diff mbox series

[1/3] powerpc: lib: sstep: fix 'sthcx' instruction

Message ID 20220223135820.2252470-1-anders.roxell@linaro.org
State Superseded
Headers show
Series [1/3] powerpc: lib: sstep: fix 'sthcx' instruction | expand

Commit Message

Anders Roxell Feb. 23, 2022, 1:58 p.m. UTC
Looks like there been a copy paste mistake when added the instruction
'stbcx' twice and one was probably meant to be 'sthcx'.
Changing to 'sthcx' from 'stbcx'.

Cc: <stable@vger.kernel.org> # v4.13+
Fixes: 350779a29f11 ("powerpc: Handle most loads and stores in instruction emulation code")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
---
 arch/powerpc/lib/sstep.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Nicholas Piggin Feb. 24, 2022, 2:40 a.m. UTC | #1
Excerpts from Anders Roxell's message of February 23, 2022 11:58 pm:
> Looks like there been a copy paste mistake when added the instruction
> 'stbcx' twice and one was probably meant to be 'sthcx'.
> Changing to 'sthcx' from 'stbcx'.
> 
> Cc: <stable@vger.kernel.org> # v4.13+
> Fixes: 350779a29f11 ("powerpc: Handle most loads and stores in instruction emulation code")
> Reported-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>

Good catch.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

> ---
>  arch/powerpc/lib/sstep.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index bd3734d5be89..d2d29243fa6d 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -3389,7 +3389,7 @@ int emulate_loadstore(struct pt_regs *regs, struct instruction_op *op)
>  			__put_user_asmx(op->val, ea, err, "stbcx.", cr);
>  			break;
>  		case 2:
> -			__put_user_asmx(op->val, ea, err, "stbcx.", cr);
> +			__put_user_asmx(op->val, ea, err, "sthcx.", cr);
>  			break;
>  #endif
>  		case 4:
> -- 
> 2.34.1
> 
>
Nicholas Piggin Feb. 24, 2022, 2:54 a.m. UTC | #2
Excerpts from Anders Roxell's message of February 23, 2022 11:58 pm:
> Building tinyconfig with gcc (Debian 11.2.0-16) and assembler (Debian
> 2.37.90.20220207) the following build error shows up:
> 
>  {standard input}: Assembler messages:
>  {standard input}:1190: Error: unrecognized opcode: `stbcix'
>  {standard input}:1433: Error: unrecognized opcode: `lwzcix'
>  {standard input}:1453: Error: unrecognized opcode: `stbcix'
>  {standard input}:1460: Error: unrecognized opcode: `stwcix'
>  {standard input}:1596: Error: unrecognized opcode: `stbcix'
>  ...
> 
> Rework to add assembler directives [1] around the instruction. Going
> through the them one by one shows that the changes should be safe.  Like
> __get_user_atomic_128_aligned() is only called in p9_hmi_special_emu(),
> which according to the name is specific to power9.  And __raw_rm_read*()
> are only called in things that are powernv or book3s_hv specific.
> 
> [1] https://sourceware.org/binutils/docs/as/PowerPC_002dPseudo.html#PowerPC_002dPseudo

Thanks for doing this. There is a recent patch committed to binutils to work
around this compiler bug.

https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=cebc89b9328

Not sure on the outlook for GCC fix. Either way unfortunately we have 
toolchains in the wild now that will explode, so we might have to take 
your patches for the time being.

Thanks,
Nick
Nicholas Piggin Feb. 24, 2022, 5:05 a.m. UTC | #3
Excerpts from Nicholas Piggin's message of February 24, 2022 12:54 pm:
> Excerpts from Anders Roxell's message of February 23, 2022 11:58 pm:
>> Building tinyconfig with gcc (Debian 11.2.0-16) and assembler (Debian
>> 2.37.90.20220207) the following build error shows up:
>> 
>>  {standard input}: Assembler messages:
>>  {standard input}:1190: Error: unrecognized opcode: `stbcix'
>>  {standard input}:1433: Error: unrecognized opcode: `lwzcix'
>>  {standard input}:1453: Error: unrecognized opcode: `stbcix'
>>  {standard input}:1460: Error: unrecognized opcode: `stwcix'
>>  {standard input}:1596: Error: unrecognized opcode: `stbcix'
>>  ...
>> 
>> Rework to add assembler directives [1] around the instruction. Going
>> through the them one by one shows that the changes should be safe.  Like
>> __get_user_atomic_128_aligned() is only called in p9_hmi_special_emu(),
>> which according to the name is specific to power9.  And __raw_rm_read*()
>> are only called in things that are powernv or book3s_hv specific.
>> 
>> [1] https://sourceware.org/binutils/docs/as/PowerPC_002dPseudo.html#PowerPC_002dPseudo
> 
> Thanks for doing this. There is a recent patch committed to binutils to work
> around this compiler bug.
> 
> https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=cebc89b9328
> 
> Not sure on the outlook for GCC fix. Either way unfortunately we have 
> toolchains in the wild now that will explode, so we might have to take 
> your patches for the time being.

Perhaps not... Here's a hack that seems to work around the problem.

The issue of removing -many from the kernel and replacing it with
appropriate architecture versions is an orthogonal one (that we
should do). Either way this hack should be able to allow us to do
that as well, on these problem toolchains.

But for now it just uses -many as the trivial regression fix to get
back to previous behaviour.

Thanks,
Nick

---
 arch/powerpc/include/asm/asm-compat.h | 28 +++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/arch/powerpc/include/asm/asm-compat.h b/arch/powerpc/include/asm/asm-compat.h
index 2b736d9fbb1b..f9ac4a36f026 100644
--- a/arch/powerpc/include/asm/asm-compat.h
+++ b/arch/powerpc/include/asm/asm-compat.h
@@ -5,6 +5,34 @@
 #include <asm/types.h>
 #include <asm/ppc-opcode.h>
 
+#ifndef __ASSEMBLY__
+/*
+ * gcc 10 started to emit a .machine directive at the beginning of generated
+ * .s files, which overrides assembler -Wa,-m<cpu> options passed down.
+ * Unclear if this behaviour will be reverted.
+ *
+ * gas 2.38 commit b25f942e18d6 made .machine directive more strict, commit
+ * cebc89b9328ea weakens it to take into account the gcc directive and allow
+ * assembler -m<cpu> options to work.
+ *
+ * A combination of both results in an older machine -mcpu= code generation
+ * preventing newer mneumonics in inline asm being recognised because it
+ * overrides our -Wa,-many option from being recognised.
+ *
+ * Emitting a .machine any directive by hand allows us to hack our way around
+ * this.
+ *
+ * XXX: verify versions and combinations.
+ */
+#ifdef CONFIG_CC_IS_GCC
+#if (GCC_VERSION >= 100000)
+#if (CONFIG_AS_VERSION == 23800)
+asm(".machine any");
+#endif
+#endif
+#endif
+#endif /* __ASSEMBLY__ */
+
 #ifdef __powerpc64__
 
 /* operations for longs and pointers */
Arnd Bergmann Feb. 24, 2022, 8:55 a.m. UTC | #4
On Thu, Feb 24, 2022 at 6:05 AM Nicholas Piggin <npiggin@gmail.com> wrote:
> Excerpts from Nicholas Piggin's message of February 24, 2022 12:54 pm:
> >
> > Not sure on the outlook for GCC fix. Either way unfortunately we have
> > toolchains in the wild now that will explode, so we might have to take
> > your patches for the time being.
>
> Perhaps not... Here's a hack that seems to work around the problem.
>
> The issue of removing -many from the kernel and replacing it with
> appropriate architecture versions is an orthogonal one (that we
> should do). Either way this hack should be able to allow us to do
> that as well, on these problem toolchains.
>
> But for now it just uses -many as the trivial regression fix to get
> back to previous behaviour.

I don't think the previous behavior is what you want to be honest.

We had the same thing on Arm a few years ago when binutils
started enforcing this more strictly, and it does catch actual
bugs. I think annotating individual inline asm statements is
the best choice here, as that documents what the intention is.

There is one more bug in this series that I looked at with Anders, but
he did not send a patch for that so far:

static void dummy_perf(struct pt_regs *regs)
{
#if defined(CONFIG_FSL_EMB_PERFMON)
        mtpmr(PMRN_PMGC0, mfpmr(PMRN_PMGC0) & ~PMGC0_PMIE);
#elif defined(CONFIG_PPC64) || defined(CONFIG_PPC_BOOK3S_32)
        if (cur_cpu_spec->pmc_type == PPC_PMC_IBM)
                mtspr(SPRN_MMCR0, mfspr(SPRN_MMCR0) & ~(MMCR0_PMXE|MMCR0_PMAO));
#else
        mtspr(SPRN_MMCR0, mfspr(SPRN_MMCR0) & ~MMCR0_PMXE);
#endif
}

Here, the assembler correctly flags the mtpmr/mfpmr as an invalid
instruction for a combined 6xx kernel: As far as I can tell, these are
only available on e300 but not the others, and instead of the compile-time
check for CONFIG_FSL_EMB_PERFMON, there needs to be some
runtime check to use the first method on 83xx but the #elif one on
the other 6xx machines.

       Arnd
Nicholas Piggin Feb. 24, 2022, 10:11 a.m. UTC | #5
Excerpts from Arnd Bergmann's message of February 24, 2022 6:55 pm:
> On Thu, Feb 24, 2022 at 6:05 AM Nicholas Piggin <npiggin@gmail.com> wrote:
>> Excerpts from Nicholas Piggin's message of February 24, 2022 12:54 pm:
>> >
>> > Not sure on the outlook for GCC fix. Either way unfortunately we have
>> > toolchains in the wild now that will explode, so we might have to take
>> > your patches for the time being.
>>
>> Perhaps not... Here's a hack that seems to work around the problem.
>>
>> The issue of removing -many from the kernel and replacing it with
>> appropriate architecture versions is an orthogonal one (that we
>> should do). Either way this hack should be able to allow us to do
>> that as well, on these problem toolchains.
>>
>> But for now it just uses -many as the trivial regression fix to get
>> back to previous behaviour.
> 
> I don't think the previous behavior is what you want to be honest.

-many isn't good but that's what we're using and that is still
what we're using upstream on any other toolchain that doesn't
have these issues. Including the next binutils version that will
ignore the initial .machine directive for 64s.

Neither of these approaches solves that. At least for 64s that
is passing -Wa,-many down already. (Although Anders' series
gets almost there).

So this is the minimal fix that brings the toolchians in to line
with others and behaves how it previously did and fixes immediate
build regressions. Removing -many is somewhat independent of that.

> We had the same thing on Arm a few years ago when binutils
> started enforcing this more strictly, and it does catch actual
> bugs. I think annotating individual inline asm statements is
> the best choice here, as that documents what the intention is.

A few cases where there are differences in privileged instructions
(that won't be compiler generated), that will be done anyway.

For new instructions added to the ISA though? I think it's ugly and
unecesaary. There is no ambiguity about the intention when you see
a lharx instruction is there?

It would delinate instructions that can't be used on all processors
but I don't see  much advantage there, it's not an exhaustive check
because we have other restrictions on instructions in the kernel
environment. And why would inline asm be special but not the rest
of the asm? Would you propose to put these .machine directives
everywhere in thousands of lines of asm code in the kernel? I
don't know that it's an improvement. And inline asm is a small
fraction of instructions.

> 
> There is one more bug in this series that I looked at with Anders, but
> he did not send a patch for that so far:
> 
> static void dummy_perf(struct pt_regs *regs)
> {
> #if defined(CONFIG_FSL_EMB_PERFMON)
>         mtpmr(PMRN_PMGC0, mfpmr(PMRN_PMGC0) & ~PMGC0_PMIE);
> #elif defined(CONFIG_PPC64) || defined(CONFIG_PPC_BOOK3S_32)
>         if (cur_cpu_spec->pmc_type == PPC_PMC_IBM)
>                 mtspr(SPRN_MMCR0, mfspr(SPRN_MMCR0) & ~(MMCR0_PMXE|MMCR0_PMAO));
> #else
>         mtspr(SPRN_MMCR0, mfspr(SPRN_MMCR0) & ~MMCR0_PMXE);
> #endif
> }
> 
> Here, the assembler correctly flags the mtpmr/mfpmr as an invalid
> instruction for a combined 6xx kernel: As far as I can tell, these are
> only available on e300 but not the others, and instead of the compile-time
> check for CONFIG_FSL_EMB_PERFMON, there needs to be some
> runtime check to use the first method on 83xx but the #elif one on
> the other 6xx machines.

Right that should be caught if you just pass -m<superset> architecture
to the assembler that does not include the mtpmr. 32-bit is a lot more
complicated than 64s like this though, so it's pssible in some cases
you will want more checking and -m<subset> + some .machine directives
will work better.

Once you add the .machine directive to your inline asm though, you lose
*all* such static checking for the instruction. So it's really not a
panacea and has its own downsides.

Thanks,
Nick
Arnd Bergmann Feb. 24, 2022, 10:20 a.m. UTC | #6
On Thu, Feb 24, 2022 at 11:11 AM Nicholas Piggin <npiggin@gmail.com> wrote:
> Excerpts from Arnd Bergmann's message of February 24, 2022 6:55 pm:
> > On Thu, Feb 24, 2022 at 6:05 AM Nicholas Piggin <npiggin@gmail.com> wrote:
> > We had the same thing on Arm a few years ago when binutils
> > started enforcing this more strictly, and it does catch actual
> > bugs. I think annotating individual inline asm statements is
> > the best choice here, as that documents what the intention is.
>
> A few cases where there are differences in privileged instructions
> (that won't be compiler generated), that will be done anyway.
>
> For new instructions added to the ISA though? I think it's ugly and
> unecesaary. There is no ambiguity about the intention when you see
> a lharx instruction is there?
>
> It would delinate instructions that can't be used on all processors
> but I don't see  much advantage there, it's not an exhaustive check
> because we have other restrictions on instructions in the kernel
> environment. And why would inline asm be special but not the rest
> of the asm? Would you propose to put these .machine directives
> everywhere in thousands of lines of asm code in the kernel? I
> don't know that it's an improvement. And inline asm is a small
> fraction of instructions.

Most of the code is fine, as we tend to only build .S files that
are for the given target CPU, the explicit .machine directives are
only needed when you have a file that mixes instructions for
incompatible machines, using a runtime detection.

> Right that should be caught if you just pass -m<superset> architecture
> to the assembler that does not include the mtpmr. 32-bit is a lot more
> complicated than 64s like this though, so it's pssible in some cases
> you will want more checking and -m<subset> + some .machine directives
> will work better.
>
> Once you add the .machine directive to your inline asm though, you lose
> *all* such static checking for the instruction. So it's really not a
> panacea and has its own downsides.

Again, there should be a minimum number of those .machine directives
in inline asm as well, which tends to work out fine as long as the
entire kernel is built with the correct -march= option for the minimum
supported CPU, and stays away from inline asm that requires a higher
CPU level.

      Arnd
Nicholas Piggin Feb. 24, 2022, 11:13 a.m. UTC | #7
Excerpts from Arnd Bergmann's message of February 24, 2022 8:20 pm:
> On Thu, Feb 24, 2022 at 11:11 AM Nicholas Piggin <npiggin@gmail.com> wrote:
>> Excerpts from Arnd Bergmann's message of February 24, 2022 6:55 pm:
>> > On Thu, Feb 24, 2022 at 6:05 AM Nicholas Piggin <npiggin@gmail.com> wrote:
>> > We had the same thing on Arm a few years ago when binutils
>> > started enforcing this more strictly, and it does catch actual
>> > bugs. I think annotating individual inline asm statements is
>> > the best choice here, as that documents what the intention is.
>>
>> A few cases where there are differences in privileged instructions
>> (that won't be compiler generated), that will be done anyway.
>>
>> For new instructions added to the ISA though? I think it's ugly and
>> unecesaary. There is no ambiguity about the intention when you see
>> a lharx instruction is there?
>>
>> It would delinate instructions that can't be used on all processors
>> but I don't see  much advantage there, it's not an exhaustive check
>> because we have other restrictions on instructions in the kernel
>> environment. And why would inline asm be special but not the rest
>> of the asm? Would you propose to put these .machine directives
>> everywhere in thousands of lines of asm code in the kernel? I
>> don't know that it's an improvement. And inline asm is a small
>> fraction of instructions.
> 
> Most of the code is fine, as we tend to only build .S files that
> are for the given target CPU,

That's not true on powerpc at least. grep FTR_SECTION.

Not all of them are different ISA, but it's more than just the
CPU_FTR_ARCH ones which only started about POWER7.

> the explicit .machine directives are
> only needed when you have a file that mixes instructions for
> incompatible machines, using a runtime detection.

Right. There are .S files are in that category. And a lot of
it for inline and .S we probably skirt entirely due to using raw 
instruction encoding because of old toolchains (which gets no error 
checking at all) which we really should tidy up and trim.

> 
>> Right that should be caught if you just pass -m<superset> architecture
>> to the assembler that does not include the mtpmr. 32-bit is a lot more
>> complicated than 64s like this though, so it's pssible in some cases
>> you will want more checking and -m<subset> + some .machine directives
>> will work better.
>>
>> Once you add the .machine directive to your inline asm though, you lose
>> *all* such static checking for the instruction. So it's really not a
>> panacea and has its own downsides.
> 
> Again, there should be a minimum number of those .machine directives
> in inline asm as well, which tends to work out fine as long as the
> entire kernel is built with the correct -march= option for the minimum
> supported CPU, and stays away from inline asm that requires a higher
> CPU level.

There's really no advantage to them, and they're ugly and annoying
and if we applied the concept consistently for all asm they would grow 
to a very large number.

The idea they'll give you good static checking just doesn't really
pan out.

Thanks,
Nick
Michael Ellerman Feb. 24, 2022, 12:39 p.m. UTC | #8
Hi Anders,

Thanks for these, just a few comments below ...

Anders Roxell <anders.roxell@linaro.org> writes:
> Building tinyconfig with gcc (Debian 11.2.0-16) and assembler (Debian
> 2.37.90.20220207) the following build error shows up:
>
>  {standard input}: Assembler messages:
>  {standard input}:1190: Error: unrecognized opcode: `stbcix'
>  {standard input}:1433: Error: unrecognized opcode: `lwzcix'
>  {standard input}:1453: Error: unrecognized opcode: `stbcix'
>  {standard input}:1460: Error: unrecognized opcode: `stwcix'
>  {standard input}:1596: Error: unrecognized opcode: `stbcix'
>  ...
>
> Rework to add assembler directives [1] around the instruction. Going
> through the them one by one shows that the changes should be safe.  Like
> __get_user_atomic_128_aligned() is only called in p9_hmi_special_emu(),
> which according to the name is specific to power9.  And __raw_rm_read*()
> are only called in things that are powernv or book3s_hv specific.
>
> [1] https://sourceware.org/binutils/docs/as/PowerPC_002dPseudo.html#PowerPC_002dPseudo
>
> Cc: <stable@vger.kernel.org>
> Co-developed-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
> ---
>  arch/powerpc/include/asm/io.h        | 46 +++++++++++++++++++++++-----
>  arch/powerpc/include/asm/uaccess.h   |  3 ++
>  arch/powerpc/platforms/powernv/rng.c |  6 +++-
>  3 files changed, 46 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
> index beba4979bff9..5ff6dec489f8 100644
> --- a/arch/powerpc/include/asm/io.h
> +++ b/arch/powerpc/include/asm/io.h
> @@ -359,25 +359,37 @@ static inline void __raw_writeq_be(unsigned long v, volatile void __iomem *addr)
>   */
>  static inline void __raw_rm_writeb(u8 val, volatile void __iomem *paddr)
>  {
> -	__asm__ __volatile__("stbcix %0,0,%1"
> +	__asm__ __volatile__(".machine \"push\"\n"
> +			     ".machine \"power6\"\n"
> +			     "stbcix %0,0,%1\n"
> +			     ".machine \"pop\"\n"
>  		: : "r" (val), "r" (paddr) : "memory");

As Segher said it'd be cleaner without the embedded quotes.

> @@ -441,7 +465,10 @@ static inline unsigned int name(unsigned int port)	\
>  	unsigned int x;					\
>  	__asm__ __volatile__(				\
>  		"sync\n"				\
> +		".machine \"push\"\n"			\
> +		".machine \"power6\"\n"			\
>  		"0:"	op "	%0,0,%1\n"		\
> +		".machine \"pop\"\n"			\
>  		"1:	twi	0,%0,0\n"		\
>  		"2:	isync\n"			\
>  		"3:	nop\n"				\
> @@ -465,7 +492,10 @@ static inline void name(unsigned int val, unsigned int port) \
>  {							\
>  	__asm__ __volatile__(				\
>  		"sync\n"				\
> +		".machine \"push\"\n"			\
> +		".machine \"power6\"\n"			\
>  		"0:" op " %0,0,%1\n"			\
> +		".machine \"pop\"\n"			\
>  		"1:	sync\n"				\
>  		"2:\n"					\
>  		EX_TABLE(0b, 2b)			\

It's not visible from the diff, but the above two are __do_in_asm and
__do_out_asm and are inside an ifdef CONFIG_PPC32.

AFAICS they're only used for:

__do_in_asm(_rec_inb, "lbzx")
__do_in_asm(_rec_inw, "lhbrx")
__do_in_asm(_rec_inl, "lwbrx")
__do_out_asm(_rec_outb, "stbx")
__do_out_asm(_rec_outw, "sthbrx")
__do_out_asm(_rec_outl, "stwbrx")

Which are all old instructions, so I don't think we need the machine
power6 for those two macros?

> diff --git a/arch/powerpc/platforms/powernv/rng.c b/arch/powerpc/platforms/powernv/rng.c
> index b4386714494a..5bf30ef6d928 100644
> --- a/arch/powerpc/platforms/powernv/rng.c
> +++ b/arch/powerpc/platforms/powernv/rng.c
> @@ -43,7 +43,11 @@ static unsigned long rng_whiten(struct powernv_rng *rng, unsigned long val)
>  	unsigned long parity;
>  
>  	/* Calculate the parity of the value */
> -	asm ("popcntd %0,%1" : "=r" (parity) : "r" (val));
> +	asm (".machine \"push\"\n"
> +	     ".machine \"power7\"\n"
> +	     "popcntd %0,%1\n"
> +	     ".machine \"pop\"\n"
> +	     : "=r" (parity) : "r" (val));

This was actually present in an older CPU, but it doesn't really matter,
this is fine.

cheers
Anders Roxell Feb. 24, 2022, 4:12 p.m. UTC | #9
On Thu, 24 Feb 2022 at 13:39, Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Hi Anders,

Hi Michael,

>
> Thanks for these, just a few comments below ...

I will resolve the comments below and resend a v2 shortly.

Cheers,
Anders

>
> Anders Roxell <anders.roxell@linaro.org> writes:
> > Building tinyconfig with gcc (Debian 11.2.0-16) and assembler (Debian
> > 2.37.90.20220207) the following build error shows up:
> >
> >  {standard input}: Assembler messages:
> >  {standard input}:1190: Error: unrecognized opcode: `stbcix'
> >  {standard input}:1433: Error: unrecognized opcode: `lwzcix'
> >  {standard input}:1453: Error: unrecognized opcode: `stbcix'
> >  {standard input}:1460: Error: unrecognized opcode: `stwcix'
> >  {standard input}:1596: Error: unrecognized opcode: `stbcix'
> >  ...
> >
> > Rework to add assembler directives [1] around the instruction. Going
> > through the them one by one shows that the changes should be safe.  Like
> > __get_user_atomic_128_aligned() is only called in p9_hmi_special_emu(),
> > which according to the name is specific to power9.  And __raw_rm_read*()
> > are only called in things that are powernv or book3s_hv specific.
> >
> > [1] https://sourceware.org/binutils/docs/as/PowerPC_002dPseudo.html#PowerPC_002dPseudo
> >
> > Cc: <stable@vger.kernel.org>
> > Co-developed-by: Arnd Bergmann <arnd@arndb.de>
> > Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> > Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
> > ---
> >  arch/powerpc/include/asm/io.h        | 46 +++++++++++++++++++++++-----
> >  arch/powerpc/include/asm/uaccess.h   |  3 ++
> >  arch/powerpc/platforms/powernv/rng.c |  6 +++-
> >  3 files changed, 46 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
> > index beba4979bff9..5ff6dec489f8 100644
> > --- a/arch/powerpc/include/asm/io.h
> > +++ b/arch/powerpc/include/asm/io.h
> > @@ -359,25 +359,37 @@ static inline void __raw_writeq_be(unsigned long v, volatile void __iomem *addr)
> >   */
> >  static inline void __raw_rm_writeb(u8 val, volatile void __iomem *paddr)
> >  {
> > -     __asm__ __volatile__("stbcix %0,0,%1"
> > +     __asm__ __volatile__(".machine \"push\"\n"
> > +                          ".machine \"power6\"\n"
> > +                          "stbcix %0,0,%1\n"
> > +                          ".machine \"pop\"\n"
> >               : : "r" (val), "r" (paddr) : "memory");
>
> As Segher said it'd be cleaner without the embedded quotes.
>
> > @@ -441,7 +465,10 @@ static inline unsigned int name(unsigned int port)       \
> >       unsigned int x;                                 \
> >       __asm__ __volatile__(                           \
> >               "sync\n"                                \
> > +             ".machine \"push\"\n"                   \
> > +             ".machine \"power6\"\n"                 \
> >               "0:"    op "    %0,0,%1\n"              \
> > +             ".machine \"pop\"\n"                    \
> >               "1:     twi     0,%0,0\n"               \
> >               "2:     isync\n"                        \
> >               "3:     nop\n"                          \
> > @@ -465,7 +492,10 @@ static inline void name(unsigned int val, unsigned int port) \
> >  {                                                    \
> >       __asm__ __volatile__(                           \
> >               "sync\n"                                \
> > +             ".machine \"push\"\n"                   \
> > +             ".machine \"power6\"\n"                 \
> >               "0:" op " %0,0,%1\n"                    \
> > +             ".machine \"pop\"\n"                    \
> >               "1:     sync\n"                         \
> >               "2:\n"                                  \
> >               EX_TABLE(0b, 2b)                        \
>
> It's not visible from the diff, but the above two are __do_in_asm and
> __do_out_asm and are inside an ifdef CONFIG_PPC32.
>
> AFAICS they're only used for:
>
> __do_in_asm(_rec_inb, "lbzx")
> __do_in_asm(_rec_inw, "lhbrx")
> __do_in_asm(_rec_inl, "lwbrx")
> __do_out_asm(_rec_outb, "stbx")
> __do_out_asm(_rec_outw, "sthbrx")
> __do_out_asm(_rec_outl, "stwbrx")
>
> Which are all old instructions, so I don't think we need the machine
> power6 for those two macros?
>
> > diff --git a/arch/powerpc/platforms/powernv/rng.c b/arch/powerpc/platforms/powernv/rng.c
> > index b4386714494a..5bf30ef6d928 100644
> > --- a/arch/powerpc/platforms/powernv/rng.c
> > +++ b/arch/powerpc/platforms/powernv/rng.c
> > @@ -43,7 +43,11 @@ static unsigned long rng_whiten(struct powernv_rng *rng, unsigned long val)
> >       unsigned long parity;
> >
> >       /* Calculate the parity of the value */
> > -     asm ("popcntd %0,%1" : "=r" (parity) : "r" (val));
> > +     asm (".machine \"push\"\n"
> > +          ".machine \"power7\"\n"
> > +          "popcntd %0,%1\n"
> > +          ".machine \"pop\"\n"
> > +          : "=r" (parity) : "r" (val));
>
> This was actually present in an older CPU, but it doesn't really matter,
> this is fine.
>
> cheers
Segher Boessenkool Feb. 24, 2022, 5:12 p.m. UTC | #10
On Thu, Feb 24, 2022 at 03:05:28PM +1000, Nicholas Piggin wrote:
> + * gcc 10 started to emit a .machine directive at the beginning of generated
> + * .s files, which overrides assembler -Wa,-m<cpu> options passed down.
> + * Unclear if this behaviour will be reverted.

It will not be reverted.  If you need a certain .machine for some asm
code, you should write just that!

> +#ifdef CONFIG_CC_IS_GCC
> +#if (GCC_VERSION >= 100000)
> +#if (CONFIG_AS_VERSION == 23800)
> +asm(".machine any");
> +#endif
> +#endif
> +#endif
> +#endif /* __ASSEMBLY__ */

Abusing toplevel asm like this is broken and you *will* end up with
unhappiness all around.


Segher
Segher Boessenkool Feb. 24, 2022, 5:29 p.m. UTC | #11
On Thu, Feb 24, 2022 at 09:13:25PM +1000, Nicholas Piggin wrote:
> Excerpts from Arnd Bergmann's message of February 24, 2022 8:20 pm:
> > Again, there should be a minimum number of those .machine directives
> > in inline asm as well, which tends to work out fine as long as the
> > entire kernel is built with the correct -march= option for the minimum
> > supported CPU, and stays away from inline asm that requires a higher
> > CPU level.
> 
> There's really no advantage to them, and they're ugly and annoying
> and if we applied the concept consistently for all asm they would grow 
> to a very large number.

The advantage is that you get machine code that *works*.  There are
quite a few mnemonics that translate to different instructions with
different machine options!  We like to get the intended instructions
instead of something that depends on what assembler options the user
has passed behind our backs.

> The idea they'll give you good static checking just doesn't really
> pan out.

That never was a goal of this at all.

-many was very problematical for GCC itself.  We no longer use it.


Segher
Segher Boessenkool Feb. 24, 2022, 5:37 p.m. UTC | #12
On Thu, Feb 24, 2022 at 11:39:16PM +1100, Michael Ellerman wrote:
> >  	/* Calculate the parity of the value */
> > -	asm ("popcntd %0,%1" : "=r" (parity) : "r" (val));
> > +	asm (".machine \"push\"\n"
> > +	     ".machine \"power7\"\n"
> > +	     "popcntd %0,%1\n"
> > +	     ".machine \"pop\"\n"
> > +	     : "=r" (parity) : "r" (val));
> 
> This was actually present in an older CPU, but it doesn't really matter,
> this is fine.

popcntd was new on p7 (popcntb is the older one :-) )  And it does not
matter indeed.


Segher
Nicholas Piggin Feb. 25, 2022, 12:23 a.m. UTC | #13
Excerpts from Segher Boessenkool's message of February 25, 2022 3:29 am:
> On Thu, Feb 24, 2022 at 09:13:25PM +1000, Nicholas Piggin wrote:
>> Excerpts from Arnd Bergmann's message of February 24, 2022 8:20 pm:
>> > Again, there should be a minimum number of those .machine directives
>> > in inline asm as well, which tends to work out fine as long as the
>> > entire kernel is built with the correct -march= option for the minimum
>> > supported CPU, and stays away from inline asm that requires a higher
>> > CPU level.
>> 
>> There's really no advantage to them, and they're ugly and annoying
>> and if we applied the concept consistently for all asm they would grow 
>> to a very large number.
> 
> The advantage is that you get machine code that *works*.  There are
> quite a few mnemonics that translate to different instructions with
> different machine options!  We like to get the intended instructions
> instead of something that depends on what assembler options the user
> has passed behind our backs.
> 
>> The idea they'll give you good static checking just doesn't really
>> pan out.
> 
> That never was a goal of this at all.
> 
> -many was very problematical for GCC itself.  We no longer use it.

You have the wrong context. We're not talking about -many vs .machine
here.

Thanks,
Nick
Nicholas Piggin Feb. 25, 2022, 12:32 a.m. UTC | #14
Excerpts from Segher Boessenkool's message of February 25, 2022 3:12 am:
> On Thu, Feb 24, 2022 at 03:05:28PM +1000, Nicholas Piggin wrote:
>> + * gcc 10 started to emit a .machine directive at the beginning of generated
>> + * .s files, which overrides assembler -Wa,-m<cpu> options passed down.
>> + * Unclear if this behaviour will be reverted.
> 
> It will not be reverted.  If you need a certain .machine for some asm
> code, you should write just that!

It should be reverted because it breaks old binutils which did not have
the workaround patch for this broken gcc behaviour. And it is just
unnecessary because -m option can already be used to do the same thing.

Not that I expect gcc to revert it.

> 
>> +#ifdef CONFIG_CC_IS_GCC
>> +#if (GCC_VERSION >= 100000)
>> +#if (CONFIG_AS_VERSION == 23800)
>> +asm(".machine any");
>> +#endif
>> +#endif
>> +#endif
>> +#endif /* __ASSEMBLY__ */
> 
> Abusing toplevel asm like this is broken and you *will* end up with
> unhappiness all around.

It actually unbreaks things and reduces my unhappiness. It's only done 
for broken compiler versions and only where as does not have the 
workaround for the breakage.

Thanks,
Nick
Arnd Bergmann Feb. 25, 2022, 8:33 a.m. UTC | #15
On Fri, Feb 25, 2022 at 1:32 AM Nicholas Piggin <npiggin@gmail.com> wrote:
> Excerpts from Segher Boessenkool's message of February 25, 2022 3:12 am:
> >> +#ifdef CONFIG_CC_IS_GCC
> >> +#if (GCC_VERSION >= 100000)
> >> +#if (CONFIG_AS_VERSION == 23800)
> >> +asm(".machine any");
> >> +#endif
> >> +#endif
> >> +#endif
> >> +#endif /* __ASSEMBLY__ */
> >
> > Abusing toplevel asm like this is broken and you *will* end up with
> > unhappiness all around.
>
> It actually unbreaks things and reduces my unhappiness. It's only done
> for broken compiler versions and only where as does not have the
> workaround for the breakage.

It doesn't work with clang, which always passes explicit .machine
statements around each inline asm, and it's also fundamentally
incompatible with LTO builds. Generally speaking, you can't expect
a top-level asm statement to have any effect inside of another
function.

        Arnd
Nicholas Piggin Feb. 25, 2022, 10:51 a.m. UTC | #16
Excerpts from Arnd Bergmann's message of February 25, 2022 6:33 pm:
> On Fri, Feb 25, 2022 at 1:32 AM Nicholas Piggin <npiggin@gmail.com> wrote:
>> Excerpts from Segher Boessenkool's message of February 25, 2022 3:12 am:
>> >> +#ifdef CONFIG_CC_IS_GCC
>> >> +#if (GCC_VERSION >= 100000)
>> >> +#if (CONFIG_AS_VERSION == 23800)
>> >> +asm(".machine any");
>> >> +#endif
>> >> +#endif
>> >> +#endif
>> >> +#endif /* __ASSEMBLY__ */
>> >
>> > Abusing toplevel asm like this is broken and you *will* end up with
>> > unhappiness all around.
>>
>> It actually unbreaks things and reduces my unhappiness. It's only done
>> for broken compiler versions and only where as does not have the
>> workaround for the breakage.
> 
> It doesn't work with clang, which always passes explicit .machine
> statements around each inline asm, and it's also fundamentally
> incompatible with LTO builds. Generally speaking, you can't expect
> a top-level asm statement to have any effect inside of another
> function.

You have misunderstood my patch. It is not supposed to "work" with
clang and it explicitly is complied out of clang. It's not intended
to have any implementation independent meaning. It's working around
a very specific issue with specific versions of gcc, and that's what
it does.

It's also not intended to be the final solution, it's a workaround
hack. We will move away from -many of course. I will post it as a
series since which hopefully will make it less confusing to people.

Thanks,
Nick
Segher Boessenkool Feb. 25, 2022, 10:28 p.m. UTC | #17
On Fri, Feb 25, 2022 at 10:23:07AM +1000, Nicholas Piggin wrote:
> Excerpts from Segher Boessenkool's message of February 25, 2022 3:29 am:
> > On Thu, Feb 24, 2022 at 09:13:25PM +1000, Nicholas Piggin wrote:
> >> Excerpts from Arnd Bergmann's message of February 24, 2022 8:20 pm:
> >> > Again, there should be a minimum number of those .machine directives
> >> > in inline asm as well, which tends to work out fine as long as the
> >> > entire kernel is built with the correct -march= option for the minimum
> >> > supported CPU, and stays away from inline asm that requires a higher
> >> > CPU level.
> >> 
> >> There's really no advantage to them, and they're ugly and annoying
> >> and if we applied the concept consistently for all asm they would grow 
> >> to a very large number.
> > 
> > The advantage is that you get machine code that *works*.  There are
> > quite a few mnemonics that translate to different instructions with
> > different machine options!  We like to get the intended instructions
> > instead of something that depends on what assembler options the user
> > has passed behind our backs.
> > 
> >> The idea they'll give you good static checking just doesn't really
> >> pan out.
> > 
> > That never was a goal of this at all.
> > 
> > -many was very problematical for GCC itself.  We no longer use it.
> 
> You have the wrong context. We're not talking about -many vs .machine
> here.

Okay, so you have no idea what you are talking about?  Wow.

The reason GCC uses .machine *itself* is because assembler -mmachine
options *cannot work*, for many reasons.  We hit problems often enough
that years ago we started moving away from it already.  The biggest
problems are that on one hand there are mnemonics that encode to
different instructions depending on target arch or cpu selected (like
mftb, lxvx, wait, etc.), and on the other hand GCC needs to switch that
target halfway through compilation (attribute((target(...)))).

Often these problems were hidden most of the time by us passing -many.
But not all of the time, and over time, problems became more frequent
and nasty.

Passing assembler -m options is nasty when you have to mix it with
.machine statements (and we need the latter no matter what), and it
becomes completely unpredictable if the user passes other -m options
manually.

Inline assembler is inserted textually in the generated assembler code.
This is a big part of the strength of inline assembler.  It does mean
that if you need a different target selected for your assembler code
then you need to arrange for that in your assembler code.

So yes, this very much is about -many, other -m options, and .machine .
I discourage the kernel (as well as any other project) from using -m
options, especially -many, but that is your own choice of course.  I
get sick and tired from you calling a deliberate design decision we
arrived at after years of work and weighing alternatives a "bug" though.


Segher
Segher Boessenkool Feb. 25, 2022, 10:33 p.m. UTC | #18
On Fri, Feb 25, 2022 at 10:32:02AM +1000, Nicholas Piggin wrote:
> Excerpts from Segher Boessenkool's message of February 25, 2022 3:12 am:
> > On Thu, Feb 24, 2022 at 03:05:28PM +1000, Nicholas Piggin wrote:
> >> + * gcc 10 started to emit a .machine directive at the beginning of generated
> >> + * .s files, which overrides assembler -Wa,-m<cpu> options passed down.
> >> + * Unclear if this behaviour will be reverted.
> > 
> > It will not be reverted.  If you need a certain .machine for some asm
> > code, you should write just that!
> 
> It should be reverted because it breaks old binutils which did not have
> the workaround patch for this broken gcc behaviour. And it is just
> unnecessary because -m option can already be used to do the same thing.
> 
> Not that I expect gcc to revert it.

Nothing will happen if you do not file a bug report.  And do read the
bug reporting instructions first please.

> >> +#ifdef CONFIG_CC_IS_GCC
> >> +#if (GCC_VERSION >= 100000)
> >> +#if (CONFIG_AS_VERSION == 23800)
> >> +asm(".machine any");
> >> +#endif
> >> +#endif
> >> +#endif
> >> +#endif /* __ASSEMBLY__ */
> > 
> > Abusing toplevel asm like this is broken and you *will* end up with
> > unhappiness all around.
> 
> It actually unbreaks things and reduces my unhappiness.

It is broken.  You will need -fno-toplevel-reorder, and you really do
not want that, if you *can* use it in the kernel even.

> It's only done 
> for broken compiler versions and only where as does not have the 
> workaround for the breakage.

What compiler versions?  Please file a PR.


Segher
Nicholas Piggin Feb. 26, 2022, 12:07 a.m. UTC | #19
Excerpts from Segher Boessenkool's message of February 26, 2022 8:28 am:
> On Fri, Feb 25, 2022 at 10:23:07AM +1000, Nicholas Piggin wrote:
>> Excerpts from Segher Boessenkool's message of February 25, 2022 3:29 am:
>> > On Thu, Feb 24, 2022 at 09:13:25PM +1000, Nicholas Piggin wrote:
>> >> Excerpts from Arnd Bergmann's message of February 24, 2022 8:20 pm:
>> >> > Again, there should be a minimum number of those .machine directives
>> >> > in inline asm as well, which tends to work out fine as long as the
>> >> > entire kernel is built with the correct -march= option for the minimum
>> >> > supported CPU, and stays away from inline asm that requires a higher
>> >> > CPU level.
>> >> 
>> >> There's really no advantage to them, and they're ugly and annoying
>> >> and if we applied the concept consistently for all asm they would grow 
>> >> to a very large number.
>> > 
>> > The advantage is that you get machine code that *works*.  There are
>> > quite a few mnemonics that translate to different instructions with
>> > different machine options!  We like to get the intended instructions
>> > instead of something that depends on what assembler options the user
>> > has passed behind our backs.
>> > 
>> >> The idea they'll give you good static checking just doesn't really
>> >> pan out.
>> > 
>> > That never was a goal of this at all.
>> > 
>> > -many was very problematical for GCC itself.  We no longer use it.
>> 
>> You have the wrong context. We're not talking about -many vs .machine
>> here.
> 
> Okay, so you have no idea what you are talking about?  Wow.

Wrong context. It's not about -many. We're past that everyone agrees 
it's wrong.

> The reason GCC uses .machine *itself* is because assembler -mmachine
> options *cannot work*, for many reasons.  We hit problems often enough
> that years ago we started moving away from it already.  The biggest
> problems are that on one hand there are mnemonics that encode to
> different instructions depending on target arch or cpu selected (like
> mftb, lxvx, wait, etc.), and on the other hand GCC needs to switch that
> target halfway through compilation (attribute((target(...)))).
> 
> Often these problems were hidden most of the time by us passing -many.
> But not all of the time, and over time, problems became more frequent
> and nasty.
> 
> Passing assembler -m options is nasty when you have to mix it with
> .machine statements (and we need the latter no matter what), and it

No it's not nasty, read the gas manual. -m specifies the machine and
so does .machine. It's simple.

> becomes completely unpredictable if the user passes other -m options
> manually.
> Inline assembler is inserted textually in the generated assembler code.
> This is a big part of the strength of inline assembler.  It does mean
> that if you need a different target selected for your assembler code
> then you need to arrange for that in your assembler code.
> 
> So yes, this very much is about -many, other -m options, and .machine .
> I discourage the kernel (as well as any other project) from using -m
> options, especially -many, but that is your own choice of course.  I
> get sick and tired from you calling a deliberate design decision we
> arrived at after years of work and weighing alternatives a "bug" though.

Alan posted a good summary here

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102485#c10

Thanks,
Nick
diff mbox series

Patch

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index bd3734d5be89..d2d29243fa6d 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -3389,7 +3389,7 @@  int emulate_loadstore(struct pt_regs *regs, struct instruction_op *op)
 			__put_user_asmx(op->val, ea, err, "stbcx.", cr);
 			break;
 		case 2:
-			__put_user_asmx(op->val, ea, err, "stbcx.", cr);
+			__put_user_asmx(op->val, ea, err, "sthcx.", cr);
 			break;
 #endif
 		case 4: