diff mbox

[ARM] stop changing signedness in PROMOTE_MODE

Message ID CAKdteOZZzE+aixcoiyqPRQ7qfwZF_QLM=eW2iVLuC39XYs6KDA@mail.gmail.com
State New
Headers show

Commit Message

Christophe Lyon Feb. 17, 2016, 10:20 a.m. UTC
On 17 February 2016 at 11:05, Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
>

> On 17/02/16 10:03, Christophe Lyon wrote:

>>

>> On 15 February 2016 at 12:32, Kyrill Tkachov

>> <kyrylo.tkachov@foss.arm.com> wrote:

>>>

>>> On 04/02/16 08:58, Ramana Radhakrishnan wrote:

>>>>

>>>> On Tue, Jun 30, 2015 at 2:15 AM, Jim Wilson <jim.wilson@linaro.org>

>>>> wrote:

>>>>>

>>>>> This is my suggested fix for PR 65932, which is a linux kernel

>>>>> miscompile with gcc-5.1.

>>>>>

>>>>> The problem here is caused by a chain of events.  The first is that

>>>>> the relatively new eipa_sra pass creates fake parameters that behave

>>>>> slightly differently than normal parameters.  The second is that the

>>>>> optimizer creates phi nodes that copy local variables to fake

>>>>> parameters and/or vice versa.  The third is that the ouf-of-ssa pass

>>>>> assumes that it can emit simple move instructions for these phi nodes.

>>>>> And the fourth is that the ARM port has a PROMOTE_MODE macro that

>>>>> forces QImode and HImode to unsigned, but a

>>>>> TARGET_PROMOTE_FUNCTION_MODE hook that does not.  So signed char and

>>>>> short parameters have different in register representations than local

>>>>> variables, and require a conversion when copying between them, a

>>>>> conversion that the out-of-ssa pass can't easily emit.

>>>>>

>>>>> Ultimately, I think this is a problem in the arm backend.  It should

>>>>> not have a PROMOTE_MODE macro that is changing the sign of char and

>>>>> short local variables.  I also think that we should merge the

>>>>> PROMOTE_MODE macro with the TARGET_PROMOTE_FUNCTION_MODE hook to

>>>>> prevent this from happening again.

>>>>>

>>>>> I see four general problems with the current ARM PROMOTE_MODE

>>>>> definition.

>>>>> 1) Unsigned char is only faster for armv5 and earlier, before the sxtb

>>>>> instruction was added.  It is a lose for armv6 and later.

>>>>> 2) Unsigned short was only faster for targets that don't support

>>>>> unaligned accesses.  Support for these targets was removed a while

>>>>> ago, and this PROMODE_MODE hunk should have been removed at the same

>>>>> time.  It was accidentally left behind.

>>>>> 3) TARGET_PROMOTE_FUNCTION_MODE used to be a boolean hook, when it was

>>>>> converted to a function, the PROMOTE_MODE code was copied without the

>>>>> UNSIGNEDP changes.  Thus it is only an accident that

>>>>> TARGET_PROMOTE_FUNCTION_MODE and PROMOTE_MODE disagree.  Changing

>>>>> TARGET_PROMOTE_FUNCTION_MODE is an ABI change, so only PROMOTE_MODE

>>>>> changes to resolve the difference are safe.

>>>>> 4) There is a general principle that you should only change signedness

>>>>> in PROMOTE_MODE if the hardware forces it, as otherwise this results

>>>>> in extra conversion instructions that make code slower.  The mips64

>>>>> hardware for instance requires that 32-bit values be sign-extended

>>>>> regardless of type, and instructions may trap if this is not true.

>>>>> However, it has a set of 32-bit instructions that operate on these

>>>>> values, and hence no conversions are required.  There is no similar

>>>>> case on ARM. Thus the conversions are unnecessary and unwise.  This

>>>>> can be seen in the testcases where gcc emits both a zero-extend and a

>>>>> sign-extend inside a loop, as the sign-extend is required for a

>>>>> compare, and the zero-extend is required by PROMOTE_MODE.

>>>>

>>>> Given Kyrill's testing with the patch and the reasonably detailed

>>>> check of the effects of code generation changes - The arm.h hunk is ok

>>>> - I do think we should make this explicit in the documentation that

>>>> TARGET_PROMOTE_MODE and TARGET_PROMOTE_FUNCTION_MODE should agree and

>>>> better still maybe put in a checking assert for the same in the

>>>> mid-end but that could be the subject of a follow-up patch.

>>>>

>>>> Ok to apply just the arm.h hunk as I think Kyrill has taken care of

>>>> the testsuite fallout separately.

>>>

>>> Hi all,

>>>

>>> I'd like to backport the arm.h from this ( r233130) to the GCC 5

>>> branch. As the CSE patch from my series had some fallout on x86_64

>>> due to a deficiency in the AVX patterns that is too invasive to fix

>>> at this stage (and presumably backport), I'd like to just backport

>>> this arm.h fix and adjust the tests to XFAIL the fallout that comes

>>> with not applying the CSE patch. The attached patch does that.

>>>

>>> The code quality fallout on code outside the testsuite is not

>>> that gread. The SPEC benchmarks are not affected by not applying

>>> the CSE change, and only a single sequence in a popular embedded

>>> benchmark

>>> shows some degradation for -mtune=cortex-a9 in the same way as the

>>> wmul-1.c and wmul-2.c tests.

>>>

>>> I think that's a fair tradeoff for fixing the wrong code bug on that

>>> branch.

>>>

>>> Ok to backport r233130 and the attached testsuite patch to the GCC 5

>>> branch?

>>>

>>> Thanks,

>>> Kyrill

>>>

>>> 2016-02-15  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

>>>

>>>      PR target/65932

>>>      * gcc.target/arm/wmul-1.c: Add -mtune=cortex-a9 to dg-options.

>>>      xfail the scan-assembler test.

>>>      * gcc.target/arm/wmul-2.c: Likewise.

>>>      * gcc.target/arm/wmul-3.c: Simplify test to generate a single

>>> smulbb.

>>>

>>>

>> Hi Kyrill,

>>

>> I've noticed that wmul-3 still fails on the gcc-5 branch when forcing GCC

>> configuration to:

>> --with-cpu cortex-a5 --with-fpu vfpv3-d16-fp16

>> (target arm-none-linux-gnueabihf)

>>

>> The generated code is:

>>          sxth    r0, r0

>>          sxth    r1, r1

>>          mul     r0, r1, r0

>> instead of

>>          smulbb  r0, r1, r0

>> on trunk.

>>

>> I guess we don't worry?

>

>

> Hi Christophe,

> Hmmm, I suspect we might want to backport

> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01714.html

> to fix backend the costing logic of smulbb.

> Could you please try that patch to see if it helps?

>


Ha indeed, with the attached patch, we now generate smulbb.
I didn't run a full make check though.

OK with a suitable ChangeLog entry?

Christophe.

> Thanks,

> Kyrill

>

>

>>>

>>>> regards

>>>> Ramana

>>>>

>>>>

>>>>

>>>>

>>>>> My change was tested with an arm bootstrap, make check, and SPEC

>>>>> CPU2000 run.  The original poster verified that this gives a linux

>>>>> kernel that boots correctly.

>>>>>

>>>>> The PRMOTE_MODE change causes 3 testsuite testcases to fail.  These

>>>>> are tests to verify that smulbb and/or smlabb are generated.

>>>>> Eliminating the unnecessary sign conversions causes us to get better

>>>>> code that doesn't include the smulbb and smlabb instructions.  I had

>>>>> to modify the testcases to get them to emit the desired instructions.

>>>>> With the testcase changes there are no additional testsuite failures,

>>>>> though I'm concerned that these testcases with the changes may be

>>>>> fragile, and future changes may break them again.

>>>>

>>>>

>>>>

>>>>> If there are ARM parts where smulbb/smlabb are faster than mul/mla,

>>>>> then maybe we should try to add new patterns to get the instructions

>>>>> emitted again for the unmodified testcases.

>>>>>

>>>>> Jim

>>>

>>>

>

Comments

Christophe Lyon Feb. 18, 2016, 10:16 a.m. UTC | #1
On 17 February 2016 at 11:22, Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
>

> On 17/02/16 10:20, Christophe Lyon wrote:

>>

>> On 17 February 2016 at 11:05, Kyrill Tkachov

>> <kyrylo.tkachov@foss.arm.com> wrote:

>>>

>>> On 17/02/16 10:03, Christophe Lyon wrote:

>>>>

>>>> On 15 February 2016 at 12:32, Kyrill Tkachov

>>>> <kyrylo.tkachov@foss.arm.com> wrote:

>>>>>

>>>>> On 04/02/16 08:58, Ramana Radhakrishnan wrote:

>>>>>>

>>>>>> On Tue, Jun 30, 2015 at 2:15 AM, Jim Wilson <jim.wilson@linaro.org>

>>>>>> wrote:

>>>>>>>

>>>>>>> This is my suggested fix for PR 65932, which is a linux kernel

>>>>>>> miscompile with gcc-5.1.

>>>>>>>

>>>>>>> The problem here is caused by a chain of events.  The first is that

>>>>>>> the relatively new eipa_sra pass creates fake parameters that behave

>>>>>>> slightly differently than normal parameters.  The second is that the

>>>>>>> optimizer creates phi nodes that copy local variables to fake

>>>>>>> parameters and/or vice versa.  The third is that the ouf-of-ssa pass

>>>>>>> assumes that it can emit simple move instructions for these phi

>>>>>>> nodes.

>>>>>>> And the fourth is that the ARM port has a PROMOTE_MODE macro that

>>>>>>> forces QImode and HImode to unsigned, but a

>>>>>>> TARGET_PROMOTE_FUNCTION_MODE hook that does not.  So signed char and

>>>>>>> short parameters have different in register representations than

>>>>>>> local

>>>>>>> variables, and require a conversion when copying between them, a

>>>>>>> conversion that the out-of-ssa pass can't easily emit.

>>>>>>>

>>>>>>> Ultimately, I think this is a problem in the arm backend.  It should

>>>>>>> not have a PROMOTE_MODE macro that is changing the sign of char and

>>>>>>> short local variables.  I also think that we should merge the

>>>>>>> PROMOTE_MODE macro with the TARGET_PROMOTE_FUNCTION_MODE hook to

>>>>>>> prevent this from happening again.

>>>>>>>

>>>>>>> I see four general problems with the current ARM PROMOTE_MODE

>>>>>>> definition.

>>>>>>> 1) Unsigned char is only faster for armv5 and earlier, before the

>>>>>>> sxtb

>>>>>>> instruction was added.  It is a lose for armv6 and later.

>>>>>>> 2) Unsigned short was only faster for targets that don't support

>>>>>>> unaligned accesses.  Support for these targets was removed a while

>>>>>>> ago, and this PROMODE_MODE hunk should have been removed at the same

>>>>>>> time.  It was accidentally left behind.

>>>>>>> 3) TARGET_PROMOTE_FUNCTION_MODE used to be a boolean hook, when it

>>>>>>> was

>>>>>>> converted to a function, the PROMOTE_MODE code was copied without the

>>>>>>> UNSIGNEDP changes.  Thus it is only an accident that

>>>>>>> TARGET_PROMOTE_FUNCTION_MODE and PROMOTE_MODE disagree.  Changing

>>>>>>> TARGET_PROMOTE_FUNCTION_MODE is an ABI change, so only PROMOTE_MODE

>>>>>>> changes to resolve the difference are safe.

>>>>>>> 4) There is a general principle that you should only change

>>>>>>> signedness

>>>>>>> in PROMOTE_MODE if the hardware forces it, as otherwise this results

>>>>>>> in extra conversion instructions that make code slower.  The mips64

>>>>>>> hardware for instance requires that 32-bit values be sign-extended

>>>>>>> regardless of type, and instructions may trap if this is not true.

>>>>>>> However, it has a set of 32-bit instructions that operate on these

>>>>>>> values, and hence no conversions are required.  There is no similar

>>>>>>> case on ARM. Thus the conversions are unnecessary and unwise.  This

>>>>>>> can be seen in the testcases where gcc emits both a zero-extend and a

>>>>>>> sign-extend inside a loop, as the sign-extend is required for a

>>>>>>> compare, and the zero-extend is required by PROMOTE_MODE.

>>>>>>

>>>>>> Given Kyrill's testing with the patch and the reasonably detailed

>>>>>> check of the effects of code generation changes - The arm.h hunk is ok

>>>>>> - I do think we should make this explicit in the documentation that

>>>>>> TARGET_PROMOTE_MODE and TARGET_PROMOTE_FUNCTION_MODE should agree and

>>>>>> better still maybe put in a checking assert for the same in the

>>>>>> mid-end but that could be the subject of a follow-up patch.

>>>>>>

>>>>>> Ok to apply just the arm.h hunk as I think Kyrill has taken care of

>>>>>> the testsuite fallout separately.

>>>>>

>>>>> Hi all,

>>>>>

>>>>> I'd like to backport the arm.h from this ( r233130) to the GCC 5

>>>>> branch. As the CSE patch from my series had some fallout on x86_64

>>>>> due to a deficiency in the AVX patterns that is too invasive to fix

>>>>> at this stage (and presumably backport), I'd like to just backport

>>>>> this arm.h fix and adjust the tests to XFAIL the fallout that comes

>>>>> with not applying the CSE patch. The attached patch does that.

>>>>>

>>>>> The code quality fallout on code outside the testsuite is not

>>>>> that gread. The SPEC benchmarks are not affected by not applying

>>>>> the CSE change, and only a single sequence in a popular embedded

>>>>> benchmark

>>>>> shows some degradation for -mtune=cortex-a9 in the same way as the

>>>>> wmul-1.c and wmul-2.c tests.

>>>>>

>>>>> I think that's a fair tradeoff for fixing the wrong code bug on that

>>>>> branch.

>>>>>

>>>>> Ok to backport r233130 and the attached testsuite patch to the GCC 5

>>>>> branch?

>>>>>

>>>>> Thanks,

>>>>> Kyrill

>>>>>

>>>>> 2016-02-15  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

>>>>>

>>>>>       PR target/65932

>>>>>       * gcc.target/arm/wmul-1.c: Add -mtune=cortex-a9 to dg-options.

>>>>>       xfail the scan-assembler test.

>>>>>       * gcc.target/arm/wmul-2.c: Likewise.

>>>>>       * gcc.target/arm/wmul-3.c: Simplify test to generate a single

>>>>> smulbb.

>>>>>

>>>>>

>>>> Hi Kyrill,

>>>>

>>>> I've noticed that wmul-3 still fails on the gcc-5 branch when forcing

>>>> GCC

>>>> configuration to:

>>>> --with-cpu cortex-a5 --with-fpu vfpv3-d16-fp16

>>>> (target arm-none-linux-gnueabihf)

>>>>

>>>> The generated code is:

>>>>           sxth    r0, r0

>>>>           sxth    r1, r1

>>>>           mul     r0, r1, r0

>>>> instead of

>>>>           smulbb  r0, r1, r0

>>>> on trunk.

>>>>

>>>> I guess we don't worry?

>>>

>>>

>>> Hi Christophe,

>>> Hmmm, I suspect we might want to backport

>>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01714.html

>>> to fix backend the costing logic of smulbb.

>>> Could you please try that patch to see if it helps?

>>>

>> Ha indeed, with the attached patch, we now generate smulbb.

>> I didn't run a full make check though.

>

>

> Thanks for checking.

>

>> OK with a suitable ChangeLog entry?

>

>

> Can you please do a full test run when you get the chance?


Done, I noticed no other difference.

> Since the patch is mine we'll need an ok from another arm maintainer ;)

OK :)

> Thanks,

> Kyrill

>

>

>>

>> Christophe.

>>

>>> Thanks,

>>> Kyrill

>>>

>>>

>>>>>> regards

>>>>>> Ramana

>>>>>>

>>>>>>

>>>>>>

>>>>>>

>>>>>>> My change was tested with an arm bootstrap, make check, and SPEC

>>>>>>> CPU2000 run.  The original poster verified that this gives a linux

>>>>>>> kernel that boots correctly.

>>>>>>>

>>>>>>> The PRMOTE_MODE change causes 3 testsuite testcases to fail.  These

>>>>>>> are tests to verify that smulbb and/or smlabb are generated.

>>>>>>> Eliminating the unnecessary sign conversions causes us to get better

>>>>>>> code that doesn't include the smulbb and smlabb instructions.  I had

>>>>>>> to modify the testcases to get them to emit the desired instructions.

>>>>>>> With the testcase changes there are no additional testsuite failures,

>>>>>>> though I'm concerned that these testcases with the changes may be

>>>>>>> fragile, and future changes may break them again.

>>>>>>

>>>>>>

>>>>>>

>>>>>>> If there are ARM parts where smulbb/smlabb are faster than mul/mla,

>>>>>>> then maybe we should try to add new patterns to get the instructions

>>>>>>> emitted again for the unmodified testcases.

>>>>>>>

>>>>>>> Jim

>>>>>

>>>>>

>
Christophe Lyon March 7, 2016, 12:55 p.m. UTC | #2
On 7 March 2016 at 05:43, Ramana Radhakrishnan
<ramana.gcc@googlemail.com> wrote:
> On Wed, Feb 17, 2016 at 10:20 AM, Christophe Lyon

> <christophe.lyon@linaro.org> wrote:

>> On 17 February 2016 at 11:05, Kyrill Tkachov

>> <kyrylo.tkachov@foss.arm.com> wrote:

>>>

>>> On 17/02/16 10:03, Christophe Lyon wrote:

>>>>

>>>> On 15 February 2016 at 12:32, Kyrill Tkachov

>>>> <kyrylo.tkachov@foss.arm.com> wrote:

>>>>>

>>>>> On 04/02/16 08:58, Ramana Radhakrishnan wrote:

>>>>>>

>>>>>> On Tue, Jun 30, 2015 at 2:15 AM, Jim Wilson <jim.wilson@linaro.org>

>>>>>> wrote:

>>>>>>>

>>>>>>> This is my suggested fix for PR 65932, which is a linux kernel

>>>>>>> miscompile with gcc-5.1.

>>>>>>>

>>>>>>> The problem here is caused by a chain of events.  The first is that

>>>>>>> the relatively new eipa_sra pass creates fake parameters that behave

>>>>>>> slightly differently than normal parameters.  The second is that the

>>>>>>> optimizer creates phi nodes that copy local variables to fake

>>>>>>> parameters and/or vice versa.  The third is that the ouf-of-ssa pass

>>>>>>> assumes that it can emit simple move instructions for these phi nodes.

>>>>>>> And the fourth is that the ARM port has a PROMOTE_MODE macro that

>>>>>>> forces QImode and HImode to unsigned, but a

>>>>>>> TARGET_PROMOTE_FUNCTION_MODE hook that does not.  So signed char and

>>>>>>> short parameters have different in register representations than local

>>>>>>> variables, and require a conversion when copying between them, a

>>>>>>> conversion that the out-of-ssa pass can't easily emit.

>>>>>>>

>>>>>>> Ultimately, I think this is a problem in the arm backend.  It should

>>>>>>> not have a PROMOTE_MODE macro that is changing the sign of char and

>>>>>>> short local variables.  I also think that we should merge the

>>>>>>> PROMOTE_MODE macro with the TARGET_PROMOTE_FUNCTION_MODE hook to

>>>>>>> prevent this from happening again.

>>>>>>>

>>>>>>> I see four general problems with the current ARM PROMOTE_MODE

>>>>>>> definition.

>>>>>>> 1) Unsigned char is only faster for armv5 and earlier, before the sxtb

>>>>>>> instruction was added.  It is a lose for armv6 and later.

>>>>>>> 2) Unsigned short was only faster for targets that don't support

>>>>>>> unaligned accesses.  Support for these targets was removed a while

>>>>>>> ago, and this PROMODE_MODE hunk should have been removed at the same

>>>>>>> time.  It was accidentally left behind.

>>>>>>> 3) TARGET_PROMOTE_FUNCTION_MODE used to be a boolean hook, when it was

>>>>>>> converted to a function, the PROMOTE_MODE code was copied without the

>>>>>>> UNSIGNEDP changes.  Thus it is only an accident that

>>>>>>> TARGET_PROMOTE_FUNCTION_MODE and PROMOTE_MODE disagree.  Changing

>>>>>>> TARGET_PROMOTE_FUNCTION_MODE is an ABI change, so only PROMOTE_MODE

>>>>>>> changes to resolve the difference are safe.

>>>>>>> 4) There is a general principle that you should only change signedness

>>>>>>> in PROMOTE_MODE if the hardware forces it, as otherwise this results

>>>>>>> in extra conversion instructions that make code slower.  The mips64

>>>>>>> hardware for instance requires that 32-bit values be sign-extended

>>>>>>> regardless of type, and instructions may trap if this is not true.

>>>>>>> However, it has a set of 32-bit instructions that operate on these

>>>>>>> values, and hence no conversions are required.  There is no similar

>>>>>>> case on ARM. Thus the conversions are unnecessary and unwise.  This

>>>>>>> can be seen in the testcases where gcc emits both a zero-extend and a

>>>>>>> sign-extend inside a loop, as the sign-extend is required for a

>>>>>>> compare, and the zero-extend is required by PROMOTE_MODE.

>>>>>>

>>>>>> Given Kyrill's testing with the patch and the reasonably detailed

>>>>>> check of the effects of code generation changes - The arm.h hunk is ok

>>>>>> - I do think we should make this explicit in the documentation that

>>>>>> TARGET_PROMOTE_MODE and TARGET_PROMOTE_FUNCTION_MODE should agree and

>>>>>> better still maybe put in a checking assert for the same in the

>>>>>> mid-end but that could be the subject of a follow-up patch.

>>>>>>

>>>>>> Ok to apply just the arm.h hunk as I think Kyrill has taken care of

>>>>>> the testsuite fallout separately.

>>>>>

>>>>> Hi all,

>>>>>

>>>>> I'd like to backport the arm.h from this ( r233130) to the GCC 5

>>>>> branch. As the CSE patch from my series had some fallout on x86_64

>>>>> due to a deficiency in the AVX patterns that is too invasive to fix

>>>>> at this stage (and presumably backport), I'd like to just backport

>>>>> this arm.h fix and adjust the tests to XFAIL the fallout that comes

>>>>> with not applying the CSE patch. The attached patch does that.

>>>>>

>>>>> The code quality fallout on code outside the testsuite is not

>>>>> that gread. The SPEC benchmarks are not affected by not applying

>>>>> the CSE change, and only a single sequence in a popular embedded

>>>>> benchmark

>>>>> shows some degradation for -mtune=cortex-a9 in the same way as the

>>>>> wmul-1.c and wmul-2.c tests.

>>>>>

>>>>> I think that's a fair tradeoff for fixing the wrong code bug on that

>>>>> branch.

>>>>>

>>>>> Ok to backport r233130 and the attached testsuite patch to the GCC 5

>>>>> branch?

>>>>>

>>>>> Thanks,

>>>>> Kyrill

>>>>>

>>>>> 2016-02-15  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

>>>>>

>>>>>      PR target/65932

>>>>>      * gcc.target/arm/wmul-1.c: Add -mtune=cortex-a9 to dg-options.

>>>>>      xfail the scan-assembler test.

>>>>>      * gcc.target/arm/wmul-2.c: Likewise.

>>>>>      * gcc.target/arm/wmul-3.c: Simplify test to generate a single

>>>>> smulbb.

>>>>>

>>>>>

>>>> Hi Kyrill,

>>>>

>>>> I've noticed that wmul-3 still fails on the gcc-5 branch when forcing GCC

>>>> configuration to:

>>>> --with-cpu cortex-a5 --with-fpu vfpv3-d16-fp16

>>>> (target arm-none-linux-gnueabihf)

>>>>

>>>> The generated code is:

>>>>          sxth    r0, r0

>>>>          sxth    r1, r1

>>>>          mul     r0, r1, r0

>>>> instead of

>>>>          smulbb  r0, r1, r0

>>>> on trunk.

>>>>

>>>> I guess we don't worry?

>>>

>>>

>>> Hi Christophe,

>>> Hmmm, I suspect we might want to backport

>>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01714.html

>>> to fix backend the costing logic of smulbb.

>>> Could you please try that patch to see if it helps?

>>>

>>

>> Ha indeed, with the attached patch, we now generate smulbb.

>> I didn't run a full make check though.

>>

>> OK with a suitable ChangeLog entry?

>

> Ok with a suitable Changelog entry and if no regressions.

>

Thanks, committed as r2340169, with no regression:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/gcc-5-branch/234019/report-build-info.html

> Ramana

>

>>

>> Christophe.

>>

>>> Thanks,

>>> Kyrill

>>>

>>>

>>>>>

>>>>>> regards

>>>>>> Ramana

>>>>>>

>>>>>>

>>>>>>

>>>>>>

>>>>>>> My change was tested with an arm bootstrap, make check, and SPEC

>>>>>>> CPU2000 run.  The original poster verified that this gives a linux

>>>>>>> kernel that boots correctly.

>>>>>>>

>>>>>>> The PRMOTE_MODE change causes 3 testsuite testcases to fail.  These

>>>>>>> are tests to verify that smulbb and/or smlabb are generated.

>>>>>>> Eliminating the unnecessary sign conversions causes us to get better

>>>>>>> code that doesn't include the smulbb and smlabb instructions.  I had

>>>>>>> to modify the testcases to get them to emit the desired instructions.

>>>>>>> With the testcase changes there are no additional testsuite failures,

>>>>>>> though I'm concerned that these testcases with the changes may be

>>>>>>> fragile, and future changes may break them again.

>>>>>>

>>>>>>

>>>>>>

>>>>>>> If there are ARM parts where smulbb/smlabb are faster than mul/mla,

>>>>>>> then maybe we should try to add new patterns to get the instructions

>>>>>>> emitted again for the unmodified testcases.

>>>>>>>

>>>>>>> Jim

>>>>>

>>>>>

>>>
diff mbox

Patch

Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 233484)
+++ gcc/config/arm/arm.c	(working copy)
@@ -10306,8 +10306,10 @@ 
 	      /* SMUL[TB][TB].  */
 	      if (speed_p)
 		*cost += extra_cost->mult[0].extend;
-	      *cost += (rtx_cost (XEXP (x, 0), SIGN_EXTEND, 0, speed_p)
-			+ rtx_cost (XEXP (x, 1), SIGN_EXTEND, 0, speed_p));
+	      *cost += rtx_cost (XEXP (XEXP (x, 0), 0),
+				 SIGN_EXTEND, 0, speed_p);
+	      *cost += rtx_cost (XEXP (XEXP (x, 1), 0),
+				 SIGN_EXTEND, 1, speed_p);
 	      return true;
 	    }
 	  if (speed_p)