[ARM] PR 67591 ARM v8 Thumb IT blocks are deprecated

Message ID CAKdteOZiSJvDWMfewbMao2FbJu4z+bkuzve4+mt9fUD78Qm_jQ@mail.gmail.com
State New
Headers show

Commit Message

Christophe Lyon Sept. 7, 2016, 8:05 p.m.
Hi,

The attached patch is a first part to solve PR 67591: it removes
several occurrences of "IT blocks containing 32-bit Thumb
instructions are deprecated in ARMv8" messages in the
gcc/g++/libstdc++/fortran testsuites.

It does not remove them all yet. This patch only modifies the
*cmp_and, *cmp_ior, *ior_scc_scc, *ior_scc_scc_cmp,
*and_scc_scc and *and_scc_scc_cmp patterns.
Additional work is required in sub_shiftsi etc, at least.
I've started looking at these, but I decided I could already
post this self-contained patch to check if this implementation
is OK.

Regarding *cmp_and and *cmp_ior patterns, the addition of the
enabled_for_depr_it attribute is aggressive in the sense that it keeps
only the alternatives with 'l' and 'Py' constraints, while in some
cases the constraints could be relaxed. Indeed, these 2 patterns can
swap their input comparisons, meaning that any of them can be emitted
in the IT-block, and is thus subject to the ARMv8 deprecation.
The generated code is possibly suboptimal in the cases where the
operands are not swapped, since 'r' could be used.

Cross-tested on arm-none-linux-gnueabihf with -mthumb/-march=armv8-a
and --with-cpu=cortex-a57 --with-mode=thumb, showing only improvements:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/239850-depr-it-4/report-build-info.html

Bootstrapped OK on armv8l HW.

Is this OK?

Thanks,

Christophe
2016-09-05  Christophe Lyon  <christophe.lyon@linaro.org>

	PR target/67591
	* config/arm/arm.md (*cmp_and): Add enabled_for_depr_it attribute.
	(*cmp_ior): Likewise.
	(*ior_scc_scc): Add alternative for enabled_for_depr_it attribute.
	(*ior_scc_scc_cmp): Likewise.
	(*and_scc_scc): Likewise.
	(*and_scc_scc_cmp): Likewise.

Comments

Christophe Lyon Oct. 12, 2016, 8:59 a.m. | #1
Hi Kyrill,

On 7 October 2016 at 17:00, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote:
> Hi Christophe,

>

>

> On 07/09/16 21:05, Christophe Lyon wrote:

>>

>> Hi,

>>

>> The attached patch is a first part to solve PR 67591: it removes

>> several occurrences of "IT blocks containing 32-bit Thumb

>> instructions are deprecated in ARMv8" messages in the

>> gcc/g++/libstdc++/fortran testsuites.

>>

>> It does not remove them all yet. This patch only modifies the

>> *cmp_and, *cmp_ior, *ior_scc_scc, *ior_scc_scc_cmp,

>> *and_scc_scc and *and_scc_scc_cmp patterns.

>> Additional work is required in sub_shiftsi etc, at least.

>> I've started looking at these, but I decided I could already

>> post this self-contained patch to check if this implementation

>> is OK.

>>

>> Regarding *cmp_and and *cmp_ior patterns, the addition of the

>> enabled_for_depr_it attribute is aggressive in the sense that it keeps

>> only the alternatives with 'l' and 'Py' constraints, while in some

>> cases the constraints could be relaxed. Indeed, these 2 patterns can

>> swap their input comparisons, meaning that any of them can be emitted

>> in the IT-block, and is thus subject to the ARMv8 deprecation.

>> The generated code is possibly suboptimal in the cases where the

>> operands are not swapped, since 'r' could be used.

>>

>> Cross-tested on arm-none-linux-gnueabihf with -mthumb/-march=armv8-a

>> and --with-cpu=cortex-a57 --with-mode=thumb, showing only improvements:

>>

>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/239850-depr-it-4/report-build-info.html

>>

>> Bootstrapped OK on armv8l HW.

>>

>> Is this OK?

>>

>> Thanks,

>>

>> Christophe

>

>

>  (define_insn_and_split "*ior_scc_scc"

> -  [(set (match_operand:SI 0 "s_register_operand" "=Ts")

> +  [(set (match_operand:SI 0 "s_register_operand" "=Ts,Ts")

>         (ior:SI (match_operator:SI 3 "arm_comparison_operator"

> -                [(match_operand:SI 1 "s_register_operand" "r")

> -                 (match_operand:SI 2 "arm_add_operand" "rIL")])

> +                [(match_operand:SI 1 "s_register_operand" "r,l")

> +                 (match_operand:SI 2 "arm_add_operand" "rIL,lPy")])

>                 (match_operator:SI 6 "arm_comparison_operator"

> -                [(match_operand:SI 4 "s_register_operand" "r")

> -                 (match_operand:SI 5 "arm_add_operand" "rIL")])))

> +                [(match_operand:SI 4 "s_register_operand" "r,l")

> +                 (match_operand:SI 5 "arm_add_operand" "rIL,lPy")])))

>

> Can you please put the more restrictive alternatives (lPy) first?

> Same with the other patterns your patch touches.

> Ok with that change if a rebased testing run is ok.

> Sorry for the delay in reviewing.

>


OK, I will update my patch accordingly.

However, when I discussed this with Ramana during the Cauldron,
he requested benchmark results. So far, I was able to run spec2006
on an APM machine, and I'm seeing performance changes in the
range 11% improvement (465.tonto) to 7% degradation (433.milc).

Would that be acceptable?

The number of warnings (IT blocks containing 32-bit Thumb instructions
are deprecated in ARMv8)
was 712 without my patch and 122 with it. (using the hosts's binutils
2.24/ubuntu).
I expected some warning, since as I said earlier other patterns need
to be updated.

Christophe


> Thanks for improving this area!

> Kyrill

>
Christophe Lyon Oct. 12, 2016, 9:22 a.m. | #2
On 12 October 2016 at 11:14, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote:
>

> On 12/10/16 09:59, Christophe Lyon wrote:

>>

>> Hi Kyrill,

>>

>> On 7 October 2016 at 17:00, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>

>> wrote:

>>>

>>> Hi Christophe,

>>>

>>>

>>> On 07/09/16 21:05, Christophe Lyon wrote:

>>>>

>>>> Hi,

>>>>

>>>> The attached patch is a first part to solve PR 67591: it removes

>>>> several occurrences of "IT blocks containing 32-bit Thumb

>>>> instructions are deprecated in ARMv8" messages in the

>>>> gcc/g++/libstdc++/fortran testsuites.

>>>>

>>>> It does not remove them all yet. This patch only modifies the

>>>> *cmp_and, *cmp_ior, *ior_scc_scc, *ior_scc_scc_cmp,

>>>> *and_scc_scc and *and_scc_scc_cmp patterns.

>>>> Additional work is required in sub_shiftsi etc, at least.

>>>> I've started looking at these, but I decided I could already

>>>> post this self-contained patch to check if this implementation

>>>> is OK.

>>>>

>>>> Regarding *cmp_and and *cmp_ior patterns, the addition of the

>>>> enabled_for_depr_it attribute is aggressive in the sense that it keeps

>>>> only the alternatives with 'l' and 'Py' constraints, while in some

>>>> cases the constraints could be relaxed. Indeed, these 2 patterns can

>>>> swap their input comparisons, meaning that any of them can be emitted

>>>> in the IT-block, and is thus subject to the ARMv8 deprecation.

>>>> The generated code is possibly suboptimal in the cases where the

>>>> operands are not swapped, since 'r' could be used.

>>>>

>>>> Cross-tested on arm-none-linux-gnueabihf with -mthumb/-march=armv8-a

>>>> and --with-cpu=cortex-a57 --with-mode=thumb, showing only improvements:

>>>>

>>>>

>>>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/239850-depr-it-4/report-build-info.html

>>>>

>>>> Bootstrapped OK on armv8l HW.

>>>>

>>>> Is this OK?

>>>>

>>>> Thanks,

>>>>

>>>> Christophe

>>>

>>>

>>>   (define_insn_and_split "*ior_scc_scc"

>>> -  [(set (match_operand:SI 0 "s_register_operand" "=Ts")

>>> +  [(set (match_operand:SI 0 "s_register_operand" "=Ts,Ts")

>>>          (ior:SI (match_operator:SI 3 "arm_comparison_operator"

>>> -                [(match_operand:SI 1 "s_register_operand" "r")

>>> -                 (match_operand:SI 2 "arm_add_operand" "rIL")])

>>> +                [(match_operand:SI 1 "s_register_operand" "r,l")

>>> +                 (match_operand:SI 2 "arm_add_operand" "rIL,lPy")])

>>>                  (match_operator:SI 6 "arm_comparison_operator"

>>> -                [(match_operand:SI 4 "s_register_operand" "r")

>>> -                 (match_operand:SI 5 "arm_add_operand" "rIL")])))

>>> +                [(match_operand:SI 4 "s_register_operand" "r,l")

>>> +                 (match_operand:SI 5 "arm_add_operand" "rIL,lPy")])))

>>>

>>> Can you please put the more restrictive alternatives (lPy) first?

>>> Same with the other patterns your patch touches.

>>> Ok with that change if a rebased testing run is ok.

>>> Sorry for the delay in reviewing.

>>>

>> OK, I will update my patch accordingly.

>>

>> However, when I discussed this with Ramana during the Cauldron,

>> he requested benchmark results. So far, I was able to run spec2006

>> on an APM machine, and I'm seeing performance changes in the

>> range 11% improvement (465.tonto) to 7% degradation (433.milc).

>>

>> Would that be acceptable?

>

>

> Those sound like quite large swings.

Indeed, but most are in the -1%-+1% range.

> Are you sure the machine was not running anything else at the time

> or playing tricks with frequency scaling?

No, I had no such guarantee. I used this machine temporarily,
first to check that bootstrap worked. I planed to use another
board with an A57 "standard" microarch for proper
benchmarking, but I'm not sure when I'll have access to it
(wrt to e/o gcc stage1), that's why I reported these early
figures.

> Did all iterations of SPEC show a consistent difference?

>

> If the changes are consistent, could you have a look at the codegen

> to see if there are any clues to the differences?

I will update my patch according to your comment, re-run the bench
and have a deeper look at the codegen differences.

> I'd like to get an explanation for these differences before committing

> this patch. If they are just an effect of the more restrictive constraints

> then there may be not much we can do, but I'd like to make sure there's not

> anything else unexpected going on.

>

Thanks,

Christophe

>>

>> The number of warnings (IT blocks containing 32-bit Thumb instructions

>> are deprecated in ARMv8)

>> was 712 without my patch and 122 with it. (using the hosts's binutils

>> 2.24/ubuntu).

>> I expected some warning, since as I said earlier other patterns need

>> to be updated.

>

>

> Understood. That's fine.

>

> Thanks,

> Kyrill

>

>

>>

>> Christophe

>>

>>

>>> Thanks for improving this area!

>>> Kyrill

>>>

>

Patch hide | download patch | download mbox

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 318db75..0374bdd 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -9340,6 +9340,7 @@ 
   [(set_attr "conds" "set")
    (set_attr "predicable" "no")
    (set_attr "arch" "t2,t2,t2,t2,t2,any,any,any,any")
+   (set_attr "enabled_for_depr_it" "yes,no,no,no,no,no,no,no,no")
    (set_attr_alternative "length"
       [(const_int 6)
        (const_int 8)
@@ -9422,6 +9423,7 @@ 
   "
   [(set_attr "conds" "set")
    (set_attr "arch" "t2,t2,t2,t2,t2,any,any,any,any")
+   (set_attr "enabled_for_depr_it" "yes,no,no,no,no,no,no,no,no")
    (set_attr_alternative "length"
       [(const_int 6)
        (const_int 8)
@@ -9444,13 +9446,13 @@ 
 )
 
 (define_insn_and_split "*ior_scc_scc"
-  [(set (match_operand:SI 0 "s_register_operand" "=Ts")
+  [(set (match_operand:SI 0 "s_register_operand" "=Ts,Ts")
 	(ior:SI (match_operator:SI 3 "arm_comparison_operator"
-		 [(match_operand:SI 1 "s_register_operand" "r")
-		  (match_operand:SI 2 "arm_add_operand" "rIL")])
+		 [(match_operand:SI 1 "s_register_operand" "r,l")
+		  (match_operand:SI 2 "arm_add_operand" "rIL,lPy")])
 		(match_operator:SI 6 "arm_comparison_operator"
-		 [(match_operand:SI 4 "s_register_operand" "r")
-		  (match_operand:SI 5 "arm_add_operand" "rIL")])))
+		 [(match_operand:SI 4 "s_register_operand" "r,l")
+		  (match_operand:SI 5 "arm_add_operand" "rIL,lPy")])))
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_32BIT
    && (arm_select_dominance_cc_mode (operands[3], operands[6], DOM_CC_X_OR_Y)
@@ -9469,6 +9471,7 @@ 
 						  DOM_CC_X_OR_Y),
 		    CC_REGNUM);"
   [(set_attr "conds" "clob")
+   (set_attr "enabled_for_depr_it" "no,yes")
    (set_attr "length" "16")
    (set_attr "type" "multiple")]
 )
@@ -9478,13 +9481,13 @@ 
 (define_insn_and_split "*ior_scc_scc_cmp"
   [(set (match_operand 0 "dominant_cc_register" "")
 	(compare (ior:SI (match_operator:SI 3 "arm_comparison_operator"
-			  [(match_operand:SI 1 "s_register_operand" "r")
-			   (match_operand:SI 2 "arm_add_operand" "rIL")])
+			  [(match_operand:SI 1 "s_register_operand" "r,l")
+			   (match_operand:SI 2 "arm_add_operand" "rIL,lPy")])
 			 (match_operator:SI 6 "arm_comparison_operator"
-			  [(match_operand:SI 4 "s_register_operand" "r")
-			   (match_operand:SI 5 "arm_add_operand" "rIL")]))
+			  [(match_operand:SI 4 "s_register_operand" "r,l")
+			   (match_operand:SI 5 "arm_add_operand" "rIL,lPy")]))
 		 (const_int 0)))
-   (set (match_operand:SI 7 "s_register_operand" "=Ts")
+   (set (match_operand:SI 7 "s_register_operand" "=Ts,Ts")
 	(ior:SI (match_op_dup 3 [(match_dup 1) (match_dup 2)])
 		(match_op_dup 6 [(match_dup 4) (match_dup 5)])))]
   "TARGET_32BIT"
@@ -9499,18 +9502,19 @@ 
    (set (match_dup 7) (ne:SI (match_dup 0) (const_int 0)))]
   ""
   [(set_attr "conds" "set")
+   (set_attr "enabled_for_depr_it" "no,yes")
    (set_attr "length" "16")
    (set_attr "type" "multiple")]
 )
 
 (define_insn_and_split "*and_scc_scc"
-  [(set (match_operand:SI 0 "s_register_operand" "=Ts")
+  [(set (match_operand:SI 0 "s_register_operand" "=Ts,Ts")
 	(and:SI (match_operator:SI 3 "arm_comparison_operator"
-		 [(match_operand:SI 1 "s_register_operand" "r")
-		  (match_operand:SI 2 "arm_add_operand" "rIL")])
+		 [(match_operand:SI 1 "s_register_operand" "r,l")
+		  (match_operand:SI 2 "arm_add_operand" "rIL,lPy")])
 		(match_operator:SI 6 "arm_comparison_operator"
-		 [(match_operand:SI 4 "s_register_operand" "r")
-		  (match_operand:SI 5 "arm_add_operand" "rIL")])))
+		 [(match_operand:SI 4 "s_register_operand" "r,l")
+		  (match_operand:SI 5 "arm_add_operand" "rIL,lPy")])))
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_32BIT
    && (arm_select_dominance_cc_mode (operands[3], operands[6], DOM_CC_X_AND_Y)
@@ -9531,6 +9535,7 @@ 
 						  DOM_CC_X_AND_Y),
 		    CC_REGNUM);"
   [(set_attr "conds" "clob")
+   (set_attr "enabled_for_depr_it" "no,yes")
    (set_attr "length" "16")
    (set_attr "type" "multiple")]
 )
@@ -9540,13 +9545,13 @@ 
 (define_insn_and_split "*and_scc_scc_cmp"
   [(set (match_operand 0 "dominant_cc_register" "")
 	(compare (and:SI (match_operator:SI 3 "arm_comparison_operator"
-			  [(match_operand:SI 1 "s_register_operand" "r")
-			   (match_operand:SI 2 "arm_add_operand" "rIL")])
+			  [(match_operand:SI 1 "s_register_operand" "r,l")
+			   (match_operand:SI 2 "arm_add_operand" "rIL,lPy")])
 			 (match_operator:SI 6 "arm_comparison_operator"
-			  [(match_operand:SI 4 "s_register_operand" "r")
-			   (match_operand:SI 5 "arm_add_operand" "rIL")]))
+			  [(match_operand:SI 4 "s_register_operand" "r,l")
+			   (match_operand:SI 5 "arm_add_operand" "rIL,lPy")]))
 		 (const_int 0)))
-   (set (match_operand:SI 7 "s_register_operand" "=Ts")
+   (set (match_operand:SI 7 "s_register_operand" "=Ts,Ts")
 	(and:SI (match_op_dup 3 [(match_dup 1) (match_dup 2)])
 		(match_op_dup 6 [(match_dup 4) (match_dup 5)])))]
   "TARGET_32BIT"
@@ -9561,6 +9566,7 @@ 
    (set (match_dup 7) (ne:SI (match_dup 0) (const_int 0)))]
   ""
   [(set_attr "conds" "set")
+   (set_attr "enabled_for_depr_it" "no,yes")
    (set_attr "length" "16")
    (set_attr "type" "multiple")]
 )