diff mbox

[AArch64] Split X-reg UBFIZ into W-reg LSL when possible

Message ID 5853DC63.3030602@foss.arm.com
State New
Headers show

Commit Message

Kyrill Tkachov Dec. 16, 2016, 12:21 p.m. UTC
On 15/12/16 11:56, James Greenhalgh wrote:
> On Thu, Dec 08, 2016 at 09:35:09AM +0000, Kyrill Tkachov wrote:

>> Hi all,

>>

>> Similar to the previous patch this transforms X-reg UBFIZ instructions into

>> W-reg LSL instructions when the UBFIZ operands add up to 32, so we can take

>> advantage of the implicit zero-extension to DImode

>> when writing to a W-register.

>>

>> This is done by splitting the existing *andim_ashift<mode>_bfi pattern into

>> its two SImode and DImode specialisations and changing the DImode pattern

>> into a define_insn_and_split that splits into a

>> zero-extended SImode ashift when the operands match up.

>>

>> So for the code in the testcase we generate:

>> LSL     W0, W0, 5

>>

>> instead of:

>> UBFIZ   X0, X0, 5, 27

>>

>> Bootstrapped and tested on aarch64-none-linux-gnu.

>>

>> Since we're in stage 3 perhaps this is not for GCC 6, but it is fairly low

>> risk.  I'm happy for it to wait for the next release if necessary.

> My comments on the previous patch also apply here. This patch should only

> need to add one new split pattern.

>

> Thanks,

> James


Thanks, here is the version adding just a single define_split.

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok?

Thanks,
Kyrill

2016-12-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * config/aarch64/aarch64.md: New define_split above bswap<mode>2.

2016-12-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * gcc.target/aarch64/ubfiz_lsl_1.c: New test.

Comments

James Greenhalgh Dec. 16, 2016, 4:16 p.m. UTC | #1
On Fri, Dec 16, 2016 at 12:21:55PM +0000, Kyrill Tkachov wrote:
> 

> On 15/12/16 11:56, James Greenhalgh wrote:

> >On Thu, Dec 08, 2016 at 09:35:09AM +0000, Kyrill Tkachov wrote:

> >>Hi all,

> >>

> >>Similar to the previous patch this transforms X-reg UBFIZ instructions into

> >>W-reg LSL instructions when the UBFIZ operands add up to 32, so we can take

> >>advantage of the implicit zero-extension to DImode

> >>when writing to a W-register.

> >>

> >>This is done by splitting the existing *andim_ashift<mode>_bfi pattern into

> >>its two SImode and DImode specialisations and changing the DImode pattern

> >>into a define_insn_and_split that splits into a

> >>zero-extended SImode ashift when the operands match up.

> >>

> >>So for the code in the testcase we generate:

> >>LSL     W0, W0, 5

> >>

> >>instead of:

> >>UBFIZ   X0, X0, 5, 27

> >>

> >>Bootstrapped and tested on aarch64-none-linux-gnu.

> >>

> >>Since we're in stage 3 perhaps this is not for GCC 6, but it is fairly low

> >>risk.  I'm happy for it to wait for the next release if necessary.

> >My comments on the previous patch also apply here. This patch should only

> >need to add one new split pattern.


OK with a small nit fixed.

Thanks,
James

> Thanks, here is the version adding just a single define_split.

> 

> Bootstrapped and tested on aarch64-none-linux-gnu.



> 2016-12-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

> 

>     * config/aarch64/aarch64.md: New define_split above bswap<mode>2.

> 

> 2016-12-16  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

> 

>     * gcc.target/aarch64/ubfiz_lsl_1.c: New test.


> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md

> index 5a40ee6abd5e123116aaaa478dced2207dd59478..b0f7bcbb84159fc8c0c733d0b40f2f08eea241a9 100644

> --- a/gcc/config/aarch64/aarch64.md

> +++ b/gcc/config/aarch64/aarch64.md

> @@ -4454,6 +4454,24 @@ (define_insn "*andim_ashift<mode>_bfiz"

>    [(set_attr "type" "bfx")]

>  )

>  

> +;; When the bitposition and width of the equivalent extraction add up to 32


s/bitposition/bit position/
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 5a40ee6abd5e123116aaaa478dced2207dd59478..b0f7bcbb84159fc8c0c733d0b40f2f08eea241a9 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4454,6 +4454,24 @@  (define_insn "*andim_ashift<mode>_bfiz"
   [(set_attr "type" "bfx")]
 )
 
+;; When the bitposition and width of the equivalent extraction add up to 32
+;; we can use a W-reg LSL instruction taking advantage of the implicit
+;; zero-extension of the X-reg.
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+	(and:DI (ashift:DI (match_operand:DI 1 "register_operand")
+			     (match_operand 2 "const_int_operand"))
+		 (match_operand 3 "const_int_operand")))]
+ "aarch64_mask_and_shift_for_ubfiz_p (DImode, operands[3], operands[2])
+  && (INTVAL (operands[2]) + popcount_hwi (INTVAL (operands[3])))
+      == GET_MODE_BITSIZE (SImode)"
+  [(set (match_dup 0)
+	(zero_extend:DI (ashift:SI (match_dup 4) (match_dup 2))))]
+  {
+    operands[4] = gen_lowpart (SImode, operands[1]);
+  }
+)
+
 (define_insn "bswap<mode>2"
   [(set (match_operand:GPI 0 "register_operand" "=r")
         (bswap:GPI (match_operand:GPI 1 "register_operand" "r")))]
diff --git a/gcc/testsuite/gcc.target/aarch64/ubfiz_lsl_1.c b/gcc/testsuite/gcc.target/aarch64/ubfiz_lsl_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..d3fd3f234f2324d71813298210fdcf0660ac45b4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/ubfiz_lsl_1.c
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Check that an X-reg UBFIZ can be simplified into a W-reg LSL.  */
+
+long long
+f2 (long long x)
+{
+  return (x << 5) & 0xffffffff;
+}
+
+/* { dg-final { scan-assembler "lsl\tw" } } */
+/* { dg-final { scan-assembler-not "ubfiz\tx" } } */