Message ID | 5853DC60.4060000@foss.arm.com |
---|---|
State | New |
Headers | show |
On Fri, Dec 16, 2016 at 12:21:52PM +0000, Kyrill Tkachov wrote: > > On 15/12/16 11:53, James Greenhalgh wrote: > >On Thu, Dec 08, 2016 at 09:35:08AM +0000, Kyrill Tkachov wrote: > >>Hi all, > >> > >>In this patch we split X-register UBFX instructions that extract up to the > >>edge of a W-register into a W-register LSR instruction. So for the example in > >>the testcase instead of: > >>UBFX X0, X0, 24, 8 > >> > >>we'd generate: > >>LSR w0, w0, 24 > >> > >>An LSR is a simpler instruction and there's a higher chance that it can be > >>combined with other instructions. > >> > >>To do this the patch separates the sign_extract case from the zero_extract > >>case in the *<optab><mode> ANY_EXTRACT pattern and further splits the > >>SImode/DImode patterns from the resulting zrero_extract pattern. > >>The DImode zero_extract pattern then becomes a define_insn_and_split that > >>splits into a zero_extend of an lshiftrt when the bitposition and width of > >>the zero_extract add up to 32. > >> > >>Bootstrapped and tested on aarch64-none-linux-gnu. > >> > >>Since we're in stage 3 perhaps this is not for GCC 6, but it is fairly low > >>risk. I'm happy for it to wait for the next release if necessary. > >I'm OK with the idea, and I'm OK taking this in for Stage 3, but I'm not > >convinced by the implementation. > > > >>2016-12-08 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > >> > >> * config/aarch64/aarch64.md (*<optab><mode>): Split into... > >> (*extv<mode>): ...This... > >> (*extzvsi): ...This... > >> (*extzvdi:): ... And this. Add splitting to lshiftrt when possible. > >Why do we want to to it this way, rather than simply defining a single > >"split" which works in the case you're trying to catch. > > > >i.e. (untested) > > > >(define_split > > [(set (match_operand:DI 0 "register_operand") > > (zero_extract:DI (match_operand:DI 1 "register_operand") > > (match_operand 2 > > "aarch64_simd_shift_imm_offset_di") > > (match_operand 3 > > "aarch64_simd_shift_imm_di")))] > > "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), > > 1, GET_MODE_BITSIZE (DImode) - 1) > > && (INTVAL (operands[2]) + INTVAL (operands[3])) > > == GET_MODE_BITSIZE (SImode)" > > [(set (match_dup 0) > > (zero_extend:DI (lshiftrt:SI (match_dup 4) (match_dup 3))))] > > { > > operands[4] = gen_lowpart (SImode, operands[1]); > > } > >) > > > >Thanks, > >James > > Yes, that works and is simpler. > Is this ok then? OK with the typo fixed. Thanks, James > 2016-12-16 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/aarch64/aarch64.md: New define_split above insv<mode>. > > 2016-12-16 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * gcc.target/aarch64/ubfx_lsr_1.c: New test. > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 65ea04587442b0cab18b1e4652537524b82d5b86..5a40ee6abd5e123116aaaa478dced2207dd59478 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -4340,6 +4340,26 @@ (define_insn "*<optab><mode>" > [(set_attr "type" "bfx")] > ) > > +;; When the bitposition and width add up to 32 we can use a W-reg LSR s/bitposition/bit position/
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 65ea04587442b0cab18b1e4652537524b82d5b86..5a40ee6abd5e123116aaaa478dced2207dd59478 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -4340,6 +4340,26 @@ (define_insn "*<optab><mode>" [(set_attr "type" "bfx")] ) +;; When the bitposition and width add up to 32 we can use a W-reg LSR +;; instruction taking advantage of the implicit zero-extension of the X-reg. +(define_split + [(set (match_operand:DI 0 "register_operand") + (zero_extract:DI (match_operand:DI 1 "register_operand") + (match_operand 2 + "aarch64_simd_shift_imm_offset_di") + (match_operand 3 + "aarch64_simd_shift_imm_di")))] + "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), 1, + GET_MODE_BITSIZE (DImode) - 1) + && (INTVAL (operands[2]) + INTVAL (operands[3])) + == GET_MODE_BITSIZE (SImode)" + [(set (match_dup 0) + (zero_extend:DI (lshiftrt:SI (match_dup 4) (match_dup 3))))] + { + operands[4] = gen_lowpart (SImode, operands[1]); + } +) + ;; Bitfield Insert (insv) (define_expand "insv<mode>" [(set (zero_extract:GPI (match_operand:GPI 0 "register_operand") diff --git a/gcc/testsuite/gcc.target/aarch64/ubfx_lsr_1.c b/gcc/testsuite/gcc.target/aarch64/ubfx_lsr_1.c new file mode 100644 index 0000000000000000000000000000000000000000..f6f72b074e1fc6bcb1976eee6c545e9781b4bed6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ubfx_lsr_1.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +/* Check that an X-reg UBFX can be simplified into a W-reg LSR. */ + +int +f (unsigned long long x) +{ + x = (x >> 24) & 255; + return x + 1; +} + +/* { dg-final { scan-assembler "lsr\tw" } } */ +/* { dg-final { scan-assembler-not "ubfx\tx" } } */