diff mbox series

Recognize a missed usage of a sbfiz instruction

Message ID 1517582287-9923-1-git-send-email-luis.machado@linaro.org
State Superseded
Headers show
Series Recognize a missed usage of a sbfiz instruction | expand

Commit Message

Luis Machado Feb. 2, 2018, 2:38 p.m. UTC
A customer reported the following missed opportunities to combine a couple
instructions into a sbfiz.

int sbfiz32 (int x)
{
  return x << 29 >> 10;
}

long sbfiz64 (long x)
{
  return x << 58 >> 20;
}

This gets converted to the following pattern:

(set (reg:SI 98)
    (ashift:SI (sign_extend:SI (reg:HI 0 x0 [ xD.3334 ]))
        (const_int 6 [0x6])))

Currently, gcc generates the following:

sbfiz32:
	lsl x0, x0, 29
	asr x0, x0, 10
	ret

sbfiz64:
	lsl x0, x0, 58
	asr x0, x0, 20
	ret

It could generate this instead:

sbfiz32:
	sbfiz   w0, w0, 19, 3
	ret

sbfiz64::
	sbfiz   x0, x0, 38, 6
	ret

The unsigned versions already generate ubfiz for the same code, so the lack of
a sbfiz pattern may have been an oversight.

This particular sbfiz pattern shows up in both CPU2006 (~ 80 hits) and
CPU2017 (~ 280 hits). It's not a lot, but seems beneficial in any case. No
significant performance differences, probably due to the small number of
occurrences or cases outside hot areas.

Regression-tested and bootstrapped ok on aarch64-linux. Validated with
CPU2017 and CPU2006 runs.

I thought i'd put this up for review. I know we're still not in development
mode yet.

2018-02-02  Luis Machado  <luis.machado@linaro.org>

	gcc/
	* config/aarch64/aarch64.md (*ashift<mode>_extv_bfiz): New pattern.
	* testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c: New test.
---
 gcc/config/aarch64/aarch64.md                    | 13 +++++++++++++
 gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c | 24 ++++++++++++++++++++++++
 2 files changed, 37 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c

-- 
2.7.4

Comments

Kyrill Tkachov Feb. 2, 2018, 2:57 p.m. UTC | #1
Hi Luis,


On 02/02/18 14:38, Luis Machado wrote:
> A customer reported the following missed opportunities to combine a couple

> instructions into a sbfiz.

>

> int sbfiz32 (int x)

> {

>   return x << 29 >> 10;

> }

>

> long sbfiz64 (long x)

> {

>   return x << 58 >> 20;

> }

>

> This gets converted to the following pattern:

>

> (set (reg:SI 98)

>     (ashift:SI (sign_extend:SI (reg:HI 0 x0 [ xD.3334 ]))

>         (const_int 6 [0x6])))

>

> Currently, gcc generates the following:

>

> sbfiz32:

>         lsl x0, x0, 29

>         asr x0, x0, 10

>         ret

>

> sbfiz64:

>         lsl x0, x0, 58

>         asr x0, x0, 20

>         ret

>

> It could generate this instead:

>

> sbfiz32:

>         sbfiz   w0, w0, 19, 3

>         ret

>

> sbfiz64::

>         sbfiz   x0, x0, 38, 6

>         ret

>

> The unsigned versions already generate ubfiz for the same code, so the lack of

> a sbfiz pattern may have been an oversight.


You're right. In fact, llvm will generate the SBFIZ form.

>

> This particular sbfiz pattern shows up in both CPU2006 (~ 80 hits) and

> CPU2017 (~ 280 hits). It's not a lot, but seems beneficial in any case. No

> significant performance differences, probably due to the small number of

> occurrences or cases outside hot areas.


Yeah, these little missed opportunities add up in the long run :)

>

> Regression-tested and bootstrapped ok on aarch64-linux. Validated with

> CPU2017 and CPU2006 runs.

>

> I thought i'd put this up for review. I know we're still not in development

> mode yet.

>

> 2018-02-02  Luis Machado  <luis.machado@linaro.org>

>

>         gcc/

>         * config/aarch64/aarch64.md (*ashift<mode>_extv_bfiz): New pattern.

>         * testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c: New test.


I'm sure you know this already, the testsuite entry will go into a ChangeLog
of its own in gcc/testsuite/ with the testsuite/ prefix removed.

> ---

>  gcc/config/aarch64/aarch64.md                    | 13 +++++++++++++

>  gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c | 24 ++++++++++++++++++++++++

>  2 files changed, 37 insertions(+)

>  create mode 100644 gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c

>

> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md

> index 5a2a930..d336bf0 100644

> --- a/gcc/config/aarch64/aarch64.md

> +++ b/gcc/config/aarch64/aarch64.md

> @@ -4828,6 +4828,19 @@

>    [(set_attr "type" "bfx")]

>  )

>

> +;; Match sbfiz pattern in a shift left + shift right operation.

> +

> +(define_insn "*ashift<mode>_extv_bfiz"

> +  [(set (match_operand:GPI 0 "register_operand" "=r")

> +       (ashift:GPI (sign_extract:GPI (match_operand:GPI 1 "register_operand" "r")

> +                                     (match_operand 2 "const_int_operand" "n")

> +                                     (match_operand 3 "const_int_operand" "n"))

> +                    (match_operand 4 "const_int_operand" "n")))]

> +  "UINTVAL (operands[2]) < <GPI:sizen>"

> +  "sbfiz\\t%<w>0, %<w>1, %4, %2"

> +  [(set_attr "type" "bfx")]

> +)


I think you need some more validation on operands 3 and 4.
Look at other similar patterns, but you want something like the aarch64_simd_shift_imm_<mode> predicate
on them to make sure they fall within [0, 63] or [0, 31].
Also, for operands 2 you probably want the aarch64_simd_shift_imm_offset_di to make sure it doesn't
allow a size of zero. Don't know if the RTL optimisers will try to create such wonky RTL, but
we tend to be defensive in the pattern about these things.

> +

>  ;; When the bit position and width of the equivalent extraction add up to 32

>  ;; we can use a W-reg LSL instruction taking advantage of the implicit

>  ;; zero-extension of the X-reg.

> diff --git a/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c

> new file mode 100644

> index 0000000..931f8f4

> --- /dev/null

> +++ b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c

> @@ -0,0 +1,24 @@

> +/* { dg-do compile } */

> +/* { dg-options "-O3" } */

> +

> +/* Check that a LSL followed by an ASR can be combined into a single SBFIZ

> +   instruction.  */

> +

> +/* Using W-reg */

> +

> +int

> +sbfiz32 (int x)

> +{

> +  return x << 29 >> 10;

> +}

> +

> +/* Using X-reg */

> +

> +long

> +sbfiz64 (long x)

> +{

> +  return x << 58 >> 20;

> +}


long will be 32 bits when testing with -mabi=ilp32 so you want this to be long long,
or some other guaranteed 64-bit type.

Thanks,
Kyrill

> +

> +/* { dg-final { scan-assembler "sbfiz\tw" } } */

> +/* { dg-final { scan-assembler "sbfiz\tx" } } */

> -- 

> 2.7.4

>
diff mbox series

Patch

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 5a2a930..d336bf0 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4828,6 +4828,19 @@ 
   [(set_attr "type" "bfx")]
 )
 
+;; Match sbfiz pattern in a shift left + shift right operation.
+
+(define_insn "*ashift<mode>_extv_bfiz"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(ashift:GPI (sign_extract:GPI (match_operand:GPI 1 "register_operand" "r")
+				      (match_operand 2 "const_int_operand" "n")
+				      (match_operand 3 "const_int_operand" "n"))
+		     (match_operand 4 "const_int_operand" "n")))]
+  "UINTVAL (operands[2]) < <GPI:sizen>"
+  "sbfiz\\t%<w>0, %<w>1, %4, %2"
+  [(set_attr "type" "bfx")]
+)
+
 ;; When the bit position and width of the equivalent extraction add up to 32
 ;; we can use a W-reg LSL instruction taking advantage of the implicit
 ;; zero-extension of the X-reg.
diff --git a/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c
new file mode 100644
index 0000000..931f8f4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c
@@ -0,0 +1,24 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+/* Check that a LSL followed by an ASR can be combined into a single SBFIZ
+   instruction.  */
+
+/* Using W-reg */
+
+int
+sbfiz32 (int x)
+{
+  return x << 29 >> 10;
+}
+
+/* Using X-reg */
+
+long
+sbfiz64 (long x)
+{
+  return x << 58 >> 20;
+}
+
+/* { dg-final { scan-assembler "sbfiz\tw" } } */
+/* { dg-final { scan-assembler "sbfiz\tx" } } */