[AArch64] Remove use of wider vector modes

Message ID 87efs4c61d.fsf@linaro.org
State New
Headers show

Commit Message

Richard Sandiford Aug. 22, 2017, 9:20 a.m.
The AArch64 port defined x2, x3 and x4 vector modes that were only used
in the rtl for the AdvSIMD LD{2,3,4} patterns.  It seems unlikely that
this rtl would have led to any valid simplifications, since the values
involved were unspecs that had a different number of operands from the
non-dreg versions.  (The dreg UNSPEC_LD2 had a single operand, while
the qreg one had two operands.)

As it happened, the patterns led to invalid simplifications on big-
endian targets due to a mix-up in the operand order, see Tamar's fix
in r240271.

This patch therefore replaces the rtl patterns with dedicated unspecs.
This allows the x2, x3 and x4 modes to be removed, avoiding a clash
with 256-bit and 512-bit SVE.

Tested on aarch64-linux-gnu.  Also tested by comparing the before and after
assembly for the testsuite at -O2 -ftree-vectorize on aarch64-linux-gnu
and aarch64_be-linux-gnu; there were no differences.  OK to install?

Richard


2017-08-22  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* config/aarch64/aarch64-modes.def: Remove 32-, 48- and 64-byte
	vector modes.
	* config/aarch64/iterators.md (VRL2, VRL3, VRL4): Delete.
	* config/aarch64/aarch64.md (UNSPEC_LD2_DREG, UNSPEC_LD3_DREG)
	(UNSPEC_LD4_DREG): New unspecs.
	* config/aarch64/aarch64-simd.md (aarch64_ld2<mode>_dreg_le)
	(aarch64_ld2<mode>_dreg_be): Replace with...
	(aarch64_ld2<mode>_dreg): ...this pattern and use the new DREG
	unspec.
	(aarch64_ld3<mode>_dreg_le)
	(aarch64_ld3<mode>_dreg_be): Replace with...
	(aarch64_ld3<mode>_dreg): ...this pattern and use the new DREG
	unspec.
	(aarch64_ld4<mode>_dreg_le)
	(aarch64_ld4<mode>_dreg_be): Replace with...
	(aarch64_ld4<mode>_dreg): ...this pattern and use the new DREG
	unspec.

Comments

James Greenhalgh Aug. 30, 2017, 4:54 p.m. | #1
On Tue, Aug 22, 2017 at 10:20:46AM +0100, Richard Sandiford wrote:
> The AArch64 port defined x2, x3 and x4 vector modes that were only used

> in the rtl for the AdvSIMD LD{2,3,4} patterns.  It seems unlikely that

> this rtl would have led to any valid simplifications, since the values

> involved were unspecs that had a different number of operands from the

> non-dreg versions.  (The dreg UNSPEC_LD2 had a single operand, while

> the qreg one had two operands.)

> 

> As it happened, the patterns led to invalid simplifications on big-

> endian targets due to a mix-up in the operand order, see Tamar's fix

> in r240271.

> 

> This patch therefore replaces the rtl patterns with dedicated unspecs.

> This allows the x2, x3 and x4 modes to be removed, avoiding a clash

> with 256-bit and 512-bit SVE.

> 

> Tested on aarch64-linux-gnu.  Also tested by comparing the before and after

> assembly for the testsuite at -O2 -ftree-vectorize on aarch64-linux-gnu

> and aarch64_be-linux-gnu; there were no differences.  OK to install?


This is OK. I think we were being a bit too optimistic with the design
of these RTL patterns, though clearly at least some part of this was getting
used in rare corner cases, as we had a wrong code-bug. I think what these
would give you is a simplification if you used a 64-bit ld2/3/4 and then
extracted a lane which GCC knew to be zero. This happened by mistake with
the backwards lane indexes for big-endian, causing us to think the DImode
subreg we took was constant zero.

In reality, there is not likely to be a common case that needs
optimisation here, so I don't mind losing this.

OK.

Thanks,
James

> 2017-08-22  Richard Sandiford  <richard.sandiford@linaro.org>

> 	    Alan Hayward  <alan.hayward@arm.com>

> 	    David Sherwood  <david.sherwood@arm.com>

> 

> gcc/

> 	* config/aarch64/aarch64-modes.def: Remove 32-, 48- and 64-byte

> 	vector modes.

> 	* config/aarch64/iterators.md (VRL2, VRL3, VRL4): Delete.

> 	* config/aarch64/aarch64.md (UNSPEC_LD2_DREG, UNSPEC_LD3_DREG)

> 	(UNSPEC_LD4_DREG): New unspecs.

> 	* config/aarch64/aarch64-simd.md (aarch64_ld2<mode>_dreg_le)

> 	(aarch64_ld2<mode>_dreg_be): Replace with...

> 	(aarch64_ld2<mode>_dreg): ...this pattern and use the new DREG

> 	unspec.

> 	(aarch64_ld3<mode>_dreg_le)

> 	(aarch64_ld3<mode>_dreg_be): Replace with...

> 	(aarch64_ld3<mode>_dreg): ...this pattern and use the new DREG

> 	unspec.

> 	(aarch64_ld4<mode>_dreg_le)

> 	(aarch64_ld4<mode>_dreg_be): Replace with...

> 	(aarch64_ld4<mode>_dreg): ...this pattern and use the new DREG

> 	unspec.

> 

> Index: gcc/config/aarch64/aarch64-modes.def

> ===================================================================

> --- gcc/config/aarch64/aarch64-modes.def	2017-02-23 19:54:24.000000000 +0000

> +++ gcc/config/aarch64/aarch64-modes.def	2017-08-22 10:11:04.724056356 +0100

> @@ -44,15 +44,5 @@ INT_MODE (OI, 32);

>  INT_MODE (CI, 48);

>  INT_MODE (XI, 64);

>  

> -/* Vector modes for register lists.  */

> -VECTOR_MODES (INT, 32);		/* V32QI V16HI V8SI V4DI.  */

> -VECTOR_MODES (FLOAT, 32);	/* V8SF V4DF.  */

> -

> -VECTOR_MODES (INT, 48);		/* V32QI V16HI V8SI V4DI.  */

> -VECTOR_MODES (FLOAT, 48);	/* V8SF V4DF.  */

> -

> -VECTOR_MODES (INT, 64);		/* V32QI V16HI V8SI V4DI.  */

> -VECTOR_MODES (FLOAT, 64);	/* V8SF V4DF.  */

> -

>  /* Quad float: 128-bit floating mode for long doubles.  */

>  FLOAT_MODE (TF, 16, ieee_quad_format);

> Index: gcc/config/aarch64/iterators.md

> ===================================================================

> --- gcc/config/aarch64/iterators.md	2017-08-03 10:40:55.896279339 +0100

> +++ gcc/config/aarch64/iterators.md	2017-08-22 10:11:04.727125997 +0100

> @@ -711,21 +711,6 @@ (define_mode_attr Vendreg [(OI "T") (CI

>  ;; ld..._lane and st..._lane operations.

>  (define_mode_attr nregs [(OI "2") (CI "3") (XI "4")])

>  

> -(define_mode_attr VRL2 [(V8QI "V32QI") (V4HI "V16HI")

> -			(V4HF "V16HF")

> -			(V2SI "V8SI")  (V2SF "V8SF")

> -			(DI   "V4DI")  (DF   "V4DF")])

> -

> -(define_mode_attr VRL3 [(V8QI "V48QI") (V4HI "V24HI")

> -			(V4HF "V24HF")

> -			(V2SI "V12SI")  (V2SF "V12SF")

> -			(DI   "V6DI")  (DF   "V6DF")])

> -

> -(define_mode_attr VRL4 [(V8QI "V64QI") (V4HI "V32HI")

> -			(V4HF "V32HF")

> -			(V2SI "V16SI")  (V2SF "V16SF")

> -			(DI   "V8DI")  (DF   "V8DF")])

> -

>  ;; Mode for atomic operation suffixes

>  (define_mode_attr atomic_sfx

>    [(QI "b") (HI "h") (SI "") (DI "")])

> Index: gcc/config/aarch64/aarch64.md

> ===================================================================

> --- gcc/config/aarch64/aarch64.md	2017-08-16 08:50:34.060622654 +0100

> +++ gcc/config/aarch64/aarch64.md	2017-08-22 10:11:04.726102783 +0100

> @@ -98,10 +98,13 @@ (define_c_enum "unspec" [

>      UNSPEC_GOTTINYTLS

>      UNSPEC_LD1

>      UNSPEC_LD2

> +    UNSPEC_LD2_DREG

>      UNSPEC_LD2_DUP

>      UNSPEC_LD3

> +    UNSPEC_LD3_DREG

>      UNSPEC_LD3_DUP

>      UNSPEC_LD4

> +    UNSPEC_LD4_DREG

>      UNSPEC_LD4_DUP

>      UNSPEC_LD2_LANE

>      UNSPEC_LD3_LANE

> Index: gcc/config/aarch64/aarch64-simd.md

> ===================================================================

> --- gcc/config/aarch64/aarch64-simd.md	2017-08-17 17:29:27.227162205 +0100

> +++ gcc/config/aarch64/aarch64-simd.md	2017-08-22 10:11:04.726102783 +0100

> @@ -4963,278 +4963,62 @@ (define_expand "aarch64_ld<VSTRUCT:nregs

>    DONE;

>  })

>  

> -(define_insn "aarch64_ld2<mode>_dreg_le"

> +(define_insn "aarch64_ld2<mode>_dreg"

>    [(set (match_operand:OI 0 "register_operand" "=w")

> -	(subreg:OI

> -	  (vec_concat:<VRL2>

> -	    (vec_concat:<VDBL>

> -	     (unspec:VD

> -		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]

> -		UNSPEC_LD2)

> -	     (vec_duplicate:VD (const_int 0)))

> -	    (vec_concat:<VDBL>

> -	     (unspec:VD [(match_dup 1)]

> -			UNSPEC_LD2)

> -	     (vec_duplicate:VD (const_int 0)))) 0))]

> -  "TARGET_SIMD && !BYTES_BIG_ENDIAN"

> -  "ld2\\t{%S0.<Vtype> - %T0.<Vtype>}, %1"

> -  [(set_attr "type" "neon_load2_2reg<q>")]

> -)

> -

> -(define_insn "aarch64_ld2<mode>_dreg_be"

> -  [(set (match_operand:OI 0 "register_operand" "=w")

> -	(subreg:OI

> -	  (vec_concat:<VRL2>

> -	    (vec_concat:<VDBL>

> -	     (vec_duplicate:VD (const_int 0))

> -	     (unspec:VD

> -		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]

> -		UNSPEC_LD2))

> -	    (vec_concat:<VDBL>

> -	     (vec_duplicate:VD (const_int 0))

> -	     (unspec:VD [(match_dup 1)]

> -			UNSPEC_LD2))) 0))]

> -  "TARGET_SIMD && BYTES_BIG_ENDIAN"

> +	(unspec:OI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")

> +		    (unspec:VD [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]

> +		   UNSPEC_LD2_DREG))]

> +  "TARGET_SIMD"

>    "ld2\\t{%S0.<Vtype> - %T0.<Vtype>}, %1"

>    [(set_attr "type" "neon_load2_2reg<q>")]

>  )

>  

> -(define_insn "aarch64_ld2<mode>_dreg_le"

> -  [(set (match_operand:OI 0 "register_operand" "=w")

> -	(subreg:OI

> -	  (vec_concat:<VRL2>

> -	    (vec_concat:<VDBL>

> -	     (unspec:DX

> -		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]

> -		UNSPEC_LD2)

> -	     (const_int 0))

> -	    (vec_concat:<VDBL>

> -	     (unspec:DX [(match_dup 1)]

> -			UNSPEC_LD2)

> -	     (const_int 0))) 0))]

> -  "TARGET_SIMD && !BYTES_BIG_ENDIAN"

> -  "ld1\\t{%S0.1d - %T0.1d}, %1"

> -  [(set_attr "type" "neon_load1_2reg<q>")]

> -)

> -

> -(define_insn "aarch64_ld2<mode>_dreg_be"

> +(define_insn "aarch64_ld2<mode>_dreg"

>    [(set (match_operand:OI 0 "register_operand" "=w")

> -	(subreg:OI

> -	  (vec_concat:<VRL2>

> -	    (vec_concat:<VDBL>

> -	     (const_int 0)

> -	     (unspec:DX

> -		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]

> -		UNSPEC_LD2))

> -	    (vec_concat:<VDBL>

> -	     (const_int 0)

> -	     (unspec:DX [(match_dup 1)]

> -			UNSPEC_LD2))) 0))]

> -  "TARGET_SIMD && BYTES_BIG_ENDIAN"

> +	(unspec:OI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")

> +		    (unspec:DX [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]

> +		   UNSPEC_LD2_DREG))]

> +  "TARGET_SIMD"

>    "ld1\\t{%S0.1d - %T0.1d}, %1"

>    [(set_attr "type" "neon_load1_2reg<q>")]

>  )

>  

> -(define_insn "aarch64_ld3<mode>_dreg_le"

> +(define_insn "aarch64_ld3<mode>_dreg"

>    [(set (match_operand:CI 0 "register_operand" "=w")

> -	(subreg:CI

> -	 (vec_concat:<VRL3>

> -	  (vec_concat:<VRL2>

> -	    (vec_concat:<VDBL>

> -	     (unspec:VD

> -		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]

> -		UNSPEC_LD3)

> -	     (vec_duplicate:VD (const_int 0)))

> -	    (vec_concat:<VDBL>

> -	     (unspec:VD [(match_dup 1)]

> -			UNSPEC_LD3)

> -	     (vec_duplicate:VD (const_int 0))))

> -	  (vec_concat:<VDBL>

> -	     (unspec:VD [(match_dup 1)]

> -			UNSPEC_LD3)

> -	     (vec_duplicate:VD (const_int 0)))) 0))]

> -  "TARGET_SIMD && !BYTES_BIG_ENDIAN"

> -  "ld3\\t{%S0.<Vtype> - %U0.<Vtype>}, %1"

> -  [(set_attr "type" "neon_load3_3reg<q>")]

> -)

> -

> -(define_insn "aarch64_ld3<mode>_dreg_be"

> -  [(set (match_operand:CI 0 "register_operand" "=w")

> -	(subreg:CI

> -	 (vec_concat:<VRL3>

> -	  (vec_concat:<VRL2>

> -	    (vec_concat:<VDBL>

> -	     (vec_duplicate:VD (const_int 0))

> -	     (unspec:VD

> -		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]

> -		UNSPEC_LD3))

> -	    (vec_concat:<VDBL>

> -	     (vec_duplicate:VD (const_int 0))

> -	     (unspec:VD [(match_dup 1)]

> -			UNSPEC_LD3)))

> -	  (vec_concat:<VDBL>

> -	     (vec_duplicate:VD (const_int 0))

> -	     (unspec:VD [(match_dup 1)]

> -			UNSPEC_LD3))) 0))]

> -  "TARGET_SIMD && BYTES_BIG_ENDIAN"

> +	(unspec:CI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")

> +		    (unspec:VD [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]

> +		   UNSPEC_LD3_DREG))]

> +  "TARGET_SIMD"

>    "ld3\\t{%S0.<Vtype> - %U0.<Vtype>}, %1"

>    [(set_attr "type" "neon_load3_3reg<q>")]

>  )

>  

> -(define_insn "aarch64_ld3<mode>_dreg_le"

> +(define_insn "aarch64_ld3<mode>_dreg"

>    [(set (match_operand:CI 0 "register_operand" "=w")

> -	(subreg:CI

> -	 (vec_concat:<VRL3>

> -	  (vec_concat:<VRL2>

> -	    (vec_concat:<VDBL>

> -	     (unspec:DX

> -		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]

> -		UNSPEC_LD3)

> -	     (const_int 0))

> -	    (vec_concat:<VDBL>

> -	     (unspec:DX [(match_dup 1)]

> -			UNSPEC_LD3)

> -	     (const_int 0)))

> -	  (vec_concat:<VDBL>

> -	     (unspec:DX [(match_dup 1)]

> -			UNSPEC_LD3)

> -	     (const_int 0))) 0))]

> -  "TARGET_SIMD && !BYTES_BIG_ENDIAN"

> -  "ld1\\t{%S0.1d - %U0.1d}, %1"

> -  [(set_attr "type" "neon_load1_3reg<q>")]

> -)

> -

> -(define_insn "aarch64_ld3<mode>_dreg_be"

> -  [(set (match_operand:CI 0 "register_operand" "=w")

> -	(subreg:CI

> -	 (vec_concat:<VRL3>

> -	  (vec_concat:<VRL2>

> -	    (vec_concat:<VDBL>

> -	     (const_int 0)

> -	     (unspec:DX

> -		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]

> -		UNSPEC_LD3))

> -	    (vec_concat:<VDBL>

> -	     (const_int 0)

> -	     (unspec:DX [(match_dup 1)]

> -			UNSPEC_LD3)))

> -	  (vec_concat:<VDBL>

> -	     (const_int 0)

> -	     (unspec:DX [(match_dup 1)]

> -			UNSPEC_LD3))) 0))]

> -  "TARGET_SIMD && BYTES_BIG_ENDIAN"

> +	(unspec:CI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")

> +		    (unspec:DX [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]

> +		   UNSPEC_LD3_DREG))]

> +  "TARGET_SIMD"

>    "ld1\\t{%S0.1d - %U0.1d}, %1"

>    [(set_attr "type" "neon_load1_3reg<q>")]

>  )

>  

> -(define_insn "aarch64_ld4<mode>_dreg_le"

> -  [(set (match_operand:XI 0 "register_operand" "=w")

> -	(subreg:XI

> -	 (vec_concat:<VRL4>

> -	   (vec_concat:<VRL2>

> -	     (vec_concat:<VDBL>

> -	       (unspec:VD

> -		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]

> -		UNSPEC_LD4)

> -	       (vec_duplicate:VD (const_int 0)))

> -	      (vec_concat:<VDBL>

> -		(unspec:VD [(match_dup 1)]

> -			UNSPEC_LD4)

> -		(vec_duplicate:VD (const_int 0))))

> -	   (vec_concat:<VRL2>

> -	     (vec_concat:<VDBL>

> -	       (unspec:VD [(match_dup 1)]

> -			UNSPEC_LD4)

> -	       (vec_duplicate:VD (const_int 0)))

> -	     (vec_concat:<VDBL>

> -	       (unspec:VD [(match_dup 1)]

> -			UNSPEC_LD4)

> -	       (vec_duplicate:VD (const_int 0))))) 0))]

> -  "TARGET_SIMD && !BYTES_BIG_ENDIAN"

> -  "ld4\\t{%S0.<Vtype> - %V0.<Vtype>}, %1"

> -  [(set_attr "type" "neon_load4_4reg<q>")]

> -)

> -

> -(define_insn "aarch64_ld4<mode>_dreg_be"

> +(define_insn "aarch64_ld4<mode>_dreg"

>    [(set (match_operand:XI 0 "register_operand" "=w")

> -	(subreg:XI

> -	 (vec_concat:<VRL4>

> -	   (vec_concat:<VRL2>

> -	     (vec_concat:<VDBL>

> -	       (vec_duplicate:VD (const_int 0))

> -	       (unspec:VD

> -		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]

> -		UNSPEC_LD4))

> -	      (vec_concat:<VDBL>

> -		(vec_duplicate:VD (const_int 0))

> -		(unspec:VD [(match_dup 1)]

> -			UNSPEC_LD4)))

> -	   (vec_concat:<VRL2>

> -	     (vec_concat:<VDBL>

> -	       (vec_duplicate:VD (const_int 0))

> -	       (unspec:VD [(match_dup 1)]

> -			UNSPEC_LD4))

> -	     (vec_concat:<VDBL>

> -	       (vec_duplicate:VD (const_int 0))

> -	       (unspec:VD [(match_dup 1)]

> -			UNSPEC_LD4)))) 0))]

> -  "TARGET_SIMD && BYTES_BIG_ENDIAN"

> +	(unspec:XI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")

> +		    (unspec:VD [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]

> +		   UNSPEC_LD4_DREG))]

> +  "TARGET_SIMD"

>    "ld4\\t{%S0.<Vtype> - %V0.<Vtype>}, %1"

>    [(set_attr "type" "neon_load4_4reg<q>")]

>  )

>  

> -(define_insn "aarch64_ld4<mode>_dreg_le"

> +(define_insn "aarch64_ld4<mode>_dreg"

>    [(set (match_operand:XI 0 "register_operand" "=w")

> -	(subreg:XI

> -	 (vec_concat:<VRL4>

> -	   (vec_concat:<VRL2>

> -	     (vec_concat:<VDBL>

> -	       (unspec:DX

> -		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]

> -		UNSPEC_LD4)

> -	       (const_int 0))

> -	      (vec_concat:<VDBL>

> -	        (unspec:DX [(match_dup 1)]

> -			UNSPEC_LD4)

> -		(const_int 0)))

> -	   (vec_concat:<VRL2>

> -	     (vec_concat:<VDBL>

> -	       (unspec:DX [(match_dup 1)]

> -			UNSPEC_LD4)

> -	       (const_int 0))

> -	     (vec_concat:<VDBL>

> -	       (unspec:DX [(match_dup 1)]

> -			UNSPEC_LD4)

> -	       (const_int 0)))) 0))]

> -  "TARGET_SIMD && !BYTES_BIG_ENDIAN"

> -  "ld1\\t{%S0.1d - %V0.1d}, %1"

> -  [(set_attr "type" "neon_load1_4reg<q>")]

> -)

> -

> -(define_insn "aarch64_ld4<mode>_dreg_be"

> -  [(set (match_operand:XI 0 "register_operand" "=w")

> -	(subreg:XI

> -	 (vec_concat:<VRL4>

> -	   (vec_concat:<VRL2>

> -	     (vec_concat:<VDBL>

> -	       (const_int 0)

> -	       (unspec:DX

> -		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]

> -		UNSPEC_LD4))

> -	      (vec_concat:<VDBL>

> -		(const_int 0)

> -		(unspec:DX [(match_dup 1)]

> -			UNSPEC_LD4)))

> -	   (vec_concat:<VRL2>

> -	     (vec_concat:<VDBL>

> -	       (const_int 0)

> -	       (unspec:DX [(match_dup 1)]

> -			UNSPEC_LD4))

> -	     (vec_concat:<VDBL>

> -	       (const_int 0)

> -	       (unspec:DX [(match_dup 1)]

> -			UNSPEC_LD4)))) 0))]

> -  "TARGET_SIMD && BYTES_BIG_ENDIAN"

> +	(unspec:XI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")

> +		    (unspec:DX [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]

> +		   UNSPEC_LD4_DREG))]

> +  "TARGET_SIMD"

>    "ld1\\t{%S0.1d - %V0.1d}, %1"

>    [(set_attr "type" "neon_load1_4reg<q>")]

>  )

> @@ -5248,12 +5032,7 @@ (define_expand "aarch64_ld<VSTRUCT:nregs

>    rtx mem = gen_rtx_MEM (BLKmode, operands[1]);

>    set_mem_size (mem, <VSTRUCT:nregs> * 8);

>  

> -  if (BYTES_BIG_ENDIAN)

> -    emit_insn (gen_aarch64_ld<VSTRUCT:nregs><VDC:mode>_dreg_be (operands[0],

> -								mem));

> -  else

> -    emit_insn (gen_aarch64_ld<VSTRUCT:nregs><VDC:mode>_dreg_le (operands[0],

> -								mem));

> +  emit_insn (gen_aarch64_ld<VSTRUCT:nregs><VDC:mode>_dreg (operands[0], mem));

>    DONE;

>  })

>

Patch

Index: gcc/config/aarch64/aarch64-modes.def
===================================================================
--- gcc/config/aarch64/aarch64-modes.def	2017-02-23 19:54:24.000000000 +0000
+++ gcc/config/aarch64/aarch64-modes.def	2017-08-22 10:11:04.724056356 +0100
@@ -44,15 +44,5 @@  INT_MODE (OI, 32);
 INT_MODE (CI, 48);
 INT_MODE (XI, 64);
 
-/* Vector modes for register lists.  */
-VECTOR_MODES (INT, 32);		/* V32QI V16HI V8SI V4DI.  */
-VECTOR_MODES (FLOAT, 32);	/* V8SF V4DF.  */
-
-VECTOR_MODES (INT, 48);		/* V32QI V16HI V8SI V4DI.  */
-VECTOR_MODES (FLOAT, 48);	/* V8SF V4DF.  */
-
-VECTOR_MODES (INT, 64);		/* V32QI V16HI V8SI V4DI.  */
-VECTOR_MODES (FLOAT, 64);	/* V8SF V4DF.  */
-
 /* Quad float: 128-bit floating mode for long doubles.  */
 FLOAT_MODE (TF, 16, ieee_quad_format);
Index: gcc/config/aarch64/iterators.md
===================================================================
--- gcc/config/aarch64/iterators.md	2017-08-03 10:40:55.896279339 +0100
+++ gcc/config/aarch64/iterators.md	2017-08-22 10:11:04.727125997 +0100
@@ -711,21 +711,6 @@  (define_mode_attr Vendreg [(OI "T") (CI
 ;; ld..._lane and st..._lane operations.
 (define_mode_attr nregs [(OI "2") (CI "3") (XI "4")])
 
-(define_mode_attr VRL2 [(V8QI "V32QI") (V4HI "V16HI")
-			(V4HF "V16HF")
-			(V2SI "V8SI")  (V2SF "V8SF")
-			(DI   "V4DI")  (DF   "V4DF")])
-
-(define_mode_attr VRL3 [(V8QI "V48QI") (V4HI "V24HI")
-			(V4HF "V24HF")
-			(V2SI "V12SI")  (V2SF "V12SF")
-			(DI   "V6DI")  (DF   "V6DF")])
-
-(define_mode_attr VRL4 [(V8QI "V64QI") (V4HI "V32HI")
-			(V4HF "V32HF")
-			(V2SI "V16SI")  (V2SF "V16SF")
-			(DI   "V8DI")  (DF   "V8DF")])
-
 ;; Mode for atomic operation suffixes
 (define_mode_attr atomic_sfx
   [(QI "b") (HI "h") (SI "") (DI "")])
Index: gcc/config/aarch64/aarch64.md
===================================================================
--- gcc/config/aarch64/aarch64.md	2017-08-16 08:50:34.060622654 +0100
+++ gcc/config/aarch64/aarch64.md	2017-08-22 10:11:04.726102783 +0100
@@ -98,10 +98,13 @@  (define_c_enum "unspec" [
     UNSPEC_GOTTINYTLS
     UNSPEC_LD1
     UNSPEC_LD2
+    UNSPEC_LD2_DREG
     UNSPEC_LD2_DUP
     UNSPEC_LD3
+    UNSPEC_LD3_DREG
     UNSPEC_LD3_DUP
     UNSPEC_LD4
+    UNSPEC_LD4_DREG
     UNSPEC_LD4_DUP
     UNSPEC_LD2_LANE
     UNSPEC_LD3_LANE
Index: gcc/config/aarch64/aarch64-simd.md
===================================================================
--- gcc/config/aarch64/aarch64-simd.md	2017-08-17 17:29:27.227162205 +0100
+++ gcc/config/aarch64/aarch64-simd.md	2017-08-22 10:11:04.726102783 +0100
@@ -4963,278 +4963,62 @@  (define_expand "aarch64_ld<VSTRUCT:nregs
   DONE;
 })
 
-(define_insn "aarch64_ld2<mode>_dreg_le"
+(define_insn "aarch64_ld2<mode>_dreg"
   [(set (match_operand:OI 0 "register_operand" "=w")
-	(subreg:OI
-	  (vec_concat:<VRL2>
-	    (vec_concat:<VDBL>
-	     (unspec:VD
-		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
-		UNSPEC_LD2)
-	     (vec_duplicate:VD (const_int 0)))
-	    (vec_concat:<VDBL>
-	     (unspec:VD [(match_dup 1)]
-			UNSPEC_LD2)
-	     (vec_duplicate:VD (const_int 0)))) 0))]
-  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
-  "ld2\\t{%S0.<Vtype> - %T0.<Vtype>}, %1"
-  [(set_attr "type" "neon_load2_2reg<q>")]
-)
-
-(define_insn "aarch64_ld2<mode>_dreg_be"
-  [(set (match_operand:OI 0 "register_operand" "=w")
-	(subreg:OI
-	  (vec_concat:<VRL2>
-	    (vec_concat:<VDBL>
-	     (vec_duplicate:VD (const_int 0))
-	     (unspec:VD
-		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
-		UNSPEC_LD2))
-	    (vec_concat:<VDBL>
-	     (vec_duplicate:VD (const_int 0))
-	     (unspec:VD [(match_dup 1)]
-			UNSPEC_LD2))) 0))]
-  "TARGET_SIMD && BYTES_BIG_ENDIAN"
+	(unspec:OI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")
+		    (unspec:VD [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]
+		   UNSPEC_LD2_DREG))]
+  "TARGET_SIMD"
   "ld2\\t{%S0.<Vtype> - %T0.<Vtype>}, %1"
   [(set_attr "type" "neon_load2_2reg<q>")]
 )
 
-(define_insn "aarch64_ld2<mode>_dreg_le"
-  [(set (match_operand:OI 0 "register_operand" "=w")
-	(subreg:OI
-	  (vec_concat:<VRL2>
-	    (vec_concat:<VDBL>
-	     (unspec:DX
-		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
-		UNSPEC_LD2)
-	     (const_int 0))
-	    (vec_concat:<VDBL>
-	     (unspec:DX [(match_dup 1)]
-			UNSPEC_LD2)
-	     (const_int 0))) 0))]
-  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
-  "ld1\\t{%S0.1d - %T0.1d}, %1"
-  [(set_attr "type" "neon_load1_2reg<q>")]
-)
-
-(define_insn "aarch64_ld2<mode>_dreg_be"
+(define_insn "aarch64_ld2<mode>_dreg"
   [(set (match_operand:OI 0 "register_operand" "=w")
-	(subreg:OI
-	  (vec_concat:<VRL2>
-	    (vec_concat:<VDBL>
-	     (const_int 0)
-	     (unspec:DX
-		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
-		UNSPEC_LD2))
-	    (vec_concat:<VDBL>
-	     (const_int 0)
-	     (unspec:DX [(match_dup 1)]
-			UNSPEC_LD2))) 0))]
-  "TARGET_SIMD && BYTES_BIG_ENDIAN"
+	(unspec:OI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")
+		    (unspec:DX [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]
+		   UNSPEC_LD2_DREG))]
+  "TARGET_SIMD"
   "ld1\\t{%S0.1d - %T0.1d}, %1"
   [(set_attr "type" "neon_load1_2reg<q>")]
 )
 
-(define_insn "aarch64_ld3<mode>_dreg_le"
+(define_insn "aarch64_ld3<mode>_dreg"
   [(set (match_operand:CI 0 "register_operand" "=w")
-	(subreg:CI
-	 (vec_concat:<VRL3>
-	  (vec_concat:<VRL2>
-	    (vec_concat:<VDBL>
-	     (unspec:VD
-		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
-		UNSPEC_LD3)
-	     (vec_duplicate:VD (const_int 0)))
-	    (vec_concat:<VDBL>
-	     (unspec:VD [(match_dup 1)]
-			UNSPEC_LD3)
-	     (vec_duplicate:VD (const_int 0))))
-	  (vec_concat:<VDBL>
-	     (unspec:VD [(match_dup 1)]
-			UNSPEC_LD3)
-	     (vec_duplicate:VD (const_int 0)))) 0))]
-  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
-  "ld3\\t{%S0.<Vtype> - %U0.<Vtype>}, %1"
-  [(set_attr "type" "neon_load3_3reg<q>")]
-)
-
-(define_insn "aarch64_ld3<mode>_dreg_be"
-  [(set (match_operand:CI 0 "register_operand" "=w")
-	(subreg:CI
-	 (vec_concat:<VRL3>
-	  (vec_concat:<VRL2>
-	    (vec_concat:<VDBL>
-	     (vec_duplicate:VD (const_int 0))
-	     (unspec:VD
-		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
-		UNSPEC_LD3))
-	    (vec_concat:<VDBL>
-	     (vec_duplicate:VD (const_int 0))
-	     (unspec:VD [(match_dup 1)]
-			UNSPEC_LD3)))
-	  (vec_concat:<VDBL>
-	     (vec_duplicate:VD (const_int 0))
-	     (unspec:VD [(match_dup 1)]
-			UNSPEC_LD3))) 0))]
-  "TARGET_SIMD && BYTES_BIG_ENDIAN"
+	(unspec:CI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")
+		    (unspec:VD [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]
+		   UNSPEC_LD3_DREG))]
+  "TARGET_SIMD"
   "ld3\\t{%S0.<Vtype> - %U0.<Vtype>}, %1"
   [(set_attr "type" "neon_load3_3reg<q>")]
 )
 
-(define_insn "aarch64_ld3<mode>_dreg_le"
+(define_insn "aarch64_ld3<mode>_dreg"
   [(set (match_operand:CI 0 "register_operand" "=w")
-	(subreg:CI
-	 (vec_concat:<VRL3>
-	  (vec_concat:<VRL2>
-	    (vec_concat:<VDBL>
-	     (unspec:DX
-		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
-		UNSPEC_LD3)
-	     (const_int 0))
-	    (vec_concat:<VDBL>
-	     (unspec:DX [(match_dup 1)]
-			UNSPEC_LD3)
-	     (const_int 0)))
-	  (vec_concat:<VDBL>
-	     (unspec:DX [(match_dup 1)]
-			UNSPEC_LD3)
-	     (const_int 0))) 0))]
-  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
-  "ld1\\t{%S0.1d - %U0.1d}, %1"
-  [(set_attr "type" "neon_load1_3reg<q>")]
-)
-
-(define_insn "aarch64_ld3<mode>_dreg_be"
-  [(set (match_operand:CI 0 "register_operand" "=w")
-	(subreg:CI
-	 (vec_concat:<VRL3>
-	  (vec_concat:<VRL2>
-	    (vec_concat:<VDBL>
-	     (const_int 0)
-	     (unspec:DX
-		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
-		UNSPEC_LD3))
-	    (vec_concat:<VDBL>
-	     (const_int 0)
-	     (unspec:DX [(match_dup 1)]
-			UNSPEC_LD3)))
-	  (vec_concat:<VDBL>
-	     (const_int 0)
-	     (unspec:DX [(match_dup 1)]
-			UNSPEC_LD3))) 0))]
-  "TARGET_SIMD && BYTES_BIG_ENDIAN"
+	(unspec:CI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")
+		    (unspec:DX [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]
+		   UNSPEC_LD3_DREG))]
+  "TARGET_SIMD"
   "ld1\\t{%S0.1d - %U0.1d}, %1"
   [(set_attr "type" "neon_load1_3reg<q>")]
 )
 
-(define_insn "aarch64_ld4<mode>_dreg_le"
-  [(set (match_operand:XI 0 "register_operand" "=w")
-	(subreg:XI
-	 (vec_concat:<VRL4>
-	   (vec_concat:<VRL2>
-	     (vec_concat:<VDBL>
-	       (unspec:VD
-		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
-		UNSPEC_LD4)
-	       (vec_duplicate:VD (const_int 0)))
-	      (vec_concat:<VDBL>
-		(unspec:VD [(match_dup 1)]
-			UNSPEC_LD4)
-		(vec_duplicate:VD (const_int 0))))
-	   (vec_concat:<VRL2>
-	     (vec_concat:<VDBL>
-	       (unspec:VD [(match_dup 1)]
-			UNSPEC_LD4)
-	       (vec_duplicate:VD (const_int 0)))
-	     (vec_concat:<VDBL>
-	       (unspec:VD [(match_dup 1)]
-			UNSPEC_LD4)
-	       (vec_duplicate:VD (const_int 0))))) 0))]
-  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
-  "ld4\\t{%S0.<Vtype> - %V0.<Vtype>}, %1"
-  [(set_attr "type" "neon_load4_4reg<q>")]
-)
-
-(define_insn "aarch64_ld4<mode>_dreg_be"
+(define_insn "aarch64_ld4<mode>_dreg"
   [(set (match_operand:XI 0 "register_operand" "=w")
-	(subreg:XI
-	 (vec_concat:<VRL4>
-	   (vec_concat:<VRL2>
-	     (vec_concat:<VDBL>
-	       (vec_duplicate:VD (const_int 0))
-	       (unspec:VD
-		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
-		UNSPEC_LD4))
-	      (vec_concat:<VDBL>
-		(vec_duplicate:VD (const_int 0))
-		(unspec:VD [(match_dup 1)]
-			UNSPEC_LD4)))
-	   (vec_concat:<VRL2>
-	     (vec_concat:<VDBL>
-	       (vec_duplicate:VD (const_int 0))
-	       (unspec:VD [(match_dup 1)]
-			UNSPEC_LD4))
-	     (vec_concat:<VDBL>
-	       (vec_duplicate:VD (const_int 0))
-	       (unspec:VD [(match_dup 1)]
-			UNSPEC_LD4)))) 0))]
-  "TARGET_SIMD && BYTES_BIG_ENDIAN"
+	(unspec:XI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")
+		    (unspec:VD [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]
+		   UNSPEC_LD4_DREG))]
+  "TARGET_SIMD"
   "ld4\\t{%S0.<Vtype> - %V0.<Vtype>}, %1"
   [(set_attr "type" "neon_load4_4reg<q>")]
 )
 
-(define_insn "aarch64_ld4<mode>_dreg_le"
+(define_insn "aarch64_ld4<mode>_dreg"
   [(set (match_operand:XI 0 "register_operand" "=w")
-	(subreg:XI
-	 (vec_concat:<VRL4>
-	   (vec_concat:<VRL2>
-	     (vec_concat:<VDBL>
-	       (unspec:DX
-		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
-		UNSPEC_LD4)
-	       (const_int 0))
-	      (vec_concat:<VDBL>
-	        (unspec:DX [(match_dup 1)]
-			UNSPEC_LD4)
-		(const_int 0)))
-	   (vec_concat:<VRL2>
-	     (vec_concat:<VDBL>
-	       (unspec:DX [(match_dup 1)]
-			UNSPEC_LD4)
-	       (const_int 0))
-	     (vec_concat:<VDBL>
-	       (unspec:DX [(match_dup 1)]
-			UNSPEC_LD4)
-	       (const_int 0)))) 0))]
-  "TARGET_SIMD && !BYTES_BIG_ENDIAN"
-  "ld1\\t{%S0.1d - %V0.1d}, %1"
-  [(set_attr "type" "neon_load1_4reg<q>")]
-)
-
-(define_insn "aarch64_ld4<mode>_dreg_be"
-  [(set (match_operand:XI 0 "register_operand" "=w")
-	(subreg:XI
-	 (vec_concat:<VRL4>
-	   (vec_concat:<VRL2>
-	     (vec_concat:<VDBL>
-	       (const_int 0)
-	       (unspec:DX
-		[(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
-		UNSPEC_LD4))
-	      (vec_concat:<VDBL>
-		(const_int 0)
-		(unspec:DX [(match_dup 1)]
-			UNSPEC_LD4)))
-	   (vec_concat:<VRL2>
-	     (vec_concat:<VDBL>
-	       (const_int 0)
-	       (unspec:DX [(match_dup 1)]
-			UNSPEC_LD4))
-	     (vec_concat:<VDBL>
-	       (const_int 0)
-	       (unspec:DX [(match_dup 1)]
-			UNSPEC_LD4)))) 0))]
-  "TARGET_SIMD && BYTES_BIG_ENDIAN"
+	(unspec:XI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")
+		    (unspec:DX [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]
+		   UNSPEC_LD4_DREG))]
+  "TARGET_SIMD"
   "ld1\\t{%S0.1d - %V0.1d}, %1"
   [(set_attr "type" "neon_load1_4reg<q>")]
 )
@@ -5248,12 +5032,7 @@  (define_expand "aarch64_ld<VSTRUCT:nregs
   rtx mem = gen_rtx_MEM (BLKmode, operands[1]);
   set_mem_size (mem, <VSTRUCT:nregs> * 8);
 
-  if (BYTES_BIG_ENDIAN)
-    emit_insn (gen_aarch64_ld<VSTRUCT:nregs><VDC:mode>_dreg_be (operands[0],
-								mem));
-  else
-    emit_insn (gen_aarch64_ld<VSTRUCT:nregs><VDC:mode>_dreg_le (operands[0],
-								mem));
+  emit_insn (gen_aarch64_ld<VSTRUCT:nregs><VDC:mode>_dreg (operands[0], mem));
   DONE;
 })