[0/4,AArch64] Add SVE support

Message ID	87a803ntmg.fsf@linaro.org
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of gcc-patches-return-465902-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; q=dns; s=default; b=cWl+RYBZcb0emaExz6CmOC6fAafdnViEV342Zzx9u8op89t3lH 4olt5j242xNxzSw4iYfXSGBeD5B0BmBp/RJW6er66Cj20IM/GqbKCuTfYsP1jCOT S2FS+GnSKwFNEb4TnTIsf3F76CjesUklkCjJsbNdv5MB6mi/nn3JfA/2g= Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org From: Richard Sandiford <richard.sandiford@linaro.org> To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, james.greenhalgh@arm.com, marcus.shawcroft@arm.com, richard.sandiford@linaro.org Cc: richard.earnshaw@arm.com, james.greenhalgh@arm.com, marcus.shawcroft@arm.com Subject: [0/4] [AArch64] Add SVE support Date: Fri, 03 Nov 2017 17:45:43 +0000 Message-ID: <87a803ntmg.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain
Series	Add SVE support \| expand [0/4,AArch64] Add SVE support [2/4,AArch64] Testsuite markup for SVE [4/4] SVE unwinding

Richard Sandiford Nov. 3, 2017, 5:45 p.m. UTC

This series adds support for ARM's Scalable Vector Extension.
More details on SVE can be found here:

  https://developer.arm.com/products/architecture/a-profile/docs/arm-architecture-reference-manual-supplement-armv8-a

There are four parts for ease of review, but it probably makes
sense to commit them as one patch.

The series plugs SVE into the current vectorisation framework without
adding any new features to the framework itself.  This means for example
that vector loops still handle full vectors, with a scalar epilogue loop
being needed for the rest.  Later patches add support for other features
like fully-predicated loops.

The patches build on top of the various series that I've already posted.
Sorry that there were so many, and thanks again for all the reviews.

Tested on aarch64-linux-gnu without SVE and aarch64-linux-gnu with SVE
(in the default vector-length agnostic mode).  Also tested with
-msve-vector-bits=256 and -msve-vector-bits=512 to select 256-bit
and 512-bit SVE registers.

Thanks,
Richard

Richard Sandiford Nov. 3, 2017, 5:48 p.m. UTC | #1

This patch adds support for ARM's Scalable Vector Extension.
The patch just contains the core features that work with the
current vectoriser framework; later patches will add extra
capabilities to both the target-independent code and AArch64 code.
The patch doesn't include:

- support for unwinding frames whose size depends on the vector length
- modelling the effect of __tls_get_addr on the SVE registers

These are handled by later patches instead.

Some notes:

- The copyright years for aarch64-sve.md start at 2009 because some of
  the code is based on aarch64.md, which also starts from then.

- The patch inserts spaces between items in the AArch64 section
  of sourcebuild.texi.  This matches at least the surrounding
  architectures and looks a little nicer in the info output.

- aarch64-sve.md includes a pattern:

    while_ult<GPI:mode><PRED_ALL:mode>

  A later patch adds a matching "while_ult" optab, but the pattern
  is also needed by the predicate vec_duplicate expander.


2017-11-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* doc/invoke.texi (-msve-vector-bits=): Document new option.
	(sve): Document new AArch64 extension.
	* doc/md.texi (w): Extend the description of the AArch64
	constraint to include SVE vectors.
	(Upl, Upa): Document new AArch64 predicate constraints.
	* config/aarch64/aarch64-opts.h (aarch64_sve_vector_bits_enum): New
	enum.
	* config/aarch64/aarch64.opt (sve_vector_bits): New enum.
	(msve-vector-bits=): New option.
	* config/aarch64/aarch64-option-extensions.def (fp, simd): Disable
	SVE when these are disabled.
	(sve): New extension.
	* config/aarch64/aarch64-modes.def: Define SVE vector and predicate
	modes.  Adjust their number of units based on aarch64_sve_vg.
	(MAX_BITSIZE_MODE_ANY_MODE): Define.
	* config/aarch64/aarch64-protos.h (ADDR_QUERY_ANY): New
	aarch64_addr_query_type.
	(aarch64_const_vec_all_same_in_range_p, aarch64_sve_pred_mode)
	(aarch64_sve_cnt_immediate_p, aarch64_sve_addvl_addpl_immediate_p)
	(aarch64_sve_inc_dec_immediate_p, aarch64_add_offset_temporaries)
	(aarch64_split_add_offset, aarch64_output_sve_cnt_immediate)
	(aarch64_output_sve_addvl_addpl, aarch64_output_sve_inc_dec_immediate)
	(aarch64_output_sve_mov_immediate, aarch64_output_ptrue): Declare.
	(aarch64_simd_imm_zero_p): Delete.
	(aarch64_check_zero_based_sve_index_immediate): Declare.
	(aarch64_sve_index_immediate_p, aarch64_sve_arith_immediate_p)
	(aarch64_sve_bitmask_immediate_p, aarch64_sve_dup_immediate_p)
	(aarch64_sve_cmp_immediate_p, aarch64_sve_float_arith_immediate_p)
	(aarch64_sve_float_mul_immediate_p): Likewise.
	(aarch64_classify_symbol): Take the offset as a HOST_WIDE_INT
	rather than an rtx.
	(aarch64_sve_ld1r_operand_p, aarch64_sve_ldr_operand_p): Declare.
	(aarch64_expand_mov_immediate): Take a gen_vec_duplicate callback.
	(aarch64_emit_sve_pred_move, aarch64_expand_sve_mem_move): Declare.
	(aarch64_expand_sve_vec_cmp_int, aarch64_expand_sve_vec_cmp_float)
	(aarch64_expand_sve_vcond, aarch64_expand_sve_vec_perm): Declare.
	(aarch64_regmode_natural_size): Likewise.
	* config/aarch64/aarch64.h (AARCH64_FL_SVE): New macro.
	(AARCH64_FL_V8_3, AARCH64_FL_RCPC, AARCH64_FL_DOTPROD): Shift
	left one place.
	(AARCH64_ISA_SVE, TARGET_SVE): New macros.
	(FIXED_REGISTERS, CALL_USED_REGISTERS, REGISTER_NAMES): Add entries
	for VG and the SVE predicate registers.
	(V_ALIASES): Add a "z"-prefixed alias.
	(FIRST_PSEUDO_REGISTER): Change to P15_REGNUM + 1.
	(AARCH64_DWARF_VG, AARCH64_DWARF_P0): New macros.
	(PR_REGNUM_P, PR_LO_REGNUM_P): Likewise.
	(PR_LO_REGS, PR_HI_REGS, PR_REGS): New reg_classes.
	(REG_CLASS_NAMES): Add entries for them.
	(REG_CLASS_CONTENTS): Likewise.  Update ALL_REGS to include VG
	and the predicate registers.
	(aarch64_sve_vg): Declare.
	(BITS_PER_SVE_VECTOR, BYTES_PER_SVE_VECTOR, BYTES_PER_SVE_PRED)
	(SVE_BYTE_MODE, MAX_COMPILE_TIME_VEC_BYTES): New macros.
	(REGMODE_NATURAL_SIZE): Define.
	* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Handle
	SVE macros.
	* config/aarch64/aarch64.c: Include cfgrtl.h.
	(simd_immediate_info): Add a constructor for series vectors,
	and an associated step field.
	(aarch64_sve_vg): New variable.
	(aarch64_dbx_register_number): Handle VG and the predicate registers.
	(aarch64_vect_struct_mode_p, aarch64_vector_mode_p): Delete.
	(VEC_ADVSIMD, VEC_SVE_DATA, VEC_SVE_PRED, VEC_STRUCT, VEC_ANY_SVE)
	(VEC_ANY_DATA, VEC_STRUCT): New constants.
	(aarch64_advsimd_struct_mode_p, aarch64_sve_pred_mode_p)
	(aarch64_classify_vector_mode, aarch64_vector_data_mode_p)
	(aarch64_sve_data_mode_p, aarch64_pred_mode, aarch64_get_mask_mode):
	New functions.
	(aarch64_hard_regno_nregs): Handle SVE data modes for FP_REGS
	and FP_LO_REGS.  Handle PR_REGS, PR_LO_REGS and PR_HI_REGS.
	(aarch64_hard_regno_mode_ok): Handle VG.  Also handle the SVE
	predicate modes and predicate registers.  Explicitly restrict
	GPRs to modes of 16 bytes or smaller.  Only allow FP registers
	to store a vector mode if it is recognized by
	aarch64_classify_vector_mode.
	(aarch64_regmode_natural_size): New function.
	(aarch64_hard_regno_caller_save_mode): Return the original mode
	for predicates.
	(aarch64_sve_cnt_immediate_p, aarch64_output_sve_cnt_immediate)
	(aarch64_sve_addvl_addpl_immediate_p, aarch64_output_sve_addvl_addpl)
	(aarch64_sve_inc_dec_immediate_p, aarch64_output_sve_inc_dec_immediate)
	(aarch64_add_offset_1_temporaries, aarch64_offset_temporaries): New
	functions.
	(aarch64_add_offset): Add a temp2 parameter.  Assert that temp1
	does not overlap dest if the function is frame-related.  Handle
	SVE constants.
	(aarch64_split_add_offset): New function.
	(aarch64_add_sp, aarch64_sub_sp): Add temp2 parameters and pass
	them aarch64_add_offset.
	(aarch64_allocate_and_probe_stack_space): Add a temp2 parameter
	and update call to aarch64_sub_sp.
	(aarch64_add_cfa_expression): New function.
	(aarch64_expand_prologue): Pass extra temporary registers to the
	functions above.  Handle the case in which we need to emit new
	DW_CFA_expressions for registers that were originally saved
	relative to the stack pointer, but now have to be expressed
	relative to the frame pointer.
	(aarch64_output_mi_thunk): Pass extra temporary registers to the
	functions above.
	(aarch64_expand_epilogue): Likewise.  Prevent inheritance of
	IP0 and IP1 values for SVE frames.
	(aarch64_expand_vec_series): New function.
	(aarch64_expand_mov_immediate): Add a gen_vec_duplicate parameter.
	Handle SVE constants.
	(aarch64_emit_sve_pred_move, aarch64_expand_sve_mem_move)
	(aarch64_get_reg_raw_mode, offset_4bit_signed_scaled_p)
	(offset_6bit_unsigned_scaled_p, aarch64_offset_7bit_signed_scaled_p)
	(offset_9bit_signed_scaled_p): New functions.
	(aarch64_replicate_bitmask_imm): New function.
	(aarch64_bitmask_imm): Use it.
	(aarch64_cannot_force_const_mem): Reject expressions involving
	a CONST_POLY_INT.  Update call to aarch64_classify_symbol.
	(aarch64_classify_index): Handle SVE indices, by requiring
	a plain register index with a scale that matches the element size.
	(aarch64_classify_address): Handle SVE addresses.  Assert that
	the mode of the address is VOIDmode or an integer mode.
	Update call to aarch64_classify_symbol.
	(aarch64_classify_symbolic_expression): Update call to
	aarch64_classify_symbol.
	(aarch64_const_vec_all_same_in_range_p): Extend to VEC_DUPLICATE
	constants by using const_vec_duplicate_p.
	(aarch64_const_vec_all_in_range_p): New function.
	(aarch64_print_vector_float_operand): Likewise.
	(aarch64_print_operand): Handle 'N' and 'C'.  Use "zN" rather than
	"vN" for FP registers with SVE modes.  Handle (const ...) vectors
	and the FP immediates 1.0 and 0.5.
	(aarch64_print_operand_address): Use ADDR_QUERY_ANY.  Handle
	SVE addresses.
	(aarch64_regno_regclass): Handle predicate registers.
	(aarch64_secondary_reload): Handle big-endian reloads of SVE
	data modes.
	(aarch64_class_max_nregs): Handle SVE modes and predicate registers.
	(aarch64_rtx_costs): Check for ADDVL and ADDPL instructions.
	(aarch64_convert_sve_vector_bits): New function.
	(aarch64_override_options): Use it to handle -msve-vector-bits=.
	(aarch64_classify_symbol): Take the offset as a HOST_WIDE_INT
	rather than an rtx.
	(aarch64_legitimate_constant_p): Use aarch64_classify_vector_mode.
	Handle SVE vector and predicate modes.  Accept VL-based constants
	that need only one temporary register.  Only call
	aarch64_constant_address_p if the constant is a scalar integer.
	(aarch64_conditional_register_usage): Mark the predicate registers
	as fixed if SVE isn't available.
	(aarch64_vector_mode_supported_p): Use aarch64_classify_vector_mode.
	Return true for SVE vector and predicate modes.
	(aarch64_simd_container_mode): Take the number of bits as a poly_int64
	rather than an unsigned int.  Handle SVE modes.
	(aarch64_preferred_simd_mode): Update call accordingly.  Handle
	SVE modes.
	(aarch64_autovectorize_vector_sizes): Add BYTES_PER_SVE_VECTOR
	if SVE is enabled.
	(aarch64_sve_index_immediate_p, aarch64_sve_arith_immediate_p)
	(aarch64_sve_bitmask_immediate_p, aarch64_sve_dup_immediate_p)
	(aarch64_sve_cmp_immediate_p, aarch64_sve_float_arith_immediate_p)
	(aarch64_sve_float_mul_immediate_p): New functions.
	(aarch64_sve_valid_immediate): New function.
	(aarch64_simd_valid_immediate): Use it as the fallback for SVE vectors.
	Explicitly reject structure modes.  Check for INDEX constants.
	Handle PTRUE and PFALSE constants.
	(aarch64_check_zero_based_sve_index_immediate): New function.
	(aarch64_simd_imm_zero_p): Delete.
	(aarch64_mov_operand_p): Use aarch64_simd_valid_immediate for
	vector modes.  Accept constants in the range of CNT[BHWD].
	(aarch64_simd_scalar_immediate_valid_for_move): Explicitly
	ask for an Advanced SIMD mode.
	(aarch64_sve_ld1r_operand_p, aarch64_sve_ldr_operand_p): New functions.
	(aarch64_simd_vector_alignment): Handle SVE predicates.
	(aarch64_vectorize_preferred_vector_alignment): New function.
	(aarch64_simd_vector_alignment_reachable): Use it instead of
	the vector size.
	(aarch64_shift_truncation_mask): Use aarch64_vector_data_mode_p.
	(aarch64_output_sve_mov_immediate, aarch64_output_ptrue): New
	functions.
	(MAX_VECT_LEN): Delete.
	(expand_vec_perm_d): Add a vec_flags field.
	(emit_unspec2, aarch64_expand_sve_vec_perm): New functions.
	(aarch64_evpc_trn, aarch64_evpc_uzp, aarch64_evpc_zip)
	(aarch64_evpc_ext): Don't apply a big-endian lane correction
	for SVE modes.
	(aarch64_evpc_rev): Use a predicated operation for SVE.
	(aarch64_evpc_dup): Enforce a 64-byte range for SVE DUP.
	(aarch64_evpc_tbl): Use MAX_COMPILE_TIME_VEC_BYTES instead of
	MAX_VECT_LEN.
	(aarch64_evpc_sve_tbl): New function.
	(aarch64_expand_vec_perm_const_1): Handle SVE permutes too,
	using aarch64_evpc_sve_tbl rather than aarch64_evpc_tbl.
	(aarch64_expand_vec_perm_const): Initialize vec_flags.
	(aarch64_vectorize_vec_perm_const_ok): Likewise.
	(aarch64_sve_cmp_operand_p, aarch64_unspec_cond_code)
	(aarch64_gen_unspec_cond, aarch64_expand_sve_vec_cmp_int)
	(aarch64_emit_unspec_cond, aarch64_emit_unspec_cond_or)
	(aarch64_emit_inverted_unspec_cond, aarch64_expand_sve_vec_cmp_float)
	(aarch64_expand_sve_vcond): New functions.
	(aarch64_modes_tieable_p): Use aarch64_vector_data_mode_p instead
	of aarch64_vector_mode_p.
	(aarch64_dwarf_poly_indeterminate_value): New function.
	(aarch64_compute_pressure_classes): Likewise.
	(aarch64_can_change_mode_class): Likewise.
	(TARGET_GET_RAW_RESULT_MODE, TARGET_GET_RAW_ARG_MODE): Redefine.
	(TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): Likewise.
	(TARGET_VECTORIZE_GET_MASK_MODE): Likewise.
	(TARGET_DWARF_POLY_INDETERMINATE_VALUE): Likewise.
	(TARGET_COMPUTE_PRESSURE_CLASSES): Likewise.
	(TARGET_CAN_CHANGE_MODE_CLASS): Likewise.
	* config/aarch64/constraints.md (Upa, Upl, Uad, Ual, Utr, Utw, Di)
	(Dm, Dv, vsa, vsc, vsd, vsi, vsn, vsl, vsm, vfa, vfm, vfn): New
	constraints.
	(Dn, Dl, Dr): Accept const as well as const_vector.
	(Dz): Likewise.  Compare against CONST0_RTX.
	* config/aarch64/iterators.md: Refer to "Advanced SIMD" instead
	of "vector" where appropriate.
	(SVE_ALL, SVE_BH, SVE_BHS, SVE_BHSI, SVE_HSDI, SVE_HSF, SVE_SD)
	(SVE_SDI, SVE_I, SVE_F, PRED_ALL, PRED_BHS): New mode iterators.
	(UNSPEC_SEL, UNSPEC_ANDF, UNSPEC_IORF, UNSPEC_XORF, UNSPEC_COND_LT)
	(UNSPEC_COND_LE, UNSPEC_COND_EQ, UNSPEC_COND_NE, UNSPEC_COND_GE)
	(UNSPEC_COND_GT, UNSPEC_COND_LO, UNSPEC_COND_LS, UNSPEC_COND_HS)
	(UNSPEC_COND_HI, UNSPEC_COND_UO): New unspecs.
	(Vetype, VEL, Vel, VWIDE, Vwide, vw, vwcore, V_INT_EQUIV)
	(v_int_equiv): Extend to SVE modes.
	(Vesize, V128, v128, Vewtype, V_FP_EQUIV, v_fp_equiv, VPRED): New
	mode attributes.
	(LOGICAL_OR, SVE_INT_UNARY, SVE_FP_UNARY): New code iterators.
	(optab): Handle popcount, smin, smax, umin, umax, abs and sqrt.
	(logical_nn, lr, sve_int_op, sve_fp_op): New code attributs.
	(LOGICALF, OPTAB_PERMUTE, UNPACK, UNPACK_UNSIGNED, SVE_COND_INT_CMP)
	(SVE_COND_FP_CMP): New int iterators.
	(perm_hilo): Handle the new unpack unspecs.
	(optab, logicalf_op, su, perm_optab, cmp_op, imm_con): New int
	attributes.
	* config/aarch64/predicates.md (aarch64_sve_cnt_immediate)
	(aarch64_sve_addvl_addpl_immediate, aarch64_split_add_offset_immediate)
	(aarch64_pluslong_or_poly_operand, aarch64_nonmemory_operand)
	(aarch64_equality_operator, aarch64_constant_vector_operand)
	(aarch64_sve_ld1r_operand, aarch64_sve_ldr_operand): New predicates.
	(aarch64_sve_nonimmediate_operand): Likewise.
	(aarch64_sve_general_operand): Likewise.
	(aarch64_sve_dup_operand, aarch64_sve_arith_immediate): Likewise.
	(aarch64_sve_sub_arith_immediate, aarch64_sve_inc_dec_immediate)
	(aarch64_sve_logical_immediate, aarch64_sve_mul_immediate): Likewise.
	(aarch64_sve_dup_immediate, aarch64_sve_cmp_vsc_immediate): Likewise.
	(aarch64_sve_cmp_vsd_immediate, aarch64_sve_index_immediate): Likewise.
	(aarch64_sve_float_arith_immediate): Likewise.
	(aarch64_sve_float_arith_with_sub_immediate): Likewise.
	(aarch64_sve_float_mul_immediate, aarch64_sve_arith_operand): Likewise.
	(aarch64_sve_add_operand, aarch64_sve_logical_operand): Likewise.
	(aarch64_sve_lshift_operand, aarch64_sve_rshift_operand): Likewise.
	(aarch64_sve_mul_operand, aarch64_sve_cmp_vsc_operand): Likewise.
	(aarch64_sve_cmp_vsd_operand, aarch64_sve_index_operand): Likewise.
	(aarch64_sve_float_arith_operand): Likewise.
	(aarch64_sve_float_arith_with_sub_operand): Likewise.
	(aarch64_sve_float_mul_operand): Likewise.
	(aarch64_sve_vec_perm_operand): Likewise.
	(aarch64_pluslong_operand): Include aarch64_sve_addvl_addpl_immediate.
	(aarch64_mov_operand): Accept const_poly_int and const_vector.
	(aarch64_simd_lshift_imm, aarch64_simd_rshift_imm): Accept const
	as well as const_vector.
	(aarch64_simd_imm_zero, aarch64_simd_imm_minus_one): Move earlier
	in file.  Use CONST0_RTX and CONSTM1_RTX.
	(aarch64_simd_reg_or_zero): Accept const as well as const_vector.
	Use aarch64_simd_imm_zero.
	* config/aarch64/aarch64-sve.md: New file.
	* config/aarch64/aarch64.md: Include it.
	(VG_REGNUM, P0_REGNUM, P7_REGNUM, P15_REGNUM): New register numbers.
	(UNSPEC_REV, UNSPEC_LD1_SVE, UNSPEC_ST1_SVE, UNSPEC_MERGE_PTRUE)
	(UNSPEC_PTEST_PTRUE, UNSPEC_UNPACKSHI, UNSPEC_UNPACKUHI)
	(UNSPEC_UNPACKSLO, UNSPEC_UNPACKULO, UNSPEC_PACK)
	(UNSPEC_FLOAT_CONVERT, UNSPEC_WHILE_LO): New unspec constants.
	(movqi, movhi): Pass CONST_POLY_INT operaneds through
	aarch64_expand_mov_immediate.
	(*mov<mode>_aarch64, *movsi_aarch64, *movdi_aarch64): Handle
	CNT[BHSD] immediates.
	(movti): Split CONST_POLY_INT moves into two halves.
	(add<mode>3): Accept aarch64_pluslong_or_poly_operand.
	Split additions that need a temporary here if the destination
	is the stack pointer.
	(*add<mode>3_aarch64): Handle ADDVL and ADDPL immediates.
	(*add<mode>3_poly_1): New instruction.
	(set_clobber_cc): New expander.

Richard Sandiford Nov. 3, 2017, 5:50 p.m. UTC | #2

This patch adds gcc.target/aarch64 tests for SVE, and forces some
existing Advanced SIMD tests to use -march=armv8-a.


2017-11-03  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/testsuite/
	* gcc.target/aarch64/bic_imm_1.c: Force -march=armv8-a.
	* gcc.target/aarch64/fmaxmin.c: Likewise.
	* gcc.target/aarch64/fmul_fcvt_2.c: Likewise.
	* gcc.target/aarch64/orr_imm_1.c: Likewise.
	* gcc.target/aarch64/pr62178.c: Likewise.
	* gcc.target/aarch64/pr71727-2.c: Likewise.
	* gcc.target/aarch64/saddw-1.c: Likewise.
	* gcc.target/aarch64/saddw-2.c: Likewise.
	* gcc.target/aarch64/uaddw-1.c: Likewise.
	* gcc.target/aarch64/uaddw-2.c: Likewise.
	* gcc.target/aarch64/uaddw-3.c: Likewise.
	* gcc.target/aarch64/vect-add-sub-cond.c: Likewise.
	* gcc.target/aarch64/vect-compile.c: Likewise.
	* gcc.target/aarch64/vect-faddv-compile.c: Likewise.
	* gcc.target/aarch64/vect-fcm-eq-d.c: Likewise.
	* gcc.target/aarch64/vect-fcm-eq-f.c: Likewise.
	* gcc.target/aarch64/vect-fcm-ge-d.c: Likewise.
	* gcc.target/aarch64/vect-fcm-ge-f.c: Likewise.
	* gcc.target/aarch64/vect-fcm-gt-d.c: Likewise.
	* gcc.target/aarch64/vect-fcm-gt-f.c: Likewise.
	* gcc.target/aarch64/vect-fmax-fmin-compile.c: Likewise.
	* gcc.target/aarch64/vect-fmaxv-fminv-compile.c: Likewise.
	* gcc.target/aarch64/vect-fmovd-zero.c: Likewise.
	* gcc.target/aarch64/vect-fmovd.c: Likewise.
	* gcc.target/aarch64/vect-fmovf-zero.c: Likewise.
	* gcc.target/aarch64/vect-fmovf.c: Likewise.
	* gcc.target/aarch64/vect-fp-compile.c: Likewise.
	* gcc.target/aarch64/vect-ld1r-compile-fp.c: Likewise.
	* gcc.target/aarch64/vect-ld1r-compile.c: Likewise.
	* gcc.target/aarch64/vect-movi.c: Likewise.
	* gcc.target/aarch64/vect-mull-compile.c: Likewise.
	* gcc.target/aarch64/vect-reduc-or_1.c: Likewise.
	* gcc.target/aarch64/vect-vaddv.c: Likewise.
	* gcc.target/aarch64/vect_saddl_1.c: Likewise.
	* gcc.target/aarch64/vect_smlal_1.c: Likewise.
	* gcc.target/aarch64/vector_initialization_nostack.c: XFAIL for
	fixed-length SVE.
	* gcc.target/aarch64/sve_arith_1.c: New test.
	* gcc.target/aarch64/sve_const_pred_1.C: Likewise.
	* gcc.target/aarch64/sve_const_pred_2.C: Likewise.
	* gcc.target/aarch64/sve_const_pred_3.C: Likewise.
	* gcc.target/aarch64/sve_const_pred_4.C: Likewise.
	* gcc.target/aarch64/sve_cvtf_signed_1.c: Likewise.
	* gcc.target/aarch64/sve_cvtf_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve_cvtf_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve_cvtf_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve_dup_imm_1.c: Likewise.
	* gcc.target/aarch64/sve_dup_imm_1_run.c: Likewise.
	* gcc.target/aarch64/sve_dup_lane_1.c: Likewise.
	* gcc.target/aarch64/sve_ext_1.c: Likewise.
	* gcc.target/aarch64/sve_ext_2.c: Likewise.
	* gcc.target/aarch64/sve_extract_1.c: Likewise.
	* gcc.target/aarch64/sve_extract_2.c: Likewise.
	* gcc.target/aarch64/sve_extract_3.c: Likewise.
	* gcc.target/aarch64/sve_extract_4.c: Likewise.
	* gcc.target/aarch64/sve_fabs_1.c: Likewise.
	* gcc.target/aarch64/sve_fcvtz_signed_1.c: Likewise.
	* gcc.target/aarch64/sve_fcvtz_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve_fcvtz_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve_fcvtz_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve_fdiv_1.c: Likewise.
	* gcc.target/aarch64/sve_fdup_1.c: Likewise.
	* gcc.target/aarch64/sve_fdup_1_run.c: Likewise.
	* gcc.target/aarch64/sve_fmad_1.c: Likewise.
	* gcc.target/aarch64/sve_fmla_1.c: Likewise.
	* gcc.target/aarch64/sve_fmls_1.c: Likewise.
	* gcc.target/aarch64/sve_fmsb_1.c: Likewise.
	* gcc.target/aarch64/sve_fmul_1.c: Likewise.
	* gcc.target/aarch64/sve_fneg_1.c: Likewise.
	* gcc.target/aarch64/sve_fnmad_1.c: Likewise.
	* gcc.target/aarch64/sve_fnmla_1.c: Likewise.
	* gcc.target/aarch64/sve_fnmls_1.c: Likewise.
	* gcc.target/aarch64/sve_fnmsb_1.c: Likewise.
	* gcc.target/aarch64/sve_fp_arith_1.c: Likewise.
	* gcc.target/aarch64/sve_frinta_1.c: Likewise.
	* gcc.target/aarch64/sve_frinti_1.c: Likewise.
	* gcc.target/aarch64/sve_frintm_1.c: Likewise.
	* gcc.target/aarch64/sve_frintp_1.c: Likewise.
	* gcc.target/aarch64/sve_frintx_1.c: Likewise.
	* gcc.target/aarch64/sve_frintz_1.c: Likewise.
	* gcc.target/aarch64/sve_fsqrt_1.c: Likewise.
	* gcc.target/aarch64/sve_fsubr_1.c: Likewise.
	* gcc.target/aarch64/sve_index_1.c: Likewise.
	* gcc.target/aarch64/sve_index_1_run.c: Likewise.
	* gcc.target/aarch64/sve_ld1r_1.c: Likewise.
	* gcc.target/aarch64/sve_load_const_offset_1.c: Likewise.
	* gcc.target/aarch64/sve_load_scalar_offset_1.c: Likewise.
	* gcc.target/aarch64/sve_logical_1.c: Likewise.
	* gcc.target/aarch64/sve_loop_add_1.c: Likewise.
	* gcc.target/aarch64/sve_loop_add_1_run.c: Likewise.
	* gcc.target/aarch64/sve_mad_1.c: Likewise.
	* gcc.target/aarch64/sve_maxmin_1.c: Likewise.
	* gcc.target/aarch64/sve_maxmin_1_run.c: Likewise.
	* gcc.target/aarch64/sve_maxmin_strict_1.c: Likewise.
	* gcc.target/aarch64/sve_maxmin_strict_1_run.c: Likewise.
	* gcc.target/aarch64/sve_mla_1.c: Likewise.
	* gcc.target/aarch64/sve_mls_1.c: Likewise.
	* gcc.target/aarch64/sve_mov_rr_1.c: Likewise.
	* gcc.target/aarch64/sve_msb_1.c: Likewise.
	* gcc.target/aarch64/sve_mul_1.c: Likewise.
	* gcc.target/aarch64/sve_neg_1.c: Likewise.
	* gcc.target/aarch64/sve_nlogical_1.c: Likewise.
	* gcc.target/aarch64/sve_nlogical_1_run.c: Likewise.
	* gcc.target/aarch64/sve_pack_1.c: Likewise.
	* gcc.target/aarch64/sve_pack_1_run.c: Likewise.
	* gcc.target/aarch64/sve_pack_fcvt_signed_1.c: Likewise.
	* gcc.target/aarch64/sve_pack_fcvt_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve_pack_fcvt_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve_pack_fcvt_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve_pack_float_1.c: Likewise.
	* gcc.target/aarch64/sve_pack_float_1_run.c: Likewise.
	* gcc.target/aarch64/sve_popcount_1.c: Likewise.
	* gcc.target/aarch64/sve_popcount_1_run.c: Likewise.
	* gcc.target/aarch64/sve_reduc_1.c: Likewise.
	* gcc.target/aarch64/sve_reduc_1_run.c: Likewise.
	* gcc.target/aarch64/sve_reduc_2.c: Likewise.
	* gcc.target/aarch64/sve_reduc_2_run.c: Likewise.
	* gcc.target/aarch64/sve_reduc_3.c: Likewise.
	* gcc.target/aarch64/sve_revb_1.c: Likewise.
	* gcc.target/aarch64/sve_revh_1.c: Likewise.
	* gcc.target/aarch64/sve_revw_1.c: Likewise.
	* gcc.target/aarch64/sve_shift_1.c: Likewise.
	* gcc.target/aarch64/sve_single_1.c: Likewise.
	* gcc.target/aarch64/sve_single_2.c: Likewise.
	* gcc.target/aarch64/sve_single_3.c: Likewise.
	* gcc.target/aarch64/sve_single_4.c: Likewise.
	* gcc.target/aarch64/sve_store_scalar_offset_1.c: Likewise.
	* gcc.target/aarch64/sve_subr_1.c: Likewise.
	* gcc.target/aarch64/sve_trn1_1.c: Likewise.
	* gcc.target/aarch64/sve_trn2_1.c: Likewise.
	* gcc.target/aarch64/sve_unpack_fcvt_signed_1.c: Likewise.
	* gcc.target/aarch64/sve_unpack_fcvt_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve_unpack_fcvt_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve_unpack_fcvt_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve_unpack_float_1.c: Likewise.
	* gcc.target/aarch64/sve_unpack_float_1_run.c: Likewise.
	* gcc.target/aarch64/sve_unpack_signed_1.c: Likewise.
	* gcc.target/aarch64/sve_unpack_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve_unpack_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve_unpack_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve_uzp1_1.c: Likewise.
	* gcc.target/aarch64/sve_uzp1_1_run.c: Likewise.
	* gcc.target/aarch64/sve_uzp2_1.c: Likewise.
	* gcc.target/aarch64/sve_uzp2_1_run.c: Likewise.
	* gcc.target/aarch64/sve_vcond_1.C: Likewise.
	* gcc.target/aarch64/sve_vcond_1_run.C: Likewise.
	* gcc.target/aarch64/sve_vcond_2.c: Likewise.
	* gcc.target/aarch64/sve_vcond_2_run.c: Likewise.
	* gcc.target/aarch64/sve_vcond_3.c: Likewise.
	* gcc.target/aarch64/sve_vcond_4.c: Likewise.
	* gcc.target/aarch64/sve_vcond_4_run.c: Likewise.
	* gcc.target/aarch64/sve_vcond_5.c: Likewise.
	* gcc.target/aarch64/sve_vcond_5_run.c: Likewise.
	* gcc.target/aarch64/sve_vcond_6.c: Likewise.
	* gcc.target/aarch64/sve_vcond_6_run.c: Likewise.
	* gcc.target/aarch64/sve_vec_init_1.c: Likewise.
	* gcc.target/aarch64/sve_vec_init_1_run.c: Likewise.
	* gcc.target/aarch64/sve_vec_init_2.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_1.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_1_run.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_1_overrange_run.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_const_1.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_const_1_overrun.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_const_1_run.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_const_single_1.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_const_single_1_run.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_single_1.c: Likewise.
	* gcc.target/aarch64/sve_vec_perm_single_1_run.c: Likewise.
	* gcc.target/aarch64/sve_zip1_1.c: Likewise.
	* gcc.target/aarch64/sve_zip2_1.c: Likewise.

Richard Sandiford Nov. 24, 2017, 3:59 p.m. UTC | #3

Richard Sandiford <richard.sandiford@linaro.org> writes:
> This series adds support for ARM's Scalable Vector Extension.

> More details on SVE can be found here:

>

>   https://developer.arm.com/products/architecture/a-profile/docs/arm-architecture-reference-manual-supplement-armv8-a

>

> There are four parts for ease of review, but it probably makes

> sense to commit them as one patch.

>

> The series plugs SVE into the current vectorisation framework without

> adding any new features to the framework itself.  This means for example

> that vector loops still handle full vectors, with a scalar epilogue loop

> being needed for the rest.  Later patches add support for other features

> like fully-predicated loops.

>

> The patches build on top of the various series that I've already posted.

> Sorry that there were so many, and thanks again for all the reviews.

>

> Tested on aarch64-linux-gnu without SVE and aarch64-linux-gnu with SVE

> (in the default vector-length agnostic mode).  Also tested with

> -msve-vector-bits=256 and -msve-vector-bits=512 to select 256-bit

> and 512-bit SVE registers.


Here's an update based on an off-list discussion with the maintainers.
Changes since v1:

- Changed the names of the modes from 256-bit vectors to "VNx"
  + a 128-bit mode name, e.g. V32QI -> VNx16QI.

- Added an "sve" attribute and used it in the "enabled" attribute.
  This allows generic aarch64.md patterns to disable things related
  to SVE on non-SVE targets; previously this was implicit through the
  constraints.

- Improved the consistency of the constraint names, specifically:

  Ua?: addition contraints (already used for Uaa)
  Us?: general scalar constraints (already used for various other scalars)
  Ut?: memory constraints (unchanged from v1)
  vs?: vector SVE constraints (mostly unchanged, but now includes FP
       as well as integer constraints)

  There's still the general "Dm" (minus one) constraint, for consistency
  with "Dz" (zero).

- Added missing register descriptions above FIXED_REGISTERS.

- "should"/"is expected to" -> "must".

- Added more commentary to things like regmode_natural_size.

I also did a before and after comparison of the testsuite output
for base AArch64 (but using the new FIRST_PSEUDO_REGISTER definition
to avoid changes to hash values).  There were no differences.

Thanks,
Richard


2017-11-24  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* doc/invoke.texi (-msve-vector-bits=): Document new option.
	(sve): Document new AArch64 extension.
	* doc/md.texi (w): Extend the description of the AArch64
	constraint to include SVE vectors.
	(Upl, Upa): Document new AArch64 predicate constraints.
	* config/aarch64/aarch64-opts.h (aarch64_sve_vector_bits_enum): New
	enum.
	* config/aarch64/aarch64.opt (sve_vector_bits): New enum.
	(msve-vector-bits=): New option.
	* config/aarch64/aarch64-option-extensions.def (fp, simd): Disable
	SVE when these are disabled.
	(sve): New extension.
	* config/aarch64/aarch64-modes.def: Define SVE vector and predicate
	modes.  Adjust their number of units based on aarch64_sve_vg.
	(MAX_BITSIZE_MODE_ANY_MODE): Define.
	* config/aarch64/aarch64-protos.h (ADDR_QUERY_ANY): New
	aarch64_addr_query_type.
	(aarch64_const_vec_all_same_in_range_p, aarch64_sve_pred_mode)
	(aarch64_sve_cnt_immediate_p, aarch64_sve_addvl_addpl_immediate_p)
	(aarch64_sve_inc_dec_immediate_p, aarch64_add_offset_temporaries)
	(aarch64_split_add_offset, aarch64_output_sve_cnt_immediate)
	(aarch64_output_sve_addvl_addpl, aarch64_output_sve_inc_dec_immediate)
	(aarch64_output_sve_mov_immediate, aarch64_output_ptrue): Declare.
	(aarch64_simd_imm_zero_p): Delete.
	(aarch64_check_zero_based_sve_index_immediate): Declare.
	(aarch64_sve_index_immediate_p, aarch64_sve_arith_immediate_p)
	(aarch64_sve_bitmask_immediate_p, aarch64_sve_dup_immediate_p)
	(aarch64_sve_cmp_immediate_p, aarch64_sve_float_arith_immediate_p)
	(aarch64_sve_float_mul_immediate_p): Likewise.
	(aarch64_classify_symbol): Take the offset as a HOST_WIDE_INT
	rather than an rtx.
	(aarch64_sve_ld1r_operand_p, aarch64_sve_ldr_operand_p): Declare.
	(aarch64_expand_mov_immediate): Take a gen_vec_duplicate callback.
	(aarch64_emit_sve_pred_move, aarch64_expand_sve_mem_move): Declare.
	(aarch64_expand_sve_vec_cmp_int, aarch64_expand_sve_vec_cmp_float)
	(aarch64_expand_sve_vcond, aarch64_expand_sve_vec_perm): Declare.
	(aarch64_regmode_natural_size): Likewise.
	* config/aarch64/aarch64.h (AARCH64_FL_SVE): New macro.
	(AARCH64_FL_V8_3, AARCH64_FL_RCPC, AARCH64_FL_DOTPROD): Shift
	left one place.
	(AARCH64_ISA_SVE, TARGET_SVE): New macros.
	(FIXED_REGISTERS, CALL_USED_REGISTERS, REGISTER_NAMES): Add entries
	for VG and the SVE predicate registers.
	(V_ALIASES): Add a "z"-prefixed alias.
	(FIRST_PSEUDO_REGISTER): Change to P15_REGNUM + 1.
	(AARCH64_DWARF_VG, AARCH64_DWARF_P0): New macros.
	(PR_REGNUM_P, PR_LO_REGNUM_P): Likewise.
	(PR_LO_REGS, PR_HI_REGS, PR_REGS): New reg_classes.
	(REG_CLASS_NAMES): Add entries for them.
	(REG_CLASS_CONTENTS): Likewise.  Update ALL_REGS to include VG
	and the predicate registers.
	(aarch64_sve_vg): Declare.
	(BITS_PER_SVE_VECTOR, BYTES_PER_SVE_VECTOR, BYTES_PER_SVE_PRED)
	(SVE_BYTE_MODE, MAX_COMPILE_TIME_VEC_BYTES): New macros.
	(REGMODE_NATURAL_SIZE): Define.
	* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Handle
	SVE macros.
	* config/aarch64/aarch64.c: Include cfgrtl.h.
	(simd_immediate_info): Add a constructor for series vectors,
	and an associated step field.
	(aarch64_sve_vg): New variable.
	(aarch64_dbx_register_number): Handle VG and the predicate registers.
	(aarch64_vect_struct_mode_p, aarch64_vector_mode_p): Delete.
	(VEC_ADVSIMD, VEC_SVE_DATA, VEC_SVE_PRED, VEC_STRUCT, VEC_ANY_SVE)
	(VEC_ANY_DATA, VEC_STRUCT): New constants.
	(aarch64_advsimd_struct_mode_p, aarch64_sve_pred_mode_p)
	(aarch64_classify_vector_mode, aarch64_vector_data_mode_p)
	(aarch64_sve_data_mode_p, aarch64_pred_mode, aarch64_get_mask_mode):
	New functions.
	(aarch64_hard_regno_nregs): Handle SVE data modes for FP_REGS
	and FP_LO_REGS.  Handle PR_REGS, PR_LO_REGS and PR_HI_REGS.
	(aarch64_hard_regno_mode_ok): Handle VG.  Also handle the SVE
	predicate modes and predicate registers.  Explicitly restrict
	GPRs to modes of 16 bytes or smaller.  Only allow FP registers
	to store a vector mode if it is recognized by
	aarch64_classify_vector_mode.
	(aarch64_regmode_natural_size): New function.
	(aarch64_hard_regno_caller_save_mode): Return the original mode
	for predicates.
	(aarch64_sve_cnt_immediate_p, aarch64_output_sve_cnt_immediate)
	(aarch64_sve_addvl_addpl_immediate_p, aarch64_output_sve_addvl_addpl)
	(aarch64_sve_inc_dec_immediate_p, aarch64_output_sve_inc_dec_immediate)
	(aarch64_add_offset_1_temporaries, aarch64_offset_temporaries): New
	functions.
	(aarch64_add_offset): Add a temp2 parameter.  Assert that temp1
	does not overlap dest if the function is frame-related.  Handle
	SVE constants.
	(aarch64_split_add_offset): New function.
	(aarch64_add_sp, aarch64_sub_sp): Add temp2 parameters and pass
	them aarch64_add_offset.
	(aarch64_allocate_and_probe_stack_space): Add a temp2 parameter
	and update call to aarch64_sub_sp.
	(aarch64_add_cfa_expression): New function.
	(aarch64_expand_prologue): Pass extra temporary registers to the
	functions above.  Handle the case in which we need to emit new
	DW_CFA_expressions for registers that were originally saved
	relative to the stack pointer, but now have to be expressed
	relative to the frame pointer.
	(aarch64_output_mi_thunk): Pass extra temporary registers to the
	functions above.
	(aarch64_expand_epilogue): Likewise.  Prevent inheritance of
	IP0 and IP1 values for SVE frames.
	(aarch64_expand_vec_series): New function.
	(aarch64_expand_mov_immediate): Add a gen_vec_duplicate parameter.
	Handle SVE constants.  Use emit_move_insn to move a force_const_mem
	into the register, rather than emitting a SET directly.
	(aarch64_emit_sve_pred_move, aarch64_expand_sve_mem_move)
	(aarch64_get_reg_raw_mode, offset_4bit_signed_scaled_p)
	(offset_6bit_unsigned_scaled_p, aarch64_offset_7bit_signed_scaled_p)
	(offset_9bit_signed_scaled_p): New functions.
	(aarch64_replicate_bitmask_imm): New function.
	(aarch64_bitmask_imm): Use it.
	(aarch64_cannot_force_const_mem): Reject expressions involving
	a CONST_POLY_INT.  Update call to aarch64_classify_symbol.
	(aarch64_classify_index): Handle SVE indices, by requiring
	a plain register index with a scale that matches the element size.
	(aarch64_classify_address): Handle SVE addresses.  Assert that
	the mode of the address is VOIDmode or an integer mode.
	Update call to aarch64_classify_symbol.
	(aarch64_classify_symbolic_expression): Update call to
	aarch64_classify_symbol.
	(aarch64_const_vec_all_same_in_range_p): Extend to VEC_DUPLICATE
	constants by using const_vec_duplicate_p.
	(aarch64_const_vec_all_in_range_p): New function.
	(aarch64_print_vector_float_operand): Likewise.
	(aarch64_print_operand): Handle 'N' and 'C'.  Use "zN" rather than
	"vN" for FP registers with SVE modes.  Handle (const ...) vectors
	and the FP immediates 1.0 and 0.5.
	(aarch64_print_operand_address): Use ADDR_QUERY_ANY.  Handle
	SVE addresses.
	(aarch64_regno_regclass): Handle predicate registers.
	(aarch64_secondary_reload): Handle big-endian reloads of SVE
	data modes.
	(aarch64_class_max_nregs): Handle SVE modes and predicate registers.
	(aarch64_rtx_costs): Check for ADDVL and ADDPL instructions.
	(aarch64_convert_sve_vector_bits): New function.
	(aarch64_override_options): Use it to handle -msve-vector-bits=.
	(aarch64_classify_symbol): Take the offset as a HOST_WIDE_INT
	rather than an rtx.
	(aarch64_legitimate_constant_p): Use aarch64_classify_vector_mode.
	Handle SVE vector and predicate modes.  Accept VL-based constants
	that need only one temporary register, and VL offsets that require
	no temporary registers.
	(aarch64_conditional_register_usage): Mark the predicate registers
	as fixed if SVE isn't available.
	(aarch64_vector_mode_supported_p): Use aarch64_classify_vector_mode.
	Return true for SVE vector and predicate modes.
	(aarch64_simd_container_mode): Take the number of bits as a poly_int64
	rather than an unsigned int.  Handle SVE modes.
	(aarch64_preferred_simd_mode): Update call accordingly.  Handle
	SVE modes.
	(aarch64_autovectorize_vector_sizes): Add BYTES_PER_SVE_VECTOR
	if SVE is enabled.
	(aarch64_sve_index_immediate_p, aarch64_sve_arith_immediate_p)
	(aarch64_sve_bitmask_immediate_p, aarch64_sve_dup_immediate_p)
	(aarch64_sve_cmp_immediate_p, aarch64_sve_float_arith_immediate_p)
	(aarch64_sve_float_mul_immediate_p): New functions.
	(aarch64_sve_valid_immediate): New function.
	(aarch64_simd_valid_immediate): Use it as the fallback for SVE vectors.
	Explicitly reject structure modes.  Check for INDEX constants.
	Handle PTRUE and PFALSE constants.
	(aarch64_check_zero_based_sve_index_immediate): New function.
	(aarch64_simd_imm_zero_p): Delete.
	(aarch64_mov_operand_p): Use aarch64_simd_valid_immediate for
	vector modes.  Accept constants in the range of CNT[BHWD].
	(aarch64_simd_scalar_immediate_valid_for_move): Explicitly
	ask for an Advanced SIMD mode.
	(aarch64_sve_ld1r_operand_p, aarch64_sve_ldr_operand_p): New functions.
	(aarch64_simd_vector_alignment): Handle SVE predicates.
	(aarch64_vectorize_preferred_vector_alignment): New function.
	(aarch64_simd_vector_alignment_reachable): Use it instead of
	the vector size.
	(aarch64_shift_truncation_mask): Use aarch64_vector_data_mode_p.
	(aarch64_output_sve_mov_immediate, aarch64_output_ptrue): New
	functions.
	(MAX_VECT_LEN): Delete.
	(expand_vec_perm_d): Add a vec_flags field.
	(emit_unspec2, aarch64_expand_sve_vec_perm): New functions.
	(aarch64_evpc_trn, aarch64_evpc_uzp, aarch64_evpc_zip)
	(aarch64_evpc_ext): Don't apply a big-endian lane correction
	for SVE modes.
	(aarch64_evpc_rev): Rename to...
	(aarch64_evpc_rev_local): ...this.  Use a predicated operation for SVE.
	(aarch64_evpc_rev_global): New function.
	(aarch64_evpc_dup): Enforce a 64-byte range for SVE DUP.
	(aarch64_evpc_tbl): Use MAX_COMPILE_TIME_VEC_BYTES instead of
	MAX_VECT_LEN.
	(aarch64_evpc_sve_tbl): New function.
	(aarch64_expand_vec_perm_const_1): Update after rename of
	aarch64_evpc_rev.  Handle SVE permutes too, trying
	aarch64_evpc_rev_global and using aarch64_evpc_sve_tbl rather
	than aarch64_evpc_tbl.
	(aarch64_expand_vec_perm_const): Initialize vec_flags.
	(aarch64_vectorize_vec_perm_const_ok): Likewise.
	(aarch64_sve_cmp_operand_p, aarch64_unspec_cond_code)
	(aarch64_gen_unspec_cond, aarch64_expand_sve_vec_cmp_int)
	(aarch64_emit_unspec_cond, aarch64_emit_unspec_cond_or)
	(aarch64_emit_inverted_unspec_cond, aarch64_expand_sve_vec_cmp_float)
	(aarch64_expand_sve_vcond): New functions.
	(aarch64_modes_tieable_p): Use aarch64_vector_data_mode_p instead
	of aarch64_vector_mode_p.
	(aarch64_dwarf_poly_indeterminate_value): New function.
	(aarch64_compute_pressure_classes): Likewise.
	(aarch64_can_change_mode_class): Likewise.
	(TARGET_GET_RAW_RESULT_MODE, TARGET_GET_RAW_ARG_MODE): Redefine.
	(TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): Likewise.
	(TARGET_VECTORIZE_GET_MASK_MODE): Likewise.
	(TARGET_DWARF_POLY_INDETERMINATE_VALUE): Likewise.
	(TARGET_COMPUTE_PRESSURE_CLASSES): Likewise.
	(TARGET_CAN_CHANGE_MODE_CLASS): Likewise.
	* config/aarch64/constraints.md (Upa, Upl, Uav, Uat, Usv, Usi, Utr)
	(Uty, Dm, vsa, vsc, vsd, vsi, vsn, vsl, vsm, vsA, vsM, vsN): New
	constraints.
	(Dn, Dl, Dr): Accept const as well as const_vector.
	(Dz): Likewise.  Compare against CONST0_RTX.
	* config/aarch64/iterators.md: Refer to "Advanced SIMD" instead
	of "vector" where appropriate.
	(SVE_ALL, SVE_BH, SVE_BHS, SVE_BHSI, SVE_HSDI, SVE_HSF, SVE_SD)
	(SVE_SDI, SVE_I, SVE_F, PRED_ALL, PRED_BHS): New mode iterators.
	(UNSPEC_SEL, UNSPEC_ANDF, UNSPEC_IORF, UNSPEC_XORF, UNSPEC_COND_LT)
	(UNSPEC_COND_LE, UNSPEC_COND_EQ, UNSPEC_COND_NE, UNSPEC_COND_GE)
	(UNSPEC_COND_GT, UNSPEC_COND_LO, UNSPEC_COND_LS, UNSPEC_COND_HS)
	(UNSPEC_COND_HI, UNSPEC_COND_UO): New unspecs.
	(Vetype, VEL, Vel, VWIDE, Vwide, vw, vwcore, V_INT_EQUIV)
	(v_int_equiv): Extend to SVE modes.
	(Vesize, V128, v128, Vewtype, V_FP_EQUIV, v_fp_equiv, VPRED): New
	mode attributes.
	(LOGICAL_OR, SVE_INT_UNARY, SVE_FP_UNARY): New code iterators.
	(optab): Handle popcount, smin, smax, umin, umax, abs and sqrt.
	(logical_nn, lr, sve_int_op, sve_fp_op): New code attributs.
	(LOGICALF, OPTAB_PERMUTE, UNPACK, UNPACK_UNSIGNED, SVE_COND_INT_CMP)
	(SVE_COND_FP_CMP): New int iterators.
	(perm_hilo): Handle the new unpack unspecs.
	(optab, logicalf_op, su, perm_optab, cmp_op, imm_con): New int
	attributes.
	* config/aarch64/predicates.md (aarch64_sve_cnt_immediate)
	(aarch64_sve_addvl_addpl_immediate, aarch64_split_add_offset_immediate)
	(aarch64_pluslong_or_poly_operand, aarch64_nonmemory_operand)
	(aarch64_equality_operator, aarch64_constant_vector_operand)
	(aarch64_sve_ld1r_operand, aarch64_sve_ldr_operand): New predicates.
	(aarch64_sve_nonimmediate_operand): Likewise.
	(aarch64_sve_general_operand): Likewise.
	(aarch64_sve_dup_operand, aarch64_sve_arith_immediate): Likewise.
	(aarch64_sve_sub_arith_immediate, aarch64_sve_inc_dec_immediate)
	(aarch64_sve_logical_immediate, aarch64_sve_mul_immediate): Likewise.
	(aarch64_sve_dup_immediate, aarch64_sve_cmp_vsc_immediate): Likewise.
	(aarch64_sve_cmp_vsd_immediate, aarch64_sve_index_immediate): Likewise.
	(aarch64_sve_float_arith_immediate): Likewise.
	(aarch64_sve_float_arith_with_sub_immediate): Likewise.
	(aarch64_sve_float_mul_immediate, aarch64_sve_arith_operand): Likewise.
	(aarch64_sve_add_operand, aarch64_sve_logical_operand): Likewise.
	(aarch64_sve_lshift_operand, aarch64_sve_rshift_operand): Likewise.
	(aarch64_sve_mul_operand, aarch64_sve_cmp_vsc_operand): Likewise.
	(aarch64_sve_cmp_vsd_operand, aarch64_sve_index_operand): Likewise.
	(aarch64_sve_float_arith_operand): Likewise.
	(aarch64_sve_float_arith_with_sub_operand): Likewise.
	(aarch64_sve_float_mul_operand): Likewise.
	(aarch64_sve_vec_perm_operand): Likewise.
	(aarch64_pluslong_operand): Include aarch64_sve_addvl_addpl_immediate.
	(aarch64_mov_operand): Accept const_poly_int and const_vector.
	(aarch64_simd_lshift_imm, aarch64_simd_rshift_imm): Accept const
	as well as const_vector.
	(aarch64_simd_imm_zero, aarch64_simd_imm_minus_one): Move earlier
	in file.  Use CONST0_RTX and CONSTM1_RTX.
	(aarch64_simd_or_scalar_imm_zero): Likewise.  Add match_codes.
	(aarch64_simd_reg_or_zero): Accept const as well as const_vector.
	Use aarch64_simd_imm_zero.
	* config/aarch64/aarch64-sve.md: New file.
	* config/aarch64/aarch64.md: Include it.
	(VG_REGNUM, P0_REGNUM, P7_REGNUM, P15_REGNUM): New register numbers.
	(UNSPEC_REV, UNSPEC_LD1_SVE, UNSPEC_ST1_SVE, UNSPEC_MERGE_PTRUE)
	(UNSPEC_PTEST_PTRUE, UNSPEC_UNPACKSHI, UNSPEC_UNPACKUHI)
	(UNSPEC_UNPACKSLO, UNSPEC_UNPACKULO, UNSPEC_PACK)
	(UNSPEC_FLOAT_CONVERT, UNSPEC_WHILE_LO): New unspec constants.
	(sve): New attribute.
	(enabled): Disable instructions with the sve attribute unless
	TARGET_SVE.
	(movqi, movhi): Pass CONST_POLY_INT operaneds through
	aarch64_expand_mov_immediate.
	(*mov<mode>_aarch64, *movsi_aarch64, *movdi_aarch64): Handle
	CNT[BHSD] immediates.
	(movti): Split CONST_POLY_INT moves into two halves.
	(add<mode>3): Accept aarch64_pluslong_or_poly_operand.
	Split additions that need a temporary here if the destination
	is the stack pointer.
	(*add<mode>3_aarch64): Handle ADDVL and ADDPL immediates.
	(*add<mode>3_poly_1): New instruction.
	(set_clobber_cc): New expander.

Richard Sandiford Jan. 5, 2018, 11:41 a.m. UTC | #4

Here's the patch updated to apply on top of the v8.4 and
__builtin_load_no_speculate support.  It also handles the new
vec_perm_indices and CONST_VECTOR encoding and uses VNx... names
for the SVE modes.

Richard Sandiford <richard.sandiford@linaro.org> writes:
> This patch adds support for ARM's Scalable Vector Extension.

> The patch just contains the core features that work with the

> current vectoriser framework; later patches will add extra

> capabilities to both the target-independent code and AArch64 code.

> The patch doesn't include:

>

> - support for unwinding frames whose size depends on the vector length

> - modelling the effect of __tls_get_addr on the SVE registers

>

> These are handled by later patches instead.

>

> Some notes:

>

> - The copyright years for aarch64-sve.md start at 2009 because some of

>   the code is based on aarch64.md, which also starts from then.

>

> - The patch inserts spaces between items in the AArch64 section

>   of sourcebuild.texi.  This matches at least the surrounding

>   architectures and looks a little nicer in the info output.

>

> - aarch64-sve.md includes a pattern:

>

>     while_ult<GPI:mode><PRED_ALL:mode>

>

>   A later patch adds a matching "while_ult" optab, but the pattern

>   is also needed by the predicate vec_duplicate expander.


2018-01-05  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* doc/invoke.texi (-msve-vector-bits=): Document new option.
	(sve): Document new AArch64 extension.
	* doc/md.texi (w): Extend the description of the AArch64
	constraint to include SVE vectors.
	(Upl, Upa): Document new AArch64 predicate constraints.
	* config/aarch64/aarch64-opts.h (aarch64_sve_vector_bits_enum): New
	enum.
	* config/aarch64/aarch64.opt (sve_vector_bits): New enum.
	(msve-vector-bits=): New option.
	* config/aarch64/aarch64-option-extensions.def (fp, simd): Disable
	SVE when these are disabled.
	(sve): New extension.
	* config/aarch64/aarch64-modes.def: Define SVE vector and predicate
	modes.  Adjust their number of units based on aarch64_sve_vg.
	(MAX_BITSIZE_MODE_ANY_MODE): Define.
	* config/aarch64/aarch64-protos.h (ADDR_QUERY_ANY): New
	aarch64_addr_query_type.
	(aarch64_const_vec_all_same_in_range_p, aarch64_sve_pred_mode)
	(aarch64_sve_cnt_immediate_p, aarch64_sve_addvl_addpl_immediate_p)
	(aarch64_sve_inc_dec_immediate_p, aarch64_add_offset_temporaries)
	(aarch64_split_add_offset, aarch64_output_sve_cnt_immediate)
	(aarch64_output_sve_addvl_addpl, aarch64_output_sve_inc_dec_immediate)
	(aarch64_output_sve_mov_immediate, aarch64_output_ptrue): Declare.
	(aarch64_simd_imm_zero_p): Delete.
	(aarch64_check_zero_based_sve_index_immediate): Declare.
	(aarch64_sve_index_immediate_p, aarch64_sve_arith_immediate_p)
	(aarch64_sve_bitmask_immediate_p, aarch64_sve_dup_immediate_p)
	(aarch64_sve_cmp_immediate_p, aarch64_sve_float_arith_immediate_p)
	(aarch64_sve_float_mul_immediate_p): Likewise.
	(aarch64_classify_symbol): Take the offset as a HOST_WIDE_INT
	rather than an rtx.
	(aarch64_sve_ld1r_operand_p, aarch64_sve_ldr_operand_p): Declare.
	(aarch64_expand_mov_immediate): Take a gen_vec_duplicate callback.
	(aarch64_emit_sve_pred_move, aarch64_expand_sve_mem_move): Declare.
	(aarch64_expand_sve_vec_cmp_int, aarch64_expand_sve_vec_cmp_float)
	(aarch64_expand_sve_vcond, aarch64_expand_sve_vec_perm): Declare.
	(aarch64_regmode_natural_size): Likewise.
	* config/aarch64/aarch64.h (AARCH64_FL_SVE): New macro.
	(AARCH64_FL_V8_3, AARCH64_FL_RCPC, AARCH64_FL_DOTPROD): Shift
	left one place.
	(AARCH64_ISA_SVE, TARGET_SVE): New macros.
	(FIXED_REGISTERS, CALL_USED_REGISTERS, REGISTER_NAMES): Add entries
	for VG and the SVE predicate registers.
	(V_ALIASES): Add a "z"-prefixed alias.
	(FIRST_PSEUDO_REGISTER): Change to P15_REGNUM + 1.
	(AARCH64_DWARF_VG, AARCH64_DWARF_P0): New macros.
	(PR_REGNUM_P, PR_LO_REGNUM_P): Likewise.
	(PR_LO_REGS, PR_HI_REGS, PR_REGS): New reg_classes.
	(REG_CLASS_NAMES): Add entries for them.
	(REG_CLASS_CONTENTS): Likewise.  Update ALL_REGS to include VG
	and the predicate registers.
	(aarch64_sve_vg): Declare.
	(BITS_PER_SVE_VECTOR, BYTES_PER_SVE_VECTOR, BYTES_PER_SVE_PRED)
	(SVE_BYTE_MODE, MAX_COMPILE_TIME_VEC_BYTES): New macros.
	(REGMODE_NATURAL_SIZE): Define.
	* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Handle
	SVE macros.
	* config/aarch64/aarch64.c: Include cfgrtl.h.
	(simd_immediate_info): Add a constructor for series vectors,
	and an associated step field.
	(aarch64_sve_vg): New variable.
	(aarch64_dbx_register_number): Handle VG and the predicate registers.
	(aarch64_vect_struct_mode_p, aarch64_vector_mode_p): Delete.
	(VEC_ADVSIMD, VEC_SVE_DATA, VEC_SVE_PRED, VEC_STRUCT, VEC_ANY_SVE)
	(VEC_ANY_DATA, VEC_STRUCT): New constants.
	(aarch64_advsimd_struct_mode_p, aarch64_sve_pred_mode_p)
	(aarch64_classify_vector_mode, aarch64_vector_data_mode_p)
	(aarch64_sve_data_mode_p, aarch64_sve_pred_mode)
	(aarch64_get_mask_mode): New functions.
	(aarch64_hard_regno_nregs): Handle SVE data modes for FP_REGS
	and FP_LO_REGS.  Handle PR_REGS, PR_LO_REGS and PR_HI_REGS.
	(aarch64_hard_regno_mode_ok): Handle VG.  Also handle the SVE
	predicate modes and predicate registers.  Explicitly restrict
	GPRs to modes of 16 bytes or smaller.  Only allow FP registers
	to store a vector mode if it is recognized by
	aarch64_classify_vector_mode.
	(aarch64_regmode_natural_size): New function.
	(aarch64_hard_regno_caller_save_mode): Return the original mode
	for predicates.
	(aarch64_sve_cnt_immediate_p, aarch64_output_sve_cnt_immediate)
	(aarch64_sve_addvl_addpl_immediate_p, aarch64_output_sve_addvl_addpl)
	(aarch64_sve_inc_dec_immediate_p, aarch64_output_sve_inc_dec_immediate)
	(aarch64_add_offset_1_temporaries, aarch64_offset_temporaries): New
	functions.
	(aarch64_add_offset): Add a temp2 parameter.  Assert that temp1
	does not overlap dest if the function is frame-related.  Handle
	SVE constants.
	(aarch64_split_add_offset): New function.
	(aarch64_add_sp, aarch64_sub_sp): Add temp2 parameters and pass
	them aarch64_add_offset.
	(aarch64_allocate_and_probe_stack_space): Add a temp2 parameter
	and update call to aarch64_sub_sp.
	(aarch64_add_cfa_expression): New function.
	(aarch64_expand_prologue): Pass extra temporary registers to the
	functions above.  Handle the case in which we need to emit new
	DW_CFA_expressions for registers that were originally saved
	relative to the stack pointer, but now have to be expressed
	relative to the frame pointer.
	(aarch64_output_mi_thunk): Pass extra temporary registers to the
	functions above.
	(aarch64_expand_epilogue): Likewise.  Prevent inheritance of
	IP0 and IP1 values for SVE frames.
	(aarch64_expand_vec_series): New function.
	(aarch64_expand_sve_widened_duplicate): Likewise.
	(aarch64_expand_sve_const_vector): Likewise.
	(aarch64_expand_mov_immediate): Add a gen_vec_duplicate parameter.
	Handle SVE constants.  Use emit_move_insn to move a force_const_mem
	into the register, rather than emitting a SET directly.
	(aarch64_emit_sve_pred_move, aarch64_expand_sve_mem_move)
	(aarch64_get_reg_raw_mode, offset_4bit_signed_scaled_p)
	(offset_6bit_unsigned_scaled_p, aarch64_offset_7bit_signed_scaled_p)
	(offset_9bit_signed_scaled_p): New functions.
	(aarch64_replicate_bitmask_imm): New function.
	(aarch64_bitmask_imm): Use it.
	(aarch64_cannot_force_const_mem): Reject expressions involving
	a CONST_POLY_INT.  Update call to aarch64_classify_symbol.
	(aarch64_classify_index): Handle SVE indices, by requiring
	a plain register index with a scale that matches the element size.
	(aarch64_classify_address): Handle SVE addresses.  Assert that
	the mode of the address is VOIDmode or an integer mode.
	Update call to aarch64_classify_symbol.
	(aarch64_classify_symbolic_expression): Update call to
	aarch64_classify_symbol.
	(aarch64_const_vec_all_in_range_p): New function.
	(aarch64_print_vector_float_operand): Likewise.
	(aarch64_print_operand): Handle 'N' and 'C'.  Use "zN" rather than
	"vN" for FP registers with SVE modes.  Handle (const ...) vectors
	and the FP immediates 1.0 and 0.5.
	(aarch64_print_address_internal): Handle SVE addresses.
	(aarch64_print_operand_address): Use ADDR_QUERY_ANY.
	(aarch64_regno_regclass): Handle predicate registers.
	(aarch64_secondary_reload): Handle big-endian reloads of SVE
	data modes.
	(aarch64_class_max_nregs): Handle SVE modes and predicate registers.
	(aarch64_rtx_costs): Check for ADDVL and ADDPL instructions.
	(aarch64_convert_sve_vector_bits): New function.
	(aarch64_override_options): Use it to handle -msve-vector-bits=.
	(aarch64_classify_symbol): Take the offset as a HOST_WIDE_INT
	rather than an rtx.
	(aarch64_legitimate_constant_p): Use aarch64_classify_vector_mode.
	Handle SVE vector and predicate modes.  Accept VL-based constants
	that need only one temporary register, and VL offsets that require
	no temporary registers.
	(aarch64_conditional_register_usage): Mark the predicate registers
	as fixed if SVE isn't available.
	(aarch64_vector_mode_supported_p): Use aarch64_classify_vector_mode.
	Return true for SVE vector and predicate modes.
	(aarch64_simd_container_mode): Take the number of bits as a poly_int64
	rather than an unsigned int.  Handle SVE modes.
	(aarch64_preferred_simd_mode): Update call accordingly.  Handle
	SVE modes.
	(aarch64_autovectorize_vector_sizes): Add BYTES_PER_SVE_VECTOR
	if SVE is enabled.
	(aarch64_sve_index_immediate_p, aarch64_sve_arith_immediate_p)
	(aarch64_sve_bitmask_immediate_p, aarch64_sve_dup_immediate_p)
	(aarch64_sve_cmp_immediate_p, aarch64_sve_float_arith_immediate_p)
	(aarch64_sve_float_mul_immediate_p): New functions.
	(aarch64_sve_valid_immediate): New function.
	(aarch64_simd_valid_immediate): Use it as the fallback for SVE vectors.
	Explicitly reject structure modes.  Check for INDEX constants.
	Handle PTRUE and PFALSE constants.
	(aarch64_check_zero_based_sve_index_immediate): New function.
	(aarch64_simd_imm_zero_p): Delete.
	(aarch64_mov_operand_p): Use aarch64_simd_valid_immediate for
	vector modes.  Accept constants in the range of CNT[BHWD].
	(aarch64_simd_scalar_immediate_valid_for_move): Explicitly
	ask for an Advanced SIMD mode.
	(aarch64_sve_ld1r_operand_p, aarch64_sve_ldr_operand_p): New functions.
	(aarch64_simd_vector_alignment): Handle SVE predicates.
	(aarch64_vectorize_preferred_vector_alignment): New function.
	(aarch64_simd_vector_alignment_reachable): Use it instead of
	the vector size.
	(aarch64_shift_truncation_mask): Use aarch64_vector_data_mode_p.
	(aarch64_output_sve_mov_immediate, aarch64_output_ptrue): New
	functions.
	(MAX_VECT_LEN): Delete.
	(expand_vec_perm_d): Add a vec_flags field.
	(emit_unspec2, aarch64_expand_sve_vec_perm): New functions.
	(aarch64_evpc_trn, aarch64_evpc_uzp, aarch64_evpc_zip)
	(aarch64_evpc_ext): Don't apply a big-endian lane correction
	for SVE modes.
	(aarch64_evpc_rev): Rename to...
	(aarch64_evpc_rev_local): ...this.  Use a predicated operation for SVE.
	(aarch64_evpc_rev_global): New function.
	(aarch64_evpc_dup): Enforce a 64-byte range for SVE DUP.
	(aarch64_evpc_tbl): Use MAX_COMPILE_TIME_VEC_BYTES instead of
	MAX_VECT_LEN.
	(aarch64_evpc_sve_tbl): New function.
	(aarch64_expand_vec_perm_const_1): Update after rename of
	aarch64_evpc_rev.  Handle SVE permutes too, trying
	aarch64_evpc_rev_global and using aarch64_evpc_sve_tbl rather
	than aarch64_evpc_tbl.
	(aarch64_vectorize_vec_perm_const): Initialize vec_flags.
	(aarch64_sve_cmp_operand_p, aarch64_unspec_cond_code)
	(aarch64_gen_unspec_cond, aarch64_expand_sve_vec_cmp_int)
	(aarch64_emit_unspec_cond, aarch64_emit_unspec_cond_or)
	(aarch64_emit_inverted_unspec_cond, aarch64_expand_sve_vec_cmp_float)
	(aarch64_expand_sve_vcond): New functions.
	(aarch64_modes_tieable_p): Use aarch64_vector_data_mode_p instead
	of aarch64_vector_mode_p.
	(aarch64_dwarf_poly_indeterminate_value): New function.
	(aarch64_compute_pressure_classes): Likewise.
	(aarch64_can_change_mode_class): Likewise.
	(TARGET_GET_RAW_RESULT_MODE, TARGET_GET_RAW_ARG_MODE): Redefine.
	(TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): Likewise.
	(TARGET_VECTORIZE_GET_MASK_MODE): Likewise.
	(TARGET_DWARF_POLY_INDETERMINATE_VALUE): Likewise.
	(TARGET_COMPUTE_PRESSURE_CLASSES): Likewise.
	(TARGET_CAN_CHANGE_MODE_CLASS): Likewise.
	* config/aarch64/constraints.md (Upa, Upl, Uav, Uat, Usv, Usi, Utr)
	(Uty, Dm, vsa, vsc, vsd, vsi, vsn, vsl, vsm, vsA, vsM, vsN): New
	constraints.
	(Dn, Dl, Dr): Accept const as well as const_vector.
	(Dz): Likewise.  Compare against CONST0_RTX.
	* config/aarch64/iterators.md: Refer to "Advanced SIMD" instead
	of "vector" where appropriate.
	(SVE_ALL, SVE_BH, SVE_BHS, SVE_BHSI, SVE_HSDI, SVE_HSF, SVE_SD)
	(SVE_SDI, SVE_I, SVE_F, PRED_ALL, PRED_BHS): New mode iterators.
	(UNSPEC_SEL, UNSPEC_ANDF, UNSPEC_IORF, UNSPEC_XORF, UNSPEC_COND_LT)
	(UNSPEC_COND_LE, UNSPEC_COND_EQ, UNSPEC_COND_NE, UNSPEC_COND_GE)
	(UNSPEC_COND_GT, UNSPEC_COND_LO, UNSPEC_COND_LS, UNSPEC_COND_HS)
	(UNSPEC_COND_HI, UNSPEC_COND_UO): New unspecs.
	(Vetype, VEL, Vel, VWIDE, Vwide, vw, vwcore, V_INT_EQUIV)
	(v_int_equiv): Extend to SVE modes.
	(Vesize, V128, v128, Vewtype, V_FP_EQUIV, v_fp_equiv, VPRED): New
	mode attributes.
	(LOGICAL_OR, SVE_INT_UNARY, SVE_FP_UNARY): New code iterators.
	(optab): Handle popcount, smin, smax, umin, umax, abs and sqrt.
	(logical_nn, lr, sve_int_op, sve_fp_op): New code attributs.
	(LOGICALF, OPTAB_PERMUTE, UNPACK, UNPACK_UNSIGNED, SVE_COND_INT_CMP)
	(SVE_COND_FP_CMP): New int iterators.
	(perm_hilo): Handle the new unpack unspecs.
	(optab, logicalf_op, su, perm_optab, cmp_op, imm_con): New int
	attributes.
	* config/aarch64/predicates.md (aarch64_sve_cnt_immediate)
	(aarch64_sve_addvl_addpl_immediate, aarch64_split_add_offset_immediate)
	(aarch64_pluslong_or_poly_operand, aarch64_nonmemory_operand)
	(aarch64_equality_operator, aarch64_constant_vector_operand)
	(aarch64_sve_ld1r_operand, aarch64_sve_ldr_operand): New predicates.
	(aarch64_sve_nonimmediate_operand): Likewise.
	(aarch64_sve_general_operand): Likewise.
	(aarch64_sve_dup_operand, aarch64_sve_arith_immediate): Likewise.
	(aarch64_sve_sub_arith_immediate, aarch64_sve_inc_dec_immediate)
	(aarch64_sve_logical_immediate, aarch64_sve_mul_immediate): Likewise.
	(aarch64_sve_dup_immediate, aarch64_sve_cmp_vsc_immediate): Likewise.
	(aarch64_sve_cmp_vsd_immediate, aarch64_sve_index_immediate): Likewise.
	(aarch64_sve_float_arith_immediate): Likewise.
	(aarch64_sve_float_arith_with_sub_immediate): Likewise.
	(aarch64_sve_float_mul_immediate, aarch64_sve_arith_operand): Likewise.
	(aarch64_sve_add_operand, aarch64_sve_logical_operand): Likewise.
	(aarch64_sve_lshift_operand, aarch64_sve_rshift_operand): Likewise.
	(aarch64_sve_mul_operand, aarch64_sve_cmp_vsc_operand): Likewise.
	(aarch64_sve_cmp_vsd_operand, aarch64_sve_index_operand): Likewise.
	(aarch64_sve_float_arith_operand): Likewise.
	(aarch64_sve_float_arith_with_sub_operand): Likewise.
	(aarch64_sve_float_mul_operand): Likewise.
	(aarch64_sve_vec_perm_operand): Likewise.
	(aarch64_pluslong_operand): Include aarch64_sve_addvl_addpl_immediate.
	(aarch64_mov_operand): Accept const_poly_int and const_vector.
	(aarch64_simd_lshift_imm, aarch64_simd_rshift_imm): Accept const
	as well as const_vector.
	(aarch64_simd_imm_zero, aarch64_simd_imm_minus_one): Move earlier
	in file.  Use CONST0_RTX and CONSTM1_RTX.
	(aarch64_simd_or_scalar_imm_zero): Likewise.  Add match_codes.
	(aarch64_simd_reg_or_zero): Accept const as well as const_vector.
	Use aarch64_simd_imm_zero.
	* config/aarch64/aarch64-sve.md: New file.
	* config/aarch64/aarch64.md: Include it.
	(VG_REGNUM, P0_REGNUM, P7_REGNUM, P15_REGNUM): New register numbers.
	(UNSPEC_REV, UNSPEC_LD1_SVE, UNSPEC_ST1_SVE, UNSPEC_MERGE_PTRUE)
	(UNSPEC_PTEST_PTRUE, UNSPEC_UNPACKSHI, UNSPEC_UNPACKUHI)
	(UNSPEC_UNPACKSLO, UNSPEC_UNPACKULO, UNSPEC_PACK)
	(UNSPEC_FLOAT_CONVERT, UNSPEC_WHILE_LO): New unspec constants.
	(sve): New attribute.
	(enabled): Disable instructions with the sve attribute unless
	TARGET_SVE.
	(movqi, movhi): Pass CONST_POLY_INT operaneds through
	aarch64_expand_mov_immediate.
	(*mov<mode>_aarch64, *movsi_aarch64, *movdi_aarch64): Handle
	CNT[BHSD] immediates.
	(movti): Split CONST_POLY_INT moves into two halves.
	(add<mode>3): Accept aarch64_pluslong_or_poly_operand.
	Split additions that need a temporary here if the destination
	is the stack pointer.
	(*add<mode>3_aarch64): Handle ADDVL and ADDPL immediates.
	(*add<mode>3_poly_1): New instruction.
	(set_clobber_cc): New expander.

James Greenhalgh Jan. 6, 2018, 6:05 p.m. UTC | #5

On Fri, Nov 03, 2017 at 05:50:54PM +0000, Richard Sandiford wrote:
> This patch adds gcc.target/aarch64 tests for SVE, and forces some

> existing Advanced SIMD tests to use -march=armv8-a.


I'm going to assume that these new testcases are broadly sensible, and not
spend any significant time looking at them.

I'm not completely happy forcing the architecture to Armv8-a - it would be
useful for our testing coverage if users which have configured with other
architecture variants had this test execute in those environments. That
way we'd check we still do the right thing once we have an implicit
-march=armv8.2-a .

However, as we don't have a good way to make that happen (other than maybe
only forcing the arch if we are in a configuration wired for SVE?) I'm
happy with this patch as a compromise for now.

OK, but a modification to cover the above point would make me happier.

Thanks,
James

> 

> 

> 2017-11-03  Richard Sandiford  <richard.sandiford@linaro.org>

> 	    Alan Hayward  <alan.hayward@arm.com>

> 	    David Sherwood  <david.sherwood@arm.com>

> 

> gcc/testsuite/

> 	* gcc.target/aarch64/bic_imm_1.c: Force -march=armv8-a.

> 	* gcc.target/aarch64/fmaxmin.c: Likewise.

> 	* gcc.target/aarch64/fmul_fcvt_2.c: Likewise.

> 	* gcc.target/aarch64/orr_imm_1.c: Likewise.

> 	* gcc.target/aarch64/pr62178.c: Likewise.

> 	* gcc.target/aarch64/pr71727-2.c: Likewise.

> 	* gcc.target/aarch64/saddw-1.c: Likewise.

> 	* gcc.target/aarch64/saddw-2.c: Likewise.

> 	* gcc.target/aarch64/uaddw-1.c: Likewise.

> 	* gcc.target/aarch64/uaddw-2.c: Likewise.

> 	* gcc.target/aarch64/uaddw-3.c: Likewise.

> 	* gcc.target/aarch64/vect-add-sub-cond.c: Likewise.

> 	* gcc.target/aarch64/vect-compile.c: Likewise.

> 	* gcc.target/aarch64/vect-faddv-compile.c: Likewise.

> 	* gcc.target/aarch64/vect-fcm-eq-d.c: Likewise.

> 	* gcc.target/aarch64/vect-fcm-eq-f.c: Likewise.

> 	* gcc.target/aarch64/vect-fcm-ge-d.c: Likewise.

> 	* gcc.target/aarch64/vect-fcm-ge-f.c: Likewise.

> 	* gcc.target/aarch64/vect-fcm-gt-d.c: Likewise.

> 	* gcc.target/aarch64/vect-fcm-gt-f.c: Likewise.

> 	* gcc.target/aarch64/vect-fmax-fmin-compile.c: Likewise.

> 	* gcc.target/aarch64/vect-fmaxv-fminv-compile.c: Likewise.

> 	* gcc.target/aarch64/vect-fmovd-zero.c: Likewise.

> 	* gcc.target/aarch64/vect-fmovd.c: Likewise.

> 	* gcc.target/aarch64/vect-fmovf-zero.c: Likewise.

> 	* gcc.target/aarch64/vect-fmovf.c: Likewise.

> 	* gcc.target/aarch64/vect-fp-compile.c: Likewise.

> 	* gcc.target/aarch64/vect-ld1r-compile-fp.c: Likewise.

> 	* gcc.target/aarch64/vect-ld1r-compile.c: Likewise.

> 	* gcc.target/aarch64/vect-movi.c: Likewise.

> 	* gcc.target/aarch64/vect-mull-compile.c: Likewise.

> 	* gcc.target/aarch64/vect-reduc-or_1.c: Likewise.

> 	* gcc.target/aarch64/vect-vaddv.c: Likewise.

> 	* gcc.target/aarch64/vect_saddl_1.c: Likewise.

> 	* gcc.target/aarch64/vect_smlal_1.c: Likewise.

> 	* gcc.target/aarch64/vector_initialization_nostack.c: XFAIL for

> 	fixed-length SVE.

> 	* gcc.target/aarch64/sve_arith_1.c: New test.

> 	* gcc.target/aarch64/sve_const_pred_1.C: Likewise.

> 	* gcc.target/aarch64/sve_const_pred_2.C: Likewise.

> 	* gcc.target/aarch64/sve_const_pred_3.C: Likewise.

> 	* gcc.target/aarch64/sve_const_pred_4.C: Likewise.

> 	* gcc.target/aarch64/sve_cvtf_signed_1.c: Likewise.

> 	* gcc.target/aarch64/sve_cvtf_signed_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_cvtf_unsigned_1.c: Likewise.

> 	* gcc.target/aarch64/sve_cvtf_unsigned_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_dup_imm_1.c: Likewise.

> 	* gcc.target/aarch64/sve_dup_imm_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_dup_lane_1.c: Likewise.

> 	* gcc.target/aarch64/sve_ext_1.c: Likewise.

> 	* gcc.target/aarch64/sve_ext_2.c: Likewise.

> 	* gcc.target/aarch64/sve_extract_1.c: Likewise.

> 	* gcc.target/aarch64/sve_extract_2.c: Likewise.

> 	* gcc.target/aarch64/sve_extract_3.c: Likewise.

> 	* gcc.target/aarch64/sve_extract_4.c: Likewise.

> 	* gcc.target/aarch64/sve_fabs_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fcvtz_signed_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fcvtz_signed_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_fcvtz_unsigned_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fcvtz_unsigned_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_fdiv_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fdup_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fdup_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_fmad_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fmla_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fmls_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fmsb_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fmul_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fneg_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fnmad_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fnmla_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fnmls_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fnmsb_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fp_arith_1.c: Likewise.

> 	* gcc.target/aarch64/sve_frinta_1.c: Likewise.

> 	* gcc.target/aarch64/sve_frinti_1.c: Likewise.

> 	* gcc.target/aarch64/sve_frintm_1.c: Likewise.

> 	* gcc.target/aarch64/sve_frintp_1.c: Likewise.

> 	* gcc.target/aarch64/sve_frintx_1.c: Likewise.

> 	* gcc.target/aarch64/sve_frintz_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fsqrt_1.c: Likewise.

> 	* gcc.target/aarch64/sve_fsubr_1.c: Likewise.

> 	* gcc.target/aarch64/sve_index_1.c: Likewise.

> 	* gcc.target/aarch64/sve_index_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_ld1r_1.c: Likewise.

> 	* gcc.target/aarch64/sve_load_const_offset_1.c: Likewise.

> 	* gcc.target/aarch64/sve_load_scalar_offset_1.c: Likewise.

> 	* gcc.target/aarch64/sve_logical_1.c: Likewise.

> 	* gcc.target/aarch64/sve_loop_add_1.c: Likewise.

> 	* gcc.target/aarch64/sve_loop_add_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_mad_1.c: Likewise.

> 	* gcc.target/aarch64/sve_maxmin_1.c: Likewise.

> 	* gcc.target/aarch64/sve_maxmin_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_maxmin_strict_1.c: Likewise.

> 	* gcc.target/aarch64/sve_maxmin_strict_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_mla_1.c: Likewise.

> 	* gcc.target/aarch64/sve_mls_1.c: Likewise.

> 	* gcc.target/aarch64/sve_mov_rr_1.c: Likewise.

> 	* gcc.target/aarch64/sve_msb_1.c: Likewise.

> 	* gcc.target/aarch64/sve_mul_1.c: Likewise.

> 	* gcc.target/aarch64/sve_neg_1.c: Likewise.

> 	* gcc.target/aarch64/sve_nlogical_1.c: Likewise.

> 	* gcc.target/aarch64/sve_nlogical_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_pack_1.c: Likewise.

> 	* gcc.target/aarch64/sve_pack_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_pack_fcvt_signed_1.c: Likewise.

> 	* gcc.target/aarch64/sve_pack_fcvt_signed_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_pack_fcvt_unsigned_1.c: Likewise.

> 	* gcc.target/aarch64/sve_pack_fcvt_unsigned_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_pack_float_1.c: Likewise.

> 	* gcc.target/aarch64/sve_pack_float_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_popcount_1.c: Likewise.

> 	* gcc.target/aarch64/sve_popcount_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_reduc_1.c: Likewise.

> 	* gcc.target/aarch64/sve_reduc_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_reduc_2.c: Likewise.

> 	* gcc.target/aarch64/sve_reduc_2_run.c: Likewise.

> 	* gcc.target/aarch64/sve_reduc_3.c: Likewise.

> 	* gcc.target/aarch64/sve_revb_1.c: Likewise.

> 	* gcc.target/aarch64/sve_revh_1.c: Likewise.

> 	* gcc.target/aarch64/sve_revw_1.c: Likewise.

> 	* gcc.target/aarch64/sve_shift_1.c: Likewise.

> 	* gcc.target/aarch64/sve_single_1.c: Likewise.

> 	* gcc.target/aarch64/sve_single_2.c: Likewise.

> 	* gcc.target/aarch64/sve_single_3.c: Likewise.

> 	* gcc.target/aarch64/sve_single_4.c: Likewise.

> 	* gcc.target/aarch64/sve_store_scalar_offset_1.c: Likewise.

> 	* gcc.target/aarch64/sve_subr_1.c: Likewise.

> 	* gcc.target/aarch64/sve_trn1_1.c: Likewise.

> 	* gcc.target/aarch64/sve_trn2_1.c: Likewise.

> 	* gcc.target/aarch64/sve_unpack_fcvt_signed_1.c: Likewise.

> 	* gcc.target/aarch64/sve_unpack_fcvt_signed_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_unpack_fcvt_unsigned_1.c: Likewise.

> 	* gcc.target/aarch64/sve_unpack_fcvt_unsigned_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_unpack_float_1.c: Likewise.

> 	* gcc.target/aarch64/sve_unpack_float_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_unpack_signed_1.c: Likewise.

> 	* gcc.target/aarch64/sve_unpack_signed_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_unpack_unsigned_1.c: Likewise.

> 	* gcc.target/aarch64/sve_unpack_unsigned_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_uzp1_1.c: Likewise.

> 	* gcc.target/aarch64/sve_uzp1_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_uzp2_1.c: Likewise.

> 	* gcc.target/aarch64/sve_uzp2_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_vcond_1.C: Likewise.

> 	* gcc.target/aarch64/sve_vcond_1_run.C: Likewise.

> 	* gcc.target/aarch64/sve_vcond_2.c: Likewise.

> 	* gcc.target/aarch64/sve_vcond_2_run.c: Likewise.

> 	* gcc.target/aarch64/sve_vcond_3.c: Likewise.

> 	* gcc.target/aarch64/sve_vcond_4.c: Likewise.

> 	* gcc.target/aarch64/sve_vcond_4_run.c: Likewise.

> 	* gcc.target/aarch64/sve_vcond_5.c: Likewise.

> 	* gcc.target/aarch64/sve_vcond_5_run.c: Likewise.

> 	* gcc.target/aarch64/sve_vcond_6.c: Likewise.

> 	* gcc.target/aarch64/sve_vcond_6_run.c: Likewise.

> 	* gcc.target/aarch64/sve_vec_init_1.c: Likewise.

> 	* gcc.target/aarch64/sve_vec_init_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_vec_init_2.c: Likewise.

> 	* gcc.target/aarch64/sve_vec_perm_1.c: Likewise.

> 	* gcc.target/aarch64/sve_vec_perm_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_vec_perm_1_overrange_run.c: Likewise.

> 	* gcc.target/aarch64/sve_vec_perm_const_1.c: Likewise.

> 	* gcc.target/aarch64/sve_vec_perm_const_1_overrun.c: Likewise.

> 	* gcc.target/aarch64/sve_vec_perm_const_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_vec_perm_const_single_1.c: Likewise.

> 	* gcc.target/aarch64/sve_vec_perm_const_single_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_vec_perm_single_1.c: Likewise.

> 	* gcc.target/aarch64/sve_vec_perm_single_1_run.c: Likewise.

> 	* gcc.target/aarch64/sve_zip1_1.c: Likewise.

> 	* gcc.target/aarch64/sve_zip2_1.c: Likewise.

>

James Greenhalgh Jan. 6, 2018, 6:09 p.m. UTC | #6

On Fri, Nov 24, 2017 at 03:59:58PM +0000, Richard Sandiford wrote:
> Richard Sandiford <richard.sandiford@linaro.org> writes:

> > This series adds support for ARM's Scalable Vector Extension.

> > More details on SVE can be found here:

> >

> >   https://developer.arm.com/products/architecture/a-profile/docs/arm-architecture-reference-manual-supplement-armv8-a

> >

> > There are four parts for ease of review, but it probably makes

> > sense to commit them as one patch.

> >

> > The series plugs SVE into the current vectorisation framework without

> > adding any new features to the framework itself.  This means for example

> > that vector loops still handle full vectors, with a scalar epilogue loop

> > being needed for the rest.  Later patches add support for other features

> > like fully-predicated loops.

> >

> > The patches build on top of the various series that I've already posted.

> > Sorry that there were so many, and thanks again for all the reviews.

> >

> > Tested on aarch64-linux-gnu without SVE and aarch64-linux-gnu with SVE

> > (in the default vector-length agnostic mode).  Also tested with

> > -msve-vector-bits=256 and -msve-vector-bits=512 to select 256-bit

> > and 512-bit SVE registers.

> 

> Here's an update based on an off-list discussion with the maintainers.

> Changes since v1:

> 

> - Changed the names of the modes from 256-bit vectors to "VNx"

>   + a 128-bit mode name, e.g. V32QI -> VNx16QI.

> 

> - Added an "sve" attribute and used it in the "enabled" attribute.

>   This allows generic aarch64.md patterns to disable things related

>   to SVE on non-SVE targets; previously this was implicit through the

>   constraints.

> 

> - Improved the consistency of the constraint names, specifically:

> 

>   Ua?: addition contraints (already used for Uaa)

>   Us?: general scalar constraints (already used for various other scalars)

>   Ut?: memory constraints (unchanged from v1)

>   vs?: vector SVE constraints (mostly unchanged, but now includes FP

>        as well as integer constraints)

> 

>   There's still the general "Dm" (minus one) constraint, for consistency

>   with "Dz" (zero).

> 

> - Added missing register descriptions above FIXED_REGISTERS.

> 

> - "should"/"is expected to" -> "must".

> 

> - Added more commentary to things like regmode_natural_size.

> 

> I also did a before and after comparison of the testsuite output

> for base AArch64 (but using the new FIRST_PSEUDO_REGISTER definition

> to avoid changes to hash values).  There were no differences.


I seem to have lost 4/4 in my mailer. Would you mind pinging it if I have
any action to take? Also, please ping any other SVE parts I've missed that
you haven't pinged in recent days.

I'll get to 1/4 in good time, but at 5000+ lines, it will need at least
another day! I'd like to OK everything around it which is outstanding, then
build up the courage for the big patch!

Thanks,
James

Richard Sandiford Jan. 6, 2018, 7:13 p.m. UTC | #7

James Greenhalgh <james.greenhalgh@arm.com> writes:
> On Fri, Nov 03, 2017 at 05:50:54PM +0000, Richard Sandiford wrote:

>> This patch adds gcc.target/aarch64 tests for SVE, and forces some

>> existing Advanced SIMD tests to use -march=armv8-a.

>

> I'm going to assume that these new testcases are broadly sensible, and not

> spend any significant time looking at them.

>

> I'm not completely happy forcing the architecture to Armv8-a - it would be

> useful for our testing coverage if users which have configured with other

> architecture variants had this test execute in those environments. That

> way we'd check we still do the right thing once we have an implicit

> -march=armv8.2-a .

>

> However, as we don't have a good way to make that happen (other than maybe

> only forcing the arch if we are in a configuration wired for SVE?) I'm

> happy with this patch as a compromise for now.

Would something like LLVM's -mattr be useful?  Then we could have
-mattr=+nosve without having to change the base architecture.

I suppose we'd need to be careful about how it interacts with -march
though, so it probably isn't GCC 8 material.  I'll try only forcing
the arch when we're compiling for SVE, like you say.

Not strictly related, but do you think it's OK to require binutils 2.28+
when testing GCC (rather than simply building it)?  When trying with an
older OS the other day, I realised that the SVE dg-do assemble tests
would fail for 2.27 and earlier.  We'd need something like:

  /* { dg-do assemble { aarch64_sve_asm } } */

if we wanted to support older binutils.

Thanks,
Richard

Richard Sandiford Jan. 6, 2018, 7:39 p.m. UTC | #8

James Greenhalgh <james.greenhalgh@arm.com> writes:
> On Fri, Nov 24, 2017 at 03:59:58PM +0000, Richard Sandiford wrote:

>> Richard Sandiford <richard.sandiford@linaro.org> writes:

>> > This series adds support for ARM's Scalable Vector Extension.

>> > More details on SVE can be found here:

>> >

>> >   https://developer.arm.com/products/architecture/a-profile/docs/arm-architecture-reference-manual-supplement-armv8-a

>> >

>> > There are four parts for ease of review, but it probably makes

>> > sense to commit them as one patch.

>> >

>> > The series plugs SVE into the current vectorisation framework without

>> > adding any new features to the framework itself.  This means for example

>> > that vector loops still handle full vectors, with a scalar epilogue loop

>> > being needed for the rest.  Later patches add support for other features

>> > like fully-predicated loops.

>> >

>> > The patches build on top of the various series that I've already posted.

>> > Sorry that there were so many, and thanks again for all the reviews.

>> >

>> > Tested on aarch64-linux-gnu without SVE and aarch64-linux-gnu with SVE

>> > (in the default vector-length agnostic mode).  Also tested with

>> > -msve-vector-bits=256 and -msve-vector-bits=512 to select 256-bit

>> > and 512-bit SVE registers.

>> 

>> Here's an update based on an off-list discussion with the maintainers.

>> Changes since v1:

>> 

>> - Changed the names of the modes from 256-bit vectors to "VNx"

>>   + a 128-bit mode name, e.g. V32QI -> VNx16QI.

>> 

>> - Added an "sve" attribute and used it in the "enabled" attribute.

>>   This allows generic aarch64.md patterns to disable things related

>>   to SVE on non-SVE targets; previously this was implicit through the

>>   constraints.

>> 

>> - Improved the consistency of the constraint names, specifically:

>> 

>>   Ua?: addition contraints (already used for Uaa)

>>   Us?: general scalar constraints (already used for various other scalars)

>>   Ut?: memory constraints (unchanged from v1)

>>   vs?: vector SVE constraints (mostly unchanged, but now includes FP

>>        as well as integer constraints)

>> 

>>   There's still the general "Dm" (minus one) constraint, for consistency

>>   with "Dz" (zero).

>> 

>> - Added missing register descriptions above FIXED_REGISTERS.

>> 

>> - "should"/"is expected to" -> "must".

>> 

>> - Added more commentary to things like regmode_natural_size.

>> 

>> I also did a before and after comparison of the testsuite output

>> for base AArch64 (but using the new FIRST_PSEUDO_REGISTER definition

>> to avoid changes to hash values).  There were no differences.

>

> I seem to have lost 4/4 in my mailer. Would you mind pinging it if I have

> any action to take? Also, please ping any other SVE parts I've missed that

> you haven't pinged in recent days.


4/4 was the unwinder support, which you've already reviewed (thanks):

  https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00251.html

There are two other AArch64 patches that I'll ping in a sec.

There are also quite a few patches that add target-independent
support for something and also add corresponding SVE code
to config/aarch64 and/or code quality tests to gcc.target/aarch64.
I think the full list of those is:

  Patches with config/aarch64 code:

    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02066.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02068.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01484.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01485.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01491.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01494.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01497.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01506.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01570.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01575.html

  Patches with gcc.target/aarch64 tests but no config/aarch64 changes,
  with the tests being in the spirit of the ones added in the original
  SVE patch:

    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00752.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01446.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01489.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01490.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01498.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01499.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01572.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01573.html
    https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01577.html

The target-independent pieces have already been reviewed (except where
I'll ping seperately).

Thanks,
Richard

James Greenhalgh Jan. 7, 2018, 5:10 p.m. UTC | #9

On Sat, Jan 06, 2018 at 07:13:22PM +0000, Richard Sandiford wrote:
> James Greenhalgh <james.greenhalgh@arm.com> writes:

> > On Fri, Nov 03, 2017 at 05:50:54PM +0000, Richard Sandiford wrote:

> >> This patch adds gcc.target/aarch64 tests for SVE, and forces some

> >> existing Advanced SIMD tests to use -march=armv8-a.

> >

> > I'm going to assume that these new testcases are broadly sensible, and not

> > spend any significant time looking at them.

> >

> > I'm not completely happy forcing the architecture to Armv8-a - it would be

> > useful for our testing coverage if users which have configured with other

> > architecture variants had this test execute in those environments. That

> > way we'd check we still do the right thing once we have an implicit

> > -march=armv8.2-a .

> >

> > However, as we don't have a good way to make that happen (other than maybe

> > only forcing the arch if we are in a configuration wired for SVE?) I'm

> > happy with this patch as a compromise for now.

> 

> Would something like LLVM's -mattr be useful?  Then we could have

> -mattr=+nosve without having to change the base architecture.

> 

> I suppose we'd need to be careful about how it interacts with -march

> though, so it probably isn't GCC 8 material.  I'll try only forcing

> the arch when we're compiling for SVE, like you say.


(Sorry if you took a duplicate of this - I mistakenly sent with a disclaimer)

We also could do this with Target pragmas:

  #pragma GCC target ("+nosve")

Should work here I think.

> Not strictly related, but do you think it's OK to require binutils 2.28+

> when testing GCC (rather than simply building it)?  When trying with an

> older OS the other day, I realised that the SVE dg-do assemble tests

> would fail for 2.27 and earlier.  We'd need something like:

> 

>   /* { dg-do assemble { aarch64_sve_asm } } */

> 

> if we wanted to support older binutils.


Personally I think this is OK. We have the same problem with other
new instructions we add and want assemble tests for.

Thanks,
James

James Greenhalgh Jan. 7, 2018, 9:09 p.m. UTC | #10

(Resending; this bounced)

On Sat, Jan 06, 2018 at 07:39:46PM +0000, Richard Sandiford wrote:
> James Greenhalgh <james.greenhalgh@arm.com> writes:

> > On Fri, Nov 24, 2017 at 03:59:58PM +0000, Richard Sandiford wrote:

> >> Richard Sandiford <richard.sandiford@linaro.org> writes:

> >> > This series adds support for ARM's Scalable Vector Extension.

> >> > More details on SVE can be found here:

> >> >

> >> >   https://developer.arm.com/products/architecture/a-profile/docs/arm-architecture-reference-manual-supplement-armv8-a

> >> >

> >> > There are four parts for ease of review, but it probably makes

> >> > sense to commit them as one patch.

> >> >

> >> > The series plugs SVE into the current vectorisation framework without

> >> > adding any new features to the framework itself.  This means for example

> >> > that vector loops still handle full vectors, with a scalar epilogue loop

> >> > being needed for the rest.  Later patches add support for other features

> >> > like fully-predicated loops.

> >> >

> >> > The patches build on top of the various series that I've already posted.

> >> > Sorry that there were so many, and thanks again for all the reviews.

> >> >

> >> > Tested on aarch64-linux-gnu without SVE and aarch64-linux-gnu with SVE

> >> > (in the default vector-length agnostic mode).  Also tested with

> >> > -msve-vector-bits=256 and -msve-vector-bits=512 to select 256-bit

> >> > and 512-bit SVE registers.

> >> 

> >> Here's an update based on an off-list discussion with the maintainers.

> >> Changes since v1:

> >> 

> >> - Changed the names of the modes from 256-bit vectors to "VNx"

> >>   + a 128-bit mode name, e.g. V32QI -> VNx16QI.

> >> 

> >> - Added an "sve" attribute and used it in the "enabled" attribute.

> >>   This allows generic aarch64.md patterns to disable things related

> >>   to SVE on non-SVE targets; previously this was implicit through the

> >>   constraints.

> >> 

> >> - Improved the consistency of the constraint names, specifically:

> >> 

> >>   Ua?: addition contraints (already used for Uaa)

> >>   Us?: general scalar constraints (already used for various other scalars)

> >>   Ut?: memory constraints (unchanged from v1)

> >>   vs?: vector SVE constraints (mostly unchanged, but now includes FP

> >>        as well as integer constraints)

> >> 

> >>   There's still the general "Dm" (minus one) constraint, for consistency

> >>   with "Dz" (zero).

> >> 

> >> - Added missing register descriptions above FIXED_REGISTERS.

> >> 

> >> - "should"/"is expected to" -> "must".

> >> 

> >> - Added more commentary to things like regmode_natural_size.

> >> 

> >> I also did a before and after comparison of the testsuite output

> >> for base AArch64 (but using the new FIRST_PSEUDO_REGISTER definition

> >> to avoid changes to hash values).  There were no differences.

> >

> > I seem to have lost 4/4 in my mailer. Would you mind pinging it if I have

> > any action to take? Also, please ping any other SVE parts I've missed that

> > you haven't pinged in recent days.

> 

> 4/4 was the unwinder support, which you've already reviewed (thanks):

> 

>   https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00251.html

> 

> There are two other AArch64 patches that I'll ping in a sec.


These, and 1/4 will now have to wait for later this week I'm afraid, I
hope I'll have a chance by midweek. I'll also see how far I can get with
the Armv8.4-A review this week to avoid further sequencing issues with who
goes first. Thanks for the preemptive rebase.

> There are also quite a few patches that add target-independent

> support for something and also add corresponding SVE code

> to config/aarch64 and/or code quality tests to gcc.target/aarch64.

> I think the full list of those is:

> 

>   Patches with config/aarch64 code:

> 

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02066.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02068.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01484.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01485.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01491.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01494.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01497.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01506.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01570.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01575.html


I think I've put an OK on most of these. The review was overall
straightforward - sorry for missing the action left on me earlier.

>   Patches with gcc.target/aarch64 tests but no config/aarch64 changes,

>   with the tests being in the spirit of the ones added in the original

>   SVE patch:

> 

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg00752.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01446.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01489.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01490.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01498.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01499.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01572.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01573.html

>     https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01577.html

> 

> The target-independent pieces have already been reviewed (except where

> I'll ping seperately).


And these are also OK. With two comments on the overall strategy - as I've
mentioned elsewhere I find the -march=armv8-a+sve to be too restrictive for
our testing efforts. I'd prefer to just add SVE by a target pragma if we can.
Additionally, I'd be happy with the whole AArch64 testsuite being organised
in to folders (doing this might also make it easier for us to turn all SVE
tests off with older assemblers, by using the skipping the exp file if we
can't find support).

For now, I've OK'd the tests. They would still be OK in my mind with either
or both of my suggestions above.

Thanks,
James

James Greenhalgh Jan. 10, 2018, 7:16 p.m. UTC | #11

On Fri, Jan 05, 2018 at 11:41:25AM +0000, Richard Sandiford wrote:
> Here's the patch updated to apply on top of the v8.4 and

> __builtin_load_no_speculate support.  It also handles the new

> vec_perm_indices and CONST_VECTOR encoding and uses VNx... names

> for the SVE modes.

> 

> Richard Sandiford <richard.sandiford@linaro.org> writes:

> > This patch adds support for ARM's Scalable Vector Extension.

> > The patch just contains the core features that work with the

> > current vectoriser framework; later patches will add extra

> > capabilities to both the target-independent code and AArch64 code.

> > The patch doesn't include:

> >

> > - support for unwinding frames whose size depends on the vector length

> > - modelling the effect of __tls_get_addr on the SVE registers

> >

> > These are handled by later patches instead.

> >

> > Some notes:

> >

> > - The copyright years for aarch64-sve.md start at 2009 because some of

> >   the code is based on aarch64.md, which also starts from then.

> >

> > - The patch inserts spaces between items in the AArch64 section

> >   of sourcebuild.texi.  This matches at least the surrounding

> >   architectures and looks a little nicer in the info output.

> >

> > - aarch64-sve.md includes a pattern:

> >

> >     while_ult<GPI:mode><PRED_ALL:mode>

> >

> >   A later patch adds a matching "while_ult" optab, but the pattern

> >   is also needed by the predicate vec_duplicate expander.

I'm keen to take this. The code is good quality overall, I'm confident in your
reputation and implementation. There are some parts of the design that I'm
less happy about, but pragmatically, we should take this now to get the
behaviour correct, and look to optimise, refactor, and clean-up in future.

Sorry it took me a long time to get to the review. I've got no meaningful
design concerns here, and certainly nothing so critical that we couldn't
fix it after the fact in GCC 9 and up.

That said...

> 	(aarch64_sve_cnt_immediate_p, aarch64_sve_addvl_addpl_immediate_p)

I'm not a big fan of these sorts of functions which return a char* where
we've dumped the text we want to print out in the short term. The interface
(fill a static char[] we can then leak on return) is pretty ugly.

One consideration for future work would be refactoring out aarch64.c - it is
getting to be too big for my liking (near 18,000 lines).

> 	(aarch64_expand_sve_mem_move)

Do we have a good description of how SVE big-endian vectors work, <snip more
comments - I found the detailed comment at the top of aarch64-sve.md> 

The sort of comment you write later ("see the comment at the head of
aarch64-sve.md for details") would also be useful here as a reference.

> aarch64_get_reg_raw_mode

Do we assert/warn anywhere for users of __builtin_apply that they are
fundamentally unsupported?

> offset_4bit_signed_scaled_p

So much code duplication here and in similair functions. Would a single
interface (unsigned bits, bool signed, bool scaled) let you avoid the many
identical functions?

> aarch64_evpc_rev_local 

I'm likely missing something obvious, but what is the distinction you're
drawing between global and local? Could you comment it?

> aarch64-sve.md - scheduling types

None of the instructions here have types for scheduling. That's going to
make for a future headache. Adding them to the existing scheduling types
is going to cause all sorts of trouble when building GCC (we already have
too many types for some compilers to handle the structure!). We'll need
to finds a solution to how we'll direct scheduling for SVE.

> aarch64-sve.md - predicated operands

It is a shame this ends up being so ugly and requiring UNSPEC_MERGE_PTRUE
everywhere. That will block a lot of useful optimisation.

Otherwise, this is OK for trunk. I'm happy to take it as is, and have the
above suggestions applied as follow-ups if you think they are worth doing.

Thanks,
James

Richard Sandiford Jan. 10, 2018, 7:54 p.m. UTC | #12

Thanks for the review!

James Greenhalgh <james.greenhalgh@arm.com> writes:
> On Fri, Jan 05, 2018 at 11:41:25AM +0000, Richard Sandiford wrote:

>> Here's the patch updated to apply on top of the v8.4 and

>> __builtin_load_no_speculate support.  It also handles the new

>> vec_perm_indices and CONST_VECTOR encoding and uses VNx... names

>> for the SVE modes.

>> 

>> Richard Sandiford <richard.sandiford@linaro.org> writes:

>> > This patch adds support for ARM's Scalable Vector Extension.

>> > The patch just contains the core features that work with the

>> > current vectoriser framework; later patches will add extra

>> > capabilities to both the target-independent code and AArch64 code.

>> > The patch doesn't include:

>> >

>> > - support for unwinding frames whose size depends on the vector length

>> > - modelling the effect of __tls_get_addr on the SVE registers

>> >

>> > These are handled by later patches instead.

>> >

>> > Some notes:

>> >

>> > - The copyright years for aarch64-sve.md start at 2009 because some of

>> >   the code is based on aarch64.md, which also starts from then.

>> >

>> > - The patch inserts spaces between items in the AArch64 section

>> >   of sourcebuild.texi.  This matches at least the surrounding

>> >   architectures and looks a little nicer in the info output.

>> >

>> > - aarch64-sve.md includes a pattern:

>> >

>> >     while_ult<GPI:mode><PRED_ALL:mode>

>> >

>> >   A later patch adds a matching "while_ult" optab, but the pattern

>> >   is also needed by the predicate vec_duplicate expander.

>

> I'm keen to take this. The code is good quality overall, I'm confident in your

> reputation and implementation. There are some parts of the design that I'm

> less happy about, but pragmatically, we should take this now to get the

> behaviour correct, and look to optimise, refactor, and clean-up in future.

>

> Sorry it took me a long time to get to the review. I've got no meaningful

> design concerns here, and certainly nothing so critical that we couldn't

> fix it after the fact in GCC 9 and up.

>

> That said...

>

>> 	(aarch64_sve_cnt_immediate_p, aarch64_sve_addvl_addpl_immediate_p)

>

> I'm not a big fan of these sorts of functions which return a char* where

> we've dumped the text we want to print out in the short term. The interface

> (fill a static char[] we can then leak on return) is pretty ugly.


Yeah, it's not pretty, but I think the various possible ways of doing
the addition do justify using output functions here.  The distinction
between INC[BHWD], DEC[BHWD], ADDVL and ADDPL doesn't really affect
anything other than the final output, so it isn't something that
should be exposed as different constraints (for example).

We should probably "just" have a nicer interface for target code
to construct instruction format strings.

> One consideration for future work would be refactoring out aarch64.c - it is

> getting to be too big for my liking (near 18,000 lines).

>

>> 	(aarch64_expand_sve_mem_move)

>

> Do we have a good description of how SVE big-endian vectors work, <snip more

> comments - I found the detailed comment at the top of aarch64-sve.md> 

>

> The sort of comment you write later ("see the comment at the head of

> aarch64-sve.md for details") would also be useful here as a reference.


Ah, yeah, will add a reference there too.

>> aarch64_get_reg_raw_mode

>

> Do we assert/warn anywhere for users of __builtin_apply that they are

> fundamentally unsupported?


Not as far as I know.  FWIW, this doesn't affect SVE (yet), because we
don't yet support any types that would be passed in the SVE-specific
part of the registers.

>> offset_4bit_signed_scaled_p

>

> So much code duplication here and in similair functions. Would a single

> interface (unsigned bits, bool signed, bool scaled) let you avoid the many

> identical functions?


We just kept to the existing style here.  I agree it might be a good idea
to consolidate them, but personally I'd prefer to keep the signed/scaled
distinction in the function name, since it's more readable than booleans
and shorter than a new enum.

>> aarch64_evpc_rev_local 

>

> I'm likely missing something obvious, but what is the distinction you're

> drawing between global and local? Could you comment it?


"global" reverses the whole vector: the first and last elements switch
places.  "local" reverses within groups of N consecutive elements but
not between them.

But yet again names are probably my downfall here. :-)  I'm happy to
call them something else instead.  Either way I'll expand the comments.

>> aarch64-sve.md - scheduling types

>

> None of the instructions here have types for scheduling. That's going to

> make for a future headache. Adding them to the existing scheduling types

> is going to cause all sorts of trouble when building GCC (we already have

> too many types for some compilers to handle the structure!). We'll need

> to finds a solution to how we'll direct scheduling for SVE.


Yeah.  I didn't want to add scheduling attributes now without scheduling
descriptions to go with them, since there's no way of knowing what the
division should be.

>> aarch64-sve.md - predicated operands

>

> It is a shame this ends up being so ugly and requiring UNSPEC_MERGE_PTRUE

> everywhere. That will block a lot of useful optimisation.


I don't think it blocks many in practice (at least, not the kind that
really do belong in RTL rather than gimple).  Most instructions map
directly to an optab and those that don't do combine OK in the
UNSPEC_MERGE_PTRUE form (e.g. AND + NOT -> BIC).

> Otherwise, this is OK for trunk. I'm happy to take it as is, and have the

> above suggestions applied as follow-ups if you think they are worth doing.


Thanks.  If we can reach quick agreement about the offset checks then
I'll roll in that change.

Richard

Richard Sandiford Jan. 12, 2018, 3:30 p.m. UTC | #13

James Greenhalgh <james.greenhalgh@arm.com> writes:
> On Sat, Jan 06, 2018 at 07:13:22PM +0000, Richard Sandiford wrote:

>> James Greenhalgh <james.greenhalgh@arm.com> writes:

>> > On Fri, Nov 03, 2017 at 05:50:54PM +0000, Richard Sandiford wrote:

>> >> This patch adds gcc.target/aarch64 tests for SVE, and forces some

>> >> existing Advanced SIMD tests to use -march=armv8-a.

>> >

>> > I'm going to assume that these new testcases are broadly sensible, and not

>> > spend any significant time looking at them.

>> >

>> > I'm not completely happy forcing the architecture to Armv8-a - it would be

>> > useful for our testing coverage if users which have configured with other

>> > architecture variants had this test execute in those environments. That

>> > way we'd check we still do the right thing once we have an implicit

>> > -march=armv8.2-a .

>> >

>> > However, as we don't have a good way to make that happen (other than maybe

>> > only forcing the arch if we are in a configuration wired for SVE?) I'm

>> > happy with this patch as a compromise for now.

>> 

>> Would something like LLVM's -mattr be useful?  Then we could have

>> -mattr=+nosve without having to change the base architecture.

>> 

>> I suppose we'd need to be careful about how it interacts with -march

>> though, so it probably isn't GCC 8 material.  I'll try only forcing

>> the arch when we're compiling for SVE, like you say.

>

> (Sorry if you took a duplicate of this - I mistakenly sent with a disclaimer)

>

> We also could do this with Target pragmas:

>

>   #pragma GCC target ("+nosve")

>

> Should work here I think.


Yeah, it does, thanks.  I switched to that instead of forcing -march=.

>> Not strictly related, but do you think it's OK to require binutils 2.28+

>> when testing GCC (rather than simply building it)?  When trying with an

>> older OS the other day, I realised that the SVE dg-do assemble tests

>> would fail for 2.27 and earlier.  We'd need something like:

>> 

>>   /* { dg-do assemble { aarch64_sve_asm } } */

>> 

>> if we wanted to support older binutils.

>

> Personally I think this is OK. We have the same problem with other

> new instructions we add and want assemble tests for.


I later saw that we have aarch64_asm_<foo>_ok, so I added "sve" to the
list and used it to protect dg-do assemble tests.

In another message you said:

> Additionally, I'd be happy with the whole AArch64 testsuite being organised

> in to folders (doing this might also make it easier for us to turn all SVE

> tests off with older assemblers, by using the skipping the exp file if we

> can't find support).


I agree organising it into directories is nicer, and it means that we
can add the -march= option in the harness rather than each individual
dg-options line.  That in turn makes it easy to avoid overriding -march=
if the setting that the tester chose (or the toolchain's default setting)
already includes SVE.

Here's what I plan to commit (without reposting the new tests).

Thanks,
Richard


2018-01-12  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/testsuite/
	* lib/target-supports.exp (check_effective_target_aarch64_asm_sve_ok):
	New proc.
	* gcc.target/aarch64/bic_imm_1.c: Use #pragma GCC target "+nosve".
	* gcc.target/aarch64/fmaxmin.c: Likewise.
	* gcc.target/aarch64/fmul_fcvt_2.c: Likewise.
	* gcc.target/aarch64/orr_imm_1.c: Likewise.
	* gcc.target/aarch64/pr62178.c: Likewise.
	* gcc.target/aarch64/pr71727-2.c: Likewise.
	* gcc.target/aarch64/saddw-1.c: Likewise.
	* gcc.target/aarch64/saddw-2.c: Likewise.
	* gcc.target/aarch64/uaddw-1.c: Likewise.
	* gcc.target/aarch64/uaddw-2.c: Likewise.
	* gcc.target/aarch64/uaddw-3.c: Likewise.
	* gcc.target/aarch64/vect-add-sub-cond.c: Likewise.
	* gcc.target/aarch64/vect-compile.c: Likewise.
	* gcc.target/aarch64/vect-faddv-compile.c: Likewise.
	* gcc.target/aarch64/vect-fcm-eq-d.c: Likewise.
	* gcc.target/aarch64/vect-fcm-eq-f.c: Likewise.
	* gcc.target/aarch64/vect-fcm-ge-d.c: Likewise.
	* gcc.target/aarch64/vect-fcm-ge-f.c: Likewise.
	* gcc.target/aarch64/vect-fcm-gt-d.c: Likewise.
	* gcc.target/aarch64/vect-fcm-gt-f.c: Likewise.
	* gcc.target/aarch64/vect-fmax-fmin-compile.c: Likewise.
	* gcc.target/aarch64/vect-fmaxv-fminv-compile.c: Likewise.
	* gcc.target/aarch64/vect-fmovd-zero.c: Likewise.
	* gcc.target/aarch64/vect-fmovd.c: Likewise.
	* gcc.target/aarch64/vect-fmovf-zero.c: Likewise.
	* gcc.target/aarch64/vect-fmovf.c: Likewise.
	* gcc.target/aarch64/vect-fp-compile.c: Likewise.
	* gcc.target/aarch64/vect-ld1r-compile-fp.c: Likewise.
	* gcc.target/aarch64/vect-ld1r-compile.c: Likewise.
	* gcc.target/aarch64/vect-movi.c: Likewise.
	* gcc.target/aarch64/vect-mull-compile.c: Likewise.
	* gcc.target/aarch64/vect-reduc-or_1.c: Likewise.
	* gcc.target/aarch64/vect-vaddv.c: Likewise.
	* gcc.target/aarch64/vect_saddl_1.c: Likewise.
	* gcc.target/aarch64/vect_smlal_1.c: Likewise.
	* gcc.target/aarch64/vector_initialization_nostack.c: XFAIL for
	fixed-length SVE.
	* gcc.target/aarch64/sve/aarch64-sve.exp: New file.
	* gcc.target/aarch64/sve/arith_1.c: New test.
	* gcc.target/aarch64/sve/const_pred_1.C: Likewise.
	* gcc.target/aarch64/sve/const_pred_2.C: Likewise.
	* gcc.target/aarch64/sve/const_pred_3.C: Likewise.
	* gcc.target/aarch64/sve/const_pred_4.C: Likewise.
	* gcc.target/aarch64/sve/cvtf_signed_1.c: Likewise.
	* gcc.target/aarch64/sve/cvtf_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cvtf_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve/cvtf_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve/dup_imm_1.c: Likewise.
	* gcc.target/aarch64/sve/dup_imm_1_run.c: Likewise.
	* gcc.target/aarch64/sve/dup_lane_1.c: Likewise.
	* gcc.target/aarch64/sve/ext_1.c: Likewise.
	* gcc.target/aarch64/sve/ext_2.c: Likewise.
	* gcc.target/aarch64/sve/extract_1.c: Likewise.
	* gcc.target/aarch64/sve/extract_2.c: Likewise.
	* gcc.target/aarch64/sve/extract_3.c: Likewise.
	* gcc.target/aarch64/sve/extract_4.c: Likewise.
	* gcc.target/aarch64/sve/fabs_1.c: Likewise.
	* gcc.target/aarch64/sve/fcvtz_signed_1.c: Likewise.
	* gcc.target/aarch64/sve/fcvtz_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve/fcvtz_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve/fcvtz_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve/fdiv_1.c: Likewise.
	* gcc.target/aarch64/sve/fdup_1.c: Likewise.
	* gcc.target/aarch64/sve/fdup_1_run.c: Likewise.
	* gcc.target/aarch64/sve/fmad_1.c: Likewise.
	* gcc.target/aarch64/sve/fmla_1.c: Likewise.
	* gcc.target/aarch64/sve/fmls_1.c: Likewise.
	* gcc.target/aarch64/sve/fmsb_1.c: Likewise.
	* gcc.target/aarch64/sve/fmul_1.c: Likewise.
	* gcc.target/aarch64/sve/fneg_1.c: Likewise.
	* gcc.target/aarch64/sve/fnmad_1.c: Likewise.
	* gcc.target/aarch64/sve/fnmla_1.c: Likewise.
	* gcc.target/aarch64/sve/fnmls_1.c: Likewise.
	* gcc.target/aarch64/sve/fnmsb_1.c: Likewise.
	* gcc.target/aarch64/sve/fp_arith_1.c: Likewise.
	* gcc.target/aarch64/sve/frinta_1.c: Likewise.
	* gcc.target/aarch64/sve/frinti_1.c: Likewise.
	* gcc.target/aarch64/sve/frintm_1.c: Likewise.
	* gcc.target/aarch64/sve/frintp_1.c: Likewise.
	* gcc.target/aarch64/sve/frintx_1.c: Likewise.
	* gcc.target/aarch64/sve/frintz_1.c: Likewise.
	* gcc.target/aarch64/sve/fsqrt_1.c: Likewise.
	* gcc.target/aarch64/sve/fsubr_1.c: Likewise.
	* gcc.target/aarch64/sve/index_1.c: Likewise.
	* gcc.target/aarch64/sve/index_1_run.c: Likewise.
	* gcc.target/aarch64/sve/ld1r_1.c: Likewise.
	* gcc.target/aarch64/sve/load_const_offset_1.c: Likewise.
	* gcc.target/aarch64/sve/load_const_offset_2.c: Likewise.
	* gcc.target/aarch64/sve/load_const_offset_3.c: Likewise.
	* gcc.target/aarch64/sve/load_scalar_offset_1.c: Likewise.
	* gcc.target/aarch64/sve/logical_1.c: Likewise.
	* gcc.target/aarch64/sve/loop_add_1.c: Likewise.
	* gcc.target/aarch64/sve/loop_add_1_run.c: Likewise.
	* gcc.target/aarch64/sve/mad_1.c: Likewise.
	* gcc.target/aarch64/sve/maxmin_1.c: Likewise.
	* gcc.target/aarch64/sve/maxmin_1_run.c: Likewise.
	* gcc.target/aarch64/sve/maxmin_strict_1.c: Likewise.
	* gcc.target/aarch64/sve/maxmin_strict_1_run.c: Likewise.
	* gcc.target/aarch64/sve/mla_1.c: Likewise.
	* gcc.target/aarch64/sve/mls_1.c: Likewise.
	* gcc.target/aarch64/sve/mov_rr_1.c: Likewise.
	* gcc.target/aarch64/sve/msb_1.c: Likewise.
	* gcc.target/aarch64/sve/mul_1.c: Likewise.
	* gcc.target/aarch64/sve/neg_1.c: Likewise.
	* gcc.target/aarch64/sve/nlogical_1.c: Likewise.
	* gcc.target/aarch64/sve/nlogical_1_run.c: Likewise.
	* gcc.target/aarch64/sve/pack_1.c: Likewise.
	* gcc.target/aarch64/sve/pack_1_run.c: Likewise.
	* gcc.target/aarch64/sve/pack_fcvt_signed_1.c: Likewise.
	* gcc.target/aarch64/sve/pack_fcvt_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve/pack_fcvt_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve/pack_fcvt_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve/pack_float_1.c: Likewise.
	* gcc.target/aarch64/sve/pack_float_1_run.c: Likewise.
	* gcc.target/aarch64/sve/popcount_1.c: Likewise.
	* gcc.target/aarch64/sve/popcount_1_run.c: Likewise.
	* gcc.target/aarch64/sve/reduc_1.c: Likewise.
	* gcc.target/aarch64/sve/reduc_1_run.c: Likewise.
	* gcc.target/aarch64/sve/reduc_2.c: Likewise.
	* gcc.target/aarch64/sve/reduc_2_run.c: Likewise.
	* gcc.target/aarch64/sve/reduc_3.c: Likewise.
	* gcc.target/aarch64/sve/rev_1.c: Likewise.
	* gcc.target/aarch64/sve/revb_1.c: Likewise.
	* gcc.target/aarch64/sve/revh_1.c: Likewise.
	* gcc.target/aarch64/sve/revw_1.c: Likewise.
	* gcc.target/aarch64/sve/shift_1.c: Likewise.
	* gcc.target/aarch64/sve/single_1.c: Likewise.
	* gcc.target/aarch64/sve/single_2.c: Likewise.
	* gcc.target/aarch64/sve/single_3.c: Likewise.
	* gcc.target/aarch64/sve/single_4.c: Likewise.
	* gcc.target/aarch64/sve/spill_1.c: Likewise.
	* gcc.target/aarch64/sve/store_scalar_offset_1.c: Likewise.
	* gcc.target/aarch64/sve/subr_1.c: Likewise.
	* gcc.target/aarch64/sve/trn1_1.c: Likewise.
	* gcc.target/aarch64/sve/trn2_1.c: Likewise.
	* gcc.target/aarch64/sve/unpack_fcvt_signed_1.c: Likewise.
	* gcc.target/aarch64/sve/unpack_fcvt_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve/unpack_fcvt_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve/unpack_float_1.c: Likewise.
	* gcc.target/aarch64/sve/unpack_float_1_run.c: Likewise.
	* gcc.target/aarch64/sve/unpack_signed_1.c: Likewise.
	* gcc.target/aarch64/sve/unpack_signed_1_run.c: Likewise.
	* gcc.target/aarch64/sve/unpack_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve/unpack_unsigned_1_run.c: Likewise.
	* gcc.target/aarch64/sve/uzp1_1.c: Likewise.
	* gcc.target/aarch64/sve/uzp1_1_run.c: Likewise.
	* gcc.target/aarch64/sve/uzp2_1.c: Likewise.
	* gcc.target/aarch64/sve/uzp2_1_run.c: Likewise.
	* gcc.target/aarch64/sve/vcond_1.C: Likewise.
	* gcc.target/aarch64/sve/vcond_1_run.C: Likewise.
	* gcc.target/aarch64/sve/vcond_2.c: Likewise.
	* gcc.target/aarch64/sve/vcond_2_run.c: Likewise.
	* gcc.target/aarch64/sve/vcond_3.c: Likewise.
	* gcc.target/aarch64/sve/vcond_4.c: Likewise.
	* gcc.target/aarch64/sve/vcond_4_run.c: Likewise.
	* gcc.target/aarch64/sve/vcond_5.c: Likewise.
	* gcc.target/aarch64/sve/vcond_5_run.c: Likewise.
	* gcc.target/aarch64/sve/vcond_6.c: Likewise.
	* gcc.target/aarch64/sve/vcond_6_run.c: Likewise.
	* gcc.target/aarch64/sve/vec_init_1.c: Likewise.
	* gcc.target/aarch64/sve/vec_init_1_run.c: Likewise.
	* gcc.target/aarch64/sve/vec_init_2.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_1.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_1_run.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_1_overrange_run.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_const_1.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_const_1_overrun.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_const_1_run.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_const_single_1.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_const_single_1_run.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_single_1.c: Likewise.
	* gcc.target/aarch64/sve/vec_perm_single_1_run.c: Likewise.
	* gcc.target/aarch64/sve/zip1_1.c: Likewise.
	* gcc.target/aarch64/sve/zip2_1.c: Likewise.

Index: gcc/testsuite/lib/target-supports.exp
===================================================================
--- gcc/testsuite/lib/target-supports.exp	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/lib/target-supports.exp	2018-01-12 15:13:44.050826874 +0000
@@ -8590,7 +8590,7 @@ proc check_effective_target_aarch64_tiny
 # Create functions to check that the AArch64 assembler supports the
 # various architecture extensions via the .arch_extension pseudo-op.
 
-foreach { aarch64_ext } { "fp" "simd" "crypto" "crc" "lse" "dotprod"} {
+foreach { aarch64_ext } { "fp" "simd" "crypto" "crc" "lse" "dotprod" "sve"} {
     eval [string map [list FUNC $aarch64_ext] {
 	proc check_effective_target_aarch64_asm_FUNC_ok { } {
 	  if { [istarget aarch64*-*-*] } {
Index: gcc/testsuite/gcc.target/aarch64/bic_imm_1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/bic_imm_1.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/bic_imm_1.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,8 @@
 /* { dg-do assemble } */
 /* { dg-options "-O2 --save-temps -ftree-vectorize" } */
 
+#pragma GCC target "+nosve"
+
 /* Each function uses the correspoding 'CLASS' in
    Marco CHECK (aarch64_simd_valid_immediate).  */
 
Index: gcc/testsuite/gcc.target/aarch64/fmaxmin.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/fmaxmin.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/fmaxmin.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,7 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fno-inline -fno-vect-cost-model -save-temps" } */
 
+#pragma GCC target "+nosve"
 
 extern void abort (void);
 double fmax (double, double);
Index: gcc/testsuite/gcc.target/aarch64/fmul_fcvt_2.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/fmul_fcvt_2.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/fmul_fcvt_2.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-save-temps -O2 -ftree-vectorize -fno-inline -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define N 1024
 
 #define FUNC_DEF(__a)		\
Index: gcc/testsuite/gcc.target/aarch64/orr_imm_1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/orr_imm_1.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/orr_imm_1.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,8 @@
 /* { dg-do assemble } */
 /* { dg-options "-O2 --save-temps -ftree-vectorize" } */
 
+#pragma GCC target "+nosve"
+
 /* Each function uses the correspoding 'CLASS' in
    Marco CHECK (aarch64_simd_valid_immediate).  */
 
Index: gcc/testsuite/gcc.target/aarch64/pr62178.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/pr62178.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/pr62178.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 int a[30 +1][30 +1], b[30 +1][30 +1], r[30 +1][30 +1];
 
 void foo (void) {
Index: gcc/testsuite/gcc.target/aarch64/pr71727-2.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/pr71727-2.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/pr71727-2.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-mstrict-align -O3" } */
 
+#pragma GCC target "+nosve"
+
 unsigned char foo(const unsigned char *buffer, unsigned int length)
 {
   unsigned char sum;
Index: gcc/testsuite/gcc.target/aarch64/saddw-1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/saddw-1.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/saddw-1.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 int 
 t6(int len, void * dummy, short * __restrict x)
 {
Index: gcc/testsuite/gcc.target/aarch64/saddw-2.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/saddw-2.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/saddw-2.c	2018-01-12 15:13:44.034827531 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 int 
 t6(int len, void * dummy, int * __restrict x)
 {
Index: gcc/testsuite/gcc.target/aarch64/uaddw-1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/uaddw-1.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/uaddw-1.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 int 
 t6(int len, void * dummy, unsigned short * __restrict x)
 {
Index: gcc/testsuite/gcc.target/aarch64/uaddw-2.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/uaddw-2.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/uaddw-2.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 int 
 t6(int len, void * dummy, unsigned short * __restrict x)
 {
Index: gcc/testsuite/gcc.target/aarch64/uaddw-3.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/uaddw-3.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/uaddw-3.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 int 
 t6(int len, void * dummy, char * __restrict x)
 {
Index: gcc/testsuite/gcc.target/aarch64/vect-add-sub-cond.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-add-sub-cond.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-add-sub-cond.c	2018-01-12 15:13:44.049826915 +0000
@@ -3,6 +3,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize" } */
 
+#pragma GCC target "+nosve"
+
 #define COUNT1(X) if (X) count += 1
 #define COUNT2(X) if (X) count -= 1
 #define COUNT3(X) count += (X)
Index: gcc/testsuite/gcc.target/aarch64/vect-compile.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-compile.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-compile.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,7 +1,8 @@
-
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 #include "vect.x"
 
 /* { dg-final { scan-assembler "orn\\tv" } } */
Index: gcc/testsuite/gcc.target/aarch64/vect-faddv-compile.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-faddv-compile.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-faddv-compile.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,7 +1,8 @@
-
 /* { dg-do compile } */
 /* { dg-options "-O3 -ffast-math" } */
 
+#pragma GCC target "+nosve"
+
 #include "vect-faddv.x"
 
 /* { dg-final { scan-assembler-times "faddp\\tv" 2} } */
Index: gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-d.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-d.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-d.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-unroll-loops --save-temps -fno-inline -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define FTYPE double
 #define ITYPE long
 #define OP ==
Index: gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-f.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-f.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fcm-eq-f.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-unroll-loops --save-temps -fno-inline" } */
 
+#pragma GCC target "+nosve"
+
 #define FTYPE float
 #define ITYPE int
 #define OP ==
Index: gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-d.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-d.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-d.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-unroll-loops --save-temps -fno-inline -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define FTYPE double
 #define ITYPE long
 #define OP >=
Index: gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-f.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-f.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fcm-ge-f.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-unroll-loops --save-temps -fno-inline" } */
 
+#pragma GCC target "+nosve"
+
 #define FTYPE float
 #define ITYPE int
 #define OP >=
Index: gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-d.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-d.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-d.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-unroll-loops --save-temps -fno-inline -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define FTYPE double
 #define ITYPE long
 #define OP >
Index: gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-f.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-f.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fcm-gt-f.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-unroll-loops --save-temps -fno-inline" } */
 
+#pragma GCC target "+nosve"
+
 #define FTYPE float
 #define ITYPE int
 #define OP >
Index: gcc/testsuite/gcc.target/aarch64/vect-fmax-fmin-compile.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fmax-fmin-compile.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fmax-fmin-compile.c	2018-01-12 15:13:44.049826915 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3 -ffast-math" } */
 
+#pragma GCC target "+nosve"
+
 #include "vect-fmax-fmin.x"
 
 /* { dg-final { scan-assembler "fmaxnm\\tv" } } */
Index: gcc/testsuite/gcc.target/aarch64/vect-fmaxv-fminv-compile.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fmaxv-fminv-compile.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fmaxv-fminv-compile.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,7 +1,8 @@
-
 /* { dg-do compile } */
 /* { dg-options "-O3 -ffast-math -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #include "vect-fmaxv-fminv.x"
 
 /* { dg-final { scan-assembler "fminnmv" } } */
Index: gcc/testsuite/gcc.target/aarch64/vect-fmovd-zero.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fmovd-zero.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fmovd-zero.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define N 32
 
 void
Index: gcc/testsuite/gcc.target/aarch64/vect-fmovd.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fmovd.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fmovd.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define N 32
 
 void
Index: gcc/testsuite/gcc.target/aarch64/vect-fmovf-zero.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fmovf-zero.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fmovf-zero.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define N 32
 
 void
Index: gcc/testsuite/gcc.target/aarch64/vect-fmovf.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fmovf.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fmovf.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #define N 32
 
 void
Index: gcc/testsuite/gcc.target/aarch64/vect-fp-compile.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-fp-compile.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-fp-compile.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,8 +1,8 @@
-
-
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 #include "vect-fp.x"
 
 /* { dg-final { scan-assembler "fadd\\tv" } } */
Index: gcc/testsuite/gcc.target/aarch64/vect-ld1r-compile-fp.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-ld1r-compile-fp.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-ld1r-compile-fp.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3 -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #include "stdint.h"
 #include "vect-ld1r.x"
 
Index: gcc/testsuite/gcc.target/aarch64/vect-ld1r-compile.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-ld1r-compile.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-ld1r-compile.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
 /* { dg-options "-O3 -fno-vect-cost-model" } */
 
+#pragma GCC target "+nosve"
+
 #include "stdint.h"
 #include "vect-ld1r.x"
 
Index: gcc/testsuite/gcc.target/aarch64/vect-movi.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-movi.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-movi.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O3 --save-temps -fno-inline" } */
 
+#pragma GCC target "+nosve"
+
 extern void abort (void);
 
 #define N 16
Index: gcc/testsuite/gcc.target/aarch64/vect-mull-compile.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-mull-compile.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-mull-compile.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,7 +1,8 @@
-
 /* { dg-do compile } */
 /* { dg-options "-O3" } */
 
+#pragma GCC target "+nosve"
+
 #define N 16
 
 #include "vect-mull.x"
Index: gcc/testsuite/gcc.target/aarch64/vect-reduc-or_1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-reduc-or_1.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-reduc-or_1.c	2018-01-12 15:13:44.050826874 +0000
@@ -2,6 +2,8 @@
 /* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all -fno-vect-cost-model" } */
 /* Write a reduction loop to be reduced using whole vector right shift.  */
 
+#pragma GCC target "+nosve"
+
 extern void abort (void);
 
 unsigned char in[8] __attribute__((__aligned__(16)));
Index: gcc/testsuite/gcc.target/aarch64/vect-vaddv.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect-vaddv.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect-vaddv.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O3 --save-temps -ffast-math" } */
 
+#pragma GCC target "+nosve"
+
 #include <arm_neon.h>
 
 extern void abort (void);
Index: gcc/testsuite/gcc.target/aarch64/vect_saddl_1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect_saddl_1.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect_saddl_1.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O3 -fno-inline -save-temps -fno-vect-cost-model -fno-ipa-icf" } */
 
+#pragma GCC target "+nosve"
+
 typedef signed char S8_t;
 typedef signed short S16_t;
 typedef signed int S32_t;
Index: gcc/testsuite/gcc.target/aarch64/vect_smlal_1.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vect_smlal_1.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vect_smlal_1.c	2018-01-12 15:13:44.050826874 +0000
@@ -1,6 +1,8 @@
 /* { dg-do run } */
 /* { dg-options "-O3 -fno-inline -save-temps -fno-vect-cost-model -fno-ipa-icf" } */
 
+#pragma GCC target "+nosve"
+
 typedef signed char S8_t;
 typedef signed short S16_t;
 typedef signed int S32_t;
Index: gcc/testsuite/gcc.target/aarch64/vector_initialization_nostack.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/vector_initialization_nostack.c	2018-01-12 15:13:42.557888273 +0000
+++ gcc/testsuite/gcc.target/aarch64/vector_initialization_nostack.c	2018-01-12 15:13:44.050826874 +0000
@@ -49,5 +49,6 @@ f12 (void)
   return sum;
 }
 
-
-/* { dg-final { scan-assembler-not "sp" } } */
+/* Fails for fixed-length SVE because we lack a vec_init pattern.
+   A later patch fixes this in generic code.  */
+/* { dg-final { scan-assembler-not "sp" { xfail { aarch64_sve && { ! vect_variable_length } } } } } */
Index: gcc/testsuite/gcc.target/aarch64/sve/aarch64-sve.exp
===================================================================
--- /dev/null	2018-01-12 06:40:27.684409621 +0000
+++ gcc/testsuite/gcc.target/aarch64/sve/aarch64-sve.exp	2018-01-12 15:13:44.035827490 +0000
@@ -0,0 +1,52 @@
+#  Specific regression driver for AArch64 SVE.
+#  Copyright (C) 2009-2018 Free Software Foundation, Inc.
+#  Contributed by ARM Ltd.
+#
+#  This file is part of GCC.
+#
+#  GCC is free software; you can redistribute it and/or modify it
+#  under the terms of the GNU General Public License as published by
+#  the Free Software Foundation; either version 3, or (at your option)
+#  any later version.
+#
+#  GCC is distributed in the hope that it will be useful, but
+#  WITHOUT ANY WARRANTY; without even the implied warranty of
+#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+#  General Public License for more details.
+#
+#  You should have received a copy of the GNU General Public License
+#  along with GCC; see the file COPYING3.  If not see
+#  <http://www.gnu.org/licenses/>.  */
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Exit immediately if this isn't an AArch64 target.
+if {![istarget aarch64*-*-*] } then {
+  return
+}
+
+# Load support procs.
+load_lib gcc-dg.exp
+
+# If a testcase doesn't have special options, use these.
+global DEFAULT_CFLAGS
+if ![info exists DEFAULT_CFLAGS] then {
+    set DEFAULT_CFLAGS " -ansi -pedantic-errors"
+}
+
+# Initialize `dg'.
+dg-init
+
+# Force SVE if we're not testing it already.
+if { [check_effective_target_aarch64_sve] } {
+    set sve_flags ""
+} else {
+    set sve_flags "-march=armv8.2-a+sve"
+}
+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \
+    $sve_flags $DEFAULT_CFLAGS
+
+# All done.
+dg-finish

[0/4,AArch64] Add SVE support

Message

Comments