diff mbox series

Add VEC_DUPLICATE_{CST,EXPR} and associated optab

Message ID 874lrrjasv.fsf@linaro.org
State New
Headers show
Series Add VEC_DUPLICATE_{CST,EXPR} and associated optab | expand

Commit Message

Richard Sandiford Sept. 25, 2017, 11:08 a.m. UTC
SVE needs a way of broadcasting a scalar to a variable-length vector.
This patch adds VEC_DUPLICATE_CST for when VECTOR_CST would be used for
fixed-length vectors and VEC_DUPLICATE_EXPR for when CONSTRUCTOR would
be used for fixed-length vectors.  VEC_DUPLICATE_EXPR is the tree
equivalent of the existing rtl code VEC_DUPLICATE.

Originally we had a single VEC_DUPLICATE_EXPR and used TREE_CONSTANT
to mark constant nodes, but in response to last year's RFC, Richard B.
suggested it would be better to have separate codes for the constant
and non-constant cases.  This allows VEC_DUPLICATE_EXPR to be treated
as a normal unary operation and avoids the previous need for treating
it as a GIMPLE_SINGLE_RHS.

It might make sense to use VEC_DUPLICATE_CST for all duplicated
vector constants, since it's a bit more compact than VECTOR_CST
in that case, and is potentially more efficient to process.  I don't
have any specific plans to do that though.  We'll need to keep both
types of constant around whatever happens.

The patch also adds a vec_duplicate_optab to go with VEC_DUPLICATE_EXPR.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
OK to install?

Richard


2017-09-25  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hawyard@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* doc/generic.texi (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): Document.
	(VEC_COND_EXPR): Add missing @tindex.
	* doc/md.texi (vec_duplicate@var{m}): Document.
	* tree.def (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): New tree codes.
	* tree-core.h (tree_base): Document that u.nelts and TREE_OVERFLOW
	are used for VEC_DUPLICATE_CST as well.
	(tree_vector): Access base.n.nelts directly.
	* tree.h (TREE_OVERFLOW): Add VEC_DUPLICATE_CST to the list of
	valid codes.
	(VEC_DUPLICATE_CST_ELT): New macro.
	(build_vec_duplicate_cst): Declare.
	* tree.c (tree_node_structure_for_code, tree_code_size, tree_size)
	(integer_zerop, integer_onep, integer_all_onesp, integer_truep)
	(real_zerop, real_onep, real_minus_onep, add_expr, initializer_zerop)
	(walk_tree_1, drop_tree_overflow): Handle VEC_DUPLICATE_CST.
	(build_vec_duplicate_cst): New function.
	(uniform_vector_p): Handle the new codes.
	(test_vec_duplicate_predicates_int): New function.
	(test_vec_duplicate_predicates_float): Likewise.
	(test_vec_duplicate_predicates): Likewise.
	(tree_c_tests): Call test_vec_duplicate_predicates.
	* cfgexpand.c (expand_debug_expr): Handle the new codes.
	* tree-pretty-print.c (dump_generic_node): Likewise.
	* dwarf2out.c (rtl_for_decl_init): Handle VEC_DUPLICATE_CST.
	* gimple-expr.h (is_gimple_constant): Likewise.
	* gimplify.c (gimplify_expr): Likewise.
	* graphite-isl-ast-to-gimple.c
	(translate_isl_ast_to_gimple::is_constant): Likewise.
	* graphite-scop-detection.c (scan_tree_for_params): Likewise.
	* ipa-icf-gimple.c (func_checker::compare_cst_or_decl): Likewise.
	(func_checker::compare_operand): Likewise.
	* ipa-icf.c (sem_item::add_expr, sem_variable::equals): Likewise.
	* match.pd (negate_expr_p): Likewise.
	* print-tree.c (print_node): Likewise.
	* tree-chkp.c (chkp_find_bounds_1): Likewise.
	* tree-data-ref.c (data_ref_compare_tree): Likewise.
	* tree-loop-distribution.c (const_with_all_bytes_same): Likewise.
	* tree-ssa-loop.c (for_each_index): Likewise.
	* tree-ssa-pre.c (create_component_ref_by_pieces_1): Likewise.
	* tree-ssa-sccvn.c (copy_reference_ops_from_ref): Likewise.
	(ao_ref_init_from_vn_reference): Likewise.
	* tree-vect-generic.c (ssa_uniform_vector_p): Likewise.
	* varasm.c (const_hash_1, compare_constant): Likewise.
	* fold-const.c (negate_expr_p, fold_negate_expr_1, const_binop)
	(fold_convert_const, operand_equal_p, fold_view_convert_expr)
	(exact_inverse, fold_checksum_tree): Likewise.
	(const_unop): Likewise.  Fold VEC_DUPLICATE_EXPRs of a constant.
	(test_vec_duplicate_folding): New function.
	(fold_const_c_tests): Call it.
	* optabs.def (vec_duplicate_optab): New optab.
	* optabs-tree.c (optab_for_tree_code): Handle VEC_DUPLICATE_EXPR.
	* optabs.h (expand_vector_broadcast): Declare.
	* optabs.c (expand_vector_broadcast): Make non-static.  Try using
	vec_duplicate_optab.
	* expr.c (store_constructor): Try using vec_duplicate_optab for
	uniform vectors.
	(const_vector_element): New function, split out from...
	(const_vector_from_tree): ...here.
	(expand_expr_real_2): Handle VEC_DUPLICATE_EXPR.
	(expand_expr_real_1): Handle VEC_DUPLICATE_CST.
	* internal-fn.c (expand_vector_ubsan_overflow): Use CONSTANT_P
	instead of checking for VECTOR_CST.
	* tree-cfg.c (verify_gimple_assign_unary): Handle VEC_DUPLICATE_EXPR.
	(verify_gimple_assign_single): Handle VEC_DUPLICATE_CST.
	* tree-inline.c (estimate_operator_cost): Handle VEC_DUPLICATE_EXPR.

Comments

Richard Biener Sept. 25, 2017, 11:39 a.m. UTC | #1
On Mon, Sep 25, 2017 at 1:08 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> SVE needs a way of broadcasting a scalar to a variable-length vector.

> This patch adds VEC_DUPLICATE_CST for when VECTOR_CST would be used for

> fixed-length vectors and VEC_DUPLICATE_EXPR for when CONSTRUCTOR would

> be used for fixed-length vectors.  VEC_DUPLICATE_EXPR is the tree

> equivalent of the existing rtl code VEC_DUPLICATE.

>

> Originally we had a single VEC_DUPLICATE_EXPR and used TREE_CONSTANT

> to mark constant nodes, but in response to last year's RFC, Richard B.

> suggested it would be better to have separate codes for the constant

> and non-constant cases.  This allows VEC_DUPLICATE_EXPR to be treated

> as a normal unary operation and avoids the previous need for treating

> it as a GIMPLE_SINGLE_RHS.

>

> It might make sense to use VEC_DUPLICATE_CST for all duplicated

> vector constants, since it's a bit more compact than VECTOR_CST

> in that case, and is potentially more efficient to process.  I don't

> have any specific plans to do that though.  We'll need to keep both

> types of constant around whatever happens.


I think VEC_DUPLICATE_EXPR is a good thing to have.  Looking at the changelog
you didn't patch build_vector_from_val to make use of either new tree
code?  That
would get you (quite) some testing coverage.

Currently we require all elements of a VECTOR_CST to be present -- how difficult
would it be to declare that iff and if only the first element is
present then all following
elements are the same as the last one?  That said, I'm looking for a loop-hole
to not add the extra VEC_DUPLICATE_CST code ... eventually we can simply
allow scalars in contexts where vectors are valid?  Like we do for shifts?  OTOH
that'd be "implicitely typed constants" (depends on context) like CONST_INT, so
probably not the way to go?

The ugly thing about the new codes is that we go from 3 cases when folding
vector CONSTRUCTOR and VECTOR_CST we now have 24 to cover...
(if I didn't miscount).  This now really asks for some common iterator over
elements of a vector (and VEC_DUPLICATE_{EXPR,CST} would just return the first
elt all the time).  Note that using scalars instead of vectors reduces the
combinatorical explosion a bit (scalar and scalar const can be handled
the same).

So ... I'd rather not have those if we can avoid it but I haven't
fully thought out
things as you can see from above.

Richard.

> The patch also adds a vec_duplicate_optab to go with VEC_DUPLICATE_EXPR.

>

> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.

> OK to install?

>

> Richard

>

>

> 2017-09-25  Richard Sandiford  <richard.sandiford@linaro.org>

>             Alan Hayward  <alan.hawyard@arm.com>

>             David Sherwood  <david.sherwood@arm.com>

>

> gcc/

>         * doc/generic.texi (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): Document.

>         (VEC_COND_EXPR): Add missing @tindex.

>         * doc/md.texi (vec_duplicate@var{m}): Document.

>         * tree.def (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): New tree codes.

>         * tree-core.h (tree_base): Document that u.nelts and TREE_OVERFLOW

>         are used for VEC_DUPLICATE_CST as well.

>         (tree_vector): Access base.n.nelts directly.

>         * tree.h (TREE_OVERFLOW): Add VEC_DUPLICATE_CST to the list of

>         valid codes.

>         (VEC_DUPLICATE_CST_ELT): New macro.

>         (build_vec_duplicate_cst): Declare.

>         * tree.c (tree_node_structure_for_code, tree_code_size, tree_size)

>         (integer_zerop, integer_onep, integer_all_onesp, integer_truep)

>         (real_zerop, real_onep, real_minus_onep, add_expr, initializer_zerop)

>         (walk_tree_1, drop_tree_overflow): Handle VEC_DUPLICATE_CST.

>         (build_vec_duplicate_cst): New function.

>         (uniform_vector_p): Handle the new codes.

>         (test_vec_duplicate_predicates_int): New function.

>         (test_vec_duplicate_predicates_float): Likewise.

>         (test_vec_duplicate_predicates): Likewise.

>         (tree_c_tests): Call test_vec_duplicate_predicates.

>         * cfgexpand.c (expand_debug_expr): Handle the new codes.

>         * tree-pretty-print.c (dump_generic_node): Likewise.

>         * dwarf2out.c (rtl_for_decl_init): Handle VEC_DUPLICATE_CST.

>         * gimple-expr.h (is_gimple_constant): Likewise.

>         * gimplify.c (gimplify_expr): Likewise.

>         * graphite-isl-ast-to-gimple.c

>         (translate_isl_ast_to_gimple::is_constant): Likewise.

>         * graphite-scop-detection.c (scan_tree_for_params): Likewise.

>         * ipa-icf-gimple.c (func_checker::compare_cst_or_decl): Likewise.

>         (func_checker::compare_operand): Likewise.

>         * ipa-icf.c (sem_item::add_expr, sem_variable::equals): Likewise.

>         * match.pd (negate_expr_p): Likewise.

>         * print-tree.c (print_node): Likewise.

>         * tree-chkp.c (chkp_find_bounds_1): Likewise.

>         * tree-data-ref.c (data_ref_compare_tree): Likewise.

>         * tree-loop-distribution.c (const_with_all_bytes_same): Likewise.

>         * tree-ssa-loop.c (for_each_index): Likewise.

>         * tree-ssa-pre.c (create_component_ref_by_pieces_1): Likewise.

>         * tree-ssa-sccvn.c (copy_reference_ops_from_ref): Likewise.

>         (ao_ref_init_from_vn_reference): Likewise.

>         * tree-vect-generic.c (ssa_uniform_vector_p): Likewise.

>         * varasm.c (const_hash_1, compare_constant): Likewise.

>         * fold-const.c (negate_expr_p, fold_negate_expr_1, const_binop)

>         (fold_convert_const, operand_equal_p, fold_view_convert_expr)

>         (exact_inverse, fold_checksum_tree): Likewise.

>         (const_unop): Likewise.  Fold VEC_DUPLICATE_EXPRs of a constant.

>         (test_vec_duplicate_folding): New function.

>         (fold_const_c_tests): Call it.

>         * optabs.def (vec_duplicate_optab): New optab.

>         * optabs-tree.c (optab_for_tree_code): Handle VEC_DUPLICATE_EXPR.

>         * optabs.h (expand_vector_broadcast): Declare.

>         * optabs.c (expand_vector_broadcast): Make non-static.  Try using

>         vec_duplicate_optab.

>         * expr.c (store_constructor): Try using vec_duplicate_optab for

>         uniform vectors.

>         (const_vector_element): New function, split out from...

>         (const_vector_from_tree): ...here.

>         (expand_expr_real_2): Handle VEC_DUPLICATE_EXPR.

>         (expand_expr_real_1): Handle VEC_DUPLICATE_CST.

>         * internal-fn.c (expand_vector_ubsan_overflow): Use CONSTANT_P

>         instead of checking for VECTOR_CST.

>         * tree-cfg.c (verify_gimple_assign_unary): Handle VEC_DUPLICATE_EXPR.

>         (verify_gimple_assign_single): Handle VEC_DUPLICATE_CST.

>         * tree-inline.c (estimate_operator_cost): Handle VEC_DUPLICATE_EXPR.

>

> Index: gcc/doc/generic.texi

> ===================================================================

> --- gcc/doc/generic.texi        2017-09-04 08:29:12.853103383 +0100

> +++ gcc/doc/generic.texi        2017-09-25 12:03:06.688818488 +0100

> @@ -1036,6 +1036,7 @@ As this example indicates, the operands

>  @tindex FIXED_CST

>  @tindex COMPLEX_CST

>  @tindex VECTOR_CST

> +@tindex VEC_DUPLICATE_CST

>  @tindex STRING_CST

>  @findex TREE_STRING_LENGTH

>  @findex TREE_STRING_POINTER

> @@ -1089,6 +1090,14 @@ constant nodes.  Each individual constan

>  double constant node.  The first operand is a @code{TREE_LIST} of the

>  constant nodes and is accessed through @code{TREE_VECTOR_CST_ELTS}.

>

> +@item VEC_DUPLICATE_CST

> +These nodes represent a vector constant in which every element has the

> +same scalar value.  At present only variable-length vectors use

> +@code{VEC_DUPLICATE_CST}; constant-length vectors use @code{VECTOR_CST}

> +instead.  The scalar element value is given by

> +@code{VEC_DUPLICATE_CST_ELT} and has the same restrictions as the

> +element of a @code{VECTOR_CST}.

> +

>  @item STRING_CST

>  These nodes represent string-constants.  The @code{TREE_STRING_LENGTH}

>  returns the length of the string, as an @code{int}.  The

> @@ -1692,6 +1701,7 @@ a value from @code{enum annot_expr_kind}

>

>  @node Vectors

>  @subsection Vectors

> +@tindex VEC_DUPLICATE_EXPR

>  @tindex VEC_LSHIFT_EXPR

>  @tindex VEC_RSHIFT_EXPR

>  @tindex VEC_WIDEN_MULT_HI_EXPR

> @@ -1703,9 +1713,14 @@ a value from @code{enum annot_expr_kind}

>  @tindex VEC_PACK_TRUNC_EXPR

>  @tindex VEC_PACK_SAT_EXPR

>  @tindex VEC_PACK_FIX_TRUNC_EXPR

> +@tindex VEC_COND_EXPR

>  @tindex SAD_EXPR

>

>  @table @code

> +@item VEC_DUPLICATE_EXPR

> +This node has a single operand and represents a vector in which every

> +element is equal to that operand.

> +

>  @item VEC_LSHIFT_EXPR

>  @itemx VEC_RSHIFT_EXPR

>  These nodes represent whole vector left and right shifts, respectively.

> Index: gcc/doc/md.texi

> ===================================================================

> --- gcc/doc/md.texi     2017-09-04 11:49:42.934500723 +0100

> +++ gcc/doc/md.texi     2017-09-25 12:03:06.693818177 +0100

> @@ -4888,6 +4888,17 @@ and operand 1 is parallel containing val

>  the vector mode @var{m}, or a vector mode with the same element mode and

>  smaller number of elements.

>

> +@cindex @code{vec_duplicate@var{m}} instruction pattern

> +@item @samp{vec_duplicate@var{m}}

> +Initialize vector output operand 0 so that each element has the value given

> +by scalar input operand 1.  The vector has mode @var{m} and the scalar has

> +the mode appropriate for one element of @var{m}.

> +

> +This pattern only handles duplicates of non-constant inputs.  Constant

> +vectors go through the @code{mov@var{m}} pattern instead.

> +

> +This pattern is not allowed to @code{FAIL}.

> +

>  @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern

>  @item @samp{vec_cmp@var{m}@var{n}}

>  Output a vector comparison.  Operand 0 of mode @var{n} is the destination for

> Index: gcc/tree.def

> ===================================================================

> --- gcc/tree.def        2017-07-27 10:37:56.369045398 +0100

> +++ gcc/tree.def        2017-09-25 12:03:06.739815314 +0100

> @@ -304,6 +304,10 @@ DEFTREECODE (COMPLEX_CST, "complex_cst",

>  /* Contents are in VECTOR_CST_ELTS field.  */

>  DEFTREECODE (VECTOR_CST, "vector_cst", tcc_constant, 0)

>

> +/* Represents a vector constant in which every element is equal to

> +   VEC_DUPLICATE_CST_ELT.  */

> +DEFTREECODE (VEC_DUPLICATE_CST, "vec_duplicate_cst", tcc_constant, 0)

> +

>  /* Contents are TREE_STRING_LENGTH and the actual contents of the string.  */

>  DEFTREECODE (STRING_CST, "string_cst", tcc_constant, 0)

>

> @@ -534,6 +538,9 @@ DEFTREECODE (TARGET_EXPR, "target_expr",

>     1 and 2 are NULL.  The operands are then taken from the cfg edges. */

>  DEFTREECODE (COND_EXPR, "cond_expr", tcc_expression, 3)

>

> +/* Represents a vector in which every element is equal to operand 0.  */

> +DEFTREECODE (VEC_DUPLICATE_EXPR, "vec_duplicate_expr", tcc_unary, 1)

> +

>  /* Vector conditional expression. It is like COND_EXPR, but with

>     vector operands.

>

> Index: gcc/tree-core.h

> ===================================================================

> --- gcc/tree-core.h     2017-09-14 16:25:43.864400951 +0100

> +++ gcc/tree-core.h     2017-09-25 12:03:06.723816310 +0100

> @@ -975,7 +975,8 @@ struct GTY(()) tree_base {

>      /* VEC length.  This field is only used with TREE_VEC.  */

>      int length;

>

> -    /* Number of elements.  This field is only used with VECTOR_CST.  */

> +    /* Number of elements.  This field is only used with VECTOR_CST

> +       and VEC_DUPLICATE_CST.  It is always 1 for VEC_DUPLICATE_CST.  */

>      unsigned int nelts;

>

>      /* SSA version number.  This field is only used with SSA_NAME.  */

> @@ -1062,7 +1063,7 @@ struct GTY(()) tree_base {

>     public_flag:

>

>         TREE_OVERFLOW in

> -           INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST

> +           INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST, VEC_DUPLICATE_CST

>

>         TREE_PUBLIC in

>             VAR_DECL, FUNCTION_DECL

> @@ -1329,7 +1330,7 @@ struct GTY(()) tree_complex {

>

>  struct GTY(()) tree_vector {

>    struct tree_typed typed;

> -  tree GTY ((length ("VECTOR_CST_NELTS ((tree) &%h)"))) elts[1];

> +  tree GTY ((length ("((tree) &%h)->base.u.nelts"))) elts[1];

>  };

>

>  struct GTY(()) tree_identifier {

> Index: gcc/tree.h

> ===================================================================

> --- gcc/tree.h  2017-09-14 16:45:44.200520742 +0100

> +++ gcc/tree.h  2017-09-25 12:03:06.741815189 +0100

> @@ -730,8 +730,8 @@ #define TREE_SYMBOL_REFERENCED(NODE) \

>  #define TYPE_REF_CAN_ALIAS_ALL(NODE) \

>    (PTR_OR_REF_CHECK (NODE)->base.static_flag)

>

> -/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, or VECTOR_CST, this means

> -   there was an overflow in folding.  */

> +/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST or VEC_DUPLICATE_CST,

> +   this means there was an overflow in folding.  */

>

>  #define TREE_OVERFLOW(NODE) (CST_CHECK (NODE)->base.public_flag)

>

> @@ -1030,6 +1030,10 @@ #define VECTOR_CST_NELTS(NODE) (VECTOR_C

>  #define VECTOR_CST_ELTS(NODE) (VECTOR_CST_CHECK (NODE)->vector.elts)

>  #define VECTOR_CST_ELT(NODE,IDX) (VECTOR_CST_CHECK (NODE)->vector.elts[IDX])

>

> +/* In a VEC_DUPLICATE_CST node.  */

> +#define VEC_DUPLICATE_CST_ELT(NODE) \

> +  (VEC_DUPLICATE_CST_CHECK (NODE)->vector.elts[0])

> +

>  /* Define fields and accessors for some special-purpose tree nodes.  */

>

>  #define IDENTIFIER_LENGTH(NODE) \

> @@ -4026,6 +4030,7 @@ extern tree build_int_cst (tree, HOST_WI

>  extern tree build_int_cstu (tree type, unsigned HOST_WIDE_INT cst);

>  extern tree build_int_cst_type (tree, HOST_WIDE_INT);

>  extern tree make_vector (unsigned CXX_MEM_STAT_INFO);

> +extern tree build_vec_duplicate_cst (tree, tree CXX_MEM_STAT_INFO);

>  extern tree build_vector (tree, vec<tree> CXX_MEM_STAT_INFO);

>  extern tree build_vector_from_ctor (tree, vec<constructor_elt, va_gc> *);

>  extern tree build_vector_from_val (tree, tree);

> Index: gcc/tree.c

> ===================================================================

> --- gcc/tree.c  2017-09-21 12:06:40.939511360 +0100

> +++ gcc/tree.c  2017-09-25 12:03:06.737815438 +0100

> @@ -464,6 +464,7 @@ tree_node_structure_for_code (enum tree_

>      case FIXED_CST:            return TS_FIXED_CST;

>      case COMPLEX_CST:          return TS_COMPLEX;

>      case VECTOR_CST:           return TS_VECTOR;

> +    case VEC_DUPLICATE_CST:    return TS_VECTOR;

>      case STRING_CST:           return TS_STRING;

>        /* tcc_exceptional cases.  */

>      case ERROR_MARK:           return TS_COMMON;

> @@ -816,6 +817,7 @@ tree_code_size (enum tree_code code)

>         case FIXED_CST:         return sizeof (struct tree_fixed_cst);

>         case COMPLEX_CST:       return sizeof (struct tree_complex);

>         case VECTOR_CST:        return sizeof (struct tree_vector);

> +       case VEC_DUPLICATE_CST: return sizeof (struct tree_vector);

>         case STRING_CST:        gcc_unreachable ();

>         default:

>           return lang_hooks.tree_size (code);

> @@ -875,6 +877,9 @@ tree_size (const_tree node)

>        return (sizeof (struct tree_vector)

>               + (VECTOR_CST_NELTS (node) - 1) * sizeof (tree));

>

> +    case VEC_DUPLICATE_CST:

> +      return sizeof (struct tree_vector);

> +

>      case STRING_CST:

>        return TREE_STRING_LENGTH (node) + offsetof (struct tree_string, str) + 1;

>

> @@ -1682,6 +1687,30 @@ cst_and_fits_in_hwi (const_tree x)

>           && (tree_fits_shwi_p (x) || tree_fits_uhwi_p (x)));

>  }

>

> +/* Build a new VEC_DUPLICATE_CST with type TYPE and operand EXP.

> +

> +   Note that this function is only suitable for callers that specifically

> +   need a VEC_DUPLICATE_CST node.  Use build_vector_from_val to duplicate

> +   a general scalar into a general vector type.  */

> +

> +tree

> +build_vec_duplicate_cst (tree type, tree exp MEM_STAT_DECL)

> +{

> +  int length = sizeof (struct tree_vector);

> +

> +  record_node_allocation_statistics (VEC_DUPLICATE_CST, length);

> +

> +  tree t = ggc_alloc_cleared_tree_node_stat (length PASS_MEM_STAT);

> +

> +  TREE_SET_CODE (t, VEC_DUPLICATE_CST);

> +  TREE_TYPE (t) = type;

> +  t->base.u.nelts = 1;

> +  VEC_DUPLICATE_CST_ELT (t) = exp;

> +  TREE_CONSTANT (t) = 1;

> +

> +  return t;

> +}

> +

>  /* Build a newly constructed VECTOR_CST node of length LEN.  */

>

>  tree

> @@ -2343,6 +2372,8 @@ integer_zerop (const_tree expr)

>             return false;

>         return true;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return integer_zerop (VEC_DUPLICATE_CST_ELT (expr));

>      default:

>        return false;

>      }

> @@ -2369,6 +2400,8 @@ integer_onep (const_tree expr)

>             return false;

>         return true;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return integer_onep (VEC_DUPLICATE_CST_ELT (expr));

>      default:

>        return false;

>      }

> @@ -2407,6 +2440,9 @@ integer_all_onesp (const_tree expr)

>        return 1;

>      }

>

> +  else if (TREE_CODE (expr) == VEC_DUPLICATE_CST)

> +    return integer_all_onesp (VEC_DUPLICATE_CST_ELT (expr));

> +

>    else if (TREE_CODE (expr) != INTEGER_CST)

>      return 0;

>

> @@ -2462,7 +2498,7 @@ integer_nonzerop (const_tree expr)

>  int

>  integer_truep (const_tree expr)

>  {

> -  if (TREE_CODE (expr) == VECTOR_CST)

> +  if (TREE_CODE (expr) == VECTOR_CST || TREE_CODE (expr) == VEC_DUPLICATE_CST)

>      return integer_all_onesp (expr);

>    return integer_onep (expr);

>  }

> @@ -2633,6 +2669,8 @@ real_zerop (const_tree expr)

>             return false;

>         return true;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return real_zerop (VEC_DUPLICATE_CST_ELT (expr));

>      default:

>        return false;

>      }

> @@ -2661,6 +2699,8 @@ real_onep (const_tree expr)

>             return false;

>         return true;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return real_onep (VEC_DUPLICATE_CST_ELT (expr));

>      default:

>        return false;

>      }

> @@ -2688,6 +2728,8 @@ real_minus_onep (const_tree expr)

>             return false;

>         return true;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return real_minus_onep (VEC_DUPLICATE_CST_ELT (expr));

>      default:

>        return false;

>      }

> @@ -7090,6 +7132,9 @@ add_expr (const_tree t, inchash::hash &h

>           inchash::add_expr (VECTOR_CST_ELT (t, i), hstate, flags);

>         return;

>        }

> +    case VEC_DUPLICATE_CST:

> +      inchash::add_expr (VEC_DUPLICATE_CST_ELT (t), hstate);

> +      return;

>      case SSA_NAME:

>        /* We can just compare by pointer.  */

>        hstate.add_wide_int (SSA_NAME_VERSION (t));

> @@ -10344,6 +10389,9 @@ initializer_zerop (const_tree init)

>         return true;

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      return initializer_zerop (VEC_DUPLICATE_CST_ELT (init));

> +

>      case CONSTRUCTOR:

>        {

>         unsigned HOST_WIDE_INT idx;

> @@ -10389,7 +10437,13 @@ uniform_vector_p (const_tree vec)

>

>    gcc_assert (VECTOR_TYPE_P (TREE_TYPE (vec)));

>

> -  if (TREE_CODE (vec) == VECTOR_CST)

> +  if (TREE_CODE (vec) == VEC_DUPLICATE_CST)

> +    return VEC_DUPLICATE_CST_ELT (vec);

> +

> +  else if (TREE_CODE (vec) == VEC_DUPLICATE_EXPR)

> +    return TREE_OPERAND (vec, 0);

> +

> +  else if (TREE_CODE (vec) == VECTOR_CST)

>      {

>        first = VECTOR_CST_ELT (vec, 0);

>        for (i = 1; i < VECTOR_CST_NELTS (vec); ++i)

> @@ -11094,6 +11148,7 @@ #define WALK_SUBTREE_TAIL(NODE)                         \

>      case REAL_CST:

>      case FIXED_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case STRING_CST:

>      case BLOCK:

>      case PLACEHOLDER_EXPR:

> @@ -12380,6 +12435,12 @@ drop_tree_overflow (tree t)

>             elt = drop_tree_overflow (elt);

>         }

>      }

> +  if (TREE_CODE (t) == VEC_DUPLICATE_CST)

> +    {

> +      tree *elt = &VEC_DUPLICATE_CST_ELT (t);

> +      if (TREE_OVERFLOW (*elt))

> +       *elt = drop_tree_overflow (*elt);

> +    }

>    return t;

>  }

>

> @@ -13797,6 +13858,92 @@ test_integer_constants ()

>    ASSERT_EQ (type, TREE_TYPE (zero));

>  }

>

> +/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs

> +   for integral type TYPE.  */

> +

> +static void

> +test_vec_duplicate_predicates_int (tree type)

> +{

> +  tree vec_type = build_vector_type (type, 4);

> +

> +  tree zero = build_zero_cst (type);

> +  tree vec_zero = build_vec_duplicate_cst (vec_type, zero);

> +  ASSERT_TRUE (integer_zerop (vec_zero));

> +  ASSERT_FALSE (integer_onep (vec_zero));

> +  ASSERT_FALSE (integer_minus_onep (vec_zero));

> +  ASSERT_FALSE (integer_all_onesp (vec_zero));

> +  ASSERT_FALSE (integer_truep (vec_zero));

> +  ASSERT_TRUE (initializer_zerop (vec_zero));

> +

> +  tree one = build_one_cst (type);

> +  tree vec_one = build_vec_duplicate_cst (vec_type, one);

> +  ASSERT_FALSE (integer_zerop (vec_one));

> +  ASSERT_TRUE (integer_onep (vec_one));

> +  ASSERT_FALSE (integer_minus_onep (vec_one));

> +  ASSERT_FALSE (integer_all_onesp (vec_one));

> +  ASSERT_FALSE (integer_truep (vec_one));

> +  ASSERT_FALSE (initializer_zerop (vec_one));

> +

> +  tree minus_one = build_minus_one_cst (type);

> +  tree vec_minus_one = build_vec_duplicate_cst (vec_type, minus_one);

> +  ASSERT_FALSE (integer_zerop (vec_minus_one));

> +  ASSERT_FALSE (integer_onep (vec_minus_one));

> +  ASSERT_TRUE (integer_minus_onep (vec_minus_one));

> +  ASSERT_TRUE (integer_all_onesp (vec_minus_one));

> +  ASSERT_TRUE (integer_truep (vec_minus_one));

> +  ASSERT_FALSE (initializer_zerop (vec_minus_one));

> +

> +  tree x = create_tmp_var_raw (type, "x");

> +  tree vec_x = build1 (VEC_DUPLICATE_EXPR, vec_type, x);

> +  ASSERT_EQ (uniform_vector_p (vec_zero), zero);

> +  ASSERT_EQ (uniform_vector_p (vec_one), one);

> +  ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one);

> +  ASSERT_EQ (uniform_vector_p (vec_x), x);

> +}

> +

> +/* Verify predicate handling of VEC_DUPLICATE_CSTs for floating-point

> +   type TYPE.  */

> +

> +static void

> +test_vec_duplicate_predicates_float (tree type)

> +{

> +  tree vec_type = build_vector_type (type, 4);

> +

> +  tree zero = build_zero_cst (type);

> +  tree vec_zero = build_vec_duplicate_cst (vec_type, zero);

> +  ASSERT_TRUE (real_zerop (vec_zero));

> +  ASSERT_FALSE (real_onep (vec_zero));

> +  ASSERT_FALSE (real_minus_onep (vec_zero));

> +  ASSERT_TRUE (initializer_zerop (vec_zero));

> +

> +  tree one = build_one_cst (type);

> +  tree vec_one = build_vec_duplicate_cst (vec_type, one);

> +  ASSERT_FALSE (real_zerop (vec_one));

> +  ASSERT_TRUE (real_onep (vec_one));

> +  ASSERT_FALSE (real_minus_onep (vec_one));

> +  ASSERT_FALSE (initializer_zerop (vec_one));

> +

> +  tree minus_one = build_minus_one_cst (type);

> +  tree vec_minus_one = build_vec_duplicate_cst (vec_type, minus_one);

> +  ASSERT_FALSE (real_zerop (vec_minus_one));

> +  ASSERT_FALSE (real_onep (vec_minus_one));

> +  ASSERT_TRUE (real_minus_onep (vec_minus_one));

> +  ASSERT_FALSE (initializer_zerop (vec_minus_one));

> +

> +  ASSERT_EQ (uniform_vector_p (vec_zero), zero);

> +  ASSERT_EQ (uniform_vector_p (vec_one), one);

> +  ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one);

> +}

> +

> +/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs.  */

> +

> +static void

> +test_vec_duplicate_predicates ()

> +{

> +  test_vec_duplicate_predicates_int (integer_type_node);

> +  test_vec_duplicate_predicates_float (float_type_node);

> +}

> +

>  /* Verify identifiers.  */

>

>  static void

> @@ -13825,6 +13972,7 @@ test_labels ()

>  tree_c_tests ()

>  {

>    test_integer_constants ();

> +  test_vec_duplicate_predicates ();

>    test_identifiers ();

>    test_labels ();

>  }

> Index: gcc/cfgexpand.c

> ===================================================================

> --- gcc/cfgexpand.c     2017-09-14 16:25:43.861637270 +0100

> +++ gcc/cfgexpand.c     2017-09-25 12:03:06.687818551 +0100

> @@ -5049,6 +5049,8 @@ expand_debug_expr (tree exp)

>      case VEC_WIDEN_LSHIFT_HI_EXPR:

>      case VEC_WIDEN_LSHIFT_LO_EXPR:

>      case VEC_PERM_EXPR:

> +    case VEC_DUPLICATE_CST:

> +    case VEC_DUPLICATE_EXPR:

>        return NULL;

>

>      /* Misc codes.  */

> Index: gcc/tree-pretty-print.c

> ===================================================================

> --- gcc/tree-pretty-print.c     2017-08-24 08:46:01.758139665 +0100

> +++ gcc/tree-pretty-print.c     2017-09-25 12:03:06.728815998 +0100

> @@ -1800,6 +1800,12 @@ dump_generic_node (pretty_printer *pp, t

>        }

>        break;

>

> +    case VEC_DUPLICATE_CST:

> +      pp_string (pp, "{ ");

> +      dump_generic_node (pp, VEC_DUPLICATE_CST_ELT (node), spc, flags, false);

> +      pp_string (pp, ", ... }");

> +      break;

> +

>      case FUNCTION_TYPE:

>      case METHOD_TYPE:

>        dump_generic_node (pp, TREE_TYPE (node), spc, flags, false);

> @@ -3230,6 +3236,15 @@ dump_generic_node (pretty_printer *pp, t

>        pp_string (pp, " > ");

>        break;

>

> +    case VEC_DUPLICATE_EXPR:

> +      pp_space (pp);

> +      for (str = get_tree_code_name (code); *str; str++)

> +       pp_character (pp, TOUPPER (*str));

> +      pp_string (pp, " < ");

> +      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);

> +      pp_string (pp, " > ");

> +      break;

> +

>      case VEC_UNPACK_HI_EXPR:

>        pp_string (pp, " VEC_UNPACK_HI_EXPR < ");

>        dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);

> Index: gcc/dwarf2out.c

> ===================================================================

> --- gcc/dwarf2out.c     2017-09-21 11:53:16.380966799 +0100

> +++ gcc/dwarf2out.c     2017-09-25 12:03:06.704817493 +0100

> @@ -18862,6 +18862,7 @@ rtl_for_decl_init (tree init, tree type)

>         switch (TREE_CODE (init))

>           {

>           case VECTOR_CST:

> +         case VEC_DUPLICATE_CST:

>             break;

>           case CONSTRUCTOR:

>             if (TREE_CONSTANT (init))

> Index: gcc/gimple-expr.h

> ===================================================================

> --- gcc/gimple-expr.h   2017-02-23 19:54:20.000000000 +0000

> +++ gcc/gimple-expr.h   2017-09-25 12:03:06.708817243 +0100

> @@ -134,6 +134,7 @@ is_gimple_constant (const_tree t)

>      case FIXED_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case STRING_CST:

>        return true;

>

> Index: gcc/gimplify.c

> ===================================================================

> --- gcc/gimplify.c      2017-08-29 08:47:13.282917702 +0100

> +++ gcc/gimplify.c      2017-09-25 12:03:06.711817057 +0100

> @@ -11501,6 +11501,7 @@ gimplify_expr (tree *expr_p, gimple_seq

>         case STRING_CST:

>         case COMPLEX_CST:

>         case VECTOR_CST:

> +       case VEC_DUPLICATE_CST:

>           /* Drop the overflow flag on constants, we do not want

>              that in the GIMPLE IL.  */

>           if (TREE_OVERFLOW_P (*expr_p))

> Index: gcc/graphite-isl-ast-to-gimple.c

> ===================================================================

> --- gcc/graphite-isl-ast-to-gimple.c    2017-09-22 17:22:08.334305773 +0100

> +++ gcc/graphite-isl-ast-to-gimple.c    2017-09-25 12:03:06.712816994 +0100

> @@ -245,7 +245,8 @@ enum phi_node_kind

>      return TREE_CODE (op) == INTEGER_CST

>        || TREE_CODE (op) == REAL_CST

>        || TREE_CODE (op) == COMPLEX_CST

> -      || TREE_CODE (op) == VECTOR_CST;

> +      || TREE_CODE (op) == VECTOR_CST

> +      || TREE_CODE (op) == VEC_DUPLICATE_CST;

>    }

>

>  private:

> Index: gcc/graphite-scop-detection.c

> ===================================================================

> --- gcc/graphite-scop-detection.c       2017-09-22 17:22:08.510305732 +0100

> +++ gcc/graphite-scop-detection.c       2017-09-25 12:03:06.712816994 +0100

> @@ -1447,6 +1447,7 @@ scan_tree_for_params (sese_info_p s, tre

>      case REAL_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>        break;

>

>     default:

> Index: gcc/ipa-icf-gimple.c

> ===================================================================

> --- gcc/ipa-icf-gimple.c        2017-08-30 16:25:16.913251173 +0100

> +++ gcc/ipa-icf-gimple.c        2017-09-25 12:03:06.714816870 +0100

> @@ -333,6 +333,7 @@ func_checker::compare_cst_or_decl (tree

>      case INTEGER_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case STRING_CST:

>      case REAL_CST:

>        {

> @@ -528,6 +529,7 @@ func_checker::compare_operand (tree t1,

>      case INTEGER_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case STRING_CST:

>      case REAL_CST:

>      case FUNCTION_DECL:

> Index: gcc/ipa-icf.c

> ===================================================================

> --- gcc/ipa-icf.c       2017-06-07 07:42:16.940073012 +0100

> +++ gcc/ipa-icf.c       2017-09-25 12:03:06.715816808 +0100

> @@ -1478,6 +1478,7 @@ sem_item::add_expr (const_tree exp, inch

>      case STRING_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>        inchash::add_expr (exp, hstate);

>        break;

>      case CONSTRUCTOR:

> @@ -2030,6 +2031,9 @@ sem_variable::equals (tree t1, tree t2)

>

>         return 1;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return sem_variable::equals (VEC_DUPLICATE_CST_ELT (t1),

> +                                  VEC_DUPLICATE_CST_ELT (t2));

>      case ARRAY_REF:

>      case ARRAY_RANGE_REF:

>        {

> Index: gcc/match.pd

> ===================================================================

> --- gcc/match.pd        2017-09-21 11:17:14.827201204 +0100

> +++ gcc/match.pd        2017-09-25 12:03:06.716816745 +0100

> @@ -944,6 +944,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

>  (match negate_expr_p

>   VECTOR_CST

>   (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type))))

> +(match negate_expr_p

> + VEC_DUPLICATE_CST

> + (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type))))

>

>  /* (-A) * (-B) -> A * B  */

>  (simplify

> Index: gcc/print-tree.c

> ===================================================================

> --- gcc/print-tree.c    2017-08-21 10:42:05.815630531 +0100

> +++ gcc/print-tree.c    2017-09-25 12:03:06.719816559 +0100

> @@ -783,6 +783,10 @@ print_node (FILE *file, const char *pref

>           }

>           break;

>

> +       case VEC_DUPLICATE_CST:

> +         print_node (file, "elt", VEC_DUPLICATE_CST_ELT (node), indent + 4);

> +         break;

> +

>         case COMPLEX_CST:

>           print_node (file, "real", TREE_REALPART (node), indent + 4);

>           print_node (file, "imag", TREE_IMAGPART (node), indent + 4);

> Index: gcc/tree-chkp.c

> ===================================================================

> --- gcc/tree-chkp.c     2017-08-16 08:50:32.376422338 +0100

> +++ gcc/tree-chkp.c     2017-09-25 12:03:06.722816372 +0100

> @@ -3800,6 +3800,7 @@ chkp_find_bounds_1 (tree ptr, tree ptr_s

>      case INTEGER_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>        if (integer_zerop (ptr_src))

>         bounds = chkp_get_none_bounds ();

>        else

> Index: gcc/tree-data-ref.c

> ===================================================================

> --- gcc/tree-data-ref.c 2017-08-29 20:01:07.143372092 +0100

> +++ gcc/tree-data-ref.c 2017-09-25 12:03:06.724816248 +0100

> @@ -1223,6 +1223,7 @@ data_ref_compare_tree (tree t1, tree t2)

>      case STRING_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>        {

>         hashval_t h1 = iterative_hash_expr (t1, 0);

>         hashval_t h2 = iterative_hash_expr (t2, 0);

> Index: gcc/tree-loop-distribution.c

> ===================================================================

> --- gcc/tree-loop-distribution.c        2017-08-29 20:01:07.143372092 +0100

> +++ gcc/tree-loop-distribution.c        2017-09-25 12:03:06.727816061 +0100

> @@ -935,6 +935,9 @@ const_with_all_bytes_same (tree val)

>            && CONSTRUCTOR_NELTS (val) == 0))

>      return 0;

>

> +  if (TREE_CODE (val) == VEC_DUPLICATE_CST)

> +    return const_with_all_bytes_same (VEC_DUPLICATE_CST_ELT (val));

> +

>    if (real_zerop (val))

>      {

>        /* Only return 0 for +0.0, not for -0.0, which doesn't have

> Index: gcc/tree-ssa-loop.c

> ===================================================================

> --- gcc/tree-ssa-loop.c 2017-08-10 14:36:07.892477227 +0100

> +++ gcc/tree-ssa-loop.c 2017-09-25 12:03:06.728815998 +0100

> @@ -616,6 +616,7 @@ for_each_index (tree *addr_p, bool (*cbc

>         case STRING_CST:

>         case RESULT_DECL:

>         case VECTOR_CST:

> +       case VEC_DUPLICATE_CST:

>         case COMPLEX_CST:

>         case INTEGER_CST:

>         case REAL_CST:

> Index: gcc/tree-ssa-pre.c

> ===================================================================

> --- gcc/tree-ssa-pre.c  2017-09-13 18:03:48.390469882 +0100

> +++ gcc/tree-ssa-pre.c  2017-09-25 12:03:06.729815936 +0100

> @@ -2675,6 +2675,7 @@ create_component_ref_by_pieces_1 (basic_

>      case INTEGER_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case REAL_CST:

>      case CONSTRUCTOR:

>      case VAR_DECL:

> Index: gcc/tree-ssa-sccvn.c

> ===================================================================

> --- gcc/tree-ssa-sccvn.c        2017-09-21 11:53:16.339540234 +0100

> +++ gcc/tree-ssa-sccvn.c        2017-09-25 12:03:06.731815812 +0100

> @@ -858,6 +858,7 @@ copy_reference_ops_from_ref (tree ref, v

>         case INTEGER_CST:

>         case COMPLEX_CST:

>         case VECTOR_CST:

> +       case VEC_DUPLICATE_CST:

>         case REAL_CST:

>         case FIXED_CST:

>         case CONSTRUCTOR:

> @@ -1050,6 +1051,7 @@ ao_ref_init_from_vn_reference (ao_ref *r

>         case INTEGER_CST:

>         case COMPLEX_CST:

>         case VECTOR_CST:

> +       case VEC_DUPLICATE_CST:

>         case REAL_CST:

>         case CONSTRUCTOR:

>         case CONST_DECL:

> Index: gcc/tree-vect-generic.c

> ===================================================================

> --- gcc/tree-vect-generic.c     2017-09-14 17:04:19.082694343 +0100

> +++ gcc/tree-vect-generic.c     2017-09-25 12:03:06.731815812 +0100

> @@ -1419,6 +1419,7 @@ lower_vec_perm (gimple_stmt_iterator *gs

>  ssa_uniform_vector_p (tree op)

>  {

>    if (TREE_CODE (op) == VECTOR_CST

> +      || TREE_CODE (op) == VEC_DUPLICATE_CST

>        || TREE_CODE (op) == CONSTRUCTOR)

>      return uniform_vector_p (op);

>    if (TREE_CODE (op) == SSA_NAME)

> Index: gcc/varasm.c

> ===================================================================

> --- gcc/varasm.c        2017-09-22 17:43:06.658083770 +0100

> +++ gcc/varasm.c        2017-09-25 12:03:06.743815065 +0100

> @@ -3068,6 +3068,9 @@ const_hash_1 (const tree exp)

>      CASE_CONVERT:

>        return const_hash_1 (TREE_OPERAND (exp, 0)) * 7 + 2;

>

> +    case VEC_DUPLICATE_CST:

> +      return const_hash_1 (VEC_DUPLICATE_CST_ELT (exp)) * 7 + 3;

> +

>      default:

>        /* A language specific constant. Just hash the code.  */

>        return code;

> @@ -3158,6 +3161,10 @@ compare_constant (const tree t1, const t

>         return 1;

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      return compare_constant (VEC_DUPLICATE_CST_ELT (t1),

> +                              VEC_DUPLICATE_CST_ELT (t2));

> +

>      case CONSTRUCTOR:

>        {

>         vec<constructor_elt, va_gc> *v1, *v2;

> Index: gcc/fold-const.c

> ===================================================================

> --- gcc/fold-const.c    2017-09-14 17:04:19.080694343 +0100

> +++ gcc/fold-const.c    2017-09-25 12:03:06.708817243 +0100

> @@ -418,6 +418,9 @@ negate_expr_p (tree t)

>         return true;

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      return negate_expr_p (VEC_DUPLICATE_CST_ELT (t));

> +

>      case COMPLEX_EXPR:

>        return negate_expr_p (TREE_OPERAND (t, 0))

>              && negate_expr_p (TREE_OPERAND (t, 1));

> @@ -577,6 +580,14 @@ fold_negate_expr_1 (location_t loc, tree

>         return build_vector (type, elts);

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      {

> +       tree sub = fold_negate_expr (loc, VEC_DUPLICATE_CST_ELT (t));

> +       if (!sub)

> +         return NULL_TREE;

> +       return build_vector_from_val (type, sub);

> +      }

> +

>      case COMPLEX_EXPR:

>        if (negate_expr_p (t))

>         return fold_build2_loc (loc, COMPLEX_EXPR, type,

> @@ -1433,6 +1444,16 @@ const_binop (enum tree_code code, tree a

>        return build_vector (type, elts);

>      }

>

> +  if (TREE_CODE (arg1) == VEC_DUPLICATE_CST

> +      && TREE_CODE (arg2) == VEC_DUPLICATE_CST)

> +    {

> +      tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1),

> +                             VEC_DUPLICATE_CST_ELT (arg2));

> +      if (!sub)

> +       return NULL_TREE;

> +      return build_vector_from_val (TREE_TYPE (arg1), sub);

> +    }

> +

>    /* Shifts allow a scalar offset for a vector.  */

>    if (TREE_CODE (arg1) == VECTOR_CST

>        && TREE_CODE (arg2) == INTEGER_CST)

> @@ -1456,6 +1477,15 @@ const_binop (enum tree_code code, tree a

>

>        return build_vector (type, elts);

>      }

> +

> +  if (TREE_CODE (arg1) == VEC_DUPLICATE_CST

> +      && TREE_CODE (arg2) == INTEGER_CST)

> +    {

> +      tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1), arg2);

> +      if (!sub)

> +       return NULL_TREE;

> +      return build_vector_from_val (TREE_TYPE (arg1), sub);

> +    }

>    return NULL_TREE;

>  }

>

> @@ -1649,6 +1679,13 @@ const_unop (enum tree_code code, tree ty

>           if (i == count)

>             return build_vector (type, elements);

>         }

> +      else if (TREE_CODE (arg0) == VEC_DUPLICATE_CST)

> +       {

> +         tree sub = const_unop (BIT_NOT_EXPR, TREE_TYPE (type),

> +                                VEC_DUPLICATE_CST_ELT (arg0));

> +         if (sub)

> +           return build_vector_from_val (type, sub);

> +       }

>        break;

>

>      case TRUTH_NOT_EXPR:

> @@ -1734,6 +1771,11 @@ const_unop (enum tree_code code, tree ty

>         return res;

>        }

>

> +    case VEC_DUPLICATE_EXPR:

> +      if (CONSTANT_CLASS_P (arg0))

> +       return build_vector_from_val (type, arg0);

> +      return NULL_TREE;

> +

>      default:

>        break;

>      }

> @@ -2164,6 +2206,15 @@ fold_convert_const (enum tree_code code,

>             }

>           return build_vector (type, v);

>         }

> +      if (TREE_CODE (arg1) == VEC_DUPLICATE_CST

> +         && (TYPE_VECTOR_SUBPARTS (type)

> +             == TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg1))))

> +       {

> +         tree sub = fold_convert_const (code, TREE_TYPE (type),

> +                                        VEC_DUPLICATE_CST_ELT (arg1));

> +         if (sub)

> +           return build_vector_from_val (type, sub);

> +       }

>      }

>    return NULL_TREE;

>  }

> @@ -2950,6 +3001,10 @@ operand_equal_p (const_tree arg0, const_

>           return 1;

>         }

>

> +      case VEC_DUPLICATE_CST:

> +       return operand_equal_p (VEC_DUPLICATE_CST_ELT (arg0),

> +                               VEC_DUPLICATE_CST_ELT (arg1), flags);

> +

>        case COMPLEX_CST:

>         return (operand_equal_p (TREE_REALPART (arg0), TREE_REALPART (arg1),

>                                  flags)

> @@ -7504,6 +7559,20 @@ can_native_encode_string_p (const_tree e

>  static tree

>  fold_view_convert_expr (tree type, tree expr)

>  {

> +  /* Recurse on duplicated vectors if the target type is also a vector

> +     and if the elements line up.  */

> +  tree expr_type = TREE_TYPE (expr);

> +  if (TREE_CODE (expr) == VEC_DUPLICATE_CST

> +      && VECTOR_TYPE_P (type)

> +      && TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (expr_type)

> +      && TYPE_SIZE (TREE_TYPE (type)) == TYPE_SIZE (TREE_TYPE (expr_type)))

> +    {

> +      tree sub = fold_view_convert_expr (TREE_TYPE (type),

> +                                        VEC_DUPLICATE_CST_ELT (expr));

> +      if (sub)

> +       return build_vector_from_val (type, sub);

> +    }

> +

>    /* We support up to 512-bit values (for V8DFmode).  */

>    unsigned char buffer[64];

>    int len;

> @@ -8903,6 +8972,15 @@ exact_inverse (tree type, tree cst)

>         return build_vector (type, elts);

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      {

> +       tree sub = exact_inverse (TREE_TYPE (type),

> +                                 VEC_DUPLICATE_CST_ELT (cst));

> +       if (!sub)

> +         return NULL_TREE;

> +       return build_vector_from_val (type, sub);

> +      }

> +

>      default:

>        return NULL_TREE;

>      }

> @@ -12097,6 +12175,9 @@ fold_checksum_tree (const_tree expr, str

>           for (i = 0; i < (int) VECTOR_CST_NELTS (expr); ++i)

>             fold_checksum_tree (VECTOR_CST_ELT (expr, i), ctx, ht);

>           break;

> +       case VEC_DUPLICATE_CST:

> +         fold_checksum_tree (VEC_DUPLICATE_CST_ELT (expr), ctx, ht);

> +         break;

>         default:

>           break;

>         }

> @@ -14563,6 +14644,36 @@ test_vector_folding ()

>    ASSERT_FALSE (integer_nonzerop (fold_build2 (NE_EXPR, res_type, one, one)));

>  }

>

> +/* Verify folding of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs.  */

> +

> +static void

> +test_vec_duplicate_folding ()

> +{

> +  tree type = build_vector_type (ssizetype, 4);

> +  tree dup5 = build_vec_duplicate_cst (type, ssize_int (5));

> +  tree dup3 = build_vec_duplicate_cst (type, ssize_int (3));

> +

> +  tree neg_dup5 = fold_unary (NEGATE_EXPR, type, dup5);

> +  ASSERT_EQ (uniform_vector_p (neg_dup5), ssize_int (-5));

> +

> +  tree not_dup5 = fold_unary (BIT_NOT_EXPR, type, dup5);

> +  ASSERT_EQ (uniform_vector_p (not_dup5), ssize_int (-6));

> +

> +  tree dup5_plus_dup3 = fold_binary (PLUS_EXPR, type, dup5, dup3);

> +  ASSERT_EQ (uniform_vector_p (dup5_plus_dup3), ssize_int (8));

> +

> +  tree dup5_lsl_2 = fold_binary (LSHIFT_EXPR, type, dup5, ssize_int (2));

> +  ASSERT_EQ (uniform_vector_p (dup5_lsl_2), ssize_int (20));

> +

> +  tree size_vector = build_vector_type (sizetype, 4);

> +  tree size_dup5 = fold_convert (size_vector, dup5);

> +  ASSERT_EQ (uniform_vector_p (size_dup5), size_int (5));

> +

> +  tree dup5_expr = fold_unary (VEC_DUPLICATE_EXPR, type, ssize_int (5));

> +  tree dup5_cst = build_vector_from_val (type, ssize_int (5));

> +  ASSERT_TRUE (operand_equal_p (dup5_expr, dup5_cst, 0));

> +}

> +

>  /* Run all of the selftests within this file.  */

>

>  void

> @@ -14570,6 +14681,7 @@ fold_const_c_tests ()

>  {

>    test_arithmetic_folding ();

>    test_vector_folding ();

> +  test_vec_duplicate_folding ();

>  }

>

>  } // namespace selftest

> Index: gcc/optabs.def

> ===================================================================

> --- gcc/optabs.def      2017-08-10 14:36:07.448493264 +0100

> +++ gcc/optabs.def      2017-09-25 12:03:06.718816621 +0100

> @@ -364,3 +364,5 @@ OPTAB_D (atomic_xor_optab, "atomic_xor$I

>

>  OPTAB_D (get_thread_pointer_optab, "get_thread_pointer$I$a")

>  OPTAB_D (set_thread_pointer_optab, "set_thread_pointer$I$a")

> +

> +OPTAB_DC (vec_duplicate_optab, "vec_duplicate$a", VEC_DUPLICATE)

> Index: gcc/optabs-tree.c

> ===================================================================

> --- gcc/optabs-tree.c   2017-06-22 12:22:57.735313105 +0100

> +++ gcc/optabs-tree.c   2017-09-25 12:03:06.716816745 +0100

> @@ -210,6 +210,9 @@ optab_for_tree_code (enum tree_code code

>        return TYPE_UNSIGNED (type) ?

>         vec_pack_ufix_trunc_optab : vec_pack_sfix_trunc_optab;

>

> +    case VEC_DUPLICATE_EXPR:

> +      return vec_duplicate_optab;

> +

>      default:

>        break;

>      }

> Index: gcc/optabs.h

> ===================================================================

> --- gcc/optabs.h        2017-06-30 12:50:37.492697279 +0100

> +++ gcc/optabs.h        2017-09-25 12:03:06.719816559 +0100

> @@ -181,6 +181,7 @@ extern rtx simplify_expand_binop (machin

>                                   enum optab_methods methods);

>  extern bool force_expand_binop (machine_mode, optab, rtx, rtx, rtx, int,

>                                 enum optab_methods);

> +extern rtx expand_vector_broadcast (machine_mode, rtx);

>

>  /* Generate code for a simple binary or unary operation.  "Simple" in

>     this case means "can be unambiguously described by a (mode, code)

> Index: gcc/optabs.c

> ===================================================================

> --- gcc/optabs.c        2017-09-23 10:28:11.672861860 +0100

> +++ gcc/optabs.c        2017-09-25 12:03:06.718816621 +0100

> @@ -367,7 +367,7 @@ force_expand_binop (machine_mode mode, o

>     mode of OP must be the element mode of VMODE.  If OP is a constant,

>     then the return value will be a constant.  */

>

> -static rtx

> +rtx

>  expand_vector_broadcast (machine_mode vmode, rtx op)

>  {

>    enum insn_code icode;

> @@ -385,6 +385,16 @@ expand_vector_broadcast (machine_mode vm

>    if (CONSTANT_P (op))

>      return gen_rtx_CONST_VECTOR (vmode, vec);

>

> +  icode = optab_handler (vec_duplicate_optab, vmode);

> +  if (icode != CODE_FOR_nothing)

> +    {

> +      struct expand_operand ops[2];

> +      create_output_operand (&ops[0], NULL_RTX, vmode);

> +      create_input_operand (&ops[1], op, GET_MODE (op));

> +      expand_insn (icode, 2, ops);

> +      return ops[0].value;

> +    }

> +

>    /* ??? If the target doesn't have a vec_init, then we have no easy way

>       of performing this operation.  Most of this sort of generic support

>       is hidden away in the vector lowering support in gimple.  */

> Index: gcc/expr.c

> ===================================================================

> --- gcc/expr.c  2017-09-23 10:27:39.925846365 +0100

> +++ gcc/expr.c  2017-09-25 12:03:06.705817430 +0100

> @@ -6572,7 +6572,8 @@ store_constructor (tree exp, rtx target,

>         constructor_elt *ce;

>         int i;

>         int need_to_clear;

> -       int icode = CODE_FOR_nothing;

> +       insn_code icode = CODE_FOR_nothing;

> +       tree elt;

>         tree elttype = TREE_TYPE (type);

>         int elt_size = tree_to_uhwi (TYPE_SIZE (elttype));

>         machine_mode eltmode = TYPE_MODE (elttype);

> @@ -6582,13 +6583,30 @@ store_constructor (tree exp, rtx target,

>         unsigned n_elts;

>         alias_set_type alias;

>         bool vec_vec_init_p = false;

> +       machine_mode mode = GET_MODE (target);

>

>         gcc_assert (eltmode != BLKmode);

>

> +       /* Try using vec_duplicate_optab for uniform vectors.  */

> +       if (!TREE_SIDE_EFFECTS (exp)

> +           && VECTOR_MODE_P (mode)

> +           && eltmode == GET_MODE_INNER (mode)

> +           && ((icode = optab_handler (vec_duplicate_optab, mode))

> +               != CODE_FOR_nothing)

> +           && (elt = uniform_vector_p (exp)))

> +         {

> +           struct expand_operand ops[2];

> +           create_output_operand (&ops[0], target, mode);

> +           create_input_operand (&ops[1], expand_normal (elt), eltmode);

> +           expand_insn (icode, 2, ops);

> +           if (!rtx_equal_p (target, ops[0].value))

> +             emit_move_insn (target, ops[0].value);

> +           break;

> +         }

> +

>         n_elts = TYPE_VECTOR_SUBPARTS (type);

> -       if (REG_P (target) && VECTOR_MODE_P (GET_MODE (target)))

> +       if (REG_P (target) && VECTOR_MODE_P (mode))

>           {

> -           machine_mode mode = GET_MODE (target);

>             machine_mode emode = eltmode;

>

>             if (CONSTRUCTOR_NELTS (exp)

> @@ -6600,7 +6618,7 @@ store_constructor (tree exp, rtx target,

>                             == n_elts);

>                 emode = TYPE_MODE (etype);

>               }

> -           icode = (int) convert_optab_handler (vec_init_optab, mode, emode);

> +           icode = convert_optab_handler (vec_init_optab, mode, emode);

>             if (icode != CODE_FOR_nothing)

>               {

>                 unsigned int i, n = n_elts;

> @@ -6648,7 +6666,7 @@ store_constructor (tree exp, rtx target,

>         if (need_to_clear && size > 0 && !vector)

>           {

>             if (REG_P (target))

> -             emit_move_insn (target, CONST0_RTX (GET_MODE (target)));

> +             emit_move_insn (target, CONST0_RTX (mode));

>             else

>               clear_storage (target, GEN_INT (size), BLOCK_OP_NORMAL);

>             cleared = 1;

> @@ -6656,7 +6674,7 @@ store_constructor (tree exp, rtx target,

>

>         /* Inform later passes that the old value is dead.  */

>         if (!cleared && !vector && REG_P (target))

> -         emit_move_insn (target, CONST0_RTX (GET_MODE (target)));

> +         emit_move_insn (target, CONST0_RTX (mode));

>

>          if (MEM_P (target))

>           alias = MEM_ALIAS_SET (target);

> @@ -6707,8 +6725,7 @@ store_constructor (tree exp, rtx target,

>

>         if (vector)

>           emit_insn (GEN_FCN (icode) (target,

> -                                     gen_rtx_PARALLEL (GET_MODE (target),

> -                                                       vector)));

> +                                     gen_rtx_PARALLEL (mode, vector)));

>         break;

>        }

>

> @@ -7683,6 +7700,19 @@ expand_operands (tree exp0, tree exp1, r

>  }

>

>

> +/* Expand constant vector element ELT, which has mode MODE.  This is used

> +   for members of VECTOR_CST and VEC_DUPLICATE_CST.  */

> +

> +static rtx

> +const_vector_element (scalar_mode mode, const_tree elt)

> +{

> +  if (TREE_CODE (elt) == REAL_CST)

> +    return const_double_from_real_value (TREE_REAL_CST (elt), mode);

> +  if (TREE_CODE (elt) == FIXED_CST)

> +    return CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt), mode);

> +  return immed_wide_int_const (elt, mode);

> +}

> +

>  /* Return a MEM that contains constant EXP.  DEFER is as for

>     output_constant_def and MODIFIER is as for expand_expr.  */

>

> @@ -9548,6 +9578,12 @@ #define REDUCE_BIT_FIELD(expr)   (reduce_b

>        target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);

>        return target;

>

> +    case VEC_DUPLICATE_EXPR:

> +      op0 = expand_expr (treeop0, NULL_RTX, VOIDmode, modifier);

> +      target = expand_vector_broadcast (mode, op0);

> +      gcc_assert (target);

> +      return target;

> +

>      case BIT_INSERT_EXPR:

>        {

>         unsigned bitpos = tree_to_uhwi (treeop2);

> @@ -9981,6 +10017,11 @@ expand_expr_real_1 (tree exp, rtx target

>                             tmode, modifier);

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      op0 = const_vector_element (GET_MODE_INNER (mode),

> +                                 VEC_DUPLICATE_CST_ELT (exp));

> +      return gen_const_vec_duplicate (mode, op0);

> +

>      case CONST_DECL:

>        if (modifier == EXPAND_WRITE)

>         {

> @@ -11742,8 +11783,7 @@ const_vector_from_tree (tree exp)

>  {

>    rtvec v;

>    unsigned i, units;

> -  tree elt;

> -  machine_mode inner, mode;

> +  machine_mode mode;

>

>    mode = TYPE_MODE (TREE_TYPE (exp));

>

> @@ -11754,23 +11794,12 @@ const_vector_from_tree (tree exp)

>      return const_vector_mask_from_tree (exp);

>

>    units = VECTOR_CST_NELTS (exp);

> -  inner = GET_MODE_INNER (mode);

>

>    v = rtvec_alloc (units);

>

>    for (i = 0; i < units; ++i)

> -    {

> -      elt = VECTOR_CST_ELT (exp, i);

> -

> -      if (TREE_CODE (elt) == REAL_CST)

> -       RTVEC_ELT (v, i) = const_double_from_real_value (TREE_REAL_CST (elt),

> -                                                        inner);

> -      else if (TREE_CODE (elt) == FIXED_CST)

> -       RTVEC_ELT (v, i) = CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt),

> -                                                        inner);

> -      else

> -       RTVEC_ELT (v, i) = immed_wide_int_const (elt, inner);

> -    }

> +    RTVEC_ELT (v, i) = const_vector_element (GET_MODE_INNER (mode),

> +                                            VECTOR_CST_ELT (exp, i));

>

>    return gen_rtx_CONST_VECTOR (mode, v);

>  }

> Index: gcc/internal-fn.c

> ===================================================================

> --- gcc/internal-fn.c   2017-09-21 11:17:14.803201205 +0100

> +++ gcc/internal-fn.c   2017-09-25 12:03:06.713816932 +0100

> @@ -1911,12 +1911,12 @@ expand_vector_ubsan_overflow (location_t

>        emit_move_insn (cntvar, const0_rtx);

>        emit_label (loop_lab);

>      }

> -  if (TREE_CODE (arg0) != VECTOR_CST)

> +  if (!CONSTANT_CLASS_P (arg0))

>      {

>        rtx arg0r = expand_normal (arg0);

>        arg0 = make_tree (TREE_TYPE (arg0), arg0r);

>      }

> -  if (TREE_CODE (arg1) != VECTOR_CST)

> +  if (!CONSTANT_CLASS_P (arg1))

>      {

>        rtx arg1r = expand_normal (arg1);

>        arg1 = make_tree (TREE_TYPE (arg1), arg1r);

> Index: gcc/tree-cfg.c

> ===================================================================

> --- gcc/tree-cfg.c      2017-09-13 18:03:48.394093241 +0100

> +++ gcc/tree-cfg.c      2017-09-25 12:03:06.721816434 +0100

> @@ -3803,6 +3803,17 @@ verify_gimple_assign_unary (gassign *stm

>      case CONJ_EXPR:

>        break;

>

> +    case VEC_DUPLICATE_EXPR:

> +      if (TREE_CODE (lhs_type) != VECTOR_TYPE

> +         || !useless_type_conversion_p (TREE_TYPE (lhs_type), rhs1_type))

> +       {

> +         error ("vec_duplicate should be from a scalar to a like vector");

> +         debug_generic_expr (lhs_type);

> +         debug_generic_expr (rhs1_type);

> +         return true;

> +       }

> +      return false;

> +

>      default:

>        gcc_unreachable ();

>      }

> @@ -4473,6 +4484,7 @@ verify_gimple_assign_single (gassign *st

>      case FIXED_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case STRING_CST:

>        return res;

>

> Index: gcc/tree-inline.c

> ===================================================================

> --- gcc/tree-inline.c   2017-09-21 22:35:16.975368768 +0100

> +++ gcc/tree-inline.c   2017-09-25 12:03:06.726816123 +0100

> @@ -4002,6 +4002,7 @@ estimate_operator_cost (enum tree_code c

>      case VEC_PACK_FIX_TRUNC_EXPR:

>      case VEC_WIDEN_LSHIFT_HI_EXPR:

>      case VEC_WIDEN_LSHIFT_LO_EXPR:

> +    case VEC_DUPLICATE_EXPR:

>

>        return 1;

>
Richard Sandiford Sept. 25, 2017, 12:19 p.m. UTC | #2
Richard Biener <richard.guenther@gmail.com> writes:
> On Mon, Sep 25, 2017 at 1:08 PM, Richard Sandiford

> <richard.sandiford@linaro.org> wrote:

>> SVE needs a way of broadcasting a scalar to a variable-length vector.

>> This patch adds VEC_DUPLICATE_CST for when VECTOR_CST would be used for

>> fixed-length vectors and VEC_DUPLICATE_EXPR for when CONSTRUCTOR would

>> be used for fixed-length vectors.  VEC_DUPLICATE_EXPR is the tree

>> equivalent of the existing rtl code VEC_DUPLICATE.

>>

>> Originally we had a single VEC_DUPLICATE_EXPR and used TREE_CONSTANT

>> to mark constant nodes, but in response to last year's RFC, Richard B.

>> suggested it would be better to have separate codes for the constant

>> and non-constant cases.  This allows VEC_DUPLICATE_EXPR to be treated

>> as a normal unary operation and avoids the previous need for treating

>> it as a GIMPLE_SINGLE_RHS.

>>

>> It might make sense to use VEC_DUPLICATE_CST for all duplicated

>> vector constants, since it's a bit more compact than VECTOR_CST

>> in that case, and is potentially more efficient to process.  I don't

>> have any specific plans to do that though.  We'll need to keep both

>> types of constant around whatever happens.

>

> I think VEC_DUPLICATE_EXPR is a good thing to have.  Looking at the

> changelog you didn't patch build_vector_from_val to make use of either

> new tree code?  That would get you (quite) some testing coverage.


I didn't want to change the use of VECTOR_CST and CONSTRUCTOR for
fixed-length vectors since that would be another invasive change,
and wouldn't remove the need for supporting VECTOR_CST and CONSTRUCTOR
in all the places that currently handle it.  I think it would make sense
to do it only after the variable-length support has settled.

The SVE patches do make build_vector_from_val use these codes
for variable-length vectors:

  if (!TYPE_VECTOR_SUBPARTS (vectype).is_constant (&nunits))
    {
      if (CONSTANT_CLASS_P (sc))
       return build_vec_duplicate_cst (vectype, sc);
      return fold_build1 (VEC_DUPLICATE_EXPR, vectype, sc);
    }

> Currently we require all elements of a VECTOR_CST to be present -- how

> difficult would it be to declare that iff and if only the first

> element is present then all following elements are the same as the

> last one?  That said, I'm looking for a loop-hole to not add the extra

> VEC_DUPLICATE_CST code ... eventually we can simply allow scalars in

> contexts where vectors are valid?  Like we do for shifts?  OTOH that'd

> be "implicitely typed constants" (depends on context) like CONST_INT,

> so probably not the way to go?


I don't think it would be as bad as CONST_INT, because at least it would
still have a type.  But the vast majority of code that sees an INTEGER_CST
is going to expect it to be a scalar integer.  I don't think trying to
reuse it for vectors would make things cleaner.

Note also that we need a VEC_SERIES_CST and VEC_SERIES_EXPR for linear
series.  Unlike VEC_DUPLICATE_CST, that's restricted to integer types,
to avoid awkward rounding questions with floats.

> The ugly thing about the new codes is that we go from 3 cases when folding

> vector CONSTRUCTOR and VECTOR_CST we now have 24 to cover...

> (if I didn't miscount).  This now really asks for some common iterator over

> elements of a vector (and VEC_DUPLICATE_{EXPR,CST} would just return the first

> elt all the time).  Note that using scalars instead of vectors reduces the

> combinatorical explosion a bit (scalar and scalar const can be handled

> the same).


One of the advantages of restricting the new codes to variable-length
vectors is that you never get combinations of the old and new codes.
So at the moment this adds only a single case for each fold.
With VEC_SERIES_CST we get 4 new cases for PLUS and MINUS, but
not for much else.

If we did extend the new codes to fixed-length vectors, I think we want
to hide it behind a common accessor that gives the value of element
number X, rather than operating directly on TREE_CODE, VECTOR_CST_ELT, etc.

Thanks,
Richard

>

> So ... I'd rather not have those if we can avoid it but I haven't

> fully thought out

> things as you can see from above.

>

> Richard.
diff mbox series

Patch

Index: gcc/doc/generic.texi
===================================================================
--- gcc/doc/generic.texi	2017-09-04 08:29:12.853103383 +0100
+++ gcc/doc/generic.texi	2017-09-25 12:03:06.688818488 +0100
@@ -1036,6 +1036,7 @@  As this example indicates, the operands
 @tindex FIXED_CST
 @tindex COMPLEX_CST
 @tindex VECTOR_CST
+@tindex VEC_DUPLICATE_CST
 @tindex STRING_CST
 @findex TREE_STRING_LENGTH
 @findex TREE_STRING_POINTER
@@ -1089,6 +1090,14 @@  constant nodes.  Each individual constan
 double constant node.  The first operand is a @code{TREE_LIST} of the
 constant nodes and is accessed through @code{TREE_VECTOR_CST_ELTS}.
 
+@item VEC_DUPLICATE_CST
+These nodes represent a vector constant in which every element has the
+same scalar value.  At present only variable-length vectors use
+@code{VEC_DUPLICATE_CST}; constant-length vectors use @code{VECTOR_CST}
+instead.  The scalar element value is given by
+@code{VEC_DUPLICATE_CST_ELT} and has the same restrictions as the
+element of a @code{VECTOR_CST}.
+
 @item STRING_CST
 These nodes represent string-constants.  The @code{TREE_STRING_LENGTH}
 returns the length of the string, as an @code{int}.  The
@@ -1692,6 +1701,7 @@  a value from @code{enum annot_expr_kind}
 
 @node Vectors
 @subsection Vectors
+@tindex VEC_DUPLICATE_EXPR
 @tindex VEC_LSHIFT_EXPR
 @tindex VEC_RSHIFT_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
@@ -1703,9 +1713,14 @@  a value from @code{enum annot_expr_kind}
 @tindex VEC_PACK_TRUNC_EXPR
 @tindex VEC_PACK_SAT_EXPR
 @tindex VEC_PACK_FIX_TRUNC_EXPR
+@tindex VEC_COND_EXPR
 @tindex SAD_EXPR
 
 @table @code
+@item VEC_DUPLICATE_EXPR
+This node has a single operand and represents a vector in which every
+element is equal to that operand.
+
 @item VEC_LSHIFT_EXPR
 @itemx VEC_RSHIFT_EXPR
 These nodes represent whole vector left and right shifts, respectively.
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	2017-09-04 11:49:42.934500723 +0100
+++ gcc/doc/md.texi	2017-09-25 12:03:06.693818177 +0100
@@ -4888,6 +4888,17 @@  and operand 1 is parallel containing val
 the vector mode @var{m}, or a vector mode with the same element mode and
 smaller number of elements.
 
+@cindex @code{vec_duplicate@var{m}} instruction pattern
+@item @samp{vec_duplicate@var{m}}
+Initialize vector output operand 0 so that each element has the value given
+by scalar input operand 1.  The vector has mode @var{m} and the scalar has
+the mode appropriate for one element of @var{m}.
+
+This pattern only handles duplicates of non-constant inputs.  Constant
+vectors go through the @code{mov@var{m}} pattern instead.
+
+This pattern is not allowed to @code{FAIL}.
+
 @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern
 @item @samp{vec_cmp@var{m}@var{n}}
 Output a vector comparison.  Operand 0 of mode @var{n} is the destination for
Index: gcc/tree.def
===================================================================
--- gcc/tree.def	2017-07-27 10:37:56.369045398 +0100
+++ gcc/tree.def	2017-09-25 12:03:06.739815314 +0100
@@ -304,6 +304,10 @@  DEFTREECODE (COMPLEX_CST, "complex_cst",
 /* Contents are in VECTOR_CST_ELTS field.  */
 DEFTREECODE (VECTOR_CST, "vector_cst", tcc_constant, 0)
 
+/* Represents a vector constant in which every element is equal to
+   VEC_DUPLICATE_CST_ELT.  */
+DEFTREECODE (VEC_DUPLICATE_CST, "vec_duplicate_cst", tcc_constant, 0)
+
 /* Contents are TREE_STRING_LENGTH and the actual contents of the string.  */
 DEFTREECODE (STRING_CST, "string_cst", tcc_constant, 0)
 
@@ -534,6 +538,9 @@  DEFTREECODE (TARGET_EXPR, "target_expr",
    1 and 2 are NULL.  The operands are then taken from the cfg edges. */
 DEFTREECODE (COND_EXPR, "cond_expr", tcc_expression, 3)
 
+/* Represents a vector in which every element is equal to operand 0.  */
+DEFTREECODE (VEC_DUPLICATE_EXPR, "vec_duplicate_expr", tcc_unary, 1)
+
 /* Vector conditional expression. It is like COND_EXPR, but with
    vector operands.
 
Index: gcc/tree-core.h
===================================================================
--- gcc/tree-core.h	2017-09-14 16:25:43.864400951 +0100
+++ gcc/tree-core.h	2017-09-25 12:03:06.723816310 +0100
@@ -975,7 +975,8 @@  struct GTY(()) tree_base {
     /* VEC length.  This field is only used with TREE_VEC.  */
     int length;
 
-    /* Number of elements.  This field is only used with VECTOR_CST.  */
+    /* Number of elements.  This field is only used with VECTOR_CST
+       and VEC_DUPLICATE_CST.  It is always 1 for VEC_DUPLICATE_CST.  */
     unsigned int nelts;
 
     /* SSA version number.  This field is only used with SSA_NAME.  */
@@ -1062,7 +1063,7 @@  struct GTY(()) tree_base {
    public_flag:
 
        TREE_OVERFLOW in
-           INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST
+           INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST, VEC_DUPLICATE_CST
 
        TREE_PUBLIC in
            VAR_DECL, FUNCTION_DECL
@@ -1329,7 +1330,7 @@  struct GTY(()) tree_complex {
 
 struct GTY(()) tree_vector {
   struct tree_typed typed;
-  tree GTY ((length ("VECTOR_CST_NELTS ((tree) &%h)"))) elts[1];
+  tree GTY ((length ("((tree) &%h)->base.u.nelts"))) elts[1];
 };
 
 struct GTY(()) tree_identifier {
Index: gcc/tree.h
===================================================================
--- gcc/tree.h	2017-09-14 16:45:44.200520742 +0100
+++ gcc/tree.h	2017-09-25 12:03:06.741815189 +0100
@@ -730,8 +730,8 @@  #define TREE_SYMBOL_REFERENCED(NODE) \
 #define TYPE_REF_CAN_ALIAS_ALL(NODE) \
   (PTR_OR_REF_CHECK (NODE)->base.static_flag)
 
-/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, or VECTOR_CST, this means
-   there was an overflow in folding.  */
+/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST or VEC_DUPLICATE_CST,
+   this means there was an overflow in folding.  */
 
 #define TREE_OVERFLOW(NODE) (CST_CHECK (NODE)->base.public_flag)
 
@@ -1030,6 +1030,10 @@  #define VECTOR_CST_NELTS(NODE) (VECTOR_C
 #define VECTOR_CST_ELTS(NODE) (VECTOR_CST_CHECK (NODE)->vector.elts)
 #define VECTOR_CST_ELT(NODE,IDX) (VECTOR_CST_CHECK (NODE)->vector.elts[IDX])
 
+/* In a VEC_DUPLICATE_CST node.  */
+#define VEC_DUPLICATE_CST_ELT(NODE) \
+  (VEC_DUPLICATE_CST_CHECK (NODE)->vector.elts[0])
+
 /* Define fields and accessors for some special-purpose tree nodes.  */
 
 #define IDENTIFIER_LENGTH(NODE) \
@@ -4026,6 +4030,7 @@  extern tree build_int_cst (tree, HOST_WI
 extern tree build_int_cstu (tree type, unsigned HOST_WIDE_INT cst);
 extern tree build_int_cst_type (tree, HOST_WIDE_INT);
 extern tree make_vector (unsigned CXX_MEM_STAT_INFO);
+extern tree build_vec_duplicate_cst (tree, tree CXX_MEM_STAT_INFO);
 extern tree build_vector (tree, vec<tree> CXX_MEM_STAT_INFO);
 extern tree build_vector_from_ctor (tree, vec<constructor_elt, va_gc> *);
 extern tree build_vector_from_val (tree, tree);
Index: gcc/tree.c
===================================================================
--- gcc/tree.c	2017-09-21 12:06:40.939511360 +0100
+++ gcc/tree.c	2017-09-25 12:03:06.737815438 +0100
@@ -464,6 +464,7 @@  tree_node_structure_for_code (enum tree_
     case FIXED_CST:		return TS_FIXED_CST;
     case COMPLEX_CST:		return TS_COMPLEX;
     case VECTOR_CST:		return TS_VECTOR;
+    case VEC_DUPLICATE_CST:	return TS_VECTOR;
     case STRING_CST:		return TS_STRING;
       /* tcc_exceptional cases.  */
     case ERROR_MARK:		return TS_COMMON;
@@ -816,6 +817,7 @@  tree_code_size (enum tree_code code)
 	case FIXED_CST:		return sizeof (struct tree_fixed_cst);
 	case COMPLEX_CST:	return sizeof (struct tree_complex);
 	case VECTOR_CST:	return sizeof (struct tree_vector);
+	case VEC_DUPLICATE_CST:	return sizeof (struct tree_vector);
 	case STRING_CST:	gcc_unreachable ();
 	default:
 	  return lang_hooks.tree_size (code);
@@ -875,6 +877,9 @@  tree_size (const_tree node)
       return (sizeof (struct tree_vector)
 	      + (VECTOR_CST_NELTS (node) - 1) * sizeof (tree));
 
+    case VEC_DUPLICATE_CST:
+      return sizeof (struct tree_vector);
+
     case STRING_CST:
       return TREE_STRING_LENGTH (node) + offsetof (struct tree_string, str) + 1;
 
@@ -1682,6 +1687,30 @@  cst_and_fits_in_hwi (const_tree x)
 	  && (tree_fits_shwi_p (x) || tree_fits_uhwi_p (x)));
 }
 
+/* Build a new VEC_DUPLICATE_CST with type TYPE and operand EXP.
+
+   Note that this function is only suitable for callers that specifically
+   need a VEC_DUPLICATE_CST node.  Use build_vector_from_val to duplicate
+   a general scalar into a general vector type.  */
+
+tree
+build_vec_duplicate_cst (tree type, tree exp MEM_STAT_DECL)
+{
+  int length = sizeof (struct tree_vector);
+
+  record_node_allocation_statistics (VEC_DUPLICATE_CST, length);
+
+  tree t = ggc_alloc_cleared_tree_node_stat (length PASS_MEM_STAT);
+
+  TREE_SET_CODE (t, VEC_DUPLICATE_CST);
+  TREE_TYPE (t) = type;
+  t->base.u.nelts = 1;
+  VEC_DUPLICATE_CST_ELT (t) = exp;
+  TREE_CONSTANT (t) = 1;
+
+  return t;
+}
+
 /* Build a newly constructed VECTOR_CST node of length LEN.  */
 
 tree
@@ -2343,6 +2372,8 @@  integer_zerop (const_tree expr)
 	    return false;
 	return true;
       }
+    case VEC_DUPLICATE_CST:
+      return integer_zerop (VEC_DUPLICATE_CST_ELT (expr));
     default:
       return false;
     }
@@ -2369,6 +2400,8 @@  integer_onep (const_tree expr)
 	    return false;
 	return true;
       }
+    case VEC_DUPLICATE_CST:
+      return integer_onep (VEC_DUPLICATE_CST_ELT (expr));
     default:
       return false;
     }
@@ -2407,6 +2440,9 @@  integer_all_onesp (const_tree expr)
       return 1;
     }
 
+  else if (TREE_CODE (expr) == VEC_DUPLICATE_CST)
+    return integer_all_onesp (VEC_DUPLICATE_CST_ELT (expr));
+
   else if (TREE_CODE (expr) != INTEGER_CST)
     return 0;
 
@@ -2462,7 +2498,7 @@  integer_nonzerop (const_tree expr)
 int
 integer_truep (const_tree expr)
 {
-  if (TREE_CODE (expr) == VECTOR_CST)
+  if (TREE_CODE (expr) == VECTOR_CST || TREE_CODE (expr) == VEC_DUPLICATE_CST)
     return integer_all_onesp (expr);
   return integer_onep (expr);
 }
@@ -2633,6 +2669,8 @@  real_zerop (const_tree expr)
 	    return false;
 	return true;
       }
+    case VEC_DUPLICATE_CST:
+      return real_zerop (VEC_DUPLICATE_CST_ELT (expr));
     default:
       return false;
     }
@@ -2661,6 +2699,8 @@  real_onep (const_tree expr)
 	    return false;
 	return true;
       }
+    case VEC_DUPLICATE_CST:
+      return real_onep (VEC_DUPLICATE_CST_ELT (expr));
     default:
       return false;
     }
@@ -2688,6 +2728,8 @@  real_minus_onep (const_tree expr)
 	    return false;
 	return true;
       }
+    case VEC_DUPLICATE_CST:
+      return real_minus_onep (VEC_DUPLICATE_CST_ELT (expr));
     default:
       return false;
     }
@@ -7090,6 +7132,9 @@  add_expr (const_tree t, inchash::hash &h
 	  inchash::add_expr (VECTOR_CST_ELT (t, i), hstate, flags);
 	return;
       }
+    case VEC_DUPLICATE_CST:
+      inchash::add_expr (VEC_DUPLICATE_CST_ELT (t), hstate);
+      return;
     case SSA_NAME:
       /* We can just compare by pointer.  */
       hstate.add_wide_int (SSA_NAME_VERSION (t));
@@ -10344,6 +10389,9 @@  initializer_zerop (const_tree init)
 	return true;
       }
 
+    case VEC_DUPLICATE_CST:
+      return initializer_zerop (VEC_DUPLICATE_CST_ELT (init));
+
     case CONSTRUCTOR:
       {
 	unsigned HOST_WIDE_INT idx;
@@ -10389,7 +10437,13 @@  uniform_vector_p (const_tree vec)
 
   gcc_assert (VECTOR_TYPE_P (TREE_TYPE (vec)));
 
-  if (TREE_CODE (vec) == VECTOR_CST)
+  if (TREE_CODE (vec) == VEC_DUPLICATE_CST)
+    return VEC_DUPLICATE_CST_ELT (vec);
+
+  else if (TREE_CODE (vec) == VEC_DUPLICATE_EXPR)
+    return TREE_OPERAND (vec, 0);
+
+  else if (TREE_CODE (vec) == VECTOR_CST)
     {
       first = VECTOR_CST_ELT (vec, 0);
       for (i = 1; i < VECTOR_CST_NELTS (vec); ++i)
@@ -11094,6 +11148,7 @@  #define WALK_SUBTREE_TAIL(NODE)				\
     case REAL_CST:
     case FIXED_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case STRING_CST:
     case BLOCK:
     case PLACEHOLDER_EXPR:
@@ -12380,6 +12435,12 @@  drop_tree_overflow (tree t)
 	    elt = drop_tree_overflow (elt);
 	}
     }
+  if (TREE_CODE (t) == VEC_DUPLICATE_CST)
+    {
+      tree *elt = &VEC_DUPLICATE_CST_ELT (t);
+      if (TREE_OVERFLOW (*elt))
+	*elt = drop_tree_overflow (*elt);
+    }
   return t;
 }
 
@@ -13797,6 +13858,92 @@  test_integer_constants ()
   ASSERT_EQ (type, TREE_TYPE (zero));
 }
 
+/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs
+   for integral type TYPE.  */
+
+static void
+test_vec_duplicate_predicates_int (tree type)
+{
+  tree vec_type = build_vector_type (type, 4);
+
+  tree zero = build_zero_cst (type);
+  tree vec_zero = build_vec_duplicate_cst (vec_type, zero);
+  ASSERT_TRUE (integer_zerop (vec_zero));
+  ASSERT_FALSE (integer_onep (vec_zero));
+  ASSERT_FALSE (integer_minus_onep (vec_zero));
+  ASSERT_FALSE (integer_all_onesp (vec_zero));
+  ASSERT_FALSE (integer_truep (vec_zero));
+  ASSERT_TRUE (initializer_zerop (vec_zero));
+
+  tree one = build_one_cst (type);
+  tree vec_one = build_vec_duplicate_cst (vec_type, one);
+  ASSERT_FALSE (integer_zerop (vec_one));
+  ASSERT_TRUE (integer_onep (vec_one));
+  ASSERT_FALSE (integer_minus_onep (vec_one));
+  ASSERT_FALSE (integer_all_onesp (vec_one));
+  ASSERT_FALSE (integer_truep (vec_one));
+  ASSERT_FALSE (initializer_zerop (vec_one));
+
+  tree minus_one = build_minus_one_cst (type);
+  tree vec_minus_one = build_vec_duplicate_cst (vec_type, minus_one);
+  ASSERT_FALSE (integer_zerop (vec_minus_one));
+  ASSERT_FALSE (integer_onep (vec_minus_one));
+  ASSERT_TRUE (integer_minus_onep (vec_minus_one));
+  ASSERT_TRUE (integer_all_onesp (vec_minus_one));
+  ASSERT_TRUE (integer_truep (vec_minus_one));
+  ASSERT_FALSE (initializer_zerop (vec_minus_one));
+
+  tree x = create_tmp_var_raw (type, "x");
+  tree vec_x = build1 (VEC_DUPLICATE_EXPR, vec_type, x);
+  ASSERT_EQ (uniform_vector_p (vec_zero), zero);
+  ASSERT_EQ (uniform_vector_p (vec_one), one);
+  ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one);
+  ASSERT_EQ (uniform_vector_p (vec_x), x);
+}
+
+/* Verify predicate handling of VEC_DUPLICATE_CSTs for floating-point
+   type TYPE.  */
+
+static void
+test_vec_duplicate_predicates_float (tree type)
+{
+  tree vec_type = build_vector_type (type, 4);
+
+  tree zero = build_zero_cst (type);
+  tree vec_zero = build_vec_duplicate_cst (vec_type, zero);
+  ASSERT_TRUE (real_zerop (vec_zero));
+  ASSERT_FALSE (real_onep (vec_zero));
+  ASSERT_FALSE (real_minus_onep (vec_zero));
+  ASSERT_TRUE (initializer_zerop (vec_zero));
+
+  tree one = build_one_cst (type);
+  tree vec_one = build_vec_duplicate_cst (vec_type, one);
+  ASSERT_FALSE (real_zerop (vec_one));
+  ASSERT_TRUE (real_onep (vec_one));
+  ASSERT_FALSE (real_minus_onep (vec_one));
+  ASSERT_FALSE (initializer_zerop (vec_one));
+
+  tree minus_one = build_minus_one_cst (type);
+  tree vec_minus_one = build_vec_duplicate_cst (vec_type, minus_one);
+  ASSERT_FALSE (real_zerop (vec_minus_one));
+  ASSERT_FALSE (real_onep (vec_minus_one));
+  ASSERT_TRUE (real_minus_onep (vec_minus_one));
+  ASSERT_FALSE (initializer_zerop (vec_minus_one));
+
+  ASSERT_EQ (uniform_vector_p (vec_zero), zero);
+  ASSERT_EQ (uniform_vector_p (vec_one), one);
+  ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one);
+}
+
+/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs.  */
+
+static void
+test_vec_duplicate_predicates ()
+{
+  test_vec_duplicate_predicates_int (integer_type_node);
+  test_vec_duplicate_predicates_float (float_type_node);
+}
+
 /* Verify identifiers.  */
 
 static void
@@ -13825,6 +13972,7 @@  test_labels ()
 tree_c_tests ()
 {
   test_integer_constants ();
+  test_vec_duplicate_predicates ();
   test_identifiers ();
   test_labels ();
 }
Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c	2017-09-14 16:25:43.861637270 +0100
+++ gcc/cfgexpand.c	2017-09-25 12:03:06.687818551 +0100
@@ -5049,6 +5049,8 @@  expand_debug_expr (tree exp)
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
     case VEC_PERM_EXPR:
+    case VEC_DUPLICATE_CST:
+    case VEC_DUPLICATE_EXPR:
       return NULL;
 
     /* Misc codes.  */
Index: gcc/tree-pretty-print.c
===================================================================
--- gcc/tree-pretty-print.c	2017-08-24 08:46:01.758139665 +0100
+++ gcc/tree-pretty-print.c	2017-09-25 12:03:06.728815998 +0100
@@ -1800,6 +1800,12 @@  dump_generic_node (pretty_printer *pp, t
       }
       break;
 
+    case VEC_DUPLICATE_CST:
+      pp_string (pp, "{ ");
+      dump_generic_node (pp, VEC_DUPLICATE_CST_ELT (node), spc, flags, false);
+      pp_string (pp, ", ... }");
+      break;
+
     case FUNCTION_TYPE:
     case METHOD_TYPE:
       dump_generic_node (pp, TREE_TYPE (node), spc, flags, false);
@@ -3230,6 +3236,15 @@  dump_generic_node (pretty_printer *pp, t
       pp_string (pp, " > ");
       break;
 
+    case VEC_DUPLICATE_EXPR:
+      pp_space (pp);
+      for (str = get_tree_code_name (code); *str; str++)
+	pp_character (pp, TOUPPER (*str));
+      pp_string (pp, " < ");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, " > ");
+      break;
+
     case VEC_UNPACK_HI_EXPR:
       pp_string (pp, " VEC_UNPACK_HI_EXPR < ");
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c	2017-09-21 11:53:16.380966799 +0100
+++ gcc/dwarf2out.c	2017-09-25 12:03:06.704817493 +0100
@@ -18862,6 +18862,7 @@  rtl_for_decl_init (tree init, tree type)
 	switch (TREE_CODE (init))
 	  {
 	  case VECTOR_CST:
+	  case VEC_DUPLICATE_CST:
 	    break;
 	  case CONSTRUCTOR:
 	    if (TREE_CONSTANT (init))
Index: gcc/gimple-expr.h
===================================================================
--- gcc/gimple-expr.h	2017-02-23 19:54:20.000000000 +0000
+++ gcc/gimple-expr.h	2017-09-25 12:03:06.708817243 +0100
@@ -134,6 +134,7 @@  is_gimple_constant (const_tree t)
     case FIXED_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case STRING_CST:
       return true;
 
Index: gcc/gimplify.c
===================================================================
--- gcc/gimplify.c	2017-08-29 08:47:13.282917702 +0100
+++ gcc/gimplify.c	2017-09-25 12:03:06.711817057 +0100
@@ -11501,6 +11501,7 @@  gimplify_expr (tree *expr_p, gimple_seq
 	case STRING_CST:
 	case COMPLEX_CST:
 	case VECTOR_CST:
+	case VEC_DUPLICATE_CST:
 	  /* Drop the overflow flag on constants, we do not want
 	     that in the GIMPLE IL.  */
 	  if (TREE_OVERFLOW_P (*expr_p))
Index: gcc/graphite-isl-ast-to-gimple.c
===================================================================
--- gcc/graphite-isl-ast-to-gimple.c	2017-09-22 17:22:08.334305773 +0100
+++ gcc/graphite-isl-ast-to-gimple.c	2017-09-25 12:03:06.712816994 +0100
@@ -245,7 +245,8 @@  enum phi_node_kind
     return TREE_CODE (op) == INTEGER_CST
       || TREE_CODE (op) == REAL_CST
       || TREE_CODE (op) == COMPLEX_CST
-      || TREE_CODE (op) == VECTOR_CST;
+      || TREE_CODE (op) == VECTOR_CST
+      || TREE_CODE (op) == VEC_DUPLICATE_CST;
   }
 
 private:
Index: gcc/graphite-scop-detection.c
===================================================================
--- gcc/graphite-scop-detection.c	2017-09-22 17:22:08.510305732 +0100
+++ gcc/graphite-scop-detection.c	2017-09-25 12:03:06.712816994 +0100
@@ -1447,6 +1447,7 @@  scan_tree_for_params (sese_info_p s, tre
     case REAL_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
       break;
 
    default:
Index: gcc/ipa-icf-gimple.c
===================================================================
--- gcc/ipa-icf-gimple.c	2017-08-30 16:25:16.913251173 +0100
+++ gcc/ipa-icf-gimple.c	2017-09-25 12:03:06.714816870 +0100
@@ -333,6 +333,7 @@  func_checker::compare_cst_or_decl (tree
     case INTEGER_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case STRING_CST:
     case REAL_CST:
       {
@@ -528,6 +529,7 @@  func_checker::compare_operand (tree t1,
     case INTEGER_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case STRING_CST:
     case REAL_CST:
     case FUNCTION_DECL:
Index: gcc/ipa-icf.c
===================================================================
--- gcc/ipa-icf.c	2017-06-07 07:42:16.940073012 +0100
+++ gcc/ipa-icf.c	2017-09-25 12:03:06.715816808 +0100
@@ -1478,6 +1478,7 @@  sem_item::add_expr (const_tree exp, inch
     case STRING_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
       inchash::add_expr (exp, hstate);
       break;
     case CONSTRUCTOR:
@@ -2030,6 +2031,9 @@  sem_variable::equals (tree t1, tree t2)
 
 	return 1;
       }
+    case VEC_DUPLICATE_CST:
+      return sem_variable::equals (VEC_DUPLICATE_CST_ELT (t1),
+				   VEC_DUPLICATE_CST_ELT (t2));
     case ARRAY_REF:
     case ARRAY_RANGE_REF:
       {
Index: gcc/match.pd
===================================================================
--- gcc/match.pd	2017-09-21 11:17:14.827201204 +0100
+++ gcc/match.pd	2017-09-25 12:03:06.716816745 +0100
@@ -944,6 +944,9 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (match negate_expr_p
  VECTOR_CST
  (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type))))
+(match negate_expr_p
+ VEC_DUPLICATE_CST
+ (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type))))
 
 /* (-A) * (-B) -> A * B  */
 (simplify
Index: gcc/print-tree.c
===================================================================
--- gcc/print-tree.c	2017-08-21 10:42:05.815630531 +0100
+++ gcc/print-tree.c	2017-09-25 12:03:06.719816559 +0100
@@ -783,6 +783,10 @@  print_node (FILE *file, const char *pref
 	  }
 	  break;
 
+	case VEC_DUPLICATE_CST:
+	  print_node (file, "elt", VEC_DUPLICATE_CST_ELT (node), indent + 4);
+	  break;
+
 	case COMPLEX_CST:
 	  print_node (file, "real", TREE_REALPART (node), indent + 4);
 	  print_node (file, "imag", TREE_IMAGPART (node), indent + 4);
Index: gcc/tree-chkp.c
===================================================================
--- gcc/tree-chkp.c	2017-08-16 08:50:32.376422338 +0100
+++ gcc/tree-chkp.c	2017-09-25 12:03:06.722816372 +0100
@@ -3800,6 +3800,7 @@  chkp_find_bounds_1 (tree ptr, tree ptr_s
     case INTEGER_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
       if (integer_zerop (ptr_src))
 	bounds = chkp_get_none_bounds ();
       else
Index: gcc/tree-data-ref.c
===================================================================
--- gcc/tree-data-ref.c	2017-08-29 20:01:07.143372092 +0100
+++ gcc/tree-data-ref.c	2017-09-25 12:03:06.724816248 +0100
@@ -1223,6 +1223,7 @@  data_ref_compare_tree (tree t1, tree t2)
     case STRING_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
       {
 	hashval_t h1 = iterative_hash_expr (t1, 0);
 	hashval_t h2 = iterative_hash_expr (t2, 0);
Index: gcc/tree-loop-distribution.c
===================================================================
--- gcc/tree-loop-distribution.c	2017-08-29 20:01:07.143372092 +0100
+++ gcc/tree-loop-distribution.c	2017-09-25 12:03:06.727816061 +0100
@@ -935,6 +935,9 @@  const_with_all_bytes_same (tree val)
           && CONSTRUCTOR_NELTS (val) == 0))
     return 0;
 
+  if (TREE_CODE (val) == VEC_DUPLICATE_CST)
+    return const_with_all_bytes_same (VEC_DUPLICATE_CST_ELT (val));
+
   if (real_zerop (val))
     {
       /* Only return 0 for +0.0, not for -0.0, which doesn't have
Index: gcc/tree-ssa-loop.c
===================================================================
--- gcc/tree-ssa-loop.c	2017-08-10 14:36:07.892477227 +0100
+++ gcc/tree-ssa-loop.c	2017-09-25 12:03:06.728815998 +0100
@@ -616,6 +616,7 @@  for_each_index (tree *addr_p, bool (*cbc
 	case STRING_CST:
 	case RESULT_DECL:
 	case VECTOR_CST:
+	case VEC_DUPLICATE_CST:
 	case COMPLEX_CST:
 	case INTEGER_CST:
 	case REAL_CST:
Index: gcc/tree-ssa-pre.c
===================================================================
--- gcc/tree-ssa-pre.c	2017-09-13 18:03:48.390469882 +0100
+++ gcc/tree-ssa-pre.c	2017-09-25 12:03:06.729815936 +0100
@@ -2675,6 +2675,7 @@  create_component_ref_by_pieces_1 (basic_
     case INTEGER_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case REAL_CST:
     case CONSTRUCTOR:
     case VAR_DECL:
Index: gcc/tree-ssa-sccvn.c
===================================================================
--- gcc/tree-ssa-sccvn.c	2017-09-21 11:53:16.339540234 +0100
+++ gcc/tree-ssa-sccvn.c	2017-09-25 12:03:06.731815812 +0100
@@ -858,6 +858,7 @@  copy_reference_ops_from_ref (tree ref, v
 	case INTEGER_CST:
 	case COMPLEX_CST:
 	case VECTOR_CST:
+	case VEC_DUPLICATE_CST:
 	case REAL_CST:
 	case FIXED_CST:
 	case CONSTRUCTOR:
@@ -1050,6 +1051,7 @@  ao_ref_init_from_vn_reference (ao_ref *r
 	case INTEGER_CST:
 	case COMPLEX_CST:
 	case VECTOR_CST:
+	case VEC_DUPLICATE_CST:
 	case REAL_CST:
 	case CONSTRUCTOR:
 	case CONST_DECL:
Index: gcc/tree-vect-generic.c
===================================================================
--- gcc/tree-vect-generic.c	2017-09-14 17:04:19.082694343 +0100
+++ gcc/tree-vect-generic.c	2017-09-25 12:03:06.731815812 +0100
@@ -1419,6 +1419,7 @@  lower_vec_perm (gimple_stmt_iterator *gs
 ssa_uniform_vector_p (tree op)
 {
   if (TREE_CODE (op) == VECTOR_CST
+      || TREE_CODE (op) == VEC_DUPLICATE_CST
       || TREE_CODE (op) == CONSTRUCTOR)
     return uniform_vector_p (op);
   if (TREE_CODE (op) == SSA_NAME)
Index: gcc/varasm.c
===================================================================
--- gcc/varasm.c	2017-09-22 17:43:06.658083770 +0100
+++ gcc/varasm.c	2017-09-25 12:03:06.743815065 +0100
@@ -3068,6 +3068,9 @@  const_hash_1 (const tree exp)
     CASE_CONVERT:
       return const_hash_1 (TREE_OPERAND (exp, 0)) * 7 + 2;
 
+    case VEC_DUPLICATE_CST:
+      return const_hash_1 (VEC_DUPLICATE_CST_ELT (exp)) * 7 + 3;
+
     default:
       /* A language specific constant. Just hash the code.  */
       return code;
@@ -3158,6 +3161,10 @@  compare_constant (const tree t1, const t
 	return 1;
       }
 
+    case VEC_DUPLICATE_CST:
+      return compare_constant (VEC_DUPLICATE_CST_ELT (t1),
+			       VEC_DUPLICATE_CST_ELT (t2));
+
     case CONSTRUCTOR:
       {
 	vec<constructor_elt, va_gc> *v1, *v2;
Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c	2017-09-14 17:04:19.080694343 +0100
+++ gcc/fold-const.c	2017-09-25 12:03:06.708817243 +0100
@@ -418,6 +418,9 @@  negate_expr_p (tree t)
 	return true;
       }
 
+    case VEC_DUPLICATE_CST:
+      return negate_expr_p (VEC_DUPLICATE_CST_ELT (t));
+
     case COMPLEX_EXPR:
       return negate_expr_p (TREE_OPERAND (t, 0))
 	     && negate_expr_p (TREE_OPERAND (t, 1));
@@ -577,6 +580,14 @@  fold_negate_expr_1 (location_t loc, tree
 	return build_vector (type, elts);
       }
 
+    case VEC_DUPLICATE_CST:
+      {
+	tree sub = fold_negate_expr (loc, VEC_DUPLICATE_CST_ELT (t));
+	if (!sub)
+	  return NULL_TREE;
+	return build_vector_from_val (type, sub);
+      }
+
     case COMPLEX_EXPR:
       if (negate_expr_p (t))
 	return fold_build2_loc (loc, COMPLEX_EXPR, type,
@@ -1433,6 +1444,16 @@  const_binop (enum tree_code code, tree a
       return build_vector (type, elts);
     }
 
+  if (TREE_CODE (arg1) == VEC_DUPLICATE_CST
+      && TREE_CODE (arg2) == VEC_DUPLICATE_CST)
+    {
+      tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1),
+			      VEC_DUPLICATE_CST_ELT (arg2));
+      if (!sub)
+	return NULL_TREE;
+      return build_vector_from_val (TREE_TYPE (arg1), sub);
+    }
+
   /* Shifts allow a scalar offset for a vector.  */
   if (TREE_CODE (arg1) == VECTOR_CST
       && TREE_CODE (arg2) == INTEGER_CST)
@@ -1456,6 +1477,15 @@  const_binop (enum tree_code code, tree a
 
       return build_vector (type, elts);
     }
+
+  if (TREE_CODE (arg1) == VEC_DUPLICATE_CST
+      && TREE_CODE (arg2) == INTEGER_CST)
+    {
+      tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1), arg2);
+      if (!sub)
+	return NULL_TREE;
+      return build_vector_from_val (TREE_TYPE (arg1), sub);
+    }
   return NULL_TREE;
 }
 
@@ -1649,6 +1679,13 @@  const_unop (enum tree_code code, tree ty
 	  if (i == count)
 	    return build_vector (type, elements);
 	}
+      else if (TREE_CODE (arg0) == VEC_DUPLICATE_CST)
+	{
+	  tree sub = const_unop (BIT_NOT_EXPR, TREE_TYPE (type),
+				 VEC_DUPLICATE_CST_ELT (arg0));
+	  if (sub)
+	    return build_vector_from_val (type, sub);
+	}
       break;
 
     case TRUTH_NOT_EXPR:
@@ -1734,6 +1771,11 @@  const_unop (enum tree_code code, tree ty
 	return res;
       }
 
+    case VEC_DUPLICATE_EXPR:
+      if (CONSTANT_CLASS_P (arg0))
+	return build_vector_from_val (type, arg0);
+      return NULL_TREE;
+
     default:
       break;
     }
@@ -2164,6 +2206,15 @@  fold_convert_const (enum tree_code code,
 	    }
 	  return build_vector (type, v);
 	}
+      if (TREE_CODE (arg1) == VEC_DUPLICATE_CST
+	  && (TYPE_VECTOR_SUBPARTS (type)
+	      == TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg1))))
+	{
+	  tree sub = fold_convert_const (code, TREE_TYPE (type),
+					 VEC_DUPLICATE_CST_ELT (arg1));
+	  if (sub)
+	    return build_vector_from_val (type, sub);
+	}
     }
   return NULL_TREE;
 }
@@ -2950,6 +3001,10 @@  operand_equal_p (const_tree arg0, const_
 	  return 1;
 	}
 
+      case VEC_DUPLICATE_CST:
+	return operand_equal_p (VEC_DUPLICATE_CST_ELT (arg0),
+				VEC_DUPLICATE_CST_ELT (arg1), flags);
+
       case COMPLEX_CST:
 	return (operand_equal_p (TREE_REALPART (arg0), TREE_REALPART (arg1),
 				 flags)
@@ -7504,6 +7559,20 @@  can_native_encode_string_p (const_tree e
 static tree
 fold_view_convert_expr (tree type, tree expr)
 {
+  /* Recurse on duplicated vectors if the target type is also a vector
+     and if the elements line up.  */
+  tree expr_type = TREE_TYPE (expr);
+  if (TREE_CODE (expr) == VEC_DUPLICATE_CST
+      && VECTOR_TYPE_P (type)
+      && TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (expr_type)
+      && TYPE_SIZE (TREE_TYPE (type)) == TYPE_SIZE (TREE_TYPE (expr_type)))
+    {
+      tree sub = fold_view_convert_expr (TREE_TYPE (type),
+					 VEC_DUPLICATE_CST_ELT (expr));
+      if (sub)
+	return build_vector_from_val (type, sub);
+    }
+
   /* We support up to 512-bit values (for V8DFmode).  */
   unsigned char buffer[64];
   int len;
@@ -8903,6 +8972,15 @@  exact_inverse (tree type, tree cst)
 	return build_vector (type, elts);
       }
 
+    case VEC_DUPLICATE_CST:
+      {
+	tree sub = exact_inverse (TREE_TYPE (type),
+				  VEC_DUPLICATE_CST_ELT (cst));
+	if (!sub)
+	  return NULL_TREE;
+	return build_vector_from_val (type, sub);
+      }
+
     default:
       return NULL_TREE;
     }
@@ -12097,6 +12175,9 @@  fold_checksum_tree (const_tree expr, str
 	  for (i = 0; i < (int) VECTOR_CST_NELTS (expr); ++i)
 	    fold_checksum_tree (VECTOR_CST_ELT (expr, i), ctx, ht);
 	  break;
+	case VEC_DUPLICATE_CST:
+	  fold_checksum_tree (VEC_DUPLICATE_CST_ELT (expr), ctx, ht);
+	  break;
 	default:
 	  break;
 	}
@@ -14563,6 +14644,36 @@  test_vector_folding ()
   ASSERT_FALSE (integer_nonzerop (fold_build2 (NE_EXPR, res_type, one, one)));
 }
 
+/* Verify folding of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs.  */
+
+static void
+test_vec_duplicate_folding ()
+{
+  tree type = build_vector_type (ssizetype, 4);
+  tree dup5 = build_vec_duplicate_cst (type, ssize_int (5));
+  tree dup3 = build_vec_duplicate_cst (type, ssize_int (3));
+
+  tree neg_dup5 = fold_unary (NEGATE_EXPR, type, dup5);
+  ASSERT_EQ (uniform_vector_p (neg_dup5), ssize_int (-5));
+
+  tree not_dup5 = fold_unary (BIT_NOT_EXPR, type, dup5);
+  ASSERT_EQ (uniform_vector_p (not_dup5), ssize_int (-6));
+
+  tree dup5_plus_dup3 = fold_binary (PLUS_EXPR, type, dup5, dup3);
+  ASSERT_EQ (uniform_vector_p (dup5_plus_dup3), ssize_int (8));
+
+  tree dup5_lsl_2 = fold_binary (LSHIFT_EXPR, type, dup5, ssize_int (2));
+  ASSERT_EQ (uniform_vector_p (dup5_lsl_2), ssize_int (20));
+
+  tree size_vector = build_vector_type (sizetype, 4);
+  tree size_dup5 = fold_convert (size_vector, dup5);
+  ASSERT_EQ (uniform_vector_p (size_dup5), size_int (5));
+
+  tree dup5_expr = fold_unary (VEC_DUPLICATE_EXPR, type, ssize_int (5));
+  tree dup5_cst = build_vector_from_val (type, ssize_int (5));
+  ASSERT_TRUE (operand_equal_p (dup5_expr, dup5_cst, 0));
+}
+
 /* Run all of the selftests within this file.  */
 
 void
@@ -14570,6 +14681,7 @@  fold_const_c_tests ()
 {
   test_arithmetic_folding ();
   test_vector_folding ();
+  test_vec_duplicate_folding ();
 }
 
 } // namespace selftest
Index: gcc/optabs.def
===================================================================
--- gcc/optabs.def	2017-08-10 14:36:07.448493264 +0100
+++ gcc/optabs.def	2017-09-25 12:03:06.718816621 +0100
@@ -364,3 +364,5 @@  OPTAB_D (atomic_xor_optab, "atomic_xor$I
 
 OPTAB_D (get_thread_pointer_optab, "get_thread_pointer$I$a")
 OPTAB_D (set_thread_pointer_optab, "set_thread_pointer$I$a")
+
+OPTAB_DC (vec_duplicate_optab, "vec_duplicate$a", VEC_DUPLICATE)
Index: gcc/optabs-tree.c
===================================================================
--- gcc/optabs-tree.c	2017-06-22 12:22:57.735313105 +0100
+++ gcc/optabs-tree.c	2017-09-25 12:03:06.716816745 +0100
@@ -210,6 +210,9 @@  optab_for_tree_code (enum tree_code code
       return TYPE_UNSIGNED (type) ?
 	vec_pack_ufix_trunc_optab : vec_pack_sfix_trunc_optab;
 
+    case VEC_DUPLICATE_EXPR:
+      return vec_duplicate_optab;
+
     default:
       break;
     }
Index: gcc/optabs.h
===================================================================
--- gcc/optabs.h	2017-06-30 12:50:37.492697279 +0100
+++ gcc/optabs.h	2017-09-25 12:03:06.719816559 +0100
@@ -181,6 +181,7 @@  extern rtx simplify_expand_binop (machin
 				  enum optab_methods methods);
 extern bool force_expand_binop (machine_mode, optab, rtx, rtx, rtx, int,
 				enum optab_methods);
+extern rtx expand_vector_broadcast (machine_mode, rtx);
 
 /* Generate code for a simple binary or unary operation.  "Simple" in
    this case means "can be unambiguously described by a (mode, code)
Index: gcc/optabs.c
===================================================================
--- gcc/optabs.c	2017-09-23 10:28:11.672861860 +0100
+++ gcc/optabs.c	2017-09-25 12:03:06.718816621 +0100
@@ -367,7 +367,7 @@  force_expand_binop (machine_mode mode, o
    mode of OP must be the element mode of VMODE.  If OP is a constant,
    then the return value will be a constant.  */
 
-static rtx
+rtx
 expand_vector_broadcast (machine_mode vmode, rtx op)
 {
   enum insn_code icode;
@@ -385,6 +385,16 @@  expand_vector_broadcast (machine_mode vm
   if (CONSTANT_P (op))
     return gen_rtx_CONST_VECTOR (vmode, vec);
 
+  icode = optab_handler (vec_duplicate_optab, vmode);
+  if (icode != CODE_FOR_nothing)
+    {
+      struct expand_operand ops[2];
+      create_output_operand (&ops[0], NULL_RTX, vmode);
+      create_input_operand (&ops[1], op, GET_MODE (op));
+      expand_insn (icode, 2, ops);
+      return ops[0].value;
+    }
+
   /* ??? If the target doesn't have a vec_init, then we have no easy way
      of performing this operation.  Most of this sort of generic support
      is hidden away in the vector lowering support in gimple.  */
Index: gcc/expr.c
===================================================================
--- gcc/expr.c	2017-09-23 10:27:39.925846365 +0100
+++ gcc/expr.c	2017-09-25 12:03:06.705817430 +0100
@@ -6572,7 +6572,8 @@  store_constructor (tree exp, rtx target,
 	constructor_elt *ce;
 	int i;
 	int need_to_clear;
-	int icode = CODE_FOR_nothing;
+	insn_code icode = CODE_FOR_nothing;
+	tree elt;
 	tree elttype = TREE_TYPE (type);
 	int elt_size = tree_to_uhwi (TYPE_SIZE (elttype));
 	machine_mode eltmode = TYPE_MODE (elttype);
@@ -6582,13 +6583,30 @@  store_constructor (tree exp, rtx target,
 	unsigned n_elts;
 	alias_set_type alias;
 	bool vec_vec_init_p = false;
+	machine_mode mode = GET_MODE (target);
 
 	gcc_assert (eltmode != BLKmode);
 
+	/* Try using vec_duplicate_optab for uniform vectors.  */
+	if (!TREE_SIDE_EFFECTS (exp)
+	    && VECTOR_MODE_P (mode)
+	    && eltmode == GET_MODE_INNER (mode)
+	    && ((icode = optab_handler (vec_duplicate_optab, mode))
+		!= CODE_FOR_nothing)
+	    && (elt = uniform_vector_p (exp)))
+	  {
+	    struct expand_operand ops[2];
+	    create_output_operand (&ops[0], target, mode);
+	    create_input_operand (&ops[1], expand_normal (elt), eltmode);
+	    expand_insn (icode, 2, ops);
+	    if (!rtx_equal_p (target, ops[0].value))
+	      emit_move_insn (target, ops[0].value);
+	    break;
+	  }
+
 	n_elts = TYPE_VECTOR_SUBPARTS (type);
-	if (REG_P (target) && VECTOR_MODE_P (GET_MODE (target)))
+	if (REG_P (target) && VECTOR_MODE_P (mode))
 	  {
-	    machine_mode mode = GET_MODE (target);
 	    machine_mode emode = eltmode;
 
 	    if (CONSTRUCTOR_NELTS (exp)
@@ -6600,7 +6618,7 @@  store_constructor (tree exp, rtx target,
 			    == n_elts);
 		emode = TYPE_MODE (etype);
 	      }
-	    icode = (int) convert_optab_handler (vec_init_optab, mode, emode);
+	    icode = convert_optab_handler (vec_init_optab, mode, emode);
 	    if (icode != CODE_FOR_nothing)
 	      {
 		unsigned int i, n = n_elts;
@@ -6648,7 +6666,7 @@  store_constructor (tree exp, rtx target,
 	if (need_to_clear && size > 0 && !vector)
 	  {
 	    if (REG_P (target))
-	      emit_move_insn (target, CONST0_RTX (GET_MODE (target)));
+	      emit_move_insn (target, CONST0_RTX (mode));
 	    else
 	      clear_storage (target, GEN_INT (size), BLOCK_OP_NORMAL);
 	    cleared = 1;
@@ -6656,7 +6674,7 @@  store_constructor (tree exp, rtx target,
 
 	/* Inform later passes that the old value is dead.  */
 	if (!cleared && !vector && REG_P (target))
-	  emit_move_insn (target, CONST0_RTX (GET_MODE (target)));
+	  emit_move_insn (target, CONST0_RTX (mode));
 
         if (MEM_P (target))
 	  alias = MEM_ALIAS_SET (target);
@@ -6707,8 +6725,7 @@  store_constructor (tree exp, rtx target,
 
 	if (vector)
 	  emit_insn (GEN_FCN (icode) (target,
-				      gen_rtx_PARALLEL (GET_MODE (target),
-							vector)));
+				      gen_rtx_PARALLEL (mode, vector)));
 	break;
       }
 
@@ -7683,6 +7700,19 @@  expand_operands (tree exp0, tree exp1, r
 }
 
 
+/* Expand constant vector element ELT, which has mode MODE.  This is used
+   for members of VECTOR_CST and VEC_DUPLICATE_CST.  */
+
+static rtx
+const_vector_element (scalar_mode mode, const_tree elt)
+{
+  if (TREE_CODE (elt) == REAL_CST)
+    return const_double_from_real_value (TREE_REAL_CST (elt), mode);
+  if (TREE_CODE (elt) == FIXED_CST)
+    return CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt), mode);
+  return immed_wide_int_const (elt, mode);
+}
+
 /* Return a MEM that contains constant EXP.  DEFER is as for
    output_constant_def and MODIFIER is as for expand_expr.  */
 
@@ -9548,6 +9578,12 @@  #define REDUCE_BIT_FIELD(expr)	(reduce_b
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case VEC_DUPLICATE_EXPR:
+      op0 = expand_expr (treeop0, NULL_RTX, VOIDmode, modifier);
+      target = expand_vector_broadcast (mode, op0);
+      gcc_assert (target);
+      return target;
+
     case BIT_INSERT_EXPR:
       {
 	unsigned bitpos = tree_to_uhwi (treeop2);
@@ -9981,6 +10017,11 @@  expand_expr_real_1 (tree exp, rtx target
 			    tmode, modifier);
       }
 
+    case VEC_DUPLICATE_CST:
+      op0 = const_vector_element (GET_MODE_INNER (mode),
+				  VEC_DUPLICATE_CST_ELT (exp));
+      return gen_const_vec_duplicate (mode, op0);
+
     case CONST_DECL:
       if (modifier == EXPAND_WRITE)
 	{
@@ -11742,8 +11783,7 @@  const_vector_from_tree (tree exp)
 {
   rtvec v;
   unsigned i, units;
-  tree elt;
-  machine_mode inner, mode;
+  machine_mode mode;
 
   mode = TYPE_MODE (TREE_TYPE (exp));
 
@@ -11754,23 +11794,12 @@  const_vector_from_tree (tree exp)
     return const_vector_mask_from_tree (exp);
 
   units = VECTOR_CST_NELTS (exp);
-  inner = GET_MODE_INNER (mode);
 
   v = rtvec_alloc (units);
 
   for (i = 0; i < units; ++i)
-    {
-      elt = VECTOR_CST_ELT (exp, i);
-
-      if (TREE_CODE (elt) == REAL_CST)
-	RTVEC_ELT (v, i) = const_double_from_real_value (TREE_REAL_CST (elt),
-							 inner);
-      else if (TREE_CODE (elt) == FIXED_CST)
-	RTVEC_ELT (v, i) = CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt),
-							 inner);
-      else
-	RTVEC_ELT (v, i) = immed_wide_int_const (elt, inner);
-    }
+    RTVEC_ELT (v, i) = const_vector_element (GET_MODE_INNER (mode),
+					     VECTOR_CST_ELT (exp, i));
 
   return gen_rtx_CONST_VECTOR (mode, v);
 }
Index: gcc/internal-fn.c
===================================================================
--- gcc/internal-fn.c	2017-09-21 11:17:14.803201205 +0100
+++ gcc/internal-fn.c	2017-09-25 12:03:06.713816932 +0100
@@ -1911,12 +1911,12 @@  expand_vector_ubsan_overflow (location_t
       emit_move_insn (cntvar, const0_rtx);
       emit_label (loop_lab);
     }
-  if (TREE_CODE (arg0) != VECTOR_CST)
+  if (!CONSTANT_CLASS_P (arg0))
     {
       rtx arg0r = expand_normal (arg0);
       arg0 = make_tree (TREE_TYPE (arg0), arg0r);
     }
-  if (TREE_CODE (arg1) != VECTOR_CST)
+  if (!CONSTANT_CLASS_P (arg1))
     {
       rtx arg1r = expand_normal (arg1);
       arg1 = make_tree (TREE_TYPE (arg1), arg1r);
Index: gcc/tree-cfg.c
===================================================================
--- gcc/tree-cfg.c	2017-09-13 18:03:48.394093241 +0100
+++ gcc/tree-cfg.c	2017-09-25 12:03:06.721816434 +0100
@@ -3803,6 +3803,17 @@  verify_gimple_assign_unary (gassign *stm
     case CONJ_EXPR:
       break;
 
+    case VEC_DUPLICATE_EXPR:
+      if (TREE_CODE (lhs_type) != VECTOR_TYPE
+	  || !useless_type_conversion_p (TREE_TYPE (lhs_type), rhs1_type))
+	{
+	  error ("vec_duplicate should be from a scalar to a like vector");
+	  debug_generic_expr (lhs_type);
+	  debug_generic_expr (rhs1_type);
+	  return true;
+	}
+      return false;
+
     default:
       gcc_unreachable ();
     }
@@ -4473,6 +4484,7 @@  verify_gimple_assign_single (gassign *st
     case FIXED_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case STRING_CST:
       return res;
 
Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c	2017-09-21 22:35:16.975368768 +0100
+++ gcc/tree-inline.c	2017-09-25 12:03:06.726816123 +0100
@@ -4002,6 +4002,7 @@  estimate_operator_cost (enum tree_code c
     case VEC_PACK_FIX_TRUNC_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
+    case VEC_DUPLICATE_EXPR:
 
       return 1;