diff mbox series

[05/nn] Add VEC_DUPLICATE_{CST,EXPR} and associated optab

Message ID 87bmkyxg9d.fsf@linaro.org
State New
Headers show
Series [05/nn] Add VEC_DUPLICATE_{CST,EXPR} and associated optab | expand

Commit Message

Richard Sandiford Oct. 23, 2017, 11:20 a.m. UTC
SVE needs a way of broadcasting a scalar to a variable-length vector.
This patch adds VEC_DUPLICATE_CST for when VECTOR_CST would be used for
fixed-length vectors and VEC_DUPLICATE_EXPR for when CONSTRUCTOR would
be used for fixed-length vectors.  VEC_DUPLICATE_EXPR is the tree
equivalent of the existing rtl code VEC_DUPLICATE.

Originally we had a single VEC_DUPLICATE_EXPR and used TREE_CONSTANT
to mark constant nodes, but in response to last year's RFC, Richard B.
suggested it would be better to have separate codes for the constant
and non-constant cases.  This allows VEC_DUPLICATE_EXPR to be treated
as a normal unary operation and avoids the previous need for treating
it as a GIMPLE_SINGLE_RHS.

It might make sense to use VEC_DUPLICATE_CST for all duplicated
vector constants, since it's a bit more compact than VECTOR_CST
in that case, and is potentially more efficient to process.
However, the nice thing about keeping it restricted to variable-length
vectors is that there is then no need to handle combinations of
VECTOR_CST and VEC_DUPLICATE_CST; a vector type will always use
VECTOR_CST or never use it.

The patch also adds a vec_duplicate_optab to go with VEC_DUPLICATE_EXPR.


2017-10-23  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hawyard@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* doc/generic.texi (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): Document.
	(VEC_COND_EXPR): Add missing @tindex.
	* doc/md.texi (vec_duplicate@var{m}): Document.
	* tree.def (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): New tree codes.
	* tree-core.h (tree_base): Document that u.nelts and TREE_OVERFLOW
	are used for VEC_DUPLICATE_CST as well.
	(tree_vector): Access base.n.nelts directly.
	* tree.h (TREE_OVERFLOW): Add VEC_DUPLICATE_CST to the list of
	valid codes.
	(VEC_DUPLICATE_CST_ELT): New macro.
	(build_vec_duplicate_cst): Declare.
	* tree.c (tree_node_structure_for_code, tree_code_size, tree_size)
	(integer_zerop, integer_onep, integer_all_onesp, integer_truep)
	(real_zerop, real_onep, real_minus_onep, add_expr, initializer_zerop)
	(walk_tree_1, drop_tree_overflow): Handle VEC_DUPLICATE_CST.
	(build_vec_duplicate_cst): New function.
	(uniform_vector_p): Handle the new codes.
	(test_vec_duplicate_predicates_int): New function.
	(test_vec_duplicate_predicates_float): Likewise.
	(test_vec_duplicate_predicates): Likewise.
	(tree_c_tests): Call test_vec_duplicate_predicates.
	* cfgexpand.c (expand_debug_expr): Handle the new codes.
	* tree-pretty-print.c (dump_generic_node): Likewise.
	* dwarf2out.c (rtl_for_decl_init): Handle VEC_DUPLICATE_CST.
	* gimple-expr.h (is_gimple_constant): Likewise.
	* gimplify.c (gimplify_expr): Likewise.
	* graphite-isl-ast-to-gimple.c
	(translate_isl_ast_to_gimple::is_constant): Likewise.
	* graphite-scop-detection.c (scan_tree_for_params): Likewise.
	* ipa-icf-gimple.c (func_checker::compare_cst_or_decl): Likewise.
	(func_checker::compare_operand): Likewise.
	* ipa-icf.c (sem_item::add_expr, sem_variable::equals): Likewise.
	* match.pd (negate_expr_p): Likewise.
	* print-tree.c (print_node): Likewise.
	* tree-chkp.c (chkp_find_bounds_1): Likewise.
	* tree-loop-distribution.c (const_with_all_bytes_same): Likewise.
	* tree-ssa-loop.c (for_each_index): Likewise.
	* tree-ssa-pre.c (create_component_ref_by_pieces_1): Likewise.
	* tree-ssa-sccvn.c (copy_reference_ops_from_ref): Likewise.
	(ao_ref_init_from_vn_reference): Likewise.
	* tree-vect-generic.c (ssa_uniform_vector_p): Likewise.
	* varasm.c (const_hash_1, compare_constant): Likewise.
	* fold-const.c (negate_expr_p, fold_negate_expr_1, const_binop)
	(fold_convert_const, operand_equal_p, fold_view_convert_expr)
	(exact_inverse, fold_checksum_tree): Likewise.
	(const_unop): Likewise.  Fold VEC_DUPLICATE_EXPRs of a constant.
	(test_vec_duplicate_folding): New function.
	(fold_const_c_tests): Call it.
	* optabs.def (vec_duplicate_optab): New optab.
	* optabs-tree.c (optab_for_tree_code): Handle VEC_DUPLICATE_EXPR.
	* optabs.h (expand_vector_broadcast): Declare.
	* optabs.c (expand_vector_broadcast): Make non-static.  Try using
	vec_duplicate_optab.
	* expr.c (store_constructor): Try using vec_duplicate_optab for
	uniform vectors.
	(const_vector_element): New function, split out from...
	(const_vector_from_tree): ...here.
	(expand_expr_real_2): Handle VEC_DUPLICATE_EXPR.
	(expand_expr_real_1): Handle VEC_DUPLICATE_CST.
	* internal-fn.c (expand_vector_ubsan_overflow): Use CONSTANT_P
	instead of checking for VECTOR_CST.
	* tree-cfg.c (verify_gimple_assign_unary): Handle VEC_DUPLICATE_EXPR.
	(verify_gimple_assign_single): Handle VEC_DUPLICATE_CST.
	* tree-inline.c (estimate_operator_cost): Handle VEC_DUPLICATE_EXPR.

Comments

Richard Biener Oct. 26, 2017, 11:48 a.m. UTC | #1
On Mon, Oct 23, 2017 at 1:20 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> SVE needs a way of broadcasting a scalar to a variable-length vector.

> This patch adds VEC_DUPLICATE_CST for when VECTOR_CST would be used for

> fixed-length vectors and VEC_DUPLICATE_EXPR for when CONSTRUCTOR would

> be used for fixed-length vectors.  VEC_DUPLICATE_EXPR is the tree

> equivalent of the existing rtl code VEC_DUPLICATE.

>

> Originally we had a single VEC_DUPLICATE_EXPR and used TREE_CONSTANT

> to mark constant nodes, but in response to last year's RFC, Richard B.

> suggested it would be better to have separate codes for the constant

> and non-constant cases.  This allows VEC_DUPLICATE_EXPR to be treated

> as a normal unary operation and avoids the previous need for treating

> it as a GIMPLE_SINGLE_RHS.

>

> It might make sense to use VEC_DUPLICATE_CST for all duplicated

> vector constants, since it's a bit more compact than VECTOR_CST

> in that case, and is potentially more efficient to process.

> However, the nice thing about keeping it restricted to variable-length

> vectors is that there is then no need to handle combinations of

> VECTOR_CST and VEC_DUPLICATE_CST; a vector type will always use

> VECTOR_CST or never use it.

>

> The patch also adds a vec_duplicate_optab to go with VEC_DUPLICATE_EXPR.


Index: gcc/tree-vect-generic.c
===================================================================
--- gcc/tree-vect-generic.c     2017-10-23 11:38:53.934094740 +0100
+++ gcc/tree-vect-generic.c     2017-10-23 11:41:51.773953100 +0100
@@ -1419,6 +1419,7 @@ lower_vec_perm (gimple_stmt_iterator *gs
 ssa_uniform_vector_p (tree op)
  {
     if (TREE_CODE (op) == VECTOR_CST
     +      || TREE_CODE (op) == VEC_DUPLICATE_CST
            || TREE_CODE (op) == CONSTRUCTOR)
                 return uniform_vector_p (op);

VEC_DUPLICATE_EXPR handling?  Looks like for VEC_DUPLICATE_CST
it could directly return true.

I didn't see uniform_vector_p being updated?

Can you add verification to either verify_expr or build_vec_duplicate_cst
that the type is one of variable size?  And amend tree.def docs
accordingly.  Because otherwise we miss a lot of cases in constant
folding (mixing VEC_DUPLICATE_CST and VECTOR_CST).

Otherwise looks ok to me.

Thanks,
Richard.

>

> 2017-10-23  Richard Sandiford  <richard.sandiford@linaro.org>

>             Alan Hayward  <alan.hawyard@arm.com>

>             David Sherwood  <david.sherwood@arm.com>

>

> gcc/

>         * doc/generic.texi (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): Document.

>         (VEC_COND_EXPR): Add missing @tindex.

>         * doc/md.texi (vec_duplicate@var{m}): Document.

>         * tree.def (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): New tree codes.

>         * tree-core.h (tree_base): Document that u.nelts and TREE_OVERFLOW

>         are used for VEC_DUPLICATE_CST as well.

>         (tree_vector): Access base.n.nelts directly.

>         * tree.h (TREE_OVERFLOW): Add VEC_DUPLICATE_CST to the list of

>         valid codes.

>         (VEC_DUPLICATE_CST_ELT): New macro.

>         (build_vec_duplicate_cst): Declare.

>         * tree.c (tree_node_structure_for_code, tree_code_size, tree_size)

>         (integer_zerop, integer_onep, integer_all_onesp, integer_truep)

>         (real_zerop, real_onep, real_minus_onep, add_expr, initializer_zerop)

>         (walk_tree_1, drop_tree_overflow): Handle VEC_DUPLICATE_CST.

>         (build_vec_duplicate_cst): New function.

>         (uniform_vector_p): Handle the new codes.

>         (test_vec_duplicate_predicates_int): New function.

>         (test_vec_duplicate_predicates_float): Likewise.

>         (test_vec_duplicate_predicates): Likewise.

>         (tree_c_tests): Call test_vec_duplicate_predicates.

>         * cfgexpand.c (expand_debug_expr): Handle the new codes.

>         * tree-pretty-print.c (dump_generic_node): Likewise.

>         * dwarf2out.c (rtl_for_decl_init): Handle VEC_DUPLICATE_CST.

>         * gimple-expr.h (is_gimple_constant): Likewise.

>         * gimplify.c (gimplify_expr): Likewise.

>         * graphite-isl-ast-to-gimple.c

>         (translate_isl_ast_to_gimple::is_constant): Likewise.

>         * graphite-scop-detection.c (scan_tree_for_params): Likewise.

>         * ipa-icf-gimple.c (func_checker::compare_cst_or_decl): Likewise.

>         (func_checker::compare_operand): Likewise.

>         * ipa-icf.c (sem_item::add_expr, sem_variable::equals): Likewise.

>         * match.pd (negate_expr_p): Likewise.

>         * print-tree.c (print_node): Likewise.

>         * tree-chkp.c (chkp_find_bounds_1): Likewise.

>         * tree-loop-distribution.c (const_with_all_bytes_same): Likewise.

>         * tree-ssa-loop.c (for_each_index): Likewise.

>         * tree-ssa-pre.c (create_component_ref_by_pieces_1): Likewise.

>         * tree-ssa-sccvn.c (copy_reference_ops_from_ref): Likewise.

>         (ao_ref_init_from_vn_reference): Likewise.

>         * tree-vect-generic.c (ssa_uniform_vector_p): Likewise.

>         * varasm.c (const_hash_1, compare_constant): Likewise.

>         * fold-const.c (negate_expr_p, fold_negate_expr_1, const_binop)

>         (fold_convert_const, operand_equal_p, fold_view_convert_expr)

>         (exact_inverse, fold_checksum_tree): Likewise.

>         (const_unop): Likewise.  Fold VEC_DUPLICATE_EXPRs of a constant.

>         (test_vec_duplicate_folding): New function.

>         (fold_const_c_tests): Call it.

>         * optabs.def (vec_duplicate_optab): New optab.

>         * optabs-tree.c (optab_for_tree_code): Handle VEC_DUPLICATE_EXPR.

>         * optabs.h (expand_vector_broadcast): Declare.

>         * optabs.c (expand_vector_broadcast): Make non-static.  Try using

>         vec_duplicate_optab.

>         * expr.c (store_constructor): Try using vec_duplicate_optab for

>         uniform vectors.

>         (const_vector_element): New function, split out from...

>         (const_vector_from_tree): ...here.

>         (expand_expr_real_2): Handle VEC_DUPLICATE_EXPR.

>         (expand_expr_real_1): Handle VEC_DUPLICATE_CST.

>         * internal-fn.c (expand_vector_ubsan_overflow): Use CONSTANT_P

>         instead of checking for VECTOR_CST.

>         * tree-cfg.c (verify_gimple_assign_unary): Handle VEC_DUPLICATE_EXPR.

>         (verify_gimple_assign_single): Handle VEC_DUPLICATE_CST.

>         * tree-inline.c (estimate_operator_cost): Handle VEC_DUPLICATE_EXPR.

>

> Index: gcc/doc/generic.texi

> ===================================================================

> --- gcc/doc/generic.texi        2017-10-23 11:38:53.934094740 +0100

> +++ gcc/doc/generic.texi        2017-10-23 11:41:51.760448406 +0100

> @@ -1036,6 +1036,7 @@ As this example indicates, the operands

>  @tindex FIXED_CST

>  @tindex COMPLEX_CST

>  @tindex VECTOR_CST

> +@tindex VEC_DUPLICATE_CST

>  @tindex STRING_CST

>  @findex TREE_STRING_LENGTH

>  @findex TREE_STRING_POINTER

> @@ -1089,6 +1090,14 @@ constant nodes.  Each individual constan

>  double constant node.  The first operand is a @code{TREE_LIST} of the

>  constant nodes and is accessed through @code{TREE_VECTOR_CST_ELTS}.

>

> +@item VEC_DUPLICATE_CST

> +These nodes represent a vector constant in which every element has the

> +same scalar value.  At present only variable-length vectors use

> +@code{VEC_DUPLICATE_CST}; constant-length vectors use @code{VECTOR_CST}

> +instead.  The scalar element value is given by

> +@code{VEC_DUPLICATE_CST_ELT} and has the same restrictions as the

> +element of a @code{VECTOR_CST}.

> +

>  @item STRING_CST

>  These nodes represent string-constants.  The @code{TREE_STRING_LENGTH}

>  returns the length of the string, as an @code{int}.  The

> @@ -1692,6 +1701,7 @@ a value from @code{enum annot_expr_kind}

>

>  @node Vectors

>  @subsection Vectors

> +@tindex VEC_DUPLICATE_EXPR

>  @tindex VEC_LSHIFT_EXPR

>  @tindex VEC_RSHIFT_EXPR

>  @tindex VEC_WIDEN_MULT_HI_EXPR

> @@ -1703,9 +1713,14 @@ a value from @code{enum annot_expr_kind}

>  @tindex VEC_PACK_TRUNC_EXPR

>  @tindex VEC_PACK_SAT_EXPR

>  @tindex VEC_PACK_FIX_TRUNC_EXPR

> +@tindex VEC_COND_EXPR

>  @tindex SAD_EXPR

>

>  @table @code

> +@item VEC_DUPLICATE_EXPR

> +This node has a single operand and represents a vector in which every

> +element is equal to that operand.

> +

>  @item VEC_LSHIFT_EXPR

>  @itemx VEC_RSHIFT_EXPR

>  These nodes represent whole vector left and right shifts, respectively.

> Index: gcc/doc/md.texi

> ===================================================================

> --- gcc/doc/md.texi     2017-10-23 11:41:22.189466342 +0100

> +++ gcc/doc/md.texi     2017-10-23 11:41:51.761413027 +0100

> @@ -4888,6 +4888,17 @@ and operand 1 is parallel containing val

>  the vector mode @var{m}, or a vector mode with the same element mode and

>  smaller number of elements.

>

> +@cindex @code{vec_duplicate@var{m}} instruction pattern

> +@item @samp{vec_duplicate@var{m}}

> +Initialize vector output operand 0 so that each element has the value given

> +by scalar input operand 1.  The vector has mode @var{m} and the scalar has

> +the mode appropriate for one element of @var{m}.

> +

> +This pattern only handles duplicates of non-constant inputs.  Constant

> +vectors go through the @code{mov@var{m}} pattern instead.

> +

> +This pattern is not allowed to @code{FAIL}.

> +

>  @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern

>  @item @samp{vec_cmp@var{m}@var{n}}

>  Output a vector comparison.  Operand 0 of mode @var{n} is the destination for

> Index: gcc/tree.def

> ===================================================================

> --- gcc/tree.def        2017-10-23 11:38:53.934094740 +0100

> +++ gcc/tree.def        2017-10-23 11:41:51.774917721 +0100

> @@ -304,6 +304,10 @@ DEFTREECODE (COMPLEX_CST, "complex_cst",

>  /* Contents are in VECTOR_CST_ELTS field.  */

>  DEFTREECODE (VECTOR_CST, "vector_cst", tcc_constant, 0)

>

> +/* Represents a vector constant in which every element is equal to

> +   VEC_DUPLICATE_CST_ELT.  */

> +DEFTREECODE (VEC_DUPLICATE_CST, "vec_duplicate_cst", tcc_constant, 0)

> +

>  /* Contents are TREE_STRING_LENGTH and the actual contents of the string.  */

>  DEFTREECODE (STRING_CST, "string_cst", tcc_constant, 0)

>

> @@ -534,6 +538,9 @@ DEFTREECODE (TARGET_EXPR, "target_expr",

>     1 and 2 are NULL.  The operands are then taken from the cfg edges. */

>  DEFTREECODE (COND_EXPR, "cond_expr", tcc_expression, 3)

>

> +/* Represents a vector in which every element is equal to operand 0.  */

> +DEFTREECODE (VEC_DUPLICATE_EXPR, "vec_duplicate_expr", tcc_unary, 1)

> +

>  /* Vector conditional expression. It is like COND_EXPR, but with

>     vector operands.

>

> Index: gcc/tree-core.h

> ===================================================================

> --- gcc/tree-core.h     2017-10-23 11:41:25.862065318 +0100

> +++ gcc/tree-core.h     2017-10-23 11:41:51.771059237 +0100

> @@ -975,7 +975,8 @@ struct GTY(()) tree_base {

>      /* VEC length.  This field is only used with TREE_VEC.  */

>      int length;

>

> -    /* Number of elements.  This field is only used with VECTOR_CST.  */

> +    /* Number of elements.  This field is only used with VECTOR_CST

> +       and VEC_DUPLICATE_CST.  It is always 1 for VEC_DUPLICATE_CST.  */

>      unsigned int nelts;

>

>      /* SSA version number.  This field is only used with SSA_NAME.  */

> @@ -1065,7 +1066,7 @@ struct GTY(()) tree_base {

>     public_flag:

>

>         TREE_OVERFLOW in

> -           INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST

> +           INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST, VEC_DUPLICATE_CST

>

>         TREE_PUBLIC in

>             VAR_DECL, FUNCTION_DECL

> @@ -1332,7 +1333,7 @@ struct GTY(()) tree_complex {

>

>  struct GTY(()) tree_vector {

>    struct tree_typed typed;

> -  tree GTY ((length ("VECTOR_CST_NELTS ((tree) &%h)"))) elts[1];

> +  tree GTY ((length ("((tree) &%h)->base.u.nelts"))) elts[1];

>  };

>

>  struct GTY(()) tree_identifier {

> Index: gcc/tree.h

> ===================================================================

> --- gcc/tree.h  2017-10-23 11:41:23.517482774 +0100

> +++ gcc/tree.h  2017-10-23 11:41:51.775882341 +0100

> @@ -730,8 +730,8 @@ #define TREE_SYMBOL_REFERENCED(NODE) \

>  #define TYPE_REF_CAN_ALIAS_ALL(NODE) \

>    (PTR_OR_REF_CHECK (NODE)->base.static_flag)

>

> -/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, or VECTOR_CST, this means

> -   there was an overflow in folding.  */

> +/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST or VEC_DUPLICATE_CST,

> +   this means there was an overflow in folding.  */

>

>  #define TREE_OVERFLOW(NODE) (CST_CHECK (NODE)->base.public_flag)

>

> @@ -1030,6 +1030,10 @@ #define VECTOR_CST_NELTS(NODE) (VECTOR_C

>  #define VECTOR_CST_ELTS(NODE) (VECTOR_CST_CHECK (NODE)->vector.elts)

>  #define VECTOR_CST_ELT(NODE,IDX) (VECTOR_CST_CHECK (NODE)->vector.elts[IDX])

>

> +/* In a VEC_DUPLICATE_CST node.  */

> +#define VEC_DUPLICATE_CST_ELT(NODE) \

> +  (VEC_DUPLICATE_CST_CHECK (NODE)->vector.elts[0])

> +

>  /* Define fields and accessors for some special-purpose tree nodes.  */

>

>  #define IDENTIFIER_LENGTH(NODE) \

> @@ -4025,6 +4029,7 @@ extern tree build_int_cst (tree, HOST_WI

>  extern tree build_int_cstu (tree type, unsigned HOST_WIDE_INT cst);

>  extern tree build_int_cst_type (tree, HOST_WIDE_INT);

>  extern tree make_vector (unsigned CXX_MEM_STAT_INFO);

> +extern tree build_vec_duplicate_cst (tree, tree CXX_MEM_STAT_INFO);

>  extern tree build_vector (tree, vec<tree> CXX_MEM_STAT_INFO);

>  extern tree build_vector_from_ctor (tree, vec<constructor_elt, va_gc> *);

>  extern tree build_vector_from_val (tree, tree);

> Index: gcc/tree.c

> ===================================================================

> --- gcc/tree.c  2017-10-23 11:41:23.515548300 +0100

> +++ gcc/tree.c  2017-10-23 11:41:51.774917721 +0100

> @@ -464,6 +464,7 @@ tree_node_structure_for_code (enum tree_

>      case FIXED_CST:            return TS_FIXED_CST;

>      case COMPLEX_CST:          return TS_COMPLEX;

>      case VECTOR_CST:           return TS_VECTOR;

> +    case VEC_DUPLICATE_CST:    return TS_VECTOR;

>      case STRING_CST:           return TS_STRING;

>        /* tcc_exceptional cases.  */

>      case ERROR_MARK:           return TS_COMMON;

> @@ -816,6 +817,7 @@ tree_code_size (enum tree_code code)

>         case FIXED_CST:         return sizeof (struct tree_fixed_cst);

>         case COMPLEX_CST:       return sizeof (struct tree_complex);

>         case VECTOR_CST:        return sizeof (struct tree_vector);

> +       case VEC_DUPLICATE_CST: return sizeof (struct tree_vector);

>         case STRING_CST:        gcc_unreachable ();

>         default:

>           return lang_hooks.tree_size (code);

> @@ -875,6 +877,9 @@ tree_size (const_tree node)

>        return (sizeof (struct tree_vector)

>               + (VECTOR_CST_NELTS (node) - 1) * sizeof (tree));

>

> +    case VEC_DUPLICATE_CST:

> +      return sizeof (struct tree_vector);

> +

>      case STRING_CST:

>        return TREE_STRING_LENGTH (node) + offsetof (struct tree_string, str) + 1;

>

> @@ -1682,6 +1687,30 @@ cst_and_fits_in_hwi (const_tree x)

>           && (tree_fits_shwi_p (x) || tree_fits_uhwi_p (x)));

>  }

>

> +/* Build a new VEC_DUPLICATE_CST with type TYPE and operand EXP.

> +

> +   Note that this function is only suitable for callers that specifically

> +   need a VEC_DUPLICATE_CST node.  Use build_vector_from_val to duplicate

> +   a general scalar into a general vector type.  */

> +

> +tree

> +build_vec_duplicate_cst (tree type, tree exp MEM_STAT_DECL)

> +{

> +  int length = sizeof (struct tree_vector);

> +

> +  record_node_allocation_statistics (VEC_DUPLICATE_CST, length);

> +

> +  tree t = ggc_alloc_cleared_tree_node_stat (length PASS_MEM_STAT);

> +

> +  TREE_SET_CODE (t, VEC_DUPLICATE_CST);

> +  TREE_TYPE (t) = type;

> +  t->base.u.nelts = 1;

> +  VEC_DUPLICATE_CST_ELT (t) = exp;

> +  TREE_CONSTANT (t) = 1;

> +

> +  return t;

> +}

> +

>  /* Build a newly constructed VECTOR_CST node of length LEN.  */

>

>  tree

> @@ -2343,6 +2372,8 @@ integer_zerop (const_tree expr)

>             return false;

>         return true;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return integer_zerop (VEC_DUPLICATE_CST_ELT (expr));

>      default:

>        return false;

>      }

> @@ -2369,6 +2400,8 @@ integer_onep (const_tree expr)

>             return false;

>         return true;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return integer_onep (VEC_DUPLICATE_CST_ELT (expr));

>      default:

>        return false;

>      }

> @@ -2407,6 +2440,9 @@ integer_all_onesp (const_tree expr)

>        return 1;

>      }

>

> +  else if (TREE_CODE (expr) == VEC_DUPLICATE_CST)

> +    return integer_all_onesp (VEC_DUPLICATE_CST_ELT (expr));

> +

>    else if (TREE_CODE (expr) != INTEGER_CST)

>      return 0;

>

> @@ -2463,7 +2499,7 @@ integer_nonzerop (const_tree expr)

>  int

>  integer_truep (const_tree expr)

>  {

> -  if (TREE_CODE (expr) == VECTOR_CST)

> +  if (TREE_CODE (expr) == VECTOR_CST || TREE_CODE (expr) == VEC_DUPLICATE_CST)

>      return integer_all_onesp (expr);

>    return integer_onep (expr);

>  }

> @@ -2634,6 +2670,8 @@ real_zerop (const_tree expr)

>             return false;

>         return true;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return real_zerop (VEC_DUPLICATE_CST_ELT (expr));

>      default:

>        return false;

>      }

> @@ -2662,6 +2700,8 @@ real_onep (const_tree expr)

>             return false;

>         return true;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return real_onep (VEC_DUPLICATE_CST_ELT (expr));

>      default:

>        return false;

>      }

> @@ -2689,6 +2729,8 @@ real_minus_onep (const_tree expr)

>             return false;

>         return true;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return real_minus_onep (VEC_DUPLICATE_CST_ELT (expr));

>      default:

>        return false;

>      }

> @@ -7091,6 +7133,9 @@ add_expr (const_tree t, inchash::hash &h

>           inchash::add_expr (VECTOR_CST_ELT (t, i), hstate, flags);

>         return;

>        }

> +    case VEC_DUPLICATE_CST:

> +      inchash::add_expr (VEC_DUPLICATE_CST_ELT (t), hstate);

> +      return;

>      case SSA_NAME:

>        /* We can just compare by pointer.  */

>        hstate.add_wide_int (SSA_NAME_VERSION (t));

> @@ -10345,6 +10390,9 @@ initializer_zerop (const_tree init)

>         return true;

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      return initializer_zerop (VEC_DUPLICATE_CST_ELT (init));

> +

>      case CONSTRUCTOR:

>        {

>         unsigned HOST_WIDE_INT idx;

> @@ -10390,7 +10438,13 @@ uniform_vector_p (const_tree vec)

>

>    gcc_assert (VECTOR_TYPE_P (TREE_TYPE (vec)));

>

> -  if (TREE_CODE (vec) == VECTOR_CST)

> +  if (TREE_CODE (vec) == VEC_DUPLICATE_CST)

> +    return VEC_DUPLICATE_CST_ELT (vec);

> +

> +  else if (TREE_CODE (vec) == VEC_DUPLICATE_EXPR)

> +    return TREE_OPERAND (vec, 0);

> +

> +  else if (TREE_CODE (vec) == VECTOR_CST)

>      {

>        first = VECTOR_CST_ELT (vec, 0);

>        for (i = 1; i < VECTOR_CST_NELTS (vec); ++i)

> @@ -11095,6 +11149,7 @@ #define WALK_SUBTREE_TAIL(NODE)                         \

>      case REAL_CST:

>      case FIXED_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case STRING_CST:

>      case BLOCK:

>      case PLACEHOLDER_EXPR:

> @@ -12381,6 +12436,12 @@ drop_tree_overflow (tree t)

>             elt = drop_tree_overflow (elt);

>         }

>      }

> +  if (TREE_CODE (t) == VEC_DUPLICATE_CST)

> +    {

> +      tree *elt = &VEC_DUPLICATE_CST_ELT (t);

> +      if (TREE_OVERFLOW (*elt))

> +       *elt = drop_tree_overflow (*elt);

> +    }

>    return t;

>  }

>

> @@ -13798,6 +13859,92 @@ test_integer_constants ()

>    ASSERT_EQ (type, TREE_TYPE (zero));

>  }

>

> +/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs

> +   for integral type TYPE.  */

> +

> +static void

> +test_vec_duplicate_predicates_int (tree type)

> +{

> +  tree vec_type = build_vector_type (type, 4);

> +

> +  tree zero = build_zero_cst (type);

> +  tree vec_zero = build_vec_duplicate_cst (vec_type, zero);

> +  ASSERT_TRUE (integer_zerop (vec_zero));

> +  ASSERT_FALSE (integer_onep (vec_zero));

> +  ASSERT_FALSE (integer_minus_onep (vec_zero));

> +  ASSERT_FALSE (integer_all_onesp (vec_zero));

> +  ASSERT_FALSE (integer_truep (vec_zero));

> +  ASSERT_TRUE (initializer_zerop (vec_zero));

> +

> +  tree one = build_one_cst (type);

> +  tree vec_one = build_vec_duplicate_cst (vec_type, one);

> +  ASSERT_FALSE (integer_zerop (vec_one));

> +  ASSERT_TRUE (integer_onep (vec_one));

> +  ASSERT_FALSE (integer_minus_onep (vec_one));

> +  ASSERT_FALSE (integer_all_onesp (vec_one));

> +  ASSERT_FALSE (integer_truep (vec_one));

> +  ASSERT_FALSE (initializer_zerop (vec_one));

> +

> +  tree minus_one = build_minus_one_cst (type);

> +  tree vec_minus_one = build_vec_duplicate_cst (vec_type, minus_one);

> +  ASSERT_FALSE (integer_zerop (vec_minus_one));

> +  ASSERT_FALSE (integer_onep (vec_minus_one));

> +  ASSERT_TRUE (integer_minus_onep (vec_minus_one));

> +  ASSERT_TRUE (integer_all_onesp (vec_minus_one));

> +  ASSERT_TRUE (integer_truep (vec_minus_one));

> +  ASSERT_FALSE (initializer_zerop (vec_minus_one));

> +

> +  tree x = create_tmp_var_raw (type, "x");

> +  tree vec_x = build1 (VEC_DUPLICATE_EXPR, vec_type, x);

> +  ASSERT_EQ (uniform_vector_p (vec_zero), zero);

> +  ASSERT_EQ (uniform_vector_p (vec_one), one);

> +  ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one);

> +  ASSERT_EQ (uniform_vector_p (vec_x), x);

> +}

> +

> +/* Verify predicate handling of VEC_DUPLICATE_CSTs for floating-point

> +   type TYPE.  */

> +

> +static void

> +test_vec_duplicate_predicates_float (tree type)

> +{

> +  tree vec_type = build_vector_type (type, 4);

> +

> +  tree zero = build_zero_cst (type);

> +  tree vec_zero = build_vec_duplicate_cst (vec_type, zero);

> +  ASSERT_TRUE (real_zerop (vec_zero));

> +  ASSERT_FALSE (real_onep (vec_zero));

> +  ASSERT_FALSE (real_minus_onep (vec_zero));

> +  ASSERT_TRUE (initializer_zerop (vec_zero));

> +

> +  tree one = build_one_cst (type);

> +  tree vec_one = build_vec_duplicate_cst (vec_type, one);

> +  ASSERT_FALSE (real_zerop (vec_one));

> +  ASSERT_TRUE (real_onep (vec_one));

> +  ASSERT_FALSE (real_minus_onep (vec_one));

> +  ASSERT_FALSE (initializer_zerop (vec_one));

> +

> +  tree minus_one = build_minus_one_cst (type);

> +  tree vec_minus_one = build_vec_duplicate_cst (vec_type, minus_one);

> +  ASSERT_FALSE (real_zerop (vec_minus_one));

> +  ASSERT_FALSE (real_onep (vec_minus_one));

> +  ASSERT_TRUE (real_minus_onep (vec_minus_one));

> +  ASSERT_FALSE (initializer_zerop (vec_minus_one));

> +

> +  ASSERT_EQ (uniform_vector_p (vec_zero), zero);

> +  ASSERT_EQ (uniform_vector_p (vec_one), one);

> +  ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one);

> +}

> +

> +/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs.  */

> +

> +static void

> +test_vec_duplicate_predicates ()

> +{

> +  test_vec_duplicate_predicates_int (integer_type_node);

> +  test_vec_duplicate_predicates_float (float_type_node);

> +}

> +

>  /* Verify identifiers.  */

>

>  static void

> @@ -13826,6 +13973,7 @@ test_labels ()

>  tree_c_tests ()

>  {

>    test_integer_constants ();

> +  test_vec_duplicate_predicates ();

>    test_identifiers ();

>    test_labels ();

>  }

> Index: gcc/cfgexpand.c

> ===================================================================

> --- gcc/cfgexpand.c     2017-10-23 11:41:23.137358624 +0100

> +++ gcc/cfgexpand.c     2017-10-23 11:41:51.760448406 +0100

> @@ -5049,6 +5049,8 @@ expand_debug_expr (tree exp)

>      case VEC_WIDEN_LSHIFT_HI_EXPR:

>      case VEC_WIDEN_LSHIFT_LO_EXPR:

>      case VEC_PERM_EXPR:

> +    case VEC_DUPLICATE_CST:

> +    case VEC_DUPLICATE_EXPR:

>        return NULL;

>

>      /* Misc codes.  */

> Index: gcc/tree-pretty-print.c

> ===================================================================

> --- gcc/tree-pretty-print.c     2017-10-23 11:38:53.934094740 +0100

> +++ gcc/tree-pretty-print.c     2017-10-23 11:41:51.772023858 +0100

> @@ -1802,6 +1802,12 @@ dump_generic_node (pretty_printer *pp, t

>        }

>        break;

>

> +    case VEC_DUPLICATE_CST:

> +      pp_string (pp, "{ ");

> +      dump_generic_node (pp, VEC_DUPLICATE_CST_ELT (node), spc, flags, false);

> +      pp_string (pp, ", ... }");

> +      break;

> +

>      case FUNCTION_TYPE:

>      case METHOD_TYPE:

>        dump_generic_node (pp, TREE_TYPE (node), spc, flags, false);

> @@ -3231,6 +3237,15 @@ dump_generic_node (pretty_printer *pp, t

>        pp_string (pp, " > ");

>        break;

>

> +    case VEC_DUPLICATE_EXPR:

> +      pp_space (pp);

> +      for (str = get_tree_code_name (code); *str; str++)

> +       pp_character (pp, TOUPPER (*str));

> +      pp_string (pp, " < ");

> +      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);

> +      pp_string (pp, " > ");

> +      break;

> +

>      case VEC_UNPACK_HI_EXPR:

>        pp_string (pp, " VEC_UNPACK_HI_EXPR < ");

>        dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);

> Index: gcc/dwarf2out.c

> ===================================================================

> --- gcc/dwarf2out.c     2017-10-23 11:41:24.407340836 +0100

> +++ gcc/dwarf2out.c     2017-10-23 11:41:51.763342269 +0100

> @@ -18862,6 +18862,7 @@ rtl_for_decl_init (tree init, tree type)

>         switch (TREE_CODE (init))

>           {

>           case VECTOR_CST:

> +         case VEC_DUPLICATE_CST:

>             break;

>           case CONSTRUCTOR:

>             if (TREE_CONSTANT (init))

> Index: gcc/gimple-expr.h

> ===================================================================

> --- gcc/gimple-expr.h   2017-10-23 11:38:53.934094740 +0100

> +++ gcc/gimple-expr.h   2017-10-23 11:41:51.765271511 +0100

> @@ -134,6 +134,7 @@ is_gimple_constant (const_tree t)

>      case FIXED_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case STRING_CST:

>        return true;

>

> Index: gcc/gimplify.c

> ===================================================================

> --- gcc/gimplify.c      2017-10-23 11:41:25.531270256 +0100

> +++ gcc/gimplify.c      2017-10-23 11:41:51.766236132 +0100

> @@ -11506,6 +11506,7 @@ gimplify_expr (tree *expr_p, gimple_seq

>         case STRING_CST:

>         case COMPLEX_CST:

>         case VECTOR_CST:

> +       case VEC_DUPLICATE_CST:

>           /* Drop the overflow flag on constants, we do not want

>              that in the GIMPLE IL.  */

>           if (TREE_OVERFLOW_P (*expr_p))

> Index: gcc/graphite-isl-ast-to-gimple.c

> ===================================================================

> --- gcc/graphite-isl-ast-to-gimple.c    2017-10-23 11:41:23.205065216 +0100

> +++ gcc/graphite-isl-ast-to-gimple.c    2017-10-23 11:41:51.767200753 +0100

> @@ -222,7 +222,8 @@ enum phi_node_kind

>      return TREE_CODE (op) == INTEGER_CST

>        || TREE_CODE (op) == REAL_CST

>        || TREE_CODE (op) == COMPLEX_CST

> -      || TREE_CODE (op) == VECTOR_CST;

> +      || TREE_CODE (op) == VECTOR_CST

> +      || TREE_CODE (op) == VEC_DUPLICATE_CST;

>    }

>

>  private:

> Index: gcc/graphite-scop-detection.c

> ===================================================================

> --- gcc/graphite-scop-detection.c       2017-10-23 11:41:25.533204730 +0100

> +++ gcc/graphite-scop-detection.c       2017-10-23 11:41:51.767200753 +0100

> @@ -1243,6 +1243,7 @@ scan_tree_for_params (sese_info_p s, tre

>      case REAL_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>        break;

>

>     default:

> Index: gcc/ipa-icf-gimple.c

> ===================================================================

> --- gcc/ipa-icf-gimple.c        2017-10-23 11:38:53.934094740 +0100

> +++ gcc/ipa-icf-gimple.c        2017-10-23 11:41:51.767200753 +0100

> @@ -333,6 +333,7 @@ func_checker::compare_cst_or_decl (tree

>      case INTEGER_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case STRING_CST:

>      case REAL_CST:

>        {

> @@ -528,6 +529,7 @@ func_checker::compare_operand (tree t1,

>      case INTEGER_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case STRING_CST:

>      case REAL_CST:

>      case FUNCTION_DECL:

> Index: gcc/ipa-icf.c

> ===================================================================

> --- gcc/ipa-icf.c       2017-10-23 11:41:25.874639400 +0100

> +++ gcc/ipa-icf.c       2017-10-23 11:41:51.768165374 +0100

> @@ -1478,6 +1478,7 @@ sem_item::add_expr (const_tree exp, inch

>      case STRING_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>        inchash::add_expr (exp, hstate);

>        break;

>      case CONSTRUCTOR:

> @@ -2030,6 +2031,9 @@ sem_variable::equals (tree t1, tree t2)

>

>         return 1;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return sem_variable::equals (VEC_DUPLICATE_CST_ELT (t1),

> +                                  VEC_DUPLICATE_CST_ELT (t2));

>      case ARRAY_REF:

>      case ARRAY_RANGE_REF:

>        {

> Index: gcc/match.pd

> ===================================================================

> --- gcc/match.pd        2017-10-23 11:38:53.934094740 +0100

> +++ gcc/match.pd        2017-10-23 11:41:51.768165374 +0100

> @@ -958,6 +958,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

>  (match negate_expr_p

>   VECTOR_CST

>   (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type))))

> +(match negate_expr_p

> + VEC_DUPLICATE_CST

> + (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type))))

>

>  /* (-A) * (-B) -> A * B  */

>  (simplify

> Index: gcc/print-tree.c

> ===================================================================

> --- gcc/print-tree.c    2017-10-23 11:38:53.934094740 +0100

> +++ gcc/print-tree.c    2017-10-23 11:41:51.769129995 +0100

> @@ -783,6 +783,10 @@ print_node (FILE *file, const char *pref

>           }

>           break;

>

> +       case VEC_DUPLICATE_CST:

> +         print_node (file, "elt", VEC_DUPLICATE_CST_ELT (node), indent + 4);

> +         break;

> +

>         case COMPLEX_CST:

>           print_node (file, "real", TREE_REALPART (node), indent + 4);

>           print_node (file, "imag", TREE_IMAGPART (node), indent + 4);

> Index: gcc/tree-chkp.c

> ===================================================================

> --- gcc/tree-chkp.c     2017-10-23 11:41:23.201196268 +0100

> +++ gcc/tree-chkp.c     2017-10-23 11:41:51.770094616 +0100

> @@ -3800,6 +3800,7 @@ chkp_find_bounds_1 (tree ptr, tree ptr_s

>      case INTEGER_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>        if (integer_zerop (ptr_src))

>         bounds = chkp_get_none_bounds ();

>        else

> Index: gcc/tree-loop-distribution.c

> ===================================================================

> --- gcc/tree-loop-distribution.c        2017-10-23 11:41:23.228278904 +0100

> +++ gcc/tree-loop-distribution.c        2017-10-23 11:41:51.771059237 +0100

> @@ -921,6 +921,9 @@ const_with_all_bytes_same (tree val)

>            && CONSTRUCTOR_NELTS (val) == 0))

>      return 0;

>

> +  if (TREE_CODE (val) == VEC_DUPLICATE_CST)

> +    return const_with_all_bytes_same (VEC_DUPLICATE_CST_ELT (val));

> +

>    if (real_zerop (val))

>      {

>        /* Only return 0 for +0.0, not for -0.0, which doesn't have

> Index: gcc/tree-ssa-loop.c

> ===================================================================

> --- gcc/tree-ssa-loop.c 2017-10-23 11:38:53.934094740 +0100

> +++ gcc/tree-ssa-loop.c 2017-10-23 11:41:51.772023858 +0100

> @@ -616,6 +616,7 @@ for_each_index (tree *addr_p, bool (*cbc

>         case STRING_CST:

>         case RESULT_DECL:

>         case VECTOR_CST:

> +       case VEC_DUPLICATE_CST:

>         case COMPLEX_CST:

>         case INTEGER_CST:

>         case REAL_CST:

> Index: gcc/tree-ssa-pre.c

> ===================================================================

> --- gcc/tree-ssa-pre.c  2017-10-23 11:41:25.549647760 +0100

> +++ gcc/tree-ssa-pre.c  2017-10-23 11:41:51.772023858 +0100

> @@ -2675,6 +2675,7 @@ create_component_ref_by_pieces_1 (basic_

>      case INTEGER_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case REAL_CST:

>      case CONSTRUCTOR:

>      case VAR_DECL:

> Index: gcc/tree-ssa-sccvn.c

> ===================================================================

> --- gcc/tree-ssa-sccvn.c        2017-10-23 11:38:53.934094740 +0100

> +++ gcc/tree-ssa-sccvn.c        2017-10-23 11:41:51.773953100 +0100

> @@ -858,6 +858,7 @@ copy_reference_ops_from_ref (tree ref, v

>         case INTEGER_CST:

>         case COMPLEX_CST:

>         case VECTOR_CST:

> +       case VEC_DUPLICATE_CST:

>         case REAL_CST:

>         case FIXED_CST:

>         case CONSTRUCTOR:

> @@ -1050,6 +1051,7 @@ ao_ref_init_from_vn_reference (ao_ref *r

>         case INTEGER_CST:

>         case COMPLEX_CST:

>         case VECTOR_CST:

> +       case VEC_DUPLICATE_CST:

>         case REAL_CST:

>         case CONSTRUCTOR:

>         case CONST_DECL:

> Index: gcc/tree-vect-generic.c

> ===================================================================

> --- gcc/tree-vect-generic.c     2017-10-23 11:38:53.934094740 +0100

> +++ gcc/tree-vect-generic.c     2017-10-23 11:41:51.773953100 +0100

> @@ -1419,6 +1419,7 @@ lower_vec_perm (gimple_stmt_iterator *gs

>  ssa_uniform_vector_p (tree op)

>  {

>    if (TREE_CODE (op) == VECTOR_CST

> +      || TREE_CODE (op) == VEC_DUPLICATE_CST

>        || TREE_CODE (op) == CONSTRUCTOR)

>      return uniform_vector_p (op);

>    if (TREE_CODE (op) == SSA_NAME)

> Index: gcc/varasm.c

> ===================================================================

> --- gcc/varasm.c        2017-10-23 11:41:25.822408600 +0100

> +++ gcc/varasm.c        2017-10-23 11:41:51.775882341 +0100

> @@ -3068,6 +3068,9 @@ const_hash_1 (const tree exp)

>      CASE_CONVERT:

>        return const_hash_1 (TREE_OPERAND (exp, 0)) * 7 + 2;

>

> +    case VEC_DUPLICATE_CST:

> +      return const_hash_1 (VEC_DUPLICATE_CST_ELT (exp)) * 7 + 3;

> +

>      default:

>        /* A language specific constant. Just hash the code.  */

>        return code;

> @@ -3158,6 +3161,10 @@ compare_constant (const tree t1, const t

>         return 1;

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      return compare_constant (VEC_DUPLICATE_CST_ELT (t1),

> +                              VEC_DUPLICATE_CST_ELT (t2));

> +

>      case CONSTRUCTOR:

>        {

>         vec<constructor_elt, va_gc> *v1, *v2;

> Index: gcc/fold-const.c

> ===================================================================

> --- gcc/fold-const.c    2017-10-23 11:41:23.535860278 +0100

> +++ gcc/fold-const.c    2017-10-23 11:41:51.765271511 +0100

> @@ -418,6 +418,9 @@ negate_expr_p (tree t)

>         return true;

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      return negate_expr_p (VEC_DUPLICATE_CST_ELT (t));

> +

>      case COMPLEX_EXPR:

>        return negate_expr_p (TREE_OPERAND (t, 0))

>              && negate_expr_p (TREE_OPERAND (t, 1));

> @@ -579,6 +582,14 @@ fold_negate_expr_1 (location_t loc, tree

>         return build_vector (type, elts);

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      {

> +       tree sub = fold_negate_expr (loc, VEC_DUPLICATE_CST_ELT (t));

> +       if (!sub)

> +         return NULL_TREE;

> +       return build_vector_from_val (type, sub);

> +      }

> +

>      case COMPLEX_EXPR:

>        if (negate_expr_p (t))

>         return fold_build2_loc (loc, COMPLEX_EXPR, type,

> @@ -1436,6 +1447,16 @@ const_binop (enum tree_code code, tree a

>        return build_vector (type, elts);

>      }

>

> +  if (TREE_CODE (arg1) == VEC_DUPLICATE_CST

> +      && TREE_CODE (arg2) == VEC_DUPLICATE_CST)

> +    {

> +      tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1),

> +                             VEC_DUPLICATE_CST_ELT (arg2));

> +      if (!sub)

> +       return NULL_TREE;

> +      return build_vector_from_val (TREE_TYPE (arg1), sub);

> +    }

> +

>    /* Shifts allow a scalar offset for a vector.  */

>    if (TREE_CODE (arg1) == VECTOR_CST

>        && TREE_CODE (arg2) == INTEGER_CST)

> @@ -1459,6 +1480,15 @@ const_binop (enum tree_code code, tree a

>

>        return build_vector (type, elts);

>      }

> +

> +  if (TREE_CODE (arg1) == VEC_DUPLICATE_CST

> +      && TREE_CODE (arg2) == INTEGER_CST)

> +    {

> +      tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1), arg2);

> +      if (!sub)

> +       return NULL_TREE;

> +      return build_vector_from_val (TREE_TYPE (arg1), sub);

> +    }

>    return NULL_TREE;

>  }

>

> @@ -1652,6 +1682,13 @@ const_unop (enum tree_code code, tree ty

>           if (i == count)

>             return build_vector (type, elements);

>         }

> +      else if (TREE_CODE (arg0) == VEC_DUPLICATE_CST)

> +       {

> +         tree sub = const_unop (BIT_NOT_EXPR, TREE_TYPE (type),

> +                                VEC_DUPLICATE_CST_ELT (arg0));

> +         if (sub)

> +           return build_vector_from_val (type, sub);

> +       }

>        break;

>

>      case TRUTH_NOT_EXPR:

> @@ -1737,6 +1774,11 @@ const_unop (enum tree_code code, tree ty

>         return res;

>        }

>

> +    case VEC_DUPLICATE_EXPR:

> +      if (CONSTANT_CLASS_P (arg0))

> +       return build_vector_from_val (type, arg0);

> +      return NULL_TREE;

> +

>      default:

>        break;

>      }

> @@ -2167,6 +2209,15 @@ fold_convert_const (enum tree_code code,

>             }

>           return build_vector (type, v);

>         }

> +      if (TREE_CODE (arg1) == VEC_DUPLICATE_CST

> +         && (TYPE_VECTOR_SUBPARTS (type)

> +             == TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg1))))

> +       {

> +         tree sub = fold_convert_const (code, TREE_TYPE (type),

> +                                        VEC_DUPLICATE_CST_ELT (arg1));

> +         if (sub)

> +           return build_vector_from_val (type, sub);

> +       }

>      }

>    return NULL_TREE;

>  }

> @@ -2953,6 +3004,10 @@ operand_equal_p (const_tree arg0, const_

>           return 1;

>         }

>

> +      case VEC_DUPLICATE_CST:

> +       return operand_equal_p (VEC_DUPLICATE_CST_ELT (arg0),

> +                               VEC_DUPLICATE_CST_ELT (arg1), flags);

> +

>        case COMPLEX_CST:

>         return (operand_equal_p (TREE_REALPART (arg0), TREE_REALPART (arg1),

>                                  flags)

> @@ -7492,6 +7547,20 @@ can_native_interpret_type_p (tree type)

>  static tree

>  fold_view_convert_expr (tree type, tree expr)

>  {

> +  /* Recurse on duplicated vectors if the target type is also a vector

> +     and if the elements line up.  */

> +  tree expr_type = TREE_TYPE (expr);

> +  if (TREE_CODE (expr) == VEC_DUPLICATE_CST

> +      && VECTOR_TYPE_P (type)

> +      && TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (expr_type)

> +      && TYPE_SIZE (TREE_TYPE (type)) == TYPE_SIZE (TREE_TYPE (expr_type)))

> +    {

> +      tree sub = fold_view_convert_expr (TREE_TYPE (type),

> +                                        VEC_DUPLICATE_CST_ELT (expr));

> +      if (sub)

> +       return build_vector_from_val (type, sub);

> +    }

> +

>    /* We support up to 512-bit values (for V8DFmode).  */

>    unsigned char buffer[64];

>    int len;

> @@ -8891,6 +8960,15 @@ exact_inverse (tree type, tree cst)

>         return build_vector (type, elts);

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      {

> +       tree sub = exact_inverse (TREE_TYPE (type),

> +                                 VEC_DUPLICATE_CST_ELT (cst));

> +       if (!sub)

> +         return NULL_TREE;

> +       return build_vector_from_val (type, sub);

> +      }

> +

>      default:

>        return NULL_TREE;

>      }

> @@ -11969,6 +12047,9 @@ fold_checksum_tree (const_tree expr, str

>           for (i = 0; i < (int) VECTOR_CST_NELTS (expr); ++i)

>             fold_checksum_tree (VECTOR_CST_ELT (expr, i), ctx, ht);

>           break;

> +       case VEC_DUPLICATE_CST:

> +         fold_checksum_tree (VEC_DUPLICATE_CST_ELT (expr), ctx, ht);

> +         break;

>         default:

>           break;

>         }

> @@ -14436,6 +14517,36 @@ test_vector_folding ()

>    ASSERT_FALSE (integer_nonzerop (fold_build2 (NE_EXPR, res_type, one, one)));

>  }

>

> +/* Verify folding of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs.  */

> +

> +static void

> +test_vec_duplicate_folding ()

> +{

> +  tree type = build_vector_type (ssizetype, 4);

> +  tree dup5 = build_vec_duplicate_cst (type, ssize_int (5));

> +  tree dup3 = build_vec_duplicate_cst (type, ssize_int (3));

> +

> +  tree neg_dup5 = fold_unary (NEGATE_EXPR, type, dup5);

> +  ASSERT_EQ (uniform_vector_p (neg_dup5), ssize_int (-5));

> +

> +  tree not_dup5 = fold_unary (BIT_NOT_EXPR, type, dup5);

> +  ASSERT_EQ (uniform_vector_p (not_dup5), ssize_int (-6));

> +

> +  tree dup5_plus_dup3 = fold_binary (PLUS_EXPR, type, dup5, dup3);

> +  ASSERT_EQ (uniform_vector_p (dup5_plus_dup3), ssize_int (8));

> +

> +  tree dup5_lsl_2 = fold_binary (LSHIFT_EXPR, type, dup5, ssize_int (2));

> +  ASSERT_EQ (uniform_vector_p (dup5_lsl_2), ssize_int (20));

> +

> +  tree size_vector = build_vector_type (sizetype, 4);

> +  tree size_dup5 = fold_convert (size_vector, dup5);

> +  ASSERT_EQ (uniform_vector_p (size_dup5), size_int (5));

> +

> +  tree dup5_expr = fold_unary (VEC_DUPLICATE_EXPR, type, ssize_int (5));

> +  tree dup5_cst = build_vector_from_val (type, ssize_int (5));

> +  ASSERT_TRUE (operand_equal_p (dup5_expr, dup5_cst, 0));

> +}

> +

>  /* Run all of the selftests within this file.  */

>

>  void

> @@ -14443,6 +14554,7 @@ fold_const_c_tests ()

>  {

>    test_arithmetic_folding ();

>    test_vector_folding ();

> +  test_vec_duplicate_folding ();

>  }

>

>  } // namespace selftest

> Index: gcc/optabs.def

> ===================================================================

> --- gcc/optabs.def      2017-10-23 11:38:53.934094740 +0100

> +++ gcc/optabs.def      2017-10-23 11:41:51.769129995 +0100

> @@ -364,3 +364,5 @@ OPTAB_D (atomic_xor_optab, "atomic_xor$I

>

>  OPTAB_D (get_thread_pointer_optab, "get_thread_pointer$I$a")

>  OPTAB_D (set_thread_pointer_optab, "set_thread_pointer$I$a")

> +

> +OPTAB_DC (vec_duplicate_optab, "vec_duplicate$a", VEC_DUPLICATE)

> Index: gcc/optabs-tree.c

> ===================================================================

> --- gcc/optabs-tree.c   2017-10-23 11:38:53.934094740 +0100

> +++ gcc/optabs-tree.c   2017-10-23 11:41:51.768165374 +0100

> @@ -210,6 +210,9 @@ optab_for_tree_code (enum tree_code code

>        return TYPE_UNSIGNED (type) ?

>         vec_pack_ufix_trunc_optab : vec_pack_sfix_trunc_optab;

>

> +    case VEC_DUPLICATE_EXPR:

> +      return vec_duplicate_optab;

> +

>      default:

>        break;

>      }

> Index: gcc/optabs.h

> ===================================================================

> --- gcc/optabs.h        2017-10-23 11:38:53.934094740 +0100

> +++ gcc/optabs.h        2017-10-23 11:41:51.769129995 +0100

> @@ -181,6 +181,7 @@ extern rtx simplify_expand_binop (machin

>                                   enum optab_methods methods);

>  extern bool force_expand_binop (machine_mode, optab, rtx, rtx, rtx, int,

>                                 enum optab_methods);

> +extern rtx expand_vector_broadcast (machine_mode, rtx);

>

>  /* Generate code for a simple binary or unary operation.  "Simple" in

>     this case means "can be unambiguously described by a (mode, code)

> Index: gcc/optabs.c

> ===================================================================

> --- gcc/optabs.c        2017-10-23 11:41:41.549050496 +0100

> +++ gcc/optabs.c        2017-10-23 11:41:51.769129995 +0100

> @@ -367,7 +367,7 @@ force_expand_binop (machine_mode mode, o

>     mode of OP must be the element mode of VMODE.  If OP is a constant,

>     then the return value will be a constant.  */

>

> -static rtx

> +rtx

>  expand_vector_broadcast (machine_mode vmode, rtx op)

>  {

>    enum insn_code icode;

> @@ -380,6 +380,16 @@ expand_vector_broadcast (machine_mode vm

>    if (CONSTANT_P (op))

>      return gen_const_vec_duplicate (vmode, op);

>

> +  icode = optab_handler (vec_duplicate_optab, vmode);

> +  if (icode != CODE_FOR_nothing)

> +    {

> +      struct expand_operand ops[2];

> +      create_output_operand (&ops[0], NULL_RTX, vmode);

> +      create_input_operand (&ops[1], op, GET_MODE (op));

> +      expand_insn (icode, 2, ops);

> +      return ops[0].value;

> +    }

> +

>    /* ??? If the target doesn't have a vec_init, then we have no easy way

>       of performing this operation.  Most of this sort of generic support

>       is hidden away in the vector lowering support in gimple.  */

> Index: gcc/expr.c

> ===================================================================

> --- gcc/expr.c  2017-10-23 11:41:39.187050437 +0100

> +++ gcc/expr.c  2017-10-23 11:41:51.764306890 +0100

> @@ -6572,7 +6572,8 @@ store_constructor (tree exp, rtx target,

>         constructor_elt *ce;

>         int i;

>         int need_to_clear;

> -       int icode = CODE_FOR_nothing;

> +       insn_code icode = CODE_FOR_nothing;

> +       tree elt;

>         tree elttype = TREE_TYPE (type);

>         int elt_size = tree_to_uhwi (TYPE_SIZE (elttype));

>         machine_mode eltmode = TYPE_MODE (elttype);

> @@ -6582,13 +6583,30 @@ store_constructor (tree exp, rtx target,

>         unsigned n_elts;

>         alias_set_type alias;

>         bool vec_vec_init_p = false;

> +       machine_mode mode = GET_MODE (target);

>

>         gcc_assert (eltmode != BLKmode);

>

> +       /* Try using vec_duplicate_optab for uniform vectors.  */

> +       if (!TREE_SIDE_EFFECTS (exp)

> +           && VECTOR_MODE_P (mode)

> +           && eltmode == GET_MODE_INNER (mode)

> +           && ((icode = optab_handler (vec_duplicate_optab, mode))

> +               != CODE_FOR_nothing)

> +           && (elt = uniform_vector_p (exp)))

> +         {

> +           struct expand_operand ops[2];

> +           create_output_operand (&ops[0], target, mode);

> +           create_input_operand (&ops[1], expand_normal (elt), eltmode);

> +           expand_insn (icode, 2, ops);

> +           if (!rtx_equal_p (target, ops[0].value))

> +             emit_move_insn (target, ops[0].value);

> +           break;

> +         }

> +

>         n_elts = TYPE_VECTOR_SUBPARTS (type);

> -       if (REG_P (target) && VECTOR_MODE_P (GET_MODE (target)))

> +       if (REG_P (target) && VECTOR_MODE_P (mode))

>           {

> -           machine_mode mode = GET_MODE (target);

>             machine_mode emode = eltmode;

>

>             if (CONSTRUCTOR_NELTS (exp)

> @@ -6600,7 +6618,7 @@ store_constructor (tree exp, rtx target,

>                             == n_elts);

>                 emode = TYPE_MODE (etype);

>               }

> -           icode = (int) convert_optab_handler (vec_init_optab, mode, emode);

> +           icode = convert_optab_handler (vec_init_optab, mode, emode);

>             if (icode != CODE_FOR_nothing)

>               {

>                 unsigned int i, n = n_elts;

> @@ -6648,7 +6666,7 @@ store_constructor (tree exp, rtx target,

>         if (need_to_clear && size > 0 && !vector)

>           {

>             if (REG_P (target))

> -             emit_move_insn (target, CONST0_RTX (GET_MODE (target)));

> +             emit_move_insn (target, CONST0_RTX (mode));

>             else

>               clear_storage (target, GEN_INT (size), BLOCK_OP_NORMAL);

>             cleared = 1;

> @@ -6656,7 +6674,7 @@ store_constructor (tree exp, rtx target,

>

>         /* Inform later passes that the old value is dead.  */

>         if (!cleared && !vector && REG_P (target))

> -         emit_move_insn (target, CONST0_RTX (GET_MODE (target)));

> +         emit_move_insn (target, CONST0_RTX (mode));

>

>          if (MEM_P (target))

>           alias = MEM_ALIAS_SET (target);

> @@ -6707,8 +6725,7 @@ store_constructor (tree exp, rtx target,

>

>         if (vector)

>           emit_insn (GEN_FCN (icode) (target,

> -                                     gen_rtx_PARALLEL (GET_MODE (target),

> -                                                       vector)));

> +                                     gen_rtx_PARALLEL (mode, vector)));

>         break;

>        }

>

> @@ -7686,6 +7703,19 @@ expand_operands (tree exp0, tree exp1, r

>  }

>

>

> +/* Expand constant vector element ELT, which has mode MODE.  This is used

> +   for members of VECTOR_CST and VEC_DUPLICATE_CST.  */

> +

> +static rtx

> +const_vector_element (scalar_mode mode, const_tree elt)

> +{

> +  if (TREE_CODE (elt) == REAL_CST)

> +    return const_double_from_real_value (TREE_REAL_CST (elt), mode);

> +  if (TREE_CODE (elt) == FIXED_CST)

> +    return CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt), mode);

> +  return immed_wide_int_const (wi::to_wide (elt), mode);

> +}

> +

>  /* Return a MEM that contains constant EXP.  DEFER is as for

>     output_constant_def and MODIFIER is as for expand_expr.  */

>

> @@ -9551,6 +9581,12 @@ #define REDUCE_BIT_FIELD(expr)   (reduce_b

>        target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);

>        return target;

>

> +    case VEC_DUPLICATE_EXPR:

> +      op0 = expand_expr (treeop0, NULL_RTX, VOIDmode, modifier);

> +      target = expand_vector_broadcast (mode, op0);

> +      gcc_assert (target);

> +      return target;

> +

>      case BIT_INSERT_EXPR:

>        {

>         unsigned bitpos = tree_to_uhwi (treeop2);

> @@ -10003,6 +10039,11 @@ expand_expr_real_1 (tree exp, rtx target

>                             tmode, modifier);

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      op0 = const_vector_element (GET_MODE_INNER (mode),

> +                                 VEC_DUPLICATE_CST_ELT (exp));

> +      return gen_const_vec_duplicate (mode, op0);

> +

>      case CONST_DECL:

>        if (modifier == EXPAND_WRITE)

>         {

> @@ -11764,8 +11805,7 @@ const_vector_from_tree (tree exp)

>  {

>    rtvec v;

>    unsigned i, units;

> -  tree elt;

> -  machine_mode inner, mode;

> +  machine_mode mode;

>

>    mode = TYPE_MODE (TREE_TYPE (exp));

>

> @@ -11776,23 +11816,12 @@ const_vector_from_tree (tree exp)

>      return const_vector_mask_from_tree (exp);

>

>    units = VECTOR_CST_NELTS (exp);

> -  inner = GET_MODE_INNER (mode);

>

>    v = rtvec_alloc (units);

>

>    for (i = 0; i < units; ++i)

> -    {

> -      elt = VECTOR_CST_ELT (exp, i);

> -

> -      if (TREE_CODE (elt) == REAL_CST)

> -       RTVEC_ELT (v, i) = const_double_from_real_value (TREE_REAL_CST (elt),

> -                                                        inner);

> -      else if (TREE_CODE (elt) == FIXED_CST)

> -       RTVEC_ELT (v, i) = CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt),

> -                                                        inner);

> -      else

> -       RTVEC_ELT (v, i) = immed_wide_int_const (wi::to_wide (elt), inner);

> -    }

> +    RTVEC_ELT (v, i) = const_vector_element (GET_MODE_INNER (mode),

> +                                            VECTOR_CST_ELT (exp, i));

>

>    return gen_rtx_CONST_VECTOR (mode, v);

>  }

> Index: gcc/internal-fn.c

> ===================================================================

> --- gcc/internal-fn.c   2017-10-23 11:41:23.529089619 +0100

> +++ gcc/internal-fn.c   2017-10-23 11:41:51.767200753 +0100

> @@ -1911,12 +1911,12 @@ expand_vector_ubsan_overflow (location_t

>        emit_move_insn (cntvar, const0_rtx);

>        emit_label (loop_lab);

>      }

> -  if (TREE_CODE (arg0) != VECTOR_CST)

> +  if (!CONSTANT_CLASS_P (arg0))

>      {

>        rtx arg0r = expand_normal (arg0);

>        arg0 = make_tree (TREE_TYPE (arg0), arg0r);

>      }

> -  if (TREE_CODE (arg1) != VECTOR_CST)

> +  if (!CONSTANT_CLASS_P (arg1))

>      {

>        rtx arg1r = expand_normal (arg1);

>        arg1 = make_tree (TREE_TYPE (arg1), arg1r);

> Index: gcc/tree-cfg.c

> ===================================================================

> --- gcc/tree-cfg.c      2017-10-23 11:41:25.864967029 +0100

> +++ gcc/tree-cfg.c      2017-10-23 11:41:51.770094616 +0100

> @@ -3803,6 +3803,17 @@ verify_gimple_assign_unary (gassign *stm

>      case CONJ_EXPR:

>        break;

>

> +    case VEC_DUPLICATE_EXPR:

> +      if (TREE_CODE (lhs_type) != VECTOR_TYPE

> +         || !useless_type_conversion_p (TREE_TYPE (lhs_type), rhs1_type))

> +       {

> +         error ("vec_duplicate should be from a scalar to a like vector");

> +         debug_generic_expr (lhs_type);

> +         debug_generic_expr (rhs1_type);

> +         return true;

> +       }

> +      return false;

> +

>      default:

>        gcc_unreachable ();

>      }

> @@ -4473,6 +4484,7 @@ verify_gimple_assign_single (gassign *st

>      case FIXED_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case STRING_CST:

>        return res;

>

> Index: gcc/tree-inline.c

> ===================================================================

> --- gcc/tree-inline.c   2017-10-23 11:41:25.833048208 +0100

> +++ gcc/tree-inline.c   2017-10-23 11:41:51.771059237 +0100

> @@ -4002,6 +4002,7 @@ estimate_operator_cost (enum tree_code c

>      case VEC_PACK_FIX_TRUNC_EXPR:

>      case VEC_WIDEN_LSHIFT_HI_EXPR:

>      case VEC_WIDEN_LSHIFT_LO_EXPR:

> +    case VEC_DUPLICATE_EXPR:

>

>        return 1;

>
Richard Sandiford Nov. 6, 2017, 3:09 p.m. UTC | #2
Richard Biener <richard.guenther@gmail.com> writes:
> On Mon, Oct 23, 2017 at 1:20 PM, Richard Sandiford

> <richard.sandiford@linaro.org> wrote:

>> SVE needs a way of broadcasting a scalar to a variable-length vector.

>> This patch adds VEC_DUPLICATE_CST for when VECTOR_CST would be used for

>> fixed-length vectors and VEC_DUPLICATE_EXPR for when CONSTRUCTOR would

>> be used for fixed-length vectors.  VEC_DUPLICATE_EXPR is the tree

>> equivalent of the existing rtl code VEC_DUPLICATE.

>>

>> Originally we had a single VEC_DUPLICATE_EXPR and used TREE_CONSTANT

>> to mark constant nodes, but in response to last year's RFC, Richard B.

>> suggested it would be better to have separate codes for the constant

>> and non-constant cases.  This allows VEC_DUPLICATE_EXPR to be treated

>> as a normal unary operation and avoids the previous need for treating

>> it as a GIMPLE_SINGLE_RHS.

>>

>> It might make sense to use VEC_DUPLICATE_CST for all duplicated

>> vector constants, since it's a bit more compact than VECTOR_CST

>> in that case, and is potentially more efficient to process.

>> However, the nice thing about keeping it restricted to variable-length

>> vectors is that there is then no need to handle combinations of

>> VECTOR_CST and VEC_DUPLICATE_CST; a vector type will always use

>> VECTOR_CST or never use it.

>>

>> The patch also adds a vec_duplicate_optab to go with VEC_DUPLICATE_EXPR.

>

> Index: gcc/tree-vect-generic.c

> ===================================================================

> --- gcc/tree-vect-generic.c     2017-10-23 11:38:53.934094740 +0100

> +++ gcc/tree-vect-generic.c     2017-10-23 11:41:51.773953100 +0100

> @@ -1419,6 +1419,7 @@ lower_vec_perm (gimple_stmt_iterator *gs

>  ssa_uniform_vector_p (tree op)

>   {

>      if (TREE_CODE (op) == VECTOR_CST

>      +      || TREE_CODE (op) == VEC_DUPLICATE_CST

>             || TREE_CODE (op) == CONSTRUCTOR)

>                  return uniform_vector_p (op);

>

> VEC_DUPLICATE_EXPR handling?


Oops, yeah.  I could have sworn it was there at one time...

> Looks like for VEC_DUPLICATE_CST it could directly return true.


The function is a bit misnamed: it returns the duplicated tree value
rather than a bool.

> I didn't see uniform_vector_p being updated?


That part was there FWIW (for tree.c).

> Can you add verification to either verify_expr or build_vec_duplicate_cst

> that the type is one of variable size?  And amend tree.def docs

> accordingly.  Because otherwise we miss a lot of cases in constant

> folding (mixing VEC_DUPLICATE_CST and VECTOR_CST).


OK, done in the patch below with a gcc_unreachable () bomb in
build_vec_duplicate_cst, which becomes a gcc_assert when variable-length
vectors are added.  This meant changing the selftests to use
build_vector_from_val rather than build_vec_duplicate_cst,
but to still get testing of VEC_DUPLICATE_*, we then need to use
the target's preferred vector length instead of always using 4.

Tested as before.  OK (given the slightly different selftests)?

Thanks,
Richard


2017-11-06  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hawyard@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* doc/generic.texi (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): Document.
	(VEC_COND_EXPR): Add missing @tindex.
	* doc/md.texi (vec_duplicate@var{m}): Document.
	* tree.def (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): New tree codes.
	* tree-core.h (tree_base): Document that u.nelts and TREE_OVERFLOW
	are used for VEC_DUPLICATE_CST as well.
	(tree_vector): Access base.n.nelts directly.
	* tree.h (TREE_OVERFLOW): Add VEC_DUPLICATE_CST to the list of
	valid codes.
	(VEC_DUPLICATE_CST_ELT): New macro.
	* tree.c (tree_node_structure_for_code, tree_code_size, tree_size)
	(integer_zerop, integer_onep, integer_all_onesp, integer_truep)
	(real_zerop, real_onep, real_minus_onep, add_expr, initializer_zerop)
	(walk_tree_1, drop_tree_overflow): Handle VEC_DUPLICATE_CST.
	(build_vec_duplicate_cst): New function.
	(build_vector_from_val): Add stubbed-out handling of variable-length
	vectors, using build_vec_duplicate_cst and VEC_DUPLICATE_EXPR.
	(uniform_vector_p): Handle the new codes.
	(test_vec_duplicate_predicates_int): New function.
	(test_vec_duplicate_predicates_float): Likewise.
	(test_vec_duplicate_predicates): Likewise.
	(tree_c_tests): Call test_vec_duplicate_predicates.
	* cfgexpand.c (expand_debug_expr): Handle the new codes.
	* tree-pretty-print.c (dump_generic_node): Likewise.
	* tree-vect-generic.c (ssa_uniform_vector_p): Likewise.
	* dwarf2out.c (rtl_for_decl_init): Handle VEC_DUPLICATE_CST.
	* gimple-expr.h (is_gimple_constant): Likewise.
	* gimplify.c (gimplify_expr): Likewise.
	* graphite-isl-ast-to-gimple.c
	(translate_isl_ast_to_gimple::is_constant): Likewise.
	* graphite-scop-detection.c (scan_tree_for_params): Likewise.
	* ipa-icf-gimple.c (func_checker::compare_cst_or_decl): Likewise.
	(func_checker::compare_operand): Likewise.
	* ipa-icf.c (sem_item::add_expr, sem_variable::equals): Likewise.
	* match.pd (negate_expr_p): Likewise.
	* print-tree.c (print_node): Likewise.
	* tree-chkp.c (chkp_find_bounds_1): Likewise.
	* tree-loop-distribution.c (const_with_all_bytes_same): Likewise.
	* tree-ssa-loop.c (for_each_index): Likewise.
	* tree-ssa-pre.c (create_component_ref_by_pieces_1): Likewise.
	* tree-ssa-sccvn.c (copy_reference_ops_from_ref): Likewise.
	(ao_ref_init_from_vn_reference): Likewise.
	* varasm.c (const_hash_1, compare_constant): Likewise.
	* fold-const.c (negate_expr_p, fold_negate_expr_1, const_binop)
	(fold_convert_const, operand_equal_p, fold_view_convert_expr)
	(exact_inverse, fold_checksum_tree): Likewise.
	(const_unop): Likewise.  Fold VEC_DUPLICATE_EXPRs of a constant.
	(test_vec_duplicate_folding): New function.
	(fold_const_c_tests): Call it.
	* optabs.def (vec_duplicate_optab): New optab.
	* optabs-tree.c (optab_for_tree_code): Handle VEC_DUPLICATE_EXPR.
	* optabs.h (expand_vector_broadcast): Declare.
	* optabs.c (expand_vector_broadcast): Make non-static.  Try using
	vec_duplicate_optab.
	* expr.c (store_constructor): Try using vec_duplicate_optab for
	uniform vectors.
	(const_vector_element): New function, split out from...
	(const_vector_from_tree): ...here.
	(expand_expr_real_2): Handle VEC_DUPLICATE_EXPR.
	(expand_expr_real_1): Handle VEC_DUPLICATE_CST.
	* internal-fn.c (expand_vector_ubsan_overflow): Use CONSTANT_P
	instead of checking for VECTOR_CST.
	* tree-cfg.c (verify_gimple_assign_unary): Handle VEC_DUPLICATE_EXPR.
	(verify_gimple_assign_single): Handle VEC_DUPLICATE_CST.
	* tree-inline.c (estimate_operator_cost): Handle VEC_DUPLICATE_EXPR.

Index: gcc/doc/generic.texi
===================================================================
--- gcc/doc/generic.texi	2017-11-06 12:40:39.845713389 +0000
+++ gcc/doc/generic.texi	2017-11-06 12:40:40.277637153 +0000
@@ -1036,6 +1036,7 @@ As this example indicates, the operands
 @tindex FIXED_CST
 @tindex COMPLEX_CST
 @tindex VECTOR_CST
+@tindex VEC_DUPLICATE_CST
 @tindex STRING_CST
 @findex TREE_STRING_LENGTH
 @findex TREE_STRING_POINTER
@@ -1089,6 +1090,14 @@ constant nodes.  Each individual constan
 double constant node.  The first operand is a @code{TREE_LIST} of the
 constant nodes and is accessed through @code{TREE_VECTOR_CST_ELTS}.
 
+@item VEC_DUPLICATE_CST
+These nodes represent a vector constant in which every element has the
+same scalar value.  At present only variable-length vectors use
+@code{VEC_DUPLICATE_CST}; constant-length vectors use @code{VECTOR_CST}
+instead.  The scalar element value is given by
+@code{VEC_DUPLICATE_CST_ELT} and has the same restrictions as the
+element of a @code{VECTOR_CST}.
+
 @item STRING_CST
 These nodes represent string-constants.  The @code{TREE_STRING_LENGTH}
 returns the length of the string, as an @code{int}.  The
@@ -1692,6 +1701,7 @@ a value from @code{enum annot_expr_kind}
 
 @node Vectors
 @subsection Vectors
+@tindex VEC_DUPLICATE_EXPR
 @tindex VEC_LSHIFT_EXPR
 @tindex VEC_RSHIFT_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
@@ -1703,9 +1713,14 @@ a value from @code{enum annot_expr_kind}
 @tindex VEC_PACK_TRUNC_EXPR
 @tindex VEC_PACK_SAT_EXPR
 @tindex VEC_PACK_FIX_TRUNC_EXPR
+@tindex VEC_COND_EXPR
 @tindex SAD_EXPR
 
 @table @code
+@item VEC_DUPLICATE_EXPR
+This node has a single operand and represents a vector in which every
+element is equal to that operand.
+
 @item VEC_LSHIFT_EXPR
 @itemx VEC_RSHIFT_EXPR
 These nodes represent whole vector left and right shifts, respectively.
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	2017-11-06 12:40:39.845713389 +0000
+++ gcc/doc/md.texi	2017-11-06 12:40:40.278630081 +0000
@@ -4888,6 +4888,17 @@ and operand 1 is parallel containing val
 the vector mode @var{m}, or a vector mode with the same element mode and
 smaller number of elements.
 
+@cindex @code{vec_duplicate@var{m}} instruction pattern
+@item @samp{vec_duplicate@var{m}}
+Initialize vector output operand 0 so that each element has the value given
+by scalar input operand 1.  The vector has mode @var{m} and the scalar has
+the mode appropriate for one element of @var{m}.
+
+This pattern only handles duplicates of non-constant inputs.  Constant
+vectors go through the @code{mov@var{m}} pattern instead.
+
+This pattern is not allowed to @code{FAIL}.
+
 @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern
 @item @samp{vec_cmp@var{m}@var{n}}
 Output a vector comparison.  Operand 0 of mode @var{n} is the destination for
Index: gcc/tree.def
===================================================================
--- gcc/tree.def	2017-11-06 12:40:39.845713389 +0000
+++ gcc/tree.def	2017-11-06 12:40:40.292531076 +0000
@@ -304,6 +304,11 @@ DEFTREECODE (COMPLEX_CST, "complex_cst",
 /* Contents are in VECTOR_CST_ELTS field.  */
 DEFTREECODE (VECTOR_CST, "vector_cst", tcc_constant, 0)
 
+/* Represents a vector constant in which every element is equal to
+   VEC_DUPLICATE_CST_ELT.  This is only ever used for variable-length
+   vectors; fixed-length vectors must use VECTOR_CST instead.  */
+DEFTREECODE (VEC_DUPLICATE_CST, "vec_duplicate_cst", tcc_constant, 0)
+
 /* Contents are TREE_STRING_LENGTH and the actual contents of the string.  */
 DEFTREECODE (STRING_CST, "string_cst", tcc_constant, 0)
 
@@ -534,6 +539,9 @@ DEFTREECODE (TARGET_EXPR, "target_expr",
    1 and 2 are NULL.  The operands are then taken from the cfg edges. */
 DEFTREECODE (COND_EXPR, "cond_expr", tcc_expression, 3)
 
+/* Represents a vector in which every element is equal to operand 0.  */
+DEFTREECODE (VEC_DUPLICATE_EXPR, "vec_duplicate_expr", tcc_unary, 1)
+
 /* Vector conditional expression. It is like COND_EXPR, but with
    vector operands.
 
Index: gcc/tree-core.h
===================================================================
--- gcc/tree-core.h	2017-11-06 12:40:39.845713389 +0000
+++ gcc/tree-core.h	2017-11-06 12:40:40.288559363 +0000
@@ -975,7 +975,8 @@ struct GTY(()) tree_base {
     /* VEC length.  This field is only used with TREE_VEC.  */
     int length;
 
-    /* Number of elements.  This field is only used with VECTOR_CST.  */
+    /* Number of elements.  This field is only used with VECTOR_CST
+       and VEC_DUPLICATE_CST.  It is always 1 for VEC_DUPLICATE_CST.  */
     unsigned int nelts;
 
     /* SSA version number.  This field is only used with SSA_NAME.  */
@@ -1065,7 +1066,7 @@ struct GTY(()) tree_base {
    public_flag:
 
        TREE_OVERFLOW in
-           INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST
+           INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST, VEC_DUPLICATE_CST
 
        TREE_PUBLIC in
            VAR_DECL, FUNCTION_DECL
@@ -1332,7 +1333,7 @@ struct GTY(()) tree_complex {
 
 struct GTY(()) tree_vector {
   struct tree_typed typed;
-  tree GTY ((length ("VECTOR_CST_NELTS ((tree) &%h)"))) elts[1];
+  tree GTY ((length ("((tree) &%h)->base.u.nelts"))) elts[1];
 };
 
 struct GTY(()) tree_identifier {
Index: gcc/tree.h
===================================================================
--- gcc/tree.h	2017-11-06 12:40:39.845713389 +0000
+++ gcc/tree.h	2017-11-06 12:40:40.293524004 +0000
@@ -709,8 +709,8 @@ #define TREE_SYMBOL_REFERENCED(NODE) \
 #define TYPE_REF_CAN_ALIAS_ALL(NODE) \
   (PTR_OR_REF_CHECK (NODE)->base.static_flag)
 
-/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, or VECTOR_CST, this means
-   there was an overflow in folding.  */
+/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST or VEC_DUPLICATE_CST,
+   this means there was an overflow in folding.  */
 
 #define TREE_OVERFLOW(NODE) (CST_CHECK (NODE)->base.public_flag)
 
@@ -1009,6 +1009,10 @@ #define VECTOR_CST_NELTS(NODE) (VECTOR_C
 #define VECTOR_CST_ELTS(NODE) (VECTOR_CST_CHECK (NODE)->vector.elts)
 #define VECTOR_CST_ELT(NODE,IDX) (VECTOR_CST_CHECK (NODE)->vector.elts[IDX])
 
+/* In a VEC_DUPLICATE_CST node.  */
+#define VEC_DUPLICATE_CST_ELT(NODE) \
+  (VEC_DUPLICATE_CST_CHECK (NODE)->vector.elts[0])
+
 /* Define fields and accessors for some special-purpose tree nodes.  */
 
 #define IDENTIFIER_LENGTH(NODE) \
Index: gcc/tree.c
===================================================================
--- gcc/tree.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/tree.c	2017-11-06 12:40:40.292531076 +0000
@@ -464,6 +464,7 @@ tree_node_structure_for_code (enum tree_
     case FIXED_CST:		return TS_FIXED_CST;
     case COMPLEX_CST:		return TS_COMPLEX;
     case VECTOR_CST:		return TS_VECTOR;
+    case VEC_DUPLICATE_CST:	return TS_VECTOR;
     case STRING_CST:		return TS_STRING;
       /* tcc_exceptional cases.  */
     case ERROR_MARK:		return TS_COMMON;
@@ -829,6 +830,7 @@ tree_code_size (enum tree_code code)
 	case FIXED_CST:		return sizeof (tree_fixed_cst);
 	case COMPLEX_CST:	return sizeof (tree_complex);
 	case VECTOR_CST:	return sizeof (tree_vector);
+	case VEC_DUPLICATE_CST:	return sizeof (tree_vector);
 	case STRING_CST:	gcc_unreachable ();
 	default:
 	  gcc_checking_assert (code >= NUM_TREE_CODES);
@@ -890,6 +892,9 @@ tree_size (const_tree node)
       return (sizeof (struct tree_vector)
 	      + (VECTOR_CST_NELTS (node) - 1) * sizeof (tree));
 
+    case VEC_DUPLICATE_CST:
+      return sizeof (struct tree_vector);
+
     case STRING_CST:
       return TREE_STRING_LENGTH (node) + offsetof (struct tree_string, str) + 1;
 
@@ -1697,6 +1702,34 @@ cst_and_fits_in_hwi (const_tree x)
 	  && (tree_fits_shwi_p (x) || tree_fits_uhwi_p (x)));
 }
 
+/* Build a new VEC_DUPLICATE_CST with type TYPE and operand EXP.
+
+   This function is only suitable for callers that know TYPE is a
+   variable-length vector and specifically need a VEC_DUPLICATE_CST node.
+   Use build_vector_from_val to duplicate a general scalar into a general
+   vector type.  */
+
+static tree
+build_vec_duplicate_cst (tree type, tree exp MEM_STAT_DECL)
+{
+  /* Shouldn't be used until we have variable-length vectors.  */
+  gcc_unreachable ();
+
+  int length = sizeof (struct tree_vector);
+
+  record_node_allocation_statistics (VEC_DUPLICATE_CST, length);
+
+  tree t = ggc_alloc_cleared_tree_node_stat (length PASS_MEM_STAT);
+
+  TREE_SET_CODE (t, VEC_DUPLICATE_CST);
+  TREE_TYPE (t) = type;
+  t->base.u.nelts = 1;
+  VEC_DUPLICATE_CST_ELT (t) = exp;
+  TREE_CONSTANT (t) = 1;
+
+  return t;
+}
+
 /* Build a newly constructed VECTOR_CST node of length LEN.  */
 
 tree
@@ -1790,6 +1823,13 @@ build_vector_from_val (tree vectype, tre
   gcc_checking_assert (types_compatible_p (TYPE_MAIN_VARIANT (TREE_TYPE (sc)),
 					   TREE_TYPE (vectype)));
 
+  if (0)
+    {
+      if (CONSTANT_CLASS_P (sc))
+	return build_vec_duplicate_cst (vectype, sc);
+      return fold_build1 (VEC_DUPLICATE_EXPR, vectype, sc);
+    }
+
   if (CONSTANT_CLASS_P (sc))
     {
       auto_vec<tree, 32> v (nunits);
@@ -2358,6 +2398,8 @@ integer_zerop (const_tree expr)
 	    return false;
 	return true;
       }
+    case VEC_DUPLICATE_CST:
+      return integer_zerop (VEC_DUPLICATE_CST_ELT (expr));
     default:
       return false;
     }
@@ -2384,6 +2426,8 @@ integer_onep (const_tree expr)
 	    return false;
 	return true;
       }
+    case VEC_DUPLICATE_CST:
+      return integer_onep (VEC_DUPLICATE_CST_ELT (expr));
     default:
       return false;
     }
@@ -2422,6 +2466,9 @@ integer_all_onesp (const_tree expr)
       return 1;
     }
 
+  else if (TREE_CODE (expr) == VEC_DUPLICATE_CST)
+    return integer_all_onesp (VEC_DUPLICATE_CST_ELT (expr));
+
   else if (TREE_CODE (expr) != INTEGER_CST)
     return 0;
 
@@ -2478,7 +2525,7 @@ integer_nonzerop (const_tree expr)
 int
 integer_truep (const_tree expr)
 {
-  if (TREE_CODE (expr) == VECTOR_CST)
+  if (TREE_CODE (expr) == VECTOR_CST || TREE_CODE (expr) == VEC_DUPLICATE_CST)
     return integer_all_onesp (expr);
   return integer_onep (expr);
 }
@@ -2649,6 +2696,8 @@ real_zerop (const_tree expr)
 	    return false;
 	return true;
       }
+    case VEC_DUPLICATE_CST:
+      return real_zerop (VEC_DUPLICATE_CST_ELT (expr));
     default:
       return false;
     }
@@ -2677,6 +2726,8 @@ real_onep (const_tree expr)
 	    return false;
 	return true;
       }
+    case VEC_DUPLICATE_CST:
+      return real_onep (VEC_DUPLICATE_CST_ELT (expr));
     default:
       return false;
     }
@@ -2704,6 +2755,8 @@ real_minus_onep (const_tree expr)
 	    return false;
 	return true;
       }
+    case VEC_DUPLICATE_CST:
+      return real_minus_onep (VEC_DUPLICATE_CST_ELT (expr));
     default:
       return false;
     }
@@ -7106,6 +7159,9 @@ add_expr (const_tree t, inchash::hash &h
 	  inchash::add_expr (VECTOR_CST_ELT (t, i), hstate, flags);
 	return;
       }
+    case VEC_DUPLICATE_CST:
+      inchash::add_expr (VEC_DUPLICATE_CST_ELT (t), hstate);
+      return;
     case SSA_NAME:
       /* We can just compare by pointer.  */
       hstate.add_hwi (SSA_NAME_VERSION (t));
@@ -10367,6 +10423,9 @@ initializer_zerop (const_tree init)
 	return true;
       }
 
+    case VEC_DUPLICATE_CST:
+      return initializer_zerop (VEC_DUPLICATE_CST_ELT (init));
+
     case CONSTRUCTOR:
       {
 	unsigned HOST_WIDE_INT idx;
@@ -10412,7 +10471,13 @@ uniform_vector_p (const_tree vec)
 
   gcc_assert (VECTOR_TYPE_P (TREE_TYPE (vec)));
 
-  if (TREE_CODE (vec) == VECTOR_CST)
+  if (TREE_CODE (vec) == VEC_DUPLICATE_CST)
+    return VEC_DUPLICATE_CST_ELT (vec);
+
+  else if (TREE_CODE (vec) == VEC_DUPLICATE_EXPR)
+    return TREE_OPERAND (vec, 0);
+
+  else if (TREE_CODE (vec) == VECTOR_CST)
     {
       first = VECTOR_CST_ELT (vec, 0);
       for (i = 1; i < VECTOR_CST_NELTS (vec); ++i)
@@ -11144,6 +11209,7 @@ #define WALK_SUBTREE_TAIL(NODE)				\
     case REAL_CST:
     case FIXED_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case STRING_CST:
     case BLOCK:
     case PLACEHOLDER_EXPR:
@@ -12430,6 +12496,12 @@ drop_tree_overflow (tree t)
 	    elt = drop_tree_overflow (elt);
 	}
     }
+  if (TREE_CODE (t) == VEC_DUPLICATE_CST)
+    {
+      tree *elt = &VEC_DUPLICATE_CST_ELT (t);
+      if (TREE_OVERFLOW (*elt))
+	*elt = drop_tree_overflow (*elt);
+    }
   return t;
 }
 
@@ -13850,6 +13922,102 @@ test_integer_constants ()
   ASSERT_EQ (type, TREE_TYPE (zero));
 }
 
+/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs
+   for integral type TYPE.  */
+
+static void
+test_vec_duplicate_predicates_int (tree type)
+{
+  scalar_int_mode int_mode = SCALAR_INT_TYPE_MODE (type);
+  machine_mode vec_mode = targetm.vectorize.preferred_simd_mode (int_mode);
+  /* This will be 1 if VEC_MODE isn't a vector mode.  */
+  unsigned int nunits = GET_MODE_NUNITS (vec_mode);
+
+  tree vec_type = build_vector_type (type, nunits);
+
+  tree zero = build_zero_cst (type);
+  tree vec_zero = build_vector_from_val (vec_type, zero);
+  ASSERT_TRUE (integer_zerop (vec_zero));
+  ASSERT_FALSE (integer_onep (vec_zero));
+  ASSERT_FALSE (integer_minus_onep (vec_zero));
+  ASSERT_FALSE (integer_all_onesp (vec_zero));
+  ASSERT_FALSE (integer_truep (vec_zero));
+  ASSERT_TRUE (initializer_zerop (vec_zero));
+
+  tree one = build_one_cst (type);
+  tree vec_one = build_vector_from_val (vec_type, one);
+  ASSERT_FALSE (integer_zerop (vec_one));
+  ASSERT_TRUE (integer_onep (vec_one));
+  ASSERT_FALSE (integer_minus_onep (vec_one));
+  ASSERT_FALSE (integer_all_onesp (vec_one));
+  ASSERT_FALSE (integer_truep (vec_one));
+  ASSERT_FALSE (initializer_zerop (vec_one));
+
+  tree minus_one = build_minus_one_cst (type);
+  tree vec_minus_one = build_vector_from_val (vec_type, minus_one);
+  ASSERT_FALSE (integer_zerop (vec_minus_one));
+  ASSERT_FALSE (integer_onep (vec_minus_one));
+  ASSERT_TRUE (integer_minus_onep (vec_minus_one));
+  ASSERT_TRUE (integer_all_onesp (vec_minus_one));
+  ASSERT_TRUE (integer_truep (vec_minus_one));
+  ASSERT_FALSE (initializer_zerop (vec_minus_one));
+
+  tree x = create_tmp_var_raw (type, "x");
+  tree vec_x = build1 (VEC_DUPLICATE_EXPR, vec_type, x);
+  ASSERT_EQ (uniform_vector_p (vec_zero), zero);
+  ASSERT_EQ (uniform_vector_p (vec_one), one);
+  ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one);
+  ASSERT_EQ (uniform_vector_p (vec_x), x);
+}
+
+/* Verify predicate handling of VEC_DUPLICATE_CSTs for floating-point
+   type TYPE.  */
+
+static void
+test_vec_duplicate_predicates_float (tree type)
+{
+  scalar_float_mode float_mode = SCALAR_FLOAT_TYPE_MODE (type);
+  machine_mode vec_mode = targetm.vectorize.preferred_simd_mode (float_mode);
+  /* This will be 1 if VEC_MODE isn't a vector mode.  */
+  unsigned int nunits = GET_MODE_NUNITS (vec_mode);
+
+  tree vec_type = build_vector_type (type, nunits);
+
+  tree zero = build_zero_cst (type);
+  tree vec_zero = build_vector_from_val (vec_type, zero);
+  ASSERT_TRUE (real_zerop (vec_zero));
+  ASSERT_FALSE (real_onep (vec_zero));
+  ASSERT_FALSE (real_minus_onep (vec_zero));
+  ASSERT_TRUE (initializer_zerop (vec_zero));
+
+  tree one = build_one_cst (type);
+  tree vec_one = build_vector_from_val (vec_type, one);
+  ASSERT_FALSE (real_zerop (vec_one));
+  ASSERT_TRUE (real_onep (vec_one));
+  ASSERT_FALSE (real_minus_onep (vec_one));
+  ASSERT_FALSE (initializer_zerop (vec_one));
+
+  tree minus_one = build_minus_one_cst (type);
+  tree vec_minus_one = build_vector_from_val (vec_type, minus_one);
+  ASSERT_FALSE (real_zerop (vec_minus_one));
+  ASSERT_FALSE (real_onep (vec_minus_one));
+  ASSERT_TRUE (real_minus_onep (vec_minus_one));
+  ASSERT_FALSE (initializer_zerop (vec_minus_one));
+
+  ASSERT_EQ (uniform_vector_p (vec_zero), zero);
+  ASSERT_EQ (uniform_vector_p (vec_one), one);
+  ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one);
+}
+
+/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs.  */
+
+static void
+test_vec_duplicate_predicates ()
+{
+  test_vec_duplicate_predicates_int (integer_type_node);
+  test_vec_duplicate_predicates_float (float_type_node);
+}
+
 /* Verify identifiers.  */
 
 static void
@@ -13878,6 +14046,7 @@ test_labels ()
 tree_c_tests ()
 {
   test_integer_constants ();
+  test_vec_duplicate_predicates ();
   test_identifiers ();
   test_labels ();
 }
Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/cfgexpand.c	2017-11-06 12:40:40.276644225 +0000
@@ -5068,6 +5068,8 @@ expand_debug_expr (tree exp)
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
     case VEC_PERM_EXPR:
+    case VEC_DUPLICATE_CST:
+    case VEC_DUPLICATE_EXPR:
       return NULL;
 
     /* Misc codes.  */
Index: gcc/tree-pretty-print.c
===================================================================
--- gcc/tree-pretty-print.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/tree-pretty-print.c	2017-11-06 12:40:40.289552291 +0000
@@ -1802,6 +1802,12 @@ dump_generic_node (pretty_printer *pp, t
       }
       break;
 
+    case VEC_DUPLICATE_CST:
+      pp_string (pp, "{ ");
+      dump_generic_node (pp, VEC_DUPLICATE_CST_ELT (node), spc, flags, false);
+      pp_string (pp, ", ... }");
+      break;
+
     case FUNCTION_TYPE:
     case METHOD_TYPE:
       dump_generic_node (pp, TREE_TYPE (node), spc, flags, false);
@@ -3231,6 +3237,15 @@ dump_generic_node (pretty_printer *pp, t
       pp_string (pp, " > ");
       break;
 
+    case VEC_DUPLICATE_EXPR:
+      pp_space (pp);
+      for (str = get_tree_code_name (code); *str; str++)
+	pp_character (pp, TOUPPER (*str));
+      pp_string (pp, " < ");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, " > ");
+      break;
+
     case VEC_UNPACK_HI_EXPR:
       pp_string (pp, " VEC_UNPACK_HI_EXPR < ");
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
Index: gcc/tree-vect-generic.c
===================================================================
--- gcc/tree-vect-generic.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/tree-vect-generic.c	2017-11-06 12:40:40.291538147 +0000
@@ -1419,6 +1419,8 @@ lower_vec_perm (gimple_stmt_iterator *gs
 ssa_uniform_vector_p (tree op)
 {
   if (TREE_CODE (op) == VECTOR_CST
+      || TREE_CODE (op) == VEC_DUPLICATE_CST
+      || TREE_CODE (op) == VEC_DUPLICATE_EXPR
       || TREE_CODE (op) == CONSTRUCTOR)
     return uniform_vector_p (op);
   if (TREE_CODE (op) == SSA_NAME)
Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/dwarf2out.c	2017-11-06 12:40:40.280615937 +0000
@@ -18878,6 +18878,7 @@ rtl_for_decl_init (tree init, tree type)
 	switch (TREE_CODE (init))
 	  {
 	  case VECTOR_CST:
+	  case VEC_DUPLICATE_CST:
 	    break;
 	  case CONSTRUCTOR:
 	    if (TREE_CONSTANT (init))
Index: gcc/gimple-expr.h
===================================================================
--- gcc/gimple-expr.h	2017-11-06 12:40:39.845713389 +0000
+++ gcc/gimple-expr.h	2017-11-06 12:40:40.282601794 +0000
@@ -134,6 +134,7 @@ is_gimple_constant (const_tree t)
     case FIXED_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case STRING_CST:
       return true;
 
Index: gcc/gimplify.c
===================================================================
--- gcc/gimplify.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/gimplify.c	2017-11-06 12:40:40.283594722 +0000
@@ -11507,6 +11507,7 @@ gimplify_expr (tree *expr_p, gimple_seq
 	case STRING_CST:
 	case COMPLEX_CST:
 	case VECTOR_CST:
+	case VEC_DUPLICATE_CST:
 	  /* Drop the overflow flag on constants, we do not want
 	     that in the GIMPLE IL.  */
 	  if (TREE_OVERFLOW_P (*expr_p))
Index: gcc/graphite-isl-ast-to-gimple.c
===================================================================
--- gcc/graphite-isl-ast-to-gimple.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/graphite-isl-ast-to-gimple.c	2017-11-06 12:40:40.284587650 +0000
@@ -211,7 +211,8 @@ enum phi_node_kind
     return TREE_CODE (op) == INTEGER_CST
       || TREE_CODE (op) == REAL_CST
       || TREE_CODE (op) == COMPLEX_CST
-      || TREE_CODE (op) == VECTOR_CST;
+      || TREE_CODE (op) == VECTOR_CST
+      || TREE_CODE (op) == VEC_DUPLICATE_CST;
   }
 
 private:
Index: gcc/graphite-scop-detection.c
===================================================================
--- gcc/graphite-scop-detection.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/graphite-scop-detection.c	2017-11-06 12:40:40.284587650 +0000
@@ -1212,6 +1212,7 @@ scan_tree_for_params (sese_info_p s, tre
     case REAL_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
       break;
 
    default:
Index: gcc/ipa-icf-gimple.c
===================================================================
--- gcc/ipa-icf-gimple.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/ipa-icf-gimple.c	2017-11-06 12:40:40.285580578 +0000
@@ -333,6 +333,7 @@ func_checker::compare_cst_or_decl (tree
     case INTEGER_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case STRING_CST:
     case REAL_CST:
       {
@@ -528,6 +529,7 @@ func_checker::compare_operand (tree t1,
     case INTEGER_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case STRING_CST:
     case REAL_CST:
     case FUNCTION_DECL:
Index: gcc/ipa-icf.c
===================================================================
--- gcc/ipa-icf.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/ipa-icf.c	2017-11-06 12:40:40.285580578 +0000
@@ -1479,6 +1479,7 @@ sem_item::add_expr (const_tree exp, inch
     case STRING_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
       inchash::add_expr (exp, hstate);
       break;
     case CONSTRUCTOR:
@@ -2036,6 +2037,9 @@ sem_variable::equals (tree t1, tree t2)
 
 	return 1;
       }
+    case VEC_DUPLICATE_CST:
+      return sem_variable::equals (VEC_DUPLICATE_CST_ELT (t1),
+				   VEC_DUPLICATE_CST_ELT (t2));
     case ARRAY_REF:
     case ARRAY_RANGE_REF:
       {
Index: gcc/match.pd
===================================================================
--- gcc/match.pd	2017-11-06 12:40:39.845713389 +0000
+++ gcc/match.pd	2017-11-06 12:40:40.285580578 +0000
@@ -958,6 +958,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (match negate_expr_p
  VECTOR_CST
  (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type))))
+(match negate_expr_p
+ VEC_DUPLICATE_CST
+ (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type))))
 
 /* (-A) * (-B) -> A * B  */
 (simplify
Index: gcc/print-tree.c
===================================================================
--- gcc/print-tree.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/print-tree.c	2017-11-06 12:40:40.287566435 +0000
@@ -783,6 +783,10 @@ print_node (FILE *file, const char *pref
 	  }
 	  break;
 
+	case VEC_DUPLICATE_CST:
+	  print_node (file, "elt", VEC_DUPLICATE_CST_ELT (node), indent + 4);
+	  break;
+
 	case COMPLEX_CST:
 	  print_node (file, "real", TREE_REALPART (node), indent + 4);
 	  print_node (file, "imag", TREE_IMAGPART (node), indent + 4);
Index: gcc/tree-chkp.c
===================================================================
--- gcc/tree-chkp.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/tree-chkp.c	2017-11-06 12:40:40.288559363 +0000
@@ -3799,6 +3799,7 @@ chkp_find_bounds_1 (tree ptr, tree ptr_s
     case INTEGER_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
       if (integer_zerop (ptr_src))
 	bounds = chkp_get_none_bounds ();
       else
Index: gcc/tree-loop-distribution.c
===================================================================
--- gcc/tree-loop-distribution.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/tree-loop-distribution.c	2017-11-06 12:40:40.289552291 +0000
@@ -927,6 +927,9 @@ const_with_all_bytes_same (tree val)
           && CONSTRUCTOR_NELTS (val) == 0))
     return 0;
 
+  if (TREE_CODE (val) == VEC_DUPLICATE_CST)
+    return const_with_all_bytes_same (VEC_DUPLICATE_CST_ELT (val));
+
   if (real_zerop (val))
     {
       /* Only return 0 for +0.0, not for -0.0, which doesn't have
Index: gcc/tree-ssa-loop.c
===================================================================
--- gcc/tree-ssa-loop.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/tree-ssa-loop.c	2017-11-06 12:40:40.290545219 +0000
@@ -616,6 +616,7 @@ for_each_index (tree *addr_p, bool (*cbc
 	case STRING_CST:
 	case RESULT_DECL:
 	case VECTOR_CST:
+	case VEC_DUPLICATE_CST:
 	case COMPLEX_CST:
 	case INTEGER_CST:
 	case REAL_CST:
Index: gcc/tree-ssa-pre.c
===================================================================
--- gcc/tree-ssa-pre.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/tree-ssa-pre.c	2017-11-06 12:40:40.290545219 +0000
@@ -2627,6 +2627,7 @@ create_component_ref_by_pieces_1 (basic_
     case INTEGER_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case REAL_CST:
     case CONSTRUCTOR:
     case VAR_DECL:
Index: gcc/tree-ssa-sccvn.c
===================================================================
--- gcc/tree-ssa-sccvn.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/tree-ssa-sccvn.c	2017-11-06 12:40:40.291538147 +0000
@@ -866,6 +866,7 @@ copy_reference_ops_from_ref (tree ref, v
 	case INTEGER_CST:
 	case COMPLEX_CST:
 	case VECTOR_CST:
+	case VEC_DUPLICATE_CST:
 	case REAL_CST:
 	case FIXED_CST:
 	case CONSTRUCTOR:
@@ -1058,6 +1059,7 @@ ao_ref_init_from_vn_reference (ao_ref *r
 	case INTEGER_CST:
 	case COMPLEX_CST:
 	case VECTOR_CST:
+	case VEC_DUPLICATE_CST:
 	case REAL_CST:
 	case CONSTRUCTOR:
 	case CONST_DECL:
Index: gcc/varasm.c
===================================================================
--- gcc/varasm.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/varasm.c	2017-11-06 12:40:40.293524004 +0000
@@ -3068,6 +3068,9 @@ const_hash_1 (const tree exp)
     CASE_CONVERT:
       return const_hash_1 (TREE_OPERAND (exp, 0)) * 7 + 2;
 
+    case VEC_DUPLICATE_CST:
+      return const_hash_1 (VEC_DUPLICATE_CST_ELT (exp)) * 7 + 3;
+
     default:
       /* A language specific constant. Just hash the code.  */
       return code;
@@ -3158,6 +3161,10 @@ compare_constant (const tree t1, const t
 	return 1;
       }
 
+    case VEC_DUPLICATE_CST:
+      return compare_constant (VEC_DUPLICATE_CST_ELT (t1),
+			       VEC_DUPLICATE_CST_ELT (t2));
+
     case CONSTRUCTOR:
       {
 	vec<constructor_elt, va_gc> *v1, *v2;
Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/fold-const.c	2017-11-06 12:40:40.282601794 +0000
@@ -418,6 +418,9 @@ negate_expr_p (tree t)
 	return true;
       }
 
+    case VEC_DUPLICATE_CST:
+      return negate_expr_p (VEC_DUPLICATE_CST_ELT (t));
+
     case COMPLEX_EXPR:
       return negate_expr_p (TREE_OPERAND (t, 0))
 	     && negate_expr_p (TREE_OPERAND (t, 1));
@@ -579,6 +582,14 @@ fold_negate_expr_1 (location_t loc, tree
 	return build_vector (type, elts);
       }
 
+    case VEC_DUPLICATE_CST:
+      {
+	tree sub = fold_negate_expr (loc, VEC_DUPLICATE_CST_ELT (t));
+	if (!sub)
+	  return NULL_TREE;
+	return build_vector_from_val (type, sub);
+      }
+
     case COMPLEX_EXPR:
       if (negate_expr_p (t))
 	return fold_build2_loc (loc, COMPLEX_EXPR, type,
@@ -1436,6 +1447,16 @@ const_binop (enum tree_code code, tree a
       return build_vector (type, elts);
     }
 
+  if (TREE_CODE (arg1) == VEC_DUPLICATE_CST
+      && TREE_CODE (arg2) == VEC_DUPLICATE_CST)
+    {
+      tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1),
+			      VEC_DUPLICATE_CST_ELT (arg2));
+      if (!sub)
+	return NULL_TREE;
+      return build_vector_from_val (TREE_TYPE (arg1), sub);
+    }
+
   /* Shifts allow a scalar offset for a vector.  */
   if (TREE_CODE (arg1) == VECTOR_CST
       && TREE_CODE (arg2) == INTEGER_CST)
@@ -1459,6 +1480,15 @@ const_binop (enum tree_code code, tree a
 
       return build_vector (type, elts);
     }
+
+  if (TREE_CODE (arg1) == VEC_DUPLICATE_CST
+      && TREE_CODE (arg2) == INTEGER_CST)
+    {
+      tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1), arg2);
+      if (!sub)
+	return NULL_TREE;
+      return build_vector_from_val (TREE_TYPE (arg1), sub);
+    }
   return NULL_TREE;
 }
 
@@ -1652,6 +1682,13 @@ const_unop (enum tree_code code, tree ty
 	  if (i == count)
 	    return build_vector (type, elements);
 	}
+      else if (TREE_CODE (arg0) == VEC_DUPLICATE_CST)
+	{
+	  tree sub = const_unop (BIT_NOT_EXPR, TREE_TYPE (type),
+				 VEC_DUPLICATE_CST_ELT (arg0));
+	  if (sub)
+	    return build_vector_from_val (type, sub);
+	}
       break;
 
     case TRUTH_NOT_EXPR:
@@ -1737,6 +1774,11 @@ const_unop (enum tree_code code, tree ty
 	return res;
       }
 
+    case VEC_DUPLICATE_EXPR:
+      if (CONSTANT_CLASS_P (arg0))
+	return build_vector_from_val (type, arg0);
+      return NULL_TREE;
+
     default:
       break;
     }
@@ -2167,6 +2209,15 @@ fold_convert_const (enum tree_code code,
 	    }
 	  return build_vector (type, v);
 	}
+      if (TREE_CODE (arg1) == VEC_DUPLICATE_CST
+	  && (TYPE_VECTOR_SUBPARTS (type)
+	      == TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg1))))
+	{
+	  tree sub = fold_convert_const (code, TREE_TYPE (type),
+					 VEC_DUPLICATE_CST_ELT (arg1));
+	  if (sub)
+	    return build_vector_from_val (type, sub);
+	}
     }
   return NULL_TREE;
 }
@@ -2953,6 +3004,10 @@ operand_equal_p (const_tree arg0, const_
 	  return 1;
 	}
 
+      case VEC_DUPLICATE_CST:
+	return operand_equal_p (VEC_DUPLICATE_CST_ELT (arg0),
+				VEC_DUPLICATE_CST_ELT (arg1), flags);
+
       case COMPLEX_CST:
 	return (operand_equal_p (TREE_REALPART (arg0), TREE_REALPART (arg1),
 				 flags)
@@ -7475,6 +7530,20 @@ can_native_interpret_type_p (tree type)
 static tree
 fold_view_convert_expr (tree type, tree expr)
 {
+  /* Recurse on duplicated vectors if the target type is also a vector
+     and if the elements line up.  */
+  tree expr_type = TREE_TYPE (expr);
+  if (TREE_CODE (expr) == VEC_DUPLICATE_CST
+      && VECTOR_TYPE_P (type)
+      && TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (expr_type)
+      && TYPE_SIZE (TREE_TYPE (type)) == TYPE_SIZE (TREE_TYPE (expr_type)))
+    {
+      tree sub = fold_view_convert_expr (TREE_TYPE (type),
+					 VEC_DUPLICATE_CST_ELT (expr));
+      if (sub)
+	return build_vector_from_val (type, sub);
+    }
+
   /* We support up to 512-bit values (for V8DFmode).  */
   unsigned char buffer[64];
   int len;
@@ -8874,6 +8943,15 @@ exact_inverse (tree type, tree cst)
 	return build_vector (type, elts);
       }
 
+    case VEC_DUPLICATE_CST:
+      {
+	tree sub = exact_inverse (TREE_TYPE (type),
+				  VEC_DUPLICATE_CST_ELT (cst));
+	if (!sub)
+	  return NULL_TREE;
+	return build_vector_from_val (type, sub);
+      }
+
     default:
       return NULL_TREE;
     }
@@ -11939,6 +12017,9 @@ fold_checksum_tree (const_tree expr, str
 	  for (i = 0; i < (int) VECTOR_CST_NELTS (expr); ++i)
 	    fold_checksum_tree (VECTOR_CST_ELT (expr, i), ctx, ht);
 	  break;
+	case VEC_DUPLICATE_CST:
+	  fold_checksum_tree (VEC_DUPLICATE_CST_ELT (expr), ctx, ht);
+	  break;
 	default:
 	  break;
 	}
@@ -14412,6 +14493,41 @@ test_vector_folding ()
   ASSERT_FALSE (integer_nonzerop (fold_build2 (NE_EXPR, res_type, one, one)));
 }
 
+/* Verify folding of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs.  */
+
+static void
+test_vec_duplicate_folding ()
+{
+  scalar_int_mode int_mode = SCALAR_INT_TYPE_MODE (ssizetype);
+  machine_mode vec_mode = targetm.vectorize.preferred_simd_mode (int_mode);
+  /* This will be 1 if VEC_MODE isn't a vector mode.  */
+  unsigned int nunits = GET_MODE_NUNITS (vec_mode);
+
+  tree type = build_vector_type (ssizetype, nunits);
+  tree dup5 = build_vector_from_val (type, ssize_int (5));
+  tree dup3 = build_vector_from_val (type, ssize_int (3));
+
+  tree neg_dup5 = fold_unary (NEGATE_EXPR, type, dup5);
+  ASSERT_EQ (uniform_vector_p (neg_dup5), ssize_int (-5));
+
+  tree not_dup5 = fold_unary (BIT_NOT_EXPR, type, dup5);
+  ASSERT_EQ (uniform_vector_p (not_dup5), ssize_int (-6));
+
+  tree dup5_plus_dup3 = fold_binary (PLUS_EXPR, type, dup5, dup3);
+  ASSERT_EQ (uniform_vector_p (dup5_plus_dup3), ssize_int (8));
+
+  tree dup5_lsl_2 = fold_binary (LSHIFT_EXPR, type, dup5, ssize_int (2));
+  ASSERT_EQ (uniform_vector_p (dup5_lsl_2), ssize_int (20));
+
+  tree size_vector = build_vector_type (sizetype, nunits);
+  tree size_dup5 = fold_convert (size_vector, dup5);
+  ASSERT_EQ (uniform_vector_p (size_dup5), size_int (5));
+
+  tree dup5_expr = fold_unary (VEC_DUPLICATE_EXPR, type, ssize_int (5));
+  tree dup5_cst = build_vector_from_val (type, ssize_int (5));
+  ASSERT_TRUE (operand_equal_p (dup5_expr, dup5_cst, 0));
+}
+
 /* Run all of the selftests within this file.  */
 
 void
@@ -14419,6 +14535,7 @@ fold_const_c_tests ()
 {
   test_arithmetic_folding ();
   test_vector_folding ();
+  test_vec_duplicate_folding ();
 }
 
 } // namespace selftest
Index: gcc/optabs.def
===================================================================
--- gcc/optabs.def	2017-11-06 12:40:39.845713389 +0000
+++ gcc/optabs.def	2017-11-06 12:40:40.286573506 +0000
@@ -364,3 +364,5 @@ OPTAB_D (atomic_xor_optab, "atomic_xor$I
 
 OPTAB_D (get_thread_pointer_optab, "get_thread_pointer$I$a")
 OPTAB_D (set_thread_pointer_optab, "set_thread_pointer$I$a")
+
+OPTAB_DC (vec_duplicate_optab, "vec_duplicate$a", VEC_DUPLICATE)
Index: gcc/optabs-tree.c
===================================================================
--- gcc/optabs-tree.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/optabs-tree.c	2017-11-06 12:40:40.286573506 +0000
@@ -210,6 +210,9 @@ optab_for_tree_code (enum tree_code code
       return TYPE_UNSIGNED (type) ?
 	vec_pack_ufix_trunc_optab : vec_pack_sfix_trunc_optab;
 
+    case VEC_DUPLICATE_EXPR:
+      return vec_duplicate_optab;
+
     default:
       break;
     }
Index: gcc/optabs.h
===================================================================
--- gcc/optabs.h	2017-11-06 12:40:39.845713389 +0000
+++ gcc/optabs.h	2017-11-06 12:40:40.287566435 +0000
@@ -181,6 +181,7 @@ extern rtx simplify_expand_binop (machin
 				  enum optab_methods methods);
 extern bool force_expand_binop (machine_mode, optab, rtx, rtx, rtx, int,
 				enum optab_methods);
+extern rtx expand_vector_broadcast (machine_mode, rtx);
 
 /* Generate code for a simple binary or unary operation.  "Simple" in
    this case means "can be unambiguously described by a (mode, code)
Index: gcc/optabs.c
===================================================================
--- gcc/optabs.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/optabs.c	2017-11-06 12:40:40.286573506 +0000
@@ -367,7 +367,7 @@ force_expand_binop (machine_mode mode, o
    mode of OP must be the element mode of VMODE.  If OP is a constant,
    then the return value will be a constant.  */
 
-static rtx
+rtx
 expand_vector_broadcast (machine_mode vmode, rtx op)
 {
   enum insn_code icode;
@@ -380,6 +380,16 @@ expand_vector_broadcast (machine_mode vm
   if (valid_for_const_vec_duplicate_p (vmode, op))
     return gen_const_vec_duplicate (vmode, op);
 
+  icode = optab_handler (vec_duplicate_optab, vmode);
+  if (icode != CODE_FOR_nothing)
+    {
+      struct expand_operand ops[2];
+      create_output_operand (&ops[0], NULL_RTX, vmode);
+      create_input_operand (&ops[1], op, GET_MODE (op));
+      expand_insn (icode, 2, ops);
+      return ops[0].value;
+    }
+
   /* ??? If the target doesn't have a vec_init, then we have no easy way
      of performing this operation.  Most of this sort of generic support
      is hidden away in the vector lowering support in gimple.  */
Index: gcc/expr.c
===================================================================
--- gcc/expr.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/expr.c	2017-11-06 12:40:40.281608865 +0000
@@ -6576,7 +6576,8 @@ store_constructor (tree exp, rtx target,
 	constructor_elt *ce;
 	int i;
 	int need_to_clear;
-	int icode = CODE_FOR_nothing;
+	insn_code icode = CODE_FOR_nothing;
+	tree elt;
 	tree elttype = TREE_TYPE (type);
 	int elt_size = tree_to_uhwi (TYPE_SIZE (elttype));
 	machine_mode eltmode = TYPE_MODE (elttype);
@@ -6586,13 +6587,30 @@ store_constructor (tree exp, rtx target,
 	unsigned n_elts;
 	alias_set_type alias;
 	bool vec_vec_init_p = false;
+	machine_mode mode = GET_MODE (target);
 
 	gcc_assert (eltmode != BLKmode);
 
+	/* Try using vec_duplicate_optab for uniform vectors.  */
+	if (!TREE_SIDE_EFFECTS (exp)
+	    && VECTOR_MODE_P (mode)
+	    && eltmode == GET_MODE_INNER (mode)
+	    && ((icode = optab_handler (vec_duplicate_optab, mode))
+		!= CODE_FOR_nothing)
+	    && (elt = uniform_vector_p (exp)))
+	  {
+	    struct expand_operand ops[2];
+	    create_output_operand (&ops[0], target, mode);
+	    create_input_operand (&ops[1], expand_normal (elt), eltmode);
+	    expand_insn (icode, 2, ops);
+	    if (!rtx_equal_p (target, ops[0].value))
+	      emit_move_insn (target, ops[0].value);
+	    break;
+	  }
+
 	n_elts = TYPE_VECTOR_SUBPARTS (type);
-	if (REG_P (target) && VECTOR_MODE_P (GET_MODE (target)))
+	if (REG_P (target) && VECTOR_MODE_P (mode))
 	  {
-	    machine_mode mode = GET_MODE (target);
 	    machine_mode emode = eltmode;
 
 	    if (CONSTRUCTOR_NELTS (exp)
@@ -6604,7 +6622,7 @@ store_constructor (tree exp, rtx target,
 			    == n_elts);
 		emode = TYPE_MODE (etype);
 	      }
-	    icode = (int) convert_optab_handler (vec_init_optab, mode, emode);
+	    icode = convert_optab_handler (vec_init_optab, mode, emode);
 	    if (icode != CODE_FOR_nothing)
 	      {
 		unsigned int i, n = n_elts;
@@ -6652,7 +6670,7 @@ store_constructor (tree exp, rtx target,
 	if (need_to_clear && size > 0 && !vector)
 	  {
 	    if (REG_P (target))
-	      emit_move_insn (target, CONST0_RTX (GET_MODE (target)));
+	      emit_move_insn (target, CONST0_RTX (mode));
 	    else
 	      clear_storage (target, GEN_INT (size), BLOCK_OP_NORMAL);
 	    cleared = 1;
@@ -6660,7 +6678,7 @@ store_constructor (tree exp, rtx target,
 
 	/* Inform later passes that the old value is dead.  */
 	if (!cleared && !vector && REG_P (target))
-	  emit_move_insn (target, CONST0_RTX (GET_MODE (target)));
+	  emit_move_insn (target, CONST0_RTX (mode));
 
         if (MEM_P (target))
 	  alias = MEM_ALIAS_SET (target);
@@ -6711,8 +6729,7 @@ store_constructor (tree exp, rtx target,
 
 	if (vector)
 	  emit_insn (GEN_FCN (icode) (target,
-				      gen_rtx_PARALLEL (GET_MODE (target),
-							vector)));
+				      gen_rtx_PARALLEL (mode, vector)));
 	break;
       }
 
@@ -7690,6 +7707,19 @@ expand_operands (tree exp0, tree exp1, r
 }
 
 
+/* Expand constant vector element ELT, which has mode MODE.  This is used
+   for members of VECTOR_CST and VEC_DUPLICATE_CST.  */
+
+static rtx
+const_vector_element (scalar_mode mode, const_tree elt)
+{
+  if (TREE_CODE (elt) == REAL_CST)
+    return const_double_from_real_value (TREE_REAL_CST (elt), mode);
+  if (TREE_CODE (elt) == FIXED_CST)
+    return CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt), mode);
+  return immed_wide_int_const (wi::to_wide (elt), mode);
+}
+
 /* Return a MEM that contains constant EXP.  DEFER is as for
    output_constant_def and MODIFIER is as for expand_expr.  */
 
@@ -9555,6 +9585,12 @@ #define REDUCE_BIT_FIELD(expr)	(reduce_b
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case VEC_DUPLICATE_EXPR:
+      op0 = expand_expr (treeop0, NULL_RTX, VOIDmode, modifier);
+      target = expand_vector_broadcast (mode, op0);
+      gcc_assert (target);
+      return target;
+
     case BIT_INSERT_EXPR:
       {
 	unsigned bitpos = tree_to_uhwi (treeop2);
@@ -9988,6 +10024,11 @@ expand_expr_real_1 (tree exp, rtx target
 			    tmode, modifier);
       }
 
+    case VEC_DUPLICATE_CST:
+      op0 = const_vector_element (GET_MODE_INNER (mode),
+				  VEC_DUPLICATE_CST_ELT (exp));
+      return gen_const_vec_duplicate (mode, op0);
+
     case CONST_DECL:
       if (modifier == EXPAND_WRITE)
 	{
@@ -11749,8 +11790,7 @@ const_vector_from_tree (tree exp)
 {
   rtvec v;
   unsigned i, units;
-  tree elt;
-  machine_mode inner, mode;
+  machine_mode mode;
 
   mode = TYPE_MODE (TREE_TYPE (exp));
 
@@ -11761,23 +11801,12 @@ const_vector_from_tree (tree exp)
     return const_vector_mask_from_tree (exp);
 
   units = VECTOR_CST_NELTS (exp);
-  inner = GET_MODE_INNER (mode);
 
   v = rtvec_alloc (units);
 
   for (i = 0; i < units; ++i)
-    {
-      elt = VECTOR_CST_ELT (exp, i);
-
-      if (TREE_CODE (elt) == REAL_CST)
-	RTVEC_ELT (v, i) = const_double_from_real_value (TREE_REAL_CST (elt),
-							 inner);
-      else if (TREE_CODE (elt) == FIXED_CST)
-	RTVEC_ELT (v, i) = CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt),
-							 inner);
-      else
-	RTVEC_ELT (v, i) = immed_wide_int_const (wi::to_wide (elt), inner);
-    }
+    RTVEC_ELT (v, i) = const_vector_element (GET_MODE_INNER (mode),
+					     VECTOR_CST_ELT (exp, i));
 
   return gen_rtx_CONST_VECTOR (mode, v);
 }
Index: gcc/internal-fn.c
===================================================================
--- gcc/internal-fn.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/internal-fn.c	2017-11-06 12:40:40.284587650 +0000
@@ -1911,12 +1911,12 @@ expand_vector_ubsan_overflow (location_t
       emit_move_insn (cntvar, const0_rtx);
       emit_label (loop_lab);
     }
-  if (TREE_CODE (arg0) != VECTOR_CST)
+  if (!CONSTANT_CLASS_P (arg0))
     {
       rtx arg0r = expand_normal (arg0);
       arg0 = make_tree (TREE_TYPE (arg0), arg0r);
     }
-  if (TREE_CODE (arg1) != VECTOR_CST)
+  if (!CONSTANT_CLASS_P (arg1))
     {
       rtx arg1r = expand_normal (arg1);
       arg1 = make_tree (TREE_TYPE (arg1), arg1r);
Index: gcc/tree-cfg.c
===================================================================
--- gcc/tree-cfg.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/tree-cfg.c	2017-11-06 12:40:40.287566435 +0000
@@ -3798,6 +3798,17 @@ verify_gimple_assign_unary (gassign *stm
     case CONJ_EXPR:
       break;
 
+    case VEC_DUPLICATE_EXPR:
+      if (TREE_CODE (lhs_type) != VECTOR_TYPE
+	  || !useless_type_conversion_p (TREE_TYPE (lhs_type), rhs1_type))
+	{
+	  error ("vec_duplicate should be from a scalar to a like vector");
+	  debug_generic_expr (lhs_type);
+	  debug_generic_expr (rhs1_type);
+	  return true;
+	}
+      return false;
+
     default:
       gcc_unreachable ();
     }
@@ -4468,6 +4479,7 @@ verify_gimple_assign_single (gassign *st
     case FIXED_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case STRING_CST:
       return res;
 
Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c	2017-11-06 12:40:39.845713389 +0000
+++ gcc/tree-inline.c	2017-11-06 12:40:40.289552291 +0000
@@ -3930,6 +3930,7 @@ estimate_operator_cost (enum tree_code c
     case VEC_PACK_FIX_TRUNC_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
+    case VEC_DUPLICATE_EXPR:
 
       return 1;
Richard Biener Nov. 7, 2017, 10:25 a.m. UTC | #3
On Mon, Nov 6, 2017 at 4:09 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:

>> On Mon, Oct 23, 2017 at 1:20 PM, Richard Sandiford

>> <richard.sandiford@linaro.org> wrote:

>>> SVE needs a way of broadcasting a scalar to a variable-length vector.

>>> This patch adds VEC_DUPLICATE_CST for when VECTOR_CST would be used for

>>> fixed-length vectors and VEC_DUPLICATE_EXPR for when CONSTRUCTOR would

>>> be used for fixed-length vectors.  VEC_DUPLICATE_EXPR is the tree

>>> equivalent of the existing rtl code VEC_DUPLICATE.

>>>

>>> Originally we had a single VEC_DUPLICATE_EXPR and used TREE_CONSTANT

>>> to mark constant nodes, but in response to last year's RFC, Richard B.

>>> suggested it would be better to have separate codes for the constant

>>> and non-constant cases.  This allows VEC_DUPLICATE_EXPR to be treated

>>> as a normal unary operation and avoids the previous need for treating

>>> it as a GIMPLE_SINGLE_RHS.

>>>

>>> It might make sense to use VEC_DUPLICATE_CST for all duplicated

>>> vector constants, since it's a bit more compact than VECTOR_CST

>>> in that case, and is potentially more efficient to process.

>>> However, the nice thing about keeping it restricted to variable-length

>>> vectors is that there is then no need to handle combinations of

>>> VECTOR_CST and VEC_DUPLICATE_CST; a vector type will always use

>>> VECTOR_CST or never use it.

>>>

>>> The patch also adds a vec_duplicate_optab to go with VEC_DUPLICATE_EXPR.

>>

>> Index: gcc/tree-vect-generic.c

>> ===================================================================

>> --- gcc/tree-vect-generic.c     2017-10-23 11:38:53.934094740 +0100

>> +++ gcc/tree-vect-generic.c     2017-10-23 11:41:51.773953100 +0100

>> @@ -1419,6 +1419,7 @@ lower_vec_perm (gimple_stmt_iterator *gs

>>  ssa_uniform_vector_p (tree op)

>>   {

>>      if (TREE_CODE (op) == VECTOR_CST

>>      +      || TREE_CODE (op) == VEC_DUPLICATE_CST

>>             || TREE_CODE (op) == CONSTRUCTOR)

>>                  return uniform_vector_p (op);

>>

>> VEC_DUPLICATE_EXPR handling?

>

> Oops, yeah.  I could have sworn it was there at one time...

>

>> Looks like for VEC_DUPLICATE_CST it could directly return true.

>

> The function is a bit misnamed: it returns the duplicated tree value

> rather than a bool.

>

>> I didn't see uniform_vector_p being updated?

>

> That part was there FWIW (for tree.c).

>

>> Can you add verification to either verify_expr or build_vec_duplicate_cst

>> that the type is one of variable size?  And amend tree.def docs

>> accordingly.  Because otherwise we miss a lot of cases in constant

>> folding (mixing VEC_DUPLICATE_CST and VECTOR_CST).

>

> OK, done in the patch below with a gcc_unreachable () bomb in

> build_vec_duplicate_cst, which becomes a gcc_assert when variable-length

> vectors are added.  This meant changing the selftests to use

> build_vector_from_val rather than build_vec_duplicate_cst,

> but to still get testing of VEC_DUPLICATE_*, we then need to use

> the target's preferred vector length instead of always using 4.

>

> Tested as before.  OK (given the slightly different selftests)?


Ok.  I'll leave the missed constant foldings to you to figure out.

Richard.

> Thanks,

> Richard

>

>

> 2017-11-06  Richard Sandiford  <richard.sandiford@linaro.org>

>             Alan Hayward  <alan.hawyard@arm.com>

>             David Sherwood  <david.sherwood@arm.com>

>

> gcc/

>         * doc/generic.texi (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): Document.

>         (VEC_COND_EXPR): Add missing @tindex.

>         * doc/md.texi (vec_duplicate@var{m}): Document.

>         * tree.def (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): New tree codes.

>         * tree-core.h (tree_base): Document that u.nelts and TREE_OVERFLOW

>         are used for VEC_DUPLICATE_CST as well.

>         (tree_vector): Access base.n.nelts directly.

>         * tree.h (TREE_OVERFLOW): Add VEC_DUPLICATE_CST to the list of

>         valid codes.

>         (VEC_DUPLICATE_CST_ELT): New macro.

>         * tree.c (tree_node_structure_for_code, tree_code_size, tree_size)

>         (integer_zerop, integer_onep, integer_all_onesp, integer_truep)

>         (real_zerop, real_onep, real_minus_onep, add_expr, initializer_zerop)

>         (walk_tree_1, drop_tree_overflow): Handle VEC_DUPLICATE_CST.

>         (build_vec_duplicate_cst): New function.

>         (build_vector_from_val): Add stubbed-out handling of variable-length

>         vectors, using build_vec_duplicate_cst and VEC_DUPLICATE_EXPR.

>         (uniform_vector_p): Handle the new codes.

>         (test_vec_duplicate_predicates_int): New function.

>         (test_vec_duplicate_predicates_float): Likewise.

>         (test_vec_duplicate_predicates): Likewise.

>         (tree_c_tests): Call test_vec_duplicate_predicates.

>         * cfgexpand.c (expand_debug_expr): Handle the new codes.

>         * tree-pretty-print.c (dump_generic_node): Likewise.

>         * tree-vect-generic.c (ssa_uniform_vector_p): Likewise.

>         * dwarf2out.c (rtl_for_decl_init): Handle VEC_DUPLICATE_CST.

>         * gimple-expr.h (is_gimple_constant): Likewise.

>         * gimplify.c (gimplify_expr): Likewise.

>         * graphite-isl-ast-to-gimple.c

>         (translate_isl_ast_to_gimple::is_constant): Likewise.

>         * graphite-scop-detection.c (scan_tree_for_params): Likewise.

>         * ipa-icf-gimple.c (func_checker::compare_cst_or_decl): Likewise.

>         (func_checker::compare_operand): Likewise.

>         * ipa-icf.c (sem_item::add_expr, sem_variable::equals): Likewise.

>         * match.pd (negate_expr_p): Likewise.

>         * print-tree.c (print_node): Likewise.

>         * tree-chkp.c (chkp_find_bounds_1): Likewise.

>         * tree-loop-distribution.c (const_with_all_bytes_same): Likewise.

>         * tree-ssa-loop.c (for_each_index): Likewise.

>         * tree-ssa-pre.c (create_component_ref_by_pieces_1): Likewise.

>         * tree-ssa-sccvn.c (copy_reference_ops_from_ref): Likewise.

>         (ao_ref_init_from_vn_reference): Likewise.

>         * varasm.c (const_hash_1, compare_constant): Likewise.

>         * fold-const.c (negate_expr_p, fold_negate_expr_1, const_binop)

>         (fold_convert_const, operand_equal_p, fold_view_convert_expr)

>         (exact_inverse, fold_checksum_tree): Likewise.

>         (const_unop): Likewise.  Fold VEC_DUPLICATE_EXPRs of a constant.

>         (test_vec_duplicate_folding): New function.

>         (fold_const_c_tests): Call it.

>         * optabs.def (vec_duplicate_optab): New optab.

>         * optabs-tree.c (optab_for_tree_code): Handle VEC_DUPLICATE_EXPR.

>         * optabs.h (expand_vector_broadcast): Declare.

>         * optabs.c (expand_vector_broadcast): Make non-static.  Try using

>         vec_duplicate_optab.

>         * expr.c (store_constructor): Try using vec_duplicate_optab for

>         uniform vectors.

>         (const_vector_element): New function, split out from...

>         (const_vector_from_tree): ...here.

>         (expand_expr_real_2): Handle VEC_DUPLICATE_EXPR.

>         (expand_expr_real_1): Handle VEC_DUPLICATE_CST.

>         * internal-fn.c (expand_vector_ubsan_overflow): Use CONSTANT_P

>         instead of checking for VECTOR_CST.

>         * tree-cfg.c (verify_gimple_assign_unary): Handle VEC_DUPLICATE_EXPR.

>         (verify_gimple_assign_single): Handle VEC_DUPLICATE_CST.

>         * tree-inline.c (estimate_operator_cost): Handle VEC_DUPLICATE_EXPR.

>

> Index: gcc/doc/generic.texi

> ===================================================================

> --- gcc/doc/generic.texi        2017-11-06 12:40:39.845713389 +0000

> +++ gcc/doc/generic.texi        2017-11-06 12:40:40.277637153 +0000

> @@ -1036,6 +1036,7 @@ As this example indicates, the operands

>  @tindex FIXED_CST

>  @tindex COMPLEX_CST

>  @tindex VECTOR_CST

> +@tindex VEC_DUPLICATE_CST

>  @tindex STRING_CST

>  @findex TREE_STRING_LENGTH

>  @findex TREE_STRING_POINTER

> @@ -1089,6 +1090,14 @@ constant nodes.  Each individual constan

>  double constant node.  The first operand is a @code{TREE_LIST} of the

>  constant nodes and is accessed through @code{TREE_VECTOR_CST_ELTS}.

>

> +@item VEC_DUPLICATE_CST

> +These nodes represent a vector constant in which every element has the

> +same scalar value.  At present only variable-length vectors use

> +@code{VEC_DUPLICATE_CST}; constant-length vectors use @code{VECTOR_CST}

> +instead.  The scalar element value is given by

> +@code{VEC_DUPLICATE_CST_ELT} and has the same restrictions as the

> +element of a @code{VECTOR_CST}.

> +

>  @item STRING_CST

>  These nodes represent string-constants.  The @code{TREE_STRING_LENGTH}

>  returns the length of the string, as an @code{int}.  The

> @@ -1692,6 +1701,7 @@ a value from @code{enum annot_expr_kind}

>

>  @node Vectors

>  @subsection Vectors

> +@tindex VEC_DUPLICATE_EXPR

>  @tindex VEC_LSHIFT_EXPR

>  @tindex VEC_RSHIFT_EXPR

>  @tindex VEC_WIDEN_MULT_HI_EXPR

> @@ -1703,9 +1713,14 @@ a value from @code{enum annot_expr_kind}

>  @tindex VEC_PACK_TRUNC_EXPR

>  @tindex VEC_PACK_SAT_EXPR

>  @tindex VEC_PACK_FIX_TRUNC_EXPR

> +@tindex VEC_COND_EXPR

>  @tindex SAD_EXPR

>

>  @table @code

> +@item VEC_DUPLICATE_EXPR

> +This node has a single operand and represents a vector in which every

> +element is equal to that operand.

> +

>  @item VEC_LSHIFT_EXPR

>  @itemx VEC_RSHIFT_EXPR

>  These nodes represent whole vector left and right shifts, respectively.

> Index: gcc/doc/md.texi

> ===================================================================

> --- gcc/doc/md.texi     2017-11-06 12:40:39.845713389 +0000

> +++ gcc/doc/md.texi     2017-11-06 12:40:40.278630081 +0000

> @@ -4888,6 +4888,17 @@ and operand 1 is parallel containing val

>  the vector mode @var{m}, or a vector mode with the same element mode and

>  smaller number of elements.

>

> +@cindex @code{vec_duplicate@var{m}} instruction pattern

> +@item @samp{vec_duplicate@var{m}}

> +Initialize vector output operand 0 so that each element has the value given

> +by scalar input operand 1.  The vector has mode @var{m} and the scalar has

> +the mode appropriate for one element of @var{m}.

> +

> +This pattern only handles duplicates of non-constant inputs.  Constant

> +vectors go through the @code{mov@var{m}} pattern instead.

> +

> +This pattern is not allowed to @code{FAIL}.

> +

>  @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern

>  @item @samp{vec_cmp@var{m}@var{n}}

>  Output a vector comparison.  Operand 0 of mode @var{n} is the destination for

> Index: gcc/tree.def

> ===================================================================

> --- gcc/tree.def        2017-11-06 12:40:39.845713389 +0000

> +++ gcc/tree.def        2017-11-06 12:40:40.292531076 +0000

> @@ -304,6 +304,11 @@ DEFTREECODE (COMPLEX_CST, "complex_cst",

>  /* Contents are in VECTOR_CST_ELTS field.  */

>  DEFTREECODE (VECTOR_CST, "vector_cst", tcc_constant, 0)

>

> +/* Represents a vector constant in which every element is equal to

> +   VEC_DUPLICATE_CST_ELT.  This is only ever used for variable-length

> +   vectors; fixed-length vectors must use VECTOR_CST instead.  */

> +DEFTREECODE (VEC_DUPLICATE_CST, "vec_duplicate_cst", tcc_constant, 0)

> +

>  /* Contents are TREE_STRING_LENGTH and the actual contents of the string.  */

>  DEFTREECODE (STRING_CST, "string_cst", tcc_constant, 0)

>

> @@ -534,6 +539,9 @@ DEFTREECODE (TARGET_EXPR, "target_expr",

>     1 and 2 are NULL.  The operands are then taken from the cfg edges. */

>  DEFTREECODE (COND_EXPR, "cond_expr", tcc_expression, 3)

>

> +/* Represents a vector in which every element is equal to operand 0.  */

> +DEFTREECODE (VEC_DUPLICATE_EXPR, "vec_duplicate_expr", tcc_unary, 1)

> +

>  /* Vector conditional expression. It is like COND_EXPR, but with

>     vector operands.

>

> Index: gcc/tree-core.h

> ===================================================================

> --- gcc/tree-core.h     2017-11-06 12:40:39.845713389 +0000

> +++ gcc/tree-core.h     2017-11-06 12:40:40.288559363 +0000

> @@ -975,7 +975,8 @@ struct GTY(()) tree_base {

>      /* VEC length.  This field is only used with TREE_VEC.  */

>      int length;

>

> -    /* Number of elements.  This field is only used with VECTOR_CST.  */

> +    /* Number of elements.  This field is only used with VECTOR_CST

> +       and VEC_DUPLICATE_CST.  It is always 1 for VEC_DUPLICATE_CST.  */

>      unsigned int nelts;

>

>      /* SSA version number.  This field is only used with SSA_NAME.  */

> @@ -1065,7 +1066,7 @@ struct GTY(()) tree_base {

>     public_flag:

>

>         TREE_OVERFLOW in

> -           INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST

> +           INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST, VEC_DUPLICATE_CST

>

>         TREE_PUBLIC in

>             VAR_DECL, FUNCTION_DECL

> @@ -1332,7 +1333,7 @@ struct GTY(()) tree_complex {

>

>  struct GTY(()) tree_vector {

>    struct tree_typed typed;

> -  tree GTY ((length ("VECTOR_CST_NELTS ((tree) &%h)"))) elts[1];

> +  tree GTY ((length ("((tree) &%h)->base.u.nelts"))) elts[1];

>  };

>

>  struct GTY(()) tree_identifier {

> Index: gcc/tree.h

> ===================================================================

> --- gcc/tree.h  2017-11-06 12:40:39.845713389 +0000

> +++ gcc/tree.h  2017-11-06 12:40:40.293524004 +0000

> @@ -709,8 +709,8 @@ #define TREE_SYMBOL_REFERENCED(NODE) \

>  #define TYPE_REF_CAN_ALIAS_ALL(NODE) \

>    (PTR_OR_REF_CHECK (NODE)->base.static_flag)

>

> -/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, or VECTOR_CST, this means

> -   there was an overflow in folding.  */

> +/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST or VEC_DUPLICATE_CST,

> +   this means there was an overflow in folding.  */

>

>  #define TREE_OVERFLOW(NODE) (CST_CHECK (NODE)->base.public_flag)

>

> @@ -1009,6 +1009,10 @@ #define VECTOR_CST_NELTS(NODE) (VECTOR_C

>  #define VECTOR_CST_ELTS(NODE) (VECTOR_CST_CHECK (NODE)->vector.elts)

>  #define VECTOR_CST_ELT(NODE,IDX) (VECTOR_CST_CHECK (NODE)->vector.elts[IDX])

>

> +/* In a VEC_DUPLICATE_CST node.  */

> +#define VEC_DUPLICATE_CST_ELT(NODE) \

> +  (VEC_DUPLICATE_CST_CHECK (NODE)->vector.elts[0])

> +

>  /* Define fields and accessors for some special-purpose tree nodes.  */

>

>  #define IDENTIFIER_LENGTH(NODE) \

> Index: gcc/tree.c

> ===================================================================

> --- gcc/tree.c  2017-11-06 12:40:39.845713389 +0000

> +++ gcc/tree.c  2017-11-06 12:40:40.292531076 +0000

> @@ -464,6 +464,7 @@ tree_node_structure_for_code (enum tree_

>      case FIXED_CST:            return TS_FIXED_CST;

>      case COMPLEX_CST:          return TS_COMPLEX;

>      case VECTOR_CST:           return TS_VECTOR;

> +    case VEC_DUPLICATE_CST:    return TS_VECTOR;

>      case STRING_CST:           return TS_STRING;

>        /* tcc_exceptional cases.  */

>      case ERROR_MARK:           return TS_COMMON;

> @@ -829,6 +830,7 @@ tree_code_size (enum tree_code code)

>         case FIXED_CST:         return sizeof (tree_fixed_cst);

>         case COMPLEX_CST:       return sizeof (tree_complex);

>         case VECTOR_CST:        return sizeof (tree_vector);

> +       case VEC_DUPLICATE_CST: return sizeof (tree_vector);

>         case STRING_CST:        gcc_unreachable ();

>         default:

>           gcc_checking_assert (code >= NUM_TREE_CODES);

> @@ -890,6 +892,9 @@ tree_size (const_tree node)

>        return (sizeof (struct tree_vector)

>               + (VECTOR_CST_NELTS (node) - 1) * sizeof (tree));

>

> +    case VEC_DUPLICATE_CST:

> +      return sizeof (struct tree_vector);

> +

>      case STRING_CST:

>        return TREE_STRING_LENGTH (node) + offsetof (struct tree_string, str) + 1;

>

> @@ -1697,6 +1702,34 @@ cst_and_fits_in_hwi (const_tree x)

>           && (tree_fits_shwi_p (x) || tree_fits_uhwi_p (x)));

>  }

>

> +/* Build a new VEC_DUPLICATE_CST with type TYPE and operand EXP.

> +

> +   This function is only suitable for callers that know TYPE is a

> +   variable-length vector and specifically need a VEC_DUPLICATE_CST node.

> +   Use build_vector_from_val to duplicate a general scalar into a general

> +   vector type.  */

> +

> +static tree

> +build_vec_duplicate_cst (tree type, tree exp MEM_STAT_DECL)

> +{

> +  /* Shouldn't be used until we have variable-length vectors.  */

> +  gcc_unreachable ();

> +

> +  int length = sizeof (struct tree_vector);

> +

> +  record_node_allocation_statistics (VEC_DUPLICATE_CST, length);

> +

> +  tree t = ggc_alloc_cleared_tree_node_stat (length PASS_MEM_STAT);

> +

> +  TREE_SET_CODE (t, VEC_DUPLICATE_CST);

> +  TREE_TYPE (t) = type;

> +  t->base.u.nelts = 1;

> +  VEC_DUPLICATE_CST_ELT (t) = exp;

> +  TREE_CONSTANT (t) = 1;

> +

> +  return t;

> +}

> +

>  /* Build a newly constructed VECTOR_CST node of length LEN.  */

>

>  tree

> @@ -1790,6 +1823,13 @@ build_vector_from_val (tree vectype, tre

>    gcc_checking_assert (types_compatible_p (TYPE_MAIN_VARIANT (TREE_TYPE (sc)),

>                                            TREE_TYPE (vectype)));

>

> +  if (0)

> +    {

> +      if (CONSTANT_CLASS_P (sc))

> +       return build_vec_duplicate_cst (vectype, sc);

> +      return fold_build1 (VEC_DUPLICATE_EXPR, vectype, sc);

> +    }

> +

>    if (CONSTANT_CLASS_P (sc))

>      {

>        auto_vec<tree, 32> v (nunits);

> @@ -2358,6 +2398,8 @@ integer_zerop (const_tree expr)

>             return false;

>         return true;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return integer_zerop (VEC_DUPLICATE_CST_ELT (expr));

>      default:

>        return false;

>      }

> @@ -2384,6 +2426,8 @@ integer_onep (const_tree expr)

>             return false;

>         return true;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return integer_onep (VEC_DUPLICATE_CST_ELT (expr));

>      default:

>        return false;

>      }

> @@ -2422,6 +2466,9 @@ integer_all_onesp (const_tree expr)

>        return 1;

>      }

>

> +  else if (TREE_CODE (expr) == VEC_DUPLICATE_CST)

> +    return integer_all_onesp (VEC_DUPLICATE_CST_ELT (expr));

> +

>    else if (TREE_CODE (expr) != INTEGER_CST)

>      return 0;

>

> @@ -2478,7 +2525,7 @@ integer_nonzerop (const_tree expr)

>  int

>  integer_truep (const_tree expr)

>  {

> -  if (TREE_CODE (expr) == VECTOR_CST)

> +  if (TREE_CODE (expr) == VECTOR_CST || TREE_CODE (expr) == VEC_DUPLICATE_CST)

>      return integer_all_onesp (expr);

>    return integer_onep (expr);

>  }

> @@ -2649,6 +2696,8 @@ real_zerop (const_tree expr)

>             return false;

>         return true;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return real_zerop (VEC_DUPLICATE_CST_ELT (expr));

>      default:

>        return false;

>      }

> @@ -2677,6 +2726,8 @@ real_onep (const_tree expr)

>             return false;

>         return true;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return real_onep (VEC_DUPLICATE_CST_ELT (expr));

>      default:

>        return false;

>      }

> @@ -2704,6 +2755,8 @@ real_minus_onep (const_tree expr)

>             return false;

>         return true;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return real_minus_onep (VEC_DUPLICATE_CST_ELT (expr));

>      default:

>        return false;

>      }

> @@ -7106,6 +7159,9 @@ add_expr (const_tree t, inchash::hash &h

>           inchash::add_expr (VECTOR_CST_ELT (t, i), hstate, flags);

>         return;

>        }

> +    case VEC_DUPLICATE_CST:

> +      inchash::add_expr (VEC_DUPLICATE_CST_ELT (t), hstate);

> +      return;

>      case SSA_NAME:

>        /* We can just compare by pointer.  */

>        hstate.add_hwi (SSA_NAME_VERSION (t));

> @@ -10367,6 +10423,9 @@ initializer_zerop (const_tree init)

>         return true;

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      return initializer_zerop (VEC_DUPLICATE_CST_ELT (init));

> +

>      case CONSTRUCTOR:

>        {

>         unsigned HOST_WIDE_INT idx;

> @@ -10412,7 +10471,13 @@ uniform_vector_p (const_tree vec)

>

>    gcc_assert (VECTOR_TYPE_P (TREE_TYPE (vec)));

>

> -  if (TREE_CODE (vec) == VECTOR_CST)

> +  if (TREE_CODE (vec) == VEC_DUPLICATE_CST)

> +    return VEC_DUPLICATE_CST_ELT (vec);

> +

> +  else if (TREE_CODE (vec) == VEC_DUPLICATE_EXPR)

> +    return TREE_OPERAND (vec, 0);

> +

> +  else if (TREE_CODE (vec) == VECTOR_CST)

>      {

>        first = VECTOR_CST_ELT (vec, 0);

>        for (i = 1; i < VECTOR_CST_NELTS (vec); ++i)

> @@ -11144,6 +11209,7 @@ #define WALK_SUBTREE_TAIL(NODE)                         \

>      case REAL_CST:

>      case FIXED_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case STRING_CST:

>      case BLOCK:

>      case PLACEHOLDER_EXPR:

> @@ -12430,6 +12496,12 @@ drop_tree_overflow (tree t)

>             elt = drop_tree_overflow (elt);

>         }

>      }

> +  if (TREE_CODE (t) == VEC_DUPLICATE_CST)

> +    {

> +      tree *elt = &VEC_DUPLICATE_CST_ELT (t);

> +      if (TREE_OVERFLOW (*elt))

> +       *elt = drop_tree_overflow (*elt);

> +    }

>    return t;

>  }

>

> @@ -13850,6 +13922,102 @@ test_integer_constants ()

>    ASSERT_EQ (type, TREE_TYPE (zero));

>  }

>

> +/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs

> +   for integral type TYPE.  */

> +

> +static void

> +test_vec_duplicate_predicates_int (tree type)

> +{

> +  scalar_int_mode int_mode = SCALAR_INT_TYPE_MODE (type);

> +  machine_mode vec_mode = targetm.vectorize.preferred_simd_mode (int_mode);

> +  /* This will be 1 if VEC_MODE isn't a vector mode.  */

> +  unsigned int nunits = GET_MODE_NUNITS (vec_mode);

> +

> +  tree vec_type = build_vector_type (type, nunits);

> +

> +  tree zero = build_zero_cst (type);

> +  tree vec_zero = build_vector_from_val (vec_type, zero);

> +  ASSERT_TRUE (integer_zerop (vec_zero));

> +  ASSERT_FALSE (integer_onep (vec_zero));

> +  ASSERT_FALSE (integer_minus_onep (vec_zero));

> +  ASSERT_FALSE (integer_all_onesp (vec_zero));

> +  ASSERT_FALSE (integer_truep (vec_zero));

> +  ASSERT_TRUE (initializer_zerop (vec_zero));

> +

> +  tree one = build_one_cst (type);

> +  tree vec_one = build_vector_from_val (vec_type, one);

> +  ASSERT_FALSE (integer_zerop (vec_one));

> +  ASSERT_TRUE (integer_onep (vec_one));

> +  ASSERT_FALSE (integer_minus_onep (vec_one));

> +  ASSERT_FALSE (integer_all_onesp (vec_one));

> +  ASSERT_FALSE (integer_truep (vec_one));

> +  ASSERT_FALSE (initializer_zerop (vec_one));

> +

> +  tree minus_one = build_minus_one_cst (type);

> +  tree vec_minus_one = build_vector_from_val (vec_type, minus_one);

> +  ASSERT_FALSE (integer_zerop (vec_minus_one));

> +  ASSERT_FALSE (integer_onep (vec_minus_one));

> +  ASSERT_TRUE (integer_minus_onep (vec_minus_one));

> +  ASSERT_TRUE (integer_all_onesp (vec_minus_one));

> +  ASSERT_TRUE (integer_truep (vec_minus_one));

> +  ASSERT_FALSE (initializer_zerop (vec_minus_one));

> +

> +  tree x = create_tmp_var_raw (type, "x");

> +  tree vec_x = build1 (VEC_DUPLICATE_EXPR, vec_type, x);

> +  ASSERT_EQ (uniform_vector_p (vec_zero), zero);

> +  ASSERT_EQ (uniform_vector_p (vec_one), one);

> +  ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one);

> +  ASSERT_EQ (uniform_vector_p (vec_x), x);

> +}

> +

> +/* Verify predicate handling of VEC_DUPLICATE_CSTs for floating-point

> +   type TYPE.  */

> +

> +static void

> +test_vec_duplicate_predicates_float (tree type)

> +{

> +  scalar_float_mode float_mode = SCALAR_FLOAT_TYPE_MODE (type);

> +  machine_mode vec_mode = targetm.vectorize.preferred_simd_mode (float_mode);

> +  /* This will be 1 if VEC_MODE isn't a vector mode.  */

> +  unsigned int nunits = GET_MODE_NUNITS (vec_mode);

> +

> +  tree vec_type = build_vector_type (type, nunits);

> +

> +  tree zero = build_zero_cst (type);

> +  tree vec_zero = build_vector_from_val (vec_type, zero);

> +  ASSERT_TRUE (real_zerop (vec_zero));

> +  ASSERT_FALSE (real_onep (vec_zero));

> +  ASSERT_FALSE (real_minus_onep (vec_zero));

> +  ASSERT_TRUE (initializer_zerop (vec_zero));

> +

> +  tree one = build_one_cst (type);

> +  tree vec_one = build_vector_from_val (vec_type, one);

> +  ASSERT_FALSE (real_zerop (vec_one));

> +  ASSERT_TRUE (real_onep (vec_one));

> +  ASSERT_FALSE (real_minus_onep (vec_one));

> +  ASSERT_FALSE (initializer_zerop (vec_one));

> +

> +  tree minus_one = build_minus_one_cst (type);

> +  tree vec_minus_one = build_vector_from_val (vec_type, minus_one);

> +  ASSERT_FALSE (real_zerop (vec_minus_one));

> +  ASSERT_FALSE (real_onep (vec_minus_one));

> +  ASSERT_TRUE (real_minus_onep (vec_minus_one));

> +  ASSERT_FALSE (initializer_zerop (vec_minus_one));

> +

> +  ASSERT_EQ (uniform_vector_p (vec_zero), zero);

> +  ASSERT_EQ (uniform_vector_p (vec_one), one);

> +  ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one);

> +}

> +

> +/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs.  */

> +

> +static void

> +test_vec_duplicate_predicates ()

> +{

> +  test_vec_duplicate_predicates_int (integer_type_node);

> +  test_vec_duplicate_predicates_float (float_type_node);

> +}

> +

>  /* Verify identifiers.  */

>

>  static void

> @@ -13878,6 +14046,7 @@ test_labels ()

>  tree_c_tests ()

>  {

>    test_integer_constants ();

> +  test_vec_duplicate_predicates ();

>    test_identifiers ();

>    test_labels ();

>  }

> Index: gcc/cfgexpand.c

> ===================================================================

> --- gcc/cfgexpand.c     2017-11-06 12:40:39.845713389 +0000

> +++ gcc/cfgexpand.c     2017-11-06 12:40:40.276644225 +0000

> @@ -5068,6 +5068,8 @@ expand_debug_expr (tree exp)

>      case VEC_WIDEN_LSHIFT_HI_EXPR:

>      case VEC_WIDEN_LSHIFT_LO_EXPR:

>      case VEC_PERM_EXPR:

> +    case VEC_DUPLICATE_CST:

> +    case VEC_DUPLICATE_EXPR:

>        return NULL;

>

>      /* Misc codes.  */

> Index: gcc/tree-pretty-print.c

> ===================================================================

> --- gcc/tree-pretty-print.c     2017-11-06 12:40:39.845713389 +0000

> +++ gcc/tree-pretty-print.c     2017-11-06 12:40:40.289552291 +0000

> @@ -1802,6 +1802,12 @@ dump_generic_node (pretty_printer *pp, t

>        }

>        break;

>

> +    case VEC_DUPLICATE_CST:

> +      pp_string (pp, "{ ");

> +      dump_generic_node (pp, VEC_DUPLICATE_CST_ELT (node), spc, flags, false);

> +      pp_string (pp, ", ... }");

> +      break;

> +

>      case FUNCTION_TYPE:

>      case METHOD_TYPE:

>        dump_generic_node (pp, TREE_TYPE (node), spc, flags, false);

> @@ -3231,6 +3237,15 @@ dump_generic_node (pretty_printer *pp, t

>        pp_string (pp, " > ");

>        break;

>

> +    case VEC_DUPLICATE_EXPR:

> +      pp_space (pp);

> +      for (str = get_tree_code_name (code); *str; str++)

> +       pp_character (pp, TOUPPER (*str));

> +      pp_string (pp, " < ");

> +      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);

> +      pp_string (pp, " > ");

> +      break;

> +

>      case VEC_UNPACK_HI_EXPR:

>        pp_string (pp, " VEC_UNPACK_HI_EXPR < ");

>        dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);

> Index: gcc/tree-vect-generic.c

> ===================================================================

> --- gcc/tree-vect-generic.c     2017-11-06 12:40:39.845713389 +0000

> +++ gcc/tree-vect-generic.c     2017-11-06 12:40:40.291538147 +0000

> @@ -1419,6 +1419,8 @@ lower_vec_perm (gimple_stmt_iterator *gs

>  ssa_uniform_vector_p (tree op)

>  {

>    if (TREE_CODE (op) == VECTOR_CST

> +      || TREE_CODE (op) == VEC_DUPLICATE_CST

> +      || TREE_CODE (op) == VEC_DUPLICATE_EXPR

>        || TREE_CODE (op) == CONSTRUCTOR)

>      return uniform_vector_p (op);

>    if (TREE_CODE (op) == SSA_NAME)

> Index: gcc/dwarf2out.c

> ===================================================================

> --- gcc/dwarf2out.c     2017-11-06 12:40:39.845713389 +0000

> +++ gcc/dwarf2out.c     2017-11-06 12:40:40.280615937 +0000

> @@ -18878,6 +18878,7 @@ rtl_for_decl_init (tree init, tree type)

>         switch (TREE_CODE (init))

>           {

>           case VECTOR_CST:

> +         case VEC_DUPLICATE_CST:

>             break;

>           case CONSTRUCTOR:

>             if (TREE_CONSTANT (init))

> Index: gcc/gimple-expr.h

> ===================================================================

> --- gcc/gimple-expr.h   2017-11-06 12:40:39.845713389 +0000

> +++ gcc/gimple-expr.h   2017-11-06 12:40:40.282601794 +0000

> @@ -134,6 +134,7 @@ is_gimple_constant (const_tree t)

>      case FIXED_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case STRING_CST:

>        return true;

>

> Index: gcc/gimplify.c

> ===================================================================

> --- gcc/gimplify.c      2017-11-06 12:40:39.845713389 +0000

> +++ gcc/gimplify.c      2017-11-06 12:40:40.283594722 +0000

> @@ -11507,6 +11507,7 @@ gimplify_expr (tree *expr_p, gimple_seq

>         case STRING_CST:

>         case COMPLEX_CST:

>         case VECTOR_CST:

> +       case VEC_DUPLICATE_CST:

>           /* Drop the overflow flag on constants, we do not want

>              that in the GIMPLE IL.  */

>           if (TREE_OVERFLOW_P (*expr_p))

> Index: gcc/graphite-isl-ast-to-gimple.c

> ===================================================================

> --- gcc/graphite-isl-ast-to-gimple.c    2017-11-06 12:40:39.845713389 +0000

> +++ gcc/graphite-isl-ast-to-gimple.c    2017-11-06 12:40:40.284587650 +0000

> @@ -211,7 +211,8 @@ enum phi_node_kind

>      return TREE_CODE (op) == INTEGER_CST

>        || TREE_CODE (op) == REAL_CST

>        || TREE_CODE (op) == COMPLEX_CST

> -      || TREE_CODE (op) == VECTOR_CST;

> +      || TREE_CODE (op) == VECTOR_CST

> +      || TREE_CODE (op) == VEC_DUPLICATE_CST;

>    }

>

>  private:

> Index: gcc/graphite-scop-detection.c

> ===================================================================

> --- gcc/graphite-scop-detection.c       2017-11-06 12:40:39.845713389 +0000

> +++ gcc/graphite-scop-detection.c       2017-11-06 12:40:40.284587650 +0000

> @@ -1212,6 +1212,7 @@ scan_tree_for_params (sese_info_p s, tre

>      case REAL_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>        break;

>

>     default:

> Index: gcc/ipa-icf-gimple.c

> ===================================================================

> --- gcc/ipa-icf-gimple.c        2017-11-06 12:40:39.845713389 +0000

> +++ gcc/ipa-icf-gimple.c        2017-11-06 12:40:40.285580578 +0000

> @@ -333,6 +333,7 @@ func_checker::compare_cst_or_decl (tree

>      case INTEGER_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case STRING_CST:

>      case REAL_CST:

>        {

> @@ -528,6 +529,7 @@ func_checker::compare_operand (tree t1,

>      case INTEGER_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case STRING_CST:

>      case REAL_CST:

>      case FUNCTION_DECL:

> Index: gcc/ipa-icf.c

> ===================================================================

> --- gcc/ipa-icf.c       2017-11-06 12:40:39.845713389 +0000

> +++ gcc/ipa-icf.c       2017-11-06 12:40:40.285580578 +0000

> @@ -1479,6 +1479,7 @@ sem_item::add_expr (const_tree exp, inch

>      case STRING_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>        inchash::add_expr (exp, hstate);

>        break;

>      case CONSTRUCTOR:

> @@ -2036,6 +2037,9 @@ sem_variable::equals (tree t1, tree t2)

>

>         return 1;

>        }

> +    case VEC_DUPLICATE_CST:

> +      return sem_variable::equals (VEC_DUPLICATE_CST_ELT (t1),

> +                                  VEC_DUPLICATE_CST_ELT (t2));

>      case ARRAY_REF:

>      case ARRAY_RANGE_REF:

>        {

> Index: gcc/match.pd

> ===================================================================

> --- gcc/match.pd        2017-11-06 12:40:39.845713389 +0000

> +++ gcc/match.pd        2017-11-06 12:40:40.285580578 +0000

> @@ -958,6 +958,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

>  (match negate_expr_p

>   VECTOR_CST

>   (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type))))

> +(match negate_expr_p

> + VEC_DUPLICATE_CST

> + (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type))))

>

>  /* (-A) * (-B) -> A * B  */

>  (simplify

> Index: gcc/print-tree.c

> ===================================================================

> --- gcc/print-tree.c    2017-11-06 12:40:39.845713389 +0000

> +++ gcc/print-tree.c    2017-11-06 12:40:40.287566435 +0000

> @@ -783,6 +783,10 @@ print_node (FILE *file, const char *pref

>           }

>           break;

>

> +       case VEC_DUPLICATE_CST:

> +         print_node (file, "elt", VEC_DUPLICATE_CST_ELT (node), indent + 4);

> +         break;

> +

>         case COMPLEX_CST:

>           print_node (file, "real", TREE_REALPART (node), indent + 4);

>           print_node (file, "imag", TREE_IMAGPART (node), indent + 4);

> Index: gcc/tree-chkp.c

> ===================================================================

> --- gcc/tree-chkp.c     2017-11-06 12:40:39.845713389 +0000

> +++ gcc/tree-chkp.c     2017-11-06 12:40:40.288559363 +0000

> @@ -3799,6 +3799,7 @@ chkp_find_bounds_1 (tree ptr, tree ptr_s

>      case INTEGER_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>        if (integer_zerop (ptr_src))

>         bounds = chkp_get_none_bounds ();

>        else

> Index: gcc/tree-loop-distribution.c

> ===================================================================

> --- gcc/tree-loop-distribution.c        2017-11-06 12:40:39.845713389 +0000

> +++ gcc/tree-loop-distribution.c        2017-11-06 12:40:40.289552291 +0000

> @@ -927,6 +927,9 @@ const_with_all_bytes_same (tree val)

>            && CONSTRUCTOR_NELTS (val) == 0))

>      return 0;

>

> +  if (TREE_CODE (val) == VEC_DUPLICATE_CST)

> +    return const_with_all_bytes_same (VEC_DUPLICATE_CST_ELT (val));

> +

>    if (real_zerop (val))

>      {

>        /* Only return 0 for +0.0, not for -0.0, which doesn't have

> Index: gcc/tree-ssa-loop.c

> ===================================================================

> --- gcc/tree-ssa-loop.c 2017-11-06 12:40:39.845713389 +0000

> +++ gcc/tree-ssa-loop.c 2017-11-06 12:40:40.290545219 +0000

> @@ -616,6 +616,7 @@ for_each_index (tree *addr_p, bool (*cbc

>         case STRING_CST:

>         case RESULT_DECL:

>         case VECTOR_CST:

> +       case VEC_DUPLICATE_CST:

>         case COMPLEX_CST:

>         case INTEGER_CST:

>         case REAL_CST:

> Index: gcc/tree-ssa-pre.c

> ===================================================================

> --- gcc/tree-ssa-pre.c  2017-11-06 12:40:39.845713389 +0000

> +++ gcc/tree-ssa-pre.c  2017-11-06 12:40:40.290545219 +0000

> @@ -2627,6 +2627,7 @@ create_component_ref_by_pieces_1 (basic_

>      case INTEGER_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case REAL_CST:

>      case CONSTRUCTOR:

>      case VAR_DECL:

> Index: gcc/tree-ssa-sccvn.c

> ===================================================================

> --- gcc/tree-ssa-sccvn.c        2017-11-06 12:40:39.845713389 +0000

> +++ gcc/tree-ssa-sccvn.c        2017-11-06 12:40:40.291538147 +0000

> @@ -866,6 +866,7 @@ copy_reference_ops_from_ref (tree ref, v

>         case INTEGER_CST:

>         case COMPLEX_CST:

>         case VECTOR_CST:

> +       case VEC_DUPLICATE_CST:

>         case REAL_CST:

>         case FIXED_CST:

>         case CONSTRUCTOR:

> @@ -1058,6 +1059,7 @@ ao_ref_init_from_vn_reference (ao_ref *r

>         case INTEGER_CST:

>         case COMPLEX_CST:

>         case VECTOR_CST:

> +       case VEC_DUPLICATE_CST:

>         case REAL_CST:

>         case CONSTRUCTOR:

>         case CONST_DECL:

> Index: gcc/varasm.c

> ===================================================================

> --- gcc/varasm.c        2017-11-06 12:40:39.845713389 +0000

> +++ gcc/varasm.c        2017-11-06 12:40:40.293524004 +0000

> @@ -3068,6 +3068,9 @@ const_hash_1 (const tree exp)

>      CASE_CONVERT:

>        return const_hash_1 (TREE_OPERAND (exp, 0)) * 7 + 2;

>

> +    case VEC_DUPLICATE_CST:

> +      return const_hash_1 (VEC_DUPLICATE_CST_ELT (exp)) * 7 + 3;

> +

>      default:

>        /* A language specific constant. Just hash the code.  */

>        return code;

> @@ -3158,6 +3161,10 @@ compare_constant (const tree t1, const t

>         return 1;

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      return compare_constant (VEC_DUPLICATE_CST_ELT (t1),

> +                              VEC_DUPLICATE_CST_ELT (t2));

> +

>      case CONSTRUCTOR:

>        {

>         vec<constructor_elt, va_gc> *v1, *v2;

> Index: gcc/fold-const.c

> ===================================================================

> --- gcc/fold-const.c    2017-11-06 12:40:39.845713389 +0000

> +++ gcc/fold-const.c    2017-11-06 12:40:40.282601794 +0000

> @@ -418,6 +418,9 @@ negate_expr_p (tree t)

>         return true;

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      return negate_expr_p (VEC_DUPLICATE_CST_ELT (t));

> +

>      case COMPLEX_EXPR:

>        return negate_expr_p (TREE_OPERAND (t, 0))

>              && negate_expr_p (TREE_OPERAND (t, 1));

> @@ -579,6 +582,14 @@ fold_negate_expr_1 (location_t loc, tree

>         return build_vector (type, elts);

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      {

> +       tree sub = fold_negate_expr (loc, VEC_DUPLICATE_CST_ELT (t));

> +       if (!sub)

> +         return NULL_TREE;

> +       return build_vector_from_val (type, sub);

> +      }

> +

>      case COMPLEX_EXPR:

>        if (negate_expr_p (t))

>         return fold_build2_loc (loc, COMPLEX_EXPR, type,

> @@ -1436,6 +1447,16 @@ const_binop (enum tree_code code, tree a

>        return build_vector (type, elts);

>      }

>

> +  if (TREE_CODE (arg1) == VEC_DUPLICATE_CST

> +      && TREE_CODE (arg2) == VEC_DUPLICATE_CST)

> +    {

> +      tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1),

> +                             VEC_DUPLICATE_CST_ELT (arg2));

> +      if (!sub)

> +       return NULL_TREE;

> +      return build_vector_from_val (TREE_TYPE (arg1), sub);

> +    }

> +

>    /* Shifts allow a scalar offset for a vector.  */

>    if (TREE_CODE (arg1) == VECTOR_CST

>        && TREE_CODE (arg2) == INTEGER_CST)

> @@ -1459,6 +1480,15 @@ const_binop (enum tree_code code, tree a

>

>        return build_vector (type, elts);

>      }

> +

> +  if (TREE_CODE (arg1) == VEC_DUPLICATE_CST

> +      && TREE_CODE (arg2) == INTEGER_CST)

> +    {

> +      tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1), arg2);

> +      if (!sub)

> +       return NULL_TREE;

> +      return build_vector_from_val (TREE_TYPE (arg1), sub);

> +    }

>    return NULL_TREE;

>  }

>

> @@ -1652,6 +1682,13 @@ const_unop (enum tree_code code, tree ty

>           if (i == count)

>             return build_vector (type, elements);

>         }

> +      else if (TREE_CODE (arg0) == VEC_DUPLICATE_CST)

> +       {

> +         tree sub = const_unop (BIT_NOT_EXPR, TREE_TYPE (type),

> +                                VEC_DUPLICATE_CST_ELT (arg0));

> +         if (sub)

> +           return build_vector_from_val (type, sub);

> +       }

>        break;

>

>      case TRUTH_NOT_EXPR:

> @@ -1737,6 +1774,11 @@ const_unop (enum tree_code code, tree ty

>         return res;

>        }

>

> +    case VEC_DUPLICATE_EXPR:

> +      if (CONSTANT_CLASS_P (arg0))

> +       return build_vector_from_val (type, arg0);

> +      return NULL_TREE;

> +

>      default:

>        break;

>      }

> @@ -2167,6 +2209,15 @@ fold_convert_const (enum tree_code code,

>             }

>           return build_vector (type, v);

>         }

> +      if (TREE_CODE (arg1) == VEC_DUPLICATE_CST

> +         && (TYPE_VECTOR_SUBPARTS (type)

> +             == TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg1))))

> +       {

> +         tree sub = fold_convert_const (code, TREE_TYPE (type),

> +                                        VEC_DUPLICATE_CST_ELT (arg1));

> +         if (sub)

> +           return build_vector_from_val (type, sub);

> +       }

>      }

>    return NULL_TREE;

>  }

> @@ -2953,6 +3004,10 @@ operand_equal_p (const_tree arg0, const_

>           return 1;

>         }

>

> +      case VEC_DUPLICATE_CST:

> +       return operand_equal_p (VEC_DUPLICATE_CST_ELT (arg0),

> +                               VEC_DUPLICATE_CST_ELT (arg1), flags);

> +

>        case COMPLEX_CST:

>         return (operand_equal_p (TREE_REALPART (arg0), TREE_REALPART (arg1),

>                                  flags)

> @@ -7475,6 +7530,20 @@ can_native_interpret_type_p (tree type)

>  static tree

>  fold_view_convert_expr (tree type, tree expr)

>  {

> +  /* Recurse on duplicated vectors if the target type is also a vector

> +     and if the elements line up.  */

> +  tree expr_type = TREE_TYPE (expr);

> +  if (TREE_CODE (expr) == VEC_DUPLICATE_CST

> +      && VECTOR_TYPE_P (type)

> +      && TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (expr_type)

> +      && TYPE_SIZE (TREE_TYPE (type)) == TYPE_SIZE (TREE_TYPE (expr_type)))

> +    {

> +      tree sub = fold_view_convert_expr (TREE_TYPE (type),

> +                                        VEC_DUPLICATE_CST_ELT (expr));

> +      if (sub)

> +       return build_vector_from_val (type, sub);

> +    }

> +

>    /* We support up to 512-bit values (for V8DFmode).  */

>    unsigned char buffer[64];

>    int len;

> @@ -8874,6 +8943,15 @@ exact_inverse (tree type, tree cst)

>         return build_vector (type, elts);

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      {

> +       tree sub = exact_inverse (TREE_TYPE (type),

> +                                 VEC_DUPLICATE_CST_ELT (cst));

> +       if (!sub)

> +         return NULL_TREE;

> +       return build_vector_from_val (type, sub);

> +      }

> +

>      default:

>        return NULL_TREE;

>      }

> @@ -11939,6 +12017,9 @@ fold_checksum_tree (const_tree expr, str

>           for (i = 0; i < (int) VECTOR_CST_NELTS (expr); ++i)

>             fold_checksum_tree (VECTOR_CST_ELT (expr, i), ctx, ht);

>           break;

> +       case VEC_DUPLICATE_CST:

> +         fold_checksum_tree (VEC_DUPLICATE_CST_ELT (expr), ctx, ht);

> +         break;

>         default:

>           break;

>         }

> @@ -14412,6 +14493,41 @@ test_vector_folding ()

>    ASSERT_FALSE (integer_nonzerop (fold_build2 (NE_EXPR, res_type, one, one)));

>  }

>

> +/* Verify folding of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs.  */

> +

> +static void

> +test_vec_duplicate_folding ()

> +{

> +  scalar_int_mode int_mode = SCALAR_INT_TYPE_MODE (ssizetype);

> +  machine_mode vec_mode = targetm.vectorize.preferred_simd_mode (int_mode);

> +  /* This will be 1 if VEC_MODE isn't a vector mode.  */

> +  unsigned int nunits = GET_MODE_NUNITS (vec_mode);

> +

> +  tree type = build_vector_type (ssizetype, nunits);

> +  tree dup5 = build_vector_from_val (type, ssize_int (5));

> +  tree dup3 = build_vector_from_val (type, ssize_int (3));

> +

> +  tree neg_dup5 = fold_unary (NEGATE_EXPR, type, dup5);

> +  ASSERT_EQ (uniform_vector_p (neg_dup5), ssize_int (-5));

> +

> +  tree not_dup5 = fold_unary (BIT_NOT_EXPR, type, dup5);

> +  ASSERT_EQ (uniform_vector_p (not_dup5), ssize_int (-6));

> +

> +  tree dup5_plus_dup3 = fold_binary (PLUS_EXPR, type, dup5, dup3);

> +  ASSERT_EQ (uniform_vector_p (dup5_plus_dup3), ssize_int (8));

> +

> +  tree dup5_lsl_2 = fold_binary (LSHIFT_EXPR, type, dup5, ssize_int (2));

> +  ASSERT_EQ (uniform_vector_p (dup5_lsl_2), ssize_int (20));

> +

> +  tree size_vector = build_vector_type (sizetype, nunits);

> +  tree size_dup5 = fold_convert (size_vector, dup5);

> +  ASSERT_EQ (uniform_vector_p (size_dup5), size_int (5));

> +

> +  tree dup5_expr = fold_unary (VEC_DUPLICATE_EXPR, type, ssize_int (5));

> +  tree dup5_cst = build_vector_from_val (type, ssize_int (5));

> +  ASSERT_TRUE (operand_equal_p (dup5_expr, dup5_cst, 0));

> +}

> +

>  /* Run all of the selftests within this file.  */

>

>  void

> @@ -14419,6 +14535,7 @@ fold_const_c_tests ()

>  {

>    test_arithmetic_folding ();

>    test_vector_folding ();

> +  test_vec_duplicate_folding ();

>  }

>

>  } // namespace selftest

> Index: gcc/optabs.def

> ===================================================================

> --- gcc/optabs.def      2017-11-06 12:40:39.845713389 +0000

> +++ gcc/optabs.def      2017-11-06 12:40:40.286573506 +0000

> @@ -364,3 +364,5 @@ OPTAB_D (atomic_xor_optab, "atomic_xor$I

>

>  OPTAB_D (get_thread_pointer_optab, "get_thread_pointer$I$a")

>  OPTAB_D (set_thread_pointer_optab, "set_thread_pointer$I$a")

> +

> +OPTAB_DC (vec_duplicate_optab, "vec_duplicate$a", VEC_DUPLICATE)

> Index: gcc/optabs-tree.c

> ===================================================================

> --- gcc/optabs-tree.c   2017-11-06 12:40:39.845713389 +0000

> +++ gcc/optabs-tree.c   2017-11-06 12:40:40.286573506 +0000

> @@ -210,6 +210,9 @@ optab_for_tree_code (enum tree_code code

>        return TYPE_UNSIGNED (type) ?

>         vec_pack_ufix_trunc_optab : vec_pack_sfix_trunc_optab;

>

> +    case VEC_DUPLICATE_EXPR:

> +      return vec_duplicate_optab;

> +

>      default:

>        break;

>      }

> Index: gcc/optabs.h

> ===================================================================

> --- gcc/optabs.h        2017-11-06 12:40:39.845713389 +0000

> +++ gcc/optabs.h        2017-11-06 12:40:40.287566435 +0000

> @@ -181,6 +181,7 @@ extern rtx simplify_expand_binop (machin

>                                   enum optab_methods methods);

>  extern bool force_expand_binop (machine_mode, optab, rtx, rtx, rtx, int,

>                                 enum optab_methods);

> +extern rtx expand_vector_broadcast (machine_mode, rtx);

>

>  /* Generate code for a simple binary or unary operation.  "Simple" in

>     this case means "can be unambiguously described by a (mode, code)

> Index: gcc/optabs.c

> ===================================================================

> --- gcc/optabs.c        2017-11-06 12:40:39.845713389 +0000

> +++ gcc/optabs.c        2017-11-06 12:40:40.286573506 +0000

> @@ -367,7 +367,7 @@ force_expand_binop (machine_mode mode, o

>     mode of OP must be the element mode of VMODE.  If OP is a constant,

>     then the return value will be a constant.  */

>

> -static rtx

> +rtx

>  expand_vector_broadcast (machine_mode vmode, rtx op)

>  {

>    enum insn_code icode;

> @@ -380,6 +380,16 @@ expand_vector_broadcast (machine_mode vm

>    if (valid_for_const_vec_duplicate_p (vmode, op))

>      return gen_const_vec_duplicate (vmode, op);

>

> +  icode = optab_handler (vec_duplicate_optab, vmode);

> +  if (icode != CODE_FOR_nothing)

> +    {

> +      struct expand_operand ops[2];

> +      create_output_operand (&ops[0], NULL_RTX, vmode);

> +      create_input_operand (&ops[1], op, GET_MODE (op));

> +      expand_insn (icode, 2, ops);

> +      return ops[0].value;

> +    }

> +

>    /* ??? If the target doesn't have a vec_init, then we have no easy way

>       of performing this operation.  Most of this sort of generic support

>       is hidden away in the vector lowering support in gimple.  */

> Index: gcc/expr.c

> ===================================================================

> --- gcc/expr.c  2017-11-06 12:40:39.845713389 +0000

> +++ gcc/expr.c  2017-11-06 12:40:40.281608865 +0000

> @@ -6576,7 +6576,8 @@ store_constructor (tree exp, rtx target,

>         constructor_elt *ce;

>         int i;

>         int need_to_clear;

> -       int icode = CODE_FOR_nothing;

> +       insn_code icode = CODE_FOR_nothing;

> +       tree elt;

>         tree elttype = TREE_TYPE (type);

>         int elt_size = tree_to_uhwi (TYPE_SIZE (elttype));

>         machine_mode eltmode = TYPE_MODE (elttype);

> @@ -6586,13 +6587,30 @@ store_constructor (tree exp, rtx target,

>         unsigned n_elts;

>         alias_set_type alias;

>         bool vec_vec_init_p = false;

> +       machine_mode mode = GET_MODE (target);

>

>         gcc_assert (eltmode != BLKmode);

>

> +       /* Try using vec_duplicate_optab for uniform vectors.  */

> +       if (!TREE_SIDE_EFFECTS (exp)

> +           && VECTOR_MODE_P (mode)

> +           && eltmode == GET_MODE_INNER (mode)

> +           && ((icode = optab_handler (vec_duplicate_optab, mode))

> +               != CODE_FOR_nothing)

> +           && (elt = uniform_vector_p (exp)))

> +         {

> +           struct expand_operand ops[2];

> +           create_output_operand (&ops[0], target, mode);

> +           create_input_operand (&ops[1], expand_normal (elt), eltmode);

> +           expand_insn (icode, 2, ops);

> +           if (!rtx_equal_p (target, ops[0].value))

> +             emit_move_insn (target, ops[0].value);

> +           break;

> +         }

> +

>         n_elts = TYPE_VECTOR_SUBPARTS (type);

> -       if (REG_P (target) && VECTOR_MODE_P (GET_MODE (target)))

> +       if (REG_P (target) && VECTOR_MODE_P (mode))

>           {

> -           machine_mode mode = GET_MODE (target);

>             machine_mode emode = eltmode;

>

>             if (CONSTRUCTOR_NELTS (exp)

> @@ -6604,7 +6622,7 @@ store_constructor (tree exp, rtx target,

>                             == n_elts);

>                 emode = TYPE_MODE (etype);

>               }

> -           icode = (int) convert_optab_handler (vec_init_optab, mode, emode);

> +           icode = convert_optab_handler (vec_init_optab, mode, emode);

>             if (icode != CODE_FOR_nothing)

>               {

>                 unsigned int i, n = n_elts;

> @@ -6652,7 +6670,7 @@ store_constructor (tree exp, rtx target,

>         if (need_to_clear && size > 0 && !vector)

>           {

>             if (REG_P (target))

> -             emit_move_insn (target, CONST0_RTX (GET_MODE (target)));

> +             emit_move_insn (target, CONST0_RTX (mode));

>             else

>               clear_storage (target, GEN_INT (size), BLOCK_OP_NORMAL);

>             cleared = 1;

> @@ -6660,7 +6678,7 @@ store_constructor (tree exp, rtx target,

>

>         /* Inform later passes that the old value is dead.  */

>         if (!cleared && !vector && REG_P (target))

> -         emit_move_insn (target, CONST0_RTX (GET_MODE (target)));

> +         emit_move_insn (target, CONST0_RTX (mode));

>

>          if (MEM_P (target))

>           alias = MEM_ALIAS_SET (target);

> @@ -6711,8 +6729,7 @@ store_constructor (tree exp, rtx target,

>

>         if (vector)

>           emit_insn (GEN_FCN (icode) (target,

> -                                     gen_rtx_PARALLEL (GET_MODE (target),

> -                                                       vector)));

> +                                     gen_rtx_PARALLEL (mode, vector)));

>         break;

>        }

>

> @@ -7690,6 +7707,19 @@ expand_operands (tree exp0, tree exp1, r

>  }

>

>

> +/* Expand constant vector element ELT, which has mode MODE.  This is used

> +   for members of VECTOR_CST and VEC_DUPLICATE_CST.  */

> +

> +static rtx

> +const_vector_element (scalar_mode mode, const_tree elt)

> +{

> +  if (TREE_CODE (elt) == REAL_CST)

> +    return const_double_from_real_value (TREE_REAL_CST (elt), mode);

> +  if (TREE_CODE (elt) == FIXED_CST)

> +    return CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt), mode);

> +  return immed_wide_int_const (wi::to_wide (elt), mode);

> +}

> +

>  /* Return a MEM that contains constant EXP.  DEFER is as for

>     output_constant_def and MODIFIER is as for expand_expr.  */

>

> @@ -9555,6 +9585,12 @@ #define REDUCE_BIT_FIELD(expr)   (reduce_b

>        target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);

>        return target;

>

> +    case VEC_DUPLICATE_EXPR:

> +      op0 = expand_expr (treeop0, NULL_RTX, VOIDmode, modifier);

> +      target = expand_vector_broadcast (mode, op0);

> +      gcc_assert (target);

> +      return target;

> +

>      case BIT_INSERT_EXPR:

>        {

>         unsigned bitpos = tree_to_uhwi (treeop2);

> @@ -9988,6 +10024,11 @@ expand_expr_real_1 (tree exp, rtx target

>                             tmode, modifier);

>        }

>

> +    case VEC_DUPLICATE_CST:

> +      op0 = const_vector_element (GET_MODE_INNER (mode),

> +                                 VEC_DUPLICATE_CST_ELT (exp));

> +      return gen_const_vec_duplicate (mode, op0);

> +

>      case CONST_DECL:

>        if (modifier == EXPAND_WRITE)

>         {

> @@ -11749,8 +11790,7 @@ const_vector_from_tree (tree exp)

>  {

>    rtvec v;

>    unsigned i, units;

> -  tree elt;

> -  machine_mode inner, mode;

> +  machine_mode mode;

>

>    mode = TYPE_MODE (TREE_TYPE (exp));

>

> @@ -11761,23 +11801,12 @@ const_vector_from_tree (tree exp)

>      return const_vector_mask_from_tree (exp);

>

>    units = VECTOR_CST_NELTS (exp);

> -  inner = GET_MODE_INNER (mode);

>

>    v = rtvec_alloc (units);

>

>    for (i = 0; i < units; ++i)

> -    {

> -      elt = VECTOR_CST_ELT (exp, i);

> -

> -      if (TREE_CODE (elt) == REAL_CST)

> -       RTVEC_ELT (v, i) = const_double_from_real_value (TREE_REAL_CST (elt),

> -                                                        inner);

> -      else if (TREE_CODE (elt) == FIXED_CST)

> -       RTVEC_ELT (v, i) = CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt),

> -                                                        inner);

> -      else

> -       RTVEC_ELT (v, i) = immed_wide_int_const (wi::to_wide (elt), inner);

> -    }

> +    RTVEC_ELT (v, i) = const_vector_element (GET_MODE_INNER (mode),

> +                                            VECTOR_CST_ELT (exp, i));

>

>    return gen_rtx_CONST_VECTOR (mode, v);

>  }

> Index: gcc/internal-fn.c

> ===================================================================

> --- gcc/internal-fn.c   2017-11-06 12:40:39.845713389 +0000

> +++ gcc/internal-fn.c   2017-11-06 12:40:40.284587650 +0000

> @@ -1911,12 +1911,12 @@ expand_vector_ubsan_overflow (location_t

>        emit_move_insn (cntvar, const0_rtx);

>        emit_label (loop_lab);

>      }

> -  if (TREE_CODE (arg0) != VECTOR_CST)

> +  if (!CONSTANT_CLASS_P (arg0))

>      {

>        rtx arg0r = expand_normal (arg0);

>        arg0 = make_tree (TREE_TYPE (arg0), arg0r);

>      }

> -  if (TREE_CODE (arg1) != VECTOR_CST)

> +  if (!CONSTANT_CLASS_P (arg1))

>      {

>        rtx arg1r = expand_normal (arg1);

>        arg1 = make_tree (TREE_TYPE (arg1), arg1r);

> Index: gcc/tree-cfg.c

> ===================================================================

> --- gcc/tree-cfg.c      2017-11-06 12:40:39.845713389 +0000

> +++ gcc/tree-cfg.c      2017-11-06 12:40:40.287566435 +0000

> @@ -3798,6 +3798,17 @@ verify_gimple_assign_unary (gassign *stm

>      case CONJ_EXPR:

>        break;

>

> +    case VEC_DUPLICATE_EXPR:

> +      if (TREE_CODE (lhs_type) != VECTOR_TYPE

> +         || !useless_type_conversion_p (TREE_TYPE (lhs_type), rhs1_type))

> +       {

> +         error ("vec_duplicate should be from a scalar to a like vector");

> +         debug_generic_expr (lhs_type);

> +         debug_generic_expr (rhs1_type);

> +         return true;

> +       }

> +      return false;

> +

>      default:

>        gcc_unreachable ();

>      }

> @@ -4468,6 +4479,7 @@ verify_gimple_assign_single (gassign *st

>      case FIXED_CST:

>      case COMPLEX_CST:

>      case VECTOR_CST:

> +    case VEC_DUPLICATE_CST:

>      case STRING_CST:

>        return res;

>

> Index: gcc/tree-inline.c

> ===================================================================

> --- gcc/tree-inline.c   2017-11-06 12:40:39.845713389 +0000

> +++ gcc/tree-inline.c   2017-11-06 12:40:40.289552291 +0000

> @@ -3930,6 +3930,7 @@ estimate_operator_cost (enum tree_code c

>      case VEC_PACK_FIX_TRUNC_EXPR:

>      case VEC_WIDEN_LSHIFT_HI_EXPR:

>      case VEC_WIDEN_LSHIFT_LO_EXPR:

> +    case VEC_DUPLICATE_EXPR:

>

>        return 1;

>
Richard Sandiford Dec. 15, 2017, 12:29 a.m. UTC | #4
This patch just adds VEC_DUPLICATE_EXPR, since the VEC_DUPLICATE_CST
isn't needed with the new VECTOR_CST layout.  It's really just the
original patch with bits removed, but just in case:

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.
OK to install?

Richard


2017-12-15  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hawyard@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* doc/generic.texi (VEC_DUPLICATE_EXPR): Document.
	(VEC_COND_EXPR): Add missing @tindex.
	* doc/md.texi (vec_duplicate@var{m}): Document.
	* tree.def (VEC_DUPLICATE_EXPR): New tree codes.
	* tree.c (build_vector_from_val): Add stubbed-out handling of
	variable-length vectors, using VEC_DUPLICATE_EXPR.
	(uniform_vector_p): Handle VEC_DUPLICATE_EXPR.
	* cfgexpand.c (expand_debug_expr): Likewise.
	* tree-cfg.c (verify_gimple_assign_unary): Likewise.
	* tree-inline.c (estimate_operator_cost): Likewise.
	* tree-pretty-print.c (dump_generic_node): Likewise.
	* tree-vect-generic.c (ssa_uniform_vector_p): Likewise.
	* fold-const.c (const_unop): Fold VEC_DUPLICATE_EXPRs of a constant.
	(test_vec_duplicate_folding): New function.
	(fold_const_c_tests): Call it.
	* optabs.def (vec_duplicate_optab): New optab.
	* optabs-tree.c (optab_for_tree_code): Handle VEC_DUPLICATE_EXPR.
	* optabs.h (expand_vector_broadcast): Declare.
	* optabs.c (expand_vector_broadcast): Make non-static.  Try using
	vec_duplicate_optab.
	* expr.c (store_constructor): Try using vec_duplicate_optab for
	uniform vectors.
	(expand_expr_real_2): Handle VEC_DUPLICATE_EXPR.

Index: gcc/doc/generic.texi
===================================================================
--- gcc/doc/generic.texi	2017-12-15 00:24:47.213516622 +0000
+++ gcc/doc/generic.texi	2017-12-15 00:24:47.498459276 +0000
@@ -1768,6 +1768,7 @@ a value from @code{enum annot_expr_kind}
 
 @node Vectors
 @subsection Vectors
+@tindex VEC_DUPLICATE_EXPR
 @tindex VEC_LSHIFT_EXPR
 @tindex VEC_RSHIFT_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
@@ -1779,9 +1780,14 @@ a value from @code{enum annot_expr_kind}
 @tindex VEC_PACK_TRUNC_EXPR
 @tindex VEC_PACK_SAT_EXPR
 @tindex VEC_PACK_FIX_TRUNC_EXPR
+@tindex VEC_COND_EXPR
 @tindex SAD_EXPR
 
 @table @code
+@item VEC_DUPLICATE_EXPR
+This node has a single operand and represents a vector in which every
+element is equal to that operand.
+
 @item VEC_LSHIFT_EXPR
 @itemx VEC_RSHIFT_EXPR
 These nodes represent whole vector left and right shifts, respectively.
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	2017-12-15 00:24:47.213516622 +0000
+++ gcc/doc/md.texi	2017-12-15 00:24:47.499459075 +0000
@@ -4888,6 +4888,17 @@ and operand 1 is parallel containing val
 the vector mode @var{m}, or a vector mode with the same element mode and
 smaller number of elements.
 
+@cindex @code{vec_duplicate@var{m}} instruction pattern
+@item @samp{vec_duplicate@var{m}}
+Initialize vector output operand 0 so that each element has the value given
+by scalar input operand 1.  The vector has mode @var{m} and the scalar has
+the mode appropriate for one element of @var{m}.
+
+This pattern only handles duplicates of non-constant inputs.  Constant
+vectors go through the @code{mov@var{m}} pattern instead.
+
+This pattern is not allowed to @code{FAIL}.
+
 @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern
 @item @samp{vec_cmp@var{m}@var{n}}
 Output a vector comparison.  Operand 0 of mode @var{n} is the destination for
Index: gcc/tree.def
===================================================================
--- gcc/tree.def	2017-12-15 00:24:47.213516622 +0000
+++ gcc/tree.def	2017-12-15 00:24:47.505457868 +0000
@@ -537,6 +537,9 @@ DEFTREECODE (TARGET_EXPR, "target_expr",
    1 and 2 are NULL.  The operands are then taken from the cfg edges. */
 DEFTREECODE (COND_EXPR, "cond_expr", tcc_expression, 3)
 
+/* Represents a vector in which every element is equal to operand 0.  */
+DEFTREECODE (VEC_DUPLICATE_EXPR, "vec_duplicate_expr", tcc_unary, 1)
+
 /* Vector conditional expression. It is like COND_EXPR, but with
    vector operands.
 
Index: gcc/tree.c
===================================================================
--- gcc/tree.c	2017-12-15 00:24:47.213516622 +0000
+++ gcc/tree.c	2017-12-15 00:24:47.505457868 +0000
@@ -1785,6 +1785,8 @@ build_vector_from_val (tree vectype, tre
       v.quick_push (sc);
       return v.build ();
     }
+  else if (0)
+    return fold_build1 (VEC_DUPLICATE_EXPR, vectype, sc);
   else
     {
       vec<constructor_elt, va_gc> *v;
@@ -10468,7 +10470,10 @@ uniform_vector_p (const_tree vec)
 
   gcc_assert (VECTOR_TYPE_P (TREE_TYPE (vec)));
 
-  if (TREE_CODE (vec) == VECTOR_CST)
+  if (TREE_CODE (vec) == VEC_DUPLICATE_EXPR)
+    return TREE_OPERAND (vec, 0);
+
+  else if (TREE_CODE (vec) == VECTOR_CST)
     {
       if (VECTOR_CST_NPATTERNS (vec) == 1 && VECTOR_CST_DUPLICATE_P (vec))
 	return VECTOR_CST_ENCODED_ELT (vec, 0);
Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c	2017-12-15 00:24:47.213516622 +0000
+++ gcc/cfgexpand.c	2017-12-15 00:24:47.498459276 +0000
@@ -5069,6 +5069,7 @@ expand_debug_expr (tree exp)
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
     case VEC_PERM_EXPR:
+    case VEC_DUPLICATE_EXPR:
       return NULL;
 
     /* Misc codes.  */
Index: gcc/tree-cfg.c
===================================================================
--- gcc/tree-cfg.c	2017-12-15 00:24:47.213516622 +0000
+++ gcc/tree-cfg.c	2017-12-15 00:24:47.503458270 +0000
@@ -3857,6 +3857,17 @@ verify_gimple_assign_unary (gassign *stm
     case CONJ_EXPR:
       break;
 
+    case VEC_DUPLICATE_EXPR:
+      if (TREE_CODE (lhs_type) != VECTOR_TYPE
+	  || !useless_type_conversion_p (TREE_TYPE (lhs_type), rhs1_type))
+	{
+	  error ("vec_duplicate should be from a scalar to a like vector");
+	  debug_generic_expr (lhs_type);
+	  debug_generic_expr (rhs1_type);
+	  return true;
+	}
+      return false;
+
     default:
       gcc_unreachable ();
     }
Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c	2017-12-15 00:24:47.213516622 +0000
+++ gcc/tree-inline.c	2017-12-15 00:24:47.504458069 +0000
@@ -3928,6 +3928,7 @@ estimate_operator_cost (enum tree_code c
     case VEC_PACK_FIX_TRUNC_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
+    case VEC_DUPLICATE_EXPR:
 
       return 1;
 
Index: gcc/tree-pretty-print.c
===================================================================
--- gcc/tree-pretty-print.c	2017-12-15 00:24:47.213516622 +0000
+++ gcc/tree-pretty-print.c	2017-12-15 00:24:47.504458069 +0000
@@ -3178,6 +3178,15 @@ dump_generic_node (pretty_printer *pp, t
       pp_string (pp, " > ");
       break;
 
+    case VEC_DUPLICATE_EXPR:
+      pp_space (pp);
+      for (str = get_tree_code_name (code); *str; str++)
+	pp_character (pp, TOUPPER (*str));
+      pp_string (pp, " < ");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, " > ");
+      break;
+
     case VEC_UNPACK_HI_EXPR:
       pp_string (pp, " VEC_UNPACK_HI_EXPR < ");
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
Index: gcc/tree-vect-generic.c
===================================================================
--- gcc/tree-vect-generic.c	2017-12-15 00:24:47.213516622 +0000
+++ gcc/tree-vect-generic.c	2017-12-15 00:24:47.504458069 +0000
@@ -1418,6 +1418,7 @@ lower_vec_perm (gimple_stmt_iterator *gs
 ssa_uniform_vector_p (tree op)
 {
   if (TREE_CODE (op) == VECTOR_CST
+      || TREE_CODE (op) == VEC_DUPLICATE_EXPR
       || TREE_CODE (op) == CONSTRUCTOR)
     return uniform_vector_p (op);
   if (TREE_CODE (op) == SSA_NAME)
Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c	2017-12-15 00:24:47.213516622 +0000
+++ gcc/fold-const.c	2017-12-15 00:24:47.501458673 +0000
@@ -1771,6 +1771,11 @@ const_unop (enum tree_code code, tree ty
 	return elts.build ();
       }
 
+    case VEC_DUPLICATE_EXPR:
+      if (CONSTANT_CLASS_P (arg0))
+	return build_vector_from_val (type, arg0);
+      return NULL_TREE;
+
     default:
       break;
     }
@@ -14442,6 +14447,22 @@ test_vector_folding ()
   ASSERT_FALSE (integer_nonzerop (fold_build2 (NE_EXPR, res_type, one, one)));
 }
 
+/* Verify folding of VEC_DUPLICATE_EXPRs.  */
+
+static void
+test_vec_duplicate_folding ()
+{
+  scalar_int_mode int_mode = SCALAR_INT_TYPE_MODE (ssizetype);
+  machine_mode vec_mode = targetm.vectorize.preferred_simd_mode (int_mode);
+  /* This will be 1 if VEC_MODE isn't a vector mode.  */
+  unsigned int nunits = GET_MODE_NUNITS (vec_mode);
+
+  tree type = build_vector_type (ssizetype, nunits);
+  tree dup5_expr = fold_unary (VEC_DUPLICATE_EXPR, type, ssize_int (5));
+  tree dup5_cst = build_vector_from_val (type, ssize_int (5));
+  ASSERT_TRUE (operand_equal_p (dup5_expr, dup5_cst, 0));
+}
+
 /* Run all of the selftests within this file.  */
 
 void
@@ -14449,6 +14470,7 @@ fold_const_c_tests ()
 {
   test_arithmetic_folding ();
   test_vector_folding ();
+  test_vec_duplicate_folding ();
 }
 
 } // namespace selftest
Index: gcc/optabs.def
===================================================================
--- gcc/optabs.def	2017-12-15 00:24:47.213516622 +0000
+++ gcc/optabs.def	2017-12-15 00:24:47.502458472 +0000
@@ -363,3 +363,5 @@ OPTAB_D (atomic_xor_optab, "atomic_xor$I
 
 OPTAB_D (get_thread_pointer_optab, "get_thread_pointer$I$a")
 OPTAB_D (set_thread_pointer_optab, "set_thread_pointer$I$a")
+
+OPTAB_DC (vec_duplicate_optab, "vec_duplicate$a", VEC_DUPLICATE)
Index: gcc/optabs-tree.c
===================================================================
--- gcc/optabs-tree.c	2017-12-15 00:24:47.213516622 +0000
+++ gcc/optabs-tree.c	2017-12-15 00:24:47.501458673 +0000
@@ -199,6 +199,9 @@ optab_for_tree_code (enum tree_code code
       return TYPE_UNSIGNED (type) ?
 	vec_pack_ufix_trunc_optab : vec_pack_sfix_trunc_optab;
 
+    case VEC_DUPLICATE_EXPR:
+      return vec_duplicate_optab;
+
     default:
       break;
     }
Index: gcc/optabs.h
===================================================================
--- gcc/optabs.h	2017-12-15 00:24:47.213516622 +0000
+++ gcc/optabs.h	2017-12-15 00:24:47.502458472 +0000
@@ -182,6 +182,7 @@ extern rtx simplify_expand_binop (machin
 				  enum optab_methods methods);
 extern bool force_expand_binop (machine_mode, optab, rtx, rtx, rtx, int,
 				enum optab_methods);
+extern rtx expand_vector_broadcast (machine_mode, rtx);
 
 /* Generate code for a simple binary or unary operation.  "Simple" in
    this case means "can be unambiguously described by a (mode, code)
Index: gcc/optabs.c
===================================================================
--- gcc/optabs.c	2017-12-15 00:24:47.213516622 +0000
+++ gcc/optabs.c	2017-12-15 00:24:47.502458472 +0000
@@ -367,7 +367,7 @@ force_expand_binop (machine_mode mode, o
    mode of OP must be the element mode of VMODE.  If OP is a constant,
    then the return value will be a constant.  */
 
-static rtx
+rtx
 expand_vector_broadcast (machine_mode vmode, rtx op)
 {
   enum insn_code icode;
@@ -380,6 +380,16 @@ expand_vector_broadcast (machine_mode vm
   if (valid_for_const_vec_duplicate_p (vmode, op))
     return gen_const_vec_duplicate (vmode, op);
 
+  icode = optab_handler (vec_duplicate_optab, vmode);
+  if (icode != CODE_FOR_nothing)
+    {
+      struct expand_operand ops[2];
+      create_output_operand (&ops[0], NULL_RTX, vmode);
+      create_input_operand (&ops[1], op, GET_MODE (op));
+      expand_insn (icode, 2, ops);
+      return ops[0].value;
+    }
+
   /* ??? If the target doesn't have a vec_init, then we have no easy way
      of performing this operation.  Most of this sort of generic support
      is hidden away in the vector lowering support in gimple.  */
Index: gcc/expr.c
===================================================================
--- gcc/expr.c	2017-12-15 00:24:47.213516622 +0000
+++ gcc/expr.c	2017-12-15 00:24:47.500458874 +0000
@@ -6598,7 +6598,8 @@ store_constructor (tree exp, rtx target,
 	constructor_elt *ce;
 	int i;
 	int need_to_clear;
-	int icode = CODE_FOR_nothing;
+	insn_code icode = CODE_FOR_nothing;
+	tree elt;
 	tree elttype = TREE_TYPE (type);
 	int elt_size = tree_to_uhwi (TYPE_SIZE (elttype));
 	machine_mode eltmode = TYPE_MODE (elttype);
@@ -6608,13 +6609,30 @@ store_constructor (tree exp, rtx target,
 	unsigned n_elts;
 	alias_set_type alias;
 	bool vec_vec_init_p = false;
+	machine_mode mode = GET_MODE (target);
 
 	gcc_assert (eltmode != BLKmode);
 
+	/* Try using vec_duplicate_optab for uniform vectors.  */
+	if (!TREE_SIDE_EFFECTS (exp)
+	    && VECTOR_MODE_P (mode)
+	    && eltmode == GET_MODE_INNER (mode)
+	    && ((icode = optab_handler (vec_duplicate_optab, mode))
+		!= CODE_FOR_nothing)
+	    && (elt = uniform_vector_p (exp)))
+	  {
+	    struct expand_operand ops[2];
+	    create_output_operand (&ops[0], target, mode);
+	    create_input_operand (&ops[1], expand_normal (elt), eltmode);
+	    expand_insn (icode, 2, ops);
+	    if (!rtx_equal_p (target, ops[0].value))
+	      emit_move_insn (target, ops[0].value);
+	    break;
+	  }
+
 	n_elts = TYPE_VECTOR_SUBPARTS (type);
-	if (REG_P (target) && VECTOR_MODE_P (GET_MODE (target)))
+	if (REG_P (target) && VECTOR_MODE_P (mode))
 	  {
-	    machine_mode mode = GET_MODE (target);
 	    machine_mode emode = eltmode;
 
 	    if (CONSTRUCTOR_NELTS (exp)
@@ -6626,7 +6644,7 @@ store_constructor (tree exp, rtx target,
 			    == n_elts);
 		emode = TYPE_MODE (etype);
 	      }
-	    icode = (int) convert_optab_handler (vec_init_optab, mode, emode);
+	    icode = convert_optab_handler (vec_init_optab, mode, emode);
 	    if (icode != CODE_FOR_nothing)
 	      {
 		unsigned int i, n = n_elts;
@@ -6674,7 +6692,7 @@ store_constructor (tree exp, rtx target,
 	if (need_to_clear && size > 0 && !vector)
 	  {
 	    if (REG_P (target))
-	      emit_move_insn (target, CONST0_RTX (GET_MODE (target)));
+	      emit_move_insn (target, CONST0_RTX (mode));
 	    else
 	      clear_storage (target, GEN_INT (size), BLOCK_OP_NORMAL);
 	    cleared = 1;
@@ -6682,7 +6700,7 @@ store_constructor (tree exp, rtx target,
 
 	/* Inform later passes that the old value is dead.  */
 	if (!cleared && !vector && REG_P (target))
-	  emit_move_insn (target, CONST0_RTX (GET_MODE (target)));
+	  emit_move_insn (target, CONST0_RTX (mode));
 
         if (MEM_P (target))
 	  alias = MEM_ALIAS_SET (target);
@@ -6733,8 +6751,7 @@ store_constructor (tree exp, rtx target,
 
 	if (vector)
 	  emit_insn (GEN_FCN (icode) (target,
-				      gen_rtx_PARALLEL (GET_MODE (target),
-							vector)));
+				      gen_rtx_PARALLEL (mode, vector)));
 	break;
       }
 
@@ -9563,6 +9580,12 @@ #define REDUCE_BIT_FIELD(expr)	(reduce_b
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case VEC_DUPLICATE_EXPR:
+      op0 = expand_expr (treeop0, NULL_RTX, VOIDmode, modifier);
+      target = expand_vector_broadcast (mode, op0);
+      gcc_assert (target);
+      return target;
+
     case BIT_INSERT_EXPR:
       {
 	unsigned bitpos = tree_to_uhwi (treeop2);
Richard Biener Dec. 15, 2017, 8:58 a.m. UTC | #5
On Fri, Dec 15, 2017 at 1:29 AM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> This patch just adds VEC_DUPLICATE_EXPR, since the VEC_DUPLICATE_CST

> isn't needed with the new VECTOR_CST layout.  It's really just the

> original patch with bits removed, but just in case:

>

> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.

> OK to install?


To keep things simple at this point OK.  Note that I'd eventually
like to see this as VEC_PERM_EXPR <scalar_type_1, scalar_type_1, { 0, ... }>.
For reductions when we need { x, 0, ... } we now have to use a
VEC_DUPLICATE_EXPR to make x a vector and then a VEC_PERM_EXPR
to merge it with {0, ... }, right?  Rather than VEC_PERM_EXPR <x_1, 0,
{ 0, 1, 1, 1.... }>

Thanks,
Richard.

> Richard

>

>

> 2017-12-15  Richard Sandiford  <richard.sandiford@linaro.org>

>             Alan Hayward  <alan.hawyard@arm.com>

>             David Sherwood  <david.sherwood@arm.com>

>

> gcc/

>         * doc/generic.texi (VEC_DUPLICATE_EXPR): Document.

>         (VEC_COND_EXPR): Add missing @tindex.

>         * doc/md.texi (vec_duplicate@var{m}): Document.

>         * tree.def (VEC_DUPLICATE_EXPR): New tree codes.

>         * tree.c (build_vector_from_val): Add stubbed-out handling of

>         variable-length vectors, using VEC_DUPLICATE_EXPR.

>         (uniform_vector_p): Handle VEC_DUPLICATE_EXPR.

>         * cfgexpand.c (expand_debug_expr): Likewise.

>         * tree-cfg.c (verify_gimple_assign_unary): Likewise.

>         * tree-inline.c (estimate_operator_cost): Likewise.

>         * tree-pretty-print.c (dump_generic_node): Likewise.

>         * tree-vect-generic.c (ssa_uniform_vector_p): Likewise.

>         * fold-const.c (const_unop): Fold VEC_DUPLICATE_EXPRs of a constant.

>         (test_vec_duplicate_folding): New function.

>         (fold_const_c_tests): Call it.

>         * optabs.def (vec_duplicate_optab): New optab.

>         * optabs-tree.c (optab_for_tree_code): Handle VEC_DUPLICATE_EXPR.

>         * optabs.h (expand_vector_broadcast): Declare.

>         * optabs.c (expand_vector_broadcast): Make non-static.  Try using

>         vec_duplicate_optab.

>         * expr.c (store_constructor): Try using vec_duplicate_optab for

>         uniform vectors.

>         (expand_expr_real_2): Handle VEC_DUPLICATE_EXPR.

>

> Index: gcc/doc/generic.texi

> ===================================================================

> --- gcc/doc/generic.texi        2017-12-15 00:24:47.213516622 +0000

> +++ gcc/doc/generic.texi        2017-12-15 00:24:47.498459276 +0000

> @@ -1768,6 +1768,7 @@ a value from @code{enum annot_expr_kind}

>

>  @node Vectors

>  @subsection Vectors

> +@tindex VEC_DUPLICATE_EXPR

>  @tindex VEC_LSHIFT_EXPR

>  @tindex VEC_RSHIFT_EXPR

>  @tindex VEC_WIDEN_MULT_HI_EXPR

> @@ -1779,9 +1780,14 @@ a value from @code{enum annot_expr_kind}

>  @tindex VEC_PACK_TRUNC_EXPR

>  @tindex VEC_PACK_SAT_EXPR

>  @tindex VEC_PACK_FIX_TRUNC_EXPR

> +@tindex VEC_COND_EXPR

>  @tindex SAD_EXPR

>

>  @table @code

> +@item VEC_DUPLICATE_EXPR

> +This node has a single operand and represents a vector in which every

> +element is equal to that operand.

> +

>  @item VEC_LSHIFT_EXPR

>  @itemx VEC_RSHIFT_EXPR

>  These nodes represent whole vector left and right shifts, respectively.

> Index: gcc/doc/md.texi

> ===================================================================

> --- gcc/doc/md.texi     2017-12-15 00:24:47.213516622 +0000

> +++ gcc/doc/md.texi     2017-12-15 00:24:47.499459075 +0000

> @@ -4888,6 +4888,17 @@ and operand 1 is parallel containing val

>  the vector mode @var{m}, or a vector mode with the same element mode and

>  smaller number of elements.

>

> +@cindex @code{vec_duplicate@var{m}} instruction pattern

> +@item @samp{vec_duplicate@var{m}}

> +Initialize vector output operand 0 so that each element has the value given

> +by scalar input operand 1.  The vector has mode @var{m} and the scalar has

> +the mode appropriate for one element of @var{m}.

> +

> +This pattern only handles duplicates of non-constant inputs.  Constant

> +vectors go through the @code{mov@var{m}} pattern instead.

> +

> +This pattern is not allowed to @code{FAIL}.

> +

>  @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern

>  @item @samp{vec_cmp@var{m}@var{n}}

>  Output a vector comparison.  Operand 0 of mode @var{n} is the destination for

> Index: gcc/tree.def

> ===================================================================

> --- gcc/tree.def        2017-12-15 00:24:47.213516622 +0000

> +++ gcc/tree.def        2017-12-15 00:24:47.505457868 +0000

> @@ -537,6 +537,9 @@ DEFTREECODE (TARGET_EXPR, "target_expr",

>     1 and 2 are NULL.  The operands are then taken from the cfg edges. */

>  DEFTREECODE (COND_EXPR, "cond_expr", tcc_expression, 3)

>

> +/* Represents a vector in which every element is equal to operand 0.  */

> +DEFTREECODE (VEC_DUPLICATE_EXPR, "vec_duplicate_expr", tcc_unary, 1)

> +

>  /* Vector conditional expression. It is like COND_EXPR, but with

>     vector operands.

>

> Index: gcc/tree.c

> ===================================================================

> --- gcc/tree.c  2017-12-15 00:24:47.213516622 +0000

> +++ gcc/tree.c  2017-12-15 00:24:47.505457868 +0000

> @@ -1785,6 +1785,8 @@ build_vector_from_val (tree vectype, tre

>        v.quick_push (sc);

>        return v.build ();

>      }

> +  else if (0)

> +    return fold_build1 (VEC_DUPLICATE_EXPR, vectype, sc);

>    else

>      {

>        vec<constructor_elt, va_gc> *v;

> @@ -10468,7 +10470,10 @@ uniform_vector_p (const_tree vec)

>

>    gcc_assert (VECTOR_TYPE_P (TREE_TYPE (vec)));

>

> -  if (TREE_CODE (vec) == VECTOR_CST)

> +  if (TREE_CODE (vec) == VEC_DUPLICATE_EXPR)

> +    return TREE_OPERAND (vec, 0);

> +

> +  else if (TREE_CODE (vec) == VECTOR_CST)

>      {

>        if (VECTOR_CST_NPATTERNS (vec) == 1 && VECTOR_CST_DUPLICATE_P (vec))

>         return VECTOR_CST_ENCODED_ELT (vec, 0);

> Index: gcc/cfgexpand.c

> ===================================================================

> --- gcc/cfgexpand.c     2017-12-15 00:24:47.213516622 +0000

> +++ gcc/cfgexpand.c     2017-12-15 00:24:47.498459276 +0000

> @@ -5069,6 +5069,7 @@ expand_debug_expr (tree exp)

>      case VEC_WIDEN_LSHIFT_HI_EXPR:

>      case VEC_WIDEN_LSHIFT_LO_EXPR:

>      case VEC_PERM_EXPR:

> +    case VEC_DUPLICATE_EXPR:

>        return NULL;

>

>      /* Misc codes.  */

> Index: gcc/tree-cfg.c

> ===================================================================

> --- gcc/tree-cfg.c      2017-12-15 00:24:47.213516622 +0000

> +++ gcc/tree-cfg.c      2017-12-15 00:24:47.503458270 +0000

> @@ -3857,6 +3857,17 @@ verify_gimple_assign_unary (gassign *stm

>      case CONJ_EXPR:

>        break;

>

> +    case VEC_DUPLICATE_EXPR:

> +      if (TREE_CODE (lhs_type) != VECTOR_TYPE

> +         || !useless_type_conversion_p (TREE_TYPE (lhs_type), rhs1_type))

> +       {

> +         error ("vec_duplicate should be from a scalar to a like vector");

> +         debug_generic_expr (lhs_type);

> +         debug_generic_expr (rhs1_type);

> +         return true;

> +       }

> +      return false;

> +

>      default:

>        gcc_unreachable ();

>      }

> Index: gcc/tree-inline.c

> ===================================================================

> --- gcc/tree-inline.c   2017-12-15 00:24:47.213516622 +0000

> +++ gcc/tree-inline.c   2017-12-15 00:24:47.504458069 +0000

> @@ -3928,6 +3928,7 @@ estimate_operator_cost (enum tree_code c

>      case VEC_PACK_FIX_TRUNC_EXPR:

>      case VEC_WIDEN_LSHIFT_HI_EXPR:

>      case VEC_WIDEN_LSHIFT_LO_EXPR:

> +    case VEC_DUPLICATE_EXPR:

>

>        return 1;

>

> Index: gcc/tree-pretty-print.c

> ===================================================================

> --- gcc/tree-pretty-print.c     2017-12-15 00:24:47.213516622 +0000

> +++ gcc/tree-pretty-print.c     2017-12-15 00:24:47.504458069 +0000

> @@ -3178,6 +3178,15 @@ dump_generic_node (pretty_printer *pp, t

>        pp_string (pp, " > ");

>        break;

>

> +    case VEC_DUPLICATE_EXPR:

> +      pp_space (pp);

> +      for (str = get_tree_code_name (code); *str; str++)

> +       pp_character (pp, TOUPPER (*str));

> +      pp_string (pp, " < ");

> +      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);

> +      pp_string (pp, " > ");

> +      break;

> +

>      case VEC_UNPACK_HI_EXPR:

>        pp_string (pp, " VEC_UNPACK_HI_EXPR < ");

>        dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);

> Index: gcc/tree-vect-generic.c

> ===================================================================

> --- gcc/tree-vect-generic.c     2017-12-15 00:24:47.213516622 +0000

> +++ gcc/tree-vect-generic.c     2017-12-15 00:24:47.504458069 +0000

> @@ -1418,6 +1418,7 @@ lower_vec_perm (gimple_stmt_iterator *gs

>  ssa_uniform_vector_p (tree op)

>  {

>    if (TREE_CODE (op) == VECTOR_CST

> +      || TREE_CODE (op) == VEC_DUPLICATE_EXPR

>        || TREE_CODE (op) == CONSTRUCTOR)

>      return uniform_vector_p (op);

>    if (TREE_CODE (op) == SSA_NAME)

> Index: gcc/fold-const.c

> ===================================================================

> --- gcc/fold-const.c    2017-12-15 00:24:47.213516622 +0000

> +++ gcc/fold-const.c    2017-12-15 00:24:47.501458673 +0000

> @@ -1771,6 +1771,11 @@ const_unop (enum tree_code code, tree ty

>         return elts.build ();

>        }

>

> +    case VEC_DUPLICATE_EXPR:

> +      if (CONSTANT_CLASS_P (arg0))

> +       return build_vector_from_val (type, arg0);

> +      return NULL_TREE;

> +

>      default:

>        break;

>      }

> @@ -14442,6 +14447,22 @@ test_vector_folding ()

>    ASSERT_FALSE (integer_nonzerop (fold_build2 (NE_EXPR, res_type, one, one)));

>  }

>

> +/* Verify folding of VEC_DUPLICATE_EXPRs.  */

> +

> +static void

> +test_vec_duplicate_folding ()

> +{

> +  scalar_int_mode int_mode = SCALAR_INT_TYPE_MODE (ssizetype);

> +  machine_mode vec_mode = targetm.vectorize.preferred_simd_mode (int_mode);

> +  /* This will be 1 if VEC_MODE isn't a vector mode.  */

> +  unsigned int nunits = GET_MODE_NUNITS (vec_mode);

> +

> +  tree type = build_vector_type (ssizetype, nunits);

> +  tree dup5_expr = fold_unary (VEC_DUPLICATE_EXPR, type, ssize_int (5));

> +  tree dup5_cst = build_vector_from_val (type, ssize_int (5));

> +  ASSERT_TRUE (operand_equal_p (dup5_expr, dup5_cst, 0));

> +}

> +

>  /* Run all of the selftests within this file.  */

>

>  void

> @@ -14449,6 +14470,7 @@ fold_const_c_tests ()

>  {

>    test_arithmetic_folding ();

>    test_vector_folding ();

> +  test_vec_duplicate_folding ();

>  }

>

>  } // namespace selftest

> Index: gcc/optabs.def

> ===================================================================

> --- gcc/optabs.def      2017-12-15 00:24:47.213516622 +0000

> +++ gcc/optabs.def      2017-12-15 00:24:47.502458472 +0000

> @@ -363,3 +363,5 @@ OPTAB_D (atomic_xor_optab, "atomic_xor$I

>

>  OPTAB_D (get_thread_pointer_optab, "get_thread_pointer$I$a")

>  OPTAB_D (set_thread_pointer_optab, "set_thread_pointer$I$a")

> +

> +OPTAB_DC (vec_duplicate_optab, "vec_duplicate$a", VEC_DUPLICATE)

> Index: gcc/optabs-tree.c

> ===================================================================

> --- gcc/optabs-tree.c   2017-12-15 00:24:47.213516622 +0000

> +++ gcc/optabs-tree.c   2017-12-15 00:24:47.501458673 +0000

> @@ -199,6 +199,9 @@ optab_for_tree_code (enum tree_code code

>        return TYPE_UNSIGNED (type) ?

>         vec_pack_ufix_trunc_optab : vec_pack_sfix_trunc_optab;

>

> +    case VEC_DUPLICATE_EXPR:

> +      return vec_duplicate_optab;

> +

>      default:

>        break;

>      }

> Index: gcc/optabs.h

> ===================================================================

> --- gcc/optabs.h        2017-12-15 00:24:47.213516622 +0000

> +++ gcc/optabs.h        2017-12-15 00:24:47.502458472 +0000

> @@ -182,6 +182,7 @@ extern rtx simplify_expand_binop (machin

>                                   enum optab_methods methods);

>  extern bool force_expand_binop (machine_mode, optab, rtx, rtx, rtx, int,

>                                 enum optab_methods);

> +extern rtx expand_vector_broadcast (machine_mode, rtx);

>

>  /* Generate code for a simple binary or unary operation.  "Simple" in

>     this case means "can be unambiguously described by a (mode, code)

> Index: gcc/optabs.c

> ===================================================================

> --- gcc/optabs.c        2017-12-15 00:24:47.213516622 +0000

> +++ gcc/optabs.c        2017-12-15 00:24:47.502458472 +0000

> @@ -367,7 +367,7 @@ force_expand_binop (machine_mode mode, o

>     mode of OP must be the element mode of VMODE.  If OP is a constant,

>     then the return value will be a constant.  */

>

> -static rtx

> +rtx

>  expand_vector_broadcast (machine_mode vmode, rtx op)

>  {

>    enum insn_code icode;

> @@ -380,6 +380,16 @@ expand_vector_broadcast (machine_mode vm

>    if (valid_for_const_vec_duplicate_p (vmode, op))

>      return gen_const_vec_duplicate (vmode, op);

>

> +  icode = optab_handler (vec_duplicate_optab, vmode);

> +  if (icode != CODE_FOR_nothing)

> +    {

> +      struct expand_operand ops[2];

> +      create_output_operand (&ops[0], NULL_RTX, vmode);

> +      create_input_operand (&ops[1], op, GET_MODE (op));

> +      expand_insn (icode, 2, ops);

> +      return ops[0].value;

> +    }

> +

>    /* ??? If the target doesn't have a vec_init, then we have no easy way

>       of performing this operation.  Most of this sort of generic support

>       is hidden away in the vector lowering support in gimple.  */

> Index: gcc/expr.c

> ===================================================================

> --- gcc/expr.c  2017-12-15 00:24:47.213516622 +0000

> +++ gcc/expr.c  2017-12-15 00:24:47.500458874 +0000

> @@ -6598,7 +6598,8 @@ store_constructor (tree exp, rtx target,

>         constructor_elt *ce;

>         int i;

>         int need_to_clear;

> -       int icode = CODE_FOR_nothing;

> +       insn_code icode = CODE_FOR_nothing;

> +       tree elt;

>         tree elttype = TREE_TYPE (type);

>         int elt_size = tree_to_uhwi (TYPE_SIZE (elttype));

>         machine_mode eltmode = TYPE_MODE (elttype);

> @@ -6608,13 +6609,30 @@ store_constructor (tree exp, rtx target,

>         unsigned n_elts;

>         alias_set_type alias;

>         bool vec_vec_init_p = false;

> +       machine_mode mode = GET_MODE (target);

>

>         gcc_assert (eltmode != BLKmode);

>

> +       /* Try using vec_duplicate_optab for uniform vectors.  */

> +       if (!TREE_SIDE_EFFECTS (exp)

> +           && VECTOR_MODE_P (mode)

> +           && eltmode == GET_MODE_INNER (mode)

> +           && ((icode = optab_handler (vec_duplicate_optab, mode))

> +               != CODE_FOR_nothing)

> +           && (elt = uniform_vector_p (exp)))

> +         {

> +           struct expand_operand ops[2];

> +           create_output_operand (&ops[0], target, mode);

> +           create_input_operand (&ops[1], expand_normal (elt), eltmode);

> +           expand_insn (icode, 2, ops);

> +           if (!rtx_equal_p (target, ops[0].value))

> +             emit_move_insn (target, ops[0].value);

> +           break;

> +         }

> +

>         n_elts = TYPE_VECTOR_SUBPARTS (type);

> -       if (REG_P (target) && VECTOR_MODE_P (GET_MODE (target)))

> +       if (REG_P (target) && VECTOR_MODE_P (mode))

>           {

> -           machine_mode mode = GET_MODE (target);

>             machine_mode emode = eltmode;

>

>             if (CONSTRUCTOR_NELTS (exp)

> @@ -6626,7 +6644,7 @@ store_constructor (tree exp, rtx target,

>                             == n_elts);

>                 emode = TYPE_MODE (etype);

>               }

> -           icode = (int) convert_optab_handler (vec_init_optab, mode, emode);

> +           icode = convert_optab_handler (vec_init_optab, mode, emode);

>             if (icode != CODE_FOR_nothing)

>               {

>                 unsigned int i, n = n_elts;

> @@ -6674,7 +6692,7 @@ store_constructor (tree exp, rtx target,

>         if (need_to_clear && size > 0 && !vector)

>           {

>             if (REG_P (target))

> -             emit_move_insn (target, CONST0_RTX (GET_MODE (target)));

> +             emit_move_insn (target, CONST0_RTX (mode));

>             else

>               clear_storage (target, GEN_INT (size), BLOCK_OP_NORMAL);

>             cleared = 1;

> @@ -6682,7 +6700,7 @@ store_constructor (tree exp, rtx target,

>

>         /* Inform later passes that the old value is dead.  */

>         if (!cleared && !vector && REG_P (target))

> -         emit_move_insn (target, CONST0_RTX (GET_MODE (target)));

> +         emit_move_insn (target, CONST0_RTX (mode));

>

>          if (MEM_P (target))

>           alias = MEM_ALIAS_SET (target);

> @@ -6733,8 +6751,7 @@ store_constructor (tree exp, rtx target,

>

>         if (vector)

>           emit_insn (GEN_FCN (icode) (target,

> -                                     gen_rtx_PARALLEL (GET_MODE (target),

> -                                                       vector)));

> +                                     gen_rtx_PARALLEL (mode, vector)));

>         break;

>        }

>

> @@ -9563,6 +9580,12 @@ #define REDUCE_BIT_FIELD(expr)   (reduce_b

>        target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);

>        return target;

>

> +    case VEC_DUPLICATE_EXPR:

> +      op0 = expand_expr (treeop0, NULL_RTX, VOIDmode, modifier);

> +      target = expand_vector_broadcast (mode, op0);

> +      gcc_assert (target);

> +      return target;

> +

>      case BIT_INSERT_EXPR:

>        {

>         unsigned bitpos = tree_to_uhwi (treeop2);
Richard Sandiford Dec. 15, 2017, 12:52 p.m. UTC | #6
Richard Biener <richard.guenther@gmail.com> writes:
> On Fri, Dec 15, 2017 at 1:29 AM, Richard Sandiford

> <richard.sandiford@linaro.org> wrote:

>> This patch just adds VEC_DUPLICATE_EXPR, since the VEC_DUPLICATE_CST

>> isn't needed with the new VECTOR_CST layout.  It's really just the

>> original patch with bits removed, but just in case:

>>

>> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.

>> OK to install?

>

> To keep things simple at this point OK.  Note that I'd eventually

> like to see this as VEC_PERM_EXPR <scalar_type_1, scalar_type_1, { 0, ... }>.

> For reductions when we need { x, 0, ... } we now have to use a

> VEC_DUPLICATE_EXPR to make x a vector and then a VEC_PERM_EXPR

> to merge it with {0, ... }, right?  Rather than VEC_PERM_EXPR <x_1, 0,

> { 0, 1, 1, 1.... }>


That's where the shift-left-and-insert-scalar thing (IFN_SHL_INSERT)
comes in.  But yeah, allowing scalars as operands to VEC_PERM_EXPRs
would mean it could represent both VEC_DUPLICATE_EXPR and IFN_SHL_INSERT.
I guess the question is whether that's better than extending CONSTRUCTOR
(or a replacement) to use the VECTOR_CST encoding.  I realise you don't
like CONSTRUCTOR in gimple though...

I promise to look at either of those for GCC 9 if you think they're
better, but they'll be more invasive for other targets.

Thanks,
Richard
Richard Biener Dec. 15, 2017, 1:20 p.m. UTC | #7
On Fri, Dec 15, 2017 at 1:52 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:

>> On Fri, Dec 15, 2017 at 1:29 AM, Richard Sandiford

>> <richard.sandiford@linaro.org> wrote:

>>> This patch just adds VEC_DUPLICATE_EXPR, since the VEC_DUPLICATE_CST

>>> isn't needed with the new VECTOR_CST layout.  It's really just the

>>> original patch with bits removed, but just in case:

>>>

>>> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.

>>> OK to install?

>>

>> To keep things simple at this point OK.  Note that I'd eventually

>> like to see this as VEC_PERM_EXPR <scalar_type_1, scalar_type_1, { 0, ... }>.

>> For reductions when we need { x, 0, ... } we now have to use a

>> VEC_DUPLICATE_EXPR to make x a vector and then a VEC_PERM_EXPR

>> to merge it with {0, ... }, right?  Rather than VEC_PERM_EXPR <x_1, 0,

>> { 0, 1, 1, 1.... }>

>

> That's where the shift-left-and-insert-scalar thing (IFN_SHL_INSERT)

> comes in.  But yeah, allowing scalars as operands to VEC_PERM_EXPRs

> would mean it could represent both VEC_DUPLICATE_EXPR and IFN_SHL_INSERT.

> I guess the question is whether that's better than extending CONSTRUCTOR

> (or a replacement) to use the VECTOR_CST encoding.  I realise you don't

> like CONSTRUCTOR in gimple though...

>

> I promise to look at either of those for GCC 9 if you think they're

> better, but they'll be more invasive for other targets.


Thanks.
Richard.

> Thanks,

> Richard
diff mbox series

Patch

Index: gcc/doc/generic.texi
===================================================================
--- gcc/doc/generic.texi	2017-10-23 11:38:53.934094740 +0100
+++ gcc/doc/generic.texi	2017-10-23 11:41:51.760448406 +0100
@@ -1036,6 +1036,7 @@  As this example indicates, the operands
 @tindex FIXED_CST
 @tindex COMPLEX_CST
 @tindex VECTOR_CST
+@tindex VEC_DUPLICATE_CST
 @tindex STRING_CST
 @findex TREE_STRING_LENGTH
 @findex TREE_STRING_POINTER
@@ -1089,6 +1090,14 @@  constant nodes.  Each individual constan
 double constant node.  The first operand is a @code{TREE_LIST} of the
 constant nodes and is accessed through @code{TREE_VECTOR_CST_ELTS}.
 
+@item VEC_DUPLICATE_CST
+These nodes represent a vector constant in which every element has the
+same scalar value.  At present only variable-length vectors use
+@code{VEC_DUPLICATE_CST}; constant-length vectors use @code{VECTOR_CST}
+instead.  The scalar element value is given by
+@code{VEC_DUPLICATE_CST_ELT} and has the same restrictions as the
+element of a @code{VECTOR_CST}.
+
 @item STRING_CST
 These nodes represent string-constants.  The @code{TREE_STRING_LENGTH}
 returns the length of the string, as an @code{int}.  The
@@ -1692,6 +1701,7 @@  a value from @code{enum annot_expr_kind}
 
 @node Vectors
 @subsection Vectors
+@tindex VEC_DUPLICATE_EXPR
 @tindex VEC_LSHIFT_EXPR
 @tindex VEC_RSHIFT_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
@@ -1703,9 +1713,14 @@  a value from @code{enum annot_expr_kind}
 @tindex VEC_PACK_TRUNC_EXPR
 @tindex VEC_PACK_SAT_EXPR
 @tindex VEC_PACK_FIX_TRUNC_EXPR
+@tindex VEC_COND_EXPR
 @tindex SAD_EXPR
 
 @table @code
+@item VEC_DUPLICATE_EXPR
+This node has a single operand and represents a vector in which every
+element is equal to that operand.
+
 @item VEC_LSHIFT_EXPR
 @itemx VEC_RSHIFT_EXPR
 These nodes represent whole vector left and right shifts, respectively.
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	2017-10-23 11:41:22.189466342 +0100
+++ gcc/doc/md.texi	2017-10-23 11:41:51.761413027 +0100
@@ -4888,6 +4888,17 @@  and operand 1 is parallel containing val
 the vector mode @var{m}, or a vector mode with the same element mode and
 smaller number of elements.
 
+@cindex @code{vec_duplicate@var{m}} instruction pattern
+@item @samp{vec_duplicate@var{m}}
+Initialize vector output operand 0 so that each element has the value given
+by scalar input operand 1.  The vector has mode @var{m} and the scalar has
+the mode appropriate for one element of @var{m}.
+
+This pattern only handles duplicates of non-constant inputs.  Constant
+vectors go through the @code{mov@var{m}} pattern instead.
+
+This pattern is not allowed to @code{FAIL}.
+
 @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern
 @item @samp{vec_cmp@var{m}@var{n}}
 Output a vector comparison.  Operand 0 of mode @var{n} is the destination for
Index: gcc/tree.def
===================================================================
--- gcc/tree.def	2017-10-23 11:38:53.934094740 +0100
+++ gcc/tree.def	2017-10-23 11:41:51.774917721 +0100
@@ -304,6 +304,10 @@  DEFTREECODE (COMPLEX_CST, "complex_cst",
 /* Contents are in VECTOR_CST_ELTS field.  */
 DEFTREECODE (VECTOR_CST, "vector_cst", tcc_constant, 0)
 
+/* Represents a vector constant in which every element is equal to
+   VEC_DUPLICATE_CST_ELT.  */
+DEFTREECODE (VEC_DUPLICATE_CST, "vec_duplicate_cst", tcc_constant, 0)
+
 /* Contents are TREE_STRING_LENGTH and the actual contents of the string.  */
 DEFTREECODE (STRING_CST, "string_cst", tcc_constant, 0)
 
@@ -534,6 +538,9 @@  DEFTREECODE (TARGET_EXPR, "target_expr",
    1 and 2 are NULL.  The operands are then taken from the cfg edges. */
 DEFTREECODE (COND_EXPR, "cond_expr", tcc_expression, 3)
 
+/* Represents a vector in which every element is equal to operand 0.  */
+DEFTREECODE (VEC_DUPLICATE_EXPR, "vec_duplicate_expr", tcc_unary, 1)
+
 /* Vector conditional expression. It is like COND_EXPR, but with
    vector operands.
 
Index: gcc/tree-core.h
===================================================================
--- gcc/tree-core.h	2017-10-23 11:41:25.862065318 +0100
+++ gcc/tree-core.h	2017-10-23 11:41:51.771059237 +0100
@@ -975,7 +975,8 @@  struct GTY(()) tree_base {
     /* VEC length.  This field is only used with TREE_VEC.  */
     int length;
 
-    /* Number of elements.  This field is only used with VECTOR_CST.  */
+    /* Number of elements.  This field is only used with VECTOR_CST
+       and VEC_DUPLICATE_CST.  It is always 1 for VEC_DUPLICATE_CST.  */
     unsigned int nelts;
 
     /* SSA version number.  This field is only used with SSA_NAME.  */
@@ -1065,7 +1066,7 @@  struct GTY(()) tree_base {
    public_flag:
 
        TREE_OVERFLOW in
-           INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST
+           INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST, VEC_DUPLICATE_CST
 
        TREE_PUBLIC in
            VAR_DECL, FUNCTION_DECL
@@ -1332,7 +1333,7 @@  struct GTY(()) tree_complex {
 
 struct GTY(()) tree_vector {
   struct tree_typed typed;
-  tree GTY ((length ("VECTOR_CST_NELTS ((tree) &%h)"))) elts[1];
+  tree GTY ((length ("((tree) &%h)->base.u.nelts"))) elts[1];
 };
 
 struct GTY(()) tree_identifier {
Index: gcc/tree.h
===================================================================
--- gcc/tree.h	2017-10-23 11:41:23.517482774 +0100
+++ gcc/tree.h	2017-10-23 11:41:51.775882341 +0100
@@ -730,8 +730,8 @@  #define TREE_SYMBOL_REFERENCED(NODE) \
 #define TYPE_REF_CAN_ALIAS_ALL(NODE) \
   (PTR_OR_REF_CHECK (NODE)->base.static_flag)
 
-/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, or VECTOR_CST, this means
-   there was an overflow in folding.  */
+/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST or VEC_DUPLICATE_CST,
+   this means there was an overflow in folding.  */
 
 #define TREE_OVERFLOW(NODE) (CST_CHECK (NODE)->base.public_flag)
 
@@ -1030,6 +1030,10 @@  #define VECTOR_CST_NELTS(NODE) (VECTOR_C
 #define VECTOR_CST_ELTS(NODE) (VECTOR_CST_CHECK (NODE)->vector.elts)
 #define VECTOR_CST_ELT(NODE,IDX) (VECTOR_CST_CHECK (NODE)->vector.elts[IDX])
 
+/* In a VEC_DUPLICATE_CST node.  */
+#define VEC_DUPLICATE_CST_ELT(NODE) \
+  (VEC_DUPLICATE_CST_CHECK (NODE)->vector.elts[0])
+
 /* Define fields and accessors for some special-purpose tree nodes.  */
 
 #define IDENTIFIER_LENGTH(NODE) \
@@ -4025,6 +4029,7 @@  extern tree build_int_cst (tree, HOST_WI
 extern tree build_int_cstu (tree type, unsigned HOST_WIDE_INT cst);
 extern tree build_int_cst_type (tree, HOST_WIDE_INT);
 extern tree make_vector (unsigned CXX_MEM_STAT_INFO);
+extern tree build_vec_duplicate_cst (tree, tree CXX_MEM_STAT_INFO);
 extern tree build_vector (tree, vec<tree> CXX_MEM_STAT_INFO);
 extern tree build_vector_from_ctor (tree, vec<constructor_elt, va_gc> *);
 extern tree build_vector_from_val (tree, tree);
Index: gcc/tree.c
===================================================================
--- gcc/tree.c	2017-10-23 11:41:23.515548300 +0100
+++ gcc/tree.c	2017-10-23 11:41:51.774917721 +0100
@@ -464,6 +464,7 @@  tree_node_structure_for_code (enum tree_
     case FIXED_CST:		return TS_FIXED_CST;
     case COMPLEX_CST:		return TS_COMPLEX;
     case VECTOR_CST:		return TS_VECTOR;
+    case VEC_DUPLICATE_CST:	return TS_VECTOR;
     case STRING_CST:		return TS_STRING;
       /* tcc_exceptional cases.  */
     case ERROR_MARK:		return TS_COMMON;
@@ -816,6 +817,7 @@  tree_code_size (enum tree_code code)
 	case FIXED_CST:		return sizeof (struct tree_fixed_cst);
 	case COMPLEX_CST:	return sizeof (struct tree_complex);
 	case VECTOR_CST:	return sizeof (struct tree_vector);
+	case VEC_DUPLICATE_CST:	return sizeof (struct tree_vector);
 	case STRING_CST:	gcc_unreachable ();
 	default:
 	  return lang_hooks.tree_size (code);
@@ -875,6 +877,9 @@  tree_size (const_tree node)
       return (sizeof (struct tree_vector)
 	      + (VECTOR_CST_NELTS (node) - 1) * sizeof (tree));
 
+    case VEC_DUPLICATE_CST:
+      return sizeof (struct tree_vector);
+
     case STRING_CST:
       return TREE_STRING_LENGTH (node) + offsetof (struct tree_string, str) + 1;
 
@@ -1682,6 +1687,30 @@  cst_and_fits_in_hwi (const_tree x)
 	  && (tree_fits_shwi_p (x) || tree_fits_uhwi_p (x)));
 }
 
+/* Build a new VEC_DUPLICATE_CST with type TYPE and operand EXP.
+
+   Note that this function is only suitable for callers that specifically
+   need a VEC_DUPLICATE_CST node.  Use build_vector_from_val to duplicate
+   a general scalar into a general vector type.  */
+
+tree
+build_vec_duplicate_cst (tree type, tree exp MEM_STAT_DECL)
+{
+  int length = sizeof (struct tree_vector);
+
+  record_node_allocation_statistics (VEC_DUPLICATE_CST, length);
+
+  tree t = ggc_alloc_cleared_tree_node_stat (length PASS_MEM_STAT);
+
+  TREE_SET_CODE (t, VEC_DUPLICATE_CST);
+  TREE_TYPE (t) = type;
+  t->base.u.nelts = 1;
+  VEC_DUPLICATE_CST_ELT (t) = exp;
+  TREE_CONSTANT (t) = 1;
+
+  return t;
+}
+
 /* Build a newly constructed VECTOR_CST node of length LEN.  */
 
 tree
@@ -2343,6 +2372,8 @@  integer_zerop (const_tree expr)
 	    return false;
 	return true;
       }
+    case VEC_DUPLICATE_CST:
+      return integer_zerop (VEC_DUPLICATE_CST_ELT (expr));
     default:
       return false;
     }
@@ -2369,6 +2400,8 @@  integer_onep (const_tree expr)
 	    return false;
 	return true;
       }
+    case VEC_DUPLICATE_CST:
+      return integer_onep (VEC_DUPLICATE_CST_ELT (expr));
     default:
       return false;
     }
@@ -2407,6 +2440,9 @@  integer_all_onesp (const_tree expr)
       return 1;
     }
 
+  else if (TREE_CODE (expr) == VEC_DUPLICATE_CST)
+    return integer_all_onesp (VEC_DUPLICATE_CST_ELT (expr));
+
   else if (TREE_CODE (expr) != INTEGER_CST)
     return 0;
 
@@ -2463,7 +2499,7 @@  integer_nonzerop (const_tree expr)
 int
 integer_truep (const_tree expr)
 {
-  if (TREE_CODE (expr) == VECTOR_CST)
+  if (TREE_CODE (expr) == VECTOR_CST || TREE_CODE (expr) == VEC_DUPLICATE_CST)
     return integer_all_onesp (expr);
   return integer_onep (expr);
 }
@@ -2634,6 +2670,8 @@  real_zerop (const_tree expr)
 	    return false;
 	return true;
       }
+    case VEC_DUPLICATE_CST:
+      return real_zerop (VEC_DUPLICATE_CST_ELT (expr));
     default:
       return false;
     }
@@ -2662,6 +2700,8 @@  real_onep (const_tree expr)
 	    return false;
 	return true;
       }
+    case VEC_DUPLICATE_CST:
+      return real_onep (VEC_DUPLICATE_CST_ELT (expr));
     default:
       return false;
     }
@@ -2689,6 +2729,8 @@  real_minus_onep (const_tree expr)
 	    return false;
 	return true;
       }
+    case VEC_DUPLICATE_CST:
+      return real_minus_onep (VEC_DUPLICATE_CST_ELT (expr));
     default:
       return false;
     }
@@ -7091,6 +7133,9 @@  add_expr (const_tree t, inchash::hash &h
 	  inchash::add_expr (VECTOR_CST_ELT (t, i), hstate, flags);
 	return;
       }
+    case VEC_DUPLICATE_CST:
+      inchash::add_expr (VEC_DUPLICATE_CST_ELT (t), hstate);
+      return;
     case SSA_NAME:
       /* We can just compare by pointer.  */
       hstate.add_wide_int (SSA_NAME_VERSION (t));
@@ -10345,6 +10390,9 @@  initializer_zerop (const_tree init)
 	return true;
       }
 
+    case VEC_DUPLICATE_CST:
+      return initializer_zerop (VEC_DUPLICATE_CST_ELT (init));
+
     case CONSTRUCTOR:
       {
 	unsigned HOST_WIDE_INT idx;
@@ -10390,7 +10438,13 @@  uniform_vector_p (const_tree vec)
 
   gcc_assert (VECTOR_TYPE_P (TREE_TYPE (vec)));
 
-  if (TREE_CODE (vec) == VECTOR_CST)
+  if (TREE_CODE (vec) == VEC_DUPLICATE_CST)
+    return VEC_DUPLICATE_CST_ELT (vec);
+
+  else if (TREE_CODE (vec) == VEC_DUPLICATE_EXPR)
+    return TREE_OPERAND (vec, 0);
+
+  else if (TREE_CODE (vec) == VECTOR_CST)
     {
       first = VECTOR_CST_ELT (vec, 0);
       for (i = 1; i < VECTOR_CST_NELTS (vec); ++i)
@@ -11095,6 +11149,7 @@  #define WALK_SUBTREE_TAIL(NODE)				\
     case REAL_CST:
     case FIXED_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case STRING_CST:
     case BLOCK:
     case PLACEHOLDER_EXPR:
@@ -12381,6 +12436,12 @@  drop_tree_overflow (tree t)
 	    elt = drop_tree_overflow (elt);
 	}
     }
+  if (TREE_CODE (t) == VEC_DUPLICATE_CST)
+    {
+      tree *elt = &VEC_DUPLICATE_CST_ELT (t);
+      if (TREE_OVERFLOW (*elt))
+	*elt = drop_tree_overflow (*elt);
+    }
   return t;
 }
 
@@ -13798,6 +13859,92 @@  test_integer_constants ()
   ASSERT_EQ (type, TREE_TYPE (zero));
 }
 
+/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs
+   for integral type TYPE.  */
+
+static void
+test_vec_duplicate_predicates_int (tree type)
+{
+  tree vec_type = build_vector_type (type, 4);
+
+  tree zero = build_zero_cst (type);
+  tree vec_zero = build_vec_duplicate_cst (vec_type, zero);
+  ASSERT_TRUE (integer_zerop (vec_zero));
+  ASSERT_FALSE (integer_onep (vec_zero));
+  ASSERT_FALSE (integer_minus_onep (vec_zero));
+  ASSERT_FALSE (integer_all_onesp (vec_zero));
+  ASSERT_FALSE (integer_truep (vec_zero));
+  ASSERT_TRUE (initializer_zerop (vec_zero));
+
+  tree one = build_one_cst (type);
+  tree vec_one = build_vec_duplicate_cst (vec_type, one);
+  ASSERT_FALSE (integer_zerop (vec_one));
+  ASSERT_TRUE (integer_onep (vec_one));
+  ASSERT_FALSE (integer_minus_onep (vec_one));
+  ASSERT_FALSE (integer_all_onesp (vec_one));
+  ASSERT_FALSE (integer_truep (vec_one));
+  ASSERT_FALSE (initializer_zerop (vec_one));
+
+  tree minus_one = build_minus_one_cst (type);
+  tree vec_minus_one = build_vec_duplicate_cst (vec_type, minus_one);
+  ASSERT_FALSE (integer_zerop (vec_minus_one));
+  ASSERT_FALSE (integer_onep (vec_minus_one));
+  ASSERT_TRUE (integer_minus_onep (vec_minus_one));
+  ASSERT_TRUE (integer_all_onesp (vec_minus_one));
+  ASSERT_TRUE (integer_truep (vec_minus_one));
+  ASSERT_FALSE (initializer_zerop (vec_minus_one));
+
+  tree x = create_tmp_var_raw (type, "x");
+  tree vec_x = build1 (VEC_DUPLICATE_EXPR, vec_type, x);
+  ASSERT_EQ (uniform_vector_p (vec_zero), zero);
+  ASSERT_EQ (uniform_vector_p (vec_one), one);
+  ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one);
+  ASSERT_EQ (uniform_vector_p (vec_x), x);
+}
+
+/* Verify predicate handling of VEC_DUPLICATE_CSTs for floating-point
+   type TYPE.  */
+
+static void
+test_vec_duplicate_predicates_float (tree type)
+{
+  tree vec_type = build_vector_type (type, 4);
+
+  tree zero = build_zero_cst (type);
+  tree vec_zero = build_vec_duplicate_cst (vec_type, zero);
+  ASSERT_TRUE (real_zerop (vec_zero));
+  ASSERT_FALSE (real_onep (vec_zero));
+  ASSERT_FALSE (real_minus_onep (vec_zero));
+  ASSERT_TRUE (initializer_zerop (vec_zero));
+
+  tree one = build_one_cst (type);
+  tree vec_one = build_vec_duplicate_cst (vec_type, one);
+  ASSERT_FALSE (real_zerop (vec_one));
+  ASSERT_TRUE (real_onep (vec_one));
+  ASSERT_FALSE (real_minus_onep (vec_one));
+  ASSERT_FALSE (initializer_zerop (vec_one));
+
+  tree minus_one = build_minus_one_cst (type);
+  tree vec_minus_one = build_vec_duplicate_cst (vec_type, minus_one);
+  ASSERT_FALSE (real_zerop (vec_minus_one));
+  ASSERT_FALSE (real_onep (vec_minus_one));
+  ASSERT_TRUE (real_minus_onep (vec_minus_one));
+  ASSERT_FALSE (initializer_zerop (vec_minus_one));
+
+  ASSERT_EQ (uniform_vector_p (vec_zero), zero);
+  ASSERT_EQ (uniform_vector_p (vec_one), one);
+  ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one);
+}
+
+/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs.  */
+
+static void
+test_vec_duplicate_predicates ()
+{
+  test_vec_duplicate_predicates_int (integer_type_node);
+  test_vec_duplicate_predicates_float (float_type_node);
+}
+
 /* Verify identifiers.  */
 
 static void
@@ -13826,6 +13973,7 @@  test_labels ()
 tree_c_tests ()
 {
   test_integer_constants ();
+  test_vec_duplicate_predicates ();
   test_identifiers ();
   test_labels ();
 }
Index: gcc/cfgexpand.c
===================================================================
--- gcc/cfgexpand.c	2017-10-23 11:41:23.137358624 +0100
+++ gcc/cfgexpand.c	2017-10-23 11:41:51.760448406 +0100
@@ -5049,6 +5049,8 @@  expand_debug_expr (tree exp)
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
     case VEC_PERM_EXPR:
+    case VEC_DUPLICATE_CST:
+    case VEC_DUPLICATE_EXPR:
       return NULL;
 
     /* Misc codes.  */
Index: gcc/tree-pretty-print.c
===================================================================
--- gcc/tree-pretty-print.c	2017-10-23 11:38:53.934094740 +0100
+++ gcc/tree-pretty-print.c	2017-10-23 11:41:51.772023858 +0100
@@ -1802,6 +1802,12 @@  dump_generic_node (pretty_printer *pp, t
       }
       break;
 
+    case VEC_DUPLICATE_CST:
+      pp_string (pp, "{ ");
+      dump_generic_node (pp, VEC_DUPLICATE_CST_ELT (node), spc, flags, false);
+      pp_string (pp, ", ... }");
+      break;
+
     case FUNCTION_TYPE:
     case METHOD_TYPE:
       dump_generic_node (pp, TREE_TYPE (node), spc, flags, false);
@@ -3231,6 +3237,15 @@  dump_generic_node (pretty_printer *pp, t
       pp_string (pp, " > ");
       break;
 
+    case VEC_DUPLICATE_EXPR:
+      pp_space (pp);
+      for (str = get_tree_code_name (code); *str; str++)
+	pp_character (pp, TOUPPER (*str));
+      pp_string (pp, " < ");
+      dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
+      pp_string (pp, " > ");
+      break;
+
     case VEC_UNPACK_HI_EXPR:
       pp_string (pp, " VEC_UNPACK_HI_EXPR < ");
       dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false);
Index: gcc/dwarf2out.c
===================================================================
--- gcc/dwarf2out.c	2017-10-23 11:41:24.407340836 +0100
+++ gcc/dwarf2out.c	2017-10-23 11:41:51.763342269 +0100
@@ -18862,6 +18862,7 @@  rtl_for_decl_init (tree init, tree type)
 	switch (TREE_CODE (init))
 	  {
 	  case VECTOR_CST:
+	  case VEC_DUPLICATE_CST:
 	    break;
 	  case CONSTRUCTOR:
 	    if (TREE_CONSTANT (init))
Index: gcc/gimple-expr.h
===================================================================
--- gcc/gimple-expr.h	2017-10-23 11:38:53.934094740 +0100
+++ gcc/gimple-expr.h	2017-10-23 11:41:51.765271511 +0100
@@ -134,6 +134,7 @@  is_gimple_constant (const_tree t)
     case FIXED_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case STRING_CST:
       return true;
 
Index: gcc/gimplify.c
===================================================================
--- gcc/gimplify.c	2017-10-23 11:41:25.531270256 +0100
+++ gcc/gimplify.c	2017-10-23 11:41:51.766236132 +0100
@@ -11506,6 +11506,7 @@  gimplify_expr (tree *expr_p, gimple_seq
 	case STRING_CST:
 	case COMPLEX_CST:
 	case VECTOR_CST:
+	case VEC_DUPLICATE_CST:
 	  /* Drop the overflow flag on constants, we do not want
 	     that in the GIMPLE IL.  */
 	  if (TREE_OVERFLOW_P (*expr_p))
Index: gcc/graphite-isl-ast-to-gimple.c
===================================================================
--- gcc/graphite-isl-ast-to-gimple.c	2017-10-23 11:41:23.205065216 +0100
+++ gcc/graphite-isl-ast-to-gimple.c	2017-10-23 11:41:51.767200753 +0100
@@ -222,7 +222,8 @@  enum phi_node_kind
     return TREE_CODE (op) == INTEGER_CST
       || TREE_CODE (op) == REAL_CST
       || TREE_CODE (op) == COMPLEX_CST
-      || TREE_CODE (op) == VECTOR_CST;
+      || TREE_CODE (op) == VECTOR_CST
+      || TREE_CODE (op) == VEC_DUPLICATE_CST;
   }
 
 private:
Index: gcc/graphite-scop-detection.c
===================================================================
--- gcc/graphite-scop-detection.c	2017-10-23 11:41:25.533204730 +0100
+++ gcc/graphite-scop-detection.c	2017-10-23 11:41:51.767200753 +0100
@@ -1243,6 +1243,7 @@  scan_tree_for_params (sese_info_p s, tre
     case REAL_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
       break;
 
    default:
Index: gcc/ipa-icf-gimple.c
===================================================================
--- gcc/ipa-icf-gimple.c	2017-10-23 11:38:53.934094740 +0100
+++ gcc/ipa-icf-gimple.c	2017-10-23 11:41:51.767200753 +0100
@@ -333,6 +333,7 @@  func_checker::compare_cst_or_decl (tree
     case INTEGER_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case STRING_CST:
     case REAL_CST:
       {
@@ -528,6 +529,7 @@  func_checker::compare_operand (tree t1,
     case INTEGER_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case STRING_CST:
     case REAL_CST:
     case FUNCTION_DECL:
Index: gcc/ipa-icf.c
===================================================================
--- gcc/ipa-icf.c	2017-10-23 11:41:25.874639400 +0100
+++ gcc/ipa-icf.c	2017-10-23 11:41:51.768165374 +0100
@@ -1478,6 +1478,7 @@  sem_item::add_expr (const_tree exp, inch
     case STRING_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
       inchash::add_expr (exp, hstate);
       break;
     case CONSTRUCTOR:
@@ -2030,6 +2031,9 @@  sem_variable::equals (tree t1, tree t2)
 
 	return 1;
       }
+    case VEC_DUPLICATE_CST:
+      return sem_variable::equals (VEC_DUPLICATE_CST_ELT (t1),
+				   VEC_DUPLICATE_CST_ELT (t2));
     case ARRAY_REF:
     case ARRAY_RANGE_REF:
       {
Index: gcc/match.pd
===================================================================
--- gcc/match.pd	2017-10-23 11:38:53.934094740 +0100
+++ gcc/match.pd	2017-10-23 11:41:51.768165374 +0100
@@ -958,6 +958,9 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (match negate_expr_p
  VECTOR_CST
  (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type))))
+(match negate_expr_p
+ VEC_DUPLICATE_CST
+ (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type))))
 
 /* (-A) * (-B) -> A * B  */
 (simplify
Index: gcc/print-tree.c
===================================================================
--- gcc/print-tree.c	2017-10-23 11:38:53.934094740 +0100
+++ gcc/print-tree.c	2017-10-23 11:41:51.769129995 +0100
@@ -783,6 +783,10 @@  print_node (FILE *file, const char *pref
 	  }
 	  break;
 
+	case VEC_DUPLICATE_CST:
+	  print_node (file, "elt", VEC_DUPLICATE_CST_ELT (node), indent + 4);
+	  break;
+
 	case COMPLEX_CST:
 	  print_node (file, "real", TREE_REALPART (node), indent + 4);
 	  print_node (file, "imag", TREE_IMAGPART (node), indent + 4);
Index: gcc/tree-chkp.c
===================================================================
--- gcc/tree-chkp.c	2017-10-23 11:41:23.201196268 +0100
+++ gcc/tree-chkp.c	2017-10-23 11:41:51.770094616 +0100
@@ -3800,6 +3800,7 @@  chkp_find_bounds_1 (tree ptr, tree ptr_s
     case INTEGER_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
       if (integer_zerop (ptr_src))
 	bounds = chkp_get_none_bounds ();
       else
Index: gcc/tree-loop-distribution.c
===================================================================
--- gcc/tree-loop-distribution.c	2017-10-23 11:41:23.228278904 +0100
+++ gcc/tree-loop-distribution.c	2017-10-23 11:41:51.771059237 +0100
@@ -921,6 +921,9 @@  const_with_all_bytes_same (tree val)
           && CONSTRUCTOR_NELTS (val) == 0))
     return 0;
 
+  if (TREE_CODE (val) == VEC_DUPLICATE_CST)
+    return const_with_all_bytes_same (VEC_DUPLICATE_CST_ELT (val));
+
   if (real_zerop (val))
     {
       /* Only return 0 for +0.0, not for -0.0, which doesn't have
Index: gcc/tree-ssa-loop.c
===================================================================
--- gcc/tree-ssa-loop.c	2017-10-23 11:38:53.934094740 +0100
+++ gcc/tree-ssa-loop.c	2017-10-23 11:41:51.772023858 +0100
@@ -616,6 +616,7 @@  for_each_index (tree *addr_p, bool (*cbc
 	case STRING_CST:
 	case RESULT_DECL:
 	case VECTOR_CST:
+	case VEC_DUPLICATE_CST:
 	case COMPLEX_CST:
 	case INTEGER_CST:
 	case REAL_CST:
Index: gcc/tree-ssa-pre.c
===================================================================
--- gcc/tree-ssa-pre.c	2017-10-23 11:41:25.549647760 +0100
+++ gcc/tree-ssa-pre.c	2017-10-23 11:41:51.772023858 +0100
@@ -2675,6 +2675,7 @@  create_component_ref_by_pieces_1 (basic_
     case INTEGER_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case REAL_CST:
     case CONSTRUCTOR:
     case VAR_DECL:
Index: gcc/tree-ssa-sccvn.c
===================================================================
--- gcc/tree-ssa-sccvn.c	2017-10-23 11:38:53.934094740 +0100
+++ gcc/tree-ssa-sccvn.c	2017-10-23 11:41:51.773953100 +0100
@@ -858,6 +858,7 @@  copy_reference_ops_from_ref (tree ref, v
 	case INTEGER_CST:
 	case COMPLEX_CST:
 	case VECTOR_CST:
+	case VEC_DUPLICATE_CST:
 	case REAL_CST:
 	case FIXED_CST:
 	case CONSTRUCTOR:
@@ -1050,6 +1051,7 @@  ao_ref_init_from_vn_reference (ao_ref *r
 	case INTEGER_CST:
 	case COMPLEX_CST:
 	case VECTOR_CST:
+	case VEC_DUPLICATE_CST:
 	case REAL_CST:
 	case CONSTRUCTOR:
 	case CONST_DECL:
Index: gcc/tree-vect-generic.c
===================================================================
--- gcc/tree-vect-generic.c	2017-10-23 11:38:53.934094740 +0100
+++ gcc/tree-vect-generic.c	2017-10-23 11:41:51.773953100 +0100
@@ -1419,6 +1419,7 @@  lower_vec_perm (gimple_stmt_iterator *gs
 ssa_uniform_vector_p (tree op)
 {
   if (TREE_CODE (op) == VECTOR_CST
+      || TREE_CODE (op) == VEC_DUPLICATE_CST
       || TREE_CODE (op) == CONSTRUCTOR)
     return uniform_vector_p (op);
   if (TREE_CODE (op) == SSA_NAME)
Index: gcc/varasm.c
===================================================================
--- gcc/varasm.c	2017-10-23 11:41:25.822408600 +0100
+++ gcc/varasm.c	2017-10-23 11:41:51.775882341 +0100
@@ -3068,6 +3068,9 @@  const_hash_1 (const tree exp)
     CASE_CONVERT:
       return const_hash_1 (TREE_OPERAND (exp, 0)) * 7 + 2;
 
+    case VEC_DUPLICATE_CST:
+      return const_hash_1 (VEC_DUPLICATE_CST_ELT (exp)) * 7 + 3;
+
     default:
       /* A language specific constant. Just hash the code.  */
       return code;
@@ -3158,6 +3161,10 @@  compare_constant (const tree t1, const t
 	return 1;
       }
 
+    case VEC_DUPLICATE_CST:
+      return compare_constant (VEC_DUPLICATE_CST_ELT (t1),
+			       VEC_DUPLICATE_CST_ELT (t2));
+
     case CONSTRUCTOR:
       {
 	vec<constructor_elt, va_gc> *v1, *v2;
Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c	2017-10-23 11:41:23.535860278 +0100
+++ gcc/fold-const.c	2017-10-23 11:41:51.765271511 +0100
@@ -418,6 +418,9 @@  negate_expr_p (tree t)
 	return true;
       }
 
+    case VEC_DUPLICATE_CST:
+      return negate_expr_p (VEC_DUPLICATE_CST_ELT (t));
+
     case COMPLEX_EXPR:
       return negate_expr_p (TREE_OPERAND (t, 0))
 	     && negate_expr_p (TREE_OPERAND (t, 1));
@@ -579,6 +582,14 @@  fold_negate_expr_1 (location_t loc, tree
 	return build_vector (type, elts);
       }
 
+    case VEC_DUPLICATE_CST:
+      {
+	tree sub = fold_negate_expr (loc, VEC_DUPLICATE_CST_ELT (t));
+	if (!sub)
+	  return NULL_TREE;
+	return build_vector_from_val (type, sub);
+      }
+
     case COMPLEX_EXPR:
       if (negate_expr_p (t))
 	return fold_build2_loc (loc, COMPLEX_EXPR, type,
@@ -1436,6 +1447,16 @@  const_binop (enum tree_code code, tree a
       return build_vector (type, elts);
     }
 
+  if (TREE_CODE (arg1) == VEC_DUPLICATE_CST
+      && TREE_CODE (arg2) == VEC_DUPLICATE_CST)
+    {
+      tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1),
+			      VEC_DUPLICATE_CST_ELT (arg2));
+      if (!sub)
+	return NULL_TREE;
+      return build_vector_from_val (TREE_TYPE (arg1), sub);
+    }
+
   /* Shifts allow a scalar offset for a vector.  */
   if (TREE_CODE (arg1) == VECTOR_CST
       && TREE_CODE (arg2) == INTEGER_CST)
@@ -1459,6 +1480,15 @@  const_binop (enum tree_code code, tree a
 
       return build_vector (type, elts);
     }
+
+  if (TREE_CODE (arg1) == VEC_DUPLICATE_CST
+      && TREE_CODE (arg2) == INTEGER_CST)
+    {
+      tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1), arg2);
+      if (!sub)
+	return NULL_TREE;
+      return build_vector_from_val (TREE_TYPE (arg1), sub);
+    }
   return NULL_TREE;
 }
 
@@ -1652,6 +1682,13 @@  const_unop (enum tree_code code, tree ty
 	  if (i == count)
 	    return build_vector (type, elements);
 	}
+      else if (TREE_CODE (arg0) == VEC_DUPLICATE_CST)
+	{
+	  tree sub = const_unop (BIT_NOT_EXPR, TREE_TYPE (type),
+				 VEC_DUPLICATE_CST_ELT (arg0));
+	  if (sub)
+	    return build_vector_from_val (type, sub);
+	}
       break;
 
     case TRUTH_NOT_EXPR:
@@ -1737,6 +1774,11 @@  const_unop (enum tree_code code, tree ty
 	return res;
       }
 
+    case VEC_DUPLICATE_EXPR:
+      if (CONSTANT_CLASS_P (arg0))
+	return build_vector_from_val (type, arg0);
+      return NULL_TREE;
+
     default:
       break;
     }
@@ -2167,6 +2209,15 @@  fold_convert_const (enum tree_code code,
 	    }
 	  return build_vector (type, v);
 	}
+      if (TREE_CODE (arg1) == VEC_DUPLICATE_CST
+	  && (TYPE_VECTOR_SUBPARTS (type)
+	      == TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg1))))
+	{
+	  tree sub = fold_convert_const (code, TREE_TYPE (type),
+					 VEC_DUPLICATE_CST_ELT (arg1));
+	  if (sub)
+	    return build_vector_from_val (type, sub);
+	}
     }
   return NULL_TREE;
 }
@@ -2953,6 +3004,10 @@  operand_equal_p (const_tree arg0, const_
 	  return 1;
 	}
 
+      case VEC_DUPLICATE_CST:
+	return operand_equal_p (VEC_DUPLICATE_CST_ELT (arg0),
+				VEC_DUPLICATE_CST_ELT (arg1), flags);
+
       case COMPLEX_CST:
 	return (operand_equal_p (TREE_REALPART (arg0), TREE_REALPART (arg1),
 				 flags)
@@ -7492,6 +7547,20 @@  can_native_interpret_type_p (tree type)
 static tree
 fold_view_convert_expr (tree type, tree expr)
 {
+  /* Recurse on duplicated vectors if the target type is also a vector
+     and if the elements line up.  */
+  tree expr_type = TREE_TYPE (expr);
+  if (TREE_CODE (expr) == VEC_DUPLICATE_CST
+      && VECTOR_TYPE_P (type)
+      && TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (expr_type)
+      && TYPE_SIZE (TREE_TYPE (type)) == TYPE_SIZE (TREE_TYPE (expr_type)))
+    {
+      tree sub = fold_view_convert_expr (TREE_TYPE (type),
+					 VEC_DUPLICATE_CST_ELT (expr));
+      if (sub)
+	return build_vector_from_val (type, sub);
+    }
+
   /* We support up to 512-bit values (for V8DFmode).  */
   unsigned char buffer[64];
   int len;
@@ -8891,6 +8960,15 @@  exact_inverse (tree type, tree cst)
 	return build_vector (type, elts);
       }
 
+    case VEC_DUPLICATE_CST:
+      {
+	tree sub = exact_inverse (TREE_TYPE (type),
+				  VEC_DUPLICATE_CST_ELT (cst));
+	if (!sub)
+	  return NULL_TREE;
+	return build_vector_from_val (type, sub);
+      }
+
     default:
       return NULL_TREE;
     }
@@ -11969,6 +12047,9 @@  fold_checksum_tree (const_tree expr, str
 	  for (i = 0; i < (int) VECTOR_CST_NELTS (expr); ++i)
 	    fold_checksum_tree (VECTOR_CST_ELT (expr, i), ctx, ht);
 	  break;
+	case VEC_DUPLICATE_CST:
+	  fold_checksum_tree (VEC_DUPLICATE_CST_ELT (expr), ctx, ht);
+	  break;
 	default:
 	  break;
 	}
@@ -14436,6 +14517,36 @@  test_vector_folding ()
   ASSERT_FALSE (integer_nonzerop (fold_build2 (NE_EXPR, res_type, one, one)));
 }
 
+/* Verify folding of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs.  */
+
+static void
+test_vec_duplicate_folding ()
+{
+  tree type = build_vector_type (ssizetype, 4);
+  tree dup5 = build_vec_duplicate_cst (type, ssize_int (5));
+  tree dup3 = build_vec_duplicate_cst (type, ssize_int (3));
+
+  tree neg_dup5 = fold_unary (NEGATE_EXPR, type, dup5);
+  ASSERT_EQ (uniform_vector_p (neg_dup5), ssize_int (-5));
+
+  tree not_dup5 = fold_unary (BIT_NOT_EXPR, type, dup5);
+  ASSERT_EQ (uniform_vector_p (not_dup5), ssize_int (-6));
+
+  tree dup5_plus_dup3 = fold_binary (PLUS_EXPR, type, dup5, dup3);
+  ASSERT_EQ (uniform_vector_p (dup5_plus_dup3), ssize_int (8));
+
+  tree dup5_lsl_2 = fold_binary (LSHIFT_EXPR, type, dup5, ssize_int (2));
+  ASSERT_EQ (uniform_vector_p (dup5_lsl_2), ssize_int (20));
+
+  tree size_vector = build_vector_type (sizetype, 4);
+  tree size_dup5 = fold_convert (size_vector, dup5);
+  ASSERT_EQ (uniform_vector_p (size_dup5), size_int (5));
+
+  tree dup5_expr = fold_unary (VEC_DUPLICATE_EXPR, type, ssize_int (5));
+  tree dup5_cst = build_vector_from_val (type, ssize_int (5));
+  ASSERT_TRUE (operand_equal_p (dup5_expr, dup5_cst, 0));
+}
+
 /* Run all of the selftests within this file.  */
 
 void
@@ -14443,6 +14554,7 @@  fold_const_c_tests ()
 {
   test_arithmetic_folding ();
   test_vector_folding ();
+  test_vec_duplicate_folding ();
 }
 
 } // namespace selftest
Index: gcc/optabs.def
===================================================================
--- gcc/optabs.def	2017-10-23 11:38:53.934094740 +0100
+++ gcc/optabs.def	2017-10-23 11:41:51.769129995 +0100
@@ -364,3 +364,5 @@  OPTAB_D (atomic_xor_optab, "atomic_xor$I
 
 OPTAB_D (get_thread_pointer_optab, "get_thread_pointer$I$a")
 OPTAB_D (set_thread_pointer_optab, "set_thread_pointer$I$a")
+
+OPTAB_DC (vec_duplicate_optab, "vec_duplicate$a", VEC_DUPLICATE)
Index: gcc/optabs-tree.c
===================================================================
--- gcc/optabs-tree.c	2017-10-23 11:38:53.934094740 +0100
+++ gcc/optabs-tree.c	2017-10-23 11:41:51.768165374 +0100
@@ -210,6 +210,9 @@  optab_for_tree_code (enum tree_code code
       return TYPE_UNSIGNED (type) ?
 	vec_pack_ufix_trunc_optab : vec_pack_sfix_trunc_optab;
 
+    case VEC_DUPLICATE_EXPR:
+      return vec_duplicate_optab;
+
     default:
       break;
     }
Index: gcc/optabs.h
===================================================================
--- gcc/optabs.h	2017-10-23 11:38:53.934094740 +0100
+++ gcc/optabs.h	2017-10-23 11:41:51.769129995 +0100
@@ -181,6 +181,7 @@  extern rtx simplify_expand_binop (machin
 				  enum optab_methods methods);
 extern bool force_expand_binop (machine_mode, optab, rtx, rtx, rtx, int,
 				enum optab_methods);
+extern rtx expand_vector_broadcast (machine_mode, rtx);
 
 /* Generate code for a simple binary or unary operation.  "Simple" in
    this case means "can be unambiguously described by a (mode, code)
Index: gcc/optabs.c
===================================================================
--- gcc/optabs.c	2017-10-23 11:41:41.549050496 +0100
+++ gcc/optabs.c	2017-10-23 11:41:51.769129995 +0100
@@ -367,7 +367,7 @@  force_expand_binop (machine_mode mode, o
    mode of OP must be the element mode of VMODE.  If OP is a constant,
    then the return value will be a constant.  */
 
-static rtx
+rtx
 expand_vector_broadcast (machine_mode vmode, rtx op)
 {
   enum insn_code icode;
@@ -380,6 +380,16 @@  expand_vector_broadcast (machine_mode vm
   if (CONSTANT_P (op))
     return gen_const_vec_duplicate (vmode, op);
 
+  icode = optab_handler (vec_duplicate_optab, vmode);
+  if (icode != CODE_FOR_nothing)
+    {
+      struct expand_operand ops[2];
+      create_output_operand (&ops[0], NULL_RTX, vmode);
+      create_input_operand (&ops[1], op, GET_MODE (op));
+      expand_insn (icode, 2, ops);
+      return ops[0].value;
+    }
+
   /* ??? If the target doesn't have a vec_init, then we have no easy way
      of performing this operation.  Most of this sort of generic support
      is hidden away in the vector lowering support in gimple.  */
Index: gcc/expr.c
===================================================================
--- gcc/expr.c	2017-10-23 11:41:39.187050437 +0100
+++ gcc/expr.c	2017-10-23 11:41:51.764306890 +0100
@@ -6572,7 +6572,8 @@  store_constructor (tree exp, rtx target,
 	constructor_elt *ce;
 	int i;
 	int need_to_clear;
-	int icode = CODE_FOR_nothing;
+	insn_code icode = CODE_FOR_nothing;
+	tree elt;
 	tree elttype = TREE_TYPE (type);
 	int elt_size = tree_to_uhwi (TYPE_SIZE (elttype));
 	machine_mode eltmode = TYPE_MODE (elttype);
@@ -6582,13 +6583,30 @@  store_constructor (tree exp, rtx target,
 	unsigned n_elts;
 	alias_set_type alias;
 	bool vec_vec_init_p = false;
+	machine_mode mode = GET_MODE (target);
 
 	gcc_assert (eltmode != BLKmode);
 
+	/* Try using vec_duplicate_optab for uniform vectors.  */
+	if (!TREE_SIDE_EFFECTS (exp)
+	    && VECTOR_MODE_P (mode)
+	    && eltmode == GET_MODE_INNER (mode)
+	    && ((icode = optab_handler (vec_duplicate_optab, mode))
+		!= CODE_FOR_nothing)
+	    && (elt = uniform_vector_p (exp)))
+	  {
+	    struct expand_operand ops[2];
+	    create_output_operand (&ops[0], target, mode);
+	    create_input_operand (&ops[1], expand_normal (elt), eltmode);
+	    expand_insn (icode, 2, ops);
+	    if (!rtx_equal_p (target, ops[0].value))
+	      emit_move_insn (target, ops[0].value);
+	    break;
+	  }
+
 	n_elts = TYPE_VECTOR_SUBPARTS (type);
-	if (REG_P (target) && VECTOR_MODE_P (GET_MODE (target)))
+	if (REG_P (target) && VECTOR_MODE_P (mode))
 	  {
-	    machine_mode mode = GET_MODE (target);
 	    machine_mode emode = eltmode;
 
 	    if (CONSTRUCTOR_NELTS (exp)
@@ -6600,7 +6618,7 @@  store_constructor (tree exp, rtx target,
 			    == n_elts);
 		emode = TYPE_MODE (etype);
 	      }
-	    icode = (int) convert_optab_handler (vec_init_optab, mode, emode);
+	    icode = convert_optab_handler (vec_init_optab, mode, emode);
 	    if (icode != CODE_FOR_nothing)
 	      {
 		unsigned int i, n = n_elts;
@@ -6648,7 +6666,7 @@  store_constructor (tree exp, rtx target,
 	if (need_to_clear && size > 0 && !vector)
 	  {
 	    if (REG_P (target))
-	      emit_move_insn (target, CONST0_RTX (GET_MODE (target)));
+	      emit_move_insn (target, CONST0_RTX (mode));
 	    else
 	      clear_storage (target, GEN_INT (size), BLOCK_OP_NORMAL);
 	    cleared = 1;
@@ -6656,7 +6674,7 @@  store_constructor (tree exp, rtx target,
 
 	/* Inform later passes that the old value is dead.  */
 	if (!cleared && !vector && REG_P (target))
-	  emit_move_insn (target, CONST0_RTX (GET_MODE (target)));
+	  emit_move_insn (target, CONST0_RTX (mode));
 
         if (MEM_P (target))
 	  alias = MEM_ALIAS_SET (target);
@@ -6707,8 +6725,7 @@  store_constructor (tree exp, rtx target,
 
 	if (vector)
 	  emit_insn (GEN_FCN (icode) (target,
-				      gen_rtx_PARALLEL (GET_MODE (target),
-							vector)));
+				      gen_rtx_PARALLEL (mode, vector)));
 	break;
       }
 
@@ -7686,6 +7703,19 @@  expand_operands (tree exp0, tree exp1, r
 }
 
 
+/* Expand constant vector element ELT, which has mode MODE.  This is used
+   for members of VECTOR_CST and VEC_DUPLICATE_CST.  */
+
+static rtx
+const_vector_element (scalar_mode mode, const_tree elt)
+{
+  if (TREE_CODE (elt) == REAL_CST)
+    return const_double_from_real_value (TREE_REAL_CST (elt), mode);
+  if (TREE_CODE (elt) == FIXED_CST)
+    return CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt), mode);
+  return immed_wide_int_const (wi::to_wide (elt), mode);
+}
+
 /* Return a MEM that contains constant EXP.  DEFER is as for
    output_constant_def and MODIFIER is as for expand_expr.  */
 
@@ -9551,6 +9581,12 @@  #define REDUCE_BIT_FIELD(expr)	(reduce_b
       target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
       return target;
 
+    case VEC_DUPLICATE_EXPR:
+      op0 = expand_expr (treeop0, NULL_RTX, VOIDmode, modifier);
+      target = expand_vector_broadcast (mode, op0);
+      gcc_assert (target);
+      return target;
+
     case BIT_INSERT_EXPR:
       {
 	unsigned bitpos = tree_to_uhwi (treeop2);
@@ -10003,6 +10039,11 @@  expand_expr_real_1 (tree exp, rtx target
 			    tmode, modifier);
       }
 
+    case VEC_DUPLICATE_CST:
+      op0 = const_vector_element (GET_MODE_INNER (mode),
+				  VEC_DUPLICATE_CST_ELT (exp));
+      return gen_const_vec_duplicate (mode, op0);
+
     case CONST_DECL:
       if (modifier == EXPAND_WRITE)
 	{
@@ -11764,8 +11805,7 @@  const_vector_from_tree (tree exp)
 {
   rtvec v;
   unsigned i, units;
-  tree elt;
-  machine_mode inner, mode;
+  machine_mode mode;
 
   mode = TYPE_MODE (TREE_TYPE (exp));
 
@@ -11776,23 +11816,12 @@  const_vector_from_tree (tree exp)
     return const_vector_mask_from_tree (exp);
 
   units = VECTOR_CST_NELTS (exp);
-  inner = GET_MODE_INNER (mode);
 
   v = rtvec_alloc (units);
 
   for (i = 0; i < units; ++i)
-    {
-      elt = VECTOR_CST_ELT (exp, i);
-
-      if (TREE_CODE (elt) == REAL_CST)
-	RTVEC_ELT (v, i) = const_double_from_real_value (TREE_REAL_CST (elt),
-							 inner);
-      else if (TREE_CODE (elt) == FIXED_CST)
-	RTVEC_ELT (v, i) = CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt),
-							 inner);
-      else
-	RTVEC_ELT (v, i) = immed_wide_int_const (wi::to_wide (elt), inner);
-    }
+    RTVEC_ELT (v, i) = const_vector_element (GET_MODE_INNER (mode),
+					     VECTOR_CST_ELT (exp, i));
 
   return gen_rtx_CONST_VECTOR (mode, v);
 }
Index: gcc/internal-fn.c
===================================================================
--- gcc/internal-fn.c	2017-10-23 11:41:23.529089619 +0100
+++ gcc/internal-fn.c	2017-10-23 11:41:51.767200753 +0100
@@ -1911,12 +1911,12 @@  expand_vector_ubsan_overflow (location_t
       emit_move_insn (cntvar, const0_rtx);
       emit_label (loop_lab);
     }
-  if (TREE_CODE (arg0) != VECTOR_CST)
+  if (!CONSTANT_CLASS_P (arg0))
     {
       rtx arg0r = expand_normal (arg0);
       arg0 = make_tree (TREE_TYPE (arg0), arg0r);
     }
-  if (TREE_CODE (arg1) != VECTOR_CST)
+  if (!CONSTANT_CLASS_P (arg1))
     {
       rtx arg1r = expand_normal (arg1);
       arg1 = make_tree (TREE_TYPE (arg1), arg1r);
Index: gcc/tree-cfg.c
===================================================================
--- gcc/tree-cfg.c	2017-10-23 11:41:25.864967029 +0100
+++ gcc/tree-cfg.c	2017-10-23 11:41:51.770094616 +0100
@@ -3803,6 +3803,17 @@  verify_gimple_assign_unary (gassign *stm
     case CONJ_EXPR:
       break;
 
+    case VEC_DUPLICATE_EXPR:
+      if (TREE_CODE (lhs_type) != VECTOR_TYPE
+	  || !useless_type_conversion_p (TREE_TYPE (lhs_type), rhs1_type))
+	{
+	  error ("vec_duplicate should be from a scalar to a like vector");
+	  debug_generic_expr (lhs_type);
+	  debug_generic_expr (rhs1_type);
+	  return true;
+	}
+      return false;
+
     default:
       gcc_unreachable ();
     }
@@ -4473,6 +4484,7 @@  verify_gimple_assign_single (gassign *st
     case FIXED_CST:
     case COMPLEX_CST:
     case VECTOR_CST:
+    case VEC_DUPLICATE_CST:
     case STRING_CST:
       return res;
 
Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c	2017-10-23 11:41:25.833048208 +0100
+++ gcc/tree-inline.c	2017-10-23 11:41:51.771059237 +0100
@@ -4002,6 +4002,7 @@  estimate_operator_cost (enum tree_code c
     case VEC_PACK_FIX_TRUNC_EXPR:
     case VEC_WIDEN_LSHIFT_HI_EXPR:
     case VEC_WIDEN_LSHIFT_LO_EXPR:
+    case VEC_DUPLICATE_EXPR:
 
       return 1;