From patchwork Mon Oct 23 11:20:14 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 116689 Delivered-To: patch@linaro.org Received: by 10.140.22.164 with SMTP id 33csp4526216qgn; Mon, 23 Oct 2017 04:20:45 -0700 (PDT) X-Received: by 10.84.177.131 with SMTP id x3mr10194351plb.123.1508757645373; Mon, 23 Oct 2017 04:20:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1508757645; cv=none; d=google.com; s=arc-20160816; b=qAReFYI/5KFCIgGRK7lEJQkDuBqSxSf4D02fZ9Kdkq4EzQRaLNaz+f41FoZyNvwHa1 XMhgGXly2j32bzsOv18+O/JxUS0H8SBaH23o1BUR7oWj7IsXuHjMKRaC1IkS5TZUai46 ZWpLAd5FCr6VnF8eWTsd+5f5xWxLyxu7CgG3yNPiVtDVnqArGFlrHg8GLCEQm2ldmCcC WwCyiAr0mDCGSvG/YQQlLirHE5nVfUzS8B0p/P60ocVGVtzVEBtfL6a8q03n0lcRN/Ur Zk5utic4vAj4197R3TrkW1idWfjQS/kid872/iXyk+QPAALRHxF8tK08TNsWkMOjK2tw VTIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:user-agent:message-id:in-reply-to:date:references :subject:mail-followup-to:to:from:delivered-to:sender:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :mailing-list:dkim-signature:domainkey-signature :arc-authentication-results; bh=LHSwo2fGqY4ZvUhROgyfwBO+ii0a/h+x9JVlaWhoPkY=; b=ItP98vUAe4of/gGlpMMYvnzU35qGQFK5vFxcnP7dok9Ykc6hTFOzGgX0A6WlJPMbYE LRGdAZstvake4zdQ5hc4eCUhHeVUzRz33/5pmRxb7pTt8aFNN62D9Sd1cud2sFv/1vXq PP+xz8gqkZQpTwAC0ttV4LznfSuXXHlTb1y9d+jZkUgh3V4TCG/FPxHL4MHwBWGFq9uw 3mMWJv16e44pvnnGT8U+eihEwNUaCO/BRvtBaYwTJmadF391xAEsJAcjLX5jjdzZ6yG+ YntdS8QMa3mvFTR7hjDObyk9HhtnfPzKLnLSST4t4aiKL1WoSoGjSmw6eZSxqW65riFz HUmA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Kaas4ikZ; spf=pass (google.com: domain of gcc-patches-return-464731-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-464731-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id j71si4745000pgc.290.2017.10.23.04.20.45 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Oct 2017 04:20:45 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-464731-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Kaas4ikZ; spf=pass (google.com: domain of gcc-patches-return-464731-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-464731-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=default; b=LiLKqt/NQa3EZhB2bh9s03uezDE0O OR8+YKyrB4If8bOEf2K9g0a+DzsyHyJvYJRNBExnfOmc+5DTunJAmtJLXLFKH250 Ic6h9KpQ0SG0zHm+whpPTYZvOWXPcadxz35ZcMZzsVMWBpu6bhR2Mv9sfIippZVX 5B8Q9fpuYJANfQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:references:date:in-reply-to:message-id:mime-version :content-type; s=default; bh=rY/uP4AI0x1DTy69wLGLkgLAHjk=; b=Kaa s4ikZKbhSSg3YE48YpmlPFLsfaANVQPHi9lpzMbn2kBYHC2qyycTfsOB0jzhQztT QVXx7Lyr/ZG7xQ7tpNxsAZ4kBncBx2ZThjexoXDuQk7ym+eFBLoAkMUsNorDiLqW 20XDi+tg03sIG7kd1X5Zo8GDUAvrWZSIeN9qqCw0= Received: (qmail 48326 invoked by alias); 23 Oct 2017 11:20:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 48316 invoked by uid 89); 23 Oct 2017 11:20:26 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-15.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=years, year's, sk:ipaicf, sk:ipa-icf X-HELO: mail-wr0-f172.google.com Received: from mail-wr0-f172.google.com (HELO mail-wr0-f172.google.com) (209.85.128.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 23 Oct 2017 11:20:20 +0000 Received: by mail-wr0-f172.google.com with SMTP id u40so11080520wrf.10 for ; Mon, 23 Oct 2017 04:20:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:subject:references:date :in-reply-to:message-id:user-agent:mime-version; bh=LHSwo2fGqY4ZvUhROgyfwBO+ii0a/h+x9JVlaWhoPkY=; b=iKmcYGf7ydLopTP69sRjC9p93asf1LEP6Xu3J0WyteCCqCo8k3f8BIy7/AWeS4udXx Fo3ppREkhZe4qNSzlaTKH7gG2kKqk+C1R2lTV0UTwKfLkSVKezF93f0v9B0zYPEgjm3c Ah1N7ji15inR7VkNtEGqW8SFALG6La0zOQXb4UPiJ7jLb4J2JNcchaPU5tPwI01gaI/L r4WbMyjUUslITfd/I9U5bzI9I+xPHgwGuXKPsVGzPso6IXMqIoV8i0M6zoAOZiOfVQtM ttXPEMQ/zbf5DUdoeYhsX5yF4mPns5HjeGFqjFbVTSrgdhtWRPA/KIYeMPf0ajsxt5Q1 BiAQ== X-Gm-Message-State: AMCzsaU8OGsFRixkWqazaMWiY4STRAiLp+6Xeikk+hWQlIiRbQbfVKpi yyuOpWGwc92X8IVtLO7vKO6swETgr1k= X-Google-Smtp-Source: ABhQp+TY6xwyhNuawP+Sj7Vmc8Op/1UHI8t+dlLLLFc/N1/kGNFXhGIZRYyjVQ8qsy32oByg7SmMqw== X-Received: by 10.223.175.100 with SMTP id z91mr11339550wrc.262.1508757616785; Mon, 23 Oct 2017 04:20:16 -0700 (PDT) Received: from localhost ([2.26.27.199]) by smtp.gmail.com with ESMTPSA id l127sm2744283wmd.18.2017.10.23.04.20.14 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 23 Oct 2017 04:20:15 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@linaro.org Subject: [05/nn] Add VEC_DUPLICATE_{CST,EXPR} and associated optab References: <87wp3mxgir.fsf@linaro.org> Date: Mon, 23 Oct 2017 12:20:14 +0100 In-Reply-To: <87wp3mxgir.fsf@linaro.org> (Richard Sandiford's message of "Mon, 23 Oct 2017 12:14:36 +0100") Message-ID: <87bmkyxg9d.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 SVE needs a way of broadcasting a scalar to a variable-length vector. This patch adds VEC_DUPLICATE_CST for when VECTOR_CST would be used for fixed-length vectors and VEC_DUPLICATE_EXPR for when CONSTRUCTOR would be used for fixed-length vectors. VEC_DUPLICATE_EXPR is the tree equivalent of the existing rtl code VEC_DUPLICATE. Originally we had a single VEC_DUPLICATE_EXPR and used TREE_CONSTANT to mark constant nodes, but in response to last year's RFC, Richard B. suggested it would be better to have separate codes for the constant and non-constant cases. This allows VEC_DUPLICATE_EXPR to be treated as a normal unary operation and avoids the previous need for treating it as a GIMPLE_SINGLE_RHS. It might make sense to use VEC_DUPLICATE_CST for all duplicated vector constants, since it's a bit more compact than VECTOR_CST in that case, and is potentially more efficient to process. However, the nice thing about keeping it restricted to variable-length vectors is that there is then no need to handle combinations of VECTOR_CST and VEC_DUPLICATE_CST; a vector type will always use VECTOR_CST or never use it. The patch also adds a vec_duplicate_optab to go with VEC_DUPLICATE_EXPR. 2017-10-23 Richard Sandiford Alan Hayward David Sherwood gcc/ * doc/generic.texi (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): Document. (VEC_COND_EXPR): Add missing @tindex. * doc/md.texi (vec_duplicate@var{m}): Document. * tree.def (VEC_DUPLICATE_CST, VEC_DUPLICATE_EXPR): New tree codes. * tree-core.h (tree_base): Document that u.nelts and TREE_OVERFLOW are used for VEC_DUPLICATE_CST as well. (tree_vector): Access base.n.nelts directly. * tree.h (TREE_OVERFLOW): Add VEC_DUPLICATE_CST to the list of valid codes. (VEC_DUPLICATE_CST_ELT): New macro. (build_vec_duplicate_cst): Declare. * tree.c (tree_node_structure_for_code, tree_code_size, tree_size) (integer_zerop, integer_onep, integer_all_onesp, integer_truep) (real_zerop, real_onep, real_minus_onep, add_expr, initializer_zerop) (walk_tree_1, drop_tree_overflow): Handle VEC_DUPLICATE_CST. (build_vec_duplicate_cst): New function. (uniform_vector_p): Handle the new codes. (test_vec_duplicate_predicates_int): New function. (test_vec_duplicate_predicates_float): Likewise. (test_vec_duplicate_predicates): Likewise. (tree_c_tests): Call test_vec_duplicate_predicates. * cfgexpand.c (expand_debug_expr): Handle the new codes. * tree-pretty-print.c (dump_generic_node): Likewise. * dwarf2out.c (rtl_for_decl_init): Handle VEC_DUPLICATE_CST. * gimple-expr.h (is_gimple_constant): Likewise. * gimplify.c (gimplify_expr): Likewise. * graphite-isl-ast-to-gimple.c (translate_isl_ast_to_gimple::is_constant): Likewise. * graphite-scop-detection.c (scan_tree_for_params): Likewise. * ipa-icf-gimple.c (func_checker::compare_cst_or_decl): Likewise. (func_checker::compare_operand): Likewise. * ipa-icf.c (sem_item::add_expr, sem_variable::equals): Likewise. * match.pd (negate_expr_p): Likewise. * print-tree.c (print_node): Likewise. * tree-chkp.c (chkp_find_bounds_1): Likewise. * tree-loop-distribution.c (const_with_all_bytes_same): Likewise. * tree-ssa-loop.c (for_each_index): Likewise. * tree-ssa-pre.c (create_component_ref_by_pieces_1): Likewise. * tree-ssa-sccvn.c (copy_reference_ops_from_ref): Likewise. (ao_ref_init_from_vn_reference): Likewise. * tree-vect-generic.c (ssa_uniform_vector_p): Likewise. * varasm.c (const_hash_1, compare_constant): Likewise. * fold-const.c (negate_expr_p, fold_negate_expr_1, const_binop) (fold_convert_const, operand_equal_p, fold_view_convert_expr) (exact_inverse, fold_checksum_tree): Likewise. (const_unop): Likewise. Fold VEC_DUPLICATE_EXPRs of a constant. (test_vec_duplicate_folding): New function. (fold_const_c_tests): Call it. * optabs.def (vec_duplicate_optab): New optab. * optabs-tree.c (optab_for_tree_code): Handle VEC_DUPLICATE_EXPR. * optabs.h (expand_vector_broadcast): Declare. * optabs.c (expand_vector_broadcast): Make non-static. Try using vec_duplicate_optab. * expr.c (store_constructor): Try using vec_duplicate_optab for uniform vectors. (const_vector_element): New function, split out from... (const_vector_from_tree): ...here. (expand_expr_real_2): Handle VEC_DUPLICATE_EXPR. (expand_expr_real_1): Handle VEC_DUPLICATE_CST. * internal-fn.c (expand_vector_ubsan_overflow): Use CONSTANT_P instead of checking for VECTOR_CST. * tree-cfg.c (verify_gimple_assign_unary): Handle VEC_DUPLICATE_EXPR. (verify_gimple_assign_single): Handle VEC_DUPLICATE_CST. * tree-inline.c (estimate_operator_cost): Handle VEC_DUPLICATE_EXPR. Index: gcc/doc/generic.texi =================================================================== --- gcc/doc/generic.texi 2017-10-23 11:38:53.934094740 +0100 +++ gcc/doc/generic.texi 2017-10-23 11:41:51.760448406 +0100 @@ -1036,6 +1036,7 @@ As this example indicates, the operands @tindex FIXED_CST @tindex COMPLEX_CST @tindex VECTOR_CST +@tindex VEC_DUPLICATE_CST @tindex STRING_CST @findex TREE_STRING_LENGTH @findex TREE_STRING_POINTER @@ -1089,6 +1090,14 @@ constant nodes. Each individual constan double constant node. The first operand is a @code{TREE_LIST} of the constant nodes and is accessed through @code{TREE_VECTOR_CST_ELTS}. +@item VEC_DUPLICATE_CST +These nodes represent a vector constant in which every element has the +same scalar value. At present only variable-length vectors use +@code{VEC_DUPLICATE_CST}; constant-length vectors use @code{VECTOR_CST} +instead. The scalar element value is given by +@code{VEC_DUPLICATE_CST_ELT} and has the same restrictions as the +element of a @code{VECTOR_CST}. + @item STRING_CST These nodes represent string-constants. The @code{TREE_STRING_LENGTH} returns the length of the string, as an @code{int}. The @@ -1692,6 +1701,7 @@ a value from @code{enum annot_expr_kind} @node Vectors @subsection Vectors +@tindex VEC_DUPLICATE_EXPR @tindex VEC_LSHIFT_EXPR @tindex VEC_RSHIFT_EXPR @tindex VEC_WIDEN_MULT_HI_EXPR @@ -1703,9 +1713,14 @@ a value from @code{enum annot_expr_kind} @tindex VEC_PACK_TRUNC_EXPR @tindex VEC_PACK_SAT_EXPR @tindex VEC_PACK_FIX_TRUNC_EXPR +@tindex VEC_COND_EXPR @tindex SAD_EXPR @table @code +@item VEC_DUPLICATE_EXPR +This node has a single operand and represents a vector in which every +element is equal to that operand. + @item VEC_LSHIFT_EXPR @itemx VEC_RSHIFT_EXPR These nodes represent whole vector left and right shifts, respectively. Index: gcc/doc/md.texi =================================================================== --- gcc/doc/md.texi 2017-10-23 11:41:22.189466342 +0100 +++ gcc/doc/md.texi 2017-10-23 11:41:51.761413027 +0100 @@ -4888,6 +4888,17 @@ and operand 1 is parallel containing val the vector mode @var{m}, or a vector mode with the same element mode and smaller number of elements. +@cindex @code{vec_duplicate@var{m}} instruction pattern +@item @samp{vec_duplicate@var{m}} +Initialize vector output operand 0 so that each element has the value given +by scalar input operand 1. The vector has mode @var{m} and the scalar has +the mode appropriate for one element of @var{m}. + +This pattern only handles duplicates of non-constant inputs. Constant +vectors go through the @code{mov@var{m}} pattern instead. + +This pattern is not allowed to @code{FAIL}. + @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern @item @samp{vec_cmp@var{m}@var{n}} Output a vector comparison. Operand 0 of mode @var{n} is the destination for Index: gcc/tree.def =================================================================== --- gcc/tree.def 2017-10-23 11:38:53.934094740 +0100 +++ gcc/tree.def 2017-10-23 11:41:51.774917721 +0100 @@ -304,6 +304,10 @@ DEFTREECODE (COMPLEX_CST, "complex_cst", /* Contents are in VECTOR_CST_ELTS field. */ DEFTREECODE (VECTOR_CST, "vector_cst", tcc_constant, 0) +/* Represents a vector constant in which every element is equal to + VEC_DUPLICATE_CST_ELT. */ +DEFTREECODE (VEC_DUPLICATE_CST, "vec_duplicate_cst", tcc_constant, 0) + /* Contents are TREE_STRING_LENGTH and the actual contents of the string. */ DEFTREECODE (STRING_CST, "string_cst", tcc_constant, 0) @@ -534,6 +538,9 @@ DEFTREECODE (TARGET_EXPR, "target_expr", 1 and 2 are NULL. The operands are then taken from the cfg edges. */ DEFTREECODE (COND_EXPR, "cond_expr", tcc_expression, 3) +/* Represents a vector in which every element is equal to operand 0. */ +DEFTREECODE (VEC_DUPLICATE_EXPR, "vec_duplicate_expr", tcc_unary, 1) + /* Vector conditional expression. It is like COND_EXPR, but with vector operands. Index: gcc/tree-core.h =================================================================== --- gcc/tree-core.h 2017-10-23 11:41:25.862065318 +0100 +++ gcc/tree-core.h 2017-10-23 11:41:51.771059237 +0100 @@ -975,7 +975,8 @@ struct GTY(()) tree_base { /* VEC length. This field is only used with TREE_VEC. */ int length; - /* Number of elements. This field is only used with VECTOR_CST. */ + /* Number of elements. This field is only used with VECTOR_CST + and VEC_DUPLICATE_CST. It is always 1 for VEC_DUPLICATE_CST. */ unsigned int nelts; /* SSA version number. This field is only used with SSA_NAME. */ @@ -1065,7 +1066,7 @@ struct GTY(()) tree_base { public_flag: TREE_OVERFLOW in - INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST + INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST, VEC_DUPLICATE_CST TREE_PUBLIC in VAR_DECL, FUNCTION_DECL @@ -1332,7 +1333,7 @@ struct GTY(()) tree_complex { struct GTY(()) tree_vector { struct tree_typed typed; - tree GTY ((length ("VECTOR_CST_NELTS ((tree) &%h)"))) elts[1]; + tree GTY ((length ("((tree) &%h)->base.u.nelts"))) elts[1]; }; struct GTY(()) tree_identifier { Index: gcc/tree.h =================================================================== --- gcc/tree.h 2017-10-23 11:41:23.517482774 +0100 +++ gcc/tree.h 2017-10-23 11:41:51.775882341 +0100 @@ -730,8 +730,8 @@ #define TREE_SYMBOL_REFERENCED(NODE) \ #define TYPE_REF_CAN_ALIAS_ALL(NODE) \ (PTR_OR_REF_CHECK (NODE)->base.static_flag) -/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, or VECTOR_CST, this means - there was an overflow in folding. */ +/* In an INTEGER_CST, REAL_CST, COMPLEX_CST, VECTOR_CST or VEC_DUPLICATE_CST, + this means there was an overflow in folding. */ #define TREE_OVERFLOW(NODE) (CST_CHECK (NODE)->base.public_flag) @@ -1030,6 +1030,10 @@ #define VECTOR_CST_NELTS(NODE) (VECTOR_C #define VECTOR_CST_ELTS(NODE) (VECTOR_CST_CHECK (NODE)->vector.elts) #define VECTOR_CST_ELT(NODE,IDX) (VECTOR_CST_CHECK (NODE)->vector.elts[IDX]) +/* In a VEC_DUPLICATE_CST node. */ +#define VEC_DUPLICATE_CST_ELT(NODE) \ + (VEC_DUPLICATE_CST_CHECK (NODE)->vector.elts[0]) + /* Define fields and accessors for some special-purpose tree nodes. */ #define IDENTIFIER_LENGTH(NODE) \ @@ -4025,6 +4029,7 @@ extern tree build_int_cst (tree, HOST_WI extern tree build_int_cstu (tree type, unsigned HOST_WIDE_INT cst); extern tree build_int_cst_type (tree, HOST_WIDE_INT); extern tree make_vector (unsigned CXX_MEM_STAT_INFO); +extern tree build_vec_duplicate_cst (tree, tree CXX_MEM_STAT_INFO); extern tree build_vector (tree, vec CXX_MEM_STAT_INFO); extern tree build_vector_from_ctor (tree, vec *); extern tree build_vector_from_val (tree, tree); Index: gcc/tree.c =================================================================== --- gcc/tree.c 2017-10-23 11:41:23.515548300 +0100 +++ gcc/tree.c 2017-10-23 11:41:51.774917721 +0100 @@ -464,6 +464,7 @@ tree_node_structure_for_code (enum tree_ case FIXED_CST: return TS_FIXED_CST; case COMPLEX_CST: return TS_COMPLEX; case VECTOR_CST: return TS_VECTOR; + case VEC_DUPLICATE_CST: return TS_VECTOR; case STRING_CST: return TS_STRING; /* tcc_exceptional cases. */ case ERROR_MARK: return TS_COMMON; @@ -816,6 +817,7 @@ tree_code_size (enum tree_code code) case FIXED_CST: return sizeof (struct tree_fixed_cst); case COMPLEX_CST: return sizeof (struct tree_complex); case VECTOR_CST: return sizeof (struct tree_vector); + case VEC_DUPLICATE_CST: return sizeof (struct tree_vector); case STRING_CST: gcc_unreachable (); default: return lang_hooks.tree_size (code); @@ -875,6 +877,9 @@ tree_size (const_tree node) return (sizeof (struct tree_vector) + (VECTOR_CST_NELTS (node) - 1) * sizeof (tree)); + case VEC_DUPLICATE_CST: + return sizeof (struct tree_vector); + case STRING_CST: return TREE_STRING_LENGTH (node) + offsetof (struct tree_string, str) + 1; @@ -1682,6 +1687,30 @@ cst_and_fits_in_hwi (const_tree x) && (tree_fits_shwi_p (x) || tree_fits_uhwi_p (x))); } +/* Build a new VEC_DUPLICATE_CST with type TYPE and operand EXP. + + Note that this function is only suitable for callers that specifically + need a VEC_DUPLICATE_CST node. Use build_vector_from_val to duplicate + a general scalar into a general vector type. */ + +tree +build_vec_duplicate_cst (tree type, tree exp MEM_STAT_DECL) +{ + int length = sizeof (struct tree_vector); + + record_node_allocation_statistics (VEC_DUPLICATE_CST, length); + + tree t = ggc_alloc_cleared_tree_node_stat (length PASS_MEM_STAT); + + TREE_SET_CODE (t, VEC_DUPLICATE_CST); + TREE_TYPE (t) = type; + t->base.u.nelts = 1; + VEC_DUPLICATE_CST_ELT (t) = exp; + TREE_CONSTANT (t) = 1; + + return t; +} + /* Build a newly constructed VECTOR_CST node of length LEN. */ tree @@ -2343,6 +2372,8 @@ integer_zerop (const_tree expr) return false; return true; } + case VEC_DUPLICATE_CST: + return integer_zerop (VEC_DUPLICATE_CST_ELT (expr)); default: return false; } @@ -2369,6 +2400,8 @@ integer_onep (const_tree expr) return false; return true; } + case VEC_DUPLICATE_CST: + return integer_onep (VEC_DUPLICATE_CST_ELT (expr)); default: return false; } @@ -2407,6 +2440,9 @@ integer_all_onesp (const_tree expr) return 1; } + else if (TREE_CODE (expr) == VEC_DUPLICATE_CST) + return integer_all_onesp (VEC_DUPLICATE_CST_ELT (expr)); + else if (TREE_CODE (expr) != INTEGER_CST) return 0; @@ -2463,7 +2499,7 @@ integer_nonzerop (const_tree expr) int integer_truep (const_tree expr) { - if (TREE_CODE (expr) == VECTOR_CST) + if (TREE_CODE (expr) == VECTOR_CST || TREE_CODE (expr) == VEC_DUPLICATE_CST) return integer_all_onesp (expr); return integer_onep (expr); } @@ -2634,6 +2670,8 @@ real_zerop (const_tree expr) return false; return true; } + case VEC_DUPLICATE_CST: + return real_zerop (VEC_DUPLICATE_CST_ELT (expr)); default: return false; } @@ -2662,6 +2700,8 @@ real_onep (const_tree expr) return false; return true; } + case VEC_DUPLICATE_CST: + return real_onep (VEC_DUPLICATE_CST_ELT (expr)); default: return false; } @@ -2689,6 +2729,8 @@ real_minus_onep (const_tree expr) return false; return true; } + case VEC_DUPLICATE_CST: + return real_minus_onep (VEC_DUPLICATE_CST_ELT (expr)); default: return false; } @@ -7091,6 +7133,9 @@ add_expr (const_tree t, inchash::hash &h inchash::add_expr (VECTOR_CST_ELT (t, i), hstate, flags); return; } + case VEC_DUPLICATE_CST: + inchash::add_expr (VEC_DUPLICATE_CST_ELT (t), hstate); + return; case SSA_NAME: /* We can just compare by pointer. */ hstate.add_wide_int (SSA_NAME_VERSION (t)); @@ -10345,6 +10390,9 @@ initializer_zerop (const_tree init) return true; } + case VEC_DUPLICATE_CST: + return initializer_zerop (VEC_DUPLICATE_CST_ELT (init)); + case CONSTRUCTOR: { unsigned HOST_WIDE_INT idx; @@ -10390,7 +10438,13 @@ uniform_vector_p (const_tree vec) gcc_assert (VECTOR_TYPE_P (TREE_TYPE (vec))); - if (TREE_CODE (vec) == VECTOR_CST) + if (TREE_CODE (vec) == VEC_DUPLICATE_CST) + return VEC_DUPLICATE_CST_ELT (vec); + + else if (TREE_CODE (vec) == VEC_DUPLICATE_EXPR) + return TREE_OPERAND (vec, 0); + + else if (TREE_CODE (vec) == VECTOR_CST) { first = VECTOR_CST_ELT (vec, 0); for (i = 1; i < VECTOR_CST_NELTS (vec); ++i) @@ -11095,6 +11149,7 @@ #define WALK_SUBTREE_TAIL(NODE) \ case REAL_CST: case FIXED_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case STRING_CST: case BLOCK: case PLACEHOLDER_EXPR: @@ -12381,6 +12436,12 @@ drop_tree_overflow (tree t) elt = drop_tree_overflow (elt); } } + if (TREE_CODE (t) == VEC_DUPLICATE_CST) + { + tree *elt = &VEC_DUPLICATE_CST_ELT (t); + if (TREE_OVERFLOW (*elt)) + *elt = drop_tree_overflow (*elt); + } return t; } @@ -13798,6 +13859,92 @@ test_integer_constants () ASSERT_EQ (type, TREE_TYPE (zero)); } +/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs + for integral type TYPE. */ + +static void +test_vec_duplicate_predicates_int (tree type) +{ + tree vec_type = build_vector_type (type, 4); + + tree zero = build_zero_cst (type); + tree vec_zero = build_vec_duplicate_cst (vec_type, zero); + ASSERT_TRUE (integer_zerop (vec_zero)); + ASSERT_FALSE (integer_onep (vec_zero)); + ASSERT_FALSE (integer_minus_onep (vec_zero)); + ASSERT_FALSE (integer_all_onesp (vec_zero)); + ASSERT_FALSE (integer_truep (vec_zero)); + ASSERT_TRUE (initializer_zerop (vec_zero)); + + tree one = build_one_cst (type); + tree vec_one = build_vec_duplicate_cst (vec_type, one); + ASSERT_FALSE (integer_zerop (vec_one)); + ASSERT_TRUE (integer_onep (vec_one)); + ASSERT_FALSE (integer_minus_onep (vec_one)); + ASSERT_FALSE (integer_all_onesp (vec_one)); + ASSERT_FALSE (integer_truep (vec_one)); + ASSERT_FALSE (initializer_zerop (vec_one)); + + tree minus_one = build_minus_one_cst (type); + tree vec_minus_one = build_vec_duplicate_cst (vec_type, minus_one); + ASSERT_FALSE (integer_zerop (vec_minus_one)); + ASSERT_FALSE (integer_onep (vec_minus_one)); + ASSERT_TRUE (integer_minus_onep (vec_minus_one)); + ASSERT_TRUE (integer_all_onesp (vec_minus_one)); + ASSERT_TRUE (integer_truep (vec_minus_one)); + ASSERT_FALSE (initializer_zerop (vec_minus_one)); + + tree x = create_tmp_var_raw (type, "x"); + tree vec_x = build1 (VEC_DUPLICATE_EXPR, vec_type, x); + ASSERT_EQ (uniform_vector_p (vec_zero), zero); + ASSERT_EQ (uniform_vector_p (vec_one), one); + ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one); + ASSERT_EQ (uniform_vector_p (vec_x), x); +} + +/* Verify predicate handling of VEC_DUPLICATE_CSTs for floating-point + type TYPE. */ + +static void +test_vec_duplicate_predicates_float (tree type) +{ + tree vec_type = build_vector_type (type, 4); + + tree zero = build_zero_cst (type); + tree vec_zero = build_vec_duplicate_cst (vec_type, zero); + ASSERT_TRUE (real_zerop (vec_zero)); + ASSERT_FALSE (real_onep (vec_zero)); + ASSERT_FALSE (real_minus_onep (vec_zero)); + ASSERT_TRUE (initializer_zerop (vec_zero)); + + tree one = build_one_cst (type); + tree vec_one = build_vec_duplicate_cst (vec_type, one); + ASSERT_FALSE (real_zerop (vec_one)); + ASSERT_TRUE (real_onep (vec_one)); + ASSERT_FALSE (real_minus_onep (vec_one)); + ASSERT_FALSE (initializer_zerop (vec_one)); + + tree minus_one = build_minus_one_cst (type); + tree vec_minus_one = build_vec_duplicate_cst (vec_type, minus_one); + ASSERT_FALSE (real_zerop (vec_minus_one)); + ASSERT_FALSE (real_onep (vec_minus_one)); + ASSERT_TRUE (real_minus_onep (vec_minus_one)); + ASSERT_FALSE (initializer_zerop (vec_minus_one)); + + ASSERT_EQ (uniform_vector_p (vec_zero), zero); + ASSERT_EQ (uniform_vector_p (vec_one), one); + ASSERT_EQ (uniform_vector_p (vec_minus_one), minus_one); +} + +/* Verify predicate handling of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs. */ + +static void +test_vec_duplicate_predicates () +{ + test_vec_duplicate_predicates_int (integer_type_node); + test_vec_duplicate_predicates_float (float_type_node); +} + /* Verify identifiers. */ static void @@ -13826,6 +13973,7 @@ test_labels () tree_c_tests () { test_integer_constants (); + test_vec_duplicate_predicates (); test_identifiers (); test_labels (); } Index: gcc/cfgexpand.c =================================================================== --- gcc/cfgexpand.c 2017-10-23 11:41:23.137358624 +0100 +++ gcc/cfgexpand.c 2017-10-23 11:41:51.760448406 +0100 @@ -5049,6 +5049,8 @@ expand_debug_expr (tree exp) case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: case VEC_PERM_EXPR: + case VEC_DUPLICATE_CST: + case VEC_DUPLICATE_EXPR: return NULL; /* Misc codes. */ Index: gcc/tree-pretty-print.c =================================================================== --- gcc/tree-pretty-print.c 2017-10-23 11:38:53.934094740 +0100 +++ gcc/tree-pretty-print.c 2017-10-23 11:41:51.772023858 +0100 @@ -1802,6 +1802,12 @@ dump_generic_node (pretty_printer *pp, t } break; + case VEC_DUPLICATE_CST: + pp_string (pp, "{ "); + dump_generic_node (pp, VEC_DUPLICATE_CST_ELT (node), spc, flags, false); + pp_string (pp, ", ... }"); + break; + case FUNCTION_TYPE: case METHOD_TYPE: dump_generic_node (pp, TREE_TYPE (node), spc, flags, false); @@ -3231,6 +3237,15 @@ dump_generic_node (pretty_printer *pp, t pp_string (pp, " > "); break; + case VEC_DUPLICATE_EXPR: + pp_space (pp); + for (str = get_tree_code_name (code); *str; str++) + pp_character (pp, TOUPPER (*str)); + pp_string (pp, " < "); + dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false); + pp_string (pp, " > "); + break; + case VEC_UNPACK_HI_EXPR: pp_string (pp, " VEC_UNPACK_HI_EXPR < "); dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false); Index: gcc/dwarf2out.c =================================================================== --- gcc/dwarf2out.c 2017-10-23 11:41:24.407340836 +0100 +++ gcc/dwarf2out.c 2017-10-23 11:41:51.763342269 +0100 @@ -18862,6 +18862,7 @@ rtl_for_decl_init (tree init, tree type) switch (TREE_CODE (init)) { case VECTOR_CST: + case VEC_DUPLICATE_CST: break; case CONSTRUCTOR: if (TREE_CONSTANT (init)) Index: gcc/gimple-expr.h =================================================================== --- gcc/gimple-expr.h 2017-10-23 11:38:53.934094740 +0100 +++ gcc/gimple-expr.h 2017-10-23 11:41:51.765271511 +0100 @@ -134,6 +134,7 @@ is_gimple_constant (const_tree t) case FIXED_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case STRING_CST: return true; Index: gcc/gimplify.c =================================================================== --- gcc/gimplify.c 2017-10-23 11:41:25.531270256 +0100 +++ gcc/gimplify.c 2017-10-23 11:41:51.766236132 +0100 @@ -11506,6 +11506,7 @@ gimplify_expr (tree *expr_p, gimple_seq case STRING_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: /* Drop the overflow flag on constants, we do not want that in the GIMPLE IL. */ if (TREE_OVERFLOW_P (*expr_p)) Index: gcc/graphite-isl-ast-to-gimple.c =================================================================== --- gcc/graphite-isl-ast-to-gimple.c 2017-10-23 11:41:23.205065216 +0100 +++ gcc/graphite-isl-ast-to-gimple.c 2017-10-23 11:41:51.767200753 +0100 @@ -222,7 +222,8 @@ enum phi_node_kind return TREE_CODE (op) == INTEGER_CST || TREE_CODE (op) == REAL_CST || TREE_CODE (op) == COMPLEX_CST - || TREE_CODE (op) == VECTOR_CST; + || TREE_CODE (op) == VECTOR_CST + || TREE_CODE (op) == VEC_DUPLICATE_CST; } private: Index: gcc/graphite-scop-detection.c =================================================================== --- gcc/graphite-scop-detection.c 2017-10-23 11:41:25.533204730 +0100 +++ gcc/graphite-scop-detection.c 2017-10-23 11:41:51.767200753 +0100 @@ -1243,6 +1243,7 @@ scan_tree_for_params (sese_info_p s, tre case REAL_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: break; default: Index: gcc/ipa-icf-gimple.c =================================================================== --- gcc/ipa-icf-gimple.c 2017-10-23 11:38:53.934094740 +0100 +++ gcc/ipa-icf-gimple.c 2017-10-23 11:41:51.767200753 +0100 @@ -333,6 +333,7 @@ func_checker::compare_cst_or_decl (tree case INTEGER_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case STRING_CST: case REAL_CST: { @@ -528,6 +529,7 @@ func_checker::compare_operand (tree t1, case INTEGER_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case STRING_CST: case REAL_CST: case FUNCTION_DECL: Index: gcc/ipa-icf.c =================================================================== --- gcc/ipa-icf.c 2017-10-23 11:41:25.874639400 +0100 +++ gcc/ipa-icf.c 2017-10-23 11:41:51.768165374 +0100 @@ -1478,6 +1478,7 @@ sem_item::add_expr (const_tree exp, inch case STRING_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: inchash::add_expr (exp, hstate); break; case CONSTRUCTOR: @@ -2030,6 +2031,9 @@ sem_variable::equals (tree t1, tree t2) return 1; } + case VEC_DUPLICATE_CST: + return sem_variable::equals (VEC_DUPLICATE_CST_ELT (t1), + VEC_DUPLICATE_CST_ELT (t2)); case ARRAY_REF: case ARRAY_RANGE_REF: { Index: gcc/match.pd =================================================================== --- gcc/match.pd 2017-10-23 11:38:53.934094740 +0100 +++ gcc/match.pd 2017-10-23 11:41:51.768165374 +0100 @@ -958,6 +958,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (match negate_expr_p VECTOR_CST (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type)))) +(match negate_expr_p + VEC_DUPLICATE_CST + (if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type)))) /* (-A) * (-B) -> A * B */ (simplify Index: gcc/print-tree.c =================================================================== --- gcc/print-tree.c 2017-10-23 11:38:53.934094740 +0100 +++ gcc/print-tree.c 2017-10-23 11:41:51.769129995 +0100 @@ -783,6 +783,10 @@ print_node (FILE *file, const char *pref } break; + case VEC_DUPLICATE_CST: + print_node (file, "elt", VEC_DUPLICATE_CST_ELT (node), indent + 4); + break; + case COMPLEX_CST: print_node (file, "real", TREE_REALPART (node), indent + 4); print_node (file, "imag", TREE_IMAGPART (node), indent + 4); Index: gcc/tree-chkp.c =================================================================== --- gcc/tree-chkp.c 2017-10-23 11:41:23.201196268 +0100 +++ gcc/tree-chkp.c 2017-10-23 11:41:51.770094616 +0100 @@ -3800,6 +3800,7 @@ chkp_find_bounds_1 (tree ptr, tree ptr_s case INTEGER_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: if (integer_zerop (ptr_src)) bounds = chkp_get_none_bounds (); else Index: gcc/tree-loop-distribution.c =================================================================== --- gcc/tree-loop-distribution.c 2017-10-23 11:41:23.228278904 +0100 +++ gcc/tree-loop-distribution.c 2017-10-23 11:41:51.771059237 +0100 @@ -921,6 +921,9 @@ const_with_all_bytes_same (tree val) && CONSTRUCTOR_NELTS (val) == 0)) return 0; + if (TREE_CODE (val) == VEC_DUPLICATE_CST) + return const_with_all_bytes_same (VEC_DUPLICATE_CST_ELT (val)); + if (real_zerop (val)) { /* Only return 0 for +0.0, not for -0.0, which doesn't have Index: gcc/tree-ssa-loop.c =================================================================== --- gcc/tree-ssa-loop.c 2017-10-23 11:38:53.934094740 +0100 +++ gcc/tree-ssa-loop.c 2017-10-23 11:41:51.772023858 +0100 @@ -616,6 +616,7 @@ for_each_index (tree *addr_p, bool (*cbc case STRING_CST: case RESULT_DECL: case VECTOR_CST: + case VEC_DUPLICATE_CST: case COMPLEX_CST: case INTEGER_CST: case REAL_CST: Index: gcc/tree-ssa-pre.c =================================================================== --- gcc/tree-ssa-pre.c 2017-10-23 11:41:25.549647760 +0100 +++ gcc/tree-ssa-pre.c 2017-10-23 11:41:51.772023858 +0100 @@ -2675,6 +2675,7 @@ create_component_ref_by_pieces_1 (basic_ case INTEGER_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case REAL_CST: case CONSTRUCTOR: case VAR_DECL: Index: gcc/tree-ssa-sccvn.c =================================================================== --- gcc/tree-ssa-sccvn.c 2017-10-23 11:38:53.934094740 +0100 +++ gcc/tree-ssa-sccvn.c 2017-10-23 11:41:51.773953100 +0100 @@ -858,6 +858,7 @@ copy_reference_ops_from_ref (tree ref, v case INTEGER_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case REAL_CST: case FIXED_CST: case CONSTRUCTOR: @@ -1050,6 +1051,7 @@ ao_ref_init_from_vn_reference (ao_ref *r case INTEGER_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case REAL_CST: case CONSTRUCTOR: case CONST_DECL: Index: gcc/tree-vect-generic.c =================================================================== --- gcc/tree-vect-generic.c 2017-10-23 11:38:53.934094740 +0100 +++ gcc/tree-vect-generic.c 2017-10-23 11:41:51.773953100 +0100 @@ -1419,6 +1419,7 @@ lower_vec_perm (gimple_stmt_iterator *gs ssa_uniform_vector_p (tree op) { if (TREE_CODE (op) == VECTOR_CST + || TREE_CODE (op) == VEC_DUPLICATE_CST || TREE_CODE (op) == CONSTRUCTOR) return uniform_vector_p (op); if (TREE_CODE (op) == SSA_NAME) Index: gcc/varasm.c =================================================================== --- gcc/varasm.c 2017-10-23 11:41:25.822408600 +0100 +++ gcc/varasm.c 2017-10-23 11:41:51.775882341 +0100 @@ -3068,6 +3068,9 @@ const_hash_1 (const tree exp) CASE_CONVERT: return const_hash_1 (TREE_OPERAND (exp, 0)) * 7 + 2; + case VEC_DUPLICATE_CST: + return const_hash_1 (VEC_DUPLICATE_CST_ELT (exp)) * 7 + 3; + default: /* A language specific constant. Just hash the code. */ return code; @@ -3158,6 +3161,10 @@ compare_constant (const tree t1, const t return 1; } + case VEC_DUPLICATE_CST: + return compare_constant (VEC_DUPLICATE_CST_ELT (t1), + VEC_DUPLICATE_CST_ELT (t2)); + case CONSTRUCTOR: { vec *v1, *v2; Index: gcc/fold-const.c =================================================================== --- gcc/fold-const.c 2017-10-23 11:41:23.535860278 +0100 +++ gcc/fold-const.c 2017-10-23 11:41:51.765271511 +0100 @@ -418,6 +418,9 @@ negate_expr_p (tree t) return true; } + case VEC_DUPLICATE_CST: + return negate_expr_p (VEC_DUPLICATE_CST_ELT (t)); + case COMPLEX_EXPR: return negate_expr_p (TREE_OPERAND (t, 0)) && negate_expr_p (TREE_OPERAND (t, 1)); @@ -579,6 +582,14 @@ fold_negate_expr_1 (location_t loc, tree return build_vector (type, elts); } + case VEC_DUPLICATE_CST: + { + tree sub = fold_negate_expr (loc, VEC_DUPLICATE_CST_ELT (t)); + if (!sub) + return NULL_TREE; + return build_vector_from_val (type, sub); + } + case COMPLEX_EXPR: if (negate_expr_p (t)) return fold_build2_loc (loc, COMPLEX_EXPR, type, @@ -1436,6 +1447,16 @@ const_binop (enum tree_code code, tree a return build_vector (type, elts); } + if (TREE_CODE (arg1) == VEC_DUPLICATE_CST + && TREE_CODE (arg2) == VEC_DUPLICATE_CST) + { + tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1), + VEC_DUPLICATE_CST_ELT (arg2)); + if (!sub) + return NULL_TREE; + return build_vector_from_val (TREE_TYPE (arg1), sub); + } + /* Shifts allow a scalar offset for a vector. */ if (TREE_CODE (arg1) == VECTOR_CST && TREE_CODE (arg2) == INTEGER_CST) @@ -1459,6 +1480,15 @@ const_binop (enum tree_code code, tree a return build_vector (type, elts); } + + if (TREE_CODE (arg1) == VEC_DUPLICATE_CST + && TREE_CODE (arg2) == INTEGER_CST) + { + tree sub = const_binop (code, VEC_DUPLICATE_CST_ELT (arg1), arg2); + if (!sub) + return NULL_TREE; + return build_vector_from_val (TREE_TYPE (arg1), sub); + } return NULL_TREE; } @@ -1652,6 +1682,13 @@ const_unop (enum tree_code code, tree ty if (i == count) return build_vector (type, elements); } + else if (TREE_CODE (arg0) == VEC_DUPLICATE_CST) + { + tree sub = const_unop (BIT_NOT_EXPR, TREE_TYPE (type), + VEC_DUPLICATE_CST_ELT (arg0)); + if (sub) + return build_vector_from_val (type, sub); + } break; case TRUTH_NOT_EXPR: @@ -1737,6 +1774,11 @@ const_unop (enum tree_code code, tree ty return res; } + case VEC_DUPLICATE_EXPR: + if (CONSTANT_CLASS_P (arg0)) + return build_vector_from_val (type, arg0); + return NULL_TREE; + default: break; } @@ -2167,6 +2209,15 @@ fold_convert_const (enum tree_code code, } return build_vector (type, v); } + if (TREE_CODE (arg1) == VEC_DUPLICATE_CST + && (TYPE_VECTOR_SUBPARTS (type) + == TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg1)))) + { + tree sub = fold_convert_const (code, TREE_TYPE (type), + VEC_DUPLICATE_CST_ELT (arg1)); + if (sub) + return build_vector_from_val (type, sub); + } } return NULL_TREE; } @@ -2953,6 +3004,10 @@ operand_equal_p (const_tree arg0, const_ return 1; } + case VEC_DUPLICATE_CST: + return operand_equal_p (VEC_DUPLICATE_CST_ELT (arg0), + VEC_DUPLICATE_CST_ELT (arg1), flags); + case COMPLEX_CST: return (operand_equal_p (TREE_REALPART (arg0), TREE_REALPART (arg1), flags) @@ -7492,6 +7547,20 @@ can_native_interpret_type_p (tree type) static tree fold_view_convert_expr (tree type, tree expr) { + /* Recurse on duplicated vectors if the target type is also a vector + and if the elements line up. */ + tree expr_type = TREE_TYPE (expr); + if (TREE_CODE (expr) == VEC_DUPLICATE_CST + && VECTOR_TYPE_P (type) + && TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (expr_type) + && TYPE_SIZE (TREE_TYPE (type)) == TYPE_SIZE (TREE_TYPE (expr_type))) + { + tree sub = fold_view_convert_expr (TREE_TYPE (type), + VEC_DUPLICATE_CST_ELT (expr)); + if (sub) + return build_vector_from_val (type, sub); + } + /* We support up to 512-bit values (for V8DFmode). */ unsigned char buffer[64]; int len; @@ -8891,6 +8960,15 @@ exact_inverse (tree type, tree cst) return build_vector (type, elts); } + case VEC_DUPLICATE_CST: + { + tree sub = exact_inverse (TREE_TYPE (type), + VEC_DUPLICATE_CST_ELT (cst)); + if (!sub) + return NULL_TREE; + return build_vector_from_val (type, sub); + } + default: return NULL_TREE; } @@ -11969,6 +12047,9 @@ fold_checksum_tree (const_tree expr, str for (i = 0; i < (int) VECTOR_CST_NELTS (expr); ++i) fold_checksum_tree (VECTOR_CST_ELT (expr, i), ctx, ht); break; + case VEC_DUPLICATE_CST: + fold_checksum_tree (VEC_DUPLICATE_CST_ELT (expr), ctx, ht); + break; default: break; } @@ -14436,6 +14517,36 @@ test_vector_folding () ASSERT_FALSE (integer_nonzerop (fold_build2 (NE_EXPR, res_type, one, one))); } +/* Verify folding of VEC_DUPLICATE_CSTs and VEC_DUPLICATE_EXPRs. */ + +static void +test_vec_duplicate_folding () +{ + tree type = build_vector_type (ssizetype, 4); + tree dup5 = build_vec_duplicate_cst (type, ssize_int (5)); + tree dup3 = build_vec_duplicate_cst (type, ssize_int (3)); + + tree neg_dup5 = fold_unary (NEGATE_EXPR, type, dup5); + ASSERT_EQ (uniform_vector_p (neg_dup5), ssize_int (-5)); + + tree not_dup5 = fold_unary (BIT_NOT_EXPR, type, dup5); + ASSERT_EQ (uniform_vector_p (not_dup5), ssize_int (-6)); + + tree dup5_plus_dup3 = fold_binary (PLUS_EXPR, type, dup5, dup3); + ASSERT_EQ (uniform_vector_p (dup5_plus_dup3), ssize_int (8)); + + tree dup5_lsl_2 = fold_binary (LSHIFT_EXPR, type, dup5, ssize_int (2)); + ASSERT_EQ (uniform_vector_p (dup5_lsl_2), ssize_int (20)); + + tree size_vector = build_vector_type (sizetype, 4); + tree size_dup5 = fold_convert (size_vector, dup5); + ASSERT_EQ (uniform_vector_p (size_dup5), size_int (5)); + + tree dup5_expr = fold_unary (VEC_DUPLICATE_EXPR, type, ssize_int (5)); + tree dup5_cst = build_vector_from_val (type, ssize_int (5)); + ASSERT_TRUE (operand_equal_p (dup5_expr, dup5_cst, 0)); +} + /* Run all of the selftests within this file. */ void @@ -14443,6 +14554,7 @@ fold_const_c_tests () { test_arithmetic_folding (); test_vector_folding (); + test_vec_duplicate_folding (); } } // namespace selftest Index: gcc/optabs.def =================================================================== --- gcc/optabs.def 2017-10-23 11:38:53.934094740 +0100 +++ gcc/optabs.def 2017-10-23 11:41:51.769129995 +0100 @@ -364,3 +364,5 @@ OPTAB_D (atomic_xor_optab, "atomic_xor$I OPTAB_D (get_thread_pointer_optab, "get_thread_pointer$I$a") OPTAB_D (set_thread_pointer_optab, "set_thread_pointer$I$a") + +OPTAB_DC (vec_duplicate_optab, "vec_duplicate$a", VEC_DUPLICATE) Index: gcc/optabs-tree.c =================================================================== --- gcc/optabs-tree.c 2017-10-23 11:38:53.934094740 +0100 +++ gcc/optabs-tree.c 2017-10-23 11:41:51.768165374 +0100 @@ -210,6 +210,9 @@ optab_for_tree_code (enum tree_code code return TYPE_UNSIGNED (type) ? vec_pack_ufix_trunc_optab : vec_pack_sfix_trunc_optab; + case VEC_DUPLICATE_EXPR: + return vec_duplicate_optab; + default: break; } Index: gcc/optabs.h =================================================================== --- gcc/optabs.h 2017-10-23 11:38:53.934094740 +0100 +++ gcc/optabs.h 2017-10-23 11:41:51.769129995 +0100 @@ -181,6 +181,7 @@ extern rtx simplify_expand_binop (machin enum optab_methods methods); extern bool force_expand_binop (machine_mode, optab, rtx, rtx, rtx, int, enum optab_methods); +extern rtx expand_vector_broadcast (machine_mode, rtx); /* Generate code for a simple binary or unary operation. "Simple" in this case means "can be unambiguously described by a (mode, code) Index: gcc/optabs.c =================================================================== --- gcc/optabs.c 2017-10-23 11:41:41.549050496 +0100 +++ gcc/optabs.c 2017-10-23 11:41:51.769129995 +0100 @@ -367,7 +367,7 @@ force_expand_binop (machine_mode mode, o mode of OP must be the element mode of VMODE. If OP is a constant, then the return value will be a constant. */ -static rtx +rtx expand_vector_broadcast (machine_mode vmode, rtx op) { enum insn_code icode; @@ -380,6 +380,16 @@ expand_vector_broadcast (machine_mode vm if (CONSTANT_P (op)) return gen_const_vec_duplicate (vmode, op); + icode = optab_handler (vec_duplicate_optab, vmode); + if (icode != CODE_FOR_nothing) + { + struct expand_operand ops[2]; + create_output_operand (&ops[0], NULL_RTX, vmode); + create_input_operand (&ops[1], op, GET_MODE (op)); + expand_insn (icode, 2, ops); + return ops[0].value; + } + /* ??? If the target doesn't have a vec_init, then we have no easy way of performing this operation. Most of this sort of generic support is hidden away in the vector lowering support in gimple. */ Index: gcc/expr.c =================================================================== --- gcc/expr.c 2017-10-23 11:41:39.187050437 +0100 +++ gcc/expr.c 2017-10-23 11:41:51.764306890 +0100 @@ -6572,7 +6572,8 @@ store_constructor (tree exp, rtx target, constructor_elt *ce; int i; int need_to_clear; - int icode = CODE_FOR_nothing; + insn_code icode = CODE_FOR_nothing; + tree elt; tree elttype = TREE_TYPE (type); int elt_size = tree_to_uhwi (TYPE_SIZE (elttype)); machine_mode eltmode = TYPE_MODE (elttype); @@ -6582,13 +6583,30 @@ store_constructor (tree exp, rtx target, unsigned n_elts; alias_set_type alias; bool vec_vec_init_p = false; + machine_mode mode = GET_MODE (target); gcc_assert (eltmode != BLKmode); + /* Try using vec_duplicate_optab for uniform vectors. */ + if (!TREE_SIDE_EFFECTS (exp) + && VECTOR_MODE_P (mode) + && eltmode == GET_MODE_INNER (mode) + && ((icode = optab_handler (vec_duplicate_optab, mode)) + != CODE_FOR_nothing) + && (elt = uniform_vector_p (exp))) + { + struct expand_operand ops[2]; + create_output_operand (&ops[0], target, mode); + create_input_operand (&ops[1], expand_normal (elt), eltmode); + expand_insn (icode, 2, ops); + if (!rtx_equal_p (target, ops[0].value)) + emit_move_insn (target, ops[0].value); + break; + } + n_elts = TYPE_VECTOR_SUBPARTS (type); - if (REG_P (target) && VECTOR_MODE_P (GET_MODE (target))) + if (REG_P (target) && VECTOR_MODE_P (mode)) { - machine_mode mode = GET_MODE (target); machine_mode emode = eltmode; if (CONSTRUCTOR_NELTS (exp) @@ -6600,7 +6618,7 @@ store_constructor (tree exp, rtx target, == n_elts); emode = TYPE_MODE (etype); } - icode = (int) convert_optab_handler (vec_init_optab, mode, emode); + icode = convert_optab_handler (vec_init_optab, mode, emode); if (icode != CODE_FOR_nothing) { unsigned int i, n = n_elts; @@ -6648,7 +6666,7 @@ store_constructor (tree exp, rtx target, if (need_to_clear && size > 0 && !vector) { if (REG_P (target)) - emit_move_insn (target, CONST0_RTX (GET_MODE (target))); + emit_move_insn (target, CONST0_RTX (mode)); else clear_storage (target, GEN_INT (size), BLOCK_OP_NORMAL); cleared = 1; @@ -6656,7 +6674,7 @@ store_constructor (tree exp, rtx target, /* Inform later passes that the old value is dead. */ if (!cleared && !vector && REG_P (target)) - emit_move_insn (target, CONST0_RTX (GET_MODE (target))); + emit_move_insn (target, CONST0_RTX (mode)); if (MEM_P (target)) alias = MEM_ALIAS_SET (target); @@ -6707,8 +6725,7 @@ store_constructor (tree exp, rtx target, if (vector) emit_insn (GEN_FCN (icode) (target, - gen_rtx_PARALLEL (GET_MODE (target), - vector))); + gen_rtx_PARALLEL (mode, vector))); break; } @@ -7686,6 +7703,19 @@ expand_operands (tree exp0, tree exp1, r } +/* Expand constant vector element ELT, which has mode MODE. This is used + for members of VECTOR_CST and VEC_DUPLICATE_CST. */ + +static rtx +const_vector_element (scalar_mode mode, const_tree elt) +{ + if (TREE_CODE (elt) == REAL_CST) + return const_double_from_real_value (TREE_REAL_CST (elt), mode); + if (TREE_CODE (elt) == FIXED_CST) + return CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt), mode); + return immed_wide_int_const (wi::to_wide (elt), mode); +} + /* Return a MEM that contains constant EXP. DEFER is as for output_constant_def and MODIFIER is as for expand_expr. */ @@ -9551,6 +9581,12 @@ #define REDUCE_BIT_FIELD(expr) (reduce_b target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target); return target; + case VEC_DUPLICATE_EXPR: + op0 = expand_expr (treeop0, NULL_RTX, VOIDmode, modifier); + target = expand_vector_broadcast (mode, op0); + gcc_assert (target); + return target; + case BIT_INSERT_EXPR: { unsigned bitpos = tree_to_uhwi (treeop2); @@ -10003,6 +10039,11 @@ expand_expr_real_1 (tree exp, rtx target tmode, modifier); } + case VEC_DUPLICATE_CST: + op0 = const_vector_element (GET_MODE_INNER (mode), + VEC_DUPLICATE_CST_ELT (exp)); + return gen_const_vec_duplicate (mode, op0); + case CONST_DECL: if (modifier == EXPAND_WRITE) { @@ -11764,8 +11805,7 @@ const_vector_from_tree (tree exp) { rtvec v; unsigned i, units; - tree elt; - machine_mode inner, mode; + machine_mode mode; mode = TYPE_MODE (TREE_TYPE (exp)); @@ -11776,23 +11816,12 @@ const_vector_from_tree (tree exp) return const_vector_mask_from_tree (exp); units = VECTOR_CST_NELTS (exp); - inner = GET_MODE_INNER (mode); v = rtvec_alloc (units); for (i = 0; i < units; ++i) - { - elt = VECTOR_CST_ELT (exp, i); - - if (TREE_CODE (elt) == REAL_CST) - RTVEC_ELT (v, i) = const_double_from_real_value (TREE_REAL_CST (elt), - inner); - else if (TREE_CODE (elt) == FIXED_CST) - RTVEC_ELT (v, i) = CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt), - inner); - else - RTVEC_ELT (v, i) = immed_wide_int_const (wi::to_wide (elt), inner); - } + RTVEC_ELT (v, i) = const_vector_element (GET_MODE_INNER (mode), + VECTOR_CST_ELT (exp, i)); return gen_rtx_CONST_VECTOR (mode, v); } Index: gcc/internal-fn.c =================================================================== --- gcc/internal-fn.c 2017-10-23 11:41:23.529089619 +0100 +++ gcc/internal-fn.c 2017-10-23 11:41:51.767200753 +0100 @@ -1911,12 +1911,12 @@ expand_vector_ubsan_overflow (location_t emit_move_insn (cntvar, const0_rtx); emit_label (loop_lab); } - if (TREE_CODE (arg0) != VECTOR_CST) + if (!CONSTANT_CLASS_P (arg0)) { rtx arg0r = expand_normal (arg0); arg0 = make_tree (TREE_TYPE (arg0), arg0r); } - if (TREE_CODE (arg1) != VECTOR_CST) + if (!CONSTANT_CLASS_P (arg1)) { rtx arg1r = expand_normal (arg1); arg1 = make_tree (TREE_TYPE (arg1), arg1r); Index: gcc/tree-cfg.c =================================================================== --- gcc/tree-cfg.c 2017-10-23 11:41:25.864967029 +0100 +++ gcc/tree-cfg.c 2017-10-23 11:41:51.770094616 +0100 @@ -3803,6 +3803,17 @@ verify_gimple_assign_unary (gassign *stm case CONJ_EXPR: break; + case VEC_DUPLICATE_EXPR: + if (TREE_CODE (lhs_type) != VECTOR_TYPE + || !useless_type_conversion_p (TREE_TYPE (lhs_type), rhs1_type)) + { + error ("vec_duplicate should be from a scalar to a like vector"); + debug_generic_expr (lhs_type); + debug_generic_expr (rhs1_type); + return true; + } + return false; + default: gcc_unreachable (); } @@ -4473,6 +4484,7 @@ verify_gimple_assign_single (gassign *st case FIXED_CST: case COMPLEX_CST: case VECTOR_CST: + case VEC_DUPLICATE_CST: case STRING_CST: return res; Index: gcc/tree-inline.c =================================================================== --- gcc/tree-inline.c 2017-10-23 11:41:25.833048208 +0100 +++ gcc/tree-inline.c 2017-10-23 11:41:51.771059237 +0100 @@ -4002,6 +4002,7 @@ estimate_operator_cost (enum tree_code c case VEC_PACK_FIX_TRUNC_EXPR: case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: + case VEC_DUPLICATE_EXPR: return 1;