From patchwork Mon May 27 09:51:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Prathamesh Kulkarni X-Patchwork-Id: 165201 Delivered-To: patch@linaro.org Received: by 2002:a92:9e1a:0:0:0:0:0 with SMTP id q26csp6943328ili; Mon, 27 May 2019 02:52:03 -0700 (PDT) X-Google-Smtp-Source: APXvYqwuG39GpxfT2Yw1BUW7JiAZdmS8yyvXskEVk1rMdfUWX7bzHYNwCEekDvhav9S/76Jd4Aip X-Received: by 2002:aa7:942f:: with SMTP id y15mr76974002pfo.121.1558950723130; Mon, 27 May 2019 02:52:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1558950723; cv=none; d=google.com; s=arc-20160816; b=KIwoGCl5LI0lD5G1vWFjwkHzBc2M9zGnZggiIw7BYsxNo3Z0Am6DbzYrkMFeFvy3tn /ludm3tx3MYagO/JfSzRKdKUYKy+iJ0p/2nlX/wvNPzXg7L55QYkhujpUrIprfZuVetd nnyp+rBDoJI1+wMp0gm+J9Q+U/ojK3lYX9PbbgZ0nI2DraULm9EKlvXtcrpy1I15lpIe yW4Y9HG0hXkQlZqpl29NumAiZXKajTdAhliSp8+ROVfXdfpAlZH03OBQJiw8/Yx94EAh 7YmwqPVA5b3wr99odGePyHXpsBAiCBkNSr5pATIvEHn8a3iO9f8r605Fk+cT2NLKsLYz pXhQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:mime-version:dkim-signature :delivered-to:sender:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature; bh=z/g6jsCmkqA2a/jKdgP1+ZHzk6wBt1vQX4DLCUyaJRo=; b=U41J/ooYHVO3HsOa6hX01gftRvfgTxnKYutQqYoxMGxb2QcKZSb7fmoGrrcH7aB0Ni 1iA2z+ttswsRgdOD3gNIi2yyqjXH0ocgeV5+sKfUHpgCNfSi2CEI8OMiFPu3DoOvSQde wU6RF8axE86DZutkB/vuVH8X7JvXBgTuYgeQaX8XlbrBgH2eIvyWK3nUUlj5nfhwuYUU N8VLgZDLYZ+S/WSP373QhiRfx+PMaKVhRtQFXo0i23rsZirdFRx4eHFyiuILo4P/vjmJ fkMJK4na6rvf6RPwEV1VFRmrrYvuFgO+RIw3R7EkBFiXq6+HsH54tDK4JLoeNeFhk3gH NtDQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=xMDRswOA; dkim=neutral (body hash did not verify) header.i=@linaro.org header.s=google header.b=uHcVWL9p; spf=pass (google.com: domain of gcc-patches-return-501708-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom="gcc-patches-return-501708-patch=linaro.org@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id j88si17156945pje.35.2019.05.27.02.52.02 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 27 May 2019 02:52:03 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-501708-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=xMDRswOA; dkim=neutral (body hash did not verify) header.i=@linaro.org header.s=google header.b=uHcVWL9p; spf=pass (google.com: domain of gcc-patches-return-501708-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom="gcc-patches-return-501708-patch=linaro.org@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=ni4kFCF4WOzwhKctBNlS92JUTChSAsPgpHTDACmyquXNdh XT2Lk0zOPFL1pTfntOJzgGh8G9bmbjBCUuvFYGhAXiZ3SDsvoKDvAsdxenvbRzom aaGgP2wx5Atvm+ksrtdSPn1K+RQ4328WLXpPva3dTvUYgdn9TbgMD9R4enqLo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; s= default; bh=LyVG/G73ZjiF2qPsXJPOjL9rmu8=; b=xMDRswOAh7Eh4BTcBbDN LRjI9jbN7f9+GswxEzi4vygmXMQoYWUn9LLeIh4HeNKF5kbF76/f4qsJJYWWrIyw LB96TdAy4edllRFvU46AIyP87NRJI/4TiSJ17SzxuytKtFhFxe4yAUQmVlVtCxDX VZfZ+I/2uIHaeFQJwY9PD2k= Received: (qmail 34914 invoked by alias); 27 May 2019 09:51:49 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 34906 invoked by uid 89); 27 May 2019 09:51:49 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-22.7 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.1 spammy=54, 1.1, zip, 51 X-HELO: mail-lj1-f180.google.com Received: from mail-lj1-f180.google.com (HELO mail-lj1-f180.google.com) (209.85.208.180) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 27 May 2019 09:51:45 +0000 Received: by mail-lj1-f180.google.com with SMTP id z5so14140010lji.10 for ; Mon, 27 May 2019 02:51:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:from:date:message-id:subject:to; bh=xrenmyzYgSCl1PfxyOPYnX9C0U0A9xhYvC2jZVSlCyQ=; b=uHcVWL9prktF7f91n6K/kj04yaWa9xXhTkDQ9/2VKDbUdFNw2os52J4MlokK1XV4gF /oDUWT8vWnN2T8aVznJc/c2SIV3lcGhfx95zlxiJnLnpaSOGuUs6myE9qEzVDa2fnrPV k2B06rZTjCpnVejgU1zhu5quaxg9e0fYTYSvf9nEMwkcZcJ4pp0ywoxGRm3T+DL1zMvZ j3Hh4ekBWEQryd1+Tnbr91BaOCiRCXdoo5T/nZx3ZuaTLHw3J2P3owK3nmD/x3cEpLR/ GLCQhhe1CZtTEA7d65w1ZKWnJFbHegjlKxOVHrTYuGta0AuQn3DNVnKjJT5WtB4fVCHg Odwg== MIME-Version: 1.0 From: Prathamesh Kulkarni Date: Mon, 27 May 2019 15:21:06 +0530 Message-ID: Subject: [AArch64] [SVE] PR88837 - Poor vector construction code in VL-specific mode To: gcc Patches , Richard Sandiford X-IsSubscribed: yes Hi, The attached patch tries to improve initialization for fixed-length SVE vector and it's algorithm is described in comments for aarch64_sve_expand_vector_init() in the patch, with help from Richard Sandiford. I verified tests added in the patch pass with qemu and am trying to run bootstrap+test on patch in qemu. Does the patch look OK ? Thanks, Prathamesh 2019-05-27 Prathamesh Kulkarni Richard Sandiford * vector-builder.h (vector_builder::count_dups): New method. * config/aarch64/aarch64-protos.h (aarch64_expand_sve_vector_init): Declare prototype. * config/aarch64/aarch64/sve.md (aarch64_sve_rev64): Use @. (vec_init): New pattern. * config/aarch64/aarch64.c (emit_insr): New function. (aarch64_sve_expand_vector_init_handle_trailing_constants): Likewise. (aarch64_sve_expand_vector_init_insert_elems): Likewise. (aarch64_sve_expand_vector_init_handle_trailing_same_elem): Likewise. (aarch64_sve_expand_vector_init): Define two overloaded functions. testsuite/ * gcc.target/aarch64/sve/init_1.c: New test. * gcc.target/aarch64/sve/init_1_run.c: Likewise. * gcc.target/aarch64/sve/init_2.c: Likewise. * gcc.target/aarch64/sve/init_2_run.c: Likewise. * gcc.target/aarch64/sve/init_3.c: Likewise. * gcc.target/aarch64/sve/init_3_run.c: Likewise. * gcc.target/aarch64/sve/init_4.c: Likewise. * gcc.target/aarch64/sve/init_4_run.c: Likewise. * gcc.target/aarch64/sve/init_5.c: Likewise. * gcc.target/aarch64/sve/init_5_run.c: Likewise. * gcc.target/aarch64/sve/init_6.c: Likewise. * gcc.target/aarch64/sve/init_6_run.c: Likewise. * gcc.target/aarch64/sve/init_7.c: Likewise. * gcc.target/aarch64/sve/init_7_run.c: Likewise. * gcc.target/aarch64/sve/init_8.c: Likewise. * gcc.target/aarch64/sve/init_8_run.c: Likewise. * gcc.target/aarch64/sve/init_9.c: Likewise. * gcc.target/aarch64/sve/init_9_run.c: Likewise. * gcc.target/aarch64/sve/init_10.c: Likewise. * gcc.target/aarch64/sve/init_10_run.c: Likewise. * gcc.target/aarch64/sve/init_11.c: Likewise. * gcc.target/aarch64/sve/init_11_run.c: Likewise. * gcc.target/aarch64/sve/init_12.c: Likewise. * gcc.target/aarch64/sve/init_12_run.c: Likewise. diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index b6c0d0a8eb6..f82728ed2d3 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -515,6 +515,7 @@ bool aarch64_maybe_expand_sve_subreg_move (rtx, rtx); void aarch64_split_sve_subreg_move (rtx, rtx, rtx); void aarch64_expand_prologue (void); void aarch64_expand_vector_init (rtx, rtx); +void aarch64_sve_expand_vector_init (rtx, rtx); void aarch64_init_cumulative_args (CUMULATIVE_ARGS *, const_tree, rtx, const_tree, unsigned); void aarch64_init_expanders (void); diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index b9cb1fae98c..a4e0014eb3d 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -863,7 +863,7 @@ "revb\t%0.h, %1/m, %2.h" ) -(define_insn "*aarch64_sve_rev" +(define_insn "@aarch64_sve_rev" [(set (match_operand:SVE_ALL 0 "register_operand" "=w") (unspec:SVE_ALL [(match_operand:SVE_ALL 1 "register_operand" "w")] UNSPEC_REV))] @@ -3207,3 +3207,15 @@ DONE; } ) + +;; Standard pattern name vec_init. + +(define_expand "vec_init" + [(match_operand:SVE_ALL 0 "register_operand" "") + (match_operand 1 "" "")] + "TARGET_SVE" + { + aarch64_sve_expand_vector_init (operands[0], operands[1]); + DONE; + } +) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 83453d03095..8967e02524e 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -15244,6 +15244,261 @@ aarch64_expand_vector_init (rtx target, rtx vals) } } +/* Emit RTL corresponding to: + insr TARGET, ELEM. */ + +static void +emit_insr (rtx target, rtx elem) +{ + machine_mode mode = GET_MODE (target); + scalar_mode elem_mode = GET_MODE_INNER (mode); + elem = force_reg (elem_mode, elem); + + insn_code icode = optab_handler (vec_shl_insert_optab, mode); + gcc_assert (icode != CODE_FOR_nothing); + emit_insn (GEN_FCN (icode) (target, target, elem)); +} + +/* Subroutine of aarch64_sve_expand_vector_init for handling + trailing constants. + This function works as follows: + (a) Create a new vector consisting of trailing constants. + (b) Initialize TARGET with the constant vector using emit_move_insn. + (c) Insert remaining elements in TARGET using insr. + NELTS is the total number of elements in original vector while + + ??? The heuristic used is to do above only if number of constants + is at least half the total number of elements. May need fine tuning. */ + +static bool +aarch64_sve_expand_vector_init_handle_trailing_constants + (rtx target, const rtx_vector_builder &builder, int nelts, int nelts_reqd) +{ + machine_mode mode = GET_MODE (target); + scalar_mode elem_mode = GET_MODE_INNER (mode); + int n_trailing_constants = 0; + + for (int i = nelts_reqd - 1; + i >= 0 && aarch64_legitimate_constant_p (elem_mode, builder.elt (i)); + i--) + n_trailing_constants++; + + if (n_trailing_constants >= nelts_reqd / 2) + { + rtx_vector_builder v (mode, 1, nelts); + for (int i = 0; i < nelts; i++) + v.quick_push (builder.elt (i + nelts_reqd - n_trailing_constants)); + rtx const_vec = v.build (); + emit_move_insn (target, const_vec); + + for (int i = nelts_reqd - n_trailing_constants - 1; i >= 0; i--) + emit_insr (target, builder.elt (i)); + + return true; + } + + return false; +} + +/* Subroutine of aarch64_sve_expand_vector_init. + Works as follows: + (a) Initialize TARGET by broadcasting element NELTS_REQD - 1 of BUILDER. + (b) Skip trailing elements from BUILDER, which are same as + element NELTS_REQD - 1. + (c) Insert earlier elements in reverse order in TARGET using insr. */ + +static void +aarch64_sve_expand_vector_init_insert_elems (rtx target, + const rtx_vector_builder &builder, + int nelts_reqd) +{ + machine_mode mode = GET_MODE (target); + scalar_mode elem_mode = GET_MODE_INNER (mode); + + struct expand_operand ops[2]; + enum insn_code icode = optab_handler (vec_duplicate_optab, mode); + gcc_assert (icode != CODE_FOR_nothing); + + create_output_operand (&ops[0], target, mode); + create_input_operand (&ops[1], builder.elt (nelts_reqd - 1), elem_mode); + expand_insn (icode, 2, ops); + + int ndups = builder.count_dups (nelts_reqd - 1, -1, -1); + for (int i = nelts_reqd - ndups - 1; i >= 0; i--) + emit_insr (target, builder.elt (i)); +} + +/* Subroutine of aarch64_sve_expand_vector_init to handle case + when all trailing elements of builder are same. + This works as follows: + (a) Using expand_insn interface to broadcast last vector element in TARGET. + (b) Insert remaining elements in TARGET using insr. + + ??? The heuristic used is to do above if number of same trailing elements + is at least 3/4 of total number of elements, loosely based on + heuristic from mostly_zeros_p. May need fine-tuning. */ + +static bool +aarch64_sve_expand_vector_init_handle_trailing_same_elem + (rtx target, const rtx_vector_builder &builder, int nelts_reqd) +{ + int ndups = builder.count_dups (nelts_reqd - 1, -1, -1); + if (ndups >= (3 * nelts_reqd) / 4) + { + aarch64_sve_expand_vector_init_insert_elems (target, builder, + nelts_reqd - ndups + 1); + return true; + } + + return false; +} + +/* Initialize register TARGET from BUILDER. NELTS is the constant number + of elements in BUILDER. + + The function tries to initialize TARGET from BUILDER if it fits one + of the special cases outlined below. + + Failing that, the function divides BUILDER into two sub-vectors: + v_even = even elements of BUILDER; + v_odd = odd elements of BUILDER; + + and recursively calls itself with v_even and v_odd. + + if (recursive call succeeded for v_even or v_odd) + TARGET = zip (v_even, v_odd) + + The function returns true if it managed to build TARGET from BUILDER + with one of the special cases, false otherwise. + + Example: {a, 1, b, 2, c, 3, d, 4} + + The vector gets divided into: + v_even = {a, b, c, d} + v_odd = {1, 2, 3, 4} + + aarch64_sve_expand_vector_init(v_odd) hits case 1 and + initialize tmp2 from constant vector v_odd using emit_move_insn. + + aarch64_sve_expand_vector_init(v_even) fails since v_even contains + 4 elements, so we construct tmp1 from v_even using insr: + tmp1 = dup(d) + insr tmp1, c + insr tmp1, b + insr tmp1, a + + And finally: + TARGET = zip (tmp1, tmp2) + which sets TARGET to {a, 1, b, 2, c, 3, d, 4}. */ + +static bool +aarch64_sve_expand_vector_init (rtx target, const rtx_vector_builder &builder, + int nelts, int nelts_reqd) +{ + machine_mode mode = GET_MODE (target); + + /* Case 1: Vector contains trailing constants. */ + + if (aarch64_sve_expand_vector_init_handle_trailing_constants + (target, builder, nelts, nelts_reqd)) + return true; + + /* Case 2: Vector contains leading constants. */ + + rtx_vector_builder rev_builder (mode, 1, nelts_reqd); + for (int i = 0; i < nelts_reqd; i++) + rev_builder.quick_push (builder.elt (nelts_reqd - i - 1)); + rev_builder.finalize (); + + if (aarch64_sve_expand_vector_init_handle_trailing_constants + (target, rev_builder, nelts, nelts_reqd)) + { + emit_insn (gen_aarch64_sve_rev (mode, target, target)); + return true; + } + + /* Case 3: Vector contains trailing same element. */ + + if (aarch64_sve_expand_vector_init_handle_trailing_same_elem + (target, builder, nelts_reqd)) + return true; + + /* Case 4: Vector contains leading same element. */ + + if (aarch64_sve_expand_vector_init_handle_trailing_same_elem + (target, rev_builder, nelts_reqd) && nelts_reqd == nelts) + { + emit_insn (gen_aarch64_sve_rev (mode, target, target)); + return true; + } + + /* Avoid recursing below 4-elements. + ??? The threshold 4 may need fine-tuning. */ + + if (nelts_reqd <= 4) + return false; + + rtx_vector_builder v_even (mode, 1, nelts); + rtx_vector_builder v_odd (mode, 1, nelts); + + for (int i = 0; i < nelts * 2; i += 2) + { + v_even.quick_push (builder.elt (i)); + v_odd.quick_push (builder.elt (i + 1)); + } + + v_even.finalize (); + v_odd.finalize (); + + rtx tmp1 = gen_reg_rtx (mode); + bool did_even_p = aarch64_sve_expand_vector_init (tmp1, v_even, + nelts, nelts_reqd / 2); + + rtx tmp2 = gen_reg_rtx (mode); + bool did_odd_p = aarch64_sve_expand_vector_init (tmp2, v_odd, + nelts, nelts_reqd / 2); + + if (!did_even_p && !did_odd_p) + return false; + + /* Initialize v_even and v_odd using INSR if it didn't match any of the + special cases and zip v_even, v_odd. */ + + if (!did_even_p) + aarch64_sve_expand_vector_init_insert_elems (tmp1, v_even, nelts_reqd / 2); + + if (!did_odd_p) + aarch64_sve_expand_vector_init_insert_elems (tmp2, v_odd, nelts_reqd / 2); + + rtvec v = gen_rtvec (2, tmp1, tmp2); + emit_set_insn (target, gen_rtx_UNSPEC (mode, v, UNSPEC_ZIP1)); + return true; +} + +/* Initialize register TARGET from the elements in PARALLEL rtx VALS. */ + +void +aarch64_sve_expand_vector_init (rtx target, rtx vals) +{ + machine_mode mode = GET_MODE (target); + int nelts = XVECLEN (vals, 0); + + rtx_vector_builder v (mode, 1, nelts); + for (int i = 0; i < nelts; i++) + v.quick_push (XVECEXP (vals, 0, i)); + v.finalize (); + + /* If neither sub-vectors of v could be initialized specially, + then use INSR to insert all elements from v into TARGET. + ??? This might not be optimal for vectors with large + initializers like 16-element or above. + For nelts < 4, it probably isn't useful to handle specially. */ + + if (nelts < 4 + || !aarch64_sve_expand_vector_init (target, v, nelts, nelts)) + aarch64_sve_expand_vector_init_insert_elems (target, v, nelts); +} + static unsigned HOST_WIDE_INT aarch64_shift_truncation_mask (machine_mode mode) { diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_1.c b/gcc/testsuite/gcc.target/aarch64/sve/init_1.c new file mode 100644 index 00000000000..c51876947fb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_1.c @@ -0,0 +1,27 @@ +/* { dg-do compile { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O2 -ftree-vectorize -fno-schedule-insns -msve-vector-bits=256 --save-temps" } */ + +/* Case 1.1: Trailing constants with stepped sequence. */ + +#include + +typedef int32_t vnx4si __attribute__((vector_size (32))); + +__attribute__((noipa)) +vnx4si foo(int a, int b) +{ + return (vnx4si) { a, b, 1, 2, 3, 4, 5, 6 }; +} + +/* +foo: +.LFB0: + .cfi_startproc + ptrue p0.s, vl8 + index z0.s, #1, #1 + insr z0.s, w1 + insr z0.s, w0 + ret +*/ + +/* { dg-final { scan-assembler {\tindex\t(z[0-9]+\.s), #1, #1\n\tinsr\t\1, w1\n\tinsr\t\1, w0} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_10.c b/gcc/testsuite/gcc.target/aarch64/sve/init_10.c new file mode 100644 index 00000000000..7bca3f0ecc9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_10.c @@ -0,0 +1,29 @@ +/* { dg-do compile { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O2 -ftree-vectorize -fno-schedule-insns -msve-vector-bits=256 --save-temps" } */ + +/* Case 5.4: Interleaved repeating elements and non-repeating elements. */ + +#include + +typedef int32_t vnx4si __attribute__((vector_size (32))); + +__attribute__((noipa)) +vnx4si foo(int a, int b, int c, int f) +{ + return (vnx4si) { a, f, b, f, c, f, c, f }; +} + +/* +foo: +.LFB0: + .cfi_startproc + mov z0.s, w2 + mov z1.s, w3 + insr z0.s, w1 + ptrue p0.s, vl8 + insr z0.s, w0 + zip1 z0.s, z0.s, z1.s + ret +*/ + +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.s), w3\n\tmov\t(z[0-9]+\.s), w2\n.*\n\tinsr\t\2, w1\n\tinsr\t\2, w0\n\tzip1\t\2, \2, \1} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_10_run.c b/gcc/testsuite/gcc.target/aarch64/sve/init_10_run.c new file mode 100644 index 00000000000..d9640e42ddd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_10_run.c @@ -0,0 +1,21 @@ +/* { dg-do run { target aarch64_sve256_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=256 --save-temps" } */ + +#include "init_10.c" + +int main() +{ + int a = 10; + int b = 11; + int c = 12; + int f = 13; + + vnx4si v = foo (a, b, c, f); + int expected[] = { a, f, b, f, c, f, c, f }; + + for (int i = 0; i < 8; i++) + if (v[i] != expected[i]) + __builtin_abort (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_11.c b/gcc/testsuite/gcc.target/aarch64/sve/init_11.c new file mode 100644 index 00000000000..b90895df436 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_11.c @@ -0,0 +1,27 @@ +/* { dg-do compile { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O2 -ftree-vectorize -fno-schedule-insns -msve-vector-bits=256 --save-temps" } */ + +/* Case 5.5: Interleaved repeating elements and trailing same elements. */ + +#include + +typedef int32_t vnx4si __attribute__((vector_size (32))); + +vnx4si foo(int a, int b, int f) +{ + return (vnx4si) { a, f, b, f, b, f, b, f }; +} + +/* +foo: +.LFB0: + .cfi_startproc + mov z0.s, w1 + mov z1.s, w2 + insr z0.s, w0 + ptrue p0.s, vl8 + zip1 z0.s, z0.s, z1.s + ret +*/ + +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.s), w1\n\tmov\t(z[0-9]+\.s), w2\n\tinsr\t\1, w0\n.*\tzip1\t\1, \1, \2} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_11_run.c b/gcc/testsuite/gcc.target/aarch64/sve/init_11_run.c new file mode 100644 index 00000000000..8a99da45433 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_11_run.c @@ -0,0 +1,20 @@ +/* { dg-do run { target aarch64_sve256_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=256 --save-temps" } */ + +#include "init_11.c" + +int main() +{ + int a = 10; + int b = 11; + int f = 12; + + vnx4si v = foo (a, b, f); + int expected[] = { a, f, b, f, b, f, b, f }; + + for (int i = 0; i < 8; i++) + if (v[i] != expected[i]) + __builtin_abort (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_12.c b/gcc/testsuite/gcc.target/aarch64/sve/init_12.c new file mode 100644 index 00000000000..b36967d6d59 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_12.c @@ -0,0 +1,30 @@ +/* { dg-do compile { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O2 -ftree-vectorize -fno-schedule-insns -msve-vector-bits=256 --save-temps" } */ + +/* Case 5.5: Interleaved repeating elements and trailing same elements. */ + +#include + +typedef int32_t vnx4si __attribute__((vector_size (32))); + +__attribute__((noipa)) +vnx4si foo(int a, int b, int f) +{ + return (vnx4si) { b, f, b, f, b, f, a, f }; +} + +/* +foo: +.LFB0: + .cfi_startproc + mov z0.s, w0 + mov z1.s, w2 + insr z0.s, w1 + ptrue p0.s, vl8 + insr z0.s, w1 + insr z0.s, w1 + zip1 z0.s, z0.s, z1.s + ret +*/ + +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.s), w2\n\tmov\t(z[0-9]+\.s), w0\n.*\n\tinsr\t\2, w1\n\tinsr\t\2, w1\n\tinsr\t\2, w1\n\tzip1\t\2, \2, \1} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_12_run.c b/gcc/testsuite/gcc.target/aarch64/sve/init_12_run.c new file mode 100644 index 00000000000..b77464c6b3c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_12_run.c @@ -0,0 +1,20 @@ +/* { dg-do run { target aarch64_sve256_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=256 --save-temps" } */ + +#include "init_12.c" + +int main() +{ + int a = 10; + int b = 11; + int f = 12; + + vnx4si v = foo (a, b, f); + int expected[] = { b, f, b, f, b, f, a, f }; + + for (int i = 0; i < 8; i++) + if (v[i] != expected[i]) + __builtin_abort (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve/init_1_run.c new file mode 100644 index 00000000000..c0cc5235da4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_1_run.c @@ -0,0 +1,19 @@ +/* { dg-do run { target aarch64_sve256_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=256 --save-temps" } */ + +#include "init_1.c" + +int main() +{ + int a = 10; + int b = 11; + + vnx4si v = foo (a, b); + int expected[] = { a, b, 1, 2, 3, 4, 5, 6 }; + + for (int i = 0; i < 8; i++) + if (v[i] != expected[i]) + __builtin_abort (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_2.c b/gcc/testsuite/gcc.target/aarch64/sve/init_2.c new file mode 100644 index 00000000000..1ab7c4300e6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_2.c @@ -0,0 +1,29 @@ +/* { dg-do compile { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O2 -ftree-vectorize -fno-schedule-insns -msve-vector-bits=256 --save-temps" } */ + +/* Case 1.2: Trailing constants with repeating sequence. */ + +#include + +typedef int32_t vnx4si __attribute__((vector_size (32))); + +__attribute__((noipa)) +vnx4si foo(int a, int b) +{ + return (vnx4si) { a, b, 2, 3, 2, 3, 2, 3 }; +} + +/* +foo: +.LFB0: + .cfi_startproc + ptrue p0.s, vl8 + adrp x2, .LANCHOR0 + add x2, x2, :lo12:.LANCHOR0 + ld1w z0.s, p0/z, [x2] + insr z0.s, w1 + insr z0.s, w0 + ret +*/ + +/* { dg-final { scan-assembler {\tld1w\t(z[0-9]+\.s), p[0-9]+/z, \[x[0-9]+\]\n\tinsr\t\1, w1\n\tinsr\t\1, w0} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_2_run.c b/gcc/testsuite/gcc.target/aarch64/sve/init_2_run.c new file mode 100644 index 00000000000..0f3705d145b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_2_run.c @@ -0,0 +1,19 @@ +/* { dg-do run { target aarch64_sve256_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=256 --save-temps" } */ + +#include "init_2.c" + +int main() +{ + int a = 10; + int b = 11; + + vnx4si v = foo (a, b); + int expected[] = { a, b, 2, 3, 2, 3, 2, 3 }; + + for (int i = 0; i < 8; i++) + if (v[i] != expected[i]) + __builtin_abort (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_3.c b/gcc/testsuite/gcc.target/aarch64/sve/init_3.c new file mode 100644 index 00000000000..ccf3fa85292 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_3.c @@ -0,0 +1,28 @@ +/* { dg-do compile { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O2 -ftree-vectorize -fno-schedule-insns -msve-vector-bits=256 --save-temps" } */ + +/* Case 2.1: Leading constants with stepped sequence. */ + +#include + +typedef int32_t vnx4si __attribute__((vector_size (32))); + +__attribute__((noipa)) +vnx4si foo(int a, int b) +{ + return (vnx4si) { 1, 2, 3, 4, 5, 6, a, b }; +} + +/* +foo: +.LFB0: + .cfi_startproc + ptrue p0.s, vl8 + index z0.s, #6, #-1 + insr z0.s, w0 + insr z0.s, w1 + rev z0.s, z0.s + ret +*/ + +/* { dg-final { scan-assembler {\tindex\t(z[0-9]+\.s), #6, #-1\n\tinsr\t\1, w0\n\tinsr\t\1, w1\n\trev\t\1, \1} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_3_run.c b/gcc/testsuite/gcc.target/aarch64/sve/init_3_run.c new file mode 100644 index 00000000000..5df711dfc79 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_3_run.c @@ -0,0 +1,19 @@ +/* { dg-do run { target aarch64_sve256_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=256 --save-temps" } */ + +#include "init_3.c" + +int main() +{ + int a = 10; + int b = 11; + + vnx4si v = foo (a, b); + int expected[] = { 1, 2, 3, 4, 5, 6, a, b }; + + for (int i = 0; i < 8; i++) + if (v[i] != expected[i]) + __builtin_abort (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_4.c b/gcc/testsuite/gcc.target/aarch64/sve/init_4.c new file mode 100644 index 00000000000..b817dc5d9f7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_4.c @@ -0,0 +1,30 @@ +/* { dg-do compile { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O2 -ftree-vectorize -fno-schedule-insns -msve-vector-bits=256 --save-temps" } */ + +/* Case 2.2: Leading constants with stepped sequence. */ + +#include + +typedef int32_t vnx4si __attribute__((vector_size (32))); + +__attribute__((noipa)) +vnx4si foo(int a, int b) +{ + return (vnx4si) { 3, 2, 3, 2, 3, 2, b, a }; +} + +/* +foo: +.LFB0: + .cfi_startproc + ptrue p0.s, vl8 + adrp x2, .LANCHOR0 + add x2, x2, :lo12:.LANCHOR0 + ld1w z0.s, p0/z, [x2] + insr z0.s, w1 + insr z0.s, w0 + rev z0.s, z0.s + ret +*/ + +/* { dg-final { scan-assembler {\tld1w\t(z[0-9]+\.s), p[0-9]+/z, \[x[0-9]+\]\n\tinsr\t\1, w1\n\tinsr\t\1, w0\n\trev\t\1, \1} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_4_run.c b/gcc/testsuite/gcc.target/aarch64/sve/init_4_run.c new file mode 100644 index 00000000000..563353fe673 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_4_run.c @@ -0,0 +1,19 @@ +/* { dg-do run { target aarch64_sve256_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=256 --save-temps" } */ + +#include "init_4.c" + +int main() +{ + int a = 10; + int b = 11; + + vnx4si v = foo (a, b); + int expected[] = { 3, 2, 3, 2, 3, 2, b, a }; + + for (int i = 0; i < 8; i++) + if (v[i] != expected[i]) + __builtin_abort (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_5.c b/gcc/testsuite/gcc.target/aarch64/sve/init_5.c new file mode 100644 index 00000000000..d662dfba8b5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_5.c @@ -0,0 +1,27 @@ +/* { dg-do compile { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O2 -ftree-vectorize -fno-schedule-insns -msve-vector-bits=256 --save-temps" } */ + +/* Case 3: Trailing same element. */ + +#include + +typedef int32_t vnx4si __attribute__((vector_size (32))); + +__attribute__((noipa)) +vnx4si foo(int a, int b, int c) +{ + return (vnx4si) { a, b, c, c, c, c, c, c }; +} + +/* +foo: +.LFB0: + .cfi_startproc + mov z0.s, w2 + ptrue p0.s, vl8 + insr z0.s, w1 + insr z0.s, w0 + ret +*/ + +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.s), w2\n.*\tinsr\t\1, w1\n\tinsr\t\1, w0} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_5_run.c b/gcc/testsuite/gcc.target/aarch64/sve/init_5_run.c new file mode 100644 index 00000000000..ae444a17688 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_5_run.c @@ -0,0 +1,20 @@ +/* { dg-do run { target aarch64_sve256_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=256 --save-temps" } */ + +#include "init_5.c" + +int main() +{ + int a = 10; + int b = 11; + int c = 12; + + vnx4si v = foo (a, b, c); + int expected[] = { a, b, c, c, c, c, c, c }; + + for (int i = 0; i < 8; i++) + if (v[i] != expected[i]) + __builtin_abort (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_6.c b/gcc/testsuite/gcc.target/aarch64/sve/init_6.c new file mode 100644 index 00000000000..fd0e21dcb85 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_6.c @@ -0,0 +1,28 @@ +/* { dg-do compile { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O2 -ftree-vectorize -fno-schedule-insns -msve-vector-bits=256 --save-temps" } */ + +/* Case 3: Trailing same element. */ + +#include + +typedef int32_t vnx4si __attribute__((vector_size (32))); + +__attribute__((noipa)) +vnx4si foo(int a, int b, int c) +{ + return (vnx4si) { c, c, c, c, c, c, b, a }; +} + +/* +foo: +.LFB0: + .cfi_startproc + mov z0.s, w2 + ptrue p0.s, vl8 + insr z0.s, w1 + insr z0.s, w0 + rev z0.s, z0.s + ret +*/ + +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.s), w2\n.*\tinsr\t\1, w1\n\tinsr\t\1, w0\n\trev\t\1, \1} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_6_run.c b/gcc/testsuite/gcc.target/aarch64/sve/init_6_run.c new file mode 100644 index 00000000000..d919f0ce0ba --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_6_run.c @@ -0,0 +1,20 @@ +/* { dg-do run { target aarch64_sve256_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=256 --save-temps" } */ + +#include "init_6.c" + +int main() +{ + int a = 10; + int b = 11; + int c = 12; + + vnx4si v = foo (a, b, c); + int expected[] = { c, c, c, c, c, c, b, a }; + + for (int i = 0; i < 8; i++) + if (v[i] != expected[i]) + __builtin_abort (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_7.c b/gcc/testsuite/gcc.target/aarch64/sve/init_7.c new file mode 100644 index 00000000000..5f3d82242d7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_7.c @@ -0,0 +1,32 @@ +/* { dg-do compile { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O2 -ftree-vectorize -fno-schedule-insns -msve-vector-bits=256 --save-temps" } */ + +/* Case 5.1: All elements. */ + +#include + +typedef int32_t vnx4si __attribute__((vector_size (32))); + +__attribute__((noipa)) +vnx4si foo(int a, int b, int c, int d, int e, int f, int g, int h) +{ + return (vnx4si) { a, b, c, d, e, f, g, h }; +} + +/* +foo: +.LFB0: + .cfi_startproc + mov z0.s, w7 + ptrue p0.s, vl8 + insr z0.s, w6 + insr z0.s, w5 + insr z0.s, w4 + insr z0.s, w3 + insr z0.s, w2 + insr z0.s, w1 + insr z0.s, w0 + ret +*/ + +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.s), w7\n.*\tinsr\t\1, w6\n\tinsr\t\1, w5\n\tinsr\t\1, w4\n\tinsr\t\1, w3\n\tinsr\t\1, w2\n\tinsr\t\1, w1\n\tinsr\t\1, w0} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_7_run.c b/gcc/testsuite/gcc.target/aarch64/sve/init_7_run.c new file mode 100644 index 00000000000..c9f040c6d4d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_7_run.c @@ -0,0 +1,25 @@ +/* { dg-do run { target aarch64_sve256_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=256 --save-temps" } */ + +#include "init_7.c" + +int main() +{ + int a = 10; + int b = 11; + int c = 12; + int d = 13; + int e = 14; + int f = 15; + int g = 16; + int h = 17; + + vnx4si v = foo (a, b, c, d, e, f, g, h); + int expected[] = { a, b, c, d, e, f, g, h }; + + for (int i = 0; i < 8; i++) + if (v[i] != expected[i]) + __builtin_abort (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_8.c b/gcc/testsuite/gcc.target/aarch64/sve/init_8.c new file mode 100644 index 00000000000..9a1869a2765 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_8.c @@ -0,0 +1,32 @@ +/* { dg-do compile { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O2 -ftree-vectorize -fno-schedule-insns -msve-vector-bits=256 --save-temps" } */ + +/* Case 5.2: Interleaved elements and constants. */ + +#include + +typedef int32_t vnx4si __attribute__((vector_size (32))); + +__attribute__((noipa)) +vnx4si foo(int a, int b, int c, int d) +{ + return (vnx4si) { a, 1, b, 2, c, 3, d, 4 }; +} + +/* +foo: +.LFB0: + .cfi_startproc + ptrue p0.s, vl8 + mov z0.s, w3 + adrp x3, .LANCHOR0 + insr z0.s, w2 + add x3, x3, :lo12:.LANCHOR0 + insr z0.s, w1 + ld1w z1.s, p0/z, [x3] + insr z0.s, w0 + zip1 z0.s, z0.s, z1.s + ret +*/ + +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.s), w3\n\tadrp\t(x[0-9]+), \.LANCHOR0\n\tinsr\t\1, w2\n\tadd\t\2, \2, :lo12:\.LANCHOR0\n\tinsr\t\1, w1\n\tld1w\t(z[0-9]+\.s), p[0-9]+/z, \[\2\]\n\tinsr\t\1, w0\n\tzip1\t\1, \1, \3} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_8_run.c b/gcc/testsuite/gcc.target/aarch64/sve/init_8_run.c new file mode 100644 index 00000000000..14a8ad44145 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_8_run.c @@ -0,0 +1,21 @@ +/* { dg-do run { target aarch64_sve256_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=256 --save-temps" } */ + +#include "init_8.c" + +int main() +{ + int a = 10; + int b = 11; + int c = 12; + int d = 13; + + vnx4si v = foo (a, b, c, d); + int expected[] = { a, 1, b, 2, c, 3, d, 4 }; + + for (int i = 0; i < 8; i++) + if (v[i] != expected[i]) + __builtin_abort (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_9.c b/gcc/testsuite/gcc.target/aarch64/sve/init_9.c new file mode 100644 index 00000000000..0ecbce848ef --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_9.c @@ -0,0 +1,27 @@ +/* { dg-do compile { target aarch64_asm_sve_ok } } */ +/* { dg-options "-O2 -ftree-vectorize -fno-schedule-insns -msve-vector-bits=256 --save-temps" } */ + +/* Case 5.3: Repeated elements. */ + +#include + +typedef int32_t vnx4si __attribute__((vector_size (32))); + +__attribute__((noipa)) +vnx4si foo(int a, int b) +{ + return (vnx4si) { a, b, a, b, a, b, a, b }; +} + +/* +foo: +.LFB0: + .cfi_startproc + mov z0.s, w0 + mov z1.s, w1 + ptrue p0.s, vl8 + zip1 z0.s, z0.s, z1.s + ret +*/ + +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.s), w0\n\tmov\t(z[0-9]+\.s), w1\n.*\tzip1\t\1, \1, \2} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_9_run.c b/gcc/testsuite/gcc.target/aarch64/sve/init_9_run.c new file mode 100644 index 00000000000..6c67025c585 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_9_run.c @@ -0,0 +1,19 @@ +/* { dg-do run { target aarch64_sve256_hw } } */ +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=256 --save-temps" } */ + +#include "init_9.c" + +int main() +{ + int a = 10; + int b = 11; + + vnx4si v = foo (a, b); + int expected[] = { a, b, a, b, a, b, a, b }; + + for (int i = 0; i < 8; i++) + if (v[i] != expected[i]) + __builtin_abort (); + + return 0; +} diff --git a/gcc/vector-builder.h b/gcc/vector-builder.h index 9967daa6e4c..9f95b01bc3b 100644 --- a/gcc/vector-builder.h +++ b/gcc/vector-builder.h @@ -96,6 +96,7 @@ public: unsigned int encoded_nelts () const; bool encoded_full_vector_p () const; T elt (unsigned int) const; + unsigned int count_dups (int, int, int) const; bool operator == (const Derived &) const; bool operator != (const Derived &x) const { return !operator == (x); } @@ -223,6 +224,23 @@ vector_builder::elt (unsigned int i) const derived ()->step (prev, final)); } +/* Return the number of leading duplicate elements in the range + [START:END:STEP]. The value is always at least 1. */ + +template +unsigned int +vector_builder::count_dups (int start, int end, int step) const +{ + gcc_assert ((end - start) % step == 0); + + unsigned int ndups = 1; + for (int i = start + step; + i != end && derived ()->equal_p (elt (i), elt (start)); + i += step) + ndups++; + return ndups; +} + /* Change the encoding to NPATTERNS patterns of NELTS_PER_PATTERN each, but without changing the underlying vector. */