From patchwork Sat Sep 23 08:24:48 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 114122 Delivered-To: patch@linaro.org Received: by 10.140.106.117 with SMTP id d108csp415916qgf; Sat, 23 Sep 2017 01:25:14 -0700 (PDT) X-Received: by 10.84.151.68 with SMTP id i62mr1515546pli.179.1506155114436; Sat, 23 Sep 2017 01:25:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1506155114; cv=none; d=google.com; s=arc-20160816; b=XVkFOlV1ofcRqTACiDNRqO1PA/jleh/jHGFu1cD5aRLZpOxpSHuZf0Y2tdBiIYfcVe uMYwpj8y8TTYDwHJe4wE8nGwkvQaKei0lAJc4mYi+cneuoNhG4H4yb/a56M5CsHO5PFq al5sLbV3nPLQIduI0UY+F/hM8/0+oKOZg6sP6Uvaj9HWf508zzLPAXZc5x5TTYQ9SALu vlIIZ++K0ECDzaWJ7qR8WNcTJ+48uagWSlW7q6e/OD8UlmVuC/JQaRmVMmN90gNZfnEK Pi2sTza6OTnlhtFqXMkIJnXjIqGmGdPwNw2ZBlYBaIPVh+uh3C5NszEXx2VVRvsfqpNx 4Wyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:user-agent:message-id:date:subject:cc:mail-followup-to :to:from:delivered-to:sender:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=NDr3xnk1uEwOBPD4S4wVsXdPIOBXzyuPRBDzgtDtOPc=; b=BXpjuTwZqVvijgppPJJE1k3nezUFoABBvZtECR94WlnZJTCs8TsLRu8agZpLzZOMi4 YVR7BpCE3rsbRiR9hjCggLL5ZTSwNw2P9GQQJLl4XW5Zwaxg6WzR5P27+/h2oaNhuJ4a bQNGLsdM504BgFuNMcJv7Wv8WygAoAI2lmhY2i45xn3jF5BzPL5U4jluQNjdwsCYf9+a yWFO/I1Vkwq0DnpC78FTD72Eb5xAWdKG2qvmGp1YztbUSzP695vFJrUie5LP9dYaAYIM Y/0QSmuLiFxQPo/RpSMZ+Dmup7SFigcAqcHL52VQAQ475eO0Pzx3UmSWdIeZ9KbT89/Z rV0w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=KfCsHVEl; spf=pass (google.com: domain of gcc-patches-return-462820-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-462820-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id c13si1037686pgn.262.2017.09.23.01.25.14 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 23 Sep 2017 01:25:14 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-462820-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=KfCsHVEl; spf=pass (google.com: domain of gcc-patches-return-462820-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-462820-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; q=dns; s=default; b=CJ9m/tv+Dq+4GbmYCmIejFj8DaZOmY+50Fy9fnjbHERERXmD/l fww04jZrawnRDg7hF5fjNaXAjh3L1k7AwxcT3HPNdIXY76AzYRWRYH6L6hxf2IGD QOiO3bs0x7sdRm+4vC4ioQ3i9jvlPkrM+0BMe68KDkpREKM24Hf+bvSwA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; s= default; bh=ZQGF1lLOWq0W2iNyo713APFrSZw=; b=KfCsHVElPyxCPPC5dnZ1 4PeV3mZN9wOmO19jaFh7NgrbEKvRh7eVtJm7RHPHd8RMsyizDmm7GQHYqLNOSpI8 WfyTu2UROw1NH/00KJG9EtB3kuScNSq6Keg7dyQy6pE46WUozQtPcUye+QsMSaVJ TjX7yPR10Io4HgsabLJsc5g= Received: (qmail 87446 invoked by alias); 23 Sep 2017 08:24:58 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 87436 invoked by uid 89); 23 Sep 2017 08:24:57 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-15.1 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_SPAM, SPF_PASS autolearn=ham version=3.3.2 spammy=lo12 X-HELO: mail-wr0-f171.google.com Received: from mail-wr0-f171.google.com (HELO mail-wr0-f171.google.com) (209.85.128.171) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 23 Sep 2017 08:24:54 +0000 Received: by mail-wr0-f171.google.com with SMTP id k20so2288494wre.4 for ; Sat, 23 Sep 2017 01:24:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:cc:subject:date :message-id:user-agent:mime-version; bh=NDr3xnk1uEwOBPD4S4wVsXdPIOBXzyuPRBDzgtDtOPc=; b=NBIheAxpjhUHhgAkuJ9ANLax7mYK5TMQZnJVQNeqr0oScpvu5O/ESxYiPUCfncWZkK TI5ubnSPvWyfoqiieot7tq6E7RM+oYqqa4xWVqtz+A8qyFvrI0kb7FeLR3fqFAir083e ZBI2v22mmCzpbAEtFU/9yzBvDQ5V5cICN7r6laMiU+u2UPLeHpKePP2wHS7ZE9Ic31Lt cUvpUkfyioMfyrZzpep25RJ/+epqoROUcXhVn3G9Q8TQXFDZkpz/f08G1F3B6Pd1qz/e 12E1VSBTbcX2zOrdqnDwGbtVhRFENLp9u5toMfiMaXdR88m0rg38QSZHW4Uo3kvvexlP Xe/w== X-Gm-Message-State: AHPjjUh/cAHIHf2TR9TgIzyAaANLazCCMQu7RJ9VysaEqfp3DVngY8Cd WlJATcHzV8RFjQnyG3DSu3ketg== X-Google-Smtp-Source: AOwi7QBwrYHLP4n1paXB4Tf0maUdGLiHZzcpvYCWM7EgXafF91uME/peUGJbsCphmOs3fKhQ0FaCgQ== X-Received: by 10.223.152.132 with SMTP id w4mr1219086wrb.264.1506155091983; Sat, 23 Sep 2017 01:24:51 -0700 (PDT) Received: from localhost ([2.25.234.72]) by smtp.gmail.com with ESMTPSA id a19sm2058117wra.64.2017.09.23.01.24.50 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 23 Sep 2017 01:24:51 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, dmalcolm@redhat.com, richard.sandiford@linaro.org Cc: dmalcolm@redhat.com Subject: Add more vec_duplicate simplifications Date: Sat, 23 Sep 2017 09:24:48 +0100 Message-ID: <87d16hrfe7.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 This patch adds a vec_duplicate_p helper that tests for constant or non-constant vector duplicates. Together with the existing const_vec_duplicate_p, this complements the gen_vec_duplicate and gen_const_vec_duplicate added by a previous patch. The patch uses the new routines to add more rtx simplifications involving vector duplicates. These mirror simplifications that we already do for CONST_VECTOR broadcasts and are needed for variable-length SVE, which uses: (const:M (vec_duplicate:M X)) to represent constant broadcasts instead. The simplifications trigger on the testsuite for variable duplicates too, and in each case I saw the change was an improvement. E.g.: - Several targets had this simplification in gcc.dg/pr49948.c when compiled at -O3: -Failed to match this instruction: +Successfully matched this instruction: (set (reg:DI 88) - (subreg:DI (vec_duplicate:V2DI (reg/f:DI 75 [ _4 ])) 0)) + (reg/f:DI 75 [ _4 ])) On aarch64 this gives: ret .p2align 2 .L8: + adrp x1, b sub sp, sp, #80 - adrp x2, b - add x1, sp, 12 + add x2, sp, 12 str wzr, [x0, #:lo12:a] + str x2, [x1, #:lo12:b] mov w0, 0 - dup v0.2d, x1 - str d0, [x2, #:lo12:b] add sp, sp, 80 ret .size foo, .-foo On x86_64: jg .L2 leaq -76(%rsp), %rax movl $0, a(%rip) - movq %rax, -96(%rsp) - movq -96(%rsp), %xmm0 - punpcklqdq %xmm0, %xmm0 - movq %xmm0, b(%rip) + movq %rax, b(%rip) .L2: xorl %eax, %eax ret etc. - gcc.dg/torture/pr58018.c compiled at -O3 on aarch64 has an instance of: Trying 50, 52, 46 -> 53: Failed to match this instruction: (set (reg:V4SI 167) - (and:V4SI (and:V4SI (vec_duplicate:V4SI (reg:SI 132 [ _165 ])) - (reg:V4SI 209)) - (const_vector:V4SI [ - (const_int 1 [0x1]) - (const_int 1 [0x1]) - (const_int 1 [0x1]) - (const_int 1 [0x1]) - ]))) + (and:V4SI (vec_duplicate:V4SI (reg:SI 132 [ _165 ])) + (reg:V4SI 209))) Successfully matched this instruction: (set (reg:V4SI 163 [ vect_patt_16.14 ]) (vec_duplicate:V4SI (reg:SI 132 [ _165 ]))) +Successfully matched this instruction: +(set (reg:V4SI 167) + (and:V4SI (reg:V4SI 163 [ vect_patt_16.14 ]) + (reg:V4SI 209))) where (reg:SI 132) is the result of a scalar comparison and so is known to be 0 or 1. This saves a MOVI and vector AND: cmp w7, 4 bls .L15 dup v1.4s, w2 - lsr w2, w1, 2 + dup v2.4s, w6 movi v3.4s, 0 - mov w0, 0 - movi v2.4s, 0x1 + lsr w2, w1, 2 mvni v0.4s, 0 + mov w0, 0 cmge v1.4s, v1.4s, v3.4s and v1.16b, v2.16b, v1.16b - dup v2.4s, w6 - and v1.16b, v1.16b, v2.16b .p2align 3 .L7: and v0.16b, v0.16b, v1.16b - powerpc64le has many instances of things like: -Failed to match this instruction: +Successfully matched this instruction: (set (reg:V4SI 161 [ vect_cst__24 ]) - (vec_select:V4SI (vec_duplicate:V4SI (vec_select:SI (reg:V4SI 143) - (parallel [ - (const_int 0 [0]) - ]))) - (parallel [ - (const_int 2 [0x2]) - (const_int 3 [0x3]) - (const_int 0 [0]) - (const_int 1 [0x1]) - ]))) + (vec_duplicate:V4SI (vec_select:SI (reg:V4SI 143) + (parallel [ + (const_int 0 [0]) + ])))) This removes redundant XXPERMDIs from many tests. The best way of testing the new simplifications seemed to be via selftests. The patch cribs part of David's patch here: https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00270.html . Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. Also tested by comparing the testsuite assembly output on at least one target per CPU directory, with the results above. OK to install? Richard 2017-09-22 Richard Sandiford David Malcolm Alan Hayward David Sherwood gcc/ * rtl.h (vec_duplicate_p): New function. * selftest-rtl.c (assert_rtx_eq_at): New function. * selftest-rtl.h (ASSERT_RTX_EQ): New macro. (assert_rtx_eq_at): Declare. * selftest.h (selftest::simplify_rtx_c_tests): Declare. * selftest-run-tests.c (selftest::run_tests): Call it. * simplify-rtx.c: Include selftest.h and selftest-rtl.h. (simplify_unary_operation_1): Recursively handle vector duplicates. (simplify_binary_operation_1): Likewise. Handle VEC_SELECTs of vector duplicates. (simplify_subreg): Handle subregs of vector duplicates. (make_test_reg, test_vector_ops_duplicate, test_vector_ops) (selftest::simplify_rtx_c_tests): New functions. Index: gcc/rtl.h =================================================================== --- gcc/rtl.h 2017-09-21 11:53:16.490837253 +0100 +++ gcc/rtl.h 2017-09-23 09:14:44.753611808 +0100 @@ -2772,6 +2772,21 @@ const_vec_duplicate_p (T x, T *elt) return false; } +/* Return true if X is a vector with a duplicated element value, either + constant or nonconstant. Store the duplicated element in *ELT if so. */ + +template +inline bool +vec_duplicate_p (T x, T *elt) +{ + if (GET_CODE (x) == VEC_DUPLICATE) + { + *elt = XEXP (x, 0); + return true; + } + return const_vec_duplicate_p (x, elt); +} + /* If X is a vector constant with a duplicated element value, return that element value, otherwise return X. */ Index: gcc/selftest-rtl.c =================================================================== --- gcc/selftest-rtl.c 2017-02-23 19:54:03.000000000 +0000 +++ gcc/selftest-rtl.c 2017-09-23 09:14:44.753611808 +0100 @@ -35,6 +35,29 @@ Software Foundation; either version 3, o namespace selftest { +/* Compare rtx EXPECTED and ACTUAL using rtx_equal_p, calling + ::selftest::pass if they are equal, aborting if they are non-equal. + LOC is the effective location of the assertion, MSG describes it. */ + +void +assert_rtx_eq_at (const location &loc, const char *msg, + rtx expected, rtx actual) +{ + if (rtx_equal_p (expected, actual)) + ::selftest::pass (loc, msg); + else + { + fprintf (stderr, "%s:%i: %s: FAIL: %s\n", loc.m_file, loc.m_line, + loc.m_function, msg); + fprintf (stderr, " expected: "); + print_rtl (stderr, expected); + fprintf (stderr, "\n actual: "); + print_rtl (stderr, actual); + fprintf (stderr, "\n"); + abort (); + } +} + /* Compare rtx EXPECTED and ACTUAL by pointer equality, calling ::selftest::pass if they are equal, aborting if they are non-equal. LOC is the effective location of the assertion, MSG describes it. */ Index: gcc/selftest-rtl.h =================================================================== --- gcc/selftest-rtl.h 2017-02-23 19:54:03.000000000 +0000 +++ gcc/selftest-rtl.h 2017-09-23 09:14:44.753611808 +0100 @@ -47,6 +47,15 @@ #define ASSERT_RTL_DUMP_EQ_WITH_REUSE(EX assert_rtl_dump_eq (SELFTEST_LOCATION, (EXPECTED_DUMP), (RTX), \ (REUSE_MANAGER)) +#define ASSERT_RTX_EQ(EXPECTED, ACTUAL) \ + SELFTEST_BEGIN_STMT \ + const char *desc = "ASSERT_RTX_EQ (" #EXPECTED ", " #ACTUAL ")"; \ + ::selftest::assert_rtx_eq_at (SELFTEST_LOCATION, desc, (EXPECTED), \ + (ACTUAL)); \ + SELFTEST_END_STMT + +extern void assert_rtx_eq_at (const location &, const char *, rtx, rtx); + /* Evaluate rtx EXPECTED and ACTUAL and compare them with == (i.e. pointer equality), calling ::selftest::pass if they are equal, aborting if they are non-equal. */ Index: gcc/selftest.h =================================================================== --- gcc/selftest.h 2017-06-12 17:05:23.270759359 +0100 +++ gcc/selftest.h 2017-09-23 09:14:44.754563641 +0100 @@ -197,6 +197,7 @@ extern void tree_cfg_c_tests (); extern void vec_c_tests (); extern void wide_int_cc_tests (); extern void predict_c_tests (); +extern void simplify_rtx_c_tests (); extern int num_passes; Index: gcc/selftest-run-tests.c =================================================================== --- gcc/selftest-run-tests.c 2017-06-12 17:05:23.505761496 +0100 +++ gcc/selftest-run-tests.c 2017-09-23 09:14:44.754563641 +0100 @@ -93,6 +93,7 @@ selftest::run_tests () store_merging_c_tests (); predict_c_tests (); + simplify_rtx_c_tests (); /* Run any lang-specific selftests. */ lang_hooks.run_lang_selftests (); Index: gcc/simplify-rtx.c =================================================================== --- gcc/simplify-rtx.c 2017-09-23 08:56:06.723183240 +0100 +++ gcc/simplify-rtx.c 2017-09-23 09:14:44.755515474 +0100 @@ -33,6 +33,8 @@ Software Foundation; either version 3, o #include "diagnostic-core.h" #include "varasm.h" #include "flags.h" +#include "selftest.h" +#include "selftest-rtl.h" /* Simplification and canonicalization of RTL. */ @@ -925,7 +927,7 @@ exact_int_to_float_conversion_p (const_r simplify_unary_operation_1 (enum rtx_code code, machine_mode mode, rtx op) { enum rtx_code reversed; - rtx temp; + rtx temp, elt; scalar_int_mode inner, int_mode, op_mode, op0_mode; switch (code) @@ -1681,6 +1683,28 @@ simplify_unary_operation_1 (enum rtx_cod break; } + if (VECTOR_MODE_P (mode) && vec_duplicate_p (op, &elt)) + { + /* Try applying the operator to ELT and see if that simplifies. + We can duplicate the result if so. + + The reason we don't use simplify_gen_unary is that it isn't + necessarily a win to convert things like: + + (neg:V (vec_duplicate:V (reg:S R))) + + to: + + (vec_duplicate:V (neg:S (reg:S R))) + + The first might be done entirely in vector registers while the + second might need a move between register files. */ + temp = simplify_unary_operation (code, GET_MODE_INNER (mode), + elt, GET_MODE_INNER (GET_MODE (op))); + if (temp) + return gen_vec_duplicate (mode, temp); + } + return 0; } @@ -2138,7 +2162,7 @@ simplify_binary_operation (enum rtx_code simplify_binary_operation_1 (enum rtx_code code, machine_mode mode, rtx op0, rtx op1, rtx trueop0, rtx trueop1) { - rtx tem, reversed, opleft, opright; + rtx tem, reversed, opleft, opright, elt0, elt1; HOST_WIDE_INT val; unsigned int width = GET_MODE_PRECISION (mode); scalar_int_mode int_mode, inner_mode; @@ -3505,6 +3529,9 @@ simplify_binary_operation_1 (enum rtx_co gcc_assert (XVECLEN (trueop1, 0) == 1); gcc_assert (CONST_INT_P (XVECEXP (trueop1, 0, 0))); + if (vec_duplicate_p (trueop0, &elt0)) + return elt0; + if (GET_CODE (trueop0) == CONST_VECTOR) return CONST_VECTOR_ELT (trueop0, INTVAL (XVECEXP (trueop1, 0, 0))); @@ -3587,9 +3614,6 @@ simplify_binary_operation_1 (enum rtx_co tmp_op, gen_rtx_PARALLEL (VOIDmode, vec)); return tmp; } - if (GET_CODE (trueop0) == VEC_DUPLICATE - && GET_MODE (XEXP (trueop0, 0)) == mode) - return XEXP (trueop0, 0); } else { @@ -3598,6 +3622,11 @@ simplify_binary_operation_1 (enum rtx_co == GET_MODE_INNER (GET_MODE (trueop0))); gcc_assert (GET_CODE (trueop1) == PARALLEL); + if (vec_duplicate_p (trueop0, &elt0)) + /* It doesn't matter which elements are selected by trueop1, + because they are all the same. */ + return gen_vec_duplicate (mode, elt0); + if (GET_CODE (trueop0) == CONST_VECTOR) { int elt_size = GET_MODE_UNIT_SIZE (mode); @@ -3898,6 +3927,32 @@ simplify_binary_operation_1 (enum rtx_co gcc_unreachable (); } + if (mode == GET_MODE (op0) + && mode == GET_MODE (op1) + && vec_duplicate_p (op0, &elt0) + && vec_duplicate_p (op1, &elt1)) + { + /* Try applying the operator to ELT and see if that simplifies. + We can duplicate the result if so. + + The reason we don't use simplify_gen_binary is that it isn't + necessarily a win to convert things like: + + (plus:V (vec_duplicate:V (reg:S R1)) + (vec_duplicate:V (reg:S R2))) + + to: + + (vec_duplicate:V (plus:S (reg:S R1) (reg:S R2))) + + The first might be done entirely in vector registers while the + second might need a move between register files. */ + tem = simplify_binary_operation (code, GET_MODE_INNER (mode), + elt0, elt1); + if (tem) + return gen_vec_duplicate (mode, tem); + } + return 0; } @@ -6046,6 +6101,20 @@ simplify_subreg (machine_mode outermode, if (outermode == innermode && !byte) return op; + if (byte % GET_MODE_UNIT_SIZE (innermode) == 0) + { + rtx elt; + + if (VECTOR_MODE_P (outermode) + && GET_MODE_INNER (outermode) == GET_MODE_INNER (innermode) + && vec_duplicate_p (op, &elt)) + return gen_vec_duplicate (outermode, elt); + + if (outermode == GET_MODE_INNER (innermode) + && vec_duplicate_p (op, &elt)) + return elt; + } + if (CONST_SCALAR_INT_P (op) || CONST_DOUBLE_AS_FLOAT_P (op) || GET_CODE (op) == CONST_FIXED @@ -6351,3 +6420,125 @@ simplify_rtx (const_rtx x) } return NULL; } + +#if CHECKING_P + +namespace selftest { + +/* Make a unique pseudo REG of mode MODE for use by selftests. */ + +static rtx +make_test_reg (machine_mode mode) +{ + static int test_reg_num = LAST_VIRTUAL_REGISTER + 1; + + return gen_rtx_REG (mode, test_reg_num++); +} + +/* Test vector simplifications in which the operands and result have + vector mode MODE. SCALAR_REG is a pseudo register that holds one + element of MODE. */ + +static void +test_vector_ops_duplicate (machine_mode mode, rtx scalar_reg) +{ + scalar_mode inner_mode = GET_MODE_INNER (mode); + rtx duplicate = gen_rtx_VEC_DUPLICATE (mode, scalar_reg); + unsigned int nunits = GET_MODE_NUNITS (mode); + if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT) + { + /* Test some simple unary cases with VEC_DUPLICATE arguments. */ + rtx not_scalar_reg = gen_rtx_NOT (inner_mode, scalar_reg); + rtx duplicate_not = gen_rtx_VEC_DUPLICATE (mode, not_scalar_reg); + ASSERT_RTX_EQ (duplicate, + simplify_unary_operation (NOT, mode, + duplicate_not, mode)); + + rtx neg_scalar_reg = gen_rtx_NEG (inner_mode, scalar_reg); + rtx duplicate_neg = gen_rtx_VEC_DUPLICATE (mode, neg_scalar_reg); + ASSERT_RTX_EQ (duplicate, + simplify_unary_operation (NEG, mode, + duplicate_neg, mode)); + + /* Test some simple binary cases with VEC_DUPLICATE arguments. */ + ASSERT_RTX_EQ (duplicate, + simplify_binary_operation (PLUS, mode, duplicate, + CONST0_RTX (mode))); + + ASSERT_RTX_EQ (duplicate, + simplify_binary_operation (MINUS, mode, duplicate, + CONST0_RTX (mode))); + + ASSERT_RTX_PTR_EQ (CONST0_RTX (mode), + simplify_binary_operation (MINUS, mode, duplicate, + duplicate)); + } + + /* Test a scalar VEC_SELECT of a VEC_DUPLICATE. */ + rtx zero_par = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (1, const0_rtx)); + ASSERT_RTX_PTR_EQ (scalar_reg, + simplify_binary_operation (VEC_SELECT, inner_mode, + duplicate, zero_par)); + + /* And again with the final element. */ + rtx last_index = gen_int_mode (GET_MODE_NUNITS (mode) - 1, word_mode); + rtx last_par = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (1, last_index)); + ASSERT_RTX_PTR_EQ (scalar_reg, + simplify_binary_operation (VEC_SELECT, inner_mode, + duplicate, last_par)); + + /* Test a scalar subreg of a VEC_DUPLICATE. */ + unsigned int offset = subreg_lowpart_offset (inner_mode, mode); + ASSERT_RTX_EQ (scalar_reg, + simplify_gen_subreg (inner_mode, duplicate, + mode, offset)); + + machine_mode narrower_mode; + if (nunits > 2 + && mode_for_vector (inner_mode, 2).exists (&narrower_mode) + && VECTOR_MODE_P (narrower_mode)) + { + /* Test VEC_SELECT of a vector. */ + rtx vec_par + = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, const1_rtx, const0_rtx)); + rtx narrower_duplicate + = gen_rtx_VEC_DUPLICATE (narrower_mode, scalar_reg); + ASSERT_RTX_EQ (narrower_duplicate, + simplify_binary_operation (VEC_SELECT, narrower_mode, + duplicate, vec_par)); + + /* Test a vector subreg of a VEC_DUPLICATE. */ + unsigned int offset = subreg_lowpart_offset (narrower_mode, mode); + ASSERT_RTX_EQ (narrower_duplicate, + simplify_gen_subreg (narrower_mode, duplicate, + mode, offset)); + } +} + +/* Verify some simplifications involving vectors. */ + +static void +test_vector_ops () +{ + for (unsigned int i = 0; i < NUM_MACHINE_MODES; ++i) + { + machine_mode mode = (machine_mode) i; + if (VECTOR_MODE_P (mode)) + { + rtx scalar_reg = make_test_reg (GET_MODE_INNER (mode)); + test_vector_ops_duplicate (mode, scalar_reg); + } + } +} + +/* Run all of the selftests within this file. */ + +void +simplify_rtx_c_tests () +{ + test_vector_ops (); +} + +} // namespace selftest + +#endif /* CHECKING_P */