From patchwork Fri Jan 26 13:50:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 125975 Delivered-To: patch@linaro.org Received: by 10.46.84.92 with SMTP id y28csp348602ljd; Fri, 26 Jan 2018 05:51:00 -0800 (PST) X-Google-Smtp-Source: AH8x225PSJl4BWVDjw4r9Vy3P5Elx04a5NuKV8VGF/K9rtarcMLXWdnaKVxGtCDcCpi0DGsjeNeM X-Received: by 10.101.66.131 with SMTP id j3mr15275222pgp.56.1516974660663; Fri, 26 Jan 2018 05:51:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516974660; cv=none; d=google.com; s=arc-20160816; b=iZu9XRrQExAqKzI677MQFlioFplKJjCeQextCVgt/JIpZOnkm3F2FThy47E8/OZCej fxLXXjlIHM0BuK5AG1NXqL1if42PT2v4Hklc0JeNtm2XEuAYIU+xsohDR/WoAJxlqqrJ 2Cqgcy35tVPI+gsfxZ3b65lm3EmgviK25r0b0VYkkjzuy50M/+DkM5uhEoTHgLAJoruS 8lN2pyuNIn5vV1hKjAhVpf3pECHnrm/WkiyKPffoW8jgOD5OlcT/fdPuUYKrBJgj0fZZ QBHwC/ulMSONjVdo/uVAf0rwok8ysyifGt1Mw6j4Fg6HyE1xwI1x0ajrfetSDbvxkF4w caSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:user-agent:message-id:date:subject:cc:mail-followup-to :to:from:delivered-to:sender:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=s6N8QIf2osQnnOUT0lXHUxwkgyTziuOkh1qsBi4UsXg=; b=pgm/q0peeMv458faZePG3SuDlRSjKf10wOUbtBDf7h6vts2osuW5+velgr9JUll9eG 26oZVf3uNweAdKZLI5D8RI/TQGabG48vgODJVxDjn83CSyybMYWuLBHvcPBN3jckNzJy dfGADW8rc03TlhySF3e0URgiUBSjhAEXofPzwHvbcXoEe56lcVqDfl5ZdUNxID6QqXCb 7UK9R+gNXsy7+986P8lvc3KYuIs18WGHwhB2nmarN71QNjZ9bQsajRYdHAfsl8s8GTSb aHzg5k0gmVo3J1f9NChObWrMbxMByMSOkhYZeLFV7vmcWPEDpZBl7tmF0uLrsxovj1uV 0K4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=hKbYazD+; spf=pass (google.com: domain of gcc-patches-return-472104-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-472104-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id 3-v6si3746365plu.181.2018.01.26.05.51.00 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 26 Jan 2018 05:51:00 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-472104-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=hKbYazD+; spf=pass (google.com: domain of gcc-patches-return-472104-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-472104-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; q=dns; s=default; b=mjr1rFCpzF2GpPFV/mmzxsW/lCEtX2Uy8jZJ4GnRhCcE0hZdii 0BKGPfR3+WWpSL821aBoDwpum9LpcuSx1Dm2HncFjKbh3s2CTtizaMBkzLpXkkH6 SOLPPNADMyNyni4LR2D9pqrr/N35jROYIi/EHA5NIUhi+n6y9y8/v+tJQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; s= default; bh=guqxkgwfOPRYo7W9DO8fWs+/K44=; b=hKbYazD+kYy5aQq5I2jY rEBtv3Q0dOPr9KgovGOAuENTR4vD6M2k4eCz7a6hFRvwPfW6HtbLkrD6LCxj7fSl 7B7tt9V+U48Yse//zgA+1vWluNHoJy9QpatcEyvofxWqpZpsrME7bfSU3Gghh5ZM 94B+royQujG6uvKIgUO0opI= Received: (qmail 87570 invoked by alias); 26 Jan 2018 13:50:46 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 87560 invoked by uid 89); 26 Jan 2018 13:50:46 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=handing, 0b, broadcast, Hx-languages-length:6078 X-HELO: mail-wm0-f50.google.com Received: from mail-wm0-f50.google.com (HELO mail-wm0-f50.google.com) (74.125.82.50) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 26 Jan 2018 13:50:44 +0000 Received: by mail-wm0-f50.google.com with SMTP id v71so21374771wmv.2 for ; Fri, 26 Jan 2018 05:50:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:cc:subject:date :message-id:user-agent:mime-version; bh=s6N8QIf2osQnnOUT0lXHUxwkgyTziuOkh1qsBi4UsXg=; b=qDsc+tOvDmso+3cMwm9jP27buGGy0V9m2/oKLWxj0YLQV0xJKU9+QZHDtQJt4YoUd2 SXgeYPzapeVePBgHBZZEQBF1Oo2WrZMZvdpKIYsFZNbnBVj9GVyyLCFh14fpc8kdiyEr fy3TdPvcac1ztu84mtnwzg2UevMxCN3mgIxD++paO05ix+1ro2NdTZmxZIy71giC0A3E nWGTtiDtElH4+0YHA2chIsAIXSokPh+Id7FszPlCmkxNz9CFLqUGXfp0/zTSQDToecoA 9ViiyaT9l7zKLeuM90RhfwQmyalV2FMj6C4jhAENJQA9Kr5xynqfj2i2jXpPJvocZtl2 nXbA== X-Gm-Message-State: AKwxytc+fF1X32bjwyNgqbbS6okTc/biyqbMV+Qhg3y9mL1FHNbKiwC2 vJZSZmD0icuANl02Uwl2Ypse0A== X-Received: by 10.28.159.7 with SMTP id i7mr11629320wme.57.1516974642539; Fri, 26 Jan 2018 05:50:42 -0800 (PST) Received: from localhost (92.40.248.158.threembb.co.uk. [92.40.248.158]) by smtp.gmail.com with ESMTPSA id g64sm4165062wmf.20.2018.01.26.05.50.41 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 26 Jan 2018 05:50:41 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, james.greenhalgh@arm.com, marcus.shawcroft@arm.com, richard.sandiford@linaro.org Cc: richard.earnshaw@arm.com, james.greenhalgh@arm.com, marcus.shawcroft@arm.com Subject: [AArch64] Use all SVE LD1RQ variants Date: Fri, 26 Jan 2018 13:50:40 +0000 Message-ID: <87y3kkohjj.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) MIME-Version: 1.0 The fallback way of handling a repeated 128-bit constant vector for SVE is to force the 128 bits to the constant pool and use LD1RQ to load it. Previously the code always used the byte variant of LD1RQ (LD1RQB), with a preceding BSWAP for big-endian targets. However, that BSWAP doesn't handle all cases correctly. The simplest fix seemed to be to use the LD1RQ appropriate for the element size. This helps to fix some of the sve/slp_*.c tests for aarch64_be, although a later patch is needed as well. Tested on aarch64_be-elf and aarch64-linux-gnu. OK to install? Richard 2018-01-26 Richard Sandiford gcc/ * config/aarch64/aarch64-sve.md (sve_ld1rq): Replace with... (*sve_ld1rq): ... this new pattern. Handle all element sizes, not just bytes. * config/aarch64/aarch64.c (aarch64_expand_sve_widened_duplicate): Remove BSWAP handing for big-endian targets and use the form of LD1RQ appropariate for the mode. gcc/testsuite/ * gcc.target/aarch64/sve/slp_2.c: Expect LD1RQD rather than LD1RQB. * gcc.target/aarch64/sve/slp_3.c: Expect LD1RQW rather than LD1RQB. * gcc.target/aarch64/sve/slp_4.c: Expect LD1RQH rather than LD1RQB. Index: gcc/config/aarch64/aarch64-sve.md =================================================================== --- gcc/config/aarch64/aarch64-sve.md 2018-01-26 13:26:50.176756711 +0000 +++ gcc/config/aarch64/aarch64-sve.md 2018-01-26 13:49:00.069630245 +0000 @@ -652,14 +652,14 @@ (define_insn "sve_ld1r" ;; Load 128 bits from memory and duplicate to fill a vector. Since there ;; are so few operations on 128-bit "elements", we don't define a VNx1TI ;; and simply use vectors of bytes instead. -(define_insn "sve_ld1rq" - [(set (match_operand:VNx16QI 0 "register_operand" "=w") - (unspec:VNx16QI - [(match_operand:VNx16BI 1 "register_operand" "Upl") +(define_insn "*sve_ld1rq" + [(set (match_operand:SVE_ALL 0 "register_operand" "=w") + (unspec:SVE_ALL + [(match_operand: 1 "register_operand" "Upl") (match_operand:TI 2 "aarch64_sve_ld1r_operand" "Uty")] UNSPEC_LD1RQ))] "TARGET_SVE" - "ld1rqb\t%0.b, %1/z, %2" + "ld1rq\t%0., %1/z, %2" ) ;; Implement a predicate broadcast by shifting the low bit of the scalar Index: gcc/config/aarch64/aarch64.c =================================================================== --- gcc/config/aarch64/aarch64.c 2018-01-26 13:46:00.955822193 +0000 +++ gcc/config/aarch64/aarch64.c 2018-01-26 13:49:00.071630173 +0000 @@ -2787,16 +2787,7 @@ aarch64_expand_sve_widened_duplicate (rt return true; } - /* The bytes are loaded in little-endian order, so do a byteswap on - big-endian targets. */ - if (BYTES_BIG_ENDIAN) - { - src = simplify_unary_operation (BSWAP, src_mode, src, src_mode); - if (!src) - return NULL_RTX; - } - - /* Use LD1RQ to load the 128 bits from memory. */ + /* Use LD1RQ[BHWD] to load the 128 bits from memory. */ src = force_const_mem (src_mode, src); if (!src) return false; @@ -2808,8 +2799,12 @@ aarch64_expand_sve_widened_duplicate (rt src = replace_equiv_address (src, addr); } - rtx ptrue = force_reg (VNx16BImode, CONSTM1_RTX (VNx16BImode)); - emit_insn (gen_sve_ld1rq (gen_lowpart (VNx16QImode, dest), ptrue, src)); + machine_mode mode = GET_MODE (dest); + unsigned int elem_bytes = GET_MODE_UNIT_SIZE (mode); + machine_mode pred_mode = aarch64_sve_pred_mode (elem_bytes).require (); + rtx ptrue = force_reg (pred_mode, CONSTM1_RTX (pred_mode)); + src = gen_rtx_UNSPEC (mode, gen_rtvec (2, ptrue, src), UNSPEC_LD1RQ); + emit_insn (gen_rtx_SET (dest, src)); return true; } Index: gcc/testsuite/gcc.target/aarch64/sve/slp_2.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/slp_2.c 2018-01-13 17:58:43.651957575 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/slp_2.c 2018-01-26 13:49:00.071630173 +0000 @@ -32,7 +32,7 @@ TEST_ALL (VEC_PERM) /* { dg-final { scan-assembler-times {\tld1rh\tz[0-9]+\.h, } 2 } } */ /* { dg-final { scan-assembler-times {\tld1rw\tz[0-9]+\.s, } 3 } } */ /* { dg-final { scan-assembler-times {\tld1rd\tz[0-9]+\.d, } 3 } } */ -/* { dg-final { scan-assembler-times {\tld1rqb\tz[0-9]+\.b, } 3 } } */ +/* { dg-final { scan-assembler-times {\tld1rqd\tz[0-9]+\.d, } 3 } } */ /* { dg-final { scan-assembler-not {\tzip1\t} } } */ /* { dg-final { scan-assembler-not {\tzip2\t} } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/slp_3.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/slp_3.c 2018-01-13 17:58:43.651957575 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/slp_3.c 2018-01-26 13:49:00.071630173 +0000 @@ -36,7 +36,7 @@ TEST_ALL (VEC_PERM) /* 1 for each 16-bit type and 4 for double. */ /* { dg-final { scan-assembler-times {\tld1rd\tz[0-9]+\.d, } 7 } } */ /* 1 for each 32-bit type. */ -/* { dg-final { scan-assembler-times {\tld1rqb\tz[0-9]+\.b, } 3 } } */ +/* { dg-final { scan-assembler-times {\tld1rqw\tz[0-9]+\.s, } 3 } } */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #41\n} 2 } } */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #25\n} 2 } } */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #31\n} 2 } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/slp_4.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/slp_4.c 2018-01-13 17:58:43.651957575 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/slp_4.c 2018-01-26 13:49:00.072630137 +0000 @@ -38,7 +38,7 @@ TEST_ALL (VEC_PERM) /* 1 for each 8-bit type, 4 for each 32-bit type and 8 for double. */ /* { dg-final { scan-assembler-times {\tld1rd\tz[0-9]+\.d, } 22 } } */ /* 1 for each 16-bit type. */ -/* { dg-final { scan-assembler-times {\tld1rqb\tz[0-9]\.b, } 3 } } */ +/* { dg-final { scan-assembler-times {\tld1rqh\tz[0-9]\.h, } 3 } } */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #99\n} 2 } } */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #11\n} 2 } } */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #17\n} 2 } } */