From patchwork Fri Jan 26 13:54:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 125981 Delivered-To: patch@linaro.org Received: by 10.46.84.92 with SMTP id y28csp350767ljd; Fri, 26 Jan 2018 05:55:04 -0800 (PST) X-Google-Smtp-Source: AH8x2262TjapJaEq5P91hX+S1GlqEod/02MG/e+Ixi8B9EuQjH4cddcNYWU2A2BjwBpN9m+WuSSa X-Received: by 10.98.162.10 with SMTP id m10mr19140809pff.168.1516974904524; Fri, 26 Jan 2018 05:55:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516974904; cv=none; d=google.com; s=arc-20160816; b=QGa/Q/Mxmv9Vx7E0mhLNuh5/zODRJ9SjViC+18kTNxVpFR0gphzKEVTNoZDCeBCuKD /pvL7arfpFvEj+s2q8S/ycoHIEjiiUtKu/AwQHGX5ITGxNf91QykfIs559qxAasK5qay YYBoJIjDTP7GYGW3btTA8LlxK6mGIp4u67FrsHi48MT+KVlOfpkW7RhjmviDi45SdfAL CyyXKvLKQaygR/yQKMnYs3AZ8H8fluyRFLH0odsnH9qgPt8oic0R0NjkRdjlG6TGkqgi asca6qwp2P3c+MwVITcCKN79kvWNYfBdBvAf3fKJpL2TemH4fuSvOZxKK7QYkoUu9Zjy U0fA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:user-agent:message-id:date:subject:cc:mail-followup-to :to:from:delivered-to:sender:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=NGYNiAlLyUkVJeD3IRuKpzjARTm6Bi/1ON6MvSdwFT0=; b=XXxRG2UFeBXWtq7XaR9AsZjGHy1Fhau7JHq/fKbxoyun10rraxNBL15359SNgFcKA8 sKFY6NrEFgtVrT8bnIUu6hHW1PlR7JTrGCIxKArTlfRlXmDhrpV5hWNYPC9ndT163m+l RHLus4vvp7pCmwGhl0I3qznIzjY44R9RedWSJdzmueRX9BMkaUVdIjXNezfcyXb2VEzd gQwEeXrNWUTQ6qdYYW0Em8A1Kg8HcjFGTIgJJHEOT6h2L3wGFO3e5QYoS3b2oc9IMMMI 0/M3YN/G23ULy8OqKpEgZZP+3YONINbd2GHXOLvPi2KjQZEPKBW/ht7gvQ51nwd9+mJ2 LSqQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=csAkRBqC; spf=pass (google.com: domain of gcc-patches-return-472106-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-472106-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id c83si6379163pfd.185.2018.01.26.05.55.04 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 26 Jan 2018 05:55:04 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-472106-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=csAkRBqC; spf=pass (google.com: domain of gcc-patches-return-472106-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-472106-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; q=dns; s=default; b=D9QqGSjuwDfY8hpUdSiunISXaUCu0IM63lg2bvBREIXfjpAiXP Wwl6uaRUpMiMdIimp0xFBbCQtGbKHfPMrkcV8cVwzsEoEpEuIArJUwU75dtXdTcv A7AWQd6HFcPeu/BF+H/R24MS/X6x/HFbG1A1j9rTgmehzbc91miuVgeTc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; s= default; bh=bk5NP2JtvKOGo+vbOYznWc3gnXs=; b=csAkRBqCxhwBh+rxVjyd uobmiRfUmm/xV1sfiwe1HCmA9b0/vpGyHHVHbswyEAWeaQqmiN+FLVPgVok+CEjO ywpjaazBWKmiCfHW68/dFuAXqhkComvnUL3MtBAM+c1nPFRykwgN7Se+6/mQoZYA aBtYYE4siJB7786VLDh0aKU= Received: (qmail 95777 invoked by alias); 26 Jan 2018 13:54:50 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 95755 invoked by uid 89); 26 Jan 2018 13:54:50 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=REV X-HELO: mail-wr0-f178.google.com Received: from mail-wr0-f178.google.com (HELO mail-wr0-f178.google.com) (209.85.128.178) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 26 Jan 2018 13:54:47 +0000 Received: by mail-wr0-f178.google.com with SMTP id e41so598842wre.9 for ; Fri, 26 Jan 2018 05:54:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:cc:subject:date :message-id:user-agent:mime-version; bh=NGYNiAlLyUkVJeD3IRuKpzjARTm6Bi/1ON6MvSdwFT0=; b=Cqymd5ESncUnz/wHFse/aHFhxmNDY6nwOlueBHgHzBuC66+pLljYKZCsb1eNErYzI5 kNQ7j1Pq9og1oVyHmQhe0tcOOeSg+A5LPEeNftNgg3MC/aDpihMyatgLHCI255hXtVWc HXEpROkJfVHbqb6qW/z1WvXjX3k5sqc2QJ8jL0/NxUQQ3aMW96kueFmFHbvkUcTppA3F Hts9AGHMMA86P/7LEvdwcAXLFYw+EyU90RmLTUr6dmd9dDELvQcuOVb4iEoDnTJZp8yv GhP+kZAaz9X6QRTJLAwkFz5SELcEsppL8ZnAkmYup2pxo7Mp00eMv1ajxDml9EH3e9bs jhHw== X-Gm-Message-State: AKwxytdogyu+OzPefsRNIoxnIV0+0nr1FYlDE+204rDtbK+pEPylQdbZ Si9q6tWTJiXImNgjQHYnyRE2RQ== X-Received: by 10.223.136.206 with SMTP id g14mr4827142wrg.201.1516974885566; Fri, 26 Jan 2018 05:54:45 -0800 (PST) Received: from localhost (92.40.248.158.threembb.co.uk. [92.40.248.158]) by smtp.gmail.com with ESMTPSA id y6sm3792116wmy.14.2018.01.26.05.54.44 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 26 Jan 2018 05:54:44 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com, james.greenhalgh@arm.com, marcus.shawcroft@arm.com, richard.sandiford@linaro.org Cc: richard.earnshaw@arm.com, james.greenhalgh@arm.com, marcus.shawcroft@arm.com Subject: [AArch64] Prefer LD1RQ for big-endian SVE Date: Fri, 26 Jan 2018 13:54:42 +0000 Message-ID: <87tvv8ohct.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) MIME-Version: 1.0 This patch deals with cases in which a CONST_VECTOR contains a repeating bit pattern that is wider than one element but narrower than 128 bits. The current code: * treats the repeating pattern as a single element * uses the associated LD1R to load and replicate it (such as LD1RD for 64-bit patterns) * uses a subreg to cast the result back to the original vector type The problem is that for big-endian targets, the final cast is effectively a form of element reverse. E.g. say we're using LD1RD to load 16-bit elements, with h being the high parts and l being the low parts: +-----+-----+-----+-----+-----+---- lanes | 0 | 1 | 2 | 3 | 4 | ... +-----+-----+-----+-----+-----+---- memory bytes |h0 l0 h1 l1 h2 l2 h3 l3 h0 l0 .... +---------------------------------- V V V V V V V V ----------+-----------------------+ register .... | 0 | after ----------+-----------------------+ lsb LD1RD .... h3 l3 h0 l0 h1 l1 h2 l2 h3 l3| ----------------------------------+ ----+-----+-----+-----+-----+-----+ expected ... | 4 | 3 | 2 | 1 | 0 | register ----+-----+-----+-----+-----+-----+ lsb contents .... h0 l0 h3 l3 h2 l2 h1 l1 h0 l0| ----------------------------------+ A later patch fixes the handling of general subregs to account for this, but it means that we need to do a REV instruction after the load. It seems better to use LD1RQ[BHW] on a 128-bit pattern instead, since that gets the endianness right without a separate fixup instruction. This is another step towards fixing sve/slp_* for aarch64_be. Tested on aarch64_be-elf and aarch64-linux-gnu. OK to install? Richard 2018-01-26 Richard Sandiford gcc/ * config/aarch64/aarch64.c (aarch64_expand_sve_const_vector): Prefer the TImode handling for big-endian targets. gcc/testsuite/ * gcc.target/aarch64/sve/slp_2.c: Expect LD1RQ to be used instead of LD1R[HWD] for multi-element constants on big-endian targets. * gcc.target/aarch64/sve/slp_3.c: Likewise. * gcc.target/aarch64/sve/slp_4.c: Likewise. Index: gcc/config/aarch64/aarch64.c =================================================================== --- gcc/config/aarch64/aarch64.c 2018-01-26 13:49:00.071630173 +0000 +++ gcc/config/aarch64/aarch64.c 2018-01-26 13:51:19.665760175 +0000 @@ -2824,10 +2824,18 @@ aarch64_expand_sve_const_vector (rtx des /* The constant is a repeating seqeuence of at least two elements, where the repeating elements occupy no more than 128 bits. Get an integer representation of the replicated value. */ - unsigned int int_bits = GET_MODE_UNIT_BITSIZE (mode) * npatterns; - gcc_assert (int_bits <= 128); - - scalar_int_mode int_mode = int_mode_for_size (int_bits, 0).require (); + scalar_int_mode int_mode; + if (BYTES_BIG_ENDIAN) + /* For now, always use LD1RQ to load the value on big-endian + targets, since the handling of smaller integers includes a + subreg that is semantically an element reverse. */ + int_mode = TImode; + else + { + unsigned int int_bits = GET_MODE_UNIT_BITSIZE (mode) * npatterns; + gcc_assert (int_bits <= 128); + int_mode = int_mode_for_size (int_bits, 0).require (); + } rtx int_value = simplify_gen_subreg (int_mode, src, mode, 0); if (int_value && aarch64_expand_sve_widened_duplicate (dest, int_mode, int_value)) Index: gcc/testsuite/gcc.target/aarch64/sve/slp_2.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/slp_2.c 2018-01-26 13:49:00.071630173 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/slp_2.c 2018-01-26 13:51:19.665760175 +0000 @@ -29,9 +29,12 @@ #define TEST_ALL(T) \ TEST_ALL (VEC_PERM) -/* { dg-final { scan-assembler-times {\tld1rh\tz[0-9]+\.h, } 2 } } */ -/* { dg-final { scan-assembler-times {\tld1rw\tz[0-9]+\.s, } 3 } } */ -/* { dg-final { scan-assembler-times {\tld1rd\tz[0-9]+\.d, } 3 } } */ +/* { dg-final { scan-assembler-times {\tld1rh\tz[0-9]+\.h, } 2 { target aarch64_little_endian } } } */ +/* { dg-final { scan-assembler-times {\tld1rqb\tz[0-9]+\.b, } 2 { target aarch64_big_endian } } } */ +/* { dg-final { scan-assembler-times {\tld1rw\tz[0-9]+\.s, } 3 { target aarch64_little_endian } } } */ +/* { dg-final { scan-assembler-times {\tld1rqh\tz[0-9]+\.h, } 3 { target aarch64_big_endian } } } */ +/* { dg-final { scan-assembler-times {\tld1rd\tz[0-9]+\.d, } 3 { target aarch64_little_endian } } } */ +/* { dg-final { scan-assembler-times {\tld1rqw\tz[0-9]+\.s, } 3 { target aarch64_big_endian } } } */ /* { dg-final { scan-assembler-times {\tld1rqd\tz[0-9]+\.d, } 3 } } */ /* { dg-final { scan-assembler-not {\tzip1\t} } } */ /* { dg-final { scan-assembler-not {\tzip2\t} } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/slp_3.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/slp_3.c 2018-01-26 13:49:00.071630173 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/slp_3.c 2018-01-26 13:51:19.665760175 +0000 @@ -32,9 +32,12 @@ #define TEST_ALL(T) \ TEST_ALL (VEC_PERM) /* 1 for each 8-bit type. */ -/* { dg-final { scan-assembler-times {\tld1rw\tz[0-9]+\.s, } 2 } } */ +/* { dg-final { scan-assembler-times {\tld1rw\tz[0-9]+\.s, } 2 { target aarch64_little_endian } } } */ +/* { dg-final { scan-assembler-times {\tld1rqb\tz[0-9]+\.b, } 2 { target aarch64_big_endian } } } */ /* 1 for each 16-bit type and 4 for double. */ -/* { dg-final { scan-assembler-times {\tld1rd\tz[0-9]+\.d, } 7 } } */ +/* { dg-final { scan-assembler-times {\tld1rd\tz[0-9]+\.d, } 7 { target aarch64_little_endian } } } */ +/* { dg-final { scan-assembler-times {\tld1rqh\tz[0-9]+\.h, } 3 { target aarch64_big_endian } } } */ +/* { dg-final { scan-assembler-times {\tld1rd\tz[0-9]+\.d, } 4 { target aarch64_big_endian } } } */ /* 1 for each 32-bit type. */ /* { dg-final { scan-assembler-times {\tld1rqw\tz[0-9]+\.s, } 3 } } */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #41\n} 2 } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/slp_4.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/slp_4.c 2018-01-26 13:49:00.072630137 +0000 +++ gcc/testsuite/gcc.target/aarch64/sve/slp_4.c 2018-01-26 13:51:19.665760175 +0000 @@ -36,7 +36,9 @@ #define TEST_ALL(T) \ TEST_ALL (VEC_PERM) /* 1 for each 8-bit type, 4 for each 32-bit type and 8 for double. */ -/* { dg-final { scan-assembler-times {\tld1rd\tz[0-9]+\.d, } 22 } } */ +/* { dg-final { scan-assembler-times {\tld1rd\tz[0-9]+\.d, } 22 { target aarch64_little_endian } } } */ +/* { dg-final { scan-assembler-times {\tld1rqb\tz[0-9]+\.b, } 2 { target aarch64_big_endian } } } */ +/* { dg-final { scan-assembler-times {\tld1rd\tz[0-9]+\.d, } 20 { target aarch64_big_endian } } } */ /* 1 for each 16-bit type. */ /* { dg-final { scan-assembler-times {\tld1rqh\tz[0-9]\.h, } 3 } } */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, #99\n} 2 } } */