From patchwork Wed Jul 12 16:33:42 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 107525 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp1030810qge; Wed, 12 Jul 2017 09:34:13 -0700 (PDT) X-Received: by 10.84.231.193 with SMTP id g1mr5126343pln.264.1499877253214; Wed, 12 Jul 2017 09:34:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1499877253; cv=none; d=google.com; s=arc-20160816; b=tFjH1MsrmN5S4D/TxiZoW6GaoQj05oU/L9nD/wRpAv/MNfnzGJ//aLR++CCvbYzb/S ph4+lgZq0b7IpWctOv1ZvC/AbPZ9eHpxD7engVHQzUNNREKTAiRgakP6OIrFp/M9CyuG HSP8NkgHSknwXZGKtZJN2v18sC7Ji9r3fTeUcaGMwYNFzP6qzOYQ95IchpUhmTjpYCdX zRQjW5+bz/Lq1je42VZx/TdgQPbCh26J29+LU89yo6AYvPAIAs5LQplv/eUyJ4/yYvFM qOUkvAXZmsh3hLARIr1XzGQqXmEs2px3yiT2kTWV++51lnlMg6a4lfl9xrN0bw3ltTjM jq8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:user-agent:message-id:date:subject:mail-followup-to:to :from:delivered-to:sender:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=L6Oaj6+KLqNVk3qFX+uidyjV6zuq5TpAn8KJNuc9SYo=; b=Mu1AxcHBGdmooqbj3j3ahJjwWusnoEKnx1iUb80Td4MI1dXRE1/bYQGx/10VvahMcg 3XaiGe7tLCofGur9OPMCvWP2BiPp02yXWC/K3rpJfS+Ju2MBrxCjj+daSU16YkDGnZ9d l2/CotPfLpBAvc0KjDKwgYqTjNepfJEBPQ1W4rhIJTKDnYxhsX/Rdv5ajB7tWcHec8MW zvyL5H7whB568DgpU7F4NcTbF2gepbA4nP4JZkLMuwAMpRbKkiXsIMkcyxk1GRXDkMsQ b4/M35oqLWR209EgufTiTSwgFEOGAhP0KTfrqEPezPM8hG3xiXc/l6q8rFItfhTmUJeX kkhw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.b=V/oLlaGq; spf=pass (google.com: domain of gcc-patches-return-457958-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-457958-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id 94si2519635pld.632.2017.07.12.09.34.12 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 12 Jul 2017 09:34:13 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-457958-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.b=V/oLlaGq; spf=pass (google.com: domain of gcc-patches-return-457958-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-457958-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; q=dns; s= default; b=dcdHe+wxlXvSx0DgU+kxIJXn9hvMvUdflRbug1US9Bqj/0REr2pgZ hKl0M5t5UnD3p/4wAKVGQK1HokX7r0MIIwF1NY0CiDQUhG+cUn4bMRtEmtuif7CO pWTtsx14+uIG8p9lTI7J9uZWg/FwplQfP5LpKk58OHs4voaf2u3iVs= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; s= default; bh=WtjuNU+s5WVuymxZfKUMF/1/tF8=; b=V/oLlaGqGYvTN0LFCh57 uSRLto2fP63ryhuN6SntYty64q0OjmU2hFbGhmUfYpJR6dbr/2MEeQLWJNlu7c9J C8L0+O0N3c24jvrZi3PP1rWVcHVU+eIZlulF0E2H6pGsGyyPCVL8KAvAVqlLsRAt 7avqTBpYhLXNZc7uNS/mgkA= Received: (qmail 21819 invoked by alias); 12 Jul 2017 16:33:53 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 20903 invoked by uid 89); 12 Jul 2017 16:33:52 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_SPAM, SPF_PASS autolearn=ham version=3.3.2 spammy=transfers X-HELO: mail-wr0-f177.google.com Received: from mail-wr0-f177.google.com (HELO mail-wr0-f177.google.com) (209.85.128.177) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 12 Jul 2017 16:33:48 +0000 Received: by mail-wr0-f177.google.com with SMTP id k67so40395090wrc.2 for ; Wed, 12 Jul 2017 09:33:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:subject:date:message-id :user-agent:mime-version; bh=L6Oaj6+KLqNVk3qFX+uidyjV6zuq5TpAn8KJNuc9SYo=; b=it0PaGHgUa2NggEM+SSSS6Z1R2bjk4Pc/pkw3BG+ladjMENK9B8GM6SvFbipFfAwT/ 1jdgULDvGx3BBb1i2SKRMSiDrhJM55/uR+2Cv58PyWdqYJbXL+EqeT9mfXg31FW9XSin EXqW9PKfhd6hMKBwl25faENk8DilJm/y8exw+49syA7YCb6bNcnKYYlWe9zFVcbJmYFM Wmng8/qTtzR9AbUEZj504ySv7EzYdAD7n/UzlQtwX4z2ZDmqfyB1NslCuP5LFc+UDLBF o5Q/n8s6WPLdYI2JSUPXM1jFBbXV0ii4lNrFd5DCZbsh1cD/Z4u7PlWqc81hni4b8fv/ VLHQ== X-Gm-Message-State: AIVw1112oqRWsdtIdpvZy7rCA0Lm3LBXmG0slhhgag3gRysmE9DU+vFV YIu4KB1zM5xpHi1hMB7n6w== X-Received: by 10.28.144.6 with SMTP id s6mr3388308wmd.16.1499877225780; Wed, 12 Jul 2017 09:33:45 -0700 (PDT) Received: from localhost (188.29.165.215.threembb.co.uk. [188.29.165.215]) by smtp.gmail.com with ESMTPSA id a126sm2265859wmh.14.2017.07.12.09.33.43 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 12 Jul 2017 09:33:44 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@linaro.org Subject: [rs6000] Avoid rotates of floating-point modes Date: Wed, 12 Jul 2017 17:33:42 +0100 Message-ID: <87wp7dsjh5.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 The little-endian VSX code uses rotates to swap the two 64-bit halves of 128-bit scalar modes. This is fine for TImode and V1TImode, but it isn't really valid to use RTL rotates on floating-point modes like KFmode and TFmode, and doing that triggered an assert added by the SVE series. This patch uses bit-casts to V1TImode instead. Tested on powerpc64le-linux-gnu. OK to install? Richard 2017-07-12 Richard Sandiford gcc/ * config/rs6000/rs6000-protos.h (rs6000_emit_le_vsx_permute): Declare. * config/rs6000/rs6000.c (rs6000_gen_le_vsx_permute): Replace with... (rs6000_emit_le_vsx_permute): ...this. Take the destination as input. Emit instructions rather than returning an expression. Handle TFmode and KFmode by casting to TImode. (rs6000_emit_le_vsx_load): Update to use rs6000_emit_le_vsx_permute. (rs6000_emit_le_vsx_store): Likewise. * config/rs6000/vsx.md (VSX_LE_128I): New iterator. (*vsx_le_permute_): Use it instead of VSX_LE_128. (*vsx_le_undo_permute_): Likewise. (*vsx_le_perm_load_): Use rs6000_emit_le_vsx_permute to emit the split sequence. (*vsx_le_perm_store_): Likewise. Index: gcc/config/rs6000/rs6000-protos.h =================================================================== --- gcc/config/rs6000/rs6000-protos.h 2017-06-30 12:50:38.886633045 +0100 +++ gcc/config/rs6000/rs6000-protos.h 2017-07-12 16:30:38.728631839 +0100 @@ -151,6 +151,7 @@ extern rtx rs6000_longcall_ref (rtx); extern void rs6000_fatal_bad_address (rtx); extern rtx create_TOC_reference (rtx, rtx); extern void rs6000_split_multireg_move (rtx, rtx); +extern void rs6000_emit_le_vsx_permute (rtx, rtx, machine_mode); extern void rs6000_emit_le_vsx_move (rtx, rtx, machine_mode); extern bool valid_sf_si_move (rtx, rtx, machine_mode); extern void rs6000_emit_move (rtx, rtx, machine_mode); Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c 2017-07-08 11:37:45.740795846 +0100 +++ gcc/config/rs6000/rs6000.c 2017-07-12 16:30:38.732631678 +0100 @@ -10503,17 +10503,24 @@ rs6000_const_vec (machine_mode mode) /* Generate a permute rtx that represents an lxvd2x, stxvd2x, or xxpermdi for a VSX load or store operation. */ -rtx -rs6000_gen_le_vsx_permute (rtx source, machine_mode mode) +void +rs6000_emit_le_vsx_permute (rtx dest, rtx source, machine_mode mode) { /* Use ROTATE instead of VEC_SELECT on IEEE 128-bit floating point, and 128-bit integers if they are allowed in VSX registers. */ - if (FLOAT128_VECTOR_P (mode) || mode == TImode || mode == V1TImode) - return gen_rtx_ROTATE (mode, source, GEN_INT (64)); + if (FLOAT128_VECTOR_P (mode)) + { + dest = gen_lowpart (V1TImode, dest); + source = gen_lowpart (V1TImode, source); + mode = V1TImode; + } + if (mode == TImode || mode == V1TImode) + emit_insn (gen_rtx_SET (dest, gen_rtx_ROTATE (mode, source, + GEN_INT (64)))); else { rtx par = gen_rtx_PARALLEL (VOIDmode, rs6000_const_vec (mode)); - return gen_rtx_VEC_SELECT (mode, source, par); + emit_insn (gen_rtx_SET (dest, gen_rtx_VEC_SELECT (mode, source, par))); } } @@ -10523,8 +10530,6 @@ rs6000_gen_le_vsx_permute (rtx source, m void rs6000_emit_le_vsx_load (rtx dest, rtx source, machine_mode mode) { - rtx tmp, permute_mem, permute_reg; - /* Use V2DImode to do swaps of types with 128-bit scalare parts (TImode, V1TImode). */ if (mode == TImode || mode == V1TImode) @@ -10534,11 +10539,9 @@ rs6000_emit_le_vsx_load (rtx dest, rtx s source = adjust_address (source, V2DImode, 0); } - tmp = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (dest) : dest; - permute_mem = rs6000_gen_le_vsx_permute (source, mode); - permute_reg = rs6000_gen_le_vsx_permute (tmp, mode); - emit_insn (gen_rtx_SET (tmp, permute_mem)); - emit_insn (gen_rtx_SET (dest, permute_reg)); + rtx tmp = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (dest) : dest; + rs6000_emit_le_vsx_permute (tmp, source, mode); + rs6000_emit_le_vsx_permute (dest, tmp, mode); } /* Emit a little-endian store to vector memory location DEST from VSX @@ -10547,8 +10550,6 @@ rs6000_emit_le_vsx_load (rtx dest, rtx s void rs6000_emit_le_vsx_store (rtx dest, rtx source, machine_mode mode) { - rtx tmp, permute_src, permute_tmp; - /* This should never be called during or after reload, because it does not re-permute the source register. It is intended only for use during expand. */ @@ -10563,11 +10564,9 @@ rs6000_emit_le_vsx_store (rtx dest, rtx source = gen_lowpart (V2DImode, source); } - tmp = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (source) : source; - permute_src = rs6000_gen_le_vsx_permute (source, mode); - permute_tmp = rs6000_gen_le_vsx_permute (tmp, mode); - emit_insn (gen_rtx_SET (tmp, permute_src)); - emit_insn (gen_rtx_SET (dest, permute_tmp)); + rtx tmp = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (source) : source; + rs6000_emit_le_vsx_permute (tmp, source, mode); + rs6000_emit_le_vsx_permute (dest, tmp, mode); } /* Emit a sequence representing a little-endian VSX load or store, Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md 2017-06-30 12:50:38.889632907 +0100 +++ gcc/config/rs6000/vsx.md 2017-07-12 16:30:38.734631598 +0100 @@ -37,6 +37,10 @@ (define_mode_iterator VSX_LE_128 [(KF (TI "TARGET_VSX_TIMODE") V1TI]) +;; Same, but with just the integer modes. +(define_mode_iterator VSX_LE_128I [(TI "TARGET_VSX_TIMODE") + V1TI]) + ;; Iterator for the 2 32-bit vector types (define_mode_iterator VSX_W [V4SF V4SI]) @@ -750,9 +754,9 @@ (define_split ;; special V1TI container class, which it is not appropriate to use vec_select ;; for the type. (define_insn "*vsx_le_permute_" - [(set (match_operand:VSX_LE_128 0 "nonimmediate_operand" "=,,Z") - (rotate:VSX_LE_128 - (match_operand:VSX_LE_128 1 "input_operand" ",Z,") + [(set (match_operand:VSX_LE_128I 0 "nonimmediate_operand" "=,,Z") + (rotate:VSX_LE_128I + (match_operand:VSX_LE_128I 1 "input_operand" ",Z,") (const_int 64)))] "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR" "@ @@ -763,10 +767,10 @@ (define_insn "*vsx_le_permute_" (set_attr "type" "vecperm,vecload,vecstore")]) (define_insn_and_split "*vsx_le_undo_permute_" - [(set (match_operand:VSX_LE_128 0 "vsx_register_operand" "=,") - (rotate:VSX_LE_128 - (rotate:VSX_LE_128 - (match_operand:VSX_LE_128 1 "vsx_register_operand" "0,") + [(set (match_operand:VSX_LE_128I 0 "vsx_register_operand" "=,") + (rotate:VSX_LE_128I + (rotate:VSX_LE_128I + (match_operand:VSX_LE_128I 1 "vsx_register_operand" "0,") (const_int 64)) (const_int 64)))] "!BYTES_BIG_ENDIAN && TARGET_VSX" @@ -791,16 +795,15 @@ (define_insn_and_split "*vsx_le_perm_loa "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR" "#" "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR" - [(set (match_dup 2) - (rotate:VSX_LE_128 (match_dup 1) - (const_int 64))) - (set (match_dup 0) - (rotate:VSX_LE_128 (match_dup 2) - (const_int 64)))] + [(const_int 0)] " { - operands[2] = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (operands[0]) - : operands[0]; + rtx tmp = (can_create_pseudo_p () + ? gen_reg_rtx_and_attrs (operands[0]) + : operands[0]); + rs6000_emit_le_vsx_permute (tmp, operands[1], mode); + rs6000_emit_le_vsx_permute (operands[0], tmp, mode); + DONE; } " [(set_attr "type" "vecload") @@ -818,15 +821,14 @@ (define_split [(set (match_operand:VSX_LE_128 0 "memory_operand" "") (match_operand:VSX_LE_128 1 "vsx_register_operand" ""))] "!BYTES_BIG_ENDIAN && TARGET_VSX && !reload_completed && !TARGET_P9_VECTOR" - [(set (match_dup 2) - (rotate:VSX_LE_128 (match_dup 1) - (const_int 64))) - (set (match_dup 0) - (rotate:VSX_LE_128 (match_dup 2) - (const_int 64)))] + [(const_int 0)] { - operands[2] = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (operands[0]) - : operands[0]; + rtx tmp = (can_create_pseudo_p () + ? gen_reg_rtx_and_attrs (operands[0]) + : operands[0]); + rs6000_emit_le_vsx_permute (tmp, operands[1], mode); + rs6000_emit_le_vsx_permute (operands[0], tmp, mode); + DONE; }) ;; Peephole to catch memory to memory transfers for TImode if TImode landed in @@ -850,16 +852,13 @@ (define_split [(set (match_operand:VSX_LE_128 0 "memory_operand" "") (match_operand:VSX_LE_128 1 "vsx_register_operand" ""))] "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed && !TARGET_P9_VECTOR" - [(set (match_dup 1) - (rotate:VSX_LE_128 (match_dup 1) - (const_int 64))) - (set (match_dup 0) - (rotate:VSX_LE_128 (match_dup 1) - (const_int 64))) - (set (match_dup 1) - (rotate:VSX_LE_128 (match_dup 1) - (const_int 64)))] - "") + [(const_int 0)] +{ + rs6000_emit_le_vsx_permute (operands[1], operands[1], mode); + rs6000_emit_le_vsx_permute (operands[0], operands[1], mode); + rs6000_emit_le_vsx_permute (operands[1], operands[1], mode); + DONE; +}) ;; Vector constants that can be generated with XXSPLTIB that was added in ISA ;; 3.0. Both (const_vector [..]) and (vec_duplicate ...) forms are recognized.