From patchwork Tue Dec 11 11:27:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Richard Earnshaw \(lists\)" X-Patchwork-Id: 153449 Delivered-To: patch@linaro.org Received: by 2002:a2e:299d:0:0:0:0:0 with SMTP id p29-v6csp499504ljp; Tue, 11 Dec 2018 03:27:51 -0800 (PST) X-Google-Smtp-Source: AFSGD/XV4rQbj4HJ1bKQrkk4CvznPo+E/GgKK6YV4hVSu1CjaIE4bOB4DWbZt8lvknFRJQjaOFpe X-Received: by 2002:a62:5b83:: with SMTP id p125mr16138530pfb.116.1544527671168; Tue, 11 Dec 2018 03:27:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544527671; cv=none; d=google.com; s=arc-20160816; b=e98FYwK4hGzjRRSl1KLxJ4/6nlKikWUsm5xrqiZuUJWkEG9bFJf9ttPoOuamx8Jn0d alChkEZM40g90/Nx14F382tqrtPiOS6xz7M29b9+j9X1lqCy4OKqvXL9lHz635Qt4UV5 413cQqGU2nQ95gyMtWFL9SXV7FiXwDGIMRKb1iPrZFphz2gJk79aOBmC9dnGIwfJx2IE 5FY/vXztz+g4M0ES+HkXibH2R/ddFdx52J4t1t+DugalCG2ZCjno9uC2wjTXs4BiPGu6 R9kV3Ivn9bz6crBO/k54j/hC3w8Nj8qZt9A3sXBOqtlrsq1Zeq2UAB++cUFLfg3436u9 aowg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:user-agent:date:message-id:openpgp:subject:from:to :delivered-to:sender:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature; bh=b3sWYIgwAEfsHTSUPOWrWJSr0t++29xe3tFROcysu/I=; b=nowF8nSbieswfPic43dklvqP3oek1lUnmw6RL6Jz3xDvzBATQl9xQjMJLyGog9P2ry JXsIPJrh3q8ydA61c65piWu6wX3GoWDScA+31fxbz5cLpjaANK4/al/6uEJi1wU1r3Wn aRwL/0S9azTJRpJ+2hIOZlqadbZzFdEX6J0BqH/ciaX6mkbNKVzujZ8oHsNI8i6JPqDG rVV4os4mD33jdBgZNUSBHp2QUorr4yDIJwzpcOaHZ+zYFX+qvvaAnsZT5uYcsTBAyuUf Oiax1Hngc8ES1Vs9Wfn0KBy5s8z8F3/989YCqtRMBcW9Xt6uRV3asg7CCQU4cDwWUM27 +Haw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="Yi5/FdQ2"; spf=pass (google.com: domain of gcc-patches-return-492085-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom="gcc-patches-return-492085-patch=linaro.org@gcc.gnu.org" Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id n4si12344557pgd.10.2018.12.11.03.27.50 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 11 Dec 2018 03:27:51 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-492085-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="Yi5/FdQ2"; spf=pass (google.com: domain of gcc-patches-return-492085-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom="gcc-patches-return-492085-patch=linaro.org@gcc.gnu.org" DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=FqKrHH4GT2iJKqKNu76hUUzRdRYJ5VoWGoAjrI+QZs6uC/WhOQ madUgQtF7WtjnX829PO0BGEpy0YaxSJjqXe0+AtsbGz3aOND+SqURtA20/BlfWrO AWJuha3ra/u6lkgAgQtCCctWh+60ZUG88xcRyv2u7VU5LF0N01LWAlEVs= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=OdhM3fZPhLgvcAMfDJwJvxBnlBY=; b=Yi5/FdQ26pw3+HnZBV2x B0BnaiNE6xJTbaw19YvU0xKd0wjAtNjKOpt3kbU+g22lyHcMgFDJv5XJbUz6cH5L smWwEZLfPgKActWN92URhTRnFeAZfFLMpSlGgaoDqIyiOYda8KARLVRQBf3wJy9M M/ZalgzG4VXNQ7f3SENSHNI= Received: (qmail 51993 invoked by alias); 11 Dec 2018 11:27:39 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 51977 invoked by uid 89); 11 Dec 2018 11:27:38 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS autolearn=ham version=3.3.2 spammy=operate, preferable, define_expand, sk:UNSPEC_ X-HELO: foss.arm.com Received: from usa-sjc-mx-foss1.foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 11 Dec 2018 11:27:36 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B63BC80D; Tue, 11 Dec 2018 03:27:34 -0800 (PST) Received: from e120077-lin.cambridge.arm.com (e120077-lin.cambridge.arm.com [10.2.206.231]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3FEFA3F6A8; Tue, 11 Dec 2018 03:27:34 -0800 (PST) To: gcc-patches From: "Richard Earnshaw (lists)" Subject: [aarch64] PR target/87369 Prefer bsl/bit/bif for copysign Openpgp: preference=signencrypt Message-ID: <39893456-e4ec-1a0b-0bed-9255917bfa41@arm.com> Date: Tue, 11 Dec 2018 11:27:32 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 The copysign operations will almost always be performed on values in floating-point registers. As such, we do not want the compiler to simplify the operations into code sequences that can only be done using the general-purpose register set. Unfortunately, this is what is currently happening. Fortunately, it seems quite unlikely that copysign() will be subsequently followed by other logical operations on the values involved, so I think it is acceptable to use an unspec here. This allows us to preserve the operation in a form that allows the register allocator to make the right choice later on, without limitation on the final form of the operation (well, if we do end up using the gp register bank, we get a dead constant load that we cannot easily eliminate at a late stage). PR target/37369 * config/aarch64/iterators.md (sizem1): Add sizes for SFmode and DFmode. (Vbtype): Add SFmode mapping. * config/aarch64/aarch64.md (copysigndf3, copysignsf3): Delete. (copysign3): New expand pattern. (copysign3_insn): New insn pattern. Applied to trunk R. diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 82af4d47f78..6657316c5dd 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -222,6 +222,7 @@ (define_c_enum "unspec" [ UNSPEC_FADDA UNSPEC_REV_SUBREG UNSPEC_SPECULATION_TRACKER + UNSPEC_COPYSIGN ]) (define_c_enum "unspecv" [ @@ -5987,49 +5988,47 @@ (define_expand "lrint2" ;; LDR d2, #(1 << 63) ;; BSL v2.8b, [y], [x] ;; -;; or another, equivalent, sequence using one of BSL/BIT/BIF. -;; aarch64_simd_bsldf will select the best suited of these instructions -;; to generate based on register allocation, and knows how to partially -;; constant fold based on the values of X and Y, so expand through that. - -(define_expand "copysigndf3" - [(match_operand:DF 0 "register_operand") - (match_operand:DF 1 "register_operand") - (match_operand:DF 2 "register_operand")] +;; or another, equivalent, sequence using one of BSL/BIT/BIF. Because +;; we expect these operations to nearly always operate on +;; floating-point values, we do not want the operation to be +;; simplified into a bit-field insert operation that operates on the +;; integer side, since typically that would involve three inter-bank +;; register copies. As we do not expect copysign to be followed by +;; other logical operations on the result, it seems preferable to keep +;; this as an unspec operation, rather than exposing the underlying +;; logic to the compiler. + +(define_expand "copysign3" + [(match_operand:GPF 0 "register_operand") + (match_operand:GPF 1 "register_operand") + (match_operand:GPF 2 "register_operand")] "TARGET_FLOAT && TARGET_SIMD" { - rtx mask = gen_reg_rtx (DImode); - emit_move_insn (mask, GEN_INT (HOST_WIDE_INT_1U << 63)); - emit_insn (gen_aarch64_simd_bsldf (operands[0], mask, - operands[2], operands[1])); + rtx bitmask = gen_reg_rtx (mode); + emit_move_insn (bitmask, GEN_INT (HOST_WIDE_INT_M1U + << (GET_MODE_BITSIZE (mode) - 1))); + emit_insn (gen_copysign3_insn (operands[0], operands[1], operands[2], + bitmask)); DONE; } ) -;; As above, but we must first get to a 64-bit value if we wish to use -;; aarch64_simd_bslv2sf. - -(define_expand "copysignsf3" - [(match_operand:SF 0 "register_operand") - (match_operand:SF 1 "register_operand") - (match_operand:SF 2 "register_operand")] +(define_insn "copysign3_insn" + [(set (match_operand:GPF 0 "register_operand" "=w,w,w,r") + (unspec:GPF [(match_operand:GPF 1 "register_operand" "w,0,w,r") + (match_operand:GPF 2 "register_operand" "w,w,0,0") + (match_operand: 3 "register_operand" "0,w,w,X")] + UNSPEC_COPYSIGN))] "TARGET_FLOAT && TARGET_SIMD" -{ - rtx v_bitmask = gen_reg_rtx (V2SImode); - - /* Juggle modes to get us in to a vector mode for BSL. */ - rtx op1 = lowpart_subreg (DImode, operands[1], SFmode); - rtx op2 = lowpart_subreg (V2SFmode, operands[2], SFmode); - rtx tmp = gen_reg_rtx (V2SFmode); - emit_move_insn (v_bitmask, - aarch64_simd_gen_const_vector_dup (V2SImode, - HOST_WIDE_INT_M1U << 31)); - emit_insn (gen_aarch64_simd_bslv2sf (tmp, v_bitmask, op2, op1)); - emit_move_insn (operands[0], lowpart_subreg (SFmode, tmp, V2SFmode)); - DONE; -} + "@ + bsl\\t%0., %2., %1. + bit\\t%0., %2., %3. + bif\\t%0., %1., %3. + bfxil\\t%0, %1, #0, " + [(set_attr "type" "neon_bsl,neon_bsl,neon_bsl,bfm")] ) + ;; For xorsign (x, y), we want to generate: ;; ;; LDR d2, #1<<63 diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index a80755734d6..ae75666167d 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -601,7 +601,8 @@ (define_mode_attr size [(QI "b") (HI "h") (SI "w")]) (define_mode_attr sizen [(QI "8") (HI "16") (SI "32") (DI "64")]) ;; Give the ordinal of the MSB in the mode -(define_mode_attr sizem1 [(QI "#7") (HI "#15") (SI "#31") (DI "#63")]) +(define_mode_attr sizem1 [(QI "#7") (HI "#15") (SI "#31") (DI "#63") + (HF "#15") (SF "#31") (DF "#63")]) ;; Attribute to describe constants acceptable in logical operations (define_mode_attr lconst [(SI "K") (DI "L")]) @@ -687,7 +688,7 @@ (define_mode_attr Vbtype [(V8QI "8b") (V16QI "16b") (V8HF "16b") (V2SF "8b") (V4SF "16b") (V2DF "16b") (DI "8b") (DF "8b") - (SI "8b")]) + (SI "8b") (SF "8b")]) ;; Define element mode for each vector mode. (define_mode_attr VEL [(V8QI "QI") (V16QI "QI") (VNx16QI "QI")