From patchwork Tue May 8 09:56:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 135140 Delivered-To: patch@linaro.org Received: by 10.46.151.6 with SMTP id r6csp4178049lji; Tue, 8 May 2018 02:57:17 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqzNW66BLBjuk9KXnnyfXR3NvrkTRH2Xo1bmaZGgHFBXZb1MGY3AtTpLS5w9/biGrJ5yT18 X-Received: by 2002:a17:902:59ce:: with SMTP id d14-v6mr41653079plj.253.1525773437128; Tue, 08 May 2018 02:57:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525773437; cv=none; d=google.com; s=arc-20160816; b=GnvyKXNu5+gsayT+vPXFAijBosaC73TBFcVxqQmd/Qdx3dY6iDuMjnAIHJjVrlHSIJ 7v7owqrJH4to2xEc4NChfbXcW2LupgsXB+9DxbxKpbO2Q1dTdX5J6xHSwOdd77kxMHZY Mg5fCHbvJnrmR+T9eybzttdifEw+hzmXlels4MJ2ahLjZSckPUOcnZyGePT/l9NvHWgF w2aGovS5Gx3dY57bUilmjsnWkOx7FFPyvGA/M8JTGIKr/Q5yFeT86m47qQTAjTtrOL/U 0QH00R9m8kVjEMKY8i2EjEENPEwEau3ZR8LYq0vP6Q+Ld4H87AJt/rQiUXhs88ofrA7N 308g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:user-agent:message-id:date:subject:mail-followup-to:to :from:delivered-to:sender:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=redt/O9wRKNUVsjfNsBBVkSPZg7l9KVTxScd/PGwR9I=; b=Flad2Wk/nE8oorJSfn2/OA1k1Q1iojUGcR42+lapMA+zlPq3Wj0JgzfGBbjgXBSL3L PqznZU2uHl/oz19CmdTIJRmq8rr5VWQZOwKupYKnRnxPVI3re+6DFdDB50FTBsPiesfw 8lmLXTX2wHNwxKQOFC3SJLC9TW8iFB3AAwIE3IGgiGYQwwW34HRVzLezw3sFYDDpYMWG FpM1URz8eK2OHvJSMVkYWSnMCW2G+Rynsdg33YTEH+b9aImGMLx2MxR+/7/jcB8l3QHd U2mPmwT7yGLDTTqW0mpl95djLfMu5cD+HNAsRzqD2jndraKWwuxtlMgKznicSBnVgI0T 8JVw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=I/1bltC8; spf=pass (google.com: domain of gcc-patches-return-477337-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-477337-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id j3-v6si23474325pld.300.2018.05.08.02.57.16 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 08 May 2018 02:57:17 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-477337-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=I/1bltC8; spf=pass (google.com: domain of gcc-patches-return-477337-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-477337-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; q=dns; s= default; b=C6Hh6zeWRjbN51gZq0jxa2/jJyGxmUe7cbSMSEk8xMHINZJB/MOnd 9Rm+lYLJE2LimKdt7GX5QsuM3V6HRG9Eqwrg6oynQyf7yXif2XjHaLGg9YT/VRy8 QNQPLabjrrzGAvdXCzl33LEXwihBZJT2DJb1vqDuH19+PVcZW7dWa4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; s= default; bh=GpEBG1AJ2+wd3e0LXq3tWvBX6QM=; b=I/1bltC8B80s5W9/Vk5N KHD3JR0Nwtiiy4slfsnuggbPm4lpccD8BLEfDw/dh+YnP4XwCX55noQ92Ppxr26e 34ynNvj5HqvhjsBkoqAUeesKEpqczHn/Ii/vB86HwjGAa7qJM1l2RPcUCzsf/3Sz giyJSQ1k95LJY2Xcz2/UQN4= Received: (qmail 56009 invoked by alias); 8 May 2018 09:57:02 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 55992 invoked by uid 89); 8 May 2018 09:57:01 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-11.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=vsd, CODE, lays, unspec X-HELO: mail-wr0-f176.google.com Received: from mail-wr0-f176.google.com (HELO mail-wr0-f176.google.com) (209.85.128.176) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 08 May 2018 09:56:57 +0000 Received: by mail-wr0-f176.google.com with SMTP id 94-v6so30333402wrf.5 for ; Tue, 08 May 2018 02:56:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:subject:date:message-id :user-agent:mime-version; bh=redt/O9wRKNUVsjfNsBBVkSPZg7l9KVTxScd/PGwR9I=; b=hEBdvea4EjwCzsnAGZMMcxOBSp/WNEYewq3+Jo6GoP6Mqac86VC4Annqus0LLC0UXP gyEaKx1LzCflB/iOMjtf5GQ93N/J6eVEa1gkCMHQud+dLL2DIZtnhII3U4D01lXKq/4S 6JeXY8mw/Wh0nto10gWb/8kHjAPOZSoAGu+kdw/cpLZ1eIOCNgTBnO18aJ9jTuO07ZqJ BIZ3Dal6LsALP+hJMOFnG0fZJ2u4u/Nuugu5bpOEGSpWvWzpQES06Xa4eBiE344wmTTI yBq/Cq4QGg+9Lzp3fycY57oDQbBhWgLs0ctNf+8e5lH9svrGJTePD6Y5zW8LQcgRZ/O2 NGLw== X-Gm-Message-State: ALQs6tAR6Yck8DGfORg3zQ3XtenRmoxXs0PtB02GoSF+E77iR4SUu5b4 HyNZ7P0g/vMpQixezPp7zJ8ditmkyfs= X-Received: by 2002:adf:9d15:: with SMTP id k21-v6mr31242438wre.213.1525773414980; Tue, 08 May 2018 02:56:54 -0700 (PDT) Received: from localhost (116.58.7.51.dyn.plus.net. [51.7.58.116]) by smtp.gmail.com with ESMTPSA id a69-v6sm9315260wma.7.2018.05.08.02.56.53 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 08 May 2018 02:56:54 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@linaro.org Subject: [committed][AArch64] Use UNSPEC_MERGE_PTRUE for comparisons Date: Tue, 08 May 2018 10:56:53 +0100 Message-ID: <87bmdqsc62.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) MIME-Version: 1.0 This patch rewrites the SVE comparison handling so that it uses UNSPEC_MERGE_PTRUE for comparisons that are known to be predicated on a PTRUE, for consistency with other patterns. Specific unspecs are then only needed for truly predicated floating-point comparisons, such as those used in the expansion of UNEQ for flag_trapping_math. The patch also makes sure that the comparison expanders attach a REG_EQUAL note to instructions that use UNSPEC_MERGE_PTRUE, so passes can use that as an alternative to the unspec pattern. (This happens automatically for optabs. The problem was that this code emits instruction patterns directly.) No specific benefit on its own, but it lays the groundwork for the next patch. Tested on aarch64-linux-gnu (with and without SVE) and aaarch64_be-elf. Applied as r260029. Richard 2018-05-08 Richard Sandiford gcc/ * config/aarch64/iterators.md (UNSPEC_COND_LO, UNSPEC_COND_LS) (UNSPEC_COND_HI, UNSPEC_COND_HS, UNSPEC_COND_UO): Delete. (SVE_INT_CMP, SVE_FP_CMP): New code iterators. (cmp_op, sve_imm_con): New code attributes. (SVE_COND_INT_CMP, imm_con): Delete. (cmp_op): Remove above unspecs from int attribute. * config/aarch64/aarch64-sve.md (*vec_cmp_): Rename to... (*cmp): ...this. Use UNSPEC_MERGE_PTRUE instead of comparison-specific unspecs. (*vec_cmp__ptest): Rename to... (*cmp_ptest): ...this and adjust likewise. (*vec_cmp__cc): Rename to... (*cmp_cc): ...this and adjust likewise. (*vec_fcm): Rename to... (*fcm): ...this and adjust likewise. (*vec_fcmuo): Rename to... (*fcmuo): ...this and adjust likewise. (*pred_fcm): New pattern. * config/aarch64/aarch64.c (aarch64_emit_unop, aarch64_emit_binop) (aarch64_emit_sve_ptrue_op, aarch64_emit_sve_ptrue_op_cc): New functions. (aarch64_unspec_cond_code): Remove handling of LTU, GTU, LEU, GEU and UNORDERED. (aarch64_gen_unspec_cond, aarch64_emit_unspec_cond): Delete. (aarch64_emit_sve_predicated_cond): New function. (aarch64_expand_sve_vec_cmp_int): Use aarch64_emit_sve_ptrue_op_cc. (aarch64_emit_unspec_cond_or): Replace with... (aarch64_emit_sve_or_conds): ...this new function. Use aarch64_emit_sve_ptrue_op for the individual comparisons and aarch64_emit_binop to OR them together. (aarch64_emit_inverted_unspec_cond): Replace with... (aarch64_emit_sve_inverted_cond): ...this new function. Use aarch64_emit_sve_ptrue_op for the comparison and aarch64_emit_unop to invert the result. (aarch64_expand_sve_vec_cmp_float): Update after the above changes. Use aarch64_emit_sve_ptrue_op for native comparisons. Index: gcc/config/aarch64/iterators.md =================================================================== --- gcc/config/aarch64/iterators.md 2018-05-01 19:31:04.341265575 +0100 +++ gcc/config/aarch64/iterators.md 2018-05-08 10:51:17.070995242 +0100 @@ -455,11 +455,6 @@ (define_c_enum "unspec" UNSPEC_COND_NE ; Used in aarch64-sve.md. UNSPEC_COND_GE ; Used in aarch64-sve.md. UNSPEC_COND_GT ; Used in aarch64-sve.md. - UNSPEC_COND_LO ; Used in aarch64-sve.md. - UNSPEC_COND_LS ; Used in aarch64-sve.md. - UNSPEC_COND_HS ; Used in aarch64-sve.md. - UNSPEC_COND_HI ; Used in aarch64-sve.md. - UNSPEC_COND_UO ; Used in aarch64-sve.md. UNSPEC_LASTB ; Used in aarch64-sve.md. ]) @@ -1189,6 +1184,12 @@ (define_code_iterator SVE_INT_UNARY [neg ;; SVE floating-point unary operations. (define_code_iterator SVE_FP_UNARY [neg abs sqrt]) +;; SVE integer comparisons. +(define_code_iterator SVE_INT_CMP [lt le eq ne ge gt ltu leu geu gtu]) + +;; SVE floating-point comparisons. +(define_code_iterator SVE_FP_CMP [lt le eq ne ge gt]) + ;; ------------------------------------------------------------------- ;; Code Attributes ;; ------------------------------------------------------------------- @@ -1252,6 +1253,18 @@ (define_code_attr CMP [(lt "LT") (le "LE (ltu "LTU") (leu "LEU") (ne "NE") (geu "GEU") (gtu "GTU")]) +;; The AArch64 condition associated with an rtl comparison code. +(define_code_attr cmp_op [(lt "lt") + (le "le") + (eq "eq") + (ne "ne") + (ge "ge") + (gt "gt") + (ltu "lo") + (leu "ls") + (geu "hs") + (gtu "hi")]) + (define_code_attr fix_trunc_optab [(fix "fix_trunc") (unsigned_fix "fixuns_trunc")]) @@ -1358,6 +1371,18 @@ (define_code_attr sve_fp_op [(plus "fadd (abs "fabs") (sqrt "fsqrt")]) +;; The SVE immediate constraint to use for an rtl code. +(define_code_attr sve_imm_con [(eq "vsc") + (ne "vsc") + (lt "vsc") + (ge "vsc") + (le "vsc") + (gt "vsc") + (ltu "vsd") + (leu "vsd") + (geu "vsd") + (gtu "vsd")]) + ;; ------------------------------------------------------------------- ;; Int Iterators. ;; ------------------------------------------------------------------- @@ -1483,12 +1508,6 @@ (define_int_iterator SVE_COND_INT_OP [UN (define_int_iterator SVE_COND_FP_OP [UNSPEC_COND_ADD UNSPEC_COND_SUB]) -(define_int_iterator SVE_COND_INT_CMP [UNSPEC_COND_LT UNSPEC_COND_LE - UNSPEC_COND_EQ UNSPEC_COND_NE - UNSPEC_COND_GE UNSPEC_COND_GT - UNSPEC_COND_LO UNSPEC_COND_LS - UNSPEC_COND_HS UNSPEC_COND_HI]) - (define_int_iterator SVE_COND_FP_CMP [UNSPEC_COND_LT UNSPEC_COND_LE UNSPEC_COND_EQ UNSPEC_COND_NE UNSPEC_COND_GE UNSPEC_COND_GT]) @@ -1730,23 +1749,7 @@ (define_int_attr cmp_op [(UNSPEC_COND_LT (UNSPEC_COND_EQ "eq") (UNSPEC_COND_NE "ne") (UNSPEC_COND_GE "ge") - (UNSPEC_COND_GT "gt") - (UNSPEC_COND_LO "lo") - (UNSPEC_COND_LS "ls") - (UNSPEC_COND_HS "hs") - (UNSPEC_COND_HI "hi")]) - -;; The constraint to use for an UNSPEC_COND_. -(define_int_attr imm_con [(UNSPEC_COND_EQ "vsc") - (UNSPEC_COND_NE "vsc") - (UNSPEC_COND_LT "vsc") - (UNSPEC_COND_GE "vsc") - (UNSPEC_COND_LE "vsc") - (UNSPEC_COND_GT "vsc") - (UNSPEC_COND_LO "vsd") - (UNSPEC_COND_LS "vsd") - (UNSPEC_COND_HS "vsd") - (UNSPEC_COND_HI "vsd")]) + (UNSPEC_COND_GT "gt")]) (define_int_attr sve_int_op [(UNSPEC_COND_ADD "add") (UNSPEC_COND_SUB "sub") Index: gcc/config/aarch64/aarch64-sve.md =================================================================== --- gcc/config/aarch64/aarch64-sve.md 2018-03-13 15:13:02.184621133 +0000 +++ gcc/config/aarch64/aarch64-sve.md 2018-05-08 10:51:17.067995359 +0100 @@ -1292,14 +1292,15 @@ (define_insn_and_split "while_ult_" +;; Integer comparisons predicated with a PTRUE. +(define_insn "*cmp" [(set (match_operand: 0 "register_operand" "=Upa, Upa") (unspec: [(match_operand: 1 "register_operand" "Upl, Upl") - (match_operand:SVE_I 2 "register_operand" "w, w") - (match_operand:SVE_I 3 "aarch64_sve_cmp__operand" ", w")] - SVE_COND_INT_CMP)) + (SVE_INT_CMP: + (match_operand:SVE_I 2 "register_operand" "w, w") + (match_operand:SVE_I 3 "aarch64_sve_cmp__operand" ", w"))] + UNSPEC_MERGE_PTRUE)) (clobber (reg:CC CC_REGNUM))] "TARGET_SVE" "@ @@ -1307,17 +1308,19 @@ (define_insn "*vec_cmp_" cmp\t%0., %1/z, %2., %3." ) -;; Predicated integer comparison in which only the flags result is interesting. -(define_insn "*vec_cmp__ptest" +;; Integer comparisons predicated with a PTRUE in which only the flags result +;; is interesting. +(define_insn "*cmp_ptest" [(set (reg:CC CC_REGNUM) (compare:CC (unspec:SI [(match_operand: 1 "register_operand" "Upl, Upl") (unspec: [(match_dup 1) - (match_operand:SVE_I 2 "register_operand" "w, w") - (match_operand:SVE_I 3 "aarch64_sve_cmp__operand" ", w")] - SVE_COND_INT_CMP)] + (SVE_INT_CMP: + (match_operand:SVE_I 2 "register_operand" "w, w") + (match_operand:SVE_I 3 "aarch64_sve_cmp__operand" ", w"))] + UNSPEC_MERGE_PTRUE)] UNSPEC_PTEST_PTRUE) (const_int 0))) (clobber (match_scratch: 0 "=Upa, Upa"))] @@ -1327,59 +1330,76 @@ (define_insn "*vec_cmp__pt cmp\t%0., %1/z, %2., %3." ) -;; Predicated comparison in which both the flag and predicate results -;; are interesting. -(define_insn "*vec_cmp__cc" +;; Integer comparisons predicated with a PTRUE in which both the flag and +;; predicate results are interesting. +(define_insn "*cmp_cc" [(set (reg:CC CC_REGNUM) (compare:CC (unspec:SI [(match_operand: 1 "register_operand" "Upl, Upl") (unspec: [(match_dup 1) - (match_operand:SVE_I 2 "register_operand" "w, w") - (match_operand:SVE_I 3 "aarch64_sve_cmp__operand" ", w")] - SVE_COND_INT_CMP)] + (SVE_INT_CMP: + (match_operand:SVE_I 2 "register_operand" "w, w") + (match_operand:SVE_I 3 "aarch64_sve_cmp__operand" ", w"))] + UNSPEC_MERGE_PTRUE)] UNSPEC_PTEST_PTRUE) (const_int 0))) (set (match_operand: 0 "register_operand" "=Upa, Upa") (unspec: [(match_dup 1) - (match_dup 2) - (match_dup 3)] - SVE_COND_INT_CMP))] + (SVE_INT_CMP: + (match_dup 2) + (match_dup 3))] + UNSPEC_MERGE_PTRUE))] "TARGET_SVE" "@ cmp\t%0., %1/z, %2., #%3 cmp\t%0., %1/z, %2., %3." ) -;; Predicated floating-point comparison (excluding FCMUO, which doesn't -;; allow #0.0 as an operand). -(define_insn "*vec_fcm" +;; Floating-point comparisons predicated with a PTRUE. +(define_insn "*fcm" [(set (match_operand: 0 "register_operand" "=Upa, Upa") (unspec: [(match_operand: 1 "register_operand" "Upl, Upl") - (match_operand:SVE_F 2 "register_operand" "w, w") - (match_operand:SVE_F 3 "aarch64_simd_reg_or_zero" "Dz, w")] - SVE_COND_FP_CMP))] + (SVE_FP_CMP: + (match_operand:SVE_F 2 "register_operand" "w, w") + (match_operand:SVE_F 3 "aarch64_simd_reg_or_zero" "Dz, w"))] + UNSPEC_MERGE_PTRUE))] "TARGET_SVE" "@ fcm\t%0., %1/z, %2., #0.0 fcm\t%0., %1/z, %2., %3." ) -;; Predicated FCMUO. -(define_insn "*vec_fcmuo" +(define_insn "*fcmuo" [(set (match_operand: 0 "register_operand" "=Upa") (unspec: [(match_operand: 1 "register_operand" "Upl") - (match_operand:SVE_F 2 "register_operand" "w") - (match_operand:SVE_F 3 "register_operand" "w")] - UNSPEC_COND_UO))] + (unordered: + (match_operand:SVE_F 2 "register_operand" "w") + (match_operand:SVE_F 3 "register_operand" "w"))] + UNSPEC_MERGE_PTRUE))] "TARGET_SVE" "fcmuo\t%0., %1/z, %2., %3." ) +;; Predicated floating-point comparisons. We don't need a version +;; of this for unordered comparisons. +(define_insn "*pred_fcm" + [(set (match_operand: 0 "register_operand" "=Upa, Upa") + (unspec: + [(match_operand: 1 "register_operand" "Upl, Upl") + (match_operand:SVE_F 2 "register_operand" "w, w") + (match_operand:SVE_F 3 "aarch64_simd_reg_or_zero" "Dz, w")] + SVE_COND_FP_CMP))] + "TARGET_SVE" + "@ + fcm\t%0., %1/z, %2., #0.0 + fcm\t%0., %1/z, %2., %3." +) + ;; vcond_mask operand order: true, false, mask ;; UNSPEC_SEL operand order: mask, true, false (as for VEC_COND_EXPR) ;; SEL operand order: mask, true, false Index: gcc/config/aarch64/aarch64.c =================================================================== --- gcc/config/aarch64/aarch64.c 2018-05-08 09:42:03.397652851 +0100 +++ gcc/config/aarch64/aarch64.c 2018-05-08 10:51:17.069995281 +0100 @@ -1873,6 +1873,27 @@ aarch64_emit_move (rtx dest, rtx src) : emit_move_insn_1 (dest, src)); } +/* Apply UNOPTAB to OP and store the result in DEST. */ + +static void +aarch64_emit_unop (rtx dest, optab unoptab, rtx op) +{ + rtx tmp = expand_unop (GET_MODE (dest), unoptab, op, dest, 0); + if (dest != tmp) + emit_move_insn (dest, tmp); +} + +/* Apply BINOPTAB to OP0 and OP1 and store the result in DEST. */ + +static void +aarch64_emit_binop (rtx dest, optab binoptab, rtx op0, rtx op1) +{ + rtx tmp = expand_binop (GET_MODE (dest), binoptab, op0, op1, dest, 0, + OPTAB_DIRECT); + if (dest != tmp) + emit_move_insn (dest, tmp); +} + /* Split a 128-bit move operation into two 64-bit move operations, taking care to handle partial overlap of register to register copies. Special cases are needed when moving between GP regs and @@ -15675,6 +15696,34 @@ aarch64_sve_cmp_operand_p (rtx_code op_c } } +/* Use predicated SVE instructions to implement the equivalent of: + + (set TARGET OP) + + given that PTRUE is an all-true predicate of the appropriate mode. */ + +static void +aarch64_emit_sve_ptrue_op (rtx target, rtx ptrue, rtx op) +{ + rtx unspec = gen_rtx_UNSPEC (GET_MODE (target), + gen_rtvec (2, ptrue, op), + UNSPEC_MERGE_PTRUE); + rtx_insn *insn = emit_set_insn (target, unspec); + set_unique_reg_note (insn, REG_EQUAL, copy_rtx (op)); +} + +/* Likewise, but also clobber the condition codes. */ + +static void +aarch64_emit_sve_ptrue_op_cc (rtx target, rtx ptrue, rtx op) +{ + rtx unspec = gen_rtx_UNSPEC (GET_MODE (target), + gen_rtvec (2, ptrue, op), + UNSPEC_MERGE_PTRUE); + rtx_insn *insn = emit_insn (gen_set_clobber_cc (target, unspec)); + set_unique_reg_note (insn, REG_EQUAL, copy_rtx (op)); +} + /* Return the UNSPEC_COND_* code for comparison CODE. */ static unsigned int @@ -15694,35 +15743,33 @@ aarch64_unspec_cond_code (rtx_code code) return UNSPEC_COND_LE; case GE: return UNSPEC_COND_GE; - case LTU: - return UNSPEC_COND_LO; - case GTU: - return UNSPEC_COND_HI; - case LEU: - return UNSPEC_COND_LS; - case GEU: - return UNSPEC_COND_HS; - case UNORDERED: - return UNSPEC_COND_UO; default: gcc_unreachable (); } } -/* Return an (unspec:PRED_MODE [PRED OP0 OP1] UNSPEC_COND_) expression, - where is the operation associated with comparison CODE. */ +/* Emit: -static rtx -aarch64_gen_unspec_cond (rtx_code code, machine_mode pred_mode, - rtx pred, rtx op0, rtx op1) + (set TARGET (unspec [PRED OP0 OP1] UNSPEC_COND_)) + + where is the operation associated with comparison CODE. This form + of instruction is used when (and (CODE OP0 OP1) PRED) would have different + semantics, such as when PRED might not be all-true and when comparing + inactive lanes could have side effects. */ + +static void +aarch64_emit_sve_predicated_cond (rtx target, rtx_code code, + rtx pred, rtx op0, rtx op1) { - rtvec vec = gen_rtvec (3, pred, op0, op1); - return gen_rtx_UNSPEC (pred_mode, vec, aarch64_unspec_cond_code (code)); + rtx unspec = gen_rtx_UNSPEC (GET_MODE (pred), + gen_rtvec (3, pred, op0, op1), + aarch64_unspec_cond_code (code)); + emit_set_insn (target, unspec); } -/* Expand an SVE integer comparison: +/* Expand an SVE integer comparison using the SVE equivalent of: - TARGET = CODE (OP0, OP1). */ + (set TARGET (CODE OP0 OP1)). */ void aarch64_expand_sve_vec_cmp_int (rtx target, rtx_code code, rtx op0, rtx op1) @@ -15734,78 +15781,53 @@ aarch64_expand_sve_vec_cmp_int (rtx targ op1 = force_reg (data_mode, op1); rtx ptrue = force_reg (pred_mode, CONSTM1_RTX (pred_mode)); - rtx unspec = aarch64_gen_unspec_cond (code, pred_mode, ptrue, op0, op1); - emit_insn (gen_set_clobber_cc (target, unspec)); -} - -/* Emit an instruction: - - (set TARGET (unspec:PRED_MODE [PRED OP0 OP1] UNSPEC_COND_)) - - where is the operation associated with comparison CODE. */ - -static void -aarch64_emit_unspec_cond (rtx target, rtx_code code, machine_mode pred_mode, - rtx pred, rtx op0, rtx op1) -{ - rtx unspec = aarch64_gen_unspec_cond (code, pred_mode, pred, op0, op1); - emit_set_insn (target, unspec); + rtx cond = gen_rtx_fmt_ee (code, pred_mode, op0, op1); + aarch64_emit_sve_ptrue_op_cc (target, ptrue, cond); } -/* Emit: +/* Emit the SVE equivalent of: - (set TMP1 (unspec:PRED_MODE [PTRUE OP0 OP1] UNSPEC_COND_)) - (set TMP2 (unspec:PRED_MODE [PTRUE OP0 OP1] UNSPEC_COND_)) - (set TARGET (and:PRED_MODE (ior:PRED_MODE TMP1 TMP2) PTRUE)) + (set TMP1 (CODE1 OP0 OP1)) + (set TMP2 (CODE2 OP0 OP1)) + (set TARGET (ior:PRED_MODE TMP1 TMP2)) - where is the operation associated with comparison CODEi. */ + PTRUE is an all-true predicate with the same mode as TARGET. */ static void -aarch64_emit_unspec_cond_or (rtx target, rtx_code code1, rtx_code code2, - machine_mode pred_mode, rtx ptrue, - rtx op0, rtx op1) +aarch64_emit_sve_or_conds (rtx target, rtx_code code1, rtx_code code2, + rtx ptrue, rtx op0, rtx op1) { + machine_mode pred_mode = GET_MODE (ptrue); rtx tmp1 = gen_reg_rtx (pred_mode); - aarch64_emit_unspec_cond (tmp1, code1, pred_mode, ptrue, op0, op1); + aarch64_emit_sve_ptrue_op (tmp1, ptrue, + gen_rtx_fmt_ee (code1, pred_mode, op0, op1)); rtx tmp2 = gen_reg_rtx (pred_mode); - aarch64_emit_unspec_cond (tmp2, code2, pred_mode, ptrue, op0, op1); - emit_set_insn (target, gen_rtx_AND (pred_mode, - gen_rtx_IOR (pred_mode, tmp1, tmp2), - ptrue)); + aarch64_emit_sve_ptrue_op (tmp2, ptrue, + gen_rtx_fmt_ee (code2, pred_mode, op0, op1)); + aarch64_emit_binop (target, ior_optab, tmp1, tmp2); } -/* If CAN_INVERT_P, emit an instruction: +/* Emit the SVE equivalent of: - (set TARGET (unspec:PRED_MODE [PRED OP0 OP1] UNSPEC_COND_)) + (set TMP (CODE OP0 OP1)) + (set TARGET (not TMP)) - where is the operation associated with comparison CODE. Otherwise - emit: - - (set TMP (unspec:PRED_MODE [PRED OP0 OP1] UNSPEC_COND_)) - (set TARGET (and:PRED_MODE (not:PRED_MODE TMP) PTRUE)) - - where the second instructions sets TARGET to the inverse of TMP. */ + PTRUE is an all-true predicate with the same mode as TARGET. */ static void -aarch64_emit_inverted_unspec_cond (rtx target, rtx_code code, - machine_mode pred_mode, rtx ptrue, rtx pred, - rtx op0, rtx op1, bool can_invert_p) +aarch64_emit_sve_inverted_cond (rtx target, rtx ptrue, rtx_code code, + rtx op0, rtx op1) { - if (can_invert_p) - aarch64_emit_unspec_cond (target, code, pred_mode, pred, op0, op1); - else - { - rtx tmp = gen_reg_rtx (pred_mode); - aarch64_emit_unspec_cond (tmp, code, pred_mode, pred, op0, op1); - emit_set_insn (target, gen_rtx_AND (pred_mode, - gen_rtx_NOT (pred_mode, tmp), - ptrue)); - } + machine_mode pred_mode = GET_MODE (ptrue); + rtx tmp = gen_reg_rtx (pred_mode); + aarch64_emit_sve_ptrue_op (tmp, ptrue, + gen_rtx_fmt_ee (code, pred_mode, op0, op1)); + aarch64_emit_unop (target, one_cmpl_optab, tmp); } -/* Expand an SVE floating-point comparison: +/* Expand an SVE floating-point comparison using the SVE equivalent of: - TARGET = CODE (OP0, OP1) + (set TARGET (CODE OP0 OP1)) If CAN_INVERT_P is true, the caller can also handle inverted results; return true if the result is in fact inverted. */ @@ -15823,30 +15845,23 @@ aarch64_expand_sve_vec_cmp_float (rtx ta case UNORDERED: /* UNORDERED has no immediate form. */ op1 = force_reg (data_mode, op1); - aarch64_emit_unspec_cond (target, code, pred_mode, ptrue, op0, op1); - return false; - + /* fall through */ case LT: case LE: case GT: case GE: case EQ: case NE: - /* There is native support for the comparison. */ - aarch64_emit_unspec_cond (target, code, pred_mode, ptrue, op0, op1); - return false; - - case ORDERED: - /* There is native support for the inverse comparison. */ - op1 = force_reg (data_mode, op1); - aarch64_emit_inverted_unspec_cond (target, UNORDERED, - pred_mode, ptrue, ptrue, op0, op1, - can_invert_p); - return can_invert_p; + { + /* There is native support for the comparison. */ + rtx cond = gen_rtx_fmt_ee (code, pred_mode, op0, op1); + aarch64_emit_sve_ptrue_op (target, ptrue, cond); + return false; + } case LTGT: /* This is a trapping operation (LT or GT). */ - aarch64_emit_unspec_cond_or (target, LT, GT, pred_mode, ptrue, op0, op1); + aarch64_emit_sve_or_conds (target, LT, GT, ptrue, op0, op1); return false; case UNEQ: @@ -15854,38 +15869,59 @@ aarch64_expand_sve_vec_cmp_float (rtx ta { /* This would trap for signaling NaNs. */ op1 = force_reg (data_mode, op1); - aarch64_emit_unspec_cond_or (target, UNORDERED, EQ, - pred_mode, ptrue, op0, op1); + aarch64_emit_sve_or_conds (target, UNORDERED, EQ, ptrue, op0, op1); return false; } /* fall through */ - case UNLT: case UNLE: case UNGT: case UNGE: - { - rtx ordered = ptrue; - if (flag_trapping_math) - { - /* Only compare the elements that are known to be ordered. */ - ordered = gen_reg_rtx (pred_mode); - op1 = force_reg (data_mode, op1); - aarch64_emit_inverted_unspec_cond (ordered, UNORDERED, pred_mode, - ptrue, ptrue, op0, op1, false); - } - if (code == UNEQ) - code = NE; - else - code = reverse_condition_maybe_unordered (code); - aarch64_emit_inverted_unspec_cond (target, code, pred_mode, ptrue, - ordered, op0, op1, can_invert_p); - return can_invert_p; - } + if (flag_trapping_math) + { + /* Work out which elements are ordered. */ + rtx ordered = gen_reg_rtx (pred_mode); + op1 = force_reg (data_mode, op1); + aarch64_emit_sve_inverted_cond (ordered, ptrue, UNORDERED, op0, op1); + + /* Test the opposite condition for the ordered elements, + then invert the result. */ + if (code == UNEQ) + code = NE; + else + code = reverse_condition_maybe_unordered (code); + if (can_invert_p) + { + aarch64_emit_sve_predicated_cond (target, code, + ordered, op0, op1); + return true; + } + rtx tmp = gen_reg_rtx (pred_mode); + aarch64_emit_sve_predicated_cond (tmp, code, ordered, op0, op1); + aarch64_emit_unop (target, one_cmpl_optab, tmp); + return false; + } + break; + + case ORDERED: + /* ORDERED has no immediate form. */ + op1 = force_reg (data_mode, op1); + break; default: gcc_unreachable (); } + + /* There is native support for the inverse comparison. */ + code = reverse_condition_maybe_unordered (code); + if (can_invert_p) + { + rtx cond = gen_rtx_fmt_ee (code, pred_mode, op0, op1); + aarch64_emit_sve_ptrue_op (target, ptrue, cond); + return true; + } + aarch64_emit_sve_inverted_cond (target, ptrue, code, op0, op1); + return false; } /* Expand an SVE vcond pattern with operands OPS. DATA_MODE is the mode