From patchwork Thu May 5 21:37:07 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jim Wilson X-Patchwork-Id: 67241 Delivered-To: patch@linaro.org Received: by 10.140.92.199 with SMTP id b65csp2214qge; Thu, 5 May 2016 14:37:37 -0700 (PDT) X-Received: by 10.98.100.77 with SMTP id y74mr23874201pfb.101.1462484257893; Thu, 05 May 2016 14:37:37 -0700 (PDT) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id a62si13162912pfc.166.2016.05.05.14.37.37 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 05 May 2016 14:37:37 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-426676-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-426676-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-426676-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; q=dns; s=default; b=QWknq8TUDk5G2ABaizwifRGg9sajS3RX9J4avkAu0kL P5bUatKwfnIPgbOuM31j4YdDAAv7Z+GTO2CJxpOdjSZh6tAmYxqjhhnk3L1EDN3d Aj69XSup6usXy0I5WTQusdSVI4gS+lJT3mGQb6ij5U+oFeDawkZiQWuPn71n+FTw = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; s=default; bh=elQCSJJO/lMm+iAs9yGADweYejY=; b=rtUdgGfmHShwVNHJZ 8LWe0xn2QzoTEJKFyxb1QPdsFbJ+r6sq9fJsB894sHAzWngWzAzDkRpN0Omw7b/I Tseccuv5vdTEW8+eHtQgr3ok478lFdfXi6+6S6vlT8MaxFVbgZeFzmq7waVXBk1n HUb4BHUDnkYJNzO/S0SbQFi1J4= Received: (qmail 47691 invoked by alias); 5 May 2016 21:37:20 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 47675 invoked by uid 89); 5 May 2016 21:37:20 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL, BAYES_00, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 spammy=alt, mfr, wr, vfpmd X-HELO: mail-oi0-f46.google.com Received: from mail-oi0-f46.google.com (HELO mail-oi0-f46.google.com) (209.85.218.46) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Thu, 05 May 2016 21:37:10 +0000 Received: by mail-oi0-f46.google.com with SMTP id x201so118431916oif.3 for ; Thu, 05 May 2016 14:37:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to:cc; bh=TY2ADsuta5fjMP+SJYd7xSu2zaT5PJRDRGqTL/QFyME=; b=hK9xomw7ysmhO25DLOc84goh06Na/wCzM7Ci/+/PJxpNWzy7Sjn6WGUlRTncGQzfXd sGCImgsOGrWp7ZdYSoqgrViatGWRoQBAdTLnRwFXLSu9kEFnM8c+CGQkIK/EPtrXSpXh DrGymyPY6Z4WhESUQJS/YakDhz+qBhYCGl4CX5KIdIBuIud+Qdse6Bq//GPh/Pe1lh5b YlX0D5ZQyq+MxOoQynqN7G42qN4ua2O0+sonh8jDVMA2pdnx1Az9YAUEqVUbABLVl7gQ N/lQW2CKtcDfKPiLn3wtMgXGyadVq1BDpnR0Zn24ci6xxxPXNPXxPzdPtVtZAt5CBWkw V87g== X-Gm-Message-State: AOPr4FU17G4MsOSh7JGY7xFsereZ85U7yTWhYOKKH7mUoii0Ziu5D/AoAYdzTFimU/iitGXfPi7p/N4q+kT45qpF MIME-Version: 1.0 X-Received: by 10.202.213.22 with SMTP id m22mr8143571oig.102.1462484228005; Thu, 05 May 2016 14:37:08 -0700 (PDT) Received: by 10.157.34.201 with HTTP; Thu, 5 May 2016 14:37:07 -0700 (PDT) Date: Thu, 5 May 2016 14:37:07 -0700 Message-ID: Subject: [PATCH, ARM] use vmov.i64 to load 0 into FP reg if neon enabled From: Jim Wilson To: "gcc-patches@gcc.gnu.org" Cc: Jim Wilson For this simple testcase double sub (void) { return 0.0; } Without the attached patch, an ARM compiler with neon support enabled, gives vldr.64 d0, .L2 With the attached patch, an ARM compiler with neon enabled, gives vmov.i64 d0, #0@ float which is faster and smaller, as there is no load from a constant pool entry. There are a few ways to implement this. I added a neon enabled attribute. Another way to do this would be a new constraint, like Dg, that tests for both neon and 0. I don't see any mention of targets that only support single-float in the ARM ARM, so it isn't obvious how to handle that. I see no targets that support both neon and single-float, but maybe I need to check for that anyways? Most of the patch involves renumbering constraints and matching attributes. The new alternative w/G must come before w/UvF or else we still get a constant pool reference. Otherwise the patch is pretty small and simple. We can do the same thing in the movdi pattern. I haven't tried writing that yet. This patch was tested with a bootstrap and make check in an armhf schroot on an xgene box. There were no regressions. OK to check in? Jim * config/arm/arm.md: (arch): Add neon. (arch_enabled): Return yes for arch neon when TARGET_NEON. * config/arm/vfp.md (movdf_vfp): Add w/G as alternative 3. Add neon_move as type for alt 3. Add arch attr enabling alt 3 for neon. Emit vmov.i64 for alt 3. Renumber alternatives 3 to 8. Adjust attributes for alt renumbering. Mark alt 3 as non-predicable. (thumb2_movdf_vfp): Likewise. Index: config/arm/arm.md =================================================================== --- config/arm/arm.md (revision 235793) +++ config/arm/arm.md (working copy) @@ -121,7 +121,7 @@ ; arm_arch6. "v6t2" for Thumb-2 with arm_arch6. This attribute is ; used to compute attribute "enabled", use type "any" to enable an ; alternative in all cases. -(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,v6t2,neon_for_64bits,avoid_neon_for_64bits,iwmmxt,iwmmxt2,armv6_or_vfpv3" +(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,v6t2,neon_for_64bits,avoid_neon_for_64bits,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon" (const_string "any")) (define_attr "arch_enabled" "no,yes" @@ -177,6 +177,10 @@ (and (eq_attr "arch" "armv6_or_vfpv3") (match_test "arm_arch6 || TARGET_VFP3")) (const_string "yes") + + (and (eq_attr "arch" "neon") + (match_test "TARGET_NEON")) + (const_string "yes") ] (const_string "no"))) Index: config/arm/vfp.md =================================================================== --- config/arm/vfp.md (revision 235793) +++ config/arm/vfp.md (working copy) @@ -394,8 +394,8 @@ ;; DFmode moves (define_insn "*movdf_vfp" - [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w ,Uv,r, m,w,r") - (match_operand:DF 1 "soft_df_operand" " ?r,w,Dy,UvF,w ,mF,r,w,r"))] + [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w,w ,Uv,r, m,w,r") + (match_operand:DF 1 "soft_df_operand" " ?r,w,Dy,G,UvF,w ,mF,r,w,r"))] "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP && ( register_operand (operands[0], DFmode) || register_operand (operands[1], DFmode))" @@ -410,16 +410,18 @@ case 2: gcc_assert (TARGET_VFP_DOUBLE); return \"vmov%?.f64\\t%P0, %1\"; - case 3: case 4: + case 3: + return \"vmov.i64\\t%P0, #0@ float\"; + case 4: case 5: return output_move_vfp (operands); - case 5: case 6: + case 6: case 7: return output_move_double (operands, true, NULL); - case 7: + case 8: if (TARGET_VFP_SINGLE) return \"vmov%?.f32\\t%0, %1\;vmov%?.f32\\t%p0, %p1\"; else return \"vmov%?.f64\\t%P0, %P1\"; - case 8: + case 9: return \"#\"; default: gcc_unreachable (); @@ -426,23 +428,24 @@ } } " - [(set_attr "type" "f_mcrr,f_mrrc,fconstd,f_loadd,f_stored,\ + [(set_attr "type" "f_mcrr,f_mrrc,fconstd,neon_move,f_loadd,f_stored,\ load2,store2,ffarithd,multiple") - (set (attr "length") (cond [(eq_attr "alternative" "5,6,8") (const_int 8) - (eq_attr "alternative" "7") + (set (attr "length") (cond [(eq_attr "alternative" "6,7,9") (const_int 8) + (eq_attr "alternative" "8") (if_then_else (match_test "TARGET_VFP_SINGLE") (const_int 8) (const_int 4))] (const_int 4))) - (set_attr "predicable" "yes") - (set_attr "pool_range" "*,*,*,1020,*,1020,*,*,*") - (set_attr "neg_pool_range" "*,*,*,1004,*,1004,*,*,*")] + (set_attr "predicable" "yes,yes,yes,no,yes,yes,yes,yes,yes,yes") + (set_attr "pool_range" "*,*,*,*,1020,*,1020,*,*,*") + (set_attr "neg_pool_range" "*,*,*,*,1004,*,1004,*,*,*") + (set_attr "arch" "any,any,any,neon,any,any,any,any,any,any")] ) (define_insn "*thumb2_movdf_vfp" - [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w ,Uv,r ,m,w,r") - (match_operand:DF 1 "soft_df_operand" " ?r,w,Dy,UvF,w, mF,r, w,r"))] + [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w,w ,Uv,r ,m,w,r") + (match_operand:DF 1 "soft_df_operand" " ?r,w,Dy,G,UvF,w, mF,r, w,r"))] "TARGET_THUMB2 && TARGET_HARD_FLOAT && TARGET_VFP && ( register_operand (operands[0], DFmode) || register_operand (operands[1], DFmode))" @@ -457,11 +460,13 @@ case 2: gcc_assert (TARGET_VFP_DOUBLE); return \"vmov%?.f64\\t%P0, %1\"; - case 3: case 4: + case 3: + return \"vmov.i64\\t%P0, #0@ float\"; + case 4: case 5: return output_move_vfp (operands); - case 5: case 6: case 8: + case 6: case 7: case 9: return output_move_double (operands, true, NULL); - case 7: + case 8: if (TARGET_VFP_SINGLE) return \"vmov%?.f32\\t%0, %1\;vmov%?.f32\\t%p0, %p1\"; else @@ -471,17 +476,18 @@ } } " - [(set_attr "type" "f_mcrr,f_mrrc,fconstd,f_loadd,\ + [(set_attr "type" "f_mcrr,f_mrrc,fconstd,neon_move,f_loadd,\ f_stored,load2,store2,ffarithd,multiple") - (set (attr "length") (cond [(eq_attr "alternative" "5,6,8") (const_int 8) - (eq_attr "alternative" "7") + (set (attr "length") (cond [(eq_attr "alternative" "6,7,9") (const_int 8) + (eq_attr "alternative" "8") (if_then_else (match_test "TARGET_VFP_SINGLE") (const_int 8) (const_int 4))] (const_int 4))) - (set_attr "pool_range" "*,*,*,1018,*,4094,*,*,*") - (set_attr "neg_pool_range" "*,*,*,1008,*,0,*,*,*")] + (set_attr "pool_range" "*,*,*,*,1018,*,4094,*,*,*") + (set_attr "neg_pool_range" "*,*,*,*,1008,*,0,*,*,*") + (set_attr "arch" "any,any,any,neon,any,any,any,any,any,any")] )