From patchwork Sat Apr 14 13:11:48 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Stubbs X-Patchwork-Id: 7811 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id 3224923E47 for ; Sat, 14 Apr 2012 13:12:02 +0000 (UTC) Received: from mail-iy0-f180.google.com (mail-iy0-f180.google.com [209.85.210.180]) by fiordland.canonical.com (Postfix) with ESMTP id C70CAA180BB for ; Sat, 14 Apr 2012 13:12:01 +0000 (UTC) Received: by iage36 with SMTP id e36so7363856iag.11 for ; Sat, 14 Apr 2012 06:12:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-forwarded-to:x-forwarded-for:delivered-to:received-spf:message-id :date:from:user-agent:mime-version:to:cc:subject:references :in-reply-to:content-type:x-originalarrivaltime:x-gm-message-state; bh=XPtj+DF8aRROY+5FhuJIFTQoWw+MBF+8+ldYZ7brHYQ=; b=frVjG/f9UXpsP5QgJ8uBAPAMKmXPcsqjrue0pcB1Gi9RbxwQc5KsOPtCf4hSleMoBv f93yzs6mWXyAaWTIVS5XAzL5O6PhyeGPa+nPKt5BbV+EIqFpSkZgMDjjPkiEsfz2Kw5C meAfRPEYcw5tk+gVHUdKIC8GCwAdJAFmP3xYn+ShSHNokaUZrY8D+ztgFmpvMpk5ZdaN XVOpnL3J1zgnT5kDcIyytqz4K/JUUzAiIy7SSuoWBXUN0C4mU2V4c6psI3ePoooMovgo HeOT2fvgM/lXO9VfyTjmngZjx9cZAIeqkpjPGiqiOENYbsnxbO608KEVbUr1nZLyTWCw 8Y+Q== Received: by 10.42.179.196 with SMTP id br4mr3205589icb.42.1334409120934; Sat, 14 Apr 2012 06:12:00 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.231.70.69 with SMTP id c5csp82437ibj; Sat, 14 Apr 2012 06:11:59 -0700 (PDT) Received: by 10.216.135.97 with SMTP id t75mr2962050wei.60.1334409118277; Sat, 14 Apr 2012 06:11:58 -0700 (PDT) Received: from relay1.mentorg.com (relay1.mentorg.com. [192.94.38.131]) by mx.google.com with ESMTPS id fr6si1791509wib.15.2012.04.14.06.11.57 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 14 Apr 2012 06:11:58 -0700 (PDT) Received-SPF: neutral (google.com: 192.94.38.131 is neither permitted nor denied by best guess record for domain of Andrew_Stubbs@mentor.com) client-ip=192.94.38.131; Authentication-Results: mx.google.com; spf=neutral (google.com: 192.94.38.131 is neither permitted nor denied by best guess record for domain of Andrew_Stubbs@mentor.com) smtp.mail=Andrew_Stubbs@mentor.com Received: from svr-orw-exc-10.mgc.mentorg.com ([147.34.98.58]) by relay1.mentorg.com with esmtp id 1SJ2le-0002Y5-5V from Andrew_Stubbs@mentor.com ; Sat, 14 Apr 2012 06:11:54 -0700 Received: from SVR-IES-FEM-01.mgc.mentorg.com ([137.202.0.104]) by SVR-ORW-EXC-10.mgc.mentorg.com with Microsoft SMTPSVC(6.0.3790.4675); Sat, 14 Apr 2012 06:11:26 -0700 Received: from [172.30.12.233] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.1.289.1; Sat, 14 Apr 2012 14:11:52 +0100 Message-ID: <4F897794.7010402@codesourcery.com> Date: Sat, 14 Apr 2012 14:11:48 +0100 From: Andrew Stubbs User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120329 Thunderbird/11.0.1 MIME-Version: 1.0 To: "gcc-patches@gcc.gnu.org" CC: Richard Earnshaw , "patches@linaro.org" Subject: Re: [PATCH][ARM] NEON DImode neg References: <4F4D12C5.9070805@codesourcery.com> <4F704189.4010302@codesourcery.com> <4F86F932.60606@arm.com> <4F897239.1090901@codesourcery.com> In-Reply-To: <4F897239.1090901@codesourcery.com> X-OriginalArrivalTime: 14 Apr 2012 13:11:26.0843 (UTC) FILETIME=[194EB4B0:01CD1A40] X-Gm-Message-State: ALoCoQkXVNdH81JBtro6SmDSDiZeboc08xILZ3QwDOIOcZ38+lSIST0ndho6L6y3L95/rVmwrVvu And now with the patch. :( On 14/04/12 13:48, Andrew Stubbs wrote: > On 12/04/12 16:48, Richard Earnshaw wrote: >> If negation in Neon needs a scratch register, it seems to me to be >> somewhat odd that we're disparaging the ARM version. >> >> Also, wouldn't it be sensible to support a variant that was >> early-clobber on operand 0, but loaded immediate zero into that value >> first: >> >> vmov Dd, #0 >> vsub Dd, Dd, Dm >> >> That way you'll never need more than two registers, whereas today you >> want three. > > This patch implements the changes you suggested. > > I've done a full bootstrap and test and found no regressions. > > OK? > > Andrew > > P.S. This patch can't actually be committed until my "NEON DImode > immediate constants" patch is approved and committed. (Without that the > load #0 needs a constant pool, and loading constants this late has a bug > at -O0.) 2012-04-12 Andrew Stubbs gcc/ * config/arm/arm.md (negdi2): Use gen_negdi2_neon. * config/arm/neon.md (negdi2_neon): New insn. Also add splitters for core and NEON registers. diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 751997f..f1dbbf7 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -4048,7 +4048,13 @@ (neg:DI (match_operand:DI 1 "s_register_operand" ""))) (clobber (reg:CC CC_REGNUM))])] "TARGET_EITHER" - "" + { + if (TARGET_NEON) + { + emit_insn (gen_negdi2_neon (operands[0], operands[1])); + DONE; + } + } ) ;; The constraints here are to prevent a *partial* overlap (where %Q0 == %R1). diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 3c88568..8c8b02d 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -922,6 +922,45 @@ (const_string "neon_int_3")))] ) +(define_insn "negdi2_neon" + [(set (match_operand:DI 0 "s_register_operand" "=&w, w,r,&r") + (neg:DI (match_operand:DI 1 "s_register_operand" " w, w,0, r"))) + (clobber (match_scratch:DI 2 "= X,&w,X, X")) + (clobber (reg:CC CC_REGNUM))] + "TARGET_NEON" + "#" + [(set_attr "length" "8")] +) + +; Split negdi2_neon for vfp registers +(define_split + [(set (match_operand:DI 0 "s_register_operand" "") + (neg:DI (match_operand:DI 1 "s_register_operand" ""))) + (clobber (match_scratch:DI 2 "")) + (clobber (reg:CC CC_REGNUM))] + "TARGET_NEON && reload_completed && IS_VFP_REGNUM (REGNO (operands[0]))" + [(set (match_dup 2) (const_int 0)) + (parallel [(set (match_dup 0) (minus:DI (match_dup 2) (match_dup 1))) + (clobber (reg:CC CC_REGNUM))])] + { + if (!REG_P (operands[2])) + operands[2] = operands[0]; + } +) + +; Split negdi2_neon for core registers +(define_split + [(set (match_operand:DI 0 "s_register_operand" "") + (neg:DI (match_operand:DI 1 "s_register_operand" ""))) + (clobber (match_scratch:DI 2 "")) + (clobber (reg:CC CC_REGNUM))] + "TARGET_32BIT && reload_completed + && arm_general_register_operand (operands[0], DImode)" + [(parallel [(set (match_dup 0) (neg:DI (match_dup 1))) + (clobber (reg:CC CC_REGNUM))])] + "" +) + (define_insn "*umin3_neon" [(set (match_operand:VDQIW 0 "s_register_operand" "=w") (umin:VDQIW (match_operand:VDQIW 1 "s_register_operand" "w")