From patchwork Fri Aug 26 14:57:54 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramana Radhakrishnan X-Patchwork-Id: 3721 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id C0D4723F18 for ; Fri, 26 Aug 2011 14:58:00 +0000 (UTC) Received: from mail-gx0-f180.google.com (mail-gx0-f180.google.com [209.85.161.180]) by fiordland.canonical.com (Postfix) with ESMTP id 886E3A18AC9 for ; Fri, 26 Aug 2011 14:58:00 +0000 (UTC) Received: by gxk10 with SMTP id 10so4061322gxk.11 for ; Fri, 26 Aug 2011 07:58:00 -0700 (PDT) Received: by 10.150.236.12 with SMTP id j12mr2312946ybh.212.1314370678858; Fri, 26 Aug 2011 07:57:58 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.151.27.20 with SMTP id e20cs286589ybj; Fri, 26 Aug 2011 07:57:58 -0700 (PDT) Received: by 10.213.35.199 with SMTP id q7mr1709084ebd.77.1314370676933; Fri, 26 Aug 2011 07:57:56 -0700 (PDT) Received: from mail-ey0-f172.google.com (mail-ey0-f172.google.com [209.85.215.172]) by mx.google.com with ESMTPS id d3si1761122eeb.114.2011.08.26.07.57.56 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 26 Aug 2011 07:57:56 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.215.172 is neither permitted nor denied by best guess record for domain of ramana.radhakrishnan@linaro.org) client-ip=209.85.215.172; Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.215.172 is neither permitted nor denied by best guess record for domain of ramana.radhakrishnan@linaro.org) smtp.mail=ramana.radhakrishnan@linaro.org Received: by eye4 with SMTP id 4so2324069eye.31 for ; Fri, 26 Aug 2011 07:57:56 -0700 (PDT) MIME-Version: 1.0 Received: by 10.42.154.136 with SMTP id q8mr1267657icw.109.1314370674135; Fri, 26 Aug 2011 07:57:54 -0700 (PDT) Received: by 10.231.31.4 with HTTP; Fri, 26 Aug 2011 07:57:54 -0700 (PDT) In-Reply-To: References: Date: Fri, 26 Aug 2011 15:57:54 +0100 Message-ID: Subject: Re: [Patch ARM] Fix vec_pack_trunc pattern for vectorize_with_neon_quad. From: Ramana Radhakrishnan To: gcc-patches Cc: Patch Tracking , Ira Rosen On 16 August 2011 15:20, Ramana Radhakrishnan wrote: > Hi, > > While looking at a failure with regrename and > mvectorize-with-neon-quad I noticed that the early-clobber in this > vec_pack_trunc pattern is superfluous given that we can use > reg_overlap_mentioned_p to decide in which order we want to emit these > 2 instructions. While it works around the problem in regrename.c I > still think that the behaviour in regrename is a bit suspicious and > needs some more investigation. > RichardS finally fixed the problem in data-flow and hence we should be able to turn on vectorize_with_quad anyway. Here's the patch which I thought I should have committed as a workaround but I think it's better to split this further in the case where the 2 registers are equal because otherwise you are pointlessly creating a stall in the Neon pipe for the vmovn result to arrive. Hence I'm not committing this patch. Tests finished OK btw for this patch. cheers Ramana index 24dd941..2c60c5f 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -5631,14 +5631,29 @@ ; the semantics of the instructions require. (define_insn "vec_pack_trunc_" - [(set (match_operand: 0 "register_operand" "=&w") + [(set (match_operand: 0 "register_operand" "=w") (vec_concat: (truncate: (match_operand:VN 1 "register_operand" "w")) (truncate: (match_operand:VN 2 "register_operand" "w"))))] "TARGET_NEON && !BYTES_BIG_ENDIAN" - "vmovn.i\t%e0, %q1\;vmovn.i\t%f0, %q2" + { + /* If operand1 and operand2 are identical, then the second + narrowing operation isn't needed as the values obtained + in both parts of the destination q register are identical. + This precludes the need for an early clobber in the destination + operand. */ + if (rtx_equal_p (operands[1], operands[2])) + return "vmovn.i\\t%e0, %q1\;vmov.i\\t%f0, %e0"; + else + { + if (reg_overlap_mentioned_p (operands[0], operands[2])) + return "vmovn.i\\t%f0, %q2\;vmovn.i\\t%e0, %q1"; + else + return "vmovn.i\\t%e0, %q1\;vmovn.i\\t%f0, %q2"; + } + } [(set_attr "neon_type" "neon_shift_1") (set_attr "length" "8")] )