From patchwork Tue Oct 20 15:47:31 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
X-Patchwork-Id: 55312
Return-Path: <patchwork-forward+bncBDFONDVM3EGBBMWETGYQKGQEV3X55VY@linaro.org>
X-Original-To: linaro@patches.linaro.org
Delivered-To: linaro@patches.linaro.org
Received: from mail-lb0-f200.google.com (mail-lb0-f200.google.com
 [209.85.217.200])
 by patches.linaro.org (Postfix) with ESMTPS id ED18B23024
 for <linaro@patches.linaro.org>; Tue, 20 Oct 2015 15:48:03 +0000 (UTC)
Received: by lbbms9 with SMTP id ms9sf8645510lbb.3
 for <linaro@patches.linaro.org>; Tue, 20 Oct 2015 08:48:02 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id
 :list-unsubscribe:list-archive:list-post:list-help:sender
 :delivered-to:message-id:date:from:user-agent:mime-version:to:cc
 :subject:references:in-reply-to:content-type:x-original-sender
 :x-original-authentication-results;
 bh=6ZiCoZXQ4MdDaTPpGxKrknWRscrO4Mg3qUcRWf7w8Bo=;
 b=DXqMB/86erMtq1PeyHQY9eWEt1ERbQ30vXDR8Uh2dD4K/VylioXMXHNQGD4LS25fmW
 BNRS9IJGBbgDikZJ4Qri3FJV5C50ALmHspLc4UGVsab5OUA4asPlOgQjQ/qCmzJkNA4k
 EtJPV97gP4ej9XJ3ugjEdE69YN5haHx6TyrD8v758lnTTpkBxNS2ZR7wZwFaWpMbTR+q
 SE7bwca3p5kS5peg2AEtEJNZ3i0xlqpUMZBFSWdCMJkv8y3A/KObnjgfyWsqdBrEJiVo
 nHWnLZk/Lu4nfPWfQDCJrRVWgMzGAWxycuDUQdJhaaoE0IXPmSFkcsWEj/2qh74wMOzg
 mbYw==
X-Gm-Message-State: ALoCoQl6Y4X8jd9R5ax6FdLrJWiLC37xF3DGjQVwmgdDgKtghzk3ZROpXgjZehiKKA2E0+VP0JHY
X-Received: by 10.194.109.233 with SMTP id hv9mr806984wjb.1.1445356082849;
 Tue, 20 Oct 2015 08:48:02 -0700 (PDT)
X-BeenThere: patchwork-forward@linaro.org
Received: by 10.25.24.170 with SMTP id 42ls95765lfy.3.gmail; Tue, 20 Oct 2015
 08:48:02 -0700 (PDT)
X-Received: by 10.25.24.27 with SMTP id o27mr1498705lfi.5.1445356082558;
 Tue, 20 Oct 2015 08:48:02 -0700 (PDT)
Received: from mail-lb0-x235.google.com (mail-lb0-x235.google.com.
 [2a00:1450:4010:c04::235]) by mx.google.com with ESMTPS id
 a135si2743115lfe.135.2015.10.20.08.48.02
 for <patchwork-forward@linaro.org>
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Tue, 20 Oct 2015 08:48:02 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 2a00:1450:4010:c04::235 as permitted sender)
 client-ip=2a00:1450:4010:c04::235; 
Received: by lbbwb3 with SMTP id wb3so18539355lbb.1
 for <patchwork-forward@linaro.org>;
 Tue, 20 Oct 2015 08:48:02 -0700 (PDT)
X-Received: by 10.112.163.131 with SMTP id yi3mr2321017lbb.36.1445356082255; 
 Tue, 20 Oct 2015 08:48:02 -0700 (PDT)
X-Forwarded-To: patchwork-forward@linaro.org
X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org
Delivered-To: patch@linaro.org
Received: by 10.112.59.35 with SMTP id w3csp2161181lbq;
 Tue, 20 Oct 2015 08:48:00 -0700 (PDT)
X-Received: by 10.66.216.39 with SMTP id on7mr4688986pac.73.1445356080573;
 Tue, 20 Oct 2015 08:48:00 -0700 (PDT)
Received: from sourceware.org (server1.sourceware.org. [209.132.180.131])
 by mx.google.com with ESMTPS id
 zo6si6082555pbc.29.2015.10.20.08.48.00 for <patch@linaro.org>
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Tue, 20 Oct 2015 08:48:00 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 gcc-patches-return-410677-patch=linaro.org@gcc.gnu.org
 designates 209.132.180.131 as permitted sender)
 client-ip=209.132.180.131; 
Received: (qmail 92540 invoked by alias); 20 Oct 2015 15:47:40 -0000
Mailing-List: list patchwork-forward@linaro.org;
 contact patchwork-forward+owners@linaro.org
Precedence: list
List-Id: <patchwork-forward.linaro.org>
List-Unsubscribe: <mailto:googlegroups-manage+836684582541+unsubscribe@googlegroups.com>, 
 <http://groups.google.com/a/linaro.org/group/patchwork-forward/subscribe>
List-Archive: <http://groups.google.com/a/linaro.org/group/patchwork-forward/>
List-Post: <http://groups.google.com/a/linaro.org/group/patchwork-forward/post>, 
 <mailto:patchwork-forward@linaro.org>
List-Help: <http://support.google.com/a/linaro.org/bin/topic.py?topic=25838>, 
 <mailto:patchwork-forward+help@linaro.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 92530 invoked by uid 89); 20 Oct 2015 15:47:39 -0000
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=0.6 required=5.0 tests=AWL, BAYES_50,
 LIKELY_SPAM_BODY, SPF_PASS autolearn=no version=3.3.2
X-HELO: eu-smtp-delivery-143.mimecast.com
Received: from eu-smtp-delivery-143.mimecast.com (HELO
 eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by
 sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;
 Tue, 20 Oct 2015 15:47:36 +0000
Received: from cam-owa2.Emea.Arm.com (fw-tnat.cambridge.arm.com
 [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id
 uk-mta-24-Y_UzQhluR5ySg3Leb052UA-1; Tue, 20 Oct 2015 16:47:31 +0100
Received: from [10.2.207.50] ([10.1.2.79]) by cam-owa2.Emea.Arm.com with
 Microsoft SMTPSVC(6.0.3790.3959); Tue, 20 Oct 2015 16:47:31 +0100
Message-ID: <56266213.80705@arm.com>
Date: Tue, 20 Oct 2015 16:47:31 +0100
From: Kyrill Tkachov <kyrylo.tkachov@arm.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:31.0) Gecko/20100101 Thunderbird/31.2.0
MIME-Version: 1.0
To: Marcus Shawcroft <marcus.shawcroft@gmail.com>
CC: GCC Patches <gcc-patches@gcc.gnu.org>,
 Marcus Shawcroft <marcus.shawcroft@arm.com>,
 Richard Earnshaw <Richard.Earnshaw@arm.com>,
 James Greenhalgh <james.greenhalgh@arm.com>
Subject: Re: [PATCH][AArch64][1/2] Add fmul-by-power-of-2+fcvt optimisation
References: <5624F6B3.1030407@arm.com>
 <CAFqB+PzN6BTpnk8hRbTzOiLnZXRWDjufOCRaoRy9u+92oD7zrw@mail.gmail.com>
In-Reply-To: <CAFqB+PzN6BTpnk8hRbTzOiLnZXRWDjufOCRaoRy9u+92oD7zrw@mail.gmail.com>
X-MC-Unique: Y_UzQhluR5ySg3Leb052UA-1
X-IsSubscribed: yes
X-Original-Sender: kyrylo.tkachov@arm.com
X-Original-Authentication-Results: mx.google.com; spf=pass (google.com:
 domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 2a00:1450:4010:c04::235 as permitted sender)
 smtp.mailfrom=patch+caf_=patchwork-forward=linaro.org@linaro.org;
 dkim=pass header.i=@gcc.gnu.org
X-Google-Group-Id: 836684582541

On 20/10/15 16:26, Marcus Shawcroft wrote:
> On 19 October 2015 at 14:57, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>
>> 2015-10-19  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>      * config/aarch64/aarch64.md
>>   (*aarch64_fcvt<su_optab><GPF:mode><GPI:mode>2_mult): New pattern.
>>      * config/aarch64/aarch64-simd.md
>>   (*aarch64_fcvt<su_optab><VDQF:mode><fcvt_target>2_mult): Likewise.
>>      * config/aarch64/aarch64.c (aarch64_rtx_costs): Handle above patterns.
>>      (aarch64_fpconst_pow_of_2): New function.
>>      (aarch64_vec_fpconst_pow_of_2): Likewise.
>>      * config/aarch64/aarch64-protos.h (aarch64_fpconst_pow_of_2): Declare
>>      prototype.
>>      (aarch64_vec_fpconst_pow_of_2): Likewise.
>>      * config/aarch64/predicates.md (aarch64_fp_pow2): New predicate.
>>      (aarch64_fp_vec_pow2): Likewise.
>>
>> 2015-10-19  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>      * gcc.target/aarch64/fmul_fcvt_1.c: New test.
>>      * gcc.target/aarch64/fmul_fcvt_2.c: Likewise.
> +    char buf[64];
> +    sprintf (buf, "fcvtz<su>\\t%%0.<Vtype>, %%1.<Vtype>, #%d", fbits);
>
> Prefer snprintf please.
>
> +  }
> +  [(set_attr "type" "neon_fp_to_int_<Vetype><q>")]
> +)
> +
> +
>
> Superflous blank line here ?
>
> +  *cost += rtx_cost (XEXP (x, 0), VOIDmode,
> +     (enum rtx_code) code, 0, speed);
>
> My understanding is the unnecessary use of enum  is now discouraged,
> (rtx_code) is sufficient in this case.
>
> +  int count = CONST_VECTOR_NUNITS (x);
> +  int i;
> +  for (i = 1; i < count; i++)
>
> Push the int into the for initializer.
> Push the rhs of the count assignment into the for condition and drop
> the definition of count.

Ok, done.

>
> +/* { dg-final { scan-assembler "fcvtzs\tw\[0-9\], s\[0-9\]*.*#2" } } */
>
> I'd prefer scan-assembler-times or do you have a particular reason to
> avoid it in these tests?

No reason.
Here's the patch updated as per your feedback.

How's this?

Thanks,
Kyrill

2015-10-20  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * config/aarch64/aarch64.md
(*aarch64_fcvt<su_optab><GPF:mode><GPI:mode>2_mult): New pattern.
     * config/aarch64/aarch64-simd.md
(*aarch64_fcvt<su_optab><VDQF:mode><fcvt_target>2_mult): Likewise.
     * config/aarch64/aarch64.c (aarch64_rtx_costs): Handle above patterns.
     (aarch64_fpconst_pow_of_2): New function.
     (aarch64_vec_fpconst_pow_of_2): Likewise.
     * config/aarch64/aarch64-protos.h (aarch64_fpconst_pow_of_2): Declare
     prototype.
     (aarch64_vec_fpconst_pow_of_2): Likewise.
     * config/aarch64/predicates.md (aarch64_fp_pow2): New predicate.
     (aarch64_fp_vec_pow2): Likewise.

2015-10-20  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * gcc.target/aarch64/fmul_fcvt_1.c: New test.
     * gcc.target/aarch64/fmul_fcvt_2.c: Likewise.

commit dc55192dd5f54f69d659ea6cc703c5fc2dc9b88b
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Thu Oct 8 15:17:47 2015 +0100

    [AArch64] Add fmul+fcvt optimisation

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index a8ac8d3..309dcfb 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -294,12 +294,14 @@ enum aarch64_symbol_type aarch64_classify_symbol (rtx, rtx);
 enum aarch64_symbol_type aarch64_classify_tls_symbol (rtx);
 enum reg_class aarch64_regno_regclass (unsigned);
 int aarch64_asm_preferred_eh_data_format (int, int);
+int aarch64_fpconst_pow_of_2 (rtx);
 machine_mode aarch64_hard_regno_caller_save_mode (unsigned, unsigned,
 						       machine_mode);
 int aarch64_hard_regno_mode_ok (unsigned, machine_mode);
 int aarch64_hard_regno_nregs (unsigned, machine_mode);
 int aarch64_simd_attr_length_move (rtx_insn *);
 int aarch64_uxt_size (int, HOST_WIDE_INT);
+int aarch64_vec_fpconst_pow_of_2 (rtx);
 rtx aarch64_final_eh_return_addr (void);
 rtx aarch64_legitimize_reload_address (rtx *, machine_mode, int, int, int);
 const char *aarch64_output_move_struct (rtx *operands);
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 167277e..cf1ff6d 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1654,6 +1654,26 @@ (define_insn "l<fcvt_pattern><su_optab><VDQF:mode><fcvt_target>2"
   [(set_attr "type" "neon_fp_to_int_<Vetype><q>")]
 )
 
+(define_insn "*aarch64_fcvt<su_optab><VDQF:mode><fcvt_target>2_mult"
+  [(set (match_operand:<FCVT_TARGET> 0 "register_operand" "=w")
+	(FIXUORS:<FCVT_TARGET> (unspec:<FCVT_TARGET>
+			       [(mult:VDQF
+	 (match_operand:VDQF 1 "register_operand" "w")
+	 (match_operand:VDQF 2 "aarch64_fp_vec_pow2" ""))]
+			       UNSPEC_FRINTZ)))]
+  "TARGET_SIMD
+   && IN_RANGE (aarch64_vec_fpconst_pow_of_2 (operands[2]), 1,
+		GET_MODE_BITSIZE (GET_MODE_INNER (<VDQF:MODE>mode)))"
+  {
+    int fbits = aarch64_vec_fpconst_pow_of_2 (operands[2]);
+    char buf[64];
+    snprintf (buf, 64, "fcvtz<su>\\t%%0.<Vtype>, %%1.<Vtype>, #%d", fbits);
+    output_asm_insn (buf, operands);
+    return "";
+  }
+  [(set_attr "type" "neon_fp_to_int_<Vetype><q>")]
+)
+
 (define_expand "<optab><VDQF:mode><fcvt_target>2"
   [(set (match_operand:<FCVT_TARGET> 0 "register_operand")
 	(FIXUORS:<FCVT_TARGET> (unspec:<FCVT_TARGET>
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index e304d40..17e59e1 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6791,6 +6791,19 @@ cost_plus:
 	  else
 	    *cost += extra_cost->fp[GET_MODE (x) == DFmode].toint;
 	}
+
+      /* We can combine fmul by a power of 2 followed by a fcvt into a single
+	 fixed-point fcvt.  */
+      if (GET_CODE (x) == MULT
+	  && ((VECTOR_MODE_P (mode)
+	       && aarch64_vec_fpconst_pow_of_2 (XEXP (x, 1)) > 0)
+	      || aarch64_fpconst_pow_of_2 (XEXP (x, 1)) > 0))
+	{
+	  *cost += rtx_cost (XEXP (x, 0), VOIDmode, (rtx_code) code,
+			     0, speed);
+	  return true;
+	}
+
       *cost += rtx_cost (x, VOIDmode, (enum rtx_code) code, 0, speed);
       return true;
 
@@ -13357,6 +13370,52 @@ aarch64_reorg (void)
 #undef TARGET_MACHINE_DEPENDENT_REORG
 #define TARGET_MACHINE_DEPENDENT_REORG aarch64_reorg
 
+
+/* If X is a positive CONST_DOUBLE with a value that is a power of 2
+   return the log2 of that value.  Otherwise return -1.  */
+
+int
+aarch64_fpconst_pow_of_2 (rtx x)
+{
+  const REAL_VALUE_TYPE *r;
+
+  if (!CONST_DOUBLE_P (x))
+    return -1;
+
+  r = CONST_DOUBLE_REAL_VALUE (x);
+
+  if (REAL_VALUE_NEGATIVE (*r)
+      || REAL_VALUE_ISNAN (*r)
+      || REAL_VALUE_ISINF (*r)
+      || !real_isinteger (r, DFmode))
+    return -1;
+
+  return exact_log2 (real_to_integer (r));
+}
+
+/* If X is a vector of equal CONST_DOUBLE values and that value is
+   Y, return the aarch64_fpconst_pow_of_2 of Y.  Otherwise return -1.  */
+
+int
+aarch64_vec_fpconst_pow_of_2 (rtx x)
+{
+  if (GET_CODE (x) != CONST_VECTOR)
+    return -1;
+
+  if (GET_MODE_CLASS (GET_MODE (x)) != MODE_VECTOR_FLOAT)
+    return -1;
+
+  int firstval = aarch64_fpconst_pow_of_2 (CONST_VECTOR_ELT (x, 0));
+  if (firstval <= 0)
+    return -1;
+
+  for (int i = 1; i < CONST_VECTOR_NUNITS (x); i++)
+    if (aarch64_fpconst_pow_of_2 (CONST_VECTOR_ELT (x, i)) != firstval)
+      return -1;
+
+  return firstval;
+}
+
 /* Implement TARGET_PROMOTED_TYPE to promote __fp16 to float.  */
 static tree
 aarch64_promoted_type (const_tree t)
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index a3ec371..35f9877 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4287,6 +4287,25 @@ (define_insn "l<fcvt_pattern><su_optab><GPF:mode><GPI:mode>2"
   [(set_attr "type" "f_cvtf2i")]
 )
 
+(define_insn "*aarch64_fcvt<su_optab><GPF:mode><GPI:mode>2_mult"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(FIXUORS:GPI
+	  (mult:GPF
+	    (match_operand:GPF 1 "register_operand" "w")
+	    (match_operand:GPF 2 "aarch64_fp_pow2" "F"))))]
+  "TARGET_FLOAT
+   && IN_RANGE (aarch64_fpconst_pow_of_2 (operands[2]), 1,
+		GET_MODE_BITSIZE (<GPI:MODE>mode))"
+  {
+    int fbits = aarch64_fpconst_pow_of_2 (operands[2]);
+    char buf[64];
+    snprintf (buf, 64, "fcvtz<su>\\t%%<GPI:w>0, %%<GPF:s>1, #%d", fbits);
+    output_asm_insn (buf, operands);
+    return "";
+  }
+  [(set_attr "type" "f_cvtf2i")]
+)
+
 ;; fma - no throw
 
 (define_insn "fma<mode>4"
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index 8af4b81..1bcbf62 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -87,6 +87,13 @@ (define_predicate "aarch64_fp_compare_operand"
        (and (match_code "const_double")
 	    (match_test "aarch64_float_const_zero_rtx_p (op)"))))
 
+(define_predicate "aarch64_fp_pow2"
+  (and (match_code "const_double")
+	(match_test "aarch64_fpconst_pow_of_2 (op) > 0")))
+
+(define_predicate "aarch64_fp_vec_pow2"
+  (match_test "aarch64_vec_fpconst_pow_of_2 (op) > 0"))
+
 (define_predicate "aarch64_plus_immediate"
   (and (match_code "const_int")
        (ior (match_test "aarch64_uimm12_shift (INTVAL (op))")
diff --git a/gcc/testsuite/gcc.target/aarch64/fmul_fcvt_1.c b/gcc/testsuite/gcc.target/aarch64/fmul_fcvt_1.c
new file mode 100644
index 0000000..4e3ace7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/fmul_fcvt_1.c
@@ -0,0 +1,129 @@
+/* { dg-do run } */
+/* { dg-options "-save-temps -O2 -fno-inline" } */
+
+#define FUNC_DEFS(__a)	\
+int			\
+sffoo##__a (float x)	\
+{			\
+  return x * __a##.0f;	\
+}			\
+			\
+unsigned int		\
+usffoo##__a (float x)	\
+{			\
+  return x * __a##.0f;	\
+}			\
+			\
+long			\
+lsffoo##__a (float x)	\
+{			\
+  return x * __a##.0f;	\
+}			\
+			\
+unsigned long		\
+ulsffoo##__a (float x)	\
+{			\
+  return x * __a##.0f;	\
+}
+
+#define FUNC_DEFD(__a)	\
+long			\
+dffoo##__a (double x)	\
+{			\
+  return x * __a##.0;	\
+}			\
+			\
+unsigned long		\
+udffoo##__a (double x)	\
+{			\
+  return x * __a##.0;	\
+}			\
+int			\
+sdffoo##__a (double x)	\
+{			\
+  return x * __a##.0;	\
+}			\
+			\
+unsigned int		\
+usdffoo##__a (double x)	\
+{			\
+  return x * __a##.0;	\
+}
+
+FUNC_DEFS (4)
+FUNC_DEFD (4)
+/* { dg-final { scan-assembler-times "fcvtzs\tw\[0-9\], s\[0-9\]*.*#2" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tx\[0-9\], s\[0-9\]*.*#2" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tx\[0-9\], d\[0-9\]*.*#2" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tw\[0-9\], d\[0-9\]*.*#2" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzu\tw\[0-9\], s\[0-9\]*.*#2" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzu\tx\[0-9\], s\[0-9\]*.*#2" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzu\tx\[0-9\], d\[0-9\]*.*#2" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzu\tw\[0-9\], d\[0-9\]*.*#2" 1 } } */
+
+FUNC_DEFS (8)
+FUNC_DEFD (8)
+/* { dg-final { scan-assembler-times "fcvtzs\tw\[0-9\], s\[0-9\]*.*#3" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tx\[0-9\], s\[0-9\]*.*#3" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tx\[0-9\], d\[0-9\]*.*#3" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tw\[0-9\], d\[0-9\]*.*#3" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzu\tw\[0-9\], s\[0-9\]*.*#3" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzu\tx\[0-9\], s\[0-9\]*.*#3" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzu\tx\[0-9\], d\[0-9\]*.*#3" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzu\tw\[0-9\], d\[0-9\]*.*#3" 1 } } */
+
+FUNC_DEFS (16)
+FUNC_DEFD (16)
+/* { dg-final { scan-assembler-times "fcvtzs\tw\[0-9\], s\[0-9\]*.*#4" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tx\[0-9\], s\[0-9\]*.*#4" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tx\[0-9\], d\[0-9\]*.*#4" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tw\[0-9\], d\[0-9\]*.*#4" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzu\tw\[0-9\], s\[0-9\]*.*#4" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzu\tx\[0-9\], s\[0-9\]*.*#4" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzu\tx\[0-9\], d\[0-9\]*.*#4" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzu\tw\[0-9\], d\[0-9\]*.*#4" 1 } } */
+
+
+#define FUNC_TESTS(__a, __b)					\
+do								\
+  {								\
+    if (sffoo##__a (__b) != (int)(__b * __a))			\
+      __builtin_abort ();					\
+    if (usffoo##__a (__b) != (unsigned int)(__b * __a))	\
+      __builtin_abort ();					\
+    if (lsffoo##__a (__b) != (long)(__b * __a))		\
+      __builtin_abort ();					\
+    if (ulsffoo##__a (__b) != (unsigned long)(__b * __a))	\
+      __builtin_abort ();					\
+  } while (0)
+
+#define FUNC_TESTD(__a, __b)					\
+do								\
+  {								\
+    if (dffoo##__a (__b) != (long)(__b * __a))			\
+      __builtin_abort ();					\
+    if (udffoo##__a (__b) != (unsigned long)(__b * __a))	\
+      __builtin_abort ();					\
+    if (sdffoo##__a (__b) != (int)(__b * __a))			\
+      __builtin_abort ();					\
+    if (usdffoo##__a (__b) != (unsigned int)(__b * __a))	\
+      __builtin_abort ();					\
+  } while (0)
+
+int
+main (void)
+{
+  float i;
+
+  for (i = -0.001; i < 32.0; i += 1.0f)
+    {
+      FUNC_TESTS (4, i);
+      FUNC_TESTS (8, i);
+      FUNC_TESTS (16, i);
+
+      FUNC_TESTD (4, i);
+      FUNC_TESTD (8, i);
+      FUNC_TESTD (16, i);
+    }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/fmul_fcvt_2.c b/gcc/testsuite/gcc.target/aarch64/fmul_fcvt_2.c
new file mode 100644
index 0000000..d8a9335
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/fmul_fcvt_2.c
@@ -0,0 +1,67 @@
+/* { dg-do run } */
+/* { dg-options "-save-temps -O2 -ftree-vectorize -fno-inline" } */
+
+#define N 1024
+
+#define FUNC_DEF(__a)		\
+void				\
+foo##__a (float *a, int *b)	\
+{				\
+  int i;			\
+  for (i = 0; i < N; i++)	\
+    b[i] = a[i] * __a##.0f;	\
+}
+
+FUNC_DEF (4)
+FUNC_DEF (8)
+FUNC_DEF (16)
+
+int ints[N];
+float floats[N];
+
+void
+reset_ints (int *arr)
+{
+  int i;
+
+  for (i = 0; i < N; i++)
+    arr[i] = 0;
+}
+
+void
+check_result (int *is, int n)
+{
+  int i;
+
+  for (i = 0; i < N; i++)
+    if (is[i] != i * n)
+      __builtin_abort ();
+}
+
+#define FUNC_CHECK(__a)		\
+do				\
+  {				\
+    reset_ints (ints);		\
+    foo##__a (floats, ints);	\
+    check_result (ints, __a);	\
+  } while (0)
+
+
+int
+main (void)
+{
+  int i;
+  for (i = 0; i < N; i++)
+    floats[i] = (float) i;
+
+  FUNC_CHECK (4);
+  FUNC_CHECK (8);
+  FUNC_CHECK (16);
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-not "fmul\tv\[0-9\]*.*" } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tv\[0-9\].4s, v\[0-9\].4s*.*#2" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tv\[0-9\].4s, v\[0-9\].4s*.*#3" 1 } } */
+/* { dg-final { scan-assembler-times "fcvtzs\tv\[0-9\].4s, v\[0-9\].4s*.*#4" 1 } } */