From patchwork Fri Apr 17 11:19:14 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Kugan Vivekanandarajah
 <kugan.vivekanandarajah@linaro.org>
X-Patchwork-Id: 47283
Return-Path: <patchwork-forward+bncBD47NO755IHRBUOYYOUQKGQE3WGSTRY@linaro.org>
X-Original-To: linaro@patches.linaro.org
Delivered-To: linaro@patches.linaro.org
Received: from mail-la0-f72.google.com (mail-la0-f72.google.com
 [209.85.215.72])
 by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 99B8020553
 for <linaro@patches.linaro.org>; Fri, 17 Apr 2015 11:19:46 +0000 (UTC)
Received: by laat2 with SMTP id t2sf24270812laa.2
 for <linaro@patches.linaro.org>; Fri, 17 Apr 2015 04:19:45 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id
 :list-unsubscribe:list-archive:list-post:list-help:sender
 :delivered-to:message-id:date:from:user-agent:mime-version:to:cc
 :subject:references:in-reply-to:content-type:x-original-sender
 :x-original-authentication-results;
 bh=GlIMpHB7wX9s9LsBSWUiEKemE7p1e9p8GjBHTR/X1NU=;
 b=LLdnkySorsdM+e50HiBriqmJAcntMWGPRos80MiB4ATogIRd+F4szCgFxsjHBE73sO
 E27SL6ndtFhF7cjIaMgPg00IZz9RBfgrULfPIrcwv02rIQWjZREoZIYJk3chKL0OMsKV
 /IbmjqWxb3VkjBJjttpVHi0aRtoIMS2wtpzt01jp8RrVnEv/jEojzl0leA3g2zloQmnz
 VzuA3GJnMMO1H0tWLdLSuYg+fFoCaaqofbpPzTxRyswij0jNUA3gvDtDtDwRtaGK1RAs
 r2AzpUSJeyFfoRnu9TWMATzfntAyN7wTeKtiVd1IvQqgHpuJwSL/eIajZPoAGxCqwJ5A
 WO/Q==
X-Gm-Message-State: ALoCoQmBsPmsnhsgU99mUWSd0SkUHpTdMKbh0ojrfvsiT1mtW3BinF2/Xo8raBygUXVcLIaYfXMj
X-Received: by 10.112.171.41 with SMTP id ar9mr1178927lbc.24.1429269585530; 
 Fri, 17 Apr 2015 04:19:45 -0700 (PDT)
X-BeenThere: patchwork-forward@linaro.org
Received: by 10.152.8.232 with SMTP id u8ls484238laa.24.gmail; Fri, 17 Apr
 2015 04:19:45 -0700 (PDT)
X-Received: by 10.112.204.6 with SMTP id ku6mr2562493lbc.73.1429269585301;
 Fri, 17 Apr 2015 04:19:45 -0700 (PDT)
Received: from mail-la0-x22a.google.com (mail-la0-x22a.google.com.
 [2a00:1450:4010:c03::22a])
 by mx.google.com with ESMTPS id k2si8540985lah.93.2015.04.17.04.19.45
 for <patchwork-forward@linaro.org>
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Fri, 17 Apr 2015 04:19:45 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 2a00:1450:4010:c03::22a as permitted sender)
 client-ip=2a00:1450:4010:c03::22a; 
Received: by lagv1 with SMTP id v1so77658850lag.3
 for <patchwork-forward@linaro.org>;
 Fri, 17 Apr 2015 04:19:45 -0700 (PDT)
X-Received: by 10.152.116.11 with SMTP id js11mr2618060lab.106.1429269585046; 
 Fri, 17 Apr 2015 04:19:45 -0700 (PDT)
X-Forwarded-To: patchwork-forward@linaro.org
X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org
Delivered-To: patch@linaro.org
Received: by 10.112.67.65 with SMTP id l1csp3669238lbt;
 Fri, 17 Apr 2015 04:19:43 -0700 (PDT)
X-Received: by 10.68.227.195 with SMTP id sc3mr4667911pbc.64.1429269582284; 
 Fri, 17 Apr 2015 04:19:42 -0700 (PDT)
Received: from sourceware.org (server1.sourceware.org. [209.132.180.131])
 by mx.google.com with ESMTPS id
 rg4si16305290pdb.144.2015.04.17.04.19.41 for <patch@linaro.org>
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Fri, 17 Apr 2015 04:19:42 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 gcc-patches-return-395327-patch=linaro.org@gcc.gnu.org
 designates 209.132.180.131 as permitted sender)
 client-ip=209.132.180.131; 
Received: (qmail 116764 invoked by alias); 17 Apr 2015 11:19:25 -0000
Mailing-List: list patchwork-forward@linaro.org;
 contact patchwork-forward+owners@linaro.org
Precedence: list
List-Id: <patchwork-forward.linaro.org>
List-Unsubscribe: <mailto:googlegroups-manage+836684582541+unsubscribe@googlegroups.com>, 
 <http://groups.google.com/a/linaro.org/group/patchwork-forward/subscribe>
List-Archive: <http://groups.google.com/a/linaro.org/group/patchwork-forward/>
List-Post: <http://groups.google.com/a/linaro.org/group/patchwork-forward/post>, 
 <mailto:patchwork-forward@linaro.org>
List-Help: <http://support.google.com/a/linaro.org/bin/topic.py?topic=25838>, 
 <mailto:patchwork-forward+help@linaro.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 116753 invoked by uid 89); 17 Apr 2015 11:19:24 -0000
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00,
 RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2
X-HELO: mail-pd0-f181.google.com
Received: from mail-pd0-f181.google.com (HELO mail-pd0-f181.google.com)
 (209.85.192.181) by sourceware.org
 (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256
 encrypted) ESMTPS; Fri, 17 Apr 2015 11:19:22 +0000
Received: by pdbnk13 with SMTP id nk13so125245547pdb.0 for
 <gcc-patches@gcc.gnu.org>; Fri, 17 Apr 2015 04:19:21 -0700 (PDT)
X-Received: by 10.68.191.229 with SMTP id hb5mr4635484pbc.126.1429269560893;
 Fri, 17 Apr 2015 04:19:20 -0700 (PDT)
Received: from [10.1.1.4] (58-6-183-210.dyn.iinet.net.au. [58.6.183.210]) by
 mx.google.com with ESMTPSA id
 sb4sm9862108pbb.5.2015.04.17.04.19.17 (version=TLSv1.2
 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Fri, 17 Apr 2015 04:19:19 -0700 (PDT)
Message-ID: <5530EC32.4030806@linaro.org>
Date: Fri, 17 Apr 2015 21:19:14 +1000
From: Kugan <kugan.vivekanandarajah@linaro.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:31.0) Gecko/20100101 Thunderbird/31.5.0
MIME-Version: 1.0
To: James Greenhalgh <james.greenhalgh@arm.com>
CC: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>,
 "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
 Marcus Shawcroft <Marcus.Shawcroft@arm.com>,
 Richard Earnshaw <Richard.Earnshaw@arm.com>,
 Jim Wilson <jim.wilson@linaro.org>
Subject: Re: [AArch64][PR65375] Fix RTX cost for vector SET
References: <55066BCC.4010900@linaro.org> <5506AA24.3050108@arm.com>
 <5506CD7A.7030109@linaro.org> <5506D77B.5060909@linaro.org>
 <55070972.3000800@arm.com> <5507813E.3060106@linaro.org>
 <5513B390.2030201@linaro.org> <552D8FF7.5000105@linaro.org>
 <20150415092509.GA20852@arm.com> <552E4150.3020403@linaro.org>
 <20150415111854.GB22143@arm.com> <552E4C90.4070208@linaro.org>
In-Reply-To: <552E4C90.4070208@linaro.org>
X-IsSubscribed: yes
X-Original-Sender: kugan.vivekanandarajah@linaro.org
X-Original-Authentication-Results: mx.google.com; spf=pass (google.com:
 domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 2a00:1450:4010:c03::22a as permitted sender)
 smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org; 
 dkim=pass header.i=@gcc.gnu.org
X-Google-Group-Id: 836684582541

>> My point is that adding your patch while keeping the logic at the top
>> which claims to catch ALL vector operations makes for less readable
>> code.
>>
>> At the very least you'll need to update this comment:
>>
>>   /* TODO: The cost infrastructure currently does not handle
>>      vector operations.  Assume that all vector operations
>>      are equally expensive.  */
>>
>> to make it clear that this doesn't catch vector set operations.
>>
>> But fixing the comment doesn't improve the messy code so I'd certainly
>> prefer to see one of the other approaches which have been discussed.
> 
> I see your point. Let me work on this based on your suggestions above.

Hi James,

Here is an attempt along this line. Is this what you have in mind?
Trying to keep functionality as before so that we can tune the
parameters later. Not fully tested yet.

Thanks,
Kugan

diff --git a/gcc/config/aarch64/aarch64-cost-tables.h b/gcc/config/aarch64/aarch64-cost-tables.h
index ae2b547..ed9432e 100644
--- a/gcc/config/aarch64/aarch64-cost-tables.h
+++ b/gcc/config/aarch64/aarch64-cost-tables.h
@@ -121,7 +121,9 @@ const struct cpu_cost_table thunderx_extra_costs =
   },
   /* Vector */
   {
-    COSTS_N_INSNS (1)	/* Alu.  */
+    COSTS_N_INSNS (1),	/* Alu.  */
+    COSTS_N_INSNS (1),	/* Load.  */
+    COSTS_N_INSNS (1)	/* Store.  */
   }
 };
 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index cba3c1a..c2d4a53 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5499,16 +5499,6 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
      above this default.  */
   *cost = COSTS_N_INSNS (1);
 
-  /* TODO: The cost infrastructure currently does not handle
-     vector operations.  Assume that all vector operations
-     are equally expensive.  */
-  if (VECTOR_MODE_P (mode))
-    {
-      if (speed)
-	*cost += extra_cost->vect.alu;
-      return true;
-    }
-
   switch (code)
     {
     case SET:
@@ -5523,7 +5513,9 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
 	  if (speed)
 	    {
 	      rtx address = XEXP (op0, 0);
-	      if (GET_MODE_CLASS (mode) == MODE_INT)
+	      if (VECTOR_MODE_P (mode))
+		*cost += extra_cost->vect.store;
+	      else if (GET_MODE_CLASS (mode) == MODE_INT)
 		*cost += extra_cost->ldst.store;
 	      else if (mode == SFmode)
 		*cost += extra_cost->ldst.storef;
@@ -5544,10 +5536,17 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
 
 	  /* Fall through.  */
 	case REG:
+	  if (VECTOR_MODE_P (GET_MODE (op0)) && REG_P (op1))
+	    {
+              /* The cost is 1 per vector-register copied.  */
+              int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1)
+			      / GET_MODE_SIZE (V4SImode);
+              *cost = COSTS_N_INSNS (n_minus_1 + 1);
+	    }
 	  /* const0_rtx is in general free, but we will use an
 	     instruction to set a register to 0.  */
-          if (REG_P (op1) || op1 == const0_rtx)
-            {
+	  else if (REG_P (op1) || op1 == const0_rtx)
+	    {
               /* The cost is 1 per register copied.  */
               int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1)
 			      / UNITS_PER_WORD;
@@ -5570,6 +5569,7 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
 	      && (GET_MODE_BITSIZE (GET_MODE (XEXP (op1, 0)))
 		  >= INTVAL (XEXP (op0, 1))))
 	    op1 = XEXP (op1, 0);
+	  gcc_assert (!VECTOR_MODE_P (mode));
 
           if (CONST_INT_P (op1))
             {
@@ -5621,8 +5621,10 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
     case CONST_DOUBLE:
       if (speed)
 	{
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
 	  /* mov[df,sf]_aarch64.  */
-	  if (aarch64_float_const_representable_p (x))
+	  else if (aarch64_float_const_representable_p (x))
 	    /* FMOV (scalar immediate).  */
 	    *cost += extra_cost->fp[mode == DFmode].fpconst;
 	  else if (!aarch64_float_const_zero_rtx_p (x))
@@ -5650,7 +5652,9 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
 	     approximation for the additional cost of the addressing
 	     mode.  */
 	  rtx address = XEXP (x, 0);
-	  if (GET_MODE_CLASS (mode) == MODE_INT)
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.load;
+	  else if (GET_MODE_CLASS (mode) == MODE_INT)
 	    *cost += extra_cost->ldst.load;
 	  else if (mode == SFmode)
 	    *cost += extra_cost->ldst.loadf;
@@ -5705,7 +5709,12 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
     case CLRSB:
     case CLZ:
       if (speed)
-        *cost += extra_cost->alu.clz;
+	{
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->alu.clz;
+	}
 
       return false;
 
@@ -5790,6 +5799,13 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
           return false;
         }
 
+      /* VCMP.  */
+      if (VECTOR_MODE_P (mode))
+	{
+	  if (speed)
+	    *cost += extra_cost->vect.alu;
+	  return true;
+	}
       return false;
 
     case MINUS:
@@ -5808,8 +5824,13 @@ cost_minus:
 	    *cost += rtx_cost (op0, MINUS, 0, speed);
 
 	    if (speed)
-	      /* SUB(S) (immediate).  */
-	      *cost += extra_cost->alu.arith;
+	      {
+		if (VECTOR_MODE_P (mode))
+		  *cost += extra_cost->vect.alu;
+		/* SUB(S) (immediate).  */
+		else
+		  *cost += extra_cost->alu.arith;
+	      }
 	    return true;
 
 	  }
@@ -5818,8 +5839,12 @@ cost_minus:
         if (aarch64_rtx_arith_op_extract_p (op1, mode))
 	  {
 	    if (speed)
-	      *cost += extra_cost->alu.arith_shift;
-
+	      {
+		if (VECTOR_MODE_P (mode))
+		  *cost += extra_cost->vect.alu;
+		else
+		  *cost += extra_cost->alu.arith_shift;
+	      }
 	    *cost += rtx_cost (XEXP (XEXP (op1, 0), 0),
 			       (enum rtx_code) GET_CODE (op1),
 			       0, speed);
@@ -5844,7 +5869,10 @@ cost_minus:
 
 	if (speed)
 	  {
-	    if (GET_MODE_CLASS (mode) == MODE_INT)
+	    if (VECTOR_MODE_P (mode))
+	      /* Vector SUB.  */
+	      *cost += extra_cost->vect.alu;
+	    else if (GET_MODE_CLASS (mode) == MODE_INT)
 	      /* SUB(S).  */
 	      *cost += extra_cost->alu.arith;
 	    else if (GET_MODE_CLASS (mode) == MODE_FLOAT)
@@ -5878,8 +5906,13 @@ cost_plus:
 	    *cost += rtx_cost (op0, PLUS, 0, speed);
 
 	    if (speed)
-	      /* ADD (immediate).  */
-	      *cost += extra_cost->alu.arith;
+	      {
+		if (VECTOR_MODE_P (mode))
+		  *cost += extra_cost->vect.alu;
+		/* ADD (immediate).  */
+		else
+		  *cost += extra_cost->alu.arith;
+	      }
 	    return true;
 	  }
 
@@ -5887,8 +5920,12 @@ cost_plus:
         if (aarch64_rtx_arith_op_extract_p (op0, mode))
 	  {
 	    if (speed)
-	      *cost += extra_cost->alu.arith_shift;
-
+	      {
+		if (VECTOR_MODE_P (mode))
+		  *cost += extra_cost->vect.alu;
+		else
+		  *cost += extra_cost->alu.arith_shift;
+	      }
 	    *cost += rtx_cost (XEXP (XEXP (op0, 0), 0),
 			       (enum rtx_code) GET_CODE (op0),
 			       0, speed);
@@ -5913,7 +5950,10 @@ cost_plus:
 
 	if (speed)
 	  {
-	    if (GET_MODE_CLASS (mode) == MODE_INT)
+	    if (VECTOR_MODE_P (mode))
+	      /* Vector ADD.  */
+	      *cost += extra_cost->vect.alu;
+	    else if (GET_MODE_CLASS (mode) == MODE_INT)
 	      /* ADD.  */
 	      *cost += extra_cost->alu.arith;
 	    else if (GET_MODE_CLASS (mode) == MODE_FLOAT)
@@ -5927,8 +5967,12 @@ cost_plus:
       *cost = COSTS_N_INSNS (1);
 
       if (speed)
-        *cost += extra_cost->alu.rev;
-
+	{
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->alu.rev;
+	}
       return false;
 
     case IOR:
@@ -5936,10 +5980,14 @@ cost_plus:
         {
           *cost = COSTS_N_INSNS (1);
 
-          if (speed)
-            *cost += extra_cost->alu.rev;
-
-          return true;
+	  if (speed)
+	    {
+	      if (VECTOR_MODE_P (mode))
+		*cost += extra_cost->vect.alu;
+	      else
+		*cost += extra_cost->alu.rev;
+	    }
+	  return true;
         }
     /* Fall through.  */
     case XOR:
@@ -5948,6 +5996,13 @@ cost_plus:
       op0 = XEXP (x, 0);
       op1 = XEXP (x, 1);
 
+      if (VECTOR_MODE_P (mode))
+	{
+	  if (speed)
+	    *cost += extra_cost->vect.alu;
+	  return true;
+	}
+
       if (code == AND
           && GET_CODE (op0) == MULT
           && CONST_INT_P (XEXP (op0, 1))
@@ -6013,10 +6068,15 @@ cost_plus:
       return false;
 
     case NOT:
-      /* MVN.  */
       if (speed)
-	*cost += extra_cost->alu.logical;
-
+	{
+	  /* VNEG.  */
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  /* MVN.  */
+	  else
+	    *cost += extra_cost->alu.logical;
+	}
       /* The logical instruction could have the shifted register form,
          but the cost is the same if the shift is processed as a separate
          instruction, so we don't bother with it here.  */
@@ -6057,13 +6117,18 @@ cost_plus:
 
       /* UXTB/UXTH.  */
       if (speed)
-	*cost += extra_cost->alu.extend;
-
+	{
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->alu.extend;
+	}
       return false;
 
     case SIGN_EXTEND:
       if (MEM_P (XEXP (x, 0)))
 	{
+	  gcc_assert (!VECTOR_MODE_P (mode));
 	  /* LDRSH.  */
 	  if (speed)
 	    {
@@ -6078,7 +6143,12 @@ cost_plus:
 	}
 
       if (speed)
-	*cost += extra_cost->alu.extend;
+	{
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->alu.extend;
+	}
       return false;
 
     case ASHIFT:
@@ -6087,10 +6157,16 @@ cost_plus:
 
       if (CONST_INT_P (op1))
         {
-	  /* LSL (immediate), UBMF, UBFIZ and friends.  These are all
-	     aliases.  */
 	  if (speed)
-	    *cost += extra_cost->alu.shift;
+	    {
+	      /* VSHL (immediate).  */
+	      if (VECTOR_MODE_P (mode))
+		*cost += extra_cost->vect.alu;
+	      /* LSL (immediate), UBMF, UBFIZ and friends.  These are all
+		 aliases.  */
+	      else
+		*cost += extra_cost->alu.shift;
+	    }
 
           /* We can incorporate zero/sign extend for free.  */
           if (GET_CODE (op0) == ZERO_EXTEND
@@ -6102,10 +6178,15 @@ cost_plus:
         }
       else
         {
-	  /* LSLV.  */
 	  if (speed)
-	    *cost += extra_cost->alu.shift_reg;
-
+	    {
+	      /* VSHL (register).  */
+	      if (VECTOR_MODE_P (mode))
+		*cost += extra_cost->vect.alu;
+	      /* LSLV.  */
+	      else
+		*cost += extra_cost->alu.shift_reg;
+	    }
 	  return false;  /* All arguments need to be in registers.  */
         }
 
@@ -6120,18 +6201,27 @@ cost_plus:
 	{
 	  /* ASR (immediate) and friends.  */
 	  if (speed)
-	    *cost += extra_cost->alu.shift;
+	    {
+	      if (VECTOR_MODE_P (mode))
+		*cost += extra_cost->vect.alu;
+	      else
+		*cost += extra_cost->alu.shift;
+	    }
 
 	  *cost += rtx_cost (op0, (enum rtx_code) code, 0, speed);
 	  return true;
 	}
       else
 	{
-
-	  /* ASR (register) and friends.  */
 	  if (speed)
-	    *cost += extra_cost->alu.shift_reg;
-
+	    {
+	      /* VAHR (register).  */
+	      if (VECTOR_MODE_P (mode))
+		*cost += extra_cost->vect.alu;
+	      /* ASR (register) and friends.  */
+	      else
+		*cost += extra_cost->alu.shift_reg;
+	    }
 	  return false;  /* All arguments need to be in registers.  */
 	}
 
@@ -6179,7 +6269,12 @@ cost_plus:
     case SIGN_EXTRACT:
       /* UBFX/SBFX.  */
       if (speed)
-	*cost += extra_cost->alu.bfx;
+	{
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->alu.bfx;
+	}
 
       /* We can trust that the immediates used will be correct (there
 	 are no by-register forms), so we need only cost op0.  */
@@ -6196,7 +6291,9 @@ cost_plus:
     case UMOD:
       if (speed)
 	{
-	  if (GET_MODE_CLASS (GET_MODE (x)) == MODE_INT)
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else if (GET_MODE_CLASS (GET_MODE (x)) == MODE_INT)
 	    *cost += (extra_cost->mult[GET_MODE (x) == DImode].add
 		      + extra_cost->mult[GET_MODE (x) == DImode].idiv);
 	  else if (GET_MODE (x) == DFmode)
@@ -6213,7 +6310,9 @@ cost_plus:
     case SQRT:
       if (speed)
 	{
-	  if (GET_MODE_CLASS (mode) == MODE_INT)
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else if (GET_MODE_CLASS (mode) == MODE_INT)
 	    /* There is no integer SQRT, so only DIV and UDIV can get
 	       here.  */
 	    *cost += extra_cost->mult[mode == DImode].idiv;
@@ -6245,7 +6344,12 @@ cost_plus:
       op2 = XEXP (x, 2);
 
       if (speed)
-	*cost += extra_cost->fp[mode == DFmode].fma;
+	{
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->fp[mode == DFmode].fma;
+	}
 
       /* FMSUB, FNMADD, and FNMSUB are free.  */
       if (GET_CODE (op0) == NEG)
@@ -6285,7 +6389,13 @@ cost_plus:
 
     case FLOAT_EXTEND:
       if (speed)
-	*cost += extra_cost->fp[mode == DFmode].widen;
+	{
+	  if (VECTOR_MODE_P (mode))
+	    /*Vector convertion.  */
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->fp[mode == DFmode].widen;
+	}
       return false;
 
     case FLOAT_TRUNCATE:
@@ -6311,8 +6421,13 @@ cost_plus:
         }
 
       if (speed)
-        *cost += extra_cost->fp[GET_MODE (x) == DFmode].toint;
-
+	{
+	  /* FCVT.  */
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->fp[GET_MODE (x) == DFmode].toint;
+	}
       *cost += rtx_cost (x, (enum rtx_code) code, 0, speed);
       return true;
 
@@ -6321,7 +6436,12 @@ cost_plus:
 	{
 	  /* FABS and FNEG are analogous.  */
 	  if (speed)
-	    *cost += extra_cost->fp[mode == DFmode].neg;
+	    {
+	      if (VECTOR_MODE_P (mode))
+		*cost += extra_cost->vect.alu;
+	      else
+		*cost += extra_cost->fp[mode == DFmode].neg;
+	    }
 	}
       else
 	{
@@ -6338,10 +6458,13 @@ cost_plus:
     case SMIN:
       if (speed)
 	{
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
 	  /* FMAXNM/FMINNM/FMAX/FMIN.
 	     TODO: This may not be accurate for all implementations, but
 	     we do not model this in the cost tables.  */
-	  *cost += extra_cost->fp[mode == DFmode].addsub;
+	  else
+	    *cost += extra_cost->fp[mode == DFmode].addsub;
 	}
       return false;
 
diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h
index 3ee7ebf..c8e1d2e 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -124,6 +124,8 @@ struct fp_cost_table
 struct vector_cost_table
 {
   const int alu;
+  const int load;
+  const int store;
 };
 
 struct cpu_cost_table
diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
index 05e96a9..257902c 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -119,7 +119,9 @@ const struct cpu_cost_table generic_extra_costs =
   },
   /* Vector */
   {
-    COSTS_N_INSNS (1)	/* alu.  */
+    COSTS_N_INSNS (1),	/* alu.  */
+    COSTS_N_INSNS (1),	/* Load.  */
+    COSTS_N_INSNS (1)	/* Store.  */
   }
 };
 
@@ -220,7 +222,9 @@ const struct cpu_cost_table cortexa53_extra_costs =
   },
   /* Vector */
   {
-    COSTS_N_INSNS (1)	/* alu.  */
+    COSTS_N_INSNS (1),	/* alu.  */
+    COSTS_N_INSNS (1),	/* Load.  */
+    COSTS_N_INSNS (1)	/* Store.  */
   }
 };
 
@@ -321,7 +325,9 @@ const struct cpu_cost_table cortexa57_extra_costs =
   },
   /* Vector */
   {
-    COSTS_N_INSNS (1)  /* alu.  */
+    COSTS_N_INSNS (1),  /* alu.  */
+    COSTS_N_INSNS (1),  /* Load.  */
+    COSTS_N_INSNS (1)   /* Store.  */
   }
 };
 
@@ -422,7 +428,9 @@ const struct cpu_cost_table xgene1_extra_costs =
   },
   /* Vector */
   {
-    COSTS_N_INSNS (2)  /* alu.  */
+    COSTS_N_INSNS (2),  /* alu.  */
+    COSTS_N_INSNS (1),  /* Load.  */
+    COSTS_N_INSNS (1),  /* Store.  */
   }
 };