From patchwork Fri Apr 24 23:26:16 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Kugan Vivekanandarajah
 <kugan.vivekanandarajah@linaro.org>
X-Patchwork-Id: 47570
Return-Path: <patchwork-forward+bncBD47NO755IHRBNVC5OUQKGQE6RWLRCY@linaro.org>
X-Original-To: linaro@patches.linaro.org
Delivered-To: linaro@patches.linaro.org
Received: from mail-wi0-f199.google.com (mail-wi0-f199.google.com
 [209.85.212.199])
 by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id B513020553
 for <linaro@patches.linaro.org>; Fri, 24 Apr 2015 23:26:47 +0000 (UTC)
Received: by wizk4 with SMTP id k4sf7329248wiz.2
 for <linaro@patches.linaro.org>; Fri, 24 Apr 2015 16:26:47 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id
 :list-unsubscribe:list-archive:list-post:list-help:sender
 :delivered-to:message-id:date:from:user-agent:mime-version:to:cc
 :subject:references:in-reply-to:content-type:x-original-sender
 :x-original-authentication-results;
 bh=RhmeHQi7EqGiL8vJmUAK/2Sdcq6Jo9QRFU21SXQOVY8=;
 b=B1S1YZ4Rvjie1Gt50RCncTptHRtKSoJKw7V0MJT/Qkb6sYH4Haie5TPjobEMZl5N/v
 F4wRY3t9AHxxpsl+vvnnv8k60kZzLNvezzBxVAFMJfcVLdIMuFIcZk5RFncThvaXfhW3
 FharetvtZRzv+JoHPwMps8PfmgJ+KWB/SmslqqpHbQAI7bVLHX/0UmbuEUYaEgCjqo8A
 7FohYCE+i82GM3iX/ov1wlqVm66BVc8Cn6n2V53ga2MQ4lDxyDk7OZKu9NMVimeGodkY
 mHO0QzsuM5LWvkli7uLugAntAtxbA7s7/9WQLUOBkCbb+YWqe/80YS9mWjfGifrXp4S2
 yhvg==
X-Gm-Message-State: ALoCoQl0xGkfsG+UVF7R8rRbHVJODWbJqpMVymsozF1mnQ49N9tkxvuHh+M/zhVs/qnt2NmofLNi
X-Received: by 10.112.130.71 with SMTP id oc7mr412776lbb.23.1429918006854;
 Fri, 24 Apr 2015 16:26:46 -0700 (PDT)
X-BeenThere: patchwork-forward@linaro.org
Received: by 10.152.7.72 with SMTP id h8ls564676laa.71.gmail;
 Fri, 24 Apr 2015 16:26:46 -0700 (PDT)
X-Received: by 10.152.5.7 with SMTP id o7mr601546lao.51.1429918006687;
 Fri, 24 Apr 2015 16:26:46 -0700 (PDT)
Received: from mail-la0-x22e.google.com (mail-la0-x22e.google.com.
 [2a00:1450:4010:c03::22e]) by mx.google.com with ESMTPS id
 jf6si9284604lbc.170.2015.04.24.16.26.46
 for <patchwork-forward@linaro.org>
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Fri, 24 Apr 2015 16:26:46 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 2a00:1450:4010:c03::22e as permitted sender)
 client-ip=2a00:1450:4010:c03::22e; 
Received: by labbd9 with SMTP id bd9so45516832lab.2
 for <patchwork-forward@linaro.org>;
 Fri, 24 Apr 2015 16:26:46 -0700 (PDT)
X-Received: by 10.152.36.161 with SMTP id r1mr633376laj.88.1429918006426;
 Fri, 24 Apr 2015 16:26:46 -0700 (PDT)
X-Forwarded-To: patchwork-forward@linaro.org
X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org
Delivered-To: patch@linaro.org
Received: by 10.112.67.65 with SMTP id l1csp102128lbt;
 Fri, 24 Apr 2015 16:26:45 -0700 (PDT)
X-Received: by 10.66.141.109 with SMTP id rn13mr1223695pab.113.1429918004108; 
 Fri, 24 Apr 2015 16:26:44 -0700 (PDT)
Received: from sourceware.org (server1.sourceware.org. [209.132.180.131])
 by mx.google.com with ESMTPS id
 xs8si19405061pbc.108.2015.04.24.16.26.43 for <patch@linaro.org>
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Fri, 24 Apr 2015 16:26:44 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 gcc-patches-return-395995-patch=linaro.org@gcc.gnu.org
 designates 209.132.180.131 as permitted sender)
 client-ip=209.132.180.131; 
Received: (qmail 23866 invoked by alias); 24 Apr 2015 23:26:29 -0000
Mailing-List: list patchwork-forward@linaro.org;
 contact patchwork-forward+owners@linaro.org
Precedence: list
List-Id: <patchwork-forward.linaro.org>
List-Unsubscribe: <mailto:googlegroups-manage+836684582541+unsubscribe@googlegroups.com>, 
 <http://groups.google.com/a/linaro.org/group/patchwork-forward/subscribe>
List-Archive: <http://groups.google.com/a/linaro.org/group/patchwork-forward/>
List-Post: <http://groups.google.com/a/linaro.org/group/patchwork-forward/post>, 
 <mailto:patchwork-forward@linaro.org>
List-Help: <http://support.google.com/a/linaro.org/bin/topic.py?topic=25838>, 
 <mailto:patchwork-forward+help@linaro.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 23813 invoked by uid 89); 24 Apr 2015 23:26:28 -0000
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00,
 RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2
X-HELO: mail-pa0-f42.google.com
Received: from mail-pa0-f42.google.com (HELO mail-pa0-f42.google.com)
 (209.85.220.42) by sourceware.org
 (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256
 encrypted) ESMTPS; Fri, 24 Apr 2015 23:26:25 +0000
Received: by pabsx10 with SMTP id sx10so61150473pab.3 for
 <gcc-patches@gcc.gnu.org>; Fri, 24 Apr 2015 16:26:23 -0700 (PDT)
X-Received: by 10.66.142.169 with SMTP id rx9mr1360146pab.84.1429917983822;
 Fri, 24 Apr 2015 16:26:23 -0700 (PDT)
Received: from [10.1.1.2] (58-6-183-210.dyn.iinet.net.au. [58.6.183.210]) by
 mx.google.com with ESMTPSA id
 sz7sm12239023pab.22.2015.04.24.16.26.20 (version=TLSv1.2
 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Fri, 24 Apr 2015 16:26:23 -0700 (PDT)
Message-ID: <553AD118.3010705@linaro.org>
Date: Sat, 25 Apr 2015 09:26:16 +1000
From: Kugan <kugan.vivekanandarajah@linaro.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:31.0) Gecko/20100101 Thunderbird/31.5.0
MIME-Version: 1.0
To: James Greenhalgh <james.greenhalgh@arm.com>
CC: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>,
 "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
 Marcus Shawcroft <Marcus.Shawcroft@arm.com>,
 Richard Earnshaw <Richard.Earnshaw@arm.com>,
 Jim Wilson <jim.wilson@linaro.org>
Subject: Re: [AArch64][PR65375] Fix RTX cost for vector SET
References: <5506D77B.5060909@linaro.org> <55070972.3000800@arm.com>
 <5507813E.3060106@linaro.org> <5513B390.2030201@linaro.org>
 <552D8FF7.5000105@linaro.org> <20150415092509.GA20852@arm.com>
 <552E4150.3020403@linaro.org> <20150415111854.GB22143@arm.com>
 <552E4C90.4070208@linaro.org> <5530EC32.4030806@linaro.org>
 <20150420202225.GA7414@arm.com>
In-Reply-To: <20150420202225.GA7414@arm.com>
X-IsSubscribed: yes
X-Original-Sender: kugan.vivekanandarajah@linaro.org
X-Original-Authentication-Results: mx.google.com; spf=pass (google.com:
 domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 2a00:1450:4010:c03::22e as permitted sender)
 smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org; 
 dkim=pass header.i=@gcc.gnu.org
X-Google-Group-Id: 836684582541

On 21/04/15 06:22, James Greenhalgh wrote:
> On Fri, Apr 17, 2015 at 12:19:14PM +0100, Kugan wrote:
>>>> My point is that adding your patch while keeping the logic at the top
>>>> which claims to catch ALL vector operations makes for less readable
>>>> code.
>>>>
>>>> At the very least you'll need to update this comment:
>>>>
>>>>   /* TODO: The cost infrastructure currently does not handle
>>>>      vector operations.  Assume that all vector operations
>>>>      are equally expensive.  */
>>>>
>>>> to make it clear that this doesn't catch vector set operations.
>>>>
>>>> But fixing the comment doesn't improve the messy code so I'd certainly
>>>> prefer to see one of the other approaches which have been discussed.
>>>
>>> I see your point. Let me work on this based on your suggestions above.
>>
>> Hi James,
>>
>> Here is an attempt along this line. Is this what you have in mind?
>> Trying to keep functionality as before so that we can tune the
>> parameters later. Not fully tested yet.
> 
> Hi Kugan,
> 
> Sorry to have dropped out of the thread for a while, I'm currently
> travelling in the US.
> 
> This is along the lines of what I had in mind, thanks for digging through
> and doing it. It needs a little polishing, just neaten up the rough edges
> of comments and where they sit next to the new if conditionals, and of course,
> testing, and I have a few comments below.
> 
> Thanks,
> James
> 
>> diff --git a/gcc/config/aarch64/aarch64-cost-tables.h b/gcc/config/aarch64/aarch64-cost-tables.h
>> index ae2b547..ed9432e 100644
>> --- a/gcc/config/aarch64/aarch64-cost-tables.h
>> +++ b/gcc/config/aarch64/aarch64-cost-tables.h
>> @@ -121,7 +121,9 @@ const struct cpu_cost_table thunderx_extra_costs =
>>    },
>>    /* Vector */
>>    {
>> -    COSTS_N_INSNS (1)	/* Alu.  */
>> +    COSTS_N_INSNS (1),	/* Alu.  */
>> +    COSTS_N_INSNS (1),	/* Load.  */
>> +    COSTS_N_INSNS (1)	/* Store.  */
>>    }
>>  };
> 
> Can you push the Load/Stores in to the LD/ST section above and give
> them a name like loadv/storev.
> 
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index cba3c1a..c2d4a53 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
> 
> <snip>
> 
>> @@ -5570,6 +5569,7 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
>>  	      && (GET_MODE_BITSIZE (GET_MODE (XEXP (op1, 0)))
>>  		  >= INTVAL (XEXP (op0, 1))))
>>  	    op1 = XEXP (op1, 0);
>> +	  gcc_assert (!VECTOR_MODE_P (mode));
> 
> As Kyrill asked, please drop this.


Thanks for the review. I have updated the patch based on the comments
with some other minor changes. Bootstrapped and regression tested on
aarch64-none-linux-gnu with no-new regressions. Is this OK for trunk?


Thanks,
Kugan


gcc/ChangeLog:

2015-04-24  Kugan Vivekanandarajah  <kuganv@linaro.org>
	    Jim Wilson  <jim.wilson@linaro.org>

	* config/arm/aarch-common-protos.h (struct mem_cost_table): Added
	new  fields loadv and storev.
	* config/aarch64/aarch64-cost-tables.h (thunderx_extra_costs):
	Initialize loadv and storev.
	* config/arm/aarch-cost-tables.h (generic_extra_costs): Likewise.
	(cortexa53_extra_costs): Likewise.
	(cortexa57_extra_costs): Likewise.
	(xgene1_extra_costs): Likewise.
	* config/aarch64/aarch64.c (aarch64_rtx_costs): Update vector
	rtx_costs.

diff --git a/gcc/config/aarch64/aarch64-cost-tables.h b/gcc/config/aarch64/aarch64-cost-tables.h
index ae2b547..939125c 100644
--- a/gcc/config/aarch64/aarch64-cost-tables.h
+++ b/gcc/config/aarch64/aarch64-cost-tables.h
@@ -83,7 +83,9 @@ const struct cpu_cost_table thunderx_extra_costs =
     0,			/* N/A: Stm_regs_per_insn_subsequent.  */
     0,			/* Storef.  */
     0,			/* Stored.  */
-    COSTS_N_INSNS (1)  /* Store_unaligned.  */
+    COSTS_N_INSNS (1),	/* Store_unaligned.  */
+    COSTS_N_INSNS (1),	/* Loadv.  */
+    COSTS_N_INSNS (1)	/* Storev.  */
   },
   {
     /* FP SFmode */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index cba3c1a..13425fc 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5499,16 +5499,6 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
      above this default.  */
   *cost = COSTS_N_INSNS (1);
 
-  /* TODO: The cost infrastructure currently does not handle
-     vector operations.  Assume that all vector operations
-     are equally expensive.  */
-  if (VECTOR_MODE_P (mode))
-    {
-      if (speed)
-	*cost += extra_cost->vect.alu;
-      return true;
-    }
-
   switch (code)
     {
     case SET:
@@ -5523,7 +5513,9 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
 	  if (speed)
 	    {
 	      rtx address = XEXP (op0, 0);
-	      if (GET_MODE_CLASS (mode) == MODE_INT)
+	      if (VECTOR_MODE_P (mode))
+		*cost += extra_cost->ldst.storev;
+	      else if (GET_MODE_CLASS (mode) == MODE_INT)
 		*cost += extra_cost->ldst.store;
 	      else if (mode == SFmode)
 		*cost += extra_cost->ldst.storef;
@@ -5544,15 +5536,22 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
 
 	  /* Fall through.  */
 	case REG:
+	  /* The cost is one per vector-register copied.  */
+	  if (VECTOR_MODE_P (GET_MODE (op0)) && REG_P (op1))
+	    {
+	      int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1)
+			      / GET_MODE_SIZE (V4SImode);
+	      *cost = COSTS_N_INSNS (n_minus_1 + 1);
+	    }
 	  /* const0_rtx is in general free, but we will use an
 	     instruction to set a register to 0.  */
-          if (REG_P (op1) || op1 == const0_rtx)
-            {
-              /* The cost is 1 per register copied.  */
-              int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1)
+	  else if (REG_P (op1) || op1 == const0_rtx)
+	    {
+	      /* The cost is 1 per register copied.  */
+	      int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1)
 			      / UNITS_PER_WORD;
-              *cost = COSTS_N_INSNS (n_minus_1 + 1);
-            }
+	      *cost = COSTS_N_INSNS (n_minus_1 + 1);
+	    }
           else
 	    /* Cost is just the cost of the RHS of the set.  */
 	    *cost += rtx_cost (op1, SET, 1, speed);
@@ -5650,7 +5649,9 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
 	     approximation for the additional cost of the addressing
 	     mode.  */
 	  rtx address = XEXP (x, 0);
-	  if (GET_MODE_CLASS (mode) == MODE_INT)
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->ldst.loadv;
+	  else if (GET_MODE_CLASS (mode) == MODE_INT)
 	    *cost += extra_cost->ldst.load;
 	  else if (mode == SFmode)
 	    *cost += extra_cost->ldst.loadf;
@@ -5667,6 +5668,14 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
     case NEG:
       op0 = XEXP (x, 0);
 
+      if (VECTOR_MODE_P (mode))
+	{
+	  if (speed)
+	    /* FNEG.  */
+	    *cost += extra_cost->vect.alu;
+	  return false;
+	}
+
       if (GET_MODE_CLASS (GET_MODE (x)) == MODE_INT)
        {
           if (GET_RTX_CLASS (GET_CODE (op0)) == RTX_COMPARE
@@ -5705,7 +5714,12 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
     case CLRSB:
     case CLZ:
       if (speed)
-        *cost += extra_cost->alu.clz;
+	{
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->alu.clz;
+	}
 
       return false;
 
@@ -5790,6 +5804,20 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
           return false;
         }
 
+      if (VECTOR_MODE_P (mode))
+	{
+	  /* Vector compare.  */
+	  if (speed)
+	    *cost += extra_cost->vect.alu;
+
+	  if (aarch64_float_const_zero_rtx_p (op1))
+	    {
+	      /* Vector cm (eq|ge|gt|lt|le) supports constant 0.0 for no extra
+		 cost.  */
+	      return true;
+	    }
+	  return false;
+	}
       return false;
 
     case MINUS:
@@ -5844,7 +5872,10 @@ cost_minus:
 
 	if (speed)
 	  {
-	    if (GET_MODE_CLASS (mode) == MODE_INT)
+	    if (VECTOR_MODE_P (mode))
+	      /* Vector SUB.  */
+	      *cost += extra_cost->vect.alu;
+	    else if (GET_MODE_CLASS (mode) == MODE_INT)
 	      /* SUB(S).  */
 	      *cost += extra_cost->alu.arith;
 	    else if (GET_MODE_CLASS (mode) == MODE_FLOAT)
@@ -5888,7 +5919,6 @@ cost_plus:
 	  {
 	    if (speed)
 	      *cost += extra_cost->alu.arith_shift;
-
 	    *cost += rtx_cost (XEXP (XEXP (op0, 0), 0),
 			       (enum rtx_code) GET_CODE (op0),
 			       0, speed);
@@ -5913,7 +5943,10 @@ cost_plus:
 
 	if (speed)
 	  {
-	    if (GET_MODE_CLASS (mode) == MODE_INT)
+	    if (VECTOR_MODE_P (mode))
+	      /* Vector ADD.  */
+	      *cost += extra_cost->vect.alu;
+	    else if (GET_MODE_CLASS (mode) == MODE_INT)
 	      /* ADD.  */
 	      *cost += extra_cost->alu.arith;
 	    else if (GET_MODE_CLASS (mode) == MODE_FLOAT)
@@ -5927,8 +5960,12 @@ cost_plus:
       *cost = COSTS_N_INSNS (1);
 
       if (speed)
-        *cost += extra_cost->alu.rev;
-
+	{
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->alu.rev;
+	}
       return false;
 
     case IOR:
@@ -5936,10 +5973,14 @@ cost_plus:
         {
           *cost = COSTS_N_INSNS (1);
 
-          if (speed)
-            *cost += extra_cost->alu.rev;
-
-          return true;
+	  if (speed)
+	    {
+	      if (VECTOR_MODE_P (mode))
+		*cost += extra_cost->vect.alu;
+	      else
+		*cost += extra_cost->alu.rev;
+	    }
+	  return true;
         }
     /* Fall through.  */
     case XOR:
@@ -5948,6 +5989,13 @@ cost_plus:
       op0 = XEXP (x, 0);
       op1 = XEXP (x, 1);
 
+      if (VECTOR_MODE_P (mode))
+	{
+	  if (speed)
+	    *cost += extra_cost->vect.alu;
+	  return true;
+	}
+
       if (code == AND
           && GET_CODE (op0) == MULT
           && CONST_INT_P (XEXP (op0, 1))
@@ -6013,10 +6061,15 @@ cost_plus:
       return false;
 
     case NOT:
-      /* MVN.  */
       if (speed)
-	*cost += extra_cost->alu.logical;
-
+	{
+	  /* Vector NOT.  */
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  /* MVN.  */
+	  else
+	    *cost += extra_cost->alu.logical;
+	}
       /* The logical instruction could have the shifted register form,
          but the cost is the same if the shift is processed as a separate
          instruction, so we don't bother with it here.  */
@@ -6055,10 +6108,15 @@ cost_plus:
 	  return true;
 	}
 
-      /* UXTB/UXTH.  */
       if (speed)
-	*cost += extra_cost->alu.extend;
-
+	{
+	  if (VECTOR_MODE_P (mode))
+	    /* UMOV.  */
+	    *cost += extra_cost->vect.alu;
+	  else
+	    /* UXTB/UXTH.  */
+	    *cost += extra_cost->alu.extend;
+	}
       return false;
 
     case SIGN_EXTEND:
@@ -6078,7 +6136,12 @@ cost_plus:
 	}
 
       if (speed)
-	*cost += extra_cost->alu.extend;
+	{
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->alu.extend;
+	}
       return false;
 
     case ASHIFT:
@@ -6087,10 +6150,16 @@ cost_plus:
 
       if (CONST_INT_P (op1))
         {
-	  /* LSL (immediate), UBMF, UBFIZ and friends.  These are all
-	     aliases.  */
 	  if (speed)
-	    *cost += extra_cost->alu.shift;
+	    {
+	      /* Vector shift (immediate).  */
+	      if (VECTOR_MODE_P (mode))
+		*cost += extra_cost->vect.alu;
+	      /* LSL (immediate), UBMF, UBFIZ and friends.  These are all
+		 aliases.  */
+	      else
+		*cost += extra_cost->alu.shift;
+	    }
 
           /* We can incorporate zero/sign extend for free.  */
           if (GET_CODE (op0) == ZERO_EXTEND
@@ -6102,10 +6171,15 @@ cost_plus:
         }
       else
         {
-	  /* LSLV.  */
 	  if (speed)
-	    *cost += extra_cost->alu.shift_reg;
-
+	    {
+	      /* Vector shift (register).  */
+	      if (VECTOR_MODE_P (mode))
+		*cost += extra_cost->vect.alu;
+	      /* LSLV.  */
+	      else
+		*cost += extra_cost->alu.shift_reg;
+	    }
 	  return false;  /* All arguments need to be in registers.  */
         }
 
@@ -6120,7 +6194,12 @@ cost_plus:
 	{
 	  /* ASR (immediate) and friends.  */
 	  if (speed)
-	    *cost += extra_cost->alu.shift;
+	    {
+	      if (VECTOR_MODE_P (mode))
+		*cost += extra_cost->vect.alu;
+	      else
+		*cost += extra_cost->alu.shift;
+	    }
 
 	  *cost += rtx_cost (op0, (enum rtx_code) code, 0, speed);
 	  return true;
@@ -6130,8 +6209,12 @@ cost_plus:
 
 	  /* ASR (register) and friends.  */
 	  if (speed)
-	    *cost += extra_cost->alu.shift_reg;
-
+	    {
+	      if (VECTOR_MODE_P (mode))
+		*cost += extra_cost->vect.alu;
+	      else
+		*cost += extra_cost->alu.shift_reg;
+	    }
 	  return false;  /* All arguments need to be in registers.  */
 	}
 
@@ -6179,7 +6262,12 @@ cost_plus:
     case SIGN_EXTRACT:
       /* UBFX/SBFX.  */
       if (speed)
-	*cost += extra_cost->alu.bfx;
+	{
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->alu.bfx;
+	}
 
       /* We can trust that the immediates used will be correct (there
 	 are no by-register forms), so we need only cost op0.  */
@@ -6196,7 +6284,9 @@ cost_plus:
     case UMOD:
       if (speed)
 	{
-	  if (GET_MODE_CLASS (GET_MODE (x)) == MODE_INT)
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else if (GET_MODE_CLASS (GET_MODE (x)) == MODE_INT)
 	    *cost += (extra_cost->mult[GET_MODE (x) == DImode].add
 		      + extra_cost->mult[GET_MODE (x) == DImode].idiv);
 	  else if (GET_MODE (x) == DFmode)
@@ -6213,7 +6303,9 @@ cost_plus:
     case SQRT:
       if (speed)
 	{
-	  if (GET_MODE_CLASS (mode) == MODE_INT)
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else if (GET_MODE_CLASS (mode) == MODE_INT)
 	    /* There is no integer SQRT, so only DIV and UDIV can get
 	       here.  */
 	    *cost += extra_cost->mult[mode == DImode].idiv;
@@ -6245,7 +6337,12 @@ cost_plus:
       op2 = XEXP (x, 2);
 
       if (speed)
-	*cost += extra_cost->fp[mode == DFmode].fma;
+	{
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->fp[mode == DFmode].fma;
+	}
 
       /* FMSUB, FNMADD, and FNMSUB are free.  */
       if (GET_CODE (op0) == NEG)
@@ -6285,12 +6382,24 @@ cost_plus:
 
     case FLOAT_EXTEND:
       if (speed)
-	*cost += extra_cost->fp[mode == DFmode].widen;
+	{
+	  if (VECTOR_MODE_P (mode))
+	    /*Vector truncate.  */
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->fp[mode == DFmode].widen;
+	}
       return false;
 
     case FLOAT_TRUNCATE:
       if (speed)
-	*cost += extra_cost->fp[mode == DFmode].narrow;
+	{
+	  if (VECTOR_MODE_P (mode))
+	    /*Vector conversion.  */
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->fp[mode == DFmode].narrow;
+	}
       return false;
 
     case FIX:
@@ -6311,13 +6420,23 @@ cost_plus:
         }
 
       if (speed)
-        *cost += extra_cost->fp[GET_MODE (x) == DFmode].toint;
-
+	{
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
+	  else
+	    *cost += extra_cost->fp[GET_MODE (x) == DFmode].toint;
+	}
       *cost += rtx_cost (x, (enum rtx_code) code, 0, speed);
       return true;
 
     case ABS:
-      if (GET_MODE_CLASS (mode) == MODE_FLOAT)
+      if (VECTOR_MODE_P (mode))
+	{
+	  /* ABS (vector).  */
+	  if (speed)
+	    *cost += extra_cost->vect.alu;
+	}
+      else if (GET_MODE_CLASS (mode) == MODE_FLOAT)
 	{
 	  /* FABS and FNEG are analogous.  */
 	  if (speed)
@@ -6338,10 +6457,13 @@ cost_plus:
     case SMIN:
       if (speed)
 	{
+	  if (VECTOR_MODE_P (mode))
+	    *cost += extra_cost->vect.alu;
 	  /* FMAXNM/FMINNM/FMAX/FMIN.
 	     TODO: This may not be accurate for all implementations, but
 	     we do not model this in the cost tables.  */
-	  *cost += extra_cost->fp[mode == DFmode].addsub;
+	  else
+	    *cost += extra_cost->fp[mode == DFmode].addsub;
 	}
       return false;
 
diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h
index 3ee7ebf..29f7c99 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -102,6 +102,8 @@ struct mem_cost_table
   const int storef;		/* SFmode.  */
   const int stored;		/* DFmode.  */
   const int store_unaligned;	/* Extra for unaligned stores.  */
+  const int loadv;		/* Vector load.  */
+  const int storev;		/* Vector store.  */
 };
 
 struct fp_cost_table
diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
index 05e96a9..809feb8 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -81,7 +81,9 @@ const struct cpu_cost_table generic_extra_costs =
     1,			/* stm_regs_per_insn_subsequent.  */
     COSTS_N_INSNS (2),	/* storef.  */
     COSTS_N_INSNS (3),	/* stored.  */
-    COSTS_N_INSNS (1)  /* store_unaligned.  */
+    COSTS_N_INSNS (1),	/* store_unaligned.  */
+    COSTS_N_INSNS (1),	/* loadv.  */
+    COSTS_N_INSNS (1)	/* storev.  */
   },
   {
     /* FP SFmode */
@@ -182,7 +184,9 @@ const struct cpu_cost_table cortexa53_extra_costs =
     2,				/* stm_regs_per_insn_subsequent.  */
     0,				/* storef.  */
     0,				/* stored.  */
-    COSTS_N_INSNS (1)		/* store_unaligned.  */
+    COSTS_N_INSNS (1),		/* store_unaligned.  */
+    COSTS_N_INSNS (1),		/* loadv.  */
+    COSTS_N_INSNS (1)		/* storev.  */
   },
   {
     /* FP SFmode */
@@ -283,7 +287,9 @@ const struct cpu_cost_table cortexa57_extra_costs =
     2,                         /* stm_regs_per_insn_subsequent.  */
     0,                         /* storef.  */
     0,                         /* stored.  */
-    COSTS_N_INSNS (1)          /* store_unaligned.  */
+    COSTS_N_INSNS (1),         /* store_unaligned.  */
+    COSTS_N_INSNS (1),         /* loadv.  */
+    COSTS_N_INSNS (1)          /* storev.  */
   },
   {
     /* FP SFmode */
@@ -385,6 +391,8 @@ const struct cpu_cost_table xgene1_extra_costs =
     0,                         /* storef.  */
     0,                         /* stored.  */
     0,                         /* store_unaligned.  */
+    COSTS_N_INSNS (1),         /* loadv.  */
+    COSTS_N_INSNS (1)          /* storev.  */
   },
   {
     /* FP SFmode */