From patchwork Fri Apr 24 23:26:16 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kugan Vivekanandarajah X-Patchwork-Id: 47570 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-wi0-f199.google.com (mail-wi0-f199.google.com [209.85.212.199]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id B513020553 for ; Fri, 24 Apr 2015 23:26:47 +0000 (UTC) Received: by wizk4 with SMTP id k4sf7329248wiz.2 for ; Fri, 24 Apr 2015 16:26:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id :list-unsubscribe:list-archive:list-post:list-help:sender :delivered-to:message-id:date:from:user-agent:mime-version:to:cc :subject:references:in-reply-to:content-type:x-original-sender :x-original-authentication-results; bh=RhmeHQi7EqGiL8vJmUAK/2Sdcq6Jo9QRFU21SXQOVY8=; b=B1S1YZ4Rvjie1Gt50RCncTptHRtKSoJKw7V0MJT/Qkb6sYH4Haie5TPjobEMZl5N/v F4wRY3t9AHxxpsl+vvnnv8k60kZzLNvezzBxVAFMJfcVLdIMuFIcZk5RFncThvaXfhW3 FharetvtZRzv+JoHPwMps8PfmgJ+KWB/SmslqqpHbQAI7bVLHX/0UmbuEUYaEgCjqo8A 7FohYCE+i82GM3iX/ov1wlqVm66BVc8Cn6n2V53ga2MQ4lDxyDk7OZKu9NMVimeGodkY mHO0QzsuM5LWvkli7uLugAntAtxbA7s7/9WQLUOBkCbb+YWqe/80YS9mWjfGifrXp4S2 yhvg== X-Gm-Message-State: ALoCoQl0xGkfsG+UVF7R8rRbHVJODWbJqpMVymsozF1mnQ49N9tkxvuHh+M/zhVs/qnt2NmofLNi X-Received: by 10.112.130.71 with SMTP id oc7mr412776lbb.23.1429918006854; Fri, 24 Apr 2015 16:26:46 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.7.72 with SMTP id h8ls564676laa.71.gmail; Fri, 24 Apr 2015 16:26:46 -0700 (PDT) X-Received: by 10.152.5.7 with SMTP id o7mr601546lao.51.1429918006687; Fri, 24 Apr 2015 16:26:46 -0700 (PDT) Received: from mail-la0-x22e.google.com (mail-la0-x22e.google.com. [2a00:1450:4010:c03::22e]) by mx.google.com with ESMTPS id jf6si9284604lbc.170.2015.04.24.16.26.46 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 24 Apr 2015 16:26:46 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c03::22e as permitted sender) client-ip=2a00:1450:4010:c03::22e; Received: by labbd9 with SMTP id bd9so45516832lab.2 for ; Fri, 24 Apr 2015 16:26:46 -0700 (PDT) X-Received: by 10.152.36.161 with SMTP id r1mr633376laj.88.1429918006426; Fri, 24 Apr 2015 16:26:46 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.67.65 with SMTP id l1csp102128lbt; Fri, 24 Apr 2015 16:26:45 -0700 (PDT) X-Received: by 10.66.141.109 with SMTP id rn13mr1223695pab.113.1429918004108; Fri, 24 Apr 2015 16:26:44 -0700 (PDT) Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id xs8si19405061pbc.108.2015.04.24.16.26.43 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 24 Apr 2015 16:26:44 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-395995-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Received: (qmail 23866 invoked by alias); 24 Apr 2015 23:26:29 -0000 Mailing-List: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 23813 invoked by uid 89); 24 Apr 2015 23:26:28 -0000 X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-pa0-f42.google.com Received: from mail-pa0-f42.google.com (HELO mail-pa0-f42.google.com) (209.85.220.42) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 24 Apr 2015 23:26:25 +0000 Received: by pabsx10 with SMTP id sx10so61150473pab.3 for ; Fri, 24 Apr 2015 16:26:23 -0700 (PDT) X-Received: by 10.66.142.169 with SMTP id rx9mr1360146pab.84.1429917983822; Fri, 24 Apr 2015 16:26:23 -0700 (PDT) Received: from [10.1.1.2] (58-6-183-210.dyn.iinet.net.au. [58.6.183.210]) by mx.google.com with ESMTPSA id sz7sm12239023pab.22.2015.04.24.16.26.20 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 24 Apr 2015 16:26:23 -0700 (PDT) Message-ID: <553AD118.3010705@linaro.org> Date: Sat, 25 Apr 2015 09:26:16 +1000 From: Kugan User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: James Greenhalgh CC: Kyrylo Tkachov , "gcc-patches@gcc.gnu.org" , Marcus Shawcroft , Richard Earnshaw , Jim Wilson Subject: Re: [AArch64][PR65375] Fix RTX cost for vector SET References: <5506D77B.5060909@linaro.org> <55070972.3000800@arm.com> <5507813E.3060106@linaro.org> <5513B390.2030201@linaro.org> <552D8FF7.5000105@linaro.org> <20150415092509.GA20852@arm.com> <552E4150.3020403@linaro.org> <20150415111854.GB22143@arm.com> <552E4C90.4070208@linaro.org> <5530EC32.4030806@linaro.org> <20150420202225.GA7414@arm.com> In-Reply-To: <20150420202225.GA7414@arm.com> X-IsSubscribed: yes X-Original-Sender: kugan.vivekanandarajah@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 2a00:1450:4010:c03::22e as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org; dkim=pass header.i=@gcc.gnu.org X-Google-Group-Id: 836684582541 On 21/04/15 06:22, James Greenhalgh wrote: > On Fri, Apr 17, 2015 at 12:19:14PM +0100, Kugan wrote: >>>> My point is that adding your patch while keeping the logic at the top >>>> which claims to catch ALL vector operations makes for less readable >>>> code. >>>> >>>> At the very least you'll need to update this comment: >>>> >>>> /* TODO: The cost infrastructure currently does not handle >>>> vector operations. Assume that all vector operations >>>> are equally expensive. */ >>>> >>>> to make it clear that this doesn't catch vector set operations. >>>> >>>> But fixing the comment doesn't improve the messy code so I'd certainly >>>> prefer to see one of the other approaches which have been discussed. >>> >>> I see your point. Let me work on this based on your suggestions above. >> >> Hi James, >> >> Here is an attempt along this line. Is this what you have in mind? >> Trying to keep functionality as before so that we can tune the >> parameters later. Not fully tested yet. > > Hi Kugan, > > Sorry to have dropped out of the thread for a while, I'm currently > travelling in the US. > > This is along the lines of what I had in mind, thanks for digging through > and doing it. It needs a little polishing, just neaten up the rough edges > of comments and where they sit next to the new if conditionals, and of course, > testing, and I have a few comments below. > > Thanks, > James > >> diff --git a/gcc/config/aarch64/aarch64-cost-tables.h b/gcc/config/aarch64/aarch64-cost-tables.h >> index ae2b547..ed9432e 100644 >> --- a/gcc/config/aarch64/aarch64-cost-tables.h >> +++ b/gcc/config/aarch64/aarch64-cost-tables.h >> @@ -121,7 +121,9 @@ const struct cpu_cost_table thunderx_extra_costs = >> }, >> /* Vector */ >> { >> - COSTS_N_INSNS (1) /* Alu. */ >> + COSTS_N_INSNS (1), /* Alu. */ >> + COSTS_N_INSNS (1), /* Load. */ >> + COSTS_N_INSNS (1) /* Store. */ >> } >> }; > > Can you push the Load/Stores in to the LD/ST section above and give > them a name like loadv/storev. > >> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >> index cba3c1a..c2d4a53 100644 >> --- a/gcc/config/aarch64/aarch64.c >> +++ b/gcc/config/aarch64/aarch64.c > > > >> @@ -5570,6 +5569,7 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED, >> && (GET_MODE_BITSIZE (GET_MODE (XEXP (op1, 0))) >> >= INTVAL (XEXP (op0, 1)))) >> op1 = XEXP (op1, 0); >> + gcc_assert (!VECTOR_MODE_P (mode)); > > As Kyrill asked, please drop this. Thanks for the review. I have updated the patch based on the comments with some other minor changes. Bootstrapped and regression tested on aarch64-none-linux-gnu with no-new regressions. Is this OK for trunk? Thanks, Kugan gcc/ChangeLog: 2015-04-24 Kugan Vivekanandarajah Jim Wilson * config/arm/aarch-common-protos.h (struct mem_cost_table): Added new fields loadv and storev. * config/aarch64/aarch64-cost-tables.h (thunderx_extra_costs): Initialize loadv and storev. * config/arm/aarch-cost-tables.h (generic_extra_costs): Likewise. (cortexa53_extra_costs): Likewise. (cortexa57_extra_costs): Likewise. (xgene1_extra_costs): Likewise. * config/aarch64/aarch64.c (aarch64_rtx_costs): Update vector rtx_costs. diff --git a/gcc/config/aarch64/aarch64-cost-tables.h b/gcc/config/aarch64/aarch64-cost-tables.h index ae2b547..939125c 100644 --- a/gcc/config/aarch64/aarch64-cost-tables.h +++ b/gcc/config/aarch64/aarch64-cost-tables.h @@ -83,7 +83,9 @@ const struct cpu_cost_table thunderx_extra_costs = 0, /* N/A: Stm_regs_per_insn_subsequent. */ 0, /* Storef. */ 0, /* Stored. */ - COSTS_N_INSNS (1) /* Store_unaligned. */ + COSTS_N_INSNS (1), /* Store_unaligned. */ + COSTS_N_INSNS (1), /* Loadv. */ + COSTS_N_INSNS (1) /* Storev. */ }, { /* FP SFmode */ diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index cba3c1a..13425fc 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -5499,16 +5499,6 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED, above this default. */ *cost = COSTS_N_INSNS (1); - /* TODO: The cost infrastructure currently does not handle - vector operations. Assume that all vector operations - are equally expensive. */ - if (VECTOR_MODE_P (mode)) - { - if (speed) - *cost += extra_cost->vect.alu; - return true; - } - switch (code) { case SET: @@ -5523,7 +5513,9 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED, if (speed) { rtx address = XEXP (op0, 0); - if (GET_MODE_CLASS (mode) == MODE_INT) + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->ldst.storev; + else if (GET_MODE_CLASS (mode) == MODE_INT) *cost += extra_cost->ldst.store; else if (mode == SFmode) *cost += extra_cost->ldst.storef; @@ -5544,15 +5536,22 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED, /* Fall through. */ case REG: + /* The cost is one per vector-register copied. */ + if (VECTOR_MODE_P (GET_MODE (op0)) && REG_P (op1)) + { + int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1) + / GET_MODE_SIZE (V4SImode); + *cost = COSTS_N_INSNS (n_minus_1 + 1); + } /* const0_rtx is in general free, but we will use an instruction to set a register to 0. */ - if (REG_P (op1) || op1 == const0_rtx) - { - /* The cost is 1 per register copied. */ - int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1) + else if (REG_P (op1) || op1 == const0_rtx) + { + /* The cost is 1 per register copied. */ + int n_minus_1 = (GET_MODE_SIZE (GET_MODE (op0)) - 1) / UNITS_PER_WORD; - *cost = COSTS_N_INSNS (n_minus_1 + 1); - } + *cost = COSTS_N_INSNS (n_minus_1 + 1); + } else /* Cost is just the cost of the RHS of the set. */ *cost += rtx_cost (op1, SET, 1, speed); @@ -5650,7 +5649,9 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED, approximation for the additional cost of the addressing mode. */ rtx address = XEXP (x, 0); - if (GET_MODE_CLASS (mode) == MODE_INT) + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->ldst.loadv; + else if (GET_MODE_CLASS (mode) == MODE_INT) *cost += extra_cost->ldst.load; else if (mode == SFmode) *cost += extra_cost->ldst.loadf; @@ -5667,6 +5668,14 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED, case NEG: op0 = XEXP (x, 0); + if (VECTOR_MODE_P (mode)) + { + if (speed) + /* FNEG. */ + *cost += extra_cost->vect.alu; + return false; + } + if (GET_MODE_CLASS (GET_MODE (x)) == MODE_INT) { if (GET_RTX_CLASS (GET_CODE (op0)) == RTX_COMPARE @@ -5705,7 +5714,12 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED, case CLRSB: case CLZ: if (speed) - *cost += extra_cost->alu.clz; + { + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->vect.alu; + else + *cost += extra_cost->alu.clz; + } return false; @@ -5790,6 +5804,20 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED, return false; } + if (VECTOR_MODE_P (mode)) + { + /* Vector compare. */ + if (speed) + *cost += extra_cost->vect.alu; + + if (aarch64_float_const_zero_rtx_p (op1)) + { + /* Vector cm (eq|ge|gt|lt|le) supports constant 0.0 for no extra + cost. */ + return true; + } + return false; + } return false; case MINUS: @@ -5844,7 +5872,10 @@ cost_minus: if (speed) { - if (GET_MODE_CLASS (mode) == MODE_INT) + if (VECTOR_MODE_P (mode)) + /* Vector SUB. */ + *cost += extra_cost->vect.alu; + else if (GET_MODE_CLASS (mode) == MODE_INT) /* SUB(S). */ *cost += extra_cost->alu.arith; else if (GET_MODE_CLASS (mode) == MODE_FLOAT) @@ -5888,7 +5919,6 @@ cost_plus: { if (speed) *cost += extra_cost->alu.arith_shift; - *cost += rtx_cost (XEXP (XEXP (op0, 0), 0), (enum rtx_code) GET_CODE (op0), 0, speed); @@ -5913,7 +5943,10 @@ cost_plus: if (speed) { - if (GET_MODE_CLASS (mode) == MODE_INT) + if (VECTOR_MODE_P (mode)) + /* Vector ADD. */ + *cost += extra_cost->vect.alu; + else if (GET_MODE_CLASS (mode) == MODE_INT) /* ADD. */ *cost += extra_cost->alu.arith; else if (GET_MODE_CLASS (mode) == MODE_FLOAT) @@ -5927,8 +5960,12 @@ cost_plus: *cost = COSTS_N_INSNS (1); if (speed) - *cost += extra_cost->alu.rev; - + { + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->vect.alu; + else + *cost += extra_cost->alu.rev; + } return false; case IOR: @@ -5936,10 +5973,14 @@ cost_plus: { *cost = COSTS_N_INSNS (1); - if (speed) - *cost += extra_cost->alu.rev; - - return true; + if (speed) + { + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->vect.alu; + else + *cost += extra_cost->alu.rev; + } + return true; } /* Fall through. */ case XOR: @@ -5948,6 +5989,13 @@ cost_plus: op0 = XEXP (x, 0); op1 = XEXP (x, 1); + if (VECTOR_MODE_P (mode)) + { + if (speed) + *cost += extra_cost->vect.alu; + return true; + } + if (code == AND && GET_CODE (op0) == MULT && CONST_INT_P (XEXP (op0, 1)) @@ -6013,10 +6061,15 @@ cost_plus: return false; case NOT: - /* MVN. */ if (speed) - *cost += extra_cost->alu.logical; - + { + /* Vector NOT. */ + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->vect.alu; + /* MVN. */ + else + *cost += extra_cost->alu.logical; + } /* The logical instruction could have the shifted register form, but the cost is the same if the shift is processed as a separate instruction, so we don't bother with it here. */ @@ -6055,10 +6108,15 @@ cost_plus: return true; } - /* UXTB/UXTH. */ if (speed) - *cost += extra_cost->alu.extend; - + { + if (VECTOR_MODE_P (mode)) + /* UMOV. */ + *cost += extra_cost->vect.alu; + else + /* UXTB/UXTH. */ + *cost += extra_cost->alu.extend; + } return false; case SIGN_EXTEND: @@ -6078,7 +6136,12 @@ cost_plus: } if (speed) - *cost += extra_cost->alu.extend; + { + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->vect.alu; + else + *cost += extra_cost->alu.extend; + } return false; case ASHIFT: @@ -6087,10 +6150,16 @@ cost_plus: if (CONST_INT_P (op1)) { - /* LSL (immediate), UBMF, UBFIZ and friends. These are all - aliases. */ if (speed) - *cost += extra_cost->alu.shift; + { + /* Vector shift (immediate). */ + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->vect.alu; + /* LSL (immediate), UBMF, UBFIZ and friends. These are all + aliases. */ + else + *cost += extra_cost->alu.shift; + } /* We can incorporate zero/sign extend for free. */ if (GET_CODE (op0) == ZERO_EXTEND @@ -6102,10 +6171,15 @@ cost_plus: } else { - /* LSLV. */ if (speed) - *cost += extra_cost->alu.shift_reg; - + { + /* Vector shift (register). */ + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->vect.alu; + /* LSLV. */ + else + *cost += extra_cost->alu.shift_reg; + } return false; /* All arguments need to be in registers. */ } @@ -6120,7 +6194,12 @@ cost_plus: { /* ASR (immediate) and friends. */ if (speed) - *cost += extra_cost->alu.shift; + { + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->vect.alu; + else + *cost += extra_cost->alu.shift; + } *cost += rtx_cost (op0, (enum rtx_code) code, 0, speed); return true; @@ -6130,8 +6209,12 @@ cost_plus: /* ASR (register) and friends. */ if (speed) - *cost += extra_cost->alu.shift_reg; - + { + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->vect.alu; + else + *cost += extra_cost->alu.shift_reg; + } return false; /* All arguments need to be in registers. */ } @@ -6179,7 +6262,12 @@ cost_plus: case SIGN_EXTRACT: /* UBFX/SBFX. */ if (speed) - *cost += extra_cost->alu.bfx; + { + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->vect.alu; + else + *cost += extra_cost->alu.bfx; + } /* We can trust that the immediates used will be correct (there are no by-register forms), so we need only cost op0. */ @@ -6196,7 +6284,9 @@ cost_plus: case UMOD: if (speed) { - if (GET_MODE_CLASS (GET_MODE (x)) == MODE_INT) + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->vect.alu; + else if (GET_MODE_CLASS (GET_MODE (x)) == MODE_INT) *cost += (extra_cost->mult[GET_MODE (x) == DImode].add + extra_cost->mult[GET_MODE (x) == DImode].idiv); else if (GET_MODE (x) == DFmode) @@ -6213,7 +6303,9 @@ cost_plus: case SQRT: if (speed) { - if (GET_MODE_CLASS (mode) == MODE_INT) + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->vect.alu; + else if (GET_MODE_CLASS (mode) == MODE_INT) /* There is no integer SQRT, so only DIV and UDIV can get here. */ *cost += extra_cost->mult[mode == DImode].idiv; @@ -6245,7 +6337,12 @@ cost_plus: op2 = XEXP (x, 2); if (speed) - *cost += extra_cost->fp[mode == DFmode].fma; + { + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->vect.alu; + else + *cost += extra_cost->fp[mode == DFmode].fma; + } /* FMSUB, FNMADD, and FNMSUB are free. */ if (GET_CODE (op0) == NEG) @@ -6285,12 +6382,24 @@ cost_plus: case FLOAT_EXTEND: if (speed) - *cost += extra_cost->fp[mode == DFmode].widen; + { + if (VECTOR_MODE_P (mode)) + /*Vector truncate. */ + *cost += extra_cost->vect.alu; + else + *cost += extra_cost->fp[mode == DFmode].widen; + } return false; case FLOAT_TRUNCATE: if (speed) - *cost += extra_cost->fp[mode == DFmode].narrow; + { + if (VECTOR_MODE_P (mode)) + /*Vector conversion. */ + *cost += extra_cost->vect.alu; + else + *cost += extra_cost->fp[mode == DFmode].narrow; + } return false; case FIX: @@ -6311,13 +6420,23 @@ cost_plus: } if (speed) - *cost += extra_cost->fp[GET_MODE (x) == DFmode].toint; - + { + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->vect.alu; + else + *cost += extra_cost->fp[GET_MODE (x) == DFmode].toint; + } *cost += rtx_cost (x, (enum rtx_code) code, 0, speed); return true; case ABS: - if (GET_MODE_CLASS (mode) == MODE_FLOAT) + if (VECTOR_MODE_P (mode)) + { + /* ABS (vector). */ + if (speed) + *cost += extra_cost->vect.alu; + } + else if (GET_MODE_CLASS (mode) == MODE_FLOAT) { /* FABS and FNEG are analogous. */ if (speed) @@ -6338,10 +6457,13 @@ cost_plus: case SMIN: if (speed) { + if (VECTOR_MODE_P (mode)) + *cost += extra_cost->vect.alu; /* FMAXNM/FMINNM/FMAX/FMIN. TODO: This may not be accurate for all implementations, but we do not model this in the cost tables. */ - *cost += extra_cost->fp[mode == DFmode].addsub; + else + *cost += extra_cost->fp[mode == DFmode].addsub; } return false; diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h index 3ee7ebf..29f7c99 100644 --- a/gcc/config/arm/aarch-common-protos.h +++ b/gcc/config/arm/aarch-common-protos.h @@ -102,6 +102,8 @@ struct mem_cost_table const int storef; /* SFmode. */ const int stored; /* DFmode. */ const int store_unaligned; /* Extra for unaligned stores. */ + const int loadv; /* Vector load. */ + const int storev; /* Vector store. */ }; struct fp_cost_table diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h index 05e96a9..809feb8 100644 --- a/gcc/config/arm/aarch-cost-tables.h +++ b/gcc/config/arm/aarch-cost-tables.h @@ -81,7 +81,9 @@ const struct cpu_cost_table generic_extra_costs = 1, /* stm_regs_per_insn_subsequent. */ COSTS_N_INSNS (2), /* storef. */ COSTS_N_INSNS (3), /* stored. */ - COSTS_N_INSNS (1) /* store_unaligned. */ + COSTS_N_INSNS (1), /* store_unaligned. */ + COSTS_N_INSNS (1), /* loadv. */ + COSTS_N_INSNS (1) /* storev. */ }, { /* FP SFmode */ @@ -182,7 +184,9 @@ const struct cpu_cost_table cortexa53_extra_costs = 2, /* stm_regs_per_insn_subsequent. */ 0, /* storef. */ 0, /* stored. */ - COSTS_N_INSNS (1) /* store_unaligned. */ + COSTS_N_INSNS (1), /* store_unaligned. */ + COSTS_N_INSNS (1), /* loadv. */ + COSTS_N_INSNS (1) /* storev. */ }, { /* FP SFmode */ @@ -283,7 +287,9 @@ const struct cpu_cost_table cortexa57_extra_costs = 2, /* stm_regs_per_insn_subsequent. */ 0, /* storef. */ 0, /* stored. */ - COSTS_N_INSNS (1) /* store_unaligned. */ + COSTS_N_INSNS (1), /* store_unaligned. */ + COSTS_N_INSNS (1), /* loadv. */ + COSTS_N_INSNS (1) /* storev. */ }, { /* FP SFmode */ @@ -385,6 +391,8 @@ const struct cpu_cost_table xgene1_extra_costs = 0, /* storef. */ 0, /* stored. */ 0, /* store_unaligned. */ + COSTS_N_INSNS (1), /* loadv. */ + COSTS_N_INSNS (1) /* storev. */ }, { /* FP SFmode */