From patchwork Thu Jan 12 03:43:52 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Hurugalawadi, Naveen" X-Patchwork-Id: 91034 Delivered-To: patch@linaro.org Received: by 10.140.20.99 with SMTP id 90csp1436579qgi; Wed, 11 Jan 2017 19:45:13 -0800 (PST) X-Received: by 10.98.213.202 with SMTP id d193mr14043422pfg.14.1484192713474; Wed, 11 Jan 2017 19:45:13 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id v23si7802215pgc.143.2017.01.11.19.45.13 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 11 Jan 2017 19:45:13 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-return-445929-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-445929-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-445929-patch=linaro.org@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:content-type:mime-version; q=dns; s=default; b=RZ1FwzeG1Sy5MiqUQGz5A6wSFtaOEFJkl7zz12uUMQUG7yq8i2 pUc+6QvAcZdT6UISQGukm+HSqfPiSoRtGYfbJRqeCRWmu0NQYEwByzbEaHWOHLd7 UldRPUwAyK5XojzWy8ylquzgh8zsdRC/LgwU53spVUrKgDkKNmoTqWQc4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:content-type:mime-version; s= default; bh=YhUbI4WHV6EQ3Oj9t/UH3qIl6Wc=; b=KnboTCzBErZpuucr70wA W1QUk9DoBRraoCQRgeRtMnabucNyGJge2rbso0hKA+/yZt//f72QFfGWDSm1+Mk2 jGJ66VQxZRc7MgQVmL4MPgZNFhg6OPpgRB+LWi0Hk48jnkE5hRA9KgMeG0RrbI/8 BT3Nmi0ftpURgUM5NRQB2d8= Received: (qmail 3785 invoked by alias); 12 Jan 2017 03:44:56 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 3465 invoked by uid 89); 12 Jan 2017 03:44:06 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=reservations, xge, ThunderX, define_cpu_unit X-HELO: NAM03-CO1-obe.outbound.protection.outlook.com Received: from mail-co1nam03on0053.outbound.protection.outlook.com (HELO NAM03-CO1-obe.outbound.protection.outlook.com) (104.47.40.53) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 12 Jan 2017 03:43:56 +0000 Received: from CO2PR07MB2694.namprd07.prod.outlook.com (10.166.214.7) by BL2PR07MB2307.namprd07.prod.outlook.com (10.167.101.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.845.12; Thu, 12 Jan 2017 03:43:53 +0000 Received: from CO2PR07MB2694.namprd07.prod.outlook.com ([10.166.214.7]) by CO2PR07MB2694.namprd07.prod.outlook.com ([10.166.214.7]) with mapi id 15.01.0817.020; Thu, 12 Jan 2017 03:43:52 +0000 From: "Hurugalawadi, Naveen" To: James Greenhalgh CC: "gcc-patches@gcc.gnu.org" , "Pinski, Andrew" , Marcus Shawcroft , Richard Earnshaw Subject: [PATCH/AARCH64] Add scheduler for Thunderx2t99 Date: Thu, 12 Jan 2017 03:43:52 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Naveen.Hurugalawadi@cavium.com; x-microsoft-exchange-diagnostics: 1; BL2PR07MB2307; 7:Rh/46L5L3nUb94k60LJZtLXXEx0GQ0G/NHqB3kpsnpuryRx2fYBrID8dSTgyAGKmXJJew2y3jqT/fbjBZ2ZtvvQDy3THRzt+td+MDmCPcQ7u3RLl3XzBjrP5P0uol9ZezdSht3JGkex2RPHo9gUV4nMJKkXD2+DSD5bOahCi/syP6D3y8gSLeOhjPlYNYuvvxfXbNbr7si2HLsxz4WxPv+W3DNjPCjpiIxctHZ0HVuOJ+TZi7mddw1YYb0MWTlthdn7WatRlTcPY98vM4jBUwTuiUm1IKVDCROvaHV5+AMjGRdDG3t7CPN349DQJS8yOQURZ80zU3NzxWM5mlOrQHqjnwPPjovX27I1D2HaF6rgjsOHYtPb6TcQzuLd/HBl/zQv34NiWCD8/H7kB9u/3NgApEXsT5mFXf68bAcT822G60uCfDVvB0/ZNTv7Vy/HljqBu4etO2BnI2EbnTNLCTA== x-forefront-antispam-report: SFV:SKI; SCL:-1SFV:NSPM; SFS:(10009020)(6009001)(7916002)(39450400003)(199003)(189002)(92566002)(81156014)(8676002)(68736007)(81166006)(25786008)(55016002)(33656002)(6436002)(99286003)(4326007)(6506006)(2900100001)(3660700001)(8936002)(5890100001)(97736004)(3846002)(110136003)(7736002)(7696004)(102836003)(6916009)(66066001)(6116002)(74316002)(101416001)(9686003)(122556002)(105586002)(5660300001)(6306002)(189998001)(106356001)(86362001)(54906002)(106116001)(38730400001)(99936001)(305945005)(77096006)(54356999)(3280700002)(50986999)(2906002); DIR:OUT; SFP:1101; SCL:1; SRVR:BL2PR07MB2307; H:CO2PR07MB2694.namprd07.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; x-ms-office365-filtering-correlation-id: 331eed78-ea50-454b-25f2-08d43a9d3a0d x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001); SRVR:BL2PR07MB2307; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(22074186197030)(183786458502308); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(102415395)(6040375)(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046)(6041248)(20161123560025)(20161123564025)(20161123562025)(20161123558021)(20161123555025)(6072148); SRVR:BL2PR07MB2307; BCL:0; PCL:0; RULEID:; SRVR:BL2PR07MB2307; x-forefront-prvs: 018577E36E received-spf: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: cavium.com X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Jan 2017 03:43:52.0916 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 711e4ccf-2e9b-4bcf-a551-4094005b6194 X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL2PR07MB2307 Hi James, The scheduling patch for vulcan was posted at the following link:- https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01205.html We are working on the patch and addressed the comments for thunderx2t99. >> I tried lowering the repeat expressions as so: Done. >>split off the AdvSIMD/FP model from the main pipeline Done. >> A change like wiring the vulcan_f0 and vulcan_f1 reservations >> to be cpu_units of a new define_automaton "vulcan_advsimd" Done. >> simplifying some of the remaining large expressions >> (vulcan_asimd_load*_mult, vulcan_asimd_load*_elts) can bring the size down Did not understand much about this comment. Can you please let me know about the simplification? Please find attached the modified patch as per your suggestions and comments. Please review the patch and let us know if its okay? Thanks, Naveen diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index a7a4b33..4d39673 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -75,7 +75,7 @@ AARCH64_CORE("xgene1", xgene1, xgene1, 8A, AARCH64_FL_FOR_ARCH8, xge /* Broadcom ('B') cores. */ AARCH64_CORE("thunderx2t99", thunderx2t99, cortexa57, 8_1A, AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x42, 0x516, -1) -AARCH64_CORE("vulcan", vulcan, cortexa57, 8_1A, AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x42, 0x516, -1) +AARCH64_CORE("vulcan", vulcan, vulcan, 8_1A, AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x42, 0x516, -1) /* V8 big.LITTLE implementations. */ diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index bde4231..063559c 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -220,6 +220,7 @@ (include "../arm/exynos-m1.md") (include "thunderx.md") (include "../arm/xgene1.md") +(include "thunderx2t99.md") ;; ------------------------------------------------------------------- ;; Jumps and other miscellaneous insns diff --git a/gcc/config/aarch64/thunderx2t99.md b/gcc/config/aarch64/thunderx2t99.md new file mode 100644 index 0000000..00d40f8 --- /dev/null +++ b/gcc/config/aarch64/thunderx2t99.md @@ -0,0 +1,513 @@ +;; Cavium ThunderX 2 CN99xx pipeline description +;; Copyright (C) 2016-2017 Free Software Foundation, Inc. +;; +;; Contributed by Cavium, Broadcom and Mentor Embedded. + +;; This file is part of GCC. + +;; GCC is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. + +;; GCC is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; . + +(define_automaton "thunderx2t99, thunderx2t99_advsimd, thunderx2t99_ldst") +(define_automaton "thunderx2t99_mult") + +(define_cpu_unit "thunderx2t99_i0" "thunderx2t99") +(define_cpu_unit "thunderx2t99_i1" "thunderx2t99") +(define_cpu_unit "thunderx2t99_i2" "thunderx2t99") + +(define_cpu_unit "thunderx2t99_ls0" "thunderx2t99_ldst") +(define_cpu_unit "thunderx2t99_ls1" "thunderx2t99_ldst") +(define_cpu_unit "thunderx2t99_sd" "thunderx2t99_ldst") + +; Pseudo-units for multiply pipeline. + +(define_cpu_unit "thunderx2t99_i1m1" "thunderx2t99_mult") +(define_cpu_unit "thunderx2t99_i1m2" "thunderx2t99_mult") +(define_cpu_unit "thunderx2t99_i1m3" "thunderx2t99_mult") + +; Pseudo-units for load delay (assuming dcache hit). + +(define_cpu_unit "thunderx2t99_ls0d1" "thunderx2t99_ldst") +(define_cpu_unit "thunderx2t99_ls0d2" "thunderx2t99_ldst") +(define_cpu_unit "thunderx2t99_ls0d3" "thunderx2t99_ldst") + +(define_cpu_unit "thunderx2t99_ls1d1" "thunderx2t99_ldst") +(define_cpu_unit "thunderx2t99_ls1d2" "thunderx2t99_ldst") +(define_cpu_unit "thunderx2t99_ls1d3" "thunderx2t99_ldst") + +; Make some aliases for f0/f1. +(define_cpu_unit "thunderx2t99_f0" "thunderx2t99_advsimd") +(define_cpu_unit "thunderx2t99_f1" "thunderx2t99_advsimd") + +(define_reservation "thunderx2t99_i012" "thunderx2t99_i0|thunderx2t99_i1|thunderx2t99_i2") +(define_reservation "thunderx2t99_ls01" "thunderx2t99_ls0|thunderx2t99_ls1") +(define_reservation "thunderx2t99_f01" "thunderx2t99_f0|thunderx2t99_f1") + +(define_reservation "thunderx2t99_ls_both" "thunderx2t99_ls0+thunderx2t99_ls1") + +; A load with delay in the ls0/ls1 pipes. +(define_reservation "thunderx2t99_l0delay" "thunderx2t99_ls0,\ + thunderx2t99_ls0d1,thunderx2t99_ls0d2,\ + thunderx2t99_ls0d3") +(define_reservation "thunderx2t99_l1delay" "thunderx2t99_ls1,\ + thunderx2t99_ls1d1,thunderx2t99_ls1d2,\ + thunderx2t99_ls1d3") +(define_reservation "thunderx2t99_l01delay" "thunderx2t99_l0delay|thunderx2t99_l1delay") + +;; Branch and call instructions. + +(define_insn_reservation "thunderx2t99_branch" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "call,branch")) + "thunderx2t99_i2") + +;; Integer arithmetic/logic instructions. + +; Plain register moves are handled by renaming, and don't create any uops. + +(define_insn_reservation "thunderx2t99_regmove" 0 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "mov_reg")) + "nothing") + +(define_insn_reservation "thunderx2t99_alu_basic" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "alu_imm,alu_sreg,alus_imm,alus_sreg,\ + adc_reg,adc_imm,adcs_reg,adcs_imm,\ + logic_reg,logic_imm,logics_reg,logics_imm,\ + csel,adr,mov_imm,shift_reg,shift_imm,bfm,\ + rbit,rev,extend,rotate_imm")) + "thunderx2t99_i012") + +(define_insn_reservation "thunderx2t99_alu_shift" 2 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "alu_shift_imm,alu_ext,alu_shift_reg,\ + alus_shift_imm,alus_ext,alus_shift_reg,\ + logic_shift_imm,logics_shift_reg")) + "thunderx2t99_i012,thunderx2t99_i012") + +(define_insn_reservation "thunderx2t99_div" 13 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "sdiv,udiv")) + "thunderx2t99_i1*3") + +(define_insn_reservation "thunderx2t99_madd" 5 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "mla,smlal,umlal")) + "thunderx2t99_i1,thunderx2t99_i1m1,thunderx2t99_i1m2,thunderx2t99_i1m3,\ + thunderx2t99_i012") + +; NOTE: smull, umull are used for "high part" multiplies too. +(define_insn_reservation "thunderx2t99_mul" 4 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "mul,smull,umull")) + "thunderx2t99_i1,thunderx2t99_i1m1,thunderx2t99_i1m2,thunderx2t99_i1m3") + +(define_insn_reservation "thunderx2t99_countbits" 3 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "clz")) + "thunderx2t99_i1") + +;; Integer loads and stores. + +(define_insn_reservation "thunderx2t99_load_basic" 4 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "load1")) + "thunderx2t99_ls01") + +(define_insn_reservation "thunderx2t99_loadpair" 5 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "load2")) + "thunderx2t99_i012,thunderx2t99_ls01") + +(define_insn_reservation "thunderx2t99_store_basic" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "store1")) + "thunderx2t99_ls01,thunderx2t99_sd") + +(define_insn_reservation "thunderx2t99_storepair_basic" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "store2")) + "thunderx2t99_ls01,thunderx2t99_sd") + +;; FP data processing instructions. + +(define_insn_reservation "thunderx2t99_fp_simple" 5 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "ffariths,ffarithd,f_minmaxs,f_minmaxd")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_fp_addsub" 6 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "fadds,faddd")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_fp_cmp" 5 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "fcmps,fcmpd")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_fp_divsqrt_s" 16 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "fdivs,fsqrts")) + "thunderx2t99_f0*3|thunderx2t99_f1*3") + +(define_insn_reservation "thunderx2t99_fp_divsqrt_d" 23 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "fdivd,fsqrtd")) + "thunderx2t99_f0*5|thunderx2t99_f1*5") + +(define_insn_reservation "thunderx2t99_fp_mul_mac" 6 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "fmuls,fmuld,fmacs,fmacd")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_frint" 7 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "f_rints,f_rintd")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_fcsel" 4 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "fcsel")) + "thunderx2t99_f01") + +;; FP miscellaneous instructions. + +(define_insn_reservation "thunderx2t99_fp_cvt" 7 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "f_cvtf2i,f_cvt,f_cvti2f")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_fp_mov" 4 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "fconsts,fconstd,fmov,f_mrc")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_fp_mov_to_gen" 5 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "f_mcr")) + "thunderx2t99_f01") + +;; FP loads and stores. + +(define_insn_reservation "thunderx2t99_fp_load_basic" 4 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "f_loads,f_loadd")) + "thunderx2t99_ls01") + +(define_insn_reservation "thunderx2t99_fp_loadpair_basic" 4 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_load1_2reg")) + "thunderx2t99_ls01*2") + +(define_insn_reservation "thunderx2t99_fp_store_basic" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "f_stores,f_stored")) + "thunderx2t99_ls01,thunderx2t99_sd") + +(define_insn_reservation "thunderx2t99_fp_storepair_basic" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_store1_2reg")) + "thunderx2t99_ls01,(thunderx2t99_ls01+thunderx2t99_sd),thunderx2t99_sd") + +;; ASIMD integer instructions. + +(define_insn_reservation "thunderx2t99_asimd_int" 7 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_abd,neon_abd_q,\ + neon_arith_acc,neon_arith_acc_q,\ + neon_abs,neon_abs_q,\ + neon_add,neon_add_q,\ + neon_neg,neon_neg_q,\ + neon_add_long,neon_add_widen,\ + neon_add_halve,neon_add_halve_q,\ + neon_sub_long,neon_sub_widen,\ + neon_sub_halve,neon_sub_halve_q,\ + neon_add_halve_narrow_q,neon_sub_halve_narrow_q,\ + neon_qabs,neon_qabs_q,\ + neon_qadd,neon_qadd_q,\ + neon_qneg,neon_qneg_q,\ + neon_qsub,neon_qsub_q,\ + neon_minmax,neon_minmax_q,\ + neon_reduc_minmax,neon_reduc_minmax_q,\ + neon_mul_b,neon_mul_h,neon_mul_s,\ + neon_mul_b_q,neon_mul_h_q,neon_mul_s_q,\ + neon_sat_mul_b,neon_sat_mul_h,neon_sat_mul_s,\ + neon_sat_mul_b_q,neon_sat_mul_h_q,neon_sat_mul_s_q,\ + neon_mla_b,neon_mla_h,neon_mla_s,\ + neon_mla_b_q,neon_mla_h_q,neon_mla_s_q,\ + neon_mul_b_long,neon_mul_h_long,\ + neon_mul_s_long,neon_mul_d_long,\ + neon_sat_mul_b_long,neon_sat_mul_h_long,\ + neon_sat_mul_s_long,\ + neon_mla_b_long,neon_mla_h_long,neon_mla_s_long,\ + neon_sat_mla_b_long,neon_sat_mla_h_long,\ + neon_sat_mla_s_long,\ + neon_shift_acc,neon_shift_acc_q,\ + neon_shift_imm,neon_shift_imm_q,\ + neon_shift_reg,neon_shift_reg_q,\ + neon_shift_imm_long,neon_shift_imm_narrow_q,\ + neon_sat_shift_imm,neon_sat_shift_imm_q,\ + neon_sat_shift_reg,neon_sat_shift_reg_q,\ + neon_sat_shift_imm_narrow_q")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_reduc_add" 5 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_reduc_add,neon_reduc_add_q")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_cmp" 7 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_compare,neon_compare_q,neon_compare_zero,\ + neon_tst,neon_tst_q")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_logic" 5 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_logic,neon_logic_q")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_polynomial" 5 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_mul_d_long")) + "thunderx2t99_f01") + +;; ASIMD floating-point instructions. + +(define_insn_reservation "thunderx2t99_asimd_fp_simple" 5 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_fp_abs_s,neon_fp_abs_d,\ + neon_fp_abs_s_q,neon_fp_abs_d_q,\ + neon_fp_compare_s,neon_fp_compare_d,\ + neon_fp_compare_s_q,neon_fp_compare_d_q,\ + neon_fp_minmax_s,neon_fp_minmax_d,\ + neon_fp_minmax_s_q,neon_fp_minmax_d_q,\ + neon_fp_reduc_minmax_s,neon_fp_reduc_minmax_d,\ + neon_fp_reduc_minmax_s_q,neon_fp_reduc_minmax_d_q,\ + neon_fp_neg_s,neon_fp_neg_d,\ + neon_fp_neg_s_q,neon_fp_neg_d_q")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_fp_arith" 6 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_fp_abd_s,neon_fp_abd_d,\ + neon_fp_abd_s_q,neon_fp_abd_d_q,\ + neon_fp_addsub_s,neon_fp_addsub_d,\ + neon_fp_addsub_s_q,neon_fp_addsub_d_q,\ + neon_fp_reduc_add_s,neon_fp_reduc_add_d,\ + neon_fp_reduc_add_s_q,neon_fp_reduc_add_d_q,\ + neon_fp_mul_s,neon_fp_mul_d,\ + neon_fp_mul_s_q,neon_fp_mul_d_q,\ + neon_fp_mla_s,neon_fp_mla_d,\ + neon_fp_mla_s_q,neon_fp_mla_d_q")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_fp_conv" 7 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_fp_cvt_widen_s,neon_fp_cvt_narrow_d_q,\ + neon_fp_to_int_s,neon_fp_to_int_d,\ + neon_fp_to_int_s_q,neon_fp_to_int_d_q,\ + neon_fp_round_s,neon_fp_round_d,\ + neon_fp_round_s_q,neon_fp_round_d_q")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_fp_div_s" 16 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_fp_div_s,neon_fp_div_s_q")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_fp_div_d" 23 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_fp_div_d,neon_fp_div_d_q")) + "thunderx2t99_f01") + +;; ASIMD miscellaneous instructions. + +(define_insn_reservation "thunderx2t99_asimd_misc" 5 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_rbit,\ + neon_bsl,neon_bsl_q,\ + neon_cls,neon_cls_q,\ + neon_cnt,neon_cnt_q,\ + neon_from_gp,neon_from_gp_q,\ + neon_dup,neon_dup_q,\ + neon_ext,neon_ext_q,\ + neon_ins,neon_ins_q,\ + neon_move,neon_move_q,\ + neon_fp_recpe_s,neon_fp_recpe_d,\ + neon_fp_recpe_s_q,neon_fp_recpe_d_q,\ + neon_fp_recpx_s,neon_fp_recpx_d,\ + neon_fp_recpx_s_q,neon_fp_recpx_d_q,\ + neon_rev,neon_rev_q,\ + neon_dup,neon_dup_q,\ + neon_permute,neon_permute_q")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_recip_step" 6 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_fp_recps_s,neon_fp_recps_s_q,\ + neon_fp_recps_d,neon_fp_recps_d_q,\ + neon_fp_rsqrts_s, neon_fp_rsqrts_s_q,\ + neon_fp_rsqrts_d, neon_fp_rsqrts_d_q")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_lut" 8 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_tbl1,neon_tbl1_q,neon_tbl2_q")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_elt_to_gr" 6 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_to_gp,neon_to_gp_q")) + "thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_ext" 7 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_shift_imm_narrow_q,neon_sat_shift_imm_narrow_q")) + "thunderx2t99_f01") + +;; ASIMD load instructions. + +; NOTE: These reservations attempt to model latency and throughput correctly, +; but the cycle timing of unit allocation is not necessarily accurate (because +; insns are split into uops, and those may be issued out-of-order). + +(define_insn_reservation "thunderx2t99_asimd_load1_1_mult" 4 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_load1_1reg,neon_load1_1reg_q")) + "thunderx2t99_ls01") + +(define_insn_reservation "thunderx2t99_asimd_load1_2_mult" 4 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_load1_2reg,neon_load1_2reg_q")) + "thunderx2t99_ls_both") + +(define_insn_reservation "thunderx2t99_asimd_load1_3_mult" 5 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_load1_3reg,neon_load1_3reg_q")) + "(thunderx2t99_ls_both,thunderx2t99_ls01)|(thunderx2t99_ls01,\ + thunderx2t99_ls_both)") + +(define_insn_reservation "thunderx2t99_asimd_load1_4_mult" 6 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_load1_4reg,neon_load1_4reg_q")) + "thunderx2t99_ls_both*2") + +(define_insn_reservation "thunderx2t99_asimd_load1_onelane" 5 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_load1_one_lane,neon_load1_one_lane_q")) + "thunderx2t99_l01delay,thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_load1_all" 5 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_load1_all_lanes,neon_load1_all_lanes_q")) + "thunderx2t99_l01delay,thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_load2" 5 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_load2_2reg,neon_load2_2reg_q,\ + neon_load2_one_lane,neon_load2_one_lane_q,\ + neon_load2_all_lanes,neon_load2_all_lanes_q")) + "(thunderx2t99_l0delay,thunderx2t99_f01)|(thunderx2t99_l1delay,\ + thunderx2t99_f01)") + +(define_insn_reservation "thunderx2t99_asimd_load3_mult" 8 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_load3_3reg,neon_load3_3reg_q")) + "thunderx2t99_ls_both*3,(thunderx2t99_ls0d1+thunderx2t99_ls1d1),\ + (thunderx2t99_ls0d2+thunderx2t99_ls1d2),\ + (thunderx2t99_ls0d3+thunderx2t99_ls1d3),thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_load3_elts" 7 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_load3_one_lane,neon_load3_one_lane_q,\ + neon_load3_all_lanes,neon_load3_all_lanes_q")) + "thunderx2t99_ls_both,thunderx2t99_l01delay,thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_load4_mult" 8 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_load4_4reg,neon_load4_4reg_q")) + "thunderx2t99_ls_both*4,(thunderx2t99_ls0d1+thunderx2t99_ls1d1),\ + (thunderx2t99_ls0d2+thunderx2t99_ls1d2),\ + (thunderx2t99_ls0d3+thunderx2t99_ls1d3),thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_load4_elts" 6 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_load4_one_lane,neon_load4_one_lane_q,\ + neon_load4_all_lanes,neon_load4_all_lanes_q")) + "thunderx2t99_ls_both*2,(thunderx2t99_ls0d1+thunderx2t99_ls1d1),\ + (thunderx2t99_ls0d2+thunderx2t99_ls1d2),\ + (thunderx2t99_ls0d3+thunderx2t99_ls1d3),thunderx2t99_f01") + +;; ASIMD store instructions. + +; Same note applies as for ASIMD load instructions. + +(define_insn_reservation "thunderx2t99_asimd_store1_1_mult" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_store1_1reg,neon_store1_1reg_q")) + "thunderx2t99_ls01") + +(define_insn_reservation "thunderx2t99_asimd_store1_2_mult" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_store1_2reg,neon_store1_2reg_q")) + "thunderx2t99_ls_both") + +(define_insn_reservation "thunderx2t99_asimd_store1_3_mult" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_store1_3reg,neon_store1_3reg_q")) + "(thunderx2t99_ls_both,thunderx2t99_ls01)|(thunderx2t99_ls01,\ + thunderx2t99_ls_both)") + +(define_insn_reservation "thunderx2t99_asimd_store1_4_mult" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_store1_4reg,neon_store1_4reg_q")) + "thunderx2t99_ls_both*2") + +(define_insn_reservation "thunderx2t99_asimd_store1_onelane" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_store1_one_lane,neon_store1_one_lane_q")) + "thunderx2t99_ls01,thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_store2_mult" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_store2_2reg,neon_store2_2reg_q")) + "thunderx2t99_ls_both,thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_store2_onelane" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_store2_one_lane,neon_store2_one_lane_q")) + "thunderx2t99_ls01,thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_store3_mult" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_store3_3reg,neon_store3_3reg_q")) + "thunderx2t99_ls_both*3,thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_store3_onelane" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_store3_one_lane,neon_store3_one_lane_q")) + "thunderx2t99_ls_both,thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_store4_mult" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_store4_4reg,neon_store4_4reg_q")) + "thunderx2t99_ls_both*4,thunderx2t99_f01") + +(define_insn_reservation "thunderx2t99_asimd_store4_onelane" 1 + (and (eq_attr "tune" "thunderx2t99") + (eq_attr "type" "neon_store4_one_lane,neon_store4_one_lane_q")) + "thunderx2t99_ls_both,thunderx2t99_f01")