From patchwork Thu Jun 21 01:53:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139428 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1484852lji; Wed, 20 Jun 2018 19:17:13 -0700 (PDT) X-Google-Smtp-Source: ADUXVKL7ypSVIppHWnKYY5tfXnjafjmT7k2J3rRE1+yAYdp8tY78llfa48G7gzqpBEblnMRSe36V X-Received: by 2002:ac8:5416:: with SMTP id b22-v6mr21857440qtq.332.1529547433018; Wed, 20 Jun 2018 19:17:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547433; cv=none; d=google.com; s=arc-20160816; b=nj9DJ0HuraVALy8bfyQNKUkpsohCCPWHDCKahyjjtL3Fbnb8PnEo8QbwBgbU/fHFoN l+dVzhH4AofC6Fpx5n4zanuledhVaz8sT6kwCUHX8wVogjfL1C732wUjTy4GFEUXv9y4 vRNjSoe15CCaSOUQxdAuR0rw6VhgSF7JGgkVRcE/fqDZxROPXSFeHSeR2F/OLJASskAl FrDNNvLwlgybHrv9gQyit5vBiLLCEI4Im73y0Gt+oig+R1Clo5uBeDK0Vg82wdHygXXs j5CTrR5he/qmOPszJFGtgjscWTvDkYdP3S1PggW9bn2sdT+eB7P/vRyulHJxUzQ66Ubi B3ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=DAvQ6d8C7WhBxzP4vcx1hFuoE7NEqSILwo/Uh1DOpfk=; b=P4pKQJUD9nDGApjDayDnYLYGK2WhNSMV6uUkNSQMeBoYjw9Cwk2XW2NOlHGJEcFyKu t+xXv9X3bBIxIB1YxGjRHPvMUP4wRKX1hG0o//Gk9MIf+fHso9Xj43gmXj9HYoOIRQtb VrfO0oI3CQC96KxAQFEPjFOurYmJsyOz0ThNXfC61lpEe51HZ0M8UFU+os2RJmUjNkCB rYD5XoxWkZeRXRQnhagP+lmfAMk380gwwoEoyiXYOGuq/51uy52eNkvpMqTBTnEoec1K ZgbVG2nW3bTH2Rx7zBcrpjrvWIMEP+OipmTHP/nxwgs5LwzX+YwMmJBrSfX3VW9ohDWB O+aQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=MWxRzx4G; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id 75-v6si3585259qki.95.2018.06.20.19.17.12 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:17:13 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=MWxRzx4G; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52598 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp9s-0004pI-F6 for patch@linaro.org; Wed, 20 Jun 2018 22:17:12 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39637) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVood-00052B-N9 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVoob-0003bI-Os for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:15 -0400 Received: from mail-pf0-x22a.google.com ([2607:f8b0:400e:c00::22a]:42129) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVoob-0003af-FW for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:13 -0400 Received: by mail-pf0-x22a.google.com with SMTP id w7-v6so683532pfn.9 for ; Wed, 20 Jun 2018 18:55:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=DAvQ6d8C7WhBxzP4vcx1hFuoE7NEqSILwo/Uh1DOpfk=; b=MWxRzx4GMNO4DAGIBuxFSOTcl1DHvFGagls5ki1R7inZcuW1pMMXAeT3qYVa7uIx0G lXpHLhvYFuSIpeJLlMcqdH/fgUeKCDWQ4iY0rgfXIu9Q94x+4H3AriOZtvrd8NOSb7MO n1Ix5VE6dYxvyLBeQN0AVFyICi2W/3l/SD5v8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=DAvQ6d8C7WhBxzP4vcx1hFuoE7NEqSILwo/Uh1DOpfk=; b=gsLvqoLiwR5IxexjjaGrSWyvM9WcLxCrbfNsR65hzSMkkWK7YFWxIp1S0ou3mipnMT YJPvkuqGHoyE8MLNPecJ5iq7R0SmBdwfrzAZeKIFzBD7zCLz0mbXPm57Q59qd8EUL7vS 8qGRLn/ARb7hhmw3eWu2ZNq0+/fsvba+GN/9muTojMbzR2oPQth1ksE13IMF9nseDkuA u+BKa9Z7lOSh8zzf6UnCSv4rGc1L9ASdVf0ajgDY2rACfE4wli4i8GpCucveZwCeG+QN D0jDBK2Sgbu0Xxe7biqHNJH7PlmmuM5wY3gSS9sajXcQcqH3RjxZzqzbKfwMlpjjwrgY t79A== X-Gm-Message-State: APt69E2jJMRB7h1GGgRp5fYICwFrGkzcV/77xxFa4rq0q0fysBqHLRVf Q4BKXfi7anHYvaXSgYgtUWH8LW0mFxc= X-Received: by 2002:a65:43c9:: with SMTP id n9-v6mr20527147pgp.399.1529546112151; Wed, 20 Jun 2018 18:55:12 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.55.10 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:11 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:59 -1000 Message-Id: <20180621015359.12018-36-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::22a Subject: [Qemu-devel] [PATCH v5 35/35] target/arm: Implement ARMv8.2-DotProd X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" We've already added the helpers with an SVE patch, all that remains is to wire up the aa64 and aa32 translators. Enable the feature within -cpu max for CONFIG_USER_ONLY. Signed-off-by: Richard Henderson --- target/arm/cpu.h | 1 + linux-user/elfload.c | 1 + target/arm/cpu.c | 1 + target/arm/cpu64.c | 1 + target/arm/translate-a64.c | 36 +++++++++++++++++ target/arm/translate.c | 81 ++++++++++++++++++++++++++------------ 6 files changed, 96 insertions(+), 25 deletions(-) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 8488273c5b..23098474e1 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -1480,6 +1480,7 @@ enum arm_features { ARM_FEATURE_V8_SM4, /* implements SM4 part of v8 Crypto Extensions */ ARM_FEATURE_V8_ATOMICS, /* ARMv8.1-Atomics feature */ ARM_FEATURE_V8_RDM, /* implements v8.1 simd round multiply */ + ARM_FEATURE_V8_DOTPROD, /* implements v8.2 simd dot product */ ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */ ARM_FEATURE_V8_FCMA, /* has complex number part of v8.3 extensions. */ }; diff --git a/linux-user/elfload.c b/linux-user/elfload.c index 13bc78d0c8..bdb023b477 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -583,6 +583,7 @@ static uint32_t get_elf_hwcap(void) ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP); GET_FEATURE(ARM_FEATURE_V8_ATOMICS, ARM_HWCAP_A64_ATOMICS); GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM); + GET_FEATURE(ARM_FEATURE_V8_DOTPROD, ARM_HWCAP_A64_ASIMDDP); GET_FEATURE(ARM_FEATURE_V8_FCMA, ARM_HWCAP_A64_FCMA); #undef GET_FEATURE diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 8e4f4d8c21..95ac9c064d 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -1794,6 +1794,7 @@ static void arm_max_initfn(Object *obj) set_feature(&cpu->env, ARM_FEATURE_V8_PMULL); set_feature(&cpu->env, ARM_FEATURE_CRC); set_feature(&cpu->env, ARM_FEATURE_V8_RDM); + set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD); set_feature(&cpu->env, ARM_FEATURE_V8_FCMA); #endif } diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c index 0360d7efc5..3b4bc73ffa 100644 --- a/target/arm/cpu64.c +++ b/target/arm/cpu64.c @@ -250,6 +250,7 @@ static void aarch64_max_initfn(Object *obj) set_feature(&cpu->env, ARM_FEATURE_CRC); set_feature(&cpu->env, ARM_FEATURE_V8_ATOMICS); set_feature(&cpu->env, ARM_FEATURE_V8_RDM); + set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD); set_feature(&cpu->env, ARM_FEATURE_V8_FP16); set_feature(&cpu->env, ARM_FEATURE_V8_FCMA); set_feature(&cpu->env, ARM_FEATURE_SVE); diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 038e48278f..903d6233d3 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -640,6 +640,16 @@ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, vec_full_reg_size(s), gvec_op); } +/* Expand a 3-operand operation using an out-of-line helper. */ +static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd, + int rn, int rm, int data, gen_helper_gvec_3 *fn) +{ + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + is_q ? 16 : 8, vec_full_reg_size(s), data, fn); +} + /* Expand a 3-operand + env pointer operation using * an out-of-line helper. */ @@ -11336,6 +11346,14 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) } feature = ARM_FEATURE_V8_RDM; break; + case 0x02: /* SDOT (vector) */ + case 0x12: /* UDOT (vector) */ + if (size != MO_32) { + unallocated_encoding(s); + return; + } + feature = ARM_FEATURE_V8_DOTPROD; + break; case 0x8: /* FCMLA, #0 */ case 0x9: /* FCMLA, #90 */ case 0xa: /* FCMLA, #180 */ @@ -11389,6 +11407,11 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) } return; + case 0x2: /* SDOT / UDOT */ + gen_gvec_op3_ool(s, is_q, rd, rn, rm, 0, + u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b); + return; + case 0x8: /* FCMLA, #0 */ case 0x9: /* FCMLA, #90 */ case 0xa: /* FCMLA, #180 */ @@ -12568,6 +12591,13 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) return; } break; + case 0x0e: /* SDOT */ + case 0x1e: /* UDOT */ + if (size != MO_32 || !arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) { + unallocated_encoding(s); + return; + } + break; case 0x11: /* FCMLA #0 */ case 0x13: /* FCMLA #90 */ case 0x15: /* FCMLA #180 */ @@ -12665,6 +12695,12 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } switch (16 * u + opcode) { + case 0x0e: /* SDOT */ + case 0x1e: /* UDOT */ + gen_gvec_op3_ool(s, is_q, rd, rn, rm, index, + u ? gen_helper_gvec_udot_idx_b + : gen_helper_gvec_sdot_idx_b); + return; case 0x11: /* FCMLA #0 */ case 0x13: /* FCMLA #90 */ case 0x15: /* FCMLA #180 */ diff --git a/target/arm/translate.c b/target/arm/translate.c index f405c82fb2..09f2852b29 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -7748,9 +7748,10 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) */ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn) { - gen_helper_gvec_3_ptr *fn_gvec_ptr; - int rd, rn, rm, rot, size, opr_sz; - TCGv_ptr fpst; + gen_helper_gvec_3 *fn_gvec = NULL; + gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL; + int rd, rn, rm, opr_sz; + int data = 0; bool q; q = extract32(insn, 6, 1); @@ -7763,8 +7764,8 @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn) if ((insn & 0xfe200f10) == 0xfc200800) { /* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */ - size = extract32(insn, 20, 1); - rot = extract32(insn, 23, 2); + int size = extract32(insn, 20, 1); + data = extract32(insn, 23, 2); /* rot */ if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA) || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) { return 1; @@ -7772,13 +7773,20 @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn) fn_gvec_ptr = size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah; } else if ((insn & 0xfea00f10) == 0xfc800800) { /* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */ - size = extract32(insn, 20, 1); - rot = extract32(insn, 24, 1); + int size = extract32(insn, 20, 1); + data = extract32(insn, 24, 1); /* rot */ if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA) || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) { return 1; } fn_gvec_ptr = size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh; + } else if ((insn & 0xfeb00f00) == 0xfc200d00) { + /* V[US]DOT -- 1111 1100 0.10 .... .... 1101 .Q.U .... */ + bool u = extract32(insn, 4, 1); + if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) { + return 1; + } + fn_gvec = u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b; } else { return 1; } @@ -7793,12 +7801,19 @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn) } opr_sz = (1 + q) * 8; - fpst = get_fpstatus_ptr(1); - tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), - vfp_reg_offset(1, rn), - vfp_reg_offset(1, rm), fpst, - opr_sz, opr_sz, rot, fn_gvec_ptr); - tcg_temp_free_ptr(fpst); + if (fn_gvec_ptr) { + TCGv_ptr fpst = get_fpstatus_ptr(1); + tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), + vfp_reg_offset(1, rn), + vfp_reg_offset(1, rm), fpst, + opr_sz, opr_sz, data, fn_gvec_ptr); + tcg_temp_free_ptr(fpst); + } else { + tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), + vfp_reg_offset(1, rn), + vfp_reg_offset(1, rm), + opr_sz, opr_sz, data, fn_gvec); + } return 0; } @@ -7812,8 +7827,10 @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn) static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn) { - int rd, rn, rm, rot, size, opr_sz; - TCGv_ptr fpst; + gen_helper_gvec_3 *fn_gvec = NULL; + gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL; + int rd, rn, rm, opr_sz; + int data = 0; bool q; q = extract32(insn, 6, 1); @@ -7826,12 +7843,21 @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn) if ((insn & 0xff000f10) == 0xfe000800) { /* VCMLA (indexed) -- 1111 1110 S.RR .... .... 1000 ...0 .... */ - rot = extract32(insn, 20, 2); - size = extract32(insn, 23, 1); + int size = extract32(insn, 23, 1); + data = extract32(insn, 20, 2); /* rot */ if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA) || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) { return 1; } + fn_gvec_ptr = (size ? gen_helper_gvec_fcmlas_idx + : gen_helper_gvec_fcmlah_idx); + } else if ((insn & 0xffb00f00) == 0xfe200d00) { + /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */ + int u = extract32(insn, 4, 1); + if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) { + return 1; + } + fn_gvec = u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b; } else { return 1; } @@ -7846,14 +7872,19 @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn) } opr_sz = (1 + q) * 8; - fpst = get_fpstatus_ptr(1); - tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), - vfp_reg_offset(1, rn), - vfp_reg_offset(1, rm), fpst, - opr_sz, opr_sz, rot, - size ? gen_helper_gvec_fcmlas_idx - : gen_helper_gvec_fcmlah_idx); - tcg_temp_free_ptr(fpst); + if (fn_gvec_ptr) { + TCGv_ptr fpst = get_fpstatus_ptr(1); + tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), + vfp_reg_offset(1, rn), + vfp_reg_offset(1, rm), fpst, + opr_sz, opr_sz, data, fn_gvec_ptr); + tcg_temp_free_ptr(fpst); + } else { + tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), + vfp_reg_offset(1, rn), + vfp_reg_offset(1, rm), + opr_sz, opr_sz, data, fn_gvec); + } return 0; }