From patchwork Sat Feb 17 18:22:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128677 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1819106ljc; Sat, 17 Feb 2018 10:29:55 -0800 (PST) X-Google-Smtp-Source: AH8x224iP3D/USQjDNZ2shaVJOhK7hkAhjzrPBBHlWyDyD9w4/hn3nIqbCM5rX8PnHX3jKopOEKN X-Received: by 10.129.208.12 with SMTP id v12mr4276000ywi.304.1518892195829; Sat, 17 Feb 2018 10:29:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892195; cv=none; d=google.com; s=arc-20160816; b=bxfw4wroq5fpPeVNJjbOTVIwi601/MUmceTJQmcmkejhri3GV6/C8BOC1FxeYXBLzH qJ5c5z2cwaryh8AZSoIUp+MHE5pesZh16KwU0fWWe0lhrnhXt0xq8RSN/xWcTbttQK4E eWdXSB8u7CQC9J4K9LAloZD3g1AQDCVnnfJPDm6iFucqY8O8+ppBUHJvgROErqmdisPB xYQvZq2vFFnZ5GzgooCXA7RdISnwLOEuCODjRa8RJ528PLQL0VP4oP/MDbk7d8ZuLxWe JLGJB/dswG1v/So/QMDCRKM+fssCbQ5lo/xp3Gxz8YmDzwCy8ifCIJk4FB+O5GYKGjDZ aBYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=Rzb3iILp0N9VPoO2HC7Ijv9hijbpXzSFKAfXOuc/y24=; b=O8vr8umhHfojdK/f0EhwvMmrGb/yO5Bnp7Y+dvUPvnFV6BC+LpHMf0N8mw96uk/Dv3 hGdDSmt3hjsa412rvJh/HJCb242ENChJ39yi1NMYqR4Et31DFA12TYQZY/52tQUkf0hc kyn8Q0QqIuAx7IEq5TTAWmRtYp+ktrAuGwbwgbgziPNTSV8Uz8waaLfUOm6ajADOIRQv EnlK6wZFgmIUd/DXbAwFDca7pd5FZkxOhe+0q+041BZVr2XvC1mUPrBnAHUbi//NypBc J+3jFKv5gz6RRCHJXI18Js9I80p3imlinhqSVQh9QhS1rA4pGH4MlTa7KBXVkkDbtzcz HZjw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=A1yiZ6hg; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id k13si850609ywm.511.2018.02.17.10.29.55 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:29:55 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=A1yiZ6hg; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48086 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7FC-0004bH-Mt for patch@linaro.org; Sat, 17 Feb 2018 13:29:55 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39565) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en797-0000AP-Md for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en796-0001UD-8s for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:37 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:36664) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en796-0001Tu-0K for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:36 -0500 Received: by mail-pl0-x244.google.com with SMTP id v3so3443614plg.3 for ; Sat, 17 Feb 2018 10:23:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Rzb3iILp0N9VPoO2HC7Ijv9hijbpXzSFKAfXOuc/y24=; b=A1yiZ6hgTpl96c/Xw/00JNme4ci8k9OSBXcTPbWr/oWLOlErI/SFubj5rIv9LBNmiZ /9vQuRapzkFu4XompCryPs9N5vg77Mw9NA+ae2K+g9r+hnZPYatky+Rp2Kd9I/NlXsSC iXJO0CgS4kZz/27Fm2pUSB1p4qfpYC3Mc4tKI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Rzb3iILp0N9VPoO2HC7Ijv9hijbpXzSFKAfXOuc/y24=; b=K1ApJ0ATp1rHEZ84dcrZbzbfx09o/Ouvr9QBkkYTuZON4rfyqvOz1VtBAYF8+oOs76 R6LsyboN1c4WX6hygH3AlwLkDOpH3q2QXk5jm99WBt+3o8tHc6DSHFWMhyvKt2WS0yem jhqXlfsIxufxhvN61yRU8O4bOHSlqiyRO/nmrS6fWh5n8yPpx/jaxv/d2Doungkk4oBc wTrt6YHmr5mp2GR8AUf8gP6ef7MmH7UVd7LvSDDsfBf4NNKY2YAUIj5u9Oqx35UtflYH spE/8O5ZoSiExqOK27nYec4Gk3c7sSKXa5pzD+O3rwhSbd+otsDIBRvdXQrKnvrCyTw7 u7vg== X-Gm-Message-State: APf1xPBJ4AGNlbNLJcvnWKeMewgpzNQTZnPz5BtbAcN6TfBG5Roo5nq+ MtjdEq4s3ahp8TzWOfVzCTXBTRhBC54= X-Received: by 2002:a17:902:6bcb:: with SMTP id m11-v6mr2324350plt.326.1518891814662; Sat, 17 Feb 2018 10:23:34 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.33 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:33 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:21 -0800 Message-Id: <20180217182323.25885-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 05/67] target/arm: Implement SVE load vector/predicate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 132 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 22 +++++++- 2 files changed, 153 insertions(+), 1 deletion(-) -- 2.14.3 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 50cf2a1fdd..c0cccfda6f 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -46,6 +46,19 @@ typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, * Implement all of the translator functions referenced by the decoder. */ +/* Return the offset info CPUARMState of the predicate vector register Pn. + * Note for this purpose, FFR is P16. */ +static inline int pred_full_reg_offset(DisasContext *s, int regno) +{ + return offsetof(CPUARMState, vfp.pregs[regno]); +} + +/* Return the byte size of the whole predicate register, VL / 64. */ +static inline int pred_full_reg_size(DisasContext *s) +{ + return s->sve_len >> 3; +} + /* Invoke a vector expander on two Zregs. */ static void do_vector2_z(DisasContext *s, GVecGen2Fn *gvec_fn, int esz, int rd, int rn) @@ -97,3 +110,122 @@ static void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uint32_t insn) { do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); } + +/* + *** SVE Memory - 32-bit Gather and Unsized Contiguous Group + */ + +/* Subroutine loading a vector register at VOFS of LEN bytes. + * The load should begin at the address Rn + IMM. + */ + +#if UINTPTR_MAX == UINT32_MAX +# define ptr i32 +#else +# define ptr i64 +#endif + +static void do_ldr(DisasContext *s, uint32_t vofs, uint32_t len, + int rn, int imm) +{ + uint32_t len_align = QEMU_ALIGN_DOWN(len, 8); + uint32_t len_remain = len % 8; + uint32_t nparts = len / 8 + ctpop8(len_remain); + int midx = get_mem_index(s); + TCGv_i64 addr, t0, t1; + + addr = tcg_temp_new_i64(); + t0 = tcg_temp_new_i64(); + + /* Note that unpredicated load/store of vector/predicate registers + * are defined as a stream of bytes, which equates to little-endian + * operations on larger quantities. There is no nice way to force + * a little-endian load for aarch64_be-linux-user out of line. + * + * Attempt to keep code expansion to a minimum by limiting the + * amount of unrolling done. + */ + if (nparts <= 4) { + int i; + + for (i = 0; i < len_align; i += 8) { + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + i); + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEQ); + tcg_gen_st_i64(t0, cpu_env, vofs + i); + } + } else { + TCGLabel *loop = gen_new_label(); + TCGv_ptr i = TCGV_NAT_TO_PTR(glue(tcg_const_local_, ptr)(0)); + TCGv_ptr dest; + + gen_set_label(loop); + + /* Minimize the number of local temps that must be re-read from + * the stack each iteration. Instead, re-compute values other + * than the loop counter. + */ + dest = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(dest, i, imm); +#if UINTPTR_MAX == UINT32_MAX + tcg_gen_extu_i32_i64(addr, TCGV_PTR_TO_NAT(dest)); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, rn)); +#else + tcg_gen_add_i64(addr, TCGV_PTR_TO_NAT(dest), cpu_reg_sp(s, rn)); +#endif + + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEQ); + + tcg_gen_add_ptr(dest, cpu_env, i); + tcg_gen_addi_ptr(i, i, 8); + tcg_gen_st_i64(t0, dest, vofs); + tcg_temp_free_ptr(dest); + + glue(tcg_gen_brcondi_, ptr)(TCG_COND_LTU, TCGV_PTR_TO_NAT(i), + len_align, loop); + tcg_temp_free_ptr(i); + } + + /* Predicate register loads can be any multiple of 2. + * Note that we still store the entire 64-bit unit into cpu_env. + */ + if (len_remain) { + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + len_align); + + switch (len_remain) { + case 2: + case 4: + case 8: + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LE | ctz32(len_remain)); + break; + + case 6: + t1 = tcg_temp_new_i64(); + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEUL); + tcg_gen_addi_i64(addr, addr, 4); + tcg_gen_qemu_ld_i64(t1, addr, midx, MO_LEUW); + tcg_gen_deposit_i64(t0, t0, t1, 32, 32); + tcg_temp_free_i64(t1); + break; + + default: + g_assert_not_reached(); + } + tcg_gen_st_i64(t0, cpu_env, vofs + len_align); + } + tcg_temp_free_i64(addr); + tcg_temp_free_i64(t0); +} + +#undef ptr + +static void trans_LDR_zri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + int size = vec_full_reg_size(s); + do_ldr(s, vec_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); +} + +static void trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + int size = pred_full_reg_size(s); + do_ldr(s, pred_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2c13a6024a..0c6a7ba34d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -19,11 +19,17 @@ # This file is processed by scripts/decodetree.py # +########################################################################### +# Named fields. These are primarily for disjoint fields. + +%imm9_16_10 16:s6 10:3 + ########################################################################### # Named attribute sets. These are used to make nice(er) names # when creating helpers common to those for the individual # instruction patterns. +&rri rd rn imm &rrr_esz rd rn rm esz ########################################################################### @@ -31,7 +37,13 @@ # reduce the amount of duplication between instruction patterns. # Three operand with unused vector element size -@rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0 +@rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0 + +# Basic Load/Store with 9-bit immediate offset +@pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ + &rri imm=%imm9_16_10 +@rd_rn_i9 ........ ........ ...... rn:5 rd:5 \ + &rri imm=%imm9_16_10 ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -43,3 +55,11 @@ AND_zzz 00000100 00 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 + +### SVE Memory - 32-bit Gather and Unsized Contiguous Group + +# SVE load predicate register +LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9 + +# SVE load vector register +LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9