From patchwork Wed May 16 22:29:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 136051 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp1526738lji; Wed, 16 May 2018 15:33:57 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoRpeeN8OXapuwaA82IX/Z6n4vjcnyYJhSddC53YXRxU4SnnmLSdU88mwWWK/glU29J1mgA X-Received: by 2002:a37:9704:: with SMTP id z4-v6mr2890616qkd.138.1526510037218; Wed, 16 May 2018 15:33:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526510037; cv=none; d=google.com; s=arc-20160816; b=dals5PYYHhgw5fZndJw62Mp52+/qz2y8FJyFWkFT5LOjKl0RTeKLc3nRosO4IbnkTw kjUrPEZuxFaADcWu00EDEE14xIWhvOySNYNwpJFyrqm7Jnd6FQ9aEGUpmfvlPXkbNmnK oFG/rblNF4szqWFDKWRaugivPr6eI5vjZh35OgTzHI6Dvg+aHKJ82gFkwm027s8xngT9 IYzs7r7LlvD6YXRkXH/FtHNPT4VDtu8rjVtmgrL2VgxcH7GKYMdStjt2sTjxa0FFAvWr 6rvHe/11WHBqHZQqZCmlbTLQTjAa9XIGMhP0ZHk2iT0HS1/wFd1wFygpytUblBTDl0lA dKmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=um4QLHX1iMAwR5s1WiFKyjG6exIGzfIhln7mUAYtMVI=; b=MYjr7BqD9G+rQepIDlrYrqjB0u6QlMMkTyp5PhC8aMtH7v6RdL1RmAgwzP1ZVytMlB U7tOaPjKgZWYXDbRYZ5j0l+JNUlSR8cKVwrywbk7oplDO1clajA+rXTyvats8IFYd7M9 Hb75bjc/q1iKVH0HA0pmKkEOLsWn4Rj4Z1CyDv1TmC0bbYgO3Jm9sXLjWLdUbsu1wexA 7P3To6v5FRO1h8BQTTdzXFI1VW9Z0mDZnwhcQfT0IExB1v9geM9KAiKK4UZrgFnrB38g qqVB73TcHXGUFookcOG6WUO7QF4l9oSADdZACdhHjkIZE1O+2IIG13mZY1mq0JAUe/L4 RjZw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=J8vhtKl8; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id m11-v6si3678783qvl.110.2018.05.16.15.33.57 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 16 May 2018 15:33:57 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=J8vhtKl8; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:44759 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJ4zc-0000fU-Kl for patch@linaro.org; Wed, 16 May 2018 18:33:56 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40964) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJ4w6-0007U2-PD for qemu-devel@nongnu.org; Wed, 16 May 2018 18:30:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fJ4w5-0007Mr-Bn for qemu-devel@nongnu.org; Wed, 16 May 2018 18:30:18 -0400 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:43757) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fJ4w5-0007MK-4t for qemu-devel@nongnu.org; Wed, 16 May 2018 18:30:17 -0400 Received: by mail-pf0-x242.google.com with SMTP id j20-v6so1045962pff.10 for ; Wed, 16 May 2018 15:30:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=um4QLHX1iMAwR5s1WiFKyjG6exIGzfIhln7mUAYtMVI=; b=J8vhtKl8y86nU7vPIyVwIOu5TK3175Sv/d+pgNdlISti/jgUpsZQexUVHDLKWIoK5i ho1x0cIIyvYDPubj2U8eKZWQHTJNp9HOip1de8xMawOtevWBputs0o4kgWnDewb7zf+/ M9FHt8+AKSmq8rr+c4VvGIVLHqQT419OvAKkU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=um4QLHX1iMAwR5s1WiFKyjG6exIGzfIhln7mUAYtMVI=; b=EOCQpLmIF9y/6M29OvQHCv534yjEYQmOCIv81tNubSznrltjvCq5ySlO/JxO0KlABq LNbzq3GQ5hrpqPd1pZfjgvUO28FCttDbz3JUxtEwPw7DIvad+7kye6L6afGV5LHLOiXC uc388241E7cYTp5Ex5QdLcNnzC4j4Ob4gdYHg+LMUV0Q8QPNgMgEgY+724/fHhskR4p3 Pzs2u5edDjNmClRR4YgIn2xFDrXXL1BTPU9giVKrxxYEx1dFL5ualpFDFKA2je+Fyngx viCpG8JRDA39vrXWs2QBbdMWWNU9mXMUvS2M0CwyOhg3I0DSRWProYcIKeaVtzVVM3Jy Nb5A== X-Gm-Message-State: ALKqPwd397xf3rAlsGwgXLFaDRsRdHoUn2sc7ft2BAEXvpy9rsTmRCMG D61im2pohszDXjkjdqJHr3ZVwQcLwpw= X-Received: by 2002:a63:7d0f:: with SMTP id y15-v6mr2165999pgc.317.1526509815773; Wed, 16 May 2018 15:30:15 -0700 (PDT) Received: from cloudburst.twiddle.net (97-113-2-170.tukw.qwest.net. [97.113.2.170]) by smtp.gmail.com with ESMTPSA id j1-v6sm6640418pfc.159.2018.05.16.15.30.14 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 16 May 2018 15:30:14 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 16 May 2018 15:29:44 -0700 Message-Id: <20180516223007.10256-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180516223007.10256-1-richard.henderson@linaro.org> References: <20180516223007.10256-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v3-a 04/27] target/arm: Implement SVE load vector/predicate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 127 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 20 ++++++ 2 files changed, 147 insertions(+) -- 2.17.0 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 67d6db313e..5ec18a6aac 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -42,6 +42,20 @@ * Implement all of the translator functions referenced by the decoder. */ +/* Return the offset info CPUARMState of the predicate vector register Pn. + * Note for this purpose, FFR is P16. + */ +static inline int pred_full_reg_offset(DisasContext *s, int regno) +{ + return offsetof(CPUARMState, vfp.pregs[regno]); +} + +/* Return the byte size of the whole predicate register, VL / 64. */ +static inline int pred_full_reg_size(DisasContext *s) +{ + return s->sve_len >> 3; +} + /* Invoke a vector expander on two Zregs. */ static bool do_vector2_z(DisasContext *s, GVecGen2Fn *gvec_fn, int esz, int rd, int rn) @@ -100,3 +114,116 @@ static bool trans_BIC_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) { return do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); } + +/* + *** SVE Memory - 32-bit Gather and Unsized Contiguous Group + */ + +/* Subroutine loading a vector register at VOFS of LEN bytes. + * The load should begin at the address Rn + IMM. + */ + +static void do_ldr(DisasContext *s, uint32_t vofs, uint32_t len, + int rn, int imm) +{ + uint32_t len_align = QEMU_ALIGN_DOWN(len, 8); + uint32_t len_remain = len % 8; + uint32_t nparts = len / 8 + ctpop8(len_remain); + int midx = get_mem_index(s); + TCGv_i64 addr, t0, t1; + + addr = tcg_temp_new_i64(); + t0 = tcg_temp_new_i64(); + + /* Note that unpredicated load/store of vector/predicate registers + * are defined as a stream of bytes, which equates to little-endian + * operations on larger quantities. There is no nice way to force + * a little-endian load for aarch64_be-linux-user out of line. + * + * Attempt to keep code expansion to a minimum by limiting the + * amount of unrolling done. + */ + if (nparts <= 4) { + int i; + + for (i = 0; i < len_align; i += 8) { + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + i); + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEQ); + tcg_gen_st_i64(t0, cpu_env, vofs + i); + } + } else { + TCGLabel *loop = gen_new_label(); + TCGv_ptr tp, i = tcg_const_local_ptr(0); + + gen_set_label(loop); + + /* Minimize the number of local temps that must be re-read from + * the stack each iteration. Instead, re-compute values other + * than the loop counter. + */ + tp = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(tp, i, imm); + tcg_gen_extu_ptr_i64(addr, tp); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, rn)); + + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEQ); + + tcg_gen_add_ptr(tp, cpu_env, i); + tcg_gen_addi_ptr(i, i, 8); + tcg_gen_st_i64(t0, tp, vofs); + tcg_temp_free_ptr(tp); + + tcg_gen_brcondi_ptr(TCG_COND_LTU, i, len_align, loop); + tcg_temp_free_ptr(i); + } + + /* Predicate register loads can be any multiple of 2. + * Note that we still store the entire 64-bit unit into cpu_env. + */ + if (len_remain) { + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + len_align); + + switch (len_remain) { + case 2: + case 4: + case 8: + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LE | ctz32(len_remain)); + break; + + case 6: + t1 = tcg_temp_new_i64(); + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEUL); + tcg_gen_addi_i64(addr, addr, 4); + tcg_gen_qemu_ld_i64(t1, addr, midx, MO_LEUW); + tcg_gen_deposit_i64(t0, t0, t1, 32, 32); + tcg_temp_free_i64(t1); + break; + + default: + g_assert_not_reached(); + } + tcg_gen_st_i64(t0, cpu_env, vofs + len_align); + } + tcg_temp_free_i64(addr); + tcg_temp_free_i64(t0); +} + +static bool trans_LDR_zri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + if (sve_access_check(s)) { + int size = vec_full_reg_size(s); + int off = vec_full_reg_offset(s, a->rd); + do_ldr(s, off, size, a->rn, a->imm * size); + } + return true; +} + +static bool trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + if (sve_access_check(s)) { + int size = pred_full_reg_size(s); + int off = pred_full_reg_offset(s, a->rd); + do_ldr(s, off, size, a->rn, a->imm * size); + } + return true; +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 48dac9f71f..a2c4450e7c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -19,11 +19,17 @@ # This file is processed by scripts/decodetree.py # +########################################################################### +# Named fields. These are primarily for disjoint fields. + +%imm9_16_10 16:s6 10:3 + ########################################################################### # Named attribute sets. These are used to make nice(er) names # when creating helpers common to those for the individual # instruction patterns. +&rri rd rn imm &rrr_esz rd rn rm esz ########################################################################### @@ -33,6 +39,12 @@ # Three operand with unused vector element size @rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0 +# Basic Load/Store with 9-bit immediate offset +@pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ + &rri imm=%imm9_16_10 +@rd_rn_i9 ........ ........ ...... rn:5 rd:5 \ + &rri imm=%imm9_16_10 + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -43,3 +55,11 @@ AND_zzz 00000100 00 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 + +### SVE Memory - 32-bit Gather and Unsized Contiguous Group + +# SVE load predicate register +LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9 + +# SVE load vector register +LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9