From patchwork Tue Jan 23 03:53:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 125472 Delivered-To: patch@linaro.org Received: by 10.46.66.141 with SMTP id h13csp1519223ljf; Mon, 22 Jan 2018 19:56:46 -0800 (PST) X-Google-Smtp-Source: AH8x227S/jKrlDG9Blc3naVXg8TI+UqGUPzp7VQDs+wA6/5vYdcM2R4R4boSSgpaNB0b+rZL6TX3 X-Received: by 10.37.144.11 with SMTP id s11mr1095581ybl.354.1516679806027; Mon, 22 Jan 2018 19:56:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516679806; cv=none; d=google.com; s=arc-20160816; b=ZGvVQFLRPh2++jms+gGRQOBXiA0wvcQl8bcfktdNKreOe4xiK9+KWgF+cQK6HJ9SKd c6lnycBJAFi6vU8nfWbzHme7cItvPZFW7rSiyXmO3ISLZ7EdRyYOpD7V9StsNI3qKHvF 0hGjbQa/Erpzac/fviYtGBS9o02yw1uRTjcW91FlJXEmlgjcdZ1SdO2rVl3uVjTjTD/r VrwY7aTLgoUq2xeCM5TGOkRiKOF17c9SxvaJZzPcDKMfdN5HZfR9br2vOuVXc/pFfWi6 S6aMyqT3h0XwCWP5Qr+h/NECqXTWi7h+OT1CMHQtx1kge4bBcuuvQWK3qyDoUYpHPZFf sjPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=ALs+5LNCOW2HM0Ze3ARm/rJDwNfbOfdrpeBVt29uKZ0=; b=r9hFvCqLI3OZ9ZhJm1NtudW2B49ID9OdF4FB7yqwf+Q/wka6bEYSOt0IeQqcg3Cfxd TDJbAPD/6odyp1NiRXRgnV52BfnfSS2voEN50Sf5YqfelGm5mvB4zOGmnOZBPFrWVgo2 B5fcFMmjQyxk7Eg+KVNaXHQbXhDp4rqpxV2pV3TTqw4mqoawBos3Vr23fftt8LcQXOy4 qylpXY6qksg/jpFDCv5aL326mwWP/sUhjPNOEJvEjZN6tPgj+h/89EV7BLhczkl/s4zx +kMwkCHP8UbL7ttUHcL02xPR1cc64Re1tJjXDlLJw5FCVQoA2oyrXeG4Dcdjuresy15d 9Q0A== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Y0ysfKUu; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id q74si2234601ybg.139.2018.01.22.19.56.45 for (version=TLS1 cipher=AES128-SHA bits=128/128); Mon, 22 Jan 2018 19:56:46 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Y0ysfKUu; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:60730 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1edphV-0003BY-Br for patch@linaro.org; Mon, 22 Jan 2018 22:56:45 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43197) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1edpen-0001sg-QY for qemu-devel@nongnu.org; Mon, 22 Jan 2018 22:53:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1edpel-00044r-Vq for qemu-devel@nongnu.org; Mon, 22 Jan 2018 22:53:57 -0500 Received: from mail-pg0-x242.google.com ([2607:f8b0:400e:c05::242]:34644) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1edpel-00043u-Mj for qemu-devel@nongnu.org; Mon, 22 Jan 2018 22:53:55 -0500 Received: by mail-pg0-x242.google.com with SMTP id r19so8779200pgn.1 for ; Mon, 22 Jan 2018 19:53:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ALs+5LNCOW2HM0Ze3ARm/rJDwNfbOfdrpeBVt29uKZ0=; b=Y0ysfKUuAjVLno29vP6kzVagc+wW2eFH8ZTgI76KCxRN20Bc+1c4Cs2Bod6ndDaIQj 1tKaEfo41K9g4VQNWNulTETnbge6qUcA4YgQBWq61wti+ItCIC9V7dglY/aaSefyv/go 6tY6VmejEYEHSR5TbXmHDX1Xo2rD7ACvn2d08= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ALs+5LNCOW2HM0Ze3ARm/rJDwNfbOfdrpeBVt29uKZ0=; b=KEp15D+5sUtrlos+Wgtx3I24lAYParpBoZj0T4wjtUgLf3VlhHO1fYKu6seURa6xtq 2mn8uxUEjkp7RfL4m/uSm94yMfVaiDQmF5Je0F0ga9ElTdJBRuTWKlULKZERw8MgsUIu GA184PXd7IpW0zOBsQ5iNaE7hxNsY0zbcn6zAW3MoVd465V7kDG9V/25TqiLQiIHiZTi wY4CzdNZ1Mv6Do+cLcJ/Urt+a4qFSmUDtQzYUzVDajLnuYVNUzYJjGDDz9YAozde/DtB /ErvOGwqdEfyq1/Ah3vm4jbu7nzosGmo52ndIg4UbV+5f7mnTSb77GY5gqKUOjeNSsro q2VQ== X-Gm-Message-State: AKwxytc9gxzFkPLb7TAjTRMH0+Jsv3wF3S6uAaM7SdBet1qn4N8MDRz6 ZCr4E2asFRXf5wMjnGKFMGlsf/4y27M= X-Received: by 10.99.167.75 with SMTP id w11mr8164144pgo.215.1516679634206; Mon, 22 Jan 2018 19:53:54 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-6-47.tukw.qwest.net. [174.21.6.47]) by smtp.gmail.com with ESMTPSA id t1sm3444680pfj.21.2018.01.22.19.53.52 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 22 Jan 2018 19:53:53 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 22 Jan 2018 19:53:45 -0800 Message-Id: <20180123035349.24538-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180123035349.24538-1-richard.henderson@linaro.org> References: <20180123035349.24538-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::242 Subject: [Qemu-devel] [PATCH v3 1/5] target/arm: Expand vector registers for SVE X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Change vfp.regs as a uint64_t to vfp.zregs as an ARMVectorReg. The previous patches have made the change in representation relatively painless. Reviewed-by: Alex Bennée Signed-off-by: Richard Henderson --- target/arm/cpu.h | 59 +++++++++++++++++++++++++++++++--------------- target/arm/machine.c | 35 ++++++++++++++++++++++++++- target/arm/translate-a64.c | 8 +++---- target/arm/translate.c | 7 +++--- 4 files changed, 81 insertions(+), 28 deletions(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/cpu.h b/target/arm/cpu.h index d2bb59eded..1854fe51a8 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -153,6 +153,42 @@ typedef struct { uint32_t base_mask; } TCR; +/* Define a maximum sized vector register. + * For 32-bit, this is a 128-bit NEON/AdvSIMD register. + * For 64-bit, this is a 2048-bit SVE register. + * + * Note that the mapping between S, D, and Q views of the register bank + * differs between AArch64 and AArch32. + * In AArch32: + * Qn = regs[n].d[1]:regs[n].d[0] + * Dn = regs[n / 2].d[n & 1] + * Sn = regs[n / 4].d[n % 4 / 2], + * bits 31..0 for even n, and bits 63..32 for odd n + * (and regs[16] to regs[31] are inaccessible) + * In AArch64: + * Zn = regs[n].d[*] + * Qn = regs[n].d[1]:regs[n].d[0] + * Dn = regs[n].d[0] + * Sn = regs[n].d[0] bits 31..0 + * + * This corresponds to the architecturally defined mapping between + * the two execution states, and means we do not need to explicitly + * map these registers when changing states. + * + * Align the data for use with TCG host vector operations. + */ + +#ifdef TARGET_AARCH64 +# define ARM_MAX_VQ 16 +#else +# define ARM_MAX_VQ 1 +#endif + +typedef struct ARMVectorReg { + uint64_t d[2 * ARM_MAX_VQ] QEMU_ALIGNED(16); +} ARMVectorReg; + + typedef struct CPUARMState { /* Regs for current mode. */ uint32_t regs[16]; @@ -477,22 +513,7 @@ typedef struct CPUARMState { /* VFP coprocessor state. */ struct { - /* VFP/Neon register state. Note that the mapping between S, D and Q - * views of the register bank differs between AArch64 and AArch32: - * In AArch32: - * Qn = regs[2n+1]:regs[2n] - * Dn = regs[n] - * Sn = regs[n/2] bits 31..0 for even n, and bits 63..32 for odd n - * (and regs[32] to regs[63] are inaccessible) - * In AArch64: - * Qn = regs[2n+1]:regs[2n] - * Dn = regs[2n] - * Sn = regs[2n] bits 31..0 - * This corresponds to the architecturally defined mapping between - * the two execution states, and means we do not need to explicitly - * map these registers when changing states. - */ - uint64_t regs[64]; + ARMVectorReg zregs[32]; uint32_t xregs[16]; /* We store these fpcsr fields separately for convenience. */ @@ -2769,7 +2790,7 @@ static inline void *arm_get_el_change_hook_opaque(ARMCPU *cpu) */ static inline uint64_t *aa32_vfp_dreg(CPUARMState *env, unsigned regno) { - return &env->vfp.regs[regno]; + return &env->vfp.zregs[regno >> 1].d[regno & 1]; } /** @@ -2778,7 +2799,7 @@ static inline uint64_t *aa32_vfp_dreg(CPUARMState *env, unsigned regno) */ static inline uint64_t *aa32_vfp_qreg(CPUARMState *env, unsigned regno) { - return &env->vfp.regs[2 * regno]; + return &env->vfp.zregs[regno].d[0]; } /** @@ -2787,7 +2808,7 @@ static inline uint64_t *aa32_vfp_qreg(CPUARMState *env, unsigned regno) */ static inline uint64_t *aa64_vfp_qreg(CPUARMState *env, unsigned regno) { - return &env->vfp.regs[2 * regno]; + return &env->vfp.zregs[regno].d[0]; } #endif diff --git a/target/arm/machine.c b/target/arm/machine.c index a85c2430d3..cb0e1c92bb 100644 --- a/target/arm/machine.c +++ b/target/arm/machine.c @@ -50,7 +50,40 @@ static const VMStateDescription vmstate_vfp = { .minimum_version_id = 3, .needed = vfp_needed, .fields = (VMStateField[]) { - VMSTATE_UINT64_ARRAY(env.vfp.regs, ARMCPU, 64), + /* For compatibility, store Qn out of Zn here. */ + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[0].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[1].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[2].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[3].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[4].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[5].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[6].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[7].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[8].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[9].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[10].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[11].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[12].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[13].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[14].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[15].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[16].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[17].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[18].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[19].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[20].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[21].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[22].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[23].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[24].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[25].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[26].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[27].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[28].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[29].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[30].d, ARMCPU, 0, 2), + VMSTATE_UINT64_SUB_ARRAY(env.vfp.zregs[31].d, ARMCPU, 0, 2), + /* The xregs array is a little awkward because element 1 (FPSCR) * requires a specific accessor, so we have to split it up in * the vmstate: diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index eed64c73e5..10eef870fe 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -517,8 +517,8 @@ static inline int vec_reg_offset(DisasContext *s, int regno, { int offs = 0; #ifdef HOST_WORDS_BIGENDIAN - /* This is complicated slightly because vfp.regs[2n] is - * still the low half and vfp.regs[2n+1] the high half + /* This is complicated slightly because vfp.zregs[n].d[0] is + * still the low half and vfp.zregs[n].d[1] the high half * of the 128 bit vector, even on big endian systems. * Calculate the offset assuming a fully bigendian 128 bits, * then XOR to account for the order of the two 64 bit halves. @@ -528,7 +528,7 @@ static inline int vec_reg_offset(DisasContext *s, int regno, #else offs += element * (1 << size); #endif - offs += offsetof(CPUARMState, vfp.regs[regno * 2]); + offs += offsetof(CPUARMState, vfp.zregs[regno]); assert_fp_access_checked(s); return offs; } @@ -537,7 +537,7 @@ static inline int vec_reg_offset(DisasContext *s, int regno, static inline int vec_full_reg_offset(DisasContext *s, int regno) { assert_fp_access_checked(s); - return offsetof(CPUARMState, vfp.regs[regno * 2]); + return offsetof(CPUARMState, vfp.zregs[regno]); } /* Return a newly allocated pointer to the vector register. */ diff --git a/target/arm/translate.c b/target/arm/translate.c index 55826b7e5a..a8c13d3758 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -1512,13 +1512,12 @@ static inline void gen_vfp_st(DisasContext *s, int dp, TCGv_i32 addr) } } -static inline long -vfp_reg_offset (int dp, int reg) +static inline long vfp_reg_offset(bool dp, unsigned reg) { if (dp) { - return offsetof(CPUARMState, vfp.regs[reg]); + return offsetof(CPUARMState, vfp.zregs[reg >> 1].d[reg & 1]); } else { - long ofs = offsetof(CPUARMState, vfp.regs[reg >> 1]); + long ofs = offsetof(CPUARMState, vfp.zregs[reg >> 2].d[(reg >> 1) & 1]); if (reg & 1) { ofs += offsetof(CPU_DoubleU, l.upper); } else {