From patchwork Thu Aug 9 04:21:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 143666 Delivered-To: patch@linaro.org Received: by 2002:a2e:9754:0:0:0:0:0 with SMTP id f20-v6csp1607787ljj; Wed, 8 Aug 2018 21:30:45 -0700 (PDT) X-Google-Smtp-Source: AA+uWPyX+/pHH16NpdAAowBbmXT4XNwvsXtsqQKRxDsN7frAPf4eN+SbCvh1pcwZgCVEv9ogf+pr X-Received: by 2002:a37:434a:: with SMTP id q71-v6mr395748qka.277.1533789045153; Wed, 08 Aug 2018 21:30:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533789045; cv=none; d=google.com; s=arc-20160816; b=ZrAKsPdfNOOufFzvnMUDgrWBrfly86FqQcMtFRexcVliEf43luFqfq5tfBFdYZPFhD mHiUPDNz+1I2Xbe2P7iHjIMwr3nh6NhEWHmSpIz2GOrBCxx+OlIIDPhpXQ3uvPA83wZD K4R+SWs0rew5zFjK1OcWFstItMrnHjgsNndG3J4zP7jfCPZ6eCLg5glKWJdYdokRAaOK Svwt9NV0heExOatzL1qlYRl9GaRo3i/8T6GjE2CYuHql0F76BgYladyrG8Humqz9nYQm iryThM99EP48JhZvR3mpzx4gqN3MvupNIYMIuuSCnXtYn8T03wn53I7j0hvJfcb8ryNa zn3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=GUayhAnOGzgkh75iWjLeMdxi4TkGN1zqtM9FxRzxZnM=; b=P/0rZBS7rlT27Xis+8Bi2JArV8Qbq4pK4ZwK554I5YpjnTSY9SUKsB1c2PkEnTbyBa N9gN9d0X6BDY/HOlebvY5CbDn+5OVEEpoI8gBeXz8cjX9W552juWVs4xqnfNorqgSSRR 0l/MnE6TDxd5KMr53kXMlQGe3XnwZk/ud56Hc0bqKHf8xlZuBWUXVD+VO19QzxRHoJL3 LqLlhe92oxup7ZkHPraUhU01dHZc2ew8h9dLl1sd6m5Iq1XnwwmEMQBZpkDuaStPw4lm axoph3pa4kZdpyFLW94ifkHb6ywtdIELIhkCJNHRB8R4dISxb/pFuSQ+jJvihdqUuCyX SQNg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=fWwa4dB3; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id g10-v6si4446560qth.312.2018.08.08.21.30.44 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 08 Aug 2018 21:30:45 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=fWwa4dB3; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:47259 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fncay-0007Hr-If for patch@linaro.org; Thu, 09 Aug 2018 00:30:44 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54036) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fncSu-0000qh-Jr for qemu-devel@nongnu.org; Thu, 09 Aug 2018 00:22:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fncSs-0007IX-Vs for qemu-devel@nongnu.org; Thu, 09 Aug 2018 00:22:24 -0400 Received: from mail-pl0-x232.google.com ([2607:f8b0:400e:c01::232]:39443) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fncSs-0007HQ-Lj for qemu-devel@nongnu.org; Thu, 09 Aug 2018 00:22:22 -0400 Received: by mail-pl0-x232.google.com with SMTP id w14-v6so1981330plp.6 for ; Wed, 08 Aug 2018 21:22:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=GUayhAnOGzgkh75iWjLeMdxi4TkGN1zqtM9FxRzxZnM=; b=fWwa4dB3ntpidb1OsO8S8eT9kuKVX443QD4SsUTH3xI9Emyh0QLQW7pTmYjY5fMnYT hdtAXHNUGN4zP0UhkC4gq9a0AVX6+V1eC95L02TYNbx8HTReRa1HwQovX6AoFm5VlFGM xltxv4a4QodXQ+YiyZzazEaHwF9dTvqi/fN7U= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=GUayhAnOGzgkh75iWjLeMdxi4TkGN1zqtM9FxRzxZnM=; b=LUgWNvIU7/oZIzGeelXXgS3UNWkuR6ioImV2eNNjHMrsDh1eRfobU7M/1trWhgltGt KacD424Lam26qxnXkRQiyhsm4XBtxcoitl+0Lts0Ai/nLL2GNgaokrt23hwAo0u0GK11 kKtW6E83EDszNzDZLz2Vdb2JH07aCOLFfA0i6pa3MTG1nb7qefDwPvdpQyOe+T5pUVEH TFT3xH37iM0ujYFpAt8uqqKEcFHKJnHo9UZepbFDJVQ97isF94Xlhh0j4zH+o7LABYcS h93qVUZCEeKAY4n4PtKsRgDbwk6XKn8dA0IIZBcJkz5FZSZIT1Yjrqt7t1QPmACmFSn6 ai5A== X-Gm-Message-State: AOUpUlGMV4ie9Bcg8g5ZZv0d6jgSPrYw4K9ZTTykNhl1p41E4q1+7KGH ZxPqic0Y5dokrSoIjwCccNdNd6L6JaM= X-Received: by 2002:a17:902:c6:: with SMTP id a64-v6mr514834pla.180.1533788541338; Wed, 08 Aug 2018 21:22:21 -0700 (PDT) Received: from cloudburst.twiddle.net (97-113-8-179.tukw.qwest.net. [97.113.8.179]) by smtp.gmail.com with ESMTPSA id m30-v6sm7355799pff.121.2018.08.08.21.22.20 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 08 Aug 2018 21:22:20 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 8 Aug 2018 21:21:55 -0700 Message-Id: <20180809042206.15726-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180809042206.15726-1-richard.henderson@linaro.org> References: <20180809042206.15726-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::232 Subject: [Qemu-devel] [PATCH 09/20] target/arm: Handle SVE vector length changes in system mode X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: laurent.desnogues@gmail.com, peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" SVE vector length can change when changing EL, or when writing to one of the ZCR_ELn registers. For correctness, our implementation requires that predicate bits that are inaccessible are never set. Which means noticing length changes and zeroing the appropriate register bits. Signed-off-by: Richard Henderson --- target/arm/cpu.h | 4 ++ target/arm/cpu64.c | 42 -------------- target/arm/helper.c | 127 ++++++++++++++++++++++++++++++++++++----- target/arm/op_helper.c | 1 + 4 files changed, 119 insertions(+), 55 deletions(-) -- 2.17.1 diff --git a/target/arm/cpu.h b/target/arm/cpu.h index ed51a2f5aa..18b3c92c2e 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -910,6 +910,10 @@ int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs, int aarch64_cpu_gdb_read_register(CPUState *cpu, uint8_t *buf, int reg); int aarch64_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg); void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq); +void aarch64_sve_change_el(CPUARMState *env, int old_el, int new_el); +#else +static inline void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq) { } +static inline void aarch64_sve_change_el(CPUARMState *env, int o, int n) { } #endif target_ulong do_arm_semihosting(CPUARMState *env); diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c index ae650b608e..16272f1358 100644 --- a/target/arm/cpu64.c +++ b/target/arm/cpu64.c @@ -439,45 +439,3 @@ static void aarch64_cpu_register_types(void) } type_init(aarch64_cpu_register_types) - -/* The manual says that when SVE is enabled and VQ is widened the - * implementation is allowed to zero the previously inaccessible - * portion of the registers. The corollary to that is that when - * SVE is enabled and VQ is narrowed we are also allowed to zero - * the now inaccessible portion of the registers. - * - * The intent of this is that no predicate bit beyond VQ is ever set. - * Which means that some operations on predicate registers themselves - * may operate on full uint64_t or even unrolled across the maximum - * uint64_t[4]. Performing 4 bits of host arithmetic unconditionally - * may well be cheaper than conditionals to restrict the operation - * to the relevant portion of a uint16_t[16]. - * - * TODO: Need to call this for changes to the real system registers - * and EL state changes. - */ -void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq) -{ - int i, j; - uint64_t pmask; - - assert(vq >= 1 && vq <= ARM_MAX_VQ); - assert(vq <= arm_env_get_cpu(env)->sve_max_vq); - - /* Zap the high bits of the zregs. */ - for (i = 0; i < 32; i++) { - memset(&env->vfp.zregs[i].d[2 * vq], 0, 16 * (ARM_MAX_VQ - vq)); - } - - /* Zap the high bits of the pregs and ffr. */ - pmask = 0; - if (vq & 3) { - pmask = ~(-1ULL << (16 * (vq & 3))); - } - for (j = vq / 4; j < ARM_MAX_VQ / 4; j++) { - for (i = 0; i < 17; ++i) { - env->vfp.pregs[i].p[j] &= pmask; - } - pmask = 0; - } -} diff --git a/target/arm/helper.c b/target/arm/helper.c index 290b1a849e..fb79b27cf6 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -4399,11 +4399,44 @@ static int sve_exception_el(CPUARMState *env, int el) return 0; } +/* + * Given that SVE is enabled, return the vector length for EL. + */ +static uint32_t sve_zcr_len_for_el(CPUARMState *env, int el) +{ + ARMCPU *cpu = arm_env_get_cpu(env); + uint32_t zcr_len = cpu->sve_max_vq - 1; + + if (el <= 1) { + zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[1]); + } + if (el < 2 && arm_feature(env, ARM_FEATURE_EL2)) { + zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[2]); + } + if (el < 3 && arm_feature(env, ARM_FEATURE_EL3)) { + zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[3]); + } + return zcr_len; +} + static void zcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value) { + int cur_el = arm_current_el(env); + int old_len = sve_zcr_len_for_el(env, cur_el); + int new_len; + /* Bits other than [3:0] are RAZ/WI. */ raw_write(env, ri, value & 0xf); + + /* + * Because we arrived here, we know both FP and SVE are enabled; + * otherwise we would have trapped access to the ZCR_ELn register. + */ + new_len = sve_zcr_len_for_el(env, cur_el); + if (new_len < old_len) { + aarch64_sve_narrow_vq(env, new_len + 1); + } } static const ARMCPRegInfo zcr_el1_reginfo = { @@ -8100,8 +8133,11 @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs) unsigned int new_el = env->exception.target_el; target_ulong addr = env->cp15.vbar_el[new_el]; unsigned int new_mode = aarch64_pstate_mode(new_el, true); + unsigned int cur_el = arm_current_el(env); - if (arm_current_el(env) < new_el) { + aarch64_sve_change_el(env, cur_el, new_el); + + if (cur_el < new_el) { /* Entry vector offset depends on whether the implemented EL * immediately lower than the target level is using AArch32 or AArch64 */ @@ -12402,18 +12438,7 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc, if (sve_el != 0 && fp_el == 0) { zcr_len = 0; } else { - ARMCPU *cpu = arm_env_get_cpu(env); - - zcr_len = cpu->sve_max_vq - 1; - if (current_el <= 1) { - zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[1]); - } - if (current_el < 2 && arm_feature(env, ARM_FEATURE_EL2)) { - zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[2]); - } - if (current_el < 3 && arm_feature(env, ARM_FEATURE_EL3)) { - zcr_len = MIN(zcr_len, 0xf & (uint32_t)env->vfp.zcr_el[3]); - } + zcr_len = sve_zcr_len_for_el(env, current_el); } flags |= zcr_len << ARM_TBFLAG_ZCR_LEN_SHIFT; } else { @@ -12467,3 +12492,79 @@ void cpu_get_tb_cpu_state(CPUARMState *env, target_ulong *pc, *pflags = flags; *cs_base = 0; } + +#ifdef TARGET_AARCH64 +/* + * The manual says that when SVE is enabled and VQ is widened the + * implementation is allowed to zero the previously inaccessible + * portion of the registers. The corollary to that is that when + * SVE is enabled and VQ is narrowed we are also allowed to zero + * the now inaccessible portion of the registers. + * + * The intent of this is that no predicate bit beyond VQ is ever set. + * Which means that some operations on predicate registers themselves + * may operate on full uint64_t or even unrolled across the maximum + * uint64_t[4]. Performing 4 bits of host arithmetic unconditionally + * may well be cheaper than conditionals to restrict the operation + * to the relevant portion of a uint16_t[16]. + */ +void aarch64_sve_narrow_vq(CPUARMState *env, unsigned vq) +{ + int i, j; + uint64_t pmask; + + assert(vq >= 1 && vq <= ARM_MAX_VQ); + assert(vq <= arm_env_get_cpu(env)->sve_max_vq); + + /* Zap the high bits of the zregs. */ + for (i = 0; i < 32; i++) { + memset(&env->vfp.zregs[i].d[2 * vq], 0, 16 * (ARM_MAX_VQ - vq)); + } + + /* Zap the high bits of the pregs and ffr. */ + pmask = 0; + if (vq & 3) { + pmask = ~(-1ULL << (16 * (vq & 3))); + } + for (j = vq / 4; j < ARM_MAX_VQ / 4; j++) { + for (i = 0; i < 17; ++i) { + env->vfp.pregs[i].p[j] &= pmask; + } + pmask = 0; + } +} + +/* + * Notice a change in SVE vector size when changing EL. + */ +void aarch64_sve_change_el(CPUARMState *env, int old_el, int new_el) +{ + int old_len, new_len; + + /* Nothing to do if no SVE. */ + if (!arm_feature(env, ARM_FEATURE_SVE)) { + return; + } + + /* Nothing to do if FP is disabled in either EL. */ + if (fp_exception_el(env, old_el) || fp_exception_el(env, new_el)) { + return; + } + + /* + * When FP is enabled, but SVE is disabled, the effective len is 0. + * ??? How should sve_exception_el interact with AArch32 state? + * That isn't included in the CheckSVEEnabled pseudocode, so is the + * host kernel required to explicitly disable SVE for an EL using aa32? + */ + old_len = (sve_exception_el(env, old_el) + ? 0 : sve_zcr_len_for_el(env, old_el)); + new_len = (sve_exception_el(env, new_el) + ? 0 : sve_zcr_len_for_el(env, new_el)); + + /* When changing vector length, clear inaccessible state. */ + if (new_len < old_len) { + aarch64_sve_narrow_vq(env, new_len + 1); + } +} +#endif diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c index f728f25e4b..b9f920b3c4 100644 --- a/target/arm/op_helper.c +++ b/target/arm/op_helper.c @@ -1068,6 +1068,7 @@ void HELPER(exception_return)(CPUARMState *env) "AArch64 EL%d PC 0x%" PRIx64 "\n", cur_el, new_el, env->pc); } + aarch64_sve_change_el(env, cur_el, new_el); qemu_mutex_lock_iothread(); arm_call_el_change_hook(arm_env_get_cpu(env));