Message ID | 20201012153746.9996-10-peter.maydell@linaro.org |
---|---|
State | Accepted |
Headers | show |
Series | target/arm: Various v8.1M minor features | expand |
On 10/12/20 8:37 AM, Peter Maydell wrote: > @@ -198,8 +200,14 @@ void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val) > /* > * M profile FPSCR is RES0 for the QC, STRIDE, FZ16, LEN bits > * and also for the trapped-exception-handling bits IxE. > + * From v8.1M with the low-overhead-loop extension bits > + * [18:16] are used for LTPSIZE and (since we don't implement > + * MVE) always read as 4 and ignore writes. > */ > val &= 0xf7c0009f; > + if (cpu_isar_feature(aa32_lob, cpu)) { > + val |= 4 << 16; > + } > } > > vfp_set_fpscr_to_host(env, val); > @@ -212,9 +220,18 @@ void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val) > * (which are stored in fp_status), and the other RES0 bits > * in between, then we clear all of the low 16 bits. > */ > - env->vfp.xregs[ARM_VFP_FPSCR] = val & 0xf7c80000; > - env->vfp.vec_len = (val >> 16) & 7; > - env->vfp.vec_stride = (val >> 20) & 3; > + if (cpu_isar_feature(aa32_lob, cpu)) { > + /* > + * M-profile low-overhead-loop extension: [18:16] are LTPSIZE > + * and we keep them in vfp.xregs[]. > + */ > + env->vfp.xregs[ARM_VFP_FPSCR] = val & 0xf7cf0000; > + } else { > + /* Those bits might be the old-style short vector length/stride */ > + env->vfp.xregs[ARM_VFP_FPSCR] = val & 0xf7c80000; > + env->vfp.vec_len = (val >> 16) & 7; > + env->vfp.vec_stride = (val >> 20) & 3; > + } I think these two sets of masking are confusing. Perhaps usefully rearranged as if (!fp16) { val &= ~fz16; } vfp_set_fpscr_to_host(env, val); if (!m-profile) { vec_len = extract32(val, 16, 3); vec_stride = extract32(val, 20, 2); } val &= 0xf7c80000; if (lob) { val |= 4 << 16; } fpscr = val; Which then obviates the next patch. r~
On Tue, 13 Oct 2020 at 21:06, Richard Henderson <richard.henderson@linaro.org> wrote: > I think these two sets of masking are confusing. > Perhaps usefully rearranged as > > if (!fp16) { > val &= ~fz16; > } > vfp_set_fpscr_to_host(env, val); > > if (!m-profile) { > vec_len = extract32(val, 16, 3); > vec_stride = extract32(val, 20, 2); > } > val &= 0xf7c80000; > if (lob) { > val |= 4 << 16; > } > fpscr = val; Yeah, probably cleaner. The other thing I wondered about is whether we should be setting vec_len/vec_stride for an A-profile CPU which doesn't implement the short-vector extension (ie where MVFR0.FPShVec is zero). But that gets a bit awkward: v8A allows an implementation to make Stride and Len be RAZ, but v7A didn't permit that and so I think we would need to distinguish: * has short-vector support (eg Cortex-A9) * v8A, can implement FPSCR.{Stride,Len} as RAZ/WI * no short-vector support, Stride/Len can be written but the only effect is that some insns must UNDEF (eg Cortex-A7) I think at the moment we currently provide short-vector support for everything, which is wrong but wrong in the direction that means more guest code runs... thanks -- PMM
On 10/13/20 1:38 PM, Peter Maydell wrote: > * has short-vector support (eg Cortex-A9) > * v8A, can implement FPSCR.{Stride,Len} as RAZ/WI > * no short-vector support, Stride/Len can be written > but the only effect is that some insns must UNDEF > (eg Cortex-A7) Yep. The other thing I wondered is if it was worthwhile to go ahead and split out ltpsize now, even with MTE not implemented. Eventually the conditions here would look like if (m-profile) { if (mte) { ltpsize = [18:16]; } } else { if (!v8) { vec_len = [18:16]; vec_stride = [22:20]; } } but for now you could leave out the assignment to ltpsize and just leave it initialized to 4 since reset. r~
diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 186ee621a65..baae826f94f 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -255,6 +255,11 @@ static void arm_cpu_reset(DeviceState *dev) uint8_t *rom; uint32_t vecbase; + if (cpu_isar_feature(aa32_lob, cpu)) { + /* LTPSIZE is constant 4 if MVE not implemented */ + env->vfp.xregs[ARM_VFP_FPSCR] |= 4 << 16; + } + if (arm_feature(env, ARM_FEATURE_M_SECURITY)) { env->v7m.secure = true; } else { diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c index 5666393ef79..350150adbf1 100644 --- a/target/arm/vfp_helper.c +++ b/target/arm/vfp_helper.c @@ -189,8 +189,10 @@ uint32_t vfp_get_fpscr(CPUARMState *env) void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val) { + ARMCPU *cpu = env_archcpu(env); + /* When ARMv8.2-FP16 is not supported, FZ16 is RES0. */ - if (!cpu_isar_feature(any_fp16, env_archcpu(env))) { + if (!cpu_isar_feature(any_fp16, cpu)) { val &= ~FPCR_FZ16; } @@ -198,8 +200,14 @@ void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val) /* * M profile FPSCR is RES0 for the QC, STRIDE, FZ16, LEN bits * and also for the trapped-exception-handling bits IxE. + * From v8.1M with the low-overhead-loop extension bits + * [18:16] are used for LTPSIZE and (since we don't implement + * MVE) always read as 4 and ignore writes. */ val &= 0xf7c0009f; + if (cpu_isar_feature(aa32_lob, cpu)) { + val |= 4 << 16; + } } vfp_set_fpscr_to_host(env, val); @@ -212,9 +220,18 @@ void HELPER(vfp_set_fpscr)(CPUARMState *env, uint32_t val) * (which are stored in fp_status), and the other RES0 bits * in between, then we clear all of the low 16 bits. */ - env->vfp.xregs[ARM_VFP_FPSCR] = val & 0xf7c80000; - env->vfp.vec_len = (val >> 16) & 7; - env->vfp.vec_stride = (val >> 20) & 3; + if (cpu_isar_feature(aa32_lob, cpu)) { + /* + * M-profile low-overhead-loop extension: [18:16] are LTPSIZE + * and we keep them in vfp.xregs[]. + */ + env->vfp.xregs[ARM_VFP_FPSCR] = val & 0xf7cf0000; + } else { + /* Those bits might be the old-style short vector length/stride */ + env->vfp.xregs[ARM_VFP_FPSCR] = val & 0xf7c80000; + env->vfp.vec_len = (val >> 16) & 7; + env->vfp.vec_stride = (val >> 20) & 3; + } /* * The bit we set within fpscr_q is arbitrary; the register as a
If the M-profile low-overhead-branch extension is implemented, FPSCR bits [18:16] are a new field LTPSIZE. If MVE is not implemented (currently always true for us) then this field always reads as 4 and ignores writes. These bits used to be the vector-length field for the old short-vector extension, so we need to take care that they are not misinterpreted as setting vec_len. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> --- target/arm/cpu.c | 5 +++++ target/arm/vfp_helper.c | 25 +++++++++++++++++++++---- 2 files changed, 26 insertions(+), 4 deletions(-)