diff mbox series

[v3,15/21] KVM: arm64: Support FEAT_FPMR for guests

Message ID 20231205-arm64-2023-dpisa-v3-15-dbcbcd867a7f@kernel.org
State New
Headers show
Series arm64: Support for 2023 DPISA extensions | expand

Commit Message

Mark Brown Dec. 5, 2023, 4:48 p.m. UTC
FEAT_FPMR introduces a new system register FPMR which allows configuration
of floating point behaviour, currently for FP8 specific features. Allow use
of this in guests, disabling the trap while guests are running and saving
and restoring the value along with the rest of the floating point state.
Since FPMR is stored immediately after the main floating point state we
share it with the hypervisor by adjusting the size of the shared region.

Access to FPMR is covered by both a register specific trap HCRX_EL2.EnFPM
and the overall floating point access trap so we just unconditionally
enable the FPMR specific trap and rely on the floating point access trap to
detect guest floating point usage.

Signed-off-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/include/asm/kvm_arm.h        |  2 +-
 arch/arm64/include/asm/kvm_host.h       |  4 +++-
 arch/arm64/kvm/emulate-nested.c         |  9 +++++++++
 arch/arm64/kvm/fpsimd.c                 | 20 +++++++++++++++++---
 arch/arm64/kvm/hyp/include/hyp/switch.h |  7 ++++++-
 arch/arm64/kvm/sys_regs.c               | 11 +++++++++++
 6 files changed, 47 insertions(+), 6 deletions(-)

Comments

Mark Brown Dec. 7, 2023, 12:30 p.m. UTC | #1
On Thu, Dec 07, 2023 at 08:39:46AM +0000, Marc Zyngier wrote:
> Mark Brown <broonie@kernel.org> wrote:

> >  #define HCRX_GUEST_FLAGS \
> > -	(HCRX_EL2_SMPME | HCRX_EL2_TCR2En | \
> > +	(HCRX_EL2_SMPME | HCRX_EL2_TCR2En | HCRX_EL2_EnFPM | \

> We really should start making all of these things conditional. See
> below.

Is there an overarching theory behind how these things are intended to
work?  I agree with you that I'd have expected more conditionality here,
I was trying to fit in with the existing pattern.  It's kind of hard to
follow what the intention is, I think to some extent due to things
having evolved over time.

> > @@ -517,7 +519,6 @@ struct kvm_vcpu_arch {
> >  	enum fp_type fp_type;
> >  	unsigned int sve_max_vl;
> >  	u64 svcr;
> > -	u64 fpmr;

> Why do this change here? Why isn't done like that the first place?

It didn't seem right to add the register to struct vcpu_sysreg before it
was handled by KVM.  As referenced in the cover letter normally this
wouldn't come up because KVM doesn't rely on the host kernel for
managing register state so we add KVM support then enable the host
kernel but for FPSIMD we're reusing fpsimd_save() so we need the host
kernel support to be in place when we enable KVM.

> >  	CGT_MDCR_TDE,
> > @@ -279,6 +281,12 @@ static const struct trap_bits coarse_trap_bits[] = {
> >  		.mask		= HCR_TTLBOS,
> >  		.behaviour	= BEHAVE_FORWARD_ANY,
> >  	},
> > +	[CGT_HCRX_EnFPM] = {
> > +		.index		= HCRX_EL2,
> > +		.value		= HCRX_EL2_EnFPM,
> > +		.mask		= HCRX_EL2_EnFPM,
> > +		.behaviour	= BEHAVE_FORWARD_ANY,

> This looks wrong. HCRX_EL2.EnFPM is an enable bit.

Right, it's the wrong way round.

> > +static void *fpsimd_share_end(struct user_fpsimd_state *fpsimd)
> > +{
> > +	void *share_end = fpsimd + 1;
> > +
> > +	if (cpus_have_final_cap(ARM64_HAS_FPMR))
> > +		share_end += sizeof(u64);
> > +
> > +	return share_end;
> > +}

> This is horrible. Why can't you just have a new structure wrapping
> both user_fpsimd_state and fpmr? This is going to break in subtle
> ways, just like the SVE/SME stuff.

I agree that it's not great, the main issue was that fpsimd_state is
both already embedded in uw for hardened usercopy and very widely
referenced by exactly which struct it's in so I was taking a guess as to
what would get the least objections.  The obvious thing would be to add
FPMR to uw and share the whole thing with the hypervisor, if people
don't mind adding another field to uw I could do that?

> >  	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
> > +	if (cpus_have_final_cap(ARM64_HAS_FPMR)) {
> > +		WARN_ON_ONCE(&current->thread.fpmr + 1 != fpsimd_share_end(fpsimd));

> How can this happen?

It shouldn't, but it'd be bad if it did so I put a check in to make sure
we haven't messed up.

> > +		vcpu->arch.host_fpmr = kern_hyp_va(&current->thread.fpmr);
> > +	}

> We really need to stop piling the save/restore of stuff that isn't
> advertised to the guest.

I'm not clear what you're referencing here?  The feature is advertised
to the guest via the ID registers and in the past you've pushed back on
making things where the state is just a single register like this
optional.  Do you mean that we should be making this conditional on the
guest ID registers?  If that is the case is there a plan for how that's
supposed to work, set flags when kvm_vcpu_run_pid_change() happens for
example?
Mark Brown Dec. 7, 2023, 3:47 p.m. UTC | #2
On Thu, Dec 07, 2023 at 02:06:34PM +0000, Marc Zyngier wrote:
> Mark Brown <broonie@kernel.org> wrote:

> > > > @@ -517,7 +519,6 @@ struct kvm_vcpu_arch {
> > > >  	enum fp_type fp_type;
> > > >  	unsigned int sve_max_vl;
> > > >  	u64 svcr;
> > > > -	u64 fpmr;

> > > Why do this change here? Why isn't done like that the first place?

> > It didn't seem right to add the register to struct vcpu_sysreg before it
> > was handled by KVM.  As referenced in the cover letter normally this
> > wouldn't come up because KVM doesn't rely on the host kernel for
> > managing register state so we add KVM support then enable the host
> > kernel but for FPSIMD we're reusing fpsimd_save() so we need the host
> > kernel support to be in place when we enable KVM.

> That doesn't explain why you can't be upfront with it and populate the
> FPMR entry. In either case, you are wasting a u64.

So you'd rather just have the register listed in there as part of the
host support rather than initially excluding it?  Note that the current
series has the same approach as is currently used for SVCR which is in a
similar situation.

> > I agree that it's not great, the main issue was that fpsimd_state is
> > both already embedded in uw for hardened usercopy and very widely
> > referenced by exactly which struct it's in so I was taking a guess as to
> > what would get the least objections.  The obvious thing would be to add
> > FPMR to uw and share the whole thing with the hypervisor, if people
> > don't mind adding another field to uw I could do that?

> Either that, or you create a KVM-specific structure that contains
> these fields. If that results in KVM changes, so be it. But I won't
> take this sort of pointer arithmetic that assumes some pre-defined
> layout.

Moving fpsimd_state would have a big textual impact on the host code and
consequent issues with creating conflicts too so I'd rather avoid that.

> > > We really need to stop piling the save/restore of stuff that isn't
> > > advertised to the guest.

> > I'm not clear what you're referencing here?  The feature is advertised
> > to the guest via the ID registers and in the past you've pushed back on
> > making things where the state is just a single register like this
> > optional.  Do you mean that we should be making this conditional on the
> > guest ID registers?  If that is the case is there a plan for how that's
> > supposed to work, set flags when kvm_vcpu_run_pid_change() happens for
> > example?

> See the beginning of this email. It is high time that we stop enabling
> everything by default, because this totally breaks VM migration. We
> already have a huge backlog of these things, and I don't want to add
> more of it.

> Which means that at the very least, enabling *any* feature also comes
> with sanitising the state one way or another when this feature is
> disabled by userspace.

> How this is being done is still a work in progress: my current plan is
> based on a set of trap bits that are computed on a per-VM basis, and
> some side state that indicates whether the trap handling is for
> emulation or feature disabling purpose. This will probably reuse the
> NV infrastructure which has an exhaustive list of the sysregs that can
> be trapped from EL0/EL1.

> At the very least, userspace shouldn't be able to observe the state
> that a guest isn't supposed to generate, and we should be mindful of
> not creating covert channels.

OK, that does seem much more like what I'd have expected the enablement
of these extra features to look like (and may well simplify a bunch of
the existing trap management) though it is a big change in approach.  It
does greatly expand the scope of what's needed to virtualise the feature
though, and there's a bunch of other in flight serieses that'd be
impacted (eg, POR and GCS) and I'm expecting a bunch of overlap until
the general mechanism gets landed.

Based on that I think it might make sense to push the KVM guest support
for these features out and deal with them separately as part of the new
approach to trap/feature management.  Probably through pulling the KVM
bits to the end of the serieses so they're there for people to see.
Does that sound reasonable to you?
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 9f9239d86900..95f3b44e7c3a 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -103,7 +103,7 @@ 
 #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
 
 #define HCRX_GUEST_FLAGS \
-	(HCRX_EL2_SMPME | HCRX_EL2_TCR2En | \
+	(HCRX_EL2_SMPME | HCRX_EL2_TCR2En | HCRX_EL2_EnFPM | \
 	 (cpus_have_final_cap(ARM64_HAS_MOPS) ? (HCRX_EL2_MSCEn | HCRX_EL2_MCE2) : 0))
 #define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En | HCRX_EL2_EnFPM)
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f8d98985a39c..9885adff06fa 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -391,6 +391,8 @@  enum vcpu_sysreg {
 	CNTP_CVAL_EL0,
 	CNTP_CTL_EL0,
 
+	FPMR,
+
 	/* Memory Tagging Extension registers */
 	RGSR_EL1,	/* Random Allocation Tag Seed Register */
 	GCR_EL1,	/* Tag Control Register */
@@ -517,7 +519,6 @@  struct kvm_vcpu_arch {
 	enum fp_type fp_type;
 	unsigned int sve_max_vl;
 	u64 svcr;
-	u64 fpmr;
 
 	/* Stage 2 paging state used by the hardware on next switch */
 	struct kvm_s2_mmu *hw_mmu;
@@ -576,6 +577,7 @@  struct kvm_vcpu_arch {
 	struct kvm_guest_debug_arch external_debug_state;
 
 	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
+	u64 *host_fpmr;					/* hyp VA */
 	struct task_struct *parent_task;
 
 	struct {
diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/kvm/emulate-nested.c
index 06185216a297..802e5cde696f 100644
--- a/arch/arm64/kvm/emulate-nested.c
+++ b/arch/arm64/kvm/emulate-nested.c
@@ -67,6 +67,8 @@  enum cgt_group_id {
 	CGT_HCR_TTLBIS,
 	CGT_HCR_TTLBOS,
 
+	CGT_HCRX_EnFPM,
+
 	CGT_MDCR_TPMCR,
 	CGT_MDCR_TPM,
 	CGT_MDCR_TDE,
@@ -279,6 +281,12 @@  static const struct trap_bits coarse_trap_bits[] = {
 		.mask		= HCR_TTLBOS,
 		.behaviour	= BEHAVE_FORWARD_ANY,
 	},
+	[CGT_HCRX_EnFPM] = {
+		.index		= HCRX_EL2,
+		.value		= HCRX_EL2_EnFPM,
+		.mask		= HCRX_EL2_EnFPM,
+		.behaviour	= BEHAVE_FORWARD_ANY,
+	},
 	[CGT_MDCR_TPMCR] = {
 		.index		= MDCR_EL2,
 		.value		= MDCR_EL2_TPMCR,
@@ -478,6 +486,7 @@  static const struct encoding_to_trap_config encoding_to_cgt[] __initconst = {
 	SR_TRAP(SYS_AIDR_EL1,		CGT_HCR_TID1),
 	SR_TRAP(SYS_SMIDR_EL1,		CGT_HCR_TID1),
 	SR_TRAP(SYS_CTR_EL0,		CGT_HCR_TID2),
+	SR_TRAP(SYS_FPMR,		CGT_HCRX_EnFPM),
 	SR_TRAP(SYS_CCSIDR_EL1,		CGT_HCR_TID2_TID4),
 	SR_TRAP(SYS_CCSIDR2_EL1,	CGT_HCR_TID2_TID4),
 	SR_TRAP(SYS_CLIDR_EL1,		CGT_HCR_TID2_TID4),
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index e3e611e30e91..dee078625d0d 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -14,6 +14,16 @@ 
 #include <asm/kvm_mmu.h>
 #include <asm/sysreg.h>
 
+static void *fpsimd_share_end(struct user_fpsimd_state *fpsimd)
+{
+	void *share_end = fpsimd + 1;
+
+	if (cpus_have_final_cap(ARM64_HAS_FPMR))
+		share_end += sizeof(u64);
+
+	return share_end;
+}
+
 void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
 {
 	struct task_struct *p = vcpu->arch.parent_task;
@@ -23,7 +33,7 @@  void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
 		return;
 
 	fpsimd = &p->thread.uw.fpsimd_state;
-	kvm_unshare_hyp(fpsimd, fpsimd + 1);
+	kvm_unshare_hyp(fpsimd, fpsimd_share_end(fpsimd));
 	put_task_struct(p);
 }
 
@@ -45,11 +55,15 @@  int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 	kvm_vcpu_unshare_task_fp(vcpu);
 
 	/* Make sure the host task fpsimd state is visible to hyp: */
-	ret = kvm_share_hyp(fpsimd, fpsimd + 1);
+	ret = kvm_share_hyp(fpsimd, fpsimd_share_end(fpsimd));
 	if (ret)
 		return ret;
 
 	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
+	if (cpus_have_final_cap(ARM64_HAS_FPMR)) {
+		WARN_ON_ONCE(&current->thread.fpmr + 1 != fpsimd_share_end(fpsimd));
+		vcpu->arch.host_fpmr = kern_hyp_va(&current->thread.fpmr);
+	}
 
 	/*
 	 * We need to keep current's task_struct pinned until its data has been
@@ -153,7 +167,7 @@  void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 		fp_state.sve_vl = vcpu->arch.sve_max_vl;
 		fp_state.sme_state = NULL;
 		fp_state.svcr = &vcpu->arch.svcr;
-		fp_state.fpmr = &vcpu->arch.fpmr;
+		fp_state.fpmr = &__vcpu_sys_reg(vcpu, FPMR);
 		fp_state.fp_type = &vcpu->arch.fp_type;
 
 		if (vcpu_has_sve(vcpu))
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index f99d8af0b9af..a51b21a2e26f 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -339,10 +339,15 @@  static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
 	isb();
 
 	/* Write out the host state if it's in the registers */
-	if (vcpu->arch.fp_state == FP_STATE_HOST_OWNED)
+	if (vcpu->arch.fp_state == FP_STATE_HOST_OWNED) {
 		__fpsimd_save_state(vcpu->arch.host_fpsimd_state);
+		if (cpus_have_final_cap(ARM64_HAS_FPMR))
+			*vcpu->arch.host_fpmr = read_sysreg_s(SYS_FPMR);
+	}
 
 	/* Restore the guest state */
+	if (cpus_have_final_cap(ARM64_HAS_FPMR))
+		write_sysreg_s(__vcpu_sys_reg(vcpu, FPMR), SYS_FPMR);
 	if (sve_guest)
 		__hyp_sve_restore_guest(vcpu);
 	else
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index b843da5e4bb9..0789fb632623 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1914,6 +1914,15 @@  static unsigned int hidden_user_visibility(const struct kvm_vcpu *vcpu,
 	.visibility = hidden_user_visibility,	\
 }
 
+static unsigned int fpmr_visibility(const struct kvm_vcpu *vcpu,
+				    const struct sys_reg_desc *rd)
+{
+	if (cpus_have_final_cap(ARM64_HAS_FPMR))
+		return 0;
+
+	return REG_HIDDEN;
+}
+
 /*
  * Since reset() callback and field val are not used for idregs, they will be
  * used for specific purposes for idregs.
@@ -2310,6 +2319,8 @@  static const struct sys_reg_desc sys_reg_descs[] = {
 	{ SYS_DESC(SYS_CSSELR_EL1), access_csselr, reset_unknown, CSSELR_EL1 },
 	{ SYS_DESC(SYS_CTR_EL0), access_ctr },
 	{ SYS_DESC(SYS_SVCR), undef_access },
+	{ SYS_DESC(SYS_FPMR), access_rw, reset_unknown, FPMR,
+	  .visibility = fpmr_visibility },
 
 	{ PMU_SYS_REG(PMCR_EL0), .access = access_pmcr, .reset = reset_pmcr,
 	  .reg = PMCR_EL0, .get_user = get_pmcr, .set_user = set_pmcr },