Message ID | 20211222124052.644626-23-jing2.liu@intel.com |
---|---|
State | New |
Headers | show |
Series | AMX Support in KVM | expand |
On Wed, Dec 22, 2021, Jing Liu wrote: > @@ -1968,6 +1969,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > case MSR_IA32_XFD: > ret = kvm_set_msr_common(vcpu, msr_info); > if (!ret && data) { > + vmx_disable_intercept_for_msr(vcpu, MSR_IA32_XFD, MSR_TYPE_RW); > + vcpu->arch.xfd_out_of_sync = true; xfd_out_of_sync is a poor name, as XFD _may_ be out of sync, or it may not. It's also confusing that it's kept set after XFD is explicitly synchronized in vcpu_enter_guest(). > + > vcpu->arch.trap_nm = true; > vmx_update_exception_bitmap(vcpu); Ah, this is why #NM interception was made sticky many patches ago. More at the end. > } > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h > index bf9d3051cd6c..0a00242a91e7 100644 > --- a/arch/x86/kvm/vmx/vmx.h > +++ b/arch/x86/kvm/vmx/vmx.h > @@ -340,7 +340,7 @@ struct vcpu_vmx { > struct lbr_desc lbr_desc; > > /* Save desired MSR intercept (read: pass-through) state */ > -#define MAX_POSSIBLE_PASSTHROUGH_MSRS 14 > +#define MAX_POSSIBLE_PASSTHROUGH_MSRS 15 > struct { > DECLARE_BITMAP(read, MAX_POSSIBLE_PASSTHROUGH_MSRS); > DECLARE_BITMAP(write, MAX_POSSIBLE_PASSTHROUGH_MSRS); > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 3b756ff13103..10a08aa2aa45 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -10024,6 +10024,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) > if (vcpu->arch.guest_fpu.xfd_err) > wrmsrl(MSR_IA32_XFD_ERR, 0); > > + if (vcpu->arch.xfd_out_of_sync) Rather than adding a flag that tracks whether or not the MSR can be written by the guest, can't this be: if (!vmx_test_msr_bitmap_write(vcpu->loaded_vmcs->msr_bitmap)) fpu_sync_guest_vmexit_xfd_state(); That might be marginally slower than checking a dedicated flag? But is has the advantage of doing the correct thing for nested guests instead of penalizing them with an unnecessary sync on every exit. If performance of the check is an issue, we could add a static key to skip the code unless at least one vCPU has triggered the XFD crud, a la kvm_has_noapic_vcpu (which may or may not provide any real performance benefits). Speaking of nested, interception of #NM in vmx_update_exception_bitmap() is wrong with respect to nested guests. Until XFD is supported for L2, which I didn't see in this series, #NM should not be intercepted while L2 is running. For the earlier patch that introduced arch.trap_nm, if it's not too gross and not racy, the code could be: if (is_guest_mode(vcpu)) eb |= get_vmcs12(vcpu)->exception_bitmap; else { ... if (vcpu->arch.guest_fpu.fpstate.xfd) eb |= (1u << NM_VECTOR); } Though I'm ok with a semi-temporary flag if that's gross/racy. Then this patch can change it to: if (is_guest_mode(vcpu)) eb |= get_vmcs12(vcpu)->exception_bitmap; else { ... if (!vmx_test_msr_bitmap_write(vcpu->vmcs01.msr_bitmap)) eb |= (1u << NM_VECTOR); } > + fpu_sync_guest_vmexit_xfd_state(); > + > /* > * Consume any pending interrupts, including the possible source of > * VM-Exit on SVM and any ticks that occur between VM-Exit and now. > -- > 2.27.0 >
> From: Sean Christopherson <seanjc@google.com> > Sent: Wednesday, December 29, 2021 9:05 AM > > On Wed, Dec 22, 2021, Jing Liu wrote: > > @@ -1968,6 +1969,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, > struct msr_data *msr_info) > > case MSR_IA32_XFD: > > ret = kvm_set_msr_common(vcpu, msr_info); > > if (!ret && data) { > > + vmx_disable_intercept_for_msr(vcpu, > MSR_IA32_XFD, MSR_TYPE_RW); > > + vcpu->arch.xfd_out_of_sync = true; > > xfd_out_of_sync is a poor name, as XFD _may_ be out of sync, or it may not. > It's > also confusing that it's kept set after XFD is explicitly synchronized in > vcpu_enter_guest(). yes, sync_xfd_after_exit might be more accurate. > > > + > > vcpu->arch.trap_nm = true; > > vmx_update_exception_bitmap(vcpu); > > Ah, this is why #NM interception was made sticky many patches ago. More > at the end. > > > } > > diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h > > index bf9d3051cd6c..0a00242a91e7 100644 > > --- a/arch/x86/kvm/vmx/vmx.h > > +++ b/arch/x86/kvm/vmx/vmx.h > > @@ -340,7 +340,7 @@ struct vcpu_vmx { > > struct lbr_desc lbr_desc; > > > > /* Save desired MSR intercept (read: pass-through) state */ > > -#define MAX_POSSIBLE_PASSTHROUGH_MSRS 14 > > +#define MAX_POSSIBLE_PASSTHROUGH_MSRS 15 > > struct { > > DECLARE_BITMAP(read, > MAX_POSSIBLE_PASSTHROUGH_MSRS); > > DECLARE_BITMAP(write, > MAX_POSSIBLE_PASSTHROUGH_MSRS); > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 3b756ff13103..10a08aa2aa45 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -10024,6 +10024,9 @@ static int vcpu_enter_guest(struct kvm_vcpu > *vcpu) > > if (vcpu->arch.guest_fpu.xfd_err) > > wrmsrl(MSR_IA32_XFD_ERR, 0); > > > > + if (vcpu->arch.xfd_out_of_sync) > > Rather than adding a flag that tracks whether or not the MSR can be written > by > the guest, can't this be: > > if (!vmx_test_msr_bitmap_write(vcpu->loaded_vmcs->msr_bitmap)) > fpu_sync_guest_vmexit_xfd_state(); > We can use this > That might be marginally slower than checking a dedicated flag? But is has > the > advantage of doing the correct thing for nested guests instead of penalizing > them > with an unnecessary sync on every exit. If performance of the check is an > issue, > we could add a static key to skip the code unless at least one vCPU has > triggered > the XFD crud, a la kvm_has_noapic_vcpu (which may or may not provide any > real > performance benefits). > > Speaking of nested, interception of #NM in vmx_update_exception_bitmap() > is wrong > with respect to nested guests. Until XFD is supported for L2, which I didn't > see > in this series, #NM should not be intercepted while L2 is running. Can you remind what additional thing is required to support XFD for L2? If only about performance I prefer to the current conservative approach as the first step. As explained earlier, #NM should be rare if the guest doesn't run AMX applications at all. Adding nested into this picture doesn't make things a lot worser. > > For the earlier patch that introduced arch.trap_nm, if it's not too gross and > not > racy, the code could be: > > if (is_guest_mode(vcpu)) > eb |= get_vmcs12(vcpu)->exception_bitmap; > else { > ... > > if (vcpu->arch.guest_fpu.fpstate.xfd) > eb |= (1u << NM_VECTOR); > } > > Though I'm ok with a semi-temporary flag if that's gross/racy. > > Then this patch can change it to: > > if (is_guest_mode(vcpu)) > eb |= get_vmcs12(vcpu)->exception_bitmap; > else { > ... > > if (!vmx_test_msr_bitmap_write(vcpu->vmcs01.msr_bitmap)) > eb |= (1u << NM_VECTOR); > } > > > + fpu_sync_guest_vmexit_xfd_state(); > > + > > /* > > * Consume any pending interrupts, including the possible source of > > * VM-Exit on SVM and any ticks that occur between VM-Exit and > now. > > -- > > 2.27.0 > > Thanks Kevin
> From: Tian, Kevin > Sent: Wednesday, December 29, 2021 11:35 AM > > > > Speaking of nested, interception of #NM in > vmx_update_exception_bitmap() > > is wrong > > with respect to nested guests. Until XFD is supported for L2, which I didn't > > see > > in this series, #NM should not be intercepted while L2 is running. > > Can you remind what additional thing is required to support XFD for L2? ok, at least we require additional work on when to disable write interception. It can be done only when both L0 and L1 make the same decision, just like what we did for many other VMCS settings... > If only about performance I prefer to the current conservative approach > as the first step. As explained earlier, #NM should be rare if the guest > doesn't run AMX applications at all. Adding nested into this picture doesn't > make things a lot worser. All your comments incorporated except this one. As said always trapping #NM for L2 is not a big problem, as long as it's rare and don't break function. Usually a relatively static scheme is safer than a dynamic one in case of anything overlooked for the initial implementation. 😊 Thanks Kevin
> From: Sean Christopherson <seanjc@google.com> > Sent: Wednesday, December 29, 2021 9:05 AM > > + if (vcpu->arch.xfd_out_of_sync) > > Rather than adding a flag that tracks whether or not the MSR can be written > by > the guest, can't this be: > > if (!vmx_test_msr_bitmap_write(vcpu->loaded_vmcs->msr_bitmap)) > fpu_sync_guest_vmexit_xfd_state(); > and forgot to mention a change different from above. It's in x86 common exit path but above is vmx specific check. I'm not sure whether it's worthy of introducing another kvm_x86_ops callback just for this minor usage. Introducing an extra flag (e.g. vcpu->arch.xfd_disable_interception) sounds simpler here. Thanks Kevin
On Wed, Dec 29, 2021, Tian, Kevin wrote: > > From: Tian, Kevin > > Sent: Wednesday, December 29, 2021 11:35 AM > > > > > > Speaking of nested, interception of #NM in > > vmx_update_exception_bitmap() > > > is wrong > > > with respect to nested guests. Until XFD is supported for L2, which I didn't > > > see > > > in this series, #NM should not be intercepted while L2 is running. > > > > Can you remind what additional thing is required to support XFD for L2? > > ok, at least we require additional work on when to disable write interception. > It can be done only when both L0 and L1 make the same decision, just like > what we did for many other VMCS settings... I'm not terribly concerned with exposing XFD to L2 right now, I'm much more concerned with not breaking nVMX when XFD is exposed to L1. > > If only about performance I prefer to the current conservative approach > > as the first step. As explained earlier, #NM should be rare if the guest > > doesn't run AMX applications at all. Adding nested into this picture doesn't > > make things a lot worser. > > All your comments incorporated except this one. As said always trapping > #NM for L2 is not a big problem, as long as it's rare and don't break function. > Usually a relatively static scheme is safer than a dynamic one in case of > anything overlooked for the initial implementation. 😊 I strongly disagree, it's not automagically safer. Either way, we need to validate that KVM does the correct thing with respect to vmcs12. E.g. does KVM correctly reflect the #NM back to L2 when it's not intercepted by L1? Does it synthesize a nested VM-Exit when it is intercepted by L1? The testing required doesn't magically go away. As posted, there is zero chance that the patches correctly handling #NM in L2 because nested_vmx_l0_wants_exit() doesn't prevent an #NM from being forwarded to L1. E.g. if L0 (KVM) and L1 both intercept #NM, handle_exception_nm() will queue a #NM for injection and then syntehsizea nested VM-Exit, i.e. the #NM from L2 will get injected into L1 after the nested exit. That also means handle_exception_nmi_irqoff() => handle_exception_nm() is fatally broken on non-XFD hardware, as it will attempt RDMSR(MSR_IA32_XFD_ERR) if L1 intercepts #NM since handle_exception_nmi_irqoff() will run before __vmx_handle_exit() => nested_vmx_reflect_vmexit() checks whether L0 or L1 should handle the exit.
> From: Sean Christopherson <seanjc@google.com> > Sent: Thursday, December 30, 2021 1:26 AM > > On Wed, Dec 29, 2021, Tian, Kevin wrote: > > > From: Tian, Kevin > > > Sent: Wednesday, December 29, 2021 11:35 AM > > > > > > > > Speaking of nested, interception of #NM in > > > vmx_update_exception_bitmap() > > > > is wrong > > > > with respect to nested guests. Until XFD is supported for L2, which I > didn't > > > > see > > > > in this series, #NM should not be intercepted while L2 is running. > > > > > > Can you remind what additional thing is required to support XFD for L2? > > > > ok, at least we require additional work on when to disable write > interception. > > It can be done only when both L0 and L1 make the same decision, just like > > what we did for many other VMCS settings... > > I'm not terribly concerned with exposing XFD to L2 right now, I'm much more > concerned with not breaking nVMX when XFD is exposed to L1. > > > > If only about performance I prefer to the current conservative approach > > > as the first step. As explained earlier, #NM should be rare if the guest > > > doesn't run AMX applications at all. Adding nested into this picture > doesn't > > > make things a lot worser. > > > > All your comments incorporated except this one. As said always trapping > > #NM for L2 is not a big problem, as long as it's rare and don't break > function. > > Usually a relatively static scheme is safer than a dynamic one in case of > > anything overlooked for the initial implementation. 😊 > > I strongly disagree, it's not automagically safer. Either way, we need to > validate > that KVM does the correct thing with respect to vmcs12. E.g. does KVM > correctly > reflect the #NM back to L2 when it's not intercepted by L1? Does it > synthesize a > nested VM-Exit when it is intercepted by L1? The testing required doesn't > magically > go away. > > As posted, there is zero chance that the patches correctly handling #NM in L2 > because nested_vmx_l0_wants_exit() doesn't prevent an #NM from being > forwarded > to L1. E.g. if L0 (KVM) and L1 both intercept #NM, handle_exception_nm() > will > queue a #NM for injection and then syntehsizea nested VM-Exit, i.e. the #NM > from > L2 will get injected into L1 after the nested exit. Yes, it's incorrect behavior. > > That also means handle_exception_nmi_irqoff() => handle_exception_nm() > is fatally > broken on non-XFD hardware, as it will attempt RDMSR(MSR_IA32_XFD_ERR) > if L1 > intercepts #NM since handle_exception_nmi_irqoff() will run before > __vmx_handle_exit() => nested_vmx_reflect_vmexit() checks whether L0 or > L1 should > handle the exit. Ditto. Thanks for pointing out those facts that we obviously overlooked. So handle_exception_nm() should neither blindly read xfd_err (#NM might be caused by L1 interception on a non-xfd platform) nor blindly queue a #NM exception (triggered in L2) to L1 which intercepts #NM (then expects vm-exit) The former suggests that reading xfd_err should be made conditionally similar to what we did in vmx_update_exception_bitmap(). The latter suggests the actual exception is better postponed until __vmx_handle_exit(). We are looking at this change now. And once #NM handler works correctly to handle interception by either L0 or L1, I'm not sure whether we still want to special case L1 vs. L2 in vmx_update_exception_bitmap(), since in the end L0 wants to intercept #NM to save xfd_err for both L1 and L2 (if exposed with xfd capability by L1). Thanks Kevin
> From: Tian, Kevin > Sent: Thursday, December 30, 2021 9:28 AM > > > As posted, there is zero chance that the patches correctly handling #NM in > L2 > > because nested_vmx_l0_wants_exit() doesn't prevent an #NM from being > > forwarded > > to L1. E.g. if L0 (KVM) and L1 both intercept #NM, handle_exception_nm() > > will > > queue a #NM for injection and then syntehsizea nested VM-Exit, i.e. the > #NM > > from > > L2 will get injected into L1 after the nested exit. > > Yes, it's incorrect behavior. > > > > > That also means handle_exception_nmi_irqoff() => handle_exception_nm() > > is fatally > > broken on non-XFD hardware, as it will attempt > RDMSR(MSR_IA32_XFD_ERR) > > if L1 > > intercepts #NM since handle_exception_nmi_irqoff() will run before > > __vmx_handle_exit() => nested_vmx_reflect_vmexit() checks whether L0 or > > L1 should > > handle the exit. > > Ditto. Thanks for pointing out those facts that we obviously overlooked. > > So handle_exception_nm() should neither blindly read xfd_err (#NM might > be > caused by L1 interception on a non-xfd platform) nor blindly queue a #NM > exception (triggered in L2) to L1 which intercepts #NM (then expects vm-exit) > > The former suggests that reading xfd_err should be made conditionally > similar to what we did in vmx_update_exception_bitmap(). The latter > suggests the actual exception is better postponed until __vmx_handle_exit(). > > We are looking at this change now. > > And once #NM handler works correctly to handle interception by either L0 > or L1, I'm not sure whether we still want to special case L1 vs. L2 in > vmx_update_exception_bitmap(), since in the end L0 wants to intercept > #NM to save xfd_err for both L1 and L2 (if exposed with xfd capability by L1). > the new change is like below. static void handle_nm_fault_irqoff(struct kvm_vcpu *vcpu) { /* * Save xfd_err to guest_fpu before interrupt is enabled, so the * guest value is not clobbered by the host activity before the guest * has chance to consume it. * * Since trapping #NM is started when xfd write interception is * disabled, using this flag to guard the saving operation. This * implies no-op for a non-xfd #NM due to L1 interception. * * Queuing exception is done in vmx_handle_exit. */ if (vcpu->arch.xfd_no_write_intercept) rdmsrl(MSR_IA32_XFD_ERR, vcpu->arch.guest_fpu.xfd_err); } in the final series it will first check vcpu->arch.guest_fpu.fpstate->xfd before the disable interception patch is applied and then becomes the above form, similar to your suggestion on vmx_update_exception_bitmap(). whether to check msr_bitmap vs. an extra flag is an orthogonal open. Then: handle_exception_nmi(struct kvm_vcpu *vcpu) { ... if (is_machine_check(intr_info) || is_nmi(intr_info)) return 1; /* handled by handle_exception_nmi_irqoff() */ /* * Queue the exception here instead of in handle_nm_fault_irqoff(). * This ensures the nested_vmx check is not skipped so vmexit can * be reflected to L1 (when it intercepts #NM) before reaching this * point. */ if (is_nm_fault(intr_info)) { kvm_queue_exception(vcpu, NM_VECTOR); return 1; } ... } Then regarding to test non-AMX nested #NM usage, it might be difficult to trigger it from modern OS. As commented by Linux #NM handler, it's expected only for XFD or math emulation when fpu is missing. So we plan to run a selftest in L1 which sets CR0.TS and then touch fpu registers. and for L1 kernel we will run two binaries with one trapping #NM and the other not. Please let us know if you have a better suggestion for the testing here. Thanks Kevin
> From: Tian, Kevin > Sent: Thursday, December 30, 2021 3:05 PM > > the new change is like below. > > static void handle_nm_fault_irqoff(struct kvm_vcpu *vcpu) > { > /* > * Save xfd_err to guest_fpu before interrupt is enabled, so the > * guest value is not clobbered by the host activity before the guest > * has chance to consume it. > * > * Since trapping #NM is started when xfd write interception is > * disabled, using this flag to guard the saving operation. This > * implies no-op for a non-xfd #NM due to L1 interception. > * > * Queuing exception is done in vmx_handle_exit. > */ > if (vcpu->arch.xfd_no_write_intercept) > rdmsrl(MSR_IA32_XFD_ERR, vcpu->arch.guest_fpu.xfd_err); > } > > in the final series it will first check vcpu->arch.guest_fpu.fpstate->xfd > before the disable interception patch is applied and then becomes > the above form, similar to your suggestion on > vmx_update_exception_bitmap(). > > whether to check msr_bitmap vs. an extra flag is an orthogonal open. > > Then: > > handle_exception_nmi(struct kvm_vcpu *vcpu) > { > ... > if (is_machine_check(intr_info) || is_nmi(intr_info)) > return 1; /* handled by handle_exception_nmi_irqoff() */ > > /* > * Queue the exception here instead of in handle_nm_fault_irqoff(). > * This ensures the nested_vmx check is not skipped so vmexit can > * be reflected to L1 (when it intercepts #NM) before reaching this > * point. > */ > if (is_nm_fault(intr_info)) { > kvm_queue_exception(vcpu, NM_VECTOR); > return 1; > } > > ... > } > > Then regarding to test non-AMX nested #NM usage, it might be difficult > to trigger it from modern OS. As commented by Linux #NM handler, it's > expected only for XFD or math emulation when fpu is missing. So we plan > to run a selftest in L1 which sets CR0.TS and then touch fpu registers. and > for L1 kernel we will run two binaries with one trapping #NM and the other > not. > We have verified this scenario and didn't find problem. Basically the selftest is like below: guest_code() { cr0 = read_cr0(); cr0 |= X86_CR0_TS; write_cr0(cr0); asm volatile("fnop"); } guest_nm_handler() { cr0 = read_cr0(); cr0 &= ~X86_CR0_TS; write_cr0(cr0); } We run the selftest in L1 to create a nested scenario. When L1 intercepts #NM: (L2) fnop (L0) #NM vmexit (L0) reflect a virtual vmexit (reason #NM) to L1 (L1) #NM vmexit (L1) queue #NM exception to L2 (L2) guest_nm_handler() (L2) fnop (succeed) When L1 doesn't intercept #NM: (L2) fnop (L0) #NM vmexit (L0) queue #NM exception to L2 (L2) guest_nm_handler() (L2) fnop (succeed) Please suggest if any more test is necessary here. Thanks Kevin
On 12/29/21 02:04, Sean Christopherson wrote: > > Speaking of nested, interception of #NM in vmx_update_exception_bitmap() is wrong > with respect to nested guests. Until XFD is supported for L2, which I didn't see > in this series, #NM should not be intercepted while L2 is running. Why wouldn't L2 support XFD, since there are no new VMCS bits? As long as L0 knows what to do with XFD and XFD_ERR, it will do the right thing no matter if L1 or L2 is running. Paolo
On Tue, Jan 04, 2022, Paolo Bonzini wrote: > On 12/29/21 02:04, Sean Christopherson wrote: > > > > Speaking of nested, interception of #NM in vmx_update_exception_bitmap() is wrong > > with respect to nested guests. Until XFD is supported for L2, which I didn't see > > in this series, #NM should not be intercepted while L2 is running. > > Why wouldn't L2 support XFD, since there are no new VMCS bits? As long as > L0 knows what to do with XFD and XFD_ERR, it will do the right thing no > matter if L1 or L2 is running. I incorrectly assumed there was something L0 needed to do in order to support nested XFD.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index f7a661f35d1a..7b81d324b32a 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -641,6 +641,7 @@ struct kvm_vcpu_arch { bool tpr_access_reporting; bool xsaves_enabled; bool trap_nm; + bool xfd_out_of_sync; u64 ia32_xss; u64 microcode_version; u64 arch_capabilities; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 8595a3e8fbd8..7aaf905e13b4 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -162,6 +162,7 @@ static u32 vmx_possible_passthrough_msrs[MAX_POSSIBLE_PASSTHROUGH_MSRS] = { MSR_FS_BASE, MSR_GS_BASE, MSR_KERNEL_GS_BASE, + MSR_IA32_XFD, MSR_IA32_XFD_ERR, #endif MSR_IA32_SYSENTER_CS, @@ -1968,6 +1969,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_XFD: ret = kvm_set_msr_common(vcpu, msr_info); if (!ret && data) { + vmx_disable_intercept_for_msr(vcpu, MSR_IA32_XFD, MSR_TYPE_RW); + vcpu->arch.xfd_out_of_sync = true; + vcpu->arch.trap_nm = true; vmx_update_exception_bitmap(vcpu); } diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index bf9d3051cd6c..0a00242a91e7 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -340,7 +340,7 @@ struct vcpu_vmx { struct lbr_desc lbr_desc; /* Save desired MSR intercept (read: pass-through) state */ -#define MAX_POSSIBLE_PASSTHROUGH_MSRS 14 +#define MAX_POSSIBLE_PASSTHROUGH_MSRS 15 struct { DECLARE_BITMAP(read, MAX_POSSIBLE_PASSTHROUGH_MSRS); DECLARE_BITMAP(write, MAX_POSSIBLE_PASSTHROUGH_MSRS); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 3b756ff13103..10a08aa2aa45 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10024,6 +10024,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) if (vcpu->arch.guest_fpu.xfd_err) wrmsrl(MSR_IA32_XFD_ERR, 0); + if (vcpu->arch.xfd_out_of_sync) + fpu_sync_guest_vmexit_xfd_state(); + /* * Consume any pending interrupts, including the possible source of * VM-Exit on SVM and any ticks that occur between VM-Exit and now.