diff mbox series

[v4,3/4] KVM: nSVM: implement the nested idle halt intercept

Message ID 20241022054810.23369-4-manali.shukla@amd.com
State New
Headers show
Series Add support for the Idle HLT intercept feature | expand

Commit Message

Manali Shukla Oct. 22, 2024, 5:48 a.m. UTC
Enable the idle HLT intercept for the L2 guest when it is available on
the platform and the L2 guest can utilize the Idle HLT intercept
feature. This is achieved by clearing the HLT intercept for the L1
hypervisor in the recalc_interrupts() function. The Idle HLT intercept
requires that the HLT intercept is cleared and the Idle HLT intercept
is set to properly enable the feature.

The nested idle halt intercept was tested by booting L1,L2 (all Linux)
and checking there are only idle-hlt vmexits happened.

Signed-off-by: Manali Shukla <manali.shukla@amd.com>
---
 arch/x86/kvm/governed_features.h | 1 +
 arch/x86/kvm/svm/nested.c        | 8 ++++++++
 arch/x86/kvm/svm/svm.c           | 4 ++++
 3 files changed, 13 insertions(+)

Comments

Sean Christopherson Dec. 20, 2024, 1:01 a.m. UTC | #1
On Tue, Oct 22, 2024, Manali Shukla wrote:
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index d5314cb7dff4..feb241110f1a 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -178,6 +178,14 @@ void recalc_intercepts(struct vcpu_svm *svm)
>  	} else {
>  		WARN_ON(!(c->virt_ext & VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK));
>  	}
> +
> +	/*
> +	 * Clear the HLT intercept for L2 guest when the Idle HLT intercept feature
> +	 * is enabled on the platform and the guest can use the Idle HLT intercept
> +	 * feature.
> +	 */
> +	if (guest_can_use(&svm->vcpu, X86_FEATURE_IDLE_HLT))
> +		vmcb_clr_intercept(c, INTERCEPT_HLT);

This is wrong.  KVM needs to honor the intercept of vmcb12.  If L1 wants to
intercept HLT, then KVM needs to configure vmcb02 to intercept HLT, regradless
of whether or not L1 is utilizing INTERCEPT_IDLE_HLT.

Given how KVM currently handles intercepts for nested SVM, I'm pretty sure you
can simply do nothing.  recalc_intercepts() starts with KVM's intercepts (from
vmcb01), and adds in L1's intercepts.  So unless there is a special case, the
default behavior should Just Work.

	for (i = 0; i < MAX_INTERCEPT; i++)
		c->intercepts[i] = h->intercepts[i];

	...

	for (i = 0; i < MAX_INTERCEPT; i++)
		c->intercepts[i] |= g->intercepts[i];

KVM's approach creates all kinds of virtualization holes, e.g. L1 can utilize
IDLE_HLT even if the feature isn't advertised to L1.  But that's true for quite
literally all feature-based intercepts, so for better or worse, I don't think
it makes sense to try and change that approach for this feature.

> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index e86b79e975d3..38d546788fc6 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4425,6 +4425,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>  	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
>  	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VGIF);
>  	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VNMI);
> +	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_IDLE_HLT);
>  
>  	svm_recalc_instruction_intercepts(vcpu, svm);
>  
> @@ -5228,6 +5229,9 @@ static __init void svm_set_cpu_caps(void)
>  		if (vnmi)
>  			kvm_cpu_cap_set(X86_FEATURE_VNMI);
>  
> +		if (cpu_feature_enabled(X86_FEATURE_IDLE_HLT))
> +			kvm_cpu_cap_set(X86_FEATURE_IDLE_HLT);

kvm_cpu_cap_check_and_set() does this for you.

> +
>  		/* Nested VM can receive #VMEXIT instead of triggering #GP */
>  		kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK);
>  	}
> -- 
> 2.34.1
>
Manali Shukla Dec. 30, 2024, 7:05 a.m. UTC | #2
Hi Sean,

Thank you for reviewing my patches.

On 12/20/2024 6:31 AM, Sean Christopherson wrote:
> On Tue, Oct 22, 2024, Manali Shukla wrote:
>> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
>> index d5314cb7dff4..feb241110f1a 100644
>> --- a/arch/x86/kvm/svm/nested.c
>> +++ b/arch/x86/kvm/svm/nested.c
>> @@ -178,6 +178,14 @@ void recalc_intercepts(struct vcpu_svm *svm)
>>  	} else {
>>  		WARN_ON(!(c->virt_ext & VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK));
>>  	}
>> +
>> +	/*
>> +	 * Clear the HLT intercept for L2 guest when the Idle HLT intercept feature
>> +	 * is enabled on the platform and the guest can use the Idle HLT intercept
>> +	 * feature.
>> +	 */
>> +	if (guest_can_use(&svm->vcpu, X86_FEATURE_IDLE_HLT))
>> +		vmcb_clr_intercept(c, INTERCEPT_HLT);
> 
> This is wrong.  KVM needs to honor the intercept of vmcb12.  If L1 wants to
> intercept HLT, then KVM needs to configure vmcb02 to intercept HLT, regradless
> of whether or not L1 is utilizing INTERCEPT_IDLE_HLT.
> 
> Given how KVM currently handles intercepts for nested SVM, I'm pretty sure you
> can simply do nothing.  recalc_intercepts() starts with KVM's intercepts (from
> vmcb01), and adds in L1's intercepts.  So unless there is a special case, the
> default behavior should Just Work.
> 
> 	for (i = 0; i < MAX_INTERCEPT; i++)
> 		c->intercepts[i] = h->intercepts[i];
> 
> 	...
> 
> 	for (i = 0; i < MAX_INTERCEPT; i++)
> 		c->intercepts[i] |= g->intercepts[i];
> 
> KVM's approach creates all kinds of virtualization holes, e.g. L1 can utilize
> IDLE_HLT even if the feature isn't advertised to L1.  But that's true for quite
> literally all feature-based intercepts, so for better or worse, I don't think
> it makes sense to try and change that approach for this feature.
> 

Yeah. Makes sense. I will remove the above condition from V5, so that intercept of
vmcb12 is honored.

>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>> index e86b79e975d3..38d546788fc6 100644
>> --- a/arch/x86/kvm/svm/svm.c
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -4425,6 +4425,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>>  	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
>>  	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VGIF);
>>  	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VNMI);
>> +	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_IDLE_HLT);
>>  
>>  	svm_recalc_instruction_intercepts(vcpu, svm);
>>  
>> @@ -5228,6 +5229,9 @@ static __init void svm_set_cpu_caps(void)
>>  		if (vnmi)
>>  			kvm_cpu_cap_set(X86_FEATURE_VNMI);
>>  
>> +		if (cpu_feature_enabled(X86_FEATURE_IDLE_HLT))
>> +			kvm_cpu_cap_set(X86_FEATURE_IDLE_HLT);
> 
> kvm_cpu_cap_check_and_set() does this for you.
> 
>> +
>>  		/* Nested VM can receive #VMEXIT instead of triggering #GP */
>>  		kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK);
>>  	}
>> -- 
>> 2.34.1
>>
- Manali
Manali Shukla Dec. 30, 2024, 7:14 a.m. UTC | #3
On 12/20/2024 6:31 AM, Sean Christopherson wrote:

>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>> index e86b79e975d3..38d546788fc6 100644
>> --- a/arch/x86/kvm/svm/svm.c
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -4425,6 +4425,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
>>  	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
>>  	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VGIF);
>>  	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VNMI);
>> +	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_IDLE_HLT);
>>  
>>  	svm_recalc_instruction_intercepts(vcpu, svm);
>>  
>> @@ -5228,6 +5229,9 @@ static __init void svm_set_cpu_caps(void)
>>  		if (vnmi)
>>  			kvm_cpu_cap_set(X86_FEATURE_VNMI);
>>  
>> +		if (cpu_feature_enabled(X86_FEATURE_IDLE_HLT))
>> +			kvm_cpu_cap_set(X86_FEATURE_IDLE_HLT);
> 
> kvm_cpu_cap_check_and_set() does this for you.
> 

Sure. I will use kvm_cpu_cap_check_and_set() in V5.

>> +
>>  		/* Nested VM can receive #VMEXIT instead of triggering #GP */
>>  		kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK);
>>  	}
>> -- 
>> 2.34.1
>>

- Manali
diff mbox series

Patch

diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h
index ad463b1ed4e4..0154012721cb 100644
--- a/arch/x86/kvm/governed_features.h
+++ b/arch/x86/kvm/governed_features.h
@@ -17,6 +17,7 @@  KVM_GOVERNED_X86_FEATURE(PFTHRESHOLD)
 KVM_GOVERNED_X86_FEATURE(VGIF)
 KVM_GOVERNED_X86_FEATURE(VNMI)
 KVM_GOVERNED_X86_FEATURE(LAM)
+KVM_GOVERNED_X86_FEATURE(IDLE_HLT)
 
 #undef KVM_GOVERNED_X86_FEATURE
 #undef KVM_GOVERNED_FEATURE
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index d5314cb7dff4..feb241110f1a 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -178,6 +178,14 @@  void recalc_intercepts(struct vcpu_svm *svm)
 	} else {
 		WARN_ON(!(c->virt_ext & VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK));
 	}
+
+	/*
+	 * Clear the HLT intercept for L2 guest when the Idle HLT intercept feature
+	 * is enabled on the platform and the guest can use the Idle HLT intercept
+	 * feature.
+	 */
+	if (guest_can_use(&svm->vcpu, X86_FEATURE_IDLE_HLT))
+		vmcb_clr_intercept(c, INTERCEPT_HLT);
 }
 
 /*
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e86b79e975d3..38d546788fc6 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4425,6 +4425,7 @@  static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_PFTHRESHOLD);
 	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VGIF);
 	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_VNMI);
+	kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_IDLE_HLT);
 
 	svm_recalc_instruction_intercepts(vcpu, svm);
 
@@ -5228,6 +5229,9 @@  static __init void svm_set_cpu_caps(void)
 		if (vnmi)
 			kvm_cpu_cap_set(X86_FEATURE_VNMI);
 
+		if (cpu_feature_enabled(X86_FEATURE_IDLE_HLT))
+			kvm_cpu_cap_set(X86_FEATURE_IDLE_HLT);
+
 		/* Nested VM can receive #VMEXIT instead of triggering #GP */
 		kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK);
 	}