mbox series

[0/4] x86: cpuid: improve support for broken CPUID configurations

Message ID 20220622144820.751402-1-mlevitsk@redhat.com
Headers show
Series x86: cpuid: improve support for broken CPUID configurations | expand

Message

Maxim Levitsky June 22, 2022, 2:48 p.m. UTC
This patch series aims to harden the cpuid code against the case when
the hypervisor exposes a broken CPUID configuration to the guest,
in the form of having a feature disabled but not features that depend on it.

This is the more generic way to fix kernel panic in aes-ni kernel driver,
which was triggered by CPUID configuration in which AVX is disabled but
not AVX2.

https://lore.kernel.org/all/20211103145231.GA4485@gondor.apana.org.au/T/

This was tested by booting a guest with AVX disabled and not AVX2,
and observing that both a warning is now printed in dmesg, and
that avx2 is gone from /proc/cpuinfo.

Best regards,
	Maxim Levitsky

Maxim Levitsky (4):
  perf/x86/intel/lbr: use setup_clear_cpu_cap instead of clear_cpu_cap
  x86/cpuid: refactor setup_clear_cpu_cap/clear_feature
  x86/cpuid: move filter_cpuid_features to cpuid-deps.c
  x86/cpuid: check for dependencies violations in CPUID and attempt to
    fix them

 arch/x86/events/intel/lbr.c       |  2 +-
 arch/x86/include/asm/cpufeature.h |  1 +
 arch/x86/kernel/cpu/common.c      | 50 +-----------------
 arch/x86/kernel/cpu/cpuid-deps.c  | 84 +++++++++++++++++++++++++++----
 4 files changed, 78 insertions(+), 59 deletions(-)

Comments

Dave Hansen June 22, 2022, 2:58 p.m. UTC | #1
On 6/22/22 07:48, Maxim Levitsky wrote:
> clear_cpu_cap(&boot_cpu_data) is very similar to setup_clear_cpu_cap
> except that the latter also sets a bit in 'cpu_caps_cleared' which
> later clears the same cap in secondary cpus, which is likely
> what is meant here.
> 
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>

Seems like a:

Fixes: 47125db27e47 ("perf/x86/intel/lbr: Support Architectural LBR")

would be in order.

Kan, does this change look right to you?

>  arch/x86/events/intel/lbr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
> index 13179f31fe10fa..b08715172309a7 100644
> --- a/arch/x86/events/intel/lbr.c
> +++ b/arch/x86/events/intel/lbr.c
> @@ -1860,7 +1860,7 @@ void __init intel_pmu_arch_lbr_init(void)
>  	return;
>  
>  clear_arch_lbr:
> -	clear_cpu_cap(&boot_cpu_data, X86_FEATURE_ARCH_LBR);
> +	setup_clear_cpu_cap(X86_FEATURE_ARCH_LBR);
>  }
>  
>  /**
Dave Hansen June 22, 2022, 3:07 p.m. UTC | #2
On 6/22/22 07:48, Maxim Levitsky wrote:
> No functional change intended.

It would be really nice to at least write a "why" sentence.  You wrote
the "what" (move code), but I have no idea why you are moving it.

>  arch/x86/kernel/cpu/common.c      | 46 -----------------------------
>  arch/x86/kernel/cpu/cpuid-deps.c  | 48 +++++++++++++++++++++++++++++++

That looks a wee bit odd for a code move.  Where did the 2 lines go?

> diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> index ea34cc31b0474f..3eb5fe0d654e63 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -147,6 +147,7 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
>  
>  extern void setup_clear_cpu_cap(unsigned int bit);
>  extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
> +extern void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn);
>  
>  #define setup_force_cpu_cap(bit) do { \
>  	set_cpu_cap(&boot_cpu_data, bit);	\
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 4730b0a58f24a5..4cc79971d2d847 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -620,52 +620,6 @@ __noendbr void cet_disable(void)
>  		wrmsrl(MSR_IA32_S_CET, 0);
>  }
>  
> -/*
...
> -}
>  
>  /*
>   * Naming convention should be: <Name> [(<Codename>)]

One, by leaving extra whitespace.

> diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
> index d6777d07ba3302..bcb091d02a754b 100644
> --- a/arch/x86/kernel/cpu/cpuid-deps.c
> +++ b/arch/x86/kernel/cpu/cpuid-deps.c
> @@ -131,3 +131,51 @@ void setup_clear_cpu_cap(unsigned int feature)
>  {
>  	clear_cpu_cap(&boot_cpu_data, feature);
>  }
> +
> +
> +/*

Two, by adding extra whitespace.
Maxim Levitsky June 22, 2022, 4:01 p.m. UTC | #3
On Wed, 2022-06-22 at 08:07 -0700, Dave Hansen wrote:
> On 6/22/22 07:48, Maxim Levitsky wrote:
> > No functional change intended.
> 
> It would be really nice to at least write a "why" sentence.  You wrote
> the "what" (move code), but I have no idea why you are moving it.

This is a good point, I will add a justification in V2.

> 
> >  arch/x86/kernel/cpu/common.c      | 46 -----------------------------
> >  arch/x86/kernel/cpu/cpuid-deps.c  | 48 +++++++++++++++++++++++++++++++
> 
> That looks a wee bit odd for a code move.  Where did the 2 lines go?
> 
> > diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> > index ea34cc31b0474f..3eb5fe0d654e63 100644
> > --- a/arch/x86/include/asm/cpufeature.h
> > +++ b/arch/x86/include/asm/cpufeature.h
> > @@ -147,6 +147,7 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
> >  
> >  extern void setup_clear_cpu_cap(unsigned int bit);
> >  extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
> > +extern void filter_cpuid_features(struct cpuinfo_x86 *c, bool warn);
> >  
> >  #define setup_force_cpu_cap(bit) do { \
> >  	set_cpu_cap(&boot_cpu_data, bit);	\
> > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> > index 4730b0a58f24a5..4cc79971d2d847 100644
> > --- a/arch/x86/kernel/cpu/common.c
> > +++ b/arch/x86/kernel/cpu/common.c
> > @@ -620,52 +620,6 @@ __noendbr void cet_disable(void)
> >  		wrmsrl(MSR_IA32_S_CET, 0);
> >  }
> >  
> > -/*
> ...
> > -}
> >  
> >  /*
> >   * Naming convention should be: <Name> [(<Codename>)]
> 
> One, by leaving extra whitespace.
> 
> > diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
> > index d6777d07ba3302..bcb091d02a754b 100644
> > --- a/arch/x86/kernel/cpu/cpuid-deps.c
> > +++ b/arch/x86/kernel/cpu/cpuid-deps.c
> > @@ -131,3 +131,51 @@ void setup_clear_cpu_cap(unsigned int feature)
> >  {
> >  	clear_cpu_cap(&boot_cpu_data, feature);
> >  }
> > +
> > +
> > +/*
> 
> Two, by adding extra whitespace.
> 

I will fix this, I didn't notice that I added new lines.
Next time I move code around, I'll check that the number count matches.

Thanks!
Best regards,
	Maxim Levitsky
Liang, Kan June 22, 2022, 7:32 p.m. UTC | #4
On 6/22/2022 2:57 PM, Liang, Kan wrote:
> 
> 
> On 6/22/2022 10:58 AM, Dave Hansen wrote:
>> On 6/22/22 07:48, Maxim Levitsky wrote:
>>> clear_cpu_cap(&boot_cpu_data) is very similar to setup_clear_cpu_cap
>>> except that the latter also sets a bit in 'cpu_caps_cleared' which
>>> later clears the same cap in secondary cpus, which is likely
>>> what is meant here.
>>>
>>> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
>>
>> Seems like a:
>>
>> Fixes: 47125db27e47 ("perf/x86/intel/lbr: Support Architectural LBR")
>>
>> would be in order.
>>
>> Kan, does this change look right to you?
> 
> For the current implementation, the Arch LBR feature should be either 
> supported by all the CPUs or disabled on all the CPUs. It cannot be only 
> enabled for partial CPUs, even in a hybrid platform.
> So the current code only check the boot CPU via static_cpu_has().
> 
> Ideally, Yes, I think it may be better to clear the bit for all CPUs, 
> which makes the capability consistent among CPUs.
>

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>

Thanks,
Kan


> Thanks,
> Kan
>>
>>>   arch/x86/events/intel/lbr.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
>>> index 13179f31fe10fa..b08715172309a7 100644
>>> --- a/arch/x86/events/intel/lbr.c
>>> +++ b/arch/x86/events/intel/lbr.c
>>> @@ -1860,7 +1860,7 @@ void __init intel_pmu_arch_lbr_init(void)
>>>       return;
>>>   clear_arch_lbr:
>>> -    clear_cpu_cap(&boot_cpu_data, X86_FEATURE_ARCH_LBR);
>>> +    setup_clear_cpu_cap(X86_FEATURE_ARCH_LBR);
>>>   }
>>>   /**
>>