diff mbox series

[1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability

Message ID 20191010171517.28782-2-suzuki.poulose@arm.com
State Superseded
Headers show
Series arm64: Fix support for systems without FP/SIMD | expand

Commit Message

Suzuki K Poulose Oct. 10, 2019, 5:15 p.m. UTC
The NO_FPSIMD capability is defined with scope SYSTEM, which implies
that the "absence" of FP/SIMD on at least one CPU is detected only
after all the SMP CPUs are brought up. However, we use the status
of this capability for every context switch. So, let us change
the scop to LOCAL_CPU to allow the detection of this capability
as and when the first CPU without FP is brought up.

Also, the current type allows hotplugged CPU to be brought up without
FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
up. Fix both of these issues by changing the capability to
BOOT_RESTRICTED_LOCAL_CPU_FEATURE.

Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

---
 arch/arm64/kernel/cpufeature.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.21.0

Comments

Dave Martin Oct. 11, 2019, 11:36 a.m. UTC | #1
On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
> The NO_FPSIMD capability is defined with scope SYSTEM, which implies

> that the "absence" of FP/SIMD on at least one CPU is detected only

> after all the SMP CPUs are brought up. However, we use the status

> of this capability for every context switch. So, let us change

> the scop to LOCAL_CPU to allow the detection of this capability

> as and when the first CPU without FP is brought up.

> 

> Also, the current type allows hotplugged CPU to be brought up without

> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace

> up. Fix both of these issues by changing the capability to

> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.

> 

> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")

> Cc: Will Deacon <will@kernel.org>

> Cc: Mark Rutland <mark.rutland@arm.com>

> Cc: Catalin Marinas <catalin.marinas@arm.com>

> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

> ---

>  arch/arm64/kernel/cpufeature.c | 2 +-

>  1 file changed, 1 insertion(+), 1 deletion(-)

> 

> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c

> index 9323bcc40a58..0f9eace6c64b 100644

> --- a/arch/arm64/kernel/cpufeature.c

> +++ b/arch/arm64/kernel/cpufeature.c

> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {

>  	{

>  		/* FP/SIMD is not implemented */

>  		.capability = ARM64_HAS_NO_FPSIMD,

> -		.type = ARM64_CPUCAP_SYSTEM_FEATURE,

> +		.type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,


ARM64_HAS_NO_FPSIMD is really a disability, not a capability.

Although we have other things that smell like this (CPU errata for
example), I wonder whether inverting the meaning in the case would
make the situation easier to understand.

So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
IIUC.  We'd just need to invert the sense of the check in
system_supports_fpsimd().

>  		.min_field_value = 0,


(Does .min_field_value == 0 make sense, or is it even used?  I thought
only the default has_cpuid_feature() match logic uses that.)

>  		.matches = has_no_fpsimd,

>  	},


Cheers
---Dave
Suzuki K Poulose Oct. 11, 2019, 12:13 p.m. UTC | #2
Hi Dave

On 11/10/2019 12:36, Dave Martin wrote:
> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:

>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies

>> that the "absence" of FP/SIMD on at least one CPU is detected only

>> after all the SMP CPUs are brought up. However, we use the status

>> of this capability for every context switch. So, let us change

>> the scop to LOCAL_CPU to allow the detection of this capability

>> as and when the first CPU without FP is brought up.

>>

>> Also, the current type allows hotplugged CPU to be brought up without

>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace

>> up. Fix both of these issues by changing the capability to

>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.

>>

>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")

>> Cc: Will Deacon <will@kernel.org>

>> Cc: Mark Rutland <mark.rutland@arm.com>

>> Cc: Catalin Marinas <catalin.marinas@arm.com>

>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

>> ---

>>   arch/arm64/kernel/cpufeature.c | 2 +-

>>   1 file changed, 1 insertion(+), 1 deletion(-)

>>

>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c

>> index 9323bcc40a58..0f9eace6c64b 100644

>> --- a/arch/arm64/kernel/cpufeature.c

>> +++ b/arch/arm64/kernel/cpufeature.c

>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {

>>   	{

>>   		/* FP/SIMD is not implemented */

>>   		.capability = ARM64_HAS_NO_FPSIMD,

>> -		.type = ARM64_CPUCAP_SYSTEM_FEATURE,

>> +		.type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,

> 

> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.

> 

> Although we have other things that smell like this (CPU errata for

> example), I wonder whether inverting the meaning in the case would

> make the situation easier to understand.


Yes, it is indeed a disability, more on that below.

> 

> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field

> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE

> IIUC.  We'd just need to invert the sense of the check in

> system_supports_fpsimd().


This is particularly something we want to avoid with this patch. We want
to make sure that we have the up-to-date status of the disability right
when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE
you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,
you must wait until all the CPUs are up, unlike the negated capability.

> 

>>   		.min_field_value = 0,

> 

> (Does .min_field_value == 0 make sense, or is it even used?  I thought

> only the default has_cpuid_feature() match logic uses that.)


True, it is not used for this particular case.

Cheers
Suzuki
Dave Martin Oct. 11, 2019, 2:21 p.m. UTC | #3
On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave
> 

> On 11/10/2019 12:36, Dave Martin wrote:

> >On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:

> >>The NO_FPSIMD capability is defined with scope SYSTEM, which implies

> >>that the "absence" of FP/SIMD on at least one CPU is detected only

> >>after all the SMP CPUs are brought up. However, we use the status

> >>of this capability for every context switch. So, let us change

> >>the scop to LOCAL_CPU to allow the detection of this capability

> >>as and when the first CPU without FP is brought up.

> >>

> >>Also, the current type allows hotplugged CPU to be brought up without

> >>FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace

> >>up. Fix both of these issues by changing the capability to

> >>BOOT_RESTRICTED_LOCAL_CPU_FEATURE.

> >>

> >>Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")

> >>Cc: Will Deacon <will@kernel.org>

> >>Cc: Mark Rutland <mark.rutland@arm.com>

> >>Cc: Catalin Marinas <catalin.marinas@arm.com>

> >>Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

> >>---

> >>  arch/arm64/kernel/cpufeature.c | 2 +-

> >>  1 file changed, 1 insertion(+), 1 deletion(-)

> >>

> >>diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c

> >>index 9323bcc40a58..0f9eace6c64b 100644

> >>--- a/arch/arm64/kernel/cpufeature.c

> >>+++ b/arch/arm64/kernel/cpufeature.c

> >>@@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {

> >>  	{

> >>  		/* FP/SIMD is not implemented */

> >>  		.capability = ARM64_HAS_NO_FPSIMD,

> >>-		.type = ARM64_CPUCAP_SYSTEM_FEATURE,

> >>+		.type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,

> >

> >ARM64_HAS_NO_FPSIMD is really a disability, not a capability.

> >

> >Although we have other things that smell like this (CPU errata for

> >example), I wonder whether inverting the meaning in the case would

> >make the situation easier to understand.

> 

> Yes, it is indeed a disability, more on that below.

> 

> >

> >So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field

> >value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE

> >IIUC.  We'd just need to invert the sense of the check in

> >system_supports_fpsimd().

> 

> This is particularly something we want to avoid with this patch. We want

> to make sure that we have the up-to-date status of the disability right

> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE

> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,

> you must wait until all the CPUs are up, unlike the negated capability.


I don't see why waiting for the random defective early CPU to come up is
better than waiting for all the early CPUs to come up and then deciding.

Kernel-mode NEON aside, the status of this cap should not matter until
we enter userspace for the first time.

The only issue is if e.g., crypto drivers that can use kernel-mode NEON
probe for it before all early CPUs are up, and so cache the wrong
decision.  The current approach doesn't cope with that anyway AFAICT.

> >>  		.min_field_value = 0,

> >

> >(Does .min_field_value == 0 make sense, or is it even used?  I thought

> >only the default has_cpuid_feature() match logic uses that.)

> 

> True, it is not used for this particular case.


Ok, just wondering.

Cheers
---Dave
Suzuki K Poulose Oct. 11, 2019, 5:28 p.m. UTC | #4
On 11/10/2019 15:21, Dave Martin wrote:
> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave

>>

>> On 11/10/2019 12:36, Dave Martin wrote:

>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:

>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies

>>>> that the "absence" of FP/SIMD on at least one CPU is detected only

>>>> after all the SMP CPUs are brought up. However, we use the status

>>>> of this capability for every context switch. So, let us change

>>>> the scop to LOCAL_CPU to allow the detection of this capability

>>>> as and when the first CPU without FP is brought up.

>>>>

>>>> Also, the current type allows hotplugged CPU to be brought up without

>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace

>>>> up. Fix both of these issues by changing the capability to

>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.

>>>>

>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")

>>>> Cc: Will Deacon <will@kernel.org>

>>>> Cc: Mark Rutland <mark.rutland@arm.com>

>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>

>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

>>>> ---

>>>>   arch/arm64/kernel/cpufeature.c | 2 +-

>>>>   1 file changed, 1 insertion(+), 1 deletion(-)

>>>>

>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c

>>>> index 9323bcc40a58..0f9eace6c64b 100644

>>>> --- a/arch/arm64/kernel/cpufeature.c

>>>> +++ b/arch/arm64/kernel/cpufeature.c

>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {

>>>>   	{

>>>>   		/* FP/SIMD is not implemented */

>>>>   		.capability = ARM64_HAS_NO_FPSIMD,

>>>> -		.type = ARM64_CPUCAP_SYSTEM_FEATURE,

>>>> +		.type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,

>>>

>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.

>>>

>>> Although we have other things that smell like this (CPU errata for

>>> example), I wonder whether inverting the meaning in the case would

>>> make the situation easier to understand.

>>

>> Yes, it is indeed a disability, more on that below.

>>

>>>

>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field

>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE

>>> IIUC.  We'd just need to invert the sense of the check in

>>> system_supports_fpsimd().

>>

>> This is particularly something we want to avoid with this patch. We want

>> to make sure that we have the up-to-date status of the disability right

>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE

>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,

>> you must wait until all the CPUs are up, unlike the negated capability.

> 

> I don't see why waiting for the random defective early CPU to come up is

> better than waiting for all the early CPUs to come up and then deciding.

> 

> Kernel-mode NEON aside, the status of this cap should not matter until

> we enter userspace for the first time.

> 

> The only issue is if e.g., crypto drivers that can use kernel-mode NEON

> probe for it before all early CPUs are up, and so cache the wrong

> decision.  The current approach doesn't cope with that anyway AFAICT.


This approach does in fact. With LOCAL_CPU scope, the moment a defective
CPU turns up, we mark the "capability" and thus the kernel cannot use
the neon then onwards, unlike the existing case where we have time till
we boot all the CPUs (even when the boot CPU may be defective).

Cheers
Suzuki
Dave Martin Oct. 14, 2019, 2:52 p.m. UTC | #5
On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:
> 

> 

> On 11/10/2019 15:21, Dave Martin wrote:

> >On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave

> >>

> >>On 11/10/2019 12:36, Dave Martin wrote:

> >>>On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:

> >>>>The NO_FPSIMD capability is defined with scope SYSTEM, which implies

> >>>>that the "absence" of FP/SIMD on at least one CPU is detected only

> >>>>after all the SMP CPUs are brought up. However, we use the status

> >>>>of this capability for every context switch. So, let us change

> >>>>the scop to LOCAL_CPU to allow the detection of this capability

> >>>>as and when the first CPU without FP is brought up.

> >>>>

> >>>>Also, the current type allows hotplugged CPU to be brought up without

> >>>>FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace

> >>>>up. Fix both of these issues by changing the capability to

> >>>>BOOT_RESTRICTED_LOCAL_CPU_FEATURE.

> >>>>

> >>>>Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")

> >>>>Cc: Will Deacon <will@kernel.org>

> >>>>Cc: Mark Rutland <mark.rutland@arm.com>

> >>>>Cc: Catalin Marinas <catalin.marinas@arm.com>

> >>>>Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

> >>>>---

> >>>>  arch/arm64/kernel/cpufeature.c | 2 +-

> >>>>  1 file changed, 1 insertion(+), 1 deletion(-)

> >>>>

> >>>>diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c

> >>>>index 9323bcc40a58..0f9eace6c64b 100644

> >>>>--- a/arch/arm64/kernel/cpufeature.c

> >>>>+++ b/arch/arm64/kernel/cpufeature.c

> >>>>@@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {

> >>>>  	{

> >>>>  		/* FP/SIMD is not implemented */

> >>>>  		.capability = ARM64_HAS_NO_FPSIMD,

> >>>>-		.type = ARM64_CPUCAP_SYSTEM_FEATURE,

> >>>>+		.type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,

> >>>

> >>>ARM64_HAS_NO_FPSIMD is really a disability, not a capability.

> >>>

> >>>Although we have other things that smell like this (CPU errata for

> >>>example), I wonder whether inverting the meaning in the case would

> >>>make the situation easier to understand.

> >>

> >>Yes, it is indeed a disability, more on that below.

> >>

> >>>

> >>>So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field

> >>>value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE

> >>>IIUC.  We'd just need to invert the sense of the check in

> >>>system_supports_fpsimd().

> >>

> >>This is particularly something we want to avoid with this patch. We want

> >>to make sure that we have the up-to-date status of the disability right

> >>when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE

> >>you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,

> >>you must wait until all the CPUs are up, unlike the negated capability.

> >

> >I don't see why waiting for the random defective early CPU to come up is

> >better than waiting for all the early CPUs to come up and then deciding.

> >

> >Kernel-mode NEON aside, the status of this cap should not matter until

> >we enter userspace for the first time.

> >

> >The only issue is if e.g., crypto drivers that can use kernel-mode NEON

> >probe for it before all early CPUs are up, and so cache the wrong

> >decision.  The current approach doesn't cope with that anyway AFAICT.

> 

> This approach does in fact. With LOCAL_CPU scope, the moment a defective

> CPU turns up, we mark the "capability" and thus the kernel cannot use

> the neon then onwards, unlike the existing case where we have time till

> we boot all the CPUs (even when the boot CPU may be defective).


I guess that makes sense.

I'm now wondering what happens if anything tries to use kernel-mode NEON
before SVE is initialised -- which doesn't happen until cpufeatures
configures the system features.

I don't think your proposed change makes anything worse here, but it may
need looking into.

Cheers
---Dave
Suzuki K Poulose Oct. 14, 2019, 3:45 p.m. UTC | #6
On 14/10/2019 15:52, Dave Martin wrote:
> On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:

>>

>>

>> On 11/10/2019 15:21, Dave Martin wrote:

>>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave

>>>>

>>>> On 11/10/2019 12:36, Dave Martin wrote:

>>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:

>>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies

>>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only

>>>>>> after all the SMP CPUs are brought up. However, we use the status

>>>>>> of this capability for every context switch. So, let us change

>>>>>> the scop to LOCAL_CPU to allow the detection of this capability

>>>>>> as and when the first CPU without FP is brought up.

>>>>>>

>>>>>> Also, the current type allows hotplugged CPU to be brought up without

>>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace

>>>>>> up. Fix both of these issues by changing the capability to

>>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.

>>>>>>

>>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")

>>>>>> Cc: Will Deacon <will@kernel.org>

>>>>>> Cc: Mark Rutland <mark.rutland@arm.com>

>>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>

>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

>>>>>> ---

>>>>>>   arch/arm64/kernel/cpufeature.c | 2 +-

>>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)

>>>>>>

>>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c

>>>>>> index 9323bcc40a58..0f9eace6c64b 100644

>>>>>> --- a/arch/arm64/kernel/cpufeature.c

>>>>>> +++ b/arch/arm64/kernel/cpufeature.c

>>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {

>>>>>>   	{

>>>>>>   		/* FP/SIMD is not implemented */

>>>>>>   		.capability = ARM64_HAS_NO_FPSIMD,

>>>>>> -		.type = ARM64_CPUCAP_SYSTEM_FEATURE,

>>>>>> +		.type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,

>>>>>

>>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.

>>>>>

>>>>> Although we have other things that smell like this (CPU errata for

>>>>> example), I wonder whether inverting the meaning in the case would

>>>>> make the situation easier to understand.

>>>>

>>>> Yes, it is indeed a disability, more on that below.

>>>>

>>>>>

>>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field

>>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE

>>>>> IIUC.  We'd just need to invert the sense of the check in

>>>>> system_supports_fpsimd().

>>>>

>>>> This is particularly something we want to avoid with this patch. We want

>>>> to make sure that we have the up-to-date status of the disability right

>>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE

>>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,

>>>> you must wait until all the CPUs are up, unlike the negated capability.

>>>

>>> I don't see why waiting for the random defective early CPU to come up is

>>> better than waiting for all the early CPUs to come up and then deciding.

>>>

>>> Kernel-mode NEON aside, the status of this cap should not matter until

>>> we enter userspace for the first time.

>>>

>>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON

>>> probe for it before all early CPUs are up, and so cache the wrong

>>> decision.  The current approach doesn't cope with that anyway AFAICT.

>>

>> This approach does in fact. With LOCAL_CPU scope, the moment a defective

>> CPU turns up, we mark the "capability" and thus the kernel cannot use

>> the neon then onwards, unlike the existing case where we have time till

>> we boot all the CPUs (even when the boot CPU may be defective).

> 

> I guess that makes sense.

> 

> I'm now wondering what happens if anything tries to use kernel-mode NEON

> before SVE is initialised -- which doesn't happen until cpufeatures

> configures the system features.

> 

> I don't think your proposed change makes anything worse here, but it may

> need looking into.


We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE
is initialised ?

Suzuki
Dave Martin Oct. 14, 2019, 3:50 p.m. UTC | #7
On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:
>

>

> On 14/10/2019 15:52, Dave Martin wrote:

> > On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:

> >>

> >>

> >> On 11/10/2019 15:21, Dave Martin wrote:

> >>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave

> >>>>

> >>>> On 11/10/2019 12:36, Dave Martin wrote:

> >>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:

> >>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies

> >>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only

> >>>>>> after all the SMP CPUs are brought up. However, we use the status

> >>>>>> of this capability for every context switch. So, let us change

> >>>>>> the scop to LOCAL_CPU to allow the detection of this capability

> >>>>>> as and when the first CPU without FP is brought up.

> >>>>>>

> >>>>>> Also, the current type allows hotplugged CPU to be brought up without

> >>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace

> >>>>>> up. Fix both of these issues by changing the capability to

> >>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.

> >>>>>>

> >>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")

> >>>>>> Cc: Will Deacon <will@kernel.org>

> >>>>>> Cc: Mark Rutland <mark.rutland@arm.com>

> >>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>

> >>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

> >>>>>> ---

> >>>>>>   arch/arm64/kernel/cpufeature.c | 2 +-

> >>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)

> >>>>>>

> >>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c

> >>>>>> index 9323bcc40a58..0f9eace6c64b 100644

> >>>>>> --- a/arch/arm64/kernel/cpufeature.c

> >>>>>> +++ b/arch/arm64/kernel/cpufeature.c

> >>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {

> >>>>>>        {

> >>>>>>                /* FP/SIMD is not implemented */

> >>>>>>                .capability = ARM64_HAS_NO_FPSIMD,

> >>>>>> -              .type = ARM64_CPUCAP_SYSTEM_FEATURE,

> >>>>>> +              .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,

> >>>>>

> >>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.

> >>>>>

> >>>>> Although we have other things that smell like this (CPU errata for

> >>>>> example), I wonder whether inverting the meaning in the case would

> >>>>> make the situation easier to understand.

> >>>>

> >>>> Yes, it is indeed a disability, more on that below.

> >>>>

> >>>>>

> >>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field

> >>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE

> >>>>> IIUC.  We'd just need to invert the sense of the check in

> >>>>> system_supports_fpsimd().

> >>>>

> >>>> This is particularly something we want to avoid with this patch. We want

> >>>> to make sure that we have the up-to-date status of the disability right

> >>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE

> >>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,

> >>>> you must wait until all the CPUs are up, unlike the negated capability.

> >>>

> >>> I don't see why waiting for the random defective early CPU to come up is

> >>> better than waiting for all the early CPUs to come up and then deciding.

> >>>

> >>> Kernel-mode NEON aside, the status of this cap should not matter until

> >>> we enter userspace for the first time.

> >>>

> >>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON

> >>> probe for it before all early CPUs are up, and so cache the wrong

> >>> decision.  The current approach doesn't cope with that anyway AFAICT.

> >>

> >> This approach does in fact. With LOCAL_CPU scope, the moment a defective

> >> CPU turns up, we mark the "capability" and thus the kernel cannot use

> >> the neon then onwards, unlike the existing case where we have time till

> >> we boot all the CPUs (even when the boot CPU may be defective).

> >

> > I guess that makes sense.

> >

> > I'm now wondering what happens if anything tries to use kernel-mode NEON

> > before SVE is initialised -- which doesn't happen until cpufeatures

> > configures the system features.

> >

> > I don't think your proposed change makes anything worse here, but it may

> > need looking into.

>

> We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE

> is initialised ?


Could do, at least as an experiment.

Ard, do you have any thoughts on this?

Cheers
---Dave
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Ard Biesheuvel Oct. 14, 2019, 4:57 p.m. UTC | #8
On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@arm.com> wrote:
>

> On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:

> >

> >

> > On 14/10/2019 15:52, Dave Martin wrote:

> > > On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:

> > >>

> > >>

> > >> On 11/10/2019 15:21, Dave Martin wrote:

> > >>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave

> > >>>>

> > >>>> On 11/10/2019 12:36, Dave Martin wrote:

> > >>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:

> > >>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies

> > >>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only

> > >>>>>> after all the SMP CPUs are brought up. However, we use the status

> > >>>>>> of this capability for every context switch. So, let us change

> > >>>>>> the scop to LOCAL_CPU to allow the detection of this capability

> > >>>>>> as and when the first CPU without FP is brought up.

> > >>>>>>

> > >>>>>> Also, the current type allows hotplugged CPU to be brought up without

> > >>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace

> > >>>>>> up. Fix both of these issues by changing the capability to

> > >>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.

> > >>>>>>

> > >>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")

> > >>>>>> Cc: Will Deacon <will@kernel.org>

> > >>>>>> Cc: Mark Rutland <mark.rutland@arm.com>

> > >>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>

> > >>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

> > >>>>>> ---

> > >>>>>>   arch/arm64/kernel/cpufeature.c | 2 +-

> > >>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)

> > >>>>>>

> > >>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c

> > >>>>>> index 9323bcc40a58..0f9eace6c64b 100644

> > >>>>>> --- a/arch/arm64/kernel/cpufeature.c

> > >>>>>> +++ b/arch/arm64/kernel/cpufeature.c

> > >>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {

> > >>>>>>        {

> > >>>>>>                /* FP/SIMD is not implemented */

> > >>>>>>                .capability = ARM64_HAS_NO_FPSIMD,

> > >>>>>> -              .type = ARM64_CPUCAP_SYSTEM_FEATURE,

> > >>>>>> +              .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,

> > >>>>>

> > >>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.

> > >>>>>

> > >>>>> Although we have other things that smell like this (CPU errata for

> > >>>>> example), I wonder whether inverting the meaning in the case would

> > >>>>> make the situation easier to understand.

> > >>>>

> > >>>> Yes, it is indeed a disability, more on that below.

> > >>>>

> > >>>>>

> > >>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field

> > >>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE

> > >>>>> IIUC.  We'd just need to invert the sense of the check in

> > >>>>> system_supports_fpsimd().

> > >>>>

> > >>>> This is particularly something we want to avoid with this patch. We want

> > >>>> to make sure that we have the up-to-date status of the disability right

> > >>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE

> > >>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,

> > >>>> you must wait until all the CPUs are up, unlike the negated capability.

> > >>>

> > >>> I don't see why waiting for the random defective early CPU to come up is

> > >>> better than waiting for all the early CPUs to come up and then deciding.

> > >>>

> > >>> Kernel-mode NEON aside, the status of this cap should not matter until

> > >>> we enter userspace for the first time.

> > >>>

> > >>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON

> > >>> probe for it before all early CPUs are up, and so cache the wrong

> > >>> decision.  The current approach doesn't cope with that anyway AFAICT.

> > >>

> > >> This approach does in fact. With LOCAL_CPU scope, the moment a defective

> > >> CPU turns up, we mark the "capability" and thus the kernel cannot use

> > >> the neon then onwards, unlike the existing case where we have time till

> > >> we boot all the CPUs (even when the boot CPU may be defective).

> > >

> > > I guess that makes sense.

> > >

> > > I'm now wondering what happens if anything tries to use kernel-mode NEON

> > > before SVE is initialised -- which doesn't happen until cpufeatures

> > > configures the system features.

> > >

> > > I don't think your proposed change makes anything worse here, but it may

> > > need looking into.

> >

> > We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE

> > is initialised ?

>

> Could do, at least as an experiment.

>

> Ard, do you have any thoughts on this?

>


All in-kernel NEON code checks whether the NEON is usable, so I'd
expect that check to return 'false' if it is too early in the boot for
the NEON to be used at all.
Suzuki K Poulose Oct. 15, 2019, 9:44 a.m. UTC | #9
On 14/10/2019 17:57, Ard Biesheuvel wrote:
> On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@arm.com> wrote:

>>

>> On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:

>>>

>>>

>>> On 14/10/2019 15:52, Dave Martin wrote:

>>>> On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:

>>>>>

>>>>>

>>>>> On 11/10/2019 15:21, Dave Martin wrote:

>>>>>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave

>>>>>>>

>>>>>>> On 11/10/2019 12:36, Dave Martin wrote:

>>>>>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:

>>>>>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies

>>>>>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only

>>>>>>>>> after all the SMP CPUs are brought up. However, we use the status

>>>>>>>>> of this capability for every context switch. So, let us change

>>>>>>>>> the scop to LOCAL_CPU to allow the detection of this capability

>>>>>>>>> as and when the first CPU without FP is brought up.

>>>>>>>>>

>>>>>>>>> Also, the current type allows hotplugged CPU to be brought up without

>>>>>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace

>>>>>>>>> up. Fix both of these issues by changing the capability to

>>>>>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.

>>>>>>>>>

>>>>>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")

>>>>>>>>> Cc: Will Deacon <will@kernel.org>

>>>>>>>>> Cc: Mark Rutland <mark.rutland@arm.com>

>>>>>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>

>>>>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

>>>>>>>>> ---

>>>>>>>>>    arch/arm64/kernel/cpufeature.c | 2 +-

>>>>>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)

>>>>>>>>>

>>>>>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c

>>>>>>>>> index 9323bcc40a58..0f9eace6c64b 100644

>>>>>>>>> --- a/arch/arm64/kernel/cpufeature.c

>>>>>>>>> +++ b/arch/arm64/kernel/cpufeature.c

>>>>>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {

>>>>>>>>>         {

>>>>>>>>>                 /* FP/SIMD is not implemented */

>>>>>>>>>                 .capability = ARM64_HAS_NO_FPSIMD,

>>>>>>>>> -              .type = ARM64_CPUCAP_SYSTEM_FEATURE,

>>>>>>>>> +              .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,

>>>>>>>>

>>>>>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.

>>>>>>>>

>>>>>>>> Although we have other things that smell like this (CPU errata for

>>>>>>>> example), I wonder whether inverting the meaning in the case would

>>>>>>>> make the situation easier to understand.

>>>>>>>

>>>>>>> Yes, it is indeed a disability, more on that below.

>>>>>>>

>>>>>>>>

>>>>>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field

>>>>>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE

>>>>>>>> IIUC.  We'd just need to invert the sense of the check in

>>>>>>>> system_supports_fpsimd().

>>>>>>>

>>>>>>> This is particularly something we want to avoid with this patch. We want

>>>>>>> to make sure that we have the up-to-date status of the disability right

>>>>>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE

>>>>>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,

>>>>>>> you must wait until all the CPUs are up, unlike the negated capability.

>>>>>>

>>>>>> I don't see why waiting for the random defective early CPU to come up is

>>>>>> better than waiting for all the early CPUs to come up and then deciding.

>>>>>>

>>>>>> Kernel-mode NEON aside, the status of this cap should not matter until

>>>>>> we enter userspace for the first time.

>>>>>>

>>>>>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON

>>>>>> probe for it before all early CPUs are up, and so cache the wrong

>>>>>> decision.  The current approach doesn't cope with that anyway AFAICT.

>>>>>

>>>>> This approach does in fact. With LOCAL_CPU scope, the moment a defective

>>>>> CPU turns up, we mark the "capability" and thus the kernel cannot use

>>>>> the neon then onwards, unlike the existing case where we have time till

>>>>> we boot all the CPUs (even when the boot CPU may be defective).

>>>>

>>>> I guess that makes sense.

>>>>

>>>> I'm now wondering what happens if anything tries to use kernel-mode NEON

>>>> before SVE is initialised -- which doesn't happen until cpufeatures

>>>> configures the system features.

>>>>

>>>> I don't think your proposed change makes anything worse here, but it may

>>>> need looking into.

>>>

>>> We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE

>>> is initialised ?

>>

>> Could do, at least as an experiment.

>>

>> Ard, do you have any thoughts on this?

>>

> 

> All in-kernel NEON code checks whether the NEON is usable, so I'd

> expect that check to return 'false' if it is too early in the boot for

> the NEON to be used at all.


Ok. That implies, we need a check to make sure SVE set up is complete,
which we don't at the moment, as we default to assume FP/SIMD is available.

"system_can_use_fpsimd()" instead of the "system_supports_fpsimd() where
the former should indicate:

  system_supports_fpsimd() && sve_setup_complete()

Where the sve_setup_complete() can itself be a static key, initialized
very early if we have !CONFIG_SVE. Otherwise, set from sve_setup().


Thoughts ?
Suzuki
Ard Biesheuvel Oct. 15, 2019, 9:52 a.m. UTC | #10
On Tue, 15 Oct 2019 at 11:44, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>

>

>

> On 14/10/2019 17:57, Ard Biesheuvel wrote:

> > On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@arm.com> wrote:

> >>

> >> On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:

> >>>

> >>>

> >>> On 14/10/2019 15:52, Dave Martin wrote:

> >>>> On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:

> >>>>>

> >>>>>

> >>>>> On 11/10/2019 15:21, Dave Martin wrote:

> >>>>>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave

> >>>>>>>

> >>>>>>> On 11/10/2019 12:36, Dave Martin wrote:

> >>>>>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:

> >>>>>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies

> >>>>>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only

> >>>>>>>>> after all the SMP CPUs are brought up. However, we use the status

> >>>>>>>>> of this capability for every context switch. So, let us change

> >>>>>>>>> the scop to LOCAL_CPU to allow the detection of this capability

> >>>>>>>>> as and when the first CPU without FP is brought up.

> >>>>>>>>>

> >>>>>>>>> Also, the current type allows hotplugged CPU to be brought up without

> >>>>>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace

> >>>>>>>>> up. Fix both of these issues by changing the capability to

> >>>>>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.

> >>>>>>>>>

> >>>>>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")

> >>>>>>>>> Cc: Will Deacon <will@kernel.org>

> >>>>>>>>> Cc: Mark Rutland <mark.rutland@arm.com>

> >>>>>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>

> >>>>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

> >>>>>>>>> ---

> >>>>>>>>>    arch/arm64/kernel/cpufeature.c | 2 +-

> >>>>>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)

> >>>>>>>>>

> >>>>>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c

> >>>>>>>>> index 9323bcc40a58..0f9eace6c64b 100644

> >>>>>>>>> --- a/arch/arm64/kernel/cpufeature.c

> >>>>>>>>> +++ b/arch/arm64/kernel/cpufeature.c

> >>>>>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {

> >>>>>>>>>         {

> >>>>>>>>>                 /* FP/SIMD is not implemented */

> >>>>>>>>>                 .capability = ARM64_HAS_NO_FPSIMD,

> >>>>>>>>> -              .type = ARM64_CPUCAP_SYSTEM_FEATURE,

> >>>>>>>>> +              .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,

> >>>>>>>>

> >>>>>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.

> >>>>>>>>

> >>>>>>>> Although we have other things that smell like this (CPU errata for

> >>>>>>>> example), I wonder whether inverting the meaning in the case would

> >>>>>>>> make the situation easier to understand.

> >>>>>>>

> >>>>>>> Yes, it is indeed a disability, more on that below.

> >>>>>>>

> >>>>>>>>

> >>>>>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field

> >>>>>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE

> >>>>>>>> IIUC.  We'd just need to invert the sense of the check in

> >>>>>>>> system_supports_fpsimd().

> >>>>>>>

> >>>>>>> This is particularly something we want to avoid with this patch. We want

> >>>>>>> to make sure that we have the up-to-date status of the disability right

> >>>>>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE

> >>>>>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,

> >>>>>>> you must wait until all the CPUs are up, unlike the negated capability.

> >>>>>>

> >>>>>> I don't see why waiting for the random defective early CPU to come up is

> >>>>>> better than waiting for all the early CPUs to come up and then deciding.

> >>>>>>

> >>>>>> Kernel-mode NEON aside, the status of this cap should not matter until

> >>>>>> we enter userspace for the first time.

> >>>>>>

> >>>>>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON

> >>>>>> probe for it before all early CPUs are up, and so cache the wrong

> >>>>>> decision.  The current approach doesn't cope with that anyway AFAICT.

> >>>>>

> >>>>> This approach does in fact. With LOCAL_CPU scope, the moment a defective

> >>>>> CPU turns up, we mark the "capability" and thus the kernel cannot use

> >>>>> the neon then onwards, unlike the existing case where we have time till

> >>>>> we boot all the CPUs (even when the boot CPU may be defective).

> >>>>

> >>>> I guess that makes sense.

> >>>>

> >>>> I'm now wondering what happens if anything tries to use kernel-mode NEON

> >>>> before SVE is initialised -- which doesn't happen until cpufeatures

> >>>> configures the system features.

> >>>>

> >>>> I don't think your proposed change makes anything worse here, but it may

> >>>> need looking into.

> >>>

> >>> We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE

> >>> is initialised ?

> >>

> >> Could do, at least as an experiment.

> >>

> >> Ard, do you have any thoughts on this?

> >>

> >

> > All in-kernel NEON code checks whether the NEON is usable, so I'd

> > expect that check to return 'false' if it is too early in the boot for

> > the NEON to be used at all.

>

> Ok. That implies, we need a check to make sure SVE set up is complete,

> which we don't at the moment, as we default to assume FP/SIMD is available.

>

> "system_can_use_fpsimd()" instead of the "system_supports_fpsimd() where

> the former should indicate:

>

>   system_supports_fpsimd() && sve_setup_complete()

>

> Where the sve_setup_complete() can itself be a static key, initialized

> very early if we have !CONFIG_SVE. Otherwise, set from sve_setup().

>

>

> Thoughts ?


Yes, that sounds reasonable. If we fold that into the implementation
of may_use_simd(), we shouldn't need any other changes to the clients
AFAICT
Dave Martin Oct. 15, 2019, 10:24 a.m. UTC | #11
On Mon, Oct 14, 2019 at 06:57:30PM +0200, Ard Biesheuvel wrote:
> On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@arm.com> wrote:

> >

> > On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:

> > >

> > >

> > > On 14/10/2019 15:52, Dave Martin wrote:

> > > > On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:

> > > >>

> > > >>

> > > >> On 11/10/2019 15:21, Dave Martin wrote:

> > > >>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave

> > > >>>>

> > > >>>> On 11/10/2019 12:36, Dave Martin wrote:

> > > >>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:

> > > >>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies

> > > >>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only

> > > >>>>>> after all the SMP CPUs are brought up. However, we use the status

> > > >>>>>> of this capability for every context switch. So, let us change

> > > >>>>>> the scop to LOCAL_CPU to allow the detection of this capability

> > > >>>>>> as and when the first CPU without FP is brought up.

> > > >>>>>>

> > > >>>>>> Also, the current type allows hotplugged CPU to be brought up without

> > > >>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace

> > > >>>>>> up. Fix both of these issues by changing the capability to

> > > >>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.

> > > >>>>>>

> > > >>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")

> > > >>>>>> Cc: Will Deacon <will@kernel.org>

> > > >>>>>> Cc: Mark Rutland <mark.rutland@arm.com>

> > > >>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>

> > > >>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

> > > >>>>>> ---

> > > >>>>>>   arch/arm64/kernel/cpufeature.c | 2 +-

> > > >>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)

> > > >>>>>>

> > > >>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c

> > > >>>>>> index 9323bcc40a58..0f9eace6c64b 100644

> > > >>>>>> --- a/arch/arm64/kernel/cpufeature.c

> > > >>>>>> +++ b/arch/arm64/kernel/cpufeature.c

> > > >>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {

> > > >>>>>>        {

> > > >>>>>>                /* FP/SIMD is not implemented */

> > > >>>>>>                .capability = ARM64_HAS_NO_FPSIMD,

> > > >>>>>> -              .type = ARM64_CPUCAP_SYSTEM_FEATURE,

> > > >>>>>> +              .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,

> > > >>>>>

> > > >>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.

> > > >>>>>

> > > >>>>> Although we have other things that smell like this (CPU errata for

> > > >>>>> example), I wonder whether inverting the meaning in the case would

> > > >>>>> make the situation easier to understand.

> > > >>>>

> > > >>>> Yes, it is indeed a disability, more on that below.

> > > >>>>

> > > >>>>>

> > > >>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field

> > > >>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE

> > > >>>>> IIUC.  We'd just need to invert the sense of the check in

> > > >>>>> system_supports_fpsimd().

> > > >>>>

> > > >>>> This is particularly something we want to avoid with this patch. We want

> > > >>>> to make sure that we have the up-to-date status of the disability right

> > > >>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE

> > > >>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,

> > > >>>> you must wait until all the CPUs are up, unlike the negated capability.

> > > >>>

> > > >>> I don't see why waiting for the random defective early CPU to come up is

> > > >>> better than waiting for all the early CPUs to come up and then deciding.

> > > >>>

> > > >>> Kernel-mode NEON aside, the status of this cap should not matter until

> > > >>> we enter userspace for the first time.

> > > >>>

> > > >>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON

> > > >>> probe for it before all early CPUs are up, and so cache the wrong

> > > >>> decision.  The current approach doesn't cope with that anyway AFAICT.

> > > >>

> > > >> This approach does in fact. With LOCAL_CPU scope, the moment a defective

> > > >> CPU turns up, we mark the "capability" and thus the kernel cannot use

> > > >> the neon then onwards, unlike the existing case where we have time till

> > > >> we boot all the CPUs (even when the boot CPU may be defective).

> > > >

> > > > I guess that makes sense.

> > > >

> > > > I'm now wondering what happens if anything tries to use kernel-mode NEON

> > > > before SVE is initialised -- which doesn't happen until cpufeatures

> > > > configures the system features.

> > > >

> > > > I don't think your proposed change makes anything worse here, but it may

> > > > need looking into.

> > >

> > > We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE

> > > is initialised ?

> >

> > Could do, at least as an experiment.

> >

> > Ard, do you have any thoughts on this?

> >

> 

> All in-kernel NEON code checks whether the NEON is usable, so I'd

> expect that check to return 'false' if it is too early in the boot for

> the NEON to be used at all.


My concern is that the check may be done once, at probe time, for crypto
drivers.  If probing happens before system_supports_fpsimd() has
stabilised, we may be stuck with the wrong probe decision.

So: are crypto drivers and kernel_mode_neon() users definitely only
probed _after_ all early CPUs are up?

Cheers
---Dave
Ard Biesheuvel Oct. 15, 2019, 10:30 a.m. UTC | #12
On Tue, 15 Oct 2019 at 12:25, Dave Martin <Dave.Martin@arm.com> wrote:
>

> On Mon, Oct 14, 2019 at 06:57:30PM +0200, Ard Biesheuvel wrote:

> > On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@arm.com> wrote:

> > >

> > > On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:

> > > >

> > > >

> > > > On 14/10/2019 15:52, Dave Martin wrote:

> > > > > On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:

> > > > >>

> > > > >>

> > > > >> On 11/10/2019 15:21, Dave Martin wrote:

> > > > >>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave

> > > > >>>>

> > > > >>>> On 11/10/2019 12:36, Dave Martin wrote:

> > > > >>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:

> > > > >>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies

> > > > >>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only

> > > > >>>>>> after all the SMP CPUs are brought up. However, we use the status

> > > > >>>>>> of this capability for every context switch. So, let us change

> > > > >>>>>> the scop to LOCAL_CPU to allow the detection of this capability

> > > > >>>>>> as and when the first CPU without FP is brought up.

> > > > >>>>>>

> > > > >>>>>> Also, the current type allows hotplugged CPU to be brought up without

> > > > >>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace

> > > > >>>>>> up. Fix both of these issues by changing the capability to

> > > > >>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.

> > > > >>>>>>

> > > > >>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")

> > > > >>>>>> Cc: Will Deacon <will@kernel.org>

> > > > >>>>>> Cc: Mark Rutland <mark.rutland@arm.com>

> > > > >>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>

> > > > >>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

> > > > >>>>>> ---

> > > > >>>>>>   arch/arm64/kernel/cpufeature.c | 2 +-

> > > > >>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)

> > > > >>>>>>

> > > > >>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c

> > > > >>>>>> index 9323bcc40a58..0f9eace6c64b 100644

> > > > >>>>>> --- a/arch/arm64/kernel/cpufeature.c

> > > > >>>>>> +++ b/arch/arm64/kernel/cpufeature.c

> > > > >>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {

> > > > >>>>>>        {

> > > > >>>>>>                /* FP/SIMD is not implemented */

> > > > >>>>>>                .capability = ARM64_HAS_NO_FPSIMD,

> > > > >>>>>> -              .type = ARM64_CPUCAP_SYSTEM_FEATURE,

> > > > >>>>>> +              .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,

> > > > >>>>>

> > > > >>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.

> > > > >>>>>

> > > > >>>>> Although we have other things that smell like this (CPU errata for

> > > > >>>>> example), I wonder whether inverting the meaning in the case would

> > > > >>>>> make the situation easier to understand.

> > > > >>>>

> > > > >>>> Yes, it is indeed a disability, more on that below.

> > > > >>>>

> > > > >>>>>

> > > > >>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field

> > > > >>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE

> > > > >>>>> IIUC.  We'd just need to invert the sense of the check in

> > > > >>>>> system_supports_fpsimd().

> > > > >>>>

> > > > >>>> This is particularly something we want to avoid with this patch. We want

> > > > >>>> to make sure that we have the up-to-date status of the disability right

> > > > >>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE

> > > > >>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,

> > > > >>>> you must wait until all the CPUs are up, unlike the negated capability.

> > > > >>>

> > > > >>> I don't see why waiting for the random defective early CPU to come up is

> > > > >>> better than waiting for all the early CPUs to come up and then deciding.

> > > > >>>

> > > > >>> Kernel-mode NEON aside, the status of this cap should not matter until

> > > > >>> we enter userspace for the first time.

> > > > >>>

> > > > >>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON

> > > > >>> probe for it before all early CPUs are up, and so cache the wrong

> > > > >>> decision.  The current approach doesn't cope with that anyway AFAICT.

> > > > >>

> > > > >> This approach does in fact. With LOCAL_CPU scope, the moment a defective

> > > > >> CPU turns up, we mark the "capability" and thus the kernel cannot use

> > > > >> the neon then onwards, unlike the existing case where we have time till

> > > > >> we boot all the CPUs (even when the boot CPU may be defective).

> > > > >

> > > > > I guess that makes sense.

> > > > >

> > > > > I'm now wondering what happens if anything tries to use kernel-mode NEON

> > > > > before SVE is initialised -- which doesn't happen until cpufeatures

> > > > > configures the system features.

> > > > >

> > > > > I don't think your proposed change makes anything worse here, but it may

> > > > > need looking into.

> > > >

> > > > We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE

> > > > is initialised ?

> > >

> > > Could do, at least as an experiment.

> > >

> > > Ard, do you have any thoughts on this?

> > >

> >

> > All in-kernel NEON code checks whether the NEON is usable, so I'd

> > expect that check to return 'false' if it is too early in the boot for

> > the NEON to be used at all.

>

> My concern is that the check may be done once, at probe time, for crypto

> drivers.  If probing happens before system_supports_fpsimd() has

> stabilised, we may be stuck with the wrong probe decision.

>

> So: are crypto drivers and kernel_mode_neon() users definitely only

> probed _after_ all early CPUs are up?

>


Isn't SMP already up when initcalls are processed?
Suzuki K Poulose Oct. 15, 2019, 1:03 p.m. UTC | #13
On 15/10/2019 11:30, Ard Biesheuvel wrote:
> On Tue, 15 Oct 2019 at 12:25, Dave Martin <Dave.Martin@arm.com> wrote:

>>

>> On Mon, Oct 14, 2019 at 06:57:30PM +0200, Ard Biesheuvel wrote:

>>> On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@arm.com> wrote:

>>>>

>>>> On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:

>>>>>

>>>>>

>>>>> On 14/10/2019 15:52, Dave Martin wrote:

>>>>>> On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:

>>>>>>>

>>>>>>>

>>>>>>> On 11/10/2019 15:21, Dave Martin wrote:

>>>>>>>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave

>>>>>>>>>

>>>>>>>>> On 11/10/2019 12:36, Dave Martin wrote:

>>>>>>>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:

>>>>>>>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies

>>>>>>>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only

>>>>>>>>>>> after all the SMP CPUs are brought up. However, we use the status

>>>>>>>>>>> of this capability for every context switch. So, let us change

>>>>>>>>>>> the scop to LOCAL_CPU to allow the detection of this capability

>>>>>>>>>>> as and when the first CPU without FP is brought up.

>>>>>>>>>>>

>>>>>>>>>>> Also, the current type allows hotplugged CPU to be brought up without

>>>>>>>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace

>>>>>>>>>>> up. Fix both of these issues by changing the capability to

>>>>>>>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.

>>>>>>>>>>>

>>>>>>>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")

>>>>>>>>>>> Cc: Will Deacon <will@kernel.org>

>>>>>>>>>>> Cc: Mark Rutland <mark.rutland@arm.com>

>>>>>>>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>

>>>>>>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

>>>>>>>>>>> ---

>>>>>>>>>>>    arch/arm64/kernel/cpufeature.c | 2 +-

>>>>>>>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)

>>>>>>>>>>>

>>>>>>>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c

>>>>>>>>>>> index 9323bcc40a58..0f9eace6c64b 100644

>>>>>>>>>>> --- a/arch/arm64/kernel/cpufeature.c

>>>>>>>>>>> +++ b/arch/arm64/kernel/cpufeature.c

>>>>>>>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {

>>>>>>>>>>>         {

>>>>>>>>>>>                 /* FP/SIMD is not implemented */

>>>>>>>>>>>                 .capability = ARM64_HAS_NO_FPSIMD,

>>>>>>>>>>> -              .type = ARM64_CPUCAP_SYSTEM_FEATURE,

>>>>>>>>>>> +              .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,

>>>>>>>>>>

>>>>>>>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.

>>>>>>>>>>

>>>>>>>>>> Although we have other things that smell like this (CPU errata for

>>>>>>>>>> example), I wonder whether inverting the meaning in the case would

>>>>>>>>>> make the situation easier to understand.

>>>>>>>>>

>>>>>>>>> Yes, it is indeed a disability, more on that below.

>>>>>>>>>

>>>>>>>>>>

>>>>>>>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field

>>>>>>>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE

>>>>>>>>>> IIUC.  We'd just need to invert the sense of the check in

>>>>>>>>>> system_supports_fpsimd().

>>>>>>>>>

>>>>>>>>> This is particularly something we want to avoid with this patch. We want

>>>>>>>>> to make sure that we have the up-to-date status of the disability right

>>>>>>>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE

>>>>>>>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,

>>>>>>>>> you must wait until all the CPUs are up, unlike the negated capability.

>>>>>>>>

>>>>>>>> I don't see why waiting for the random defective early CPU to come up is

>>>>>>>> better than waiting for all the early CPUs to come up and then deciding.

>>>>>>>>

>>>>>>>> Kernel-mode NEON aside, the status of this cap should not matter until

>>>>>>>> we enter userspace for the first time.

>>>>>>>>

>>>>>>>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON

>>>>>>>> probe for it before all early CPUs are up, and so cache the wrong

>>>>>>>> decision.  The current approach doesn't cope with that anyway AFAICT.

>>>>>>>

>>>>>>> This approach does in fact. With LOCAL_CPU scope, the moment a defective

>>>>>>> CPU turns up, we mark the "capability" and thus the kernel cannot use

>>>>>>> the neon then onwards, unlike the existing case where we have time till

>>>>>>> we boot all the CPUs (even when the boot CPU may be defective).

>>>>>>

>>>>>> I guess that makes sense.

>>>>>>

>>>>>> I'm now wondering what happens if anything tries to use kernel-mode NEON

>>>>>> before SVE is initialised -- which doesn't happen until cpufeatures

>>>>>> configures the system features.

>>>>>>

>>>>>> I don't think your proposed change makes anything worse here, but it may

>>>>>> need looking into.

>>>>>

>>>>> We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE

>>>>> is initialised ?

>>>>

>>>> Could do, at least as an experiment.

>>>>

>>>> Ard, do you have any thoughts on this?

>>>>

>>>

>>> All in-kernel NEON code checks whether the NEON is usable, so I'd

>>> expect that check to return 'false' if it is too early in the boot for

>>> the NEON to be used at all.

>>

>> My concern is that the check may be done once, at probe time, for crypto

>> drivers.  If probing happens before system_supports_fpsimd() has

>> stabilised, we may be stuck with the wrong probe decision.

>>

>> So: are crypto drivers and kernel_mode_neon() users definitely only

>> probed _after_ all early CPUs are up?

>>

> 

> Isn't SMP already up when initcalls are processed?


Not all of them. Booting with initcall_debug=1 shows the following :

--

  // trimmed //

[    0.000000] NR_IRQS:64 nr_irqs:64 0 
 
 

[    0.000000] GIC: Using split EOI/Deactivate mode 
 
 

[    0.000000] CPU0: found redistributor 0 region 0:0x000000002f100000 
 
 

[    0.000000] Architected cp15 timer(s) running at 100.00MHz (phys). 
 
 

[    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 
0x171024e7e0, max_idle_ns: 440795205315 ns
[    0.000029] sched_clock: 56 bits at 100MHz, resolution 10ns, wraps every 
4398046511100ns
[    0.000989] Console: colour dummy device 80x25 

[    0.001049] Calibrating delay loop (skipped), value calculated using timer 
frequency.. 200.00 BogoMIPS (lpj=400000)
[    0.001149] pid_max: default: 32768 minimum: 301 

[    0.001549] Security Framework initialized 

[    0.001802] Mount-cache hash table entries: 8192 (order: 4, 65536 bytes) 

[    0.001849] Mountpoint-cache hash table entries: 8192 (order: 4, 65536 bytes) 

[    0.004949] Initializing cgroup subsys io
[    0.005042] Initializing cgroup subsys memory
[    0.005079] Initializing cgroup subsys devices
[    0.005149] Initializing cgroup subsys perf_event
[    0.005255] Initializing cgroup subsys hugetlb
[    0.005255] Initializing cgroup subsys pids

[    0.006002] calling  cpu_suspend_init+0x0/0x78 @ 1 

[    0.006062] initcall cpu_suspend_init+0x0/0x78 returned 0 after 0 usecs
[    0.006149] calling  arm64_enable_runtime_services+0x0/0x200 @ 1
[    0.006225] EFI services will not be available. 

[    0.006249] initcall arm64_enable_runtime_services+0x0/0x200 returned 0 after 
0 usecs
[    0.006389] calling  asids_init+0x0/0xf8 @ 1
[    0.006449] ASID allocator initialised with 65536 entries 

[    0.006535] initcall asids_init+0x0/0xf8 returned 0 after 0 usecs
[    0.006553] calling  xen_guest_init+0x0/0x1d8 @ 1 

[    0.006649] initcall xen_guest_init+0x0/0x1d8 returned 0 after 0 usecs
[    0.006749] calling  spawn_ksoftirqd+0x0/0x40 @ 1 

[    0.007749] initcall spawn_ksoftirqd+0x0/0x40 returned 0 after 3906 usecs
[    0.007864] calling  init_workqueues+0x0/0x3ec @ 1 

[    0.019869] initcall init_workqueues+0x0/0x3ec returned 0 after 11718 usecs
[    0.019988] calling  migration_init+0x0/0x84 @ 1 

[    0.020082] initcall migration_init+0x0/0x84 returned 0 after 0 usecs
[    0.020189] calling  check_cpu_stall_init+0x0/0x28 @ 1 

[    0.020316] initcall check_cpu_stall_init+0x0/0x28 returned 0 after 0 usecs
[    0.020449] calling  rcu_spawn_gp_kthread+0x0/0x12c @ 1 

[    0.020971] initcall rcu_spawn_gp_kthread+0x0/0x12c returned 0 after 0 usecs
[    0.021049] calling  cpu_stop_init+0x0/0xe0 @ 1 

[    0.023815] initcall cpu_stop_init+0x0/0xe0 returned 0 after 3906 usecs
[    0.023922] calling  jump_label_init_module+0x0/0x20 @ 1 

[    0.023949] initcall jump_label_init_module+0x0/0x20 returned 0 after 0 usecs
[    0.024084] calling  its_pci_msi_init+0x0/0xec @ 1 

[    0.024249] /interrupt-controller@2f000000/its@2f020000: unable to locate ITS 
domain
[    0.024349] initcall its_pci_msi_init+0x0/0xec returned 0 after 0 usecs
[    0.024455] calling  its_pmsi_init+0x0/0xec @ 1 

[    0.024576] /interrupt-controller@2f000000/its@2f020000: unable to locate ITS 
domain
[    0.024669] initcall its_pmsi_init+0x0/0xec returned 0 after 0 usecs
[    0.024849] calling  tegra_init_fuse+0x0/0x150 @ 1 

[    0.025095] initcall tegra_init_fuse+0x0/0x150 returned 0 after 0 usecs
[    0.025231] calling  tegra_pmc_early_init+0x0/0xfc @ 1 

[    0.025749] initcall tegra_pmc_early_init+0x0/0xfc returned 0 after 0 usecs
[    0.025886] calling  rand_initialize+0x0/0x40 @ 1 

[    0.026849] initcall rand_initialize+0x0/0x40 returned 0 after 0 usecs
[    0.026949] calling  dummy_timer_register+0x0/0x54 @ 1 

[    0.027033] initcall dummy_timer_register+0x0/0x54 returned 0 after 0 usecs


[    0.035949] Detected PIPT I-cache on CPU1 

[    0.036049] CPU1: found redistributor 1 region 0:0x000000002f120000
[    0.036082] CPU1: Booted secondary processor [410fd0f0]
[    0.048049] Detected PIPT I-cache on CPU2 

[    0.048149] CPU2: found redistributor 2 region 0:0x000000002f140000
[    0.048168] CPU2: Booted secondary processor [410fd0f0]
[    0.060249] Detected PIPT I-cache on CPU3 

[    0.060349] CPU3: found redistributor 3 region 0:0x000000002f160000
[    0.060402] CPU3: Booted secondary processor [410fd0f0]
[    0.060620] Brought up 4 CPUs
[    0.060949] SMP: Total of 4 processors activated.


Cheers
Suzuki
Ard Biesheuvel Oct. 15, 2019, 1:11 p.m. UTC | #14
On Tue, 15 Oct 2019 at 15:03, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>

>

>

> On 15/10/2019 11:30, Ard Biesheuvel wrote:

> > On Tue, 15 Oct 2019 at 12:25, Dave Martin <Dave.Martin@arm.com> wrote:

> >>

> >> On Mon, Oct 14, 2019 at 06:57:30PM +0200, Ard Biesheuvel wrote:

> >>> On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@arm.com> wrote:

> >>>>

> >>>> On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:

> >>>>>

> >>>>>

> >>>>> On 14/10/2019 15:52, Dave Martin wrote:

> >>>>>>

...
> >>>>>> I'm now wondering what happens if anything tries to use kernel-mode NEON

> >>>>>> before SVE is initialised -- which doesn't happen until cpufeatures

> >>>>>> configures the system features.

> >>>>>>

> >>>>>> I don't think your proposed change makes anything worse here, but it may

> >>>>>> need looking into.

> >>>>>

> >>>>> We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE

> >>>>> is initialised ?

> >>>>

> >>>> Could do, at least as an experiment.

> >>>>

> >>>> Ard, do you have any thoughts on this?

> >>>>

> >>>

> >>> All in-kernel NEON code checks whether the NEON is usable, so I'd

> >>> expect that check to return 'false' if it is too early in the boot for

> >>> the NEON to be used at all.

> >>

> >> My concern is that the check may be done once, at probe time, for crypto

> >> drivers.  If probing happens before system_supports_fpsimd() has

> >> stabilised, we may be stuck with the wrong probe decision.

> >>

> >> So: are crypto drivers and kernel_mode_neon() users definitely only

> >> probed _after_ all early CPUs are up?

> >>

> >

> > Isn't SMP already up when initcalls are processed?

>

> Not all of them. Booting with initcall_debug=1 shows the following :

>

> --

>

>   // trimmed //

>

...
> [    0.027033] initcall dummy_timer_register+0x0/0x54 returned 0 after 0 usecs

>

>

> [    0.035949] Detected PIPT I-cache on CPU1

>

> [    0.036049] CPU1: found redistributor 1 region 0:0x000000002f120000

> [    0.036082] CPU1: Booted secondary processor [410fd0f0]

> [    0.048049] Detected PIPT I-cache on CPU2

>

> [    0.048149] CPU2: found redistributor 2 region 0:0x000000002f140000

> [    0.048168] CPU2: Booted secondary processor [410fd0f0]

> [    0.060249] Detected PIPT I-cache on CPU3

>

> [    0.060349] CPU3: found redistributor 3 region 0:0x000000002f160000

> [    0.060402] CPU3: Booted secondary processor [410fd0f0]

> [    0.060620] Brought up 4 CPUs

> [    0.060949] SMP: Total of 4 processors activated.

>

>


These are all early initcalls, which are actually documented as
running before SMP, and before 'pure' initcalls, which should only be
used to initialize global variables that cannot be initialized
statically. So I think we can safely disregard these as uses of kernel
mode NEON we should care about.

But I would still expect may_use_simd() to return the right value
here, independently of the logic that reasons about whether we have a
NEON in the first place.
Dave Martin Oct. 15, 2019, 2:05 p.m. UTC | #15
On Tue, Oct 15, 2019 at 12:30:15PM +0200, Ard Biesheuvel wrote:
> On Tue, 15 Oct 2019 at 12:25, Dave Martin <Dave.Martin@arm.com> wrote:

> >

> > On Mon, Oct 14, 2019 at 06:57:30PM +0200, Ard Biesheuvel wrote:


[...]

> > > All in-kernel NEON code checks whether the NEON is usable, so I'd

> > > expect that check to return 'false' if it is too early in the boot for

> > > the NEON to be used at all.

> >

> > My concern is that the check may be done once, at probe time, for crypto

> > drivers.  If probing happens before system_supports_fpsimd() has

> > stabilised, we may be stuck with the wrong probe decision.

> >

> > So: are crypto drivers and kernel_mode_neon() users definitely only

> > probed _after_ all early CPUs are up?

> >

> 

> Isn't SMP already up when initcalls are processed?


That was my original assumption when developing SVE.  I think I
convinced myself that it was valid, but it sounds worth reinvestigating.

Assuming the assumption _is_ valid, then dropping a suitable WARN() into
system_supports_fpsimd() or cpu_has_neon() or similar may be a good idea.

Cheers
---Dave
diff mbox series

Patch

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9323bcc40a58..0f9eace6c64b 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1361,7 +1361,7 @@  static const struct arm64_cpu_capabilities arm64_features[] = {
 	{
 		/* FP/SIMD is not implemented */
 		.capability = ARM64_HAS_NO_FPSIMD,
-		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
 		.min_field_value = 0,
 		.matches = has_no_fpsimd,
 	},