mbox series

[0/6] add system vulnerability sysfs entries

Message ID 20181206234408.1287689-1-jeremy.linton@arm.com
Headers show
Series add system vulnerability sysfs entries | expand

Message

Jeremy Linton Dec. 6, 2018, 11:44 p.m. UTC
Part of this series was originally by Mian Yousaf Kaukab.

Arm64 machines should be displaying a human readable
vulnerability status to speculative execution attacks in
/sys/devices/system/cpu/vulnerabilities 

This series enables that behavior by providing the expected
functions. Those functions expose the cpu errata and feature
states, as well as whether firmware is responding appropriately
to display the overall machine status. This means that in a
heterogeneous machine we will only claim the machine is mitigated
or safe if we are confident all booted cores are safe or
mitigated. Otherwise, we will display unknown or unsafe
depending on how much of the machine configuration can
be assured.

Jeremy Linton (2):
  arm64: add sysfs vulnerability show for meltdown
  arm64: add sysfs vulnerability show for spectre v2

Mian Yousaf Kaukab (4):
  arm64: kpti: move check for non-vulnerable CPUs to a function
  arm64: add sysfs vulnerability show for spectre v1
  arm64: add sysfs vulnerability show for speculative store bypass
  arm64: enable generic CPU vulnerabilites support

 arch/arm64/Kconfig             |   1 +
 arch/arm64/kernel/cpu_errata.c | 110 +++++++++++++++++++++++++++++++--
 arch/arm64/kernel/cpufeature.c |  45 +++++++++++---
 3 files changed, 143 insertions(+), 13 deletions(-)

-- 
2.17.2

Comments

Jeremy Linton Dec. 12, 2018, 3:48 p.m. UTC | #1
Hi Dave,

Thanks for looking at this!

On 12/13/2018 06:07 AM, Dave Martin wrote:
> On Thu, Dec 06, 2018 at 05:44:02PM -0600, Jeremy Linton wrote:

>> Part of this series was originally by Mian Yousaf Kaukab.

>>

>> Arm64 machines should be displaying a human readable

>> vulnerability status to speculative execution attacks in

>> /sys/devices/system/cpu/vulnerabilities

> 

> Is there any agreement on the strings that will be returned in there?

> 

> A quick search didn't find anything obvious upstream.  There is

> documentation proposed in [1], but I don't know what happened to it and

> it doesn't define the mitigation strings at all.  (I didn't follow the

> discussion, so there is likely background here I'm not aware of.)

> 

> If the mitigation strings are meaningful at all, they really ought to be

> documented somewhere since this is ABI.


I think they are in testing? But that documentation is missing the 
"Unknown" state which tends to be our default case for "we don't have a 
complete white/black list", or "mitigation disabled, we don't know if 
your vulnerable", etc.

I'm not sure I like the "Unknown" state, but we can try to add it to the 
documentation.

> 

>> This series enables that behavior by providing the expected

>> functions. Those functions expose the cpu errata and feature

>> states, as well as whether firmware is responding appropriately

>> to display the overall machine status. This means that in a

>> heterogeneous machine we will only claim the machine is mitigated

>> or safe if we are confident all booted cores are safe or

>> mitigated. Otherwise, we will display unknown or unsafe

>> depending on how much of the machine configuration can

>> be assured.

> 

> Can the vulnerability status change once we enter userspace?


Generally no, for heterogeneous machines I think the answer here is yes, 
a user could check the state, and have it read "Not affected" then bring 
another core online which causes the state to change at which point if 
they re-read /sys it may reflect another state. OTOH, given that we tend 
to default to mitigated usually this shouldn't be an issue unless 
someone has disabled the mitigation.


> 

> I see no locking or other concurrency protections, and various global

> variables that could be __ro_after_init if nothing will change them

> after boot.


I think the state changes are all protected due to the fact the bringing 
a core online/offline is serialized.

> 

> If they can change after boot, userspace has no way to be notified,


Is checking on hotplug notification sufficient?


> 

> (I haven't grokked the patches fully, so the answer to this question may

> be reasonably straightforward...)

> 

> 

> Cheers

> ---Dave

> 

> [1] https://lkml.org/lkml/2018/1/8/145

>
Dave Martin Dec. 13, 2018, 12:07 p.m. UTC | #2
On Thu, Dec 06, 2018 at 05:44:02PM -0600, Jeremy Linton wrote:
> Part of this series was originally by Mian Yousaf Kaukab.

> 

> Arm64 machines should be displaying a human readable

> vulnerability status to speculative execution attacks in

> /sys/devices/system/cpu/vulnerabilities 


Is there any agreement on the strings that will be returned in there?

A quick search didn't find anything obvious upstream.  There is
documentation proposed in [1], but I don't know what happened to it and
it doesn't define the mitigation strings at all.  (I didn't follow the
discussion, so there is likely background here I'm not aware of.)

If the mitigation strings are meaningful at all, they really ought to be
documented somewhere since this is ABI.

> This series enables that behavior by providing the expected

> functions. Those functions expose the cpu errata and feature

> states, as well as whether firmware is responding appropriately

> to display the overall machine status. This means that in a

> heterogeneous machine we will only claim the machine is mitigated

> or safe if we are confident all booted cores are safe or

> mitigated. Otherwise, we will display unknown or unsafe

> depending on how much of the machine configuration can

> be assured.


Can the vulnerability status change once we enter userspace?

I see no locking or other concurrency protections, and various global
variables that could be __ro_after_init if nothing will change them
after boot.

If they can change after boot, userspace has no way to be notified,

(I haven't grokked the patches fully, so the answer to this question may
be reasonably straightforward...)


Cheers
---Dave

[1] https://lkml.org/lkml/2018/1/8/145
Dave Martin Dec. 13, 2018, 7:26 p.m. UTC | #3
On Wed, Dec 12, 2018 at 09:48:03AM -0600, Jeremy Linton wrote:
> Hi Dave,

> 

> Thanks for looking at this!

> 

> On 12/13/2018 06:07 AM, Dave Martin wrote:

> >On Thu, Dec 06, 2018 at 05:44:02PM -0600, Jeremy Linton wrote:

> >>Part of this series was originally by Mian Yousaf Kaukab.

> >>

> >>Arm64 machines should be displaying a human readable

> >>vulnerability status to speculative execution attacks in

> >>/sys/devices/system/cpu/vulnerabilities

> >

> >Is there any agreement on the strings that will be returned in there?

> >

> >A quick search didn't find anything obvious upstream.  There is

> >documentation proposed in [1], but I don't know what happened to it and

> >it doesn't define the mitigation strings at all.  (I didn't follow the

> >discussion, so there is likely background here I'm not aware of.)

> >

> >If the mitigation strings are meaningful at all, they really ought to be

> >documented somewhere since this is ABI.

> 

> I think they are in testing? But that documentation is missing the "Unknown"

> state which tends to be our default case for "we don't have a complete

> white/black list", or "mitigation disabled, we don't know if your

> vulnerable", etc.

> 

> I'm not sure I like the "Unknown" state, but we can try to add it to the

> documentation.

> 

> >

> >>This series enables that behavior by providing the expected

> >>functions. Those functions expose the cpu errata and feature

> >>states, as well as whether firmware is responding appropriately

> >>to display the overall machine status. This means that in a

> >>heterogeneous machine we will only claim the machine is mitigated

> >>or safe if we are confident all booted cores are safe or

> >>mitigated. Otherwise, we will display unknown or unsafe

> >>depending on how much of the machine configuration can

> >>be assured.

> >

> >Can the vulnerability status change once we enter userspace?

> 

> Generally no, for heterogeneous machines I think the answer here is yes, a

> user could check the state, and have it read "Not affected" then bring

> another core online which causes the state to change at which point if they


This feels like a potential bug, since userspace may already be making
assumptions about the vulnerability state by this point.

Shouldn't we reject late cpus that are noncompliant with the status quo
(as for unreconcilable cpu feature mismatches)?  I thought we already
did this for some mitigations.

> re-read /sys it may reflect another state. OTOH, given that we tend to

> default to mitigated usually this shouldn't be an issue unless someone has

> disabled the mitigation.

> 

> 

> >

> >I see no locking or other concurrency protections, and various global

> >variables that could be __ro_after_init if nothing will change them

> >after boot.

> 

> I think the state changes are all protected due to the fact the bringing a

> core online/offline is serialized.


Only with each other, not with random sysfs code running on another
online cpu.

If the concurrently accessed variables are only of fundamental integer
types there is unlikely to be a problem in practice, since transforming
a single, small isolated access into a memcpy() or similar (although
permitted by the language) is very unlikely to be a performance win --
so the compiler is unlikely to do it.

This is rather theoretical: if nobody else is bothering with atomics or
locking here, it may not be worth us adding anything specific just for
arm64.

[...]

Cheers
---Dave