diff mbox series

[v2,1/6] arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB

Message ID 20241213165201.v2.1.I2040fa004dafe196243f67ebcc647cbedbb516e6@changeid
State New
Headers show
Series arm64: errata: Rework Spectre BHB mitigations to not assume "safe" | expand

Commit Message

Doug Anderson Dec. 14, 2024, 12:52 a.m. UTC
The code for detecting CPUs that are vulnerable to Spectre BHB was
based on a hardcoded list of CPU IDs that were known to be affected.
Unfortunately, the list mostly only contained the IDs of standard ARM
cores. The IDs for many cores that are minor variants of the standard
ARM cores (like many Qualcomm Kyro CPUs) weren't listed. This led the
code to assume that those variants were not affected.

Flip the code on its head and instead list CPU IDs for cores that are
known to be _not_ affected. Now CPUs will be assumed vulnerable until
added to the list saying that they're safe.

As of right now, the only CPU IDs added to the "unaffected" list are
ARM Cortex A35, A53, and A55. This list was created by looking at
older cores listed in cputype.h that weren't listed in the "affected"
list previously.

Unfortunately, while this solution is better than what we had before,
it's still an imperfect solution. Specifically there are two ways to
mitigate Spectre BHB and one of those ways is parameterized with a "k"
value indicating how many loops are needed to mitigate. If we have an
unknown CPU ID then we've got to guess about how to mitigate it. Since
more cores seem to be mitigated by looping (and because it's unlikely
that the needed FW code will be in place for FW mitigation for unknown
cores), we'll choose looping for unknown CPUs and choose the highest
"k" value of 32.

The downside of our guessing is that some CPUs may now report as
"mitigated" when in reality they should need a firmware mitigation.
We'll choose to put a WARN_ON splat in the logs in this case any time
we had to make a guess since guessing the right mitigation is pretty
awful. Hopefully this will encourage CPU vendors to add their CPU IDs
to the list.


Fixes: 558c303c9734 ("arm64: Mitigate spectre style branch history side channels")
Cc: stable@vger.kernel.org
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v2:
- New

 arch/arm64/kernel/proton-pack.c | 46 +++++++++++++++++++++++++++------
 1 file changed, 38 insertions(+), 8 deletions(-)

Comments

Julius Werner Dec. 14, 2024, 2:25 a.m. UTC | #1
I feel like this patch is maybe taking a bit of a wrong angle at
achieving what you're trying to do. The code seems to be structured in
a way that it has separate functions to test for each of the possible
mitigation options one at a time, and pushing the default case into
spectre_bhb_loop_affected() feels like a bit of a layering violation.
I think it would work the way you wrote it, but it depends heavily on
the order functions are called in is_spectre_bhb_affected(), which
seems counter-intuitive with the way the functions seem to be designed
as independent checks.

What do you think about an approach like this instead:

- Refactor max_bhb_k in spectre_bhb_loop_affected() to be a global
instead, which starts out as zero, is updated by
spectre_bhb_loop_affected(), and is directly read by
spectre_bhb_patch_loop_iter() (could probably remove the `scope`
argument from spectre_bhb_loop_affected() then).

- Add a function is_spectre_bhb_safe() that just checks if the MIDR is
in the new list you're adding

- Add an `if (is_spectre_bhb_safe()) return false;` to the top of
is_spectre_bhb_affected

- Change the `return false` into `return true` at the end of
is_spectre_bhb_affected (in fact, you can probably take out some of
the other calls that result in returning true as well then)

- In spectre_bhb_enable_mitigations(), at the end of the long if-else
chain, put a last else block that prints your WARN_ONCE(), sets the
max_bhb_k global to 32, and then does the same stuff that the `if
(spectre_bhb_loop_affected())` block would have installed (maybe
factoring that out into a helper function called from both cases).

I think that should implement the same "assume unsafe by default
unless explicitly listed as safe, check for all other mitigations
first, then default to k=32 loop if none found" approach, but makes it
a bit more obvious in the code.
Doug Anderson Dec. 16, 2024, 9:29 p.m. UTC | #2
Hi,

On Fri, Dec 13, 2024 at 6:25 PM Julius Werner <jwerner@chromium.org> wrote:
>
> I feel like this patch is maybe taking a bit of a wrong angle at
> achieving what you're trying to do. The code seems to be structured in
> a way that it has separate functions to test for each of the possible
> mitigation options one at a time, and pushing the default case into
> spectre_bhb_loop_affected() feels like a bit of a layering violation.
> I think it would work the way you wrote it, but it depends heavily on
> the order functions are called in is_spectre_bhb_affected(), which
> seems counter-intuitive with the way the functions seem to be designed
> as independent checks.
>
> What do you think about an approach like this instead:
>
> - Refactor max_bhb_k in spectre_bhb_loop_affected() to be a global
> instead, which starts out as zero, is updated by
> spectre_bhb_loop_affected(), and is directly read by
> spectre_bhb_patch_loop_iter() (could probably remove the `scope`
> argument from spectre_bhb_loop_affected() then).

Refactoring "max_bhb_k" would be a general cleanup and not related to
anything else here, right?

...but the function is_spectre_bhb_affected() is called from
"cpu_errata.c" and has a scope. Can we guarantee that it's always
"SCOPE_LOCAL_CPU"? I tried reading through the code and it's
_probably_ SCOPE_LOCAL_CPU most of the time, but it doesn't seem worth
it to add an assumption here for a small cleanup.

I'm not going to do this, though I will move "max_bhb_k" to be a
global for the suggestion below.


> - Add a function is_spectre_bhb_safe() that just checks if the MIDR is
> in the new list you're adding
>
> - Add an `if (is_spectre_bhb_safe()) return false;` to the top of
> is_spectre_bhb_affected

That seems reasonable to do and lets me get rid of the "safe" checks
from is_spectre_bhb_fw_affected() and spectre_bhb_loop_affected(), so
it sounds good.


> - Change the `return false` into `return true` at the end of
> is_spectre_bhb_affected (in fact, you can probably take out some of
> the other calls that result in returning true as well then)

I'm not sure you can take out the other calls. Specifically, both
spectre_bhb_loop_affected() and is_spectre_bhb_fw_affected() _need_ to
be called for each CPU so that they update static globals, right?
Maybe we could get rid of the call to supports_clearbhb(), but that
_would_ change things in ways that are not obvious. Specifically I
could believe that someone could have backported "clear BHB" to their
core but their core is otherwise listed as "loop affected". That would
affect "max_bhb_k". Maybe (?) it doesn't matter in this case, but I'd
rather not mess with it unless someone really wants me to and is sure
it's safe.


> - In spectre_bhb_enable_mitigations(), at the end of the long if-else
> chain, put a last else block that prints your WARN_ONCE(), sets the
> max_bhb_k global to 32, and then does the same stuff that the `if
> (spectre_bhb_loop_affected())` block would have installed (maybe
> factoring that out into a helper function called from both cases).

...or just reorder it so that the spectre_bhb_loop_affected() code is
last and it can be the "else". Then I can WARN_ONCE() if
spectre_bhb_loop_affected(). ...or I could just do the WARN_ONCE()
when I get to the end of is_spectre_bhb_affected() and set "max_bhb_k"
to 32 there. I'd actually rather do that so that "max_bhb_k" is
consistently set after is_spectre_bhb_affected() is called on all
cores regardless of whether or not some cores are unknown.
diff mbox series

Patch

diff --git a/arch/arm64/kernel/proton-pack.c b/arch/arm64/kernel/proton-pack.c
index da53722f95d4..39c5573c7527 100644
--- a/arch/arm64/kernel/proton-pack.c
+++ b/arch/arm64/kernel/proton-pack.c
@@ -841,13 +841,31 @@  enum bhb_mitigation_bits {
 };
 static unsigned long system_bhb_mitigations;
 
+static const struct midr_range spectre_bhb_firmware_mitigated_list[] = {
+	MIDR_ALL_VERSIONS(MIDR_CORTEX_A73),
+	MIDR_ALL_VERSIONS(MIDR_CORTEX_A75),
+	{},
+};
+
+static const struct midr_range spectre_bhb_safe_list[] = {
+	MIDR_ALL_VERSIONS(MIDR_CORTEX_A35),
+	MIDR_ALL_VERSIONS(MIDR_CORTEX_A53),
+	MIDR_ALL_VERSIONS(MIDR_CORTEX_A55),
+	{},
+};
+
 /*
  * This must be called with SCOPE_LOCAL_CPU for each type of CPU, before any
  * SCOPE_SYSTEM call will give the right answer.
+ *
+ * NOTE: Unknown CPUs are reported as affected. In order to make this work
+ * and still keep the list short, only handle CPUs where:
+ * - supports_csv2p3() returned false
+ * - supports_clearbhb() returned false.
  */
 u8 spectre_bhb_loop_affected(int scope)
 {
-	u8 k = 0;
+	u8 k;
 	static u8 max_bhb_k;
 
 	if (scope == SCOPE_LOCAL_CPU) {
@@ -886,6 +904,16 @@  u8 spectre_bhb_loop_affected(int scope)
 			k = 11;
 		else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k8_list))
 			k =  8;
+		else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_safe_list) ||
+			 is_midr_in_range_list(read_cpuid_id(), spectre_bhb_firmware_mitigated_list))
+			k =  0;
+		else {
+			WARN_ONCE(true,
+				 "Unrecognized CPU %#010x, assuming Spectre BHB vulnerable\n",
+				 read_cpuid_id());
+			/* Hopefully k = 32 handles the worst case for unknown CPUs */
+			k = 32;
+		}
 
 		max_bhb_k = max(max_bhb_k, k);
 	} else {
@@ -916,24 +944,26 @@  static enum mitigation_state spectre_bhb_get_cpu_fw_mitigation_state(void)
 	}
 }
 
+/*
+ * NOTE: Unknown CPUs are reported as affected. In order to make this work
+ * and still keep the list short, only handle CPUs where:
+ * - supports_csv2p3() returned false
+ * - supports_clearbhb() returned false.
+ * - spectre_bhb_loop_affected() returned 0.
+ */
 static bool is_spectre_bhb_fw_affected(int scope)
 {
 	static bool system_affected;
 	enum mitigation_state fw_state;
 	bool has_smccc = arm_smccc_1_1_get_conduit() != SMCCC_CONDUIT_NONE;
-	static const struct midr_range spectre_bhb_firmware_mitigated_list[] = {
-		MIDR_ALL_VERSIONS(MIDR_CORTEX_A73),
-		MIDR_ALL_VERSIONS(MIDR_CORTEX_A75),
-		{},
-	};
 	bool cpu_in_list = is_midr_in_range_list(read_cpuid_id(),
-					 spectre_bhb_firmware_mitigated_list);
+						 spectre_bhb_safe_list);
 
 	if (scope != SCOPE_LOCAL_CPU)
 		return system_affected;
 
 	fw_state = spectre_bhb_get_cpu_fw_mitigation_state();
-	if (cpu_in_list || (has_smccc && fw_state == SPECTRE_MITIGATED)) {
+	if (!cpu_in_list || (has_smccc && fw_state == SPECTRE_MITIGATED)) {
 		system_affected = true;
 		return true;
 	}