diff mbox series

[v2,03/13] firmware: google: Test spinlock on panic path to avoid lockups

Message ID 20220719195325.402745-4-gpiccoli@igalia.com
State Superseded
Headers show
Series The panic notifiers refactor strikes back - fixes/clean-ups | expand

Commit Message

Guilherme G. Piccoli July 19, 2022, 7:53 p.m. UTC
Currently the gsmi driver registers a panic notifier as well as
reboot and die notifiers. The callbacks registered are called in
atomic and very limited context - for instance, panic disables
preemption and local IRQs, also all secondary CPUs (not executing
the panic path) are shutdown.

With that said, taking a spinlock in this scenario is a dangerous
invitation for lockup scenarios. So, fix that by checking if the
spinlock is free to acquire in the panic notifier callback - if not,
bail-out and avoid a potential hang.

Fixes: 74c5b31c6618 ("driver: Google EFI SMI")
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: David Gow <davidgow@google.com>
Cc: Evan Green <evgreen@chromium.org>
Cc: Julius Werner <jwerner@chromium.org>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>

---

V2:
- do not use spin_trylock anymore, to avoid messing with
non-panic paths; now we just check the spinlock state in
the panic notifier before taking it. Thanks Evan for the
review/idea!

 drivers/firmware/google/gsmi.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Guilherme G. Piccoli Aug. 7, 2022, 3:38 p.m. UTC | #1
On 19/07/2022 16:53, Guilherme G. Piccoli wrote:
> Currently the gsmi driver registers a panic notifier as well as
> reboot and die notifiers. The callbacks registered are called in
> atomic and very limited context - for instance, panic disables
> preemption and local IRQs, also all secondary CPUs (not executing
> the panic path) are shutdown.
> 
> With that said, taking a spinlock in this scenario is a dangerous
> invitation for lockup scenarios. So, fix that by checking if the
> spinlock is free to acquire in the panic notifier callback - if not,
> bail-out and avoid a potential hang.
> 
> Fixes: 74c5b31c6618 ("driver: Google EFI SMI")
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: David Gow <davidgow@google.com>
> Cc: Evan Green <evgreen@chromium.org>
> Cc: Julius Werner <jwerner@chromium.org>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
> 
> ---
> 
> V2:
> - do not use spin_trylock anymore, to avoid messing with
> non-panic paths; now we just check the spinlock state in
> the panic notifier before taking it. Thanks Evan for the
> review/idea!
> 
>  drivers/firmware/google/gsmi.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> [...]

Hi Evan, do you think this one is good now, based on your previous review?

Appreciate any feedback!
Cheers,


Guilherme
Evan Green Aug. 8, 2022, 5:07 a.m. UTC | #2
On Tue, Jul 19, 2022 at 12:55 PM Guilherme G. Piccoli
<gpiccoli@igalia.com> wrote:
>
> Currently the gsmi driver registers a panic notifier as well as
> reboot and die notifiers. The callbacks registered are called in
> atomic and very limited context - for instance, panic disables
> preemption and local IRQs, also all secondary CPUs (not executing
> the panic path) are shutdown.
>
> With that said, taking a spinlock in this scenario is a dangerous
> invitation for lockup scenarios. So, fix that by checking if the
> spinlock is free to acquire in the panic notifier callback - if not,
> bail-out and avoid a potential hang.
>
> Fixes: 74c5b31c6618 ("driver: Google EFI SMI")
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: David Gow <davidgow@google.com>
> Cc: Evan Green <evgreen@chromium.org>
> Cc: Julius Werner <jwerner@chromium.org>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>

Reviewed-by: Evan Green <evgreen@chromium.org>
Greg KH Aug. 8, 2022, 3:26 p.m. UTC | #3
On Mon, Aug 08, 2022 at 12:14:30PM -0300, Guilherme G. Piccoli wrote:
> On 08/08/2022 02:07, Evan Green wrote:
> > On Tue, Jul 19, 2022 at 12:55 PM Guilherme G. Piccoli
> > <gpiccoli@igalia.com> wrote:
> >>
> >> Currently the gsmi driver registers a panic notifier as well as
> >> reboot and die notifiers. The callbacks registered are called in
> >> atomic and very limited context - for instance, panic disables
> >> preemption and local IRQs, also all secondary CPUs (not executing
> >> the panic path) are shutdown.
> >>
> >> With that said, taking a spinlock in this scenario is a dangerous
> >> invitation for lockup scenarios. So, fix that by checking if the
> >> spinlock is free to acquire in the panic notifier callback - if not,
> >> bail-out and avoid a potential hang.
> >>
> >> Fixes: 74c5b31c6618 ("driver: Google EFI SMI")
> >> Cc: Ard Biesheuvel <ardb@kernel.org>
> >> Cc: David Gow <davidgow@google.com>
> >> Cc: Evan Green <evgreen@chromium.org>
> >> Cc: Julius Werner <jwerner@chromium.org>
> >> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
> > 
> > Reviewed-by: Evan Green <evgreen@chromium.org>
> 
> Thanks a bunch Evan!
> 
> Ard / Greg, do you think you could get this patch through your -next (or
> -fixes) trees? Not sure which tree is the most common for picking GSMI
> stuff.

Picking out an individual patch from a series with as many responses and
threads like this one is quite difficult.

Just resend this as a stand-alone patch if you want it applied
stand-alone as our tools want to apply a whole patch series at once.

> I'm trying to get these fixes merged individually in their trees to not
> stall the whole series and increase the burden of re-submitting.

The burden is on the submitter, not the maintainer as we have more
submitters than reviewers/maintainers.

thanks,

greg k-h
Greg KH Aug. 10, 2022, 12:54 p.m. UTC | #4
On Mon, Aug 08, 2022 at 12:37:46PM -0300, Guilherme G. Piccoli wrote:
> Let me clarify / ask something: this series, for example, is composed as
> a bunch of patches "centered" around the same idea, panic notifiers
> improvements/fixes. But its patches belong to completely different
> subsystems, like EFI/misc, architectures (alpha, parisc, arm), core
> kernel code, etc.
> 
> What is the best way of getting this merged?
> (a) Re-send individual patches with the respective Review/ACK tags to
> the proper subsystem, or;

Yes.

> (b) Wait until the whole series is ACKed/Reviewed, and a single
> maintainer (like you or Andrew, for example) would pick the whole series
> and apply at once, even if it spans across multiple parts of the kernel?

No, only do this after a kernel release cycle happens and there are
straggler patches that did not get picked up by the relevant subsystem
maintainers.

thanks,

greg k-h
diff mbox series

Patch

diff --git a/drivers/firmware/google/gsmi.c b/drivers/firmware/google/gsmi.c
index adaa492c3d2d..3ef5f3c0b4e4 100644
--- a/drivers/firmware/google/gsmi.c
+++ b/drivers/firmware/google/gsmi.c
@@ -681,6 +681,14 @@  static struct notifier_block gsmi_die_notifier = {
 static int gsmi_panic_callback(struct notifier_block *nb,
 			       unsigned long reason, void *arg)
 {
+	/*
+	 * Perform the lock check before effectively trying
+	 * to acquire it on gsmi_shutdown_reason() to avoid
+	 * potential lockups in atomic context.
+	 */
+	if (spin_is_locked(&gsmi_dev.lock))
+		return NOTIFY_DONE;
+
 	gsmi_shutdown_reason(GSMI_SHUTDOWN_PANIC);
 	return NOTIFY_DONE;
 }