diff mbox series

[v1] ACPI: reboot: Increase the delay to avoid racing after writing to ACPI RESET_REG on AMD Milan platforms.

Message ID 20230607034403.2885-1-james.liu@hpe.com
State New
Headers show
Series [v1] ACPI: reboot: Increase the delay to avoid racing after writing to ACPI RESET_REG on AMD Milan platforms. | expand

Commit Message

James Liu June 7, 2023, 3:44 a.m. UTC
For AMD Milan platforms, the delay of 15ms is insufficient to avoid racing
of reboot mechanisms. That said, the AMD Milan processors don't reboot
in 15ms after invoking acpi_reset().

The proposed 50ms delay can effectively work around this issue.
This extended delay aligns better with ACPI v6.4 (i.e., sec. 4.8.4.6),
which indicates that ideally OSPM should execute spin loops on the CPUs
in the system following a write to this register.

Signed-off-by: James Liu <james.liu@hpe.com>
---
 drivers/acpi/reboot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

James Liu June 8, 2023, 9:14 a.m. UTC | #1
On Wed, Jun 07, 2023 at 01:19:42PM +0200, Rafael J. Wysocki wrote:
> On Wed, Jun 7, 2023 at 5:44 AM James Liu <james.liu@hpe.com> wrote:
> >
> > For AMD Milan platforms, the delay of 15ms is insufficient to avoid racing
> > of reboot mechanisms. That said, the AMD Milan processors don't reboot
> > in 15ms after invoking acpi_reset().
> >
> > The proposed 50ms delay can effectively work around this issue.
> > This extended delay aligns better with ACPI v6.4 (i.e., sec. 4.8.4.6),
> > which indicates that ideally OSPM should execute spin loops on the CPUs
> > in the system following a write to this register.
> >
> > Signed-off-by: James Liu <james.liu@hpe.com>
> 
> Why do you want to affect everyone (including guest kernels running in
> virtual machines AFAICS) in order to address a problem specific to one
> platform?

I hoped to address this issue for any platform requiring a longer delay to
complete ACPI reset in time for any (maybe silicon-level) reasons. Some AMD Milan
platforms were the cases that we've found so far.

Except that, since ACPI spec indicates there should be a spin loop (long enough)
after the write instruction to Reset Register, I thought it should be no harms to
the other systems which well consider this spin loop when they claim to support
ACPI reboot.

Btw, I am just curious, why is the virtual machine mentioned here?

is the 50ms delay in acpi_reboot() for a guest OS on VM so long that some
unexpected behavior might happen?

> Wouldn't it be better to quirk that platform and document the quirk properly?

Yeah, it could be. Actually we considered this, and we will consider it again.

> > ---
> >  drivers/acpi/reboot.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/acpi/reboot.c b/drivers/acpi/reboot.c
> > index b79b7c99c237..002f7c7814a1 100644
> > --- a/drivers/acpi/reboot.c
> > +++ b/drivers/acpi/reboot.c
> > @@ -78,5 +78,5 @@ void acpi_reboot(void)
> >          * The 15ms delay has been found to be long enough for the system
> >          * to reboot on the affected platforms.
> >          */
> > -       mdelay(15);
> > +       mdelay(50);
> >  }
> > --
> > 2.40.1
> >
Rafael J. Wysocki June 12, 2023, 4:57 p.m. UTC | #2
On Thu, Jun 8, 2023 at 11:14 AM James Liu <james.liu@hpe.com> wrote:
>
> On Wed, Jun 07, 2023 at 01:19:42PM +0200, Rafael J. Wysocki wrote:
> > On Wed, Jun 7, 2023 at 5:44 AM James Liu <james.liu@hpe.com> wrote:
> > >
> > > For AMD Milan platforms, the delay of 15ms is insufficient to avoid racing
> > > of reboot mechanisms. That said, the AMD Milan processors don't reboot
> > > in 15ms after invoking acpi_reset().
> > >
> > > The proposed 50ms delay can effectively work around this issue.
> > > This extended delay aligns better with ACPI v6.4 (i.e., sec. 4.8.4.6),
> > > which indicates that ideally OSPM should execute spin loops on the CPUs
> > > in the system following a write to this register.
> > >
> > > Signed-off-by: James Liu <james.liu@hpe.com>
> >
> > Why do you want to affect everyone (including guest kernels running in
> > virtual machines AFAICS) in order to address a problem specific to one
> > platform?
>
> I hoped to address this issue for any platform requiring a longer delay to
> complete ACPI reset in time for any (maybe silicon-level) reasons. Some AMD Milan
> platforms were the cases that we've found so far.

Do you know about any other?

> Except that, since ACPI spec indicates there should be a spin loop (long enough)
> after the write instruction to Reset Register, I thought it should be no harms to
> the other systems which well consider this spin loop when they claim to support
> ACPI reboot.
>
> Btw, I am just curious, why is the virtual machine mentioned here?

The new delay would be over 3 times larger, so some users might be
surprised by it at least potentially.

> is the 50ms delay in acpi_reboot() for a guest OS on VM so long that some
> unexpected behavior might happen?
>
> > Wouldn't it be better to quirk that platform and document the quirk properly?
>
> Yeah, it could be. Actually we considered this, and we will consider it again.
>
> > > ---
> > >  drivers/acpi/reboot.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/acpi/reboot.c b/drivers/acpi/reboot.c
> > > index b79b7c99c237..002f7c7814a1 100644
> > > --- a/drivers/acpi/reboot.c
> > > +++ b/drivers/acpi/reboot.c
> > > @@ -78,5 +78,5 @@ void acpi_reboot(void)
> > >          * The 15ms delay has been found to be long enough for the system
> > >          * to reboot on the affected platforms.
> > >          */
> > > -       mdelay(15);
> > > +       mdelay(50);
> > >  }
> > > --
diff mbox series

Patch

diff --git a/drivers/acpi/reboot.c b/drivers/acpi/reboot.c
index b79b7c99c237..002f7c7814a1 100644
--- a/drivers/acpi/reboot.c
+++ b/drivers/acpi/reboot.c
@@ -78,5 +78,5 @@  void acpi_reboot(void)
 	 * The 15ms delay has been found to be long enough for the system
 	 * to reboot on the affected platforms.
 	 */
-	mdelay(15);
+	mdelay(50);
 }