[v6,2/2] ACPI: Add quirks for AMD Renoir/Lucienne CPUs to force the D3 hint

Message ID 20210607173156.5548-2-mario.limonciello@amd.com
State New
Headers show
Series
  • Untitled series #134428
Related show

Commit Message

Limonciello, Mario June 7, 2021, 5:31 p.m.
AMD systems from Renoir and Lucienne require that the NVME controller
is put into D3 over a Modern Standby / suspend-to-idle
cycle.  This is "typically" accomplished using the `StorageD3Enable`
property in the _DSD, but this property was introduced after many
of these systems launched and most OEM systems don't have it in
their BIOS.

On AMD Renoir without these drives going into D3 over suspend-to-idle
the resume will fail with the NVME controller being reset and a trace
like this in the kernel logs:
```
[   83.556118] nvme nvme0: I/O 161 QID 2 timeout, aborting
[   83.556178] nvme nvme0: I/O 162 QID 2 timeout, aborting
[   83.556187] nvme nvme0: I/O 163 QID 2 timeout, aborting
[   83.556196] nvme nvme0: I/O 164 QID 2 timeout, aborting
[   95.332114] nvme nvme0: I/O 25 QID 0 timeout, reset controller
[   95.332843] nvme nvme0: Abort status: 0x371
[   95.332852] nvme nvme0: Abort status: 0x371
[   95.332856] nvme nvme0: Abort status: 0x371
[   95.332859] nvme nvme0: Abort status: 0x371
[   95.332909] PM: dpm_run_callback(): pci_pm_resume+0x0/0xe0 returns -16
[   95.332936] nvme 0000:03:00.0: PM: failed to resume async: error -16
```

The Microsoft documentation for StorageD3Enable mentioned that Windows has
a hardcoded allowlist for D3 support, which was used for these platforms.
Introduce quirks to hardcode them for Linux as well.

As this property is now "standardized", OEM systems using AMD Cezanne and
newer APU's have adopted this property, and quirks like this should not be
necessary.

CC: Julian Sikorski <belegdol@gmail.com>
CC: Shyam-sundar S-k <Shyam-sundar.S-k@amd.com>
CC: Alexander Deucher <Alexander.Deucher@amd.com>
CC: Rafael J. Wysocki <rjw@rjwysocki.net>
CC: Prike Liang <prike.liang@amd.com>
Link: https://docs.microsoft.com/en-us/windows-hardware/design/component-guidelines/power-management-for-storage-hardware-devices-intro
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
 drivers/acpi/device_pm.c |  3 +++
 drivers/acpi/x86/utils.c | 27 +++++++++++++++++++++++++++
 include/acpi/acpi_bus.h  |  5 +++++
 3 files changed, 35 insertions(+)

Changes from v4->v5:
 * Add this patch back in as it's been made apparent that the
   system needs to be hardcoded for these.
   Changes:
   - Drop Cezanne - it's now covered by StorageD3Enable
   - Rebase ontop of acpi_storage_d3 outside of NVME
Changes from v5->v6:
 * Move the quirk check into drivers/acpi/x86/ as suggested by
   Rafael.

Comments

Julian Sikorski June 8, 2021, 6:01 a.m. | #1
W dniu 07.06.2021 o 19:31, Mario Limonciello pisze:
> AMD systems from Renoir and Lucienne require that the NVME controller

> is put into D3 over a Modern Standby / suspend-to-idle

> cycle.  This is "typically" accomplished using the `StorageD3Enable`

> property in the _DSD, but this property was introduced after many

> of these systems launched and most OEM systems don't have it in

> their BIOS.

> 

> On AMD Renoir without these drives going into D3 over suspend-to-idle

> the resume will fail with the NVME controller being reset and a trace

> like this in the kernel logs:

> ```

> [   83.556118] nvme nvme0: I/O 161 QID 2 timeout, aborting

> [   83.556178] nvme nvme0: I/O 162 QID 2 timeout, aborting

> [   83.556187] nvme nvme0: I/O 163 QID 2 timeout, aborting

> [   83.556196] nvme nvme0: I/O 164 QID 2 timeout, aborting

> [   95.332114] nvme nvme0: I/O 25 QID 0 timeout, reset controller

> [   95.332843] nvme nvme0: Abort status: 0x371

> [   95.332852] nvme nvme0: Abort status: 0x371

> [   95.332856] nvme nvme0: Abort status: 0x371

> [   95.332859] nvme nvme0: Abort status: 0x371

> [   95.332909] PM: dpm_run_callback(): pci_pm_resume+0x0/0xe0 returns -16

> [   95.332936] nvme 0000:03:00.0: PM: failed to resume async: error -16

> ```

> 

> The Microsoft documentation for StorageD3Enable mentioned that Windows has

> a hardcoded allowlist for D3 support, which was used for these platforms.

> Introduce quirks to hardcode them for Linux as well.

> 

> As this property is now "standardized", OEM systems using AMD Cezanne and

> newer APU's have adopted this property, and quirks like this should not be

> necessary.

> 

> CC: Julian Sikorski <belegdol@gmail.com>

> CC: Shyam-sundar S-k <Shyam-sundar.S-k@amd.com>

> CC: Alexander Deucher <Alexander.Deucher@amd.com>

> CC: Rafael J. Wysocki <rjw@rjwysocki.net>

> CC: Prike Liang <prike.liang@amd.com>

> Link: https://docs.microsoft.com/en-us/windows-hardware/design/component-guidelines/power-management-for-storage-hardware-devices-intro

> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

> ---


Tested-by: Julian Sikorski <belegdol@gmail.com>
Rafael J. Wysocki June 8, 2021, 11:27 a.m. | #2
On Mon, Jun 7, 2021 at 7:32 PM Mario Limonciello
<mario.limonciello@amd.com> wrote:
>

> AMD systems from Renoir and Lucienne require that the NVME controller

> is put into D3 over a Modern Standby / suspend-to-idle

> cycle.  This is "typically" accomplished using the `StorageD3Enable`

> property in the _DSD, but this property was introduced after many

> of these systems launched and most OEM systems don't have it in

> their BIOS.

>

> On AMD Renoir without these drives going into D3 over suspend-to-idle

> the resume will fail with the NVME controller being reset and a trace

> like this in the kernel logs:

> ```

> [   83.556118] nvme nvme0: I/O 161 QID 2 timeout, aborting

> [   83.556178] nvme nvme0: I/O 162 QID 2 timeout, aborting

> [   83.556187] nvme nvme0: I/O 163 QID 2 timeout, aborting

> [   83.556196] nvme nvme0: I/O 164 QID 2 timeout, aborting

> [   95.332114] nvme nvme0: I/O 25 QID 0 timeout, reset controller

> [   95.332843] nvme nvme0: Abort status: 0x371

> [   95.332852] nvme nvme0: Abort status: 0x371

> [   95.332856] nvme nvme0: Abort status: 0x371

> [   95.332859] nvme nvme0: Abort status: 0x371

> [   95.332909] PM: dpm_run_callback(): pci_pm_resume+0x0/0xe0 returns -16

> [   95.332936] nvme 0000:03:00.0: PM: failed to resume async: error -16

> ```

>

> The Microsoft documentation for StorageD3Enable mentioned that Windows has

> a hardcoded allowlist for D3 support, which was used for these platforms.

> Introduce quirks to hardcode them for Linux as well.

>

> As this property is now "standardized", OEM systems using AMD Cezanne and

> newer APU's have adopted this property, and quirks like this should not be

> necessary.

>

> CC: Julian Sikorski <belegdol@gmail.com>

> CC: Shyam-sundar S-k <Shyam-sundar.S-k@amd.com>

> CC: Alexander Deucher <Alexander.Deucher@amd.com>

> CC: Rafael J. Wysocki <rjw@rjwysocki.net>

> CC: Prike Liang <prike.liang@amd.com>

> Link: https://docs.microsoft.com/en-us/windows-hardware/design/component-guidelines/power-management-for-storage-hardware-devices-intro

> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

> ---

>  drivers/acpi/device_pm.c |  3 +++

>  drivers/acpi/x86/utils.c | 27 +++++++++++++++++++++++++++

>  include/acpi/acpi_bus.h  |  5 +++++

>  3 files changed, 35 insertions(+)

>

> Changes from v4->v5:

>  * Add this patch back in as it's been made apparent that the

>    system needs to be hardcoded for these.

>    Changes:

>    - Drop Cezanne - it's now covered by StorageD3Enable

>    - Rebase ontop of acpi_storage_d3 outside of NVME

> Changes from v5->v6:

>  * Move the quirk check into drivers/acpi/x86/ as suggested by

>    Rafael.

>

> diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c

> index 1edb68d00b8e..985c17384192 100644

> --- a/drivers/acpi/device_pm.c

> +++ b/drivers/acpi/device_pm.c

> @@ -1356,6 +1356,9 @@ bool acpi_storage_d3(struct device *dev)

>         struct acpi_device *adev = ACPI_COMPANION(dev);

>         u8 val;

>

> +       if (force_storage_d3())

> +               return true;

> +

>         if (!adev)

>                 return false;

>         if (fwnode_property_read_u8(acpi_fwnode_handle(adev), "StorageD3Enable",

> diff --git a/drivers/acpi/x86/utils.c b/drivers/acpi/x86/utils.c

> index bdc1ba00aee9..2b8d5b3c876f 100644

> --- a/drivers/acpi/x86/utils.c

> +++ b/drivers/acpi/x86/utils.c

> @@ -135,3 +135,30 @@ bool acpi_device_always_present(struct acpi_device *adev)

>

>         return ret;

>  }

> +

> +/*

> + * AMD systems from Renoir and Lucienne *require* that the NVME controller

> + * is put into D3 over a Modern Standby / suspend-to-idle cycle.

> + *

> + * This is "typically" accomplished using the `StorageD3Enable`

> + * property in the _DSD that is checked via the `acpi_storage_d3` function

> + * but this property was introduced after many of these systems launched

> + * and most OEM systems don't have it in their BIOS.

> + *

> + * The Microsoft documentation for StorageD3Enable mentioned that Windows has

> + * a hardcoded allowlist for D3 support, which was used for these platforms.

> + *

> + * This allows quirking on Linux in a similar fashion.

> + */

> +const struct x86_cpu_id storage_d3_cpu_ids[] = {

> +       X86_MATCH_VENDOR_FAM_MODEL(AMD, 23, 96, NULL),  /* Renoir */

> +       X86_MATCH_VENDOR_FAM_MODEL(AMD, 23, 104, NULL), /* Lucienne */

> +       {}

> +};

> +

> +bool force_storage_d3(void)

> +{

> +       if (x86_match_cpu(storage_d3_cpu_ids))

> +               return true;

> +       return false;


Well, what about doing

  return x86_match_cpu(storage_d3_cpu_ids);

instead?

> +}

> diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h

> index 3a82faac5767..9b0ddbae5617 100644

> --- a/include/acpi/acpi_bus.h

> +++ b/include/acpi/acpi_bus.h

> @@ -607,11 +607,16 @@ int acpi_disable_wakeup_device_power(struct acpi_device *dev);

>

>  #ifdef CONFIG_X86

>  bool acpi_device_always_present(struct acpi_device *adev);

> +bool force_storage_d3(void);


This doesn't need to go into acpi_bus.h, because it will only be used
in device_pm.c.

You may as well put it into drivers/acpi/internal.h.

>  #else

>  static inline bool acpi_device_always_present(struct acpi_device *adev)

>  {

>         return false;

>  }

> +static inline bool force_storage_d3(void)

> +{

> +       return false;

> +}

>  #endif

>

>  #ifdef CONFIG_PM

> --

Patch

diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
index 1edb68d00b8e..985c17384192 100644
--- a/drivers/acpi/device_pm.c
+++ b/drivers/acpi/device_pm.c
@@ -1356,6 +1356,9 @@  bool acpi_storage_d3(struct device *dev)
 	struct acpi_device *adev = ACPI_COMPANION(dev);
 	u8 val;
 
+	if (force_storage_d3())
+		return true;
+
 	if (!adev)
 		return false;
 	if (fwnode_property_read_u8(acpi_fwnode_handle(adev), "StorageD3Enable",
diff --git a/drivers/acpi/x86/utils.c b/drivers/acpi/x86/utils.c
index bdc1ba00aee9..2b8d5b3c876f 100644
--- a/drivers/acpi/x86/utils.c
+++ b/drivers/acpi/x86/utils.c
@@ -135,3 +135,30 @@  bool acpi_device_always_present(struct acpi_device *adev)
 
 	return ret;
 }
+
+/*
+ * AMD systems from Renoir and Lucienne *require* that the NVME controller
+ * is put into D3 over a Modern Standby / suspend-to-idle cycle.
+ *
+ * This is "typically" accomplished using the `StorageD3Enable`
+ * property in the _DSD that is checked via the `acpi_storage_d3` function
+ * but this property was introduced after many of these systems launched
+ * and most OEM systems don't have it in their BIOS.
+ *
+ * The Microsoft documentation for StorageD3Enable mentioned that Windows has
+ * a hardcoded allowlist for D3 support, which was used for these platforms.
+ *
+ * This allows quirking on Linux in a similar fashion.
+ */
+const struct x86_cpu_id storage_d3_cpu_ids[] = {
+	X86_MATCH_VENDOR_FAM_MODEL(AMD, 23, 96, NULL),	/* Renoir */
+	X86_MATCH_VENDOR_FAM_MODEL(AMD, 23, 104, NULL),	/* Lucienne */
+	{}
+};
+
+bool force_storage_d3(void)
+{
+	if (x86_match_cpu(storage_d3_cpu_ids))
+		return true;
+	return false;
+}
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index 3a82faac5767..9b0ddbae5617 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -607,11 +607,16 @@  int acpi_disable_wakeup_device_power(struct acpi_device *dev);
 
 #ifdef CONFIG_X86
 bool acpi_device_always_present(struct acpi_device *adev);
+bool force_storage_d3(void);
 #else
 static inline bool acpi_device_always_present(struct acpi_device *adev)
 {
 	return false;
 }
+static inline bool force_storage_d3(void)
+{
+	return false;
+}
 #endif
 
 #ifdef CONFIG_PM