Message ID | 20250110084413.80981-1-leo.lin@canonical.com |
---|---|
State | New |
Headers | show |
Series | USB: Prevent xhci from resuming root hub during suspend entrance | expand |
On Fri, Jan 10, 2025 at 04:44:10PM +0800, Yo-Jung (Leo) Lin wrote: > The commit d9b4067aef50 ("USB: Fix the issue of task recovery failure > caused by USB status when S4 wakes up") fixed an issue where if an USB > port change happens during the entering steps of hibernation, xhci driver > would attempt to resume the root hub, making the hibernation fail. > > System-wide suspend may fail due to the same reason, but this hasn't been > addressed yet. This has been found on HP ProOne 440[1], as well as on > some newer Dell all-in-one models. When suspend fails due to this reason, > the kernel would show the following messages: > > [ 74.245058] [165] usbcore:hub_suspend:3961: hub 2-0:1.0: hub_suspend > [ 74.245850] [165] usbcore:hcd_bus_suspend:2251: usb usb2: bus suspend, wakeup 0 > [ 74.250971] [3508] usbcore:usb_port_suspend:3554: usb 1-2: usb suspend, wakeup 1 > [ 74.263025] [11] usbcore:hub_suspend:3961: hub 1-0:1.0: hub_suspend > [ 74.264029] [11] usbcore:hcd_bus_suspend:2251: usb usb1: bus suspend, wakeup 0 > [ 74.265061] [11] xhci_hcd:xhci_bus_suspend:1779: xhci_hcd 0000:80:14.0: port 1-14 not suspended > [ 74.266020] [11] xhci_hcd:xhci_bus_suspend:1779: xhci_hcd 0000:80:14.0: port 1-8 not suspended > [ 74.266933] [11] xhci_hcd:xhci_bus_suspend:1779: xhci_hcd 0000:80:14.0: port 1-4 not suspended > [ 74.267758] [11] xhci_hcd:xhci_ring_cmd_db:369: xhci_hcd 0000:80:14.0: // Ding dong! > [ 74.268677] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 5 ep 6 > [ 74.269632] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 5 ep 5 > [ 74.270448] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 5 ep 3 > [ 74.271228] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 5 ep 0 > [ 74.271946] [11] xhci_hcd:xhci_ring_cmd_db:369: xhci_hcd 0000:80:14.0: // Ding dong! > [ 74.272802] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 6 ep 8 > [ 74.273533] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 6 ep 5 > [ 74.274233] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 6 ep 0 > [ 74.274982] [11] xhci_hcd:xhci_ring_cmd_db:369: xhci_hcd 0000:80:14.0: // Ding dong! > [ 74.275814] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 2 ep 0 > [ 74.281739] <intr> xhci_hcd:handle_port_status:1992: xhci_hcd 0000:80:14.0: Port change event, 1-8, id 8, portsc: 0x202a0 > [ 74.282453] <intr> xhci_hcd:handle_port_status:1998: xhci_hcd 0000:80:14.0: resume root hub > [ 74.283145] <intr> xhci_hcd:handle_port_status:2109: xhci_hcd 0000:80:14.0: handle_port_status: starting usb1 port polling. > [ 74.385425] e1000e: EEE TX LPI TIMER: 00000011 > [ 74.385543] [3508] pci_acpi:acpi_pci_set_power_state:1119: pcieport 0000:80:1c.0: power state changed by ACPI to D0 > [ 74.385722] xhci_hcd 0000:80:14.0: PM: pci_pm_suspend(): hcd_pci_suspend returns -16 > [ 74.385735] xhci_hcd 0000:80:14.0: PM: dpm_run_callback(): pci_pm_suspend returns -16 > [ 74.385743] xhci_hcd 0000:80:14.0: PM: failed to suspend async: error -16 > > To address this, extend the fix in that commit also to suspend. > > This patch was tested on top of next-20250109, by suspending every 2 > minutes consecutively for 300 times on the machine where the above error > messages were found. There's no suspend failure found during the test. > > [1] [PATCH v3] platform/x86/hp: Avoid spurious wakeup on HP ProOne 440 > https://lore.kernel.org/all/20240906053047.459036-1-kai.heng.feng@canonical.com/#t > > Signed-off-by: Yo-Jung (Leo) Lin <leo.lin@canonical.com> > --- > include/linux/pm.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/linux/pm.h b/include/linux/pm.h > index 08c37b83fea8..d71347357fb1 100644 > --- a/include/linux/pm.h > +++ b/include/linux/pm.h > @@ -571,7 +571,7 @@ const struct dev_pm_ops name = { \ > > #define PMSG_IS_AUTO(msg) (((msg).event & PM_EVENT_AUTO) != 0) > #define PMSG_NO_WAKEUP(msg) (((msg).event & \ > - (PM_EVENT_FREEZE | PM_EVENT_QUIESCE)) != 0) > + (PM_EVENT_FREEZE | PM_EVENT_QUIESCE | PM_EVENT_SUSPEND)) != 0) What tree is this against? It's not my usb trees as this #define is not present there. And why are you changing a define that affects multiple devices when it's only a USB issue you are seeing? confused, greg k-h
On Fri, Jan 10, 2025 at 04:44:10PM +0800, Yo-Jung (Leo) Lin wrote: > The commit d9b4067aef50 ("USB: Fix the issue of task recovery failure > caused by USB status when S4 wakes up") fixed an issue where if an USB > port change happens during the entering steps of hibernation, xhci driver > would attempt to resume the root hub, making the hibernation fail. > > System-wide suspend may fail due to the same reason, but this hasn't been > addressed yet. This has been found on HP ProOne 440[1], as well as on > some newer Dell all-in-one models. When suspend fails due to this reason, > the kernel would show the following messages: I believe this problem was discussed on the mailing list before, and it turned out that the issue was caused by a bug in the xhci-hcd driver, not a bug in the USB core. Basically, suspend is _supposed_ to fail if a wakeup event occurs while the suspend is in progress. As I recall, the bug in xhci-hcd was that it treats some non-wakeup events as if they were wakeup events. In particular, a port change on the root hub should be treated as a wakeup event if and only if the root hub is enabled for wakeup. Does xhci-hcd check for this before failing the suspend? This reasoning shows that your proposed fix is incorrect. Alan Stern
Hi Alan On Fri, Jan 10, 2025 at 11:44 PM Alan Stern <stern@rowland.harvard.edu> wrote: > > On Fri, Jan 10, 2025 at 04:44:10PM +0800, Yo-Jung (Leo) Lin wrote: > > The commit d9b4067aef50 ("USB: Fix the issue of task recovery failure > > caused by USB status when S4 wakes up") fixed an issue where if an USB > > port change happens during the entering steps of hibernation, xhci driver > > would attempt to resume the root hub, making the hibernation fail. > > > > System-wide suspend may fail due to the same reason, but this hasn't been > > addressed yet. This has been found on HP ProOne 440[1], as well as on > > some newer Dell all-in-one models. When suspend fails due to this reason, > > the kernel would show the following messages: > > I believe this problem was discussed on the mailing list before, and it > turned out that the issue was caused by a bug in the xhci-hcd driver, > not a bug in the USB core. Could you be more specific on which bug/thread it is? If you were mentioning thread about d9b4067aef50 ("USB: Fix the issue of task recovery failure caused by USB status when S4 wakes up"), the log in that commit message suggests that it happened on ehci, while here it happened on xhci. So this may be more general than just the xhci. > > Basically, suspend is _supposed_ to fail if a wakeup event occurs while > the suspend is in progress. As I recall, the bug in xhci-hcd was that > it treats some non-wakeup events as if they were wakeup events. > > In particular, a port change on the root hub should be treated as a > wakeup event if and only if the root hub is enabled for wakeup. Does > xhci-hcd check for this before failing the suspend? > > This reasoning shows that your proposed fix is incorrect. > Thanks for the feedback, This indeed isn't a correct way to address this. Will try to figure out some other ways. > Alan Stern Best, Leo
On Mon, Jan 13, 2025 at 04:14:07PM +0800, Yo-Jung (Leo) Lin wrote: > Hi Alan > > On Fri, Jan 10, 2025 at 11:44 PM Alan Stern <stern@rowland.harvard.edu> wrote: > > > > On Fri, Jan 10, 2025 at 04:44:10PM +0800, Yo-Jung (Leo) Lin wrote: > > > The commit d9b4067aef50 ("USB: Fix the issue of task recovery failure > > > caused by USB status when S4 wakes up") fixed an issue where if an USB > > > port change happens during the entering steps of hibernation, xhci driver > > > would attempt to resume the root hub, making the hibernation fail. > > > > > > System-wide suspend may fail due to the same reason, but this hasn't been > > > addressed yet. This has been found on HP ProOne 440[1], as well as on > > > some newer Dell all-in-one models. When suspend fails due to this reason, > > > the kernel would show the following messages: > > > > I believe this problem was discussed on the mailing list before, and it > > turned out that the issue was caused by a bug in the xhci-hcd driver, > > not a bug in the USB core. > > Could you be more specific on which bug/thread it is? > If you were mentioning thread about d9b4067aef50 ("USB: Fix the > issue of task recovery failure caused by USB status when S4 wakes up"), > the log in that commit message suggests that it happened on ehci, while > here it happened on xhci. So this may be more general than just the xhci. I was referring to the discussion in the email thread here: https://lore.kernel.org/linux-usb/7be0c87a-c00f-4346-8482-f41ef0249b57@rowland.harvard.edu/ > > Basically, suspend is _supposed_ to fail if a wakeup event occurs while > > the suspend is in progress. As I recall, the bug in xhci-hcd was that > > it treats some non-wakeup events as if they were wakeup events. > > > > In particular, a port change on the root hub should be treated as a > > wakeup event if and only if the root hub is enabled for wakeup. Does > > xhci-hcd check for this before failing the suspend? > > > > This reasoning shows that your proposed fix is incorrect. > > > Thanks for the feedback, This indeed isn't a correct way to address this. > Will try to figure out some other ways. Okay, good. Alan Stern
diff --git a/include/linux/pm.h b/include/linux/pm.h index 08c37b83fea8..d71347357fb1 100644 --- a/include/linux/pm.h +++ b/include/linux/pm.h @@ -571,7 +571,7 @@ const struct dev_pm_ops name = { \ #define PMSG_IS_AUTO(msg) (((msg).event & PM_EVENT_AUTO) != 0) #define PMSG_NO_WAKEUP(msg) (((msg).event & \ - (PM_EVENT_FREEZE | PM_EVENT_QUIESCE)) != 0) + (PM_EVENT_FREEZE | PM_EVENT_QUIESCE | PM_EVENT_SUSPEND)) != 0) /* * Device run-time power management status. *
The commit d9b4067aef50 ("USB: Fix the issue of task recovery failure caused by USB status when S4 wakes up") fixed an issue where if an USB port change happens during the entering steps of hibernation, xhci driver would attempt to resume the root hub, making the hibernation fail. System-wide suspend may fail due to the same reason, but this hasn't been addressed yet. This has been found on HP ProOne 440[1], as well as on some newer Dell all-in-one models. When suspend fails due to this reason, the kernel would show the following messages: [ 74.245058] [165] usbcore:hub_suspend:3961: hub 2-0:1.0: hub_suspend [ 74.245850] [165] usbcore:hcd_bus_suspend:2251: usb usb2: bus suspend, wakeup 0 [ 74.250971] [3508] usbcore:usb_port_suspend:3554: usb 1-2: usb suspend, wakeup 1 [ 74.263025] [11] usbcore:hub_suspend:3961: hub 1-0:1.0: hub_suspend [ 74.264029] [11] usbcore:hcd_bus_suspend:2251: usb usb1: bus suspend, wakeup 0 [ 74.265061] [11] xhci_hcd:xhci_bus_suspend:1779: xhci_hcd 0000:80:14.0: port 1-14 not suspended [ 74.266020] [11] xhci_hcd:xhci_bus_suspend:1779: xhci_hcd 0000:80:14.0: port 1-8 not suspended [ 74.266933] [11] xhci_hcd:xhci_bus_suspend:1779: xhci_hcd 0000:80:14.0: port 1-4 not suspended [ 74.267758] [11] xhci_hcd:xhci_ring_cmd_db:369: xhci_hcd 0000:80:14.0: // Ding dong! [ 74.268677] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 5 ep 6 [ 74.269632] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 5 ep 5 [ 74.270448] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 5 ep 3 [ 74.271228] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 5 ep 0 [ 74.271946] [11] xhci_hcd:xhci_ring_cmd_db:369: xhci_hcd 0000:80:14.0: // Ding dong! [ 74.272802] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 6 ep 8 [ 74.273533] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 6 ep 5 [ 74.274233] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 6 ep 0 [ 74.274982] [11] xhci_hcd:xhci_ring_cmd_db:369: xhci_hcd 0000:80:14.0: // Ding dong! [ 74.275814] <intr> xhci_hcd:handle_tx_event:2711: xhci_hcd 0000:80:14.0: Stopped on No-op or Link TRB for slot 2 ep 0 [ 74.281739] <intr> xhci_hcd:handle_port_status:1992: xhci_hcd 0000:80:14.0: Port change event, 1-8, id 8, portsc: 0x202a0 [ 74.282453] <intr> xhci_hcd:handle_port_status:1998: xhci_hcd 0000:80:14.0: resume root hub [ 74.283145] <intr> xhci_hcd:handle_port_status:2109: xhci_hcd 0000:80:14.0: handle_port_status: starting usb1 port polling. [ 74.385425] e1000e: EEE TX LPI TIMER: 00000011 [ 74.385543] [3508] pci_acpi:acpi_pci_set_power_state:1119: pcieport 0000:80:1c.0: power state changed by ACPI to D0 [ 74.385722] xhci_hcd 0000:80:14.0: PM: pci_pm_suspend(): hcd_pci_suspend returns -16 [ 74.385735] xhci_hcd 0000:80:14.0: PM: dpm_run_callback(): pci_pm_suspend returns -16 [ 74.385743] xhci_hcd 0000:80:14.0: PM: failed to suspend async: error -16 To address this, extend the fix in that commit also to suspend. This patch was tested on top of next-20250109, by suspending every 2 minutes consecutively for 300 times on the machine where the above error messages were found. There's no suspend failure found during the test. [1] [PATCH v3] platform/x86/hp: Avoid spurious wakeup on HP ProOne 440 https://lore.kernel.org/all/20240906053047.459036-1-kai.heng.feng@canonical.com/#t Signed-off-by: Yo-Jung (Leo) Lin <leo.lin@canonical.com> --- include/linux/pm.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)