diff mbox series

[v2,1/2] xhci: Add a quirk for full reset on removal

Message ID 20250515185227.1507363-2-royluo@google.com
State New
Headers show
Series Introduce XHCI_FULL_RESET_ON_REMOVE quirk for DWC3 | expand

Commit Message

Roy Luo May 15, 2025, 6:52 p.m. UTC
Commit 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state()
helper") introduced an optimization to xhci_reset() during xhci removal,
allowing it to bail out early without waiting for the reset to complete.

This behavior can cause issues on SNPS DWC3 USB controller with dual-role
capability. When the DWC3 controller exits host mode and removes xhci
while a reset is still in progress, and then tries to configure its
hardware for device mode, the ongoing reset leads to register access
issues; specifically, all register reads returns 0. These issues extend
beyond the xhci register space (which is expected during a reset) and
affect the entire DWC3 IP block, causing the DWC3 device mode to
malfunction.

To address this, introduce the `XHCI_FULL_RESET_ON_REMOVE` quirk. When this
quirk is set, xhci_reset() always completes its reset handshake, ensuring
the controller is in a fully reset state before proceeding.

Cc: stable@vger.kernel.org
Fixes: 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state() helper")
Signed-off-by: Roy Luo <royluo@google.com>
---
 drivers/usb/host/xhci-plat.c | 3 +++
 drivers/usb/host/xhci.c      | 8 +++++++-
 drivers/usb/host/xhci.h      | 1 +
 3 files changed, 11 insertions(+), 1 deletion(-)

Comments

Thinh Nguyen May 15, 2025, 11:42 p.m. UTC | #1
On Thu, May 15, 2025, Roy Luo wrote:
> Commit 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state()
> helper") introduced an optimization to xhci_reset() during xhci removal,
> allowing it to bail out early without waiting for the reset to complete.
> 
> This behavior can cause issues on SNPS DWC3 USB controller with dual-role
> capability. When the DWC3 controller exits host mode and removes xhci
> while a reset is still in progress, and then tries to configure its
> hardware for device mode, the ongoing reset leads to register access
> issues; specifically, all register reads returns 0. These issues extend
> beyond the xhci register space (which is expected during a reset) and
> affect the entire DWC3 IP block, causing the DWC3 device mode to
> malfunction.
> 
> To address this, introduce the `XHCI_FULL_RESET_ON_REMOVE` quirk. When this
> quirk is set, xhci_reset() always completes its reset handshake, ensuring
> the controller is in a fully reset state before proceeding.
> 
> Cc: stable@vger.kernel.org
> Fixes: 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state() helper")
> Signed-off-by: Roy Luo <royluo@google.com>
> ---
>  drivers/usb/host/xhci-plat.c | 3 +++
>  drivers/usb/host/xhci.c      | 8 +++++++-
>  drivers/usb/host/xhci.h      | 1 +
>  3 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
> index 3155e3a842da..19c5c26a8e63 100644
> --- a/drivers/usb/host/xhci-plat.c
> +++ b/drivers/usb/host/xhci-plat.c
> @@ -265,6 +265,9 @@ int xhci_plat_probe(struct platform_device *pdev, struct device *sysdev, const s
>  		if (device_property_read_bool(tmpdev, "xhci-skip-phy-init-quirk"))
>  			xhci->quirks |= XHCI_SKIP_PHY_INIT;
>  
> +		if (device_property_read_bool(tmpdev, "xhci-full-reset-on-remove-quirk"))
> +			xhci->quirks |= XHCI_FULL_RESET_ON_REMOVE;
> +
>  		device_property_read_u32(tmpdev, "imod-interval-ns",
>  					 &xhci->imod_interval);
>  	}
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 90eb491267b5..4f091d618c01 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -198,6 +198,7 @@ int xhci_reset(struct xhci_hcd *xhci, u64 timeout_us)
>  	u32 command;
>  	u32 state;
>  	int ret;
> +	unsigned int exit_state;
>  
>  	state = readl(&xhci->op_regs->status);
>  
> @@ -226,8 +227,13 @@ int xhci_reset(struct xhci_hcd *xhci, u64 timeout_us)
>  	if (xhci->quirks & XHCI_INTEL_HOST)
>  		udelay(1000);
>  
> +	if (xhci->quirks & XHCI_FULL_RESET_ON_REMOVE)
> +		exit_state = 0;

There's no state 0. Checking against that is odd. Couldn't we just use
xhci_handshake() equivalent instead?

In any case, this is basically a revert of this change:
6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state() helper")

Can't we just revert or fix the above patch that causes a regression?

Thanks,
Thinh

> +	else
> +		exit_state = XHCI_STATE_REMOVING;
> +
>  	ret = xhci_handshake_check_state(xhci, &xhci->op_regs->command,
> -				CMD_RESET, 0, timeout_us, XHCI_STATE_REMOVING);
> +				CMD_RESET, 0, timeout_us, exit_state);
>  	if (ret)
>  		return ret;
>  
> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> index 242ab9fbc8ae..ac65af788298 100644
> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -1637,6 +1637,7 @@ struct xhci_hcd {
>  #define XHCI_WRITE_64_HI_LO	BIT_ULL(47)
>  #define XHCI_CDNS_SCTX_QUIRK	BIT_ULL(48)
>  #define XHCI_ETRON_HOST	BIT_ULL(49)
> +#define XHCI_FULL_RESET_ON_REMOVE	BIT_ULL(50)
>  
>  	unsigned int		num_active_eps;
>  	unsigned int		limit_active_eps;
> -- 
> 2.49.0.1112.g889b7c5bd8-goog
>
Michał Pecio May 16, 2025, 6:33 a.m. UTC | #2
On Thu, 15 May 2025 23:42:50 +0000, Thinh Nguyen wrote:
> In any case, this is basically a revert of this change:
> 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state()
> helper")
> 
> Can't we just revert or fix the above patch that causes a regression?

Also note that 6ccb83d6c497 claimed to fix actual problems, so
disabling it on selected hardware could bring the old bug back:

> In some situations where xhci removal happens parallel to
> xhci_handshake, we encounter a scenario where the xhci_handshake
> can't succeed, and it polls until timeout.
> 
> If xhci_handshake runs until timeout it can on some platforms result
> in a long wait which might lead to a watchdog timeout.

But on the other hand, xhci_handshake() has long timeouts because
the handshakes themselves can take a surprisingly long time (and
sometimes still succeed), so any reliance on handshake completing
before timeout is frankly a bug in itself.

Regards,
Michal
Roy Luo May 16, 2025, 11:11 p.m. UTC | #3
> There's no state 0. Checking against that is odd. Couldn't we just use
> xhci_handshake() equivalent instead?

Ok, I will change it in the next version.

On Thu, May 15, 2025 at 11:33 PM Michał Pecio <michal.pecio@gmail.com> wrote:
>
> On Thu, 15 May 2025 23:42:50 +0000, Thinh Nguyen wrote:
> > In any case, this is basically a revert of this change:
> > 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state()
> > helper")
> >
> > Can't we just revert or fix the above patch that causes a regression?
>
> Also note that 6ccb83d6c497 claimed to fix actual problems, so
> disabling it on selected hardware could bring the old bug back:
>
> > In some situations where xhci removal happens parallel to
> > xhci_handshake, we encounter a scenario where the xhci_handshake
> > can't succeed, and it polls until timeout.
> >
> > If xhci_handshake runs until timeout it can on some platforms result
> > in a long wait which might lead to a watchdog timeout.

On top of this, xhci_handshake_check_state(XHCI_STATE_REMOVING)
is also used elsewhere like xhci_abort_cmd_ring(), so a simple revert is
off the table. Commit 6ccb83d6c497 did not specify which platform and
in what circumstance would xhci handshake timeout, adding a quirk for
DWC3 seems to be the better option here.

>
> But on the other hand, xhci_handshake() has long timeouts because
> the handshakes themselves can take a surprisingly long time (and
> sometimes still succeed), so any reliance on handshake completing
> before timeout is frankly a bug in itself.

This patch simply honors the contract between the software and
hardware, allowing the handshake to complete. It doesn't assume the
handshake will finish on time. If it times out, then it times out and
returns a failure.

Thanks,
Roy
diff mbox series

Patch

diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
index 3155e3a842da..19c5c26a8e63 100644
--- a/drivers/usb/host/xhci-plat.c
+++ b/drivers/usb/host/xhci-plat.c
@@ -265,6 +265,9 @@  int xhci_plat_probe(struct platform_device *pdev, struct device *sysdev, const s
 		if (device_property_read_bool(tmpdev, "xhci-skip-phy-init-quirk"))
 			xhci->quirks |= XHCI_SKIP_PHY_INIT;
 
+		if (device_property_read_bool(tmpdev, "xhci-full-reset-on-remove-quirk"))
+			xhci->quirks |= XHCI_FULL_RESET_ON_REMOVE;
+
 		device_property_read_u32(tmpdev, "imod-interval-ns",
 					 &xhci->imod_interval);
 	}
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 90eb491267b5..4f091d618c01 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -198,6 +198,7 @@  int xhci_reset(struct xhci_hcd *xhci, u64 timeout_us)
 	u32 command;
 	u32 state;
 	int ret;
+	unsigned int exit_state;
 
 	state = readl(&xhci->op_regs->status);
 
@@ -226,8 +227,13 @@  int xhci_reset(struct xhci_hcd *xhci, u64 timeout_us)
 	if (xhci->quirks & XHCI_INTEL_HOST)
 		udelay(1000);
 
+	if (xhci->quirks & XHCI_FULL_RESET_ON_REMOVE)
+		exit_state = 0;
+	else
+		exit_state = XHCI_STATE_REMOVING;
+
 	ret = xhci_handshake_check_state(xhci, &xhci->op_regs->command,
-				CMD_RESET, 0, timeout_us, XHCI_STATE_REMOVING);
+				CMD_RESET, 0, timeout_us, exit_state);
 	if (ret)
 		return ret;
 
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 242ab9fbc8ae..ac65af788298 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1637,6 +1637,7 @@  struct xhci_hcd {
 #define XHCI_WRITE_64_HI_LO	BIT_ULL(47)
 #define XHCI_CDNS_SCTX_QUIRK	BIT_ULL(48)
 #define XHCI_ETRON_HOST	BIT_ULL(49)
+#define XHCI_FULL_RESET_ON_REMOVE	BIT_ULL(50)
 
 	unsigned int		num_active_eps;
 	unsigned int		limit_active_eps;