diff mbox series

[v6,4/5] xhci: introduce xhci->lost_power flag

Message ID 20241210-s2r-cdns-v6-4-28a17f9715a2@bootlin.com
State New
Headers show
Series Fix USB suspend on TI J7200 (cdns3-ti, cdns3, xhci) | expand

Commit Message

Théo Lebrun Dec. 10, 2024, 5:13 p.m. UTC
The XHCI_RESET_ON_RESUME quirk allows wrappers to signal that they
expect a reset after resume. It is also used by some to enforce a XHCI
reset on resume (see needs-reset-on-resume DT prop).

Some wrappers are unsure beforehands if they will reset. Add a mechanism
to signal *at resume* if power has been lost. Parent devices can set
this flag, that defaults to false.

The XHCI_RESET_ON_RESUME quirk still triggers a runtime_pm_get() on the
controller. This is required as we do not know if a suspend will
trigger a reset, so the best guess is to avoid runtime PM.

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
---
 drivers/usb/host/xhci.c | 3 ++-
 drivers/usb/host/xhci.h | 6 ++++++
 2 files changed, 8 insertions(+), 1 deletion(-)

Comments

Roger Quadros Dec. 12, 2024, 12:37 p.m. UTC | #1
On 10/12/2024 19:13, Théo Lebrun wrote:
> The XHCI_RESET_ON_RESUME quirk allows wrappers to signal that they
> expect a reset after resume. It is also used by some to enforce a XHCI
> reset on resume (see needs-reset-on-resume DT prop).
> 
> Some wrappers are unsure beforehands if they will reset. Add a mechanism
> to signal *at resume* if power has been lost. Parent devices can set
> this flag, that defaults to false.
> 
> The XHCI_RESET_ON_RESUME quirk still triggers a runtime_pm_get() on the
> controller. This is required as we do not know if a suspend will
> trigger a reset, so the best guess is to avoid runtime PM.
> 
> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
> ---
>  drivers/usb/host/xhci.c | 3 ++-
>  drivers/usb/host/xhci.h | 6 ++++++
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 5ebde8cae4fc44cdb997b0f61314e309bda56c0d..ae2c8daa206a87da24d58a62b0a0485ebf68cdd6 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -1017,7 +1017,8 @@ int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg)
>  
>  	spin_lock_irq(&xhci->lock);
>  
> -	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME || xhci->broken_suspend)
> +	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME ||
> +	    xhci->broken_suspend || xhci->lost_power)
>  		reinit_xhc = true;
>  
>  	if (!reinit_xhc) {
> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> index 4914f0a10cff42dbc1448dcf7908534d582c848e..32526df75925989d40cbe7d59a187c945f498a30 100644
> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -1645,6 +1645,12 @@ struct xhci_hcd {
>  	unsigned		broken_suspend:1;
>  	/* Indicates that omitting hcd is supported if root hub has no ports */
>  	unsigned		allow_single_roothub:1;
> +	/*
> +	 * Signal from upper stacks that we lost power during system-wide
> +	 * suspend. Its default value is based on XHCI_RESET_ON_RESUME, meaning
> +	 * it is safe for wrappers to not modify lost_power at resume.
> +	 */
> +	unsigned                lost_power:1;

I suppose this is private to XHCI driver and not legitimate to be accessed
by another driver after HCD is instantiated?
Doesn't access to xhci_hcd need to be serialized via xhci->lock?

Just curious, what happens if you don't include patch 4 and 5?
Is USB functionality still broken for you?

Doesn't XHCI driver detect that power was lost and issue a reset in any case
via the below code

        /* re-initialize the HC on Restore Error, or Host Controller Error */
        if ((temp & (STS_SRE | STS_HCE)) &&
            !(xhci->xhc_state & XHCI_STATE_REMOVING)) {
                reinit_xhc = true;
                if (!xhci->broken_suspend)
                        xhci_warn(xhci, "xHC error in resume, USBSTS 0x%x, Reinit\n", temp);
        }

>  	/* cached extended protocol port capabilities */
>  	struct xhci_port_cap	*port_caps;
>  	unsigned int		num_port_caps;
>
Théo Lebrun Dec. 13, 2024, 4:03 p.m. UTC | #2
On Thu Dec 12, 2024 at 1:37 PM CET, Roger Quadros wrote:
> On 10/12/2024 19:13, Théo Lebrun wrote:
> > The XHCI_RESET_ON_RESUME quirk allows wrappers to signal that they
> > expect a reset after resume. It is also used by some to enforce a XHCI
> > reset on resume (see needs-reset-on-resume DT prop).
> > 
> > Some wrappers are unsure beforehands if they will reset. Add a mechanism
> > to signal *at resume* if power has been lost. Parent devices can set
> > this flag, that defaults to false.
> > 
> > The XHCI_RESET_ON_RESUME quirk still triggers a runtime_pm_get() on the
> > controller. This is required as we do not know if a suspend will
> > trigger a reset, so the best guess is to avoid runtime PM.
> > 
> > Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
> > ---
> >  drivers/usb/host/xhci.c | 3 ++-
> >  drivers/usb/host/xhci.h | 6 ++++++
> >  2 files changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> > index 5ebde8cae4fc44cdb997b0f61314e309bda56c0d..ae2c8daa206a87da24d58a62b0a0485ebf68cdd6 100644
> > --- a/drivers/usb/host/xhci.c
> > +++ b/drivers/usb/host/xhci.c
> > @@ -1017,7 +1017,8 @@ int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg)
> >  
> >  	spin_lock_irq(&xhci->lock);
> >  
> > -	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME || xhci->broken_suspend)
> > +	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME ||
> > +	    xhci->broken_suspend || xhci->lost_power)
> >  		reinit_xhc = true;
> >  
> >  	if (!reinit_xhc) {
> > diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> > index 4914f0a10cff42dbc1448dcf7908534d582c848e..32526df75925989d40cbe7d59a187c945f498a30 100644
> > --- a/drivers/usb/host/xhci.h
> > +++ b/drivers/usb/host/xhci.h
> > @@ -1645,6 +1645,12 @@ struct xhci_hcd {
> >  	unsigned		broken_suspend:1;
> >  	/* Indicates that omitting hcd is supported if root hub has no ports */
> >  	unsigned		allow_single_roothub:1;
> > +	/*
> > +	 * Signal from upper stacks that we lost power during system-wide
> > +	 * suspend. Its default value is based on XHCI_RESET_ON_RESUME, meaning
> > +	 * it is safe for wrappers to not modify lost_power at resume.
> > +	 */
> > +	unsigned                lost_power:1;
>
> I suppose this is private to XHCI driver and not legitimate to be accessed
> by another driver after HCD is instantiated?

Yes it is private.

> Doesn't access to xhci_hcd need to be serialized via xhci->lock?

Good question. In theory maybe. In practice I don't see how
cdns_host_resume(), called by cdns_resume(), could clash with anything
else. I'll add that to be safe.

> Just curious, what happens if you don't include patch 4 and 5?
> Is USB functionality still broken for you?

No it works fine. Patches 4+5 are only there to avoid the below warning.
Logging "xHC error in resume" is a lie, so I want to avoid it.

> Doesn't XHCI driver detect that power was lost and issue a reset in any case
> via the below code
>
>         /* re-initialize the HC on Restore Error, or Host Controller Error */
>         if ((temp & (STS_SRE | STS_HCE)) &&
>             !(xhci->xhc_state & XHCI_STATE_REMOVING)) {
>                 reinit_xhc = true;
>                 if (!xhci->broken_suspend)
>                         xhci_warn(xhci, "xHC error in resume, USBSTS 0x%x, Reinit\n", temp);
>         }
>
> >  	/* cached extended protocol port capabilities */
> >  	struct xhci_port_cap	*port_caps;
> >  	unsigned int		num_port_caps;
> > 

I'll wait for your opinion on the [PATCH v6 2/5] email thread before
sending a new revision.

Thanks,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
Roger Quadros Dec. 17, 2024, 9 p.m. UTC | #3
On 13/12/2024 18:03, Théo Lebrun wrote:
> On Thu Dec 12, 2024 at 1:37 PM CET, Roger Quadros wrote:
>> On 10/12/2024 19:13, Théo Lebrun wrote:
>>> The XHCI_RESET_ON_RESUME quirk allows wrappers to signal that they
>>> expect a reset after resume. It is also used by some to enforce a XHCI
>>> reset on resume (see needs-reset-on-resume DT prop).
>>>
>>> Some wrappers are unsure beforehands if they will reset. Add a mechanism
>>> to signal *at resume* if power has been lost. Parent devices can set
>>> this flag, that defaults to false.
>>>
>>> The XHCI_RESET_ON_RESUME quirk still triggers a runtime_pm_get() on the
>>> controller. This is required as we do not know if a suspend will
>>> trigger a reset, so the best guess is to avoid runtime PM.
>>>
>>> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
>>> ---
>>>  drivers/usb/host/xhci.c | 3 ++-
>>>  drivers/usb/host/xhci.h | 6 ++++++
>>>  2 files changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
>>> index 5ebde8cae4fc44cdb997b0f61314e309bda56c0d..ae2c8daa206a87da24d58a62b0a0485ebf68cdd6 100644
>>> --- a/drivers/usb/host/xhci.c
>>> +++ b/drivers/usb/host/xhci.c
>>> @@ -1017,7 +1017,8 @@ int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg)
>>>  
>>>  	spin_lock_irq(&xhci->lock);
>>>  
>>> -	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME || xhci->broken_suspend)
>>> +	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME ||
>>> +	    xhci->broken_suspend || xhci->lost_power)
>>>  		reinit_xhc = true;
>>>  
>>>  	if (!reinit_xhc) {
>>> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
>>> index 4914f0a10cff42dbc1448dcf7908534d582c848e..32526df75925989d40cbe7d59a187c945f498a30 100644
>>> --- a/drivers/usb/host/xhci.h
>>> +++ b/drivers/usb/host/xhci.h
>>> @@ -1645,6 +1645,12 @@ struct xhci_hcd {
>>>  	unsigned		broken_suspend:1;
>>>  	/* Indicates that omitting hcd is supported if root hub has no ports */
>>>  	unsigned		allow_single_roothub:1;
>>> +	/*
>>> +	 * Signal from upper stacks that we lost power during system-wide
>>> +	 * suspend. Its default value is based on XHCI_RESET_ON_RESUME, meaning
>>> +	 * it is safe for wrappers to not modify lost_power at resume.
>>> +	 */
>>> +	unsigned                lost_power:1;
>>
>> I suppose this is private to XHCI driver and not legitimate to be accessed
>> by another driver after HCD is instantiated?
> 
> Yes it is private.
> 
>> Doesn't access to xhci_hcd need to be serialized via xhci->lock?
> 
> Good question. In theory maybe. In practice I don't see how
> cdns_host_resume(), called by cdns_resume(), could clash with anything
> else. I'll add that to be safe.
> 
>> Just curious, what happens if you don't include patch 4 and 5?
>> Is USB functionality still broken for you?
> 
> No it works fine. Patches 4+5 are only there to avoid the below warning.
> Logging "xHC error in resume" is a lie, so I want to avoid it.

How is it a lie?
The XHCI controller did loose its save/restore state during a PM operation.
As far as XHCI is concerned this is an error. no?

> 
>> Doesn't XHCI driver detect that power was lost and issue a reset in any case
>> via the below code
>>
>>         /* re-initialize the HC on Restore Error, or Host Controller Error */
>>         if ((temp & (STS_SRE | STS_HCE)) &&
>>             !(xhci->xhc_state & XHCI_STATE_REMOVING)) {
>>                 reinit_xhc = true;
>>                 if (!xhci->broken_suspend)
>>                         xhci_warn(xhci, "xHC error in resume, USBSTS 0x%x, Reinit\n", temp);
>>         }
>>
>>>  	/* cached extended protocol port capabilities */
>>>  	struct xhci_port_cap	*port_caps;
>>>  	unsigned int		num_port_caps;
>>>
> 
> I'll wait for your opinion on the [PATCH v6 2/5] email thread before
> sending a new revision.

Sorry for the delay. I'm not an expert in PM but will give my opinion there anyways.

> 
> Thanks,
> 
> --
> Théo Lebrun, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com
> 
>
Théo Lebrun Dec. 18, 2024, 5:49 p.m. UTC | #4
Hello Roger, Peter, Pawel, Greg, Mathias,

On Tue Dec 17, 2024 at 10:00 PM CET, Roger Quadros wrote:
>
>
> On 13/12/2024 18:03, Théo Lebrun wrote:
> > On Thu Dec 12, 2024 at 1:37 PM CET, Roger Quadros wrote:
> >> On 10/12/2024 19:13, Théo Lebrun wrote:
> >>> The XHCI_RESET_ON_RESUME quirk allows wrappers to signal that they
> >>> expect a reset after resume. It is also used by some to enforce a XHCI
> >>> reset on resume (see needs-reset-on-resume DT prop).
> >>>
> >>> Some wrappers are unsure beforehands if they will reset. Add a mechanism
> >>> to signal *at resume* if power has been lost. Parent devices can set
> >>> this flag, that defaults to false.
> >>>
> >>> The XHCI_RESET_ON_RESUME quirk still triggers a runtime_pm_get() on the
> >>> controller. This is required as we do not know if a suspend will
> >>> trigger a reset, so the best guess is to avoid runtime PM.
> >>>
> >>> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
> >>> ---
> >>>  drivers/usb/host/xhci.c | 3 ++-
> >>>  drivers/usb/host/xhci.h | 6 ++++++
> >>>  2 files changed, 8 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> >>> index 5ebde8cae4fc44cdb997b0f61314e309bda56c0d..ae2c8daa206a87da24d58a62b0a0485ebf68cdd6 100644
> >>> --- a/drivers/usb/host/xhci.c
> >>> +++ b/drivers/usb/host/xhci.c
> >>> @@ -1017,7 +1017,8 @@ int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg)
> >>>  
> >>>  	spin_lock_irq(&xhci->lock);
> >>>  
> >>> -	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME || xhci->broken_suspend)
> >>> +	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME ||
> >>> +	    xhci->broken_suspend || xhci->lost_power)
> >>>  		reinit_xhc = true;
> >>>  
> >>>  	if (!reinit_xhc) {
> >>> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> >>> index 4914f0a10cff42dbc1448dcf7908534d582c848e..32526df75925989d40cbe7d59a187c945f498a30 100644
> >>> --- a/drivers/usb/host/xhci.h
> >>> +++ b/drivers/usb/host/xhci.h
> >>> @@ -1645,6 +1645,12 @@ struct xhci_hcd {
> >>>  	unsigned		broken_suspend:1;
> >>>  	/* Indicates that omitting hcd is supported if root hub has no ports */
> >>>  	unsigned		allow_single_roothub:1;
> >>> +	/*
> >>> +	 * Signal from upper stacks that we lost power during system-wide
> >>> +	 * suspend. Its default value is based on XHCI_RESET_ON_RESUME, meaning
> >>> +	 * it is safe for wrappers to not modify lost_power at resume.
> >>> +	 */
> >>> +	unsigned                lost_power:1;
> >>
> >> I suppose this is private to XHCI driver and not legitimate to be accessed
> >> by another driver after HCD is instantiated?
> > 
> > Yes it is private.
> > 
> >> Doesn't access to xhci_hcd need to be serialized via xhci->lock?
> > 
> > Good question. In theory maybe. In practice I don't see how
> > cdns_host_resume(), called by cdns_resume(), could clash with anything
> > else. I'll add that to be safe.
> > 
> >> Just curious, what happens if you don't include patch 4 and 5?
> >> Is USB functionality still broken for you?
> > 
> > No it works fine. Patches 4+5 are only there to avoid the below warning.
> > Logging "xHC error in resume" is a lie, so I want to avoid it.
>
> How is it a lie?
> The XHCI controller did loose its save/restore state during a PM operation.
> As far as XHCI is concerned this is an error. no?

The `xhci->quirks & XHCI_RESET_ON_RESUME` is exactly the same thing;
both the quirk and the flag we add have for purpose:

1. skipping this block

	if (!reinit_xhc) {
		retval = xhci_handshake(&xhci->op_regs->status,
					STS_CNR, 0, 10 * 1000 * 1000);
		// ...
		xhci_restore_registers(xhci);
		xhci_set_cmd_ring_deq(xhci);
		command = readl(&xhci->op_regs->command);
		command |= CMD_CRS;
		writel(command, &xhci->op_regs->command);
		if (xhci_handshake(&xhci->op_regs->status,
			      STS_RESTORE, 0, 100 * 1000)) {
			// ...
		}
	}

2. avoiding this warning:

	xhci_warn(xhci, "xHC error in resume, USBSTS 0x%x, Reinit\n", temp);

I don't think the block skipped is important in resume duration (to be
confirmed). But the xhci_warn() is not desired: we do not want to log
warnings if we know it is expected.

I'll think some more about it.

Thanks,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
Théo Lebrun Jan. 8, 2025, 10:59 a.m. UTC | #5
On Wed Dec 18, 2024 at 6:49 PM CET, Théo Lebrun wrote:
> On Tue Dec 17, 2024 at 10:00 PM CET, Roger Quadros wrote:
> > On 13/12/2024 18:03, Théo Lebrun wrote:
> > > On Thu Dec 12, 2024 at 1:37 PM CET, Roger Quadros wrote:
> > >> On 10/12/2024 19:13, Théo Lebrun wrote:
> > >>> The XHCI_RESET_ON_RESUME quirk allows wrappers to signal that they
> > >>> expect a reset after resume. It is also used by some to enforce a XHCI
> > >>> reset on resume (see needs-reset-on-resume DT prop).
> > >>>
> > >>> Some wrappers are unsure beforehands if they will reset. Add a mechanism
> > >>> to signal *at resume* if power has been lost. Parent devices can set
> > >>> this flag, that defaults to false.
> > >>>
> > >>> The XHCI_RESET_ON_RESUME quirk still triggers a runtime_pm_get() on the
> > >>> controller. This is required as we do not know if a suspend will
> > >>> trigger a reset, so the best guess is to avoid runtime PM.
> > >>>
> > >>> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
> > >>> ---
> > >>>  drivers/usb/host/xhci.c | 3 ++-
> > >>>  drivers/usb/host/xhci.h | 6 ++++++
> > >>>  2 files changed, 8 insertions(+), 1 deletion(-)
> > >>>
> > >>> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> > >>> index 5ebde8cae4fc44cdb997b0f61314e309bda56c0d..ae2c8daa206a87da24d58a62b0a0485ebf68cdd6 100644
> > >>> --- a/drivers/usb/host/xhci.c
> > >>> +++ b/drivers/usb/host/xhci.c
> > >>> @@ -1017,7 +1017,8 @@ int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg)
> > >>>  
> > >>>  	spin_lock_irq(&xhci->lock);
> > >>>  
> > >>> -	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME || xhci->broken_suspend)
> > >>> +	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME ||
> > >>> +	    xhci->broken_suspend || xhci->lost_power)
> > >>>  		reinit_xhc = true;
> > >>>  
> > >>>  	if (!reinit_xhc) {
> > >>> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> > >>> index 4914f0a10cff42dbc1448dcf7908534d582c848e..32526df75925989d40cbe7d59a187c945f498a30 100644
> > >>> --- a/drivers/usb/host/xhci.h
> > >>> +++ b/drivers/usb/host/xhci.h
> > >>> @@ -1645,6 +1645,12 @@ struct xhci_hcd {
> > >>>  	unsigned		broken_suspend:1;
> > >>>  	/* Indicates that omitting hcd is supported if root hub has no ports */
> > >>>  	unsigned		allow_single_roothub:1;
> > >>> +	/*
> > >>> +	 * Signal from upper stacks that we lost power during system-wide
> > >>> +	 * suspend. Its default value is based on XHCI_RESET_ON_RESUME, meaning
> > >>> +	 * it is safe for wrappers to not modify lost_power at resume.
> > >>> +	 */
> > >>> +	unsigned                lost_power:1;
> > >>
> > >> I suppose this is private to XHCI driver and not legitimate to be accessed
> > >> by another driver after HCD is instantiated?
> > > 
> > > Yes it is private.
> > > 
> > >> Doesn't access to xhci_hcd need to be serialized via xhci->lock?
> > > 
> > > Good question. In theory maybe. In practice I don't see how
> > > cdns_host_resume(), called by cdns_resume(), could clash with anything
> > > else. I'll add that to be safe.
> > > 
> > >> Just curious, what happens if you don't include patch 4 and 5?
> > >> Is USB functionality still broken for you?
> > > 
> > > No it works fine. Patches 4+5 are only there to avoid the below warning.
> > > Logging "xHC error in resume" is a lie, so I want to avoid it.
> >
> > How is it a lie?
> > The XHCI controller did loose its save/restore state during a PM operation.
> > As far as XHCI is concerned this is an error. no?
>
> The `xhci->quirks & XHCI_RESET_ON_RESUME` is exactly the same thing;
> both the quirk and the flag we add have for purpose:
>
> 1. skipping this block
>
> 	if (!reinit_xhc) {
> 		retval = xhci_handshake(&xhci->op_regs->status,
> 					STS_CNR, 0, 10 * 1000 * 1000);
> 		// ...
> 		xhci_restore_registers(xhci);
> 		xhci_set_cmd_ring_deq(xhci);
> 		command = readl(&xhci->op_regs->command);
> 		command |= CMD_CRS;
> 		writel(command, &xhci->op_regs->command);
> 		if (xhci_handshake(&xhci->op_regs->status,
> 			      STS_RESTORE, 0, 100 * 1000)) {
> 			// ...
> 		}
> 	}
>
> 2. avoiding this warning:
>
> 	xhci_warn(xhci, "xHC error in resume, USBSTS 0x%x, Reinit\n", temp);
>
> I don't think the block skipped is important in resume duration (to be
> confirmed). But the xhci_warn() is not desired: we do not want to log
> warnings if we know it is expected.
>
> I'll think some more about it.

About this series, there were two discussions:

 - The desire to avoid putting the hardware init sequence of cdns3-ti
   inside  runtime_resume() as we don't do runtime PM.
   *That is fine and will be fixed for the next revision.*
   See [PATCH V6 2/5]: https://lore.kernel.org/lkml/8a1ed4be-c41c-46b6-ae25-33a6035b8c8d@kernel.org/

 - [PATCH V6 4/5] and [PATCH V6 5/5] are dedicated to avoiding a warning
   at XHCI resume on J7200:

      xhci_warn(xhci, "xHC error in resume, USBSTS 0x%x, Reinit\n", temp);

   https://lore.kernel.org/lkml/20241210-s2r-cdns-v6-4-28a17f9715a2@bootlin.com/
   https://lore.kernel.org/lkml/20241210-s2r-cdns-v6-5-28a17f9715a2@bootlin.com/

   Roger Quadros asked if we should not instead keep it, as there is
   indeed a reinit of the xHC block. I don't think we do: we don't want
   a warning at resume; IMO it would imply the reinit was unexpected.

   Proof is there is already a platform with a ->broken_suspend boolean
   that disables the warning even though there is a reinit. It doesn't
   log a warning as the reinit was expected.

   So we currently have:
    - xhci->broken_suspend: set at probe & implies the resume sequence
      won't work.
    - xhci->quirks & XHCI_RESET_ON_RESUME: set at probe & implies the
      controller reset during suspend.

   IIUC xhci->broken_suspend is NOT equivalent to "the controller reset
   during suspend". Else we wouldn't have both the broken_suspend flag
   and the XHCI_RESET_ON_RESUME quirk.

   In our case we want exactly the same thing as the
   XHCI_RESET_ON_RESUME quirk but updated at resume depending on what
   the wrapper driver detects.

   We could either:
   1. Update xhci->quirks at resume from upper layers.
   2. Introduce a xhci->lost_power flag. It would be strictly equivalent
      to the XHCI_RESET_ON_RESUME quirk BUT updated at resume from
      upper layers.

   @Mathias Nyman: what is your thought on the matter? Option (2) was
   the one taken in this series. Is there another option I am missing?

Thanks,

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
Mathias Nyman Jan. 8, 2025, 6:43 p.m. UTC | #6
On 8.1.2025 12.59, Théo Lebrun wrote:
> On Wed Dec 18, 2024 at 6:49 PM CET, Théo Lebrun wrote:
>> On Tue Dec 17, 2024 at 10:00 PM CET, Roger Quadros wrote:
>>> On 13/12/2024 18:03, Théo Lebrun wrote:
>>>> On Thu Dec 12, 2024 at 1:37 PM CET, Roger Quadros wrote:
>>>>> On 10/12/2024 19:13, Théo Lebrun wrote:
>>>>>> The XHCI_RESET_ON_RESUME quirk allows wrappers to signal that they
>>>>>> expect a reset after resume. It is also used by some to enforce a XHCI
>>>>>> reset on resume (see needs-reset-on-resume DT prop).
>>>>>>
>>>>>> Some wrappers are unsure beforehands if they will reset. Add a mechanism
>>>>>> to signal *at resume* if power has been lost. Parent devices can set
>>>>>> this flag, that defaults to false.
>>>>>>
>>>>>> The XHCI_RESET_ON_RESUME quirk still triggers a runtime_pm_get() on the
>>>>>> controller. This is required as we do not know if a suspend will
>>>>>> trigger a reset, so the best guess is to avoid runtime PM.
>>>>>>
>>>>>> Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
>>>>>> ---
>>>>>>   drivers/usb/host/xhci.c | 3 ++-
>>>>>>   drivers/usb/host/xhci.h | 6 ++++++
>>>>>>   2 files changed, 8 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
>>>>>> index 5ebde8cae4fc44cdb997b0f61314e309bda56c0d..ae2c8daa206a87da24d58a62b0a0485ebf68cdd6 100644
>>>>>> --- a/drivers/usb/host/xhci.c
>>>>>> +++ b/drivers/usb/host/xhci.c
>>>>>> @@ -1017,7 +1017,8 @@ int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg)
>>>>>>   
>>>>>>   	spin_lock_irq(&xhci->lock);
>>>>>>   
>>>>>> -	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME || xhci->broken_suspend)
>>>>>> +	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME ||
>>>>>> +	    xhci->broken_suspend || xhci->lost_power)
>>>>>>   		reinit_xhc = true;
>>>>>>   
>>>>>>   	if (!reinit_xhc) {
>>>>>> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
>>>>>> index 4914f0a10cff42dbc1448dcf7908534d582c848e..32526df75925989d40cbe7d59a187c945f498a30 100644
>>>>>> --- a/drivers/usb/host/xhci.h
>>>>>> +++ b/drivers/usb/host/xhci.h
>>>>>> @@ -1645,6 +1645,12 @@ struct xhci_hcd {
>>>>>>   	unsigned		broken_suspend:1;
>>>>>>   	/* Indicates that omitting hcd is supported if root hub has no ports */
>>>>>>   	unsigned		allow_single_roothub:1;
>>>>>> +	/*
>>>>>> +	 * Signal from upper stacks that we lost power during system-wide
>>>>>> +	 * suspend. Its default value is based on XHCI_RESET_ON_RESUME, meaning
>>>>>> +	 * it is safe for wrappers to not modify lost_power at resume.
>>>>>> +	 */
>>>>>> +	unsigned                lost_power:1;
>>>>>
>>>>> I suppose this is private to XHCI driver and not legitimate to be accessed
>>>>> by another driver after HCD is instantiated?
>>>>
>>>> Yes it is private.
>>>>
>>>>> Doesn't access to xhci_hcd need to be serialized via xhci->lock?
>>>>
>>>> Good question. In theory maybe. In practice I don't see how
>>>> cdns_host_resume(), called by cdns_resume(), could clash with anything
>>>> else. I'll add that to be safe.
>>>>
>>>>> Just curious, what happens if you don't include patch 4 and 5?
>>>>> Is USB functionality still broken for you?
>>>>
>>>> No it works fine. Patches 4+5 are only there to avoid the below warning.
>>>> Logging "xHC error in resume" is a lie, so I want to avoid it.
>>>
>>> How is it a lie?
>>> The XHCI controller did loose its save/restore state during a PM operation.
>>> As far as XHCI is concerned this is an error. no?
>>
>> The `xhci->quirks & XHCI_RESET_ON_RESUME` is exactly the same thing;
>> both the quirk and the flag we add have for purpose:
>>
>> 1. skipping this block
>>
>> 	if (!reinit_xhc) {
>> 		retval = xhci_handshake(&xhci->op_regs->status,
>> 					STS_CNR, 0, 10 * 1000 * 1000);
>> 		// ...
>> 		xhci_restore_registers(xhci);
>> 		xhci_set_cmd_ring_deq(xhci);
>> 		command = readl(&xhci->op_regs->command);
>> 		command |= CMD_CRS;
>> 		writel(command, &xhci->op_regs->command);
>> 		if (xhci_handshake(&xhci->op_regs->status,
>> 			      STS_RESTORE, 0, 100 * 1000)) {
>> 			// ...
>> 		}
>> 	}
>>
>> 2. avoiding this warning:
>>
>> 	xhci_warn(xhci, "xHC error in resume, USBSTS 0x%x, Reinit\n", temp);
>>
>> I don't think the block skipped is important in resume duration (to be
>> confirmed). But the xhci_warn() is not desired: we do not want to log
>> warnings if we know it is expected.
>>
>> I'll think some more about it.
> 
> About this series, there were two discussions:
> 
>   - The desire to avoid putting the hardware init sequence of cdns3-ti
>     inside  runtime_resume() as we don't do runtime PM.
>     *That is fine and will be fixed for the next revision.*
>     See [PATCH V6 2/5]: https://lore.kernel.org/lkml/8a1ed4be-c41c-46b6-ae25-33a6035b8c8d@kernel.org/
> 
>   - [PATCH V6 4/5] and [PATCH V6 5/5] are dedicated to avoiding a warning
>     at XHCI resume on J7200:
> 
>        xhci_warn(xhci, "xHC error in resume, USBSTS 0x%x, Reinit\n", temp);
> 

Adding a new quirk or private xhci_cd meme


>     https://lore.kernel.org/lkml/20241210-s2r-cdns-v6-4-28a17f9715a2@bootlin.com/
>     https://lore.kernel.org/lkml/20241210-s2r-cdns-v6-5-28a17f9715a2@bootlin.com/
> 
>     Roger Quadros asked if we should not instead keep it, as there is
>     indeed a reinit of the xHC block. I don't think we do: we don't want
>     a warning at resume; IMO it would imply the reinit was unexpected.
> 
>     Proof is there is already a platform with a ->broken_suspend boolean
>     that disables the warning even though there is a reinit. It doesn't
>     log a warning as the reinit was expected.
> 
>     So we currently have:
>      - xhci->broken_suspend: set at probe & implies the resume sequence
>        won't work.
>      - xhci->quirks & XHCI_RESET_ON_RESUME: set at probe & implies the
>        controller reset during suspend.
> 
>     IIUC xhci->broken_suspend is NOT equivalent to "the controller reset
>     during suspend". Else we wouldn't have both the broken_suspend flag
>     and the XHCI_RESET_ON_RESUME quirk.
> 
>     In our case we want exactly the same thing as the
>     XHCI_RESET_ON_RESUME quirk but updated at resume depending on what
>     the wrapper driver detects.
> 
>     We could either:
>     1. Update xhci->quirks at resume from upper layers.
>     2. Introduce a xhci->lost_power flag. It would be strictly equivalent
>        to the XHCI_RESET_ON_RESUME quirk BUT updated at resume from
>        upper layers.
> 
>     @Mathias Nyman: what is your thought on the matter? Option (2) was
>     the one taken in this series. Is there another option I am missing?

This would be a fourth way the upper layers inform xhci_resume() that the
xHC host should be reset instead of restoring the registers.

option 1 creates the first dynamic xhci->quirk flag,
option 2 adds a xhci->lost_power flag that is touched outside of xhci driver.

Neither seem like a good idea just to get rid of a dev_warn() message.

Maybe its time to split xhci_resume() into xhci_reset_resume()
and xhci_restore_resume(), and let those upper layers decide what they need.

Doesn't cdns driver already have a xhci_plat_priv resume_quirk() function
called  during xhci_plat_resume(), before xhci_resume()?
Can that be used?

Thanks
Mathias
Théo Lebrun Jan. 29, 2025, 10:45 a.m. UTC | #7
Hello Mathias,

On Wed Jan 8, 2025 at 7:43 PM CET, Mathias Nyman wrote:
> This would be a fourth way the upper layers inform xhci_resume() that the
> xHC host should be reset instead of restoring the registers.
>
> option 1 creates the first dynamic xhci->quirk flag,
> option 2 adds a xhci->lost_power flag that is touched outside of xhci driver.
>
> Neither seem like a good idea just to get rid of a dev_warn() message.
>
> Maybe its time to split xhci_resume() into xhci_reset_resume()
> and xhci_restore_resume(), and let those upper layers decide what they need.
>
> Doesn't cdns driver already have a xhci_plat_priv resume_quirk() function
> called  during xhci_plat_resume(), before xhci_resume()?
> Can that be used?

I agree with your feeling: another solution is needed. Discussing the
topic gave some new ideas and I have a new solution that feels much
more appropriate.

### Opinion on splitting xhci_resume() into two implementations

About splitting xhci_resume() into two different implementations that is
picked by above layer: I am not convinced.

What would go into xhci_reset_resume() and xhci_restore_resume()? There
isn't a clear cut inbetween the reset procedure and the normal restore
procedure, as we might do a reset if the normal restore procedure
fails (with some logging that reset was unexpected).

We would probably end up with many small functions called by either,
which would blur the overall step sequence.

### Proposal

Instead, I propose we keep xhci_resume() as a single function but change
its signature from the current:

   int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg);

To this:

   int xhci_resume(struct xhci_hcd *xhci, bool power_lost,
                   bool is_auto_resume);

The key insight is that xhci_resume() extracts two information out of
the message:
 - whether we are after hibernation: msg.event == PM_EVENT_RESTORE,
 - whether this is an auto resume: msg.event == PM_EVENT_AUTO_RESUME.

First bulletpoint is somewhat wrong: driver wants to know if the device
did lose power, it doesn't care about hibernation per se. It just
happened to be that hibernation was the only case of power loss.
Knowing that, we can refactor to ask upper layers the right questions:
(1) "did we lose power?" and, (2) "is this an auto resume?".

Then, each caller is responsible for handing those booleans. If they
themselves receive pm_message_t messages (eg xhci-pci), then they do
the simple conversion:

      bool power_lost = msg.event == PM_EVENT_RESTORE;
      bool is_auto_resume = msg.event == PM_EVENT_AUTO_RESUME;

Others can do more more or less computation to pick a power_lost value.

### About xhci-plat power_lost value

In the case of xhci-plat, I think it will be:
 - A new power_lost field in `struct xhci_plat_priv`.
 - That gets set inside the cdns_role_driver::resume() callback of
   drivers/usb/cdns3/host.c.
 - Then xhci_plat_resume_common() computes power_lost being:
      power_lost = is_restore || priv->power_lost;

### About xhci_plat_priv::resume_quirk()

It isn't really useful to use. drivers/usb/cdns3/host.c can know whether
power was lost through its USB role resume() callback. From there to
the resume_quirk(), the boolean must be stored somewhere. That
somewhere can only be... inside xhci_plat_priv. So it is simpler if
xhci-plat retrieves the boolean directly.

xhci_plat_priv::resume_quirk() can change the power_lost value if it
wants to, that is fine. But in our case, the info isn't available from
there.

###

I am unsure if the above explanation is clear, so I'll be sending my
current work-in-progress series as an answer to this. The goal being
that you can give me your opinion on the proposal: ACK and we continue
in this direction, NACK and we keep digging.

I'll wait for the merge window to end before sending a proper revision.

Thanks!

--
Théo Lebrun, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com
diff mbox series

Patch

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 5ebde8cae4fc44cdb997b0f61314e309bda56c0d..ae2c8daa206a87da24d58a62b0a0485ebf68cdd6 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1017,7 +1017,8 @@  int xhci_resume(struct xhci_hcd *xhci, pm_message_t msg)
 
 	spin_lock_irq(&xhci->lock);
 
-	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME || xhci->broken_suspend)
+	if (hibernated || xhci->quirks & XHCI_RESET_ON_RESUME ||
+	    xhci->broken_suspend || xhci->lost_power)
 		reinit_xhc = true;
 
 	if (!reinit_xhc) {
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 4914f0a10cff42dbc1448dcf7908534d582c848e..32526df75925989d40cbe7d59a187c945f498a30 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1645,6 +1645,12 @@  struct xhci_hcd {
 	unsigned		broken_suspend:1;
 	/* Indicates that omitting hcd is supported if root hub has no ports */
 	unsigned		allow_single_roothub:1;
+	/*
+	 * Signal from upper stacks that we lost power during system-wide
+	 * suspend. Its default value is based on XHCI_RESET_ON_RESUME, meaning
+	 * it is safe for wrappers to not modify lost_power at resume.
+	 */
+	unsigned                lost_power:1;
 	/* cached extended protocol port capabilities */
 	struct xhci_port_cap	*port_caps;
 	unsigned int		num_port_caps;