diff mbox

usb: dwc3: core: Disable USB2.0 phy suspend when dwc3 acts as host role

Message ID b90a6c6878d9f386806619826ffac3504662f53d.1479202687.git.baolin.wang@linaro.org
State New
Headers show

Commit Message

(Exiting) Baolin Wang Nov. 15, 2016, 9:41 a.m. UTC
When dwc3 controller acts as host role with attaching slow speed device
(like mouse or keypad). Then if we plugged out the slow speed device,
it will timeout to run the deconfiguration endpoint command to drop the
endpoint's resources. Some xHCI command timeout log as below when
disconnecting one slow device:

[   99.807739] c0 xhci-hcd.0.auto: Port Status Change Event for port 1
[   99.814699] c0 xhci-hcd.0.auto: resume root hub
[   99.819992] c0 xhci-hcd.0.auto: handle_port_status: starting port
				   polling.
[   99.827808] c0 xhci-hcd.0.auto: get port status, actual port 0 status
				   = 0x202a0
[   99.835903] c0 xhci-hcd.0.auto: Get port status returned 0x10100
[   99.850052] c0 xhci-hcd.0.auto: clear port connect change, actual
				   port 0 status  = 0x2a0
[   99.859313] c0 xhci-hcd.0.auto: Cancel URB ffffffc01ed6cd00, dev 1,
				   ep 0x81, starting at offset 0xc406d210
[   99.869645] c0 xhci-hcd.0.auto: // Ding dong!
[   99.874776] c0 xhci-hcd.0.auto: Stopped on Transfer TRB
[   99.880713] c0 xhci-hcd.0.auto: Removing canceled TD starting at
				   0xc406d210 (dma).
[   99.889012] c0 xhci-hcd.0.auto: Finding endpoint context
[   99.895069] c0 xhci-hcd.0.auto: Cycle state = 0x1
[   99.900519] c0 xhci-hcd.0.auto: New dequeue segment =
				   ffffffc1112f0880 (virtual)
[   99.908655] c0 xhci-hcd.0.auto: New dequeue pointer = 0xc406d220 (DMA)
[   99.915927] c0 xhci-hcd.0.auto: Set TR Deq Ptr cmd, new deq seg =
				   ffffffc1112f0880 (0xc406d000 dma),
				   new deq ptr = ffffff8002175220
				   (0xc406d220 dma), new cycle = 1
[   99.931242] c0 xhci-hcd.0.auto: // Ding dong!
[   99.936360] c0 xhci-hcd.0.auto: Successful Set TR Deq Ptr cmd,
				   deq = @c406d220
[   99.944458] c0 xhci-hcd.0.auto: xhci_hub_status_data: stopping port
				   polling.
[  100.047619] c0 xhci-hcd.0.auto: xhci_drop_endpoint called for udev
				   ffffffc01ae08800
[  100.057002] c0 xhci-hcd.0.auto: drop ep 0x81, slot id 1, new drop
				   flags = 0x8, new add flags = 0x0
[  100.067878] c0 xhci-hcd.0.auto: xhci_check_bandwidth called for udev
				   ffffffc01ae08800
[  100.076868] c0 xhci-hcd.0.auto: New Input Control Context:

......

[  100.427252] c0 xhci-hcd.0.auto: // Ding dong!
[  105.430728] c0 xhci-hcd.0.auto: Command timeout
[  105.436029] c0 xhci-hcd.0.auto: Abort command ring
[  113.558223] c0 xhci-hcd.0.auto: Command completion event does not match
				   command
[  113.569778] c0 xhci-hcd.0.auto: Timeout while waiting for configure
				   endpoint command

The reason is it will suspend USB phy to disable phy clock when
disconnecting the slow USB decice, that will hang on the xHCI commands
executing which depends on the phy clock.

Thus we should disable USB2.0 phy suspend feature when dwc3 acts as host
role.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>

---
 drivers/usb/dwc3/core.c |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

-- 
1.7.9.5

Comments

Felipe Balbi Jan. 12, 2017, 7:49 a.m. UTC | #1
Hi,

Baolin Wang <baolin.wang@linaro.org> writes:
>> Baolin Wang <baolin.wang@linaro.org> writes:

>>> When dwc3 controller acts as host role with attaching slow speed device

>>> (like mouse or keypad). Then if we plugged out the slow speed device,

>>> it will timeout to run the deconfiguration endpoint command to drop the

>>> endpoint's resources. Some xHCI command timeout log as below when

>>> disconnecting one slow device:

>>>

>>> [   99.807739] c0 xhci-hcd.0.auto: Port Status Change Event for port 1

>>> [   99.814699] c0 xhci-hcd.0.auto: resume root hub

>>> [   99.819992] c0 xhci-hcd.0.auto: handle_port_status: starting port

>>>                                  polling.

>>> [   99.827808] c0 xhci-hcd.0.auto: get port status, actual port 0 status

>>>                                  = 0x202a0

>>> [   99.835903] c0 xhci-hcd.0.auto: Get port status returned 0x10100

>>> [   99.850052] c0 xhci-hcd.0.auto: clear port connect change, actual

>>>                                  port 0 status  = 0x2a0

>>> [   99.859313] c0 xhci-hcd.0.auto: Cancel URB ffffffc01ed6cd00, dev 1,

>>>                                  ep 0x81, starting at offset 0xc406d210

>>> [   99.869645] c0 xhci-hcd.0.auto: // Ding dong!

>>> [   99.874776] c0 xhci-hcd.0.auto: Stopped on Transfer TRB

>>> [   99.880713] c0 xhci-hcd.0.auto: Removing canceled TD starting at

>>>                                  0xc406d210 (dma).

>>> [   99.889012] c0 xhci-hcd.0.auto: Finding endpoint context

>>> [   99.895069] c0 xhci-hcd.0.auto: Cycle state = 0x1

>>> [   99.900519] c0 xhci-hcd.0.auto: New dequeue segment =

>>>                                  ffffffc1112f0880 (virtual)

>>> [   99.908655] c0 xhci-hcd.0.auto: New dequeue pointer = 0xc406d220 (DMA)

>>> [   99.915927] c0 xhci-hcd.0.auto: Set TR Deq Ptr cmd, new deq seg =

>>>                                  ffffffc1112f0880 (0xc406d000 dma),

>>>                                  new deq ptr = ffffff8002175220

>>>                                  (0xc406d220 dma), new cycle = 1

>>> [   99.931242] c0 xhci-hcd.0.auto: // Ding dong!

>>> [   99.936360] c0 xhci-hcd.0.auto: Successful Set TR Deq Ptr cmd,

>>>                                  deq = @c406d220

>>> [   99.944458] c0 xhci-hcd.0.auto: xhci_hub_status_data: stopping port

>>>                                  polling.

>>> [  100.047619] c0 xhci-hcd.0.auto: xhci_drop_endpoint called for udev

>>>                                  ffffffc01ae08800

>>> [  100.057002] c0 xhci-hcd.0.auto: drop ep 0x81, slot id 1, new drop

>>>                                  flags = 0x8, new add flags = 0x0

>>> [  100.067878] c0 xhci-hcd.0.auto: xhci_check_bandwidth called for udev

>>>                                  ffffffc01ae08800

>>> [  100.076868] c0 xhci-hcd.0.auto: New Input Control Context:

>>>

>>> ......

>>>

>>> [  100.427252] c0 xhci-hcd.0.auto: // Ding dong!

>>> [  105.430728] c0 xhci-hcd.0.auto: Command timeout

>>> [  105.436029] c0 xhci-hcd.0.auto: Abort command ring

>>> [  113.558223] c0 xhci-hcd.0.auto: Command completion event does not match

>>>                                  command

>>> [  113.569778] c0 xhci-hcd.0.auto: Timeout while waiting for configure

>>>                                  endpoint command

>>>

>>> The reason is it will suspend USB phy to disable phy clock when

>>> disconnecting the slow USB decice, that will hang on the xHCI commands

>>> executing which depends on the phy clock.

>>>

>>> Thus we should disable USB2.0 phy suspend feature when dwc3 acts as host

>>> role.

>>>

>>> Signed-off-by: Baolin Wang <baolin.wang@linaro.org>

>>> ---

>>>  drivers/usb/dwc3/core.c |   14 ++++++++++++++

>>>  1 file changed, 14 insertions(+)

>>>

>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c

>>> index 9a4a5e4..0b646cf 100644

>>> --- a/drivers/usb/dwc3/core.c

>>> +++ b/drivers/usb/dwc3/core.c

>>> @@ -565,6 +565,20 @@ static int dwc3_phy_setup(struct dwc3 *dwc)

>>>       if (dwc->revision > DWC3_REVISION_194A)

>>>               reg |= DWC3_GUSB2PHYCFG_SUSPHY;

>>>

>>> +     /*

>>> +      * When dwc3 controller acts as host role with attaching one slow speed

>>> +      * device (like mouse or keypad). Then if we plugged out the slow speed

>>> +      * device, it will timeout to run the deconfiguration endpoint command.

>>> +      * The reason is it will suspend USB phy to disable phy clock when

>>> +      * disconnecting slow speed decice, which will affect the xHCI commands

>>> +      * executing.

>>> +      *

>>> +      * Thus we should disable USB 2.0 phy suspend feature when dwc3 acts as

>>> +      * host role.

>>> +      */

>>> +     if (dwc->dr_mode == USB_DR_MODE_HOST || dwc->dr_mode == USB_DR_MODE_OTG)

>>> +             reg &= ~DWC3_GUSB2PHYCFG_SUSPHY;

>>

>> which version of the core you're using? Recent version (since 1.94A,

>

> My version is 2.80a.

>

>> IIRC) can manage core suspend automatically. Also, this patch of yours

>> will cause a power consumption regression.

>

> Yes, it can manage core suspend automatically, that is the problem.

> When plugging out one mouse or keypad device, the phy will suspend

> automatically to disable the phy clock. But now the disconnecting

> process is not finished, and some xHCI commands (like deconfiguration

> endpoint command to drop endpoint resources) need depend on the phy

> clock, which will hang on the system to timeout the command or abort

> command ring to halt the xHCI.

>

> I agree with you it will cause a power consumption regression, but it

> will cause serious problem if not. Do you have some suggestion?


sorry for the long delay. This was lost in my inbox.

I'm not sure this patch is the best solution. There's no mention in
Databook that we should avoid PHY suspend when acting as host. Adding
John here to see if John has any idea of how to fix this.

-- 
balbi
John Youn Jan. 12, 2017, 9:47 p.m. UTC | #2
On 1/11/2017 11:51 PM, Felipe Balbi wrote:
> 

> Hi,

> 

> Baolin Wang <baolin.wang@linaro.org> writes:

>>> Baolin Wang <baolin.wang@linaro.org> writes:

>>>> When dwc3 controller acts as host role with attaching slow speed device

>>>> (like mouse or keypad). Then if we plugged out the slow speed device,

>>>> it will timeout to run the deconfiguration endpoint command to drop the

>>>> endpoint's resources. Some xHCI command timeout log as below when

>>>> disconnecting one slow device:

>>>>

>>>> [   99.807739] c0 xhci-hcd.0.auto: Port Status Change Event for port 1

>>>> [   99.814699] c0 xhci-hcd.0.auto: resume root hub

>>>> [   99.819992] c0 xhci-hcd.0.auto: handle_port_status: starting port

>>>>                                  polling.

>>>> [   99.827808] c0 xhci-hcd.0.auto: get port status, actual port 0 status

>>>>                                  = 0x202a0

>>>> [   99.835903] c0 xhci-hcd.0.auto: Get port status returned 0x10100

>>>> [   99.850052] c0 xhci-hcd.0.auto: clear port connect change, actual

>>>>                                  port 0 status  = 0x2a0

>>>> [   99.859313] c0 xhci-hcd.0.auto: Cancel URB ffffffc01ed6cd00, dev 1,

>>>>                                  ep 0x81, starting at offset 0xc406d210

>>>> [   99.869645] c0 xhci-hcd.0.auto: // Ding dong!

>>>> [   99.874776] c0 xhci-hcd.0.auto: Stopped on Transfer TRB

>>>> [   99.880713] c0 xhci-hcd.0.auto: Removing canceled TD starting at

>>>>                                  0xc406d210 (dma).

>>>> [   99.889012] c0 xhci-hcd.0.auto: Finding endpoint context

>>>> [   99.895069] c0 xhci-hcd.0.auto: Cycle state = 0x1

>>>> [   99.900519] c0 xhci-hcd.0.auto: New dequeue segment =

>>>>                                  ffffffc1112f0880 (virtual)

>>>> [   99.908655] c0 xhci-hcd.0.auto: New dequeue pointer = 0xc406d220 (DMA)

>>>> [   99.915927] c0 xhci-hcd.0.auto: Set TR Deq Ptr cmd, new deq seg =

>>>>                                  ffffffc1112f0880 (0xc406d000 dma),

>>>>                                  new deq ptr = ffffff8002175220

>>>>                                  (0xc406d220 dma), new cycle = 1

>>>> [   99.931242] c0 xhci-hcd.0.auto: // Ding dong!

>>>> [   99.936360] c0 xhci-hcd.0.auto: Successful Set TR Deq Ptr cmd,

>>>>                                  deq = @c406d220

>>>> [   99.944458] c0 xhci-hcd.0.auto: xhci_hub_status_data: stopping port

>>>>                                  polling.

>>>> [  100.047619] c0 xhci-hcd.0.auto: xhci_drop_endpoint called for udev

>>>>                                  ffffffc01ae08800

>>>> [  100.057002] c0 xhci-hcd.0.auto: drop ep 0x81, slot id 1, new drop

>>>>                                  flags = 0x8, new add flags = 0x0

>>>> [  100.067878] c0 xhci-hcd.0.auto: xhci_check_bandwidth called for udev

>>>>                                  ffffffc01ae08800

>>>> [  100.076868] c0 xhci-hcd.0.auto: New Input Control Context:

>>>>

>>>> ......

>>>>

>>>> [  100.427252] c0 xhci-hcd.0.auto: // Ding dong!

>>>> [  105.430728] c0 xhci-hcd.0.auto: Command timeout

>>>> [  105.436029] c0 xhci-hcd.0.auto: Abort command ring

>>>> [  113.558223] c0 xhci-hcd.0.auto: Command completion event does not match

>>>>                                  command

>>>> [  113.569778] c0 xhci-hcd.0.auto: Timeout while waiting for configure

>>>>                                  endpoint command

>>>>

>>>> The reason is it will suspend USB phy to disable phy clock when

>>>> disconnecting the slow USB decice, that will hang on the xHCI commands

>>>> executing which depends on the phy clock.

>>>>

>>>> Thus we should disable USB2.0 phy suspend feature when dwc3 acts as host

>>>> role.

>>>>

>>>> Signed-off-by: Baolin Wang <baolin.wang@linaro.org>

>>>> ---

>>>>  drivers/usb/dwc3/core.c |   14 ++++++++++++++

>>>>  1 file changed, 14 insertions(+)

>>>>

>>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c

>>>> index 9a4a5e4..0b646cf 100644

>>>> --- a/drivers/usb/dwc3/core.c

>>>> +++ b/drivers/usb/dwc3/core.c

>>>> @@ -565,6 +565,20 @@ static int dwc3_phy_setup(struct dwc3 *dwc)

>>>>       if (dwc->revision > DWC3_REVISION_194A)

>>>>               reg |= DWC3_GUSB2PHYCFG_SUSPHY;

>>>>

>>>> +     /*

>>>> +      * When dwc3 controller acts as host role with attaching one slow speed

>>>> +      * device (like mouse or keypad). Then if we plugged out the slow speed

>>>> +      * device, it will timeout to run the deconfiguration endpoint command.

>>>> +      * The reason is it will suspend USB phy to disable phy clock when

>>>> +      * disconnecting slow speed decice, which will affect the xHCI commands

>>>> +      * executing.

>>>> +      *

>>>> +      * Thus we should disable USB 2.0 phy suspend feature when dwc3 acts as

>>>> +      * host role.

>>>> +      */

>>>> +     if (dwc->dr_mode == USB_DR_MODE_HOST || dwc->dr_mode == USB_DR_MODE_OTG)

>>>> +             reg &= ~DWC3_GUSB2PHYCFG_SUSPHY;

>>>

>>> which version of the core you're using? Recent version (since 1.94A,

>>

>> My version is 2.80a.

>>

>>> IIRC) can manage core suspend automatically. Also, this patch of yours

>>> will cause a power consumption regression.

>>

>> Yes, it can manage core suspend automatically, that is the problem.

>> When plugging out one mouse or keypad device, the phy will suspend

>> automatically to disable the phy clock. But now the disconnecting

>> process is not finished, and some xHCI commands (like deconfiguration

>> endpoint command to drop endpoint resources) need depend on the phy

>> clock, which will hang on the system to timeout the command or abort

>> command ring to halt the xHCI.

>>

>> I agree with you it will cause a power consumption regression, but it

>> will cause serious problem if not. Do you have some suggestion?

> 

> sorry for the long delay. This was lost in my inbox.

> 

> I'm not sure this patch is the best solution. There's no mention in

> Databook that we should avoid PHY suspend when acting as host. Adding

> John here to see if John has any idea of how to fix this.

> 


I'm not familiar enough with XHCI side of things to say.

I'll ask around to see if anyone has an idea.

Regards,
John
John Youn Jan. 19, 2017, 1:33 a.m. UTC | #3
On 1/16/2017 2:38 AM, Felipe Balbi wrote:
>

> Hi,

>

> John Youn <John.Youn@synopsys.com> writes:

>>> Baolin Wang <baolin.wang@linaro.org> writes:

>>>>> Baolin Wang <baolin.wang@linaro.org> writes:

>>>>>> When dwc3 controller acts as host role with attaching slow speed device

>>>>>> (like mouse or keypad). Then if we plugged out the slow speed device,

>>>>>> it will timeout to run the deconfiguration endpoint command to drop the

>>>>>> endpoint's resources. Some xHCI command timeout log as below when

>>>>>> disconnecting one slow device:

>>>>>>

>>>>>> [   99.807739] c0 xhci-hcd.0.auto: Port Status Change Event for port 1

>>>>>> [   99.814699] c0 xhci-hcd.0.auto: resume root hub

>>>>>> [   99.819992] c0 xhci-hcd.0.auto: handle_port_status: starting port

>>>>>>                                  polling.

>>>>>> [   99.827808] c0 xhci-hcd.0.auto: get port status, actual port 0 status

>>>>>>                                  = 0x202a0

>>>>>> [   99.835903] c0 xhci-hcd.0.auto: Get port status returned 0x10100

>>>>>> [   99.850052] c0 xhci-hcd.0.auto: clear port connect change, actual

>>>>>>                                  port 0 status  = 0x2a0

>>>>>> [   99.859313] c0 xhci-hcd.0.auto: Cancel URB ffffffc01ed6cd00, dev 1,

>>>>>>                                  ep 0x81, starting at offset 0xc406d210

>>>>>> [   99.869645] c0 xhci-hcd.0.auto: // Ding dong!

>>>>>> [   99.874776] c0 xhci-hcd.0.auto: Stopped on Transfer TRB

>>>>>> [   99.880713] c0 xhci-hcd.0.auto: Removing canceled TD starting at

>>>>>>                                  0xc406d210 (dma).

>>>>>> [   99.889012] c0 xhci-hcd.0.auto: Finding endpoint context

>>>>>> [   99.895069] c0 xhci-hcd.0.auto: Cycle state = 0x1

>>>>>> [   99.900519] c0 xhci-hcd.0.auto: New dequeue segment =

>>>>>>                                  ffffffc1112f0880 (virtual)

>>>>>> [   99.908655] c0 xhci-hcd.0.auto: New dequeue pointer = 0xc406d220 (DMA)

>>>>>> [   99.915927] c0 xhci-hcd.0.auto: Set TR Deq Ptr cmd, new deq seg =

>>>>>>                                  ffffffc1112f0880 (0xc406d000 dma),

>>>>>>                                  new deq ptr = ffffff8002175220

>>>>>>                                  (0xc406d220 dma), new cycle = 1

>>>>>> [   99.931242] c0 xhci-hcd.0.auto: // Ding dong!

>>>>>> [   99.936360] c0 xhci-hcd.0.auto: Successful Set TR Deq Ptr cmd,

>>>>>>                                  deq = @c406d220

>>>>>> [   99.944458] c0 xhci-hcd.0.auto: xhci_hub_status_data: stopping port

>>>>>>                                  polling.

>>>>>> [  100.047619] c0 xhci-hcd.0.auto: xhci_drop_endpoint called for udev

>>>>>>                                  ffffffc01ae08800

>>>>>> [  100.057002] c0 xhci-hcd.0.auto: drop ep 0x81, slot id 1, new drop

>>>>>>                                  flags = 0x8, new add flags = 0x0

>>>>>> [  100.067878] c0 xhci-hcd.0.auto: xhci_check_bandwidth called for udev

>>>>>>                                  ffffffc01ae08800

>>>>>> [  100.076868] c0 xhci-hcd.0.auto: New Input Control Context:

>>>>>>

>>>>>> ......

>>>>>>

>>>>>> [  100.427252] c0 xhci-hcd.0.auto: // Ding dong!

>>>>>> [  105.430728] c0 xhci-hcd.0.auto: Command timeout

>>>>>> [  105.436029] c0 xhci-hcd.0.auto: Abort command ring

>>>>>> [  113.558223] c0 xhci-hcd.0.auto: Command completion event does not match

>>>>>>                                  command

>>>>>> [  113.569778] c0 xhci-hcd.0.auto: Timeout while waiting for configure

>>>>>>                                  endpoint command

>>>>>>

>>>>>> The reason is it will suspend USB phy to disable phy clock when

>>>>>> disconnecting the slow USB decice, that will hang on the xHCI commands

>>>>>> executing which depends on the phy clock.

>>>>>>

>>>>>> Thus we should disable USB2.0 phy suspend feature when dwc3 acts as host

>>>>>> role.

>>>>>>

>>>>>> Signed-off-by: Baolin Wang <baolin.wang@linaro.org>

>>>>>> ---

>>>>>>  drivers/usb/dwc3/core.c |   14 ++++++++++++++

>>>>>>  1 file changed, 14 insertions(+)

>>>>>>

>>>>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c

>>>>>> index 9a4a5e4..0b646cf 100644

>>>>>> --- a/drivers/usb/dwc3/core.c

>>>>>> +++ b/drivers/usb/dwc3/core.c

>>>>>> @@ -565,6 +565,20 @@ static int dwc3_phy_setup(struct dwc3 *dwc)

>>>>>>       if (dwc->revision > DWC3_REVISION_194A)

>>>>>>               reg |= DWC3_GUSB2PHYCFG_SUSPHY;

>>>>>>

>>>>>> +     /*

>>>>>> +      * When dwc3 controller acts as host role with attaching one slow speed

>>>>>> +      * device (like mouse or keypad). Then if we plugged out the slow speed

>>>>>> +      * device, it will timeout to run the deconfiguration endpoint command.

>>>>>> +      * The reason is it will suspend USB phy to disable phy clock when

>>>>>> +      * disconnecting slow speed decice, which will affect the xHCI commands

>>>>>> +      * executing.

>>>>>> +      *

>>>>>> +      * Thus we should disable USB 2.0 phy suspend feature when dwc3 acts as

>>>>>> +      * host role.

>>>>>> +      */

>>>>>> +     if (dwc->dr_mode == USB_DR_MODE_HOST || dwc->dr_mode == USB_DR_MODE_OTG)

>>>>>> +             reg &= ~DWC3_GUSB2PHYCFG_SUSPHY;

>>>>>

>>>>> which version of the core you're using? Recent version (since 1.94A,

>>>>

>>>> My version is 2.80a.

>>>>

>>>>> IIRC) can manage core suspend automatically. Also, this patch of yours

>>>>> will cause a power consumption regression.

>>>>

>>>> Yes, it can manage core suspend automatically, that is the problem.

>>>> When plugging out one mouse or keypad device, the phy will suspend

>>>> automatically to disable the phy clock. But now the disconnecting

>>>> process is not finished, and some xHCI commands (like deconfiguration

>>>> endpoint command to drop endpoint resources) need depend on the phy

>>>> clock, which will hang on the system to timeout the command or abort

>>>> command ring to halt the xHCI.

>>>>

>>>> I agree with you it will cause a power consumption regression, but it

>>>> will cause serious problem if not. Do you have some suggestion?

>>>

>>> sorry for the long delay. This was lost in my inbox.

>>>

>>> I'm not sure this patch is the best solution. There's no mention in

>>> Databook that we should avoid PHY suspend when acting as host. Adding

>>> John here to see if John has any idea of how to fix this.

>>>

>>

>> I'm not familiar enough with XHCI side of things to say.

>>

>> I'll ask around to see if anyone has an idea.

>


Hi Felipe, Baolin,

I talked with a couple engineers here and the behavior is not
something that's expected in host mode.

Can you check that the value of the GCTL.RAMCLKSEL is set
appropriately? This affects where the core gets the clock signal
from. If it is getting it from the phy clock then you will likely have
this problem and will need to adjust it. Otherwise you should probably
use the existing quirk instead.

Regards,
John
(Exiting) Baolin Wang Jan. 19, 2017, 3:12 a.m. UTC | #4
Hi John,

On 19 January 2017 at 09:33, John Youn <John.Youn@synopsys.com> wrote:
> On 1/16/2017 2:38 AM, Felipe Balbi wrote:

>>

>> Hi,

>>

>> John Youn <John.Youn@synopsys.com> writes:

>>>> Baolin Wang <baolin.wang@linaro.org> writes:

>>>>>> Baolin Wang <baolin.wang@linaro.org> writes:

>>>>>>> When dwc3 controller acts as host role with attaching slow speed device

>>>>>>> (like mouse or keypad). Then if we plugged out the slow speed device,

>>>>>>> it will timeout to run the deconfiguration endpoint command to drop the

>>>>>>> endpoint's resources. Some xHCI command timeout log as below when

>>>>>>> disconnecting one slow device:

>>>>>>>

>>>>>>> [   99.807739] c0 xhci-hcd.0.auto: Port Status Change Event for port 1

>>>>>>> [   99.814699] c0 xhci-hcd.0.auto: resume root hub

>>>>>>> [   99.819992] c0 xhci-hcd.0.auto: handle_port_status: starting port

>>>>>>>                                  polling.

>>>>>>> [   99.827808] c0 xhci-hcd.0.auto: get port status, actual port 0 status

>>>>>>>                                  = 0x202a0

>>>>>>> [   99.835903] c0 xhci-hcd.0.auto: Get port status returned 0x10100

>>>>>>> [   99.850052] c0 xhci-hcd.0.auto: clear port connect change, actual

>>>>>>>                                  port 0 status  = 0x2a0

>>>>>>> [   99.859313] c0 xhci-hcd.0.auto: Cancel URB ffffffc01ed6cd00, dev 1,

>>>>>>>                                  ep 0x81, starting at offset 0xc406d210

>>>>>>> [   99.869645] c0 xhci-hcd.0.auto: // Ding dong!

>>>>>>> [   99.874776] c0 xhci-hcd.0.auto: Stopped on Transfer TRB

>>>>>>> [   99.880713] c0 xhci-hcd.0.auto: Removing canceled TD starting at

>>>>>>>                                  0xc406d210 (dma).

>>>>>>> [   99.889012] c0 xhci-hcd.0.auto: Finding endpoint context

>>>>>>> [   99.895069] c0 xhci-hcd.0.auto: Cycle state = 0x1

>>>>>>> [   99.900519] c0 xhci-hcd.0.auto: New dequeue segment =

>>>>>>>                                  ffffffc1112f0880 (virtual)

>>>>>>> [   99.908655] c0 xhci-hcd.0.auto: New dequeue pointer = 0xc406d220 (DMA)

>>>>>>> [   99.915927] c0 xhci-hcd.0.auto: Set TR Deq Ptr cmd, new deq seg =

>>>>>>>                                  ffffffc1112f0880 (0xc406d000 dma),

>>>>>>>                                  new deq ptr = ffffff8002175220

>>>>>>>                                  (0xc406d220 dma), new cycle = 1

>>>>>>> [   99.931242] c0 xhci-hcd.0.auto: // Ding dong!

>>>>>>> [   99.936360] c0 xhci-hcd.0.auto: Successful Set TR Deq Ptr cmd,

>>>>>>>                                  deq = @c406d220

>>>>>>> [   99.944458] c0 xhci-hcd.0.auto: xhci_hub_status_data: stopping port

>>>>>>>                                  polling.

>>>>>>> [  100.047619] c0 xhci-hcd.0.auto: xhci_drop_endpoint called for udev

>>>>>>>                                  ffffffc01ae08800

>>>>>>> [  100.057002] c0 xhci-hcd.0.auto: drop ep 0x81, slot id 1, new drop

>>>>>>>                                  flags = 0x8, new add flags = 0x0

>>>>>>> [  100.067878] c0 xhci-hcd.0.auto: xhci_check_bandwidth called for udev

>>>>>>>                                  ffffffc01ae08800

>>>>>>> [  100.076868] c0 xhci-hcd.0.auto: New Input Control Context:

>>>>>>>

>>>>>>> ......

>>>>>>>

>>>>>>> [  100.427252] c0 xhci-hcd.0.auto: // Ding dong!

>>>>>>> [  105.430728] c0 xhci-hcd.0.auto: Command timeout

>>>>>>> [  105.436029] c0 xhci-hcd.0.auto: Abort command ring

>>>>>>> [  113.558223] c0 xhci-hcd.0.auto: Command completion event does not match

>>>>>>>                                  command

>>>>>>> [  113.569778] c0 xhci-hcd.0.auto: Timeout while waiting for configure

>>>>>>>                                  endpoint command

>>>>>>>

>>>>>>> The reason is it will suspend USB phy to disable phy clock when

>>>>>>> disconnecting the slow USB decice, that will hang on the xHCI commands

>>>>>>> executing which depends on the phy clock.

>>>>>>>

>>>>>>> Thus we should disable USB2.0 phy suspend feature when dwc3 acts as host

>>>>>>> role.

>>>>>>>

>>>>>>> Signed-off-by: Baolin Wang <baolin.wang@linaro.org>

>>>>>>> ---

>>>>>>>  drivers/usb/dwc3/core.c |   14 ++++++++++++++

>>>>>>>  1 file changed, 14 insertions(+)

>>>>>>>

>>>>>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c

>>>>>>> index 9a4a5e4..0b646cf 100644

>>>>>>> --- a/drivers/usb/dwc3/core.c

>>>>>>> +++ b/drivers/usb/dwc3/core.c

>>>>>>> @@ -565,6 +565,20 @@ static int dwc3_phy_setup(struct dwc3 *dwc)

>>>>>>>       if (dwc->revision > DWC3_REVISION_194A)

>>>>>>>               reg |= DWC3_GUSB2PHYCFG_SUSPHY;

>>>>>>>

>>>>>>> +     /*

>>>>>>> +      * When dwc3 controller acts as host role with attaching one slow speed

>>>>>>> +      * device (like mouse or keypad). Then if we plugged out the slow speed

>>>>>>> +      * device, it will timeout to run the deconfiguration endpoint command.

>>>>>>> +      * The reason is it will suspend USB phy to disable phy clock when

>>>>>>> +      * disconnecting slow speed decice, which will affect the xHCI commands

>>>>>>> +      * executing.

>>>>>>> +      *

>>>>>>> +      * Thus we should disable USB 2.0 phy suspend feature when dwc3 acts as

>>>>>>> +      * host role.

>>>>>>> +      */

>>>>>>> +     if (dwc->dr_mode == USB_DR_MODE_HOST || dwc->dr_mode == USB_DR_MODE_OTG)

>>>>>>> +             reg &= ~DWC3_GUSB2PHYCFG_SUSPHY;

>>>>>>

>>>>>> which version of the core you're using? Recent version (since 1.94A,

>>>>>

>>>>> My version is 2.80a.

>>>>>

>>>>>> IIRC) can manage core suspend automatically. Also, this patch of yours

>>>>>> will cause a power consumption regression.

>>>>>

>>>>> Yes, it can manage core suspend automatically, that is the problem.

>>>>> When plugging out one mouse or keypad device, the phy will suspend

>>>>> automatically to disable the phy clock. But now the disconnecting

>>>>> process is not finished, and some xHCI commands (like deconfiguration

>>>>> endpoint command to drop endpoint resources) need depend on the phy

>>>>> clock, which will hang on the system to timeout the command or abort

>>>>> command ring to halt the xHCI.

>>>>>

>>>>> I agree with you it will cause a power consumption regression, but it

>>>>> will cause serious problem if not. Do you have some suggestion?

>>>>

>>>> sorry for the long delay. This was lost in my inbox.

>>>>

>>>> I'm not sure this patch is the best solution. There's no mention in

>>>> Databook that we should avoid PHY suspend when acting as host. Adding

>>>> John here to see if John has any idea of how to fix this.

>>>>

>>>

>>> I'm not familiar enough with XHCI side of things to say.

>>>

>>> I'll ask around to see if anyone has an idea.

>>

>

> Hi Felipe, Baolin,

>

> I talked with a couple engineers here and the behavior is not

> something that's expected in host mode.

>

> Can you check that the value of the GCTL.RAMCLKSEL is set

> appropriately? This affects where the core gets the clock signal


In host mode, the bit[6:7] for RAMCLKSEL is default value 0, which
means it selects bus clock.

> from. If it is getting it from the phy clock then you will likely have

> this problem and will need to adjust it. Otherwise you should probably

> use the existing quirk instead.


So the bus clock is from the phy clock, then it will have this problem
when suspending phy? Yes, I can use the existing quirk, but I am
afraid it is one common problem if we use the mainline kernel. Or we
can add some documentation for enabling the phy suspend feature to
remind other people.

-- 
Baolin.wang
Best Regards
John Youn Jan. 19, 2017, 3:31 a.m. UTC | #5
On 1/18/2017 7:12 PM, Baolin Wang wrote:
> Hi John,

>

> On 19 January 2017 at 09:33, John Youn <John.Youn@synopsys.com> wrote:

>> On 1/16/2017 2:38 AM, Felipe Balbi wrote:

>>>

>>> Hi,

>>>

>>> John Youn <John.Youn@synopsys.com> writes:

>>>>> Baolin Wang <baolin.wang@linaro.org> writes:

>>>>>>> Baolin Wang <baolin.wang@linaro.org> writes:

>>>>>>>> When dwc3 controller acts as host role with attaching slow speed device

>>>>>>>> (like mouse or keypad). Then if we plugged out the slow speed device,

>>>>>>>> it will timeout to run the deconfiguration endpoint command to drop the

>>>>>>>> endpoint's resources. Some xHCI command timeout log as below when

>>>>>>>> disconnecting one slow device:

>>>>>>>>

>>>>>>>> [   99.807739] c0 xhci-hcd.0.auto: Port Status Change Event for port 1

>>>>>>>> [   99.814699] c0 xhci-hcd.0.auto: resume root hub

>>>>>>>> [   99.819992] c0 xhci-hcd.0.auto: handle_port_status: starting port

>>>>>>>>                                  polling.

>>>>>>>> [   99.827808] c0 xhci-hcd.0.auto: get port status, actual port 0 status

>>>>>>>>                                  = 0x202a0

>>>>>>>> [   99.835903] c0 xhci-hcd.0.auto: Get port status returned 0x10100

>>>>>>>> [   99.850052] c0 xhci-hcd.0.auto: clear port connect change, actual

>>>>>>>>                                  port 0 status  = 0x2a0

>>>>>>>> [   99.859313] c0 xhci-hcd.0.auto: Cancel URB ffffffc01ed6cd00, dev 1,

>>>>>>>>                                  ep 0x81, starting at offset 0xc406d210

>>>>>>>> [   99.869645] c0 xhci-hcd.0.auto: // Ding dong!

>>>>>>>> [   99.874776] c0 xhci-hcd.0.auto: Stopped on Transfer TRB

>>>>>>>> [   99.880713] c0 xhci-hcd.0.auto: Removing canceled TD starting at

>>>>>>>>                                  0xc406d210 (dma).

>>>>>>>> [   99.889012] c0 xhci-hcd.0.auto: Finding endpoint context

>>>>>>>> [   99.895069] c0 xhci-hcd.0.auto: Cycle state = 0x1

>>>>>>>> [   99.900519] c0 xhci-hcd.0.auto: New dequeue segment =

>>>>>>>>                                  ffffffc1112f0880 (virtual)

>>>>>>>> [   99.908655] c0 xhci-hcd.0.auto: New dequeue pointer = 0xc406d220 (DMA)

>>>>>>>> [   99.915927] c0 xhci-hcd.0.auto: Set TR Deq Ptr cmd, new deq seg =

>>>>>>>>                                  ffffffc1112f0880 (0xc406d000 dma),

>>>>>>>>                                  new deq ptr = ffffff8002175220

>>>>>>>>                                  (0xc406d220 dma), new cycle = 1

>>>>>>>> [   99.931242] c0 xhci-hcd.0.auto: // Ding dong!

>>>>>>>> [   99.936360] c0 xhci-hcd.0.auto: Successful Set TR Deq Ptr cmd,

>>>>>>>>                                  deq = @c406d220

>>>>>>>> [   99.944458] c0 xhci-hcd.0.auto: xhci_hub_status_data: stopping port

>>>>>>>>                                  polling.

>>>>>>>> [  100.047619] c0 xhci-hcd.0.auto: xhci_drop_endpoint called for udev

>>>>>>>>                                  ffffffc01ae08800

>>>>>>>> [  100.057002] c0 xhci-hcd.0.auto: drop ep 0x81, slot id 1, new drop

>>>>>>>>                                  flags = 0x8, new add flags = 0x0

>>>>>>>> [  100.067878] c0 xhci-hcd.0.auto: xhci_check_bandwidth called for udev

>>>>>>>>                                  ffffffc01ae08800

>>>>>>>> [  100.076868] c0 xhci-hcd.0.auto: New Input Control Context:

>>>>>>>>

>>>>>>>> ......

>>>>>>>>

>>>>>>>> [  100.427252] c0 xhci-hcd.0.auto: // Ding dong!

>>>>>>>> [  105.430728] c0 xhci-hcd.0.auto: Command timeout

>>>>>>>> [  105.436029] c0 xhci-hcd.0.auto: Abort command ring

>>>>>>>> [  113.558223] c0 xhci-hcd.0.auto: Command completion event does not match

>>>>>>>>                                  command

>>>>>>>> [  113.569778] c0 xhci-hcd.0.auto: Timeout while waiting for configure

>>>>>>>>                                  endpoint command

>>>>>>>>

>>>>>>>> The reason is it will suspend USB phy to disable phy clock when

>>>>>>>> disconnecting the slow USB decice, that will hang on the xHCI commands

>>>>>>>> executing which depends on the phy clock.

>>>>>>>>

>>>>>>>> Thus we should disable USB2.0 phy suspend feature when dwc3 acts as host

>>>>>>>> role.

>>>>>>>>

>>>>>>>> Signed-off-by: Baolin Wang <baolin.wang@linaro.org>

>>>>>>>> ---

>>>>>>>>  drivers/usb/dwc3/core.c |   14 ++++++++++++++

>>>>>>>>  1 file changed, 14 insertions(+)

>>>>>>>>

>>>>>>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c

>>>>>>>> index 9a4a5e4..0b646cf 100644

>>>>>>>> --- a/drivers/usb/dwc3/core.c

>>>>>>>> +++ b/drivers/usb/dwc3/core.c

>>>>>>>> @@ -565,6 +565,20 @@ static int dwc3_phy_setup(struct dwc3 *dwc)

>>>>>>>>       if (dwc->revision > DWC3_REVISION_194A)

>>>>>>>>               reg |= DWC3_GUSB2PHYCFG_SUSPHY;

>>>>>>>>

>>>>>>>> +     /*

>>>>>>>> +      * When dwc3 controller acts as host role with attaching one slow speed

>>>>>>>> +      * device (like mouse or keypad). Then if we plugged out the slow speed

>>>>>>>> +      * device, it will timeout to run the deconfiguration endpoint command.

>>>>>>>> +      * The reason is it will suspend USB phy to disable phy clock when

>>>>>>>> +      * disconnecting slow speed decice, which will affect the xHCI commands

>>>>>>>> +      * executing.

>>>>>>>> +      *

>>>>>>>> +      * Thus we should disable USB 2.0 phy suspend feature when dwc3 acts as

>>>>>>>> +      * host role.

>>>>>>>> +      */

>>>>>>>> +     if (dwc->dr_mode == USB_DR_MODE_HOST || dwc->dr_mode == USB_DR_MODE_OTG)

>>>>>>>> +             reg &= ~DWC3_GUSB2PHYCFG_SUSPHY;

>>>>>>>

>>>>>>> which version of the core you're using? Recent version (since 1.94A,

>>>>>>

>>>>>> My version is 2.80a.

>>>>>>

>>>>>>> IIRC) can manage core suspend automatically. Also, this patch of yours

>>>>>>> will cause a power consumption regression.

>>>>>>

>>>>>> Yes, it can manage core suspend automatically, that is the problem.

>>>>>> When plugging out one mouse or keypad device, the phy will suspend

>>>>>> automatically to disable the phy clock. But now the disconnecting

>>>>>> process is not finished, and some xHCI commands (like deconfiguration

>>>>>> endpoint command to drop endpoint resources) need depend on the phy

>>>>>> clock, which will hang on the system to timeout the command or abort

>>>>>> command ring to halt the xHCI.

>>>>>>

>>>>>> I agree with you it will cause a power consumption regression, but it

>>>>>> will cause serious problem if not. Do you have some suggestion?

>>>>>

>>>>> sorry for the long delay. This was lost in my inbox.

>>>>>

>>>>> I'm not sure this patch is the best solution. There's no mention in

>>>>> Databook that we should avoid PHY suspend when acting as host. Adding

>>>>> John here to see if John has any idea of how to fix this.

>>>>>

>>>>

>>>> I'm not familiar enough with XHCI side of things to say.

>>>>

>>>> I'll ask around to see if anyone has an idea.

>>>

>>

>> Hi Felipe, Baolin,

>>

>> I talked with a couple engineers here and the behavior is not

>> something that's expected in host mode.

>>

>> Can you check that the value of the GCTL.RAMCLKSEL is set

>> appropriately? This affects where the core gets the clock signal

>

> In host mode, the bit[6:7] for RAMCLKSEL is default value 0, which

> means it selects bus clock.

>

>> from. If it is getting it from the phy clock then you will likely have

>> this problem and will need to adjust it. Otherwise you should probably

>> use the existing quirk instead.

>

> So the bus clock is from the phy clock, then it will have this problem

> when suspending phy? Yes, I can use the existing quirk, but I am

> afraid it is one common problem if we use the mainline kernel. Or we

> can add some documentation for enabling the phy suspend feature to

> remind other people.

>


Hi Baolin,

It's expected the clocks to the PHY and core are different if you
suspend only the PHY. That's not particular to just the host, for
example handling LPM in device mode you could have the same problem.

You would have to confirm with your own platform how it is.

Regards,
John
diff mbox

Patch

diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 9a4a5e4..0b646cf 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -565,6 +565,20 @@  static int dwc3_phy_setup(struct dwc3 *dwc)
 	if (dwc->revision > DWC3_REVISION_194A)
 		reg |= DWC3_GUSB2PHYCFG_SUSPHY;
 
+	/*
+	 * When dwc3 controller acts as host role with attaching one slow speed
+	 * device (like mouse or keypad). Then if we plugged out the slow speed
+	 * device, it will timeout to run the deconfiguration endpoint command.
+	 * The reason is it will suspend USB phy to disable phy clock when
+	 * disconnecting slow speed decice, which will affect the xHCI commands
+	 * executing.
+	 *
+	 * Thus we should disable USB 2.0 phy suspend feature when dwc3 acts as
+	 * host role.
+	 */
+	if (dwc->dr_mode == USB_DR_MODE_HOST || dwc->dr_mode == USB_DR_MODE_OTG)
+		reg &= ~DWC3_GUSB2PHYCFG_SUSPHY;
+
 	if (dwc->dis_u2_susphy_quirk)
 		reg &= ~DWC3_GUSB2PHYCFG_SUSPHY;