Message ID | 1519730526-22274-1-git-send-email-rogerq@ti.com |
---|---|
State | New |
Headers | show |
Series | usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume | expand |
Hi Roger, On 27 February 2018 at 19:22, Roger Quadros <rogerq@ti.com> wrote: > In the following test we get stuck by sleeping forever in _dwc3_set_mode() > after which dual-role switching doesn't work. > > On dra7-evm's dual-role port, > - Load g_zero gadget driver and enumerate to host > - suspend to mem > - disconnect USB cable to host and connect otg cable with Pen drive in it. > - resume system > - we sleep indefinitely in _dwc3_set_mode due to. > dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()-> > dwc3_gadget_stop()->wait_event_lock_irq() > > Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints > so we don't wait in dwc3_gadget_stop(). I am curious why the DWC3_DEPEVT_EPCMDCMPLT event was not triggered any more when you executed the DWC3_DEPCMD_ENDTRANSFER command? > > Signed-off-by: Roger Quadros <rogerq@ti.com> > --- > drivers/usb/dwc3/gadget.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > index 2bda4eb..0a360da 100644 > --- a/drivers/usb/dwc3/gadget.c > +++ b/drivers/usb/dwc3/gadget.c > @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc) > > void dwc3_gadget_exit(struct dwc3 *dwc) > { > + int epnum; > + unsigned long flags; > + > + spin_lock_irqsave(&dwc->lock, flags); > + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) { > + struct dwc3_ep *dep = dwc->eps[epnum]; > + > + if (!dep) > + continue; > + > + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING; > + } > + spin_unlock_irqrestore(&dwc->lock, flags); > + > usb_del_gadget_udc(&dwc->gadget); > dwc3_gadget_free_endpoints(dwc); > dma_free_coherent(dwc->sysdev, DWC3_BOUNCE_SIZE, dwc->bounce, > -- -- Baolin.wang Best Regards
Hi, Roger Quadros <rogerq@ti.com> writes: > In the following test we get stuck by sleeping forever in _dwc3_set_mode() > after which dual-role switching doesn't work. > > On dra7-evm's dual-role port, > - Load g_zero gadget driver and enumerate to host > - suspend to mem > - disconnect USB cable to host and connect otg cable with Pen drive in it. > - resume system > - we sleep indefinitely in _dwc3_set_mode due to. > dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()-> > dwc3_gadget_stop()->wait_event_lock_irq() > > Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints > so we don't wait in dwc3_gadget_stop(). > > Signed-off-by: Roger Quadros <rogerq@ti.com> > --- > drivers/usb/dwc3/gadget.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > index 2bda4eb..0a360da 100644 > --- a/drivers/usb/dwc3/gadget.c > +++ b/drivers/usb/dwc3/gadget.c > @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc) > > void dwc3_gadget_exit(struct dwc3 *dwc) > { > + int epnum; > + unsigned long flags; > + > + spin_lock_irqsave(&dwc->lock, flags); > + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) { > + struct dwc3_ep *dep = dwc->eps[epnum]; > + > + if (!dep) > + continue; > + > + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING; > + } > + spin_unlock_irqrestore(&dwc->lock, flags); > + > usb_del_gadget_udc(&dwc->gadget); > dwc3_gadget_free_endpoints(dwc); free endpoints is a better place for this. It's already going to free the memory anyway. Might as well clear all flags to 0 there. -- balbi
Hi Baolin, On 28/02/18 05:04, Baolin Wang wrote: > Hi Roger, > > On 27 February 2018 at 19:22, Roger Quadros <rogerq@ti.com> wrote: >> In the following test we get stuck by sleeping forever in _dwc3_set_mode() >> after which dual-role switching doesn't work. >> >> On dra7-evm's dual-role port, >> - Load g_zero gadget driver and enumerate to host >> - suspend to mem >> - disconnect USB cable to host and connect otg cable with Pen drive in it. >> - resume system >> - we sleep indefinitely in _dwc3_set_mode due to. >> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()-> >> dwc3_gadget_stop()->wait_event_lock_irq() >> >> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints >> so we don't wait in dwc3_gadget_stop(). > > I am curious why the DWC3_DEPEVT_EPCMDCMPLT event was not triggered > any more when you executed the DWC3_DEPCMD_ENDTRANSFER command? In this particular case the USB gadget has been disconnected from the host so we shouldn't be expecting any command completion events. > >> >> Signed-off-by: Roger Quadros <rogerq@ti.com> >> --- >> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++ >> 1 file changed, 14 insertions(+) >> >> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >> index 2bda4eb..0a360da 100644 >> --- a/drivers/usb/dwc3/gadget.c >> +++ b/drivers/usb/dwc3/gadget.c >> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc) >> >> void dwc3_gadget_exit(struct dwc3 *dwc) >> { >> + int epnum; >> + unsigned long flags; >> + >> + spin_lock_irqsave(&dwc->lock, flags); >> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) { >> + struct dwc3_ep *dep = dwc->eps[epnum]; >> + >> + if (!dep) >> + continue; >> + >> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING; >> + } >> + spin_unlock_irqrestore(&dwc->lock, flags); >> + >> usb_del_gadget_udc(&dwc->gadget); >> dwc3_gadget_free_endpoints(dwc); >> dma_free_coherent(dwc->sysdev, DWC3_BOUNCE_SIZE, dwc->bounce, >> -- > -- cheers, -roger Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Hi, Roger Quadros <rogerq@ti.com> writes: >> Roger Quadros <rogerq@ti.com> writes: >>> In the following test we get stuck by sleeping forever in _dwc3_set_mode() >>> after which dual-role switching doesn't work. >>> >>> On dra7-evm's dual-role port, >>> - Load g_zero gadget driver and enumerate to host >>> - suspend to mem >>> - disconnect USB cable to host and connect otg cable with Pen drive in it. >>> - resume system >>> - we sleep indefinitely in _dwc3_set_mode due to. >>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()-> >>> dwc3_gadget_stop()->wait_event_lock_irq() >>> >>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints >>> so we don't wait in dwc3_gadget_stop(). >>> >>> Signed-off-by: Roger Quadros <rogerq@ti.com> >>> --- >>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++ >>> 1 file changed, 14 insertions(+) >>> >>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >>> index 2bda4eb..0a360da 100644 >>> --- a/drivers/usb/dwc3/gadget.c >>> +++ b/drivers/usb/dwc3/gadget.c >>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc) >>> >>> void dwc3_gadget_exit(struct dwc3 *dwc) >>> { >>> + int epnum; >>> + unsigned long flags; >>> + >>> + spin_lock_irqsave(&dwc->lock, flags); >>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) { >>> + struct dwc3_ep *dep = dwc->eps[epnum]; >>> + >>> + if (!dep) >>> + continue; >>> + >>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING; >>> + } >>> + spin_unlock_irqrestore(&dwc->lock, flags); >>> + >>> usb_del_gadget_udc(&dwc->gadget); >>> dwc3_gadget_free_endpoints(dwc); >> >> free endpoints is a better place for this. It's already going to free >> the memory anyway. Might as well clear all flags to 0 there. >> > > But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints() > is called after usb_del_gadget_udc() and the deadlock happens when > > usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq() > > and DWC3_EP_END_TRANSFER_PENDING flag is set. indeed. Iterating twice over the entire endpoint list seems wasteful. Perhaps we just shouldn't wait when removing the UDC since that's essentially what this patch will do, right? If you clear the flag before calling ->udc_stop(), this means the loop in dwc3_gadget_stop() will do nothing. Might as well remove it. -- balbi
Hi Roger, On 5 March 2018 at 17:45, Roger Quadros <rogerq@ti.com> wrote: > Felipe, > > On 05/03/18 10:49, Felipe Balbi wrote: >> >> Hi, >> >> Roger Quadros <rogerq@ti.com> writes: >>>> Roger Quadros <rogerq@ti.com> writes: >>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode() >>>>> after which dual-role switching doesn't work. >>>>> >>>>> On dra7-evm's dual-role port, >>>>> - Load g_zero gadget driver and enumerate to host >>>>> - suspend to mem >>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it. >>>>> - resume system >>>>> - we sleep indefinitely in _dwc3_set_mode due to. >>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()-> >>>>> dwc3_gadget_stop()->wait_event_lock_irq() >>>>> >>>>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints >>>>> so we don't wait in dwc3_gadget_stop(). >>>>> >>>>> Signed-off-by: Roger Quadros <rogerq@ti.com> >>>>> --- >>>>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++ >>>>> 1 file changed, 14 insertions(+) >>>>> >>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >>>>> index 2bda4eb..0a360da 100644 >>>>> --- a/drivers/usb/dwc3/gadget.c >>>>> +++ b/drivers/usb/dwc3/gadget.c >>>>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc) >>>>> >>>>> void dwc3_gadget_exit(struct dwc3 *dwc) >>>>> { >>>>> + int epnum; >>>>> + unsigned long flags; >>>>> + >>>>> + spin_lock_irqsave(&dwc->lock, flags); >>>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) { >>>>> + struct dwc3_ep *dep = dwc->eps[epnum]; >>>>> + >>>>> + if (!dep) >>>>> + continue; >>>>> + >>>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING; >>>>> + } >>>>> + spin_unlock_irqrestore(&dwc->lock, flags); >>>>> + >>>>> usb_del_gadget_udc(&dwc->gadget); >>>>> dwc3_gadget_free_endpoints(dwc); >>>> >>>> free endpoints is a better place for this. It's already going to free >>>> the memory anyway. Might as well clear all flags to 0 there. >>>> >>> >>> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints() >>> is called after usb_del_gadget_udc() and the deadlock happens when >>> >>> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq() >>> >>> and DWC3_EP_END_TRANSFER_PENDING flag is set. >> >> indeed. Iterating twice over the entire endpoint list seems >> wasteful. Perhaps we just shouldn't wait when removing the UDC since >> that's essentially what this patch will do, right? If you clear the flag >> before calling ->udc_stop(), this means the loop in dwc3_gadget_stop() >> will do nothing. Might as well remove it. >> > > This means that we will never wait for DWC3_EP_END_TRANSFER_PENDING to clear > in dwc3_gadget_stop() like we used to. This is perfectly fine, right? > > It makes sense to me as dwc3_gadget_stop() calls __dwc3_gadget_stop() which > masks all interrupts and nobody will ever clear that flag if it was set. I don't think so. It can not mask the endpoint events, please check the events which will be masked in DEVTEN register. The reason why we should wait for DWC3_EP_END_TRANSFER_PENDING to clear is that, sometimes the DWC3_DEPEVT_EPCMDCMPLT event will be triggered later than 100us, but now we may have freed the gadget irq which will cause crash. -- Baolin.wang Best Regards
Hi, Baolin Wang <baolin.wang@linaro.org> writes: >>> Roger Quadros <rogerq@ti.com> writes: >>>>> Roger Quadros <rogerq@ti.com> writes: >>>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode() >>>>>> after which dual-role switching doesn't work. >>>>>> >>>>>> On dra7-evm's dual-role port, >>>>>> - Load g_zero gadget driver and enumerate to host >>>>>> - suspend to mem >>>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it. >>>>>> - resume system >>>>>> - we sleep indefinitely in _dwc3_set_mode due to. >>>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()-> >>>>>> dwc3_gadget_stop()->wait_event_lock_irq() >>>>>> >>>>>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints >>>>>> so we don't wait in dwc3_gadget_stop(). >>>>>> >>>>>> Signed-off-by: Roger Quadros <rogerq@ti.com> >>>>>> --- >>>>>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++ >>>>>> 1 file changed, 14 insertions(+) >>>>>> >>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >>>>>> index 2bda4eb..0a360da 100644 >>>>>> --- a/drivers/usb/dwc3/gadget.c >>>>>> +++ b/drivers/usb/dwc3/gadget.c >>>>>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc) >>>>>> >>>>>> void dwc3_gadget_exit(struct dwc3 *dwc) >>>>>> { >>>>>> + int epnum; >>>>>> + unsigned long flags; >>>>>> + >>>>>> + spin_lock_irqsave(&dwc->lock, flags); >>>>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) { >>>>>> + struct dwc3_ep *dep = dwc->eps[epnum]; >>>>>> + >>>>>> + if (!dep) >>>>>> + continue; >>>>>> + >>>>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING; >>>>>> + } >>>>>> + spin_unlock_irqrestore(&dwc->lock, flags); >>>>>> + >>>>>> usb_del_gadget_udc(&dwc->gadget); >>>>>> dwc3_gadget_free_endpoints(dwc); >>>>> >>>>> free endpoints is a better place for this. It's already going to free >>>>> the memory anyway. Might as well clear all flags to 0 there. >>>>> >>>> >>>> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints() >>>> is called after usb_del_gadget_udc() and the deadlock happens when >>>> >>>> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq() >>>> >>>> and DWC3_EP_END_TRANSFER_PENDING flag is set. >>> >>> indeed. Iterating twice over the entire endpoint list seems >>> wasteful. Perhaps we just shouldn't wait when removing the UDC since >>> that's essentially what this patch will do, right? If you clear the flag >>> before calling ->udc_stop(), this means the loop in dwc3_gadget_stop() >>> will do nothing. Might as well remove it. >>> >> >> This means that we will never wait for DWC3_EP_END_TRANSFER_PENDING to clear >> in dwc3_gadget_stop() like we used to. This is perfectly fine, right? >> >> It makes sense to me as dwc3_gadget_stop() calls __dwc3_gadget_stop() which >> masks all interrupts and nobody will ever clear that flag if it was set. > > I don't think so. It can not mask the endpoint events, please check > the events which will be masked in DEVTEN register. The reason why we > should wait for DWC3_EP_END_TRANSFER_PENDING to clear is that, > sometimes the DWC3_DEPEVT_EPCMDCMPLT event will be triggered later > than 100us, but now we may have freed the gadget irq which will cause > crash. We could mask command complete events as soon as ->udc_stop() is called, right? Hmm, actually, __dwc3_gadget_stop() already clears DEVTEN completely. /me goes check databook At least on revision 2.60a of the databook, bit 10 is reserved. I wonder if that's the start of all the problems. Anybody has access to older and newer databook revisions so we can cross-check? best -- balbi
On 05/03/18 13:06, Felipe Balbi wrote: > > Hi, > > Baolin Wang <baolin.wang@linaro.org> writes: >>>> Roger Quadros <rogerq@ti.com> writes: >>>>>> Roger Quadros <rogerq@ti.com> writes: >>>>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode() >>>>>>> after which dual-role switching doesn't work. >>>>>>> >>>>>>> On dra7-evm's dual-role port, >>>>>>> - Load g_zero gadget driver and enumerate to host >>>>>>> - suspend to mem >>>>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it. >>>>>>> - resume system >>>>>>> - we sleep indefinitely in _dwc3_set_mode due to. >>>>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()-> >>>>>>> dwc3_gadget_stop()->wait_event_lock_irq() >>>>>>> >>>>>>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints >>>>>>> so we don't wait in dwc3_gadget_stop(). >>>>>>> >>>>>>> Signed-off-by: Roger Quadros <rogerq@ti.com> >>>>>>> --- >>>>>>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++ >>>>>>> 1 file changed, 14 insertions(+) >>>>>>> >>>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >>>>>>> index 2bda4eb..0a360da 100644 >>>>>>> --- a/drivers/usb/dwc3/gadget.c >>>>>>> +++ b/drivers/usb/dwc3/gadget.c >>>>>>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc) >>>>>>> >>>>>>> void dwc3_gadget_exit(struct dwc3 *dwc) >>>>>>> { >>>>>>> + int epnum; >>>>>>> + unsigned long flags; >>>>>>> + >>>>>>> + spin_lock_irqsave(&dwc->lock, flags); >>>>>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) { >>>>>>> + struct dwc3_ep *dep = dwc->eps[epnum]; >>>>>>> + >>>>>>> + if (!dep) >>>>>>> + continue; >>>>>>> + >>>>>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING; >>>>>>> + } >>>>>>> + spin_unlock_irqrestore(&dwc->lock, flags); >>>>>>> + >>>>>>> usb_del_gadget_udc(&dwc->gadget); >>>>>>> dwc3_gadget_free_endpoints(dwc); >>>>>> >>>>>> free endpoints is a better place for this. It's already going to free >>>>>> the memory anyway. Might as well clear all flags to 0 there. >>>>>> >>>>> >>>>> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints() >>>>> is called after usb_del_gadget_udc() and the deadlock happens when >>>>> >>>>> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq() >>>>> >>>>> and DWC3_EP_END_TRANSFER_PENDING flag is set. >>>> >>>> indeed. Iterating twice over the entire endpoint list seems >>>> wasteful. Perhaps we just shouldn't wait when removing the UDC since >>>> that's essentially what this patch will do, right? If you clear the flag >>>> before calling ->udc_stop(), this means the loop in dwc3_gadget_stop() >>>> will do nothing. Might as well remove it. >>>> >>> >>> This means that we will never wait for DWC3_EP_END_TRANSFER_PENDING to clear >>> in dwc3_gadget_stop() like we used to. This is perfectly fine, right? >>> >>> It makes sense to me as dwc3_gadget_stop() calls __dwc3_gadget_stop() which >>> masks all interrupts and nobody will ever clear that flag if it was set. >> >> I don't think so. It can not mask the endpoint events, please check >> the events which will be masked in DEVTEN register. The reason why we >> should wait for DWC3_EP_END_TRANSFER_PENDING to clear is that, >> sometimes the DWC3_DEPEVT_EPCMDCMPLT event will be triggered later >> than 100us, but now we may have freed the gadget irq which will cause >> crash. > > We could mask command complete events as soon as ->udc_stop() is called, > right? Hmm, actually, __dwc3_gadget_stop() already clears DEVTEN > completely. But which bit in DEVTEN says Endpoint events are disabled? > > /me goes check databook > > At least on revision 2.60a of the databook, bit 10 is reserved. I wonder > if that's the start of all the problems. Anybody has access to older and > newer databook revisions so we can cross-check? > I can access v2.40 and v3.10 books. bit 10 is reserved on both Differences in v2.4 vs v3.10 are: bit 8 reserved vs L1SUSPEN bit 13 reserved vs StopOnDisconnectEn bit 14 reserved vs L1WKUPEVTEN -- cheers, -roger Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
On 5 March 2018 at 19:14, Roger Quadros <rogerq@ti.com> wrote: > On 05/03/18 13:06, Felipe Balbi wrote: >> >> Hi, >> >> Baolin Wang <baolin.wang@linaro.org> writes: >>>>> Roger Quadros <rogerq@ti.com> writes: >>>>>>> Roger Quadros <rogerq@ti.com> writes: >>>>>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode() >>>>>>>> after which dual-role switching doesn't work. >>>>>>>> >>>>>>>> On dra7-evm's dual-role port, >>>>>>>> - Load g_zero gadget driver and enumerate to host >>>>>>>> - suspend to mem >>>>>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it. >>>>>>>> - resume system >>>>>>>> - we sleep indefinitely in _dwc3_set_mode due to. >>>>>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()-> >>>>>>>> dwc3_gadget_stop()->wait_event_lock_irq() >>>>>>>> >>>>>>>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints >>>>>>>> so we don't wait in dwc3_gadget_stop(). >>>>>>>> >>>>>>>> Signed-off-by: Roger Quadros <rogerq@ti.com> >>>>>>>> --- >>>>>>>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++ >>>>>>>> 1 file changed, 14 insertions(+) >>>>>>>> >>>>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >>>>>>>> index 2bda4eb..0a360da 100644 >>>>>>>> --- a/drivers/usb/dwc3/gadget.c >>>>>>>> +++ b/drivers/usb/dwc3/gadget.c >>>>>>>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc) >>>>>>>> >>>>>>>> void dwc3_gadget_exit(struct dwc3 *dwc) >>>>>>>> { >>>>>>>> + int epnum; >>>>>>>> + unsigned long flags; >>>>>>>> + >>>>>>>> + spin_lock_irqsave(&dwc->lock, flags); >>>>>>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) { >>>>>>>> + struct dwc3_ep *dep = dwc->eps[epnum]; >>>>>>>> + >>>>>>>> + if (!dep) >>>>>>>> + continue; >>>>>>>> + >>>>>>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING; >>>>>>>> + } >>>>>>>> + spin_unlock_irqrestore(&dwc->lock, flags); >>>>>>>> + >>>>>>>> usb_del_gadget_udc(&dwc->gadget); >>>>>>>> dwc3_gadget_free_endpoints(dwc); >>>>>>> >>>>>>> free endpoints is a better place for this. It's already going to free >>>>>>> the memory anyway. Might as well clear all flags to 0 there. >>>>>>> >>>>>> >>>>>> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints() >>>>>> is called after usb_del_gadget_udc() and the deadlock happens when >>>>>> >>>>>> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq() >>>>>> >>>>>> and DWC3_EP_END_TRANSFER_PENDING flag is set. >>>>> >>>>> indeed. Iterating twice over the entire endpoint list seems >>>>> wasteful. Perhaps we just shouldn't wait when removing the UDC since >>>>> that's essentially what this patch will do, right? If you clear the flag >>>>> before calling ->udc_stop(), this means the loop in dwc3_gadget_stop() >>>>> will do nothing. Might as well remove it. >>>>> >>>> >>>> This means that we will never wait for DWC3_EP_END_TRANSFER_PENDING to clear >>>> in dwc3_gadget_stop() like we used to. This is perfectly fine, right? >>>> >>>> It makes sense to me as dwc3_gadget_stop() calls __dwc3_gadget_stop() which >>>> masks all interrupts and nobody will ever clear that flag if it was set. >>> >>> I don't think so. It can not mask the endpoint events, please check >>> the events which will be masked in DEVTEN register. The reason why we >>> should wait for DWC3_EP_END_TRANSFER_PENDING to clear is that, >>> sometimes the DWC3_DEPEVT_EPCMDCMPLT event will be triggered later >>> than 100us, but now we may have freed the gadget irq which will cause >>> crash. >> >> We could mask command complete events as soon as ->udc_stop() is called, >> right? Hmm, actually, __dwc3_gadget_stop() already clears DEVTEN >> completely. > > But which bit in DEVTEN says Endpoint events are disabled? When we set up the DWC3_DEPCMD_ENDTRANSFER command in dwc3_stop_active_transfer(), we can do not set DWC3_DEPCMD_CMDIOC, then there will no endpoint command complete interrupts I think. cmd |= DWC3_DEPCMD_CMDIOC; > >> >> /me goes check databook >> >> At least on revision 2.60a of the databook, bit 10 is reserved. I wonder >> if that's the start of all the problems. Anybody has access to older and >> newer databook revisions so we can cross-check? >> > > I can access v2.40 and v3.10 books. > > bit 10 is reserved on both > > Differences in v2.4 vs v3.10 are: > > bit 8 reserved vs L1SUSPEN > bit 13 reserved vs StopOnDisconnectEn > bit 14 reserved vs L1WKUPEVTEN > > -- > cheers, > -roger > > Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki -- Baolin.wang Best Regards
Hi, Baolin Wang <baolin.wang@linaro.org> writes: >>>>>>>>> void dwc3_gadget_exit(struct dwc3 *dwc) >>>>>>>>> { >>>>>>>>> + int epnum; >>>>>>>>> + unsigned long flags; >>>>>>>>> + >>>>>>>>> + spin_lock_irqsave(&dwc->lock, flags); >>>>>>>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) { >>>>>>>>> + struct dwc3_ep *dep = dwc->eps[epnum]; >>>>>>>>> + >>>>>>>>> + if (!dep) >>>>>>>>> + continue; >>>>>>>>> + >>>>>>>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING; >>>>>>>>> + } >>>>>>>>> + spin_unlock_irqrestore(&dwc->lock, flags); >>>>>>>>> + >>>>>>>>> usb_del_gadget_udc(&dwc->gadget); >>>>>>>>> dwc3_gadget_free_endpoints(dwc); >>>>>>>> >>>>>>>> free endpoints is a better place for this. It's already going to free >>>>>>>> the memory anyway. Might as well clear all flags to 0 there. >>>>>>>> >>>>>>> >>>>>>> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints() >>>>>>> is called after usb_del_gadget_udc() and the deadlock happens when >>>>>>> >>>>>>> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq() >>>>>>> >>>>>>> and DWC3_EP_END_TRANSFER_PENDING flag is set. >>>>>> >>>>>> indeed. Iterating twice over the entire endpoint list seems >>>>>> wasteful. Perhaps we just shouldn't wait when removing the UDC since >>>>>> that's essentially what this patch will do, right? If you clear the flag >>>>>> before calling ->udc_stop(), this means the loop in dwc3_gadget_stop() >>>>>> will do nothing. Might as well remove it. >>>>>> >>>>> >>>>> This means that we will never wait for DWC3_EP_END_TRANSFER_PENDING to clear >>>>> in dwc3_gadget_stop() like we used to. This is perfectly fine, right? >>>>> >>>>> It makes sense to me as dwc3_gadget_stop() calls __dwc3_gadget_stop() which >>>>> masks all interrupts and nobody will ever clear that flag if it was set. >>>> >>>> I don't think so. It can not mask the endpoint events, please check >>>> the events which will be masked in DEVTEN register. The reason why we >>>> should wait for DWC3_EP_END_TRANSFER_PENDING to clear is that, >>>> sometimes the DWC3_DEPEVT_EPCMDCMPLT event will be triggered later >>>> than 100us, but now we may have freed the gadget irq which will cause >>>> crash. >>> >>> We could mask command complete events as soon as ->udc_stop() is called, >>> right? Hmm, actually, __dwc3_gadget_stop() already clears DEVTEN >>> completely. >> >> But which bit in DEVTEN says Endpoint events are disabled? > > When we set up the DWC3_DEPCMD_ENDTRANSFER command in > dwc3_stop_active_transfer(), we can do not set DWC3_DEPCMD_CMDIOC, > then there will no endpoint command complete interrupts I think. > > cmd |= DWC3_DEPCMD_CMDIOC; I remember some part of the databook mandating CMDIOC to be set. We could test it out without and see if anything blows up. I would, however, require a lengthy comment explaining that we're deviating from databook revision x.yya, section foobar because $reasons. :-) -- balbi
Hi, Roger Quadros <rogerq@ti.com> writes: <snip> >>> When we set up the DWC3_DEPCMD_ENDTRANSFER command in >>> dwc3_stop_active_transfer(), we can do not set DWC3_DEPCMD_CMDIOC, >>> then there will no endpoint command complete interrupts I think. >>> >>> cmd |= DWC3_DEPCMD_CMDIOC; >> >> I remember some part of the databook mandating CMDIOC to be set. We >> could test it out without and see if anything blows up. I would, >> however, require a lengthy comment explaining that we're deviating from >> databook revision x.yya, section foobar because $reasons. :-) >> > > This is what the v3.10 databook says > > "When issuing an End Transfer command, software must set the CmdIOC > bit (field 8) so that an Endpoint Command Complete event is generated > after the transfer ends. This is necessary to synchronize the > conclusion of system bus traffic before the End Transfer command is > completed." > > with a note > > "If GUCTL2[Rst_actbitlater] is set, Software can poll the completion > of the End Transfer command by polling the command active bit to be > cleared to 0." > > fyi. > > Rst_actbitlater - "Enable clearing of the command active bit for the > ENDXFER command after the command execution is completed. This bit is > valid in device mode only." > > So I'd prefer not to clear CMDIOC for all cases. > > Could we some how just tackle the dwc3_gadget_exit case like I did in > this patch? if you can send a version that doesn't iterate over all endpoints twice, sure. We still need a comment somewhere, and I fear we may get interrupts later in some cases. How would we deal with that? -- balbi
On 09/03/18 11:26, Roger Quadros wrote: > On 09/03/18 11:23, Felipe Balbi wrote: >> >> Hi, >> >> Roger Quadros <rogerq@ti.com> writes: >> >> <snip> >> >>>>> When we set up the DWC3_DEPCMD_ENDTRANSFER command in >>>>> dwc3_stop_active_transfer(), we can do not set DWC3_DEPCMD_CMDIOC, >>>>> then there will no endpoint command complete interrupts I think. >>>>> >>>>> cmd |= DWC3_DEPCMD_CMDIOC; >>>> >>>> I remember some part of the databook mandating CMDIOC to be set. We >>>> could test it out without and see if anything blows up. I would, >>>> however, require a lengthy comment explaining that we're deviating from >>>> databook revision x.yya, section foobar because $reasons. :-) >>>> >>> >>> This is what the v3.10 databook says >>> >>> "When issuing an End Transfer command, software must set the CmdIOC >>> bit (field 8) so that an Endpoint Command Complete event is generated >>> after the transfer ends. This is necessary to synchronize the >>> conclusion of system bus traffic before the End Transfer command is >>> completed." >>> >>> with a note >>> >>> "If GUCTL2[Rst_actbitlater] is set, Software can poll the completion >>> of the End Transfer command by polling the command active bit to be >>> cleared to 0." >>> >>> fyi. >>> >>> Rst_actbitlater - "Enable clearing of the command active bit for the >>> ENDXFER command after the command execution is completed. This bit is >>> valid in device mode only." >>> >>> So I'd prefer not to clear CMDIOC for all cases. >>> >>> Could we some how just tackle the dwc3_gadget_exit case like I did in >>> this patch? >> >> if you can send a version that doesn't iterate over all endpoints twice, >> sure. We still need a comment somewhere, and I fear we may get >> interrupts later in some cases. How would we deal with that? >> > > how about explicitly masking that interrupt? Is it possible? > Other easy option is to use wait_event_interruptible_lock_irq_timeout() instead of wait_event_lock_irq() in dwc3_gadget_stop(). Is a 200ms timeout sufficient? And after the first timeout we assume all will timeout so no point in waiting 200ms for each endpoint. -- cheers, -roger Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Hi, Roger Quadros <rogerq@ti.com> writes: >>> This is what the v3.10 databook says >>> >>> "When issuing an End Transfer command, software must set the CmdIOC >>> bit (field 8) so that an Endpoint Command Complete event is generated >>> after the transfer ends. This is necessary to synchronize the >>> conclusion of system bus traffic before the End Transfer command is >>> completed." >>> >>> with a note >>> >>> "If GUCTL2[Rst_actbitlater] is set, Software can poll the completion >>> of the End Transfer command by polling the command active bit to be >>> cleared to 0." >>> >>> fyi. >>> >>> Rst_actbitlater - "Enable clearing of the command active bit for the >>> ENDXFER command after the command execution is completed. This bit is >>> valid in device mode only." >>> >>> So I'd prefer not to clear CMDIOC for all cases. >>> >>> Could we some how just tackle the dwc3_gadget_exit case like I did in >>> this patch? >> >> if you can send a version that doesn't iterate over all endpoints twice, >> sure. We still need a comment somewhere, and I fear we may get >> interrupts later in some cases. How would we deal with that? >> > > how about explicitly masking that interrupt? Is it possible? I think I showed that the bit is reserved on recent dwc3 core releases (anytyhing 2.40a+, at least). -- balbi
Hi, Roger Quadros <rogerq@ti.com> writes: >>>>>> When we set up the DWC3_DEPCMD_ENDTRANSFER command in >>>>>> dwc3_stop_active_transfer(), we can do not set DWC3_DEPCMD_CMDIOC, >>>>>> then there will no endpoint command complete interrupts I think. >>>>>> >>>>>> cmd |= DWC3_DEPCMD_CMDIOC; >>>>> >>>>> I remember some part of the databook mandating CMDIOC to be set. We >>>>> could test it out without and see if anything blows up. I would, >>>>> however, require a lengthy comment explaining that we're deviating from >>>>> databook revision x.yya, section foobar because $reasons. :-) >>>>> >>>> >>>> This is what the v3.10 databook says >>>> >>>> "When issuing an End Transfer command, software must set the CmdIOC >>>> bit (field 8) so that an Endpoint Command Complete event is generated >>>> after the transfer ends. This is necessary to synchronize the >>>> conclusion of system bus traffic before the End Transfer command is >>>> completed." >>>> >>>> with a note >>>> >>>> "If GUCTL2[Rst_actbitlater] is set, Software can poll the completion >>>> of the End Transfer command by polling the command active bit to be >>>> cleared to 0." >>>> >>>> fyi. >>>> >>>> Rst_actbitlater - "Enable clearing of the command active bit for the >>>> ENDXFER command after the command execution is completed. This bit is >>>> valid in device mode only." >>>> >>>> So I'd prefer not to clear CMDIOC for all cases. >>>> >>>> Could we some how just tackle the dwc3_gadget_exit case like I did in >>>> this patch? >>> >>> if you can send a version that doesn't iterate over all endpoints twice, >>> sure. We still need a comment somewhere, and I fear we may get >>> interrupts later in some cases. How would we deal with that? >>> >> >> how about explicitly masking that interrupt? Is it possible? >> > > Other easy option is to use wait_event_interruptible_lock_irq_timeout() > instead of wait_event_lock_irq() in dwc3_gadget_stop(). > > Is a 200ms timeout sufficient? And after the first timeout we assume all > will timeout so no point in waiting 200ms for each endpoint. We can do that. And I think some 5ms is more than enough :-) I'd be surprised if it takes anything over some 200us for the EndTransfer command to complete. -- balbi
Hi Felipe, On 09/03/18 14:47, Roger Quadros wrote: > In the following test we get stuck by sleeping forever in _dwc3_set_mode() > after which dual-role switching doesn't work. > > On dra7-evm's dual-role port, > - Load g_zero gadget driver and enumerate to host > - suspend to mem > - disconnect USB cable to host and connect otg cable with Pen drive in it. > - resume system > - we sleep indefinitely in _dwc3_set_mode due to. > dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()-> > dwc3_gadget_stop()->wait_event_lock_irq() > > To fix this instead of waiting indefinitely with wait_event_lock_irq() > we use wait_event_interruptible_lock_irq_timeout() and print > and error message if there was a timeout. > > Signed-off-by: Roger Quadros <rogerq@ti.com> Thanks for picking this for -next. Is it better to have this in v4.16-rc fixes? and also stable? v4.12+ > --- > > Changelog: > > v2: > - use wait_event_interruptible_lock_irq_timeout() instead of wait_event_lock_irq() > > drivers/usb/dwc3/gadget.c | 23 ++++++++++++++++++++--- > 1 file changed, 20 insertions(+), 3 deletions(-) > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > index 2bda4eb..7c3a6e4 100644 > --- a/drivers/usb/dwc3/gadget.c > +++ b/drivers/usb/dwc3/gadget.c > @@ -1950,6 +1950,7 @@ static int dwc3_gadget_stop(struct usb_gadget *g) > struct dwc3 *dwc = gadget_to_dwc(g); > unsigned long flags; > int epnum; > + u32 tmo_eps = 0; > > spin_lock_irqsave(&dwc->lock, flags); > > @@ -1960,6 +1961,7 @@ static int dwc3_gadget_stop(struct usb_gadget *g) > > for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) { > struct dwc3_ep *dep = dwc->eps[epnum]; > + int ret; > > if (!dep) > continue; > @@ -1967,9 +1969,24 @@ static int dwc3_gadget_stop(struct usb_gadget *g) > if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING)) > continue; > > - wait_event_lock_irq(dep->wait_end_transfer, > - !(dep->flags & DWC3_EP_END_TRANSFER_PENDING), > - dwc->lock); > + ret = wait_event_interruptible_lock_irq_timeout(dep->wait_end_transfer, > + !(dep->flags & DWC3_EP_END_TRANSFER_PENDING), > + dwc->lock, msecs_to_jiffies(5)); > + > + if (ret <= 0) { > + /* Timed out or interrupted! There's nothing much > + * we can do so we just log here and print which > + * endpoints timed out at the end. > + */ > + tmo_eps |= 1 << epnum; > + dep->flags &= DWC3_EP_END_TRANSFER_PENDING; > + } > + } > + > + if (tmo_eps) { > + dev_err(dwc->dev, > + "end transfer timed out on endpoints 0x%x [bitmap]\n", > + tmo_eps); > } > > out: > -- cheers, -roger Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Hi, Roger Quadros <rogerq@ti.com> writes: > Hi Felipe, > > On 09/03/18 14:47, Roger Quadros wrote: >> In the following test we get stuck by sleeping forever in _dwc3_set_mode() >> after which dual-role switching doesn't work. >> >> On dra7-evm's dual-role port, >> - Load g_zero gadget driver and enumerate to host >> - suspend to mem >> - disconnect USB cable to host and connect otg cable with Pen drive in it. >> - resume system >> - we sleep indefinitely in _dwc3_set_mode due to. >> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()-> >> dwc3_gadget_stop()->wait_event_lock_irq() >> >> To fix this instead of waiting indefinitely with wait_event_lock_irq() >> we use wait_event_interruptible_lock_irq_timeout() and print >> and error message if there was a timeout. >> >> Signed-off-by: Roger Quadros <rogerq@ti.com> > > Thanks for picking this for -next. > Is it better to have this in v4.16-rc fixes? > and also stable? v4.12+ Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit log ;-) The best we can do now, is wait for -rc1 and manually send the commit to stable. -- balbi
On 16/03/18 13:00, Felipe Balbi wrote: > > Hi, > > Roger Quadros <rogerq@ti.com> writes: > >> Hi Felipe, >> >> On 09/03/18 14:47, Roger Quadros wrote: >>> In the following test we get stuck by sleeping forever in _dwc3_set_mode() >>> after which dual-role switching doesn't work. >>> >>> On dra7-evm's dual-role port, >>> - Load g_zero gadget driver and enumerate to host >>> - suspend to mem >>> - disconnect USB cable to host and connect otg cable with Pen drive in it. >>> - resume system >>> - we sleep indefinitely in _dwc3_set_mode due to. >>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()-> >>> dwc3_gadget_stop()->wait_event_lock_irq() >>> >>> To fix this instead of waiting indefinitely with wait_event_lock_irq() >>> we use wait_event_interruptible_lock_irq_timeout() and print >>> and error message if there was a timeout. >>> >>> Signed-off-by: Roger Quadros <rogerq@ti.com> >> >> Thanks for picking this for -next. >> Is it better to have this in v4.16-rc fixes? >> and also stable? v4.12+ > > Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit > log ;-) > > The best we can do now, is wait for -rc1 and manually send the commit to > stable. > That's fine. Thanks. -- cheers, -roger Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Hi, On 3/16/2018 3:03 PM, Roger Quadros wrote: > On 16/03/18 13:00, Felipe Balbi wrote: >> >> Hi, >> >> Roger Quadros <rogerq@ti.com> writes: >> >>> Hi Felipe, >>> >>> On 09/03/18 14:47, Roger Quadros wrote: >>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode() >>>> after which dual-role switching doesn't work. >>>> >>>> On dra7-evm's dual-role port, >>>> - Load g_zero gadget driver and enumerate to host >>>> - suspend to mem >>>> - disconnect USB cable to host and connect otg cable with Pen drive in it. >>>> - resume system >>>> - we sleep indefinitely in _dwc3_set_mode due to. >>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()-> >>>> dwc3_gadget_stop()->wait_event_lock_irq() >>>> >>>> To fix this instead of waiting indefinitely with wait_event_lock_irq() >>>> we use wait_event_interruptible_lock_irq_timeout() and print >>>> and error message if there was a timeout. >>>> >>>> Signed-off-by: Roger Quadros <rogerq@ti.com> >>> >>> Thanks for picking this for -next. >>> Is it better to have this in v4.16-rc fixes? >>> and also stable? v4.12+ >> >> Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit >> log ;-) >> >> The best we can do now, is wait for -rc1 and manually send the commit to >> stable. >> > > That's fine. Thanks. > Same issue seen in dwc3_gadget_ep_dequeue() function where also used wait_event_lock_irq() - as result infinite loop. Actually to fix this issue I updated condition of wait function from: !(dep->flags & DWC3_EP_END_TRANSFER_PENDING) to: !(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED) Not, sure that this fix is fully correct because I'm familiar with dwc3, but this fix allow us to go forward with request dequeue. I think, need deeper investigation of infinite loop to catch root cause of it, before accept any of fixes. Thanks, Minas
Hi, Minas Harutyunyan <Minas.Harutyunyan@synopsys.com> writes: >>>> On 09/03/18 14:47, Roger Quadros wrote: >>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode() >>>>> after which dual-role switching doesn't work. >>>>> >>>>> On dra7-evm's dual-role port, >>>>> - Load g_zero gadget driver and enumerate to host >>>>> - suspend to mem >>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it. >>>>> - resume system >>>>> - we sleep indefinitely in _dwc3_set_mode due to. >>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()-> >>>>> dwc3_gadget_stop()->wait_event_lock_irq() >>>>> >>>>> To fix this instead of waiting indefinitely with wait_event_lock_irq() >>>>> we use wait_event_interruptible_lock_irq_timeout() and print >>>>> and error message if there was a timeout. >>>>> >>>>> Signed-off-by: Roger Quadros <rogerq@ti.com> >>>> >>>> Thanks for picking this for -next. >>>> Is it better to have this in v4.16-rc fixes? >>>> and also stable? v4.12+ >>> >>> Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit >>> log ;-) >>> >>> The best we can do now, is wait for -rc1 and manually send the commit to >>> stable. >>> >> >> That's fine. Thanks. >> > > Same issue seen in dwc3_gadget_ep_dequeue() function where also used > wait_event_lock_irq() - as result infinite loop. how did this happen? During rmmod dwc3? Or, perhaps, after you unloaded a gadget driver? > Actually to fix this issue I updated condition of wait function > from: > !(dep->flags & DWC3_EP_END_TRANSFER_PENDING) > to: > !(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED) you're not fixing anything. You're, essentially, removing the entire end transfer pending logic. The whole idea of this is that we can disable the endpoint and wait for the End Transfer interrupt. When you add a check for the endpoint being enabled, then that code will never run and, thus, never wait for the End Transfer IRQ. If you manage to find a more reliable way of reproducing this, then make sure to capture dwc3 tracepoints (see the documentation for details) and let's start trying to figure out what's going on. cheers -- balbi
Hi, Minas Harutyunyan <Minas.Harutyunyan@synopsys.com> writes: >>>>>> Thanks for picking this for -next. >>>>>> Is it better to have this in v4.16-rc fixes? >>>>>> and also stable? v4.12+ >>>>> >>>>> Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit >>>>> log ;-) >>>>> >>>>> The best we can do now, is wait for -rc1 and manually send the commit to >>>>> stable. >>>>> >>>> >>>> That's fine. Thanks. >>>> >>> >>> Same issue seen in dwc3_gadget_ep_dequeue() function where also used >>> wait_event_lock_irq() - as result infinite loop. >> >> how did this happen? During rmmod dwc3? Or, perhaps, after you unloaded >> a gadget driver? >> > No, not during rmmod's. > We using our internal USB testing tool. Test case; ISOC OUT, transfer > size N frames. When host starts ISOC OUT traffic then the dwc3 based on > "Transfer not ready" event in frame F starts transfers staring from > frame F+4 (for bInterval=1) as result 4 requests, which already queued > on device side, remain incomplete. Function driver on some timeout > trying dequeue these 4 requests (without disabling EP) to complete test. > For IN ISOC's these requests completed on MISSED ISOC event, but for > ISOC OUT required call dequeue on some timeout. okay >>> Actually to fix this issue I updated condition of wait function >>> from: >>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING) >>> to: >>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED) >> >> you're not fixing anything. You're, essentially, removing the entire >> end transfer pending logic. > yes, you are right, but how to overcome this infinite loop? Replace > wait_event_lock_irq() by wait_event_interruptible_lock_irq_timeout()? The best way here would be to figure why we're missing command complete IRQ in those cases. According to documentation, we *should* receive that interrupt, so why is it missing? -- balbi
Hi, On 3/19/2018 12:55 PM, Felipe Balbi wrote: > > Hi, > > Minas Harutyunyan <Minas.Harutyunyan@synopsys.com> writes: >>>>>>> Thanks for picking this for -next. >>>>>>> Is it better to have this in v4.16-rc fixes? >>>>>>> and also stable? v4.12+ >>>>>> >>>>>> Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit >>>>>> log ;-) >>>>>> >>>>>> The best we can do now, is wait for -rc1 and manually send the commit to >>>>>> stable. >>>>>> >>>>> >>>>> That's fine. Thanks. >>>>> >>>> >>>> Same issue seen in dwc3_gadget_ep_dequeue() function where also used >>>> wait_event_lock_irq() - as result infinite loop. >>> >>> how did this happen? During rmmod dwc3? Or, perhaps, after you unloaded >>> a gadget driver? >>> >> No, not during rmmod's. >> We using our internal USB testing tool. Test case; ISOC OUT, transfer >> size N frames. When host starts ISOC OUT traffic then the dwc3 based on >> "Transfer not ready" event in frame F starts transfers staring from >> frame F+4 (for bInterval=1) as result 4 requests, which already queued >> on device side, remain incomplete. Function driver on some timeout >> trying dequeue these 4 requests (without disabling EP) to complete test. >> For IN ISOC's these requests completed on MISSED ISOC event, but for >> ISOC OUT required call dequeue on some timeout. > > okay > >>>> Actually to fix this issue I updated condition of wait function >>>> from: >>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING) >>>> to: >>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED) >>> >>> you're not fixing anything. You're, essentially, removing the entire >>> end transfer pending logic. >> yes, you are right, but how to overcome this infinite loop? Replace >> wait_event_lock_irq() by wait_event_interruptible_lock_irq_timeout()? > > The best way here would be to figure why we're missing command complete > IRQ in those cases. According to documentation, we *should* receive that > interrupt, so why is it missing? > Additional info on test. Core configuration is HS only mode, test speed HS, core version v2.90a. Maybe it will help to understand cause of issue. BTW, currently to pass above describe ISOC OUT test we just commented wait_event_lock_irq() in dwc3_gadget_ep_dequeue() function and successfully received request completion in function driver. Thanks, Minas
Hi, On 3/19/2018 3:36 PM, Minas Harutyunyan wrote: > Hi, > > On 3/19/2018 12:55 PM, Felipe Balbi wrote: >> >> Hi, >> >> Minas Harutyunyan <Minas.Harutyunyan@synopsys.com> writes: >>>>>>>> Thanks for picking this for -next. >>>>>>>> Is it better to have this in v4.16-rc fixes? >>>>>>>> and also stable? v4.12+ >>>>>>> >>>>>>> Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit >>>>>>> log ;-) >>>>>>> >>>>>>> The best we can do now, is wait for -rc1 and manually send the commit to >>>>>>> stable. >>>>>>> >>>>>> >>>>>> That's fine. Thanks. >>>>>> >>>>> >>>>> Same issue seen in dwc3_gadget_ep_dequeue() function where also used >>>>> wait_event_lock_irq() - as result infinite loop. >>>> >>>> how did this happen? During rmmod dwc3? Or, perhaps, after you unloaded >>>> a gadget driver? >>>> >>> No, not during rmmod's. >>> We using our internal USB testing tool. Test case; ISOC OUT, transfer >>> size N frames. When host starts ISOC OUT traffic then the dwc3 based on >>> "Transfer not ready" event in frame F starts transfers staring from >>> frame F+4 (for bInterval=1) as result 4 requests, which already queued >>> on device side, remain incomplete. Function driver on some timeout >>> trying dequeue these 4 requests (without disabling EP) to complete test. >>> For IN ISOC's these requests completed on MISSED ISOC event, but for >>> ISOC OUT required call dequeue on some timeout. >> >> okay >> >>>>> Actually to fix this issue I updated condition of wait function >>>>> from: >>>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING) >>>>> to: >>>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED) >>>> >>>> you're not fixing anything. You're, essentially, removing the entire >>>> end transfer pending logic. >>> yes, you are right, but how to overcome this infinite loop? Replace >>> wait_event_lock_irq() by wait_event_interruptible_lock_irq_timeout()? >> >> The best way here would be to figure why we're missing command complete >> IRQ in those cases. According to documentation, we *should* receive that >> interrupt, so why is it missing? >> > > Additional info on test. Core configuration is HS only mode, test speed > HS, core version v2.90a. Maybe it will help to understand cause of issue. > BTW, currently to pass above describe ISOC OUT test we just commented > wait_event_lock_irq() in dwc3_gadget_ep_dequeue() function and > successfully received request completion in function driver. > Thanks, > Minas > One more info: while function driver call dequeue, host periodically send control read command to get status of test from function - test In Progress or Finished. Thanks, Minas
Hi Filipe, On 3/19/2018 5:53 PM, Minas Harutyunyan wrote: > Hi, > > On 3/19/2018 3:36 PM, Minas Harutyunyan wrote: >> Hi, >> >> On 3/19/2018 12:55 PM, Felipe Balbi wrote: >>> >>> Hi, >>> >>> Minas Harutyunyan <Minas.Harutyunyan@synopsys.com> writes: >>>>>>>>> Thanks for picking this for -next. >>>>>>>>> Is it better to have this in v4.16-rc fixes? >>>>>>>>> and also stable? v4.12+ >>>>>>>> >>>>>>>> Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit >>>>>>>> log ;-) >>>>>>>> >>>>>>>> The best we can do now, is wait for -rc1 and manually send the commit to >>>>>>>> stable. >>>>>>>> >>>>>>> >>>>>>> That's fine. Thanks. >>>>>>> >>>>>> >>>>>> Same issue seen in dwc3_gadget_ep_dequeue() function where also used >>>>>> wait_event_lock_irq() - as result infinite loop. >>>>> >>>>> how did this happen? During rmmod dwc3? Or, perhaps, after you unloaded >>>>> a gadget driver? >>>>> >>>> No, not during rmmod's. >>>> We using our internal USB testing tool. Test case; ISOC OUT, transfer >>>> size N frames. When host starts ISOC OUT traffic then the dwc3 based on >>>> "Transfer not ready" event in frame F starts transfers staring from >>>> frame F+4 (for bInterval=1) as result 4 requests, which already queued >>>> on device side, remain incomplete. Function driver on some timeout >>>> trying dequeue these 4 requests (without disabling EP) to complete test. >>>> For IN ISOC's these requests completed on MISSED ISOC event, but for >>>> ISOC OUT required call dequeue on some timeout. >>> >>> okay >>> >>>>>> Actually to fix this issue I updated condition of wait function >>>>>> from: >>>>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING) >>>>>> to: >>>>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED) >>>>> >>>>> you're not fixing anything. You're, essentially, removing the entire >>>>> end transfer pending logic. >>>> yes, you are right, but how to overcome this infinite loop? Replace >>>> wait_event_lock_irq() by wait_event_interruptible_lock_irq_timeout()? >>> >>> The best way here would be to figure why we're missing command complete >>> IRQ in those cases. According to documentation, we *should* receive that >>> interrupt, so why is it missing? >>> >> >> Additional info on test. Core configuration is HS only mode, test speed >> HS, core version v2.90a. Maybe it will help to understand cause of issue. >> BTW, currently to pass above describe ISOC OUT test we just commented >> wait_event_lock_irq() in dwc3_gadget_ep_dequeue() function and >> successfully received request completion in function driver. >> Thanks, >> Minas >> > > One more info: while function driver call dequeue, host periodically > send control read command to get status of test from function - test In > Progress or Finished. > Thanks, > Minas > Your last dwc3 patch series allow us to successfully dequeuing remaining requests without falling in to infinite loop. Thank you, Minas
Hi, Minas Harutyunyan <Minas.Harutyunyan@synopsys.com> writes: >>>>>>> Actually to fix this issue I updated condition of wait function >>>>>>> from: >>>>>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING) >>>>>>> to: >>>>>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED) >>>>>> >>>>>> you're not fixing anything. You're, essentially, removing the entire >>>>>> end transfer pending logic. >>>>> yes, you are right, but how to overcome this infinite loop? Replace >>>>> wait_event_lock_irq() by wait_event_interruptible_lock_irq_timeout()? >>>> >>>> The best way here would be to figure why we're missing command complete >>>> IRQ in those cases. According to documentation, we *should* receive that >>>> interrupt, so why is it missing? >>>> >>> >>> Additional info on test. Core configuration is HS only mode, test speed >>> HS, core version v2.90a. Maybe it will help to understand cause of issue. >>> BTW, currently to pass above describe ISOC OUT test we just commented >>> wait_event_lock_irq() in dwc3_gadget_ep_dequeue() function and >>> successfully received request completion in function driver. >>> Thanks, >>> Minas >>> >> >> One more info: while function driver call dequeue, host periodically >> send control read command to get status of test from function - test In >> Progress or Finished. >> Thanks, >> Minas >> > > Your last dwc3 patch series allow us to successfully dequeuing remaining > requests without falling in to infinite loop. that's cool, thanks :-) I'll just fix the documentation bug I introduced heh :-) -- balbi
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 2bda4eb..0a360da 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc) void dwc3_gadget_exit(struct dwc3 *dwc) { + int epnum; + unsigned long flags; + + spin_lock_irqsave(&dwc->lock, flags); + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) { + struct dwc3_ep *dep = dwc->eps[epnum]; + + if (!dep) + continue; + + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING; + } + spin_unlock_irqrestore(&dwc->lock, flags); + usb_del_gadget_udc(&dwc->gadget); dwc3_gadget_free_endpoints(dwc); dma_free_coherent(dwc->sysdev, DWC3_BOUNCE_SIZE, dwc->bounce,
In the following test we get stuck by sleeping forever in _dwc3_set_mode() after which dual-role switching doesn't work. On dra7-evm's dual-role port, - Load g_zero gadget driver and enumerate to host - suspend to mem - disconnect USB cable to host and connect otg cable with Pen drive in it. - resume system - we sleep indefinitely in _dwc3_set_mode due to. dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()-> dwc3_gadget_stop()->wait_event_lock_irq() Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints so we don't wait in dwc3_gadget_stop(). Signed-off-by: Roger Quadros <rogerq@ti.com> --- drivers/usb/dwc3/gadget.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) -- cheers, -roger Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki