diff mbox series

[2/2] usb: ehci: unlink itd/sitds from hardware list if the controller has stopped periodic schedule

Message ID 20230718112600.3969141-2-xu.yang_2@nxp.com
State New
Headers show
Series [1/2] usb: host: ehci-sched: add itd/sitd_unlink_urb() functions | expand

Commit Message

Xu Yang July 18, 2023, 11:26 a.m. UTC
In current design, the ehci driver will not unlink itd/sitds from the
hardware list when dequeue isochronous urbs. Rather just wait until they
complete normally or their time slot expires. However, this will cause
issues if the controller has stopped periodic schedule before finished
all periodic schedule. The urb will not be done forever in this case and
then usb_kill/poison_urb() will always wait there.

The ChipIdea IP exactly has a bug: if frame babble occurs during periodic
transfer, PE (PORTSC.bit2) will be cleared and the controller will stop
periodic schedule immediately. So if the user tries to kill or poison
related urb, it will wait there since the urb can't be done forever.

This patch will check if this issue occurs, then it will unlink itd/sitds
from the hardware list depends on the result.

Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
---
 drivers/usb/host/ehci-hcd.c | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

Comments

Xu Yang Aug. 8, 2023, 10:03 a.m. UTC | #1
Hi Alan,

> On Sat, Jul 29, 2023 at 06:55:07AM +0000, Xu Yang wrote:
> > Many thanks for your comments and suggestions.
> >
> > Yes, there is a "live" variable in scan_isoc() to handle the case where
> > root hub has stopped periodic schedule. I have rechecked the root cause
> > of the issue , that is the USB controller has disabled the port (PE cleared)
> > and asserted USBERRINT when frame babble is detected, but PEC is not
> > asserted. Therefore, the SW didn't aware that port has been disabled.
> > The periodic schedule keeps running at this moment. Then the SW keeps
> > sending packets to this port, but all of the transfers will fail. But periodic
> > schedule will also be disabled after a period of time. Finally, all of the linked
> > itds are penging there. The code can't enter into scan_isoc() if only USBERRINT
> > is asserted.
> 
> That's not true.  The io_watchdog timer continues to fire periodically
> (at 100 ms intervals) as long as isoc_count > 0.  The timer's handler is
> ehci_work(), which calls scan_isoc() if isoc_count > 0.

Yes, the io_watchdog timer will cleanup the isoc periodically as long as
isoc_count > 0. 

I did see all of the linked itds are pending there in my case at the end. After my
debug, I found the chipidea/host.c had set ehci->need_io_watchdog to 0 which
will have impact on turn_on_io_watchdog().

The host has enabled 1 intr endpoint and 2 isoc endpoints from USB camera.
Therefore, ehci->intr_count is always 1 and ehci->isoc_count is changing from
time to time during camera is running. When PE is cleared by HW, isoc_count
may be decreased to 0 after scan_isoc(). When turn_on_io_watchdog() is called,
EHCI_HRTIMER_IO_WATCHDOG event will not be enabled due to isoc_count=0
and need_io_watchdog=0 too. When isoc urb is submited again, enable_periodic()
will still not enable EHCI_HRTIMER_IO_WATCHDOG event due to periodic_count>0.
Then, the linked itds are pending there as long as intr urb has not been completed.

Thanks,
Xu Yang

> 
> > For this issue, I think the SW needs to aware that the port has been disabled
> > although PEC not asserted by HW. I will send another patch to fix this issue
> > later.
> 
> Yes, that sounds like a nasty problem.  The kernel wouldn't realize
> that the device had disconnected.
> 
> Alan Stern
Alan Stern Aug. 8, 2023, 3:13 p.m. UTC | #2
On Tue, Aug 08, 2023 at 10:03:39AM +0000, Xu Yang wrote:
> Hi Alan,
> 
> > On Sat, Jul 29, 2023 at 06:55:07AM +0000, Xu Yang wrote:
> > > Many thanks for your comments and suggestions.
> > >
> > > Yes, there is a "live" variable in scan_isoc() to handle the case where
> > > root hub has stopped periodic schedule. I have rechecked the root cause
> > > of the issue , that is the USB controller has disabled the port (PE cleared)
> > > and asserted USBERRINT when frame babble is detected, but PEC is not
> > > asserted. Therefore, the SW didn't aware that port has been disabled.
> > > The periodic schedule keeps running at this moment. Then the SW keeps
> > > sending packets to this port, but all of the transfers will fail. But periodic
> > > schedule will also be disabled after a period of time. Finally, all of the linked
> > > itds are penging there. The code can't enter into scan_isoc() if only USBERRINT
> > > is asserted.
> > 
> > That's not true.  The io_watchdog timer continues to fire periodically
> > (at 100 ms intervals) as long as isoc_count > 0.  The timer's handler is
> > ehci_work(), which calls scan_isoc() if isoc_count > 0.
> 
> Yes, the io_watchdog timer will cleanup the isoc periodically as long as
> isoc_count > 0. 
> 
> I did see all of the linked itds are pending there in my case at the end. After my
> debug, I found the chipidea/host.c had set ehci->need_io_watchdog to 0 which
> will have impact on turn_on_io_watchdog().
> 
> The host has enabled 1 intr endpoint and 2 isoc endpoints from USB camera.
> Therefore, ehci->intr_count is always 1 and ehci->isoc_count is changing from
> time to time during camera is running. When PE is cleared by HW, isoc_count
> may be decreased to 0 after scan_isoc(). When turn_on_io_watchdog() is called,
> EHCI_HRTIMER_IO_WATCHDOG event will not be enabled due to isoc_count=0
> and need_io_watchdog=0 too. When isoc urb is submited again, enable_periodic()
> will still not enable EHCI_HRTIMER_IO_WATCHDOG event due to periodic_count>0.

Ooh, that's a bug.  enable_periodic() should call turn_on_io_watchdog() 
no matter what value periodic_count has.  Would you like to fix it?

Alan Stern
diff mbox series

Patch

diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index a1930db0da1c..26dc1d1ae5e8 100644
--- a/drivers/usb/host/ehci-hcd.c
+++ b/drivers/usb/host/ehci-hcd.c
@@ -930,10 +930,41 @@  static int ehci_urb_dequeue(struct usb_hcd *hcd, struct urb *urb, int status)
 
 	if (usb_pipetype(urb->pipe) == PIPE_ISOCHRONOUS) {
 		/*
-		 * We don't expedite dequeue for isochronous URBs.
+		 * 1. We don't expedite dequeue for isochronous URBs.
 		 * Just wait until they complete normally or their
 		 * time slot expires.
+		 *
+		 * 2. The ChipIdea IP has a bug: if frame babble occurs,
+		 * PE will be cleared and the controller will stop periodic
+		 * schedule. So if we don't force dequeue this urb, it
+		 * won't be done forever. Here, a force dequeue is needed
+		 * for this case.
 		 */
+		unsigned 		i = HCS_N_PORTS (ehci->hcs_params);
+		bool 			need_force_dequeue = false;
+
+		while (i--) {
+			int pstatus;
+
+			pstatus = ehci_readl(ehci,
+					&ehci->regs->port_status[i]);
+
+			/* Any cleared PE means controller has stopped
+			 * periodic schedule.
+			 */
+			if (!(pstatus & PORT_PE)) {
+				need_force_dequeue = true;
+				break;
+			}
+		}
+
+		if (!need_force_dequeue)
+			goto done;
+
+		if (urb->dev->speed == USB_SPEED_HIGH)
+			itd_unlink_urb(ehci, urb);
+		else
+			sitd_unlink_urb(ehci, urb);
 	} else {
 		qh = (struct ehci_qh *) urb->hcpriv;
 		qh->unlink_reason |= QH_UNLINK_REQUESTED;