diff mbox series

USB: core: Make usb_start_wait_urb() interruptible

Message ID 20210829015825.GA297712@rowland.harvard.edu
State New
Headers show
Series USB: core: Make usb_start_wait_urb() interruptible | expand

Commit Message

Alan Stern Aug. 29, 2021, 1:58 a.m. UTC
usb_start_wait_urb() can be called from user processes by means of the
USBDEVFS_BULK and USBDEVFS_CONTROL ioctls in usbfs.  Consequently it
should not contain an uninterruptible wait of arbitrarily long length
(the timeout value is specified here by the user, so it can be
practically anything).  Doing so leads the kernel to complain about
"Task X blocked for more than N seconds", as found in testing by
syzbot:

INFO: task syz-executor.0:8700 blocked for more than 143 seconds.
      Not tainted 5.14.0-rc7-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.0  state:D stack:23192 pid: 8700 ppid:  8455 flags:0x00004004
Call Trace:
 context_switch kernel/sched/core.c:4681 [inline]
 __schedule+0xc07/0x11f0 kernel/sched/core.c:5938
 schedule+0x14b/0x210 kernel/sched/core.c:6017
 schedule_timeout+0x98/0x2f0 kernel/time/timer.c:1857
 do_wait_for_common+0x2da/0x480 kernel/sched/completion.c:85
 __wait_for_common kernel/sched/completion.c:106 [inline]
 wait_for_common kernel/sched/completion.c:117 [inline]
 wait_for_completion_timeout+0x46/0x60 kernel/sched/completion.c:157
 usb_start_wait_urb+0x167/0x550 drivers/usb/core/message.c:63
 do_proc_bulk+0x978/0x1080 drivers/usb/core/devio.c:1236
 proc_bulk drivers/usb/core/devio.c:1273 [inline]
 usbdev_do_ioctl drivers/usb/core/devio.c:2547 [inline]
 usbdev_ioctl+0x3441/0x6b10 drivers/usb/core/devio.c:2713
...

This patch fixes the problem by converting the uninterruptible wait to
an interruptible one.  For the most part this won't affect calls to
usb_start_wait_urb(), because they are made by kernel threads and so
can't receive most signals.

But in some cases such calls may occur in user threads in contexts
other than usbfs ioctls.  A signal in these circumstances could cause
a USB transfer to fail when otherwise it wouldn't.  The outcome
wouldn't be too dreadful, since USB transfers can fail at any time and
the system is prepared to handle these failures gracefully.  In
theory, for example, a signal might cause a driver's probe routine to
fail; in practice if the user doesn't want a probe to fail then he
shouldn't send interrupt signals to the probing process.

Overall, then, making these delays interruptible seems to be an
acceptable risk.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: syzbot+ada0f7d3d9fd2016d927@syzkaller.appspotmail.com
CC: stable@vger.kernel.org

---


[as1964]


 drivers/usb/core/message.c |   12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

Comments

Johan Hovold Aug. 30, 2021, 7:56 a.m. UTC | #1
On Sat, Aug 28, 2021 at 09:58:25PM -0400, Alan Stern wrote:
> usb_start_wait_urb() can be called from user processes by means of the

> USBDEVFS_BULK and USBDEVFS_CONTROL ioctls in usbfs.  Consequently it

> should not contain an uninterruptible wait of arbitrarily long length

> (the timeout value is specified here by the user, so it can be

> practically anything).  Doing so leads the kernel to complain about

> "Task X blocked for more than N seconds", as found in testing by

> syzbot:

> 

> INFO: task syz-executor.0:8700 blocked for more than 143 seconds.

>       Not tainted 5.14.0-rc7-syzkaller #0

> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

> task:syz-executor.0  state:D stack:23192 pid: 8700 ppid:  8455 flags:0x00004004

> Call Trace:

>  context_switch kernel/sched/core.c:4681 [inline]

>  __schedule+0xc07/0x11f0 kernel/sched/core.c:5938

>  schedule+0x14b/0x210 kernel/sched/core.c:6017

>  schedule_timeout+0x98/0x2f0 kernel/time/timer.c:1857

>  do_wait_for_common+0x2da/0x480 kernel/sched/completion.c:85

>  __wait_for_common kernel/sched/completion.c:106 [inline]

>  wait_for_common kernel/sched/completion.c:117 [inline]

>  wait_for_completion_timeout+0x46/0x60 kernel/sched/completion.c:157

>  usb_start_wait_urb+0x167/0x550 drivers/usb/core/message.c:63

>  do_proc_bulk+0x978/0x1080 drivers/usb/core/devio.c:1236

>  proc_bulk drivers/usb/core/devio.c:1273 [inline]

>  usbdev_do_ioctl drivers/usb/core/devio.c:2547 [inline]

>  usbdev_ioctl+0x3441/0x6b10 drivers/usb/core/devio.c:2713

> ...

> 

> This patch fixes the problem by converting the uninterruptible wait to

> an interruptible one.  For the most part this won't affect calls to

> usb_start_wait_urb(), because they are made by kernel threads and so

> can't receive most signals.

> 

> But in some cases such calls may occur in user threads in contexts

> other than usbfs ioctls.  A signal in these circumstances could cause

> a USB transfer to fail when otherwise it wouldn't.  The outcome

> wouldn't be too dreadful, since USB transfers can fail at any time and

> the system is prepared to handle these failures gracefully.  In

> theory, for example, a signal might cause a driver's probe routine to

> fail; in practice if the user doesn't want a probe to fail then he

> shouldn't send interrupt signals to the probing process.


While probe() triggered through sysfs or by module loading is one
example, the USB msg helpers are also called in a lot of other
user-thread contexts such as open() calls etc. It might even be that the
majority of these calls can be done from user threads (post
enumeration).

> Overall, then, making these delays interruptible seems to be an

> acceptable risk.


Possibly, but changing the API like this to fix the usbfs ioctls seems
like using a bit of a too big hammer to me, especially when backporting
to stable.

> Signed-off-by: Alan Stern <stern@rowland.harvard.edu>

> Reported-and-tested-by: syzbot+ada0f7d3d9fd2016d927@syzkaller.appspotmail.com

> CC: stable@vger.kernel.org


Johan
Alan Stern Aug. 30, 2021, 2:46 p.m. UTC | #2
On Mon, Aug 30, 2021 at 09:56:50AM +0200, Johan Hovold wrote:
> On Sat, Aug 28, 2021 at 09:58:25PM -0400, Alan Stern wrote:

> > This patch fixes the problem by converting the uninterruptible wait to

> > an interruptible one.  For the most part this won't affect calls to

> > usb_start_wait_urb(), because they are made by kernel threads and so

> > can't receive most signals.

> > 

> > But in some cases such calls may occur in user threads in contexts

> > other than usbfs ioctls.  A signal in these circumstances could cause

> > a USB transfer to fail when otherwise it wouldn't.  The outcome

> > wouldn't be too dreadful, since USB transfers can fail at any time and

> > the system is prepared to handle these failures gracefully.  In

> > theory, for example, a signal might cause a driver's probe routine to

> > fail; in practice if the user doesn't want a probe to fail then he

> > shouldn't send interrupt signals to the probing process.

> 

> While probe() triggered through sysfs or by module loading is one

> example, the USB msg helpers are also called in a lot of other

> user-thread contexts such as open() calls etc. It might even be that the

> majority of these calls can be done from user threads (post

> enumeration).


Could be.  It's not a well defined matter; it depends on what drivers 
are in use and how they are used.

Consider that a control message in a driver is likely to use the 
default USB_CTRL_[GS]ET_TIMEOUT value of 5 seconds.  Does it make sense 
to allow uninterruptible wait states to last as long as that?

And to what extent does it matter if we make these delays 
interruptible?  A signal delivered during a system call will be fielded 
when the call returns if not earlier; the only difference will be that 
now some USB messages may be aborted.  For things like SIGINT or 
SIGTERM this seems reasonable.  (I'm not so sure about things like 
SIGALRM, SIGIO, or SIGSTOP, though.)

> > Overall, then, making these delays interruptible seems to be an

> > acceptable risk.

> 

> Possibly, but changing the API like this to fix the usbfs ioctls seems

> like using a bit of a too big hammer to me, especially when backporting

> to stable.


Perhaps the stable backport could be delayed for a while (say, one 
release cycle).

Do you have alternative suggestions?  I don't think we want special 
interruptible versions of usb_control_msg() and usb_bulk_msg() just for 
use by usbfs.

Alan Stern
Oliver Neukum Aug. 30, 2021, 3:11 p.m. UTC | #3
On 30.08.21 16:46, Alan Stern wrote:
> Do you have alternative suggestions?  I don't think we want special 

> interruptible versions of usb_control_msg() and usb_bulk_msg() just for 

> use by usbfs.


Hi,

why not just pass a flag?

    Regards
        Oliver
Alan Stern Aug. 30, 2021, 4:09 p.m. UTC | #4
On Mon, Aug 30, 2021 at 05:11:53PM +0200, Oliver Neukum wrote:
> 

> On 30.08.21 16:46, Alan Stern wrote:

> > Do you have alternative suggestions?  I don't think we want special 

> > interruptible versions of usb_control_msg() and usb_bulk_msg() just for 

> > use by usbfs.

> 

> Hi,

> 

> why not just pass a flag?


We could.  But you're ignoring the question I asked earlier in that 
email: Is a 5-second uninterruptible delay (the default USB control 
timeout) acceptable in general?

Alan Stern
Oliver Neukum Aug. 31, 2021, 8:52 a.m. UTC | #5
On 30.08.21 18:09, Alan Stern wrote:
> On Mon, Aug 30, 2021 at 05:11:53PM +0200, Oliver Neukum wrote:

>> On 30.08.21 16:46, Alan Stern wrote:

>>> Do you have alternative suggestions?  I don't think we want special 

>>> interruptible versions of usb_control_msg() and usb_bulk_msg() just for 

>>> use by usbfs.

>> Hi,

>>

>> why not just pass a flag?

> We could.  But you're ignoring the question I asked earlier in that 

> email: Is a 5-second uninterruptible delay (the default USB control 

> timeout) acceptable in general?

We cannot change the five seconds, so this boils down to whether
we can always handle signals when we need to send control messages.
The answer to that is negative.

Suspend/resume, block IO, probe, ioctl(). This list will be rather long.

    Regards
        Oliver
Johan Hovold Aug. 31, 2021, 9:13 a.m. UTC | #6
On Mon, Aug 30, 2021 at 10:46:13AM -0400, Alan Stern wrote:
> On Mon, Aug 30, 2021 at 09:56:50AM +0200, Johan Hovold wrote:

> > On Sat, Aug 28, 2021 at 09:58:25PM -0400, Alan Stern wrote:

> > > This patch fixes the problem by converting the uninterruptible wait to

> > > an interruptible one.  For the most part this won't affect calls to

> > > usb_start_wait_urb(), because they are made by kernel threads and so

> > > can't receive most signals.

> > > 

> > > But in some cases such calls may occur in user threads in contexts

> > > other than usbfs ioctls.  A signal in these circumstances could cause

> > > a USB transfer to fail when otherwise it wouldn't.  The outcome

> > > wouldn't be too dreadful, since USB transfers can fail at any time and

> > > the system is prepared to handle these failures gracefully.  In

> > > theory, for example, a signal might cause a driver's probe routine to

> > > fail; in practice if the user doesn't want a probe to fail then he

> > > shouldn't send interrupt signals to the probing process.

> > 

> > While probe() triggered through sysfs or by module loading is one

> > example, the USB msg helpers are also called in a lot of other

> > user-thread contexts such as open() calls etc. It might even be that the

> > majority of these calls can be done from user threads (post

> > enumeration).

> 

> Could be.  It's not a well defined matter; it depends on what drivers 

> are in use and how they are used.


Right, but the commit message seemed to suggest that these helpers being
run from interruptible threads was the exception.

> Consider that a control message in a driver is likely to use the 

> default USB_CTRL_[GS]ET_TIMEOUT value of 5 seconds.  Does it make sense 

> to allow uninterruptible wait states to last as long as that?


Perhaps sometimes? I don't have a use case at hand, but I haven't
reviewed all drivers either.

The comment above usb_start_wait_urb() (which also needs to be updated,
by the way) even suggests that drivers should "implement their own
interruptible routines" so perhaps this has just gone unnoticed for 20
odd years. And the question then becomes, why didn't we use
interruptible sleep from the start?

And trying to answer that I find that that's precisely what we did, but
for some reason it was changed to uninterruptible sleep in v2.4.11
without a motivation (that I could easily find spelled out).

> And to what extent does it matter if we make these delays 

> interruptible?  A signal delivered during a system call will be fielded 

> when the call returns if not earlier; the only difference will be that 

> now some USB messages may be aborted.  For things like SIGINT or 

> SIGTERM this seems reasonable.  (I'm not so sure about things like 

> SIGALRM, SIGIO, or SIGSTOP, though.)


I'm not saying I'm necessarily against the change. It just seemed a bit
rushed to change the (stable) API like this while claiming it won't
affect most call sites.

> > > Overall, then, making these delays interruptible seems to be an

> > > acceptable risk.

> > 

> > Possibly, but changing the API like this to fix the usbfs ioctls seems

> > like using a bit of a too big hammer to me, especially when backporting

> > to stable.

> 

> Perhaps the stable backport could be delayed for a while (say, one 

> release cycle).


That might work.

> Do you have alternative suggestions?  I don't think we want special 

> interruptible versions of usb_control_msg() and usb_bulk_msg() just for 

> use by usbfs.


usbfs could carry a temporary local implementation as the documentation
for usb_start_wait_urb() currently suggests. I assume we can't limit the
usbfs timeouts.

Johan
Oliver Neukum Aug. 31, 2021, 10:47 a.m. UTC | #7
On 31.08.21 11:13, Johan Hovold wrote:
> The comment above usb_start_wait_urb() (which also needs to be updated,

> by the way) even suggests that drivers should "implement their own

> interruptible routines" so perhaps this has just gone unnoticed for 20

> odd years. And the question then becomes, why didn't we use

> interruptible sleep from the start?

>

> And trying to answer that I find that that's precisely what we did, but

> for some reason it was changed to uninterruptible sleep in v2.4.11

> without a motivation (that I could easily find spelled out).


I must admit that I do not remember. But there are a lot of situations
requiring control messages that do not allow signal delivery.

Take for example a device that is HID and storage. We are doing
HID error handling, which can involve a device reset. You absolutely
cannot deliver a signal right now, as you have a device that is in an
undefined
state in the block layer.

It looks to me very much like we need both versions and as a rule of thumb,
while you would use GFP_NOIO you must also use the uninterruptible
messaging.

    Regards
        Oliver
Oliver Neukum Aug. 31, 2021, 11:02 a.m. UTC | #8
On 31.08.21 11:13, Johan Hovold wrote:
> On Mon, Aug 30, 2021 at 10:46:13AM -0400, Alan Stern wrote:

>> Perhaps the stable backport could be delayed for a while (say, one 

>> release cycle).

> That might work.

>

>> Do you have alternative suggestions?  I don't think we want special 

>> interruptible versions of usb_control_msg() and usb_bulk_msg() just for 

>> use by usbfs.

> usbfs could carry a temporary local implementation as the documentation

> for usb_start_wait_urb() currently suggests. I assume we can't limit the

> usbfs timeouts.

Upon further considerations forcing user space to handle signals also
breaks the API, albeit in a more subtle manner. I'd suggest that we use
wait_event_killable_timeout(). And do it the way Alan initially disliked,
that is code a version for use by usbfs.

Thus we'd avoid the issue of having an unkillable process, but we do
not impose a need to handle signals.

    Regards
        Oliver
Johan Hovold Aug. 31, 2021, 11:10 a.m. UTC | #9
On Tue, Aug 31, 2021 at 11:13:59AM +0200, Johan Hovold wrote:
> On Mon, Aug 30, 2021 at 10:46:13AM -0400, Alan Stern wrote:


> > Consider that a control message in a driver is likely to use the 

> > default USB_CTRL_[GS]ET_TIMEOUT value of 5 seconds.  Does it make sense 

> > to allow uninterruptible wait states to last as long as that?

> 

> Perhaps sometimes? I don't have a use case at hand, but I haven't

> reviewed all drivers either.

> 

> The comment above usb_start_wait_urb() (which also needs to be updated,

> by the way) even suggests that drivers should "implement their own

> interruptible routines" so perhaps this has just gone unnoticed for 20

> odd years. And the question then becomes, why didn't we use

> interruptible sleep from the start?

> 

> And trying to answer that I find that that's precisely what we did, but

> for some reason it was changed to uninterruptible sleep in v2.4.11

> without a motivation (that I could easily find spelled out).


Here it is:

	https://lore.kernel.org/lkml/20010818013101.A7058@devserv.devel.redhat.com/

It's rationale does not seem valid anymore (i.e. the NULL deref), but
the example is still instructive.

If you close for example a v4l application you still expect the device
to be shut down orderly but with interruptible sleep all control
requests during shutdown will be aborted immediately.

Similar for USB serial devices which would for example fail to deassert
DTR/RTS, etc. (I just verified with the patch applied.)

Johan
Alan Stern Aug. 31, 2021, 5:03 p.m. UTC | #10
On Tue, Aug 31, 2021 at 01:10:32PM +0200, Johan Hovold wrote:
> On Tue, Aug 31, 2021 at 11:13:59AM +0200, Johan Hovold wrote:

> > On Mon, Aug 30, 2021 at 10:46:13AM -0400, Alan Stern wrote:

> 

> > > Consider that a control message in a driver is likely to use the 

> > > default USB_CTRL_[GS]ET_TIMEOUT value of 5 seconds.  Does it make sense 

> > > to allow uninterruptible wait states to last as long as that?

> > 

> > Perhaps sometimes? I don't have a use case at hand, but I haven't

> > reviewed all drivers either.

> > 

> > The comment above usb_start_wait_urb() (which also needs to be updated,

> > by the way) even suggests that drivers should "implement their own

> > interruptible routines" so perhaps this has just gone unnoticed for 20

> > odd years. And the question then becomes, why didn't we use

> > interruptible sleep from the start?

> > 

> > And trying to answer that I find that that's precisely what we did, but

> > for some reason it was changed to uninterruptible sleep in v2.4.11

> > without a motivation (that I could easily find spelled out).

> 

> Here it is:

> 

> 	https://lore.kernel.org/lkml/20010818013101.A7058@devserv.devel.redhat.com/

> 

> It's rationale does not seem valid anymore (i.e. the NULL deref), but

> the example is still instructive.

> 

> If you close for example a v4l application you still expect the device

> to be shut down orderly but with interruptible sleep all control

> requests during shutdown will be aborted immediately.

> 

> Similar for USB serial devices which would for example fail to deassert

> DTR/RTS, etc. (I just verified with the patch applied.)


On Tue, Aug 31, 2021 at 01:02:11PM +0200, Oliver Neukum wrote:
> Upon further considerations forcing user space to handle signals also

> breaks the API, albeit in a more subtle manner. I'd suggest that we use

> wait_event_killable_timeout(). And do it the way Alan initially disliked,

> that is code a version for use by usbfs.

>

> Thus we'd avoid the issue of having an unkillable process, but we do

> not impose a need to handle signals.


Okay, I'll play it safe and rewrite the patch, adding special-purpose 
routines just for usbfs.

Will wait_event_killable_timeout() prevent complaints about tasks being 
blocked for too long (the reason syzbot reported this in the first 
place)?

Alan Stern
Oliver Neukum Sept. 1, 2021, 8:16 a.m. UTC | #11
On 31.08.21 19:03, Alan Stern wrote:
>

> Will wait_event_killable_timeout() prevent complaints about tasks being 

> blocked for too long (the reason syzbot reported this in the first 

> place)?


Very good question. TASK_KILLABLE is its own task state, so I'd say
if it doesn't the test suite needs to be fixed.

    Regards
        Oliver
diff mbox series

Patch

Index: usb-devel/drivers/usb/core/message.c
===================================================================
--- usb-devel.orig/drivers/usb/core/message.c
+++ usb-devel/drivers/usb/core/message.c
@@ -60,12 +60,18 @@  static int usb_start_wait_urb(struct urb
 		goto out;
 
 	expire = timeout ? msecs_to_jiffies(timeout) : MAX_SCHEDULE_TIMEOUT;
-	if (!wait_for_completion_timeout(&ctx.done, expire)) {
+	retval = wait_for_completion_interruptible_timeout(&ctx.done, expire);
+	if (retval <= 0) {
 		usb_kill_urb(urb);
-		retval = (ctx.status == -ENOENT ? -ETIMEDOUT : ctx.status);
+		if (ctx.status != -ENOENT)	/* URB already completed */
+			retval = ctx.status;
+		else if (retval == 0)
+			retval = -ETIMEDOUT;
+		else
+			retval = -EINTR;
 
 		dev_dbg(&urb->dev->dev,
-			"%s timed out on ep%d%s len=%u/%u\n",
+			"%s timed out or interrupted on ep%d%s len=%u/%u\n",
 			current->comm,
 			usb_endpoint_num(&urb->ep->desc),
 			usb_urb_dir_in(urb) ? "in" : "out",