mbox series

[0/2] Input: uinput - Multiple concurrency fixes in ff request handling

Message ID 20231207063406.556770-1-vi@endrift.com
Headers show
Series Input: uinput - Multiple concurrency fixes in ff request handling | expand

Message

Vicki Pfau Dec. 7, 2023, 6:34 a.m. UTC
While investigating a report of a process hanging on submitting ff data to a
uinput-derived evdev handle I uncovered several issues regarding cross-thread
concurrency.

The first fix is simply making waiting on the completion object interruptible.
Without this, the submitting process cannot be interrupted, meaning it has to
either wait for the uinput-controlling process to read the data, or the timeout
being reached. While this is the usual flow, it being uninterruptible means
that if the uinput-controlling process is misbehaving, the submitting process
cannot be killed, suspended, or otherwise interrupted until the timeout is
reached, which could take an annoyingly long time for users.

The second fix is probably more controversial, and I'm unsure if it's really
the best way to solve this issue. Namely, there exists a small, but
reproducible window where closing a uinput device on the uinput side and
uploading ff data via an evdev handle in a separate process will lead to a
deadlock: the uinput ioctl will claim the mutex, flush requests, then try to
close the input device, which then tries to claim the evdev mutex. However,
when uploading the ff data, the evdev mutex will be claimed, try to claim the
uinput mutex, and hang indefinitely, leading to a deadlock. Since it can never
claim the uinput mutex, it doesn't notice that it should exit early, but since
it can't get the mutex at all, it can't release the evdev mutex.

My approach to solving this involves temporarily releasing the mutex after
flushing requests, allowing the upload to claim the mutex, then closing the
input device without the mutex being held, and finally reclaim the mutex to
rebalance the mutex_unlock later on.

I spent quite a while investigating other approaches while trying to come up
with the least hacky and simplest way to fix this. However, a proper fix might
be more involved and have to touch other subsystems, namely evdev, in which
case I would defer to Dmitry for a better fix, as he's a lot more familiar with
these subsystems.

I also suspect that there's a race condition with uinput_dev_event, as most
call sites are protected by the uinput device mutex, but not all of them.
Namely, it can be called via the input device's event function pointer, which
has no idea that the uinput mutex exists.  However, I haven't demonstrated that
there is actually an issue here, so I haven't attempted to fix it.

Vicki Pfau (2):
  Input: uinput - Allow uinput_request_submit wait interrupting
  Input: uinput - Release mutex while unregistering input device

 drivers/input/misc/uinput.c | 34 ++++++++++++++++++++++++++++++++--
 1 file changed, 32 insertions(+), 2 deletions(-)

Comments

Dmitry Torokhov Dec. 8, 2023, 7:58 p.m. UTC | #1
Hi Vicki,

On Wed, Dec 06, 2023 at 10:34:06PM -0800, Vicki Pfau wrote:
> Any pending requests may be holding a mutex from its own subsystem, e.g.
> evdev, while waiting to be able to claim the uinput device mutex.
> However, unregistering the device may try to claim that mutex, leading
> to a deadlock. To prevent this from happening, we need to temporarily
> give up the lock before calling input_unregister_device.

I do not think we can simply give up the lock, the whole thing with
UI_DEV_DESTROY allowing reusing connection to create a new input device
was a huge mistake because if you try to do UI_DEV_CREATE again on
the same fd you'll end up reusing whatever is in udev instance,
including the state and the mutex, and will make a huge mess.

I think the only reasonable way forward is change the driver so that no
ioctls are accepted after UI_DEV_DESTROY and then start untangling the
locking issues (possibly by dropping the lock on destroy after setting
the status - I think you will not observe the lockups you mention if
your application will stop using UI_DEV_DESTROY and simply closes the
fd).

> 
> Fixes: e8b95728f724 ("Input: uinput - avoid FF flush when destroying device")

This is not the commit that introduced the problem, it has been there
since forever.

Thanks.
Vicki Pfau Dec. 9, 2023, 3:24 a.m. UTC | #2
Hi Dmitry,

On 12/8/23 11:58, Dmitry Torokhov wrote:
> Hi Vicki,
> 
> On Wed, Dec 06, 2023 at 10:34:06PM -0800, Vicki Pfau wrote:
>> Any pending requests may be holding a mutex from its own subsystem, e.g.
>> evdev, while waiting to be able to claim the uinput device mutex.
>> However, unregistering the device may try to claim that mutex, leading
>> to a deadlock. To prevent this from happening, we need to temporarily
>> give up the lock before calling input_unregister_device.
> 
> I do not think we can simply give up the lock, the whole thing with
> UI_DEV_DESTROY allowing reusing connection to create a new input device
> was a huge mistake because if you try to do UI_DEV_CREATE again on
> the same fd you'll end up reusing whatever is in udev instance,
> including the state and the mutex, and will make a huge mess.

Yeah, I was curious why this was possible in the first place. It seemed overcomplicated compared to just opening a new fd. I suppose that that makes more sense, though it's a bit involved for this.

> 
> I think the only reasonable way forward is change the driver so that no
> ioctls are accepted after UI_DEV_DESTROY and then start untangling the
> locking issues (possibly by dropping the lock on destroy after setting
> the status - I think you will not observe the lockups you mention if
> your application will stop using UI_DEV_DESTROY and simply closes the
> fd).

This does sound like a reasonable way forward. Unfortunately, I don't have access to the uinput-side application code, but I have been trying to work with them to flatten out bugs in it. I can pass this suggestion along, though there is still a reproducible deadlock that could theoretically happen with other programs in the meantime (though the likelihood of it being hit without actively trying for it is low).

> 
>>
>> Fixes: e8b95728f724 ("Input: uinput - avoid FF flush when destroying device")
> 
> This is not the commit that introduced the problem, it has been there
> since forever.

My mistake. If I prepare a v2, which I may not, I'll drop the line.
> 
> Thanks.
> 

Vicki