Message ID | 20230125-hid-unregister-leds-v2-1-689cc62fc878@diag.uniroma1.it |
---|---|
State | New |
Headers | show |
Series | HID: manually unregister leds on device removal to prevent UAFs | expand |
Hi Pietro, On Jan 31 2023, Pietro Borrello wrote: > Unregister the LED controllers before device removal, as > bigben_set_led() may schedule bigben->worker after the structure has > been freed, causing a use-after-free. > > Fixes: 4eb1b01de5b9 ("HID: hid-bigbenff: fix race condition for scheduled work during removal") > Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it> > --- > drivers/hid/hid-bigbenff.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/drivers/hid/hid-bigbenff.c b/drivers/hid/hid-bigbenff.c > index e8b16665860d..d3201b755595 100644 > --- a/drivers/hid/hid-bigbenff.c > +++ b/drivers/hid/hid-bigbenff.c > @@ -306,9 +306,14 @@ static enum led_brightness bigben_get_led(struct led_classdev *led) > > static void bigben_remove(struct hid_device *hid) > { > + int n; > struct bigben_device *bigben = hid_get_drvdata(hid); > > bigben->removed = true; > + for (n = 0; n < NUM_LEDS; n++) { > + if (bigben->leds[n]) > + devm_led_classdev_unregister(&hid->dev, bigben->leds[n]); > + } > cancel_work_sync(&bigben->worker); I don't think this is the correct fix. It would seem that we are suddenly making the assumption that the devm mechanism would do things in the wrong order, when the devm_led_classdev_unregister() should be called *before* the devm_free() of the struct bigben_device. However, you can trigger a bug, and thus we can analyse a little bit further what is happening: * user calls a function on the LED * bigben_set_led() is called * .remove() is being called at roughly the same time: - bigben->removed is set to true - cancel_work_sync() is called * at that point, bigben_set_led() can not crash because led_classdev_unregister() flushes all of its workers, and thus prevents the call for dev_kfree(struct bigben_device) * but now bigben_set_led() calls schedule_work() * led_classdev_unregister() is now done and devm_kfree() is called for struct bigben_device * now the led worker kicks in, and tries to access struct bigben_device and derefences it to get the value of bigben->removed (and bigben->report), which crashes. So without your patch, the problem seems to be that we call a schedule_work *after* we set bigben->removed to true and we call cancel_work_sync(). And if you look at the hid-playstation driver, you'll see that the schedule_work() call is encapsulated in a spinlock and a check to ds->output_worker_initialized. And this is why you can not reproduce on the hid-playstation driver, because it is guarded against scheduling a worker when the driver is being removed. I think I prefer a lot more the playstation solution: having to manually call a devm_release_free always feels wrong in a normal path. And also by doing so, you might paper another problem that might happen on an error path in probe for instance. Also, this means that the pattern you saw is specific to some drivers, not all depending on how they make use of workers. Would you mind respinning that series with those comments? Cheers, Benjamin > hid_hw_stop(hid); > } > > -- > 2.25.1
On Thu, 9 Feb 2023 at 09:55, Benjamin Tissoires <benjamin.tissoires@redhat.com> wrote: > > Hi Pietro, > > On Jan 31 2023, Pietro Borrello wrote: > > Unregister the LED controllers before device removal, as > > bigben_set_led() may schedule bigben->worker after the structure has > > been freed, causing a use-after-free. > > > > Fixes: 4eb1b01de5b9 ("HID: hid-bigbenff: fix race condition for scheduled work during removal") > > Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it> > > --- > > drivers/hid/hid-bigbenff.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/drivers/hid/hid-bigbenff.c b/drivers/hid/hid-bigbenff.c > > index e8b16665860d..d3201b755595 100644 > > --- a/drivers/hid/hid-bigbenff.c > > +++ b/drivers/hid/hid-bigbenff.c > > @@ -306,9 +306,14 @@ static enum led_brightness bigben_get_led(struct led_classdev *led) > > > > static void bigben_remove(struct hid_device *hid) > > { > > + int n; > > struct bigben_device *bigben = hid_get_drvdata(hid); > > > > bigben->removed = true; > > + for (n = 0; n < NUM_LEDS; n++) { > > + if (bigben->leds[n]) > > + devm_led_classdev_unregister(&hid->dev, bigben->leds[n]); > > + } > > cancel_work_sync(&bigben->worker); > > I don't think this is the correct fix. It would seem that we are > suddenly making the assumption that the devm mechanism would do things > in the wrong order, when the devm_led_classdev_unregister() should be > called *before* the devm_free() of the struct bigben_device. > > However, you can trigger a bug, and thus we can analyse a little bit > further what is happening: > > * user calls a function on the LED > * bigben_set_led() is called > * .remove() is being called at roughly the same time: > - bigben->removed is set to true > - cancel_work_sync() is called > * at that point, bigben_set_led() can not crash because > led_classdev_unregister() flushes all of its workers, and thus > prevents the call for dev_kfree(struct bigben_device) > * but now bigben_set_led() calls schedule_work() > * led_classdev_unregister() is now done and devm_kfree() is called for > struct bigben_device > * now the led worker kicks in, and tries to access struct bigben_device > and derefences it to get the value of bigben->removed (and > bigben->report), which crashes. > > So without your patch, the problem seems to be that we call a > schedule_work *after* we set bigben->removed to true and we call > cancel_work_sync(). Yes, this matches my intuition of what is happening here. Thank you for the extensive description. > > And if you look at the hid-playstation driver, you'll see that the > schedule_work() call is encapsulated in a spinlock and a check to > ds->output_worker_initialized. > > And this is why you can not reproduce on the hid-playstation driver, > because it is guarded against scheduling a worker when the driver is > being removed. > > I think I prefer a lot more the playstation solution: having to manually > call a devm_release_free always feels wrong in a normal path. And also > by doing so, you might paper another problem that might happen on an > error path in probe for instance. Also, this means that the pattern you > saw is specific to some drivers, not all depending on how they make use > of workers. > Yes, I agree this would be much cleaner. > Would you mind respinning that series with those comments? Sure, I'll work on that! Best regards, Pietro
diff --git a/drivers/hid/hid-bigbenff.c b/drivers/hid/hid-bigbenff.c index e8b16665860d..d3201b755595 100644 --- a/drivers/hid/hid-bigbenff.c +++ b/drivers/hid/hid-bigbenff.c @@ -306,9 +306,14 @@ static enum led_brightness bigben_get_led(struct led_classdev *led) static void bigben_remove(struct hid_device *hid) { + int n; struct bigben_device *bigben = hid_get_drvdata(hid); bigben->removed = true; + for (n = 0; n < NUM_LEDS; n++) { + if (bigben->leds[n]) + devm_led_classdev_unregister(&hid->dev, bigben->leds[n]); + } cancel_work_sync(&bigben->worker); hid_hw_stop(hid); }
Unregister the LED controllers before device removal, as bigben_set_led() may schedule bigben->worker after the structure has been freed, causing a use-after-free. Fixes: 4eb1b01de5b9 ("HID: hid-bigbenff: fix race condition for scheduled work during removal") Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it> --- drivers/hid/hid-bigbenff.c | 5 +++++ 1 file changed, 5 insertions(+)