Message ID | 20240625-led-class-device-leak-v1-1-9eb4436310c2@bootlin.com |
---|---|
State | Superseded |
Headers | show |
Series | Revert "leds: led-core: Fix refcount leak in of_led_get()" | expand |
Hi Luca, On Tue, 25 Jun 2024 09:26:52 +0200 Luca Ceresoli <luca.ceresoli@bootlin.com> wrote: > This reverts commit da1afe8e6099980fe1e2fd7436dca284af9d3f29. > > Commit 699a8c7c4bd3 ("leds: Add of_led_get() and led_put()"), introduced in > 5.5, added of_led_get() and led_put() but missed a put_device() in > led_put(), thus creating a leak in case the consumer device is removed. > > Arguably device removal was not very popular, so this went apparently > unnoticed until 2022. In January 2023 two different patches got merged to > fix the same bug: > > - commit da1afe8e6099 ("leds: led-core: Fix refcount leak in of_led_get()") > - commit 445110941eb9 ("leds: led-class: Add missing put_device() to led_put()") > > They fix the bug in two different ways, which creates no patch conflicts, > and both were merged in v6.2. The result is that now there is one more > put_device() than get_device()s, instead of one less. > > Arguably device removal is not very popular yet, so this apparently hasn't > been noticed as well up to now. But it blew up here while I'm working with > device tree overlay insertion and removal. The symptom is an apparently > unrelated list of oopses on device removal, with reasons: > > kernfs: can not remove 'uevent', no directory > kernfs: can not remove 'brightness', no directory > kernfs: can not remove 'max_brightness', no directory > ... > > Here sysfs fails removing attribute files, which is because the device name > changed and so the sysfs path. This is because the device name string got > corrupted, which is because it got freed too early and its memory reused. > > Different symptoms could appear in different use cases. > > Fix by removing one of the two fixes. > > The choice was to remove commit da1afe8e6099 because: > > * it is calling put_device() inside of_led_get() just after getting the > device, thus it is basically not refcounting the LED device at all > during its entire lifetime > * it does not add a corresponding put_device() in led_get(), so it fixes > only the OF case > > The other fix (445110941eb9) is adding the put_device() in led_put() so it > covers the entire lifetime, and it works even in the non-DT case. > > Fixes: da1afe8e6099 ("leds: led-core: Fix refcount leak in of_led_get()") > Co-developed-by: Hervé Codina <herve.codina@bootlin.com> > Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com> As there is a Co-developer, you have to add his/her Signed-off-by: https://elixir.bootlin.com/linux/v6.10-rc5/source/Documentation/process/submitting-patches.rst#L494 So feel free to: a) Add Signed-off-by: Hervé Codina <herve.codina@bootlin.com> or b) Remove Co-developed-by: Hervé Codina <herve.codina@bootlin.com> Even if I participate in that fix, I will not be upset if you remove the Co-developed-by :) Best regards, Hervé
diff --git a/drivers/leds/led-class.c b/drivers/leds/led-class.c index 24fcff682b24..b23d2138cd83 100644 --- a/drivers/leds/led-class.c +++ b/drivers/leds/led-class.c @@ -258,7 +258,6 @@ struct led_classdev *of_led_get(struct device_node *np, int index) led_dev = class_find_device_by_of_node(&leds_class, led_node); of_node_put(led_node); - put_device(led_dev); return led_module_get(led_dev); }
This reverts commit da1afe8e6099980fe1e2fd7436dca284af9d3f29. Commit 699a8c7c4bd3 ("leds: Add of_led_get() and led_put()"), introduced in 5.5, added of_led_get() and led_put() but missed a put_device() in led_put(), thus creating a leak in case the consumer device is removed. Arguably device removal was not very popular, so this went apparently unnoticed until 2022. In January 2023 two different patches got merged to fix the same bug: - commit da1afe8e6099 ("leds: led-core: Fix refcount leak in of_led_get()") - commit 445110941eb9 ("leds: led-class: Add missing put_device() to led_put()") They fix the bug in two different ways, which creates no patch conflicts, and both were merged in v6.2. The result is that now there is one more put_device() than get_device()s, instead of one less. Arguably device removal is not very popular yet, so this apparently hasn't been noticed as well up to now. But it blew up here while I'm working with device tree overlay insertion and removal. The symptom is an apparently unrelated list of oopses on device removal, with reasons: kernfs: can not remove 'uevent', no directory kernfs: can not remove 'brightness', no directory kernfs: can not remove 'max_brightness', no directory ... Here sysfs fails removing attribute files, which is because the device name changed and so the sysfs path. This is because the device name string got corrupted, which is because it got freed too early and its memory reused. Different symptoms could appear in different use cases. Fix by removing one of the two fixes. The choice was to remove commit da1afe8e6099 because: * it is calling put_device() inside of_led_get() just after getting the device, thus it is basically not refcounting the LED device at all during its entire lifetime * it does not add a corresponding put_device() in led_get(), so it fixes only the OF case The other fix (445110941eb9) is adding the put_device() in led_put() so it covers the entire lifetime, and it works even in the non-DT case. Fixes: da1afe8e6099 ("leds: led-core: Fix refcount leak in of_led_get()") Co-developed-by: Hervé Codina <herve.codina@bootlin.com> Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com> --- drivers/leds/led-class.c | 1 - 1 file changed, 1 deletion(-) --- base-commit: 28ef3e64d0a22f6a29a1ea489293715a29623e52 change-id: 20240625-led-class-device-leak-6637a2821678 Best regards,