diff mbox series

[02/13] leds: el15203000: Fix devm vs. non-devm ordering

Message ID 1667983694-15040-3-git-send-email-wangyufen@huawei.com
State New
Headers show
Series leds: Fix devm vs. non-devm ordering | expand

Commit Message

wangyufen Nov. 9, 2022, 8:48 a.m. UTC
When non-devm resources are allocated they mustn't be followed by devm
allocations, otherwise it will break the tear down ordering and might
lead to crashes or other bugs during ->remove() stage. Fix this by
wrapping mutex_destroy() call with devm_add_action_or_reset().

Fixes: fc19967bcb8f ("leds: add LED driver for EL15203000 board")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Cc: Oleh Kravchenko <oleg@kaa.org.ua>
---
 drivers/leds/leds-el15203000.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

Comments

Wei Yongjun Nov. 15, 2022, 2:06 a.m. UTC | #1
Hi Oleh,

On 2022/11/11 18:39, Oleh Kravchenko wrote:
> Hello Wang,
> 
>> 11 лист. 2022 р. о 11:21 wangyufen <wangyufen@huawei.com> написав(ла):
>>
>>
>> 在 2022/11/9 18:43, Oleh Kravchenko 写道:
>>>
>>>
>>>> 9 лист. 2022 р. о 12:25 wangyufen <wangyufen@huawei.com> написав(ла):
>>>>
>>>>
>>>> 在 2022/11/9 17:39, Oleh Kravchenko 写道:
>>>>
>>>>>> -static void el15203000_remove(struct spi_device *spi)
>>>>>>
>>>>> Is remove() callback from struct spi_driver deprecated?
>>>>>
>>>> It is not that remove() callback is deprecated,
>>>> it's that after wrapping mutex_destroy() call with devm_add_action_or_reset(),
>>>> remove() callback is unnecessary here.
>>>>
>>> When remove() is called, the memory allocated by devm_*() is valid.
>>> So what you try to fix here?
>>
>> Fix the &priv->lock used after destroy, for details, please see patch #0
>> LKML: Wang Yufen: [PATCH 00/13] leds: Fix devm vs. non-devm ordering
> 
> It doesn’t make any sense for me.
> You saying that remove() called before devm_* allocation
> if it true then set_brightness_delayed() will crash the system in anyway.
> 
> LED device has a parent SPI device; LED device can’t exist without SPI device.
> 
> So deallocation order should be next:
> 1. LED device devm_*()
> 2. SPI device remove()

The allocation order is as follows:

el15203000_probe()
  mutex_init(&priv->lock);
  el15203000_probe_dt(priv)
    device_for_each_child_node(priv->dev, child) {
      ...
      led->ldev.brightness_set_blocking = el15203000_set_blocking;
      ...
      devm_led_classdev_register_ext(priv->dev, &led->ldev, &init_data);
         dr = devres_alloc(devm_led_classdev_release, sizeof(*dr), GFP_KERNEL);
         <-- dr->node.release = devm_led_classdev_release()
         ...
         devres_add(parent, dr);
         <-- add dr->node to &priv->dev->devres_head

And the full deallocation order should be this:

1. SPI device .remove callback
2. LED device devm_*()
3. SPI device deallocation

spi_unregister_device()
  device_del()
    bus_remove_device()
      device_release_driver_internal()
        __device_release_driver()
          ...
	  device_remove()
	    spi_remove()   <-- call el15203000_remove() here, mutex_destroy(&priv->lock), lock destroy
          ...
          device_unbind_cleanup()
            devres_release_all()
              release_nodes()
              <-- traverse spi->dev->devres_head list and call dr->node.release in sequence.
                   devm_led_classdev_release()
                     led_classdev_unregister()
                     <-- flush set_brightness_work here, before the work flush, set_brightness_work may be sched.
                     <-- that is el15203000_set_blocking()..-> mutex_lock(&led->priv->lock) is called,
                     <-- this leads to the priv->lock use after destroy.
 put_device(&spi->dev) <-- spi device is deallocation in here


Regards,
Wei Yongjun
Wang Yufen Nov. 22, 2022, 1:10 a.m. UTC | #2
在 2022/11/15 10:06, Wei Yongjun 写道:
> Hi Oleh,
> 
> On 2022/11/11 18:39, Oleh Kravchenko wrote:
>> Hello Wang,
>>
>>> 11 лист. 2022 р. о 11:21 wangyufen <wangyufen@huawei.com> написав(ла):
>>>
>>>
>>> 在 2022/11/9 18:43, Oleh Kravchenko 写道:
>>>>
>>>>
>>>>> 9 лист. 2022 р. о 12:25 wangyufen <wangyufen@huawei.com> написав(ла):
>>>>>
>>>>>
>>>>> 在 2022/11/9 17:39, Oleh Kravchenko 写道:
>>>>>
>>>>>>> -static void el15203000_remove(struct spi_device *spi)
>>>>>>>
>>>>>> Is remove() callback from struct spi_driver deprecated?
>>>>>>
>>>>> It is not that remove() callback is deprecated,
>>>>> it's that after wrapping mutex_destroy() call with devm_add_action_or_reset(),
>>>>> remove() callback is unnecessary here.
>>>>>
>>>> When remove() is called, the memory allocated by devm_*() is valid.
>>>> So what you try to fix here?
>>>
>>> Fix the &priv->lock used after destroy, for details, please see patch #0
>>> LKML: Wang Yufen: [PATCH 00/13] leds: Fix devm vs. non-devm ordering
>>
>> It doesn’t make any sense for me.
>> You saying that remove() called before devm_* allocation
>> if it true then set_brightness_delayed() will crash the system in anyway.
>>
>> LED device has a parent SPI device; LED device can’t exist without SPI device.
>>
>> So deallocation order should be next:
>> 1. LED device devm_*()
>> 2. SPI device remove()
> 
> The allocation order is as follows:
> 
> el15203000_probe()
>    mutex_init(&priv->lock);
>    el15203000_probe_dt(priv)
>      device_for_each_child_node(priv->dev, child) {
>        ...
>        led->ldev.brightness_set_blocking = el15203000_set_blocking;
>        ...
>        devm_led_classdev_register_ext(priv->dev, &led->ldev, &init_data);
>           dr = devres_alloc(devm_led_classdev_release, sizeof(*dr), GFP_KERNEL);
>           <-- dr->node.release = devm_led_classdev_release()
>           ...
>           devres_add(parent, dr);
>           <-- add dr->node to &priv->dev->devres_head
> 
> And the full deallocation order should be this:
> 
> 1. SPI device .remove callback
> 2. LED device devm_*()
> 3. SPI device deallocation
> 
> spi_unregister_device()
>    device_del()
>      bus_remove_device()
>        device_release_driver_internal()
>          __device_release_driver()
>            ...
> 	  device_remove()
> 	    spi_remove()   <-- call el15203000_remove() here, mutex_destroy(&priv->lock), lock destroy
>            ...
>            device_unbind_cleanup()
>              devres_release_all()
>                release_nodes()
>                <-- traverse spi->dev->devres_head list and call dr->node.release in sequence.
>                     devm_led_classdev_release()
>                       led_classdev_unregister()
>                       <-- flush set_brightness_work here, before the work flush, set_brightness_work may be sched.
>                       <-- that is el15203000_set_blocking()..-> mutex_lock(&led->priv->lock) is called,
>                       <-- this leads to the priv->lock use after destroy.
>   put_device(&spi->dev) <-- spi device is deallocation in here
> 
>
Hi Oleh,

Judging from the deallocation order above, there is a issue that the 
&priv->lock used after destroy, right?

And thanks Wei for the detailed explanation.

Thanks,
Wang


> Regards,
> Wei Yongjun
>
Oleh Kravchenko Nov. 25, 2022, 10:21 p.m. UTC | #3
Hello Wang,

22.11.22 03:10, Wang Yufen пише:
> 
> Hi Oleh,
> 
> Judging from the deallocation order above, there is a issue that the &priv->lock used after destroy, right?
> 
> And thanks Wei for the detailed explanation.
> 
> Thanks,
> Wang

Sorry, guys.
The last russian missile attack made my work impossible.
I will try to verify all when I have the ability.
diff mbox series

Patch

diff --git a/drivers/leds/leds-el15203000.c b/drivers/leds/leds-el15203000.c
index 7e7b617..9be934e 100644
--- a/drivers/leds/leds-el15203000.c
+++ b/drivers/leds/leds-el15203000.c
@@ -287,10 +287,16 @@  static int el15203000_probe_dt(struct el15203000 *priv)
 	return ret;
 }
 
+static void el15203000_mutex_destroy(void *lock)
+{
+	mutex_destroy(lock);
+}
+
 static int el15203000_probe(struct spi_device *spi)
 {
 	struct el15203000	*priv;
 	size_t			count;
+	int			ret;
 
 	count = device_get_child_node_count(&spi->dev);
 	if (!count) {
@@ -312,15 +318,14 @@  static int el15203000_probe(struct spi_device *spi)
 
 	spi_set_drvdata(spi, priv);
 
+	ret = devm_add_action_or_reset(&spi->dev, el15203000_mutex_destroy,
+				       &priv->lock);
+	if (ret)
+		return ret;
+
 	return el15203000_probe_dt(priv);
 }
 
-static void el15203000_remove(struct spi_device *spi)
-{
-	struct el15203000 *priv = spi_get_drvdata(spi);
-
-	mutex_destroy(&priv->lock);
-}
 
 static const struct of_device_id el15203000_dt_ids[] = {
 	{ .compatible = "crane,el15203000", },
@@ -331,7 +336,6 @@  static void el15203000_remove(struct spi_device *spi)
 
 static struct spi_driver el15203000_driver = {
 	.probe		= el15203000_probe,
-	.remove		= el15203000_remove,
 	.driver = {
 		.name		= KBUILD_MODNAME,
 		.of_match_table	= el15203000_dt_ids,