diff mbox series

[v3] thermal/core: fix error paths in __thermal_cooling_device_register()

Message ID 20230112154721.452292-1-caleb.connolly@linaro.org
State New
Headers show
Series [v3] thermal/core: fix error paths in __thermal_cooling_device_register() | expand

Commit Message

Caleb Connolly Jan. 12, 2023, 3:47 p.m. UTC
There is in invalid call to thermal_cooling_device_destroy_sysfs() and
another to put_device() in the error paths here. Fix them.

Fixes: c408b3d1d9bb ("thermal: Validate new state in cur_state_store()")
Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org>
---
Changes since v2:
 * Rework and simplify into one patch following Yang's suggestions.

V2: https://lore.kernel.org/all/20230103171811.204196-1-caleb.connolly@linaro.org/
---
 drivers/thermal/thermal_core.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

Comments

Viresh Kumar Jan. 16, 2023, 4:23 a.m. UTC | #1
On 12-01-23, 15:47, Caleb Connolly wrote:
> There is in invalid call to thermal_cooling_device_destroy_sysfs() and
> another to put_device() in the error paths here. Fix them.
> 
> Fixes: c408b3d1d9bb ("thermal: Validate new state in cur_state_store()")
> Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org>
> ---
> Changes since v2:
>  * Rework and simplify into one patch following Yang's suggestions.
> 
> V2: https://lore.kernel.org/all/20230103171811.204196-1-caleb.connolly@linaro.org/
> ---
>  drivers/thermal/thermal_core.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)

Replied to an earlier version by mistake. Should it be like this ?

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index f17ab2316dbd..18db011d4d68 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -910,17 +910,15 @@ __thermal_cooling_device_register(struct device_node *np,

        ret = cdev->ops->get_max_state(cdev, &cdev->max_state);
        if (ret)
-               goto out_kfree_type;
+               goto out_cdev_type;

        thermal_cooling_device_setup_sysfs(cdev);
        ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id);
-       if (ret) {
-               thermal_cooling_device_destroy_sysfs(cdev);
-               goto out_kfree_type;
-       }
+       if (ret)
+               goto out_cooling_dev;
        ret = device_register(&cdev->device);
        if (ret)
-               goto out_kfree_type;
+               goto out_put_device;

        /* Add 'this' new cdev to the global cdev list */
        mutex_lock(&thermal_list_lock);
@@ -939,11 +937,12 @@ __thermal_cooling_device_register(struct device_node *np,

        return cdev;

-out_kfree_type:
+out_put_device:
+       put_device(&cdev->device);
+out_cooling_dev:
        thermal_cooling_device_destroy_sysfs(cdev);
+out_cdev_type:
        kfree(cdev->type);
-       put_device(&cdev->device);
-       cdev = NULL;
 out_ida_remove:
        ida_free(&thermal_cdev_ida, id);
 out_kfree_cdev:
Caleb Connolly Jan. 16, 2023, 12:22 p.m. UTC | #2
On 16/01/2023 04:23, Viresh Kumar wrote:
> On 12-01-23, 15:47, Caleb Connolly wrote:
>> There is in invalid call to thermal_cooling_device_destroy_sysfs() and
>> another to put_device() in the error paths here. Fix them.
>>
>> Fixes: c408b3d1d9bb ("thermal: Validate new state in cur_state_store()")
>> Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org>
>> ---
>> Changes since v2:
>>  * Rework and simplify into one patch following Yang's suggestions.
>>
>> V2: https://lore.kernel.org/all/20230103171811.204196-1-caleb.connolly@linaro.org/
>> ---
>>  drivers/thermal/thermal_core.c | 11 ++++++-----
>>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> Replied to an earlier version by mistake. Should it be like this ?

apologies for all the subject rewording.

I'm not 100% clear on the benefits/drawbacks to the different
approaches, am I right in thinking the improvement here (compared to my
patch) is in giving each error condition it's own path so there isn't
any ambiguity?
> 
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index f17ab2316dbd..18db011d4d68 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -910,17 +910,15 @@ __thermal_cooling_device_register(struct device_node *np,
> 
>         ret = cdev->ops->get_max_state(cdev, &cdev->max_state);
>         if (ret)
> -               goto out_kfree_type;
> +               goto out_cdev_type;
> 
>         thermal_cooling_device_setup_sysfs(cdev);
>         ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id);
> -       if (ret) {
> -               thermal_cooling_device_destroy_sysfs(cdev);
> -               goto out_kfree_type;
> -       }
> +       if (ret)
> +               goto out_cooling_dev;
>         ret = device_register(&cdev->device);
>         if (ret)
> -               goto out_kfree_type;
> +               goto out_put_device;
> 
>         /* Add 'this' new cdev to the global cdev list */
>         mutex_lock(&thermal_list_lock);
> @@ -939,11 +937,12 @@ __thermal_cooling_device_register(struct device_node *np,
> 
>         return cdev;
> 
> -out_kfree_type:
> +out_put_device:
> +       put_device(&cdev->device);
> +out_cooling_dev:
>         thermal_cooling_device_destroy_sysfs(cdev);
> +out_cdev_type:
As far as I can tell this is give or take the same as the two patches I
had in v2 [1]?
>         kfree(cdev->type);
> -       put_device(&cdev->device);
> -       cdev = NULL;
Is removing the explicit null assignment here intentional?

>  out_ida_remove:
>         ida_free(&thermal_cdev_ida, id);
>  out_kfree_cdev:
> 

I'm happy to go ahead an submit a v4 based on this.

[1]:
https://lore.kernel.org/all/20230103171811.204196-1-caleb.connolly@linaro.org/
Yang Yingliang Jan. 16, 2023, 12:38 p.m. UTC | #3
On 2023/1/16 12:23, Viresh Kumar wrote:
> On 12-01-23, 15:47, Caleb Connolly wrote:
>> There is in invalid call to thermal_cooling_device_destroy_sysfs() and
>> another to put_device() in the error paths here. Fix them.
>>
>> Fixes: c408b3d1d9bb ("thermal: Validate new state in cur_state_store()")
>> Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org>
>> ---
>> Changes since v2:
>>   * Rework and simplify into one patch following Yang's suggestions.
>>
>> V2: https://lore.kernel.org/all/20230103171811.204196-1-caleb.connolly@linaro.org/
>> ---
>>   drivers/thermal/thermal_core.c | 11 ++++++-----
>>   1 file changed, 6 insertions(+), 5 deletions(-)
> Replied to an earlier version by mistake. Should it be like this ?
>
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index f17ab2316dbd..18db011d4d68 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -910,17 +910,15 @@ __thermal_cooling_device_register(struct device_node *np,
>
>          ret = cdev->ops->get_max_state(cdev, &cdev->max_state);
>          if (ret)
> -               goto out_kfree_type;
> +               goto out_cdev_type;
>
>          thermal_cooling_device_setup_sysfs(cdev);
>          ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id);
> -       if (ret) {
> -               thermal_cooling_device_destroy_sysfs(cdev);
> -               goto out_kfree_type;
> -       }
> +       if (ret)
> +               goto out_cooling_dev;
>          ret = device_register(&cdev->device);
>          if (ret)
> -               goto out_kfree_type;
> +               goto out_put_device;
>
>          /* Add 'this' new cdev to the global cdev list */
>          mutex_lock(&thermal_list_lock);
> @@ -939,11 +937,12 @@ __thermal_cooling_device_register(struct device_node *np,
>
>          return cdev;
>
> -out_kfree_type:
> +out_put_device:
> +       put_device(&cdev->device);
> +out_cooling_dev:
>          thermal_cooling_device_destroy_sysfs(cdev);
> +out_cdev_type:
>          kfree(cdev->type);
The 'cdev' will be freed after calling put_device(), it can not be 
accessed anymore.

Thanks,
Yang
> -       put_device(&cdev->device);
> -       cdev = NULL;
>   out_ida_remove:
>          ida_free(&thermal_cdev_ida, id);
>   out_kfree_cdev:
>
Yang Yingliang Jan. 16, 2023, 12:52 p.m. UTC | #4
On 2023/1/12 23:47, Caleb Connolly wrote:
> There is in invalid call to thermal_cooling_device_destroy_sysfs() and
> another to put_device() in the error paths here. Fix them.
>
> Fixes: c408b3d1d9bb ("thermal: Validate new state in cur_state_store()")
> Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org>
> ---
> Changes since v2:
>   * Rework and simplify into one patch following Yang's suggestions.
>
> V2: https://lore.kernel.org/all/20230103171811.204196-1-caleb.connolly@linaro.org/
> ---
>   drivers/thermal/thermal_core.c | 11 ++++++-----
>   1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index f17ab2316dbd..321d2a6300c4 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -909,15 +909,16 @@ __thermal_cooling_device_register(struct device_node *np,
>   	cdev->devdata = devdata;
>   
>   	ret = cdev->ops->get_max_state(cdev, &cdev->max_state);
> -	if (ret)
> -		goto out_kfree_type;
> +	if (ret) {
> +		kfree(cdev->type);
> +		goto out_ida_remove;
> +	}
>   
>   	thermal_cooling_device_setup_sysfs(cdev);
>   	ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id);
> -	if (ret) {
> -		thermal_cooling_device_destroy_sysfs(cdev);
> +	if (ret)
>   		goto out_kfree_type;
> -	}
> +
Before device_initialize() which is called in device_register(), we can 
not call put_device().
So we can not use 'out_kfree_type' lable before calling device_register().

How about like fix it this:
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index f17ab2316dbd..b57a6db0efdf 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -909,14 +909,17 @@ __thermal_cooling_device_register(struct 
device_node *np,
         cdev->devdata = devdata;

         ret = cdev->ops->get_max_state(cdev, &cdev->max_state);
-       if (ret)
-               goto out_kfree_type;
+       if (ret) {
+               kfree(cdev->type);
+               goto out_ida_remove;
+       }

         thermal_cooling_device_setup_sysfs(cdev);
         ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id);
         if (ret) {
                 thermal_cooling_device_destroy_sysfs(cdev);
-               goto out_kfree_type;
+               kfree(cdev->type);
+               goto out_ida_remove;
         }
         ret = device_register(&cdev->device);
         if (ret)

Thanks,
Yang
>   	ret = device_register(&cdev->device);
>   	if (ret)
>   		goto out_kfree_type;
Viresh Kumar Jan. 17, 2023, 4:36 a.m. UTC | #5
On 16-01-23, 20:38, Yang Yingliang wrote:
> The 'cdev' will be freed after calling put_device(), it can not be accessed
> anymore.

I surely missed the class's release callback :(

How about this then, it clears this in a better way, i.e. what entity
is responsible for what exactly. This can be divided in a few patches
for sure.

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index f17ab2316dbd..5555bf827a0f 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -774,6 +774,9 @@ static void thermal_release(struct device *dev)
        } else if (!strncmp(dev_name(dev), "cooling_device",
                            sizeof("cooling_device") - 1)) {
                cdev = to_cooling_device(dev);
+               thermal_cooling_device_destroy_sysfs(cdev);
+               kfree(cdev->type);
+               ida_free(&thermal_cdev_ida, cdev->id);
                kfree(cdev);
        }
 }
@@ -910,17 +913,18 @@ __thermal_cooling_device_register(struct device_node *np,

        ret = cdev->ops->get_max_state(cdev, &cdev->max_state);
        if (ret)
-               goto out_kfree_type;
+               goto out_cdev_type;

        thermal_cooling_device_setup_sysfs(cdev);
        ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id);
+       if (ret)
+               goto out_cooling_dev;
+       ret = device_register(&cdev->device);
        if (ret) {
-               thermal_cooling_device_destroy_sysfs(cdev);
-               goto out_kfree_type;
+               /* thermal_release() handles rest of the cleanup */
+               put_device(&cdev->device);
+               return ret;
        }
-       ret = device_register(&cdev->device);
-       if (ret)
-               goto out_kfree_type;

        /* Add 'this' new cdev to the global cdev list */
        mutex_lock(&thermal_list_lock);
@@ -939,11 +943,10 @@ __thermal_cooling_device_register(struct device_node *np,

        return cdev;

-out_kfree_type:
+out_cooling_dev:
        thermal_cooling_device_destroy_sysfs(cdev);
+out_cdev_type:
        kfree(cdev->type);
-       put_device(&cdev->device);
-       cdev = NULL;
 out_ida_remove:
        ida_free(&thermal_cdev_ida, id);
 out_kfree_cdev:
@@ -1104,11 +1107,7 @@ void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev)

        mutex_unlock(&thermal_list_lock);

-       ida_free(&thermal_cdev_ida, cdev->id);
-       device_del(&cdev->device);
-       thermal_cooling_device_destroy_sysfs(cdev);
-       kfree(cdev->type);
-       put_device(&cdev->device);
+       device_unregister(&cdev->device);
 }
 EXPORT_SYMBOL_GPL(thermal_cooling_device_unregister);
Viresh Kumar Jan. 17, 2023, 4:40 a.m. UTC | #6
On 16-01-23, 12:22, Caleb Connolly wrote:
> I'm not 100% clear on the benefits/drawbacks to the different
> approaches, am I right in thinking the improvement here (compared to my
> patch) is in giving each error condition it's own path so there isn't
> any ambiguity?

There was still confusion about what should be called when, like
device_put() can't be called with just dev_set_name(), etc. Also I
made a mistake which Yang mentioned,

I have proposed a different way of fixing this all for once, lets see
how that goes.
Rafael J. Wysocki Jan. 17, 2023, 1:39 p.m. UTC | #7
On Tue, Jan 17, 2023 at 5:36 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 16-01-23, 20:38, Yang Yingliang wrote:
> > The 'cdev' will be freed after calling put_device(), it can not be accessed
> > anymore.
>
> I surely missed the class's release callback :(
>
> How about this then, it clears this in a better way, i.e. what entity
> is responsible for what exactly. This can be divided in a few patches
> for sure.

It looks good to me, but then I haven't caught the release bug too.

> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index f17ab2316dbd..5555bf827a0f 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -774,6 +774,9 @@ static void thermal_release(struct device *dev)
>         } else if (!strncmp(dev_name(dev), "cooling_device",
>                             sizeof("cooling_device") - 1)) {
>                 cdev = to_cooling_device(dev);
> +               thermal_cooling_device_destroy_sysfs(cdev);
> +               kfree(cdev->type);
> +               ida_free(&thermal_cdev_ida, cdev->id);
>                 kfree(cdev);
>         }
>  }
> @@ -910,17 +913,18 @@ __thermal_cooling_device_register(struct device_node *np,
>
>         ret = cdev->ops->get_max_state(cdev, &cdev->max_state);
>         if (ret)
> -               goto out_kfree_type;
> +               goto out_cdev_type;
>
>         thermal_cooling_device_setup_sysfs(cdev);
>         ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id);
> +       if (ret)
> +               goto out_cooling_dev;
> +       ret = device_register(&cdev->device);
>         if (ret) {
> -               thermal_cooling_device_destroy_sysfs(cdev);
> -               goto out_kfree_type;
> +               /* thermal_release() handles rest of the cleanup */
> +               put_device(&cdev->device);
> +               return ret;
>         }
> -       ret = device_register(&cdev->device);
> -       if (ret)
> -               goto out_kfree_type;
>
>         /* Add 'this' new cdev to the global cdev list */
>         mutex_lock(&thermal_list_lock);
> @@ -939,11 +943,10 @@ __thermal_cooling_device_register(struct device_node *np,
>
>         return cdev;
>
> -out_kfree_type:
> +out_cooling_dev:
>         thermal_cooling_device_destroy_sysfs(cdev);
> +out_cdev_type:
>         kfree(cdev->type);
> -       put_device(&cdev->device);
> -       cdev = NULL;
>  out_ida_remove:
>         ida_free(&thermal_cdev_ida, id);
>  out_kfree_cdev:
> @@ -1104,11 +1107,7 @@ void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev)
>
>         mutex_unlock(&thermal_list_lock);
>
> -       ida_free(&thermal_cdev_ida, cdev->id);
> -       device_del(&cdev->device);
> -       thermal_cooling_device_destroy_sysfs(cdev);
> -       kfree(cdev->type);
> -       put_device(&cdev->device);
> +       device_unregister(&cdev->device);
>  }
>  EXPORT_SYMBOL_GPL(thermal_cooling_device_unregister);
>
> --
> viresh
Caleb Connolly Jan. 17, 2023, 6:11 p.m. UTC | #8
On 17/01/2023 13:39, Rafael J. Wysocki wrote:
> On Tue, Jan 17, 2023 at 5:36 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>>
>> On 16-01-23, 20:38, Yang Yingliang wrote:
>>> The 'cdev' will be freed after calling put_device(), it can not be accessed
>>> anymore.
>>
>> I surely missed the class's release callback :(
>>
>> How about this then, it clears this in a better way, i.e. what entity
>> is responsible for what exactly. This can be divided in a few patches
>> for sure.
> 
> It looks good to me, but then I haven't caught the release bug too.

it's a fun one:

https://lore.kernel.org/linux-pm/20230101174342.58351-1-caleb.connolly@linaro.org/

I don't see any issues with this suggested patch, however I don't think
I could comfortably attach my SoB to it given the larger code reordering
and my complete lack of experience with this subsystem.

Would a Tested-by be acceptable?
> 
>> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
>> index f17ab2316dbd..5555bf827a0f 100644
>> --- a/drivers/thermal/thermal_core.c
>> +++ b/drivers/thermal/thermal_core.c
>> @@ -774,6 +774,9 @@ static void thermal_release(struct device *dev)
>>         } else if (!strncmp(dev_name(dev), "cooling_device",
>>                             sizeof("cooling_device") - 1)) {
>>                 cdev = to_cooling_device(dev);
>> +               thermal_cooling_device_destroy_sysfs(cdev);
>> +               kfree(cdev->type);
>> +               ida_free(&thermal_cdev_ida, cdev->id);
>>                 kfree(cdev);
>>         }
>>  }
>> @@ -910,17 +913,18 @@ __thermal_cooling_device_register(struct device_node *np,
>>
>>         ret = cdev->ops->get_max_state(cdev, &cdev->max_state);
>>         if (ret)
>> -               goto out_kfree_type;
>> +               goto out_cdev_type;
>>
>>         thermal_cooling_device_setup_sysfs(cdev);
>>         ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id);
>> +       if (ret)
>> +               goto out_cooling_dev;
>> +       ret = device_register(&cdev->device);
>>         if (ret) {
>> -               thermal_cooling_device_destroy_sysfs(cdev);
>> -               goto out_kfree_type;
>> +               /* thermal_release() handles rest of the cleanup */
>> +               put_device(&cdev->device);
>> +               return ret;
>>         }
>> -       ret = device_register(&cdev->device);
>> -       if (ret)
>> -               goto out_kfree_type;
>>
>>         /* Add 'this' new cdev to the global cdev list */
>>         mutex_lock(&thermal_list_lock);
>> @@ -939,11 +943,10 @@ __thermal_cooling_device_register(struct device_node *np,
>>
>>         return cdev;
>>
>> -out_kfree_type:
>> +out_cooling_dev:
>>         thermal_cooling_device_destroy_sysfs(cdev);
>> +out_cdev_type:
>>         kfree(cdev->type);
>> -       put_device(&cdev->device);
>> -       cdev = NULL;
>>  out_ida_remove:
>>         ida_free(&thermal_cdev_ida, id);
>>  out_kfree_cdev:
>> @@ -1104,11 +1107,7 @@ void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev)
>>
>>         mutex_unlock(&thermal_list_lock);
>>
>> -       ida_free(&thermal_cdev_ida, cdev->id);
>> -       device_del(&cdev->device);
>> -       thermal_cooling_device_destroy_sysfs(cdev);
>> -       kfree(cdev->type);
>> -       put_device(&cdev->device);
>> +       device_unregister(&cdev->device);
>>  }
>>  EXPORT_SYMBOL_GPL(thermal_cooling_device_unregister);
>>
>> --
>> viresh
Viresh Kumar Jan. 18, 2023, 4:18 a.m. UTC | #9
On 17-01-23, 18:11, Caleb Connolly wrote:
> it's a fun one:
> 
> https://lore.kernel.org/linux-pm/20230101174342.58351-1-caleb.connolly@linaro.org/
> 
> I don't see any issues with this suggested patch, however I don't think
> I could comfortably attach my SoB to it given the larger code reordering
> and my complete lack of experience with this subsystem.
> 
> Would a Tested-by be acceptable?

Sure, lemme send the patch(es) formally then.
diff mbox series

Patch

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index f17ab2316dbd..321d2a6300c4 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -909,15 +909,16 @@  __thermal_cooling_device_register(struct device_node *np,
 	cdev->devdata = devdata;
 
 	ret = cdev->ops->get_max_state(cdev, &cdev->max_state);
-	if (ret)
-		goto out_kfree_type;
+	if (ret) {
+		kfree(cdev->type);
+		goto out_ida_remove;
+	}
 
 	thermal_cooling_device_setup_sysfs(cdev);
 	ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id);
-	if (ret) {
-		thermal_cooling_device_destroy_sysfs(cdev);
+	if (ret)
 		goto out_kfree_type;
-	}
+
 	ret = device_register(&cdev->device);
 	if (ret)
 		goto out_kfree_type;