Message ID | 20230112154721.452292-1-caleb.connolly@linaro.org |
---|---|
State | New |
Headers | show |
Series | [v3] thermal/core: fix error paths in __thermal_cooling_device_register() | expand |
On 12-01-23, 15:47, Caleb Connolly wrote: > There is in invalid call to thermal_cooling_device_destroy_sysfs() and > another to put_device() in the error paths here. Fix them. > > Fixes: c408b3d1d9bb ("thermal: Validate new state in cur_state_store()") > Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org> > --- > Changes since v2: > * Rework and simplify into one patch following Yang's suggestions. > > V2: https://lore.kernel.org/all/20230103171811.204196-1-caleb.connolly@linaro.org/ > --- > drivers/thermal/thermal_core.c | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) Replied to an earlier version by mistake. Should it be like this ? diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index f17ab2316dbd..18db011d4d68 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -910,17 +910,15 @@ __thermal_cooling_device_register(struct device_node *np, ret = cdev->ops->get_max_state(cdev, &cdev->max_state); if (ret) - goto out_kfree_type; + goto out_cdev_type; thermal_cooling_device_setup_sysfs(cdev); ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id); - if (ret) { - thermal_cooling_device_destroy_sysfs(cdev); - goto out_kfree_type; - } + if (ret) + goto out_cooling_dev; ret = device_register(&cdev->device); if (ret) - goto out_kfree_type; + goto out_put_device; /* Add 'this' new cdev to the global cdev list */ mutex_lock(&thermal_list_lock); @@ -939,11 +937,12 @@ __thermal_cooling_device_register(struct device_node *np, return cdev; -out_kfree_type: +out_put_device: + put_device(&cdev->device); +out_cooling_dev: thermal_cooling_device_destroy_sysfs(cdev); +out_cdev_type: kfree(cdev->type); - put_device(&cdev->device); - cdev = NULL; out_ida_remove: ida_free(&thermal_cdev_ida, id); out_kfree_cdev:
On 16/01/2023 04:23, Viresh Kumar wrote: > On 12-01-23, 15:47, Caleb Connolly wrote: >> There is in invalid call to thermal_cooling_device_destroy_sysfs() and >> another to put_device() in the error paths here. Fix them. >> >> Fixes: c408b3d1d9bb ("thermal: Validate new state in cur_state_store()") >> Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org> >> --- >> Changes since v2: >> * Rework and simplify into one patch following Yang's suggestions. >> >> V2: https://lore.kernel.org/all/20230103171811.204196-1-caleb.connolly@linaro.org/ >> --- >> drivers/thermal/thermal_core.c | 11 ++++++----- >> 1 file changed, 6 insertions(+), 5 deletions(-) > > Replied to an earlier version by mistake. Should it be like this ? apologies for all the subject rewording. I'm not 100% clear on the benefits/drawbacks to the different approaches, am I right in thinking the improvement here (compared to my patch) is in giving each error condition it's own path so there isn't any ambiguity? > > diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c > index f17ab2316dbd..18db011d4d68 100644 > --- a/drivers/thermal/thermal_core.c > +++ b/drivers/thermal/thermal_core.c > @@ -910,17 +910,15 @@ __thermal_cooling_device_register(struct device_node *np, > > ret = cdev->ops->get_max_state(cdev, &cdev->max_state); > if (ret) > - goto out_kfree_type; > + goto out_cdev_type; > > thermal_cooling_device_setup_sysfs(cdev); > ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id); > - if (ret) { > - thermal_cooling_device_destroy_sysfs(cdev); > - goto out_kfree_type; > - } > + if (ret) > + goto out_cooling_dev; > ret = device_register(&cdev->device); > if (ret) > - goto out_kfree_type; > + goto out_put_device; > > /* Add 'this' new cdev to the global cdev list */ > mutex_lock(&thermal_list_lock); > @@ -939,11 +937,12 @@ __thermal_cooling_device_register(struct device_node *np, > > return cdev; > > -out_kfree_type: > +out_put_device: > + put_device(&cdev->device); > +out_cooling_dev: > thermal_cooling_device_destroy_sysfs(cdev); > +out_cdev_type: As far as I can tell this is give or take the same as the two patches I had in v2 [1]? > kfree(cdev->type); > - put_device(&cdev->device); > - cdev = NULL; Is removing the explicit null assignment here intentional? > out_ida_remove: > ida_free(&thermal_cdev_ida, id); > out_kfree_cdev: > I'm happy to go ahead an submit a v4 based on this. [1]: https://lore.kernel.org/all/20230103171811.204196-1-caleb.connolly@linaro.org/
On 2023/1/16 12:23, Viresh Kumar wrote: > On 12-01-23, 15:47, Caleb Connolly wrote: >> There is in invalid call to thermal_cooling_device_destroy_sysfs() and >> another to put_device() in the error paths here. Fix them. >> >> Fixes: c408b3d1d9bb ("thermal: Validate new state in cur_state_store()") >> Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org> >> --- >> Changes since v2: >> * Rework and simplify into one patch following Yang's suggestions. >> >> V2: https://lore.kernel.org/all/20230103171811.204196-1-caleb.connolly@linaro.org/ >> --- >> drivers/thermal/thermal_core.c | 11 ++++++----- >> 1 file changed, 6 insertions(+), 5 deletions(-) > Replied to an earlier version by mistake. Should it be like this ? > > diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c > index f17ab2316dbd..18db011d4d68 100644 > --- a/drivers/thermal/thermal_core.c > +++ b/drivers/thermal/thermal_core.c > @@ -910,17 +910,15 @@ __thermal_cooling_device_register(struct device_node *np, > > ret = cdev->ops->get_max_state(cdev, &cdev->max_state); > if (ret) > - goto out_kfree_type; > + goto out_cdev_type; > > thermal_cooling_device_setup_sysfs(cdev); > ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id); > - if (ret) { > - thermal_cooling_device_destroy_sysfs(cdev); > - goto out_kfree_type; > - } > + if (ret) > + goto out_cooling_dev; > ret = device_register(&cdev->device); > if (ret) > - goto out_kfree_type; > + goto out_put_device; > > /* Add 'this' new cdev to the global cdev list */ > mutex_lock(&thermal_list_lock); > @@ -939,11 +937,12 @@ __thermal_cooling_device_register(struct device_node *np, > > return cdev; > > -out_kfree_type: > +out_put_device: > + put_device(&cdev->device); > +out_cooling_dev: > thermal_cooling_device_destroy_sysfs(cdev); > +out_cdev_type: > kfree(cdev->type); The 'cdev' will be freed after calling put_device(), it can not be accessed anymore. Thanks, Yang > - put_device(&cdev->device); > - cdev = NULL; > out_ida_remove: > ida_free(&thermal_cdev_ida, id); > out_kfree_cdev: >
On 2023/1/12 23:47, Caleb Connolly wrote: > There is in invalid call to thermal_cooling_device_destroy_sysfs() and > another to put_device() in the error paths here. Fix them. > > Fixes: c408b3d1d9bb ("thermal: Validate new state in cur_state_store()") > Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org> > --- > Changes since v2: > * Rework and simplify into one patch following Yang's suggestions. > > V2: https://lore.kernel.org/all/20230103171811.204196-1-caleb.connolly@linaro.org/ > --- > drivers/thermal/thermal_core.c | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c > index f17ab2316dbd..321d2a6300c4 100644 > --- a/drivers/thermal/thermal_core.c > +++ b/drivers/thermal/thermal_core.c > @@ -909,15 +909,16 @@ __thermal_cooling_device_register(struct device_node *np, > cdev->devdata = devdata; > > ret = cdev->ops->get_max_state(cdev, &cdev->max_state); > - if (ret) > - goto out_kfree_type; > + if (ret) { > + kfree(cdev->type); > + goto out_ida_remove; > + } > > thermal_cooling_device_setup_sysfs(cdev); > ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id); > - if (ret) { > - thermal_cooling_device_destroy_sysfs(cdev); > + if (ret) > goto out_kfree_type; > - } > + Before device_initialize() which is called in device_register(), we can not call put_device(). So we can not use 'out_kfree_type' lable before calling device_register(). How about like fix it this: diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index f17ab2316dbd..b57a6db0efdf 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -909,14 +909,17 @@ __thermal_cooling_device_register(struct device_node *np, cdev->devdata = devdata; ret = cdev->ops->get_max_state(cdev, &cdev->max_state); - if (ret) - goto out_kfree_type; + if (ret) { + kfree(cdev->type); + goto out_ida_remove; + } thermal_cooling_device_setup_sysfs(cdev); ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id); if (ret) { thermal_cooling_device_destroy_sysfs(cdev); - goto out_kfree_type; + kfree(cdev->type); + goto out_ida_remove; } ret = device_register(&cdev->device); if (ret) Thanks, Yang > ret = device_register(&cdev->device); > if (ret) > goto out_kfree_type;
On 16-01-23, 20:38, Yang Yingliang wrote: > The 'cdev' will be freed after calling put_device(), it can not be accessed > anymore. I surely missed the class's release callback :( How about this then, it clears this in a better way, i.e. what entity is responsible for what exactly. This can be divided in a few patches for sure. diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index f17ab2316dbd..5555bf827a0f 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -774,6 +774,9 @@ static void thermal_release(struct device *dev) } else if (!strncmp(dev_name(dev), "cooling_device", sizeof("cooling_device") - 1)) { cdev = to_cooling_device(dev); + thermal_cooling_device_destroy_sysfs(cdev); + kfree(cdev->type); + ida_free(&thermal_cdev_ida, cdev->id); kfree(cdev); } } @@ -910,17 +913,18 @@ __thermal_cooling_device_register(struct device_node *np, ret = cdev->ops->get_max_state(cdev, &cdev->max_state); if (ret) - goto out_kfree_type; + goto out_cdev_type; thermal_cooling_device_setup_sysfs(cdev); ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id); + if (ret) + goto out_cooling_dev; + ret = device_register(&cdev->device); if (ret) { - thermal_cooling_device_destroy_sysfs(cdev); - goto out_kfree_type; + /* thermal_release() handles rest of the cleanup */ + put_device(&cdev->device); + return ret; } - ret = device_register(&cdev->device); - if (ret) - goto out_kfree_type; /* Add 'this' new cdev to the global cdev list */ mutex_lock(&thermal_list_lock); @@ -939,11 +943,10 @@ __thermal_cooling_device_register(struct device_node *np, return cdev; -out_kfree_type: +out_cooling_dev: thermal_cooling_device_destroy_sysfs(cdev); +out_cdev_type: kfree(cdev->type); - put_device(&cdev->device); - cdev = NULL; out_ida_remove: ida_free(&thermal_cdev_ida, id); out_kfree_cdev: @@ -1104,11 +1107,7 @@ void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev) mutex_unlock(&thermal_list_lock); - ida_free(&thermal_cdev_ida, cdev->id); - device_del(&cdev->device); - thermal_cooling_device_destroy_sysfs(cdev); - kfree(cdev->type); - put_device(&cdev->device); + device_unregister(&cdev->device); } EXPORT_SYMBOL_GPL(thermal_cooling_device_unregister);
On 16-01-23, 12:22, Caleb Connolly wrote: > I'm not 100% clear on the benefits/drawbacks to the different > approaches, am I right in thinking the improvement here (compared to my > patch) is in giving each error condition it's own path so there isn't > any ambiguity? There was still confusion about what should be called when, like device_put() can't be called with just dev_set_name(), etc. Also I made a mistake which Yang mentioned, I have proposed a different way of fixing this all for once, lets see how that goes.
On Tue, Jan 17, 2023 at 5:36 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 16-01-23, 20:38, Yang Yingliang wrote: > > The 'cdev' will be freed after calling put_device(), it can not be accessed > > anymore. > > I surely missed the class's release callback :( > > How about this then, it clears this in a better way, i.e. what entity > is responsible for what exactly. This can be divided in a few patches > for sure. It looks good to me, but then I haven't caught the release bug too. > diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c > index f17ab2316dbd..5555bf827a0f 100644 > --- a/drivers/thermal/thermal_core.c > +++ b/drivers/thermal/thermal_core.c > @@ -774,6 +774,9 @@ static void thermal_release(struct device *dev) > } else if (!strncmp(dev_name(dev), "cooling_device", > sizeof("cooling_device") - 1)) { > cdev = to_cooling_device(dev); > + thermal_cooling_device_destroy_sysfs(cdev); > + kfree(cdev->type); > + ida_free(&thermal_cdev_ida, cdev->id); > kfree(cdev); > } > } > @@ -910,17 +913,18 @@ __thermal_cooling_device_register(struct device_node *np, > > ret = cdev->ops->get_max_state(cdev, &cdev->max_state); > if (ret) > - goto out_kfree_type; > + goto out_cdev_type; > > thermal_cooling_device_setup_sysfs(cdev); > ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id); > + if (ret) > + goto out_cooling_dev; > + ret = device_register(&cdev->device); > if (ret) { > - thermal_cooling_device_destroy_sysfs(cdev); > - goto out_kfree_type; > + /* thermal_release() handles rest of the cleanup */ > + put_device(&cdev->device); > + return ret; > } > - ret = device_register(&cdev->device); > - if (ret) > - goto out_kfree_type; > > /* Add 'this' new cdev to the global cdev list */ > mutex_lock(&thermal_list_lock); > @@ -939,11 +943,10 @@ __thermal_cooling_device_register(struct device_node *np, > > return cdev; > > -out_kfree_type: > +out_cooling_dev: > thermal_cooling_device_destroy_sysfs(cdev); > +out_cdev_type: > kfree(cdev->type); > - put_device(&cdev->device); > - cdev = NULL; > out_ida_remove: > ida_free(&thermal_cdev_ida, id); > out_kfree_cdev: > @@ -1104,11 +1107,7 @@ void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev) > > mutex_unlock(&thermal_list_lock); > > - ida_free(&thermal_cdev_ida, cdev->id); > - device_del(&cdev->device); > - thermal_cooling_device_destroy_sysfs(cdev); > - kfree(cdev->type); > - put_device(&cdev->device); > + device_unregister(&cdev->device); > } > EXPORT_SYMBOL_GPL(thermal_cooling_device_unregister); > > -- > viresh
On 17/01/2023 13:39, Rafael J. Wysocki wrote: > On Tue, Jan 17, 2023 at 5:36 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: >> >> On 16-01-23, 20:38, Yang Yingliang wrote: >>> The 'cdev' will be freed after calling put_device(), it can not be accessed >>> anymore. >> >> I surely missed the class's release callback :( >> >> How about this then, it clears this in a better way, i.e. what entity >> is responsible for what exactly. This can be divided in a few patches >> for sure. > > It looks good to me, but then I haven't caught the release bug too. it's a fun one: https://lore.kernel.org/linux-pm/20230101174342.58351-1-caleb.connolly@linaro.org/ I don't see any issues with this suggested patch, however I don't think I could comfortably attach my SoB to it given the larger code reordering and my complete lack of experience with this subsystem. Would a Tested-by be acceptable? > >> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c >> index f17ab2316dbd..5555bf827a0f 100644 >> --- a/drivers/thermal/thermal_core.c >> +++ b/drivers/thermal/thermal_core.c >> @@ -774,6 +774,9 @@ static void thermal_release(struct device *dev) >> } else if (!strncmp(dev_name(dev), "cooling_device", >> sizeof("cooling_device") - 1)) { >> cdev = to_cooling_device(dev); >> + thermal_cooling_device_destroy_sysfs(cdev); >> + kfree(cdev->type); >> + ida_free(&thermal_cdev_ida, cdev->id); >> kfree(cdev); >> } >> } >> @@ -910,17 +913,18 @@ __thermal_cooling_device_register(struct device_node *np, >> >> ret = cdev->ops->get_max_state(cdev, &cdev->max_state); >> if (ret) >> - goto out_kfree_type; >> + goto out_cdev_type; >> >> thermal_cooling_device_setup_sysfs(cdev); >> ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id); >> + if (ret) >> + goto out_cooling_dev; >> + ret = device_register(&cdev->device); >> if (ret) { >> - thermal_cooling_device_destroy_sysfs(cdev); >> - goto out_kfree_type; >> + /* thermal_release() handles rest of the cleanup */ >> + put_device(&cdev->device); >> + return ret; >> } >> - ret = device_register(&cdev->device); >> - if (ret) >> - goto out_kfree_type; >> >> /* Add 'this' new cdev to the global cdev list */ >> mutex_lock(&thermal_list_lock); >> @@ -939,11 +943,10 @@ __thermal_cooling_device_register(struct device_node *np, >> >> return cdev; >> >> -out_kfree_type: >> +out_cooling_dev: >> thermal_cooling_device_destroy_sysfs(cdev); >> +out_cdev_type: >> kfree(cdev->type); >> - put_device(&cdev->device); >> - cdev = NULL; >> out_ida_remove: >> ida_free(&thermal_cdev_ida, id); >> out_kfree_cdev: >> @@ -1104,11 +1107,7 @@ void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev) >> >> mutex_unlock(&thermal_list_lock); >> >> - ida_free(&thermal_cdev_ida, cdev->id); >> - device_del(&cdev->device); >> - thermal_cooling_device_destroy_sysfs(cdev); >> - kfree(cdev->type); >> - put_device(&cdev->device); >> + device_unregister(&cdev->device); >> } >> EXPORT_SYMBOL_GPL(thermal_cooling_device_unregister); >> >> -- >> viresh
On 17-01-23, 18:11, Caleb Connolly wrote: > it's a fun one: > > https://lore.kernel.org/linux-pm/20230101174342.58351-1-caleb.connolly@linaro.org/ > > I don't see any issues with this suggested patch, however I don't think > I could comfortably attach my SoB to it given the larger code reordering > and my complete lack of experience with this subsystem. > > Would a Tested-by be acceptable? Sure, lemme send the patch(es) formally then.
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index f17ab2316dbd..321d2a6300c4 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -909,15 +909,16 @@ __thermal_cooling_device_register(struct device_node *np, cdev->devdata = devdata; ret = cdev->ops->get_max_state(cdev, &cdev->max_state); - if (ret) - goto out_kfree_type; + if (ret) { + kfree(cdev->type); + goto out_ida_remove; + } thermal_cooling_device_setup_sysfs(cdev); ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id); - if (ret) { - thermal_cooling_device_destroy_sysfs(cdev); + if (ret) goto out_kfree_type; - } + ret = device_register(&cdev->device); if (ret) goto out_kfree_type;
There is in invalid call to thermal_cooling_device_destroy_sysfs() and another to put_device() in the error paths here. Fix them. Fixes: c408b3d1d9bb ("thermal: Validate new state in cur_state_store()") Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org> --- Changes since v2: * Rework and simplify into one patch following Yang's suggestions. V2: https://lore.kernel.org/all/20230103171811.204196-1-caleb.connolly@linaro.org/ --- drivers/thermal/thermal_core.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)