diff mbox series

[v1,3/3] thermal/debugfs: Prevent use-after-free from occurring after cdev removal

Message ID 13503555.uLZWGnKmhe@kreacher
State New
Headers show
Series thermal/debugfs: Fix a memory leak on removal and locking | expand

Commit Message

Rafael J. Wysocki April 25, 2024, 1:57 p.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Since thermal_debug_cdev_remove() does not run under cdev->lock, it can
run in parallel with thermal_debug_cdev_state_update() and it may free
the struct thermal_debugfs object used by the latter after it has been
checked against NULL.

If that happens, thermal_debug_cdev_state_update() will access memory
that has been freed already causing the kernel to crash.

Address this by using cdev->lock in thermal_debug_cdev_remove() around
the cdev->debugfs value check (in case the same cdev is removed at the
same time in two differet threads) and its reset to NULL.

Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
Cc :6.8+ <stable@vger.kernel.org> # 6.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/thermal/thermal_debugfs.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

Lukasz Luba April 25, 2024, 10:05 p.m. UTC | #1
On 4/25/24 14:57, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Since thermal_debug_cdev_remove() does not run under cdev->lock, it can
> run in parallel with thermal_debug_cdev_state_update() and it may free
> the struct thermal_debugfs object used by the latter after it has been
> checked against NULL.
> 
> If that happens, thermal_debug_cdev_state_update() will access memory
> that has been freed already causing the kernel to crash.
> 
> Address this by using cdev->lock in thermal_debug_cdev_remove() around
> the cdev->debugfs value check (in case the same cdev is removed at the
> same time in two differet threads) and its reset to NULL.

s/differet/different/

> 
> Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
> Cc :6.8+ <stable@vger.kernel.org> # 6.8+
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>   drivers/thermal/thermal_debugfs.c |   10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)
> 
> Index: linux-pm/drivers/thermal/thermal_debugfs.c
> ===================================================================
> --- linux-pm.orig/drivers/thermal/thermal_debugfs.c
> +++ linux-pm/drivers/thermal/thermal_debugfs.c
> @@ -503,15 +503,21 @@ void thermal_debug_cdev_add(struct therm
>    */
>   void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev)
>   {
> -	struct thermal_debugfs *thermal_dbg = cdev->debugfs;
> +	struct thermal_debugfs *thermal_dbg;
>   
> +	mutex_lock(&cdev->lock);
> +
> +	thermal_dbg = cdev->debugfs;
>   	if (!thermal_dbg)

mutex_unlock(&cdev->lock) missing here

>   		return;
>   
> +	cdev->debugfs = NULL;
> +
> +	mutex_unlock(&cdev->lock);
> +
>   	mutex_lock(&thermal_dbg->lock);
>   
>   	thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg);
> -	cdev->debugfs = NULL;
>   
>   	mutex_unlock(&thermal_dbg->lock);
>   
> 
> 
> 
>
Rafael J. Wysocki April 26, 2024, 9:18 a.m. UTC | #2
On Fri, Apr 26, 2024 at 12:05 AM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
>
>
> On 4/25/24 14:57, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Since thermal_debug_cdev_remove() does not run under cdev->lock, it can
> > run in parallel with thermal_debug_cdev_state_update() and it may free
> > the struct thermal_debugfs object used by the latter after it has been
> > checked against NULL.
> >
> > If that happens, thermal_debug_cdev_state_update() will access memory
> > that has been freed already causing the kernel to crash.
> >
> > Address this by using cdev->lock in thermal_debug_cdev_remove() around
> > the cdev->debugfs value check (in case the same cdev is removed at the
> > same time in two differet threads) and its reset to NULL.
>
> s/differet/different/
>
> >
> > Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
> > Cc :6.8+ <stable@vger.kernel.org> # 6.8+
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >   drivers/thermal/thermal_debugfs.c |   10 ++++++++--
> >   1 file changed, 8 insertions(+), 2 deletions(-)
> >
> > Index: linux-pm/drivers/thermal/thermal_debugfs.c
> > ===================================================================
> > --- linux-pm.orig/drivers/thermal/thermal_debugfs.c
> > +++ linux-pm/drivers/thermal/thermal_debugfs.c
> > @@ -503,15 +503,21 @@ void thermal_debug_cdev_add(struct therm
> >    */
> >   void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev)
> >   {
> > -     struct thermal_debugfs *thermal_dbg = cdev->debugfs;
> > +     struct thermal_debugfs *thermal_dbg;
> >
> > +     mutex_lock(&cdev->lock);
> > +
> > +     thermal_dbg = cdev->debugfs;
> >       if (!thermal_dbg)
>
> mutex_unlock(&cdev->lock) missing here

Good catch, thanks!

Ho-hum, I'm not sure why I haven't added it here ...

I'll send a v2 of this patch shortly.

> >               return;
> >
> > +     cdev->debugfs = NULL;
> > +
> > +     mutex_unlock(&cdev->lock);
> > +
> >       mutex_lock(&thermal_dbg->lock);
> >
> >       thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg);
> > -     cdev->debugfs = NULL;
> >
> >       mutex_unlock(&thermal_dbg->lock);
> >
> >
> >
> >
> >
>
Lukasz Luba April 26, 2024, 9:35 a.m. UTC | #3
On 4/26/24 10:28, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Since thermal_debug_cdev_remove() does not run under cdev->lock, it can
> run in parallel with thermal_debug_cdev_state_update() and it may free
> the struct thermal_debugfs object used by the latter after it has been
> checked against NULL.
> 
> If that happens, thermal_debug_cdev_state_update() will access memory
> that has been freed already causing the kernel to crash.
> 
> Address this by using cdev->lock in thermal_debug_cdev_remove() around
> the cdev->debugfs value check (in case the same cdev is removed at the
> same time in two different threads) and its reset to NULL.
> 
> Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
> Cc :6.8+ <stable@vger.kernel.org> # 6.8+
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> v1 -> v2: Add missing mutex_unlock() (Lukasz).
> 
> ---
>   drivers/thermal/thermal_debugfs.c |   14 +++++++++++---
>   1 file changed, 11 insertions(+), 3 deletions(-)
> 
> Index: linux-pm/drivers/thermal/thermal_debugfs.c
> ===================================================================
> --- linux-pm.orig/drivers/thermal/thermal_debugfs.c
> +++ linux-pm/drivers/thermal/thermal_debugfs.c
> @@ -505,15 +505,23 @@ void thermal_debug_cdev_add(struct therm
>    */
>   void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev)
>   {
> -	struct thermal_debugfs *thermal_dbg = cdev->debugfs;
> +	struct thermal_debugfs *thermal_dbg;
>   
> -	if (!thermal_dbg)
> +	mutex_lock(&cdev->lock);
> +
> +	thermal_dbg = cdev->debugfs;
> +	if (!thermal_dbg) {
> +		mutex_unlock(&cdev->lock);
>   		return;
> +	}
> +
> +	cdev->debugfs = NULL;
> +
> +	mutex_unlock(&cdev->lock);
>   
>   	mutex_lock(&thermal_dbg->lock);
>   
>   	thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg);
> -	cdev->debugfs = NULL;
>   
>   	mutex_unlock(&thermal_dbg->lock);
>   
> 
> 
> 
> 

It looks good now

Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Rafael J. Wysocki April 26, 2024, 9:54 a.m. UTC | #4
On Fri, Apr 26, 2024 at 11:35 AM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
>
>
> On 4/26/24 10:28, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Since thermal_debug_cdev_remove() does not run under cdev->lock, it can
> > run in parallel with thermal_debug_cdev_state_update() and it may free
> > the struct thermal_debugfs object used by the latter after it has been
> > checked against NULL.
> >
> > If that happens, thermal_debug_cdev_state_update() will access memory
> > that has been freed already causing the kernel to crash.
> >
> > Address this by using cdev->lock in thermal_debug_cdev_remove() around
> > the cdev->debugfs value check (in case the same cdev is removed at the
> > same time in two different threads) and its reset to NULL.
> >
> > Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
> > Cc :6.8+ <stable@vger.kernel.org> # 6.8+
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >
> > v1 -> v2: Add missing mutex_unlock() (Lukasz).
> >
> > ---
> >   drivers/thermal/thermal_debugfs.c |   14 +++++++++++---
> >   1 file changed, 11 insertions(+), 3 deletions(-)
> >
> > Index: linux-pm/drivers/thermal/thermal_debugfs.c
> > ===================================================================
> > --- linux-pm.orig/drivers/thermal/thermal_debugfs.c
> > +++ linux-pm/drivers/thermal/thermal_debugfs.c
> > @@ -505,15 +505,23 @@ void thermal_debug_cdev_add(struct therm
> >    */
> >   void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev)
> >   {
> > -     struct thermal_debugfs *thermal_dbg = cdev->debugfs;
> > +     struct thermal_debugfs *thermal_dbg;
> >
> > -     if (!thermal_dbg)
> > +     mutex_lock(&cdev->lock);
> > +
> > +     thermal_dbg = cdev->debugfs;
> > +     if (!thermal_dbg) {
> > +             mutex_unlock(&cdev->lock);
> >               return;
> > +     }
> > +
> > +     cdev->debugfs = NULL;
> > +
> > +     mutex_unlock(&cdev->lock);
> >
> >       mutex_lock(&thermal_dbg->lock);
> >
> >       thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg);
> > -     cdev->debugfs = NULL;
> >
> >       mutex_unlock(&thermal_dbg->lock);
> >
> >
> >
> >
> >
>
> It looks good now
>
> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>

Thanks!
diff mbox series

Patch

Index: linux-pm/drivers/thermal/thermal_debugfs.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_debugfs.c
+++ linux-pm/drivers/thermal/thermal_debugfs.c
@@ -503,15 +503,21 @@  void thermal_debug_cdev_add(struct therm
  */
 void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev)
 {
-	struct thermal_debugfs *thermal_dbg = cdev->debugfs;
+	struct thermal_debugfs *thermal_dbg;
 
+	mutex_lock(&cdev->lock);
+
+	thermal_dbg = cdev->debugfs;
 	if (!thermal_dbg)
 		return;
 
+	cdev->debugfs = NULL;
+
+	mutex_unlock(&cdev->lock);
+
 	mutex_lock(&thermal_dbg->lock);
 
 	thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg);
-	cdev->debugfs = NULL;
 
 	mutex_unlock(&thermal_dbg->lock);