Message ID | 13503555.uLZWGnKmhe@kreacher |
---|---|
State | New |
Headers | show |
Series | thermal/debugfs: Fix a memory leak on removal and locking | expand |
On 4/25/24 14:57, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Since thermal_debug_cdev_remove() does not run under cdev->lock, it can > run in parallel with thermal_debug_cdev_state_update() and it may free > the struct thermal_debugfs object used by the latter after it has been > checked against NULL. > > If that happens, thermal_debug_cdev_state_update() will access memory > that has been freed already causing the kernel to crash. > > Address this by using cdev->lock in thermal_debug_cdev_remove() around > the cdev->debugfs value check (in case the same cdev is removed at the > same time in two differet threads) and its reset to NULL. s/differet/different/ > > Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information") > Cc :6.8+ <stable@vger.kernel.org> # 6.8+ > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > drivers/thermal/thermal_debugfs.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > Index: linux-pm/drivers/thermal/thermal_debugfs.c > =================================================================== > --- linux-pm.orig/drivers/thermal/thermal_debugfs.c > +++ linux-pm/drivers/thermal/thermal_debugfs.c > @@ -503,15 +503,21 @@ void thermal_debug_cdev_add(struct therm > */ > void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev) > { > - struct thermal_debugfs *thermal_dbg = cdev->debugfs; > + struct thermal_debugfs *thermal_dbg; > > + mutex_lock(&cdev->lock); > + > + thermal_dbg = cdev->debugfs; > if (!thermal_dbg) mutex_unlock(&cdev->lock) missing here > return; > > + cdev->debugfs = NULL; > + > + mutex_unlock(&cdev->lock); > + > mutex_lock(&thermal_dbg->lock); > > thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg); > - cdev->debugfs = NULL; > > mutex_unlock(&thermal_dbg->lock); > > > > >
On Fri, Apr 26, 2024 at 12:05 AM Lukasz Luba <lukasz.luba@arm.com> wrote: > > > > On 4/25/24 14:57, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > Since thermal_debug_cdev_remove() does not run under cdev->lock, it can > > run in parallel with thermal_debug_cdev_state_update() and it may free > > the struct thermal_debugfs object used by the latter after it has been > > checked against NULL. > > > > If that happens, thermal_debug_cdev_state_update() will access memory > > that has been freed already causing the kernel to crash. > > > > Address this by using cdev->lock in thermal_debug_cdev_remove() around > > the cdev->debugfs value check (in case the same cdev is removed at the > > same time in two differet threads) and its reset to NULL. > > s/differet/different/ > > > > > Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information") > > Cc :6.8+ <stable@vger.kernel.org> # 6.8+ > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > --- > > drivers/thermal/thermal_debugfs.c | 10 ++++++++-- > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > Index: linux-pm/drivers/thermal/thermal_debugfs.c > > =================================================================== > > --- linux-pm.orig/drivers/thermal/thermal_debugfs.c > > +++ linux-pm/drivers/thermal/thermal_debugfs.c > > @@ -503,15 +503,21 @@ void thermal_debug_cdev_add(struct therm > > */ > > void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev) > > { > > - struct thermal_debugfs *thermal_dbg = cdev->debugfs; > > + struct thermal_debugfs *thermal_dbg; > > > > + mutex_lock(&cdev->lock); > > + > > + thermal_dbg = cdev->debugfs; > > if (!thermal_dbg) > > mutex_unlock(&cdev->lock) missing here Good catch, thanks! Ho-hum, I'm not sure why I haven't added it here ... I'll send a v2 of this patch shortly. > > return; > > > > + cdev->debugfs = NULL; > > + > > + mutex_unlock(&cdev->lock); > > + > > mutex_lock(&thermal_dbg->lock); > > > > thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg); > > - cdev->debugfs = NULL; > > > > mutex_unlock(&thermal_dbg->lock); > > > > > > > > > > >
On 4/26/24 10:28, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Since thermal_debug_cdev_remove() does not run under cdev->lock, it can > run in parallel with thermal_debug_cdev_state_update() and it may free > the struct thermal_debugfs object used by the latter after it has been > checked against NULL. > > If that happens, thermal_debug_cdev_state_update() will access memory > that has been freed already causing the kernel to crash. > > Address this by using cdev->lock in thermal_debug_cdev_remove() around > the cdev->debugfs value check (in case the same cdev is removed at the > same time in two different threads) and its reset to NULL. > > Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information") > Cc :6.8+ <stable@vger.kernel.org> # 6.8+ > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > > v1 -> v2: Add missing mutex_unlock() (Lukasz). > > --- > drivers/thermal/thermal_debugfs.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > Index: linux-pm/drivers/thermal/thermal_debugfs.c > =================================================================== > --- linux-pm.orig/drivers/thermal/thermal_debugfs.c > +++ linux-pm/drivers/thermal/thermal_debugfs.c > @@ -505,15 +505,23 @@ void thermal_debug_cdev_add(struct therm > */ > void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev) > { > - struct thermal_debugfs *thermal_dbg = cdev->debugfs; > + struct thermal_debugfs *thermal_dbg; > > - if (!thermal_dbg) > + mutex_lock(&cdev->lock); > + > + thermal_dbg = cdev->debugfs; > + if (!thermal_dbg) { > + mutex_unlock(&cdev->lock); > return; > + } > + > + cdev->debugfs = NULL; > + > + mutex_unlock(&cdev->lock); > > mutex_lock(&thermal_dbg->lock); > > thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg); > - cdev->debugfs = NULL; > > mutex_unlock(&thermal_dbg->lock); > > > > > It looks good now Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
On Fri, Apr 26, 2024 at 11:35 AM Lukasz Luba <lukasz.luba@arm.com> wrote: > > > > On 4/26/24 10:28, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > Since thermal_debug_cdev_remove() does not run under cdev->lock, it can > > run in parallel with thermal_debug_cdev_state_update() and it may free > > the struct thermal_debugfs object used by the latter after it has been > > checked against NULL. > > > > If that happens, thermal_debug_cdev_state_update() will access memory > > that has been freed already causing the kernel to crash. > > > > Address this by using cdev->lock in thermal_debug_cdev_remove() around > > the cdev->debugfs value check (in case the same cdev is removed at the > > same time in two different threads) and its reset to NULL. > > > > Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information") > > Cc :6.8+ <stable@vger.kernel.org> # 6.8+ > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > --- > > > > v1 -> v2: Add missing mutex_unlock() (Lukasz). > > > > --- > > drivers/thermal/thermal_debugfs.c | 14 +++++++++++--- > > 1 file changed, 11 insertions(+), 3 deletions(-) > > > > Index: linux-pm/drivers/thermal/thermal_debugfs.c > > =================================================================== > > --- linux-pm.orig/drivers/thermal/thermal_debugfs.c > > +++ linux-pm/drivers/thermal/thermal_debugfs.c > > @@ -505,15 +505,23 @@ void thermal_debug_cdev_add(struct therm > > */ > > void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev) > > { > > - struct thermal_debugfs *thermal_dbg = cdev->debugfs; > > + struct thermal_debugfs *thermal_dbg; > > > > - if (!thermal_dbg) > > + mutex_lock(&cdev->lock); > > + > > + thermal_dbg = cdev->debugfs; > > + if (!thermal_dbg) { > > + mutex_unlock(&cdev->lock); > > return; > > + } > > + > > + cdev->debugfs = NULL; > > + > > + mutex_unlock(&cdev->lock); > > > > mutex_lock(&thermal_dbg->lock); > > > > thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg); > > - cdev->debugfs = NULL; > > > > mutex_unlock(&thermal_dbg->lock); > > > > > > > > > > > > It looks good now > > Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Thanks!
Index: linux-pm/drivers/thermal/thermal_debugfs.c =================================================================== --- linux-pm.orig/drivers/thermal/thermal_debugfs.c +++ linux-pm/drivers/thermal/thermal_debugfs.c @@ -503,15 +503,21 @@ void thermal_debug_cdev_add(struct therm */ void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev) { - struct thermal_debugfs *thermal_dbg = cdev->debugfs; + struct thermal_debugfs *thermal_dbg; + mutex_lock(&cdev->lock); + + thermal_dbg = cdev->debugfs; if (!thermal_dbg) return; + cdev->debugfs = NULL; + + mutex_unlock(&cdev->lock); + mutex_lock(&thermal_dbg->lock); thermal_debugfs_cdev_clear(&thermal_dbg->cdev_dbg); - cdev->debugfs = NULL; mutex_unlock(&thermal_dbg->lock);