diff mbox series

thermal: sysfs: Perform bounds check when storing thermal states

Message ID 20220705150002.2016207-1-varadgautam@google.com
State New
Headers show
Series thermal: sysfs: Perform bounds check when storing thermal states | expand

Commit Message

Varad Gautam July 5, 2022, 3 p.m. UTC
Check that a user-provided thermal state is within the maximum
thermal states supported by a given driver before attempting to
apply it. This prevents a subsequent OOB access in
thermal_cooling_device_stats_update() while performing
state-transition accounting on drivers that do not have this check
in their set_cur_state() handle.

Signed-off-by: Varad Gautam <varadgautam@google.com>
Cc: stable@vger.kernel.org
---
 drivers/thermal/thermal_sysfs.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Comments

Greg KH July 5, 2022, 4:18 p.m. UTC | #1
On Tue, Jul 05, 2022 at 03:00:02PM +0000, Varad Gautam wrote:
> Check that a user-provided thermal state is within the maximum
> thermal states supported by a given driver before attempting to
> apply it. This prevents a subsequent OOB access in
> thermal_cooling_device_stats_update() while performing
> state-transition accounting on drivers that do not have this check
> in their set_cur_state() handle.
> 
> Signed-off-by: Varad Gautam <varadgautam@google.com>
> Cc: stable@vger.kernel.org
> ---
>  drivers/thermal/thermal_sysfs.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
> index 1c4aac8464a7..0c6b0223b133 100644
> --- a/drivers/thermal/thermal_sysfs.c
> +++ b/drivers/thermal/thermal_sysfs.c
> @@ -607,7 +607,7 @@ cur_state_store(struct device *dev, struct device_attribute *attr,
>  		const char *buf, size_t count)
>  {
>  	struct thermal_cooling_device *cdev = to_cooling_device(dev);
> -	unsigned long state;
> +	unsigned long state, max_state;
>  	int result;
>  
>  	if (sscanf(buf, "%ld\n", &state) != 1)
> @@ -618,10 +618,20 @@ cur_state_store(struct device *dev, struct device_attribute *attr,
>  
>  	mutex_lock(&cdev->lock);
>  
> +	result = cdev->ops->get_max_state(cdev, &max_state);
> +	if (result)
> +		goto unlock;
> +
> +	if (state > max_state) {
> +		result = -EINVAL;
> +		goto unlock;
> +	}
> +
>  	result = cdev->ops->set_cur_state(cdev, state);

Why doesn't set_cur_state() check the max state before setting it?  Why
are the callers forced to always check it before?  That feels wrong...

thanks,

greg k-h
Zhang, Rui July 6, 2022, 8:51 a.m. UTC | #2
On Wed, 2022-07-06 at 09:16 +0200, Varad Gautam wrote:
> On Wed, Jul 6, 2022 at 8:45 AM Greg KH <gregkh@linuxfoundation.org>
> wrote:
> > 
> > On Tue, Jul 05, 2022 at 11:02:50PM +0200, Varad Gautam wrote:
> > > On Tue, Jul 5, 2022 at 6:18 PM Greg KH <
> > > gregkh@linuxfoundation.org> wrote:
> > > > 
> > > > On Tue, Jul 05, 2022 at 03:00:02PM +0000, Varad Gautam wrote:
> > > > > Check that a user-provided thermal state is within the
> > > > > maximum
> > > > > thermal states supported by a given driver before attempting
> > > > > to
> > > > > apply it. This prevents a subsequent OOB access in
> > > > > thermal_cooling_device_stats_update() while performing
> > > > > state-transition accounting on drivers that do not have this
> > > > > check
> > > > > in their set_cur_state() handle.
> > > > > 
> > > > > Signed-off-by: Varad Gautam <varadgautam@google.com>
> > > > > Cc: stable@vger.kernel.org
> > > > > ---
> > > > >  drivers/thermal/thermal_sysfs.c | 12 +++++++++++-
> > > > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/drivers/thermal/thermal_sysfs.c
> > > > > b/drivers/thermal/thermal_sysfs.c
> > > > > index 1c4aac8464a7..0c6b0223b133 100644
> > > > > --- a/drivers/thermal/thermal_sysfs.c
> > > > > +++ b/drivers/thermal/thermal_sysfs.c
> > > > > @@ -607,7 +607,7 @@ cur_state_store(struct device *dev,
> > > > > struct device_attribute *attr,
> > > > >               const char *buf, size_t count)
> > > > >  {
> > > > >       struct thermal_cooling_device *cdev =
> > > > > to_cooling_device(dev);
> > > > > -     unsigned long state;
> > > > > +     unsigned long state, max_state;
> > > > >       int result;
> > > > > 
> > > > >       if (sscanf(buf, "%ld\n", &state) != 1)
> > > > > @@ -618,10 +618,20 @@ cur_state_store(struct device *dev,
> > > > > struct device_attribute *attr,
> > > > > 
> > > > >       mutex_lock(&cdev->lock);
> > > > > 
> > > > > +     result = cdev->ops->get_max_state(cdev, &max_state);
> > > > > +     if (result)
> > > > > +             goto unlock;
> > > > > +
> > > > > +     if (state > max_state) {
> > > > > +             result = -EINVAL;
> > > > > +             goto unlock;
> > > > > +     }
> > > > > +
> > > > >       result = cdev->ops->set_cur_state(cdev, state);
> > > > 
> > > > Why doesn't set_cur_state() check the max state before setting
> > > > it?  Why
> > > > are the callers forced to always check it before?  That feels
> > > > wrong...
> > > > 
> > > 
> > > The problem lies in thermal_cooling_device_stats_update(), not
> > > set_cur_state().
> > > 
> > > If ->set_cur_state() doesn't error out on invalid state,
> > > thermal_cooling_device_stats_update() does a:
> > > 
> > > stats->trans_table[stats->state * stats->max_states +
> > > new_state]++;
> > > 
> > > stats->trans_table reserves space depending on max_states, but
> > > we'd end up
> > > reading/writing outside it. cur_state_store() can prevent this
> > > regardless of
> > > the driver's ->set_cur_state() implementation.
> > 
> > Why wouldn't cur_state_store() check for an out-of-bounds condition
> > by
> > calling get_max_state() and then return an error if it is invalid,
> > preventing thermal_cooling_device_stats_update() from ever being
> > called?
> > 
> 
> That's what this patch does, it adds the out-of-bounds check.

No, I think Greg' question is
why cdev->ops->set_cur_state() return 0 when setting a cooling state
that exceeds the maximum cooling state?

thanks,
rui
> 
> > thanks,
> > 
> > greg k-h
Greg KH July 6, 2022, 9:21 a.m. UTC | #3
On Wed, Jul 06, 2022 at 04:51:59PM +0800, Zhang Rui wrote:
> On Wed, 2022-07-06 at 09:16 +0200, Varad Gautam wrote:
> > On Wed, Jul 6, 2022 at 8:45 AM Greg KH <gregkh@linuxfoundation.org>
> > wrote:
> > > 
> > > On Tue, Jul 05, 2022 at 11:02:50PM +0200, Varad Gautam wrote:
> > > > On Tue, Jul 5, 2022 at 6:18 PM Greg KH <
> > > > gregkh@linuxfoundation.org> wrote:
> > > > > 
> > > > > On Tue, Jul 05, 2022 at 03:00:02PM +0000, Varad Gautam wrote:
> > > > > > Check that a user-provided thermal state is within the
> > > > > > maximum
> > > > > > thermal states supported by a given driver before attempting
> > > > > > to
> > > > > > apply it. This prevents a subsequent OOB access in
> > > > > > thermal_cooling_device_stats_update() while performing
> > > > > > state-transition accounting on drivers that do not have this
> > > > > > check
> > > > > > in their set_cur_state() handle.
> > > > > > 
> > > > > > Signed-off-by: Varad Gautam <varadgautam@google.com>
> > > > > > Cc: stable@vger.kernel.org
> > > > > > ---
> > > > > >  drivers/thermal/thermal_sysfs.c | 12 +++++++++++-
> > > > > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/drivers/thermal/thermal_sysfs.c
> > > > > > b/drivers/thermal/thermal_sysfs.c
> > > > > > index 1c4aac8464a7..0c6b0223b133 100644
> > > > > > --- a/drivers/thermal/thermal_sysfs.c
> > > > > > +++ b/drivers/thermal/thermal_sysfs.c
> > > > > > @@ -607,7 +607,7 @@ cur_state_store(struct device *dev,
> > > > > > struct device_attribute *attr,
> > > > > >               const char *buf, size_t count)
> > > > > >  {
> > > > > >       struct thermal_cooling_device *cdev =
> > > > > > to_cooling_device(dev);
> > > > > > -     unsigned long state;
> > > > > > +     unsigned long state, max_state;
> > > > > >       int result;
> > > > > > 
> > > > > >       if (sscanf(buf, "%ld\n", &state) != 1)
> > > > > > @@ -618,10 +618,20 @@ cur_state_store(struct device *dev,
> > > > > > struct device_attribute *attr,
> > > > > > 
> > > > > >       mutex_lock(&cdev->lock);
> > > > > > 
> > > > > > +     result = cdev->ops->get_max_state(cdev, &max_state);
> > > > > > +     if (result)
> > > > > > +             goto unlock;
> > > > > > +
> > > > > > +     if (state > max_state) {
> > > > > > +             result = -EINVAL;
> > > > > > +             goto unlock;
> > > > > > +     }
> > > > > > +
> > > > > >       result = cdev->ops->set_cur_state(cdev, state);
> > > > > 
> > > > > Why doesn't set_cur_state() check the max state before setting
> > > > > it?  Why
> > > > > are the callers forced to always check it before?  That feels
> > > > > wrong...
> > > > > 
> > > > 
> > > > The problem lies in thermal_cooling_device_stats_update(), not
> > > > set_cur_state().
> > > > 
> > > > If ->set_cur_state() doesn't error out on invalid state,
> > > > thermal_cooling_device_stats_update() does a:
> > > > 
> > > > stats->trans_table[stats->state * stats->max_states +
> > > > new_state]++;
> > > > 
> > > > stats->trans_table reserves space depending on max_states, but
> > > > we'd end up
> > > > reading/writing outside it. cur_state_store() can prevent this
> > > > regardless of
> > > > the driver's ->set_cur_state() implementation.
> > > 
> > > Why wouldn't cur_state_store() check for an out-of-bounds condition
> > > by
> > > calling get_max_state() and then return an error if it is invalid,
> > > preventing thermal_cooling_device_stats_update() from ever being
> > > called?
> > > 
> > 
> > That's what this patch does, it adds the out-of-bounds check.
> 
> No, I think Greg' question is
> why cdev->ops->set_cur_state() return 0 when setting a cooling state
> that exceeds the maximum cooling state?

Yes, that is what I am asking, it should not allow a state to be
exceeded.

thanks,

greg k-h
Varad Gautam July 6, 2022, 10:01 a.m. UTC | #4
On Wed, Jul 6, 2022 at 11:21 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Wed, Jul 06, 2022 at 04:51:59PM +0800, Zhang Rui wrote:
> > On Wed, 2022-07-06 at 09:16 +0200, Varad Gautam wrote:
> > > On Wed, Jul 6, 2022 at 8:45 AM Greg KH <gregkh@linuxfoundation.org>
> > > wrote:
> > > >
> > > > On Tue, Jul 05, 2022 at 11:02:50PM +0200, Varad Gautam wrote:
> > > > > On Tue, Jul 5, 2022 at 6:18 PM Greg KH <
> > > > > gregkh@linuxfoundation.org> wrote:
> > > > > >
> > > > > > On Tue, Jul 05, 2022 at 03:00:02PM +0000, Varad Gautam wrote:
> > > > > > > Check that a user-provided thermal state is within the
> > > > > > > maximum
> > > > > > > thermal states supported by a given driver before attempting
> > > > > > > to
> > > > > > > apply it. This prevents a subsequent OOB access in
> > > > > > > thermal_cooling_device_stats_update() while performing
> > > > > > > state-transition accounting on drivers that do not have this
> > > > > > > check
> > > > > > > in their set_cur_state() handle.
> > > > > > >
> > > > > > > Signed-off-by: Varad Gautam <varadgautam@google.com>
> > > > > > > Cc: stable@vger.kernel.org
> > > > > > > ---
> > > > > > >  drivers/thermal/thermal_sysfs.c | 12 +++++++++++-
> > > > > > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/thermal/thermal_sysfs.c
> > > > > > > b/drivers/thermal/thermal_sysfs.c
> > > > > > > index 1c4aac8464a7..0c6b0223b133 100644
> > > > > > > --- a/drivers/thermal/thermal_sysfs.c
> > > > > > > +++ b/drivers/thermal/thermal_sysfs.c
> > > > > > > @@ -607,7 +607,7 @@ cur_state_store(struct device *dev,
> > > > > > > struct device_attribute *attr,
> > > > > > >               const char *buf, size_t count)
> > > > > > >  {
> > > > > > >       struct thermal_cooling_device *cdev =
> > > > > > > to_cooling_device(dev);
> > > > > > > -     unsigned long state;
> > > > > > > +     unsigned long state, max_state;
> > > > > > >       int result;
> > > > > > >
> > > > > > >       if (sscanf(buf, "%ld\n", &state) != 1)
> > > > > > > @@ -618,10 +618,20 @@ cur_state_store(struct device *dev,
> > > > > > > struct device_attribute *attr,
> > > > > > >
> > > > > > >       mutex_lock(&cdev->lock);
> > > > > > >
> > > > > > > +     result = cdev->ops->get_max_state(cdev, &max_state);
> > > > > > > +     if (result)
> > > > > > > +             goto unlock;
> > > > > > > +
> > > > > > > +     if (state > max_state) {
> > > > > > > +             result = -EINVAL;
> > > > > > > +             goto unlock;
> > > > > > > +     }
> > > > > > > +
> > > > > > >       result = cdev->ops->set_cur_state(cdev, state);
> > > > > >
> > > > > > Why doesn't set_cur_state() check the max state before setting
> > > > > > it?  Why
> > > > > > are the callers forced to always check it before?  That feels
> > > > > > wrong...
> > > > > >
> > > > >
> > > > > The problem lies in thermal_cooling_device_stats_update(), not
> > > > > set_cur_state().
> > > > >
> > > > > If ->set_cur_state() doesn't error out on invalid state,
> > > > > thermal_cooling_device_stats_update() does a:
> > > > >
> > > > > stats->trans_table[stats->state * stats->max_states +
> > > > > new_state]++;
> > > > >
> > > > > stats->trans_table reserves space depending on max_states, but
> > > > > we'd end up
> > > > > reading/writing outside it. cur_state_store() can prevent this
> > > > > regardless of
> > > > > the driver's ->set_cur_state() implementation.
> > > >
> > > > Why wouldn't cur_state_store() check for an out-of-bounds condition
> > > > by
> > > > calling get_max_state() and then return an error if it is invalid,
> > > > preventing thermal_cooling_device_stats_update() from ever being
> > > > called?
> > > >
> > >
> > > That's what this patch does, it adds the out-of-bounds check.
> >
> > No, I think Greg' question is
> > why cdev->ops->set_cur_state() return 0 when setting a cooling state
> > that exceeds the maximum cooling state?
>
> Yes, that is what I am asking, it should not allow a state to be
> exceeded.
>

Indeed, it is upto the driver to return !0 from cdev->ops->set_cur_state()
when setting state > max - and it is a driver bug for not doing so.

But a buggy driver should not lead to cur_state_store() performing an OOB
access.

> thanks,
>
> greg k-h
Greg KH July 6, 2022, 10:21 a.m. UTC | #5
On Wed, Jul 06, 2022 at 12:01:19PM +0200, Varad Gautam wrote:
> On Wed, Jul 6, 2022 at 11:21 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Wed, Jul 06, 2022 at 04:51:59PM +0800, Zhang Rui wrote:
> > > On Wed, 2022-07-06 at 09:16 +0200, Varad Gautam wrote:
> > > > On Wed, Jul 6, 2022 at 8:45 AM Greg KH <gregkh@linuxfoundation.org>
> > > > wrote:
> > > > >
> > > > > On Tue, Jul 05, 2022 at 11:02:50PM +0200, Varad Gautam wrote:
> > > > > > On Tue, Jul 5, 2022 at 6:18 PM Greg KH <
> > > > > > gregkh@linuxfoundation.org> wrote:
> > > > > > >
> > > > > > > On Tue, Jul 05, 2022 at 03:00:02PM +0000, Varad Gautam wrote:
> > > > > > > > Check that a user-provided thermal state is within the
> > > > > > > > maximum
> > > > > > > > thermal states supported by a given driver before attempting
> > > > > > > > to
> > > > > > > > apply it. This prevents a subsequent OOB access in
> > > > > > > > thermal_cooling_device_stats_update() while performing
> > > > > > > > state-transition accounting on drivers that do not have this
> > > > > > > > check
> > > > > > > > in their set_cur_state() handle.
> > > > > > > >
> > > > > > > > Signed-off-by: Varad Gautam <varadgautam@google.com>
> > > > > > > > Cc: stable@vger.kernel.org
> > > > > > > > ---
> > > > > > > >  drivers/thermal/thermal_sysfs.c | 12 +++++++++++-
> > > > > > > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/thermal/thermal_sysfs.c
> > > > > > > > b/drivers/thermal/thermal_sysfs.c
> > > > > > > > index 1c4aac8464a7..0c6b0223b133 100644
> > > > > > > > --- a/drivers/thermal/thermal_sysfs.c
> > > > > > > > +++ b/drivers/thermal/thermal_sysfs.c
> > > > > > > > @@ -607,7 +607,7 @@ cur_state_store(struct device *dev,
> > > > > > > > struct device_attribute *attr,
> > > > > > > >               const char *buf, size_t count)
> > > > > > > >  {
> > > > > > > >       struct thermal_cooling_device *cdev =
> > > > > > > > to_cooling_device(dev);
> > > > > > > > -     unsigned long state;
> > > > > > > > +     unsigned long state, max_state;
> > > > > > > >       int result;
> > > > > > > >
> > > > > > > >       if (sscanf(buf, "%ld\n", &state) != 1)
> > > > > > > > @@ -618,10 +618,20 @@ cur_state_store(struct device *dev,
> > > > > > > > struct device_attribute *attr,
> > > > > > > >
> > > > > > > >       mutex_lock(&cdev->lock);
> > > > > > > >
> > > > > > > > +     result = cdev->ops->get_max_state(cdev, &max_state);
> > > > > > > > +     if (result)
> > > > > > > > +             goto unlock;
> > > > > > > > +
> > > > > > > > +     if (state > max_state) {
> > > > > > > > +             result = -EINVAL;
> > > > > > > > +             goto unlock;
> > > > > > > > +     }
> > > > > > > > +
> > > > > > > >       result = cdev->ops->set_cur_state(cdev, state);
> > > > > > >
> > > > > > > Why doesn't set_cur_state() check the max state before setting
> > > > > > > it?  Why
> > > > > > > are the callers forced to always check it before?  That feels
> > > > > > > wrong...
> > > > > > >
> > > > > >
> > > > > > The problem lies in thermal_cooling_device_stats_update(), not
> > > > > > set_cur_state().
> > > > > >
> > > > > > If ->set_cur_state() doesn't error out on invalid state,
> > > > > > thermal_cooling_device_stats_update() does a:
> > > > > >
> > > > > > stats->trans_table[stats->state * stats->max_states +
> > > > > > new_state]++;
> > > > > >
> > > > > > stats->trans_table reserves space depending on max_states, but
> > > > > > we'd end up
> > > > > > reading/writing outside it. cur_state_store() can prevent this
> > > > > > regardless of
> > > > > > the driver's ->set_cur_state() implementation.
> > > > >
> > > > > Why wouldn't cur_state_store() check for an out-of-bounds condition
> > > > > by
> > > > > calling get_max_state() and then return an error if it is invalid,
> > > > > preventing thermal_cooling_device_stats_update() from ever being
> > > > > called?
> > > > >
> > > >
> > > > That's what this patch does, it adds the out-of-bounds check.
> > >
> > > No, I think Greg' question is
> > > why cdev->ops->set_cur_state() return 0 when setting a cooling state
> > > that exceeds the maximum cooling state?
> >
> > Yes, that is what I am asking, it should not allow a state to be
> > exceeded.
> >
> 
> Indeed, it is upto the driver to return !0 from cdev->ops->set_cur_state()
> when setting state > max - and it is a driver bug for not doing so.
> 
> But a buggy driver should not lead to cur_state_store() performing an OOB
> access.

Agreed, which is why the code that does the access should check before
it does so.  Right now you are relying on the sysfs code to do so, which
seems very wrong.

thanks,

greg k-h
Varad Gautam July 6, 2022, 12:30 p.m. UTC | #6
On Wed, Jul 6, 2022 at 12:21 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Wed, Jul 06, 2022 at 12:01:19PM +0200, Varad Gautam wrote:
> > On Wed, Jul 6, 2022 at 11:21 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Wed, Jul 06, 2022 at 04:51:59PM +0800, Zhang Rui wrote:
> > > > On Wed, 2022-07-06 at 09:16 +0200, Varad Gautam wrote:
> > > > > On Wed, Jul 6, 2022 at 8:45 AM Greg KH <gregkh@linuxfoundation.org>
> > > > > wrote:
> > > > > >
> > > > > > On Tue, Jul 05, 2022 at 11:02:50PM +0200, Varad Gautam wrote:
> > > > > > > On Tue, Jul 5, 2022 at 6:18 PM Greg KH <
> > > > > > > gregkh@linuxfoundation.org> wrote:
> > > > > > > >
> > > > > > > > On Tue, Jul 05, 2022 at 03:00:02PM +0000, Varad Gautam wrote:
> > > > > > > > > Check that a user-provided thermal state is within the
> > > > > > > > > maximum
> > > > > > > > > thermal states supported by a given driver before attempting
> > > > > > > > > to
> > > > > > > > > apply it. This prevents a subsequent OOB access in
> > > > > > > > > thermal_cooling_device_stats_update() while performing
> > > > > > > > > state-transition accounting on drivers that do not have this
> > > > > > > > > check
> > > > > > > > > in their set_cur_state() handle.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Varad Gautam <varadgautam@google.com>
> > > > > > > > > Cc: stable@vger.kernel.org
> > > > > > > > > ---
> > > > > > > > >  drivers/thermal/thermal_sysfs.c | 12 +++++++++++-
> > > > > > > > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/thermal/thermal_sysfs.c
> > > > > > > > > b/drivers/thermal/thermal_sysfs.c
> > > > > > > > > index 1c4aac8464a7..0c6b0223b133 100644
> > > > > > > > > --- a/drivers/thermal/thermal_sysfs.c
> > > > > > > > > +++ b/drivers/thermal/thermal_sysfs.c
> > > > > > > > > @@ -607,7 +607,7 @@ cur_state_store(struct device *dev,
> > > > > > > > > struct device_attribute *attr,
> > > > > > > > >               const char *buf, size_t count)
> > > > > > > > >  {
> > > > > > > > >       struct thermal_cooling_device *cdev =
> > > > > > > > > to_cooling_device(dev);
> > > > > > > > > -     unsigned long state;
> > > > > > > > > +     unsigned long state, max_state;
> > > > > > > > >       int result;
> > > > > > > > >
> > > > > > > > >       if (sscanf(buf, "%ld\n", &state) != 1)
> > > > > > > > > @@ -618,10 +618,20 @@ cur_state_store(struct device *dev,
> > > > > > > > > struct device_attribute *attr,
> > > > > > > > >
> > > > > > > > >       mutex_lock(&cdev->lock);
> > > > > > > > >
> > > > > > > > > +     result = cdev->ops->get_max_state(cdev, &max_state);
> > > > > > > > > +     if (result)
> > > > > > > > > +             goto unlock;
> > > > > > > > > +
> > > > > > > > > +     if (state > max_state) {
> > > > > > > > > +             result = -EINVAL;
> > > > > > > > > +             goto unlock;
> > > > > > > > > +     }
> > > > > > > > > +
> > > > > > > > >       result = cdev->ops->set_cur_state(cdev, state);
> > > > > > > >
> > > > > > > > Why doesn't set_cur_state() check the max state before setting
> > > > > > > > it?  Why
> > > > > > > > are the callers forced to always check it before?  That feels
> > > > > > > > wrong...
> > > > > > > >
> > > > > > >
> > > > > > > The problem lies in thermal_cooling_device_stats_update(), not
> > > > > > > set_cur_state().
> > > > > > >
> > > > > > > If ->set_cur_state() doesn't error out on invalid state,
> > > > > > > thermal_cooling_device_stats_update() does a:
> > > > > > >
> > > > > > > stats->trans_table[stats->state * stats->max_states +
> > > > > > > new_state]++;
> > > > > > >
> > > > > > > stats->trans_table reserves space depending on max_states, but
> > > > > > > we'd end up
> > > > > > > reading/writing outside it. cur_state_store() can prevent this
> > > > > > > regardless of
> > > > > > > the driver's ->set_cur_state() implementation.
> > > > > >
> > > > > > Why wouldn't cur_state_store() check for an out-of-bounds condition
> > > > > > by
> > > > > > calling get_max_state() and then return an error if it is invalid,
> > > > > > preventing thermal_cooling_device_stats_update() from ever being
> > > > > > called?
> > > > > >
> > > > >
> > > > > That's what this patch does, it adds the out-of-bounds check.
> > > >
> > > > No, I think Greg' question is
> > > > why cdev->ops->set_cur_state() return 0 when setting a cooling state
> > > > that exceeds the maximum cooling state?
> > >
> > > Yes, that is what I am asking, it should not allow a state to be
> > > exceeded.
> > >
> >
> > Indeed, it is upto the driver to return !0 from cdev->ops->set_cur_state()
> > when setting state > max - and it is a driver bug for not doing so.
> >
> > But a buggy driver should not lead to cur_state_store() performing an OOB
> > access.
>
> Agreed, which is why the code that does the access should check before
> it does so.  Right now you are relying on the sysfs code to do so, which
> seems very wrong.
>

I see the point.

The OOB access happens in thermal_cooling_device_stats_update().

By placing the check in cur_state_store(), I'm trying to ensure
two things for a buggy driver:
1. The driver's cdev->ops->set_cur_state() doesn't get called if
the new state is > max state. This is to prevent the driver
from storing the new (invalid) state internally. If the driver
didn't realise/reject an invalid state, chances are it will try
to propagate it internally and take actions according to that,
which can have side effects on system stability.
2. The kernel doesn't do an OOB access in
thermal_cooling_device_stats_update().

> thanks,
>
> greg k-h
Greg KH July 6, 2022, 12:51 p.m. UTC | #7
On Wed, Jul 06, 2022 at 02:30:21PM +0200, Varad Gautam wrote:
> On Wed, Jul 6, 2022 at 12:21 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Wed, Jul 06, 2022 at 12:01:19PM +0200, Varad Gautam wrote:
> > > On Wed, Jul 6, 2022 at 11:21 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Wed, Jul 06, 2022 at 04:51:59PM +0800, Zhang Rui wrote:
> > > > > On Wed, 2022-07-06 at 09:16 +0200, Varad Gautam wrote:
> > > > > > On Wed, Jul 6, 2022 at 8:45 AM Greg KH <gregkh@linuxfoundation.org>
> > > > > > wrote:
> > > > > > >
> > > > > > > On Tue, Jul 05, 2022 at 11:02:50PM +0200, Varad Gautam wrote:
> > > > > > > > On Tue, Jul 5, 2022 at 6:18 PM Greg KH <
> > > > > > > > gregkh@linuxfoundation.org> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, Jul 05, 2022 at 03:00:02PM +0000, Varad Gautam wrote:
> > > > > > > > > > Check that a user-provided thermal state is within the
> > > > > > > > > > maximum
> > > > > > > > > > thermal states supported by a given driver before attempting
> > > > > > > > > > to
> > > > > > > > > > apply it. This prevents a subsequent OOB access in
> > > > > > > > > > thermal_cooling_device_stats_update() while performing
> > > > > > > > > > state-transition accounting on drivers that do not have this
> > > > > > > > > > check
> > > > > > > > > > in their set_cur_state() handle.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Varad Gautam <varadgautam@google.com>
> > > > > > > > > > Cc: stable@vger.kernel.org
> > > > > > > > > > ---
> > > > > > > > > >  drivers/thermal/thermal_sysfs.c | 12 +++++++++++-
> > > > > > > > > >  1 file changed, 11 insertions(+), 1 deletion(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/thermal/thermal_sysfs.c
> > > > > > > > > > b/drivers/thermal/thermal_sysfs.c
> > > > > > > > > > index 1c4aac8464a7..0c6b0223b133 100644
> > > > > > > > > > --- a/drivers/thermal/thermal_sysfs.c
> > > > > > > > > > +++ b/drivers/thermal/thermal_sysfs.c
> > > > > > > > > > @@ -607,7 +607,7 @@ cur_state_store(struct device *dev,
> > > > > > > > > > struct device_attribute *attr,
> > > > > > > > > >               const char *buf, size_t count)
> > > > > > > > > >  {
> > > > > > > > > >       struct thermal_cooling_device *cdev =
> > > > > > > > > > to_cooling_device(dev);
> > > > > > > > > > -     unsigned long state;
> > > > > > > > > > +     unsigned long state, max_state;
> > > > > > > > > >       int result;
> > > > > > > > > >
> > > > > > > > > >       if (sscanf(buf, "%ld\n", &state) != 1)
> > > > > > > > > > @@ -618,10 +618,20 @@ cur_state_store(struct device *dev,
> > > > > > > > > > struct device_attribute *attr,
> > > > > > > > > >
> > > > > > > > > >       mutex_lock(&cdev->lock);
> > > > > > > > > >
> > > > > > > > > > +     result = cdev->ops->get_max_state(cdev, &max_state);
> > > > > > > > > > +     if (result)
> > > > > > > > > > +             goto unlock;
> > > > > > > > > > +
> > > > > > > > > > +     if (state > max_state) {
> > > > > > > > > > +             result = -EINVAL;
> > > > > > > > > > +             goto unlock;
> > > > > > > > > > +     }
> > > > > > > > > > +
> > > > > > > > > >       result = cdev->ops->set_cur_state(cdev, state);
> > > > > > > > >
> > > > > > > > > Why doesn't set_cur_state() check the max state before setting
> > > > > > > > > it?  Why
> > > > > > > > > are the callers forced to always check it before?  That feels
> > > > > > > > > wrong...
> > > > > > > > >
> > > > > > > >
> > > > > > > > The problem lies in thermal_cooling_device_stats_update(), not
> > > > > > > > set_cur_state().
> > > > > > > >
> > > > > > > > If ->set_cur_state() doesn't error out on invalid state,
> > > > > > > > thermal_cooling_device_stats_update() does a:
> > > > > > > >
> > > > > > > > stats->trans_table[stats->state * stats->max_states +
> > > > > > > > new_state]++;
> > > > > > > >
> > > > > > > > stats->trans_table reserves space depending on max_states, but
> > > > > > > > we'd end up
> > > > > > > > reading/writing outside it. cur_state_store() can prevent this
> > > > > > > > regardless of
> > > > > > > > the driver's ->set_cur_state() implementation.
> > > > > > >
> > > > > > > Why wouldn't cur_state_store() check for an out-of-bounds condition
> > > > > > > by
> > > > > > > calling get_max_state() and then return an error if it is invalid,
> > > > > > > preventing thermal_cooling_device_stats_update() from ever being
> > > > > > > called?
> > > > > > >
> > > > > >
> > > > > > That's what this patch does, it adds the out-of-bounds check.
> > > > >
> > > > > No, I think Greg' question is
> > > > > why cdev->ops->set_cur_state() return 0 when setting a cooling state
> > > > > that exceeds the maximum cooling state?
> > > >
> > > > Yes, that is what I am asking, it should not allow a state to be
> > > > exceeded.
> > > >
> > >
> > > Indeed, it is upto the driver to return !0 from cdev->ops->set_cur_state()
> > > when setting state > max - and it is a driver bug for not doing so.
> > >
> > > But a buggy driver should not lead to cur_state_store() performing an OOB
> > > access.
> >
> > Agreed, which is why the code that does the access should check before
> > it does so.  Right now you are relying on the sysfs code to do so, which
> > seems very wrong.
> >
> 
> I see the point.
> 
> The OOB access happens in thermal_cooling_device_stats_update().
> 
> By placing the check in cur_state_store(), I'm trying to ensure
> two things for a buggy driver:

What in-kernel driver has this problem, and why not just fix it there?

> 1. The driver's cdev->ops->set_cur_state() doesn't get called if
> the new state is > max state. This is to prevent the driver
> from storing the new (invalid) state internally. If the driver
> didn't realise/reject an invalid state, chances are it will try
> to propagate it internally and take actions according to that,
> which can have side effects on system stability.

Again, set_cur_state() should check for max values, if not, it is broken
and that needs to be fixed in the driver.

> 2. The kernel doesn't do an OOB access in
> thermal_cooling_device_stats_update().

Then don't allow thermal_cooling_device_stats_update() to do an out of
band access by fixing it there too.  But again, your patch does not
solve that directly.

thanks,

greg k-h
diff mbox series

Patch

diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
index 1c4aac8464a7..0c6b0223b133 100644
--- a/drivers/thermal/thermal_sysfs.c
+++ b/drivers/thermal/thermal_sysfs.c
@@ -607,7 +607,7 @@  cur_state_store(struct device *dev, struct device_attribute *attr,
 		const char *buf, size_t count)
 {
 	struct thermal_cooling_device *cdev = to_cooling_device(dev);
-	unsigned long state;
+	unsigned long state, max_state;
 	int result;
 
 	if (sscanf(buf, "%ld\n", &state) != 1)
@@ -618,10 +618,20 @@  cur_state_store(struct device *dev, struct device_attribute *attr,
 
 	mutex_lock(&cdev->lock);
 
+	result = cdev->ops->get_max_state(cdev, &max_state);
+	if (result)
+		goto unlock;
+
+	if (state > max_state) {
+		result = -EINVAL;
+		goto unlock;
+	}
+
 	result = cdev->ops->set_cur_state(cdev, state);
 	if (!result)
 		thermal_cooling_device_stats_update(cdev, state);
 
+unlock:
 	mutex_unlock(&cdev->lock);
 	return result ? result : count;
 }