diff mbox series

[v2] thermal: Fix a NULL pointer dereference

Message ID 1631041289-11804-1-git-send-email-quic_subbaram@quicinc.com
State New
Headers show
Series [v2] thermal: Fix a NULL pointer dereference | expand

Commit Message

Subbaraman Narayanamurthy Sept. 7, 2021, 7:01 p.m. UTC
of_parse_thermal_zones() parses the thermal-zones node and registers a
thermal_zone device for each subnode. However, if a thermal zone is
consuming a thermal sensor and that thermal sensor device hasn't probed
yet, an attempt to set trip_point_*_temp for that thermal zone device
can cause a NULL pointer dereference. Fix it.

 console:/sys/class/thermal/thermal_zone87 # echo 120000 > trip_point_0_temp
 ...
 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
 ...
 Call trace:
  of_thermal_set_trip_temp+0x40/0xc4
  trip_point_temp_store+0xc0/0x1dc
  dev_attr_store+0x38/0x88
  sysfs_kf_write+0x64/0xc0
  kernfs_fop_write_iter+0x108/0x1d0
  vfs_write+0x2f4/0x368
  ksys_write+0x7c/0xec
  __arm64_sys_write+0x20/0x30
  el0_svc_common.llvm.7279915941325364641+0xbc/0x1bc
  do_el0_svc+0x28/0xa0
  el0_svc+0x14/0x24
  el0_sync_handler+0x88/0xec
  el0_sync+0x1c0/0x200

While at it, fix the possible NULL pointer dereference in other
functions as well: of_thermal_get_temp(), of_thermal_set_emul_temp(),
of_thermal_get_trend().

Cc: stable@vger.kernel.org
Suggested-by: David Collins <quic_collinsd@quicinc.com>
Signed-off-by: Subbaraman Narayanamurthy <quic_subbaram@quicinc.com>
---
Changes for v2:
- Added checks in of_thermal_get_temp(), of_thermal_set_emul_temp(), of_thermal_get_trend().

 drivers/thermal/thermal_of.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Comments

Subbaraman Narayanamurthy Oct. 5, 2021, 10:09 p.m. UTC | #1
On 9/17/21 1:06 PM, Subbaraman Narayanamurthy wrote:
> On 9/17/21 2:31 AM, Daniel Lezcano wrote:

>> On 07/09/2021 21:01, Subbaraman Narayanamurthy wrote:

>>> of_parse_thermal_zones() parses the thermal-zones node and registers a

>>> thermal_zone device for each subnode. However, if a thermal zone is

>>> consuming a thermal sensor and that thermal sensor device hasn't probed

>>> yet, an attempt to set trip_point_*_temp for that thermal zone device

>>> can cause a NULL pointer dereference. Fix it.

>>>

>>>  console:/sys/class/thermal/thermal_zone87 # echo 120000 > trip_point_0_temp

>>>  ...

>>>  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020

>>>  ...

>>>  Call trace:

>>>   of_thermal_set_trip_temp+0x40/0xc4

>>>   trip_point_temp_store+0xc0/0x1dc

>>>   dev_attr_store+0x38/0x88

>>>   sysfs_kf_write+0x64/0xc0

>>>   kernfs_fop_write_iter+0x108/0x1d0

>>>   vfs_write+0x2f4/0x368

>>>   ksys_write+0x7c/0xec

>>>   __arm64_sys_write+0x20/0x30

>>>   el0_svc_common.llvm.7279915941325364641+0xbc/0x1bc

>>>   do_el0_svc+0x28/0xa0

>>>   el0_svc+0x14/0x24

>>>   el0_sync_handler+0x88/0xec

>>>   el0_sync+0x1c0/0x200

>>>

>>> While at it, fix the possible NULL pointer dereference in other

>>> functions as well: of_thermal_get_temp(), of_thermal_set_emul_temp(),

>>> of_thermal_get_trend().

>>>

>>> Cc: stable@vger.kernel.org

>>> Suggested-by: David Collins <quic_collinsd@quicinc.com>

>>> Signed-off-by: Subbaraman Narayanamurthy <quic_subbaram@quicinc.com>

>>> ---

>>> Changes for v2:

>>> - Added checks in of_thermal_get_temp(), of_thermal_set_emul_temp(), of_thermal_get_trend().

>>>

>>>  drivers/thermal/thermal_of.c | 9 ++++++---

>>>  1 file changed, 6 insertions(+), 3 deletions(-)

>>>

>>> diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c

>>> index 6379f26..9233f7e 100644

>>> --- a/drivers/thermal/thermal_of.c

>>> +++ b/drivers/thermal/thermal_of.c

>>> @@ -89,7 +89,7 @@ static int of_thermal_get_temp(struct thermal_zone_device *tz,

>>>  {

>>>  	struct __thermal_zone *data = tz->devdata;

>>>  

>>> -	if (!data->ops->get_temp)

>>> +	if (!data->ops || !data->ops->get_temp)

>> comment (1)

>>

>> AFAICT, if data->ops != NULL then data->ops->get_temp is also != NULL

>> because of the code allocating/freeing the ops structure.

>>

>> The tests can be replaced by (!data->ops), no ?

> Thanks Daniel for reviewing the patch.

>

> I agree that even if a sensor module is unregistered, that would call

> "thermal_zone_of_sensor_unregister" which would eventually set NULL on

> get_temp() and get_trend() and "tzd->ops" as well.

>

> However, of_thermal_get_temp() is trying to call "data->ops->get_temp"

> which comes from a sensor driver when it registers. There is no

> guarantee that it would be non-NULL right?

>

> Thinking of which, I think having both checks would be valid.


Hi Daniel,
Do you still have any concerns with this change?

Thanks,
Subbaraman

>>>  		return -EINVAL;

>>>  

>>>  	return data->ops->get_temp(data->sensor_data, temp);

>>> @@ -186,6 +186,9 @@ static int of_thermal_set_emul_temp(struct thermal_zone_device *tz,

>>>  {

>>>  	struct __thermal_zone *data = tz->devdata;

>>>  

>>> +	if (!data->ops || !data->ops->set_emul_temp)

>>> +		return -EINVAL;

>>> +

>> comment (2)

>>

>> The test looks pointless here (I mean both of them).

>>

>> If of_thermal_set_emul_temp() is called it is because the callback was

>> set in thermal_zone_of_add_sensor().

>>

>> This one does:

>>

>> 	tz->ops = ops;

>>

>> and

>> 	if (ops->set_emul_temp)

>>                 tzd->ops->set_emul_temp = of_thermal_set_emul_temp;

>>

>> If I'm not wrong if we are called, then data->ops &&

>> data->ops->set_emul_temp is always true, right ?

>>

> I've not exercised this condition yet. However, the original problem

> we've observed was when thermal HAL was trying to set trip thresholds

> on a thermal zone device for which the sensor device is not probed yet.

> This had happened randomly because of vendor modules taking time to be

> loaded and probed. I don't know if there would be any userspace entity

> that can try to set emulated temperature for a thermal zone even before

> a sensor device is probed.

>

> Without a sensor driver probed, "tz->ops" would not have a valid pointer

> right? So, I think checking for "data->ops" should be good.

>

> Another possibility is, a sensor might not have "set_emul_temp" callback.

> So checking for "ops->set_emul_temp" should be still valid.

>

>>>  	return data->ops->set_emul_temp(data->sensor_data, temp);

>>>  }

>>>  

>>> @@ -194,7 +197,7 @@ static int of_thermal_get_trend(struct thermal_zone_device *tz, int trip,

>>>  {

>>>  	struct __thermal_zone *data = tz->devdata;

>>>  

>>> -	if (!data->ops->get_trend)

>>> +	if (!data->ops || !data->ops->get_trend)

>>>  		return -EINVAL;

>> Same comment as (1)

> of_thermal_get_trend() is trying to call "data->ops->get_trend" which

> comes from a sensor driver when it registers. From what I can see,

> there are lot of drivers which don't pass "get_trend" in their ops.

> So there is no guarantee that it would be non-NULL right?

>

> Thinking of which, I think having both checks would be valid.

>

>>>  	return data->ops->get_trend(data->sensor_data, trip, trend);

>>> @@ -301,7 +304,7 @@ static int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip,

>>>  	if (trip >= data->ntrips || trip < 0)

>>>  		return -EDOM;

>>>  

>>> -	if (data->ops->set_trip_temp) {

>>> +	if (data->ops && data->ops->set_trip_temp) {

>> Same comment as (2)

> Without a sensor driver probed, "tz->ops" would not have a valid pointer right?

> So, I think checking for "data->ops" should be good. Another possibility is, a

> sensor device might not have "set_trip_temp" callback but just "set_trips".

>

> So checking for "data->ops->set_trip_temp" might be still valid.

>

>>>  		int ret;

>>>  

>>>  		ret = data->ops->set_trip_temp(data->sensor_data, trip, temp);

>>>
Daniel Lezcano Oct. 6, 2021, 9:39 a.m. UTC | #2
On 06/10/2021 00:09, Subbaraman Narayanamurthy wrote:

[ ... ]

>>> The tests can be replaced by (!data->ops), no ?

>> Thanks Daniel for reviewing the patch.

>>

>> I agree that even if a sensor module is unregistered, that would call

>> "thermal_zone_of_sensor_unregister" which would eventually set NULL on

>> get_temp() and get_trend() and "tzd->ops" as well.

>>

>> However, of_thermal_get_temp() is trying to call "data->ops->get_temp"

>> which comes from a sensor driver when it registers. There is no

>> guarantee that it would be non-NULL right?

>>

>> Thinking of which, I think having both checks would be valid.

> 

> Hi Daniel,

> Do you still have any concerns with this change?


Yes, let me answer to the initial patch.


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
diff mbox series

Patch

diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index 6379f26..9233f7e 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -89,7 +89,7 @@  static int of_thermal_get_temp(struct thermal_zone_device *tz,
 {
 	struct __thermal_zone *data = tz->devdata;
 
-	if (!data->ops->get_temp)
+	if (!data->ops || !data->ops->get_temp)
 		return -EINVAL;
 
 	return data->ops->get_temp(data->sensor_data, temp);
@@ -186,6 +186,9 @@  static int of_thermal_set_emul_temp(struct thermal_zone_device *tz,
 {
 	struct __thermal_zone *data = tz->devdata;
 
+	if (!data->ops || !data->ops->set_emul_temp)
+		return -EINVAL;
+
 	return data->ops->set_emul_temp(data->sensor_data, temp);
 }
 
@@ -194,7 +197,7 @@  static int of_thermal_get_trend(struct thermal_zone_device *tz, int trip,
 {
 	struct __thermal_zone *data = tz->devdata;
 
-	if (!data->ops->get_trend)
+	if (!data->ops || !data->ops->get_trend)
 		return -EINVAL;
 
 	return data->ops->get_trend(data->sensor_data, trip, trend);
@@ -301,7 +304,7 @@  static int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip,
 	if (trip >= data->ntrips || trip < 0)
 		return -EDOM;
 
-	if (data->ops->set_trip_temp) {
+	if (data->ops && data->ops->set_trip_temp) {
 		int ret;
 
 		ret = data->ops->set_trip_temp(data->sensor_data, trip, temp);