diff mbox series

[v1,1/6] thermal: sysfs: Trigger zone temperature updates on sysfs reads

Message ID 3304112.44csPzL39Z@kreacher
State New
Headers show
Series thermal: core: Assorted improvements for v6.11 | expand

Commit Message

Rafael J. Wysocki May 10, 2024, 2:13 p.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Reading the zone temperature via sysfs causes the driver callback to
be invoked, but it does not cause the thermal zone object to be updated.

This is problematic if the zone temperature read via sysfs differs from
the temperature value stored in the thermal zone object as it may cause
the kernel and user space to act against each other in some cases.

For this reason, make temp_show() trigger a zone temperature update if
the temperature returned by thermal_zone_get_temp() is different from
the temperature value stored in the thermal zone object.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/thermal/thermal_core.c  |    2 +-
 drivers/thermal/thermal_sysfs.c |    3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

Comments

Lukasz Luba May 13, 2024, 7:11 a.m. UTC | #1
Hi Rafael,

On 5/10/24 15:13, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Reading the zone temperature via sysfs causes the driver callback to
> be invoked, but it does not cause the thermal zone object to be updated.
> 
> This is problematic if the zone temperature read via sysfs differs from
> the temperature value stored in the thermal zone object as it may cause
> the kernel and user space to act against each other in some cases.
> 
> For this reason, make temp_show() trigger a zone temperature update if
> the temperature returned by thermal_zone_get_temp() is different from
> the temperature value stored in the thermal zone object.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>   drivers/thermal/thermal_core.c  |    2 +-
>   drivers/thermal/thermal_sysfs.c |    3 +++
>   2 files changed, 4 insertions(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/thermal/thermal_sysfs.c
> ===================================================================
> --- linux-pm.orig/drivers/thermal/thermal_sysfs.c
> +++ linux-pm/drivers/thermal/thermal_sysfs.c
> @@ -42,6 +42,9 @@ temp_show(struct device *dev, struct dev
>   	if (ret)
>   		return ret;
>   
> +	if (temperature != READ_ONCE(tz->temperature))
> +		thermal_zone_device_update(tz, THERMAL_EVENT_TEMP_SAMPLE);

That's a bit problematic because it will trigger
governor->manage()

In case of IPA governor we relay on constant polling
period. We estimate the past power usage and current
thermal budget, to derive the next period power budget
for devices. I don't know if the internal PID algorithm
will be resilient enough to compensate this asynchronous
trigger caused from user-space.

We choose the period to be at least 1 frame (e.g. ~16ms)
to have good avg usage of CPUs and GPU. TBH I don't know
what would happen if someone reads the temp after e.g. 1ms
of last IPA trigger, but some devices (e.g. GPU) wasn't
loaded in that last 1ms delta...
I'm a bit more relaxed about CPUs because we use utilization
signal from runqueues (like the TEO util gov). That's a moving
avg signal which should keep some history, like low-pass
filter, so information is more resilient in that case.

Could we think about that IPA constant period usage?
I think I understand the proposal of your patch.
We might add a filter inside IPA to ignore such async
triggers in the .manage() callback.
What do you think?

Regards,
Lukasz
diff mbox series

Patch

Index: linux-pm/drivers/thermal/thermal_sysfs.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_sysfs.c
+++ linux-pm/drivers/thermal/thermal_sysfs.c
@@ -42,6 +42,9 @@  temp_show(struct device *dev, struct dev
 	if (ret)
 		return ret;
 
+	if (temperature != READ_ONCE(tz->temperature))
+		thermal_zone_device_update(tz, THERMAL_EVENT_TEMP_SAMPLE);
+
 	return sprintf(buf, "%d\n", temperature);
 }
 
Index: linux-pm/drivers/thermal/thermal_core.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_core.c
+++ linux-pm/drivers/thermal/thermal_core.c
@@ -429,7 +429,7 @@  static void update_temperature(struct th
 	}
 
 	tz->last_temperature = tz->temperature;
-	tz->temperature = temp;
+	WRITE_ONCE(tz->temperature, temp);
 
 	trace_thermal_temperature(tz);