mbox series

[v1,0/4] thermal: core/ACPI: Fix processor cooling device regression

Message ID 2148907.irdbgypaU6@kreacher
Headers show
Series thermal: core/ACPI: Fix processor cooling device regression | expand

Message

Rafael J. Wysocki March 3, 2023, 7:18 p.m. UTC
Hi All,

As reported by Rui in this thread:

Link: https://lore.kernel.org/linux-pm/53ec1f06f61c984100868926f282647e57ecfb2d.camel@intel.com/

some recent changes in the thermal core cause the CPU cooling devices
registered by the ACPI processor driver to become unusable in some cases
and somewhat crippled in general.

The problem is that the ACPI processor driver changes its ->get_max_state()
callback return value depending on whether or not cpufreq is available and
there is a cpufreq policy for a given CPU.  However, the thermal core has
always assumed that the return value of that callback will not change, which
in fact is relied on by the cooling device statistics code.  In particular,
when the ->get_max_state() grows, the memory buffer allocated for storing the
statistics will be too small and corruption may ensue as a result.

For this reason, the issue needs to be addressed in the ACPI processor driver
and not in the thermal core, but the core needs to help somewhat too.  Namely,
it needs to provide a helper allowing an interested driver to update the
max_state value for an already registered cooling device in certain situations
which will also cause the statistics to be rebuilt.

This series implements the above and for details please refer to the individual
patch chagelogs.

Please also note that it has only been lightly tested, so more testing and of
course review of it is welcome.

Thanks!