mbox series

[V2,0/4] PM/Thermal: Enhance PCH overheat handling

Message ID 20220519143508.3803894-1-rui.zhang@intel.com
Headers show
Series PM/Thermal: Enhance PCH overheat handling | expand

Message

Zhang, Rui May 19, 2022, 2:35 p.m. UTC
On some Intel client platforms like SKL/KBL/CNL/CML, there is a
PCH thermal sensor that monitors the PCH temperature and blocks the system
from entering S0ix in case it overheats.

Commit ef63b043ac86 ("thermal: intel: pch: fix S0ix failure due to PCH
temperature above threshold") introduces a delay loop to cool the
temperature down for this purpose.

However, in practice, we found that the time it takes to cool the PCH down
below threshold highly depends on the initial PCH temperature when the
delay starts, as well as the ambient temperature.

For example, on a Dell XPS 9360 laptop, the problem can be triggered 
1. when it is suspended with heavy workload running.
or
2. when it is moved from New Hampshire to Florida.

In these cases, the 1 second delay is not sufficient. As a result, the
system stays in a shallower power state like PCx instead of S0ix, and
drains the battery power, without user' notice.

In order to fix this, we
1. move the delay to .suspend_noirq phase instead, in order to
   do the cooling when the system is in a more quiescent state
2. expand the default overall cooling delay timeout to 60 seconds.
3. make sure the temperature is below threshold rather than equal to it.

Compared with V1, the last four patches are dropped from the series, and
we focus on the PCH Overheat issue only. Plus, splitted one of the patch
according to Rafael' suggestion.

thanks,
rui