diff mbox series

[V4,4/4] thermal/drivers/cpu_cooling: Rename to cpufreq_cooling

Message ID 20191204153930.9128-4-daniel.lezcano@linaro.org
State New
Headers show
Series [V4,1/4] thermal/drivers/Kconfig: Convert the CPU cooling device to a choice | expand

Commit Message

Daniel Lezcano Dec. 4, 2019, 3:39 p.m. UTC
As we introduced the idle injection cooling device called
cpuidle_cooling, let's be consistent and rename the cpu_cooling to
cpufreq_cooling as this one mitigates with OPPs changes.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>

---
  V4:
    - Added Acked-by and Reviewed-by
  V3:
    - Fix missing name conversion (Viresh Kumar)
---
 Documentation/driver-api/thermal/exynos_thermal.rst  | 2 +-
 MAINTAINERS                                          | 2 +-
 drivers/thermal/Makefile                             | 2 +-
 drivers/thermal/clock_cooling.c                      | 2 +-
 drivers/thermal/{cpu_cooling.c => cpufreq_cooling.c} | 6 +++---
 include/linux/clock_cooling.h                        | 2 +-
 6 files changed, 8 insertions(+), 8 deletions(-)
 rename drivers/thermal/{cpu_cooling.c => cpufreq_cooling.c} (99%)

-- 
2.17.1

Comments

Martin Kepplinger Dec. 6, 2019, 11:33 a.m. UTC | #1
I tested this on the librem5-devkit and see the
cooling devices in sysfs. I configure ARM_PSCI_CPUIDLE, not ARM_CPUIDLE and
add the patch below in register the cooling device there. "psci_idle"
is listed as the cpuidle_driver.

That's what I'm running, in case you want to see it all:
https://source.puri.sm/martin.kepplinger/linux-next/commits/next-20191205/librem5_cpuidle_mainline_atf

so I add a trip temperature description like this:
https://source.puri.sm/martin.kepplinger/linux-next/commit/361f49f93ae2c477fd012790831cabd0ed976660

When I let the SoC heat up, cpuidle cooling won't kick it. In sysfs:

catting the relevant files in /sys/class/thermal after heating up,
if that makes sense:

87000
85000
85000
thermal-cpufreq-0
1
thermal-idle-0
0
thermal-idle-1                                                                  
0                                                                               
thermal-idle-2
0
thermal-idle-3
0

with ARM_CPUIDLE instead of ARM_PSCI_CPUIDLE (and registering the cooling dev
during cpuidle-arm.c init) I won't have a cpuidle driver and thus no cpu-sleep
state at all.

Can you see where the problem here lies?

thanks!

                                    martin

---
 drivers/cpuidle/cpuidle-psci.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
index f3c1a2396f98..de6e7f444a66 100644
--- a/drivers/cpuidle/cpuidle-psci.c
+++ b/drivers/cpuidle/cpuidle-psci.c
@@ -8,6 +8,7 @@
 
 #define pr_fmt(fmt) "CPUidle PSCI: " fmt
 
+#include <linux/cpu_cooling.h>
 #include <linux/cpuidle.h>
 #include <linux/cpumask.h>
 #include <linux/cpu_pm.h>
@@ -195,6 +196,8 @@ static int __init psci_idle_init_cpu(int cpu)
 	if (ret)
 		goto out_kfree_drv;
 
+	cpuidle_cooling_register(drv);
+
 	return 0;
 
 out_kfree_drv:
-- 
2.20.1
Daniel Lezcano Dec. 6, 2019, 2:15 p.m. UTC | #2
On 06/12/2019 12:33, Martin Kepplinger wrote:
> I tested this on the librem5-devkit and see the

> cooling devices in sysfs. I configure ARM_PSCI_CPUIDLE, not ARM_CPUIDLE and

> add the patch below in register the cooling device there. "psci_idle"

> is listed as the cpuidle_driver.

> 

> That's what I'm running, in case you want to see it all:

> https://source.puri.sm/martin.kepplinger/linux-next/commits/next-20191205/librem5_cpuidle_mainline_atf

> 

> so I add a trip temperature description like this:

> https://source.puri.sm/martin.kepplinger/linux-next/commit/361f49f93ae2c477fd012790831cabd0ed976660

> 

> When I let the SoC heat up, cpuidle cooling won't kick it. In sysfs:

> 

> catting the relevant files in /sys/class/thermal after heating up,

> if that makes sense:

> 

> 87000

> 85000

> 85000

> thermal-cpufreq-0

> 1

> thermal-idle-0

> 0

> thermal-idle-1                                                                  

> 0                                                                               

> thermal-idle-2

> 0

> thermal-idle-3

> 0

> 

> with ARM_CPUIDLE instead of ARM_PSCI_CPUIDLE (and registering the cooling dev

> during cpuidle-arm.c init) I won't have a cpuidle driver and thus no cpu-sleep

> state at all.

> 

> Can you see where the problem here lies?


Yes, I removed the registration via the DT.

Can you try the following:

diff --git a/drivers/cpuidle/dt_idle_states.c
b/drivers/cpuidle/dt_idle_states.c
index d06d21a9525d..01367ddec49a 100644
--- a/drivers/cpuidle/dt_idle_states.c
+++ b/drivers/cpuidle/dt_idle_states.c
@@ -13,6 +13,7 @@
 #include <linux/errno.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
+#include <linux/cpu_cooling.h>
 #include <linux/of.h>
 #include <linux/of_device.h>

@@ -205,6 +206,9 @@ int dt_init_idle_driver(struct cpuidle_driver *drv,
 			err = -EINVAL;
 			break;
 		}
+
+		cpuidle_of_cooling_register(state_node, drv);
+
 		of_node_put(state_node);
 	}

That's a hack for the moment.


> ---

>  drivers/cpuidle/cpuidle-psci.c | 3 +++

>  1 file changed, 3 insertions(+)

> 

> diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c

> index f3c1a2396f98..de6e7f444a66 100644

> --- a/drivers/cpuidle/cpuidle-psci.c

> +++ b/drivers/cpuidle/cpuidle-psci.c

> @@ -8,6 +8,7 @@

>  

>  #define pr_fmt(fmt) "CPUidle PSCI: " fmt

>  

> +#include <linux/cpu_cooling.h>

>  #include <linux/cpuidle.h>

>  #include <linux/cpumask.h>

>  #include <linux/cpu_pm.h>

> @@ -195,6 +196,8 @@ static int __init psci_idle_init_cpu(int cpu)

>  	if (ret)

>  		goto out_kfree_drv;

>  

> +	cpuidle_cooling_register(drv);

> +

>  	return 0;

>  

>  out_kfree_drv:

> 



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
Martin Kepplinger Dec. 9, 2019, 9:54 a.m. UTC | #3
On 06.12.19 15:15, Daniel Lezcano wrote:
> On 06/12/2019 12:33, Martin Kepplinger wrote:

>> I tested this on the librem5-devkit and see the

>> cooling devices in sysfs. I configure ARM_PSCI_CPUIDLE, not ARM_CPUIDLE and

>> add the patch below in register the cooling device there. "psci_idle"

>> is listed as the cpuidle_driver.

>>

>> That's what I'm running, in case you want to see it all:

>> https://source.puri.sm/martin.kepplinger/linux-next/commits/next-20191205/librem5_cpuidle_mainline_atf

>>

>> so I add a trip temperature description like this:

>> https://source.puri.sm/martin.kepplinger/linux-next/commit/361f49f93ae2c477fd012790831cabd0ed976660

>>

>> When I let the SoC heat up, cpuidle cooling won't kick it. In sysfs:

>>

>> catting the relevant files in /sys/class/thermal after heating up,

>> if that makes sense:

>>

>> 87000

>> 85000

>> 85000

>> thermal-cpufreq-0

>> 1

>> thermal-idle-0

>> 0

>> thermal-idle-1                                                                  

>> 0                                                                               

>> thermal-idle-2

>> 0

>> thermal-idle-3

>> 0

>>

>> with ARM_CPUIDLE instead of ARM_PSCI_CPUIDLE (and registering the cooling dev

>> during cpuidle-arm.c init) I won't have a cpuidle driver and thus no cpu-sleep

>> state at all.

>>

>> Can you see where the problem here lies?

> 

> Yes, I removed the registration via the DT.

> 

> Can you try the following:

> 

> diff --git a/drivers/cpuidle/dt_idle_states.c

> b/drivers/cpuidle/dt_idle_states.c

> index d06d21a9525d..01367ddec49a 100644

> --- a/drivers/cpuidle/dt_idle_states.c

> +++ b/drivers/cpuidle/dt_idle_states.c

> @@ -13,6 +13,7 @@

>  #include <linux/errno.h>

>  #include <linux/kernel.h>

>  #include <linux/module.h>

> +#include <linux/cpu_cooling.h>

>  #include <linux/of.h>

>  #include <linux/of_device.h>

> 

> @@ -205,6 +206,9 @@ int dt_init_idle_driver(struct cpuidle_driver *drv,

>  			err = -EINVAL;

>  			break;

>  		}

> +

> +		cpuidle_of_cooling_register(state_node, drv);

> +

>  		of_node_put(state_node);

>  	}

> 

> That's a hack for the moment.

> 


thanks. I could test that successfully. The only question would be: Is
is intentional how "non-aggressive" the cooling driver cools? I would
have expected it to basically inject more idle cycles earlier. I'd set
75 degrees as trip point and at 85 degress is would only inject about 30
(of 100).

You describe the "config values" in question in the documentation, but
I'm not sure what's the correct way to change them.

thanks,

                                      martin
Daniel Lezcano Dec. 9, 2019, 7:29 p.m. UTC | #4
On 09/12/2019 13:03, Daniel Lezcano wrote:
> On 09/12/2019 10:54, Martin Kepplinger wrote:

>>

>>

>> On 06.12.19 15:15, Daniel Lezcano wrote:

>>> On 06/12/2019 12:33, Martin Kepplinger wrote:

>>>> I tested this on the librem5-devkit and see the

>>>> cooling devices in sysfs. I configure ARM_PSCI_CPUIDLE, not ARM_CPUIDLE and

>>>> add the patch below in register the cooling device there. "psci_idle"

>>>> is listed as the cpuidle_driver.

>>>>

>>>> That's what I'm running, in case you want to see it all:

>>>> https://source.puri.sm/martin.kepplinger/linux-next/commits/next-20191205/librem5_cpuidle_mainline_atf

>>>>

>>>> so I add a trip temperature description like this:

>>>> https://source.puri.sm/martin.kepplinger/linux-next/commit/361f49f93ae2c477fd012790831cabd0ed976660

>>>>

>>>> When I let the SoC heat up, cpuidle cooling won't kick it. In sysfs:

>>>>

>>>> catting the relevant files in /sys/class/thermal after heating up,

>>>> if that makes sense:

>>>>

>>>> 87000

>>>> 85000

>>>> 85000

>>>> thermal-cpufreq-0

>>>> 1

>>>> thermal-idle-0

>>>> 0

>>>> thermal-idle-1                                                                  

>>>> 0                                                                               

>>>> thermal-idle-2

>>>> 0

>>>> thermal-idle-3

>>>> 0

>>>>

>>>> with ARM_CPUIDLE instead of ARM_PSCI_CPUIDLE (and registering the cooling dev

>>>> during cpuidle-arm.c init) I won't have a cpuidle driver and thus no cpu-sleep

>>>> state at all.

>>>>

>>>> Can you see where the problem here lies?

>>>

>>> Yes, I removed the registration via the DT.

>>>

>>> Can you try the following:

>>>

>>> diff --git a/drivers/cpuidle/dt_idle_states.c

>>> b/drivers/cpuidle/dt_idle_states.c

>>> index d06d21a9525d..01367ddec49a 100644

>>> --- a/drivers/cpuidle/dt_idle_states.c

>>> +++ b/drivers/cpuidle/dt_idle_states.c

>>> @@ -13,6 +13,7 @@

>>>  #include <linux/errno.h>

>>>  #include <linux/kernel.h>

>>>  #include <linux/module.h>

>>> +#include <linux/cpu_cooling.h>

>>>  #include <linux/of.h>

>>>  #include <linux/of_device.h>

>>>

>>> @@ -205,6 +206,9 @@ int dt_init_idle_driver(struct cpuidle_driver *drv,

>>>  			err = -EINVAL;

>>>  			break;

>>>  		}

>>> +

>>> +		cpuidle_of_cooling_register(state_node, drv);

>>> +

>>>  		of_node_put(state_node);

>>>  	}

>>>

>>> That's a hack for the moment.

>>>

>>

>> thanks. I could test that successfully. The only question would be: Is

>> is intentional how "non-aggressive" the cooling driver cools? I would

>> have expected it to basically inject more idle cycles earlier. I'd set

>> 75 degrees as trip point and at 85 degress is would only inject about 30

>> (of 100).


By the way, how many CPUs are injecting idle cycle when the mitigation
happens ?

>> You describe the "config values" in question in the documentation, but

>> I'm not sure what's the correct way to change them.

> 

> That is difficult to say without knowing the board behavior. Are you

> able to profile the temperature with the load? How fast the temperature

> increases? The aggressive behavior of the cooling device will depend on

> the governor which depends on the slope of the temperature increase and

> the sampling.

> 

> Can you give the pointer to the git tree with the DT definition of your

> board?

> 

> You can try by changing the idle duration to 10ms instead of the default

> 4ms.

> 

> You can also change the cooling states in the DT <&state 20 70>, so it

> will begin to mitigate at state 20. But I wouldn't recommend that.

> 

> Do you have the energy power model, so we can try with the IPA governor?

> 

> 

> 



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
Martin Kepplinger Dec. 10, 2019, 8:57 a.m. UTC | #5
On 09.12.19 20:29, Daniel Lezcano wrote:
> On 09/12/2019 13:03, Daniel Lezcano wrote:

>> On 09/12/2019 10:54, Martin Kepplinger wrote:

>>>

>>>

>>> On 06.12.19 15:15, Daniel Lezcano wrote:

>>>> On 06/12/2019 12:33, Martin Kepplinger wrote:

>>>>> I tested this on the librem5-devkit and see the

>>>>> cooling devices in sysfs. I configure ARM_PSCI_CPUIDLE, not ARM_CPUIDLE and

>>>>> add the patch below in register the cooling device there. "psci_idle"

>>>>> is listed as the cpuidle_driver.

>>>>>

>>>>> That's what I'm running, in case you want to see it all:

>>>>> https://source.puri.sm/martin.kepplinger/linux-next/commits/next-20191205/librem5_cpuidle_mainline_atf

>>>>>

>>>>> so I add a trip temperature description like this:

>>>>> https://source.puri.sm/martin.kepplinger/linux-next/commit/361f49f93ae2c477fd012790831cabd0ed976660

>>>>>

>>>>> When I let the SoC heat up, cpuidle cooling won't kick it. In sysfs:

>>>>>

>>>>> catting the relevant files in /sys/class/thermal after heating up,

>>>>> if that makes sense:

>>>>>

>>>>> 87000

>>>>> 85000

>>>>> 85000

>>>>> thermal-cpufreq-0

>>>>> 1

>>>>> thermal-idle-0

>>>>> 0

>>>>> thermal-idle-1                                                                  

>>>>> 0                                                                               

>>>>> thermal-idle-2

>>>>> 0

>>>>> thermal-idle-3

>>>>> 0

>>>>>

>>>>> with ARM_CPUIDLE instead of ARM_PSCI_CPUIDLE (and registering the cooling dev

>>>>> during cpuidle-arm.c init) I won't have a cpuidle driver and thus no cpu-sleep

>>>>> state at all.

>>>>>

>>>>> Can you see where the problem here lies?

>>>>

>>>> Yes, I removed the registration via the DT.

>>>>

>>>> Can you try the following:

>>>>

>>>> diff --git a/drivers/cpuidle/dt_idle_states.c

>>>> b/drivers/cpuidle/dt_idle_states.c

>>>> index d06d21a9525d..01367ddec49a 100644

>>>> --- a/drivers/cpuidle/dt_idle_states.c

>>>> +++ b/drivers/cpuidle/dt_idle_states.c

>>>> @@ -13,6 +13,7 @@

>>>>  #include <linux/errno.h>

>>>>  #include <linux/kernel.h>

>>>>  #include <linux/module.h>

>>>> +#include <linux/cpu_cooling.h>

>>>>  #include <linux/of.h>

>>>>  #include <linux/of_device.h>

>>>>

>>>> @@ -205,6 +206,9 @@ int dt_init_idle_driver(struct cpuidle_driver *drv,

>>>>  			err = -EINVAL;

>>>>  			break;

>>>>  		}

>>>> +

>>>> +		cpuidle_of_cooling_register(state_node, drv);

>>>> +

>>>>  		of_node_put(state_node);

>>>>  	}

>>>>

>>>> That's a hack for the moment.

>>>>

>>>

>>> thanks. I could test that successfully. The only question would be: Is

>>> is intentional how "non-aggressive" the cooling driver cools? I would

>>> have expected it to basically inject more idle cycles earlier. I'd set

>>> 75 degrees as trip point and at 85 degress is would only inject about 30

>>> (of 100).

> 

> By the way, how many CPUs are injecting idle cycle when the mitigation

> happens ?


all 4 are injecting the same.

> 

>>> You describe the "config values" in question in the documentation, but

>>> I'm not sure what's the correct way to change them.

>>

>> That is difficult to say without knowing the board behavior. Are you

>> able to profile the temperature with the load? How fast the temperature

>> increases? The aggressive behavior of the cooling device will depend on

>> the governor which depends on the slope of the temperature increase and

>> the sampling.

>>

>> Can you give the pointer to the git tree with the DT definition of your

>> board?


https://source.puri.sm/martin.kepplinger/linux-next/blob/next-20191205/librem5_cpuidle_mainline_atf/arch/arm64/boot/dts/freescale/imx8mq-librem5-devkit.dts

you can browse in that branch.

>>

>> You can try by changing the idle duration to 10ms instead of the default

>> 4ms.


where is that set?

>>

>> You can also change the cooling states in the DT <&state 20 70>, so it

>> will begin to mitigate at state 20. But I wouldn't recommend that.


where would we assign that? I'm not sure who reads that -.-
it's still something to consider, but a longer idle duration makes more
sense, yes.

>>

>> Do you have the energy power model, so we can try with the IPA governor?

>>

>>


thanks for the reminder. I'd look at that later.

                               martin
Daniel Lezcano Dec. 10, 2019, 9:15 a.m. UTC | #6
On 10/12/2019 09:57, Martin Kepplinger wrote:
> 

> 

> On 09.12.19 20:29, Daniel Lezcano wrote:

>> On 09/12/2019 13:03, Daniel Lezcano wrote:

>>> On 09/12/2019 10:54, Martin Kepplinger wrote:

>>>>

>>>>

>>>> On 06.12.19 15:15, Daniel Lezcano wrote:

>>>>> On 06/12/2019 12:33, Martin Kepplinger wrote:

>>>>>> I tested this on the librem5-devkit and see the

>>>>>> cooling devices in sysfs. I configure ARM_PSCI_CPUIDLE, not ARM_CPUIDLE and

>>>>>> add the patch below in register the cooling device there. "psci_idle"

>>>>>> is listed as the cpuidle_driver.

>>>>>>

>>>>>> That's what I'm running, in case you want to see it all:

>>>>>> https://source.puri.sm/martin.kepplinger/linux-next/commits/next-20191205/librem5_cpuidle_mainline_atf

>>>>>>

>>>>>> so I add a trip temperature description like this:

>>>>>> https://source.puri.sm/martin.kepplinger/linux-next/commit/361f49f93ae2c477fd012790831cabd0ed976660

>>>>>>

>>>>>> When I let the SoC heat up, cpuidle cooling won't kick it. In sysfs:

>>>>>>

>>>>>> catting the relevant files in /sys/class/thermal after heating up,

>>>>>> if that makes sense:

>>>>>>

>>>>>> 87000

>>>>>> 85000

>>>>>> 85000

>>>>>> thermal-cpufreq-0

>>>>>> 1

>>>>>> thermal-idle-0

>>>>>> 0

>>>>>> thermal-idle-1                                                                  

>>>>>> 0                                                                               

>>>>>> thermal-idle-2

>>>>>> 0

>>>>>> thermal-idle-3

>>>>>> 0

>>>>>>

>>>>>> with ARM_CPUIDLE instead of ARM_PSCI_CPUIDLE (and registering the cooling dev

>>>>>> during cpuidle-arm.c init) I won't have a cpuidle driver and thus no cpu-sleep

>>>>>> state at all.

>>>>>>

>>>>>> Can you see where the problem here lies?

>>>>>

>>>>> Yes, I removed the registration via the DT.

>>>>>

>>>>> Can you try the following:

>>>>>

>>>>> diff --git a/drivers/cpuidle/dt_idle_states.c

>>>>> b/drivers/cpuidle/dt_idle_states.c

>>>>> index d06d21a9525d..01367ddec49a 100644

>>>>> --- a/drivers/cpuidle/dt_idle_states.c

>>>>> +++ b/drivers/cpuidle/dt_idle_states.c

>>>>> @@ -13,6 +13,7 @@

>>>>>  #include <linux/errno.h>

>>>>>  #include <linux/kernel.h>

>>>>>  #include <linux/module.h>

>>>>> +#include <linux/cpu_cooling.h>

>>>>>  #include <linux/of.h>

>>>>>  #include <linux/of_device.h>

>>>>>

>>>>> @@ -205,6 +206,9 @@ int dt_init_idle_driver(struct cpuidle_driver *drv,

>>>>>  			err = -EINVAL;

>>>>>  			break;

>>>>>  		}

>>>>> +

>>>>> +		cpuidle_of_cooling_register(state_node, drv);

>>>>> +

>>>>>  		of_node_put(state_node);

>>>>>  	}

>>>>>

>>>>> That's a hack for the moment.

>>>>>

>>>>

>>>> thanks. I could test that successfully. The only question would be: Is

>>>> is intentional how "non-aggressive" the cooling driver cools? I would

>>>> have expected it to basically inject more idle cycles earlier. I'd set

>>>> 75 degrees as trip point and at 85 degress is would only inject about 30

>>>> (of 100).

>>

>> By the way, how many CPUs are injecting idle cycle when the mitigation

>> happens ?

> 

> all 4 are injecting the same.

> 

>>

>>>> You describe the "config values" in question in the documentation, but

>>>> I'm not sure what's the correct way to change them.

>>>

>>> That is difficult to say without knowing the board behavior. Are you

>>> able to profile the temperature with the load? How fast the temperature

>>> increases? The aggressive behavior of the cooling device will depend on

>>> the governor which depends on the slope of the temperature increase and

>>> the sampling.

>>>

>>> Can you give the pointer to the git tree with the DT definition of your

>>> board?

> 

> https://source.puri.sm/martin.kepplinger/linux-next/blob/next-20191205/librem5_cpuidle_mainline_atf/arch/arm64/boot/dts/freescale/imx8mq-librem5-devkit.dts

> 

> you can browse in that branch.

> 

>>>

>>> You can try by changing the idle duration to 10ms instead of the default

>>> 4ms.

> 

> where is that set?


diff --git a/drivers/thermal/cpuidle_cooling.c
b/drivers/thermal/cpuidle_cooling.c
index 369c5c613f6b..0793e722b2d2 100644
--- a/drivers/thermal/cpuidle_cooling.c
+++ b/drivers/thermal/cpuidle_cooling.c
@@ -192,7 +192,7 @@ __init cpuidle_of_cooling_register(struct
device_node *np,
                goto out_id;
        }

-       idle_inject_set_duration(ii_dev, 0, TICK_USEC);
+       idle_inject_set_duration(ii_dev, 0, 10000);

        idle_cdev->ii_dev = ii_dev;


>>>

>>> You can also change the cooling states in the DT <&state 20 70>, so it

>>> will begin to mitigate at state 20. But I wouldn't recommend that.

> 

> where would we assign that? I'm not sure who reads that -.-> it's still something to consider, but a longer idle duration makes more

> sense, yes.

> 

>>>

>>> Do you have the energy power model, so we can try with the IPA governor?

>>>

>>>

> 

> thanks for the reminder. I'd look at that later.

> 

>                                martin

> 



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
Daniel Lezcano Dec. 11, 2019, 9:33 p.m. UTC | #7
Hi Martin,

I've a bug in the code.


diff --git a/drivers/thermal/cpuidle_cooling.c
b/drivers/thermal/cpuidle_cooling.c
index 369c5c613f6b..628ad707f247 100644
--- a/drivers/thermal/cpuidle_cooling.c
+++ b/drivers/thermal/cpuidle_cooling.c
@@ -192,7 +192,7 @@ __init cpuidle_of_cooling_register(struct
device_node *np,
                goto out_id;
        }

-       idle_inject_set_duration(ii_dev, 0, TICK_USEC);
+       idle_inject_set_duration(ii_dev, TICK_USEC, TICK_USEC);

        idle_cdev->ii_dev = ii_dev;


Let me know if that solves your issue.

  -- Daniel

On 10/12/2019 09:57, Martin Kepplinger wrote:
> 

> 

> On 09.12.19 20:29, Daniel Lezcano wrote:

>> On 09/12/2019 13:03, Daniel Lezcano wrote:

>>> On 09/12/2019 10:54, Martin Kepplinger wrote:

>>>>

>>>>

>>>> On 06.12.19 15:15, Daniel Lezcano wrote:

>>>>> On 06/12/2019 12:33, Martin Kepplinger wrote:

>>>>>> I tested this on the librem5-devkit and see the

>>>>>> cooling devices in sysfs. I configure ARM_PSCI_CPUIDLE, not ARM_CPUIDLE and

>>>>>> add the patch below in register the cooling device there. "psci_idle"

>>>>>> is listed as the cpuidle_driver.

>>>>>>

>>>>>> That's what I'm running, in case you want to see it all:

>>>>>> https://source.puri.sm/martin.kepplinger/linux-next/commits/next-20191205/librem5_cpuidle_mainline_atf

>>>>>>

>>>>>> so I add a trip temperature description like this:

>>>>>> https://source.puri.sm/martin.kepplinger/linux-next/commit/361f49f93ae2c477fd012790831cabd0ed976660

>>>>>>

>>>>>> When I let the SoC heat up, cpuidle cooling won't kick it. In sysfs:

>>>>>>

>>>>>> catting the relevant files in /sys/class/thermal after heating up,

>>>>>> if that makes sense:

>>>>>>

>>>>>> 87000

>>>>>> 85000

>>>>>> 85000

>>>>>> thermal-cpufreq-0

>>>>>> 1

>>>>>> thermal-idle-0

>>>>>> 0

>>>>>> thermal-idle-1                                                                  

>>>>>> 0                                                                               

>>>>>> thermal-idle-2

>>>>>> 0

>>>>>> thermal-idle-3

>>>>>> 0

>>>>>>

>>>>>> with ARM_CPUIDLE instead of ARM_PSCI_CPUIDLE (and registering the cooling dev

>>>>>> during cpuidle-arm.c init) I won't have a cpuidle driver and thus no cpu-sleep

>>>>>> state at all.

>>>>>>

>>>>>> Can you see where the problem here lies?

>>>>>

>>>>> Yes, I removed the registration via the DT.

>>>>>

>>>>> Can you try the following:

>>>>>

>>>>> diff --git a/drivers/cpuidle/dt_idle_states.c

>>>>> b/drivers/cpuidle/dt_idle_states.c

>>>>> index d06d21a9525d..01367ddec49a 100644

>>>>> --- a/drivers/cpuidle/dt_idle_states.c

>>>>> +++ b/drivers/cpuidle/dt_idle_states.c

>>>>> @@ -13,6 +13,7 @@

>>>>>  #include <linux/errno.h>

>>>>>  #include <linux/kernel.h>

>>>>>  #include <linux/module.h>

>>>>> +#include <linux/cpu_cooling.h>

>>>>>  #include <linux/of.h>

>>>>>  #include <linux/of_device.h>

>>>>>

>>>>> @@ -205,6 +206,9 @@ int dt_init_idle_driver(struct cpuidle_driver *drv,

>>>>>  			err = -EINVAL;

>>>>>  			break;

>>>>>  		}

>>>>> +

>>>>> +		cpuidle_of_cooling_register(state_node, drv);

>>>>> +

>>>>>  		of_node_put(state_node);

>>>>>  	}

>>>>>

>>>>> That's a hack for the moment.

>>>>>

>>>>

>>>> thanks. I could test that successfully. The only question would be: Is

>>>> is intentional how "non-aggressive" the cooling driver cools? I would

>>>> have expected it to basically inject more idle cycles earlier. I'd set

>>>> 75 degrees as trip point and at 85 degress is would only inject about 30

>>>> (of 100).

>>

>> By the way, how many CPUs are injecting idle cycle when the mitigation

>> happens ?

> 

> all 4 are injecting the same.

> 

>>

>>>> You describe the "config values" in question in the documentation, but

>>>> I'm not sure what's the correct way to change them.

>>>

>>> That is difficult to say without knowing the board behavior. Are you

>>> able to profile the temperature with the load? How fast the temperature

>>> increases? The aggressive behavior of the cooling device will depend on

>>> the governor which depends on the slope of the temperature increase and

>>> the sampling.

>>>

>>> Can you give the pointer to the git tree with the DT definition of your

>>> board?

> 

> https://source.puri.sm/martin.kepplinger/linux-next/blob/next-20191205/librem5_cpuidle_mainline_atf/arch/arm64/boot/dts/freescale/imx8mq-librem5-devkit.dts

> 

> you can browse in that branch.

> 

>>>

>>> You can try by changing the idle duration to 10ms instead of the default

>>> 4ms.

> 

> where is that set?

> 

>>>

>>> You can also change the cooling states in the DT <&state 20 70>, so it

>>> will begin to mitigate at state 20. But I wouldn't recommend that.

> 

> where would we assign that? I'm not sure who reads that -.-

> it's still something to consider, but a longer idle duration makes more

> sense, yes.

> 

>>>

>>> Do you have the energy power model, so we can try with the IPA governor?

>>>

>>>

> 

> thanks for the reminder. I'd look at that later.

> 

>                                martin

> 



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
Martin Kepplinger Dec. 12, 2019, 7:34 a.m. UTC | #8
On 11.12.19 22:33, Daniel Lezcano wrote:
> 

> Hi Martin,

> 

> I've a bug in the code.

> 

> 

> diff --git a/drivers/thermal/cpuidle_cooling.c

> b/drivers/thermal/cpuidle_cooling.c

> index 369c5c613f6b..628ad707f247 100644

> --- a/drivers/thermal/cpuidle_cooling.c

> +++ b/drivers/thermal/cpuidle_cooling.c

> @@ -192,7 +192,7 @@ __init cpuidle_of_cooling_register(struct

> device_node *np,

>                 goto out_id;

>         }

> 

> -       idle_inject_set_duration(ii_dev, 0, TICK_USEC);

> +       idle_inject_set_duration(ii_dev, TICK_USEC, TICK_USEC);

> 

>         idle_cdev->ii_dev = ii_dev;

> 

> 

> Let me know if that solves your issue.

> 


It does. I now won't heat up over the hot trip point at all. That's what
I was expecting. thanks!

                                   martin
diff mbox series

Patch

diff --git a/Documentation/driver-api/thermal/exynos_thermal.rst b/Documentation/driver-api/thermal/exynos_thermal.rst
index 5bd556566c70..d4e4a5b75805 100644
--- a/Documentation/driver-api/thermal/exynos_thermal.rst
+++ b/Documentation/driver-api/thermal/exynos_thermal.rst
@@ -67,7 +67,7 @@  TMU driver description:
 The exynos thermal driver is structured as::
 
 					Kernel Core thermal framework
-				(thermal_core.c, step_wise.c, cpu_cooling.c)
+				(thermal_core.c, step_wise.c, cpufreq_cooling.c)
 								^
 								|
 								|
diff --git a/MAINTAINERS b/MAINTAINERS
index d2e92a0360f2..26e4be914765 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16194,7 +16194,7 @@  L:	linux-pm@vger.kernel.org
 S:	Supported
 F:	Documentation/driver-api/thermal/cpu-cooling-api.rst
 F:	Documentation/driver-api/thermal/cpu-idle-cooling.rst
-F:	drivers/thermal/cpu_cooling.c
+F:	drivers/thermal/cpufreq_cooling.c
 F:	drivers/thermal/cpuidle_cooling.c
 F:	include/linux/cpu_cooling.h
 
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index 9c8aa2d4bd28..5c98472ffd8b 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -19,7 +19,7 @@  thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE)	+= user_space.o
 thermal_sys-$(CONFIG_THERMAL_GOV_POWER_ALLOCATOR)	+= power_allocator.o
 
 # cpufreq cooling
-thermal_sys-$(CONFIG_CPU_FREQ_THERMAL)	+= cpu_cooling.o
+thermal_sys-$(CONFIG_CPU_FREQ_THERMAL)	+= cpufreq_cooling.o
 thermal_sys-$(CONFIG_CPU_IDLE_THERMAL)	+= cpuidle_cooling.o
 
 # clock cooling
diff --git a/drivers/thermal/clock_cooling.c b/drivers/thermal/clock_cooling.c
index 3ad3256c48fd..7cb3ae4b44ee 100644
--- a/drivers/thermal/clock_cooling.c
+++ b/drivers/thermal/clock_cooling.c
@@ -7,7 +7,7 @@ 
  *  Copyright (C) 2013	Texas Instruments Inc.
  *  Contact:  Eduardo Valentin <eduardo.valentin@ti.com>
  *
- *  Highly based on cpu_cooling.c.
+ *  Highly based on cpufreq_cooling.c.
  *  Copyright (C) 2012	Samsung Electronics Co., Ltd(http://www.samsung.com)
  *  Copyright (C) 2012  Amit Daniel <amit.kachhap@linaro.org>
  */
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpufreq_cooling.c
similarity index 99%
rename from drivers/thermal/cpu_cooling.c
rename to drivers/thermal/cpufreq_cooling.c
index 6b9865c786ba..3a3f9cf94b6d 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpufreq_cooling.c
@@ -1,6 +1,6 @@ 
 // SPDX-License-Identifier: GPL-2.0
 /*
- *  linux/drivers/thermal/cpu_cooling.c
+ *  linux/drivers/thermal/cpufreq_cooling.c
  *
  *  Copyright (C) 2012	Samsung Electronics Co., Ltd(http://www.samsung.com)
  *
@@ -694,7 +694,7 @@  of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 	u32 capacitance = 0;
 
 	if (!np) {
-		pr_err("cpu_cooling: OF node not available for cpu%d\n",
+		pr_err("cpufreq_cooling: OF node not available for cpu%d\n",
 		       policy->cpu);
 		return NULL;
 	}
@@ -705,7 +705,7 @@  of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 
 		cdev = __cpufreq_cooling_register(np, policy, capacitance);
 		if (IS_ERR(cdev)) {
-			pr_err("cpu_cooling: cpu%d failed to register as cooling device: %ld\n",
+			pr_err("cpufreq_cooling: cpu%d failed to register as cooling device: %ld\n",
 			       policy->cpu, PTR_ERR(cdev));
 			cdev = NULL;
 		}
diff --git a/include/linux/clock_cooling.h b/include/linux/clock_cooling.h
index b5cebf766e02..4b0a69863656 100644
--- a/include/linux/clock_cooling.h
+++ b/include/linux/clock_cooling.h
@@ -7,7 +7,7 @@ 
  *  Copyright (C) 2013	Texas Instruments Inc.
  *  Contact:  Eduardo Valentin <eduardo.valentin@ti.com>
  *
- *  Highly based on cpu_cooling.c.
+ *  Highly based on cpufreq_cooling.c.
  *  Copyright (C) 2012	Samsung Electronics Co., Ltd(http://www.samsung.com)
  *  Copyright (C) 2012  Amit Daniel <amit.kachhap@linaro.org>
  */