[1/2] arm64: dts: rockchip: Fix multiple thermal zones conflict in rk3399.dtsi

Message ID 20190604165802.7338-1-daniel.lezcano@linaro.org
State New
Headers show
Series
  • [1/2] arm64: dts: rockchip: Fix multiple thermal zones conflict in rk3399.dtsi
Related show

Commit Message

Daniel Lezcano June 4, 2019, 4:57 p.m.
Currently the common thermal zones definitions for the rk3399 assumes
multiple thermal zones are supported by the governors. This is not the
case and each thermal zone has its own governor instance acting
individually without collaboration with other governors.

As the cooling device for the CPU and the GPU thermal zones is the
same, each governors take different decisions for the same cooling
device leading to conflicting instructions and an erratic behavior.

As the cooling-maps is about to become an optional property, let's
remove the cpu cooling device map from the GPU thermal zone.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

---
 arch/arm64/boot/dts/rockchip/rk3399.dtsi | 9 ---------
 1 file changed, 9 deletions(-)

-- 
2.17.1

Comments

Daniel Lezcano June 8, 2019, 10:05 p.m. | #1
On 08/06/2019 00:18, Linus Walleij wrote:
> On Tue, Jun 4, 2019 at 6:58 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:

> 

>> The intelligent power allocator PID coefficient to be set in sysfs

>> are:

>>

>>     k_d: 0

>>     k_po: 79

>>     k_i: 10

>>     k_pu: 50

> 

> With all the other interesting parametrization in the device tree

> I kind of wonder why the PID regulator constants defaults are

> not set up from device tree?

> 

> Any specific reason?


None I'm aware of. I guess these constants are considered as tweak
values and not hardware related.

> To me it seems like the kind of stuff userpace will invariably just

> get wrong or forget about (somebody just runs a different

> distribution without the extra magic to set sysfs right) unless

> we supply good defaults.


I agree.

-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
Robin Murphy June 14, 2019, 10:09 a.m. | #2
On 14/06/2019 10:35, Heiko Stuebner wrote:
> Hi Daniel,

> 

> Am Dienstag, 4. Juni 2019, 18:57:57 CEST schrieb Daniel Lezcano:

>> Currently the common thermal zones definitions for the rk3399 assumes

>> multiple thermal zones are supported by the governors. This is not the

>> case and each thermal zone has its own governor instance acting

>> individually without collaboration with other governors.

>>

>> As the cooling device for the CPU and the GPU thermal zones is the

>> same, each governors take different decisions for the same cooling

>> device leading to conflicting instructions and an erratic behavior.

>>

>> As the cooling-maps is about to become an optional property, let's

>> remove the cpu cooling device map from the GPU thermal zone.

>>

>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

>> ---

>>   arch/arm64/boot/dts/rockchip/rk3399.dtsi | 9 ---------

>>   1 file changed, 9 deletions(-)

>>

>> diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi

>> index 196ac9b78076..e1357e0f60f7 100644

>> --- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi

>> +++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi

>> @@ -821,15 +821,6 @@

>>   					type = "critical";

>>   				};

>>   			};

>> -

>> -			cooling-maps {

>> -				map0 {

>> -					trip = <&gpu_alert0>;

>> -					cooling-device =

>> -						<&cpu_b0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,

>> -						<&cpu_b1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;

>> -				};

>> -			};

>>   		};

>>   	};

> 

> my knowledge of the thermal framework is not that big, but what about the

> rk3399-devices which further detail the cooling-maps like rk3399-gru-kevin

> and the rk3399-nanopc-t4 with its fan-handling in the cooling-maps?


FWIW, my knowledge of thermal is probably even less :)

For NanoPC-T4 I think I more or less just took Odroid-XU3/4 as the best 
pwm-fan example and adapted that into the existing RK3399 zones in the 
manner which seemed most logical to my interpretation - if what was 
there wasn't right to begin with, then I may well have done that wrong too.

Robin.
Daniel Lezcano June 14, 2019, 1:03 p.m. | #3
On 14/06/2019 11:35, Heiko Stuebner wrote:
> Hi Daniel,

> 

> Am Dienstag, 4. Juni 2019, 18:57:57 CEST schrieb Daniel Lezcano:

>> Currently the common thermal zones definitions for the rk3399 assumes

>> multiple thermal zones are supported by the governors. This is not the

>> case and each thermal zone has its own governor instance acting

>> individually without collaboration with other governors.

>>

>> As the cooling device for the CPU and the GPU thermal zones is the

>> same, each governors take different decisions for the same cooling

>> device leading to conflicting instructions and an erratic behavior.

>>

>> As the cooling-maps is about to become an optional property, let's

>> remove the cpu cooling device map from the GPU thermal zone.

>>

>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

>> ---

>>  arch/arm64/boot/dts/rockchip/rk3399.dtsi | 9 ---------

>>  1 file changed, 9 deletions(-)

>>

>> diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi

>> index 196ac9b78076..e1357e0f60f7 100644

>> --- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi

>> +++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi

>> @@ -821,15 +821,6 @@

>>  					type = "critical";

>>  				};

>>  			};

>> -

>> -			cooling-maps {

>> -				map0 {

>> -					trip = <&gpu_alert0>;

>> -					cooling-device =

>> -						<&cpu_b0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,

>> -						<&cpu_b1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;

>> -				};

>> -			};

>>  		};

>>  	};

> 

> my knowledge of the thermal framework is not that big, but what about the

> rk3399-devices which further detail the cooling-maps like rk3399-gru-kevin

> and the rk3399-nanopc-t4 with its fan-handling in the cooling-maps?


The rk3399-gru-kevin is correct.

The rk3399-nanopc-t4 is not correct because the cpu and the gpu are
sharing the same cooling device (the fan). There are different
configurations:

1. The cpu cooling device for the CPU and the fan for the GPU

2. Different trip points on the CPU thermal zone, eg. one to for the CPU
cooling device and another one for the fan.

There are some variant for the above. If this board is not on battery,
you may want to give priority to the throughput, so activate the fan
first and then cool down the CPU. Or if you are on battery, you may want
to invert the trip points.

In any case, it is not possible to share the same cooling device for
different thermal zones.


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
Robin Murphy June 14, 2019, 2:02 p.m. | #4
On 14/06/2019 14:03, Daniel Lezcano wrote:
> On 14/06/2019 11:35, Heiko Stuebner wrote:

>> Hi Daniel,

>>

>> Am Dienstag, 4. Juni 2019, 18:57:57 CEST schrieb Daniel Lezcano:

>>> Currently the common thermal zones definitions for the rk3399 assumes

>>> multiple thermal zones are supported by the governors. This is not the

>>> case and each thermal zone has its own governor instance acting

>>> individually without collaboration with other governors.

>>>

>>> As the cooling device for the CPU and the GPU thermal zones is the

>>> same, each governors take different decisions for the same cooling

>>> device leading to conflicting instructions and an erratic behavior.

>>>

>>> As the cooling-maps is about to become an optional property, let's

>>> remove the cpu cooling device map from the GPU thermal zone.

>>>

>>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

>>> ---

>>>   arch/arm64/boot/dts/rockchip/rk3399.dtsi | 9 ---------

>>>   1 file changed, 9 deletions(-)

>>>

>>> diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi

>>> index 196ac9b78076..e1357e0f60f7 100644

>>> --- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi

>>> +++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi

>>> @@ -821,15 +821,6 @@

>>>   					type = "critical";

>>>   				};

>>>   			};

>>> -

>>> -			cooling-maps {

>>> -				map0 {

>>> -					trip = <&gpu_alert0>;

>>> -					cooling-device =

>>> -						<&cpu_b0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,

>>> -						<&cpu_b1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;

>>> -				};

>>> -			};

>>>   		};

>>>   	};

>>

>> my knowledge of the thermal framework is not that big, but what about the

>> rk3399-devices which further detail the cooling-maps like rk3399-gru-kevin

>> and the rk3399-nanopc-t4 with its fan-handling in the cooling-maps?

> 

> The rk3399-gru-kevin is correct.

> 

> The rk3399-nanopc-t4 is not correct because the cpu and the gpu are

> sharing the same cooling device (the fan). There are different

> configurations:

> 

> 1. The cpu cooling device for the CPU and the fan for the GPU

> 

> 2. Different trip points on the CPU thermal zone, eg. one to for the CPU

> cooling device and another one for the fan.

> 

> There are some variant for the above. If this board is not on battery,

> you may want to give priority to the throughput, so activate the fan

> first and then cool down the CPU. Or if you are on battery, you may want

> to invert the trip points.

> 

> In any case, it is not possible to share the same cooling device for

> different thermal zones.


OK, thanks for the clarification. I'll get my board set up again to 
figure out the best fix for rk3399-nanopc-t4 (FWIW most users are 
probably just using passive cooling or a plain DC fan anyway). You might 
want to raise this issue with the maintainers of 
arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi, since the 
everything-shared-by-everything approach in there was what I used as a 
reference.

Robin.
Daniel Lezcano June 14, 2019, 2:30 p.m. | #5
On 14/06/2019 16:02, Robin Murphy wrote:
> On 14/06/2019 14:03, Daniel Lezcano wrote:

>> On 14/06/2019 11:35, Heiko Stuebner wrote:

>>> Hi Daniel,

>>>

>>> Am Dienstag, 4. Juni 2019, 18:57:57 CEST schrieb Daniel Lezcano:

>>>> Currently the common thermal zones definitions for the rk3399 assumes

>>>> multiple thermal zones are supported by the governors. This is not the

>>>> case and each thermal zone has its own governor instance acting

>>>> individually without collaboration with other governors.

>>>>

>>>> As the cooling device for the CPU and the GPU thermal zones is the

>>>> same, each governors take different decisions for the same cooling

>>>> device leading to conflicting instructions and an erratic behavior.

>>>>

>>>> As the cooling-maps is about to become an optional property, let's

>>>> remove the cpu cooling device map from the GPU thermal zone.

>>>>

>>>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

>>>> ---

>>>>   arch/arm64/boot/dts/rockchip/rk3399.dtsi | 9 ---------

>>>>   1 file changed, 9 deletions(-)

>>>>

>>>> diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi

>>>> b/arch/arm64/boot/dts/rockchip/rk3399.dtsi

>>>> index 196ac9b78076..e1357e0f60f7 100644

>>>> --- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi

>>>> +++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi

>>>> @@ -821,15 +821,6 @@

>>>>                       type = "critical";

>>>>                   };

>>>>               };

>>>> -

>>>> -            cooling-maps {

>>>> -                map0 {

>>>> -                    trip = <&gpu_alert0>;

>>>> -                    cooling-device =

>>>> -                        <&cpu_b0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,

>>>> -                        <&cpu_b1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;

>>>> -                };

>>>> -            };

>>>>           };

>>>>       };

>>>

>>> my knowledge of the thermal framework is not that big, but what about

>>> the

>>> rk3399-devices which further detail the cooling-maps like

>>> rk3399-gru-kevin

>>> and the rk3399-nanopc-t4 with its fan-handling in the cooling-maps?

>>

>> The rk3399-gru-kevin is correct.

>>

>> The rk3399-nanopc-t4 is not correct because the cpu and the gpu are

>> sharing the same cooling device (the fan). There are different

>> configurations:

>>

>> 1. The cpu cooling device for the CPU and the fan for the GPU

>>

>> 2. Different trip points on the CPU thermal zone, eg. one to for the CPU

>> cooling device and another one for the fan.

>>

>> There are some variant for the above. If this board is not on battery,

>> you may want to give priority to the throughput, so activate the fan

>> first and then cool down the CPU. Or if you are on battery, you may want

>> to invert the trip points.

>>

>> In any case, it is not possible to share the same cooling device for

>> different thermal zones.

> 

> OK, thanks for the clarification. I'll get my board set up again to

> figure out the best fix for rk3399-nanopc-t4 (FWIW most users are

> probably just using passive cooling or a plain DC fan anyway). You might

> want to raise this issue with the maintainers of

> arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi, since the

> everything-shared-by-everything approach in there was what I used as a

> reference.


Cc'ed: Kukjin Kim and Krzysztof Kozlowski

Easy :)


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

Patch

diff --git a/arch/arm64/boot/dts/rockchip/rk3399.dtsi b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
index 196ac9b78076..e1357e0f60f7 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399.dtsi
@@ -821,15 +821,6 @@ 
 					type = "critical";
 				};
 			};
-
-			cooling-maps {
-				map0 {
-					trip = <&gpu_alert0>;
-					cooling-device =
-						<&cpu_b0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
-						<&cpu_b1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
-				};
-			};
 		};
 	};