diff mbox series

[3/3] PM: EM: Increase energy calculation precision

Message ID 20210625152603.25960-4-lukasz.luba@arm.com
State New
Headers show
Series Improve EAS energy estimation and increase precision | expand

Commit Message

Lukasz Luba June 25, 2021, 3:26 p.m. UTC
The Energy Model (EM) provides useful information about device power in
each performance state to other subsystems like: Energy Aware Scheduler
(EAS). The energy calculation in EAS does arithmetic operation based on
the EM em_cpu_energy(). Current implementation of that function uses
em_perf_state::cost as a pre-computed cost coefficient equal to:
cost = power * max_frequency / frequency.
The 'power' is expressed in milli-Watts (or in abstract scale).

There are corner cases then the EAS energy calculation for two Performance
Domains (PDs) return the same value, e.g. 10mW. The EAS compares these
values to choose smaller one. It might happen that this values are equal
due to rounding error. In such scenario, we need better precision, e.g.
10000 times better. To provide this possibility increase the precision on
the em_perf_state::cost.

This patch allows to avoid the rounding to milli-Watt errors, which might
occur in EAS energy estimation for each Performance Domains (PD). The
rounding error is common for small tasks which have small utilization
values.

The rest of the EM code doesn't change, em_perf_state::power is still
expressed in milli-Watts (or in abstract scale). Thus, all existing
platforms don't have to change their reported power. The same applies to
EM clients, like thermal or DTPM (they use em_perf_state::power).

Reported-by: CCJ Yeh <CCj.Yeh@mediatek.com>
Suggested-by: CCJ Yeh <CCj.Yeh@mediatek.com>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
---
 include/linux/energy_model.h | 5 ++++-
 kernel/power/energy_model.c  | 3 ++-
 2 files changed, 6 insertions(+), 2 deletions(-)

Comments

Dietmar Eggemann July 5, 2021, 12:45 p.m. UTC | #1
On 25/06/2021 17:26, Lukasz Luba wrote:
> The Energy Model (EM) provides useful information about device power in

> each performance state to other subsystems like: Energy Aware Scheduler

> (EAS). The energy calculation in EAS does arithmetic operation based on

> the EM em_cpu_energy(). Current implementation of that function uses

> em_perf_state::cost as a pre-computed cost coefficient equal to:

> cost = power * max_frequency / frequency.

> The 'power' is expressed in milli-Watts (or in abstract scale).

> 

> There are corner cases then the EAS energy calculation for two Performance

            ^^^^^^^^^^^^

Again, an easy to understand example to describe in which situation this
change would bring a benefit would help.

> Domains (PDs) return the same value, e.g. 10mW. The EAS compares these

> values to choose smaller one. It might happen that this values are equal

> due to rounding error. In such scenario, we need better precision, e.g.

> 10000 times better. To provide this possibility increase the precision on

> the em_perf_state::cost.

> 

> This patch allows to avoid the rounding to milli-Watt errors, which might

> occur in EAS energy estimation for each Performance Domains (PD). The

> rounding error is common for small tasks which have small utilization

> values.


What's the influence of the CPU utilization 'cpu_util_next()' here?

compute_energy()
    em_cpu_energy()
            return ps->cost * sum_util / scale_cpu
                              ^^^^^^^^
> The rest of the EM code doesn't change, em_perf_state::power is still

> expressed in milli-Watts (or in abstract scale). Thus, all existing

> platforms don't have to change their reported power. The same applies to


Not only existing platforms since there are no changes. So why
highlighting `existing` here.?

> EM clients, like thermal or DTPM (they use em_perf_state::power).

> 

> Reported-by: CCJ Yeh <CCj.Yeh@mediatek.com>

> Suggested-by: CCJ Yeh <CCj.Yeh@mediatek.com>

> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>

> ---

>  include/linux/energy_model.h | 5 ++++-

>  kernel/power/energy_model.c  | 3 ++-

>  2 files changed, 6 insertions(+), 2 deletions(-)

> 

> diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h

> index 2016f5a706e0..91037dd57e61 100644

> --- a/include/linux/energy_model.h

> +++ b/include/linux/energy_model.h

> @@ -16,7 +16,10 @@

>   * @power:	The power consumed at this level (by 1 CPU or by a registered

>   *		device). It can be a total power: static and dynamic.

>   * @cost:	The cost coefficient associated with this level, used during

> - *		energy calculation. Equal to: power * max_frequency / frequency

> + *		energy calculation. Equal to:

> +		power * 10000 * max_frequency / frequency

> + *		To increase the energy estimation presision use different

> + *		scale in this coefficient than in @power field.

>   */

>  struct em_perf_state {

>  	unsigned long frequency;

> diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c

> index 0f4530b3a8cd..2724f0ac417d 100644

> --- a/kernel/power/energy_model.c

> +++ b/kernel/power/energy_model.c

> @@ -170,7 +170,8 @@ static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd,

>  	/* Compute the cost of each performance state. */

>  	fmax = (u64) table[nr_states - 1].frequency;

>  	for (i = 0; i < nr_states; i++) {

> -		table[i].cost = div64_u64(fmax * table[i].power,

> +		u64 power_res = (u64)table[i].power * 10000;

> +		table[i].cost = div64_u64(fmax * power_res,

>  					  table[i].frequency);

>  	}

>  

>
Lukasz Luba July 6, 2021, 7:51 p.m. UTC | #2
On 7/5/21 1:45 PM, Dietmar Eggemann wrote:
> On 25/06/2021 17:26, Lukasz Luba wrote:

>> The Energy Model (EM) provides useful information about device power in

>> each performance state to other subsystems like: Energy Aware Scheduler

>> (EAS). The energy calculation in EAS does arithmetic operation based on

>> the EM em_cpu_energy(). Current implementation of that function uses

>> em_perf_state::cost as a pre-computed cost coefficient equal to:

>> cost = power * max_frequency / frequency.

>> The 'power' is expressed in milli-Watts (or in abstract scale).

>>

>> There are corner cases then the EAS energy calculation for two Performance

>              ^^^^^^^^^^^^

> 

> Again, an easy to understand example to describe in which situation this

> change would bring a benefit would help.

> 

>> Domains (PDs) return the same value, e.g. 10mW. The EAS compares these

>> values to choose smaller one. It might happen that this values are equal

>> due to rounding error. In such scenario, we need better precision, e.g.

>> 10000 times better. To provide this possibility increase the precision on

>> the em_perf_state::cost.

>>

>> This patch allows to avoid the rounding to milli-Watt errors, which might

>> occur in EAS energy estimation for each Performance Domains (PD). The

>> rounding error is common for small tasks which have small utilization

>> values.

> 

> What's the influence of the CPU utilization 'cpu_util_next()' here?

> 

> compute_energy()

>      em_cpu_energy()

>              return ps->cost * sum_util / scale_cpu

>                                ^^^^^^^^


This is the place where the rounding error triggers. If sum_util is
small and scale_cpu is e.g. 1024, then we have a small fraction here.
It depends on the EM 'cost', but for most platforms we have small
power and cost values, so we suffer this rounding.
The example that I gave in my response in patch 2/3 shows this.

>> The rest of the EM code doesn't change, em_perf_state::power is still

>> expressed in milli-Watts (or in abstract scale). Thus, all existing

>> platforms don't have to change their reported power. The same applies to

> 

> Not only existing platforms since there are no changes. So why

> highlighting `existing` here.?


I just wanted to be clear that it doesn't affect existing platforms
at all. We don't require to report power in better resolution e.g.
micro-Watts.
Also, the clients in the kernel won't be affected, since they use
EM 'power' filed, not 'cost'.
diff mbox series

Patch

diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
index 2016f5a706e0..91037dd57e61 100644
--- a/include/linux/energy_model.h
+++ b/include/linux/energy_model.h
@@ -16,7 +16,10 @@ 
  * @power:	The power consumed at this level (by 1 CPU or by a registered
  *		device). It can be a total power: static and dynamic.
  * @cost:	The cost coefficient associated with this level, used during
- *		energy calculation. Equal to: power * max_frequency / frequency
+ *		energy calculation. Equal to:
+		power * 10000 * max_frequency / frequency
+ *		To increase the energy estimation presision use different
+ *		scale in this coefficient than in @power field.
  */
 struct em_perf_state {
 	unsigned long frequency;
diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
index 0f4530b3a8cd..2724f0ac417d 100644
--- a/kernel/power/energy_model.c
+++ b/kernel/power/energy_model.c
@@ -170,7 +170,8 @@  static int em_create_perf_table(struct device *dev, struct em_perf_domain *pd,
 	/* Compute the cost of each performance state. */
 	fmax = (u64) table[nr_states - 1].frequency;
 	for (i = 0; i < nr_states; i++) {
-		table[i].cost = div64_u64(fmax * table[i].power,
+		u64 power_res = (u64)table[i].power * 10000;
+		table[i].cost = div64_u64(fmax * power_res,
 					  table[i].frequency);
 	}