diff mbox series

[RFC,4/4] sched: cpufreq: Apply DVFS headroom to CFS only

Message ID 20230820210640.585311-5-qyousef@layalina.io
State New
Headers show
Series Fix dvfs_headroom escaping uclamp constraints | expand

Commit Message

Qais Yousef Aug. 20, 2023, 9:06 p.m. UTC
RT and Deadline have exact performance requirement when running. RT runs
at max or a specific OPP defined by uclamp_min. Deadline's OPP is
defined by its bandwidth. Both of which are known ahead of time and
don't require a headroom to grow into.

IRQs on the other hand have no specific performance requirement and
cruises along at whatever the current OPP happens to be when they occur.

Now they all have PELT pressure signals that does impact frequency
selection and task placement. The question is do they need DVFS
headroom?

I think the answer is no because when CFS is not running at all, these
pressure signal has no real impact on performance for RT, DL or IRQ.

If CFS util is not zero, we already add their pressure as an
*additional* headroom to account for the lost/stolen time. So I argue
that the pressure are headroom themselves and shouldn't need an
additional DVFS headroom applied on top.

In summary final outcome should be:

	CFS + DVFS headroom + (RT, DT, IRQ) pressure headroom

Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
---
 kernel/sched/core.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

Comments

Dietmar Eggemann Aug. 21, 2023, 4:41 p.m. UTC | #1
On 20/08/2023 23:06, Qais Yousef wrote:
> RT and Deadline have exact performance requirement when running. RT runs
> at max or a specific OPP defined by uclamp_min. Deadline's OPP is
> defined by its bandwidth. Both of which are known ahead of time and
> don't require a headroom to grow into.
> 
> IRQs on the other hand have no specific performance requirement and
> cruises along at whatever the current OPP happens to be when they occur.
> 
> Now they all have PELT pressure signals that does impact frequency
> selection and task placement. The question is do they need DVFS
> headroom?
> 
> I think the answer is no because when CFS is not running at all, these
> pressure signal has no real impact on performance for RT, DL or IRQ.
> 
> If CFS util is not zero, we already add their pressure as an
> *additional* headroom to account for the lost/stolen time. So I argue
> that the pressure are headroom themselves and shouldn't need an
> additional DVFS headroom applied on top.
> 
> In summary final outcome should be:
> 
> 	CFS + DVFS headroom + (RT, DT, IRQ) pressure headroom

I assume here you want to align the difference that EAS deals with
`util_cfs` vs `capacity` whereas power deals with `util` vs
`capacity_orig`? You want that power should only apply the 1.25 to util_cfs?

> Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
> ---
>  kernel/sched/core.c | 12 +++++-------
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 441d433c83cd..602e369753a3 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7438,10 +7438,11 @@ unsigned long effective_cpu_util(int cpu, unsigned long util_cfs,
>  	 * When there are no CFS RUNNABLE tasks, clamps are released and
>  	 * frequency will be gracefully reduced with the utilization decay.
>  	 */
> -	util = util_cfs + cpu_util_rt(rq);
>  	if (type == FREQUENCY_UTIL) {
> -		util = apply_dvfs_headroom(util);
> +		util = apply_dvfs_headroom(util_cfs) + cpu_util_rt(rq);
>  		util = uclamp_rq_util_with(rq, util, p);
> +	} else {
> +		util = util_cfs + cpu_util_rt(rq);
>  	}
>  
>  	dl_util = cpu_util_dl(rq);
> @@ -7473,12 +7474,9 @@ unsigned long effective_cpu_util(int cpu, unsigned long util_cfs,
>  	 *              max - irq
>  	 *   U' = irq + --------- * U
>  	 *                 max
> -	 *
> -	 * We only need to apply dvfs headroom to irq part since the util part
> -	 * already had it applied.
>  	 */
>  	util = scale_irq_capacity(util, irq, max);
> -	util += type ==  FREQUENCY_UTIL ? apply_dvfs_headroom(irq) : irq;
> +	util += irq;
>  
>  	/*
>  	 * Bandwidth required by DEADLINE must always be granted while, for
> @@ -7491,7 +7489,7 @@ unsigned long effective_cpu_util(int cpu, unsigned long util_cfs,
>  	 * an interface. So, we only do the latter for now.
>  	 */
>  	if (type == FREQUENCY_UTIL)
> -		util += apply_dvfs_headroom(cpu_bw_dl(rq));
> +		util += cpu_bw_dl(rq);
>  
>  	return min(max, util);
>  }
Qais Yousef Aug. 26, 2023, 8:27 p.m. UTC | #2
On 08/21/23 18:41, Dietmar Eggemann wrote:
> On 20/08/2023 23:06, Qais Yousef wrote:
> > RT and Deadline have exact performance requirement when running. RT runs
> > at max or a specific OPP defined by uclamp_min. Deadline's OPP is
> > defined by its bandwidth. Both of which are known ahead of time and
> > don't require a headroom to grow into.
> > 
> > IRQs on the other hand have no specific performance requirement and
> > cruises along at whatever the current OPP happens to be when they occur.
> > 
> > Now they all have PELT pressure signals that does impact frequency
> > selection and task placement. The question is do they need DVFS
> > headroom?
> > 
> > I think the answer is no because when CFS is not running at all, these
> > pressure signal has no real impact on performance for RT, DL or IRQ.
> > 
> > If CFS util is not zero, we already add their pressure as an
> > *additional* headroom to account for the lost/stolen time. So I argue
> > that the pressure are headroom themselves and shouldn't need an
> > additional DVFS headroom applied on top.
> > 
> > In summary final outcome should be:
> > 
> > 	CFS + DVFS headroom + (RT, DT, IRQ) pressure headroom
> 
> I assume here you want to align the difference that EAS deals with

This function is used on all systems that use schedutil - EAS being one of them
but not the only one. The definition isn't, and shouldn't, be tied to EAS.
I'm certainly intending this change for all possible users of schedutil.

> `util_cfs` vs `capacity` whereas power deals with `util` vs
> `capacity_orig`? You want that power should only apply the 1.25 to util_cfs?

I don't get what you're saying. But I think it's similar to what I'm saying.

To clarify. What I'm saying is that when we try to calculate the effective
util, CFS is the only entity in practice that interacts with DVFS. DL and RT by
design 'disable' DVFS and when they become runnable set the frequency to
a constant fixed point.  For them DVFS latencies are not acceptable - although
in practice they do take a single hit for the freq change on wake up. IRQ on
the other hand doesn't really care about DVFS. So we end up in practice that
CFS is the only entity that interacts with DVFS, so when we calculate the
DVFS headroom, we should only take its util into account.


Thanks!

--
Qais Yousef
diff mbox series

Patch

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 441d433c83cd..602e369753a3 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7438,10 +7438,11 @@  unsigned long effective_cpu_util(int cpu, unsigned long util_cfs,
 	 * When there are no CFS RUNNABLE tasks, clamps are released and
 	 * frequency will be gracefully reduced with the utilization decay.
 	 */
-	util = util_cfs + cpu_util_rt(rq);
 	if (type == FREQUENCY_UTIL) {
-		util = apply_dvfs_headroom(util);
+		util = apply_dvfs_headroom(util_cfs) + cpu_util_rt(rq);
 		util = uclamp_rq_util_with(rq, util, p);
+	} else {
+		util = util_cfs + cpu_util_rt(rq);
 	}
 
 	dl_util = cpu_util_dl(rq);
@@ -7473,12 +7474,9 @@  unsigned long effective_cpu_util(int cpu, unsigned long util_cfs,
 	 *              max - irq
 	 *   U' = irq + --------- * U
 	 *                 max
-	 *
-	 * We only need to apply dvfs headroom to irq part since the util part
-	 * already had it applied.
 	 */
 	util = scale_irq_capacity(util, irq, max);
-	util += type ==  FREQUENCY_UTIL ? apply_dvfs_headroom(irq) : irq;
+	util += irq;
 
 	/*
 	 * Bandwidth required by DEADLINE must always be granted while, for
@@ -7491,7 +7489,7 @@  unsigned long effective_cpu_util(int cpu, unsigned long util_cfs,
 	 * an interface. So, we only do the latter for now.
 	 */
 	if (type == FREQUENCY_UTIL)
-		util += apply_dvfs_headroom(cpu_bw_dl(rq));
+		util += cpu_bw_dl(rq);
 
 	return min(max, util);
 }