diff mbox

[v2,09/11] sched: test the cpu's capacity in wake affine

Message ID 1400860385-14555-10-git-send-email-vincent.guittot@linaro.org
State New
Headers show

Commit Message

Vincent Guittot May 23, 2014, 3:53 p.m. UTC
Currently the task always wakes affine on this_cpu if the latter is idle.
Before waking up the task on this_cpu, we check that this_cpu capacity is not
significantly reduced because of RT tasks or irq activity.

Use case where the number of irq and the time spent under irq is important
will take benefit of this because the task that is woken up by irq or softirq
will not use the same CPU than irq (and softirq) but a idle one which share
its LLC.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

Comments

Peter Zijlstra May 28, 2014, 10:58 a.m. UTC | #1
On Fri, May 23, 2014 at 05:53:03PM +0200, Vincent Guittot wrote:
> Currently the task always wakes affine on this_cpu if the latter is idle.
> Before waking up the task on this_cpu, we check that this_cpu capacity is not
> significantly reduced because of RT tasks or irq activity.
> 
> Use case where the number of irq and the time spent under irq is important
> will take benefit of this because the task that is woken up by irq or softirq
> will not use the same CPU than irq (and softirq) but a idle one which share
> its LLC.

OK, so I'm having a terrible time parsing the above.

So looking at the patch you make balance false even when this_load==0
when the effective power/capacity (nico's patches haven't fully sunk in
yet) of this cpu is less than that of the previous cpu.

Is that right?

Now I'm only struggling to understand the rationale for this, its got
LLC in there somewhere, but I'm failing to comprehend.
Vincent Guittot May 28, 2014, 11:15 a.m. UTC | #2
On 28 May 2014 12:58, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, May 23, 2014 at 05:53:03PM +0200, Vincent Guittot wrote:
>> Currently the task always wakes affine on this_cpu if the latter is idle.
>> Before waking up the task on this_cpu, we check that this_cpu capacity is not
>> significantly reduced because of RT tasks or irq activity.
>>
>> Use case where the number of irq and the time spent under irq is important
>> will take benefit of this because the task that is woken up by irq or softirq
>> will not use the same CPU than irq (and softirq) but a idle one which share
>> its LLC.
>
> OK, so I'm having a terrible time parsing the above.
>
> So looking at the patch you make balance false even when this_load==0
> when the effective power/capacity (nico's patches haven't fully sunk in
> yet) of this cpu is less than that of the previous cpu.
>
> Is that right?

yes,

>
> Now I'm only struggling to understand the rationale for this, its got
> LLC in there somewhere, but I'm failing to comprehend.

Ah.. i have probably overestimated the fact that wake_affine was only
done at MC or SMT level but after reading more deeply the glags
configuration of all  sched_domain level , my assumption is not true.
So I need to test sd flag to make sure that their share their cache at
this level
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Vincent Guittot Nov. 24, 2014, 1:23 p.m. UTC | #3
On 24 November 2014 at 01:34, Wanpeng Li <kernellwp@gmail.com> wrote:
> Hi Vincent,
> On 5/28/14, 7:15 PM, Vincent Guittot wrote:
>>
>> On 28 May 2014 12:58, Peter Zijlstra <peterz@infradead.org> wrote:
>>>
>>> On Fri, May 23, 2014 at 05:53:03PM +0200, Vincent Guittot wrote:
>
> [snip]
>>>
>>> Now I'm only struggling to understand the rationale for this, its got
>>> LLC in there somewhere, but I'm failing to comprehend.
>>
>> Ah.. i have probably overestimated the fact that wake_affine was only
>> done at MC or SMT level but after reading more deeply the glags
>> configuration of all  sched_domain level , my assumption is not true.
>
>
> What's level the wake_affine should happen?

In 1st version of this patch, it was proposed to test the cpu_capacity
of idle CPU only if they were sharing their cache

Regards,
Vincent
>
> Regards,
> Wanpeng Li
>
>> So I need to test sd flag to make sure that their share their cache at
>> this level
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
diff mbox

Patch

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c01d8b6..e8a30f9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4241,6 +4241,7 @@  static int wake_wide(struct task_struct *p)
 static int wake_affine(struct sched_domain *sd, struct task_struct *p, int sync)
 {
 	s64 this_load, load;
+	s64 this_eff_load, prev_eff_load;
 	int idx, this_cpu, prev_cpu;
 	struct task_group *tg;
 	unsigned long weight;
@@ -4284,21 +4285,21 @@  static int wake_affine(struct sched_domain *sd, struct task_struct *p, int sync)
 	 * Otherwise check if either cpus are near enough in load to allow this
 	 * task to be woken on this_cpu.
 	 */
-	if (this_load > 0) {
-		s64 this_eff_load, prev_eff_load;
+	this_eff_load = 100;
+	this_eff_load *= power_of(prev_cpu);
+
+	prev_eff_load = 100 + (sd->imbalance_pct - 100) / 2;
+	prev_eff_load *= power_of(this_cpu);
 
-		this_eff_load = 100;
-		this_eff_load *= power_of(prev_cpu);
+	if (this_load > 0) {
 		this_eff_load *= this_load +
 			effective_load(tg, this_cpu, weight, weight);
 
-		prev_eff_load = 100 + (sd->imbalance_pct - 100) / 2;
-		prev_eff_load *= power_of(this_cpu);
 		prev_eff_load *= load + effective_load(tg, prev_cpu, 0, weight);
+	}
+
+	balanced = this_eff_load <= prev_eff_load;
 
-		balanced = this_eff_load <= prev_eff_load;
-	} else
-		balanced = true;
 	schedstat_inc(p, se.statistics.nr_wakeups_affine_attempts);
 
 	if (!balanced)