diff mbox

[v3,07/12] sched: test the cpu's capacity in wake affine

Message ID 1404144343-18720-8-git-send-email-vincent.guittot@linaro.org
State New
Headers show

Commit Message

Vincent Guittot June 30, 2014, 4:05 p.m. UTC
Currently the task always wakes affine on this_cpu if the latter is idle.
Before waking up the task on this_cpu, we check that this_cpu capacity is not
significantly reduced because of RT tasks or irq activity.

Use case where the number of irq and/or the time spent under irq is important
will take benefit of this because the task that is woken up by irq or softirq
will not use the same CPU than irq (and softirq) but a idle one which share
its cache.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/fair.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

Comments

Rik van Riel July 9, 2014, 3:12 a.m. UTC | #1
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 06/30/2014 12:05 PM, Vincent Guittot wrote:
> Currently the task always wakes affine on this_cpu if the latter is
> idle. Before waking up the task on this_cpu, we check that this_cpu
> capacity is not significantly reduced because of RT tasks or irq
> activity.
> 
> Use case where the number of irq and/or the time spent under irq is
> important will take benefit of this because the task that is woken
> up by irq or softirq will not use the same CPU than irq (and
> softirq) but a idle one which share its cache.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>

Acked-by: Rik van Riel <riel@redhat.com>


- -- 
All rights reversed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTvLMXAAoJEM553pKExN6DGwcIAIDInqWSTMPG1SpNPIgrDB4g
YILWx6pFhv1ZSVEiJ05jxNpDKHeAKaEFR9NrSA/Psd+0p1zZ0pazioXyVhhr92PO
2C/xnIDsxNp7Qf+Yz1OvAHdoZiBM/uOI60JU+xL3bF/npjrw6DWe4WitikAookRR
Yjwhtyj4KiVgXBYDWuES5HS5s61W+ol1F3as6KKMurfwU6JYXqJYMRZGQoymi144
5oV9enFgN3u6kXhXUMgC3vfzVW2xMr7SNDTvEDO0vqrvlIRsv07u63G0YppoMFbI
Mht3ARLZKOwR6paiju1IkBdy/4nZeAuO461hDKwb/bB5q3JdeUtXw89xbelJD1Y=
=H6O5
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Peter Zijlstra July 10, 2014, 11:06 a.m. UTC | #2
On Mon, Jun 30, 2014 at 06:05:38PM +0200, Vincent Guittot wrote:
> Currently the task always wakes affine on this_cpu if the latter is idle.
> Before waking up the task on this_cpu, we check that this_cpu capacity is not
> significantly reduced because of RT tasks or irq activity.
> 
> Use case where the number of irq and/or the time spent under irq is important
> will take benefit of this because the task that is woken up by irq or softirq
> will not use the same CPU than irq (and softirq) but a idle one which share
> its cache.

The above, doesn't seem to explain:

> +	} else if (!(sd->flags & SD_SHARE_PKG_RESOURCES)) {
> +		this_eff_load = 0;
> +	}
> +
> +	balanced = this_eff_load <= prev_eff_load;

this. Why would you unconditionally allow wake_affine across cache
domains?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Vincent Guittot July 10, 2014, 1:58 p.m. UTC | #3
On 10 July 2014 13:06, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, Jun 30, 2014 at 06:05:38PM +0200, Vincent Guittot wrote:
>> Currently the task always wakes affine on this_cpu if the latter is idle.
>> Before waking up the task on this_cpu, we check that this_cpu capacity is not
>> significantly reduced because of RT tasks or irq activity.
>>
>> Use case where the number of irq and/or the time spent under irq is important
>> will take benefit of this because the task that is woken up by irq or softirq
>> will not use the same CPU than irq (and softirq) but a idle one which share
>> its cache.
>
> The above, doesn't seem to explain:
>
>> +     } else if (!(sd->flags & SD_SHARE_PKG_RESOURCES)) {
>> +             this_eff_load = 0;
>> +     }
>> +
>> +     balanced = this_eff_load <= prev_eff_load;
>
> this. Why would you unconditionally allow wake_affine across cache
> domains?

The current policy is to use this_cpu if this_load <= 0. I want to
keep the current wake affine policy for all sched_domain that doesn't
share cache so if the involved CPUs don't share cache, I clear
this_eff_load to force balanced to be true. But when CPUs share their
cache,  we not only look at idleness but also at available capacity of
prev and local CPUs.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
diff mbox

Patch

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7dff578..a23c938 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4240,6 +4240,7 @@  static int wake_wide(struct task_struct *p)
 static int wake_affine(struct sched_domain *sd, struct task_struct *p, int sync)
 {
 	s64 this_load, load;
+	s64 this_eff_load, prev_eff_load;
 	int idx, this_cpu, prev_cpu;
 	struct task_group *tg;
 	unsigned long weight;
@@ -4283,21 +4284,23 @@  static int wake_affine(struct sched_domain *sd, struct task_struct *p, int sync)
 	 * Otherwise check if either cpus are near enough in load to allow this
 	 * task to be woken on this_cpu.
 	 */
-	if (this_load > 0) {
-		s64 this_eff_load, prev_eff_load;
+	this_eff_load = 100;
+	this_eff_load *= capacity_of(prev_cpu);
+
+	prev_eff_load = 100 + (sd->imbalance_pct - 100) / 2;
+	prev_eff_load *= capacity_of(this_cpu);
 
-		this_eff_load = 100;
-		this_eff_load *= capacity_of(prev_cpu);
+	if (this_load > 0) {
 		this_eff_load *= this_load +
 			effective_load(tg, this_cpu, weight, weight);
 
-		prev_eff_load = 100 + (sd->imbalance_pct - 100) / 2;
-		prev_eff_load *= capacity_of(this_cpu);
 		prev_eff_load *= load + effective_load(tg, prev_cpu, 0, weight);
+	} else if (!(sd->flags & SD_SHARE_PKG_RESOURCES)) {
+		this_eff_load = 0;
+	}
+
+	balanced = this_eff_load <= prev_eff_load;
 
-		balanced = this_eff_load <= prev_eff_load;
-	} else
-		balanced = true;
 	schedstat_inc(p, se.statistics.nr_wakeups_affine_attempts);
 
 	if (!balanced)