diff mbox

[V2,3/3] sched: Move idle_stamp up to the core

Message ID 1391728237-4441-4-git-send-email-daniel.lezcano@linaro.org
State New
Headers show

Commit Message

Daniel Lezcano Feb. 6, 2014, 11:10 p.m. UTC
The idle_balance modifies the idle_stamp field of the rq, making this
information to be shared across core.c and fair.c. As we can know if the
cpu is going to idle or not with the previous patch, let's encapsulate the
idle_stamp information in core.c by moving it up to the caller. The
idle_balance function returns true in case a balancing occured and the cpu
won't be idle, false if no balance happened and the cpu is going idle.

Cc: mingo@kernel.org
Cc: alex.shi@linaro.org
Cc: peterz@infradead.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
---
 kernel/sched/core.c  |   13 +++++++++++--
 kernel/sched/fair.c  |   14 ++++++--------
 kernel/sched/sched.h |    8 +-------
 3 files changed, 18 insertions(+), 17 deletions(-)

Comments

Daniel Lezcano Feb. 11, 2014, 12:07 p.m. UTC | #1
On 02/10/2014 11:04 AM, Preeti Murthy wrote:
> Hi Daniel,
>
> On Fri, Feb 7, 2014 at 4:40 AM, Daniel Lezcano
> <daniel.lezcano@linaro.org> wrote:
>> The idle_balance modifies the idle_stamp field of the rq, making this
>> information to be shared across core.c and fair.c. As we can know if the
>> cpu is going to idle or not with the previous patch, let's encapsulate the
>> idle_stamp information in core.c by moving it up to the caller. The
>> idle_balance function returns true in case a balancing occured and the cpu
>> won't be idle, false if no balance happened and the cpu is going idle.
>>
>> Cc: mingo@kernel.org
>> Cc: alex.shi@linaro.org
>> Cc: peterz@infradead.org
>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
>> ---
>>   kernel/sched/core.c  |   13 +++++++++++--
>>   kernel/sched/fair.c  |   14 ++++++--------
>>   kernel/sched/sched.h |    8 +-------
>>   3 files changed, 18 insertions(+), 17 deletions(-)
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 16b97dd..428ee4c 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -2704,8 +2704,17 @@ need_resched:
>>
>>          pre_schedule(rq, prev);
>>
>> -       if (unlikely(!rq->nr_running))
>> -               idle_balance(rq);
>> +#ifdef CONFIG_SMP
>> +       if (unlikely(!rq->nr_running)) {
>> +               /*
>> +                * We must set idle_stamp _before_ calling idle_balance(), such
>> +                * that we measure the duration of idle_balance() as idle time.
>
> Should not this be "such that we *do not* measure the duration of idle_balance()
> as idle time?"

Actually, the initial code was including the idle balance time 
processing in the idle stamp. When I moved the idle stamp in core.c, 
idle balance was no longer measured (an unwanted change). That has been 
fixed and to prevent that to occur again, we added a comment.
Alex Shi Feb. 13, 2014, 7:53 a.m. UTC | #2
On 02/07/2014 07:10 AM, Daniel Lezcano wrote:
> The idle_balance modifies the idle_stamp field of the rq, making this
> information to be shared across core.c and fair.c. As we can know if the
> cpu is going to idle or not with the previous patch, let's encapsulate the
> idle_stamp information in core.c by moving it up to the caller. The
> idle_balance function returns true in case a balancing occured and the cpu
> won't be idle, false if no balance happened and the cpu is going idle.
> 
> Cc: mingo@kernel.org
> Cc: alex.shi@linaro.org
> Cc: peterz@infradead.org
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>

Reviewed-by: Alex Shi <alex.shi@linaro.org>
diff mbox

Patch

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 16b97dd..428ee4c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2704,8 +2704,17 @@  need_resched:
 
 	pre_schedule(rq, prev);
 
-	if (unlikely(!rq->nr_running))
-		idle_balance(rq);
+#ifdef CONFIG_SMP
+	if (unlikely(!rq->nr_running)) {
+		/*
+		 * We must set idle_stamp _before_ calling idle_balance(), such
+		 * that we measure the duration of idle_balance() as idle time.
+		 */
+		rq->idle_stamp = rq_clock(rq);
+		if (idle_balance(rq))
+			rq->idle_stamp = 0;
+	}
+#endif
 
 	put_prev_task(rq, prev);
 	next = pick_next_task(rq);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5ebc681..04fea77 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6531,7 +6531,7 @@  out:
  * idle_balance is called by schedule() if this_cpu is about to become
  * idle. Attempts to pull tasks from other CPUs.
  */
-void idle_balance(struct rq *this_rq)
+int idle_balance(struct rq *this_rq)
 {
 	struct sched_domain *sd;
 	int pulled_task = 0;
@@ -6539,10 +6539,8 @@  void idle_balance(struct rq *this_rq)
 	u64 curr_cost = 0;
 	int this_cpu = this_rq->cpu;
 
-	this_rq->idle_stamp = rq_clock(this_rq);
-
 	if (this_rq->avg_idle < sysctl_sched_migration_cost)
-		return;
+		return 0;
 
 	/*
 	 * Drop the rq->lock, but keep IRQ/preempt disabled.
@@ -6580,10 +6578,8 @@  void idle_balance(struct rq *this_rq)
 		interval = msecs_to_jiffies(sd->balance_interval);
 		if (time_after(next_balance, sd->last_balance + interval))
 			next_balance = sd->last_balance + interval;
-		if (pulled_task) {
-			this_rq->idle_stamp = 0;
+		if (pulled_task)
 			break;
-		}
 	}
 	rcu_read_unlock();
 
@@ -6594,7 +6590,7 @@  void idle_balance(struct rq *this_rq)
 	 * A task could have be enqueued in the meantime
 	 */
 	if (this_rq->nr_running && !pulled_task)
-		return;
+		return 1;
 
 	if (pulled_task || time_after(jiffies, this_rq->next_balance)) {
 		/*
@@ -6606,6 +6602,8 @@  void idle_balance(struct rq *this_rq)
 
 	if (curr_cost > this_rq->max_idle_balance_cost)
 		this_rq->max_idle_balance_cost = curr_cost;
+
+	return pulled_task;
 }
 
 /*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 1436219..c08c070 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1176,17 +1176,11 @@  extern const struct sched_class idle_sched_class;
 extern void update_group_power(struct sched_domain *sd, int cpu);
 
 extern void trigger_load_balance(struct rq *rq);
-extern void idle_balance(struct rq *this_rq);
+extern int idle_balance(struct rq *this_rq);
 
 extern void idle_enter_fair(struct rq *this_rq);
 extern void idle_exit_fair(struct rq *this_rq);
 
-#else	/* CONFIG_SMP */
-
-static inline void idle_balance(struct rq *rq)
-{
-}
-
 #endif
 
 extern void sysrq_sched_debug_show(void);