From patchwork Tue Feb 12 13:23:54 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 14762 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id DC3C023E51 for ; Tue, 12 Feb 2013 13:24:16 +0000 (UTC) Received: from mail-vc0-f182.google.com (mail-vc0-f182.google.com [209.85.220.182]) by fiordland.canonical.com (Postfix) with ESMTP id 653B6A1896F for ; Tue, 12 Feb 2013 13:24:16 +0000 (UTC) Received: by mail-vc0-f182.google.com with SMTP id fl17so38114vcb.27 for ; Tue, 12 Feb 2013 05:24:15 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:x-forwarded-to:x-forwarded-for:delivered-to:x-received :received-spf:x-received:from:to:cc:subject:date:message-id:x-mailer :x-gm-message-state; bh=Ccwa2MHQ7phrJZaUhliITkWJ8NrZbT26ex6LFT9LjNs=; b=AT3BaQGqtWCjtEgOvnFVS4UvU2PyzJa6jRiZQTAf/6wheDRBlnwngLBvhQw6pUF3kI 1RpprFXHnMAECOflOMZA8XcUHI4nAL1uVi1rUal1jyS0Rs0o27HFoKsbzXD+YcCLUzsJ Pb0SkKtHIod4UgP3DoM4uR82LB7RM1t74VFyoFzzI+2G/kUOvFNsbX+dhZmpOaAfGxx6 rH545M0SeIu9pU4BatUi5ANathl6blxuTtiv5HIRn8yjuvSecHq5V+zYw/+OLgoeuYXY 5IIhhJjjH0k56UPM7or6Gxe2pJF5lMa2EnA3HQKQazHEWau3U3CPEwjyXV3XZHFbY4Sr ZG3g== X-Received: by 10.52.38.234 with SMTP id j10mr9909507vdk.0.1360675455876; Tue, 12 Feb 2013 05:24:15 -0800 (PST) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.58.252.8 with SMTP id zo8csp153602vec; Tue, 12 Feb 2013 05:24:15 -0800 (PST) X-Received: by 10.194.235.196 with SMTP id uo4mr5762745wjc.30.1360675454743; Tue, 12 Feb 2013 05:24:14 -0800 (PST) Received: from mail-wi0-f175.google.com (mail-wi0-f175.google.com [209.85.212.175]) by mx.google.com with ESMTPS id fe7si7495086wib.52.2013.02.12.05.24.14 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 12 Feb 2013 05:24:14 -0800 (PST) Received-SPF: neutral (google.com: 209.85.212.175 is neither permitted nor denied by best guess record for domain of vincent.guittot@linaro.org) client-ip=209.85.212.175; Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.212.175 is neither permitted nor denied by best guess record for domain of vincent.guittot@linaro.org) smtp.mail=vincent.guittot@linaro.org Received: by mail-wi0-f175.google.com with SMTP id l13so4400737wie.8 for ; Tue, 12 Feb 2013 05:24:14 -0800 (PST) X-Received: by 10.194.172.197 with SMTP id be5mr30903044wjc.20.1360675454275; Tue, 12 Feb 2013 05:24:14 -0800 (PST) Received: from localhost.localdomain (LPuteaux-156-14-44-212.w82-127.abo.wanadoo.fr. [82.127.83.212]) by mx.google.com with ESMTPS id hb9sm36586972wib.3.2013.02.12.05.24.12 (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 12 Feb 2013 05:24:13 -0800 (PST) From: Vincent Guittot To: linux-kernel@vger.kernel.org, linaro-dev@lists.linaro.org, peterz@infradead.org, mingo@kernel.org, fweisbec@gmail.com, rostedt@goodmis.org, efault@gmx.de Cc: Vincent Guittot Subject: [PATCH v2] sched: fix wrong rq's runnable_avg update with rt task Date: Tue, 12 Feb 2013 14:23:54 +0100 Message-Id: <1360675434-4259-1-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 1.7.9.5 X-Gm-Message-State: ALoCoQnX2+lNHJjLRsqkP29/ab+cOzjk6GIHJQ7W3gBWZpzzrrHYWWSJ8K4SSNeba7SD53KJohSZ When a RT task is scheduled on an idle CPU, the update of the rq's load is not done because CFS's functions are not called. Then, the idle_balance, which is called just before entering the idle function, updates the rq's load and makes the assumption that the elapsed time since the last update, was only running time. The rq's load of a CPU that only runs a periodic RT task, is close to LOAD_AVG_MAX whatever the running duration of the RT task is. A new idle_exit function is called when the prev task is the idle function so the elapsed time will be accounted as idle time in the rq's load. Changes since V1: - move code out of schedule function and create a pre_schedule callback for idle class instead. Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 10 ++++++++++ kernel/sched/idle_task.c | 7 +++++++ kernel/sched/sched.h | 5 +++++ 3 files changed, 22 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 81fa536..60951f1 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1562,6 +1562,16 @@ static inline void dequeue_entity_load_avg(struct cfs_rq *cfs_rq, se->avg.decay_count = atomic64_read(&cfs_rq->decay_counter); } /* migrations, e.g. sleep=0 leave decay_count == 0 */ } + +/* + * Update the rq's load with the elapsed idle time before a task is + * scheduled. if the newly scheduled task is not a CFS task, idle_exit will + * be the only way to update the runnable statistic. + */ +void idle_exit(int this_cpu, struct rq *this_rq) +{ + update_rq_runnable_avg(this_rq, 0); +} #else static inline void update_entity_load_avg(struct sched_entity *se, int update_cfs_rq) {} diff --git a/kernel/sched/idle_task.c b/kernel/sched/idle_task.c index b6baf37..27cd379 100644 --- a/kernel/sched/idle_task.c +++ b/kernel/sched/idle_task.c @@ -13,6 +13,12 @@ select_task_rq_idle(struct task_struct *p, int sd_flag, int flags) { return task_cpu(p); /* IDLE tasks as never migrated */ } + +static void pre_schedule_idle(struct rq *rq, struct task_struct *prev) +{ + /* Update rq's load with elapsed idle time */ + idle_exit(smp_processor_id(), rq); +} #endif /* CONFIG_SMP */ /* * Idle tasks are unconditionally rescheduled: @@ -86,6 +92,7 @@ const struct sched_class idle_sched_class = { #ifdef CONFIG_SMP .select_task_rq = select_task_rq_idle, + .pre_schedule = pre_schedule_idle, #endif .set_curr_task = set_curr_task_idle, diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index fc88644..9707092 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -877,6 +877,7 @@ extern const struct sched_class idle_sched_class; extern void trigger_load_balance(struct rq *rq, int cpu); extern void idle_balance(int this_cpu, struct rq *this_rq); +extern void idle_exit(int this_cpu, struct rq *this_rq); #else /* CONFIG_SMP */ @@ -884,6 +885,10 @@ static inline void idle_balance(int cpu, struct rq *rq) { } +static inline void idle_exit(int this_cpu, struct rq *this_rq) +{ +} + #endif extern void sysrq_sched_debug_show(void);