From patchwork Fri Aug 4 13:40:21 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 109400 Delivered-To: patch@linaro.org Received: by 10.182.109.195 with SMTP id hu3csp1382457obb; Fri, 4 Aug 2017 06:40:38 -0700 (PDT) X-Received: by 10.84.128.195 with SMTP id a61mr2893162pla.222.1501854038874; Fri, 04 Aug 2017 06:40:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1501854038; cv=none; d=google.com; s=arc-20160816; b=bG7PTgKnahGYqGUT2MvRyHFDQRnJYMekv6Bi9DeLJgROa7XXQHxeNtZUQmlytwdOvZ WBQ0j1eY1qL8GC6VkCU+WTfNcdNNHdiS9JQa3YipM1kNLobbVeDWBlXBypd0z1nj1AlE wV8E9/+yUVX/aVw/ewNd6qgbGxlGmUxwB8zzhVIRYCJASqbQuEYEf01PaZwtllO7a5vU cWzT1lXsRe9PuJlRRKkrGQMv6KhKa7cbfrCl1UPn3MpTPT2G8hHJ9SZCKusdXiqvutfy fiV3YHtyhVxHz1FoCwIhdCLXQZ180Ix5v7SJuObHW3eFJ1487TP0SPomBvjCe5FxWzU5 rVXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=SZmjDYbLvC3hQlU0k1aFOW21AbyG6BODTK3tHWdKfdc=; b=ofSRby76b2XM6JqnwHNghOee6FM7lbWLuB8iOu9m1q1P9emJrEZMgYZVx5lnnWudkn evPLPu3Qljuejvfq1JpTtD0KfArumzXdCnkua6Jc6BH5DYUAoiE9ccr/uf0cxHcepJ3O igf5s8u9AiOXPt4SNurpZK7QD287PpoRkXuo57bO1i+yqRcMCOdCWadNO55/5DKrNzKL IqFug8j4pIVCWImSD+Fw0wT1ow7iZZTp71DhgPLE0AvR6oq32ydh0q/PTFt/Q8driAmX kX0CxMVdAlXMNmR8yt4wZqyJbCvfUSbZA+9w0kMKkKEO6vzueEcBwnJupCZT4ns0annj qFNA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=U2di62vu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u72si1030833pfi.537.2017.08.04.06.40.38; Fri, 04 Aug 2017 06:40:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=U2di62vu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752858AbdHDNkg (ORCPT + 25 others); Fri, 4 Aug 2017 09:40:36 -0400 Received: from mail-wm0-f49.google.com ([74.125.82.49]:37537 "EHLO mail-wm0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752552AbdHDNkc (ORCPT ); Fri, 4 Aug 2017 09:40:32 -0400 Received: by mail-wm0-f49.google.com with SMTP id t201so20268593wmt.0 for ; Fri, 04 Aug 2017 06:40:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=SZmjDYbLvC3hQlU0k1aFOW21AbyG6BODTK3tHWdKfdc=; b=U2di62vu7W4x3DleDyYFXcxNzJKHpjXuk0J1zZlUFq3dNBr0f9n4P5MMKKo9FyLcWr POhNNdQ/tjm5zcrUZR+/OAZKH/HCISGZ9ssu/v3pbUlWkBz7FJfZONboQmiQgjGRbKmy DPT8fMsF90gcARg3wZnk1vxZXDHVq2U2M3j6M= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=SZmjDYbLvC3hQlU0k1aFOW21AbyG6BODTK3tHWdKfdc=; b=C0dhPsVrasvkiMaSP9nLdw2ksYLPVfJSHhhHavg4KxAxSdp+pMuMlhMmnNoouQE/Zd DH4VzTU2EvRK2ky5oHicIzWQkQSixnxLCZ1iWo3Z0UnV/dT3yJE6PhSo84RNFZukNAc5 FVxQ8oYPnwOct0JiEA9KSlmu8AVZnMc6s4VH8IkFYV7Ie1sd5QrZKzldaMbE8zqnTLWR 3PEc4CRwnYc4TTTuGQK2cOwnnkbbqu0pd+Mza+hMP/gKBp+vX040IO5NeT56prXHwSGr hPTeffeo1r29EyHXuMXm4ZNCP+6EQajJYOtD0nnU90vO054uHcZ5nUm9vx6d5Pt0F6Un WMDg== X-Gm-Message-State: AHYfb5h8ywmndoncV3HZjs1yGRt9SFBdKI06ZRMlVTBRmj8nW0itELDb 5hvH9V+s91pYGZ+7 X-Received: by 10.28.105.28 with SMTP id e28mr1315133wmc.42.1501854031590; Fri, 04 Aug 2017 06:40:31 -0700 (PDT) Received: from localhost.localdomain ([2a01:e0a:f:6020:f960:58a5:a4a:6f6c]) by smtp.gmail.com with ESMTPSA id 196sm3903874wmg.36.2017.08.04.06.40.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 04 Aug 2017 06:40:30 -0700 (PDT) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, rjw@rjwysocki.net Cc: juri.lelli@arm.com, dietmar.eggemann@arm.com, Morten.Rasmussen@arm.com, viresh.kumar@linaro.org, Vincent Guittot Subject: [RESEND PATCH v2 1/2] sched/rt: add utilization tracking Date: Fri, 4 Aug 2017 15:40:21 +0200 Message-Id: <1501854022-22128-2-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1501854022-22128-1-git-send-email-vincent.guittot@linaro.org> References: <1501854022-22128-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org schedutil governor relies on cfs_rq's util_avg to choose the OPP when cfs tasks are running. When the CPU is overloaded by cfs and rt tasks, cfs tasks are preempted by rt tasks and in this case util_avg reflects the remaining capacity that is used by cfs tasks but not what cfs tasks want to use. In such case, schedutil can select a lower OPP when cfs task runs whereas the CPU is overloaded. In order to have a more accurate view of the utilization of the CPU, we track the utilization that is used by RT tasks. DL tasks are not taken into account as they have their own utilization tracking mecanism. We don't use rt_avg which doesn't have the same dynamic as PELT and which can include IRQ time. Signed-off-by: Vincent Guittot --- Change since v1: - rebase on tip/sched/core There were several comments on v1: - As raised by Peter for v1, if IRQ time is taken into account in rt_avg, it will not be accounted in rq->clock_task. This means that cfs utilization is not affected by some extra contributions or decays because of IRQ. - Regading the sync of rt and cfs utilization, both cfs and rt use the same rq->clock_task. Then, we have the same issue than cfs regarding blocked value. The utilization of idle cfs/rt rqs are not updated regularly but only when a load_balance is triggered (more precisely a call to update_blocked_average). I'd like to fix this issue for both cfs and rt with a separate patch that will ensure that utilization (and load) are updated regularly even for idle CPUs - One last open question is the location of rt utilization function in fair.c file. PELT related funtions should probably move in a dedicated pelt.c file. This would also help to address one comment about having a place to update metrics of NOHZ idle CPUs. Thought ? kernel/sched/fair.c | 21 +++++++++++++++++++++ kernel/sched/rt.c | 9 +++++++++ kernel/sched/sched.h | 3 +++ 3 files changed, 33 insertions(+) -- 2.7.4 Signed-off-by: Peter Zijlstra (Intel) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 008c514..50fe0a2 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3012,6 +3012,17 @@ __update_load_avg_cfs_rq(u64 now, int cpu, struct cfs_rq *cfs_rq) cfs_rq->curr != NULL, cfs_rq); } +int update_rt_rq_load_avg(u64 now, int cpu, struct rt_rq *rt_rq, int running) +{ + int ret; + + ret = ___update_load_avg(now, cpu, &rt_rq->avg, 0, running, NULL); + + + return ret; +} + + /* * Signed add and clamp on underflow. * @@ -3539,6 +3550,11 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq, bool update_freq) return 0; } +int update_rt_rq_load_avg(u64 now, int cpu, struct rt_rq *rt_rq, int running) +{ + return 0; +} + #define UPDATE_TG 0x0 #define SKIP_AGE_LOAD 0x0 @@ -6932,6 +6948,10 @@ static void update_blocked_averages(int cpu) if (cfs_rq_is_decayed(cfs_rq)) list_del_leaf_cfs_rq(cfs_rq); } + + update_rt_rq_load_avg(rq_clock_task(rq), cpu, &rq->rt, 0); + + rq_unlock_irqrestore(rq, &rf); } @@ -6991,6 +7011,7 @@ static inline void update_blocked_averages(int cpu) rq_lock_irqsave(rq, &rf); update_rq_clock(rq); update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq, true); + update_rt_rq_load_avg(rq_clock_task(rq), cpu, &rq->rt, 0); rq_unlock_irqrestore(rq, &rf); } diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 45caf93..e72d572 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1534,6 +1534,8 @@ static struct task_struct *_pick_next_task_rt(struct rq *rq) return p; } +extern int update_rt_rq_load_avg(u64 now, int cpu, struct rt_rq *rt_rq, int running); + static struct task_struct * pick_next_task_rt(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) { @@ -1579,6 +1581,10 @@ pick_next_task_rt(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) queue_push_tasks(rq); + if (p) + update_rt_rq_load_avg(rq_clock_task(rq), cpu_of(rq), rt_rq, + rq->curr->sched_class == &rt_sched_class); + return p; } @@ -1586,6 +1592,8 @@ static void put_prev_task_rt(struct rq *rq, struct task_struct *p) { update_curr_rt(rq); + update_rt_rq_load_avg(rq_clock_task(rq), cpu_of(rq), &rq->rt, 1); + /* * The previous task needs to be made eligible for pushing * if it is still active @@ -2368,6 +2376,7 @@ static void task_tick_rt(struct rq *rq, struct task_struct *p, int queued) struct sched_rt_entity *rt_se = &p->rt; update_curr_rt(rq); + update_rt_rq_load_avg(rq_clock_task(rq), cpu_of(rq), &rq->rt, 1); watchdog(rq, p); diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index eeef1a3..e7ee5b0 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -524,6 +524,9 @@ struct rt_rq { unsigned long rt_nr_total; int overloaded; struct plist_head pushable_tasks; + + struct sched_avg avg; + #ifdef HAVE_RT_PUSH_IPI int push_flags; int push_cpu;