From patchwork Tue Feb 6 08:32:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 126957 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp2705792ljc; Tue, 6 Feb 2018 00:32:52 -0800 (PST) X-Google-Smtp-Source: AH8x225QMpf2Lxm5a1KkavJKhhpyN9GtdjvVyWbr0WNuE+kRmuUVawsXnzKAZX2lmo0THBRYbiLA X-Received: by 10.99.179.7 with SMTP id i7mr1314760pgf.337.1517905972470; Tue, 06 Feb 2018 00:32:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517905972; cv=none; d=google.com; s=arc-20160816; b=uVgXw3dW675bEOF6GN0YZoe4/llFSTDbGK2wIuE1N0XM9GDPQe5XaVbRNJwTpghgoi iaDtY8o+weIvb+0DYP/zW6d/rmZMWzs8ZNeV/PgsGbplNiRhJSyRYFmgnzzDWx3eydDz 0IFU3hWYkOMuaudyQInbUuhBRNTBmgNOuIzVT1ArFM6ZGQ2SmbZB6M+0qU2ZLSmf7SaC w4UpsU6EVRowby/QmK3tUUCXow5BoaMXVpMcVr9bBlt8RIuwuUiDyPMFZAgp46HLFDbQ JSFyn6oTGVQDJs8y91i/Tch5oBf3tRJf5ybJznJKSxXSXBRfhXD/Lfl/17owo7sBRx7I 0VKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=B80Hx/oWKXysb2XeONUKFr895bkZuncJegne0oX6/X0=; b=h8asPGoZ5QyuhOVqeExG+TfuMNj9RGrqhKaG4VViOYUY0wBTENsTn6h5VRpmTA/Sbt OoWlJnLonntZQy7OKNnAuPBI+Ot9A1d8w41/AXCRYkGcirM2fwvJkiQJYRXzf6IO6Cec glbRtmczW9LOxIce6JUIfiueuVkpt2UDttHK+ckgN2mselqmpKbCKR80tuKUafyyHYHI beyomFs8dJ+xAXpztGWfHEjbReMhLM3qKlac38MAYmjN1T0xJlJdIWbwY8qdM1Yy+pjc 2qvhKg2cEJuEscf+PftI+wEkaInIu+CCwxESnNR85o/KQ/Dlu2uy2asxDF7ng/HP8ElK nTmA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=jRROIdWo; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g6-v6si1372348plj.394.2018.02.06.00.32.52; Tue, 06 Feb 2018 00:32:52 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=jRROIdWo; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752454AbeBFIct (ORCPT + 22 others); Tue, 6 Feb 2018 03:32:49 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:55885 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751924AbeBFIck (ORCPT ); Tue, 6 Feb 2018 03:32:40 -0500 Received: by mail-wm0-f67.google.com with SMTP id 143so2022868wma.5 for ; Tue, 06 Feb 2018 00:32:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=B80Hx/oWKXysb2XeONUKFr895bkZuncJegne0oX6/X0=; b=jRROIdWo1eRofWd5P8Hc0kvjftBIXXCK29R3AFBNdi1hpQ46hWc3CmIIDQrLvIQJ0E YAQWmXSVBe4U0HmnD5h4XPZqiEGQsMJYhKgUKV+vqI5v055u68Moa9V0Mwi9+xqJO6G/ K/JUo9QWuG4t8528ANAW1WXrejXe/IYS9th6Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=B80Hx/oWKXysb2XeONUKFr895bkZuncJegne0oX6/X0=; b=LRgvd3Yl/WH/3BMHdxoQbrXuo2LQLO0YI4PpoNPpO7RbuJ2o8jQ50Kl7o4/Zy5OAeU DT8J2urWkQ0/xCri57qzagoomMpKrJ+HQzc3S3hkgZhGW0e6/TPKFrP+IVEC6XjsCj13 8X+cDmBxq8DvWjMigZ+5k8C73MlwXZypXfMTmsscACM3Rwj/qmtb+yMtMtRiTcbCLep/ mrUiD8H76XSDE1WPZzVgMtC3ShKM4vFtEqS2Z7JiSE8U7x+I/cwVB+7tkSOOWr3GO40B GlF7DrED4BZ+/zyYYybZ7CL+p8IufXL8q3KqBsdjKoXJC6SJcjs6njb00iG0L7e598Kf o05Q== X-Gm-Message-State: APf1xPAwW6jVwrW9NHRTbL8MvaQUHw5UEYeieC+iDisOUJbrsc1tSJ9V GckYx6wyxsbpqQVuegp93aU8aA== X-Received: by 10.80.137.233 with SMTP id h38mr2842687edh.39.1517905959433; Tue, 06 Feb 2018 00:32:39 -0800 (PST) Received: from localhost.localdomain ([2a01:e0a:f:6020:e179:d82f:e215:d936]) by smtp.gmail.com with ESMTPSA id k42sm9119601edb.44.2018.02.06.00.32.37 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 06 Feb 2018 00:32:38 -0800 (PST) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, valentin.schneider@arm.com Cc: morten.rasmussen@foss.arm.com, brendan.jackman@arm.com, dietmar.eggemann@arm.com, Vincent Guittot Subject: [PATCH 1/3] sched: Stop nohz stats when decayed Date: Tue, 6 Feb 2018 09:32:21 +0100 Message-Id: <1517905943-24528-1-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <20180201181041.GF2269@hirez.programming.kicks-ass.net> References: <20180201181041.GF2269@hirez.programming.kicks-ass.net> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Stopped the periodic update of blocked load when all idle CPUs have fully decayed. We introduce a new nohz.has_blocked that reflect if some idle CPUs has blocked load that have to be periodiccally updated. nohz.has_blocked is set everytime that a Idle CPU can have blocked load and it is then clear when no more blocked load has been detected during an update. We don't need atomic operation but only to make cure of the right ordering when updating nohz.idle_cpus_mask and nohz.has_blocked. Suggested-by: Peter Zijlstra (Intel) Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 94 +++++++++++++++++++++++++++++++++++++++++----------- kernel/sched/sched.h | 1 + 2 files changed, 75 insertions(+), 20 deletions(-) -- 2.7.4 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7af1fa9..279f4b2 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5383,8 +5383,9 @@ decay_load_missed(unsigned long load, unsigned long missed_updates, int idx) static struct { cpumask_var_t idle_cpus_mask; atomic_t nr_cpus; + int has_blocked; /* Idle CPUS has blocked load */ unsigned long next_balance; /* in jiffy units */ - unsigned long next_stats; + unsigned long next_blocked; /* Next update of blocked load in jiffies */ } nohz ____cacheline_aligned; #endif /* CONFIG_NO_HZ_COMMON */ @@ -6951,6 +6952,7 @@ enum fbq_type { regular, remote, all }; #define LBF_DST_PINNED 0x04 #define LBF_SOME_PINNED 0x08 #define LBF_NOHZ_STATS 0x10 +#define LBF_NOHZ_AGAIN 0x20 struct lb_env { struct sched_domain *sd; @@ -7335,8 +7337,6 @@ static void attach_tasks(struct lb_env *env) rq_unlock(env->dst_rq, &rf); } -#ifdef CONFIG_FAIR_GROUP_SCHED - static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) { if (cfs_rq->load.weight) @@ -7354,11 +7354,14 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) return true; } +#ifdef CONFIG_FAIR_GROUP_SCHED + static void update_blocked_averages(int cpu) { struct rq *rq = cpu_rq(cpu); struct cfs_rq *cfs_rq, *pos; struct rq_flags rf; + bool done = true; rq_lock_irqsave(rq, &rf); update_rq_clock(rq); @@ -7388,10 +7391,14 @@ static void update_blocked_averages(int cpu) */ if (cfs_rq_is_decayed(cfs_rq)) list_del_leaf_cfs_rq(cfs_rq); + else + done = false; } #ifdef CONFIG_NO_HZ_COMMON rq->last_blocked_load_update_tick = jiffies; + if (done) + rq->has_blocked_load = 0; #endif rq_unlock_irqrestore(rq, &rf); } @@ -7454,6 +7461,8 @@ static inline void update_blocked_averages(int cpu) update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq); #ifdef CONFIG_NO_HZ_COMMON rq->last_blocked_load_update_tick = jiffies; + if (cfs_rq_is_decayed(cfs_rq)) + rq->has_blocked_load = 0; #endif rq_unlock_irqrestore(rq, &rf); } @@ -7789,18 +7798,25 @@ group_type group_classify(struct sched_group *group, return group_other; } -static void update_nohz_stats(struct rq *rq) +static bool update_nohz_stats(struct rq *rq) { #ifdef CONFIG_NO_HZ_COMMON unsigned int cpu = rq->cpu; + if (!rq->has_blocked_load) + return false; + if (!cpumask_test_cpu(cpu, nohz.idle_cpus_mask)) - return; + return false; if (!time_after(jiffies, rq->last_blocked_load_update_tick)) - return; + return true; update_blocked_averages(cpu); + + return rq->has_blocked_load; +#else + return false; #endif } @@ -7826,8 +7842,8 @@ static inline void update_sg_lb_stats(struct lb_env *env, for_each_cpu_and(i, sched_group_span(group), env->cpus) { struct rq *rq = cpu_rq(i); - if (env->flags & LBF_NOHZ_STATS) - update_nohz_stats(rq); + if ((env->flags & LBF_NOHZ_STATS) && update_nohz_stats(rq)) + env->flags |= LBF_NOHZ_AGAIN; /* Bias balancing toward cpus of our domain */ if (local_group) @@ -7979,18 +7995,15 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd struct sg_lb_stats *local = &sds->local_stat; struct sg_lb_stats tmp_sgs; int load_idx, prefer_sibling = 0; + int has_blocked = READ_ONCE(nohz.has_blocked); bool overload = false; if (child && child->flags & SD_PREFER_SIBLING) prefer_sibling = 1; #ifdef CONFIG_NO_HZ_COMMON - if (env->idle == CPU_NEWLY_IDLE) { + if (env->idle == CPU_NEWLY_IDLE && has_blocked) env->flags |= LBF_NOHZ_STATS; - - if (cpumask_subset(nohz.idle_cpus_mask, sched_domain_span(env->sd))) - nohz.next_stats = jiffies + msecs_to_jiffies(LOAD_AVG_PERIOD); - } #endif load_idx = get_sd_load_idx(env->sd, env->idle); @@ -8046,6 +8059,15 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd sg = sg->next; } while (sg != env->sd->groups); +#ifdef CONFIG_NO_HZ_COMMON + if ((env->flags & LBF_NOHZ_AGAIN) && + cpumask_subset(nohz.idle_cpus_mask, sched_domain_span(env->sd))) { + + WRITE_ONCE(nohz.next_blocked, + jiffies + msecs_to_jiffies(LOAD_AVG_PERIOD)); + } +#endif + if (env->sd->flags & SD_NUMA) env->fbq_type = fbq_classify_group(&sds->busiest_stat); @@ -9069,6 +9091,8 @@ static void nohz_balancer_kick(struct rq *rq) struct sched_domain *sd; int nr_busy, i, cpu = rq->cpu; unsigned int flags = 0; + unsigned long has_blocked = READ_ONCE(nohz.has_blocked); + unsigned long next = READ_ONCE(nohz.next_blocked); if (unlikely(rq->idle_balance)) return; @@ -9086,7 +9110,7 @@ static void nohz_balancer_kick(struct rq *rq) if (likely(!atomic_read(&nohz.nr_cpus))) return; - if (time_after(now, nohz.next_stats)) + if (time_after(now, next) && has_blocked) flags = NOHZ_STATS_KICK; if (time_before(now, nohz.next_balance)) @@ -9207,13 +9231,15 @@ void nohz_balance_enter_idle(int cpu) if (!housekeeping_cpu(cpu, HK_FLAG_SCHED)) return; + rq->has_blocked_load = 1; + if (rq->nohz_tick_stopped) - return; + goto out; /* * If we're a completely isolated CPU, we don't play. */ - if (on_null_domain(cpu_rq(cpu))) + if (on_null_domain(rq)) return; rq->nohz_tick_stopped = 1; @@ -9222,6 +9248,13 @@ void nohz_balance_enter_idle(int cpu) atomic_inc(&nohz.nr_cpus); set_cpu_sd_state_idle(cpu); + +out: + /* + * Each time a cpu enter idle, we assume that it has blocked load and + * enable the periodic update of the load of idle cpus + */ + WRITE_ONCE(nohz.has_blocked, 1); } #else static inline void nohz_balancer_kick(struct rq *rq) { } @@ -9374,6 +9407,16 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) SCHED_WARN_ON((flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK); + /* + * We assume there will be no idle load after this update and clear + * the stats state. If a cpu enters idle in the mean time, it will + * set the stats state and trig another update of idle load. + * Because a cpu that becomes idle, is added to idle_cpus_mask before + * setting the stats state, we are sure to not clear the state and not + * check the load of an idle cpu. + */ + WRITE_ONCE(nohz.has_blocked, 0); + for_each_cpu(balance_cpu, nohz.idle_cpus_mask) { if (balance_cpu == this_cpu || !idle_cpu(balance_cpu)) continue; @@ -9383,11 +9426,16 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) * work being done for other cpus. Next load * balancing owner will pick it up. */ - if (need_resched()) - break; + if (need_resched()) { + has_blocked_load = true; + goto abort; + } rq = cpu_rq(balance_cpu); + update_blocked_averages(rq->cpu); + has_blocked_load |= rq->has_blocked_load; + /* * If time for next balance is due, * do the balance. @@ -9400,7 +9448,6 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) cpu_load_update_idle(rq); rq_unlock_irq(rq, &rf); - update_blocked_averages(rq->cpu); if (flags & NOHZ_BALANCE_KICK) rebalance_domains(rq, CPU_IDLE); } @@ -9415,7 +9462,13 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) if (flags & NOHZ_BALANCE_KICK) rebalance_domains(this_rq, CPU_IDLE); - nohz.next_stats = next_stats; + WRITE_ONCE(nohz.next_blocked, + now + msecs_to_jiffies(LOAD_AVG_PERIOD)); + +abort: + /* There is still blocked load, enable periodic update */ + if (has_blocked_load) + WRITE_ONCE(nohz.has_blocked, 1); /* * next_balance will be updated only when there is a need. @@ -10046,6 +10099,7 @@ __init void init_sched_fair_class(void) #ifdef CONFIG_NO_HZ_COMMON nohz.next_balance = jiffies; + nohz.next_blocked = jiffies; zalloc_cpumask_var(&nohz.idle_cpus_mask, GFP_NOWAIT); #endif #endif /* SMP */ diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index e200045..ad9b929 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -723,6 +723,7 @@ struct rq { #ifdef CONFIG_SMP unsigned long last_load_update_tick; unsigned long last_blocked_load_update_tick; + unsigned int has_blocked_load; #endif /* CONFIG_SMP */ unsigned int nohz_tick_stopped; atomic_t nohz_flags; From patchwork Tue Feb 6 08:32:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 126958 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp2705930ljc; Tue, 6 Feb 2018 00:33:05 -0800 (PST) X-Google-Smtp-Source: AH8x226XcFZ8AjsXfcNKuYnETvSTjvr1AlwECuFTGAsBfPtweZenq/fqge78KxwhHTrxe0RILSc7 X-Received: by 10.99.127.84 with SMTP id p20mr1349740pgn.330.1517905984872; Tue, 06 Feb 2018 00:33:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517905984; cv=none; d=google.com; s=arc-20160816; b=gHNtjspCOQ6xLDJT5ijyET/cNajVTelYP8dPUePaHXYnfVngIInWk6bHl5AzIz6uHF ltv/SxPb/9DsglebPsyX95qPLPzqIDKUY0Nj0GpL7rhQ9MAVl8neVR1JBN35v2YuNOwe leKYUZakqmdOmTFTJG7UE0F3XRJ9Kz3PSfVidIl9BGOsQoxzxfYuDcFUm8IkM75jv4Z3 4ATNkfLHV89hRLa+Y6giH155nPy+2ZlAfQYIe9WpmgqqVcqrpba5rwIAtfiamJYzf5GX rwKemecMJXtFeuWx5xdsza4Jn80Kh0zXTGciMW7slvW1pXpKppXu7Sr9MRSKKoEP558o 8k7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=BAbX34IIoXTVnoYDtljQur2ziv/71IJtcwHPYfVrduk=; b=Xxqpz8bc6SVLkmjqU6VdJ6Gj1QJ2uORBGFQl95gafrlz2pIPpZLORkwWO8wCIYm4f/ TjItxp+rPxJZi6O0fBKc7Iz6Upy1tnqtYdVMhecprYwGgBJ7RVJsF18v8fQbOS+JGGZ5 to5CpZMWvmCcxM/JLSpQ0/IFoTzQx09oUu1e5LMxsF1jYeoDJoxwsn26eWs1UZz9ORlA MFEb9cWsoZYGss/3s+hvNB8CAcy4VAu8OPEWk6RamQ74t9npuGq8xUxXBwl5caKO4j5D iaLKUiR5+W+5IlqqezLYvLkjQhc/DReEjishMlRKoF/gFfbYWgfwGFPjZsgAIOA3n7Nh fHZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=gbWqSAWz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m23si1647716pgc.438.2018.02.06.00.33.04; Tue, 06 Feb 2018 00:33:04 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=gbWqSAWz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752544AbeBFIdC (ORCPT + 22 others); Tue, 6 Feb 2018 03:33:02 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:37778 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752155AbeBFIcl (ORCPT ); Tue, 6 Feb 2018 03:32:41 -0500 Received: by mail-wm0-f67.google.com with SMTP id v71so2089970wmv.2 for ; Tue, 06 Feb 2018 00:32:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=BAbX34IIoXTVnoYDtljQur2ziv/71IJtcwHPYfVrduk=; b=gbWqSAWz6k7f4Z5uAuKD74/X7NI+TD1ubo9M5r/gVPbyPnb5sbpt/99ykKSdaSHAf5 SXjSkLzv9UjxJcKuTKCdtVYO3Fj1Aib7bn1KBL05V5CAAnA67pODcmS16Nrk/EQq6pNq NdTWNO4w0mUPXZBzGRCIL4+koyvUW2yeT6ECA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=BAbX34IIoXTVnoYDtljQur2ziv/71IJtcwHPYfVrduk=; b=gYucxoSNONnBl4BQfUXevFvjWkxVq144wCB/GRxoclHvXcKKq2DgrbNgC1eLnQfg4v Mtj2l0kr9vcA1off3vP7nz2yMXvDSOrq8bOZmf8zhDOPR7vC0iDBd+HqFydB8rY0Ojjt 5wot3qb7DQZMVe6KPlke7H25KQBQJa/sbTCMsEX42lztUnjaxOevPWHCmrZ4523p4GxM /IvrJR+1/8MaPvhecV/fM1PZDpzkk2IZ4pybSP7mxMUKBiHz3JG/ghrz8gb9HwYDr/0d yZJneST84UD5Vvo32ua3awwX2gfmL108SceZk7wAkvnR4B68L+D/UxNt+NqH81boI0xs 8Quw== X-Gm-Message-State: APf1xPBh0T6y2JQQl719ufdpH1u2bDd79U9ay8Gp6UhIMykp82yg6/DL /ywvDFUh3HP00QwbkRp0beyEbQ== X-Received: by 10.80.175.162 with SMTP id h31mr2820615edd.48.1517905960706; Tue, 06 Feb 2018 00:32:40 -0800 (PST) Received: from localhost.localdomain ([2a01:e0a:f:6020:e179:d82f:e215:d936]) by smtp.gmail.com with ESMTPSA id k42sm9119601edb.44.2018.02.06.00.32.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 06 Feb 2018 00:32:39 -0800 (PST) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, valentin.schneider@arm.com Cc: morten.rasmussen@foss.arm.com, brendan.jackman@arm.com, dietmar.eggemann@arm.com, Vincent Guittot Subject: [PATCH 2/3] sched: reduce the periodic update duration Date: Tue, 6 Feb 2018 09:32:22 +0100 Message-Id: <1517905943-24528-2-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1517905943-24528-1-git-send-email-vincent.guittot@linaro.org> References: <20180201181041.GF2269@hirez.programming.kicks-ass.net> <1517905943-24528-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Instead of using the cfs_rq_is_decayed() which monitors all *_avg and *_sum, we create a cfs_rq_has_blocked() which only takes care of util_avg and load_avg. We are only interested by these 2 values which are decaying faster than the *_sum so we can stop the periodic update earlier. Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) -- 2.7.4 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 279f4b2..6998528 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7337,6 +7337,19 @@ static void attach_tasks(struct lb_env *env) rq_unlock(env->dst_rq, &rf); } +static inline bool cfs_rq_has_blocked(struct cfs_rq *cfs_rq) +{ + if (cfs_rq->avg.load_avg) + return true; + + if (cfs_rq->avg.util_avg) + return true; + + return false; +} + +#ifdef CONFIG_FAIR_GROUP_SCHED + static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) { if (cfs_rq->load.weight) @@ -7354,8 +7367,6 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) return true; } -#ifdef CONFIG_FAIR_GROUP_SCHED - static void update_blocked_averages(int cpu) { struct rq *rq = cpu_rq(cpu); @@ -7391,7 +7402,9 @@ static void update_blocked_averages(int cpu) */ if (cfs_rq_is_decayed(cfs_rq)) list_del_leaf_cfs_rq(cfs_rq); - else + + /* Don't need periodic decay once load/util_avg are null */ + if (cfs_rq_has_blocked(cfs_rq)) done = false; } @@ -7461,7 +7474,7 @@ static inline void update_blocked_averages(int cpu) update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq); #ifdef CONFIG_NO_HZ_COMMON rq->last_blocked_load_update_tick = jiffies; - if (cfs_rq_is_decayed(cfs_rq)) + if (!cfs_rq_has_blocked(cfs_rq)) rq->has_blocked_load = 0; #endif rq_unlock_irqrestore(rq, &rf); From patchwork Tue Feb 6 08:32:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 126959 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp2706043ljc; Tue, 6 Feb 2018 00:33:18 -0800 (PST) X-Google-Smtp-Source: AH8x227zObMnlon7eW0JVSjD+f5vKO+Y4oTS3W3DdX5s90ZqLVsLqjQpjIRJdoEr8YUuMKGea70b X-Received: by 10.99.179.77 with SMTP id x13mr1341452pgt.135.1517905997888; Tue, 06 Feb 2018 00:33:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517905997; cv=none; d=google.com; s=arc-20160816; b=mClhniAYExUMMQKZf5Ob+tuJOXacT7oM+buZlmtTGTdP0by0qSDNPU5wv4adTQ8tnt Uk1VXhqEP9IjGYhKnt8OF0eQbYJO2zp4+W8Z30KlP3DFQAA1En/nSvG1iPYi3iXSapMg M2E4Vf0WV00xXD8drJBvUK67JP4yLnQph76q+a0q3GmQMTcEyLSQ6IJBKuPnIRubeohI qkVqN3Nz5TiuPnmAcXafoidUr0ByIhOUUmumWI5wk9MbPLeoTpoOu1eWus/akaiPmhFh yZg9GkGZQQhyZnqtNsCjokYn9IybhE9FGMFHKLSuu98mUqicoy1uDt1sRlIVGrD/xM89 GFCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=3rrmFAcQlv8dGe8ebZAkyqO+k8Dacnif2O7VRQKqdww=; b=py6ltTugnk0vgjUKIK9DkQ0UkLtW6NsS2Bt14SSH8deVZugNcD7wpuIge2wfb8w1NC L+u4Odqhj4p+OeydRleJkoub+bhEDJ77gLn2/cnFP8U9Yz5lbn0hS9O0ym4ZH+wfYgJv L4g/pqPphhKTGL3RkPlwnBqFlCDpGWqmA3uvDngQLRuxX/a3LRjAUtR0l/3Q71J91n10 s+jaozZ54t3ZGb4+LpyUy0Oj8fq4/ObdOPi5DJPOt6I+YCAFxlz1sdAKK02MKzMzMGtQ 80Mft6JO/GbG2RD+6+MoO0Pi5NuK6lNRAqFwWnn9AR3NJFiIdbzMAWufFKzTrKPaXwHO +jZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=M+g7uGwG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m23si1647716pgc.438.2018.02.06.00.33.17; Tue, 06 Feb 2018 00:33:17 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=M+g7uGwG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752589AbeBFIdQ (ORCPT + 22 others); Tue, 6 Feb 2018 03:33:16 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:32871 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752246AbeBFIcn (ORCPT ); Tue, 6 Feb 2018 03:32:43 -0500 Received: by mail-wm0-f67.google.com with SMTP id x4-v6so16678295wmc.0 for ; Tue, 06 Feb 2018 00:32:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=3rrmFAcQlv8dGe8ebZAkyqO+k8Dacnif2O7VRQKqdww=; b=M+g7uGwGNga9ImsF38/np3ZPpo3QCB21dya9JnW8I22xKWmcSz4GEmU/te1A6UKXfy ybSkWo5kHzLZGZgH4jZcM/1wK04Wr5JZW82smuQ3dAd94GEs2vNbp8F3SamT8AnTpC2O JJlfu2CcUWahMbkF3B/nkLv9LNtfQndBRcOPQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=3rrmFAcQlv8dGe8ebZAkyqO+k8Dacnif2O7VRQKqdww=; b=cxC774PKAe7xFMlWlv/Y33ADLHhvuExJoTBLGass/DjmzUSBmoNnj6DwkIJu6ZGqwN 8PThGMyPdsp/7lyKrA39hfMP24FTTL24ACgTuUPIIY9d2nzch1sqwEQRU7FdplXwfRw8 bYYGd6jJ+kA0Kmr63AgZUY1Egaq0kzVcMEDwUrVEQGv4zhKsx7Vj2+j3xWbGS0hnZ5Nf L9YIXGOeZVzvcA5CSSf+znSXPw4d91zwOn101aDOzOxrLUW+Y4qWYcKy0urzTrgAQJ5L a9/QSwVcRxuReYC3OfPz22/meIzYmtpEOERTn0AY7qoMDmcgQFF9SY/Cc9pL2sPoRPAi plHw== X-Gm-Message-State: APf1xPDvZnUQ3fWXfb/oCVLuFfyOY7i3wtiMIJQb8P77SpdZpqzsJi1m gHvihtDpMJ/mBLQrl4fkuIcIcg== X-Received: by 10.80.163.185 with SMTP id s54mr2651755edb.228.1517905962058; Tue, 06 Feb 2018 00:32:42 -0800 (PST) Received: from localhost.localdomain ([2a01:e0a:f:6020:e179:d82f:e215:d936]) by smtp.gmail.com with ESMTPSA id k42sm9119601edb.44.2018.02.06.00.32.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 06 Feb 2018 00:32:41 -0800 (PST) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, valentin.schneider@arm.com Cc: morten.rasmussen@foss.arm.com, brendan.jackman@arm.com, dietmar.eggemann@arm.com, Vincent Guittot Subject: [PATCH 3/3] sched: update blocked load when newly idle Date: Tue, 6 Feb 2018 09:32:23 +0100 Message-Id: <1517905943-24528-3-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1517905943-24528-1-git-send-email-vincent.guittot@linaro.org> References: <20180201181041.GF2269@hirez.programming.kicks-ass.net> <1517905943-24528-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When NEWLY_IDLE load balance is not triggered, we might need to update the blocked load anyway. We can kick an ilb so an idle CPU will take care of updating blocked load or we can try to update them locally before entering idle. In the latter case, we reuse part of the nohz_idle_balance. Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 102 ++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 84 insertions(+), 18 deletions(-) -- 2.7.4 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 6998528..256defe 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8829,6 +8829,9 @@ update_next_balance(struct sched_domain *sd, unsigned long *next_balance) *next_balance = next; } +static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, enum cpu_idle_type idle); +static void kick_ilb(unsigned int flags); + /* * idle_balance is called by schedule() if this_cpu is about to become * idle. Attempts to pull tasks from other CPUs. @@ -8863,12 +8866,26 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf) if (this_rq->avg_idle < sysctl_sched_migration_cost || !this_rq->rd->overload) { + unsigned long has_blocked = READ_ONCE(nohz.has_blocked); + unsigned long next = READ_ONCE(nohz.next_blocked); + rcu_read_lock(); sd = rcu_dereference_check_sched_domain(this_rq->sd); if (sd) update_next_balance(sd, &next_balance); rcu_read_unlock(); + /* + * Update blocked idle load if it has not been done for a + * while. Try to do it locally before entering idle but kick a + * ilb if it takes too much time and/or might delay next local + * wake up + */ + if (has_blocked && time_after_eq(jiffies, next) && + (this_rq->avg_idle < sysctl_sched_migration_cost || + !_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))) + kick_ilb(NOHZ_STATS_KICK); + goto out; } @@ -9393,30 +9410,24 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle) #ifdef CONFIG_NO_HZ_COMMON /* - * In CONFIG_NO_HZ_COMMON case, the idle balance kickee will do the - * rebalancing for all the cpus for whom scheduler ticks are stopped. + * Internal function that runs load balance for all idle cpus. The load balance + * can be a simple update of blocked load or a complete load balance with + * tasks movement depending of flags. + * For newly idle mode, we abort the loop if it takes too much time and return + * false to notify that the loop has not be completed and a ilb should be kick. */ -static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) +static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, enum cpu_idle_type idle) { /* Earliest time when we have to do rebalance again */ unsigned long now = jiffies; unsigned long next_balance = now + 60*HZ; - unsigned long next_stats = now + msecs_to_jiffies(LOAD_AVG_PERIOD); + bool has_blocked_load = false; int update_next_balance = 0; int this_cpu = this_rq->cpu; - unsigned int flags; int balance_cpu; + int ret = false; struct rq *rq; - - if (!(atomic_read(nohz_flags(this_cpu)) & NOHZ_KICK_MASK)) - return false; - - if (idle != CPU_IDLE) { - atomic_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); - return false; - } - - flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); + u64 curr_cost = 0; SCHED_WARN_ON((flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK); @@ -9431,6 +9442,10 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) WRITE_ONCE(nohz.has_blocked, 0); for_each_cpu(balance_cpu, nohz.idle_cpus_mask) { + u64 t0, domain_cost; + + t0 = sched_clock_cpu(this_cpu); + if (balance_cpu == this_cpu || !idle_cpu(balance_cpu)) continue; @@ -9444,6 +9459,16 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) goto abort; } + /* + * If the update is done while CPU becomes idle, we abort + * the update when its cost is higher than the average idle + * time in orde to not delay a possible wake up. + */ + if (idle == CPU_NEWLY_IDLE && this_rq->avg_idle < curr_cost) { + has_blocked_load = true; + goto abort; + } + rq = cpu_rq(balance_cpu); update_blocked_averages(rq->cpu); @@ -9456,10 +9481,10 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) if (time_after_eq(jiffies, rq->next_balance)) { struct rq_flags rf; - rq_lock_irq(rq, &rf); + rq_lock_irqsave(rq, &rf); update_rq_clock(rq); cpu_load_update_idle(rq); - rq_unlock_irq(rq, &rf); + rq_unlock_irqrestore(rq, &rf); if (flags & NOHZ_BALANCE_KICK) rebalance_domains(rq, CPU_IDLE); @@ -9469,15 +9494,27 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) next_balance = rq->next_balance; update_next_balance = 1; } + + domain_cost = sched_clock_cpu(this_cpu) - t0; + curr_cost += domain_cost; + + } + + /* Newly idle CPU doesn't need an update */ + if (idle != CPU_NEWLY_IDLE) { + update_blocked_averages(this_cpu); + has_blocked_load |= this_rq->has_blocked_load; } - update_blocked_averages(this_cpu); if (flags & NOHZ_BALANCE_KICK) rebalance_domains(this_rq, CPU_IDLE); WRITE_ONCE(nohz.next_blocked, now + msecs_to_jiffies(LOAD_AVG_PERIOD)); + /* The full idle balance loop has been done */ + ret = true; + abort: /* There is still blocked load, enable periodic update */ if (has_blocked_load) @@ -9491,6 +9528,35 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) if (likely(update_next_balance)) nohz.next_balance = next_balance; + return ret; +} + +/* + * In CONFIG_NO_HZ_COMMON case, the idle balance kickee will do the + * rebalancing for all the cpus for whom scheduler ticks are stopped. + */ +static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) +{ + int this_cpu = this_rq->cpu; + unsigned int flags; + + if (!(atomic_read(nohz_flags(this_cpu)) & NOHZ_KICK_MASK)) + return false; + + if (idle != CPU_IDLE) { + atomic_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); + return false; + } + + /* + * barrier, pairs with nohz_balance_enter_idle(), ensures ... + */ + flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); + if (!(flags & NOHZ_KICK_MASK)) + return false; + + _nohz_idle_balance(this_rq, flags, idle); + return true; } #else