From patchwork Mon Feb 12 08:07:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 127945 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp2977646ljc; Mon, 12 Feb 2018 00:08:16 -0800 (PST) X-Google-Smtp-Source: AH8x226YMoTvYXZyQHJ4dRShx+9KoHhrIQTrZMX1fri3CY0yrR8XXRv2IrNBv1xvDgJH5BL4OcAz X-Received: by 2002:a17:902:be09:: with SMTP id r9-v6mr10087769pls.234.1518422895877; Mon, 12 Feb 2018 00:08:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518422895; cv=none; d=google.com; s=arc-20160816; b=Vm4Nzx+SGosq2BsIAcxwH9ibkIjgAqrTg6Tyd/N+67r5aJ9Oth4M6+gylqJ4NAAxkP n2yMJNi67HjZKA96AO3vWRkjYh3btBHWdA4vSaE97aEY+X0TjNTc4aTAqU65R6p24v0c EbiYc6RVxwofYoID9d5DVWFl4Stzc6OR31UyaR08XDFHHSLH+qASnMRcjJL7FguQVAs6 xKKcNaU6fR079mylcQpKtWGBaX6nTbgrL2+qob9ecaz1jDYSKOezZvhk6uE8xkiVZhsP bkH4YNs0pbMBcQGVcSD4YMAOvd88Vikmg9sTf1CTcqbSUeCVcHr62NdXkNNNXH2lcuEJ myEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=9nGowaQTQrxrTQR196+PXUMaHBMXlDSIsqx3e+uEj1A=; b=hbg6Jv8TTP3NRUdr2qlulL4hroSTPZIRK/73tR/eolfd0ySl/4WMyhVVk7tvLienlU p+vwyB1frw4whSkI0pRpnFRQYUZeL+KzZvYvXqkYL/mOnNZ30cJDTq+zw4+G3cgK+cXX JGyKbkc3nmnHNxLPY3MwMUAIJHpFe3caLuSLoE9OcaWT4MXYC7sCco6SWuP2BMzxJfEr eGfwGKICILEK5meCydOp4yemnk5Y2793xE9k85ZtGcy2cM0Rt6oQd83HGzIixbZ4TkNJ MLXmKhGqR8odyfnjr/UB7lmRYhIibHohZtcpoKI4YmTmsLe+ZASmtlyTz3SLIXkTCOoG 1YfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=eez2RILu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b61-v6si5562713plc.265.2018.02.12.00.08.15; Mon, 12 Feb 2018 00:08:15 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=eez2RILu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932901AbeBLIII (ORCPT + 12 others); Mon, 12 Feb 2018 03:08:08 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:37540 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932566AbeBLIID (ORCPT ); Mon, 12 Feb 2018 03:08:03 -0500 Received: by mail-wm0-f67.google.com with SMTP id v71so8132212wmv.2 for ; Mon, 12 Feb 2018 00:08:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9nGowaQTQrxrTQR196+PXUMaHBMXlDSIsqx3e+uEj1A=; b=eez2RILu0WhFmHSHm4aXbHldHGanA6bQ8fn6D3iRitndhdXxG1w8ut8By7DuZoPjjL jEi9Xx8CDLJUMftaLi8gZwPqeQfdZy0pwDon/0YPYIBvdnn9020gnJh3YAYpGXOJ37us ZBie0p22m6lQFSP1BaTdRpV07RJPvJTYqgZ1Q= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9nGowaQTQrxrTQR196+PXUMaHBMXlDSIsqx3e+uEj1A=; b=EttLR0sndeD5BhSejTmdbJY1yYw6zSMteoJ9v9Jkz9eLOtb87exJSqS6Ew5iHWj9J2 eo6tqkGlrX5DWC5UlUgwnO2A8yEniA/ucJ+KIxZqcEaq8JU/BYSFLKH+ZvIukaMc/btO 3T9tAQrUZd27UqeeZC8jm8eZkXXI04MzY8QXqFDcjkVQU7mfxpNsw/SEaVDh6+JTCZTf tz5IOW9KdixMnUR11mBskQMbKwwiLpnq0OSzF79piAsecHB1YDLcy0P0wexf6apya11E 4BvWJsKL5BJcEr8gsx47jBaEovvT8+GWpk6B7AO5Q2ubK22x0ZG2+eHwjbl9MJ7Gv4AT FkRA== X-Gm-Message-State: APf1xPCGr7l4GOiDjxfOXUGEqQDhYP4PpAi5Chd8u6rGZyQn5lO/x3Bh ZQd2IiE/CZlkBDj6d1c2Q+bG+Q== X-Received: by 10.80.174.6 with SMTP id c6mr14671662edd.263.1518422881652; Mon, 12 Feb 2018 00:08:01 -0800 (PST) Received: from localhost.localdomain ([2a01:e0a:f:6020:8800:c3b2:94cc:861a]) by smtp.gmail.com with ESMTPSA id f29sm4853715eda.43.2018.02.12.00.08.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 12 Feb 2018 00:08:00 -0800 (PST) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, valentin.schneider@arm.com Cc: morten.rasmussen@foss.arm.com, brendan.jackman@arm.com, dietmar.eggemann@arm.com, Vincent Guittot Subject: [PATCH v3 1/3] sched: Stop nohz stats when decayed Date: Mon, 12 Feb 2018 09:07:52 +0100 Message-Id: <1518422874-13216-2-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1518422874-13216-1-git-send-email-vincent.guittot@linaro.org> References: <1518422874-13216-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Stopped the periodic update of blocked load when all idle CPUs have fully decayed. We introduce a new nohz.has_blocked that reflect if some idle CPUs has blocked load that have to be periodiccally updated. nohz.has_blocked is set everytime that a Idle CPU can have blocked load and it is then clear when no more blocked load has been detected during an update. We don't need atomic operation but only to make cure of the right ordering when updating nohz.idle_cpus_mask and nohz.has_blocked. Suggested-by: Peter Zijlstra (Intel) Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 118 ++++++++++++++++++++++++++++++++++++++++++--------- kernel/sched/sched.h | 1 + 2 files changed, 99 insertions(+), 20 deletions(-) -- 2.7.4 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7af1fa9..e228d3d 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5383,8 +5383,9 @@ decay_load_missed(unsigned long load, unsigned long missed_updates, int idx) static struct { cpumask_var_t idle_cpus_mask; atomic_t nr_cpus; + int has_blocked; /* Idle CPUS has blocked load */ unsigned long next_balance; /* in jiffy units */ - unsigned long next_stats; + unsigned long next_blocked; /* Next update of blocked load in jiffies */ } nohz ____cacheline_aligned; #endif /* CONFIG_NO_HZ_COMMON */ @@ -6951,6 +6952,7 @@ enum fbq_type { regular, remote, all }; #define LBF_DST_PINNED 0x04 #define LBF_SOME_PINNED 0x08 #define LBF_NOHZ_STATS 0x10 +#define LBF_NOHZ_AGAIN 0x20 struct lb_env { struct sched_domain *sd; @@ -7335,8 +7337,6 @@ static void attach_tasks(struct lb_env *env) rq_unlock(env->dst_rq, &rf); } -#ifdef CONFIG_FAIR_GROUP_SCHED - static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) { if (cfs_rq->load.weight) @@ -7354,11 +7354,14 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) return true; } +#ifdef CONFIG_FAIR_GROUP_SCHED + static void update_blocked_averages(int cpu) { struct rq *rq = cpu_rq(cpu); struct cfs_rq *cfs_rq, *pos; struct rq_flags rf; + bool done = true; rq_lock_irqsave(rq, &rf); update_rq_clock(rq); @@ -7388,10 +7391,14 @@ static void update_blocked_averages(int cpu) */ if (cfs_rq_is_decayed(cfs_rq)) list_del_leaf_cfs_rq(cfs_rq); + else + done = false; } #ifdef CONFIG_NO_HZ_COMMON rq->last_blocked_load_update_tick = jiffies; + if (done) + rq->has_blocked_load = 0; #endif rq_unlock_irqrestore(rq, &rf); } @@ -7454,6 +7461,8 @@ static inline void update_blocked_averages(int cpu) update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq); #ifdef CONFIG_NO_HZ_COMMON rq->last_blocked_load_update_tick = jiffies; + if (cfs_rq_is_decayed(cfs_rq)) + rq->has_blocked_load = 0; #endif rq_unlock_irqrestore(rq, &rf); } @@ -7789,18 +7798,25 @@ group_type group_classify(struct sched_group *group, return group_other; } -static void update_nohz_stats(struct rq *rq) +static bool update_nohz_stats(struct rq *rq) { #ifdef CONFIG_NO_HZ_COMMON unsigned int cpu = rq->cpu; + if (!rq->has_blocked_load) + return false; + if (!cpumask_test_cpu(cpu, nohz.idle_cpus_mask)) - return; + return false; if (!time_after(jiffies, rq->last_blocked_load_update_tick)) - return; + return true; update_blocked_averages(cpu); + + return rq->has_blocked_load; +#else + return false; #endif } @@ -7826,8 +7842,8 @@ static inline void update_sg_lb_stats(struct lb_env *env, for_each_cpu_and(i, sched_group_span(group), env->cpus) { struct rq *rq = cpu_rq(i); - if (env->flags & LBF_NOHZ_STATS) - update_nohz_stats(rq); + if ((env->flags & LBF_NOHZ_STATS) && update_nohz_stats(rq)) + env->flags |= LBF_NOHZ_AGAIN; /* Bias balancing toward cpus of our domain */ if (local_group) @@ -7979,18 +7995,15 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd struct sg_lb_stats *local = &sds->local_stat; struct sg_lb_stats tmp_sgs; int load_idx, prefer_sibling = 0; + int has_blocked = READ_ONCE(nohz.has_blocked); bool overload = false; if (child && child->flags & SD_PREFER_SIBLING) prefer_sibling = 1; #ifdef CONFIG_NO_HZ_COMMON - if (env->idle == CPU_NEWLY_IDLE) { + if (env->idle == CPU_NEWLY_IDLE && has_blocked) env->flags |= LBF_NOHZ_STATS; - - if (cpumask_subset(nohz.idle_cpus_mask, sched_domain_span(env->sd))) - nohz.next_stats = jiffies + msecs_to_jiffies(LOAD_AVG_PERIOD); - } #endif load_idx = get_sd_load_idx(env->sd, env->idle); @@ -8046,6 +8059,15 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd sg = sg->next; } while (sg != env->sd->groups); +#ifdef CONFIG_NO_HZ_COMMON + if ((env->flags & LBF_NOHZ_AGAIN) && + cpumask_subset(nohz.idle_cpus_mask, sched_domain_span(env->sd))) { + + WRITE_ONCE(nohz.next_blocked, + jiffies + msecs_to_jiffies(LOAD_AVG_PERIOD)); + } +#endif + if (env->sd->flags & SD_NUMA) env->fbq_type = fbq_classify_group(&sds->busiest_stat); @@ -9069,6 +9091,8 @@ static void nohz_balancer_kick(struct rq *rq) struct sched_domain *sd; int nr_busy, i, cpu = rq->cpu; unsigned int flags = 0; + unsigned long has_blocked = READ_ONCE(nohz.has_blocked); + unsigned long next_blocked = READ_ONCE(nohz.next_blocked); if (unlikely(rq->idle_balance)) return; @@ -9086,7 +9110,7 @@ static void nohz_balancer_kick(struct rq *rq) if (likely(!atomic_read(&nohz.nr_cpus))) return; - if (time_after(now, nohz.next_stats)) + if (time_after(now, next_blocked) && has_blocked) flags = NOHZ_STATS_KICK; if (time_before(now, nohz.next_balance)) @@ -9207,13 +9231,26 @@ void nohz_balance_enter_idle(int cpu) if (!housekeeping_cpu(cpu, HK_FLAG_SCHED)) return; + /* + * Can be safely set without rq->lock held + * If a clear happens, it will have seen last additions because + * rq->lock is held during the check and the clear + */ + rq->has_blocked_load = 1; + + /* + * The tick is still stopped but load could have been added in the + * meantime. We set the nohz.has_blocked flag to trig a check the + * *_avg. The CPU is already part of nohz.idle_cpus_mask so the clear + * of nohz.has_blocked can only happen after checking the new load + */ if (rq->nohz_tick_stopped) - return; + goto out; /* * If we're a completely isolated CPU, we don't play. */ - if (on_null_domain(cpu_rq(cpu))) + if (on_null_domain(rq)) return; rq->nohz_tick_stopped = 1; @@ -9222,6 +9259,20 @@ void nohz_balance_enter_idle(int cpu) atomic_inc(&nohz.nr_cpus); set_cpu_sd_state_idle(cpu); + + /* + * Ensures that if nohz_idle_balance() fails to observe our + * @idle_cpus_mask store, it must observe the @has_blocked + * store. + */ + smp_mb__after_atomic(); + +out: + /* + * Each time a cpu enter idle, we assume that it has blocked load and + * enable the periodic update of the load of idle cpus + */ + WRITE_ONCE(nohz.has_blocked, 1); } #else static inline void nohz_balancer_kick(struct rq *rq) { } @@ -9374,6 +9425,22 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) SCHED_WARN_ON((flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK); + /* + * We assume there will be no idle load after this update and clear + * the has_blocked flag. If a cpu enters idle in the mean time, it will + * set the has_blocked flag and trig another update of idle load. + * Because a cpu that becomes idle, is added to idle_cpus_mask before + * setting the flag, we are sure to not clear the state and not + * check the load of an idle cpu. + */ + WRITE_ONCE(nohz.has_blocked, 0); + + /* + * Ensures that if we miss the CPU, we must see the has_blocked + * store from nohz_balance_enter_idle(). + */ + smp_mb(); + for_each_cpu(balance_cpu, nohz.idle_cpus_mask) { if (balance_cpu == this_cpu || !idle_cpu(balance_cpu)) continue; @@ -9383,11 +9450,16 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) * work being done for other cpus. Next load * balancing owner will pick it up. */ - if (need_resched()) - break; + if (need_resched()) { + has_blocked_load = true; + goto abort; + } rq = cpu_rq(balance_cpu); + update_blocked_averages(rq->cpu); + has_blocked_load |= rq->has_blocked_load; + /* * If time for next balance is due, * do the balance. @@ -9400,7 +9472,6 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) cpu_load_update_idle(rq); rq_unlock_irq(rq, &rf); - update_blocked_averages(rq->cpu); if (flags & NOHZ_BALANCE_KICK) rebalance_domains(rq, CPU_IDLE); } @@ -9415,7 +9486,13 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) if (flags & NOHZ_BALANCE_KICK) rebalance_domains(this_rq, CPU_IDLE); - nohz.next_stats = next_stats; + WRITE_ONCE(nohz.next_blocked, + now + msecs_to_jiffies(LOAD_AVG_PERIOD)); + +abort: + /* There is still blocked load, enable periodic update */ + if (has_blocked_load) + WRITE_ONCE(nohz.has_blocked, 1); /* * next_balance will be updated only when there is a need. @@ -10046,6 +10123,7 @@ __init void init_sched_fair_class(void) #ifdef CONFIG_NO_HZ_COMMON nohz.next_balance = jiffies; + nohz.next_blocked = jiffies; zalloc_cpumask_var(&nohz.idle_cpus_mask, GFP_NOWAIT); #endif #endif /* SMP */ diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index e200045..ad9b929 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -723,6 +723,7 @@ struct rq { #ifdef CONFIG_SMP unsigned long last_load_update_tick; unsigned long last_blocked_load_update_tick; + unsigned int has_blocked_load; #endif /* CONFIG_SMP */ unsigned int nohz_tick_stopped; atomic_t nohz_flags; From patchwork Mon Feb 12 08:07:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 127944 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp2977654ljc; Mon, 12 Feb 2018 00:08:16 -0800 (PST) X-Google-Smtp-Source: AH8x226zi1pRaa0d76LYdfeNMQm+Uo48WfVB+3F2+d7QL+y6FbKU7AHdMllWtNbcMd2yjSn6GUvC X-Received: by 10.98.141.208 with SMTP id p77mr10829663pfk.5.1518422896488; Mon, 12 Feb 2018 00:08:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518422896; cv=none; d=google.com; s=arc-20160816; b=HhzS+icFalAbi7JpynvtW1fCBNv1tvQI2XWAu84KQ0d30XUQ5Qqu+VY+xE5QSBIbIe s5u5nziy5Oau9GF+OFZlpzbOpz9kh0lDfv7iFJJlJQuAzptBR0Jej/tZ/Bd2xS0s3tOE jF4LH9gq76+hPVoCvWIvWmNIOTBmpPviSOYs6/tyR1CLpsyvwyxPk8Ao3p3iC54XfU44 H3/jipdXaTre/AGMFkPpP5jV2oyk5QXuMCEDxCnAUyIjoF/kBA/Nmhf/iV2iQhZbYF1F APkJLyj8VOWhWkV1XK5la3K+TEh/6hJIytMW+R185afxzuMbuo5JaDXHRvG5Umf8C0+0 zAvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=DLzLoQQJyLnv/v+XGuR5Jz1BcVk7CD2lJ/418fj488c=; b=zja96diEqAzZrKpGmHYS3H6/OyDP62a7JVKixU4djOSfokYkY3gcnrTIUSeX7I98Qh uer3RigxoIA64ijEZ1zriWtQ8BUjVmRAnyWTKRm1+uzfFdtRrV783CN1IGkWfVsKx0Qh iiTALdU6YKwpLhqx0/5Wh1X8KCtnn6Jpmi8xkCXap6pQVH4o8TRdW29tJBiffeGgr8/4 uajLFW+gj2UK37GWT6XMgz8ZaS0/42WwKSCnyvjYQborCliZC2w4y7uagr9GhlqTDP9z PAIC4/5f2GxNbbO0+0FGEuH5yaCTKm8a4MZQoYjpPGs2euy0YbG+ISBlOUDjqf8PrvPN DBGw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=VDFouXQj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b61-v6si5562713plc.265.2018.02.12.00.08.16; Mon, 12 Feb 2018 00:08:16 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=VDFouXQj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932942AbeBLIIK (ORCPT + 12 others); Mon, 12 Feb 2018 03:08:10 -0500 Received: from mail-wm0-f68.google.com ([74.125.82.68]:51525 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932875AbeBLIIE (ORCPT ); Mon, 12 Feb 2018 03:08:04 -0500 Received: by mail-wm0-f68.google.com with SMTP id r71so7922233wmd.1 for ; Mon, 12 Feb 2018 00:08:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=DLzLoQQJyLnv/v+XGuR5Jz1BcVk7CD2lJ/418fj488c=; b=VDFouXQjrhMUbvldJjR1c/12e9PJOZ88+sghDxmozBPCBDjWlNYcuxFPJXkzFAVOhq 1dFilkNRMuHR0zxGqnmOTpH86kXEceHyqq4i/upd6arhoh7l3s9wQMHLtf/t+j2qQGNO VKZW+c5vrJSDRRIIoZxDD6+tJt6FG6FMGHqCo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=DLzLoQQJyLnv/v+XGuR5Jz1BcVk7CD2lJ/418fj488c=; b=NGJi+mG57D36n7d6kICM8W7iAw/FXcwqPDUWj+aAGvEa+oF77YBFaLlv72aaFqKKrz vLdz98xq4XCOemSCNqKE7SUZiq16CCbYIztusXwcrc++3IRUcNOZpBOPUPAkCj4zLNvs CcS6xbY1ni2e7zt3OTTsO+Zu7iEo8bW/7jEVJREFQtTISbdaQLvorLOSkAn5OdEEQOEx xQsqOwZNRtZwtxafauWou3hrs9odcaTq9tZ36MxOMsVUsu4wifEgh8WffiI2BgQlbd1B UhaEnqL9zgZkYaHsesIfLnj7jxcoalT2k686dQ2/mxRHOmM1rRNpD9FTdAeHXYnB9ldg Li2A== X-Gm-Message-State: APf1xPAM86iOBNQXf16h1zcN2V4BTG0f/bKj3pQ5oFu+R4Q/bhx+SyVB Kvn3Z8nHZzp6HKhHeDEIj5ZHkfyh7bQ= X-Received: by 10.80.224.195 with SMTP id j3mr14748802edl.50.1518422882935; Mon, 12 Feb 2018 00:08:02 -0800 (PST) Received: from localhost.localdomain ([2a01:e0a:f:6020:8800:c3b2:94cc:861a]) by smtp.gmail.com with ESMTPSA id f29sm4853715eda.43.2018.02.12.00.08.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 12 Feb 2018 00:08:02 -0800 (PST) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, valentin.schneider@arm.com Cc: morten.rasmussen@foss.arm.com, brendan.jackman@arm.com, dietmar.eggemann@arm.com, Vincent Guittot Subject: [PATCH v3 2/3] sched: reduce the periodic update duration Date: Mon, 12 Feb 2018 09:07:53 +0100 Message-Id: <1518422874-13216-3-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1518422874-13216-1-git-send-email-vincent.guittot@linaro.org> References: <1518422874-13216-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Instead of using the cfs_rq_is_decayed() which monitors all *_avg and *_sum, we create a cfs_rq_has_blocked() which only takes care of util_avg and load_avg. We are only interested by these 2 values which are decaying faster than the *_sum so we can stop the periodic update earlier. Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) -- 2.7.4 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e228d3d..7566424 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7337,6 +7337,19 @@ static void attach_tasks(struct lb_env *env) rq_unlock(env->dst_rq, &rf); } +static inline bool cfs_rq_has_blocked(struct cfs_rq *cfs_rq) +{ + if (cfs_rq->avg.load_avg) + return true; + + if (cfs_rq->avg.util_avg) + return true; + + return false; +} + +#ifdef CONFIG_FAIR_GROUP_SCHED + static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) { if (cfs_rq->load.weight) @@ -7354,8 +7367,6 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) return true; } -#ifdef CONFIG_FAIR_GROUP_SCHED - static void update_blocked_averages(int cpu) { struct rq *rq = cpu_rq(cpu); @@ -7391,7 +7402,9 @@ static void update_blocked_averages(int cpu) */ if (cfs_rq_is_decayed(cfs_rq)) list_del_leaf_cfs_rq(cfs_rq); - else + + /* Don't need periodic decay once load/util_avg are null */ + if (cfs_rq_has_blocked(cfs_rq)) done = false; } @@ -7461,7 +7474,7 @@ static inline void update_blocked_averages(int cpu) update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq); #ifdef CONFIG_NO_HZ_COMMON rq->last_blocked_load_update_tick = jiffies; - if (cfs_rq_is_decayed(cfs_rq)) + if (!cfs_rq_has_blocked(cfs_rq)) rq->has_blocked_load = 0; #endif rq_unlock_irqrestore(rq, &rf); From patchwork Mon Feb 12 08:07:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 127946 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp2977937ljc; Mon, 12 Feb 2018 00:08:42 -0800 (PST) X-Google-Smtp-Source: AH8x225M8Y0U130p2h2Jpvl6stQpYzniYT9sbu/UdZvDWvX/q5V/ow8CcxYGqXRdQC7So1Jw0qno X-Received: by 10.99.4.66 with SMTP id 63mr4005235pge.93.1518422921911; Mon, 12 Feb 2018 00:08:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518422921; cv=none; d=google.com; s=arc-20160816; b=Z2Wt5zfP2DKT+55BXiUJC3IhXf8J03pF6d/ypHvNgA4F0FWorq71QKBPLtpm9goh3b gK4U/GNpUgC6mU5S/p2KsOuSvl1NQYYyGD7Vv4Nu6hd9wI/YnZumqsZEkd/UPO5M5Owd u7SfQgFyWea0YWUBExdz5Rfndt88X0hi4pw0xRLn31i4XwGfd4CTJreJuZC7fPEjtM9f gkSO4I2f5o9jjDbfWhxudDVULQntVaFHB5tbGRNyCcWIu69yYtN+GYrfLZ5PEHzoEsY9 12IGKUhq+EcRtdlP9i1DkYVxLIJutnZ0nLcmMZxsADuTqwERekz8kDXIKVwv5aVpnvUW 9Jwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=8DkJ88in6Jag6IguI2ISW6lhQx6B6l8KugjtA+1TDcg=; b=0rwS0htNS6nxe2ks0UYefPUbvTon5erar9QN9Jzj+78nokiKdW480NxYpn/TJa7mf8 bxT7lwAcElk1+LhyXDz3OFPxOLirt4kEIYMm4l8ls6Oko3Z+RxOlMP9psjyzNmrYRpNR TgVy+ivBAXAzZO9ppAj8CDXPlD5IuyVdA6OdLECR2Ao2CQB2Z9lkHxlbqifon28fCjte tQc1wKAQQ+1v+oFLriNJMVIiFuygnLeMk9EnAo/z1C3Kyxmpn8pN73bVG3fYEWUR0/ZY bERZzG8SzW2+US/Nq61T2pdnCB63SQ/BzOrWjLnxRYl0/xL/+qH5phJJwBeAOjObbcWr jiTw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=i85oIqKz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i67si3983532pfj.212.2018.02.12.00.08.41; Mon, 12 Feb 2018 00:08:41 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=i85oIqKz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932962AbeBLIIh (ORCPT + 12 others); Mon, 12 Feb 2018 03:08:37 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:39051 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932876AbeBLIIF (ORCPT ); Mon, 12 Feb 2018 03:08:05 -0500 Received: by mail-wm0-f67.google.com with SMTP id b21so8112264wme.4 for ; Mon, 12 Feb 2018 00:08:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=8DkJ88in6Jag6IguI2ISW6lhQx6B6l8KugjtA+1TDcg=; b=i85oIqKzFhmVkflMd5rLIrEmvx0VP/Bdp2J/oohHARKJDiVHcCWWflm46+l63ISfUi uWUPGvB7WbyU6+qjBkG4+nyA+USBiVTsaVK3tMBN7BABpdUXh4YePfx+wfmhUNGbqpMH z87ng+bHfa67XM4JERNSVXg906zXPckbFQTEg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=8DkJ88in6Jag6IguI2ISW6lhQx6B6l8KugjtA+1TDcg=; b=I3jWP6/qzKnP4Ye5mQZOugAS0mvPtxnWWZwSBgOzSdHAoqysokHa5dBU9LpabjvSpk OXBJfa/IqGTl4tmLwRFEqU/7jhjCs1bKdOlTKHr4etj5QEjOQhJMmFgqtD8sA8ltjY8y 8oRO8Jl7ALSqXBcpOOlh9GqBgiAfny8XSzrL4kHOgkTcAQqkqOyqukRcvzgslPfRxTKz h3APEhKCZb7u2QTBofR0sE/oRL6EKaLjyW+kSUr8D1J2gAHrUVdSkU3DdDwCicqdQtRH fbwBO95k3TPP9bSWFRpSXio9kHfQkfhX3eKdO5A0J7u/tkqvBEFgT4nWgCyIYmj+7WEm lWCw== X-Gm-Message-State: APf1xPCAvJQ+13BbcvUlWK/L1TLi364Y8ZV4V01MaS6rivxT8t8Wm06E KIDt5hNQYGYgg+2TmI2uUvKI6Q== X-Received: by 10.80.219.75 with SMTP id b11mr15639899edl.220.1518422884300; Mon, 12 Feb 2018 00:08:04 -0800 (PST) Received: from localhost.localdomain ([2a01:e0a:f:6020:8800:c3b2:94cc:861a]) by smtp.gmail.com with ESMTPSA id f29sm4853715eda.43.2018.02.12.00.08.02 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 12 Feb 2018 00:08:03 -0800 (PST) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, valentin.schneider@arm.com Cc: morten.rasmussen@foss.arm.com, brendan.jackman@arm.com, dietmar.eggemann@arm.com, Vincent Guittot Subject: [PATCH v3 3/3] sched: update blocked load when newly idle Date: Mon, 12 Feb 2018 09:07:54 +0100 Message-Id: <1518422874-13216-4-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1518422874-13216-1-git-send-email-vincent.guittot@linaro.org> References: <1518422874-13216-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When NEWLY_IDLE load balance is not triggered, we might need to update the blocked load anyway. We can kick an ilb so an idle CPU will take care of updating blocked load or we can try to update them locally before entering idle. In the latter case, we reuse part of the nohz_idle_balance. Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 102 ++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 84 insertions(+), 18 deletions(-) -- 2.7.4 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7566424..3c11ef6 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8829,6 +8829,9 @@ update_next_balance(struct sched_domain *sd, unsigned long *next_balance) *next_balance = next; } +static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, enum cpu_idle_type idle); +static void kick_ilb(unsigned int flags); + /* * idle_balance is called by schedule() if this_cpu is about to become * idle. Attempts to pull tasks from other CPUs. @@ -8863,12 +8866,26 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf) if (this_rq->avg_idle < sysctl_sched_migration_cost || !this_rq->rd->overload) { + unsigned long has_blocked = READ_ONCE(nohz.has_blocked); + unsigned long next_blocked = READ_ONCE(nohz.next_blocked); + rcu_read_lock(); sd = rcu_dereference_check_sched_domain(this_rq->sd); if (sd) update_next_balance(sd, &next_balance); rcu_read_unlock(); + /* + * Update blocked idle load if it has not been done for a + * while. Try to do it locally before entering idle but kick a + * ilb if it takes too much time and/or might delay next local + * wake up + */ + if (has_blocked && time_after_eq(jiffies, next_blocked) && + (this_rq->avg_idle < sysctl_sched_migration_cost || + !_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))) + kick_ilb(NOHZ_STATS_KICK); + goto out; } @@ -9411,30 +9428,24 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle) #ifdef CONFIG_NO_HZ_COMMON /* - * In CONFIG_NO_HZ_COMMON case, the idle balance kickee will do the - * rebalancing for all the cpus for whom scheduler ticks are stopped. + * Internal function that runs load balance for all idle cpus. The load balance + * can be a simple update of blocked load or a complete load balance with + * tasks movement depending of flags. + * For newly idle mode, we abort the loop if it takes too much time and return + * false to notify that the loop has not be completed and a ilb should be kick. */ -static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) +static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, enum cpu_idle_type idle) { /* Earliest time when we have to do rebalance again */ unsigned long now = jiffies; unsigned long next_balance = now + 60*HZ; - unsigned long next_stats = now + msecs_to_jiffies(LOAD_AVG_PERIOD); + bool has_blocked_load = false; int update_next_balance = 0; int this_cpu = this_rq->cpu; - unsigned int flags; int balance_cpu; + int ret = false; struct rq *rq; - - if (!(atomic_read(nohz_flags(this_cpu)) & NOHZ_KICK_MASK)) - return false; - - if (idle != CPU_IDLE) { - atomic_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); - return false; - } - - flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); + u64 curr_cost = 0; SCHED_WARN_ON((flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK); @@ -9455,6 +9466,10 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) smp_mb(); for_each_cpu(balance_cpu, nohz.idle_cpus_mask) { + u64 t0, domain_cost; + + t0 = sched_clock_cpu(this_cpu); + if (balance_cpu == this_cpu || !idle_cpu(balance_cpu)) continue; @@ -9468,6 +9483,16 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) goto abort; } + /* + * If the update is done while CPU becomes idle, we abort + * the update when its cost is higher than the average idle + * time in orde to not delay a possible wake up. + */ + if (idle == CPU_NEWLY_IDLE && this_rq->avg_idle < curr_cost) { + has_blocked_load = true; + goto abort; + } + rq = cpu_rq(balance_cpu); update_blocked_averages(rq->cpu); @@ -9480,10 +9505,10 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) if (time_after_eq(jiffies, rq->next_balance)) { struct rq_flags rf; - rq_lock_irq(rq, &rf); + rq_lock_irqsave(rq, &rf); update_rq_clock(rq); cpu_load_update_idle(rq); - rq_unlock_irq(rq, &rf); + rq_unlock_irqrestore(rq, &rf); if (flags & NOHZ_BALANCE_KICK) rebalance_domains(rq, CPU_IDLE); @@ -9493,15 +9518,27 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) next_balance = rq->next_balance; update_next_balance = 1; } + + domain_cost = sched_clock_cpu(this_cpu) - t0; + curr_cost += domain_cost; + + } + + /* Newly idle CPU doesn't need an update */ + if (idle != CPU_NEWLY_IDLE) { + update_blocked_averages(this_cpu); + has_blocked_load |= this_rq->has_blocked_load; } - update_blocked_averages(this_cpu); if (flags & NOHZ_BALANCE_KICK) rebalance_domains(this_rq, CPU_IDLE); WRITE_ONCE(nohz.next_blocked, now + msecs_to_jiffies(LOAD_AVG_PERIOD)); + /* The full idle balance loop has been done */ + ret = true; + abort: /* There is still blocked load, enable periodic update */ if (has_blocked_load) @@ -9515,6 +9552,35 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) if (likely(update_next_balance)) nohz.next_balance = next_balance; + return ret; +} + +/* + * In CONFIG_NO_HZ_COMMON case, the idle balance kickee will do the + * rebalancing for all the cpus for whom scheduler ticks are stopped. + */ +static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) +{ + int this_cpu = this_rq->cpu; + unsigned int flags; + + if (!(atomic_read(nohz_flags(this_cpu)) & NOHZ_KICK_MASK)) + return false; + + if (idle != CPU_IDLE) { + atomic_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); + return false; + } + + /* + * barrier, pairs with nohz_balance_enter_idle(), ensures ... + */ + flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); + if (!(flags & NOHZ_KICK_MASK)) + return false; + + _nohz_idle_balance(this_rq, flags, idle); + return true; } #else