From patchwork Mon Nov 3 16:54:47 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 40053 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-la0-f72.google.com (mail-la0-f72.google.com [209.85.215.72]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 52DFF21894 for ; Mon, 3 Nov 2014 16:57:10 +0000 (UTC) Received: by mail-la0-f72.google.com with SMTP id mc6sf6544657lab.3 for ; Mon, 03 Nov 2014 08:57:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:sender:precedence:list-id :x-original-sender:x-original-authentication-results:mailing-list :list-post:list-help:list-archive:list-unsubscribe; bh=C5wXXNYRQjWOdpq7lTHKPO+xaDVB2qgwKn6Tcm3vGUo=; b=UDn4vLjhw1sCo4h6OS968xcTz8x2x9FJP6pi9t+czLZo2p1jSBmPreh3Nr1YNpgo57 8Zb8VWGza7lDcVAEnglWO4djAuSGD2BJaEYP6Jb6KZUYUg6kAU9w323ogay3vPQAysh2 423At1VhdbdFusWW3AMaZ2C9DCvn4/Rdqlt8I+XQOqgnSCuN3Ba9JqBc1pu++HTfbRCy 4CgWoBkJvXx9mdXjODVSoV6jo+Mi1wkAIR3/NTJvMfvNPmkAFUG28dtAPZ79mxirqTRJ F2xxCpa9mGx0TbWM29M6fU2z3S7mmxplee6FvUWFrAt3zM577Lw7WwQrZ9KORuQa0Em+ UT+g== X-Gm-Message-State: ALoCoQkqj5XYyE1zfudq8fORpHLRzYq3lxj/z3/VcEDfOGChUESjVNgSZ8OaWBqxInP+gDq160Zo X-Received: by 10.194.86.66 with SMTP id n2mr101135wjz.7.1415033829157; Mon, 03 Nov 2014 08:57:09 -0800 (PST) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.44.169 with SMTP id f9ls734799lam.11.gmail; Mon, 03 Nov 2014 08:57:08 -0800 (PST) X-Received: by 10.112.139.196 with SMTP id ra4mr14285862lbb.95.1415033828823; Mon, 03 Nov 2014 08:57:08 -0800 (PST) Received: from mail-la0-f51.google.com (mail-la0-f51.google.com. [209.85.215.51]) by mx.google.com with ESMTPS id g7si33221351lab.66.2014.11.03.08.57.08 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 03 Nov 2014 08:57:08 -0800 (PST) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.51 as permitted sender) client-ip=209.85.215.51; Received: by mail-la0-f51.google.com with SMTP id q1so9852986lam.10 for ; Mon, 03 Nov 2014 08:57:08 -0800 (PST) X-Received: by 10.152.6.228 with SMTP id e4mr52388595laa.71.1415033828630; Mon, 03 Nov 2014 08:57:08 -0800 (PST) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.141.34 with SMTP id rl2csp21044lbb; Mon, 3 Nov 2014 08:57:07 -0800 (PST) X-Received: by 10.70.64.133 with SMTP id o5mr10362405pds.105.1415033826717; Mon, 03 Nov 2014 08:57:06 -0800 (PST) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id bc11si15614994pdb.255.2014.11.03.08.57.06 for ; Mon, 03 Nov 2014 08:57:06 -0800 (PST) Received-SPF: none (google.com: linux-kernel-owner@vger.kernel.org does not designate permitted sender hosts) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753250AbaKCQ5D (ORCPT + 25 others); Mon, 3 Nov 2014 11:57:03 -0500 Received: from mail-wi0-f176.google.com ([209.85.212.176]:40963 "EHLO mail-wi0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753389AbaKCQ4T (ORCPT ); Mon, 3 Nov 2014 11:56:19 -0500 Received: by mail-wi0-f176.google.com with SMTP id h11so7057056wiw.3 for ; Mon, 03 Nov 2014 08:56:17 -0800 (PST) X-Received: by 10.180.219.66 with SMTP id pm2mr18091810wic.5.1415033777885; Mon, 03 Nov 2014 08:56:17 -0800 (PST) Received: from lmenx30s.lme.st.com (LPuteaux-656-01-48-212.w82-127.abo.wanadoo.fr. [82.127.83.212]) by mx.google.com with ESMTPSA id s8sm22798078wjx.9.2014.11.03.08.56.15 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 03 Nov 2014 08:56:16 -0800 (PST) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, preeti@linux.vnet.ibm.com, Morten.Rasmussen@arm.com, kamalesh@linux.vnet.ibm.com, linux-arm-kernel@lists.infradead.org Cc: riel@redhat.com, efault@gmx.de, nicolas.pitre@linaro.org, linaro-kernel@lists.linaro.org, Vincent Guittot Subject: [PATCH v9 10/10] sched: move cfs task on a CPU with higher capacity Date: Mon, 3 Nov 2014 17:54:47 +0100 Message-Id: <1415033687-23294-11-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1415033687-23294-1-git-send-email-vincent.guittot@linaro.org> References: <1415033687-23294-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: vincent.guittot@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.51 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , When a CPU is used to handle a lot of IRQs or some RT tasks, the remaining capacity for CFS tasks can be significantly reduced. Once we detect such situation by comparing cpu_capacity_orig and cpu_capacity, we trig an idle load balance to check if it's worth moving its tasks on an idle CPU. Once the idle load_balance has selected the busiest CPU, it will look for an active load balance for only two cases : - there is only 1 task on the busiest CPU. - we haven't been able to move a task of the busiest rq. A CPU with a reduced capacity is included in the 1st case, and it's worth to actively migrate its task if the idle CPU has got full capacity. This test has been added in need_active_balance. As a sidenote, this will note generate more spurious ilb because we already trig an ilb if there is more than 1 busy cpu. If this cpu is the only one that has a task, we will trig the ilb once for migrating the task. The nohz_kick_needed function has been cleaned up a bit while adding the new test env.src_cpu and env.src_rq must be set unconditionnally because they are used in need_active_balance which is called even if busiest->nr_running equals 1 Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 74 ++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 53 insertions(+), 21 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index db392a6..02e8f7f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6634,6 +6634,28 @@ static int need_active_balance(struct lb_env *env) return 1; } + /* + * The dst_cpu is idle and the src_cpu CPU has only 1 CFS task. + * It's worth migrating the task if the src_cpu's capacity is reduced + * because of other sched_class or IRQs whereas capacity stays + * available on dst_cpu. + */ + if ((env->idle != CPU_NOT_IDLE) && + (env->src_rq->cfs.h_nr_running == 1)) { + unsigned long src_eff_capacity, dst_eff_capacity; + + dst_eff_capacity = 100; + dst_eff_capacity *= capacity_of(env->dst_cpu); + dst_eff_capacity *= capacity_orig_of(env->src_cpu); + + src_eff_capacity = sd->imbalance_pct; + src_eff_capacity *= capacity_of(env->src_cpu); + src_eff_capacity *= capacity_orig_of(env->dst_cpu); + + if (src_eff_capacity < dst_eff_capacity) + return 1; + } + return unlikely(sd->nr_balance_failed > sd->cache_nice_tries+2); } @@ -6733,6 +6755,9 @@ static int load_balance(int this_cpu, struct rq *this_rq, schedstat_add(sd, lb_imbalance[idle], env.imbalance); + env.src_cpu = busiest->cpu; + env.src_rq = busiest; + ld_moved = 0; if (busiest->nr_running > 1) { /* @@ -6742,8 +6767,6 @@ static int load_balance(int this_cpu, struct rq *this_rq, * correctly treated as an imbalance. */ env.flags |= LBF_ALL_PINNED; - env.src_cpu = busiest->cpu; - env.src_rq = busiest; env.loop_max = min(sysctl_sched_nr_migrate, busiest->nr_running); more_balance: @@ -7443,22 +7466,25 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) /* * Current heuristic for kicking the idle load balancer in the presence - * of an idle cpu is the system. + * of an idle cpu in the system. * - This rq has more than one task. - * - At any scheduler domain level, this cpu's scheduler group has multiple - * busy cpu's exceeding the group's capacity. + * - This rq has at least one CFS task and the capacity of the CPU is + * significantly reduced because of RT tasks or IRQs. + * - At parent of LLC scheduler domain level, this cpu's scheduler group has + * multiple busy cpu. * - For SD_ASYM_PACKING, if the lower numbered cpu's in the scheduler * domain span are idle. */ -static inline int nohz_kick_needed(struct rq *rq) +static inline bool nohz_kick_needed(struct rq *rq) { unsigned long now = jiffies; struct sched_domain *sd; struct sched_group_capacity *sgc; int nr_busy, cpu = rq->cpu; + bool kick = false; if (unlikely(rq->idle_balance)) - return 0; + return false; /* * We may be recently in ticked or tickless idle mode. At the first @@ -7472,38 +7498,44 @@ static inline int nohz_kick_needed(struct rq *rq) * balancing. */ if (likely(!atomic_read(&nohz.nr_cpus))) - return 0; + return false; if (time_before(now, nohz.next_balance)) - return 0; + return false; if (rq->nr_running >= 2) - goto need_kick; + return true; rcu_read_lock(); sd = rcu_dereference(per_cpu(sd_busy, cpu)); - if (sd) { sgc = sd->groups->sgc; nr_busy = atomic_read(&sgc->nr_busy_cpus); - if (nr_busy > 1) - goto need_kick_unlock; + if (nr_busy > 1) { + kick = true; + goto unlock; + } + } - sd = rcu_dereference(per_cpu(sd_asym, cpu)); + sd = rcu_dereference(rq->sd); + if (sd) { + if ((rq->cfs.h_nr_running >= 1) && + check_cpu_capacity(rq, sd)) { + kick = true; + goto unlock; + } + } + sd = rcu_dereference(per_cpu(sd_asym, cpu)); if (sd && (cpumask_first_and(nohz.idle_cpus_mask, sched_domain_span(sd)) < cpu)) - goto need_kick_unlock; + kick = true; +unlock: rcu_read_unlock(); - return 0; - -need_kick_unlock: - rcu_read_unlock(); -need_kick: - return 1; + return kick; } #else static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) { }