From patchwork Fri Feb 27 15:54:14 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 45252 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-lb0-f197.google.com (mail-lb0-f197.google.com [209.85.217.197]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id D499E20674 for ; Fri, 27 Feb 2015 15:57:20 +0000 (UTC) Received: by lbdu10 with SMTP id u10sf15202239lbd.3 for ; Fri, 27 Feb 2015 07:57:19 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:sender:precedence:list-id :x-original-sender:x-original-authentication-results:mailing-list :list-post:list-help:list-archive:list-unsubscribe; bh=ri5LysHeyeHSv2JE2nlRzDxqILh5Rve5h3Wf88NnkkU=; b=fTjK3ZCvL499NuK2/bmAe3NQbN8GRLqXEP37CRxcvrGQ6HuA8Px25CvDJfLvf01nCM v9+WDSbapWFqrfXBwtKONxdcPCfTPgtTU2l7om5nuxMLXBicvurl5ne92hHR7IZHb5Ua fMw5i/jIfarHcRvx5+v1oAgAyXH5ZoDwqy3z219LkwcgHpIwgIiJDlZktdnjz90Z5aBl cK7tk9rZ4aPYNMJEdSUNrPrkrtDcnj3GnX/3mYsnZit+vOn3qGxLJAqggB20LAIMisgn QPz8ZlUnqldOltXPTf0UXWB+Gw6cyS/vuGF330GFIbYL+Bz1Ra6UoWbRrCRtwhoO42Hn cvSg== X-Gm-Message-State: ALoCoQl67T6iZ3G5BLIGKeGEwsEtiRhifyfn5DCUMmFC3CgxyDj6JLuqRkBKZ6Q5TSxbRWYZ0DCP X-Received: by 10.194.201.10 with SMTP id jw10mr2174092wjc.3.1425052639809; Fri, 27 Feb 2015 07:57:19 -0800 (PST) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.7.228 with SMTP id m4ls152312laa.106.gmail; Fri, 27 Feb 2015 07:57:19 -0800 (PST) X-Received: by 10.112.167.4 with SMTP id zk4mr12772972lbb.74.1425052639472; Fri, 27 Feb 2015 07:57:19 -0800 (PST) Received: from mail-lb0-f170.google.com (mail-lb0-f170.google.com. [209.85.217.170]) by mx.google.com with ESMTPS id z5si2457426lbf.103.2015.02.27.07.57.19 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 27 Feb 2015 07:57:19 -0800 (PST) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.170 as permitted sender) client-ip=209.85.217.170; Received: by lbiz11 with SMTP id z11so18204846lbi.5 for ; Fri, 27 Feb 2015 07:57:19 -0800 (PST) X-Received: by 10.112.114.230 with SMTP id jj6mr12989830lbb.112.1425052639343; Fri, 27 Feb 2015 07:57:19 -0800 (PST) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.35.133 with SMTP id h5csp4028901lbj; Fri, 27 Feb 2015 07:57:17 -0800 (PST) X-Received: by 10.66.66.7 with SMTP id b7mr25481956pat.9.1425052635983; Fri, 27 Feb 2015 07:57:15 -0800 (PST) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t5si4156259pdm.178.2015.02.27.07.57.14; Fri, 27 Feb 2015 07:57:15 -0800 (PST) Received-SPF: none (google.com: linux-kernel-owner@vger.kernel.org does not designate permitted sender hosts) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754989AbbB0P5I (ORCPT + 28 others); Fri, 27 Feb 2015 10:57:08 -0500 Received: from mail-we0-f175.google.com ([74.125.82.175]:41036 "EHLO mail-we0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932114AbbB0Pz0 (ORCPT ); Fri, 27 Feb 2015 10:55:26 -0500 Received: by wevm14 with SMTP id m14so21216155wev.8 for ; Fri, 27 Feb 2015 07:55:24 -0800 (PST) X-Received: by 10.180.91.79 with SMTP id cc15mr362655wib.37.1425052524565; Fri, 27 Feb 2015 07:55:24 -0800 (PST) Received: from lmenx30s.lme.st.com (LPuteaux-656-1-48-212.w82-127.abo.wanadoo.fr. [82.127.83.212]) by mx.google.com with ESMTPSA id y3sm6519459wju.14.2015.02.27.07.55.22 (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 27 Feb 2015 07:55:23 -0800 (PST) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, preeti@linux.vnet.ibm.com, Morten.Rasmussen@arm.com, kamalesh@linux.vnet.ibm.com Cc: riel@redhat.com, efault@gmx.de, nicolas.pitre@linaro.org, dietmar.eggemann@arm.com, linaro-kernel@lists.linaro.org, Vincent Guittot Subject: [PATCH v10 11/11] sched: move cfs task on a CPU with higher capacity Date: Fri, 27 Feb 2015 16:54:14 +0100 Message-Id: <1425052454-25797-12-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1425052454-25797-1-git-send-email-vincent.guittot@linaro.org> References: <1425052454-25797-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: vincent.guittot@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.170 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , When a CPU is used to handle a lot of IRQs or some RT tasks, the remaining capacity for CFS tasks can be significantly reduced. Once we detect such situation by comparing cpu_capacity_orig and cpu_capacity, we trig an idle load balance to check if it's worth moving its tasks on an idle CPU. It's worth trying to move the task before the CPU is fully utilized to minimize the preemption by irq or RT tasks. Once the idle load_balance has selected the busiest CPU, it will look for an active load balance for only two cases : - there is only 1 task on the busiest CPU. - we haven't been able to move a task of the busiest rq. A CPU with a reduced capacity is included in the 1st case, and it's worth to actively migrate its task if the idle CPU has got more available capacity for CFS tasks. This test has been added in need_active_balance. As a sidenote, this will not generate more spurious ilb because we already trig an ilb if there is more than 1 busy cpu. If this cpu is the only one that has a task, we will trig the ilb once for migrating the task. The nohz_kick_needed function has been cleaned up a bit while adding the new test env.src_cpu and env.src_rq must be set unconditionnally because they are used in need_active_balance which is called even if busiest->nr_running equals 1 Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 69 ++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 47 insertions(+), 22 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7420d21..e70c315 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6855,6 +6855,19 @@ static int need_active_balance(struct lb_env *env) return 1; } + /* + * The dst_cpu is idle and the src_cpu CPU has only 1 CFS task. + * It's worth migrating the task if the src_cpu's capacity is reduced + * because of other sched_class or IRQs if more capacity stays + * available on dst_cpu. + */ + if ((env->idle != CPU_NOT_IDLE) && + (env->src_rq->cfs.h_nr_running == 1)) { + if ((check_cpu_capacity(env->src_rq, sd)) && + (capacity_of(env->src_cpu)*sd->imbalance_pct < capacity_of(env->dst_cpu)*100)) + return 1; + } + return unlikely(sd->nr_balance_failed > sd->cache_nice_tries+2); } @@ -6954,6 +6967,9 @@ static int load_balance(int this_cpu, struct rq *this_rq, schedstat_add(sd, lb_imbalance[idle], env.imbalance); + env.src_cpu = busiest->cpu; + env.src_rq = busiest; + ld_moved = 0; if (busiest->nr_running > 1) { /* @@ -6963,8 +6979,6 @@ static int load_balance(int this_cpu, struct rq *this_rq, * correctly treated as an imbalance. */ env.flags |= LBF_ALL_PINNED; - env.src_cpu = busiest->cpu; - env.src_rq = busiest; env.loop_max = min(sysctl_sched_nr_migrate, busiest->nr_running); more_balance: @@ -7664,22 +7678,25 @@ static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) /* * Current heuristic for kicking the idle load balancer in the presence - * of an idle cpu is the system. + * of an idle cpu in the system. * - This rq has more than one task. - * - At any scheduler domain level, this cpu's scheduler group has multiple - * busy cpu's exceeding the group's capacity. + * - This rq has at least one CFS task and the capacity of the CPU is + * significantly reduced because of RT tasks or IRQs. + * - At parent of LLC scheduler domain level, this cpu's scheduler group has + * multiple busy cpu. * - For SD_ASYM_PACKING, if the lower numbered cpu's in the scheduler * domain span are idle. */ -static inline int nohz_kick_needed(struct rq *rq) +static inline bool nohz_kick_needed(struct rq *rq) { unsigned long now = jiffies; struct sched_domain *sd; struct sched_group_capacity *sgc; int nr_busy, cpu = rq->cpu; + bool kick = false; if (unlikely(rq->idle_balance)) - return 0; + return false; /* * We may be recently in ticked or tickless idle mode. At the first @@ -7693,38 +7710,46 @@ static inline int nohz_kick_needed(struct rq *rq) * balancing. */ if (likely(!atomic_read(&nohz.nr_cpus))) - return 0; + return false; if (time_before(now, nohz.next_balance)) - return 0; + return false; if (rq->nr_running >= 2) - goto need_kick; + return true; rcu_read_lock(); sd = rcu_dereference(per_cpu(sd_busy, cpu)); - if (sd) { sgc = sd->groups->sgc; nr_busy = atomic_read(&sgc->nr_busy_cpus); - if (nr_busy > 1) - goto need_kick_unlock; + if (nr_busy > 1) { + kick = true; + goto unlock; + } + } - sd = rcu_dereference(per_cpu(sd_asym, cpu)); + sd = rcu_dereference(rq->sd); + if (sd) { + if ((rq->cfs.h_nr_running >= 1) && + check_cpu_capacity(rq, sd)) { + kick = true; + goto unlock; + } + } + sd = rcu_dereference(per_cpu(sd_asym, cpu)); if (sd && (cpumask_first_and(nohz.idle_cpus_mask, - sched_domain_span(sd)) < cpu)) - goto need_kick_unlock; - - rcu_read_unlock(); - return 0; + sched_domain_span(sd)) < cpu)) { + kick = true; + goto unlock; + } -need_kick_unlock: +unlock: rcu_read_unlock(); -need_kick: - return 1; + return kick; } #else static void nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) { }