From patchwork Tue Sep 30 08:41:08 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 38129 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-lb0-f199.google.com (mail-lb0-f199.google.com [209.85.217.199]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 2D593202DB for ; Tue, 30 Sep 2014 08:42:16 +0000 (UTC) Received: by mail-lb0-f199.google.com with SMTP id w7sf1724286lbi.2 for ; Tue, 30 Sep 2014 01:42:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:from:to:cc:subject:date:message-id :mime-version:sender:precedence:list-id:x-original-sender :x-original-authentication-results:mailing-list:list-post:list-help :list-archive:list-unsubscribe:content-type :content-transfer-encoding; bh=LkMdAkYbordSApPUWCc3pEeZZdSuEnepGzKQvKvnhqI=; b=WrwWEv4uNnimhdLmHVQj5zpoyIhT4AfWN2i6JqZFALPAavlWJRN+2JJ8VQyGAAY0s+ Gqqp+hTpFIctEHt9XM0WYnnJ+tlDbcjwrT0+XcQpFUQGbycG8SGZx0xAso4LogsMaa2t fAK93bkgwMMfSKsDkwWi+DFYes3vdQ2lwLZvRy7qKVKHkjFcD74BQzxuVfwD9FKr507W l/Zr/r6syH5qAHw/9SEP0J4gQxpxOIiJ84IIVgi0X8gpvlEriq/Lk95Dm6YL6FdiZQ/Q O5BImVBGEiVzIXiWQkeyW075UZWo+3oTON4hAS2xmqWKf9t0Cbap/BR4EgDSKVMlUFeH CPgg== X-Gm-Message-State: ALoCoQniN8EP/zH0d4te1y3LauSNVi//uscuWwtwvGA1GDFJwZFFvn/D81nehHyWOkgqldUHhRZJ X-Received: by 10.112.14.101 with SMTP id o5mr68431lbc.23.1412066534696; Tue, 30 Sep 2014 01:42:14 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.7.41 with SMTP id g9ls10993laa.93.gmail; Tue, 30 Sep 2014 01:42:14 -0700 (PDT) X-Received: by 10.152.170.167 with SMTP id an7mr9479308lac.94.1412066534423; Tue, 30 Sep 2014 01:42:14 -0700 (PDT) Received: from mail-la0-f49.google.com (mail-la0-f49.google.com [209.85.215.49]) by mx.google.com with ESMTPS id e10si21651689laf.54.2014.09.30.01.42.14 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 30 Sep 2014 01:42:14 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.49 as permitted sender) client-ip=209.85.215.49; Received: by mail-la0-f49.google.com with SMTP id ge10so5028809lab.22 for ; Tue, 30 Sep 2014 01:42:14 -0700 (PDT) X-Received: by 10.153.6.36 with SMTP id cr4mr10977772lad.40.1412066534256; Tue, 30 Sep 2014 01:42:14 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.130.169 with SMTP id of9csp308438lbb; Tue, 30 Sep 2014 01:42:13 -0700 (PDT) X-Received: by 10.66.90.162 with SMTP id bx2mr5178109pab.39.1412066531591; Tue, 30 Sep 2014 01:42:11 -0700 (PDT) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x2si865735pdj.69.2014.09.30.01.42.10 for ; Tue, 30 Sep 2014 01:42:11 -0700 (PDT) Received-SPF: none (google.com: linux-kernel-owner@vger.kernel.org does not designate permitted sender hosts) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755426AbaI3ImG (ORCPT + 27 others); Tue, 30 Sep 2014 04:42:06 -0400 Received: from mail-wi0-f180.google.com ([209.85.212.180]:57285 "EHLO mail-wi0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753194AbaI3ImB (ORCPT ); Tue, 30 Sep 2014 04:42:01 -0400 Received: by mail-wi0-f180.google.com with SMTP id em10so2879367wid.13 for ; Tue, 30 Sep 2014 01:42:00 -0700 (PDT) X-Received: by 10.180.100.202 with SMTP id fa10mr4058982wib.32.1412066520195; Tue, 30 Sep 2014 01:42:00 -0700 (PDT) Received: from lmenx30s.lme.st.com (LPuteaux-656-01-48-212.w82-127.abo.wanadoo.fr. [82.127.83.212]) by mx.google.com with ESMTPSA id u7sm14381289wif.7.2014.09.30.01.41.58 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 30 Sep 2014 01:41:59 -0700 (PDT) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, riel@redhat.com, linux-kernel@vger.kernel.org Cc: linaro-kernel@lists.linaro.org, Vincent Guittot Subject: [PATCH] sched: fix spurious active migration Date: Tue, 30 Sep 2014 10:41:08 +0200 Message-Id: <1412066468-4340-1-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 1.9.1 MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: vincent.guittot@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.49 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , Since commit caeb178c60f4 ("sched/fair: Make update_sd_pick_busiest() ...") sd_pick_busiest returns a group that can be neither imbalanced nor overloaded but is only more loaded than others. This change has been introduced to ensure a better load balance in system that are not overloaded but as a side effect, it can also generate useless active migration between groups. Let take the example of 3 tasks on a quad cores system. We will always have an idle core so the load balance will find a busiest group (core) whenever an ILB is triggered and it will force an active migration (once above nr_balance_failed threshold) so the idle core becomes busy but another core will become idle. With the next ILB, the freshly idle core will try to pull the task of a busy CPU. The number of spurious active migration is not so huge in quad core system because the ILB is not triggered so much. But it becomes significant as soon as you have more than one sched_domain level like on a dual cluster of quad cores where the ILB is triggered every tick when you have more than 1 busy_cpu We need to ensure that the migration generate a real improveùent and will not only move the avg_load imbalance on another CPU. Before caeb178c60f4f93f1b45c0bc056b5cf6d217b67f, the filtering of such use case was ensured by the following test in f_b_g if ((local->idle_cpus < busiest->idle_cpus) && busiest->sum_nr_running <= busiest->group_weight) This patch modified the condition to take into account situation where busiest group is not overloaded: If the diff between the number of idle cpus in 2 groups is less than or equal to 1 and the busiest group is not overloaded, moving a task will not improve the load balance but just move it. A test with sysbench on a dual clusters of quad cores gives the following results: command: sysbench --test=cpu --num-threads=5 --max-time=5 run The HZ is 200 which means that 1000 ticks has fired during the test. -With Mainline, perf gives the following figures Samples: 727 of event 'sched:sched_migrate_task' Event count (approx.): 727 Overhead Command Shared Object Symbol ........ ............... ............. .............. 12.52% migration/1 [unknown] [.] 00000000 12.52% migration/5 [unknown] [.] 00000000 12.52% migration/7 [unknown] [.] 00000000 12.10% migration/6 [unknown] [.] 00000000 11.83% migration/0 [unknown] [.] 00000000 11.83% migration/3 [unknown] [.] 00000000 11.14% migration/4 [unknown] [.] 00000000 10.87% migration/2 [unknown] [.] 00000000 2.75% sysbench [unknown] [.] 00000000 0.83% swapper [unknown] [.] 00000000 0.55% ktps65090charge [unknown] [.] 00000000 0.41% mmcqd/1 [unknown] [.] 00000000 0.14% perf [unknown] [.] 00000000 -With this patch, perf gives the following figures Samples: 20 of event 'sched:sched_migrate_task' Event count (approx.): 20 Overhead Command Shared Object Symbol ........ ............... ............. .............. 80.00% sysbench [unknown] [.] 00000000 10.00% swapper [unknown] [.] 00000000 5.00% ktps65090charge [unknown] [.] 00000000 5.00% migration/1 [unknown] [.] 00000000 Signed-off-by: Vincent Guittot Reviewed-by: Rik van Riel --- kernel/sched/fair.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 2a1e6ac..adad532 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6425,13 +6425,14 @@ static struct sched_group *find_busiest_group(struct lb_env *env) if (env->idle == CPU_IDLE) { /* - * This cpu is idle. If the busiest group load doesn't - * have more tasks than the number of available cpu's and - * there is no imbalance between this and busiest group - * wrt to idle cpu's, it is balanced. + * This cpu is idle. If the busiest group is not overloaded + * and there is no imbalance between this and busiest group + * wrt to idle cpus, it is balanced. The imbalance becomes + * significant if the diff is greater than 1 otherwise we + * might end up to just move the imbalance on another group */ - if ((local->idle_cpus < busiest->idle_cpus) && - busiest->sum_nr_running <= busiest->group_weight) + if ((local->idle_cpus <= (busiest->idle_cpus + 1)) && + !(busiest->group_type == group_overloaded)) goto out_balanced; } else { /*