From patchwork Fri Jun 3 14:00:22 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 69257 Delivered-To: patch@linaro.org Received: by 10.140.106.246 with SMTP id e109csp279080qgf; Fri, 3 Jun 2016 07:00:42 -0700 (PDT) X-Received: by 10.36.10.139 with SMTP id 133mr5153822itw.75.1464962442042; Fri, 03 Jun 2016 07:00:42 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id gs7si5935161pac.201.2016.06.03.07.00.41; Fri, 03 Jun 2016 07:00:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752925AbcFCOAj (ORCPT + 31 others); Fri, 3 Jun 2016 10:00:39 -0400 Received: from mail-wm0-f49.google.com ([74.125.82.49]:37247 "EHLO mail-wm0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751731AbcFCOAh (ORCPT ); Fri, 3 Jun 2016 10:00:37 -0400 Received: by mail-wm0-f49.google.com with SMTP id z87so109632617wmh.0 for ; Fri, 03 Jun 2016 07:00:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id; bh=zEI2xpe2HXvRZ959y1dR/3htEwtIUl29lDFJMUG1fdY=; b=D1ikzAWiSoRjHClNrRWasjVw0Ke6bx2Ki9PRVgifSWe/XJwg7+kbFZ5O4tX3Glx6a3 D6X9Nmk4IxQC8JNLOZBFdbjDcokciJNPbdNqKu3d4pKmDMkSj1rRkoT9a3EyRYEX9XSy HvVpki+ABvJVflFpjv3ZLXpI+rV2wQFjWl+Gs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=zEI2xpe2HXvRZ959y1dR/3htEwtIUl29lDFJMUG1fdY=; b=dpslnodMeWCI9YTzEjUP49sST5gd4AWU7b/Ot/Hyx/VFmRXOclXg11lqpCGJkwox/N DxKs+CQJHS2ExSnb/4UujB8daUsVnXXI96JspjU2YyTC+RejR/sBZ04nU3RlWh8ZXDLz x0ZVvTlb3f6A+Y2ES6O+44T6kjftGxoNGxOT0FQvIF721gr+s/H28unUF4TkELfFrNqc qObqjubXE5NT/UZbuDHXtlDu5DHy2euHCZby7IcLDMjRnGVH1wrUDXqzT6yAR9S3FNr0 V2m0sz/7+oMm9x4ZAQvbFnyHDOn9gUUNYqocSkrC9DIUMa6k9vsG9DG8wtCOAMJM4buX +usg== X-Gm-Message-State: ALyK8tIKxKk6q9eQVe0BxiTDOq7qu8uHDpnVonzTbQJSyPFbqdXxH4vI0ChPrg2MaqrYk5ro X-Received: by 10.28.111.215 with SMTP id c84mr4256562wmi.95.1464962435565; Fri, 03 Jun 2016 07:00:35 -0700 (PDT) Received: from localhost.localdomain (pas72-3-88-189-71-117.fbx.proxad.net. [88.189.71.117]) by smtp.gmail.com with ESMTPSA id a140sm6606608wme.18.2016.06.03.07.00.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 03 Jun 2016 07:00:34 -0700 (PDT) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, pjt@google.com Cc: yuyang.du@intel.com, dietmar.eggemann@arm.com, bsegall@google.com, Vincent Guittot Subject: [PATCH v2] sched: fix hierarchical order in rq->leaf_cfs_rq_list Date: Fri, 3 Jun 2016 16:00:22 +0200 Message-Id: <1464962422-23931-1-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 1.9.1 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Fix the insertion of cfs_rq in rq->leaf_cfs_rq_list to ensure that a child will always called before its parent. The hierarchical order in shares update list has been introduced by commit 67e86250f8ea ("sched: Introduce hierarchal order on shares update list") With the current implementation a child can be still put after its parent. Lets take the example of root \ b /\ c d* | e* with root -> b -> c already enqueued but not d -> e so the leaf_cfs_rq_list looks like: head -> c -> b -> root -> tail The branch d -> e will be added the first time that they are enqueued, starting with e then d. When e is added, its parents is not already on the list so e is put at the tail : head -> c -> b -> root -> e -> tail Then, d is added at the head because its parent is already on the list: head -> d -> c -> b -> root -> e -> tail e is not placed at the right position and will be called the last whereas it should be called at the beginning. Because it follows the bottom-up enqueue sequence, we are sure that we will finished to add either a cfs_rq without parent or a cfs_rq with a parent that is already on the list. We can use this event to detect when we have finished to add a new branch. For the others, whose parents are not already added, we have to ensure that they will be added after their children that have just been inserted the steps before, and after any potential parents that are already in the list. The easiest way is to put the cfs_rq just after the last inserted one and to keep track of it until the branch is fully added. Signed-off-by: Vincent Guittot --- v2: - Add comments - rename leaf_alone to tmp_alone_branch kernel/sched/core.c | 1 + kernel/sched/fair.c | 54 +++++++++++++++++++++++++++++++++++++++++++++------- kernel/sched/sched.h | 1 + 3 files changed, 49 insertions(+), 7 deletions(-) -- 1.9.1 diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 404c078..935267b 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7386,6 +7386,7 @@ void __init sched_init(void) #ifdef CONFIG_FAIR_GROUP_SCHED root_task_group.shares = ROOT_TASK_GROUP_LOAD; INIT_LIST_HEAD(&rq->leaf_cfs_rq_list); + rq->tmp_alone_branch = &rq->leaf_cfs_rq_list; /* * How much cpu bandwidth does root_task_group get? * diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 218f8e8..1a93c4e 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -286,19 +286,59 @@ static inline struct cfs_rq *group_cfs_rq(struct sched_entity *grp) static inline void list_add_leaf_cfs_rq(struct cfs_rq *cfs_rq) { if (!cfs_rq->on_list) { + struct rq *rq = rq_of(cfs_rq); + int cpu = cpu_of(rq); /* * Ensure we either appear before our parent (if already * enqueued) or force our parent to appear after us when it is - * enqueued. The fact that we always enqueue bottom-up - * reduces this to two cases. + * enqueued. The fact that we always enqueue bottom-up + * reduces this to two cases and a special case for the root + * cfs_rq. Furthermore, it also means that we will always reset + * tmp_alone_branch either when the branch is connected + * to a tree or when we reach the beg of the tree */ if (cfs_rq->tg->parent && - cfs_rq->tg->parent->cfs_rq[cpu_of(rq_of(cfs_rq))]->on_list) { - list_add_rcu(&cfs_rq->leaf_cfs_rq_list, - &rq_of(cfs_rq)->leaf_cfs_rq_list); - } else { + cfs_rq->tg->parent->cfs_rq[cpu]->on_list) { + /* + * If parent is already on the list, we add the child + * just before. Thanks to circular linked property of + * the list, this means to put the child at the tail + * of the list that starts by parent. + */ + list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list, + &(cfs_rq->tg->parent->cfs_rq[cpu]->leaf_cfs_rq_list)); + /* + * The branch is now connected to its tree so we can + * reset tmp_alone_branch to the beginning of the + * list. + */ + rq->tmp_alone_branch = &rq->leaf_cfs_rq_list; + } else if (!cfs_rq->tg->parent) { + /* + * cfs rq without parent should be put + * at the tail of the list. + */ list_add_tail_rcu(&cfs_rq->leaf_cfs_rq_list, - &rq_of(cfs_rq)->leaf_cfs_rq_list); + &rq->leaf_cfs_rq_list); + /* + * We have reach the beg of a tree so we can reset + * tmp_alone_branch to the beginning of the list. + */ + rq->tmp_alone_branch = &rq->leaf_cfs_rq_list; + } else { + /* + * The parent has not already been added so we want to + * make sure that it will be put after us. + * tmp_alone_branch points to the beg of the branch + * where we will add parent. + */ + list_add_rcu(&cfs_rq->leaf_cfs_rq_list, + rq->tmp_alone_branch); + /* + * upodate tmp_alone_branch to points to the new beg + * of the branch + */ + rq->tmp_alone_branch = &cfs_rq->leaf_cfs_rq_list; } cfs_rq->on_list = 1; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index e51145e..00cecd6 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -614,6 +614,7 @@ struct rq { #ifdef CONFIG_FAIR_GROUP_SCHED /* list of leaf cfs_rq on this cpu: */ struct list_head leaf_cfs_rq_list; + struct list_head *tmp_alone_branch; #endif /* CONFIG_FAIR_GROUP_SCHED */ /*