From patchwork Mon Oct 17 11:49:55 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dietmar Eggemann X-Patchwork-Id: 77735 Delivered-To: patch@linaro.org Received: by 10.140.97.247 with SMTP id m110csp354260qge; Mon, 17 Oct 2016 04:50:24 -0700 (PDT) X-Received: by 10.98.214.194 with SMTP id a63mr37856322pfl.9.1476705024032; Mon, 17 Oct 2016 04:50:24 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j2si30530105pfj.194.2016.10.17.04.50.23; Mon, 17 Oct 2016 04:50:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934231AbcJQLuV (ORCPT + 27 others); Mon, 17 Oct 2016 07:50:21 -0400 Received: from foss.arm.com ([217.140.101.70]:59606 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932550AbcJQLuL (ORCPT ); Mon, 17 Oct 2016 07:50:11 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 64F282B; Mon, 17 Oct 2016 04:49:57 -0700 (PDT) Received: from [10.1.210.41] (e107985-lin.cambridge.arm.com [10.1.210.41]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id ECEAD3F218; Mon, 17 Oct 2016 04:49:55 -0700 (PDT) Subject: Re: [v4.8-rc1 Regression] sched/fair: Apply more PELT fixes To: Vincent Guittot , Joseph Salisbury References: <57FE6302.6060103@canonical.com> <57FFADC8.2020602@canonical.com> <43c59cba-2044-1de2-0f78-8f346bd1e3cb@arm.com> <20161014151827.GA10379@linaro.org> <2bb765e7-8a5f-c525-a6ae-fbec6fae6354@canonical.com> <20161017090903.GA11962@linaro.org> Cc: Ingo Molnar , Peter Zijlstra , Linus Torvalds , Thomas Gleixner , LKML , Mike Galbraith , omer.akram@canonical.com From: Dietmar Eggemann Message-ID: <4e15ad55-beeb-e860-0420-8f439d076758@arm.com> Date: Mon, 17 Oct 2016 12:49:55 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <20161017090903.GA11962@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Vincent, On 17/10/16 10:09, Vincent Guittot wrote: > Le Friday 14 Oct 2016 à 12:04:02 (-0400), Joseph Salisbury a écrit : >> On 10/14/2016 11:18 AM, Vincent Guittot wrote: >>> Le Friday 14 Oct 2016 à 14:10:07 (+0100), Dietmar Eggemann a écrit : >>>> On 14/10/16 09:24, Vincent Guittot wrote: [...] > Could you try the patch below on top of the faulty kernel ? > > --- > kernel/sched/fair.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 8b03fb5..8926685 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -2902,7 +2902,8 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa, > */ > static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force) > { > - long delta = cfs_rq->avg.load_avg - cfs_rq->tg_load_avg_contrib; > + unsigned long load_avg = READ_ONCE(cfs_rq->avg.load_avg); > + long delta = load_avg - cfs_rq->tg_load_avg_contrib; > > /* > * No need to update load_avg for root_task_group as it is not used. > @@ -2912,7 +2913,7 @@ static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force) > > if (force || abs(delta) > cfs_rq->tg_load_avg_contrib / 64) { > atomic_long_add(delta, &cfs_rq->tg->load_avg); > - cfs_rq->tg_load_avg_contrib = cfs_rq->avg.load_avg; > + cfs_rq->tg_load_avg_contrib = load_avg; > } > } Tested it on an Ubuntu 16.10 Server (on top of the default 4.8.0-22-generic kernel) on a Lenovo T430 and it didn't help. What seems to cure it is to get rid of this snippet (part of the commit mentioned earlier in this thread: 3d30544f0212): BTW, I guess we can reach .tg_load_avg up to ~300000-400000 on such a system initially because systemd will create all ~100 services (and therefore the corresponding 2. level tg's) at once. In my previous example, there was 500ms between the creation of 2 tg's so there was a lot of decaying going on in between. diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 039de34f1521..16c692049fbf 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -726,7 +726,6 @@ void post_init_entity_util_avg(struct sched_entity *se) struct sched_avg *sa = &se->avg; long cap = (long)(SCHED_CAPACITY_SCALE - cfs_rq->avg.util_avg) / 2; u64 now = cfs_rq_clock_task(cfs_rq); - int tg_update; if (cap > 0) { if (cfs_rq->avg.util_avg != 0) { @@ -759,10 +758,8 @@ void post_init_entity_util_avg(struct sched_entity *se) } } - tg_update = update_cfs_rq_load_avg(now, cfs_rq, false); + update_cfs_rq_load_avg(now, cfs_rq, false); attach_entity_load_avg(cfs_rq, se); - if (tg_update) - update_tg_load_avg(cfs_rq, false); } #else /* !CONFIG_SMP */