From patchwork Wed Oct 19 12:45:23 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Vincent Guittot <vincent.guittot@linaro.org>
X-Patchwork-Id: 78256
Delivered-To: patch@linaro.org
Received: by 10.140.97.247 with SMTP id m110csp275460qge;
 Wed, 19 Oct 2016 07:23:25 -0700 (PDT)
X-Received: by 10.98.24.198 with SMTP id 189mr11583955pfy.31.1476887005100; 
 Wed, 19 Oct 2016 07:23:25 -0700 (PDT)
Return-Path: <stable-owner@vger.kernel.org>
Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67])
 by mx.google.com with ESMTP id 3si29164339pfd.50.2016.10.19.07.23.24; 
 Wed, 19 Oct 2016 07:23:25 -0700 (PDT)
Received-SPF: pass (google.com: best guess record for domain of
 stable-owner@vger.kernel.org designates 209.132.180.67 as
 permitted sender) client-ip=209.132.180.67; 
Authentication-Results: mx.google.com;
 dkim=neutral (body hash did not verify) header.i=@linaro.org;
 spf=pass (google.com: best guess record for domain of
 stable-owner@vger.kernel.org designates 209.132.180.67 as
 permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org; 
 dmarc=fail (p=NONE dis=NONE) header.from=linaro.org
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S941335AbcJSOXJ (ORCPT <rfc822; hanjun.guo@linaro.org> + 3 others);
 Wed, 19 Oct 2016 10:23:09 -0400
Received: from mail-qt0-f177.google.com ([209.85.216.177]:33024 "EHLO
 mail-qt0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
 with ESMTP id S941555AbcJSOXF (ORCPT
 <rfc822;stable@vger.kernel.org>); Wed, 19 Oct 2016 10:23:05 -0400
Received: by mail-qt0-f177.google.com with SMTP id s49so21558424qta.0
 for <stable@vger.kernel.org>; Wed, 19 Oct 2016 07:23:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; 
 h=from:to:cc:subject:date:message-id;
 bh=MDU4sFes1tpA/K+5fv9g1yMRF2ANB+KSDu7XT9uQHQk=;
 b=JmBe5C7gJjw0USnJMavwXIW+NoBO4p81RlOIFJYLAWQJSh082iYNvMVu2WQyZSV669
 e0QaU/GF2cqxI/BC/6JN+zeole7zIazLeta+EHCgWMg/MQlSAw2fh+AONppLDgXq8SzR
 85pTruW5cPZIG26T9O3CUrE/rFWOZIQxf+K2c=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:from:to:cc:subject:date:message-id;
 bh=MDU4sFes1tpA/K+5fv9g1yMRF2ANB+KSDu7XT9uQHQk=;
 b=MAg2OUayJjX9Idx38LTwH0Wh/UMNRu9QhVGqKISpF7mSgsTWxTOkmSYHT6zH0ooDup
 S8HtpGvac6mI/UEX+OpErjEOLfePtjKiUVv75fS7DRZMudokzWebFzLB3b8H+OHn+wrF
 9VUaNZfEfpmGfd5opM8cK4ElxkOKLQKRGzK1BoOwib/XqBLZUoDRVU2WKfpEJcKhHdeM
 2UvjAYeEPAe0lAomrU8gNFU7NWGbwY7ooGTkvzP2836MKzt/zRNftf2cqoOBoLEWZaX+
 v+Y2A33cNKsfzUR7h0opXv9NwSGEI2c4IOulegAhtAuuDlzjj7FlPf4gFhlbMp7xVpmT
 g8yQ==
X-Gm-Message-State: AA6/9RkkqVLBtVADTDcy8g5eo9fAKxMZsV3KeVVMeQTTTyQ+KJ0DWPwSHIJxLFct+TDg5zO3
X-Received: by 10.28.111.8 with SMTP id k8mr2798080wmc.4.1476881137812;
 Wed, 19 Oct 2016 05:45:37 -0700 (PDT)
Received: from localhost.localdomain ([2a01:e35:8bd4:7750:80e7:5536:afd1:507b])
 by smtp.gmail.com with ESMTPSA id
 a5sm33488860wjv.6.2016.10.19.05.45.36
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
 Wed, 19 Oct 2016 05:45:37 -0700 (PDT)
From: Vincent Guittot <vincent.guittot@linaro.org>
To: peterz@infradead.org, mingo@kernel.org,
 linux-kernel@vger.kernel.org, dietmar.eggemann@arm.com,
 joseph.salisbury@canonical.com
Cc: joonwoop@codeaurora.org, stable@vger.kernel.org,
 Vincent Guittot <vincent.guittot@linaro.org>
Subject: [PATCH] sched: fix wrong task group's load_avg
Date: Wed, 19 Oct 2016 14:45:23 +0200
Message-Id: <1476881123-10159-1-git-send-email-vincent.guittot@linaro.org>
X-Mailer: git-send-email 2.7.4
Sender: stable-owner@vger.kernel.org
Precedence: bulk
List-ID: <stable.vger.kernel.org>
X-Mailing-List: stable@vger.kernel.org

A regression has been reported with:
commit 3d30544f0212 ("sched/fair: Apply more PELT fixes)
when several level of task groups are involved
and cpu_possible_mask != cpu_present_mask.

The root cause is that group entity's load (tg_child->se[i]->avg.load_avg)
is initialized to scale_load_down(se->load.weight). During the creation of
a child task group, its group entities on possible CPUs are attached to
parent's cfs_rq (tg_parent) and their loads are added in parent's load
(tg_parent->load_avg) with update_tg_load_avg.

But only the load on online CPUs will be then updated to reflect real load
whereas load on other CPUs will stay to the initial value. The result is
a tg_parent->load_avg that is higher than the real load, the weight
of group entities (tg_parent->se[i]->load.weight) on online CPUs is smaller
than it should be, and the task group gets a less running time than what
it could expect.

This situation can be detected with /proc/sched_debug. The ".tg_load_avg"
of the task group will be much higher than sum of ".tg_load_avg_contrib"
of online cfs_rqs of the task group.

The load of group entities don't have to be intialized to something else
than 0 because their load will increase when entity will be attached.

Fixes: 3d30544f0212 ("sched/fair: Apply more PELT fixes)
Reported-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: <stable@vger.kernel.org> # 4.8.x
---
 kernel/sched/fair.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8b03fb5..89776ac 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -690,7 +690,14 @@ void init_entity_runnable_average(struct sched_entity *se)
 	 * will definitely be update (after enqueue).
 	 */
 	sa->period_contrib = 1023;
-	sa->load_avg = scale_load_down(se->load.weight);
+	/*
+	 * Tasks are intialized with full load to be seen as heavy task until
+	 * they get a chance to stabilize to their real load level.
+	 * group entity are intialized with null load to reflect the fact that
+	 * nothing has been attached yet to the task group.
+	 */
+	if (entity_is_task(se))
+		sa->load_avg = scale_load_down(se->load.weight);
 	sa->load_sum = sa->load_avg * LOAD_AVG_MAX;
 	/*
 	 * At this point, util_avg won't be used in select_task_rq_fair anyway