From patchwork Mon Jul 20 08:33:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 235757 Delivered-To: patch@linaro.org Received: by 2002:a92:d244:0:0:0:0:0 with SMTP id v4csp1773891ilg; Mon, 20 Jul 2020 01:33:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwwhSilYlu07sYoZTym6pWENpzkBttOsE37sP0ZQ0Mrx/kqfm34si3nQ6m5xq224KOt1FjN X-Received: by 2002:a17:906:5657:: with SMTP id v23mr20362505ejr.196.1595234038591; Mon, 20 Jul 2020 01:33:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595234038; cv=none; d=google.com; s=arc-20160816; b=IZGOLGGTMF+6gZVdwQT1DC/+P275TQ6fUdtSet26J+L7fndZJLb7/FFpsk5GZiXZ8g dISI+M3mOp+Mm+g3uQEQ2kHQp9nYyISzgVHrlAFShb8Q6ZHCYqFdU1Y7JfGYrWsi9UlV 4hhpWiGD9VDnPyI9UAzRMZDBmiN5qartoUPz8VtjrThT4Tg9OhRnG2c4QfqyaNdms84v hPv/pWjUIeCD8sQ80jGWUdlmTGwf60ueisp/hiQZY5V3NzFXInc2f0gL/NafAcUQOya4 TosgUZifae89vw+TY9tN768/+L7tedGvuA0Gl8OhzHZSehj94RURf+wTOyhMYASxHTCQ BQFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature; bh=Jw8oYTOE5+FmaB1olMEI3hg+LM7MiO8WRk1wy4m9+B0=; b=xvSMZcQuAMScrKFOApekqL+Saxp/E61Dw9uZSugF0Gy2WaIdwUK1zC/FIxeG4kJevo BHnWQZg4E50J2FBgNBWgl8J2cwlatIn0u27MgQEtV7dENvZRsP3rlPkwJTm43oTuOxP8 1ITFbTNCMXoKVIKhNJct9ZVch7UocZrisb6/wJq5wSTbMB5IyTux6MoM0sYFWp0uSP3x 269RIDa/3bxIlNSE2X5tmv+e1PQLxunWyvh1iZDQAIicVwejxUNRQs8fGvK+QdCCVMvD AHMW7/wT6wTG22QN8nzy2AJyvPLol3knbg7ja984Cq23XJOYFcNwRtBAOGo9s5Z7/1KV iH+Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=BNv3clhd; spf=pass (google.com: domain of stable-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id co17si9580302edb.437.2020.07.20.01.33.58; Mon, 20 Jul 2020 01:33:58 -0700 (PDT) Received-SPF: pass (google.com: domain of stable-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=BNv3clhd; spf=pass (google.com: domain of stable-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727910AbgGTId6 (ORCPT + 15 others); Mon, 20 Jul 2020 04:33:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727883AbgGTId5 (ORCPT ); Mon, 20 Jul 2020 04:33:57 -0400 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4DB2DC0619D2 for ; Mon, 20 Jul 2020 01:33:57 -0700 (PDT) Received: by mail-wm1-x341.google.com with SMTP id o2so24322034wmh.2 for ; Mon, 20 Jul 2020 01:33:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id; bh=Jw8oYTOE5+FmaB1olMEI3hg+LM7MiO8WRk1wy4m9+B0=; b=BNv3clhdJiptUfJITPPZsM7T6qY+qFQrZCdBNGPftPjP/wy5wC2A7l/4GRy70WQfe1 TiYLrNWdr2EWNDQMouVx16ipTTQ4ZCHUyEjf1su9I/eapo82I8fqq66I9ObX9rvcDELX IHmlTbL/6hhJpNRZAf32+6jSDBrO+de534qcp2FRIjXv+h1gdkhyPLht2Cb4CVJraeCF ul9cBOMNGv+mMESxhpZd7fH6+631fp/L693HjqglAnEjeACKR/wSw5J2xgx2wH8fGFrx fuKG36ymFdU7GzlwVF5d7/RdTx9Pi9OnS41J9XWQHbGBgW5R17HABpGcx17MO3tFDZx3 v1Fw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=Jw8oYTOE5+FmaB1olMEI3hg+LM7MiO8WRk1wy4m9+B0=; b=q9t91SQG1EcH6/p9Bz/qDkTGqT0zmy33nXGYFJKMHINUdlJFwH6Xz4y6/ufqH0IDtP 1+Dmvkxs1IjL5cptMRb+QBIZKz0k1jDsdvr/8A3BjNeg0NQa3vpOvpTGgbx/g6I9vorl bOPsw2uXdd8pCtw47bi7yI+dRuhCuRfp2Bgf5zGBtdUyQo9lLgAtpdCuXn0EbtTi565V nexnYjNqNrDNeq4IJX780BD+glldY/h/lp6eHLWXA3EsPzcPvC2ApW7Kg8jUPcaF3/EB CNjNGTIFnuDUjEMLAhxMc5yXekkcNMVNqqHoXTQtobN9BH2XaBYbQtQ88UOfIyK5pUMB ATng== X-Gm-Message-State: AOAM532Y4jdUaFgGkGTVNg7LPm6OaRgdClphMr96/cjVkyC8Y73jS3uS mKx5zzBcUXihgpzBGc2xT+sr0A== X-Received: by 2002:a7b:c14a:: with SMTP id z10mr19793759wmi.19.1595234035983; Mon, 20 Jul 2020 01:33:55 -0700 (PDT) Received: from localhost.localdomain ([2a01:e0a:f:6020:7da7:684d:4a8a:3f66]) by smtp.gmail.com with ESMTPSA id s4sm23515200wre.53.2020.07.20.01.33.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Jul 2020 01:33:54 -0700 (PDT) From: Vincent Guittot To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: valentin.schneider@arm.com, sashal@kernel.org, Vincent Guittot Subject: [PATCH 5.4] sched/fair: handle case of task_h_load() returning 0 Date: Mon, 20 Jul 2020 10:33:45 +0200 Message-Id: <20200720083345.22101-1-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.17.1 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org [ Upstream commit 01cfcde9c26d8555f0e6e9aea9d6049f87683998 ] task_h_load() can return 0 in some situations like running stress-ng mmapfork, which forks thousands of threads, in a sched group on a 224 cores system. The load balance doesn't handle this correctly because env->imbalance never decreases and it will stop pulling tasks only after reaching loop_max, which can be equal to the number of running tasks of the cfs. Make sure that imbalance will be decreased by at least 1. misfit task is the other feature that doesn't handle correctly such situation although it's probably more difficult to face the problem because of the smaller number of CPUs and running tasks on heterogenous system. We can't simply ensure that task_h_load() returns at least one because it would imply to handle underflow in other places. Signed-off-by: Vincent Guittot Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Valentin Schneider Reviewed-by: Dietmar Eggemann Tested-by: Dietmar Eggemann Cc: # v5.4 cc: Sasha Levin Link: https://lkml.kernel.org/r/20200710152426.16981-1-vincent.guittot@linaro.org --- kernel/sched/fair.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) -- 2.17.1 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 2f81e4ae844e..9b16080093be 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3824,7 +3824,11 @@ static inline void update_misfit_status(struct task_struct *p, struct rq *rq) return; } - rq->misfit_task_load = task_h_load(p); + /* + * Make sure that misfit_task_load will not be null even if + * task_h_load() returns 0. + */ + rq->misfit_task_load = max_t(unsigned long, task_h_load(p), 1); } #else /* CONFIG_SMP */ @@ -7407,7 +7411,15 @@ static int detach_tasks(struct lb_env *env) if (!can_migrate_task(p, env)) goto next; - load = task_h_load(p); + /* + * Depending of the number of CPUs and tasks and the + * cgroup hierarchy, task_h_load() can return a null + * value. Make sure that env->imbalance decreases + * otherwise detach_tasks() will stop only after + * detaching up to loop_max tasks. + */ + load = max_t(unsigned long, task_h_load(p), 1); + if (sched_feat(LB_MIN) && load < 16 && !env->sd->nr_balance_failed) goto next;