From patchwork Wed Jul  8 16:58:28 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Xunlei Pang <xlpang@126.com>
X-Patchwork-Id: 50903
Return-Path: <patchwork-forward+bncBCN6JTVP2IEBBQNO6WWAKGQEB7CQXUI@linaro.org>
X-Original-To: linaro@patches.linaro.org
Delivered-To: linaro@patches.linaro.org
Received: from mail-wg0-f71.google.com (mail-wg0-f71.google.com [74.125.82.71])
 by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 334BF22A03
 for <linaro@patches.linaro.org>; Wed,  8 Jul 2015 17:00:50 +0000 (UTC)
Received: by wgjx7 with SMTP id x7sf71888134wgj.3
 for <linaro@patches.linaro.org>; Wed, 08 Jul 2015 10:00:49 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject
 :date:message-id:sender:precedence:list-id:x-original-sender
 :x-original-authentication-results:mailing-list:list-post:list-help
 :list-archive:list-unsubscribe;
 bh=TnTjGw2B3pMtfri8evEDH90UT+xBiwHS0YFR378w+34=;
 b=eSpYywYnNaFvEOHpmgEYCjlJtaFAuaK3fOd5dJRKlSS8b4I1dOnTolPfSOouGwg8ir
 79RKOKYQ9eaR1lQAXJRlSuOTWeakUvKp1F03L3/CJ9UNvny1uddSxmKpuw5vC66yqvQ8
 Mhd5Qze1jzXsiS2/lBmVpNn/kImCfbyzRBekeXb1Q6nFofGC8mhxP1ldlWFyd1T90WpM
 m5iYvQNzrAxkdWQjI3CphXZ4xC3jJXz96Kb1s3Fv5qMzK6Aj/cAm/uiaDPVuRJuFLLkc
 GUyHnSIrNApdMdPjRIcjG5DNn7+RKOfl2LdVZ1rsJ0xEXIEXX4c/4/dVu9y3xxYITYGw
 nhTA==
X-Gm-Message-State: ALoCoQkqITzZOxMPoUVAhOEt8P1hnUCtVkSRvMMGKSjXVugObI3t89XNfVkjMh3qMCFpORa2z6A7
X-Received: by 10.180.198.9 with SMTP id iy9mr6722067wic.7.1436374849424;
 Wed, 08 Jul 2015 10:00:49 -0700 (PDT)
MIME-Version: 1.0
X-BeenThere: patchwork-forward@linaro.org
Received: by 10.152.29.36 with SMTP id g4ls997444lah.33.gmail; Wed, 08 Jul
 2015 10:00:49 -0700 (PDT)
X-Received: by 10.152.3.97 with SMTP id b1mr10700860lab.54.1436374849225;
 Wed, 08 Jul 2015 10:00:49 -0700 (PDT)
Received: from mail-lb0-x22c.google.com (mail-lb0-x22c.google.com.
 [2a00:1450:4010:c04::22c]) by mx.google.com with ESMTPS id
 sj10si2150446lac.29.2015.07.08.10.00.49
 for <patchwork-forward@linaro.org>
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Wed, 08 Jul 2015 10:00:49 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 2a00:1450:4010:c04::22c as permitted sender)
 client-ip=2a00:1450:4010:c04::22c; 
Received: by lbcpe5 with SMTP id pe5so59972642lbc.2
 for <patchwork-forward@linaro.org>;
 Wed, 08 Jul 2015 10:00:49 -0700 (PDT)
X-Received: by 10.112.222.133 with SMTP id qm5mr10548529lbc.86.1436374849060; 
 Wed, 08 Jul 2015 10:00:49 -0700 (PDT)
X-Forwarded-To: patchwork-forward@linaro.org
X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org
Delivered-To: patch@linaro.org
Received: by 10.112.108.230 with SMTP id hn6csp88691lbb;
 Wed, 8 Jul 2015 10:00:47 -0700 (PDT)
X-Received: by 10.70.50.199 with SMTP id e7mr22783401pdo.124.1436374846266; 
 Wed, 08 Jul 2015 10:00:46 -0700 (PDT)
Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67])
 by mx.google.com with ESMTP id
 ld8si4956443pbc.211.2015.07.08.10.00.44; 
 Wed, 08 Jul 2015 10:00:46 -0700 (PDT)
Received-SPF: pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender) client-ip=209.132.180.67; 
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S965212AbbGHRAl (ORCPT <rfc822;matthew.hart@linaro.org>
 + 29 others); Wed, 8 Jul 2015 13:00:41 -0400
Received: from m50-112.126.com ([123.125.50.112]:42672 "EHLO m50-112.126.com"
 rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
 id S935005AbbGHRAh (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
 Wed, 8 Jul 2015 13:00:37 -0400
Received: from localhost.localdomain (unknown [210.21.223.3])
 by smtp6 (Coremail) with SMTP id j9KowACHr4vlVp1Vvy+DAg--.16591S3;
 Thu, 09 Jul 2015 00:59:23 +0800 (CST)
From: Xunlei Pang <xlpang@126.com>
To: linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>,
 Steven Rostedt <rostedt@goodmis.org>, Juri Lelli <juri.lelli@gmail.com>,
 Ingo Molnar <mingo@redhat.com>, Xunlei Pang <pang.xunlei@linaro.org>
Subject: [PATCH v5 1/2] sched/rt: Check to push the task away after its
 affinity was changed
Date: Thu,  9 Jul 2015 00:58:28 +0800
Message-Id: <1436374709-26703-1-git-send-email-xlpang@126.com>
X-Mailer: git-send-email 1.9.1
X-CM-TRANSID: j9KowACHr4vlVp1Vvy+DAg--.16591S3
X-Coremail-Antispam: 1Uf129KBjvJXoWxtFWfJr1rAw45JF4ftr4fZrb_yoW7WFy3pr
 Za93ZxGr4DXF1a9w1fXw4Duw43Gws5X343JF4fGayFkF98G34jqFyFq3Wa9rW5CryfKFsI
 yr1jqrWxCw1DAaDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2
 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x07jmVbgUUUUU=
X-Originating-IP: [210.21.223.3]
X-CM-SenderInfo: p0ost0bj6rjloofrz/1tbi6Awxv1UJRBhg+wAAsx
Sender: linux-kernel-owner@vger.kernel.org
Precedence: list
List-ID: <patchwork-forward.linaro.org>
X-Mailing-List: linux-kernel@vger.kernel.org
X-Original-Sender: xlpang@126.com
X-Original-Authentication-Results: mx.google.com; spf=pass (google.com:
 domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 2a00:1450:4010:c04::22c as permitted sender)
 smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org; 
 dkim=neutral (body hash did not verify) header.i=@126.com;
 dmarc=fail (p=NONE dis=NONE) header.from=126.com
Mailing-list: list patchwork-forward@linaro.org;
 contact patchwork-forward+owners@linaro.org
X-Google-Group-Id: 836684582541
List-Post: <http://groups.google.com/a/linaro.org/group/patchwork-forward/post>, 
 <mailto:patchwork-forward@linaro.org>
List-Help: <http://support.google.com/a/linaro.org/bin/topic.py?topic=25838>, 
 <mailto:patchwork-forward+help@linaro.org>
List-Archive: <http://groups.google.com/a/linaro.org/group/patchwork-forward/>
List-Unsubscribe: <mailto:googlegroups-manage+836684582541+unsubscribe@googlegroups.com>, 
 <http://groups.google.com/a/linaro.org/group/patchwork-forward/subscribe>

From: Xunlei Pang <pang.xunlei@linaro.org>

We may suffer from extra rt overload rq due to the affinity,
so when the affinity of any runnable rt task is changed, we
should check to trigger balancing, otherwise it will cause
some unnecessary delayed real-time response. Unfortunately,
current RT global scheduler does nothing about this.

For example: a 2-cpu system with two runnable FIFO tasks(same
rt_priority) bound on CPU0, let's name them rt1(running) and
rt2(runnable) respectively; CPU1 has no RTs. Then, someone sets
the affinity of rt2 to 0x3(i.e. CPU0 and CPU1), but after this,
rt2 still can't be scheduled enters schedule(), this
definitely causes some/big response latency for rt2.

The patch tries to push the task away once it got migratable.

The patch also solves a problem about move_queued_task() called
in set_cpus_allowed_ptr():
When a lower priority rt task got migrated due to its curr cpu
isn't in the new affinity mask, after move_queued_task() it will
miss the chance of pushing away, because check_preempt_curr()
called by move_queued_task() doens't set the "need resched flag"
for lower priority tasks.

Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
---
v4->v5:
Avoid triggering the lockdep_pin stuff that Peter pointed out:
   lockdep_unpin_lock(&rq->lock);
   __balance_callback(rq);
   lockdep_pin_lock(&rq->lock);


 kernel/sched/core.c | 87 ++++++++++++++++++++++++++++++-----------------------
 kernel/sched/rt.c   | 18 ++++++++---
 2 files changed, 63 insertions(+), 42 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b803e1b..bb8f3ad 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1045,6 +1045,46 @@ void check_preempt_curr(struct rq *rq, struct task_struct *p, int flags)
 }
 
 #ifdef CONFIG_SMP
+
+/* rq->lock is held */
+static void __balance_callback(struct rq *rq)
+{
+	struct callback_head *head, *next;
+	void (*func)(struct rq *rq);
+
+	head = rq->balance_callback;
+	rq->balance_callback = NULL;
+	while (head) {
+		func = (void (*)(struct rq *))head->func;
+		next = head->next;
+		head->next = NULL;
+		head = next;
+
+		func(rq);
+	}
+}
+
+/* preemption is disabled */
+static inline void balance_callback(struct rq *rq)
+{
+	if (unlikely(rq->balance_callback)) {
+		unsigned long flags;
+
+		raw_spin_lock_irqsave(&rq->lock, flags);
+		__balance_callback(rq);
+		raw_spin_unlock_irqrestore(&rq->lock, flags);
+	}
+}
+
+#else
+
+static inline void balance_callback(struct rq *rq)
+{
+}
+
+#endif
+
+#ifdef CONFIG_SMP
 /*
  * This is how migration works:
  *
@@ -1187,6 +1227,16 @@ int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask)
 	}
 
 	do_set_cpus_allowed(p, new_mask);
+	/*
+	 * rq->lock might get released during __balance_callback(),
+	 * but if there's any successful migrating of @p, task_cpu(p)
+	 * will obviously be in the new_mask, as p->pi_lock is never
+	 * released; Thus, subsequent cpumask_test_cpu() is true and
+	 * will make it return safely in such case.
+	 */
+	lockdep_unpin_lock(&rq->lock);
+	__balance_callback(rq);
+	lockdep_pin_lock(&rq->lock);
 
 	/* Can the task run on the task's current CPU? If so, we're done */
 	if (cpumask_test_cpu(task_cpu(p), new_mask))
@@ -2480,43 +2530,6 @@ static struct rq *finish_task_switch(struct task_struct *prev)
 	return rq;
 }
 
-#ifdef CONFIG_SMP
-
-/* rq->lock is NOT held, but preemption is disabled */
-static void __balance_callback(struct rq *rq)
-{
-	struct callback_head *head, *next;
-	void (*func)(struct rq *rq);
-	unsigned long flags;
-
-	raw_spin_lock_irqsave(&rq->lock, flags);
-	head = rq->balance_callback;
-	rq->balance_callback = NULL;
-	while (head) {
-		func = (void (*)(struct rq *))head->func;
-		next = head->next;
-		head->next = NULL;
-		head = next;
-
-		func(rq);
-	}
-	raw_spin_unlock_irqrestore(&rq->lock, flags);
-}
-
-static inline void balance_callback(struct rq *rq)
-{
-	if (unlikely(rq->balance_callback))
-		__balance_callback(rq);
-}
-
-#else
-
-static inline void balance_callback(struct rq *rq)
-{
-}
-
-#endif
-
 /**
  * schedule_tail - first thing a freshly forked thread must call.
  * @prev: the thread we just switched away from.
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 00816ee..d27061f 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2089,14 +2089,15 @@ static void set_cpus_allowed_rt(struct task_struct *p,
 
 	weight = cpumask_weight(new_mask);
 
+	rq = task_rq(p);
+
 	/*
-	 * Only update if the process changes its state from whether it
-	 * can migrate or not.
+	 * Skip updating the migration stuff if the process doesn't change
+	 * its migrate state, but still need to check if it can be pushed
+	 * away due to its new affinity.
 	 */
 	if ((p->nr_cpus_allowed > 1) == (weight > 1))
-		return;
-
-	rq = task_rq(p);
+		goto queue_push;
 
 	/*
 	 * The process used to be able to migrate OR it can now migrate
@@ -2113,6 +2114,13 @@ static void set_cpus_allowed_rt(struct task_struct *p,
 	}
 
 	update_rt_migration(&rq->rt);
+
+queue_push:
+	if (weight > 1 &&
+	    !task_running(rq, p) &&
+	    !test_tsk_need_resched(rq->curr) &&
+	    !cpumask_subset(new_mask, &p->cpus_allowed))
+		queue_push_tasks(rq);
 }
 
 /* Assumes rq->lock is held */