From patchwork Sat Jan  5 17:49:03 2013
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
X-Patchwork-Id: 13833
Return-Path: <patch+caf_=linaro-patchwork=canonical.com@linaro.org>
X-Original-To: patchwork@peony.canonical.com
Delivered-To: patchwork@peony.canonical.com
Received: from fiordland.canonical.com (fiordland.canonical.com
 [91.189.94.145])
 by peony.canonical.com (Postfix) with ESMTP id 660A823E21
 for <patchwork@peony.canonical.com>;
 Sat,  5 Jan 2013 17:49:25 +0000 (UTC)
Received: from mail-vc0-f180.google.com (mail-vc0-f180.google.com
 [209.85.220.180])
 by fiordland.canonical.com (Postfix) with ESMTP id E2209A19887
 for <linaro-patchwork@canonical.com>;
 Sat,  5 Jan 2013 17:49:24 +0000 (UTC)
Received: by mail-vc0-f180.google.com with SMTP id p16so17829330vcq.11
 for <linaro-patchwork@canonical.com>;
 Sat, 05 Jan 2013 09:49:24 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=x-received:x-forwarded-to:x-forwarded-for:delivered-to:x-received
 :received-spf:from:to:cc:subject:date:message-id:x-mailer
 :in-reply-to:references:x-content-scanned:x-cbid:x-gm-message-state; 
 bh=ZAy5AaV9WQHprPS0PEb0ZYgG293eIJ4HmGAI937qBpw=;
 b=ZARZOuFs0NMVoWGDvcYW1rHUnNhnAGXsMnM1VR0PkSjhkFV8xrnwbr+OY8kRZOj//s
 BYKEP9woUoYWdXEQ8930PeioHohaROBbzLlIwk+GQK1WCMyC3Sqq7eWNe3NWkHKaeTsd
 UStuN+Lq6KD+NPDq55Q1TmIjw0OAoBIA1O3qq+Vv0EZ9ef4uaM7LctL1kdkFKHHFy0cI
 iwRgb/p/41gpXp7nBNAN61CESnhNOOgSGewiOI0oDnkkaMJ4b4XkyIYBdET5vO4x3Bmv
 1kHn0KB7ri/J1jutTI0b1w5ZDeNEMPREju1GJNcQn46H3Sq9pSnl772BGsaJ5FLZIa8l
 /HSg==
X-Received: by 10.52.88.168 with SMTP id bh8mr67809109vdb.51.1357408164424; 
 Sat, 05 Jan 2013 09:49:24 -0800 (PST)
X-Forwarded-To: linaro-patchwork@canonical.com
X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com
Delivered-To: patches@linaro.org
Received: by 10.58.145.101 with SMTP id st5csp17436veb;
 Sat, 5 Jan 2013 09:49:23 -0800 (PST)
X-Received: by 10.50.159.228 with SMTP id xf4mr1901402igb.108.1357408161508; 
 Sat, 05 Jan 2013 09:49:21 -0800 (PST)
Received: from e7.ny.us.ibm.com (e7.ny.us.ibm.com. [32.97.182.137])
 by mx.google.com with ESMTPS id
 bd6si42101771icc.1.2013.01.05.09.49.21
 (version=TLSv1/SSLv3 cipher=OTHER);
 Sat, 05 Jan 2013 09:49:21 -0800 (PST)
Received-SPF: pass (google.com: domain of paulmck@linux.vnet.ibm.com
 designates 32.97.182.137 as permitted sender)
 client-ip=32.97.182.137; 
Authentication-Results: mx.google.com; spf=pass (google.com: domain of
 paulmck@linux.vnet.ibm.com designates 32.97.182.137 as
 permitted sender) smtp.mail=paulmck@linux.vnet.ibm.com
Received: from /spool/local
 by e7.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only!
 Violators will be prosecuted
 for <patches@linaro.org> from <paulmck@linux.vnet.ibm.com>;
 Sat, 5 Jan 2013 12:49:20 -0500
Received: from d01dlp03.pok.ibm.com (9.56.250.168)
 by e7.ny.us.ibm.com (192.168.1.107) with IBM ESMTP SMTP Gateway:
 Authorized Use Only! Violators will be prosecuted; 
 Sat, 5 Jan 2013 12:49:17 -0500
Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236])
 by d01dlp03.pok.ibm.com (Postfix) with ESMTP id 403C4C9003C;
 Sat,  5 Jan 2013 12:49:17 -0500 (EST)
Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168])
 by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id
 r05HnGYU340662; Sat, 5 Jan 2013 12:49:17 -0500
Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1])
 by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP
 id r05HnC4u032434; Sat, 5 Jan 2013 10:49:16 -0700
Received: from paulmck-ThinkPad-W500 ([9.80.23.97])
 by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP
 id r05HnAql032212; Sat, 5 Jan 2013 10:49:11 -0700
Received: by paulmck-ThinkPad-W500 (Postfix, from userid 1000)
 id 96D42E71A1; Sat,  5 Jan 2013 09:49:06 -0800 (PST)
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: linux-kernel@vger.kernel.org
Cc: mingo@elte.hu, laijs@cn.fujitsu.com, dipankar@in.ibm.com,
 akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca,
 josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
 peterz@infradead.org, rostedt@goodmis.org, Valdis.Kletnieks@vt.edu,
 dhowells@redhat.com, edumazet@google.com, darren@dvhart.com,
 fweisbec@gmail.com, sbw@mit.edu, patches@linaro.org,
 "Paul E. McKenney" <paul.mckenney@linaro.org>,
 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: [PATCH tip/core/rcu 13/14] rcu: Abstract rcu_start_future_gp() from
 rcu_nocb_wait_gp()
Date: Sat,  5 Jan 2013 09:49:03 -0800
Message-Id: <1357408144-15830-13-git-send-email-paulmck@linux.vnet.ibm.com>
X-Mailer: git-send-email 1.7.8
In-Reply-To: <1357408144-15830-1-git-send-email-paulmck@linux.vnet.ibm.com>
References: <20130105174844.GA14172@linux.vnet.ibm.com>
 <1357408144-15830-1-git-send-email-paulmck@linux.vnet.ibm.com>
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 13010517-5806-0000-0000-00001DD5C81F
X-Gm-Message-State: ALoCoQliG73b1JaWQ19KB3l1bfjiWn4d2ep0omh4ovsGi1VfwoHKkxIsP8AZ5FkaCgIU4BKF51Ed

From: "Paul E. McKenney" <paul.mckenney@linaro.org>

CPUs going idle will need to record the need for a future grace
period, but won't actually need to block waiting on it.  This commit
therefore splits rcu_start_future_gp(), which does the recording, from
rcu_nocb_wait_gp(), which now invokes rcu_start_future_gp() to do the
recording, after which rcu_nocb_wait_gp() does the waiting.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c        |  123 +++++++++++++++++++++++++++++++++++++++++++++--
 kernel/rcutree.h        |    2 +-
 kernel/rcutree_plugin.h |  104 ++++------------------------------------
 3 files changed, 130 insertions(+), 99 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 8ca18ec..bd42feb 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -230,6 +230,7 @@ static ulong jiffies_till_next_fqs = RCU_JIFFIES_TILL_FORCE_QS;
 module_param(jiffies_till_first_fqs, ulong, 0644);
 module_param(jiffies_till_next_fqs, ulong, 0644);
 
+static void rcu_start_gp(struct rcu_state *rsp);
 static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *));
 static void force_quiescent_state(struct rcu_state *rsp);
 static int rcu_pending(int cpu);
@@ -1113,6 +1114,120 @@ static unsigned long rcu_cbs_completed(struct rcu_state *rsp,
 }
 
 /*
+ * Trace-event helper function for rcu_start_future_gp() and
+ * rcu_nocb_wait_gp().
+ */
+static void trace_rcu_future_gp(struct rcu_node *rnp, struct rcu_data *rdp,
+				unsigned long c, char *s)
+{
+	trace_rcu_future_grace_period(rdp->rsp->name, rnp->gpnum,
+				      rnp->completed, c, rnp->level,
+				      rnp->grplo, rnp->grphi, s);
+}
+
+/*
+ * Start some future grace period, as needed to handle newly arrived
+ * callbacks.  The required future grace periods are recorded in each
+ * rcu_node structure's ->need_future_gp field.
+ *
+ * The caller must hold the specified rcu_node structure's ->lock.
+ */
+static unsigned long __maybe_unused
+rcu_start_future_gp(struct rcu_node *rnp, struct rcu_data *rdp)
+{
+	unsigned long c;
+	int i;
+	struct rcu_node *rnp_root = rcu_get_root(rdp->rsp);
+
+	/*
+	 * Pick up grace-period number for new callbacks.  If this
+	 * grace period is already marked as needed, return to the caller.
+	 */
+	c = rcu_cbs_completed(rdp->rsp, rnp);
+	trace_rcu_future_gp(rnp, rdp, c, "Startleaf");
+	if (rnp->need_future_gp[c & 0x1]) {
+		trace_rcu_future_gp(rnp, rdp, c, "Prestartleaf");
+		return c;
+	}
+
+	/*
+	 * If either this rcu_node structure or the root rcu_node structure
+	 * believe that a grace period is in progress, then we must wait
+	 * for the one following, which is in "c".  Because our request
+	 * will be noticed at the end of the current grace period, we don't
+	 * need to explicitly start one.
+	 */
+	if (rnp->gpnum != rnp->completed ||
+	    ACCESS_ONCE(rnp->gpnum) != ACCESS_ONCE(rnp->completed)) {
+		rnp->need_future_gp[c & 0x1]++;
+		trace_rcu_future_gp(rnp, rdp, c, "Startedleaf");
+		return c;
+	}
+
+	/*
+	 * There might be no grace period in progress.  If we don't already
+	 * hold it, acquire the root rcu_node structure's lock in order to
+	 * start one (if needed).
+	 */
+	if (rnp != rnp_root)
+		raw_spin_lock(&rnp_root->lock);
+
+	/*
+	 * Get a new grace-period number.  If there really is no grace
+	 * period in progress, it will be smaller than the one we obtained
+	 * earlier.  Adjust callbacks as needed.  Note that even no-CBs
+	 * CPUs have a ->nxtcompleted[] array, so no no-CBs checks needed.
+	 */
+	c = rcu_cbs_completed(rdp->rsp, rnp_root);
+	for (i = RCU_DONE_TAIL; i < RCU_NEXT_TAIL; i++)
+		if (ULONG_CMP_LT(c, rdp->nxtcompleted[i]))
+			rdp->nxtcompleted[i] = c;
+
+	/*
+	 * If the needed for the required grace period is already
+	 * recorded, trace and leave.
+	 */
+	if (rnp_root->need_future_gp[c & 0x1]) {
+		trace_rcu_future_gp(rnp, rdp, c, "Prestartedroot");
+		goto unlock_out;
+	}
+
+	/* Record the need for the future grace period. */
+	rnp_root->need_future_gp[c & 0x1]++;
+
+	/* If a grace period is not already in progress, start one. */
+	if (rnp_root->gpnum != rnp_root->completed) {
+		trace_rcu_future_gp(rnp, rdp, c, "Startedleafroot");
+	} else {
+		trace_rcu_future_gp(rnp, rdp, c, "Startedroot");
+		rcu_start_gp(rdp->rsp);
+	}
+unlock_out:
+	if (rnp != rnp_root)
+		raw_spin_unlock(&rnp_root->lock);
+	return c;
+}
+
+/*
+ * Clean up any old requests for the just-ended grace period.  Also return
+ * whether any additional grace periods have been requested.  Also invoke
+ * rcu_nocb_gp_cleanup() in order to wake up any no-callbacks kthreads
+ * waiting for this grace period to complete.
+ */
+static int rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
+{
+	int c = rnp->completed;
+	int needmore;
+	struct rcu_data *rdp = this_cpu_ptr(rsp->rda);
+
+	rcu_nocb_gp_cleanup(rsp, rnp);
+	rnp->need_future_gp[c & 0x1] = 0;
+	needmore = rnp->need_future_gp[(c + 1) & 0x1];
+	trace_rcu_future_gp(rnp, rdp, c, needmore ? "CleanupMore" : "Cleanup");
+	return needmore;
+}
+
+/*
  * If there is room, assign a ->completed number to any callbacks on
  * this CPU that have not already been assigned.  Also accelerate any
  * callbacks that were previously assigned a ->completed number that has
@@ -1350,9 +1465,9 @@ static int rcu_gp_init(struct rcu_state *rsp)
 		rdp = this_cpu_ptr(rsp->rda);
 		rcu_preempt_check_blocked_tasks(rnp);
 		rnp->qsmask = rnp->qsmaskinit;
-		rnp->gpnum = rsp->gpnum;
+		ACCESS_ONCE(rnp->gpnum) = rsp->gpnum;
 		WARN_ON_ONCE(rnp->completed != rsp->completed);
-		rnp->completed = rsp->completed;
+		ACCESS_ONCE(rnp->completed) = rsp->completed;
 		if (rnp == rdp->mynode)
 			rcu_start_gp_per_cpu(rsp, rnp, rdp);
 		rcu_preempt_boost_start_gp(rnp);
@@ -1433,11 +1548,11 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 	 */
 	rcu_for_each_node_breadth_first(rsp, rnp) {
 		raw_spin_lock_irq(&rnp->lock);
-		rnp->completed = rsp->gpnum;
+		ACCESS_ONCE(rnp->completed) = rsp->gpnum;
 		rdp = this_cpu_ptr(rsp->rda);
 		if (rnp == rdp->mynode)
 			__rcu_process_gp_end(rsp, rnp, rdp);
-		nocb += rcu_nocb_gp_cleanup(rsp, rnp);
+		nocb += rcu_future_gp_cleanup(rsp, rnp);
 		raw_spin_unlock_irq(&rnp->lock);
 		cond_resched();
 	}
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index 775d96c..d6a043e 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -535,7 +535,7 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp);
 static void increment_cpu_stall_ticks(void);
 static int rcu_nocb_needs_gp(struct rcu_state *rsp);
 static void rcu_nocb_gp_set(struct rcu_node *rnp, int nrq);
-static int rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp);
+static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp);
 static void rcu_init_one_nocb(struct rcu_node *rnp);
 static bool is_nocb_cpu(int cpu);
 static bool __call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *rhp,
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index e4037bd..71bafa8 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -2061,22 +2061,12 @@ static int rcu_nocb_needs_gp(struct rcu_state *rsp)
 }
 
 /*
- * Clean up this rcu_node structure's no-CBs state at the end of
- * a grace period, and also return whether any no-CBs CPU associated
- * with this rcu_node structure needs another grace period.
+ * Wake up any no-CBs CPUs' kthreads that were waiting on the just-ended
+ * grace period.
  */
-static int rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
+static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
 {
-	int c = rnp->completed;
-	int needmore;
-
-	wake_up_all(&rnp->nocb_gp_wq[c & 0x1]);
-	rnp->need_future_gp[c & 0x1] = 0;
-	needmore = rnp->need_future_gp[(c + 1) & 0x1];
-	trace_rcu_future_grace_period(rsp->name, rnp->gpnum, rnp->completed,
-				      c, rnp->level, rnp->grplo, rnp->grphi,
-				      needmore ? "CleanupMore" : "Cleanup");
-	return needmore;
+	wake_up_all(&rnp->nocb_gp_wq[rnp->completed & 0x1]);
 }
 
 /*
@@ -2214,84 +2204,16 @@ static void rcu_nocb_wait_gp(struct rcu_data *rdp)
 	bool d;
 	unsigned long flags;
 	struct rcu_node *rnp = rdp->mynode;
-	struct rcu_node *rnp_root = rcu_get_root(rdp->rsp);
 
 	raw_spin_lock_irqsave(&rnp->lock, flags);
-	c = rnp->completed + 2;
-
-	/* Count our request for a grace period. */
-	rnp->need_future_gp[c & 0x1]++;
-	trace_rcu_future_grace_period(rdp->rsp->name, rnp->gpnum,
-				      rnp->completed, c, rnp->level,
-				      rnp->grplo, rnp->grphi, "Startleaf");
-
-	if (rnp->gpnum != rnp->completed) {
-
-		/*
-		 * This rcu_node structure believes that a grace period
-		 * is in progress, so we are done.  When this grace
-		 * period ends, our request will be acted upon.
-		 */
-		trace_rcu_future_grace_period(rdp->rsp->name, rnp->gpnum,
-					      rnp->completed, c, rnp->level,
-					      rnp->grplo, rnp->grphi,
-					      "Startedleaf");
-		raw_spin_unlock_irqrestore(&rnp->lock, flags);
-
-	} else {
-
-		/*
-		 * Might not be a grace period, check root rcu_node
-		 * structure to see if we must start one.
-		 */
-		if (rnp != rnp_root)
-			raw_spin_lock(&rnp_root->lock); /* irqs disabled. */
-		if (rnp_root->gpnum != rnp_root->completed) {
-			trace_rcu_future_grace_period(rdp->rsp->name,
-						      rnp->gpnum,
-						      rnp->completed,
-						      c, rnp->level,
-						      rnp->grplo, rnp->grphi,
-						      "Startedleafroot");
-			raw_spin_unlock(&rnp_root->lock); /* irqs disabled. */
-		} else {
-
-			/*
-			 * No grace period, so we need to start one.
-			 * The good news is that we can wait for exactly
-			 * one grace period instead of part of the current
-			 * grace period and all of the next grace period.
-			 * Adjust counters accordingly and start the
-			 * needed grace period.
-			 */
-			rnp->need_future_gp[c & 0x1]--;
-			c = rnp_root->completed + 1;
-			rnp->need_future_gp[c & 0x1]++;
-			rnp_root->need_future_gp[c & 0x1]++;
-			trace_rcu_future_grace_period(rdp->rsp->name,
-						      rnp->gpnum,
-						      rnp->completed,
-						      c, rnp->level,
-						      rnp->grplo, rnp->grphi,
-						      "Startedroot");
-			rcu_start_gp(rdp->rsp);
-			raw_spin_unlock(&rnp->lock);
-		}
-
-		/* Clean up locking and irq state. */
-		if (rnp != rnp_root)
-			raw_spin_unlock_irqrestore(&rnp->lock, flags);
-		else
-			local_irq_restore(flags);
-	}
+	c = rcu_start_future_gp(rnp, rdp);
+	raw_spin_unlock_irqrestore(&rnp->lock, flags);
 
 	/*
 	 * Wait for the grace period.  Do so interruptibly to avoid messing
 	 * up the load average.
 	 */
-	trace_rcu_future_grace_period(rdp->rsp->name, rnp->gpnum,
-				      rnp->completed, c, rnp->level,
-				      rnp->grplo, rnp->grphi, "StartWait");
+	trace_rcu_future_gp(rnp, rdp, c, "StartWait");
 	for (;;) {
 		wait_event_interruptible(
 			rnp->nocb_gp_wq[c & 0x1],
@@ -2299,14 +2221,9 @@ static void rcu_nocb_wait_gp(struct rcu_data *rdp)
 		if (likely(d))
 			break;
 		flush_signals(current);
-		trace_rcu_future_grace_period(rdp->rsp->name,
-					      rnp->gpnum, rnp->completed, c,
-					      rnp->level, rnp->grplo,
-					      rnp->grphi, "ResumeWait");
+		trace_rcu_future_gp(rnp, rdp, c, "ResumeWait");
 	}
-	trace_rcu_future_grace_period(rdp->rsp->name, rnp->gpnum,
-				      rnp->completed, c, rnp->level,
-				      rnp->grplo, rnp->grphi, "EndWait");
+	trace_rcu_future_gp(rnp, rdp, c, "EndWait");
 	smp_mb(); /* Ensure that CB invocation happens after GP end. */
 }
 
@@ -2413,9 +2330,8 @@ static int rcu_nocb_needs_gp(struct rcu_state *rsp)
 	return 0;
 }
 
-static int rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
+static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
 {
-	return 0;
 }
 
 static void rcu_nocb_gp_set(struct rcu_node *rnp, int nrq)