From patchwork Sat Dec  3 18:34:39 2011
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
X-Patchwork-Id: 5440
Return-Path: <patch+caf_=linaro-patchwork=canonical.com@linaro.org>
X-Original-To: patchwork@peony.canonical.com
Delivered-To: patchwork@peony.canonical.com
Received: from fiordland.canonical.com (fiordland.canonical.com
 [91.189.94.145])
 by peony.canonical.com (Postfix) with ESMTP id DAE4923E03
 for <patchwork@peony.canonical.com>;
 Sat,  3 Dec 2011 18:35:24 +0000 (UTC)
Received: from mail-lpp01m010-f52.google.com (mail-lpp01m010-f52.google.com
 [209.85.215.52])
 by fiordland.canonical.com (Postfix) with ESMTP id B9F16A18894
 for <linaro-patchwork@canonical.com>;
 Sat,  3 Dec 2011 18:35:24 +0000 (UTC)
Received: by mail-lpp01m010-f52.google.com with SMTP id h2so2304486laa.11
 for <linaro-patchwork@canonical.com>;
 Sat, 03 Dec 2011 10:35:24 -0800 (PST)
Received: by 10.152.144.73 with SMTP id sk9mr2004127lab.34.1322937324630;
 Sat, 03 Dec 2011 10:35:24 -0800 (PST)
X-Forwarded-To: linaro-patchwork@canonical.com
X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com
Delivered-To: patches@linaro.org
Received: by 10.152.41.198 with SMTP id h6cs193919lal;
 Sat, 3 Dec 2011 10:35:24 -0800 (PST)
Received: by 10.68.20.73 with SMTP id l9mr7429743pbe.80.1322937322248;
 Sat, 03 Dec 2011 10:35:22 -0800 (PST)
Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com. [32.97.182.141])
 by mx.google.com with ESMTPS id
 c6si16927015pbj.193.2011.12.03.10.35.21
 (version=TLSv1/SSLv3 cipher=OTHER);
 Sat, 03 Dec 2011 10:35:22 -0800 (PST)
Received-SPF: pass (google.com: domain of paulmck@linux.vnet.ibm.com
 designates 32.97.182.141 as permitted sender)
 client-ip=32.97.182.141; 
Authentication-Results: mx.google.com; spf=pass (google.com: domain of
 paulmck@linux.vnet.ibm.com designates 32.97.182.141 as
 permitted sender) smtp.mail=paulmck@linux.vnet.ibm.com
Received: from /spool/local
 by e1.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only!
 Violators will be prosecuted
 for <patches@linaro.org> from <paulmck@linux.vnet.ibm.com>;
 Sat, 3 Dec 2011 13:35:20 -0500
Received: from d01relay04.pok.ibm.com (9.56.227.236)
 by e1.ny.us.ibm.com (192.168.1.101) with IBM ESMTP SMTP Gateway:
 Authorized Use Only! Violators will be prosecuted; 
 Sat, 3 Dec 2011 13:34:50 -0500
Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64])
 by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id
 pB3IYnrY330052; Sat, 3 Dec 2011 13:34:49 -0500
Received: from d01av04.pok.ibm.com (loopback [127.0.0.1])
 by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id
 pB3IYkjK019810; Sat, 3 Dec 2011 13:34:48 -0500
Received: from paulmck-ThinkPad-W500 (sig-9-65-206-211.mts.ibm.com
 [9.65.206.211])
 by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id
 pB3IYj9I019723; Sat, 3 Dec 2011 13:34:45 -0500
Received: by paulmck-ThinkPad-W500 (Postfix, from userid 1000)
 id 0F980EAA05; Sat,  3 Dec 2011 10:34:44 -0800 (PST)
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: linux-kernel@vger.kernel.org
Cc: mingo@elte.hu, laijs@cn.fujitsu.com, dipankar@in.ibm.com,
 akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca,
 josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
 peterz@infradead.org, rostedt@goodmis.org, Valdis.Kletnieks@vt.edu,
 dhowells@redhat.com, eric.dumazet@gmail.com, darren@dvhart.com,
 patches@linaro.org, "Paul E. McKenney" <paul.mckenney@linaro.org>,
 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: [PATCH RFC tip/core/rcu 4/7] rcu: Adaptive dyntick-idle preparation
Date: Sat,  3 Dec 2011 10:34:39 -0800
Message-Id: <1322937282-19846-4-git-send-email-paulmck@linux.vnet.ibm.com>
X-Mailer: git-send-email 1.7.3.2
In-Reply-To: <20111203183417.GA18914@linux.vnet.ibm.com>
References: <20111203183417.GA18914@linux.vnet.ibm.com>
x-cbid: 11120318-6078-0000-0000-0000053199E9

From: Paul E. McKenney <paul.mckenney@linaro.org>

If there are other CPUs active at a given point in time, then there is a
limit to what a given CPU can do to advance the current RCU grace period.
Beyond this limit, attempting to force the RCU grace period forward will
do nothing but consume energy burning CPU cycles.

Therefore, this commit takes an adaptive approach to RCU_FAST_NO_HZ
preparations for idle.  It pushes the RCU core state machine for
two cycles unconditionally, and then it will push from zero to three
additional cycles, but only as long as the RCU core has work for this
CPU to do immediately.  The rcu_pending() function is used to check
whether the RCU core has such work.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree_plugin.h |   54 +++++++++++++++++++++++++++++++++++++---------
 1 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index adb6e66..8cd9efe 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -1994,8 +1994,40 @@ static void rcu_prepare_for_idle(int cpu)
 
 #else /* #if !defined(CONFIG_RCU_FAST_NO_HZ) */
 
-#define RCU_NEEDS_CPU_FLUSHES 5		/* Allow for callback self-repost. */
+/*
+ * This code is invoked when a CPU goes idle, at which point we want
+ * to have the CPU do everything required for RCU so that it can enter
+ * the energy-efficient dyntick-idle mode.  This is handled by a
+ * state machine implemented by rcu_prepare_for_idle() below.
+ *
+ * The following three proprocessor symbols control this state machine:
+ *
+ * RCU_IDLE_FLUSHES gives the maximum number of times that we will attempt
+ *	to satisfy RCU.  Beyond this point, it is better to incur a periodic
+ *	scheduling-clock interrupt than to loop through the state machine
+ *	at full power.
+ * RCU_IDLE_OPT_FLUSHES gives the number of RCU_IDLE_FLUSHES that are
+ *	optional if RCU does not need anything immediately from this
+ *	CPU, even if this CPU still has RCU callbacks queued.  The first
+ *	times through the state machine are mandatory: we need to give
+ *	the state machine a chance to communicate a quiescent state
+ *	to the RCU core.
+ * RCU_IDLE_GP_DELAY gives the number of jiffies that a CPU is permitted
+ *	to sleep in dyntick-idle mode with RCU callbacks pending.  This
+ *	is sized to be roughly one RCU grace period.  Those energy-efficiency
+ *	benchmarkers who might otherwise be tempted to set this to a large
+ *	number, be warned: Setting RCU_IDLE_GP_DELAY too high can hang your
+ *	system.  And if you are -that- concerned about energy efficiency,
+ *	just power the system down and be done with it!
+ *
+ * The values below work well in practice.  If future workloads require
+ * adjustment, they can be converted into kernel config parameters, though
+ * making the state machine smarter might be a better option.
+ */
+#define RCU_IDLE_FLUSHES 5		/* Number of dyntick-idle tries. */
+#define RCU_IDLE_OPT_FLUSHES 3		/* Optional dyntick-idle tries. */
 #define RCU_IDLE_GP_DELAY 6		/* Roughly one grace period. */
+
 static DEFINE_PER_CPU(int, rcu_dyntick_drain);
 static DEFINE_PER_CPU(unsigned long, rcu_dyntick_holdoff);
 static DEFINE_PER_CPU(struct hrtimer, rcu_idle_gp_timer);
@@ -2110,17 +2142,17 @@ static void rcu_prepare_for_idle(int cpu)
 	/* Check and update the rcu_dyntick_drain sequencing. */
 	if (per_cpu(rcu_dyntick_drain, cpu) <= 0) {
 		/* First time through, initialize the counter. */
-		per_cpu(rcu_dyntick_drain, cpu) = RCU_NEEDS_CPU_FLUSHES;
-	} else if (--per_cpu(rcu_dyntick_drain, cpu) <= 0) {
+		per_cpu(rcu_dyntick_drain, cpu) = RCU_IDLE_FLUSHES;
+	} else if (per_cpu(rcu_dyntick_drain, cpu) <= RCU_IDLE_OPT_FLUSHES &&
+		   !rcu_pending(cpu)) {
 		/* Can we go dyntick-idle despite still having callbacks? */
-		if (!rcu_pending(cpu)) {
-			trace_rcu_prep_idle("Dyntick with callbacks");
-			per_cpu(rcu_dyntick_holdoff, cpu) = jiffies - 1;
-			hrtimer_start(&per_cpu(rcu_idle_gp_timer, cpu),
-				      rcu_idle_gp_wait, HRTIMER_MODE_REL);
-			return; /* Nothing more to do immediately. */
-		}
-
+		trace_rcu_prep_idle("Dyntick with callbacks");
+		per_cpu(rcu_dyntick_drain, cpu) = 0;
+		per_cpu(rcu_dyntick_holdoff, cpu) = jiffies - 1;
+		hrtimer_start(&per_cpu(rcu_idle_gp_timer, cpu),
+			      rcu_idle_gp_wait, HRTIMER_MODE_REL);
+		return; /* Nothing more to do immediately. */
+	} else if (--per_cpu(rcu_dyntick_drain, cpu) <= 0) {
 		/* We have hit the limit, so time to give up. */
 		per_cpu(rcu_dyntick_holdoff, cpu) = jiffies;
 		local_irq_restore(flags);