From patchwork Sat Dec 3 18:34:39 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 5440 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id DAE4923E03 for ; Sat, 3 Dec 2011 18:35:24 +0000 (UTC) Received: from mail-lpp01m010-f52.google.com (mail-lpp01m010-f52.google.com [209.85.215.52]) by fiordland.canonical.com (Postfix) with ESMTP id B9F16A18894 for ; Sat, 3 Dec 2011 18:35:24 +0000 (UTC) Received: by mail-lpp01m010-f52.google.com with SMTP id h2so2304486laa.11 for ; Sat, 03 Dec 2011 10:35:24 -0800 (PST) Received: by 10.152.144.73 with SMTP id sk9mr2004127lab.34.1322937324630; Sat, 03 Dec 2011 10:35:24 -0800 (PST) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.152.41.198 with SMTP id h6cs193919lal; Sat, 3 Dec 2011 10:35:24 -0800 (PST) Received: by 10.68.20.73 with SMTP id l9mr7429743pbe.80.1322937322248; Sat, 03 Dec 2011 10:35:22 -0800 (PST) Received: from e1.ny.us.ibm.com (e1.ny.us.ibm.com. [32.97.182.141]) by mx.google.com with ESMTPS id c6si16927015pbj.193.2011.12.03.10.35.21 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 03 Dec 2011 10:35:22 -0800 (PST) Received-SPF: pass (google.com: domain of paulmck@linux.vnet.ibm.com designates 32.97.182.141 as permitted sender) client-ip=32.97.182.141; Authentication-Results: mx.google.com; spf=pass (google.com: domain of paulmck@linux.vnet.ibm.com designates 32.97.182.141 as permitted sender) smtp.mail=paulmck@linux.vnet.ibm.com Received: from /spool/local by e1.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sat, 3 Dec 2011 13:35:20 -0500 Received: from d01relay04.pok.ibm.com (9.56.227.236) by e1.ny.us.ibm.com (192.168.1.101) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Sat, 3 Dec 2011 13:34:50 -0500 Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id pB3IYnrY330052; Sat, 3 Dec 2011 13:34:49 -0500 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id pB3IYkjK019810; Sat, 3 Dec 2011 13:34:48 -0500 Received: from paulmck-ThinkPad-W500 (sig-9-65-206-211.mts.ibm.com [9.65.206.211]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id pB3IYj9I019723; Sat, 3 Dec 2011 13:34:45 -0500 Received: by paulmck-ThinkPad-W500 (Postfix, from userid 1000) id 0F980EAA05; Sat, 3 Dec 2011 10:34:44 -0800 (PST) From: "Paul E. McKenney" To: linux-kernel@vger.kernel.org Cc: mingo@elte.hu, laijs@cn.fujitsu.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca, josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org, Valdis.Kletnieks@vt.edu, dhowells@redhat.com, eric.dumazet@gmail.com, darren@dvhart.com, patches@linaro.org, "Paul E. McKenney" , "Paul E. McKenney" Subject: [PATCH RFC tip/core/rcu 4/7] rcu: Adaptive dyntick-idle preparation Date: Sat, 3 Dec 2011 10:34:39 -0800 Message-Id: <1322937282-19846-4-git-send-email-paulmck@linux.vnet.ibm.com> X-Mailer: git-send-email 1.7.3.2 In-Reply-To: <20111203183417.GA18914@linux.vnet.ibm.com> References: <20111203183417.GA18914@linux.vnet.ibm.com> x-cbid: 11120318-6078-0000-0000-0000053199E9 From: Paul E. McKenney If there are other CPUs active at a given point in time, then there is a limit to what a given CPU can do to advance the current RCU grace period. Beyond this limit, attempting to force the RCU grace period forward will do nothing but consume energy burning CPU cycles. Therefore, this commit takes an adaptive approach to RCU_FAST_NO_HZ preparations for idle. It pushes the RCU core state machine for two cycles unconditionally, and then it will push from zero to three additional cycles, but only as long as the RCU core has work for this CPU to do immediately. The rcu_pending() function is used to check whether the RCU core has such work. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney --- kernel/rcutree_plugin.h | 54 +++++++++++++++++++++++++++++++++++++--------- 1 files changed, 43 insertions(+), 11 deletions(-) diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h index adb6e66..8cd9efe 100644 --- a/kernel/rcutree_plugin.h +++ b/kernel/rcutree_plugin.h @@ -1994,8 +1994,40 @@ static void rcu_prepare_for_idle(int cpu) #else /* #if !defined(CONFIG_RCU_FAST_NO_HZ) */ -#define RCU_NEEDS_CPU_FLUSHES 5 /* Allow for callback self-repost. */ +/* + * This code is invoked when a CPU goes idle, at which point we want + * to have the CPU do everything required for RCU so that it can enter + * the energy-efficient dyntick-idle mode. This is handled by a + * state machine implemented by rcu_prepare_for_idle() below. + * + * The following three proprocessor symbols control this state machine: + * + * RCU_IDLE_FLUSHES gives the maximum number of times that we will attempt + * to satisfy RCU. Beyond this point, it is better to incur a periodic + * scheduling-clock interrupt than to loop through the state machine + * at full power. + * RCU_IDLE_OPT_FLUSHES gives the number of RCU_IDLE_FLUSHES that are + * optional if RCU does not need anything immediately from this + * CPU, even if this CPU still has RCU callbacks queued. The first + * times through the state machine are mandatory: we need to give + * the state machine a chance to communicate a quiescent state + * to the RCU core. + * RCU_IDLE_GP_DELAY gives the number of jiffies that a CPU is permitted + * to sleep in dyntick-idle mode with RCU callbacks pending. This + * is sized to be roughly one RCU grace period. Those energy-efficiency + * benchmarkers who might otherwise be tempted to set this to a large + * number, be warned: Setting RCU_IDLE_GP_DELAY too high can hang your + * system. And if you are -that- concerned about energy efficiency, + * just power the system down and be done with it! + * + * The values below work well in practice. If future workloads require + * adjustment, they can be converted into kernel config parameters, though + * making the state machine smarter might be a better option. + */ +#define RCU_IDLE_FLUSHES 5 /* Number of dyntick-idle tries. */ +#define RCU_IDLE_OPT_FLUSHES 3 /* Optional dyntick-idle tries. */ #define RCU_IDLE_GP_DELAY 6 /* Roughly one grace period. */ + static DEFINE_PER_CPU(int, rcu_dyntick_drain); static DEFINE_PER_CPU(unsigned long, rcu_dyntick_holdoff); static DEFINE_PER_CPU(struct hrtimer, rcu_idle_gp_timer); @@ -2110,17 +2142,17 @@ static void rcu_prepare_for_idle(int cpu) /* Check and update the rcu_dyntick_drain sequencing. */ if (per_cpu(rcu_dyntick_drain, cpu) <= 0) { /* First time through, initialize the counter. */ - per_cpu(rcu_dyntick_drain, cpu) = RCU_NEEDS_CPU_FLUSHES; - } else if (--per_cpu(rcu_dyntick_drain, cpu) <= 0) { + per_cpu(rcu_dyntick_drain, cpu) = RCU_IDLE_FLUSHES; + } else if (per_cpu(rcu_dyntick_drain, cpu) <= RCU_IDLE_OPT_FLUSHES && + !rcu_pending(cpu)) { /* Can we go dyntick-idle despite still having callbacks? */ - if (!rcu_pending(cpu)) { - trace_rcu_prep_idle("Dyntick with callbacks"); - per_cpu(rcu_dyntick_holdoff, cpu) = jiffies - 1; - hrtimer_start(&per_cpu(rcu_idle_gp_timer, cpu), - rcu_idle_gp_wait, HRTIMER_MODE_REL); - return; /* Nothing more to do immediately. */ - } - + trace_rcu_prep_idle("Dyntick with callbacks"); + per_cpu(rcu_dyntick_drain, cpu) = 0; + per_cpu(rcu_dyntick_holdoff, cpu) = jiffies - 1; + hrtimer_start(&per_cpu(rcu_idle_gp_timer, cpu), + rcu_idle_gp_wait, HRTIMER_MODE_REL); + return; /* Nothing more to do immediately. */ + } else if (--per_cpu(rcu_dyntick_drain, cpu) <= 0) { /* We have hit the limit, so time to give up. */ per_cpu(rcu_dyntick_holdoff, cpu) = jiffies; local_irq_restore(flags);