From patchwork Sat Jan 5 17:49:03 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 13833 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id 660A823E21 for ; Sat, 5 Jan 2013 17:49:25 +0000 (UTC) Received: from mail-vc0-f180.google.com (mail-vc0-f180.google.com [209.85.220.180]) by fiordland.canonical.com (Postfix) with ESMTP id E2209A19887 for ; Sat, 5 Jan 2013 17:49:24 +0000 (UTC) Received: by mail-vc0-f180.google.com with SMTP id p16so17829330vcq.11 for ; Sat, 05 Jan 2013 09:49:24 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:x-forwarded-to:x-forwarded-for:delivered-to:x-received :received-spf:from:to:cc:subject:date:message-id:x-mailer :in-reply-to:references:x-content-scanned:x-cbid:x-gm-message-state; bh=ZAy5AaV9WQHprPS0PEb0ZYgG293eIJ4HmGAI937qBpw=; b=ZARZOuFs0NMVoWGDvcYW1rHUnNhnAGXsMnM1VR0PkSjhkFV8xrnwbr+OY8kRZOj//s BYKEP9woUoYWdXEQ8930PeioHohaROBbzLlIwk+GQK1WCMyC3Sqq7eWNe3NWkHKaeTsd UStuN+Lq6KD+NPDq55Q1TmIjw0OAoBIA1O3qq+Vv0EZ9ef4uaM7LctL1kdkFKHHFy0cI iwRgb/p/41gpXp7nBNAN61CESnhNOOgSGewiOI0oDnkkaMJ4b4XkyIYBdET5vO4x3Bmv 1kHn0KB7ri/J1jutTI0b1w5ZDeNEMPREju1GJNcQn46H3Sq9pSnl772BGsaJ5FLZIa8l /HSg== X-Received: by 10.52.88.168 with SMTP id bh8mr67809109vdb.51.1357408164424; Sat, 05 Jan 2013 09:49:24 -0800 (PST) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.58.145.101 with SMTP id st5csp17436veb; Sat, 5 Jan 2013 09:49:23 -0800 (PST) X-Received: by 10.50.159.228 with SMTP id xf4mr1901402igb.108.1357408161508; Sat, 05 Jan 2013 09:49:21 -0800 (PST) Received: from e7.ny.us.ibm.com (e7.ny.us.ibm.com. [32.97.182.137]) by mx.google.com with ESMTPS id bd6si42101771icc.1.2013.01.05.09.49.21 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 05 Jan 2013 09:49:21 -0800 (PST) Received-SPF: pass (google.com: domain of paulmck@linux.vnet.ibm.com designates 32.97.182.137 as permitted sender) client-ip=32.97.182.137; Authentication-Results: mx.google.com; spf=pass (google.com: domain of paulmck@linux.vnet.ibm.com designates 32.97.182.137 as permitted sender) smtp.mail=paulmck@linux.vnet.ibm.com Received: from /spool/local by e7.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sat, 5 Jan 2013 12:49:20 -0500 Received: from d01dlp03.pok.ibm.com (9.56.250.168) by e7.ny.us.ibm.com (192.168.1.107) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Sat, 5 Jan 2013 12:49:17 -0500 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id 403C4C9003C; Sat, 5 Jan 2013 12:49:17 -0500 (EST) Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r05HnGYU340662; Sat, 5 Jan 2013 12:49:17 -0500 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r05HnC4u032434; Sat, 5 Jan 2013 10:49:16 -0700 Received: from paulmck-ThinkPad-W500 ([9.80.23.97]) by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id r05HnAql032212; Sat, 5 Jan 2013 10:49:11 -0700 Received: by paulmck-ThinkPad-W500 (Postfix, from userid 1000) id 96D42E71A1; Sat, 5 Jan 2013 09:49:06 -0800 (PST) From: "Paul E. McKenney" To: linux-kernel@vger.kernel.org Cc: mingo@elte.hu, laijs@cn.fujitsu.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca, josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org, Valdis.Kletnieks@vt.edu, dhowells@redhat.com, edumazet@google.com, darren@dvhart.com, fweisbec@gmail.com, sbw@mit.edu, patches@linaro.org, "Paul E. McKenney" , "Paul E. McKenney" Subject: [PATCH tip/core/rcu 13/14] rcu: Abstract rcu_start_future_gp() from rcu_nocb_wait_gp() Date: Sat, 5 Jan 2013 09:49:03 -0800 Message-Id: <1357408144-15830-13-git-send-email-paulmck@linux.vnet.ibm.com> X-Mailer: git-send-email 1.7.8 In-Reply-To: <1357408144-15830-1-git-send-email-paulmck@linux.vnet.ibm.com> References: <20130105174844.GA14172@linux.vnet.ibm.com> <1357408144-15830-1-git-send-email-paulmck@linux.vnet.ibm.com> X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13010517-5806-0000-0000-00001DD5C81F X-Gm-Message-State: ALoCoQliG73b1JaWQ19KB3l1bfjiWn4d2ep0omh4ovsGi1VfwoHKkxIsP8AZ5FkaCgIU4BKF51Ed From: "Paul E. McKenney" CPUs going idle will need to record the need for a future grace period, but won't actually need to block waiting on it. This commit therefore splits rcu_start_future_gp(), which does the recording, from rcu_nocb_wait_gp(), which now invokes rcu_start_future_gp() to do the recording, after which rcu_nocb_wait_gp() does the waiting. Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney --- kernel/rcutree.c | 123 +++++++++++++++++++++++++++++++++++++++++++++-- kernel/rcutree.h | 2 +- kernel/rcutree_plugin.h | 104 ++++------------------------------------ 3 files changed, 130 insertions(+), 99 deletions(-) diff --git a/kernel/rcutree.c b/kernel/rcutree.c index 8ca18ec..bd42feb 100644 --- a/kernel/rcutree.c +++ b/kernel/rcutree.c @@ -230,6 +230,7 @@ static ulong jiffies_till_next_fqs = RCU_JIFFIES_TILL_FORCE_QS; module_param(jiffies_till_first_fqs, ulong, 0644); module_param(jiffies_till_next_fqs, ulong, 0644); +static void rcu_start_gp(struct rcu_state *rsp); static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *)); static void force_quiescent_state(struct rcu_state *rsp); static int rcu_pending(int cpu); @@ -1113,6 +1114,120 @@ static unsigned long rcu_cbs_completed(struct rcu_state *rsp, } /* + * Trace-event helper function for rcu_start_future_gp() and + * rcu_nocb_wait_gp(). + */ +static void trace_rcu_future_gp(struct rcu_node *rnp, struct rcu_data *rdp, + unsigned long c, char *s) +{ + trace_rcu_future_grace_period(rdp->rsp->name, rnp->gpnum, + rnp->completed, c, rnp->level, + rnp->grplo, rnp->grphi, s); +} + +/* + * Start some future grace period, as needed to handle newly arrived + * callbacks. The required future grace periods are recorded in each + * rcu_node structure's ->need_future_gp field. + * + * The caller must hold the specified rcu_node structure's ->lock. + */ +static unsigned long __maybe_unused +rcu_start_future_gp(struct rcu_node *rnp, struct rcu_data *rdp) +{ + unsigned long c; + int i; + struct rcu_node *rnp_root = rcu_get_root(rdp->rsp); + + /* + * Pick up grace-period number for new callbacks. If this + * grace period is already marked as needed, return to the caller. + */ + c = rcu_cbs_completed(rdp->rsp, rnp); + trace_rcu_future_gp(rnp, rdp, c, "Startleaf"); + if (rnp->need_future_gp[c & 0x1]) { + trace_rcu_future_gp(rnp, rdp, c, "Prestartleaf"); + return c; + } + + /* + * If either this rcu_node structure or the root rcu_node structure + * believe that a grace period is in progress, then we must wait + * for the one following, which is in "c". Because our request + * will be noticed at the end of the current grace period, we don't + * need to explicitly start one. + */ + if (rnp->gpnum != rnp->completed || + ACCESS_ONCE(rnp->gpnum) != ACCESS_ONCE(rnp->completed)) { + rnp->need_future_gp[c & 0x1]++; + trace_rcu_future_gp(rnp, rdp, c, "Startedleaf"); + return c; + } + + /* + * There might be no grace period in progress. If we don't already + * hold it, acquire the root rcu_node structure's lock in order to + * start one (if needed). + */ + if (rnp != rnp_root) + raw_spin_lock(&rnp_root->lock); + + /* + * Get a new grace-period number. If there really is no grace + * period in progress, it will be smaller than the one we obtained + * earlier. Adjust callbacks as needed. Note that even no-CBs + * CPUs have a ->nxtcompleted[] array, so no no-CBs checks needed. + */ + c = rcu_cbs_completed(rdp->rsp, rnp_root); + for (i = RCU_DONE_TAIL; i < RCU_NEXT_TAIL; i++) + if (ULONG_CMP_LT(c, rdp->nxtcompleted[i])) + rdp->nxtcompleted[i] = c; + + /* + * If the needed for the required grace period is already + * recorded, trace and leave. + */ + if (rnp_root->need_future_gp[c & 0x1]) { + trace_rcu_future_gp(rnp, rdp, c, "Prestartedroot"); + goto unlock_out; + } + + /* Record the need for the future grace period. */ + rnp_root->need_future_gp[c & 0x1]++; + + /* If a grace period is not already in progress, start one. */ + if (rnp_root->gpnum != rnp_root->completed) { + trace_rcu_future_gp(rnp, rdp, c, "Startedleafroot"); + } else { + trace_rcu_future_gp(rnp, rdp, c, "Startedroot"); + rcu_start_gp(rdp->rsp); + } +unlock_out: + if (rnp != rnp_root) + raw_spin_unlock(&rnp_root->lock); + return c; +} + +/* + * Clean up any old requests for the just-ended grace period. Also return + * whether any additional grace periods have been requested. Also invoke + * rcu_nocb_gp_cleanup() in order to wake up any no-callbacks kthreads + * waiting for this grace period to complete. + */ +static int rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp) +{ + int c = rnp->completed; + int needmore; + struct rcu_data *rdp = this_cpu_ptr(rsp->rda); + + rcu_nocb_gp_cleanup(rsp, rnp); + rnp->need_future_gp[c & 0x1] = 0; + needmore = rnp->need_future_gp[(c + 1) & 0x1]; + trace_rcu_future_gp(rnp, rdp, c, needmore ? "CleanupMore" : "Cleanup"); + return needmore; +} + +/* * If there is room, assign a ->completed number to any callbacks on * this CPU that have not already been assigned. Also accelerate any * callbacks that were previously assigned a ->completed number that has @@ -1350,9 +1465,9 @@ static int rcu_gp_init(struct rcu_state *rsp) rdp = this_cpu_ptr(rsp->rda); rcu_preempt_check_blocked_tasks(rnp); rnp->qsmask = rnp->qsmaskinit; - rnp->gpnum = rsp->gpnum; + ACCESS_ONCE(rnp->gpnum) = rsp->gpnum; WARN_ON_ONCE(rnp->completed != rsp->completed); - rnp->completed = rsp->completed; + ACCESS_ONCE(rnp->completed) = rsp->completed; if (rnp == rdp->mynode) rcu_start_gp_per_cpu(rsp, rnp, rdp); rcu_preempt_boost_start_gp(rnp); @@ -1433,11 +1548,11 @@ static void rcu_gp_cleanup(struct rcu_state *rsp) */ rcu_for_each_node_breadth_first(rsp, rnp) { raw_spin_lock_irq(&rnp->lock); - rnp->completed = rsp->gpnum; + ACCESS_ONCE(rnp->completed) = rsp->gpnum; rdp = this_cpu_ptr(rsp->rda); if (rnp == rdp->mynode) __rcu_process_gp_end(rsp, rnp, rdp); - nocb += rcu_nocb_gp_cleanup(rsp, rnp); + nocb += rcu_future_gp_cleanup(rsp, rnp); raw_spin_unlock_irq(&rnp->lock); cond_resched(); } diff --git a/kernel/rcutree.h b/kernel/rcutree.h index 775d96c..d6a043e 100644 --- a/kernel/rcutree.h +++ b/kernel/rcutree.h @@ -535,7 +535,7 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp); static void increment_cpu_stall_ticks(void); static int rcu_nocb_needs_gp(struct rcu_state *rsp); static void rcu_nocb_gp_set(struct rcu_node *rnp, int nrq); -static int rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp); +static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp); static void rcu_init_one_nocb(struct rcu_node *rnp); static bool is_nocb_cpu(int cpu); static bool __call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *rhp, diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h index e4037bd..71bafa8 100644 --- a/kernel/rcutree_plugin.h +++ b/kernel/rcutree_plugin.h @@ -2061,22 +2061,12 @@ static int rcu_nocb_needs_gp(struct rcu_state *rsp) } /* - * Clean up this rcu_node structure's no-CBs state at the end of - * a grace period, and also return whether any no-CBs CPU associated - * with this rcu_node structure needs another grace period. + * Wake up any no-CBs CPUs' kthreads that were waiting on the just-ended + * grace period. */ -static int rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp) +static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp) { - int c = rnp->completed; - int needmore; - - wake_up_all(&rnp->nocb_gp_wq[c & 0x1]); - rnp->need_future_gp[c & 0x1] = 0; - needmore = rnp->need_future_gp[(c + 1) & 0x1]; - trace_rcu_future_grace_period(rsp->name, rnp->gpnum, rnp->completed, - c, rnp->level, rnp->grplo, rnp->grphi, - needmore ? "CleanupMore" : "Cleanup"); - return needmore; + wake_up_all(&rnp->nocb_gp_wq[rnp->completed & 0x1]); } /* @@ -2214,84 +2204,16 @@ static void rcu_nocb_wait_gp(struct rcu_data *rdp) bool d; unsigned long flags; struct rcu_node *rnp = rdp->mynode; - struct rcu_node *rnp_root = rcu_get_root(rdp->rsp); raw_spin_lock_irqsave(&rnp->lock, flags); - c = rnp->completed + 2; - - /* Count our request for a grace period. */ - rnp->need_future_gp[c & 0x1]++; - trace_rcu_future_grace_period(rdp->rsp->name, rnp->gpnum, - rnp->completed, c, rnp->level, - rnp->grplo, rnp->grphi, "Startleaf"); - - if (rnp->gpnum != rnp->completed) { - - /* - * This rcu_node structure believes that a grace period - * is in progress, so we are done. When this grace - * period ends, our request will be acted upon. - */ - trace_rcu_future_grace_period(rdp->rsp->name, rnp->gpnum, - rnp->completed, c, rnp->level, - rnp->grplo, rnp->grphi, - "Startedleaf"); - raw_spin_unlock_irqrestore(&rnp->lock, flags); - - } else { - - /* - * Might not be a grace period, check root rcu_node - * structure to see if we must start one. - */ - if (rnp != rnp_root) - raw_spin_lock(&rnp_root->lock); /* irqs disabled. */ - if (rnp_root->gpnum != rnp_root->completed) { - trace_rcu_future_grace_period(rdp->rsp->name, - rnp->gpnum, - rnp->completed, - c, rnp->level, - rnp->grplo, rnp->grphi, - "Startedleafroot"); - raw_spin_unlock(&rnp_root->lock); /* irqs disabled. */ - } else { - - /* - * No grace period, so we need to start one. - * The good news is that we can wait for exactly - * one grace period instead of part of the current - * grace period and all of the next grace period. - * Adjust counters accordingly and start the - * needed grace period. - */ - rnp->need_future_gp[c & 0x1]--; - c = rnp_root->completed + 1; - rnp->need_future_gp[c & 0x1]++; - rnp_root->need_future_gp[c & 0x1]++; - trace_rcu_future_grace_period(rdp->rsp->name, - rnp->gpnum, - rnp->completed, - c, rnp->level, - rnp->grplo, rnp->grphi, - "Startedroot"); - rcu_start_gp(rdp->rsp); - raw_spin_unlock(&rnp->lock); - } - - /* Clean up locking and irq state. */ - if (rnp != rnp_root) - raw_spin_unlock_irqrestore(&rnp->lock, flags); - else - local_irq_restore(flags); - } + c = rcu_start_future_gp(rnp, rdp); + raw_spin_unlock_irqrestore(&rnp->lock, flags); /* * Wait for the grace period. Do so interruptibly to avoid messing * up the load average. */ - trace_rcu_future_grace_period(rdp->rsp->name, rnp->gpnum, - rnp->completed, c, rnp->level, - rnp->grplo, rnp->grphi, "StartWait"); + trace_rcu_future_gp(rnp, rdp, c, "StartWait"); for (;;) { wait_event_interruptible( rnp->nocb_gp_wq[c & 0x1], @@ -2299,14 +2221,9 @@ static void rcu_nocb_wait_gp(struct rcu_data *rdp) if (likely(d)) break; flush_signals(current); - trace_rcu_future_grace_period(rdp->rsp->name, - rnp->gpnum, rnp->completed, c, - rnp->level, rnp->grplo, - rnp->grphi, "ResumeWait"); + trace_rcu_future_gp(rnp, rdp, c, "ResumeWait"); } - trace_rcu_future_grace_period(rdp->rsp->name, rnp->gpnum, - rnp->completed, c, rnp->level, - rnp->grplo, rnp->grphi, "EndWait"); + trace_rcu_future_gp(rnp, rdp, c, "EndWait"); smp_mb(); /* Ensure that CB invocation happens after GP end. */ } @@ -2413,9 +2330,8 @@ static int rcu_nocb_needs_gp(struct rcu_state *rsp) return 0; } -static int rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp) +static void rcu_nocb_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp) { - return 0; } static void rcu_nocb_gp_set(struct rcu_node *rnp, int nrq)