From patchwork Wed Nov 2 20:30:24 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 4925 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id E5E2323DC3 for ; Wed, 2 Nov 2011 20:34:00 +0000 (UTC) Received: from mail-fx0-f52.google.com (mail-fx0-f52.google.com [209.85.161.52]) by fiordland.canonical.com (Postfix) with ESMTP id D9F49A1868B for ; Wed, 2 Nov 2011 20:34:00 +0000 (UTC) Received: by mail-fx0-f52.google.com with SMTP id n26so1195523faa.11 for ; Wed, 02 Nov 2011 13:34:00 -0700 (PDT) Received: by 10.223.61.72 with SMTP id s8mr4307558fah.27.1320265985740; Wed, 02 Nov 2011 13:33:05 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.152.14.103 with SMTP id o7cs64099lac; Wed, 2 Nov 2011 13:33:05 -0700 (PDT) Received: by 10.52.97.35 with SMTP id dx3mr6317627vdb.2.1320265983597; Wed, 02 Nov 2011 13:33:03 -0700 (PDT) Received: from e2.ny.us.ibm.com (e2.ny.us.ibm.com. [32.97.182.142]) by mx.google.com with ESMTPS id ka1si851121vcb.91.2011.11.02.13.33.02 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 02 Nov 2011 13:33:03 -0700 (PDT) Received-SPF: pass (google.com: domain of paulmck@linux.vnet.ibm.com designates 32.97.182.142 as permitted sender) client-ip=32.97.182.142; Authentication-Results: mx.google.com; spf=pass (google.com: domain of paulmck@linux.vnet.ibm.com designates 32.97.182.142 as permitted sender) smtp.mail=paulmck@linux.vnet.ibm.com Received: from /spool/local by e2.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 2 Nov 2011 16:31:12 -0400 Received: from d01relay03.pok.ibm.com ([9.56.227.235]) by e2.ny.us.ibm.com ([192.168.1.102]) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 2 Nov 2011 16:31:03 -0400 Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id pA2KUwFX200108 for ; Wed, 2 Nov 2011 16:30:58 -0400 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id pA2KUqkn029842 for ; Wed, 2 Nov 2011 16:30:58 -0400 Received: from paulmck-ThinkPad-W500 (sig-9-49-130-61.mts.ibm.com [9.49.130.61]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id pA2KUpRN029759; Wed, 2 Nov 2011 16:30:52 -0400 Received: by paulmck-ThinkPad-W500 (Postfix, from userid 1000) id AAF0BEA767; Wed, 2 Nov 2011 13:30:51 -0700 (PDT) From: "Paul E. McKenney" To: linux-kernel@vger.kernel.org Cc: mingo@elte.hu, laijs@cn.fujitsu.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca, josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org, Valdis.Kletnieks@vt.edu, dhowells@redhat.com, eric.dumazet@gmail.com, darren@dvhart.com, patches@linaro.org, "Paul E. McKenney" Subject: [PATCH RFC tip/core/rcu 03/28] rcu: Avoid RCU-preempt expedited grace-period botch Date: Wed, 2 Nov 2011 13:30:24 -0700 Message-Id: <1320265849-5744-3-git-send-email-paulmck@linux.vnet.ibm.com> X-Mailer: git-send-email 1.7.3.2 In-Reply-To: <20111102203017.GA3830@linux.vnet.ibm.com> References: <20111102203017.GA3830@linux.vnet.ibm.com> x-cbid: 11110220-5112-0000-0000-0000019EA3B2 Because rcu_read_unlock_special() samples rcu_preempted_readers_exp(rnp) after dropping rnp->lock, the following sequence of events is possible: 1. Task A exits its RCU read-side critical section, and removes itself from the ->blkd_tasks list, releases rnp->lock, and is then preempted. Task B remains on the ->blkd_tasks list, and blocks the current expedited grace period. 2. Task B exits from its RCU read-side critical section and removes itself from the ->blkd_tasks list. Because it is the last task blocking the current expedited grace period, it ends that expedited grace period. 3. Task A resumes, and samples rcu_preempted_readers_exp(rnp) which of course indicates that nothing is blocking the nonexistent expedited grace period. Task A is again preempted. 4. Some other CPU starts an expedited grace period. There are several tasks blocking this expedited grace period queued on the same rcu_node structure that Task A was using in step 1 above. 5. Task A examines its state and incorrectly concludes that it was the last task blocking the expedited grace period on the current rcu_node structure. It therefore reports completion up the rcu_node tree. 6. The expedited grace period can then incorrectly complete before the tasks blocked on this same rcu_node structure exit their RCU read-side critical sections. Arbitrarily bad things happen. This commit therefore takes a snapshot of rcu_preempted_readers_exp(rnp) prior to dropping the lock, so that only the last task thinks that it is the last task, thus avoiding the failure scenario laid out above. Signed-off-by: Paul E. McKenney --- kernel/rcutree_plugin.h | 7 +++++-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h index 4b9b9f8..7986053 100644 --- a/kernel/rcutree_plugin.h +++ b/kernel/rcutree_plugin.h @@ -312,6 +312,7 @@ static noinline void rcu_read_unlock_special(struct task_struct *t) { int empty; int empty_exp; + int empty_exp_now; unsigned long flags; struct list_head *np; #ifdef CONFIG_RCU_BOOST @@ -382,8 +383,10 @@ static noinline void rcu_read_unlock_special(struct task_struct *t) /* * If this was the last task on the current list, and if * we aren't waiting on any CPUs, report the quiescent state. - * Note that rcu_report_unblock_qs_rnp() releases rnp->lock. + * Note that rcu_report_unblock_qs_rnp() releases rnp->lock, + * so we must take a snapshot of the expedited state. */ + empty_exp_now = !rcu_preempted_readers_exp(rnp); if (!empty && !rcu_preempt_blocked_readers_cgp(rnp)) { trace_rcu_quiescent_state_report("preempt_rcu", rnp->gpnum, @@ -406,7 +409,7 @@ static noinline void rcu_read_unlock_special(struct task_struct *t) * If this was the last task on the expedited lists, * then we need to report up the rcu_node hierarchy. */ - if (!empty_exp && !rcu_preempted_readers_exp(rnp)) + if (!empty_exp && empty_exp_now) rcu_report_exp_rnp(&rcu_preempt_state, rnp); } else { local_irq_restore(flags);