From patchwork Thu Sep 20 18:47:58 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 11592 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id C23F523E54 for ; Thu, 20 Sep 2012 18:48:51 +0000 (UTC) Received: from mail-ie0-f180.google.com (mail-ie0-f180.google.com [209.85.223.180]) by fiordland.canonical.com (Postfix) with ESMTP id 42227A1823C for ; Thu, 20 Sep 2012 18:48:51 +0000 (UTC) Received: by mail-ie0-f180.google.com with SMTP id e10so3419662iej.11 for ; Thu, 20 Sep 2012 11:48:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-forwarded-to:x-forwarded-for:delivered-to:received-spf:from:to:cc :subject:date:message-id:x-mailer:in-reply-to:references :x-content-scanned:x-cbid:x-gm-message-state; bh=SMaJWjJu5woD6RzewIw5TkGMPEOGCHSDUTMhQwfSgRI=; b=i8RYPVQ4MqtmEMuW0jaR1Y9vKBJhMw55zVh2pK/7aWzwSQQNYMltL2xAHoXL5lO9PZ DKFs6z7IkSByF5FIOjCL1LvRzMuaDcgAPuWHNj+VG2s5t4ZvgwvHuBIeGrmzu+fL3m9y v+fSmcsZf/SMa8/CLi9XktVZZoPwePyMcruk4TBlx+rLxHz8u1wD2ciHTUVTIRbZaXiE PD75sW7IENPvdJQd2NKsGeNFIs7T8vzF5i7RU4p2f+swar6I3okg/pyiB5rsu8sz3iTb YhRkG3j2g8oTmZGv6c7EAdr39TnatE+AMTTaCZZLjozsaCOYIUDmdpLSmle+72N5Lf+w 8A6A== Received: by 10.50.154.227 with SMTP id vr3mr2952365igb.43.1348166931061; Thu, 20 Sep 2012 11:48:51 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.50.184.232 with SMTP id ex8csp92207igc; Thu, 20 Sep 2012 11:48:50 -0700 (PDT) Received: by 10.182.8.6 with SMTP id n6mr2022483oba.39.1348166930522; Thu, 20 Sep 2012 11:48:50 -0700 (PDT) Received: from e38.co.us.ibm.com (e38.co.us.ibm.com. [32.97.110.159]) by mx.google.com with ESMTPS id lg5si5702058obb.110.2012.09.20.11.48.50 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 20 Sep 2012 11:48:50 -0700 (PDT) Received-SPF: pass (google.com: domain of paulmck@linux.vnet.ibm.com designates 32.97.110.159 as permitted sender) client-ip=32.97.110.159; Authentication-Results: mx.google.com; spf=pass (google.com: domain of paulmck@linux.vnet.ibm.com designates 32.97.110.159 as permitted sender) smtp.mail=paulmck@linux.vnet.ibm.com Received: from /spool/local by e38.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 20 Sep 2012 12:48:50 -0600 Received: from d03dlp03.boulder.ibm.com (9.17.202.179) by e38.co.us.ibm.com (192.168.1.138) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 20 Sep 2012 12:48:48 -0600 Received: from d03relay03.boulder.ibm.com (d03relay03.boulder.ibm.com [9.17.195.228]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 75F5419D8043 for ; Thu, 20 Sep 2012 12:48:37 -0600 (MDT) Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by d03relay03.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q8KImTe9207680 for ; Thu, 20 Sep 2012 12:48:31 -0600 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q8KImNMr020735 for ; Thu, 20 Sep 2012 12:48:26 -0600 Received: from paulmck-ThinkPad-W500 ([9.47.24.72]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q8KImMb8020657; Thu, 20 Sep 2012 12:48:22 -0600 Received: by paulmck-ThinkPad-W500 (Postfix, from userid 1000) id A14ADE5222; Thu, 20 Sep 2012 11:48:21 -0700 (PDT) From: "Paul E. McKenney" To: linux-kernel@vger.kernel.org Cc: mingo@elte.hu, laijs@cn.fujitsu.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca, josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org, Valdis.Kletnieks@vt.edu, dhowells@redhat.com, eric.dumazet@gmail.com, darren@dvhart.com, fweisbec@gmail.com, sbw@mit.edu, patches@linaro.org, "Paul E. McKenney" Subject: [PATCH tip/core/rcu 02/23] rcu: Prevent initialization-time quiescent-state race Date: Thu, 20 Sep 2012 11:47:58 -0700 Message-Id: <1348166900-18716-2-git-send-email-paulmck@linux.vnet.ibm.com> X-Mailer: git-send-email 1.7.8 In-Reply-To: <1348166900-18716-1-git-send-email-paulmck@linux.vnet.ibm.com> References: <20120920184751.GA18657@linux.vnet.ibm.com> <1348166900-18716-1-git-send-email-paulmck@linux.vnet.ibm.com> X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12092018-5518-0000-0000-000007CEF40F X-Gm-Message-State: ALoCoQkw5lHd9FSibbetgP1ZwiBnZMummNFxvVPBaaLYc38o2wNikh/Pred8TfHskPCo5SYQC2OT From: "Paul E. McKenney" The next step in reducing RCU's grace-period initialization latency on large systems will make this initialization preemptible. Unfortunately, making the grace-period initialization subject to interrupts (let alone preemption) exposes the following race on systems whose rcu_node tree contains more than one node: 1. CPU 31 starts initializing the grace period, including the first leaf rcu_node structures, and is then preempted. 2. CPU 0 refers to the first leaf rcu_node structure, and notes that a new grace period has started. It passes through a quiescent state shortly thereafter, and informs the RCU core of this rite of passage. 3. CPU 0 enters an RCU read-side critical section, acquiring a pointer to an RCU-protected data item. 4. CPU 31 takes an interrupt whose handler removes the data item referenced by CPU 0 from the data structure, and registers an RCU callback in order to free it. 5. CPU 31 resumes initializing the grace period, including its own rcu_node structure. In invokes rcu_start_gp_per_cpu(), which advances all callbacks, including the one registered in #4 above, to be handled by the current grace period. 6. The remaining CPUs pass through quiescent states and inform the RCU core, but CPU 0 remains in its RCU read-side critical section, still referencing the now-removed data item. 7. The grace period completes and all the callbacks are invoked, including the one that frees the data item that CPU 0 is still referencing. Oops!!! One way to avoid this race is to remove grace-period acceleration from rcu_start_gp_per_cpu(). Now, the only reason for this acceleration was to allow CPUs bringing RCU out of idle state to have their callbacks invoked after only one grace period, rather than the two grace periods that would otherwise be required. But this acceleration does not work when RCU grace-period initialization is moved to a kthread because the CPU posting the callback is no longer necessarily the CPU that is initializing the resulting grace period. This commit therefore removes this now-pointless (and soon to be dangerous) grace-period acceleration, thus avoiding the above race. Signed-off-by: Paul E. McKenney --- kernel/rcutree.c | 14 -------------- 1 files changed, 0 insertions(+), 14 deletions(-) diff --git a/kernel/rcutree.c b/kernel/rcutree.c index 5b4b093..0df9aaa 100644 --- a/kernel/rcutree.c +++ b/kernel/rcutree.c @@ -1021,20 +1021,6 @@ rcu_start_gp_per_cpu(struct rcu_state *rsp, struct rcu_node *rnp, struct rcu_dat /* Prior grace period ended, so advance callbacks for current CPU. */ __rcu_process_gp_end(rsp, rnp, rdp); - /* - * Because this CPU just now started the new grace period, we know - * that all of its callbacks will be covered by this upcoming grace - * period, even the ones that were registered arbitrarily recently. - * Therefore, advance all outstanding callbacks to RCU_WAIT_TAIL. - * - * Other CPUs cannot be sure exactly when the grace period started. - * Therefore, their recently registered callbacks must pass through - * an additional RCU_NEXT_READY stage, so that they will be handled - * by the next RCU grace period. - */ - rdp->nxttail[RCU_NEXT_READY_TAIL] = rdp->nxttail[RCU_NEXT_TAIL]; - rdp->nxttail[RCU_WAIT_TAIL] = rdp->nxttail[RCU_NEXT_TAIL]; - /* Set state so that this CPU will detect the next quiescent state. */ __note_new_gpnum(rsp, rnp, rdp); }