From patchwork Thu Sep 20 18:48:04 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 11588 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id 38DCF23E54 for ; Thu, 20 Sep 2012 18:48:48 +0000 (UTC) Received: from mail-ie0-f180.google.com (mail-ie0-f180.google.com [209.85.223.180]) by fiordland.canonical.com (Postfix) with ESMTP id 65EADA1823C for ; Thu, 20 Sep 2012 18:48:47 +0000 (UTC) Received: by mail-ie0-f180.google.com with SMTP id e10so3419662iej.11 for ; Thu, 20 Sep 2012 11:48:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-forwarded-to:x-forwarded-for:delivered-to:received-spf:from:to:cc :subject:date:message-id:x-mailer:in-reply-to:references :x-content-scanned:x-cbid:x-gm-message-state; bh=nSK1d/zxdG6nF4QUydrzLlB75n+i48a2DWPSe+qp3oc=; b=Mae0fdWQIzVkezq8ox012ojXviavxKi+B/ovhwS25Dp4wcXEkA+wnBjDhnJSnoUc5O NLueSQnqht7O3r2xTI62g3caoLPQYZ2mlqHhYPpyAqwgvgHedV73NtKRmbTun9ghsb4A lhfjSi62tFGfmysZ5cL4uZcaHkndYeJL9dtuKvrO29Da0mOTgvxGBrPWKPFkdxamzEvQ 9lHuk+GnznJu5o8uMJkcFDUnRnAeGHPdoGxcRYZTF3rUVymXOSGhuhq5jTBk+pd7Q7fr L//AcIpz11BkHw1SmRliD6LDpPZd4GYbs0yr9ijxZGM8MEvDKXGD1s2jsOg3eQQ4AGHc r+pQ== Received: by 10.50.217.227 with SMTP id pb3mr3336875igc.28.1348166927150; Thu, 20 Sep 2012 11:48:47 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.50.184.232 with SMTP id ex8csp92196igc; Thu, 20 Sep 2012 11:48:46 -0700 (PDT) Received: by 10.50.95.231 with SMTP id dn7mr2601129igb.37.1348166926739; Thu, 20 Sep 2012 11:48:46 -0700 (PDT) Received: from e39.co.us.ibm.com (e39.co.us.ibm.com. [32.97.110.160]) by mx.google.com with ESMTPS id hh5si9786734igc.68.2012.09.20.11.48.46 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 20 Sep 2012 11:48:46 -0700 (PDT) Received-SPF: pass (google.com: domain of paulmck@linux.vnet.ibm.com designates 32.97.110.160 as permitted sender) client-ip=32.97.110.160; Authentication-Results: mx.google.com; spf=pass (google.com: domain of paulmck@linux.vnet.ibm.com designates 32.97.110.160 as permitted sender) smtp.mail=paulmck@linux.vnet.ibm.com Received: from /spool/local by e39.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 20 Sep 2012 12:48:45 -0600 Received: from d03dlp01.boulder.ibm.com (9.17.202.177) by e39.co.us.ibm.com (192.168.1.139) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 20 Sep 2012 12:48:43 -0600 Received: from d03relay01.boulder.ibm.com (d03relay01.boulder.ibm.com [9.17.195.226]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id 199D9C40012; Thu, 20 Sep 2012 12:48:39 -0600 (MDT) Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by d03relay01.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q8KIma1a184906; Thu, 20 Sep 2012 12:48:36 -0600 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q8KImP9G020949; Thu, 20 Sep 2012 12:48:36 -0600 Received: from paulmck-ThinkPad-W500 ([9.47.24.72]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q8KImNWK020789; Thu, 20 Sep 2012 12:48:24 -0600 Received: by paulmck-ThinkPad-W500 (Postfix, from userid 1000) id 08095EC52A; Thu, 20 Sep 2012 11:48:22 -0700 (PDT) From: "Paul E. McKenney" To: linux-kernel@vger.kernel.org Cc: mingo@elte.hu, laijs@cn.fujitsu.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca, josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org, Valdis.Kletnieks@vt.edu, dhowells@redhat.com, eric.dumazet@gmail.com, darren@dvhart.com, fweisbec@gmail.com, sbw@mit.edu, patches@linaro.org, "Paul E. McKenney" , "Paul E. McKenney" Subject: [PATCH tip/core/rcu 08/23] rcu: Provide OOM handler to motivate lazy RCU callbacks Date: Thu, 20 Sep 2012 11:48:04 -0700 Message-Id: <1348166900-18716-8-git-send-email-paulmck@linux.vnet.ibm.com> X-Mailer: git-send-email 1.7.8 In-Reply-To: <1348166900-18716-1-git-send-email-paulmck@linux.vnet.ibm.com> References: <20120920184751.GA18657@linux.vnet.ibm.com> <1348166900-18716-1-git-send-email-paulmck@linux.vnet.ibm.com> X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12092018-4242-0000-0000-000002F74EDC X-Gm-Message-State: ALoCoQnzhlP1X7cdi8fvO7gCLJa/i/bHvYkBDBrsd2vKUhCC6VmMIOd1dHXCeRvg5LaOyiOYk8Tu From: "Paul E. McKenney" In kernels built with CONFIG_RCU_FAST_NO_HZ=y, CPUs can accumulate a large number of lazy callbacks, which as the name implies will be slow to be invoked. This can be a problem on small-memory systems, where the default 6-second sleep for CPUs having only lazy RCU callbacks could well be fatal. This commit therefore installs an OOM hander that ensures that every CPU with lazy callbacks has at least one non-lazy callback, in turn ensuring timely advancement for these callbacks. Updated to fix bug that disabled OOM killing, noted by Lai Jiangshan. Updated to push the for_each_rcu_flavor() loop into rcu_oom_notify_cpu(), thus reducing the number of IPIs, as suggested by Steven Rostedt. Also to make the for_each_online_cpu() loop be preemptible. (Later, it might be good to use smp_call_function(), as suggested by Peter Zijlstra.) Signed-off-by: Paul E. McKenney Signed-off-by: Paul E. McKenney Tested-by: Sasha Levin Reviewed-by: Josh Triplett --- kernel/rcutree.h | 5 ++- kernel/rcutree_plugin.h | 83 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 87 insertions(+), 1 deletions(-) diff --git a/kernel/rcutree.h b/kernel/rcutree.h index 117a150..effb273 100644 --- a/kernel/rcutree.h +++ b/kernel/rcutree.h @@ -315,8 +315,11 @@ struct rcu_data { unsigned long n_rp_need_fqs; unsigned long n_rp_need_nothing; - /* 6) _rcu_barrier() callback. */ + /* 6) _rcu_barrier() and OOM callbacks. */ struct rcu_head barrier_head; +#ifdef CONFIG_RCU_FAST_NO_HZ + struct rcu_head oom_head; +#endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */ int cpu; struct rcu_state *rsp; diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h index 7f3244c..5879636 100644 --- a/kernel/rcutree_plugin.h +++ b/kernel/rcutree_plugin.h @@ -25,6 +25,7 @@ */ #include +#include #define RCU_KTHREAD_PRIO 1 @@ -2112,6 +2113,88 @@ static void rcu_idle_count_callbacks_posted(void) __this_cpu_add(rcu_dynticks.nonlazy_posted, 1); } +/* + * Data for flushing lazy RCU callbacks at OOM time. + */ +static atomic_t oom_callback_count; +static DECLARE_WAIT_QUEUE_HEAD(oom_callback_wq); + +/* + * RCU OOM callback -- decrement the outstanding count and deliver the + * wake-up if we are the last one. + */ +static void rcu_oom_callback(struct rcu_head *rhp) +{ + if (atomic_dec_and_test(&oom_callback_count)) + wake_up(&oom_callback_wq); +} + +/* + * Post an rcu_oom_notify callback on the current CPU if it has at + * least one lazy callback. This will unnecessarily post callbacks + * to CPUs that already have a non-lazy callback at the end of their + * callback list, but this is an infrequent operation, so accept some + * extra overhead to keep things simple. + */ +static void rcu_oom_notify_cpu(void *unused) +{ + struct rcu_state *rsp; + struct rcu_data *rdp; + + for_each_rcu_flavor(rsp) { + rdp = __this_cpu_ptr(rsp->rda); + if (rdp->qlen_lazy != 0) { + atomic_inc(&oom_callback_count); + rsp->call(&rdp->oom_head, rcu_oom_callback); + } + } +} + +/* + * If low on memory, ensure that each CPU has a non-lazy callback. + * This will wake up CPUs that have only lazy callbacks, in turn + * ensuring that they free up the corresponding memory in a timely manner. + * Because an uncertain amount of memory will be freed in some uncertain + * timeframe, we do not claim to have freed anything. + */ +static int rcu_oom_notify(struct notifier_block *self, + unsigned long notused, void *nfreed) +{ + int cpu; + + /* Wait for callbacks from earlier instance to complete. */ + wait_event(oom_callback_wq, atomic_read(&oom_callback_count) == 0); + + /* + * Prevent premature wakeup: ensure that all increments happen + * before there is a chance of the counter reaching zero. + */ + atomic_set(&oom_callback_count, 1); + + get_online_cpus(); + for_each_online_cpu(cpu) { + smp_call_function_single(cpu, rcu_oom_notify_cpu, NULL, 1); + cond_resched(); + } + put_online_cpus(); + + /* Unconditionally decrement: no need to wake ourselves up. */ + atomic_dec(&oom_callback_count); + + return NOTIFY_OK; +} + +static struct notifier_block rcu_oom_nb = { + .notifier_call = rcu_oom_notify +}; + +static int __init rcu_register_oom_notifier(void) +{ + register_oom_notifier(&rcu_oom_nb); + return 0; +} +early_initcall(rcu_register_oom_notifier); + #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */ #ifdef CONFIG_RCU_CPU_STALL_INFO