From patchwork Wed Nov 2 20:30:26 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 4912 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id A9EAB23DC3 for ; Wed, 2 Nov 2011 20:32:02 +0000 (UTC) Received: from mail-fx0-f52.google.com (mail-fx0-f52.google.com [209.85.161.52]) by fiordland.canonical.com (Postfix) with ESMTP id 985ABA18670 for ; Wed, 2 Nov 2011 20:32:02 +0000 (UTC) Received: by mail-fx0-f52.google.com with SMTP id n26so1195523faa.11 for ; Wed, 02 Nov 2011 13:32:02 -0700 (PDT) Received: by 10.223.91.73 with SMTP id l9mr10779214fam.22.1320265922389; Wed, 02 Nov 2011 13:32:02 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.152.14.103 with SMTP id o7cs64056lac; Wed, 2 Nov 2011 13:32:02 -0700 (PDT) Received: by 10.52.24.210 with SMTP id w18mr6240845vdf.21.1320265919937; Wed, 02 Nov 2011 13:31:59 -0700 (PDT) Received: from e8.ny.us.ibm.com (e8.ny.us.ibm.com. [32.97.182.138]) by mx.google.com with ESMTPS id b6si3074969vdw.23.2011.11.02.13.31.59 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 02 Nov 2011 13:31:59 -0700 (PDT) Received-SPF: pass (google.com: domain of paulmck@linux.vnet.ibm.com designates 32.97.182.138 as permitted sender) client-ip=32.97.182.138; Authentication-Results: mx.google.com; spf=pass (google.com: domain of paulmck@linux.vnet.ibm.com designates 32.97.182.138 as permitted sender) smtp.mail=paulmck@linux.vnet.ibm.com Received: from /spool/local by e8.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 2 Nov 2011 16:31:52 -0400 Received: from d01relay07.pok.ibm.com ([9.56.227.147]) by e8.ny.us.ibm.com ([192.168.1.108]) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 2 Nov 2011 16:31:01 -0400 Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay07.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id pA2KUw9W2629820 for ; Wed, 2 Nov 2011 16:30:58 -0400 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id pA2KUq2d029849 for ; Wed, 2 Nov 2011 16:30:58 -0400 Received: from paulmck-ThinkPad-W500 (sig-9-49-130-61.mts.ibm.com [9.49.130.61]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id pA2KUpuQ029760; Wed, 2 Nov 2011 16:30:52 -0400 Received: by paulmck-ThinkPad-W500 (Postfix, from userid 1000) id B76FCEA771; Wed, 2 Nov 2011 13:30:51 -0700 (PDT) From: "Paul E. McKenney" To: linux-kernel@vger.kernel.org Cc: mingo@elte.hu, laijs@cn.fujitsu.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca, josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org, Valdis.Kletnieks@vt.edu, dhowells@redhat.com, eric.dumazet@gmail.com, darren@dvhart.com, patches@linaro.org, "Paul E. McKenney" Subject: [PATCH RFC tip/core/rcu 05/28] lockdep: Update documentation for lock-class leak detection Date: Wed, 2 Nov 2011 13:30:26 -0700 Message-Id: <1320265849-5744-5-git-send-email-paulmck@linux.vnet.ibm.com> X-Mailer: git-send-email 1.7.3.2 In-Reply-To: <20111102203017.GA3830@linux.vnet.ibm.com> References: <20111102203017.GA3830@linux.vnet.ibm.com> x-cbid: 11110220-9360-0000-0000-00000045A4A0 There are a number of bugs that can leak or overuse lock classes, which can cause the maximum number of lock classes (currently 8191) to be exceeded. However, the documentation does not tell you how to track down these problems. This commit addresses this shortcoming. Signed-off-by: Paul E. McKenney --- Documentation/lockdep-design.txt | 61 ++++++++++++++++++++++++++++++++++++++ 1 files changed, 61 insertions(+), 0 deletions(-) diff --git a/Documentation/lockdep-design.txt b/Documentation/lockdep-design.txt index abf768c..383bb23 100644 --- a/Documentation/lockdep-design.txt +++ b/Documentation/lockdep-design.txt @@ -221,3 +221,64 @@ when the chain is validated for the first time, is then put into a hash table, which hash-table can be checked in a lockfree manner. If the locking chain occurs again later on, the hash table tells us that we dont have to validate the chain again. + +Troubleshooting: +---------------- + +The validator tracks a maximum of MAX_LOCKDEP_KEYS number of lock classes. +Exceeding this number will trigger the following lockdep warning: + + (DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS)) + +By default, MAX_LOCKDEP_KEYS is currently set to 8191, and typical +desktop systems have less than 1,000 lock classes, so this warning +normally results from lock-class leakage or failure to properly +initialize locks. These two problems are illustrated below: + +1. Repeated module loading and unloading while running the validator + will result in lock-class leakage. The issue here is that each + load of the module will create a new set of lock classes for that + module's locks, but module unloading does not remove old classes. + Therefore, if that module is loaded and unloaded repeatedly, + the number of lock classes will eventually reach the maximum. + +2. Using structures such as arrays that have large numbers of + locks that are not explicitly initialized. For example, + a hash table with 8192 buckets where each bucket has its + own spinlock_t will consume 8192 lock classes -unless- each + spinlock is initialized, for example, using spin_lock_init(). + Failure to properly initialize the per-bucket spinlocks would + guarantee lock-class overflow. In contrast, a loop that called + spin_lock_init() on each lock would place all 8192 locks into a + single lock class. + + The moral of this story is that you should always explicitly + initialize your locks. + +One might argue that the validator should be modified to allow lock +classes to be reused. However, if you are tempted to make this argument, +first review the code and think through the changes that would be +required, keeping in mind that the lock classes to be removed are likely +to be linked into the lock-dependency graph. This turns out to be a +harder to do than to say. + +Of course, if you do run out of lock classes, the next thing to do is +to find the offending lock classes. First, the following command gives +you the number of lock classes currently in use along with the maximum: + + grep "lock-classes" /proc/lockdep_stats + +This command produces the following output on a modest Power system: + + lock-classes: 748 [max: 8191] + +If the number allocated (748 above) increases continually over time, +then there is likely a leak. The following command can be used to +identify the leaking lock classes: + + grep "BD" /proc/lockdep + +Run the command and save the output, then compare against the output +from a later run of this command to identify the leakers. This same +output can also help you find situations where lock initialization +has been omitted.