[RFC,tip/core/rcu,03/41] rcu: Add lockdep-RCU checks for simple self-deadlock

Message ID 1328125319-5205-3-git-send-email-paulmck@linux.vnet.ibm.com
State New
Headers show

Commit Message

Paul E. McKenney Feb. 1, 2012, 7:41 p.m.
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

It is illegal to have a grace period within a same-flavor RCU read-side
critical section, so this commit adds lockdep-RCU checks to splat when
such abuse is encountered.  This commit does not detect more elaborate
RCU deadlock situations.  These situations might be a job for lockdep
enhancements.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutiny.c        |    3 +++
 kernel/rcutiny_plugin.h |    4 ++++
 kernel/rcutree.c        |    6 ++++++
 kernel/rcutree_plugin.h |    3 +++
 kernel/srcu.c           |    4 ++++
 5 files changed, 20 insertions(+), 0 deletions(-)

Comments

Josh Triplett Feb. 2, 2012, 12:55 a.m. | #1
On Wed, Feb 01, 2012 at 11:41:21AM -0800, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> 
> It is illegal to have a grace period within a same-flavor RCU read-side
> critical section, so this commit adds lockdep-RCU checks to splat when
> such abuse is encountered.  This commit does not detect more elaborate
> RCU deadlock situations.  These situations might be a job for lockdep
> enhancements.

Since doing so also violates the prohibition on blocking within an RCU
read-side critical section, wouldn't it suffice to call might_sleep() or
equivalent, which also detects other problems?  (Obviously this doesn't
apply to SRCU, but it applies to the other variants of RCU.)

> --- a/kernel/rcutiny.c
> +++ b/kernel/rcutiny.c
> @@ -319,6 +319,9 @@ static void rcu_process_callbacks(struct softirq_action *unused)
>   */
>  void synchronize_sched(void)
>  {
> +	rcu_lockdep_assert(!lock_is_held(&rcu_sched_lock_map),
> +			   "Illegal grace period in RCU read-side "
> +			   "critical section");

This message doesn't seem entirely obvious to me.  A grace period didn't
occur; a synchronize call did, which tried to request a grace period
that can never happen.

> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -1816,6 +1816,9 @@ EXPORT_SYMBOL_GPL(call_rcu_bh);
>   */
>  void synchronize_sched(void)
>  {
> +	rcu_lockdep_assert(!lock_is_held(&rcu_sched_lock_map),
> +			   "Illegal synchronize_sched() in RCU-sched "
> +			   "read-side critical section");
>  	if (rcu_blocking_is_gp())
>  		return;
>  	wait_rcu_gp(call_rcu_sched);
> @@ -1833,6 +1836,9 @@ EXPORT_SYMBOL_GPL(synchronize_sched);
>   */
>  void synchronize_rcu_bh(void)
>  {
> +	rcu_lockdep_assert(!lock_is_held(&rcu_bh_lock_map),
> +			   "Illegal synchronize_sched() in RCU-bh "
> +			   "read-side critical section");

Copy-paste problem here: this should say synchronize_sched_bh.  (Or
perhaps it should say __func__. :) )

> --- a/kernel/srcu.c
> +++ b/kernel/srcu.c
> @@ -172,6 +172,10 @@ static void __synchronize_srcu(struct srcu_struct *sp, void (*sync_func)(void))
>  {
>  	int idx;
>  
> +	rcu_lockdep_assert(!lock_is_held(&sp->dep_map),
> +			   "Illegal SRCU grace period in same-type "
> +			   "SRCU read-side critical section");

Same issue with the message: a grace period didn't occur, and it never
will; a call to synchronize_srcu requesting a grace period occurred.

- Josh Triplett
Paul E. McKenney Feb. 2, 2012, 4:20 p.m. | #2
On Wed, Feb 01, 2012 at 04:55:54PM -0800, Josh Triplett wrote:
> On Wed, Feb 01, 2012 at 11:41:21AM -0800, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > 
> > It is illegal to have a grace period within a same-flavor RCU read-side
> > critical section, so this commit adds lockdep-RCU checks to splat when
> > such abuse is encountered.  This commit does not detect more elaborate
> > RCU deadlock situations.  These situations might be a job for lockdep
> > enhancements.
> 
> Since doing so also violates the prohibition on blocking within an RCU
> read-side critical section, wouldn't it suffice to call might_sleep() or
> equivalent, which also detects other problems?  (Obviously this doesn't
> apply to SRCU, but it applies to the other variants of RCU.)

Yes, but...

The advantage of the lockdep-RCU splat is that it gives you a better
hint as to where the RCU read-side critical section was entered, which
is very helpful when tracking these down, especially when they are
intermittent.

On of the downsides of the Linux kernel community being more RCU-savvy
is that the errors they now tend to commit are more complex.  ;-)

And yes, I should also well check for the other variants of RCU read-side
critical section (other than RCU).  Done.

I also glued the strings together to promote grepability as you suggest
later.  (But I leave it to you to get checkpatch.pl upgraded -- it currently
warns about long lines, but not about strings split across lines.)

> > --- a/kernel/rcutiny.c
> > +++ b/kernel/rcutiny.c
> > @@ -319,6 +319,9 @@ static void rcu_process_callbacks(struct softirq_action *unused)
> >   */
> >  void synchronize_sched(void)
> >  {
> > +	rcu_lockdep_assert(!lock_is_held(&rcu_sched_lock_map),
> > +			   "Illegal grace period in RCU read-side "
> > +			   "critical section");
> 
> This message doesn't seem entirely obvious to me.  A grace period didn't
> occur; a synchronize call did, which tried to request a grace period
> that can never happen.

I suppose I might as well make it consistent with the other messages.  ;-)

> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -1816,6 +1816,9 @@ EXPORT_SYMBOL_GPL(call_rcu_bh);
> >   */
> >  void synchronize_sched(void)
> >  {
> > +	rcu_lockdep_assert(!lock_is_held(&rcu_sched_lock_map),
> > +			   "Illegal synchronize_sched() in RCU-sched "
> > +			   "read-side critical section");
> >  	if (rcu_blocking_is_gp())
> >  		return;
> >  	wait_rcu_gp(call_rcu_sched);
> > @@ -1833,6 +1836,9 @@ EXPORT_SYMBOL_GPL(synchronize_sched);
> >   */
> >  void synchronize_rcu_bh(void)
> >  {
> > +	rcu_lockdep_assert(!lock_is_held(&rcu_bh_lock_map),
> > +			   "Illegal synchronize_sched() in RCU-bh "
> > +			   "read-side critical section");
> 
> Copy-paste problem here: this should say synchronize_sched_bh.  (Or
> perhaps it should say __func__. :) )

Fixed, but will pass on __func__ for the moment.  Cool though it might
be to exercise varargs.  ;-)

> > --- a/kernel/srcu.c
> > +++ b/kernel/srcu.c
> > @@ -172,6 +172,10 @@ static void __synchronize_srcu(struct srcu_struct *sp, void (*sync_func)(void))
> >  {
> >  	int idx;
> >  
> > +	rcu_lockdep_assert(!lock_is_held(&sp->dep_map),
> > +			   "Illegal SRCU grace period in same-type "
> > +			   "SRCU read-side critical section");
> 
> Same issue with the message: a grace period didn't occur, and it never
> will; a call to synchronize_srcu requesting a grace period occurred.

Good catch, fixed!

							Thanx, Paul
Josh Triplett Feb. 2, 2012, 7:56 p.m. | #3
On Thu, Feb 02, 2012 at 08:20:17AM -0800, Paul E. McKenney wrote:
> On Wed, Feb 01, 2012 at 04:55:54PM -0800, Josh Triplett wrote:
> > On Wed, Feb 01, 2012 at 11:41:21AM -0800, Paul E. McKenney wrote:
> > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > 
> > > It is illegal to have a grace period within a same-flavor RCU read-side
> > > critical section, so this commit adds lockdep-RCU checks to splat when
> > > such abuse is encountered.  This commit does not detect more elaborate
> > > RCU deadlock situations.  These situations might be a job for lockdep
> > > enhancements.
> > 
> > Since doing so also violates the prohibition on blocking within an RCU
> > read-side critical section, wouldn't it suffice to call might_sleep() or
> > equivalent, which also detects other problems?  (Obviously this doesn't
> > apply to SRCU, but it applies to the other variants of RCU.)
> 
> Yes, but...
> 
> The advantage of the lockdep-RCU splat is that it gives you a better
> hint as to where the RCU read-side critical section was entered, which
> is very helpful when tracking these down, especially when they are
> intermittent.

Ah, fair enough.

> And yes, I should also well check for the other variants of RCU read-side
> critical section (other than RCU).  Done.

Oh?  What hadn't you checked for?

> I also glued the strings together to promote grepability as you suggest
> later.  (But I leave it to you to get checkpatch.pl upgraded -- it currently
> warns about long lines, but not about strings split across lines.)

It theoretically shouldn't warn about long lines that consist only of a
quoted string possibly followed by ',' or ');'; it has a check to ignore
those.  After you glued the strings together, what did you end up with?

As for adding a warning about strings broken across lines, that seems
sensible.  Some quick grepping suggests that doing so would catch a pile
of existing code, too.  Patch to follow momentarily.

- Josh Triplett
Paul E. McKenney Feb. 2, 2012, 8:42 p.m. | #4
On Thu, Feb 02, 2012 at 11:56:38AM -0800, Josh Triplett wrote:
> On Thu, Feb 02, 2012 at 08:20:17AM -0800, Paul E. McKenney wrote:
> > On Wed, Feb 01, 2012 at 04:55:54PM -0800, Josh Triplett wrote:
> > > On Wed, Feb 01, 2012 at 11:41:21AM -0800, Paul E. McKenney wrote:
> > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > 
> > > > It is illegal to have a grace period within a same-flavor RCU read-side
> > > > critical section, so this commit adds lockdep-RCU checks to splat when
> > > > such abuse is encountered.  This commit does not detect more elaborate
> > > > RCU deadlock situations.  These situations might be a job for lockdep
> > > > enhancements.
> > > 
> > > Since doing so also violates the prohibition on blocking within an RCU
> > > read-side critical section, wouldn't it suffice to call might_sleep() or
> > > equivalent, which also detects other problems?  (Obviously this doesn't
> > > apply to SRCU, but it applies to the other variants of RCU.)
> > 
> > Yes, but...
> > 
> > The advantage of the lockdep-RCU splat is that it gives you a better
> > hint as to where the RCU read-side critical section was entered, which
> > is very helpful when tracking these down, especially when they are
> > intermittent.
> 
> Ah, fair enough.
> 
> > And yes, I should also well check for the other variants of RCU read-side
> > critical section (other than RCU).  Done.
> 
> Oh?  What hadn't you checked for?

Things like synchronize_sched() in rcu_read_lock() critical section
and vice versa.

> > I also glued the strings together to promote grepability as you suggest
> > later.  (But I leave it to you to get checkpatch.pl upgraded -- it currently
> > warns about long lines, but not about strings split across lines.)
> 
> It theoretically shouldn't warn about long lines that consist only of a
> quoted string possibly followed by ',' or ');'; it has a check to ignore
> those.  After you glued the strings together, what did you end up with?

You are quite right -- it ignores the lines with long strings.

> As for adding a warning about strings broken across lines, that seems
> sensible.  Some quick grepping suggests that doing so would catch a pile
> of existing code, too.  Patch to follow momentarily.

Indeed, I might not be the only one to overgeneralize from the 80-character
warning.  ;-)

							Thanx, Paul
Josh Triplett Feb. 3, 2012, 9:04 a.m. | #5
On Thu, Feb 02, 2012 at 12:42:06PM -0800, Paul E. McKenney wrote:
> On Thu, Feb 02, 2012 at 11:56:38AM -0800, Josh Triplett wrote:
> > On Thu, Feb 02, 2012 at 08:20:17AM -0800, Paul E. McKenney wrote:
> > > On Wed, Feb 01, 2012 at 04:55:54PM -0800, Josh Triplett wrote:
> > > > On Wed, Feb 01, 2012 at 11:41:21AM -0800, Paul E. McKenney wrote:
> > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > 
> > > > > It is illegal to have a grace period within a same-flavor RCU read-side
> > > > > critical section, so this commit adds lockdep-RCU checks to splat when
> > > > > such abuse is encountered.  This commit does not detect more elaborate
> > > > > RCU deadlock situations.  These situations might be a job for lockdep
> > > > > enhancements.
> > > > 
> > > > Since doing so also violates the prohibition on blocking within an RCU
> > > > read-side critical section, wouldn't it suffice to call might_sleep() or
> > > > equivalent, which also detects other problems?  (Obviously this doesn't
> > > > apply to SRCU, but it applies to the other variants of RCU.)
> > > 
> > > Yes, but...
> > > 
> > > The advantage of the lockdep-RCU splat is that it gives you a better
> > > hint as to where the RCU read-side critical section was entered, which
> > > is very helpful when tracking these down, especially when they are
> > > intermittent.
> > 
> > Ah, fair enough.
> > 
> > > And yes, I should also well check for the other variants of RCU read-side
> > > critical section (other than RCU).  Done.
> > 
> > Oh?  What hadn't you checked for?
> 
> Things like synchronize_sched() in rcu_read_lock() critical section
> and vice versa.

Ouch.  Good idea.

That also suggests another interesting possibility: lockdep could tag
pointers used in the flavor-specific rcu_dereference variants and
pointers used in the call_rcu variants to make sure nobody uses multiple
variants on the same pointer. :)  (Assuming we don't want
flavor-specific __rcu_* pointer tags.)

Speaking of which, could kfree_rcu require its argument to have the
__rcu type annotation?  We can't necessarily guarantee that for call_rcu
in all cases, but I think we can for kfree_rcu.

- Josh Triplett
Paul E. McKenney Feb. 3, 2012, 6:05 p.m. | #6
On Fri, Feb 03, 2012 at 01:04:49AM -0800, Josh Triplett wrote:
> On Thu, Feb 02, 2012 at 12:42:06PM -0800, Paul E. McKenney wrote:
> > On Thu, Feb 02, 2012 at 11:56:38AM -0800, Josh Triplett wrote:
> > > On Thu, Feb 02, 2012 at 08:20:17AM -0800, Paul E. McKenney wrote:
> > > > On Wed, Feb 01, 2012 at 04:55:54PM -0800, Josh Triplett wrote:
> > > > > On Wed, Feb 01, 2012 at 11:41:21AM -0800, Paul E. McKenney wrote:
> > > > > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > > > > 
> > > > > > It is illegal to have a grace period within a same-flavor RCU read-side
> > > > > > critical section, so this commit adds lockdep-RCU checks to splat when
> > > > > > such abuse is encountered.  This commit does not detect more elaborate
> > > > > > RCU deadlock situations.  These situations might be a job for lockdep
> > > > > > enhancements.
> > > > > 
> > > > > Since doing so also violates the prohibition on blocking within an RCU
> > > > > read-side critical section, wouldn't it suffice to call might_sleep() or
> > > > > equivalent, which also detects other problems?  (Obviously this doesn't
> > > > > apply to SRCU, but it applies to the other variants of RCU.)
> > > > 
> > > > Yes, but...
> > > > 
> > > > The advantage of the lockdep-RCU splat is that it gives you a better
> > > > hint as to where the RCU read-side critical section was entered, which
> > > > is very helpful when tracking these down, especially when they are
> > > > intermittent.
> > > 
> > > Ah, fair enough.
> > > 
> > > > And yes, I should also well check for the other variants of RCU read-side
> > > > critical section (other than RCU).  Done.
> > > 
> > > Oh?  What hadn't you checked for?
> > 
> > Things like synchronize_sched() in rcu_read_lock() critical section
> > and vice versa.
> 
> Ouch.  Good idea.
> 
> That also suggests another interesting possibility: lockdep could tag
> pointers used in the flavor-specific rcu_dereference variants and
> pointers used in the call_rcu variants to make sure nobody uses multiple
> variants on the same pointer. :)  (Assuming we don't want
> flavor-specific __rcu_* pointer tags.)

Indeed, the last attempt to produce flavor-specific __rcu_* pointer tags
turned into quite a mess.  The other issue with it is that it looks like 
there are reasonable use cases for protecting a given pointer with multiple
flavors of RCU.  I don't know if any of them have actually made it into
mainline, but there have been a number of discussions involving them.

> Speaking of which, could kfree_rcu require its argument to have the
> __rcu type annotation?  We can't necessarily guarantee that for call_rcu
> in all cases, but I think we can for kfree_rcu.

It might make sense -- I have added it to my list of things to think
about for RCU.

							Thanx, Paul

Patch

diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index 977296d..904cda9 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -319,6 +319,9 @@  static void rcu_process_callbacks(struct softirq_action *unused)
  */
 void synchronize_sched(void)
 {
+	rcu_lockdep_assert(!lock_is_held(&rcu_sched_lock_map),
+			   "Illegal grace period in RCU read-side "
+			   "critical section");
 	cond_resched();
 }
 EXPORT_SYMBOL_GPL(synchronize_sched);
diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
index 9cb1ae4..dd672e7 100644
--- a/kernel/rcutiny_plugin.h
+++ b/kernel/rcutiny_plugin.h
@@ -706,6 +706,10 @@  EXPORT_SYMBOL_GPL(call_rcu);
  */
 void synchronize_rcu(void)
 {
+	rcu_lockdep_assert(!lock_is_held(&rcu_lock_map),
+			   "Illegal synchronize_rcu() in RCU read-side "
+			   "critical section");
+
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 	if (!rcu_scheduler_active)
 		return;
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 6c4a672..d0f8e64 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1816,6 +1816,9 @@  EXPORT_SYMBOL_GPL(call_rcu_bh);
  */
 void synchronize_sched(void)
 {
+	rcu_lockdep_assert(!lock_is_held(&rcu_sched_lock_map),
+			   "Illegal synchronize_sched() in RCU-sched "
+			   "read-side critical section");
 	if (rcu_blocking_is_gp())
 		return;
 	wait_rcu_gp(call_rcu_sched);
@@ -1833,6 +1836,9 @@  EXPORT_SYMBOL_GPL(synchronize_sched);
  */
 void synchronize_rcu_bh(void)
 {
+	rcu_lockdep_assert(!lock_is_held(&rcu_bh_lock_map),
+			   "Illegal synchronize_sched() in RCU-bh "
+			   "read-side critical section");
 	if (rcu_blocking_is_gp())
 		return;
 	wait_rcu_gp(call_rcu_bh);
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 8bb35d7..cc07bce 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -688,6 +688,9 @@  EXPORT_SYMBOL_GPL(call_rcu);
  */
 void synchronize_rcu(void)
 {
+	rcu_lockdep_assert(!lock_is_held(&rcu_lock_map),
+			   "Illegal synchronize_rcu() in RCU read-side "
+			   "critical section");
 	if (!rcu_scheduler_active)
 		return;
 	wait_rcu_gp(call_rcu);
diff --git a/kernel/srcu.c b/kernel/srcu.c
index 0febf61..1959763 100644
--- a/kernel/srcu.c
+++ b/kernel/srcu.c
@@ -172,6 +172,10 @@  static void __synchronize_srcu(struct srcu_struct *sp, void (*sync_func)(void))
 {
 	int idx;
 
+	rcu_lockdep_assert(!lock_is_held(&sp->dep_map),
+			   "Illegal SRCU grace period in same-type "
+			   "SRCU read-side critical section");
+
 	idx = sp->completed;
 	mutex_lock(&sp->mutex);