diff mbox

[tip/core/rcu,11/15] rcu: Avoid spurious RCU CPU stall warnings

Message ID 1346352988-32444-11-git-send-email-paulmck@linux.vnet.ibm.com
State Accepted
Commit c96ea7cfdd88d0a67c970502bc5313fede34b86b
Headers show

Commit Message

Paul E. McKenney Aug. 30, 2012, 6:56 p.m. UTC
From: "Paul E. McKenney" <paul.mckenney@linaro.org>

If a given CPU avoids the idle loop but also avoids starting a new
RCU grace period for a full minute, RCU can issue spurious RCU CPU
stall warnings.  This commit fixes this issue by adding a check for
ongoing grace period to avoid these spurious stall warnings.

Reported-by: Becky Bruce <bgillbruce@gmail.com>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

Comments

Josh Triplett Aug. 31, 2012, 6:24 p.m. UTC | #1
On Thu, Aug 30, 2012 at 11:56:24AM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> 
> If a given CPU avoids the idle loop but also avoids starting a new
> RCU grace period for a full minute, RCU can issue spurious RCU CPU
> stall warnings.  This commit fixes this issue by adding a check for
> ongoing grace period to avoid these spurious stall warnings.
> 
> Reported-by: Becky Bruce <bgillbruce@gmail.com>
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Reviewed-by: Josh Triplett <josh@joshtriplett.org>

> ---
>  kernel/rcutree.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index fbe43b0..e58097b 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -820,7 +820,8 @@ static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp)
>  	j = ACCESS_ONCE(jiffies);
>  	js = ACCESS_ONCE(rsp->jiffies_stall);
>  	rnp = rdp->mynode;
> -	if ((ACCESS_ONCE(rnp->qsmask) & rdp->grpmask) && ULONG_CMP_GE(j, js)) {
> +	if (rcu_gp_in_progress(rsp) &&
> +	    (ACCESS_ONCE(rnp->qsmask) & rdp->grpmask) && ULONG_CMP_GE(j, js)) {
>  
>  		/* We haven't checked in, so go dump stack. */
>  		print_cpu_stall(rsp);
> -- 
> 1.7.8
>
Peter Zijlstra Sept. 6, 2012, 2:56 p.m. UTC | #2
On Thu, 2012-08-30 at 11:56 -0700, Paul E. McKenney wrote:
> 
> If a given CPU avoids the idle loop but also avoids starting a new
> RCU grace period for a full minute, RCU can issue spurious RCU CPU
> stall warnings.  This commit fixes this issue by adding a check for
> ongoing grace period to avoid these spurious stall warnings. 

How would it avoid starting a new period for over a minute? fqs should
happen, right? And holding rcu_read_lock() for over a minute surely is a
bug.
Steven Rostedt Sept. 6, 2012, 3:07 p.m. UTC | #3
On Thu, 2012-09-06 at 16:56 +0200, Peter Zijlstra wrote:
> On Thu, 2012-08-30 at 11:56 -0700, Paul E. McKenney wrote:
> > 
> > If a given CPU avoids the idle loop but also avoids starting a new
> > RCU grace period for a full minute, RCU can issue spurious RCU CPU
> > stall warnings.  This commit fixes this issue by adding a check for
> > ongoing grace period to avoid these spurious stall warnings. 
> 
> How would it avoid starting a new period for over a minute? fqs should
> happen, right? And holding rcu_read_lock() for over a minute surely is a
> bug.

I can see this happening in test cases, but it would seem weird on a
normal system. That is, for preempt rcu, having a process scheduled out
holding an rcu_read_lock() for over a minute could happen on a really
stressed out system. But for such a case, I don't think a warning is out
of question.

-- Steve
Peter Zijlstra Sept. 6, 2012, 3:19 p.m. UTC | #4
On Thu, 2012-09-06 at 11:07 -0400, Steven Rostedt wrote:
> On Thu, 2012-09-06 at 16:56 +0200, Peter Zijlstra wrote:
> > On Thu, 2012-08-30 at 11:56 -0700, Paul E. McKenney wrote:
> > > 
> > > If a given CPU avoids the idle loop but also avoids starting a new
> > > RCU grace period for a full minute, RCU can issue spurious RCU CPU
> > > stall warnings.  This commit fixes this issue by adding a check for
> > > ongoing grace period to avoid these spurious stall warnings. 
> > 
> > How would it avoid starting a new period for over a minute? fqs should
> > happen, right? And holding rcu_read_lock() for over a minute surely is a
> > bug.
> 
> I can see this happening in test cases, but it would seem weird on a
> normal system. That is, for preempt rcu, having a process scheduled out
> holding an rcu_read_lock() for over a minute could happen on a really
> stressed out system. But for such a case, I don't think a warning is out
> of question.

One would hope that fqs would boost things.. but yeah, if your app is
spinning above the rcu boost prio you're still toast. But in that case
you're right, a warning is fully deserved.
Paul E. McKenney Sept. 6, 2012, 9:03 p.m. UTC | #5
On Thu, Sep 06, 2012 at 05:19:18PM +0200, Peter Zijlstra wrote:
> On Thu, 2012-09-06 at 11:07 -0400, Steven Rostedt wrote:
> > On Thu, 2012-09-06 at 16:56 +0200, Peter Zijlstra wrote:
> > > On Thu, 2012-08-30 at 11:56 -0700, Paul E. McKenney wrote:
> > > > 
> > > > If a given CPU avoids the idle loop but also avoids starting a new
> > > > RCU grace period for a full minute, RCU can issue spurious RCU CPU
> > > > stall warnings.  This commit fixes this issue by adding a check for
> > > > ongoing grace period to avoid these spurious stall warnings. 
> > > 
> > > How would it avoid starting a new period for over a minute? fqs should
> > > happen, right? And holding rcu_read_lock() for over a minute surely is a
> > > bug.
> > 
> > I can see this happening in test cases, but it would seem weird on a
> > normal system. That is, for preempt rcu, having a process scheduled out
> > holding an rcu_read_lock() for over a minute could happen on a really
> > stressed out system. But for such a case, I don't think a warning is out
> > of question.
> 
> One would hope that fqs would boost things.. but yeah, if your app is
> spinning above the rcu boost prio you're still toast. But in that case
> you're right, a warning is fully deserved.

Here are a few other ways that stalls can happen:

o	A CPU looping in an RCU read-side critical section.
	
o	A CPU looping with interrupts disabled.  This condition can
	result in RCU-sched and RCU-bh stalls.

o	A CPU looping with preemption disabled.  This condition can
	result in RCU-sched stalls and, if ksoftirqd is in use, RCU-bh
	stalls.

o	A CPU looping with bottom halves disabled.  This condition can
	result in RCU-sched and RCU-bh stalls.

o	For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
	without invoking schedule().

o	A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
	happen to preempt a low-priority task in the middle of an RCU
	read-side critical section.   This is especially damaging if
	that low-priority task is not permitted to run on any other CPU,
	in which case the next RCU grace period can never complete, which
	will eventually cause the system to run out of memory and hang.
	While the system is in the process of running itself out of
	memory, you might see stall-warning messages.

o	A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that
	is running at a higher priority than the RCU softirq threads.
	This will prevent RCU callbacks from ever being invoked,
	and in a CONFIG_TREE_PREEMPT_RCU kernel will further prevent
	RCU grace periods from ever completing.  Either way, the
	system will eventually run out of memory and hang.  In the
	CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning
	messages.

o	A hardware or software issue shuts off the scheduler-clock
	interrupt on a CPU that is not in dyntick-idle mode.  This
	problem really has happened, and seems to be most likely to
	result in RCU CPU stall warnings for CONFIG_NO_HZ=n kernels.

o	A bug in the RCU implementation.

o	A hardware failure.  This is quite unlikely, but has occurred
	at least once in real life.  A CPU failed in a running system,
	becoming unresponsive, but not causing an immediate crash.
	This resulted in a series of RCU CPU stall warnings, eventually
	leading the realization that the CPU had failed.

							Thanx, Paul
Steven Rostedt Sept. 6, 2012, 9:41 p.m. UTC | #6
On Thu, 2012-09-06 at 14:03 -0700, Paul E. McKenney wrote:

> Here are a few other ways that stalls can happen:
> 
> o	A CPU looping in an RCU read-side critical section.

For a minute? That's a bug.

> 	
> o	A CPU looping with interrupts disabled.  This condition can
> 	result in RCU-sched and RCU-bh stalls.

Also a bug.

> 
> o	A CPU looping with preemption disabled.  This condition can
> 	result in RCU-sched stalls and, if ksoftirqd is in use, RCU-bh
> 	stalls.

Bug as well.

> 
> o	A CPU looping with bottom halves disabled.  This condition can
> 	result in RCU-sched and RCU-bh stalls.

Bug too.

> 
> o	For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
> 	without invoking schedule().

Another bug.

> 
> o	A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
> 	happen to preempt a low-priority task in the middle of an RCU
> 	read-side critical section.   This is especially damaging if
> 	that low-priority task is not permitted to run on any other CPU,
> 	in which case the next RCU grace period can never complete, which
> 	will eventually cause the system to run out of memory and hang.
> 	While the system is in the process of running itself out of
> 	memory, you might see stall-warning messages.

Buggy system.

> 
> o	A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that
> 	is running at a higher priority than the RCU softirq threads.
> 	This will prevent RCU callbacks from ever being invoked,
> 	and in a CONFIG_TREE_PREEMPT_RCU kernel will further prevent
> 	RCU grace periods from ever completing.  Either way, the
> 	system will eventually run out of memory and hang.  In the
> 	CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning
> 	messages.

Not really a bug, but the developers need a spanking.

> 
> o	A hardware or software issue shuts off the scheduler-clock
> 	interrupt on a CPU that is not in dyntick-idle mode.  This
> 	problem really has happened, and seems to be most likely to
> 	result in RCU CPU stall warnings for CONFIG_NO_HZ=n kernels.

Driving the bug.

> 
> o	A bug in the RCU implementation.

Bug in the name.

> 
> o	A hardware failure.  This is quite unlikely, but has occurred
> 	at least once in real life.  A CPU failed in a running system,
> 	becoming unresponsive, but not causing an immediate crash.
> 	This resulted in a series of RCU CPU stall warnings, eventually
> 	leading the realization that the CPU had failed.

Hardware bug.

So, where's the "spurious RCU CPU stall warnings"?

All these cases deserve a warning.

-- Steve
Paul E. McKenney Sept. 6, 2012, 9:58 p.m. UTC | #7
On Thu, Sep 06, 2012 at 05:41:01PM -0400, Steven Rostedt wrote:
> On Thu, 2012-09-06 at 14:03 -0700, Paul E. McKenney wrote:
> 
> > Here are a few other ways that stalls can happen:
> > 
> > o	A CPU looping in an RCU read-side critical section.
> 
> For a minute? That's a bug.
> 
> > 	
> > o	A CPU looping with interrupts disabled.  This condition can
> > 	result in RCU-sched and RCU-bh stalls.
> 
> Also a bug.
> 
> > 
> > o	A CPU looping with preemption disabled.  This condition can
> > 	result in RCU-sched stalls and, if ksoftirqd is in use, RCU-bh
> > 	stalls.
> 
> Bug as well.
> 
> > 
> > o	A CPU looping with bottom halves disabled.  This condition can
> > 	result in RCU-sched and RCU-bh stalls.
> 
> Bug too.
> 
> > 
> > o	For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
> > 	without invoking schedule().
> 
> Another bug.
> 
> > 
> > o	A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
> > 	happen to preempt a low-priority task in the middle of an RCU
> > 	read-side critical section.   This is especially damaging if
> > 	that low-priority task is not permitted to run on any other CPU,
> > 	in which case the next RCU grace period can never complete, which
> > 	will eventually cause the system to run out of memory and hang.
> > 	While the system is in the process of running itself out of
> > 	memory, you might see stall-warning messages.
> 
> Buggy system.
> 
> > 
> > o	A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that
> > 	is running at a higher priority than the RCU softirq threads.
> > 	This will prevent RCU callbacks from ever being invoked,
> > 	and in a CONFIG_TREE_PREEMPT_RCU kernel will further prevent
> > 	RCU grace periods from ever completing.  Either way, the
> > 	system will eventually run out of memory and hang.  In the
> > 	CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning
> > 	messages.
> 
> Not really a bug, but the developers need a spanking.

And RCU does what it can, which is limited to a splat on the console.

> > o	A hardware or software issue shuts off the scheduler-clock
> > 	interrupt on a CPU that is not in dyntick-idle mode.  This
> > 	problem really has happened, and seems to be most likely to
> > 	result in RCU CPU stall warnings for CONFIG_NO_HZ=n kernels.
> 
> Driving the bug.
> 
> > 
> > o	A bug in the RCU implementation.
> 
> Bug in the name.
> 
> > 
> > o	A hardware failure.  This is quite unlikely, but has occurred
> > 	at least once in real life.  A CPU failed in a running system,
> > 	becoming unresponsive, but not causing an immediate crash.
> > 	This resulted in a series of RCU CPU stall warnings, eventually
> > 	leading the realization that the CPU had failed.
> 
> Hardware bug.
> 
> So, where's the "spurious RCU CPU stall warnings"?

I figured that would count as a bug in the RCU implementation.  ;-)

> All these cases deserve a warning.

Agreed, and that is the whole purpose of the stall warnings.

							Thanx, Paul
Steven Rostedt Sept. 6, 2012, 10:05 p.m. UTC | #8
On Thu, 2012-09-06 at 14:58 -0700, Paul E. McKenney wrote:

> > All these cases deserve a warning.
> 
> Agreed, and that is the whole purpose of the stall warnings.

Then let me ask the question again. According to the change log:

> If a given CPU avoids the idle loop but also avoids starting a new
> RCU grace period for a full minute, RCU can issue spurious RCU CPU
> stall warnings.  This commit fixes this issue by adding a check for
> ongoing grace period to avoid these spurious stall warnings.


I'm still confused by what is "this issue"? And why is it being fixed.
It sounds to me that the "issue" was a CPU avoiding starting a new RCU
grace period for a full minute. Which to me sounds like a bug in which
we *want* a warning. Why is this patch needed?

-- Steve
Paul E. McKenney Sept. 6, 2012, 10:22 p.m. UTC | #9
On Thu, Sep 06, 2012 at 06:05:53PM -0400, Steven Rostedt wrote:
> On Thu, 2012-09-06 at 14:58 -0700, Paul E. McKenney wrote:
> 
> > > All these cases deserve a warning.
> > 
> > Agreed, and that is the whole purpose of the stall warnings.
> 
> Then let me ask the question again. According to the change log:
> 
> > If a given CPU avoids the idle loop but also avoids starting a new
> > RCU grace period for a full minute, RCU can issue spurious RCU CPU
> > stall warnings.  This commit fixes this issue by adding a check for
> > ongoing grace period to avoid these spurious stall warnings.
> 
> I'm still confused by what is "this issue"? And why is it being fixed.
> It sounds to me that the "issue" was a CPU avoiding starting a new RCU
> grace period for a full minute. Which to me sounds like a bug in which
> we *want* a warning. Why is this patch needed?

Ah!

It is perfectly legal to avoid -starting- an RCU grace period for a
minute, or even longer.  If RCU has nothing to do, in other words, if no
one registers any RCU callbacks, then RCU need not start a grace period.

Of course, this would mean that it would eventually be a full minute
since the last start of a grace period.  This is not a problem, after
all, Linux went through a full ten years before experiencing its first
grace period.

But the stall-warning code just checked how long it had been since
the last start of a grace period, failing to note that this grace
period had long since completed.  So it splatted out a warning.
This warning was spurious in the sense that there was no bug aside
from the missing check that the grace period was still in progress.

And this commit fixes that bug in RCU.

							Thanx, Paul
Peter Zijlstra Sept. 7, 2012, 7 a.m. UTC | #10
On Thu, 2012-09-06 at 15:22 -0700, Paul E. McKenney wrote:
> Ah!
> 
> It is perfectly legal to avoid -starting- an RCU grace period for a
> minute, or even longer.  If RCU has nothing to do, in other words, if no
> one registers any RCU callbacks, then RCU need not start a grace period.
> 
> Of course, this would mean that it would eventually be a full minute
> since the last start of a grace period.  This is not a problem, after
> all, Linux went through a full ten years before experiencing its first
> grace period.
> 
> But the stall-warning code just checked how long it had been since
> the last start of a grace period, failing to note that this grace
> period had long since completed.  So it splatted out a warning.
> This warning was spurious in the sense that there was no bug aside
> from the missing check that the grace period was still in progress.
> 
> And this commit fixes that bug in RCU. 

OK, that makes sense.. it just looks like both Steve and me got confused
by the initial changelog.
Steven Rostedt Sept. 7, 2012, 2:42 p.m. UTC | #11
On Fri, 2012-09-07 at 09:00 +0200, Peter Zijlstra wrote:
> On Thu, 2012-09-06 at 15:22 -0700, Paul E. McKenney wrote:
> > Ah!
> > 
> > It is perfectly legal to avoid -starting- an RCU grace period for a
> > minute, or even longer.  If RCU has nothing to do, in other words, if no
> > one registers any RCU callbacks, then RCU need not start a grace period.
> > 
> > Of course, this would mean that it would eventually be a full minute
> > since the last start of a grace period.  This is not a problem, after
> > all, Linux went through a full ten years before experiencing its first
> > grace period.
> > 
> > But the stall-warning code just checked how long it had been since
> > the last start of a grace period, failing to note that this grace
> > period had long since completed.  So it splatted out a warning.
> > This warning was spurious in the sense that there was no bug aside
> > from the missing check that the grace period was still in progress.
> > 
> > And this commit fixes that bug in RCU. 
> 
> OK, that makes sense.. it just looks like both Steve and me got confused
> by the initial changelog. 

Right, I think the change log needs to be fixed ;-)

-- Steve
diff mbox

Patch

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index fbe43b0..e58097b 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -820,7 +820,8 @@  static void check_cpu_stall(struct rcu_state *rsp, struct rcu_data *rdp)
 	j = ACCESS_ONCE(jiffies);
 	js = ACCESS_ONCE(rsp->jiffies_stall);
 	rnp = rdp->mynode;
-	if ((ACCESS_ONCE(rnp->qsmask) & rdp->grpmask) && ULONG_CMP_GE(j, js)) {
+	if (rcu_gp_in_progress(rsp) &&
+	    (ACCESS_ONCE(rnp->qsmask) & rdp->grpmask) && ULONG_CMP_GE(j, js)) {
 
 		/* We haven't checked in, so go dump stack. */
 		print_cpu_stall(rsp);