[tip/core/rcu,09/15] rcu: Increasing rcu_barrier() concurrency

Message ID 1339794370-28119-9-git-send-email-paulmck@linux.vnet.ibm.com
State New
Headers show

Commit Message

Paul E. McKenney June 15, 2012, 9:06 p.m.
From: "Paul E. McKenney" <paul.mckenney@linaro.org>

The traditional rcu_barrier() implementation has serialized all requests,
regardless of RCU flavor, and also does not coalesce concurrent requests.
In the past, this has been good and sufficient.

However, systems are getting larger and use of rcu_barrier() has been
increasing.  This commit therefore introduces a counter-based scheme
that allows _rcu_barrier() calls for the same flavor of RCU to take
advantage of each others' work.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
 kernel/rcutree.c |   27 ++++++++++++++++++++++++++-
 kernel/rcutree.h |    2 ++
 2 files changed, 28 insertions(+), 1 deletions(-)

Comments

Josh Triplett June 15, 2012, 11:31 p.m. | #1
On Fri, Jun 15, 2012 at 02:06:04PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> 
> The traditional rcu_barrier() implementation has serialized all requests,
> regardless of RCU flavor, and also does not coalesce concurrent requests.
> In the past, this has been good and sufficient.
> 
> However, systems are getting larger and use of rcu_barrier() has been
> increasing.  This commit therefore introduces a counter-based scheme
> that allows _rcu_barrier() calls for the same flavor of RCU to take
> advantage of each others' work.
> 
> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> ---
>  kernel/rcutree.c |   27 ++++++++++++++++++++++++++-
>  kernel/rcutree.h |    2 ++
>  2 files changed, 28 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 93358d4..7c299d3 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -2291,13 +2291,32 @@ static void _rcu_barrier(struct rcu_state *rsp)
>  	unsigned long flags;
>  	struct rcu_data *rdp;
>  	struct rcu_data rd;
> +	unsigned long snap = ACCESS_ONCE(rsp->n_barrier_done);
> +	unsigned long snap_done;
>  
>  	init_rcu_head_on_stack(&rd.barrier_head);
>  
>  	/* Take mutex to serialize concurrent rcu_barrier() requests. */
>  	mutex_lock(&rsp->barrier_mutex);
>  
> -	smp_mb();  /* Prevent any prior operations from leaking in. */
> +	/*
> +	 * Ensure tht all prior references, including to ->n_barrier_done,
> +	 * are ordered before the _rcu_barrier() machinery.
> +	 */
> +	smp_mb();  /* See above block comment. */

If checkpatch complains about the lack of a comment to the right of a
barrier even when the barrier has a comment directly above it, that
seems like a bug in checkpatch that needs fixing, to prevent developers
from having to add noise like "See above block comment.". :)

Also: what type of barriers do mutex_lock and mutex_unlock imply?  I
assume they imply some weaker barrier than smp_mb, but I'd still assume
they imply *some* barrier.

> +	/* Recheck ->n_barrier_done to see if others did our work for us. */
> +	snap_done = ACCESS_ONCE(rsp->n_barrier_done);
> +	if (ULONG_CMP_GE(snap_done, ((snap + 1) & ~0x1) + 2)) {

This calculation seems sufficiently clever that it merits an explanatory
comment.

> +		smp_mb();
> +		mutex_unlock(&rsp->barrier_mutex);
> +		return;
> +	}
> +
> +	/* Increment ->n_barrier_done to avoid duplicate work. */
> +	ACCESS_ONCE(rsp->n_barrier_done)++;

Interesting dissonance here: the use of ACCESS_ONCE with ++ implies
exactly two accesses, rather than exactly one.  What makes it safe to
not use atomic_inc here, but not safe to drop the ACCESS_ONCE?
Potential use of a cached value read earlier in the function?

- Josh Triplett
Steven Rostedt June 16, 2012, 12:21 a.m. | #2
On Fri, 2012-06-15 at 16:31 -0700, Josh Triplett wrote:
>   
> > -	smp_mb();  /* Prevent any prior operations from leaking in. */
> > +	/*
> > +	 * Ensure tht all prior references, including to ->n_barrier_done,
> > +	 * are ordered before the _rcu_barrier() machinery.
> > +	 */
> > +	smp_mb();  /* See above block comment. */
> 
> If checkpatch complains about the lack of a comment to the right of a
> barrier even when the barrier has a comment directly above it, that
> seems like a bug in checkpatch that needs fixing, to prevent developers
> from having to add noise like "See above block comment.". :)


Yuck yuck yuck yuck!!!


Really, checkpatch is not the golden rule. I've copied an old checkpatch
from something like 2.6.38 or so and use that today, where it was still
reasonable. I've long abandoned the latest checkpatch, as it causes too
many false positives. Or nazis like dictation.

My rule of thumb is this. If what checkpatch tells you to do makes the
format either uglier, or look stupid, it's a good idea to ignore the
checkpatch complaint.

I think in this case, you hit the latter.

-- Steve
Paul E. McKenney June 16, 2012, 12:48 a.m. | #3
On Fri, Jun 15, 2012 at 04:31:51PM -0700, Josh Triplett wrote:
> On Fri, Jun 15, 2012 at 02:06:04PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paul.mckenney@linaro.org>
> > 
> > The traditional rcu_barrier() implementation has serialized all requests,
> > regardless of RCU flavor, and also does not coalesce concurrent requests.
> > In the past, this has been good and sufficient.
> > 
> > However, systems are getting larger and use of rcu_barrier() has been
> > increasing.  This commit therefore introduces a counter-based scheme
> > that allows _rcu_barrier() calls for the same flavor of RCU to take
> > advantage of each others' work.
> > 
> > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > ---
> >  kernel/rcutree.c |   27 ++++++++++++++++++++++++++-
> >  kernel/rcutree.h |    2 ++
> >  2 files changed, 28 insertions(+), 1 deletions(-)
> > 
> > diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> > index 93358d4..7c299d3 100644
> > --- a/kernel/rcutree.c
> > +++ b/kernel/rcutree.c
> > @@ -2291,13 +2291,32 @@ static void _rcu_barrier(struct rcu_state *rsp)
> >  	unsigned long flags;
> >  	struct rcu_data *rdp;
> >  	struct rcu_data rd;
> > +	unsigned long snap = ACCESS_ONCE(rsp->n_barrier_done);
> > +	unsigned long snap_done;
> >  
> >  	init_rcu_head_on_stack(&rd.barrier_head);
> >  
> >  	/* Take mutex to serialize concurrent rcu_barrier() requests. */
> >  	mutex_lock(&rsp->barrier_mutex);
> >  
> > -	smp_mb();  /* Prevent any prior operations from leaking in. */
> > +	/*
> > +	 * Ensure tht all prior references, including to ->n_barrier_done,
> > +	 * are ordered before the _rcu_barrier() machinery.
> > +	 */
> > +	smp_mb();  /* See above block comment. */
> 
> If checkpatch complains about the lack of a comment to the right of a
> barrier even when the barrier has a comment directly above it, that
> seems like a bug in checkpatch that needs fixing, to prevent developers
> from having to add noise like "See above block comment.". :)

;-)

> Also: what type of barriers do mutex_lock and mutex_unlock imply?  I
> assume they imply some weaker barrier than smp_mb, but I'd still assume
> they imply *some* barrier.

mutex_lock() prevents code from leaving the critical section, but is
not guaranteed to prevent code from entering the critical section.

> > +	/* Recheck ->n_barrier_done to see if others did our work for us. */
> > +	snap_done = ACCESS_ONCE(rsp->n_barrier_done);
> > +	if (ULONG_CMP_GE(snap_done, ((snap + 1) & ~0x1) + 2)) {
> 
> This calculation seems sufficiently clever that it merits an explanatory
> comment.

I will see what I can come up with.

> > +		smp_mb();
> > +		mutex_unlock(&rsp->barrier_mutex);
> > +		return;
> > +	}
> > +
> > +	/* Increment ->n_barrier_done to avoid duplicate work. */
> > +	ACCESS_ONCE(rsp->n_barrier_done)++;
> 
> Interesting dissonance here: the use of ACCESS_ONCE with ++ implies
> exactly two accesses, rather than exactly one.  What makes it safe to
> not use atomic_inc here, but not safe to drop the ACCESS_ONCE?
> Potential use of a cached value read earlier in the function?

Or, worse yet, the compiler speculating the increment and then backing
it out if the early-exit path is taken.

							Thanx, Paul
Paul E. McKenney June 16, 2012, 12:49 a.m. | #4
On Fri, Jun 15, 2012 at 08:21:20PM -0400, Steven Rostedt wrote:
> On Fri, 2012-06-15 at 16:31 -0700, Josh Triplett wrote:
> >   
> > > -	smp_mb();  /* Prevent any prior operations from leaking in. */
> > > +	/*
> > > +	 * Ensure tht all prior references, including to ->n_barrier_done,
> > > +	 * are ordered before the _rcu_barrier() machinery.
> > > +	 */
> > > +	smp_mb();  /* See above block comment. */
> > 
> > If checkpatch complains about the lack of a comment to the right of a
> > barrier even when the barrier has a comment directly above it, that
> > seems like a bug in checkpatch that needs fixing, to prevent developers
> > from having to add noise like "See above block comment.". :)
> 
> 
> Yuck yuck yuck yuck!!!
> 
> 
> Really, checkpatch is not the golden rule. I've copied an old checkpatch
> from something like 2.6.38 or so and use that today, where it was still
> reasonable. I've long abandoned the latest checkpatch, as it causes too
> many false positives. Or nazis like dictation.
> 
> My rule of thumb is this. If what checkpatch tells you to do makes the
> format either uglier, or look stupid, it's a good idea to ignore the
> checkpatch complaint.
> 
> I think in this case, you hit the latter.

Heh.  I have been doing this "/* See above block comment. */" thing for
quite some time.  ;-)

							Thanx, Paul

Patch

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 93358d4..7c299d3 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -2291,13 +2291,32 @@  static void _rcu_barrier(struct rcu_state *rsp)
 	unsigned long flags;
 	struct rcu_data *rdp;
 	struct rcu_data rd;
+	unsigned long snap = ACCESS_ONCE(rsp->n_barrier_done);
+	unsigned long snap_done;
 
 	init_rcu_head_on_stack(&rd.barrier_head);
 
 	/* Take mutex to serialize concurrent rcu_barrier() requests. */
 	mutex_lock(&rsp->barrier_mutex);
 
-	smp_mb();  /* Prevent any prior operations from leaking in. */
+	/*
+	 * Ensure tht all prior references, including to ->n_barrier_done,
+	 * are ordered before the _rcu_barrier() machinery.
+	 */
+	smp_mb();  /* See above block comment. */
+
+	/* Recheck ->n_barrier_done to see if others did our work for us. */
+	snap_done = ACCESS_ONCE(rsp->n_barrier_done);
+	if (ULONG_CMP_GE(snap_done, ((snap + 1) & ~0x1) + 2)) {
+		smp_mb();
+		mutex_unlock(&rsp->barrier_mutex);
+		return;
+	}
+
+	/* Increment ->n_barrier_done to avoid duplicate work. */
+	ACCESS_ONCE(rsp->n_barrier_done)++;
+	WARN_ON_ONCE((rsp->n_barrier_done & 0x1) != 1);
+	smp_mb(); /* Order ->n_barrier_done increment with below mechanism. */
 
 	/*
 	 * Initialize the count to one rather than to zero in order to
@@ -2368,6 +2387,12 @@  static void _rcu_barrier(struct rcu_state *rsp)
 	if (atomic_dec_and_test(&rsp->barrier_cpu_count))
 		complete(&rsp->barrier_completion);
 
+	/* Increment ->n_barrier_done to prevent duplicate work. */
+	smp_mb(); /* Keep increment after above mechanism. */
+	ACCESS_ONCE(rsp->n_barrier_done)++;
+	WARN_ON_ONCE((rsp->n_barrier_done & 0x1) != 0);
+	smp_mb(); /* Keep increment before caller's subsequent code. */
+
 	/* Wait for all rcu_barrier_callback() callbacks to be invoked. */
 	wait_for_completion(&rsp->barrier_completion);
 
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index d9ac82f..a294f7f 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -389,6 +389,8 @@  struct rcu_state {
 	struct mutex barrier_mutex;		/* Guards barrier fields. */
 	atomic_t barrier_cpu_count;		/* # CPUs waiting on. */
 	struct completion barrier_completion;	/* Wake at barrier end. */
+	unsigned long n_barrier_done;		/* ++ at start and end of */
+						/*  _rcu_barrier(). */
 	raw_spinlock_t fqslock;			/* Only one task forcing */
 						/*  quiescent states. */
 	unsigned long jiffies_force_qs;		/* Time at which to invoke */