Message ID | 1322937282-19846-7-git-send-email-paulmck@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
On Sat, Dec 03, 2011 at 10:34:42AM -0800, Paul E. McKenney wrote: > From: Yong Zhang <yong.zhang0@gmail.com> > > RCU-lockdep will issue warnings given the following use pattern: > > rcu_read_lock(); > local_irq_disable(); > rcu_read_unlock(); > local_irq_enable(); > > However, this use pattern is legal except for the scheduler's runqueue > and priority-inheritance locks (and any other locks that the scheduler > might use during priority-inheritance operations). > > Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > --- > kernel/rcutree_plugin.h | 8 ++++++-- > 1 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h > index 8cd9efe..2020e8a 100644 > --- a/kernel/rcutree_plugin.h > +++ b/kernel/rcutree_plugin.h > @@ -401,8 +401,11 @@ static noinline void rcu_read_unlock_special(struct task_struct *t) > > #ifdef CONFIG_RCU_BOOST > /* Unboost if we were boosted. */ > - if (rbmp) > + if (rbmp) { > + local_irq_save(flags); > rt_mutex_unlock(rbmp); > + local_irq_restore(flags); > + } > #endif /* #ifdef CONFIG_RCU_BOOST */ > > /* > @@ -1233,9 +1236,10 @@ static int rcu_boost(struct rcu_node *rnp) > lockdep_set_class_and_name(&mtx.wait_lock, &rcu_boost_class, > "rcu_boost_mutex"); > t->rcu_boost_mutex = &mtx; > - raw_spin_unlock_irqrestore(&rnp->lock, flags); > + raw_spin_unlock(&rnp->lock); /* rrupts remain disabled. */ > rt_mutex_lock(&mtx); /* Side effect: boosts task t's priority. */ > rt_mutex_unlock(&mtx); /* Keep lockdep happy. */ We permit rt_mutex_unlock() to be call with irq disabled, but rt_mutex_lock() is still not allowed. So this usage is not legal now. Sounds we should hold this patch on until a more suitable way is found. Thanks, Yong > + local_irq_restore(flags); > > return rnp->exp_tasks != NULL || rnp->boost_tasks != NULL; > } > -- > 1.7.3.2 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/
On Sat, 2011-12-03 at 10:34 -0800, Paul E. McKenney wrote: > From: Yong Zhang <yong.zhang0@gmail.com> > > RCU-lockdep will issue warnings given the following use pattern: > > rcu_read_lock(); > local_irq_disable(); > rcu_read_unlock(); > local_irq_enable(); > > However, this use pattern is legal except for the scheduler's runqueue > and priority-inheritance locks (and any other locks that the scheduler > might use during priority-inheritance operations). So what does this patch do? Make it not complain when you do the above? How often does this pattern actually happen? Can't be that often otherwise we'd have had more complaints, no?
On Mon, Dec 05, 2011 at 10:41:36AM +0100, Peter Zijlstra wrote: > On Sat, 2011-12-03 at 10:34 -0800, Paul E. McKenney wrote: > > From: Yong Zhang <yong.zhang0@gmail.com> > > > > RCU-lockdep will issue warnings given the following use pattern: > > > > rcu_read_lock(); > > local_irq_disable(); > > rcu_read_unlock(); > > local_irq_enable(); > > > > However, this use pattern is legal except for the scheduler's runqueue > > and priority-inheritance locks (and any other locks that the scheduler > > might use during priority-inheritance operations). > > So what does this patch do? Make it not complain when you do the above? It suppose to not complain but it bring other complain :( > How often does this pattern actually happen? IIRC, we have just one which is cured by commit [a841796: signal: align __lock_task_sighand() irq disabling and RCU] > Can't be that often > otherwise we'd have had more complaints, no? Yeah, So that also means we don't dedicated lock_class_key for mtx.wait_lock. Thanks, Yong
On Mon, Dec 05, 2011 at 05:19:24PM +0800, Yong Zhang wrote: > On Sat, Dec 03, 2011 at 10:34:42AM -0800, Paul E. McKenney wrote: > > From: Yong Zhang <yong.zhang0@gmail.com> > > > > RCU-lockdep will issue warnings given the following use pattern: > > > > rcu_read_lock(); > > local_irq_disable(); > > rcu_read_unlock(); > > local_irq_enable(); > > > > However, this use pattern is legal except for the scheduler's runqueue > > and priority-inheritance locks (and any other locks that the scheduler > > might use during priority-inheritance operations). > > > > Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > --- > > kernel/rcutree_plugin.h | 8 ++++++-- > > 1 files changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h > > index 8cd9efe..2020e8a 100644 > > --- a/kernel/rcutree_plugin.h > > +++ b/kernel/rcutree_plugin.h > > @@ -401,8 +401,11 @@ static noinline void rcu_read_unlock_special(struct task_struct *t) > > > > #ifdef CONFIG_RCU_BOOST > > /* Unboost if we were boosted. */ > > - if (rbmp) > > + if (rbmp) { > > + local_irq_save(flags); > > rt_mutex_unlock(rbmp); > > + local_irq_restore(flags); > > + } > > #endif /* #ifdef CONFIG_RCU_BOOST */ > > > > /* > > @@ -1233,9 +1236,10 @@ static int rcu_boost(struct rcu_node *rnp) > > lockdep_set_class_and_name(&mtx.wait_lock, &rcu_boost_class, > > "rcu_boost_mutex"); > > t->rcu_boost_mutex = &mtx; > > - raw_spin_unlock_irqrestore(&rnp->lock, flags); > > + raw_spin_unlock(&rnp->lock); /* rrupts remain disabled. */ > > rt_mutex_lock(&mtx); /* Side effect: boosts task t's priority. */ > > rt_mutex_unlock(&mtx); /* Keep lockdep happy. */ > > We permit rt_mutex_unlock() to be call with irq disabled, > but rt_mutex_lock() is still not allowed. So this usage > is not legal now. Even after commit #5342e269b has been applied? The purpose of that commit was to allow rt_mutex_lock() to be called with irqs disabled. So, what am I missing? Thanx, Paul > Sounds we should hold this patch on until a more suitable > way is found. > > Thanks, > Yong > > > + local_irq_restore(flags); > > > > return rnp->exp_tasks != NULL || rnp->boost_tasks != NULL; > > } > > -- > > 1.7.3.2 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > > -- > Only stand for myself >
On Mon, Dec 05, 2011 at 06:03:46PM +0800, Yong Zhang wrote: > On Mon, Dec 05, 2011 at 10:41:36AM +0100, Peter Zijlstra wrote: > > On Sat, 2011-12-03 at 10:34 -0800, Paul E. McKenney wrote: > > > From: Yong Zhang <yong.zhang0@gmail.com> > > > > > > RCU-lockdep will issue warnings given the following use pattern: > > > > > > rcu_read_lock(); > > > local_irq_disable(); > > > rcu_read_unlock(); > > > local_irq_enable(); > > > > > > However, this use pattern is legal except for the scheduler's runqueue > > > and priority-inheritance locks (and any other locks that the scheduler > > > might use during priority-inheritance operations). > > > > So what does this patch do? Make it not complain when you do the above? > > It suppose to not complain but it bring other complain :( Again, even with commit #5342e269b applied? > > How often does this pattern actually happen? > > IIRC, we have just one which is cured by commit [a841796: signal: align > __lock_task_sighand() irq disabling and RCU] > > > Can't be that often > > otherwise we'd have had more complaints, no? Maybe, maybe not. To see the complaint, you have to have RCU_BOOST=y. This is used heavily in -rt, but I bet that there are config options that don't see much use in -rt. With this one, prevention is better than after-the-fact cure. Thanx, Paul > Yeah, > > So that also means we don't dedicated lock_class_key for mtx.wait_lock. > > Thanks, > Yong >
On Mon, Dec 05, 2011 at 08:45:05AM -0800, Paul E. McKenney wrote: > On Mon, Dec 05, 2011 at 05:19:24PM +0800, Yong Zhang wrote: > > On Sat, Dec 03, 2011 at 10:34:42AM -0800, Paul E. McKenney wrote: > > > From: Yong Zhang <yong.zhang0@gmail.com> > > > > > > RCU-lockdep will issue warnings given the following use pattern: > > > > > > rcu_read_lock(); > > > local_irq_disable(); > > > rcu_read_unlock(); > > > local_irq_enable(); > > > > > > However, this use pattern is legal except for the scheduler's runqueue > > > and priority-inheritance locks (and any other locks that the scheduler > > > might use during priority-inheritance operations). > > > > > > Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> > > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > > --- > > > kernel/rcutree_plugin.h | 8 ++++++-- > > > 1 files changed, 6 insertions(+), 2 deletions(-) > > > > > > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h > > > index 8cd9efe..2020e8a 100644 > > > --- a/kernel/rcutree_plugin.h > > > +++ b/kernel/rcutree_plugin.h > > > @@ -401,8 +401,11 @@ static noinline void rcu_read_unlock_special(struct task_struct *t) > > > > > > #ifdef CONFIG_RCU_BOOST > > > /* Unboost if we were boosted. */ > > > - if (rbmp) > > > + if (rbmp) { > > > + local_irq_save(flags); > > > rt_mutex_unlock(rbmp); > > > + local_irq_restore(flags); > > > + } > > > #endif /* #ifdef CONFIG_RCU_BOOST */ > > > > > > /* > > > @@ -1233,9 +1236,10 @@ static int rcu_boost(struct rcu_node *rnp) > > > lockdep_set_class_and_name(&mtx.wait_lock, &rcu_boost_class, > > > "rcu_boost_mutex"); > > > t->rcu_boost_mutex = &mtx; > > > - raw_spin_unlock_irqrestore(&rnp->lock, flags); > > > + raw_spin_unlock(&rnp->lock); /* rrupts remain disabled. */ > > > rt_mutex_lock(&mtx); /* Side effect: boosts task t's priority. */ > > > rt_mutex_unlock(&mtx); /* Keep lockdep happy. */ > > > > We permit rt_mutex_unlock() to be call with irq disabled, > > but rt_mutex_lock() is still not allowed. So this usage > > is not legal now. > > Even after commit #5342e269b has been applied? Yeah, because we call might_sleep() in rt_mutex_lock() unconditionally. But in this case the 'BUG: sleeping function called from invalid context at *' is obviously false positive. Maybe we could teach might_sleep() about this special case? Thanks, Yong
On Tue, Dec 06, 2011 at 09:26:35AM +0800, Yong Zhang wrote: > On Mon, Dec 05, 2011 at 08:45:05AM -0800, Paul E. McKenney wrote: > > On Mon, Dec 05, 2011 at 05:19:24PM +0800, Yong Zhang wrote: > > > On Sat, Dec 03, 2011 at 10:34:42AM -0800, Paul E. McKenney wrote: > > > > From: Yong Zhang <yong.zhang0@gmail.com> > > > > > > > > RCU-lockdep will issue warnings given the following use pattern: > > > > > > > > rcu_read_lock(); > > > > local_irq_disable(); > > > > rcu_read_unlock(); > > > > local_irq_enable(); > > > > > > > > However, this use pattern is legal except for the scheduler's runqueue > > > > and priority-inheritance locks (and any other locks that the scheduler > > > > might use during priority-inheritance operations). > > > > > > > > Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> > > > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > > > --- > > > > kernel/rcutree_plugin.h | 8 ++++++-- > > > > 1 files changed, 6 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h > > > > index 8cd9efe..2020e8a 100644 > > > > --- a/kernel/rcutree_plugin.h > > > > +++ b/kernel/rcutree_plugin.h > > > > @@ -401,8 +401,11 @@ static noinline void rcu_read_unlock_special(struct task_struct *t) > > > > > > > > #ifdef CONFIG_RCU_BOOST > > > > /* Unboost if we were boosted. */ > > > > - if (rbmp) > > > > + if (rbmp) { > > > > + local_irq_save(flags); > > > > rt_mutex_unlock(rbmp); > > > > + local_irq_restore(flags); > > > > + } > > > > #endif /* #ifdef CONFIG_RCU_BOOST */ > > > > > > > > /* > > > > @@ -1233,9 +1236,10 @@ static int rcu_boost(struct rcu_node *rnp) > > > > lockdep_set_class_and_name(&mtx.wait_lock, &rcu_boost_class, > > > > "rcu_boost_mutex"); > > > > t->rcu_boost_mutex = &mtx; > > > > - raw_spin_unlock_irqrestore(&rnp->lock, flags); > > > > + raw_spin_unlock(&rnp->lock); /* rrupts remain disabled. */ > > > > rt_mutex_lock(&mtx); /* Side effect: boosts task t's priority. */ > > > > rt_mutex_unlock(&mtx); /* Keep lockdep happy. */ > > > > > > We permit rt_mutex_unlock() to be call with irq disabled, > > > but rt_mutex_lock() is still not allowed. So this usage > > > is not legal now. > > > > Even after commit #5342e269b has been applied? > > Yeah, because we call might_sleep() in rt_mutex_lock() unconditionally. > But in this case the 'BUG: sleeping function called from invalid context > at *' is obviously false positive. > > Maybe we could teach might_sleep() about this special case? That sounds very good to me! Thanx, Paul
On Tue, 2011-12-06 at 09:26 +0800, Yong Zhang wrote: > Yeah, because we call might_sleep() in rt_mutex_lock() unconditionally. > But in this case the 'BUG: sleeping function called from invalid context > at *' is obviously false positive. Why can't this mutex acquisition not block? > Maybe we could teach might_sleep() about this special case? Sounds horrid.
On Tue, Dec 06, 2011 at 10:52:32AM +0100, Peter Zijlstra wrote: > On Tue, 2011-12-06 at 09:26 +0800, Yong Zhang wrote: > > > Yeah, because we call might_sleep() in rt_mutex_lock() unconditionally. > > But in this case the 'BUG: sleeping function called from invalid context > > at *' is obviously false positive. > > Why can't this mutex acquisition not block? It could block. The issue it's legal to call rt_mutex_lock() with irqs disabled and we don't want might_sleep() bite us on this special case. When we are going to sleep, we re-enable irq in __rt_mutex_slowlock(). > > > Maybe we could teach might_sleep() about this special case? > > Sounds horrid. Maybe, any alternative? Thanks, Yong
On Tue, 2011-12-06 at 10:52 +0100, Peter Zijlstra wrote: > On Tue, 2011-12-06 at 09:26 +0800, Yong Zhang wrote: > > > Yeah, because we call might_sleep() in rt_mutex_lock() unconditionally. > > But in this case the 'BUG: sleeping function called from invalid context > > at *' is obviously false positive. > > Why can't this mutex acquisition not block? Gaah!! I see, this 5342e269 patch is revolting.. guys that's really vile don't do that! I tried reading the RCU code but I gave up.. rcu_boost() does: rt_mutex_init_proxy_locked(); raw_spin_unlock_irqrestore(); rt_mutex_lock(); rt_mutex_unlock(); vs rcu_read_unlock_special()'s RCU_READ_UNLOCK_BLOCKED branch: rt_mutex_unlock(); The latter looks to be unbalanced because I can't actually find a matching lock. Also, all of that is ran with IRQs enabled. So what's the problem?
On Tue, 2011-12-06 at 18:05 +0800, Yong Zhang wrote: > On Tue, Dec 06, 2011 at 10:52:32AM +0100, Peter Zijlstra wrote: > > On Tue, 2011-12-06 at 09:26 +0800, Yong Zhang wrote: > > > > > Yeah, because we call might_sleep() in rt_mutex_lock() unconditionally. > > > But in this case the 'BUG: sleeping function called from invalid context > > > at *' is obviously false positive. > > > > Why can't this mutex acquisition not block? > > It could block. The issue it's legal to call rt_mutex_lock() with > irqs disabled and we don't want might_sleep() bite us on this Of course its legal (nobody will arrest you for it), but its also bloody horrid. > special case. When we are going to sleep, we re-enable irq in > __rt_mutex_slowlock(). > > > > > > Maybe we could teach might_sleep() about this special case? > > > > Sounds horrid. > > Maybe, any alternative? Maybe someone explain this mess first?
On Tue, 2011-12-06 at 11:32 +0100, Peter Zijlstra wrote: > > > > > > > Maybe we could teach might_sleep() about this special case? > > > > > > Sounds horrid. > > > > Maybe, any alternative? > > Maybe someone explain this mess first? Me too, because this looks like a hack that's just like a lie. The first lie to be said gets you out of a bit of trouble, but then something else happens based on that lie, in which you need to make another bigger lie. This new lie affects more people and requires new more ingenious lies to control the chaos. But eventually the lies required to keep everything going become so overwhelming that it all blows up in your face and you end up looking like a jackass. Why is rcu using rt_mutex_lock() in strange ways? It's lying about its use. And now this patch is the bigger lie to get around the issues of the first lie. Eventually this code will continue to expand largely based on these lies and will explode in our faces, and I feel sorry for the poor jackass that needs to fix it. Perhaps the real answer is that we need to create an API for priority inheritance, that things like RCU could use. Attach a task that another task requires to finish something and boost the priority of that task. Maybe even completions could use such a thing? -- Steve
On Tue, Dec 06, 2011 at 10:52:32AM +0100, Peter Zijlstra wrote: > On Tue, 2011-12-06 at 09:26 +0800, Yong Zhang wrote: > > > Yeah, because we call might_sleep() in rt_mutex_lock() unconditionally. > > But in this case the 'BUG: sleeping function called from invalid context > > at *' is obviously false positive. > > Why can't this mutex acquisition not block? Because its purpose in life is to pass its priority on to the lock holder. Thanx, Paul > > Maybe we could teach might_sleep() about this special case? > > Sounds horrid. >
On Tue, Dec 06, 2011 at 07:26:45AM -0500, Steven Rostedt wrote: > On Tue, 2011-12-06 at 11:32 +0100, Peter Zijlstra wrote: > > > > > > > > > > Maybe we could teach might_sleep() about this special case? > > > > > > > > Sounds horrid. > > > > > > Maybe, any alternative? > > > > Maybe someone explain this mess first? > > Me too, because this looks like a hack that's just like a lie. The first > lie to be said gets you out of a bit of trouble, but then something else > happens based on that lie, in which you need to make another bigger lie. > This new lie affects more people and requires new more ingenious lies to > control the chaos. But eventually the lies required to keep everything > going become so overwhelming that it all blows up in your face and you > end up looking like a jackass. > > Why is rcu using rt_mutex_lock() in strange ways? It's lying about its > use. And now this patch is the bigger lie to get around the issues of > the first lie. Eventually this code will continue to expand largely > based on these lies and will explode in our faces, and I feel sorry for > the poor jackass that needs to fix it. > > Perhaps the real answer is that we need to create an API for priority > inheritance, that things like RCU could use. Attach a task that another > task requires to finish something and boost the priority of that task. > Maybe even completions could use such a thing? I would be OK with that -- that was in fact the approach I was taking when I was advised to use mutexes instead. ;-) Thanx, Paul
On Tue, Dec 06, 2011 at 11:27:26AM +0100, Peter Zijlstra wrote: > On Tue, 2011-12-06 at 10:52 +0100, Peter Zijlstra wrote: > > On Tue, 2011-12-06 at 09:26 +0800, Yong Zhang wrote: > > > > > Yeah, because we call might_sleep() in rt_mutex_lock() unconditionally. > > > But in this case the 'BUG: sleeping function called from invalid context > > > at *' is obviously false positive. > > > > Why can't this mutex acquisition not block? > > Gaah!! I see, this 5342e269 patch is revolting.. guys that's really vile > don't do that! > > I tried reading the RCU code but I gave up.. rcu_boost() does: > > rt_mutex_init_proxy_locked(); > raw_spin_unlock_irqrestore(); > rt_mutex_lock(); > rt_mutex_unlock(); > > vs rcu_read_unlock_special()'s RCU_READ_UNLOCK_BLOCKED branch: > > rt_mutex_unlock(); > > > The latter looks to be unbalanced because I can't actually find a > matching lock. Also, all of that is ran with IRQs enabled. So what's the > problem? The rt_mutex_init_proxy_locked() creates the lock in held state, held by the RCU reader who is holding up the grace period. So rcu_read_unlock_special()'s rt_mutex_unlock() is balanced by the rt_mutex_init_proxy_locked(). The problem with the IRQs enabled is the following sequence: rcu_read_lock(); /* do stuff */ local_irq_save(flags); /* do more stuff */ rcu_read_unlock(); /* do even more stuff */ local_irq_restore(flags); This has been legal in the past, and might well be used in places that -rt does not exercise, hence the desire to explicitly legalize it. Thanx, Paul
On Tue, 2011-12-06 at 08:11 -0800, Paul E. McKenney wrote: > The problem with the IRQs enabled is the following sequence: > > rcu_read_lock(); > /* do stuff */ > local_irq_save(flags); > /* do more stuff */ > rcu_read_unlock(); > /* do even more stuff */ > local_irq_restore(flags); > > This has been legal in the past, and might well be used in places that > -rt does not exercise, hence the desire to explicitly legalize it. So why not make it strictly dis-allowed, even for !-rt and see what falls over? If there's lots of fallout we might need to reconsider, but wouldn't it be easier to all abide by the strictest rules than to try and frob stuff like was proposed?
On Tue, Dec 06, 2011 at 08:04:55AM -0800, Paul E. McKenney wrote: > On Tue, Dec 06, 2011 at 07:26:45AM -0500, Steven Rostedt wrote: > > On Tue, 2011-12-06 at 11:32 +0100, Peter Zijlstra wrote: > > > > > > > > > > > > > Maybe we could teach might_sleep() about this special case? > > > > > > > > > > Sounds horrid. > > > > > > > > Maybe, any alternative? > > > > > > Maybe someone explain this mess first? > > > > Me too, because this looks like a hack that's just like a lie. The first > > lie to be said gets you out of a bit of trouble, but then something else > > happens based on that lie, in which you need to make another bigger lie. > > This new lie affects more people and requires new more ingenious lies to > > control the chaos. But eventually the lies required to keep everything > > going become so overwhelming that it all blows up in your face and you > > end up looking like a jackass. > > > > Why is rcu using rt_mutex_lock() in strange ways? It's lying about its > > use. And now this patch is the bigger lie to get around the issues of > > the first lie. Eventually this code will continue to expand largely > > based on these lies and will explode in our faces, and I feel sorry for > > the poor jackass that needs to fix it. > > > > Perhaps the real answer is that we need to create an API for priority > > inheritance, that things like RCU could use. Attach a task that another > > task requires to finish something and boost the priority of that task. > > Maybe even completions could use such a thing? > > I would be OK with that -- that was in fact the approach I was taking > when I was advised to use mutexes instead. ;-) But in the spirit of full disclosure -- moving from explicit priority boosting to mutexes simplified the code greatly. It eliminated a bunch of races between one task boosting and the boostee exiting its RCU read-side critical section. The difficulty is (or maybe was) the need to block when boosting priority, which meant that there were races where the booster marked the task, dropped the rcu_node ->lock, the boostee exited its RCU read-side critical section and uselessly deboosted itself, then the booster incorrectly did the boost. But perhaps I was just suffering a failure of imagination. Thanx, Paul
On Tue, 2011-12-06 at 08:04 -0800, Paul E. McKenney wrote: > > Perhaps the real answer is that we need to create an API for priority > > inheritance, that things like RCU could use. Attach a task that another > > task requires to finish something and boost the priority of that task. > > Maybe even completions could use such a thing? > > I would be OK with that -- that was in fact the approach I was taking > when I was advised to use mutexes instead. ;-) Maybe we should rethink it. Using the makeshift mutex looks to be a short term hack. But if we are starting to build on it, it will end up being a horrible design, based off of a hack. A mutex is to provide mutual exclusion. If we start bastardizing it to do other things, it will become unmaintainable. I dare say that it's close to unmaintainable now ;) If we create a new API to handle inheritance, then perhaps it could be used for other things like workqueues and completions (in -rt only). -- Steve
On Tue, Dec 06, 2011 at 11:56:42AM -0500, Steven Rostedt wrote: > On Tue, 2011-12-06 at 08:04 -0800, Paul E. McKenney wrote: > > > > Perhaps the real answer is that we need to create an API for priority > > > inheritance, that things like RCU could use. Attach a task that another > > > task requires to finish something and boost the priority of that task. > > > Maybe even completions could use such a thing? > > > > I would be OK with that -- that was in fact the approach I was taking > > when I was advised to use mutexes instead. ;-) > > Maybe we should rethink it. Using the makeshift mutex looks to be a > short term hack. But if we are starting to build on it, it will end up > being a horrible design, based off of a hack. > > A mutex is to provide mutual exclusion. If we start bastardizing it to > do other things, it will become unmaintainable. I dare say that it's > close to unmaintainable now ;) > > If we create a new API to handle inheritance, then perhaps it could be > used for other things like workqueues and completions (in -rt only). Tough choice between yours and Peter's suggestion... 1. Re-introduce ugly races by eliminating the mutex. 2. Possibly have to deal with a new spate of lockdep-RCU splats. Decisions, decisions! ;-) I suppose that one approach is to start with Peter's approach, possibly adapting lockdep to explicitly check -- with an exception for srcu_read_lock_raw(), of course. If lockdep-RCU splats rain down too hard, then perhaps the explicit priority inheritance would be one potential umbrella. Thoughts? Thanx, Paul
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h index 8cd9efe..2020e8a 100644 --- a/kernel/rcutree_plugin.h +++ b/kernel/rcutree_plugin.h @@ -401,8 +401,11 @@ static noinline void rcu_read_unlock_special(struct task_struct *t) #ifdef CONFIG_RCU_BOOST /* Unboost if we were boosted. */ - if (rbmp) + if (rbmp) { + local_irq_save(flags); rt_mutex_unlock(rbmp); + local_irq_restore(flags); + } #endif /* #ifdef CONFIG_RCU_BOOST */ /* @@ -1233,9 +1236,10 @@ static int rcu_boost(struct rcu_node *rnp) lockdep_set_class_and_name(&mtx.wait_lock, &rcu_boost_class, "rcu_boost_mutex"); t->rcu_boost_mutex = &mtx; - raw_spin_unlock_irqrestore(&rnp->lock, flags); + raw_spin_unlock(&rnp->lock); /* rrupts remain disabled. */ rt_mutex_lock(&mtx); /* Side effect: boosts task t's priority. */ rt_mutex_unlock(&mtx); /* Keep lockdep happy. */ + local_irq_restore(flags); return rnp->exp_tasks != NULL || rnp->boost_tasks != NULL; }