diff mbox

sched/core: Create new task with twice disabled preemption

Message ID 20140214123536.GA12304@arm.com
State New
Headers show

Commit Message

Catalin Marinas Feb. 14, 2014, 12:35 p.m. UTC
On Thu, Feb 13, 2014 at 07:51:56PM +0400, Kirill Tkhai wrote:
> Preemption state on enter in finish_task_switch() is different
> in cases of context_switch() and schedule_tail().
> 
> In the first case we have it twice disabled: at the start of
> schedule() and during spin locking. In the second it is only
> once: the value which was set in init_task_preempt_count().
> 
> For archs without __ARCH_WANT_UNLOCKED_CTXSW set this means
> that all newly created tasks execute finish_arch_post_lock_switch()
> and post_schedule() with preemption enabled.
> 
> It seems there is possible a problem in rare situations on arm64,
> when one freshly created thread preempts another before
> finish_arch_post_lock_switch() has finished. If mm is the same,
> then TIF_SWITCH_MM on the second won't be set.
> 
> The second rare but possible issue is zeroing of post_schedule()
> on a wrong cpu.
> 
> So, lets fix this and unify preempt_count state.

An alternative to your patch:

Comments

Catalin Marinas Feb. 14, 2014, 3:49 p.m. UTC | #1
On Fri, Feb 14, 2014 at 12:44:01PM +0000, Kirill Tkhai wrote:
> В Птн, 14/02/2014 в 12:35 +0000, Catalin Marinas пишет:
> > On Thu, Feb 13, 2014 at 07:51:56PM +0400, Kirill Tkhai wrote:
> > > Preemption state on enter in finish_task_switch() is different
> > > in cases of context_switch() and schedule_tail().
> > > 
> > > In the first case we have it twice disabled: at the start of
> > > schedule() and during spin locking. In the second it is only
> > > once: the value which was set in init_task_preempt_count().
> > > 
> > > For archs without __ARCH_WANT_UNLOCKED_CTXSW set this means
> > > that all newly created tasks execute finish_arch_post_lock_switch()
> > > and post_schedule() with preemption enabled.
> > > 
> > > It seems there is possible a problem in rare situations on arm64,
> > > when one freshly created thread preempts another before
> > > finish_arch_post_lock_switch() has finished. If mm is the same,
> > > then TIF_SWITCH_MM on the second won't be set.
> > > 
> > > The second rare but possible issue is zeroing of post_schedule()
> > > on a wrong cpu.
> > > 
> > > So, lets fix this and unify preempt_count state.
> > 
> > An alternative to your patch:
> 
> It looks better, than the initial.
> 
> You may add my Acked-by if you want.

Thanks for the ack. But apart from arm64, are there any other problems
with running part of finish_task_switch() and post_schedule() with
preemption enabled?

The finish_arch_post_lock_switch() is currently only used by arm and
arm64 (the former UP only) and arm no longer has the preemption problem
(see commit bdae73cd374e2). So I can either disable the preemption
around schedule_tail() call in arm64 or do it globally as per yours or
my patch.

Peter, Ingo, any thoughts? Do we care about preempt count consistency
across finish_task_switch() and post_schedule()?

Thanks.
diff mbox

Patch

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b46131ef6aab..b932b6a4c716 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2210,6 +2210,10 @@  asmlinkage void schedule_tail(struct task_struct *prev)
 {
 	struct rq *rq = this_rq();
 
+#ifndef __ARCH_WANT_UNLOCKED_CTXSW
+	/* In this case, finish_task_switch reenables preemption */
+	preempt_disable();
+#endif
 	finish_task_switch(rq, prev);
 
 	/*
@@ -2218,10 +2222,7 @@  asmlinkage void schedule_tail(struct task_struct *prev)
 	 */
 	post_schedule(rq);
 
-#ifdef __ARCH_WANT_UNLOCKED_CTXSW
-	/* In this case, finish_task_switch does not reenable preemption */
 	preempt_enable();
-#endif
 	if (current->set_child_tid)
 		put_user(task_pid_vnr(current), current->set_child_tid);
 }