[V2,3/3] workqueue: Schedule work on non-idle cpu instead of current one

Message ID 7d5c2cac08134446583624cc124d46d9726bdb2d.1348736069.git.viresh.kumar@linaro.org
State Accepted
Headers show

Commit Message

Viresh Kumar Sept. 27, 2012, 9:04 a.m.
Workqueues queues work on current cpu, if the caller haven't passed a preferred
cpu. This may wake up an idle CPU, which is actually not required.

This work can be processed by any CPU and so we must select a non-idle CPU here.
This patch adds in support in workqueue framework to get preferred CPU details
from the scheduler, instead of using current CPU.

Most of the time when a work is queued, the current cpu isn't idle and so we
will choose it only. There are cases when a cpu is idle when it queues some
work. For example, consider following scenario:
- A cpu has programmed a timer and is IDLE now.
- CPU gets into interrupt handler due to timer and queues a work. As the CPU is
  currently IDLE, we can queue this work to some other CPU.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 kernel/workqueue.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Tejun Heo Sept. 30, 2012, 8:54 a.m. | #1
Hello, Viresh.

On Thu, Sep 27, 2012 at 02:34:05PM +0530, Viresh Kumar wrote:
> - A cpu has programmed a timer and is IDLE now.
> - CPU gets into interrupt handler due to timer and queues a work. As the CPU is
>   currently IDLE, we can queue this work to some other CPU.

I'm still a bit confused, if the CPU is already running the IRQ
handler, the CPU is not idle by definition.  What am I missing here?

Thanks.
Viresh Kumar Sept. 30, 2012, 12:16 p.m. | #2
On 30 September 2012 14:24, Tejun Heo <tj@kernel.org> wrote:
> On Thu, Sep 27, 2012 at 02:34:05PM +0530, Viresh Kumar wrote:
>> - A cpu has programmed a timer and is IDLE now.
>> - CPU gets into interrupt handler due to timer and queues a work. As the CPU is
>>   currently IDLE, we can queue this work to some other CPU.
>
> I'm still a bit confused, if the CPU is already running the IRQ
> handler, the CPU is not idle by definition.  What am I missing here?

Hi Tejun,

For the scheduler CPU is idle, if all below are true:
- current task is idle task
- nr_running == 0
- wake_list is empty

And during these conditions, there can be a timer running in background.
And when we reach its interrupt handler, then also these conditions hold true
and local cpu is idle.

--
Viresh
Tejun Heo Oct. 1, 2012, 12:32 a.m. | #3
Hello,

On Sun, Sep 30, 2012 at 05:46:45PM +0530, Viresh Kumar wrote:
> For the scheduler CPU is idle, if all below are true:
> - current task is idle task
> - nr_running == 0
> - wake_list is empty
> 
> And during these conditions, there can be a timer running in background.
> And when we reach its interrupt handler, then also these conditions hold true
> and local cpu is idle.

It isn't about the CPU being actually idle?  Also, if it's only about
timers, shouldn't it be enough to implement it for timer and
delayed_work?

It would be great if you explain what you're trying to achieve how.  I
can't tell what you're aiming for and why that would be beneficial
from these submissions.

Thanks.
Viresh Kumar Oct. 1, 2012, 3:47 a.m. | #4
On 1 October 2012 06:02, Tejun Heo <tj@kernel.org> wrote:
> It isn't about the CPU being actually idle?

No. Being idle only from scheduler's perspective. :)

> Also, if it's only about timers, shouldn't it be enough to implement
> it for timer and delayed_work?

What if we need a timer, which must re-arm itself + schedule a work?
delayed_work will not be sufficient in that case, and we would need
to use normal work.

If i am not wrong, there can be other future users of this routine too.
@Vincent: Can you please comment on this?

> It would be great if you explain what you're trying to achieve how.  I
> can't tell what you're aiming for and why that would be beneficial
> from these submissions.

Following slides are implemented by Vincent and presented during LPC.
Please have a look at them, they explain the problem statement well:

http://www.linuxplumbersconf.org/2012/wp-content/uploads/2012/08/lpc2012-sched-timer-workqueue.pdf

Specifically slides: 12 & 19.

There aren't too many users with this behavior, but even a single user
will be sufficient not to let the cpu get idle at all. And that will result in
less power saving.

--
viresh
Vincent Guittot Oct. 1, 2012, 9:12 a.m. | #5
On 1 October 2012 05:47, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 1 October 2012 06:02, Tejun Heo <tj@kernel.org> wrote:
>> It isn't about the CPU being actually idle?
>
> No. Being idle only from scheduler's perspective. :)
>
>> Also, if it's only about timers, shouldn't it be enough to implement
>> it for timer and delayed_work?
>
> What if we need a timer, which must re-arm itself + schedule a work?
> delayed_work will not be sufficient in that case, and we would need
> to use normal work.
>
> If i am not wrong, there can be other future users of this routine too.
> @Vincent: Can you please comment on this?

The goal is to consolidate the place, where the selection of a target
CPU for running something is done. The scheduler should select the CPU
in order to have coherent behaviour of all framework.

A delayed work can be delayed for a long period and the target CPU
that have been selected for the timer could not be the best place to
queue a work any more.

The goal is to be able to migrate a work if the scheduler thinks that'
is necessary.

Vincent

>
>> It would be great if you explain what you're trying to achieve how.  I
>> can't tell what you're aiming for and why that would be beneficial
>> from these submissions.
>
> Following slides are implemented by Vincent and presented during LPC.
> Please have a look at them, they explain the problem statement well:
>
> http://www.linuxplumbersconf.org/2012/wp-content/uploads/2012/08/lpc2012-sched-timer-workqueue.pdf
>
> Specifically slides: 12 & 19.
>
> There aren't too many users with this behavior, but even a single user
> will be sufficient not to let the cpu get idle at all. And that will result in
> less power saving.
>
> --
> viresh

Patch

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 143fd8c..5fa4ba4 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1238,7 +1238,7 @@  static void __queue_work(unsigned int cpu, struct workqueue_struct *wq,
 		struct global_cwq *last_gcwq;
 
 		if (cpu == WORK_CPU_UNBOUND)
-			cpu = raw_smp_processor_id();
+			cpu = sched_select_cpu(0);
 
 		/*
 		 * It's multi cpu.  If @work was previously on a different