[1/4,resend] cpuidle : handle clockevent notify from the cpuidle framework

Message ID 1363868494-5503-1-git-send-email-daniel.lezcano@linaro.org
State New
Headers show

Commit Message

Daniel Lezcano March 21, 2013, 12:21 p.m.
When a cpu enters a deep idle state, the local timers are stopped and
the time framework falls back to the timer device used as a broadcast
timer.

The different cpuidle drivers are calling clockevents_notify ENTER/EXIT
when the idle state stops the local timer.

Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle
drivers. If the flag is set, the cpuidle core code takes care of the
notification on behalf of the driver to avoid pointless code duplication.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: Rajendra Nayak <rnayak@ti.com>
Cc: Sascha Hauer <kernel@pengutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/cpuidle/cpuidle.c |    9 +++++++++
 include/linux/cpuidle.h   |    1 +
 2 files changed, 10 insertions(+)

Comments

Santosh Shilimkar March 21, 2013, 12:59 p.m. | #1
On Thursday 21 March 2013 05:51 PM, Daniel Lezcano wrote:
> When a cpu enters a deep idle state, the local timers are stopped and
> the time framework falls back to the timer device used as a broadcast
> timer.
> 
> The different cpuidle drivers are calling clockevents_notify ENTER/EXIT
> when the idle state stops the local timer.
> 
> Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle
> drivers. If the flag is set, the cpuidle core code takes care of the
> notification on behalf of the driver to avoid pointless code duplication.
> 
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Linus Walleij <linus.walleij@linaro.org>
> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
> Cc: Rajendra Nayak <rnayak@ti.com>
> Cc: Sascha Hauer <kernel@pengutronix.de>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> ---
>  drivers/cpuidle/cpuidle.c |    9 +++++++++
>  include/linux/cpuidle.h   |    1 +
>  2 files changed, 10 insertions(+)
> 
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index eba6929..c500370 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -8,6 +8,7 @@
>   * This code is licenced under the GPL.
>   */
>  
> +#include <linux/clockchips.h>
>  #include <linux/kernel.h>
>  #include <linux/mutex.h>
>  #include <linux/sched.h>
> @@ -146,12 +147,20 @@ int cpuidle_idle_call(void)
>  
>  	trace_cpu_idle_rcuidle(next_state, dev->cpu);
>  
> +	if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP)
> +		clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER,
> +				   &dev->cpu);
> +
Seems like good clean-up from drivers.
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

How about taking care of cpu_pm_notifiers() as well with some
additional flag for CPU and cluster power state. That can help
to reduce and consolidate the code. What you say ?

Regards,
Santosh

Regards,
Santosh
Daniel Lezcano March 21, 2013, 1:52 p.m. | #2
On 03/21/2013 01:59 PM, Santosh Shilimkar wrote:
> On Thursday 21 March 2013 05:51 PM, Daniel Lezcano wrote:
>> When a cpu enters a deep idle state, the local timers are stopped and
>> the time framework falls back to the timer device used as a broadcast
>> timer.
>>
>> The different cpuidle drivers are calling clockevents_notify ENTER/EXIT
>> when the idle state stops the local timer.
>>
>> Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle
>> drivers. If the flag is set, the cpuidle core code takes care of the
>> notification on behalf of the driver to avoid pointless code duplication.
>>
>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Len Brown <lenb@kernel.org>
>> Cc: Linus Walleij <linus.walleij@linaro.org>
>> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
>> Cc: Rajendra Nayak <rnayak@ti.com>
>> Cc: Sascha Hauer <kernel@pengutronix.de>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> ---
>>  drivers/cpuidle/cpuidle.c |    9 +++++++++
>>  include/linux/cpuidle.h   |    1 +
>>  2 files changed, 10 insertions(+)
>>
>> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
>> index eba6929..c500370 100644
>> --- a/drivers/cpuidle/cpuidle.c
>> +++ b/drivers/cpuidle/cpuidle.c
>> @@ -8,6 +8,7 @@
>>   * This code is licenced under the GPL.
>>   */
>>  
>> +#include <linux/clockchips.h>
>>  #include <linux/kernel.h>
>>  #include <linux/mutex.h>
>>  #include <linux/sched.h>
>> @@ -146,12 +147,20 @@ int cpuidle_idle_call(void)
>>  
>>  	trace_cpu_idle_rcuidle(next_state, dev->cpu);
>>  
>> +	if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP)
>> +		clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER,
>> +				   &dev->cpu);
>> +
> Seems like good clean-up from drivers.
> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
> 
> How about taking care of cpu_pm_notifiers() as well with some
> additional flag for CPU and cluster power state. That can help
> to reduce and consolidate the code. What you say ?

Do you mean add a flag for different level of idle (idle, suspend, power
off) and then use the cpu_pm_enter/cpu_cluster_pm_enter in all the
drivers as a common framework ?
Daniel Lezcano March 21, 2013, 1:54 p.m. | #3
On 03/21/2013 01:59 PM, Santosh Shilimkar wrote:

[ ... ]

> Seems like good clean-up from drivers.
> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>

... btw, thanks for reviewing the patches :)

  -- Daniel
Santosh Shilimkar March 21, 2013, 2:04 p.m. | #4
On Thursday 21 March 2013 07:22 PM, Daniel Lezcano wrote:
> On 03/21/2013 01:59 PM, Santosh Shilimkar wrote:
>> On Thursday 21 March 2013 05:51 PM, Daniel Lezcano wrote:
>>> When a cpu enters a deep idle state, the local timers are stopped and
>>> the time framework falls back to the timer device used as a broadcast
>>> timer.
>>>
>>> The different cpuidle drivers are calling clockevents_notify ENTER/EXIT
>>> when the idle state stops the local timer.
>>>
>>> Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle
>>> drivers. If the flag is set, the cpuidle core code takes care of the
>>> notification on behalf of the driver to avoid pointless code duplication.
>>>
>>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>>> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
>>> Cc: Len Brown <lenb@kernel.org>
>>> Cc: Linus Walleij <linus.walleij@linaro.org>
>>> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
>>> Cc: Rajendra Nayak <rnayak@ti.com>
>>> Cc: Sascha Hauer <kernel@pengutronix.de>
>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>> ---
>>>  drivers/cpuidle/cpuidle.c |    9 +++++++++
>>>  include/linux/cpuidle.h   |    1 +
>>>  2 files changed, 10 insertions(+)
>>>
>>> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
>>> index eba6929..c500370 100644
>>> --- a/drivers/cpuidle/cpuidle.c
>>> +++ b/drivers/cpuidle/cpuidle.c
>>> @@ -8,6 +8,7 @@
>>>   * This code is licenced under the GPL.
>>>   */
>>>  
>>> +#include <linux/clockchips.h>
>>>  #include <linux/kernel.h>
>>>  #include <linux/mutex.h>
>>>  #include <linux/sched.h>
>>> @@ -146,12 +147,20 @@ int cpuidle_idle_call(void)
>>>  
>>>  	trace_cpu_idle_rcuidle(next_state, dev->cpu);
>>>  
>>> +	if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP)
>>> +		clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER,
>>> +				   &dev->cpu);
>>> +
>> Seems like good clean-up from drivers.
>> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
>>
>> How about taking care of cpu_pm_notifiers() as well with some
>> additional flag for CPU and cluster power state. That can help
>> to reduce and consolidate the code. What you say ?
> 
> Do you mean add a flag for different level of idle (idle, suspend, power
> off) and then use the cpu_pm_enter/cpu_cluster_pm_enter in all the
> drivers as a common framework ?
> 
I mean only for CPUidle considering C-state already has the information
about CPU and cluster power state. For suspend, we by-default run the notifiers
so nothing needs to be done there.

You may not even need a framework. Just like we know in a C-state, timer
stops, same lines, we can say CPU state is going to be say off and hence
cpu_pm_enter() notifier needs to be called. And same way for cluster.

I still haven't given complete thought but thought crossed my mind
after looking at your patches.

Regards,
Santosh
Daniel Lezcano March 21, 2013, 2:41 p.m. | #5
On 03/21/2013 03:04 PM, Santosh Shilimkar wrote:
> On Thursday 21 March 2013 07:22 PM, Daniel Lezcano wrote:
>> On 03/21/2013 01:59 PM, Santosh Shilimkar wrote:
>>> On Thursday 21 March 2013 05:51 PM, Daniel Lezcano wrote:
>>>> When a cpu enters a deep idle state, the local timers are stopped and
>>>> the time framework falls back to the timer device used as a broadcast
>>>> timer.
>>>>
>>>> The different cpuidle drivers are calling clockevents_notify ENTER/EXIT
>>>> when the idle state stops the local timer.
>>>>
>>>> Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle
>>>> drivers. If the flag is set, the cpuidle core code takes care of the
>>>> notification on behalf of the driver to avoid pointless code duplication.
>>>>
>>>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>>>> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
>>>> Cc: Len Brown <lenb@kernel.org>
>>>> Cc: Linus Walleij <linus.walleij@linaro.org>
>>>> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
>>>> Cc: Rajendra Nayak <rnayak@ti.com>
>>>> Cc: Sascha Hauer <kernel@pengutronix.de>
>>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>>> ---
>>>>  drivers/cpuidle/cpuidle.c |    9 +++++++++
>>>>  include/linux/cpuidle.h   |    1 +
>>>>  2 files changed, 10 insertions(+)
>>>>
>>>> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
>>>> index eba6929..c500370 100644
>>>> --- a/drivers/cpuidle/cpuidle.c
>>>> +++ b/drivers/cpuidle/cpuidle.c
>>>> @@ -8,6 +8,7 @@
>>>>   * This code is licenced under the GPL.
>>>>   */
>>>>  
>>>> +#include <linux/clockchips.h>
>>>>  #include <linux/kernel.h>
>>>>  #include <linux/mutex.h>
>>>>  #include <linux/sched.h>
>>>> @@ -146,12 +147,20 @@ int cpuidle_idle_call(void)
>>>>  
>>>>  	trace_cpu_idle_rcuidle(next_state, dev->cpu);
>>>>  
>>>> +	if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP)
>>>> +		clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER,
>>>> +				   &dev->cpu);
>>>> +
>>> Seems like good clean-up from drivers.
>>> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
>>>
>>> How about taking care of cpu_pm_notifiers() as well with some
>>> additional flag for CPU and cluster power state. That can help
>>> to reduce and consolidate the code. What you say ?
>>
>> Do you mean add a flag for different level of idle (idle, suspend, power
>> off) and then use the cpu_pm_enter/cpu_cluster_pm_enter in all the
>> drivers as a common framework ?
>>
> I mean only for CPUidle considering C-state already has the information
> about CPU and cluster power state. For suspend, we by-default run the notifiers
> so nothing needs to be done there.
> 
> You may not even need a framework. Just like we know in a C-state, timer
> stops, same lines, we can say CPU state is going to be say off and hence
> cpu_pm_enter() notifier needs to be called. And same way for cluster.
> 
> I still haven't given complete thought but thought crossed my mind
> after looking at your patches.

Right, that could be interesting.

I see may be one issue with this approach: when we enter an idle state
with power off, some checking are done before cpu_pm_enter and that
could lead to abort the current idle routine.

By moving this to the cpuidle framework, we will invoke always
cpu_pm_enter/exit even if the idle enter routine failed.

But, IMO, the idea is good.

For example in cpuidle34xx:

static int __omap3_enter_idle(struct cpuidle_device *dev,
                                struct cpuidle_driver *drv,
                                int index)
{
        struct omap3_idle_statedata *cx = &omap3_idle_data[index];

        local_fiq_disable();

        if (omap_irq_pending() || need_resched())
                goto return_sleep_time;

	...

        if (cx->mpu_state == PWRDM_POWER_OFF)
                cpu_pm_enter();

	...
}

The same for omap4 and tegra2/3.

With your knowledge of omap, do you think it is possible to move
cpu_pm_enter before entering the idle routine ?

Thanks
  -- Daniel
Santosh Shilimkar March 21, 2013, 2:52 p.m. | #6
On Thursday 21 March 2013 08:11 PM, Daniel Lezcano wrote:
> On 03/21/2013 03:04 PM, Santosh Shilimkar wrote:
>> On Thursday 21 March 2013 07:22 PM, Daniel Lezcano wrote:
>>> On 03/21/2013 01:59 PM, Santosh Shilimkar wrote:
>>>> On Thursday 21 March 2013 05:51 PM, Daniel Lezcano wrote:
>>>>> When a cpu enters a deep idle state, the local timers are stopped and
>>>>> the time framework falls back to the timer device used as a broadcast
>>>>> timer.
>>>>>
>>>>> The different cpuidle drivers are calling clockevents_notify ENTER/EXIT
>>>>> when the idle state stops the local timer.
>>>>>
>>>>> Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle
>>>>> drivers. If the flag is set, the cpuidle core code takes care of the
>>>>> notification on behalf of the driver to avoid pointless code duplication.
>>>>>
>>>>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>>>>> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
>>>>> Cc: Len Brown <lenb@kernel.org>
>>>>> Cc: Linus Walleij <linus.walleij@linaro.org>
>>>>> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
>>>>> Cc: Rajendra Nayak <rnayak@ti.com>
>>>>> Cc: Sascha Hauer <kernel@pengutronix.de>
>>>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>>>> ---
>>>>>  drivers/cpuidle/cpuidle.c |    9 +++++++++
>>>>>  include/linux/cpuidle.h   |    1 +
>>>>>  2 files changed, 10 insertions(+)
>>>>>
>>>>> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
>>>>> index eba6929..c500370 100644
>>>>> --- a/drivers/cpuidle/cpuidle.c
>>>>> +++ b/drivers/cpuidle/cpuidle.c
>>>>> @@ -8,6 +8,7 @@
>>>>>   * This code is licenced under the GPL.
>>>>>   */
>>>>>  
>>>>> +#include <linux/clockchips.h>
>>>>>  #include <linux/kernel.h>
>>>>>  #include <linux/mutex.h>
>>>>>  #include <linux/sched.h>
>>>>> @@ -146,12 +147,20 @@ int cpuidle_idle_call(void)
>>>>>  
>>>>>  	trace_cpu_idle_rcuidle(next_state, dev->cpu);
>>>>>  
>>>>> +	if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP)
>>>>> +		clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER,
>>>>> +				   &dev->cpu);
>>>>> +
>>>> Seems like good clean-up from drivers.
>>>> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
>>>>
>>>> How about taking care of cpu_pm_notifiers() as well with some
>>>> additional flag for CPU and cluster power state. That can help
>>>> to reduce and consolidate the code. What you say ?
>>>
>>> Do you mean add a flag for different level of idle (idle, suspend, power
>>> off) and then use the cpu_pm_enter/cpu_cluster_pm_enter in all the
>>> drivers as a common framework ?
>>>
>> I mean only for CPUidle considering C-state already has the information
>> about CPU and cluster power state. For suspend, we by-default run the notifiers
>> so nothing needs to be done there.
>>
>> You may not even need a framework. Just like we know in a C-state, timer
>> stops, same lines, we can say CPU state is going to be say off and hence
>> cpu_pm_enter() notifier needs to be called. And same way for cluster.
>>
>> I still haven't given complete thought but thought crossed my mind
>> after looking at your patches.
> 
> Right, that could be interesting.
> 
> I see may be one issue with this approach: when we enter an idle state
> with power off, some checking are done before cpu_pm_enter and that
> could lead to abort the current idle routine.
> 
I see your point.

> By moving this to the cpuidle framework, we will invoke always
> cpu_pm_enter/exit even if the idle enter routine failed.
> 
If we at all decide to go on this path, we can always get around the
issues. The key is these notifiers have to be run very close to the
low power entry/exit since the saved context for CPU/CPU cluster
at wrong points would lead to many issues.

> But, IMO, the idea is good.
> 
> For example in cpuidle34xx:
> 
> static int __omap3_enter_idle(struct cpuidle_device *dev,
>                                 struct cpuidle_driver *drv,
>                                 int index)
> {
>         struct omap3_idle_statedata *cx = &omap3_idle_data[index];
> 
>         local_fiq_disable();
> 
>         if (omap_irq_pending() || need_resched())
>                 goto return_sleep_time;
> 
> 	...
> 
>         if (cx->mpu_state == PWRDM_POWER_OFF)
>                 cpu_pm_enter();
> 
> 	...
> }
> 
> The same for omap4 and tegra2/3.
> 
> With your knowledge of omap, do you think it is possible to move
> cpu_pm_enter before entering the idle routine ?
> 
I will get back on this topic after some experiments most likely
by next week.

Regards,
Santosh
Lorenzo Pieralisi March 21, 2013, 3:25 p.m. | #7
On Thu, Mar 21, 2013 at 02:52:21PM +0000, Santosh Shilimkar wrote:

[...]

> >>>> How about taking care of cpu_pm_notifiers() as well with some
> >>>> additional flag for CPU and cluster power state. That can help
> >>>> to reduce and consolidate the code. What you say ?
> >>>
> >>> Do you mean add a flag for different level of idle (idle, suspend, power
> >>> off) and then use the cpu_pm_enter/cpu_cluster_pm_enter in all the
> >>> drivers as a common framework ?
> >>>
> >> I mean only for CPUidle considering C-state already has the information
> >> about CPU and cluster power state. For suspend, we by-default run the notifiers
> >> so nothing needs to be done there.
> >>
> >> You may not even need a framework. Just like we know in a C-state, timer
> >> stops, same lines, we can say CPU state is going to be say off and hence
> >> cpu_pm_enter() notifier needs to be called. And same way for cluster.
> >>
> >> I still haven't given complete thought but thought crossed my mind
> >> after looking at your patches.
> > 
> > Right, that could be interesting.
> > 
> > I see may be one issue with this approach: when we enter an idle state
> > with power off, some checking are done before cpu_pm_enter and that
> > could lead to abort the current idle routine.
> > 
> I see your point.
> 
> > By moving this to the cpuidle framework, we will invoke always
> > cpu_pm_enter/exit even if the idle enter routine failed.
> > 
> If we at all decide to go on this path, we can always get around the
> issues. The key is these notifiers have to be run very close to the
> low power entry/exit since the saved context for CPU/CPU cluster
> at wrong points would lead to many issues.
> 
> > But, IMO, the idea is good.
> > 
> > For example in cpuidle34xx:
> > 
> > static int __omap3_enter_idle(struct cpuidle_device *dev,
> >                                 struct cpuidle_driver *drv,
> >                                 int index)
> > {
> >         struct omap3_idle_statedata *cx = &omap3_idle_data[index];
> > 
> >         local_fiq_disable();
> > 
> >         if (omap_irq_pending() || need_resched())
> >                 goto return_sleep_time;
> > 
> > 	...
> > 
> >         if (cx->mpu_state == PWRDM_POWER_OFF)
> >                 cpu_pm_enter();
> > 
> > 	...
> > }
> > 
> > The same for omap4 and tegra2/3.
> > 
> > With your knowledge of omap, do you think it is possible to move
> > cpu_pm_enter before entering the idle routine ?
> > 
> I will get back on this topic after some experiments most likely
> by next week.

Looks like the way to go, we could enhance the notifiers to be able to
save/restore specific subsystems with the C-state flags defining which
ones.

Notifiers actions should not be disruptive so the only drawback in
executing those before entering the idle routine could possibly be
a waste of cycles in case the C-state entry fails but certainly something
to verify on all platforms using them.

Lorenzo
Rafael J. Wysocki March 22, 2013, 12:41 a.m. | #8
On Thursday, March 21, 2013 01:21:31 PM Daniel Lezcano wrote:
> When a cpu enters a deep idle state, the local timers are stopped and
> the time framework falls back to the timer device used as a broadcast
> timer.
> 
> The different cpuidle drivers are calling clockevents_notify ENTER/EXIT
> when the idle state stops the local timer.
> 
> Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle
> drivers. If the flag is set, the cpuidle core code takes care of the
> notification on behalf of the driver to avoid pointless code duplication.
> 
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Len Brown <lenb@kernel.org>
> Cc: Linus Walleij <linus.walleij@linaro.org>
> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
> Cc: Rajendra Nayak <rnayak@ti.com>
> Cc: Sascha Hauer <kernel@pengutronix.de>
> Cc: Thomas Gleixner <tglx@linutronix.de>

All patches in the series applied to linux-pm.git/bleeding-edge and will be
moved to linux-next after build testing.

Thanks,
Rafael


> ---
>  drivers/cpuidle/cpuidle.c |    9 +++++++++
>  include/linux/cpuidle.h   |    1 +
>  2 files changed, 10 insertions(+)
> 
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index eba6929..c500370 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -8,6 +8,7 @@
>   * This code is licenced under the GPL.
>   */
>  
> +#include <linux/clockchips.h>
>  #include <linux/kernel.h>
>  #include <linux/mutex.h>
>  #include <linux/sched.h>
> @@ -146,12 +147,20 @@ int cpuidle_idle_call(void)
>  
>  	trace_cpu_idle_rcuidle(next_state, dev->cpu);
>  
> +	if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP)
> +		clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER,
> +				   &dev->cpu);
> +
>  	if (cpuidle_state_is_coupled(dev, drv, next_state))
>  		entered_state = cpuidle_enter_state_coupled(dev, drv,
>  							    next_state);
>  	else
>  		entered_state = cpuidle_enter_state(dev, drv, next_state);
>  
> +	if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP)
> +		clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT,
> +				   &dev->cpu);
> +
>  	trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu);
>  
>  	/* give the governor an opportunity to reflect on the outcome */
> diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
> index 480c14d..a837b33 100644
> --- a/include/linux/cpuidle.h
> +++ b/include/linux/cpuidle.h
> @@ -57,6 +57,7 @@ struct cpuidle_state {
>  /* Idle State Flags */
>  #define CPUIDLE_FLAG_TIME_VALID	(0x01) /* is residency time measurable? */
>  #define CPUIDLE_FLAG_COUPLED	(0x02) /* state applies to multiple cpus */
> +#define CPUIDLE_FLAG_TIMER_STOP (0x04)  /* timer is stopped on this state */
>  
>  #define CPUIDLE_DRIVER_FLAGS_MASK (0xFFFF0000)
>  
>
Kevin Hilman March 22, 2013, 5:04 p.m. | #9
Daniel Lezcano <daniel.lezcano@linaro.org> writes:

> When a cpu enters a deep idle state, the local timers are stopped and
> the time framework falls back to the timer device used as a broadcast
> timer.
>
> The different cpuidle drivers are calling clockevents_notify ENTER/EXIT
> when the idle state stops the local timer.
>
> Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle
> drivers. If the flag is set, the cpuidle core code takes care of the
> notification on behalf of the driver to avoid pointless code duplication.

Nice cleanup.

Reviewed-by: Kevin Hilman <khilman@linaro.org>
Santosh Shilimkar March 26, 2013, 8:41 a.m. | #10
Lorenzo, Daniel,

On Thursday 21 March 2013 08:55 PM, Lorenzo Pieralisi wrote:
> On Thu, Mar 21, 2013 at 02:52:21PM +0000, Santosh Shilimkar wrote:
> 
> [...]
> 
>>>>>> How about taking care of cpu_pm_notifiers() as well with some
>>>>>> additional flag for CPU and cluster power state. That can help
>>>>>> to reduce and consolidate the code. What you say ?
>>>>>
>>>>> Do you mean add a flag for different level of idle (idle, suspend, power
>>>>> off) and then use the cpu_pm_enter/cpu_cluster_pm_enter in all the
>>>>> drivers as a common framework ?
>>>>>
>>>> I mean only for CPUidle considering C-state already has the information
>>>> about CPU and cluster power state. For suspend, we by-default run the notifiers
>>>> so nothing needs to be done there.
>>>>
>>>> You may not even need a framework. Just like we know in a C-state, timer
>>>> stops, same lines, we can say CPU state is going to be say off and hence
>>>> cpu_pm_enter() notifier needs to be called. And same way for cluster.
>>>>
>>>> I still haven't given complete thought but thought crossed my mind
>>>> after looking at your patches.
>>>
>>> Right, that could be interesting.
>>>
>>> I see may be one issue with this approach: when we enter an idle state
>>> with power off, some checking are done before cpu_pm_enter and that
>>> could lead to abort the current idle routine.
>>>
>> I see your point.
>>
>>> By moving this to the cpuidle framework, we will invoke always
>>> cpu_pm_enter/exit even if the idle enter routine failed.
>>>
>> If we at all decide to go on this path, we can always get around the
>> issues. The key is these notifiers have to be run very close to the
>> low power entry/exit since the saved context for CPU/CPU cluster
>> at wrong points would lead to many issues.
>>
>>> But, IMO, the idea is good.
>>>
>>> For example in cpuidle34xx:
>>>
>>> static int __omap3_enter_idle(struct cpuidle_device *dev,
>>>                                 struct cpuidle_driver *drv,
>>>                                 int index)
>>> {
>>>         struct omap3_idle_statedata *cx = &omap3_idle_data[index];
>>>
>>>         local_fiq_disable();
>>>
>>>         if (omap_irq_pending() || need_resched())
>>>                 goto return_sleep_time;
>>>
>>> 	...
>>>
>>>         if (cx->mpu_state == PWRDM_POWER_OFF)
>>>                 cpu_pm_enter();
>>>
>>> 	...
>>> }
>>>
>>> The same for omap4 and tegra2/3.
>>>
>>> With your knowledge of omap, do you think it is possible to move
>>> cpu_pm_enter before entering the idle routine ?
>>>
>> I will get back on this topic after some experiments most likely
>> by next week.
> 
> Looks like the way to go, we could enhance the notifiers to be able to
> save/restore specific subsystems with the C-state flags defining which
> ones.
> 
> Notifiers actions should not be disruptive so the only drawback in
> executing those before entering the idle routine could possibly be
> a waste of cycles in case the C-state entry fails but certainly something
> to verify on all platforms using them.
>
After spending few mins on the idea, I made few observations.

1. Since the cpu_pm_enter() involves in saving the context CPU
co-processors like VFP, debug subsystem, it should be called
after irq, preemption is disabled and we are on its way
into idle entry. It can be called from generic idle code
but we just need to take care of above.
 
2. cpu_cluster_pm_enter() actually is bit tricky since its use
is not consistent with couple idle entry and say smp_idle entry.
Clusters notifiers are called when the idle drivers finds that
all the CPUs in that cluster are at sleep and current CPU
can take down the cluster with it. This notifier as well
needs to be called late since the after the notifier call,
there shouldn't be any irq activity which might lead to
incorrect context save for say irq controller.

3. Same goes with exit notifiers which has to be called
before irq, preemption is enabled back again.

4. Couple idle overall needs special considerations to
manage these notifiers from generic code.

Overall it seems to be doable with the direction you
already started for timer broad-cast notifiers. You
can at least add this one in your queue of works :-)
I do not have much cycles left because of other pile
of work to carry out the changes but you can surely
count me on reviewing and testing the patches.

Regards,
Santosh

Patch

diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index eba6929..c500370 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -8,6 +8,7 @@ 
  * This code is licenced under the GPL.
  */
 
+#include <linux/clockchips.h>
 #include <linux/kernel.h>
 #include <linux/mutex.h>
 #include <linux/sched.h>
@@ -146,12 +147,20 @@  int cpuidle_idle_call(void)
 
 	trace_cpu_idle_rcuidle(next_state, dev->cpu);
 
+	if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP)
+		clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER,
+				   &dev->cpu);
+
 	if (cpuidle_state_is_coupled(dev, drv, next_state))
 		entered_state = cpuidle_enter_state_coupled(dev, drv,
 							    next_state);
 	else
 		entered_state = cpuidle_enter_state(dev, drv, next_state);
 
+	if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP)
+		clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT,
+				   &dev->cpu);
+
 	trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu);
 
 	/* give the governor an opportunity to reflect on the outcome */
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 480c14d..a837b33 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -57,6 +57,7 @@  struct cpuidle_state {
 /* Idle State Flags */
 #define CPUIDLE_FLAG_TIME_VALID	(0x01) /* is residency time measurable? */
 #define CPUIDLE_FLAG_COUPLED	(0x02) /* state applies to multiple cpus */
+#define CPUIDLE_FLAG_TIMER_STOP (0x04)  /* timer is stopped on this state */
 
 #define CPUIDLE_DRIVER_FLAGS_MASK (0xFFFF0000)