[3/3] cpuidle/menu: add per cpu pm_qos_resume_latency consideration

Message ID 1484227624-6740-4-git-send-email-alex.shi@linaro.org
State New
Headers show

Commit Message

Alex Shi Jan. 12, 2017, 1:27 p.m.
Kernel or user may have special requirement on cpu response time, like
if a interrupt is pinned to a cpu, we don't want the cpu goes too deep
sleep. This patch can prevent this thing happen by consider per cpu
resume_latency setting in cpu sleep state selection in menu governor.

The pm_qos_resume_latency ask device to give reponse in this time. That's
similar with cpu cstates' entry_latency + exit_latency. But since
most of cpu cstate either has no entry_latency or add it into exit_latency
So, we just can restrict this time requirement as states exit_latency.

We can set a wanted latency value according to the value of
/sys/devices/system/cpu/cpuX/cpuidle/stateX/latency. to just a bit
less than related state's latency value. Then cpu can get to this state
or higher.

Signed-off-by: Alex Shi <alex.shi@linaro.org>

To: linux-kernel@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
---
 drivers/cpuidle/governors/menu.c | 7 +++++++
 1 file changed, 7 insertions(+)

-- 
2.8.1.101.g72d917a

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Rik van Riel Jan. 12, 2017, 8:03 p.m. | #1
On Thu, 2017-01-12 at 21:27 +0800, Alex Shi wrote:
> Kernel or user may have special requirement on cpu response time,

> like

> if a interrupt is pinned to a cpu, we don't want the cpu goes too

> deep

> sleep. This patch can prevent this thing happen by consider per cpu

> resume_latency setting in cpu sleep state selection in menu governor.

> 

> The pm_qos_resume_latency ask device to give reponse in this time.

> That's

> similar with cpu cstates' entry_latency + exit_latency. But since

> most of cpu cstate either has no entry_latency or add it into

> exit_latency

> So, we just can restrict this time requirement as states

> exit_latency.

> 

> We can set a wanted latency value according to the value of

> /sys/devices/system/cpu/cpuX/cpuidle/stateX/latency. to just a bit

> less than related state's latency value. Then cpu can get to this

> state

> or higher.

> 

> Signed-off-by: Alex Shi <alex.shi@linaro.org>

> To: linux-kernel@vger.kernel.org

> Cc: linux-pm@vger.kernel.org

> Cc: Ulf Hansson <ulf.hansson@linaro.org>

> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>

> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>

> Cc: Arjan van de Ven <arjan@linux.intel.com>

> Cc: Rik van Riel <riel@redhat.com>

> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>

> 


Acked-by: Rik van Riel <riel@redhat.com>


-- 
All rights reversed
Daniel Lezcano Jan. 19, 2017, 10:21 a.m. | #2
On Thu, Jan 19, 2017 at 05:25:37PM +0800, Alex Shi wrote:
> 

> > That said, I have the feeling that is taking the wrong direction. Each time we

> > are entering idle, we check the latencies. Entering idle can be done thousand

> > of times per second. Wouldn't make sense to disable the states not fulfilling

> > the constraints at the moment the latencies are changed ? As the idle states

> > have increasing exit latencies, setting an idle state limit to disable all

> > states after that limit may be more efficient than checking again and again in

> > the idle path, no ?

> 

> You'r right. save some checking is good thing to do.


Hi Alex,

I think you missed the point.

What I am proposing is to change the current approach by disabling all the
states after a specific latency.

We add a specific internal function:

static int cpuidle_set_latency(struct cpuidle_driver *drv,
				struct cpuidle_device *dev,
				int latency)
{
	int i, idx;

	for (i = 0, idx = 0; i < drv->state_count; i++) {

		struct cpuidle_state *s = &drv->states[i];			

		if (s->latency > latency)
			break;

		idx = i;
	}

	dev->state_count = idx;

	return 0;
}

This function is called from the notifier callback:

static int cpuidle_latency_notify(struct notifier_block *b,
                unsigned long l, void *v)
 {
-       wake_up_all_idle_cpus();
+       struct cpuidle_device *dev;
+       struct cpuidle_driver *drv;
+
+       cpuidle_pause_and_lock();
+       for_each_possible_cpu(cpu) {
+               dev = &per_cpu(cpuidle_dev, cpu);
+               drv = = cpuidle_get_cpu_driver(dev);    
+               cpuidle_set_latency(drv, dev, l)
+       }
+       cpuidle_resume_and_unlock();
+
        return NOTIFY_OK;
 }

-----------------------------------------------------------------------------

The menu governor becomes:



At init time, the drv->state_count and all cpu's dev->state_count are the same.

Well, that is the rough idea: instead of reading the latency when entering
idle, let's disable/enable the idle states when we set a new latency.

I did not check how that fits with the per cpu latency, but I think it will be
cleaner to change the approach rather than spreading latencies dances around.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.htmldiff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index bba3c2af..87e58e3 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -352,7 +352,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
         * Find the idle state with the lowest power while satisfying
         * our constraints.
         */
-       for (i = data->last_state_idx + 1; i < drv->state_count; i++) {
+       for (i = data->last_state_idx + 1; i < dev->state_count; i++) {
                struct cpuidle_state *s = &drv->states[i];
                struct cpuidle_state_usage *su = &dev->states_usage[i];


... with a cleanup around latency_req.

-----------------------------------------------------------------------------

And the cpuidle_device structure is changed to:

diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index b923c32..2fc966cb 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -88,6 +88,7 @@ struct cpuidle_device {
        cpumask_t               coupled_cpus;
        struct cpuidle_coupled  *coupled;
 #endif
+       int state_count;
 };
 
 DECLARE_PER_CPU(struct cpuidle_device *, cpuidle_devices);

Rafael J. Wysocki Jan. 19, 2017, 9:43 p.m. | #3
On Thu, Jan 19, 2017 at 11:21 AM, Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
> On Thu, Jan 19, 2017 at 05:25:37PM +0800, Alex Shi wrote:

>>

>> > That said, I have the feeling that is taking the wrong direction. Each time we

>> > are entering idle, we check the latencies. Entering idle can be done thousand

>> > of times per second. Wouldn't make sense to disable the states not fulfilling

>> > the constraints at the moment the latencies are changed ? As the idle states

>> > have increasing exit latencies, setting an idle state limit to disable all

>> > states after that limit may be more efficient than checking again and again in

>> > the idle path, no ?

>>

>> You'r right. save some checking is good thing to do.

>

> Hi Alex,

>

> I think you missed the point.

>

> What I am proposing is to change the current approach by disabling all the

> states after a specific latency.

>

> We add a specific internal function:

>

> static int cpuidle_set_latency(struct cpuidle_driver *drv,

>                                 struct cpuidle_device *dev,

>                                 int latency)

> {

>         int i, idx;

>

>         for (i = 0, idx = 0; i < drv->state_count; i++) {

>

>                 struct cpuidle_state *s = &drv->states[i];

>

>                 if (s->latency > latency)

>                         break;

>

>                 idx = i;

>         }

>

>         dev->state_count = idx;

>

>         return 0;

> }

>

> This function is called from the notifier callback:

>

> static int cpuidle_latency_notify(struct notifier_block *b,

>                 unsigned long l, void *v)

>  {

> -       wake_up_all_idle_cpus();

> +       struct cpuidle_device *dev;

> +       struct cpuidle_driver *drv;

> +

> +       cpuidle_pause_and_lock();

> +       for_each_possible_cpu(cpu) {

> +               dev = &per_cpu(cpuidle_dev, cpu);

> +               drv = = cpuidle_get_cpu_driver(dev);

> +               cpuidle_set_latency(drv, dev, l)

> +       }

> +       cpuidle_resume_and_unlock();

> +

>         return NOTIFY_OK;

>  }


The above may be problematic if the constraints change relatively
often.  It is global and it will affect all of the CPUs in the system
every time and now think about systems with hundreds of them.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Lezcano Jan. 20, 2017, 10:54 a.m. | #4
On Thu, Jan 19, 2017 at 10:43:23PM +0100, Rafael J. Wysocki wrote:

[ ... ]

> > This function is called from the notifier callback:

> >

> > static int cpuidle_latency_notify(struct notifier_block *b,

> >                 unsigned long l, void *v)

> >  {

> > -       wake_up_all_idle_cpus();

> > +       struct cpuidle_device *dev;

> > +       struct cpuidle_driver *drv;

> > +

> > +       cpuidle_pause_and_lock();

> > +       for_each_possible_cpu(cpu) {

> > +               dev = &per_cpu(cpuidle_dev, cpu);

> > +               drv = = cpuidle_get_cpu_driver(dev);

> > +               cpuidle_set_latency(drv, dev, l)

> > +       }

> > +       cpuidle_resume_and_unlock();

> > +

> >         return NOTIFY_OK;

> >  }

> 

> The above may be problematic if the constraints change relatively

> often.  It is global and it will affect all of the CPUs in the system

> every time and now think about systems with hundreds of them.


Yeah, that could be problematic. The code snippet gives the general idea but it
could be changed by for example by a flag telling the cpus when they enter idle
to update their state_count. Or something like that.

But if you think the patchset is fine, it is ok, we can improve things afterwards.

  -- Daniel

-- 

 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Lezcano Jan. 23, 2017, 12:50 p.m. | #5
On Sun, Jan 22, 2017 at 09:31:44AM +0800, Alex Shi wrote:
> 

> >Yeah, that could be problematic. The code snippet gives the general idea but it

> >could be changed by for example by a flag telling the cpus when they enter idle

> >to update their state_count. Or something like that.

> 

> Yes, this idea could be helpful.

> 

> But since the idle path isn't a hot path. and a few memory access won't cost

> a lot. So I doubt if the benefit could be measurable.


It won't be measurable, as well as reading the cpu device latency before
checking the latency req is zero, but it makes sense.

The idle routine is not a hot path but a very special place where the interrupt
are disabled, the rcu is not usable, tick is disabled etc ...

Perhaps it is not a problem for the moment, but it is probably worth to mention that
using API from other subsystems in the idle select path could be problematic
and perhaps it is time to think about another approach for the future.

-- 

 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch hide | download patch | download mbox

diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index 07e36bb..8d6d25c 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -19,6 +19,7 @@ 
 #include <linux/tick.h>
 #include <linux/sched.h>
 #include <linux/math64.h>
+#include <linux/cpu.h>
 
 /*
  * Please note when changing the tuning values:
@@ -280,17 +281,23 @@  static unsigned int get_typical_interval(struct menu_device *data)
 static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 {
 	struct menu_device *data = this_cpu_ptr(&menu_devices);
+	struct device *device = get_cpu_device(dev->cpu);
 	int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
 	int i;
 	unsigned int interactivity_req;
 	unsigned int expected_interval;
 	unsigned long nr_iowaiters, cpu_load;
+	int resume_latency = dev_pm_qos_read_value(device);
 
 	if (data->needs_update) {
 		menu_update(drv, dev);
 		data->needs_update = 0;
 	}
 
+	/* resume_latency is 0 means no restriction */
+	if (resume_latency && resume_latency < latency_req)
+		latency_req = resume_latency;
+
 	/* Special case when user has set very strict latency requirement */
 	if (unlikely(latency_req == 0))
 		return 0;