diff mbox series

[RFC] cpuidle/drivers/menu: Remove get_loadavg in the performance multiplier

Message ID 1537868328-13405-1-git-send-email-daniel.lezcano@linaro.org
State New
Headers show
Series [RFC] cpuidle/drivers/menu: Remove get_loadavg in the performance multiplier | expand

Commit Message

Daniel Lezcano Sept. 25, 2018, 9:38 a.m. UTC
The function get_loadavg() returns almost always zero. To be more
precise, statistically speaking for a total of 1023379 times passing
to the function, the load is equal to zero 1020728 times, greater than
100, 610 times, the remaining is between 0 and 5.

I'm putting in question this metric. Is it worth to keep it?

Cc: Todd Kjos <tkjos@google.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Colin Cross <ccross@android.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

---
 drivers/cpuidle/governors/menu.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

-- 
2.7.4

Comments

Rafael J. Wysocki Oct. 3, 2018, 8:58 a.m. UTC | #1
On Tuesday, September 25, 2018 11:38:44 AM CEST Daniel Lezcano wrote:
> The function get_loadavg() returns almost always zero. To be more

> precise, statistically speaking for a total of 1023379 times passing

> to the function, the load is equal to zero 1020728 times, greater than

> 100, 610 times, the remaining is between 0 and 5.

> 

> I'm putting in question this metric. Is it worth to keep it?


The honest answer is that I'm not sure.

Using the CPU load for driving idle state selection is questionable in general
(there need not be any connection between the two in principle) and this
particular function is questionable either.  If it really makes no difference
in addition to that, then yes it would be good to get rid of it.

That said it looks like that may be quite workload-dependent, so maybe there
are workloads where it does make a difference?  Like networking or similar?

Thanks,
Rafael
Daniel Lezcano Oct. 3, 2018, 10:25 a.m. UTC | #2
On 03/10/2018 10:58, Rafael J. Wysocki wrote:
> On Tuesday, September 25, 2018 11:38:44 AM CEST Daniel Lezcano wrote:

>> The function get_loadavg() returns almost always zero. To be more

>> precise, statistically speaking for a total of 1023379 times passing

>> to the function, the load is equal to zero 1020728 times, greater than

>> 100, 610 times, the remaining is between 0 and 5.

>>

>> I'm putting in question this metric. Is it worth to keep it?

> 

> The honest answer is that I'm not sure.

> 

> Using the CPU load for driving idle state selection is questionable in general

> (there need not be any connection between the two in principle) and this

> particular function is questionable either.  If it really makes no difference

> in addition to that, then yes it would be good to get rid of it.

> 

> That said it looks like that may be quite workload-dependent, so maybe there

> are workloads where it does make a difference?  Like networking or similar?


Perhaps initially the load was not the same:


commit 69d25870f20c4b2563304f2b79c5300dd60a067e
Author: Arjan van de Ven <arjan@infradead.org>
Date:   Mon Sep 21 17:04:08 2009 -0700

    cpuidle: fix the menu governor to boost IO performance


The function get_loadavg() was:

static int get_loadavg(void)
{
       unsigned long this = this_cpu_load();

       return LOAD_INT(this) * 10 + LOAD_FRAC(this) / 10;
}

With:

unsigned long this_cpu_load(void)
{
        struct rq *this = this_rq();
        return this->cpu_load[0];
}


Now we pass the load we got from get_iowait_load() which is:

void get_iowait_load(unsigned long *nr_waiters, unsigned long *load)
{
        struct rq *rq = this_rq();
        *nr_waiters = atomic_read(&rq->nr_iowait);
        *load = rq->load.weight;
}

Colin Cross did some measurements a few years ago and concluded it is
zero most of the time, otherwise the number is so high it has no
meaning. The get_loadavg() call is removed from Android since 2011 [1].

I can hack my server (bi Xeon - 6 cores), add statistics regarding this
metric and let it run during several days, sure a lot of different
workloads will happen on it (compilation, network, throughput computing,
...) in a real use case manner. That is what I did with an ARM64
evaluation board.

However I won't be able to do that right now and that will take a bit of
time. If you think it is really useful, I can plan to do that in the
next weeks.

 -- Daniel

[1]
https://android.googlesource.com/kernel/common/+/4dedd9f124703207895777ac6e91dacde0f7cc17



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
Rafael J. Wysocki Oct. 3, 2018, 11:09 a.m. UTC | #3
On Wed, Oct 3, 2018 at 12:25 PM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>

> On 03/10/2018 10:58, Rafael J. Wysocki wrote:

> > On Tuesday, September 25, 2018 11:38:44 AM CEST Daniel Lezcano wrote:

> >> The function get_loadavg() returns almost always zero. To be more

> >> precise, statistically speaking for a total of 1023379 times passing

> >> to the function, the load is equal to zero 1020728 times, greater than

> >> 100, 610 times, the remaining is between 0 and 5.

> >>

> >> I'm putting in question this metric. Is it worth to keep it?

> >

> > The honest answer is that I'm not sure.

> >

> > Using the CPU load for driving idle state selection is questionable in general

> > (there need not be any connection between the two in principle) and this

> > particular function is questionable either.  If it really makes no difference

> > in addition to that, then yes it would be good to get rid of it.

> >

> > That said it looks like that may be quite workload-dependent, so maybe there

> > are workloads where it does make a difference?  Like networking or similar?

>

> Perhaps initially the load was not the same:

>

>

> commit 69d25870f20c4b2563304f2b79c5300dd60a067e

> Author: Arjan van de Ven <arjan@infradead.org>

> Date:   Mon Sep 21 17:04:08 2009 -0700

>

>     cpuidle: fix the menu governor to boost IO performance

>

>

> The function get_loadavg() was:

>

> static int get_loadavg(void)

> {

>        unsigned long this = this_cpu_load();

>

>        return LOAD_INT(this) * 10 + LOAD_FRAC(this) / 10;

> }

>

> With:

>

> unsigned long this_cpu_load(void)

> {

>         struct rq *this = this_rq();

>         return this->cpu_load[0];

> }

>

>

> Now we pass the load we got from get_iowait_load() which is:

>

> void get_iowait_load(unsigned long *nr_waiters, unsigned long *load)

> {

>         struct rq *rq = this_rq();

>         *nr_waiters = atomic_read(&rq->nr_iowait);

>         *load = rq->load.weight;

> }


This would mean that commit 372ba8cb46b2 (cpuidle: menu: Lookup CPU
runqueues less) changed the behavior more than expected, though (CC
Mel).

In any case that change was made in 2014 which is after Android
stopped using get_loadavg().

> Colin Cross did some measurements a few years ago and concluded it is

> zero most of the time, otherwise the number is so high it has no

> meaning. The get_loadavg() call is removed from Android since 2011 [1].


This is a good indication that it doesn't really matter (any more perhaps).

> I can hack my server (bi Xeon - 6 cores), add statistics regarding this

> metric and let it run during several days, sure a lot of different

> workloads will happen on it (compilation, network, throughput computing,

> ...) in a real use case manner. That is what I did with an ARM64

> evaluation board.

>

> However I won't be able to do that right now and that will take a bit of

> time. If you think it is really useful, I can plan to do that in the

> next weeks.


Well, we may as well just make this change to follow Android and let
it go through all of the CI we have for one cycle.

> [1] https://android.googlesource.com/kernel/common/+/4dedd9f124703207895777ac6e91dacde0f7cc17


If you send a non-RFC patch to drop get_loadavg() (but you can drop it
altogether then, there are no other callers of it AFAICS), I'll queue
it up after the 4.20 (or whatever it turns out to be in the end) merge
window so all of the CIs out there have a chance to report issues (if
any).

Thanks,
Rafael
diff mbox series

Patch

diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index e26a409..d939b8e 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -173,18 +173,10 @@  static inline int which_bucket(unsigned int duration, unsigned long nr_iowaiters
  * to be, the higher this multiplier, and thus the higher
  * the barrier to go to an expensive C state.
  */
-static inline int performance_multiplier(unsigned long nr_iowaiters, unsigned long load)
+static inline int performance_multiplier(unsigned long nr_iowaiters)
 {
-	int mult = 1;
-
-	/* for higher loadavg, we are more reluctant */
-
-	mult += 2 * get_loadavg(load);
-
 	/* for IO wait tasks (per cpu!) we add 5x each */
-	mult += 10 * nr_iowaiters;
-
-	return mult;
+	return 1 + 10 * nr_iowaiters;
 }
 
 static DEFINE_PER_CPU(struct menu_device, menu_devices);
@@ -359,7 +351,8 @@  static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
 		 * Use the performance multiplier and the user-configurable
 		 * latency_req to determine the maximum exit latency.
 		 */
-		interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load);
+		interactivity_req = data->predicted_us /
+			performance_multiplier(nr_iowaiters);
 		if (latency_req > interactivity_req)
 			latency_req = interactivity_req;
 	}