cpufreq: dt: Set default policy->transition_delay_ns

Message ID	16157eb75bb26cca73a0da930e49f2549b96fd65.1495429745.git.viresh.kumar@linaro.org
State	New
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: best guess record for domain of linux-pm-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; From: Viresh Kumar <viresh.kumar@linaro.org> To: Rafael Wysocki <rjw@rjwysocki.net>, Viresh Kumar <viresh.kumar@linaro.org> Cc: linaro-kernel@lists.linaro.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, Vincent Guittot <vincent.guittot@linaro.org> Subject: [PATCH] cpufreq: dt: Set default policy->transition_delay_ns Date: Mon, 22 May 2017 10:40:14 +0530 Message-Id: <16157eb75bb26cca73a0da930e49f2549b96fd65.1495429745.git.viresh.kumar@linaro.org> Sender: linux-pm-owner@vger.kernel.org Precedence: bulk

Viresh Kumar May 22, 2017, 5:10 a.m. UTC

The rate_limit_us for the schedutil governor is getting set to 500 ms by
default for the ARM64 hikey board. And its way too much, even for the
default value. Lets set the default transition_delay_ns to something
more realistic (10 ms), while the userspace always have a chance to set
something it wants.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

---
 drivers/cpufreq/cpufreq-dt.c | 3 +++
 1 file changed, 3 insertions(+)

-- 
2.7.4

Brendan Jackman May 22, 2017, 10:45 a.m. UTC | #1

Hi Viresh,

On Mon, May 22 2017 at 05:10, Viresh Kumar wrote:
> The rate_limit_us for the schedutil governor is getting set to 500 ms by

> default for the ARM64 hikey board. And its way too much, even for the

> default value. Lets set the default transition_delay_ns to something

> more realistic (10 ms), while the userspace always have a chance to set

> something it wants.

Just a thought - do you think we can treat the reported transition
latency as a proxy for the "cost" of freq transitions?  I.e. assume that
on platforms with very fast frequency switching it's probably cheap to
switch frequency and we want schedutil to respond quickly, whereas on
platforms with big latencies, frequency switches might be expensive and
we probably want hysteresis.

If that makes sense then maybe we could use 10 * transition_latency /
NSEC_PER_USEC, when transition_latency is reported? Otherwise 10ms seems
sensible to me..

Cheers,
Brendan

>

> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

> ---

>  drivers/cpufreq/cpufreq-dt.c | 3 +++

>  1 file changed, 3 insertions(+)

>

> diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c

> index c943787d761e..70eac3fd89ac 100644

> --- a/drivers/cpufreq/cpufreq-dt.c

> +++ b/drivers/cpufreq/cpufreq-dt.c

> @@ -275,6 +275,9 @@ static int cpufreq_init(struct cpufreq_policy *policy)

>

>  	policy->cpuinfo.transition_latency = transition_latency;

>

> +	/* Set the default transition delay to 10ms */

> +	policy->transition_delay_us = 10 * USEC_PER_MSEC;

> +

>  	return 0;

>

>  out_free_cpufreq_table:

Viresh Kumar May 22, 2017, 10:55 a.m. UTC | #2

On 22-05-17, 11:45, Brendan Jackman wrote:
> Hi Viresh,

> 

> On Mon, May 22 2017 at 05:10, Viresh Kumar wrote:

> > The rate_limit_us for the schedutil governor is getting set to 500 ms by

> > default for the ARM64 hikey board. And its way too much, even for the

> > default value. Lets set the default transition_delay_ns to something

> > more realistic (10 ms), while the userspace always have a chance to set

> > something it wants.

> 

> Just a thought - do you think we can treat the reported transition

> latency as a proxy for the "cost" of freq transitions?  I.e. assume that

> on platforms with very fast frequency switching it's probably cheap to

> switch frequency and we want schedutil to respond quickly, whereas on

> platforms with big latencies, frequency switches might be expensive and

> we probably want hysteresis.

> 

> If that makes sense then maybe we could use 10 * transition_latency /

> NSEC_PER_USEC, when transition_latency is reported? Otherwise 10ms seems

> sensible to me..


So my platform (hikey) does provide transition-latency as 500 us. But
schedutil multiplies that with LATENCY_MULTIPLIER (1000) and that
makes it 500000 rate_limit_us, which is unacceptable.

@Rafael: Why does the LATENCY_MULTIPLIER has such a high value? I am
not sure I understood completely on why we have this multiplier :(

-- 
viresh

Leo Yan May 22, 2017, 11:17 a.m. UTC | #3

On Mon, May 22, 2017 at 04:25:22PM +0530, Viresh Kumar wrote:
> On 22-05-17, 11:45, Brendan Jackman wrote:

> > Hi Viresh,

> > 

> > On Mon, May 22 2017 at 05:10, Viresh Kumar wrote:

> > > The rate_limit_us for the schedutil governor is getting set to 500 ms by

> > > default for the ARM64 hikey board. And its way too much, even for the

> > > default value. Lets set the default transition_delay_ns to something

> > > more realistic (10 ms), while the userspace always have a chance to set

> > > something it wants.

> > 

> > Just a thought - do you think we can treat the reported transition

> > latency as a proxy for the "cost" of freq transitions?  I.e. assume that

> > on platforms with very fast frequency switching it's probably cheap to

> > switch frequency and we want schedutil to respond quickly, whereas on

> > platforms with big latencies, frequency switches might be expensive and

> > we probably want hysteresis.

> > 

> > If that makes sense then maybe we could use 10 * transition_latency /

> > NSEC_PER_USEC, when transition_latency is reported? Otherwise 10ms seems

> > sensible to me..

> 

> So my platform (hikey) does provide transition-latency as 500 us. But

> schedutil multiplies that with LATENCY_MULTIPLIER (1000) and that

> makes it 500000 rate_limit_us, which is unacceptable.

> 

> @Rafael: Why does the LATENCY_MULTIPLIER has such a high value? I am

> not sure I understood completely on why we have this multiplier :(


This afternoon Amit pointed me for this patch, should fix as below?
Otherwise it seems directly assign the same value from unit 'ns' to
'us' but without any value conversion.


Thanks,
Leo Yandiff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 76877a6..dcc90fc 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -538,7 +538,7 @@ static int sugov_init(struct cpufreq_policy *policy)
                unsigned int lat;
 
                tunables->rate_limit_us = LATENCY_MULTIPLIER;
-               lat = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
+               lat = policy->cpuinfo.transition_latency / NSEC_PER_MSEC;
                if (lat)
                        tunables->rate_limit_us *= lat;
        }

Viresh Kumar May 22, 2017, 11:27 a.m. UTC | #4

On 22-05-17, 19:17, Leo Yan wrote:
> This afternoon Amit pointed me for this patch, should fix as below?

> Otherwise it seems directly assign the same value from unit 'ns' to

> 'us' but without any value conversion.

> 

> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c

> index 76877a6..dcc90fc 100644

> --- a/kernel/sched/cpufreq_schedutil.c

> +++ b/kernel/sched/cpufreq_schedutil.c

> @@ -538,7 +538,7 @@ static int sugov_init(struct cpufreq_policy *policy)

>                 unsigned int lat;

>  

>                 tunables->rate_limit_us = LATENCY_MULTIPLIER;

> -               lat = policy->cpuinfo.transition_latency / NSEC_PER_USEC;

> +               lat = policy->cpuinfo.transition_latency / NSEC_PER_MSEC;

>                 if (lat)

>                         tunables->rate_limit_us *= lat;

>         }


I will let Rafael comment in as well. NSEC_PER_USEC is used in the
earlier governors as well (ondemand/conservative) in exactly the same
way as schedutil is using.

-- 
viresh

Rafael J. Wysocki June 27, 2017, 12:15 a.m. UTC | #5

On Monday, May 22, 2017 04:57:27 PM Viresh Kumar wrote:
> On 22-05-17, 19:17, Leo Yan wrote:

> > This afternoon Amit pointed me for this patch, should fix as below?

> > Otherwise it seems directly assign the same value from unit 'ns' to

> > 'us' but without any value conversion.

> > 

> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c

> > index 76877a6..dcc90fc 100644

> > --- a/kernel/sched/cpufreq_schedutil.c

> > +++ b/kernel/sched/cpufreq_schedutil.c

> > @@ -538,7 +538,7 @@ static int sugov_init(struct cpufreq_policy *policy)

> >                 unsigned int lat;

> >  

> >                 tunables->rate_limit_us = LATENCY_MULTIPLIER;

> > -               lat = policy->cpuinfo.transition_latency / NSEC_PER_USEC;

> > +               lat = policy->cpuinfo.transition_latency / NSEC_PER_MSEC;

> >                 if (lat)

> >                         tunables->rate_limit_us *= lat;

> >         }

> 

> I will let Rafael comment in as well. NSEC_PER_USEC is used in the

> earlier governors as well (ondemand/conservative) in exactly the same

> way as schedutil is using.


The reason why it is used by schedutil is because the other governors used it
that way.  IOW, doesn't matter. :-)

Thanks,
Rafael

Viresh Kumar June 27, 2017, 4:20 a.m. UTC | #6

On 27-06-17, 02:15, Rafael J. Wysocki wrote:
> On Monday, May 22, 2017 04:57:27 PM Viresh Kumar wrote:

> > On 22-05-17, 19:17, Leo Yan wrote:

> > > This afternoon Amit pointed me for this patch, should fix as below?

> > > Otherwise it seems directly assign the same value from unit 'ns' to

> > > 'us' but without any value conversion.

> > > 

> > > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c

> > > index 76877a6..dcc90fc 100644

> > > --- a/kernel/sched/cpufreq_schedutil.c

> > > +++ b/kernel/sched/cpufreq_schedutil.c

> > > @@ -538,7 +538,7 @@ static int sugov_init(struct cpufreq_policy *policy)

> > >                 unsigned int lat;

> > >  

> > >                 tunables->rate_limit_us = LATENCY_MULTIPLIER;

> > > -               lat = policy->cpuinfo.transition_latency / NSEC_PER_USEC;

I think the above line is just fine and the below one is incorrect, as
we wanted to convert transition latency to usec here (i.e. in the
units of rate_limit_us).

> > > +               lat = policy->cpuinfo.transition_latency / NSEC_PER_MSEC;

> > >                 if (lat)

> > >                         tunables->rate_limit_us *= lat;

> > >         }

> > 

> > I will let Rafael comment in as well. NSEC_PER_USEC is used in the

> > earlier governors as well (ondemand/conservative) in exactly the same

> > way as schedutil is using.

> 

> The reason why it is used by schedutil is because the other governors used it

> that way.  IOW, doesn't matter. :-)

But I feel the value of LATENCY_MULTIPLIER (1000) is way too high. It currently
says that if freq-switching takes time X, then we should wait for 999X time
before we change the freq again.

Perhaps LATENCY_MULTIPLIER should be just 10 or 20 here. For a platform with
transition_latency 500 us, rate_limit_us comes to 500 ms. Which is absurd. We
ideally want it to be around 10-20 ms here. And compared to other ARM platforms,
500 us transition_latency is very low. It normally is around 1-3 ms for ARM32
platforms.

@Rafael: Will it be fine to lower down the value of LATENCY_MULTIPLIER?

-- 
viresh

Rafael J. Wysocki June 27, 2017, 4:08 p.m. UTC | #7

Hi,

On Tue, Jun 27, 2017 at 6:20 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 27-06-17, 02:15, Rafael J. Wysocki wrote:

>> On Monday, May 22, 2017 04:57:27 PM Viresh Kumar wrote:

>> > On 22-05-17, 19:17, Leo Yan wrote:

>> > > This afternoon Amit pointed me for this patch, should fix as below?

>> > > Otherwise it seems directly assign the same value from unit 'ns' to

>> > > 'us' but without any value conversion.

>> > >

>> > > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c

>> > > index 76877a6..dcc90fc 100644

>> > > --- a/kernel/sched/cpufreq_schedutil.c

>> > > +++ b/kernel/sched/cpufreq_schedutil.c

>> > > @@ -538,7 +538,7 @@ static int sugov_init(struct cpufreq_policy *policy)

>> > >                 unsigned int lat;

>> > >

>> > >                 tunables->rate_limit_us = LATENCY_MULTIPLIER;

>> > > -               lat = policy->cpuinfo.transition_latency / NSEC_PER_USEC;

>

> I think the above line is just fine and the below one is incorrect, as

> we wanted to convert transition latency to usec here (i.e. in the

> units of rate_limit_us).

>

>> > > +               lat = policy->cpuinfo.transition_latency / NSEC_PER_MSEC;

>> > >                 if (lat)

>> > >                         tunables->rate_limit_us *= lat;

>> > >         }

>> >

>> > I will let Rafael comment in as well. NSEC_PER_USEC is used in the

>> > earlier governors as well (ondemand/conservative) in exactly the same

>> > way as schedutil is using.

>>

>> The reason why it is used by schedutil is because the other governors used it

>> that way.  IOW, doesn't matter. :-)

>

> But I feel the value of LATENCY_MULTIPLIER (1000) is way too high. It currently

> says that if freq-switching takes time X, then we should wait for 999X time

> before we change the freq again.

>

> Perhaps LATENCY_MULTIPLIER should be just 10 or 20 here. For a platform with

> transition_latency 500 us, rate_limit_us comes to 500 ms. Which is absurd. We

> ideally want it to be around 10-20 ms here. And compared to other ARM platforms,

> 500 us transition_latency is very low. It normally is around 1-3 ms for ARM32

> platforms.

>

> @Rafael: Will it be fine to lower down the value of LATENCY_MULTIPLIER?


We can do that, but then I think we need to compensate for the change
in the old governors code or there may be surprises.

Thanks,
Rafael

Viresh Kumar June 28, 2017, 4:14 a.m. UTC | #8

On 27-06-17, 18:08, Rafael J. Wysocki wrote:
> On Tue, Jun 27, 2017 at 6:20 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:

> > @Rafael: Will it be fine to lower down the value of LATENCY_MULTIPLIER?

> 

> We can do that, but then I think we need to compensate for the change

> in the old governors code or there may be surprises.

Why shouldn't we change the value of LATENCY_MULTIPLIER for old
governors as well? They use the same calculations and the sampling
rate there is also this bad (like rate_limit_us).

If we aren't going to change that for old governors, then we can
create a local version of LATENCY_MULTIPLIER for schedutil I believe.

-- 
viresh

Rafael J. Wysocki June 28, 2017, 8:52 p.m. UTC | #9

On Wednesday, June 28, 2017 09:44:55 AM Viresh Kumar wrote:
> On 27-06-17, 18:08, Rafael J. Wysocki wrote:

> > On Tue, Jun 27, 2017 at 6:20 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:

> > > @Rafael: Will it be fine to lower down the value of LATENCY_MULTIPLIER?

> > 

> > We can do that, but then I think we need to compensate for the change

> > in the old governors code or there may be surprises.

> 

> Why shouldn't we change the value of LATENCY_MULTIPLIER for old

> governors as well? They use the same calculations and the sampling

> rate there is also this bad (like rate_limit_us).


On some systems.  On other systems it isn't.

> If we aren't going to change that for old governors, then we can

> create a local version of LATENCY_MULTIPLIER for schedutil I believe.


OK, so at least for intel_pstate and acpi-cpufreq we want a 10 ms default
which is what we have currently.

If you want to rework all that, make sure you preserve that.

Thanks,
Rafael

cpufreq: dt: Set default policy->transition_delay_ns

Commit Message

Comments

Patch