[RFC,V2,0/6] cpufreq: transition-latency cleanups

Message ID	cover.1499853492.git.viresh.kumar@linaro.org
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: best guess record for domain of linux-pm-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; From: Viresh Kumar <viresh.kumar@linaro.org> To: Rafael Wysocki <rjw@rjwysocki.net> Cc: Viresh Kumar <viresh.kumar@linaro.org>, linux-pm@vger.kernel.org, Vincent Guittot <vincent.guittot@linaro.org>, linux@dominikbrodowski.net, Ingo Molnar <mingo@redhat.com>, Jonathan Corbet <corbet@lwn.net>, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>, Sudeep Holla <sudeep.holla@arm.com> Subject: [RFC V2 0/6] cpufreq: transition-latency cleanups Date: Thu, 13 Jul 2017 11:10:51 +0530 Message-Id: <cover.1499853492.git.viresh.kumar@linaro.org> Sender: linux-pm-owner@vger.kernel.org Precedence: bulk
Series	cpufreq: transition-latency cleanups \| expand [RFC,V2,0/6] cpufreq: transition-latency cleanups [RFC,V2,2/6] cpufreq: schedutil: Set dynamic_switching to true [RFC,V2,3/6] cpufreq: governor: Drop min_sampling_rate [RFC,V2,5/6] cpufreq: Cap the default transition delay value to 10 ms [RFC,V2,6/6] cpufreq: arm_big_little: Make ->get_transition_latency() mandatory

Viresh Kumar July 13, 2017, 5:40 a.m. UTC

Hi Rafael,

Here is the V2 and sending it as RFC this time.

This series tries to cleanup the code around transition-latency and its
users.  Some of the old legacy code, which may not make much sense now,
is dropped as well.

Some code consolidation also done across governors.

Based of: pm/linux-next
Tested on: ARM64 Hikey board.

V1->V2:
- While we still get rid of the limitation of 10ms for using
  ondemand/conservative, but we preserve the earlier behavior where the
  transition latency set to CPUFREQ_ETERNAL would not allow use of
  ondemand/conservative governors. Thanks to Dominik for his feedback on
  that.

--
viresh

Viresh Kumar (6):
  cpufreq: Replace "max_transition_latency" with "dynamic_switching"
  cpufreq: schedutil: Set dynamic_switching to true
  cpufreq: governor: Drop min_sampling_rate
  cpufreq: Use transition_delay_us for legacy governors as well
  cpufreq: Cap the default transition delay value to 10 ms
  cpufreq: arm_big_little: Make ->get_transition_latency() mandatory

 Documentation/admin-guide/pm/cpufreq.rst |  8 -------
 drivers/cpufreq/arm_big_little.c         | 10 ++++-----
 drivers/cpufreq/cpufreq.c                |  8 +++----
 drivers/cpufreq/cpufreq_conservative.c   |  6 ------
 drivers/cpufreq/cpufreq_governor.c       | 17 ++-------------
 drivers/cpufreq/cpufreq_governor.h       |  3 +--
 drivers/cpufreq/cpufreq_ondemand.c       | 12 -----------
 include/linux/cpufreq.h                  | 36 ++++++++++++++++++++++++--------
 kernel/sched/cpufreq_schedutil.c         | 12 ++---------
 9 files changed, 40 insertions(+), 72 deletions(-)

-- 
2.13.0.71.gd7076ec9c9cb

Rafael J. Wysocki July 13, 2017, 4:19 p.m. UTC | #1

On Thu, Jul 13, 2017 at 7:40 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> There is no limitation in the ondemand or conservative governors which

> disallow the transition_latency to be greater than 10 ms.

>

> The max_transition_latency field is rather used to disallow automatic

> dynamic frequency switching for platforms which didn't wanted these

> governors to run.

>

> Replace max_transition_latency with a boolean (dynamic_switching) and

> check for transition_latency == CPUFREQ_ETERNAL along with that. This

> makes it pretty straight forward to read/understand now.


Well, using CPUFREQ_ETERNAL for that on the driver side is still not
particularly straightforward IMO, so maybe add a
"no_dynamic_switching" to the driver structure and set it to "true"
for the one driver in question?

Thanks,
Rafael

Rafael J. Wysocki July 13, 2017, 4:31 p.m. UTC | #2

On Thu, Jul 13, 2017 at 7:40 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> The policy->transition_delay_us field is used only by the schedutil

> governor currently, and this field describes how fast the driver wants

> the cpufreq governor to change CPUs frequency. It should rather be a

> common thing across all governors, as it doesn't have any schedutil

> dependency here.

>

> Create a new helper cpufreq_policy_transition_delay_us() to get the

> transition delay across all governors.

>

> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

> ---

>  drivers/cpufreq/cpufreq_governor.c |  9 +--------

>  include/linux/cpufreq.h            | 15 +++++++++++++++

>  kernel/sched/cpufreq_schedutil.c   | 11 +----------

>  3 files changed, 17 insertions(+), 18 deletions(-)

>

> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c

> index 858081f9c3d7..eed069ecfd5e 100644

> --- a/drivers/cpufreq/cpufreq_governor.c

> +++ b/drivers/cpufreq/cpufreq_governor.c

> @@ -389,7 +389,6 @@ int cpufreq_dbs_governor_init(struct cpufreq_policy *policy)

>         struct dbs_governor *gov = dbs_governor_of(policy);

>         struct dbs_data *dbs_data;

>         struct policy_dbs_info *policy_dbs;

> -       unsigned int latency;

>         int ret = 0;

>

>         /* State should be equivalent to EXIT */

> @@ -428,13 +427,7 @@ int cpufreq_dbs_governor_init(struct cpufreq_policy *policy)

>         if (ret)

>                 goto free_policy_dbs_info;

>

> -       /* policy latency is in ns. Convert it to us first */

> -       latency = policy->cpuinfo.transition_latency / 1000;

> -       if (latency == 0)

> -               latency = 1;

> -

> -       /* Bring kernel and HW constraints together */

> -       dbs_data->sampling_rate = LATENCY_MULTIPLIER * latency;

> +       dbs_data->sampling_rate = cpufreq_policy_transition_delay_us(policy);

>

>         if (!have_governor_per_policy())

>                 gov->gdbs_data = dbs_data;

> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h

> index 00e4c40a3249..14f0ab61ed17 100644

> --- a/include/linux/cpufreq.h

> +++ b/include/linux/cpufreq.h

> @@ -532,6 +532,21 @@ static inline void cpufreq_policy_apply_limits(struct cpufreq_policy *policy)

>                 __cpufreq_driver_target(policy, policy->min, CPUFREQ_RELATION_L);

>  }

>

> +static inline unsigned int

> +cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)

> +{

> +       unsigned int delay_us = LATENCY_MULTIPLIER, latency;

> +

> +       if (policy->transition_delay_us)

> +               return policy->transition_delay_us;

> +

> +       latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;

> +       if (latency)

> +               delay_us *= latency;

> +

> +       return delay_us;

> +}


Not in the header, please, and I don't think you need delay_us:

    latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
    if (latency)
        return latency * LATENCY_MULTIPLIER;

    return LATENCY_MULTIPLIER;

Thanks,
Rafael

Dominik Brodowski July 14, 2017, 7:01 a.m. UTC | #3

On Thu, Jul 13, 2017 at 06:19:53PM +0200, Rafael J. Wysocki wrote:
> On Thu, Jul 13, 2017 at 7:40 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:

> > There is no limitation in the ondemand or conservative governors which

> > disallow the transition_latency to be greater than 10 ms.

> >

> > The max_transition_latency field is rather used to disallow automatic

> > dynamic frequency switching for platforms which didn't wanted these

> > governors to run.

> >

> > Replace max_transition_latency with a boolean (dynamic_switching) and

> > check for transition_latency == CPUFREQ_ETERNAL along with that. This

> > makes it pretty straight forward to read/understand now.

> 

> Well, using CPUFREQ_ETERNAL for that on the driver side is still not

> particularly straightforward IMO, so maybe add a

> "no_dynamic_switching" to the driver structure and set it to "true"

> for the one driver in question?


IIRC it's not just one driver which sets the latency to CPUFREQ_ETERNAL, and
where dynamic switching might be harmful or at least lead to undefined
behavior.

Best,
	Dominik

Rafael J. Wysocki July 14, 2017, 11:11 a.m. UTC | #4

On Fri, Jul 14, 2017 at 9:01 AM, Dominik Brodowski
<linux@dominikbrodowski.net> wrote:
> On Thu, Jul 13, 2017 at 06:19:53PM +0200, Rafael J. Wysocki wrote:

>> On Thu, Jul 13, 2017 at 7:40 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:

>> > There is no limitation in the ondemand or conservative governors which

>> > disallow the transition_latency to be greater than 10 ms.

>> >

>> > The max_transition_latency field is rather used to disallow automatic

>> > dynamic frequency switching for platforms which didn't wanted these

>> > governors to run.

>> >

>> > Replace max_transition_latency with a boolean (dynamic_switching) and

>> > check for transition_latency == CPUFREQ_ETERNAL along with that. This

>> > makes it pretty straight forward to read/understand now.

>>

>> Well, using CPUFREQ_ETERNAL for that on the driver side is still not

>> particularly straightforward IMO, so maybe add a

>> "no_dynamic_switching" to the driver structure and set it to "true"

>> for the one driver in question?

>

> IIRC it's not just one driver which sets the latency to CPUFREQ_ETERNAL, and

> where dynamic switching might be harmful or at least lead to undefined

> behavior.


OK

Still, though, using CPUFREQ_ETERNAL to indicate the "no dynamic
switching" condition is somewhat convoluted, so why don't we have a
flag to *explicitly* say that instead?

Do you know which drivers they are or is it just all drivers that use
CPUFREQ_ETERNAL?

Thanks,
Rafael

Rafael J. Wysocki July 14, 2017, 10:06 p.m. UTC | #5

On Friday, July 14, 2017 01:11:58 PM Rafael J. Wysocki wrote:
> On Fri, Jul 14, 2017 at 9:01 AM, Dominik Brodowski

> <linux@dominikbrodowski.net> wrote:

> > On Thu, Jul 13, 2017 at 06:19:53PM +0200, Rafael J. Wysocki wrote:

> >> On Thu, Jul 13, 2017 at 7:40 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:

> >> > There is no limitation in the ondemand or conservative governors which

> >> > disallow the transition_latency to be greater than 10 ms.

> >> >

> >> > The max_transition_latency field is rather used to disallow automatic

> >> > dynamic frequency switching for platforms which didn't wanted these

> >> > governors to run.

> >> >

> >> > Replace max_transition_latency with a boolean (dynamic_switching) and

> >> > check for transition_latency == CPUFREQ_ETERNAL along with that. This

> >> > makes it pretty straight forward to read/understand now.

> >>

> >> Well, using CPUFREQ_ETERNAL for that on the driver side is still not

> >> particularly straightforward IMO, so maybe add a

> >> "no_dynamic_switching" to the driver structure and set it to "true"

> >> for the one driver in question?

> >

> > IIRC it's not just one driver which sets the latency to CPUFREQ_ETERNAL, and

> > where dynamic switching might be harmful or at least lead to undefined

> > behavior.

> 

> OK

> 

> Still, though, using CPUFREQ_ETERNAL to indicate the "no dynamic

> switching" condition is somewhat convoluted, so why don't we have a

> flag to *explicitly* say that instead?

> 

> Do you know which drivers they are or is it just all drivers that use

> CPUFREQ_ETERNAL?


Well, after the $subject patch it effectively is all drivers that use
CPUFREQ_ETERNAL anyway, so it looks like we actually can do a complete
switch-over.

Thanks,
Rafael

Dominik Brodowski July 15, 2017, 5:08 a.m. UTC | #6

On Sat, Jul 15, 2017 at 12:06:24AM +0200, Rafael J. Wysocki wrote:
> On Friday, July 14, 2017 01:11:58 PM Rafael J. Wysocki wrote:

> > On Fri, Jul 14, 2017 at 9:01 AM, Dominik Brodowski

> > <linux@dominikbrodowski.net> wrote:

> > > On Thu, Jul 13, 2017 at 06:19:53PM +0200, Rafael J. Wysocki wrote:

> > >> On Thu, Jul 13, 2017 at 7:40 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:

> > >> > There is no limitation in the ondemand or conservative governors which

> > >> > disallow the transition_latency to be greater than 10 ms.

> > >> >

> > >> > The max_transition_latency field is rather used to disallow automatic

> > >> > dynamic frequency switching for platforms which didn't wanted these

> > >> > governors to run.

> > >> >

> > >> > Replace max_transition_latency with a boolean (dynamic_switching) and

> > >> > check for transition_latency == CPUFREQ_ETERNAL along with that. This

> > >> > makes it pretty straight forward to read/understand now.

> > >>

> > >> Well, using CPUFREQ_ETERNAL for that on the driver side is still not

> > >> particularly straightforward IMO, so maybe add a

> > >> "no_dynamic_switching" to the driver structure and set it to "true"

> > >> for the one driver in question?

> > >

> > > IIRC it's not just one driver which sets the latency to CPUFREQ_ETERNAL, and

> > > where dynamic switching might be harmful or at least lead to undefined

> > > behavior.

> > 

> > OK

> > 

> > Still, though, using CPUFREQ_ETERNAL to indicate the "no dynamic

> > switching" condition is somewhat convoluted, so why don't we have a

> > flag to *explicitly* say that instead?

> > 

> > Do you know which drivers they are or is it just all drivers that use

> > CPUFREQ_ETERNAL?

> 

> Well, after the $subject patch it effectively is all drivers that use

> CPUFREQ_ETERNAL anyway, so it looks like we actually can do a complete

> switch-over.


Exactly. But lets take a quick look at the drivers ussing CPUFREQ_ETERNAL:

Using CPUFREQ_ETERNAL, as policy-setting drivers:

- intel_pstate.c - for the intel_pstate driver, which defers to the hardware
  to do frequency selection.

- longrun.c - hardware-based frequency selection.

=> Those drivers are not interested in kernel-based dynamic frequency
selection anyway.


Using CPUFREQ_ETERNAL as a fallback if transition_latency is unknown:
- arm_big_little.c
- arm_big_little_dt.c
- cpufreq-dt.c
- imx6q-cpufreq.c
- spear_cpufreq.c

=> As it seems to be an error case, it seems best to bail out on the
safe side and disallow dynamic frequency scaling. Platform experts might
know better, though.


Using CPUFREQ_ETERNAL unconditionally:
- cpufreq-nforce2.c - over a decade old driver; has a commented-out hack
  to mdelay(10ms) after each frequency transition. This smells like it might
  be unsafe to do dynamic switching more often than that.

- elanfreq.c - Has udelay(1ms+10ms) in transition path, so the same terms
  and conditions seem to apply.

- gx-suspmod.c - works by a mechanism which reminds me of CPU frequency
  throttling, but chipset- and not CPU-based.

- pmac32-cpufreq.c - for some models, it sets latency to ETERNAL. In some
  frequency switchign code, it has mdelay(10ms) calls.

- speedsstep-smi.c - this case was discussed previously.

=> For those drivers, dynamic frequency scaling should not be enabled IMO.


- sa1100-cpufreq.c and
- sa1110-cpufreq.c - If I remember correctly, those drivers were used for
  fast user-space based frequency scaling in the past. 

=> For these two drivers, enabling DFVS might be an option.


- sh-cpufreq.c - looks fast, but I have no clue.

- unicore2-cpufreq.c - same.

=> For those drivers, I have no clue. So to be on the safe side, I'd opt for
dynamic frequency scaling to be set to off.


To summarize: At first, I'd propose a *complete* switch-over from
CPUFREQ_ETERNAL to setting the flag "no DVFS" you have proposed. Then, one
might discuss with the maintainers of individual drivers/platforms on
whether to relax this rule for a few of those drivers (sa11x0, sh-cpufreq,
unicore2-cpufreq and the drivers using CPUFREQ_ETERNAL as a fallback).

Best,
	Dominik

Rafael J. Wysocki July 15, 2017, 12:26 p.m. UTC | #7

On Saturday, July 15, 2017 07:08:28 AM Dominik Brodowski wrote:
> On Sat, Jul 15, 2017 at 12:06:24AM +0200, Rafael J. Wysocki wrote:

> > On Friday, July 14, 2017 01:11:58 PM Rafael J. Wysocki wrote:

> > > On Fri, Jul 14, 2017 at 9:01 AM, Dominik Brodowski

> > > <linux@dominikbrodowski.net> wrote:

> > > > On Thu, Jul 13, 2017 at 06:19:53PM +0200, Rafael J. Wysocki wrote:

> > > >> On Thu, Jul 13, 2017 at 7:40 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:

> > > >> > There is no limitation in the ondemand or conservative governors which

> > > >> > disallow the transition_latency to be greater than 10 ms.

> > > >> >

> > > >> > The max_transition_latency field is rather used to disallow automatic

> > > >> > dynamic frequency switching for platforms which didn't wanted these

> > > >> > governors to run.

> > > >> >

> > > >> > Replace max_transition_latency with a boolean (dynamic_switching) and

> > > >> > check for transition_latency == CPUFREQ_ETERNAL along with that. This

> > > >> > makes it pretty straight forward to read/understand now.

> > > >>

> > > >> Well, using CPUFREQ_ETERNAL for that on the driver side is still not

> > > >> particularly straightforward IMO, so maybe add a

> > > >> "no_dynamic_switching" to the driver structure and set it to "true"

> > > >> for the one driver in question?

> > > >

> > > > IIRC it's not just one driver which sets the latency to CPUFREQ_ETERNAL, and

> > > > where dynamic switching might be harmful or at least lead to undefined

> > > > behavior.

> > > 

> > > OK

> > > 

> > > Still, though, using CPUFREQ_ETERNAL to indicate the "no dynamic

> > > switching" condition is somewhat convoluted, so why don't we have a

> > > flag to *explicitly* say that instead?

> > > 

> > > Do you know which drivers they are or is it just all drivers that use

> > > CPUFREQ_ETERNAL?

> > 

> > Well, after the $subject patch it effectively is all drivers that use

> > CPUFREQ_ETERNAL anyway, so it looks like we actually can do a complete

> > switch-over.

> 

> Exactly. But lets take a quick look at the drivers ussing CPUFREQ_ETERNAL:

> 

> Using CPUFREQ_ETERNAL, as policy-setting drivers:

> 

> - intel_pstate.c - for the intel_pstate driver, which defers to the hardware

>   to do frequency selection.

> 

> - longrun.c - hardware-based frequency selection.


That may or may not be hardware-based, but if the ->setpolicy callback is
present, transition_latency doesn't matter anyway.

> => Those drivers are not interested in kernel-based dynamic frequency

> selection anyway.


Right.

> 

> Using CPUFREQ_ETERNAL as a fallback if transition_latency is unknown:

> - arm_big_little.c

> - arm_big_little_dt.c

> - cpufreq-dt.c

> - imx6q-cpufreq.c

> - spear_cpufreq.c

> 

> => As it seems to be an error case, it seems best to bail out on the

> safe side and disallow dynamic frequency scaling. Platform experts might

> know better, though.

> 


Well, Viresh should know what to do for some of them at least. :-)

> Using CPUFREQ_ETERNAL unconditionally:

> - cpufreq-nforce2.c - over a decade old driver; has a commented-out hack

>   to mdelay(10ms) after each frequency transition. This smells like it might

>   be unsafe to do dynamic switching more often than that.

> 

> - elanfreq.c - Has udelay(1ms+10ms) in transition path, so the same terms

>   and conditions seem to apply.

> 

> - gx-suspmod.c - works by a mechanism which reminds me of CPU frequency

>   throttling, but chipset- and not CPU-based.

> 

> - pmac32-cpufreq.c - for some models, it sets latency to ETERNAL. In some

>   frequency switchign code, it has mdelay(10ms) calls.

> 

> - speedsstep-smi.c - this case was discussed previously.

> 

> => For those drivers, dynamic frequency scaling should not be enabled IMO.


Agreed.

> - sa1100-cpufreq.c and

> - sa1110-cpufreq.c - If I remember correctly, those drivers were used for

>   fast user-space based frequency scaling in the past. 

> 

> => For these two drivers, enabling DFVS might be an option.

> 


OK, I'm not familiar with these.

> - sh-cpufreq.c - looks fast, but I have no clue.

> 

> - unicore2-cpufreq.c - same.

> 

> => For those drivers, I have no clue. So to be on the safe side, I'd opt for

> dynamic frequency scaling to be set to off.

> 


Agreed.

> To summarize: At first, I'd propose a *complete* switch-over from

> CPUFREQ_ETERNAL to setting the flag "no DVFS" you have proposed.


So we seem to be in agreement over this.

> Then, one might discuss with the maintainers of individual drivers/platforms on

> whether to relax this rule for a few of those drivers (sa11x0, sh-cpufreq,

> unicore2-cpufreq and the drivers using CPUFREQ_ETERNAL as a fallback).


Right.

Thanks,
Rafael

Viresh Kumar July 17, 2017, 11:58 a.m. UTC | #8

On 15-07-17, 14:26, Rafael J. Wysocki wrote:
> On Saturday, July 15, 2017 07:08:28 AM Dominik Brodowski wrote:


> > Exactly. But lets take a quick look at the drivers ussing CPUFREQ_ETERNAL:

> > 

> > Using CPUFREQ_ETERNAL, as policy-setting drivers:

> > 

> > - intel_pstate.c - for the intel_pstate driver, which defers to the hardware

> >   to do frequency selection.

> > 

> > - longrun.c - hardware-based frequency selection.

> 

> That may or may not be hardware-based, but if the ->setpolicy callback is

> present, transition_latency doesn't matter anyway.


Right and to avoid confusion its probably better to avoid setting
transition_latency completely from them. I will try to include that in
my series.

> > => Those drivers are not interested in kernel-based dynamic frequency

> > selection anyway.

> 

> Right.

> 

> > 

> > Using CPUFREQ_ETERNAL as a fallback if transition_latency is unknown:

> > - arm_big_little.c

> > - arm_big_little_dt.c

> > - cpufreq-dt.c

> > - imx6q-cpufreq.c

> > - spear_cpufreq.c

> > 

> > => As it seems to be an error case, it seems best to bail out on the

> > safe side and disallow dynamic frequency scaling. Platform experts might

> > know better, though.

> > 

> 

> Well, Viresh should know what to do for some of them at least. :-)


Yeah, they just don't know how much time it takes to change the
frequency. We shouldn't disallow DVFS for them.

> > Using CPUFREQ_ETERNAL unconditionally:

> > - cpufreq-nforce2.c - over a decade old driver; has a commented-out hack

> >   to mdelay(10ms) after each frequency transition. This smells like it might

> >   be unsafe to do dynamic switching more often than that.

> > 

> > - elanfreq.c - Has udelay(1ms+10ms) in transition path, so the same terms

> >   and conditions seem to apply.

> > 

> > - gx-suspmod.c - works by a mechanism which reminds me of CPU frequency

> >   throttling, but chipset- and not CPU-based.

> > 

> > - pmac32-cpufreq.c - for some models, it sets latency to ETERNAL. In some

> >   frequency switchign code, it has mdelay(10ms) calls.

> > 

> > - speedsstep-smi.c - this case was discussed previously.

> > 

> > => For those drivers, dynamic frequency scaling should not be enabled IMO.

> 

> Agreed.


+1 and these are the drivers which should have this new variable set
to avoid DVFS.

Anyone who is setting CPUFREQ_ETERNAL unconditionally should be
setting the new flag.

> > To summarize: At first, I'd propose a *complete* switch-over from

> > CPUFREQ_ETERNAL to setting the flag "no DVFS" you have proposed.

> 

> So we seem to be in agreement over this.


The usage of CPUFREQ_ETERNAL had been confusing over the years, i.e.
some use it to not allow ondemand/conservative, while others use it as
they don't know their transition latencies.

A complete switch over may not be good for the later.

I would suggest we only move the platforms which set latency to
CPUFREQ_ETERNAL unconditionally to the "no DVFS" list. And everyone
else can still continue with CPUFREQ_ETERNAL. I have earlier proposed
finding their latencies dynamically and will try to include that for
them.

-- 
viresh

Rafael J. Wysocki July 17, 2017, 12:18 p.m. UTC | #9

On Monday, July 17, 2017 05:28:51 PM Viresh Kumar wrote:
> On 15-07-17, 14:26, Rafael J. Wysocki wrote:

> > On Saturday, July 15, 2017 07:08:28 AM Dominik Brodowski wrote:

> 

> > > Exactly. But lets take a quick look at the drivers ussing CPUFREQ_ETERNAL:

> > > 

> > > Using CPUFREQ_ETERNAL, as policy-setting drivers:

> > > 

> > > - intel_pstate.c - for the intel_pstate driver, which defers to the hardware

> > >   to do frequency selection.

> > > 

> > > - longrun.c - hardware-based frequency selection.

> > 

> > That may or may not be hardware-based, but if the ->setpolicy callback is

> > present, transition_latency doesn't matter anyway.

> 

> Right and to avoid confusion its probably better to avoid setting

> transition_latency completely from them. I will try to include that in

> my series.

> 

> > > => Those drivers are not interested in kernel-based dynamic frequency

> > > selection anyway.

> > 

> > Right.

> > 

> > > 

> > > Using CPUFREQ_ETERNAL as a fallback if transition_latency is unknown:

> > > - arm_big_little.c

> > > - arm_big_little_dt.c

> > > - cpufreq-dt.c

> > > - imx6q-cpufreq.c

> > > - spear_cpufreq.c

> > > 

> > > => As it seems to be an error case, it seems best to bail out on the

> > > safe side and disallow dynamic frequency scaling. Platform experts might

> > > know better, though.

> > > 

> > 

> > Well, Viresh should know what to do for some of them at least. :-)

> 

> Yeah, they just don't know how much time it takes to change the

> frequency. We shouldn't disallow DVFS for them.

> 

> > > Using CPUFREQ_ETERNAL unconditionally:

> > > - cpufreq-nforce2.c - over a decade old driver; has a commented-out hack

> > >   to mdelay(10ms) after each frequency transition. This smells like it might

> > >   be unsafe to do dynamic switching more often than that.

> > > 

> > > - elanfreq.c - Has udelay(1ms+10ms) in transition path, so the same terms

> > >   and conditions seem to apply.

> > > 

> > > - gx-suspmod.c - works by a mechanism which reminds me of CPU frequency

> > >   throttling, but chipset- and not CPU-based.

> > > 

> > > - pmac32-cpufreq.c - for some models, it sets latency to ETERNAL. In some

> > >   frequency switchign code, it has mdelay(10ms) calls.

> > > 

> > > - speedsstep-smi.c - this case was discussed previously.

> > > 

> > > => For those drivers, dynamic frequency scaling should not be enabled IMO.

> > 

> > Agreed.

> 

> +1 and these are the drivers which should have this new variable set

> to avoid DVFS.

> 

> Anyone who is setting CPUFREQ_ETERNAL unconditionally should be

> setting the new flag.

> 

> > > To summarize: At first, I'd propose a *complete* switch-over from

> > > CPUFREQ_ETERNAL to setting the flag "no DVFS" you have proposed.

> > 

> > So we seem to be in agreement over this.

> 

> The usage of CPUFREQ_ETERNAL had been confusing over the years, i.e.

> some use it to not allow ondemand/conservative, while others use it as

> they don't know their transition latencies.


So if we can identify these, we can introduce something like
CPUFREQ_LATENCY_UNKNOWN for them and get rid of CPUFREQ_ETERNAL.

> A complete switch over may not be good for the later.

> 

> I would suggest we only move the platforms which set latency to

> CPUFREQ_ETERNAL unconditionally to the "no DVFS" list. And everyone

> else can still continue with CPUFREQ_ETERNAL. I have earlier proposed

> finding their latencies dynamically and will try to include that for

> them.


I have to admint I'm not a super-fan of that (mostly because I tried to do
something similar in the past for genpd, but that didn't work out well IMO),
but as long as that is done in drivers and not in the core, I can live with it.

Thanks,
Rafael

[RFC,V2,0/6] cpufreq: transition-latency cleanups

Message

Comments