diff mbox

[RFC] cpufreq: send notifications for intermediate (stable) frequencies

Message ID 25738681139c04272b6d2ebeff244c6d36c893f7.1400133090.git.viresh.kumar@linaro.org
State New
Headers show

Commit Message

Viresh Kumar May 15, 2014, 5:56 a.m. UTC
Douglas Anderson, recently pointed out an interesting problem due to which his
udelay() was expiring earlier than it should:
https://lkml.org/lkml/2014/5/13/766

While transitioning between frequencies few platforms may temporarily switch to
a stable frequency, waiting for the main PLL to stabilize.

For example: When we transition between very low frequencies on exynos, like
between 200MHz and 300MHz, we may temporarily switch to a PLL running at 800MHz.
No CPUFREQ notification is sent for that. That means there's a period of time
when we're running at 800MHz but loops_per_jiffy is calibrated at between 200MHz
and 300MHz. And so udelay behaves badly.

To get this fixed in a generic way, lets introduce another callback safe_freq()
for the cpufreq drivers.

safe_freq() should return a stable intermediate frequency a platform might want
to switch to, before jumping to the frequency corresponding to 'index'. Core
will send the 'PRE' notification for this 'stable' frequency and 'POST' for the
'target' frequency. Though if ->target_index() fails, it will handle POST for
'stable' frequency only.

Drivers must send 'POST' notification for 'stable' freq and 'PRE' for 'target'
freq. If they can't switch to target frequency, they don't need to send any
notification.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
Doug/Stephen,

If this doesn't look too ugly, then I would need patches from you to fix your
platforms as I am not well aware of clk hierarchy of your platforms.

 drivers/cpufreq/cpufreq.c | 13 +++++++++++--
 include/linux/cpufreq.h   | 18 ++++++++++++++++++
 2 files changed, 29 insertions(+), 2 deletions(-)

Comments

Doug Anderson May 15, 2014, 6:13 p.m. UTC | #1
Viresh,

On Wed, May 14, 2014 at 10:56 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> Douglas Anderson, recently pointed out an interesting problem due to which his
> udelay() was expiring earlier than it should:
> https://lkml.org/lkml/2014/5/13/766
>
> While transitioning between frequencies few platforms may temporarily switch to
> a stable frequency, waiting for the main PLL to stabilize.
>
> For example: When we transition between very low frequencies on exynos, like
> between 200MHz and 300MHz, we may temporarily switch to a PLL running at 800MHz.
> No CPUFREQ notification is sent for that. That means there's a period of time
> when we're running at 800MHz but loops_per_jiffy is calibrated at between 200MHz
> and 300MHz. And so udelay behaves badly.
>
> To get this fixed in a generic way, lets introduce another callback safe_freq()
> for the cpufreq drivers.
>
> safe_freq() should return a stable intermediate frequency a platform might want
> to switch to, before jumping to the frequency corresponding to 'index'. Core
> will send the 'PRE' notification for this 'stable' frequency and 'POST' for the
> 'target' frequency. Though if ->target_index() fails, it will handle POST for
> 'stable' frequency only.
>
> Drivers must send 'POST' notification for 'stable' freq and 'PRE' for 'target'
> freq. If they can't switch to target frequency, they don't need to send any
> notification.

This will have the side effect of sending twice as many notifications.
 ...however it does allow for people registering for CPUFREQ
notifications to be more generic...

Thinking about it, I think you're right that this is the way to go.
The majority of the registrants of CPUFREQ that I see really ought to
be moved to common clock notifications (they are dealing with the fact
that a peripheral clock will get scaled as a side effect of CPUFREQ).
What's left is only a very small number of cases that would most
cleanly be dealt with by just seeing the extra notification.


> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
> Doug/Stephen,
>
> If this doesn't look too ugly, then I would need patches from you to fix your
> platforms as I am not well aware of clk hierarchy of your platforms.

It probably makes sense to wait until Thomas Abraham's patch lands,
since he's redoing exynos cpufreq to use cpufreq-cpu0.  ...and maybe
Thomas would be willing to write this patch?


>  drivers/cpufreq/cpufreq.c | 13 +++++++++++--
>  include/linux/cpufreq.h   | 18 ++++++++++++++++++
>  2 files changed, 29 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index a05c921..8d1cb4f 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1874,11 +1874,17 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
>
>                 if (notify) {
>                         freqs.old = policy->cur;
> -                       freqs.new = freq_table[index].frequency;
> +                       /* Switch to some safe intermediate freq */
> +                       if (cpufreq_driver->safe_freq)

What do you think about calling this get_safe_freq().  It took me a
little while before I realized that this function didn't perform the
transition to the safe frequency--it just returned it.

...the comment adds extra confusion since it makes it sound like the
switch happens right here.


> +                               freqs.new = cpufreq_driver->safe_freq(policy,
> +                                                                     index);
> +                       else
> +                               freqs.new = freq_table[index].frequency;
>                         freqs.flags = 0;
>
>                         pr_debug("%s: cpu: %d, oldfreq: %u, new freq: %u\n",
> -                                __func__, policy->cpu, freqs.old, freqs.new);
> +                                __func__, policy->cpu, freqs.old,
> +                                freq_table[index].frequency);
>
>                         cpufreq_freq_transition_begin(policy, &freqs);
>                 }
> @@ -1887,6 +1893,9 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
>                 if (retval)
>                         pr_err("%s: Failed to change cpu frequency: %d\n",
>                                __func__, retval);
> +               else
> +                       /* Send POST notification for the target frequency */
> +                       freqs.new = freq_table[index].frequency;

Don't you need to set freqs.old to the safe_freq?

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Warren May 15, 2014, 7:17 p.m. UTC | #2
On 05/14/2014 11:56 PM, Viresh Kumar wrote:
> Douglas Anderson, recently pointed out an interesting problem due to which his
> udelay() was expiring earlier than it should:
> https://lkml.org/lkml/2014/5/13/766
> 
> While transitioning between frequencies few platforms may temporarily switch to
> a stable frequency, waiting for the main PLL to stabilize.
> 
> For example: When we transition between very low frequencies on exynos, like
> between 200MHz and 300MHz, we may temporarily switch to a PLL running at 800MHz.
> No CPUFREQ notification is sent for that. That means there's a period of time
> when we're running at 800MHz but loops_per_jiffy is calibrated at between 200MHz
> and 300MHz. And so udelay behaves badly.
> 
> To get this fixed in a generic way, lets introduce another callback safe_freq()
> for the cpufreq drivers.
> 
> safe_freq() should return a stable intermediate frequency a platform might want
> to switch to, before jumping to the frequency corresponding to 'index'. Core
> will send the 'PRE' notification for this 'stable' frequency and 'POST' for the
> 'target' frequency. Though if ->target_index() fails, it will handle POST for
> 'stable' frequency only.
> 
> Drivers must send 'POST' notification for 'stable' freq and 'PRE' for 'target'
> freq. If they can't switch to target frequency, they don't need to send any
> notification.

This seems rather complex. Can't either the driver or the cpufreq core
be responsible for all of the notifications? Otherwise, the logic gets
rather complex, and spread between the core and the driver.

Perhaps the core should make separate calls into the driver to switch to
the temporary frequency and the final frequency, so it can manage all
the notifications. Probably best to use a separate function pointer for
the temporary change so the driver can easily know what it's doing.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Anderson May 15, 2014, 8:39 p.m. UTC | #3
Hi,

On Thu, May 15, 2014 at 12:17 PM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 05/14/2014 11:56 PM, Viresh Kumar wrote:
>> Douglas Anderson, recently pointed out an interesting problem due to which his
>> udelay() was expiring earlier than it should:
>> https://lkml.org/lkml/2014/5/13/766
>>
>> While transitioning between frequencies few platforms may temporarily switch to
>> a stable frequency, waiting for the main PLL to stabilize.
>>
>> For example: When we transition between very low frequencies on exynos, like
>> between 200MHz and 300MHz, we may temporarily switch to a PLL running at 800MHz.
>> No CPUFREQ notification is sent for that. That means there's a period of time
>> when we're running at 800MHz but loops_per_jiffy is calibrated at between 200MHz
>> and 300MHz. And so udelay behaves badly.
>>
>> To get this fixed in a generic way, lets introduce another callback safe_freq()
>> for the cpufreq drivers.
>>
>> safe_freq() should return a stable intermediate frequency a platform might want
>> to switch to, before jumping to the frequency corresponding to 'index'. Core
>> will send the 'PRE' notification for this 'stable' frequency and 'POST' for the
>> 'target' frequency. Though if ->target_index() fails, it will handle POST for
>> 'stable' frequency only.
>>
>> Drivers must send 'POST' notification for 'stable' freq and 'PRE' for 'target'
>> freq. If they can't switch to target frequency, they don't need to send any
>> notification.
>
> This seems rather complex. Can't either the driver or the cpufreq core
> be responsible for all of the notifications? Otherwise, the logic gets
> rather complex, and spread between the core and the driver.
>
> Perhaps the core should make separate calls into the driver to switch to
> the temporary frequency and the final frequency, so it can manage all
> the notifications. Probably best to use a separate function pointer for
> the temporary change so the driver can easily know what it's doing.

In the discussion about the exynos cpufreq redesign (atop
cpufreq-cpu0), it turns out that they've come up with a pretty
reasonable solution that also happens to solve our problem.  They
utilize an extra divider to make sure that the temporary PLL gets
divided down so that it's low enough.

It might mean that going between 300 MHz and 500 MHz that you will
transition through 400 MHz, but I'm quite OK with not sending out a
notification for that.

If something like that could work for tegra, then maybe we can drop
this whole thing and it will all just fix itself.  ;)

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Stephen Warren May 15, 2014, 8:51 p.m. UTC | #4
On 05/15/2014 02:39 PM, Doug Anderson wrote:
> Hi,
> 
> On Thu, May 15, 2014 at 12:17 PM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>> On 05/14/2014 11:56 PM, Viresh Kumar wrote:
>>> Douglas Anderson, recently pointed out an interesting problem due to which his
>>> udelay() was expiring earlier than it should:
>>> https://lkml.org/lkml/2014/5/13/766
>>>
>>> While transitioning between frequencies few platforms may temporarily switch to
>>> a stable frequency, waiting for the main PLL to stabilize.
>>>
>>> For example: When we transition between very low frequencies on exynos, like
>>> between 200MHz and 300MHz, we may temporarily switch to a PLL running at 800MHz.
>>> No CPUFREQ notification is sent for that. That means there's a period of time
>>> when we're running at 800MHz but loops_per_jiffy is calibrated at between 200MHz
>>> and 300MHz. And so udelay behaves badly.
>>>
>>> To get this fixed in a generic way, lets introduce another callback safe_freq()
>>> for the cpufreq drivers.
>>>
>>> safe_freq() should return a stable intermediate frequency a platform might want
>>> to switch to, before jumping to the frequency corresponding to 'index'. Core
>>> will send the 'PRE' notification for this 'stable' frequency and 'POST' for the
>>> 'target' frequency. Though if ->target_index() fails, it will handle POST for
>>> 'stable' frequency only.
>>>
>>> Drivers must send 'POST' notification for 'stable' freq and 'PRE' for 'target'
>>> freq. If they can't switch to target frequency, they don't need to send any
>>> notification.
>>
>> This seems rather complex. Can't either the driver or the cpufreq core
>> be responsible for all of the notifications? Otherwise, the logic gets
>> rather complex, and spread between the core and the driver.
>>
>> Perhaps the core should make separate calls into the driver to switch to
>> the temporary frequency and the final frequency, so it can manage all
>> the notifications. Probably best to use a separate function pointer for
>> the temporary change so the driver can easily know what it's doing.
> 
> In the discussion about the exynos cpufreq redesign (atop
> cpufreq-cpu0), it turns out that they've come up with a pretty
> reasonable solution that also happens to solve our problem.  They
> utilize an extra divider to make sure that the temporary PLL gets
> divided down so that it's low enough.
> 
> It might mean that going between 300 MHz and 500 MHz that you will
> transition through 400 MHz, but I'm quite OK with not sending out a
> notification for that.
> 
> If something like that could work for tegra, then maybe we can drop
> this whole thing and it will all just fix itself.  ;)

At least in the case of Tegra20 cpufreq, I don't think that will be
possible at least without changing the temporary clock source we use
(pll_p). The PLL that's use temporarily is also the root of all the
peripheral clocks, and hence can't be changed. We also only characterize
that PLL at the one specific frequency it was designed to run at.

That said, it looks like the CPU clock may support pll_p_out3 and 4 as
sources in addition to pll_p. I'm not sure if anything else uses those
divided pll_p outputs. Peter, perhaps you can comment? Also, since pll_p
itself runs at exactly 216MHz, pll_p_out3 and 4 can never go higher than
that, so we couldn't use this trick for transitions between two fast
clock rates. Note that Tegra20 does reparenting for any CPU clock rate
change, not just when changing to/from certain slow rates. None of the
other potential CPU clock parents seem any better.

It's possible that later Tegra SoCs have more freedom here, but I didn't
check.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Anderson May 15, 2014, 8:58 p.m. UTC | #5
Stephen,

On Thu, May 15, 2014 at 1:51 PM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 05/15/2014 02:39 PM, Doug Anderson wrote:
>> Hi,
>>
>> On Thu, May 15, 2014 at 12:17 PM, Stephen Warren <swarren@wwwdotorg.org> wrote:
>>> On 05/14/2014 11:56 PM, Viresh Kumar wrote:
>>>> Douglas Anderson, recently pointed out an interesting problem due to which his
>>>> udelay() was expiring earlier than it should:
>>>> https://lkml.org/lkml/2014/5/13/766
>>>>
>>>> While transitioning between frequencies few platforms may temporarily switch to
>>>> a stable frequency, waiting for the main PLL to stabilize.
>>>>
>>>> For example: When we transition between very low frequencies on exynos, like
>>>> between 200MHz and 300MHz, we may temporarily switch to a PLL running at 800MHz.
>>>> No CPUFREQ notification is sent for that. That means there's a period of time
>>>> when we're running at 800MHz but loops_per_jiffy is calibrated at between 200MHz
>>>> and 300MHz. And so udelay behaves badly.
>>>>
>>>> To get this fixed in a generic way, lets introduce another callback safe_freq()
>>>> for the cpufreq drivers.
>>>>
>>>> safe_freq() should return a stable intermediate frequency a platform might want
>>>> to switch to, before jumping to the frequency corresponding to 'index'. Core
>>>> will send the 'PRE' notification for this 'stable' frequency and 'POST' for the
>>>> 'target' frequency. Though if ->target_index() fails, it will handle POST for
>>>> 'stable' frequency only.
>>>>
>>>> Drivers must send 'POST' notification for 'stable' freq and 'PRE' for 'target'
>>>> freq. If they can't switch to target frequency, they don't need to send any
>>>> notification.
>>>
>>> This seems rather complex. Can't either the driver or the cpufreq core
>>> be responsible for all of the notifications? Otherwise, the logic gets
>>> rather complex, and spread between the core and the driver.
>>>
>>> Perhaps the core should make separate calls into the driver to switch to
>>> the temporary frequency and the final frequency, so it can manage all
>>> the notifications. Probably best to use a separate function pointer for
>>> the temporary change so the driver can easily know what it's doing.
>>
>> In the discussion about the exynos cpufreq redesign (atop
>> cpufreq-cpu0), it turns out that they've come up with a pretty
>> reasonable solution that also happens to solve our problem.  They
>> utilize an extra divider to make sure that the temporary PLL gets
>> divided down so that it's low enough.
>>
>> It might mean that going between 300 MHz and 500 MHz that you will
>> transition through 400 MHz, but I'm quite OK with not sending out a
>> notification for that.
>>
>> If something like that could work for tegra, then maybe we can drop
>> this whole thing and it will all just fix itself.  ;)
>
> At least in the case of Tegra20 cpufreq, I don't think that will be
> possible at least without changing the temporary clock source we use
> (pll_p). The PLL that's use temporarily is also the root of all the
> peripheral clocks, and hence can't be changed. We also only characterize
> that PLL at the one specific frequency it was designed to run at.

It's interesting, in the exynos case they didn't change the PLL itself
but found an extra divider that I wasn't actually aware existed.  It
was located after the mux and before the cpu.


> That said, it looks like the CPU clock may support pll_p_out3 and 4 as
> sources in addition to pll_p. I'm not sure if anything else uses those
> divided pll_p outputs. Peter, perhaps you can comment? Also, since pll_p
> itself runs at exactly 216MHz, pll_p_out3 and 4 can never go higher than
> that, so we couldn't use this trick for transitions between two fast
> clock rates. Note that Tegra20 does reparenting for any CPU clock rate
> change, not just when changing to/from certain slow rates. None of the
> other potential CPU clock parents seem any better.

On exynos it won't necessarily transition to a frequency that's
between the start and end.  ...but at least with the trick mentioned
you can be sure that it's never _faster_ than either start or end.
The last I read through the code exynos always transitioned to a temp
PLL, though perhaps certain transitions could be optimized to avoid it
(if you're making a transition that doesn't need to relock).

...example transitions:
1.6 => 800 (temp) => 1.7
600 => 800 (temp) => 800
600 => 400 (temp) => 700
200 => 200 (temp) => 300

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Viresh Kumar May 16, 2014, 6:01 a.m. UTC | #6
On 15 May 2014 23:43, Doug Anderson <dianders@chromium.org> wrote:
> This will have the side effect of sending twice as many notifications.
>  ...however it does allow for people registering for CPUFREQ
> notifications to be more generic...

That's not a side effect of this approach but the way platforms are
handling it, and there is no way out we can skip that. In case we
do, we will give space for the race to happen and udelay will work
badly..

> Thinking about it, I think you're right that this is the way to go.

Correct :)

> It probably makes sense to wait until Thomas Abraham's patch lands,
> since he's redoing exynos cpufreq to use cpufreq-cpu0.  ...and maybe
> Thomas would be willing to write this patch?

But cpufreq-cpu0 isn't handling this intermediate freq concept, how will
you work around that? Anyway we can get this working for tegra if
stephen agrees.

> What do you think about calling this get_safe_freq().  It took me a
> little while before I realized that this function didn't perform the
> transition to the safe frequency--it just returned it.
>
> ...the comment adds extra confusion since it makes it sound like the
> switch happens right here.

Agree.

>> +                       /* Send POST notification for the target frequency */
>> +                       freqs.new = freq_table[index].frequency;
>
> Don't you need to set freqs.old to the safe_freq?

Agree..
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar May 16, 2014, 6:04 a.m. UTC | #7
On 16 May 2014 00:47, Stephen Warren <swarren@wwwdotorg.org> wrote:
> This seems rather complex. Can't either the driver or the cpufreq core
> be responsible for all of the notifications? Otherwise, the logic gets
> rather complex, and spread between the core and the driver.

I do agree about that and that's why added that 'ugly' statement.

> Perhaps the core should make separate calls into the driver to switch to
> the temporary frequency and the final frequency, so it can manage all
> the notifications. Probably best to use a separate function pointer for
> the temporary change so the driver can easily know what it's doing.

Hmm, that sounds like a much better approach. Let me try to code it.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter De Schrijver May 16, 2014, 9:28 a.m. UTC | #8
On Thu, May 15, 2014 at 10:58:26PM +0200, Doug Anderson wrote:
> Stephen,
> 
> On Thu, May 15, 2014 at 1:51 PM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> > On 05/15/2014 02:39 PM, Doug Anderson wrote:
> >> Hi,
> >>
> >> On Thu, May 15, 2014 at 12:17 PM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> >>> On 05/14/2014 11:56 PM, Viresh Kumar wrote:
> >>>> Douglas Anderson, recently pointed out an interesting problem due to which his
> >>>> udelay() was expiring earlier than it should:
> >>>> https://lkml.org/lkml/2014/5/13/766
> >>>>
> >>>> While transitioning between frequencies few platforms may temporarily switch to
> >>>> a stable frequency, waiting for the main PLL to stabilize.
> >>>>
> >>>> For example: When we transition between very low frequencies on exynos, like
> >>>> between 200MHz and 300MHz, we may temporarily switch to a PLL running at 800MHz.
> >>>> No CPUFREQ notification is sent for that. That means there's a period of time
> >>>> when we're running at 800MHz but loops_per_jiffy is calibrated at between 200MHz
> >>>> and 300MHz. And so udelay behaves badly.
> >>>>
> >>>> To get this fixed in a generic way, lets introduce another callback safe_freq()
> >>>> for the cpufreq drivers.
> >>>>
> >>>> safe_freq() should return a stable intermediate frequency a platform might want
> >>>> to switch to, before jumping to the frequency corresponding to 'index'. Core
> >>>> will send the 'PRE' notification for this 'stable' frequency and 'POST' for the
> >>>> 'target' frequency. Though if ->target_index() fails, it will handle POST for
> >>>> 'stable' frequency only.
> >>>>
> >>>> Drivers must send 'POST' notification for 'stable' freq and 'PRE' for 'target'
> >>>> freq. If they can't switch to target frequency, they don't need to send any
> >>>> notification.
> >>>
> >>> This seems rather complex. Can't either the driver or the cpufreq core
> >>> be responsible for all of the notifications? Otherwise, the logic gets
> >>> rather complex, and spread between the core and the driver.
> >>>
> >>> Perhaps the core should make separate calls into the driver to switch to
> >>> the temporary frequency and the final frequency, so it can manage all
> >>> the notifications. Probably best to use a separate function pointer for
> >>> the temporary change so the driver can easily know what it's doing.
> >>
> >> In the discussion about the exynos cpufreq redesign (atop
> >> cpufreq-cpu0), it turns out that they've come up with a pretty
> >> reasonable solution that also happens to solve our problem.  They
> >> utilize an extra divider to make sure that the temporary PLL gets
> >> divided down so that it's low enough.
> >>
> >> It might mean that going between 300 MHz and 500 MHz that you will
> >> transition through 400 MHz, but I'm quite OK with not sending out a
> >> notification for that.
> >>
> >> If something like that could work for tegra, then maybe we can drop
> >> this whole thing and it will all just fix itself.  ;)
> >
> > At least in the case of Tegra20 cpufreq, I don't think that will be
> > possible at least without changing the temporary clock source we use
> > (pll_p). The PLL that's use temporarily is also the root of all the
> > peripheral clocks, and hence can't be changed. We also only characterize
> > that PLL at the one specific frequency it was designed to run at.
> 
> It's interesting, in the exynos case they didn't change the PLL itself
> but found an extra divider that I wasn't actually aware existed.  It
> was located after the mux and before the cpu.
> 
> 

We do have a divider between the mux and clocksource as well, but we never use
it except from Tegra114 onwards for hw controlled thermal throttling (more as
an emergency measure in case sw fails to react in time). Tegra also has a
microsecond counter which could be used for udelay() on parts which don't have
the arch counter (Tegra20 and Tegra30). For Tegra114 and Tegra124 we use the
arch counter, so there shouldn't be a problem.

Cheers,

Peter.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index a05c921..8d1cb4f 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1874,11 +1874,17 @@  int __cpufreq_driver_target(struct cpufreq_policy *policy,
 
 		if (notify) {
 			freqs.old = policy->cur;
-			freqs.new = freq_table[index].frequency;
+			/* Switch to some safe intermediate freq */
+			if (cpufreq_driver->safe_freq)
+				freqs.new = cpufreq_driver->safe_freq(policy,
+								      index);
+			else
+				freqs.new = freq_table[index].frequency;
 			freqs.flags = 0;
 
 			pr_debug("%s: cpu: %d, oldfreq: %u, new freq: %u\n",
-				 __func__, policy->cpu, freqs.old, freqs.new);
+				 __func__, policy->cpu, freqs.old,
+				 freq_table[index].frequency);
 
 			cpufreq_freq_transition_begin(policy, &freqs);
 		}
@@ -1887,6 +1893,9 @@  int __cpufreq_driver_target(struct cpufreq_policy *policy,
 		if (retval)
 			pr_err("%s: Failed to change cpu frequency: %d\n",
 			       __func__, retval);
+		else
+			/* Send POST notification for the target frequency */
+			freqs.new = freq_table[index].frequency;
 
 		if (notify)
 			cpufreq_freq_transition_end(policy, &freqs, retval);
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 3f45889..b5ba275 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -226,6 +226,24 @@  struct cpufreq_driver {
 				 unsigned int relation);
 	int	(*target_index)	(struct cpufreq_policy *policy,
 				 unsigned int index);
+	/*
+	 * Only for drivers with target_index() and CPUFREQ_ASYNC_NOTIFICATION
+	 * unset.
+	 *
+	 * safe_freq() should return a stable intermediate frequency a platform
+	 * might want to switch to, before jumping to the frequency
+	 * corresponding to 'index'. Core will send the 'PRE' notification for
+	 * this 'stable' frequency and 'POST' for the 'target' frequency. Though
+	 * if ->target_index() fails, it will handle POST for 'stable' frequency
+	 * only.
+	 *
+	 * Drivers must send 'POST' notification for 'stable' freq and 'PRE' for
+	 * 'target' freq. If they can't switch to target frequency, they don't
+	 * need to send any notification.
+	 *
+	 */
+	unsigned int (*safe_freq)(struct cpufreq_policy *policy,
+				  unsigned int index);
 
 	/* should be defined, if possible */
 	unsigned int	(*get)	(unsigned int cpu);