[13/13] cpufreq: make sure frequency transitions are serialized

Message ID CAKohpomGNSA4Pyg6Pn2SiawXPe=wMU7hx0Y7phjfNDV8J+JrLw@mail.gmail.com
State New
Headers show

Commit Message

Viresh Kumar June 27, 2013, 4:56 a.m.
On 27 June 2013 03:27, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> Well, now, seeing that the locking around this seems to be kind of haphazard,
> I'm wondering what prevents two different threads from doing CPUFREQ_PRECHANGE
> concurrently in such a way that thread A will check transition_ongoing
> and thread B will check transition_ongoing and then both will set it if it
> was 'false' before.  And then one of them will trigger the WARN() in
> CPUFREQ_POSTCHANGE.
>
> Is there any protection in place and if so then how does it work?

cpufreq_notify_transition() is called from driver->target() which is
called from __cpufreq_driver_target(). __cpufreq_driver_target()
is called directly by governors and cpufreq_driver_target() otherwise.

cpufreq_driver_target() implements proper locking and so it is fine.
__cpufreq_driver_target() is called from governors. From governors
it is is serialized in the sense two threads wouldn't call it at the same
time.

And so I thought this will work. But I just found a mistake in my code.
For multi-socket platforms with clock domains for sockets/clusters,
a single instance of transition_ongoing isn't enough and so this must
be embedded in struct cpufreq_policy.

Below patch must get this fixed (Attached).

-------------x---------------------x-----------------

From: Viresh Kumar <viresh.kumar@linaro.org>
Date: Wed, 19 Jun 2013 10:16:55 +0530
Subject: [PATCH] cpufreq: make sure frequency transitions are serialized

Whenever we are changing frequency of a cpu, we are calling PRECHANGE and
POSTCHANGE notifiers. They must be serialized. i.e. PRECHANGE or POSTCHANGE
shouldn't be called twice contiguously.

This can happen due to bugs in users of __cpufreq_driver_target() or actual
cpufreq drivers who are sending these notifiers.

This patch adds some protection against this. Now, we keep track of the last
transaction and see if something went wrong.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/cpufreq/cpufreq.c | 14 ++++++++++++++
 include/linux/cpufreq.h   |  1 +
 2 files changed, 15 insertions(+)

Comments

Rafael J. Wysocki June 27, 2013, 12:15 p.m. | #1
On Thursday, June 27, 2013 10:26:27 AM Viresh Kumar wrote:
> On 27 June 2013 03:27, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > Well, now, seeing that the locking around this seems to be kind of haphazard,
> > I'm wondering what prevents two different threads from doing CPUFREQ_PRECHANGE
> > concurrently in such a way that thread A will check transition_ongoing
> > and thread B will check transition_ongoing and then both will set it if it
> > was 'false' before.  And then one of them will trigger the WARN() in
> > CPUFREQ_POSTCHANGE.
> >
> > Is there any protection in place and if so then how does it work?
> 
> cpufreq_notify_transition() is called from driver->target() which is
> called from __cpufreq_driver_target(). __cpufreq_driver_target()
> is called directly by governors and cpufreq_driver_target() otherwise.
> 
> cpufreq_driver_target() implements proper locking and so it is fine.
> __cpufreq_driver_target() is called from governors. From governors
> it is is serialized in the sense two threads wouldn't call it at the same
> time.
> 
> And so I thought this will work. But I just found a mistake in my code.
> For multi-socket platforms with clock domains for sockets/clusters,
> a single instance of transition_ongoing isn't enough and so this must
> be embedded in struct cpufreq_policy.
> 
> Below patch must get this fixed (Attached).
> 
> -------------x---------------------x-----------------
> 
> From: Viresh Kumar <viresh.kumar@linaro.org>
> Date: Wed, 19 Jun 2013 10:16:55 +0530
> Subject: [PATCH] cpufreq: make sure frequency transitions are serialized
> 
> Whenever we are changing frequency of a cpu, we are calling PRECHANGE and
> POSTCHANGE notifiers. They must be serialized. i.e. PRECHANGE or POSTCHANGE
> shouldn't be called twice contiguously.
> 
> This can happen due to bugs in users of __cpufreq_driver_target() or actual
> cpufreq drivers who are sending these notifiers.
> 
> This patch adds some protection against this. Now, we keep track of the last
> transaction and see if something went wrong.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

OK, queued up for 3.11.

Thanks,
Rafael


> ---
>  drivers/cpufreq/cpufreq.c | 14 ++++++++++++++
>  include/linux/cpufreq.h   |  1 +
>  2 files changed, 15 insertions(+)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 2d53f47..75715f1 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -264,6 +264,12 @@ void __cpufreq_notify_transition(struct
> cpufreq_policy *policy,
>  	switch (state) {
> 
>  	case CPUFREQ_PRECHANGE:
> +		if (WARN(policy->transition_ongoing,
> +				"In middle of another frequency transition\n"))
> +			return;
> +
> +		policy->transition_ongoing = true;
> +
>  		/* detect if the driver reported a value as "old frequency"
>  		 * which is not equal to what the cpufreq core thinks is
>  		 * "old frequency".
> @@ -283,6 +289,12 @@ void __cpufreq_notify_transition(struct
> cpufreq_policy *policy,
>  		break;
> 
>  	case CPUFREQ_POSTCHANGE:
> +		if (WARN(!policy->transition_ongoing,
> +				"No frequency transition in progress\n"))
> +			return;
> +
> +		policy->transition_ongoing = false;
> +
>  		adjust_jiffies(CPUFREQ_POSTCHANGE, freqs);
>  		pr_debug("FREQ: %lu - CPU: %lu", (unsigned long)freqs->new,
>  			(unsigned long)freqs->cpu);
> @@ -1458,6 +1470,8 @@ int __cpufreq_driver_target(struct cpufreq_policy *policy,
> 
>  	if (cpufreq_disabled())
>  		return -ENODEV;
> +	if (policy->transition_ongoing)
> +		return -EBUSY;
> 
>  	/* Make sure that target_freq is within supported range */
>  	if (target_freq > policy->max)
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index 037d36a..8c13a45 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -115,6 +115,7 @@ struct cpufreq_policy {
> 
>  	struct kobject		kobj;
>  	struct completion	kobj_unregister;
> +	bool			transition_ongoing; /* Tracks transition status */
>  };
> 
>  #define CPUFREQ_ADJUST			(0)

Patch

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 2d53f47..75715f1 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -264,6 +264,12 @@  void __cpufreq_notify_transition(struct
cpufreq_policy *policy,
 	switch (state) {

 	case CPUFREQ_PRECHANGE:
+		if (WARN(policy->transition_ongoing,
+				"In middle of another frequency transition\n"))
+			return;
+
+		policy->transition_ongoing = true;
+
 		/* detect if the driver reported a value as "old frequency"
 		 * which is not equal to what the cpufreq core thinks is
 		 * "old frequency".
@@ -283,6 +289,12 @@  void __cpufreq_notify_transition(struct
cpufreq_policy *policy,
 		break;

 	case CPUFREQ_POSTCHANGE:
+		if (WARN(!policy->transition_ongoing,
+				"No frequency transition in progress\n"))
+			return;
+
+		policy->transition_ongoing = false;
+
 		adjust_jiffies(CPUFREQ_POSTCHANGE, freqs);
 		pr_debug("FREQ: %lu - CPU: %lu", (unsigned long)freqs->new,
 			(unsigned long)freqs->cpu);
@@ -1458,6 +1470,8 @@  int __cpufreq_driver_target(struct cpufreq_policy *policy,

 	if (cpufreq_disabled())
 		return -ENODEV;
+	if (policy->transition_ongoing)
+		return -EBUSY;

 	/* Make sure that target_freq is within supported range */
 	if (target_freq > policy->max)
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 037d36a..8c13a45 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -115,6 +115,7 @@  struct cpufreq_policy {

 	struct kobject		kobj;
 	struct completion	kobj_unregister;
+	bool			transition_ongoing; /* Tracks transition status */
 };

 #define CPUFREQ_ADJUST			(0)