diff mbox

cpufreq: move policy kobj to policy->cpu at resume

Message ID 3b19a388891fe997c6c7bc14463453c27312edfa.1404968266.git.viresh.kumar@linaro.org
State New
Headers show

Commit Message

Viresh Kumar July 10, 2014, 5:19 a.m. UTC
This is only relevant to implementations with multiple clusters, where clusters
have separate clock lines but all CPUs within a cluster share it.

Consider a dual cluster platform with 2 cores per cluster. During suspend we
start offlining CPUs from 1 to 3. When CPU2 is remove, policy->kobj would be
moved to CPU3 and when CPU3 goes down we wouldn't free policy or its kobj.

Now on resume, we will get CPU2 before CPU3 and will call __cpufreq_add_dev().
We will recover the old policy and update policy->cpu from 3 to 2 from
update_policy_cpu().

But the kobj is still tied to CPU3 and wasn't moved to CPU2. We wouldn't create
a link for CPU2, but would try that while bringing CPU3 online. Which will
report errors as CPU3 already has kobj assigned to it.

This bug got introduced with commit 42f921a, which overlooked this scenario.

To fix this, lets move kobj to the new policy->cpu while bringing first CPU of a
cluster back.

Fixes: ("42f921a cpufreq: remove sysfs files for CPUs which failed to come back after resume")
Cc: Stable <stable@vger.kernel.org> # 3.13+
Reported-by: Bu Yitian <ybu@qti.qualcomm.com>
Reported-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
Hi Rafael,

This is for 3.16 release, please take it once Yitian/Saravana test this out.

@Yitian/Saravana: Sorry of overlooking this when both of you reported this
first. I (and Srivatsa as well) was damn sure that this scenario is taken into
account in current code and a close look proved that wrong.

I couldn't test it out, can any of you please see if it fixes things for you?

 drivers/cpufreq/cpufreq.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Bu, Yitian July 10, 2014, 7:08 a.m. UTC | #1
Hi Viresh:

I have verified this patch, all the issues I reported disappeared.
Thanks for the quick fix.

Best Regards

> -----Original Message-----
> From: Viresh Kumar [mailto:viresh.kumar@linaro.org]
> Sent: Thursday, July 10, 2014 1:19 PM
> To: rjw@rjwysocki.net
> Cc: linaro-kernel@lists.linaro.org; linux-pm@vger.kernel.org;
> arvind.chauhan@arm.com; srivatsa@mit.edu; skannan@codeaurora.org; Bu,
> Yitian; Viresh Kumar; Stable
> Subject: [PATCH] cpufreq: move policy kobj to policy->cpu at resume
> 
> This is only relevant to implementations with multiple clusters, where clusters
> have separate clock lines but all CPUs within a cluster share it.
> 
> Consider a dual cluster platform with 2 cores per cluster. During suspend we
> start offlining CPUs from 1 to 3. When CPU2 is remove, policy->kobj would be
> moved to CPU3 and when CPU3 goes down we wouldn't free policy or its kobj.
> 
> Now on resume, we will get CPU2 before CPU3 and will call
> __cpufreq_add_dev().
> We will recover the old policy and update policy->cpu from 3 to 2 from
> update_policy_cpu().
> 
> But the kobj is still tied to CPU3 and wasn't moved to CPU2. We wouldn't create
> a link for CPU2, but would try that while bringing CPU3 online. Which will
> report errors as CPU3 already has kobj assigned to it.
> 
> This bug got introduced with commit 42f921a, which overlooked this scenario.
> 
> To fix this, lets move kobj to the new policy->cpu while bringing first CPU of a
> cluster back.
> 
> Fixes: ("42f921a cpufreq: remove sysfs files for CPUs which failed to come back
> after resume")
> Cc: Stable <stable@vger.kernel.org> # 3.13+
> Reported-by: Bu Yitian <ybu@qti.qualcomm.com>
> Reported-by: Saravana Kannan <skannan@codeaurora.org>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
> Hi Rafael,
> 
> This is for 3.16 release, please take it once Yitian/Saravana test this out.
> 
> @Yitian/Saravana: Sorry of overlooking this when both of you reported this first.
> I (and Srivatsa as well) was damn sure that this scenario is taken into account in
> current code and a close look proved that wrong.
> 
> I couldn't test it out, can any of you please see if it fixes things for you?
> 
>  drivers/cpufreq/cpufreq.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index
> 62259d2..6f02485 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1153,10 +1153,12 @@ static int __cpufreq_add_dev(struct device *dev,
> struct subsys_interface *sif)
>  	 * the creation of a brand new one. So we need to perform this update
>  	 * by invoking update_policy_cpu().
>  	 */
> -	if (recover_policy && cpu != policy->cpu)
> +	if (recover_policy && cpu != policy->cpu) {
>  		update_policy_cpu(policy, cpu);
> -	else
> +		WARN_ON(kobject_move(&policy->kobj, &dev->kobj));
> +	} else {
>  		policy->cpu = cpu;
> +	}
> 
>  	cpumask_copy(policy->cpus, cpumask_of(cpu));
> 
> --
> 2.0.0.rc2

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar July 10, 2014, 7:10 a.m. UTC | #2
On 10 July 2014 12:38, Bu, Yitian <ybu@qti.qualcomm.com> wrote:
> Hi Viresh:
>
> I have verified this patch, all the issues I reported disappeared.
> Thanks for the quick fix.

Please don't top post. http://en.wikipedia.org/wiki/Posting_style

Rafael would be adding your:

Tested-by: Bu Yitian <ybu@qti.qualcomm.com>

Let us know if you don't want that.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bu, Yitian July 10, 2014, 7:13 a.m. UTC | #3
> -----Original Message-----

> From: Viresh Kumar [mailto:viresh.kumar@linaro.org]

> Sent: Thursday, July 10, 2014 3:10 PM

> To: Bu, Yitian

> Cc: rjw@rjwysocki.net; linaro-kernel@lists.linaro.org; linux-

> pm@vger.kernel.org; arvind.chauhan@arm.com; srivatsa@mit.edu;

> skannan@codeaurora.org; Stable

> Subject: Re: [PATCH] cpufreq: move policy kobj to policy->cpu at resume

> 

> On 10 July 2014 12:38, Bu, Yitian <ybu@qti.qualcomm.com> wrote:

> > Hi Viresh:

> >

> > I have verified this patch, all the issues I reported disappeared.

> > Thanks for the quick fix.

> 

> Please don't top post. http://en.wikipedia.org/wiki/Posting_style

> 

> Rafael would be adding your:

> 

> Tested-by: Bu Yitian <ybu@qti.qualcomm.com>

> 

> Let us know if you don't want that.


It is ok, thanks.
Srivatsa S. Bhat July 10, 2014, 11:15 a.m. UTC | #4
On 07/10/2014 10:49 AM, Viresh Kumar wrote:
> This is only relevant to implementations with multiple clusters, where clusters
> have separate clock lines but all CPUs within a cluster share it.
> 
> Consider a dual cluster platform with 2 cores per cluster. During suspend we
> start offlining CPUs from 1 to 3. When CPU2 is remove, policy->kobj would be
> moved to CPU3 and when CPU3 goes down we wouldn't free policy or its kobj.
> 
> Now on resume, we will get CPU2 before CPU3 and will call __cpufreq_add_dev().
> We will recover the old policy and update policy->cpu from 3 to 2 from
> update_policy_cpu().
> 
> But the kobj is still tied to CPU3 and wasn't moved to CPU2. We wouldn't create
> a link for CPU2, but would try that while bringing CPU3 online. Which will
> report errors as CPU3 already has kobj assigned to it.
> 
> This bug got introduced with commit 42f921a, which overlooked this scenario.
> 
> To fix this, lets move kobj to the new policy->cpu while bringing first CPU of a
> cluster back.
> 
> Fixes: ("42f921a cpufreq: remove sysfs files for CPUs which failed to come back after resume")
> Cc: Stable <stable@vger.kernel.org> # 3.13+
> Reported-by: Bu Yitian <ybu@qti.qualcomm.com>
> Reported-by: Saravana Kannan <skannan@codeaurora.org>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---

Looks good to me. But I think it would be better to move the invocation of
kobject_move() to update_policy_cpu() itself, so that update_policy_cpu()
will do all the work involved in updating the policy->cpu, as its name suggests.

With that small nit,

Reviewed-by: Srivatsa S. Bhat <srivatsa@mit.edu>

Regards,
Srivatsa S. Bhat

> Hi Rafael,
> 
> This is for 3.16 release, please take it once Yitian/Saravana test this out.
> 
> @Yitian/Saravana: Sorry of overlooking this when both of you reported this
> first. I (and Srivatsa as well) was damn sure that this scenario is taken into
> account in current code and a close look proved that wrong.
> 
> I couldn't test it out, can any of you please see if it fixes things for you?
> 
>  drivers/cpufreq/cpufreq.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 62259d2..6f02485 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1153,10 +1153,12 @@ static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
>  	 * the creation of a brand new one. So we need to perform this update
>  	 * by invoking update_policy_cpu().
>  	 */
> -	if (recover_policy && cpu != policy->cpu)
> +	if (recover_policy && cpu != policy->cpu) {
>  		update_policy_cpu(policy, cpu);
> -	else
> +		WARN_ON(kobject_move(&policy->kobj, &dev->kobj));
> +	} else {
>  		policy->cpu = cpu;
> +	}
>  
>  	cpumask_copy(policy->cpus, cpumask_of(cpu));
>  
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar July 10, 2014, 11:20 a.m. UTC | #5
On 10 July 2014 16:45, Srivatsa S. Bhat <srivatsa@mit.edu> wrote:
> Looks good to me. But I think it would be better to move the invocation of
> kobject_move() to update_policy_cpu() itself, so that update_policy_cpu()
> will do all the work involved in updating the policy->cpu, as its name suggests.

Its called from remove path as well ..
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Srivatsa S. Bhat July 10, 2014, 11:22 a.m. UTC | #6
On 07/10/2014 04:50 PM, Viresh Kumar wrote:
> On 10 July 2014 16:45, Srivatsa S. Bhat <srivatsa@mit.edu> wrote:
>> Looks good to me. But I think it would be better to move the invocation of
>> kobject_move() to update_policy_cpu() itself, so that update_policy_cpu()
>> will do all the work involved in updating the policy->cpu, as its name suggests.
> 
> Its called from remove path as well ..
> 

I know.. That's why it makes even more sense to consolidate all the
work into one function. We can restructure cpufreq_nominate_new_policy_cpu()
such that the kobject_move() can be moved to update_policy_cpu().

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar July 10, 2014, 12:42 p.m. UTC | #7
On 10 July 2014 16:52, Srivatsa S. Bhat <srivatsa@mit.edu> wrote:
> I know.. That's why it makes even more sense to consolidate all the
> work into one function. We can restructure cpufreq_nominate_new_policy_cpu()
> such that the kobject_move() can be moved to update_policy_cpu().

Done.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar July 11, 2014, 6:01 a.m. UTC | #8
On 10 July 2014 16:45, Srivatsa S. Bhat <srivatsa@mit.edu> wrote:
> Looks good to me. But I think it would be better to move the invocation of
> kobject_move() to update_policy_cpu() itself, so that update_policy_cpu()
> will do all the work involved in updating the policy->cpu, as its name suggests.
>
> With that small nit,
>
> Reviewed-by: Srivatsa S. Bhat <srivatsa@mit.edu>

Hi Rafael,

I had a chat with Srivatsa about this patch and the V2 version which
people aren't able to test yet.

I proposed that we take this patch as is and hold V2 for some time.
- V2 wouldn't apply cleanly to stable kernels for sure
- V2 isn't yet tested and V1 is.
- Saravana already proposed a patch which would remove most of what
  V2 is adding: http://www.spinics.net/lists/arm-kernel/msg346604.html

Our chats:

<vireshk> srivatsa, http://www.spinics.net/lists/arm-kernel/msg346604.html
<srivatsa> vireshk, thanks for the pointer.. will try to take a look
by tonight..

<vireshk> srivatsa, Because saravana is actually looking to change much of
                the stuff, what about dropping the V2 fix that I sent
yesterday and
                take V1 only for now?
<vireshk> srivatsa, As V1 would apply cleanly over stable kernels as well
<srivatsa> vireshk, sure, no problem... perhaps you can make the code
                 reorganization of the kobject_move as a separate
patch and submit
                 it for merge window instead of -rc
<vireshk> srivatsa, I was even thinking of that as well
<vireshk> Have two patches, only first one for rc and second one for
next release
<vireshk> srivatsa, But the second patch would be overwritten by Saravanna,
                so might not be of any use
<srivatsa> vireshk, ah, i see.. in that case, we can hold off on the
second patch for
                 now.. just v1 should do..


What do you say?
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 62259d2..6f02485 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1153,10 +1153,12 @@  static int __cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 	 * the creation of a brand new one. So we need to perform this update
 	 * by invoking update_policy_cpu().
 	 */
-	if (recover_policy && cpu != policy->cpu)
+	if (recover_policy && cpu != policy->cpu) {
 		update_policy_cpu(policy, cpu);
-	else
+		WARN_ON(kobject_move(&policy->kobj, &dev->kobj));
+	} else {
 		policy->cpu = cpu;
+	}
 
 	cpumask_copy(policy->cpus, cpumask_of(cpu));