diff mbox series

[v2,01/10] drivers base/arch_topology: free cpumask cpus_to_visit

Message ID 20170706094948.8779-2-dietmar.eggemann@arm.com
State Accepted
Commit 5408211a8f290ace85147858f4e05e18b942f489
Headers show
Series [v2,01/10] drivers base/arch_topology: free cpumask cpus_to_visit | expand

Commit Message

Dietmar Eggemann July 6, 2017, 9:49 a.m. UTC
Free cpumask cpus_to_visit in case registering
init_cpu_capacity_notifier has failed or the parsing of the cpu
capacity-dmips-mhz property is done. The cpumask cpus_to_visit is
only used inside the notifier call init_cpu_capacity_callback.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>

Acked-by: Vincent Guittot <vincent.guittot@linaro.org>

Tested-by: Juri Lelli <juri.lelli@arm.com>

Reviewed-by: Juri Lelli <juri.lelli@arm.com>

---
 drivers/base/arch_topology.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

-- 
2.11.0

Comments

Viresh Kumar July 6, 2017, 10:22 a.m. UTC | #1
On 06-07-17, 10:49, Dietmar Eggemann wrote:
> Free cpumask cpus_to_visit in case registering

> init_cpu_capacity_notifier has failed or the parsing of the cpu

> capacity-dmips-mhz property is done. The cpumask cpus_to_visit is

> only used inside the notifier call init_cpu_capacity_callback.

> 

> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

> Cc: Juri Lelli <juri.lelli@arm.com>

> Reported-by: Vincent Guittot <vincent.guittot@linaro.org>

> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>

> Acked-by: Vincent Guittot <vincent.guittot@linaro.org>

> Tested-by: Juri Lelli <juri.lelli@arm.com>

> Reviewed-by: Juri Lelli <juri.lelli@arm.com>

> ---

>  drivers/base/arch_topology.c | 12 ++++++++++--

>  1 file changed, 10 insertions(+), 2 deletions(-)

> 

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c

> index d1c33a85059e..f4832c662762 100644

> --- a/drivers/base/arch_topology.c

> +++ b/drivers/base/arch_topology.c

> @@ -206,6 +206,8 @@ static struct notifier_block init_cpu_capacity_notifier = {

>  

>  static int __init register_cpufreq_notifier(void)

>  {

> +	int ret;

> +

>  	/*

>  	 * on ACPI-based systems we need to use the default cpu capacity

>  	 * until we have the necessary code to parse the cpu capacity, so

> @@ -221,13 +223,19 @@ static int __init register_cpufreq_notifier(void)

>  

>  	cpumask_copy(cpus_to_visit, cpu_possible_mask);

>  

> -	return cpufreq_register_notifier(&init_cpu_capacity_notifier,

> -					 CPUFREQ_POLICY_NOTIFIER);

> +	ret = cpufreq_register_notifier(&init_cpu_capacity_notifier,

> +					CPUFREQ_POLICY_NOTIFIER);

> +

> +	if (ret)

> +		free_cpumask_var(cpus_to_visit);

> +

> +	return ret;

>  }

>  core_initcall(register_cpufreq_notifier);

>  

>  static void parsing_done_workfn(struct work_struct *work)

>  {

> +	free_cpumask_var(cpus_to_visit);

>  	cpufreq_unregister_notifier(&init_cpu_capacity_notifier,

>  					 CPUFREQ_POLICY_NOTIFIER);


As a general rule (and good coding practice), it is better to free resources
only after the users are gone. And so we should have changed the order here.
i.e. Unregister the notifier first and then free the cpumask.

And because of that we may end up crashing the kernel here.

Here is an example:

Consider that init_cpu_capacity_callback() is getting called concurrently on big
and LITTLE CPUs.


CPU0 (big)                            CPU4 (LITTLE)

                                      if (cap_parsing_failed || cap_parsing_done)
                                          return 0;

cap_parsing_done = true;
schedule_work(&parsing_done_work);

parsing_done_workfn(work)
  -> free_cpumask_var(cpus_to_visit);
  -> cpufreq_unregister_notifier()


                                      switch (val) {
                                          ...
                                          /* Touch cpus_to_visit and crash */


My assumption here is that the same notifier head can get called in parallel on
two CPUs as all I see there is a down_read() in __blocking_notifier_call_chain()
which shouldn't block parallel calls.

Maybe I am wrong :(

-- 
viresh
Juri Lelli July 6, 2017, 10:59 a.m. UTC | #2
Hi Viresh,

On 06/07/17 15:52, Viresh Kumar wrote:
> On 06-07-17, 10:49, Dietmar Eggemann wrote:


[...]

> >  static void parsing_done_workfn(struct work_struct *work)

> >  {

> > +	free_cpumask_var(cpus_to_visit);

> >  	cpufreq_unregister_notifier(&init_cpu_capacity_notifier,

> >  					 CPUFREQ_POLICY_NOTIFIER);

> 

> As a general rule (and good coding practice), it is better to free resources

> only after the users are gone. And so we should have changed the order here.

> i.e. Unregister the notifier first and then free the cpumask.

> 

> And because of that we may end up crashing the kernel here.

> 

> Here is an example:

> 

> Consider that init_cpu_capacity_callback() is getting called concurrently on big

> and LITTLE CPUs.

> 

> 

> CPU0 (big)                            CPU4 (LITTLE)

> 

>                                       if (cap_parsing_failed || cap_parsing_done)

>                                           return 0;

> 


But, in this case the policy notifier for LITTLE cluster has not been
executed yet, so the domain's CPUs have not yet been cleared out from
cpus_to_visit. CPU0 won't see the mask as empty then, right?

> cap_parsing_done = true;

> schedule_work(&parsing_done_work);

> 

> parsing_done_workfn(work)

>   -> free_cpumask_var(cpus_to_visit);

>   -> cpufreq_unregister_notifier()

> 

> 

>                                       switch (val) {

>                                           ...

>                                           /* Touch cpus_to_visit and crash */

> 

> 

> My assumption here is that the same notifier head can get called in parallel on

> two CPUs as all I see there is a down_read() in __blocking_notifier_call_chain()

> which shouldn't block parallel calls.

> 


If that's the case I'm wondering however if we need explicit
synchronization though. Otherwise both threads can read the mask as
full, clear only their bits and not schedule the workfn?

But, can the policies be concurrently initialized? Or is the
initialization process serialized or the different domains?

Thanks,

- Juri
Viresh Kumar July 6, 2017, 11:15 a.m. UTC | #3
On 06-07-17, 11:59, Juri Lelli wrote:
> On 06/07/17 15:52, Viresh Kumar wrote:


> > CPU0 (big)                            CPU4 (LITTLE)

> > 

> >                                       if (cap_parsing_failed || cap_parsing_done)

> >                                           return 0;

> > 

> 

> But, in this case the policy notifier for LITTLE cluster has not been

> executed yet,


Not necessarily. The cpufreq notifier with CPUFREQ_NOTIFY event can get called
again and again (as soon as the policy is changed, for example min/max changed
from sysfs). And so it is possible that the LITTLE cpus are already cleared from
the mask.

> so the domain's CPUs have not yet been cleared out from

> cpus_to_visit. CPU0 won't see the mask as empty then, right?


And so it can.

> > cap_parsing_done = true;

> > schedule_work(&parsing_done_work);

> > 

> > parsing_done_workfn(work)

> >   -> free_cpumask_var(cpus_to_visit);

> >   -> cpufreq_unregister_notifier()

> > 

> > 

> >                                       switch (val) {

> >                                           ...

> >                                           /* Touch cpus_to_visit and crash */

> > 

> > 

> > My assumption here is that the same notifier head can get called in parallel on

> > two CPUs as all I see there is a down_read() in __blocking_notifier_call_chain()

> > which shouldn't block parallel calls.

> > 

> 

> If that's the case I'm wondering however if we need explicit

> synchronization though. Otherwise both threads can read the mask as

> full, clear only their bits and not schedule the workfn?


Maybe not as the policies are created one by one only, not concurrently.

> But, can the policies be concurrently initialized? Or is the

> initialization process serialized or the different domains?


There can be complex cases here. For example consider this.

Only the little CPUs are brought online at boot. Their policy is set and they
are cleared from the cpus_to_visit mask. Now we try to bring any big CPU online
and at the same time try changing min/max from sysfs for the LITTLE CPU policy.

The notifier may get called concurrently here I believe and cause the problem I
mentioned earlier.

-- 
viresh
Dietmar Eggemann July 7, 2017, 3:50 p.m. UTC | #4
On 06/07/17 12:15, Viresh Kumar wrote:
> On 06-07-17, 11:59, Juri Lelli wrote:

>> On 06/07/17 15:52, Viresh Kumar wrote:


[...]

>>

>> If that's the case I'm wondering however if we need explicit

>> synchronization though. Otherwise both threads can read the mask as

>> full, clear only their bits and not schedule the workfn?

> 

> Maybe not as the policies are created one by one only, not concurrently.

> 

>> But, can the policies be concurrently initialized? Or is the

>> initialization process serialized or the different domains?

> 

> There can be complex cases here. For example consider this.

> 

> Only the little CPUs are brought online at boot. Their policy is set and they

> are cleared from the cpus_to_visit mask. Now we try to bring any big CPU online

> and at the same time try changing min/max from sysfs for the LITTLE CPU policy.

> 

> The notifier may get called concurrently here I believe and cause the problem I

> mentioned earlier.

> 


I chatted with Juri and your proposed fix to do the unregister before
the free makes sense to us. Thanks for spotting this! I will change in
the next version.
diff mbox series

Patch

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index d1c33a85059e..f4832c662762 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -206,6 +206,8 @@  static struct notifier_block init_cpu_capacity_notifier = {
 
 static int __init register_cpufreq_notifier(void)
 {
+	int ret;
+
 	/*
 	 * on ACPI-based systems we need to use the default cpu capacity
 	 * until we have the necessary code to parse the cpu capacity, so
@@ -221,13 +223,19 @@  static int __init register_cpufreq_notifier(void)
 
 	cpumask_copy(cpus_to_visit, cpu_possible_mask);
 
-	return cpufreq_register_notifier(&init_cpu_capacity_notifier,
-					 CPUFREQ_POLICY_NOTIFIER);
+	ret = cpufreq_register_notifier(&init_cpu_capacity_notifier,
+					CPUFREQ_POLICY_NOTIFIER);
+
+	if (ret)
+		free_cpumask_var(cpus_to_visit);
+
+	return ret;
 }
 core_initcall(register_cpufreq_notifier);
 
 static void parsing_done_workfn(struct work_struct *work)
 {
+	free_cpumask_var(cpus_to_visit);
 	cpufreq_unregister_notifier(&init_cpu_capacity_notifier,
 					 CPUFREQ_POLICY_NOTIFIER);
 }