diff mbox

cpufreq: create link to policy only for registered CPUs

Message ID a0469c333e08b3953bfe9e8356c6919c26578349.1473414746.git.viresh.kumar@linaro.org
State Superseded
Headers show

Commit Message

Viresh Kumar Sept. 9, 2016, 9:54 a.m. UTC
If a cpufreq driver is registered very early in the boot stage (e.g.
registered from postcore_initcall()), then cpufreq core may generate
kernel warnings for it.

In this case, the CPUs are registered as devices with the kernel only
after the cpufreq driver is registered, while the CPUs were brought
online way before that. And by the time cpufreq_add_dev() gets called,
the cpu device isn't stored in the per-cpu variable (cpu_sys_devices,)
which is read by get_cpu_device().

And so cpufreq core fails to get device for the CPU, for which
cpufreq_add_dev() was called in the first place and we will hit a
WARN_ON(!cpu_dev).

Even if we reuse the 'dev' parameter passed to cpufreq_add_dev() to
avoid that warning, there might be other CPUs online that share the
policy with the cpu for which cpufreq_add_dev() is called. And
eventually get_cpu_device() will return NULL for them as well, and we
will hit the same WARN_ON() again.

In order to fix these issues, change cpufreq core to create links to the
policy for a cpu only when cpufreq_add_dev() is called for that CPU.

Reuse the 'real_cpus' mask to track that as well.

Note that cpufreq_remove_dev() already handles removal of the links for
individual CPUs and cpufreq_add_dev() has aligned with that now.

Reported-by: Russell King <linux@armlinux.org.uk>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

---
 drivers/cpufreq/cpufreq.c | 89 +++++++++++++++--------------------------------
 1 file changed, 28 insertions(+), 61 deletions(-)

-- 
2.7.1.410.g6faf27b

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Viresh Kumar Sept. 9, 2016, 11:22 a.m. UTC | #1
On 09-09-16, 12:16, Russell King - ARM Linux wrote:
> On Fri, Sep 09, 2016 at 03:24:14PM +0530, Viresh Kumar wrote:

> > If a cpufreq driver is registered very early in the boot stage (e.g.

> > registered from postcore_initcall()), then cpufreq core may generate

> > kernel warnings for it.

> > 

> > In this case, the CPUs are registered as devices with the kernel only

> > after the cpufreq driver is registered, while the CPUs were brought

> > online way before that.

> 

> This seems confusing, maybe:

> 

> "In this case, the CPUs are brought online, then the cpufreq driver is

> registered, and then the CPU topology devices are registered."

> 

> which gives more of a linear A happens, then B, then C.


Sure, thanks for the tip..

> > ... And by the time cpufreq_add_dev() gets called,

> > the cpu device isn't stored in the per-cpu variable (cpu_sys_devices,)

> > which is read by get_cpu_device().

> 

> s/And by/By/ or "However, by"

> 

> > And so cpufreq core fails to get device for the CPU, for which

> > cpufreq_add_dev() was called in the first place and we will hit a

> > WARN_ON(!cpu_dev).

> 

> s/And so/So the/

> 

> This isn't the WARN_ON() statement that's triggering for me.


The WARN_ON() that was triggering for you was already removed by a
patch from Rafael (see below), but with that patch, you would have hit
this WARN_ON() :(.

> > Even if we reuse the 'dev' parameter passed to cpufreq_add_dev() to

> > avoid that warning, there might be other CPUs online that share the

> > policy with the cpu for which cpufreq_add_dev() is called. And

> > eventually get_cpu_device() will return NULL for them as well, and we

> > will hit the same WARN_ON() again.

> 

> s/And eventually/Eventually/


Thanks for all the suggestions..

> > In order to fix these issues, change cpufreq core to create links to the

> > policy for a cpu only when cpufreq_add_dev() is called for that CPU.

> > 

> > Reuse the 'real_cpus' mask to track that as well.

> > 

> > Note that cpufreq_remove_dev() already handles removal of the links for

> > individual CPUs and cpufreq_add_dev() has aligned with that now.

> 

> I applied this patch, but I still get:

> 

> WARNING: CPU: 0 PID: 1 at drivers/cpufreq/cpufreq.c:1040 cpufreq_add_dev+0x144/0x634

> Modules linked in:

> CPU: 0 PID: 1 Comm: swapper Not tainted 4.8.0-rc5+ #1061

> Hardware name: Intel-Assabet

> Backtrace:

> [<c0212190>] (dump_backtrace) from [<c021249c>] (show_stack+0x18/0x1c)

>  r6:00000000 r5:c05ca11d r4:00000000

> [<c0212484>] (show_stack) from [<c036e84c>] (dump_stack+0x20/0x28)

> [<c036e82c>] (dump_stack) from [<c021f7ec>] (__warn+0xd0/0xfc)

> [<c021f71c>] (__warn) from [<c021f840>] (warn_slowpath_null+0x28/0x30)

>  r10:00000000 r8:c062927c r7:00000000 r6:00000000 r5:c063d288 r4:00000000

> [<c021f818>] (warn_slowpath_null) from [<c043dc84>] (cpufreq_add_dev+0x144/0x634)

> [<c043db40>] (cpufreq_add_dev) from [<c03dc43c>] (bus_probe_device+0x5c/0x84)

>  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:c062927c r5:c063d288

>  r4:c0640148

> [<c03dc3e0>] (bus_probe_device) from [<c03da7c4>] (device_add+0x390/0x520)

>  r6:c0629284 r5:00000000 r4:c062927c

> [<c03da434>] (device_add) from [<c03daad8>] (device_register+0x1c/0x20)

>  r10:c061d848 r8:c0603524 r7:00000001 r6:00000000 r5:c062927c r4:c062927c

> [<c03daabc>] (device_register) from [<c03df5e8>] (register_cpu+0x88/0xac)

>  r4:c0629274

> [<c03df560>] (register_cpu) from [<c0603544>] (topology_init+0x20/0x2c)

>  r7:c0646760 r6:c0623568 r5:c061d834 r4:00000000

> [<c0603524>] (topology_init) from [<c020974c>] (do_one_initcall+0xc0/0x178)

>  r4:00000004

> [<c020968c>] (do_one_initcall) from [<c0600e70>] (kernel_init_freeable+0xfc/0x1c4)

>  r10:c061d848 r9:00000000 r8:0000008c r7:c0646760 r6:c0623568 r5:c061d834

>  r4:00000004

> [<c0600d74>] (kernel_init_freeable) from [<c052b6cc>] (kernel_init+0x10/0xf4)

>  r10:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c052b6bc r4:00000000

> [<c052b6bc>] (kernel_init) from [<c020fcf0>] (ret_from_fork+0x14/0x24)

>  r4:00000000

> ---[ end trace d7209ea270f4f585 ]---

> 

> I'm afraid I rather predicted that after reading the patch but before

> running the test: the patch does nothing to solve the original warning,

> as the code path which gets us to that warning remains untouched by

> this patch.

> 

> The code path is:

> 

> static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)

> {

>         if (cpu_online(cpu))

>                 return cpufreq_online(cpu);

> 

> static int cpufreq_online(unsigned int cpu)

> {

>         policy = per_cpu(cpufreq_cpu_data, cpu);

>         if (policy) {

>         } else {

>                 new_policy = true;

>                 policy = cpufreq_policy_alloc(cpu);

> 

> static struct cpufreq_policy *cpufreq_policy_alloc(unsigned int cpu)

> {

>         if (WARN_ON(!dev))

>                 return NULL;

> 

> The only change in your patch that affected this path was this:

> 

> -       if (cpu_online(cpu))

> -               return cpufreq_online(cpu);

> +       if (cpu_online(cpu)) {

> +               ret = cpufreq_online(cpu);

> +               if (ret)

> +                       return ret;

> +       }

> 

> which obviously has no bearing on that WARN_ON() firing.

> 

> Maybe I'm testing the wrong patch.


Thanks for testing it.. You need another patch from Rafael, which
should be in linux-next by now..

commit 3689ad7ed6a8 ("cpufreq: Drop unnecessary check from
cpufreq_policy_alloc()")

Both patches combined will fix the problem you were getting.

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 13fb589b6d2c..3a64136bf21b 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -916,58 +916,18 @@  static struct kobj_type ktype_cpufreq = {
 	.release	= cpufreq_sysfs_release,
 };
 
-static int add_cpu_dev_symlink(struct cpufreq_policy *policy, int cpu)
+static int add_cpu_dev_symlink(struct cpufreq_policy *policy,
+			       struct device *dev)
 {
-	struct device *cpu_dev;
-
-	pr_debug("%s: Adding symlink for CPU: %u\n", __func__, cpu);
-
-	if (!policy)
-		return 0;
-
-	cpu_dev = get_cpu_device(cpu);
-	if (WARN_ON(!cpu_dev))
-		return 0;
-
-	return sysfs_create_link(&cpu_dev->kobj, &policy->kobj, "cpufreq");
+	dev_dbg(dev, "%s: Adding symlink\n", __func__);
+	return sysfs_create_link(&dev->kobj, &policy->kobj, "cpufreq");
 }
 
-static void remove_cpu_dev_symlink(struct cpufreq_policy *policy, int cpu)
+static void remove_cpu_dev_symlink(struct cpufreq_policy *policy,
+				   struct device *dev)
 {
-	struct device *cpu_dev;
-
-	pr_debug("%s: Removing symlink for CPU: %u\n", __func__, cpu);
-
-	cpu_dev = get_cpu_device(cpu);
-	if (WARN_ON(!cpu_dev))
-		return;
-
-	sysfs_remove_link(&cpu_dev->kobj, "cpufreq");
-}
-
-/* Add/remove symlinks for all related CPUs */
-static int cpufreq_add_dev_symlink(struct cpufreq_policy *policy)
-{
-	unsigned int j;
-	int ret = 0;
-
-	/* Some related CPUs might not be present (physically hotplugged) */
-	for_each_cpu(j, policy->real_cpus) {
-		ret = add_cpu_dev_symlink(policy, j);
-		if (ret)
-			break;
-	}
-
-	return ret;
-}
-
-static void cpufreq_remove_dev_symlink(struct cpufreq_policy *policy)
-{
-	unsigned int j;
-
-	/* Some related CPUs might not be present (physically hotplugged) */
-	for_each_cpu(j, policy->real_cpus)
-		remove_cpu_dev_symlink(policy, j);
+	dev_dbg(dev, "%s: Removing symlink\n", __func__);
+	sysfs_remove_link(&dev->kobj, "cpufreq");
 }
 
 static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
@@ -999,7 +959,7 @@  static int cpufreq_add_dev_interface(struct cpufreq_policy *policy)
 			return ret;
 	}
 
-	return cpufreq_add_dev_symlink(policy);
+	return 0;
 }
 
 __weak struct cpufreq_governor *cpufreq_default_governor(void)
@@ -1129,7 +1089,6 @@  static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy, bool notify)
 
 	down_write(&policy->rwsem);
 	cpufreq_stats_free_table(policy);
-	cpufreq_remove_dev_symlink(policy);
 	kobj = &policy->kobj;
 	cmp = &policy->kobj_unregister;
 	up_write(&policy->rwsem);
@@ -1211,8 +1170,8 @@  static int cpufreq_online(unsigned int cpu)
 	if (new_policy) {
 		/* related_cpus should at least include policy->cpus. */
 		cpumask_copy(policy->related_cpus, policy->cpus);
-		/* Remember CPUs present at the policy creation time. */
-		cpumask_and(policy->real_cpus, policy->cpus, cpu_present_mask);
+		/* Clear mask of registered CPUs */
+		cpumask_clear(policy->real_cpus);
 	}
 
 	/*
@@ -1327,6 +1286,8 @@  static int cpufreq_online(unsigned int cpu)
 	return ret;
 }
 
+static void cpufreq_offline(unsigned int cpu);
+
 /**
  * cpufreq_add_dev - the cpufreq interface for a CPU device.
  * @dev: CPU device.
@@ -1336,22 +1297,28 @@  static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif)
 {
 	struct cpufreq_policy *policy;
 	unsigned cpu = dev->id;
+	int ret;
 
 	dev_dbg(dev, "%s: adding CPU%u\n", __func__, cpu);
 
-	if (cpu_online(cpu))
-		return cpufreq_online(cpu);
+	if (cpu_online(cpu)) {
+		ret = cpufreq_online(cpu);
+		if (ret)
+			return ret;
+	}
 
-	/*
-	 * A hotplug notifier will follow and we will handle it as CPU online
-	 * then.  For now, just create the sysfs link, unless there is no policy
-	 * or the link is already present.
-	 */
+	/* Create sysfs link on CPU registration */
 	policy = per_cpu(cpufreq_cpu_data, cpu);
 	if (!policy || cpumask_test_and_set_cpu(cpu, policy->real_cpus))
 		return 0;
 
-	return add_cpu_dev_symlink(policy, cpu);
+	ret = add_cpu_dev_symlink(policy, dev);
+	if (ret) {
+		cpumask_clear_cpu(cpu, policy->real_cpus);
+		cpufreq_offline(cpu);
+	}
+
+	return ret;
 }
 
 static void cpufreq_offline(unsigned int cpu)
@@ -1432,7 +1399,7 @@  static void cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif)
 		cpufreq_offline(cpu);
 
 	cpumask_clear_cpu(cpu, policy->real_cpus);
-	remove_cpu_dev_symlink(policy, cpu);
+	remove_cpu_dev_symlink(policy, dev);
 
 	if (cpumask_empty(policy->real_cpus))
 		cpufreq_policy_free(policy, true);