Message ID | ed8fd187687cb4ea9afd0bc32107ca5abf03e679.1422580135.git.viresh.kumar@linaro.org |
---|---|
State | New |
Headers | show |
On 30 January 2015 at 07:00, ethan zhao <ethan.zhao@oracle.com> wrote: > It is another bug. Hmm, lets see.. > Yes, here is one bug should be fix. but seems not enough to avoid the issue > completely, Did you test it ? > how about the Thread B running here > > Thread B: xenbus_thread() > > xenbus_thread() > msg->u.watch.handle->callback() > handle_vcpu_hotplug_event() > vcpu_hotplug() > cpu_down() > __cpu_notify(CPU_POST_DEAD..) > cpufreq_cpu_callback() > __cpufreq_remove_dev_prepare > update_policy_cpu(){ > ... > down_write(&policy->rwsem); > policy->last_cpu = policy->cpu; > policy->cpu = cpu; > up_write(&policy->rwsem); > ----> > } > > And thread A run to the same position as above, don't ignore my work > blindly, that piece of bandage > could save your time Oh, please don't misunderstand me. I didn't had any intention of showing any kind of disrespect. Okay, why do you say that the above thread you shown has a bug in there? Its juse updating policy->cpu and that shouldn't make anything unstable. Please explain again one more time, in details why do you think you hit a different bug. Also look at kref.h to see which piece of code has hit that WARN() and it will happen only in the case I have shown. Lets see if I missed something :) -- viresh -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 30 January 2015 at 07:40, ethan zhao <ethan.zhao@oracle.com> wrote: > For a PPC notification and xen-bus thread race, could you tell me a way how > to reproduce it by trigger the PPC notification and xen-bus events manually > ? > You really want me write some code into a test kernel to flood the PPC and > xen-bus at the same time ? if we could analysis code and get the issue > clearly, we wouldn't wait the users to yell out. I thought you already have a test where you are hitting the issue you originally reported. Atleast Santosh did confirm that he is hitting 3/5 times in his kernel during boot.. My reasoning of why your observation doesn't fit here: Copying from your earlier mail.. Thread A: Workqueue: kacpi_notify acpi_processor_notify() acpi_processor_ppc_has_changed() cpufreq_update_policy() cpufreq_cpu_get() kobject_get() This tries to increment the count and the warning you have mentioned happen because: WARN_ON_ONCE(atomic_inc_return(&kref->refcount) < 2); i.e. even after incrementing the count, it is < 2. Which I believe will be 1. Which means that we have tried to do kobject_get() on a kobject for which kobject_put() is already done. Thread B: xenbus_thread() xenbus_thread() msg->u.watch.handle->callback() handle_vcpu_hotplug_event() vcpu_hotplug() cpu_down() __cpu_notify(CPU_DOWN_PREPARE..) cpufreq_cpu_callback() __cpufreq_remove_dev_prepare() update_policy_cpu() kobject_move() Okay, where is the race or kobject_put() here ? We are just moving the kobject and it has nothing to do with the refcount of kobject. Why do you see its a race ? -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 30 January 2015 at 07:51, ethan zhao <ethan.zhao@oracle.com> wrote: >> My reasoning of why your observation doesn't fit here: >> >> Copying from your earlier mail.. >> >> Thread A: Workqueue: kacpi_notify >> >> acpi_processor_notify() >> acpi_processor_ppc_has_changed() >> cpufreq_update_policy() >> cpufreq_cpu_get() >> kobject_get() >> >> This tries to increment the count and the warning you have mentioned >> happen because: >> >> WARN_ON_ONCE(atomic_inc_return(&kref->refcount) < 2); >> >> i.e. even after incrementing the count, it is < 2. Which I believe will be >> 1. Which means that we have tried to do kobject_get() on a kobject >> for which kobject_put() is already done. >> >> Thread B: xenbus_thread() >> >> xenbus_thread() >> msg->u.watch.handle->callback() >> handle_vcpu_hotplug_event() >> vcpu_hotplug() >> cpu_down() >> __cpu_notify(CPU_DOWN_PREPARE..) >> cpufreq_cpu_callback() >> __cpufreq_remove_dev_prepare() >> update_policy_cpu() >> kobject_move() >> >> >> Okay, where is the race or kobject_put() here ? We are just moving >> the kobject and it has nothing to do with the refcount of kobject. >> >> Why do you see its a race ? > > I mean the policy->cpu has been changed, that CPU is about to be down, > Thread A continue to get and update the policy for it blindly, that is > what I Say 'race', not the refcount itself. First of all, the WARN you had in your patch doesn't have anything to do with the so-called race you just define. Its because of the reason I defined earlier. Second, what if policy->cpu is getting updated? A policy manages a group of CPUs, not a single cpu. And there still are other CPUs online for that policy and so kobject_get() for that policy->kobj is perfectly valid. -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 30 January 2015 at 09:16, ethan zhao <ethan.zhao@oracle.com> wrote: > You mean the policy is shared by all CPUs, so PPC notification about one A policy may or maynot be shared by multiple CPUs. It all depends on your systems configuration. All CPUs which share clock line, share a policy. You can verify that from /sys/devices/system/cpu/cpu*/related_cpus. This gives the CPUs that share policy. > CPU should update all CPU's policy, right ? even the requested CPU is > shutting down. CPUs sharing policy have a single policy structure. And so policy would be updated for all CPUs that share the poilcy. Even if some cpu goes down, the policy might still be up and running. -- viresh -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 31 January 2015 at 04:27, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > No, it isn't. cpufreq_cpu_data is a pointer and doesn't need any locks to > protect it. What the lock does is to guarantee the callers of cpufreq_cpu_get() > that the policy object won't go away from under them (it is used for some other > purposes too, but they are unrelated). Yeah, its not just locking. Otherwise the locking only at the bottom of this routine should have fixed it. > That's almost correct. In addition to the above (albeit maybe unintentionally) > the patch also fixes the possible race between two instances of > __cpufreq_remove_dev_finish() with the same arguments running in parallel with > each other. The proof is left as an exercise to the reader. :-) Two __cpufreq_remove_dev_finish() can't get called in parallel, so it doesn't fix any race there :). > Please try to improve the changelog ... Yes. -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2 February 2015 at 07:24, ethan zhao <ethan.zhao@oracle.com> wrote: > Policy will be freed only when that's last CPU shutting down, how does it > happen when system booting ? I couldn't understand what you are asking here. Though policy will be allocated for the first CPU of every policy during boot.. > The description of the patch would be wrong (the Xen_bus call stack) -- > Did the xen_bus shut down every CPU till the last during booting ? I don't know about that.. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 4473eba1d6b0..e3bf702b5588 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1409,9 +1409,10 @@ static int __cpufreq_remove_dev_finish(struct device *dev, unsigned long flags; struct cpufreq_policy *policy; - read_lock_irqsave(&cpufreq_driver_lock, flags); + write_lock_irqsave(&cpufreq_driver_lock, flags); policy = per_cpu(cpufreq_cpu_data, cpu); - read_unlock_irqrestore(&cpufreq_driver_lock, flags); + per_cpu(cpufreq_cpu_data, cpu) = NULL; + write_unlock_irqrestore(&cpufreq_driver_lock, flags); if (!policy) { pr_debug("%s: No cpu_data found\n", __func__); @@ -1466,7 +1467,6 @@ static int __cpufreq_remove_dev_finish(struct device *dev, } } - per_cpu(cpufreq_cpu_data, cpu) = NULL; return 0; }
cpufreq_cpu_data is protected by cpufreq_driver_lock and one of the instances has missed this. And as a result we get this: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 4 at include/linux/kref.h:47 kobject_get+0x41/0x50() Modules linked in: acpi_cpufreq(+) nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c sd_mod ixgbe igb mdio ahci hwmon ... Call Trace: [<ffffffff81661b14>] dump_stack+0x46/0x58 [<ffffffff81072b61>] warn_slowpath_common+0x81/0xa0 [<ffffffff81072c7a>] warn_slowpath_null+0x1a/0x20 [<ffffffff812e16d1>] kobject_get+0x41/0x50 [<ffffffff815262a5>] cpufreq_cpu_get+0x75/0xc0 [<ffffffff81527c3e>] cpufreq_update_policy+0x2e/0x1f0 [<ffffffff810b8cb2>] ? up+0x32/0x50 [<ffffffff81381aa9>] ? acpi_ns_get_node+0xcb/0xf2 [<ffffffff81381efd>] ? acpi_evaluate_object+0x22c/0x252 [<ffffffff813824f6>] ? acpi_get_handle+0x95/0xc0 [<ffffffff81360967>] ? acpi_has_method+0x25/0x40 [<ffffffff81391e08>] acpi_processor_ppc_has_changed+0x77/0x82 [<ffffffff81089566>] ? move_linked_works+0x66/0x90 [<ffffffff8138e8ed>] acpi_processor_notify+0x58/0xe7 [<ffffffff8137410c>] acpi_ev_notify_dispatch+0x44/0x5c [<ffffffff8135f293>] acpi_os_execute_deferred+0x15/0x22 [<ffffffff8108c910>] process_one_work+0x160/0x410 [<ffffffff8108d05b>] worker_thread+0x11b/0x520 [<ffffffff8108cf40>] ? rescuer_thread+0x380/0x380 [<ffffffff81092421>] kthread+0xe1/0x100 [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0 [<ffffffff81669ebc>] ret_from_fork+0x7c/0xb0 [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0 ---[ end trace 89e66eb9795efdf7 ]--- And here is the race: Thread A: Workqueue: kacpi_notify acpi_processor_notify() acpi_processor_ppc_has_changed() cpufreq_update_policy() cpufreq_cpu_get() kobject_get() Thread B: xenbus_thread() xenbus_thread() msg->u.watch.handle->callback() handle_vcpu_hotplug_event() vcpu_hotplug() cpu_down() __cpu_notify(CPU_POST_DEAD..) cpufreq_cpu_callback() __cpufreq_remove_dev_finish() cpufreq_policy_put_kobj() kobject_put() cpufreq_cpu_get() gets the policy from per-cpu variable cpufreq_cpu_data under cpufreq_driver_lock, and once it gets a valid policy it expects it to not be freed until cpufreq_cpu_put() is called. But the race happens when another thread puts the kobject first and updates cpufreq_cpu_data later and that too without these locks. And so the first thread gets a valid policy structure and before it does kobject_get() on it, the second one does kobject_put(). And so this WARN(). Fix this by setting cpufreq_cpu_data to NULL before putting the kobject and that too under locks. Cc: <stable@vger.kernel.org> # 3.12+ Reported-by: Ethan Zhao <ethan.zhao@oracle.com> Reported-and-tested-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> --- @Santosh: I have changed read locks to write locks here and so you need to test again. drivers/cpufreq/cpufreq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)