Message ID | 1415261798-9671-1-git-send-email-wangyijing@huawei.com |
---|---|
State | New |
Headers | show |
Maybe "fix glue dir race condition by not removing them" is a better title? On Thu, Nov 06, 2014 at 04:16:38PM +0800, Yijing Wang wrote: > There is a race condition when removing glue directory. > It can be reproduced in following test: > > path 1: Add first child device > device_add() > get_device_parent() > /*find parent from glue_dirs.list*/ > list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry) > if (k->parent == parent_kobj) { > kobj = kobject_get(k); > break; > } > .... > class_dir_create_and_add() > > path2: Remove last child device under glue dir > device_del() > cleanup_device_parent() > cleanup_glue_dir() > kobject_put(glue_dir); > > If path2 has been called cleanup_glue_dir(), but not > call kobject_put(glue_dir), the glue dir is still > in parent's kset list. Meanwhile, path1 find the glue > dir from the glue_dirs.list. Path2 may release glue dir > before path1 call kobject_get(). So kernel will report > the warning and bug_on. > > This fix keep glue dir around once it created suggested > by Tejun Heo. I think you prolly want to explain why this is okay / desired. e.g. list how the glue dir is used and how many of them are there and explain that there's no real benefit in removing them. ... > Signed-off-by: Yijing Wang <wangyijing@huawei.com> > Signed-off-by: Weng Meiling <wengmeiling.weng@huawei.com> > Cc: <stable@vger.kernel.org> #3.4+ Except for the above nits. Reviewed-by: Tejun Heo <tj@kernel.org>
On Thu, Nov 06, 2014 at 11:55:47AM -0500, Tejun Heo wrote: > Maybe "fix glue dir race condition by not removing them" is a better > title? > > On Thu, Nov 06, 2014 at 04:16:38PM +0800, Yijing Wang wrote: > > There is a race condition when removing glue directory. > > It can be reproduced in following test: > > > > path 1: Add first child device > > device_add() > > get_device_parent() > > /*find parent from glue_dirs.list*/ > > list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry) > > if (k->parent == parent_kobj) { > > kobj = kobject_get(k); > > break; > > } > > .... > > class_dir_create_and_add() > > > > path2: Remove last child device under glue dir > > device_del() > > cleanup_device_parent() > > cleanup_glue_dir() > > kobject_put(glue_dir); > > > > If path2 has been called cleanup_glue_dir(), but not > > call kobject_put(glue_dir), the glue dir is still > > in parent's kset list. Meanwhile, path1 find the glue > > dir from the glue_dirs.list. Path2 may release glue dir > > before path1 call kobject_get(). So kernel will report > > the warning and bug_on. > > > > This fix keep glue dir around once it created suggested > > by Tejun Heo. > > I think you prolly want to explain why this is okay / desired. > e.g. list how the glue dir is used and how many of them are there and > explain that there's no real benefit in removing them. I'd really _like_ to remove them if at all possible, as if there isn't any "children" in the subdirectory, there shouldn't be a need for that directory to be there. This seems to be the "classic" problem we have of a kref in a list that can be found while the last instance could be removed at the same time. I hate to just throw another lock at the problem, but wouldn't a lock to protect the list of glue_dirs be the answer here? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On 2014/11/7 0:55, Tejun Heo wrote: > Maybe "fix glue dir race condition by not removing them" is a better > title? Yes, it's better, thank you! > > On Thu, Nov 06, 2014 at 04:16:38PM +0800, Yijing Wang wrote: >> There is a race condition when removing glue directory. >> It can be reproduced in following test: >> >> path 1: Add first child device >> device_add() >> get_device_parent() >> /*find parent from glue_dirs.list*/ >> list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry) >> if (k->parent == parent_kobj) { >> kobj = kobject_get(k); >> break; >> } >> .... >> class_dir_create_and_add() >> >> path2: Remove last child device under glue dir >> device_del() >> cleanup_device_parent() >> cleanup_glue_dir() >> kobject_put(glue_dir); >> >> If path2 has been called cleanup_glue_dir(), but not >> call kobject_put(glue_dir), the glue dir is still >> in parent's kset list. Meanwhile, path1 find the glue >> dir from the glue_dirs.list. Path2 may release glue dir >> before path1 call kobject_get(). So kernel will report >> the warning and bug_on. >> >> This fix keep glue dir around once it created suggested >> by Tejun Heo. > > I think you prolly want to explain why this is okay / desired. > e.g. list how the glue dir is used and how many of them are there and > explain that there's no real benefit in removing them. > Right, I will add the explanation. :) > ... >> Signed-off-by: Yijing Wang <wangyijing@huawei.com> >> Signed-off-by: Weng Meiling <wengmeiling.weng@huawei.com> >> Cc: <stable@vger.kernel.org> #3.4+ > > Except for the above nits. > > Reviewed-by: Tejun Heo <tj@kernel.org> Thanks! >
diff --git a/drivers/base/core.c b/drivers/base/core.c index 14d1629..5b7a3e9 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -761,7 +761,7 @@ static struct kobject *get_device_parent(struct device *dev, spin_lock(&dev->class->p->glue_dirs.list_lock); list_for_each_entry(k, &dev->class->p->glue_dirs.list, entry) if (k->parent == parent_kobj) { - kobj = kobject_get(k); + kobj = k; break; } spin_unlock(&dev->class->p->glue_dirs.list_lock); @@ -786,21 +786,6 @@ static struct kobject *get_device_parent(struct device *dev, return NULL; } -static void cleanup_glue_dir(struct device *dev, struct kobject *glue_dir) -{ - /* see if we live in a "glue" directory */ - if (!glue_dir || !dev->class || - glue_dir->kset != &dev->class->p->glue_dirs) - return; - - kobject_put(glue_dir); -} - -static void cleanup_device_parent(struct device *dev) -{ - cleanup_glue_dir(dev, dev->kobj.parent); -} - static int device_add_class_symlinks(struct device *dev) { int error; @@ -1094,7 +1079,6 @@ done: kobject_uevent(&dev->kobj, KOBJ_REMOVE); kobject_del(&dev->kobj); Error: - cleanup_device_parent(dev); if (parent) put_device(parent); name_error: @@ -1215,7 +1199,6 @@ void device_del(struct device *dev) blocking_notifier_call_chain(&dev->bus->p->bus_notifier, BUS_NOTIFY_REMOVED_DEVICE, dev); kobject_uevent(&dev->kobj, KOBJ_REMOVE); - cleanup_device_parent(dev); kobject_del(&dev->kobj); put_device(parent); } @@ -1873,7 +1856,6 @@ int device_move(struct device *dev, struct device *new_parent, __func__, new_parent ? dev_name(new_parent) : "<NULL>"); error = kobject_move(&dev->kobj, new_parent_kobj); if (error) { - cleanup_glue_dir(dev, new_parent_kobj); put_device(new_parent); goto out; } @@ -1902,7 +1884,6 @@ int device_move(struct device *dev, struct device *new_parent, set_dev_node(dev, dev_to_node(old_parent)); } } - cleanup_glue_dir(dev, new_parent_kobj); put_device(new_parent); goto out; }