Message ID | 0f6af89aa9093884d3668962ebc62383d8a209ec.1477563459.git.viresh.kumar@linaro.org |
---|---|
State | New |
Headers | show |
On Thu, 27 Oct 2016, Viresh Kumar wrote: > The kernel WARNs and then crashes today if wm8994_device_init() fails > after calling devm_regulator_bulk_get(). > > That happens because there are multiple devices involved here and the > order in which managed resources are freed isn't correct. > > The regulators are added as children of wm8994->dev. Whereas, > devm_regulator_bulk_get() receives wm8994->dev as the device, though it > gets the same regulators which were added as children of wm8994->dev > earlier. > > During failures, the children are removed first and the core eventually > calls regulator_unregister() for them. As regulator_put() was never done > for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at > > WARN_ON(rdev->open_count); > > And eventually it crashes from debugfs_remove_recursive(). Applied, thanks. > --------x------------------x---------------- > > wm8994 3-001a: Device is not a WM8994, ID is 0 > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 1 at /mnt/ssd/all/work/repos/devel/linux/drivers/regulator/core.c:4072 regulator_unregister+0xc8/0xd0 > Modules linked in: > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc6-00154-g54fe84cbd50b #41 > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) > [<c010e24c>] (unwind_backtrace) from [<c010af38>] (show_stack+0x10/0x14) > [<c010af38>] (show_stack) from [<c032a1c4>] (dump_stack+0x88/0x9c) > [<c032a1c4>] (dump_stack) from [<c011a98c>] (__warn+0xe8/0x100) > [<c011a98c>] (__warn) from [<c011aa54>] (warn_slowpath_null+0x20/0x28) > [<c011aa54>] (warn_slowpath_null) from [<c0384a0c>] (regulator_unregister+0xc8/0xd0) > [<c0384a0c>] (regulator_unregister) from [<c0406434>] (release_nodes+0x16c/0x1dc) > [<c0406434>] (release_nodes) from [<c04039c4>] (__device_release_driver+0x8c/0x110) > [<c04039c4>] (__device_release_driver) from [<c0403a64>] (device_release_driver+0x1c/0x28) > [<c0403a64>] (device_release_driver) from [<c0402b24>] (bus_remove_device+0xd8/0x104) > [<c0402b24>] (bus_remove_device) from [<c03ffcd8>] (device_del+0x10c/0x218) > [<c03ffcd8>] (device_del) from [<c0404e4c>] (platform_device_del+0x1c/0x88) > [<c0404e4c>] (platform_device_del) from [<c0404ec4>] (platform_device_unregister+0xc/0x20) > [<c0404ec4>] (platform_device_unregister) from [<c0428bc0>] (mfd_remove_devices_fn+0x5c/0x64) > [<c0428bc0>] (mfd_remove_devices_fn) from [<c03ff9d8>] (device_for_each_child_reverse+0x4c/0x78) > [<c03ff9d8>] (device_for_each_child_reverse) from [<c04288c4>] (mfd_remove_devices+0x20/0x30) > [<c04288c4>] (mfd_remove_devices) from [<c042758c>] (wm8994_device_init+0x2ac/0x7f0) > [<c042758c>] (wm8994_device_init) from [<c04f14a8>] (i2c_device_probe+0x178/0x1fc) > [<c04f14a8>] (i2c_device_probe) from [<c04036fc>] (driver_probe_device+0x214/0x2c0) > [<c04036fc>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0) > [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c) > [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218) > [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8) > [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84) > [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170) > [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc) > [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114) > [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c) > ---[ end trace 0919d3d0bc998260 ]--- > > [snip..] > > Unable to handle kernel NULL pointer dereference at virtual address 00000078 > pgd = c0004000 > [00000078] *pgd=00000000 > Internal error: Oops: 5 [#1] PREEMPT SMP ARM > Modules linked in: > CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.8.0-rc6-00154-g54fe84cbd50b #41 > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) > task: ee874000 task.stack: ee878000 > PC is at down_write+0x14/0x54 > LR is at debugfs_remove_recursive+0x30/0x150 > > [snip..] > > [<c06e489c>] (down_write) from [<c02e9954>] (debugfs_remove_recursive+0x30/0x150) > [<c02e9954>] (debugfs_remove_recursive) from [<c0382b78>] (_regulator_put+0x24/0xac) > [<c0382b78>] (_regulator_put) from [<c0382c1c>] (regulator_put+0x1c/0x2c) > [<c0382c1c>] (regulator_put) from [<c0406434>] (release_nodes+0x16c/0x1dc) > [<c0406434>] (release_nodes) from [<c04035d4>] (driver_probe_device+0xec/0x2c0) > [<c04035d4>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0) > [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c) > [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218) > [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8) > [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84) > [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170) > [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc) > [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114) > [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c) > Code: e1a04000 f590f000 e3a03001 e34f3fff (e1902f9f) > ---[ end trace 0919d3d0bc998262 ]--- > > --------x------------------x---------------- > > Fix the kernel warnings and crashes by using regulator_bulk_get() > instead of devm_regulator_bulk_get() and explicitly freeing the supplies > in exit paths. > > Tested on Exynos 5250, dual core ARM A15 machine. > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> > Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> > --- > V3->V4: > - Send the right version of the patch, sent the older one by mistake earlier. > > V2->V3: > - Fixed a rebase conflict > - sending only one patch with Charles Ack > > V1->V2: > - Use regulator_bulk_free() instead of open coding it. > - Shorter backtrace > - Reworded the last paragraph to make it more clear > - Added a comment in code > > drivers/mfd/wm8994-core.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/drivers/mfd/wm8994-core.c b/drivers/mfd/wm8994-core.c > index 95e6bc55adbb..953d0790ffd5 100644 > --- a/drivers/mfd/wm8994-core.c > +++ b/drivers/mfd/wm8994-core.c > @@ -393,8 +393,13 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq) > BUG(); > goto err; > } > - > - ret = devm_regulator_bulk_get(wm8994->dev, wm8994->num_supplies, > + > + /* > + * Can't use devres helper here as some of the supplies are provided by > + * wm8994->dev's children (regulators) and those regulators are > + * unregistered by the devres core before the supplies are freed. > + */ > + ret = regulator_bulk_get(wm8994->dev, wm8994->num_supplies, > wm8994->supplies); > if (ret != 0) { > dev_err(wm8994->dev, "Failed to get supplies: %d\n", ret); > @@ -404,7 +409,7 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq) > ret = regulator_bulk_enable(wm8994->num_supplies, wm8994->supplies); > if (ret != 0) { > dev_err(wm8994->dev, "Failed to enable supplies: %d\n", ret); > - goto err; > + goto err_regulator_free; > } > > ret = wm8994_reg_read(wm8994, WM8994_SOFTWARE_RESET); > @@ -595,6 +600,8 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq) > err_enable: > regulator_bulk_disable(wm8994->num_supplies, > wm8994->supplies); > +err_regulator_free: > + regulator_bulk_free(wm8994->num_supplies, wm8994->supplies); > err: > mfd_remove_devices(wm8994->dev); > return ret; > @@ -605,6 +612,7 @@ static void wm8994_device_exit(struct wm8994 *wm8994) > pm_runtime_disable(wm8994->dev); > wm8994_irq_exit(wm8994); > regulator_bulk_disable(wm8994->num_supplies, wm8994->supplies); > + regulator_bulk_free(wm8994->num_supplies, wm8994->supplies); > mfd_remove_devices(wm8994->dev); > } > -- Lee Jones Linaro STMicroelectronics Landing Team Lead Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog
+ Jaehoon On 10 November 2016 at 09:39, Lee Jones <lee.jones@linaro.org> wrote: > On Thu, 27 Oct 2016, Viresh Kumar wrote: > >> The kernel WARNs and then crashes today if wm8994_device_init() fails >> after calling devm_regulator_bulk_get(). >> >> That happens because there are multiple devices involved here and the >> order in which managed resources are freed isn't correct. >> >> The regulators are added as children of wm8994->dev. Whereas, >> devm_regulator_bulk_get() receives wm8994->dev as the device, though it >> gets the same regulators which were added as children of wm8994->dev >> earlier. >> >> During failures, the children are removed first and the core eventually >> calls regulator_unregister() for them. As regulator_put() was never done >> for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at >> >> WARN_ON(rdev->open_count); >> >> And eventually it crashes from debugfs_remove_recursive(). > > Applied, thanks. Lee, did you intend to send this a fix for 4.9 rc? I would be nice as kernelci are reporting this issue (assuming it's the same). https://kernelci.org/boot/all/job/ulfh/kernel/v4.9-rc5-78-g52eb5d4531e4/ Kind regards Uffe > >> --------x------------------x---------------- >> >> wm8994 3-001a: Device is not a WM8994, ID is 0 >> ------------[ cut here ]------------ >> WARNING: CPU: 0 PID: 1 at /mnt/ssd/all/work/repos/devel/linux/drivers/regulator/core.c:4072 regulator_unregister+0xc8/0xd0 >> Modules linked in: >> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc6-00154-g54fe84cbd50b #41 >> Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) >> [<c010e24c>] (unwind_backtrace) from [<c010af38>] (show_stack+0x10/0x14) >> [<c010af38>] (show_stack) from [<c032a1c4>] (dump_stack+0x88/0x9c) >> [<c032a1c4>] (dump_stack) from [<c011a98c>] (__warn+0xe8/0x100) >> [<c011a98c>] (__warn) from [<c011aa54>] (warn_slowpath_null+0x20/0x28) >> [<c011aa54>] (warn_slowpath_null) from [<c0384a0c>] (regulator_unregister+0xc8/0xd0) >> [<c0384a0c>] (regulator_unregister) from [<c0406434>] (release_nodes+0x16c/0x1dc) >> [<c0406434>] (release_nodes) from [<c04039c4>] (__device_release_driver+0x8c/0x110) >> [<c04039c4>] (__device_release_driver) from [<c0403a64>] (device_release_driver+0x1c/0x28) >> [<c0403a64>] (device_release_driver) from [<c0402b24>] (bus_remove_device+0xd8/0x104) >> [<c0402b24>] (bus_remove_device) from [<c03ffcd8>] (device_del+0x10c/0x218) >> [<c03ffcd8>] (device_del) from [<c0404e4c>] (platform_device_del+0x1c/0x88) >> [<c0404e4c>] (platform_device_del) from [<c0404ec4>] (platform_device_unregister+0xc/0x20) >> [<c0404ec4>] (platform_device_unregister) from [<c0428bc0>] (mfd_remove_devices_fn+0x5c/0x64) >> [<c0428bc0>] (mfd_remove_devices_fn) from [<c03ff9d8>] (device_for_each_child_reverse+0x4c/0x78) >> [<c03ff9d8>] (device_for_each_child_reverse) from [<c04288c4>] (mfd_remove_devices+0x20/0x30) >> [<c04288c4>] (mfd_remove_devices) from [<c042758c>] (wm8994_device_init+0x2ac/0x7f0) >> [<c042758c>] (wm8994_device_init) from [<c04f14a8>] (i2c_device_probe+0x178/0x1fc) >> [<c04f14a8>] (i2c_device_probe) from [<c04036fc>] (driver_probe_device+0x214/0x2c0) >> [<c04036fc>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0) >> [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c) >> [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218) >> [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8) >> [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84) >> [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170) >> [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc) >> [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114) >> [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c) >> ---[ end trace 0919d3d0bc998260 ]--- >> >> [snip..] >> >> Unable to handle kernel NULL pointer dereference at virtual address 00000078 >> pgd = c0004000 >> [00000078] *pgd=00000000 >> Internal error: Oops: 5 [#1] PREEMPT SMP ARM >> Modules linked in: >> CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.8.0-rc6-00154-g54fe84cbd50b #41 >> Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) >> task: ee874000 task.stack: ee878000 >> PC is at down_write+0x14/0x54 >> LR is at debugfs_remove_recursive+0x30/0x150 >> >> [snip..] >> >> [<c06e489c>] (down_write) from [<c02e9954>] (debugfs_remove_recursive+0x30/0x150) >> [<c02e9954>] (debugfs_remove_recursive) from [<c0382b78>] (_regulator_put+0x24/0xac) >> [<c0382b78>] (_regulator_put) from [<c0382c1c>] (regulator_put+0x1c/0x2c) >> [<c0382c1c>] (regulator_put) from [<c0406434>] (release_nodes+0x16c/0x1dc) >> [<c0406434>] (release_nodes) from [<c04035d4>] (driver_probe_device+0xec/0x2c0) >> [<c04035d4>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0) >> [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c) >> [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218) >> [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8) >> [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84) >> [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170) >> [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc) >> [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114) >> [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c) >> Code: e1a04000 f590f000 e3a03001 e34f3fff (e1902f9f) >> ---[ end trace 0919d3d0bc998262 ]--- >> >> --------x------------------x---------------- >> >> Fix the kernel warnings and crashes by using regulator_bulk_get() >> instead of devm_regulator_bulk_get() and explicitly freeing the supplies >> in exit paths. >> >> Tested on Exynos 5250, dual core ARM A15 machine. >> >> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> >> Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> >> --- >> V3->V4: >> - Send the right version of the patch, sent the older one by mistake earlier. >> >> V2->V3: >> - Fixed a rebase conflict >> - sending only one patch with Charles Ack >> >> V1->V2: >> - Use regulator_bulk_free() instead of open coding it. >> - Shorter backtrace >> - Reworded the last paragraph to make it more clear >> - Added a comment in code >> >> drivers/mfd/wm8994-core.c | 14 +++++++++++--- >> 1 file changed, 11 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/mfd/wm8994-core.c b/drivers/mfd/wm8994-core.c >> index 95e6bc55adbb..953d0790ffd5 100644 >> --- a/drivers/mfd/wm8994-core.c >> +++ b/drivers/mfd/wm8994-core.c >> @@ -393,8 +393,13 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq) >> BUG(); >> goto err; >> } >> - >> - ret = devm_regulator_bulk_get(wm8994->dev, wm8994->num_supplies, >> + >> + /* >> + * Can't use devres helper here as some of the supplies are provided by >> + * wm8994->dev's children (regulators) and those regulators are >> + * unregistered by the devres core before the supplies are freed. >> + */ >> + ret = regulator_bulk_get(wm8994->dev, wm8994->num_supplies, >> wm8994->supplies); >> if (ret != 0) { >> dev_err(wm8994->dev, "Failed to get supplies: %d\n", ret); >> @@ -404,7 +409,7 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq) >> ret = regulator_bulk_enable(wm8994->num_supplies, wm8994->supplies); >> if (ret != 0) { >> dev_err(wm8994->dev, "Failed to enable supplies: %d\n", ret); >> - goto err; >> + goto err_regulator_free; >> } >> >> ret = wm8994_reg_read(wm8994, WM8994_SOFTWARE_RESET); >> @@ -595,6 +600,8 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq) >> err_enable: >> regulator_bulk_disable(wm8994->num_supplies, >> wm8994->supplies); >> +err_regulator_free: >> + regulator_bulk_free(wm8994->num_supplies, wm8994->supplies); >> err: >> mfd_remove_devices(wm8994->dev); >> return ret; >> @@ -605,6 +612,7 @@ static void wm8994_device_exit(struct wm8994 *wm8994) >> pm_runtime_disable(wm8994->dev); >> wm8994_irq_exit(wm8994); >> regulator_bulk_disable(wm8994->num_supplies, wm8994->supplies); >> + regulator_bulk_free(wm8994->num_supplies, wm8994->supplies); >> mfd_remove_devices(wm8994->dev); >> } >> > > -- > Lee Jones > Linaro STMicroelectronics Landing Team Lead > Linaro.org │ Open source software for ARM SoCs > Follow Linaro: Facebook | Twitter | Blog > _______________________________________________ > linaro-kernel mailing list > linaro-kernel@lists.linaro.org > https://lists.linaro.org/mailman/listinfo/linaro-kernel
On Thu, 27 Oct 2016, Viresh Kumar wrote: > The kernel WARNs and then crashes today if wm8994_device_init() fails > after calling devm_regulator_bulk_get(). > > That happens because there are multiple devices involved here and the > order in which managed resources are freed isn't correct. > > The regulators are added as children of wm8994->dev. Whereas, > devm_regulator_bulk_get() receives wm8994->dev as the device, though it > gets the same regulators which were added as children of wm8994->dev > earlier. > > During failures, the children are removed first and the core eventually > calls regulator_unregister() for them. As regulator_put() was never done > for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at > > WARN_ON(rdev->open_count); > > And eventually it crashes from debugfs_remove_recursive(). Is ... mfd: wm8994-core: disable regulators before removing them ... required as well, or is that separate? > --------x------------------x---------------- > > wm8994 3-001a: Device is not a WM8994, ID is 0 > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 1 at /mnt/ssd/all/work/repos/devel/linux/drivers/regulator/core.c:4072 regulator_unregister+0xc8/0xd0 > Modules linked in: > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc6-00154-g54fe84cbd50b #41 > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) > [<c010e24c>] (unwind_backtrace) from [<c010af38>] (show_stack+0x10/0x14) > [<c010af38>] (show_stack) from [<c032a1c4>] (dump_stack+0x88/0x9c) > [<c032a1c4>] (dump_stack) from [<c011a98c>] (__warn+0xe8/0x100) > [<c011a98c>] (__warn) from [<c011aa54>] (warn_slowpath_null+0x20/0x28) > [<c011aa54>] (warn_slowpath_null) from [<c0384a0c>] (regulator_unregister+0xc8/0xd0) > [<c0384a0c>] (regulator_unregister) from [<c0406434>] (release_nodes+0x16c/0x1dc) > [<c0406434>] (release_nodes) from [<c04039c4>] (__device_release_driver+0x8c/0x110) > [<c04039c4>] (__device_release_driver) from [<c0403a64>] (device_release_driver+0x1c/0x28) > [<c0403a64>] (device_release_driver) from [<c0402b24>] (bus_remove_device+0xd8/0x104) > [<c0402b24>] (bus_remove_device) from [<c03ffcd8>] (device_del+0x10c/0x218) > [<c03ffcd8>] (device_del) from [<c0404e4c>] (platform_device_del+0x1c/0x88) > [<c0404e4c>] (platform_device_del) from [<c0404ec4>] (platform_device_unregister+0xc/0x20) > [<c0404ec4>] (platform_device_unregister) from [<c0428bc0>] (mfd_remove_devices_fn+0x5c/0x64) > [<c0428bc0>] (mfd_remove_devices_fn) from [<c03ff9d8>] (device_for_each_child_reverse+0x4c/0x78) > [<c03ff9d8>] (device_for_each_child_reverse) from [<c04288c4>] (mfd_remove_devices+0x20/0x30) > [<c04288c4>] (mfd_remove_devices) from [<c042758c>] (wm8994_device_init+0x2ac/0x7f0) > [<c042758c>] (wm8994_device_init) from [<c04f14a8>] (i2c_device_probe+0x178/0x1fc) > [<c04f14a8>] (i2c_device_probe) from [<c04036fc>] (driver_probe_device+0x214/0x2c0) > [<c04036fc>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0) > [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c) > [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218) > [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8) > [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84) > [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170) > [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc) > [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114) > [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c) > ---[ end trace 0919d3d0bc998260 ]--- > > [snip..] > > Unable to handle kernel NULL pointer dereference at virtual address 00000078 > pgd = c0004000 > [00000078] *pgd=00000000 > Internal error: Oops: 5 [#1] PREEMPT SMP ARM > Modules linked in: > CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.8.0-rc6-00154-g54fe84cbd50b #41 > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) > task: ee874000 task.stack: ee878000 > PC is at down_write+0x14/0x54 > LR is at debugfs_remove_recursive+0x30/0x150 > > [snip..] > > [<c06e489c>] (down_write) from [<c02e9954>] (debugfs_remove_recursive+0x30/0x150) > [<c02e9954>] (debugfs_remove_recursive) from [<c0382b78>] (_regulator_put+0x24/0xac) > [<c0382b78>] (_regulator_put) from [<c0382c1c>] (regulator_put+0x1c/0x2c) > [<c0382c1c>] (regulator_put) from [<c0406434>] (release_nodes+0x16c/0x1dc) > [<c0406434>] (release_nodes) from [<c04035d4>] (driver_probe_device+0xec/0x2c0) > [<c04035d4>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0) > [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c) > [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218) > [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8) > [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84) > [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170) > [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc) > [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114) > [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c) > Code: e1a04000 f590f000 e3a03001 e34f3fff (e1902f9f) > ---[ end trace 0919d3d0bc998262 ]--- > > --------x------------------x---------------- > > Fix the kernel warnings and crashes by using regulator_bulk_get() > instead of devm_regulator_bulk_get() and explicitly freeing the supplies > in exit paths. > > Tested on Exynos 5250, dual core ARM A15 machine. > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> > Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> > --- > V3->V4: > - Send the right version of the patch, sent the older one by mistake earlier. > > V2->V3: > - Fixed a rebase conflict > - sending only one patch with Charles Ack > > V1->V2: > - Use regulator_bulk_free() instead of open coding it. > - Shorter backtrace > - Reworded the last paragraph to make it more clear > - Added a comment in code > > drivers/mfd/wm8994-core.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/drivers/mfd/wm8994-core.c b/drivers/mfd/wm8994-core.c > index 95e6bc55adbb..953d0790ffd5 100644 > --- a/drivers/mfd/wm8994-core.c > +++ b/drivers/mfd/wm8994-core.c > @@ -393,8 +393,13 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq) > BUG(); > goto err; > } > - > - ret = devm_regulator_bulk_get(wm8994->dev, wm8994->num_supplies, > + > + /* > + * Can't use devres helper here as some of the supplies are provided by > + * wm8994->dev's children (regulators) and those regulators are > + * unregistered by the devres core before the supplies are freed. > + */ > + ret = regulator_bulk_get(wm8994->dev, wm8994->num_supplies, > wm8994->supplies); > if (ret != 0) { > dev_err(wm8994->dev, "Failed to get supplies: %d\n", ret); > @@ -404,7 +409,7 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq) > ret = regulator_bulk_enable(wm8994->num_supplies, wm8994->supplies); > if (ret != 0) { > dev_err(wm8994->dev, "Failed to enable supplies: %d\n", ret); > - goto err; > + goto err_regulator_free; > } > > ret = wm8994_reg_read(wm8994, WM8994_SOFTWARE_RESET); > @@ -595,6 +600,8 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq) > err_enable: > regulator_bulk_disable(wm8994->num_supplies, > wm8994->supplies); > +err_regulator_free: > + regulator_bulk_free(wm8994->num_supplies, wm8994->supplies); > err: > mfd_remove_devices(wm8994->dev); > return ret; > @@ -605,6 +612,7 @@ static void wm8994_device_exit(struct wm8994 *wm8994) > pm_runtime_disable(wm8994->dev); > wm8994_irq_exit(wm8994); > regulator_bulk_disable(wm8994->num_supplies, wm8994->supplies); > + regulator_bulk_free(wm8994->num_supplies, wm8994->supplies); > mfd_remove_devices(wm8994->dev); > } > -- Lee Jones Linaro STMicroelectronics Landing Team Lead Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog
On 25-11-16, 10:14, Lee Jones wrote: > On Thu, 27 Oct 2016, Viresh Kumar wrote: > > > The kernel WARNs and then crashes today if wm8994_device_init() fails > > after calling devm_regulator_bulk_get(). > > > > That happens because there are multiple devices involved here and the > > order in which managed resources are freed isn't correct. > > > > The regulators are added as children of wm8994->dev. Whereas, > > devm_regulator_bulk_get() receives wm8994->dev as the device, though it > > gets the same regulators which were added as children of wm8994->dev > > earlier. > > > > During failures, the children are removed first and the core eventually > > calls regulator_unregister() for them. As regulator_put() was never done > > for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at > > > > WARN_ON(rdev->open_count); > > > > And eventually it crashes from debugfs_remove_recursive(). > > Is ... > > mfd: wm8994-core: disable regulators before removing them > > ... required as well, or is that separate? It would be better if we get that too. Anyway, the $subject patch has a dependency on it.. -- viresh
On 25-11-16, 10:41, Lee Jones wrote: > On Fri, 25 Nov 2016, Viresh Kumar wrote: > > > On 25-11-16, 10:14, Lee Jones wrote: > > > On Thu, 27 Oct 2016, Viresh Kumar wrote: > > > > > > > The kernel WARNs and then crashes today if wm8994_device_init() fails > > > > after calling devm_regulator_bulk_get(). > > > > > > > > That happens because there are multiple devices involved here and the > > > > order in which managed resources are freed isn't correct. > > > > > > > > The regulators are added as children of wm8994->dev. Whereas, > > > > devm_regulator_bulk_get() receives wm8994->dev as the device, though it > > > > gets the same regulators which were added as children of wm8994->dev > > > > earlier. > > > > > > > > During failures, the children are removed first and the core eventually > > > > calls regulator_unregister() for them. As regulator_put() was never done > > > > for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at > > > > > > > > WARN_ON(rdev->open_count); > > > > > > > > And eventually it crashes from debugfs_remove_recursive(). > > > > > > Is ... > > > > > > mfd: wm8994-core: disable regulators before removing them > > > > > > ... required as well, or is that separate? > > > > It would be better if we get that too. Anyway, the $subject patch has a > > dependency on it.. > > Which is ... ? I meant rebase dependency, nothing else. -- viresh
On Fri, 25 Nov 2016, Viresh Kumar wrote: > On 25-11-16, 10:14, Lee Jones wrote: > > On Thu, 27 Oct 2016, Viresh Kumar wrote: > > > > > The kernel WARNs and then crashes today if wm8994_device_init() fails > > > after calling devm_regulator_bulk_get(). > > > > > > That happens because there are multiple devices involved here and the > > > order in which managed resources are freed isn't correct. > > > > > > The regulators are added as children of wm8994->dev. Whereas, > > > devm_regulator_bulk_get() receives wm8994->dev as the device, though it > > > gets the same regulators which were added as children of wm8994->dev > > > earlier. > > > > > > During failures, the children are removed first and the core eventually > > > calls regulator_unregister() for them. As regulator_put() was never done > > > for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at > > > > > > WARN_ON(rdev->open_count); > > > > > > And eventually it crashes from debugfs_remove_recursive(). > > > > Is ... > > > > mfd: wm8994-core: disable regulators before removing them > > > > ... required as well, or is that separate? > > It would be better if we get that too. Anyway, the $subject patch has a > dependency on it.. Which is ... ? -- Lee Jones Linaro STMicroelectronics Landing Team Lead Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog
diff --git a/drivers/mfd/wm8994-core.c b/drivers/mfd/wm8994-core.c index 95e6bc55adbb..953d0790ffd5 100644 --- a/drivers/mfd/wm8994-core.c +++ b/drivers/mfd/wm8994-core.c @@ -393,8 +393,13 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq) BUG(); goto err; } - - ret = devm_regulator_bulk_get(wm8994->dev, wm8994->num_supplies, + + /* + * Can't use devres helper here as some of the supplies are provided by + * wm8994->dev's children (regulators) and those regulators are + * unregistered by the devres core before the supplies are freed. + */ + ret = regulator_bulk_get(wm8994->dev, wm8994->num_supplies, wm8994->supplies); if (ret != 0) { dev_err(wm8994->dev, "Failed to get supplies: %d\n", ret); @@ -404,7 +409,7 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq) ret = regulator_bulk_enable(wm8994->num_supplies, wm8994->supplies); if (ret != 0) { dev_err(wm8994->dev, "Failed to enable supplies: %d\n", ret); - goto err; + goto err_regulator_free; } ret = wm8994_reg_read(wm8994, WM8994_SOFTWARE_RESET); @@ -595,6 +600,8 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq) err_enable: regulator_bulk_disable(wm8994->num_supplies, wm8994->supplies); +err_regulator_free: + regulator_bulk_free(wm8994->num_supplies, wm8994->supplies); err: mfd_remove_devices(wm8994->dev); return ret; @@ -605,6 +612,7 @@ static void wm8994_device_exit(struct wm8994 *wm8994) pm_runtime_disable(wm8994->dev); wm8994_irq_exit(wm8994); regulator_bulk_disable(wm8994->num_supplies, wm8994->supplies); + regulator_bulk_free(wm8994->num_supplies, wm8994->supplies); mfd_remove_devices(wm8994->dev); }