diff mbox

[V4] mfd: wm8994-core: Don't use managed regulator bulk get API

Message ID 0f6af89aa9093884d3668962ebc62383d8a209ec.1477563459.git.viresh.kumar@linaro.org
State New
Headers show

Commit Message

Viresh Kumar Oct. 27, 2016, 10:20 a.m. UTC
The kernel WARNs and then crashes today if wm8994_device_init() fails
after calling devm_regulator_bulk_get().

That happens because there are multiple devices involved here and the
order in which managed resources are freed isn't correct.

The regulators are added as children of wm8994->dev.  Whereas,
devm_regulator_bulk_get() receives wm8994->dev as the device, though it
gets the same regulators which were added as children of wm8994->dev
earlier.

During failures, the children are removed first and the core eventually
calls regulator_unregister() for them. As regulator_put() was never done
for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at

	WARN_ON(rdev->open_count);

And eventually it crashes from debugfs_remove_recursive().

--------x------------------x----------------

 wm8994 3-001a: Device is not a WM8994, ID is 0
 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 1 at /mnt/ssd/all/work/repos/devel/linux/drivers/regulator/core.c:4072 regulator_unregister+0xc8/0xd0
 Modules linked in:
 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc6-00154-g54fe84cbd50b #41
 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
 [<c010e24c>] (unwind_backtrace) from [<c010af38>] (show_stack+0x10/0x14)
 [<c010af38>] (show_stack) from [<c032a1c4>] (dump_stack+0x88/0x9c)
 [<c032a1c4>] (dump_stack) from [<c011a98c>] (__warn+0xe8/0x100)
 [<c011a98c>] (__warn) from [<c011aa54>] (warn_slowpath_null+0x20/0x28)
 [<c011aa54>] (warn_slowpath_null) from [<c0384a0c>] (regulator_unregister+0xc8/0xd0)
 [<c0384a0c>] (regulator_unregister) from [<c0406434>] (release_nodes+0x16c/0x1dc)
 [<c0406434>] (release_nodes) from [<c04039c4>] (__device_release_driver+0x8c/0x110)
 [<c04039c4>] (__device_release_driver) from [<c0403a64>] (device_release_driver+0x1c/0x28)
 [<c0403a64>] (device_release_driver) from [<c0402b24>] (bus_remove_device+0xd8/0x104)
 [<c0402b24>] (bus_remove_device) from [<c03ffcd8>] (device_del+0x10c/0x218)
 [<c03ffcd8>] (device_del) from [<c0404e4c>] (platform_device_del+0x1c/0x88)
 [<c0404e4c>] (platform_device_del) from [<c0404ec4>] (platform_device_unregister+0xc/0x20)
 [<c0404ec4>] (platform_device_unregister) from [<c0428bc0>] (mfd_remove_devices_fn+0x5c/0x64)
 [<c0428bc0>] (mfd_remove_devices_fn) from [<c03ff9d8>] (device_for_each_child_reverse+0x4c/0x78)
 [<c03ff9d8>] (device_for_each_child_reverse) from [<c04288c4>] (mfd_remove_devices+0x20/0x30)
 [<c04288c4>] (mfd_remove_devices) from [<c042758c>] (wm8994_device_init+0x2ac/0x7f0)
 [<c042758c>] (wm8994_device_init) from [<c04f14a8>] (i2c_device_probe+0x178/0x1fc)
 [<c04f14a8>] (i2c_device_probe) from [<c04036fc>] (driver_probe_device+0x214/0x2c0)
 [<c04036fc>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0)
 [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c)
 [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218)
 [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8)
 [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84)
 [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170)
 [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)
 [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114)
 [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c)
 ---[ end trace 0919d3d0bc998260 ]---

 [snip..]

 Unable to handle kernel NULL pointer dereference at virtual address 00000078
 pgd = c0004000
 [00000078] *pgd=00000000
 Internal error: Oops: 5 [#1] PREEMPT SMP ARM
 Modules linked in:
 CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.8.0-rc6-00154-g54fe84cbd50b #41
 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
 task: ee874000 task.stack: ee878000
 PC is at down_write+0x14/0x54
 LR is at debugfs_remove_recursive+0x30/0x150

 [snip..]

 [<c06e489c>] (down_write) from [<c02e9954>] (debugfs_remove_recursive+0x30/0x150)
 [<c02e9954>] (debugfs_remove_recursive) from [<c0382b78>] (_regulator_put+0x24/0xac)
 [<c0382b78>] (_regulator_put) from [<c0382c1c>] (regulator_put+0x1c/0x2c)
 [<c0382c1c>] (regulator_put) from [<c0406434>] (release_nodes+0x16c/0x1dc)
 [<c0406434>] (release_nodes) from [<c04035d4>] (driver_probe_device+0xec/0x2c0)
 [<c04035d4>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0)
 [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c)
 [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218)
 [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8)
 [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84)
 [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170)
 [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)
 [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114)
 [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c)
 Code: e1a04000 f590f000 e3a03001 e34f3fff (e1902f9f)
 ---[ end trace 0919d3d0bc998262 ]---

--------x------------------x----------------

Fix the kernel warnings and crashes by using regulator_bulk_get()
instead of devm_regulator_bulk_get() and explicitly freeing the supplies
in exit paths.

Tested on Exynos 5250, dual core ARM A15 machine.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>

---
V3->V4:
- Send the right version of the patch, sent the older one by mistake earlier.

V2->V3:
- Fixed a rebase conflict
- sending only one patch with Charles Ack

V1->V2:
- Use regulator_bulk_free() instead of open coding it.
- Shorter backtrace
- Reworded the last paragraph to make it more clear
- Added a comment in code

 drivers/mfd/wm8994-core.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

-- 
2.7.1.410.g6faf27b

Comments

Lee Jones Nov. 10, 2016, 8:39 a.m. UTC | #1
On Thu, 27 Oct 2016, Viresh Kumar wrote:

> The kernel WARNs and then crashes today if wm8994_device_init() fails

> after calling devm_regulator_bulk_get().

> 

> That happens because there are multiple devices involved here and the

> order in which managed resources are freed isn't correct.

> 

> The regulators are added as children of wm8994->dev.  Whereas,

> devm_regulator_bulk_get() receives wm8994->dev as the device, though it

> gets the same regulators which were added as children of wm8994->dev

> earlier.

> 

> During failures, the children are removed first and the core eventually

> calls regulator_unregister() for them. As regulator_put() was never done

> for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at

> 

> 	WARN_ON(rdev->open_count);

> 

> And eventually it crashes from debugfs_remove_recursive().


Applied, thanks.

> --------x------------------x----------------

> 

>  wm8994 3-001a: Device is not a WM8994, ID is 0

>  ------------[ cut here ]------------

>  WARNING: CPU: 0 PID: 1 at /mnt/ssd/all/work/repos/devel/linux/drivers/regulator/core.c:4072 regulator_unregister+0xc8/0xd0

>  Modules linked in:

>  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc6-00154-g54fe84cbd50b #41

>  Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)

>  [<c010e24c>] (unwind_backtrace) from [<c010af38>] (show_stack+0x10/0x14)

>  [<c010af38>] (show_stack) from [<c032a1c4>] (dump_stack+0x88/0x9c)

>  [<c032a1c4>] (dump_stack) from [<c011a98c>] (__warn+0xe8/0x100)

>  [<c011a98c>] (__warn) from [<c011aa54>] (warn_slowpath_null+0x20/0x28)

>  [<c011aa54>] (warn_slowpath_null) from [<c0384a0c>] (regulator_unregister+0xc8/0xd0)

>  [<c0384a0c>] (regulator_unregister) from [<c0406434>] (release_nodes+0x16c/0x1dc)

>  [<c0406434>] (release_nodes) from [<c04039c4>] (__device_release_driver+0x8c/0x110)

>  [<c04039c4>] (__device_release_driver) from [<c0403a64>] (device_release_driver+0x1c/0x28)

>  [<c0403a64>] (device_release_driver) from [<c0402b24>] (bus_remove_device+0xd8/0x104)

>  [<c0402b24>] (bus_remove_device) from [<c03ffcd8>] (device_del+0x10c/0x218)

>  [<c03ffcd8>] (device_del) from [<c0404e4c>] (platform_device_del+0x1c/0x88)

>  [<c0404e4c>] (platform_device_del) from [<c0404ec4>] (platform_device_unregister+0xc/0x20)

>  [<c0404ec4>] (platform_device_unregister) from [<c0428bc0>] (mfd_remove_devices_fn+0x5c/0x64)

>  [<c0428bc0>] (mfd_remove_devices_fn) from [<c03ff9d8>] (device_for_each_child_reverse+0x4c/0x78)

>  [<c03ff9d8>] (device_for_each_child_reverse) from [<c04288c4>] (mfd_remove_devices+0x20/0x30)

>  [<c04288c4>] (mfd_remove_devices) from [<c042758c>] (wm8994_device_init+0x2ac/0x7f0)

>  [<c042758c>] (wm8994_device_init) from [<c04f14a8>] (i2c_device_probe+0x178/0x1fc)

>  [<c04f14a8>] (i2c_device_probe) from [<c04036fc>] (driver_probe_device+0x214/0x2c0)

>  [<c04036fc>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0)

>  [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c)

>  [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218)

>  [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8)

>  [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84)

>  [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170)

>  [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)

>  [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114)

>  [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c)

>  ---[ end trace 0919d3d0bc998260 ]---

> 

>  [snip..]

> 

>  Unable to handle kernel NULL pointer dereference at virtual address 00000078

>  pgd = c0004000

>  [00000078] *pgd=00000000

>  Internal error: Oops: 5 [#1] PREEMPT SMP ARM

>  Modules linked in:

>  CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.8.0-rc6-00154-g54fe84cbd50b #41

>  Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)

>  task: ee874000 task.stack: ee878000

>  PC is at down_write+0x14/0x54

>  LR is at debugfs_remove_recursive+0x30/0x150

> 

>  [snip..]

> 

>  [<c06e489c>] (down_write) from [<c02e9954>] (debugfs_remove_recursive+0x30/0x150)

>  [<c02e9954>] (debugfs_remove_recursive) from [<c0382b78>] (_regulator_put+0x24/0xac)

>  [<c0382b78>] (_regulator_put) from [<c0382c1c>] (regulator_put+0x1c/0x2c)

>  [<c0382c1c>] (regulator_put) from [<c0406434>] (release_nodes+0x16c/0x1dc)

>  [<c0406434>] (release_nodes) from [<c04035d4>] (driver_probe_device+0xec/0x2c0)

>  [<c04035d4>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0)

>  [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c)

>  [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218)

>  [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8)

>  [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84)

>  [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170)

>  [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)

>  [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114)

>  [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c)

>  Code: e1a04000 f590f000 e3a03001 e34f3fff (e1902f9f)

>  ---[ end trace 0919d3d0bc998262 ]---

> 

> --------x------------------x----------------

> 

> Fix the kernel warnings and crashes by using regulator_bulk_get()

> instead of devm_regulator_bulk_get() and explicitly freeing the supplies

> in exit paths.

> 

> Tested on Exynos 5250, dual core ARM A15 machine.

> 

> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

> Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>

> ---

> V3->V4:

> - Send the right version of the patch, sent the older one by mistake earlier.

> 

> V2->V3:

> - Fixed a rebase conflict

> - sending only one patch with Charles Ack

> 

> V1->V2:

> - Use regulator_bulk_free() instead of open coding it.

> - Shorter backtrace

> - Reworded the last paragraph to make it more clear

> - Added a comment in code

> 

>  drivers/mfd/wm8994-core.c | 14 +++++++++++---

>  1 file changed, 11 insertions(+), 3 deletions(-)

> 

> diff --git a/drivers/mfd/wm8994-core.c b/drivers/mfd/wm8994-core.c

> index 95e6bc55adbb..953d0790ffd5 100644

> --- a/drivers/mfd/wm8994-core.c

> +++ b/drivers/mfd/wm8994-core.c

> @@ -393,8 +393,13 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq)

>  		BUG();

>  		goto err;

>  	}

> -		

> -	ret = devm_regulator_bulk_get(wm8994->dev, wm8994->num_supplies,

> +

> +	/*

> +	 * Can't use devres helper here as some of the supplies are provided by

> +	 * wm8994->dev's children (regulators) and those regulators are

> +	 * unregistered by the devres core before the supplies are freed.

> +	 */

> +	ret = regulator_bulk_get(wm8994->dev, wm8994->num_supplies,

>  				 wm8994->supplies);

>  	if (ret != 0) {

>  		dev_err(wm8994->dev, "Failed to get supplies: %d\n", ret);

> @@ -404,7 +409,7 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq)

>  	ret = regulator_bulk_enable(wm8994->num_supplies, wm8994->supplies);

>  	if (ret != 0) {

>  		dev_err(wm8994->dev, "Failed to enable supplies: %d\n", ret);

> -		goto err;

> +		goto err_regulator_free;

>  	}

>  

>  	ret = wm8994_reg_read(wm8994, WM8994_SOFTWARE_RESET);

> @@ -595,6 +600,8 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq)

>  err_enable:

>  	regulator_bulk_disable(wm8994->num_supplies,

>  			       wm8994->supplies);

> +err_regulator_free:

> +	regulator_bulk_free(wm8994->num_supplies, wm8994->supplies);

>  err:

>  	mfd_remove_devices(wm8994->dev);

>  	return ret;

> @@ -605,6 +612,7 @@ static void wm8994_device_exit(struct wm8994 *wm8994)

>  	pm_runtime_disable(wm8994->dev);

>  	wm8994_irq_exit(wm8994);

>  	regulator_bulk_disable(wm8994->num_supplies, wm8994->supplies);

> +	regulator_bulk_free(wm8994->num_supplies, wm8994->supplies);

>  	mfd_remove_devices(wm8994->dev);

>  }

>  


-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
Ulf Hansson Nov. 24, 2016, 2:45 p.m. UTC | #2
+ Jaehoon

On 10 November 2016 at 09:39, Lee Jones <lee.jones@linaro.org> wrote:
> On Thu, 27 Oct 2016, Viresh Kumar wrote:

>

>> The kernel WARNs and then crashes today if wm8994_device_init() fails

>> after calling devm_regulator_bulk_get().

>>

>> That happens because there are multiple devices involved here and the

>> order in which managed resources are freed isn't correct.

>>

>> The regulators are added as children of wm8994->dev.  Whereas,

>> devm_regulator_bulk_get() receives wm8994->dev as the device, though it

>> gets the same regulators which were added as children of wm8994->dev

>> earlier.

>>

>> During failures, the children are removed first and the core eventually

>> calls regulator_unregister() for them. As regulator_put() was never done

>> for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at

>>

>>       WARN_ON(rdev->open_count);

>>

>> And eventually it crashes from debugfs_remove_recursive().

>

> Applied, thanks.


Lee, did you intend to send this a fix for 4.9 rc? I would be nice as
kernelci are reporting this issue (assuming it's the same).

https://kernelci.org/boot/all/job/ulfh/kernel/v4.9-rc5-78-g52eb5d4531e4/

Kind regards
Uffe

>

>> --------x------------------x----------------

>>

>>  wm8994 3-001a: Device is not a WM8994, ID is 0

>>  ------------[ cut here ]------------

>>  WARNING: CPU: 0 PID: 1 at /mnt/ssd/all/work/repos/devel/linux/drivers/regulator/core.c:4072 regulator_unregister+0xc8/0xd0

>>  Modules linked in:

>>  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc6-00154-g54fe84cbd50b #41

>>  Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)

>>  [<c010e24c>] (unwind_backtrace) from [<c010af38>] (show_stack+0x10/0x14)

>>  [<c010af38>] (show_stack) from [<c032a1c4>] (dump_stack+0x88/0x9c)

>>  [<c032a1c4>] (dump_stack) from [<c011a98c>] (__warn+0xe8/0x100)

>>  [<c011a98c>] (__warn) from [<c011aa54>] (warn_slowpath_null+0x20/0x28)

>>  [<c011aa54>] (warn_slowpath_null) from [<c0384a0c>] (regulator_unregister+0xc8/0xd0)

>>  [<c0384a0c>] (regulator_unregister) from [<c0406434>] (release_nodes+0x16c/0x1dc)

>>  [<c0406434>] (release_nodes) from [<c04039c4>] (__device_release_driver+0x8c/0x110)

>>  [<c04039c4>] (__device_release_driver) from [<c0403a64>] (device_release_driver+0x1c/0x28)

>>  [<c0403a64>] (device_release_driver) from [<c0402b24>] (bus_remove_device+0xd8/0x104)

>>  [<c0402b24>] (bus_remove_device) from [<c03ffcd8>] (device_del+0x10c/0x218)

>>  [<c03ffcd8>] (device_del) from [<c0404e4c>] (platform_device_del+0x1c/0x88)

>>  [<c0404e4c>] (platform_device_del) from [<c0404ec4>] (platform_device_unregister+0xc/0x20)

>>  [<c0404ec4>] (platform_device_unregister) from [<c0428bc0>] (mfd_remove_devices_fn+0x5c/0x64)

>>  [<c0428bc0>] (mfd_remove_devices_fn) from [<c03ff9d8>] (device_for_each_child_reverse+0x4c/0x78)

>>  [<c03ff9d8>] (device_for_each_child_reverse) from [<c04288c4>] (mfd_remove_devices+0x20/0x30)

>>  [<c04288c4>] (mfd_remove_devices) from [<c042758c>] (wm8994_device_init+0x2ac/0x7f0)

>>  [<c042758c>] (wm8994_device_init) from [<c04f14a8>] (i2c_device_probe+0x178/0x1fc)

>>  [<c04f14a8>] (i2c_device_probe) from [<c04036fc>] (driver_probe_device+0x214/0x2c0)

>>  [<c04036fc>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0)

>>  [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c)

>>  [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218)

>>  [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8)

>>  [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84)

>>  [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170)

>>  [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)

>>  [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114)

>>  [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c)

>>  ---[ end trace 0919d3d0bc998260 ]---

>>

>>  [snip..]

>>

>>  Unable to handle kernel NULL pointer dereference at virtual address 00000078

>>  pgd = c0004000

>>  [00000078] *pgd=00000000

>>  Internal error: Oops: 5 [#1] PREEMPT SMP ARM

>>  Modules linked in:

>>  CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.8.0-rc6-00154-g54fe84cbd50b #41

>>  Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)

>>  task: ee874000 task.stack: ee878000

>>  PC is at down_write+0x14/0x54

>>  LR is at debugfs_remove_recursive+0x30/0x150

>>

>>  [snip..]

>>

>>  [<c06e489c>] (down_write) from [<c02e9954>] (debugfs_remove_recursive+0x30/0x150)

>>  [<c02e9954>] (debugfs_remove_recursive) from [<c0382b78>] (_regulator_put+0x24/0xac)

>>  [<c0382b78>] (_regulator_put) from [<c0382c1c>] (regulator_put+0x1c/0x2c)

>>  [<c0382c1c>] (regulator_put) from [<c0406434>] (release_nodes+0x16c/0x1dc)

>>  [<c0406434>] (release_nodes) from [<c04035d4>] (driver_probe_device+0xec/0x2c0)

>>  [<c04035d4>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0)

>>  [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c)

>>  [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218)

>>  [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8)

>>  [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84)

>>  [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170)

>>  [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)

>>  [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114)

>>  [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c)

>>  Code: e1a04000 f590f000 e3a03001 e34f3fff (e1902f9f)

>>  ---[ end trace 0919d3d0bc998262 ]---

>>

>> --------x------------------x----------------

>>

>> Fix the kernel warnings and crashes by using regulator_bulk_get()

>> instead of devm_regulator_bulk_get() and explicitly freeing the supplies

>> in exit paths.

>>

>> Tested on Exynos 5250, dual core ARM A15 machine.

>>

>> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

>> Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>

>> ---

>> V3->V4:

>> - Send the right version of the patch, sent the older one by mistake earlier.

>>

>> V2->V3:

>> - Fixed a rebase conflict

>> - sending only one patch with Charles Ack

>>

>> V1->V2:

>> - Use regulator_bulk_free() instead of open coding it.

>> - Shorter backtrace

>> - Reworded the last paragraph to make it more clear

>> - Added a comment in code

>>

>>  drivers/mfd/wm8994-core.c | 14 +++++++++++---

>>  1 file changed, 11 insertions(+), 3 deletions(-)

>>

>> diff --git a/drivers/mfd/wm8994-core.c b/drivers/mfd/wm8994-core.c

>> index 95e6bc55adbb..953d0790ffd5 100644

>> --- a/drivers/mfd/wm8994-core.c

>> +++ b/drivers/mfd/wm8994-core.c

>> @@ -393,8 +393,13 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq)

>>               BUG();

>>               goto err;

>>       }

>> -

>> -     ret = devm_regulator_bulk_get(wm8994->dev, wm8994->num_supplies,

>> +

>> +     /*

>> +      * Can't use devres helper here as some of the supplies are provided by

>> +      * wm8994->dev's children (regulators) and those regulators are

>> +      * unregistered by the devres core before the supplies are freed.

>> +      */

>> +     ret = regulator_bulk_get(wm8994->dev, wm8994->num_supplies,

>>                                wm8994->supplies);

>>       if (ret != 0) {

>>               dev_err(wm8994->dev, "Failed to get supplies: %d\n", ret);

>> @@ -404,7 +409,7 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq)

>>       ret = regulator_bulk_enable(wm8994->num_supplies, wm8994->supplies);

>>       if (ret != 0) {

>>               dev_err(wm8994->dev, "Failed to enable supplies: %d\n", ret);

>> -             goto err;

>> +             goto err_regulator_free;

>>       }

>>

>>       ret = wm8994_reg_read(wm8994, WM8994_SOFTWARE_RESET);

>> @@ -595,6 +600,8 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq)

>>  err_enable:

>>       regulator_bulk_disable(wm8994->num_supplies,

>>                              wm8994->supplies);

>> +err_regulator_free:

>> +     regulator_bulk_free(wm8994->num_supplies, wm8994->supplies);

>>  err:

>>       mfd_remove_devices(wm8994->dev);

>>       return ret;

>> @@ -605,6 +612,7 @@ static void wm8994_device_exit(struct wm8994 *wm8994)

>>       pm_runtime_disable(wm8994->dev);

>>       wm8994_irq_exit(wm8994);

>>       regulator_bulk_disable(wm8994->num_supplies, wm8994->supplies);

>> +     regulator_bulk_free(wm8994->num_supplies, wm8994->supplies);

>>       mfd_remove_devices(wm8994->dev);

>>  }

>>

>

> --

> Lee Jones

> Linaro STMicroelectronics Landing Team Lead

> Linaro.org │ Open source software for ARM SoCs

> Follow Linaro: Facebook | Twitter | Blog

> _______________________________________________

> linaro-kernel mailing list

> linaro-kernel@lists.linaro.org

> https://lists.linaro.org/mailman/listinfo/linaro-kernel
Lee Jones Nov. 25, 2016, 10:14 a.m. UTC | #3
On Thu, 27 Oct 2016, Viresh Kumar wrote:

> The kernel WARNs and then crashes today if wm8994_device_init() fails

> after calling devm_regulator_bulk_get().

> 

> That happens because there are multiple devices involved here and the

> order in which managed resources are freed isn't correct.

> 

> The regulators are added as children of wm8994->dev.  Whereas,

> devm_regulator_bulk_get() receives wm8994->dev as the device, though it

> gets the same regulators which were added as children of wm8994->dev

> earlier.

> 

> During failures, the children are removed first and the core eventually

> calls regulator_unregister() for them. As regulator_put() was never done

> for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at

> 

> 	WARN_ON(rdev->open_count);

> 

> And eventually it crashes from debugfs_remove_recursive().


Is ...

   mfd: wm8994-core: disable regulators before removing them

... required as well, or is that separate?

> --------x------------------x----------------

> 

>  wm8994 3-001a: Device is not a WM8994, ID is 0

>  ------------[ cut here ]------------

>  WARNING: CPU: 0 PID: 1 at /mnt/ssd/all/work/repos/devel/linux/drivers/regulator/core.c:4072 regulator_unregister+0xc8/0xd0

>  Modules linked in:

>  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc6-00154-g54fe84cbd50b #41

>  Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)

>  [<c010e24c>] (unwind_backtrace) from [<c010af38>] (show_stack+0x10/0x14)

>  [<c010af38>] (show_stack) from [<c032a1c4>] (dump_stack+0x88/0x9c)

>  [<c032a1c4>] (dump_stack) from [<c011a98c>] (__warn+0xe8/0x100)

>  [<c011a98c>] (__warn) from [<c011aa54>] (warn_slowpath_null+0x20/0x28)

>  [<c011aa54>] (warn_slowpath_null) from [<c0384a0c>] (regulator_unregister+0xc8/0xd0)

>  [<c0384a0c>] (regulator_unregister) from [<c0406434>] (release_nodes+0x16c/0x1dc)

>  [<c0406434>] (release_nodes) from [<c04039c4>] (__device_release_driver+0x8c/0x110)

>  [<c04039c4>] (__device_release_driver) from [<c0403a64>] (device_release_driver+0x1c/0x28)

>  [<c0403a64>] (device_release_driver) from [<c0402b24>] (bus_remove_device+0xd8/0x104)

>  [<c0402b24>] (bus_remove_device) from [<c03ffcd8>] (device_del+0x10c/0x218)

>  [<c03ffcd8>] (device_del) from [<c0404e4c>] (platform_device_del+0x1c/0x88)

>  [<c0404e4c>] (platform_device_del) from [<c0404ec4>] (platform_device_unregister+0xc/0x20)

>  [<c0404ec4>] (platform_device_unregister) from [<c0428bc0>] (mfd_remove_devices_fn+0x5c/0x64)

>  [<c0428bc0>] (mfd_remove_devices_fn) from [<c03ff9d8>] (device_for_each_child_reverse+0x4c/0x78)

>  [<c03ff9d8>] (device_for_each_child_reverse) from [<c04288c4>] (mfd_remove_devices+0x20/0x30)

>  [<c04288c4>] (mfd_remove_devices) from [<c042758c>] (wm8994_device_init+0x2ac/0x7f0)

>  [<c042758c>] (wm8994_device_init) from [<c04f14a8>] (i2c_device_probe+0x178/0x1fc)

>  [<c04f14a8>] (i2c_device_probe) from [<c04036fc>] (driver_probe_device+0x214/0x2c0)

>  [<c04036fc>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0)

>  [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c)

>  [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218)

>  [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8)

>  [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84)

>  [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170)

>  [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)

>  [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114)

>  [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c)

>  ---[ end trace 0919d3d0bc998260 ]---

> 

>  [snip..]

> 

>  Unable to handle kernel NULL pointer dereference at virtual address 00000078

>  pgd = c0004000

>  [00000078] *pgd=00000000

>  Internal error: Oops: 5 [#1] PREEMPT SMP ARM

>  Modules linked in:

>  CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       4.8.0-rc6-00154-g54fe84cbd50b #41

>  Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)

>  task: ee874000 task.stack: ee878000

>  PC is at down_write+0x14/0x54

>  LR is at debugfs_remove_recursive+0x30/0x150

> 

>  [snip..]

> 

>  [<c06e489c>] (down_write) from [<c02e9954>] (debugfs_remove_recursive+0x30/0x150)

>  [<c02e9954>] (debugfs_remove_recursive) from [<c0382b78>] (_regulator_put+0x24/0xac)

>  [<c0382b78>] (_regulator_put) from [<c0382c1c>] (regulator_put+0x1c/0x2c)

>  [<c0382c1c>] (regulator_put) from [<c0406434>] (release_nodes+0x16c/0x1dc)

>  [<c0406434>] (release_nodes) from [<c04035d4>] (driver_probe_device+0xec/0x2c0)

>  [<c04035d4>] (driver_probe_device) from [<c0403854>] (__driver_attach+0xac/0xb0)

>  [<c0403854>] (__driver_attach) from [<c0401a74>] (bus_for_each_dev+0x68/0x9c)

>  [<c0401a74>] (bus_for_each_dev) from [<c0402cf0>] (bus_add_driver+0x1a0/0x218)

>  [<c0402cf0>] (bus_add_driver) from [<c040406c>] (driver_register+0x78/0xf8)

>  [<c040406c>] (driver_register) from [<c04f20a0>] (i2c_register_driver+0x34/0x84)

>  [<c04f20a0>] (i2c_register_driver) from [<c01017d0>] (do_one_initcall+0x40/0x170)

>  [<c01017d0>] (do_one_initcall) from [<c0a00dbc>] (kernel_init_freeable+0x15c/0x1fc)

>  [<c0a00dbc>] (kernel_init_freeable) from [<c06e07b0>] (kernel_init+0x8/0x114)

>  [<c06e07b0>] (kernel_init) from [<c0107978>] (ret_from_fork+0x14/0x3c)

>  Code: e1a04000 f590f000 e3a03001 e34f3fff (e1902f9f)

>  ---[ end trace 0919d3d0bc998262 ]---

> 

> --------x------------------x----------------

> 

> Fix the kernel warnings and crashes by using regulator_bulk_get()

> instead of devm_regulator_bulk_get() and explicitly freeing the supplies

> in exit paths.

> 

> Tested on Exynos 5250, dual core ARM A15 machine.

> 

> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

> Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>

> ---

> V3->V4:

> - Send the right version of the patch, sent the older one by mistake earlier.

> 

> V2->V3:

> - Fixed a rebase conflict

> - sending only one patch with Charles Ack

> 

> V1->V2:

> - Use regulator_bulk_free() instead of open coding it.

> - Shorter backtrace

> - Reworded the last paragraph to make it more clear

> - Added a comment in code

> 

>  drivers/mfd/wm8994-core.c | 14 +++++++++++---

>  1 file changed, 11 insertions(+), 3 deletions(-)

> 

> diff --git a/drivers/mfd/wm8994-core.c b/drivers/mfd/wm8994-core.c

> index 95e6bc55adbb..953d0790ffd5 100644

> --- a/drivers/mfd/wm8994-core.c

> +++ b/drivers/mfd/wm8994-core.c

> @@ -393,8 +393,13 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq)

>  		BUG();

>  		goto err;

>  	}

> -		

> -	ret = devm_regulator_bulk_get(wm8994->dev, wm8994->num_supplies,

> +

> +	/*

> +	 * Can't use devres helper here as some of the supplies are provided by

> +	 * wm8994->dev's children (regulators) and those regulators are

> +	 * unregistered by the devres core before the supplies are freed.

> +	 */

> +	ret = regulator_bulk_get(wm8994->dev, wm8994->num_supplies,

>  				 wm8994->supplies);

>  	if (ret != 0) {

>  		dev_err(wm8994->dev, "Failed to get supplies: %d\n", ret);

> @@ -404,7 +409,7 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq)

>  	ret = regulator_bulk_enable(wm8994->num_supplies, wm8994->supplies);

>  	if (ret != 0) {

>  		dev_err(wm8994->dev, "Failed to enable supplies: %d\n", ret);

> -		goto err;

> +		goto err_regulator_free;

>  	}

>  

>  	ret = wm8994_reg_read(wm8994, WM8994_SOFTWARE_RESET);

> @@ -595,6 +600,8 @@ static int wm8994_device_init(struct wm8994 *wm8994, int irq)

>  err_enable:

>  	regulator_bulk_disable(wm8994->num_supplies,

>  			       wm8994->supplies);

> +err_regulator_free:

> +	regulator_bulk_free(wm8994->num_supplies, wm8994->supplies);

>  err:

>  	mfd_remove_devices(wm8994->dev);

>  	return ret;

> @@ -605,6 +612,7 @@ static void wm8994_device_exit(struct wm8994 *wm8994)

>  	pm_runtime_disable(wm8994->dev);

>  	wm8994_irq_exit(wm8994);

>  	regulator_bulk_disable(wm8994->num_supplies, wm8994->supplies);

> +	regulator_bulk_free(wm8994->num_supplies, wm8994->supplies);

>  	mfd_remove_devices(wm8994->dev);

>  }

>  


-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
Viresh Kumar Nov. 25, 2016, 10:16 a.m. UTC | #4
On 25-11-16, 10:14, Lee Jones wrote:
> On Thu, 27 Oct 2016, Viresh Kumar wrote:

> 

> > The kernel WARNs and then crashes today if wm8994_device_init() fails

> > after calling devm_regulator_bulk_get().

> > 

> > That happens because there are multiple devices involved here and the

> > order in which managed resources are freed isn't correct.

> > 

> > The regulators are added as children of wm8994->dev.  Whereas,

> > devm_regulator_bulk_get() receives wm8994->dev as the device, though it

> > gets the same regulators which were added as children of wm8994->dev

> > earlier.

> > 

> > During failures, the children are removed first and the core eventually

> > calls regulator_unregister() for them. As regulator_put() was never done

> > for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at

> > 

> > 	WARN_ON(rdev->open_count);

> > 

> > And eventually it crashes from debugfs_remove_recursive().

> 

> Is ...

> 

>    mfd: wm8994-core: disable regulators before removing them

> 

> ... required as well, or is that separate?


It would be better if we get that too. Anyway, the $subject patch has a
dependency on it..

-- 
viresh
Viresh Kumar Nov. 25, 2016, 10:41 a.m. UTC | #5
On 25-11-16, 10:41, Lee Jones wrote:
> On Fri, 25 Nov 2016, Viresh Kumar wrote:

> 

> > On 25-11-16, 10:14, Lee Jones wrote:

> > > On Thu, 27 Oct 2016, Viresh Kumar wrote:

> > > 

> > > > The kernel WARNs and then crashes today if wm8994_device_init() fails

> > > > after calling devm_regulator_bulk_get().

> > > > 

> > > > That happens because there are multiple devices involved here and the

> > > > order in which managed resources are freed isn't correct.

> > > > 

> > > > The regulators are added as children of wm8994->dev.  Whereas,

> > > > devm_regulator_bulk_get() receives wm8994->dev as the device, though it

> > > > gets the same regulators which were added as children of wm8994->dev

> > > > earlier.

> > > > 

> > > > During failures, the children are removed first and the core eventually

> > > > calls regulator_unregister() for them. As regulator_put() was never done

> > > > for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at

> > > > 

> > > > 	WARN_ON(rdev->open_count);

> > > > 

> > > > And eventually it crashes from debugfs_remove_recursive().

> > > 

> > > Is ...

> > > 

> > >    mfd: wm8994-core: disable regulators before removing them

> > > 

> > > ... required as well, or is that separate?

> > 

> > It would be better if we get that too. Anyway, the $subject patch has a

> > dependency on it..

> 

> Which is ... ?


I meant rebase dependency, nothing else.

-- 
viresh
Lee Jones Nov. 25, 2016, 10:41 a.m. UTC | #6
On Fri, 25 Nov 2016, Viresh Kumar wrote:

> On 25-11-16, 10:14, Lee Jones wrote:

> > On Thu, 27 Oct 2016, Viresh Kumar wrote:

> > 

> > > The kernel WARNs and then crashes today if wm8994_device_init() fails

> > > after calling devm_regulator_bulk_get().

> > > 

> > > That happens because there are multiple devices involved here and the

> > > order in which managed resources are freed isn't correct.

> > > 

> > > The regulators are added as children of wm8994->dev.  Whereas,

> > > devm_regulator_bulk_get() receives wm8994->dev as the device, though it

> > > gets the same regulators which were added as children of wm8994->dev

> > > earlier.

> > > 

> > > During failures, the children are removed first and the core eventually

> > > calls regulator_unregister() for them. As regulator_put() was never done

> > > for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at

> > > 

> > > 	WARN_ON(rdev->open_count);

> > > 

> > > And eventually it crashes from debugfs_remove_recursive().

> > 

> > Is ...

> > 

> >    mfd: wm8994-core: disable regulators before removing them

> > 

> > ... required as well, or is that separate?

> 

> It would be better if we get that too. Anyway, the $subject patch has a

> dependency on it..


Which is ... ?

-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
diff mbox

Patch

diff --git a/drivers/mfd/wm8994-core.c b/drivers/mfd/wm8994-core.c
index 95e6bc55adbb..953d0790ffd5 100644
--- a/drivers/mfd/wm8994-core.c
+++ b/drivers/mfd/wm8994-core.c
@@ -393,8 +393,13 @@  static int wm8994_device_init(struct wm8994 *wm8994, int irq)
 		BUG();
 		goto err;
 	}
-		
-	ret = devm_regulator_bulk_get(wm8994->dev, wm8994->num_supplies,
+
+	/*
+	 * Can't use devres helper here as some of the supplies are provided by
+	 * wm8994->dev's children (regulators) and those regulators are
+	 * unregistered by the devres core before the supplies are freed.
+	 */
+	ret = regulator_bulk_get(wm8994->dev, wm8994->num_supplies,
 				 wm8994->supplies);
 	if (ret != 0) {
 		dev_err(wm8994->dev, "Failed to get supplies: %d\n", ret);
@@ -404,7 +409,7 @@  static int wm8994_device_init(struct wm8994 *wm8994, int irq)
 	ret = regulator_bulk_enable(wm8994->num_supplies, wm8994->supplies);
 	if (ret != 0) {
 		dev_err(wm8994->dev, "Failed to enable supplies: %d\n", ret);
-		goto err;
+		goto err_regulator_free;
 	}
 
 	ret = wm8994_reg_read(wm8994, WM8994_SOFTWARE_RESET);
@@ -595,6 +600,8 @@  static int wm8994_device_init(struct wm8994 *wm8994, int irq)
 err_enable:
 	regulator_bulk_disable(wm8994->num_supplies,
 			       wm8994->supplies);
+err_regulator_free:
+	regulator_bulk_free(wm8994->num_supplies, wm8994->supplies);
 err:
 	mfd_remove_devices(wm8994->dev);
 	return ret;
@@ -605,6 +612,7 @@  static void wm8994_device_exit(struct wm8994 *wm8994)
 	pm_runtime_disable(wm8994->dev);
 	wm8994_irq_exit(wm8994);
 	regulator_bulk_disable(wm8994->num_supplies, wm8994->supplies);
+	regulator_bulk_free(wm8994->num_supplies, wm8994->supplies);
 	mfd_remove_devices(wm8994->dev);
 }