diff mbox

PM / Domains: Restore lock-less behaviour for the genpd syscore APIs

Message ID 1486482784-31734-1-git-send-email-ulf.hansson@linaro.org
State New
Headers show

Commit Message

Ulf Hansson Feb. 7, 2017, 3:53 p.m. UTC
The commit ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of
*noirq() callbacks") went too far, as it not only changed to use locks for
the *noirq callbacks, but also for the genpd syscore APIs.

This cause the following error, reported by Geert Uytterhoeven:

"This causes the following BUG on all my Renesas arm32 boards, where the
system timer is an IRQ safe device:

BUG: sleeping function called from invalid context at
kernel/locking/mutex.c:232
in_atomic(): 0, irqs_disabled(): 128, pid: 1751, name: s2ram
CPU: 0 PID: 1751 Comm: s2ram Not tainted
4.10.0-rc7-koelsch-05643-g27f4c73972a614fe #3354
Hardware name: Generic R8A7791 (Flattened Device Tree)
[<c020e9c4>] (unwind_backtrace) from [<c020a40c>] (show_stack+0x10/0x14)
[<c020a40c>] (show_stack) from [<c03f9818>] (dump_stack+0x7c/0x9c)
[<c03f9818>] (dump_stack) from [<c0240020>] (___might_sleep+0x124/0x160)
[<c0240020>] (___might_sleep) from [<c06fedfc>] (mutex_lock+0x18/0x60)
[<c06fedfc>] (mutex_lock) from [<c04de340>]
(genpd_syscore_switch+0x2c/0x7c)
[<c04de340>] (genpd_syscore_switch) from [<c05ec328>]
(sh_cmt_clock_event_suspend+0x18/0x28)
[<c05ec328>] (sh_cmt_clock_event_suspend) from [<c027f9a4>]
(clockevents_suspend+0x40/0x54)
[<c027f9a4>] (clockevents_suspend) from [<c0276d48>]
(timekeeping_suspend+0x23c/0x278)
[<c0276d48>] (timekeeping_suspend) from [<c04cbb88>]
(syscore_suspend+0x88/0x138)
[<c04cbb88>] (syscore_suspend) from [<c025f618>]
(suspend_devices_and_enter+0x290/0x470)
[<c025f618>] (suspend_devices_and_enter) from [<c025fa20>]
(pm_suspend+0x228/0x280)
[<c025fa20>] (pm_suspend) from [<c025e60c>] (state_store+0xac/0xcc)
[<c025e60c>] (state_store) from [<c0340c4c>] (kernfs_fop_write+0x160/0x19c)
[<c0340c4c>] (kernfs_fop_write) from [<c02e3054>] (__vfs_write+0x20/0x108)
[<c02e3054>] (__vfs_write) from [<c02e4424>] (vfs_write+0xb8/0x144)
[<c02e4424>] (vfs_write) from [<c02e5014>] (SyS_write+0x40/0x80)
[<c02e5014>] (SyS_write) from [<c0206cc0>] (ret_fast_syscall+0x0/0x34)"

To fix this problem, restore the lock-less behaviour. However only for the
genpd syscore APIs.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of *noirq() callbacks")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

---
 drivers/base/power/domain.c | 48 +++++++++++++++++++++++++--------------------
 1 file changed, 27 insertions(+), 21 deletions(-)

-- 
1.9.1

Comments

Kevin Hilman Feb. 7, 2017, 6:29 p.m. UTC | #1
Ulf Hansson <ulf.hansson@linaro.org> writes:

> The commit ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of

> *noirq() callbacks") went too far, as it not only changed to use locks for

> the *noirq callbacks, but also for the genpd syscore APIs.

>

> This cause the following error, reported by Geert Uytterhoeven:

>

> "This causes the following BUG on all my Renesas arm32 boards, where the

> system timer is an IRQ safe device:

>

> BUG: sleeping function called from invalid context at

> kernel/locking/mutex.c:232

> in_atomic(): 0, irqs_disabled(): 128, pid: 1751, name: s2ram

> CPU: 0 PID: 1751 Comm: s2ram Not tainted

> 4.10.0-rc7-koelsch-05643-g27f4c73972a614fe #3354

> Hardware name: Generic R8A7791 (Flattened Device Tree)

> [<c020e9c4>] (unwind_backtrace) from [<c020a40c>] (show_stack+0x10/0x14)

> [<c020a40c>] (show_stack) from [<c03f9818>] (dump_stack+0x7c/0x9c)

> [<c03f9818>] (dump_stack) from [<c0240020>] (___might_sleep+0x124/0x160)

> [<c0240020>] (___might_sleep) from [<c06fedfc>] (mutex_lock+0x18/0x60)

> [<c06fedfc>] (mutex_lock) from [<c04de340>]

> (genpd_syscore_switch+0x2c/0x7c)

> [<c04de340>] (genpd_syscore_switch) from [<c05ec328>]

> (sh_cmt_clock_event_suspend+0x18/0x28)

> [<c05ec328>] (sh_cmt_clock_event_suspend) from [<c027f9a4>]

> (clockevents_suspend+0x40/0x54)

> [<c027f9a4>] (clockevents_suspend) from [<c0276d48>]

> (timekeeping_suspend+0x23c/0x278)

> [<c0276d48>] (timekeeping_suspend) from [<c04cbb88>]

> (syscore_suspend+0x88/0x138)

> [<c04cbb88>] (syscore_suspend) from [<c025f618>]

> (suspend_devices_and_enter+0x290/0x470)

> [<c025f618>] (suspend_devices_and_enter) from [<c025fa20>]

> (pm_suspend+0x228/0x280)

> [<c025fa20>] (pm_suspend) from [<c025e60c>] (state_store+0xac/0xcc)

> [<c025e60c>] (state_store) from [<c0340c4c>] (kernfs_fop_write+0x160/0x19c)

> [<c0340c4c>] (kernfs_fop_write) from [<c02e3054>] (__vfs_write+0x20/0x108)

> [<c02e3054>] (__vfs_write) from [<c02e4424>] (vfs_write+0xb8/0x144)

> [<c02e4424>] (vfs_write) from [<c02e5014>] (SyS_write+0x40/0x80)

> [<c02e5014>] (SyS_write) from [<c0206cc0>] (ret_fast_syscall+0x0/0x34)"

>

> To fix this problem, restore the lock-less behaviour. However only for the

> genpd syscore APIs.

>

> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>

> Fixes: ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of *noirq() callbacks")

> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>


It's up to Rafael, but IMO, since this isn't yet in mainline, it would
be cleaner for the git history to revert the original, and rework the
patch to have just the relevant parts

Kevin
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Geert Uytterhoeven Feb. 7, 2017, 7:09 p.m. UTC | #2
On Tue, Feb 7, 2017 at 4:53 PM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> The commit ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of

> *noirq() callbacks") went too far, as it not only changed to use locks for

> the *noirq callbacks, but also for the genpd syscore APIs.

>

> This cause the following error, reported by Geert Uytterhoeven:

>

> "This causes the following BUG on all my Renesas arm32 boards, where the

> system timer is an IRQ safe device:

>

> BUG: sleeping function called from invalid context at

> kernel/locking/mutex.c:232

> in_atomic(): 0, irqs_disabled(): 128, pid: 1751, name: s2ram

> CPU: 0 PID: 1751 Comm: s2ram Not tainted

> 4.10.0-rc7-koelsch-05643-g27f4c73972a614fe #3354

> Hardware name: Generic R8A7791 (Flattened Device Tree)

> [<c020e9c4>] (unwind_backtrace) from [<c020a40c>] (show_stack+0x10/0x14)

> [<c020a40c>] (show_stack) from [<c03f9818>] (dump_stack+0x7c/0x9c)

> [<c03f9818>] (dump_stack) from [<c0240020>] (___might_sleep+0x124/0x160)

> [<c0240020>] (___might_sleep) from [<c06fedfc>] (mutex_lock+0x18/0x60)

> [<c06fedfc>] (mutex_lock) from [<c04de340>]

> (genpd_syscore_switch+0x2c/0x7c)

> [<c04de340>] (genpd_syscore_switch) from [<c05ec328>]

> (sh_cmt_clock_event_suspend+0x18/0x28)

> [<c05ec328>] (sh_cmt_clock_event_suspend) from [<c027f9a4>]

> (clockevents_suspend+0x40/0x54)

> [<c027f9a4>] (clockevents_suspend) from [<c0276d48>]

> (timekeeping_suspend+0x23c/0x278)

> [<c0276d48>] (timekeeping_suspend) from [<c04cbb88>]

> (syscore_suspend+0x88/0x138)

> [<c04cbb88>] (syscore_suspend) from [<c025f618>]

> (suspend_devices_and_enter+0x290/0x470)

> [<c025f618>] (suspend_devices_and_enter) from [<c025fa20>]

> (pm_suspend+0x228/0x280)

> [<c025fa20>] (pm_suspend) from [<c025e60c>] (state_store+0xac/0xcc)

> [<c025e60c>] (state_store) from [<c0340c4c>] (kernfs_fop_write+0x160/0x19c)

> [<c0340c4c>] (kernfs_fop_write) from [<c02e3054>] (__vfs_write+0x20/0x108)

> [<c02e3054>] (__vfs_write) from [<c02e4424>] (vfs_write+0xb8/0x144)

> [<c02e4424>] (vfs_write) from [<c02e5014>] (SyS_write+0x40/0x80)

> [<c02e5014>] (SyS_write) from [<c0206cc0>] (ret_fast_syscall+0x0/0x34)"

>

> To fix this problem, restore the lock-less behaviour. However only for the

> genpd syscore APIs.

>

> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>

> Fixes: ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of *noirq() callbacks")

> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>


Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>


Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lina Iyer Feb. 7, 2017, 9:52 p.m. UTC | #3
Hi Ulf,

On Tue, Feb 07 2017 at 08:53 -0700, Ulf Hansson wrote:
>The commit ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of

>*noirq() callbacks") went too far, as it not only changed to use locks for

>the *noirq callbacks, but also for the genpd syscore APIs.

>

>This cause the following error, reported by Geert Uytterhoeven:

>

>"This causes the following BUG on all my Renesas arm32 boards, where the

>system timer is an IRQ safe device:

>

>BUG: sleeping function called from invalid context at

>kernel/locking/mutex.c:232

>in_atomic(): 0, irqs_disabled(): 128, pid: 1751, name: s2ram

>CPU: 0 PID: 1751 Comm: s2ram Not tainted

>4.10.0-rc7-koelsch-05643-g27f4c73972a614fe #3354

>Hardware name: Generic R8A7791 (Flattened Device Tree)

>[<c020e9c4>] (unwind_backtrace) from [<c020a40c>] (show_stack+0x10/0x14)

>[<c020a40c>] (show_stack) from [<c03f9818>] (dump_stack+0x7c/0x9c)

>[<c03f9818>] (dump_stack) from [<c0240020>] (___might_sleep+0x124/0x160)

>[<c0240020>] (___might_sleep) from [<c06fedfc>] (mutex_lock+0x18/0x60)

>[<c06fedfc>] (mutex_lock) from [<c04de340>]

>(genpd_syscore_switch+0x2c/0x7c)

>[<c04de340>] (genpd_syscore_switch) from [<c05ec328>]

>(sh_cmt_clock_event_suspend+0x18/0x28)

>[<c05ec328>] (sh_cmt_clock_event_suspend) from [<c027f9a4>]

>(clockevents_suspend+0x40/0x54)

>[<c027f9a4>] (clockevents_suspend) from [<c0276d48>]

>(timekeeping_suspend+0x23c/0x278)

>[<c0276d48>] (timekeeping_suspend) from [<c04cbb88>]

>(syscore_suspend+0x88/0x138)

>[<c04cbb88>] (syscore_suspend) from [<c025f618>]

>(suspend_devices_and_enter+0x290/0x470)

>[<c025f618>] (suspend_devices_and_enter) from [<c025fa20>]

>(pm_suspend+0x228/0x280)

>[<c025fa20>] (pm_suspend) from [<c025e60c>] (state_store+0xac/0xcc)

>[<c025e60c>] (state_store) from [<c0340c4c>] (kernfs_fop_write+0x160/0x19c)

>[<c0340c4c>] (kernfs_fop_write) from [<c02e3054>] (__vfs_write+0x20/0x108)

>[<c02e3054>] (__vfs_write) from [<c02e4424>] (vfs_write+0xb8/0x144)

>[<c02e4424>] (vfs_write) from [<c02e5014>] (SyS_write+0x40/0x80)

>[<c02e5014>] (SyS_write) from [<c0206cc0>] (ret_fast_syscall+0x0/0x34)"

>

>To fix this problem, restore the lock-less behaviour. However only for the

>genpd syscore APIs.

>

>Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>

>Fixes: ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of *noirq() callbacks")

>Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

>---

> drivers/base/power/domain.c | 48 +++++++++++++++++++++++++--------------------

> 1 file changed, 27 insertions(+), 21 deletions(-)

>

>diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c

>index 271e208..3a75fb1 100644

>--- a/drivers/base/power/domain.c

>+++ b/drivers/base/power/domain.c

>@@ -729,16 +729,17 @@ static bool genpd_dev_active_wakeup(struct generic_pm_domain *genpd,

> /**

>  * genpd_sync_power_off - Synchronously power off a PM domain and its masters.

>  * @genpd: PM domain to power off, if possible.

>+ * @use_lock: use the lock.


I am not too fond of this. It would rather be much cleaner if you could
overload the genpd_lock/unlock() calls to make this genpd lockless. A
flag for the genpd, perhaps that could turn the lock functions into a
NOP?

-- Lina

>  * @depth: nesting count for lockdep.

>  *

>  * Check if the given PM domain can be powered off (during system suspend or

>  * hibernation) and do that if so.  Also, in that case propagate to its masters.

>  *

>  * This function is only called in "noirq" and "syscore" stages of system power

>- * transitions, but since the "noirq" callbacks may be executed asynchronously,

>- * the lock must be held.

>+ * transitions. The "noirq" callbacks may be executed asynchronously, thus in

>+ * these cases the lock must be held.

>  */

>-static void genpd_sync_power_off(struct generic_pm_domain *genpd,

>+static void genpd_sync_power_off(struct generic_pm_domain *genpd, bool use_lock,

> 				 unsigned int depth)

> {

> 	struct gpd_link *link;

>@@ -759,22 +760,27 @@ static void genpd_sync_power_off(struct generic_pm_domain *genpd,

> 	list_for_each_entry(link, &genpd->slave_links, slave_node) {

> 		genpd_sd_counter_dec(link->master);

>

>-		genpd_lock_nested(link->master, depth + 1);

>-		genpd_sync_power_off(link->master, depth + 1);

>-		genpd_unlock(link->master);

>+		if (use_lock)

>+			genpd_lock_nested(link->master, depth + 1);

>+

>+		genpd_sync_power_off(link->master, use_lock, depth + 1);

>+

>+		if (use_lock)

>+			genpd_unlock(link->master);

> 	}

> }

>

> /**

>  * genpd_sync_power_on - Synchronously power on a PM domain and its masters.

>  * @genpd: PM domain to power on.

>+ * @use_lock: use the lock.

>  * @depth: nesting count for lockdep.

>  *

>  * This function is only called in "noirq" and "syscore" stages of system power

>- * transitions, but since the "noirq" callbacks may be executed asynchronously,

>- * the lock must be held.

>+ * transitions. The "noirq" callbacks may be executed asynchronously, thus in

>+ * these cases the lock must be held.

>  */

>-static void genpd_sync_power_on(struct generic_pm_domain *genpd,

>+static void genpd_sync_power_on(struct generic_pm_domain *genpd, bool use_lock,

> 				unsigned int depth)

> {

> 	struct gpd_link *link;

>@@ -785,9 +791,13 @@ static void genpd_sync_power_on(struct generic_pm_domain *genpd,

> 	list_for_each_entry(link, &genpd->slave_links, slave_node) {

> 		genpd_sd_counter_inc(link->master);

>

>-		genpd_lock_nested(link->master, depth + 1);

>-		genpd_sync_power_on(link->master, depth + 1);

>-		genpd_unlock(link->master);

>+		if (use_lock)

>+			genpd_lock_nested(link->master, depth + 1);

>+

>+		genpd_sync_power_on(link->master, use_lock, depth + 1);

>+

>+		if (use_lock)

>+			genpd_unlock(link->master);

> 	}

>

> 	_genpd_power_on(genpd, false);

>@@ -898,7 +908,7 @@ static int pm_genpd_suspend_noirq(struct device *dev)

>

> 	genpd_lock(genpd);

> 	genpd->suspended_count++;

>-	genpd_sync_power_off(genpd, 0);

>+	genpd_sync_power_off(genpd, true, 0);

> 	genpd_unlock(genpd);

>

> 	return 0;

>@@ -925,7 +935,7 @@ static int pm_genpd_resume_noirq(struct device *dev)

> 		return 0;

>

> 	genpd_lock(genpd);

>-	genpd_sync_power_on(genpd, 0);

>+	genpd_sync_power_on(genpd, true, 0);

> 	genpd->suspended_count--;

> 	genpd_unlock(genpd);

>

>@@ -1016,7 +1026,7 @@ static int pm_genpd_restore_noirq(struct device *dev)

> 		 */

> 		genpd->status = GPD_STATE_POWER_OFF;

>

>-	genpd_sync_power_on(genpd, 0);

>+	genpd_sync_power_on(genpd, true, 0);

> 	genpd_unlock(genpd);

>

> 	if (genpd->dev_ops.stop && genpd->dev_ops.start)

>@@ -1070,17 +1080,13 @@ static void genpd_syscore_switch(struct device *dev, bool suspend)

> 	if (!pm_genpd_present(genpd))

> 		return;

>

>-	genpd_lock(genpd);

>-

> 	if (suspend) {

> 		genpd->suspended_count++;

>-		genpd_sync_power_off(genpd, 0);

>+		genpd_sync_power_off(genpd, false, 0);

> 	} else {

>-		genpd_sync_power_on(genpd, 0);

>+		genpd_sync_power_on(genpd, false, 0);

> 		genpd->suspended_count--;

> 	}

>-

>-	genpd_unlock(genpd);

> }

>

> void pm_genpd_syscore_poweroff(struct device *dev)

>-- 

>1.9.1

>

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Feb. 8, 2017, 12:06 p.m. UTC | #4
On Wednesday, February 08, 2017 01:06:21 PM Ulf Hansson wrote:
> On 7 February 2017 at 19:29, Kevin Hilman <khilman@baylibre.com> wrote:

> > Ulf Hansson <ulf.hansson@linaro.org> writes:

> >

> >> The commit ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of

> >> *noirq() callbacks") went too far, as it not only changed to use locks for

> >> the *noirq callbacks, but also for the genpd syscore APIs.

> >>

> >> This cause the following error, reported by Geert Uytterhoeven:

> >>

> >> "This causes the following BUG on all my Renesas arm32 boards, where the

> >> system timer is an IRQ safe device:

> >>

> >> BUG: sleeping function called from invalid context at

> >> kernel/locking/mutex.c:232

> >> in_atomic(): 0, irqs_disabled(): 128, pid: 1751, name: s2ram

> >> CPU: 0 PID: 1751 Comm: s2ram Not tainted

> >> 4.10.0-rc7-koelsch-05643-g27f4c73972a614fe #3354

> >> Hardware name: Generic R8A7791 (Flattened Device Tree)

> >> [<c020e9c4>] (unwind_backtrace) from [<c020a40c>] (show_stack+0x10/0x14)

> >> [<c020a40c>] (show_stack) from [<c03f9818>] (dump_stack+0x7c/0x9c)

> >> [<c03f9818>] (dump_stack) from [<c0240020>] (___might_sleep+0x124/0x160)

> >> [<c0240020>] (___might_sleep) from [<c06fedfc>] (mutex_lock+0x18/0x60)

> >> [<c06fedfc>] (mutex_lock) from [<c04de340>]

> >> (genpd_syscore_switch+0x2c/0x7c)

> >> [<c04de340>] (genpd_syscore_switch) from [<c05ec328>]

> >> (sh_cmt_clock_event_suspend+0x18/0x28)

> >> [<c05ec328>] (sh_cmt_clock_event_suspend) from [<c027f9a4>]

> >> (clockevents_suspend+0x40/0x54)

> >> [<c027f9a4>] (clockevents_suspend) from [<c0276d48>]

> >> (timekeeping_suspend+0x23c/0x278)

> >> [<c0276d48>] (timekeeping_suspend) from [<c04cbb88>]

> >> (syscore_suspend+0x88/0x138)

> >> [<c04cbb88>] (syscore_suspend) from [<c025f618>]

> >> (suspend_devices_and_enter+0x290/0x470)

> >> [<c025f618>] (suspend_devices_and_enter) from [<c025fa20>]

> >> (pm_suspend+0x228/0x280)

> >> [<c025fa20>] (pm_suspend) from [<c025e60c>] (state_store+0xac/0xcc)

> >> [<c025e60c>] (state_store) from [<c0340c4c>] (kernfs_fop_write+0x160/0x19c)

> >> [<c0340c4c>] (kernfs_fop_write) from [<c02e3054>] (__vfs_write+0x20/0x108)

> >> [<c02e3054>] (__vfs_write) from [<c02e4424>] (vfs_write+0xb8/0x144)

> >> [<c02e4424>] (vfs_write) from [<c02e5014>] (SyS_write+0x40/0x80)

> >> [<c02e5014>] (SyS_write) from [<c0206cc0>] (ret_fast_syscall+0x0/0x34)"

> >>

> >> To fix this problem, restore the lock-less behaviour. However only for the

> >> genpd syscore APIs.

> >>

> >> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>

> >> Fixes: ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of *noirq() callbacks")

> >> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

> >

> > It's up to Rafael, but IMO, since this isn't yet in mainline, it would

> > be cleaner for the git history to revert the original, and rework the

> > patch to have just the relevant parts

> 

> Is a revert making it cleaner, as it will becomes three changes

> instead of two? :-)


Kevin probably meant dropping the original commit altogether with should
be easy enough to do. :-)

> 

> I guess the only way to make the git history really clean, is whether

> Rafael can re-base his branch, and then squash this change into the

> commit it states to fix.

> 

> Anyway, I fine with any way. Seems like Rafael, will have to tell me what to do.


Frankly, I'd rather drop the problematic commit entirely and then you can
submit a replacement.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Feb. 8, 2017, 12:06 p.m. UTC | #5
On 7 February 2017 at 19:29, Kevin Hilman <khilman@baylibre.com> wrote:
> Ulf Hansson <ulf.hansson@linaro.org> writes:

>

>> The commit ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of

>> *noirq() callbacks") went too far, as it not only changed to use locks for

>> the *noirq callbacks, but also for the genpd syscore APIs.

>>

>> This cause the following error, reported by Geert Uytterhoeven:

>>

>> "This causes the following BUG on all my Renesas arm32 boards, where the

>> system timer is an IRQ safe device:

>>

>> BUG: sleeping function called from invalid context at

>> kernel/locking/mutex.c:232

>> in_atomic(): 0, irqs_disabled(): 128, pid: 1751, name: s2ram

>> CPU: 0 PID: 1751 Comm: s2ram Not tainted

>> 4.10.0-rc7-koelsch-05643-g27f4c73972a614fe #3354

>> Hardware name: Generic R8A7791 (Flattened Device Tree)

>> [<c020e9c4>] (unwind_backtrace) from [<c020a40c>] (show_stack+0x10/0x14)

>> [<c020a40c>] (show_stack) from [<c03f9818>] (dump_stack+0x7c/0x9c)

>> [<c03f9818>] (dump_stack) from [<c0240020>] (___might_sleep+0x124/0x160)

>> [<c0240020>] (___might_sleep) from [<c06fedfc>] (mutex_lock+0x18/0x60)

>> [<c06fedfc>] (mutex_lock) from [<c04de340>]

>> (genpd_syscore_switch+0x2c/0x7c)

>> [<c04de340>] (genpd_syscore_switch) from [<c05ec328>]

>> (sh_cmt_clock_event_suspend+0x18/0x28)

>> [<c05ec328>] (sh_cmt_clock_event_suspend) from [<c027f9a4>]

>> (clockevents_suspend+0x40/0x54)

>> [<c027f9a4>] (clockevents_suspend) from [<c0276d48>]

>> (timekeeping_suspend+0x23c/0x278)

>> [<c0276d48>] (timekeeping_suspend) from [<c04cbb88>]

>> (syscore_suspend+0x88/0x138)

>> [<c04cbb88>] (syscore_suspend) from [<c025f618>]

>> (suspend_devices_and_enter+0x290/0x470)

>> [<c025f618>] (suspend_devices_and_enter) from [<c025fa20>]

>> (pm_suspend+0x228/0x280)

>> [<c025fa20>] (pm_suspend) from [<c025e60c>] (state_store+0xac/0xcc)

>> [<c025e60c>] (state_store) from [<c0340c4c>] (kernfs_fop_write+0x160/0x19c)

>> [<c0340c4c>] (kernfs_fop_write) from [<c02e3054>] (__vfs_write+0x20/0x108)

>> [<c02e3054>] (__vfs_write) from [<c02e4424>] (vfs_write+0xb8/0x144)

>> [<c02e4424>] (vfs_write) from [<c02e5014>] (SyS_write+0x40/0x80)

>> [<c02e5014>] (SyS_write) from [<c0206cc0>] (ret_fast_syscall+0x0/0x34)"

>>

>> To fix this problem, restore the lock-less behaviour. However only for the

>> genpd syscore APIs.

>>

>> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>

>> Fixes: ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of *noirq() callbacks")

>> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

>

> It's up to Rafael, but IMO, since this isn't yet in mainline, it would

> be cleaner for the git history to revert the original, and rework the

> patch to have just the relevant parts


Is a revert making it cleaner, as it will becomes three changes
instead of two? :-)

I guess the only way to make the git history really clean, is whether
Rafael can re-base his branch, and then squash this change into the
commit it states to fix.

Anyway, I fine with any way. Seems like Rafael, will have to tell me what to do.

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Feb. 8, 2017, 12:13 p.m. UTC | #6
On 8 February 2017 at 13:06, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Wednesday, February 08, 2017 01:06:21 PM Ulf Hansson wrote:

>> On 7 February 2017 at 19:29, Kevin Hilman <khilman@baylibre.com> wrote:

>> > Ulf Hansson <ulf.hansson@linaro.org> writes:

>> >

>> >> The commit ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of

>> >> *noirq() callbacks") went too far, as it not only changed to use locks for

>> >> the *noirq callbacks, but also for the genpd syscore APIs.

>> >>

>> >> This cause the following error, reported by Geert Uytterhoeven:

>> >>

>> >> "This causes the following BUG on all my Renesas arm32 boards, where the

>> >> system timer is an IRQ safe device:

>> >>

>> >> BUG: sleeping function called from invalid context at

>> >> kernel/locking/mutex.c:232

>> >> in_atomic(): 0, irqs_disabled(): 128, pid: 1751, name: s2ram

>> >> CPU: 0 PID: 1751 Comm: s2ram Not tainted

>> >> 4.10.0-rc7-koelsch-05643-g27f4c73972a614fe #3354

>> >> Hardware name: Generic R8A7791 (Flattened Device Tree)

>> >> [<c020e9c4>] (unwind_backtrace) from [<c020a40c>] (show_stack+0x10/0x14)

>> >> [<c020a40c>] (show_stack) from [<c03f9818>] (dump_stack+0x7c/0x9c)

>> >> [<c03f9818>] (dump_stack) from [<c0240020>] (___might_sleep+0x124/0x160)

>> >> [<c0240020>] (___might_sleep) from [<c06fedfc>] (mutex_lock+0x18/0x60)

>> >> [<c06fedfc>] (mutex_lock) from [<c04de340>]

>> >> (genpd_syscore_switch+0x2c/0x7c)

>> >> [<c04de340>] (genpd_syscore_switch) from [<c05ec328>]

>> >> (sh_cmt_clock_event_suspend+0x18/0x28)

>> >> [<c05ec328>] (sh_cmt_clock_event_suspend) from [<c027f9a4>]

>> >> (clockevents_suspend+0x40/0x54)

>> >> [<c027f9a4>] (clockevents_suspend) from [<c0276d48>]

>> >> (timekeeping_suspend+0x23c/0x278)

>> >> [<c0276d48>] (timekeeping_suspend) from [<c04cbb88>]

>> >> (syscore_suspend+0x88/0x138)

>> >> [<c04cbb88>] (syscore_suspend) from [<c025f618>]

>> >> (suspend_devices_and_enter+0x290/0x470)

>> >> [<c025f618>] (suspend_devices_and_enter) from [<c025fa20>]

>> >> (pm_suspend+0x228/0x280)

>> >> [<c025fa20>] (pm_suspend) from [<c025e60c>] (state_store+0xac/0xcc)

>> >> [<c025e60c>] (state_store) from [<c0340c4c>] (kernfs_fop_write+0x160/0x19c)

>> >> [<c0340c4c>] (kernfs_fop_write) from [<c02e3054>] (__vfs_write+0x20/0x108)

>> >> [<c02e3054>] (__vfs_write) from [<c02e4424>] (vfs_write+0xb8/0x144)

>> >> [<c02e4424>] (vfs_write) from [<c02e5014>] (SyS_write+0x40/0x80)

>> >> [<c02e5014>] (SyS_write) from [<c0206cc0>] (ret_fast_syscall+0x0/0x34)"

>> >>

>> >> To fix this problem, restore the lock-less behaviour. However only for the

>> >> genpd syscore APIs.

>> >>

>> >> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>

>> >> Fixes: ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of *noirq() callbacks")

>> >> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

>> >

>> > It's up to Rafael, but IMO, since this isn't yet in mainline, it would

>> > be cleaner for the git history to revert the original, and rework the

>> > patch to have just the relevant parts

>>

>> Is a revert making it cleaner, as it will becomes three changes

>> instead of two? :-)

>

> Kevin probably meant dropping the original commit altogether with should

> be easy enough to do. :-)


Okay.

>

>>

>> I guess the only way to make the git history really clean, is whether

>> Rafael can re-base his branch, and then squash this change into the

>> commit it states to fix.

>>

>> Anyway, I fine with any way. Seems like Rafael, will have to tell me what to do.

>

> Frankly, I'd rather drop the problematic commit entirely and then you can

> submit a replacement.


I can do that, however I only wanted to spare both yours and mine
time, which why I suggested you to squash the changes.

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Feb. 8, 2017, 12:15 p.m. UTC | #7
On Wednesday, February 08, 2017 01:13:37 PM Ulf Hansson wrote:
> On 8 February 2017 at 13:06, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:

> > On Wednesday, February 08, 2017 01:06:21 PM Ulf Hansson wrote:

> >> On 7 February 2017 at 19:29, Kevin Hilman <khilman@baylibre.com> wrote:

> >> > Ulf Hansson <ulf.hansson@linaro.org> writes:

> >> >

> >> >> The commit ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of

> >> >> *noirq() callbacks") went too far, as it not only changed to use locks for

> >> >> the *noirq callbacks, but also for the genpd syscore APIs.

> >> >>

> >> >> This cause the following error, reported by Geert Uytterhoeven:

> >> >>

> >> >> "This causes the following BUG on all my Renesas arm32 boards, where the

> >> >> system timer is an IRQ safe device:

> >> >>

> >> >> BUG: sleeping function called from invalid context at

> >> >> kernel/locking/mutex.c:232

> >> >> in_atomic(): 0, irqs_disabled(): 128, pid: 1751, name: s2ram

> >> >> CPU: 0 PID: 1751 Comm: s2ram Not tainted

> >> >> 4.10.0-rc7-koelsch-05643-g27f4c73972a614fe #3354

> >> >> Hardware name: Generic R8A7791 (Flattened Device Tree)

> >> >> [<c020e9c4>] (unwind_backtrace) from [<c020a40c>] (show_stack+0x10/0x14)

> >> >> [<c020a40c>] (show_stack) from [<c03f9818>] (dump_stack+0x7c/0x9c)

> >> >> [<c03f9818>] (dump_stack) from [<c0240020>] (___might_sleep+0x124/0x160)

> >> >> [<c0240020>] (___might_sleep) from [<c06fedfc>] (mutex_lock+0x18/0x60)

> >> >> [<c06fedfc>] (mutex_lock) from [<c04de340>]

> >> >> (genpd_syscore_switch+0x2c/0x7c)

> >> >> [<c04de340>] (genpd_syscore_switch) from [<c05ec328>]

> >> >> (sh_cmt_clock_event_suspend+0x18/0x28)

> >> >> [<c05ec328>] (sh_cmt_clock_event_suspend) from [<c027f9a4>]

> >> >> (clockevents_suspend+0x40/0x54)

> >> >> [<c027f9a4>] (clockevents_suspend) from [<c0276d48>]

> >> >> (timekeeping_suspend+0x23c/0x278)

> >> >> [<c0276d48>] (timekeeping_suspend) from [<c04cbb88>]

> >> >> (syscore_suspend+0x88/0x138)

> >> >> [<c04cbb88>] (syscore_suspend) from [<c025f618>]

> >> >> (suspend_devices_and_enter+0x290/0x470)

> >> >> [<c025f618>] (suspend_devices_and_enter) from [<c025fa20>]

> >> >> (pm_suspend+0x228/0x280)

> >> >> [<c025fa20>] (pm_suspend) from [<c025e60c>] (state_store+0xac/0xcc)

> >> >> [<c025e60c>] (state_store) from [<c0340c4c>] (kernfs_fop_write+0x160/0x19c)

> >> >> [<c0340c4c>] (kernfs_fop_write) from [<c02e3054>] (__vfs_write+0x20/0x108)

> >> >> [<c02e3054>] (__vfs_write) from [<c02e4424>] (vfs_write+0xb8/0x144)

> >> >> [<c02e4424>] (vfs_write) from [<c02e5014>] (SyS_write+0x40/0x80)

> >> >> [<c02e5014>] (SyS_write) from [<c0206cc0>] (ret_fast_syscall+0x0/0x34)"

> >> >>

> >> >> To fix this problem, restore the lock-less behaviour. However only for the

> >> >> genpd syscore APIs.

> >> >>

> >> >> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>

> >> >> Fixes: ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of *noirq() callbacks")

> >> >> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

> >> >

> >> > It's up to Rafael, but IMO, since this isn't yet in mainline, it would

> >> > be cleaner for the git history to revert the original, and rework the

> >> > patch to have just the relevant parts

> >>

> >> Is a revert making it cleaner, as it will becomes three changes

> >> instead of two? :-)

> >

> > Kevin probably meant dropping the original commit altogether with should

> > be easy enough to do. :-)

> 

> Okay.

> 

> >

> >>

> >> I guess the only way to make the git history really clean, is whether

> >> Rafael can re-base his branch, and then squash this change into the

> >> commit it states to fix.

> >>

> >> Anyway, I fine with any way. Seems like Rafael, will have to tell me what to do.

> >

> > Frankly, I'd rather drop the problematic commit entirely and then you can

> > submit a replacement.

> 

> I can do that, however I only wanted to spare both yours and mine

> time, which why I suggested you to squash the changes.


That would leave a bisection pitfall behind, though, which I'd rather avoid.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Feb. 8, 2017, 12:35 p.m. UTC | #8
On 8 February 2017 at 13:15, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Wednesday, February 08, 2017 01:13:37 PM Ulf Hansson wrote:

>> On 8 February 2017 at 13:06, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:

>> > On Wednesday, February 08, 2017 01:06:21 PM Ulf Hansson wrote:

>> >> On 7 February 2017 at 19:29, Kevin Hilman <khilman@baylibre.com> wrote:

>> >> > Ulf Hansson <ulf.hansson@linaro.org> writes:

>> >> >

>> >> >> The commit ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of

>> >> >> *noirq() callbacks") went too far, as it not only changed to use locks for

>> >> >> the *noirq callbacks, but also for the genpd syscore APIs.

>> >> >>

>> >> >> This cause the following error, reported by Geert Uytterhoeven:

>> >> >>

>> >> >> "This causes the following BUG on all my Renesas arm32 boards, where the

>> >> >> system timer is an IRQ safe device:

>> >> >>

>> >> >> BUG: sleeping function called from invalid context at

>> >> >> kernel/locking/mutex.c:232

>> >> >> in_atomic(): 0, irqs_disabled(): 128, pid: 1751, name: s2ram

>> >> >> CPU: 0 PID: 1751 Comm: s2ram Not tainted

>> >> >> 4.10.0-rc7-koelsch-05643-g27f4c73972a614fe #3354

>> >> >> Hardware name: Generic R8A7791 (Flattened Device Tree)

>> >> >> [<c020e9c4>] (unwind_backtrace) from [<c020a40c>] (show_stack+0x10/0x14)

>> >> >> [<c020a40c>] (show_stack) from [<c03f9818>] (dump_stack+0x7c/0x9c)

>> >> >> [<c03f9818>] (dump_stack) from [<c0240020>] (___might_sleep+0x124/0x160)

>> >> >> [<c0240020>] (___might_sleep) from [<c06fedfc>] (mutex_lock+0x18/0x60)

>> >> >> [<c06fedfc>] (mutex_lock) from [<c04de340>]

>> >> >> (genpd_syscore_switch+0x2c/0x7c)

>> >> >> [<c04de340>] (genpd_syscore_switch) from [<c05ec328>]

>> >> >> (sh_cmt_clock_event_suspend+0x18/0x28)

>> >> >> [<c05ec328>] (sh_cmt_clock_event_suspend) from [<c027f9a4>]

>> >> >> (clockevents_suspend+0x40/0x54)

>> >> >> [<c027f9a4>] (clockevents_suspend) from [<c0276d48>]

>> >> >> (timekeeping_suspend+0x23c/0x278)

>> >> >> [<c0276d48>] (timekeeping_suspend) from [<c04cbb88>]

>> >> >> (syscore_suspend+0x88/0x138)

>> >> >> [<c04cbb88>] (syscore_suspend) from [<c025f618>]

>> >> >> (suspend_devices_and_enter+0x290/0x470)

>> >> >> [<c025f618>] (suspend_devices_and_enter) from [<c025fa20>]

>> >> >> (pm_suspend+0x228/0x280)

>> >> >> [<c025fa20>] (pm_suspend) from [<c025e60c>] (state_store+0xac/0xcc)

>> >> >> [<c025e60c>] (state_store) from [<c0340c4c>] (kernfs_fop_write+0x160/0x19c)

>> >> >> [<c0340c4c>] (kernfs_fop_write) from [<c02e3054>] (__vfs_write+0x20/0x108)

>> >> >> [<c02e3054>] (__vfs_write) from [<c02e4424>] (vfs_write+0xb8/0x144)

>> >> >> [<c02e4424>] (vfs_write) from [<c02e5014>] (SyS_write+0x40/0x80)

>> >> >> [<c02e5014>] (SyS_write) from [<c0206cc0>] (ret_fast_syscall+0x0/0x34)"

>> >> >>

>> >> >> To fix this problem, restore the lock-less behaviour. However only for the

>> >> >> genpd syscore APIs.

>> >> >>

>> >> >> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>

>> >> >> Fixes: ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of *noirq() callbacks")

>> >> >> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

>> >> >

>> >> > It's up to Rafael, but IMO, since this isn't yet in mainline, it would

>> >> > be cleaner for the git history to revert the original, and rework the

>> >> > patch to have just the relevant parts

>> >>

>> >> Is a revert making it cleaner, as it will becomes three changes

>> >> instead of two? :-)

>> >

>> > Kevin probably meant dropping the original commit altogether with should

>> > be easy enough to do. :-)

>>

>> Okay.

>>

>> >

>> >>

>> >> I guess the only way to make the git history really clean, is whether

>> >> Rafael can re-base his branch, and then squash this change into the

>> >> commit it states to fix.

>> >>

>> >> Anyway, I fine with any way. Seems like Rafael, will have to tell me what to do.

>> >

>> > Frankly, I'd rather drop the problematic commit entirely and then you can

>> > submit a replacement.

>>

>> I can do that, however I only wanted to spare both yours and mine

>> time, which why I suggested you to squash the changes.

>

> That would leave a bisection pitfall behind, though, which I'd rather avoid.


Okay, I will send a new version of the original commit in a few
minutes (as you will/have dropped it from your tree).

Let's ignore about $subject patch then.

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Feb. 8, 2017, 12:41 p.m. UTC | #9
Hi Lina,

--
>> drivers/base/power/domain.c | 48

>> +++++++++++++++++++++++++--------------------

>> 1 file changed, 27 insertions(+), 21 deletions(-)

>>

>> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c

>> index 271e208..3a75fb1 100644

>> --- a/drivers/base/power/domain.c

>> +++ b/drivers/base/power/domain.c

>> @@ -729,16 +729,17 @@ static bool genpd_dev_active_wakeup(struct

>> generic_pm_domain *genpd,

>> /**

>>  * genpd_sync_power_off - Synchronously power off a PM domain and its

>> masters.

>>  * @genpd: PM domain to power off, if possible.

>> + * @use_lock: use the lock.

>

>

> I am not too fond of this. It would rather be much cleaner if you could

> overload the genpd_lock/unlock() calls to make this genpd lockless. A

> flag for the genpd, perhaps that could turn the lock functions into a

> NOP?


That's not going to work, because this isn't about a lock-less genpd.

The problem occurs only during system suspend for a syscore device (in
this case the sh_cmt_clock_event device), calling pm_genpd_syscore*
APIs.

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 271e208..3a75fb1 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -729,16 +729,17 @@  static bool genpd_dev_active_wakeup(struct generic_pm_domain *genpd,
 /**
  * genpd_sync_power_off - Synchronously power off a PM domain and its masters.
  * @genpd: PM domain to power off, if possible.
+ * @use_lock: use the lock.
  * @depth: nesting count for lockdep.
  *
  * Check if the given PM domain can be powered off (during system suspend or
  * hibernation) and do that if so.  Also, in that case propagate to its masters.
  *
  * This function is only called in "noirq" and "syscore" stages of system power
- * transitions, but since the "noirq" callbacks may be executed asynchronously,
- * the lock must be held.
+ * transitions. The "noirq" callbacks may be executed asynchronously, thus in
+ * these cases the lock must be held.
  */
-static void genpd_sync_power_off(struct generic_pm_domain *genpd,
+static void genpd_sync_power_off(struct generic_pm_domain *genpd, bool use_lock,
 				 unsigned int depth)
 {
 	struct gpd_link *link;
@@ -759,22 +760,27 @@  static void genpd_sync_power_off(struct generic_pm_domain *genpd,
 	list_for_each_entry(link, &genpd->slave_links, slave_node) {
 		genpd_sd_counter_dec(link->master);
 
-		genpd_lock_nested(link->master, depth + 1);
-		genpd_sync_power_off(link->master, depth + 1);
-		genpd_unlock(link->master);
+		if (use_lock)
+			genpd_lock_nested(link->master, depth + 1);
+
+		genpd_sync_power_off(link->master, use_lock, depth + 1);
+
+		if (use_lock)
+			genpd_unlock(link->master);
 	}
 }
 
 /**
  * genpd_sync_power_on - Synchronously power on a PM domain and its masters.
  * @genpd: PM domain to power on.
+ * @use_lock: use the lock.
  * @depth: nesting count for lockdep.
  *
  * This function is only called in "noirq" and "syscore" stages of system power
- * transitions, but since the "noirq" callbacks may be executed asynchronously,
- * the lock must be held.
+ * transitions. The "noirq" callbacks may be executed asynchronously, thus in
+ * these cases the lock must be held.
  */
-static void genpd_sync_power_on(struct generic_pm_domain *genpd,
+static void genpd_sync_power_on(struct generic_pm_domain *genpd, bool use_lock,
 				unsigned int depth)
 {
 	struct gpd_link *link;
@@ -785,9 +791,13 @@  static void genpd_sync_power_on(struct generic_pm_domain *genpd,
 	list_for_each_entry(link, &genpd->slave_links, slave_node) {
 		genpd_sd_counter_inc(link->master);
 
-		genpd_lock_nested(link->master, depth + 1);
-		genpd_sync_power_on(link->master, depth + 1);
-		genpd_unlock(link->master);
+		if (use_lock)
+			genpd_lock_nested(link->master, depth + 1);
+
+		genpd_sync_power_on(link->master, use_lock, depth + 1);
+
+		if (use_lock)
+			genpd_unlock(link->master);
 	}
 
 	_genpd_power_on(genpd, false);
@@ -898,7 +908,7 @@  static int pm_genpd_suspend_noirq(struct device *dev)
 
 	genpd_lock(genpd);
 	genpd->suspended_count++;
-	genpd_sync_power_off(genpd, 0);
+	genpd_sync_power_off(genpd, true, 0);
 	genpd_unlock(genpd);
 
 	return 0;
@@ -925,7 +935,7 @@  static int pm_genpd_resume_noirq(struct device *dev)
 		return 0;
 
 	genpd_lock(genpd);
-	genpd_sync_power_on(genpd, 0);
+	genpd_sync_power_on(genpd, true, 0);
 	genpd->suspended_count--;
 	genpd_unlock(genpd);
 
@@ -1016,7 +1026,7 @@  static int pm_genpd_restore_noirq(struct device *dev)
 		 */
 		genpd->status = GPD_STATE_POWER_OFF;
 
-	genpd_sync_power_on(genpd, 0);
+	genpd_sync_power_on(genpd, true, 0);
 	genpd_unlock(genpd);
 
 	if (genpd->dev_ops.stop && genpd->dev_ops.start)
@@ -1070,17 +1080,13 @@  static void genpd_syscore_switch(struct device *dev, bool suspend)
 	if (!pm_genpd_present(genpd))
 		return;
 
-	genpd_lock(genpd);
-
 	if (suspend) {
 		genpd->suspended_count++;
-		genpd_sync_power_off(genpd, 0);
+		genpd_sync_power_off(genpd, false, 0);
 	} else {
-		genpd_sync_power_on(genpd, 0);
+		genpd_sync_power_on(genpd, false, 0);
 		genpd->suspended_count--;
 	}
-
-	genpd_unlock(genpd);
 }
 
 void pm_genpd_syscore_poweroff(struct device *dev)