diff mbox series

[v1] PM: sleep: core: Restrict power.set_active propagation

Message ID 6137505.lOV4Wx5bFT@rjwysocki.net
State New
Headers show
Series [v1] PM: sleep: core: Restrict power.set_active propagation | expand

Commit Message

Rafael J. Wysocki Feb. 8, 2025, 5:54 p.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Commit 3775fc538f53 ("PM: sleep: core: Synchronize runtime PM status of
parents and children") exposed an issue related to simple_pm_bus_pm_ops
that uses pm_runtime_force_suspend() and pm_runtime_force_resume() as
bus type PM callbacks for the noirq phases of system-wide suspend and
resume.

The problem is that pm_runtime_force_suspend() does not distinguish
runtime-suspended devices from devices for which runtime PM has never
been enabled, so if it sees a device with runtime PM status set to
RPM_ACTIVE, it will assume that runtime PM is enabled for that device
and so it will attempt to suspend it with the help of its runtime PM
callbacks which may not be ready for that.  As it turns out, this 
causes simple_pm_bus_runtime_suspend() to crash due to a NULL pointer
dereference.

Another problem related to the above commit and simple_pm_bus_pm_ops is
that setting runtime PM status of a device handled by the latter to
RPM_ACTIVE will actually prevent it from being resumed because
pm_runtime_force_resume() only resumes devices with runtime PM status
set to RPM_SUSPENDED.

To mitigate these issues, do not allow power.set_active to propagate
beyond the parent of the device with DPM_FLAG_SMART_SUSPEND set that
will need to be resumed, which should be a sufficient stop-gap for the
time being, but they will need to be properly addressed in the future
because in general during system-wide resume it is necessary to resume
all devices in a dependency chain in which at least one device is going
to be resumed.

Fixes: 3775fc538f53 ("PM: sleep: core: Synchronize runtime PM status of parents and children")
Closes: https://lore.kernel.org/linux-pm/1c2433d4-7e0f-4395-b841-b8eac7c25651@nvidia.com/
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/base/power/main.c |   21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

Comments

Rafael J. Wysocki Feb. 10, 2025, 5:21 p.m. UTC | #1
On Mon, Feb 10, 2025 at 10:31 AM Johan Hovold <johan@kernel.org> wrote:
>
> On Sat, Feb 08, 2025 at 06:54:28PM +0100, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Commit 3775fc538f53 ("PM: sleep: core: Synchronize runtime PM status of
> > parents and children") exposed an issue related to simple_pm_bus_pm_ops
> > that uses pm_runtime_force_suspend() and pm_runtime_force_resume() as
> > bus type PM callbacks for the noirq phases of system-wide suspend and
>
> Despite the name of the driver these are plain device driver PM
> callbacks (not bus PM ops).

Right, I was confused.

BTW, I still think that it is invalid to use
pm_runtime_force_suspend() and pm_runtime_force_resume() as
system-wide PM callbacks without enabling runtime PM.  The system-wide
PM callbacks should be wrappers around these checking something (eg.
dev_get_drvdata(dev)) other than dev->power.runtime_status.  It is too
fragile the current way.

> > resume.
> >
> > The problem is that pm_runtime_force_suspend() does not distinguish
> > runtime-suspended devices from devices for which runtime PM has never
> > been enabled, so if it sees a device with runtime PM status set to
> > RPM_ACTIVE, it will assume that runtime PM is enabled for that device
> > and so it will attempt to suspend it with the help of its runtime PM
> > callbacks which may not be ready for that.  As it turns out, this
> > causes simple_pm_bus_runtime_suspend() to crash due to a NULL pointer
> > dereference.
> >
> > Another problem related to the above commit and simple_pm_bus_pm_ops is
> > that setting runtime PM status of a device handled by the latter to
> > RPM_ACTIVE will actually prevent it from being resumed because
> > pm_runtime_force_resume() only resumes devices with runtime PM status
> > set to RPM_SUSPENDED.
> >
> > To mitigate these issues, do not allow power.set_active to propagate
> > beyond the parent of the device with DPM_FLAG_SMART_SUSPEND set that
> > will need to be resumed, which should be a sufficient stop-gap for the
> > time being, but they will need to be properly addressed in the future
> > because in general during system-wide resume it is necessary to resume
> > all devices in a dependency chain in which at least one device is going
> > to be resumed.
>
> So this works as long as no parent of a device with
> DPM_FLAG_SMART_SUSPEND set is using pm_runtime_force_resume().
>
> This is the case in the systems I work on, but have you
> verified that this is currently generally true? Not many drivers use
> this flag, but it all depends on what their devices' parents' drivers
> do:
>
>         drivers/acpi/acpi_tad.c
>         drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>         drivers/i2c/busses/i2c-designware-platdrv.c
>         drivers/mfd/intel-lpss.c
>         drivers/pci/pcie/portdrv.c
>         drivers/pwm/pwm-lpss-platform.c
>         drivers/soundwire/intel_auxdevice.c
>
> Most of these look like ACPI drivers so nothing that would sit directly
> on a simple-pm-bus at least.

I'm not worried about these.

Also, I have a patch series to get the power.set_active propagation
into agreement with pm_runtime_force_suspend/resume().  I'll post it
tomorrow.
diff mbox series

Patch

--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -1191,24 +1191,18 @@ 
 	return PMSG_ON;
 }
 
-static void dpm_superior_set_must_resume(struct device *dev, bool set_active)
+static void dpm_superior_set_must_resume(struct device *dev)
 {
 	struct device_link *link;
 	int idx;
 
-	if (dev->parent) {
+	if (dev->parent)
 		dev->parent->power.must_resume = true;
-		if (set_active)
-			dev->parent->power.set_active = true;
-	}
 
 	idx = device_links_read_lock();
 
-	list_for_each_entry_rcu_locked(link, &dev->links.suppliers, c_node) {
+	list_for_each_entry_rcu_locked(link, &dev->links.suppliers, c_node)
 		link->supplier->power.must_resume = true;
-		if (set_active)
-			link->supplier->power.set_active = true;
-	}
 
 	device_links_read_unlock(idx);
 }
@@ -1287,9 +1281,12 @@ 
 		dev->power.must_resume = true;
 
 	if (dev->power.must_resume) {
-		dev->power.set_active = dev->power.set_active ||
-			dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND);
-		dpm_superior_set_must_resume(dev, dev->power.set_active);
+		if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND)) {
+			dev->power.set_active = true;
+			if (dev->parent && !dev->parent->power.ignore_children)
+				dev->parent->power.set_active = true;
+		}
+		dpm_superior_set_must_resume(dev);
 	}
 
 Complete: