Message ID | 6137505.lOV4Wx5bFT@rjwysocki.net |
---|---|
State | New |
Headers | show |
Series | [v1] PM: sleep: core: Restrict power.set_active propagation | expand |
On Mon, Feb 10, 2025 at 10:31 AM Johan Hovold <johan@kernel.org> wrote: > > On Sat, Feb 08, 2025 at 06:54:28PM +0100, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > Commit 3775fc538f53 ("PM: sleep: core: Synchronize runtime PM status of > > parents and children") exposed an issue related to simple_pm_bus_pm_ops > > that uses pm_runtime_force_suspend() and pm_runtime_force_resume() as > > bus type PM callbacks for the noirq phases of system-wide suspend and > > Despite the name of the driver these are plain device driver PM > callbacks (not bus PM ops). Right, I was confused. BTW, I still think that it is invalid to use pm_runtime_force_suspend() and pm_runtime_force_resume() as system-wide PM callbacks without enabling runtime PM. The system-wide PM callbacks should be wrappers around these checking something (eg. dev_get_drvdata(dev)) other than dev->power.runtime_status. It is too fragile the current way. > > resume. > > > > The problem is that pm_runtime_force_suspend() does not distinguish > > runtime-suspended devices from devices for which runtime PM has never > > been enabled, so if it sees a device with runtime PM status set to > > RPM_ACTIVE, it will assume that runtime PM is enabled for that device > > and so it will attempt to suspend it with the help of its runtime PM > > callbacks which may not be ready for that. As it turns out, this > > causes simple_pm_bus_runtime_suspend() to crash due to a NULL pointer > > dereference. > > > > Another problem related to the above commit and simple_pm_bus_pm_ops is > > that setting runtime PM status of a device handled by the latter to > > RPM_ACTIVE will actually prevent it from being resumed because > > pm_runtime_force_resume() only resumes devices with runtime PM status > > set to RPM_SUSPENDED. > > > > To mitigate these issues, do not allow power.set_active to propagate > > beyond the parent of the device with DPM_FLAG_SMART_SUSPEND set that > > will need to be resumed, which should be a sufficient stop-gap for the > > time being, but they will need to be properly addressed in the future > > because in general during system-wide resume it is necessary to resume > > all devices in a dependency chain in which at least one device is going > > to be resumed. > > So this works as long as no parent of a device with > DPM_FLAG_SMART_SUSPEND set is using pm_runtime_force_resume(). > > This is the case in the systems I work on, but have you > verified that this is currently generally true? Not many drivers use > this flag, but it all depends on what their devices' parents' drivers > do: > > drivers/acpi/acpi_tad.c > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > drivers/i2c/busses/i2c-designware-platdrv.c > drivers/mfd/intel-lpss.c > drivers/pci/pcie/portdrv.c > drivers/pwm/pwm-lpss-platform.c > drivers/soundwire/intel_auxdevice.c > > Most of these look like ACPI drivers so nothing that would sit directly > on a simple-pm-bus at least. I'm not worried about these. Also, I have a patch series to get the power.set_active propagation into agreement with pm_runtime_force_suspend/resume(). I'll post it tomorrow.
--- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1191,24 +1191,18 @@ return PMSG_ON; } -static void dpm_superior_set_must_resume(struct device *dev, bool set_active) +static void dpm_superior_set_must_resume(struct device *dev) { struct device_link *link; int idx; - if (dev->parent) { + if (dev->parent) dev->parent->power.must_resume = true; - if (set_active) - dev->parent->power.set_active = true; - } idx = device_links_read_lock(); - list_for_each_entry_rcu_locked(link, &dev->links.suppliers, c_node) { + list_for_each_entry_rcu_locked(link, &dev->links.suppliers, c_node) link->supplier->power.must_resume = true; - if (set_active) - link->supplier->power.set_active = true; - } device_links_read_unlock(idx); } @@ -1287,9 +1281,12 @@ dev->power.must_resume = true; if (dev->power.must_resume) { - dev->power.set_active = dev->power.set_active || - dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND); - dpm_superior_set_must_resume(dev, dev->power.set_active); + if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND)) { + dev->power.set_active = true; + if (dev->parent && !dev->parent->power.ignore_children) + dev->parent->power.set_active = true; + } + dpm_superior_set_must_resume(dev); } Complete: