Message ID | 6710d8c18ff54139cdc538763ba544187c5a0cee.1651041411.git.lukas@wunner.de |
---|---|
State | New |
Headers | show |
Series | [net] usbnet: smsc95xx: Fix deadlock on runtime resume | expand |
On Wed, Apr 27, 2022 at 08:41:49AM +0200, Lukas Wunner wrote: > Commit 05b35e7eb9a1 ("smsc95xx: add phylib support") amended > smsc95xx_resume() to call phy_init_hw(). That function waits for the > device to runtime resume even though it is placed in the runtime resume > path, causing a deadlock. > > The problem is that phy_init_hw() calls down to smsc95xx_mdiobus_read(), > which never uses the _nopm variant of usbnet_read_cmd(). Amend it to > autosense that it's called from the runtime resume/suspend path and use > the _nopm variant if so. ... > diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c > index 4ef61f6b85df..82b8feaa5162 100644 > --- a/drivers/net/usb/smsc95xx.c > +++ b/drivers/net/usb/smsc95xx.c > @@ -285,11 +285,21 @@ static void smsc95xx_mdio_write_nopm(struct usbnet *dev, int idx, int regval) > __smsc95xx_mdio_write(dev, pdata->phydev->mdio.addr, idx, regval, 1); > } > > +static bool smsc95xx_in_pm(struct usbnet *dev) > +{ > +#ifdef CONFIG_PM > + return dev->udev->dev.power.runtime_status == RPM_RESUMING || > + dev->udev->dev.power.runtime_status == RPM_SUSPENDING; > +#else > + return false; > +#endif > +} This does not do what you want. You want to know if this function is being called in the resume pathway, but all it really tells you is whether the function is being called while a resume is in progress (and it doesn't even do that very precisely because the code does not use the runtime-pm spinlock). The resume could be running in a different thread, in which case you most definitely _would_ want to want for it to complete. Alan Stern
On Wed, Apr 27, 2022 at 08:41:49AM +0200, Lukas Wunner wrote: > Commit 05b35e7eb9a1 ("smsc95xx: add phylib support") amended > smsc95xx_resume() to call phy_init_hw(). That function waits for the > device to runtime resume even though it is placed in the runtime resume > path, causing a deadlock. Hi Lukas You have looked at this code, tried a few different things, so this is probably a dumb question. Do you actually need to call phy_init_hw()? mdio_bus_phy_resume() will call phy_init_hw(). So long as you first resume the MAC and then the PHY, shouldn't this just work? Andrew
On Wed, Apr 27, 2022 at 05:38:51PM +0200, Lukas Wunner wrote: > On Wed, Apr 27, 2022 at 05:24:50PM +0200, Andrew Lunn wrote: > > On Wed, Apr 27, 2022 at 08:41:49AM +0200, Lukas Wunner wrote: > > > Commit 05b35e7eb9a1 ("smsc95xx: add phylib support") amended > > > smsc95xx_resume() to call phy_init_hw(). That function waits for the > > > device to runtime resume even though it is placed in the runtime resume > > > path, causing a deadlock. > > > > You have looked at this code, tried a few different things, so this is > > probably a dumb question. > > > > Do you actually need to call phy_init_hw()? > > > > mdio_bus_phy_resume() will call phy_init_hw(). So long as you first > > resume the MAC and then the PHY, shouldn't this just work? > > mdio_bus_phy_resume() is only called for system sleep. But this is about > *runtime* PM. > > mdio_bus_phy_pm_ops does not define runtime PM callbacks. So runtime PM > is disabled on PHYs, no callback is invoked for them when the MAC runtime > suspends, hence the onus is on the MAC to runtime suspend the PHY (which > is a child of the MAC). Same on runtime resume. > > Let's say I enable runtime PM on the PHY and use pm_runtime_force_suspend() > to suspend the PHY from the MAC's ->runtime_suspend callback. At that > point the MAC already has status RPM_SUSPENDING. Boom, deadlock. > > The runtime PM core lacks the capability to declare that children should > be force runtime suspended before a device can runtime suspend, that's > the problem. This might work out if you copy the scheme we use for USB devices and interfaces. A USB interface is only a logical part of its parent device, and as such does not have a separate runtime power state of its own (in USB-2, at least). Therefore the USB core calls pm_runtime_no_callbacks() for each interface as it is created, and handles the runtime power management for the interface (i.e., invoking the interface driver's runtime_suspend and runtime_resume callbacks) from within the device's runtime PM routines -- independent of the PM core's notion of what the interface's power state should be. Similarly, you could call pm_runtime_no_callbacks() for the PHY when it is created, and manage the PHY's actual power state from within the MAC's runtime-PM routines directly (i.e., without going through the PM core). Alan Stern
diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c index 4ef61f6b85df..82b8feaa5162 100644 --- a/drivers/net/usb/smsc95xx.c +++ b/drivers/net/usb/smsc95xx.c @@ -285,11 +285,21 @@ static void smsc95xx_mdio_write_nopm(struct usbnet *dev, int idx, int regval) __smsc95xx_mdio_write(dev, pdata->phydev->mdio.addr, idx, regval, 1); } +static bool smsc95xx_in_pm(struct usbnet *dev) +{ +#ifdef CONFIG_PM + return dev->udev->dev.power.runtime_status == RPM_RESUMING || + dev->udev->dev.power.runtime_status == RPM_SUSPENDING; +#else + return false; +#endif +} + static int smsc95xx_mdiobus_read(struct mii_bus *bus, int phy_id, int idx) { struct usbnet *dev = bus->priv; - return __smsc95xx_mdio_read(dev, phy_id, idx, 0); + return __smsc95xx_mdio_read(dev, phy_id, idx, smsc95xx_in_pm(dev)); } static int smsc95xx_mdiobus_write(struct mii_bus *bus, int phy_id, int idx, @@ -297,7 +307,7 @@ static int smsc95xx_mdiobus_write(struct mii_bus *bus, int phy_id, int idx, { struct usbnet *dev = bus->priv; - __smsc95xx_mdio_write(dev, phy_id, idx, regval, 0); + __smsc95xx_mdio_write(dev, phy_id, idx, regval, smsc95xx_in_pm(dev)); return 0; }
Commit 05b35e7eb9a1 ("smsc95xx: add phylib support") amended smsc95xx_resume() to call phy_init_hw(). That function waits for the device to runtime resume even though it is placed in the runtime resume path, causing a deadlock. The problem is that phy_init_hw() calls down to smsc95xx_mdiobus_read(), which never uses the _nopm variant of usbnet_read_cmd(). Amend it to autosense that it's called from the runtime resume/suspend path and use the _nopm variant if so. Stacktrace for posterity: INFO: task kworker/2:1:49 blocked for more than 122 seconds. Workqueue: usb_hub_wq hub_event schedule rpm_resume __pm_runtime_resume usb_autopm_get_interface usbnet_read_cmd __smsc95xx_read_reg __smsc95xx_phy_wait_not_busy __smsc95xx_mdio_read smsc95xx_mdiobus_read __mdiobus_read mdiobus_read smsc_phy_reset phy_init_hw smsc95xx_resume usb_resume_interface usb_resume_both usb_runtime_resume __rpm_callback rpm_callback rpm_resume __pm_runtime_resume usb_autoresume_device hub_event process_one_work Fixes: 05b35e7eb9a1 ("smsc95xx: add phylib support") Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v5.10+ Cc: Andre Edich <andre.edich@microchip.com> --- drivers/net/usb/smsc95xx.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)