diff mbox series

[net] usbnet: smsc95xx: Fix deadlock on runtime resume

Message ID 6710d8c18ff54139cdc538763ba544187c5a0cee.1651041411.git.lukas@wunner.de
State New
Headers show
Series [net] usbnet: smsc95xx: Fix deadlock on runtime resume | expand

Commit Message

Lukas Wunner April 27, 2022, 6:41 a.m. UTC
Commit 05b35e7eb9a1 ("smsc95xx: add phylib support") amended
smsc95xx_resume() to call phy_init_hw().  That function waits for the
device to runtime resume even though it is placed in the runtime resume
path, causing a deadlock.

The problem is that phy_init_hw() calls down to smsc95xx_mdiobus_read(),
which never uses the _nopm variant of usbnet_read_cmd().  Amend it to
autosense that it's called from the runtime resume/suspend path and use
the _nopm variant if so.

Stacktrace for posterity:

  INFO: task kworker/2:1:49 blocked for more than 122 seconds.
  Workqueue: usb_hub_wq hub_event
  schedule
  rpm_resume
  __pm_runtime_resume
  usb_autopm_get_interface
  usbnet_read_cmd
  __smsc95xx_read_reg
  __smsc95xx_phy_wait_not_busy
  __smsc95xx_mdio_read
  smsc95xx_mdiobus_read
  __mdiobus_read
  mdiobus_read
  smsc_phy_reset
  phy_init_hw
  smsc95xx_resume
  usb_resume_interface
  usb_resume_both
  usb_runtime_resume
  __rpm_callback
  rpm_callback
  rpm_resume
  __pm_runtime_resume
  usb_autoresume_device
  hub_event
  process_one_work

Fixes: 05b35e7eb9a1 ("smsc95xx: add phylib support")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v5.10+
Cc: Andre Edich <andre.edich@microchip.com>
---
 drivers/net/usb/smsc95xx.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

Comments

Alan Stern April 27, 2022, 2 p.m. UTC | #1
On Wed, Apr 27, 2022 at 08:41:49AM +0200, Lukas Wunner wrote:
> Commit 05b35e7eb9a1 ("smsc95xx: add phylib support") amended
> smsc95xx_resume() to call phy_init_hw().  That function waits for the
> device to runtime resume even though it is placed in the runtime resume
> path, causing a deadlock.
> 
> The problem is that phy_init_hw() calls down to smsc95xx_mdiobus_read(),
> which never uses the _nopm variant of usbnet_read_cmd().  Amend it to
> autosense that it's called from the runtime resume/suspend path and use
> the _nopm variant if so.

...

> diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
> index 4ef61f6b85df..82b8feaa5162 100644
> --- a/drivers/net/usb/smsc95xx.c
> +++ b/drivers/net/usb/smsc95xx.c
> @@ -285,11 +285,21 @@ static void smsc95xx_mdio_write_nopm(struct usbnet *dev, int idx, int regval)
>  	__smsc95xx_mdio_write(dev, pdata->phydev->mdio.addr, idx, regval, 1);
>  }
>  
> +static bool smsc95xx_in_pm(struct usbnet *dev)
> +{
> +#ifdef CONFIG_PM
> +	return dev->udev->dev.power.runtime_status == RPM_RESUMING ||
> +	       dev->udev->dev.power.runtime_status == RPM_SUSPENDING;
> +#else
> +	return false;
> +#endif
> +}

This does not do what you want.  You want to know if this function is 
being called in the resume pathway, but all it really tells you is 
whether the function is being called while a resume is in progress (and 
it doesn't even do that very precisely because the code does not use the 
runtime-pm spinlock).  The resume could be running in a different 
thread, in which case you most definitely _would_ want to want for it to 
complete.

Alan Stern
Andrew Lunn April 27, 2022, 3:24 p.m. UTC | #2
On Wed, Apr 27, 2022 at 08:41:49AM +0200, Lukas Wunner wrote:
> Commit 05b35e7eb9a1 ("smsc95xx: add phylib support") amended
> smsc95xx_resume() to call phy_init_hw().  That function waits for the
> device to runtime resume even though it is placed in the runtime resume
> path, causing a deadlock.

Hi Lukas

You have looked at this code, tried a few different things, so this is
probably a dumb question.

Do you actually need to call phy_init_hw()?

mdio_bus_phy_resume() will call phy_init_hw(). So long as you first
resume the MAC and then the PHY, shouldn't this just work?

       Andrew
Alan Stern April 27, 2022, 6:19 p.m. UTC | #3
On Wed, Apr 27, 2022 at 05:38:51PM +0200, Lukas Wunner wrote:
> On Wed, Apr 27, 2022 at 05:24:50PM +0200, Andrew Lunn wrote:
> > On Wed, Apr 27, 2022 at 08:41:49AM +0200, Lukas Wunner wrote:
> > > Commit 05b35e7eb9a1 ("smsc95xx: add phylib support") amended
> > > smsc95xx_resume() to call phy_init_hw().  That function waits for the
> > > device to runtime resume even though it is placed in the runtime resume
> > > path, causing a deadlock.
> > 
> > You have looked at this code, tried a few different things, so this is
> > probably a dumb question.
> > 
> > Do you actually need to call phy_init_hw()?
> > 
> > mdio_bus_phy_resume() will call phy_init_hw(). So long as you first
> > resume the MAC and then the PHY, shouldn't this just work?
> 
> mdio_bus_phy_resume() is only called for system sleep.  But this is about
> *runtime* PM.
> 
> mdio_bus_phy_pm_ops does not define runtime PM callbacks.  So runtime PM
> is disabled on PHYs, no callback is invoked for them when the MAC runtime
> suspends, hence the onus is on the MAC to runtime suspend the PHY (which
> is a child of the MAC).  Same on runtime resume.
> 
> Let's say I enable runtime PM on the PHY and use pm_runtime_force_suspend()
> to suspend the PHY from the MAC's ->runtime_suspend callback.  At that
> point the MAC already has status RPM_SUSPENDING.  Boom, deadlock.
> 
> The runtime PM core lacks the capability to declare that children should
> be force runtime suspended before a device can runtime suspend, that's
> the problem.

This might work out if you copy the scheme we use for USB devices and 
interfaces.

A USB interface is only a logical part of its parent device, and as such 
does not have a separate runtime power state of its own (in USB-2, at 
least).  Therefore the USB core calls pm_runtime_no_callbacks() for each 
interface as it is created, and handles the runtime power management for 
the interface (i.e., invoking the interface driver's runtime_suspend and 
runtime_resume callbacks) from within the device's runtime PM routines 
-- independent of the PM core's notion of what the interface's power 
state should be.

Similarly, you could call pm_runtime_no_callbacks() for the PHY when it 
is created, and manage the PHY's actual power state from within the 
MAC's runtime-PM routines directly (i.e., without going through the PM 
core).

Alan Stern
diff mbox series

Patch

diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index 4ef61f6b85df..82b8feaa5162 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -285,11 +285,21 @@  static void smsc95xx_mdio_write_nopm(struct usbnet *dev, int idx, int regval)
 	__smsc95xx_mdio_write(dev, pdata->phydev->mdio.addr, idx, regval, 1);
 }
 
+static bool smsc95xx_in_pm(struct usbnet *dev)
+{
+#ifdef CONFIG_PM
+	return dev->udev->dev.power.runtime_status == RPM_RESUMING ||
+	       dev->udev->dev.power.runtime_status == RPM_SUSPENDING;
+#else
+	return false;
+#endif
+}
+
 static int smsc95xx_mdiobus_read(struct mii_bus *bus, int phy_id, int idx)
 {
 	struct usbnet *dev = bus->priv;
 
-	return __smsc95xx_mdio_read(dev, phy_id, idx, 0);
+	return __smsc95xx_mdio_read(dev, phy_id, idx, smsc95xx_in_pm(dev));
 }
 
 static int smsc95xx_mdiobus_write(struct mii_bus *bus, int phy_id, int idx,
@@ -297,7 +307,7 @@  static int smsc95xx_mdiobus_write(struct mii_bus *bus, int phy_id, int idx,
 {
 	struct usbnet *dev = bus->priv;
 
-	__smsc95xx_mdio_write(dev, phy_id, idx, regval, 0);
+	__smsc95xx_mdio_write(dev, phy_id, idx, regval, smsc95xx_in_pm(dev));
 	return 0;
 }