diff mbox series

usbnet_link_change() fails to call netif_carrier_on()

Message ID m34j43gwto.fsf@t19.piap.pl
State New
Headers show
Series usbnet_link_change() fails to call netif_carrier_on() | expand

Commit Message

Krzysztof Hałasa Nov. 19, 2024, 1:46 p.m. UTC
Hi,

ASIX AX88772B based USB 10/100 Ethernet adapter doesn't come
up ("carrier off"), despite the built-in 100BASE-FX PHY positive link
indication.

The problem appears to be in usbnet.c framework:

void usbnet_link_change(struct usbnet *dev, bool link, bool need_reset)
{
	/* update link after link is reseted */
	if (link && !need_reset)
		netif_carrier_on(dev->net);
	else
		netif_carrier_off(dev->net);

	if (need_reset && link)
		usbnet_defer_kevent(dev, EVENT_LINK_RESET);
	else
		usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
}

I think the author's idea was a bit different than what the code really
ended doing. Especially when called with link = 1 and need_reset = 1.

It seems it may have already caused problems - possible workarounds:
- commit 7be4cb7189f7 ("net: usb: ax88179_178a: improve reset check")
- commit ecf848eb934b ("net: usb: ax88179_178a: fix link status when
  link is set to down/up")
Can't check those due to -ENOHW but ax88179_link_reset() adds
a netif_carrier_on() call on link ups.

Not sure about the "reset" name, though (and the comment in
usbnet_link_change()). It seems it's when the link goes up.

Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl>
Fixes: ac64995da872 ("usbnet: introduce usbnet_link_change API")
Fixes: 4b49f58fff00 ("usbnet: handle link change")


The code has been introduced in 2013, by 4b49f58fff00 and a bunch of
related commits. Perhaps it visibly affects only AXIS and dm9601
adapters, though.

Comments

Andrew Lunn Nov. 19, 2024, 4:20 p.m. UTC | #1
On Tue, Nov 19, 2024 at 02:46:59PM +0100, Krzysztof Hałasa wrote:
> Hi,
> 
> ASIX AX88772B based USB 10/100 Ethernet adapter doesn't come
> up ("carrier off"), despite the built-in 100BASE-FX PHY positive link
> indication.
> 
> The problem appears to be in usbnet.c framework:
> 
> void usbnet_link_change(struct usbnet *dev, bool link, bool need_reset)
> {
> 	/* update link after link is reseted */
> 	if (link && !need_reset)
> 		netif_carrier_on(dev->net);
> 	else
> 		netif_carrier_off(dev->net);
> 
> 	if (need_reset && link)
> 		usbnet_defer_kevent(dev, EVENT_LINK_RESET);
> 	else
> 		usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
> }

static int ax88772_phylink_setup(struct usbnet *dev)
{
        struct asix_common_private *priv = dev->driver_priv;
        phy_interface_t phy_if_mode;
        struct phylink *phylink;

        priv->phylink_config.dev = &dev->net->dev;
        priv->phylink_config.type = PHYLINK_NETDEV;
        priv->phylink_config.mac_capabilities = MAC_SYM_PAUSE | MAC_ASYM_PAUSE |
                MAC_10 | MAC_100;

etc.

This device is using phylink to manage the PHY. phylink will than
manage the carrier. It assumes it is solely responsible for the
carrier. So i think your fix is wrong. You probably should be removing
all code in this driver which touches the carrier.

	Andrew
Krzysztof Hałasa Nov. 21, 2024, 6:51 a.m. UTC | #2
Hi Andrew,
thanks for a looking at this.

Andrew Lunn <andrew@lunn.ch> writes:

>> void usbnet_link_change(struct usbnet *dev, bool link, bool need_reset)
>> {
>>       /* update link after link is reseted */
>>       if (link && !need_reset)
>>               netif_carrier_on(dev->net);
>>       else
>>               netif_carrier_off(dev->net);
>>
>>       if (need_reset && link)
>>               usbnet_defer_kevent(dev, EVENT_LINK_RESET);
>>       else
>>               usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
>> }
>
> This device is using phylink to manage the PHY. phylink will than
> manage the carrier. It assumes it is solely responsible for the
> carrier. So i think your fix is wrong. You probably should be removing
> all code in this driver which touches the carrier.

Ok, I wasn't aware that phylink manages netdev's carrier state.

Then, is the patch wrong just because the asix driver shouldn't use the
function, or is it wrong because the function should work differently
(i.e., the semantics are different)?

Surely the function is broken, isn't it? Calling netif_carrier_off()
on link up event can't be right?


Now the ASIX driver, I'm looking at it for some time now. It consists
of two parts linked together. The ax88172a.c part doesn't use phylink,
while the main asix_devices.c does. So I'm leaving ax88172a.c alone for
now (while it could probably be better ported to the same framework,
i.e., phylink).

The main part uses usbnet.c, which does netif_carrier_{on,off}() in the
above usbnet_link_change(). I guess I can make it use directly
usbnet_defer_kevent() only so it won't be a problem.

Also, usbnet.c calls usbnet_defer_kevent() and thus netif_carrier_off()
in usbnet_probe, removing FLAG_LINK_INTR from asix_devices.c will stop
that.

The last place interacting with carrier status is asix_status(), called
about 8 times a second by usbnet.c intr_complete(). This is independent
of any MDIO traffic. Should I now remove this as well? I guess removing
asix_status would suffice.
Andrew Lunn Nov. 21, 2024, 2:25 p.m. UTC | #3
On Thu, Nov 21, 2024 at 07:51:24AM +0100, Krzysztof Hałasa wrote:
> Hi Andrew,
> thanks for a looking at this.
> 
> Andrew Lunn <andrew@lunn.ch> writes:
> 
> >> void usbnet_link_change(struct usbnet *dev, bool link, bool need_reset)
> >> {
> >>       /* update link after link is reseted */
> >>       if (link && !need_reset)
> >>               netif_carrier_on(dev->net);
> >>       else
> >>               netif_carrier_off(dev->net);
> >>
> >>       if (need_reset && link)
> >>               usbnet_defer_kevent(dev, EVENT_LINK_RESET);
> >>       else
> >>               usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
> >> }
> >
> > This device is using phylink to manage the PHY. phylink will than
> > manage the carrier. It assumes it is solely responsible for the
> > carrier. So i think your fix is wrong. You probably should be removing
> > all code in this driver which touches the carrier.
> 
> Ok, I wasn't aware that phylink manages netdev's carrier state.
> 
> Then, is the patch wrong just because the asix driver shouldn't use the
> function, or is it wrong because the function should work differently
> (i.e., the semantics are different)?
> 
> Surely the function is broken, isn't it? Calling netif_carrier_off()
> on link up event can't be right?
> 
> 
> Now the ASIX driver, I'm looking at it for some time now. It consists
> of two parts linked together. The ax88172a.c part doesn't use phylink,
> while the main asix_devices.c does. So I'm leaving ax88172a.c alone for
> now (while it could probably be better ported to the same framework,
> i.e., phylink).
> 
> The main part uses usbnet.c, which does netif_carrier_{on,off}() in the
> above usbnet_link_change(). I guess I can make it use directly
> usbnet_defer_kevent() only so it won't be a problem.
> 
> Also, usbnet.c calls usbnet_defer_kevent() and thus netif_carrier_off()
> in usbnet_probe, removing FLAG_LINK_INTR from asix_devices.c will stop
> that.
> 
> The last place interacting with carrier status is asix_status(), called
> about 8 times a second by usbnet.c intr_complete(). This is independent
> of any MDIO traffic. Should I now remove this as well? I guess removing
> asix_status would suffice.

I've not looked at this driver in detail, nor usbnet. So i can only
make general comments. I do see there are a number of drivers which
re-invent the wheel and do their own PHY handling, when they should
allow Linux to do it via phylib/phylink.

When the MAC driver is using phylink, or phylib, it should not touch
the carrier, nor access the PHY directly. The exception can be during
probe, when it can turn the carrier off. What the MAC driver should be
doing is exposing its MDIO bus as a Linux MDIO bus. phylib will then
enumerate the bus and find the PHYs on it. The MAC driver which does
not have access to device tree then typically uses phy_find_first() to
find a PHY on the bus, and uses phy_connect() to bind the PHY to the
MAC. The MAC driver then uses phy_start() when the interface is
opened. phylib will poll the PHY for changing in link status, and call
the callback function registered via phy_connect() to let the MAC know
about what the PHY has negotiated. Other than that, the MAC driver
does nothing with the PHY.

It could well be there are historical discrepancies in usbnet, in that
having Linux drive the PHY is somewhat new for usbnet, historically
the wheel was reinvented, and maybe part of that is in the usbnet
core.

	Andrew
diff mbox series

Patch

--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1978,16 +1978,18 @@  EXPORT_SYMBOL(usbnet_manage_power);
 
 void usbnet_link_change(struct usbnet *dev, bool link, bool need_reset)
 {
-	/* update link after link is reseted */
-	if (link && !need_reset)
+	if (link)
 		netif_carrier_on(dev->net);
 	else
 		netif_carrier_off(dev->net);
 
-	if (need_reset && link)
-		usbnet_defer_kevent(dev, EVENT_LINK_RESET);
-	else
-		usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
+	if (need_reset) {
+		/* update link after link is reset */
+		if (link)
+			usbnet_defer_kevent(dev, EVENT_LINK_RESET);
+		else
+			usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
+	}
 }
 EXPORT_SYMBOL(usbnet_link_change);