mbox series

[net-next,v3,0/9] provide cable test support for the ksz886x switch

Message ID 20210526043037.9830-1-o.rempel@pengutronix.de
Headers show
Series provide cable test support for the ksz886x switch | expand

Message

Oleksij Rempel May 26, 2021, 4:30 a.m. UTC
Since we already have 5.13-rc3, I assume http://vger.kernel.org/~davem/net-next.html
is out of date.

changes v3:
- remove RFC tag

changes v2:
- use generic MII_* defines where possible
- rework phylink validate
- remove phylink get state function
- reorder cabletest patches to make PHY flag patch in the right order
- fix MDI-X detection

This patches provide support for cable testing on the ksz886x switches.
Since it has one special port, we needed to add phylink with validation
and extra quirk for the PHY to signal, that one port will not provide
valid cable testing reports.

Michael Grzeschik (2):
  net: phy: micrel: move phy reg offsets to common header
  net: dsa: microchip: ksz8795: add phylink support

Oleksij Rempel (7):
  net: phy: micrel: use consistent indention after define
  net: phy: micrel: apply resume errata workaround for ksz8873 and
    ksz8863
  net: phy/dsa micrel/ksz886x add MDI-X support
  net: phy: micrel: ksz8081 add MDI-X support
  net: dsa: microchip: ksz8795: add LINK_MD register support
  net: dsa: dsa_slave_phy_connect(): extend phy's flags with port
    specific phy flags
  net: phy: micrel: ksz886x/ksz8081: add cabletest support

 drivers/net/dsa/microchip/ksz8795.c     | 218 +++++++++----
 drivers/net/dsa/microchip/ksz8795_reg.h |  67 +---
 drivers/net/ethernet/micrel/ksz884x.c   | 105 +-----
 drivers/net/phy/micrel.c                | 403 +++++++++++++++++++++++-
 drivers/net/phy/phylink.c               |   2 +-
 include/linux/micrel_phy.h              |  16 +
 net/dsa/slave.c                         |   4 +
 7 files changed, 588 insertions(+), 227 deletions(-)

Comments

Jakub Kicinski May 26, 2021, 7:32 p.m. UTC | #1
On Wed, 26 May 2021 06:30:37 +0200 Oleksij Rempel wrote:
> +	if (phydev->dev_flags & MICREL_KSZ8_P1_ERRATA)
> +		return -ENOTSUPP;

EOPNOTSUPP
Vladimir Oltean May 26, 2021, 10:01 p.m. UTC | #2
On Wed, May 26, 2021 at 06:30:29AM +0200, Oleksij Rempel wrote:
> From: Michael Grzeschik <m.grzeschik@pengutronix.de>
> 
> Some micrel devices share the same PHY register defines. This patch
> moves them to one common header so other drivers can reuse them.
> And reuse generic MII_* defines where possible.
> 
> Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> ---
>  drivers/net/dsa/microchip/ksz8795.c     | 119 ++++++++++++------------
>  drivers/net/dsa/microchip/ksz8795_reg.h |  62 ------------
>  drivers/net/ethernet/micrel/ksz884x.c   | 105 +++------------------
>  include/linux/micrel_phy.h              |  13 +++
>  4 files changed, 88 insertions(+), 211 deletions(-)
> 
> diff --git a/drivers/net/dsa/microchip/ksz8795.c b/drivers/net/dsa/microchip/ksz8795.c
> index ad509a57a945..ba065003623f 100644
> --- a/drivers/net/dsa/microchip/ksz8795.c
> +++ b/drivers/net/dsa/microchip/ksz8795.c
> @@ -15,6 +15,7 @@
>  #include <linux/phy.h>
>  #include <linux/etherdevice.h>
>  #include <linux/if_bridge.h>
> +#include <linux/micrel_phy.h>
>  #include <net/dsa.h>
>  #include <net/switchdev.h>
>  
> @@ -731,88 +732,88 @@ static void ksz8_r_phy(struct ksz_device *dev, u16 phy, u16 reg, u16 *val)
>  	u8 p = phy;
>  
>  	switch (reg) {
> -	case PHY_REG_CTRL:
> +	case MII_BMCR:
>  		ksz_pread8(dev, p, regs[P_NEG_RESTART_CTRL], &restart);
>  		ksz_pread8(dev, p, regs[P_SPEED_STATUS], &speed);
>  		ksz_pread8(dev, p, regs[P_FORCE_CTRL], &ctrl);
>  		if (restart & PORT_PHY_LOOPBACK)
> -			data |= PHY_LOOPBACK;
> +			data |= BMCR_LOOPBACK;
>  		if (ctrl & PORT_FORCE_100_MBIT)
> -			data |= PHY_SPEED_100MBIT;
> +			data |= BMCR_SPEED100;
>  		if (ksz_is_ksz88x3(dev)) {
>  			if ((ctrl & PORT_AUTO_NEG_ENABLE))
> -				data |= PHY_AUTO_NEG_ENABLE;
> +				data |= BMCR_ANENABLE;
>  		} else {
>  			if (!(ctrl & PORT_AUTO_NEG_DISABLE))
> -				data |= PHY_AUTO_NEG_ENABLE;
> +				data |= BMCR_ANENABLE;
>  		}
>  		if (restart & PORT_POWER_DOWN)
> -			data |= PHY_POWER_DOWN;
> +			data |= BMCR_PDOWN;
>  		if (restart & PORT_AUTO_NEG_RESTART)
> -			data |= PHY_AUTO_NEG_RESTART;
> +			data |= BMCR_ANRESTART;
>  		if (ctrl & PORT_FORCE_FULL_DUPLEX)
> -			data |= PHY_FULL_DUPLEX;
> +			data |= BMCR_FULLDPLX;
>  		if (speed & PORT_HP_MDIX)
> -			data |= PHY_HP_MDIX;
> +			data |= KSZ886X_BMCR_HP_MDIX;
>  		if (restart & PORT_FORCE_MDIX)
> -			data |= PHY_FORCE_MDIX;
> +			data |= KSZ886X_BMCR_FORCE_MDI;
>  		if (restart & PORT_AUTO_MDIX_DISABLE)
> -			data |= PHY_AUTO_MDIX_DISABLE;
> +			data |= KSZ886X_BMCR_DISABLE_AUTO_MDIX;
>  		if (restart & PORT_TX_DISABLE)
> -			data |= PHY_TRANSMIT_DISABLE;
> +			data |= KSZ886X_BMCR_DISABLE_TRANSMIT;
>  		if (restart & PORT_LED_OFF)
> -			data |= PHY_LED_DISABLE;
> +			data |= KSZ886X_BMCR_DISABLE_LED;
>  		break;

I am deeply confused as to what this function is doing. It is reading
the 8-bit port registers P_NEG_RESTART_CTRL, P_SPEED_STATUS and
P_FORCE_CTRL and stitching them into a 16-bit "MII_BMCR"?

What layout does this control register even have? Seeing as this is the
implementation of ksz_phy_read16(), I expect that MII_BMCR has the
layout specified in clause 22.2.4.1?

But clause 22 says register 0.5 is "Unidirectional enable", not
"PHY_HP_MDIX" (whatever that might be), and bits 0.4:0 are reserved and
must be written as zero and ignored on read.

> -	case PHY_REG_STATUS:
> +	case MII_BMSR:
>  		ksz_pread8(dev, p, regs[P_LINK_STATUS], &link);
> -		data = PHY_100BTX_FD_CAPABLE |
> -		       PHY_100BTX_CAPABLE |
> -		       PHY_10BT_FD_CAPABLE |
> -		       PHY_10BT_CAPABLE |
> -		       PHY_AUTO_NEG_CAPABLE;
> +		data = BMSR_100FULL |
> +		       BMSR_100HALF |
> +		       BMSR_10FULL |
> +		       BMSR_10HALF |
> +		       BMSR_ANEGCAPABLE;
>  		if (link & PORT_AUTO_NEG_COMPLETE)
> -			data |= PHY_AUTO_NEG_ACKNOWLEDGE;
> +			data |= BMSR_ANEGCOMPLETE;
>  		if (link & PORT_STAT_LINK_GOOD)
> -			data |= PHY_LINK_STATUS;
> +			data |= BMSR_LSTATUS;
>  		break;
> -	case PHY_REG_ID_1:
> +	case MII_PHYSID1:
>  		data = KSZ8795_ID_HI;
>  		break;
> -	case PHY_REG_ID_2:
> +	case MII_PHYSID2:
>  		if (ksz_is_ksz88x3(dev))
>  			data = KSZ8863_ID_LO;
>  		else
>  			data = KSZ8795_ID_LO;
>  		break;
> -	case PHY_REG_AUTO_NEGOTIATION:
> +	case MII_ADVERTISE:
>  		ksz_pread8(dev, p, regs[P_LOCAL_CTRL], &ctrl);
> -		data = PHY_AUTO_NEG_802_3;
> +		data = ADVERTISE_CSMA;
>  		if (ctrl & PORT_AUTO_NEG_SYM_PAUSE)
> -			data |= PHY_AUTO_NEG_SYM_PAUSE;
> +			data |= ADVERTISE_PAUSE_CAP;
>  		if (ctrl & PORT_AUTO_NEG_100BTX_FD)
> -			data |= PHY_AUTO_NEG_100BTX_FD;
> +			data |= ADVERTISE_100FULL;
>  		if (ctrl & PORT_AUTO_NEG_100BTX)
> -			data |= PHY_AUTO_NEG_100BTX;
> +			data |= ADVERTISE_100HALF;
>  		if (ctrl & PORT_AUTO_NEG_10BT_FD)
> -			data |= PHY_AUTO_NEG_10BT_FD;
> +			data |= ADVERTISE_10FULL;
>  		if (ctrl & PORT_AUTO_NEG_10BT)
> -			data |= PHY_AUTO_NEG_10BT;
> +			data |= ADVERTISE_10HALF;
>  		break;
> -	case PHY_REG_REMOTE_CAPABILITY:
> +	case MII_LPA:
>  		ksz_pread8(dev, p, regs[P_REMOTE_STATUS], &link);
> -		data = PHY_AUTO_NEG_802_3;
> +		data = LPA_SLCT;
>  		if (link & PORT_REMOTE_SYM_PAUSE)
> -			data |= PHY_AUTO_NEG_SYM_PAUSE;
> +			data |= LPA_PAUSE_CAP;
>  		if (link & PORT_REMOTE_100BTX_FD)
> -			data |= PHY_AUTO_NEG_100BTX_FD;
> +			data |= LPA_100FULL;
>  		if (link & PORT_REMOTE_100BTX)
> -			data |= PHY_AUTO_NEG_100BTX;
> +			data |= LPA_100HALF;
>  		if (link & PORT_REMOTE_10BT_FD)
> -			data |= PHY_AUTO_NEG_10BT_FD;
> +			data |= LPA_10FULL;
>  		if (link & PORT_REMOTE_10BT)
> -			data |= PHY_AUTO_NEG_10BT;
> -		if (data & ~PHY_AUTO_NEG_802_3)
> -			data |= PHY_REMOTE_ACKNOWLEDGE_NOT;
> +			data |= LPA_10HALF;
> +		if (data & ~LPA_SLCT)
> +			data |= LPA_LPACK;
>  		break;
>  	default:
>  		processed = false;
Vladimir Oltean May 26, 2021, 10:43 p.m. UTC | #3
On Wed, May 26, 2021 at 06:30:32AM +0200, Oleksij Rempel wrote:
> The ksz8873 and ksz8863 switches are affected by following errata:
> 
> | "Receiver error in 100BASE-TX mode following Soft Power Down"
> |
> | Some KSZ8873 devices may exhibit receiver errors after transitioning
> | from Soft Power Down mode to Normal mode, as controlled by register 195
> | (0xC3) bits [1:0]. When exiting Soft Power Down mode, the receiver
> | blocks may not start up properly, causing the PHY to miss data and
> | exhibit erratic behavior. The problem may appear on either port 1 or
> | port 2, or both ports. The problem occurs only for 100BASE-TX, not
> | 10BASE-T.
> |
> | END USER IMPLICATIONS
> | When the failure occurs, the following symptoms are seen on the affected
> | port(s):
> | - The port is able to link
> | - LED0 blinks, even when there is no traffic
> | - The MIB counters indicate receive errors (Rx Fragments, Rx Symbol
> |   Errors, Rx CRC Errors, Rx Alignment Errors)
> | - Only a small fraction of packets is correctly received and forwarded
> |   through the switch. Most packets are dropped due to receive errors.
> |
> | The failing condition cannot be corrected by the following:
> | - Removing and reconnecting the cable
> | - Hardware reset
> | - Software Reset and PCS Reset bits in register 67 (0x43)
> |
> | Work around:
> | The problem can be corrected by setting and then clearing the Port Power
> | Down bits (registers 29 (0x1D) and 45 (0x2D), bit 3). This must be done
> | separately for each affected port after returning from Soft Power Down
> | Mode to Normal Mode. The following procedure will ensure no further
> | issues due to this erratum. To enter Soft Power Down Mode, set register
> | 195 (0xC3), bits [1:0] = 10.
> |
> | To exit Soft Power Down Mode, follow these steps:
> | 1. Set register 195 (0xC3), bits [1:0] = 00 // Exit soft power down mode
> | 2. Wait 1ms minimum
> | 3. Set register 29 (0x1D), bit [3] = 1 // Enter PHY port 1 power down mode
> | 4. Set register 29 (0x1D), bit [3] = 0 // Exit PHY port 1 power down mode
> | 5. Set register 45 (0x2D), bit [3] = 1 // Enter PHY port 2 power down mode
> | 6. Set register 45 (0x2D), bit [3] = 0 // Exit PHY port 2 power down mode
> 
> This patch implements steps 2...6 of the suggested workaround. The first
> step needs to be implemented in the switch driver.

Am I right in understanding that register 195 (0xc3) is not a port register?

To hit the erratum, you have to enter Soft Power Down in the first place,
presumably by writing register 0xc3 from somewhere, right?

Where does Linux write this register from?

Once we find that place that enters/exits Soft Power Down mode, can't we
just toggle the Port Power Down bits for each port, exactly like the ERR
workaround says, instead of fooling around with a PHY driver?

> 
> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> ---
>  drivers/net/phy/micrel.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
> index 227d88db7d27..f03188ed953a 100644
> --- a/drivers/net/phy/micrel.c
> +++ b/drivers/net/phy/micrel.c
> @@ -1048,6 +1048,26 @@ static int ksz8873mll_config_aneg(struct phy_device *phydev)
>  	return 0;
>  }
>  
> +static int ksz886x_resume(struct phy_device *phydev)
> +{
> +	int ret;
> +
> +	/* Apply errata workaround for KSZ8863 and KSZ8873:
> +	 * Receiver error in 100BASE-TX mode following Soft Power Down
> +	 *
> +	 * When exiting Soft Power Down mode, the receiver blocks may not start
> +	 * up properly, causing the PHY to miss data and exhibit erratic
> +	 * behavior.
> +	 */
> +	usleep_range(1000, 2000);
> +
> +	ret = phy_set_bits(phydev, MII_BMCR, BMCR_PDOWN);
> +	if (ret)
> +		return ret;
> +
> +	return phy_clear_bits(phydev, MII_BMCR, BMCR_PDOWN);
> +}
> +
>  static int kszphy_get_sset_count(struct phy_device *phydev)
>  {
>  	return ARRAY_SIZE(kszphy_hw_stats);
> @@ -1401,7 +1421,7 @@ static struct phy_driver ksphy_driver[] = {
>  	/* PHY_BASIC_FEATURES */
>  	.config_init	= kszphy_config_init,
>  	.suspend	= genphy_suspend,
> -	.resume		= genphy_resume,
> +	.resume		= ksz886x_resume,

Are you able to explain the relation between the call paths of
phy_resume() and the lifetime of the Soft Power Down setting of the
switch? How do we know that the PHYs are resumed after the switch has
exited Soft Power Down mode?

>  }, {
>  	.name		= "Micrel KSZ87XX Switch",
>  	/* PHY_BASIC_FEATURES */
> -- 
> 2.29.2
>
Andrew Lunn May 27, 2021, 3:16 p.m. UTC | #4
> >  	switch (reg) {

> > -	case PHY_REG_CTRL:

> > +	case MII_BMCR:

> >  		ksz_pread8(dev, p, regs[P_NEG_RESTART_CTRL], &restart);

> >  		ksz_pread8(dev, p, regs[P_SPEED_STATUS], &speed);

> >  		ksz_pread8(dev, p, regs[P_FORCE_CTRL], &ctrl);

> >  		if (restart & PORT_PHY_LOOPBACK)

> > -			data |= PHY_LOOPBACK;

> > +			data |= BMCR_LOOPBACK;

> >  		if (ctrl & PORT_FORCE_100_MBIT)

> > -			data |= PHY_SPEED_100MBIT;

> > +			data |= BMCR_SPEED100;

> >  		if (ksz_is_ksz88x3(dev)) {

> >  			if ((ctrl & PORT_AUTO_NEG_ENABLE))

> > -				data |= PHY_AUTO_NEG_ENABLE;

> > +				data |= BMCR_ANENABLE;

> >  		} else {

> >  			if (!(ctrl & PORT_AUTO_NEG_DISABLE))

> > -				data |= PHY_AUTO_NEG_ENABLE;

> > +				data |= BMCR_ANENABLE;

> >  		}

> >  		if (restart & PORT_POWER_DOWN)

> > -			data |= PHY_POWER_DOWN;

> > +			data |= BMCR_PDOWN;

> >  		if (restart & PORT_AUTO_NEG_RESTART)

> > -			data |= PHY_AUTO_NEG_RESTART;

> > +			data |= BMCR_ANRESTART;

> >  		if (ctrl & PORT_FORCE_FULL_DUPLEX)

> > -			data |= PHY_FULL_DUPLEX;

> > +			data |= BMCR_FULLDPLX;

> >  		if (speed & PORT_HP_MDIX)

> > -			data |= PHY_HP_MDIX;

> > +			data |= KSZ886X_BMCR_HP_MDIX;

> >  		if (restart & PORT_FORCE_MDIX)

> > -			data |= PHY_FORCE_MDIX;

> > +			data |= KSZ886X_BMCR_FORCE_MDI;

> >  		if (restart & PORT_AUTO_MDIX_DISABLE)

> > -			data |= PHY_AUTO_MDIX_DISABLE;

> > +			data |= KSZ886X_BMCR_DISABLE_AUTO_MDIX;

> >  		if (restart & PORT_TX_DISABLE)

> > -			data |= PHY_TRANSMIT_DISABLE;

> > +			data |= KSZ886X_BMCR_DISABLE_TRANSMIT;

> >  		if (restart & PORT_LED_OFF)

> > -			data |= PHY_LED_DISABLE;

> > +			data |= KSZ886X_BMCR_DISABLE_LED;

> >  		break;

> 

> I am deeply confused as to what this function is doing. It is reading

> the 8-bit port registers P_NEG_RESTART_CTRL, P_SPEED_STATUS and

> P_FORCE_CTRL and stitching them into a 16-bit "MII_BMCR"?


Sort of. Take a look at the datasheet for the ksz8841. It has clause
22 like registers which it exports to a PHY driver. It puts MDIX
control into the bottom of the BMCR. So this DSA driver is emulating
the ksz8841 so it can share the PHY driver.

    Andrew
Oleksij Rempel June 10, 2021, 10:20 a.m. UTC | #5
On Thu, May 27, 2021 at 01:13:04AM +0300, Vladimir Oltean wrote:
> On Wed, May 26, 2021 at 06:30:30AM +0200, Oleksij Rempel wrote:

> > From: Michael Grzeschik <m.grzeschik@pengutronix.de>

> > 

> > This patch adds the phylink support to the ksz8795 driver to provide

> > configuration exceptions on quirky KSZ8863 and KSZ8873 ports.

> > 

> > Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>

> > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>

> > ---

> >  drivers/net/dsa/microchip/ksz8795.c | 59 +++++++++++++++++++++++++++++

> >  1 file changed, 59 insertions(+)

> > 

> > diff --git a/drivers/net/dsa/microchip/ksz8795.c b/drivers/net/dsa/microchip/ksz8795.c

> > index ba065003623f..cf81ae87544d 100644

> > --- a/drivers/net/dsa/microchip/ksz8795.c

> > +++ b/drivers/net/dsa/microchip/ksz8795.c

> > @@ -18,6 +18,7 @@

> >  #include <linux/micrel_phy.h>

> >  #include <net/dsa.h>

> >  #include <net/switchdev.h>

> > +#include <linux/phylink.h>

> >  

> >  #include "ksz_common.h"

> >  #include "ksz8795_reg.h"

> > @@ -1420,11 +1421,69 @@ static int ksz8_setup(struct dsa_switch *ds)

> >  	return 0;

> >  }

> >  

> > +static void ksz8_validate(struct dsa_switch *ds, int port,

> > +			  unsigned long *supported,

> > +			  struct phylink_link_state *state)

> > +{

> > +	__ETHTOOL_DECLARE_LINK_MODE_MASK(mask) = { 0, };

> > +	struct ksz_device *dev = ds->priv;

> > +

> > +	if (port == dev->cpu_port) {

> > +		if (state->interface != PHY_INTERFACE_MODE_RMII &&

> > +		    state->interface != PHY_INTERFACE_MODE_MII &&

> > +		    state->interface != PHY_INTERFACE_MODE_NA)

> > +			goto unsupported;

> > +	} else if (port > dev->port_cnt) {

> > +		bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS);

> > +		dev_err(ds->dev, "Unsupported port: %i\n", port);

> > +		return;

> 

> Is this possible or do we just like to invent things to check?

> Unless I'm missing something, ksz8_switch_init() does:

> 

> 	dev->ds->num_ports = dev->port_cnt;

> 

> and dsa_port_phylink_validate() does:

> 

> 	ds->ops->phylink_validate(ds, dp->index, supported, state);

> 

> where dp->index is set to @port by dsa_port_touch() in this loop:

> 

> 	for (port = 0; port < ds->num_ports; port++) {

> 		dp = dsa_port_touch(ds, port);

> 		if (!dp)

> 			return -ENOMEM;

> 	}

> 

> So, if 0 <= dp->index < ds->num_ports == dev->port_cnt, what is the point?


good point

Regards,
Oleksij
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
Oleksij Rempel June 10, 2021, 11:49 a.m. UTC | #6
On Thu, May 27, 2021 at 01:43:29AM +0300, Vladimir Oltean wrote:
> On Wed, May 26, 2021 at 06:30:32AM +0200, Oleksij Rempel wrote:

> > The ksz8873 and ksz8863 switches are affected by following errata:

> > 

> > | "Receiver error in 100BASE-TX mode following Soft Power Down"

> > |

> > | Some KSZ8873 devices may exhibit receiver errors after transitioning

> > | from Soft Power Down mode to Normal mode, as controlled by register 195

> > | (0xC3) bits [1:0]. When exiting Soft Power Down mode, the receiver

> > | blocks may not start up properly, causing the PHY to miss data and

> > | exhibit erratic behavior. The problem may appear on either port 1 or

> > | port 2, or both ports. The problem occurs only for 100BASE-TX, not

> > | 10BASE-T.

> > |

> > | END USER IMPLICATIONS

> > | When the failure occurs, the following symptoms are seen on the affected

> > | port(s):

> > | - The port is able to link

> > | - LED0 blinks, even when there is no traffic

> > | - The MIB counters indicate receive errors (Rx Fragments, Rx Symbol

> > |   Errors, Rx CRC Errors, Rx Alignment Errors)

> > | - Only a small fraction of packets is correctly received and forwarded

> > |   through the switch. Most packets are dropped due to receive errors.

> > |

> > | The failing condition cannot be corrected by the following:

> > | - Removing and reconnecting the cable

> > | - Hardware reset

> > | - Software Reset and PCS Reset bits in register 67 (0x43)

> > |

> > | Work around:

> > | The problem can be corrected by setting and then clearing the Port Power

> > | Down bits (registers 29 (0x1D) and 45 (0x2D), bit 3). This must be done

> > | separately for each affected port after returning from Soft Power Down

> > | Mode to Normal Mode. The following procedure will ensure no further

> > | issues due to this erratum. To enter Soft Power Down Mode, set register

> > | 195 (0xC3), bits [1:0] = 10.

> > |

> > | To exit Soft Power Down Mode, follow these steps:

> > | 1. Set register 195 (0xC3), bits [1:0] = 00 // Exit soft power down mode

> > | 2. Wait 1ms minimum

> > | 3. Set register 29 (0x1D), bit [3] = 1 // Enter PHY port 1 power down mode

> > | 4. Set register 29 (0x1D), bit [3] = 0 // Exit PHY port 1 power down mode

> > | 5. Set register 45 (0x2D), bit [3] = 1 // Enter PHY port 2 power down mode

> > | 6. Set register 45 (0x2D), bit [3] = 0 // Exit PHY port 2 power down mode

> > 

> > This patch implements steps 2...6 of the suggested workaround. The first

> > step needs to be implemented in the switch driver.

> 

> Am I right in understanding that register 195 (0xc3) is not a port register?

> 

> To hit the erratum, you have to enter Soft Power Down in the first place,

> presumably by writing register 0xc3 from somewhere, right?

> 

> Where does Linux write this register from?

> 

> Once we find that place that enters/exits Soft Power Down mode, can't we

> just toggle the Port Power Down bits for each port, exactly like the ERR

> workaround says, instead of fooling around with a PHY driver?


The KSZ8873 switch is using multiple register mappings.
https://ww1.microchip.com/downloads/en/DeviceDoc/00002348A.pdf
Page 38:
"The MIIM interface is used to access the MII PHY registers defined in
this section. The SPI, I2C, and SMI interfaces can also be used to access
some of these registers. The latter three interfaces use a different
mapping mechanism than the MIIM interface."

This PHY driver is able to work directly over MIIM (MDIO). Or work with DSA over
integrated register translation mapping.

> > 

> > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>

> > ---

> >  drivers/net/phy/micrel.c | 22 +++++++++++++++++++++-

> >  1 file changed, 21 insertions(+), 1 deletion(-)

> > 

> > diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c

> > index 227d88db7d27..f03188ed953a 100644

> > --- a/drivers/net/phy/micrel.c

> > +++ b/drivers/net/phy/micrel.c

> > @@ -1048,6 +1048,26 @@ static int ksz8873mll_config_aneg(struct phy_device *phydev)

> >  	return 0;

> >  }

> >  

> > +static int ksz886x_resume(struct phy_device *phydev)

> > +{

> > +	int ret;

> > +

> > +	/* Apply errata workaround for KSZ8863 and KSZ8873:

> > +	 * Receiver error in 100BASE-TX mode following Soft Power Down

> > +	 *

> > +	 * When exiting Soft Power Down mode, the receiver blocks may not start

> > +	 * up properly, causing the PHY to miss data and exhibit erratic

> > +	 * behavior.

> > +	 */

> > +	usleep_range(1000, 2000);

> > +

> > +	ret = phy_set_bits(phydev, MII_BMCR, BMCR_PDOWN);

> > +	if (ret)

> > +		return ret;

> > +

> > +	return phy_clear_bits(phydev, MII_BMCR, BMCR_PDOWN);

> > +}

> > +

> >  static int kszphy_get_sset_count(struct phy_device *phydev)

> >  {

> >  	return ARRAY_SIZE(kszphy_hw_stats);

> > @@ -1401,7 +1421,7 @@ static struct phy_driver ksphy_driver[] = {

> >  	/* PHY_BASIC_FEATURES */

> >  	.config_init	= kszphy_config_init,

> >  	.suspend	= genphy_suspend,

> > -	.resume		= genphy_resume,

> > +	.resume		= ksz886x_resume,

> 

> Are you able to explain the relation between the call paths of

> phy_resume() and the lifetime of the Soft Power Down setting of the

> switch? How do we know that the PHYs are resumed after the switch has

> exited Soft Power Down mode?


The MII_BMCRs BMCR_PDOWN bit is mapped to the "register 29 (0x1D), bit
[3]" for the PHY0 and to "register 45 (0x2D), bit [3]" for the PHY1.

I assume, I'll need to add this comments to the commit message. Or do
you have other suggestions on how this should be implemented?

Regards,
Oleksij
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
Vladimir Oltean June 10, 2021, 1:04 p.m. UTC | #7
On Thu, Jun 10, 2021 at 01:49:20PM +0200, Oleksij Rempel wrote:
> On Thu, May 27, 2021 at 01:43:29AM +0300, Vladimir Oltean wrote:

> > On Wed, May 26, 2021 at 06:30:32AM +0200, Oleksij Rempel wrote:

> > > The ksz8873 and ksz8863 switches are affected by following errata:

> > > 

> > > | "Receiver error in 100BASE-TX mode following Soft Power Down"

> > > |

> > > | Some KSZ8873 devices may exhibit receiver errors after transitioning

> > > | from Soft Power Down mode to Normal mode, as controlled by register 195

> > > | (0xC3) bits [1:0]. When exiting Soft Power Down mode, the receiver

> > > | blocks may not start up properly, causing the PHY to miss data and

> > > | exhibit erratic behavior. The problem may appear on either port 1 or

> > > | port 2, or both ports. The problem occurs only for 100BASE-TX, not

> > > | 10BASE-T.

> > > |

> > > | END USER IMPLICATIONS

> > > | When the failure occurs, the following symptoms are seen on the affected

> > > | port(s):

> > > | - The port is able to link

> > > | - LED0 blinks, even when there is no traffic

> > > | - The MIB counters indicate receive errors (Rx Fragments, Rx Symbol

> > > |   Errors, Rx CRC Errors, Rx Alignment Errors)

> > > | - Only a small fraction of packets is correctly received and forwarded

> > > |   through the switch. Most packets are dropped due to receive errors.

> > > |

> > > | The failing condition cannot be corrected by the following:

> > > | - Removing and reconnecting the cable

> > > | - Hardware reset

> > > | - Software Reset and PCS Reset bits in register 67 (0x43)

> > > |

> > > | Work around:

> > > | The problem can be corrected by setting and then clearing the Port Power

> > > | Down bits (registers 29 (0x1D) and 45 (0x2D), bit 3). This must be done

> > > | separately for each affected port after returning from Soft Power Down

> > > | Mode to Normal Mode. The following procedure will ensure no further

> > > | issues due to this erratum. To enter Soft Power Down Mode, set register

> > > | 195 (0xC3), bits [1:0] = 10.

> > > |

> > > | To exit Soft Power Down Mode, follow these steps:

> > > | 1. Set register 195 (0xC3), bits [1:0] = 00 // Exit soft power down mode

> > > | 2. Wait 1ms minimum

> > > | 3. Set register 29 (0x1D), bit [3] = 1 // Enter PHY port 1 power down mode

> > > | 4. Set register 29 (0x1D), bit [3] = 0 // Exit PHY port 1 power down mode

> > > | 5. Set register 45 (0x2D), bit [3] = 1 // Enter PHY port 2 power down mode

> > > | 6. Set register 45 (0x2D), bit [3] = 0 // Exit PHY port 2 power down mode

> > > 

> > > This patch implements steps 2...6 of the suggested workaround. The first

> > > step needs to be implemented in the switch driver.

> > 

> > Am I right in understanding that register 195 (0xc3) is not a port register?

> > 

> > To hit the erratum, you have to enter Soft Power Down in the first place,

> > presumably by writing register 0xc3 from somewhere, right?

> > 

> > Where does Linux write this register from?

> > 

> > Once we find that place that enters/exits Soft Power Down mode, can't we

> > just toggle the Port Power Down bits for each port, exactly like the ERR

> > workaround says, instead of fooling around with a PHY driver?

> 

> The KSZ8873 switch is using multiple register mappings.

> https://ww1.microchip.com/downloads/en/DeviceDoc/00002348A.pdf

> Page 38:

> "The MIIM interface is used to access the MII PHY registers defined in

> this section. The SPI, I2C, and SMI interfaces can also be used to access

> some of these registers. The latter three interfaces use a different

> mapping mechanism than the MIIM interface."

> 

> This PHY driver is able to work directly over MIIM (MDIO). Or work with DSA over

> integrated register translation mapping.


This doesn't answer my question of where is the Soft Power Down mode enabled.

> > > 

> > > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>

> > > ---

> > >  drivers/net/phy/micrel.c | 22 +++++++++++++++++++++-

> > >  1 file changed, 21 insertions(+), 1 deletion(-)

> > > 

> > > diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c

> > > index 227d88db7d27..f03188ed953a 100644

> > > --- a/drivers/net/phy/micrel.c

> > > +++ b/drivers/net/phy/micrel.c

> > > @@ -1048,6 +1048,26 @@ static int ksz8873mll_config_aneg(struct phy_device *phydev)

> > >  	return 0;

> > >  }

> > >  

> > > +static int ksz886x_resume(struct phy_device *phydev)

> > > +{

> > > +	int ret;

> > > +

> > > +	/* Apply errata workaround for KSZ8863 and KSZ8873:

> > > +	 * Receiver error in 100BASE-TX mode following Soft Power Down

> > > +	 *

> > > +	 * When exiting Soft Power Down mode, the receiver blocks may not start

> > > +	 * up properly, causing the PHY to miss data and exhibit erratic

> > > +	 * behavior.

> > > +	 */

> > > +	usleep_range(1000, 2000);

> > > +

> > > +	ret = phy_set_bits(phydev, MII_BMCR, BMCR_PDOWN);

> > > +	if (ret)

> > > +		return ret;

> > > +

> > > +	return phy_clear_bits(phydev, MII_BMCR, BMCR_PDOWN);

> > > +}

> > > +

> > >  static int kszphy_get_sset_count(struct phy_device *phydev)

> > >  {

> > >  	return ARRAY_SIZE(kszphy_hw_stats);

> > > @@ -1401,7 +1421,7 @@ static struct phy_driver ksphy_driver[] = {

> > >  	/* PHY_BASIC_FEATURES */

> > >  	.config_init	= kszphy_config_init,

> > >  	.suspend	= genphy_suspend,

> > > -	.resume		= genphy_resume,

> > > +	.resume		= ksz886x_resume,

> > 

> > Are you able to explain the relation between the call paths of

> > phy_resume() and the lifetime of the Soft Power Down setting of the

> > switch? How do we know that the PHYs are resumed after the switch has

> > exited Soft Power Down mode?

> 

> The MII_BMCRs BMCR_PDOWN bit is mapped to the "register 29 (0x1D), bit

> [3]" for the PHY0 and to "register 45 (0x2D), bit [3]" for the PHY1.

> 

> I assume, I'll need to add this comments to the commit message. Or do

> you have other suggestions on how this should be implemented?


According to "3.2 Power Management" in the datasheet you shared:

There are 5 (five) operation modes under the power management function,
which is controlled by two bits in Register 195 (0xC3) and one bit in
Register 29 (0x1D), 45 (0x2D) as shown below:

Register 195 bit[1:0] = 00 Normal Operation Mode
Register 195 bit[1:0] = 01 Energy Detect Mode
Register 195 bit[1:0] = 10 Soft Power Down Mode
Register 195 bit[1:0] = 11 Power Saving Mode
Register 29, 45 bit 3 = 1 Port Based Power Down Mode

3.2.4 SOFT POWER DOWN MODE

The soft power down mode is entered by setting bit[1:0]=10 in register
195. When KSZ8873MLL/FLL/RLL is in this mode, all PLL clocks are
disabled, the PHY and the MAC are off, all internal registers values
will not change. When the host set bit[1:0]=00 in register 195, this
device will be back from current soft power down mode to normal
operation mode.

3.2.5 PORT-BASED POWER DOWN MODE

In addition, the KSZ8873MLL/FLL/RLL features a per-port power down mode.
To save power, a PHY port that is not in use can be powered down via
port control register 29 or 45 bit 3, or MIIM PHY register. It saves
about 15 mA per port.



From the above I understand that the first 4 power management modes are
global, and the 5th isn't.

You've explained how the PHY driver enters port-based power down mode.
But the ERR describes an issue being triggered by a global power down
mode. What you are describing is not what the ERR text is describing.

Excuse my perhaps stupid question, but have you triggered the issue
described by the erratum? Does this patch fix that? Where is the disconnect?
Oleksij Rempel June 10, 2021, 1:25 p.m. UTC | #8
On Thu, Jun 10, 2021 at 04:04:45PM +0300, Vladimir Oltean wrote:
> On Thu, Jun 10, 2021 at 01:49:20PM +0200, Oleksij Rempel wrote:

> > On Thu, May 27, 2021 at 01:43:29AM +0300, Vladimir Oltean wrote:

> > > On Wed, May 26, 2021 at 06:30:32AM +0200, Oleksij Rempel wrote:

> > > > The ksz8873 and ksz8863 switches are affected by following errata:

> > > > 

> > > > | "Receiver error in 100BASE-TX mode following Soft Power Down"

> > > > |

> > > > | Some KSZ8873 devices may exhibit receiver errors after transitioning

> > > > | from Soft Power Down mode to Normal mode, as controlled by register 195

> > > > | (0xC3) bits [1:0]. When exiting Soft Power Down mode, the receiver

> > > > | blocks may not start up properly, causing the PHY to miss data and

> > > > | exhibit erratic behavior. The problem may appear on either port 1 or

> > > > | port 2, or both ports. The problem occurs only for 100BASE-TX, not

> > > > | 10BASE-T.

> > > > |

> > > > | END USER IMPLICATIONS

> > > > | When the failure occurs, the following symptoms are seen on the affected

> > > > | port(s):

> > > > | - The port is able to link

> > > > | - LED0 blinks, even when there is no traffic

> > > > | - The MIB counters indicate receive errors (Rx Fragments, Rx Symbol

> > > > |   Errors, Rx CRC Errors, Rx Alignment Errors)

> > > > | - Only a small fraction of packets is correctly received and forwarded

> > > > |   through the switch. Most packets are dropped due to receive errors.

> > > > |

> > > > | The failing condition cannot be corrected by the following:

> > > > | - Removing and reconnecting the cable

> > > > | - Hardware reset

> > > > | - Software Reset and PCS Reset bits in register 67 (0x43)

> > > > |

> > > > | Work around:

> > > > | The problem can be corrected by setting and then clearing the Port Power

> > > > | Down bits (registers 29 (0x1D) and 45 (0x2D), bit 3). This must be done

> > > > | separately for each affected port after returning from Soft Power Down

> > > > | Mode to Normal Mode. The following procedure will ensure no further

> > > > | issues due to this erratum. To enter Soft Power Down Mode, set register

> > > > | 195 (0xC3), bits [1:0] = 10.

> > > > |

> > > > | To exit Soft Power Down Mode, follow these steps:

> > > > | 1. Set register 195 (0xC3), bits [1:0] = 00 // Exit soft power down mode

> > > > | 2. Wait 1ms minimum

> > > > | 3. Set register 29 (0x1D), bit [3] = 1 // Enter PHY port 1 power down mode

> > > > | 4. Set register 29 (0x1D), bit [3] = 0 // Exit PHY port 1 power down mode

> > > > | 5. Set register 45 (0x2D), bit [3] = 1 // Enter PHY port 2 power down mode

> > > > | 6. Set register 45 (0x2D), bit [3] = 0 // Exit PHY port 2 power down mode

> > > > 

> > > > This patch implements steps 2...6 of the suggested workaround. The first

> > > > step needs to be implemented in the switch driver.

> > > 

> > > Am I right in understanding that register 195 (0xc3) is not a port register?

> > > 

> > > To hit the erratum, you have to enter Soft Power Down in the first place,

> > > presumably by writing register 0xc3 from somewhere, right?

> > > 

> > > Where does Linux write this register from?

> > > 

> > > Once we find that place that enters/exits Soft Power Down mode, can't we

> > > just toggle the Port Power Down bits for each port, exactly like the ERR

> > > workaround says, instead of fooling around with a PHY driver?

> > 

> > The KSZ8873 switch is using multiple register mappings.

> > https://ww1.microchip.com/downloads/en/DeviceDoc/00002348A.pdf

> > Page 38:

> > "The MIIM interface is used to access the MII PHY registers defined in

> > this section. The SPI, I2C, and SMI interfaces can also be used to access

> > some of these registers. The latter three interfaces use a different

> > mapping mechanism than the MIIM interface."

> > 

> > This PHY driver is able to work directly over MIIM (MDIO). Or work with DSA over

> > integrated register translation mapping.

> 

> This doesn't answer my question of where is the Soft Power Down mode enabled.

> 

> > > > 

> > > > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>

> > > > ---

> > > >  drivers/net/phy/micrel.c | 22 +++++++++++++++++++++-

> > > >  1 file changed, 21 insertions(+), 1 deletion(-)

> > > > 

> > > > diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c

> > > > index 227d88db7d27..f03188ed953a 100644

> > > > --- a/drivers/net/phy/micrel.c

> > > > +++ b/drivers/net/phy/micrel.c

> > > > @@ -1048,6 +1048,26 @@ static int ksz8873mll_config_aneg(struct phy_device *phydev)

> > > >  	return 0;

> > > >  }

> > > >  

> > > > +static int ksz886x_resume(struct phy_device *phydev)

> > > > +{

> > > > +	int ret;

> > > > +

> > > > +	/* Apply errata workaround for KSZ8863 and KSZ8873:

> > > > +	 * Receiver error in 100BASE-TX mode following Soft Power Down

> > > > +	 *

> > > > +	 * When exiting Soft Power Down mode, the receiver blocks may not start

> > > > +	 * up properly, causing the PHY to miss data and exhibit erratic

> > > > +	 * behavior.

> > > > +	 */

> > > > +	usleep_range(1000, 2000);

> > > > +

> > > > +	ret = phy_set_bits(phydev, MII_BMCR, BMCR_PDOWN);

> > > > +	if (ret)

> > > > +		return ret;

> > > > +

> > > > +	return phy_clear_bits(phydev, MII_BMCR, BMCR_PDOWN);

> > > > +}

> > > > +

> > > >  static int kszphy_get_sset_count(struct phy_device *phydev)

> > > >  {

> > > >  	return ARRAY_SIZE(kszphy_hw_stats);

> > > > @@ -1401,7 +1421,7 @@ static struct phy_driver ksphy_driver[] = {

> > > >  	/* PHY_BASIC_FEATURES */

> > > >  	.config_init	= kszphy_config_init,

> > > >  	.suspend	= genphy_suspend,

> > > > -	.resume		= genphy_resume,

> > > > +	.resume		= ksz886x_resume,

> > > 

> > > Are you able to explain the relation between the call paths of

> > > phy_resume() and the lifetime of the Soft Power Down setting of the

> > > switch? How do we know that the PHYs are resumed after the switch has

> > > exited Soft Power Down mode?

> > 

> > The MII_BMCRs BMCR_PDOWN bit is mapped to the "register 29 (0x1D), bit

> > [3]" for the PHY0 and to "register 45 (0x2D), bit [3]" for the PHY1.

> > 

> > I assume, I'll need to add this comments to the commit message. Or do

> > you have other suggestions on how this should be implemented?

> 

> According to "3.2 Power Management" in the datasheet you shared:

> 

> There are 5 (five) operation modes under the power management function,

> which is controlled by two bits in Register 195 (0xC3) and one bit in

> Register 29 (0x1D), 45 (0x2D) as shown below:

> 

> Register 195 bit[1:0] = 00 Normal Operation Mode

> Register 195 bit[1:0] = 01 Energy Detect Mode

> Register 195 bit[1:0] = 10 Soft Power Down Mode

> Register 195 bit[1:0] = 11 Power Saving Mode

> Register 29, 45 bit 3 = 1 Port Based Power Down Mode

> 

> 3.2.4 SOFT POWER DOWN MODE

> 

> The soft power down mode is entered by setting bit[1:0]=10 in register

> 195. When KSZ8873MLL/FLL/RLL is in this mode, all PLL clocks are

> disabled, the PHY and the MAC are off, all internal registers values

> will not change. When the host set bit[1:0]=00 in register 195, this

> device will be back from current soft power down mode to normal

> operation mode.

> 

> 3.2.5 PORT-BASED POWER DOWN MODE

> 

> In addition, the KSZ8873MLL/FLL/RLL features a per-port power down mode.

> To save power, a PHY port that is not in use can be powered down via

> port control register 29 or 45 bit 3, or MIIM PHY register. It saves

> about 15 mA per port.

> 

> 

> 

> From the above I understand that the first 4 power management modes are

> global, and the 5th isn't.

> 

> You've explained how the PHY driver enters port-based power down mode.

> But the ERR describes an issue being triggered by a global power down

> mode. What you are describing is not what the ERR text is describing.

> 

> Excuse my perhaps stupid question, but have you triggered the issue

> described by the erratum? Does this patch fix that? Where is the disconnect?


Yes, this issue was seen  at some early point of development (back in 2019)
reproducible on system start. Where switch was in some default state or
on a state configured by the bootloader. I didn't tried to reproduce it
now. With other words, there is no need to provide global power
management by the DSA driver to trigger it.

Regards,
Oleksij
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
Vladimir Oltean June 10, 2021, 6:18 p.m. UTC | #9
On Thu, Jun 10, 2021 at 03:25:05PM +0200, Oleksij Rempel wrote:
> Yes, this issue was seen  at some early point of development (back in 2019)

> reproducible on system start. Where switch was in some default state or

> on a state configured by the bootloader. I didn't tried to reproduce it

> now. With other words, there is no need to provide global power

> management by the DSA driver to trigger it.


If you're sure about that then add it to the commit message or comments,
since this is not what the ERR description says.