mbox series

[RFC,net-next,0/5] net: dsa: add support for RTL8365MB-VC

Message ID 20210822193145.1312668-1-alvin@pqrs.dk
Headers show
Series net: dsa: add support for RTL8365MB-VC | expand

Message

Alvin Šipraga Aug. 22, 2021, 7:31 p.m. UTC
From: Alvin Šipraga <alsi@bang-olufsen.dk>

This series adds support for Realtek's RTL8365MB-VC, a 4+1 port
10/100/1000M Ethernet switch. The driver - rtl8365mb - was developed by
Michael Ramussen and myself.

Summary of patches:

  - The first patch in the series is a bugfix in realtek-smi which I
    found when writing the new rtl8365mb subdriver and building
    realtek-smi as a module. If desired, I can spin it off into a
    separate patch and target it for net (not net-next).

  - The second patch updates the dt-bindings for the new compatible
    string.

  - The third patch adds the 8 byte tag protocol driver.

  - The fourth patch adds the rtl8365mb subdriver - the main feature of
    this patch series.

  - The fifth patch adds a PHY driver for the internal PHYs found in the
    RTL8365MB-VC. This is not strictly necessary for the rtl8365mb
    driver to work, but it avoids using the poll-only Generic PHY driver
    directly by hooking into the IRQs made available by the switch
    driver.

This is my first time in the DSA subsystem, so I am submitting this as
an RFC patch series for now. Apologies if I have made some terrible
mistakes along the way. All feedback - no matter how minor - is thus
very welcome.

There is a lot more work that can be done on this driver, particularly
when it comes to offloading certain DSA features to the hardware. I
intend to revisit this later when I have more time. In the mean time,
the driver seems to be in a good enough state for upstream submission.

Finally, there is an outstanding issue in probing the PHY driver when
fw_devlink=on. This seems to be a generic problem with DSA drivers which
create a dependency between an internal interrupt-controller and child
PHY nodes, realtek-smi being one of them. See [1] for an ongoing
discussion about that. Since this seems to be an existing problem and
not unique to this new driver, I hope that it will not impede the
upstreaming of this patch series.

[1] https://lore.kernel.org/netdev/cd0d9c40-d07b-e2ab-b068-d0bcb4685d09@bang-olufsen.dk/

Alvin Šipraga (5):
  net: dsa: realtek-smi: fix mdio_free bug on module unload
  dt-bindings: net: dsa: realtek-smi: document new compatible rtl8365mb
  net: dsa: tag_rtl8_4: add realtek 8 byte protocol 4 tag
  net: dsa: realtek-smi: add rtl8365mb subdriver for RTL8365MB-VC
  net: phy: realtek: add support for RTL8365MB-VC internal PHYs

 .../bindings/net/dsa/realtek-smi.txt          |    1 +
 drivers/net/dsa/Kconfig                       |    1 +
 drivers/net/dsa/Makefile                      |    2 +-
 drivers/net/dsa/realtek-smi-core.c            |   10 +
 drivers/net/dsa/realtek-smi-core.h            |    2 +
 drivers/net/dsa/rtl8365mb.c                   | 2124 +++++++++++++++++
 drivers/net/dsa/rtl8366rb.c                   |    8 +
 drivers/net/phy/realtek.c                     |    8 +
 include/net/dsa.h                             |    2 +
 net/dsa/Kconfig                               |    6 +
 net/dsa/Makefile                              |    1 +
 net/dsa/tag_rtl8_4.c                          |  178 ++
 12 files changed, 2342 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/dsa/rtl8365mb.c
 create mode 100644 net/dsa/tag_rtl8_4.c

Comments

Vladimir Oltean Aug. 22, 2021, 9:54 p.m. UTC | #1
On Sun, Aug 22, 2021 at 09:31:39PM +0200, Alvin Šipraga wrote:
> From: Alvin Šipraga <alsi@bang-olufsen.dk>
> 
> realtek-smi-core fails to unregister the slave MII bus on module unload,
> raising the following BUG warning:
> 
>     mdio_bus.c:650: BUG_ON(bus->state != MDIOBUS_UNREGISTERED);
> 
>     kernel BUG at drivers/net/phy/mdio_bus.c:650!
>     Internal error: Oops - BUG: 0 [#1] PREEMPT_RT SMP
>     Call trace:
>      mdiobus_free+0x4c/0x50
>      devm_mdiobus_free+0x18/0x20
>      release_nodes.isra.0+0x1c0/0x2b0
>      devres_release_all+0x38/0x58
>      device_release_driver_internal+0x124/0x1e8
>      driver_detach+0x54/0xe0
>      bus_remove_driver+0x60/0xd8
>      driver_unregister+0x34/0x60
>      platform_driver_unregister+0x18/0x20
>      realtek_smi_driver_exit+0x14/0x1c [realtek_smi]
> 
> Fix this by duly unregistering the slave MII bus with
> mdiobus_unregister. We do this in the DSA teardown path, since
> registration is performed in the DSA setup path.
> 
> Cc: Linus Walleij <linus.walleij@linaro.org>
> Fixes: d8652956cf37 ("net: dsa: realtek-smi: Add Realtek SMI driver")
> Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
> ---
>  drivers/net/dsa/realtek-smi-core.c | 6 ++++++
>  drivers/net/dsa/realtek-smi-core.h | 1 +
>  drivers/net/dsa/rtl8366rb.c        | 8 ++++++++
>  3 files changed, 15 insertions(+)
> 
> diff --git a/drivers/net/dsa/realtek-smi-core.c b/drivers/net/dsa/realtek-smi-core.c
> index 8e49d4f85d48..6992b6b31db6 100644
> --- a/drivers/net/dsa/realtek-smi-core.c
> +++ b/drivers/net/dsa/realtek-smi-core.c
> @@ -383,6 +383,12 @@ int realtek_smi_setup_mdio(struct realtek_smi *smi)
>  	return ret;
>  }
>  
> +void realtek_smi_teardown_mdio(struct realtek_smi *smi)
> +{
> +	if (smi->slave_mii_bus)
> +		mdiobus_unregister(smi->slave_mii_bus);
> +}
> +
>  static int realtek_smi_probe(struct platform_device *pdev)
>  {
>  	const struct realtek_smi_variant *var;
> diff --git a/drivers/net/dsa/realtek-smi-core.h b/drivers/net/dsa/realtek-smi-core.h
> index fcf465f7f922..6cfa5f2df7ea 100644
> --- a/drivers/net/dsa/realtek-smi-core.h
> +++ b/drivers/net/dsa/realtek-smi-core.h
> @@ -119,6 +119,7 @@ struct realtek_smi_variant {
>  int realtek_smi_write_reg_noack(struct realtek_smi *smi, u32 addr,
>  				u32 data);
>  int realtek_smi_setup_mdio(struct realtek_smi *smi);
> +void realtek_smi_teardown_mdio(struct realtek_smi *smi);
>  
>  /* RTL8366 library helpers */
>  int rtl8366_mc_is_used(struct realtek_smi *smi, int mc_index, int *used);
> diff --git a/drivers/net/dsa/rtl8366rb.c b/drivers/net/dsa/rtl8366rb.c
> index a89093bc6c6a..6537fac7aba4 100644
> --- a/drivers/net/dsa/rtl8366rb.c
> +++ b/drivers/net/dsa/rtl8366rb.c
> @@ -982,6 +982,13 @@ static int rtl8366rb_setup(struct dsa_switch *ds)
>  	return 0;
>  }
>  
> +static void rtl8366rb_teardown(struct dsa_switch *ds)
> +{
> +	struct realtek_smi *smi = ds->priv;
> +
> +	realtek_smi_teardown_mdio(smi);
> +}
> +

Objection: dsa_switch_teardown has:

	if (ds->slave_mii_bus && ds->ops->phy_read)
		mdiobus_unregister(ds->slave_mii_bus);

The realtek_smi_setup_mdio function does:

	smi->ds->slave_mii_bus = smi->slave_mii_bus;

so I would expect that this would result in a double unregister on some
systems.

I haven't went through your new driver, but I wonder whether you have
the phy_read and phy_write methods implemented? Maybe that is the
difference?

>  static enum dsa_tag_protocol rtl8366_get_tag_protocol(struct dsa_switch *ds,
>  						      int port,
>  						      enum dsa_tag_protocol mp)
> @@ -1505,6 +1512,7 @@ static int rtl8366rb_detect(struct realtek_smi *smi)
>  static const struct dsa_switch_ops rtl8366rb_switch_ops = {
>  	.get_tag_protocol = rtl8366_get_tag_protocol,
>  	.setup = rtl8366rb_setup,
> +	.teardown = rtl8366rb_teardown,
>  	.phylink_mac_link_up = rtl8366rb_mac_link_up,
>  	.phylink_mac_link_down = rtl8366rb_mac_link_down,
>  	.get_strings = rtl8366_get_strings,
> -- 
> 2.32.0
>
Vladimir Oltean Aug. 22, 2021, 10:13 p.m. UTC | #2
On Sun, Aug 22, 2021 at 09:31:41PM +0200, Alvin Šipraga wrote:
> From: Alvin Šipraga <alsi@bang-olufsen.dk>
> 
> This commit implements a basic version of the 8 byte tag protocol used
> in the Realtek RTL8365MB-VC switch, which carries with it a protocol
> version of 0x04.
> 
> The implementation itself only handles the parsing of the EtherType
> value and Realtek protocol version, together with the source or
> destination port fields. The rest is left unimplemented for now.
> 
> The tag format is described in a confidential document provided to my
> employer - Bang & Olufsen a/s - by Realtek Semiconductor Corp.
> Permission has been granted by Realtek to publish this driver based on
> that material, together with an extract from the document describing the
> tag format and its fields.  It is hoped that this will help future
> implementors who do not have access to the material but who wish to
> extend the functionality of drivers for chips which use this protocol.
> 
> Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
> ---
>  include/net/dsa.h    |   2 +
>  net/dsa/Kconfig      |   6 ++
>  net/dsa/Makefile     |   1 +
>  net/dsa/tag_rtl8_4.c | 178 +++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 187 insertions(+)
>  create mode 100644 net/dsa/tag_rtl8_4.c
> 
> diff --git a/include/net/dsa.h b/include/net/dsa.h
> index 0c2cba45fa79..6d8b5a11d99a 100644
> --- a/include/net/dsa.h
> +++ b/include/net/dsa.h
> @@ -51,6 +51,7 @@ struct phylink_link_state;
>  #define DSA_TAG_PROTO_SEVILLE_VALUE		21
>  #define DSA_TAG_PROTO_BRCM_LEGACY_VALUE		22
>  #define DSA_TAG_PROTO_SJA1110_VALUE		23
> +#define DSA_TAG_PROTO_RTL8_4_VALUE		24
>  
>  enum dsa_tag_protocol {
>  	DSA_TAG_PROTO_NONE		= DSA_TAG_PROTO_NONE_VALUE,
> @@ -77,6 +78,7 @@ enum dsa_tag_protocol {
>  	DSA_TAG_PROTO_OCELOT_8021Q	= DSA_TAG_PROTO_OCELOT_8021Q_VALUE,
>  	DSA_TAG_PROTO_SEVILLE		= DSA_TAG_PROTO_SEVILLE_VALUE,
>  	DSA_TAG_PROTO_SJA1110		= DSA_TAG_PROTO_SJA1110_VALUE,
> +	DSA_TAG_PROTO_RTL8_4		= DSA_TAG_PROTO_RTL8_4_VALUE,
>  };
>  
>  struct dsa_switch;
> diff --git a/net/dsa/Kconfig b/net/dsa/Kconfig
> index 548285539752..470a2f3e8f75 100644
> --- a/net/dsa/Kconfig
> +++ b/net/dsa/Kconfig
> @@ -99,6 +99,12 @@ config NET_DSA_TAG_RTL4_A
>  	  Realtek switches with 4 byte protocol A tags, sich as found in
>  	  the Realtek RTL8366RB.
>  
> +config NET_DSA_TAG_RTL8_4
> +	tristate "Tag driver for Realtek 8 byte protocol 4 tags"
> +	help
> +	  Say Y or M if you want to enable support for tagging frames for Realtek
> +	  switches with 8 byte protocol 4 tags, such as the Realtek RTL8365MB-VC.
> +
>  config NET_DSA_TAG_OCELOT
>  	tristate "Tag driver for Ocelot family of switches, using NPI port"
>  	depends on MSCC_OCELOT_SWITCH_LIB || \
> diff --git a/net/dsa/Makefile b/net/dsa/Makefile
> index 67ea009f242c..01282817e80e 100644
> --- a/net/dsa/Makefile
> +++ b/net/dsa/Makefile
> @@ -11,6 +11,7 @@ obj-$(CONFIG_NET_DSA_TAG_GSWIP) += tag_gswip.o
>  obj-$(CONFIG_NET_DSA_TAG_HELLCREEK) += tag_hellcreek.o
>  obj-$(CONFIG_NET_DSA_TAG_KSZ) += tag_ksz.o
>  obj-$(CONFIG_NET_DSA_TAG_RTL4_A) += tag_rtl4_a.o
> +obj-$(CONFIG_NET_DSA_TAG_RTL8_4) += tag_rtl8_4.o
>  obj-$(CONFIG_NET_DSA_TAG_LAN9303) += tag_lan9303.o
>  obj-$(CONFIG_NET_DSA_TAG_MTK) += tag_mtk.o
>  obj-$(CONFIG_NET_DSA_TAG_OCELOT) += tag_ocelot.o
> diff --git a/net/dsa/tag_rtl8_4.c b/net/dsa/tag_rtl8_4.c
> new file mode 100644
> index 000000000000..1082bafb6a2e
> --- /dev/null
> +++ b/net/dsa/tag_rtl8_4.c
> @@ -0,0 +1,178 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Handler for Realtek 8 byte switch tags
> + *
> + * Copyright (C) 2021 Alvin Šipraga <alsi@bang-olufsen.dk>
> + *
> + * NOTE: Currently only supports protocol "4" found in the RTL8365MB, hence
> + * named tag_rtl8_4.
> + *
> + * This "proprietary tag" header has the following format:
> + *
> + *  -------------------------------------------
> + *  | MAC DA | MAC SA | 8 byte tag | Type | ...
> + *  -------------------------------------------
> + *     _______________/            \______________________________________
> + *    /                                                                   \
> + *  0                                  7|8                                 15
> + *  |-----------------------------------+-----------------------------------|---
> + *  |                               (16-bit)                                | ^
> + *  |                       Realtek EtherType [0x8899]                      | |
> + *  |-----------------------------------+-----------------------------------| 8
> + *  |              (8-bit)              |              (8-bit)              |
> + *  |          Protocol [0x04]          |              REASON               | b
> + *  |-----------------------------------+-----------------------------------| y
> + *  |   (1)  | (1) | (2) |   (1)  | (3) | (1)  | (1) |    (1)    |   (5)    | t
> + *  | FID_EN |  X  | FID | PRI_EN | PRI | KEEP |  X  | LEARN_DIS |    X     | e
> + *  |-----------------------------------+-----------------------------------| s
> + *  |   (1)  |                       (15-bit)                               | |
> + *  |  ALLOW |                        TX/RX                                 | v
> + *  |-----------------------------------+-----------------------------------|---
> + *
> + * With the following field descriptions:
> + *
> + *    field      | description
> + *   ------------+-------------
> + *    Realtek    | 0x8899: indicates that this is a proprietary Realtek tag;
> + *     EtherType |         note that Realtek uses the same EtherType for
> + *               |         other incompatible tag formats (e.g. tag_rtl4_a.c)
> + *    Protocol   | 0x04: indicates that this tag conforms to this format
> + *    X          | reserved
> + *   ------------+-------------
> + *    REASON     | reason for forwarding packet to CPU
> + *    FID_EN     | 1: packet has an FID
> + *               | 0: no FID
> + *    FID        | FID of packet (if FID_EN=1)
> + *    PRI_EN     | 1: force priority of packet
> + *               | 0: don't force priority
> + *    PRI        | priority of packet (if PRI_EN=1)
> + *    KEEP       | preserve packet VLAN tag format

What does it mean to preserve packet VLAN tag format? Trying to
understand if the sane thing is to clear or set this bit. Does it mean
to strip the VLAN tag on egress if the VLAN is configured as
egress-untagged on the port?

> + *    LEARN_DIS  | don't learn the source MAC address of the packet
> + *    ALLOW      | 1: treat TX/RX field as an allowance port mask, meaning the
> + *               |    packet may only be forwarded to ports specified in the
> + *               |    mask
> + *               | 0: no allowance port mask, TX/RX field is the forwarding
> + *               |    port mask
> + *    TX/RX      | TX (switch->CPU): port number the packet was received on
> + *               | RX (CPU->switch): forwarding port mask (if ALLOW=0)
> + *               |                   allowance port mask (if ALLOW=1)
> + */
> +
> +#include <linux/etherdevice.h>
> +#include <linux/bits.h>
> +
> +#include "dsa_priv.h"
> +
> +#define RTL8_4_TAG_LEN			8
> +#define RTL8_4_ETHERTYPE		0x8899
> +/* 0x04 = RTL8365MB DSA protocol
> + */
> +#define RTL8_4_PROTOCOL_RTL8365MB	0x04
> +
> +static struct sk_buff *rtl8_4_tag_xmit(struct sk_buff *skb,
> +				       struct net_device *dev)
> +{
> +	struct dsa_port *dp = dsa_slave_to_port(dev);
> +	__be16 *p;
> +	u8 *tag;
> +	u16 out;
> +
> +	/* Pad out so that the (stripped) packet is at least 64 bytes long
> +	 * (including FCS), otherwise the switch will drop the packet.
> +	 * Then we need an additional 8 bytes for the Realtek tag.
> +	 */
> +	if (__skb_put_padto(skb, ETH_ZLEN + RTL8_4_TAG_LEN, false))
> +		return NULL;
> +
> +	skb_push(skb, RTL8_4_TAG_LEN);
> +
> +	dsa_alloc_etype_header(skb, RTL8_4_TAG_LEN);
> +	tag = dsa_etype_header_pos_tx(skb);
> +
> +	/* Set Realtek EtherType */
> +	p = (__be16 *)tag;

You would have much fewer (zero) type casts if "tag" was a "__be16 *" in
the first place. Additionally, you would not need "p" and you could work
with tag[0], tag[1], tag[2], tag[3].

> +	*p = htons(RTL8_4_ETHERTYPE);
> +
> +	/* Set Protocol; zero REASON */
> +	p = (__be16 *)(tag + 2);
> +	*p = htons(RTL8_4_PROTOCOL_RTL8365MB << 8);
> +
> +	/* Zero FID_EN, FID, PRI_EN, PRI, KEEP, LEARN_DIS */

Please set LEARN_DIS. DSA has better ways to control what needs to be
learned and what doesn't. Packets injected into a standalone port
surely shouldn't have their MAC SA learned. Grep for "tx_fwd_offload" in
the kernel, see what it takes to transmit a packet with
skb->offload_fwd_mark = true, and you can clear LEARN_DIS and set ALLOW
then.

> +	p = (__be16 *)(tag + 4);
> +	*p = 0;
> +
> +	/* Zero ALLOW; set RX (CPU->switch) forwarding port mask */
> +	p = (__be16 *)(tag + 6);
> +	out = BIT(dp->index);

Set but unused variable.

> +	*p = htons(~(1 << 15) & BIT(dp->index));

I am deeply confused by this line.

~(1 << 15) is GENMASK(14, 0)
By AND-ing it with BIT(dp->index), what do you gain?

> +
> +	return skb;
> +}
> +
> +static struct sk_buff *rtl8_4_tag_rcv(struct sk_buff *skb,
> +				      struct net_device *dev)
> +{
> +	__be16 *p;
> +	u16 etype;
> +	u8 proto;
> +	u8 *tag;
> +	u8 port;
> +	u16 tmp;
> +
> +	if (unlikely(!pskb_may_pull(skb, RTL8_4_TAG_LEN)))
> +		return NULL;
> +
> +	tag = dsa_etype_header_pos_rx(skb);
> +
> +	/* Parse Realtek EtherType */
> +	p = (__be16 *)tag;

Same comment about it being more productive for "tag" to be "__be16 *".

> +	etype = ntohs(*p);
> +	if (unlikely(etype != RTL8_4_ETHERTYPE)) {
> +		netdev_dbg(dev, "non-realtek ethertype 0x%04x\n", etype);
> +		return NULL;
> +	}
> +
> +	/* Parse Protocol; ignore REASON */
> +	p = (__be16 *)(tag + 2);
> +	tmp = ntohs(*p);
> +	proto = tmp >> 8;
> +	if (unlikely(proto != RTL8_4_PROTOCOL_RTL8365MB)) {
> +		netdev_dbg(dev, "unknown realtek protocol 0x%02x\n", proto);
> +		return NULL;
> +	}
> +
> +	/* Ignore FID_EN, FID, PRI_EN, PRI, KEEP, LEARN_DIS */
> +	p = (__be16 *)(tag + 4);

Delete then?

> +
> +	/* Ignore ALLOW; parse TX (switch->CPU) */
> +	p = (__be16 *)(tag + 6);
> +	tmp = ntohs(*p);
> +	port = tmp & 0xf; /* Port number is the LSB 4 bits */
> +
> +	skb->dev = dsa_master_find_slave(dev, 0, port);
> +	if (!skb->dev) {
> +		netdev_dbg(dev, "could not find slave for port %d\n", port);
> +		return NULL;
> +	}
> +
> +	/* Remove tag and recalculate checksum */
> +	skb_pull_rcsum(skb, RTL8_4_TAG_LEN);
> +
> +	dsa_strip_etype_header(skb, RTL8_4_TAG_LEN);
> +
> +	skb->offload_fwd_mark = 1;

At the very least, please use

	dsa_default_offload_fwd_mark(skb);

which does the right thing when the port is not offloading the bridge.

Also tell us more about REASON and ALLOW. Is there a bit in the RX tag
which denotes that the packet was forwarded only to the host?

> +
> +	return skb;
> +}
> +
> +static const struct dsa_device_ops rtl8_4_netdev_ops = {
> +	.name = "rtl8_4",
> +	.proto = DSA_TAG_PROTO_RTL8_4,
> +	.xmit = rtl8_4_tag_xmit,
> +	.rcv = rtl8_4_tag_rcv,
> +	.needed_headroom = RTL8_4_TAG_LEN,
> +};
> +module_dsa_tag_driver(rtl8_4_netdev_ops);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_ALIAS_DSA_TAG_DRIVER(DSA_TAG_PROTO_RTL8_4);
> -- 
> 2.32.0
>
Alvin Šipraga Aug. 22, 2021, 10:42 p.m. UTC | #3
Hi Vladimir,

On 8/22/21 11:54 PM, Vladimir Oltean wrote:
> On Sun, Aug 22, 2021 at 09:31:39PM +0200, Alvin Šipraga wrote:

>> From: Alvin Šipraga <alsi@bang-olufsen.dk>

>>

>> realtek-smi-core fails to unregister the slave MII bus on module unload,

>> raising the following BUG warning:

>>

>>      mdio_bus.c:650: BUG_ON(bus->state != MDIOBUS_UNREGISTERED);

>>

>>      kernel BUG at drivers/net/phy/mdio_bus.c:650!

>>      Internal error: Oops - BUG: 0 [#1] PREEMPT_RT SMP

>>      Call trace:

>>       mdiobus_free+0x4c/0x50

>>       devm_mdiobus_free+0x18/0x20

>>       release_nodes.isra.0+0x1c0/0x2b0

>>       devres_release_all+0x38/0x58

>>       device_release_driver_internal+0x124/0x1e8

>>       driver_detach+0x54/0xe0

>>       bus_remove_driver+0x60/0xd8

>>       driver_unregister+0x34/0x60

>>       platform_driver_unregister+0x18/0x20

>>       realtek_smi_driver_exit+0x14/0x1c [realtek_smi]

>>

>> Fix this by duly unregistering the slave MII bus with

>> mdiobus_unregister. We do this in the DSA teardown path, since

>> registration is performed in the DSA setup path.

>>

>> Cc: Linus Walleij <linus.walleij@linaro.org>

>> Fixes: d8652956cf37 ("net: dsa: realtek-smi: Add Realtek SMI driver")

>> Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>

>> ---

>>   drivers/net/dsa/realtek-smi-core.c | 6 ++++++

>>   drivers/net/dsa/realtek-smi-core.h | 1 +

>>   drivers/net/dsa/rtl8366rb.c        | 8 ++++++++

>>   3 files changed, 15 insertions(+)

>>

>> diff --git a/drivers/net/dsa/realtek-smi-core.c b/drivers/net/dsa/realtek-smi-core.c

>> index 8e49d4f85d48..6992b6b31db6 100644

>> --- a/drivers/net/dsa/realtek-smi-core.c

>> +++ b/drivers/net/dsa/realtek-smi-core.c

>> @@ -383,6 +383,12 @@ int realtek_smi_setup_mdio(struct realtek_smi *smi)

>>   	return ret;

>>   }

>>   

>> +void realtek_smi_teardown_mdio(struct realtek_smi *smi)

>> +{

>> +	if (smi->slave_mii_bus)

>> +		mdiobus_unregister(smi->slave_mii_bus);

>> +}

>> +

>>   static int realtek_smi_probe(struct platform_device *pdev)

>>   {

>>   	const struct realtek_smi_variant *var;

>> diff --git a/drivers/net/dsa/realtek-smi-core.h b/drivers/net/dsa/realtek-smi-core.h

>> index fcf465f7f922..6cfa5f2df7ea 100644

>> --- a/drivers/net/dsa/realtek-smi-core.h

>> +++ b/drivers/net/dsa/realtek-smi-core.h

>> @@ -119,6 +119,7 @@ struct realtek_smi_variant {

>>   int realtek_smi_write_reg_noack(struct realtek_smi *smi, u32 addr,

>>   				u32 data);

>>   int realtek_smi_setup_mdio(struct realtek_smi *smi);

>> +void realtek_smi_teardown_mdio(struct realtek_smi *smi);

>>   

>>   /* RTL8366 library helpers */

>>   int rtl8366_mc_is_used(struct realtek_smi *smi, int mc_index, int *used);

>> diff --git a/drivers/net/dsa/rtl8366rb.c b/drivers/net/dsa/rtl8366rb.c

>> index a89093bc6c6a..6537fac7aba4 100644

>> --- a/drivers/net/dsa/rtl8366rb.c

>> +++ b/drivers/net/dsa/rtl8366rb.c

>> @@ -982,6 +982,13 @@ static int rtl8366rb_setup(struct dsa_switch *ds)

>>   	return 0;

>>   }

>>   

>> +static void rtl8366rb_teardown(struct dsa_switch *ds)

>> +{

>> +	struct realtek_smi *smi = ds->priv;

>> +

>> +	realtek_smi_teardown_mdio(smi);

>> +}

>> +

> 

> Objection: dsa_switch_teardown has:

> 

> 	if (ds->slave_mii_bus && ds->ops->phy_read)

> 		mdiobus_unregister(ds->slave_mii_bus);


This is unregistering an mdiobus registered in dsa_switch_setup:

	if (!ds->slave_mii_bus && ds->ops->phy_read) {
		ds->slave_mii_bus = devm_mdiobus_alloc(ds->dev);
		if (!ds->slave_mii_bus) {
			err = -ENOMEM;
			goto teardown;
		}

		dsa_slave_mii_bus_init(ds);

		err = mdiobus_register(ds->slave_mii_bus);
		if (err < 0)
			goto teardown;
	}

However, we don't enter this codepath because:

- ds->slave_mii_bus is already set in the call to ds->ops->setup() 
before the code snippet above;
- ds->ops->phy_read is not set.

We don't want to either, since we want to use of_mdiobus_register().

> 

> The realtek_smi_setup_mdio function does:

> 

> 	smi->ds->slave_mii_bus = smi->slave_mii_bus;

> 

> so I would expect that this would result in a double unregister on some

> systems.

> 

> I haven't went through your new driver, but I wonder whether you have

> the phy_read and phy_write methods implemented? Maybe that is the

> difference?


Right, phy_read/phy_write are not set in the dsa_switch_ops of 
rtl8365mb. So we should be safe.

It did get me thinking that it would be nice if dsa_register_switch() 
could call of_mdiobus_register() when necessary, since the snippet above 
(and its call to dsa_slave_mii_bus_init()) is almost same as 
realtek_smi_setup_mdio(). It would simplify some logic in realtek-smi 
drivers and obviate the need for this patch. I am not sure what the 
right approach to this would be but with some pointers I can give it a shot.

> 

>>   static enum dsa_tag_protocol rtl8366_get_tag_protocol(struct dsa_switch *ds,

>>   						      int port,

>>   						      enum dsa_tag_protocol mp)

>> @@ -1505,6 +1512,7 @@ static int rtl8366rb_detect(struct realtek_smi *smi)

>>   static const struct dsa_switch_ops rtl8366rb_switch_ops = {

>>   	.get_tag_protocol = rtl8366_get_tag_protocol,

>>   	.setup = rtl8366rb_setup,

>> +	.teardown = rtl8366rb_teardown,

>>   	.phylink_mac_link_up = rtl8366rb_mac_link_up,

>>   	.phylink_mac_link_down = rtl8366rb_mac_link_down,

>>   	.get_strings = rtl8366_get_strings,

>> -- 

>> 2.32.0

>>
Alvin Šipraga Aug. 22, 2021, 10:50 p.m. UTC | #4
Hi Andrew,

On 8/23/21 12:02 AM, Andrew Lunn wrote:
> On Sun, Aug 22, 2021 at 09:31:41PM +0200, Alvin Šipraga wrote:

>> diff --git a/net/dsa/Kconfig b/net/dsa/Kconfig

>> index 548285539752..470a2f3e8f75 100644

>> --- a/net/dsa/Kconfig

>> +++ b/net/dsa/Kconfig

>> @@ -99,6 +99,12 @@ config NET_DSA_TAG_RTL4_A

>>   	  Realtek switches with 4 byte protocol A tags, sich as found in

>>   	  the Realtek RTL8366RB.

>>   

>> +config NET_DSA_TAG_RTL8_4

>> +	tristate "Tag driver for Realtek 8 byte protocol 4 tags"

>> +	help

>> +	  Say Y or M if you want to enable support for tagging frames for Realtek

>> +	  switches with 8 byte protocol 4 tags, such as the Realtek RTL8365MB-VC.

>> +

> 

> Hi Alvin

> 

> This file is sorted based on the tristate text. As such, the

> NET_DSA_TAG_RTL4_A is in the wrong place. So i can understand why you

> put it here, but place move it after the Qualcom driver.


Thanks - I'll fix it in v2.

> 

>> @@ -11,6 +11,7 @@ obj-$(CONFIG_NET_DSA_TAG_GSWIP) += tag_gswip.o

>>   obj-$(CONFIG_NET_DSA_TAG_HELLCREEK) += tag_hellcreek.o

>>   obj-$(CONFIG_NET_DSA_TAG_KSZ) += tag_ksz.o

>>   obj-$(CONFIG_NET_DSA_TAG_RTL4_A) += tag_rtl4_a.o

>> +obj-$(CONFIG_NET_DSA_TAG_RTL8_4) += tag_rtl8_4.o

>>   obj-$(CONFIG_NET_DSA_TAG_LAN9303) += tag_lan9303.o

>>   obj-$(CONFIG_NET_DSA_TAG_MTK) += tag_mtk.o

>>   obj-$(CONFIG_NET_DSA_TAG_OCELOT) += tag_ocelot.o

> 

> The CONFIG_NET_DSA_TAG_RTL4_A is also in the wrong place...


Ditto.

> 

>> diff --git a/net/dsa/tag_rtl8_4.c b/net/dsa/tag_rtl8_4.c

>> new file mode 100644

>> index 000000000000..1082bafb6a2e

>> --- /dev/null

>> +++ b/net/dsa/tag_rtl8_4.c

>> @@ -0,0 +1,178 @@

>> +// SPDX-License-Identifier: GPL-2.0

>> +/*

>> + * Handler for Realtek 8 byte switch tags

>> + *

>> + * Copyright (C) 2021 Alvin Šipraga <alsi@bang-olufsen.dk>

>> + *

>> + * NOTE: Currently only supports protocol "4" found in the RTL8365MB, hence

>> + * named tag_rtl8_4.

>> + *

>> + * This "proprietary tag" header has the following format:

> 

> I think they are all proprietary. At least, there is no

> standardization across vendors.


I'll remove in v2.

> 

>> + *

>> + *  -------------------------------------------

>> + *  | MAC DA | MAC SA | 8 byte tag | Type | ...

>> + *  -------------------------------------------

>> + *     _______________/            \______________________________________

>> + *    /                                                                   \

>> + *  0                                  7|8                                 15

>> + *  |-----------------------------------+-----------------------------------|---

>> + *  |                               (16-bit)                                | ^

>> + *  |                       Realtek EtherType [0x8899]                      | |

>> + *  |-----------------------------------+-----------------------------------| 8

>> + *  |              (8-bit)              |              (8-bit)              |

>> + *  |          Protocol [0x04]          |              REASON               | b

>> + *  |-----------------------------------+-----------------------------------| y

>> + *  |   (1)  | (1) | (2) |   (1)  | (3) | (1)  | (1) |    (1)    |   (5)    | t

>> + *  | FID_EN |  X  | FID | PRI_EN | PRI | KEEP |  X  | LEARN_DIS |    X     | e

>> + *  |-----------------------------------+-----------------------------------| s

>> + *  |   (1)  |                       (15-bit)                               | |

>> + *  |  ALLOW |                        TX/RX                                 | v

>> + *  |-----------------------------------+-----------------------------------|---

>> + *

>> + * With the following field descriptions:

>> + *

>> + *    field      | description

>> + *   ------------+-------------

>> + *    Realtek    | 0x8899: indicates that this is a proprietary Realtek tag;

>> + *     EtherType |         note that Realtek uses the same EtherType for

>> + *               |         other incompatible tag formats (e.g. tag_rtl4_a.c)

>> + *    Protocol   | 0x04: indicates that this tag conforms to this format

>> + *    X          | reserved

>> + *   ------------+-------------

>> + *    REASON     | reason for forwarding packet to CPU

> 

> Are you allowed to document reason? This could be interesting for some

> of the future work.


Unfortunately the reason field is undocumented. The vendor driver 
doesn't contain any parsing code for the CPU tag so we are left to 
guess. One obvious example would be trapped packets, since the switch 
lets you configure what to do with those (drop, forward to CPU, etc.).

I have not had a reason to look into this yet, otherwise I would have 
documented whatever I knew about it. Hope it's OK for now.

> 

>> + *    FID_EN     | 1: packet has an FID

>> + *               | 0: no FID

>> + *    FID        | FID of packet (if FID_EN=1)

>> + *    PRI_EN     | 1: force priority of packet

>> + *               | 0: don't force priority

>> + *    PRI        | priority of packet (if PRI_EN=1)

>> + *    KEEP       | preserve packet VLAN tag format

>> + *    LEARN_DIS  | don't learn the source MAC address of the packet

>> + *    ALLOW      | 1: treat TX/RX field as an allowance port mask, meaning the

>> + *               |    packet may only be forwarded to ports specified in the

>> + *               |    mask

>> + *               | 0: no allowance port mask, TX/RX field is the forwarding

>> + *               |    port mask

>> + *    TX/RX      | TX (switch->CPU): port number the packet was received on

>> + *               | RX (CPU->switch): forwarding port mask (if ALLOW=0)

>> + *               |                   allowance port mask (if ALLOW=1)

> 

> There are some interesting fields here. It will be interesting to see

> what we can do with the switch.


This is exactly why I asked for Realtek's permission to publish the 
details. :-)

> 

>> + */

>> +

>> +#include <linux/etherdevice.h>

>> +#include <linux/bits.h>

>> +

>> +#include "dsa_priv.h"

>> +

>> +#define RTL8_4_TAG_LEN			8

>> +#define RTL8_4_ETHERTYPE		0x8899

> 

> Please add this to include/uapi/linux/if_ether.h


I believe Realtek uses this EtherType for a bunch of unrelated 
protocols, so I'm not sure this is a good idea. See [1] for a similar 
discussion on the mailing list a while back. What do you think?

[1] 
https://lore.kernel.org/netdev/CACRpkdYQthFgjwVzHyK3DeYUOdcYyWmdjDPG=Rf9B3VrJ12Rzg@mail.gmail.com/

> 

>> +static struct sk_buff *rtl8_4_tag_rcv(struct sk_buff *skb,

>> +				      struct net_device *dev)

>> +{

>> +	__be16 *p;

>> +	u16 etype;

>> +	u8 proto;

>> +	u8 *tag;

>> +	u8 port;

>> +	u16 tmp;

>> +

>> +	if (unlikely(!pskb_may_pull(skb, RTL8_4_TAG_LEN)))

>> +		return NULL;

>> +

>> +	tag = dsa_etype_header_pos_rx(skb);

>> +

>> +	/* Parse Realtek EtherType */

>> +	p = (__be16 *)tag;

>> +	etype = ntohs(*p);

>> +	if (unlikely(etype != RTL8_4_ETHERTYPE)) {

>> +		netdev_dbg(dev, "non-realtek ethertype 0x%04x\n", etype);

> 

> You probably want to rate limit these sorts of prints. You have

> limited controller of what frames come in, and if somebody can craft

> bad frames, they can make you send all your time printing messages to

> the log.


OK, I think I saw some rate limited version of netdev_dbg so I'll bring 
that in for v2. Thanks for the tip.

> 

>      Andrew

>
Vladimir Oltean Aug. 22, 2021, 11:10 p.m. UTC | #5
On Sun, Aug 22, 2021 at 10:42:23PM +0000, Alvin Šipraga wrote:
> > Objection: dsa_switch_teardown has:
> >
> > 	if (ds->slave_mii_bus && ds->ops->phy_read)
> > 		mdiobus_unregister(ds->slave_mii_bus);
>
> This is unregistering an mdiobus registered in dsa_switch_setup:
>
> 	if (!ds->slave_mii_bus && ds->ops->phy_read) {
> 		ds->slave_mii_bus = devm_mdiobus_alloc(ds->dev);
> 		if (!ds->slave_mii_bus) {
> 			err = -ENOMEM;
> 			goto teardown;
> 		}
>
> 		dsa_slave_mii_bus_init(ds);
>
> 		err = mdiobus_register(ds->slave_mii_bus);
> 		if (err < 0)
> 			goto teardown;
> 	}
>
> However, we don't enter this codepath because:
>
> - ds->slave_mii_bus is already set in the call to ds->ops->setup()
> before the code snippet above;
> - ds->ops->phy_read is not set.
>
> We don't want to either, since we want to use of_mdiobus_register().
>
> >
> > The realtek_smi_setup_mdio function does:
> >
> > 	smi->ds->slave_mii_bus = smi->slave_mii_bus;
> >
> > so I would expect that this would result in a double unregister on some
> > systems.
> >
> > I haven't went through your new driver, but I wonder whether you have
> > the phy_read and phy_write methods implemented? Maybe that is the
> > difference?
>
> Right, phy_read/phy_write are not set in the dsa_switch_ops of
> rtl8365mb. So we should be safe.

Correct, DSA only unregisters the ds->slave_mii_bus it has registered,
which is provided when the driver implements phy_read and/or phy_write
but does not set/register ds->slave_mii_bus itself. The patch is fine.

>
> It did get me thinking that it would be nice if dsa_register_switch()
> could call of_mdiobus_register() when necessary, since the snippet above
> (and its call to dsa_slave_mii_bus_init()) is almost same as
> realtek_smi_setup_mdio(). It would simplify some logic in realtek-smi
> drivers and obviate the need for this patch. I am not sure what the
> right approach to this would be but with some pointers I can give it a shot.

I don't think DSA could call of_mdiobus_register, since you would need
to pass it the OF node you want and the read/write ops for the bus and
its name and a place to store it (one DSA switch might have more than
one MDIO bus), and I just fail to see the point of doing that.

The whole point of having ds->slave_mii_bus (either allocated by the
driver or by DSA) is to access the PHY registers of a port under a very
narrow set of assumptions: it implicitly assumes that the port N has a
PHY at MDIO address N, as opposed to doing the usual which is to follow
the phy-handle, and that there is a single internal MDIO bus. DSA will
do this as last resort in dsa_slave_phy_setup. But if you use
of_mdiobus_register, just put a phy-handle in the device tree and be
done, you don't need ds->ops->phy_read or ds->ops->phy_write, nor can
you/should you overload these pointers for DSA to do the
of_mdiobus_register for you.

> >
> >>   static enum dsa_tag_protocol rtl8366_get_tag_protocol(struct dsa_switch *ds,
> >>   						      int port,
> >>   						      enum dsa_tag_protocol mp)
> >> @@ -1505,6 +1512,7 @@ static int rtl8366rb_detect(struct realtek_smi *smi)
> >>   static const struct dsa_switch_ops rtl8366rb_switch_ops = {
> >>   	.get_tag_protocol = rtl8366_get_tag_protocol,
> >>   	.setup = rtl8366rb_setup,
> >> +	.teardown = rtl8366rb_teardown,
> >>   	.phylink_mac_link_up = rtl8366rb_mac_link_up,
> >>   	.phylink_mac_link_down = rtl8366rb_mac_link_down,
> >>   	.get_strings = rtl8366_get_strings,
> >> --
> >> 2.32.0
> >>
Andrew Lunn Aug. 22, 2021, 11:14 p.m. UTC | #6
> >> + */
> >> +
> >> +#include <linux/etherdevice.h>
> >> +#include <linux/bits.h>
> >> +
> >> +#include "dsa_priv.h"
> >> +
> >> +#define RTL8_4_TAG_LEN			8
> >> +#define RTL8_4_ETHERTYPE		0x8899
> > 
> > Please add this to include/uapi/linux/if_ether.h

Maybe call it ETH_P_REALTEK, and comment /* Multiple Proprietary
protocols */ ?

If you do it in an individual patch, you can explain more in the
commit message about it being used for different protocols by Realtek,
and that no assumptions should be made when trying to decode it.

	  Andrew
Alvin Šipraga Aug. 22, 2021, 11:27 p.m. UTC | #7
Hi Andrew,

On 8/23/21 1:14 AM, Andrew Lunn wrote:
>>>> + */

>>>> +

>>>> +#include <linux/etherdevice.h>

>>>> +#include <linux/bits.h>

>>>> +

>>>> +#include "dsa_priv.h"

>>>> +

>>>> +#define RTL8_4_TAG_LEN			8

>>>> +#define RTL8_4_ETHERTYPE		0x8899

>>>

>>> Please add this to include/uapi/linux/if_ether.h

> 

> Maybe call it ETH_P_REALTEK, and comment /* Multiple Proprietary

> protocols */ ?

> 

> If you do it in an individual patch, you can explain more in the

> commit message about it being used for different protocols by Realtek,

> and that no assumptions should be made when trying to decode it.


Sounds good, I'll do that in v2.

> 

> 	  Andrew

>
Vladimir Oltean Aug. 22, 2021, 11:45 p.m. UTC | #8
On Sun, Aug 22, 2021 at 11:37:28PM +0000, Alvin Šipraga wrote:
> >>>> +	skb->offload_fwd_mark = 1;
> >>>
> >>> At the very least, please use
> >>>
> >>> 	dsa_default_offload_fwd_mark(skb);
> >>>
> >>> which does the right thing when the port is not offloading the bridge.
> >>
> >> Sure. Can you elaborate on what you mean by "at the very least"? Can it
> >> be improved even further?
> >
> > The elaboration is right below. skb->offload_fwd_mark should be set to
> > zero for packets that have been forwarded only to the host (like packets
> > that have hit a trapping rule). I guess the switch will denote this
> > piece of info through the REASON code.
>
> Yes, I think it will be communicated in REASON too. I haven't gotten to
> deciphering the contents of this field since it has not been needed so
> far: the ports are fully isolated and all bridging is done in software.

In that case, setting skb->offload_fwd_mark to true is absolutely wrong,
since the bridge is told that no software forwarding should be done
between ports, as it was already done in hardware (see nbp_switchdev_allowed_egress).

I wonder how this has ever worked? Are you completely sure that bridging
is done in software?
Alvin Šipraga Aug. 23, 2021, 12:28 a.m. UTC | #9
On 8/23/21 1:45 AM, Vladimir Oltean wrote:
> On Sun, Aug 22, 2021 at 11:37:28PM +0000, Alvin Šipraga wrote:

>>>>>> +	skb->offload_fwd_mark = 1;

>>>>>

>>>>> At the very least, please use

>>>>>

>>>>> 	dsa_default_offload_fwd_mark(skb);

>>>>>

>>>>> which does the right thing when the port is not offloading the bridge.

>>>>

>>>> Sure. Can you elaborate on what you mean by "at the very least"? Can it

>>>> be improved even further?

>>>

>>> The elaboration is right below. skb->offload_fwd_mark should be set to

>>> zero for packets that have been forwarded only to the host (like packets

>>> that have hit a trapping rule). I guess the switch will denote this

>>> piece of info through the REASON code.

>>

>> Yes, I think it will be communicated in REASON too. I haven't gotten to

>> deciphering the contents of this field since it has not been needed so

>> far: the ports are fully isolated and all bridging is done in software.

> 

> In that case, setting skb->offload_fwd_mark to true is absolutely wrong,

> since the bridge is told that no software forwarding should be done

> between ports, as it was already done in hardware (see nbp_switchdev_allowed_egress).

> 

> I wonder how this has ever worked? Are you completely sure that bridging

> is done in software?


You are absolutely right, and indeed I checked just now and the bridging 
is not working at all.

Deleting the line (i.e. skb->offload_fwd_mark = 0) restores the expected 
bridging behaviour.

Based on what you have said, do I understand correctly that 
offload_fwd_mark shouldn't be set until bridge hardware offloading has 
been implemented?

Thanks for your detailed review so far.
Vladimir Oltean Aug. 23, 2021, 12:31 a.m. UTC | #10
On Mon, Aug 23, 2021 at 12:28:51AM +0000, Alvin Šipraga wrote:
> On 8/23/21 1:45 AM, Vladimir Oltean wrote:
> > On Sun, Aug 22, 2021 at 11:37:28PM +0000, Alvin Šipraga wrote:
> >>>>>> +	skb->offload_fwd_mark = 1;
> >>>>>
> >>>>> At the very least, please use
> >>>>>
> >>>>> 	dsa_default_offload_fwd_mark(skb);
> >>>>>
> >>>>> which does the right thing when the port is not offloading the bridge.
> >>>>
> >>>> Sure. Can you elaborate on what you mean by "at the very least"? Can it
> >>>> be improved even further?
> >>>
> >>> The elaboration is right below. skb->offload_fwd_mark should be set to
> >>> zero for packets that have been forwarded only to the host (like packets
> >>> that have hit a trapping rule). I guess the switch will denote this
> >>> piece of info through the REASON code.
> >>
> >> Yes, I think it will be communicated in REASON too. I haven't gotten to
> >> deciphering the contents of this field since it has not been needed so
> >> far: the ports are fully isolated and all bridging is done in software.
> >
> > In that case, setting skb->offload_fwd_mark to true is absolutely wrong,
> > since the bridge is told that no software forwarding should be done
> > between ports, as it was already done in hardware (see nbp_switchdev_allowed_egress).
> >
> > I wonder how this has ever worked? Are you completely sure that bridging
> > is done in software?
>
> You are absolutely right, and indeed I checked just now and the bridging
> is not working at all.
>
> Deleting the line (i.e. skb->offload_fwd_mark = 0) restores the expected
> bridging behaviour.
>
> Based on what you have said, do I understand correctly that
> offload_fwd_mark shouldn't be set until bridge hardware offloading has
> been implemented?
>
> Thanks for your detailed review so far.

So back to my initial suggestion:

| At the very least, please use
|
| 	dsa_default_offload_fwd_mark(skb);
|
| which does the right thing when the port is not offloading the bridge.

This way, you won't have to touch this code even after you start
implementing .port_bridge_join and .port_bridge_leave. It deals with
both cases.