mbox series

[RFC,net-next,0/9] Get rid of the switchdev transactional model

Message ID 20201217015822.826304-1-vladimir.oltean@nxp.com
Headers show
Series Get rid of the switchdev transactional model | expand

Message

Vladimir Oltean Dec. 17, 2020, 1:58 a.m. UTC
This series comes after the late realization that the prepare/commit
separation imposed by switchdev does not help literally anybody:
https://patchwork.kernel.org/project/netdevbpf/patch/20201212203901.351331-1-vladimir.oltean@nxp.com/

We should kill it before it inflicts even more damage to the error
handling logic in drivers.

Vladimir Oltean (9):
  net: switchdev: remove the transaction structure from port object
    notifiers
  net: switchdev: delete switchdev_port_obj_add_now
  net: switchdev: remove the transaction structure from port attributes
  net: dsa: remove the transactional logic from ageing time notifiers
  net: dsa: remove the transactional logic from MDB entries
  net: dsa: remove the transactional logic from VLAN objects
  net: dsa: remove obsolete comment about switchdev transactions
  mlxsw: spectrum_switchdev: remove transactional logic for VLAN objects
  net: switchdev: delete the transaction object

 drivers/net/dsa/b53/b53_common.c              |  42 +++----
 drivers/net/dsa/b53/b53_priv.h                |  15 +--
 drivers/net/dsa/bcm_sf2.c                     |   2 -
 drivers/net/dsa/bcm_sf2_cfp.c                 |   7 +-
 drivers/net/dsa/dsa_loop.c                    |  32 ++----
 drivers/net/dsa/hirschmann/hellcreek.c        |  18 +--
 drivers/net/dsa/lan9303-core.c                |  12 +-
 drivers/net/dsa/lantiq_gswip.c                |  40 +++----
 drivers/net/dsa/microchip/ksz8795.c           |  14 +--
 drivers/net/dsa/microchip/ksz9477.c           |  40 +++----
 drivers/net/dsa/microchip/ksz_common.c        |  23 +---
 drivers/net/dsa/microchip/ksz_common.h        |   8 +-
 drivers/net/dsa/mt7530.c                      |  20 +---
 drivers/net/dsa/mv88e6xxx/chip.c              |  72 ++++++------
 drivers/net/dsa/ocelot/felix.c                |  32 +++---
 drivers/net/dsa/qca8k.c                       |  20 +---
 drivers/net/dsa/realtek-smi-core.h            |   9 +-
 drivers/net/dsa/rtl8366.c                     |  36 +++---
 drivers/net/dsa/rtl8366rb.c                   |   1 -
 drivers/net/dsa/sja1105/sja1105.h             |   3 +-
 drivers/net/dsa/sja1105/sja1105_devlink.c     |   9 +-
 drivers/net/dsa/sja1105/sja1105_main.c        |  48 +++-----
 .../marvell/prestera/prestera_switchdev.c     |  44 ++------
 .../mellanox/mlxsw/spectrum_switchdev.c       | 102 ++++-------------
 drivers/net/ethernet/mscc/ocelot.c            |  32 ++----
 drivers/net/ethernet/mscc/ocelot_net.c        |  39 ++-----
 drivers/net/ethernet/rocker/rocker.h          |   6 +-
 drivers/net/ethernet/rocker/rocker_main.c     |  61 +++--------
 drivers/net/ethernet/rocker/rocker_ofdpa.c    |  23 ++--
 drivers/net/ethernet/ti/cpsw_switchdev.c      |  37 ++-----
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c       |  80 +++++---------
 include/net/dsa.h                             |  11 +-
 include/net/switchdev.h                       |  24 +---
 include/soc/mscc/ocelot.h                     |   3 +-
 net/dsa/dsa_priv.h                            |  27 ++---
 net/dsa/port.c                                | 103 +++++++-----------
 net/dsa/slave.c                               |  56 +++-------
 net/dsa/switch.c                              |  74 ++-----------
 net/switchdev/switchdev.c                     | 101 ++---------------
 39 files changed, 408 insertions(+), 918 deletions(-)

Comments

Florian Fainelli Dec. 17, 2020, 2:08 a.m. UTC | #1
On 12/16/2020 5:58 PM, Vladimir Oltean wrote:
> For many drivers, the .port_mdb_prepare callback was not a good opportunity
> to avoid any error condition, and they would suppress errors found during
> the actual commit phase.
> 
> Where a logical separation between the prepare and the commit phase
> existed, the function that used to implement the .port_mdb_prepare
> callback still exists, but now it is called directly from .port_mdb_add,
> which was modified to return an int code.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Florian Fainelli Dec. 17, 2020, 2:14 a.m. UTC | #2
On 12/16/2020 5:58 PM, Vladimir Oltean wrote:
> Since the introduction of the switchdev API, port objects were
> transmitted to drivers for offloading using a two-step transactional
> model, with a prepare phase that was supposed to catch all errors, and a
> commit phase that was supposed to never fail.
> 
> Some classes of failures can never be avoided, like hardware access, or
> memory allocation. In the latter case, merely attempting to move the
> memory allocation to the preparation phase makes it impossible to avoid
> memory leaks, since commit 91cf8eceffc1 ("switchdev: Remove unused
> transaction item queue") which has removed the unused mechanism of
> passing on the allocated memory between one phase and another.
> 
> It is time we admit that separating the preparation from the commit
> phase is something that is best left for the driver to decide, and not
> something that should be baked into the API, especially since there are
> no switchdev callers that depend on this.
> 
> This patch removes the struct switchdev_trans member from switchdev port
> object notifier structures, and converts drivers to not look at this
> member.
> 
> Where driver conversion is trivial (like in the case of the Marvell
> Prestera driver, NXP DPAA2 switch, TI CPSW, and Rocker drivers), it is
> done in this patch.
> 
> Where driver conversion needs more attention (DSA, Mellanox Spectrum),
> the conversion is left for subsequent patches and here we only fake the
> prepare/commit phases at a lower level, just not in the switchdev
> notifier itself.
> 
> Where the code has a natural structure that is best left alone as a
> preparation and a commit phase (as in the case of the Ocelot switch),
> that structure is left in place, just made to not depend upon the
> switchdev transactional model.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Vladimir Oltean Dec. 17, 2020, 11:04 a.m. UTC | #3
On Thu, Dec 17, 2020 at 03:58:19AM +0200, Vladimir Oltean wrote:
> It should be the driver's business to logically separate its VLAN
> offloading into a preparation and a commit phase, and some drivers don't
> need / can't do this.
> 
> So remove the transactional shim from DSA and let drivers to propagate
> errors directly from the .port_vlan_add callback.
> 
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
[...]
> diff --git a/net/dsa/switch.c b/net/dsa/switch.c
> index 65124bc3ddfb..bd00ef6296f9 100644
> --- a/net/dsa/switch.c
> +++ b/net/dsa/switch.c
> @@ -217,35 +217,13 @@ static bool dsa_switch_vlan_match(struct dsa_switch *ds, int port,
>  	return false;
>  }
>  
> -static int dsa_switch_vlan_prepare(struct dsa_switch *ds,
> -				   struct dsa_notifier_vlan_info *info)
> -{
> -	int port, err;
> -
> -	if (!ds->ops->port_vlan_prepare || !ds->ops->port_vlan_add)
> -		return -EOPNOTSUPP;
> -
> -	for (port = 0; port < ds->num_ports; port++) {
> -		if (dsa_switch_vlan_match(ds, port, info)) {
> -			err = ds->ops->port_vlan_prepare(ds, port, info->vlan);
> -			if (err)
> -				return err;
> -		}
> -	}
> -
> -	return 0;
> -}
> -
>  static int dsa_switch_vlan_add(struct dsa_switch *ds,
>  			       struct dsa_notifier_vlan_info *info)
>  {
>  	int port;
>  
> -	if (switchdev_trans_ph_prepare(info->trans))
> -		return dsa_switch_vlan_prepare(ds, info);
> -
>  	if (!ds->ops->port_vlan_add)
> -		return 0;
> +		return -EOPNOTSUPP;
>  
>  	for (port = 0; port < ds->num_ports; port++)
>  		if (dsa_switch_vlan_match(ds, port, info))
> -- 
> 2.25.1
> 

For anybody who wants to test, please paste this instead of the existing
dsa_switch_vlan_add, to propagate the errors:

static int dsa_switch_vlan_add(struct dsa_switch *ds,
			       struct dsa_notifier_vlan_info *info)
{
	int err = 0;
	int port;

	if (!ds->ops->port_vlan_add)
		return -EOPNOTSUPP;

	for (port = 0; port < ds->num_ports; port++) {
		if (dsa_switch_vlan_match(ds, port, info)) {
			err = ds->ops->port_vlan_add(ds, port, info->vlan);
			if (err)
				break;
		}
	}

	return err;
}
Vladimir Oltean Dec. 17, 2020, 11:43 a.m. UTC | #4
On Thu, Dec 17, 2020 at 01:04:26PM +0200, Vladimir Oltean wrote:
> On Thu, Dec 17, 2020 at 03:58:19AM +0200, Vladimir Oltean wrote:
> > It should be the driver's business to logically separate its VLAN
> > offloading into a preparation and a commit phase, and some drivers don't
> > need / can't do this.
> > 
> > So remove the transactional shim from DSA and let drivers to propagate
> > errors directly from the .port_vlan_add callback.
> > 
> > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> > ---
> [...]
> > diff --git a/net/dsa/switch.c b/net/dsa/switch.c
> > index 65124bc3ddfb..bd00ef6296f9 100644
> > --- a/net/dsa/switch.c
> > +++ b/net/dsa/switch.c
> > @@ -217,35 +217,13 @@ static bool dsa_switch_vlan_match(struct dsa_switch *ds, int port,
> >  	return false;
> >  }
> >  
> > -static int dsa_switch_vlan_prepare(struct dsa_switch *ds,
> > -				   struct dsa_notifier_vlan_info *info)
> > -{
> > -	int port, err;
> > -
> > -	if (!ds->ops->port_vlan_prepare || !ds->ops->port_vlan_add)
> > -		return -EOPNOTSUPP;
> > -
> > -	for (port = 0; port < ds->num_ports; port++) {
> > -		if (dsa_switch_vlan_match(ds, port, info)) {
> > -			err = ds->ops->port_vlan_prepare(ds, port, info->vlan);
> > -			if (err)
> > -				return err;
> > -		}
> > -	}
> > -
> > -	return 0;
> > -}
> > -
> >  static int dsa_switch_vlan_add(struct dsa_switch *ds,
> >  			       struct dsa_notifier_vlan_info *info)
> >  {
> >  	int port;
> >  
> > -	if (switchdev_trans_ph_prepare(info->trans))
> > -		return dsa_switch_vlan_prepare(ds, info);
> > -
> >  	if (!ds->ops->port_vlan_add)
> > -		return 0;
> > +		return -EOPNOTSUPP;
> >  
> >  	for (port = 0; port < ds->num_ports; port++)
> >  		if (dsa_switch_vlan_match(ds, port, info))
> > -- 
> > 2.25.1
> > 
> 
> For anybody who wants to test, please paste this instead of the existing
> dsa_switch_vlan_add, to propagate the errors:
> 
> static int dsa_switch_vlan_add(struct dsa_switch *ds,
> 			       struct dsa_notifier_vlan_info *info)
> {
> 	int err = 0;
> 	int port;
> 
> 	if (!ds->ops->port_vlan_add)
> 		return -EOPNOTSUPP;
> 
> 	for (port = 0; port < ds->num_ports; port++) {
> 		if (dsa_switch_vlan_match(ds, port, info)) {
> 			err = ds->ops->port_vlan_add(ds, port, info->vlan);
> 			if (err)
> 				break;
> 		}
> 	}
> 
> 	return err;
> }

Having said that, now I finally understand one of the ways in which the
prepare phase was actually useful for something. Sometimes it takes
removing all the code before things like this are visible...

DSA uses notifiers to propagate a switchdev object to multiple ports.
For example, every VLAN is installed not only on the front-panel user
port, but also on the CPU port. Iterating through the port list, we find
that the addition on the user port may succeed, but the addition on the
CPU port might fail.

So, at the very least, we should attempt a rollback on error (iterating
backwards through the port list).

But there is a second problem. Nobody prevents you from adding a VLAN
twice on the same port.

bridge vlan add dev swp0 vid 100
bridge vlan add dev swp0 vid 90-110

If the second command fails at VLAN 110 and we attempt a rollback, we
would end up deleting VLAN 100 too, which was set up by the first command.

This is where the prepare/commit phase could come in handy. First all
the ports confirm that the VLAN range 90-110 is ok, and only then we
proceed to commit it. That would avoid the first problem.

However there's the other problem, which is that artificially splitting
into a prepare and a commit phase will still inevitably leave some
"catastrophic" failures unhandled in the commit phase. We typically
throw our hands up in the air when it comes to these, but normally if we
bother to do something about consistency in the first place, we should
have rollback logic too. It's just that the rollback will not be very
simple. We would need to know, for each VLAN, if it already existed or
not. And if it existed, what flags it had. Either that, or we start
disallowing the user, from the bridge code, to add VLAN objects that
already exist.
Kurt Kanzenbach Dec. 18, 2020, 8:49 a.m. UTC | #5
Hi Vladimir,

On Thu Dec 17 2020, Vladimir Oltean wrote:
> It should be the driver's business to logically separate its VLAN

> offloading into a preparation and a commit phase, and some drivers don't

> need / can't do this.

>

> So remove the transactional shim from DSA and let drivers to propagate

> errors directly from the .port_vlan_add callback.

>

> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>


Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>


for the hellcreek part.

Thanks,
Kurt
Linus Walleij Dec. 27, 2020, 1:32 p.m. UTC | #6
On Thu, Dec 17, 2020 at 2:59 AM Vladimir Oltean <vladimir.oltean@nxp.com> wrote:

> This series comes after the late realization that the prepare/commit

> separation imposed by switchdev does not help literally anybody:

> https://patchwork.kernel.org/project/netdevbpf/patch/20201212203901.351331-1-vladimir.oltean@nxp.com/

>

> We should kill it before it inflicts even more damage to the error

> handling logic in drivers.


I agree with the goal and the series make the kernel less
complex so:
Acked-by: Linus Walleij <linus.walleij@linaro.org>

for the series.

Yours,
Linus Walleij
Jiri Pirko Dec. 28, 2020, 9:59 a.m. UTC | #7
Thu, Dec 17, 2020 at 02:58:13AM CET, vladimir.oltean@nxp.com wrote:
>This series comes after the late realization that the prepare/commit

>separation imposed by switchdev does not help literally anybody:

>https://patchwork.kernel.org/project/netdevbpf/patch/20201212203901.351331-1-vladimir.oltean@nxp.com/

>

>We should kill it before it inflicts even more damage to the error

>handling logic in drivers.


Awesome, I totally like this patchset. To be honest, I didn't see much
or a point to do the transaction model from the start, yet I remember
there were people requiring it. I guess they are fine now without
it or they don't care anymore.

Acked-by: Jiri Pirko <jiri@nvidia.com>