mbox series

[net-next,00/11] Marvell Prestera driver implementation of devlink functionality.

Message ID 20210609151602.29004-1-oleksandr.mazur@plvision.eu
Headers show
Series Marvell Prestera driver implementation of devlink functionality. | expand

Message

Oleksandr Mazur June 9, 2021, 3:15 p.m. UTC
Prestera Switchdev driver implements a set of devlink-based features,
that include both debug functionality (traps with trap statistics), as well
as functional rate limiter that is based on the devlink kernel API (interfaces).

The core prestera-devlink functionality is implemented in the prestera_devlink.c.

The patch series also extends the existing devlink kernel API with a list of core
features:
 - devlink: add API for both publish/unpublish port parameters.
 - devlink: add port parameters-specific ops, as current design makes it impossible
   to register one parameter for multiple ports, and effectively distinguish for
   what port parameter_op is called.
 - devlink: add trap_drop_counter_get callback for driver to register - make it possible
   to keep track of how many packets have been dropped (hard) by the switch device, before
   the packets even made it to the devlink subsystem (e.g. dropped due to RXDMA buffer
   overflow).

The core features that extend current functionality of prestera Switchdev driver:
 - add storm control (BUM control) functionality, that is driven through the
   devlink (per) port parameters.
 - add logic for driver traps and drops registration (also traps with DROP action).
 - add documentation for prestera driver traps and drops group.

Oleksandr Mazur (10):
  net: core: devlink: add dropped stats traps field
  net: core: devlink: add port_params_ops for devlink port parameters
    altering
  testing: selftests: net: forwarding: add devlink-required
    functionality to test (hard) dropped stats field
  drivers: net: netdevsim: add devlink trap_drop_counter_get
    implementation
  testing: selftests: drivers: net: netdevsim: devlink: add test case
    for hard drop statistics
  drivers: net: netdevsim: add devlink port params usage
  net: marvell: prestera: devlink: add traps/groups implementation
  net: marvell: prestera: devlink: add traps with DROP action
  net: marvell: prestera: add storm control (rate limiter)
    implementation
  documentation: networking: devlink: add prestera switched driver
    Documentation

Sudarsana Reddy Kalluru (1):
  net: core: devlink: add apis to publish/unpublish port params

 Documentation/networking/devlink/prestera.rst | 167 +++++
 .../net/ethernet/marvell/prestera/prestera.h  |   9 +
 .../marvell/prestera/prestera_devlink.c       | 664 +++++++++++++++++-
 .../marvell/prestera/prestera_devlink.h       |   3 +
 .../ethernet/marvell/prestera/prestera_dsa.c  |   3 +
 .../ethernet/marvell/prestera/prestera_dsa.h  |   1 +
 .../ethernet/marvell/prestera/prestera_hw.c   |  60 ++
 .../ethernet/marvell/prestera/prestera_hw.h   |  20 +
 .../ethernet/marvell/prestera/prestera_rxtx.c |   7 +-
 drivers/net/netdevsim/dev.c                   | 108 ++-
 drivers/net/netdevsim/netdevsim.h             |   3 +
 include/net/devlink.h                         |  35 +
 net/core/devlink.c                            | 139 +++-
 .../drivers/net/netdevsim/devlink_trap.sh     |  10 +
 .../selftests/net/forwarding/devlink_lib.sh   |  26 +
 15 files changed, 1238 insertions(+), 17 deletions(-)
 create mode 100644 Documentation/networking/devlink/prestera.rst

Comments

Oleksandr Mazur June 11, 2021, 1:19 p.m. UTC | #1
>>  On Wed, Jun 09, 2021 at 06:16:00PM +0300, Oleksandr Mazur wrote:
> Storm control (BUM) provides a mechanism to limit rate of ingress
> > port traffic (matched by type). Devlink port parameter API is used:
> > driver registers a set of per-port parameters that can be accessed to both
> > get/set per-port per-type rate limit.
> > Add new FW command - RATE_LIMIT_MODE_SET.

> Hi Oleksandr

> Just expanding on the two comments you already received about this.

> We often see people miss that switchdev is about. It is not about
> writing switch drivers. It is about writing network stack
> accelerators. You take a feature of the Linux network stack and you
> accelerate it by offloading it to the hardware. So look around the
> network stack and see how you configure it to perform rate limiting of
> broadcast traffic ingress. Once you have found a suitable mechanism,
> accelerate it via offloading.

> If you find Linux has no way to perform a feature the hardware could
> accelerate, you first need to add a pure software version of that
> feature to the network stack, and then add acceleration support for
> it.


Hello Andrew, Ido, Nikolay,
I appreciate your time and comments provided over this patchset, though i have a few questions to ask, if you don't mind:

1. Does it mean that in order to support storm control in switchdev driver i need to implement software storm control in bridge driver,
and then using the switchdev attributes (notifiers) mechanism offload the configuration itself to the HW?
2. Is there any chance of keeping devlink solution untill the discussed (storm control implemented in the bridge driver) mechanism will be ready/implemented?

Anyway, it relies on the port param API from devlink which is already present in the kernel API.
Andrew Lunn June 11, 2021, 5:08 p.m. UTC | #2
On Fri, Jun 11, 2021 at 01:19:13PM +0000, Oleksandr Mazur wrote:
> >>  On Wed, Jun 09, 2021 at 06:16:00PM +0300, Oleksandr Mazur wrote:

> > Storm control (BUM) provides a mechanism to limit rate of ingress

> > > port traffic (matched by type). Devlink port parameter API is used:

> > > driver registers a set of per-port parameters that can be accessed to both

> > > get/set per-port per-type rate limit.

> > > Add new FW command - RATE_LIMIT_MODE_SET.

> 

> > Hi Oleksandr

> 

> > Just expanding on the two comments you already received about this.

> 

> > We often see people miss that switchdev is about. It is not about

> > writing switch drivers. It is about writing network stack

> > accelerators. You take a feature of the Linux network stack and you

> > accelerate it by offloading it to the hardware. So look around the

> > network stack and see how you configure it to perform rate limiting of

> > broadcast traffic ingress. Once you have found a suitable mechanism,

> > accelerate it via offloading.

> 

> > If you find Linux has no way to perform a feature the hardware could

> > accelerate, you first need to add a pure software version of that

> > feature to the network stack, and then add acceleration support for

> > it.

> 

> 

> Hello Andrew, Ido, Nikolay,

> I appreciate your time and comments provided over this patchset, though i have a few questions to ask, if you don't mind:

> 


> 1. Does it mean that in order to support storm control in switchdev

> driver i need to implement software storm control in bridge driver,

> and then using the switchdev attributes (notifiers) mechanism

> offload the configuration itself to the HW?


Hi Oleksandr

Not necessarily. Is storm control anything more than ingress packet
matching and rate limiting?

I'm not TC expert, but look for example at
https://man7.org/linux/man-pages/man8/tc-police.8.html

and the example:

# tc qdisc add dev eth0 handle ffff: ingress
# tc filter add dev eth0 parent ffff: u32 \
                   match u32 0 0 \
                   police rate 1mbit burst 100k

Replace the "match u32 0 0" with something which matches on broadcast
frames.  Maybe "flower dst_mac ff:ff:ff:ff:ff:ff"

So there is a software solution. Now accelerate it.

> 2. Is there any chance of keeping devlink solution untill the

> discussed (storm control implemented in the bridge driver) mechanism

> will be ready/implemented?


No. Please do it correctly from the beginning. No hacks.

    Andrew
Ido Schimmel June 13, 2021, 7:34 a.m. UTC | #3
On Fri, Jun 11, 2021 at 07:08:00PM +0200, Andrew Lunn wrote:
> On Fri, Jun 11, 2021 at 01:19:13PM +0000, Oleksandr Mazur wrote:

> > >>  On Wed, Jun 09, 2021 at 06:16:00PM +0300, Oleksandr Mazur wrote:

> > > Storm control (BUM) provides a mechanism to limit rate of ingress

> > > > port traffic (matched by type). Devlink port parameter API is used:

> > > > driver registers a set of per-port parameters that can be accessed to both

> > > > get/set per-port per-type rate limit.

> > > > Add new FW command - RATE_LIMIT_MODE_SET.

> > 

> > > Hi Oleksandr

> > 

> > > Just expanding on the two comments you already received about this.

> > 

> > > We often see people miss that switchdev is about. It is not about

> > > writing switch drivers. It is about writing network stack

> > > accelerators. You take a feature of the Linux network stack and you

> > > accelerate it by offloading it to the hardware. So look around the

> > > network stack and see how you configure it to perform rate limiting of

> > > broadcast traffic ingress. Once you have found a suitable mechanism,

> > > accelerate it via offloading.

> > 

> > > If you find Linux has no way to perform a feature the hardware could

> > > accelerate, you first need to add a pure software version of that

> > > feature to the network stack, and then add acceleration support for

> > > it.

> > 

> > 

> > Hello Andrew, Ido, Nikolay,

> > I appreciate your time and comments provided over this patchset, though i have a few questions to ask, if you don't mind:

> > 

> 

> > 1. Does it mean that in order to support storm control in switchdev

> > driver i need to implement software storm control in bridge driver,

> > and then using the switchdev attributes (notifiers) mechanism

> > offload the configuration itself to the HW?

> 

> Hi Oleksandr

> 

> Not necessarily. Is storm control anything more than ingress packet

> matching and rate limiting?

> 

> I'm not TC expert, but look for example at

> https://man7.org/linux/man-pages/man8/tc-police.8.html

> 

> and the example:

> 

> # tc qdisc add dev eth0 handle ffff: ingress

> # tc filter add dev eth0 parent ffff: u32 \

>                    match u32 0 0 \

>                    police rate 1mbit burst 100k

> 

> Replace the "match u32 0 0" with something which matches on broadcast

> frames.  Maybe "flower dst_mac ff:ff:ff:ff:ff:ff"

> 

> So there is a software solution. Now accelerate it.


Storm control also needs the ability to limit other types of flooded
traffic such unknown unicast and unregistered multicast packets. The
entity which classifies packets as such is the bridge, which happens
after the ingress hook.

I see two options to support storm control in Linux:

1. By adding support in the bridge itself as a new bridge slave option.
Something like:

# ip link set dev swp1 type bridge_slave \
	storm_control type { uuc | umc | bc} rate RATE mode { packet | byte }

I suspect this similar to more traditional implementations that users
might be used to and also maps nicely to hardware implementations

2. Teaching tc to call into the bridge to classify a packet. Not sure a
whole new classifier is needed for this. Maybe just extend flower with a
new key: dst_mac_type { uuc | umc }. I personally find this a bit weird,
but it is more flexible and allows to reuse existing actions

> 

> > 2. Is there any chance of keeping devlink solution untill the

> > discussed (storm control implemented in the bridge driver) mechanism

> > will be ready/implemented?

> 

> No. Please do it correctly from the beginning. No hacks.


+1
Oleksandr Mazur June 17, 2021, 5:30 p.m. UTC | #4
> Prestera Switchdev driver implements a set of devlink-based features,
> that include both debug functionality (traps with trap statistics), as well
> as functional rate limiter that is based on the devlink kernel API (interfaces).

> The core prestera-devlink functionality is implemented in the prestera_devlink.c.

> The patch series also extends the existing devlink kernel API with a list of core
> features:
>  - devlink: add API for both publish/unpublish port parameters.
>  - devlink: add port parameters-specific ops, as current design makes it impossible
>    to register one parameter for multiple ports, and effectively distinguish for
>   what port parameter_op is called.

As we discussed the storm control (BUM) done via devlink port params topic, and agreed that it shouldn't be done using the devlink API itself, there's an open question i'd like to address: the patch series included, for what i think, list of needed and benefitial changes, and those are the following patches:

> Oleksandr Mazur (2):
...
>  net: core: devlink: add port_params_ops for devlink port parameters altering
>  drivers: net: netdevsim: add devlink port params usage
 
> Sudarsana Reddy Kalluru (1):
>  net: core: devlink: add apis to publish/unpublish port params

So, should i create a new patch series that would include all of them?

Because in that case the series itself would lack an actual HW usage of it. The only usage would be limited to the netdevsim driver.
Andrew Lunn June 17, 2021, 7:44 p.m. UTC | #5
On Thu, Jun 17, 2021 at 05:30:07PM +0000, Oleksandr Mazur wrote:
> > Prestera Switchdev driver implements a set of devlink-based features,

> > that include both debug functionality (traps with trap statistics), as well

> > as functional rate limiter that is based on the devlink kernel API (interfaces).

> 

> > The core prestera-devlink functionality is implemented in the prestera_devlink.c.

> 

> > The patch series also extends the existing devlink kernel API with a list of core

> > features:

> >  - devlink: add API for both publish/unpublish port parameters.

> >  - devlink: add port parameters-specific ops, as current design makes it impossible

> >    to register one parameter for multiple ports, and effectively distinguish for

> >   what port parameter_op is called.

> 

> As we discussed the storm control (BUM) done via devlink port params topic, and agreed that it shouldn't be done using the devlink API itself, there's an open question i'd like to address: the patch series included, for what i think, list of needed and benefitial changes, and those are the following patches:


Please wrap your emails at around 70 characters.

> 

> > Oleksandr Mazur (2):

> ...

> >  net: core: devlink: add port_params_ops for devlink port parameters altering

> >  drivers: net: netdevsim: add devlink port params usage

>  

> > Sudarsana Reddy Kalluru (1):

> >  net: core: devlink: add apis to publish/unpublish port params

> 

> So, should i create a new patch series that would include all of them?

> 

> Because in that case the series itself would lack an actual HW usage of it. The only usage would be limited to the netdevsim driver.


We generally don't add APIs without a user. And in this case, i'm not
sure netdevsim is a valid user. Can you refactor some other driver to
make use of the new code? If not, i would suggest they are not merged
at the moment. When you do have a valid use case, you can post them
again.

	Andrew