mbox series

[RFC,net-next,00/22] nexthop: Add support for nexthop objects offload

Message ID 20200908091037.2709823-1-idosch@idosch.org
Headers show
Series nexthop: Add support for nexthop objects offload | expand

Message

Ido Schimmel Sept. 8, 2020, 9:10 a.m. UTC
From: Ido Schimmel <idosch@nvidia.com>

Note: I'm aware that 22 patches is a lot and I will split it for the
non-RFC submission. Sending all in one piece to see if there are general
comments regarding the interface. Also, most of the patches are very
small.

This patch set adds support for nexthop objects offload with a dummy
implementation over netdevsim. mlxsw support will be added later.

The general idea is very similar to route offload in that notifications
are sent whenever nexthop objects are changed. A listener can veto the
change and the error will be communicated to user space with extack.

To keep listeners as simple as possible, they not only receive
notifications for the nexthop object that is changed, but also for all
the other objects affected by this change. For example, when a single
nexthop is replaced, a replace notification is sent for the single
nexthop, but also for all the nexthop groups this nexthop is member in.
This relieves listeners from the need to track such dependencies.

To simplify things further for listeners, the notification info does not
contain the raw nexthop data structures (e.g., 'struct nexthop'), but
less complex data structures into which the raw data structures are
parsed into.

Tested with a new selftest over netdevsim and with fib_nexthops.sh:

Tests passed: 164
Tests failed:   0

Patch set overview:

Patches #1-#4 perform small cleanups and covert the existing nexthop
notification chain to a blocking one, so that device drivers could block
when programming nexthops to hardware. This is safe because all
notifications are emitted from a process context.

Patches #5-#8 introduce the aforementioned data structures and convert
existing listeners (i.e., the VXLAN driver) to use them.

Patches #9-#10 add a new RTNH_F_TRAP flag and the ability to set it and
RTNH_F_OFFLOAD on nexthops. This flag is used by netdevsim for testing
purposes and will also be used by mlxsw. These flags are consistent with
the existing RTM_F_OFFLOAD and RTM_F_TRAP flags.

Patches #11-#18 gradually add the new nexthop notifications.

Patches #19-#22 add a dummy implementation for nexthop offload over
netdevsim and a selftest to exercise both good and bad flows.

Ido Schimmel (22):
  nexthop: Remove unused function declaration from header file
  nexthop: Convert to blocking notification chain
  nexthop: Only emit a notification when nexthop is actually deleted
  selftests: fib_nexthops: Test cleanup of FDB entries following nexthop
    deletion
  nexthop: Add nexthop notification data structures
  nexthop: Pass extack to nexthop notifier
  nexthop: Prepare new notification info
  nexthop: vxlan: Convert to new notification info
  rtnetlink: Add RTNH_F_TRAP flag
  nexthop: Allow setting "offload" and "trap" indications on nexthops
  nexthop: Emit a notification when a nexthop is added
  nexthop: Emit a notification when a nexthop group is replaced
  nexthop: Emit a notification when a single nexthop is replaced
  nexthop: Emit a notification when a nexthop group is modified
  nexthop: Emit a notification when a nexthop group is reduced
  nexthop: Pass extack to register_nexthop_notifier()
  nexthop: Replay nexthops when registering a notifier
  nexthop: Remove in-kernel route notifications when nexthop changes
  netdevsim: Add devlink resource for nexthops
  netdevsim: Add dummy implementation for nexthop offload
  netdevsim: Allow programming routes with nexthop objects
  selftests: netdevsim: Add test for nexthop offload API

 .../networking/devlink/netdevsim.rst          |   3 +-
 drivers/net/netdevsim/dev.c                   |   6 +
 drivers/net/netdevsim/fib.c                   | 265 +++++++++++-
 drivers/net/netdevsim/netdevsim.h             |   1 +
 drivers/net/vxlan.c                           |  12 +-
 include/net/netns/nexthop.h                   |   2 +-
 include/net/nexthop.h                         |  45 +-
 include/uapi/linux/rtnetlink.h                |   6 +-
 net/ipv4/fib_semantics.c                      |   2 +
 net/ipv4/fib_trie.c                           |   9 -
 net/ipv4/nexthop.c                            | 262 ++++++++++-
 net/ipv6/route.c                              |   5 -
 .../drivers/net/netdevsim/nexthop.sh          | 408 ++++++++++++++++++
 tools/testing/selftests/net/fib_nexthops.sh   |  14 +
 14 files changed, 985 insertions(+), 55 deletions(-)
 create mode 100755 tools/testing/selftests/drivers/net/netdevsim/nexthop.sh

Comments

David Ahern Sept. 8, 2020, 2:34 p.m. UTC | #1
On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Currently, the delete notification is emitted from the error path of
> nexthop_add() and replace_nexthop(), which can be confusing to listeners
> as they are not familiar with the nexthop.
> 
> Instead, only emit the notification when the nexthop is actually
> deleted. The following sub-cases are covered:
> 
> 1. User space deletes the nexthop
> 2. The nexthop is deleted by the kernel due to a netdev event (e.g.,
>    nexthop device going down)
> 3. A group is deleted because its last nexthop is being deleted
> 4. The network namespace of the nexthop device is deleted
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  net/ipv4/nexthop.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>
Jiri Pirko Sept. 8, 2020, 2:39 p.m. UTC | #2
Tue, Sep 08, 2020 at 11:10:18AM CEST, idosch@idosch.org wrote:
>From: Ido Schimmel <idosch@nvidia.com>
>
>Currently, the delete notification is emitted from the error path of
>nexthop_add() and replace_nexthop(), which can be confusing to listeners
>as they are not familiar with the nexthop.
>
>Instead, only emit the notification when the nexthop is actually
>deleted. The following sub-cases are covered:

Well, in theory, this might break some very odd app that is adding a
route and checking the errors using this notification. My opinion is to
allow this breakage to happen, but I'm usually too benevolent :)


>
>1. User space deletes the nexthop
>2. The nexthop is deleted by the kernel due to a netdev event (e.g.,
>   nexthop device going down)
>3. A group is deleted because its last nexthop is being deleted
>4. The network namespace of the nexthop device is deleted
>
>Signed-off-by: Ido Schimmel <idosch@nvidia.com>
>---
> net/ipv4/nexthop.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
>index 13d9219a9aa1..8c0f17c6863c 100644
>--- a/net/ipv4/nexthop.c
>+++ b/net/ipv4/nexthop.c
>@@ -870,8 +870,6 @@ static void __remove_nexthop_fib(struct net *net, struct nexthop *nh)
> 	bool do_flush = false;
> 	struct fib_info *fi;
> 
>-	call_nexthop_notifiers(net, NEXTHOP_EVENT_DEL, nh);
>-
> 	list_for_each_entry(fi, &nh->fi_list, nh_list) {
> 		fi->fib_flags |= RTNH_F_DEAD;
> 		do_flush = true;
>@@ -909,6 +907,8 @@ static void __remove_nexthop(struct net *net, struct nexthop *nh,
> static void remove_nexthop(struct net *net, struct nexthop *nh,
> 			   struct nl_info *nlinfo)
> {
>+	call_nexthop_notifiers(net, NEXTHOP_EVENT_DEL, nh);
>+
> 	/* remove from the tree */
> 	rb_erase(&nh->rb_node, &net->nexthop.rb_root);
> 
>-- 
>2.26.2
>
David Ahern Sept. 8, 2020, 3:14 p.m. UTC | #3
On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> Add a function that can be called by device drivers to set "offload" or
> "trap" indication on nexthops following nexthop notifications.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  include/net/nexthop.h |  1 +
>  net/ipv4/nexthop.c    | 21 +++++++++++++++++++++
>  2 files changed, 22 insertions(+)
> 
> diff --git a/include/net/nexthop.h b/include/net/nexthop.h
> index 0bde1aa867c0..4147681e86d2 100644
> --- a/include/net/nexthop.h
> +++ b/include/net/nexthop.h
> @@ -146,6 +146,7 @@ struct nh_notifier_info {
>  
>  int register_nexthop_notifier(struct net *net, struct notifier_block *nb);
>  int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb);
> +void nexthop_hw_flags_set(struct net *net, u32 id, bool offload, bool trap);

how about nexthop_set_hw_flags? consistency with current nexthop_get_
... naming

>  
>  /* caller is holding rcu or rtnl; no reference taken to nexthop */
>  struct nexthop *nexthop_find_by_id(struct net *net, u32 id);
> diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> index 70c8ab6906ec..71605c612458 100644
> --- a/net/ipv4/nexthop.c
> +++ b/net/ipv4/nexthop.c
> @@ -2080,6 +2080,27 @@ int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb)
>  }
>  EXPORT_SYMBOL(unregister_nexthop_notifier);
>  
> +void nexthop_hw_flags_set(struct net *net, u32 id, bool offload, bool trap)
> +{
> +	struct nexthop *nexthop;
> +
> +	rcu_read_lock();
> +
> +	nexthop = nexthop_find_by_id(net, id);
> +	if (!nexthop)
> +		goto out;
> +
> +	nexthop->nh_flags &= ~(RTNH_F_OFFLOAD | RTNH_F_TRAP);
> +	if (offload)
> +		nexthop->nh_flags |= RTNH_F_OFFLOAD;
> +	if (trap)
> +		nexthop->nh_flags |= RTNH_F_TRAP;
> +
> +out:
> +	rcu_read_unlock();
> +}
> +EXPORT_SYMBOL(nexthop_hw_flags_set);
> +
>  static void __net_exit nexthop_net_exit(struct net *net)
>  {
>  	rtnl_lock();
>
David Ahern Sept. 8, 2020, 3:34 p.m. UTC | #4
On 9/8/20 3:10 AM, Ido Schimmel wrote:
> From: Ido Schimmel <idosch@nvidia.com>
> 
> This will be used by the next patch which extends the function to replay
> all the existing nexthops to the notifier block being registered.
> 
> Device drivers will be able to pass extack to the function since it is
> passed to them upon reload from devlink.
> 
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
>  drivers/net/vxlan.c   | 3 ++-
>  include/net/nexthop.h | 3 ++-
>  net/ipv4/nexthop.c    | 3 ++-
>  3 files changed, 6 insertions(+), 3 deletions(-)
> 

Reviewed-by: David Ahern <dsahern@gmail.com>
Ido Schimmel Sept. 11, 2020, 3:29 p.m. UTC | #5
On Tue, Sep 08, 2020 at 09:14:37AM -0600, David Ahern wrote:
> On 9/8/20 3:10 AM, Ido Schimmel wrote:
> > From: Ido Schimmel <idosch@nvidia.com>
> > 
> > Add a function that can be called by device drivers to set "offload" or
> > "trap" indication on nexthops following nexthop notifications.
> > 
> > Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> > ---
> >  include/net/nexthop.h |  1 +
> >  net/ipv4/nexthop.c    | 21 +++++++++++++++++++++
> >  2 files changed, 22 insertions(+)
> > 
> > diff --git a/include/net/nexthop.h b/include/net/nexthop.h
> > index 0bde1aa867c0..4147681e86d2 100644
> > --- a/include/net/nexthop.h
> > +++ b/include/net/nexthop.h
> > @@ -146,6 +146,7 @@ struct nh_notifier_info {
> >  
> >  int register_nexthop_notifier(struct net *net, struct notifier_block *nb);
> >  int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb);
> > +void nexthop_hw_flags_set(struct net *net, u32 id, bool offload, bool trap);
> 
> how about nexthop_set_hw_flags? consistency with current nexthop_get_
> ... naming

Sure. I opted for consistency with fib_alias_hw_flags_set() and
fib6_info_hw_flags_set(), but I'll change to be consistent with nexthop
code.

> 
> >  
> >  /* caller is holding rcu or rtnl; no reference taken to nexthop */
> >  struct nexthop *nexthop_find_by_id(struct net *net, u32 id);
> > diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c
> > index 70c8ab6906ec..71605c612458 100644
> > --- a/net/ipv4/nexthop.c
> > +++ b/net/ipv4/nexthop.c
> > @@ -2080,6 +2080,27 @@ int unregister_nexthop_notifier(struct net *net, struct notifier_block *nb)
> >  }
> >  EXPORT_SYMBOL(unregister_nexthop_notifier);
> >  
> > +void nexthop_hw_flags_set(struct net *net, u32 id, bool offload, bool trap)
> > +{
> > +	struct nexthop *nexthop;
> > +
> > +	rcu_read_lock();
> > +
> > +	nexthop = nexthop_find_by_id(net, id);
> > +	if (!nexthop)
> > +		goto out;
> > +
> > +	nexthop->nh_flags &= ~(RTNH_F_OFFLOAD | RTNH_F_TRAP);
> > +	if (offload)
> > +		nexthop->nh_flags |= RTNH_F_OFFLOAD;
> > +	if (trap)
> > +		nexthop->nh_flags |= RTNH_F_TRAP;
> > +
> > +out:
> > +	rcu_read_unlock();
> > +}
> > +EXPORT_SYMBOL(nexthop_hw_flags_set);
> > +
> >  static void __net_exit nexthop_net_exit(struct net *net)
> >  {
> >  	rtnl_lock();
> > 
>