mbox series

[bpf-next,v2,0/4] Expose network namespace cookies to user space

Message ID 20210219095149.50346-1-lmb@cloudflare.com
Headers show
Series Expose network namespace cookies to user space | expand

Message

Lorenz Bauer Feb. 19, 2021, 9:51 a.m. UTC
We're working on a user space control plane for the BPF sk_lookup
hook [1]. The hook attaches to a network namespace and allows
control over which socket receives a new connection / packet.

Roughly, applications can give a socket to our user space component
to participate in custom bind semantics. This creates an edge case
where  an application can provide us with a socket that lives in
a different network namespace than our BPF sk_lookup program.
We'd like to return an error in this case.

Additionally, we have some user space state that is tied to the
network namespace. We currently use the inode of the nsfs entry
in a directory name, but this is suffers from inode reuse.

I'm proposing to fix both of these issues by adding a new
SO_NETNS_COOKIE socket option as well as a NS_GET_COOKIE ioctl.
Using these we get a stable, unique identifier for a network
namespace and check whether a socket belongs to the "correct"
namespace.

NS_GET_COOKIE could be renamed to NS_GET_NET_COOKIE. I kept the
name generic because it seems like other namespace types could
benefit from a cookie as well.

1: https://www.kernel.org/doc/html/latest/bpf/prog_sk_lookup.html

Changes in v2:
- Rebase on top of Eric Dumazet's netns cookie simplification

Lorenz Bauer (4):
  net: add SO_NETNS_COOKIE socket option
  nsfs: add an ioctl to discover the network namespace cookie
  tools/testing: add test for NS_GET_COOKIE
  tools/testing: add a selftest for SO_NETNS_COOKIE

 arch/alpha/include/uapi/asm/socket.h          |  2 +
 arch/mips/include/uapi/asm/socket.h           |  2 +
 arch/parisc/include/uapi/asm/socket.h         |  2 +
 arch/sparc/include/uapi/asm/socket.h          |  2 +
 fs/nsfs.c                                     |  8 +++
 include/uapi/asm-generic/socket.h             |  2 +
 include/uapi/linux/nsfs.h                     |  2 +
 net/core/sock.c                               | 11 ++++
 tools/testing/selftests/net/.gitignore        |  1 +
 tools/testing/selftests/net/Makefile          |  2 +-
 tools/testing/selftests/net/config            |  1 +
 tools/testing/selftests/net/so_netns_cookie.c | 61 +++++++++++++++++++
 tools/testing/selftests/nsfs/.gitignore       |  1 +
 tools/testing/selftests/nsfs/Makefile         |  2 +-
 tools/testing/selftests/nsfs/config           |  1 +
 tools/testing/selftests/nsfs/netns.c          | 57 +++++++++++++++++
 16 files changed, 155 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/net/so_netns_cookie.c
 create mode 100644 tools/testing/selftests/nsfs/netns.c

Comments

Eric Dumazet Feb. 19, 2021, 11:49 a.m. UTC | #1
On 2/19/21 10:51 AM, Lorenz Bauer wrote:
> We need to distinguish which network namespace a socket belongs to.
> BPF has the useful bpf_get_netns_cookie helper for this, but accessing
> it from user space isn't possible. Add a read-only socket option that
> returns the netns cookie, similar to SO_COOKIE. If network namespaces
> are disabled, SO_NETNS_COOKIE returns the cookie of init_net.
> 
> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
> ---


> diff --git a/net/core/sock.c b/net/core/sock.c
> index 0ed98f20448a..de4644aeb58d 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1614,6 +1614,17 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
>  		v.val = sk->sk_bound_dev_if;
>  		break;
>  
> +	case SO_NETNS_COOKIE:
> +		lv = sizeof(u64);
> +		if (len < lv)
> +			return -EINVAL;

	if (len != lv)
		return -EINVAL;

(There is no reason to support bigger value before at least hundred years)

> +#ifdef CONFIG_NET_NS
> +		v.val64 = sock_net(sk)->net_cookie;
> +#else
> +		v.val64 = init_net.net_cookie;
> +#endif
> +		break;
> +

Why using this ugly #ifdef ?

The following should work just fine, even if CONFIG_NET_NS is not set.

v.val64 = sock_net(sk)->net_cookie;


	

>  	default:
>  		/* We implement the SO_SNDLOWAT etc to not be settable
>  		 * (1003.1g 7).
>
Lorenz Bauer Feb. 19, 2021, 12:23 p.m. UTC | #2
On Fri, 19 Feb 2021 at 11:49, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> > +     case SO_NETNS_COOKIE:
> > +             lv = sizeof(u64);
> > +             if (len < lv)
> > +                     return -EINVAL;
>
>         if (len != lv)
>                 return -EINVAL;
>
> (There is no reason to support bigger value before at least hundred years)

Sorry that was copy pasta from SO_COOKIE which uses the same check. I'll
change it to your suggestion. Want me to fix SO_COOKIE as well?

>
> > +#ifdef CONFIG_NET_NS
> > +             v.val64 = sock_net(sk)->net_cookie;
> > +#else
> > +             v.val64 = init_net.net_cookie;
> > +#endif
> > +             break;
> > +
>
> Why using this ugly #ifdef ?
>
> The following should work just fine, even if CONFIG_NET_NS is not set.
>
> v.val64 = sock_net(sk)->net_cookie;

I looked at sock_net and didn't understand how it avoids a compile error
so I didn't use it, thanks for pointing this out.