mbox series

[RFC,0/3] Allow sk_lookup UDP return traffic to egress.

Message ID 20240913-reverse-sk-lookup-v1-0-e721ea003d4c@cloudflare.com
Headers show
Series Allow sk_lookup UDP return traffic to egress. | expand

Message

Tiago Lam Sept. 13, 2024, 9:39 a.m. UTC
Currently, sk_lookup allows an ebpf program to run on the ingress socket
lookup path, and accept traffic not only on a range of addresses, but
also on a range of ports. At Cloudflare we use sk_lookup for two main
cases:
1. Sharing a single port between multiple services - i.e. two services
   (or more) use disjoint IP ranges but share the same port;
2. Receiving traffic on all ports - i.e. a service which accepts traffic
   on specific IP ranges but any port [1].

However, one main challenge we face while using sk_lookup for these use
cases is how to source return UDP traffic:
- On point 1. above, sometimes this range of addresses are not local
  (i.e. there's no local routes for these in the server), which means we
  need IP_TRANSPARENT set to be able to egress traffic from addresses
  we've received traffic on (or simply IP_FREEBIND in the case of IPv6);
- And on point 2. above, allowing traffic to a range of ports means a
  service could get traffic on multiple ports, but currently there's no
  way to set the source UDP port egress traffic should be sourced from -
  it's possible to receive the original destination port using the
  IP_ORIGDSTADDR ancilliary message in recvmsg, but not set it in
  sendmsg.

Both of these limitations can be worked around, but in a sub-optimal
way. Using IP_TRANSPARENT, for instance, requires special privileges.
And while one could use UDP connected sockets to send return traffic,
creating a connected socket for each different address a UDP traffic is
received on does have performance implications.

Given sk_lookup allows services to accept traffic on a range of
addresses or ports, it seems sensible to also allow return traffic to
proceed through as well, without needing extra configurations / set ups.

This patch set allows to do exactly this by performing a reverse socket
lookup on the egress path - where it looks to see if the egress socket
matches a socket in the attached sk_lookup ebpf program for the traffic
that's being sent. If it does, traffic is allowed to proceed.

The downsides to this is that this runs on the egress hot path, although
this work tries to minimise its impact by only performing the reverse
socket lookup when necessary. Further performance measurements are to be
taken, but we're reaching out early for feedback to see what the
technical concerns are and if we can address them.

[1] https://blog.cloudflare.com/how-we-built-spectrum/

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Tiago Lam <tiagolam@cloudflare.com>
---
Tiago Lam (3):
      ipv4: Run a reverse sk_lookup on sendmsg.
      ipv6: Run a reverse sk_lookup on sendmsg.
      bpf: Add sk_lookup test to use ORIGDSTADDR cmsg.

 include/net/ip.h                                   |  1 +
 net/ipv4/ip_sockglue.c                             | 11 ++++
 net/ipv4/udp.c                                     | 33 +++++++++-
 net/ipv6/datagram.c                                | 76 ++++++++++++++++++++++
 net/ipv6/udp.c                                     |  8 ++-
 tools/testing/selftests/bpf/prog_tests/sk_lookup.c | 70 +++++++++++++-------
 6 files changed, 174 insertions(+), 25 deletions(-)
---
base-commit: da3ea35007d0af457a0afc87e84fddaebc4e0b63
change-id: 20240909-reverse-sk-lookup-f7bf36292bc4

Best regards,