diff mbox series

[bpf-next,v2,5/8] libbpf, xsk: select AF_XDP BPF program based on kernel version

Message ID 20210119155013.154808-6-bjorn.topel@gmail.com
State Superseded
Headers show
Series Introduce bpf_redirect_xsk() helper | expand

Commit Message

Björn Töpel Jan. 19, 2021, 3:50 p.m. UTC
From: Björn Töpel <bjorn.topel@intel.com>

Add detection for kernel version, and adapt the BPF program based on
kernel support. This way, users will get the best possible performance
from the BPF program.

Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Marek Majtyka  <alardam@gmail.com>
---
 tools/lib/bpf/libbpf.c          |  2 +-
 tools/lib/bpf/libbpf_internal.h |  2 ++
 tools/lib/bpf/libbpf_probes.c   | 16 -------------
 tools/lib/bpf/xsk.c             | 41 ++++++++++++++++++++++++++++++---
 4 files changed, 41 insertions(+), 20 deletions(-)

Comments

Toke Høiland-Jørgensen Jan. 20, 2021, 12:52 p.m. UTC | #1
Björn Töpel <bjorn.topel@gmail.com> writes:

> From: Björn Töpel <bjorn.topel@intel.com>

>

> Add detection for kernel version, and adapt the BPF program based on

> kernel support. This way, users will get the best possible performance

> from the BPF program.


Please do explicit feature detection instead of relying on the kernel
version number; some distro kernels are known to have a creative notion
of their own version, which is not really related to the features they
actually support (I'm sure you know which one I'm referring to ;)).

-Toke
Björn Töpel Jan. 20, 2021, 1:25 p.m. UTC | #2
On 2021-01-20 13:52, Toke Høiland-Jørgensen wrote:
> Björn Töpel <bjorn.topel@gmail.com> writes:

> 

>> From: Björn Töpel <bjorn.topel@intel.com>

>>

>> Add detection for kernel version, and adapt the BPF program based on

>> kernel support. This way, users will get the best possible performance

>> from the BPF program.

> 

> Please do explicit feature detection instead of relying on the kernel

> version number; some distro kernels are known to have a creative notion

> of their own version, which is not really related to the features they

> actually support (I'm sure you know which one I'm referring to ;)).

>


Right. For a *new* helper, like bpf_redirect_xsk, we rely on rejection
from the verifier to detect support. What about "bpf_redirect_map() now
supports passing return value as flags"? Any ideas how to do that in a
robust, non-version number-based scheme?


Thanks,
Björn
Björn Töpel Jan. 20, 2021, 2:49 p.m. UTC | #3
On 2021-01-20 14:25, Björn Töpel wrote:
> On 2021-01-20 13:52, Toke Høiland-Jørgensen wrote:

>> Björn Töpel <bjorn.topel@gmail.com> writes:

>>

>>> From: Björn Töpel <bjorn.topel@intel.com>

>>>

>>> Add detection for kernel version, and adapt the BPF program based on

>>> kernel support. This way, users will get the best possible performance

>>> from the BPF program.

>>

>> Please do explicit feature detection instead of relying on the kernel

>> version number; some distro kernels are known to have a creative notion

>> of their own version, which is not really related to the features they

>> actually support (I'm sure you know which one I'm referring to ;)).

>>

> 

> Right. For a *new* helper, like bpf_redirect_xsk, we rely on rejection

> from the verifier to detect support. What about "bpf_redirect_map() now

> supports passing return value as flags"? Any ideas how to do that in a

> robust, non-version number-based scheme?

>


Just so that I understand this correctly. Red^WSome distro vendors
backport the world, and call that franken kernel, say, 3.10. Is that
interpretation correct? My hope was that wasn't the case. :-(

Would it make sense with some kind of BPF-specific "supported features"
mechanism? Something else with a bigger scope (whole kernel)?



Cheers,
Björn
Toke Høiland-Jørgensen Jan. 20, 2021, 2:56 p.m. UTC | #4
Björn Töpel <bjorn.topel@intel.com> writes:

> On 2021-01-20 13:52, Toke Høiland-Jørgensen wrote:

>> Björn Töpel <bjorn.topel@gmail.com> writes:

>> 

>>> From: Björn Töpel <bjorn.topel@intel.com>

>>>

>>> Add detection for kernel version, and adapt the BPF program based on

>>> kernel support. This way, users will get the best possible performance

>>> from the BPF program.

>> 

>> Please do explicit feature detection instead of relying on the kernel

>> version number; some distro kernels are known to have a creative notion

>> of their own version, which is not really related to the features they

>> actually support (I'm sure you know which one I'm referring to ;)).

>>

>

> Right. For a *new* helper, like bpf_redirect_xsk, we rely on rejection

> from the verifier to detect support. What about "bpf_redirect_map() now

> supports passing return value as flags"? Any ideas how to do that in a

> robust, non-version number-based scheme?


Well, having a BPF program pass in a flag of '1' with an invalid lookup
and checking if it returns 1 or 0. But how to do that from libbpf, hmm,
good question. BPF_PROG_RUN()?

An alternative could be to default to a program that will handle both
cases in the BPF code, and make it opt-in to use the optimised versions
if the user knows their kernel supports it?

-Toke
Toke Høiland-Jørgensen Jan. 20, 2021, 3:11 p.m. UTC | #5
Björn Töpel <bjorn.topel@intel.com> writes:

> On 2021-01-20 14:25, Björn Töpel wrote:

>> On 2021-01-20 13:52, Toke Høiland-Jørgensen wrote:

>>> Björn Töpel <bjorn.topel@gmail.com> writes:

>>>

>>>> From: Björn Töpel <bjorn.topel@intel.com>

>>>>

>>>> Add detection for kernel version, and adapt the BPF program based on

>>>> kernel support. This way, users will get the best possible performance

>>>> from the BPF program.

>>>

>>> Please do explicit feature detection instead of relying on the kernel

>>> version number; some distro kernels are known to have a creative notion

>>> of their own version, which is not really related to the features they

>>> actually support (I'm sure you know which one I'm referring to ;)).

>>>

>> 

>> Right. For a *new* helper, like bpf_redirect_xsk, we rely on rejection

>> from the verifier to detect support. What about "bpf_redirect_map() now

>> supports passing return value as flags"? Any ideas how to do that in a

>> robust, non-version number-based scheme?

>>

>

> Just so that I understand this correctly. Red^WSome distro vendors

> backport the world, and call that franken kernel, say, 3.10. Is that

> interpretation correct? My hope was that wasn't the case. :-(


Yup, indeed. All kernels shipped for the entire lifetime of RHEL8 think
they are v4.18.0... :/

I don't think we're the only ones doing it (there are examples in the
embedded world as well, for instance, and not sure about the other
enterprise distros), but RHEL is probably the most extreme example.

We could patch the version check in the distro-supplied version of
libbpf, of course, but that doesn't help anyone using upstream versions,
and given the prevalence of vendoring libbpf, I fear that going with the
version check will just result in a bad user experience...

> Would it make sense with some kind of BPF-specific "supported

> features" mechanism? Something else with a bigger scope (whole

> kernel)?


Heh, in my opinion, yeah. Seems like we'll finally get it for XDP, but
for BPF in general the approach has always been probing AFAICT.

For the particular case of arguments to helpers, I suppose the verifier
could technically validate value ranges for flags arguments, say. That
would be nice as an early reject anyway, but I'm not sure if it is
possible to add after-the-fact without breaking existing programs
because the verifier can't prove the argument is within the valid range.
And of course it doesn't help you with compatibility with
already-released kernels.

-Toke
Björn Töpel Jan. 20, 2021, 3:27 p.m. UTC | #6
On 2021-01-20 16:11, Toke Høiland-Jørgensen wrote:
> Björn Töpel <bjorn.topel@intel.com> writes:

> 

>> On 2021-01-20 14:25, Björn Töpel wrote:

>>> On 2021-01-20 13:52, Toke Høiland-Jørgensen wrote:

>>>> Björn Töpel <bjorn.topel@gmail.com> writes:

>>>>

>>>>> From: Björn Töpel <bjorn.topel@intel.com>

>>>>>

>>>>> Add detection for kernel version, and adapt the BPF program based on

>>>>> kernel support. This way, users will get the best possible performance

>>>>> from the BPF program.

>>>>

>>>> Please do explicit feature detection instead of relying on the kernel

>>>> version number; some distro kernels are known to have a creative notion

>>>> of their own version, which is not really related to the features they

>>>> actually support (I'm sure you know which one I'm referring to ;)).

>>>>

>>>

>>> Right. For a *new* helper, like bpf_redirect_xsk, we rely on rejection

>>> from the verifier to detect support. What about "bpf_redirect_map() now

>>> supports passing return value as flags"? Any ideas how to do that in a

>>> robust, non-version number-based scheme?

>>>

>>

>> Just so that I understand this correctly. Red^WSome distro vendors

>> backport the world, and call that franken kernel, say, 3.10. Is that

>> interpretation correct? My hope was that wasn't the case. :-(

> 

> Yup, indeed. All kernels shipped for the entire lifetime of RHEL8 think

> they are v4.18.0... :/

> 

> I don't think we're the only ones doing it (there are examples in the

> embedded world as well, for instance, and not sure about the other

> enterprise distros), but RHEL is probably the most extreme example.

> 

> We could patch the version check in the distro-supplied version of

> libbpf, of course, but that doesn't help anyone using upstream versions,

> and given the prevalence of vendoring libbpf, I fear that going with the

> version check will just result in a bad user experience...

>


Ok! Thanks for clearing that out!

>> Would it make sense with some kind of BPF-specific "supported

>> features" mechanism? Something else with a bigger scope (whole

>> kernel)?

> 

> Heh, in my opinion, yeah. Seems like we'll finally get it for XDP, but

> for BPF in general the approach has always been probing AFAICT.

> 

> For the particular case of arguments to helpers, I suppose the verifier

> could technically validate value ranges for flags arguments, say. That

> would be nice as an early reject anyway, but I'm not sure if it is

> possible to add after-the-fact without breaking existing programs

> because the verifier can't prove the argument is within the valid range.

> And of course it doesn't help you with compatibility with

> already-released kernels.

>


Hmm, think I have a way forward. I'll use BPF_PROG_TEST_RUN.

If the load fail for the new helper, fallback to bpf_redirect_map(). Use
BPF_PROG_TEST_RUN to make sure that "action via flags" passes.

That should work for you guys as well, right? I'll take a stab at it.


Björn
Toke Høiland-Jørgensen Jan. 20, 2021, 5:30 p.m. UTC | #7
Björn Töpel <bjorn.topel@intel.com> writes:

> On 2021-01-20 16:11, Toke Høiland-Jørgensen wrote:

>> Björn Töpel <bjorn.topel@intel.com> writes:

>> 

>>> On 2021-01-20 14:25, Björn Töpel wrote:

>>>> On 2021-01-20 13:52, Toke Høiland-Jørgensen wrote:

>>>>> Björn Töpel <bjorn.topel@gmail.com> writes:

>>>>>

>>>>>> From: Björn Töpel <bjorn.topel@intel.com>

>>>>>>

>>>>>> Add detection for kernel version, and adapt the BPF program based on

>>>>>> kernel support. This way, users will get the best possible performance

>>>>>> from the BPF program.

>>>>>

>>>>> Please do explicit feature detection instead of relying on the kernel

>>>>> version number; some distro kernels are known to have a creative notion

>>>>> of their own version, which is not really related to the features they

>>>>> actually support (I'm sure you know which one I'm referring to ;)).

>>>>>

>>>>

>>>> Right. For a *new* helper, like bpf_redirect_xsk, we rely on rejection

>>>> from the verifier to detect support. What about "bpf_redirect_map() now

>>>> supports passing return value as flags"? Any ideas how to do that in a

>>>> robust, non-version number-based scheme?

>>>>

>>>

>>> Just so that I understand this correctly. Red^WSome distro vendors

>>> backport the world, and call that franken kernel, say, 3.10. Is that

>>> interpretation correct? My hope was that wasn't the case. :-(

>> 

>> Yup, indeed. All kernels shipped for the entire lifetime of RHEL8 think

>> they are v4.18.0... :/

>> 

>> I don't think we're the only ones doing it (there are examples in the

>> embedded world as well, for instance, and not sure about the other

>> enterprise distros), but RHEL is probably the most extreme example.

>> 

>> We could patch the version check in the distro-supplied version of

>> libbpf, of course, but that doesn't help anyone using upstream versions,

>> and given the prevalence of vendoring libbpf, I fear that going with the

>> version check will just result in a bad user experience...

>>

>

> Ok! Thanks for clearing that out!

>

>>> Would it make sense with some kind of BPF-specific "supported

>>> features" mechanism? Something else with a bigger scope (whole

>>> kernel)?

>> 

>> Heh, in my opinion, yeah. Seems like we'll finally get it for XDP, but

>> for BPF in general the approach has always been probing AFAICT.

>> 

>> For the particular case of arguments to helpers, I suppose the verifier

>> could technically validate value ranges for flags arguments, say. That

>> would be nice as an early reject anyway, but I'm not sure if it is

>> possible to add after-the-fact without breaking existing programs

>> because the verifier can't prove the argument is within the valid range.

>> And of course it doesn't help you with compatibility with

>> already-released kernels.

>>

>

> Hmm, think I have a way forward. I'll use BPF_PROG_TEST_RUN.

>

> If the load fail for the new helper, fallback to bpf_redirect_map(). Use

> BPF_PROG_TEST_RUN to make sure that "action via flags" passes.

>

> That should work for you guys as well, right? I'll take a stab at it.


Yup, think so - SGTM! :)

-Toke
Alexei Starovoitov Jan. 20, 2021, 6:25 p.m. UTC | #8
On Wed, Jan 20, 2021 at 7:27 AM Björn Töpel <bjorn.topel@intel.com> wrote:
>

> >> Would it make sense with some kind of BPF-specific "supported

> >> features" mechanism? Something else with a bigger scope (whole

> >> kernel)?

> >

> > Heh, in my opinion, yeah. Seems like we'll finally get it for XDP, but

> > for BPF in general the approach has always been probing AFAICT.

> >

> > For the particular case of arguments to helpers, I suppose the verifier

> > could technically validate value ranges for flags arguments, say. That

> > would be nice as an early reject anyway, but I'm not sure if it is

> > possible to add after-the-fact without breaking existing programs

> > because the verifier can't prove the argument is within the valid range.

> > And of course it doesn't help you with compatibility with

> > already-released kernels.

> >

>

> Hmm, think I have a way forward. I'll use BPF_PROG_TEST_RUN.

>

> If the load fail for the new helper, fallback to bpf_redirect_map(). Use

> BPF_PROG_TEST_RUN to make sure that "action via flags" passes.


+1 to Toke's point. No version checks please.
One way to detect is to try prog_load. Search for FEAT_* in libbpf.
Another approach is to scan vmlinux BTF for necessary helpers.
Currently libbpf is relying on the former.
I think going forward would be good to detect features via BTF.
It's going to be much faster and won't create noise for audit that
could be looking at prog_load calls.
Björn Töpel Jan. 20, 2021, 6:30 p.m. UTC | #9
On 2021-01-20 19:25, Alexei Starovoitov wrote:
> On Wed, Jan 20, 2021 at 7:27 AM Björn Töpel <bjorn.topel@intel.com> wrote:

>>

>>>> Would it make sense with some kind of BPF-specific "supported

>>>> features" mechanism? Something else with a bigger scope (whole

>>>> kernel)?

>>>

>>> Heh, in my opinion, yeah. Seems like we'll finally get it for XDP, but

>>> for BPF in general the approach has always been probing AFAICT.

>>>

>>> For the particular case of arguments to helpers, I suppose the verifier

>>> could technically validate value ranges for flags arguments, say. That

>>> would be nice as an early reject anyway, but I'm not sure if it is

>>> possible to add after-the-fact without breaking existing programs

>>> because the verifier can't prove the argument is within the valid range.

>>> And of course it doesn't help you with compatibility with

>>> already-released kernels.

>>>

>>

>> Hmm, think I have a way forward. I'll use BPF_PROG_TEST_RUN.

>>

>> If the load fail for the new helper, fallback to bpf_redirect_map(). Use

>> BPF_PROG_TEST_RUN to make sure that "action via flags" passes.

> 

> +1 to Toke's point. No version checks please.

> One way to detect is to try prog_load. Search for FEAT_* in libbpf.

> Another approach is to scan vmlinux BTF for necessary helpers.

> Currently libbpf is relying on the former.

> I think going forward would be good to detect features via BTF.

> It's going to be much faster and won't create noise for audit that

> could be looking at prog_load calls.

> 


Thanks Alexei. I'll explore both options for the next spin!


Björn
diff mbox series

Patch

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 2abbc3800568..6a53adf14a9c 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -693,7 +693,7 @@  bpf_object__add_programs(struct bpf_object *obj, Elf_Data *sec_data,
 	return 0;
 }
 
-static __u32 get_kernel_version(void)
+__u32 get_kernel_version(void)
 {
 	__u32 major, minor, patch;
 	struct utsname info;
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index 969d0ac592ba..dafb780e2dd2 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -349,4 +349,6 @@  struct bpf_core_relo {
 	enum bpf_core_relo_kind kind;
 };
 
+__u32 get_kernel_version(void);
+
 #endif /* __LIBBPF_LIBBPF_INTERNAL_H */
diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
index ecaae2927ab8..aae0231371d0 100644
--- a/tools/lib/bpf/libbpf_probes.c
+++ b/tools/lib/bpf/libbpf_probes.c
@@ -48,22 +48,6 @@  static int get_vendor_id(int ifindex)
 	return strtol(buf, NULL, 0);
 }
 
-static int get_kernel_version(void)
-{
-	int version, subversion, patchlevel;
-	struct utsname utsn;
-
-	/* Return 0 on failure, and attempt to probe with empty kversion */
-	if (uname(&utsn))
-		return 0;
-
-	if (sscanf(utsn.release, "%d.%d.%d",
-		   &version, &subversion, &patchlevel) != 3)
-		return 0;
-
-	return (version << 16) + (subversion << 8) + patchlevel;
-}
-
 static void
 probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns,
 	   size_t insns_cnt, char *buf, size_t buf_len, __u32 ifindex)
diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c
index e3e41ceeb1bc..c8642c6cb5d6 100644
--- a/tools/lib/bpf/xsk.c
+++ b/tools/lib/bpf/xsk.c
@@ -23,6 +23,7 @@ 
 #include <linux/kernel.h>
 #include <linux/list.h>
 #include <linux/sockios.h>
+#include <linux/version.h>
 #include <net/if.h>
 #include <sys/ioctl.h>
 #include <sys/mman.h>
@@ -46,6 +47,11 @@ 
  #define PF_XDP AF_XDP
 #endif
 
+enum xsk_prog {
+	XSK_PROG_FALLBACK,
+	XSK_PROG_REDIRECT_FLAGS,
+};
+
 struct xsk_umem {
 	struct xsk_ring_prod *fill_save;
 	struct xsk_ring_cons *comp_save;
@@ -351,6 +357,13 @@  int xsk_umem__create_v0_0_2(struct xsk_umem **umem_ptr, void *umem_area,
 COMPAT_VERSION(xsk_umem__create_v0_0_2, xsk_umem__create, LIBBPF_0.0.2)
 DEFAULT_VERSION(xsk_umem__create_v0_0_4, xsk_umem__create, LIBBPF_0.0.4)
 
+static enum xsk_prog get_xsk_prog(void)
+{
+	__u32 kver = get_kernel_version();
+
+	return kver < KERNEL_VERSION(5, 3, 0) ? XSK_PROG_FALLBACK : XSK_PROG_REDIRECT_FLAGS;
+}
+
 static int xsk_load_xdp_prog(struct xsk_socket *xsk)
 {
 	static const int log_buf_size = 16 * 1024;
@@ -358,7 +371,7 @@  static int xsk_load_xdp_prog(struct xsk_socket *xsk)
 	char log_buf[log_buf_size];
 	int err, prog_fd;
 
-	/* This is the C-program:
+	/* This is the fallback C-program:
 	 * SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
 	 * {
 	 *     int ret, index = ctx->rx_queue_index;
@@ -414,9 +427,31 @@  static int xsk_load_xdp_prog(struct xsk_socket *xsk)
 		/* The jumps are to this instruction */
 		BPF_EXIT_INSN(),
 	};
-	size_t insns_cnt = sizeof(prog) / sizeof(struct bpf_insn);
 
-	prog_fd = bpf_load_program(BPF_PROG_TYPE_XDP, prog, insns_cnt,
+	/* This is the post-5.3 kernel C-program:
+	 * SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
+	 * {
+	 *     return bpf_redirect_map(&xsks_map, ctx->rx_queue_index, XDP_PASS);
+	 * }
+	 */
+	struct bpf_insn prog_redirect_flags[] = {
+		/* r2 = *(u32 *)(r1 + 16) */
+		BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 16),
+		/* r1 = xskmap[] */
+		BPF_LD_MAP_FD(BPF_REG_1, ctx->xsks_map_fd),
+		/* r3 = XDP_PASS */
+		BPF_MOV64_IMM(BPF_REG_3, 2),
+		/* call bpf_redirect_map */
+		BPF_EMIT_CALL(BPF_FUNC_redirect_map),
+		BPF_EXIT_INSN(),
+	};
+	size_t insns_cnt[] = {sizeof(prog) / sizeof(struct bpf_insn),
+			      sizeof(prog_redirect_flags) / sizeof(struct bpf_insn),
+	};
+	struct bpf_insn *progs[] = {prog, prog_redirect_flags};
+	enum xsk_prog option = get_xsk_prog();
+
+	prog_fd = bpf_load_program(BPF_PROG_TYPE_XDP, progs[option], insns_cnt[option],
 				   "LGPL-2.1 or BSD-2-Clause", 0, log_buf,
 				   log_buf_size);
 	if (prog_fd < 0) {