mbox series

[bpf-next,0/3] xdpsock: allow mmap2 usage for 32bits

Message ID 20190813102318.5521-1-ivan.khoronzhuk@linaro.org
Headers show
Series xdpsock: allow mmap2 usage for 32bits | expand

Message

Ivan Khoronzhuk Aug. 13, 2019, 10:23 a.m. UTC
This patchset contains several improvements for af_xdp socket umem
mappings for 32bit systems. Also, there is one more patch outside of
othis series that can be applied to another tree and related to mmap2
af_xdp umem offsets:
"mm: mmap: increase sockets maximum memory size pgoff for 32bits"
https://lkml.org/lkml/2019/8/12/549

Based on bpf-next/master

Ivan Khoronzhuk (3):
  libbpf: add asm/unistd.h to xsk to get __NR_mmap2
  xdp: xdp_umem: replace kmap on vmap for umem map
  samples: bpf: syscal_nrs: use mmap2 if defined

 net/xdp/xdp_umem.c         | 16 ++++++++++++----
 samples/bpf/syscall_nrs.c  |  5 +++++
 samples/bpf/tracex5_kern.c | 11 +++++++++++
 tools/lib/bpf/xsk.c        |  1 +
 4 files changed, 29 insertions(+), 4 deletions(-)

-- 
2.17.1

Comments

Jonathan Lemon Aug. 13, 2019, 5:41 p.m. UTC | #1
On 13 Aug 2019, at 3:23, Ivan Khoronzhuk wrote:

> For arm32 xdp sockets mmap2 is preferred, so use it if it's defined.

>

> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>


Doesn't this change the application API?
-- 
Jonathan


> ---

>  samples/bpf/syscall_nrs.c  |  5 +++++

>  samples/bpf/tracex5_kern.c | 11 +++++++++++

>  2 files changed, 16 insertions(+)

>

> diff --git a/samples/bpf/syscall_nrs.c b/samples/bpf/syscall_nrs.c

> index 516e255cbe8f..2dec94238350 100644

> --- a/samples/bpf/syscall_nrs.c

> +++ b/samples/bpf/syscall_nrs.c

> @@ -9,5 +9,10 @@ void syscall_defines(void)

>  	COMMENT("Linux system call numbers.");

>  	SYSNR(__NR_write);

>  	SYSNR(__NR_read);

> +#ifdef __NR_mmap2

> +	SYSNR(__NR_mmap2);

> +#else

>  	SYSNR(__NR_mmap);

> +#endif

> +

>  }

> diff --git a/samples/bpf/tracex5_kern.c b/samples/bpf/tracex5_kern.c

> index f57f4e1ea1ec..300350ad299a 100644

> --- a/samples/bpf/tracex5_kern.c

> +++ b/samples/bpf/tracex5_kern.c

> @@ -68,12 +68,23 @@ PROG(SYS__NR_read)(struct pt_regs *ctx)

>  	return 0;

>  }

>

> +#ifdef __NR_mmap2

> +PROG(SYS__NR_mmap2)(struct pt_regs *ctx)

> +{

> +	char fmt[] = "mmap2\n";

> +

> +	bpf_trace_printk(fmt, sizeof(fmt));

> +	return 0;

> +}

> +#else

>  PROG(SYS__NR_mmap)(struct pt_regs *ctx)

>  {

>  	char fmt[] = "mmap\n";

> +

>  	bpf_trace_printk(fmt, sizeof(fmt));

>  	return 0;

>  }

> +#endif

>

>  char _license[] SEC("license") = "GPL";

>  u32 _version SEC("version") = LINUX_VERSION_CODE;

> -- 

> 2.17.1
Andrii Nakryiko Aug. 13, 2019, 11:38 p.m. UTC | #2
On Tue, Aug 13, 2019 at 3:24 AM Ivan Khoronzhuk
<ivan.khoronzhuk@linaro.org> wrote:
>

> That's needed to get __NR_mmap2 when mmap2 syscall is used.

>

> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

> ---

>  tools/lib/bpf/xsk.c | 1 +

>  1 file changed, 1 insertion(+)

>

> diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c

> index 5007b5d4fd2c..f2fc40f9804c 100644

> --- a/tools/lib/bpf/xsk.c

> +++ b/tools/lib/bpf/xsk.c

> @@ -12,6 +12,7 @@

>  #include <stdlib.h>

>  #include <string.h>

>  #include <unistd.h>

> +#include <asm/unistd.h>


asm/unistd.h is not present in Github libbpf projection. Is there any
way to avoid including this header? Generally, libbpf can't easily use
all of kernel headers, we need to re-implemented all the extra used
stuff for Github version of libbpf, so we try to minimize usage of new
headers that are not just plain uapi headers from include/uapi.

>  #include <arpa/inet.h>

>  #include <asm/barrier.h>

>  #include <linux/compiler.h>

> --

> 2.17.1

>
Yonghong Song Aug. 14, 2019, 12:32 a.m. UTC | #3
On 8/13/19 3:23 AM, Ivan Khoronzhuk wrote:
> That's needed to get __NR_mmap2 when mmap2 syscall is used.


It seems I did not have this issue on x64 machine e.g., Fedora 29.
My glibc version is 2.28. gcc 8.2.1.

What is your particular system glibc version?
So needing kernel asm/unistd.h is because of older glibc on your
system, or something else? Could you clarify?

> 

> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

> ---

>   tools/lib/bpf/xsk.c | 1 +

>   1 file changed, 1 insertion(+)

> 

> diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c

> index 5007b5d4fd2c..f2fc40f9804c 100644

> --- a/tools/lib/bpf/xsk.c

> +++ b/tools/lib/bpf/xsk.c

> @@ -12,6 +12,7 @@

>   #include <stdlib.h>

>   #include <string.h>

>   #include <unistd.h>

> +#include <asm/unistd.h>

>   #include <arpa/inet.h>

>   #include <asm/barrier.h>

>   #include <linux/compiler.h>

>
Björn Töpel Aug. 14, 2019, 1:32 p.m. UTC | #4
On Wed, 14 Aug 2019 at 13:57, Ivan Khoronzhuk
<ivan.khoronzhuk@linaro.org> wrote:
>

> On Wed, Aug 14, 2019 at 12:24:05PM +0300, Ivan Khoronzhuk wrote:

> >On Tue, Aug 13, 2019 at 04:38:13PM -0700, Andrii Nakryiko wrote:

> >

> >Hi, Andrii

> >

> >>On Tue, Aug 13, 2019 at 3:24 AM Ivan Khoronzhuk

> >><ivan.khoronzhuk@linaro.org> wrote:

> >>>

> >>>That's needed to get __NR_mmap2 when mmap2 syscall is used.

> >>>

> >>>Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

> >>>---

> >>> tools/lib/bpf/xsk.c | 1 +

> >>> 1 file changed, 1 insertion(+)

> >>>

> >>>diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c

> >>>index 5007b5d4fd2c..f2fc40f9804c 100644

> >>>--- a/tools/lib/bpf/xsk.c

> >>>+++ b/tools/lib/bpf/xsk.c

> >>>@@ -12,6 +12,7 @@

> >>> #include <stdlib.h>

> >>> #include <string.h>

> >>> #include <unistd.h>

> >>>+#include <asm/unistd.h>

> >>

> >>asm/unistd.h is not present in Github libbpf projection. Is there any

> >

> >Look on includes from

> >tools/lib/bpf/libpf.c

> >tools/lib/bpf/bpf.c

> >

> >That's how it's done... Copping headers to arch/arm will not

> >solve this, it includes both of them anyway, and anyway it needs

> >asm/unistd.h inclusion here, only because xsk.c needs __NR_*

> >

> >

>

> There is one more radical solution for this I can send, but I'm not sure how it

> can impact on other syscals/arches...

>

> Looks like:

>

>

> diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile

> index 9312066a1ae3..8b2f8ff7ce44 100644

> --- a/tools/lib/bpf/Makefile

> +++ b/tools/lib/bpf/Makefile

> @@ -113,6 +113,7 @@ override CFLAGS += -Werror -Wall

>  override CFLAGS += -fPIC

>  override CFLAGS += $(INCLUDES)

>  override CFLAGS += -fvisibility=hidden

> +override CFLAGS += -D_FILE_OFFSET_BITS=64

>


Hmm, isn't this glibc-ism? Does is it work for, say, musl or bionic?

If this is portable, and works on 32-, and 64-bit archs, I'm happy
with the patch. :-)


Björn

>  ifeq ($(VERBOSE),1)

>    Q =

> diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c

> index f2fc40f9804c..ff2d03b8380d 100644

> --- a/tools/lib/bpf/xsk.c

> +++ b/tools/lib/bpf/xsk.c

> @@ -75,23 +75,6 @@ struct xsk_nl_info {

>         int fd;

>  };

>

> -/* For 32-bit systems, we need to use mmap2 as the offsets are 64-bit.

> - * Unfortunately, it is not part of glibc.

> - */

> -static inline void *xsk_mmap(void *addr, size_t length, int prot, int flags,

> -                            int fd, __u64 offset)

> -{

> -#ifdef __NR_mmap2

> -       unsigned int page_shift = __builtin_ffs(getpagesize()) - 1;

> -       long ret = syscall(__NR_mmap2, addr, length, prot, flags, fd,

> -                          (off_t)(offset >> page_shift));

> -

> -       return (void *)ret;

> -#else

> -       return mmap(addr, length, prot, flags, fd, offset);

> -#endif

> -}

> -

>  int xsk_umem__fd(const struct xsk_umem *umem)

>  {

>         return umem ? umem->fd : -EINVAL;

> @@ -211,10 +194,9 @@ int xsk_umem__create(struct xsk_umem **umem_ptr, void *umem_area, __u64 size,

>                 goto out_socket;

>         }

>

> -       map = xsk_mmap(NULL, off.fr.desc +

> -                      umem->config.fill_size * sizeof(__u64),

> -                      PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE,

> -                      umem->fd, XDP_UMEM_PGOFF_FILL_RING);

> +       map = mmap(NULL, off.fr.desc + umem->config.fill_size * sizeof(__u64),

> +                  PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE, umem->fd,

> +                  XDP_UMEM_PGOFF_FILL_RING);

>         if (map == MAP_FAILED) {

>                 err = -errno;

>                 goto out_socket;

> @@ -228,10 +210,9 @@ int xsk_umem__create(struct xsk_umem **umem_ptr, void *umem_area, __u64 size,

>         fill->ring = map + off.fr.desc;

>         fill->cached_cons = umem->config.fill_size;

>

> -       map = xsk_mmap(NULL,

> -                      off.cr.desc + umem->config.comp_size * sizeof(__u64),

> -                      PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE,

> -                      umem->fd, XDP_UMEM_PGOFF_COMPLETION_RING);

> +       map = mmap(NULL, off.cr.desc + umem->config.comp_size * sizeof(__u64),

> +                  PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE, umem->fd,

> +                  XDP_UMEM_PGOFF_COMPLETION_RING);

>         if (map == MAP_FAILED) {

>                 err = -errno;

>                 goto out_mmap;

> @@ -552,11 +533,10 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,

>         }

>

>         if (rx) {

> -               rx_map = xsk_mmap(NULL, off.rx.desc +

> -                                 xsk->config.rx_size * sizeof(struct xdp_desc),

> -                                 PROT_READ | PROT_WRITE,

> -                                 MAP_SHARED | MAP_POPULATE,

> -                                 xsk->fd, XDP_PGOFF_RX_RING);

> +               rx_map = mmap(NULL, off.rx.desc +

> +                             xsk->config.rx_size * sizeof(struct xdp_desc),

> +                             PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE,

> +                             xsk->fd, XDP_PGOFF_RX_RING);

>                 if (rx_map == MAP_FAILED) {

>                         err = -errno;

>                         goto out_socket;

> @@ -571,11 +551,10 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,

>         xsk->rx = rx;

>

>         if (tx) {

> -               tx_map = xsk_mmap(NULL, off.tx.desc +

> -                                 xsk->config.tx_size * sizeof(struct xdp_desc),

> -                                 PROT_READ | PROT_WRITE,

> -                                 MAP_SHARED | MAP_POPULATE,

> -                                 xsk->fd, XDP_PGOFF_TX_RING);

> +               tx_map = mmap(NULL, off.tx.desc +

> +                             xsk->config.tx_size * sizeof(struct xdp_desc),

> +                             PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE,

> +                             xsk->fd, XDP_PGOFF_TX_RING);

>                 if (tx_map == MAP_FAILED) {

>                         err = -errno;

>                         goto out_mmap_rx;

>

>

> If maintainers are ready to accept this I can send.

> What do you say?

>

> --

> Regards,

> Ivan Khoronzhuk
Yonghong Song Aug. 14, 2019, 3:51 p.m. UTC | #5
On 8/14/19 2:24 AM, Ivan Khoronzhuk wrote:
> On Tue, Aug 13, 2019 at 04:38:13PM -0700, Andrii Nakryiko wrote:

> 

> Hi, Andrii

> 

>> On Tue, Aug 13, 2019 at 3:24 AM Ivan Khoronzhuk

>> <ivan.khoronzhuk@linaro.org> wrote:

>>>

>>> That's needed to get __NR_mmap2 when mmap2 syscall is used.

>>>

>>> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>

>>> ---

>>>  tools/lib/bpf/xsk.c | 1 +

>>>  1 file changed, 1 insertion(+)

>>>

>>> diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c

>>> index 5007b5d4fd2c..f2fc40f9804c 100644

>>> --- a/tools/lib/bpf/xsk.c

>>> +++ b/tools/lib/bpf/xsk.c

>>> @@ -12,6 +12,7 @@

>>>  #include <stdlib.h>

>>>  #include <string.h>

>>>  #include <unistd.h>

>>> +#include <asm/unistd.h>

>>

>> asm/unistd.h is not present in Github libbpf projection. Is there any

> 

> Look on includes from

> tools/lib/bpf/libpf.c

> tools/lib/bpf/bpf.c

> 

> That's how it's done... Copping headers to arch/arm will not

> solve this, it includes both of them anyway, and anyway it needs

> asm/unistd.h inclusion here, only because xsk.c needs __NR_*

> 

> 

>> way to avoid including this header? Generally, libbpf can't easily use

>> all of kernel headers, we need to re-implemented all the extra used

>> stuff for Github version of libbpf, so we try to minimize usage of new

>> headers that are not just plain uapi headers from include/uapi.

> 

> Yes I know, it's far away from real number of changes needed.

> I faced enough about this already and kernel headers, especially

> for arm32 it's a bit decency problem. But this patch it's part of

> normal one. I have couple issues despite this normally fixed mmap2

> that is the same even if uapi includes are coppied to tools/arch/arm.

> 

> In continuation of kernel headers inclusion and arm build:

> 

> For instance, what about this rough "kernel headers" hack:

> https://github.com/ikhorn/af_xdp_stuff/commit/aa645ccca4d844f404ec3c2b27402d4d7848d1b5 


The ".syntax unified" is mentioned a couple of times
in bcc mailing list as well. llvm bpf backend might
be able to solve it. I have not looked at the details though.

> 

> or this one related for arm32 only:

> https://github.com/ikhorn/af_xdp_stuff/commit/2c6c6d538605aac39600dcb3c9b66de11c70b963 


This may not work if bpf program tries to handle kernel headers.
bpf program may get wrong layout.

Anyway, the above two comments are irrelevant to this patch set
and if needed should be discussed separately.

> 

> 

> I have more...

> 

>>

>>>  #include <arpa/inet.h>

>>>  #include <asm/barrier.h>

>>>  #include <linux/compiler.h>

>>> -- 

>>> 2.17.1

>>>

>