mbox series

[bpf-next,v1,0/4] bpf, sockmap: Fix data loss and panic issues

Message ID 20250407142234.47591-1-jiayuan.chen@linux.dev
Headers show
Series bpf, sockmap: Fix data loss and panic issues | expand

Message

Jiayuan Chen April 7, 2025, 2:21 p.m. UTC
I was writing a benchmark based on sockmap + TCP and discovered several
issues:

1. When EAGAIN occurs, the direction of skb is incorrect, causing data
   loss when retry.
2. When sending partial data, the offset is not recorded, leading to
   duplicate data being sent when retry.
3. An unexpected BUG_ON() judgment in skb_linearize is triggered.
4. The memory of psock->ingress_skb is not limited by the socket buffer
   and memcg.

Issues 1, 2, and 3 are described in each patch's commit message.

Regarding issue 4, this patchset does not cover it as it is difficult to
handle in practice, and I am still working on it.

Here is a brief description of the issue:
When using sockmap to skb/stream redirect, if the receiving end does not
perform read operations, all data will be buffered in ingress_skb.

For example:
'''
// set memory limit to 50G
cgcreate -g memory:myGroup
cgset -r memory.max="5000M" myGroup

// start benchmark and disable consumer from reading
cgexec -g "memory:myGroup" ./bench sockmap -c 2 -p 1 -a --rx-verdict-ingress --delay-consumer=-1 -d 100
Iter   0 ( 29.179us): Send Speed 2668.548 MB/s (20360.406 calls/s), ... Rcv Speed    0.000 MB/s (   0.000 calls/s)
Iter   1 ( -7.237us): Send Speed 2694.467 MB/s (20557.149 calls/s), ... Rcv Speed    0.000 MB/s (   0.000 calls/s)
Iter   2 ( -1.918us): Send Speed 2693.404 MB/s (20548.039 calls/s), ... Rcv Speed    0.000 MB/s (   0.000 calls/s)
Iter   3 ( -0.684us): Send Speed 2693.138 MB/s (20548.014 calls/s), ... Rcv Speed    0.000 MB/s (   0.000 calls/s)
Iter   4 (  7.879us): Send Speed 2698.620 MB/s (20588.838 calls/s), ... Rcv Speed    0.000 MB/s (   0.000 calls/s)
Iter   5 ( -3.224us): Send Speed 2696.553 MB/s (20573.066 calls/s), ... Rcv Speed    0.000 MB/s (   0.000 calls/s)
Iter   6 ( -5.409us): Send Speed 2699.705 MB/s (20597.111 calls/s), ... Rcv Speed    0.000 MB/s (   0.000 calls/s)
Iter   7 ( -0.439us): Send Speed 2699.691 MB/s (20597.009 calls/s), ... Rcv Speed    0.000 MB/s (   0.000 calls/s)
...

// memory usage are not limited
cat /proc/slabinfo | grep skb
skbuff_small_head   11824024 11824024    704   46    8 : tunables    0    0    0 : slabdata 257044 257044      0
skbuff_fclone_cache 11822080 11822080    512   32    4 : tunables    0    0    0 : slabdata 369440 369440      0
'''
Thus, a simple socket in a large file upload/download model can eat the
entire OS memory.

We must charge the skb memory to psock->sk, and if we do not want losing
skb, we need to feedback the error info to read_sock/read_skb when the
enqueue operation of psock->ingress_skb fails.

---
My another patch related to stability also requires maintainers to spare
some time from their busy schedules for review.
https://lore.kernel.org/bpf/20250317092257.68760-1-jiayuan.chen@linux.dev/T/#t


Jiayuan Chen (4):
  bpf, sockmap: Fix data lost during EAGAIN retries
  bpf, sockmap: fix duplicated data transmission
  bpf, sockmap: Fix panic when calling skb_linearize
  selftest/bpf/benchs: Add benchmark for sockmap usage

 net/core/skmsg.c                              |  48 +-
 tools/testing/selftests/bpf/Makefile          |   2 +
 tools/testing/selftests/bpf/bench.c           |   4 +
 .../selftests/bpf/benchs/bench_sockmap.c      | 599 ++++++++++++++++++
 .../selftests/bpf/progs/bench_sockmap_prog.c  |  65 ++
 5 files changed, 697 insertions(+), 21 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/benchs/bench_sockmap.c
 create mode 100644 tools/testing/selftests/bpf/progs/bench_sockmap_prog.c

Comments

patchwork-bot+netdevbpf@kernel.org April 10, 2025, 3:10 a.m. UTC | #1
Hello:

This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Mon,  7 Apr 2025 22:21:19 +0800 you wrote:
> I was writing a benchmark based on sockmap + TCP and discovered several
> issues:
> 
> 1. When EAGAIN occurs, the direction of skb is incorrect, causing data
>    loss when retry.
> 2. When sending partial data, the offset is not recorded, leading to
>    duplicate data being sent when retry.
> 3. An unexpected BUG_ON() judgment in skb_linearize is triggered.
> 4. The memory of psock->ingress_skb is not limited by the socket buffer
>    and memcg.
> 
> [...]

Here is the summary with links:
  - [bpf-next,v1,1/4] bpf, sockmap: Fix data lost during EAGAIN retries
    https://git.kernel.org/bpf/bpf-next/c/7683167196bd
  - [bpf-next,v1,2/4] bpf, sockmap: fix duplicated data transmission
    https://git.kernel.org/bpf/bpf-next/c/3b4f14b79428
  - [bpf-next,v1,3/4] bpf, sockmap: Fix panic when calling skb_linearize
    https://git.kernel.org/bpf/bpf-next/c/5ca2e29f6834
  - [bpf-next,v1,4/4] selftest/bpf/benchs: Add benchmark for sockmap usage
    https://git.kernel.org/bpf/bpf-next/c/7b2fa44de5e7

You are awesome, thank you!
John Fastabend April 10, 2025, 5:50 a.m. UTC | #2
On 2025-04-10 03:10:37, patchwork-bot+netdevbpf@kernel.org wrote:
> Hello:
> 
> This series was applied to bpf/bpf-next.git (master)
> by Alexei Starovoitov <ast@kernel.org>:
> 
> On Mon,  7 Apr 2025 22:21:19 +0800 you wrote:
> > I was writing a benchmark based on sockmap + TCP and discovered several
> > issues:
> > 
> > 1. When EAGAIN occurs, the direction of skb is incorrect, causing data
> >    loss when retry.
> > 2. When sending partial data, the offset is not recorded, leading to
> >    duplicate data being sent when retry.
> > 3. An unexpected BUG_ON() judgment in skb_linearize is triggered.
> > 4. The memory of psock->ingress_skb is not limited by the socket buffer
> >    and memcg.
> > 
> > [...]

LGTM thanks for the fixes Jiayuan. Good to see someone working through
all the cases.

already merged but ACK for me.


> 
> Here is the summary with links:
>   - [bpf-next,v1,1/4] bpf, sockmap: Fix data lost during EAGAIN retries
>     https://git.kernel.org/bpf/bpf-next/c/7683167196bd
>   - [bpf-next,v1,2/4] bpf, sockmap: fix duplicated data transmission
>     https://git.kernel.org/bpf/bpf-next/c/3b4f14b79428
>   - [bpf-next,v1,3/4] bpf, sockmap: Fix panic when calling skb_linearize
>     https://git.kernel.org/bpf/bpf-next/c/5ca2e29f6834
>   - [bpf-next,v1,4/4] selftest/bpf/benchs: Add benchmark for sockmap usage
>     https://git.kernel.org/bpf/bpf-next/c/7b2fa44de5e7
> 
> You are awesome, thank you!
> -- 
> Deet-doot-dot, I am a bot.
> https://korg.docs.kernel.org/patchwork/pwbot.html
> 
>