diff mbox series

[bpf,v4,2/2] bpf, sockmap: sk_prot needs inuse_idx set for proc stats

Message ID 20210712195546.423990-3-john.fastabend@gmail.com
State New
Headers show
Series bpf, sockmap: fix potential memory leak | expand

Commit Message

John Fastabend July 12, 2021, 7:55 p.m. UTC
Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.
We currently do not set this correctly from sockmap side. The result is
reading sock stats '/proc/net/sockstat' gives incorrect values. The
socket counter is incremented correctly, but because we don't set the
counter correctly when we replace sk_prot we may omit the decrement.

To get the correct inuse_idx value move the core_initcall that initializes
the tcp/udp proto handlers to late_initcall. This way it is initialized
after TCP/UDP has the chance to assign the inuse_idx value from the
register protocol handler.

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 net/ipv4/tcp_bpf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jakub Sitnicki July 13, 2021, 7:47 a.m. UTC | #1
On Mon, Jul 12, 2021 at 09:55 PM CEST, John Fastabend wrote:
> Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.

> We currently do not set this correctly from sockmap side. The result is

> reading sock stats '/proc/net/sockstat' gives incorrect values. The

> socket counter is incremented correctly, but because we don't set the

> counter correctly when we replace sk_prot we may omit the decrement.

>

> To get the correct inuse_idx value move the core_initcall that initializes

> the tcp/udp proto handlers to late_initcall. This way it is initialized

> after TCP/UDP has the chance to assign the inuse_idx value from the

> register protocol handler.

>

> Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>

> Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")

> Signed-off-by: John Fastabend <john.fastabend@gmail.com>

> ---

>  net/ipv4/tcp_bpf.c | 2 +-

>  1 file changed, 1 insertion(+), 1 deletion(-)

>

> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c

> index f26916a62f25..d3e9386b493e 100644

> --- a/net/ipv4/tcp_bpf.c

> +++ b/net/ipv4/tcp_bpf.c

> @@ -503,7 +503,7 @@ static int __init tcp_bpf_v4_build_proto(void)

>  	tcp_bpf_rebuild_protos(tcp_bpf_prots[TCP_BPF_IPV4], &tcp_prot);

>  	return 0;

>  }

> -core_initcall(tcp_bpf_v4_build_proto);

> +late_initcall(tcp_bpf_v4_build_proto);

>

>  static int tcp_bpf_assert_proto_ops(struct proto *ops)

>  {


Respective change for udp_bpf is missing. I've posted it separately [1]
to save us an iteration. Hope you don't mind.

[1] https://lore.kernel.org/bpf/20210713074401.475209-1-jakub@cloudflare.com/
Cong Wang July 14, 2021, 12:56 a.m. UTC | #2
On Mon, Jul 12, 2021 at 12:56 PM John Fastabend
<john.fastabend@gmail.com> wrote:
>

> Proc socket stats use sk_prot->inuse_idx value to record inuse sock stats.

> We currently do not set this correctly from sockmap side. The result is

> reading sock stats '/proc/net/sockstat' gives incorrect values. The

> socket counter is incremented correctly, but because we don't set the

> counter correctly when we replace sk_prot we may omit the decrement.

>

> To get the correct inuse_idx value move the core_initcall that initializes

> the tcp/udp proto handlers to late_initcall. This way it is initialized

> after TCP/UDP has the chance to assign the inuse_idx value from the

> register protocol handler.

>

> Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>

> Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")

> Signed-off-by: John Fastabend <john.fastabend@gmail.com>


For IPv6, I think the module is always loaded before we can
trigger tcp_bpf_check_v6_needs_rebuild(). So,

Reviewed-by: Cong Wang <cong.wang@bytedance.com>


Thanks.
diff mbox series

Patch

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index f26916a62f25..d3e9386b493e 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -503,7 +503,7 @@  static int __init tcp_bpf_v4_build_proto(void)
 	tcp_bpf_rebuild_protos(tcp_bpf_prots[TCP_BPF_IPV4], &tcp_prot);
 	return 0;
 }
-core_initcall(tcp_bpf_v4_build_proto);
+late_initcall(tcp_bpf_v4_build_proto);
 
 static int tcp_bpf_assert_proto_ops(struct proto *ops)
 {