From patchwork Wed Feb 3 04:16:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 375854 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F423DC4332B for ; Wed, 3 Feb 2021 04:17:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BF7A164F7E for ; Wed, 3 Feb 2021 04:17:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232661AbhBCERt (ORCPT ); Tue, 2 Feb 2021 23:17:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232274AbhBCERg (ORCPT ); Tue, 2 Feb 2021 23:17:36 -0500 Received: from mail-oi1-x22c.google.com (mail-oi1-x22c.google.com [IPv6:2607:f8b0:4864:20::22c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 72CFEC06174A; Tue, 2 Feb 2021 20:16:56 -0800 (PST) Received: by mail-oi1-x22c.google.com with SMTP id k25so25379454oik.13; Tue, 02 Feb 2021 20:16:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5HRWSbLtPDag7SiNqYpgFvHK80P7pmdpmcgHM2bBEUw=; b=fs0M2UAX9+eW3VgxKgpghsMg3kB6UpHz0gUcL+wzjrAeKgj0pfNXi/ZryWb39mWE2i md4OGSsXyNXFryTstkF06LoXrSPmQ2qgCTYQTa6HL36wptCJq9EYuVbVUih8ZUrfOLBm b82nYl/9BLdqmIWPAXa3m8M/n48ZAdDH3/Zv61lzX27J/LyPd7EOAAbpRatFfFuqz7Rt 3UhGVXPCFWKjqj5Rykk3z//Q3sESHyRNxK1dPTjPr6JUqKZTrx1OjKTQlA2QcCStBeEB gp0czW4cIf1S1WKU5gKrHMfnIRhjDPnGTU8ubrEus+zSxB4ZnTLI3kBXZlIXg+HzevTj tAdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5HRWSbLtPDag7SiNqYpgFvHK80P7pmdpmcgHM2bBEUw=; b=cdndaVmSEAK7r3uoYmhZ0fzAh6fpJ0FDje0Wv1q6B+pe7mfSWxanKuRDdveM9/A+Tk mckRRoaK0ki+I+IUUrGn+b4Un713ahWOrkxwZtZuDmRU83M7UyGbImqosfzaSIA14CcB N2URshoTPtWNw13KAfXEjKsDoYJQukdLNE+pbZEvF3RLTUNPDcBCL8MsgA/9nZo0qRnM YEFGb24jqgJsm0Sxh+cwqf6YuaTLGNRWDRhh5ndi0GXf1Cj6kGInzQTtbHJQtv6zvWQN SLYthlrjmXkgf0Z1soa97WKxl3UGwe+rnQt/A5LdlzbbdEn0Z+mgbLnIEBSSFQ5Ceb6e 5VHA== X-Gm-Message-State: AOAM530ZuYFisg3YPMsJ6FGH2488YUym0Ahrz89qh6n8lVARCZT33bA+ c2N/vtLbgaEuk2ne1rFfWljEfnfA6F1UOA== X-Google-Smtp-Source: ABdhPJz2XhEJ8vp4kK1mKNUekKl0O/HIqh5Ksh/ptQpUwh5FRKJLQOnsQ0VOv+kZ+8rLvTd1T8K7/w== X-Received: by 2002:aca:de06:: with SMTP id v6mr831034oig.60.1612325815153; Tue, 02 Feb 2021 20:16:55 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.16.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:16:54 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 01/19] bpf: rename BPF_STREAM_PARSER to BPF_SOCK_MAP Date: Tue, 2 Feb 2021 20:16:18 -0800 Message-Id: <20210203041636.38555-2-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Cong Wang Before we add non-TCP support, it is necessary to rename BPF_STREAM_PARSER as it will be no longer specific to TCP, and it does not have to be a parser either. This patch renames BPF_STREAM_PARSER to BPF_SOCK_MAP, so that sock_map.c hopefully would be protocol-independent. Also, improve its Kconfig description to avoid confusion. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/linux/bpf.h | 4 ++-- include/linux/bpf_types.h | 2 +- include/net/tcp.h | 4 ++-- include/net/udp.h | 4 ++-- net/Kconfig | 13 ++++++------- net/core/Makefile | 2 +- net/ipv4/Makefile | 2 +- net/ipv4/tcp_bpf.c | 4 ++-- 8 files changed, 17 insertions(+), 18 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 321966fc35db..b5af6a4e9927 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1771,7 +1771,7 @@ static inline void bpf_map_offload_map_free(struct bpf_map *map) } #endif /* CONFIG_NET && CONFIG_BPF_SYSCALL */ -#if defined(CONFIG_BPF_STREAM_PARSER) +#if defined(CONFIG_BPF_SOCK_MAP) int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog, struct bpf_prog *old, u32 which); int sock_map_get_from_fd(const union bpf_attr *attr, struct bpf_prog *prog); @@ -1804,7 +1804,7 @@ static inline int sock_map_update_elem_sys(struct bpf_map *map, void *key, void { return -EOPNOTSUPP; } -#endif /* CONFIG_BPF_STREAM_PARSER */ +#endif /* CONFIG_BPF_SOCK_MAP */ #if defined(CONFIG_INET) && defined(CONFIG_BPF_SYSCALL) void bpf_sk_reuseport_detach(struct sock *sk); diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h index 99f7fd657d87..6e27726ae578 100644 --- a/include/linux/bpf_types.h +++ b/include/linux/bpf_types.h @@ -103,7 +103,7 @@ BPF_MAP_TYPE(BPF_MAP_TYPE_HASH_OF_MAPS, htab_of_maps_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_DEVMAP, dev_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_DEVMAP_HASH, dev_map_hash_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_SK_STORAGE, sk_storage_map_ops) -#if defined(CONFIG_BPF_STREAM_PARSER) +#if defined(CONFIG_BPF_SOCK_MAP) BPF_MAP_TYPE(BPF_MAP_TYPE_SOCKMAP, sock_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_SOCKHASH, sock_hash_ops) #endif diff --git a/include/net/tcp.h b/include/net/tcp.h index 4bb42fb19711..be66571ad122 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2207,14 +2207,14 @@ void tcp_update_ulp(struct sock *sk, struct proto *p, struct sk_msg; struct sk_psock; -#ifdef CONFIG_BPF_STREAM_PARSER +#ifdef CONFIG_BPF_SOCK_MAP struct proto *tcp_bpf_get_proto(struct sock *sk, struct sk_psock *psock); void tcp_bpf_clone(const struct sock *sk, struct sock *newsk); #else static inline void tcp_bpf_clone(const struct sock *sk, struct sock *newsk) { } -#endif /* CONFIG_BPF_STREAM_PARSER */ +#endif /* CONFIG_BPF_SOCK_MAP */ #ifdef CONFIG_NET_SOCK_MSG int tcp_bpf_sendmsg_redir(struct sock *sk, struct sk_msg *msg, u32 bytes, diff --git a/include/net/udp.h b/include/net/udp.h index 877832bed471..0ff921e6b866 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -511,9 +511,9 @@ static inline struct sk_buff *udp_rcv_segment(struct sock *sk, return segs; } -#ifdef CONFIG_BPF_STREAM_PARSER +#ifdef CONFIG_BPF_SOCK_MAP struct sk_psock; struct proto *udp_bpf_get_proto(struct sock *sk, struct sk_psock *psock); -#endif /* BPF_STREAM_PARSER */ +#endif /* CONFIG_BPF_SOCK_MAP */ #endif /* _UDP_H */ diff --git a/net/Kconfig b/net/Kconfig index f4c32d982af6..0cc0805a8127 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -305,20 +305,19 @@ config BPF_JIT /proc/sys/net/core/bpf_jit_harden (optional) /proc/sys/net/core/bpf_jit_kallsyms (optional) -config BPF_STREAM_PARSER - bool "enable BPF STREAM_PARSER" +config BPF_SOCK_MAP + bool "enable BPF socket maps" depends on INET depends on BPF_SYSCALL depends on CGROUP_BPF select STREAM_PARSER select NET_SOCK_MSG help - Enabling this allows a stream parser to be used with - BPF_MAP_TYPE_SOCKMAP. + Enabling this allows skb parser and verdict to be used with + BPF_MAP_TYPE_SOCKMAP or BPF_MAP_TYPE_SOCKHASH. - BPF_MAP_TYPE_SOCKMAP provides a map type to use with network sockets. - It can be used to enforce socket policy, implement socket redirects, - etc. + This provides a BPF map type to use with network sockets. It can + be used to enforce socket policy, implement socket redirects, etc. config NET_FLOW_LIMIT bool diff --git a/net/core/Makefile b/net/core/Makefile index 3e2c378e5f31..e7c1bdaadefd 100644 --- a/net/core/Makefile +++ b/net/core/Makefile @@ -28,7 +28,7 @@ obj-$(CONFIG_CGROUP_NET_PRIO) += netprio_cgroup.o obj-$(CONFIG_CGROUP_NET_CLASSID) += netclassid_cgroup.o obj-$(CONFIG_LWTUNNEL) += lwtunnel.o obj-$(CONFIG_LWTUNNEL_BPF) += lwt_bpf.o -obj-$(CONFIG_BPF_STREAM_PARSER) += sock_map.o +obj-$(CONFIG_BPF_SOCK_MAP) += sock_map.o obj-$(CONFIG_DST_CACHE) += dst_cache.o obj-$(CONFIG_HWBM) += hwbm.o obj-$(CONFIG_NET_DEVLINK) += devlink.o diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile index 5b77a46885b9..f72f84d1b982 100644 --- a/net/ipv4/Makefile +++ b/net/ipv4/Makefile @@ -62,7 +62,7 @@ obj-$(CONFIG_TCP_CONG_LP) += tcp_lp.o obj-$(CONFIG_TCP_CONG_YEAH) += tcp_yeah.o obj-$(CONFIG_TCP_CONG_ILLINOIS) += tcp_illinois.o obj-$(CONFIG_NET_SOCK_MSG) += tcp_bpf.o -obj-$(CONFIG_BPF_STREAM_PARSER) += udp_bpf.o +obj-$(CONFIG_BPF_SOCK_MAP) += udp_bpf.o obj-$(CONFIG_NETLABEL) += cipso_ipv4.o obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \ diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index bc7d2a586e18..2252f1d90676 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -229,7 +229,7 @@ int tcp_bpf_sendmsg_redir(struct sock *sk, struct sk_msg *msg, } EXPORT_SYMBOL_GPL(tcp_bpf_sendmsg_redir); -#ifdef CONFIG_BPF_STREAM_PARSER +#ifdef CONFIG_BPF_SOCK_MAP static bool tcp_bpf_stream_read(const struct sock *sk) { struct sk_psock *psock; @@ -629,4 +629,4 @@ void tcp_bpf_clone(const struct sock *sk, struct sock *newsk) if (prot == &tcp_bpf_prots[family][TCP_BPF_BASE]) newsk->sk_prot = sk->sk_prot_creator; } -#endif /* CONFIG_BPF_STREAM_PARSER */ +#endif /* CONFIG_BPF_SOCK_MAP */ From patchwork Wed Feb 3 04:16:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 375853 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4FDEBC433DB for ; Wed, 3 Feb 2021 04:18:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 144C464F67 for ; Wed, 3 Feb 2021 04:18:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232674AbhBCESO (ORCPT ); Tue, 2 Feb 2021 23:18:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232617AbhBCERj (ORCPT ); Tue, 2 Feb 2021 23:17:39 -0500 Received: from mail-oi1-x22f.google.com (mail-oi1-x22f.google.com [IPv6:2607:f8b0:4864:20::22f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B421C0613D6; Tue, 2 Feb 2021 20:16:57 -0800 (PST) Received: by mail-oi1-x22f.google.com with SMTP id k25so25379504oik.13; Tue, 02 Feb 2021 20:16:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QuxR8jr/6XGuWWyDoFOLUavtPH9v/4WH/NrMUIXz4tg=; b=qAJiezNkwFImSrEXjXp0t6oE/jNVXwJpuOhP1PR+nc7jEH1F8bE0diKR3zefu2bLPg pqMnHn5X0fuzuPcozi5Gu4WAKn2wRb04xy+5vDrnDR0pLN3ycF4hlxl0Z4BkPXWHcvU0 XenXvmin+EkNHkDJrIhJtzTiOeWFolEa6GJr9D/c+MaHlhQcwi8T0FHwefiLOeYEUSd/ hiVm2Ezurbwguhh0+UrQy00+VlRuGdbAbSt04knahUAXY/r/kKcz1P8mOk9b4siWlceJ KUyjDmAIwjEJc8cruALC3cuYGaqnExm5ENFrnn1k8IEiGP44jw4QKhIdJgfM8LlUMhpz 7ezg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QuxR8jr/6XGuWWyDoFOLUavtPH9v/4WH/NrMUIXz4tg=; b=rfMfGKGt+NZhherPPoCDYecBVn1G0OqG/JsfOs3DYoyDz3PWIrJIERDo45PNLJ3lPz 4yZYc5yF5Yf6pziO5EiezQksT9qoxtLUD7b+bEqUXQbqEwGJlR0YT+VcsXfLFzojOac8 Yl4bhMIeDAjU+7+Uh9fdNTVSpapPpUEMLIkLFBmXVpBKWSnv9JU01p7ecVLiGoLV1wpz JDqN+45fjRiTo1ChntH4ZUc/T0pq4c0/Xv0w9PJao5Xd3UIvY0Q1x358TFkL7uXdhBEL PqpWToQ8DvXp9rk7Dw10uXbPwVGUFOLgnq/grn1GCy0Xm10vtQ6EEXroSnEm87U5MHDv JD6Q== X-Gm-Message-State: AOAM533kRWygwf63urpiBdkbHxDdIJPSRH82TZFqdsn/wuvPy6iul7rZ d7CTKabeucbMZTUxIbJnHkn/Lx/qkxYbLQ== X-Google-Smtp-Source: ABdhPJxDx+LvVm517EZ0sNCGHpBIi12aiMRdUHVTukP9EfFGymF+SMcr0fsHmF9y6S95cYEQuEAkAQ== X-Received: by 2002:aca:508f:: with SMTP id e137mr781235oib.32.1612325816659; Tue, 02 Feb 2021 20:16:56 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.16.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:16:56 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 02/19] skmsg: get rid of struct sk_psock_parser Date: Tue, 2 Feb 2021 20:16:19 -0800 Message-Id: <20210203041636.38555-3-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Cong Wang struct sk_psock_parser is embedded in sk_psock, it is unnecessary as skb verdict also uses ->saved_data_ready. We can simply fold these fields into sk_psock. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/linux/skmsg.h | 16 +++++------- net/core/skmsg.c | 58 ++++++++++++++++--------------------------- net/core/sock_map.c | 8 +++--- 3 files changed, 31 insertions(+), 51 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index 8edbbf5f2f93..56d641df3b0c 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -70,12 +70,6 @@ struct sk_psock_link { void *link_raw; }; -struct sk_psock_parser { - struct strparser strp; - bool enabled; - void (*saved_data_ready)(struct sock *sk); -}; - struct sk_psock_work_state { struct sk_buff *skb; u32 len; @@ -90,7 +84,8 @@ struct sk_psock { u32 eval; struct sk_msg *cork; struct sk_psock_progs progs; - struct sk_psock_parser parser; + struct strparser strp; + bool bpf_running; struct sk_buff_head ingress_skb; struct list_head ingress_msg; unsigned long state; @@ -100,6 +95,7 @@ struct sk_psock { void (*saved_unhash)(struct sock *sk); void (*saved_close)(struct sock *sk, long timeout); void (*saved_write_space)(struct sock *sk); + void (*saved_data_ready)(struct sock *sk); struct proto *sk_proto; struct sk_psock_work_state work_state; struct work_struct work; @@ -400,8 +396,8 @@ static inline void sk_psock_put(struct sock *sk, struct sk_psock *psock) static inline void sk_psock_data_ready(struct sock *sk, struct sk_psock *psock) { - if (psock->parser.enabled) - psock->parser.saved_data_ready(sk); + if (psock->bpf_running) + psock->saved_data_ready(sk); else sk->sk_data_ready(sk); } @@ -440,6 +436,6 @@ static inline bool sk_psock_strp_enabled(struct sk_psock *psock) { if (!psock) return false; - return psock->parser.enabled; + return psock->bpf_running; } #endif /* _LINUX_SKMSG_H */ diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 1261512d6807..f72fcb03d25c 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -653,7 +653,7 @@ static void sk_psock_destroy_deferred(struct work_struct *gc) /* Parser has been stopped */ if (psock->progs.skb_parser) - strp_done(&psock->parser.strp); + strp_done(&psock->strp); cancel_work_sync(&psock->work); @@ -750,14 +750,6 @@ static int sk_psock_bpf_run(struct sk_psock *psock, struct bpf_prog *prog, return bpf_prog_run_pin_on_cpu(prog, skb); } -static struct sk_psock *sk_psock_from_strp(struct strparser *strp) -{ - struct sk_psock_parser *parser; - - parser = container_of(strp, struct sk_psock_parser, strp); - return container_of(parser, struct sk_psock, parser); -} - static void sk_psock_skb_redirect(struct sk_buff *skb) { struct sk_psock *psock_other; @@ -899,7 +891,7 @@ static int sk_psock_strp_read_done(struct strparser *strp, int err) static int sk_psock_strp_parse(struct strparser *strp, struct sk_buff *skb) { - struct sk_psock *psock = sk_psock_from_strp(strp); + struct sk_psock *psock = container_of(strp, struct sk_psock, strp); struct bpf_prog *prog; int ret = skb->len; @@ -923,10 +915,10 @@ static void sk_psock_strp_data_ready(struct sock *sk) psock = sk_psock(sk); if (likely(psock)) { if (tls_sw_has_ctx_rx(sk)) { - psock->parser.saved_data_ready(sk); + psock->saved_data_ready(sk); } else { write_lock_bh(&sk->sk_callback_lock); - strp_data_ready(&psock->parser.strp); + strp_data_ready(&psock->strp); write_unlock_bh(&sk->sk_callback_lock); } } @@ -1009,57 +1001,49 @@ int sk_psock_init_strp(struct sock *sk, struct sk_psock *psock) .parse_msg = sk_psock_strp_parse, }; - psock->parser.enabled = false; - return strp_init(&psock->parser.strp, sk, &cb); + psock->bpf_running = false; + return strp_init(&psock->strp, sk, &cb); } void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock) { - struct sk_psock_parser *parser = &psock->parser; - - if (parser->enabled) + if (psock->bpf_running) return; - parser->saved_data_ready = sk->sk_data_ready; + psock->saved_data_ready = sk->sk_data_ready; sk->sk_data_ready = sk_psock_verdict_data_ready; sk->sk_write_space = sk_psock_write_space; - parser->enabled = true; + psock->bpf_running = true; } void sk_psock_start_strp(struct sock *sk, struct sk_psock *psock) { - struct sk_psock_parser *parser = &psock->parser; - - if (parser->enabled) + if (psock->bpf_running) return; - parser->saved_data_ready = sk->sk_data_ready; + psock->saved_data_ready = sk->sk_data_ready; sk->sk_data_ready = sk_psock_strp_data_ready; sk->sk_write_space = sk_psock_write_space; - parser->enabled = true; + psock->bpf_running = true; } void sk_psock_stop_strp(struct sock *sk, struct sk_psock *psock) { - struct sk_psock_parser *parser = &psock->parser; - - if (!parser->enabled) + if (!psock->bpf_running) return; - sk->sk_data_ready = parser->saved_data_ready; - parser->saved_data_ready = NULL; - strp_stop(&parser->strp); - parser->enabled = false; + sk->sk_data_ready = psock->saved_data_ready; + psock->saved_data_ready = NULL; + strp_stop(&psock->strp); + psock->bpf_running = false; } void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock) { - struct sk_psock_parser *parser = &psock->parser; - - if (!parser->enabled) + if (!psock->bpf_running) return; - sk->sk_data_ready = parser->saved_data_ready; - parser->saved_data_ready = NULL; - parser->enabled = false; + sk->sk_data_ready = psock->saved_data_ready; + psock->saved_data_ready = NULL; + psock->bpf_running = false; } diff --git a/net/core/sock_map.c b/net/core/sock_map.c index d758fb83c884..37ff8e13e4cc 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -148,9 +148,9 @@ static void sock_map_del_link(struct sock *sk, struct bpf_map *map = link->map; struct bpf_stab *stab = container_of(map, struct bpf_stab, map); - if (psock->parser.enabled && stab->progs.skb_parser) + if (psock->bpf_running && stab->progs.skb_parser) strp_stop = true; - if (psock->parser.enabled && stab->progs.skb_verdict) + if (psock->bpf_running && stab->progs.skb_verdict) verdict_stop = true; list_del(&link->list); sk_psock_free_link(link); @@ -283,14 +283,14 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, goto out_drop; write_lock_bh(&sk->sk_callback_lock); - if (skb_parser && skb_verdict && !psock->parser.enabled) { + if (skb_parser && skb_verdict && !psock->bpf_running) { ret = sk_psock_init_strp(sk, psock); if (ret) goto out_unlock_drop; psock_set_prog(&psock->progs.skb_verdict, skb_verdict); psock_set_prog(&psock->progs.skb_parser, skb_parser); sk_psock_start_strp(sk, psock); - } else if (!skb_parser && skb_verdict && !psock->parser.enabled) { + } else if (!skb_parser && skb_verdict && !psock->bpf_running) { psock_set_prog(&psock->progs.skb_verdict, skb_verdict); sk_psock_start_verdict(sk,psock); } From patchwork Wed Feb 3 04:16:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 375851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5834FC433E6 for ; Wed, 3 Feb 2021 04:19:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1E92B64E4D for ; Wed, 3 Feb 2021 04:19:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232920AbhBCETP (ORCPT ); Tue, 2 Feb 2021 23:19:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36704 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232731AbhBCESR (ORCPT ); Tue, 2 Feb 2021 23:18:17 -0500 Received: from mail-oi1-x22b.google.com (mail-oi1-x22b.google.com [IPv6:2607:f8b0:4864:20::22b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA3B7C061788; Tue, 2 Feb 2021 20:17:01 -0800 (PST) Received: by mail-oi1-x22b.google.com with SMTP id x71so25345614oia.9; Tue, 02 Feb 2021 20:17:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=SP/1stuU44pVBk7OBBQnq1cmGdnBzdf1fs54qnZXI28=; b=LUc4vFqaH026CJlyG/ik+ARGw3xO0txEaARW0fIlhtbGytcVs2aWJ0QxIWwrMMDqfF pbxLHHx61mDNswnOt1+rzimONAYwkYMEE5/SG3Via5jgdpnVUfXNNWg8+6F4WEFQuUZy /WFqWYqIdYeFusQMQfvvPu4u/cGyX3EJVBcKBafttplcVbevt/EgU5nte/MViGe1khgR 5kISf0TD92mztcofBpMQMrpa3mYGbHGQM000FntbJjMe8e+Wnx0WadaeiMhM/YKGEM+B IauOdBBBZSSaCZYmMIGtwaDhFNqFhFkHqwj0y7jeyQl2oRuA4MpTZMvN7vMPXsT6CLUC JMCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SP/1stuU44pVBk7OBBQnq1cmGdnBzdf1fs54qnZXI28=; b=t8f99NPauMAq/i3fP5Ll1CPB7cUNR4qxrnZZUbCH/VPYR1XV99iCNPTlWZxke5t1Ra ZDtTqCYARFsxo3ZB6au5d44t8nhFKwbcPouYAT3hSdtdv0QcPFpQxqhA8PNqyAaPMhCs MLlypfgyMtwV2Bly0haHKxtI6Vm6ltpU/u5s7d6kL8DHWwhxEMCM+kGgDUCG5g6kRzd8 tzU+zmmF13t9NDAwcrPm8saL1l/et/bNH6IE2IKMbwaumWLRdvRb0qOd/QE9ui5fmuNZ XaCRwm/PLiiqYEOi5hIOB2iYXRoPPcVSlaoLxEiWpL9Vh6Mvlq6Qdcf3iY90OG0FPQkE iSjg== X-Gm-Message-State: AOAM532zpw5Ua78wZKDA302cIdYvDVXN4y6ietSKQQM0ZH47hmNWpKbR jyf+A1UCJWt3zmOMwLIPH2PoRWBV+BmCVg== X-Google-Smtp-Source: ABdhPJyiCpNN6z8AhRrFCetB7GZ2hLGoIiiShe30NMW5a1OooszLg9txuOvFWG/lZV5l8I8lEDuaHg== X-Received: by 2002:aca:e103:: with SMTP id y3mr783973oig.11.1612325821171; Tue, 02 Feb 2021 20:17:01 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.16.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:00 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 05/19] sock_map: introduce BPF_SK_SKB_VERDICT Date: Tue, 2 Feb 2021 20:16:22 -0800 Message-Id: <20210203041636.38555-6-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Cong Wang I was planning to reuse BPF_SK_SKB_STREAM_VERDICT but its name is confusing and more importantly it seems kTLS relies on it to deliver sk_msg too. To avoid messing up kTLS, we can just reuse the stream verdict code but introduce a new type of eBPF program, skb_verdict. Users are not allowed to set stream_verdict and skb_verdict at the same time. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/linux/skmsg.h | 3 +++ include/uapi/linux/bpf.h | 1 + kernel/bpf/syscall.c | 1 + net/core/skmsg.c | 4 +++- net/core/sock_map.c | 23 ++++++++++++++++++++++- tools/bpf/bpftool/common.c | 1 + tools/bpf/bpftool/prog.c | 1 + tools/include/uapi/linux/bpf.h | 1 + 8 files changed, 33 insertions(+), 2 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index 218566ac4fa1..cb79b1afa556 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -58,6 +58,7 @@ struct sk_psock_progs { struct bpf_prog *msg_parser; struct bpf_prog *stream_parser; struct bpf_prog *stream_verdict; + struct bpf_prog *skb_verdict; }; enum sk_psock_state_bits { @@ -428,6 +429,7 @@ static inline void psock_progs_drop(struct sk_psock_progs *progs) psock_set_prog(&progs->msg_parser, NULL); psock_set_prog(&progs->stream_parser, NULL); psock_set_prog(&progs->stream_verdict, NULL); + psock_set_prog(&progs->skb_verdict, NULL); } int sk_psock_tls_strp_read(struct sk_psock *psock, struct sk_buff *skb); @@ -482,5 +484,6 @@ void skb_bpf_ext_redirect_clear(struct sk_buff *skb) ext->flags = 0; ext->sk_redir = NULL; } + #endif /* CONFIG_NET_SOCK_MSG */ #endif /* _LINUX_SKMSG_H */ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index c001766adcbc..c1a412ebfb08 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -247,6 +247,7 @@ enum bpf_attach_type { BPF_XDP_CPUMAP, BPF_SK_LOOKUP, BPF_XDP, + BPF_SK_SKB_VERDICT, __MAX_BPF_ATTACH_TYPE }; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index e5999d86c76e..a56549fc2825 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2936,6 +2936,7 @@ attach_type_to_prog_type(enum bpf_attach_type attach_type) return BPF_PROG_TYPE_SK_MSG; case BPF_SK_SKB_STREAM_PARSER: case BPF_SK_SKB_STREAM_VERDICT: + case BPF_SK_SKB_VERDICT: return BPF_PROG_TYPE_SK_SKB; case BPF_LIRC_MODE2: return BPF_PROG_TYPE_LIRC_MODE2; diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 51446fe63be5..ecbd6f0d49a5 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -688,7 +688,7 @@ void sk_psock_drop(struct sock *sk, struct sk_psock *psock) rcu_assign_sk_user_data(sk, NULL); if (psock->progs.stream_parser) sk_psock_stop_strp(sk, psock); - else if (psock->progs.stream_verdict) + else if (psock->progs.stream_verdict || psock->progs.skb_verdict) sk_psock_stop_verdict(sk, psock); write_unlock_bh(&sk->sk_callback_lock); sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED); @@ -966,6 +966,8 @@ static int sk_psock_verdict_recv(read_descriptor_t *desc, struct sk_buff *skb, } prog = READ_ONCE(psock->progs.stream_verdict); + if (!prog) + prog = READ_ONCE(psock->progs.skb_verdict); if (likely(prog)) { skb_bpf_ext_redirect_clear(skb); ret = sk_psock_bpf_run(psock, prog, skb); diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 521663582982..f827f1ecefcc 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -152,6 +152,8 @@ static void sock_map_del_link(struct sock *sk, strp_stop = true; if (psock->bpf_running && stab->progs.stream_verdict) verdict_stop = true; + if (psock->bpf_running && stab->progs.skb_verdict) + verdict_stop = true; list_del(&link->list); sk_psock_free_link(link); } @@ -224,7 +226,7 @@ static struct sk_psock *sock_map_psock_get_checked(struct sock *sk) static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, struct sock *sk) { - struct bpf_prog *msg_parser, *stream_parser, *stream_verdict; + struct bpf_prog *msg_parser, *stream_parser, *stream_verdict, *skb_verdict; struct sk_psock *psock; int ret; @@ -253,6 +255,15 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, } } + skb_verdict = READ_ONCE(progs->skb_verdict); + if (skb_verdict) { + skb_verdict = bpf_prog_inc_not_zero(skb_verdict); + if (IS_ERR(skb_verdict)) { + ret = PTR_ERR(skb_verdict); + goto out_put_msg_parser; + } + } + psock = sock_map_psock_get_checked(sk); if (IS_ERR(psock)) { ret = PTR_ERR(psock); @@ -262,6 +273,7 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, if (psock) { if ((msg_parser && READ_ONCE(psock->progs.msg_parser)) || (stream_parser && READ_ONCE(psock->progs.stream_parser)) || + (skb_verdict && READ_ONCE(psock->progs.skb_verdict)) || (stream_verdict && READ_ONCE(psock->progs.stream_verdict))) { sk_psock_put(sk, psock); ret = -EBUSY; @@ -293,6 +305,9 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, } else if (!stream_parser && stream_verdict && !psock->bpf_running) { psock_set_prog(&psock->progs.stream_verdict, stream_verdict); sk_psock_start_verdict(sk,psock); + } else if (!stream_verdict && skb_verdict && !psock->bpf_running) { + psock_set_prog(&psock->progs.skb_verdict, skb_verdict); + sk_psock_start_verdict(sk, psock); } write_unlock_bh(&sk->sk_callback_lock); return 0; @@ -301,6 +316,9 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, out_drop: sk_psock_put(sk, psock); out_progs: + if (skb_verdict) + bpf_prog_put(skb_verdict); +out_put_msg_parser: if (msg_parser) bpf_prog_put(msg_parser); out_put_stream_parser: @@ -1467,6 +1485,9 @@ int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog, case BPF_SK_SKB_STREAM_VERDICT: pprog = &progs->stream_verdict; break; + case BPF_SK_SKB_VERDICT: + pprog = &progs->skb_verdict; + break; default: return -EOPNOTSUPP; } diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c index 65303664417e..1828bba19020 100644 --- a/tools/bpf/bpftool/common.c +++ b/tools/bpf/bpftool/common.c @@ -57,6 +57,7 @@ const char * const attach_type_name[__MAX_BPF_ATTACH_TYPE] = { [BPF_SK_SKB_STREAM_PARSER] = "sk_skb_stream_parser", [BPF_SK_SKB_STREAM_VERDICT] = "sk_skb_stream_verdict", + [BPF_SK_SKB_VERDICT] = "sk_skb_verdict", [BPF_SK_MSG_VERDICT] = "sk_msg_verdict", [BPF_LIRC_MODE2] = "lirc_mode2", [BPF_FLOW_DISSECTOR] = "flow_dissector", diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index 1fe3ba255bad..a78d8c03b7ea 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -76,6 +76,7 @@ enum dump_mode { static const char * const attach_type_strings[] = { [BPF_SK_SKB_STREAM_PARSER] = "stream_parser", [BPF_SK_SKB_STREAM_VERDICT] = "stream_verdict", + [BPF_SK_SKB_VERDICT] = "skb_verdict", [BPF_SK_MSG_VERDICT] = "msg_verdict", [BPF_FLOW_DISSECTOR] = "flow_dissector", [__MAX_BPF_ATTACH_TYPE] = NULL, diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index c001766adcbc..c1a412ebfb08 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -247,6 +247,7 @@ enum bpf_attach_type { BPF_XDP_CPUMAP, BPF_SK_LOOKUP, BPF_XDP, + BPF_SK_SKB_VERDICT, __MAX_BPF_ATTACH_TYPE }; From patchwork Wed Feb 3 04:16:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 375852 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4158AC433E0 for ; Wed, 3 Feb 2021 04:19:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F103F64F68 for ; Wed, 3 Feb 2021 04:19:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232882AbhBCETF (ORCPT ); Tue, 2 Feb 2021 23:19:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232750AbhBCESY (ORCPT ); Tue, 2 Feb 2021 23:18:24 -0500 Received: from mail-oi1-x234.google.com (mail-oi1-x234.google.com [IPv6:2607:f8b0:4864:20::234]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B342C06178C; Tue, 2 Feb 2021 20:17:06 -0800 (PST) Received: by mail-oi1-x234.google.com with SMTP id n7so25342892oic.11; Tue, 02 Feb 2021 20:17:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=dupMR2pG+JHyTg89AqGH2oELASs9jaapVR219k48+YQ=; b=bCQeWjUUGeLZfuPLirC0xi27hLCVpiAaXsUPT+uIEPNfJtLDtYKsbzBX3crGj9eyq7 gXIynuPA/nBuTKUbOQxIx0opQTrArAKOXbh3sakpMrZ2ycHuvD1XATkv8VVdGk/dgOht HyikSyn12SNmRQWLRg/MSWOTJrgKp/5n7+HV4xjYS5Y+yK3gJuW1U55llAnu3L/AFQAi VBM+s+ITlZUz3iDAzjNNA2RQpogUPAAZ/dWXD3lokxC8EO5F9cWFUYnTVVXPw7WffQtG T/syGi+5GGTGYJecAfgxBd1YxO7RJEkMZUId6xPJAvS2oALKTvHXzc/H1W7n0otUoh2y gPHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=dupMR2pG+JHyTg89AqGH2oELASs9jaapVR219k48+YQ=; b=PQ+XEhLnrwqnmYaf6nhzwTBR0OViklt5CnYXecWEJl9RaQIyrzg7n/2yGR7XV1T/Vg PJDum57bbIoZ9kFMX3WgiucEt3ypXiLtcelavr9jKRIyRJs8x3UWzwOfr/VR6y1msZcS J910a+HYCoNcNDYt+QtqeXSQqApUrwmXRfVx4FOaOpPg/EY0hsBterCKwORZGkAQNVIB 4tfMboRqvUHYPEs91OLpjGKQqsKdPMZxk9aY++h10I0Y4XYfCgmNXFXoQ5pDPpliWbK+ 2OaNhCCP6cZqfteVZveQ+GwmL4sckEmhR70GHqxriShan0JuUE8q5E092rqJGtqYzJgQ CKaw== X-Gm-Message-State: AOAM531pIiFw4x31DekWuWNNKE35XVcFWeqySVjlI6v6cf89xo9qhfEK 59iEYBhWDDkT8gqaBTk/j9P3zQJBDuMpBg== X-Google-Smtp-Source: ABdhPJwPoE3QqiuZUR/2DI44BMHX9o+bWbyBSXRpMDOM8cnL7EnzaYnz4nExxPrTZMpBLQzg8DO2Bw== X-Received: by 2002:a54:4e88:: with SMTP id c8mr786082oiy.148.1612325825505; Tue, 02 Feb 2021 20:17:05 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:04 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 08/19] udp: implement ->read_sock() for sockmap Date: Tue, 2 Feb 2021 20:16:25 -0800 Message-Id: <20210203041636.38555-9-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Cong Wang Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/net/udp.h | 2 ++ net/ipv4/af_inet.c | 1 + net/ipv4/udp.c | 34 ++++++++++++++++++++++++++++++++++ 3 files changed, 37 insertions(+) diff --git a/include/net/udp.h b/include/net/udp.h index 13f9354dbd3e..b6b75cabf4e4 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -327,6 +327,8 @@ struct sock *__udp6_lib_lookup(struct net *net, struct sk_buff *skb); struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb, __be16 sport, __be16 dport); +int udp_read_sock(struct sock *sk, read_descriptor_t *desc, + sk_read_actor_t recv_actor); /* UDP uses skb->dev_scratch to cache as much information as possible and avoid * possibly multiple cache miss on dequeue() diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index d184d9379a92..4a4c6d3d2786 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -1072,6 +1072,7 @@ const struct proto_ops inet_dgram_ops = { .getsockopt = sock_common_getsockopt, .sendmsg = inet_sendmsg, .sendmsg_locked = udp_sendmsg_locked, + .read_sock = udp_read_sock, .recvmsg = inet_recvmsg, .mmap = sock_no_mmap, .sendpage = inet_sendpage, diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 635e1e8b2968..6dffbcec0b51 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1792,6 +1792,40 @@ struct sk_buff *__skb_recv_udp(struct sock *sk, unsigned int flags, } EXPORT_SYMBOL(__skb_recv_udp); +int udp_read_sock(struct sock *sk, read_descriptor_t *desc, + sk_read_actor_t recv_actor) +{ + struct sk_buff *skb; + int copied = 0, err; + + while (1) { + int offset = 0; + + skb = __skb_recv_udp(sk, 0, 1, &offset, &err); + if (!skb) + break; + if (offset < skb->len) { + int used; + size_t len; + + len = skb->len - offset; + used = recv_actor(desc, skb, offset, len); + if (used <= 0) { + if (!copied) + copied = used; + break; + } else if (used <= len) { + copied += used; + offset += used; + } + } + if (!desc->count) + break; + } + + return copied; +} + /* * This should be easy, if there is something there we * return it, otherwise we block. From patchwork Wed Feb 3 04:16:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 375850 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8F37C433E6 for ; Wed, 3 Feb 2021 04:20:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9DDF964F61 for ; Wed, 3 Feb 2021 04:20:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232969AbhBCETk (ORCPT ); Tue, 2 Feb 2021 23:19:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232760AbhBCESY (ORCPT ); Tue, 2 Feb 2021 23:18:24 -0500 Received: from mail-oo1-xc30.google.com (mail-oo1-xc30.google.com [IPv6:2607:f8b0:4864:20::c30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C4028C061793; Tue, 2 Feb 2021 20:17:07 -0800 (PST) Received: by mail-oo1-xc30.google.com with SMTP id y21so2387069oot.12; Tue, 02 Feb 2021 20:17:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=hu2PSvf1ubuS5CTlM1297EMYv4VI1bTTMBsgsZV1bMQ=; b=Xopi7NxJGZjMD6E96E6Zwqp1FGbnAnTTS9fpOU3UDAKc2HXN/q98ycxqbFqFn41q7v gzbQJQKHI3E+lfnNWYHG8HhpaofH6PI29MpySujaIcGP0REiKB0Xi/dQgUD5WPJy5YA0 mg4nVdrQlP7nkWt0IsXg0tVTKQkW0wCaKJFCzaQI9zzHDGEm6WVpAVgxjlApeyDUTown 0bE0SNP3/GlLFfd/tXKz5PA4xZRLJwhBpRQpYqTtbv9IA0jjjRFxcu0541J6/qvqS8LX R8tquy6MKOVbFa+En0bk2xBqtCigC2yXb8oq1NT2cyooKQASt/3JFSTEiRT+IWvuK3TU HXQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hu2PSvf1ubuS5CTlM1297EMYv4VI1bTTMBsgsZV1bMQ=; b=ZK587+h5oaGcJeGc2VSUKe8rOzwOGFW4SeVS+/6exOZ3T1caMsUmC+fySNVEtLUlii sHpDGhgFf1G0pslglVSP9Y09wv6/0BtuEjVO3MvRJJIjhOqUIHgsKgBH3szdJGoX7DBS LqH8PGpeCQ63tRCnYAq8qrHT0cl6kvWknmgRgZeaPrQV0woxxhFOAnYjsgzGSBDhPYaf 71T/UT0ikalPJcrmWjmev6tC5OgaGj6wZZgX4MLFUlvCbA0L1XRMP/xXy4WVFaSP4QAK thfxmI/0sw3QiqwYA+37HyRxJ9r0kEp51moy7s1BFs0xLZIYysdZ0yGG4T9r3mIqHlse dWVw== X-Gm-Message-State: AOAM532rWY7fmF3z12PgTdVOZcyP5ufqpjqNnpMX/r8ZugvWMmNI7OXa sHnPxfnmM3MGt97vyozbRlFSaGNpkbv8nw== X-Google-Smtp-Source: ABdhPJwYbPn6zl136gqq8TKcxOfa7RepL+MBrMNS/3yzIgNRTS22izXE3VQPaSHuLulohXaKev/aoA== X-Received: by 2002:a4a:d1de:: with SMTP id a30mr810423oos.43.1612325827011; Tue, 02 Feb 2021 20:17:07 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:06 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 09/19] udp: add ->read_sock() and ->sendmsg_locked() to ipv6 Date: Tue, 2 Feb 2021 20:16:26 -0800 Message-Id: <20210203041636.38555-10-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Cong Wang Similarly, udpv6_sendmsg() takes lock_sock() inside too, we have to build ->sendmsg_locked() on top of it. For ->read_sock(), we can just use udp_read_sock(). Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/net/ipv6.h | 1 + net/ipv4/udp.c | 1 + net/ipv6/af_inet6.c | 2 ++ net/ipv6/udp.c | 27 +++++++++++++++++++++------ 4 files changed, 25 insertions(+), 6 deletions(-) diff --git a/include/net/ipv6.h b/include/net/ipv6.h index bd1f396cc9c7..48b6850dae85 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -1119,6 +1119,7 @@ int inet6_hash_connect(struct inet_timewait_death_row *death_row, int inet6_sendmsg(struct socket *sock, struct msghdr *msg, size_t size); int inet6_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, int flags); +int udpv6_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len); /* * reassembly.c diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 6dffbcec0b51..3acb1be73131 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -1825,6 +1825,7 @@ int udp_read_sock(struct sock *sk, read_descriptor_t *desc, return copied; } +EXPORT_SYMBOL(udp_read_sock); /* * This should be easy, if there is something there we diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index f091fe9b4da5..63c2d024f572 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -714,7 +714,9 @@ const struct proto_ops inet6_dgram_ops = { .setsockopt = sock_common_setsockopt, /* ok */ .getsockopt = sock_common_getsockopt, /* ok */ .sendmsg = inet6_sendmsg, /* retpoline's sake */ + .sendmsg_locked = udpv6_sendmsg_locked, .recvmsg = inet6_recvmsg, /* retpoline's sake */ + .read_sock = udp_read_sock, .mmap = sock_no_mmap, .sendpage = sock_no_sendpage, .set_peek_off = sk_set_peek_off, diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 66ebdfc83c95..c52ea171060d 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -1272,7 +1272,7 @@ static int udp_v6_push_pending_frames(struct sock *sk) return err; } -int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) +static int __udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len, bool locked) { struct ipv6_txoptions opt_space; struct udp_sock *up = udp_sk(sk); @@ -1361,7 +1361,8 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) * There are pending frames. * The socket lock must be held while it's corked. */ - lock_sock(sk); + if (!locked) + lock_sock(sk); if (likely(up->pending)) { if (unlikely(up->pending != AF_INET6)) { release_sock(sk); @@ -1370,7 +1371,8 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) dst = NULL; goto do_append_data; } - release_sock(sk); + if (!locked) + release_sock(sk); } ulen += sizeof(struct udphdr); @@ -1533,11 +1535,13 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) goto out; } - lock_sock(sk); + if (!locked) + lock_sock(sk); if (unlikely(up->pending)) { /* The socket is already corked while preparing it. */ /* ... which is an evident application bug. --ANK */ - release_sock(sk); + if (!locked) + release_sock(sk); net_dbg_ratelimited("udp cork app bug 2\n"); err = -EINVAL; @@ -1562,7 +1566,8 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) if (err > 0) err = np->recverr ? net_xmit_errno(err) : 0; - release_sock(sk); + if (!locked) + release_sock(sk); out: dst_release(dst); @@ -1593,6 +1598,16 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) goto out; } +int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) +{ + return __udpv6_sendmsg(sk, msg, len, false); +} + +int udpv6_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len) +{ + return __udpv6_sendmsg(sk, msg, len, true); +} + void udpv6_destroy_sock(struct sock *sk) { struct udp_sock *up = udp_sk(sk); From patchwork Wed Feb 3 04:16:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 375849 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1DD5C43332 for ; Wed, 3 Feb 2021 04:20:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7F88A64F61 for ; Wed, 3 Feb 2021 04:20:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232909AbhBCET6 (ORCPT ); Tue, 2 Feb 2021 23:19:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36756 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231944AbhBCES1 (ORCPT ); Tue, 2 Feb 2021 23:18:27 -0500 Received: from mail-ot1-x336.google.com (mail-ot1-x336.google.com [IPv6:2607:f8b0:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BBA5EC061797; Tue, 2 Feb 2021 20:17:10 -0800 (PST) Received: by mail-ot1-x336.google.com with SMTP id d7so22161932otf.3; Tue, 02 Feb 2021 20:17:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FbcI6wt1YBzeWqrusTkpbaZfnU8cH2xe50gLcZjha8Y=; b=bTgpZ8EpJ+U6hBDnEOA1mhM+6MfCw0fSbK9KY39fKP6MA1CytaAePTJjaFfL2nBkU2 +TmPYYU3yeWxWm/U2JfCk2zrDl+JHfIFKisv7oQXxFdALRcLiQ1rqoeqcFUstb06eCb2 N/roDNssfhs33lO730uAnym3KqWZqACvrwJSSWataNBTMEwVc/Ff3szcNQkpxE8ERFdY C31QPsTmdd4IJwKCXn2OjFrI19fUH0j+6LXAvRJ/TAcEtl09ztHoQqfGD6fCZUdg01Oa vNXxZEwGpB4L1WwaXKfuyZu5gBoQJ8h1qzFib3gL+SxXvSVurgwX6vyYuZSk7YFK3TQK 0TFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FbcI6wt1YBzeWqrusTkpbaZfnU8cH2xe50gLcZjha8Y=; b=eZGnXsqzuDOArdUdpaAB5zMNBzQa5mkJU9YtoK8T2VPM4VOplJ5hNlRBp/wsZn/hQg QYH9dYN46A1m1wqXx47LMEsHzcXMcflyEExS7bULxUZmteDORUrdIMWAi1sOARsMcbLp 5fhGmFRPjB6YVZXZx+y8LAZEa4jRwVuhRJyH+rV4puuycU3oHuuo2CaGQuFAOKHHbqjR 4Q8NJFHeXdzZ3WZS6bATuXEHukN2U6dxCo0sRZeeA+yYEUA7Rz8NkvM96Ky6G8ItUmTb 3jOQUrtnRpwAOdJAvN9UUf5nDYpMfOZVnCzKc4WFXss/CiNyogRME/eD2tiJlMXjmhCn GqoA== X-Gm-Message-State: AOAM532K5SXXEr/562ywzybQzK+MdwMDWybg7xMmO/bBLy1Lx0X/cz0I 0wYyf1jVxG7I9G4KMyz/DzREyJV3tBBELQ== X-Google-Smtp-Source: ABdhPJxzQ9wsLkDoFo4vhjLHrIA+8tYBqV99ZetqsvSJCXTrcFWw0ItPVRG9mIMxga/h5rlJzLfaIg== X-Received: by 2002:a9d:6c85:: with SMTP id c5mr760176otr.300.1612325830009; Tue, 02 Feb 2021 20:17:10 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:09 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 11/19] af_unix: implement ->read_sock() for sockmap Date: Tue, 2 Feb 2021 20:16:28 -0800 Message-Id: <20210203041636.38555-12-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Cong Wang Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- net/unix/af_unix.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 4e1fa4ecbcfb..9315c4f4c27a 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -662,6 +662,7 @@ static ssize_t unix_stream_splice_read(struct socket *, loff_t *ppos, static int __unix_dgram_sendmsg(struct sock*, struct msghdr *, size_t); static int unix_dgram_sendmsg(struct socket *, struct msghdr *, size_t); static int unix_dgram_recvmsg(struct socket *, struct msghdr *, size_t, int); +int unix_read_sock(struct sock *sk, read_descriptor_t *desc, sk_read_actor_t recv_actor); static int unix_dgram_connect(struct socket *, struct sockaddr *, int, int); static int unix_seqpacket_sendmsg(struct socket *, struct msghdr *, size_t); @@ -739,6 +740,7 @@ static const struct proto_ops unix_dgram_ops = { .listen = sock_no_listen, .shutdown = unix_shutdown, .sendmsg = unix_dgram_sendmsg, + .read_sock = unix_read_sock, .sendmsg_locked = __unix_dgram_sendmsg, .recvmsg = unix_dgram_recvmsg, .mmap = sock_no_mmap, @@ -2190,6 +2192,50 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, return err; } +int unix_read_sock(struct sock *sk, read_descriptor_t *desc, + sk_read_actor_t recv_actor) +{ + unsigned int flags = MSG_DONTWAIT; + struct unix_sock *u = unix_sk(sk); + struct sk_buff *skb; + int copied = 0; + + while (1) { + int offset, err; + + mutex_lock(&u->iolock); + skb = __skb_recv_datagram(sk, &sk->sk_receive_queue, flags, + &offset, &err); + if (!skb) { + mutex_unlock(&u->iolock); + break; + } + + if (offset < skb->len) { + int used; + size_t len; + + len = skb->len - offset; + used = recv_actor(desc, skb, offset, len); + if (used <= 0) { + if (!copied) + copied = used; + mutex_unlock(&u->iolock); + break; + } else if (used <= len) { + copied += used; + offset += used; + } + } + mutex_unlock(&u->iolock); + + if (!desc->count) + break; + } + + return copied; +} + /* * Sleep until more data has arrived. But check for races.. */ From patchwork Wed Feb 3 04:16:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 375847 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B04DC433DB for ; Wed, 3 Feb 2021 04:21:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1AF0264F74 for ; Wed, 3 Feb 2021 04:21:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233115AbhBCEUr (ORCPT ); Tue, 2 Feb 2021 23:20:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232767AbhBCESa (ORCPT ); Tue, 2 Feb 2021 23:18:30 -0500 Received: from mail-ot1-x333.google.com (mail-ot1-x333.google.com [IPv6:2607:f8b0:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F178C0617AA; Tue, 2 Feb 2021 20:17:15 -0800 (PST) Received: by mail-ot1-x333.google.com with SMTP id f6so22095833ots.9; Tue, 02 Feb 2021 20:17:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ffOSXHVfVukIRx64nQpYmmUG63thxvE/1brmM+JdKMs=; b=fJUbts7FCGnPppInUfONfJXujv3VIsSNHcxo918F3cp9xP4h7nq2Uav+3Wppjzpk8R HkB0SesEzijeJuNfXyjKIxpp3p/OhtS12CY6xqqlmj2N7txw3sqryBUhBxAWsSvCUogR OVN7gWyW9qfQgM5NKS0YlrKiIb8afyfiRXQ9gXiCbZN/qd4HMNjBk07Ez+PHxDbkqa9t FN20E4vf4d7kOmTJLw151lZqtCCv5BD2jePtZ1IuMzc1hUBfzKlE9m8s9Xi2MdmTkUuU XwzQUFJKxFUHFFwtVgywH7djnS2v7N6QaebiZ0KyySE2//cqf3NkQ+dpac9Vv5HEdCZz b1SA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ffOSXHVfVukIRx64nQpYmmUG63thxvE/1brmM+JdKMs=; b=MckenP15hOUHS60V4MqXjlAEbC4dX8mneRjNOdj5XxuZo/zipYT2aouz44uKWU6kmJ LpBVPTcO2WQgBy/18DtQT9jZN5PVICk05qEXBt/wH8tdjcIXg+nAlDhAiLFx0sQKnAwa F5AOAloiubekwL8au3oCADDy3gB+3xmVSXc7tO9GzXeXKiThj0qbtKHzUHMTbRTBXnsg wYvzhlCU/5DJaMP5xdRJhLnZhaIh7l70ZwejJNs2deyQHO2aN23hsPZGrogZx3dfPNYf db5ROO3FpgnkuiG7XkBXz4PTiHpZepN6fzEIPA8u2WUKtdkVaA+suhhoQzIZo/Y3ptVK btsQ== X-Gm-Message-State: AOAM531WlwkITOYCuRcXDXWITfd4PlO3Digj0iYI65nhNPHhmzll8/k+ eSHfbU6xhAB7vYQE7vxPiRRKplpJkZW8OQ== X-Google-Smtp-Source: ABdhPJyVbJM6ZSydt4Uu3bZbe2jRyPOKOlHNwVJf8qqzIFFxqZzaxmGn0y5DarDq8xypZe+Cy9P9NQ== X-Received: by 2002:a9d:71c6:: with SMTP id z6mr807246otj.276.1612325834550; Tue, 02 Feb 2021 20:17:14 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:13 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 14/19] skmsg: extract __tcp_bpf_recvmsg() and tcp_bpf_wait_data() Date: Tue, 2 Feb 2021 20:16:31 -0800 Message-Id: <20210203041636.38555-15-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Cong Wang Although these two functions are only used by TCP, they are not specific to TCP at all, both operate on skmsg and ingress_msg, so fit in net/core/skmsg.c very well. And we will need them for non-TCP, so rename and move them to skmsg.c and export them to modules. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/linux/skmsg.h | 4 ++ include/net/tcp.h | 2 - net/core/skmsg.c | 104 +++++++++++++++++++++++++++++++++++++++++ net/ipv4/tcp_bpf.c | 106 +----------------------------------------- net/tls/tls_sw.c | 4 +- 5 files changed, 112 insertions(+), 108 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index cb94d0f89c08..0e52fc5521a0 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -125,6 +125,10 @@ int sk_msg_zerocopy_from_iter(struct sock *sk, struct iov_iter *from, struct sk_msg *msg, u32 bytes); int sk_msg_memcopy_from_iter(struct sock *sk, struct iov_iter *from, struct sk_msg *msg, u32 bytes); +int sk_msg_wait_data(struct sock *sk, struct sk_psock *psock, int flags, + long timeo, int *err); +int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, + int len, int flags); static inline void sk_msg_check_to_free(struct sk_msg *msg, u32 i, u32 bytes) { diff --git a/include/net/tcp.h b/include/net/tcp.h index c2fff35859b6..b314aee5800d 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2194,8 +2194,6 @@ static inline void tcp_bpf_clone(const struct sock *sk, struct sock *newsk) #ifdef CONFIG_NET_SOCK_MSG int tcp_bpf_sendmsg_redir(struct sock *sk, struct sk_msg *msg, u32 bytes, int flags); -int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock, - struct msghdr *msg, int len, int flags); #endif /* CONFIG_NET_SOCK_MSG */ #ifdef CONFIG_CGROUP_BPF diff --git a/net/core/skmsg.c b/net/core/skmsg.c index ecbd6f0d49a5..8e3edbdf4c7c 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -399,6 +399,110 @@ int sk_msg_memcopy_from_iter(struct sock *sk, struct iov_iter *from, } EXPORT_SYMBOL_GPL(sk_msg_memcopy_from_iter); +int sk_msg_wait_data(struct sock *sk, struct sk_psock *psock, int flags, + long timeo, int *err) +{ + DEFINE_WAIT_FUNC(wait, woken_wake_function); + int ret = 0; + + if (sk->sk_shutdown & RCV_SHUTDOWN) + return 1; + + if (!timeo) + return ret; + + add_wait_queue(sk_sleep(sk), &wait); + sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk); + ret = sk_wait_event(sk, &timeo, + !list_empty(&psock->ingress_msg) || + !skb_queue_empty(&sk->sk_receive_queue), &wait); + sk_clear_bit(SOCKWQ_ASYNC_WAITDATA, sk); + remove_wait_queue(sk_sleep(sk), &wait); + return ret; +} +EXPORT_SYMBOL_GPL(sk_msg_wait_data); + +/* Receive sk_msg from psock->ingress_msg to @msg. */ +int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg, + int len, int flags) +{ + struct iov_iter *iter = &msg->msg_iter; + int peek = flags & MSG_PEEK; + struct sk_msg *msg_rx; + int i, copied = 0; + + msg_rx = list_first_entry_or_null(&psock->ingress_msg, + struct sk_msg, list); + + while (copied != len) { + struct scatterlist *sge; + + if (unlikely(!msg_rx)) + break; + + i = msg_rx->sg.start; + do { + struct page *page; + int copy; + + sge = sk_msg_elem(msg_rx, i); + copy = sge->length; + page = sg_page(sge); + if (copied + copy > len) + copy = len - copied; + copy = copy_page_to_iter(page, sge->offset, copy, iter); + if (!copy) + return copied ? copied : -EFAULT; + + copied += copy; + if (likely(!peek)) { + sge->offset += copy; + sge->length -= copy; + if (!msg_rx->skb) + sk_mem_uncharge(sk, copy); + msg_rx->sg.size -= copy; + + if (!sge->length) { + sk_msg_iter_var_next(i); + if (!msg_rx->skb) + put_page(page); + } + } else { + /* Lets not optimize peek case if copy_page_to_iter + * didn't copy the entire length lets just break. + */ + if (copy != sge->length) + return copied; + sk_msg_iter_var_next(i); + } + + if (copied == len) + break; + } while (i != msg_rx->sg.end); + + if (unlikely(peek)) { + if (msg_rx == list_last_entry(&psock->ingress_msg, + struct sk_msg, list)) + break; + msg_rx = list_next_entry(msg_rx, list); + continue; + } + + msg_rx->sg.start = i; + if (!sge->length && msg_rx->sg.start == msg_rx->sg.end) { + list_del(&msg_rx->list); + if (msg_rx->skb) + consume_skb(msg_rx->skb); + kfree(msg_rx); + } + msg_rx = list_first_entry_or_null(&psock->ingress_msg, + struct sk_msg, list); + } + + return copied; +} +EXPORT_SYMBOL_GPL(sk_msg_recvmsg); + static struct sk_msg *sk_psock_create_ingress_msg(struct sock *sk, struct sk_buff *skb) { diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index 16e00802ccba..3c0206a4f0e0 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -10,86 +10,6 @@ #include #include -int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock, - struct msghdr *msg, int len, int flags) -{ - struct iov_iter *iter = &msg->msg_iter; - int peek = flags & MSG_PEEK; - struct sk_msg *msg_rx; - int i, copied = 0; - - msg_rx = list_first_entry_or_null(&psock->ingress_msg, - struct sk_msg, list); - - while (copied != len) { - struct scatterlist *sge; - - if (unlikely(!msg_rx)) - break; - - i = msg_rx->sg.start; - do { - struct page *page; - int copy; - - sge = sk_msg_elem(msg_rx, i); - copy = sge->length; - page = sg_page(sge); - if (copied + copy > len) - copy = len - copied; - copy = copy_page_to_iter(page, sge->offset, copy, iter); - if (!copy) - return copied ? copied : -EFAULT; - - copied += copy; - if (likely(!peek)) { - sge->offset += copy; - sge->length -= copy; - if (!msg_rx->skb) - sk_mem_uncharge(sk, copy); - msg_rx->sg.size -= copy; - - if (!sge->length) { - sk_msg_iter_var_next(i); - if (!msg_rx->skb) - put_page(page); - } - } else { - /* Lets not optimize peek case if copy_page_to_iter - * didn't copy the entire length lets just break. - */ - if (copy != sge->length) - return copied; - sk_msg_iter_var_next(i); - } - - if (copied == len) - break; - } while (i != msg_rx->sg.end); - - if (unlikely(peek)) { - if (msg_rx == list_last_entry(&psock->ingress_msg, - struct sk_msg, list)) - break; - msg_rx = list_next_entry(msg_rx, list); - continue; - } - - msg_rx->sg.start = i; - if (!sge->length && msg_rx->sg.start == msg_rx->sg.end) { - list_del(&msg_rx->list); - if (msg_rx->skb) - consume_skb(msg_rx->skb); - kfree(msg_rx); - } - msg_rx = list_first_entry_or_null(&psock->ingress_msg, - struct sk_msg, list); - } - - return copied; -} -EXPORT_SYMBOL_GPL(__tcp_bpf_recvmsg); - static int bpf_tcp_ingress(struct sock *sk, struct sk_psock *psock, struct sk_msg *msg, u32 apply_bytes, int flags) { @@ -243,28 +163,6 @@ static bool tcp_bpf_stream_read(const struct sock *sk) return !empty; } -static int tcp_bpf_wait_data(struct sock *sk, struct sk_psock *psock, - int flags, long timeo, int *err) -{ - DEFINE_WAIT_FUNC(wait, woken_wake_function); - int ret = 0; - - if (sk->sk_shutdown & RCV_SHUTDOWN) - return 1; - - if (!timeo) - return ret; - - add_wait_queue(sk_sleep(sk), &wait); - sk_set_bit(SOCKWQ_ASYNC_WAITDATA, sk); - ret = sk_wait_event(sk, &timeo, - !list_empty(&psock->ingress_msg) || - !skb_queue_empty(&sk->sk_receive_queue), &wait); - sk_clear_bit(SOCKWQ_ASYNC_WAITDATA, sk); - remove_wait_queue(sk_sleep(sk), &wait); - return ret; -} - static int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int nonblock, int flags, int *addr_len) { @@ -284,13 +182,13 @@ static int tcp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, } lock_sock(sk); msg_bytes_ready: - copied = __tcp_bpf_recvmsg(sk, psock, msg, len, flags); + copied = sk_msg_recvmsg(sk, psock, msg, len, flags); if (!copied) { int data, err = 0; long timeo; timeo = sock_rcvtimeo(sk, nonblock); - data = tcp_bpf_wait_data(sk, psock, flags, timeo, &err); + data = sk_msg_wait_data(sk, psock, flags, timeo, &err); if (data) { if (!sk_psock_queue_empty(psock)) goto msg_bytes_ready; diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 01d933ae5f16..1dcb34dfd56b 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -1789,8 +1789,8 @@ int tls_sw_recvmsg(struct sock *sk, skb = tls_wait_data(sk, psock, flags, timeo, &err); if (!skb) { if (psock) { - int ret = __tcp_bpf_recvmsg(sk, psock, - msg, len, flags); + int ret = sk_msg_recvmsg(sk, psock, msg, len, + flags); if (ret > 0) { decrypted += ret; From patchwork Wed Feb 3 04:16:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 375848 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70145C43381 for ; Wed, 3 Feb 2021 04:21:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4D57664F67 for ; Wed, 3 Feb 2021 04:21:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233179AbhBCEU4 (ORCPT ); Tue, 2 Feb 2021 23:20:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232852AbhBCES4 (ORCPT ); Tue, 2 Feb 2021 23:18:56 -0500 Received: from mail-oo1-xc2c.google.com (mail-oo1-xc2c.google.com [IPv6:2607:f8b0:4864:20::c2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1EC8C0617AB; Tue, 2 Feb 2021 20:17:16 -0800 (PST) Received: by mail-oo1-xc2c.google.com with SMTP id z36so5707810ooi.6; Tue, 02 Feb 2021 20:17:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qGgIBOZxfhj4RQXaRAJW/z/oxERFBXXb52q9pb05GWg=; b=ktAWrgMwRW/oX/JdYmeAPZuH2HFw7cuWSH4dOAlO7VW8gFmpW+kaA/GT+ySqF7FfI+ P90ifZbOQIdsDr+1cMVlOnFTHRBKzGU0WoB31s+LW3O9Z6jaUK305UZ4xvDi9gMG9gt4 IWgfx8fnOoF9ztmE2oSynaNmzxJWT3r+xVH0RoSIiK/S2MCGU3jkqX8wtj1NStIyIUOC bxvSGW09PiGA0eoB2FV7jw2yKTT3srNMoeKvuTnLtoS8Z1dnfeW5rQPKFRQSJZdqIpQL ylnuwHdHqP2LHfGMzVFIJXromVHNRyp4It9KONBshrPFB0UOwtYqwTCHQj9Nkf3bHy4Y PLqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qGgIBOZxfhj4RQXaRAJW/z/oxERFBXXb52q9pb05GWg=; b=UYz1AwaeAwgp9JcoSsYkcNCYfnyOCTQkY317QCWYlieO0ZSelm1H48/wQnR1oisrrV 74jhy5cDTTM/+YKHM6LVvSltyHTg2EMO6PLL8wluSqJ/UEUs9kfxH+w5MAYBQi0Ti08M Z9uTZaU2VS1PLB3AfUTPjKAlpHuRpiG42f8Oo3IygeWzS17OMLduoS/BKzfhqnQ8rJ7D dDLks7kh32zT641UvS4Jr5SJmSq/XPhjF78UZ11uwFugHtZnElPdAA7DLbggcjaMpcNL SPnGQUtkC1/CaMOQE6bCmJq9J69rIGSMTFaQK5R283Ck2OYAEhhdsnxPcEc/GvsRQteB U0BA== X-Gm-Message-State: AOAM533yC4zykTZaY3/hAojVk7y2hrTK48WVriC50nKC4w3JLpB8+Css H7vL+ryQVF7jHGnls6hvRmLuKFrSDsm0YQ== X-Google-Smtp-Source: ABdhPJzau0g7y3LgHF18hGPSqwt1wkCbsH3h2Tu/zVHKYAhvMRvHiRS0Ytu5Dp4X33E+nyRYNNiUKA== X-Received: by 2002:a4a:870c:: with SMTP id z12mr802907ooh.15.1612325836069; Tue, 02 Feb 2021 20:17:16 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:15 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 15/19] udp: implement udp_bpf_recvmsg() for sockmap Date: Tue, 2 Feb 2021 20:16:32 -0800 Message-Id: <20210203041636.38555-16-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Cong Wang We have to implement udp_bpf_recvmsg() to replace the ->recvmsg() to retrieve skmsg from ingress_msg. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- net/ipv4/udp_bpf.c | 64 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 63 insertions(+), 1 deletion(-) diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c index 595836088e85..9a37ba056575 100644 --- a/net/ipv4/udp_bpf.c +++ b/net/ipv4/udp_bpf.c @@ -4,6 +4,68 @@ #include #include #include +#include + +#include "udp_impl.h" + +static struct proto *udpv6_prot_saved __read_mostly; + +static int sk_udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, + int noblock, int flags, int *addr_len) +{ +#if IS_ENABLED(CONFIG_IPV6) + if (sk->sk_family == AF_INET6) + return udpv6_prot_saved->recvmsg(sk, msg, len, noblock, flags, + addr_len); +#endif + return udp_prot.recvmsg(sk, msg, len, noblock, flags, addr_len); +} + +static int udp_bpf_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, + int nonblock, int flags, int *addr_len) +{ + struct sk_psock *psock; + int copied, ret; + + if (unlikely(flags & MSG_ERRQUEUE)) + return inet_recv_error(sk, msg, len, addr_len); + + psock = sk_psock_get(sk); + if (unlikely(!psock)) + return sk_udp_recvmsg(sk, msg, len, nonblock, flags, addr_len); + + lock_sock(sk); + if (sk_psock_queue_empty(psock)) { + ret = sk_udp_recvmsg(sk, msg, len, nonblock, flags, addr_len); + goto out; + } + +msg_bytes_ready: + copied = sk_msg_recvmsg(sk, psock, msg, len, flags); + if (!copied) { + int data, err = 0; + long timeo; + + timeo = sock_rcvtimeo(sk, nonblock); + data = sk_msg_wait_data(sk, psock, flags, timeo, &err); + if (data) { + if (!sk_psock_queue_empty(psock)) + goto msg_bytes_ready; + ret = sk_udp_recvmsg(sk, msg, len, nonblock, flags, addr_len); + goto out; + } + if (err) { + ret = err; + goto out; + } + copied = -EAGAIN; + } + ret = copied; +out: + release_sock(sk); + sk_psock_put(sk, psock); + return ret; +} enum { UDP_BPF_IPV4, @@ -11,7 +73,6 @@ enum { UDP_BPF_NUM_PROTS, }; -static struct proto *udpv6_prot_saved __read_mostly; static DEFINE_SPINLOCK(udpv6_prot_lock); static struct proto udp_bpf_prots[UDP_BPF_NUM_PROTS]; @@ -20,6 +81,7 @@ static void udp_bpf_rebuild_protos(struct proto *prot, const struct proto *base) *prot = *base; prot->unhash = sock_map_unhash; prot->close = sock_map_close; + prot->recvmsg = udp_bpf_recvmsg; } static void udp_bpf_check_v6_needs_rebuild(struct proto *ops) From patchwork Wed Feb 3 04:16:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 375845 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E799EC43331 for ; Wed, 3 Feb 2021 04:21:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AF99A64F77 for ; Wed, 3 Feb 2021 04:21:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233227AbhBCEVF (ORCPT ); Tue, 2 Feb 2021 23:21:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232947AbhBCETV (ORCPT ); Tue, 2 Feb 2021 23:19:21 -0500 Received: from mail-oo1-xc2c.google.com (mail-oo1-xc2c.google.com [IPv6:2607:f8b0:4864:20::c2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 661CEC061351; Tue, 2 Feb 2021 20:17:18 -0800 (PST) Received: by mail-oo1-xc2c.google.com with SMTP id x23so5704323oop.1; Tue, 02 Feb 2021 20:17:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FCPDcm2U8/mVowiAl26odeeIloTX+VCHLbcc34SXhNg=; b=KygWNLfC1FY5z3olUJ14gMzLoUBOpOOHpXUNc6udgsKN13ZIUd4Zc6YSKzU8aQ1Ykt 4nX0Is7bxVk3/QMseeZQEpVvFNvzpqlNhdD5diHAWaVg6q7xXBfEuaeytbhHIJhry2Qk 0NvbYQtxIAf0j6KSyhLpW8fOgq38YmDNOjVu+q2niaHw+lNhUUEGEiBABvpEM0HsxcSh d73xvEFQGJ+H75E01LJbIZ2Mo+mXIF4TyMITrNe9rBbO8nUwyC7aAdqUtuNfaLifhGkn Zmg9WDG7XN+NlcY8UxWpWyVHjycGr8Xd36auU7DbsosSKzF5vYD4sfQrjh8pCBRwktwX 5U2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FCPDcm2U8/mVowiAl26odeeIloTX+VCHLbcc34SXhNg=; b=PiNdyKj/nrY/XR3dzlTuawnTflMGlK95mGhIzWKcxd2nSgB8q8IvIYm7T2VImhT/ue veRqprdQDDUs0pVmyk1jKiUH+6beH3PVQ8Zden44Z/9ZyZlFvJrC3rSdzZEOO9QWzCJo 9/Gz4kPY/fprbIArxhJx4zHTNBLjgvjmDUplGFz3MOS/RG4jGsW+btmEh7k1ima7Cg0c Z5c+fYm24+wgULcPKRHo5Gh80vtrOngQYCD1VUbzaALu8zR1U6LtOy6MjdMolnpynFrS 1vGCTYt/tMRkN/dQgQvyMf+Ih5118QS/kqZqrGET6+OeXEJipPloWCWZV5LWgQ91REYa SfGQ== X-Gm-Message-State: AOAM530dZzdAXQKElU+QLmHi35osaAamRT5lpV3TKwnx4Ww0dJV6HkMJ nTeGsCEpMf0xTJgTBLBw/eIqNAQuxoIAlA== X-Google-Smtp-Source: ABdhPJyKa0cfPJhBmE40ohTRIkysvgja2gz1icWBPmYBMWNAqtpuU8mN91+qkg9Yn2IPLSyAe+GpOA== X-Received: by 2002:a4a:d112:: with SMTP id k18mr682277oor.48.1612325837706; Tue, 02 Feb 2021 20:17:17 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:16 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 16/19] af_unix: implement unix_dgram_bpf_recvmsg() Date: Tue, 2 Feb 2021 20:16:33 -0800 Message-Id: <20210203041636.38555-17-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Cong Wang We have to implement unix_dgram_bpf_recvmsg() to replace the original ->recvmsg() to retrieve skmsg from ingress_msg. AF_UNIX is again special here because the lack of sk_prot->recvmsg(). I simply add a special case inside unix_dgram_recvmsg() to call sk->sk_prot->recvmsg() directly. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/net/af_unix.h | 3 +++ net/unix/af_unix.c | 21 ++++++++++++++++--- net/unix/unix_bpf.c | 49 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 70 insertions(+), 3 deletions(-) diff --git a/include/net/af_unix.h b/include/net/af_unix.h index fa75f899e88a..f6c43667e995 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -82,6 +82,9 @@ static inline struct unix_sock *unix_sk(const struct sock *sk) long unix_inq_len(struct sock *sk); long unix_outq_len(struct sock *sk); +int __unix_dgram_recvmsg(struct sock *sk, struct msghdr *msg, size_t size, + int nonblock, int flags, int *addr_len); + #ifdef CONFIG_SYSCTL int unix_sysctl_register(struct net *net); void unix_sysctl_unregister(struct net *net); diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 21c4406f879b..eebcd6f7ef88 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2094,11 +2094,11 @@ static void unix_copy_addr(struct msghdr *msg, struct sock *sk) } } -static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, - size_t size, int flags) +int __unix_dgram_recvmsg(struct sock *sk, struct msghdr *msg, size_t size, + int nonblock, int flags, int *addr_len) { struct scm_cookie scm; - struct sock *sk = sock->sk; + struct socket *sock = sk->sk_socket; struct unix_sock *u = unix_sk(sk); struct sk_buff *skb, *last; long timeo; @@ -2201,6 +2201,21 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, return err; } +static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, + int flags) +{ + struct sock *sk = sock->sk; + int addr_len = 0; + +#ifdef CONFIG_BPF_SOCK_MAP + if (sk->sk_prot != &unix_proto) + return sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT, + flags & ~MSG_DONTWAIT, &addr_len); +#endif + return __unix_dgram_recvmsg(sk, msg, size, flags & MSG_DONTWAIT, + flags, &addr_len); +} + int unix_read_sock(struct sock *sk, read_descriptor_t *desc, sk_read_actor_t recv_actor) { diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c index 2e6a26ec4958..570261fd18cd 100644 --- a/net/unix/unix_bpf.c +++ b/net/unix/unix_bpf.c @@ -5,6 +5,54 @@ #include #include +static int unix_dgram_bpf_recvmsg(struct sock *sk, struct msghdr *msg, + size_t len, int nonblock, int flags, + int *addr_len) +{ + struct sk_psock *psock; + int copied, ret; + + psock = sk_psock_get(sk); + if (unlikely(!psock)) + return __unix_dgram_recvmsg(sk, msg, len, nonblock, flags, + addr_len); + + lock_sock(sk); + if (!skb_queue_empty(&sk->sk_receive_queue) && + sk_psock_queue_empty(psock)) { + ret = __unix_dgram_recvmsg(sk, msg, len, nonblock, flags, + addr_len); + goto out; + } + +msg_bytes_ready: + copied = sk_msg_recvmsg(sk, psock, msg, len, flags); + if (!copied) { + int data, err = 0; + long timeo; + + timeo = sock_rcvtimeo(sk, nonblock); + data = sk_msg_wait_data(sk, psock, flags, timeo, &err); + if (data) { + if (!sk_psock_queue_empty(psock)) + goto msg_bytes_ready; + ret = __unix_dgram_recvmsg(sk, msg, len, nonblock, + flags, addr_len); + goto out; + } + if (err) { + ret = err; + goto out; + } + copied = -EAGAIN; + } + ret = copied; +out: + release_sock(sk); + sk_psock_put(sk, psock); + return ret; +} + static struct proto *unix_prot_saved __read_mostly; static DEFINE_SPINLOCK(unix_prot_lock); static struct proto unix_bpf_prot; @@ -13,6 +61,7 @@ static void unix_bpf_rebuild_protos(struct proto *prot, const struct proto *base { *prot = *base; prot->close = sock_map_close; + prot->recvmsg = unix_dgram_bpf_recvmsg; } static void unix_bpf_check_needs_rebuild(struct proto *ops) From patchwork Wed Feb 3 04:16:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 375846 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3C56C4332B for ; Wed, 3 Feb 2021 04:21:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6AD1D64F72 for ; Wed, 3 Feb 2021 04:21:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233002AbhBCEVB (ORCPT ); Tue, 2 Feb 2021 23:21:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232870AbhBCETA (ORCPT ); Tue, 2 Feb 2021 23:19:00 -0500 Received: from mail-oi1-x22b.google.com (mail-oi1-x22b.google.com [IPv6:2607:f8b0:4864:20::22b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 51FD3C061354; Tue, 2 Feb 2021 20:17:23 -0800 (PST) Received: by mail-oi1-x22b.google.com with SMTP id m13so25343114oig.8; Tue, 02 Feb 2021 20:17:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5XzKRAnIzM4OkR+o5IUQ3F91LoQlv8ZVaHR4VoSR6KY=; b=hBXMpXnvsYlkaXFEnxrF05njhzROvSlnxq9SXuaAGIBTF6MzOCO3s5MO6O+kE2ZUgF 9I1+dAVFnrbcWOU51fPWZJCP6k7/0/rHdRKcrcXH9Jla9R2/aMq0h8xK5jPOC/vBfHl8 hY+rZF0z+as5mpUFdC37Y+jThj18GAPmUoWsCkrmq099JJj/GjKCbfemXfPR72p2eiGQ hxgpXV4W5CPYiFfl3ePsBi1PWW8BWivxxC0H3GiHayBB+ByaJtiyUmi7w7TqRzm/0DO1 j39ajrceRln0M1sLYlStpqH4nYNtqtaOI3PcToapMycWIY59DfixzZbtxYKlXvtwRX5+ 00pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5XzKRAnIzM4OkR+o5IUQ3F91LoQlv8ZVaHR4VoSR6KY=; b=kQHtVsk75MeRahB//o8JDBlOkkSEsdB+gD0xFfs6IcVlWac5rFmQCq2CxEnZa/95Yp SfcSiwQu3Pfgt6wpglOpHTnFXe5RGUPzL7WbyYpZCDdNIBBbP33xFmMJbsIbV2yiBJIM ug+w/eh/4a8sJtK1EyWqe6US5BJfAFG6fuhUZ74cS9iJ2211ylzaOURSrmwmuGjj/e+0 6l1DIkYpaK99YMSOYIQpQZMAQdCC/nGVBtgzVw2Crk1pM0Y1PSXNOo6Iq75INJOlN0zg NcGJKHRfrSMNbD+jTTSiZqETwUH/qeA9EdmmLf59QHQqBP3z54U9b6api2E4dAWFv/vI nrKg== X-Gm-Message-State: AOAM532WuAPIH46bwjsOyNJQGhKFnb4sP7znwoRrTMu5pknIjyqr9iNT bF2WG3RuePYHcU2WjK5Mx2IIV93xBOfMJw== X-Google-Smtp-Source: ABdhPJyaRgC/aLw9LT04IH7XGkVKTm3z5bI3W8j/Kb/iT/+aZf4GyQ7Ku9pYAs5jGioUuzXujVgUJQ== X-Received: by 2002:aca:b655:: with SMTP id g82mr785226oif.91.1612325842490; Tue, 02 Feb 2021 20:17:22 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.17.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:21 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 19/19] selftests/bpf: add test case for redirection between udp and unix Date: Tue, 2 Feb 2021 20:16:36 -0800 Message-Id: <20210203041636.38555-20-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Cong Wang Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- .../selftests/bpf/prog_tests/sockmap_listen.c | 226 ++++++++++++++++++ 1 file changed, 226 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c index 8f52302165a6..e0c2a0a4f501 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c @@ -1798,6 +1798,228 @@ static void test_unix_redir(struct test_sockmap_listen *skel, struct bpf_map *ma unix_skb_redir_to_connected(skel, map, sotype); } +static void udp_unix_redir_to_connected(int family, int sock_mapfd, + int verd_mapfd, enum redir_mode mode) +{ + const char *log_prefix = redir_mode_str(mode); + struct sockaddr_storage addr; + int c0, c1, p0, p1; + unsigned int pass; + socklen_t len; + int err, n; + int sfd[2]; + u64 value; + u32 key; + char b; + + zero_verdict_count(verd_mapfd); + + if (socketpair(AF_UNIX, SOCK_DGRAM | SOCK_NONBLOCK, 0, sfd)) + return; + c0 = sfd[0], p0 = sfd[1]; + + p1 = socket_loopback(family, SOCK_DGRAM | SOCK_NONBLOCK); + if (p0 < 0) + goto close; + len = sizeof(addr); + err = xgetsockname(p1, sockaddr(&addr), &len); + if (err) + goto close_peer1; + + c1 = xsocket(family, SOCK_DGRAM | SOCK_NONBLOCK, 0); + if (c1 < 0) + goto close_peer1; + err = xconnect(c1, sockaddr(&addr), len); + if (err) + goto close_cli1; + err = xgetsockname(c1, sockaddr(&addr), &len); + if (err) + goto close_cli1; + err = xconnect(p1, sockaddr(&addr), len); + if (err) + goto close_cli1; + + key = 0; + value = p0; + err = xbpf_map_update_elem(sock_mapfd, &key, &value, BPF_NOEXIST); + if (err) + goto close_cli1; + + key = 1; + value = p1; + err = xbpf_map_update_elem(sock_mapfd, &key, &value, BPF_NOEXIST); + if (err) + goto close_cli1; + + n = write(c1, "a", 1); + if (n < 0) + FAIL_ERRNO("%s: write", log_prefix); + if (n == 0) + FAIL("%s: incomplete write", log_prefix); + if (n < 1) + goto close_cli1; + + key = SK_PASS; + err = xbpf_map_lookup_elem(verd_mapfd, &key, &pass); + if (err) + goto close_cli1; + if (pass != 1) + FAIL("%s: want pass count 1, have %d", log_prefix, pass); + + n = read(mode == REDIR_INGRESS ? p0 : c0, &b, 1); + if (n < 0) + FAIL_ERRNO("%s: read", log_prefix); + if (n == 0) + FAIL("%s: incomplete read", log_prefix); + +close_cli1: + xclose(c1); +close_peer1: + xclose(p1); +close: + xclose(c0); + xclose(p0); +} + +static void udp_unix_skb_redir_to_connected(struct test_sockmap_listen *skel, + struct bpf_map *inner_map, int family) +{ + int verdict = bpf_program__fd(skel->progs.prog_skb_verdict); + int verdict_map = bpf_map__fd(skel->maps.verdict_map); + int sock_map = bpf_map__fd(inner_map); + int err; + + err = xbpf_prog_attach(verdict, sock_map, BPF_SK_SKB_VERDICT, 0); + if (err) + return; + + skel->bss->test_ingress = false; + udp_unix_redir_to_connected(family, sock_map, verdict_map, REDIR_EGRESS); + skel->bss->test_ingress = true; + udp_unix_redir_to_connected(family, sock_map, verdict_map, REDIR_INGRESS); + + xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT); +} + +static void unix_udp_redir_to_connected(int family, int sock_mapfd, + int verd_mapfd, enum redir_mode mode) +{ + const char *log_prefix = redir_mode_str(mode); + struct sockaddr_storage addr; + int c0, c1, p0, p1; + unsigned int pass; + socklen_t len; + int err, n; + int sfd[2]; + u64 value; + u32 key; + char b; + + zero_verdict_count(verd_mapfd); + + p0 = socket_loopback(family, SOCK_DGRAM | SOCK_NONBLOCK); + if (p0 < 0) + return; + len = sizeof(addr); + err = xgetsockname(p0, sockaddr(&addr), &len); + if (err) + goto close_peer0; + + c0 = xsocket(family, SOCK_DGRAM | SOCK_NONBLOCK, 0); + if (c0 < 0) + goto close_peer0; + err = xconnect(c0, sockaddr(&addr), len); + if (err) + goto close_cli0; + err = xgetsockname(c0, sockaddr(&addr), &len); + if (err) + goto close_cli0; + err = xconnect(p0, sockaddr(&addr), len); + if (err) + goto close_cli0; + + if (socketpair(AF_UNIX, SOCK_DGRAM | SOCK_NONBLOCK, 0, sfd)) + goto close_cli0; + c1 = sfd[0], p1 = sfd[1]; + + key = 0; + value = p0; + err = xbpf_map_update_elem(sock_mapfd, &key, &value, BPF_NOEXIST); + if (err) + goto close; + + key = 1; + value = p1; + err = xbpf_map_update_elem(sock_mapfd, &key, &value, BPF_NOEXIST); + if (err) + goto close; + + n = write(c1, "a", 1); + if (n < 0) + FAIL_ERRNO("%s: write", log_prefix); + if (n == 0) + FAIL("%s: incomplete write", log_prefix); + if (n < 1) + goto close; + + key = SK_PASS; + err = xbpf_map_lookup_elem(verd_mapfd, &key, &pass); + if (err) + goto close; + if (pass != 1) + FAIL("%s: want pass count 1, have %d", log_prefix, pass); + + n = read(mode == REDIR_INGRESS ? p0 : c0, &b, 1); + if (n < 0) + FAIL_ERRNO("%s: read", log_prefix); + if (n == 0) + FAIL("%s: incomplete read", log_prefix); + +close: + xclose(c1); + xclose(p1); +close_cli0: + xclose(c0); +close_peer0: + xclose(p0); + +} + +static void unix_udp_skb_redir_to_connected(struct test_sockmap_listen *skel, + struct bpf_map *inner_map, int family) +{ + int verdict = bpf_program__fd(skel->progs.prog_skb_verdict); + int verdict_map = bpf_map__fd(skel->maps.verdict_map); + int sock_map = bpf_map__fd(inner_map); + int err; + + err = xbpf_prog_attach(verdict, sock_map, BPF_SK_SKB_VERDICT, 0); + if (err) + return; + + skel->bss->test_ingress = false; + unix_udp_redir_to_connected(family, sock_map, verdict_map, REDIR_EGRESS); + skel->bss->test_ingress = true; + unix_udp_redir_to_connected(family, sock_map, verdict_map, REDIR_INGRESS); + + xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT); +} + +static void test_udp_unix_redir(struct test_sockmap_listen *skel, struct bpf_map *map, + int family) +{ + const char *family_name, *map_name; + char s[MAX_TEST_NAME]; + + family_name = family_str(family); + map_name = map_type_str(map); + snprintf(s, sizeof(s), "%s %s %s", map_name, family_name, __func__); + if (!test__start_subtest(s)) + return; + udp_unix_skb_redir_to_connected(skel, map, family); + unix_udp_skb_redir_to_connected(skel, map, family); +} + static void test_reuseport(struct test_sockmap_listen *skel, struct bpf_map *map, int family, int sotype) { @@ -1864,6 +2086,8 @@ void test_sockmap_listen(void) test_udp_redir(skel, skel->maps.sock_map, AF_INET); test_udp_redir(skel, skel->maps.sock_map, AF_INET6); test_unix_redir(skel, skel->maps.sock_map, SOCK_DGRAM); + test_udp_unix_redir(skel, skel->maps.sock_map, AF_INET); + test_udp_unix_redir(skel, skel->maps.sock_map, AF_INET6); skel->bss->test_sockmap = false; run_tests(skel, skel->maps.sock_hash, AF_INET); @@ -1871,6 +2095,8 @@ void test_sockmap_listen(void) test_udp_redir(skel, skel->maps.sock_hash, AF_INET); test_udp_redir(skel, skel->maps.sock_hash, AF_INET6); test_unix_redir(skel, skel->maps.sock_hash, SOCK_DGRAM); + test_udp_unix_redir(skel, skel->maps.sock_hash, AF_INET); + test_udp_unix_redir(skel, skel->maps.sock_hash, AF_INET6); test_sockmap_listen__destroy(skel); }