From patchwork Wed Feb 3 04:16:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 375851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5834FC433E6 for ; Wed, 3 Feb 2021 04:19:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1E92B64E4D for ; Wed, 3 Feb 2021 04:19:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232920AbhBCETP (ORCPT ); Tue, 2 Feb 2021 23:19:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36704 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232731AbhBCESR (ORCPT ); Tue, 2 Feb 2021 23:18:17 -0500 Received: from mail-oi1-x22b.google.com (mail-oi1-x22b.google.com [IPv6:2607:f8b0:4864:20::22b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA3B7C061788; Tue, 2 Feb 2021 20:17:01 -0800 (PST) Received: by mail-oi1-x22b.google.com with SMTP id x71so25345614oia.9; Tue, 02 Feb 2021 20:17:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=SP/1stuU44pVBk7OBBQnq1cmGdnBzdf1fs54qnZXI28=; b=LUc4vFqaH026CJlyG/ik+ARGw3xO0txEaARW0fIlhtbGytcVs2aWJ0QxIWwrMMDqfF pbxLHHx61mDNswnOt1+rzimONAYwkYMEE5/SG3Via5jgdpnVUfXNNWg8+6F4WEFQuUZy /WFqWYqIdYeFusQMQfvvPu4u/cGyX3EJVBcKBafttplcVbevt/EgU5nte/MViGe1khgR 5kISf0TD92mztcofBpMQMrpa3mYGbHGQM000FntbJjMe8e+Wnx0WadaeiMhM/YKGEM+B IauOdBBBZSSaCZYmMIGtwaDhFNqFhFkHqwj0y7jeyQl2oRuA4MpTZMvN7vMPXsT6CLUC JMCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SP/1stuU44pVBk7OBBQnq1cmGdnBzdf1fs54qnZXI28=; b=t8f99NPauMAq/i3fP5Ll1CPB7cUNR4qxrnZZUbCH/VPYR1XV99iCNPTlWZxke5t1Ra ZDtTqCYARFsxo3ZB6au5d44t8nhFKwbcPouYAT3hSdtdv0QcPFpQxqhA8PNqyAaPMhCs MLlypfgyMtwV2Bly0haHKxtI6Vm6ltpU/u5s7d6kL8DHWwhxEMCM+kGgDUCG5g6kRzd8 tzU+zmmF13t9NDAwcrPm8saL1l/et/bNH6IE2IKMbwaumWLRdvRb0qOd/QE9ui5fmuNZ XaCRwm/PLiiqYEOi5hIOB2iYXRoPPcVSlaoLxEiWpL9Vh6Mvlq6Qdcf3iY90OG0FPQkE iSjg== X-Gm-Message-State: AOAM532zpw5Ua78wZKDA302cIdYvDVXN4y6ietSKQQM0ZH47hmNWpKbR jyf+A1UCJWt3zmOMwLIPH2PoRWBV+BmCVg== X-Google-Smtp-Source: ABdhPJyiCpNN6z8AhRrFCetB7GZ2hLGoIiiShe30NMW5a1OooszLg9txuOvFWG/lZV5l8I8lEDuaHg== X-Received: by 2002:aca:e103:: with SMTP id y3mr783973oig.11.1612325821171; Tue, 02 Feb 2021 20:17:01 -0800 (PST) Received: from unknown.attlocal.net ([2600:1700:65a0:ab60:90c4:ffea:6079:8a0c]) by smtp.gmail.com with ESMTPSA id s10sm209978ool.35.2021.02.02.20.16.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Feb 2021 20:17:00 -0800 (PST) From: Cong Wang To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, duanxiongchun@bytedance.com, wangdongdong.6@bytedance.com, jiang.wang@bytedance.com, Cong Wang , John Fastabend , Daniel Borkmann , Jakub Sitnicki , Lorenz Bauer Subject: [Patch bpf-next 05/19] sock_map: introduce BPF_SK_SKB_VERDICT Date: Tue, 2 Feb 2021 20:16:22 -0800 Message-Id: <20210203041636.38555-6-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203041636.38555-1-xiyou.wangcong@gmail.com> References: <20210203041636.38555-1-xiyou.wangcong@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Cong Wang I was planning to reuse BPF_SK_SKB_STREAM_VERDICT but its name is confusing and more importantly it seems kTLS relies on it to deliver sk_msg too. To avoid messing up kTLS, we can just reuse the stream verdict code but introduce a new type of eBPF program, skb_verdict. Users are not allowed to set stream_verdict and skb_verdict at the same time. Cc: John Fastabend Cc: Daniel Borkmann Cc: Jakub Sitnicki Cc: Lorenz Bauer Signed-off-by: Cong Wang --- include/linux/skmsg.h | 3 +++ include/uapi/linux/bpf.h | 1 + kernel/bpf/syscall.c | 1 + net/core/skmsg.c | 4 +++- net/core/sock_map.c | 23 ++++++++++++++++++++++- tools/bpf/bpftool/common.c | 1 + tools/bpf/bpftool/prog.c | 1 + tools/include/uapi/linux/bpf.h | 1 + 8 files changed, 33 insertions(+), 2 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index 218566ac4fa1..cb79b1afa556 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -58,6 +58,7 @@ struct sk_psock_progs { struct bpf_prog *msg_parser; struct bpf_prog *stream_parser; struct bpf_prog *stream_verdict; + struct bpf_prog *skb_verdict; }; enum sk_psock_state_bits { @@ -428,6 +429,7 @@ static inline void psock_progs_drop(struct sk_psock_progs *progs) psock_set_prog(&progs->msg_parser, NULL); psock_set_prog(&progs->stream_parser, NULL); psock_set_prog(&progs->stream_verdict, NULL); + psock_set_prog(&progs->skb_verdict, NULL); } int sk_psock_tls_strp_read(struct sk_psock *psock, struct sk_buff *skb); @@ -482,5 +484,6 @@ void skb_bpf_ext_redirect_clear(struct sk_buff *skb) ext->flags = 0; ext->sk_redir = NULL; } + #endif /* CONFIG_NET_SOCK_MSG */ #endif /* _LINUX_SKMSG_H */ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index c001766adcbc..c1a412ebfb08 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -247,6 +247,7 @@ enum bpf_attach_type { BPF_XDP_CPUMAP, BPF_SK_LOOKUP, BPF_XDP, + BPF_SK_SKB_VERDICT, __MAX_BPF_ATTACH_TYPE }; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index e5999d86c76e..a56549fc2825 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2936,6 +2936,7 @@ attach_type_to_prog_type(enum bpf_attach_type attach_type) return BPF_PROG_TYPE_SK_MSG; case BPF_SK_SKB_STREAM_PARSER: case BPF_SK_SKB_STREAM_VERDICT: + case BPF_SK_SKB_VERDICT: return BPF_PROG_TYPE_SK_SKB; case BPF_LIRC_MODE2: return BPF_PROG_TYPE_LIRC_MODE2; diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 51446fe63be5..ecbd6f0d49a5 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -688,7 +688,7 @@ void sk_psock_drop(struct sock *sk, struct sk_psock *psock) rcu_assign_sk_user_data(sk, NULL); if (psock->progs.stream_parser) sk_psock_stop_strp(sk, psock); - else if (psock->progs.stream_verdict) + else if (psock->progs.stream_verdict || psock->progs.skb_verdict) sk_psock_stop_verdict(sk, psock); write_unlock_bh(&sk->sk_callback_lock); sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED); @@ -966,6 +966,8 @@ static int sk_psock_verdict_recv(read_descriptor_t *desc, struct sk_buff *skb, } prog = READ_ONCE(psock->progs.stream_verdict); + if (!prog) + prog = READ_ONCE(psock->progs.skb_verdict); if (likely(prog)) { skb_bpf_ext_redirect_clear(skb); ret = sk_psock_bpf_run(psock, prog, skb); diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 521663582982..f827f1ecefcc 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -152,6 +152,8 @@ static void sock_map_del_link(struct sock *sk, strp_stop = true; if (psock->bpf_running && stab->progs.stream_verdict) verdict_stop = true; + if (psock->bpf_running && stab->progs.skb_verdict) + verdict_stop = true; list_del(&link->list); sk_psock_free_link(link); } @@ -224,7 +226,7 @@ static struct sk_psock *sock_map_psock_get_checked(struct sock *sk) static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, struct sock *sk) { - struct bpf_prog *msg_parser, *stream_parser, *stream_verdict; + struct bpf_prog *msg_parser, *stream_parser, *stream_verdict, *skb_verdict; struct sk_psock *psock; int ret; @@ -253,6 +255,15 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, } } + skb_verdict = READ_ONCE(progs->skb_verdict); + if (skb_verdict) { + skb_verdict = bpf_prog_inc_not_zero(skb_verdict); + if (IS_ERR(skb_verdict)) { + ret = PTR_ERR(skb_verdict); + goto out_put_msg_parser; + } + } + psock = sock_map_psock_get_checked(sk); if (IS_ERR(psock)) { ret = PTR_ERR(psock); @@ -262,6 +273,7 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, if (psock) { if ((msg_parser && READ_ONCE(psock->progs.msg_parser)) || (stream_parser && READ_ONCE(psock->progs.stream_parser)) || + (skb_verdict && READ_ONCE(psock->progs.skb_verdict)) || (stream_verdict && READ_ONCE(psock->progs.stream_verdict))) { sk_psock_put(sk, psock); ret = -EBUSY; @@ -293,6 +305,9 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, } else if (!stream_parser && stream_verdict && !psock->bpf_running) { psock_set_prog(&psock->progs.stream_verdict, stream_verdict); sk_psock_start_verdict(sk,psock); + } else if (!stream_verdict && skb_verdict && !psock->bpf_running) { + psock_set_prog(&psock->progs.skb_verdict, skb_verdict); + sk_psock_start_verdict(sk, psock); } write_unlock_bh(&sk->sk_callback_lock); return 0; @@ -301,6 +316,9 @@ static int sock_map_link(struct bpf_map *map, struct sk_psock_progs *progs, out_drop: sk_psock_put(sk, psock); out_progs: + if (skb_verdict) + bpf_prog_put(skb_verdict); +out_put_msg_parser: if (msg_parser) bpf_prog_put(msg_parser); out_put_stream_parser: @@ -1467,6 +1485,9 @@ int sock_map_prog_update(struct bpf_map *map, struct bpf_prog *prog, case BPF_SK_SKB_STREAM_VERDICT: pprog = &progs->stream_verdict; break; + case BPF_SK_SKB_VERDICT: + pprog = &progs->skb_verdict; + break; default: return -EOPNOTSUPP; } diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c index 65303664417e..1828bba19020 100644 --- a/tools/bpf/bpftool/common.c +++ b/tools/bpf/bpftool/common.c @@ -57,6 +57,7 @@ const char * const attach_type_name[__MAX_BPF_ATTACH_TYPE] = { [BPF_SK_SKB_STREAM_PARSER] = "sk_skb_stream_parser", [BPF_SK_SKB_STREAM_VERDICT] = "sk_skb_stream_verdict", + [BPF_SK_SKB_VERDICT] = "sk_skb_verdict", [BPF_SK_MSG_VERDICT] = "sk_msg_verdict", [BPF_LIRC_MODE2] = "lirc_mode2", [BPF_FLOW_DISSECTOR] = "flow_dissector", diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index 1fe3ba255bad..a78d8c03b7ea 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -76,6 +76,7 @@ enum dump_mode { static const char * const attach_type_strings[] = { [BPF_SK_SKB_STREAM_PARSER] = "stream_parser", [BPF_SK_SKB_STREAM_VERDICT] = "stream_verdict", + [BPF_SK_SKB_VERDICT] = "skb_verdict", [BPF_SK_MSG_VERDICT] = "msg_verdict", [BPF_FLOW_DISSECTOR] = "flow_dissector", [__MAX_BPF_ATTACH_TYPE] = NULL, diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index c001766adcbc..c1a412ebfb08 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -247,6 +247,7 @@ enum bpf_attach_type { BPF_XDP_CPUMAP, BPF_SK_LOOKUP, BPF_XDP, + BPF_SK_SKB_VERDICT, __MAX_BPF_ATTACH_TYPE };