From patchwork Mon Jan 25 12:45:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Hangbin Liu X-Patchwork-Id: 371158 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64428C433DB for ; Tue, 26 Jan 2021 20:48:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 20C24221FC for ; Tue, 26 Jan 2021 20:48:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730803AbhAZFE0 (ORCPT ); Tue, 26 Jan 2021 00:04:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728321AbhAYMrq (ORCPT ); Mon, 25 Jan 2021 07:47:46 -0500 Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3CDE8C06178C; Mon, 25 Jan 2021 04:46:26 -0800 (PST) Received: by mail-pl1-x62f.google.com with SMTP id 31so7503683plb.10; Mon, 25 Jan 2021 04:46:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WhEjp90S8v5tScfR7Qe7gXy+P2RrD05DCaixP7pwYxM=; b=b+vJ6JJynI3tDoW6YobxwJGY5dZP8eoug7rsUUHvhugJ+2FjBt+tZiUY3kEVsfswSS UUF3dMN714efceSGNLxMGV6vBJGa4Fz0uX/vqXTW5tdDttcX8G7a7PpccPXHZBnws04/ i4uYRwL3dojKx/ZPvOy7RMSKb2zHgP5JcqST0cBtv2v4fLouhLBCZrxTyYrlde7AVSc1 yLI3sa3NXd1r69MZvvzKlicj5gwPRulUmZA341W7yGBZ2QRxaXPuNYwAEzIEcfMMFCIk MpbTVH/skoE8kPuE81iFt0qPxQtZoENxfgthelPWzswQllN6yf1hzPDz0LhUuAcpRfVW ZvLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WhEjp90S8v5tScfR7Qe7gXy+P2RrD05DCaixP7pwYxM=; b=UPIq6QzxLO/p8JPm4+WMF3F+77u6Ne6y6L1OeJty/Z00cj7BnC6EChVrcI4RLH8cvB mV+K7p0A4GJFq/tPGCleRo58uT5+IuEQZdD0YRMzhV6pJQ//PimqDmPHaPoooIDthwXJ mgixY+rv46F29opSBEL9hWYtrmHKFdR1dVYJcWIuM29Gu+Uv+D9e8Qo5aAXglzUt0U35 3nsrCsMxT6Jpgor5g8v2sABKTJ8HaIcnWK++JuzAyThTHnC10yn5i/oApuKjY6Img8JA TBy2m2s9SHDOV1nDhR+fqPcTL3TBsWdgOcW+tqGZ8SaKS0EBTiiH64zql8lmGohD1a50 wixw== X-Gm-Message-State: AOAM532A5Lr9lDM9+eQWyb69qsGaCOF78Wpo8XMa8g3tYCVSTcZ7/prc OJaIAsg0v7iGJjxXc1QFRFsXcDOAJ2EYolji X-Google-Smtp-Source: ABdhPJwaZlXH+cYtJ24WzU8EOxSwQM7ve5e4pilSurp/gtT3j0J5hyc3Q+NJ1eGsDGy6e6BxaCmXGQ== X-Received: by 2002:a17:90a:b392:: with SMTP id e18mr245929pjr.156.1611578785494; Mon, 25 Jan 2021 04:46:25 -0800 (PST) Received: from Leo-laptop-t470s.redhat.com ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id j123sm17815466pfg.36.2021.01.25.04.46.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Jan 2021 04:46:25 -0800 (PST) From: Hangbin Liu To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= , Jiri Benc , Jesper Dangaard Brouer , Eelco Chaudron , ast@kernel.org, Daniel Borkmann , Lorenzo Bianconi , David Ahern , Andrii Nakryiko , Alexei Starovoitov , John Fastabend , Maciej Fijalkowski , Hangbin Liu Subject: [PATCHv17 bpf-next 1/6] bpf: run devmap xdp_prog on flush instead of bulk enqueue Date: Mon, 25 Jan 2021 20:45:11 +0800 Message-Id: <20210125124516.3098129-2-liuhangbin@gmail.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210125124516.3098129-1-liuhangbin@gmail.com> References: <20210122074652.2981711-1-liuhangbin@gmail.com> <20210125124516.3098129-1-liuhangbin@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Jesper Dangaard Brouer This changes the devmap XDP program support to run the program when the bulk queue is flushed instead of before the frame is enqueued. This has a couple of benefits: - It "sorts" the packets by destination devmap entry, and then runs the same BPF program on all the packets in sequence. This ensures that we keep the XDP program and destination device properties hot in I-cache. - It makes the multicast implementation simpler because it can just enqueue packets using bq_enqueue() without having to deal with the devmap program at all. The drawback is that if the devmap program drops the packet, the enqueue step is redundant. However, arguably this is mostly visible in a micro-benchmark, and with more mixed traffic the I-cache benefit should win out. The performance impact of just this patch is as follows: The bq_xmit_all's logic is also refactored and error label is removed. When bq_xmit_all() is called from bq_enqueue(), another packet will always be enqueued immediately after, so clearing dev_rx, xdp_prog and flush_node in bq_xmit_all() is redundant. Let's move the clear to __dev_flush(), and only check them once in bq_enqueue() since they are all modified together. By using xdp_redirect_map in sample/bpf and send pkts via pktgen cmd: ./pktgen_sample03_burst_single_flow.sh -i eno1 -d $dst_ip -m $dst_mac -t 10 -s 64 There are about +/- 0.1M deviation for native testing, the performance improved for the base-case, but some drop back with xdp devmap prog attached. Version | Test | Generic | Native | Native + 2nd xdp_prog 5.10 rc6 | xdp_redirect_map i40e->i40e | 2.0M | 9.1M | 8.0M 5.10 rc6 | xdp_redirect_map i40e->veth | 1.7M | 11.0M | 9.7M 5.10 rc6 + patch | xdp_redirect_map i40e->i40e | 2.0M | 9.5M | 7.5M 5.10 rc6 + patch | xdp_redirect_map i40e->veth | 1.7M | 11.6M | 9.1M [1] https://lore.kernel.org/bpf/20210122025007.2968381-1-liuhangbin@gmail.com Reviewed-by: Maciej Fijalkowski Acked-by: Toke Høiland-Jørgensen Signed-off-by: Jesper Dangaard Brouer Signed-off-by: Hangbin Liu Acked-by: John Fastabend --- v17: a) rename to_sent to to_send. b) clear bq dev_rx, xdp_prog and flush_node in __dev_flush(). v16: a) refactor bq_xmit_all logic and remove error label v15: a) do not use unlikely when checking bq->xdp_prog b) return sent frames for dev_map_bpf_prog_run() v14: no update, only rebase the code v13: pass in xdp_prog through __xdp_enqueue() v2-v12: no this patch --- kernel/bpf/devmap.c | 146 +++++++++++++++++++++++++------------------- 1 file changed, 84 insertions(+), 62 deletions(-) diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index f6e9c68afdd4..bf8b6b5c9cab 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -57,6 +57,7 @@ struct xdp_dev_bulk_queue { struct list_head flush_node; struct net_device *dev; struct net_device *dev_rx; + struct bpf_prog *xdp_prog; unsigned int count; }; @@ -327,46 +328,92 @@ bool dev_map_can_have_prog(struct bpf_map *map) return false; } +static int dev_map_bpf_prog_run(struct bpf_prog *xdp_prog, + struct xdp_frame **frames, int n, + struct net_device *dev) +{ + struct xdp_txq_info txq = { .dev = dev }; + struct xdp_buff xdp; + int i, nframes = 0; + + for (i = 0; i < n; i++) { + struct xdp_frame *xdpf = frames[i]; + u32 act; + int err; + + xdp_convert_frame_to_buff(xdpf, &xdp); + xdp.txq = &txq; + + act = bpf_prog_run_xdp(xdp_prog, &xdp); + switch (act) { + case XDP_PASS: + err = xdp_update_frame_from_buff(&xdp, xdpf); + if (unlikely(err < 0)) + xdp_return_frame_rx_napi(xdpf); + else + frames[nframes++] = xdpf; + break; + default: + bpf_warn_invalid_xdp_action(act); + fallthrough; + case XDP_ABORTED: + trace_xdp_exception(dev, xdp_prog, act); + fallthrough; + case XDP_DROP: + xdp_return_frame_rx_napi(xdpf); + break; + } + } + return nframes; /* sent frames count */ +} + static void bq_xmit_all(struct xdp_dev_bulk_queue *bq, u32 flags) { struct net_device *dev = bq->dev; - int sent = 0, drops = 0, err = 0; + unsigned int cnt = bq->count; + int drops = 0, err = 0; + int to_send = cnt; + int sent = cnt; int i; - if (unlikely(!bq->count)) + if (unlikely(!cnt)) return; - for (i = 0; i < bq->count; i++) { + for (i = 0; i < cnt; i++) { struct xdp_frame *xdpf = bq->q[i]; prefetch(xdpf); } - sent = dev->netdev_ops->ndo_xdp_xmit(dev, bq->count, bq->q, flags); + if (bq->xdp_prog) { + to_send = dev_map_bpf_prog_run(bq->xdp_prog, bq->q, cnt, dev); + if (!to_send) { + sent = 0; + goto out; + } + drops = cnt - to_send; + } + + sent = dev->netdev_ops->ndo_xdp_xmit(dev, to_send, bq->q, flags); if (sent < 0) { err = sent; sent = 0; - goto error; + + /* If ndo_xdp_xmit fails with an errno, no frames have been + * xmit'ed and it's our responsibility to them free all. + */ + for (i = 0; i < cnt - drops; i++) { + struct xdp_frame *xdpf = bq->q[i]; + + xdp_return_frame_rx_napi(xdpf); + } } - drops = bq->count - sent; out: + drops = cnt - sent; bq->count = 0; trace_xdp_devmap_xmit(bq->dev_rx, dev, sent, drops, err); - bq->dev_rx = NULL; - __list_del_clearprev(&bq->flush_node); return; -error: - /* If ndo_xdp_xmit fails with an errno, no frames have been - * xmit'ed and it's our responsibility to them free all. - */ - for (i = 0; i < bq->count; i++) { - struct xdp_frame *xdpf = bq->q[i]; - - xdp_return_frame_rx_napi(xdpf); - drops++; - } - goto out; } /* __dev_flush is called from xdp_do_flush() which _must_ be signaled @@ -384,8 +431,12 @@ void __dev_flush(void) struct list_head *flush_list = this_cpu_ptr(&dev_flush_list); struct xdp_dev_bulk_queue *bq, *tmp; - list_for_each_entry_safe(bq, tmp, flush_list, flush_node) + list_for_each_entry_safe(bq, tmp, flush_list, flush_node) { bq_xmit_all(bq, XDP_XMIT_FLUSH); + bq->dev_rx = NULL; + bq->xdp_prog = NULL; + __list_del_clearprev(&bq->flush_node); + } } /* rcu_read_lock (from syscall and BPF contexts) ensures that if a delete and/or @@ -408,7 +459,7 @@ struct bpf_dtab_netdev *__dev_map_lookup_elem(struct bpf_map *map, u32 key) * Thus, safe percpu variable access. */ static void bq_enqueue(struct net_device *dev, struct xdp_frame *xdpf, - struct net_device *dev_rx) + struct net_device *dev_rx, struct bpf_prog *xdp_prog) { struct list_head *flush_list = this_cpu_ptr(&dev_flush_list); struct xdp_dev_bulk_queue *bq = this_cpu_ptr(dev->xdp_bulkq); @@ -419,18 +470,22 @@ static void bq_enqueue(struct net_device *dev, struct xdp_frame *xdpf, /* Ingress dev_rx will be the same for all xdp_frame's in * bulk_queue, because bq stored per-CPU and must be flushed * from net_device drivers NAPI func end. + * + * Do the same with xdp_prog and flush_list since these fields + * are only ever modified together. */ - if (!bq->dev_rx) + if (!bq->dev_rx) { bq->dev_rx = dev_rx; + bq->xdp_prog = xdp_prog; + list_add(&bq->flush_node, flush_list); + } bq->q[bq->count++] = xdpf; - - if (!bq->flush_node.prev) - list_add(&bq->flush_node, flush_list); } static inline int __xdp_enqueue(struct net_device *dev, struct xdp_buff *xdp, - struct net_device *dev_rx) + struct net_device *dev_rx, + struct bpf_prog *xdp_prog) { struct xdp_frame *xdpf; int err; @@ -446,42 +501,14 @@ static inline int __xdp_enqueue(struct net_device *dev, struct xdp_buff *xdp, if (unlikely(!xdpf)) return -EOVERFLOW; - bq_enqueue(dev, xdpf, dev_rx); + bq_enqueue(dev, xdpf, dev_rx, xdp_prog); return 0; } -static struct xdp_buff *dev_map_run_prog(struct net_device *dev, - struct xdp_buff *xdp, - struct bpf_prog *xdp_prog) -{ - struct xdp_txq_info txq = { .dev = dev }; - u32 act; - - xdp_set_data_meta_invalid(xdp); - xdp->txq = &txq; - - act = bpf_prog_run_xdp(xdp_prog, xdp); - switch (act) { - case XDP_PASS: - return xdp; - case XDP_DROP: - break; - default: - bpf_warn_invalid_xdp_action(act); - fallthrough; - case XDP_ABORTED: - trace_xdp_exception(dev, xdp_prog, act); - break; - } - - xdp_return_buff(xdp); - return NULL; -} - int dev_xdp_enqueue(struct net_device *dev, struct xdp_buff *xdp, struct net_device *dev_rx) { - return __xdp_enqueue(dev, xdp, dev_rx); + return __xdp_enqueue(dev, xdp, dev_rx, NULL); } int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp, @@ -489,12 +516,7 @@ int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp, { struct net_device *dev = dst->dev; - if (dst->xdp_prog) { - xdp = dev_map_run_prog(dev, xdp, dst->xdp_prog); - if (!xdp) - return 0; - } - return __xdp_enqueue(dev, xdp, dev_rx); + return __xdp_enqueue(dev, xdp, dev_rx, dst->xdp_prog); } int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb, From patchwork Thu Feb 4 14:03:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Hangbin Liu X-Patchwork-Id: 376746 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30558C433E6 for ; Thu, 4 Feb 2021 14:14:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 00E1964F51 for ; Thu, 4 Feb 2021 14:13:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236679AbhBDONk (ORCPT ); Thu, 4 Feb 2021 09:13:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236612AbhBDOEx (ORCPT ); Thu, 4 Feb 2021 09:04:53 -0500 Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77C23C0613ED; Thu, 4 Feb 2021 06:04:12 -0800 (PST) Received: by mail-pj1-x1036.google.com with SMTP id z9so1848396pjl.5; Thu, 04 Feb 2021 06:04:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=y6Alzar7nxFR6yfd+g+bMdscJ24HeOoxR0ChHyjquwc=; b=pZ++3WiaGJexzu75I6oWTI9tyK0tWnEB+WTpTaTU4Yo1EBrzNVN/l3AFhSkR5duvuW vCDXzjgRVyzhPDnxkrA0jhLAM4ulliKd820Bb5aE9+wkPhm8uoZkZ+KPgHL0CK0ORT5n lbxSVuEk1BBeW583Mi38hItlHbRkilEWhw3BG+ix7H06gUWEP72rdiwJ+HxlVzF1Zhjq X+JaUKqYOobhF7Bg+U4g3Guo9PNTy127m3/9bWxRc5Qf67ESjiM+IDynuhk93moSfKGE HnXpNLHNEBF32qiJ2d4dvJI8P+tsUFrdyAIbnem4+HC7mSuDHR534d+8G6iHhvsCdfmp K5qA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=y6Alzar7nxFR6yfd+g+bMdscJ24HeOoxR0ChHyjquwc=; b=rd8IB9oZI0+DUB9UaNWgCMLxlQVvn0cT+DaZ7wJ6CqQl4r8JkWAQqjxL3cJHoT9ZS6 Kfx/zxwvafRSyBUw5RBDd6hoTjrGB8ZkLhBOGmofc0+bXydH3VP/MPJfx9e/MaYU/29k 5isWPZZ2Kq+jn5nPNR0DYVANBaO0Xpn8OaSY4/4lnH/5bpNP8BkxtIl6pPZyZEp7RWo2 6J4QAuA34U0kQatiMuM5fhiphqfbinuKjeeNNiOnhzg9Wai8/Oqud7zOVRewFR4MKm1J AevPvh9dmVQc552Ci7DLFLbVWN1/qEq4ZNAldP1cN2PwfqHZiFL+Yu9iQ5y4zLtTYUa7 7yrQ== X-Gm-Message-State: AOAM530v9rMAJ/zhNaQTL7JTsawppxoewSpAjh4gfZx6q0ZFp/EceqMY BkZX5aPpIMEHsZh30SHZkuWUYD8+uti/1EbB X-Google-Smtp-Source: ABdhPJxBW2U3HPOXqgSwirGRwPirZVz1H8A1l981qml54XjxFrJwCLoUPZlrHagcf6kvaDxjERo1VA== X-Received: by 2002:a17:90b:1495:: with SMTP id js21mr8827538pjb.127.1612447451633; Thu, 04 Feb 2021 06:04:11 -0800 (PST) Received: from Leo-laptop-t470s.redhat.com ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id 21sm5889394pfh.56.2021.02.04.06.04.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Feb 2021 06:04:10 -0800 (PST) From: Hangbin Liu To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= , Jiri Benc , Jesper Dangaard Brouer , Eelco Chaudron , ast@kernel.org, Daniel Borkmann , Lorenzo Bianconi , David Ahern , Andrii Nakryiko , Alexei Starovoitov , John Fastabend , Maciej Fijalkowski , Hangbin Liu Subject: [PATCHv18 bpf-next 2/6] bpf: add a new bpf argument type ARG_CONST_MAP_PTR_OR_NULL Date: Thu, 4 Feb 2021 22:03:13 +0800 Message-Id: <20210204140317.384296-3-liuhangbin@gmail.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210204140317.384296-1-liuhangbin@gmail.com> References: <20210125124516.3098129-1-liuhangbin@gmail.com> <20210204140317.384296-1-liuhangbin@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add a new bpf argument type ARG_CONST_MAP_PTR_OR_NULL which could be used when we want to allow NULL pointer for map parameter. The bpf helper need to take care and check if the map is NULL when use this type. Acked-by: Toke Høiland-Jørgensen Acked-by: John Fastabend Signed-off-by: Hangbin Liu v13-v18: no update v11-v12: rebase the patch to latest bpf-next v10: remove useless CONST_PTR_TO_MAP_OR_NULL and Copy-paste comment. v9: merge the patch from [1] in to this series. v1-v8: no this patch [1] https://lore.kernel.org/bpf/20200715070001.2048207-1-liuhangbin@gmail.com/ --- include/linux/bpf.h | 1 + kernel/bpf/verifier.c | 10 ++++++---- 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 321966fc35db..b0777c8c03fd 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -296,6 +296,7 @@ enum bpf_arg_type { ARG_CONST_ALLOC_SIZE_OR_ZERO, /* number of allocated bytes requested */ ARG_PTR_TO_BTF_ID_SOCK_COMMON, /* pointer to in-kernel sock_common or bpf-mirrored bpf_sock */ ARG_PTR_TO_PERCPU_BTF_ID, /* pointer to in-kernel percpu type */ + ARG_CONST_MAP_PTR_OR_NULL, /* const argument used as pointer to bpf_map or NULL */ __BPF_ARG_TYPE_MAX, }; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 5e09632efddb..50a17f80358b 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -445,7 +445,8 @@ static bool arg_type_may_be_null(enum bpf_arg_type type) type == ARG_PTR_TO_MEM_OR_NULL || type == ARG_PTR_TO_CTX_OR_NULL || type == ARG_PTR_TO_SOCKET_OR_NULL || - type == ARG_PTR_TO_ALLOC_MEM_OR_NULL; + type == ARG_PTR_TO_ALLOC_MEM_OR_NULL || + type == ARG_CONST_MAP_PTR_OR_NULL; } /* Determine whether the function releases some resources allocated by another @@ -4112,6 +4113,7 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = { [ARG_CONST_SIZE_OR_ZERO] = &scalar_types, [ARG_CONST_ALLOC_SIZE_OR_ZERO] = &scalar_types, [ARG_CONST_MAP_PTR] = &const_map_ptr_types, + [ARG_CONST_MAP_PTR_OR_NULL] = &const_map_ptr_types, [ARG_PTR_TO_CTX] = &context_types, [ARG_PTR_TO_CTX_OR_NULL] = &context_types, [ARG_PTR_TO_SOCK_COMMON] = &sock_types, @@ -4257,9 +4259,9 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, meta->ref_obj_id = reg->ref_obj_id; } - if (arg_type == ARG_CONST_MAP_PTR) { - /* bpf_map_xxx(map_ptr) call: remember that map_ptr */ - meta->map_ptr = reg->map_ptr; + if (arg_type == ARG_CONST_MAP_PTR || + arg_type == ARG_CONST_MAP_PTR_OR_NULL) { + meta->map_ptr = register_is_null(reg) ? NULL : reg->map_ptr; } else if (arg_type == ARG_PTR_TO_MAP_KEY) { /* bpf_map_xxx(..., map_ptr, ..., key) call: * check that [key, key + map->key_size) are within From patchwork Mon Jan 25 12:45:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Hangbin Liu X-Patchwork-Id: 371159 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47E8FC43142 for ; Tue, 26 Jan 2021 20:46:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1914523101 for ; Tue, 26 Jan 2021 20:46:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730933AbhAZFEx (ORCPT ); Tue, 26 Jan 2021 00:04:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51192 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728329AbhAYMsL (ORCPT ); Mon, 25 Jan 2021 07:48:11 -0500 Received: from mail-pf1-x42b.google.com (mail-pf1-x42b.google.com [IPv6:2607:f8b0:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 894DFC06121E; Mon, 25 Jan 2021 04:46:49 -0800 (PST) Received: by mail-pf1-x42b.google.com with SMTP id q20so8313524pfu.8; Mon, 25 Jan 2021 04:46:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2sZtuCuJ/vMEYRYVWnP0fmPm0j/t6522hUUFI1XTufY=; b=bvxJPB0nVbgzPmW9YSw/Q7auOYXcwEj9vx/u5VjGWJvXouXuJsrLKg+Em4QGIUhPhK 6+qAWCxGLheoQQZlQ1QcnbvwuZkBSzRFhOO4Tw473nkMTMYs6WYICvnFXwufMbXnrBk5 MU5XIh9Et02oNr/I4cEycn84gao/0CQc/WtVwdPb9F8Gvv5DjV2IV6i81uTaHBy1c5dq VHdQHXSVLnBrgATCYnZPDeUTdwCdZJAVXia0ydnmBX8eqKXBd8GEEZ047gFPt3KtqUBz DWE9eyu/Ue1xshLwn/Z9OMZ1dfJ/Oqhb2oR2h0yu77HigqauIB3cPFQCdq6SLD/WopOw DzTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2sZtuCuJ/vMEYRYVWnP0fmPm0j/t6522hUUFI1XTufY=; b=B8Io18JHBSIHRVWJocOJsGNSzTNPB8h0rcVX/tkrWZ6tYe0zWNbDVjrv5XJuPbYv8Q s98RoGH7LiiIEECO2klCxu61NYpnaw2ct2BfPZBL8m0JznTTwdw7ICvND19g3FhxQz/v 1u9xjE2NdkyMk/tFTc/egGXbyy49kyWi/FmbzRQ6dhURUfipAOUThbs4GxgmjwTouPH0 e+1NtQIwUPqtWEONrTDiOkvxpvHjYdLiUFLJxNgUSW5vZQObHrIgzQHUxoldXJF/jUew o9ZCpXnrJ+48dHRg3BuUeqR7b4Y5bMGgT2zzXsGvcDSh8bAz//4/K4qGLvHB6rSmatKl HBKA== X-Gm-Message-State: AOAM530T73hJ3PayyeWMm7aowZrvEdGlrrPJ/xjozfcQUSITGjEuXnDx TImhz5BJ5pyDDf0wSrDpJx7fJ1kN1k8Qg/IL X-Google-Smtp-Source: ABdhPJyVVmnx6P5BqMP1ABCgB71YVmxVBYtVKyPEQkzkOkOLWZGqQkrdevnnEuzwuAEDfqS+qkC/TQ== X-Received: by 2002:a65:5b47:: with SMTP id y7mr438555pgr.221.1611578808774; Mon, 25 Jan 2021 04:46:48 -0800 (PST) Received: from Leo-laptop-t470s.redhat.com ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id j123sm17815466pfg.36.2021.01.25.04.46.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Jan 2021 04:46:47 -0800 (PST) From: Hangbin Liu To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= , Jiri Benc , Jesper Dangaard Brouer , Eelco Chaudron , ast@kernel.org, Daniel Borkmann , Lorenzo Bianconi , David Ahern , Andrii Nakryiko , Alexei Starovoitov , John Fastabend , Maciej Fijalkowski , Hangbin Liu Subject: [PATCHv17 bpf-next 4/6] sample/bpf: add xdp_redirect_map_multicast test Date: Mon, 25 Jan 2021 20:45:14 +0800 Message-Id: <20210125124516.3098129-5-liuhangbin@gmail.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210125124516.3098129-1-liuhangbin@gmail.com> References: <20210122074652.2981711-1-liuhangbin@gmail.com> <20210125124516.3098129-1-liuhangbin@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This is a sample for xdp multicast. In the sample we could forward all packets between given interfaces. There is also an option -X that could enable 2nd xdp_prog on egress interface. Acked-by: Toke Høiland-Jørgensen Signed-off-by: Hangbin Liu --- v16-v17: no update v15: use bpf_object__find_program_by_name() instead of bpf_object__find_program_by_title() v13-v14: no update, only rebase the code v12: add devmap xdp_prog on egress support v10-v11: no update v9: use NULL directly for arg2 and redefine the maps with btf format v5: add a null_map as we have strict the arg2 to ARG_CONST_MAP_PTR. Move the testing part to bpf selftest in next patch. v4: no update. v3: add rxcnt map to show the packet transmit speed. v2: no update. --- samples/bpf/Makefile | 3 + samples/bpf/xdp_redirect_map_multi_kern.c | 87 +++++++ samples/bpf/xdp_redirect_map_multi_user.c | 302 ++++++++++++++++++++++ 3 files changed, 392 insertions(+) create mode 100644 samples/bpf/xdp_redirect_map_multi_kern.c create mode 100644 samples/bpf/xdp_redirect_map_multi_user.c diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile index d06144613ca2..539af70b5a98 100644 --- a/samples/bpf/Makefile +++ b/samples/bpf/Makefile @@ -41,6 +41,7 @@ tprogs-y += test_map_in_map tprogs-y += per_socket_stats_example tprogs-y += xdp_redirect tprogs-y += xdp_redirect_map +tprogs-y += xdp_redirect_map_multi tprogs-y += xdp_redirect_cpu tprogs-y += xdp_monitor tprogs-y += xdp_rxq_info @@ -99,6 +100,7 @@ test_map_in_map-objs := test_map_in_map_user.o per_socket_stats_example-objs := cookie_uid_helper_example.o xdp_redirect-objs := xdp_redirect_user.o xdp_redirect_map-objs := xdp_redirect_map_user.o +xdp_redirect_map_multi-objs := xdp_redirect_map_multi_user.o xdp_redirect_cpu-objs := xdp_redirect_cpu_user.o xdp_monitor-objs := xdp_monitor_user.o xdp_rxq_info-objs := xdp_rxq_info_user.o @@ -160,6 +162,7 @@ always-y += tcp_tos_reflect_kern.o always-y += tcp_dumpstats_kern.o always-y += xdp_redirect_kern.o always-y += xdp_redirect_map_kern.o +always-y += xdp_redirect_map_multi_kern.o always-y += xdp_redirect_cpu_kern.o always-y += xdp_monitor_kern.o always-y += xdp_rxq_info_kern.o diff --git a/samples/bpf/xdp_redirect_map_multi_kern.c b/samples/bpf/xdp_redirect_map_multi_kern.c new file mode 100644 index 000000000000..e422340d1251 --- /dev/null +++ b/samples/bpf/xdp_redirect_map_multi_kern.c @@ -0,0 +1,87 @@ +// SPDX-License-Identifier: GPL-2.0 +#define KBUILD_MODNAME "foo" +#include +#include +#include +#include +#include +#include + +struct { + __uint(type, BPF_MAP_TYPE_DEVMAP_HASH); + __uint(key_size, sizeof(int)); + __uint(value_size, sizeof(int)); + __uint(max_entries, 32); +} forward_map_general SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_DEVMAP_HASH); + __uint(key_size, sizeof(int)); + __uint(value_size, sizeof(struct bpf_devmap_val)); + __uint(max_entries, 32); +} forward_map_native SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); + __type(key, u32); + __type(value, long); + __uint(max_entries, 1); +} rxcnt SEC(".maps"); + +/* map to store egress interfaces mac addresses, set the + * max_entries to 1 and extend it in user sapce prog. + */ +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __type(key, u32); + __type(value, __be64); + __uint(max_entries, 1); +} mac_map SEC(".maps"); + +static int xdp_redirect_map(struct xdp_md *ctx, void *forward_map) +{ + long *value; + u32 key = 0; + + /* count packet in global counter */ + value = bpf_map_lookup_elem(&rxcnt, &key); + if (value) + *value += 1; + + return bpf_redirect_map_multi(forward_map, NULL, BPF_F_EXCLUDE_INGRESS); +} + +SEC("xdp_redirect_general") +int xdp_redirect_map_general(struct xdp_md *ctx) +{ + return xdp_redirect_map(ctx, &forward_map_general); +} + +SEC("xdp_redirect_native") +int xdp_redirect_map_native(struct xdp_md *ctx) +{ + return xdp_redirect_map(ctx, &forward_map_native); +} + +SEC("xdp_devmap/map_prog") +int xdp_devmap_prog(struct xdp_md *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + u32 key = ctx->egress_ifindex; + struct ethhdr *eth = data; + __be64 *mac; + u64 nh_off; + + nh_off = sizeof(*eth); + if (data + nh_off > data_end) + return XDP_DROP; + + mac = bpf_map_lookup_elem(&mac_map, &key); + if (mac) + __builtin_memcpy(eth->h_source, mac, ETH_ALEN); + + return XDP_PASS; +} + +char _license[] SEC("license") = "GPL"; diff --git a/samples/bpf/xdp_redirect_map_multi_user.c b/samples/bpf/xdp_redirect_map_multi_user.c new file mode 100644 index 000000000000..84cdbbed20b7 --- /dev/null +++ b/samples/bpf/xdp_redirect_map_multi_user.c @@ -0,0 +1,302 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "bpf_util.h" +#include +#include + +#define MAX_IFACE_NUM 32 + +static __u32 xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST; +static int ifaces[MAX_IFACE_NUM] = {}; +static int rxcnt_map_fd; + +static void int_exit(int sig) +{ + __u32 prog_id = 0; + int i; + + for (i = 0; ifaces[i] > 0; i++) { + if (bpf_get_link_xdp_id(ifaces[i], &prog_id, xdp_flags)) { + printf("bpf_get_link_xdp_id failed\n"); + exit(1); + } + if (prog_id) + bpf_set_link_xdp_fd(ifaces[i], -1, xdp_flags); + } + + exit(0); +} + +static void poll_stats(int interval) +{ + unsigned int nr_cpus = bpf_num_possible_cpus(); + __u64 values[nr_cpus], prev[nr_cpus]; + + memset(prev, 0, sizeof(prev)); + + while (1) { + __u64 sum = 0; + __u32 key = 0; + int i; + + sleep(interval); + assert(bpf_map_lookup_elem(rxcnt_map_fd, &key, values) == 0); + for (i = 0; i < nr_cpus; i++) + sum += (values[i] - prev[i]); + if (sum) + printf("Forwarding %10llu pkt/s\n", sum / interval); + memcpy(prev, values, sizeof(values)); + } +} + +static int get_mac_addr(unsigned int ifindex, void *mac_addr) +{ + char ifname[IF_NAMESIZE]; + struct ifreq ifr; + int fd, ret = -1; + + fd = socket(AF_INET, SOCK_DGRAM, 0); + if (fd < 0) + return ret; + + if (!if_indextoname(ifindex, ifname)) + goto err_out; + + strcpy(ifr.ifr_name, ifname); + + if (ioctl(fd, SIOCGIFHWADDR, &ifr) != 0) + goto err_out; + + memcpy(mac_addr, ifr.ifr_hwaddr.sa_data, 6 * sizeof(char)); + ret = 0; + +err_out: + close(fd); + return ret; +} + +static int update_mac_map(struct bpf_object *obj) +{ + int i, ret = -1, mac_map_fd; + unsigned char mac_addr[6]; + unsigned int ifindex; + + mac_map_fd = bpf_object__find_map_fd_by_name(obj, "mac_map"); + if (mac_map_fd < 0) { + printf("find mac map fd failed\n"); + return ret; + } + + for (i = 0; ifaces[i] > 0; i++) { + ifindex = ifaces[i]; + + ret = get_mac_addr(ifindex, mac_addr); + if (ret < 0) { + printf("get interface %d mac failed\n", ifindex); + return ret; + } + + ret = bpf_map_update_elem(mac_map_fd, &ifindex, mac_addr, 0); + if (ret) { + perror("bpf_update_elem mac_map_fd"); + return ret; + } + } + + return 0; +} + +static void usage(const char *prog) +{ + fprintf(stderr, + "usage: %s [OPTS] ...\n" + "OPTS:\n" + " -S use skb-mode\n" + " -N enforce native mode\n" + " -F force loading prog\n" + " -X load xdp program on egress\n", + prog); +} + +int main(int argc, char **argv) +{ + int i, ret, opt, forward_map_fd, max_ifindex = 0; + struct bpf_program *ingress_prog, *egress_prog; + int ingress_prog_fd, egress_prog_fd = 0; + struct bpf_devmap_val devmap_val; + bool attach_egress_prog = false; + char ifname[IF_NAMESIZE]; + struct bpf_map *mac_map; + struct bpf_object *obj; + unsigned int ifindex; + char filename[256]; + + while ((opt = getopt(argc, argv, "SNFX")) != -1) { + switch (opt) { + case 'S': + xdp_flags |= XDP_FLAGS_SKB_MODE; + break; + case 'N': + /* default, set below */ + break; + case 'F': + xdp_flags &= ~XDP_FLAGS_UPDATE_IF_NOEXIST; + break; + case 'X': + attach_egress_prog = true; + break; + default: + usage(basename(argv[0])); + return 1; + } + } + + if (!(xdp_flags & XDP_FLAGS_SKB_MODE)) { + xdp_flags |= XDP_FLAGS_DRV_MODE; + } else if (attach_egress_prog) { + printf("Load xdp program on egress with SKB mode not supported yet\n"); + return 1; + } + + if (optind == argc) { + printf("usage: %s ...\n", argv[0]); + return 1; + } + + printf("Get interfaces"); + for (i = 0; i < MAX_IFACE_NUM && argv[optind + i]; i++) { + ifaces[i] = if_nametoindex(argv[optind + i]); + if (!ifaces[i]) + ifaces[i] = strtoul(argv[optind + i], NULL, 0); + if (!if_indextoname(ifaces[i], ifname)) { + perror("Invalid interface name or i"); + return 1; + } + + /* Find the largest index number */ + if (ifaces[i] > max_ifindex) + max_ifindex = ifaces[i]; + + printf(" %d", ifaces[i]); + } + printf("\n"); + + snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]); + + obj = bpf_object__open(filename); + if (libbpf_get_error(obj)) { + printf("ERROR: opening BPF object file failed\n"); + obj = NULL; + goto err_out; + } + + /* Reset the map size to max ifindex + 1 */ + if (attach_egress_prog) { + mac_map = bpf_object__find_map_by_name(obj, "mac_map"); + ret = bpf_map__resize(mac_map, max_ifindex + 1); + if (ret < 0) { + printf("ERROR: reset mac map size failed\n"); + goto err_out; + } + } + + /* load BPF program */ + if (bpf_object__load(obj)) { + printf("ERROR: loading BPF object file failed\n"); + goto err_out; + } + + if (xdp_flags & XDP_FLAGS_SKB_MODE) { + ingress_prog = bpf_object__find_program_by_name(obj, "xdp_redirect_map_general"); + forward_map_fd = bpf_object__find_map_fd_by_name(obj, "forward_map_general"); + } else { + ingress_prog = bpf_object__find_program_by_name(obj, "xdp_redirect_map_native"); + forward_map_fd = bpf_object__find_map_fd_by_name(obj, "forward_map_native"); + } + if (!ingress_prog || forward_map_fd < 0) { + printf("finding ingress_prog/forward_map in obj file failed\n"); + goto err_out; + } + + ingress_prog_fd = bpf_program__fd(ingress_prog); + if (ingress_prog_fd < 0) { + printf("find ingress_prog fd failed\n"); + goto err_out; + } + + rxcnt_map_fd = bpf_object__find_map_fd_by_name(obj, "rxcnt"); + if (rxcnt_map_fd < 0) { + printf("bpf_object__find_map_fd_by_name failed\n"); + goto err_out; + } + + if (attach_egress_prog) { + /* Update mac_map with all egress interfaces' mac addr */ + if (update_mac_map(obj) < 0) { + printf("Error: update mac map failed"); + goto err_out; + } + + /* Find egress prog fd */ + egress_prog = bpf_object__find_program_by_name(obj, "xdp_devmap_prog"); + if (!egress_prog) { + printf("finding egress_prog in obj file failed\n"); + goto err_out; + } + egress_prog_fd = bpf_program__fd(egress_prog); + if (egress_prog_fd < 0) { + printf("find egress_prog fd failed\n"); + goto err_out; + } + } + + /* Remove attached program when program is interrupted or killed */ + signal(SIGINT, int_exit); + signal(SIGTERM, int_exit); + + /* Init forward multicast groups */ + for (i = 0; ifaces[i] > 0; i++) { + ifindex = ifaces[i]; + + /* bind prog_fd to each interface */ + ret = bpf_set_link_xdp_fd(ifindex, ingress_prog_fd, xdp_flags); + if (ret) { + printf("Set xdp fd failed on %d\n", ifindex); + goto err_out; + } + + /* Add all the interfaces to forward group and attach + * egress devmap programe if exist + */ + devmap_val.ifindex = ifindex; + devmap_val.bpf_prog.fd = egress_prog_fd; + ret = bpf_map_update_elem(forward_map_fd, &ifindex, &devmap_val, 0); + if (ret) { + perror("bpf_map_update_elem forward_map"); + goto err_out; + } + } + + poll_stats(2); + + return 0; + +err_out: + return 1; +} From patchwork Mon Jan 25 12:45:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Hangbin Liu X-Patchwork-Id: 371223 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24934C433DB for ; Tue, 26 Jan 2021 03:06:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CA44322EBF for ; Tue, 26 Jan 2021 03:05:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728406AbhAYMvC (ORCPT ); Mon, 25 Jan 2021 07:51:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51280 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728341AbhAYMtg (ORCPT ); Mon, 25 Jan 2021 07:49:36 -0500 Received: from mail-pg1-x529.google.com (mail-pg1-x529.google.com [IPv6:2607:f8b0:4864:20::529]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F122C061788; Mon, 25 Jan 2021 04:47:06 -0800 (PST) Received: by mail-pg1-x529.google.com with SMTP id p18so8785791pgm.11; Mon, 25 Jan 2021 04:47:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=bZrOlHeVqPz5jBwq004JVt5/tLnjwy2Oh1cW8pMRtEo=; b=PnML375Ybfn+M7RqZ1Xn5ulYNzhMRtw5lS4+q7BT3vx5qBJ8LbLzCnDlwUu3P/5wPQ Di3w1hs2W+kYbQ3jf4ui0A1p9qrEaOsT0mosGap3XpPO+RKOKO6bxHZ8LJaPoVtzOp46 65Mt4IzXG1GlzLwrOo8wII8FFjwNtRZVLc+UhKaEEp04Q2V6DzPcFM0OnV2Xo3OKLTdS xmAYF0YTSBJzoyRhJQcvnbS/aZHzn/ItPDuBuejDqfNz54AUMevLtGBIFPR+H08/UpXw pYiSPoWAvJ5BpupJwgr9l5ppDnfIw7bEwRpVeKNhBVr0QEBKuPv2rZgEXynx5qxFvz+1 RplA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bZrOlHeVqPz5jBwq004JVt5/tLnjwy2Oh1cW8pMRtEo=; b=P12Y+tCg6AF0TMxgRA2d3AgVfpYi38MfChQ95Gt6tN8wqXlbhVrmv1pg82BjD/2T+x PTuz+HzD1KCVk0dBgbkgx6VeesKlc7C1Yb9ekEyjAtFiscSHHfyaEUFGw971vE4U1KXl H7SviDjVjmrmVjrIOGkNTUiOhN27eOAv07lmW50B0iUV6Sl7qwvXxZORDZZpMXENRIS4 jViQcBL+6y7C39MCd+uNCgG0UjNqaNs0jH2x2xu7cMbcveLwgTDOxieUeRvZ9Wh5UrkT qIIQh+WeAIBZne58Cq5y9Sbht5n/zUEPWiOa+zGHA/QaGX31wGok999nrorBkv4Jnnk+ Al3Q== X-Gm-Message-State: AOAM531K3jza2iMxJNfFIsO7VRdp1oLbEouJPQsGxmSYnHo4tFGTKAyS WPcdxOLG2JAK5jOlw+ptfefcQdLz4kuz/ihO X-Google-Smtp-Source: ABdhPJxNT9RQhShVIHyUyRUp/pU+fD0QvkUKeEWeXhtBgYiMXUloVIbHhyRLZNHpyzOROkzUxJDSkQ== X-Received: by 2002:a63:5d7:: with SMTP id 206mr479269pgf.384.1611578825644; Mon, 25 Jan 2021 04:47:05 -0800 (PST) Received: from Leo-laptop-t470s.redhat.com ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id j123sm17815466pfg.36.2021.01.25.04.46.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Jan 2021 04:47:05 -0800 (PST) From: Hangbin Liu To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= , Jiri Benc , Jesper Dangaard Brouer , Eelco Chaudron , ast@kernel.org, Daniel Borkmann , Lorenzo Bianconi , David Ahern , Andrii Nakryiko , Alexei Starovoitov , John Fastabend , Maciej Fijalkowski , Hangbin Liu Subject: [PATCHv17 bpf-next 6/6] selftests/bpf: add xdp_redirect_multi test Date: Mon, 25 Jan 2021 20:45:16 +0800 Message-Id: <20210125124516.3098129-7-liuhangbin@gmail.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210125124516.3098129-1-liuhangbin@gmail.com> References: <20210122074652.2981711-1-liuhangbin@gmail.com> <20210125124516.3098129-1-liuhangbin@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add a bpf selftest for new helper xdp_redirect_map_multi(). In this test we have 3 forward groups and 1 exclude group. The test will redirect each interface's packets to all the interfaces in the forward group, and exclude the interface in exclude map. We will also test both DEVMAP and DEVMAP_HASH with xdp generic and drv. For more test details, you can find it in the test script. Here is the test result. ]# ./test_xdp_redirect_multi.sh Pass: xdpgeneric arp ns1-2 Pass: xdpgeneric arp ns1-3 Pass: xdpgeneric arp ns1-4 Pass: xdpgeneric ping ns1-2 Pass: xdpgeneric ping ns1-3 Pass: xdpgeneric ping ns1-4 Pass: xdpgeneric ping6 ns2-1 Pass: xdpgeneric ping6 ns2-3 Pass: xdpgeneric ping6 ns2-4 Pass: xdpdrv arp ns1-2 Pass: xdpdrv arp ns1-3 Pass: xdpdrv arp ns1-4 Pass: xdpdrv ping ns1-2 Pass: xdpdrv ping ns1-3 Pass: xdpdrv ping ns1-4 Pass: xdpdrv ping6 ns2-1 Pass: xdpdrv ping6 ns2-3 Pass: xdpdrv ping6 ns2-4 Pass: xdpegress mac ns1-2 Pass: xdpegress mac ns1-3 Pass: xdpegress mac ns1-4 Pass: xdpegress ping ns1-2 Pass: xdpegress ping ns1-3 Pass: xdpegress ping ns1-4 Summary: PASS 24, FAIL 0 Acked-by: Toke Høiland-Jørgensen Signed-off-by: Hangbin Liu Acked-by: John Fastabend --- v16-v17: no update v15: use bpf_object__find_program_by_name instead of bpf_object__find_program_by_title v14: no update, only rebase the code v13: remove setrlimit v12: add devmap prog test on egress v9: use NULL directly for arg2 and redefine the maps with btf format --- tools/testing/selftests/bpf/Makefile | 3 +- .../bpf/progs/xdp_redirect_multi_kern.c | 111 ++++++++ .../selftests/bpf/test_xdp_redirect_multi.sh | 208 +++++++++++++++ .../selftests/bpf/xdp_redirect_multi.c | 252 ++++++++++++++++++ 4 files changed, 573 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/progs/xdp_redirect_multi_kern.c create mode 100755 tools/testing/selftests/bpf/test_xdp_redirect_multi.sh create mode 100644 tools/testing/selftests/bpf/xdp_redirect_multi.c diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 63d6288e419c..621dceddb249 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -51,6 +51,7 @@ TEST_FILES = xsk_prereqs.sh \ # Order correspond to 'make run_tests' order TEST_PROGS := test_kmod.sh \ test_xdp_redirect.sh \ + test_xdp_redirect_multi.sh \ test_xdp_meta.sh \ test_xdp_veth.sh \ test_offload.py \ @@ -80,7 +81,7 @@ TEST_PROGS_EXTENDED := with_addr.sh \ TEST_GEN_PROGS_EXTENDED = test_sock_addr test_skb_cgroup_id_user \ flow_dissector_load test_flow_dissector test_tcp_check_syncookie_user \ test_lirc_mode2_user xdping test_cpp runqslower bench bpf_testmod.ko \ - xdpxceiver + xdpxceiver xdp_redirect_multi TEST_CUSTOM_PROGS = $(OUTPUT)/urandom_read diff --git a/tools/testing/selftests/bpf/progs/xdp_redirect_multi_kern.c b/tools/testing/selftests/bpf/progs/xdp_redirect_multi_kern.c new file mode 100644 index 000000000000..dce4df40d9de --- /dev/null +++ b/tools/testing/selftests/bpf/progs/xdp_redirect_multi_kern.c @@ -0,0 +1,111 @@ +// SPDX-License-Identifier: GPL-2.0 +#define KBUILD_MODNAME "foo" +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +struct { + __uint(type, BPF_MAP_TYPE_DEVMAP); + __uint(key_size, sizeof(int)); + __uint(value_size, sizeof(int)); + __uint(max_entries, 128); +} forward_map_v4 SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_DEVMAP_HASH); + __uint(key_size, sizeof(int)); + __uint(value_size, sizeof(int)); + __uint(max_entries, 128); +} forward_map_v6 SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_DEVMAP_HASH); + __uint(key_size, sizeof(int)); + __uint(value_size, sizeof(int)); + __uint(max_entries, 128); +} forward_map_all SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_DEVMAP_HASH); + __uint(key_size, sizeof(int)); + __uint(value_size, sizeof(struct bpf_devmap_val)); + __uint(max_entries, 128); +} forward_map_egress SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_DEVMAP_HASH); + __uint(key_size, sizeof(int)); + __uint(value_size, sizeof(int)); + __uint(max_entries, 128); +} exclude_map SEC(".maps"); + +/* map to store egress interfaces mac addresses */ +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __type(key, __u32); + __type(value, __be64); + __uint(max_entries, 128); +} mac_map SEC(".maps"); + +SEC("xdp_redirect_map_multi") +int xdp_redirect_map_multi_prog(struct xdp_md *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + struct ethhdr *eth = data; + __u16 h_proto; + __u64 nh_off; + + nh_off = sizeof(*eth); + if (data + nh_off > data_end) + return XDP_DROP; + + h_proto = eth->h_proto; + + if (h_proto == bpf_htons(ETH_P_IP)) + return bpf_redirect_map_multi(&forward_map_v4, &exclude_map, + BPF_F_EXCLUDE_INGRESS); + else if (h_proto == bpf_htons(ETH_P_IPV6)) + return bpf_redirect_map_multi(&forward_map_v6, &exclude_map, + BPF_F_EXCLUDE_INGRESS); + else + return bpf_redirect_map_multi(&forward_map_all, NULL, + BPF_F_EXCLUDE_INGRESS); +} + +/* The following 2 progs are for 2nd devmap prog testing */ +SEC("xdp_redirect_map_ingress") +int xdp_redirect_map_all_prog(struct xdp_md *ctx) +{ + return bpf_redirect_map_multi(&forward_map_egress, NULL, BPF_F_EXCLUDE_INGRESS); +} + +SEC("xdp_devmap/map_prog") +int xdp_devmap_prog(struct xdp_md *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + __u32 key = ctx->egress_ifindex; + struct ethhdr *eth = data; + __u64 nh_off; + __be64 *mac; + + nh_off = sizeof(*eth); + if (data + nh_off > data_end) + return XDP_DROP; + + mac = bpf_map_lookup_elem(&mac_map, &key); + if (mac) + __builtin_memcpy(eth->h_source, mac, ETH_ALEN); + + return XDP_PASS; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/test_xdp_redirect_multi.sh b/tools/testing/selftests/bpf/test_xdp_redirect_multi.sh new file mode 100755 index 000000000000..6503751fdca5 --- /dev/null +++ b/tools/testing/selftests/bpf/test_xdp_redirect_multi.sh @@ -0,0 +1,208 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Test topology: +# - - - - - - - - - - - - - - - - - - - - - - - - - +# | veth1 veth2 veth3 veth4 | ... init net +# - -| - - - - - - | - - - - - - | - - - - - - | - - +# --------- --------- --------- --------- +# | veth0 | | veth0 | | veth0 | | veth0 | ... +# --------- --------- --------- --------- +# ns1 ns2 ns3 ns4 +# +# Forward maps: +# Forward map_all has interfaces: veth1, veth2, veth3, veth4, ... (All traffic except IPv4, IPv6) +# Forward map_v4 has interfaces: veth1, veth3, veth4, ... (For IPv4 traffic only) +# Forward map_v6 has interfaces: veth2, veth3, veth4, ... (For IPv6 traffic only) +# Forward map_egress has all interfaces and redirect all pkts +# Exclude Groups: +# Exclude map: veth3 (assume ns3 is in black list) +# Map type: +# map_v4 use DEVMAP, others use DEVMAP_HASH +# +# Test modules: +# XDP modes: generic, native, native + egress_prog +# +# Test cases: +# ARP(we didn't block ARP for ns3): +# ns1 -> gw: ns2, ns3, ns4 should receive the arp request +# IPv4: +# ns1 -> ns2 (block), ns1 -> ns3 (block), ns1 -> ns4 (pass) +# IPv6 +# ns2 -> ns1 (block), ns2 -> ns3 (block), ns2 -> ns4 (pass) +# egress_prog: +# all ping test should pass, the src mac should be egress interface's mac +# + + +# netns numbers +NUM=4 +IFACES="" +DRV_MODE="xdpgeneric xdpdrv xdpegress" +PASS=0 +FAIL=0 + +test_pass() +{ + echo "Pass: $@" + PASS=$((PASS + 1)) +} + +test_fail() +{ + echo "fail: $@" + FAIL=$((FAIL + 1)) +} + +clean_up() +{ + for i in $(seq $NUM); do + ip link del veth$i 2> /dev/null + ip netns del ns$i 2> /dev/null + done +} + +# Kselftest framework requirement - SKIP code is 4. +check_env() +{ + ip link set dev lo xdpgeneric off &>/dev/null + if [ $? -ne 0 ];then + echo "selftests: [SKIP] Could not run test without the ip xdpgeneric support" + exit 4 + fi + + which tcpdump &>/dev/null + if [ $? -ne 0 ];then + echo "selftests: [SKIP] Could not run test without tcpdump" + exit 4 + fi +} + +setup_ns() +{ + local mode=$1 + IFACES="" + + if [ "$mode" = "xdpegress" ]; then + mode="xdpdrv" + fi + + for i in $(seq $NUM); do + ip netns add ns$i + ip link add veth$i type veth peer name veth0 netns ns$i + ip link set veth$i up + ip -n ns$i link set veth0 up + + ip -n ns$i addr add 192.0.2.$i/24 dev veth0 + ip -n ns$i addr add 2001:db8::$i/64 dev veth0 + ip -n ns$i link set veth0 $mode obj \ + xdp_dummy.o sec xdp_dummy &> /dev/null || \ + { test_fail "Unable to load dummy xdp" && exit 1; } + IFACES="$IFACES veth$i" + veth_mac[$i]=$(ip link show veth$i | awk '/link\/ether/ {print $2}') + done +} + +do_egress_tests() +{ + local mode=$1 + + # mac test + ip netns exec ns2 tcpdump -e -i veth0 -nn -l -e &> mac_ns1-2_${mode}.log & + ip netns exec ns3 tcpdump -e -i veth0 -nn -l -e &> mac_ns1-3_${mode}.log & + ip netns exec ns4 tcpdump -e -i veth0 -nn -l -e &> mac_ns1-4_${mode}.log & + ip netns exec ns1 ping 192.0.2.254 -c 4 &> /dev/null + sleep 2 + pkill -9 tcpdump + + # mac check + grep -q "${veth_mac[2]} > ff:ff:ff:ff:ff:ff" mac_ns1-2_${mode}.log && \ + test_pass "$mode mac ns1-2" || test_fail "$mode mac ns1-2" + grep -q "${veth_mac[3]} > ff:ff:ff:ff:ff:ff" mac_ns1-3_${mode}.log && \ + test_pass "$mode mac ns1-3" || test_fail "$mode mac ns1-3" + grep -q "${veth_mac[4]} > ff:ff:ff:ff:ff:ff" mac_ns1-4_${mode}.log && \ + test_pass "$mode mac ns1-4" || test_fail "$mode mac ns1-4" + + # ping test + ip netns exec ns1 ping 192.0.2.2 -c 4 &> /dev/null && \ + test_pass "$mode ping ns1-2" || test_fail "$mode ping ns1-2" + ip netns exec ns1 ping 192.0.2.3 -c 4 &> /dev/null && \ + test_pass "$mode ping ns1-3" || test_fail "$mode ping ns1-3" + ip netns exec ns1 ping 192.0.2.4 -c 4 &> /dev/null && \ + test_pass "$mode ping ns1-4" || test_fail "$mode ping ns1-4" +} + +do_ping_tests() +{ + local mode=$1 + + # arp test + ip netns exec ns2 tcpdump -i veth0 -nn -l -e &> arp_ns1-2_${mode}.log & + ip netns exec ns3 tcpdump -i veth0 -nn -l -e &> arp_ns1-3_${mode}.log & + ip netns exec ns4 tcpdump -i veth0 -nn -l -e &> arp_ns1-4_${mode}.log & + ip netns exec ns1 ping 192.0.2.254 -c 4 &> /dev/null + sleep 2 + pkill -9 tcpdump + grep -q "Request who-has 192.0.2.254 tell 192.0.2.1" arp_ns1-2_${mode}.log && \ + test_pass "$mode arp ns1-2" || test_fail "$mode arp ns1-2" + grep -q "Request who-has 192.0.2.254 tell 192.0.2.1" arp_ns1-3_${mode}.log && \ + test_pass "$mode arp ns1-3" || test_fail "$mode arp ns1-3" + grep -q "Request who-has 192.0.2.254 tell 192.0.2.1" arp_ns1-4_${mode}.log && \ + test_pass "$mode arp ns1-4" || test_fail "$mode arp ns1-4" + + # ping test + ip netns exec ns1 ping 192.0.2.2 -c 4 &> /dev/null && \ + test_fail "$mode ping ns1-2" || test_pass "$mode ping ns1-2" + ip netns exec ns1 ping 192.0.2.3 -c 4 &> /dev/null && \ + test_fail "$mode ping ns1-3" || test_pass "$mode ping ns1-3" + ip netns exec ns1 ping 192.0.2.4 -c 4 &> /dev/null && \ + test_pass "$mode ping ns1-4" || test_fail "$mode ping ns1-4" + + # ping6 test + ip netns exec ns2 ping6 2001:db8::1 -c 4 &> /dev/null && \ + test_fail "$mode ping6 ns2-1" || test_pass "$mode ping6 ns2-1" + ip netns exec ns2 ping6 2001:db8::3 -c 4 &> /dev/null && \ + test_fail "$mode ping6 ns2-3" || test_pass "$mode ping6 ns2-3" + ip netns exec ns2 ping6 2001:db8::4 -c 4 &> /dev/null && \ + test_pass "$mode ping6 ns2-4" || test_fail "$mode ping6 ns2-4" +} + +do_tests() +{ + local mode=$1 + local drv_p + + case ${mode} in + xdpdrv) drv_p="-N";; + xdpegress) drv_p="-X";; + xdpgeneric) drv_p="-S";; + esac + + ./xdp_redirect_multi $drv_p $IFACES &> xdp_redirect_${mode}.log & + xdp_pid=$! + sleep 10 + + if [ "$mode" = "xdpegress" ]; then + do_egress_tests $mode + else + do_ping_tests $mode + fi + + kill $xdp_pid +} + +trap clean_up 0 2 3 6 9 + +check_env +rm -f xdp_redirect_*.log arp_ns*.log mac_ns*.log + +for mode in ${DRV_MODE}; do + setup_ns $mode + do_tests $mode + sleep 10 + clean_up + sleep 5 +done + +echo "Summary: PASS $PASS, FAIL $FAIL" +[ $FAIL -eq 0 ] && exit 0 || exit 1 diff --git a/tools/testing/selftests/bpf/xdp_redirect_multi.c b/tools/testing/selftests/bpf/xdp_redirect_multi.c new file mode 100644 index 000000000000..b43cd3c9eefd --- /dev/null +++ b/tools/testing/selftests/bpf/xdp_redirect_multi.c @@ -0,0 +1,252 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "bpf_util.h" +#include +#include + +#define MAX_IFACE_NUM 32 + +static __u32 xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST; +static int ifaces[MAX_IFACE_NUM] = {}; + +static void int_exit(int sig) +{ + __u32 prog_id = 0; + int i; + + for (i = 0; ifaces[i] > 0; i++) { + if (bpf_get_link_xdp_id(ifaces[i], &prog_id, xdp_flags)) { + printf("bpf_get_link_xdp_id failed\n"); + exit(1); + } + if (prog_id) + bpf_set_link_xdp_fd(ifaces[i], -1, xdp_flags); + } + + exit(0); +} + +static int get_mac_addr(unsigned int ifindex, void *mac_addr) +{ + char ifname[IF_NAMESIZE]; + struct ifreq ifr; + int fd, ret = -1; + + fd = socket(AF_INET, SOCK_DGRAM, 0); + if (fd < 0) + return ret; + + if (!if_indextoname(ifindex, ifname)) + goto err_out; + + strcpy(ifr.ifr_name, ifname); + + if (ioctl(fd, SIOCGIFHWADDR, &ifr) != 0) + goto err_out; + + memcpy(mac_addr, ifr.ifr_hwaddr.sa_data, 6 * sizeof(char)); + ret = 0; + +err_out: + close(fd); + return ret; +} + +static void usage(const char *prog) +{ + fprintf(stderr, + "usage: %s [OPTS] ...\n" + "OPTS:\n" + " -S use skb-mode\n" + " -N enforce native mode\n" + " -F force loading prog\n" + " -X load xdp program on egress\n", + prog); +} + +int main(int argc, char **argv) +{ + int prog_fd, group_all, group_v4, group_v6, exclude, mac_map; + struct bpf_program *ingress_prog, *egress_prog; + struct bpf_prog_load_attr prog_load_attr = { + .prog_type = BPF_PROG_TYPE_UNSPEC, + }; + int i, ret, opt, egress_prog_fd = 0; + struct bpf_devmap_val devmap_val; + bool attach_egress_prog = false; + unsigned char mac_addr[6]; + char ifname[IF_NAMESIZE]; + struct bpf_object *obj; + unsigned int ifindex; + char filename[256]; + + while ((opt = getopt(argc, argv, "SNFX")) != -1) { + switch (opt) { + case 'S': + xdp_flags |= XDP_FLAGS_SKB_MODE; + break; + case 'N': + /* default, set below */ + break; + case 'F': + xdp_flags &= ~XDP_FLAGS_UPDATE_IF_NOEXIST; + break; + case 'X': + attach_egress_prog = true; + break; + default: + usage(basename(argv[0])); + return 1; + } + } + + if (!(xdp_flags & XDP_FLAGS_SKB_MODE)) { + xdp_flags |= XDP_FLAGS_DRV_MODE; + } else if (attach_egress_prog) { + printf("Load xdp program on egress with SKB mode not supported yet\n"); + goto err_out; + } + + if (optind == argc) { + printf("usage: %s ...\n", argv[0]); + goto err_out; + } + + printf("Get interfaces"); + for (i = 0; i < MAX_IFACE_NUM && argv[optind + i]; i++) { + ifaces[i] = if_nametoindex(argv[optind + i]); + if (!ifaces[i]) + ifaces[i] = strtoul(argv[optind + i], NULL, 0); + if (!if_indextoname(ifaces[i], ifname)) { + perror("Invalid interface name or i"); + goto err_out; + } + printf(" %d", ifaces[i]); + } + printf("\n"); + + snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]); + prog_load_attr.file = filename; + + if (bpf_prog_load_xattr(&prog_load_attr, &obj, &prog_fd)) + goto err_out; + + if (attach_egress_prog) + group_all = bpf_object__find_map_fd_by_name(obj, "forward_map_egress"); + else + group_all = bpf_object__find_map_fd_by_name(obj, "forward_map_all"); + group_v4 = bpf_object__find_map_fd_by_name(obj, "forward_map_v4"); + group_v6 = bpf_object__find_map_fd_by_name(obj, "forward_map_v6"); + exclude = bpf_object__find_map_fd_by_name(obj, "exclude_map"); + mac_map = bpf_object__find_map_fd_by_name(obj, "mac_map"); + + if (group_all < 0 || group_v4 < 0 || group_v6 < 0 || exclude < 0 || + mac_map < 0) { + printf("bpf_object__find_map_fd_by_name failed\n"); + goto err_out; + } + + if (attach_egress_prog) { + /* Find ingress/egress prog for 2nd xdp prog */ + ingress_prog = bpf_object__find_program_by_name(obj, "xdp_redirect_map_all_prog"); + egress_prog = bpf_object__find_program_by_name(obj, "xdp_devmap_prog"); + if (!ingress_prog || !egress_prog) { + printf("finding ingress/egress_prog in obj file failed\n"); + goto err_out; + } + prog_fd = bpf_program__fd(ingress_prog); + egress_prog_fd = bpf_program__fd(egress_prog); + if (prog_fd < 0 || egress_prog_fd < 0) { + printf("find egress_prog fd failed\n"); + goto err_out; + } + } + + signal(SIGINT, int_exit); + signal(SIGTERM, int_exit); + + /* Init forward multicast groups and exclude group */ + for (i = 0; ifaces[i] > 0; i++) { + ifindex = ifaces[i]; + + if (attach_egress_prog) { + ret = get_mac_addr(ifindex, mac_addr); + if (ret < 0) { + printf("get interface %d mac failed\n", ifindex); + goto err_out; + } + ret = bpf_map_update_elem(mac_map, &ifindex, mac_addr, 0); + if (ret) { + perror("bpf_update_elem mac_map failed\n"); + goto err_out; + } + } + + /* Add all the interfaces to group all */ + devmap_val.ifindex = ifindex; + devmap_val.bpf_prog.fd = egress_prog_fd; + ret = bpf_map_update_elem(group_all, &ifindex, &devmap_val, 0); + if (ret) { + perror("bpf_map_update_elem"); + goto err_out; + } + + /* For testing: remove the 1st interfaces from group v6 */ + if (i != 0) { + ret = bpf_map_update_elem(group_v6, &ifindex, &ifindex, 0); + if (ret) { + perror("bpf_map_update_elem"); + goto err_out; + } + } + + /* For testing: remove the 2nd interfaces from group v4 */ + if (i != 1) { + ret = bpf_map_update_elem(group_v4, &ifindex, &ifindex, 0); + if (ret) { + perror("bpf_map_update_elem"); + goto err_out; + } + } + + /* For testing: add the 3rd interfaces to exclude map */ + if (i == 2) { + ret = bpf_map_update_elem(exclude, &ifindex, &ifindex, 0); + if (ret) { + perror("bpf_map_update_elem"); + goto err_out; + } + } + + /* bind prog_fd to each interface */ + ret = bpf_set_link_xdp_fd(ifindex, prog_fd, xdp_flags); + if (ret) { + printf("Set xdp fd failed on %d\n", ifindex); + goto err_out; + } + } + + /* sleep some time for testing */ + sleep(999); + + return 0; + +err_out: + return 1; +}