From patchwork Fri Jul 30 06:18:16 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jussi Maki <joamaki@gmail.com>
X-Patchwork-Id: 492118
Return-Path: <netdev-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
 aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-12.3 required=3.0 tests=BAYES_00,
 DATE_IN_PAST_96_XX, 
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FORGED_FROMDOMAIN,
 FREEMAIL_FROM, 
 HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH,
 MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT
 autolearn=unavailable autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
 by smtp.lore.kernel.org (Postfix) with ESMTP id 83555C4338F
 for <netdev@archiver.kernel.org>;
 Wed,  4 Aug 2021 12:46:26 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
 by mail.kernel.org (Postfix) with ESMTP id 6C9ED60F22
 for <netdev@archiver.kernel.org>;
 Wed,  4 Aug 2021 12:46:26 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S238252AbhHDMqh (ORCPT <rfc822;netdev@archiver.kernel.org>);
 Wed, 4 Aug 2021 08:46:37 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59218 "EHLO
 lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by
 vger.kernel.org with ESMTP id S238185AbhHDMqX (ORCPT
 <rfc822;netdev@vger.kernel.org>); Wed, 4 Aug 2021 08:46:23 -0400
Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com
 [IPv6:2a00:1450:4864:20::436])
 by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 22DC5C0617A2;
 Wed,  4 Aug 2021 05:45:48 -0700 (PDT)
Received: by mail-wr1-x436.google.com with SMTP id k4so2138966wrc.0;
 Wed, 04 Aug 2021 05:45:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:date:message-id:in-reply-to:references;
 bh=Cq8acZAxJnjl34wGyykc2nFnDcJrQD98GnognT8PzKE=;
 b=Ft+83SVWaYUi4DEDs4M+rB6Ugjy7qASC89kZK1gcVnS9xuRGZH0Ggqn+5RJkaHdq18
 6WYTwAZliEJAYb0Gkmhh4j2Mh3aeo8Fr0cPSm7oJ7kGccx2P/tXT00F+NJP+4mzpor2l
 v+UItD6ze26zp4P5lKBLNRftaxSAW5r3eHYKtw6Oe5mP/5iSVwv/zc7P4Bu9EUZ8vfNJ
 iDZlJ44+xv4+WoNAK9FGnNpyikIMwOwI3lpCifWQw66SSi+rpKY/l4TgrWS2cXe2kBT4
 ziKhNvT+opbiH1QoQG7wfOqki6ykK0Jzt+tzcmLlQMREYNJ+2UyzyqG4+loAvkdrgosT
 I/pQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references;
 bh=Cq8acZAxJnjl34wGyykc2nFnDcJrQD98GnognT8PzKE=;
 b=ecHy/x+WCVImjbdGEGpE0fX/cTCoaoBvM2PQQmE9WDdSdBIXGmEtQ0+UWJnhClnYWR
 iZDg/RD8gcqNH4duyl+gRlvDORIIcRGUq+2ta43CukDbV0/qzxSwng3MyOON81r62O60
 vaQ7+8QjCgLBjHd0eHf30UlUAuf66WOTPp3AayU3ZPZHMq1vbr3994tKHPshWShncr3L
 NkFFbo4Gx8wFUWbiR//xV0nGx64y+LSlNxyT0BLUX77GIa3PHt2cUhp9ZpRJORpX2q15
 6UQoyZIih4IVOVuRNU297e/UUHoZob7iHEckKOWHd5A+qekZPH1TeYnxGJ8h6VHSpAI5
 Vnsw==
X-Gm-Message-State: AOAM5314oBrob5nW/Lbz0PdvQdA1b5vRj49wvJNT+U8yQBxq4QpVYy7K
 pkjfHfYZ3/0r/nRo9Ee8fHI6aJZnHQ67nVw=
X-Google-Smtp-Source: ABdhPJyKRz6a7QDEPe02EpwLvraUiaxK6AD8cBqOZwo0j/LTWnsLhOtfpNZHYQbx4NxQzgopY9iyYw==
X-Received: by 2002:a5d:6146:: with SMTP id y6mr27944652wrt.278.1628081146402; 
 Wed, 04 Aug 2021 05:45:46 -0700 (PDT)
Received: from localhost.localdomain ([77.109.191.101])
 by smtp.gmail.com with ESMTPSA id
 y4sm2257923wmi.22.2021.08.04.05.45.45
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Wed, 04 Aug 2021 05:45:45 -0700 (PDT)
From: Jussi Maki <joamaki@gmail.com>
To: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org, daniel@iogearbox.net, j.vosburgh@gmail.com,
 andy@greyhouse.net, vfalico@gmail.com, andrii@kernel.org,
 maciej.fijalkowski@intel.com, magnus.karlsson@intel.com,
 Jussi Maki <joamaki@gmail.com>
Subject: [PATCH bpf-next v5 1/7] net: bonding: Refactor bond_xmit_hash for
 use with xdp_buff
Date: Fri, 30 Jul 2021 06:18:16 +0000
Message-Id: <20210730061822.6600-2-joamaki@gmail.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20210730061822.6600-1-joamaki@gmail.com>
References: <20210609135537.1460244-1-joamaki@gmail.com>
 <20210730061822.6600-1-joamaki@gmail.com>
Precedence: bulk
List-ID: <netdev.vger.kernel.org>
X-Mailing-List: netdev@vger.kernel.org

In preparation for adding XDP support to the bonding driver
refactor the packet hashing functions to be able to work with
any linear data buffer without an skb.

Signed-off-by: Jussi Maki <joamaki@gmail.com>
---
 drivers/net/bonding/bond_main.c | 147 +++++++++++++++++++-------------
 1 file changed, 90 insertions(+), 57 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index d22d78303311..dcec5cc4dab1 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3611,55 +3611,80 @@ static struct notifier_block bond_netdev_notifier = {
 
 /*---------------------------- Hashing Policies -----------------------------*/
 
+/* Helper to access data in a packet, with or without a backing skb.
+ * If skb is given the data is linearized if necessary via pskb_may_pull.
+ */
+static inline const void *bond_pull_data(struct sk_buff *skb,
+					 const void *data, int hlen, int n)
+{
+	if (likely(n <= hlen))
+		return data;
+	else if (skb && likely(pskb_may_pull(skb, n)))
+		return skb->head;
+
+	return NULL;
+}
+
 /* L2 hash helper */
-static inline u32 bond_eth_hash(struct sk_buff *skb)
+static inline u32 bond_eth_hash(struct sk_buff *skb, const void *data, int mhoff, int hlen)
 {
-	struct ethhdr *ep, hdr_tmp;
+	struct ethhdr *ep;
 
-	ep = skb_header_pointer(skb, 0, sizeof(hdr_tmp), &hdr_tmp);
-	if (ep)
-		return ep->h_dest[5] ^ ep->h_source[5] ^ ep->h_proto;
-	return 0;
+	data = bond_pull_data(skb, data, hlen, mhoff + sizeof(struct ethhdr));
+	if (!data)
+		return 0;
+
+	ep = (struct ethhdr *)(data + mhoff);
+	return ep->h_dest[5] ^ ep->h_source[5] ^ ep->h_proto;
 }
 
-static bool bond_flow_ip(struct sk_buff *skb, struct flow_keys *fk,
-			 int *noff, int *proto, bool l34)
+static bool bond_flow_ip(struct sk_buff *skb, struct flow_keys *fk, const void *data,
+			 int hlen, __be16 l2_proto, int *nhoff, int *ip_proto, bool l34)
 {
 	const struct ipv6hdr *iph6;
 	const struct iphdr *iph;
 
-	if (skb->protocol == htons(ETH_P_IP)) {
-		if (unlikely(!pskb_may_pull(skb, *noff + sizeof(*iph))))
+	if (l2_proto == htons(ETH_P_IP)) {
+		data = bond_pull_data(skb, data, hlen, *nhoff + sizeof(*iph));
+		if (!data)
 			return false;
-		iph = (const struct iphdr *)(skb->data + *noff);
+
+		iph = (const struct iphdr *)(data + *nhoff);
 		iph_to_flow_copy_v4addrs(fk, iph);
-		*noff += iph->ihl << 2;
+		*nhoff += iph->ihl << 2;
 		if (!ip_is_fragment(iph))
-			*proto = iph->protocol;
-	} else if (skb->protocol == htons(ETH_P_IPV6)) {
-		if (unlikely(!pskb_may_pull(skb, *noff + sizeof(*iph6))))
+			*ip_proto = iph->protocol;
+	} else if (l2_proto == htons(ETH_P_IPV6)) {
+		data = bond_pull_data(skb, data, hlen, *nhoff + sizeof(*iph6));
+		if (!data)
 			return false;
-		iph6 = (const struct ipv6hdr *)(skb->data + *noff);
+
+		iph6 = (const struct ipv6hdr *)(data + *nhoff);
 		iph_to_flow_copy_v6addrs(fk, iph6);
-		*noff += sizeof(*iph6);
-		*proto = iph6->nexthdr;
+		*nhoff += sizeof(*iph6);
+		*ip_proto = iph6->nexthdr;
 	} else {
 		return false;
 	}
 
-	if (l34 && *proto >= 0)
-		fk->ports.ports = skb_flow_get_ports(skb, *noff, *proto);
+	if (l34 && *ip_proto >= 0)
+		fk->ports.ports = __skb_flow_get_ports(skb, *nhoff, *ip_proto, data, hlen);
 
 	return true;
 }
 
-static u32 bond_vlan_srcmac_hash(struct sk_buff *skb)
+static u32 bond_vlan_srcmac_hash(struct sk_buff *skb, const void *data, int mhoff, int hlen)
 {
-	struct ethhdr *mac_hdr = (struct ethhdr *)skb_mac_header(skb);
+	struct ethhdr *mac_hdr;
 	u32 srcmac_vendor = 0, srcmac_dev = 0;
 	u16 vlan;
 	int i;
 
+	data = bond_pull_data(skb, data, hlen, mhoff + sizeof(struct ethhdr));
+	if (!data)
+		return 0;
+	mac_hdr = (struct ethhdr *)(data + mhoff);
+
 	for (i = 0; i < 3; i++)
 		srcmac_vendor = (srcmac_vendor << 8) | mac_hdr->h_source[i];
 
@@ -3675,26 +3700,25 @@ static u32 bond_vlan_srcmac_hash(struct sk_buff *skb)
 }
 
 /* Extract the appropriate headers based on bond's xmit policy */
-static bool bond_flow_dissect(struct bonding *bond, struct sk_buff *skb,
-			      struct flow_keys *fk)
+static bool bond_flow_dissect(struct bonding *bond, struct sk_buff *skb, const void *data,
+			      __be16 l2_proto, int nhoff, int hlen, struct flow_keys *fk)
 {
 	bool l34 = bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER34;
-	int noff, proto = -1;
+	int ip_proto = -1;
 
 	switch (bond->params.xmit_policy) {
 	case BOND_XMIT_POLICY_ENCAP23:
 	case BOND_XMIT_POLICY_ENCAP34:
 		memset(fk, 0, sizeof(*fk));
 		return __skb_flow_dissect(NULL, skb, &flow_keys_bonding,
-					  fk, NULL, 0, 0, 0, 0);
+					  fk, data, l2_proto, nhoff, hlen, 0);
 	default:
 		break;
 	}
 
 	fk->ports.ports = 0;
 	memset(&fk->icmp, 0, sizeof(fk->icmp));
-	noff = skb_network_offset(skb);
-	if (!bond_flow_ip(skb, fk, &noff, &proto, l34))
+	if (!bond_flow_ip(skb, fk, data, hlen, l2_proto, &nhoff, &ip_proto, l34))
 		return false;
 
 	/* ICMP error packets contains at least 8 bytes of the header
@@ -3702,22 +3726,20 @@ static bool bond_flow_dissect(struct bonding *bond, struct sk_buff *skb,
 	 * to correlate ICMP error packets within the same flow which
 	 * generated the error.
 	 */
-	if (proto == IPPROTO_ICMP || proto == IPPROTO_ICMPV6) {
-		skb_flow_get_icmp_tci(skb, &fk->icmp, skb->data,
-				      skb_transport_offset(skb),
-				      skb_headlen(skb));
-		if (proto == IPPROTO_ICMP) {
+	if (ip_proto == IPPROTO_ICMP || ip_proto == IPPROTO_ICMPV6) {
+		skb_flow_get_icmp_tci(skb, &fk->icmp, data, nhoff, hlen);
+		if (ip_proto == IPPROTO_ICMP) {
 			if (!icmp_is_err(fk->icmp.type))
 				return true;
 
-			noff += sizeof(struct icmphdr);
-		} else if (proto == IPPROTO_ICMPV6) {
+			nhoff += sizeof(struct icmphdr);
+		} else if (ip_proto == IPPROTO_ICMPV6) {
 			if (!icmpv6_is_err(fk->icmp.type))
 				return true;
 
-			noff += sizeof(struct icmp6hdr);
+			nhoff += sizeof(struct icmp6hdr);
 		}
-		return bond_flow_ip(skb, fk, &noff, &proto, l34);
+		return bond_flow_ip(skb, fk, data, hlen, l2_proto, &nhoff, &ip_proto, l34);
 	}
 
 	return true;
@@ -3733,33 +3755,26 @@ static u32 bond_ip_hash(u32 hash, struct flow_keys *flow)
 	return hash >> 1;
 }
 
-/**
- * bond_xmit_hash - generate a hash value based on the xmit policy
- * @bond: bonding device
- * @skb: buffer to use for headers
- *
- * This function will extract the necessary headers from the skb buffer and use
- * them to generate a hash based on the xmit_policy set in the bonding device
+/* Generate hash based on xmit policy. If @skb is given it is used to linearize
+ * the data as required, but this function can be used without it if the data is
+ * known to be linear (e.g. with xdp_buff).
  */
-u32 bond_xmit_hash(struct bonding *bond, struct sk_buff *skb)
+static u32 __bond_xmit_hash(struct bonding *bond, struct sk_buff *skb, const void *data,
+			    __be16 l2_proto, int mhoff, int nhoff, int hlen)
 {
 	struct flow_keys flow;
 	u32 hash;
 
-	if (bond->params.xmit_policy == BOND_XMIT_POLICY_ENCAP34 &&
-	    skb->l4_hash)
-		return skb->hash;
-
 	if (bond->params.xmit_policy == BOND_XMIT_POLICY_VLAN_SRCMAC)
-		return bond_vlan_srcmac_hash(skb);
+		return bond_vlan_srcmac_hash(skb, data, mhoff, hlen);
 
 	if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER2 ||
-	    !bond_flow_dissect(bond, skb, &flow))
-		return bond_eth_hash(skb);
+	    !bond_flow_dissect(bond, skb, data, l2_proto, nhoff, hlen, &flow))
+		return bond_eth_hash(skb, data, mhoff, hlen);
 
 	if (bond->params.xmit_policy == BOND_XMIT_POLICY_LAYER23 ||
 	    bond->params.xmit_policy == BOND_XMIT_POLICY_ENCAP23) {
-		hash = bond_eth_hash(skb);
+		hash = bond_eth_hash(skb, data, mhoff, hlen);
 	} else {
 		if (flow.icmp.id)
 			memcpy(&hash, &flow.icmp, sizeof(hash));
@@ -3770,6 +3785,25 @@ u32 bond_xmit_hash(struct bonding *bond, struct sk_buff *skb)
 	return bond_ip_hash(hash, &flow);
 }
 
+/**
+ * bond_xmit_hash - generate a hash value based on the xmit policy
+ * @bond: bonding device
+ * @skb: buffer to use for headers
+ *
+ * This function will extract the necessary headers from the skb buffer and use
+ * them to generate a hash based on the xmit_policy set in the bonding device
+ */
+u32 bond_xmit_hash(struct bonding *bond, struct sk_buff *skb)
+{
+	if (bond->params.xmit_policy == BOND_XMIT_POLICY_ENCAP34 &&
+	    skb->l4_hash)
+		return skb->hash;
+
+	return __bond_xmit_hash(bond, skb, skb->head, skb->protocol,
+				skb->mac_header, skb->network_header,
+				skb_headlen(skb));
+}
+
 /*-------------------------- Device entry points ----------------------------*/
 
 void bond_work_init_all(struct bonding *bond)
@@ -4399,8 +4433,7 @@ static netdev_tx_t bond_xmit_roundrobin(struct sk_buff *skb,
 	return bond_tx_drop(bond_dev, skb);
 }
 
-static struct slave *bond_xmit_activebackup_slave_get(struct bonding *bond,
-						      struct sk_buff *skb)
+static struct slave *bond_xmit_activebackup_slave_get(struct bonding *bond)
 {
 	return rcu_dereference(bond->curr_active_slave);
 }
@@ -4414,7 +4447,7 @@ static netdev_tx_t bond_xmit_activebackup(struct sk_buff *skb,
 	struct bonding *bond = netdev_priv(bond_dev);
 	struct slave *slave;
 
-	slave = bond_xmit_activebackup_slave_get(bond, skb);
+	slave = bond_xmit_activebackup_slave_get(bond);
 	if (slave)
 		return bond_dev_queue_xmit(bond, skb, slave->dev);
 
@@ -4712,7 +4745,7 @@ static struct net_device *bond_xmit_get_slave(struct net_device *master_dev,
 		slave = bond_xmit_roundrobin_slave_get(bond, skb);
 		break;
 	case BOND_MODE_ACTIVEBACKUP:
-		slave = bond_xmit_activebackup_slave_get(bond, skb);
+		slave = bond_xmit_activebackup_slave_get(bond);
 		break;
 	case BOND_MODE_8023AD:
 	case BOND_MODE_XOR:

From patchwork Fri Jul 30 06:18:17 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jussi Maki <joamaki@gmail.com>
X-Patchwork-Id: 492764
Return-Path: <netdev-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
 aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-12.3 required=3.0 tests=BAYES_00,
 DATE_IN_PAST_96_XX, 
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FORGED_FROMDOMAIN,
 FREEMAIL_FROM, 
 HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH,
 MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT
 autolearn=unavailable autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
 by smtp.lore.kernel.org (Postfix) with ESMTP id 0ADE5C4320E
 for <netdev@archiver.kernel.org>;
 Wed,  4 Aug 2021 12:46:29 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
 by mail.kernel.org (Postfix) with ESMTP id E5C5360EB2
 for <netdev@archiver.kernel.org>;
 Wed,  4 Aug 2021 12:46:28 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S238190AbhHDMqj (ORCPT <rfc822;netdev@archiver.kernel.org>);
 Wed, 4 Aug 2021 08:46:39 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59220 "EHLO
 lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by
 vger.kernel.org with ESMTP id S238244AbhHDMqY (ORCPT
 <rfc822;netdev@vger.kernel.org>); Wed, 4 Aug 2021 08:46:24 -0400
Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com
 [IPv6:2a00:1450:4864:20::330])
 by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7ADEEC0617A3;
 Wed,  4 Aug 2021 05:45:49 -0700 (PDT)
Received: by mail-wm1-x330.google.com with SMTP id
 a192-20020a1c7fc90000b0290253b32e8796so3637728wmd.0; 
 Wed, 04 Aug 2021 05:45:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:date:message-id:in-reply-to:references;
 bh=6Vv+kISC72wCvpQeciFaDZJ9IKz+oFeH/BbvTmATNf8=;
 b=D3cJ0RTfTEyRzH1mbzWaTaR9iJo3gPA5/EPpDaAs3ty4c9iiPLVgkxiASzWtvEUjEE
 UmlIsH79Rf8vZMbpjPu6Ci0hAZCp8jHL+Ubjkwc7c9Brr8q2ScLjVoqtS9ciS01fTC8/
 I1tsnCSSi/EKlGstvZKZcojfQz1YHJgXkJdjbcNV5EUFLUVL62blK6ss3qYEEm/bDU9i
 TaAxbtwrvPWyBlQNN9muifmcZKhNJ4T4jq4l94DAsX+PSElbmamrF3W0ZtRDwcf6ac9W
 JwrnBONXlZL1RpVmXMah574FDd5yJaZG9aiGtl75uK/JBH4kIvMeixigA9tHySjnInBk
 Rmzg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references;
 bh=6Vv+kISC72wCvpQeciFaDZJ9IKz+oFeH/BbvTmATNf8=;
 b=j9Aw5wh9vtd3MCWbUqBror9qaValrIpAasyAyOKIbMa6Xs7qX+Y88A413PX/4gu3Op
 U6/GKAhV2b8WsOvAtLlFaJYfON5lVzsE+GpuhjE/fxfLKzBZMIfwUWOszZtSLItrNQPK
 fwEI6JJD8xjcV5YKd2H7NwjuPiDUDeMzs2Iltcy7faJhYWnA4RezsOPEPN4XeGUW01UT
 8s2aIo73us1ei0V8bETxubGolxfLdVUX7Jt2V4pXQUjZCCpc5uPQb6W/4cEv0AxeFGsC
 /67mtL8nFlR4FRcJWxK34xycmB9bJyvqH6M88bz1uK/FQac2uSrjZ2oSeyt403EUbhku
 WjTQ==
X-Gm-Message-State: AOAM531RONggrroXF0hpR06wGAmqDISS87UOsJDtPknn36WVphKazGt8
 kqdhLmcYL5VxMJMul55KKZRIFVZ56MTzN4c=
X-Google-Smtp-Source: ABdhPJxdKJHMAbnPpmBQF/qYfX30GssS2XGrTnFB8UKfQCSlvR9n5ODgxRSxBS3n3QJpyCUjsqJnCQ==
X-Received: by 2002:a05:600c:2298:: with SMTP id
 24mr13600942wmf.62.1628081147736; 
 Wed, 04 Aug 2021 05:45:47 -0700 (PDT)
Received: from localhost.localdomain ([77.109.191.101])
 by smtp.gmail.com with ESMTPSA id
 y4sm2257923wmi.22.2021.08.04.05.45.46
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Wed, 04 Aug 2021 05:45:47 -0700 (PDT)
From: Jussi Maki <joamaki@gmail.com>
To: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org, daniel@iogearbox.net, j.vosburgh@gmail.com,
 andy@greyhouse.net, vfalico@gmail.com, andrii@kernel.org,
 maciej.fijalkowski@intel.com, magnus.karlsson@intel.com,
 Jussi Maki <joamaki@gmail.com>
Subject: [PATCH bpf-next v5 2/7] net: core: Add support for XDP redirection
 to slave device
Date: Fri, 30 Jul 2021 06:18:17 +0000
Message-Id: <20210730061822.6600-3-joamaki@gmail.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20210730061822.6600-1-joamaki@gmail.com>
References: <20210609135537.1460244-1-joamaki@gmail.com>
 <20210730061822.6600-1-joamaki@gmail.com>
Precedence: bulk
List-ID: <netdev.vger.kernel.org>
X-Mailing-List: netdev@vger.kernel.org

This adds the ndo_xdp_get_xmit_slave hook for transforming XDP_TX
into XDP_REDIRECT after BPF program run when the ingress device
is a bond slave.

The dev_xdp_prog_count is exposed so that slave devices can be checked
for loaded XDP programs in order to avoid the situation where both
bond master and slave have programs loaded according to xdp_state.

Signed-off-by: Jussi Maki <joamaki@gmail.com>
---
 include/linux/filter.h    | 13 ++++++++++++-
 include/linux/netdevice.h |  6 ++++++
 net/core/dev.c            | 13 ++++++++++++-
 net/core/filter.c         | 25 +++++++++++++++++++++++++
 4 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index ba36989f711a..7ea1cc378042 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -761,6 +761,10 @@ static inline u32 bpf_prog_run_clear_cb(const struct bpf_prog *prog,
 
 DECLARE_BPF_DISPATCHER(xdp)
 
+DECLARE_STATIC_KEY_FALSE(bpf_master_redirect_enabled_key);
+
+u32 xdp_master_redirect(struct xdp_buff *xdp);
+
 static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog,
 					    struct xdp_buff *xdp)
 {
@@ -768,7 +772,14 @@ static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog,
 	 * under local_bh_disable(), which provides the needed RCU protection
 	 * for accessing map entries.
 	 */
-	return __BPF_PROG_RUN(prog, xdp, BPF_DISPATCHER_FUNC(xdp));
+	u32 act = __BPF_PROG_RUN(prog, xdp, BPF_DISPATCHER_FUNC(xdp));
+
+	if (static_branch_unlikely(&bpf_master_redirect_enabled_key)) {
+		if (act == XDP_TX && netif_is_bond_slave(xdp->rxq->dev))
+			act = xdp_master_redirect(xdp);
+	}
+
+	return act;
 }
 
 void bpf_prog_change_xdp(struct bpf_prog *prev_prog, struct bpf_prog *prog);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 42f6f866d5f3..a380786429e1 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1321,6 +1321,9 @@ struct netdev_net_notifier {
  *	that got dropped are freed/returned via xdp_return_frame().
  *	Returns negative number, means general error invoking ndo, meaning
  *	no frames were xmit'ed and core-caller will free all frames.
+ * struct net_device *(*ndo_xdp_get_xmit_slave)(struct net_device *dev,
+ *					        struct xdp_buff *xdp);
+ *      Get the xmit slave of master device based on the xdp_buff.
  * int (*ndo_xsk_wakeup)(struct net_device *dev, u32 queue_id, u32 flags);
  *      This function is used to wake up the softirq, ksoftirqd or kthread
  *	responsible for sending and/or receiving packets on a specific
@@ -1539,6 +1542,8 @@ struct net_device_ops {
 	int			(*ndo_xdp_xmit)(struct net_device *dev, int n,
 						struct xdp_frame **xdp,
 						u32 flags);
+	struct net_device *	(*ndo_xdp_get_xmit_slave)(struct net_device *dev,
+							  struct xdp_buff *xdp);
 	int			(*ndo_xsk_wakeup)(struct net_device *dev,
 						  u32 queue_id, u32 flags);
 	struct devlink_port *	(*ndo_get_devlink_port)(struct net_device *dev);
@@ -4071,6 +4076,7 @@ typedef int (*bpf_op_t)(struct net_device *dev, struct netdev_bpf *bpf);
 int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
 		      int fd, int expected_fd, u32 flags);
 int bpf_xdp_link_attach(const union bpf_attr *attr, struct bpf_prog *prog);
+u8 dev_xdp_prog_count(struct net_device *dev);
 u32 dev_xdp_prog_id(struct net_device *dev, enum bpf_xdp_mode mode);
 
 int __dev_forward_skb(struct net_device *dev, struct sk_buff *skb);
diff --git a/net/core/dev.c b/net/core/dev.c
index 3ee58876e8f5..27023ea933dd 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -9353,7 +9353,7 @@ static struct bpf_prog *dev_xdp_prog(struct net_device *dev,
 	return dev->xdp_state[mode].prog;
 }
 
-static u8 dev_xdp_prog_count(struct net_device *dev)
+u8 dev_xdp_prog_count(struct net_device *dev)
 {
 	u8 count = 0;
 	int i;
@@ -9363,6 +9363,7 @@ static u8 dev_xdp_prog_count(struct net_device *dev)
 			count++;
 	return count;
 }
+EXPORT_SYMBOL_GPL(dev_xdp_prog_count);
 
 u32 dev_xdp_prog_id(struct net_device *dev, enum bpf_xdp_mode mode)
 {
@@ -9456,6 +9457,8 @@ static int dev_xdp_attach(struct net_device *dev, struct netlink_ext_ack *extack
 {
 	unsigned int num_modes = hweight32(flags & XDP_FLAGS_MODES);
 	struct bpf_prog *cur_prog;
+	struct net_device *upper;
+	struct list_head *iter;
 	enum bpf_xdp_mode mode;
 	bpf_op_t bpf_op;
 	int err;
@@ -9494,6 +9497,14 @@ static int dev_xdp_attach(struct net_device *dev, struct netlink_ext_ack *extack
 		return -EBUSY;
 	}
 
+	/* don't allow if an upper device already has a program */
+	netdev_for_each_upper_dev_rcu(dev, upper, iter) {
+		if (dev_xdp_prog_count(upper) > 0) {
+			NL_SET_ERR_MSG(extack, "Cannot attach when an upper device already has a program");
+			return -EEXIST;
+		}
+	}
+
 	cur_prog = dev_xdp_prog(dev, mode);
 	/* can't replace attached prog with link */
 	if (link && cur_prog) {
diff --git a/net/core/filter.c b/net/core/filter.c
index faf29fd82276..ff62cd39046d 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3950,6 +3950,31 @@ void bpf_clear_redirect_map(struct bpf_map *map)
 	}
 }
 
+DEFINE_STATIC_KEY_FALSE(bpf_master_redirect_enabled_key);
+EXPORT_SYMBOL_GPL(bpf_master_redirect_enabled_key);
+
+u32 xdp_master_redirect(struct xdp_buff *xdp)
+{
+	struct net_device *master, *slave;
+	struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
+
+	master = netdev_master_upper_dev_get_rcu(xdp->rxq->dev);
+	slave = master->netdev_ops->ndo_xdp_get_xmit_slave(master, xdp);
+	if (slave && slave != xdp->rxq->dev) {
+		/* The target device is different from the receiving device, so
+		 * redirect it to the new device.
+		 * Using XDP_REDIRECT gets the correct behaviour from XDP enabled
+		 * drivers to unmap the packet from their rx ring.
+		 */
+		ri->tgt_index = slave->ifindex;
+		ri->map_id = INT_MAX;
+		ri->map_type = BPF_MAP_TYPE_UNSPEC;
+		return XDP_REDIRECT;
+	}
+	return XDP_TX;
+}
+EXPORT_SYMBOL_GPL(xdp_master_redirect);
+
 int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp,
 		    struct bpf_prog *xdp_prog)
 {

From patchwork Fri Jul 30 06:18:18 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jussi Maki <joamaki@gmail.com>
X-Patchwork-Id: 492117
Return-Path: <netdev-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
 aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-12.3 required=3.0 tests=BAYES_00,
 DATE_IN_PAST_96_XX, 
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FORGED_FROMDOMAIN,
 FREEMAIL_FROM, 
 HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH,
 MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED,
 USER_AGENT_GIT
 autolearn=unavailable autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
 by smtp.lore.kernel.org (Postfix) with ESMTP id 6ACAAC4338F
 for <netdev@archiver.kernel.org>;
 Wed,  4 Aug 2021 12:46:30 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
 by mail.kernel.org (Postfix) with ESMTP id 4D49460EB2
 for <netdev@archiver.kernel.org>;
 Wed,  4 Aug 2021 12:46:30 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S238226AbhHDMql (ORCPT <rfc822;netdev@archiver.kernel.org>);
 Wed, 4 Aug 2021 08:46:41 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59222 "EHLO
 lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by
 vger.kernel.org with ESMTP id S238214AbhHDMqY (ORCPT
 <rfc822;netdev@vger.kernel.org>); Wed, 4 Aug 2021 08:46:24 -0400
Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com
 [IPv6:2a00:1450:4864:20::332])
 by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1AEBC0617A4;
 Wed,  4 Aug 2021 05:45:50 -0700 (PDT)
Received: by mail-wm1-x332.google.com with SMTP id n11so1140416wmd.2;
 Wed, 04 Aug 2021 05:45:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:date:message-id:in-reply-to:references;
 bh=auVShO/KOPLd/Z9skkoeLwX+SW55ROIfbzPJUZIBiDs=;
 b=t/O05RMpyuty0pLBsHSYZx/WLGgNwg+GeWetmTByCN/o0JX7N40sTtA5Luw9iRdgih
 QFRzD0qyyxqcIMKUc8dTQefd3uqbhBH5i7/4hIcLdvHO+6yS0XJZ2LPojx65NBlpsxkx
 ud5ctWFzRs8TMw0nCLGYML4MHBg7248rEMstL0SbM/dywSZOtTn0xkzee2vy53p1bqpL
 3hkL0g0WRE6eMAWIg76WRPem5MRuvM7W3PgS2BmFBpzvRy+wuXAq7UI4EQfJywuQkY/l
 q/zhV7U/10GqqRMpLInCVuuc3CKNCucLwffTTq3+uj/8DO/MeqhNV5CcCrwyPQ2WciHw
 1+lQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references;
 bh=auVShO/KOPLd/Z9skkoeLwX+SW55ROIfbzPJUZIBiDs=;
 b=bIIDF1+JCCx/cP7bA+l2WRcXUeIxzhs98nCYmVarJfuq57DY+ddREG5L2AdOZ/yXKy
 z9jCy5q9Cv+DqQFs3UI5CoPkhu/4BqsblsgQLUCuz4PE2bP6T4yXUWEf4uVJtagLhbFA
 W0bVAWxX5YNFzmanTnP32u7+i1hMyv7GMFBO9/oEaBfGcMGPFjrRt4KnVsn5/PWz8IWo
 pw3C4HS5ZqbKkWPFbtHo4/Icu6bnw3zI3X3Pqos3myVGpT9Xb+ljPUqNBpOPdfjAY/oM
 c/IqgCA3D4cQKJxOfCd6y+8TIfb4rAtDVHEWOP7Mbn6huYnxpDmc1k2VYusK5Ki0qFEL
 Jj3w==
X-Gm-Message-State: AOAM532t3fzMAx2IYLuDKE2H9dms4fmY+5r5gJvNZDAgwtDGor6r9l5J
 0gVPzaFrHr02W/58NSYyYeXN70vy4B8w0kQ=
X-Google-Smtp-Source: ABdhPJxIPQNwmvIwSuz7gf8D16n4SUfMXTO4pTSUl6berccDOGSU1nBnKp29L9MN1jXmTQbjIM1TqQ==
X-Received: by 2002:a1c:3886:: with SMTP id f128mr9806007wma.85.1628081148852; 
 Wed, 04 Aug 2021 05:45:48 -0700 (PDT)
Received: from localhost.localdomain ([77.109.191.101])
 by smtp.gmail.com with ESMTPSA id
 y4sm2257923wmi.22.2021.08.04.05.45.47
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Wed, 04 Aug 2021 05:45:48 -0700 (PDT)
From: Jussi Maki <joamaki@gmail.com>
To: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org, daniel@iogearbox.net, j.vosburgh@gmail.com,
 andy@greyhouse.net, vfalico@gmail.com, andrii@kernel.org,
 maciej.fijalkowski@intel.com, magnus.karlsson@intel.com,
 Jussi Maki <joamaki@gmail.com>
Subject: [PATCH bpf-next v5 3/7] net: bonding: Add XDP support to the
 bonding driver
Date: Fri, 30 Jul 2021 06:18:18 +0000
Message-Id: <20210730061822.6600-4-joamaki@gmail.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20210730061822.6600-1-joamaki@gmail.com>
References: <20210609135537.1460244-1-joamaki@gmail.com>
 <20210730061822.6600-1-joamaki@gmail.com>
Precedence: bulk
List-ID: <netdev.vger.kernel.org>
X-Mailing-List: netdev@vger.kernel.org

XDP is implemented in the bonding driver by transparently delegating
the XDP program loading, removal and xmit operations to the bonding
slave devices. The overall goal of this work is that XDP programs
can be attached to a bond device *without* any further changes (or
awareness) necessary to the program itself, meaning the same XDP
program can be attached to a native device but also a bonding device.

Semantics of XDP_TX when attached to a bond are equivalent in such
setting to the case when a tc/BPF program would be attached to the
bond, meaning transmitting the packet out of the bond itself using one
of the bond's configured xmit methods to select a slave device (rather
than XDP_TX on the slave itself). Handling of XDP_TX to transmit
using the configured bonding mechanism is therefore implemented by
rewriting the BPF program return value in bpf_prog_run_xdp. To avoid
performance impact this check is guarded by a static key, which is
incremented when a XDP program is loaded onto a bond device. This
approach was chosen to avoid changes to drivers implementing XDP. If
the slave device does not match the receive device, then XDP_REDIRECT
is transparently used to perform the redirection in order to have
the network driver release the packet from its RX ring.  The bonding
driver hashing functions have been refactored to allow reuse with
xdp_buff's to avoid code duplication.

The motivation for this change is to enable use of bonding (and
802.3ad) in hairpinning L4 load-balancers such as [1] implemented with
XDP and also to transparently support bond devices for projects that
use XDP given most modern NICs have dual port adapters.  An alternative
to this approach would be to implement 802.3ad in user-space and
implement the bonding load-balancing in the XDP program itself, but
is rather a cumbersome endeavor in terms of slave device management
(e.g. by watching netlink) and requires separate programs for native
vs bond cases for the orchestrator. A native in-kernel implementation
overcomes these issues and provides more flexibility.

Below are benchmark results done on two machines with 100Gbit
Intel E810 (ice) NIC and with 32-core 3970X on sending machine, and
16-core 3950X on receiving machine. 64 byte packets were sent with
pktgen-dpdk at full rate. Two issues [2, 3] were identified with the
ice driver, so the tests were performed with iommu=off and patch [2]
applied. Additionally the bonding round robin algorithm was modified
to use per-cpu tx counters as high CPU load (50% vs 10%) and high rate
of cache misses were caused by the shared rr_tx_counter (see patch
2/3). The statistics were collected using "sar -n dev -u 1 10".

 -----------------------|  CPU  |--| rxpck/s |--| txpck/s |----
 without patch (1 dev):
   XDP_DROP:              3.15%      48.6Mpps
   XDP_TX:                3.12%      18.3Mpps     18.3Mpps
   XDP_DROP (RSS):        9.47%      116.5Mpps
   XDP_TX (RSS):          9.67%      25.3Mpps     24.2Mpps
 -----------------------
 with patch, bond (1 dev):
   XDP_DROP:              3.14%      46.7Mpps
   XDP_TX:                3.15%      13.9Mpps     13.9Mpps
   XDP_DROP (RSS):        10.33%     117.2Mpps
   XDP_TX (RSS):          10.64%     25.1Mpps     24.0Mpps
 -----------------------
 with patch, bond (2 devs):
   XDP_DROP:              6.27%      92.7Mpps
   XDP_TX:                6.26%      17.6Mpps     17.5Mpps
   XDP_DROP (RSS):       11.38%      117.2Mpps
   XDP_TX (RSS):         14.30%      28.7Mpps     27.4Mpps
 --------------------------------------------------------------

RSS: Receive Side Scaling, e.g. the packets were sent to a range of
destination IPs.

[1]: https://cilium.io/blog/2021/05/20/cilium-110#standalonelb
[2]: https://lore.kernel.org/bpf/20210601113236.42651-1-maciej.fijalkowski@intel.com/T/#t
[3]: https://lore.kernel.org/bpf/CAHn8xckNXci+X_Eb2WMv4uVYjO2331UWB2JLtXr_58z0Av8+8A@mail.gmail.com/

Signed-off-by: Jussi Maki <joamaki@gmail.com>
---
 drivers/net/bonding/bond_main.c | 309 +++++++++++++++++++++++++++++++-
 include/net/bonding.h           |   1 +
 2 files changed, 309 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index dcec5cc4dab1..fcd01acd1c83 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -317,6 +317,19 @@ bool bond_sk_check(struct bonding *bond)
 	}
 }
 
+static bool bond_xdp_check(struct bonding *bond)
+{
+	switch (BOND_MODE(bond)) {
+	case BOND_MODE_ROUNDROBIN:
+	case BOND_MODE_ACTIVEBACKUP:
+	case BOND_MODE_8023AD:
+	case BOND_MODE_XOR:
+		return true;
+	default:
+		return false;
+	}
+}
+
 /*---------------------------------- VLAN -----------------------------------*/
 
 /* In the following 2 functions, bond_vlan_rx_add_vid and bond_vlan_rx_kill_vid,
@@ -2133,6 +2146,41 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
 		bond_update_slave_arr(bond, NULL);
 
 
+	if (!slave_dev->netdev_ops->ndo_bpf ||
+	    !slave_dev->netdev_ops->ndo_xdp_xmit) {
+		if (bond->xdp_prog) {
+			NL_SET_ERR_MSG(extack, "Slave does not support XDP");
+			slave_err(bond_dev, slave_dev, "Slave does not support XDP\n");
+			res = -EOPNOTSUPP;
+			goto err_sysfs_del;
+		}
+	} else {
+		struct netdev_bpf xdp = {
+			.command = XDP_SETUP_PROG,
+			.flags   = 0,
+			.prog    = bond->xdp_prog,
+			.extack  = extack,
+		};
+
+		if (dev_xdp_prog_count(slave_dev) > 0) {
+			NL_SET_ERR_MSG(extack,
+				       "Slave has XDP program loaded, please unload before enslaving");
+			slave_err(bond_dev, slave_dev,
+				  "Slave has XDP program loaded, please unload before enslaving\n");
+			res = -EOPNOTSUPP;
+			goto err_sysfs_del;
+		}
+
+		res = slave_dev->netdev_ops->ndo_bpf(slave_dev, &xdp);
+		if (res < 0) {
+			/* ndo_bpf() sets extack error message */
+			slave_dbg(bond_dev, slave_dev, "Error %d calling ndo_bpf\n", res);
+			goto err_sysfs_del;
+		}
+		if (bond->xdp_prog)
+			bpf_prog_inc(bond->xdp_prog);
+	}
+
 	slave_info(bond_dev, slave_dev, "Enslaving as %s interface with %s link\n",
 		   bond_is_active_slave(new_slave) ? "an active" : "a backup",
 		   new_slave->link != BOND_LINK_DOWN ? "an up" : "a down");
@@ -2252,6 +2300,17 @@ static int __bond_release_one(struct net_device *bond_dev,
 	/* recompute stats just before removing the slave */
 	bond_get_stats(bond->dev, &bond->bond_stats);
 
+	if (bond->xdp_prog) {
+		struct netdev_bpf xdp = {
+			.command = XDP_SETUP_PROG,
+			.flags   = 0,
+			.prog	 = NULL,
+			.extack  = NULL,
+		};
+		if (slave_dev->netdev_ops->ndo_bpf(slave_dev, &xdp))
+			slave_warn(bond_dev, slave_dev, "failed to unload XDP program\n");
+	}
+
 	bond_upper_dev_unlink(bond, slave);
 	/* unregister rx_handler early so bond_handle_frame wouldn't be called
 	 * for this slave anymore.
@@ -3635,7 +3694,7 @@ static inline u32 bond_eth_hash(struct sk_buff *skb, const void *data, int mhoff
 		return 0;
 
 	ep = (struct ethhdr *)(data + mhoff);
-	return ep->h_dest[5] ^ ep->h_source[5] ^ ep->h_proto;
+	return ep->h_dest[5] ^ ep->h_source[5] ^ be16_to_cpu(ep->h_proto);
 }
 
 static bool bond_flow_ip(struct sk_buff *skb, struct flow_keys *fk, const void *data,
@@ -3804,6 +3863,26 @@ u32 bond_xmit_hash(struct bonding *bond, struct sk_buff *skb)
 				skb_headlen(skb));
 }
 
+/**
+ * bond_xmit_hash_xdp - generate a hash value based on the xmit policy
+ * @bond: bonding device
+ * @xdp: buffer to use for headers
+ *
+ * The XDP variant of bond_xmit_hash.
+ */
+static u32 bond_xmit_hash_xdp(struct bonding *bond, struct xdp_buff *xdp)
+{
+	struct ethhdr *eth;
+
+	if (xdp->data + sizeof(struct ethhdr) > xdp->data_end)
+		return 0;
+
+	eth = (struct ethhdr *)xdp->data;
+
+	return __bond_xmit_hash(bond, NULL, xdp->data, eth->h_proto, 0,
+				sizeof(struct ethhdr), xdp->data_end - xdp->data);
+}
+
 /*-------------------------- Device entry points ----------------------------*/
 
 void bond_work_init_all(struct bonding *bond)
@@ -4420,6 +4499,47 @@ static struct slave *bond_xmit_roundrobin_slave_get(struct bonding *bond,
 	return NULL;
 }
 
+static struct slave *bond_xdp_xmit_roundrobin_slave_get(struct bonding *bond,
+							struct xdp_buff *xdp)
+{
+	struct slave *slave;
+	int slave_cnt;
+	u32 slave_id;
+	const struct ethhdr *eth;
+	void *data = xdp->data;
+
+	if (data + sizeof(struct ethhdr) > xdp->data_end)
+		goto non_igmp;
+
+	eth = (struct ethhdr *)data;
+	data += sizeof(struct ethhdr);
+
+	/* See comment on IGMP in bond_xmit_roundrobin_slave_get() */
+	if (eth->h_proto == htons(ETH_P_IP)) {
+		const struct iphdr *iph;
+
+		if (data + sizeof(struct iphdr) > xdp->data_end)
+			goto non_igmp;
+
+		iph = (struct iphdr *)data;
+
+		if (iph->protocol == IPPROTO_IGMP) {
+			slave = rcu_dereference(bond->curr_active_slave);
+			if (slave)
+				return slave;
+			return bond_get_slave_by_id(bond, 0);
+		}
+	}
+
+non_igmp:
+	slave_cnt = READ_ONCE(bond->slave_cnt);
+	if (likely(slave_cnt)) {
+		slave_id = bond_rr_gen_slave_id(bond) % slave_cnt;
+		return bond_get_slave_by_id(bond, slave_id);
+	}
+	return NULL;
+}
+
 static netdev_tx_t bond_xmit_roundrobin(struct sk_buff *skb,
 					struct net_device *bond_dev)
 {
@@ -4635,6 +4755,22 @@ static struct slave *bond_xmit_3ad_xor_slave_get(struct bonding *bond,
 	return slave;
 }
 
+static struct slave *bond_xdp_xmit_3ad_xor_slave_get(struct bonding *bond,
+						     struct xdp_buff *xdp)
+{
+	struct bond_up_slave *slaves;
+	unsigned int count;
+	u32 hash;
+
+	hash = bond_xmit_hash_xdp(bond, xdp);
+	slaves = rcu_dereference(bond->usable_slaves);
+	count = slaves ? READ_ONCE(slaves->count) : 0;
+	if (unlikely(!count))
+		return NULL;
+
+	return slaves->arr[hash % count];
+}
+
 /* Use this Xmit function for 3AD as well as XOR modes. The current
  * usable slave array is formed in the control path. The xmit function
  * just calculates hash and sends the packet out.
@@ -4919,6 +5055,174 @@ static netdev_tx_t bond_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	return ret;
 }
 
+static struct net_device *
+bond_xdp_get_xmit_slave(struct net_device *bond_dev, struct xdp_buff *xdp)
+{
+	struct bonding *bond = netdev_priv(bond_dev);
+	struct slave *slave;
+
+	/* Caller needs to hold rcu_read_lock() */
+
+	switch (BOND_MODE(bond)) {
+	case BOND_MODE_ROUNDROBIN:
+		slave = bond_xdp_xmit_roundrobin_slave_get(bond, xdp);
+		break;
+
+	case BOND_MODE_ACTIVEBACKUP:
+		slave = bond_xmit_activebackup_slave_get(bond);
+		break;
+
+	case BOND_MODE_8023AD:
+	case BOND_MODE_XOR:
+		slave = bond_xdp_xmit_3ad_xor_slave_get(bond, xdp);
+		break;
+
+	default:
+		/* Should never happen. Mode guarded by bond_xdp_check() */
+		netdev_err(bond_dev, "Unknown bonding mode %d for xdp xmit\n", BOND_MODE(bond));
+		WARN_ON_ONCE(1);
+		return NULL;
+	}
+
+	if (slave)
+		return slave->dev;
+
+	return NULL;
+}
+
+static int bond_xdp_xmit(struct net_device *bond_dev,
+			 int n, struct xdp_frame **frames, u32 flags)
+{
+	int nxmit, err = -ENXIO;
+
+	rcu_read_lock();
+
+	for (nxmit = 0; nxmit < n; nxmit++) {
+		struct xdp_frame *frame = frames[nxmit];
+		struct xdp_frame *frames1[] = {frame};
+		struct net_device *slave_dev;
+		struct xdp_buff xdp;
+
+		xdp_convert_frame_to_buff(frame, &xdp);
+
+		slave_dev = bond_xdp_get_xmit_slave(bond_dev, &xdp);
+		if (!slave_dev) {
+			err = -ENXIO;
+			break;
+		}
+
+		err = slave_dev->netdev_ops->ndo_xdp_xmit(slave_dev, 1, frames1, flags);
+		if (err < 1)
+			break;
+	}
+
+	rcu_read_unlock();
+
+	/* If error happened on the first frame then we can pass the error up, otherwise
+	 * report the number of frames that were xmitted.
+	 */
+	if (err < 0)
+		return (nxmit == 0 ? err : nxmit);
+
+	return nxmit;
+}
+
+static int bond_xdp_set(struct net_device *dev, struct bpf_prog *prog,
+			struct netlink_ext_ack *extack)
+{
+	struct bonding *bond = netdev_priv(dev);
+	struct list_head *iter;
+	struct slave *slave, *rollback_slave;
+	struct bpf_prog *old_prog;
+	struct netdev_bpf xdp = {
+		.command = XDP_SETUP_PROG,
+		.flags   = 0,
+		.prog    = prog,
+		.extack  = extack,
+	};
+	int err;
+
+	ASSERT_RTNL();
+
+	if (!bond_xdp_check(bond))
+		return -EOPNOTSUPP;
+
+	old_prog = bond->xdp_prog;
+	bond->xdp_prog = prog;
+
+	bond_for_each_slave(bond, slave, iter) {
+		struct net_device *slave_dev = slave->dev;
+
+		if (!slave_dev->netdev_ops->ndo_bpf ||
+		    !slave_dev->netdev_ops->ndo_xdp_xmit) {
+			NL_SET_ERR_MSG(extack, "Slave device does not support XDP");
+			slave_err(dev, slave_dev, "Slave does not support XDP\n");
+			err = -EOPNOTSUPP;
+			goto err;
+		}
+
+		if (dev_xdp_prog_count(slave_dev) > 0) {
+			NL_SET_ERR_MSG(extack,
+				       "Slave has XDP program loaded, please unload before enslaving");
+			slave_err(dev, slave_dev,
+				  "Slave has XDP program loaded, please unload before enslaving\n");
+			err = -EOPNOTSUPP;
+			goto err;
+		}
+
+		err = slave_dev->netdev_ops->ndo_bpf(slave_dev, &xdp);
+		if (err < 0) {
+			/* ndo_bpf() sets extack error message */
+			slave_err(dev, slave_dev, "Error %d calling ndo_bpf\n", err);
+			goto err;
+		}
+		if (prog)
+			bpf_prog_inc(prog);
+	}
+
+	if (old_prog)
+		bpf_prog_put(old_prog);
+
+	if (prog)
+		static_branch_inc(&bpf_master_redirect_enabled_key);
+	else
+		static_branch_dec(&bpf_master_redirect_enabled_key);
+
+	return 0;
+
+err:
+	/* unwind the program changes */
+	bond->xdp_prog = old_prog;
+	xdp.prog = old_prog;
+	xdp.extack = NULL; /* do not overwrite original error */
+
+	bond_for_each_slave(bond, rollback_slave, iter) {
+		struct net_device *slave_dev = rollback_slave->dev;
+		int err_unwind;
+
+		if (slave == rollback_slave)
+			break;
+
+		err_unwind = slave_dev->netdev_ops->ndo_bpf(slave_dev, &xdp);
+		if (err_unwind < 0)
+			slave_err(dev, slave_dev,
+				  "Error %d when unwinding XDP program change\n", err_unwind);
+		else if (xdp.prog)
+			bpf_prog_inc(xdp.prog);
+	}
+	return err;
+}
+
+static int bond_xdp(struct net_device *dev, struct netdev_bpf *xdp)
+{
+	switch (xdp->command) {
+	case XDP_SETUP_PROG:
+		return bond_xdp_set(dev, xdp->prog, xdp->extack);
+	default:
+		return -EINVAL;
+	}
+}
+
 static u32 bond_mode_bcast_speed(struct slave *slave, u32 speed)
 {
 	if (speed == 0 || speed == SPEED_UNKNOWN)
@@ -5005,6 +5309,9 @@ static const struct net_device_ops bond_netdev_ops = {
 	.ndo_features_check	= passthru_features_check,
 	.ndo_get_xmit_slave	= bond_xmit_get_slave,
 	.ndo_sk_get_lower_dev	= bond_sk_get_lower_dev,
+	.ndo_bpf		= bond_xdp,
+	.ndo_xdp_xmit           = bond_xdp_xmit,
+	.ndo_xdp_get_xmit_slave = bond_xdp_get_xmit_slave,
 };
 
 static const struct device_type bond_type = {
diff --git a/include/net/bonding.h b/include/net/bonding.h
index 625d9c72dee3..b91c365e4e95 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -258,6 +258,7 @@ struct bonding {
 	/* protecting ipsec_list */
 	spinlock_t ipsec_lock;
 #endif /* CONFIG_XFRM_OFFLOAD */
+	struct bpf_prog *xdp_prog;
 };
 
 #define bond_slave_get_rcu(dev) \

From patchwork Fri Jul 30 06:18:19 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jussi Maki <joamaki@gmail.com>
X-Patchwork-Id: 492765
Return-Path: <netdev-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
 aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-12.3 required=3.0 tests=BAYES_00,
 DATE_IN_PAST_96_XX, 
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FORGED_FROMDOMAIN,
 FREEMAIL_FROM, 
 HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH,
 MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT
 autolearn=unavailable autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
 by smtp.lore.kernel.org (Postfix) with ESMTP id EF162C4320A
 for <netdev@archiver.kernel.org>;
 Wed,  4 Aug 2021 12:46:24 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
 by mail.kernel.org (Postfix) with ESMTP id D43D760560
 for <netdev@archiver.kernel.org>;
 Wed,  4 Aug 2021 12:46:24 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S238206AbhHDMqg (ORCPT <rfc822;netdev@archiver.kernel.org>);
 Wed, 4 Aug 2021 08:46:36 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59224 "EHLO
 lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by
 vger.kernel.org with ESMTP id S238212AbhHDMqY (ORCPT
 <rfc822;netdev@vger.kernel.org>); Wed, 4 Aug 2021 08:46:24 -0400
Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com
 [IPv6:2a00:1450:4864:20::432])
 by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7AB6BC0617A5;
 Wed,  4 Aug 2021 05:45:51 -0700 (PDT)
Received: by mail-wr1-x432.google.com with SMTP id p5so2099266wro.7;
 Wed, 04 Aug 2021 05:45:51 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:date:message-id:in-reply-to:references;
 bh=FwNUMow4yx8eykTioVcbeIXw0nALj5ZFKgzUGGqXCuI=;
 b=iqigE/qpNtNi4z//7iL6mOm4SslaiugFIMkW0jN02Ww+RdS4UlYCTXZG5klmdKiiSD
 grHyc9bVXLKQr8MfocTLimBPDMVe4wGL5Kkxr8OhVLKTk1dv1DKLVM4MUFsJek40uwzZ
 EBg5T6e4nOhFwGFB5DtYBN2cjClbaPfEbySrxk/IaGMeKLU9zGJ/6RyxKLU0ZNigOvQJ
 mccAWmh050yug3cASmRaLfbxT10yp7sg8rGBm9tw9kfIGKWwM6OgBXNr2CbpDeboi+e4
 LJJy9Aj/PX/YWC8QM1sIfFrgDtQbfFpqKlRNvk71Tq++2I09+Lm30Gx2z77vNBwaVzT/
 DMLg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references;
 bh=FwNUMow4yx8eykTioVcbeIXw0nALj5ZFKgzUGGqXCuI=;
 b=Mpf2jYBlxaVLp4Agl6UpvPW9BjVjZnItvQGIL85YyaLYqnlU6oCggSxA0noWnawEgE
 BsCWVhm8mfvQPj6WXxDLXp6+dTXlIym68tknLmkKXXV1ZwFjSPlD8yTG4GIWfinJuEmW
 ImAIS4VgbbWzM7xmlj5L895wKI6g4rtXKpt6fgNfLXwvxn/cHdX8w5PMEkuCZonHCmOR
 KBWMZn8FbAzxLcTZgErTBFNH0fAjgbYa0Vb8JvXvO32y13lO5ervMJK5ynSmJYoBssjV
 RkPRC35A7PuoM72MdOp4ucStvedWJ+pkYr6VWV4jmvRtnXA77Ngtlz7lY0alubES+ry3
 ru2g==
X-Gm-Message-State: AOAM530JI0iaz0TH+EydxKdI/mZ6ELJigmnYb30YocJNd+Fh+D2eUizp
 Y8dA2eNkstssAGKYI560ApGsdkbO3hAofFk=
X-Google-Smtp-Source: ABdhPJzJJrD1RQ9ztGi79k09b4fIYXYyXWW86QrOp0Ng2p1eZVx0KH/JbctGhHmHA8SuG2HBj0+SPA==
X-Received: by 2002:adf:f5d1:: with SMTP id
 k17mr29043590wrp.123.1628081149740; 
 Wed, 04 Aug 2021 05:45:49 -0700 (PDT)
Received: from localhost.localdomain ([77.109.191.101])
 by smtp.gmail.com with ESMTPSA id
 y4sm2257923wmi.22.2021.08.04.05.45.48
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Wed, 04 Aug 2021 05:45:49 -0700 (PDT)
From: Jussi Maki <joamaki@gmail.com>
To: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org, daniel@iogearbox.net, j.vosburgh@gmail.com,
 andy@greyhouse.net, vfalico@gmail.com, andrii@kernel.org,
 maciej.fijalkowski@intel.com, magnus.karlsson@intel.com,
 Jussi Maki <joamaki@gmail.com>
Subject: [PATCH bpf-next v5 4/7] devmap: Exclude XDP broadcast to master device
Date: Fri, 30 Jul 2021 06:18:19 +0000
Message-Id: <20210730061822.6600-5-joamaki@gmail.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20210730061822.6600-1-joamaki@gmail.com>
References: <20210609135537.1460244-1-joamaki@gmail.com>
 <20210730061822.6600-1-joamaki@gmail.com>
Precedence: bulk
List-ID: <netdev.vger.kernel.org>
X-Mailing-List: netdev@vger.kernel.org

If the ingress device is bond slave, do not broadcast back
through it or the bond master.

Signed-off-by: Jussi Maki <joamaki@gmail.com>
---
 kernel/bpf/devmap.c | 69 +++++++++++++++++++++++++++++++++++++++------
 1 file changed, 60 insertions(+), 9 deletions(-)

diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index 542e94fa30b4..f02d04540c0c 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -534,10 +534,9 @@ int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp,
 	return __xdp_enqueue(dev, xdp, dev_rx, dst->xdp_prog);
 }
 
-static bool is_valid_dst(struct bpf_dtab_netdev *obj, struct xdp_buff *xdp,
-			 int exclude_ifindex)
+static bool is_valid_dst(struct bpf_dtab_netdev *obj, struct xdp_buff *xdp)
 {
-	if (!obj || obj->dev->ifindex == exclude_ifindex ||
+	if (!obj ||
 	    !obj->dev->netdev_ops->ndo_xdp_xmit)
 		return false;
 
@@ -562,17 +561,48 @@ static int dev_map_enqueue_clone(struct bpf_dtab_netdev *obj,
 	return 0;
 }
 
+static inline bool is_ifindex_excluded(int *excluded, int num_excluded, int ifindex)
+{
+	while (num_excluded--) {
+		if (ifindex == excluded[num_excluded])
+			return true;
+	}
+	return false;
+}
+
+/* Get ifindex of each upper device. 'indexes' must be able to hold at
+ * least MAX_NEST_DEV elements.
+ * Returns the number of ifindexes added.
+ */
+static int get_upper_ifindexes(struct net_device *dev, int *indexes)
+{
+	struct net_device *upper;
+	struct list_head *iter;
+	int n = 0;
+
+	netdev_for_each_upper_dev_rcu(dev, upper, iter) {
+		indexes[n++] = upper->ifindex;
+	}
+	return n;
+}
+
 int dev_map_enqueue_multi(struct xdp_buff *xdp, struct net_device *dev_rx,
 			  struct bpf_map *map, bool exclude_ingress)
 {
 	struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map);
-	int exclude_ifindex = exclude_ingress ? dev_rx->ifindex : 0;
 	struct bpf_dtab_netdev *dst, *last_dst = NULL;
+	int excluded_devices[1+MAX_NEST_DEV];
 	struct hlist_head *head;
 	struct xdp_frame *xdpf;
+	int num_excluded = 0;
 	unsigned int i;
 	int err;
 
+	if (exclude_ingress) {
+		num_excluded = get_upper_ifindexes(dev_rx, excluded_devices);
+		excluded_devices[num_excluded++] = dev_rx->ifindex;
+	}
+
 	xdpf = xdp_convert_buff_to_frame(xdp);
 	if (unlikely(!xdpf))
 		return -EOVERFLOW;
@@ -581,7 +611,10 @@ int dev_map_enqueue_multi(struct xdp_buff *xdp, struct net_device *dev_rx,
 		for (i = 0; i < map->max_entries; i++) {
 			dst = rcu_dereference_check(dtab->netdev_map[i],
 						    rcu_read_lock_bh_held());
-			if (!is_valid_dst(dst, xdp, exclude_ifindex))
+			if (!is_valid_dst(dst, xdp))
+				continue;
+
+			if (is_ifindex_excluded(excluded_devices, num_excluded, dst->dev->ifindex))
 				continue;
 
 			/* we only need n-1 clones; last_dst enqueued below */
@@ -601,7 +634,11 @@ int dev_map_enqueue_multi(struct xdp_buff *xdp, struct net_device *dev_rx,
 			head = dev_map_index_hash(dtab, i);
 			hlist_for_each_entry_rcu(dst, head, index_hlist,
 						 lockdep_is_held(&dtab->index_lock)) {
-				if (!is_valid_dst(dst, xdp, exclude_ifindex))
+				if (!is_valid_dst(dst, xdp))
+					continue;
+
+				if (is_ifindex_excluded(excluded_devices, num_excluded,
+							dst->dev->ifindex))
 					continue;
 
 				/* we only need n-1 clones; last_dst enqueued below */
@@ -675,18 +712,27 @@ int dev_map_redirect_multi(struct net_device *dev, struct sk_buff *skb,
 			   bool exclude_ingress)
 {
 	struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map);
-	int exclude_ifindex = exclude_ingress ? dev->ifindex : 0;
 	struct bpf_dtab_netdev *dst, *last_dst = NULL;
+	int excluded_devices[1+MAX_NEST_DEV];
 	struct hlist_head *head;
 	struct hlist_node *next;
+	int num_excluded = 0;
 	unsigned int i;
 	int err;
 
+	if (exclude_ingress) {
+		num_excluded = get_upper_ifindexes(dev, excluded_devices);
+		excluded_devices[num_excluded++] = dev->ifindex;
+	}
+
 	if (map->map_type == BPF_MAP_TYPE_DEVMAP) {
 		for (i = 0; i < map->max_entries; i++) {
 			dst = rcu_dereference_check(dtab->netdev_map[i],
 						    rcu_read_lock_bh_held());
-			if (!dst || dst->dev->ifindex == exclude_ifindex)
+			if (!dst)
+				continue;
+
+			if (is_ifindex_excluded(excluded_devices, num_excluded, dst->dev->ifindex))
 				continue;
 
 			/* we only need n-1 clones; last_dst enqueued below */
@@ -700,12 +746,17 @@ int dev_map_redirect_multi(struct net_device *dev, struct sk_buff *skb,
 				return err;
 
 			last_dst = dst;
+
 		}
 	} else { /* BPF_MAP_TYPE_DEVMAP_HASH */
 		for (i = 0; i < dtab->n_buckets; i++) {
 			head = dev_map_index_hash(dtab, i);
 			hlist_for_each_entry_safe(dst, next, head, index_hlist) {
-				if (!dst || dst->dev->ifindex == exclude_ifindex)
+				if (!dst)
+					continue;
+
+				if (is_ifindex_excluded(excluded_devices, num_excluded,
+							dst->dev->ifindex))
 					continue;
 
 				/* we only need n-1 clones; last_dst enqueued below */

From patchwork Fri Jul 30 06:18:20 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jussi Maki <joamaki@gmail.com>
X-Patchwork-Id: 492119
Return-Path: <netdev-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
 aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-17.3 required=3.0 tests=BAYES_00,
 DATE_IN_PAST_96_XX, 
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FORGED_FROMDOMAIN,
 FREEMAIL_FROM, 
 HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH,
 MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING, SPF_HELO_NONE, SPF_PASS,
 USER_AGENT_GIT
 autolearn=unavailable autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
 by smtp.lore.kernel.org (Postfix) with ESMTP id 0D6C8C4320A
 for <netdev@archiver.kernel.org>;
 Wed,  4 Aug 2021 12:46:22 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
 by mail.kernel.org (Postfix) with ESMTP id DF2EF60FC3
 for <netdev@archiver.kernel.org>;
 Wed,  4 Aug 2021 12:46:21 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S238260AbhHDMqc (ORCPT <rfc822;netdev@archiver.kernel.org>);
 Wed, 4 Aug 2021 08:46:32 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59226 "EHLO
 lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by
 vger.kernel.org with ESMTP id S238242AbhHDMqY (ORCPT
 <rfc822;netdev@vger.kernel.org>); Wed, 4 Aug 2021 08:46:24 -0400
Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com
 [IPv6:2a00:1450:4864:20::334])
 by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 803A7C0617B0;
 Wed,  4 Aug 2021 05:45:52 -0700 (PDT)
Received: by mail-wm1-x334.google.com with SMTP id
 l34-20020a05600c1d22b02902573c214807so3839332wms.2; 
 Wed, 04 Aug 2021 05:45:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:date:message-id:in-reply-to:references;
 bh=iKwt2CNIQOHR6e39/4WlE1XM4D8qqPBdFFLhXvUbs2c=;
 b=boG4J51h4t/f1s3OiPbetfQ7OhQlL6uGY30SuA7uQXq1kOHGWxqVuYizPCrX0o5OhP
 DmFPyy3YCgvhkva15Qn5pPRtfw8XWlJBqHy80yLsgeMZewm9uKhtrOY69kbiez7Aumqc
 Alns+Jo+W04lqYqindQKriNvDMKhIX+/C7bSSUSdkMpIvxOVbnt8jMxP6fmZbsJGevRY
 yywFw84Agp8xdf32zsbPYySgxxF+jb+MlHEPvlu4UYPPqoumpSj5KGCQHNKXxZB27Wsg
 neOaR+hA/WJs8hk4chwrCg+1SG5otVJzEzyC6OzboCks9av5AZew7gK5F3Xbcy//HCxB
 kFLg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references;
 bh=iKwt2CNIQOHR6e39/4WlE1XM4D8qqPBdFFLhXvUbs2c=;
 b=g70sShYCpW70GLEy36tJEucDLoLFlND2PiI0auPLAFD5NysAqt1t/dkh5/PCn+3hn8
 ly23WLRI+yV0TA6E0CmlM1aFHUXhQbCLVrT9tjO+hQQj57mgXJLyEj0v3/36oUc5JzSF
 wBCWULW3ON0P7Rd1tnY41iJ0Kue9XpH8XfWsdgODRpwt2i0ZkKROcZnunV0vYOG1HMQL
 sKYpJEXk9tFOfdq62MXhV4oomuJfqSMlWxDzrAhRMvKYCFXDtsQnwnnSYXCPLE9bUNTg
 vxaqu94TPcDKEfn3DaIGTP6DQCrDqHS/R0S//k9AAm5d/z33RTxPxpmHeUYupxgDqa5b
 /erg==
X-Gm-Message-State: AOAM531y8gGvHVOAv58NIJSW7nRqZI+3yUjMgLpRNq8ne+bYFwdmRaHV
 K6oCCEl2HkdmUmDu9NPg10CiTuEQDJVffKc=
X-Google-Smtp-Source: ABdhPJwRgeMhf3YyaXeKxcn8edAsZHYnpsHC9QuXSvNw1VQjSQm5ByyDzree3KSzSKZgg+Btk8lthQ==
X-Received: by 2002:a05:600c:ad6:: with SMTP id
 c22mr27774930wmr.114.1628081150838; 
 Wed, 04 Aug 2021 05:45:50 -0700 (PDT)
Received: from localhost.localdomain ([77.109.191.101])
 by smtp.gmail.com with ESMTPSA id
 y4sm2257923wmi.22.2021.08.04.05.45.49
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Wed, 04 Aug 2021 05:45:50 -0700 (PDT)
From: Jussi Maki <joamaki@gmail.com>
To: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org, daniel@iogearbox.net, j.vosburgh@gmail.com,
 andy@greyhouse.net, vfalico@gmail.com, andrii@kernel.org,
 maciej.fijalkowski@intel.com, magnus.karlsson@intel.com,
 Jussi Maki <joamaki@gmail.com>
Subject: [PATCH bpf-next v5 5/7] net: core: Allow
 netdev_lower_get_next_private_rcu in bh context
Date: Fri, 30 Jul 2021 06:18:20 +0000
Message-Id: <20210730061822.6600-6-joamaki@gmail.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20210730061822.6600-1-joamaki@gmail.com>
References: <20210609135537.1460244-1-joamaki@gmail.com>
 <20210730061822.6600-1-joamaki@gmail.com>
Precedence: bulk
List-ID: <netdev.vger.kernel.org>
X-Mailing-List: netdev@vger.kernel.org

For the XDP bonding slave lookup to work in the NAPI poll context
in which the redudant rcu_read_lock() has been removed we have to
follow the same approach as in [1] and modify the WARN_ON to also
check rcu_read_lock_bh_held().

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=694cea395fded425008e93cd90cfdf7a451674af

Signed-off-by: Jussi Maki <joamaki@gmail.com>
---
 net/core/dev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 27023ea933dd..ae1aecf97b58 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -7588,7 +7588,7 @@ void *netdev_lower_get_next_private_rcu(struct net_device *dev,
 {
 	struct netdev_adjacent *lower;
 
-	WARN_ON_ONCE(!rcu_read_lock_held());
+	WARN_ON_ONCE(!rcu_read_lock_held() && !rcu_read_lock_bh_held());
 
 	lower = list_entry_rcu((*iter)->next, struct netdev_adjacent, list);
 

From patchwork Fri Jul 30 06:18:22 2021
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jussi Maki <joamaki@gmail.com>
X-Patchwork-Id: 492763
Return-Path: <netdev-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
 aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-12.3 required=3.0 tests=BAYES_00,
 DATE_IN_PAST_96_XX, 
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FORGED_FROMDOMAIN,
 FREEMAIL_FROM, 
 HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH,
 MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT
 autolearn=unavailable autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
 by smtp.lore.kernel.org (Postfix) with ESMTP id F0682C4320E
 for <netdev@archiver.kernel.org>;
 Wed,  4 Aug 2021 12:46:32 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
 by mail.kernel.org (Postfix) with ESMTP id D862760F22
 for <netdev@archiver.kernel.org>;
 Wed,  4 Aug 2021 12:46:32 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S238278AbhHDMqm (ORCPT <rfc822;netdev@archiver.kernel.org>);
 Wed, 4 Aug 2021 08:46:42 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59162 "EHLO
 lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by
 vger.kernel.org with ESMTP id S237381AbhHDMqZ (ORCPT
 <rfc822;netdev@vger.kernel.org>); Wed, 4 Aug 2021 08:46:25 -0400
Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com
 [IPv6:2a00:1450:4864:20::333])
 by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 636DEC0617B9;
 Wed,  4 Aug 2021 05:45:54 -0700 (PDT)
Received: by mail-wm1-x333.google.com with SMTP id k4so1132556wms.3;
 Wed, 04 Aug 2021 05:45:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:date:message-id:in-reply-to:references;
 bh=gZ6FaJLLG9ZRfjblU9BytNupe0ldFYDaj/P6zGG+tqI=;
 b=Axwl2dUl30hJitQcAU6KxQM4vFP8a4KthJjUGHl9WIH1MXJiYtwtyqPR2d+Dt2tKJO
 eFNmZhjZPqHlw+XovxwGL3COErToXjcyEZJY5dd11t7XsrY6yisAPzNvGdLBsm1w279F
 NcA02JGVYW7hPYP8++1tg8dFRv9tZQms7934BuI1jfBPUbWGVUY1gw4P6/PogOvk0x28
 sAK26tFDiWsITqnzKH3CSE+90OVMGcWh0Y1SROVXf150gO8n8n1ACSyYIONfqvUU9xcH
 X/1eJPRzeYNOoZxd82RThHOTWhwgEhhIoEutAGbRUrkcI3DXoYEmFEpAZO/TyYU6631c
 9flA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references;
 bh=gZ6FaJLLG9ZRfjblU9BytNupe0ldFYDaj/P6zGG+tqI=;
 b=SzANa2f5pn5Uolk76lEC5xaDlCGDI+VnJ5PGZlQI6y/M6bRkKUhXAZZmLLX/iAcBBW
 xJ41TtPchUWkJYXBUKwrymgBXYjIBTS46Wu2LhuNsVacZ2ybej3wrZcMUHjsMuRLUg6m
 APNL1Ex95wYf3sWoL3RtPuQfDwmB10cz5M7a5ZQ5XmaZ2kWGVczoPADKZrUHCfMT5FTI
 hYlbLpZ/AxTm8Lml/hm5Aeu1nUxXKqhhO61xr/TK33KAAzgTuzjPm2S7KY031qQASGI0
 gDn3VCrmQvTyox3k9vtROA3N4SDTazbh857naUROZQQzgHWHDnrq/ZLLbiSEZuVKJWtW
 oLcw==
X-Gm-Message-State: AOAM531ZjxvVWxr/hRFz+Looodx0fdZi+gshJyoNCz2wTluRkOat6sqS
 H38etzXnnvytL2efxcZZ84eGLlBELHt8iY8=
X-Google-Smtp-Source: ABdhPJwT5ZH+/uSr+nXj+0KVvd/zhWpWkyL3iG229DtwYzKIUgOLGrUg69+a5KPyXDeRsqyy2IQOCw==
X-Received: by 2002:a1c:c91a:: with SMTP id
 f26mr27524978wmb.162.1628081152597; 
 Wed, 04 Aug 2021 05:45:52 -0700 (PDT)
Received: from localhost.localdomain ([77.109.191.101])
 by smtp.gmail.com with ESMTPSA id
 y4sm2257923wmi.22.2021.08.04.05.45.51
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Wed, 04 Aug 2021 05:45:52 -0700 (PDT)
From: Jussi Maki <joamaki@gmail.com>
To: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org, daniel@iogearbox.net, j.vosburgh@gmail.com,
 andy@greyhouse.net, vfalico@gmail.com, andrii@kernel.org,
 maciej.fijalkowski@intel.com, magnus.karlsson@intel.com,
 Jussi Maki <joamaki@gmail.com>
Subject: [PATCH bpf-next v5 7/7] selftests/bpf: Add tests for XDP bonding
Date: Fri, 30 Jul 2021 06:18:22 +0000
Message-Id: <20210730061822.6600-8-joamaki@gmail.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20210730061822.6600-1-joamaki@gmail.com>
References: <20210609135537.1460244-1-joamaki@gmail.com>
 <20210730061822.6600-1-joamaki@gmail.com>
Precedence: bulk
List-ID: <netdev.vger.kernel.org>
X-Mailing-List: netdev@vger.kernel.org

Add a test suite to test XDP bonding implementation
over a pair of veth devices.

Signed-off-by: Jussi Maki <joamaki@gmail.com>
---
 .../selftests/bpf/prog_tests/xdp_bonding.c    | 533 ++++++++++++++++++
 1 file changed, 533 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_bonding.c b/tools/testing/selftests/bpf/prog_tests/xdp_bonding.c
new file mode 100644
index 000000000000..506fd9dc8aef
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_bonding.c
@@ -0,0 +1,533 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/**
+ * Test XDP bonding support
+ *
+ * Sets up two bonded veth pairs between two fresh namespaces
+ * and verifies that XDP_TX program loaded on a bond device
+ * are correctly loaded onto the slave devices and XDP_TX'd
+ * packets are balanced using bonding.
+ */
+
+#define _GNU_SOURCE
+#include <sched.h>
+#include <net/if.h>
+#include <linux/if_link.h>
+#include "test_progs.h"
+#include "network_helpers.h"
+#include <linux/if_bonding.h>
+#include <linux/limits.h>
+#include <linux/udp.h>
+
+#include "xdp_dummy.skel.h"
+#include "xdp_redirect_multi_kern.skel.h"
+#include "xdp_tx.skel.h"
+
+#define BOND1_MAC {0x00, 0x11, 0x22, 0x33, 0x44, 0x55}
+#define BOND1_MAC_STR "00:11:22:33:44:55"
+#define BOND2_MAC {0x00, 0x22, 0x33, 0x44, 0x55, 0x66}
+#define BOND2_MAC_STR "00:22:33:44:55:66"
+#define NPACKETS 100
+
+static int root_netns_fd = -1;
+
+static void restore_root_netns(void)
+{
+	ASSERT_OK(setns(root_netns_fd, CLONE_NEWNET), "restore_root_netns");
+}
+
+int setns_by_name(char *name)
+{
+	int nsfd, err;
+	char nspath[PATH_MAX];
+
+	snprintf(nspath, sizeof(nspath), "%s/%s", "/var/run/netns", name);
+	nsfd = open(nspath, O_RDONLY | O_CLOEXEC);
+	if (nsfd < 0)
+		return -1;
+
+	err = setns(nsfd, CLONE_NEWNET);
+	close(nsfd);
+	return err;
+}
+
+static int get_rx_packets(const char *iface)
+{
+	FILE *f;
+	char line[512];
+	int iface_len = strlen(iface);
+
+	f = fopen("/proc/net/dev", "r");
+	if (!f)
+		return -1;
+
+	while (fgets(line, sizeof(line), f)) {
+		char *p = line;
+
+		while (*p == ' ')
+			p++; /* skip whitespace */
+		if (!strncmp(p, iface, iface_len)) {
+			p += iface_len;
+			if (*p++ != ':')
+				continue;
+			while (*p == ' ')
+				p++; /* skip whitespace */
+			while (*p && *p != ' ')
+				p++; /* skip rx bytes */
+			while (*p == ' ')
+				p++; /* skip whitespace */
+			fclose(f);
+			return atoi(p);
+		}
+	}
+	fclose(f);
+	return -1;
+}
+
+#define MAX_BPF_LINKS 8
+
+struct skeletons {
+	struct xdp_dummy *xdp_dummy;
+	struct xdp_tx *xdp_tx;
+	struct xdp_redirect_multi_kern *xdp_redirect_multi_kern;
+
+	int nlinks;
+	struct bpf_link *links[MAX_BPF_LINKS];
+};
+
+static int xdp_attach(struct skeletons *skeletons, struct bpf_program *prog, char *iface)
+{
+	struct bpf_link *link;
+	int ifindex;
+
+	ifindex = if_nametoindex(iface);
+	if (!ASSERT_GT(ifindex, 0, "get ifindex"))
+		return -1;
+
+	if (!ASSERT_LE(skeletons->nlinks, MAX_BPF_LINKS, "too many XDP programs attached"))
+		return -1;
+
+	link = bpf_program__attach_xdp(prog, ifindex);
+	if (!ASSERT_OK_PTR(link, "attach xdp program"))
+		return -1;
+
+	skeletons->links[skeletons->nlinks++] = link;
+	return 0;
+}
+
+enum {
+	BOND_ONE_NO_ATTACH = 0,
+	BOND_BOTH_AND_ATTACH,
+};
+
+static const char * const mode_names[] = {
+	[BOND_MODE_ROUNDROBIN]   = "balance-rr",
+	[BOND_MODE_ACTIVEBACKUP] = "active-backup",
+	[BOND_MODE_XOR]          = "balance-xor",
+	[BOND_MODE_BROADCAST]    = "broadcast",
+	[BOND_MODE_8023AD]       = "802.3ad",
+	[BOND_MODE_TLB]          = "balance-tlb",
+	[BOND_MODE_ALB]          = "balance-alb",
+};
+
+static const char * const xmit_policy_names[] = {
+	[BOND_XMIT_POLICY_LAYER2]       = "layer2",
+	[BOND_XMIT_POLICY_LAYER34]      = "layer3+4",
+	[BOND_XMIT_POLICY_LAYER23]      = "layer2+3",
+	[BOND_XMIT_POLICY_ENCAP23]      = "encap2+3",
+	[BOND_XMIT_POLICY_ENCAP34]      = "encap3+4",
+};
+
+static int bonding_setup(struct skeletons *skeletons, int mode, int xmit_policy,
+			 int bond_both_attach)
+{
+#define SYS(fmt, ...)						\
+	({							\
+		char cmd[1024];					\
+		snprintf(cmd, sizeof(cmd), fmt, ##__VA_ARGS__);	\
+		if (!ASSERT_OK(system(cmd), cmd))		\
+			return -1;				\
+	})
+
+	SYS("ip netns add ns_dst");
+	SYS("ip link add veth1_1 type veth peer name veth2_1 netns ns_dst");
+	SYS("ip link add veth1_2 type veth peer name veth2_2 netns ns_dst");
+
+	SYS("ip link add bond1 type bond mode %s xmit_hash_policy %s",
+	    mode_names[mode], xmit_policy_names[xmit_policy]);
+	SYS("ip link set bond1 up address " BOND1_MAC_STR " addrgenmode none");
+	SYS("ip -netns ns_dst link add bond2 type bond mode %s xmit_hash_policy %s",
+	    mode_names[mode], xmit_policy_names[xmit_policy]);
+	SYS("ip -netns ns_dst link set bond2 up address " BOND2_MAC_STR " addrgenmode none");
+
+	SYS("ip link set veth1_1 master bond1");
+	if (bond_both_attach == BOND_BOTH_AND_ATTACH) {
+		SYS("ip link set veth1_2 master bond1");
+	} else {
+		SYS("ip link set veth1_2 up addrgenmode none");
+
+		if (xdp_attach(skeletons, skeletons->xdp_dummy->progs.xdp_dummy_prog, "veth1_2"))
+			return -1;
+	}
+
+	SYS("ip -netns ns_dst link set veth2_1 master bond2");
+
+	if (bond_both_attach == BOND_BOTH_AND_ATTACH)
+		SYS("ip -netns ns_dst link set veth2_2 master bond2");
+	else
+		SYS("ip -netns ns_dst link set veth2_2 up addrgenmode none");
+
+	/* Load a dummy program on sending side as with veth peer needs to have a
+	 * XDP program loaded as well.
+	 */
+	if (xdp_attach(skeletons, skeletons->xdp_dummy->progs.xdp_dummy_prog, "bond1"))
+		return -1;
+
+	if (bond_both_attach == BOND_BOTH_AND_ATTACH) {
+		if (!ASSERT_OK(setns_by_name("ns_dst"), "set netns to ns_dst"))
+			return -1;
+
+		if (xdp_attach(skeletons, skeletons->xdp_tx->progs.xdp_tx, "bond2"))
+			return -1;
+
+		restore_root_netns();
+	}
+
+	return 0;
+
+#undef SYS
+}
+
+static void bonding_cleanup(struct skeletons *skeletons)
+{
+	restore_root_netns();
+	while (skeletons->nlinks) {
+		skeletons->nlinks--;
+		bpf_link__detach(skeletons->links[skeletons->nlinks]);
+	}
+	ASSERT_OK(system("ip link delete bond1"), "delete bond1");
+	ASSERT_OK(system("ip link delete veth1_1"), "delete veth1_1");
+	ASSERT_OK(system("ip link delete veth1_2"), "delete veth1_2");
+	ASSERT_OK(system("ip netns delete ns_dst"), "delete ns_dst");
+}
+
+static int send_udp_packets(int vary_dst_ip)
+{
+	struct ethhdr eh = {
+		.h_source = BOND1_MAC,
+		.h_dest = BOND2_MAC,
+		.h_proto = htons(ETH_P_IP),
+	};
+	uint8_t buf[128] = {};
+	struct iphdr *iph = (struct iphdr *)(buf + sizeof(eh));
+	struct udphdr *uh = (struct udphdr *)(buf + sizeof(eh) + sizeof(*iph));
+	int i, s = -1;
+	int ifindex;
+
+	s = socket(AF_PACKET, SOCK_RAW, IPPROTO_RAW);
+	if (!ASSERT_GE(s, 0, "socket"))
+		goto err;
+
+	ifindex = if_nametoindex("bond1");
+	if (!ASSERT_GT(ifindex, 0, "get bond1 ifindex"))
+		goto err;
+
+	memcpy(buf, &eh, sizeof(eh));
+	iph->ihl = 5;
+	iph->version = 4;
+	iph->tos = 16;
+	iph->id = 1;
+	iph->ttl = 64;
+	iph->protocol = IPPROTO_UDP;
+	iph->saddr = 1;
+	iph->daddr = 2;
+	iph->tot_len = htons(sizeof(buf) - ETH_HLEN);
+	iph->check = 0;
+
+	for (i = 1; i <= NPACKETS; i++) {
+		int n;
+		struct sockaddr_ll saddr_ll = {
+			.sll_ifindex = ifindex,
+			.sll_halen = ETH_ALEN,
+			.sll_addr = BOND2_MAC,
+		};
+
+		/* vary the UDP destination port for even distribution with roundrobin/xor modes */
+		uh->dest++;
+
+		if (vary_dst_ip)
+			iph->daddr++;
+
+		n = sendto(s, buf, sizeof(buf), 0, (struct sockaddr *)&saddr_ll, sizeof(saddr_ll));
+		if (!ASSERT_EQ(n, sizeof(buf), "sendto"))
+			goto err;
+	}
+
+	return 0;
+
+err:
+	if (s >= 0)
+		close(s);
+	return -1;
+}
+
+void test_xdp_bonding_with_mode(struct skeletons *skeletons, char *name, int mode, int xmit_policy)
+{
+	int bond1_rx;
+
+	if (!test__start_subtest(name))
+		return;
+
+	if (bonding_setup(skeletons, mode, xmit_policy, BOND_BOTH_AND_ATTACH))
+		goto out;
+
+	if (send_udp_packets(xmit_policy != BOND_XMIT_POLICY_LAYER34))
+		goto out;
+
+	bond1_rx = get_rx_packets("bond1");
+	ASSERT_EQ(bond1_rx, NPACKETS, "expected more received packets");
+
+	switch (mode) {
+	case BOND_MODE_ROUNDROBIN:
+	case BOND_MODE_XOR: {
+		int veth1_rx = get_rx_packets("veth1_1");
+		int veth2_rx = get_rx_packets("veth1_2");
+		int diff = abs(veth1_rx - veth2_rx);
+
+		ASSERT_GE(veth1_rx + veth2_rx, NPACKETS, "expected more packets");
+
+		switch (xmit_policy) {
+		case BOND_XMIT_POLICY_LAYER2:
+			ASSERT_GE(diff, NPACKETS,
+				  "expected packets on only one of the interfaces");
+			break;
+		case BOND_XMIT_POLICY_LAYER23:
+		case BOND_XMIT_POLICY_LAYER34:
+			ASSERT_LT(diff, NPACKETS/2,
+				  "expected even distribution of packets");
+			break;
+		default:
+			PRINT_FAIL("Unimplemented xmit_policy=%d\n", xmit_policy);
+			break;
+		}
+		break;
+	}
+	case BOND_MODE_ACTIVEBACKUP: {
+		int veth1_rx = get_rx_packets("veth1_1");
+		int veth2_rx = get_rx_packets("veth1_2");
+		int diff = abs(veth1_rx - veth2_rx);
+
+		ASSERT_GE(diff, NPACKETS,
+			  "expected packets on only one of the interfaces");
+		break;
+	}
+	default:
+		PRINT_FAIL("Unimplemented xmit_policy=%d\n", xmit_policy);
+		break;
+	}
+
+out:
+	bonding_cleanup(skeletons);
+}
+
+
+/* Test the broadcast redirection using xdp_redirect_map_multi_prog and adding
+ * all the interfaces to it and checking that broadcasting won't send the packet
+ * to neither the ingress bond device (bond2) or its slave (veth2_1).
+ */
+void test_xdp_bonding_redirect_multi(struct skeletons *skeletons)
+{
+	static const char * const ifaces[] = {"bond2", "veth2_1", "veth2_2"};
+	int veth1_1_rx, veth1_2_rx;
+	int err;
+
+	if (!test__start_subtest("xdp_bonding_redirect_multi"))
+		return;
+
+	if (bonding_setup(skeletons, BOND_MODE_ROUNDROBIN, BOND_XMIT_POLICY_LAYER23,
+			  BOND_ONE_NO_ATTACH))
+		goto out;
+
+
+	if (!ASSERT_OK(setns_by_name("ns_dst"), "could not set netns to ns_dst"))
+		goto out;
+
+	/* populate the devmap with the relevant interfaces */
+	for (int i = 0; i < ARRAY_SIZE(ifaces); i++) {
+		int ifindex = if_nametoindex(ifaces[i]);
+		int map_fd = bpf_map__fd(skeletons->xdp_redirect_multi_kern->maps.map_all);
+
+		if (!ASSERT_GT(ifindex, 0, "could not get interface index"))
+			goto out;
+
+		err = bpf_map_update_elem(map_fd, &ifindex, &ifindex, 0);
+		if (!ASSERT_OK(err, "add interface to map_all"))
+			goto out;
+	}
+
+	if (xdp_attach(skeletons,
+		       skeletons->xdp_redirect_multi_kern->progs.xdp_redirect_map_multi_prog,
+		       "bond2"))
+		goto out;
+
+	restore_root_netns();
+
+	if (send_udp_packets(BOND_MODE_ROUNDROBIN))
+		goto out;
+
+	veth1_1_rx = get_rx_packets("veth1_1");
+	veth1_2_rx = get_rx_packets("veth1_2");
+
+	ASSERT_EQ(veth1_1_rx, 0, "expected no packets on veth1_1");
+	ASSERT_GE(veth1_2_rx, NPACKETS, "expected packets on veth1_2");
+
+out:
+	restore_root_netns();
+	bonding_cleanup(skeletons);
+}
+
+/* Test that XDP programs cannot be attached to both the bond master and slaves simultaneously */
+void test_xdp_bonding_attach(struct skeletons *skeletons)
+{
+	struct bpf_link *link = NULL;
+	struct bpf_link *link2 = NULL;
+	int veth, bond;
+	int err;
+
+	if (!test__start_subtest("xdp_bonding_attach"))
+		return;
+
+	if (!ASSERT_OK(system("ip link add veth type veth"), "add veth"))
+		goto out;
+	if (!ASSERT_OK(system("ip link add bond type bond"), "add bond"))
+		goto out;
+
+	veth = if_nametoindex("veth");
+	if (!ASSERT_GE(veth, 0, "if_nametoindex veth"))
+		goto out;
+	bond = if_nametoindex("bond");
+	if (!ASSERT_GE(bond, 0, "if_nametoindex bond"))
+		goto out;
+
+	/* enslaving with a XDP program loaded fails */
+	link = bpf_program__attach_xdp(skeletons->xdp_dummy->progs.xdp_dummy_prog, veth);
+	if (!ASSERT_OK_PTR(link, "attach program to veth"))
+		goto out;
+
+	err = system("ip link set veth master bond");
+	if (!ASSERT_NEQ(err, 0, "attaching slave with xdp program expected to fail"))
+		goto out;
+
+	bpf_link__detach(link);
+	link = NULL;
+
+	err = system("ip link set veth master bond");
+	if (!ASSERT_OK(err, "set veth master"))
+		goto out;
+
+	/* attaching to slave when master has no program is allowed */
+	link = bpf_program__attach_xdp(skeletons->xdp_dummy->progs.xdp_dummy_prog, veth);
+	if (!ASSERT_OK_PTR(link, "attach program to slave when enslaved"))
+		goto out;
+
+	/* attaching to master not allowed when slave has program loaded */
+	link2 = bpf_program__attach_xdp(skeletons->xdp_dummy->progs.xdp_dummy_prog, bond);
+	if (!ASSERT_ERR_PTR(link2, "attach program to master when slave has program"))
+		goto out;
+
+	bpf_link__detach(link);
+	link = NULL;
+
+	/* attaching XDP program to master allowed when slave has no program */
+	link = bpf_program__attach_xdp(skeletons->xdp_dummy->progs.xdp_dummy_prog, bond);
+	if (!ASSERT_OK_PTR(link, "attach program to master"))
+		goto out;
+
+	/* attaching to slave not allowed when master has program loaded */
+	link2 = bpf_program__attach_xdp(skeletons->xdp_dummy->progs.xdp_dummy_prog, bond);
+	ASSERT_ERR_PTR(link2, "attach program to slave when master has program");
+
+out:
+	if (link)
+		bpf_link__detach(link);
+	if (link2)
+		bpf_link__detach(link2);
+
+	system("ip link del veth");
+	system("ip link del bond");
+}
+
+static int libbpf_debug_print(enum libbpf_print_level level,
+			      const char *format, va_list args)
+{
+	if (level != LIBBPF_WARN)
+		vprintf(format, args);
+	return 0;
+}
+
+struct bond_test_case {
+	char *name;
+	int mode;
+	int xmit_policy;
+};
+
+static	struct bond_test_case bond_test_cases[] = {
+	{ "xdp_bonding_roundrobin", BOND_MODE_ROUNDROBIN, BOND_XMIT_POLICY_LAYER23, },
+	{ "xdp_bonding_activebackup", BOND_MODE_ACTIVEBACKUP, BOND_XMIT_POLICY_LAYER23 },
+
+	{ "xdp_bonding_xor_layer2", BOND_MODE_XOR, BOND_XMIT_POLICY_LAYER2, },
+	{ "xdp_bonding_xor_layer23", BOND_MODE_XOR, BOND_XMIT_POLICY_LAYER23, },
+	{ "xdp_bonding_xor_layer34", BOND_MODE_XOR, BOND_XMIT_POLICY_LAYER34, },
+};
+
+void test_xdp_bonding(void)
+{
+	libbpf_print_fn_t old_print_fn;
+	struct skeletons skeletons = {};
+	int i;
+
+	old_print_fn = libbpf_set_print(libbpf_debug_print);
+
+	root_netns_fd = open("/proc/self/ns/net", O_RDONLY);
+	if (!ASSERT_GE(root_netns_fd, 0, "open /proc/self/ns/net"))
+		goto out;
+
+	skeletons.xdp_dummy = xdp_dummy__open_and_load();
+	if (!ASSERT_OK_PTR(skeletons.xdp_dummy, "xdp_dummy__open_and_load"))
+		goto out;
+
+	skeletons.xdp_tx = xdp_tx__open_and_load();
+	if (!ASSERT_OK_PTR(skeletons.xdp_tx, "xdp_tx__open_and_load"))
+		goto out;
+
+	skeletons.xdp_redirect_multi_kern = xdp_redirect_multi_kern__open_and_load();
+	if (!ASSERT_OK_PTR(skeletons.xdp_redirect_multi_kern,
+			   "xdp_redirect_multi_kern__open_and_load"))
+		goto out;
+
+	test_xdp_bonding_attach(&skeletons);
+
+	for (i = 0; i < ARRAY_SIZE(bond_test_cases); i++) {
+		struct bond_test_case *test_case = &bond_test_cases[i];
+
+		test_xdp_bonding_with_mode(
+			&skeletons,
+			test_case->name,
+			test_case->mode,
+			test_case->xmit_policy);
+	}
+
+	test_xdp_bonding_redirect_multi(&skeletons);
+
+out:
+	if (skeletons.xdp_dummy)
+		xdp_dummy__destroy(skeletons.xdp_dummy);
+	if (skeletons.xdp_tx)
+		xdp_tx__destroy(skeletons.xdp_tx);
+	if (skeletons.xdp_redirect_multi_kern)
+		xdp_redirect_multi_kern__destroy(skeletons.xdp_redirect_multi_kern);
+
+	libbpf_set_print(old_print_fn);
+	if (root_netns_fd)
+		close(root_netns_fd);
+}