From patchwork Wed Nov 18 22:30:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pablo Neira Ayuso X-Patchwork-Id: 327812 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD433C71156 for ; Wed, 18 Nov 2020 22:30:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8379A246DE for ; Wed, 18 Nov 2020 22:30:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727147AbgKRWaf (ORCPT ); Wed, 18 Nov 2020 17:30:35 -0500 Received: from correo.us.es ([193.147.175.20]:51152 "EHLO mail.us.es" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726163AbgKRWaV (ORCPT ); Wed, 18 Nov 2020 17:30:21 -0500 Received: from antivirus1-rhel7.int (unknown [192.168.2.11]) by mail.us.es (Postfix) with ESMTP id AC23618FD84 for ; Wed, 18 Nov 2020 23:30:19 +0100 (CET) Received: from antivirus1-rhel7.int (localhost [127.0.0.1]) by antivirus1-rhel7.int (Postfix) with ESMTP id 9ADDFDA8FB for ; Wed, 18 Nov 2020 23:30:19 +0100 (CET) Received: by antivirus1-rhel7.int (Postfix, from userid 99) id 999D5DA8F7; Wed, 18 Nov 2020 23:30:19 +0100 (CET) Received: from antivirus1-rhel7.int (localhost [127.0.0.1]) by antivirus1-rhel7.int (Postfix) with ESMTP id 4B9C2DA902; Wed, 18 Nov 2020 23:30:17 +0100 (CET) Received: from 192.168.1.97 (192.168.1.97) by antivirus1-rhel7.int (F-Secure/fsigk_smtp/550/antivirus1-rhel7.int); Wed, 18 Nov 2020 23:30:17 +0100 (CET) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/antivirus1-rhel7.int) Received: from localhost.localdomain (unknown [90.77.255.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: pneira@us.es) by entrada.int (Postfix) with ESMTPSA id 0753D42514A8; Wed, 18 Nov 2020 23:30:16 +0100 (CET) X-SMTPAUTHUS: auth mail.us.es From: Pablo Neira Ayuso To: netfilter-devel@vger.kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org, kuba@kernel.org, razor@blackwall.org, tobias@waldekranz.com, jeremy@azazel.net Subject: [PATCH net-next,v4 2/9] netfilter: flowtable: add xmit path types Date: Wed, 18 Nov 2020 23:30:04 +0100 Message-Id: <20201118223011.3216-3-pablo@netfilter.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201118223011.3216-1-pablo@netfilter.org> References: <20201118223011.3216-1-pablo@netfilter.org> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add the xmit_type field that defines the two supported xmit paths in the flowtable data plane, which are the neighbour and the xfrm xmit paths. This patch prepares for new flowtable xmit path types to come. Signed-off-by: Pablo Neira Ayuso --- v4: no changes include/net/netfilter/nf_flow_table.h | 11 +++++++-- net/netfilter/nf_flow_table_core.c | 1 + net/netfilter/nf_flow_table_ip.c | 32 ++++++++++++++++++--------- net/netfilter/nft_flow_offload.c | 20 +++++++++++++++-- 4 files changed, 50 insertions(+), 14 deletions(-) diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h index 54c4d5c908a5..7d477be06913 100644 --- a/include/net/netfilter/nf_flow_table.h +++ b/include/net/netfilter/nf_flow_table.h @@ -89,6 +89,11 @@ enum flow_offload_tuple_dir { FLOW_OFFLOAD_DIR_MAX = IP_CT_DIR_MAX }; +enum flow_offload_xmit_type { + FLOW_OFFLOAD_XMIT_NEIGH = 0, + FLOW_OFFLOAD_XMIT_XFRM, +}; + struct flow_offload_tuple { union { struct in_addr src_v4; @@ -111,7 +116,8 @@ struct flow_offload_tuple { /* All members above are keys for lookups, see flow_offload_hash(). */ struct { } __hash; - u8 dir; + u8 dir:6, + xmit_type:2; u16 mtu; @@ -158,7 +164,8 @@ static inline __s32 nf_flow_timeout_delta(unsigned int timeout) struct nf_flow_route { struct { - struct dst_entry *dst; + struct dst_entry *dst; + enum flow_offload_xmit_type xmit_type; } tuple[FLOW_OFFLOAD_DIR_MAX]; }; diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c index 55fca71ace26..57dd8e40e474 100644 --- a/net/netfilter/nf_flow_table_core.c +++ b/net/netfilter/nf_flow_table_core.c @@ -95,6 +95,7 @@ static int flow_offload_fill_route(struct flow_offload *flow, } flow_tuple->iifidx = other_dst->dev->ifindex; + flow_tuple->xmit_type = route->tuple[dir].xmit_type; flow_tuple->dst_cache = dst; return 0; diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c index a698dbe28ef5..03314cc06e04 100644 --- a/net/netfilter/nf_flow_table_ip.c +++ b/net/netfilter/nf_flow_table_ip.c @@ -220,10 +220,20 @@ static bool nf_flow_exceeds_mtu(const struct sk_buff *skb, unsigned int mtu) return true; } -static int nf_flow_offload_dst_check(struct dst_entry *dst) +static inline struct dst_entry * +nft_flow_dst(struct flow_offload_tuple_rhash *tuplehash) { - if (unlikely(dst_xfrm(dst))) + return tuplehash->tuple.dst_cache; +} + +static int nf_flow_offload_dst_check(struct flow_offload_tuple_rhash *tuplehash) +{ + struct dst_entry *dst; + + if (unlikely(tuplehash->tuple.xmit_type == FLOW_OFFLOAD_XMIT_XFRM)) { + dst = nft_flow_dst(tuplehash); return dst_check(dst, 0) ? 0 : -1; + } return 0; } @@ -265,8 +275,6 @@ nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb, dir = tuplehash->tuple.dir; flow = container_of(tuplehash, struct flow_offload, tuplehash[dir]); - rt = (struct rtable *)flow->tuplehash[dir].tuple.dst_cache; - outdev = rt->dst.dev; if (unlikely(nf_flow_exceeds_mtu(skb, flow->tuplehash[dir].tuple.mtu))) return NF_ACCEPT; @@ -280,7 +288,7 @@ nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb, flow_offload_refresh(flow_table, flow); - if (nf_flow_offload_dst_check(&rt->dst)) { + if (nf_flow_offload_dst_check(tuplehash)) { flow_offload_teardown(flow); return NF_ACCEPT; } @@ -295,13 +303,16 @@ nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb, if (flow_table->flags & NF_FLOWTABLE_COUNTER) nf_ct_acct_update(flow->ct, tuplehash->tuple.dir, skb->len); - if (unlikely(dst_xfrm(&rt->dst))) { + rt = (struct rtable *)tuplehash->tuple.dst_cache; + + if (unlikely(tuplehash->tuple.xmit_type == FLOW_OFFLOAD_XMIT_XFRM)) { memset(skb->cb, 0, sizeof(struct inet_skb_parm)); IPCB(skb)->iif = skb->dev->ifindex; IPCB(skb)->flags = IPSKB_FORWARDED; return nf_flow_xmit_xfrm(skb, state, &rt->dst); } + outdev = rt->dst.dev; skb->dev = outdev; nexthop = rt_nexthop(rt, flow->tuplehash[!dir].tuple.src_v4.s_addr); skb_dst_set_noref(skb, &rt->dst); @@ -506,8 +517,6 @@ nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb, dir = tuplehash->tuple.dir; flow = container_of(tuplehash, struct flow_offload, tuplehash[dir]); - rt = (struct rt6_info *)flow->tuplehash[dir].tuple.dst_cache; - outdev = rt->dst.dev; if (unlikely(nf_flow_exceeds_mtu(skb, flow->tuplehash[dir].tuple.mtu))) return NF_ACCEPT; @@ -518,7 +527,7 @@ nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb, flow_offload_refresh(flow_table, flow); - if (nf_flow_offload_dst_check(&rt->dst)) { + if (nf_flow_offload_dst_check(tuplehash)) { flow_offload_teardown(flow); return NF_ACCEPT; } @@ -536,13 +545,16 @@ nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb, if (flow_table->flags & NF_FLOWTABLE_COUNTER) nf_ct_acct_update(flow->ct, tuplehash->tuple.dir, skb->len); - if (unlikely(dst_xfrm(&rt->dst))) { + rt = (struct rt6_info *)tuplehash->tuple.dst_cache; + + if (unlikely(tuplehash->tuple.xmit_type == FLOW_OFFLOAD_XMIT_NEIGH)) { memset(skb->cb, 0, sizeof(struct inet6_skb_parm)); IP6CB(skb)->iif = skb->dev->ifindex; IP6CB(skb)->flags = IP6SKB_FORWARDED; return nf_flow_xmit_xfrm(skb, state, &rt->dst); } + outdev = rt->dst.dev; skb->dev = outdev; nexthop = rt6_nexthop(rt, &flow->tuplehash[!dir].tuple.src_v6); skb_dst_set_noref(skb, &rt->dst); diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c index 3a6c84fb2c90..1da2bb24f6c0 100644 --- a/net/netfilter/nft_flow_offload.c +++ b/net/netfilter/nft_flow_offload.c @@ -19,6 +19,22 @@ struct nft_flow_offload { struct nft_flowtable *flowtable; }; +static enum flow_offload_xmit_type nft_xmit_type(struct dst_entry *dst) +{ + if (dst_xfrm(dst)) + return FLOW_OFFLOAD_XMIT_XFRM; + + return FLOW_OFFLOAD_XMIT_NEIGH; +} + +static void nft_default_forward_path(struct nf_flow_route *route, + struct dst_entry *dst_cache, + enum ip_conntrack_dir dir) +{ + route->tuple[dir].dst = dst_cache; + route->tuple[dir].xmit_type = nft_xmit_type(dst_cache); +} + static int nft_flow_route(const struct nft_pktinfo *pkt, const struct nf_conn *ct, struct nf_flow_route *route, @@ -44,8 +60,8 @@ static int nft_flow_route(const struct nft_pktinfo *pkt, if (!other_dst) return -ENOENT; - route->tuple[dir].dst = this_dst; - route->tuple[!dir].dst = other_dst; + nft_default_forward_path(route, this_dst, dir); + nft_default_forward_path(route, other_dst, !dir); return 0; } From patchwork Wed Nov 18 22:30:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pablo Neira Ayuso X-Patchwork-Id: 327816 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21D4FC63697 for ; Wed, 18 Nov 2020 22:30:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C1415246DE for ; Wed, 18 Nov 2020 22:30:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726837AbgKRWaX (ORCPT ); Wed, 18 Nov 2020 17:30:23 -0500 Received: from correo.us.es ([193.147.175.20]:51180 "EHLO mail.us.es" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726304AbgKRWaW (ORCPT ); Wed, 18 Nov 2020 17:30:22 -0500 Received: from antivirus1-rhel7.int (unknown [192.168.2.11]) by mail.us.es (Postfix) with ESMTP id 6C5E518FDA0 for ; Wed, 18 Nov 2020 23:30:20 +0100 (CET) Received: from antivirus1-rhel7.int (localhost [127.0.0.1]) by antivirus1-rhel7.int (Postfix) with ESMTP id 5DEC5DA730 for ; Wed, 18 Nov 2020 23:30:20 +0100 (CET) Received: by antivirus1-rhel7.int (Postfix, from userid 99) id 5CA76DA72F; Wed, 18 Nov 2020 23:30:20 +0100 (CET) Received: from antivirus1-rhel7.int (localhost [127.0.0.1]) by antivirus1-rhel7.int (Postfix) with ESMTP id 0AECCDA8FE; Wed, 18 Nov 2020 23:30:18 +0100 (CET) Received: from 192.168.1.97 (192.168.1.97) by antivirus1-rhel7.int (F-Secure/fsigk_smtp/550/antivirus1-rhel7.int); Wed, 18 Nov 2020 23:30:18 +0100 (CET) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/antivirus1-rhel7.int) Received: from localhost.localdomain (unknown [90.77.255.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: pneira@us.es) by entrada.int (Postfix) with ESMTPSA id BA6FB42514A8; Wed, 18 Nov 2020 23:30:17 +0100 (CET) X-SMTPAUTHUS: auth mail.us.es From: Pablo Neira Ayuso To: netfilter-devel@vger.kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org, kuba@kernel.org, razor@blackwall.org, tobias@waldekranz.com, jeremy@azazel.net Subject: [PATCH net-next, v4 3/9] net: resolve forwarding path from virtual netdevice and HW destination address Date: Wed, 18 Nov 2020 23:30:05 +0100 Message-Id: <20201118223011.3216-4-pablo@netfilter.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201118223011.3216-1-pablo@netfilter.org> References: <20201118223011.3216-1-pablo@netfilter.org> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch adds dev_fill_forward_path() which resolves the path to reach the real netdevice from the IP forwarding side. This function takes as input the netdevice and the destination hardware address and it walks down the devices calling .ndo_fill_forward_path() for each device until the real device is found. For instance, assuming the following topology: IP forwarding / \ br0 eth0 / \ eth1 eth2 . . . ethX ab:cd:ef:ab:cd:ef where eth1 and eth2 are bridge ports and eth0 provides WAN connectivity. ethX is the interface in another box which is connected to the eth1 bridge port. For packets going through IP forwarding to br0 whose destination MAC address is ab:cd:ef:ab:cd:ef, dev_fill_forward_path() provides the following path: br0 -> eth1 .ndo_fill_forward_path for br0 looks up at the FDB for the bridge port from the destination MAC address to get the bridge port eth1. This information allows to create a fast path that bypasses the classic bridge and IP forwarding paths, so packets go directly from the bridge port eth1 to eth0 (wan interface) and vice versa. fast path .------------------------. / \ | IP forwarding | | / \ \/ | br0 eth0 . / \ -> eth1 eth2 . . . ethX ab:cd:ef:ab:cd:ef Signed-off-by: Pablo Neira Ayuso --- v4: - Check for possible device stack overflow in dev_fill_forward_path(), per Jakub Kicinski. - Remove memset for the net_device_path_ctx structure in dev_fill_forward_path() and avoid full memset of struct net_device_path_stack. include/linux/netdevice.h | 27 +++++++++++++++++++++++++++ net/core/dev.c | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 63 insertions(+) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 03433a4c929e..ef4fc0eefee0 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -833,6 +833,27 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev, struct sk_buff *skb, struct net_device *sb_dev); +enum net_device_path_type { + DEV_PATH_ETHERNET = 0, +}; + +struct net_device_path { + enum net_device_path_type type; + const struct net_device *dev; +}; + +#define NET_DEVICE_PATH_STACK_MAX 5 + +struct net_device_path_stack { + int num_paths; + struct net_device_path path[NET_DEVICE_PATH_STACK_MAX]; +}; + +struct net_device_path_ctx { + const struct net_device *dev; + const u8 *daddr; +}; + enum tc_setup_type { TC_SETUP_QDISC_MQPRIO, TC_SETUP_CLSU32, @@ -1279,6 +1300,8 @@ struct netdev_net_notifier { * struct net_device *(*ndo_get_peer_dev)(struct net_device *dev); * If a device is paired with a peer device, return the peer instance. * The caller must be under RCU read context. + * int (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx, struct net_device_path *path); + * Get the forwarding path to reach the real device from the HW destination address */ struct net_device_ops { int (*ndo_init)(struct net_device *dev); @@ -1487,6 +1510,8 @@ struct net_device_ops { int (*ndo_tunnel_ctl)(struct net_device *dev, struct ip_tunnel_parm *p, int cmd); struct net_device * (*ndo_get_peer_dev)(struct net_device *dev); + int (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx, + struct net_device_path *path); }; /** @@ -2824,6 +2849,8 @@ void dev_remove_offload(struct packet_offload *po); int dev_get_iflink(const struct net_device *dev); int dev_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb); +int dev_fill_forward_path(const struct net_device *dev, const u8 *daddr, + struct net_device_path_stack *stack); struct net_device *__dev_get_by_flags(struct net *net, unsigned short flags, unsigned short mask); struct net_device *dev_get_by_name(struct net *net, const char *name); diff --git a/net/core/dev.c b/net/core/dev.c index 4bfdcd6b20e8..d65bab678485 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -846,6 +846,42 @@ int dev_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb) } EXPORT_SYMBOL_GPL(dev_fill_metadata_dst); +int dev_fill_forward_path(const struct net_device *dev, const u8 *daddr, + struct net_device_path_stack *stack) +{ + const struct net_device *last_dev; + struct net_device_path_ctx ctx = { + .dev = dev, + .daddr = daddr, + }; + struct net_device_path *path; + int ret = 0, k; + + stack->num_paths = 0; + while (ctx.dev && ctx.dev->netdev_ops->ndo_fill_forward_path) { + last_dev = ctx.dev; + k = stack->num_paths++; + if (WARN_ON_ONCE(k >= NET_DEVICE_PATH_STACK_MAX)) + return -1; + + path = &stack->path[k]; + memset(path, 0, sizeof(struct net_device_path)); + + ret = ctx.dev->netdev_ops->ndo_fill_forward_path(&ctx, path); + if (ret < 0) + return -1; + + if (WARN_ON_ONCE(last_dev == ctx.dev)) + return -1; + } + path = &stack->path[stack->num_paths++]; + path->type = DEV_PATH_ETHERNET; + path->dev = ctx.dev; + + return ret; +} +EXPORT_SYMBOL_GPL(dev_fill_forward_path); + /** * __dev_get_by_name - find a device by its name * @net: the applicable net namespace From patchwork Wed Nov 18 22:30:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pablo Neira Ayuso X-Patchwork-Id: 327815 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9922FC63798 for ; Wed, 18 Nov 2020 22:30:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5628D246DE for ; Wed, 18 Nov 2020 22:30:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726924AbgKRWaZ (ORCPT ); Wed, 18 Nov 2020 17:30:25 -0500 Received: from correo.us.es ([193.147.175.20]:51210 "EHLO mail.us.es" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726756AbgKRWaX (ORCPT ); Wed, 18 Nov 2020 17:30:23 -0500 Received: from antivirus1-rhel7.int (unknown [192.168.2.11]) by mail.us.es (Postfix) with ESMTP id C800518FDA5 for ; Wed, 18 Nov 2020 23:30:21 +0100 (CET) Received: from antivirus1-rhel7.int (localhost [127.0.0.1]) by antivirus1-rhel7.int (Postfix) with ESMTP id B97C6DA901 for ; Wed, 18 Nov 2020 23:30:21 +0100 (CET) Received: by antivirus1-rhel7.int (Postfix, from userid 99) id AE306DA8F7; Wed, 18 Nov 2020 23:30:21 +0100 (CET) Received: from antivirus1-rhel7.int (localhost [127.0.0.1]) by antivirus1-rhel7.int (Postfix) with ESMTP id 86E33DA8F3; Wed, 18 Nov 2020 23:30:19 +0100 (CET) Received: from 192.168.1.97 (192.168.1.97) by antivirus1-rhel7.int (F-Secure/fsigk_smtp/550/antivirus1-rhel7.int); Wed, 18 Nov 2020 23:30:19 +0100 (CET) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/antivirus1-rhel7.int) Received: from localhost.localdomain (unknown [90.77.255.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: pneira@us.es) by entrada.int (Postfix) with ESMTPSA id 432BF42514A8; Wed, 18 Nov 2020 23:30:19 +0100 (CET) X-SMTPAUTHUS: auth mail.us.es From: Pablo Neira Ayuso To: netfilter-devel@vger.kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org, kuba@kernel.org, razor@blackwall.org, tobias@waldekranz.com, jeremy@azazel.net Subject: [PATCH net-next, v4 5/9] bridge: resolve forwarding path for bridge devices Date: Wed, 18 Nov 2020 23:30:07 +0100 Message-Id: <20201118223011.3216-6-pablo@netfilter.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201118223011.3216-1-pablo@netfilter.org> References: <20201118223011.3216-1-pablo@netfilter.org> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add .ndo_fill_forward_path for bridge devices. Signed-off-by: Pablo Neira Ayuso --- v4: - Check for valid bridge dst pointer after READ_ONCE, per Nikolay Aleksandrov. include/linux/netdevice.h | 1 + net/bridge/br_device.c | 27 +++++++++++++++++++++++++++ 2 files changed, 28 insertions(+) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index e9690e1a6559..281551c70536 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -836,6 +836,7 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev, enum net_device_path_type { DEV_PATH_ETHERNET = 0, DEV_PATH_VLAN, + DEV_PATH_BRIDGE, }; struct net_device_path { diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c index 387403931a63..ce03d82e384a 100644 --- a/net/bridge/br_device.c +++ b/net/bridge/br_device.c @@ -391,6 +391,32 @@ static int br_del_slave(struct net_device *dev, struct net_device *slave_dev) return br_del_if(br, slave_dev); } +static int br_fill_forward_path(struct net_device_path_ctx *ctx, + struct net_device_path *path) +{ + struct net_bridge_fdb_entry *f; + struct net_bridge_port *dst; + struct net_bridge *br; + + if (netif_is_bridge_port(ctx->dev)) + return -1; + + br = netdev_priv(ctx->dev); + f = br_fdb_find_rcu(br, ctx->daddr, 0); + if (!f || !f->dst) + return -1; + + dst = READ_ONCE(f->dst); + if (!dst) + return -1; + + path->type = DEV_PATH_BRIDGE; + path->dev = dst->br->dev; + ctx->dev = dst->dev; + + return 0; +} + static const struct ethtool_ops br_ethtool_ops = { .get_drvinfo = br_getinfo, .get_link = ethtool_op_get_link, @@ -425,6 +451,7 @@ static const struct net_device_ops br_netdev_ops = { .ndo_bridge_setlink = br_setlink, .ndo_bridge_dellink = br_dellink, .ndo_features_check = passthru_features_check, + .ndo_fill_forward_path = br_fill_forward_path, }; static struct device_type br_type = { From patchwork Wed Nov 18 22:30:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pablo Neira Ayuso X-Patchwork-Id: 327813 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 282BAC64EBC for ; Wed, 18 Nov 2020 22:30:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ECF6E246DE for ; Wed, 18 Nov 2020 22:30:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727078AbgKRWaa (ORCPT ); Wed, 18 Nov 2020 17:30:30 -0500 Received: from correo.us.es ([193.147.175.20]:51238 "EHLO mail.us.es" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726304AbgKRWa0 (ORCPT ); Wed, 18 Nov 2020 17:30:26 -0500 Received: from antivirus1-rhel7.int (unknown [192.168.2.11]) by mail.us.es (Postfix) with ESMTP id 6BDD018FD89 for ; Wed, 18 Nov 2020 23:30:23 +0100 (CET) Received: from antivirus1-rhel7.int (localhost [127.0.0.1]) by antivirus1-rhel7.int (Postfix) with ESMTP id 5C3DFDA722 for ; Wed, 18 Nov 2020 23:30:23 +0100 (CET) Received: by antivirus1-rhel7.int (Postfix, from userid 99) id 5191BDA8FE; Wed, 18 Nov 2020 23:30:23 +0100 (CET) Received: from antivirus1-rhel7.int (localhost [127.0.0.1]) by antivirus1-rhel7.int (Postfix) with ESMTP id D4A7CDA722; Wed, 18 Nov 2020 23:30:20 +0100 (CET) Received: from 192.168.1.97 (192.168.1.97) by antivirus1-rhel7.int (F-Secure/fsigk_smtp/550/antivirus1-rhel7.int); Wed, 18 Nov 2020 23:30:20 +0100 (CET) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/antivirus1-rhel7.int) Received: from localhost.localdomain (unknown [90.77.255.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: pneira@us.es) by entrada.int (Postfix) with ESMTPSA id 90B0642514A8; Wed, 18 Nov 2020 23:30:20 +0100 (CET) X-SMTPAUTHUS: auth mail.us.es From: Pablo Neira Ayuso To: netfilter-devel@vger.kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org, kuba@kernel.org, razor@blackwall.org, tobias@waldekranz.com, jeremy@azazel.net Subject: [PATCH net-next, v4 7/9] netfilter: flowtable: use dev_fill_forward_path() to obtain egress device Date: Wed, 18 Nov 2020 23:30:09 +0100 Message-Id: <20201118223011.3216-8-pablo@netfilter.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201118223011.3216-1-pablo@netfilter.org> References: <20201118223011.3216-1-pablo@netfilter.org> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The egress device in the tuple is obtained from route. Use dev_fill_forward_path() instead to provide the real egress device for this flow whenever this is available. The new FLOW_OFFLOAD_XMIT_DIRECT type uses dev_queue_xmit() to transmit ethernet frames. Cache the source and destination hardware address to use dev_queue_xmit() to transfer packets. The FLOW_OFFLOAD_XMIT_DIRECT replaces FLOW_OFFLOAD_XMIT_NEIGH if dev_fill_forward_path() finds a direct transmit path. Signed-off-by: Pablo Neira Ayuso --- v4: no changes. include/net/netfilter/nf_flow_table.h | 16 +++++- net/netfilter/nf_flow_table_core.c | 35 ++++++++++--- net/netfilter/nf_flow_table_ip.c | 72 +++++++++++++++++++++------ net/netfilter/nft_flow_offload.c | 25 ++++++++-- 4 files changed, 118 insertions(+), 30 deletions(-) diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h index 963f99fb1c06..83110e4705c0 100644 --- a/include/net/netfilter/nf_flow_table.h +++ b/include/net/netfilter/nf_flow_table.h @@ -92,6 +92,7 @@ enum flow_offload_tuple_dir { enum flow_offload_xmit_type { FLOW_OFFLOAD_XMIT_NEIGH = 0, FLOW_OFFLOAD_XMIT_XFRM, + FLOW_OFFLOAD_XMIT_DIRECT, }; struct flow_offload_tuple { @@ -120,8 +121,14 @@ struct flow_offload_tuple { xmit_type:2; u16 mtu; - - struct dst_entry *dst_cache; + union { + struct dst_entry *dst_cache; + struct { + u32 ifidx; + u8 h_source[ETH_ALEN]; + u8 h_dest[ETH_ALEN]; + } out; + }; }; struct flow_offload_tuple_rhash { @@ -168,6 +175,11 @@ struct nf_flow_route { struct { u32 ifindex; } in; + struct { + u32 ifindex; + u8 h_source[ETH_ALEN]; + u8 h_dest[ETH_ALEN]; + } out; enum flow_offload_xmit_type xmit_type; } tuple[FLOW_OFFLOAD_DIR_MAX]; }; diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c index 27b4315d7b96..751bc1c27c16 100644 --- a/net/netfilter/nf_flow_table_core.c +++ b/net/netfilter/nf_flow_table_core.c @@ -81,9 +81,6 @@ static int flow_offload_fill_route(struct flow_offload *flow, struct flow_offload_tuple *flow_tuple = &flow->tuplehash[dir].tuple; struct dst_entry *dst = route->tuple[dir].dst; - if (!dst_hold_safe(route->tuple[dir].dst)) - return -1; - switch (flow_tuple->l3proto) { case NFPROTO_IPV4: flow_tuple->mtu = ip_dst_mtu_maybe_forward(dst, true); @@ -94,12 +91,36 @@ static int flow_offload_fill_route(struct flow_offload *flow, } flow_tuple->iifidx = route->tuple[dir].in.ifindex; + + switch (route->tuple[dir].xmit_type) { + case FLOW_OFFLOAD_XMIT_DIRECT: + memcpy(flow_tuple->out.h_dest, route->tuple[dir].out.h_dest, + ETH_ALEN); + memcpy(flow_tuple->out.h_source, route->tuple[dir].out.h_source, + ETH_ALEN); + flow_tuple->out.ifidx = route->tuple[dir].out.ifindex; + break; + case FLOW_OFFLOAD_XMIT_XFRM: + case FLOW_OFFLOAD_XMIT_NEIGH: + if (!dst_hold_safe(route->tuple[dir].dst)) + return -1; + + flow_tuple->dst_cache = dst; + break; + } flow_tuple->xmit_type = route->tuple[dir].xmit_type; - flow_tuple->dst_cache = dst; return 0; } +static void nft_flow_dst_release(struct flow_offload *flow, + enum flow_offload_tuple_dir dir) +{ + if (flow->tuplehash[dir].tuple.xmit_type == FLOW_OFFLOAD_XMIT_NEIGH || + flow->tuplehash[dir].tuple.xmit_type == FLOW_OFFLOAD_XMIT_XFRM) + dst_release(flow->tuplehash[dir].tuple.dst_cache); +} + int flow_offload_route_init(struct flow_offload *flow, const struct nf_flow_route *route) { @@ -118,7 +139,7 @@ int flow_offload_route_init(struct flow_offload *flow, return 0; err_route_reply: - dst_release(route->tuple[FLOW_OFFLOAD_DIR_ORIGINAL].dst); + nft_flow_dst_release(flow, FLOW_OFFLOAD_DIR_ORIGINAL); return err; } @@ -169,8 +190,8 @@ static void flow_offload_fixup_ct(struct nf_conn *ct) static void flow_offload_route_release(struct flow_offload *flow) { - dst_release(flow->tuplehash[FLOW_OFFLOAD_DIR_ORIGINAL].tuple.dst_cache); - dst_release(flow->tuplehash[FLOW_OFFLOAD_DIR_REPLY].tuple.dst_cache); + nft_flow_dst_release(flow, FLOW_OFFLOAD_DIR_ORIGINAL); + nft_flow_dst_release(flow, FLOW_OFFLOAD_DIR_REPLY); } void flow_offload_free(struct flow_offload *flow) diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c index 03314cc06e04..d10ba09a41b5 100644 --- a/net/netfilter/nf_flow_table_ip.c +++ b/net/netfilter/nf_flow_table_ip.c @@ -248,6 +248,24 @@ static unsigned int nf_flow_xmit_xfrm(struct sk_buff *skb, return NF_STOLEN; } +static unsigned int nf_flow_queue_xmit(struct net *net, struct sk_buff *skb, + const struct flow_offload_tuple_rhash *tuplehash, + unsigned short type) +{ + struct net_device *outdev; + + outdev = dev_get_by_index_rcu(net, tuplehash->tuple.out.ifidx); + if (!outdev) + return NF_DROP; + + skb->dev = outdev; + dev_hard_header(skb, skb->dev, type, tuplehash->tuple.out.h_dest, + tuplehash->tuple.out.h_source, skb->len); + dev_queue_xmit(skb); + + return NF_STOLEN; +} + unsigned int nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb, const struct nf_hook_state *state) @@ -262,6 +280,7 @@ nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb, unsigned int thoff; struct iphdr *iph; __be32 nexthop; + int ret; if (skb->protocol != htons(ETH_P_IP)) return NF_ACCEPT; @@ -303,22 +322,32 @@ nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb, if (flow_table->flags & NF_FLOWTABLE_COUNTER) nf_ct_acct_update(flow->ct, tuplehash->tuple.dir, skb->len); - rt = (struct rtable *)tuplehash->tuple.dst_cache; - if (unlikely(tuplehash->tuple.xmit_type == FLOW_OFFLOAD_XMIT_XFRM)) { + rt = (struct rtable *)tuplehash->tuple.dst_cache; memset(skb->cb, 0, sizeof(struct inet_skb_parm)); IPCB(skb)->iif = skb->dev->ifindex; IPCB(skb)->flags = IPSKB_FORWARDED; return nf_flow_xmit_xfrm(skb, state, &rt->dst); } - outdev = rt->dst.dev; - skb->dev = outdev; - nexthop = rt_nexthop(rt, flow->tuplehash[!dir].tuple.src_v4.s_addr); - skb_dst_set_noref(skb, &rt->dst); - neigh_xmit(NEIGH_ARP_TABLE, outdev, &nexthop, skb); + switch (tuplehash->tuple.xmit_type) { + case FLOW_OFFLOAD_XMIT_NEIGH: + rt = (struct rtable *)tuplehash->tuple.dst_cache; + outdev = rt->dst.dev; + skb->dev = outdev; + nexthop = rt_nexthop(rt, flow->tuplehash[!dir].tuple.src_v4.s_addr); + skb_dst_set_noref(skb, &rt->dst); + neigh_xmit(NEIGH_ARP_TABLE, outdev, &nexthop, skb); + ret = NF_STOLEN; + break; + case FLOW_OFFLOAD_XMIT_DIRECT: + ret = nf_flow_queue_xmit(state->net, skb, tuplehash, ETH_P_IP); + if (ret == NF_DROP) + flow_offload_teardown(flow); + break; + } - return NF_STOLEN; + return ret; } EXPORT_SYMBOL_GPL(nf_flow_offload_ip_hook); @@ -504,6 +533,7 @@ nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb, struct net_device *outdev; struct ipv6hdr *ip6h; struct rt6_info *rt; + int ret; if (skb->protocol != htons(ETH_P_IPV6)) return NF_ACCEPT; @@ -545,21 +575,31 @@ nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb, if (flow_table->flags & NF_FLOWTABLE_COUNTER) nf_ct_acct_update(flow->ct, tuplehash->tuple.dir, skb->len); - rt = (struct rt6_info *)tuplehash->tuple.dst_cache; - if (unlikely(tuplehash->tuple.xmit_type == FLOW_OFFLOAD_XMIT_NEIGH)) { + rt = (struct rt6_info *)tuplehash->tuple.dst_cache; memset(skb->cb, 0, sizeof(struct inet6_skb_parm)); IP6CB(skb)->iif = skb->dev->ifindex; IP6CB(skb)->flags = IP6SKB_FORWARDED; return nf_flow_xmit_xfrm(skb, state, &rt->dst); } - outdev = rt->dst.dev; - skb->dev = outdev; - nexthop = rt6_nexthop(rt, &flow->tuplehash[!dir].tuple.src_v6); - skb_dst_set_noref(skb, &rt->dst); - neigh_xmit(NEIGH_ND_TABLE, outdev, nexthop, skb); + switch (tuplehash->tuple.xmit_type) { + case FLOW_OFFLOAD_XMIT_NEIGH: + rt = (struct rt6_info *)tuplehash->tuple.dst_cache; + outdev = rt->dst.dev; + skb->dev = outdev; + nexthop = rt6_nexthop(rt, &flow->tuplehash[!dir].tuple.src_v6); + skb_dst_set_noref(skb, &rt->dst); + neigh_xmit(NEIGH_ND_TABLE, outdev, nexthop, skb); + ret = NF_STOLEN; + break; + case FLOW_OFFLOAD_XMIT_DIRECT: + ret = nf_flow_queue_xmit(state->net, skb, tuplehash, ETH_P_IPV6); + if (ret == NF_DROP) + flow_offload_teardown(flow); + break; + } - return NF_STOLEN; + return ret; } EXPORT_SYMBOL_GPL(nf_flow_offload_ipv6_hook); diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c index 15f5a3b38253..68d03569c293 100644 --- a/net/netfilter/nft_flow_offload.c +++ b/net/netfilter/nft_flow_offload.c @@ -39,12 +39,11 @@ static void nft_default_forward_path(struct nf_flow_route *route, static int nft_dev_fill_forward_path(const struct nf_flow_route *route, const struct dst_entry *dst_cache, const struct nf_conn *ct, - enum ip_conntrack_dir dir, + enum ip_conntrack_dir dir, u8 *ha, struct net_device_path_stack *stack) { const void *daddr = &ct->tuplehash[!dir].tuple.src.u3; struct net_device *dev = dst_cache->dev; - unsigned char ha[ETH_ALEN]; struct neighbour *n; u8 nud_state; @@ -66,10 +65,14 @@ static int nft_dev_fill_forward_path(const struct nf_flow_route *route, struct nft_forward_info { const struct net_device *indev; + u8 h_source[ETH_ALEN]; + u8 h_dest[ETH_ALEN]; + enum flow_offload_xmit_type xmit_type; }; static void nft_dev_path_info(const struct net_device_path_stack *stack, - struct nft_forward_info *info) + struct nft_forward_info *info, + unsigned char *ha) { const struct net_device_path *path; int i; @@ -79,10 +82,14 @@ static void nft_dev_path_info(const struct net_device_path_stack *stack, switch (path->type) { case DEV_PATH_ETHERNET: info->indev = path->dev; + memcpy(info->h_dest, path->dev->dev_addr, ETH_ALEN); + memcpy(info->h_source, ha, ETH_ALEN); break; case DEV_PATH_VLAN: break; case DEV_PATH_BRIDGE: + memcpy(info->h_dest, path->dev->dev_addr, ETH_ALEN); + info->xmit_type = FLOW_OFFLOAD_XMIT_DIRECT; break; } } @@ -113,14 +120,22 @@ static void nft_dev_forward_path(struct nf_flow_route *route, const struct dst_entry *dst = route->tuple[dir].dst; struct net_device_path_stack stack; struct nft_forward_info info = {}; + unsigned char ha[ETH_ALEN]; - if (nft_dev_fill_forward_path(route, dst, ct, dir, &stack) >= 0) - nft_dev_path_info(&stack, &info); + if (nft_dev_fill_forward_path(route, dst, ct, dir, ha, &stack) >= 0) + nft_dev_path_info(&stack, &info, ha); if (!info.indev || !nft_flowtable_find_dev(info.indev, ft)) return; route->tuple[!dir].in.ifindex = info.indev->ifindex; + + if (info.xmit_type == FLOW_OFFLOAD_XMIT_DIRECT) { + memcpy(route->tuple[dir].out.h_dest, info.h_source, ETH_ALEN); + memcpy(route->tuple[dir].out.h_source, info.h_dest, ETH_ALEN); + route->tuple[dir].out.ifindex = info.indev->ifindex; + route->tuple[dir].xmit_type = info.xmit_type; + } } static int nft_flow_route(const struct nft_pktinfo *pkt, From patchwork Wed Nov 18 22:30:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pablo Neira Ayuso X-Patchwork-Id: 327814 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 435CDC64E7A for ; Wed, 18 Nov 2020 22:30:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 111DD246DC for ; Wed, 18 Nov 2020 22:30:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727037AbgKRWa2 (ORCPT ); Wed, 18 Nov 2020 17:30:28 -0500 Received: from correo.us.es ([193.147.175.20]:51266 "EHLO mail.us.es" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726883AbgKRWa1 (ORCPT ); Wed, 18 Nov 2020 17:30:27 -0500 Received: from antivirus1-rhel7.int (unknown [192.168.2.11]) by mail.us.es (Postfix) with ESMTP id 22C7518FDB8 for ; Wed, 18 Nov 2020 23:30:24 +0100 (CET) Received: from antivirus1-rhel7.int (localhost [127.0.0.1]) by antivirus1-rhel7.int (Postfix) with ESMTP id 113C2DA730 for ; Wed, 18 Nov 2020 23:30:24 +0100 (CET) Received: by antivirus1-rhel7.int (Postfix, from userid 99) id 0397BDA8FF; Wed, 18 Nov 2020 23:30:24 +0100 (CET) Received: from antivirus1-rhel7.int (localhost [127.0.0.1]) by antivirus1-rhel7.int (Postfix) with ESMTP id 7802FDA73D; Wed, 18 Nov 2020 23:30:21 +0100 (CET) Received: from 192.168.1.97 (192.168.1.97) by antivirus1-rhel7.int (F-Secure/fsigk_smtp/550/antivirus1-rhel7.int); Wed, 18 Nov 2020 23:30:21 +0100 (CET) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/antivirus1-rhel7.int) Received: from localhost.localdomain (unknown [90.77.255.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: pneira@us.es) by entrada.int (Postfix) with ESMTPSA id 2EF9642514A8; Wed, 18 Nov 2020 23:30:21 +0100 (CET) X-SMTPAUTHUS: auth mail.us.es From: Pablo Neira Ayuso To: netfilter-devel@vger.kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org, kuba@kernel.org, razor@blackwall.org, tobias@waldekranz.com, jeremy@azazel.net Subject: [PATCH net-next,v4 8/9] netfilter: flowtable: add vlan support Date: Wed, 18 Nov 2020 23:30:10 +0100 Message-Id: <20201118223011.3216-9-pablo@netfilter.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201118223011.3216-1-pablo@netfilter.org> References: <20201118223011.3216-1-pablo@netfilter.org> MIME-Version: 1.0 X-Virus-Scanned: ClamAV using ClamSMTP Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add the vlan id and protocol to the flow tuple to uniquely identify flows from the receive path. For the transmit path, dev_hard_header() on the vlan device push the headers. This patch includes support for two VLAN headers (QinQ) from the ingress path. Signed-off-by: Pablo Neira Ayuso --- v4: no changes. include/net/netfilter/nf_flow_table.h | 15 +++- net/netfilter/nf_flow_table_core.c | 6 ++ net/netfilter/nf_flow_table_ip.c | 108 +++++++++++++++++++++----- net/netfilter/nft_flow_offload.c | 25 +++++- 4 files changed, 131 insertions(+), 23 deletions(-) diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h index 83110e4705c0..e65bbbd982cb 100644 --- a/include/net/netfilter/nf_flow_table.h +++ b/include/net/netfilter/nf_flow_table.h @@ -95,6 +95,8 @@ enum flow_offload_xmit_type { FLOW_OFFLOAD_XMIT_DIRECT, }; +#define NF_FLOW_TABLE_VLAN_MAX 2 + struct flow_offload_tuple { union { struct in_addr src_v4; @@ -113,13 +115,17 @@ struct flow_offload_tuple { u8 l3proto; u8 l4proto; + struct { + u16 id; + __be16 proto; + } in_vlan[NF_FLOW_TABLE_VLAN_MAX]; /* All members above are keys for lookups, see flow_offload_hash(). */ struct { } __hash; - u8 dir:6, - xmit_type:2; - + u8 dir:4, + xmit_type:2, + in_vlan_num:2; u16 mtu; union { struct dst_entry *dst_cache; @@ -174,6 +180,9 @@ struct nf_flow_route { struct dst_entry *dst; struct { u32 ifindex; + u16 vid[NF_FLOW_TABLE_VLAN_MAX]; + __be16 vproto[NF_FLOW_TABLE_VLAN_MAX]; + u8 num_vlans; } in; struct { u32 ifindex; diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c index 751bc1c27c16..c4322aeb3557 100644 --- a/net/netfilter/nf_flow_table_core.c +++ b/net/netfilter/nf_flow_table_core.c @@ -80,6 +80,7 @@ static int flow_offload_fill_route(struct flow_offload *flow, { struct flow_offload_tuple *flow_tuple = &flow->tuplehash[dir].tuple; struct dst_entry *dst = route->tuple[dir].dst; + int i; switch (flow_tuple->l3proto) { case NFPROTO_IPV4: @@ -91,6 +92,11 @@ static int flow_offload_fill_route(struct flow_offload *flow, } flow_tuple->iifidx = route->tuple[dir].in.ifindex; + for (i = 0; i < route->tuple[dir].in.num_vlans; i++) { + flow_tuple->in_vlan[i].id = route->tuple[dir].in.vid[i]; + flow_tuple->in_vlan[i].proto = route->tuple[dir].in.vproto[i]; + } + flow_tuple->in_vlan_num = route->tuple[dir].in.num_vlans; switch (route->tuple[dir].xmit_type) { case FLOW_OFFLOAD_XMIT_DIRECT: diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c index d10ba09a41b5..c1604de1612c 100644 --- a/net/netfilter/nf_flow_table_ip.c +++ b/net/netfilter/nf_flow_table_ip.c @@ -159,17 +159,35 @@ static bool ip_has_options(unsigned int thoff) return thoff != sizeof(struct iphdr); } +static void nf_flow_tuple_vlan(struct sk_buff *skb, + struct flow_offload_tuple *tuple) +{ + if (skb_vlan_tag_present(skb)) { + tuple->in_vlan[0].id = skb_vlan_tag_get(skb); + tuple->in_vlan[0].proto = skb->vlan_proto; + } + if (skb->protocol == htons(ETH_P_8021Q)) { + struct vlan_ethhdr *veth = (struct vlan_ethhdr *)skb_mac_header(skb); + + tuple->in_vlan[1].id = ntohs(veth->h_vlan_TCI); + tuple->in_vlan[1].proto = skb->protocol; + } +} + static int nf_flow_tuple_ip(struct sk_buff *skb, const struct net_device *dev, struct flow_offload_tuple *tuple) { - unsigned int thoff, hdrsize; + unsigned int thoff, hdrsize, offset = 0; struct flow_ports *ports; struct iphdr *iph; - if (!pskb_may_pull(skb, sizeof(*iph))) + if (skb->protocol == htons(ETH_P_8021Q)) + offset += VLAN_HLEN; + + if (!pskb_may_pull(skb, sizeof(*iph) + offset)) return -1; - iph = ip_hdr(skb); + iph = (struct iphdr *)(skb_network_header(skb) + offset); thoff = iph->ihl * 4; if (ip_is_fragment(iph) || @@ -191,11 +209,11 @@ static int nf_flow_tuple_ip(struct sk_buff *skb, const struct net_device *dev, return -1; thoff = iph->ihl * 4; - if (!pskb_may_pull(skb, thoff + hdrsize)) + if (!pskb_may_pull(skb, thoff + hdrsize + offset)) return -1; - iph = ip_hdr(skb); - ports = (struct flow_ports *)(skb_network_header(skb) + thoff); + iph = (struct iphdr *)(skb_network_header(skb) + offset); + ports = (struct flow_ports *)(skb_network_header(skb) + thoff + offset); tuple->src_v4.s_addr = iph->saddr; tuple->dst_v4.s_addr = iph->daddr; @@ -204,6 +222,7 @@ static int nf_flow_tuple_ip(struct sk_buff *skb, const struct net_device *dev, tuple->l3proto = AF_INET; tuple->l4proto = iph->protocol; tuple->iifidx = dev->ifindex; + nf_flow_tuple_vlan(skb, tuple); return 0; } @@ -248,6 +267,37 @@ static unsigned int nf_flow_xmit_xfrm(struct sk_buff *skb, return NF_STOLEN; } +static bool nf_flow_skb_vlan_protocol(const struct sk_buff *skb, __be16 proto) +{ + if (skb->protocol == htons(ETH_P_8021Q)) { + struct vlan_ethhdr *veth; + + veth = (struct vlan_ethhdr *)skb_mac_header(skb); + if (veth->h_vlan_encapsulated_proto == proto) + return true; + } + + return false; +} + +static void nf_flow_vlan_pop(struct sk_buff *skb, + struct flow_offload_tuple_rhash *tuplehash) +{ + struct vlan_hdr *vlan_hdr; + int i; + + for (i = 0; i < tuplehash->tuple.in_vlan_num; i++) { + if (skb_vlan_tag_present(skb)) { + __vlan_hwaccel_clear_tag(skb); + continue; + } + vlan_hdr = (struct vlan_hdr *)skb->data; + __skb_pull(skb, VLAN_HLEN); + vlan_set_encap_proto(skb, vlan_hdr); + skb_reset_network_header(skb); + } +} + static unsigned int nf_flow_queue_xmit(struct net *net, struct sk_buff *skb, const struct flow_offload_tuple_rhash *tuplehash, unsigned short type) @@ -280,9 +330,11 @@ nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb, unsigned int thoff; struct iphdr *iph; __be32 nexthop; + u32 offset = 0; int ret; - if (skb->protocol != htons(ETH_P_IP)) + if (skb->protocol != htons(ETH_P_IP) && + !nf_flow_skb_vlan_protocol(skb, htons(ETH_P_IP))) return NF_ACCEPT; if (nf_flow_tuple_ip(skb, state->in, &tuple) < 0) @@ -298,11 +350,15 @@ nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb, if (unlikely(nf_flow_exceeds_mtu(skb, flow->tuplehash[dir].tuple.mtu))) return NF_ACCEPT; - if (skb_try_make_writable(skb, sizeof(*iph))) + if (skb->protocol == htons(ETH_P_8021Q)) + offset += VLAN_HLEN; + + if (skb_try_make_writable(skb, sizeof(*iph) + offset)) return NF_DROP; - thoff = ip_hdr(skb)->ihl * 4; - if (nf_flow_state_check(flow, ip_hdr(skb)->protocol, skb, thoff)) + iph = (struct iphdr *)(skb_network_header(skb) + offset); + thoff = (iph->ihl * 4) + offset; + if (nf_flow_state_check(flow, iph->protocol, skb, thoff)) return NF_ACCEPT; flow_offload_refresh(flow_table, flow); @@ -312,6 +368,9 @@ nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb, return NF_ACCEPT; } + nf_flow_vlan_pop(skb, tuplehash); + thoff -= offset; + if (nf_flow_nat_ip(flow, skb, thoff, dir) < 0) return NF_DROP; @@ -479,14 +538,17 @@ static int nf_flow_nat_ipv6(const struct flow_offload *flow, static int nf_flow_tuple_ipv6(struct sk_buff *skb, const struct net_device *dev, struct flow_offload_tuple *tuple) { - unsigned int thoff, hdrsize; + unsigned int thoff, hdrsize, offset = 0; struct flow_ports *ports; struct ipv6hdr *ip6h; - if (!pskb_may_pull(skb, sizeof(*ip6h))) + if (skb->protocol == htons(ETH_P_8021Q)) + offset += VLAN_HLEN; + + if (!pskb_may_pull(skb, sizeof(*ip6h) + offset)) return -1; - ip6h = ipv6_hdr(skb); + ip6h = (struct ipv6hdr *)(skb_network_header(skb) + offset); switch (ip6h->nexthdr) { case IPPROTO_TCP: @@ -503,11 +565,11 @@ static int nf_flow_tuple_ipv6(struct sk_buff *skb, const struct net_device *dev, return -1; thoff = sizeof(*ip6h); - if (!pskb_may_pull(skb, thoff + hdrsize)) + if (!pskb_may_pull(skb, thoff + hdrsize + offset)) return -1; - ip6h = ipv6_hdr(skb); - ports = (struct flow_ports *)(skb_network_header(skb) + thoff); + ip6h = (struct ipv6hdr *)(skb_network_header(skb) + offset); + ports = (struct flow_ports *)(skb_network_header(skb) + thoff + offset); tuple->src_v6 = ip6h->saddr; tuple->dst_v6 = ip6h->daddr; @@ -516,6 +578,7 @@ static int nf_flow_tuple_ipv6(struct sk_buff *skb, const struct net_device *dev, tuple->l3proto = AF_INET6; tuple->l4proto = ip6h->nexthdr; tuple->iifidx = dev->ifindex; + nf_flow_tuple_vlan(skb, tuple); return 0; } @@ -533,9 +596,11 @@ nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb, struct net_device *outdev; struct ipv6hdr *ip6h; struct rt6_info *rt; + u32 offset = 0; int ret; - if (skb->protocol != htons(ETH_P_IPV6)) + if (skb->protocol != htons(ETH_P_IPV6) && + !nf_flow_skb_vlan_protocol(skb, htons(ETH_P_IPV6))) return NF_ACCEPT; if (nf_flow_tuple_ipv6(skb, state->in, &tuple) < 0) @@ -551,8 +616,11 @@ nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb, if (unlikely(nf_flow_exceeds_mtu(skb, flow->tuplehash[dir].tuple.mtu))) return NF_ACCEPT; - if (nf_flow_state_check(flow, ipv6_hdr(skb)->nexthdr, skb, - sizeof(*ip6h))) + if (skb->protocol == htons(ETH_P_8021Q)) + offset += VLAN_HLEN; + + ip6h = (struct ipv6hdr *)(skb_network_header(skb) + offset); + if (nf_flow_state_check(flow, ip6h->nexthdr, skb, sizeof(*ip6h))) return NF_ACCEPT; flow_offload_refresh(flow_table, flow); @@ -562,6 +630,8 @@ nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb, return NF_ACCEPT; } + nf_flow_vlan_pop(skb, tuplehash); + if (skb_try_make_writable(skb, sizeof(*ip6h))) return NF_DROP; diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c index 68d03569c293..7a19cadc7021 100644 --- a/net/netfilter/nft_flow_offload.c +++ b/net/netfilter/nft_flow_offload.c @@ -65,6 +65,10 @@ static int nft_dev_fill_forward_path(const struct nf_flow_route *route, struct nft_forward_info { const struct net_device *indev; + const struct net_device *vlandev; + __u16 vid[NF_FLOW_TABLE_VLAN_MAX]; + __be16 vproto[NF_FLOW_TABLE_VLAN_MAX]; + u8 num_vlans; u8 h_source[ETH_ALEN]; u8 h_dest[ETH_ALEN]; enum flow_offload_xmit_type xmit_type; @@ -86,6 +90,16 @@ static void nft_dev_path_info(const struct net_device_path_stack *stack, memcpy(info->h_source, ha, ETH_ALEN); break; case DEV_PATH_VLAN: + if (info->num_vlans >= NF_FLOW_TABLE_VLAN_MAX) { + info->indev = NULL; + break; + } + if (!info->vlandev) + info->vlandev = path->dev; + + info->vid[info->num_vlans] = path->vlan.id; + info->vproto[info->num_vlans] = path->vlan.proto; + info->num_vlans++; break; case DEV_PATH_BRIDGE: memcpy(info->h_dest, path->dev->dev_addr, ETH_ALEN); @@ -121,6 +135,7 @@ static void nft_dev_forward_path(struct nf_flow_route *route, struct net_device_path_stack stack; struct nft_forward_info info = {}; unsigned char ha[ETH_ALEN]; + int i; if (nft_dev_fill_forward_path(route, dst, ct, dir, ha, &stack) >= 0) nft_dev_path_info(&stack, &info, ha); @@ -129,11 +144,19 @@ static void nft_dev_forward_path(struct nf_flow_route *route, return; route->tuple[!dir].in.ifindex = info.indev->ifindex; + for (i = 0; i < info.num_vlans; i++) { + route->tuple[!dir].in.vid[i] = info.vid[i]; + route->tuple[!dir].in.vproto[i] = info.vproto[i]; + } + route->tuple[!dir].in.num_vlans = info.num_vlans; if (info.xmit_type == FLOW_OFFLOAD_XMIT_DIRECT) { memcpy(route->tuple[dir].out.h_dest, info.h_source, ETH_ALEN); memcpy(route->tuple[dir].out.h_source, info.h_dest, ETH_ALEN); - route->tuple[dir].out.ifindex = info.indev->ifindex; + if (info.vlandev) + route->tuple[dir].out.ifindex = info.vlandev->ifindex; + else + route->tuple[dir].out.ifindex = info.indev->ifindex; route->tuple[dir].xmit_type = info.xmit_type; } }