From patchwork Tue Apr 13 03:15:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 420727 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B044C43460 for ; Tue, 13 Apr 2021 03:15:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6EE97613B7 for ; Tue, 13 Apr 2021 03:15:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344343AbhDMDPx (ORCPT ); Mon, 12 Apr 2021 23:15:53 -0400 Received: from out30-44.freemail.mail.aliyun.com ([115.124.30.44]:47384 "EHLO out30-44.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236976AbhDMDPo (ORCPT ); Mon, 12 Apr 2021 23:15:44 -0400 X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R181e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=e01e04423; MF=xuanzhuo@linux.alibaba.com; NM=1; PH=DS; RN=15; SR=0; TI=SMTPD_---0UVPME4D_1618283723; Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0UVPME4D_1618283723) by smtp.aliyun-inc.com(127.0.0.1); Tue, 13 Apr 2021 11:15:24 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Jakub Kicinski , =?utf-8?b?QmrDtnJuIFTDtnBl?= =?utf-8?q?l?= , Magnus Karlsson , Jonathan Lemon , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , virtualization@lists.linux-foundation.org, bpf@vger.kernel.org, "dust . li" Subject: [PATCH net-next v4 02/10] netdevice: add priv_flags IFF_NOT_USE_DMA_ADDR Date: Tue, 13 Apr 2021 11:15:15 +0800 Message-Id: <20210413031523.73507-3-xuanzhuo@linux.alibaba.com> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20210413031523.73507-1-xuanzhuo@linux.alibaba.com> References: <20210413031523.73507-1-xuanzhuo@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Some driver devices, such as virtio-net, do not directly use dma addr. For upper-level frameworks such as xdp socket, that need to be aware of this. So add a new priv_flag IFF_NOT_USE_DMA_ADDR. Signed-off-by: Xuan Zhuo --- include/linux/netdevice.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 86e4bd08c2f1..78b2a8b2c31d 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1598,6 +1598,7 @@ typedef u64 netdev_priv_flags_t; * @IFF_LIVE_RENAME_OK: rename is allowed while device is up and running * @IFF_TX_SKB_NO_LINEAR: device/driver is capable of xmitting frames with * skb_headlen(skb) == 0 (data starts from frag0) + * @IFF_NOT_USE_DMA_ADDR: driver not use dma addr directly. such as virtio-net */ enum netdev_priv_flags { IFF_802_1Q_VLAN_BIT, @@ -1632,6 +1633,7 @@ enum netdev_priv_flags { IFF_L3MDEV_RX_HANDLER_BIT, IFF_LIVE_RENAME_OK_BIT, IFF_TX_SKB_NO_LINEAR_BIT, + IFF_NOT_USE_DMA_ADDR_BIT, }; #define __IFF_BIT(bit) ((netdev_priv_flags_t)1 << (bit)) @@ -1669,6 +1671,7 @@ enum netdev_priv_flags { #define IFF_L3MDEV_RX_HANDLER __IFF(L3MDEV_RX_HANDLER) #define IFF_LIVE_RENAME_OK __IFF(LIVE_RENAME_OK) #define IFF_TX_SKB_NO_LINEAR __IFF(TX_SKB_NO_LINEAR) +#define IFF_NOT_USE_DMA_ADDR __IFF(NOT_USE_DMA_ADDR) /* Specifies the type of the struct net_device::ml_priv pointer */ enum netdev_ml_priv_type { From patchwork Tue Apr 13 03:15:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 420728 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69580C4360C for ; Tue, 13 Apr 2021 03:15:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4CABD613AF for ; Tue, 13 Apr 2021 03:15:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344348AbhDMDPy (ORCPT ); Mon, 12 Apr 2021 23:15:54 -0400 Received: from out30-132.freemail.mail.aliyun.com ([115.124.30.132]:41457 "EHLO out30-132.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242040AbhDMDPq (ORCPT ); Mon, 12 Apr 2021 23:15:46 -0400 X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R151e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=e01e04426; MF=xuanzhuo@linux.alibaba.com; NM=1; PH=DS; RN=15; SR=0; TI=SMTPD_---0UVPZqgd_1618283724; Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0UVPZqgd_1618283724) by smtp.aliyun-inc.com(127.0.0.1); Tue, 13 Apr 2021 11:15:24 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Jakub Kicinski , =?utf-8?b?QmrDtnJuIFTDtnBl?= =?utf-8?q?l?= , Magnus Karlsson , Jonathan Lemon , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , virtualization@lists.linux-foundation.org, bpf@vger.kernel.org, "dust . li" Subject: [PATCH net-next v4 03/10] virtio-net: add priv_flags IFF_NOT_USE_DMA_ADDR Date: Tue, 13 Apr 2021 11:15:16 +0800 Message-Id: <20210413031523.73507-4-xuanzhuo@linux.alibaba.com> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20210413031523.73507-1-xuanzhuo@linux.alibaba.com> References: <20210413031523.73507-1-xuanzhuo@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org virtio-net not use dma addr directly. So add this priv_flags IFF_NOT_USE_DMA_ADDR. Signed-off-by: Xuan Zhuo --- drivers/net/virtio_net.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index bb4ea9dbc16b..52653e234a20 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3007,7 +3007,7 @@ static int virtnet_probe(struct virtio_device *vdev) /* Set up network device as normal. */ dev->priv_flags |= IFF_UNICAST_FLT | IFF_LIVE_ADDR_CHANGE | - IFF_TX_SKB_NO_LINEAR; + IFF_TX_SKB_NO_LINEAR | IFF_NOT_USE_DMA_ADDR; dev->netdev_ops = &virtnet_netdev; dev->features = NETIF_F_HIGHDMA; From patchwork Tue Apr 13 03:15:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 420726 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3D7AC433B4 for ; Tue, 13 Apr 2021 03:15:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 81365613AE for ; Tue, 13 Apr 2021 03:15:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344372AbhDMDP5 (ORCPT ); Mon, 12 Apr 2021 23:15:57 -0400 Received: from out30-133.freemail.mail.aliyun.com ([115.124.30.133]:49170 "EHLO out30-133.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344292AbhDMDPr (ORCPT ); Mon, 12 Apr 2021 23:15:47 -0400 X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R851e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=e01e04423; MF=xuanzhuo@linux.alibaba.com; NM=1; PH=DS; RN=15; SR=0; TI=SMTPD_---0UVPXAs6_1618283725; Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0UVPXAs6_1618283725) by smtp.aliyun-inc.com(127.0.0.1); Tue, 13 Apr 2021 11:15:26 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Jakub Kicinski , =?utf-8?b?QmrDtnJuIFTDtnBl?= =?utf-8?q?l?= , Magnus Karlsson , Jonathan Lemon , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , virtualization@lists.linux-foundation.org, bpf@vger.kernel.org, "dust . li" Subject: [PATCH net-next v4 06/10] virtio-net: unify the code for recycling the xmit ptr Date: Tue, 13 Apr 2021 11:15:19 +0800 Message-Id: <20210413031523.73507-7-xuanzhuo@linux.alibaba.com> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20210413031523.73507-1-xuanzhuo@linux.alibaba.com> References: <20210413031523.73507-1-xuanzhuo@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Now there are two types of "skb" and "xdp frame" during recycling old xmit. There are two completely similar and independent implementations. This is inconvenient for the subsequent addition of new types. So extract a function from this piece of code and call this function uniformly to recover old xmit ptr. Rename free_old_xmit_skbs() to free_old_xmit(). Signed-off-by: Xuan Zhuo --- drivers/net/virtio_net.c | 86 ++++++++++++++++++---------------------- 1 file changed, 38 insertions(+), 48 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 52653e234a20..f3752b254965 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -264,6 +264,30 @@ static struct xdp_frame *ptr_to_xdp(void *ptr) return (struct xdp_frame *)((unsigned long)ptr & ~VIRTIO_XDP_FLAG); } +static void __free_old_xmit(struct send_queue *sq, bool in_napi, + struct virtnet_sq_stats *stats) +{ + unsigned int len; + void *ptr; + + while ((ptr = virtqueue_get_buf(sq->vq, &len)) != NULL) { + if (likely(!is_xdp_frame(ptr))) { + struct sk_buff *skb = ptr; + + pr_debug("Sent skb %p\n", skb); + + stats->bytes += skb->len; + napi_consume_skb(skb, in_napi); + } else { + struct xdp_frame *frame = ptr_to_xdp(ptr); + + stats->bytes += frame->len; + xdp_return_frame(frame); + } + stats->packets++; + } +} + /* Converting between virtqueue no. and kernel tx/rx queue no. * 0:rx0 1:tx0 2:rx1 3:tx1 ... 2N:rxN 2N+1:txN 2N+2:cvq */ @@ -525,15 +549,12 @@ static int virtnet_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames, u32 flags) { struct virtnet_info *vi = netdev_priv(dev); + struct virtnet_sq_stats stats = {}; struct receive_queue *rq = vi->rq; struct bpf_prog *xdp_prog; struct send_queue *sq; - unsigned int len; - int packets = 0; - int bytes = 0; int nxmit = 0; int kicks = 0; - void *ptr; int ret; int i; @@ -552,20 +573,7 @@ static int virtnet_xdp_xmit(struct net_device *dev, } /* Free up any pending old buffers before queueing new ones. */ - while ((ptr = virtqueue_get_buf(sq->vq, &len)) != NULL) { - if (likely(is_xdp_frame(ptr))) { - struct xdp_frame *frame = ptr_to_xdp(ptr); - - bytes += frame->len; - xdp_return_frame(frame); - } else { - struct sk_buff *skb = ptr; - - bytes += skb->len; - napi_consume_skb(skb, false); - } - packets++; - } + __free_old_xmit(sq, false, &stats); for (i = 0; i < n; i++) { struct xdp_frame *xdpf = frames[i]; @@ -582,8 +590,8 @@ static int virtnet_xdp_xmit(struct net_device *dev, } out: u64_stats_update_begin(&sq->stats.syncp); - sq->stats.bytes += bytes; - sq->stats.packets += packets; + sq->stats.bytes += stats.bytes; + sq->stats.packets += stats.packets; sq->stats.xdp_tx += n; sq->stats.xdp_tx_drops += n - nxmit; sq->stats.kicks += kicks; @@ -1406,39 +1414,21 @@ static int virtnet_receive(struct receive_queue *rq, int budget, return stats.packets; } -static void free_old_xmit_skbs(struct send_queue *sq, bool in_napi) +static void free_old_xmit(struct send_queue *sq, bool in_napi) { - unsigned int len; - unsigned int packets = 0; - unsigned int bytes = 0; - void *ptr; - - while ((ptr = virtqueue_get_buf(sq->vq, &len)) != NULL) { - if (likely(!is_xdp_frame(ptr))) { - struct sk_buff *skb = ptr; - - pr_debug("Sent skb %p\n", skb); + struct virtnet_sq_stats stats = {}; - bytes += skb->len; - napi_consume_skb(skb, in_napi); - } else { - struct xdp_frame *frame = ptr_to_xdp(ptr); - - bytes += frame->len; - xdp_return_frame(frame); - } - packets++; - } + __free_old_xmit(sq, in_napi, &stats); /* Avoid overhead when no packets have been processed * happens when called speculatively from start_xmit. */ - if (!packets) + if (!stats.packets) return; u64_stats_update_begin(&sq->stats.syncp); - sq->stats.bytes += bytes; - sq->stats.packets += packets; + sq->stats.bytes += stats.bytes; + sq->stats.packets += stats.packets; u64_stats_update_end(&sq->stats.syncp); } @@ -1463,7 +1453,7 @@ static void virtnet_poll_cleantx(struct receive_queue *rq) return; if (__netif_tx_trylock(txq)) { - free_old_xmit_skbs(sq, true); + free_old_xmit(sq, true); __netif_tx_unlock(txq); } @@ -1548,7 +1538,7 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget) txq = netdev_get_tx_queue(vi->dev, index); __netif_tx_lock(txq, raw_smp_processor_id()); - free_old_xmit_skbs(sq, true); + free_old_xmit(sq, true); __netif_tx_unlock(txq); virtqueue_napi_complete(napi, sq->vq, 0); @@ -1617,7 +1607,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) bool use_napi = sq->napi.weight; /* Free up any pending old buffers before queueing new ones. */ - free_old_xmit_skbs(sq, false); + free_old_xmit(sq, false); if (use_napi && kick) virtqueue_enable_cb_delayed(sq->vq); @@ -1661,7 +1651,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) if (!use_napi && unlikely(!virtqueue_enable_cb_delayed(sq->vq))) { /* More just got used, free them then recheck. */ - free_old_xmit_skbs(sq, false); + free_old_xmit(sq, false); if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { netif_start_subqueue(dev, qnum); virtqueue_disable_cb(sq->vq); From patchwork Tue Apr 13 03:15:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 420725 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 544DAC433ED for ; Tue, 13 Apr 2021 03:15:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3A188613AF for ; Tue, 13 Apr 2021 03:15:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344378AbhDMDP7 (ORCPT ); Mon, 12 Apr 2021 23:15:59 -0400 Received: from out30-131.freemail.mail.aliyun.com ([115.124.30.131]:38989 "EHLO out30-131.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344316AbhDMDPs (ORCPT ); Mon, 12 Apr 2021 23:15:48 -0400 X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R201e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=e01e04400; MF=xuanzhuo@linux.alibaba.com; NM=1; PH=DS; RN=15; SR=0; TI=SMTPD_---0UVPZqh-_1618283726; Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0UVPZqh-_1618283726) by smtp.aliyun-inc.com(127.0.0.1); Tue, 13 Apr 2021 11:15:26 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Jakub Kicinski , =?utf-8?b?QmrDtnJuIFTDtnBl?= =?utf-8?q?l?= , Magnus Karlsson , Jonathan Lemon , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , virtualization@lists.linux-foundation.org, bpf@vger.kernel.org, "dust . li" Subject: [PATCH net-next v4 08/10] virtio-net: xsk zero copy xmit setup Date: Tue, 13 Apr 2021 11:15:21 +0800 Message-Id: <20210413031523.73507-9-xuanzhuo@linux.alibaba.com> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20210413031523.73507-1-xuanzhuo@linux.alibaba.com> References: <20210413031523.73507-1-xuanzhuo@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org xsk is a high-performance packet receiving and sending technology. This patch implements the binding and unbinding operations of xsk and the virtio-net queue for xsk zero copy xmit. The xsk zero copy xmit depends on tx napi. So if tx napi is not true, an error will be reported. And the entire operation is under the protection of rtnl_lock. If xsk is active, it will prevent ethtool from modifying tx napi. Signed-off-by: Xuan Zhuo Reviewed-by: Dust Li --- drivers/net/virtio_net.c | 78 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 77 insertions(+), 1 deletion(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index f52a25091322..8242a9e9f17d 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -22,6 +22,7 @@ #include #include #include +#include static int napi_weight = NAPI_POLL_WEIGHT; module_param(napi_weight, int, 0444); @@ -133,6 +134,11 @@ struct send_queue { struct virtnet_sq_stats stats; struct napi_struct napi; + + struct { + /* xsk pool */ + struct xsk_buff_pool __rcu *pool; + } xsk; }; /* Internal representation of a receive virtqueue */ @@ -2249,8 +2255,19 @@ static int virtnet_set_coalesce(struct net_device *dev, if (napi_weight ^ vi->sq[0].napi.weight) { if (dev->flags & IFF_UP) return -EBUSY; - for (i = 0; i < vi->max_queue_pairs; i++) + for (i = 0; i < vi->max_queue_pairs; i++) { + /* xsk xmit depend on the tx napi. So if xsk is active, + * prevent modifications to tx napi. + */ + rcu_read_lock(); + if (rcu_dereference(vi->sq[i].xsk.pool)) { + rcu_read_unlock(); + continue; + } + rcu_read_unlock(); + vi->sq[i].napi.weight = napi_weight; + } } return 0; @@ -2518,11 +2535,70 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog, return err; } +static int virtnet_xsk_pool_enable(struct net_device *dev, + struct xsk_buff_pool *pool, + u16 qid) +{ + struct virtnet_info *vi = netdev_priv(dev); + struct send_queue *sq; + + if (qid >= vi->curr_queue_pairs) + return -EINVAL; + + sq = &vi->sq[qid]; + + /* xsk zerocopy depend on the tx napi. + * + * xsk zerocopy xmit is driven by the tx interrupt. When the device is + * not busy, napi will be called continuously to send data. When the + * device is busy, wait for the notification interrupt after the + * hardware has finished processing the data, and continue to send data + * in napi. + */ + if (!sq->napi.weight) + return -EPERM; + + rcu_read_lock(); + /* Here is already protected by rtnl_lock, so rcu_assign_pointer is + * safe. + */ + rcu_assign_pointer(sq->xsk.pool, pool); + rcu_read_unlock(); + + return 0; +} + +static int virtnet_xsk_pool_disable(struct net_device *dev, u16 qid) +{ + struct virtnet_info *vi = netdev_priv(dev); + struct send_queue *sq; + + if (qid >= vi->curr_queue_pairs) + return -EINVAL; + + sq = &vi->sq[qid]; + + /* Here is already protected by rtnl_lock, so rcu_assign_pointer is + * safe. + */ + rcu_assign_pointer(sq->xsk.pool, NULL); + + synchronize_net(); /* Sync with the XSK wakeup and with NAPI. */ + + return 0; +} + static int virtnet_xdp(struct net_device *dev, struct netdev_bpf *xdp) { switch (xdp->command) { case XDP_SETUP_PROG: return virtnet_xdp_set(dev, xdp->prog, xdp->extack); + case XDP_SETUP_XSK_POOL: + if (xdp->xsk.pool) + return virtnet_xsk_pool_enable(dev, xdp->xsk.pool, + xdp->xsk.queue_id); + else + return virtnet_xsk_pool_disable(dev, xdp->xsk.queue_id); default: return -EINVAL; } From patchwork Tue Apr 13 03:15:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 420724 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F4AEC433ED for ; Tue, 13 Apr 2021 03:15:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 819D5613AE for ; Tue, 13 Apr 2021 03:15:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344397AbhDMDQC (ORCPT ); Mon, 12 Apr 2021 23:16:02 -0400 Received: from out4436.biz.mail.alibaba.com ([47.88.44.36]:53388 "EHLO out4436.biz.mail.alibaba.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344323AbhDMDPv (ORCPT ); Mon, 12 Apr 2021 23:15:51 -0400 X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R961e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=e01e04423; MF=xuanzhuo@linux.alibaba.com; NM=1; PH=DS; RN=15; SR=0; TI=SMTPD_---0UVPME4p_1618283727; Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0UVPME4p_1618283727) by smtp.aliyun-inc.com(127.0.0.1); Tue, 13 Apr 2021 11:15:27 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Jakub Kicinski , =?utf-8?b?QmrDtnJuIFTDtnBl?= =?utf-8?q?l?= , Magnus Karlsson , Jonathan Lemon , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , virtualization@lists.linux-foundation.org, bpf@vger.kernel.org, "dust . li" Subject: [PATCH net-next v4 10/10] virtio-net: xsk zero copy xmit kick by threshold Date: Tue, 13 Apr 2021 11:15:23 +0800 Message-Id: <20210413031523.73507-11-xuanzhuo@linux.alibaba.com> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20210413031523.73507-1-xuanzhuo@linux.alibaba.com> References: <20210413031523.73507-1-xuanzhuo@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org After testing, the performance of calling kick every time is not stable. And if all the packets are sent and kicked again, the performance is not good. So add a module parameter to specify how many packets are sent to call a kick. 8 is a relatively stable value with the best performance. Here is the pps of the test of xsk_kick_thr under different values (from 1 to 64). thr PPS thr PPS thr PPS 1 2924116.74247 | 23 3683263.04348 | 45 2777907.22963 2 3441010.57191 | 24 3078880.13043 | 46 2781376.21739 3 3636728.72378 | 25 2859219.57656 | 47 2777271.91304 4 3637518.61468 | 26 2851557.9593 | 48 2800320.56575 5 3651738.16251 | 27 2834783.54408 | 49 2813039.87599 6 3652176.69231 | 28 2847012.41472 | 50 3445143.01839 7 3665415.80602 | 29 2860633.91304 | 51 3666918.01281 8 3665045.16555 | 30 2857903.5786 | 52 3059929.2709 9 3671023.2401 | 31 2835589.98963 | 53 2831515.21739 10 3669532.23274 | 32 2862827.88706 | 54 3451804.07204 11 3666160.37749 | 33 2871855.96696 | 55 3654975.92385 12 3674951.44813 | 34 3434456.44816 | 56 3676198.3188 13 3667447.57331 | 35 3656918.54177 | 57 3684740.85619 14 3018846.0503 | 36 3596921.16722 | 58 3060958.8594 15 2792773.84505 | 37 3603460.63903 | 59 2828874.57191 16 3430596.3602 | 38 3595410.87666 | 60 3459926.11027 17 3660525.85806 | 39 3604250.17819 | 61 3685444.47599 18 3045627.69054 | 40 3596542.28428 | 62 3049959.0809 19 2841542.94177 | 41 3600705.16054 | 63 2806280.04013 20 2830475.97348 | 42 3019833.71191 | 64 3448494.3913 21 2845655.55789 | 43 2752951.93264 | 22 3450389.84365 | 44 2753107.27164 | It can be found that when the value of xsk_kick_thr is relatively small, the performance is not good, and when its value is greater than 13, the performance will be more irregular and unstable. It looks similar from 3 to 13, I chose 8 as the default value. The test environment is qemu + vhost-net. I modified vhost-net to drop the packets sent by vm directly, so that the cpu of vm can run higher. By default, the processes in the vm and the cpu of softirqd are too low, and there is no obvious difference in the test data. During the test, the cpu of softirq reached 100%. Each xsk_kick_thr was run for 300s, the pps of every second was recorded, and the average of the pps was finally taken. The vhost process cpu on the host has also reached 100%. Signed-off-by: Xuan Zhuo Reviewed-by: Dust Li --- drivers/net/virtio_net.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index c441d6bf1510..4e360bfc2cf0 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -28,9 +28,11 @@ static int napi_weight = NAPI_POLL_WEIGHT; module_param(napi_weight, int, 0444); static bool csum = true, gso = true, napi_tx = true; +static int xsk_kick_thr = 8; module_param(csum, bool, 0444); module_param(gso, bool, 0444); module_param(napi_tx, bool, 0644); +module_param(xsk_kick_thr, int, 0644); /* FIXME: MTU in config. */ #define GOOD_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN) @@ -2690,6 +2692,7 @@ static int virtnet_xsk_xmit_batch(struct send_queue *sq, struct xdp_desc desc; int err, packet = 0; int ret = -EAGAIN; + int need_kick = 0; if (sq->xsk.last_desc.addr) { if (sq->vq->num_free < 2 + MAX_SKB_FRAGS) @@ -2700,6 +2703,7 @@ static int virtnet_xsk_xmit_batch(struct send_queue *sq, return -EBUSY; ++packet; + ++need_kick; --budget; sq->xsk.last_desc.addr = 0; } @@ -2723,11 +2727,22 @@ static int virtnet_xsk_xmit_batch(struct send_queue *sq, } ++packet; + ++need_kick; + if (need_kick > xsk_kick_thr) { + if (virtqueue_kick_prepare(sq->vq) && + virtqueue_notify(sq->vq)) + ++stats->kicks; + + need_kick = 0; + } } if (packet) { - if (virtqueue_kick_prepare(sq->vq) && virtqueue_notify(sq->vq)) - ++stats->kicks; + if (need_kick) { + if (virtqueue_kick_prepare(sq->vq) && + virtqueue_notify(sq->vq)) + ++stats->kicks; + } *done = packet; stats->xdp_tx += packet;