From patchwork Sat Jan 16 02:59:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 365421 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 686A6C433DB for ; Sat, 16 Jan 2021 03:00:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 37E27238D7 for ; Sat, 16 Jan 2021 03:00:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729211AbhAPDAP (ORCPT ); Fri, 15 Jan 2021 22:00:15 -0500 Received: from out30-43.freemail.mail.aliyun.com ([115.124.30.43]:45072 "EHLO out30-43.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728644AbhAPDAN (ORCPT ); Fri, 15 Jan 2021 22:00:13 -0500 X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R411e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=e01e04423; MF=xuanzhuo@linux.alibaba.com; NM=1; PH=DS; RN=19; SR=0; TI=SMTPD_---0ULrQCZp_1610765968; Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0ULrQCZp_1610765968) by smtp.aliyun-inc.com(127.0.0.1); Sat, 16 Jan 2021 10:59:28 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Jakub Kicinski , =?utf-8?b?QmrDtnJuIFTDtnBl?= =?utf-8?q?l?= , Magnus Karlsson , Jonathan Lemon , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , virtualization@lists.linux-foundation.org, bpf@vger.kernel.org Subject: [PATCH net-next v2 1/7] xsk: support get page for drv Date: Sat, 16 Jan 2021 10:59:22 +0800 Message-Id: <8df6697163e7074c59b0d9c6fbf8d07e820ae988.1610765285.git.xuanzhuo@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: In-Reply-To: References: Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org For some drivers, such as virtio-net, we do not configure dma when binding xsk. We will get the page when sending. This patch participates in a field need_dma during the setup pool. If the device does not use dma, this value should be set to false. And a function xsk_buff_raw_get_page is added to get the page based on addr in drv. Signed-off-by: Xuan Zhuo --- include/linux/netdevice.h | 1 + include/net/xdp_sock_drv.h | 10 ++++++++++ include/net/xsk_buff_pool.h | 1 + net/xdp/xsk_buff_pool.c | 10 +++++++++- 4 files changed, 21 insertions(+), 1 deletion(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 5b94907..b452ade 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -914,6 +914,7 @@ struct netdev_bpf { struct { struct xsk_buff_pool *pool; u16 queue_id; + bool need_dma; } xsk; }; }; diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h index 4e295541..e9c7e25 100644 --- a/include/net/xdp_sock_drv.h +++ b/include/net/xdp_sock_drv.h @@ -100,6 +100,11 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr) return xp_raw_get_data(pool, addr); } +static inline struct page *xsk_buff_raw_get_page(struct xsk_buff_pool *pool, u64 addr) +{ + return xp_raw_get_page(pool, addr); +} + static inline void xsk_buff_dma_sync_for_cpu(struct xdp_buff *xdp, struct xsk_buff_pool *pool) { struct xdp_buff_xsk *xskb = container_of(xdp, struct xdp_buff_xsk, xdp); @@ -232,6 +237,11 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr) return NULL; } +static inline struct page *xsk_buff_raw_get_page(struct xsk_buff_pool *pool, u64 addr) +{ + return NULL; +} + static inline void xsk_buff_dma_sync_for_cpu(struct xdp_buff *xdp, struct xsk_buff_pool *pool) { } diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h index eaa8386..2dcfa54 100644 --- a/include/net/xsk_buff_pool.h +++ b/include/net/xsk_buff_pool.h @@ -108,6 +108,7 @@ int xp_dma_map(struct xsk_buff_pool *pool, struct device *dev, bool xp_can_alloc(struct xsk_buff_pool *pool, u32 count); void *xp_raw_get_data(struct xsk_buff_pool *pool, u64 addr); dma_addr_t xp_raw_get_dma(struct xsk_buff_pool *pool, u64 addr); +struct page *xp_raw_get_page(struct xsk_buff_pool *pool, u64 addr); static inline dma_addr_t xp_get_dma(struct xdp_buff_xsk *xskb) { return xskb->dma; diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index 20598ee..6d0cc9f 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -166,12 +166,13 @@ static int __xp_assign_dev(struct xsk_buff_pool *pool, bpf.command = XDP_SETUP_XSK_POOL; bpf.xsk.pool = pool; bpf.xsk.queue_id = queue_id; + bpf.xsk.need_dma = true; err = netdev->netdev_ops->ndo_bpf(netdev, &bpf); if (err) goto err_unreg_pool; - if (!pool->dma_pages) { + if (bpf.xsk.need_dma && !pool->dma_pages) { WARN(1, "Driver did not DMA map zero-copy buffers"); err = -EINVAL; goto err_unreg_xsk; @@ -535,6 +536,13 @@ void *xp_raw_get_data(struct xsk_buff_pool *pool, u64 addr) } EXPORT_SYMBOL(xp_raw_get_data); +struct page *xp_raw_get_page(struct xsk_buff_pool *pool, u64 addr) +{ + addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr; + return pool->umem->pgs[addr >> PAGE_SHIFT]; +} +EXPORT_SYMBOL(xp_raw_get_page); + dma_addr_t xp_raw_get_dma(struct xsk_buff_pool *pool, u64 addr) { addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr; From patchwork Sat Jan 16 02:59:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 365005 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0562C4332B for ; Sat, 16 Jan 2021 03:00:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9069B238D7 for ; Sat, 16 Jan 2021 03:00:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729406AbhAPDAV (ORCPT ); Fri, 15 Jan 2021 22:00:21 -0500 Received: from out30-131.freemail.mail.aliyun.com ([115.124.30.131]:44214 "EHLO out30-131.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728706AbhAPDAU (ORCPT ); Fri, 15 Jan 2021 22:00:20 -0500 X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R271e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=e01e04394; MF=xuanzhuo@linux.alibaba.com; NM=1; PH=DS; RN=19; SR=0; TI=SMTPD_---0ULr4Cw8_1610765969; Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0ULr4Cw8_1610765969) by smtp.aliyun-inc.com(127.0.0.1); Sat, 16 Jan 2021 10:59:29 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Jakub Kicinski , =?utf-8?b?QmrDtnJuIFTDtnBl?= =?utf-8?q?l?= , Magnus Karlsson , Jonathan Lemon , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , virtualization@lists.linux-foundation.org, bpf@vger.kernel.org Subject: [PATCH net-next v2 3/7] xsk, virtio-net: prepare for support xsk zerocopy xmit Date: Sat, 16 Jan 2021 10:59:24 +0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: In-Reply-To: References: Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Split function free_old_xmit_skbs, add sub-function __free_old_xmit_ptr, which is convenient to call with other statistical information, and supports the parameter 'xsk_wakeup' required for processing xsk. Use netif stop check as a function virtnet_sq_stop_check, which will be used when adding xsk support. Signed-off-by: Xuan Zhuo --- drivers/net/virtio_net.c | 95 ++++++++++++++++++++++++++---------------------- 1 file changed, 52 insertions(+), 43 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index e707c31..9013328 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -262,6 +262,11 @@ struct padded_vnet_hdr { char padding[4]; }; +static void __free_old_xmit_ptr(struct send_queue *sq, bool in_napi, + bool xsk_wakeup, + unsigned int *_packets, unsigned int *_bytes); +static void free_old_xmit_skbs(struct send_queue *sq, bool in_napi); + static bool is_xdp_frame(void *ptr) { return (unsigned long)ptr & VIRTIO_XDP_FLAG; @@ -375,6 +380,37 @@ static void skb_xmit_done(struct virtqueue *vq) netif_wake_subqueue(vi->dev, vq2txq(vq)); } +static void virtnet_sq_stop_check(struct send_queue *sq, bool in_napi) +{ + struct virtnet_info *vi = sq->vq->vdev->priv; + struct net_device *dev = vi->dev; + int qnum = sq - vi->sq; + + /* If running out of space, stop queue to avoid getting packets that we + * are then unable to transmit. + * An alternative would be to force queuing layer to requeue the skb by + * returning NETDEV_TX_BUSY. However, NETDEV_TX_BUSY should not be + * returned in a normal path of operation: it means that driver is not + * maintaining the TX queue stop/start state properly, and causes + * the stack to do a non-trivial amount of useless work. + * Since most packets only take 1 or 2 ring slots, stopping the queue + * early means 16 slots are typically wasted. + */ + + if (sq->vq->num_free < 2 + MAX_SKB_FRAGS) { + netif_stop_subqueue(dev, qnum); + if (!sq->napi.weight && + unlikely(!virtqueue_enable_cb_delayed(sq->vq))) { + /* More just got used, free them then recheck. */ + free_old_xmit_skbs(sq, in_napi); + if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) { + netif_start_subqueue(dev, qnum); + virtqueue_disable_cb(sq->vq); + } + } + } +} + #define MRG_CTX_HEADER_SHIFT 22 static void *mergeable_len_to_ctx(unsigned int truesize, unsigned int headroom) @@ -522,13 +558,11 @@ static int virtnet_xdp_xmit(struct net_device *dev, struct receive_queue *rq = vi->rq; struct bpf_prog *xdp_prog; struct send_queue *sq; - unsigned int len; int packets = 0; int bytes = 0; int drops = 0; int kicks = 0; int ret, err; - void *ptr; int i; /* Only allow ndo_xdp_xmit if XDP is loaded on dev, as this @@ -546,24 +580,7 @@ static int virtnet_xdp_xmit(struct net_device *dev, goto out; } - /* Free up any pending old buffers before queueing new ones. */ - while ((ptr = virtqueue_get_buf(sq->vq, &len)) != NULL) { - if (likely(is_xdp_frame(ptr))) { - struct virtnet_xdp_type *xtype; - struct xdp_frame *frame; - - xtype = ptr_to_xtype(ptr); - frame = xtype_get_ptr(xtype); - bytes += frame->len; - xdp_return_frame(frame); - } else { - struct sk_buff *skb = ptr; - - bytes += skb->len; - napi_consume_skb(skb, false); - } - packets++; - } + __free_old_xmit_ptr(sq, false, true, &packets, &bytes); for (i = 0; i < n; i++) { struct xdp_frame *xdpf = frames[i]; @@ -1400,7 +1417,9 @@ static int virtnet_receive(struct receive_queue *rq, int budget, return stats.packets; } -static void free_old_xmit_skbs(struct send_queue *sq, bool in_napi) +static void __free_old_xmit_ptr(struct send_queue *sq, bool in_napi, + bool xsk_wakeup, + unsigned int *_packets, unsigned int *_bytes) { unsigned int packets = 0; unsigned int bytes = 0; @@ -1434,6 +1453,17 @@ static void free_old_xmit_skbs(struct send_queue *sq, bool in_napi) packets++; } + *_packets = packets; + *_bytes = bytes; +} + +static void free_old_xmit_skbs(struct send_queue *sq, bool in_napi) +{ + unsigned int packets = 0; + unsigned int bytes = 0; + + __free_old_xmit_ptr(sq, in_napi, true, &packets, &bytes); + /* Avoid overhead when no packets have been processed * happens when called speculatively from start_xmit. */ @@ -1649,28 +1679,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) nf_reset_ct(skb); } - /* If running out of space, stop queue to avoid getting packets that we - * are then unable to transmit. - * An alternative would be to force queuing layer to requeue the skb by - * returning NETDEV_TX_BUSY. However, NETDEV_TX_BUSY should not be - * returned in a normal path of operation: it means that driver is not - * maintaining the TX queue stop/start state properly, and causes - * the stack to do a non-trivial amount of useless work. - * Since most packets only take 1 or 2 ring slots, stopping the queue - * early means 16 slots are typically wasted. - */ - if (sq->vq->num_free < 2+MAX_SKB_FRAGS) { - netif_stop_subqueue(dev, qnum); - if (!use_napi && - unlikely(!virtqueue_enable_cb_delayed(sq->vq))) { - /* More just got used, free them then recheck. */ - free_old_xmit_skbs(sq, false); - if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { - netif_start_subqueue(dev, qnum); - virtqueue_disable_cb(sq->vq); - } - } - } + virtnet_sq_stop_check(sq, false); if (kick || netif_xmit_stopped(txq)) { if (virtqueue_kick_prepare(sq->vq) && virtqueue_notify(sq->vq)) { From patchwork Sat Jan 16 02:59:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 365418 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4C37C433DB for ; Sat, 16 Jan 2021 03:01:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7DB67239E4 for ; Sat, 16 Jan 2021 03:01:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729486AbhAPDBG (ORCPT ); Fri, 15 Jan 2021 22:01:06 -0500 Received: from out30-57.freemail.mail.aliyun.com ([115.124.30.57]:44212 "EHLO out30-57.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729184AbhAPDBG (ORCPT ); Fri, 15 Jan 2021 22:01:06 -0500 X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R601e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=e01e01424; MF=xuanzhuo@linux.alibaba.com; NM=1; PH=DS; RN=19; SR=0; TI=SMTPD_---0ULr66-0_1610765969; Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0ULr66-0_1610765969) by smtp.aliyun-inc.com(127.0.0.1); Sat, 16 Jan 2021 10:59:30 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Jakub Kicinski , =?utf-8?b?QmrDtnJuIFTDtnBl?= =?utf-8?q?l?= , Magnus Karlsson , Jonathan Lemon , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , virtualization@lists.linux-foundation.org, bpf@vger.kernel.org Subject: [PATCH net-next v2 4/7] virtio-net, xsk: support xsk enable/disable Date: Sat, 16 Jan 2021 10:59:25 +0800 Message-Id: X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: In-Reply-To: References: Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org When enable, a certain number of struct virtnet_xsk_hdr is allocated to save the information of each packet and virtio hdr.This number is the limit of the received module parameters. When struct virtnet_xsk_hdr is used up, or the sq->vq->num_free of virtio-net is too small, it will be considered that the device is busy. * xsk_num_max: the xsk.hdr max num * xsk_num_percent: the max hdr num be the percent of the virtio ring size. The real xsk hdr num will the min of xsk_num_max and the percent of the num of virtio ring * xsk_budget: the budget for xsk run Signed-off-by: Xuan Zhuo --- drivers/net/virtio_net.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 97 insertions(+) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 9013328..a62d456 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -22,10 +22,19 @@ #include #include #include +#include static int napi_weight = NAPI_POLL_WEIGHT; module_param(napi_weight, int, 0444); +static int xsk_num_max = 1024; +static int xsk_num_percent = 80; +static int xsk_budget = 128; + +module_param(xsk_num_max, int, 0644); +module_param(xsk_num_percent, int, 0644); +module_param(xsk_budget, int, 0644); + static bool csum = true, gso = true, napi_tx = true; module_param(csum, bool, 0444); module_param(gso, bool, 0444); @@ -149,6 +158,15 @@ struct send_queue { struct virtnet_sq_stats stats; struct napi_struct napi; + + struct { + struct xsk_buff_pool __rcu *pool; + struct virtnet_xsk_hdr __rcu *hdr; + + u64 hdr_con; + u64 hdr_pro; + u64 hdr_n; + } xsk; }; /* Internal representation of a receive virtqueue */ @@ -2540,11 +2558,90 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog, return err; } +static int virtnet_xsk_pool_enable(struct net_device *dev, + struct xsk_buff_pool *pool, + u16 qid) +{ + struct virtnet_info *vi = netdev_priv(dev); + struct send_queue *sq = &vi->sq[qid]; + struct virtnet_xsk_hdr *hdr; + int n, ret = 0; + + if (qid >= dev->real_num_rx_queues || qid >= dev->real_num_tx_queues) + return -EINVAL; + + if (qid >= vi->curr_queue_pairs) + return -EINVAL; + + rcu_read_lock(); + + ret = -EBUSY; + if (rcu_dereference(sq->xsk.pool)) + goto end; + + /* check last xsk wait for hdr been free */ + if (rcu_dereference(sq->xsk.hdr)) + goto end; + + n = virtqueue_get_vring_size(sq->vq); + n = min(xsk_num_max, n * (xsk_num_percent % 100) / 100); + + ret = -ENOMEM; + hdr = kcalloc(n, sizeof(struct virtnet_xsk_hdr), GFP_ATOMIC); + if (!hdr) + goto end; + + memset(&sq->xsk, 0, sizeof(sq->xsk)); + + sq->xsk.hdr_pro = n; + sq->xsk.hdr_n = n; + + rcu_assign_pointer(sq->xsk.pool, pool); + rcu_assign_pointer(sq->xsk.hdr, hdr); + + ret = 0; +end: + rcu_read_unlock(); + + return ret; +} + +static int virtnet_xsk_pool_disable(struct net_device *dev, u16 qid) +{ + struct virtnet_info *vi = netdev_priv(dev); + struct send_queue *sq = &vi->sq[qid]; + struct virtnet_xsk_hdr *hdr = NULL; + + if (qid >= dev->real_num_rx_queues || qid >= dev->real_num_tx_queues) + return -EINVAL; + + if (qid >= vi->curr_queue_pairs) + return -EINVAL; + + rcu_assign_pointer(sq->xsk.pool, NULL); + + if (sq->xsk.hdr_pro - sq->xsk.hdr_con == sq->xsk.hdr_n) + hdr = rcu_replace_pointer(sq->xsk.hdr, hdr, true); + + synchronize_rcu(); /* Sync with the XSK wakeup and with NAPI. */ + + kfree(hdr); + + return 0; +} + static int virtnet_xdp(struct net_device *dev, struct netdev_bpf *xdp) { switch (xdp->command) { case XDP_SETUP_PROG: return virtnet_xdp_set(dev, xdp->prog, xdp->extack); + case XDP_SETUP_XSK_POOL: + xdp->xsk.need_dma = false; + if (xdp->xsk.pool) + return virtnet_xsk_pool_enable(dev, xdp->xsk.pool, + xdp->xsk.queue_id); + else + return virtnet_xsk_pool_disable(dev, xdp->xsk.queue_id); default: return -EINVAL; } From patchwork Sat Jan 16 02:59:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 365419 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD81DC43381 for ; Sat, 16 Jan 2021 03:00:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AA25B238D6 for ; Sat, 16 Jan 2021 03:00:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729377AbhAPDAV (ORCPT ); Fri, 15 Jan 2021 22:00:21 -0500 Received: from out30-131.freemail.mail.aliyun.com ([115.124.30.131]:33937 "EHLO out30-131.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728493AbhAPDAU (ORCPT ); Fri, 15 Jan 2021 22:00:20 -0500 X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R311e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=alimailimapcm10staff010182156082; MF=xuanzhuo@linux.alibaba.com; NM=1; PH=DS; RN=19; SR=0; TI=SMTPD_---0ULrQCa5_1610765970; Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0ULrQCa5_1610765970) by smtp.aliyun-inc.com(127.0.0.1); Sat, 16 Jan 2021 10:59:30 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Jakub Kicinski , =?utf-8?b?QmrDtnJuIFTDtnBl?= =?utf-8?q?l?= , Magnus Karlsson , Jonathan Lemon , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , virtualization@lists.linux-foundation.org, bpf@vger.kernel.org Subject: [PATCH net-next v2 5/7] virtio-net, xsk: realize the function of xsk packet sending Date: Sat, 16 Jan 2021 10:59:26 +0800 Message-Id: <9e1f5a4b633887ce1f66e39bc762b8497a379a43.1610765285.git.xuanzhuo@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: In-Reply-To: References: Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org virtnet_xsk_run will be called in the tx interrupt handling function virtnet_poll_tx. The sending process gets desc from the xsk tx queue, and assembles it to send the data. Compared with other drivers, a special place is that the page of the data in xsk is used here instead of the dma address. Because the virtio interface does not use the dma address. Signed-off-by: Xuan Zhuo --- drivers/net/virtio_net.c | 200 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 197 insertions(+), 3 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index a62d456..42aa9ad 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -119,6 +119,8 @@ struct virtnet_xsk_hdr { u32 len; }; +#define VIRTNET_STATE_XSK_WAKEUP 1 + #define VIRTNET_SQ_STAT(m) offsetof(struct virtnet_sq_stats, m) #define VIRTNET_RQ_STAT(m) offsetof(struct virtnet_rq_stats, m) @@ -163,9 +165,12 @@ struct send_queue { struct xsk_buff_pool __rcu *pool; struct virtnet_xsk_hdr __rcu *hdr; + unsigned long state; u64 hdr_con; u64 hdr_pro; u64 hdr_n; + struct xdp_desc last_desc; + bool wait_slot; } xsk; }; @@ -284,6 +289,8 @@ static void __free_old_xmit_ptr(struct send_queue *sq, bool in_napi, bool xsk_wakeup, unsigned int *_packets, unsigned int *_bytes); static void free_old_xmit_skbs(struct send_queue *sq, bool in_napi); +static int virtnet_xsk_run(struct send_queue *sq, + struct xsk_buff_pool *pool, int budget); static bool is_xdp_frame(void *ptr) { @@ -1590,6 +1597,8 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget) struct virtnet_info *vi = sq->vq->vdev->priv; unsigned int index = vq2txq(sq->vq); struct netdev_queue *txq; + struct xsk_buff_pool *pool; + int work = 0; if (unlikely(is_xdp_raw_buffer_queue(vi, index))) { /* We don't need to enable cb for XDP */ @@ -1599,15 +1608,26 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget) txq = netdev_get_tx_queue(vi->dev, index); __netif_tx_lock(txq, raw_smp_processor_id()); - free_old_xmit_skbs(sq, true); + + rcu_read_lock(); + pool = rcu_dereference(sq->xsk.pool); + if (pool) { + work = virtnet_xsk_run(sq, pool, budget); + rcu_read_unlock(); + } else { + rcu_read_unlock(); + free_old_xmit_skbs(sq, true); + } + __netif_tx_unlock(txq); - virtqueue_napi_complete(napi, sq->vq, 0); + if (work < budget) + virtqueue_napi_complete(napi, sq->vq, 0); if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) netif_tx_wake_queue(txq); - return 0; + return work; } static int xmit_skb(struct send_queue *sq, struct sk_buff *skb) @@ -2647,6 +2667,180 @@ static int virtnet_xdp(struct net_device *dev, struct netdev_bpf *xdp) } } +static int virtnet_xsk_xmit(struct send_queue *sq, struct xsk_buff_pool *pool, + struct xdp_desc *desc) +{ + struct virtnet_info *vi = sq->vq->vdev->priv; + void *data, *ptr; + struct page *page; + struct virtnet_xsk_hdr *xskhdr; + u32 idx, offset, n, i, copy, copied; + u64 addr; + int err, m; + + addr = desc->addr; + + data = xsk_buff_raw_get_data(pool, addr); + offset = offset_in_page(data); + + /* one for hdr, one for the first page */ + n = 2; + m = desc->len - (PAGE_SIZE - offset); + if (m > 0) { + n += m >> PAGE_SHIFT; + if (m & PAGE_MASK) + ++n; + + n = min_t(u32, n, ARRAY_SIZE(sq->sg)); + } + + idx = sq->xsk.hdr_con % sq->xsk.hdr_n; + xskhdr = &sq->xsk.hdr[idx]; + + /* xskhdr->hdr has been memset to zero, so not need to clear again */ + + sg_init_table(sq->sg, n); + sg_set_buf(sq->sg, &xskhdr->hdr, vi->hdr_len); + + copied = 0; + for (i = 1; i < n; ++i) { + copy = min_t(int, desc->len - copied, PAGE_SIZE - offset); + + page = xsk_buff_raw_get_page(pool, addr + copied); + + sg_set_page(sq->sg + i, page, copy, offset); + copied += copy; + if (offset) + offset = 0; + } + + xskhdr->len = desc->len; + ptr = xdp_to_ptr(&xskhdr->type); + + err = virtqueue_add_outbuf(sq->vq, sq->sg, n, ptr, GFP_ATOMIC); + if (unlikely(err)) + sq->xsk.last_desc = *desc; + else + sq->xsk.hdr_con++; + + return err; +} + +static bool virtnet_xsk_dev_is_full(struct send_queue *sq) +{ + if (sq->vq->num_free < 2 + MAX_SKB_FRAGS) + return true; + + if (sq->xsk.hdr_con == sq->xsk.hdr_pro) + return true; + + return false; +} + +static int virtnet_xsk_xmit_zc(struct send_queue *sq, + struct xsk_buff_pool *pool, unsigned int budget) +{ + struct xdp_desc desc; + int err, packet = 0; + int ret = -EAGAIN; + + if (sq->xsk.last_desc.addr) { + err = virtnet_xsk_xmit(sq, pool, &sq->xsk.last_desc); + if (unlikely(err)) + return -EBUSY; + + ++packet; + sq->xsk.last_desc.addr = 0; + } + + while (budget-- > 0) { + if (virtnet_xsk_dev_is_full(sq)) { + ret = -EBUSY; + break; + } + + if (!xsk_tx_peek_desc(pool, &desc)) { + /* done */ + ret = 0; + break; + } + + err = virtnet_xsk_xmit(sq, pool, &desc); + if (unlikely(err)) { + ret = -EBUSY; + break; + } + + ++packet; + } + + if (packet) { + xsk_tx_release(pool); + + if (virtqueue_kick_prepare(sq->vq) && virtqueue_notify(sq->vq)) { + u64_stats_update_begin(&sq->stats.syncp); + sq->stats.kicks++; + u64_stats_update_end(&sq->stats.syncp); + } + } + + return ret; +} + +static int virtnet_xsk_run(struct send_queue *sq, + struct xsk_buff_pool *pool, int budget) +{ + int err, ret = 0; + unsigned int _packets = 0; + unsigned int _bytes = 0; + + sq->xsk.wait_slot = false; + + __free_old_xmit_ptr(sq, true, false, &_packets, &_bytes); + + err = virtnet_xsk_xmit_zc(sq, pool, xsk_budget); + if (!err) { + struct xdp_desc desc; + + clear_bit(VIRTNET_STATE_XSK_WAKEUP, &sq->xsk.state); + xsk_set_tx_need_wakeup(pool); + + /* Race breaker. If new is coming after last xmit + * but before flag change + */ + + if (!xsk_tx_peek_desc(pool, &desc)) + goto end; + + set_bit(VIRTNET_STATE_XSK_WAKEUP, &sq->xsk.state); + xsk_clear_tx_need_wakeup(pool); + + sq->xsk.last_desc = desc; + ret = budget; + goto end; + } + + xsk_clear_tx_need_wakeup(pool); + + if (err == -EAGAIN) { + ret = budget; + goto end; + } + + __free_old_xmit_ptr(sq, true, false, &_packets, &_bytes); + + if (!virtnet_xsk_dev_is_full(sq)) { + ret = budget; + goto end; + } + + sq->xsk.wait_slot = true; + + virtnet_sq_stop_check(sq, true); +end: + return ret; +} + static int virtnet_get_phys_port_name(struct net_device *dev, char *buf, size_t len) { From patchwork Sat Jan 16 02:59:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 365420 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86829C433E6 for ; Sat, 16 Jan 2021 03:00:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 52555238E7 for ; Sat, 16 Jan 2021 03:00:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729335AbhAPDAS (ORCPT ); Fri, 15 Jan 2021 22:00:18 -0500 Received: from out30-54.freemail.mail.aliyun.com ([115.124.30.54]:50380 "EHLO out30-54.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729184AbhAPDAS (ORCPT ); Fri, 15 Jan 2021 22:00:18 -0500 X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R161e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=e01e04423; MF=xuanzhuo@linux.alibaba.com; NM=1; PH=DS; RN=19; SR=0; TI=SMTPD_---0ULr6UGu_1610765970; Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0ULr6UGu_1610765970) by smtp.aliyun-inc.com(127.0.0.1); Sat, 16 Jan 2021 10:59:30 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Jakub Kicinski , =?utf-8?b?QmrDtnJuIFTDtnBl?= =?utf-8?q?l?= , Magnus Karlsson , Jonathan Lemon , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , virtualization@lists.linux-foundation.org, bpf@vger.kernel.org Subject: [PATCH net-next v2 6/7] virtio-net, xsk: implement xsk wakeup callback Date: Sat, 16 Jan 2021 10:59:27 +0800 Message-Id: <2abdfb0b319d4075b68d50d2be9f441b75735e64.1610765285.git.xuanzhuo@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: In-Reply-To: References: Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Since I did not find an interface to directly notify virtio to generate a tx interrupt, I sent some data to trigger a new tx interrupt. Another advantage of this is that the transmission delay will be relatively small, and there is no need to wait for the tx interrupt to start softirq. Signed-off-by: Xuan Zhuo --- drivers/net/virtio_net.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 42aa9ad..e552c2d 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2841,6 +2841,56 @@ static int virtnet_xsk_run(struct send_queue *sq, return ret; } +static int virtnet_xsk_wakeup(struct net_device *dev, u32 qid, u32 flag) +{ + struct virtnet_info *vi = netdev_priv(dev); + struct send_queue *sq; + struct xsk_buff_pool *pool; + struct netdev_queue *txq; + + if (!netif_running(dev)) + return -ENETDOWN; + + if (qid >= vi->curr_queue_pairs) + return -EINVAL; + + sq = &vi->sq[qid]; + + rcu_read_lock(); + + pool = rcu_dereference(sq->xsk.pool); + if (!pool) + goto end; + + if (test_and_set_bit(VIRTNET_STATE_XSK_WAKEUP, &sq->xsk.state)) + goto end; + + txq = netdev_get_tx_queue(dev, qid); + + local_bh_disable(); + __netif_tx_lock(txq, raw_smp_processor_id()); + + /* Send part of the package directly to reduce the delay in sending the + * package, and this can actively trigger the tx interrupts. + * + * If the package is not processed, then continue processing in the + * subsequent tx interrupt(virtnet_poll_tx). + * + * If no packet is sent out, the ring of the device is full. In this + * case, we will still get a tx interrupt response. Then we will deal + * with the subsequent packet sending work. + */ + + virtnet_xsk_run(sq, pool, xsk_budget); + + __netif_tx_unlock(txq); + local_bh_enable(); + +end: + rcu_read_unlock(); + return 0; +} + static int virtnet_get_phys_port_name(struct net_device *dev, char *buf, size_t len) { @@ -2895,6 +2945,7 @@ static int virtnet_set_features(struct net_device *dev, .ndo_vlan_rx_kill_vid = virtnet_vlan_rx_kill_vid, .ndo_bpf = virtnet_xdp, .ndo_xdp_xmit = virtnet_xdp_xmit, + .ndo_xsk_wakeup = virtnet_xsk_wakeup, .ndo_features_check = passthru_features_check, .ndo_get_phys_port_name = virtnet_get_phys_port_name, .ndo_set_features = virtnet_set_features,