From patchwork Fri Dec 25 07:24:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: wangyunjian X-Patchwork-Id: 352533 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0C4BC433E6 for ; Fri, 25 Dec 2020 07:25:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AF5FF230FF for ; Fri, 25 Dec 2020 07:25:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726322AbgLYHZY (ORCPT ); Fri, 25 Dec 2020 02:25:24 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:9926 "EHLO szxga06-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725787AbgLYHZX (ORCPT ); Fri, 25 Dec 2020 02:25:23 -0500 Received: from DGGEMS412-HUB.china.huawei.com (unknown [172.30.72.59]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4D2JMw64gjzhs3n; Fri, 25 Dec 2020 15:23:56 +0800 (CST) Received: from localhost (10.174.243.127) by DGGEMS412-HUB.china.huawei.com (10.3.19.212) with Microsoft SMTP Server id 14.3.498.0; Fri, 25 Dec 2020 15:24:26 +0800 From: wangyunjian To: , , , CC: , , , , , Yunjian Wang Subject: [PATCH net v5 1/2] vhost_net: fix ubuf refcount incorrectly when sendmsg fails Date: Fri, 25 Dec 2020 15:24:25 +0800 Message-ID: <1608881065-7572-1-git-send-email-wangyunjian@huawei.com> X-Mailer: git-send-email 1.9.5.msysgit.1 MIME-Version: 1.0 X-Originating-IP: [10.174.243.127] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Yunjian Wang Currently the vhost_zerocopy_callback() maybe be called to decrease the refcount when sendmsg fails in tun. The error handling in vhost handle_tx_zerocopy() will try to decrease the same refcount again. This is wrong. To fix this issue, we only call vhost_net_ubuf_put() when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS. Fixes: 0690899b4d45 ("tun: experimental zero copy tx support") Signed-off-by: Yunjian Wang Acked-by: Willem de Bruijn Acked-by: Michael S. Tsirkin Acked-by: Jason Wang --- drivers/vhost/net.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 531a00d703cd..c8784dfafdd7 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -863,6 +863,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock) size_t len, total_len = 0; int err; struct vhost_net_ubuf_ref *ubufs; + struct ubuf_info *ubuf; bool zcopy_used; int sent_pkts = 0; @@ -895,9 +896,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock) /* use msg_control to pass vhost zerocopy ubuf info to skb */ if (zcopy_used) { - struct ubuf_info *ubuf; ubuf = nvq->ubuf_info + nvq->upend_idx; - vq->heads[nvq->upend_idx].id = cpu_to_vhost32(vq, head); vq->heads[nvq->upend_idx].len = VHOST_DMA_IN_PROGRESS; ubuf->callback = vhost_zerocopy_callback; @@ -927,7 +926,8 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock) err = sock->ops->sendmsg(sock, &msg, len); if (unlikely(err < 0)) { if (zcopy_used) { - vhost_net_ubuf_put(ubufs); + if (vq->heads[ubuf->desc].len == VHOST_DMA_IN_PROGRESS) + vhost_net_ubuf_put(ubufs); nvq->upend_idx = ((unsigned)nvq->upend_idx - 1) % UIO_MAXIOV; } From patchwork Fri Dec 25 07:24:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: wangyunjian X-Patchwork-Id: 352450 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1428C433DB for ; Fri, 25 Dec 2020 07:25:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8520F2312F for ; Fri, 25 Dec 2020 07:25:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727971AbgLYHZY (ORCPT ); Fri, 25 Dec 2020 02:25:24 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:9927 "EHLO szxga06-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725873AbgLYHZX (ORCPT ); Fri, 25 Dec 2020 02:25:23 -0500 Received: from DGGEMS402-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4D2JN51JP4zhv3M; Fri, 25 Dec 2020 15:24:05 +0800 (CST) Received: from localhost (10.174.243.127) by DGGEMS402-HUB.china.huawei.com (10.3.19.202) with Microsoft SMTP Server id 14.3.498.0; Fri, 25 Dec 2020 15:24:34 +0800 From: wangyunjian To: , , , CC: , , , , , Yunjian Wang Subject: [PATCH net v5 2/2] vhost_net: fix tx queue stuck when sendmsg fails Date: Fri, 25 Dec 2020 15:24:33 +0800 Message-ID: <1608881073-19004-1-git-send-email-wangyunjian@huawei.com> X-Mailer: git-send-email 1.9.5.msysgit.1 MIME-Version: 1.0 X-Originating-IP: [10.174.243.127] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Yunjian Wang Currently the driver doesn't drop a packet which can't be sent by tun (e.g bad packet). In this case, the driver will always process the same packet lead to the tx queue stuck. To fix this issue: 1. in the case of persistent failure (e.g bad packet), the driver can skip this descriptor by ignoring the error. 2. in the case of transient failure (e.g -ENOBUFS, -EAGAIN and -ENOMEM), the driver schedules the worker to try again. Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server") Signed-off-by: Yunjian Wang Acked-by: Willem de Bruijn --- drivers/vhost/net.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index c8784dfafdd7..01558fb2c552 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -827,14 +827,13 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock) msg.msg_flags &= ~MSG_MORE; } - /* TODO: Check specific error and bomb out unless ENOBUFS? */ err = sock->ops->sendmsg(sock, &msg, len); - if (unlikely(err < 0)) { + if (unlikely(err == -EAGAIN || err == -ENOMEM || err == -ENOBUFS)) { vhost_discard_vq_desc(vq, 1); vhost_net_enable_vq(net, vq); break; } - if (err != len) + if (err >= 0 && err != len) pr_debug("Truncated TX packet: len %d != %zd\n", err, len); done: @@ -922,7 +921,6 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock) msg.msg_flags &= ~MSG_MORE; } - /* TODO: Check specific error and bomb out unless ENOBUFS? */ err = sock->ops->sendmsg(sock, &msg, len); if (unlikely(err < 0)) { if (zcopy_used) { @@ -931,11 +929,13 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock) nvq->upend_idx = ((unsigned)nvq->upend_idx - 1) % UIO_MAXIOV; } - vhost_discard_vq_desc(vq, 1); - vhost_net_enable_vq(net, vq); - break; + if (err == -EAGAIN || err == -ENOMEM || err == -ENOBUFS) { + vhost_discard_vq_desc(vq, 1); + vhost_net_enable_vq(net, vq); + break; + } } - if (err != len) + if (err >= 0 && err != len) pr_debug("Truncated TX packet: " " len %d != %zd\n", err, len); if (!zcopy_used)