From patchwork Tue Jul 13 08:46:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 477355 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7BAEC11F6A for ; Tue, 13 Jul 2021 08:47:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8E45860725 for ; Tue, 13 Jul 2021 08:47:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234891AbhGMIuT (ORCPT ); Tue, 13 Jul 2021 04:50:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50098 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234855AbhGMIuQ (ORCPT ); Tue, 13 Jul 2021 04:50:16 -0400 Received: from mail-pg1-x536.google.com (mail-pg1-x536.google.com [IPv6:2607:f8b0:4864:20::536]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 03433C0613EF for ; Tue, 13 Jul 2021 01:47:27 -0700 (PDT) Received: by mail-pg1-x536.google.com with SMTP id k20so14024081pgg.7 for ; Tue, 13 Jul 2021 01:47:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=omwDqSVrMraB8zPCmbagpNybYhuwUgLxPSaQSPVKs0I=; b=NFS9sf7oZs6OfC3ePyWxWOFYPRWYuHJEmSuowU4+A/AN9SkO5oklzsdY2KXu8XLaK/ RtjCDzBNyMsdIgaq+gdMB7X3iSEzGHSn43PT3yYRI1O50eXo3lcMJohxW+FjS73Fgzhz cByWfSfvdlPZnHSM2paax8FVHByTKAKS59DKXws2obexLNMYBSz4JwJpnlbAFJbXemnl muonmUyFpQcCa6tNLtEtmhQrzdudI5qv9lDhSdlcAC7ix0U02RmFBKcz7hR6i8/sr/Il miY9BiEVBFNOzq9HdHWN3Ltqrr/qolEO7tzr92l33+PTeqr/Mh3jyfO+z8sDGyszPXwg lRGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=omwDqSVrMraB8zPCmbagpNybYhuwUgLxPSaQSPVKs0I=; b=Bqk4husQiaJotbgWse18viU7TXlwsDudskeZ5TBTmoBTbUhkGDwSAXrXYKAM86MWla wrsFl0kxHfjeSOJdr9CSACjqPuYwNx7rp969UU22qq6cH1g6Jsnh3msIMNEKRWVM1EYr e67+gHMKMxInzy6Q7ohh2ypAAQfyXcaaE1sn+62Igu6hP0ZGD7Zimc9LlyyG5OESyyCF 863J46bPJSDEY3XbvT9SsJnl1yjFfSWNGGtWZvuiwSM5K5jFYAGLpkv3pxvzEzaPMIqa BDRlcalbDQ0+7RVFvLWDphe34g2JaqgUu8th91c5JtN268Al8RUrzWa5veXHaTEFZZi9 IdUA== X-Gm-Message-State: AOAM533abWUvfw/d6gN+8e4GqZNh9/bfHIpu0OeyB2cYPhZDViRjc6fs Xkv0eKRiBlKMnsFomu0d22rP X-Google-Smtp-Source: ABdhPJy1XC0KnBeCN6bWa8CB2iE99dOa7TYAMnl/f4lHXuMSNKXTJU6n9g1wg0BApWWsX8PXSpPEXw== X-Received: by 2002:a63:1266:: with SMTP id 38mr3331831pgs.154.1626166046577; Tue, 13 Jul 2021 01:47:26 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id j21sm18231301pfn.35.2021.07.13.01.47.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:26 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 01/17] iova: Export alloc_iova_fast() and free_iova_fast() Date: Tue, 13 Jul 2021 16:46:40 +0800 Message-Id: <20210713084656.232-2-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Export alloc_iova_fast() and free_iova_fast() so that some modules can use it to improve iova allocation efficiency. Signed-off-by: Xie Yongji --- drivers/iommu/iova.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index b6cf5f16123b..3941ed6bb99b 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -521,6 +521,7 @@ alloc_iova_fast(struct iova_domain *iovad, unsigned long size, return new_iova->pfn_lo; } +EXPORT_SYMBOL_GPL(alloc_iova_fast); /** * free_iova_fast - free iova pfn range into rcache @@ -538,6 +539,7 @@ free_iova_fast(struct iova_domain *iovad, unsigned long pfn, unsigned long size) free_iova(iovad, pfn); } +EXPORT_SYMBOL_GPL(free_iova_fast); #define fq_ring_for_each(i, fq) \ for ((i) = (fq)->head; (i) != (fq)->tail; (i) = ((i) + 1) % IOVA_FQ_SIZE) From patchwork Tue Jul 13 08:46:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 475102 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8F3BC07E96 for ; Tue, 13 Jul 2021 08:47:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D158B60725 for ; Tue, 13 Jul 2021 08:47:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234652AbhGMIu1 (ORCPT ); Tue, 13 Jul 2021 04:50:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234902AbhGMIuW (ORCPT ); Tue, 13 Jul 2021 04:50:22 -0400 Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B9F2BC061787 for ; Tue, 13 Jul 2021 01:47:30 -0700 (PDT) Received: by mail-pg1-x52e.google.com with SMTP id 37so21019159pgq.0 for ; Tue, 13 Jul 2021 01:47:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=x7lCOA+2uHcoDlHpnAPiyr08qHLnKknrRwhD42bW14c=; b=drxf3oJhygPSsqt71Ess67rxf+ndIQUhAGvJBQ7FfJc0EFN26/FzOtRTtC7PVm1UpN kQbh6cvZeXqEDwoNx0CPkFun1o95KcC72J/BrRHyWrL8j1ckCdBhsQrNaMArojnrxTDL EmSaEel9rQLb9Sl4sGu/b4+LJvaPa8chJvARj9AOwsZWRs4OZF+vghBIiQsIsYm4M9Ea tM/aLM4LHw9LpWwiWBNz+lg0l70CKpTeZGSiDnx8VQaiPHKV/RYUAN0tySvpbSXIWg/n CsQiMon3tJMRQHkqUdGOISGCwJp1P5f62r4sA52w48rVbKC6WaqVP0H4fqQAmdZ4e0TS niRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=x7lCOA+2uHcoDlHpnAPiyr08qHLnKknrRwhD42bW14c=; b=UuTn0/dE9bdb4YwrZ1hJUt9ohBBMmdL9Aij+aPxWgxgffuYVgK+bGxCITGBAth5e9Z wDYqSxjajq/qeUo10eugsqLMYzcrcou/E4Kr5+tHekfhtLstZQdaiGjSRhH1tCVzRJGS 1EMLBTfSgxwWp949w28JKMjXEujNjYaoiSV6mBka+RwbRt2fHzQ86elJTslMgreDL/5A YjmO2yuwBKTxwpXbv5HFV5280nlhuX0+zFn16ff4Y+iHz5Dp4BtHVUz1QxGxBpA1Ni4q g/S57ND0B1RVEWBguGcbvajN212TW7hGoq1ZPcFLIdFF9zuipBtrE0p3NnRkceuKDE93 ICPA== X-Gm-Message-State: AOAM530w79wjIhrEAH6AYg29Itipgde5mUMJQvfPkx5Xj1iRuj2GOO7/ L1ngftwALd6vFb6uAjUK0fuT X-Google-Smtp-Source: ABdhPJwdRaHRXTGGcq8TKcIstM4ONkFGUJKEp5Fy9XmDOpn7i0jIsKMl/HA/2cyBqJh1EcfH1oJiyQ== X-Received: by 2002:a63:5802:: with SMTP id m2mr3270429pgb.171.1626166050321; Tue, 13 Jul 2021 01:47:30 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id z12sm15702430pjd.39.2021.07.13.01.47.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:29 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 02/17] file: Export receive_fd() to modules Date: Tue, 13 Jul 2021 16:46:41 +0800 Message-Id: <20210713084656.232-3-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Export receive_fd() so that some modules can use it to pass file descriptor between processes without missing any security stuffs. Signed-off-by: Xie Yongji --- fs/file.c | 6 ++++++ include/linux/file.h | 7 +++---- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/fs/file.c b/fs/file.c index 86dc9956af32..210e540672aa 100644 --- a/fs/file.c +++ b/fs/file.c @@ -1134,6 +1134,12 @@ int receive_fd_replace(int new_fd, struct file *file, unsigned int o_flags) return new_fd; } +int receive_fd(struct file *file, unsigned int o_flags) +{ + return __receive_fd(file, NULL, o_flags); +} +EXPORT_SYMBOL_GPL(receive_fd); + static int ksys_dup3(unsigned int oldfd, unsigned int newfd, int flags) { int err = -EBADF; diff --git a/include/linux/file.h b/include/linux/file.h index 2de2e4613d7b..51e830b4fe3a 100644 --- a/include/linux/file.h +++ b/include/linux/file.h @@ -94,6 +94,9 @@ extern void fd_install(unsigned int fd, struct file *file); extern int __receive_fd(struct file *file, int __user *ufd, unsigned int o_flags); + +extern int receive_fd(struct file *file, unsigned int o_flags); + static inline int receive_fd_user(struct file *file, int __user *ufd, unsigned int o_flags) { @@ -101,10 +104,6 @@ static inline int receive_fd_user(struct file *file, int __user *ufd, return -EFAULT; return __receive_fd(file, ufd, o_flags); } -static inline int receive_fd(struct file *file, unsigned int o_flags) -{ - return __receive_fd(file, NULL, o_flags); -} int receive_fd_replace(int new_fd, struct file *file, unsigned int o_flags); extern void flush_delayed_fput(void); From patchwork Tue Jul 13 08:46:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 475101 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E058AC11F68 for ; Tue, 13 Jul 2021 08:47:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CC20D60725 for ; Tue, 13 Jul 2021 08:47:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234995AbhGMIue (ORCPT ); Tue, 13 Jul 2021 04:50:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50172 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234925AbhGMIu0 (ORCPT ); Tue, 13 Jul 2021 04:50:26 -0400 Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 87815C0617AA for ; Tue, 13 Jul 2021 01:47:34 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id b12so18896832pfv.6 for ; Tue, 13 Jul 2021 01:47:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=fyDF3ItsKMeO1MjqIrqgJilMt7mYKDPvv5TK6RowgtE=; b=p1KNnydZcTiuyWle/nJtM7wFkNnRwelSBfv+CpsCxOYkJn5kWph/w1ljqkJz9zrEg5 L2pkcNggfl3Q9RjS5qhSOgAu/suoOWBFYsfmw4VoJ3cVZCGQWXC/t3LZ5InMM8XR5LRl xHGxHb49eUsoBIewF2f127m9Zr6vQ1nNrE7ANg6C6KNKlJI6o8pJl/1YyAaWFPkOV3/r EsJv5vTsWLrxteRhSjRM/MuUUYUEpmJLfFEjWeF53QSvgsRiNQPzhkLCqrfes/TWccji zCW/mpIwd16XGKv0iCcj7quuKncnXS6vKRI09sMPi97p5ey6v9LWBy67FlYZ5GtM1LSk RZlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fyDF3ItsKMeO1MjqIrqgJilMt7mYKDPvv5TK6RowgtE=; b=AQm2vcNAOUqaE3yjTlwBY7ZxfTTOfWUX+x4dZjpSid+Y5lnC3S4YyQEEwrBOhWrFhr HcLwTRp4r5c+EUZ5arVFHc7yD/SYoNFyCh89EgcJl4OpIHqj7kImZ5+XAKoHAXU5RHgp W/qjQYpoIg+oQgOwGI7SxDoyuf0fLwFk7rHfjjRDrf3dJKfoX14QEtHqYoj7pca56jIH hHp8LpL/hVE8rW5qx5YUsNwQc2dEam8pKXTDmBKPB31dyVDJ8Qum5lnJN+M6rt9VTtDX gn+CgSX66PtyJwpjiNQ2Phkwn4Zo1i997ld9PBRT/v2NCgC6gyuigSA2ZRdWQfao3Z21 Z9bw== X-Gm-Message-State: AOAM530Ylj7tSxJ1m+B6AeXIOS5NGFNbuVhscV4z6H51P6a/oTWDpbuN 6G3dEyZBTOHwcXkSxjccN8a1 X-Google-Smtp-Source: ABdhPJx6Qc5AYiRax5JttuV19B9UsF3u1wzbj3hvEI32kFSkxmH/vgaDI62s1sThuuzVSR58tJkLXA== X-Received: by 2002:a62:f947:0:b029:2e9:c502:7939 with SMTP id g7-20020a62f9470000b02902e9c5027939mr3616720pfm.34.1626166053934; Tue, 13 Jul 2021 01:47:33 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id n6sm12746734pgb.60.2021.07.13.01.47.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:33 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 03/17] vdpa: Fix code indentation Date: Tue, 13 Jul 2021 16:46:42 +0800 Message-Id: <20210713084656.232-4-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Use tabs to indent the code instead of spaces. Signed-off-by: Xie Yongji --- include/linux/vdpa.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 7c49bc5a2b71..f822490db584 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -342,25 +342,25 @@ static inline struct device *vdpa_get_dma_dev(struct vdpa_device *vdev) static inline void vdpa_reset(struct vdpa_device *vdev) { - const struct vdpa_config_ops *ops = vdev->config; + const struct vdpa_config_ops *ops = vdev->config; vdev->features_valid = false; - ops->set_status(vdev, 0); + ops->set_status(vdev, 0); } static inline int vdpa_set_features(struct vdpa_device *vdev, u64 features) { - const struct vdpa_config_ops *ops = vdev->config; + const struct vdpa_config_ops *ops = vdev->config; vdev->features_valid = true; - return ops->set_features(vdev, features); + return ops->set_features(vdev, features); } static inline void vdpa_get_config(struct vdpa_device *vdev, unsigned offset, void *buf, unsigned int len) { - const struct vdpa_config_ops *ops = vdev->config; + const struct vdpa_config_ops *ops = vdev->config; /* * Config accesses aren't supposed to trigger before features are set. From patchwork Tue Jul 13 08:46:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 477354 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.0 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98B34C11F6A for ; Tue, 13 Jul 2021 08:47:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8D55D613B0 for ; Tue, 13 Jul 2021 08:47:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234909AbhGMIu3 (ORCPT ); Tue, 13 Jul 2021 04:50:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50134 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234917AbhGMIu1 (ORCPT ); Tue, 13 Jul 2021 04:50:27 -0400 Received: from mail-pg1-x52d.google.com (mail-pg1-x52d.google.com [IPv6:2607:f8b0:4864:20::52d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53672C061786 for ; Tue, 13 Jul 2021 01:47:38 -0700 (PDT) Received: by mail-pg1-x52d.google.com with SMTP id u14so20984199pga.11 for ; Tue, 13 Jul 2021 01:47:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tdSO+YXsywfBhdOnOREu62bNqEWlUHvrtNpQOS7MuZo=; b=hiimUl4DgO1Of4XGPXpdICPYnPm7+QvGfh85ne+O9YqLV4HwNaz1/3GVCexKIIR9bK 79qC2iFIA2pLQ1nFnXPut8k1vrxiwBlzw8SE6x4BYAMbkEs5Gxh0ltQDnZC1DuhOWaXy Y2aq3utptKdeOlIEcc73zmkwyC0gwKsapg8k19yhxOLFoyCxe7DHgO2gP5VdDTqtHx6A mJhAVdE2NVFsWzib8FCLBIYmov6tjYXcaVAbydRQY1/65AASv7O0PG98TXYj+Gu12TPn VjG1m9HmS0VsQdytCx6vQRcAJNoY4lM6llfjwo4MI5ukWs2uwYJovIc1U/KU7Zz4NCOC WkTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tdSO+YXsywfBhdOnOREu62bNqEWlUHvrtNpQOS7MuZo=; b=Q43d+uKs9UCzd6/OwPxYGqKxDKj8clixpgOptgvHC5mkezJS+nkGEiXDn06xg7olDb KTCuwCrUfZ704dKfQoA8ZWoFOdKCDC7hL4Z2RLEsaXf9N82OVgT3GhSR+7CN0G6qX+i2 ZlYjY+4VtkpPf/Lf4iqjaMA6g4DMOvRQxHPy6HtsJj1x8kdZOk6p65IoTqCowPsxQpen NtI3XfXOTnUXJVYYJUaqon1R4soXWOpxz9VLoximEvSp79UOb/E2DrE2bwniiSVeRoLH Kgapi8GzfbZofoaB7Y+CeYJjTnwotS0SyrG+DA9zfiktJmNNcb00EQsRP1P5of8wARur Xalg== X-Gm-Message-State: AOAM532XRp/63j0GwrBNad+7Q/EipaJOAoFZIW1nGCCDvdeaxEJw2QdO 4mPvMAA8u+FcxHodW+7OYtPk X-Google-Smtp-Source: ABdhPJzmFohN6vUcKF8KLIan4JZVjJ6TYDpkl96NawoSdWhjfLM6wpCarQvAfJbCfQIvaaP5xnDdBw== X-Received: by 2002:a05:6a00:2:b029:32e:3ef0:770a with SMTP id h2-20020a056a000002b029032e3ef0770amr810552pfk.8.1626166057836; Tue, 13 Jul 2021 01:47:37 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id u7sm21390626pgl.30.2021.07.13.01.47.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:37 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 04/17] vdpa: Fail the vdpa_reset() if fail to set device status to zero Date: Tue, 13 Jul 2021 16:46:43 +0800 Message-Id: <20210713084656.232-5-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Re-read the device status to ensure it's set to zero during resetting. Otherwise, fail the vdpa_reset() after timeout. Signed-off-by: Xie Yongji --- include/linux/vdpa.h | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index f822490db584..198c30e84b5d 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -6,6 +6,7 @@ #include #include #include +#include /** * struct vdpa_calllback - vDPA callback definition. @@ -340,12 +341,24 @@ static inline struct device *vdpa_get_dma_dev(struct vdpa_device *vdev) return vdev->dma_dev; } -static inline void vdpa_reset(struct vdpa_device *vdev) +#define VDPA_RESET_TIMEOUT_MS 1000 + +static inline int vdpa_reset(struct vdpa_device *vdev) { const struct vdpa_config_ops *ops = vdev->config; + int timeout = 0; vdev->features_valid = false; ops->set_status(vdev, 0); + while (ops->get_status(vdev)) { + timeout += 20; + if (timeout > VDPA_RESET_TIMEOUT_MS) + return -EIO; + + msleep(20); + } + + return 0; } static inline int vdpa_set_features(struct vdpa_device *vdev, u64 features) From patchwork Tue Jul 13 08:46:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 477353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6160EC07E96 for ; Tue, 13 Jul 2021 08:47:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 439A4613AB for ; Tue, 13 Jul 2021 08:47:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234970AbhGMIuh (ORCPT ); Tue, 13 Jul 2021 04:50:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234976AbhGMIub (ORCPT ); Tue, 13 Jul 2021 04:50:31 -0400 Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A05AC0613F0 for ; Tue, 13 Jul 2021 01:47:42 -0700 (PDT) Received: by mail-pj1-x1031.google.com with SMTP id v18-20020a17090ac912b0290173b9578f1cso1502924pjt.0 for ; Tue, 13 Jul 2021 01:47:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FdoZ+4EM7JCQgZ86bDT9RbGlh7q+AWgllGjoIYPWL9Q=; b=gA1/QgaUqMMGrBGGpEVYK8fMzHiF70tOx013tB9CxkOMsIAz6K9CyWbLukiYsceeeK WT3omC6T/UV02GbqJntqFcUEJy/CsH+2v38lltH8hTUVi5bDMirNFM2m8jsgCd53U/1v Lc/Oygn7Gsw282JeKIqoiMuvIbDZt/am73WeLkzOY2/CkX4nQAuUu3ul4s5GPTxJkyx3 dWRyzileES+3pFwFh509M/IKBfUS06G83m9tPJd2yiq3RvUawNvAiQ6SGLm5AoV3JD3Q Yf50mdZiBV6HqGDPHp59XIyHUg9RniZUw56YWfFOmyWOXjXu+QsXkJ6TpitVDuk7cFO9 CT5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FdoZ+4EM7JCQgZ86bDT9RbGlh7q+AWgllGjoIYPWL9Q=; b=NzZ+H+ih7H0bwtksKVy8hE6OU4z5+fH6cx607OLFgoF+50QyYADM/txDJLeCn46vas bjjPPV+eiLVHg1/jLsi+gf/BcsySeUXFVYeX+qYIFQeScP1PzZFOsFl+gSxGE4xFXNPU 29hE6V50sGVhbLwade1mp31i9hvznoCfYUsNtANdIuAWdC6G/kYgaz8oK+Pz4NVNduG0 BwVVLc5iKqqYgNFL5ShqQ1qIt+cRfOznYdFUCVvUgsPaedz1euqhaI6tZ1Z9RtEx2uMT hlu8fvBJtf0lmbtIIwD5284febmwEYa4Laxz3nDD5+bD07dKY4H1c3gZyzSAMxFyFf3N HMqw== X-Gm-Message-State: AOAM532j/n5hsHFKAKB59TyKWm5KgZF1a01J5+uqJa8+98RzUx5B1elH RvrY2vFIYc749EFwVB3Mlv6f X-Google-Smtp-Source: ABdhPJy/YXaKCppxOExioeRxTLQv1sj8lcUHDptTXIY/sm3klehqPKR0QkOZNRfSckm4SM+jax8f+w== X-Received: by 2002:a17:90b:1b4d:: with SMTP id nv13mr1935868pjb.216.1626166061630; Tue, 13 Jul 2021 01:47:41 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id w2sm15858457pjq.5.2021.07.13.01.47.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:41 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 05/17] vhost-vdpa: Fail the vhost_vdpa_set_status() on reset failure Date: Tue, 13 Jul 2021 16:46:44 +0800 Message-Id: <20210713084656.232-6-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Re-read the device status to ensure it's set to zero during resetting. Otherwise, fail the vhost_vdpa_set_status() after timeout. Signed-off-by: Xie Yongji --- drivers/vhost/vdpa.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index bb374a801bda..62b6d911c57d 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -157,7 +157,7 @@ static long vhost_vdpa_set_status(struct vhost_vdpa *v, u8 __user *statusp) struct vdpa_device *vdpa = v->vdpa; const struct vdpa_config_ops *ops = vdpa->config; u8 status, status_old; - int nvqs = v->nvqs; + int timeout = 0, nvqs = v->nvqs; u16 i; if (copy_from_user(&status, statusp, sizeof(status))) @@ -173,6 +173,15 @@ static long vhost_vdpa_set_status(struct vhost_vdpa *v, u8 __user *statusp) return -EINVAL; ops->set_status(vdpa, status); + if (status == 0) { + while (ops->get_status(vdpa)) { + timeout += 20; + if (timeout > VDPA_RESET_TIMEOUT_MS) + return -EIO; + + msleep(20); + } + } if ((status & VIRTIO_CONFIG_S_DRIVER_OK) && !(status_old & VIRTIO_CONFIG_S_DRIVER_OK)) for (i = 0; i < nvqs; i++) From patchwork Tue Jul 13 08:46:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 475100 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03A49C07E96 for ; Tue, 13 Jul 2021 08:47:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E2A54613AB for ; Tue, 13 Jul 2021 08:47:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235069AbhGMIuk (ORCPT ); Tue, 13 Jul 2021 04:50:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50236 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234999AbhGMIuf (ORCPT ); Tue, 13 Jul 2021 04:50:35 -0400 Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90F0AC0613EF for ; Tue, 13 Jul 2021 01:47:45 -0700 (PDT) Received: by mail-pf1-x42f.google.com with SMTP id x16so18887871pfa.13 for ; Tue, 13 Jul 2021 01:47:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=duCG4IP24crSQcX/Hc+dmtkejh6EvJA1HRc0A4ISBjM=; b=jhEklg55ncTgryEBNVp2R8QmbT0mSSZ9Jbx/vRWE7VCj3bWE2z2dIOjal3v7V87F+w KXfQDIBXqy11zkoXvndvUbGlVPDEiXSl+fAQR2u1kh3sE66g/ycHI3ytjUjkQxSBJH/g AZxHQhTacxVYFFdIyyjgze0ilCXCS/jDuumV2MlbMyoRI0UfccSBvPIQYWOGLs4/8MSa /LKjYT2XeUKg2HCwHW37u9JbUA9UywMRQY5Oh21L5F5yKxW5N31AISgPj+ldmzvcwtCe E1Q/Vl13UyWPfsgrfUc5jX89w8ETwn8bjbTAF+M6GLaU7IxUmlm0UT8g4Vzjxqr6Yp/e wF9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=duCG4IP24crSQcX/Hc+dmtkejh6EvJA1HRc0A4ISBjM=; b=B6X8GJhy7XOoaH85FNa6DCrbWetGkwxPalH1Az+AliylvmNYYUWa2181eMg9yQdlqX 8TBmz2YA4R73Y93OD6zGDHBfxV/to0Nyul9ww/BTErAGCVTpavTjqh6jSpE7bh8vIYE8 hIOg4bqhTjechDfPKsDjnnI4v3uYdWgJGaxdY7ZGgduSEPromo6TPjb2ZZ0gR7vsTceZ 4IjfV3zrb1zISKCZ+JvJZJyQgCmfmZE7531vMKBCEndWEkLMQzFrS0s/1bPKmcWNq+7n NsswKHep1l0tDgdJlBBbDganXm2h9haCvW7nxjay+f9KCseECpppqNdaiulQTm3dAbZz ncPw== X-Gm-Message-State: AOAM531V/Zt8OcN5KU8z7sJ9+ZmC0DAQ4wcRqEwgmMUTQegXalWcQKdX vYasXYYvXPwto6hWM9sY+Brq X-Google-Smtp-Source: ABdhPJzUKtRs1AySsyiLpLF85eJbibtFpTyQ7lV4V3lGutdi9feZAXM7KOxrD2p11DbuRgxCB1AMmQ== X-Received: by 2002:a62:ee16:0:b029:2fe:ffcf:775a with SMTP id e22-20020a62ee160000b02902feffcf775amr3339041pfi.59.1626166065181; Tue, 13 Jul 2021 01:47:45 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id b22sm17658102pfi.181.2021.07.13.01.47.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:44 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 06/17] vhost-vdpa: Handle the failure of vdpa_reset() Date: Tue, 13 Jul 2021 16:46:45 +0800 Message-Id: <20210713084656.232-7-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The vdpa_reset() may fail now. This adds check to its return value and fail the vhost_vdpa_open(). Signed-off-by: Xie Yongji --- drivers/vhost/vdpa.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 62b6d911c57d..8615756306ec 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -116,12 +116,13 @@ static void vhost_vdpa_unsetup_vq_irq(struct vhost_vdpa *v, u16 qid) irq_bypass_unregister_producer(&vq->call_ctx.producer); } -static void vhost_vdpa_reset(struct vhost_vdpa *v) +static int vhost_vdpa_reset(struct vhost_vdpa *v) { struct vdpa_device *vdpa = v->vdpa; - vdpa_reset(vdpa); v->in_batch = 0; + + return vdpa_reset(vdpa); } static long vhost_vdpa_get_device_id(struct vhost_vdpa *v, u8 __user *argp) @@ -871,7 +872,9 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep) return -EBUSY; nvqs = v->nvqs; - vhost_vdpa_reset(v); + r = vhost_vdpa_reset(v); + if (r) + goto err; vqs = kmalloc_array(nvqs, sizeof(*vqs), GFP_KERNEL); if (!vqs) { From patchwork Tue Jul 13 08:46:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 477352 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0C24C11F69 for ; Tue, 13 Jul 2021 08:47:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DC79C613AF for ; Tue, 13 Jul 2021 08:47:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234945AbhGMIum (ORCPT ); Tue, 13 Jul 2021 04:50:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50258 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235021AbhGMIui (ORCPT ); Tue, 13 Jul 2021 04:50:38 -0400 Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D178C0613EE for ; Tue, 13 Jul 2021 01:47:49 -0700 (PDT) Received: by mail-pj1-x1030.google.com with SMTP id p9so11712591pjl.3 for ; Tue, 13 Jul 2021 01:47:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=EjR09xl/s+Zlx+uuKDPZmG38VH07hTVA8QsJdDWGoMM=; b=0VNyBgTY7yxOchYvhwFlmk9k2udQOWCUUMpM8nhTR7Bk4VZtxT4ZI8f98TzkKevJk5 dYLlzzFa5wsVMHn13aVEoDurgoQ/B97Q+JhhlkS6OR7p2ukAzqbqaIA6heoBr2mvaW2u LWKNdvE1D06+3O4nambhTYRGp9OSBjhVsvc+sS1fRJwI0tKIi1LcEHcgiIroKNvO+Tk9 Y9lhU4YFRflDnAO1ytSChBfqW/0Xf2PueeNxk3B7oGv7K2ZJPXdQOsXRxJSUBz/FAPuC yq81H3J4EOgBB4At7Bf6X83H6Bjq4AOEWbBPDSkXpZtUwXWHYKXE5SKnGfz6IjR2oj55 XrbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=EjR09xl/s+Zlx+uuKDPZmG38VH07hTVA8QsJdDWGoMM=; b=emmCio/EAEBs3SE1D2z/lXJsSfdUcNiQiTxhyCDhPHCF4w/JhNwNGDXLZojOcNfsuv 3gEejXrRAd6AwYoqKj49usdsnLto+csYtbsyj4nkrXspiDWcbk4BJKp8I1S1poPqdlCp SuNzBua82mpMjeFE3X/YfUxDZjRHkvEReQl4Rmiyjm1qEqdijrplRzUk/XExq3RNvMQU O9stZaQ86YicwRg5cwCkNT24AUyjtT/jK7eXOgYSSnKs3tMuh6sfkIKBP5wy3LmetR+f a0LEyRnkLW5TTKXutHU4F2a72CSChVDkvNi7MPUQIZtvN8wAT7FUhQPDFFk2ZPtJtSkG 8Ccw== X-Gm-Message-State: AOAM530PcUSTH4ZdyAMwNd+ExaHSAk8AO3OFIIHXvmgBlOflkeYdk54A cCBzxQWXJwAsNtx+NbX0gwYp X-Google-Smtp-Source: ABdhPJw5LSlu0ju1WyptJVUqPCDp3E+vXWICCBRXlUH6oKTubBRS+RfF2LpFzJEBJMPewe5KBLVzVw== X-Received: by 2002:a17:902:7610:b029:12b:f9f:727 with SMTP id k16-20020a1709027610b029012b0f9f0727mr2734371pll.65.1626166068835; Tue, 13 Jul 2021 01:47:48 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id i12sm9336715pjj.9.2021.07.13.01.47.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:48 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 07/17] virtio: Don't set FAILED status bit on device index allocation failure Date: Tue, 13 Jul 2021 16:46:46 +0800 Message-Id: <20210713084656.232-8-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org We don't need to set FAILED status bit on device index allocation failure since the device initialization hasn't been started yet. Signed-off-by: Xie Yongji --- drivers/virtio/virtio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c index 4b15c00c0a0a..a15beb6b593b 100644 --- a/drivers/virtio/virtio.c +++ b/drivers/virtio/virtio.c @@ -338,7 +338,7 @@ int register_virtio_device(struct virtio_device *dev) /* Assign a unique device index and hence name. */ err = ida_simple_get(&virtio_index_ida, 0, 0, GFP_KERNEL); if (err < 0) - goto out; + return err; dev->index = err; dev_set_name(&dev->dev, "virtio%u", dev->index); From patchwork Tue Jul 13 08:46:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 475099 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 082A8C11F68 for ; Tue, 13 Jul 2021 08:48:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EA2786101D for ; Tue, 13 Jul 2021 08:48:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235207AbhGMIuv (ORCPT ); Tue, 13 Jul 2021 04:50:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235103AbhGMIum (ORCPT ); Tue, 13 Jul 2021 04:50:42 -0400 Received: from mail-pf1-x433.google.com (mail-pf1-x433.google.com [IPv6:2607:f8b0:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 16E92C0613DD for ; Tue, 13 Jul 2021 01:47:53 -0700 (PDT) Received: by mail-pf1-x433.google.com with SMTP id m83so11256319pfd.0 for ; Tue, 13 Jul 2021 01:47:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2fP/MAmI7gUka5e/DKUQ2mjQlxJxe0/QRKPw5yBwyTk=; b=rpVEQUV1X70OmbrrQTJJKvk5vCPqC4uyyQYqM8mgOg9WDN/eKRObLhREIgC9AhlQtG eUZsoe1ZtDcx9fRxn5K5cTmakEDk81uPA+xScHZGg3Tqu81c80BChtmTqJDJYMUUT2FX I8vD0sFKqshhBkFve+LFV2d7iGakKJ6UO++oA6/QFcNuQGOsSjYmiPWVYrZl3U10Hqmm N6egKhnknsyBTg02r1Hz+54X/yG9BUJz2HuqTsK2D2UErp3uOrJUQlLV7QFYJqAnHhic z3Jy2j8PSswSLNBDJg5a+jUolEiDfEm0gqiykb9N4UFR//SYSeYkWWUQ83/UgrjRRwug POjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2fP/MAmI7gUka5e/DKUQ2mjQlxJxe0/QRKPw5yBwyTk=; b=TH7ziiXnDoDVR+BgfP7uZMHS2Xnr1yTxmTSq6Rjgj17EDOUVEK4A9+ug6LkyPAVEvi JNNkMW3UA4EpMJIKNuVhwtRGbqh/xYMtjSpWNdTgAZzsG1FsjHHXHRa9rVfACaxb9Zmh VD46mLdXnT7pAHW/dyNxz2/k79P4o/8CLRQYXJ/wqRPeewGoF6+E5UWKsB0tL2EdLSYL Tc7n3VWNxie8xGMlxecwr6jP7mEWqBrMxVsVDXH+AVVZnNJQbr37R+8wV3rxYib9XcH0 rUq/P7/tZW1rIM3sN7NH1P47kIlZp/gtPRZm3b+nvCjOMybj9SIIa7+Jsy2QN6W7Is62 USEw== X-Gm-Message-State: AOAM5329mSkKzx4Sq8Jnaa8nQrYmx2RyjozT8a31NDt76sjhUyVUGz/5 L334OvTaavgeukg9/wTd9uxF X-Google-Smtp-Source: ABdhPJyBkUB9QNAx3I8j5SvXv3BBPI0eQfjjVmRO0YZO2mJNlMqzj7TJp+Nb22Xnek5hLerTt0F38A== X-Received: by 2002:a65:5c89:: with SMTP id a9mr3285855pgt.207.1626166072648; Tue, 13 Jul 2021 01:47:52 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id e2sm21544494pgh.5.2021.07.13.01.47.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:52 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 08/17] virtio_config: Add a return value to reset function Date: Tue, 13 Jul 2021 16:46:47 +0800 Message-Id: <20210713084656.232-9-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This adds a return value to reset function so that we can handle the reset failure later. No functional changes. Signed-off-by: Xie Yongji --- arch/um/drivers/virtio_uml.c | 4 +++- drivers/platform/mellanox/mlxbf-tmfifo.c | 4 +++- drivers/remoteproc/remoteproc_virtio.c | 4 +++- drivers/s390/virtio/virtio_ccw.c | 6 ++++-- drivers/virtio/virtio_pci_legacy.c | 4 +++- drivers/virtio/virtio_pci_modern.c | 4 +++- drivers/virtio/virtio_vdpa.c | 4 +++- include/linux/virtio_config.h | 3 ++- 8 files changed, 24 insertions(+), 9 deletions(-) diff --git a/arch/um/drivers/virtio_uml.c b/arch/um/drivers/virtio_uml.c index 4412d6febade..ca02deaf9b32 100644 --- a/arch/um/drivers/virtio_uml.c +++ b/arch/um/drivers/virtio_uml.c @@ -828,11 +828,13 @@ static void vu_set_status(struct virtio_device *vdev, u8 status) vu_dev->status = status; } -static void vu_reset(struct virtio_device *vdev) +static int vu_reset(struct virtio_device *vdev) { struct virtio_uml_device *vu_dev = to_virtio_uml_device(vdev); vu_dev->status = 0; + + return 0; } static void vu_del_vq(struct virtqueue *vq) diff --git a/drivers/platform/mellanox/mlxbf-tmfifo.c b/drivers/platform/mellanox/mlxbf-tmfifo.c index 38800e86ed8a..e3c513c2d4fa 100644 --- a/drivers/platform/mellanox/mlxbf-tmfifo.c +++ b/drivers/platform/mellanox/mlxbf-tmfifo.c @@ -989,11 +989,13 @@ static void mlxbf_tmfifo_virtio_set_status(struct virtio_device *vdev, } /* Reset the device. Not much here for now. */ -static void mlxbf_tmfifo_virtio_reset(struct virtio_device *vdev) +static int mlxbf_tmfifo_virtio_reset(struct virtio_device *vdev) { struct mlxbf_tmfifo_vdev *tm_vdev = mlxbf_vdev_to_tmfifo(vdev); tm_vdev->status = 0; + + return 0; } /* Read the value of a configuration field. */ diff --git a/drivers/remoteproc/remoteproc_virtio.c b/drivers/remoteproc/remoteproc_virtio.c index cf4d54e98e6a..975c845b3187 100644 --- a/drivers/remoteproc/remoteproc_virtio.c +++ b/drivers/remoteproc/remoteproc_virtio.c @@ -191,7 +191,7 @@ static void rproc_virtio_set_status(struct virtio_device *vdev, u8 status) dev_dbg(&vdev->dev, "status: %d\n", status); } -static void rproc_virtio_reset(struct virtio_device *vdev) +static int rproc_virtio_reset(struct virtio_device *vdev) { struct rproc_vdev *rvdev = vdev_to_rvdev(vdev); struct fw_rsc_vdev *rsc; @@ -200,6 +200,8 @@ static void rproc_virtio_reset(struct virtio_device *vdev) rsc->status = 0; dev_dbg(&vdev->dev, "reset !\n"); + + return 0; } /* provide the vdev features as retrieved from the firmware */ diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c index d35e7a3f7067..5221cdad531d 100644 --- a/drivers/s390/virtio/virtio_ccw.c +++ b/drivers/s390/virtio/virtio_ccw.c @@ -710,14 +710,14 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs, return ret; } -static void virtio_ccw_reset(struct virtio_device *vdev) +static int virtio_ccw_reset(struct virtio_device *vdev) { struct virtio_ccw_device *vcdev = to_vc_device(vdev); struct ccw1 *ccw; ccw = ccw_device_dma_zalloc(vcdev->cdev, sizeof(*ccw)); if (!ccw) - return; + return -ENOMEM; /* Zero status bits. */ vcdev->dma_area->status = 0; @@ -729,6 +729,8 @@ static void virtio_ccw_reset(struct virtio_device *vdev) ccw->cda = 0; ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_RESET); ccw_device_dma_free(vcdev->cdev, ccw, sizeof(*ccw)); + + return 0; } static u64 virtio_ccw_get_features(struct virtio_device *vdev) diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c index d62e9835aeec..0b5d95e3efa1 100644 --- a/drivers/virtio/virtio_pci_legacy.c +++ b/drivers/virtio/virtio_pci_legacy.c @@ -89,7 +89,7 @@ static void vp_set_status(struct virtio_device *vdev, u8 status) iowrite8(status, vp_dev->ioaddr + VIRTIO_PCI_STATUS); } -static void vp_reset(struct virtio_device *vdev) +static int vp_reset(struct virtio_device *vdev) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); /* 0 status means a reset. */ @@ -99,6 +99,8 @@ static void vp_reset(struct virtio_device *vdev) ioread8(vp_dev->ioaddr + VIRTIO_PCI_STATUS); /* Flush pending VQ/configuration callbacks. */ vp_synchronize_vectors(vdev); + + return 0; } static u16 vp_config_vector(struct virtio_pci_device *vp_dev, u16 vector) diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c index 30654d3a0b41..b0cde3b2f0ff 100644 --- a/drivers/virtio/virtio_pci_modern.c +++ b/drivers/virtio/virtio_pci_modern.c @@ -158,7 +158,7 @@ static void vp_set_status(struct virtio_device *vdev, u8 status) vp_modern_set_status(&vp_dev->mdev, status); } -static void vp_reset(struct virtio_device *vdev) +static int vp_reset(struct virtio_device *vdev) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); struct virtio_pci_modern_device *mdev = &vp_dev->mdev; @@ -174,6 +174,8 @@ static void vp_reset(struct virtio_device *vdev) msleep(1); /* Flush pending VQ/configuration callbacks. */ vp_synchronize_vectors(vdev); + + return 0; } static u16 vp_config_vector(struct virtio_pci_device *vp_dev, u16 vector) diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c index ff43f9b62b2f..3e666f70e829 100644 --- a/drivers/virtio/virtio_vdpa.c +++ b/drivers/virtio/virtio_vdpa.c @@ -97,11 +97,13 @@ static void virtio_vdpa_set_status(struct virtio_device *vdev, u8 status) return ops->set_status(vdpa, status); } -static void virtio_vdpa_reset(struct virtio_device *vdev) +static int virtio_vdpa_reset(struct virtio_device *vdev) { struct vdpa_device *vdpa = vd_get_vdpa(vdev); vdpa_reset(vdpa); + + return 0; } static bool virtio_vdpa_notify(struct virtqueue *vq) diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h index 8519b3ae5d52..203407992c30 100644 --- a/include/linux/virtio_config.h +++ b/include/linux/virtio_config.h @@ -47,6 +47,7 @@ struct virtio_shm_region { * After this, status and feature negotiation must be done again * Device must not be reset from its vq/config callbacks, or in * parallel with being added/removed. + * Returns 0 on success or error status * @find_vqs: find virtqueues and instantiate them. * vdev: the virtio_device * nvqs: the number of virtqueues to find @@ -82,7 +83,7 @@ struct virtio_config_ops { u32 (*generation)(struct virtio_device *vdev); u8 (*get_status)(struct virtio_device *vdev); void (*set_status)(struct virtio_device *vdev, u8 status); - void (*reset)(struct virtio_device *vdev); + int (*reset)(struct virtio_device *vdev); int (*find_vqs)(struct virtio_device *, unsigned nvqs, struct virtqueue *vqs[], vq_callback_t *callbacks[], const char * const names[], const bool *ctx, From patchwork Tue Jul 13 08:46:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 477351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 318E1C11F66 for ; Tue, 13 Jul 2021 08:48:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1F064613B9 for ; Tue, 13 Jul 2021 08:48:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235188AbhGMIvC (ORCPT ); Tue, 13 Jul 2021 04:51:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235157AbhGMIut (ORCPT ); Tue, 13 Jul 2021 04:50:49 -0400 Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com [IPv6:2607:f8b0:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8D329C0613AA for ; Tue, 13 Jul 2021 01:47:56 -0700 (PDT) Received: by mail-pf1-x42e.google.com with SMTP id b12so18897699pfv.6 for ; Tue, 13 Jul 2021 01:47:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=79+lwfgqPqTMCDJYOtelsxZhwWsCxwv7dGGvPGe+0Sw=; b=XPYV6AqgUbZp6+vdppARBpAYCPqTsziLqNwvNsrjLuNKQsh6GS3q19GuTFueLqfI3W 2mpioxtWWYo5lHGzcp+Jutobs4GmHGUTQiIvyRJOrhRaSYEd2g13fYjJdx8IqQbpVtg+ U5M3F2NmfDR26UxFi92Yzg5HX/fYV6AYnZdE6d7ugOH+Y9vAV4e5UbOMWrmsG3AqiEzX NUPQt8B7lzQDca2Wt8iUkhnTLPaw8kJJmv71Rv4l+iNrRzjIvV51afp0ikEMovNJIa3w l3GoRQ485RyPX6RthpBauL0J2/vlNzokMcWZNI71JmewCwtjndkIJDGd4VS80mquizHp zreQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=79+lwfgqPqTMCDJYOtelsxZhwWsCxwv7dGGvPGe+0Sw=; b=N3vL0uB5LqBpEgCo0AE/HhMcu7n5BxXCAYt6AMGdVDTqu3c6U6sAFJacUh6I+yBS8Y VtZebNXatOpDCXsZTShJTt7i92Cn2QId39sfCkFj5g0yNOT97Xd0ugtmoSxr2br6o2++ QraUa+Ohjv1vsTKgh8rlu/NqBQ9aEKTACSxU+9MmAYR01hDwoI4/O255UoZWV1BvivKU 7/mMRxg6kpIGQHFgyeYimEVIZNvKVLG3UX4M0qvNi0YndIGQwsYE0oodMa52w1nes2B3 V2jQYca+BCfYt3nz+5yxdB1UUwg9Cxdo63kMn6mFPiwHPyvVkowVKj7OSh1/Lk0arbry zCPQ== X-Gm-Message-State: AOAM530Bsr62AWf1E4I54EoNbAFcRE2muc/jVcNbo/kGHdcrW7j9pVba wRa1jnQyvHtReZeoI/x45JRj X-Google-Smtp-Source: ABdhPJya/3d83k8xmEEz7mxN/KfDFTYEWfRDpJqmURKX4t0jIlbAdO7JalzXWyaKhiHPR7hvAQh/zQ== X-Received: by 2002:aa7:808b:0:b029:2ef:cdd4:8297 with SMTP id v11-20020aa7808b0000b02902efcdd48297mr3574498pff.27.1626166076154; Tue, 13 Jul 2021 01:47:56 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id q21sm11607775pff.55.2021.07.13.01.47.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:55 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 09/17] virtio-vdpa: Handle the failure of vdpa_reset() Date: Tue, 13 Jul 2021 16:46:48 +0800 Message-Id: <20210713084656.232-10-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The vpda_reset() may fail now. This adds check to its return value and fail the virtio_vdpa_reset(). Signed-off-by: Xie Yongji --- drivers/virtio/virtio_vdpa.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c index 3e666f70e829..ebbd8471bbee 100644 --- a/drivers/virtio/virtio_vdpa.c +++ b/drivers/virtio/virtio_vdpa.c @@ -101,9 +101,7 @@ static int virtio_vdpa_reset(struct virtio_device *vdev) { struct vdpa_device *vdpa = vd_get_vdpa(vdev); - vdpa_reset(vdpa); - - return 0; + return vdpa_reset(vdpa); } static bool virtio_vdpa_notify(struct virtqueue *vq) From patchwork Tue Jul 13 08:46:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 477350 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF8EFC07E96 for ; Tue, 13 Jul 2021 08:48:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BAFA26101D for ; Tue, 13 Jul 2021 08:48:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235314AbhGMIvP (ORCPT ); Tue, 13 Jul 2021 04:51:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50332 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235229AbhGMIvA (ORCPT ); Tue, 13 Jul 2021 04:51:00 -0400 Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C823C06178A for ; Tue, 13 Jul 2021 01:48:00 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id p17so10208349plf.12 for ; Tue, 13 Jul 2021 01:48:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LsGekY4BvYvIO5Z8AgcJapYB7A/KQEKxuOgJFJnV+F0=; b=Et7FbWZVS2cx+fpTA8oSkjyi22zfg/JVBDFbigj20lktndc6w8VizCfnQbd8DLvnBV VNEhNSeBTNpJFAegw09Rg0mEfMyli4GMRdum0q4qn3U74w57SPrdejSMPmNtu9yMiPCF TiTyNrOHaIEsr8Bx2ZBmhMhJIuXrBF5pSlk5FJueUhyp4hfHn5krzzUtF6Q9Pl85RJaq SKkCpcUcgTDMKntqvaiRWItj7upm0Y9Xhp3HloIdnzr5/LU16LGuAsmTpAo3w7IY0Gn5 K4vArTZXMxMG7swExsxAVe8b+arAc/H+tSgN2OIWI2JOdcmaiAlQmmKNmF3cbmgNWuUL EKrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LsGekY4BvYvIO5Z8AgcJapYB7A/KQEKxuOgJFJnV+F0=; b=MrQU0ExGZof0SNMqB/+enLqgg+0CExIk29rXbzyVPpIBvv6Ffwu1wIpn0WCk9omtk8 5VM/8eQbObhGTbjDWUk4HHfmu8LQ1QzVG2hJorDbiVl33VHFGu+DaGXMhN6vHHoES8aB ZANYN2m82K2LM9Z7Kod/PVwvU2FnlvQoeA4WXk1vje4qbHP3aHntquhAcM3ZFHZdfsIB 1UBGCmJedjyk2uwOn7caZaeQyM/GAsdmFQXs8JdKuc/X/G0M4YX66llbkW/joKDmL+lC GUUcGyNo41daSwNtX/38emZToXprDdV6a44pDHKtm74YQlSLiP8SbjYH9Hxg76y9WJru C0mA== X-Gm-Message-State: AOAM531f1G0uxRccB4gpRy/AF9dnU1dx32yzItvrjpwEb4QfllIjJtHE WAPsyIT6PW8ChZ0K0GFcb3VF X-Google-Smtp-Source: ABdhPJzFJWinFNSX5sjhEi7EUxUYGzqDtd7WH488DhV7Pgm2CsQABcNDQNUVm8LAg6MjG7Qz0MbUVw== X-Received: by 2002:a17:90a:5b07:: with SMTP id o7mr3372998pji.35.1626166080091; Tue, 13 Jul 2021 01:48:00 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id x10sm2437739pgj.73.2021.07.13.01.47.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:47:59 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 10/17] virtio: Handle device reset failure in register_virtio_device() Date: Tue, 13 Jul 2021 16:46:49 +0800 Message-Id: <20210713084656.232-11-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The device reset may fail in virtio-vdpa case now, so add checks to its return value and fail the register_virtio_device(). Signed-off-by: Xie Yongji --- drivers/virtio/virtio.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c index a15beb6b593b..8df75425fb43 100644 --- a/drivers/virtio/virtio.c +++ b/drivers/virtio/virtio.c @@ -349,7 +349,9 @@ int register_virtio_device(struct virtio_device *dev) /* We always start by resetting the device, in case a previous * driver messed it up. This also tests that code path a little. */ - dev->config->reset(dev); + err = dev->config->reset(dev); + if (err) + goto err_reset; /* Acknowledge that we've seen the device. */ virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE); @@ -362,10 +364,13 @@ int register_virtio_device(struct virtio_device *dev) */ err = device_add(&dev->dev); if (err) - ida_simple_remove(&virtio_index_ida, dev->index); -out: - if (err) - virtio_add_status(dev, VIRTIO_CONFIG_S_FAILED); + goto err_add; + + return 0; +err_add: + virtio_add_status(dev, VIRTIO_CONFIG_S_FAILED); +err_reset: + ida_simple_remove(&virtio_index_ida, dev->index); return err; } EXPORT_SYMBOL_GPL(register_virtio_device); From patchwork Tue Jul 13 08:46:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 475098 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88371C11F69 for ; Tue, 13 Jul 2021 08:48:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 72F7A6101D for ; Tue, 13 Jul 2021 08:48:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235292AbhGMIvN (ORCPT ); Tue, 13 Jul 2021 04:51:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235225AbhGMIvA (ORCPT ); Tue, 13 Jul 2021 04:51:00 -0400 Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 24B99C061794 for ; Tue, 13 Jul 2021 01:48:04 -0700 (PDT) Received: by mail-pj1-x1029.google.com with SMTP id oj10-20020a17090b4d8ab0290172f77377ebso1679662pjb.0 for ; Tue, 13 Jul 2021 01:48:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=CGl/RjOOzzXlpDeA69Ku950YCkBGnktZEIYihkRI6vQ=; b=f2iY73K9/iJbrFo2MgGFRZQ2tnpbs3ImXGvgYWrH8jbRwZtYrG0H49iFKE/gsgwKLy wVtyRbuwPA0zRTedHVys3XUYskroarJowEQpt+ZqI5TnDw7Zp23rmxpefDYE2fUT31OK Yx8EzxFOTu/Y54tuXZUybHIRyzZazhpojXPhI62Q8vKEpNQgs633F50M2v6JbT1SWtju /ZtqBhhqTVZGT9Le/4u/OKCsHRigEKlFME1X9DTpvhSS8l1I7+BAFOhoRdXOT1yNydWd YxmuE2ov3/MNlpLC8cpCwId08XpgeT4hgIX6doOocMtwfbp46o2SIwhFi6wDRfTBj+Ar hnBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CGl/RjOOzzXlpDeA69Ku950YCkBGnktZEIYihkRI6vQ=; b=eEa+hoHzY22PEk3zI1CiF30B/fcs574SZcROfVnrwDbK+RH20qIGw5AmbO9YPR6jb5 sDQO1Ps0v9r7sxktIt75ANStTnYZl2NLvPowoGpeUZc+J22v6VEV3c6o7+EPKwiMfJy8 xeDmn3riq6+vuuYEOjZykOQl9AzO4cp12llDW0DpXYk5UXyQry6eGfyOY8mtLmJyXFCT tSm+nCvgWBqMEZjYxTZaSvHy9WDFLy0FO/zuXUIunf5QcCJxbABn0NsCMHv/yYD843Hz 03ybV4GyjaLG0IC2s4jLCOjwQ/aZNgDAUdRQrhJDsZg88AJ6GM39j3mIRyjwQCGYzh7b IrmQ== X-Gm-Message-State: AOAM530YAuD4eZggRTiWt1NsOZ0R2DP39VOvQ3Di7My/b9ErBdsFkbrz fWVtMLKvI/rL2ePhKdhhz0msJ7T32/Qy X-Google-Smtp-Source: ABdhPJxyTVjwlLv6HFUW/HmnbuC6yzJVkXeQ5coeCepXZBwWX13HH6cyjAijtoaEpSnWsq9zQv6N4Q== X-Received: by 2002:a17:903:4051:b029:12a:181c:9305 with SMTP id n17-20020a1709034051b029012a181c9305mr2713300pla.25.1626166083725; Tue, 13 Jul 2021 01:48:03 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id s15sm18640281pfw.207.2021.07.13.01.48.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:48:03 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 11/17] vhost-iotlb: Add an opaque pointer for vhost IOTLB Date: Tue, 13 Jul 2021 16:46:50 +0800 Message-Id: <20210713084656.232-12-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add an opaque pointer for vhost IOTLB. And introduce vhost_iotlb_add_range_ctx() to accept it. Suggested-by: Jason Wang Signed-off-by: Xie Yongji Acked-by: Jason Wang --- drivers/vhost/iotlb.c | 20 ++++++++++++++++---- include/linux/vhost_iotlb.h | 3 +++ 2 files changed, 19 insertions(+), 4 deletions(-) diff --git a/drivers/vhost/iotlb.c b/drivers/vhost/iotlb.c index 0582079e4bcc..670d56c879e5 100644 --- a/drivers/vhost/iotlb.c +++ b/drivers/vhost/iotlb.c @@ -36,19 +36,21 @@ void vhost_iotlb_map_free(struct vhost_iotlb *iotlb, EXPORT_SYMBOL_GPL(vhost_iotlb_map_free); /** - * vhost_iotlb_add_range - add a new range to vhost IOTLB + * vhost_iotlb_add_range_ctx - add a new range to vhost IOTLB * @iotlb: the IOTLB * @start: start of the IOVA range * @last: last of IOVA range * @addr: the address that is mapped to @start * @perm: access permission of this range + * @opaque: the opaque pointer for the new mapping * * Returns an error last is smaller than start or memory allocation * fails */ -int vhost_iotlb_add_range(struct vhost_iotlb *iotlb, - u64 start, u64 last, - u64 addr, unsigned int perm) +int vhost_iotlb_add_range_ctx(struct vhost_iotlb *iotlb, + u64 start, u64 last, + u64 addr, unsigned int perm, + void *opaque) { struct vhost_iotlb_map *map; @@ -71,6 +73,7 @@ int vhost_iotlb_add_range(struct vhost_iotlb *iotlb, map->last = last; map->addr = addr; map->perm = perm; + map->opaque = opaque; iotlb->nmaps++; vhost_iotlb_itree_insert(map, &iotlb->root); @@ -80,6 +83,15 @@ int vhost_iotlb_add_range(struct vhost_iotlb *iotlb, return 0; } +EXPORT_SYMBOL_GPL(vhost_iotlb_add_range_ctx); + +int vhost_iotlb_add_range(struct vhost_iotlb *iotlb, + u64 start, u64 last, + u64 addr, unsigned int perm) +{ + return vhost_iotlb_add_range_ctx(iotlb, start, last, + addr, perm, NULL); +} EXPORT_SYMBOL_GPL(vhost_iotlb_add_range); /** diff --git a/include/linux/vhost_iotlb.h b/include/linux/vhost_iotlb.h index 6b09b786a762..2d0e2f52f938 100644 --- a/include/linux/vhost_iotlb.h +++ b/include/linux/vhost_iotlb.h @@ -17,6 +17,7 @@ struct vhost_iotlb_map { u32 perm; u32 flags_padding; u64 __subtree_last; + void *opaque; }; #define VHOST_IOTLB_FLAG_RETIRE 0x1 @@ -29,6 +30,8 @@ struct vhost_iotlb { unsigned int flags; }; +int vhost_iotlb_add_range_ctx(struct vhost_iotlb *iotlb, u64 start, u64 last, + u64 addr, unsigned int perm, void *opaque); int vhost_iotlb_add_range(struct vhost_iotlb *iotlb, u64 start, u64 last, u64 addr, unsigned int perm); void vhost_iotlb_del_range(struct vhost_iotlb *iotlb, u64 start, u64 last); From patchwork Tue Jul 13 08:46:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 475097 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4508C11F66 for ; Tue, 13 Jul 2021 08:48:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D06FD613A9 for ; Tue, 13 Jul 2021 08:48:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235237AbhGMIvS (ORCPT ); Tue, 13 Jul 2021 04:51:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50300 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235238AbhGMIvB (ORCPT ); Tue, 13 Jul 2021 04:51:01 -0400 Received: from mail-pg1-x533.google.com (mail-pg1-x533.google.com [IPv6:2607:f8b0:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D6E7C0613B9 for ; Tue, 13 Jul 2021 01:48:08 -0700 (PDT) Received: by mail-pg1-x533.google.com with SMTP id a2so21004814pgi.6 for ; Tue, 13 Jul 2021 01:48:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tkBBc1IawD+Ffhgxr5n+Z15KbqmeeJxiazdLqHh1xrc=; b=yszZshd37Q1r8rl2w4Lwrj6YFf82n3z6RFM2/AVb64c2vPdKXY1wme59FfefIDnkgr ZeHW2AajsEfK2rxQLc6OuYZkgjbUBXJA+gcejIcjQVScRmmzU//ovNfFgpCztM6evjZb /n+WP+ydd3YlkZ8SQbNriexfCms41b+6n5JyrSnt1kqA40n300sAvayGKIsgDO660Dag xPMXs0Ybb8+DJM4LihYMuZRid7CateX2g6ffuDfJjjN9xcWY2pbOZU1RjT72UvyZtG7E Iu7G8pek5Vi+CTDYAPuzeMfjRbD3yyHp8cITHLj2FoMlmhTbDWiVkQDUq3fyefNyyPld e0XA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tkBBc1IawD+Ffhgxr5n+Z15KbqmeeJxiazdLqHh1xrc=; b=gWeXFohFG8TepYKQFetF0Yyo4zee1CcI+juFeuo9PTBJ6ixiW7OSyjZ7lnE2WrpKNp ZoU4fOiM7Z3V7wB0hhW40/VdnGRZRei66qxl9YbHk07MqYVFoAxO1U7fUp23/7iSLYzA aRTLpL7/CAPJMfzoFWO3nQKmticARxwiFLyGTkpJ+bUgN5Z1dth42ek9g5RqBtTMSFMy FnNL1W5Cykl2ynuls9Wi0mt75TldsjjZ6GlYlp/weLwGJTzOMiZptYiKCbjxOedk8daI /UjrrJp6CwTeiaDax3MSWG3CDL5cg5OyribmIx7zkTjSWOoqMIUJ9KUMz0JRUTV8EFC1 dazw== X-Gm-Message-State: AOAM531YmjFSmdvai6iVjy0dRFGaXkj64DIb8fSqsRubX4j3HDKMJzpG EPP00pqLu/YvmtWGUzFFN0Rv X-Google-Smtp-Source: ABdhPJzZdDWqJmTNf93gwkTO/YfKFCx5ihMhJT+DOlHvR2RbC+0OzE+HpikVZnSzaRW2shba6Z0k9A== X-Received: by 2002:aa7:8704:0:b029:328:c7ca:fe33 with SMTP id b4-20020aa787040000b0290328c7cafe33mr3425253pfo.12.1626166087820; Tue, 13 Jul 2021 01:48:07 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id bf18sm4475167pjb.46.2021.07.13.01.48.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:48:07 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 12/17] vdpa: Add an opaque pointer for vdpa_config_ops.dma_map() Date: Tue, 13 Jul 2021 16:46:51 +0800 Message-Id: <20210713084656.232-13-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add an opaque pointer for DMA mapping. Suggested-by: Jason Wang Signed-off-by: Xie Yongji Acked-by: Jason Wang --- drivers/vdpa/vdpa_sim/vdpa_sim.c | 6 +++--- drivers/vhost/vdpa.c | 2 +- include/linux/vdpa.h | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c index 72167963421b..f456f4baf86d 100644 --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c @@ -542,14 +542,14 @@ static int vdpasim_set_map(struct vdpa_device *vdpa, } static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size, - u64 pa, u32 perm) + u64 pa, u32 perm, void *opaque) { struct vdpasim *vdpasim = vdpa_to_sim(vdpa); int ret; spin_lock(&vdpasim->iommu_lock); - ret = vhost_iotlb_add_range(vdpasim->iommu, iova, iova + size - 1, pa, - perm); + ret = vhost_iotlb_add_range_ctx(vdpasim->iommu, iova, iova + size - 1, + pa, perm, opaque); spin_unlock(&vdpasim->iommu_lock); return ret; diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 8615756306ec..f60a513dac7c 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -578,7 +578,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, return r; if (ops->dma_map) { - r = ops->dma_map(vdpa, iova, size, pa, perm); + r = ops->dma_map(vdpa, iova, size, pa, perm, NULL); } else if (ops->set_map) { if (!v->in_batch) r = ops->set_map(vdpa, dev->iotlb); diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 198c30e84b5d..4903e67c690b 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -268,7 +268,7 @@ struct vdpa_config_ops { /* DMA ops */ int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb); int (*dma_map)(struct vdpa_device *vdev, u64 iova, u64 size, - u64 pa, u32 perm); + u64 pa, u32 perm, void *opaque); int (*dma_unmap)(struct vdpa_device *vdev, u64 iova, u64 size); /* Free device resources */ From patchwork Tue Jul 13 08:46:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 477349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0972C07E95 for ; Tue, 13 Jul 2021 08:48:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A2111613A9 for ; Tue, 13 Jul 2021 08:48:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235139AbhGMIv2 (ORCPT ); Tue, 13 Jul 2021 04:51:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50278 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235018AbhGMIvL (ORCPT ); Tue, 13 Jul 2021 04:51:11 -0400 Received: from mail-pf1-x434.google.com (mail-pf1-x434.google.com [IPv6:2607:f8b0:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D78BEC0613A0 for ; Tue, 13 Jul 2021 01:48:11 -0700 (PDT) Received: by mail-pf1-x434.google.com with SMTP id x16so18888851pfa.13 for ; Tue, 13 Jul 2021 01:48:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QQuejJgNoEGyjo9tRdqGEQx2lQUe8kg/2ppmDnsJJAA=; b=unjSdw/881HgzjyZtUmmNHKre9MjzPnNfw0dZmJe0wi8SfaN2BXtbXsbdgycORb6qy CDTL+Qr9i5T3/weikPBzLhR+DSS6XfPrFZ0iQ/BSm6Rk7qiH5M7kSgYp2MJleNwlyfol vZCl9PV+5+N9XKukRMjOBXgYIcnk22iQiPRO2/DTv6okYz6rw+zj7W7lSU6+476LJVcn 2BPhqvP2u7lIK+Qdy/jXOonjh+hOdl+q/9CQqHqZHN7RjQFYmwWcjl0kk12fG2fS4C/Z Yzb0lI7wdxWXuWtJKO5acsPEWr0+bSE1X1DfnvQd4xh+Mv/SMXS70DwhUJnZZqsTODMW LhnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QQuejJgNoEGyjo9tRdqGEQx2lQUe8kg/2ppmDnsJJAA=; b=MR5TeUGY4m2OJqs+BTX8IQinOhEtXhJw+U4Tv+SW7LyYJyyq3PPT8N9VZEWComMUq0 O42ByXWvMuFpGddGtY9kGaPotq7NC5EiTD4Xuls12z+K503L17n+wPI35sepD9tkhP4O RXzy7TeEhgAi+Txptg4bNzl+leYtipFL++lOCEFlWTBzlRF/T0ACJrtQh+thl+iuF7mT GvquY8jU7gyF/EDwExS2a/1jfJP44kVsHQ8g6iqINCvOm55v3SQPtMFOyP5MjMkzGC6v OVGzca6HGZ5Ielv0lSPmJbpgYSCBjjH83sZaiogkIMYpNcHlgCGdoeuDlp0E6hqjNAkd +Q6w== X-Gm-Message-State: AOAM5306P2ZRi8htMuShKI0BkVnLpXszvxFrLYXZ0Y23VcHG/XSKC1hE meaOI8jVf7H7xmPfYJCZ569h X-Google-Smtp-Source: ABdhPJz3XYFBOb+mfR3r9wlTER8M1X/HEO0osRKjG8afubA8hE7WiCvA7RRSBu5qwBv/Xtfq2SZxsg== X-Received: by 2002:a63:8b41:: with SMTP id j62mr3285495pge.435.1626166091443; Tue, 13 Jul 2021 01:48:11 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id 2sm21343999pgz.26.2021.07.13.01.48.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:48:11 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 13/17] vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap() Date: Tue, 13 Jul 2021 16:46:52 +0800 Message-Id: <20210713084656.232-14-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The upcoming patch is going to support VA mapping/unmapping. So let's factor out the logic of PA mapping/unmapping firstly to make the code more readable. Suggested-by: Jason Wang Signed-off-by: Xie Yongji Acked-by: Jason Wang --- drivers/vhost/vdpa.c | 53 +++++++++++++++++++++++++++++++++------------------- 1 file changed, 34 insertions(+), 19 deletions(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index f60a513dac7c..3e260038beba 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -511,7 +511,7 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, return r; } -static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last) +static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last) { struct vhost_dev *dev = &v->vdev; struct vhost_iotlb *iotlb = dev->iotlb; @@ -533,6 +533,11 @@ static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last) } } +static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last) +{ + return vhost_vdpa_pa_unmap(v, start, last); +} + static void vhost_vdpa_iotlb_free(struct vhost_vdpa *v) { struct vhost_dev *dev = &v->vdev; @@ -613,37 +618,28 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size) } } -static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, - struct vhost_iotlb_msg *msg) +static int vhost_vdpa_pa_map(struct vhost_vdpa *v, + u64 iova, u64 size, u64 uaddr, u32 perm) { struct vhost_dev *dev = &v->vdev; - struct vhost_iotlb *iotlb = dev->iotlb; struct page **page_list; unsigned long list_size = PAGE_SIZE / sizeof(struct page *); unsigned int gup_flags = FOLL_LONGTERM; unsigned long npages, cur_base, map_pfn, last_pfn = 0; unsigned long lock_limit, sz2pin, nchunks, i; - u64 iova = msg->iova; + u64 start = iova; long pinned; int ret = 0; - if (msg->iova < v->range.first || - msg->iova + msg->size - 1 > v->range.last) - return -EINVAL; - - if (vhost_iotlb_itree_first(iotlb, msg->iova, - msg->iova + msg->size - 1)) - return -EEXIST; - /* Limit the use of memory for bookkeeping */ page_list = (struct page **) __get_free_page(GFP_KERNEL); if (!page_list) return -ENOMEM; - if (msg->perm & VHOST_ACCESS_WO) + if (perm & VHOST_ACCESS_WO) gup_flags |= FOLL_WRITE; - npages = PAGE_ALIGN(msg->size + (iova & ~PAGE_MASK)) >> PAGE_SHIFT; + npages = PAGE_ALIGN(size + (iova & ~PAGE_MASK)) >> PAGE_SHIFT; if (!npages) { ret = -EINVAL; goto free; @@ -657,7 +653,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, goto unlock; } - cur_base = msg->uaddr & PAGE_MASK; + cur_base = uaddr & PAGE_MASK; iova &= PAGE_MASK; nchunks = 0; @@ -688,7 +684,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, csize = (last_pfn - map_pfn + 1) << PAGE_SHIFT; ret = vhost_vdpa_map(v, iova, csize, map_pfn << PAGE_SHIFT, - msg->perm); + perm); if (ret) { /* * Unpin the pages that are left unmapped @@ -717,7 +713,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, /* Pin the rest chunk */ ret = vhost_vdpa_map(v, iova, (last_pfn - map_pfn + 1) << PAGE_SHIFT, - map_pfn << PAGE_SHIFT, msg->perm); + map_pfn << PAGE_SHIFT, perm); out: if (ret) { if (nchunks) { @@ -736,13 +732,32 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, for (pfn = map_pfn; pfn <= last_pfn; pfn++) unpin_user_page(pfn_to_page(pfn)); } - vhost_vdpa_unmap(v, msg->iova, msg->size); + vhost_vdpa_unmap(v, start, size); } unlock: mmap_read_unlock(dev->mm); free: free_page((unsigned long)page_list); return ret; + +} + +static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, + struct vhost_iotlb_msg *msg) +{ + struct vhost_dev *dev = &v->vdev; + struct vhost_iotlb *iotlb = dev->iotlb; + + if (msg->iova < v->range.first || + msg->iova + msg->size - 1 > v->range.last) + return -EINVAL; + + if (vhost_iotlb_itree_first(iotlb, msg->iova, + msg->iova + msg->size - 1)) + return -EEXIST; + + return vhost_vdpa_pa_map(v, msg->iova, msg->size, msg->uaddr, + msg->perm); } static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, From patchwork Tue Jul 13 08:46:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 475096 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FDBDC11F66 for ; Tue, 13 Jul 2021 08:48:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 74AD460725 for ; Tue, 13 Jul 2021 08:48:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235256AbhGMIvg (ORCPT ); Tue, 13 Jul 2021 04:51:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50302 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235278AbhGMIvM (ORCPT ); Tue, 13 Jul 2021 04:51:12 -0400 Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 89D4DC0613AE for ; Tue, 13 Jul 2021 01:48:15 -0700 (PDT) Received: by mail-pj1-x1030.google.com with SMTP id cu14so6409508pjb.0 for ; Tue, 13 Jul 2021 01:48:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=y4ufJe1pqoQtqMP3ixyl76pbDy1hGYJP4XFdOY1YVyY=; b=WQuU/gG7ZfTNI2qZpsTs3nFbw3nGQz58SeTex5OzuWpcg0kawm2pwWU3SS9wmThtD7 8SuQKUzHbZdL+eLavT5+YRBAi0zmAoF3t2E3SoNDTO3c4D0gyq5q5vOJ64II6vfAd4aT 71QQPF9/laUlyzOfQoVzNja3HoTyWs8WAGyHW2tFN9KbReARF+Jsge6Z3XnKKTf4ks8U utVVPAsVmo7WGuvvgX6H5ZATzkWQ/yddy7LlXwLGRWxnXXGAB0PPn7iJMOENuO0eZ444 qQrj04dP1ATna4I3q9+iU/DxS0eJuKeELHTy5OsXbFmUU0q1Ngqw9zUHXdbVI6/h072e BWMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=y4ufJe1pqoQtqMP3ixyl76pbDy1hGYJP4XFdOY1YVyY=; b=KH+rLsOxyeCkN/E8ftbI9Snc7GrR30Pb6UfymkU5QZDM9+CgYRfjusl0g/Vpkkleoe ZIwPS/a9zrBE6zbLzmjMh4KvOAkXjf6okgkBfrSYffJ4ZOQYTBIOgNcgCB4bflK5uZB1 /f999DLRp2eVbfkqdq4Vg31aZzbFXeijMCOhHZySCox/bt6B9pVhorvq785gnxac8E9b 2Wc/VxIQ7Bg5X2zwGy+zYSZnJkNOndem9iztGUoafUaX8v2a8K5K5OcYI3gRGSD0DtAA H2E7sVMSGt1BEcjJrvJdn/GgBbEcTHNkEhtGhqPR9jT8xOgrJQHbv9RcWV5Mzj4wXR+G aKEw== X-Gm-Message-State: AOAM533UOT5/s7jBTuI9g2JZ0cv3FtArctIEQXfeBeLwyms8fENOhnM9 TlgAEBoyGD2Hjly+rWg+eRFF X-Google-Smtp-Source: ABdhPJwqa943zq1Ntq/jM0YGq1xzB0bh9LWs6ajK6UMzkrpe/h3ODorWGfeeWAcPP8rS7//MZIB7gQ== X-Received: by 2002:a17:902:b188:b029:11b:1549:da31 with SMTP id s8-20020a170902b188b029011b1549da31mr2646108plr.7.1626166095039; Tue, 13 Jul 2021 01:48:15 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id y6sm20418758pgk.79.2021.07.13.01.48.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:48:14 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 14/17] vdpa: Support transferring virtual addressing during DMA mapping Date: Tue, 13 Jul 2021 16:46:53 +0800 Message-Id: <20210713084656.232-15-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch introduces an attribute for vDPA device to indicate whether virtual address can be used. If vDPA device driver set it, vhost-vdpa bus driver will not pin user page and transfer userspace virtual address instead of physical address during DMA mapping. And corresponding vma->vm_file and offset will be also passed as an opaque pointer. Suggested-by: Jason Wang Signed-off-by: Xie Yongji Acked-by: Jason Wang --- drivers/vdpa/ifcvf/ifcvf_main.c | 2 +- drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +- drivers/vdpa/vdpa.c | 9 +++- drivers/vdpa/vdpa_sim/vdpa_sim.c | 2 +- drivers/vdpa/virtio_pci/vp_vdpa.c | 2 +- drivers/vhost/vdpa.c | 99 ++++++++++++++++++++++++++++++++++----- include/linux/vdpa.h | 19 ++++++-- 7 files changed, 116 insertions(+), 19 deletions(-) diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c index b4b89a66607f..fc848aafe5a9 100644 --- a/drivers/vdpa/ifcvf/ifcvf_main.c +++ b/drivers/vdpa/ifcvf/ifcvf_main.c @@ -492,7 +492,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id) } adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa, - dev, &ifc_vdpa_ops, NULL); + dev, &ifc_vdpa_ops, NULL, false); if (adapter == NULL) { IFCVF_ERR(pdev, "Failed to allocate vDPA structure"); return -ENOMEM; diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 683452c41a36..79cb2f130bf0 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -2035,7 +2035,7 @@ static int mlx5_vdpa_dev_add(struct vdpa_mgmt_dev *v_mdev, const char *name) max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS); ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops, - name); + name, false); if (IS_ERR(ndev)) return PTR_ERR(ndev); diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index d77d59811389..41377df674d5 100644 --- a/drivers/vdpa/vdpa.c +++ b/drivers/vdpa/vdpa.c @@ -71,6 +71,7 @@ static void vdpa_release_dev(struct device *d) * @config: the bus operations that is supported by this device * @size: size of the parent structure that contains private data * @name: name of the vdpa device; optional. + * @use_va: indicate whether virtual address must be used by this device * * Driver should use vdpa_alloc_device() wrapper macro instead of * using this directly. @@ -80,7 +81,8 @@ static void vdpa_release_dev(struct device *d) */ struct vdpa_device *__vdpa_alloc_device(struct device *parent, const struct vdpa_config_ops *config, - size_t size, const char *name) + size_t size, const char *name, + bool use_va) { struct vdpa_device *vdev; int err = -EINVAL; @@ -91,6 +93,10 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent, if (!!config->dma_map != !!config->dma_unmap) goto err; + /* It should only work for the device that use on-chip IOMMU */ + if (use_va && !(config->dma_map || config->set_map)) + goto err; + err = -ENOMEM; vdev = kzalloc(size, GFP_KERNEL); if (!vdev) @@ -106,6 +112,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent, vdev->index = err; vdev->config = config; vdev->features_valid = false; + vdev->use_va = use_va; if (name) err = dev_set_name(&vdev->dev, "%s", name); diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c index f456f4baf86d..472d573d755c 100644 --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c @@ -250,7 +250,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr) ops = &vdpasim_config_ops; vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, - dev_attr->name); + dev_attr->name, false); if (!vdpasim) goto err_alloc; diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c b/drivers/vdpa/virtio_pci/vp_vdpa.c index ee723c645aec..c828b9f9d92a 100644 --- a/drivers/vdpa/virtio_pci/vp_vdpa.c +++ b/drivers/vdpa/virtio_pci/vp_vdpa.c @@ -435,7 +435,7 @@ static int vp_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id) return ret; vp_vdpa = vdpa_alloc_device(struct vp_vdpa, vdpa, - dev, &vp_vdpa_ops, NULL); + dev, &vp_vdpa_ops, NULL, false); if (vp_vdpa == NULL) { dev_err(dev, "vp_vdpa: Failed to allocate vDPA structure\n"); return -ENOMEM; diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 3e260038beba..0db701fa935f 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -533,8 +533,28 @@ static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last) } } +static void vhost_vdpa_va_unmap(struct vhost_vdpa *v, u64 start, u64 last) +{ + struct vhost_dev *dev = &v->vdev; + struct vhost_iotlb *iotlb = dev->iotlb; + struct vhost_iotlb_map *map; + struct vdpa_map_file *map_file; + + while ((map = vhost_iotlb_itree_first(iotlb, start, last)) != NULL) { + map_file = (struct vdpa_map_file *)map->opaque; + fput(map_file->file); + kfree(map_file); + vhost_iotlb_map_free(iotlb, map); + } +} + static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last) { + struct vdpa_device *vdpa = v->vdpa; + + if (vdpa->use_va) + return vhost_vdpa_va_unmap(v, start, last); + return vhost_vdpa_pa_unmap(v, start, last); } @@ -569,21 +589,21 @@ static int perm_to_iommu_flags(u32 perm) return flags | IOMMU_CACHE; } -static int vhost_vdpa_map(struct vhost_vdpa *v, - u64 iova, u64 size, u64 pa, u32 perm) +static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova, + u64 size, u64 pa, u32 perm, void *opaque) { struct vhost_dev *dev = &v->vdev; struct vdpa_device *vdpa = v->vdpa; const struct vdpa_config_ops *ops = vdpa->config; int r = 0; - r = vhost_iotlb_add_range(dev->iotlb, iova, iova + size - 1, - pa, perm); + r = vhost_iotlb_add_range_ctx(dev->iotlb, iova, iova + size - 1, + pa, perm, opaque); if (r) return r; if (ops->dma_map) { - r = ops->dma_map(vdpa, iova, size, pa, perm, NULL); + r = ops->dma_map(vdpa, iova, size, pa, perm, opaque); } else if (ops->set_map) { if (!v->in_batch) r = ops->set_map(vdpa, dev->iotlb); @@ -591,13 +611,15 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, r = iommu_map(v->domain, iova, pa, size, perm_to_iommu_flags(perm)); } - - if (r) + if (r) { vhost_iotlb_del_range(dev->iotlb, iova, iova + size - 1); - else + return r; + } + + if (!vdpa->use_va) atomic64_add(size >> PAGE_SHIFT, &dev->mm->pinned_vm); - return r; + return 0; } static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size) @@ -618,6 +640,56 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size) } } +static int vhost_vdpa_va_map(struct vhost_vdpa *v, + u64 iova, u64 size, u64 uaddr, u32 perm) +{ + struct vhost_dev *dev = &v->vdev; + u64 offset, map_size, map_iova = iova; + struct vdpa_map_file *map_file; + struct vm_area_struct *vma; + int ret; + + mmap_read_lock(dev->mm); + + while (size) { + vma = find_vma(dev->mm, uaddr); + if (!vma) { + ret = -EINVAL; + break; + } + map_size = min(size, vma->vm_end - uaddr); + if (!(vma->vm_file && (vma->vm_flags & VM_SHARED) && + !(vma->vm_flags & (VM_IO | VM_PFNMAP)))) + goto next; + + map_file = kzalloc(sizeof(*map_file), GFP_KERNEL); + if (!map_file) { + ret = -ENOMEM; + break; + } + offset = (vma->vm_pgoff << PAGE_SHIFT) + uaddr - vma->vm_start; + map_file->offset = offset; + map_file->file = get_file(vma->vm_file); + ret = vhost_vdpa_map(v, map_iova, map_size, uaddr, + perm, map_file); + if (ret) { + fput(map_file->file); + kfree(map_file); + break; + } +next: + size -= map_size; + uaddr += map_size; + map_iova += map_size; + } + if (ret) + vhost_vdpa_unmap(v, iova, map_iova - iova); + + mmap_read_unlock(dev->mm); + + return ret; +} + static int vhost_vdpa_pa_map(struct vhost_vdpa *v, u64 iova, u64 size, u64 uaddr, u32 perm) { @@ -684,7 +756,7 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v, csize = (last_pfn - map_pfn + 1) << PAGE_SHIFT; ret = vhost_vdpa_map(v, iova, csize, map_pfn << PAGE_SHIFT, - perm); + perm, NULL); if (ret) { /* * Unpin the pages that are left unmapped @@ -713,7 +785,7 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v, /* Pin the rest chunk */ ret = vhost_vdpa_map(v, iova, (last_pfn - map_pfn + 1) << PAGE_SHIFT, - map_pfn << PAGE_SHIFT, perm); + map_pfn << PAGE_SHIFT, perm, NULL); out: if (ret) { if (nchunks) { @@ -746,6 +818,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, struct vhost_iotlb_msg *msg) { struct vhost_dev *dev = &v->vdev; + struct vdpa_device *vdpa = v->vdpa; struct vhost_iotlb *iotlb = dev->iotlb; if (msg->iova < v->range.first || @@ -756,6 +829,10 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, msg->iova + msg->size - 1)) return -EEXIST; + if (vdpa->use_va) + return vhost_vdpa_va_map(v, msg->iova, msg->size, + msg->uaddr, msg->perm); + return vhost_vdpa_pa_map(v, msg->iova, msg->size, msg->uaddr, msg->perm); } diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 4903e67c690b..b605ce09a5bd 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -66,6 +66,7 @@ struct vdpa_mgmt_dev; * @config: the configuration ops for this device. * @index: device index * @features_valid: were features initialized? for legacy guests + * @use_va: indicate whether virtual address must be used by this device * @nvqs: maximum number of supported virtqueues * @mdev: management device pointer; caller must setup when registering device as part * of dev_add() mgmtdev ops callback before invoking _vdpa_register_device(). @@ -76,6 +77,7 @@ struct vdpa_device { const struct vdpa_config_ops *config; unsigned int index; bool features_valid; + bool use_va; int nvqs; struct vdpa_mgmt_dev *mdev; }; @@ -91,6 +93,16 @@ struct vdpa_iova_range { }; /** + * Corresponding file area for device memory mapping + * @file: vma->vm_file for the mapping + * @offset: mapping offset in the vm_file + */ +struct vdpa_map_file { + struct file *file; + u64 offset; +}; + +/** * struct vdpa_config_ops - operations for configuring a vDPA device. * Note: vDPA device drivers are required to implement all of the * operations unless it is mentioned to be optional in the following @@ -277,14 +289,15 @@ struct vdpa_config_ops { struct vdpa_device *__vdpa_alloc_device(struct device *parent, const struct vdpa_config_ops *config, - size_t size, const char *name); + size_t size, const char *name, + bool use_va); -#define vdpa_alloc_device(dev_struct, member, parent, config, name) \ +#define vdpa_alloc_device(dev_struct, member, parent, config, name, use_va) \ container_of(__vdpa_alloc_device( \ parent, config, \ sizeof(dev_struct) + \ BUILD_BUG_ON_ZERO(offsetof( \ - dev_struct, member)), name), \ + dev_struct, member)), name, use_va), \ dev_struct, member) int vdpa_register_device(struct vdpa_device *vdev, int nvqs); From patchwork Tue Jul 13 08:46:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 477348 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A78AC07E95 for ; Tue, 13 Jul 2021 08:48:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6B81060725 for ; Tue, 13 Jul 2021 08:48:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235358AbhGMIvs (ORCPT ); Tue, 13 Jul 2021 04:51:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235052AbhGMIv1 (ORCPT ); Tue, 13 Jul 2021 04:51:27 -0400 Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5433CC06127E for ; Tue, 13 Jul 2021 01:48:19 -0700 (PDT) Received: by mail-pj1-x1029.google.com with SMTP id i16-20020a17090acf90b02901736d9d2218so1656147pju.1 for ; Tue, 13 Jul 2021 01:48:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DmuibZSASz5HW9QiqnoPNYlf/iv8ZNP8HE9857Kg6Fw=; b=XFYPiitRN4+RByOxDim7QxWXrOIW6wy7raapT+osXdqpI/uA8WXCu8aCkb7Nzl7iTT crxA725BDpJ0Rtee7BnKrbt0YYnKlWAojWmIfXnMHK5Uh1rnBYXxzHVrJNqTsQN1rVfb zOxP2taHjLAMH4PdTQ1bVuovD2u4yLDvQ129fLwEMR/o+GbK3rkTgP2JZxR71aY3E8H4 W6nXzSjPl3BsHG/R9uVgU1+0uT9Ash4iexZ184aX7R4puclZz1DUrpVrwaA4qF8QIOU5 mkMLHDM1zsxSpydA+xKM68QqnZuTWd1E4o3us+1929DbtCpx04r8nZNZ80bgM8oe1hk9 o6Rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DmuibZSASz5HW9QiqnoPNYlf/iv8ZNP8HE9857Kg6Fw=; b=YDJv2o9aDJO3YW32xv+FyhJsckGf2o3OZNb5lhITf0LISHsAyFKPTGj25sc3AQbRZE bFI4+MN1bSklDLPbd9JF6ArlLQtlaUb6paMS2FlUj9vkAU3eds2BcV8Muna/uNY3iRUX MzZgMuEH0BxFXPK4eZlY1Lsb9MS4sLxhQIwwziIE3eT98ZlxdeQnNzhhI+AjjfnQnQpR F6O5PaLORFVuF9La808SuJ9AwDs8jE/yIRnMO2OKvKj3sxNNLh+oIZMkgD/axf+n7R7l RHdzvqQmR/az6vAatK6xZ1QDZm78Zs4PcAIShc35GMGZ4Cq5XnXXt+jDw3q7MlyOFbg0 xWAw== X-Gm-Message-State: AOAM533YKz4A4BNdYoLNyErwTtr87C12PgVSEnqIdMbl9WpNtXEYHv7i XMZU+MwpTsLUh35EX0Za2goo X-Google-Smtp-Source: ABdhPJzMFFL+LqmYw0EqE9kKoTKOfG4JqdLi0VjPIskKPn746N91NPZc3/W+s7UTqY4A8Q2BsvNCMQ== X-Received: by 2002:a17:90a:4bce:: with SMTP id u14mr3307609pjl.188.1626166098789; Tue, 13 Jul 2021 01:48:18 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id s15sm19293283pfu.97.2021.07.13.01.48.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:48:18 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 15/17] vduse: Implement an MMU-based IOMMU driver Date: Tue, 13 Jul 2021 16:46:54 +0800 Message-Id: <20210713084656.232-16-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This implements an MMU-based IOMMU driver to support mapping kernel dma buffer into userspace. The basic idea behind it is treating MMU (VA->PA) as IOMMU (IOVA->PA). The driver will set up MMU mapping instead of IOMMU mapping for the DMA transfer so that the userspace process is able to use its virtual address to access the dma buffer in kernel. And to avoid security issue, a bounce-buffering mechanism is introduced to prevent userspace accessing the original buffer directly. Signed-off-by: Xie Yongji Acked-by: Jason Wang --- drivers/vdpa/vdpa_user/iova_domain.c | 545 +++++++++++++++++++++++++++++++++++ drivers/vdpa/vdpa_user/iova_domain.h | 73 +++++ 2 files changed, 618 insertions(+) create mode 100644 drivers/vdpa/vdpa_user/iova_domain.c create mode 100644 drivers/vdpa/vdpa_user/iova_domain.h diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/iova_domain.c new file mode 100644 index 000000000000..ad45026f5423 --- /dev/null +++ b/drivers/vdpa/vdpa_user/iova_domain.c @@ -0,0 +1,545 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * MMU-based IOMMU implementation + * + * Copyright (C) 2020-2021 Bytedance Inc. and/or its affiliates. All rights reserved. + * + * Author: Xie Yongji + * + */ + +#include +#include +#include +#include +#include +#include + +#include "iova_domain.h" + +static int vduse_iotlb_add_range(struct vduse_iova_domain *domain, + u64 start, u64 last, + u64 addr, unsigned int perm, + struct file *file, u64 offset) +{ + struct vdpa_map_file *map_file; + int ret; + + map_file = kmalloc(sizeof(*map_file), GFP_ATOMIC); + if (!map_file) + return -ENOMEM; + + map_file->file = get_file(file); + map_file->offset = offset; + + ret = vhost_iotlb_add_range_ctx(domain->iotlb, start, last, + addr, perm, map_file); + if (ret) { + fput(map_file->file); + kfree(map_file); + return ret; + } + return 0; +} + +static void vduse_iotlb_del_range(struct vduse_iova_domain *domain, + u64 start, u64 last) +{ + struct vdpa_map_file *map_file; + struct vhost_iotlb_map *map; + + while ((map = vhost_iotlb_itree_first(domain->iotlb, start, last))) { + map_file = (struct vdpa_map_file *)map->opaque; + fput(map_file->file); + kfree(map_file); + vhost_iotlb_map_free(domain->iotlb, map); + } +} + +int vduse_domain_set_map(struct vduse_iova_domain *domain, + struct vhost_iotlb *iotlb) +{ + struct vdpa_map_file *map_file; + struct vhost_iotlb_map *map; + u64 start = 0ULL, last = ULLONG_MAX; + int ret; + + spin_lock(&domain->iotlb_lock); + vduse_iotlb_del_range(domain, start, last); + + for (map = vhost_iotlb_itree_first(iotlb, start, last); map; + map = vhost_iotlb_itree_next(map, start, last)) { + map_file = (struct vdpa_map_file *)map->opaque; + ret = vduse_iotlb_add_range(domain, map->start, map->last, + map->addr, map->perm, + map_file->file, + map_file->offset); + if (ret) + goto err; + } + spin_unlock(&domain->iotlb_lock); + + return 0; +err: + vduse_iotlb_del_range(domain, start, last); + spin_unlock(&domain->iotlb_lock); + return ret; +} + +void vduse_domain_clear_map(struct vduse_iova_domain *domain, + struct vhost_iotlb *iotlb) +{ + struct vhost_iotlb_map *map; + u64 start = 0ULL, last = ULLONG_MAX; + + spin_lock(&domain->iotlb_lock); + for (map = vhost_iotlb_itree_first(iotlb, start, last); map; + map = vhost_iotlb_itree_next(map, start, last)) { + vduse_iotlb_del_range(domain, map->start, map->last); + } + spin_unlock(&domain->iotlb_lock); +} + +static int vduse_domain_map_bounce_page(struct vduse_iova_domain *domain, + u64 iova, u64 size, u64 paddr) +{ + struct vduse_bounce_map *map; + u64 last = iova + size - 1; + + while (iova <= last) { + map = &domain->bounce_maps[iova >> PAGE_SHIFT]; + if (!map->bounce_page) { + map->bounce_page = alloc_page(GFP_ATOMIC); + if (!map->bounce_page) + return -ENOMEM; + } + map->orig_phys = paddr; + paddr += PAGE_SIZE; + iova += PAGE_SIZE; + } + return 0; +} + +static void vduse_domain_unmap_bounce_page(struct vduse_iova_domain *domain, + u64 iova, u64 size) +{ + struct vduse_bounce_map *map; + u64 last = iova + size - 1; + + while (iova <= last) { + map = &domain->bounce_maps[iova >> PAGE_SHIFT]; + map->orig_phys = INVALID_PHYS_ADDR; + iova += PAGE_SIZE; + } +} + +static void do_bounce(phys_addr_t orig, void *addr, size_t size, + enum dma_data_direction dir) +{ + unsigned long pfn = PFN_DOWN(orig); + unsigned int offset = offset_in_page(orig); + char *buffer; + unsigned int sz = 0; + + while (size) { + sz = min_t(size_t, PAGE_SIZE - offset, size); + + buffer = kmap_atomic(pfn_to_page(pfn)); + if (dir == DMA_TO_DEVICE) + memcpy(addr, buffer + offset, sz); + else + memcpy(buffer + offset, addr, sz); + kunmap_atomic(buffer); + + size -= sz; + pfn++; + addr += sz; + offset = 0; + } +} + +static void vduse_domain_bounce(struct vduse_iova_domain *domain, + dma_addr_t iova, size_t size, + enum dma_data_direction dir) +{ + struct vduse_bounce_map *map; + unsigned int offset; + void *addr; + size_t sz; + + if (iova >= domain->bounce_size) + return; + + while (size) { + map = &domain->bounce_maps[iova >> PAGE_SHIFT]; + offset = offset_in_page(iova); + sz = min_t(size_t, PAGE_SIZE - offset, size); + + if (WARN_ON(!map->bounce_page || + map->orig_phys == INVALID_PHYS_ADDR)) + return; + + addr = page_address(map->bounce_page) + offset; + do_bounce(map->orig_phys + offset, addr, sz, dir); + size -= sz; + iova += sz; + } +} + +static struct page * +vduse_domain_get_coherent_page(struct vduse_iova_domain *domain, u64 iova) +{ + u64 start = iova & PAGE_MASK; + u64 last = start + PAGE_SIZE - 1; + struct vhost_iotlb_map *map; + struct page *page = NULL; + + spin_lock(&domain->iotlb_lock); + map = vhost_iotlb_itree_first(domain->iotlb, start, last); + if (!map) + goto out; + + page = pfn_to_page((map->addr + iova - map->start) >> PAGE_SHIFT); + get_page(page); +out: + spin_unlock(&domain->iotlb_lock); + + return page; +} + +static struct page * +vduse_domain_get_bounce_page(struct vduse_iova_domain *domain, u64 iova) +{ + struct vduse_bounce_map *map; + struct page *page = NULL; + + spin_lock(&domain->iotlb_lock); + map = &domain->bounce_maps[iova >> PAGE_SHIFT]; + if (!map->bounce_page) + goto out; + + page = map->bounce_page; + get_page(page); +out: + spin_unlock(&domain->iotlb_lock); + + return page; +} + +static void +vduse_domain_free_bounce_pages(struct vduse_iova_domain *domain) +{ + struct vduse_bounce_map *map; + unsigned long pfn, bounce_pfns; + + bounce_pfns = domain->bounce_size >> PAGE_SHIFT; + + for (pfn = 0; pfn < bounce_pfns; pfn++) { + map = &domain->bounce_maps[pfn]; + if (WARN_ON(map->orig_phys != INVALID_PHYS_ADDR)) + continue; + + if (!map->bounce_page) + continue; + + __free_page(map->bounce_page); + map->bounce_page = NULL; + } +} + +void vduse_domain_reset_bounce_map(struct vduse_iova_domain *domain) +{ + if (!domain->bounce_map) + return; + + spin_lock(&domain->iotlb_lock); + if (!domain->bounce_map) + goto unlock; + + vduse_iotlb_del_range(domain, 0, domain->bounce_size - 1); + domain->bounce_map = 0; +unlock: + spin_unlock(&domain->iotlb_lock); +} + +static int vduse_domain_init_bounce_map(struct vduse_iova_domain *domain) +{ + int ret = 0; + + if (domain->bounce_map) + return 0; + + spin_lock(&domain->iotlb_lock); + if (domain->bounce_map) + goto unlock; + + ret = vduse_iotlb_add_range(domain, 0, domain->bounce_size - 1, + 0, VHOST_MAP_RW, domain->file, 0); + if (ret) + goto unlock; + + domain->bounce_map = 1; +unlock: + spin_unlock(&domain->iotlb_lock); + return ret; +} + +static dma_addr_t +vduse_domain_alloc_iova(struct iova_domain *iovad, + unsigned long size, unsigned long limit) +{ + unsigned long shift = iova_shift(iovad); + unsigned long iova_len = iova_align(iovad, size) >> shift; + unsigned long iova_pfn; + + /* + * Freeing non-power-of-two-sized allocations back into the IOVA caches + * will come back to bite us badly, so we have to waste a bit of space + * rounding up anything cacheable to make sure that can't happen. The + * order of the unadjusted size will still match upon freeing. + */ + if (iova_len < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1))) + iova_len = roundup_pow_of_two(iova_len); + iova_pfn = alloc_iova_fast(iovad, iova_len, limit >> shift, true); + + return iova_pfn << shift; +} + +static void vduse_domain_free_iova(struct iova_domain *iovad, + dma_addr_t iova, size_t size) +{ + unsigned long shift = iova_shift(iovad); + unsigned long iova_len = iova_align(iovad, size) >> shift; + + free_iova_fast(iovad, iova >> shift, iova_len); +} + +dma_addr_t vduse_domain_map_page(struct vduse_iova_domain *domain, + struct page *page, unsigned long offset, + size_t size, enum dma_data_direction dir, + unsigned long attrs) +{ + struct iova_domain *iovad = &domain->stream_iovad; + unsigned long limit = domain->bounce_size - 1; + phys_addr_t pa = page_to_phys(page) + offset; + dma_addr_t iova = vduse_domain_alloc_iova(iovad, size, limit); + + if (!iova) + return DMA_MAPPING_ERROR; + + if (vduse_domain_init_bounce_map(domain)) + goto err; + + if (vduse_domain_map_bounce_page(domain, (u64)iova, (u64)size, pa)) + goto err; + + if (dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL) + vduse_domain_bounce(domain, iova, size, DMA_TO_DEVICE); + + return iova; +err: + vduse_domain_free_iova(iovad, iova, size); + return DMA_MAPPING_ERROR; +} + +void vduse_domain_unmap_page(struct vduse_iova_domain *domain, + dma_addr_t dma_addr, size_t size, + enum dma_data_direction dir, unsigned long attrs) +{ + struct iova_domain *iovad = &domain->stream_iovad; + + if (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL) + vduse_domain_bounce(domain, dma_addr, size, DMA_FROM_DEVICE); + + vduse_domain_unmap_bounce_page(domain, (u64)dma_addr, (u64)size); + vduse_domain_free_iova(iovad, dma_addr, size); +} + +void *vduse_domain_alloc_coherent(struct vduse_iova_domain *domain, + size_t size, dma_addr_t *dma_addr, + gfp_t flag, unsigned long attrs) +{ + struct iova_domain *iovad = &domain->consistent_iovad; + unsigned long limit = domain->iova_limit; + dma_addr_t iova = vduse_domain_alloc_iova(iovad, size, limit); + void *orig = alloc_pages_exact(size, flag); + + if (!iova || !orig) + goto err; + + spin_lock(&domain->iotlb_lock); + if (vduse_iotlb_add_range(domain, (u64)iova, (u64)iova + size - 1, + virt_to_phys(orig), VHOST_MAP_RW, + domain->file, (u64)iova)) { + spin_unlock(&domain->iotlb_lock); + goto err; + } + spin_unlock(&domain->iotlb_lock); + + *dma_addr = iova; + + return orig; +err: + *dma_addr = DMA_MAPPING_ERROR; + if (orig) + free_pages_exact(orig, size); + if (iova) + vduse_domain_free_iova(iovad, iova, size); + + return NULL; +} + +void vduse_domain_free_coherent(struct vduse_iova_domain *domain, size_t size, + void *vaddr, dma_addr_t dma_addr, + unsigned long attrs) +{ + struct iova_domain *iovad = &domain->consistent_iovad; + struct vhost_iotlb_map *map; + struct vdpa_map_file *map_file; + phys_addr_t pa; + + spin_lock(&domain->iotlb_lock); + map = vhost_iotlb_itree_first(domain->iotlb, (u64)dma_addr, + (u64)dma_addr + size - 1); + if (WARN_ON(!map)) { + spin_unlock(&domain->iotlb_lock); + return; + } + map_file = (struct vdpa_map_file *)map->opaque; + fput(map_file->file); + kfree(map_file); + pa = map->addr; + vhost_iotlb_map_free(domain->iotlb, map); + spin_unlock(&domain->iotlb_lock); + + vduse_domain_free_iova(iovad, dma_addr, size); + free_pages_exact(phys_to_virt(pa), size); +} + +static vm_fault_t vduse_domain_mmap_fault(struct vm_fault *vmf) +{ + struct vduse_iova_domain *domain = vmf->vma->vm_private_data; + unsigned long iova = vmf->pgoff << PAGE_SHIFT; + struct page *page; + + if (!domain) + return VM_FAULT_SIGBUS; + + if (iova < domain->bounce_size) + page = vduse_domain_get_bounce_page(domain, iova); + else + page = vduse_domain_get_coherent_page(domain, iova); + + if (!page) + return VM_FAULT_SIGBUS; + + vmf->page = page; + + return 0; +} + +static const struct vm_operations_struct vduse_domain_mmap_ops = { + .fault = vduse_domain_mmap_fault, +}; + +static int vduse_domain_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct vduse_iova_domain *domain = file->private_data; + + vma->vm_flags |= VM_DONTDUMP | VM_DONTEXPAND; + vma->vm_private_data = domain; + vma->vm_ops = &vduse_domain_mmap_ops; + + return 0; +} + +static int vduse_domain_release(struct inode *inode, struct file *file) +{ + struct vduse_iova_domain *domain = file->private_data; + + spin_lock(&domain->iotlb_lock); + vduse_iotlb_del_range(domain, 0, ULLONG_MAX); + vduse_domain_free_bounce_pages(domain); + spin_unlock(&domain->iotlb_lock); + put_iova_domain(&domain->stream_iovad); + put_iova_domain(&domain->consistent_iovad); + vhost_iotlb_free(domain->iotlb); + vfree(domain->bounce_maps); + kfree(domain); + + return 0; +} + +static const struct file_operations vduse_domain_fops = { + .owner = THIS_MODULE, + .mmap = vduse_domain_mmap, + .release = vduse_domain_release, +}; + +void vduse_domain_destroy(struct vduse_iova_domain *domain) +{ + fput(domain->file); +} + +struct vduse_iova_domain * +vduse_domain_create(unsigned long iova_limit, size_t bounce_size) +{ + struct vduse_iova_domain *domain; + struct file *file; + struct vduse_bounce_map *map; + unsigned long pfn, bounce_pfns; + + bounce_pfns = PAGE_ALIGN(bounce_size) >> PAGE_SHIFT; + if (iova_limit <= bounce_size) + return NULL; + + domain = kzalloc(sizeof(*domain), GFP_KERNEL); + if (!domain) + return NULL; + + domain->iotlb = vhost_iotlb_alloc(0, 0); + if (!domain->iotlb) + goto err_iotlb; + + domain->iova_limit = iova_limit; + domain->bounce_size = PAGE_ALIGN(bounce_size); + domain->bounce_maps = vzalloc(bounce_pfns * + sizeof(struct vduse_bounce_map)); + if (!domain->bounce_maps) + goto err_map; + + for (pfn = 0; pfn < bounce_pfns; pfn++) { + map = &domain->bounce_maps[pfn]; + map->orig_phys = INVALID_PHYS_ADDR; + } + file = anon_inode_getfile("[vduse-domain]", &vduse_domain_fops, + domain, O_RDWR); + if (IS_ERR(file)) + goto err_file; + + domain->file = file; + spin_lock_init(&domain->iotlb_lock); + init_iova_domain(&domain->stream_iovad, + PAGE_SIZE, IOVA_START_PFN); + init_iova_domain(&domain->consistent_iovad, + PAGE_SIZE, bounce_pfns); + + return domain; +err_file: + vfree(domain->bounce_maps); +err_map: + vhost_iotlb_free(domain->iotlb); +err_iotlb: + kfree(domain); + return NULL; +} + +int vduse_domain_init(void) +{ + return iova_cache_get(); +} + +void vduse_domain_exit(void) +{ + iova_cache_put(); +} diff --git a/drivers/vdpa/vdpa_user/iova_domain.h b/drivers/vdpa/vdpa_user/iova_domain.h new file mode 100644 index 000000000000..5fc41d01c412 --- /dev/null +++ b/drivers/vdpa/vdpa_user/iova_domain.h @@ -0,0 +1,73 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * MMU-based IOMMU implementation + * + * Copyright (C) 2020-2021 Bytedance Inc. and/or its affiliates. All rights reserved. + * + * Author: Xie Yongji + * + */ + +#ifndef _VDUSE_IOVA_DOMAIN_H +#define _VDUSE_IOVA_DOMAIN_H + +#include +#include +#include + +#define IOVA_START_PFN 1 + +#define INVALID_PHYS_ADDR (~(phys_addr_t)0) + +struct vduse_bounce_map { + struct page *bounce_page; + u64 orig_phys; +}; + +struct vduse_iova_domain { + struct iova_domain stream_iovad; + struct iova_domain consistent_iovad; + struct vduse_bounce_map *bounce_maps; + size_t bounce_size; + unsigned long iova_limit; + int bounce_map; + struct vhost_iotlb *iotlb; + spinlock_t iotlb_lock; + struct file *file; +}; + +int vduse_domain_set_map(struct vduse_iova_domain *domain, + struct vhost_iotlb *iotlb); + +void vduse_domain_clear_map(struct vduse_iova_domain *domain, + struct vhost_iotlb *iotlb); + +dma_addr_t vduse_domain_map_page(struct vduse_iova_domain *domain, + struct page *page, unsigned long offset, + size_t size, enum dma_data_direction dir, + unsigned long attrs); + +void vduse_domain_unmap_page(struct vduse_iova_domain *domain, + dma_addr_t dma_addr, size_t size, + enum dma_data_direction dir, unsigned long attrs); + +void *vduse_domain_alloc_coherent(struct vduse_iova_domain *domain, + size_t size, dma_addr_t *dma_addr, + gfp_t flag, unsigned long attrs); + +void vduse_domain_free_coherent(struct vduse_iova_domain *domain, size_t size, + void *vaddr, dma_addr_t dma_addr, + unsigned long attrs); + +void vduse_domain_reset_bounce_map(struct vduse_iova_domain *domain); + +void vduse_domain_destroy(struct vduse_iova_domain *domain); + +struct vduse_iova_domain *vduse_domain_create(unsigned long iova_limit, + size_t bounce_size); + +int vduse_domain_init(void); + +void vduse_domain_exit(void); + +#endif /* _VDUSE_IOVA_DOMAIN_H */ From patchwork Tue Jul 13 08:46:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 475095 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.0 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43733C07E95 for ; Tue, 13 Jul 2021 08:49:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1E58660725 for ; Tue, 13 Jul 2021 08:49:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235441AbhGMIvv (ORCPT ); Tue, 13 Jul 2021 04:51:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50342 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235388AbhGMIvc (ORCPT ); Tue, 13 Jul 2021 04:51:32 -0400 Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10E1DC0613E9 for ; Tue, 13 Jul 2021 01:48:24 -0700 (PDT) Received: by mail-pj1-x102d.google.com with SMTP id cu14so6409697pjb.0 for ; Tue, 13 Jul 2021 01:48:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=yIPl04ivFClsafahu2qJ8PDmZ+pEZYepUapxtPnL8wQ=; b=rHbYu4SqxnQv72AGurJYrHpFqxamSELUZxnZbA+PwwFUCxh2BL800GgzhiDBDn7snw vPRjqvElYzHPZv86xBJvVf+NtiJJ5PWHvv8sr5CHd8aGhiuiX6JszOPctOpH13Rt9boK srkuLaEtXxszoUDTEuJjcOCbl9AIy7SEyIJ+So/Ky/MRDfT2HQKFdxakzVnJPPmCvFtd /e07SPXMHCFGuFJGaVo1Ip9QGJLz+cwdqz6Kbdp4VNErrr4uJxe+tInK4b/5BxJYf5Tq e/Zor9BQZswkVJVEK0V5Jbxayo7DmNe8DMY8omE9N6mZYNSVK4I4n4b4PtAKBvVjobTZ LASw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yIPl04ivFClsafahu2qJ8PDmZ+pEZYepUapxtPnL8wQ=; b=p3R99JL+CzPX/x4t57wZyWuldtNBNH22NBN8o+Y1psDjO0tXIT2aYxUXaq2Hz0P7Uy EqQZQKxfqPVUTGkdslUJw+UcHD84rTGCQny+3yCl/OHex5fkBs9whx/UJk5m++6Syxva m6IjDjYCStJ6mZj1v8F9eRoHVkkttccUC/vKmZ1mew1lMgCKQHZM1MzDH7nvpcEPosaN 8LnsrvcYszECfcErAESmJd/1BAvm7siZSSMwZ9yJOUyMFnZaNpEm6wbR2EGaRLTsxnvy Pc/F58dAbGUkfX4XcbOmd8heBYUhLV1l4BlZuPEXKEBfecbsirtLRX30/tNcp0pV3G45 QnUg== X-Gm-Message-State: AOAM532bLBKQi12yw1SDMHQ2EIpxXVs1UMDeJNJrLZ9VMu8bV5brYJM4 i1kiAPI1oQ8+u7fXwrEuVfdy X-Google-Smtp-Source: ABdhPJyNI4SCnAF/FaXz1vQQiNPlViVk5DHDMIWtKGo3FJ3O6HfqyjYMjhQwxUPFK2+xvmwucJvTnQ== X-Received: by 2002:a17:902:8488:b029:129:97e8:16e7 with SMTP id c8-20020a1709028488b029012997e816e7mr2683936plo.39.1626166103242; Tue, 13 Jul 2021 01:48:23 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id c141sm19151424pfc.13.2021.07.13.01.48.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:48:22 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 16/17] vduse: Introduce VDUSE - vDPA Device in Userspace Date: Tue, 13 Jul 2021 16:46:55 +0800 Message-Id: <20210713084656.232-17-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This VDUSE driver enables implementing software-emulated vDPA devices in userspace. The vDPA device is created by ioctl(VDUSE_CREATE_DEV) on /dev/vduse/control. Then a char device interface (/dev/vduse/$NAME) is exported to userspace for device emulation. In order to make the device emulation more secure, the device's control path is handled in kernel. A message mechnism is introduced to forward some dataplane related control messages to userspace. And in the data path, the DMA buffer will be mapped into userspace address space through different ways depending on the vDPA bus to which the vDPA device is attached. In virtio-vdpa case, the MMU-based IOMMU driver is used to achieve that. And in vhost-vdpa case, the DMA buffer is reside in a userspace memory region which can be shared to the VDUSE userspace processs via transferring the shmfd. For more details on VDUSE design and usage, please see the follow-on Documentation commit. Signed-off-by: Xie Yongji --- Documentation/userspace-api/ioctl/ioctl-number.rst | 1 + drivers/vdpa/Kconfig | 10 + drivers/vdpa/Makefile | 1 + drivers/vdpa/vdpa_user/Makefile | 5 + drivers/vdpa/vdpa_user/vduse_dev.c | 1502 ++++++++++++++++++++ include/uapi/linux/vduse.h | 221 +++ 6 files changed, 1740 insertions(+) create mode 100644 drivers/vdpa/vdpa_user/Makefile create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c create mode 100644 include/uapi/linux/vduse.h diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst index 1409e40e6345..293ca3aef358 100644 --- a/Documentation/userspace-api/ioctl/ioctl-number.rst +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst @@ -300,6 +300,7 @@ Code Seq# Include File Comments 'z' 10-4F drivers/s390/crypto/zcrypt_api.h conflict! '|' 00-7F linux/media.h 0x80 00-1F linux/fb.h +0x81 00-1F linux/vduse.h 0x89 00-06 arch/x86/include/asm/sockios.h 0x89 0B-DF linux/sockios.h 0x89 E0-EF linux/sockios.h SIOCPROTOPRIVATE range diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig index a503c1b2bfd9..6e23bce6433a 100644 --- a/drivers/vdpa/Kconfig +++ b/drivers/vdpa/Kconfig @@ -33,6 +33,16 @@ config VDPA_SIM_BLOCK vDPA block device simulator which terminates IO request in a memory buffer. +config VDPA_USER + tristate "VDUSE (vDPA Device in Userspace) support" + depends on EVENTFD && MMU && HAS_DMA + select DMA_OPS + select VHOST_IOTLB + select IOMMU_IOVA + help + With VDUSE it is possible to emulate a vDPA Device + in a userspace program. + config IFCVF tristate "Intel IFC VF vDPA driver" depends on PCI_MSI diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile index 67fe7f3d6943..f02ebed33f19 100644 --- a/drivers/vdpa/Makefile +++ b/drivers/vdpa/Makefile @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_VDPA) += vdpa.o obj-$(CONFIG_VDPA_SIM) += vdpa_sim/ +obj-$(CONFIG_VDPA_USER) += vdpa_user/ obj-$(CONFIG_IFCVF) += ifcvf/ obj-$(CONFIG_MLX5_VDPA) += mlx5/ obj-$(CONFIG_VP_VDPA) += virtio_pci/ diff --git a/drivers/vdpa/vdpa_user/Makefile b/drivers/vdpa/vdpa_user/Makefile new file mode 100644 index 000000000000..260e0b26af99 --- /dev/null +++ b/drivers/vdpa/vdpa_user/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 + +vduse-y := vduse_dev.o iova_domain.o + +obj-$(CONFIG_VDPA_USER) += vduse.o diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c new file mode 100644 index 000000000000..c994a4a4660c --- /dev/null +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -0,0 +1,1502 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * VDUSE: vDPA Device in Userspace + * + * Copyright (C) 2020-2021 Bytedance Inc. and/or its affiliates. All rights reserved. + * + * Author: Xie Yongji + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "iova_domain.h" + +#define DRV_AUTHOR "Yongji Xie " +#define DRV_DESC "vDPA Device in Userspace" +#define DRV_LICENSE "GPL v2" + +#define VDUSE_DEV_MAX (1U << MINORBITS) +#define VDUSE_MAX_BOUNCE_SIZE (64 * 1024 * 1024) +#define VDUSE_IOVA_SIZE (128 * 1024 * 1024) +#define VDUSE_REQUEST_TIMEOUT 30 + +struct vduse_virtqueue { + u16 index; + u16 num_max; + u32 num; + u64 desc_addr; + u64 driver_addr; + u64 device_addr; + struct vdpa_vq_state state; + bool ready; + bool kicked; + spinlock_t kick_lock; + spinlock_t irq_lock; + struct eventfd_ctx *kickfd; + struct vdpa_callback cb; + struct work_struct inject; +}; + +struct vduse_dev; + +struct vduse_vdpa { + struct vdpa_device vdpa; + struct vduse_dev *dev; +}; + +struct vduse_dev { + struct vduse_vdpa *vdev; + struct device *dev; + struct vduse_virtqueue *vqs; + struct vduse_iova_domain *domain; + char *name; + struct mutex lock; + spinlock_t msg_lock; + u64 msg_unique; + wait_queue_head_t waitq; + struct list_head send_list; + struct list_head recv_list; + struct vdpa_callback config_cb; + struct work_struct inject; + spinlock_t irq_lock; + int minor; + bool connected; + u64 api_version; + u64 device_features; + u64 driver_features; + u32 device_id; + u32 vendor_id; + u32 generation; + u32 config_size; + void *config; + u8 status; + u32 vq_num; + u32 vq_align; +}; + +struct vduse_dev_msg { + struct vduse_dev_request req; + struct vduse_dev_response resp; + struct list_head list; + wait_queue_head_t waitq; + bool completed; +}; + +struct vduse_control { + u64 api_version; +}; + +static DEFINE_MUTEX(vduse_lock); +static DEFINE_IDR(vduse_idr); + +static dev_t vduse_major; +static struct class *vduse_class; +static struct cdev vduse_ctrl_cdev; +static struct cdev vduse_cdev; +static struct workqueue_struct *vduse_irq_wq; + +static u32 allowed_device_id[] = { + VIRTIO_ID_BLOCK, +}; + +static inline struct vduse_dev *vdpa_to_vduse(struct vdpa_device *vdpa) +{ + struct vduse_vdpa *vdev = container_of(vdpa, struct vduse_vdpa, vdpa); + + return vdev->dev; +} + +static inline struct vduse_dev *dev_to_vduse(struct device *dev) +{ + struct vdpa_device *vdpa = dev_to_vdpa(dev); + + return vdpa_to_vduse(vdpa); +} + +static struct vduse_dev_msg *vduse_find_msg(struct list_head *head, + uint32_t request_id) +{ + struct vduse_dev_msg *msg; + + list_for_each_entry(msg, head, list) { + if (msg->req.request_id == request_id) { + list_del(&msg->list); + return msg; + } + } + + return NULL; +} + +static struct vduse_dev_msg *vduse_dequeue_msg(struct list_head *head) +{ + struct vduse_dev_msg *msg = NULL; + + if (!list_empty(head)) { + msg = list_first_entry(head, struct vduse_dev_msg, list); + list_del(&msg->list); + } + + return msg; +} + +static void vduse_enqueue_msg(struct list_head *head, + struct vduse_dev_msg *msg) +{ + list_add_tail(&msg->list, head); +} + +static int vduse_dev_msg_sync(struct vduse_dev *dev, + struct vduse_dev_msg *msg) +{ + int ret; + + init_waitqueue_head(&msg->waitq); + spin_lock(&dev->msg_lock); + msg->req.request_id = dev->msg_unique++; + vduse_enqueue_msg(&dev->send_list, msg); + wake_up(&dev->waitq); + spin_unlock(&dev->msg_lock); + + wait_event_killable_timeout(msg->waitq, msg->completed, + VDUSE_REQUEST_TIMEOUT * HZ); + spin_lock(&dev->msg_lock); + if (!msg->completed) { + list_del(&msg->list); + msg->resp.result = VDUSE_REQ_RESULT_FAILED; + } + ret = (msg->resp.result == VDUSE_REQ_RESULT_OK) ? 0 : -EIO; + spin_unlock(&dev->msg_lock); + + return ret; +} + +static int vduse_dev_get_vq_state_packed(struct vduse_dev *dev, + struct vduse_virtqueue *vq, + struct vdpa_vq_state_packed *packed) +{ + struct vduse_dev_msg msg = { 0 }; + int ret; + + msg.req.type = VDUSE_GET_VQ_STATE; + msg.req.vq_state.index = vq->index; + + ret = vduse_dev_msg_sync(dev, &msg); + if (ret) + return ret; + + packed->last_avail_counter = + msg.resp.vq_state.packed.last_avail_counter; + packed->last_avail_idx = msg.resp.vq_state.packed.last_avail_idx; + packed->last_used_counter = msg.resp.vq_state.packed.last_used_counter; + packed->last_used_idx = msg.resp.vq_state.packed.last_used_idx; + + return 0; +} + +static int vduse_dev_get_vq_state_split(struct vduse_dev *dev, + struct vduse_virtqueue *vq, + struct vdpa_vq_state_split *split) +{ + struct vduse_dev_msg msg = { 0 }; + int ret; + + msg.req.type = VDUSE_GET_VQ_STATE; + msg.req.vq_state.index = vq->index; + + ret = vduse_dev_msg_sync(dev, &msg); + if (ret) + return ret; + + split->avail_index = msg.resp.vq_state.split.avail_index; + + return 0; +} + +static int vduse_dev_set_status(struct vduse_dev *dev, u8 status) +{ + struct vduse_dev_msg msg = { 0 }; + + msg.req.type = VDUSE_SET_STATUS; + msg.req.s.status = status; + + return vduse_dev_msg_sync(dev, &msg); +} + +static int vduse_dev_update_iotlb(struct vduse_dev *dev, + u64 start, u64 last) +{ + struct vduse_dev_msg msg = { 0 }; + + if (last < start) + return -EINVAL; + + msg.req.type = VDUSE_UPDATE_IOTLB; + msg.req.iova.start = start; + msg.req.iova.last = last; + + return vduse_dev_msg_sync(dev, &msg); +} + +static ssize_t vduse_dev_read_iter(struct kiocb *iocb, struct iov_iter *to) +{ + struct file *file = iocb->ki_filp; + struct vduse_dev *dev = file->private_data; + struct vduse_dev_msg *msg; + int size = sizeof(struct vduse_dev_request); + ssize_t ret; + + if (iov_iter_count(to) < size) + return -EINVAL; + + spin_lock(&dev->msg_lock); + while (1) { + msg = vduse_dequeue_msg(&dev->send_list); + if (msg) + break; + + ret = -EAGAIN; + if (file->f_flags & O_NONBLOCK) + goto unlock; + + spin_unlock(&dev->msg_lock); + ret = wait_event_interruptible_exclusive(dev->waitq, + !list_empty(&dev->send_list)); + if (ret) + return ret; + + spin_lock(&dev->msg_lock); + } + spin_unlock(&dev->msg_lock); + ret = copy_to_iter(&msg->req, size, to); + spin_lock(&dev->msg_lock); + if (ret != size) { + ret = -EFAULT; + vduse_enqueue_msg(&dev->send_list, msg); + goto unlock; + } + vduse_enqueue_msg(&dev->recv_list, msg); +unlock: + spin_unlock(&dev->msg_lock); + + return ret; +} + +static ssize_t vduse_dev_write_iter(struct kiocb *iocb, struct iov_iter *from) +{ + struct file *file = iocb->ki_filp; + struct vduse_dev *dev = file->private_data; + struct vduse_dev_response resp; + struct vduse_dev_msg *msg; + size_t ret; + + ret = copy_from_iter(&resp, sizeof(resp), from); + if (ret != sizeof(resp)) + return -EINVAL; + + spin_lock(&dev->msg_lock); + msg = vduse_find_msg(&dev->recv_list, resp.request_id); + if (!msg) { + ret = -ENOENT; + goto unlock; + } + + memcpy(&msg->resp, &resp, sizeof(resp)); + msg->completed = 1; + wake_up(&msg->waitq); +unlock: + spin_unlock(&dev->msg_lock); + + return ret; +} + +static __poll_t vduse_dev_poll(struct file *file, poll_table *wait) +{ + struct vduse_dev *dev = file->private_data; + __poll_t mask = 0; + + poll_wait(file, &dev->waitq, wait); + + if (!list_empty(&dev->send_list)) + mask |= EPOLLIN | EPOLLRDNORM; + if (!list_empty(&dev->recv_list)) + mask |= EPOLLOUT | EPOLLWRNORM; + + return mask; +} + +static void vduse_dev_reset(struct vduse_dev *dev) +{ + int i; + struct vduse_iova_domain *domain = dev->domain; + + /* The coherent mappings are handled in vduse_dev_free_coherent() */ + if (domain->bounce_map) + vduse_domain_reset_bounce_map(domain); + + dev->driver_features = 0; + dev->generation++; + spin_lock(&dev->irq_lock); + dev->config_cb.callback = NULL; + dev->config_cb.private = NULL; + spin_unlock(&dev->irq_lock); + flush_work(&dev->inject); + + for (i = 0; i < dev->vq_num; i++) { + struct vduse_virtqueue *vq = &dev->vqs[i]; + + vq->ready = false; + vq->desc_addr = 0; + vq->driver_addr = 0; + vq->device_addr = 0; + vq->num = 0; + memset(&vq->state, 0, sizeof(vq->state)); + + spin_lock(&vq->kick_lock); + vq->kicked = false; + if (vq->kickfd) + eventfd_ctx_put(vq->kickfd); + vq->kickfd = NULL; + spin_unlock(&vq->kick_lock); + + spin_lock(&vq->irq_lock); + vq->cb.callback = NULL; + vq->cb.private = NULL; + spin_unlock(&vq->irq_lock); + flush_work(&vq->inject); + } +} + +static int vduse_vdpa_set_vq_address(struct vdpa_device *vdpa, u16 idx, + u64 desc_area, u64 driver_area, + u64 device_area) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + vq->desc_addr = desc_area; + vq->driver_addr = driver_area; + vq->device_addr = device_area; + + return 0; +} + +static void vduse_vdpa_kick_vq(struct vdpa_device *vdpa, u16 idx) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + spin_lock(&vq->kick_lock); + if (!vq->ready) + goto unlock; + + if (vq->kickfd) + eventfd_signal(vq->kickfd, 1); + else + vq->kicked = true; +unlock: + spin_unlock(&vq->kick_lock); +} + +static void vduse_vdpa_set_vq_cb(struct vdpa_device *vdpa, u16 idx, + struct vdpa_callback *cb) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + spin_lock(&vq->irq_lock); + vq->cb.callback = cb->callback; + vq->cb.private = cb->private; + spin_unlock(&vq->irq_lock); +} + +static void vduse_vdpa_set_vq_num(struct vdpa_device *vdpa, u16 idx, u32 num) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + vq->num = num; +} + +static void vduse_vdpa_set_vq_ready(struct vdpa_device *vdpa, + u16 idx, bool ready) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + vq->ready = ready; +} + +static bool vduse_vdpa_get_vq_ready(struct vdpa_device *vdpa, u16 idx) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + return vq->ready; +} + +static int vduse_vdpa_set_vq_state(struct vdpa_device *vdpa, u16 idx, + const struct vdpa_vq_state *state) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + if (dev->driver_features & BIT_ULL(VIRTIO_F_RING_PACKED)) { + vq->state.packed.last_avail_counter = + state->packed.last_avail_counter; + vq->state.packed.last_avail_idx = state->packed.last_avail_idx; + vq->state.packed.last_used_counter = + state->packed.last_used_counter; + vq->state.packed.last_used_idx = state->packed.last_used_idx; + } else + vq->state.split.avail_index = state->split.avail_index; + + return 0; +} + +static int vduse_vdpa_get_vq_state(struct vdpa_device *vdpa, u16 idx, + struct vdpa_vq_state *state) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + struct vduse_virtqueue *vq = &dev->vqs[idx]; + + if (dev->driver_features & BIT_ULL(VIRTIO_F_RING_PACKED)) + return vduse_dev_get_vq_state_packed(dev, vq, &state->packed); + + return vduse_dev_get_vq_state_split(dev, vq, &state->split); +} + +static u32 vduse_vdpa_get_vq_align(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->vq_align; +} + +static u64 vduse_vdpa_get_features(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->device_features; +} + +static int vduse_vdpa_set_features(struct vdpa_device *vdpa, u64 features) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + dev->driver_features = features; + return 0; +} + +static void vduse_vdpa_set_config_cb(struct vdpa_device *vdpa, + struct vdpa_callback *cb) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + spin_lock(&dev->irq_lock); + dev->config_cb.callback = cb->callback; + dev->config_cb.private = cb->private; + spin_unlock(&dev->irq_lock); +} + +static u16 vduse_vdpa_get_vq_num_max(struct vdpa_device *vdpa, u16 idx) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->vqs[idx].num_max; +} + +static u32 vduse_vdpa_get_device_id(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->device_id; +} + +static u32 vduse_vdpa_get_vendor_id(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->vendor_id; +} + +static u8 vduse_vdpa_get_status(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->status; +} + +static void vduse_vdpa_set_status(struct vdpa_device *vdpa, u8 status) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + if (vduse_dev_set_status(dev, status)) + return; + + dev->status = status; + if (status == 0) + vduse_dev_reset(dev); +} + +static size_t vduse_vdpa_get_config_size(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->config_size; +} + +static void vduse_vdpa_get_config(struct vdpa_device *vdpa, unsigned int offset, + void *buf, unsigned int len) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + if (len > dev->config_size - offset) + return; + + memcpy(buf, dev->config + offset, len); +} + +static void vduse_vdpa_set_config(struct vdpa_device *vdpa, unsigned int offset, + const void *buf, unsigned int len) +{ + /* Now we only support read-only configuration space */ +} + +static u32 vduse_vdpa_get_generation(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->generation; +} + +static int vduse_vdpa_set_map(struct vdpa_device *vdpa, + struct vhost_iotlb *iotlb) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + int ret; + + ret = vduse_domain_set_map(dev->domain, iotlb); + if (ret) + return ret; + + ret = vduse_dev_update_iotlb(dev, 0ULL, ULLONG_MAX); + if (ret) { + vduse_domain_clear_map(dev->domain, iotlb); + return ret; + } + + return 0; +} + +static void vduse_vdpa_free(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + dev->vdev = NULL; +} + +static const struct vdpa_config_ops vduse_vdpa_config_ops = { + .set_vq_address = vduse_vdpa_set_vq_address, + .kick_vq = vduse_vdpa_kick_vq, + .set_vq_cb = vduse_vdpa_set_vq_cb, + .set_vq_num = vduse_vdpa_set_vq_num, + .set_vq_ready = vduse_vdpa_set_vq_ready, + .get_vq_ready = vduse_vdpa_get_vq_ready, + .set_vq_state = vduse_vdpa_set_vq_state, + .get_vq_state = vduse_vdpa_get_vq_state, + .get_vq_align = vduse_vdpa_get_vq_align, + .get_features = vduse_vdpa_get_features, + .set_features = vduse_vdpa_set_features, + .set_config_cb = vduse_vdpa_set_config_cb, + .get_vq_num_max = vduse_vdpa_get_vq_num_max, + .get_device_id = vduse_vdpa_get_device_id, + .get_vendor_id = vduse_vdpa_get_vendor_id, + .get_status = vduse_vdpa_get_status, + .set_status = vduse_vdpa_set_status, + .get_config_size = vduse_vdpa_get_config_size, + .get_config = vduse_vdpa_get_config, + .set_config = vduse_vdpa_set_config, + .get_generation = vduse_vdpa_get_generation, + .set_map = vduse_vdpa_set_map, + .free = vduse_vdpa_free, +}; + +static dma_addr_t vduse_dev_map_page(struct device *dev, struct page *page, + unsigned long offset, size_t size, + enum dma_data_direction dir, + unsigned long attrs) +{ + struct vduse_dev *vdev = dev_to_vduse(dev); + struct vduse_iova_domain *domain = vdev->domain; + + return vduse_domain_map_page(domain, page, offset, size, dir, attrs); +} + +static void vduse_dev_unmap_page(struct device *dev, dma_addr_t dma_addr, + size_t size, enum dma_data_direction dir, + unsigned long attrs) +{ + struct vduse_dev *vdev = dev_to_vduse(dev); + struct vduse_iova_domain *domain = vdev->domain; + + return vduse_domain_unmap_page(domain, dma_addr, size, dir, attrs); +} + +static void *vduse_dev_alloc_coherent(struct device *dev, size_t size, + dma_addr_t *dma_addr, gfp_t flag, + unsigned long attrs) +{ + struct vduse_dev *vdev = dev_to_vduse(dev); + struct vduse_iova_domain *domain = vdev->domain; + unsigned long iova; + void *addr; + + *dma_addr = DMA_MAPPING_ERROR; + addr = vduse_domain_alloc_coherent(domain, size, + (dma_addr_t *)&iova, flag, attrs); + if (!addr) + return NULL; + + *dma_addr = (dma_addr_t)iova; + + return addr; +} + +static void vduse_dev_free_coherent(struct device *dev, size_t size, + void *vaddr, dma_addr_t dma_addr, + unsigned long attrs) +{ + struct vduse_dev *vdev = dev_to_vduse(dev); + struct vduse_iova_domain *domain = vdev->domain; + + vduse_domain_free_coherent(domain, size, vaddr, dma_addr, attrs); +} + +static size_t vduse_dev_max_mapping_size(struct device *dev) +{ + struct vduse_dev *vdev = dev_to_vduse(dev); + struct vduse_iova_domain *domain = vdev->domain; + + return domain->bounce_size; +} + +static const struct dma_map_ops vduse_dev_dma_ops = { + .map_page = vduse_dev_map_page, + .unmap_page = vduse_dev_unmap_page, + .alloc = vduse_dev_alloc_coherent, + .free = vduse_dev_free_coherent, + .max_mapping_size = vduse_dev_max_mapping_size, +}; + +static unsigned int perm_to_file_flags(u8 perm) +{ + unsigned int flags = 0; + + switch (perm) { + case VDUSE_ACCESS_WO: + flags |= O_WRONLY; + break; + case VDUSE_ACCESS_RO: + flags |= O_RDONLY; + break; + case VDUSE_ACCESS_RW: + flags |= O_RDWR; + break; + default: + WARN(1, "invalidate vhost IOTLB permission\n"); + break; + } + + return flags; +} + +static int vduse_kickfd_setup(struct vduse_dev *dev, + struct vduse_vq_eventfd *eventfd) +{ + struct eventfd_ctx *ctx = NULL; + struct vduse_virtqueue *vq; + u32 index; + + if (eventfd->index >= dev->vq_num) + return -EINVAL; + + index = array_index_nospec(eventfd->index, dev->vq_num); + vq = &dev->vqs[index]; + if (eventfd->fd >= 0) { + ctx = eventfd_ctx_fdget(eventfd->fd); + if (IS_ERR(ctx)) + return PTR_ERR(ctx); + } else if (eventfd->fd != VDUSE_EVENTFD_DEASSIGN) + return 0; + + spin_lock(&vq->kick_lock); + if (vq->kickfd) + eventfd_ctx_put(vq->kickfd); + vq->kickfd = ctx; + if (vq->ready && vq->kicked && vq->kickfd) { + eventfd_signal(vq->kickfd, 1); + vq->kicked = false; + } + spin_unlock(&vq->kick_lock); + + return 0; +} + +static bool vduse_dev_is_ready(struct vduse_dev *dev) +{ + int i; + + for (i = 0; i < dev->vq_num; i++) + if (!dev->vqs[i].num_max) + return false; + + return true; +} + +static void vduse_dev_irq_inject(struct work_struct *work) +{ + struct vduse_dev *dev = container_of(work, struct vduse_dev, inject); + + spin_lock_irq(&dev->irq_lock); + if (dev->config_cb.callback) + dev->config_cb.callback(dev->config_cb.private); + spin_unlock_irq(&dev->irq_lock); +} + +static void vduse_vq_irq_inject(struct work_struct *work) +{ + struct vduse_virtqueue *vq = container_of(work, + struct vduse_virtqueue, inject); + + spin_lock_irq(&vq->irq_lock); + if (vq->ready && vq->cb.callback) + vq->cb.callback(vq->cb.private); + spin_unlock_irq(&vq->irq_lock); +} + +static long vduse_dev_ioctl(struct file *file, unsigned int cmd, + unsigned long arg) +{ + struct vduse_dev *dev = file->private_data; + void __user *argp = (void __user *)arg; + int ret; + + switch (cmd) { + case VDUSE_IOTLB_GET_FD: { + struct vduse_iotlb_entry entry; + struct vhost_iotlb_map *map; + struct vdpa_map_file *map_file; + struct vduse_iova_domain *domain = dev->domain; + struct file *f = NULL; + + ret = -EFAULT; + if (copy_from_user(&entry, argp, sizeof(entry))) + break; + + ret = -EINVAL; + if (entry.start > entry.last) + break; + + spin_lock(&domain->iotlb_lock); + map = vhost_iotlb_itree_first(domain->iotlb, + entry.start, entry.last); + if (map) { + map_file = (struct vdpa_map_file *)map->opaque; + f = get_file(map_file->file); + entry.offset = map_file->offset; + entry.start = map->start; + entry.last = map->last; + entry.perm = map->perm; + } + spin_unlock(&domain->iotlb_lock); + ret = -EINVAL; + if (!f) + break; + + ret = -EFAULT; + if (copy_to_user(argp, &entry, sizeof(entry))) { + fput(f); + break; + } + ret = receive_fd(f, perm_to_file_flags(entry.perm)); + fput(f); + break; + } + case VDUSE_DEV_GET_FEATURES: + ret = put_user(dev->driver_features, (u64 __user *)argp); + break; + case VDUSE_DEV_SET_CONFIG: { + struct vduse_config_data config; + unsigned long size = offsetof(struct vduse_config_data, + buffer); + + ret = -EFAULT; + if (copy_from_user(&config, argp, size)) + break; + + ret = -EINVAL; + if (config.length == 0 || + config.length > dev->config_size - config.offset) + break; + + ret = -EFAULT; + if (copy_from_user(dev->config + config.offset, argp + size, + config.length)) + break; + + ret = 0; + break; + } + case VDUSE_DEV_INJECT_IRQ: + ret = 0; + queue_work(vduse_irq_wq, &dev->inject); + break; + case VDUSE_VQ_SETUP: { + struct vduse_vq_config config; + u32 index; + + ret = -EFAULT; + if (copy_from_user(&config, argp, sizeof(config))) + break; + + ret = -EINVAL; + if (config.index >= dev->vq_num) + break; + + index = array_index_nospec(config.index, dev->vq_num); + dev->vqs[index].num_max = config.max_size; + ret = 0; + break; + } + case VDUSE_VQ_GET_INFO: { + struct vduse_vq_info vq_info; + struct vduse_virtqueue *vq; + u32 index; + + ret = -EFAULT; + if (copy_from_user(&vq_info, argp, sizeof(vq_info))) + break; + + ret = -EINVAL; + if (vq_info.index >= dev->vq_num) + break; + + index = array_index_nospec(vq_info.index, dev->vq_num); + vq = &dev->vqs[index]; + vq_info.desc_addr = vq->desc_addr; + vq_info.driver_addr = vq->driver_addr; + vq_info.device_addr = vq->device_addr; + vq_info.num = vq->num; + + if (dev->driver_features & BIT_ULL(VIRTIO_F_RING_PACKED)) { + vq_info.packed.last_avail_counter = + vq->state.packed.last_avail_counter; + vq_info.packed.last_avail_idx = + vq->state.packed.last_avail_idx; + vq_info.packed.last_used_counter = + vq->state.packed.last_used_counter; + vq_info.packed.last_used_idx = + vq->state.packed.last_used_idx; + } else + vq_info.split.avail_index = + vq->state.split.avail_index; + + vq_info.ready = vq->ready; + + ret = -EFAULT; + if (copy_to_user(argp, &vq_info, sizeof(vq_info))) + break; + + ret = 0; + break; + } + case VDUSE_VQ_SETUP_KICKFD: { + struct vduse_vq_eventfd eventfd; + + ret = -EFAULT; + if (copy_from_user(&eventfd, argp, sizeof(eventfd))) + break; + + ret = vduse_kickfd_setup(dev, &eventfd); + break; + } + case VDUSE_VQ_INJECT_IRQ: { + u32 index; + + ret = -EFAULT; + if (get_user(index, (u32 __user *)argp)) + break; + + ret = -EINVAL; + if (index >= dev->vq_num) + break; + + ret = 0; + index = array_index_nospec(index, dev->vq_num); + queue_work(vduse_irq_wq, &dev->vqs[index].inject); + break; + } + default: + ret = -ENOIOCTLCMD; + break; + } + + return ret; +} + +static int vduse_dev_release(struct inode *inode, struct file *file) +{ + struct vduse_dev *dev = file->private_data; + + spin_lock(&dev->msg_lock); + /* Make sure the inflight messages can processed after reconncection */ + list_splice_init(&dev->recv_list, &dev->send_list); + spin_unlock(&dev->msg_lock); + dev->connected = false; + + return 0; +} + +static struct vduse_dev *vduse_dev_get_from_minor(int minor) +{ + struct vduse_dev *dev; + + mutex_lock(&vduse_lock); + dev = idr_find(&vduse_idr, minor); + mutex_unlock(&vduse_lock); + + return dev; +} + +static int vduse_dev_open(struct inode *inode, struct file *file) +{ + int ret; + struct vduse_dev *dev = vduse_dev_get_from_minor(iminor(inode)); + + if (!dev) + return -ENODEV; + + ret = -EBUSY; + mutex_lock(&dev->lock); + if (dev->connected) + goto unlock; + + ret = 0; + dev->connected = true; + file->private_data = dev; +unlock: + mutex_unlock(&dev->lock); + + return ret; +} + +static const struct file_operations vduse_dev_fops = { + .owner = THIS_MODULE, + .open = vduse_dev_open, + .release = vduse_dev_release, + .read_iter = vduse_dev_read_iter, + .write_iter = vduse_dev_write_iter, + .poll = vduse_dev_poll, + .unlocked_ioctl = vduse_dev_ioctl, + .compat_ioctl = compat_ptr_ioctl, + .llseek = noop_llseek, +}; + +static struct vduse_dev *vduse_dev_create(void) +{ + struct vduse_dev *dev = kzalloc(sizeof(*dev), GFP_KERNEL); + + if (!dev) + return NULL; + + mutex_init(&dev->lock); + spin_lock_init(&dev->msg_lock); + INIT_LIST_HEAD(&dev->send_list); + INIT_LIST_HEAD(&dev->recv_list); + spin_lock_init(&dev->irq_lock); + + INIT_WORK(&dev->inject, vduse_dev_irq_inject); + init_waitqueue_head(&dev->waitq); + + return dev; +} + +static void vduse_dev_destroy(struct vduse_dev *dev) +{ + kfree(dev); +} + +static struct vduse_dev *vduse_find_dev(const char *name) +{ + struct vduse_dev *dev; + int id; + + idr_for_each_entry(&vduse_idr, dev, id) + if (!strcmp(dev->name, name)) + return dev; + + return NULL; +} + +static int vduse_destroy_dev(char *name) +{ + struct vduse_dev *dev = vduse_find_dev(name); + + if (!dev) + return -EINVAL; + + mutex_lock(&dev->lock); + if (dev->vdev || dev->connected) { + mutex_unlock(&dev->lock); + return -EBUSY; + } + dev->connected = true; + mutex_unlock(&dev->lock); + + vduse_dev_reset(dev); + device_destroy(vduse_class, MKDEV(MAJOR(vduse_major), dev->minor)); + idr_remove(&vduse_idr, dev->minor); + kvfree(dev->config); + kfree(dev->vqs); + vduse_domain_destroy(dev->domain); + kfree(dev->name); + vduse_dev_destroy(dev); + module_put(THIS_MODULE); + + return 0; +} + +static bool device_is_allowed(u32 device_id) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(allowed_device_id); i++) + if (allowed_device_id[i] == device_id) + return true; + + return false; +} + +static bool features_is_valid(u64 features) +{ + if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM))) + return false; + + /* Now we only support read-only configuration space */ + if (features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)) + return false; + + return true; +} + +static bool vduse_validate_config(struct vduse_dev_config *config) +{ + if (config->bounce_size > VDUSE_MAX_BOUNCE_SIZE) + return false; + + if (config->vq_align > PAGE_SIZE) + return false; + + if (config->config_size > PAGE_SIZE) + return false; + + if (!device_is_allowed(config->device_id)) + return false; + + if (!features_is_valid(config->features)) + return false; + + return true; +} + +static int vduse_create_dev(struct vduse_dev_config *config, + void *config_buf, u64 api_version) +{ + int i, ret; + struct vduse_dev *dev; + + ret = -EEXIST; + if (vduse_find_dev(config->name)) + goto err; + + ret = -ENOMEM; + dev = vduse_dev_create(); + if (!dev) + goto err; + + dev->api_version = api_version; + dev->device_features = config->features; + dev->device_id = config->device_id; + dev->vendor_id = config->vendor_id; + dev->name = kstrdup(config->name, GFP_KERNEL); + if (!dev->name) + goto err_str; + + dev->domain = vduse_domain_create(VDUSE_IOVA_SIZE - 1, + config->bounce_size); + if (!dev->domain) + goto err_domain; + + dev->config = config_buf; + dev->config_size = config->config_size; + dev->vq_align = config->vq_align; + dev->vq_num = config->vq_num; + dev->vqs = kcalloc(dev->vq_num, sizeof(*dev->vqs), GFP_KERNEL); + if (!dev->vqs) + goto err_vqs; + + for (i = 0; i < dev->vq_num; i++) { + dev->vqs[i].index = i; + INIT_WORK(&dev->vqs[i].inject, vduse_vq_irq_inject); + spin_lock_init(&dev->vqs[i].kick_lock); + spin_lock_init(&dev->vqs[i].irq_lock); + } + + ret = idr_alloc(&vduse_idr, dev, 1, VDUSE_DEV_MAX, GFP_KERNEL); + if (ret < 0) + goto err_idr; + + dev->minor = ret; + dev->dev = device_create(vduse_class, NULL, + MKDEV(MAJOR(vduse_major), dev->minor), + NULL, "%s", config->name); + if (IS_ERR(dev->dev)) { + ret = PTR_ERR(dev->dev); + goto err_dev; + } + __module_get(THIS_MODULE); + + return 0; +err_dev: + idr_remove(&vduse_idr, dev->minor); +err_idr: + kfree(dev->vqs); +err_vqs: + vduse_domain_destroy(dev->domain); +err_domain: + kfree(dev->name); +err_str: + vduse_dev_destroy(dev); +err: + kvfree(config_buf); + return ret; +} + +static long vduse_ioctl(struct file *file, unsigned int cmd, + unsigned long arg) +{ + int ret; + void __user *argp = (void __user *)arg; + struct vduse_control *control = file->private_data; + + mutex_lock(&vduse_lock); + switch (cmd) { + case VDUSE_GET_API_VERSION: + ret = put_user(control->api_version, (u64 __user *)argp); + break; + case VDUSE_SET_API_VERSION: { + u64 api_version; + + ret = -EFAULT; + if (get_user(api_version, (u64 __user *)argp)) + break; + + ret = -EINVAL; + if (api_version > VDUSE_API_VERSION) + break; + + ret = 0; + control->api_version = api_version; + break; + } + case VDUSE_CREATE_DEV: { + struct vduse_dev_config config; + unsigned long size = offsetof(struct vduse_dev_config, config); + void *buf; + + ret = -EFAULT; + if (copy_from_user(&config, argp, size)) + break; + + ret = -EINVAL; + if (vduse_validate_config(&config) == false) + break; + + buf = vmemdup_user(argp + size, config.config_size); + if (IS_ERR(buf)) { + ret = PTR_ERR(buf); + break; + } + config.name[VDUSE_NAME_MAX - 1] = '\0'; + ret = vduse_create_dev(&config, buf, control->api_version); + break; + } + case VDUSE_DESTROY_DEV: { + char name[VDUSE_NAME_MAX]; + + ret = -EFAULT; + if (copy_from_user(name, argp, VDUSE_NAME_MAX)) + break; + + name[VDUSE_NAME_MAX - 1] = '\0'; + ret = vduse_destroy_dev(name); + break; + } + default: + ret = -EINVAL; + break; + } + mutex_unlock(&vduse_lock); + + return ret; +} + +static int vduse_release(struct inode *inode, struct file *file) +{ + struct vduse_control *control = file->private_data; + + kfree(control); + return 0; +} + +static int vduse_open(struct inode *inode, struct file *file) +{ + struct vduse_control *control; + + control = kmalloc(sizeof(struct vduse_control), GFP_KERNEL); + if (!control) + return -ENOMEM; + + control->api_version = VDUSE_API_VERSION; + file->private_data = control; + + return 0; +} + +static const struct file_operations vduse_ctrl_fops = { + .owner = THIS_MODULE, + .open = vduse_open, + .release = vduse_release, + .unlocked_ioctl = vduse_ioctl, + .compat_ioctl = compat_ptr_ioctl, + .llseek = noop_llseek, +}; + +static char *vduse_devnode(struct device *dev, umode_t *mode) +{ + return kasprintf(GFP_KERNEL, "vduse/%s", dev_name(dev)); +} + +static void vduse_mgmtdev_release(struct device *dev) +{ +} + +static struct device vduse_mgmtdev = { + .init_name = "vduse", + .release = vduse_mgmtdev_release, +}; + +static struct vdpa_mgmt_dev mgmt_dev; + +static int vduse_dev_init_vdpa(struct vduse_dev *dev, const char *name) +{ + struct vduse_vdpa *vdev; + int ret; + + if (dev->vdev) + return -EEXIST; + + vdev = vdpa_alloc_device(struct vduse_vdpa, vdpa, dev->dev, + &vduse_vdpa_config_ops, name, true); + if (!vdev) + return -ENOMEM; + + dev->vdev = vdev; + vdev->dev = dev; + vdev->vdpa.dev.dma_mask = &vdev->vdpa.dev.coherent_dma_mask; + ret = dma_set_mask_and_coherent(&vdev->vdpa.dev, DMA_BIT_MASK(64)); + if (ret) { + put_device(&vdev->vdpa.dev); + return ret; + } + set_dma_ops(&vdev->vdpa.dev, &vduse_dev_dma_ops); + vdev->vdpa.dma_dev = &vdev->vdpa.dev; + vdev->vdpa.mdev = &mgmt_dev; + + return 0; +} + +static int vdpa_dev_add(struct vdpa_mgmt_dev *mdev, const char *name) +{ + struct vduse_dev *dev; + int ret; + + mutex_lock(&vduse_lock); + dev = vduse_find_dev(name); + if (!dev || !vduse_dev_is_ready(dev)) { + mutex_unlock(&vduse_lock); + return -EINVAL; + } + ret = vduse_dev_init_vdpa(dev, name); + mutex_unlock(&vduse_lock); + if (ret) + return ret; + + ret = _vdpa_register_device(&dev->vdev->vdpa, dev->vq_num); + if (ret) { + put_device(&dev->vdev->vdpa.dev); + return ret; + } + + return 0; +} + +static void vdpa_dev_del(struct vdpa_mgmt_dev *mdev, struct vdpa_device *dev) +{ + _vdpa_unregister_device(dev); +} + +static const struct vdpa_mgmtdev_ops vdpa_dev_mgmtdev_ops = { + .dev_add = vdpa_dev_add, + .dev_del = vdpa_dev_del, +}; + +static struct virtio_device_id id_table[] = { + { VIRTIO_ID_BLOCK, VIRTIO_DEV_ANY_ID }, + { 0 }, +}; + +static struct vdpa_mgmt_dev mgmt_dev = { + .device = &vduse_mgmtdev, + .id_table = id_table, + .ops = &vdpa_dev_mgmtdev_ops, +}; + +static int vduse_mgmtdev_init(void) +{ + int ret; + + ret = device_register(&vduse_mgmtdev); + if (ret) + return ret; + + ret = vdpa_mgmtdev_register(&mgmt_dev); + if (ret) + goto err; + + return 0; +err: + device_unregister(&vduse_mgmtdev); + return ret; +} + +static void vduse_mgmtdev_exit(void) +{ + vdpa_mgmtdev_unregister(&mgmt_dev); + device_unregister(&vduse_mgmtdev); +} + +static int vduse_init(void) +{ + int ret; + struct device *dev; + + vduse_class = class_create(THIS_MODULE, "vduse"); + if (IS_ERR(vduse_class)) + return PTR_ERR(vduse_class); + + vduse_class->devnode = vduse_devnode; + + ret = alloc_chrdev_region(&vduse_major, 0, VDUSE_DEV_MAX, "vduse"); + if (ret) + goto err_chardev_region; + + /* /dev/vduse/control */ + cdev_init(&vduse_ctrl_cdev, &vduse_ctrl_fops); + vduse_ctrl_cdev.owner = THIS_MODULE; + ret = cdev_add(&vduse_ctrl_cdev, vduse_major, 1); + if (ret) + goto err_ctrl_cdev; + + dev = device_create(vduse_class, NULL, vduse_major, NULL, "control"); + if (IS_ERR(dev)) { + ret = PTR_ERR(dev); + goto err_device; + } + + /* /dev/vduse/$DEVICE */ + cdev_init(&vduse_cdev, &vduse_dev_fops); + vduse_cdev.owner = THIS_MODULE; + ret = cdev_add(&vduse_cdev, MKDEV(MAJOR(vduse_major), 1), + VDUSE_DEV_MAX - 1); + if (ret) + goto err_cdev; + + vduse_irq_wq = alloc_workqueue("vduse-irq", + WQ_HIGHPRI | WQ_SYSFS | WQ_UNBOUND, 0); + if (!vduse_irq_wq) + goto err_wq; + + ret = vduse_domain_init(); + if (ret) + goto err_domain; + + ret = vduse_mgmtdev_init(); + if (ret) + goto err_mgmtdev; + + return 0; +err_mgmtdev: + vduse_domain_exit(); +err_domain: + destroy_workqueue(vduse_irq_wq); +err_wq: + cdev_del(&vduse_cdev); +err_cdev: + device_destroy(vduse_class, vduse_major); +err_device: + cdev_del(&vduse_ctrl_cdev); +err_ctrl_cdev: + unregister_chrdev_region(vduse_major, VDUSE_DEV_MAX); +err_chardev_region: + class_destroy(vduse_class); + return ret; +} +module_init(vduse_init); + +static void vduse_exit(void) +{ + vduse_mgmtdev_exit(); + vduse_domain_exit(); + destroy_workqueue(vduse_irq_wq); + cdev_del(&vduse_cdev); + device_destroy(vduse_class, vduse_major); + cdev_del(&vduse_ctrl_cdev); + unregister_chrdev_region(vduse_major, VDUSE_DEV_MAX); + class_destroy(vduse_class); +} +module_exit(vduse_exit); + +MODULE_LICENSE(DRV_LICENSE); +MODULE_AUTHOR(DRV_AUTHOR); +MODULE_DESCRIPTION(DRV_DESC); diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h new file mode 100644 index 000000000000..585fcce398af --- /dev/null +++ b/include/uapi/linux/vduse.h @@ -0,0 +1,221 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _UAPI_VDUSE_H_ +#define _UAPI_VDUSE_H_ + +#include + +#define VDUSE_BASE 0x81 + +/* The ioctls for control device (/dev/vduse/control) */ + +#define VDUSE_API_VERSION 0 + +/* + * Get the version of VDUSE API that kernel supported (VDUSE_API_VERSION). + * This is used for future extension. + */ +#define VDUSE_GET_API_VERSION _IOR(VDUSE_BASE, 0x00, __u64) + +/* Set the version of VDUSE API that userspace supported. */ +#define VDUSE_SET_API_VERSION _IOW(VDUSE_BASE, 0x01, __u64) + +/* + * The basic configuration of a VDUSE device, which is used by + * VDUSE_CREATE_DEV ioctl to create a VDUSE device. + */ +struct vduse_dev_config { +#define VDUSE_NAME_MAX 256 + char name[VDUSE_NAME_MAX]; /* vduse device name, needs to be NUL terminated */ + __u32 vendor_id; /* virtio vendor id */ + __u32 device_id; /* virtio device id */ + __u64 features; /* virtio features */ + __u64 bounce_size; /* the size of bounce buffer for data transfer */ + __u32 vq_num; /* the number of virtqueues */ + __u32 vq_align; /* the allocation alignment of virtqueue's metadata */ + __u32 reserved[15]; /* for future use */ + __u32 config_size; /* the size of the configuration space */ + __u8 config[0]; /* the buffer of the configuration space */ +}; + +/* Create a VDUSE device which is represented by a char device (/dev/vduse/$NAME) */ +#define VDUSE_CREATE_DEV _IOW(VDUSE_BASE, 0x02, struct vduse_dev_config) + +/* + * Destroy a VDUSE device. Make sure there are no more references + * to the char device (/dev/vduse/$NAME). + */ +#define VDUSE_DESTROY_DEV _IOW(VDUSE_BASE, 0x03, char[VDUSE_NAME_MAX]) + +/* The ioctls for VDUSE device (/dev/vduse/$NAME) */ + +/* + * The information of one IOVA region, which is retrieved from + * VDUSE_IOTLB_GET_FD ioctl. + */ +struct vduse_iotlb_entry { + __u64 offset; /* the mmap offset on returned file descriptor */ + __u64 start; /* start of the IOVA range: [start, last] */ + __u64 last; /* last of the IOVA range: [start, last] */ +#define VDUSE_ACCESS_RO 0x1 +#define VDUSE_ACCESS_WO 0x2 +#define VDUSE_ACCESS_RW 0x3 + __u8 perm; /* access permission of this region */ +}; + +/* + * Find the first IOVA region that overlaps with the range [start, last] + * and return the corresponding file descriptor. Return -EINVAL means the + * IOVA region doesn't exist. Caller should set start and last fields. + */ +#define VDUSE_IOTLB_GET_FD _IOWR(VDUSE_BASE, 0x10, struct vduse_iotlb_entry) + +/* + * Get the negotiated virtio features. It's a subset of the features in + * struct vduse_dev_config which can be accepted by virtio driver. It's + * only valid after FEATURES_OK status bit is set. + */ +#define VDUSE_DEV_GET_FEATURES _IOR(VDUSE_BASE, 0x11, __u64) + +/* + * The information that is used by VDUSE_DEV_SET_CONFIG ioctl to update + * device configuration space. + */ +struct vduse_config_data { + __u32 offset; /* offset from the beginning of configuration space */ + __u32 length; /* the length to write to configuration space */ + __u8 buffer[0]; /* buffer used to write from */ +}; + +/* Set device configuration space */ +#define VDUSE_DEV_SET_CONFIG _IOW(VDUSE_BASE, 0x12, struct vduse_config_data) + +/* + * Inject a config interrupt. It's usually used to notify virtio driver + * that device configuration space has changed. + */ +#define VDUSE_DEV_INJECT_IRQ _IO(VDUSE_BASE, 0x13) + +/* + * The basic configuration of a virtqueue, which is used by + * VDUSE_VQ_SETUP ioctl to setup a virtqueue. + */ +struct vduse_vq_config { + __u32 index; /* virtqueue index */ + __u16 max_size; /* the max size of virtqueue */ +}; + +/* + * Setup the specified virtqueue. Make sure all virtqueues have been + * configured before the device is attached to vDPA bus. + */ +#define VDUSE_VQ_SETUP _IOW(VDUSE_BASE, 0x14, struct vduse_vq_config) + +struct vduse_vq_state_split { + __u16 avail_index; /* available index */ +}; + +struct vduse_vq_state_packed { + __u16 last_avail_counter:1; /* last driver ring wrap counter observed by device */ + __u16 last_avail_idx:15; /* device available index */ + __u16 last_used_counter:1; /* device ring wrap counter */ + __u16 last_used_idx:15; /* used index */ +}; + +/* + * The information of a virtqueue, which is retrieved from + * VDUSE_VQ_GET_INFO ioctl. + */ +struct vduse_vq_info { + __u32 index; /* virtqueue index */ + __u32 num; /* the size of virtqueue */ + __u64 desc_addr; /* address of desc area */ + __u64 driver_addr; /* address of driver area */ + __u64 device_addr; /* address of device area */ + union { + struct vduse_vq_state_split split; /* split virtqueue state */ + struct vduse_vq_state_packed packed; /* packed virtqueue state */ + }; + __u8 ready; /* ready status of virtqueue */ +}; + +/* Get the specified virtqueue's information. Caller should set index field. */ +#define VDUSE_VQ_GET_INFO _IOWR(VDUSE_BASE, 0x15, struct vduse_vq_info) + +/* + * The eventfd configuration for the specified virtqueue. It's used by + * VDUSE_VQ_SETUP_KICKFD ioctl to setup kick eventfd. + */ +struct vduse_vq_eventfd { + __u32 index; /* virtqueue index */ +#define VDUSE_EVENTFD_DEASSIGN -1 + int fd; /* eventfd, -1 means de-assigning the eventfd */ +}; + +/* + * Setup kick eventfd for specified virtqueue. The kick eventfd is used + * by VDUSE kernel module to notify userspace to consume the avail vring. + */ +#define VDUSE_VQ_SETUP_KICKFD _IOW(VDUSE_BASE, 0x16, struct vduse_vq_eventfd) + +/* + * Inject an interrupt for specific virtqueue. It's used to notify virtio driver + * to consume the used vring. + */ +#define VDUSE_VQ_INJECT_IRQ _IOW(VDUSE_BASE, 0x17, __u32) + +/* The control messages definition for read/write on /dev/vduse/$NAME */ + +enum vduse_req_type { + /* Get the state for specified virtqueue from userspace */ + VDUSE_GET_VQ_STATE, + /* Set the device status */ + VDUSE_SET_STATUS, + /* + * Notify userspace to update the memory mapping for specified + * IOVA range via VDUSE_IOTLB_GET_FD ioctl + */ + VDUSE_UPDATE_IOTLB, +}; + +struct vduse_vq_state { + __u32 index; /* virtqueue index */ + union { + struct vduse_vq_state_split split; /* split virtqueue state */ + struct vduse_vq_state_packed packed; /* packed virtqueue state */ + }; +}; + +struct vduse_dev_status { + __u8 status; /* device status */ +}; + +struct vduse_iova_range { + __u64 start; /* start of the IOVA range: [start, end] */ + __u64 last; /* last of the IOVA range: [start, end] */ +}; + +struct vduse_dev_request { + __u32 type; /* request type */ + __u32 request_id; /* request id */ + __u32 reserved[2]; /* for future use */ + union { + struct vduse_vq_state vq_state; /* virtqueue state, only use index */ + struct vduse_dev_status s; /* device status */ + struct vduse_iova_range iova; /* IOVA range for updating */ + __u32 padding[16]; /* padding */ + }; +}; + +struct vduse_dev_response { + __u32 request_id; /* corresponding request id */ +#define VDUSE_REQ_RESULT_OK 0x00 +#define VDUSE_REQ_RESULT_FAILED 0x01 + __u32 result; /* the result of request */ + __u32 reserved[2]; /* for future use */ + union { + struct vduse_vq_state vq_state; /* virtqueue state */ + __u32 padding[16]; /* padding */ + }; +}; + +#endif /* _UAPI_VDUSE_H_ */ From patchwork Tue Jul 13 08:46:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 477347 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BABFC11F67 for ; Tue, 13 Jul 2021 08:49:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 47467613C2 for ; Tue, 13 Jul 2021 08:49:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235350AbhGMIwO (ORCPT ); Tue, 13 Jul 2021 04:52:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235424AbhGMIvo (ORCPT ); Tue, 13 Jul 2021 04:51:44 -0400 Received: from mail-pj1-x1035.google.com (mail-pj1-x1035.google.com [IPv6:2607:f8b0:4864:20::1035]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 72DC7C061793 for ; Tue, 13 Jul 2021 01:48:27 -0700 (PDT) Received: by mail-pj1-x1035.google.com with SMTP id me13-20020a17090b17cdb0290173bac8b9c9so980097pjb.3 for ; Tue, 13 Jul 2021 01:48:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=uMm6a9p0EzX9DUYqGCBIk5+JBvK8mtKaW/5v/siepz8=; b=FwrQiD+0uRPLUN+6Ju88xP5C02Bvk9OqVShw9WcsJdiLxu+YCLfDld1XCGfjLU1nSl 918yKE4A5ColszxQDG/4pM1sXcDtN4ghfirvX+RsqtkNJyYamqqNi2VjbvDtJuJMdMEH DBnVAKFBCd1A3Mu8H5eyl/2LzIPpp/+RJW7nXs1sv+9wD9NX2G2ZW51nr6oAKS/uh/Re no24BT5m/sJN3evL+WSKCld+Bzy52NIw84s+0QKBgTE+n5jtfL7elaULaVgn89viBuwz oYoSGCLRyfky8JDQWJQ44g6XrECuYo7lecJLBNwO2UB7NYtQMZoIpWP32eSw3XP6Zz0y tt/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uMm6a9p0EzX9DUYqGCBIk5+JBvK8mtKaW/5v/siepz8=; b=XapcdQGY8ylb6ggbantzyZjL742jOrWWZ82Ipw4JdJ1sysqNbTB7laSS6PE31aGVf8 Gkm3qqaL/E4LUaXXo+xXxNlxW9t0Xmw5OgRRu/QmihdD6OxYbmwDywO8DI2FXcSv1cAb tyM5+GLXq7BLBopKE+k8oG7+Uv8lTiQAn2t4Pac+ZwOhEAWWqgXl21PIw/0UgW0LyeHV HeKFen89lB8FuAqocPv5Td2ZXnhQ5pycBDm+xBiFoEbBLDbz4tJvlAMbKaGoo650EJd/ ZKRyK88+LSeQcIc5dhFMi7fToulv0R85QI509dXkIJkAc+4jiAelh2G5ugEy58UPl304 FpIA== X-Gm-Message-State: AOAM531SSgxRdyrir/c6RaD1N5fbTRkwnhKBw7hz41ACQ6vn+jS5fwj7 wotxmV4f0Q1hpytgclmwO1cF X-Google-Smtp-Source: ABdhPJy41kKAoDHW9SvVVXjbdWKoO4Hb5f5MeAoUzDaUJYwvEJY6gTCT1v+c76dyqcdtmYhkN5UdhA== X-Received: by 2002:a17:90a:8585:: with SMTP id m5mr3319949pjn.224.1626166106915; Tue, 13 Jul 2021 01:48:26 -0700 (PDT) Received: from localhost ([139.177.225.253]) by smtp.gmail.com with ESMTPSA id a23sm17961927pff.43.2021.07.13.01.48.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 01:48:26 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com, joro@8bytes.org, gregkh@linuxfoundation.org, zhe.he@windriver.com, xiaodong.liu@intel.com Cc: songmuchun@bytedance.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v9 17/17] Documentation: Add documentation for VDUSE Date: Tue, 13 Jul 2021 16:46:56 +0800 Message-Id: <20210713084656.232-18-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210713084656.232-1-xieyongji@bytedance.com> References: <20210713084656.232-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org VDUSE (vDPA Device in Userspace) is a framework to support implementing software-emulated vDPA devices in userspace. This document is intended to clarify the VDUSE design and usage. Signed-off-by: Xie Yongji --- Documentation/userspace-api/index.rst | 1 + Documentation/userspace-api/vduse.rst | 248 ++++++++++++++++++++++++++++++++++ 2 files changed, 249 insertions(+) create mode 100644 Documentation/userspace-api/vduse.rst diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst index 0b5eefed027e..c432be070f67 100644 --- a/Documentation/userspace-api/index.rst +++ b/Documentation/userspace-api/index.rst @@ -27,6 +27,7 @@ place where this information is gathered. iommu media/index sysfs-platform_profile + vduse .. only:: subproject and html diff --git a/Documentation/userspace-api/vduse.rst b/Documentation/userspace-api/vduse.rst new file mode 100644 index 000000000000..2c0d56d4b2da --- /dev/null +++ b/Documentation/userspace-api/vduse.rst @@ -0,0 +1,248 @@ +================================== +VDUSE - "vDPA Device in Userspace" +================================== + +vDPA (virtio data path acceleration) device is a device that uses a +datapath which complies with the virtio specifications with vendor +specific control path. vDPA devices can be both physically located on +the hardware or emulated by software. VDUSE is a framework that makes it +possible to implement software-emulated vDPA devices in userspace. And +to make the device emulation more secure, the emulated vDPA device's +control path is handled in the kernel and only the data path is +implemented in the userspace. + +Note that only virtio block device is supported by VDUSE framework now, +which can reduce security risks when the userspace process that implements +the data path is run by an unprivileged user. The support for other device +types can be added after the security issue of corresponding device driver +is clarified or fixed in the future. + +Start/Stop VDUSE devices +------------------------ + +VDUSE devices are started as follows: + +1. Create a new VDUSE instance with ioctl(VDUSE_CREATE_DEV) on + /dev/vduse/control. + +2. Setup each virtqueue with ioctl(VDUSE_VQ_SETUP) on /dev/vduse/$NAME. + +3. Begin processing VDUSE messages from /dev/vduse/$NAME. The first + messages will arrive while attaching the VDUSE instance to vDPA bus. + +4. Send the VDPA_CMD_DEV_NEW netlink message to attach the VDUSE + instance to vDPA bus. + +VDUSE devices are stopped as follows: + +1. Send the VDPA_CMD_DEV_DEL netlink message to detach the VDUSE + instance from vDPA bus. + +2. Close the file descriptor referring to /dev/vduse/$NAME. + +3. Destroy the VDUSE instance with ioctl(VDUSE_DESTROY_DEV) on + /dev/vduse/control. + +The netlink messages can be sent via vdpa tool in iproute2 or use the +below sample codes: + +.. code-block:: c + + static int netlink_add_vduse(const char *name, enum vdpa_command cmd) + { + struct nl_sock *nlsock; + struct nl_msg *msg; + int famid; + + nlsock = nl_socket_alloc(); + if (!nlsock) + return -ENOMEM; + + if (genl_connect(nlsock)) + goto free_sock; + + famid = genl_ctrl_resolve(nlsock, VDPA_GENL_NAME); + if (famid < 0) + goto close_sock; + + msg = nlmsg_alloc(); + if (!msg) + goto close_sock; + + if (!genlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, famid, 0, 0, cmd, 0)) + goto nla_put_failure; + + NLA_PUT_STRING(msg, VDPA_ATTR_DEV_NAME, name); + if (cmd == VDPA_CMD_DEV_NEW) + NLA_PUT_STRING(msg, VDPA_ATTR_MGMTDEV_DEV_NAME, "vduse"); + + if (nl_send_sync(nlsock, msg)) + goto close_sock; + + nl_close(nlsock); + nl_socket_free(nlsock); + + return 0; + nla_put_failure: + nlmsg_free(msg); + close_sock: + nl_close(nlsock); + free_sock: + nl_socket_free(nlsock); + return -1; + } + +How VDUSE works +--------------- + +As mentioned above, a VDUSE device is created by ioctl(VDUSE_CREATE_DEV) on +/dev/vduse/control. With this ioctl, userspace can specify some basic configuration +such as device name (uniquely identify a VDUSE device), virtio features, virtio +configuration space, bounce buffer size and so on for this emulated device. Then +a char device interface (/dev/vduse/$NAME) is exported to userspace for device +emulation. Userspace can use the VDUSE_VQ_SETUP ioctl on /dev/vduse/$NAME to +add per-virtqueue configuration such as the max size of virtqueue to the device. + +After the initialization, the VDUSE device can be attached to vDPA bus via +the VDPA_CMD_DEV_NEW netlink message. Userspace needs to read()/write() on +/dev/vduse/$NAME to receive/reply some control messages from/to VDUSE kernel +module as follows: + +.. code-block:: c + + static int vduse_message_handler(int dev_fd) + { + int len; + struct vduse_dev_request req; + struct vduse_dev_response resp; + + len = read(dev_fd, &req, sizeof(req)); + if (len != sizeof(req)) + return -1; + + resp.request_id = req.request_id; + + switch (req.type) { + + /* handle different types of message */ + + } + + len = write(dev_fd, &resp, sizeof(resp)); + if (len != sizeof(resp)) + return -1; + + return 0; + } + +There are now three types of messages introduced by VDUSE framework: + +- VDUSE_GET_VQ_STATE: Get the state for virtqueue, userspace should return + avail index for split virtqueue or the device/driver ring wrap counters and + the avail and used index for packed virtqueue. + +- VDUSE_SET_STATUS: Set the device status, userspace should follow + the virtio spec: https://docs.oasis-open.org/virtio/virtio/v1.1/virtio-v1.1.html + to process this message. For example, fail to set the FEATURES_OK device + status bit if the device can not accept the negotiated virtio features + get from the VDUSE_GET_FEATURES ioctl. + +- VDUSE_UPDATE_IOTLB: Notify userspace to update the memory mapping for specified + IOVA range, userspace should firstly remove the old mapping, then setup the new + mapping via the VDUSE_IOTLB_GET_FD ioctl. + +After DRIVER_OK status bit is set via the VDUSE_SET_STATUS message, userspace is +able to start the dataplane processing with the help of below ioctls: + +- VDUSE_IOTLB_GET_FD: Find the first IOVA region that overlaps with the specified + range [start, last] and return the corresponding file descriptor. In vhost-vdpa + cases, it might be a full chunk of guest RAM. And in virtio-vdpa cases, it should + be the whole bounce buffer or the memory region that stores one virtqueue's + metadata (descriptor table, available ring and used ring). Userspace can access + this IOVA region by passing fd and corresponding size, offset, perm to mmap(). + For example: + +.. code-block:: c + + static int perm_to_prot(uint8_t perm) + { + int prot = 0; + + switch (perm) { + case VDUSE_ACCESS_WO: + prot |= PROT_WRITE; + break; + case VDUSE_ACCESS_RO: + prot |= PROT_READ; + break; + case VDUSE_ACCESS_RW: + prot |= PROT_READ | PROT_WRITE; + break; + } + + return prot; + } + + static void *iova_to_va(int dev_fd, uint64_t iova, uint64_t *len) + { + int fd; + void *addr; + size_t size; + struct vduse_iotlb_entry entry; + + entry.start = iova; + entry.last = iova; + fd = ioctl(dev_fd, VDUSE_IOTLB_GET_FD, &entry); + if (fd < 0) + return NULL; + + size = entry.last - entry.start + 1; + *len = entry.last - iova + 1; + addr = mmap(0, size, perm_to_prot(entry.perm), MAP_SHARED, + fd, entry.offset); + close(fd); + if (addr == MAP_FAILED) + return NULL; + + /* + * Using some data structures such as linked list to store + * the iotlb mapping. The munmap(2) should be called for the + * cached mapping when the corresponding VDUSE_UPDATE_IOTLB + * message is received or the device is reset. + */ + + return addr + iova - entry.start; + } + +- VDUSE_VQ_GET_INFO: Get the specified virtqueue's information including the size, + the IOVAs of descriptor table, available ring and used ring, the state + and the ready status. The IOVAs should be passed to the VDUSE_IOTLB_GET_FD ioctl + so that userspace can access the descriptor table, available ring and used ring. + +- VDUSE_VQ_SETUP_KICKFD: Setup the kick eventfd for the specified virtqueues. + The kick eventfd is used by VDUSE kernel module to notify userspace to consume + the available ring. + +- VDUSE_INJECT_VQ_IRQ: Inject an interrupt for specific virtqueue. It's used to + notify virtio driver to consume the used ring. + +More details on the uAPI can be found in include/uapi/linux/vduse.h. + +MMU-based IOMMU Driver +---------------------- + +VDUSE framework implements an MMU-based on-chip IOMMU driver to support +mapping the kernel DMA buffer into the userspace IOVA region dynamically. +This is mainly designed for virtio-vdpa case (kernel virtio drivers). + +The basic idea behind this driver is treating MMU (VA->PA) as IOMMU (IOVA->PA). +The driver will set up MMU mapping instead of IOMMU mapping for the DMA transfer +so that the userspace process is able to use its virtual address to access +the DMA buffer in kernel. + +And to avoid security issue, a bounce-buffering mechanism is introduced to +prevent userspace accessing the original buffer directly which may contain other +kernel data. During the mapping, unmapping, the driver will copy the data from +the original buffer to the bounce buffer and back, depending on the direction of +the transfer. And the bounce-buffer addresses will be mapped into the user address +space instead of the original one.