From patchwork Wed Mar 31 08:05:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 413407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21DD6C433F1 for ; Wed, 31 Mar 2021 08:07:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DC0CB619C9 for ; Wed, 31 Mar 2021 08:07:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234340AbhCaIGy (ORCPT ); Wed, 31 Mar 2021 04:06:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47358 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234123AbhCaIGe (ORCPT ); Wed, 31 Mar 2021 04:06:34 -0400 Received: from mail-pj1-x102b.google.com (mail-pj1-x102b.google.com [IPv6:2607:f8b0:4864:20::102b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BF276C061574 for ; Wed, 31 Mar 2021 01:06:21 -0700 (PDT) Received: by mail-pj1-x102b.google.com with SMTP id x7-20020a17090a2b07b02900c0ea793940so761062pjc.2 for ; Wed, 31 Mar 2021 01:06:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Qd7q0+7mMIaCO7kNrN03miiLmUUotvU02ar15rGLAYA=; b=TC4wgya5iSyGQZApckwILsZ07cID8JY+wSUKJkkjSK6QenG/Nmjt6rFLbaVUVwLWuR qZUC8Y0GjUhiGVid8GxHZKbu9gf4fU+lvhfmVfPgGy+5jgQjVK7BNWsvr6SfgOEkycaZ 5vq2ipI4hpOZ0O2rO+5j+j/r4ClYjWkyP2C+gZCHd4C/hmSUIOJRWa8irRXib6jNL6bu kD4W62kB3IbxjCgNwxzpm1RKEIwtcxPZUTI1DX3Y9RXd+XOhqYwQUafQpdnDpvkcgH0M D8Gqf8VvrAIYcVLhc6Mm5fymUe/OERG1iFMO5tcrA4RGKFIGu9cfuHfcaCK9xR+LNdNI 8BIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Qd7q0+7mMIaCO7kNrN03miiLmUUotvU02ar15rGLAYA=; b=gn77aHI+s5xghUXDqjALfUv2JcU5DHKcHBRecZhXfM02g5Ut8vVp1rrEL83snTqqQk SQ0PjKOAnq+360E7UIZl0GCpNAIRhZDmPOXHjKW1Wvx8gWN7yoqWWWS7DEciJo3qiAVh gPf7V1D0Xm6YAoz4dMrDiFQ1OPuziawlf59n6gP00yW0WO5LcxOb0ONKqW4VuTQjsnds vjDADZ/KqeKH+M99Ly6B/P15QSOKeI3D6YHY+ktfZ++nYpmqt2yscGUWndYmDAOZKT15 lazwZJRWcp6yk+yHC9BDC26bIVIkmUFpNIuqkiB7ZmP3LyomUwxRLuZSy8KDjPXQ84YT zweA== X-Gm-Message-State: AOAM532XxjqNUqVdCT9mGLWS+fR6vIKXDGYOizY5E6jzUOEQuFj1OJaY PWzhoeNAJPBcH6tlVN5BDrZY X-Google-Smtp-Source: ABdhPJydrF0nk0d3cE3vx8KgidDGa8Zfry5KovS8f39HDenv+8gD9ih8HKxbhTFZ2J9TJmZpWj0qlg== X-Received: by 2002:a17:90a:fb83:: with SMTP id cp3mr2363661pjb.33.1617177981360; Wed, 31 Mar 2021 01:06:21 -0700 (PDT) Received: from localhost ([139.177.225.243]) by smtp.gmail.com with ESMTPSA id 23sm1644744pgo.53.2021.03.31.01.06.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Mar 2021 01:06:20 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com Cc: virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 01/10] file: Export receive_fd() to modules Date: Wed, 31 Mar 2021 16:05:10 +0800 Message-Id: <20210331080519.172-2-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331080519.172-1-xieyongji@bytedance.com> References: <20210331080519.172-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Export receive_fd() so that some modules can use it to pass file descriptor between processes without missing any security stuffs. Signed-off-by: Xie Yongji --- fs/file.c | 6 ++++++ include/linux/file.h | 7 +++---- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/fs/file.c b/fs/file.c index dab120b71e44..d7d957217576 100644 --- a/fs/file.c +++ b/fs/file.c @@ -1108,6 +1108,12 @@ int __receive_fd(int fd, struct file *file, int __user *ufd, unsigned int o_flag return new_fd; } +int receive_fd(struct file *file, unsigned int o_flags) +{ + return __receive_fd(-1, file, NULL, o_flags); +} +EXPORT_SYMBOL(receive_fd); + static int ksys_dup3(unsigned int oldfd, unsigned int newfd, int flags) { int err = -EBADF; diff --git a/include/linux/file.h b/include/linux/file.h index 225982792fa2..4667f9567d3e 100644 --- a/include/linux/file.h +++ b/include/linux/file.h @@ -94,6 +94,9 @@ extern void fd_install(unsigned int fd, struct file *file); extern int __receive_fd(int fd, struct file *file, int __user *ufd, unsigned int o_flags); + +extern int receive_fd(struct file *file, unsigned int o_flags); + static inline int receive_fd_user(struct file *file, int __user *ufd, unsigned int o_flags) { @@ -101,10 +104,6 @@ static inline int receive_fd_user(struct file *file, int __user *ufd, return -EFAULT; return __receive_fd(-1, file, ufd, o_flags); } -static inline int receive_fd(struct file *file, unsigned int o_flags) -{ - return __receive_fd(-1, file, NULL, o_flags); -} static inline int receive_fd_replace(int fd, struct file *file, unsigned int o_flags) { return __receive_fd(fd, file, NULL, o_flags); From patchwork Wed Mar 31 08:05:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 413408 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDE70C433E4 for ; Wed, 31 Mar 2021 08:07:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A8627619D8 for ; Wed, 31 Mar 2021 08:07:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234327AbhCaIGw (ORCPT ); Wed, 31 Mar 2021 04:06:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234101AbhCaIG3 (ORCPT ); Wed, 31 Mar 2021 04:06:29 -0400 Received: from mail-pg1-x533.google.com (mail-pg1-x533.google.com [IPv6:2607:f8b0:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDD7FC06175F for ; Wed, 31 Mar 2021 01:06:28 -0700 (PDT) Received: by mail-pg1-x533.google.com with SMTP id q10so2449928pgj.2 for ; Wed, 31 Mar 2021 01:06:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=f7Vhf0sKPolvZdmmoQbWwDj+eTIPxki+8IuZB8c0Kq4=; b=r75Rzrne30uVZcUUdNkOzD94W4ROnkMv7fAN57hx5SsqQiycs7arb/U8T5gr66kqY2 fsJYjTgVulWItUoSriq/FcqQzPwx3Fz0XkL0dAwW6DR6TiQZ5fh9zqob5EY+4Shmuo59 UI6B0NIdzdqg38KAIztpGrWMwp/WjmNC3nfJE9c7zNUk4GE3UIQHNBQrvi1D/vVoIO+h hsdx37JQ3jSo2oFjNRGiYasaDMC3P1a1pxm9YUHLvG+1y1GhwqybDH8jya7I36j0Of6Q RB3uA8sHEhyNHiGFYH28UOTktfqnH1hbLdQ+8d4laajfaXV5EVw6897aE7LGfaIq5Agv Oyzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=f7Vhf0sKPolvZdmmoQbWwDj+eTIPxki+8IuZB8c0Kq4=; b=BsNdzT55gB2CC7ga/B6gUx+Nq2CLNcVN/M3MhV59ziKlfsHmHXumj6HYppNEfTvz2L sIALBIBkzbWbJtrtSPIgHtM/mMSf2S9fdzqzEm1eYhWM5khzrQ0akYzJiuPhVZMLzPvr oby70y7/mbyP6CGIY1Ch1nUDWV2KTfHsf3YRvbKVKtWJ+9aVNeQRnWAG825umerF+JYL MUQW1futy96nXdYDiQTqIJ9EodAJi/uzrquU6dMOcCaYXTKOa30ATbpKGg2Df5H2rR2z nvj5icDx6SmIPSIrROkqGoizu8p5HsCButGK5LEFZJZpLkqIfrjmqM8J6cfTmf/MxXme RXyg== X-Gm-Message-State: AOAM531LeJjg7CTMPgIuJLpNRfqgY3FVVsaY1eBmEvajzuq9wfIMdcmF 63ECT//6zffJM3RvXgjeiyfP X-Google-Smtp-Source: ABdhPJw5SdAYJYwggqFPCWKc2zs9T35K7EWcA/8fPQg9WcIHxPAKmkso7yE0S4NhyNQOP74BaSA/yw== X-Received: by 2002:a63:1144:: with SMTP id 4mr2075169pgr.333.1617177988462; Wed, 31 Mar 2021 01:06:28 -0700 (PDT) Received: from localhost ([139.177.225.243]) by smtp.gmail.com with ESMTPSA id 4sm1293560pjl.51.2021.03.31.01.06.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Mar 2021 01:06:28 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com Cc: virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 03/10] vhost-vdpa: protect concurrent access to vhost device iotlb Date: Wed, 31 Mar 2021 16:05:12 +0800 Message-Id: <20210331080519.172-4-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331080519.172-1-xieyongji@bytedance.com> References: <20210331080519.172-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Use vhost_dev->mutex to protect vhost device iotlb from concurrent access. Fixes: 4c8cf318("vhost: introduce vDPA-based backend") Cc: stable@vger.kernel.org Signed-off-by: Xie Yongji Acked-by: Jason Wang Reviewed-by: Stefano Garzarella --- drivers/vhost/vdpa.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 3947fbc2d1d5..63b28d3aee7c 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -725,9 +725,11 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, const struct vdpa_config_ops *ops = vdpa->config; int r = 0; + mutex_lock(&dev->mutex); + r = vhost_dev_check_owner(dev); if (r) - return r; + goto unlock; switch (msg->type) { case VHOST_IOTLB_UPDATE: @@ -748,6 +750,8 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, r = -EINVAL; break; } +unlock: + mutex_unlock(&dev->mutex); return r; } From patchwork Wed Mar 31 08:05:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 413406 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFE72C433FB for ; Wed, 31 Mar 2021 08:07:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BB137619D6 for ; Wed, 31 Mar 2021 08:07:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234358AbhCaIG5 (ORCPT ); Wed, 31 Mar 2021 04:06:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47374 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234218AbhCaIGk (ORCPT ); Wed, 31 Mar 2021 04:06:40 -0400 Received: from mail-pg1-x534.google.com (mail-pg1-x534.google.com [IPv6:2607:f8b0:4864:20::534]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9C810C061574 for ; Wed, 31 Mar 2021 01:06:39 -0700 (PDT) Received: by mail-pg1-x534.google.com with SMTP id h25so13656063pgm.3 for ; Wed, 31 Mar 2021 01:06:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=rtJRvLLxqYfbQp3k0gxj2srSIbASh6xiOG9Mep0BSmk=; b=w5hDeZv0IPaw+gqFRpEaMlZdT7oT4DNs6bmIznvJK7+yn5Op4/jEIFUooqO/yf7Odf foqq+oRANL2424ymkzYHi9jqsgdZ5LWNeumtpctkH/Hd+LoXsWnwJp1aZ88R3Q1X7y2Z FAWenN3c5A3UX6ZnkfyoM28Eri6ELfGfCT3mUSRH0zh4kXBbLEmRTdB/mDR+ZU20SLef MBTaBZhn6XNz7Gq113Pw45gBApk6HQ58W4e9dXp2kXsfuOu+Nq6sT6q6Xm7SCJp4VuXa U5JLZJazm/OgF0+fwBP1zXBjCt6WN15OALNq0ZkozXUACtF0PKU4xgtQofTuVi9Jktcp nnAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=rtJRvLLxqYfbQp3k0gxj2srSIbASh6xiOG9Mep0BSmk=; b=KF2racky1QqeBCtlpTtzH/+68re2ez2lC50Jtjb6XD1LB4KzSDqdVJOt5Ah3tX1EQp OojE8K7B3eFIXHEfqvNR4A7tZ2ufqMSn58xDG1azO/EuwNvEz7QCujKIlDrWQxzd+xus sWSQawYADxSWqIgpJ+luDLMuMKBEDz+W8y93MjL0P0GBLvTlIrB2f/S2r02QtcjuDDyh ASoLEk1S3l4K5tKbrXL4SpaQM7Mj6KPPjV3/IyeECvBqzShTLCPPiU4/I7r0tjP6gqut 2BsBXS+h6t2iOKiby2zkWK6XSqFoxDaWPHulJ8tVSddI3FZCinmVPGt7sDpctGQBvvix oUfg== X-Gm-Message-State: AOAM5337Ho6YwBQTAdxcMZS9wd1cCnasf4KaxgTkrAj/1f2LaqAoA7zg TnEPSoOuiXu8+W2xkUEzUPGw X-Google-Smtp-Source: ABdhPJxykFPDlbo5t2zr7wx1yYTSCAqgzyvVTf6J3sWPsHKgXsLVdyCoeIH5gAzysavU9FcOJicdAg== X-Received: by 2002:a63:2321:: with SMTP id j33mr2142411pgj.120.1617177999238; Wed, 31 Mar 2021 01:06:39 -0700 (PDT) Received: from localhost ([139.177.225.243]) by smtp.gmail.com with ESMTPSA id d24sm1333563pjw.37.2021.03.31.01.06.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Mar 2021 01:06:38 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com Cc: virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 06/10] vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap() Date: Wed, 31 Mar 2021 16:05:15 +0800 Message-Id: <20210331080519.172-7-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331080519.172-1-xieyongji@bytedance.com> References: <20210331080519.172-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The upcoming patch is going to support VA mapping/unmapping. So let's factor out the logic of PA mapping/unmapping firstly to make the code more readable. Suggested-by: Jason Wang Signed-off-by: Xie Yongji Acked-by: Jason Wang --- drivers/vhost/vdpa.c | 53 +++++++++++++++++++++++++++++++++------------------- 1 file changed, 34 insertions(+), 19 deletions(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 22cab98610a1..f9aab9013745 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -483,7 +483,7 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, return r; } -static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last) +static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last) { struct vhost_dev *dev = &v->vdev; struct vhost_iotlb *iotlb = dev->iotlb; @@ -505,6 +505,11 @@ static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last) } } +static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last) +{ + return vhost_vdpa_pa_unmap(v, start, last); +} + static void vhost_vdpa_iotlb_free(struct vhost_vdpa *v) { struct vhost_dev *dev = &v->vdev; @@ -585,37 +590,28 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size) } } -static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, - struct vhost_iotlb_msg *msg) +static int vhost_vdpa_pa_map(struct vhost_vdpa *v, + u64 iova, u64 size, u64 uaddr, u32 perm) { struct vhost_dev *dev = &v->vdev; - struct vhost_iotlb *iotlb = dev->iotlb; struct page **page_list; unsigned long list_size = PAGE_SIZE / sizeof(struct page *); unsigned int gup_flags = FOLL_LONGTERM; unsigned long npages, cur_base, map_pfn, last_pfn = 0; unsigned long lock_limit, sz2pin, nchunks, i; - u64 iova = msg->iova; + u64 start = iova; long pinned; int ret = 0; - if (msg->iova < v->range.first || - msg->iova + msg->size - 1 > v->range.last) - return -EINVAL; - - if (vhost_iotlb_itree_first(iotlb, msg->iova, - msg->iova + msg->size - 1)) - return -EEXIST; - /* Limit the use of memory for bookkeeping */ page_list = (struct page **) __get_free_page(GFP_KERNEL); if (!page_list) return -ENOMEM; - if (msg->perm & VHOST_ACCESS_WO) + if (perm & VHOST_ACCESS_WO) gup_flags |= FOLL_WRITE; - npages = PAGE_ALIGN(msg->size + (iova & ~PAGE_MASK)) >> PAGE_SHIFT; + npages = PAGE_ALIGN(size + (iova & ~PAGE_MASK)) >> PAGE_SHIFT; if (!npages) { ret = -EINVAL; goto free; @@ -629,7 +625,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, goto unlock; } - cur_base = msg->uaddr & PAGE_MASK; + cur_base = uaddr & PAGE_MASK; iova &= PAGE_MASK; nchunks = 0; @@ -660,7 +656,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, csize = (last_pfn - map_pfn + 1) << PAGE_SHIFT; ret = vhost_vdpa_map(v, iova, csize, map_pfn << PAGE_SHIFT, - msg->perm); + perm); if (ret) { /* * Unpin the pages that are left unmapped @@ -689,7 +685,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, /* Pin the rest chunk */ ret = vhost_vdpa_map(v, iova, (last_pfn - map_pfn + 1) << PAGE_SHIFT, - map_pfn << PAGE_SHIFT, msg->perm); + map_pfn << PAGE_SHIFT, perm); out: if (ret) { if (nchunks) { @@ -708,13 +704,32 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, for (pfn = map_pfn; pfn <= last_pfn; pfn++) unpin_user_page(pfn_to_page(pfn)); } - vhost_vdpa_unmap(v, msg->iova, msg->size); + vhost_vdpa_unmap(v, start, size); } unlock: mmap_read_unlock(dev->mm); free: free_page((unsigned long)page_list); return ret; + +} + +static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, + struct vhost_iotlb_msg *msg) +{ + struct vhost_dev *dev = &v->vdev; + struct vhost_iotlb *iotlb = dev->iotlb; + + if (msg->iova < v->range.first || + msg->iova + msg->size - 1 > v->range.last) + return -EINVAL; + + if (vhost_iotlb_itree_first(iotlb, msg->iova, + msg->iova + msg->size - 1)) + return -EEXIST; + + return vhost_vdpa_pa_map(v, msg->iova, msg->size, msg->uaddr, + msg->perm); } static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, From patchwork Wed Mar 31 08:05:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 413405 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E7D3C433E8 for ; Wed, 31 Mar 2021 08:07:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7597B619D3 for ; Wed, 31 Mar 2021 08:07:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234433AbhCaIH2 (ORCPT ); Wed, 31 Mar 2021 04:07:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47468 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234339AbhCaIGy (ORCPT ); Wed, 31 Mar 2021 04:06:54 -0400 Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6A7DAC061574 for ; Wed, 31 Mar 2021 01:06:43 -0700 (PDT) Received: by mail-pj1-x102f.google.com with SMTP id bt4so9083636pjb.5 for ; Wed, 31 Mar 2021 01:06:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=BB90qgCfILId3MapNDXsVwvW/dnzqmU0Xngr4EHg7NE=; b=zWCPPcn56B/X4wsm7blaiQP9JpsrLCJvqR7HeXXyzU9f5Mvy4ceV9o0CGaxh1ihVkk JFNRqP3M5ZtecIL2gwgTsaQ6YbSiinHZ/YZRccoRUpqwjo0F5zqPSRHwnjEjcu9xH9ZQ ZXYQcqWNJnwK1FjFo46nsnyNYgUX1R5bWI76G6323aBmZ70Uz9WDmx78qsHNnIxFUWqI PXva6He++d3k4tw5F+n4aMm6UbzALcwg2lV0fNuCkEuxOIXyu76JsufAYHcVYEbgsuqP N4bId6sbx6FY+hQLC3ajZdDAAfeOCPxOxiPTBg03Bsy6zz5g/3XESwWmboB0r/loe2W/ f49g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=BB90qgCfILId3MapNDXsVwvW/dnzqmU0Xngr4EHg7NE=; b=cMPva/gX+GL+0vMC9LcCU5gRvnUq2gEoGAhM0tY+AbnveAP8jU85q5rA7/6BoYEhRP yEBjiQQqbifp/Rvi2+eOUJNts/7QS5pV0KiZNPZvp/vz5X9riAB/OAjAVeiJ4ksCJ32G ws0zYJ9j89JNm/VQpit0W2UzjxmMJMv6m9w9iY1JkA5TwBL6+fdYqjTmCidMI/WuisgA NbDroxcq7Zz3ErvBdaUTuKURbuaQFEXs5IO4dWPMgYaVL40VBM1/zFDkgwwZHrnC5RO9 jCvzY2MvAslx+9Kx9g0CeREw07mkvkJ4eQL0bjZtn4Js5RlqAeerHutV0c50KBq72GVC gsQg== X-Gm-Message-State: AOAM5335xX81PE9j6N2BcRkAvM/JsOlds6R+6AXOsDkmLcNsY/V+KeMm vEeE3YsyQhLGjeJK+D/6hI+u X-Google-Smtp-Source: ABdhPJyItYyFbM3hjECkaDDhxrMFvyvjl8NXJdGBO7SSkT2pahGdLCCBd9GQeL329BxXqsrfSR1oVA== X-Received: by 2002:a17:90a:6906:: with SMTP id r6mr2351240pjj.235.1617178002927; Wed, 31 Mar 2021 01:06:42 -0700 (PDT) Received: from localhost ([139.177.225.243]) by smtp.gmail.com with ESMTPSA id p1sm1292201pfn.22.2021.03.31.01.06.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Mar 2021 01:06:42 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com Cc: virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 07/10] vdpa: Support transferring virtual addressing during DMA mapping Date: Wed, 31 Mar 2021 16:05:16 +0800 Message-Id: <20210331080519.172-8-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331080519.172-1-xieyongji@bytedance.com> References: <20210331080519.172-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch introduces an attribute for vDPA device to indicate whether virtual address can be used. If vDPA device driver set it, vhost-vdpa bus driver will not pin user page and transfer userspace virtual address instead of physical address during DMA mapping. And corresponding vma->vm_file and offset will be also passed as an opaque pointer. Suggested-by: Jason Wang Signed-off-by: Xie Yongji Acked-by: Jason Wang --- drivers/vdpa/ifcvf/ifcvf_main.c | 2 +- drivers/vdpa/mlx5/net/mlx5_vnet.c | 2 +- drivers/vdpa/vdpa.c | 9 +++- drivers/vdpa/vdpa_sim/vdpa_sim.c | 2 +- drivers/vdpa/virtio_pci/vp_vdpa.c | 2 +- drivers/vhost/vdpa.c | 99 ++++++++++++++++++++++++++++++++++----- include/linux/vdpa.h | 19 ++++++-- 7 files changed, 116 insertions(+), 19 deletions(-) diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c index d555a6a5d1ba..aee013f3eb5f 100644 --- a/drivers/vdpa/ifcvf/ifcvf_main.c +++ b/drivers/vdpa/ifcvf/ifcvf_main.c @@ -431,7 +431,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id) } adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa, - dev, &ifc_vdpa_ops, NULL); + dev, &ifc_vdpa_ops, NULL, false); if (adapter == NULL) { IFCVF_ERR(pdev, "Failed to allocate vDPA structure"); return -ENOMEM; diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 71397fdafa6a..fb62ebcf464a 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -1982,7 +1982,7 @@ static int mlx5v_probe(struct auxiliary_device *adev, max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS); ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops, - NULL); + NULL, false); if (IS_ERR(ndev)) return PTR_ERR(ndev); diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index 5cffce67cab0..97fbac276c72 100644 --- a/drivers/vdpa/vdpa.c +++ b/drivers/vdpa/vdpa.c @@ -71,6 +71,7 @@ static void vdpa_release_dev(struct device *d) * @config: the bus operations that is supported by this device * @size: size of the parent structure that contains private data * @name: name of the vdpa device; optional. + * @use_va: indicate whether virtual address must be used by this device * * Driver should use vdpa_alloc_device() wrapper macro instead of * using this directly. @@ -80,7 +81,8 @@ static void vdpa_release_dev(struct device *d) */ struct vdpa_device *__vdpa_alloc_device(struct device *parent, const struct vdpa_config_ops *config, - size_t size, const char *name) + size_t size, const char *name, + bool use_va) { struct vdpa_device *vdev; int err = -EINVAL; @@ -91,6 +93,10 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent, if (!!config->dma_map != !!config->dma_unmap) goto err; + /* It should only work for the device that use on-chip IOMMU */ + if (use_va && !(config->dma_map || config->set_map)) + goto err; + err = -ENOMEM; vdev = kzalloc(size, GFP_KERNEL); if (!vdev) @@ -106,6 +112,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent, vdev->index = err; vdev->config = config; vdev->features_valid = false; + vdev->use_va = use_va; if (name) err = dev_set_name(&vdev->dev, "%s", name); diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c index ff331f088baf..d26334e9a412 100644 --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c @@ -235,7 +235,7 @@ struct vdpasim *vdpasim_create(struct vdpasim_dev_attr *dev_attr) ops = &vdpasim_config_ops; vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops, - dev_attr->name); + dev_attr->name, false); if (!vdpasim) goto err_alloc; diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c b/drivers/vdpa/virtio_pci/vp_vdpa.c index 1321a2fcd088..03b36aed48d6 100644 --- a/drivers/vdpa/virtio_pci/vp_vdpa.c +++ b/drivers/vdpa/virtio_pci/vp_vdpa.c @@ -377,7 +377,7 @@ static int vp_vdpa_probe(struct pci_dev *pdev, const struct pci_device_id *id) return ret; vp_vdpa = vdpa_alloc_device(struct vp_vdpa, vdpa, - dev, &vp_vdpa_ops, NULL); + dev, &vp_vdpa_ops, NULL, false); if (vp_vdpa == NULL) { dev_err(dev, "vp_vdpa: Failed to allocate vDPA structure\n"); return -ENOMEM; diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index f9aab9013745..613ea400e0e5 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -505,8 +505,28 @@ static void vhost_vdpa_pa_unmap(struct vhost_vdpa *v, u64 start, u64 last) } } +static void vhost_vdpa_va_unmap(struct vhost_vdpa *v, u64 start, u64 last) +{ + struct vhost_dev *dev = &v->vdev; + struct vhost_iotlb *iotlb = dev->iotlb; + struct vhost_iotlb_map *map; + struct vdpa_map_file *map_file; + + while ((map = vhost_iotlb_itree_first(iotlb, start, last)) != NULL) { + map_file = (struct vdpa_map_file *)map->opaque; + fput(map_file->file); + kfree(map_file); + vhost_iotlb_map_free(iotlb, map); + } +} + static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last) { + struct vdpa_device *vdpa = v->vdpa; + + if (vdpa->use_va) + return vhost_vdpa_va_unmap(v, start, last); + return vhost_vdpa_pa_unmap(v, start, last); } @@ -541,21 +561,21 @@ static int perm_to_iommu_flags(u32 perm) return flags | IOMMU_CACHE; } -static int vhost_vdpa_map(struct vhost_vdpa *v, - u64 iova, u64 size, u64 pa, u32 perm) +static int vhost_vdpa_map(struct vhost_vdpa *v, u64 iova, + u64 size, u64 pa, u32 perm, void *opaque) { struct vhost_dev *dev = &v->vdev; struct vdpa_device *vdpa = v->vdpa; const struct vdpa_config_ops *ops = vdpa->config; int r = 0; - r = vhost_iotlb_add_range(dev->iotlb, iova, iova + size - 1, - pa, perm); + r = vhost_iotlb_add_range_ctx(dev->iotlb, iova, iova + size - 1, + pa, perm, opaque); if (r) return r; if (ops->dma_map) { - r = ops->dma_map(vdpa, iova, size, pa, perm, NULL); + r = ops->dma_map(vdpa, iova, size, pa, perm, opaque); } else if (ops->set_map) { if (!v->in_batch) r = ops->set_map(vdpa, dev->iotlb); @@ -563,13 +583,15 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, r = iommu_map(v->domain, iova, pa, size, perm_to_iommu_flags(perm)); } - - if (r) + if (r) { vhost_iotlb_del_range(dev->iotlb, iova, iova + size - 1); - else + return r; + } + + if (!vdpa->use_va) atomic64_add(size >> PAGE_SHIFT, &dev->mm->pinned_vm); - return r; + return 0; } static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size) @@ -590,6 +612,56 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size) } } +static int vhost_vdpa_va_map(struct vhost_vdpa *v, + u64 iova, u64 size, u64 uaddr, u32 perm) +{ + struct vhost_dev *dev = &v->vdev; + u64 offset, map_size, map_iova = iova; + struct vdpa_map_file *map_file; + struct vm_area_struct *vma; + int ret; + + mmap_read_lock(dev->mm); + + while (size) { + vma = find_vma(dev->mm, uaddr); + if (!vma) { + ret = -EINVAL; + break; + } + map_size = min(size, vma->vm_end - uaddr); + if (!(vma->vm_file && (vma->vm_flags & VM_SHARED) && + !(vma->vm_flags & (VM_IO | VM_PFNMAP)))) + goto next; + + map_file = kzalloc(sizeof(*map_file), GFP_KERNEL); + if (!map_file) { + ret = -ENOMEM; + break; + } + offset = (vma->vm_pgoff << PAGE_SHIFT) + uaddr - vma->vm_start; + map_file->offset = offset; + map_file->file = get_file(vma->vm_file); + ret = vhost_vdpa_map(v, map_iova, map_size, uaddr, + perm, map_file); + if (ret) { + fput(map_file->file); + kfree(map_file); + break; + } +next: + size -= map_size; + uaddr += map_size; + map_iova += map_size; + } + if (ret) + vhost_vdpa_unmap(v, iova, map_iova - iova); + + mmap_read_unlock(dev->mm); + + return ret; +} + static int vhost_vdpa_pa_map(struct vhost_vdpa *v, u64 iova, u64 size, u64 uaddr, u32 perm) { @@ -656,7 +728,7 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v, csize = (last_pfn - map_pfn + 1) << PAGE_SHIFT; ret = vhost_vdpa_map(v, iova, csize, map_pfn << PAGE_SHIFT, - perm); + perm, NULL); if (ret) { /* * Unpin the pages that are left unmapped @@ -685,7 +757,7 @@ static int vhost_vdpa_pa_map(struct vhost_vdpa *v, /* Pin the rest chunk */ ret = vhost_vdpa_map(v, iova, (last_pfn - map_pfn + 1) << PAGE_SHIFT, - map_pfn << PAGE_SHIFT, perm); + map_pfn << PAGE_SHIFT, perm, NULL); out: if (ret) { if (nchunks) { @@ -718,6 +790,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, struct vhost_iotlb_msg *msg) { struct vhost_dev *dev = &v->vdev; + struct vdpa_device *vdpa = v->vdpa; struct vhost_iotlb *iotlb = dev->iotlb; if (msg->iova < v->range.first || @@ -728,6 +801,10 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, msg->iova + msg->size - 1)) return -EEXIST; + if (vdpa->use_va) + return vhost_vdpa_va_map(v, msg->iova, msg->size, + msg->uaddr, msg->perm); + return vhost_vdpa_pa_map(v, msg->iova, msg->size, msg->uaddr, msg->perm); } diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index b01f7c9096bf..e67404e4b23e 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -44,6 +44,7 @@ struct vdpa_mgmt_dev; * @config: the configuration ops for this device. * @index: device index * @features_valid: were features initialized? for legacy guests + * @use_va: indicate whether virtual address must be used by this device * @nvqs: maximum number of supported virtqueues * @mdev: management device pointer; caller must setup when registering device as part * of dev_add() mgmtdev ops callback before invoking _vdpa_register_device(). @@ -54,6 +55,7 @@ struct vdpa_device { const struct vdpa_config_ops *config; unsigned int index; bool features_valid; + bool use_va; int nvqs; struct vdpa_mgmt_dev *mdev; }; @@ -69,6 +71,16 @@ struct vdpa_iova_range { }; /** + * Corresponding file area for device memory mapping + * @file: vma->vm_file for the mapping + * @offset: mapping offset in the vm_file + */ +struct vdpa_map_file { + struct file *file; + u64 offset; +}; + +/** * vDPA_config_ops - operations for configuring a vDPA device. * Note: vDPA device drivers are required to implement all of the * operations unless it is mentioned to be optional in the following @@ -250,14 +262,15 @@ struct vdpa_config_ops { struct vdpa_device *__vdpa_alloc_device(struct device *parent, const struct vdpa_config_ops *config, - size_t size, const char *name); + size_t size, const char *name, + bool use_va); -#define vdpa_alloc_device(dev_struct, member, parent, config, name) \ +#define vdpa_alloc_device(dev_struct, member, parent, config, name, use_va) \ container_of(__vdpa_alloc_device( \ parent, config, \ sizeof(dev_struct) + \ BUILD_BUG_ON_ZERO(offsetof( \ - dev_struct, member)), name), \ + dev_struct, member)), name, use_va), \ dev_struct, member) int vdpa_register_device(struct vdpa_device *vdev, int nvqs); From patchwork Wed Mar 31 08:05:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yongji Xie X-Patchwork-Id: 413404 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 843ECC433ED for ; Wed, 31 Mar 2021 08:07:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5F9D3619D6 for ; Wed, 31 Mar 2021 08:07:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234443AbhCaIHb (ORCPT ); Wed, 31 Mar 2021 04:07:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234355AbhCaIG5 (ORCPT ); Wed, 31 Mar 2021 04:06:57 -0400 Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC3ACC06174A for ; Wed, 31 Mar 2021 01:06:56 -0700 (PDT) Received: by mail-pj1-x1033.google.com with SMTP id x21-20020a17090a5315b029012c4a622e4aso812369pjh.2 for ; Wed, 31 Mar 2021 01:06:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DrGe+wSNv+5GrL/VvyrIF4G5z2Kf+xg5gmEDQCgKf7s=; b=DqzZ8q+OkgiXiUnK9PFEYEY4GPyyJUNJwl9trmeo91/ckMTVGZku5qS+JYUp6zbltv 7JvdkVX4/kn4rtl7cFCIfEfQP+rRDpBLK3lx/1fTetBxfqIxNrXCCGHQhn3Cs9UAAe8j dEfKQZNP/uRhRK/+NsTGf2ALmlqAwKGqnklVdGVTMTTpLvLxqs+ujglBcrnEeLW9F8st IYRilOFnC/ht3Qb86fq7BfeHYXVqAkQTbbVeDhg84tYsOWIxpLX+liISgJbcKto8LEu7 bc1z4i3IgIPP24H5wRqaa2c6qh0L8tCPEDgWo7QOxQQdgBRuH6lxgUd+3n+ii0sHchYm WAxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DrGe+wSNv+5GrL/VvyrIF4G5z2Kf+xg5gmEDQCgKf7s=; b=PM9g1GOdG/3Y+BPPAV3wqstrj2T6J70j9q2gOMtEB8G9nJWVhzdvghCgTC7IgSQsI1 qxlWdV6sydY35i6RQOuq7Zo2I8wu5HFHBijATiPLzGFY85kS7naFG7TkEGAWXdPbqi6M QvAXACxZK3PLG08zo3YDd5DQ0WrMv/R1uNGSDWNKmrjZ2BQxFPffyfT+D+JVAhorKEHH V3Y8PexDS64VwyrYuyC41XixzzetASFzDfLJXt/+SyT282Sf0eRpIEA3m5GLUyWaTuo6 b5emgw0UCa48riocChWS/KWNFDl7a756Q1xaY7SiRpSmn87CCrh2WHBVLk36Geu+Bc42 HWcA== X-Gm-Message-State: AOAM531SOWX7EHuqFy2PNGB/r7z877gGUg1XtcC1HvNbnl0nXyJQ8X5V qqoGvbYGoqS+1oqqJjAb6ad8 X-Google-Smtp-Source: ABdhPJzUZ0iCW38qBzP+z9ejwSF3QR26iU/iND3BjH9uOIBsmROmq3cVYSj9jxodUlUSY5l5upx6Gg== X-Received: by 2002:a17:90b:3551:: with SMTP id lt17mr2338546pjb.89.1617178016286; Wed, 31 Mar 2021 01:06:56 -0700 (PDT) Received: from localhost ([139.177.225.243]) by smtp.gmail.com with ESMTPSA id w79sm1324798pfc.87.2021.03.31.01.06.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Mar 2021 01:06:55 -0700 (PDT) From: Xie Yongji To: mst@redhat.com, jasowang@redhat.com, stefanha@redhat.com, sgarzare@redhat.com, parav@nvidia.com, hch@infradead.org, christian.brauner@canonical.com, rdunlap@infradead.org, willy@infradead.org, viro@zeniv.linux.org.uk, axboe@kernel.dk, bcrl@kvack.org, corbet@lwn.net, mika.penttila@nextfour.com, dan.carpenter@oracle.com Cc: virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [PATCH v6 10/10] Documentation: Add documentation for VDUSE Date: Wed, 31 Mar 2021 16:05:19 +0800 Message-Id: <20210331080519.172-11-xieyongji@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210331080519.172-1-xieyongji@bytedance.com> References: <20210331080519.172-1-xieyongji@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org VDUSE (vDPA Device in Userspace) is a framework to support implementing software-emulated vDPA devices in userspace. This document is intended to clarify the VDUSE design and usage. Signed-off-by: Xie Yongji Acked-by: Jason Wang --- Documentation/userspace-api/index.rst | 1 + Documentation/userspace-api/vduse.rst | 212 ++++++++++++++++++++++++++++++++++ 2 files changed, 213 insertions(+) create mode 100644 Documentation/userspace-api/vduse.rst diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst index acd2cc2a538d..f63119130898 100644 --- a/Documentation/userspace-api/index.rst +++ b/Documentation/userspace-api/index.rst @@ -24,6 +24,7 @@ place where this information is gathered. ioctl/index iommu media/index + vduse .. only:: subproject and html diff --git a/Documentation/userspace-api/vduse.rst b/Documentation/userspace-api/vduse.rst new file mode 100644 index 000000000000..8c4e2b2df8bb --- /dev/null +++ b/Documentation/userspace-api/vduse.rst @@ -0,0 +1,212 @@ +================================== +VDUSE - "vDPA Device in Userspace" +================================== + +vDPA (virtio data path acceleration) device is a device that uses a +datapath which complies with the virtio specifications with vendor +specific control path. vDPA devices can be both physically located on +the hardware or emulated by software. VDUSE is a framework that makes it +possible to implement software-emulated vDPA devices in userspace. + +How VDUSE works +------------ +Each userspace vDPA device is created by the VDUSE_CREATE_DEV ioctl on +the character device (/dev/vduse/control). Then a device file with the +specified name (/dev/vduse/$NAME) will appear, which can be used to +implement the userspace vDPA device's control path and data path. + +To implement control path, a message-based communication protocol and some +types of control messages are introduced in the VDUSE framework: + +- VDUSE_SET_VQ_ADDR: Set the vring address of virtqueue. + +- VDUSE_SET_VQ_NUM: Set the size of virtqueue + +- VDUSE_SET_VQ_READY: Set ready status of virtqueue + +- VDUSE_GET_VQ_READY: Get ready status of virtqueue + +- VDUSE_SET_VQ_STATE: Set the state for virtqueue + +- VDUSE_GET_VQ_STATE: Get the state for virtqueue + +- VDUSE_SET_FEATURES: Set virtio features supported by the driver + +- VDUSE_GET_FEATURES: Get virtio features supported by the device + +- VDUSE_SET_STATUS: Set the device status + +- VDUSE_GET_STATUS: Get the device status + +- VDUSE_SET_CONFIG: Write to device specific configuration space + +- VDUSE_GET_CONFIG: Read from device specific configuration space + +- VDUSE_UPDATE_IOTLB: Notify userspace to update the memory mapping in device IOTLB + +Those control messages are mostly based on the vdpa_config_ops in +include/linux/vdpa.h which defines a unified interface to control +different types of vdpa device. Userspace needs to read()/write() +on the VDUSE device file to receive/reply those control messages +from/to VDUSE kernel module as follows: + +.. code-block:: c + + static int vduse_message_handler(int dev_fd) + { + int len; + struct vduse_dev_request req; + struct vduse_dev_response resp; + + len = read(dev_fd, &req, sizeof(req)); + if (len != sizeof(req)) + return -1; + + resp.request_id = req.request_id; + + switch (req.type) { + + /* handle different types of message */ + + } + + len = write(dev_fd, &resp, sizeof(resp)); + if (len != sizeof(resp)) + return -1; + + return 0; + } + +In the data path, vDPA device's iova regions will be mapped into userspace +with the help of VDUSE_IOTLB_GET_FD ioctl on the VDUSE device file: + +- VDUSE_IOTLB_GET_FD: get the file descriptor to the first overlapped iova region. + Userspace can access this iova region by passing fd and corresponding size, offset, + perm to mmap(). For example: + +.. code-block:: c + + static int perm_to_prot(uint8_t perm) + { + int prot = 0; + + switch (perm) { + case VDUSE_ACCESS_WO: + prot |= PROT_WRITE; + break; + case VDUSE_ACCESS_RO: + prot |= PROT_READ; + break; + case VDUSE_ACCESS_RW: + prot |= PROT_READ | PROT_WRITE; + break; + } + + return prot; + } + + static void *iova_to_va(int dev_fd, uint64_t iova, uint64_t *len) + { + int fd; + void *addr; + size_t size; + struct vduse_iotlb_entry entry; + + entry.start = iova; + entry.last = iova + 1; + fd = ioctl(dev_fd, VDUSE_IOTLB_GET_FD, &entry); + if (fd < 0) + return NULL; + + size = entry.last - entry.start + 1; + *len = entry.last - iova + 1; + addr = mmap(0, size, perm_to_prot(entry.perm), MAP_SHARED, + fd, entry.offset); + close(fd); + if (addr == MAP_FAILED) + return NULL; + + /* do something to cache this iova region */ + + return addr + iova - entry.start; + } + +Besides, the following ioctls on the VDUSE device file are provided to support +interrupt injection and setting up eventfd for virtqueue kicks: + +- VDUSE_VQ_SETUP_KICKFD: set the kickfd for virtqueue, this eventfd is used + by VDUSE kernel module to notify userspace to consume the vring. + +- VDUSE_INJECT_VQ_IRQ: inject an interrupt for specific virtqueue + +- VDUSE_INJECT_CONFIG_IRQ: inject a config interrupt + +Register VDUSE device on vDPA bus +--------------------------------- +In order to make the VDUSE device work, administrator needs to use the management +API (netlink) to register it on vDPA bus. Some sample codes are show below: + +.. code-block:: c + + static int netlink_add_vduse(const char *name, int device_id) + { + struct nl_sock *nlsock; + struct nl_msg *msg; + int famid; + + nlsock = nl_socket_alloc(); + if (!nlsock) + return -ENOMEM; + + if (genl_connect(nlsock)) + goto free_sock; + + famid = genl_ctrl_resolve(nlsock, VDPA_GENL_NAME); + if (famid < 0) + goto close_sock; + + msg = nlmsg_alloc(); + if (!msg) + goto close_sock; + + if (!genlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, famid, 0, 0, + VDPA_CMD_DEV_NEW, 0)) + goto nla_put_failure; + + NLA_PUT_STRING(msg, VDPA_ATTR_DEV_NAME, name); + NLA_PUT_STRING(msg, VDPA_ATTR_MGMTDEV_DEV_NAME, "vduse"); + NLA_PUT_U32(msg, VDPA_ATTR_DEV_ID, device_id); + + if (nl_send_sync(nlsock, msg)) + goto close_sock; + + nl_close(nlsock); + nl_socket_free(nlsock); + + return 0; + nla_put_failure: + nlmsg_free(msg); + close_sock: + nl_close(nlsock); + free_sock: + nl_socket_free(nlsock); + return -1; + } + +MMU-based IOMMU Driver +---------------------- +VDUSE framework implements an MMU-based on-chip IOMMU driver to support +mapping the kernel DMA buffer into the userspace iova region dynamically. +This is mainly designed for virtio-vdpa case (kernel virtio drivers). + +The basic idea behind this driver is treating MMU (VA->PA) as IOMMU (IOVA->PA). +The driver will set up MMU mapping instead of IOMMU mapping for the DMA transfer +so that the userspace process is able to use its virtual address to access +the DMA buffer in kernel. + +And to avoid security issue, a bounce-buffering mechanism is introduced to +prevent userspace accessing the original buffer directly which may contain other +kernel data. During the mapping, unmapping, the driver will copy the data from +the original buffer to the bounce buffer and back, depending on the direction of +the transfer. And the bounce-buffer addresses will be mapped into the user address +space instead of the original one.