mbox series

[RFC,00/24] Control VQ support in vDPA

Message ID 20200924032125.18619-1-jasowang@redhat.com
Headers show
Series Control VQ support in vDPA | expand

Message

Jason Wang Sept. 24, 2020, 3:21 a.m. UTC
Hi All:

This series tries to add the support for control virtqueue in vDPA.

Control virtqueue is used by networking device for accepting various
commands from the driver. It's a must to support multiqueue and other
configurations.

When used by vhost-vDPA bus driver for VM, the control virtqueue
should be shadowed via userspace VMM (Qemu) instead of being assigned
directly to Guest. This is because Qemu needs to know the device state
in order to start and stop device correctly (e.g for Live Migration).

This requies to isolate the memory mapping for control virtqueue
presented by vhost-vDPA to prevent guest from accesing it directly.

To achieve this, vDPA introduce two new abstractions:

- address space: identified through address space id (ASID) and a set
                 of memory mapping in maintained
- virtqueue group: the minimal set of virtqueues that must share an
                 address space

Device needs to advertise the following attributes to vDPA:

- the number of address spaces supported in the device
- the number of virtqueue groups supported in the device
- the mappings from a specific virtqueue to its virtqueue groups

The mappings from virtqueue to virtqueue groups is fixed and defined
by vDPA device driver. E.g:

- For the device that has hardware ASID support, it can simply
  advertise a per virtqueue virtqueue group.
- For the device that does not have hardware ASID support, it can
  simply advertise a single virtqueue group that contains all
  virtqueues. Or if it wants a software emulated control virtqueue, it
  can advertise two virtqueue groups, one is for cvq, another is for
  the rest virtqueues.

vDPA also allow to change the association between virtqueue group and
address space. So in the case of control virtqueue, userspace
VMM(Qemu) may use a dedicated address space for the control virtqueue
group to isolate the memory mapping.

The vhost/vhost-vDPA is also extend for the userspace to:

- query the number of virtqueue groups and address spaces supported by
  the device
- query the virtqueue group for a specific virtqueue
- assocaite a virtqueue group with an address space
- send ASID based IOTLB commands

This will help userspace VMM(Qemu) to detect whether the control vq
could be supported and isolate memory mappings of control virtqueue
from the others.

To demonstrate the usage, vDPA simulator is extended to support
setting MAC address via a emulated control virtqueue. Please refer
patch 24 for more implementation details.

Please review.

Note that patch 1 and a equivalent of patch 2 have been posted in the
list. Those two are requirement for this series to work, so I add them
here.

Thank

Jason Wang (24):
  vhost-vdpa: fix backend feature ioctls
  vhost-vdpa: fix vqs leak in vhost_vdpa_open()
  vhost: move the backend feature bits to vhost_types.h
  virtio-vdpa: don't set callback if virtio doesn't need it
  vhost-vdpa: passing iotlb to IOMMU mapping helpers
  vhost-vdpa: switch to use vhost-vdpa specific IOTLB
  vdpa: add the missing comment for nvqs in struct vdpa_device
  vdpa: introduce virtqueue groups
  vdpa: multiple address spaces support
  vdpa: introduce config operations for associating ASID to a virtqueue
    group
  vhost_iotlb: split out IOTLB initialization
  vhost: support ASID in IOTLB API
  vhost-vdpa: introduce ASID based IOTLB
  vhost-vdpa: introduce uAPI to get the number of virtqueue groups
  vhost-vdpa: introduce uAPI to get the number of address spaces
  vhost-vdpa: uAPI to get virtqueue group id
  vhost-vdpa: introduce uAPI to set group ASID
  vhost-vdpa: support ASID based IOTLB API
  vdpa_sim: use separated iov for reading and writing
  vdpa_sim: advertise VIRTIO_NET_F_MTU
  vdpa_sim: advertise VIRTIO_NET_F_MAC
  vdpa_sim: factor out buffer completion logic
  vdpa_sim: filter destination mac address
  vdpasim: control virtqueue support

 drivers/vdpa/ifcvf/ifcvf_main.c   |   9 +-
 drivers/vdpa/mlx5/net/mlx5_vnet.c |  11 +-
 drivers/vdpa/vdpa.c               |   8 +-
 drivers/vdpa/vdpa_sim/vdpa_sim.c  | 293 ++++++++++++++++++++++++------
 drivers/vhost/iotlb.c             |  23 ++-
 drivers/vhost/vdpa.c              | 259 ++++++++++++++++++++------
 drivers/vhost/vhost.c             |  23 ++-
 drivers/vhost/vhost.h             |   4 +-
 drivers/virtio/virtio_vdpa.c      |   2 +-
 include/linux/vdpa.h              |  42 ++++-
 include/linux/vhost_iotlb.h       |   2 +
 include/uapi/linux/vhost.h        |  19 +-
 include/uapi/linux/vhost_types.h  |  10 +-
 13 files changed, 556 insertions(+), 149 deletions(-)

Comments

Eli Cohen Sept. 24, 2020, 7:16 a.m. UTC | #1
On Thu, Sep 24, 2020 at 11:21:02AM +0800, Jason Wang wrote:
> Commit 653055b9acd4 ("vhost-vdpa: support get/set backend features")

> introduces two malfunction backend features ioctls:

> 

> 1) the ioctls was blindly added to vring ioctl instead of vdpa device

>    ioctl

> 2) vhost_set_backend_features() was called when dev mutex has already

>    been held which will lead a deadlock

> 


I assume this patch requires some patch in qemu as well. Do you have
such patch?

> This patch fixes the above issues.

> 

> Cc: Eli Cohen <elic@nvidia.com>

> Reported-by: Zhu Lingshan <lingshan.zhu@intel.com>

> Fixes: 653055b9acd4 ("vhost-vdpa: support get/set backend features")

> Signed-off-by: Jason Wang <jasowang@redhat.com>

> ---

>  drivers/vhost/vdpa.c | 30 ++++++++++++++++--------------

>  1 file changed, 16 insertions(+), 14 deletions(-)

> 

> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c

> index 3fab94f88894..796fe979f997 100644

> --- a/drivers/vhost/vdpa.c

> +++ b/drivers/vhost/vdpa.c

> @@ -353,8 +353,6 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,

>  	struct vdpa_callback cb;

>  	struct vhost_virtqueue *vq;

>  	struct vhost_vring_state s;

> -	u64 __user *featurep = argp;

> -	u64 features;

>  	u32 idx;

>  	long r;

>  

> @@ -381,18 +379,6 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,

>  

>  		vq->last_avail_idx = vq_state.avail_index;

>  		break;

> -	case VHOST_GET_BACKEND_FEATURES:

> -		features = VHOST_VDPA_BACKEND_FEATURES;

> -		if (copy_to_user(featurep, &features, sizeof(features)))

> -			return -EFAULT;

> -		return 0;

> -	case VHOST_SET_BACKEND_FEATURES:

> -		if (copy_from_user(&features, featurep, sizeof(features)))

> -			return -EFAULT;

> -		if (features & ~VHOST_VDPA_BACKEND_FEATURES)

> -			return -EOPNOTSUPP;

> -		vhost_set_backend_features(&v->vdev, features);

> -		return 0;

>  	}

>  

>  	r = vhost_vring_ioctl(&v->vdev, cmd, argp);

> @@ -440,8 +426,20 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,

>  	struct vhost_vdpa *v = filep->private_data;

>  	struct vhost_dev *d = &v->vdev;

>  	void __user *argp = (void __user *)arg;

> +	u64 __user *featurep = argp;

> +	u64 features;

>  	long r;

>  

> +	if (cmd == VHOST_SET_BACKEND_FEATURES) {

> +		r = copy_from_user(&features, featurep, sizeof(features));

> +		if (r)

> +			return r;

> +		if (features & ~VHOST_VDPA_BACKEND_FEATURES)

> +			return -EOPNOTSUPP;

> +		vhost_set_backend_features(&v->vdev, features);

> +		return 0;

> +	}

> +

>  	mutex_lock(&d->mutex);

>  

>  	switch (cmd) {

> @@ -476,6 +474,10 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,

>  	case VHOST_VDPA_SET_CONFIG_CALL:

>  		r = vhost_vdpa_set_config_call(v, argp);

>  		break;

> +	case VHOST_GET_BACKEND_FEATURES:

> +		features = VHOST_VDPA_BACKEND_FEATURES;

> +		r = copy_to_user(featurep, &features, sizeof(features));

> +		break;

>  	default:

>  		r = vhost_dev_ioctl(&v->vdev, cmd, argp);

>  		if (r == -ENOIOCTLCMD)

> -- 

> 2.20.1

>
Jason Wang Sept. 24, 2020, 7:26 a.m. UTC | #2
On 2020/9/24 下午3:16, Eli Cohen wrote:
> On Thu, Sep 24, 2020 at 11:21:02AM +0800, Jason Wang wrote:
>> Commit 653055b9acd4 ("vhost-vdpa: support get/set backend features")
>> introduces two malfunction backend features ioctls:
>>
>> 1) the ioctls was blindly added to vring ioctl instead of vdpa device
>>     ioctl
>> 2) vhost_set_backend_features() was called when dev mutex has already
>>     been held which will lead a deadlock
>>
> I assume this patch requires some patch in qemu as well. Do you have
> such patch?
>

It's this series: [PATCH 0/3] Vhost-vDPA: batch IOTLB updating.

You were copied.

Thanks
Eli Cohen Sept. 24, 2020, 7:38 a.m. UTC | #3
On Thu, Sep 24, 2020 at 03:26:08PM +0800, Jason Wang wrote:
> 

> On 2020/9/24 下午3:16, Eli Cohen wrote:

> > On Thu, Sep 24, 2020 at 11:21:02AM +0800, Jason Wang wrote:

> > > Commit 653055b9acd4 ("vhost-vdpa: support get/set backend features")

> > > introduces two malfunction backend features ioctls:

> > > 

> > > 1) the ioctls was blindly added to vring ioctl instead of vdpa device

> > >     ioctl

> > > 2) vhost_set_backend_features() was called when dev mutex has already

> > >     been held which will lead a deadlock

> > > 

> > I assume this patch requires some patch in qemu as well. Do you have

> > such patch?

> > 

> 

> It's this series: [PATCH 0/3] Vhost-vDPA: batch IOTLB updating.

> 

> You were copied.

> 


Right, I miss those.
Thanks.
Michael S. Tsirkin Sept. 24, 2020, 7:50 a.m. UTC | #4
On Thu, Sep 24, 2020 at 11:21:02AM +0800, Jason Wang wrote:
> Commit 653055b9acd4 ("vhost-vdpa: support get/set backend features")
> introduces two malfunction backend features ioctls:
> 
> 1) the ioctls was blindly added to vring ioctl instead of vdpa device
>    ioctl
> 2) vhost_set_backend_features() was called when dev mutex has already
>    been held which will lead a deadlock
> 
> This patch fixes the above issues.
> 
> Cc: Eli Cohen <elic@nvidia.com>
> Reported-by: Zhu Lingshan <lingshan.zhu@intel.com>
> Fixes: 653055b9acd4 ("vhost-vdpa: support get/set backend features")
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Don't we want the fixes queued right now, as opposed to the rest of the
RFC?

> ---
>  drivers/vhost/vdpa.c | 30 ++++++++++++++++--------------
>  1 file changed, 16 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 3fab94f88894..796fe979f997 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -353,8 +353,6 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
>  	struct vdpa_callback cb;
>  	struct vhost_virtqueue *vq;
>  	struct vhost_vring_state s;
> -	u64 __user *featurep = argp;
> -	u64 features;
>  	u32 idx;
>  	long r;
>  
> @@ -381,18 +379,6 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
>  
>  		vq->last_avail_idx = vq_state.avail_index;
>  		break;
> -	case VHOST_GET_BACKEND_FEATURES:
> -		features = VHOST_VDPA_BACKEND_FEATURES;
> -		if (copy_to_user(featurep, &features, sizeof(features)))
> -			return -EFAULT;
> -		return 0;
> -	case VHOST_SET_BACKEND_FEATURES:
> -		if (copy_from_user(&features, featurep, sizeof(features)))
> -			return -EFAULT;
> -		if (features & ~VHOST_VDPA_BACKEND_FEATURES)
> -			return -EOPNOTSUPP;
> -		vhost_set_backend_features(&v->vdev, features);
> -		return 0;
>  	}
>  
>  	r = vhost_vring_ioctl(&v->vdev, cmd, argp);
> @@ -440,8 +426,20 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
>  	struct vhost_vdpa *v = filep->private_data;
>  	struct vhost_dev *d = &v->vdev;
>  	void __user *argp = (void __user *)arg;
> +	u64 __user *featurep = argp;
> +	u64 features;
>  	long r;
>  
> +	if (cmd == VHOST_SET_BACKEND_FEATURES) {
> +		r = copy_from_user(&features, featurep, sizeof(features));
> +		if (r)
> +			return r;
> +		if (features & ~VHOST_VDPA_BACKEND_FEATURES)
> +			return -EOPNOTSUPP;
> +		vhost_set_backend_features(&v->vdev, features);
> +		return 0;
> +	}
> +
>  	mutex_lock(&d->mutex);
>  
>  	switch (cmd) {
> @@ -476,6 +474,10 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
>  	case VHOST_VDPA_SET_CONFIG_CALL:
>  		r = vhost_vdpa_set_config_call(v, argp);
>  		break;
> +	case VHOST_GET_BACKEND_FEATURES:
> +		features = VHOST_VDPA_BACKEND_FEATURES;
> +		r = copy_to_user(featurep, &features, sizeof(features));
> +		break;
>  	default:
>  		r = vhost_dev_ioctl(&v->vdev, cmd, argp);
>  		if (r == -ENOIOCTLCMD)
> -- 
> 2.20.1
Jason Wang Sept. 24, 2020, 8:28 a.m. UTC | #5
On 2020/9/24 下午3:50, Michael S. Tsirkin wrote:
> On Thu, Sep 24, 2020 at 11:21:02AM +0800, Jason Wang wrote:

>> Commit 653055b9acd4 ("vhost-vdpa: support get/set backend features")

>> introduces two malfunction backend features ioctls:

>>

>> 1) the ioctls was blindly added to vring ioctl instead of vdpa device

>>     ioctl

>> 2) vhost_set_backend_features() was called when dev mutex has already

>>     been held which will lead a deadlock

>>

>> This patch fixes the above issues.

>>

>> Cc: Eli Cohen<elic@nvidia.com>

>> Reported-by: Zhu Lingshan<lingshan.zhu@intel.com>

>> Fixes: 653055b9acd4 ("vhost-vdpa: support get/set backend features")

>> Signed-off-by: Jason Wang<jasowang@redhat.com>

> Don't we want the fixes queued right now, as opposed to the rest of the

> RFC?



Yes, actually I've posted in before[1].

Adding the patch here is to simplify the work for the guys that want to 
do the work on top. E.g for Cindy to start the Qemu prototype.

Thanks

[1] https://www.spinics.net/lists/netdev/msg681247.html
Stefan Hajnoczi Sept. 24, 2020, 10:17 a.m. UTC | #6
On Thu, Sep 24, 2020 at 11:21:01AM +0800, Jason Wang wrote:
> This series tries to add the support for control virtqueue in vDPA.

Please include documentation for both driver authors and vhost-vdpa
ioctl users. vhost-vdpa ioctls are only documented with a single
sentence. Please add full information on arguments, return values, and a
high-level explanation of the feature (like this cover letter) to
introduce the API.

What is the policy for using virtqueue groups? My guess is:
1. virtio_vdpa simply enables all virtqueue groups.
2. vhost_vdpa relies on userspace policy on how to use virtqueue groups.
   Are the semantics of virtqueue groups documented somewhere so
   userspace knows what to do? If a vDPA driver author decides to create
   N virtqueue groups, N/2 virtqueue groups, or just 1 virtqueue group,
   how will userspace know what to do?

Maybe a document is needed to describe the recommended device-specific
virtqueue groups that vDPA drivers should implement (e.g. "put the net
control vq into its own virtqueue group")?

This could become messy with guidelines. For example, drivers might be
shipped that aren't usable for certain use cases just because the author
didn't know that a certain virtqueue grouping is advantageous.

BTW I like how general this feature is. It seems to allow vDPA devices
to be split into sub-devices for further passthrough. Who will write the
first vDPA-on-vDPA driver? :)

Stefan
Jason Wang Sept. 25, 2020, 11:36 a.m. UTC | #7
On 2020/9/24 下午6:17, Stefan Hajnoczi wrote:
> On Thu, Sep 24, 2020 at 11:21:01AM +0800, Jason Wang wrote:

>> This series tries to add the support for control virtqueue in vDPA.

> Please include documentation for both driver authors and vhost-vdpa

> ioctl users. vhost-vdpa ioctls are only documented with a single

> sentence. Please add full information on arguments, return values, and a

> high-level explanation of the feature (like this cover letter) to

> introduce the API.



Right, this is in the TODO list. (And we probably need to start with 
documenting vDPA bus operations first).


>

> What is the policy for using virtqueue groups? My guess is:

> 1. virtio_vdpa simply enables all virtqueue groups.

> 2. vhost_vdpa relies on userspace policy on how to use virtqueue groups.

>     Are the semantics of virtqueue groups documented somewhere so

>     userspace knows what to do? If a vDPA driver author decides to create

>     N virtqueue groups, N/2 virtqueue groups, or just 1 virtqueue group,

>     how will userspace know what to do?



So the mapping from virtqueue to virtqueue group is mandated by the vDPA 
device(driver). vDPA bus driver (like vhost-vDPA), can only change the 
association between virtqueue groups and ASID.

By default, it is required all virtqueue groups to be associated to 
address space 0. This make sure virtio_vdpa can work without any special 
groups/asid configuration.

I admit we need document all those semantics/polices.


>

> Maybe a document is needed to describe the recommended device-specific

> virtqueue groups that vDPA drivers should implement (e.g. "put the net

> control vq into its own virtqueue group")?



Yes, note that this depends on the hardware capability actually. It can 
only put control vq in other virtqueue group if:

1) hardware support to isolate control vq DMA from the rest virtqueues 
(PASID or simply using PA (translated address) for control vq)
or
2) the control vq is emulated by vDPA device driver (like vdpa_sim did).


>

> This could become messy with guidelines. For example, drivers might be

> shipped that aren't usable for certain use cases just because the author

> didn't know that a certain virtqueue grouping is advantageous.



Right.


>

> BTW I like how general this feature is. It seems to allow vDPA devices

> to be split into sub-devices for further passthrough. Who will write the

> first vDPA-on-vDPA driver? :)



Yes, that's an interesting question. For now, I can imagine we can 
emulated a SRIOV based virtio-net VFs via this.

If we want to expose the ASID setting to guest as well, it probably 
needs more thought.

Thanks


>

> Stefan
Eugenio Perez Martin Sept. 29, 2020, 2:40 p.m. UTC | #8
On Thu, Sep 24, 2020 at 5:24 AM Jason Wang <jasowang@redhat.com> wrote:
>
> This patch introduces the support of ASID based IOTLB by tagging IOTLB
> with a unique ASID. This is a must for supporting ASID based vhost
> IOTLB API by the following patches.
>
> IOTLB were stored in a hlist and new IOTLB will be allocated when a
> new ASID is seen via IOTLB API and destoryed when there's no mapping
> associated with an ASID.
>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/vhost/vdpa.c | 94 +++++++++++++++++++++++++++++++++-----------
>  1 file changed, 72 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 6552987544d7..1ba7e95619b5 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -34,13 +34,21 @@ enum {
>
>  #define VHOST_VDPA_DEV_MAX (1U << MINORBITS)
>
> +#define VHOST_VDPA_IOTLB_BUCKETS 16
> +
> +struct vhost_vdpa_as {
> +       struct hlist_node hash_link;
> +       struct vhost_iotlb iotlb;
> +       u32 id;
> +};
> +
>  struct vhost_vdpa {
>         struct vhost_dev vdev;
>         struct iommu_domain *domain;
>         struct vhost_virtqueue *vqs;
>         struct completion completion;
>         struct vdpa_device *vdpa;
> -       struct vhost_iotlb *iotlb;
> +       struct hlist_head as[VHOST_VDPA_IOTLB_BUCKETS];
>         struct device dev;
>         struct cdev cdev;
>         atomic_t opened;
> @@ -49,12 +57,64 @@ struct vhost_vdpa {
>         int minor;
>         struct eventfd_ctx *config_ctx;
>         int in_batch;
> +       int used_as;

Hi!

The variable `used_as` is not used anywhere outside this commit, and
in this commit is only tracking the number os AS added, not being able
to query it or using it by limiting them or anything like that.

If I'm right, could we consider deleting it? Or am I missing some usage of it?

I smoke tested all the series deleting that variable and everything
seems right to me.

Thanks!

>  };
>
>  static DEFINE_IDA(vhost_vdpa_ida);
>
>  static dev_t vhost_vdpa_major;
>
> +static struct vhost_vdpa_as *asid_to_as(struct vhost_vdpa *v, u32 asid)
> +{
> +       struct hlist_head *head = &v->as[asid % VHOST_VDPA_IOTLB_BUCKETS];
> +       struct vhost_vdpa_as *as;
> +
> +       hlist_for_each_entry(as, head, hash_link)
> +               if (as->id == asid)
> +                       return as;
> +
> +       return NULL;
> +}
> +
> +static struct vhost_vdpa_as *vhost_vdpa_alloc_as(struct vhost_vdpa *v, u32 asid)
> +{
> +       struct hlist_head *head = &v->as[asid % VHOST_VDPA_IOTLB_BUCKETS];
> +       struct vhost_vdpa_as *as;
> +
> +       if (asid_to_as(v, asid))
> +               return NULL;
> +
> +       as = kmalloc(sizeof(*as), GFP_KERNEL);
> +       if (!as)
> +               return NULL;
> +
> +       vhost_iotlb_init(&as->iotlb, 0, 0);
> +       as->id = asid;
> +       hlist_add_head(&as->hash_link, head);
> +       ++v->used_as;
> +
> +       return as;
> +}
> +
> +static int vhost_vdpa_remove_as(struct vhost_vdpa *v, u32 asid)
> +{
> +       struct vhost_vdpa_as *as = asid_to_as(v, asid);
> +
> +       /* Remove default address space is not allowed */
> +       if (asid == 0)
> +               return -EINVAL;
> +
> +       if (!as)
> +               return -EINVAL;
> +
> +       hlist_del(&as->hash_link);
> +       vhost_iotlb_reset(&as->iotlb);
> +       kfree(as);
> +       --v->used_as;
> +
> +       return 0;
> +}
> +
>  static void handle_vq_kick(struct vhost_work *work)
>  {
>         struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
> @@ -513,15 +573,6 @@ static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v,
>         }
>  }
>
> -static void vhost_vdpa_iotlb_free(struct vhost_vdpa *v)
> -{
> -       struct vhost_iotlb *iotlb = v->iotlb;
> -
> -       vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1);
> -       kfree(v->iotlb);
> -       v->iotlb = NULL;
> -}
> -
>  static int perm_to_iommu_flags(u32 perm)
>  {
>         int flags = 0;
> @@ -681,7 +732,8 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, u32 asid,
>         struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
>         struct vdpa_device *vdpa = v->vdpa;
>         const struct vdpa_config_ops *ops = vdpa->config;
> -       struct vhost_iotlb *iotlb = v->iotlb;
> +       struct vhost_vdpa_as *as = asid_to_as(v, 0);
> +       struct vhost_iotlb *iotlb = &as->iotlb;
>         int r = 0;
>
>         if (asid != 0)
> @@ -775,6 +827,7 @@ static void vhost_vdpa_cleanup(struct vhost_vdpa *v)
>  {
>         vhost_dev_cleanup(&v->vdev);
>         kfree(v->vdev.vqs);
> +       vhost_vdpa_remove_as(v, 0);
>  }
>
>  static int vhost_vdpa_open(struct inode *inode, struct file *filep)
> @@ -807,23 +860,18 @@ static int vhost_vdpa_open(struct inode *inode, struct file *filep)
>         vhost_dev_init(dev, vqs, nvqs, 0, 0, 0, false,
>                        vhost_vdpa_process_iotlb_msg);
>
> -       dev->iotlb = vhost_iotlb_alloc(0, 0);
> -       if (!dev->iotlb) {
> -               r = -ENOMEM;
> -               goto err_init_iotlb;
> -       }
> +       if (!vhost_vdpa_alloc_as(v, 0))
> +               goto err_alloc_as;
>
>         r = vhost_vdpa_alloc_domain(v);
>         if (r)
> -               goto err_alloc_domain;
> +               goto err_alloc_as;
>
>         filep->private_data = v;
>
>         return 0;
>
> -err_alloc_domain:
> -       vhost_vdpa_iotlb_free(v);
> -err_init_iotlb:
> +err_alloc_as:
>         vhost_vdpa_cleanup(v);
>  err:
>         atomic_dec(&v->opened);
> @@ -851,7 +899,6 @@ static int vhost_vdpa_release(struct inode *inode, struct file *filep)
>         filep->private_data = NULL;
>         vhost_vdpa_reset(v);
>         vhost_dev_stop(&v->vdev);
> -       vhost_vdpa_iotlb_free(v);
>         vhost_vdpa_free_domain(v);
>         vhost_vdpa_config_put(v);
>         vhost_vdpa_clean_irq(v);
> @@ -950,7 +997,7 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
>         const struct vdpa_config_ops *ops = vdpa->config;
>         struct vhost_vdpa *v;
>         int minor;
> -       int r;
> +       int i, r;
>
>         /* Only support 1 address space */
>         if (vdpa->ngroups != 1)
> @@ -1002,6 +1049,9 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
>         init_completion(&v->completion);
>         vdpa_set_drvdata(vdpa, v);
>
> +       for (i = 0; i < VHOST_VDPA_IOTLB_BUCKETS; i++)
> +               INIT_HLIST_HEAD(&v->as[i]);
> +
>         return 0;
>
>  err:
> --
> 2.20.1
>
Eli Cohen Sept. 30, 2020, 11:26 a.m. UTC | #9
On Thu, Sep 24, 2020 at 11:21:06AM +0800, Jason Wang wrote:
> To prepare for the ASID support for vhost-vdpa, try to pass IOTLB
> object to dma helpers.

Maybe it's worth mentioning here that this patch does not change any
functionality and is presented as a preparation for passing different
iotlb's instead of using dev->iotlb

> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/vhost/vdpa.c | 40 ++++++++++++++++++++++------------------
>  1 file changed, 22 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index 9c641274b9f3..74bef1c15a70 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -489,10 +489,11 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
>  	return r;
>  }
>  
> -static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last)
> +static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v,
> +				   struct vhost_iotlb *iotlb,
> +				   u64 start, u64 last)
>  {
>  	struct vhost_dev *dev = &v->vdev;
> -	struct vhost_iotlb *iotlb = dev->iotlb;
>  	struct vhost_iotlb_map *map;
>  	struct page *page;
>  	unsigned long pfn, pinned;
> @@ -514,8 +515,9 @@ static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last)
>  static void vhost_vdpa_iotlb_free(struct vhost_vdpa *v)
>  {
>  	struct vhost_dev *dev = &v->vdev;
> +	struct vhost_iotlb *iotlb = dev->iotlb;
>  
> -	vhost_vdpa_iotlb_unmap(v, 0ULL, 0ULL - 1);
> +	vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1);
>  	kfree(dev->iotlb);
>  	dev->iotlb = NULL;
>  }
> @@ -542,15 +544,14 @@ static int perm_to_iommu_flags(u32 perm)
>  	return flags | IOMMU_CACHE;
>  }
>  
> -static int vhost_vdpa_map(struct vhost_vdpa *v,
> +static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
>  			  u64 iova, u64 size, u64 pa, u32 perm)
>  {
> -	struct vhost_dev *dev = &v->vdev;
>  	struct vdpa_device *vdpa = v->vdpa;
>  	const struct vdpa_config_ops *ops = vdpa->config;
>  	int r = 0;
>  
> -	r = vhost_iotlb_add_range(dev->iotlb, iova, iova + size - 1,
> +	r = vhost_iotlb_add_range(iotlb, iova, iova + size - 1,
>  				  pa, perm);
>  	if (r)
>  		return r;
> @@ -559,7 +560,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v,
>  		r = ops->dma_map(vdpa, iova, size, pa, perm);
>  	} else if (ops->set_map) {
>  		if (!v->in_batch)
> -			r = ops->set_map(vdpa, dev->iotlb);
> +			r = ops->set_map(vdpa, iotlb);
>  	} else {
>  		r = iommu_map(v->domain, iova, pa, size,
>  			      perm_to_iommu_flags(perm));
> @@ -568,29 +569,30 @@ static int vhost_vdpa_map(struct vhost_vdpa *v,
>  	return r;
>  }
>  
> -static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size)
> +static void vhost_vdpa_unmap(struct vhost_vdpa *v,
> +			     struct vhost_iotlb *iotlb,
> +			     u64 iova, u64 size)
>  {
> -	struct vhost_dev *dev = &v->vdev;
>  	struct vdpa_device *vdpa = v->vdpa;
>  	const struct vdpa_config_ops *ops = vdpa->config;
>  
> -	vhost_vdpa_iotlb_unmap(v, iova, iova + size - 1);
> +	vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);
>  
>  	if (ops->dma_map) {
>  		ops->dma_unmap(vdpa, iova, size);
>  	} else if (ops->set_map) {
>  		if (!v->in_batch)
> -			ops->set_map(vdpa, dev->iotlb);
> +			ops->set_map(vdpa, iotlb);
>  	} else {
>  		iommu_unmap(v->domain, iova, size);
>  	}
>  }
>  
>  static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
> +					   struct vhost_iotlb *iotlb,
>  					   struct vhost_iotlb_msg *msg)
>  {
>  	struct vhost_dev *dev = &v->vdev;
> -	struct vhost_iotlb *iotlb = dev->iotlb;
>  	struct page **page_list;
>  	unsigned long list_size = PAGE_SIZE / sizeof(struct page *);
>  	unsigned int gup_flags = FOLL_LONGTERM;
> @@ -644,7 +646,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
>  			if (last_pfn && (this_pfn != last_pfn + 1)) {
>  				/* Pin a contiguous chunk of memory */
>  				csize = (last_pfn - map_pfn + 1) << PAGE_SHIFT;
> -				if (vhost_vdpa_map(v, iova, csize,
> +				if (vhost_vdpa_map(v, iotlb, iova, csize,
>  						   map_pfn << PAGE_SHIFT,
>  						   msg->perm))
>  					goto out;
> @@ -660,11 +662,12 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
>  	}
>  
>  	/* Pin the rest chunk */
> -	ret = vhost_vdpa_map(v, iova, (last_pfn - map_pfn + 1) << PAGE_SHIFT,
> +	ret = vhost_vdpa_map(v, iotlb, iova,
> +			     (last_pfn - map_pfn + 1) << PAGE_SHIFT,
>  			     map_pfn << PAGE_SHIFT, msg->perm);
>  out:
>  	if (ret) {
> -		vhost_vdpa_unmap(v, msg->iova, msg->size);
> +		vhost_vdpa_unmap(v, iotlb, msg->iova, msg->size);
>  		atomic64_sub(npages, &dev->mm->pinned_vm);
>  	}
>  	mmap_read_unlock(dev->mm);
> @@ -678,6 +681,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
>  	struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
>  	struct vdpa_device *vdpa = v->vdpa;
>  	const struct vdpa_config_ops *ops = vdpa->config;
> +	struct vhost_iotlb *iotlb = dev->iotlb;
>  	int r = 0;
>  
>  	r = vhost_dev_check_owner(dev);
> @@ -686,17 +690,17 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
>  
>  	switch (msg->type) {
>  	case VHOST_IOTLB_UPDATE:
> -		r = vhost_vdpa_process_iotlb_update(v, msg);
> +		r = vhost_vdpa_process_iotlb_update(v, iotlb, msg);
>  		break;
>  	case VHOST_IOTLB_INVALIDATE:
> -		vhost_vdpa_unmap(v, msg->iova, msg->size);
> +		vhost_vdpa_unmap(v, iotlb, msg->iova, msg->size);
>  		break;
>  	case VHOST_IOTLB_BATCH_BEGIN:
>  		v->in_batch = true;
>  		break;
>  	case VHOST_IOTLB_BATCH_END:
>  		if (v->in_batch && ops->set_map)
> -			ops->set_map(vdpa, dev->iotlb);
> +			ops->set_map(vdpa, iotlb);
>  		v->in_batch = false;
>  		break;
>  	default:
> -- 
> 2.20.1
>
Eli Cohen Oct. 1, 2020, 1:21 p.m. UTC | #10
On Thu, Sep 24, 2020 at 11:21:10AM +0800, Jason Wang wrote:
> This patches introduces the multiple address spaces support for vDPA
> device. This idea is to identify a specific address space via an
> dedicated identifier - ASID.
> 
> During vDPA device allocation, vDPA device driver needs to report the
> number of address spaces supported by the device then the DMA mapping
> ops of the vDPA device needs to be extended to support ASID.
> 
> This helps to isolate the DMA among the virtqueues. E.g in the case of
> virtio-net, the control virtqueue will not be assigned directly to
> guest.
> 
> This RFC patch only converts for the device that wants its own
> IOMMU/DMA translation logic. So it will rejects the device with more
> that 1 address space that depends on platform IOMMU. The plan to

This is not apparent from the code. Instead you enforce number of groups
to 1.

> moving all the DMA mapping logic to the vDPA device driver instead of
> doing it in vhost-vDPA (otherwise it could result a very complicated
> APIs and actually vhost-vDPA doesn't care about how the actual
> composition/emulation were done in the device driver).
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/vdpa/ifcvf/ifcvf_main.c   |  2 +-
>  drivers/vdpa/mlx5/net/mlx5_vnet.c |  5 +++--
>  drivers/vdpa/vdpa.c               |  4 +++-
>  drivers/vdpa/vdpa_sim/vdpa_sim.c  | 10 ++++++----
>  drivers/vhost/vdpa.c              | 14 +++++++++-----
>  include/linux/vdpa.h              | 23 ++++++++++++++++-------
>  6 files changed, 38 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
> index e6a0be374e51..86cdf5f8bcae 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> @@ -440,7 +440,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  
>  	adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
>  				    dev, &ifc_vdpa_ops,
> -				    IFCVF_MAX_QUEUE_PAIRS * 2, 1);
> +				    IFCVF_MAX_QUEUE_PAIRS * 2, 1, 1);
>  
>  	if (adapter == NULL) {
>  		IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index 4e480f4f754e..db7404e121bf 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1788,7 +1788,8 @@ static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev)
>  	return mvdev->generation;
>  }
>  
> -static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct vhost_iotlb *iotlb)
> +static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid,
> +			     struct vhost_iotlb *iotlb)
>  {
>  	struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
>  	struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> @@ -1931,7 +1932,7 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
>  	max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS);
>  
>  	ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
> -				 2 * mlx5_vdpa_max_qps(max_vqs), 1);
> +				 2 * mlx5_vdpa_max_qps(max_vqs), 1, 1);
>  	if (IS_ERR(ndev))
>  		return ndev;
>  
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> index 46399746ec7c..05195fa7865d 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -63,6 +63,7 @@ static void vdpa_release_dev(struct device *d)
>   * @config: the bus operations that is supported by this device
>   * @nvqs: number of virtqueues supported by this device
>   * @ngroups: number of groups supported by this device
> + * @nas: number of address spaces supported by this device
>   * @size: size of the parent structure that contains private data
>   *
>   * Driver should use vdpa_alloc_device() wrapper macro instead of
> @@ -74,7 +75,7 @@ static void vdpa_release_dev(struct device *d)
>  struct vdpa_device *__vdpa_alloc_device(struct device *parent,
>  					const struct vdpa_config_ops *config,
>  					int nvqs, unsigned int ngroups,
> -					size_t size)
> +					unsigned int nas, size_t size)
>  {
>  	struct vdpa_device *vdev;
>  	int err = -EINVAL;
> @@ -102,6 +103,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
>  	vdev->features_valid = false;
>  	vdev->nvqs = nvqs;
>  	vdev->ngroups = ngroups;
> +	vdev->nas = nas;
>  
>  	err = dev_set_name(&vdev->dev, "vdpa%u", vdev->index);
>  	if (err)
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index 6669c561bc6e..5dc04ec271bb 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -354,7 +354,7 @@ static struct vdpasim *vdpasim_create(void)
>  		ops = &vdpasim_net_config_ops;
>  
>  	vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
> -				    VDPASIM_VQ_NUM, 1);
> +				    VDPASIM_VQ_NUM, 1, 1);
>  	if (!vdpasim)
>  		goto err_alloc;
>  
> @@ -581,7 +581,7 @@ static u32 vdpasim_get_generation(struct vdpa_device *vdpa)
>  	return vdpasim->generation;
>  }
>  
> -static int vdpasim_set_map(struct vdpa_device *vdpa,
> +static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
>  			   struct vhost_iotlb *iotlb)
>  {
>  	struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> @@ -608,7 +608,8 @@ static int vdpasim_set_map(struct vdpa_device *vdpa,
>  	return ret;
>  }
>  
> -static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
> +static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
> +			   u64 iova, u64 size,
>  			   u64 pa, u32 perm)
>  {
>  	struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> @@ -622,7 +623,8 @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
>  	return ret;
>  }
>  
> -static int vdpasim_dma_unmap(struct vdpa_device *vdpa, u64 iova, u64 size)
> +static int vdpasim_dma_unmap(struct vdpa_device *vdpa, unsigned int asid,
> +			     u64 iova, u64 size)
>  {
>  	struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>  
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index ec3c94f706c1..eeefcd971e3f 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -557,10 +557,10 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
>  		return r;
>  
>  	if (ops->dma_map) {
> -		r = ops->dma_map(vdpa, iova, size, pa, perm);
> +		r = ops->dma_map(vdpa, 0, iova, size, pa, perm);
>  	} else if (ops->set_map) {
>  		if (!v->in_batch)
> -			r = ops->set_map(vdpa, iotlb);
> +			r = ops->set_map(vdpa, 0, iotlb);
>  	} else {
>  		r = iommu_map(v->domain, iova, pa, size,
>  			      perm_to_iommu_flags(perm));
> @@ -579,10 +579,10 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v,
>  	vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);
>  
>  	if (ops->dma_map) {
> -		ops->dma_unmap(vdpa, iova, size);
> +		ops->dma_unmap(vdpa, 0, iova, size);
>  	} else if (ops->set_map) {
>  		if (!v->in_batch)
> -			ops->set_map(vdpa, iotlb);
> +			ops->set_map(vdpa, 0, iotlb);
>  	} else {
>  		iommu_unmap(v->domain, iova, size);
>  	}
> @@ -700,7 +700,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
>  		break;
>  	case VHOST_IOTLB_BATCH_END:
>  		if (v->in_batch && ops->set_map)
> -			ops->set_map(vdpa, iotlb);
> +			ops->set_map(vdpa, 0, iotlb);
>  		v->in_batch = false;
>  		break;
>  	default:
> @@ -949,6 +949,10 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
>  	int minor;
>  	int r;
>  
> +	/* Only support 1 address space */
> +	if (vdpa->ngroups != 1)
> +		return -ENOTSUPP;
> +

Did you mean to check agains vdpa->nas?

>  	/* Currently, we only accept the network devices. */
>  	if (ops->get_device_id(vdpa) != VIRTIO_ID_NET)
>  		return -ENOTSUPP;
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index d829512efd27..1e1163daa352 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -43,6 +43,8 @@ struct vdpa_vq_state {
>   * @index: device index
>   * @features_valid: were features initialized? for legacy guests
>   * @nvqs: the number of virtqueues
> + * @ngroups: the number of virtqueue groups
> + * @nas: the number of address spaces
>   */
>  struct vdpa_device {
>  	struct device dev;
> @@ -52,6 +54,7 @@ struct vdpa_device {
>  	bool features_valid;
>  	int nvqs;
>  	unsigned int ngroups;
> +	unsigned int nas;
>  };
>  
>  /**
> @@ -161,6 +164,7 @@ struct vdpa_device {
>   *				Needed for device that using device
>   *				specific DMA translation (on-chip IOMMU)
>   *				@vdev: vdpa device
> + *				@asid: address space identifier
>   *				@iotlb: vhost memory mapping to be
>   *				used by the vDPA
>   *				Returns integer: success (0) or error (< 0)
> @@ -169,6 +173,7 @@ struct vdpa_device {
>   *				specific DMA translation (on-chip IOMMU)
>   *				and preferring incremental map.
>   *				@vdev: vdpa device
> + *				@asid: address space identifier
>   *				@iova: iova to be mapped
>   *				@size: size of the area
>   *				@pa: physical address for the map
> @@ -180,6 +185,7 @@ struct vdpa_device {
>   *				specific DMA translation (on-chip IOMMU)
>   *				and preferring incremental unmap.
>   *				@vdev: vdpa device
> + *				@asid: address space identifier
>   *				@iova: iova to be unmapped
>   *				@size: size of the area
>   *				Returns integer: success (0) or error (< 0)
> @@ -225,10 +231,12 @@ struct vdpa_config_ops {
>  	u32 (*get_generation)(struct vdpa_device *vdev);
>  
>  	/* DMA ops */
> -	int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> -	int (*dma_map)(struct vdpa_device *vdev, u64 iova, u64 size,
> -		       u64 pa, u32 perm);
> -	int (*dma_unmap)(struct vdpa_device *vdev, u64 iova, u64 size);
> +	int (*set_map)(struct vdpa_device *vdev, unsigned int asid,
> +		       struct vhost_iotlb *iotlb);
> +	int (*dma_map)(struct vdpa_device *vdev, unsigned int asid,
> +		       u64 iova, u64 size, u64 pa, u32 perm);
> +	int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
> +			 u64 iova, u64 size);
>  
>  	/* Free device resources */
>  	void (*free)(struct vdpa_device *vdev);
> @@ -237,11 +245,12 @@ struct vdpa_config_ops {
>  struct vdpa_device *__vdpa_alloc_device(struct device *parent,
>  					const struct vdpa_config_ops *config,
>  					int nvqs, unsigned int ngroups,
> -					size_t size);
> +					unsigned int nas, size_t size);
>  
> -#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, ngroups) \
> +#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, \
> +			  ngroups, nas)				    \
>  			  container_of(__vdpa_alloc_device( \
> -				       parent, config, nvqs, ngroups, \
> +				       parent, config, nvqs, ngroups, nas,  \
>  				       sizeof(dev_struct) + \
>  				       BUILD_BUG_ON_ZERO(offsetof( \
>  				       dev_struct, member))), \
> -- 
> 2.20.1
>
Eli Cohen Oct. 1, 2020, 1:23 p.m. UTC | #11
On Thu, Sep 24, 2020 at 11:21:10AM +0800, Jason Wang wrote:
> This patches introduces the multiple address spaces support for vDPA
> device. This idea is to identify a specific address space via an
> dedicated identifier - ASID.
> 
> During vDPA device allocation, vDPA device driver needs to report the
> number of address spaces supported by the device then the DMA mapping
> ops of the vDPA device needs to be extended to support ASID.
> 
> This helps to isolate the DMA among the virtqueues. E.g in the case of
> virtio-net, the control virtqueue will not be assigned directly to
> guest.
> 
> This RFC patch only converts for the device that wants its own
> IOMMU/DMA translation logic. So it will rejects the device with more
> that 1 address space that depends on platform IOMMU. The plan to
> moving all the DMA mapping logic to the vDPA device driver instead of
> doing it in vhost-vDPA (otherwise it could result a very complicated
> APIs and actually vhost-vDPA doesn't care about how the actual
> composition/emulation were done in the device driver).
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/vdpa/ifcvf/ifcvf_main.c   |  2 +-
>  drivers/vdpa/mlx5/net/mlx5_vnet.c |  5 +++--
>  drivers/vdpa/vdpa.c               |  4 +++-
>  drivers/vdpa/vdpa_sim/vdpa_sim.c  | 10 ++++++----
>  drivers/vhost/vdpa.c              | 14 +++++++++-----
>  include/linux/vdpa.h              | 23 ++++++++++++++++-------
>  6 files changed, 38 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
> index e6a0be374e51..86cdf5f8bcae 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> @@ -440,7 +440,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  
>  	adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
>  				    dev, &ifc_vdpa_ops,
> -				    IFCVF_MAX_QUEUE_PAIRS * 2, 1);
> +				    IFCVF_MAX_QUEUE_PAIRS * 2, 1, 1);
>  
>  	if (adapter == NULL) {
>  		IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> index 4e480f4f754e..db7404e121bf 100644
> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
> @@ -1788,7 +1788,8 @@ static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev)
>  	return mvdev->generation;
>  }
>  
> -static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct vhost_iotlb *iotlb)
> +static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid,
> +			     struct vhost_iotlb *iotlb)
>  {
>  	struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
>  	struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
> @@ -1931,7 +1932,7 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
>  	max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS);
>  
>  	ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
> -				 2 * mlx5_vdpa_max_qps(max_vqs), 1);
> +				 2 * mlx5_vdpa_max_qps(max_vqs), 1, 1);
>  	if (IS_ERR(ndev))
>  		return ndev;
>  
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
> index 46399746ec7c..05195fa7865d 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -63,6 +63,7 @@ static void vdpa_release_dev(struct device *d)
>   * @config: the bus operations that is supported by this device
>   * @nvqs: number of virtqueues supported by this device
>   * @ngroups: number of groups supported by this device
> + * @nas: number of address spaces supported by this device
>   * @size: size of the parent structure that contains private data
>   *
>   * Driver should use vdpa_alloc_device() wrapper macro instead of
> @@ -74,7 +75,7 @@ static void vdpa_release_dev(struct device *d)
>  struct vdpa_device *__vdpa_alloc_device(struct device *parent,
>  					const struct vdpa_config_ops *config,
>  					int nvqs, unsigned int ngroups,
> -					size_t size)
> +					unsigned int nas, size_t size)
>  {
>  	struct vdpa_device *vdev;
>  	int err = -EINVAL;
> @@ -102,6 +103,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
>  	vdev->features_valid = false;
>  	vdev->nvqs = nvqs;
>  	vdev->ngroups = ngroups;
> +	vdev->nas = nas;
>  
>  	err = dev_set_name(&vdev->dev, "vdpa%u", vdev->index);
>  	if (err)
> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> index 6669c561bc6e..5dc04ec271bb 100644
> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
> @@ -354,7 +354,7 @@ static struct vdpasim *vdpasim_create(void)
>  		ops = &vdpasim_net_config_ops;
>  
>  	vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
> -				    VDPASIM_VQ_NUM, 1);
> +				    VDPASIM_VQ_NUM, 1, 1);
>  	if (!vdpasim)
>  		goto err_alloc;
>  
> @@ -581,7 +581,7 @@ static u32 vdpasim_get_generation(struct vdpa_device *vdpa)
>  	return vdpasim->generation;
>  }
>  
> -static int vdpasim_set_map(struct vdpa_device *vdpa,
> +static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
>  			   struct vhost_iotlb *iotlb)
>  {
>  	struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> @@ -608,7 +608,8 @@ static int vdpasim_set_map(struct vdpa_device *vdpa,
>  	return ret;
>  }
>  
> -static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
> +static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
> +			   u64 iova, u64 size,
>  			   u64 pa, u32 perm)
>  {
>  	struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
> @@ -622,7 +623,8 @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
>  	return ret;
>  }
>  
> -static int vdpasim_dma_unmap(struct vdpa_device *vdpa, u64 iova, u64 size)
> +static int vdpasim_dma_unmap(struct vdpa_device *vdpa, unsigned int asid,
> +			     u64 iova, u64 size)
>  {
>  	struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>  
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index ec3c94f706c1..eeefcd971e3f 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -557,10 +557,10 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
>  		return r;
>  
>  	if (ops->dma_map) {
> -		r = ops->dma_map(vdpa, iova, size, pa, perm);
> +		r = ops->dma_map(vdpa, 0, iova, size, pa, perm);
>  	} else if (ops->set_map) {
>  		if (!v->in_batch)
> -			r = ops->set_map(vdpa, iotlb);
> +			r = ops->set_map(vdpa, 0, iotlb);
>  	} else {
>  		r = iommu_map(v->domain, iova, pa, size,
>  			      perm_to_iommu_flags(perm));
> @@ -579,10 +579,10 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v,
>  	vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);
>  
>  	if (ops->dma_map) {
> -		ops->dma_unmap(vdpa, iova, size);
> +		ops->dma_unmap(vdpa, 0, iova, size);
>  	} else if (ops->set_map) {
>  		if (!v->in_batch)
> -			ops->set_map(vdpa, iotlb);
> +			ops->set_map(vdpa, 0, iotlb);
>  	} else {
>  		iommu_unmap(v->domain, iova, size);
>  	}
> @@ -700,7 +700,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
>  		break;
>  	case VHOST_IOTLB_BATCH_END:
>  		if (v->in_batch && ops->set_map)
> -			ops->set_map(vdpa, iotlb);
> +			ops->set_map(vdpa, 0, iotlb);
>  		v->in_batch = false;
>  		break;
>  	default:
> @@ -949,6 +949,10 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
>  	int minor;
>  	int r;
>  
> +	/* Only support 1 address space */
> +	if (vdpa->ngroups != 1)
> +		return -ENOTSUPP;

Checkpatch warning:  prefer EOPNOTSUPP

> +
>  	/* Currently, we only accept the network devices. */
>  	if (ops->get_device_id(vdpa) != VIRTIO_ID_NET)
>  		return -ENOTSUPP;
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index d829512efd27..1e1163daa352 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -43,6 +43,8 @@ struct vdpa_vq_state {
>   * @index: device index
>   * @features_valid: were features initialized? for legacy guests
>   * @nvqs: the number of virtqueues
> + * @ngroups: the number of virtqueue groups
> + * @nas: the number of address spaces
>   */
>  struct vdpa_device {
>  	struct device dev;
> @@ -52,6 +54,7 @@ struct vdpa_device {
>  	bool features_valid;
>  	int nvqs;
>  	unsigned int ngroups;
> +	unsigned int nas;
>  };
>  
>  /**
> @@ -161,6 +164,7 @@ struct vdpa_device {
>   *				Needed for device that using device
>   *				specific DMA translation (on-chip IOMMU)
>   *				@vdev: vdpa device
> + *				@asid: address space identifier
>   *				@iotlb: vhost memory mapping to be
>   *				used by the vDPA
>   *				Returns integer: success (0) or error (< 0)
> @@ -169,6 +173,7 @@ struct vdpa_device {
>   *				specific DMA translation (on-chip IOMMU)
>   *				and preferring incremental map.
>   *				@vdev: vdpa device
> + *				@asid: address space identifier
>   *				@iova: iova to be mapped
>   *				@size: size of the area
>   *				@pa: physical address for the map
> @@ -180,6 +185,7 @@ struct vdpa_device {
>   *				specific DMA translation (on-chip IOMMU)
>   *				and preferring incremental unmap.
>   *				@vdev: vdpa device
> + *				@asid: address space identifier
>   *				@iova: iova to be unmapped
>   *				@size: size of the area
>   *				Returns integer: success (0) or error (< 0)
> @@ -225,10 +231,12 @@ struct vdpa_config_ops {
>  	u32 (*get_generation)(struct vdpa_device *vdev);
>  
>  	/* DMA ops */
> -	int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
> -	int (*dma_map)(struct vdpa_device *vdev, u64 iova, u64 size,
> -		       u64 pa, u32 perm);
> -	int (*dma_unmap)(struct vdpa_device *vdev, u64 iova, u64 size);
> +	int (*set_map)(struct vdpa_device *vdev, unsigned int asid,
> +		       struct vhost_iotlb *iotlb);
> +	int (*dma_map)(struct vdpa_device *vdev, unsigned int asid,
> +		       u64 iova, u64 size, u64 pa, u32 perm);
> +	int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
> +			 u64 iova, u64 size);
>  
>  	/* Free device resources */
>  	void (*free)(struct vdpa_device *vdev);
> @@ -237,11 +245,12 @@ struct vdpa_config_ops {
>  struct vdpa_device *__vdpa_alloc_device(struct device *parent,
>  					const struct vdpa_config_ops *config,
>  					int nvqs, unsigned int ngroups,
> -					size_t size);
> +					unsigned int nas, size_t size);
>  
> -#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, ngroups) \
> +#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, \
> +			  ngroups, nas)				    \
>  			  container_of(__vdpa_alloc_device( \
> -				       parent, config, nvqs, ngroups, \
> +				       parent, config, nvqs, ngroups, nas,  \
>  				       sizeof(dev_struct) + \
>  				       BUILD_BUG_ON_ZERO(offsetof( \
>  				       dev_struct, member))), \
> -- 
> 2.20.1
>
Jason Wang Oct. 9, 2020, 2:01 a.m. UTC | #12
On 2020/9/30 下午7:26, Eli Cohen wrote:
> On Thu, Sep 24, 2020 at 11:21:06AM +0800, Jason Wang wrote:
>> To prepare for the ASID support for vhost-vdpa, try to pass IOTLB
>> object to dma helpers.
> Maybe it's worth mentioning here that this patch does not change any
> functionality and is presented as a preparation for passing different
> iotlb's instead of using dev->iotlb


Right, let me add them in the next version.

Thanks


>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>   drivers/vhost/vdpa.c | 40 ++++++++++++++++++++++------------------
>>   1 file changed, 22 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>> index 9c641274b9f3..74bef1c15a70 100644
>> --- a/drivers/vhost/vdpa.c
>> +++ b/drivers/vhost/vdpa.c
>> @@ -489,10 +489,11 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
>>   	return r;
>>   }
>>   
>> -static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last)
>> +static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v,
>> +				   struct vhost_iotlb *iotlb,
>> +				   u64 start, u64 last)
>>   {
>>   	struct vhost_dev *dev = &v->vdev;
>> -	struct vhost_iotlb *iotlb = dev->iotlb;
>>   	struct vhost_iotlb_map *map;
>>   	struct page *page;
>>   	unsigned long pfn, pinned;
>> @@ -514,8 +515,9 @@ static void vhost_vdpa_iotlb_unmap(struct vhost_vdpa *v, u64 start, u64 last)
>>   static void vhost_vdpa_iotlb_free(struct vhost_vdpa *v)
>>   {
>>   	struct vhost_dev *dev = &v->vdev;
>> +	struct vhost_iotlb *iotlb = dev->iotlb;
>>   
>> -	vhost_vdpa_iotlb_unmap(v, 0ULL, 0ULL - 1);
>> +	vhost_vdpa_iotlb_unmap(v, iotlb, 0ULL, 0ULL - 1);
>>   	kfree(dev->iotlb);
>>   	dev->iotlb = NULL;
>>   }
>> @@ -542,15 +544,14 @@ static int perm_to_iommu_flags(u32 perm)
>>   	return flags | IOMMU_CACHE;
>>   }
>>   
>> -static int vhost_vdpa_map(struct vhost_vdpa *v,
>> +static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
>>   			  u64 iova, u64 size, u64 pa, u32 perm)
>>   {
>> -	struct vhost_dev *dev = &v->vdev;
>>   	struct vdpa_device *vdpa = v->vdpa;
>>   	const struct vdpa_config_ops *ops = vdpa->config;
>>   	int r = 0;
>>   
>> -	r = vhost_iotlb_add_range(dev->iotlb, iova, iova + size - 1,
>> +	r = vhost_iotlb_add_range(iotlb, iova, iova + size - 1,
>>   				  pa, perm);
>>   	if (r)
>>   		return r;
>> @@ -559,7 +560,7 @@ static int vhost_vdpa_map(struct vhost_vdpa *v,
>>   		r = ops->dma_map(vdpa, iova, size, pa, perm);
>>   	} else if (ops->set_map) {
>>   		if (!v->in_batch)
>> -			r = ops->set_map(vdpa, dev->iotlb);
>> +			r = ops->set_map(vdpa, iotlb);
>>   	} else {
>>   		r = iommu_map(v->domain, iova, pa, size,
>>   			      perm_to_iommu_flags(perm));
>> @@ -568,29 +569,30 @@ static int vhost_vdpa_map(struct vhost_vdpa *v,
>>   	return r;
>>   }
>>   
>> -static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size)
>> +static void vhost_vdpa_unmap(struct vhost_vdpa *v,
>> +			     struct vhost_iotlb *iotlb,
>> +			     u64 iova, u64 size)
>>   {
>> -	struct vhost_dev *dev = &v->vdev;
>>   	struct vdpa_device *vdpa = v->vdpa;
>>   	const struct vdpa_config_ops *ops = vdpa->config;
>>   
>> -	vhost_vdpa_iotlb_unmap(v, iova, iova + size - 1);
>> +	vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);
>>   
>>   	if (ops->dma_map) {
>>   		ops->dma_unmap(vdpa, iova, size);
>>   	} else if (ops->set_map) {
>>   		if (!v->in_batch)
>> -			ops->set_map(vdpa, dev->iotlb);
>> +			ops->set_map(vdpa, iotlb);
>>   	} else {
>>   		iommu_unmap(v->domain, iova, size);
>>   	}
>>   }
>>   
>>   static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
>> +					   struct vhost_iotlb *iotlb,
>>   					   struct vhost_iotlb_msg *msg)
>>   {
>>   	struct vhost_dev *dev = &v->vdev;
>> -	struct vhost_iotlb *iotlb = dev->iotlb;
>>   	struct page **page_list;
>>   	unsigned long list_size = PAGE_SIZE / sizeof(struct page *);
>>   	unsigned int gup_flags = FOLL_LONGTERM;
>> @@ -644,7 +646,7 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
>>   			if (last_pfn && (this_pfn != last_pfn + 1)) {
>>   				/* Pin a contiguous chunk of memory */
>>   				csize = (last_pfn - map_pfn + 1) << PAGE_SHIFT;
>> -				if (vhost_vdpa_map(v, iova, csize,
>> +				if (vhost_vdpa_map(v, iotlb, iova, csize,
>>   						   map_pfn << PAGE_SHIFT,
>>   						   msg->perm))
>>   					goto out;
>> @@ -660,11 +662,12 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v,
>>   	}
>>   
>>   	/* Pin the rest chunk */
>> -	ret = vhost_vdpa_map(v, iova, (last_pfn - map_pfn + 1) << PAGE_SHIFT,
>> +	ret = vhost_vdpa_map(v, iotlb, iova,
>> +			     (last_pfn - map_pfn + 1) << PAGE_SHIFT,
>>   			     map_pfn << PAGE_SHIFT, msg->perm);
>>   out:
>>   	if (ret) {
>> -		vhost_vdpa_unmap(v, msg->iova, msg->size);
>> +		vhost_vdpa_unmap(v, iotlb, msg->iova, msg->size);
>>   		atomic64_sub(npages, &dev->mm->pinned_vm);
>>   	}
>>   	mmap_read_unlock(dev->mm);
>> @@ -678,6 +681,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
>>   	struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev);
>>   	struct vdpa_device *vdpa = v->vdpa;
>>   	const struct vdpa_config_ops *ops = vdpa->config;
>> +	struct vhost_iotlb *iotlb = dev->iotlb;
>>   	int r = 0;
>>   
>>   	r = vhost_dev_check_owner(dev);
>> @@ -686,17 +690,17 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
>>   
>>   	switch (msg->type) {
>>   	case VHOST_IOTLB_UPDATE:
>> -		r = vhost_vdpa_process_iotlb_update(v, msg);
>> +		r = vhost_vdpa_process_iotlb_update(v, iotlb, msg);
>>   		break;
>>   	case VHOST_IOTLB_INVALIDATE:
>> -		vhost_vdpa_unmap(v, msg->iova, msg->size);
>> +		vhost_vdpa_unmap(v, iotlb, msg->iova, msg->size);
>>   		break;
>>   	case VHOST_IOTLB_BATCH_BEGIN:
>>   		v->in_batch = true;
>>   		break;
>>   	case VHOST_IOTLB_BATCH_END:
>>   		if (v->in_batch && ops->set_map)
>> -			ops->set_map(vdpa, dev->iotlb);
>> +			ops->set_map(vdpa, iotlb);
>>   		v->in_batch = false;
>>   		break;
>>   	default:
>> -- 
>> 2.20.1
>>
Jason Wang Oct. 9, 2020, 3:51 a.m. UTC | #13
On 2020/10/1 下午9:21, Eli Cohen wrote:
> On Thu, Sep 24, 2020 at 11:21:10AM +0800, Jason Wang wrote:
>> This patches introduces the multiple address spaces support for vDPA
>> device. This idea is to identify a specific address space via an
>> dedicated identifier - ASID.
>>
>> During vDPA device allocation, vDPA device driver needs to report the
>> number of address spaces supported by the device then the DMA mapping
>> ops of the vDPA device needs to be extended to support ASID.
>>
>> This helps to isolate the DMA among the virtqueues. E.g in the case of
>> virtio-net, the control virtqueue will not be assigned directly to
>> guest.
>>
>> This RFC patch only converts for the device that wants its own
>> IOMMU/DMA translation logic. So it will rejects the device with more
>> that 1 address space that depends on platform IOMMU. The plan to
> This is not apparent from the code. Instead you enforce number of groups
> to 1.


Yes, will fix.


>
>> moving all the DMA mapping logic to the vDPA device driver instead of
>> doing it in vhost-vDPA (otherwise it could result a very complicated
>> APIs and actually vhost-vDPA doesn't care about how the actual
>> composition/emulation were done in the device driver).
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>   drivers/vdpa/ifcvf/ifcvf_main.c   |  2 +-
>>   drivers/vdpa/mlx5/net/mlx5_vnet.c |  5 +++--
>>   drivers/vdpa/vdpa.c               |  4 +++-
>>   drivers/vdpa/vdpa_sim/vdpa_sim.c  | 10 ++++++----
>>   drivers/vhost/vdpa.c              | 14 +++++++++-----
>>   include/linux/vdpa.h              | 23 ++++++++++++++++-------
>>   6 files changed, 38 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
>> index e6a0be374e51..86cdf5f8bcae 100644
>> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
>> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
>> @@ -440,7 +440,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>>   
>>   	adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
>>   				    dev, &ifc_vdpa_ops,
>> -				    IFCVF_MAX_QUEUE_PAIRS * 2, 1);
>> +				    IFCVF_MAX_QUEUE_PAIRS * 2, 1, 1);
>>   
>>   	if (adapter == NULL) {
>>   		IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
>> diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> index 4e480f4f754e..db7404e121bf 100644
>> --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
>> @@ -1788,7 +1788,8 @@ static u32 mlx5_vdpa_get_generation(struct vdpa_device *vdev)
>>   	return mvdev->generation;
>>   }
>>   
>> -static int mlx5_vdpa_set_map(struct vdpa_device *vdev, struct vhost_iotlb *iotlb)
>> +static int mlx5_vdpa_set_map(struct vdpa_device *vdev, unsigned int asid,
>> +			     struct vhost_iotlb *iotlb)
>>   {
>>   	struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
>>   	struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev);
>> @@ -1931,7 +1932,7 @@ void *mlx5_vdpa_add_dev(struct mlx5_core_dev *mdev)
>>   	max_vqs = min_t(u32, max_vqs, MLX5_MAX_SUPPORTED_VQS);
>>   
>>   	ndev = vdpa_alloc_device(struct mlx5_vdpa_net, mvdev.vdev, mdev->device, &mlx5_vdpa_ops,
>> -				 2 * mlx5_vdpa_max_qps(max_vqs), 1);
>> +				 2 * mlx5_vdpa_max_qps(max_vqs), 1, 1);
>>   	if (IS_ERR(ndev))
>>   		return ndev;
>>   
>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
>> index 46399746ec7c..05195fa7865d 100644
>> --- a/drivers/vdpa/vdpa.c
>> +++ b/drivers/vdpa/vdpa.c
>> @@ -63,6 +63,7 @@ static void vdpa_release_dev(struct device *d)
>>    * @config: the bus operations that is supported by this device
>>    * @nvqs: number of virtqueues supported by this device
>>    * @ngroups: number of groups supported by this device
>> + * @nas: number of address spaces supported by this device
>>    * @size: size of the parent structure that contains private data
>>    *
>>    * Driver should use vdpa_alloc_device() wrapper macro instead of
>> @@ -74,7 +75,7 @@ static void vdpa_release_dev(struct device *d)
>>   struct vdpa_device *__vdpa_alloc_device(struct device *parent,
>>   					const struct vdpa_config_ops *config,
>>   					int nvqs, unsigned int ngroups,
>> -					size_t size)
>> +					unsigned int nas, size_t size)
>>   {
>>   	struct vdpa_device *vdev;
>>   	int err = -EINVAL;
>> @@ -102,6 +103,7 @@ struct vdpa_device *__vdpa_alloc_device(struct device *parent,
>>   	vdev->features_valid = false;
>>   	vdev->nvqs = nvqs;
>>   	vdev->ngroups = ngroups;
>> +	vdev->nas = nas;
>>   
>>   	err = dev_set_name(&vdev->dev, "vdpa%u", vdev->index);
>>   	if (err)
>> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>> index 6669c561bc6e..5dc04ec271bb 100644
>> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
>> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
>> @@ -354,7 +354,7 @@ static struct vdpasim *vdpasim_create(void)
>>   		ops = &vdpasim_net_config_ops;
>>   
>>   	vdpasim = vdpa_alloc_device(struct vdpasim, vdpa, NULL, ops,
>> -				    VDPASIM_VQ_NUM, 1);
>> +				    VDPASIM_VQ_NUM, 1, 1);
>>   	if (!vdpasim)
>>   		goto err_alloc;
>>   
>> @@ -581,7 +581,7 @@ static u32 vdpasim_get_generation(struct vdpa_device *vdpa)
>>   	return vdpasim->generation;
>>   }
>>   
>> -static int vdpasim_set_map(struct vdpa_device *vdpa,
>> +static int vdpasim_set_map(struct vdpa_device *vdpa, unsigned int asid,
>>   			   struct vhost_iotlb *iotlb)
>>   {
>>   	struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>> @@ -608,7 +608,8 @@ static int vdpasim_set_map(struct vdpa_device *vdpa,
>>   	return ret;
>>   }
>>   
>> -static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
>> +static int vdpasim_dma_map(struct vdpa_device *vdpa, unsigned int asid,
>> +			   u64 iova, u64 size,
>>   			   u64 pa, u32 perm)
>>   {
>>   	struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>> @@ -622,7 +623,8 @@ static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size,
>>   	return ret;
>>   }
>>   
>> -static int vdpasim_dma_unmap(struct vdpa_device *vdpa, u64 iova, u64 size)
>> +static int vdpasim_dma_unmap(struct vdpa_device *vdpa, unsigned int asid,
>> +			     u64 iova, u64 size)
>>   {
>>   	struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
>>   
>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>> index ec3c94f706c1..eeefcd971e3f 100644
>> --- a/drivers/vhost/vdpa.c
>> +++ b/drivers/vhost/vdpa.c
>> @@ -557,10 +557,10 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, struct vhost_iotlb *iotlb,
>>   		return r;
>>   
>>   	if (ops->dma_map) {
>> -		r = ops->dma_map(vdpa, iova, size, pa, perm);
>> +		r = ops->dma_map(vdpa, 0, iova, size, pa, perm);
>>   	} else if (ops->set_map) {
>>   		if (!v->in_batch)
>> -			r = ops->set_map(vdpa, iotlb);
>> +			r = ops->set_map(vdpa, 0, iotlb);
>>   	} else {
>>   		r = iommu_map(v->domain, iova, pa, size,
>>   			      perm_to_iommu_flags(perm));
>> @@ -579,10 +579,10 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v,
>>   	vhost_vdpa_iotlb_unmap(v, iotlb, iova, iova + size - 1);
>>   
>>   	if (ops->dma_map) {
>> -		ops->dma_unmap(vdpa, iova, size);
>> +		ops->dma_unmap(vdpa, 0, iova, size);
>>   	} else if (ops->set_map) {
>>   		if (!v->in_batch)
>> -			ops->set_map(vdpa, iotlb);
>> +			ops->set_map(vdpa, 0, iotlb);
>>   	} else {
>>   		iommu_unmap(v->domain, iova, size);
>>   	}
>> @@ -700,7 +700,7 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev,
>>   		break;
>>   	case VHOST_IOTLB_BATCH_END:
>>   		if (v->in_batch && ops->set_map)
>> -			ops->set_map(vdpa, iotlb);
>> +			ops->set_map(vdpa, 0, iotlb);
>>   		v->in_batch = false;
>>   		break;
>>   	default:
>> @@ -949,6 +949,10 @@ static int vhost_vdpa_probe(struct vdpa_device *vdpa)
>>   	int minor;
>>   	int r;
>>   
>> +	/* Only support 1 address space */
>> +	if (vdpa->ngroups != 1)
>> +		return -ENOTSUPP;
>> +
> Did you mean to check agains vdpa->nas?


I think we should check both.

Thanks


>
>>   	/* Currently, we only accept the network devices. */
>>   	if (ops->get_device_id(vdpa) != VIRTIO_ID_NET)
>>   		return -ENOTSUPP;
>> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
>> index d829512efd27..1e1163daa352 100644
>> --- a/include/linux/vdpa.h
>> +++ b/include/linux/vdpa.h
>> @@ -43,6 +43,8 @@ struct vdpa_vq_state {
>>    * @index: device index
>>    * @features_valid: were features initialized? for legacy guests
>>    * @nvqs: the number of virtqueues
>> + * @ngroups: the number of virtqueue groups
>> + * @nas: the number of address spaces
>>    */
>>   struct vdpa_device {
>>   	struct device dev;
>> @@ -52,6 +54,7 @@ struct vdpa_device {
>>   	bool features_valid;
>>   	int nvqs;
>>   	unsigned int ngroups;
>> +	unsigned int nas;
>>   };
>>   
>>   /**
>> @@ -161,6 +164,7 @@ struct vdpa_device {
>>    *				Needed for device that using device
>>    *				specific DMA translation (on-chip IOMMU)
>>    *				@vdev: vdpa device
>> + *				@asid: address space identifier
>>    *				@iotlb: vhost memory mapping to be
>>    *				used by the vDPA
>>    *				Returns integer: success (0) or error (< 0)
>> @@ -169,6 +173,7 @@ struct vdpa_device {
>>    *				specific DMA translation (on-chip IOMMU)
>>    *				and preferring incremental map.
>>    *				@vdev: vdpa device
>> + *				@asid: address space identifier
>>    *				@iova: iova to be mapped
>>    *				@size: size of the area
>>    *				@pa: physical address for the map
>> @@ -180,6 +185,7 @@ struct vdpa_device {
>>    *				specific DMA translation (on-chip IOMMU)
>>    *				and preferring incremental unmap.
>>    *				@vdev: vdpa device
>> + *				@asid: address space identifier
>>    *				@iova: iova to be unmapped
>>    *				@size: size of the area
>>    *				Returns integer: success (0) or error (< 0)
>> @@ -225,10 +231,12 @@ struct vdpa_config_ops {
>>   	u32 (*get_generation)(struct vdpa_device *vdev);
>>   
>>   	/* DMA ops */
>> -	int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
>> -	int (*dma_map)(struct vdpa_device *vdev, u64 iova, u64 size,
>> -		       u64 pa, u32 perm);
>> -	int (*dma_unmap)(struct vdpa_device *vdev, u64 iova, u64 size);
>> +	int (*set_map)(struct vdpa_device *vdev, unsigned int asid,
>> +		       struct vhost_iotlb *iotlb);
>> +	int (*dma_map)(struct vdpa_device *vdev, unsigned int asid,
>> +		       u64 iova, u64 size, u64 pa, u32 perm);
>> +	int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
>> +			 u64 iova, u64 size);
>>   
>>   	/* Free device resources */
>>   	void (*free)(struct vdpa_device *vdev);
>> @@ -237,11 +245,12 @@ struct vdpa_config_ops {
>>   struct vdpa_device *__vdpa_alloc_device(struct device *parent,
>>   					const struct vdpa_config_ops *config,
>>   					int nvqs, unsigned int ngroups,
>> -					size_t size);
>> +					unsigned int nas, size_t size);
>>   
>> -#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, ngroups) \
>> +#define vdpa_alloc_device(dev_struct, member, parent, config, nvqs, \
>> +			  ngroups, nas)				    \
>>   			  container_of(__vdpa_alloc_device( \
>> -				       parent, config, nvqs, ngroups, \
>> +				       parent, config, nvqs, ngroups, nas,  \
>>   				       sizeof(dev_struct) + \
>>   				       BUILD_BUG_ON_ZERO(offsetof( \
>>   				       dev_struct, member))), \
>> -- 
>> 2.20.1
>>
Jason Wang Oct. 9, 2020, 3:52 a.m. UTC | #14
On 2020/10/1 下午9:23, Eli Cohen wrote:
>>   
>> +	/* Only support 1 address space */
>> +	if (vdpa->ngroups != 1)
>> +		return -ENOTSUPP;
> Checkpatch warning:  prefer EOPNOTSUPP
>

Will fix.

Thanks