mbox series

[v3,00/11] iio: new DMABUF based API, v3

Message ID 20230403154800.215924-1-paul@crapouillou.net
Headers show
Series iio: new DMABUF based API, v3 | expand

Message

Paul Cercueil April 3, 2023, 3:47 p.m. UTC
Hi Jonathan,

Here's the v3 of my patchset that introduces a new interface based on
DMABUF objects to complement the fileio API, and adds write() support to
the existing fileio API.

It changed quite a lot since V2; the IIO subsystem is now just a DMABUF
importer, and all the complexity related to handling creation, deletion
and export of DMABUFs (including DMA mapping etc.) is gone.

This new interface will be used by Libiio. The code is ready[1] and will
be merged to the main branch as soon as the kernel bits are accepted
upstream.

Note that Libiio (and its server counterpart, iiod) use this new
interface in two different ways:
- by memory-mapping the DMABUFs to access the sample data directly,
  which is much faster than using the existing fileio API as the sample
  data does not need to be copied;
- by passing the DMABUFs around directly to the USB stack, in a
  device-to-device zero-copy fashion, using a new DMABUF interface for
  the USB (FunctionFS to be exact) stack, which is being upstreamed in
  parallel of this patchset [2].

As for write() support, Nuno (Cc'd) said he will work on upstreaming the
DAC counterpart of adc/adi-axi-adc.c in the next few weeks, so there
will be a user for the buffer write() support. I hope you are okay with
this - otherwise, we can just wait until this work is done, and I still
benefit from sending this patchset early to get feedback.

Finally, the dmaengine implementation for this new interface requires a
new dmaengine API function, since dmaengine_prep_slave_sg() will always
transfer the full scatterlist unconditionally, while we want to be able
to transfer an arbitrary amount of bytes from/to the DMABUF. Since
scatterlists seem to be going away soon, the new API function will take
an array of DMA addresses + lengths. I am open to suggestions if anybody
(especially Vinod) have a better design in mind.

Cheers,
-Paul

[1]: https://github.com/analogdevicesinc/libiio/pull/928
[2]: https://lore.kernel.org/linux-usb/425c1b8ea20002c6344a574cd094b4c715c67ba6.camel@crapouillou.net/T/#t

---
Changelog:
* Patches 01-02 are new;
* Patches [03/11], [05/11] didn't change;
* Patch [04/11]:
    - Reorganize arguments to iio_dma_buffer_io()
    - Change 'is_write' argument to 'is_from_user'
    - Change (__force char *) to (__force __user char *), in
      iio_dma_buffer_write(), since we only want to drop the "const".
* Patch [07/11]:
    - Get rid of the old IOCTLs. The IIO subsystem does not create or
      manage DMABUFs anymore, and only attaches/detaches externally
      created DMABUFs.
    - Add IIO_BUFFER_DMABUF_CYCLIC to the supported flags.
* Patch [09/11]:
    Update code to provide the functions that will be used as callbacks
    for the new IOCTLs.
* Patch [10/11]:
    Use the new dmaengine_prep_slave_dma_array(), and adapt the code to
    work with the new functions introduced in industrialio-buffer-dma.c.
* Patch [11/11]:
    Update the documentation to reflect the new API.

---
Alexandru Ardelean (1):
  iio: buffer-dma: split iio_dma_buffer_fileio_free() function

Paul Cercueil (10):
  dmaengine: Add API function dmaengine_prep_slave_dma_array()
  dmaengine: dma-axi-dmac: Implement device_prep_slave_dma_array
  iio: buffer-dma: Get rid of outgoing queue
  iio: buffer-dma: Enable buffer write support
  iio: buffer-dmaengine: Support specifying buffer direction
  iio: buffer-dmaengine: Enable write support
  iio: core: Add new DMABUF interface infrastructure
  iio: buffer-dma: Enable support for DMABUFs
  iio: buffer-dmaengine: Support new DMABUF based userspace API
  Documentation: iio: Document high-speed DMABUF based API

 Documentation/iio/dmabuf_api.rst              |  59 +++
 Documentation/iio/index.rst                   |   2 +
 drivers/dma/dma-axi-dmac.c                    |  41 ++
 drivers/iio/adc/adi-axi-adc.c                 |   3 +-
 drivers/iio/buffer/industrialio-buffer-dma.c  | 331 +++++++++++---
 .../buffer/industrialio-buffer-dmaengine.c    |  77 +++-
 drivers/iio/industrialio-buffer.c             | 402 ++++++++++++++++++
 include/linux/dmaengine.h                     |  16 +
 include/linux/iio/buffer-dma.h                |  40 +-
 include/linux/iio/buffer-dmaengine.h          |   5 +-
 include/linux/iio/buffer_impl.h               |  22 +
 include/uapi/linux/iio/buffer.h               |  22 +
 12 files changed, 939 insertions(+), 81 deletions(-)
 create mode 100644 Documentation/iio/dmabuf_api.rst

Comments

Nuno Sá April 4, 2023, 7:32 a.m. UTC | #1
On Mon, 2023-04-03 at 17:47 +0200, Paul Cercueil wrote:
> Add the necessary infrastructure to the IIO core to support a new
> optional DMABUF based interface.
> 
> With this new interface, DMABUF objects (externally created) can be
> attached to a IIO buffer, and subsequently used for data transfer.
> 
> A userspace application can then use this interface to share DMABUF
> objects between several interfaces, allowing it to transfer data in a
> zero-copy fashion, for instance between IIO and the USB stack.
> 
> The userspace application can also memory-map the DMABUF objects, and
> access the sample data directly. The advantage of doing this vs. the
> read() interface is that it avoids an extra copy of the data between
> the
> kernel and userspace. This is particularly userful for high-speed
> devices which produce several megabytes or even gigabytes of data per
> second.
> 
> As part of the interface, 3 new IOCTLs have been added:
> 
> IIO_BUFFER_DMABUF_ATTACH_IOCTL(int fd):
>  Attach the DMABUF object identified by the given file descriptor to
> the
>  buffer.
> 
> IIO_BUFFER_DMABUF_DETACH_IOCTL(int fd):
>  Detach the DMABUF object identified by the given file descriptor
> from
>  the buffer. Note that closing the IIO buffer's file descriptor will
>  automatically detach all previously attached DMABUF objects.
> 
> IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *):
>  Request a data transfer to/from the given DMABUF object. Its file
>  descriptor, as well as the transfer size and flags are provided in
> the
>  "iio_dmabuf" structure.
> 
> These three IOCTLs have to be performed on the IIO buffer's file
> descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl.
> 
> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> 
> ---
> v2: Only allow the new IOCTLs on the buffer FD created with
>     IIO_BUFFER_GET_FD_IOCTL().
> 
> v3: - Get rid of the old IOCTLs. The IIO subsystem does not create or
>     manage DMABUFs anymore, and only attaches/detaches externally
>     created DMABUFs.
>     - Add IIO_BUFFER_DMABUF_CYCLIC to the supported flags.
> ---
>  drivers/iio/industrialio-buffer.c | 402
> ++++++++++++++++++++++++++++++
>  include/linux/iio/buffer_impl.h   |  22 ++
>  include/uapi/linux/iio/buffer.h   |  22 ++
>  3 files changed, 446 insertions(+)
> 
> diff --git a/drivers/iio/industrialio-buffer.c
> b/drivers/iio/industrialio-buffer.c
> index 80c78bd6bbef..5d88e098b3e7 100644
> --- a/drivers/iio/industrialio-buffer.c
> +++ b/drivers/iio/industrialio-buffer.c
> @@ -13,10 +13,14 @@
>  #include <linux/kernel.h>
>  #include <linux/export.h>
>  #include <linux/device.h>
> +#include <linux/dma-buf.h>
> +#include <linux/dma-fence.h>
> +#include <linux/dma-resv.h>
>  #include <linux/file.h>
>  #include <linux/fs.h>
>  #include <linux/cdev.h>
>  #include <linux/slab.h>
> +#include <linux/mm.h>
>  #include <linux/poll.h>
>  #include <linux/sched/signal.h>
>  
> @@ -28,11 +32,41 @@
>  #include <linux/iio/buffer.h>
>  #include <linux/iio/buffer_impl.h>
>  
> +#define DMABUF_ENQUEUE_TIMEOUT_MS 5000
> +
> +struct iio_dma_fence;
> +
> +struct iio_dmabuf_priv {
> +       struct list_head entry;
> +       struct kref ref;
> +
> +       struct iio_buffer *buffer;
> +       struct iio_dma_buffer_block *block;
> +
> +       u64 context;
> +       spinlock_t lock;
> +
> +       struct dma_buf_attachment *attach;
> +       struct iio_dma_fence *fence;
> +};
> +
> +struct iio_dma_fence {
> +       struct dma_fence base;
> +       struct iio_dmabuf_priv *priv;
> +       struct sg_table *sgt;
> +       enum dma_data_direction dir;
> +};
> +
>  static const char * const iio_endian_prefix[] = {
>         [IIO_BE] = "be",
>         [IIO_LE] = "le",
>  };
>  
> +static inline struct iio_dma_fence *to_iio_dma_fence(struct
> dma_fence *fence)
> +{
> +       return container_of(fence, struct iio_dma_fence, base);
> +}
> +

Kind of a nitpick but I only see this being used once so I would maybe
use plain 'container_of()' as you are already doing for:

... = container_of(ref, struct iio_dmabuf_priv, ref);

So I would at least advocate for consistency. I would also probably
ditch the inline but I guess that is more a matter of style/preference.

>  static bool iio_buffer_is_active(struct iio_buffer *buf)
>  {
>         return !list_empty(&buf->buffer_list);
> @@ -329,6 +363,7 @@ void iio_buffer_init(struct iio_buffer *buffer)
>  {
> 

...

> +       priv = attach->importer_priv;
> +       list_del_init(&priv->entry);
> +
> +       iio_buffer_dmabuf_put(attach);
> +       iio_buffer_dmabuf_put(attach);
> +

Is this intended? Looks suspicious...

> +out_dmabuf_put:
> +       dma_buf_put(dmabuf);
> +
> +       return ret;
> +}
> +
> +static const char *
> +iio_buffer_dma_fence_get_driver_name(struct dma_fence *fence)
> +{
> +       return "iio";
> +}
> +
> +static void iio_buffer_dma_fence_release(struct dma_fence *fence)
> +{
> +       struct iio_dma_fence *iio_fence = to_iio_dma_fence(fence);
> +
> +       kfree(iio_fence);
> +}
> +
> +static const struct dma_fence_ops iio_buffer_dma_fence_ops = {
> +       .get_driver_name        =
> iio_buffer_dma_fence_get_driver_name,
> +       .get_timeline_name      =
> iio_buffer_dma_fence_get_driver_name,
> +       .release                = iio_buffer_dma_fence_release,
> +};
> +
> +static int iio_buffer_enqueue_dmabuf(struct iio_dev_buffer_pair *ib,
> +                                    struct iio_dmabuf __user
> *iio_dmabuf_req,
> +                                    bool nonblock)
> +{
> +       struct iio_buffer *buffer = ib->buffer;
> +       struct iio_dmabuf iio_dmabuf;
> +       struct dma_buf_attachment *attach;
> +       struct iio_dmabuf_priv *priv;
> +       enum dma_data_direction dir;
> +       struct iio_dma_fence *fence;
> +       struct dma_buf *dmabuf;
> +       struct sg_table *sgt;
> +       unsigned long timeout;
> +       bool dma_to_ram;
> +       bool cyclic;
> +       int ret;
> +
> +       if (copy_from_user(&iio_dmabuf, iio_dmabuf_req,
> sizeof(iio_dmabuf)))
> +               return -EFAULT;
> +
> +       if (iio_dmabuf.flags & ~IIO_BUFFER_DMABUF_SUPPORTED_FLAGS)
> +               return -EINVAL;
> +
> +       cyclic = iio_dmabuf.flags & IIO_BUFFER_DMABUF_CYCLIC;
> +
> +       /* Cyclic flag is only supported on output buffers */
> +       if (cyclic && buffer->direction != IIO_BUFFER_DIRECTION_OUT)
> +               return -EINVAL;
> +
> +       dmabuf = dma_buf_get(iio_dmabuf.fd);
> +       if (IS_ERR(dmabuf))
> +               return PTR_ERR(dmabuf);
> +
> +       if (!iio_dmabuf.bytes_used || iio_dmabuf.bytes_used > dmabuf-
> >size) {
> +               ret = -EINVAL;
> +               goto err_dmabuf_put;
> +       }
> +
> +       attach = iio_buffer_find_attachment(ib->indio_dev, dmabuf);
> +       if (IS_ERR(attach)) {
> +               ret = PTR_ERR(attach);
> +               goto err_dmabuf_put;
> +       }
> +
> +       priv = attach->importer_priv;
> +
> +       dma_to_ram = buffer->direction == IIO_BUFFER_DIRECTION_IN;
> +       dir = dma_to_ram ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> +
> +       sgt = dma_buf_map_attachment(attach, dir);
> +       if (IS_ERR(sgt)) {
> +               ret = PTR_ERR(sgt);
> +               pr_err("Unable to map attachment: %d\n", ret);

dev_err()? We should be able to reach the iio_dev

> +               goto err_attachment_put;
> +       }
> +
> +       fence = kmalloc(sizeof(*fence), GFP_KERNEL);
> +       if (!fence) {
> +               ret = -ENOMEM;
> +               goto err_unmap_attachment;
> +       }
> +
> 

...

>  static const struct file_operations iio_buffer_chrdev_fileops = {
>         .owner = THIS_MODULE,
>         .llseek = noop_llseek,
>         .read = iio_buffer_read,
>         .write = iio_buffer_write,
> +       .unlocked_ioctl = iio_buffer_chrdev_ioctl,
> +       .compat_ioctl = compat_ptr_ioctl,
>         .poll = iio_buffer_poll,
>         .release = iio_buffer_chrdev_release,
>  };

Hmmm, what about the legacy buffer? We should also support this
interface using it, right? Otherwise, using one of the new IOCTL in
iio_device_buffer_ioctl() (or /dev/iio:device0) will error out.

- Nuno Sá
Paul Cercueil April 4, 2023, 7:42 a.m. UTC | #2
Hi Hillf,

Le mardi 04 avril 2023 à 09:59 +0800, Hillf Danton a écrit :
> On 3 Apr 2023 17:47:50 +0200 Paul Cercueil <paul@crapouillou.net>
> > This function can be used to initiate a scatter-gather DMA transfer
> > where the DMA addresses and lengths are located inside arrays.
> > 
> > The major difference with dmaengine_prep_slave_sg() is that it
> > supports
> > specifying the lengths of each DMA transfer; as trying to override
> > the
> > length of the transfer with dmaengine_prep_slave_sg() is a very
> > tedious
> > process. The introduction of a new API function is also justified
> > by the
> > fact that scatterlists are on their way out.
> 
> Given sg's wayout and conceptually iovec and kvec (in
> include/linux/uio.h),
> what you add should have been dma_vec to ease people making use of
> it.
> 
>         struct dma_vec {
>                 dma_addr_t      addr;
>                 size_t          len;
>         };

Well it's not too late ;)

Thanks for the feedback.

Cheers,
-Paul

> > 
> > Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> > 
> > ---
> > v3: New patch
> > ---
> >  include/linux/dmaengine.h | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> > 
> > diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> > index c3656e590213..62efa28c009a 100644
> > --- a/include/linux/dmaengine.h
> > +++ b/include/linux/dmaengine.h
> > @@ -912,6 +912,11 @@ struct dma_device {
> >         struct dma_async_tx_descriptor
> > *(*device_prep_dma_interrupt)(
> >                 struct dma_chan *chan, unsigned long flags);
> >  
> > +       struct dma_async_tx_descriptor
> > *(*device_prep_slave_dma_array)(
> > +               struct dma_chan *chan, dma_addr_t *addrs,
> > +               size_t *lengths, size_t nb,
> > +               enum dma_transfer_direction direction,
> > +               unsigned long flags);
> 
> Then the callback looks like
> 
>         struct dma_async_tx_descriptor *(*device_prep_slave_vec)(
>                 struct dma_chan *chan,
>                 struct dma_vec *vec,
>                 int nvec,
>                 enum dma_transfer_direction direction,
>                 unsigned long flags);
Paul Cercueil April 4, 2023, 7:55 a.m. UTC | #3
Hi Nuno,

Le mardi 04 avril 2023 à 09:32 +0200, Nuno Sá a écrit :
> On Mon, 2023-04-03 at 17:47 +0200, Paul Cercueil wrote:
> > Add the necessary infrastructure to the IIO core to support a new
> > optional DMABUF based interface.
> > 
> > With this new interface, DMABUF objects (externally created) can be
> > attached to a IIO buffer, and subsequently used for data transfer.
> > 
> > A userspace application can then use this interface to share DMABUF
> > objects between several interfaces, allowing it to transfer data in
> > a
> > zero-copy fashion, for instance between IIO and the USB stack.
> > 
> > The userspace application can also memory-map the DMABUF objects,
> > and
> > access the sample data directly. The advantage of doing this vs.
> > the
> > read() interface is that it avoids an extra copy of the data
> > between
> > the
> > kernel and userspace. This is particularly userful for high-speed
> > devices which produce several megabytes or even gigabytes of data
> > per
> > second.
> > 
> > As part of the interface, 3 new IOCTLs have been added:
> > 
> > IIO_BUFFER_DMABUF_ATTACH_IOCTL(int fd):
> >  Attach the DMABUF object identified by the given file descriptor
> > to
> > the
> >  buffer.
> > 
> > IIO_BUFFER_DMABUF_DETACH_IOCTL(int fd):
> >  Detach the DMABUF object identified by the given file descriptor
> > from
> >  the buffer. Note that closing the IIO buffer's file descriptor
> > will
> >  automatically detach all previously attached DMABUF objects.
> > 
> > IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *):
> >  Request a data transfer to/from the given DMABUF object. Its file
> >  descriptor, as well as the transfer size and flags are provided in
> > the
> >  "iio_dmabuf" structure.
> > 
> > These three IOCTLs have to be performed on the IIO buffer's file
> > descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl.
> > 
> > Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> > 
> > ---
> > v2: Only allow the new IOCTLs on the buffer FD created with
> >     IIO_BUFFER_GET_FD_IOCTL().
> > 
> > v3: - Get rid of the old IOCTLs. The IIO subsystem does not create
> > or
> >     manage DMABUFs anymore, and only attaches/detaches externally
> >     created DMABUFs.
> >     - Add IIO_BUFFER_DMABUF_CYCLIC to the supported flags.
> > ---
> >  drivers/iio/industrialio-buffer.c | 402
> > ++++++++++++++++++++++++++++++
> >  include/linux/iio/buffer_impl.h   |  22 ++
> >  include/uapi/linux/iio/buffer.h   |  22 ++
> >  3 files changed, 446 insertions(+)
> > 
> > diff --git a/drivers/iio/industrialio-buffer.c
> > b/drivers/iio/industrialio-buffer.c
> > index 80c78bd6bbef..5d88e098b3e7 100644
> > --- a/drivers/iio/industrialio-buffer.c
> > +++ b/drivers/iio/industrialio-buffer.c
> > @@ -13,10 +13,14 @@
> >  #include <linux/kernel.h>
> >  #include <linux/export.h>
> >  #include <linux/device.h>
> > +#include <linux/dma-buf.h>
> > +#include <linux/dma-fence.h>
> > +#include <linux/dma-resv.h>
> >  #include <linux/file.h>
> >  #include <linux/fs.h>
> >  #include <linux/cdev.h>
> >  #include <linux/slab.h>
> > +#include <linux/mm.h>
> >  #include <linux/poll.h>
> >  #include <linux/sched/signal.h>
> >  
> > @@ -28,11 +32,41 @@
> >  #include <linux/iio/buffer.h>
> >  #include <linux/iio/buffer_impl.h>
> >  
> > +#define DMABUF_ENQUEUE_TIMEOUT_MS 5000
> > +
> > +struct iio_dma_fence;
> > +
> > +struct iio_dmabuf_priv {
> > +       struct list_head entry;
> > +       struct kref ref;
> > +
> > +       struct iio_buffer *buffer;
> > +       struct iio_dma_buffer_block *block;
> > +
> > +       u64 context;
> > +       spinlock_t lock;
> > +
> > +       struct dma_buf_attachment *attach;
> > +       struct iio_dma_fence *fence;
> > +};
> > +
> > +struct iio_dma_fence {
> > +       struct dma_fence base;
> > +       struct iio_dmabuf_priv *priv;
> > +       struct sg_table *sgt;
> > +       enum dma_data_direction dir;
> > +};
> > +
> >  static const char * const iio_endian_prefix[] = {
> >         [IIO_BE] = "be",
> >         [IIO_LE] = "le",
> >  };
> >  
> > +static inline struct iio_dma_fence *to_iio_dma_fence(struct
> > dma_fence *fence)
> > +{
> > +       return container_of(fence, struct iio_dma_fence, base);
> > +}
> > +
> 
> Kind of a nitpick but I only see this being used once so I would
> maybe
> use plain 'container_of()' as you are already doing for:
> 
> ... = container_of(ref, struct iio_dmabuf_priv, ref);
> 
> So I would at least advocate for consistency. I would also probably
> ditch the inline but I guess that is more a matter of
> style/preference.

Yep, at least it should be consistent.

> 
> >  static bool iio_buffer_is_active(struct iio_buffer *buf)
> >  {
> >         return !list_empty(&buf->buffer_list);
> > @@ -329,6 +363,7 @@ void iio_buffer_init(struct iio_buffer *buffer)
> >  {
> > 
> 
> ...
> 
> > +       priv = attach->importer_priv;
> > +       list_del_init(&priv->entry);
> > +
> > +       iio_buffer_dmabuf_put(attach);
> > +       iio_buffer_dmabuf_put(attach);
> > +
> 
> Is this intended? Looks suspicious...

It is intended, yes. You want to release the dma_buf_attachment that's
created in iio_buffer_attach_dmabuf(), and you need to call
iio_buffer_find_attachment() to get a pointer to it, which also gets a
second reference - so it needs to unref twice.

> 
> > +out_dmabuf_put:
> > +       dma_buf_put(dmabuf);
> > +
> > +       return ret;
> > +}
> > +
> > +static const char *
> > +iio_buffer_dma_fence_get_driver_name(struct dma_fence *fence)
> > +{
> > +       return "iio";
> > +}
> > +
> > +static void iio_buffer_dma_fence_release(struct dma_fence *fence)
> > +{
> > +       struct iio_dma_fence *iio_fence = to_iio_dma_fence(fence);
> > +
> > +       kfree(iio_fence);
> > +}
> > +
> > +static const struct dma_fence_ops iio_buffer_dma_fence_ops = {
> > +       .get_driver_name        =
> > iio_buffer_dma_fence_get_driver_name,
> > +       .get_timeline_name      =
> > iio_buffer_dma_fence_get_driver_name,
> > +       .release                = iio_buffer_dma_fence_release,
> > +};
> > +
> > +static int iio_buffer_enqueue_dmabuf(struct iio_dev_buffer_pair
> > *ib,
> > +                                    struct iio_dmabuf __user
> > *iio_dmabuf_req,
> > +                                    bool nonblock)
> > +{
> > +       struct iio_buffer *buffer = ib->buffer;
> > +       struct iio_dmabuf iio_dmabuf;
> > +       struct dma_buf_attachment *attach;
> > +       struct iio_dmabuf_priv *priv;
> > +       enum dma_data_direction dir;
> > +       struct iio_dma_fence *fence;
> > +       struct dma_buf *dmabuf;
> > +       struct sg_table *sgt;
> > +       unsigned long timeout;
> > +       bool dma_to_ram;
> > +       bool cyclic;
> > +       int ret;
> > +
> > +       if (copy_from_user(&iio_dmabuf, iio_dmabuf_req,
> > sizeof(iio_dmabuf)))
> > +               return -EFAULT;
> > +
> > +       if (iio_dmabuf.flags & ~IIO_BUFFER_DMABUF_SUPPORTED_FLAGS)
> > +               return -EINVAL;
> > +
> > +       cyclic = iio_dmabuf.flags & IIO_BUFFER_DMABUF_CYCLIC;
> > +
> > +       /* Cyclic flag is only supported on output buffers */
> > +       if (cyclic && buffer->direction !=
> > IIO_BUFFER_DIRECTION_OUT)
> > +               return -EINVAL;
> > +
> > +       dmabuf = dma_buf_get(iio_dmabuf.fd);
> > +       if (IS_ERR(dmabuf))
> > +               return PTR_ERR(dmabuf);
> > +
> > +       if (!iio_dmabuf.bytes_used || iio_dmabuf.bytes_used >
> > dmabuf-
> > > size) {
> > +               ret = -EINVAL;
> > +               goto err_dmabuf_put;
> > +       }
> > +
> > +       attach = iio_buffer_find_attachment(ib->indio_dev, dmabuf);
> > +       if (IS_ERR(attach)) {
> > +               ret = PTR_ERR(attach);
> > +               goto err_dmabuf_put;
> > +       }
> > +
> > +       priv = attach->importer_priv;
> > +
> > +       dma_to_ram = buffer->direction == IIO_BUFFER_DIRECTION_IN;
> > +       dir = dma_to_ram ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
> > +
> > +       sgt = dma_buf_map_attachment(attach, dir);
> > +       if (IS_ERR(sgt)) {
> > +               ret = PTR_ERR(sgt);
> > +               pr_err("Unable to map attachment: %d\n", ret);
> 
> dev_err()? We should be able to reach the iio_dev

Should work with (&ib->indio_dev->dev), yes.

> 
> > +               goto err_attachment_put;
> > +       }
> > +
> > +       fence = kmalloc(sizeof(*fence), GFP_KERNEL);
> > +       if (!fence) {
> > +               ret = -ENOMEM;
> > +               goto err_unmap_attachment;
> > +       }
> > +
> > 
> 
> ...
> 
> >  static const struct file_operations iio_buffer_chrdev_fileops = {
> >         .owner = THIS_MODULE,
> >         .llseek = noop_llseek,
> >         .read = iio_buffer_read,
> >         .write = iio_buffer_write,
> > +       .unlocked_ioctl = iio_buffer_chrdev_ioctl,
> > +       .compat_ioctl = compat_ptr_ioctl,
> >         .poll = iio_buffer_poll,
> >         .release = iio_buffer_chrdev_release,
> >  };
> 
> Hmmm, what about the legacy buffer? We should also support this
> interface using it, right? Otherwise, using one of the new IOCTL in
> iio_device_buffer_ioctl() (or /dev/iio:device0) will error out.

According to Jonathan the old chardev route is deprecated, and it's
fine not to support the IOCTL there.

Cheers,
-Paul
Nuno Sá April 4, 2023, 8:21 a.m. UTC | #4
On Tue, 2023-04-04 at 09:55 +0200, Paul Cercueil wrote:
> Hi Nuno,
> 
> Le mardi 04 avril 2023 à 09:32 +0200, Nuno Sá a écrit :
> > On Mon, 2023-04-03 at 17:47 +0200, Paul Cercueil wrote:
> > > Add the necessary infrastructure to the IIO core to support a new
> > > optional DMABUF based interface.
> > > 
> > > With this new interface, DMABUF objects (externally created) can be
> > > attached to a IIO buffer, and subsequently used for data transfer.
> > > 
> > > A userspace application can then use this interface to share DMABUF
> > > objects between several interfaces, allowing it to transfer data in
> > > a
> > > zero-copy fashion, for instance between IIO and the USB stack.
> > > 
> > > The userspace application can also memory-map the DMABUF objects,
> > > and
> > > access the sample data directly. The advantage of doing this vs.
> > > the
> > > read() interface is that it avoids an extra copy of the data
> > > between
> > > the
> > > kernel and userspace. This is particularly userful for high-speed
> > > devices which produce several megabytes or even gigabytes of data
> > > per
> > > second.
> > > 
> > > As part of the interface, 3 new IOCTLs have been added:
> > > 
> > > IIO_BUFFER_DMABUF_ATTACH_IOCTL(int fd):
> > >  Attach the DMABUF object identified by the given file descriptor
> > > to
> > > the
> > >  buffer.
> > > 
> > > IIO_BUFFER_DMABUF_DETACH_IOCTL(int fd):
> > >  Detach the DMABUF object identified by the given file descriptor
> > > from
> > >  the buffer. Note that closing the IIO buffer's file descriptor
> > > will
> > >  automatically detach all previously attached DMABUF objects.
> > > 
> > > IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *):
> > >  Request a data transfer to/from the given DMABUF object. Its file
> > >  descriptor, as well as the transfer size and flags are provided in
> > > the
> > >  "iio_dmabuf" structure.
> > > 
> > > These three IOCTLs have to be performed on the IIO buffer's file
> > > descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl.
> > > 
> > > Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> > > 
> > > ---
> > > v2: Only allow the new IOCTLs on the buffer FD created with
> > >     IIO_BUFFER_GET_FD_IOCTL().
> > > 
> > > v3: - Get rid of the old IOCTLs. The IIO subsystem does not create
> > > or
> > >     manage DMABUFs anymore, and only attaches/detaches externally
> > >     created DMABUFs.
> > >     - Add IIO_BUFFER_DMABUF_CYCLIC to the supported flags.
> > > ---
> > >  drivers/iio/industrialio-buffer.c | 402
> > > ++++++++++++++++++++++++++++++
> > >  include/linux/iio/buffer_impl.h   |  22 ++
> > >  include/uapi/linux/iio/buffer.h   |  22 ++
> > >  3 files changed, 446 insertions(+)
> > > 
> > > diff --git a/drivers/iio/industrialio-buffer.c
> > > b/drivers/iio/industrialio-buffer.c
> > > index 80c78bd6bbef..5d88e098b3e7 100644
> > > --- a/drivers/iio/industrialio-buffer.c
> > > +++ b/drivers/iio/industrialio-buffer.c
> > > @@ -13,10 +13,14 @@
> > >  #include <linux/kernel.h>
> > >  #include <linux/export.h>
> > >  #include <linux/device.h>
> > > +#include <linux/dma-buf.h>
> > > +#include <linux/dma-fence.h>
> > > +#include <linux/dma-resv.h>
> > >  #include <linux/file.h>
> > >  #include <linux/fs.h>
> > >  #include <linux/cdev.h>
> > >  #include <linux/slab.h>
> > > +#include <linux/mm.h>
> > >  #include <linux/poll.h>
> > >  #include <linux/sched/signal.h>
> > >  
> > > @@ -28,11 +32,41 @@
> > >  #include <linux/iio/buffer.h>
> > >  #include <linux/iio/buffer_impl.h>
> > >  
> > > +#define DMABUF_ENQUEUE_TIMEOUT_MS 5000
> > > +
> > > +struct iio_dma_fence;
> > > +
> > > +struct iio_dmabuf_priv {
> > > +       struct list_head entry;
> > > +       struct kref ref;
> > > +
> > > +       struct iio_buffer *buffer;
> > > +       struct iio_dma_buffer_block *block;
> > > +
> > > +       u64 context;
> > > +       spinlock_t lock;
> > > +
> > > +       struct dma_buf_attachment *attach;
> > > +       struct iio_dma_fence *fence;
> > > +};
> > > +
> > > +struct iio_dma_fence {
> > > +       struct dma_fence base;
> > > +       struct iio_dmabuf_priv *priv;
> > > +       struct sg_table *sgt;
> > > +       enum dma_data_direction dir;
> > > +};
> > > +
> > >  static const char * const iio_endian_prefix[] = {
> > >         [IIO_BE] = "be",
> > >         [IIO_LE] = "le",
> > >  };
> > >  
> > > +static inline struct iio_dma_fence *to_iio_dma_fence(struct
> > > dma_fence *fence)
> > > +{
> > > +       return container_of(fence, struct iio_dma_fence, base);
> > > +}
> > > +
> > 
> > Kind of a nitpick but I only see this being used once so I would
> > maybe
> > use plain 'container_of()' as you are already doing for:
> > 
> > ... = container_of(ref, struct iio_dmabuf_priv, ref);
> > 
> > So I would at least advocate for consistency. I would also probably
> > ditch the inline but I guess that is more a matter of
> > style/preference.
> 
> Yep, at least it should be consistent.
> 
> > 
> > >  static bool iio_buffer_is_active(struct iio_buffer *buf)
> > >  {
> > >         return !list_empty(&buf->buffer_list);
> > > @@ -329,6 +363,7 @@ void iio_buffer_init(struct iio_buffer *buffer)
> > >  {
> > > 
> > 
> > ...
> > 
> > > +       priv = attach->importer_priv;
> > > +       list_del_init(&priv->entry);
> > > +
> > > +       iio_buffer_dmabuf_put(attach);
> > > +       iio_buffer_dmabuf_put(attach);
> > > +
> > 
> > Is this intended? Looks suspicious...
> 
> It is intended, yes. You want to release the dma_buf_attachment that's
> created in iio_buffer_attach_dmabuf(), and you need to call
> iio_buffer_find_attachment() to get a pointer to it, which also gets a
> second reference - so it needs to unref twice.
> 

I see..

...


> > 
> > > +out_dmabuf_put:
> > > +       dma_buf_put(dmabuf);
> > > +
> > > +       return ret;
> > > +}
> > > 
> > 
> > Hmmm, what about the legacy buffer? We should also support this
> > interface using it, right? Otherwise, using one of the new IOCTL in
> > iio_device_buffer_ioctl() (or /dev/iio:device0) will error out.
> 
> According to Jonathan the old chardev route is deprecated, and it's
> fine not to support the IOCTL there.
> 

Oh, alright then... Better that way indeed!

- Nuno Sá
Christian König April 4, 2023, 8:54 a.m. UTC | #5
Am 04.04.23 um 09:42 schrieb Paul Cercueil:
> Hi Hillf,
>
> Le mardi 04 avril 2023 à 09:59 +0800, Hillf Danton a écrit :
>> On 3 Apr 2023 17:47:50 +0200 Paul Cercueil <paul@crapouillou.net>
>>> This function can be used to initiate a scatter-gather DMA transfer
>>> where the DMA addresses and lengths are located inside arrays.
>>>
>>> The major difference with dmaengine_prep_slave_sg() is that it
>>> supports
>>> specifying the lengths of each DMA transfer; as trying to override
>>> the
>>> length of the transfer with dmaengine_prep_slave_sg() is a very
>>> tedious
>>> process. The introduction of a new API function is also justified
>>> by the
>>> fact that scatterlists are on their way out.
>> Given sg's wayout and conceptually iovec and kvec (in
>> include/linux/uio.h),
>> what you add should have been dma_vec to ease people making use of
>> it.
>>
>>          struct dma_vec {
>>                  dma_addr_t      addr;
>>                  size_t          len;
>>          };
> Well it's not too late ;)

Yeah adding that is pretty much the job I have on my TODO list for quite 
some time.

I wouldn't mind if you start adding that and provide helper functions in 
DMA-buf to convert from/to an sg_table.

This way we can migrate the interface over to a new design over time.

Regards,
Christian.

>
> Thanks for the feedback.
>
> Cheers,
> -Paul
>
>>> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
>>>
>>> ---
>>> v3: New patch
>>> ---
>>>   include/linux/dmaengine.h | 16 ++++++++++++++++
>>>   1 file changed, 16 insertions(+)
>>>
>>> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
>>> index c3656e590213..62efa28c009a 100644
>>> --- a/include/linux/dmaengine.h
>>> +++ b/include/linux/dmaengine.h
>>> @@ -912,6 +912,11 @@ struct dma_device {
>>>          struct dma_async_tx_descriptor
>>> *(*device_prep_dma_interrupt)(
>>>                  struct dma_chan *chan, unsigned long flags);
>>>   
>>> +       struct dma_async_tx_descriptor
>>> *(*device_prep_slave_dma_array)(
>>> +               struct dma_chan *chan, dma_addr_t *addrs,
>>> +               size_t *lengths, size_t nb,
>>> +               enum dma_transfer_direction direction,
>>> +               unsigned long flags);
>> Then the callback looks like
>>
>>          struct dma_async_tx_descriptor *(*device_prep_slave_vec)(
>>                  struct dma_chan *chan,
>>                  struct dma_vec *vec,
>>                  int nvec,
>>                  enum dma_transfer_direction direction,
>>                  unsigned long flags);
Vinod Koul April 12, 2023, 5:23 p.m. UTC | #6
On 03-04-23, 17:47, Paul Cercueil wrote:
> This function can be used to initiate a scatter-gather DMA transfer
> where the DMA addresses and lengths are located inside arrays.
> 
> The major difference with dmaengine_prep_slave_sg() is that it supports
> specifying the lengths of each DMA transfer; as trying to override the
> length of the transfer with dmaengine_prep_slave_sg() is a very tedious
> process. The introduction of a new API function is also justified by the
> fact that scatterlists are on their way out.

Do we need a new API for this? why not use device_prep_interleaved_dma?

> 
> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> 
> ---
> v3: New patch
> ---
>  include/linux/dmaengine.h | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> index c3656e590213..62efa28c009a 100644
> --- a/include/linux/dmaengine.h
> +++ b/include/linux/dmaengine.h
> @@ -912,6 +912,11 @@ struct dma_device {
>  	struct dma_async_tx_descriptor *(*device_prep_dma_interrupt)(
>  		struct dma_chan *chan, unsigned long flags);
>  
> +	struct dma_async_tx_descriptor *(*device_prep_slave_dma_array)(
> +		struct dma_chan *chan, dma_addr_t *addrs,
> +		size_t *lengths, size_t nb,
> +		enum dma_transfer_direction direction,
> +		unsigned long flags);
>  	struct dma_async_tx_descriptor *(*device_prep_slave_sg)(
>  		struct dma_chan *chan, struct scatterlist *sgl,
>  		unsigned int sg_len, enum dma_transfer_direction direction,
> @@ -974,6 +979,17 @@ static inline struct dma_async_tx_descriptor *dmaengine_prep_slave_single(
>  						  dir, flags, NULL);
>  }
>  
> +static inline struct dma_async_tx_descriptor *dmaengine_prep_slave_dma_array(
> +	struct dma_chan *chan, dma_addr_t *addrs, size_t *lengths,
> +	size_t nb, enum dma_transfer_direction dir, unsigned long flags)
> +{
> +	if (!chan || !chan->device || !chan->device->device_prep_slave_dma_array)
> +		return NULL;
> +
> +	return chan->device->device_prep_slave_dma_array(chan, addrs, lengths,
> +							 nb, dir, flags);
> +}
> +
>  static inline struct dma_async_tx_descriptor *dmaengine_prep_slave_sg(
>  	struct dma_chan *chan, struct scatterlist *sgl,	unsigned int sg_len,
>  	enum dma_transfer_direction dir, unsigned long flags)
> -- 
> 2.39.2
Paul Cercueil April 13, 2023, 7:59 a.m. UTC | #7
Hi Vinod,

Le mercredi 12 avril 2023 à 22:53 +0530, Vinod Koul a écrit :
> On 03-04-23, 17:47, Paul Cercueil wrote:
> > This function can be used to initiate a scatter-gather DMA transfer
> > where the DMA addresses and lengths are located inside arrays.
> > 
> > The major difference with dmaengine_prep_slave_sg() is that it
> > supports
> > specifying the lengths of each DMA transfer; as trying to override
> > the
> > length of the transfer with dmaengine_prep_slave_sg() is a very
> > tedious
> > process. The introduction of a new API function is also justified
> > by the
> > fact that scatterlists are on their way out.
> 
> Do we need a new API for this? why not use
> device_prep_interleaved_dma?

I admit that I discarded the interleaved DMA without trying it, because
reading the doc, e.g. the one for "struct data_chunk", It looked like
it was not usable for when the DMA addresses are scattered in memory;
it assumes that the following DMA addresses will always come after the
previous one.

Cheers,
-Paul 

> > 
> > Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> > 
> > ---
> > v3: New patch
> > ---
> >  include/linux/dmaengine.h | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> > 
> > diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
> > index c3656e590213..62efa28c009a 100644
> > --- a/include/linux/dmaengine.h
> > +++ b/include/linux/dmaengine.h
> > @@ -912,6 +912,11 @@ struct dma_device {
> >         struct dma_async_tx_descriptor
> > *(*device_prep_dma_interrupt)(
> >                 struct dma_chan *chan, unsigned long flags);
> >  
> > +       struct dma_async_tx_descriptor
> > *(*device_prep_slave_dma_array)(
> > +               struct dma_chan *chan, dma_addr_t *addrs,
> > +               size_t *lengths, size_t nb,
> > +               enum dma_transfer_direction direction,
> > +               unsigned long flags);
> >         struct dma_async_tx_descriptor *(*device_prep_slave_sg)(
> >                 struct dma_chan *chan, struct scatterlist *sgl,
> >                 unsigned int sg_len, enum dma_transfer_direction
> > direction,
> > @@ -974,6 +979,17 @@ static inline struct dma_async_tx_descriptor
> > *dmaengine_prep_slave_single(
> >                                                   dir, flags,
> > NULL);
> >  }
> >  
> > +static inline struct dma_async_tx_descriptor
> > *dmaengine_prep_slave_dma_array(
> > +       struct dma_chan *chan, dma_addr_t *addrs, size_t *lengths,
> > +       size_t nb, enum dma_transfer_direction dir, unsigned long
> > flags)
> > +{
> > +       if (!chan || !chan->device || !chan->device-
> > >device_prep_slave_dma_array)
> > +               return NULL;
> > +
> > +       return chan->device->device_prep_slave_dma_array(chan,
> > addrs, lengths,
> > +                                                        nb, dir,
> > flags);
> > +}
> > +
> >  static inline struct dma_async_tx_descriptor
> > *dmaengine_prep_slave_sg(
> >         struct dma_chan *chan, struct scatterlist *sgl, unsigned
> > int sg_len,
> >         enum dma_transfer_direction dir, unsigned long flags)
> > -- 
> > 2.39.2
>
Jonathan Cameron April 16, 2023, 2:24 p.m. UTC | #8
On Mon,  3 Apr 2023 17:47:52 +0200
Paul Cercueil <paul@crapouillou.net> wrote:

> The buffer-dma code was using two queues, incoming and outgoing, to
> manage the state of the blocks in use.
> 
> While this totally works, it adds some complexity to the code,
> especially since the code only manages 2 blocks. It is much easier to
> just check each block's state manually, and keep a counter for the next
> block to dequeue.
> 
> Since the new DMABUF based API wouldn't use the outgoing queue anyway,
> getting rid of it now makes the upcoming changes simpler.
> 
> With this change, the IIO_BLOCK_STATE_DEQUEUED is now useless, and can
> be removed.
> 
> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> 
> ---
> v2: - Only remove the outgoing queue, and keep the incoming queue, as we
>       want the buffer to start streaming data as soon as it is enabled.
>     - Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the
>       same as IIO_BLOCK_STATE_DONE.

I'm not that familiar with this code, but with my understanding this makes
sense.   I think it is independent of the earlier patches and is a useful
change in it's own right.  As such, does it make sense to pick this up
ahead of the rest of the series? I'm assuming that discussion on the
rest will take a while.  No great rush as too late for the coming merge
window anyway.

Thanks,

Jonathan
Jonathan Cameron April 16, 2023, 2:35 p.m. UTC | #9
On Mon,  3 Apr 2023 17:47:54 +0200
Paul Cercueil <paul@crapouillou.net> wrote:

> Update the devm_iio_dmaengine_buffer_setup() function to support
> specifying the buffer direction.
> 
> Update the iio_dmaengine_buffer_submit() function to handle input
> buffers as well as output buffers.
> 
> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> Reviewed-by: Alexandru Ardelean <ardeleanalex@gmail.com>

Just one trivial question inline.

Jonathan

> diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
> index 5f85ba38e6f6..592d2aa9044c 100644
> --- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c
> +++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c
> @@ -64,14 +64,25 @@ static int iio_dmaengine_buffer_submit_block(struct iio_dma_buffer_queue *queue,
>  	struct dmaengine_buffer *dmaengine_buffer =
>  		iio_buffer_to_dmaengine_buffer(&queue->buffer);
>  	struct dma_async_tx_descriptor *desc;
> +	enum dma_transfer_direction dma_dir;
> +	size_t max_size;
>  	dma_cookie_t cookie;
>  
> -	block->bytes_used = min(block->size, dmaengine_buffer->max_size);
> -	block->bytes_used = round_down(block->bytes_used,
> -			dmaengine_buffer->align);
> +	max_size = min(block->size, dmaengine_buffer->max_size);
> +	max_size = round_down(max_size, dmaengine_buffer->align);
> +
> +	if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN) {
> +		block->bytes_used = max_size;
> +		dma_dir = DMA_DEV_TO_MEM;
> +	} else {
> +		dma_dir = DMA_MEM_TO_DEV;
> +	}
> +
> +	if (!block->bytes_used || block->bytes_used > max_size)
> +		return -EINVAL;

Two paths to here.  Either DIRECTION_IN in which we just set things
up so conditions being checked are always fine (unless max_size == 0?
Can that happen?), or !DIRECTION_IN.
So why not move this into the else {} branch above?

>  
>  	desc = dmaengine_prep_slave_single(dmaengine_buffer->chan,
> -		block->phys_addr, block->bytes_used, DMA_DEV_TO_MEM,
> +		block->phys_addr, block->bytes_used, dma_dir,
>  		DMA_PREP_INTERRUPT);
>  	if (!desc)
>  		return -ENOMEM;
Jonathan Cameron April 16, 2023, 3:04 p.m. UTC | #10
On Mon,  3 Apr 2023 17:47:56 +0200
Paul Cercueil <paul@crapouillou.net> wrote:

> Add the necessary infrastructure to the IIO core to support a new
> optional DMABUF based interface.
> 
> With this new interface, DMABUF objects (externally created) can be
> attached to a IIO buffer, and subsequently used for data transfer.
> 
> A userspace application can then use this interface to share DMABUF
> objects between several interfaces, allowing it to transfer data in a
> zero-copy fashion, for instance between IIO and the USB stack.
> 
> The userspace application can also memory-map the DMABUF objects, and
> access the sample data directly. The advantage of doing this vs. the
> read() interface is that it avoids an extra copy of the data between the
> kernel and userspace. This is particularly userful for high-speed
> devices which produce several megabytes or even gigabytes of data per
> second.

I like numbers to support a patch.  Any nice ones to throw in here
as examples of expected rates?

> 
> As part of the interface, 3 new IOCTLs have been added:
> 
> IIO_BUFFER_DMABUF_ATTACH_IOCTL(int fd):
>  Attach the DMABUF object identified by the given file descriptor to the
>  buffer.
> 
> IIO_BUFFER_DMABUF_DETACH_IOCTL(int fd):
>  Detach the DMABUF object identified by the given file descriptor from
>  the buffer. Note that closing the IIO buffer's file descriptor will
>  automatically detach all previously attached DMABUF objects.
> 
> IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *):
>  Request a data transfer to/from the given DMABUF object. Its file
>  descriptor, as well as the transfer size and flags are provided in the
>  "iio_dmabuf" structure.
> 
> These three IOCTLs have to be performed on the IIO buffer's file
> descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl.
> 
> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> 

Trivial comments from me.  I don't (yet) understand dmabuf well enough
to know if that part is right or not.  Not sure I will ever find the time
so relying on those who are more familiar with it to tell me if that code
is correct.

Thanks,

Jonathan



>  static int iio_buffer_chrdev_release(struct inode *inode, struct file *filep)
>  {
>  	struct iio_dev_buffer_pair *ib = filep->private_data;
>  	struct iio_dev *indio_dev = ib->indio_dev;
>  	struct iio_buffer *buffer = ib->buffer;
> +	struct iio_dmabuf_priv *priv, *tmp;
>  
>  	wake_up(&buffer->pollq);
>  
> +	/* Close all attached DMABUFs */
> +	list_for_each_entry_safe(priv, tmp, &buffer->dmabufs, entry) {
> +		list_del_init(&priv->entry);
> +		iio_buffer_dmabuf_put(priv->attach);
> +	}
> +
> +	/* TODO: Is it safe? Can "ib" be freed here? */

No idea :)  However that need resolving before we apply this.

> +	if (!list_empty(&buffer->dmabufs))
> +		dev_warn(&indio_dev->dev, "Buffer FD closed with active transfers\n");
> +
>  	kfree(ib);
>  	clear_bit(IIO_BUSY_BIT_POS, &buffer->flags);
>  	iio_device_put(indio_dev);
> @@ -1515,11 +1591,337 @@ static int iio_buffer_chrdev_release(struct inode *inode, struct file *filep)
>  	return 0;
>  }
>  




> +static int iio_buffer_enqueue_dmabuf(struct iio_dev_buffer_pair *ib,
> +				     struct iio_dmabuf __user *iio_dmabuf_req,
> +				     bool nonblock)
> +{

...

> +
> +	ret = buffer->access->enqueue_dmabuf(buffer, priv->block, sgt,
> +					     iio_dmabuf.bytes_used, cyclic);
> +	if (ret)

Hmm. Is there an easy way to perhaps avoid a function with multiple
error handling paths like we have here.  Perhaps drag the
extra stuff from the the dmabuf_done() function into this if (ret)
then goto err_fence_put;?  I'm not sure if that would make this even
harder to read however.

> +		iio_buffer_signal_dmabuf_done(attach, ret);
> +
> +	dma_buf_put(dmabuf);
> +
> +	return ret;
> +
> +err_resv_unlock:
> +	dma_resv_unlock(dmabuf->resv);
> +err_fence_put:
> +	dma_fence_put(&fence->base);
> +err_unmap_attachment:
> +	dma_buf_unmap_attachment(attach, sgt, dir);
> +err_attachment_put:
> +	iio_buffer_dmabuf_put(attach);
> +err_dmabuf_put:
> +	dma_buf_put(dmabuf);
> +
> +	return ret;
> +}
> +
> +void iio_buffer_signal_dmabuf_done(struct dma_buf_attachment *attach, int ret)
> +{
> +	struct iio_dmabuf_priv *priv = attach->importer_priv;
> +	struct iio_dma_fence *fence = priv->fence;
> +	enum dma_data_direction dir = fence->dir;
> +	struct sg_table *sgt = fence->sgt;
> +
> +	dma_fence_get(&fence->base);
> +	fence->base.error = ret;
> +	dma_fence_signal(&fence->base);
> +	dma_fence_put(&fence->base);
> +
> +	dma_buf_unmap_attachment(attach, sgt, dir);
> +	iio_buffer_dmabuf_put(attach);
> +}
> +EXPORT_SYMBOL_GPL(iio_buffer_signal_dmabuf_done);

...

> diff --git a/include/linux/iio/buffer_impl.h b/include/linux/iio/buffer_impl.h
> index 89c3fd7c29ca..a8a490091277 100644
> --- a/include/linux/iio/buffer_impl.h
> +++ b/include/linux/iio/buffer_impl.h
> @@ -9,8 +9,11 @@
>  #include <uapi/linux/iio/buffer.h>
>  #include <linux/iio/buffer.h>
>  
> +struct dma_buf_attachment;
>  struct iio_dev;
> +struct iio_dma_buffer_block;
>  struct iio_buffer;
> +struct sg_table;
>  
>  /**
>   * INDIO_BUFFER_FLAG_FIXED_WATERMARK - Watermark level of the buffer can not be
> @@ -39,6 +42,9 @@ struct iio_buffer;
>   *                      device stops sampling. Calles are balanced with @enable.
>   * @release:		called when the last reference to the buffer is dropped,
>   *			should free all resources allocated by the buffer
> + * @alloc_dmabuf:	called from userspace via ioctl to allocate one DMABUF.

Looks like you missed updating the docs.

> + * @enqueue_dmabuf:	called from userspace via ioctl to queue this DMABUF
> + *			object to this buffer. Requires a valid DMABUF fd.
>   * @modes:		Supported operating modes by this buffer type
>   * @flags:		A bitmask combination of INDIO_BUFFER_FLAG_*
>   *
> @@ -68,6 +74,14 @@ struct iio_buffer_access_funcs {
>  
>  	void (*release)(struct iio_buffer *buffer);
>  
> +	struct iio_dma_buffer_block * (*attach_dmabuf)(struct iio_buffer *buffer,
> +						       struct dma_buf_attachment *attach);
> +	void (*detach_dmabuf)(struct iio_buffer *buffer,
> +			      struct iio_dma_buffer_block *block);
> +	int (*enqueue_dmabuf)(struct iio_buffer *buffer,
> +			      struct iio_dma_buffer_block *block,
> +			      struct sg_table *sgt, size_t size, bool cyclic);
> +
>  	unsigned int modes;
>  	unsigned int flags;
>  };
Jonathan Cameron April 16, 2023, 3:10 p.m. UTC | #11
On Mon,  3 Apr 2023 17:47:58 +0200
Paul Cercueil <paul@crapouillou.net> wrote:

> Implement iio_dma_buffer_attach_dmabuf(), iio_dma_buffer_detach_dmabuf()
> and iio_dma_buffer_transfer_dmabuf(), which can then be used by the IIO
> DMA buffer implementations.
> 
> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Hi Paul,

A few superficially comments.

Jonathan

> 
> ---
> v3: Update code to provide the functions that will be used as callbacks
> for the new IOCTLs.
> ---
>  drivers/iio/buffer/industrialio-buffer-dma.c | 157 +++++++++++++++++--
>  include/linux/iio/buffer-dma.h               |  24 +++
>  2 files changed, 168 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c
> index e14814e0d4c8..422bd784fd1e 100644
> --- a/drivers/iio/buffer/industrialio-buffer-dma.c
> +++ b/drivers/iio/buffer/industrialio-buffer-dma.c
...

> @@ -412,8 +448,12 @@ static void iio_dma_buffer_submit_block(struct iio_dma_buffer_queue *queue,
>  
>  	block->state = IIO_BLOCK_STATE_ACTIVE;
>  	iio_buffer_block_get(block);
> +

Trivial, but I'd rather not see unrelated white space changes in a patch
doing anything else.

>  	ret = queue->ops->submit(queue, block);
>  	if (ret) {
> +		if (!block->fileio)
> +			iio_buffer_signal_dmabuf_done(block->attach, ret);
> +
>  		/*
>  		 * This is a bit of a problem and there is not much we can do
>  		 * other then wait for the buffer to be disabled and re-enabled
> @@ -645,6 +685,97 @@ size_t iio_dma_buffer_data_available(struct iio_buffer *buf)
>  }
>  EXPORT_SYMBOL_GPL(iio_dma_buffer_data_available);

...

> +int iio_dma_buffer_enqueue_dmabuf(struct iio_buffer *buffer,
> +				  struct iio_dma_buffer_block *block,
> +				  struct sg_table *sgt,
> +				  size_t size, bool cyclic)
> +{
> +	struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer);
> +	int ret = 0;

No need to init.

> +
> +	mutex_lock(&queue->lock);
> +	ret = iio_dma_can_enqueue_block(block);
> +	if (ret < 0)
> +		goto out_mutex_unlock;
> +
> +	block->bytes_used = size;
> +	block->cyclic = cyclic;
> +	block->sg_table = sgt;
> +
> +	iio_dma_buffer_enqueue(queue, block);
> +
> +out_mutex_unlock:
> +	mutex_unlock(&queue->lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(iio_dma_buffer_enqueue_dmabuf);

Obviously an unrelated activity but good to namespace these
in a future patch set.

> +
>  /**
>   * iio_dma_buffer_set_bytes_per_datum() - DMA buffer set_bytes_per_datum callback
>   * @buffer: Buffer to set the bytes-per-datum for
> diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h
> index 490b93f76fa8..e5e5817e99db 100644
> --- a/include/linux/iio/buffer-dma.h
> +++ b/include/linux/iio/buffer-dma.h

>  /**
>   * enum iio_block_state - State of a struct iio_dma_buffer_block
> @@ -41,6 +43,7 @@ enum iio_block_state {
>   * @queue: Parent DMA buffer queue
>   * @kref: kref used to manage the lifetime of block
>   * @state: Current state of the block
> + * @fileio: True if this buffer is used for fileio mode

Docs need update for the other two new elements.

>   */
>  struct iio_dma_buffer_block {
>  	/* May only be accessed by the owner of the block */
> @@ -63,6 +66,11 @@ struct iio_dma_buffer_block {
>  	 * queue->list_lock if the block is not owned by the core.
>  	 */
>  	enum iio_block_state state;
> +
> +	bool fileio;
> +
> +	struct dma_buf_attachment *attach;
> +	struct sg_table *sg_table;
>  };
Paul Cercueil April 18, 2023, 8:08 a.m. UTC | #12
Hi Jonathan,

Le dimanche 16 avril 2023 à 15:24 +0100, Jonathan Cameron a écrit :
> On Mon,  3 Apr 2023 17:47:52 +0200
> Paul Cercueil <paul@crapouillou.net> wrote:
> 
> > The buffer-dma code was using two queues, incoming and outgoing, to
> > manage the state of the blocks in use.
> > 
> > While this totally works, it adds some complexity to the code,
> > especially since the code only manages 2 blocks. It is much easier
> > to
> > just check each block's state manually, and keep a counter for the
> > next
> > block to dequeue.
> > 
> > Since the new DMABUF based API wouldn't use the outgoing queue
> > anyway,
> > getting rid of it now makes the upcoming changes simpler.
> > 
> > With this change, the IIO_BLOCK_STATE_DEQUEUED is now useless, and
> > can
> > be removed.
> > 
> > Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> > 
> > ---
> > v2: - Only remove the outgoing queue, and keep the incoming queue,
> > as we
> >       want the buffer to start streaming data as soon as it is
> > enabled.
> >     - Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally
> > the
> >       same as IIO_BLOCK_STATE_DONE.
> 
> I'm not that familiar with this code, but with my understanding this
> makes
> sense.   I think it is independent of the earlier patches and is a
> useful
> change in it's own right.  As such, does it make sense to pick this
> up
> ahead of the rest of the series? I'm assuming that discussion on the
> rest will take a while.  No great rush as too late for the coming
> merge
> window anyway.

Actually, you can pick patches 3 to 6 (when all have been acked). They
add write support for buffer-dma implementations; which is a dependency
for the rest of the patchset, but they can live on their own.

Cheers,
-Paul
Jonathan Cameron May 1, 2023, 4:25 p.m. UTC | #13
On Tue, 18 Apr 2023 10:08:21 +0200
Paul Cercueil <paul@crapouillou.net> wrote:

> Hi Jonathan,
> 
> Le dimanche 16 avril 2023 à 15:24 +0100, Jonathan Cameron a écrit :
> > On Mon,  3 Apr 2023 17:47:52 +0200
> > Paul Cercueil <paul@crapouillou.net> wrote:
> >   
> > > The buffer-dma code was using two queues, incoming and outgoing, to
> > > manage the state of the blocks in use.
> > > 
> > > While this totally works, it adds some complexity to the code,
> > > especially since the code only manages 2 blocks. It is much easier
> > > to
> > > just check each block's state manually, and keep a counter for the
> > > next
> > > block to dequeue.
> > > 
> > > Since the new DMABUF based API wouldn't use the outgoing queue
> > > anyway,
> > > getting rid of it now makes the upcoming changes simpler.
> > > 
> > > With this change, the IIO_BLOCK_STATE_DEQUEUED is now useless, and
> > > can
> > > be removed.
> > > 
> > > Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> > > 
> > > ---
> > > v2: - Only remove the outgoing queue, and keep the incoming queue,
> > > as we
> > >       want the buffer to start streaming data as soon as it is
> > > enabled.
> > >     - Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally
> > > the
> > >       same as IIO_BLOCK_STATE_DONE.  
> > 
> > I'm not that familiar with this code, but with my understanding this
> > makes
> > sense.   I think it is independent of the earlier patches and is a
> > useful
> > change in it's own right.  As such, does it make sense to pick this
> > up
> > ahead of the rest of the series? I'm assuming that discussion on the
> > rest will take a while.  No great rush as too late for the coming
> > merge
> > window anyway.  
> 
> Actually, you can pick patches 3 to 6 (when all have been acked). They
> add write support for buffer-dma implementations; which is a dependency
> for the rest of the patchset, but they can live on their own.

Remind me of that in the cover letter for v4.

Thanks,

Jonathan

> 
> Cheers,
> -Paul