diff mbox series

[v1] media: videobuf2: Allow applications customize data offsets of capture buffers

Message ID 20220322132329.6527-1-dmitry.osipenko@collabora.com
State New
Headers show
Series [v1] media: videobuf2: Allow applications customize data offsets of capture buffers | expand

Commit Message

Dmitry Osipenko March 22, 2022, 1:23 p.m. UTC
Use data offsets provided by applications for multi-planar capture
buffers. This allows V4L to import and use dma-bufs exported by other
subsystems in cases where application wants to customize data offsets
of capture buffers in order to meet hardware alignment requirements of
both dma-buf exporter and importer.

This feature is wanted for providing a better support of media hardware
found on Chromebooks. In particular display and camera ISP hardware of
Rockchip and MediaTek SoCs require special handling by userspace because
display h/w has specific alignment requirements that don't match default
alignments expected by V4L and there is a need to customize the data
offsets in case of multi-planar formats.

Some drivers already have preliminary support for data offsets
customization of capture buffers, like NVIDIA Tegra video decoder driver
for example, and V4L allows applications to provide data offsets for
multi-planar output buffers, let's support such customization for the
capture buffers as well.

Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
---
 Documentation/userspace-api/media/v4l/buffer.rst | 9 ++++++++-
 drivers/media/common/videobuf2/videobuf2-v4l2.c  | 7 +++++++
 2 files changed, 15 insertions(+), 1 deletion(-)

Comments

Nicolas Dufresne March 23, 2022, 7:21 p.m. UTC | #1
Le mercredi 23 mars 2022 à 17:28 +0300, Dmitry Osipenko a écrit :
> Hi Nicolas,
> 
> On 3/23/22 16:05, Nicolas Dufresne wrote:
> > Hi Dmitry,
> > 
> > thanks for giving a second look a this issue.
> > 
> > Le mardi 22 mars 2022 à 16:23 +0300, Dmitry Osipenko a écrit :
> > > Use data offsets provided by applications for multi-planar capture
> > > buffers. This allows V4L to import and use dma-bufs exported by other
> > > subsystems in cases where application wants to customize data offsets
> > > of capture buffers in order to meet hardware alignment requirements of
> > > both dma-buf exporter and importer.
> > > 
> > > This feature is wanted for providing a better support of media hardware
> > > found on Chromebooks. In particular display and camera ISP hardware of
> > > Rockchip and MediaTek SoCs require special handling by userspace because
> > > display h/w has specific alignment requirements that don't match default
> > > alignments expected by V4L and there is a need to customize the data
> > > offsets in case of multi-planar formats.
> > > 
> > > Some drivers already have preliminary support for data offsets
> > > customization of capture buffers, like NVIDIA Tegra video decoder driver
> > > for example, and V4L allows applications to provide data offsets for
> > > multi-planar output buffers, let's support such customization for the
> > > capture buffers as well.
> > > 
> > > Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> > > ---
> > >  Documentation/userspace-api/media/v4l/buffer.rst | 9 ++++++++-
> > >  drivers/media/common/videobuf2/videobuf2-v4l2.c  | 7 +++++++
> > >  2 files changed, 15 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/Documentation/userspace-api/media/v4l/buffer.rst b/Documentation/userspace-api/media/v4l/buffer.rst
> > > index 4638ec64db00..75b1929e2acb 100644
> > > --- a/Documentation/userspace-api/media/v4l/buffer.rst
> > > +++ b/Documentation/userspace-api/media/v4l/buffer.rst
> > > @@ -369,13 +369,20 @@ struct v4l2_plane
> > >        - ``data_offset``
> > >        - Offset in bytes to video data in the plane. Drivers must set this
> > >  	field when ``type`` refers to a capture stream, applications when
> > > -	it refers to an output stream.
> > > +	it refers to an output or capture stream.
> > 
> > There is a clear contradiction in this paragraph. Both the driver and the
> > application MUST set the data_offset.
> 
> I'm not sure where the contradiction is. Application must initialize the
> data_offset and driver must set data_offset too, if it's appropriate to
> do that for a particular driver.
> 
> > Would it be possible to demo your idea by implementing this in a virtual driver
> > ? vivid already have data_offset for capture in some cases, you could verify if
> > your idea works without any conflict in this scenario.
> 
> I actually considered implementing it in the vivid driver, but vivid
> driver already sets the data_offset to fixed values [1], so I decided
> that not to change it.
> 
> But maybe we actually could extend the vivid driver by accepting
> data_offset from userspace for the cases where the fixed offset value is
> zero in the driver.. not sure.

The is the core of the issue, how do you represent both a driver use of
data_offset and a userland provided data_offset at the same time. Contradiction
might be the wrong term, but minimally there is a large gap in the specification
for which I don't have an easy answer.

> 
> [1]
> https://elixir.bootlin.com/linux/latest/source/drivers/media/test-drivers/vivid/vivid-vid-cap.c#L172
> 
> I verified my idea using the NVIDIA Tegra video decoder driver, which
> already takes data_offsets for capture planes [3] and libvdpau-tegra
> imports DRM dma-bufs into the V4L driver [4][5].
> 
> [3]
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/media/platform/nvidia/tegra-vde/v4l2.c#n236
> [4]
> https://github.com/grate-driver/libvdpau-tegra/blob/master/src/decoder.c#L685
> [5]
> https://github.com/grate-driver/libvdpau-tegra/blob/master/src/v4l2.c#L359
> 
> The plan is to extend RK ISP driver with support of data_offsets next,
> once we'll agree that this acceptable approach and we don't actually
> need go back to implementing the new VIDIOC_EXT_QBUF ioctl [6].
> 
> [6]
> https://patchwork.linuxtv.org/project/linux-media/cover/20210114180738.1758707-1-helen.koike@collabora.com/
> 
> This patch solves the problem for userspace when it wants to import
> buffers into V4L in case of multi-planar formats, but it doesn't cover
> all other possible cases that may require offsets customization too. On
> the other hand, it's easy to start accepting data_offset for the capture
> MPLANES without introducing new UAPIs, so I decided that will be best to
> start easy with the simplest solution.
Dmitry Osipenko March 24, 2022, 6:20 p.m. UTC | #2
On 3/23/22 22:21, Nicolas Dufresne wrote:
> Le mercredi 23 mars 2022 à 17:28 +0300, Dmitry Osipenko a écrit :
>> Hi Nicolas,
>>
>> On 3/23/22 16:05, Nicolas Dufresne wrote:
>>> Hi Dmitry,
>>>
>>> thanks for giving a second look a this issue.
>>>
>>> Le mardi 22 mars 2022 à 16:23 +0300, Dmitry Osipenko a écrit :
>>>> Use data offsets provided by applications for multi-planar capture
>>>> buffers. This allows V4L to import and use dma-bufs exported by other
>>>> subsystems in cases where application wants to customize data offsets
>>>> of capture buffers in order to meet hardware alignment requirements of
>>>> both dma-buf exporter and importer.
>>>>
>>>> This feature is wanted for providing a better support of media hardware
>>>> found on Chromebooks. In particular display and camera ISP hardware of
>>>> Rockchip and MediaTek SoCs require special handling by userspace because
>>>> display h/w has specific alignment requirements that don't match default
>>>> alignments expected by V4L and there is a need to customize the data
>>>> offsets in case of multi-planar formats.
>>>>
>>>> Some drivers already have preliminary support for data offsets
>>>> customization of capture buffers, like NVIDIA Tegra video decoder driver
>>>> for example, and V4L allows applications to provide data offsets for
>>>> multi-planar output buffers, let's support such customization for the
>>>> capture buffers as well.
>>>>
>>>> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
>>>> ---
>>>>  Documentation/userspace-api/media/v4l/buffer.rst | 9 ++++++++-
>>>>  drivers/media/common/videobuf2/videobuf2-v4l2.c  | 7 +++++++
>>>>  2 files changed, 15 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/Documentation/userspace-api/media/v4l/buffer.rst b/Documentation/userspace-api/media/v4l/buffer.rst
>>>> index 4638ec64db00..75b1929e2acb 100644
>>>> --- a/Documentation/userspace-api/media/v4l/buffer.rst
>>>> +++ b/Documentation/userspace-api/media/v4l/buffer.rst
>>>> @@ -369,13 +369,20 @@ struct v4l2_plane
>>>>        - ``data_offset``
>>>>        - Offset in bytes to video data in the plane. Drivers must set this
>>>>  	field when ``type`` refers to a capture stream, applications when
>>>> -	it refers to an output stream.
>>>> +	it refers to an output or capture stream.
>>>
>>> There is a clear contradiction in this paragraph. Both the driver and the
>>> application MUST set the data_offset.
>>
>> I'm not sure where the contradiction is. Application must initialize the
>> data_offset and driver must set data_offset too, if it's appropriate to
>> do that for a particular driver.
>>
>>> Would it be possible to demo your idea by implementing this in a virtual driver
>>> ? vivid already have data_offset for capture in some cases, you could verify if
>>> your idea works without any conflict in this scenario.
>>
>> I actually considered implementing it in the vivid driver, but vivid
>> driver already sets the data_offset to fixed values [1], so I decided
>> that not to change it.
>>
>> But maybe we actually could extend the vivid driver by accepting
>> data_offset from userspace for the cases where the fixed offset value is
>> zero in the driver.. not sure.
> 
> The is the core of the issue, how do you represent both a driver use of
> data_offset and a userland provided data_offset at the same time. Contradiction
> might be the wrong term, but minimally there is a large gap in the specification
> for which I don't have an easy answer.

In the doc-comment I said: "Handling of application's offsets is
driver-dependent, application must use the resulting buffer offset."

This can be made stricter using a new
V4L2_BUF_FLAG_USE_CAPTURE_MPLANE_DATA_OFFSET, for example. But then it
doesn't feel good anymore to me because output buffers don't require a
special flag, hence it feels inconsistent.

There is another variant with adding new data_offset fields to the
v4l2_plane_pix_format and then will be possible to set data offsets
using VIDIOC_S_FMT for both S/MPLANEs, assuming that we won't need more
than 3 offsets for SPLANEs. But this is a much more invasive and
questionable UAPI extension.

--- >8 ---
@@ -2268,12 +2268,14 @@ struct v4l2_mpeg_vbi_fmt_ivtv {
  *                     this plane will be used
  * @bytesperline:      distance in bytes between the leftmost pixels in two
  *                     adjacent lines
+ * @data_offset                offset in bytes to the beginning of the
plane's data
  * @reserved:          drivers and applications must zero this array
  */
 struct v4l2_plane_pix_format {
        __u32           sizeimage;
        __u32           bytesperline;
-       __u16           reserved[6];
+       __u32           data_offset[3];
 } __attribute__ ((packed));
--- >8 ---

The root of the problem is that DRM UAPI is more flexible and allows to
customize offsets for both S/MPLANEs, while V4L doesn't allow to do it
at all. I'm exploring all the potential options, so far neither of the
proposed variants is ideal.
Dmitry Osipenko April 22, 2022, 10:50 p.m. UTC | #3
On 3/25/22 16:11, Dave Stevenson wrote:
> Hi Nicolas & Dmitry
> 
> On Fri, 25 Mar 2022 at 12:32, Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
>>
>> Le jeudi 24 mars 2022 à 21:20 +0300, Dmitry Osipenko a écrit :
>>> The root of the problem is that DRM UAPI is more flexible and allows to
>>> customize offsets for both S/MPLANEs, while V4L doesn't allow to do it
>>> at all. I'm exploring all the potential options, so far neither of the
>>> proposed variants is ideal.
>>
>> In GStreamer kmssink, the way DRM is used, is that if you have 2 planes in your
>> pixel format, but only received 1 DMABuf, we will pass this DMABuf twice (well
>> GEM handles, but twice), with appropriate offset.
>>
>> With this in mind, the idea for V4L2 could be to always resort to MPLANE for
>> this purpose. The tricky part for userland is that it needs to know the dual
>> pixel format and map that accordingly. That is a bit difficult and this is
>> something Helen was trying to address with the v4l2_buffer_ext (that and
>> allowing space to store DRM Modifiers in the future).
>>
>> The second challenge is the overhead. In DRM, as we "prime" the DMABuf into
>> handles, this gives a kernel object to store any relevant information about the
>> buffer. So having it duplicate can be done at no cost. In V4L2, the driver would
>> need to handle that more often. Specially that despite the recommendation
>> (except for primary buffer decoder, were this is mandatory), we don't force a
>> strict DMABuf / Buffer IDX  mapping.
> 
> To throw another use case of data offsets into the mix, I'm keeping a
> watching eye on implementing stereoscopic capture into libcamera where
> we want to present the same buffer to the ISP twice (once for each
> eye) with either a vertical or horizontal offset between the two
> passes.
> Adding a data_offset of either a half line or half the frame is the
> easiest way around that one, although it could potentially be
> accommodated through the selection API setting a compose region
> instead.

Hi Dave,

Thank you for the suggestion about the stereoscopic capture! If you'll
manage to test this patch and it will work for you, then please feel
free to give yours tested-by.

I'll need to do couple extra checks to ensure that we really need this
new feature for the Chromebooks, this will take time. But if you'll be
able to confirm that this variant of the patch works for yours case with
the stereo cams, then we may try to proceed using the current variant
right away.
Dmitry Osipenko June 29, 2022, 3:37 p.m. UTC | #4
On 5/19/22 13:22, Dmitry Osipenko wrote:
...
>> On 3/25/22 13:32, Nicolas Dufresne wrote:
>>> Le jeudi 24 mars 2022 à 21:20 +0300, Dmitry Osipenko a écrit :
>>>> The root of the problem is that DRM UAPI is more flexible and allows to
>>>> customize offsets for both S/MPLANEs, while V4L doesn't allow to do it
>>>> at all. I'm exploring all the potential options, so far neither of the
>>>> proposed variants is ideal.
>>>
>>> In GStreamer kmssink, the way DRM is used, is that if you have 2 planes in your
>>> pixel format, but only received 1 DMABuf, we will pass this DMABuf twice (well
>>> GEM handles, but twice), with appropriate offset.
>>>
>>> With this in mind, the idea for V4L2 could be to always resort to MPLANE for
>>> this purpose. The tricky part for userland is that it needs to know the dual
>>> pixel format and map that accordingly. That is a bit difficult and this is
>>> something Helen was trying to address with the v4l2_buffer_ext (that and
>>> allowing space to store DRM Modifiers in the future).
>>
>> FYI: here is Helen's last patch series. Since Helen is no longer active in
>> the media subsystem, someone else who is sufficiently motivated would have to
>> take over.
>>
>> https://patchwork.linuxtv.org/project/linux-media/cover/20210114180738.1758707-1-helen.koike@collabora.com/
>>
>> I'm not enthusiastic about messing with data_offset: it was - in hindsight - a
>> bad idea.
> 
> I'm aware of the Helen's work. To me the addition of the new IOCTLs that
> partially duplicate the older ones doesn't feel like the best approach.
> But since you're good with it, then I'll try to refresh the Helen's work
> for 5.20 and we'll see where it will go.
> 

Hello Hans,

I'm having hard time convincing myself that the addition of the new IOCTLs is the right approach for us. Helen was adding the new QBUF IOCTLs that supposed to:

1. Support DRM modifiers.
2. Support per-plane dma-buf offsets.
3. Support new features that "are good to have", like 64bit timestamp and unified multi/single planar formats. 

The idea of using DRM modifiers is obsoleted now. Mixing DRM with V4L will be a mess and nobody really needs that feature anymore because there are not that many compressed framebuffer formats needed by V4L drivers.

The dma-buf offsets is the main thing that is really wanted by multiple users and we can support it by extending the current UAPI, which should be the best option IMO. It will be simple to implement and doesn't require much changes form the existing applications.

If we will ever really need to introduce a brand new UAPI, then I think it should be covering more things like deprecating the creation of multiple buffers at once, making buffers independent and reusable, ability to destroy buffer and etc. This will be a lot of work that doesn't worth the effort today since everybody in userspace already adapted to the current state of V4L.

I typed another variant of the patch that extends the struct v4l2_plane with the new buffer offset field and curious what you're thinking about it. It works fine by replacing the data_offset with the new buf_offset in the Tegra decoder driver since it already supports the data_offset which was previously unused because initially I thought that it was about buffer offsets when implemented V4L support for the Tegra driver. If we will agree on this approach then RKISP and MTK should be the next drivers that may get support for the dma-buf offsets.

Or maybe you have good arguments for the addition of new IOCTLs? Other ideas?

Here is the patch in question:

--- >8 ---

Subject: [PATCH] media: videobuf2: Introduce new buf_offset field for dma-buf
 memory planes

Add new optional buf_offset field to struct v4l2_plane of the videdev2
UAPI and support it in the V4L2 core. This new field designates the offset
from the start of a memory plane's dma-buf.

The primary user of this new field are downstream software stacks that
transition to upstream kernel. ChromeOS is one of those stacks and Android
might be the other. The downstream kernel fork of ChromeOS re-uses the
data_offset field for the dma-buf offsets, but data_offset has a different
meaning, and thus, this approach isn't acceptable for the upstream. The
reason why it's needed by ChromeOS is the integration with GPU subsystem
that uses DRM. In some cases there is disagreement between V4L and DRM
drivers on the memory plane offsets, which is difficult to resolve without
overriding the offsets in V4L.

Since not all drivers will support the new buf_offset, the new capability
flag "supports_mem_plane_offset" is added to the struct vb2_queue. Drivers
supporting dma-buf offsets will have to explicitly enable the new capability.

The buf_offset field is meant to be application-specific because there is
no generic way for conveying the h/w requirements to userspace in V4L.

Note that only drivers can validate the dma-buf offsets properly because
of the different h/w alignment requirements. Hence we can't support the
dma-buf offsets handling in the V4L core and it needs to be done within
drivers in the buf_prepare() callback of the vb2_ops.

Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
---
 drivers/media/common/videobuf2/videobuf2-core.c |  2 ++
 drivers/media/common/videobuf2/videobuf2-v4l2.c | 10 ++++++++++
 include/media/videobuf2-core.h                  |  6 ++++++
 include/uapi/linux/videodev2.h                  |  5 ++++-
 4 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
index b203c1e26353..431fe1c8578c 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -1278,6 +1278,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
 		vb->planes[plane].bytesused = 0;
 		vb->planes[plane].length = 0;
 		vb->planes[plane].m.fd = 0;
+		vb->planes[plane].buf_offset = 0;
 		vb->planes[plane].data_offset = 0;
 
 		/* Acquire each plane's memory */
@@ -1323,6 +1324,7 @@ static int __prepare_dmabuf(struct vb2_buffer *vb)
 		vb->planes[plane].bytesused = planes[plane].bytesused;
 		vb->planes[plane].length = planes[plane].length;
 		vb->planes[plane].m.fd = planes[plane].m.fd;
+		vb->planes[plane].buf_offset = planes[plane].buf_offset;
 		vb->planes[plane].data_offset = planes[plane].data_offset;
 	}
 
diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index 075d24ebf44c..e79c8ec03d00 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -219,6 +219,13 @@ static int vb2_fill_vb2_v4l2_buffer(struct vb2_buffer *vb, struct v4l2_buffer *b
 					b->m.planes[plane].m.fd;
 				planes[plane].length =
 					b->m.planes[plane].length;
+				planes[plane].buf_offset =
+					b->m.planes[plane].buf_offset;
+
+				if (planes[plane].buf_offset && !q->supports_mem_plane_offset) {
+					dprintk(q, 1, "dma-buf offset unsupported by the queue\n");
+					return -EINVAL;
+				}
 			}
 			break;
 		default:
@@ -294,6 +301,7 @@ static int vb2_fill_vb2_v4l2_buffer(struct vb2_buffer *vb, struct v4l2_buffer *b
 			break;
 		}
 
+		planes[0].buf_offset = 0;
 		planes[0].data_offset = 0;
 		if (V4L2_TYPE_IS_OUTPUT(b->type)) {
 			if (b->bytesused == 0)
@@ -528,6 +536,7 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
 				pdst->m.userptr = psrc->m.userptr;
 			else if (q->memory == VB2_MEMORY_DMABUF)
 				pdst->m.fd = psrc->m.fd;
+			pdst->buf_offset = psrc->buf_offset;
 			pdst->data_offset = psrc->data_offset;
 			memset(pdst->reserved, 0, sizeof(pdst->reserved));
 		}
@@ -612,6 +621,7 @@ static int __fill_vb2_buffer(struct vb2_buffer *vb, struct vb2_plane *planes)
 			planes[plane].length = vbuf->planes[plane].length;
 		}
 		planes[plane].bytesused = vbuf->planes[plane].bytesused;
+		planes[plane].buf_offset = vbuf->planes[plane].buf_offset;
 		planes[plane].data_offset = vbuf->planes[plane].data_offset;
 	}
 	return 0;
diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
index 5468b633b9d2..195381837a41 100644
--- a/include/media/videobuf2-core.h
+++ b/include/media/videobuf2-core.h
@@ -171,6 +171,8 @@ struct vb2_mem_ops {
  *		descriptor associated with this plane.
  * @data_offset:	offset in the plane to the start of data; usually 0,
  *		unless there is a header in front of the data.
+ * @buf_offset:	offset in the memory buffer to the start of plane; always 0
+ *		for non-dmabuf buffers.
  *
  * Should contain enough information to be able to cover all the fields
  * of &struct v4l2_plane at videodev2.h.
@@ -188,6 +190,7 @@ struct vb2_plane {
 		int		fd;
 	} m;
 	unsigned int		data_offset;
+	unsigned int		buf_offset;
 };
 
 /**
@@ -506,6 +509,8 @@ struct vb2_buf_ops {
  *		->finish().
  * @non_coherent_mem: when set queue will attempt to allocate buffers using
  *		non-coherent memory.
+ * @supports_mem_plane_offset: when set, a non-zero buf_offset is allowed
+ *		for dma-buf memory planes.
  * @lock:	pointer to a mutex that protects the &struct vb2_queue. The
  *		driver can set this to a mutex to let the v4l2 core serialize
  *		the queuing ioctls. If the driver wants to handle locking
@@ -586,6 +591,7 @@ struct vb2_queue {
 	unsigned int			uses_requests:1;
 	unsigned int			allow_cache_hints:1;
 	unsigned int			non_coherent_mem:1;
+	unsigned int			supports_mem_plane_offset:1;
 
 	struct mutex			*lock;
 	void				*owner;
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 5311ac4fde35..caf33fe0ae79 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -996,6 +996,8 @@ struct v4l2_requestbuffers {
  * @m:			union of @mem_offset, @userptr and @fd
  * @data_offset:	offset in the plane to the start of data; usually 0,
  *			unless there is a header in front of the data
+ * @buf_offset		offset in the memory buffer to the start of plane;
+ *			always 0 for non-dmabuf buffers
  * @reserved:		drivers and applications must zero this array
  *
  * Multi-planar buffers consist of one or more planes, e.g. an YCbCr buffer
@@ -1012,7 +1014,8 @@ struct v4l2_plane {
 		__s32		fd;
 	} m;
 	__u32			data_offset;
-	__u32			reserved[11];
+	__u32			buf_offset;
+	__u32			reserved[7];
 };
 
 /**
diff mbox series

Patch

diff --git a/Documentation/userspace-api/media/v4l/buffer.rst b/Documentation/userspace-api/media/v4l/buffer.rst
index 4638ec64db00..75b1929e2acb 100644
--- a/Documentation/userspace-api/media/v4l/buffer.rst
+++ b/Documentation/userspace-api/media/v4l/buffer.rst
@@ -369,13 +369,20 @@  struct v4l2_plane
       - ``data_offset``
       - Offset in bytes to video data in the plane. Drivers must set this
 	field when ``type`` refers to a capture stream, applications when
-	it refers to an output stream.
+	it refers to an output or capture stream.
 
 	.. note::
 
 	   That data_offset is included  in ``bytesused``. So the
 	   size of the image in the plane is ``bytesused``-``data_offset``
 	   at offset ``data_offset`` from the start of the plane.
+
+	   For capture planes data_offset may be specified by applications
+	   and by drivers. Driver may override application's offset or error
+	   out buffer if offset can't be satisfied by hardware. This allows
+	   applications to customize data offsets of imported dma-bufs.
+	   Handling of application's offsets is driver-dependent, application
+	   must use the resulting buffer offset.
     * - __u32
       - ``reserved[11]``
       - Reserved for future use. Should be zeroed by drivers and
diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index 6edf4508c636..929107a431cc 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -263,6 +263,13 @@  static int vb2_fill_vb2_v4l2_buffer(struct vb2_buffer *vb, struct v4l2_buffer *b
 						psrc->bytesused : pdst->length;
 				pdst->data_offset = psrc->data_offset;
 			}
+		} else {
+			for (plane = 0; plane < vb->num_planes; ++plane) {
+				struct vb2_plane *pdst = &planes[plane];
+				struct v4l2_plane *psrc = &b->m.planes[plane];
+
+				pdst->data_offset = psrc->data_offset;
+			}
 		}
 	} else {
 		/*