mbox series

[v6,00/18] Add DELETE_BUF ioctl

Message ID 20230901124414.48497-1-benjamin.gaignard@collabora.com
Headers show
Series Add DELETE_BUF ioctl | expand

Message

Benjamin Gaignard Sept. 1, 2023, 12:43 p.m. UTC
Unlike when resolution change on keyframes, dynamic resolution change
on inter frames doesn't allow to do a stream off/on sequence because
it is need to keep all previous references alive to decode inter frames.
This constraint have two main problems:
- more memory consumption.
- more buffers in use.
To solve these issue this series introduce DELETE_BUFS ioctl and remove
the 32 buffers limit per queue.

VP9 conformance tests using fluster give a score of 210/305.
The 24 resize inter tests (vp90-2-21-resize_inter_* files) are ok
but require to use postprocessor.

Kernel branch is available here:
https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/commits/remove_vb2_queue_limit_v6

GStreamer branch to use DELETE_BUF ioctl and testing dynamic resolution
change is here:
https://gitlab.freedesktop.org/benjamin.gaignard1/gstreamer/-/commits/VP9_drc

changes in version 6:
- Get a patch per driver to use vb2_get_buffer() instead of directly access
  to queue buffers array.
- Add lock in vb2_core_delete_buf()
- Use vb2_buffer instead of index
- Fix various comments
- Change buffer index name to BUFFER_INDEX_MASK
- Stop spamming kernel log with unbalanced counters

changes in version 5:
- Rework offset cookie encoding pattern is n ow the first patch of the
  serie.
- Use static array instead of allocated one for postprocessor buffers.

changes in version 4:
- Stop using Xarray, instead let queues decide about their own maximum
  number of buffer and allocate bufs array given that value.
- Rework offset cookie encoding pattern.
- Change DELETE_BUF to DELETE_BUFS because it now usable for
  range of buffer to be symetrical of CREATE_BUFS.
- Add fixes tags on couple of Verisilicon related patches.
- Be smarter in Verisilicon postprocessor buffers management.
- Rebase on top of v6.4

changes in version 3:
- Use Xarray API to store allocated video buffers.
- No module parameter to limit the number of buffer per queue.
- Use Xarray inside Verisilicon driver to store postprocessor buffers
  and remove VB2_MAX_FRAME limit.
- Allow Versilicon driver to change of resolution while streaming
- Various fixes the Verisilicon VP9 code to improve fluster score.
 
changes in version 2:
- Use a dynamic array and not a list to keep trace of allocated buffers.
  Not use IDR interface because it is marked as deprecated in kernel
  documentation.
- Add a module parameter to limit the number of buffer per queue.
- Add DELETE_BUF ioctl and m2m helpers.

Regards,
Benjamin
 
Benjamin Gaignard (18):
  media: videobuf2: Rework offset 'cookie' encoding pattern
  media: videobuf2: Stop spamming kernel log with all queue counter
  media: videobuf2: Use vb2_buffer instead of index
  media: amphion: Use vb2_get_buffer() instead of directly access to
    buffers array
  media: mediatek: jpeg: Use vb2_get_buffer() instead of directly access
    to buffers array
  media: mediatek: vdec: Use vb2_get_buffer() instead of directly access
    to buffers array
  media: sti: hva: Use vb2_get_buffer() instead of directly access to
    buffers array
  media: visl: Use vb2_get_buffer() instead of directly access to
    buffers array
  media: atomisp: Use vb2_get_buffer() instead of directly access to
    buffers array
  media: videobuf2: Access vb2_queue bufs array through helper functions
  media: videobuf2: Be more flexible on the number of queue stored
    buffers
  media: verisilicon: Refactor postprocessor to store more buffers
  media: verisilicon: Store chroma and motion vectors offset
  media: verisilicon: vp9: Use destination buffer height to compute
    chroma offset
  media: verisilicon: postproc: Fix down scale test
  media: verisilicon: vp9: Allow to change resolution while streaming
  media: v4l2: Add DELETE_BUFS ioctl
  media: v4l2: Add mem2mem helpers for DELETE_BUFS ioctl

 .../userspace-api/media/v4l/user-func.rst     |   1 +
 .../media/v4l/vidioc-delete-bufs.rst          |  73 ++++
 .../media/common/videobuf2/videobuf2-core.c   | 379 ++++++++++++------
 .../media/common/videobuf2/videobuf2-v4l2.c   |  99 ++++-
 drivers/media/dvb-core/dvb_vb2.c              |   6 +-
 drivers/media/platform/amphion/vpu_dbg.c      |  22 +-
 .../platform/mediatek/jpeg/mtk_jpeg_core.c    |   6 +-
 .../vcodec/decoder/vdec/vdec_vp9_req_lat_if.c |   2 +-
 drivers/media/platform/st/sti/hva/hva-v4l2.c  |   4 +
 drivers/media/platform/verisilicon/hantro.h   |   9 +-
 .../media/platform/verisilicon/hantro_drv.c   |   4 +-
 .../platform/verisilicon/hantro_g2_vp9_dec.c  |  10 +-
 .../media/platform/verisilicon/hantro_hw.h    |   4 +-
 .../platform/verisilicon/hantro_postproc.c    |  95 ++++-
 .../media/platform/verisilicon/hantro_v4l2.c  |  27 +-
 drivers/media/test-drivers/vim2m.c            |   1 +
 drivers/media/test-drivers/visl/visl-dec.c    |  28 +-
 drivers/media/v4l2-core/v4l2-dev.c            |   1 +
 drivers/media/v4l2-core/v4l2-ioctl.c          |  17 +
 drivers/media/v4l2-core/v4l2-mem2mem.c        |  20 +
 .../staging/media/atomisp/pci/atomisp_ioctl.c |   2 +-
 include/media/v4l2-ioctl.h                    |   4 +
 include/media/v4l2-mem2mem.h                  |  12 +
 include/media/videobuf2-core.h                |  29 +-
 include/media/videobuf2-v4l2.h                |  11 +
 include/uapi/linux/videodev2.h                |  16 +
 26 files changed, 664 insertions(+), 218 deletions(-)
 create mode 100644 Documentation/userspace-api/media/v4l/vidioc-delete-bufs.rst

Comments

Hans Verkuil Sept. 4, 2023, 11:24 a.m. UTC | #1
Hi Benjamin,

On 01/09/2023 14:44, Benjamin Gaignard wrote:
> Add 'max_allowed_buffers' field in vb2_queue struct to let drivers decide
> how many buffers could be stored in a queue.
> This request 'bufs' array to be allocated at queue init time and freed
> when releasing the queue.
> By default VB2_MAX_FRAME remains the limit.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
>  .../media/common/videobuf2/videobuf2-core.c   | 25 +++++++++++++------
>  include/media/videobuf2-core.h                |  4 ++-
>  2 files changed, 20 insertions(+), 9 deletions(-)
> 

This patch breaks v4l2-compliance. I see lots of issues when running the
test-media script in v4l-utils, contrib/test, among them memory leaks
and use-after-free.

I actually tested using virtme with the build scripts, but that in turn
calls the test-media script in a qemu environment, and it is at the moment
a bit tricky to set up, unless you run a debian 12 distro.

I will email the test logs directly to you since they are a bit large (>5000 lines).

Regards,

	Hans



> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> index 15b583ce0c69..dc7f6b59d237 100644
> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> @@ -411,7 +411,7 @@ static void init_buffer_cache_hints(struct vb2_queue *q, struct vb2_buffer *vb)
>   */
>  static bool vb2_queue_add_buffer(struct vb2_queue *q, struct vb2_buffer *vb, unsigned int index)
>  {
> -	if (index < VB2_MAX_FRAME && !q->bufs[index]) {
> +	if (index < q->max_allowed_buffers && !q->bufs[index]) {
>  		q->bufs[index] = vb;
>  		vb->index = index;
>  		vb->vb2_queue = q;
> @@ -428,7 +428,7 @@ static bool vb2_queue_add_buffer(struct vb2_queue *q, struct vb2_buffer *vb, uns
>   */
>  static void vb2_queue_remove_buffer(struct vb2_queue *q, struct vb2_buffer *vb)
>  {
> -	if (vb->index < VB2_MAX_FRAME) {
> +	if (vb->index < q->max_allowed_buffers) {
>  		q->bufs[vb->index] = NULL;
>  		vb->vb2_queue = NULL;
>  	}
> @@ -449,9 +449,9 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
>  	struct vb2_buffer *vb;
>  	int ret;
>  
> -	/* Ensure that q->num_buffers+num_buffers is below VB2_MAX_FRAME */
> +	/* Ensure that q->num_buffers+num_buffers is below q->max_allowed_buffers */
>  	num_buffers = min_t(unsigned int, num_buffers,
> -			    VB2_MAX_FRAME - q->num_buffers);
> +			    q->max_allowed_buffers - q->num_buffers);
>  
>  	for (buffer = 0; buffer < num_buffers; ++buffer) {
>  		/* Allocate vb2 buffer structures */
> @@ -862,9 +862,9 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
>  	/*
>  	 * Make sure the requested values and current defaults are sane.
>  	 */
> -	WARN_ON(q->min_buffers_needed > VB2_MAX_FRAME);
> +	WARN_ON(q->min_buffers_needed > q->max_allowed_buffers);
>  	num_buffers = max_t(unsigned int, *count, q->min_buffers_needed);
> -	num_buffers = min_t(unsigned int, num_buffers, VB2_MAX_FRAME);
> +	num_buffers = min_t(unsigned int, num_buffers, q->max_allowed_buffers);
>  	memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
>  	/*
>  	 * Set this now to ensure that drivers see the correct q->memory value
> @@ -980,7 +980,7 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
>  	bool no_previous_buffers = !q->num_buffers;
>  	int ret;
>  
> -	if (q->num_buffers == VB2_MAX_FRAME) {
> +	if (q->num_buffers == q->max_allowed_buffers) {
>  		dprintk(q, 1, "maximum number of buffers already allocated\n");
>  		return -ENOBUFS;
>  	}
> @@ -1009,7 +1009,7 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
>  			return -EINVAL;
>  	}
>  
> -	num_buffers = min(*count, VB2_MAX_FRAME - q->num_buffers);
> +	num_buffers = min(*count, q->max_allowed_buffers - q->num_buffers);
>  
>  	if (requested_planes && requested_sizes) {
>  		num_planes = requested_planes;
> @@ -2519,6 +2519,14 @@ int vb2_core_queue_init(struct vb2_queue *q)
>  
>  	q->memory = VB2_MEMORY_UNKNOWN;
>  
> +	if (!q->max_allowed_buffers)
> +		q->max_allowed_buffers = VB2_MAX_FRAME;
> +
> +	/* The maximum is limited by offset cookie encoding pattern */
> +	q->max_allowed_buffers = min_t(unsigned int, q->max_allowed_buffers, BUFFER_INDEX_MASK);
> +
> +	q->bufs = kcalloc(q->max_allowed_buffers, sizeof(*q->bufs), GFP_KERNEL);
> +
>  	if (q->buf_struct_size == 0)
>  		q->buf_struct_size = sizeof(struct vb2_buffer);
>  
> @@ -2543,6 +2551,7 @@ void vb2_core_queue_release(struct vb2_queue *q)
>  	__vb2_queue_cancel(q);
>  	mutex_lock(&q->mmap_lock);
>  	__vb2_queue_free(q, q->num_buffers);
> +	kfree(q->bufs);
>  	mutex_unlock(&q->mmap_lock);
>  }
>  EXPORT_SYMBOL_GPL(vb2_core_queue_release);
> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> index cd3ff1cd759d..97153c69583f 100644
> --- a/include/media/videobuf2-core.h
> +++ b/include/media/videobuf2-core.h
> @@ -558,6 +558,7 @@ struct vb2_buf_ops {
>   * @dma_dir:	DMA mapping direction.
>   * @bufs:	videobuf2 buffer structures
>   * @num_buffers: number of allocated/used buffers
> + * @max_allowed_buffers: upper limit of number of allocated/used buffers
>   * @queued_list: list of buffers currently queued from userspace
>   * @queued_count: number of buffers queued and ready for streaming.
>   * @owned_by_drv_count: number of buffers owned by the driver
> @@ -619,8 +620,9 @@ struct vb2_queue {
>  	struct mutex			mmap_lock;
>  	unsigned int			memory;
>  	enum dma_data_direction		dma_dir;
> -	struct vb2_buffer		*bufs[VB2_MAX_FRAME];
> +	struct vb2_buffer		**bufs;
>  	unsigned int			num_buffers;
> +	unsigned int			max_allowed_buffers;
>  
>  	struct list_head		queued_list;
>  	unsigned int			queued_count;
Benjamin Gaignard Sept. 4, 2023, 11:46 a.m. UTC | #2
Le 04/09/2023 à 13:24, Hans Verkuil a écrit :
> Hi Benjamin,
>
> On 01/09/2023 14:44, Benjamin Gaignard wrote:
>> Add 'max_allowed_buffers' field in vb2_queue struct to let drivers decide
>> how many buffers could be stored in a queue.
>> This request 'bufs' array to be allocated at queue init time and freed
>> when releasing the queue.
>> By default VB2_MAX_FRAME remains the limit.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>>   .../media/common/videobuf2/videobuf2-core.c   | 25 +++++++++++++------
>>   include/media/videobuf2-core.h                |  4 ++-
>>   2 files changed, 20 insertions(+), 9 deletions(-)
>>
> This patch breaks v4l2-compliance. I see lots of issues when running the
> test-media script in v4l-utils, contrib/test, among them memory leaks
> and use-after-free.
>
> I actually tested using virtme with the build scripts, but that in turn
> calls the test-media script in a qemu environment, and it is at the moment
> a bit tricky to set up, unless you run a debian 12 distro.
>
> I will email the test logs directly to you since they are a bit large (>5000 lines).

I will try to reproduce this error on my side to fix it.

Regards,
Benjamin

>
> Regards,
>
> 	Hans
>
>
>
>> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
>> index 15b583ce0c69..dc7f6b59d237 100644
>> --- a/drivers/media/common/videobuf2/videobuf2-core.c
>> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
>> @@ -411,7 +411,7 @@ static void init_buffer_cache_hints(struct vb2_queue *q, struct vb2_buffer *vb)
>>    */
>>   static bool vb2_queue_add_buffer(struct vb2_queue *q, struct vb2_buffer *vb, unsigned int index)
>>   {
>> -	if (index < VB2_MAX_FRAME && !q->bufs[index]) {
>> +	if (index < q->max_allowed_buffers && !q->bufs[index]) {
>>   		q->bufs[index] = vb;
>>   		vb->index = index;
>>   		vb->vb2_queue = q;
>> @@ -428,7 +428,7 @@ static bool vb2_queue_add_buffer(struct vb2_queue *q, struct vb2_buffer *vb, uns
>>    */
>>   static void vb2_queue_remove_buffer(struct vb2_queue *q, struct vb2_buffer *vb)
>>   {
>> -	if (vb->index < VB2_MAX_FRAME) {
>> +	if (vb->index < q->max_allowed_buffers) {
>>   		q->bufs[vb->index] = NULL;
>>   		vb->vb2_queue = NULL;
>>   	}
>> @@ -449,9 +449,9 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
>>   	struct vb2_buffer *vb;
>>   	int ret;
>>   
>> -	/* Ensure that q->num_buffers+num_buffers is below VB2_MAX_FRAME */
>> +	/* Ensure that q->num_buffers+num_buffers is below q->max_allowed_buffers */
>>   	num_buffers = min_t(unsigned int, num_buffers,
>> -			    VB2_MAX_FRAME - q->num_buffers);
>> +			    q->max_allowed_buffers - q->num_buffers);
>>   
>>   	for (buffer = 0; buffer < num_buffers; ++buffer) {
>>   		/* Allocate vb2 buffer structures */
>> @@ -862,9 +862,9 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
>>   	/*
>>   	 * Make sure the requested values and current defaults are sane.
>>   	 */
>> -	WARN_ON(q->min_buffers_needed > VB2_MAX_FRAME);
>> +	WARN_ON(q->min_buffers_needed > q->max_allowed_buffers);
>>   	num_buffers = max_t(unsigned int, *count, q->min_buffers_needed);
>> -	num_buffers = min_t(unsigned int, num_buffers, VB2_MAX_FRAME);
>> +	num_buffers = min_t(unsigned int, num_buffers, q->max_allowed_buffers);
>>   	memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
>>   	/*
>>   	 * Set this now to ensure that drivers see the correct q->memory value
>> @@ -980,7 +980,7 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
>>   	bool no_previous_buffers = !q->num_buffers;
>>   	int ret;
>>   
>> -	if (q->num_buffers == VB2_MAX_FRAME) {
>> +	if (q->num_buffers == q->max_allowed_buffers) {
>>   		dprintk(q, 1, "maximum number of buffers already allocated\n");
>>   		return -ENOBUFS;
>>   	}
>> @@ -1009,7 +1009,7 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
>>   			return -EINVAL;
>>   	}
>>   
>> -	num_buffers = min(*count, VB2_MAX_FRAME - q->num_buffers);
>> +	num_buffers = min(*count, q->max_allowed_buffers - q->num_buffers);
>>   
>>   	if (requested_planes && requested_sizes) {
>>   		num_planes = requested_planes;
>> @@ -2519,6 +2519,14 @@ int vb2_core_queue_init(struct vb2_queue *q)
>>   
>>   	q->memory = VB2_MEMORY_UNKNOWN;
>>   
>> +	if (!q->max_allowed_buffers)
>> +		q->max_allowed_buffers = VB2_MAX_FRAME;
>> +
>> +	/* The maximum is limited by offset cookie encoding pattern */
>> +	q->max_allowed_buffers = min_t(unsigned int, q->max_allowed_buffers, BUFFER_INDEX_MASK);
>> +
>> +	q->bufs = kcalloc(q->max_allowed_buffers, sizeof(*q->bufs), GFP_KERNEL);
>> +
>>   	if (q->buf_struct_size == 0)
>>   		q->buf_struct_size = sizeof(struct vb2_buffer);
>>   
>> @@ -2543,6 +2551,7 @@ void vb2_core_queue_release(struct vb2_queue *q)
>>   	__vb2_queue_cancel(q);
>>   	mutex_lock(&q->mmap_lock);
>>   	__vb2_queue_free(q, q->num_buffers);
>> +	kfree(q->bufs);
>>   	mutex_unlock(&q->mmap_lock);
>>   }
>>   EXPORT_SYMBOL_GPL(vb2_core_queue_release);
>> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
>> index cd3ff1cd759d..97153c69583f 100644
>> --- a/include/media/videobuf2-core.h
>> +++ b/include/media/videobuf2-core.h
>> @@ -558,6 +558,7 @@ struct vb2_buf_ops {
>>    * @dma_dir:	DMA mapping direction.
>>    * @bufs:	videobuf2 buffer structures
>>    * @num_buffers: number of allocated/used buffers
>> + * @max_allowed_buffers: upper limit of number of allocated/used buffers
>>    * @queued_list: list of buffers currently queued from userspace
>>    * @queued_count: number of buffers queued and ready for streaming.
>>    * @owned_by_drv_count: number of buffers owned by the driver
>> @@ -619,8 +620,9 @@ struct vb2_queue {
>>   	struct mutex			mmap_lock;
>>   	unsigned int			memory;
>>   	enum dma_data_direction		dma_dir;
>> -	struct vb2_buffer		*bufs[VB2_MAX_FRAME];
>> +	struct vb2_buffer		**bufs;
>>   	unsigned int			num_buffers;
>> +	unsigned int			max_allowed_buffers;
>>   
>>   	struct list_head		queued_list;
>>   	unsigned int			queued_count;
>
Hans Verkuil Sept. 4, 2023, 2:09 p.m. UTC | #3
Hi Benjamin,

This patch can be folded into 11/18 to make it work properly.

vb2_core_queue_release() is also called when the file handle of the video device is
closed and it is also the owner of the currently allocated buffers. This will free
q->bufs, but queue_init isn't called a second time for non-m2m devices. So move
the allocation of q->bufs from queue_init to reqbufs/create_buffers.

And when releasing the file handle we also check if there is no owner at all:
in that case vb2_queue_release() must still be called to free q->bufs.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
---
 drivers/media/common/videobuf2/videobuf2-core.c | 15 +++++++++++++--
 drivers/media/common/videobuf2/videobuf2-v4l2.c |  4 ++--
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
index dc7f6b59d237..202d7c80ffd2 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -859,6 +859,12 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
 			return 0;
 	}

+	if (!q->bufs) {
+		q->bufs = kcalloc(q->max_allowed_buffers, sizeof(*q->bufs), GFP_KERNEL);
+		if (!q->bufs)
+			return -ENOMEM;
+	}
+
 	/*
 	 * Make sure the requested values and current defaults are sane.
 	 */
@@ -985,6 +991,12 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
 		return -ENOBUFS;
 	}

+	if (!q->bufs) {
+		q->bufs = kcalloc(q->max_allowed_buffers, sizeof(*q->bufs), GFP_KERNEL);
+		if (!q->bufs)
+			return -ENOMEM;
+	}
+
 	if (no_previous_buffers) {
 		if (q->waiting_in_dqbuf && *count) {
 			dprintk(q, 1, "another dup()ped fd is waiting for a buffer\n");
@@ -2525,8 +2537,6 @@ int vb2_core_queue_init(struct vb2_queue *q)
 	/* The maximum is limited by offset cookie encoding pattern */
 	q->max_allowed_buffers = min_t(unsigned int, q->max_allowed_buffers, BUFFER_INDEX_MASK);

-	q->bufs = kcalloc(q->max_allowed_buffers, sizeof(*q->bufs), GFP_KERNEL);
-
 	if (q->buf_struct_size == 0)
 		q->buf_struct_size = sizeof(struct vb2_buffer);

@@ -2552,6 +2562,7 @@ void vb2_core_queue_release(struct vb2_queue *q)
 	mutex_lock(&q->mmap_lock);
 	__vb2_queue_free(q, q->num_buffers);
 	kfree(q->bufs);
+	q->bufs = NULL;
 	mutex_unlock(&q->mmap_lock);
 }
 EXPORT_SYMBOL_GPL(vb2_core_queue_release);
diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index 8ba658ad9891..104fc5c4f574 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -1148,7 +1148,7 @@ int _vb2_fop_release(struct file *file, struct mutex *lock)

 	if (lock)
 		mutex_lock(lock);
-	if (file->private_data == vdev->queue->owner) {
+	if (!vdev->queue->owner || file->private_data == vdev->queue->owner) {
 		vb2_queue_release(vdev->queue);
 		vdev->queue->owner = NULL;
 	}
@@ -1276,7 +1276,7 @@ void vb2_video_unregister_device(struct video_device *vdev)
 	 */
 	get_device(&vdev->dev);
 	video_unregister_device(vdev);
-	if (vdev->queue && vdev->queue->owner) {
+	if (vdev->queue) {
 		struct mutex *lock = vdev->queue->lock ?
 			vdev->queue->lock : vdev->lock;
Hans Verkuil Sept. 4, 2023, 3:05 p.m. UTC | #4
A follow-up to my follow-up...

I realized that it is wise to do the allocation with mmap_lock held, since that is also
held when freeing q->bufs.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
---
 .../media/common/videobuf2/videobuf2-core.c   | 28 +++++++++----------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
index fe15e583b52a..dd25937c6dc8 100644
--- a/drivers/media/common/videobuf2/videobuf2-core.c
+++ b/drivers/media/common/videobuf2/videobuf2-core.c
@@ -818,7 +818,7 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
 	unsigned plane_sizes[VB2_MAX_PLANES] = { };
 	bool non_coherent_mem = flags & V4L2_MEMORY_FLAG_NON_COHERENT;
 	unsigned int i;
-	int ret;
+	int ret = 0;

 	if (q->streaming) {
 		dprintk(q, 1, "streaming active\n");
@@ -859,12 +859,6 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
 			return 0;
 	}

-	if (!q->bufs) {
-		q->bufs = kcalloc(q->max_allowed_buffers, sizeof(*q->bufs), GFP_KERNEL);
-		if (!q->bufs)
-			return -ENOMEM;
-	}
-
 	/*
 	 * Make sure the requested values and current defaults are sane.
 	 */
@@ -877,8 +871,14 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
 	 * in the queue_setup op.
 	 */
 	mutex_lock(&q->mmap_lock);
+	if (!q->bufs)
+		q->bufs = kcalloc(q->max_allowed_buffers, sizeof(*q->bufs), GFP_KERNEL);
+	if (!q->bufs)
+		ret = -ENOMEM;
 	q->memory = memory;
 	mutex_unlock(&q->mmap_lock);
+	if (ret)
+		return ret;
 	set_queue_coherency(q, non_coherent_mem);

 	/*
@@ -984,19 +984,13 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
 	unsigned plane_sizes[VB2_MAX_PLANES] = { };
 	bool non_coherent_mem = flags & V4L2_MEMORY_FLAG_NON_COHERENT;
 	bool no_previous_buffers = !q->num_buffers;
-	int ret;
+	int ret = 0;

 	if (q->num_buffers == q->max_allowed_buffers) {
 		dprintk(q, 1, "maximum number of buffers already allocated\n");
 		return -ENOBUFS;
 	}

-	if (!q->bufs) {
-		q->bufs = kcalloc(q->max_allowed_buffers, sizeof(*q->bufs), GFP_KERNEL);
-		if (!q->bufs)
-			return -ENOMEM;
-	}
-
 	if (no_previous_buffers) {
 		if (q->waiting_in_dqbuf && *count) {
 			dprintk(q, 1, "another dup()ped fd is waiting for a buffer\n");
@@ -1009,7 +1003,13 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
 		 */
 		mutex_lock(&q->mmap_lock);
 		q->memory = memory;
+		if (!q->bufs)
+			q->bufs = kcalloc(q->max_allowed_buffers, sizeof(*q->bufs), GFP_KERNEL);
+		if (!q->bufs)
+			ret = -ENOMEM;
 		mutex_unlock(&q->mmap_lock);
+		if (ret)
+			return ret;
 		q->waiting_for_buffers = !q->is_output;
 		set_queue_coherency(q, non_coherent_mem);
 	} else {
Hans Verkuil Sept. 4, 2023, 3:17 p.m. UTC | #5
On 01/09/2023 14:43, Benjamin Gaignard wrote:
> Only report unbalanced queue counters do avoid spamming kernel log
> with useless information.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
>  .../media/common/videobuf2/videobuf2-core.c   | 69 +++++++++++--------
>  1 file changed, 41 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> index cf3b9f5b69b7..85e561e46899 100644
> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> @@ -537,16 +537,18 @@ static void __vb2_queue_free(struct vb2_queue *q, unsigned int buffers)
>  				  q->cnt_prepare_streaming != q->cnt_unprepare_streaming ||
>  				  q->cnt_wait_prepare != q->cnt_wait_finish;
>  
> -		if (unbalanced || debug) {
> -			pr_info("counters for queue %p:%s\n", q,
> -				unbalanced ? " UNBALANCED!" : "");
> -			pr_info("     setup: %u start_streaming: %u stop_streaming: %u\n",
> -				q->cnt_queue_setup, q->cnt_start_streaming,
> -				q->cnt_stop_streaming);
> -			pr_info("     prepare_streaming: %u unprepare_streaming: %u\n",
> -				q->cnt_prepare_streaming, q->cnt_unprepare_streaming);
> -			pr_info("     wait_prepare: %u wait_finish: %u\n",
> -				q->cnt_wait_prepare, q->cnt_wait_finish);
> +		if (unbalanced) {
> +			pr_info("unbalanced counters for queue %p\n", q);

End the pr_info with ':' (i.e. "unbalanced counters for queue %p:\n")

> +			if (q->cnt_start_streaming != q->cnt_stop_streaming)
> +				pr_info("     setup: %u start_streaming: %u stop_streaming: %u\n",
> +					q->cnt_queue_setup, q->cnt_start_streaming,
> +					q->cnt_stop_streaming);
> +			if (q->cnt_prepare_streaming != q->cnt_unprepare_streaming)
> +				pr_info("     prepare_streaming: %u unprepare_streaming: %u\n",
> +					q->cnt_prepare_streaming, q->cnt_unprepare_streaming);
> +			if (q->cnt_wait_prepare != q->cnt_wait_finish)
> +				pr_info("     wait_prepare: %u wait_finish: %u\n",
> +					q->cnt_wait_prepare, q->cnt_wait_finish);
>  		}
>  		q->cnt_queue_setup = 0;
>  		q->cnt_wait_prepare = 0;
> @@ -567,24 +569,35 @@ static void __vb2_queue_free(struct vb2_queue *q, unsigned int buffers)
>  				  vb->cnt_buf_prepare != vb->cnt_buf_finish ||
>  				  vb->cnt_buf_init != vb->cnt_buf_cleanup;
>  
> -		if (unbalanced || debug) {
> -			pr_info("   counters for queue %p, buffer %d:%s\n",
> -				q, buffer, unbalanced ? " UNBALANCED!" : "");
> -			pr_info("     buf_init: %u buf_cleanup: %u buf_prepare: %u buf_finish: %u\n",
> -				vb->cnt_buf_init, vb->cnt_buf_cleanup,
> -				vb->cnt_buf_prepare, vb->cnt_buf_finish);
> -			pr_info("     buf_out_validate: %u buf_queue: %u buf_done: %u buf_request_complete: %u\n",
> -				vb->cnt_buf_out_validate, vb->cnt_buf_queue,
> -				vb->cnt_buf_done, vb->cnt_buf_request_complete);
> -			pr_info("     alloc: %u put: %u prepare: %u finish: %u mmap: %u\n",
> -				vb->cnt_mem_alloc, vb->cnt_mem_put,
> -				vb->cnt_mem_prepare, vb->cnt_mem_finish,
> -				vb->cnt_mem_mmap);
> -			pr_info("     get_userptr: %u put_userptr: %u\n",
> -				vb->cnt_mem_get_userptr, vb->cnt_mem_put_userptr);
> -			pr_info("     attach_dmabuf: %u detach_dmabuf: %u map_dmabuf: %u unmap_dmabuf: %u\n",
> -				vb->cnt_mem_attach_dmabuf, vb->cnt_mem_detach_dmabuf,
> -				vb->cnt_mem_map_dmabuf, vb->cnt_mem_unmap_dmabuf);
> +		if (unbalanced) {
> +			pr_info("unbalanced counters for queue %p, buffer %d\n",

End with : here as well.

> +				q, buffer);
> +			if (vb->cnt_buf_init != vb->cnt_buf_cleanup)
> +				pr_info("     buf_init: %u buf_cleanup: %u\n",
> +					vb->cnt_buf_init, vb->cnt_buf_cleanup);
> +			if (vb->cnt_buf_prepare != vb->cnt_buf_finish)
> +				pr_info("     buf_prepare: %u buf_finish: %u\n",
> +					vb->cnt_buf_prepare, vb->cnt_buf_finish);
> +			if (vb->cnt_buf_queue != vb->cnt_buf_done)
> +				pr_info("     buf_out_validate: %u buf_queue: %u buf_done: %u buf_request_complete: %u\n",
> +					vb->cnt_buf_out_validate, vb->cnt_buf_queue,
> +					vb->cnt_buf_done, vb->cnt_buf_request_complete);
> +			if (vb->cnt_mem_alloc != vb->cnt_mem_put)
> +				pr_info("     alloc: %u put: %u\n",
> +					vb->cnt_mem_alloc, vb->cnt_mem_put);
> +			if (vb->cnt_mem_prepare != vb->cnt_mem_finish)
> +				pr_info("     prepare: %u finish: %u\n",
> +					vb->cnt_mem_prepare, vb->cnt_mem_finish);
> +			pr_info("     mmap: %u\n", vb->cnt_mem_mmap);

Drop this, and also drop the cnt_mem_mmap field. I don't think this is
interesting.

> +			if (vb->cnt_mem_get_userptr != vb->cnt_mem_put_userptr)
> +				pr_info("     get_userptr: %u put_userptr: %u\n",
> +					vb->cnt_mem_get_userptr, vb->cnt_mem_put_userptr);
> +			if (vb->cnt_mem_attach_dmabuf != vb->cnt_mem_detach_dmabuf)
> +				pr_info("     attach_dmabuf: %u detach_dmabuf: %u\n",
> +					vb->cnt_mem_attach_dmabuf, vb->cnt_mem_detach_dmabuf);
> +			if (vb->cnt_mem_map_dmabuf != vb->cnt_mem_unmap_dmabuf)
> +				pr_info("     map_dmabuf: %u unmap_dmabuf: %u\n",
> +					vb->cnt_mem_map_dmabuf, vb->cnt_mem_unmap_dmabuf);
>  			pr_info("     get_dmabuf: %u num_users: %u vaddr: %u cookie: %u\n",
>  				vb->cnt_mem_get_dmabuf,
>  				vb->cnt_mem_num_users,

Same for cnt_mem_vaddr and cnt_mem_cookie. It's not interesting. But let's keep get_dmabuf and num_users
for now.

Regards,

	Hans
Hans Verkuil Sept. 4, 2023, 3:19 p.m. UTC | #6
On 01/09/2023 14:44, Benjamin Gaignard wrote:
> Add 'max_allowed_buffers' field in vb2_queue struct to let drivers decide
> how many buffers could be stored in a queue.
> This request 'bufs' array to be allocated at queue init time and freed
> when releasing the queue.
> By default VB2_MAX_FRAME remains the limit.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
>  .../media/common/videobuf2/videobuf2-core.c   | 25 +++++++++++++------
>  include/media/videobuf2-core.h                |  4 ++-
>  2 files changed, 20 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/media/common/videobuf2/videobuf2-core.c b/drivers/media/common/videobuf2/videobuf2-core.c
> index 15b583ce0c69..dc7f6b59d237 100644
> --- a/drivers/media/common/videobuf2/videobuf2-core.c
> +++ b/drivers/media/common/videobuf2/videobuf2-core.c
> @@ -411,7 +411,7 @@ static void init_buffer_cache_hints(struct vb2_queue *q, struct vb2_buffer *vb)
>   */
>  static bool vb2_queue_add_buffer(struct vb2_queue *q, struct vb2_buffer *vb, unsigned int index)
>  {
> -	if (index < VB2_MAX_FRAME && !q->bufs[index]) {
> +	if (index < q->max_allowed_buffers && !q->bufs[index]) {
>  		q->bufs[index] = vb;
>  		vb->index = index;
>  		vb->vb2_queue = q;
> @@ -428,7 +428,7 @@ static bool vb2_queue_add_buffer(struct vb2_queue *q, struct vb2_buffer *vb, uns
>   */
>  static void vb2_queue_remove_buffer(struct vb2_queue *q, struct vb2_buffer *vb)
>  {
> -	if (vb->index < VB2_MAX_FRAME) {
> +	if (vb->index < q->max_allowed_buffers) {
>  		q->bufs[vb->index] = NULL;
>  		vb->vb2_queue = NULL;
>  	}
> @@ -449,9 +449,9 @@ static int __vb2_queue_alloc(struct vb2_queue *q, enum vb2_memory memory,
>  	struct vb2_buffer *vb;
>  	int ret;
>  
> -	/* Ensure that q->num_buffers+num_buffers is below VB2_MAX_FRAME */
> +	/* Ensure that q->num_buffers+num_buffers is below q->max_allowed_buffers */
>  	num_buffers = min_t(unsigned int, num_buffers,
> -			    VB2_MAX_FRAME - q->num_buffers);
> +			    q->max_allowed_buffers - q->num_buffers);
>  
>  	for (buffer = 0; buffer < num_buffers; ++buffer) {
>  		/* Allocate vb2 buffer structures */
> @@ -862,9 +862,9 @@ int vb2_core_reqbufs(struct vb2_queue *q, enum vb2_memory memory,
>  	/*
>  	 * Make sure the requested values and current defaults are sane.
>  	 */
> -	WARN_ON(q->min_buffers_needed > VB2_MAX_FRAME);
> +	WARN_ON(q->min_buffers_needed > q->max_allowed_buffers);
>  	num_buffers = max_t(unsigned int, *count, q->min_buffers_needed);
> -	num_buffers = min_t(unsigned int, num_buffers, VB2_MAX_FRAME);
> +	num_buffers = min_t(unsigned int, num_buffers, q->max_allowed_buffers);
>  	memset(q->alloc_devs, 0, sizeof(q->alloc_devs));
>  	/*
>  	 * Set this now to ensure that drivers see the correct q->memory value
> @@ -980,7 +980,7 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
>  	bool no_previous_buffers = !q->num_buffers;
>  	int ret;
>  
> -	if (q->num_buffers == VB2_MAX_FRAME) {
> +	if (q->num_buffers == q->max_allowed_buffers) {
>  		dprintk(q, 1, "maximum number of buffers already allocated\n");
>  		return -ENOBUFS;
>  	}
> @@ -1009,7 +1009,7 @@ int vb2_core_create_bufs(struct vb2_queue *q, enum vb2_memory memory,
>  			return -EINVAL;
>  	}
>  
> -	num_buffers = min(*count, VB2_MAX_FRAME - q->num_buffers);
> +	num_buffers = min(*count, q->max_allowed_buffers - q->num_buffers);
>  
>  	if (requested_planes && requested_sizes) {
>  		num_planes = requested_planes;
> @@ -2519,6 +2519,14 @@ int vb2_core_queue_init(struct vb2_queue *q)
>  
>  	q->memory = VB2_MEMORY_UNKNOWN;
>  
> +	if (!q->max_allowed_buffers)
> +		q->max_allowed_buffers = VB2_MAX_FRAME;
> +
> +	/* The maximum is limited by offset cookie encoding pattern */
> +	q->max_allowed_buffers = min_t(unsigned int, q->max_allowed_buffers, BUFFER_INDEX_MASK);

I think this should be 'BUFFER_INDEX_MASK + 1'.

> +
> +	q->bufs = kcalloc(q->max_allowed_buffers, sizeof(*q->bufs), GFP_KERNEL);
> +
>  	if (q->buf_struct_size == 0)
>  		q->buf_struct_size = sizeof(struct vb2_buffer);
>  
> @@ -2543,6 +2551,7 @@ void vb2_core_queue_release(struct vb2_queue *q)
>  	__vb2_queue_cancel(q);
>  	mutex_lock(&q->mmap_lock);
>  	__vb2_queue_free(q, q->num_buffers);
> +	kfree(q->bufs);
>  	mutex_unlock(&q->mmap_lock);
>  }
>  EXPORT_SYMBOL_GPL(vb2_core_queue_release);
> diff --git a/include/media/videobuf2-core.h b/include/media/videobuf2-core.h
> index cd3ff1cd759d..97153c69583f 100644
> --- a/include/media/videobuf2-core.h
> +++ b/include/media/videobuf2-core.h
> @@ -558,6 +558,7 @@ struct vb2_buf_ops {
>   * @dma_dir:	DMA mapping direction.
>   * @bufs:	videobuf2 buffer structures
>   * @num_buffers: number of allocated/used buffers
> + * @max_allowed_buffers: upper limit of number of allocated/used buffers
>   * @queued_list: list of buffers currently queued from userspace
>   * @queued_count: number of buffers queued and ready for streaming.
>   * @owned_by_drv_count: number of buffers owned by the driver
> @@ -619,8 +620,9 @@ struct vb2_queue {
>  	struct mutex			mmap_lock;
>  	unsigned int			memory;
>  	enum dma_data_direction		dma_dir;
> -	struct vb2_buffer		*bufs[VB2_MAX_FRAME];
> +	struct vb2_buffer		**bufs;
>  	unsigned int			num_buffers;
> +	unsigned int			max_allowed_buffers;
>  
>  	struct list_head		queued_list;
>  	unsigned int			queued_count;

Regards,

	Hans
Hans Verkuil Sept. 4, 2023, 3:23 p.m. UTC | #7
On 01/09/2023 14:43, Benjamin Gaignard wrote:
> Unlike when resolution change on keyframes, dynamic resolution change
> on inter frames doesn't allow to do a stream off/on sequence because
> it is need to keep all previous references alive to decode inter frames.
> This constraint have two main problems:
> - more memory consumption.
> - more buffers in use.
> To solve these issue this series introduce DELETE_BUFS ioctl and remove
> the 32 buffers limit per queue.
> 
> VP9 conformance tests using fluster give a score of 210/305.
> The 24 resize inter tests (vp90-2-21-resize_inter_* files) are ok
> but require to use postprocessor.
> 
> Kernel branch is available here:
> https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/commits/remove_vb2_queue_limit_v6
> 
> GStreamer branch to use DELETE_BUF ioctl and testing dynamic resolution
> change is here:
> https://gitlab.freedesktop.org/benjamin.gaignard1/gstreamer/-/commits/VP9_drc

FYI: I still need to review and test patches 17 and 18. Either tomorrow or Wednesday.

Regards,

	Hans

> 
> changes in version 6:
> - Get a patch per driver to use vb2_get_buffer() instead of directly access
>   to queue buffers array.
> - Add lock in vb2_core_delete_buf()
> - Use vb2_buffer instead of index
> - Fix various comments
> - Change buffer index name to BUFFER_INDEX_MASK
> - Stop spamming kernel log with unbalanced counters
> 
> changes in version 5:
> - Rework offset cookie encoding pattern is n ow the first patch of the
>   serie.
> - Use static array instead of allocated one for postprocessor buffers.
> 
> changes in version 4:
> - Stop using Xarray, instead let queues decide about their own maximum
>   number of buffer and allocate bufs array given that value.
> - Rework offset cookie encoding pattern.
> - Change DELETE_BUF to DELETE_BUFS because it now usable for
>   range of buffer to be symetrical of CREATE_BUFS.
> - Add fixes tags on couple of Verisilicon related patches.
> - Be smarter in Verisilicon postprocessor buffers management.
> - Rebase on top of v6.4
> 
> changes in version 3:
> - Use Xarray API to store allocated video buffers.
> - No module parameter to limit the number of buffer per queue.
> - Use Xarray inside Verisilicon driver to store postprocessor buffers
>   and remove VB2_MAX_FRAME limit.
> - Allow Versilicon driver to change of resolution while streaming
> - Various fixes the Verisilicon VP9 code to improve fluster score.
>  
> changes in version 2:
> - Use a dynamic array and not a list to keep trace of allocated buffers.
>   Not use IDR interface because it is marked as deprecated in kernel
>   documentation.
> - Add a module parameter to limit the number of buffer per queue.
> - Add DELETE_BUF ioctl and m2m helpers.
> 
> Regards,
> Benjamin
>  
> Benjamin Gaignard (18):
>   media: videobuf2: Rework offset 'cookie' encoding pattern
>   media: videobuf2: Stop spamming kernel log with all queue counter
>   media: videobuf2: Use vb2_buffer instead of index
>   media: amphion: Use vb2_get_buffer() instead of directly access to
>     buffers array
>   media: mediatek: jpeg: Use vb2_get_buffer() instead of directly access
>     to buffers array
>   media: mediatek: vdec: Use vb2_get_buffer() instead of directly access
>     to buffers array
>   media: sti: hva: Use vb2_get_buffer() instead of directly access to
>     buffers array
>   media: visl: Use vb2_get_buffer() instead of directly access to
>     buffers array
>   media: atomisp: Use vb2_get_buffer() instead of directly access to
>     buffers array
>   media: videobuf2: Access vb2_queue bufs array through helper functions
>   media: videobuf2: Be more flexible on the number of queue stored
>     buffers
>   media: verisilicon: Refactor postprocessor to store more buffers
>   media: verisilicon: Store chroma and motion vectors offset
>   media: verisilicon: vp9: Use destination buffer height to compute
>     chroma offset
>   media: verisilicon: postproc: Fix down scale test
>   media: verisilicon: vp9: Allow to change resolution while streaming
>   media: v4l2: Add DELETE_BUFS ioctl
>   media: v4l2: Add mem2mem helpers for DELETE_BUFS ioctl
> 
>  .../userspace-api/media/v4l/user-func.rst     |   1 +
>  .../media/v4l/vidioc-delete-bufs.rst          |  73 ++++
>  .../media/common/videobuf2/videobuf2-core.c   | 379 ++++++++++++------
>  .../media/common/videobuf2/videobuf2-v4l2.c   |  99 ++++-
>  drivers/media/dvb-core/dvb_vb2.c              |   6 +-
>  drivers/media/platform/amphion/vpu_dbg.c      |  22 +-
>  .../platform/mediatek/jpeg/mtk_jpeg_core.c    |   6 +-
>  .../vcodec/decoder/vdec/vdec_vp9_req_lat_if.c |   2 +-
>  drivers/media/platform/st/sti/hva/hva-v4l2.c  |   4 +
>  drivers/media/platform/verisilicon/hantro.h   |   9 +-
>  .../media/platform/verisilicon/hantro_drv.c   |   4 +-
>  .../platform/verisilicon/hantro_g2_vp9_dec.c  |  10 +-
>  .../media/platform/verisilicon/hantro_hw.h    |   4 +-
>  .../platform/verisilicon/hantro_postproc.c    |  95 ++++-
>  .../media/platform/verisilicon/hantro_v4l2.c  |  27 +-
>  drivers/media/test-drivers/vim2m.c            |   1 +
>  drivers/media/test-drivers/visl/visl-dec.c    |  28 +-
>  drivers/media/v4l2-core/v4l2-dev.c            |   1 +
>  drivers/media/v4l2-core/v4l2-ioctl.c          |  17 +
>  drivers/media/v4l2-core/v4l2-mem2mem.c        |  20 +
>  .../staging/media/atomisp/pci/atomisp_ioctl.c |   2 +-
>  include/media/v4l2-ioctl.h                    |   4 +
>  include/media/v4l2-mem2mem.h                  |  12 +
>  include/media/videobuf2-core.h                |  29 +-
>  include/media/videobuf2-v4l2.h                |  11 +
>  include/uapi/linux/videodev2.h                |  16 +
>  26 files changed, 664 insertions(+), 218 deletions(-)
>  create mode 100644 Documentation/userspace-api/media/v4l/vidioc-delete-bufs.rst
>
Jernej Škrabec Sept. 12, 2023, 3:51 p.m. UTC | #8
Dne torek, 12. september 2023 ob 10:41:10 CEST je Benjamin Gaignard 
napisal(a):
> Le 11/09/2023 à 18:36, Jernej Škrabec a écrit :
> > Dne ponedeljek, 11. september 2023 ob 10:55:02 CEST je Benjamin Gaignard
> > 
> > napisal(a):
> >> Le 10/09/2023 à 15:21, Jernej Škrabec a écrit :
> >>> Hi Benjamin!
> >>> 
> >>> Dne petek, 01. september 2023 ob 14:44:10 CEST je Benjamin Gaignard
> >>> 
> >>> napisal(a):
> >>>> Source and destination buffer height may not be the same because
> >>>> alignment constraint are different.
> >>>> Use destination height to compute chroma offset because we target
> >>>> this buffer as hardware output.
> >>>> 
> >>>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> >>>> Fixes: e2da465455ce ("media: hantro: Support VP9 on the G2 core")
> >>>> ---
> >>>> 
> >>>>    drivers/media/platform/verisilicon/hantro_g2_vp9_dec.c | 4 +---
> >>>>    1 file changed, 1 insertion(+), 3 deletions(-)
> >>>> 
> >>>> diff --git a/drivers/media/platform/verisilicon/hantro_g2_vp9_dec.c
> >>>> b/drivers/media/platform/verisilicon/hantro_g2_vp9_dec.c index
> >>>> 6db1c32fce4d..1f3f5e7ce978 100644
> >>>> --- a/drivers/media/platform/verisilicon/hantro_g2_vp9_dec.c
> >>>> +++ b/drivers/media/platform/verisilicon/hantro_g2_vp9_dec.c
> >>>> @@ -93,9 +93,7 @@ static int start_prepare_run(struct hantro_ctx *ctx,
> >>>> const struct v4l2_ctrl_vp9_ static size_t chroma_offset(const struct
> >>>> hantro_ctx *ctx,
> >>>> 
> >>>>    			    const struct v4l2_ctrl_vp9_frame
> >>> 
> >>> *dec_params)
> >>> 
> >>>>    {
> >>>> 
> >>>> -	int bytes_per_pixel = dec_params->bit_depth == 8 ? 1 : 2;
> >>>> -
> >>>> -	return ctx->src_fmt.width * ctx->src_fmt.height * bytes_per_pixel;
> >>>> +	return ctx->dst_fmt.width * ctx->dst_fmt.height * ctx->bit_depth /
> >>> 
> >>> 8;
> >>> 
> >>> Commit message doesn't mention bit_depth change at all. While I think
> >>> there is no difference between dec_params->bit_depth and ctx->bit_depth,
> >>> you shouldn't just use ordinary division. If bit_depth is 10, it will be
> >>> rounded down. And if you decide to use bit_depth from context, please
> >>> remove dec_params argument.
> >> 
> >> I will change this patch and create a helpers function for chroma and
> >> motion vectors offsets that VP9 and HEVC code will use since they are
> >> identical. I don't see issue with the division. If you have in mind a
> >> solution please write it so I could test it.
> > 
> > Solution is same as the code that you removed:
> > int bytes_per_pixel = dec_params->bit_depth == 8 ? 1 : 2;
> > 
> > Or alternatively:
> > int bytes_per_pixel = DIV_ROUND_UP(dec_params->bit_depth, 8);
> > 
> > Consider bit_depth being 10. With old code you get 2, with yours you get
> > 1.
> 
> The old code is wrong ;-)
> If the format depth is 10 bits per pixel then chroma offset (in bytes)
> formula is width * height * 10 / 8 not width * height * 16 / 8.
> 
> I have already confirm that with HEVC on the same hardware.

Ok, mention of bit_depth issue in commit log would be great. It talks only 
about width and height.

In any case, are width and/or height always dividable by 8?

Best regards,
Jernej

> 
> Regards,
> Benjamin
> 
> > Best regards,
> > Jernej
> > 
> >> Regards,
> >> Benjamin
> >> 
> >>> Best regards,
> >>> Jernej
> >>> 
> >>>>    }
> >>>>    
> >>>>    static size_t mv_offset(const struct hantro_ctx *ctx,