Message ID | 20220908083739.11586-1-laurent.pinchart@ideasonboard.com |
---|---|
State | New |
Headers | show |
Series | [v3] media: imx: imx7-media-csi: Add support for fast-tracking queued buffers | expand |
Hi Paul, On Fri, Nov 18, 2022 at 06:20:10PM +0900, Paul Elder wrote: > Gentle ping Thanks for the ping. This patch is now part of a pull request for v6.2: https://patchwork.linuxtv.org/project/linux-media/patch/Y3lbLjxTe1P5TJsm@pendragon.ideasonboard.com/ > On Thu, Sep 08, 2022 at 11:37:39AM +0300, Laurent Pinchart wrote: > > From: Paul Elder <paul.elder@ideasonboard.com> > > > > The CSI hardware compatible with this driver handles buffers using a > > ping-pong mechanism with two sets of destination addresses. Normally, > > when an interrupt comes in to signal the completion of one buffer, say > > FB1, it assigns the next buffer in the queue to the next FB1, and the > > hardware starts to capture into FB2 in the meantime. > > > > In a buffer underrun situation, in the above example without loss of > > generality, if a new buffer is queued before the interrupt for FB1 comes > > in, we can program the buffer into FB2 (which is programmed with a dummy > > buffer, as there is a buffer underrun). > > > > This of course races with the interrupt that signals FB1 completion, as > > once that interrupt comes in, we are no longer guaranteed that the > > programming of FB2 was in time and must assume it was too late. This > > race is resolved partly by locking the programming of FB2. If it came > > after the interrupt for FB1, then the variable that is used to determine > > which FB to program would have been swapped by the interrupt handler. > > > > This alone isn't sufficient, however, because the interrupt could still > > be generated (thus the hardware starts capturing into the other fb) > > while the fast-tracking routine has the irq lock. Thus, after > > programming the fb register to fast-track the buffer, the isr also must > > be checked to confirm that an interrupt didn't come in the meantime. If > > it has, we must assume that programming the register for the > > fast-tracked buffer was not in time, and queue the buffer normally. > > > > Signed-off-by: Paul Elder <paul.elder@ideasonboard.com> > > Acked-by: Rui Miguel Silva <rmfrfs@gmail.com> > > Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> > > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> > > --- > > Changes since v2: > > > > - Update comments > > > > Changes since v1: > > > > - Fix the potential race condition where the interrupt comes in while > > the fast tracking routine has the irqlock > > - Change return value from int to bool > > --- > > drivers/staging/media/imx/imx7-media-csi.c | 76 ++++++++++++++++++++++ > > 1 file changed, 76 insertions(+) > > > > diff --git a/drivers/staging/media/imx/imx7-media-csi.c b/drivers/staging/media/imx/imx7-media-csi.c > > index a0553c24cce4..ac35cbc16e73 100644 > > --- a/drivers/staging/media/imx/imx7-media-csi.c > > +++ b/drivers/staging/media/imx/imx7-media-csi.c > > @@ -1296,12 +1296,88 @@ static int imx7_csi_video_buf_prepare(struct vb2_buffer *vb) > > return 0; > > } > > > > +static bool imx7_csi_fast_track_buffer(struct imx7_csi *csi, > > + struct imx7_csi_vb2_buffer *buf) > > +{ > > + unsigned long flags; > > + dma_addr_t phys; > > + int buf_num; > > + u32 isr; > > + > > + if (!csi->is_streaming) > > + return false; > > + > > + phys = vb2_dma_contig_plane_dma_addr(&buf->vbuf.vb2_buf, 0); > > + > > + /* > > + * buf_num holds the framebuffer ID of the most recently (*not* the > > + * next anticipated) triggered interrupt. Without loss of generality, > > + * if buf_num is 0, the hardware is capturing to FB2. If FB1 has been > > + * programmed with a dummy buffer (as indicated by active_vb2_buf[0] > > + * being NULL), then we can fast-track the new buffer by programming > > + * its address in FB1 before the hardware completes FB2, instead of > > + * adding it to the buffer queue and incurring a delay of one > > + * additional frame. > > + * > > + * The irqlock prevents races with the interrupt handler that updates > > + * buf_num when it programs the next buffer, but we can still race with > > + * the hardware if we program the buffer in FB1 just after the hardware > > + * completes FB2 and switches to FB1 and before buf_num can be updated > > + * by the interrupt handler for FB2. The fast-tracked buffer would > > + * then be ignored by the hardware while the driver would think it has > > + * successfully been processed. > > + * > > + * To avoid this problem, if we can't avoid the race, we can detect > > + * that we have lost it by checking, after programming the buffer in > > + * FB1, if the interrupt flag indicating completion of FB2 has been > > + * raised. If that is not the case, fast-tracking succeeded, and we can > > + * update active_vb2_buf[0]. Otherwise, we may or may not have lost the > > + * race (as the interrupt flag may have been raised just after > > + * programming FB1 and before we read the interrupt status register), > > + * and we need to assume the worst case of a race loss and queue the > > + * buffer through the slow path. > > + */ > > + > > + spin_lock_irqsave(&csi->irqlock, flags); > > + > > + buf_num = csi->buf_num; > > + if (csi->active_vb2_buf[buf_num]) { > > + spin_unlock_irqrestore(&csi->irqlock, flags); > > + return false; > > + } > > + > > + imx7_csi_update_buf(csi, phys, buf_num); > > + > > + isr = imx7_csi_reg_read(csi, CSI_CSISR); > > + if (isr & (buf_num ? BIT_DMA_TSF_DONE_FB1 : BIT_DMA_TSF_DONE_FB2)) { > > + /* > > + * The interrupt for the /other/ FB just came (the isr hasn't > > + * run yet though, because we have the lock here); we can't be > > + * sure we've programmed buf_num FB in time, so queue the buffer > > + * to the buffer queue normally. No need to undo writing the FB > > + * register, since we won't return it as active_vb2_buf is NULL, > > + * so it's okay to potentially write it to both FB1 and FB2; > > + * only the one where it was queued normally will be returned. > > + */ > > + spin_unlock_irqrestore(&csi->irqlock, flags); > > + return false; > > + } > > + > > + csi->active_vb2_buf[buf_num] = buf; > > + > > + spin_unlock_irqrestore(&csi->irqlock, flags); > > + return true; > > +} > > + > > static void imx7_csi_video_buf_queue(struct vb2_buffer *vb) > > { > > struct imx7_csi *csi = vb2_get_drv_priv(vb->vb2_queue); > > struct imx7_csi_vb2_buffer *buf = to_imx7_csi_vb2_buffer(vb); > > unsigned long flags; > > > > + if (imx7_csi_fast_track_buffer(csi, buf)) > > + return; > > + > > spin_lock_irqsave(&csi->q_lock, flags); > > > > list_add_tail(&buf->list, &csi->ready_q);
diff --git a/drivers/staging/media/imx/imx7-media-csi.c b/drivers/staging/media/imx/imx7-media-csi.c index a0553c24cce4..ac35cbc16e73 100644 --- a/drivers/staging/media/imx/imx7-media-csi.c +++ b/drivers/staging/media/imx/imx7-media-csi.c @@ -1296,12 +1296,88 @@ static int imx7_csi_video_buf_prepare(struct vb2_buffer *vb) return 0; } +static bool imx7_csi_fast_track_buffer(struct imx7_csi *csi, + struct imx7_csi_vb2_buffer *buf) +{ + unsigned long flags; + dma_addr_t phys; + int buf_num; + u32 isr; + + if (!csi->is_streaming) + return false; + + phys = vb2_dma_contig_plane_dma_addr(&buf->vbuf.vb2_buf, 0); + + /* + * buf_num holds the framebuffer ID of the most recently (*not* the + * next anticipated) triggered interrupt. Without loss of generality, + * if buf_num is 0, the hardware is capturing to FB2. If FB1 has been + * programmed with a dummy buffer (as indicated by active_vb2_buf[0] + * being NULL), then we can fast-track the new buffer by programming + * its address in FB1 before the hardware completes FB2, instead of + * adding it to the buffer queue and incurring a delay of one + * additional frame. + * + * The irqlock prevents races with the interrupt handler that updates + * buf_num when it programs the next buffer, but we can still race with + * the hardware if we program the buffer in FB1 just after the hardware + * completes FB2 and switches to FB1 and before buf_num can be updated + * by the interrupt handler for FB2. The fast-tracked buffer would + * then be ignored by the hardware while the driver would think it has + * successfully been processed. + * + * To avoid this problem, if we can't avoid the race, we can detect + * that we have lost it by checking, after programming the buffer in + * FB1, if the interrupt flag indicating completion of FB2 has been + * raised. If that is not the case, fast-tracking succeeded, and we can + * update active_vb2_buf[0]. Otherwise, we may or may not have lost the + * race (as the interrupt flag may have been raised just after + * programming FB1 and before we read the interrupt status register), + * and we need to assume the worst case of a race loss and queue the + * buffer through the slow path. + */ + + spin_lock_irqsave(&csi->irqlock, flags); + + buf_num = csi->buf_num; + if (csi->active_vb2_buf[buf_num]) { + spin_unlock_irqrestore(&csi->irqlock, flags); + return false; + } + + imx7_csi_update_buf(csi, phys, buf_num); + + isr = imx7_csi_reg_read(csi, CSI_CSISR); + if (isr & (buf_num ? BIT_DMA_TSF_DONE_FB1 : BIT_DMA_TSF_DONE_FB2)) { + /* + * The interrupt for the /other/ FB just came (the isr hasn't + * run yet though, because we have the lock here); we can't be + * sure we've programmed buf_num FB in time, so queue the buffer + * to the buffer queue normally. No need to undo writing the FB + * register, since we won't return it as active_vb2_buf is NULL, + * so it's okay to potentially write it to both FB1 and FB2; + * only the one where it was queued normally will be returned. + */ + spin_unlock_irqrestore(&csi->irqlock, flags); + return false; + } + + csi->active_vb2_buf[buf_num] = buf; + + spin_unlock_irqrestore(&csi->irqlock, flags); + return true; +} + static void imx7_csi_video_buf_queue(struct vb2_buffer *vb) { struct imx7_csi *csi = vb2_get_drv_priv(vb->vb2_queue); struct imx7_csi_vb2_buffer *buf = to_imx7_csi_vb2_buffer(vb); unsigned long flags; + if (imx7_csi_fast_track_buffer(csi, buf)) + return; + spin_lock_irqsave(&csi->q_lock, flags); list_add_tail(&buf->list, &csi->ready_q);