tilcdc: vblank wait timed out

Message ID 1893554f-cdd7-4069-9b66-f47a155612e2@ti.com
State New
Headers show

Commit Message

Jyri Sarha Dec. 12, 2016, 11:26 a.m.
On 12/05/16 19:07, Bartosz Golaszewski wrote:
> 2016-12-05 12:01 GMT+01:00 Bartosz Golaszewski <bgolaszewski@baylibre.com>:

>> Hi Jyri,

>>

>> I pulled your recent tilcdc pull request on top of v4.9 and Sekhar's

>> davinci branch (+ vga dac DT)[1].

>>

>> I'm getting "vblank wait timed out" errors when running simple modetest[2].

>>

>> This error happened before with the drm_bridge series[3], but went

>> away at one of the subsequent patch versions.

>>

>> Could you please verify that you see the same error and advise on what

>> could be the reason? I'm investigating on my own as well.

>>

>> Best regards,

>> Bartosz Golaszewski

>>

>> [1] https://github.com/brgl/linux/tree/tilcdc/modetest_error

>> [2] http://pastebin.com/rCM44Uds

>> [3] http://www.spinics.net/lists/dri-devel/msg123732.html

> 


Sorry, I almost forgot about this problem.

> This seems like some END_OF_FRAME0 interrupt-related race condition.


I can not see any race related to vblank event sending. The
drm_modeset_lock_crtc() is there exactly for all ongoing operations to
complete before shutting down the crtc.

I think the problem is a missing END_OF_FRAME0 interrupt when sync lost
interrupt flood happens.

> Increasing the timeout in drm_atomic_helper_wait_for_vblanks() from 50

> to 100 and dropping drm_modeset_lock_crtc()/drm_modeset_unlock_crtc()


Not taking the lock causes drm_crtc_vblank_off() to be called in
tilcdc_crtc_disable(), before the time out happens. However, this is
racy because there is a pending commit still on going and executing in
parallel with the recovery work.

> in tilcdc_crtc_recover_work() makes the warning disappear. Also:

> calling drm_crtc_vblank_off() additionally before locking the crtc in

> tilcdc_crtc_recover_work() also seems to fix the issue 90% of times. I


I wonder what happens in that 10% off the times when when that does not
help...

> have been unable to figure out a reliable solution today though.

> 


Does the attached patch help with the issue?


Best regards,
Jyri

Comments

Bartosz Golaszewski Dec. 12, 2016, 12:16 p.m. | #1
2016-12-12 12:26 GMT+01:00 Jyri Sarha <jsarha@ti.com>:
> On 12/05/16 19:07, Bartosz Golaszewski wrote:
>> 2016-12-05 12:01 GMT+01:00 Bartosz Golaszewski <bgolaszewski@baylibre.com>:
>>> Hi Jyri,
>>>
>>> I pulled your recent tilcdc pull request on top of v4.9 and Sekhar's
>>> davinci branch (+ vga dac DT)[1].
>>>
>>> I'm getting "vblank wait timed out" errors when running simple modetest[2].
>>>
>>> This error happened before with the drm_bridge series[3], but went
>>> away at one of the subsequent patch versions.
>>>
>>> Could you please verify that you see the same error and advise on what
>>> could be the reason? I'm investigating on my own as well.
>>>
>>> Best regards,
>>> Bartosz Golaszewski
>>>
>>> [1] https://github.com/brgl/linux/tree/tilcdc/modetest_error
>>> [2] http://pastebin.com/rCM44Uds
>>> [3] http://www.spinics.net/lists/dri-devel/msg123732.html
>>
>
> Sorry, I almost forgot about this problem.
>
>> This seems like some END_OF_FRAME0 interrupt-related race condition.
>
> I can not see any race related to vblank event sending. The
> drm_modeset_lock_crtc() is there exactly for all ongoing operations to
> complete before shutting down the crtc.
>
> I think the problem is a missing END_OF_FRAME0 interrupt when sync lost
> interrupt flood happens.
>

Indeed sounds like a probable cause.

>> Increasing the timeout in drm_atomic_helper_wait_for_vblanks() from 50
>> to 100 and dropping drm_modeset_lock_crtc()/drm_modeset_unlock_crtc()
>
> Not taking the lock causes drm_crtc_vblank_off() to be called in
> tilcdc_crtc_disable(), before the time out happens. However, this is
> racy because there is a pending commit still on going and executing in
> parallel with the recovery work.
>

Sure, I just did it to see if it would change anything.

>> in tilcdc_crtc_recover_work() makes the warning disappear. Also:
>> calling drm_crtc_vblank_off() additionally before locking the crtc in
>> tilcdc_crtc_recover_work() also seems to fix the issue 90% of times. I
>
> I wonder what happens in that 10% off the times when when that does not
> help...
>
>> have been unable to figure out a reliable solution today though.
>>
>
> Does the attached patch help with the issue?
>

No it doesn't - I'm still getting the warning.

Thanks,
Bartosz

Patch

From 35957a10ed12188a3f8ed63c694929eb1bb3d9b6 Mon Sep 17 00:00:00 2001
From: Jyri Sarha <jsarha@ti.com>
Date: Mon, 12 Dec 2016 12:44:47 +0200
Subject: [PATCH] drm/tilcdc: Send pendig vblank event before recovery work

Send the pending vblank event without waiting for the END_OF_FRAME0
interrupt. Sometimes the END_OF_FRAME0 interrupt does not come when
LCDC suffers from the sync lost flood and this causes a nasty warning
in drm_atomic_helper_wait_for_vblanks().

Signed-off-by: Jyri Sarha <jsarha@ti.com>
---
 drivers/gpu/drm/tilcdc/tilcdc_crtc.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/tilcdc/tilcdc_crtc.c b/drivers/gpu/drm/tilcdc/tilcdc_crtc.c
index 9942b05..00ce9d2 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_crtc.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_crtc.c
@@ -566,9 +566,19 @@  static void tilcdc_crtc_recover_work(struct work_struct *work)
 	struct tilcdc_crtc *tilcdc_crtc =
 		container_of(work, struct tilcdc_crtc, recover_work);
 	struct drm_crtc *crtc = &tilcdc_crtc->base;
+	struct drm_device *dev = crtc->dev;
+	struct drm_pending_vblank_event *event;
+	unsigned long flags;
 
 	dev_info(crtc->dev->dev, "%s: Reset CRTC", __func__);
 
+	spin_lock_irqsave(&dev->event_lock, flags);
+	event = tilcdc_crtc->event;
+	tilcdc_crtc->event = NULL;
+	if (event)
+		drm_crtc_send_vblank_event(crtc, event);
+	spin_unlock_irqrestore(&dev->event_lock, flags);
+
 	drm_modeset_lock_crtc(crtc, NULL);
 
 	if (!tilcdc_crtc_is_on(crtc))
-- 
1.9.1