[RFC,4/4,v2] drm_hwcomposer: Try to fallback if GLCompisition fails

Message ID 1516749399-29504-5-git-send-email-john.stultz@linaro.org
State New
Headers show
Series
  • drm_hwcomposer: Changes to support HiKey/HiKey960
Related show

Commit Message

John Stultz Jan. 23, 2018, 11:16 p.m.
When using drm_hwcomposer with the hikey board, the resulting
display shows lots of tearing.

This seems to be due to EGLcomposition not initializing
properly, potentially due to I'm guessing limitations of what
the  utgard mali driver can do. I've noted that with the
HiKey960 board, this patch is *not* necessary.

Hacking around a bit, I found that since the glworker code
isn't running properly, we never call glFinish(), which
is required to fix the tearing.

Ideas for a better way to implement this would be greatly
appreciated!

Cc: Marissa Wall <marissaw@google.com>
Cc: Sean Paul <seanpaul@google.com>
Cc: Dmitry Shmidt <dimitrysh@google.com>
Cc: Robert Foss <robert.foss@collabora.com>
Cc: Matt Szczesiak <matt.szczesiak@arm.com>
Cc: Liviu Dudau <Liviu.Dudau@arm.com>
Cc: David Hanna <david.hanna11@gmail.com>
Cc: Rob Herring <rob.herring@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>

---
v2:
* Simplified, focusing on the key glFinsh() call
---
 drmdisplaycompositor.cpp | 4 ++++
 1 file changed, 4 insertions(+)

-- 
2.7.4

Comments

Sean Paul Jan. 24, 2018, 3:26 p.m. | #1
On Tue, Jan 23, 2018 at 03:16:39PM -0800, John Stultz wrote:
> When using drm_hwcomposer with the hikey board, the resulting
> display shows lots of tearing.
> 
> This seems to be due to EGLcomposition not initializing
> properly, potentially due to I'm guessing limitations of what
> the  utgard mali driver can do. I've noted that with the
> HiKey960 board, this patch is *not* necessary.
> 
> Hacking around a bit, I found that since the glworker code
> isn't running properly, we never call glFinish(), which
> is required to fix the tearing.
> 
> Ideas for a better way to implement this would be greatly
> appreciated!

Sounds like you'll have to dig into the gl compositor to fix this. I think
chances are quite good there's a better way than below.

Good luck!

> 
> Cc: Marissa Wall <marissaw@google.com>
> Cc: Sean Paul <seanpaul@google.com>
> Cc: Dmitry Shmidt <dimitrysh@google.com>
> Cc: Robert Foss <robert.foss@collabora.com>
> Cc: Matt Szczesiak <matt.szczesiak@arm.com>
> Cc: Liviu Dudau <Liviu.Dudau@arm.com>
> Cc: David Hanna <david.hanna11@gmail.com>
> Cc: Rob Herring <rob.herring@linaro.org>
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> ---
> v2:
> * Simplified, focusing on the key glFinsh() call
> ---
>  drmdisplaycompositor.cpp | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drmdisplaycompositor.cpp b/drmdisplaycompositor.cpp
> index 3a20b31..eb0b77a 100644
> --- a/drmdisplaycompositor.cpp
> +++ b/drmdisplaycompositor.cpp
> @@ -439,6 +439,10 @@ int DrmDisplayCompositor::PrepareFrame(DrmDisplayComposition *display_comp) {
>  
>        fb.set_release_fence_fd(ret);
>        ret = 0;
> +    } else {
> +      /*If we're not doing anything, block to avoid tearing */
> +      glFinish();
> +      return 0;
>      }
>    }
>  
> -- 
> 2.7.4
>
John Stultz Jan. 24, 2018, 7:26 p.m. | #2
On Wed, Jan 24, 2018 at 7:26 AM, Sean Paul <seanpaul@chromium.org> wrote:
> On Tue, Jan 23, 2018 at 03:16:39PM -0800, John Stultz wrote:
>> When using drm_hwcomposer with the hikey board, the resulting
>> display shows lots of tearing.
>>
>> This seems to be due to EGLcomposition not initializing
>> properly, potentially due to I'm guessing limitations of what
>> the  utgard mali driver can do. I've noted that with the
>> HiKey960 board, this patch is *not* necessary.
>>
>> Hacking around a bit, I found that since the glworker code
>> isn't running properly, we never call glFinish(), which
>> is required to fix the tearing.
>>
>> Ideas for a better way to implement this would be greatly
>> appreciated!
>
> Sounds like you'll have to dig into the gl compositor to fix this. I think
> chances are quite good there's a better way than below.

Yes, I did spent a little bit of time earlier trying to rewrite the gl
shader to try to build for the utgard level hardware, but wasn't very
successful. At a deeper level I guess I'm not sure the glcompositor is
useful in this case, since we're doing single plane client side
compositing (as short of the glFinish bit, not running it doesn't seem
to keep things from working).  But I'll look into that again.

Again forgive as I really a bit in the dark on most graphics details,
but the other more basic question I'm a bit unsure of is, does this
patch even make any functional sense?  If we're not using the
glcompositor and are using the atomic commit in ApplyFrame(), why the
need for glFinish to avoid tearing?  Is it that the we commit the
frame atomically to the display, but if we don't block w/ glFinish()
the gpu is still drawing into it?  It seems we'd want a buffer
specific fence to wait on rather then waiting for all gl calls to
complete (if I'm understanding how glFinish() works).

Thanks again for your review feedback!

thanks
-john
Rob Herring Jan. 24, 2018, 7:58 p.m. | #3
On Wed, Jan 24, 2018 at 1:26 PM, John Stultz <john.stultz@linaro.org> wrote:
> On Wed, Jan 24, 2018 at 7:26 AM, Sean Paul <seanpaul@chromium.org> wrote:
>> On Tue, Jan 23, 2018 at 03:16:39PM -0800, John Stultz wrote:
>>> When using drm_hwcomposer with the hikey board, the resulting
>>> display shows lots of tearing.
>>>
>>> This seems to be due to EGLcomposition not initializing
>>> properly, potentially due to I'm guessing limitations of what
>>> the  utgard mali driver can do. I've noted that with the
>>> HiKey960 board, this patch is *not* necessary.
>>>
>>> Hacking around a bit, I found that since the glworker code
>>> isn't running properly, we never call glFinish(), which
>>> is required to fix the tearing.
>>>
>>> Ideas for a better way to implement this would be greatly
>>> appreciated!
>>
>> Sounds like you'll have to dig into the gl compositor to fix this. I think
>> chances are quite good there's a better way than below.
>
> Yes, I did spent a little bit of time earlier trying to rewrite the gl
> shader to try to build for the utgard level hardware, but wasn't very
> successful. At a deeper level I guess I'm not sure the glcompositor is
> useful in this case, since we're doing single plane client side
> compositing (as short of the glFinish bit, not running it doesn't seem
> to keep things from working).  But I'll look into that again.
>
> Again forgive as I really a bit in the dark on most graphics details,
> but the other more basic question I'm a bit unsure of is, does this
> patch even make any functional sense?

I'd think not. The only thing that makes me question that is I've seen
glFinish calls in gralloc (framebuffer) cases. But those cases were
prior to explicit fence support.

>  If we're not using the
> glcompositor and are using the atomic commit in ApplyFrame(), why the
> need for glFinish to avoid tearing?  Is it that the we commit the
> frame atomically to the display, but if we don't block w/ glFinish()
> the gpu is still drawing into it?  It seems we'd want a buffer
> specific fence to wait on rather then waiting for all gl calls to
> complete (if I'm understanding how glFinish() works).

That seems like the right approach. Are we failing to pass fences
associated with input layers to DRM?

Rob
Alexandru-Cosmin Gheorghe Jan. 31, 2018, 6:51 p.m. | #4
Hi John,

I took your patches for a spin on Hikey960. Some findings:
Even with this patch I'm getting some tearing on Hikey960, not a lot as you
reported on Hikey, still there are some small black squares, less than 10 of
aproximetly 20x20.
So I investigated a little bit through drm_hwcomposer and found some
issues, maybe they could help you somehow, see bellow:
 Wed, Jan 24, 2018 at 01:58:55PM -0600, Rob Herring wrote:
> On Wed, Jan 24, 2018 at 1:26 PM, John Stultz <john.stultz@linaro.org> wrote:
> > On Wed, Jan 24, 2018 at 7:26 AM, Sean Paul <seanpaul@chromium.org> wrote:
> >> On Tue, Jan 23, 2018 at 03:16:39PM -0800, John Stultz wrote:
> >>> When using drm_hwcomposer with the hikey board, the resulting
> >>> display shows lots of tearing.
> >>>
> >>> This seems to be due to EGLcomposition not initializing
> >>> properly, potentially due to I'm guessing limitations of what
> >>> the  utgard mali driver can do. I've noted that with the
> >>> HiKey960 board, this patch is *not* necessary.
> >>>


> >>> Hacking around a bit, I found that since the glworker code
> >>> isn't running properly, we never call glFinish(), which
> >>> is required to fix the tearing.
> >>>
> >>> Ideas for a better way to implement this would be greatly
> >>> appreciated!
> >>
> >> Sounds like you'll have to dig into the gl compositor to fix this. I think
> >> chances are quite good there's a better way than below.
> >
> > Yes, I did spent a little bit of time earlier trying to rewrite the gl
> > shader to try to build for the utgard level hardware, but wasn't very
> > successful. At a deeper level I guess I'm not sure the glcompositor is
> > useful in this case, since we're doing single plane client side
> > compositing (as short of the glFinish bit, not running it doesn't seem
> > to keep things from working).  But I'll look into that again.
> >
> > Again forgive as I really a bit in the dark on most graphics details,
> > but the other more basic question I'm a bit unsure of is, does this
> > patch even make any functional sense?
>
> I'd think not. The only thing that makes me question that is I've seen
> glFinish calls in gralloc (framebuffer) cases. But those cases were
> prior to explicit fence support.
>
> >  If we're not using the
> > glcompositor and are using the atomic commit in ApplyFrame(), why the
> > need for glFinish to avoid tearing?  Is it that the we commit the
> > frame atomically to the display, but if we don't block w/ glFinish()
> > the gpu is still drawing into it?  It seems we'd want a buffer
> > specific fence to wait on rather then waiting for all gl calls to
> > complete (if I'm understanding how glFinish() works).
>
> That seems like the right approach. Are we failing to pass fences
> associated with input layers to DRM?
>
> Rob
> _______________________________________________
It seems that we don't pass any explicit fences to the kernel for
IN_FENCE_FD property. This particular line seems wrong.

@@ -593,14 +594,17 @@ int
DrmDisplayCompositor::CommitFrame(DrmDisplayComposition *display_comp,
       else
         rotation |= DRM_MODE_ROTATE_0;

-      if (fence_fd < 0) {
+      if (fence_fd >= 0) {

Also, if you track the value of the OUT_FENCE_PTR property, it seems
that it gets lost. As far I was able to track it down it seem that
it should have been used by AddFenceToRetireFence, and then be passed to
SurfaceFlinger through retire_fence parameter of PresentDisplay. But,
that doesn't happen because  AddFenceToRetireFence get's called each
time on a new DrmDisplayComposition object, before calling
ApplyComposition.


Both of this findings should have explained the tearing, however I
hacked something today and it didn't fix it for Hikey960, maybe
you have more luck.

Thank you,
Alex Gheorghe


> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
John Stultz Jan. 31, 2018, 7:03 p.m. | #5
On Wed, Jan 31, 2018 at 10:51 AM, Alexandru-Cosmin Gheorghe
<Alexandru-Cosmin.Gheorghe@arm.com> wrote:
> Hi John,
>
> I took your patches for a spin on Hikey960. Some findings:
> Even with this patch I'm getting some tearing on Hikey960, not a lot as you
> reported on Hikey, still there are some small black squares, less than 10 of
> aproximetly 20x20.

Sweet! I'm glad you were able to get it up and running!  I think the
black squares are a separate thing, more on that below.

> So I investigated a little bit through drm_hwcomposer and found some
> issues, maybe they could help you somehow, see bellow:

Very much appreciate the extra review and eyes on the code!

>> > Yes, I did spent a little bit of time earlier trying to rewrite the gl
>> > shader to try to build for the utgard level hardware, but wasn't very
>> > successful. At a deeper level I guess I'm not sure the glcompositor is
>> > useful in this case, since we're doing single plane client side
>> > compositing (as short of the glFinish bit, not running it doesn't seem
>> > to keep things from working).  But I'll look into that again.
>> >
>> > Again forgive as I really a bit in the dark on most graphics details,
>> > but the other more basic question I'm a bit unsure of is, does this
>> > patch even make any functional sense?
>>
>> I'd think not. The only thing that makes me question that is I've seen
>> glFinish calls in gralloc (framebuffer) cases. But those cases were
>> prior to explicit fence support.
>>
>> >  If we're not using the
>> > glcompositor and are using the atomic commit in ApplyFrame(), why the
>> > need for glFinish to avoid tearing?  Is it that the we commit the
>> > frame atomically to the display, but if we don't block w/ glFinish()
>> > the gpu is still drawing into it?  It seems we'd want a buffer
>> > specific fence to wait on rather then waiting for all gl calls to
>> > complete (if I'm understanding how glFinish() works).
>>
>> That seems like the right approach. Are we failing to pass fences
>> associated with input layers to DRM?
>> _______________________________________________
> It seems that we don't pass any explicit fences to the kernel for
> IN_FENCE_FD property. This particular line seems wrong.
>
> @@ -593,14 +594,17 @@ int
> DrmDisplayCompositor::CommitFrame(DrmDisplayComposition *display_comp,
>        else
>          rotation |= DRM_MODE_ROTATE_0;
>
> -      if (fence_fd < 0) {
> +      if (fence_fd >= 0) {

I'll give that a whirl.

> Also, if you track the value of the OUT_FENCE_PTR property, it seems
> that it gets lost. As far I was able to track it down it seem that
> it should have been used by AddFenceToRetireFence, and then be passed to
> SurfaceFlinger through retire_fence parameter of PresentDisplay. But,
> that doesn't happen because  AddFenceToRetireFence get's called each
> time on a new DrmDisplayComposition object, before calling
> ApplyComposition.

I'll look into this some more as well. I've not yet gotten into the
details of the fence handling in drm_hwcomposer yet, so I'll have to
check that out.

> Both of this findings should have explained the tearing, however I
> hacked something today and it didn't fix it for Hikey960, maybe
> you have more luck.

The issue with the black/grey blocks is a separate problem with the
4.9 based kernels that I'm still chasing down, we see it (though less
often) w/o the hwcomposer, and I think it has to do with something off
in the CMA allocations. We don't see it with the 4.4 or the 4.14 based
tree i'm also prepping.

Thanks again for the feedback! I'm really happy to have your input
here, so if you do generate any further changes, let me know and I'll
try to integrate them into the patch set!

thanks
-john
John Stultz Feb. 13, 2018, 4:43 a.m. | #6
On Wed, Jan 31, 2018 at 11:03 AM, John Stultz <john.stultz@linaro.org> wrote:
> On Wed, Jan 31, 2018 at 10:51 AM, Alexandru-Cosmin Gheorghe
> <Alexandru-Cosmin.Gheorghe@arm.com> wrote:
>> It seems that we don't pass any explicit fences to the kernel for
>> IN_FENCE_FD property. This particular line seems wrong.
>>
>> @@ -593,14 +594,17 @@ int
>> DrmDisplayCompositor::CommitFrame(DrmDisplayComposition *display_comp,
>>        else
>>          rotation |= DRM_MODE_ROTATE_0;
>>
>> -      if (fence_fd < 0) {
>> +      if (fence_fd >= 0) {
>
> I'll give that a whirl.

So I did give that a whirl after you suggested it, but it ended up
causing nothing to be displayed, and I unfortunately didn't have time
right then to dig much further.

Rob however re-found this issue today, and I've been digging a bit
more. At least with the HiKey board, it seem the trouble is when the
IN_FENCE_FD is added to the pset, the atomic commit calls start to
fail. I dug in and it seems we're catching in the kernel on the  if
(file->f_op != &sync_file_fops) check in sync_file_fdget().

I'm now trying to trace back to where the in fence was provided from
to see what might be going wrong there.

Curious if this is anything you ran across in your attempts?

thanks
-john
Alexandru-Cosmin Gheorghe Feb. 13, 2018, 10:45 a.m. | #7
On Mon, Feb 12, 2018 at 08:43:10PM -0800, John Stultz wrote:
> On Wed, Jan 31, 2018 at 11:03 AM, John Stultz <john.stultz@linaro.org> wrote:
> > On Wed, Jan 31, 2018 at 10:51 AM, Alexandru-Cosmin Gheorghe
> > <Alexandru-Cosmin.Gheorghe@arm.com> wrote:
> >> It seems that we don't pass any explicit fences to the kernel for
> >> IN_FENCE_FD property. This particular line seems wrong.
> >>
> >> @@ -593,14 +594,17 @@ int
> >> DrmDisplayCompositor::CommitFrame(DrmDisplayComposition *display_comp,
> >>        else
> >>          rotation |= DRM_MODE_ROTATE_0;
> >>
> >> -      if (fence_fd < 0) {
> >> +      if (fence_fd >= 0) {
> >
> > I'll give that a whirl.
>
> So I did give that a whirl after you suggested it, but it ended up
> causing nothing to be displayed, and I unfortunately didn't have time
> right then to dig much further.
>
> Rob however re-found this issue today, and I've been digging a bit
> more. At least with the HiKey board, it seem the trouble is when the
> IN_FENCE_FD is added to the pset, the atomic commit calls start to
> fail. I dug in and it seems we're catching in the kernel on the  if
> (file->f_op != &sync_file_fops) check in sync_file_fdget().
>
> I'm now trying to trace back to where the in fence was provided from
> to see what might be going wrong there.
I would be surprised if this fence is not created by the GPU driver.
But, the whole Android stack is new to me, so I might be wrong.
>
> Curious if this is anything you ran across in your attempts?
Sorry, on Hikey960, I don't have this problem. Are you using the
same 4.9 kernel from your branch(hikey-hwcomposer-deps)?
>
> thanks
> -john

Regards,
Alex Gheorghe
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
John Stultz Feb. 13, 2018, 10:03 p.m. | #8
On Tue, Feb 13, 2018 at 2:45 AM, Alexandru-Cosmin Gheorghe
<Alexandru-Cosmin.Gheorghe@arm.com> wrote:
> On Mon, Feb 12, 2018 at 08:43:10PM -0800, John Stultz wrote:
>> On Wed, Jan 31, 2018 at 11:03 AM, John Stultz <john.stultz@linaro.org> wrote:
>> > On Wed, Jan 31, 2018 at 10:51 AM, Alexandru-Cosmin Gheorghe
>> > <Alexandru-Cosmin.Gheorghe@arm.com> wrote:
>> >> It seems that we don't pass any explicit fences to the kernel for
>> >> IN_FENCE_FD property. This particular line seems wrong.
>> >>
>> >> @@ -593,14 +594,17 @@ int
>> >> DrmDisplayCompositor::CommitFrame(DrmDisplayComposition *display_comp,
>> >>        else
>> >>          rotation |= DRM_MODE_ROTATE_0;
>> >>
>> >> -      if (fence_fd < 0) {
>> >> +      if (fence_fd >= 0) {
>> >
>> > I'll give that a whirl.
>>
>> So I did give that a whirl after you suggested it, but it ended up
>> causing nothing to be displayed, and I unfortunately didn't have time
>> right then to dig much further.
>>
>> Rob however re-found this issue today, and I've been digging a bit
>> more. At least with the HiKey board, it seem the trouble is when the
>> IN_FENCE_FD is added to the pset, the atomic commit calls start to
>> fail. I dug in and it seems we're catching in the kernel on the  if
>> (file->f_op != &sync_file_fops) check in sync_file_fdget().
>>
>> I'm now trying to trace back to where the in fence was provided from
>> to see what might be going wrong there.
> I would be surprised if this fence is not created by the GPU driver.
> But, the whole Android stack is new to me, so I might be wrong.

Yes, I went digging on this last night and it ends up the utgard
driver we use (r7p0) isn't giving us a dmabuf fence, instead we're
getting a mali_sync_fence.

There's a newer kernel driver release (r8p1), which seems to correct
this, but it also is version tied to the binary blob, which I don't
have a matching update for. I spent a bit of time trying to hack out
just the dmabuf fence bits, but it rarely boots all the way and
frequently locks the board up, so I'm probably mucking the library
driver interface in some way.

Its a well known sad song.

>> Curious if this is anything you ran across in your attempts?
> Sorry, on Hikey960, I don't have this problem. Are you using the
> same 4.9 kernel from your branch(hikey-hwcomposer-deps)?

Thanks for confirming! I'll be double checking on the hikey960 soon. I
suspect the bifrost driver is giving back the right structure if it
was working for you.

thanks
-john

Patch

diff --git a/drmdisplaycompositor.cpp b/drmdisplaycompositor.cpp
index 3a20b31..eb0b77a 100644
--- a/drmdisplaycompositor.cpp
+++ b/drmdisplaycompositor.cpp
@@ -439,6 +439,10 @@  int DrmDisplayCompositor::PrepareFrame(DrmDisplayComposition *display_comp) {
 
       fb.set_release_fence_fd(ret);
       ret = 0;
+    } else {
+      /*If we're not doing anything, block to avoid tearing */
+      glFinish();
+      return 0;
     }
   }