diff mbox series

[v1] drm/msm/dp: use dp_hpd_plug_handle() and dp_hpd_unplug_handle() directly

Message ID 1711656246-3483-2-git-send-email-quic_khsieh@quicinc.com
State New
Headers show
Series [v1] drm/msm/dp: use dp_hpd_plug_handle() and dp_hpd_unplug_handle() directly | expand

Commit Message

Kuogee Hsieh March 28, 2024, 8:04 p.m. UTC
For internal HPD case, hpd_event_thread is created to handle HPD
interrupts generated by HPD block of DP controller. It converts
HPD interrupts into events and executed them under hpd_event_thread
context. For external HPD case, HPD events is delivered by way of
dp_bridge_hpd_notify() under thread context. Since they are executed
under thread context already, there is no reason to hand over those
events to hpd_event_thread. Hence dp_hpd_plug_handle() and
dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify().

Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
---
 drivers/gpu/drm/msm/dp/dp_display.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Abhinav Kumar March 28, 2024, 8:24 p.m. UTC | #1
+ Johan and Bjorn for FYI

On 3/28/2024 1:04 PM, Kuogee Hsieh wrote:
> For internal HPD case, hpd_event_thread is created to handle HPD
> interrupts generated by HPD block of DP controller. It converts
> HPD interrupts into events and executed them under hpd_event_thread
> context. For external HPD case, HPD events is delivered by way of
> dp_bridge_hpd_notify() under thread context. Since they are executed
> under thread context already, there is no reason to hand over those
> events to hpd_event_thread. Hence dp_hpd_plug_handle() and
> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify().
> 
> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
> ---
>   drivers/gpu/drm/msm/dp/dp_display.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 

Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()")

Looks right to me,

Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Abhinav Kumar March 28, 2024, 11:41 p.m. UTC | #2
On 3/28/2024 3:50 PM, Dmitry Baryshkov wrote:
> On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
>>
>>
>>
>> On 3/28/2024 1:58 PM, Stephen Boyd wrote:
>>> Quoting Abhinav Kumar (2024-03-28 13:24:34)
>>>> + Johan and Bjorn for FYI
>>>>
>>>> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote:
>>>>> For internal HPD case, hpd_event_thread is created to handle HPD
>>>>> interrupts generated by HPD block of DP controller. It converts
>>>>> HPD interrupts into events and executed them under hpd_event_thread
>>>>> context. For external HPD case, HPD events is delivered by way of
>>>>> dp_bridge_hpd_notify() under thread context. Since they are executed
>>>>> under thread context already, there is no reason to hand over those
>>>>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and
>>>>> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify().
>>>>>
>>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
>>>>> ---
>>>>>     drivers/gpu/drm/msm/dp/dp_display.c | 5 +++--
>>>>>     1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>
>>>> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()")
>>>
>>> Is this a bug fix or an optimization? The commit text doesn't tell me.
>>>
>>
>> I would say both.
>>
>> optimization as it avoids the need to go through the hpd_event thread
>> processing.
>>
>> bug fix because once you go through the hpd event thread processing it
>> exposes and often breaks the already fragile hpd handling state machine
>> which can be avoided in this case.
> 
> Please add a description for the particular issue that was observed
> and how it is fixed by the patch.
> 
> Otherwise consider there to be an implicit NAK for all HPD-related
> patches unless it is a series that moves link training to the enable
> path and drops the HPD state machine completely.
> 
> I really mean it. We should stop beating a dead horse unless there is
> a grave bug that must be fixed.
> 

I think the commit message is explaining the issue well enough.

This was not fixing any issue we saw to explain you the exact scenario 
of things which happened but this is just from code walkthrough.

Like kuogee wrote, hpd event thread was there so handle events coming 
out of the hpd_isr for internal hpd cases. For the hpd_notify coming 
from pmic_glink or any other extnernal hpd cases, there is no need to 
put this through the hpd event thread because this will only make things 
worse of exposing the race conditions of the state machine.

Moving link training to enable and removal of hpd event thread will be 
worked on but delaying obvious things we can fix does not make sense.

>>
>>>>
>>>> Looks right to me,
>>>>
>>>> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
> 
> 
>
Dmitry Baryshkov March 29, 2024, 12:10 a.m. UTC | #3
On Fri, 29 Mar 2024 at 01:42, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
>
>
>
> On 3/28/2024 3:50 PM, Dmitry Baryshkov wrote:
> > On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
> >>
> >>
> >>
> >> On 3/28/2024 1:58 PM, Stephen Boyd wrote:
> >>> Quoting Abhinav Kumar (2024-03-28 13:24:34)
> >>>> + Johan and Bjorn for FYI
> >>>>
> >>>> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote:
> >>>>> For internal HPD case, hpd_event_thread is created to handle HPD
> >>>>> interrupts generated by HPD block of DP controller. It converts
> >>>>> HPD interrupts into events and executed them under hpd_event_thread
> >>>>> context. For external HPD case, HPD events is delivered by way of
> >>>>> dp_bridge_hpd_notify() under thread context. Since they are executed
> >>>>> under thread context already, there is no reason to hand over those
> >>>>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and
> >>>>> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify().
> >>>>>
> >>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
> >>>>> ---
> >>>>>     drivers/gpu/drm/msm/dp/dp_display.c | 5 +++--
> >>>>>     1 file changed, 3 insertions(+), 2 deletions(-)
> >>>>>
> >>>>
> >>>> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()")
> >>>
> >>> Is this a bug fix or an optimization? The commit text doesn't tell me.
> >>>
> >>
> >> I would say both.
> >>
> >> optimization as it avoids the need to go through the hpd_event thread
> >> processing.
> >>
> >> bug fix because once you go through the hpd event thread processing it
> >> exposes and often breaks the already fragile hpd handling state machine
> >> which can be avoided in this case.
> >
> > Please add a description for the particular issue that was observed
> > and how it is fixed by the patch.
> >
> > Otherwise consider there to be an implicit NAK for all HPD-related
> > patches unless it is a series that moves link training to the enable
> > path and drops the HPD state machine completely.
> >
> > I really mean it. We should stop beating a dead horse unless there is
> > a grave bug that must be fixed.
> >
>
> I think the commit message is explaining the issue well enough.
>
> This was not fixing any issue we saw to explain you the exact scenario
> of things which happened but this is just from code walkthrough.
>
> Like kuogee wrote, hpd event thread was there so handle events coming
> out of the hpd_isr for internal hpd cases. For the hpd_notify coming
> from pmic_glink or any other extnernal hpd cases, there is no need to
> put this through the hpd event thread because this will only make things
> worse of exposing the race conditions of the state machine.
>
> Moving link training to enable and removal of hpd event thread will be
> worked on but delaying obvious things we can fix does not make sense.
Bjorn Andersson March 29, 2024, 1:46 a.m. UTC | #4
On Thu, Mar 28, 2024 at 02:21:14PM -0700, Abhinav Kumar wrote:
> 
> 
> On 3/28/2024 1:58 PM, Stephen Boyd wrote:
> > Quoting Abhinav Kumar (2024-03-28 13:24:34)
> > > + Johan and Bjorn for FYI
> > > 
> > > On 3/28/2024 1:04 PM, Kuogee Hsieh wrote:
> > > > For internal HPD case, hpd_event_thread is created to handle HPD
> > > > interrupts generated by HPD block of DP controller. It converts
> > > > HPD interrupts into events and executed them under hpd_event_thread
> > > > context. For external HPD case, HPD events is delivered by way of
> > > > dp_bridge_hpd_notify() under thread context. Since they are executed
> > > > under thread context already, there is no reason to hand over those
> > > > events to hpd_event_thread. Hence dp_hpd_plug_handle() and
> > > > dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify().
> > > > 
> > > > Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
> > > > ---
> > > >    drivers/gpu/drm/msm/dp/dp_display.c | 5 +++--
> > > >    1 file changed, 3 insertions(+), 2 deletions(-)
> > > > 
> > > 
> > > Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()")
> > 
> > Is this a bug fix or an optimization? The commit text doesn't tell me.
> > 
> 
> I would say both.
> 
> optimization as it avoids the need to go through the hpd_event thread
> processing.
> 
> bug fix because once you go through the hpd event thread processing it
> exposes and often breaks the already fragile hpd handling state machine
> which can be avoided in this case.
> 

It removes the main users of the thread, but there's still code paths
which will post events on the thread.

I think I like the direction this is taking, but does it really fix the
whole problem, or just patch one case?


PS. Please read go/upstream and switch to b4, to avoid some practical
issues with the way you posted this patch.

Thanks,
Bjorn

> > > 
> > > Looks right to me,
> > > 
> > > Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Abhinav Kumar March 29, 2024, 2:15 a.m. UTC | #5
On 3/28/2024 5:10 PM, Dmitry Baryshkov wrote:
> On Fri, 29 Mar 2024 at 01:42, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
>>
>>
>>
>> On 3/28/2024 3:50 PM, Dmitry Baryshkov wrote:
>>> On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
>>>>
>>>>
>>>>
>>>> On 3/28/2024 1:58 PM, Stephen Boyd wrote:
>>>>> Quoting Abhinav Kumar (2024-03-28 13:24:34)
>>>>>> + Johan and Bjorn for FYI
>>>>>>
>>>>>> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote:
>>>>>>> For internal HPD case, hpd_event_thread is created to handle HPD
>>>>>>> interrupts generated by HPD block of DP controller. It converts
>>>>>>> HPD interrupts into events and executed them under hpd_event_thread
>>>>>>> context. For external HPD case, HPD events is delivered by way of
>>>>>>> dp_bridge_hpd_notify() under thread context. Since they are executed
>>>>>>> under thread context already, there is no reason to hand over those
>>>>>>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and
>>>>>>> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify().
>>>>>>>
>>>>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
>>>>>>> ---
>>>>>>>      drivers/gpu/drm/msm/dp/dp_display.c | 5 +++--
>>>>>>>      1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>
>>>>>> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()")
>>>>>
>>>>> Is this a bug fix or an optimization? The commit text doesn't tell me.
>>>>>
>>>>
>>>> I would say both.
>>>>
>>>> optimization as it avoids the need to go through the hpd_event thread
>>>> processing.
>>>>
>>>> bug fix because once you go through the hpd event thread processing it
>>>> exposes and often breaks the already fragile hpd handling state machine
>>>> which can be avoided in this case.
>>>
>>> Please add a description for the particular issue that was observed
>>> and how it is fixed by the patch.
>>>
>>> Otherwise consider there to be an implicit NAK for all HPD-related
>>> patches unless it is a series that moves link training to the enable
>>> path and drops the HPD state machine completely.
>>>
>>> I really mean it. We should stop beating a dead horse unless there is
>>> a grave bug that must be fixed.
>>>
>>
>> I think the commit message is explaining the issue well enough.
>>
>> This was not fixing any issue we saw to explain you the exact scenario
>> of things which happened but this is just from code walkthrough.
>>
>> Like kuogee wrote, hpd event thread was there so handle events coming
>> out of the hpd_isr for internal hpd cases. For the hpd_notify coming
>> from pmic_glink or any other extnernal hpd cases, there is no need to
>> put this through the hpd event thread because this will only make things
>> worse of exposing the race conditions of the state machine.
>>
>> Moving link training to enable and removal of hpd event thread will be
>> worked on but delaying obvious things we can fix does not make sense.
> 
>  From the commit message this feels like an optimisation rather than a
> fix. And granted the fragility of the HPD state machine, I'd prefer to
> stay away from optimisations. As far as I understood from the history
> of the last revert, we'd better make sure that HPD handling goes only
> through the HPD event thread.
> 

I think you are mixing the two. We tried to send the events through 
DRM's hpd_notify which ended up in a bad way and btw, thats still not 
resolved even though I have seen reports that things are fine with the 
revert, we are consistently able to see us ending up in a disconnected 
state with all the reverts and fixes in our x1e80100 DP setup.

I plan to investigate that issue properly in the next week and try to 
make some sense of it all.

In fact, this patch is removing one more user of the hpd event thread 
which is the direction in which we all want to head towards.

On whether this is an optimization or a bug fix. I think by avoiding hpd 
event thread (which should have never been used for hpd_notify updates, 
hence a bug) we are avoiding the possibility of more race conditions.

So, this has my R-b and it holds. Upto you.
Abhinav Kumar March 29, 2024, 2:37 a.m. UTC | #6
On 3/28/2024 6:46 PM, Bjorn Andersson wrote:
> On Thu, Mar 28, 2024 at 02:21:14PM -0700, Abhinav Kumar wrote:
>>
>>
>> On 3/28/2024 1:58 PM, Stephen Boyd wrote:
>>> Quoting Abhinav Kumar (2024-03-28 13:24:34)
>>>> + Johan and Bjorn for FYI
>>>>
>>>> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote:
>>>>> For internal HPD case, hpd_event_thread is created to handle HPD
>>>>> interrupts generated by HPD block of DP controller. It converts
>>>>> HPD interrupts into events and executed them under hpd_event_thread
>>>>> context. For external HPD case, HPD events is delivered by way of
>>>>> dp_bridge_hpd_notify() under thread context. Since they are executed
>>>>> under thread context already, there is no reason to hand over those
>>>>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and
>>>>> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify().
>>>>>
>>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
>>>>> ---
>>>>>     drivers/gpu/drm/msm/dp/dp_display.c | 5 +++--
>>>>>     1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>
>>>>
>>>> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()")
>>>
>>> Is this a bug fix or an optimization? The commit text doesn't tell me.
>>>
>>
>> I would say both.
>>
>> optimization as it avoids the need to go through the hpd_event thread
>> processing.
>>
>> bug fix because once you go through the hpd event thread processing it
>> exposes and often breaks the already fragile hpd handling state machine
>> which can be avoided in this case.
>>
> 
> It removes the main users of the thread, but there's still code paths
> which will post events on the thread.
> 
> I think I like the direction this is taking, but does it really fix the
> whole problem, or just patch one case?
> 

So kuogee's idea behind this that NON-hpd_isr events need not go through 
event thread at all.

We did not run into any special scenario or issue without this. It was a 
code walkthrough fix.

> 
> PS. Please read go/upstream and switch to b4, to avoid some practical
> issues with the way you posted this patch.
> 
> Thanks,
> Bjorn
> 

Just to elaborate the practical issues so that developers know what you 
encountered:

-> no need of v1 on the PATCH
-> somehow this patch was linked "in-reply-to" another patch 
https://lore.kernel.org/all/1711656246-3483-2-git-send-email-quic_khsieh@quicinc.com/ 
. This is quite strange and not sure how it happened. But will double 
check if we did something wrong here.

Thanks for sharing these.


>>>>
>>>> Looks right to me,
>>>>
>>>> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Dmitry Baryshkov March 29, 2024, 3:23 a.m. UTC | #7
On Fri, 29 Mar 2024 at 04:16, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
>
>
>
> On 3/28/2024 5:10 PM, Dmitry Baryshkov wrote:
> > On Fri, 29 Mar 2024 at 01:42, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
> >>
> >>
> >>
> >> On 3/28/2024 3:50 PM, Dmitry Baryshkov wrote:
> >>> On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 3/28/2024 1:58 PM, Stephen Boyd wrote:
> >>>>> Quoting Abhinav Kumar (2024-03-28 13:24:34)
> >>>>>> + Johan and Bjorn for FYI
> >>>>>>
> >>>>>> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote:
> >>>>>>> For internal HPD case, hpd_event_thread is created to handle HPD
> >>>>>>> interrupts generated by HPD block of DP controller. It converts
> >>>>>>> HPD interrupts into events and executed them under hpd_event_thread
> >>>>>>> context. For external HPD case, HPD events is delivered by way of
> >>>>>>> dp_bridge_hpd_notify() under thread context. Since they are executed
> >>>>>>> under thread context already, there is no reason to hand over those
> >>>>>>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and
> >>>>>>> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify().
> >>>>>>>
> >>>>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
> >>>>>>> ---
> >>>>>>>      drivers/gpu/drm/msm/dp/dp_display.c | 5 +++--
> >>>>>>>      1 file changed, 3 insertions(+), 2 deletions(-)
> >>>>>>>
> >>>>>>
> >>>>>> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()")
> >>>>>
> >>>>> Is this a bug fix or an optimization? The commit text doesn't tell me.
> >>>>>
> >>>>
> >>>> I would say both.
> >>>>
> >>>> optimization as it avoids the need to go through the hpd_event thread
> >>>> processing.
> >>>>
> >>>> bug fix because once you go through the hpd event thread processing it
> >>>> exposes and often breaks the already fragile hpd handling state machine
> >>>> which can be avoided in this case.
> >>>
> >>> Please add a description for the particular issue that was observed
> >>> and how it is fixed by the patch.
> >>>
> >>> Otherwise consider there to be an implicit NAK for all HPD-related
> >>> patches unless it is a series that moves link training to the enable
> >>> path and drops the HPD state machine completely.
> >>>
> >>> I really mean it. We should stop beating a dead horse unless there is
> >>> a grave bug that must be fixed.
> >>>
> >>
> >> I think the commit message is explaining the issue well enough.
> >>
> >> This was not fixing any issue we saw to explain you the exact scenario
> >> of things which happened but this is just from code walkthrough.
> >>
> >> Like kuogee wrote, hpd event thread was there so handle events coming
> >> out of the hpd_isr for internal hpd cases. For the hpd_notify coming
> >> from pmic_glink or any other extnernal hpd cases, there is no need to
> >> put this through the hpd event thread because this will only make things
> >> worse of exposing the race conditions of the state machine.
> >>
> >> Moving link training to enable and removal of hpd event thread will be
> >> worked on but delaying obvious things we can fix does not make sense.
> >
> >  From the commit message this feels like an optimisation rather than a
> > fix. And granted the fragility of the HPD state machine, I'd prefer to
> > stay away from optimisations. As far as I understood from the history
> > of the last revert, we'd better make sure that HPD handling goes only
> > through the HPD event thread.
> >
>
> I think you are mixing the two. We tried to send the events through
> DRM's hpd_notify which ended up in a bad way and btw, thats still not
> resolved even though I have seen reports that things are fine with the
> revert, we are consistently able to see us ending up in a disconnected
> state with all the reverts and fixes in our x1e80100 DP setup.
>
> I plan to investigate that issue properly in the next week and try to
> make some sense of it all.
>
> In fact, this patch is removing one more user of the hpd event thread
> which is the direction in which we all want to head towards.

As I stated earlier, from my point of view it doesn't make sense to
rework the HPD thread in small steps.

> On whether this is an optimization or a bug fix. I think by avoiding hpd
> event thread (which should have never been used for hpd_notify updates,
> hence a bug) we are avoiding the possibility of more race conditions.

I think that the HPD event thread serializes handling of events, so
avoiding it increases the possibility of a race condition.

>
> So, this has my R-b and it holds. Upto you.

I'd wait for a proper description of the issue that was observed and
how it is solved by this patch.
Abhinav Kumar March 29, 2024, 5:47 a.m. UTC | #8
On 3/28/2024 8:23 PM, Dmitry Baryshkov wrote:
> On Fri, 29 Mar 2024 at 04:16, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
>>
>>
>>
>> On 3/28/2024 5:10 PM, Dmitry Baryshkov wrote:
>>> On Fri, 29 Mar 2024 at 01:42, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
>>>>
>>>>
>>>>
>>>> On 3/28/2024 3:50 PM, Dmitry Baryshkov wrote:
>>>>> On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 3/28/2024 1:58 PM, Stephen Boyd wrote:
>>>>>>> Quoting Abhinav Kumar (2024-03-28 13:24:34)
>>>>>>>> + Johan and Bjorn for FYI
>>>>>>>>
>>>>>>>> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote:
>>>>>>>>> For internal HPD case, hpd_event_thread is created to handle HPD
>>>>>>>>> interrupts generated by HPD block of DP controller. It converts
>>>>>>>>> HPD interrupts into events and executed them under hpd_event_thread
>>>>>>>>> context. For external HPD case, HPD events is delivered by way of
>>>>>>>>> dp_bridge_hpd_notify() under thread context. Since they are executed
>>>>>>>>> under thread context already, there is no reason to hand over those
>>>>>>>>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and
>>>>>>>>> dp_hpd_unplug_hanlde() are called directly at dp_bridge_hpd_notify().
>>>>>>>>>
>>>>>>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
>>>>>>>>> ---
>>>>>>>>>       drivers/gpu/drm/msm/dp/dp_display.c | 5 +++--
>>>>>>>>>       1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>>>
>>>>>>>>
>>>>>>>> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()")
>>>>>>>
>>>>>>> Is this a bug fix or an optimization? The commit text doesn't tell me.
>>>>>>>
>>>>>>
>>>>>> I would say both.
>>>>>>
>>>>>> optimization as it avoids the need to go through the hpd_event thread
>>>>>> processing.
>>>>>>
>>>>>> bug fix because once you go through the hpd event thread processing it
>>>>>> exposes and often breaks the already fragile hpd handling state machine
>>>>>> which can be avoided in this case.
>>>>>
>>>>> Please add a description for the particular issue that was observed
>>>>> and how it is fixed by the patch.
>>>>>
>>>>> Otherwise consider there to be an implicit NAK for all HPD-related
>>>>> patches unless it is a series that moves link training to the enable
>>>>> path and drops the HPD state machine completely.
>>>>>
>>>>> I really mean it. We should stop beating a dead horse unless there is
>>>>> a grave bug that must be fixed.
>>>>>
>>>>
>>>> I think the commit message is explaining the issue well enough.
>>>>
>>>> This was not fixing any issue we saw to explain you the exact scenario
>>>> of things which happened but this is just from code walkthrough.
>>>>
>>>> Like kuogee wrote, hpd event thread was there so handle events coming
>>>> out of the hpd_isr for internal hpd cases. For the hpd_notify coming
>>>> from pmic_glink or any other extnernal hpd cases, there is no need to
>>>> put this through the hpd event thread because this will only make things
>>>> worse of exposing the race conditions of the state machine.
>>>>
>>>> Moving link training to enable and removal of hpd event thread will be
>>>> worked on but delaying obvious things we can fix does not make sense.
>>>
>>>   From the commit message this feels like an optimisation rather than a
>>> fix. And granted the fragility of the HPD state machine, I'd prefer to
>>> stay away from optimisations. As far as I understood from the history
>>> of the last revert, we'd better make sure that HPD handling goes only
>>> through the HPD event thread.
>>>
>>
>> I think you are mixing the two. We tried to send the events through
>> DRM's hpd_notify which ended up in a bad way and btw, thats still not
>> resolved even though I have seen reports that things are fine with the
>> revert, we are consistently able to see us ending up in a disconnected
>> state with all the reverts and fixes in our x1e80100 DP setup.
>>
>> I plan to investigate that issue properly in the next week and try to
>> make some sense of it all.
>>
>> In fact, this patch is removing one more user of the hpd event thread
>> which is the direction in which we all want to head towards.
> 
> As I stated earlier, from my point of view it doesn't make sense to
> rework the HPD thread in small steps.
> 
>> On whether this is an optimization or a bug fix. I think by avoiding hpd
>> event thread (which should have never been used for hpd_notify updates,
>> hence a bug) we are avoiding the possibility of more race conditions.
> 
> I think that the HPD event thread serializes handling of events, so
> avoiding it increases the possibility of a race condition.
> 
>>
>> So, this has my R-b and it holds. Upto you.
> 
> I'd wait for a proper description of the issue that was observed and
> how it is solved by this patch.
> 

This was a code walkthrough fix as I wrote a few times. If there no 
merit in pushing this, lets ignore it and stop discussing.
Abhinav Kumar April 6, 2024, 12:04 a.m. UTC | #9
On 3/28/2024 10:47 PM, Abhinav Kumar wrote:
> 
> 
> On 3/28/2024 8:23 PM, Dmitry Baryshkov wrote:
>> On Fri, 29 Mar 2024 at 04:16, Abhinav Kumar 
>> <quic_abhinavk@quicinc.com> wrote:
>>>
>>>
>>>
>>> On 3/28/2024 5:10 PM, Dmitry Baryshkov wrote:
>>>> On Fri, 29 Mar 2024 at 01:42, Abhinav Kumar 
>>>> <quic_abhinavk@quicinc.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 3/28/2024 3:50 PM, Dmitry Baryshkov wrote:
>>>>>> On Thu, 28 Mar 2024 at 23:21, Abhinav Kumar 
>>>>>> <quic_abhinavk@quicinc.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 3/28/2024 1:58 PM, Stephen Boyd wrote:
>>>>>>>> Quoting Abhinav Kumar (2024-03-28 13:24:34)
>>>>>>>>> + Johan and Bjorn for FYI
>>>>>>>>>
>>>>>>>>> On 3/28/2024 1:04 PM, Kuogee Hsieh wrote:
>>>>>>>>>> For internal HPD case, hpd_event_thread is created to handle HPD
>>>>>>>>>> interrupts generated by HPD block of DP controller. It converts
>>>>>>>>>> HPD interrupts into events and executed them under 
>>>>>>>>>> hpd_event_thread
>>>>>>>>>> context. For external HPD case, HPD events is delivered by way of
>>>>>>>>>> dp_bridge_hpd_notify() under thread context. Since they are 
>>>>>>>>>> executed
>>>>>>>>>> under thread context already, there is no reason to hand over 
>>>>>>>>>> those
>>>>>>>>>> events to hpd_event_thread. Hence dp_hpd_plug_handle() and
>>>>>>>>>> dp_hpd_unplug_hanlde() are called directly at 
>>>>>>>>>> dp_bridge_hpd_notify().
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
>>>>>>>>>> ---
>>>>>>>>>>       drivers/gpu/drm/msm/dp/dp_display.c | 5 +++--
>>>>>>>>>>       1 file changed, 3 insertions(+), 2 deletions(-)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Fixes: 542b37efc20e ("drm/msm/dp: Implement hpd_notify()")
>>>>>>>>
>>>>>>>> Is this a bug fix or an optimization? The commit text doesn't 
>>>>>>>> tell me.
>>>>>>>>
>>>>>>>
>>>>>>> I would say both.
>>>>>>>
>>>>>>> optimization as it avoids the need to go through the hpd_event 
>>>>>>> thread
>>>>>>> processing.
>>>>>>>
>>>>>>> bug fix because once you go through the hpd event thread 
>>>>>>> processing it
>>>>>>> exposes and often breaks the already fragile hpd handling state 
>>>>>>> machine
>>>>>>> which can be avoided in this case.
>>>>>>
>>>>>> Please add a description for the particular issue that was observed
>>>>>> and how it is fixed by the patch.
>>>>>>
>>>>>> Otherwise consider there to be an implicit NAK for all HPD-related
>>>>>> patches unless it is a series that moves link training to the enable
>>>>>> path and drops the HPD state machine completely.
>>>>>>
>>>>>> I really mean it. We should stop beating a dead horse unless there is
>>>>>> a grave bug that must be fixed.
>>>>>>
>>>>>
>>>>> I think the commit message is explaining the issue well enough.
>>>>>
>>>>> This was not fixing any issue we saw to explain you the exact scenario
>>>>> of things which happened but this is just from code walkthrough.
>>>>>
>>>>> Like kuogee wrote, hpd event thread was there so handle events coming
>>>>> out of the hpd_isr for internal hpd cases. For the hpd_notify coming
>>>>> from pmic_glink or any other extnernal hpd cases, there is no need to
>>>>> put this through the hpd event thread because this will only make 
>>>>> things
>>>>> worse of exposing the race conditions of the state machine.
>>>>>
>>>>> Moving link training to enable and removal of hpd event thread will be
>>>>> worked on but delaying obvious things we can fix does not make sense.
>>>>
>>>>   From the commit message this feels like an optimisation rather than a
>>>> fix. And granted the fragility of the HPD state machine, I'd prefer to
>>>> stay away from optimisations. As far as I understood from the history
>>>> of the last revert, we'd better make sure that HPD handling goes only
>>>> through the HPD event thread.
>>>>
>>>
>>> I think you are mixing the two. We tried to send the events through
>>> DRM's hpd_notify which ended up in a bad way and btw, thats still not
>>> resolved even though I have seen reports that things are fine with the
>>> revert, we are consistently able to see us ending up in a disconnected
>>> state with all the reverts and fixes in our x1e80100 DP setup.
>>>
>>> I plan to investigate that issue properly in the next week and try to
>>> make some sense of it all.
>>>
>>> In fact, this patch is removing one more user of the hpd event thread
>>> which is the direction in which we all want to head towards.
>>
>> As I stated earlier, from my point of view it doesn't make sense to
>> rework the HPD thread in small steps.
>>
>>> On whether this is an optimization or a bug fix. I think by avoiding hpd
>>> event thread (which should have never been used for hpd_notify updates,
>>> hence a bug) we are avoiding the possibility of more race conditions.
>>
>> I think that the HPD event thread serializes handling of events, so
>> avoiding it increases the possibility of a race condition.
>>
>>>
>>> So, this has my R-b and it holds. Upto you.
>>
>> I'd wait for a proper description of the issue that was observed and
>> how it is solved by this patch.
>>
> 
> This was a code walkthrough fix as I wrote a few times. If there no 
> merit in pushing this, lets ignore it and stop discussing.
> 

Ok, so after we debugged the HPD issue on we have found the issue and 
why actually this change will help. I am going to post a V2 with more 
details on the commit text. We can discuss after that.
diff mbox series

Patch

diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c
index 94dd60f..0476ad9 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -1706,7 +1706,8 @@  void dp_bridge_hpd_notify(struct drm_bridge *bridge,
 	status &= ~0x80000000;
 
 	if (!dp_display->link_ready && status == connector_status_connected)
-		dp_add_event(dp, EV_HPD_PLUG_INT, 0, 0);
+		dp_hpd_plug_handle(dp, 0);
 	else if (dp_display->link_ready && status == connector_status_disconnected)
-		dp_add_event(dp, EV_HPD_UNPLUG_INT, 0, 0);
+		dp_hpd_unplug_handle(dp, 0);
+
 }