diff mbox series

Revert "venus: firmware: Correct non-pix start and end addresses"

Message ID 20230207102254.1446461-1-javierm@redhat.com
State Accepted
Commit f95b8ea79c47c0ad3d18f45ad538f9970e414d1f
Headers show
Series Revert "venus: firmware: Correct non-pix start and end addresses" | expand

Commit Message

Javier Martinez Canillas Feb. 7, 2023, 10:22 a.m. UTC
This reverts commit a837e5161cfffbb3242cc0eb574f8bf65fd32640, which broke
probing of the venus driver, at least on the SC7180 SoC HP X2 Chromebook:

  [   11.455782] qcom-venus aa00000.video-codec: Adding to iommu group 11
  [   11.506980] qcom-venus aa00000.video-codec: non legacy binding
  [   12.143432] qcom-venus aa00000.video-codec: failed to reset venus core
  [   12.156440] qcom-venus: probe of aa00000.video-codec failed with error -110

Matthias Kaehlcke also reported that the same change caused a regression in
SC7180 and sc7280, that prevents AOSS from entering sleep mode during system
suspend. So let's revert this commit for now to fix both issues.

Fixes: a837e5161cff ("venus: firmware: Correct non-pix start and end addresses")
Reported-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
---

 drivers/media/platform/qcom/venus/firmware.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Thorsten Leemhuis Feb. 15, 2023, 10:53 a.m. UTC | #1
On 11.02.23 15:27, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 10.02.23 11:07, Javier Martinez Canillas wrote:
>> On 2/10/23 10:22, Vikash Garodia wrote:
>>
>>>> So what should we do about this folks? Since not allowing the driver to probe on
>>>> at least SC7180 is a quite serious regression, can we revert for now until a proper
>>>> fix is figured out?
>>>
>>> I am able to repro this issue on sc7180 and discussing with firmware team on the cause
>>> of reset failure. The original patch was raised for fixing rare SMMU faults during warm
>>> boot of video hardware. Hence looking to understand the regressing part before we
>>> proceed to revert.
>>
>> Great, if you are working on a proper fix then that would be much better indeed.
> 
> Yeah, that's great, but OTOH: there is almost certainly just one week
> before 6.2 will be released. Ideally this should be fixed by then.
> Vikash, do you think that's in the cards? If not: why not revert this
> now to make sure 6.2 works fine?

Hmm, no reply. And we meanwhile have Wednesday and 6.2 is almost
certainly going to be out on Sunday. And the problem was called "a quite
serious regression" above. So why not quickly fix this with the revert,
as proposed earlier?

Vikash? Javier?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot ignore-activity
Javier Martinez Canillas Feb. 15, 2023, 10:57 a.m. UTC | #2
On Wed, Feb 15, 2023 at 11:53 AM Linux regression tracking (Thorsten
Leemhuis) <regressions@leemhuis.info> wrote:
>
> On 11.02.23 15:27, Linux regression tracking (Thorsten Leemhuis) wrote:
> > On 10.02.23 11:07, Javier Martinez Canillas wrote:
> >> On 2/10/23 10:22, Vikash Garodia wrote:
> >>
> >>>> So what should we do about this folks? Since not allowing the driver to probe on
> >>>> at least SC7180 is a quite serious regression, can we revert for now until a proper
> >>>> fix is figured out?
> >>>
> >>> I am able to repro this issue on sc7180 and discussing with firmware team on the cause
> >>> of reset failure. The original patch was raised for fixing rare SMMU faults during warm
> >>> boot of video hardware. Hence looking to understand the regressing part before we
> >>> proceed to revert.
> >>
> >> Great, if you are working on a proper fix then that would be much better indeed.
> >
> > Yeah, that's great, but OTOH: there is almost certainly just one week
> > before 6.2 will be released. Ideally this should be fixed by then.
> > Vikash, do you think that's in the cards? If not: why not revert this
> > now to make sure 6.2 works fine?
>
> Hmm, no reply. And we meanwhile have Wednesday and 6.2 is almost
> certainly going to be out on Sunday. And the problem was called "a quite
> serious regression" above. So why not quickly fix this with the revert,
> as proposed earlier?
>
> Vikash? Javier?
>

I agree with you, that we should land this revert and then properly
fix the page fault issue in v6.3.

But it's not my call, the v4l2/media folks have to decide that.
Thorsten Leemhuis Feb. 15, 2023, 1:18 p.m. UTC | #3
On 15.02.23 11:57, Javier Martinez Canillas wrote:
> On Wed, Feb 15, 2023 at 11:53 AM Linux regression tracking (Thorsten
> Leemhuis) <regressions@leemhuis.info> wrote:
>> On 11.02.23 15:27, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> On 10.02.23 11:07, Javier Martinez Canillas wrote:
>>>> On 2/10/23 10:22, Vikash Garodia wrote:
>>>>
>>>>>> So what should we do about this folks? Since not allowing the driver to probe on
>>>>>> at least SC7180 is a quite serious regression, can we revert for now until a proper
>>>>>> fix is figured out?
>>>>> I am able to repro this issue on sc7180 and discussing with firmware team on the cause
>>>>> of reset failure. The original patch was raised for fixing rare SMMU faults during warm
>>>>> boot of video hardware. Hence looking to understand the regressing part before we
>>>>> proceed to revert.
>>>> Great, if you are working on a proper fix then that would be much better indeed.
>>> Yeah, that's great, but OTOH: there is almost certainly just one week
>>> before 6.2 will be released. Ideally this should be fixed by then.
>>> Vikash, do you think that's in the cards? If not: why not revert this
>>> now to make sure 6.2 works fine?
>> Hmm, no reply. And we meanwhile have Wednesday and 6.2 is almost
>> certainly going to be out on Sunday. And the problem was called "a quite
>> serious regression" above. So why not quickly fix this with the revert,
>> as proposed earlier?
>> Vikash? Javier?
>
> I agree with you, that we should land this revert and then properly
> fix the page fault issue in v6.3.
> 
> But it's not my call, the v4l2/media folks have to decide that.

In that case: Mauro, what's your opinion here?

Thread starts here:
https://lore.kernel.org/lkml/20230207102254.1446461-1-javierm@redhat.com/

Regression report:
https://lore.kernel.org/lkml/Y9LSMap%2BjRxbtpC8@google.com/

Ciao, Thorsten
Thorsten Leemhuis Feb. 21, 2023, 3:03 p.m. UTC | #4
On 15.02.23 14:18, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 15.02.23 11:57, Javier Martinez Canillas wrote:
>> On Wed, Feb 15, 2023 at 11:53 AM Linux regression tracking (Thorsten
>> Leemhuis) <regressions@leemhuis.info> wrote:
>>> On 11.02.23 15:27, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>> On 10.02.23 11:07, Javier Martinez Canillas wrote:
>>>>> On 2/10/23 10:22, Vikash Garodia wrote:
>>>>>
>>>>>>> So what should we do about this folks? Since not allowing the driver to probe on
>>>>>>> at least SC7180 is a quite serious regression, can we revert for now until a proper
>>>>>>> fix is figured out?
>>>>>> I am able to repro this issue on sc7180 and discussing with firmware team on the cause
>>>>>> of reset failure. The original patch was raised for fixing rare SMMU faults during warm
>>>>>> boot of video hardware. Hence looking to understand the regressing part before we
>>>>>> proceed to revert.
>>>>> Great, if you are working on a proper fix then that would be much better indeed.
>>>> Yeah, that's great, but OTOH: there is almost certainly just one week
>>>> before 6.2 will be released. Ideally this should be fixed by then.
>>>> Vikash, do you think that's in the cards? If not: why not revert this
>>>> now to make sure 6.2 works fine?
>>> Hmm, no reply. And we meanwhile have Wednesday and 6.2 is almost
>>> certainly going to be out on Sunday. And the problem was called "a quite
>>> serious regression" above. So why not quickly fix this with the revert,
>>> as proposed earlier?
>>> Vikash? Javier?
>>
>> I agree with you, that we should land this revert and then properly
>> fix the page fault issue in v6.3.
>>
>> But it's not my call, the v4l2/media folks have to decide that.
> 
> In that case: Mauro, what's your opinion here?
> 
> Thread starts here:
> https://lore.kernel.org/lkml/20230207102254.1446461-1-javierm@redhat.com/
> 
> Regression report:
> https://lore.kernel.org/lkml/Y9LSMap%2BjRxbtpC8@google.com/

No reply from Mauro and Linus chose to not apply the revert I pointed
him to. That at this point leads to the question:

Vikash, did you or somebody else make any progress to fix this properly?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke
Vikash Garodia Feb. 23, 2023, 5:45 a.m. UTC | #5
Hi All,

>-----Original Message-----
>From: Thorsten Leemhuis <regressions@leemhuis.info>
>Sent: Tuesday, February 21, 2023 8:33 PM
>To: Vikash Garodia <vgarodia@qti.qualcomm.com>
>Cc: linux-kernel@vger.kernel.org; mka@chromium.org; Albert Esteve
><aesteve@redhat.com>; stanimir.varbanov@linaro.org; Enric Balletbo i Serra
><eballetb@redhat.com>; Andy Gross <agross@kernel.org>; Bjorn Andersson
><andersson@kernel.org>; Konrad Dybcio <konrad.dybcio@linaro.org>; Stanimir
>Varbanov <stanimir.k.varbanov@gmail.com>; Vikash Garodia (QUIC)
><quic_vgarodia@quicinc.com>; linux-arm-msm@vger.kernel.org; linux-
>media@vger.kernel.org; Fritz Koenig <frkoenig@google.com>; Dikshita Agarwal
>(QUIC) <quic_dikshita@quicinc.com>; Rajeshwar Kurapaty (QUIC)
><quic_rkurapat@quicinc.com>; Javier Martinez Canillas <javierm@redhat.com>;
>Linux regressions mailing list <regressions@lists.linux.dev>; Mauro Carvalho
>Chehab <mchehab@kernel.org>
>Subject: Re: [PATCH] Revert "venus: firmware: Correct non-pix start and end
>addresses"
>
>WARNING: This email originated from outside of Qualcomm. Please be wary of
>any links or attachments, and do not enable macros.
>
>On 15.02.23 14:18, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 15.02.23 11:57, Javier Martinez Canillas wrote:
>>> On Wed, Feb 15, 2023 at 11:53 AM Linux regression tracking (Thorsten
>>> Leemhuis) <regressions@leemhuis.info> wrote:
>>>> On 11.02.23 15:27, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>>> On 10.02.23 11:07, Javier Martinez Canillas wrote:
>>>>>> On 2/10/23 10:22, Vikash Garodia wrote:
>>>>>>
>>>>>>>> So what should we do about this folks? Since not allowing the
>>>>>>>> driver to probe on at least SC7180 is a quite serious
>>>>>>>> regression, can we revert for now until a proper fix is figured out?
>>>>>>> I am able to repro this issue on sc7180 and discussing with
>>>>>>> firmware team on the cause of reset failure. The original patch
>>>>>>> was raised for fixing rare SMMU faults during warm boot of video
>>>>>>> hardware. Hence looking to understand the regressing part before we
>proceed to revert.
>>>>>> Great, if you are working on a proper fix then that would be much better
>indeed.
>>>>> Yeah, that's great, but OTOH: there is almost certainly just one
>>>>> week before 6.2 will be released. Ideally this should be fixed by then.
>>>>> Vikash, do you think that's in the cards? If not: why not revert
>>>>> this now to make sure 6.2 works fine?
>>>> Hmm, no reply. And we meanwhile have Wednesday and 6.2 is almost
>>>> certainly going to be out on Sunday. And the problem was called "a
>>>> quite serious regression" above. So why not quickly fix this with
>>>> the revert, as proposed earlier?
>>>> Vikash? Javier?
>>>
>>> I agree with you, that we should land this revert and then properly
>>> fix the page fault issue in v6.3.
>>>
>>> But it's not my call, the v4l2/media folks have to decide that.
>>
>> In that case: Mauro, what's your opinion here?
>>
>> Thread starts here:
>> https://lore.kernel.org/lkml/20230207102254.1446461-1-javierm@redhat.c
>> om/
>>
>> Regression report:
>> https://lore.kernel.org/lkml/Y9LSMap%2BjRxbtpC8@google.com/
>
>No reply from Mauro and Linus chose to not apply the revert I pointed him to.
>That at this point leads to the question:
>
>Vikash, did you or somebody else make any progress to fix this properly?

We tried with different settings for the registers and arrive at a conclusion that
the original configuration was proper. There is no need to explicitly configure
the secure non-pixel region when there is no support for the usecase. So, in summary,
we are good to have the revert.

Stan, could you please help with the revert and a pull request having this revert
alongwith other pending changes ?

>Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>--
>Everything you wanna know about Linux kernel regression tracking:
>https://linux-regtracking.leemhuis.info/about/#tldr
>If I did something stupid, please tell me, as explained on that page.
>
>#regzbot poke
Javier Martinez Canillas Feb. 23, 2023, 8:05 a.m. UTC | #6
Vikash Garodia <vgarodia@qti.qualcomm.com> writes:

Hello Vikash,

> Hi All,
>

[...]

>>
>>No reply from Mauro and Linus chose to not apply the revert I pointed him to.
>>That at this point leads to the question:
>>
>>Vikash, did you or somebody else make any progress to fix this properly?
>
> We tried with different settings for the registers and arrive at a conclusion that
> the original configuration was proper. There is no need to explicitly configure
> the secure non-pixel region when there is no support for the usecase. So, in summary,
> we are good to have the revert.
>

Perfect. Thanks a lot for looking at this.

> Stan, could you please help with the revert and a pull request having this revert
> alongwith other pending changes ?
>

Other fix posted is "media: venus: dec: Fix capture formats enumeration order":

https://patchwork.kernel.org/project/linux-media/patch/20230210081835.2054482-1-javierm@redhat.com/
Javier Martinez Canillas Feb. 28, 2023, 4:03 p.m. UTC | #7
Javier Martinez Canillas <javierm@redhat.com> writes:

> Vikash Garodia <vgarodia@qti.qualcomm.com> writes:
>
> Hello Vikash,
>
>> Hi All,
>>
>
> [...]
>
>>>
>>>No reply from Mauro and Linus chose to not apply the revert I pointed him to.
>>>That at this point leads to the question:
>>>
>>>Vikash, did you or somebody else make any progress to fix this properly?
>>
>> We tried with different settings for the registers and arrive at a conclusion that
>> the original configuration was proper. There is no need to explicitly configure
>> the secure non-pixel region when there is no support for the usecase. So, in summary,
>> we are good to have the revert.
>>
>
> Perfect. Thanks a lot for looking at this.
>
>> Stan, could you please help with the revert and a pull request having this revert
>> alongwith other pending changes ?
>>
>
> Other fix posted is "media: venus: dec: Fix capture formats enumeration order":
>
> https://patchwork.kernel.org/project/linux-media/patch/20230210081835.2054482-1-javierm@redhat.com/
>

Vikash,

Could you or someone else from QC please Review/Ack these two patches,
since it seems that Stanimir moved on and maybe is not working in this
driver anymore?
Leonard Lausen April 1, 2023, 8:53 p.m. UTC | #8
Hi Javier, Dikshita, Stan,

the revert wasn't applied to v6.2 series. Can you please apply it and include it for v6.2.10?

March 6, 2023 at 5:43 AM, "Javier Martinez Canillas" <javierm@redhat.com> wrote:
>> On 3/1/2023 3:15 PM, Dikshita Agarwal wrote:
>>> On 2/28/2023 9:33 PM, Javier Martinez Canillas wrote:
>>>> Javier Martinez Canillas<javierm@redhat.com>  writes:
>>>>> Vikash Garodia<vgarodia@qti.qualcomm.com>  writes:
>>>>>
>>>>>> Stan, could you please help with the revert and a pull request having this revert
>>>>>> alongwith other pending changes ?
>>>>>>
>>>>> Other fix posted is "media: venus: dec: Fix capture formats enumeration order":
>>>>>
>>>>> https://patchwork.kernel.org/project/linux-media/patch/20230210081835.2054482-1-javierm@redhat.com/
>>
>> Hi Javier,
>>
>> Thanks for this patch "media: venus: dec: Fix capture formats
>> enumeration order".
>>
>> Somehow I can't find it in my mailbox to be able to reply there.
>>
>> Could you please explain what is the regression you see here?
>>
>
>You can find the thread and explanation of the issue here:
>
>https://lore.kernel.org/lkml/Y+KPW18o%2FDa+N8UI@google.com/T/
>
>But Stanimir already picked it and sent a PR for v6.3 including it.

While "media: venus: dec: Fix capture formats enumeration order" may have been
applied to v6.3, this still leaves the regression introduced by "venus:
firmware: Correct non-pix start and end addresses". As pointed out by Matthias
Kaehlcke, the commit prevents SC7180 and sc7280 AOSS from entering sleep mode
during system suspend. This is a serious regression in v6.2 kernel series.

Best regards,
Leonard Lausen
Thorsten Leemhuis April 2, 2023, 5:02 a.m. UTC | #9
On 01.04.23 22:53, Leonard Lausen wrote:
> Hi Javier, Dikshita, Stan,
> 
> the revert wasn't applied to v6.2 series. Can you please apply it and include it for v6.2.10?
> 
> March 6, 2023 at 5:43 AM, "Javier Martinez Canillas" <javierm@redhat.com> wrote:
>>> On 3/1/2023 3:15 PM, Dikshita Agarwal wrote:
>>>> On 2/28/2023 9:33 PM, Javier Martinez Canillas wrote:
>>>>> Javier Martinez Canillas<javierm@redhat.com>  writes:
>>>>>> Vikash Garodia<vgarodia@qti.qualcomm.com>  writes:
>>>>>>
>>>>>>> Stan, could you please help with the revert and a pull request having this revert
>>>>>>> alongwith other pending changes ?
>>>>>>>
>>>>>> Other fix posted is "media: venus: dec: Fix capture formats enumeration order":
>>>>>>
>>>>>> https://patchwork.kernel.org/project/linux-media/patch/20230210081835.2054482-1-javierm@redhat.com/
>>>
>>> Hi Javier,
>>>
>>> Thanks for this patch "media: venus: dec: Fix capture formats
>>> enumeration order".
>>>
>>> Somehow I can't find it in my mailbox to be able to reply there.
>>>
>>> Could you please explain what is the regression you see here?
>>>
>>
>> You can find the thread and explanation of the issue here:
>>
>> https://lore.kernel.org/lkml/Y+KPW18o%2FDa+N8UI@google.com/T/
>>
>> But Stanimir already picked it and sent a PR for v6.3 including it.
> 
> While "media: venus: dec: Fix capture formats enumeration order" may have been
> applied to v6.3,

To me it looks like it was submitted[1], but not yet applied even to the
media tree[2] -- while guess, maybe due problems mentioned in[3]? Or am
I missing something?

[1]
https://lore.kernel.org/all/20230329211655.100276-1-stanimir.k.varbanov@gmail.com/
[2] https://git.linuxtv.org/media_tree.git/log/?h=fixes
[3]
https://lore.kernel.org/all/20230329214310.2503484-1-jenkins@linuxtv.org/

> this still leaves the regression introduced by "venus:
> firmware: Correct non-pix start and end addresses". As pointed out by Matthias
> Kaehlcke, the commit prevents SC7180 and sc7280 AOSS from entering sleep mode
> during system suspend. This is a serious regression in v6.2 kernel series.

That fix is sitting in the media tree for a while and afaics still
hasn't been sent to Linus (which is needed to get this fixed in 6.2.y).

Mauro, could you maybe take care of that?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
diff mbox series

Patch

diff --git a/drivers/media/platform/qcom/venus/firmware.c b/drivers/media/platform/qcom/venus/firmware.c
index 142d4c74017c..d59ecf776715 100644
--- a/drivers/media/platform/qcom/venus/firmware.c
+++ b/drivers/media/platform/qcom/venus/firmware.c
@@ -38,8 +38,8 @@  static void venus_reset_cpu(struct venus_core *core)
 	writel(fw_size, wrapper_base + WRAPPER_FW_END_ADDR);
 	writel(0, wrapper_base + WRAPPER_CPA_START_ADDR);
 	writel(fw_size, wrapper_base + WRAPPER_CPA_END_ADDR);
-	writel(0, wrapper_base + WRAPPER_NONPIX_START_ADDR);
-	writel(0, wrapper_base + WRAPPER_NONPIX_END_ADDR);
+	writel(fw_size, wrapper_base + WRAPPER_NONPIX_START_ADDR);
+	writel(fw_size, wrapper_base + WRAPPER_NONPIX_END_ADDR);
 
 	if (IS_V6(core)) {
 		/* Bring XTSS out of reset */