diff mbox series

[net] wifi: ath12k: properly set single_chip_mlo_supp to true in ath12k_core_alloc()

Message ID 20250303-topic-ath12k-fix-crash-v1-1-f871d4e4d968@linaro.org
State New
Headers show
Series [net] wifi: ath12k: properly set single_chip_mlo_supp to true in ath12k_core_alloc() | expand

Commit Message

Neil Armstrong March 3, 2025, 3 p.m. UTC
In commit 46d16f7e1d14 ("wifi: ath12k: rename mlo_capable_flags to single_chip_mlo_supp")
the line:
	ab->mlo_capable_flags = ATH12K_INTRA_DEVICE_MLO_SUPPORT;
was incorrectly updated to:
	ab->single_chip_mlo_supp = false;
leading to always disabling INTRA_DEVICE_MLO even if the device supports it.

The firmware "WLAN.HMT.1.1.c5-00156-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1"
crashes on driver initialization with:
 ath12k_pci 0000:01:00.0: chip_id 0x2 chip_family 0x4 board_id 0x3d soc_id 0x40170200
 ath12k_pci 0000:01:00.0: fw_version 0x110f009c fw_build_timestamp 2024-05-30 11:35 fw_build_id QC_IMAGE_VERSION_STRING=WLAN.HMT.1.1.c5-00156-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1
 ath12k_pci 0000:01:00.0: ignore reset dev flags 0x200
 ath12k_pci 0000:01:00.0: failed to receive wmi unified ready event: -110
 ath12k_pci 0000:01:00.0: failed to start core: -110
 failed to send QMI message
 ath12k_pci 0000:01:00.0: qmi failed to send mode request, mode: 4, err = -5
 ath12k_pci 0000:01:00.0: qmi failed to send wlan mode off

With ab->single_chip_mlo_supp set to True, firmware loads nominally.

Fixes: 46d16f7e1d14 ("wifi: ath12k: rename mlo_capable_flags to single_chip_mlo_supp")
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
---
Bisect log for reference:
The bisect leaded to:
git bisect start 'v6.14-rc4' 'v6.12'
git bisect good 5757b31666277e2b177b406e48878dc48d587a46
git bisect bad d78794d4f4dbeac0a39e15d2fbc8e917741b5b7c
git bisect bad cf33d96f50903214226b379b3f10d1f262dae018
git bisect good 12e070eb6964b341b41677fd260af5a305316a1f
git bisect bad 6917d207b469ee81e6dc7f8ccca29c234a16916d
git bisect good 4fefbc66dfb356145633e571475be2459d73ce16
git bisect bad a6ac667467b642c94928c24ac2eb40d20110983c
git bisect bad b05d30c2b6df7e2172b18bf1baee9b202f9c6b53
git bisect good 56dcbf0b520796e26b2bbe5686bdd305ad924954
git bisect bad d302ac65ac938516487f57ae20f11e9cf6327606
git bisect good 8c2143702d0719a0357600bca0236900781ffc78
git bisect good a5686ae820fa7ab03226a3b0ff529720b7bac599
git bisect bad 6f245ea0ec6c29b90c8fa4fdf6e178c646125d7e
git bisect bad 46d16f7e1d1413ad7ff99c1334d8874623717745
---
 drivers/net/wireless/ath/ath12k/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


---
base-commit: 7eb172143d5508b4da468ed59ee857c6e5e01da6
change-id: 20250303-topic-ath12k-fix-crash-49e9055c61a1

Best regards,

Comments

Vasanthakumar Thiagarajan March 19, 2025, 11:32 a.m. UTC | #1
On 3/19/2025 3:57 PM, Krzysztof Kozlowski wrote:
> On 19/03/2025 10:06, Vasanthakumar Thiagarajan wrote:
>>>>> ---
>>>>> base-commit: 7eb172143d5508b4da468ed59ee857c6e5e01da6
>>>>> change-id: 20250303-topic-ath12k-fix-crash-49e9055c61a1
>>>>>
>>>>> Best regards,
>>>>
>>>> NAK since this will break QCN
>>>> There is a series under internal review to address MLO issues for WCN chipsets
>>>
>>> ???
>>>
>>> The original commit is wrong, this fixes the conversion, nothing else.
>>
>> Nope. Driver changes to enable MLO with WCN chipset are not there yet.
>> Setting the mlo capability flag without having required driver changes
>> for WCN chipset will likely result in firmware crash. So the recommendation
>> is to enable MLO (in WCN) only when all the necessary driver changes
>> (in development, public posting in near future) are in place.
> Really, these are your answers? There is regression and first reply is
> upstream should wait for whatever you do internally. Second answer is
> the same - public posting in near future?
> 

May be I was not clear in my response. I was not telling MLO bug fixes were
in the development. Actually the MLO feature itself is not enabled
yet with WCN chip sets. Any code changes enabling it without full feature
support would result in firmware crashes with the existing firmware binaries
available in upstream.

Vasanth
Dmitry Baryshkov March 19, 2025, 11:51 a.m. UTC | #2
On Wed, Mar 19, 2025 at 05:02:39PM +0530, Vasanthakumar Thiagarajan wrote:
> 
> 
> On 3/19/2025 3:57 PM, Krzysztof Kozlowski wrote:
> > On 19/03/2025 10:06, Vasanthakumar Thiagarajan wrote:
> > > > > > ---
> > > > > > base-commit: 7eb172143d5508b4da468ed59ee857c6e5e01da6
> > > > > > change-id: 20250303-topic-ath12k-fix-crash-49e9055c61a1
> > > > > > 
> > > > > > Best regards,
> > > > > 
> > > > > NAK since this will break QCN
> > > > > There is a series under internal review to address MLO issues for WCN chipsets
> > > > 
> > > > ???
> > > > 
> > > > The original commit is wrong, this fixes the conversion, nothing else.
> > > 
> > > Nope. Driver changes to enable MLO with WCN chipset are not there yet.
> > > Setting the mlo capability flag without having required driver changes
> > > for WCN chipset will likely result in firmware crash. So the recommendation
> > > is to enable MLO (in WCN) only when all the necessary driver changes
> > > (in development, public posting in near future) are in place.
> > Really, these are your answers? There is regression and first reply is
> > upstream should wait for whatever you do internally. Second answer is
> > the same - public posting in near future?
> > 
> 
> May be I was not clear in my response. I was not telling MLO bug fixes were
> in the development. Actually the MLO feature itself is not enabled
> yet with WCN chip sets. Any code changes enabling it without full feature
> support would result in firmware crashes with the existing firmware binaries
> available in upstream.

Is there an undocumented change of the behaviour in the commit
46d16f7e1d14 ("wifi: ath12k: rename mlo_capable_flags to
single_chip_mlo_supp")?
Vasanthakumar Thiagarajan March 19, 2025, 12:54 p.m. UTC | #3
On 3/19/2025 5:21 PM, Dmitry Baryshkov wrote:
> On Wed, Mar 19, 2025 at 05:02:39PM +0530, Vasanthakumar Thiagarajan wrote:
>>
>>
>> On 3/19/2025 3:57 PM, Krzysztof Kozlowski wrote:
>>> On 19/03/2025 10:06, Vasanthakumar Thiagarajan wrote:
>>>>>>> ---
>>>>>>> base-commit: 7eb172143d5508b4da468ed59ee857c6e5e01da6
>>>>>>> change-id: 20250303-topic-ath12k-fix-crash-49e9055c61a1
>>>>>>>
>>>>>>> Best regards,
>>>>>>
>>>>>> NAK since this will break QCN
>>>>>> There is a series under internal review to address MLO issues for WCN chipsets
>>>>>
>>>>> ???
>>>>>
>>>>> The original commit is wrong, this fixes the conversion, nothing else.
>>>>
>>>> Nope. Driver changes to enable MLO with WCN chipset are not there yet.
>>>> Setting the mlo capability flag without having required driver changes
>>>> for WCN chipset will likely result in firmware crash. So the recommendation
>>>> is to enable MLO (in WCN) only when all the necessary driver changes
>>>> (in development, public posting in near future) are in place.
>>> Really, these are your answers? There is regression and first reply is
>>> upstream should wait for whatever you do internally. Second answer is
>>> the same - public posting in near future?
>>>
>>
>> May be I was not clear in my response. I was not telling MLO bug fixes were
>> in the development. Actually the MLO feature itself is not enabled
>> yet with WCN chip sets. Any code changes enabling it without full feature
>> support would result in firmware crashes with the existing firmware binaries
>> available in upstream.
> 
> Is there an undocumented change of the behaviour in the commit
> 46d16f7e1d14 ("wifi: ath12k: rename mlo_capable_flags to
> single_chip_mlo_supp")?
> 

diff --git a/drivers/net/wireless/ath/ath12k/qmi.c b/drivers/net/wireless/ath/ath12k/qmi.c

-       if (resp.single_chip_mlo_support_valid) {
-               if (resp.single_chip_mlo_support)
-                       ab->mlo_capable_flags |= ATH12K_INTRA_DEVICE_MLO_SUPPORT;
-               else
-                       ab->mlo_capable_flags &= ~ATH12K_INTRA_DEVICE_MLO_SUPPORT;
-       }

The above logic seems to keep the initialized intra MLO support even when
single_chip_mlo_support_valid is not set. The above code removal is correct as
MLO support can not be enabled in host when firmware does not advertise it.

diff --git a/drivers/net/wireless/ath/ath12k/core.c b/drivers/net/wireless/ath/ath12k/core.c

+       ab->single_chip_mlo_supp = false;


diff --git a/drivers/net/wireless/ath/ath12k/qmi.c b/drivers/net/wireless/ath/ath12k/qmi.c

+       if (resp.single_chip_mlo_support_valid &&
+           resp.single_chip_mlo_support)
+               ab->single_chip_mlo_supp = true;

The above code does it in right way. Overriding firmware MLO capability as done
in the submitted patch under review is obviously wrong. The firmware used to report
the issue seems to have an odd behavior: 1. it does not seem to advertise MLO
capability in single_chip_mlo_support bit and 2. expects configurations to enable
MLO from host. None of the WCN firmware available in upstream either advertises
MLO capability or expects configurations to enable MLO from host.

Vasanth
Dmitry Baryshkov March 19, 2025, 1:33 p.m. UTC | #4
On Wed, Mar 19, 2025 at 06:24:57PM +0530, Vasanthakumar Thiagarajan wrote:
> 
> 
> On 3/19/2025 5:21 PM, Dmitry Baryshkov wrote:
> > On Wed, Mar 19, 2025 at 05:02:39PM +0530, Vasanthakumar Thiagarajan wrote:
> > > 
> > > 
> > > On 3/19/2025 3:57 PM, Krzysztof Kozlowski wrote:
> > > > On 19/03/2025 10:06, Vasanthakumar Thiagarajan wrote:
> > > > > > > > ---
> > > > > > > > base-commit: 7eb172143d5508b4da468ed59ee857c6e5e01da6
> > > > > > > > change-id: 20250303-topic-ath12k-fix-crash-49e9055c61a1
> > > > > > > > 
> > > > > > > > Best regards,
> > > > > > > 
> > > > > > > NAK since this will break QCN
> > > > > > > There is a series under internal review to address MLO issues for WCN chipsets
> > > > > > 
> > > > > > ???
> > > > > > 
> > > > > > The original commit is wrong, this fixes the conversion, nothing else.
> > > > > 
> > > > > Nope. Driver changes to enable MLO with WCN chipset are not there yet.
> > > > > Setting the mlo capability flag without having required driver changes
> > > > > for WCN chipset will likely result in firmware crash. So the recommendation
> > > > > is to enable MLO (in WCN) only when all the necessary driver changes
> > > > > (in development, public posting in near future) are in place.
> > > > Really, these are your answers? There is regression and first reply is
> > > > upstream should wait for whatever you do internally. Second answer is
> > > > the same - public posting in near future?
> > > > 
> > > 
> > > May be I was not clear in my response. I was not telling MLO bug fixes were
> > > in the development. Actually the MLO feature itself is not enabled
> > > yet with WCN chip sets. Any code changes enabling it without full feature
> > > support would result in firmware crashes with the existing firmware binaries
> > > available in upstream.
> > 
> > Is there an undocumented change of the behaviour in the commit
> > 46d16f7e1d14 ("wifi: ath12k: rename mlo_capable_flags to
> > single_chip_mlo_supp")?
> > 
> 
> diff --git a/drivers/net/wireless/ath/ath12k/qmi.c b/drivers/net/wireless/ath/ath12k/qmi.c
> 
> -       if (resp.single_chip_mlo_support_valid) {
> -               if (resp.single_chip_mlo_support)
> -                       ab->mlo_capable_flags |= ATH12K_INTRA_DEVICE_MLO_SUPPORT;
> -               else
> -                       ab->mlo_capable_flags &= ~ATH12K_INTRA_DEVICE_MLO_SUPPORT;
> -       }
> 
> The above logic seems to keep the initialized intra MLO support even when
> single_chip_mlo_support_valid is not set. The above code removal is correct as
> MLO support can not be enabled in host when firmware does not advertise it.

Ack

> 
> diff --git a/drivers/net/wireless/ath/ath12k/core.c b/drivers/net/wireless/ath/ath12k/core.c
> 

You skipped an important chunk:

-	ab->mlo_capable_flags = ATH12K_INTRA_DEVICE_MLO_SUPPORT;
> +       ab->single_chip_mlo_supp = false;

Is this an _undocumented_ change? I think it is. If the developer has
described that the commit disables MLO, there would be no such
questions.

> diff --git a/drivers/net/wireless/ath/ath12k/qmi.c b/drivers/net/wireless/ath/ath12k/qmi.c
> 
> +       if (resp.single_chip_mlo_support_valid &&
> +           resp.single_chip_mlo_support)
> +               ab->single_chip_mlo_supp = true;
> 
> The above code does it in right way. Overriding firmware MLO capability as done
> in the submitted patch under review is obviously wrong. The firmware used to report
> the issue seems to have an odd behavior: 1. it does not seem to advertise MLO
> capability in single_chip_mlo_support bit and 2. expects configurations to enable
> MLO from host. None of the WCN firmware available in upstream either advertises
> MLO capability or expects configurations to enable MLO from host.

Additionally, from the commit message:

    For the WCN7850 family of chipsets, since the event is not supported,
    assumption is made that single chip MLO is supported.

However the code doesn't contain that change. Instead single chip MLO is
unconditionally disabled.

I guess, Neil's change should be reworked to be limited to WCN7850 only,
however it must be done as it is what was expected from the commit
message.
Vasanthakumar Thiagarajan March 19, 2025, 5:23 p.m. UTC | #5
On 3/19/2025 7:03 PM, Dmitry Baryshkov wrote:
> On Wed, Mar 19, 2025 at 06:24:57PM +0530, Vasanthakumar Thiagarajan wrote:
>>
>>
>> On 3/19/2025 5:21 PM, Dmitry Baryshkov wrote:
>>> On Wed, Mar 19, 2025 at 05:02:39PM +0530, Vasanthakumar Thiagarajan wrote:
>>>>
>>>>
>>>> On 3/19/2025 3:57 PM, Krzysztof Kozlowski wrote:
>>>>> On 19/03/2025 10:06, Vasanthakumar Thiagarajan wrote:
>>>>>>>>> ---
>>>>>>>>> base-commit: 7eb172143d5508b4da468ed59ee857c6e5e01da6
>>>>>>>>> change-id: 20250303-topic-ath12k-fix-crash-49e9055c61a1
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> NAK since this will break QCN
>>>>>>>> There is a series under internal review to address MLO issues for WCN chipsets
>>>>>>>
>>>>>>> ???
>>>>>>>
>>>>>>> The original commit is wrong, this fixes the conversion, nothing else.
>>>>>>
>>>>>> Nope. Driver changes to enable MLO with WCN chipset are not there yet.
>>>>>> Setting the mlo capability flag without having required driver changes
>>>>>> for WCN chipset will likely result in firmware crash. So the recommendation
>>>>>> is to enable MLO (in WCN) only when all the necessary driver changes
>>>>>> (in development, public posting in near future) are in place.
>>>>> Really, these are your answers? There is regression and first reply is
>>>>> upstream should wait for whatever you do internally. Second answer is
>>>>> the same - public posting in near future?
>>>>>
>>>>
>>>> May be I was not clear in my response. I was not telling MLO bug fixes were
>>>> in the development. Actually the MLO feature itself is not enabled
>>>> yet with WCN chip sets. Any code changes enabling it without full feature
>>>> support would result in firmware crashes with the existing firmware binaries
>>>> available in upstream.
>>>
>>> Is there an undocumented change of the behaviour in the commit
>>> 46d16f7e1d14 ("wifi: ath12k: rename mlo_capable_flags to
>>> single_chip_mlo_supp")?
>>>
>>
>> diff --git a/drivers/net/wireless/ath/ath12k/qmi.c b/drivers/net/wireless/ath/ath12k/qmi.c
>>
>> -       if (resp.single_chip_mlo_support_valid) {
>> -               if (resp.single_chip_mlo_support)
>> -                       ab->mlo_capable_flags |= ATH12K_INTRA_DEVICE_MLO_SUPPORT;
>> -               else
>> -                       ab->mlo_capable_flags &= ~ATH12K_INTRA_DEVICE_MLO_SUPPORT;
>> -       }
>>
>> The above logic seems to keep the initialized intra MLO support even when
>> single_chip_mlo_support_valid is not set. The above code removal is correct as
>> MLO support can not be enabled in host when firmware does not advertise it.
> 
> Ack
> 
>>
>> diff --git a/drivers/net/wireless/ath/ath12k/core.c b/drivers/net/wireless/ath/ath12k/core.c
>>
> 
> You skipped an important chunk:
> 
> -	ab->mlo_capable_flags = ATH12K_INTRA_DEVICE_MLO_SUPPORT;
>> +       ab->single_chip_mlo_supp = false;
> 
> Is this an _undocumented_ change? I think it is. If the developer has
> described that the commit disables MLO, there would be no such
> questions.
> 

Right. Since MLO was WCN was not supported, this default configuration was
changed to avoid potential issues due to partial MLO configurations with
official upstream firmware binaries.

>> diff --git a/drivers/net/wireless/ath/ath12k/qmi.c b/drivers/net/wireless/ath/ath12k/qmi.c
>>
>> +       if (resp.single_chip_mlo_support_valid &&
>> +           resp.single_chip_mlo_support)
>> +               ab->single_chip_mlo_supp = true;
>>
>> The above code does it in right way. Overriding firmware MLO capability as done
>> in the submitted patch under review is obviously wrong. The firmware used to report
>> the issue seems to have an odd behavior: 1. it does not seem to advertise MLO
>> capability in single_chip_mlo_support bit and 2. expects configurations to enable
>> MLO from host. None of the WCN firmware available in upstream either advertises
>> MLO capability or expects configurations to enable MLO from host.
> 
> Additionally, from the commit message:
> 
>      For the WCN7850 family of chipsets, since the event is not supported,
>      assumption is made that single chip MLO is supported.
> 
> However the code doesn't contain that change. Instead single chip MLO is
> unconditionally disabled.
>

This text in commit message seems to be a miss. This should have been
removed as the merged patch version completely disables MLO for WCN
chip.

Vasanth
Jeff Johnson March 19, 2025, 6:24 p.m. UTC | #6
On 3/19/2025 2:46 AM, Baochen Qiang wrote:
>>>>>> fw_build_id QC_IMAGE_VERSION_STRING=WLAN.HMT.1.1.c5-00156-
>>>>>> QCAHMTSWPL_V1.0_V2.0_SILICONZ-1
> 
> this FW version is not upstream yet, why are you testing with it?
> 
> Generally we only support upstrmea driver + upstream FW.

Thanks for pointing this out Baochen. I was wondering why I wasn't seeing any
issues on my laptop. As others have commented, due to differences between
firmware configuration and release processes, please avoid using firmware that
isn't in https://git.codelinaro.org/clo/ath-firmware/ath12k-firmware (which
gets promoted to linux-firmware).

/jeff
Jeff Johnson March 19, 2025, 6:32 p.m. UTC | #7
On 3/19/2025 3:27 AM, Krzysztof Kozlowski wrote:
> On 19/03/2025 10:06, Vasanthakumar Thiagarajan wrote:
>>>>> ---
>>>>> base-commit: 7eb172143d5508b4da468ed59ee857c6e5e01da6
>>>>> change-id: 20250303-topic-ath12k-fix-crash-49e9055c61a1
>>>>>
>>>>> Best regards,
>>>>
>>>> NAK since this will break QCN
>>>> There is a series under internal review to address MLO issues for WCN chipsets
>>>
>>> ???
>>>
>>> The original commit is wrong, this fixes the conversion, nothing else.
>>
>> Nope. Driver changes to enable MLO with WCN chipset are not there yet.
>> Setting the mlo capability flag without having required driver changes
>> for WCN chipset will likely result in firmware crash. So the recommendation
>> is to enable MLO (in WCN) only when all the necessary driver changes
>> (in development, public posting in near future) are in place.
> Really, these are your answers? There is regression and first reply is
> upstream should wait for whatever you do internally. Second answer is
> the same - public posting in near future?
> 
> Can you start working with the upstream instead?

There is a lot going on in this thread. I want to address the big picture. It
is no secret that Qualcomm has historically focused on downstream drivers, and
upstream was mostly an afterthought. But that mindset has changed. Qualcomm is
fully embracing upstream kernel development, and has actively recruited (and
is still recruiting!) experienced upstream Linux Kernel engineers. And in
places where there are shortcoming, Qualcomm has partnered with companies like
Linaro to bring in needed support. So Qualcomm is very much "working with the
upstream." We may not be working well sometimes, but many of us are still
inexperienced with working with the upstream. We are coming up to speed.
Specifically for Wi-Fi, we have a large number of engineers who have primarily
worked on downstream code who are now working on upstream (including me!). And
we still have the issue that many of the products we are shipping now still
have a lot of downstream DNA, especially when it comes to firmware interfaces.
So please bear with us as we learn and evolve.

Please keep the constructive feedback coming. And remember, the more detailed
the feedback, the easier it is to incorporate the feedback.

Thanks!
/jeff
Krzysztof Kozlowski March 20, 2025, 7:21 a.m. UTC | #8
On 19/03/2025 12:25, Dmitry Baryshkov wrote:
> On Wed, Mar 19, 2025 at 11:29:18AM +0100, Krzysztof Kozlowski wrote:
>> On 19/03/2025 10:46, Baochen Qiang wrote:
>>>
>>>
>>> On 3/19/2025 5:12 PM, neil.armstrong@linaro.org wrote:
>>>> Hi,
>>>>
>>>> On 19/03/2025 10:06, Vasanthakumar Thiagarajan wrote:
>>>>>
>>>>>
>>>>> On 3/19/2025 1:34 PM, Neil Armstrong wrote:
>>>>>> On 18/03/2025 17:35, Jeff Johnson wrote:
>>>>>>> On 3/3/2025 7:00 AM, Neil Armstrong wrote:
>>>>>>>> In commit 46d16f7e1d14 ("wifi: ath12k: rename mlo_capable_flags to
>>>>>>>> single_chip_mlo_supp")
>>>>>>>> the line:
>>>>>>>>     ab->mlo_capable_flags = ATH12K_INTRA_DEVICE_MLO_SUPPORT;
>>>>>>>> was incorrectly updated to:
>>>>>>>>     ab->single_chip_mlo_supp = false;
>>>>>>>> leading to always disabling INTRA_DEVICE_MLO even if the device supports it.
>>>>>>>>
>>>>>>>> The firmware "WLAN.HMT.1.1.c5-00156-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1"
>>>>>>>> crashes on driver initialization with:
>>>>>>>>   ath12k_pci 0000:01:00.0: chip_id 0x2 chip_family 0x4 board_id 0x3d soc_id 0x40170200
>>>>>>>>   ath12k_pci 0000:01:00.0: fw_version 0x110f009c fw_build_timestamp 2024-05-30 11:35
>>>>>>>> fw_build_id QC_IMAGE_VERSION_STRING=WLAN.HMT.1.1.c5-00156-
>>>>>>>> QCAHMTSWPL_V1.0_V2.0_SILICONZ-1
>>>
>>> this FW version is not upstream yet, why are you testing with it?
>>>
>>> Generally we only support upstrmea driver + upstream FW.
>> FW does not have to be upstream. We work, here in upstream, with all
>> sort of vendors and all sorts of devices, for which vendors might not
>> send yet their FW or we are unclear about licensing rules.
> 
> If you are working with non-supported firmware, then, basically, you are
> on your own. I think you'd get the same response from any other vendor
> shipping firmware files. Consider reporting an issue to i915 or amdgpu
> driver _and_ stating that you are using some non-standard firmware files
> that were not provided to you by the corresponding team.


FW extracted from whatever-we-got-there is entire history of all WiFi
drivers... And no one asked here for support.


Best regards,
Krzysztof
Vasanthakumar Thiagarajan March 20, 2025, 7:23 a.m. UTC | #9
On 3/19/2025 7:03 PM, Dmitry Baryshkov wrote:
> On Wed, Mar 19, 2025 at 06:24:57PM +0530, Vasanthakumar Thiagarajan wrote:
>>
>>
>> On 3/19/2025 5:21 PM, Dmitry Baryshkov wrote:
>>> On Wed, Mar 19, 2025 at 05:02:39PM +0530, Vasanthakumar Thiagarajan wrote:
>>>>
>>>>
>>>> On 3/19/2025 3:57 PM, Krzysztof Kozlowski wrote:
>>>>> On 19/03/2025 10:06, Vasanthakumar Thiagarajan wrote:
>>>>>>>>> ---
>>>>>>>>> base-commit: 7eb172143d5508b4da468ed59ee857c6e5e01da6
>>>>>>>>> change-id: 20250303-topic-ath12k-fix-crash-49e9055c61a1
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>> NAK since this will break QCN
>>>>>>>> There is a series under internal review to address MLO issues for WCN chipsets
>>>>>>>
>>>>>>> ???
>>>>>>>
>>>>>>> The original commit is wrong, this fixes the conversion, nothing else.
>>>>>>
>>>>>> Nope. Driver changes to enable MLO with WCN chipset are not there yet.
>>>>>> Setting the mlo capability flag without having required driver changes
>>>>>> for WCN chipset will likely result in firmware crash. So the recommendation
>>>>>> is to enable MLO (in WCN) only when all the necessary driver changes
>>>>>> (in development, public posting in near future) are in place.
>>>>> Really, these are your answers? There is regression and first reply is
>>>>> upstream should wait for whatever you do internally. Second answer is
>>>>> the same - public posting in near future?
>>>>>
>>>>
>>>> May be I was not clear in my response. I was not telling MLO bug fixes were
>>>> in the development. Actually the MLO feature itself is not enabled
>>>> yet with WCN chip sets. Any code changes enabling it without full feature
>>>> support would result in firmware crashes with the existing firmware binaries
>>>> available in upstream.
>>>
>>> Is there an undocumented change of the behaviour in the commit
>>> 46d16f7e1d14 ("wifi: ath12k: rename mlo_capable_flags to
>>> single_chip_mlo_supp")?
>>>
>>
>> diff --git a/drivers/net/wireless/ath/ath12k/qmi.c b/drivers/net/wireless/ath/ath12k/qmi.c
>>
>> -       if (resp.single_chip_mlo_support_valid) {
>> -               if (resp.single_chip_mlo_support)
>> -                       ab->mlo_capable_flags |= ATH12K_INTRA_DEVICE_MLO_SUPPORT;
>> -               else
>> -                       ab->mlo_capable_flags &= ~ATH12K_INTRA_DEVICE_MLO_SUPPORT;
>> -       }
>>
>> The above logic seems to keep the initialized intra MLO support even when
>> single_chip_mlo_support_valid is not set. The above code removal is correct as
>> MLO support can not be enabled in host when firmware does not advertise it.
> 
> Ack
> 
>>
>> diff --git a/drivers/net/wireless/ath/ath12k/core.c b/drivers/net/wireless/ath/ath12k/core.c
>>
> 
> You skipped an important chunk:
> 
> -	ab->mlo_capable_flags = ATH12K_INTRA_DEVICE_MLO_SUPPORT;
>> +       ab->single_chip_mlo_supp = false;
> 
> Is this an _undocumented_ change? I think it is. If the developer has
> described that the commit disables MLO, there would be no such
> questions.
> 
>> diff --git a/drivers/net/wireless/ath/ath12k/qmi.c b/drivers/net/wireless/ath/ath12k/qmi.c
>>
>> +       if (resp.single_chip_mlo_support_valid &&
>> +           resp.single_chip_mlo_support)
>> +               ab->single_chip_mlo_supp = true;
>>
>> The above code does it in right way. Overriding firmware MLO capability as done
>> in the submitted patch under review is obviously wrong. The firmware used to report
>> the issue seems to have an odd behavior: 1. it does not seem to advertise MLO
>> capability in single_chip_mlo_support bit and 2. expects configurations to enable
>> MLO from host. None of the WCN firmware available in upstream either advertises
>> MLO capability or expects configurations to enable MLO from host.
> 
> Additionally, from the commit message:
> 
>      For the WCN7850 family of chipsets, since the event is not supported,
>      assumption is made that single chip MLO is supported.
> 
> However the code doesn't contain that change. Instead single chip MLO is
> unconditionally disabled.
> 
> I guess, Neil's change should be reworked to be limited to WCN7850 only,
> however it must be done as it is what was expected from the commit
> message.
> 

There has been lots of rework gone in to ath12k driver towards enabling
MLO support for QCN chip set. As of now, MLO boot and runtime configurations
are restricted to  QCN chipset. WCN will not work with those MLO changes.
Re-enabling single_chip_mlo_supp equivalent (single_chip_mlo_supp got cleaned up in 
ath/linux-next) will create issues when MLO gets enabled for WCN. Driver
changes/hacks to support this non-upstream firmware specific behavior will
become a major challenge over a period as new features are getting added in driver/firmware.

Vasanth
Neil Armstrong March 20, 2025, 10:06 a.m. UTC | #10
Hi,

On 20/03/2025 08:23, Vasanthakumar Thiagarajan wrote:
> 
> 
> On 3/19/2025 7:03 PM, Dmitry Baryshkov wrote:
>> On Wed, Mar 19, 2025 at 06:24:57PM +0530, Vasanthakumar Thiagarajan wrote:
>>>
>>>
>>> On 3/19/2025 5:21 PM, Dmitry Baryshkov wrote:
>>>> On Wed, Mar 19, 2025 at 05:02:39PM +0530, Vasanthakumar Thiagarajan wrote:
>>>>>
>>>>>
>>>>> On 3/19/2025 3:57 PM, Krzysztof Kozlowski wrote:
>>>>>> On 19/03/2025 10:06, Vasanthakumar Thiagarajan wrote:
>>>>>>>>>> ---
>>>>>>>>>> base-commit: 7eb172143d5508b4da468ed59ee857c6e5e01da6
>>>>>>>>>> change-id: 20250303-topic-ath12k-fix-crash-49e9055c61a1
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>> NAK since this will break QCN
>>>>>>>>> There is a series under internal review to address MLO issues for WCN chipsets
>>>>>>>>
>>>>>>>> ???
>>>>>>>>
>>>>>>>> The original commit is wrong, this fixes the conversion, nothing else.
>>>>>>>
>>>>>>> Nope. Driver changes to enable MLO with WCN chipset are not there yet.
>>>>>>> Setting the mlo capability flag without having required driver changes
>>>>>>> for WCN chipset will likely result in firmware crash. So the recommendation
>>>>>>> is to enable MLO (in WCN) only when all the necessary driver changes
>>>>>>> (in development, public posting in near future) are in place.
>>>>>> Really, these are your answers? There is regression and first reply is
>>>>>> upstream should wait for whatever you do internally. Second answer is
>>>>>> the same - public posting in near future?
>>>>>>
>>>>>
>>>>> May be I was not clear in my response. I was not telling MLO bug fixes were
>>>>> in the development. Actually the MLO feature itself is not enabled
>>>>> yet with WCN chip sets. Any code changes enabling it without full feature
>>>>> support would result in firmware crashes with the existing firmware binaries
>>>>> available in upstream.
>>>>
>>>> Is there an undocumented change of the behaviour in the commit
>>>> 46d16f7e1d14 ("wifi: ath12k: rename mlo_capable_flags to
>>>> single_chip_mlo_supp")?
>>>>
>>>
>>> diff --git a/drivers/net/wireless/ath/ath12k/qmi.c b/drivers/net/wireless/ath/ath12k/qmi.c
>>>
>>> -       if (resp.single_chip_mlo_support_valid) {
>>> -               if (resp.single_chip_mlo_support)
>>> -                       ab->mlo_capable_flags |= ATH12K_INTRA_DEVICE_MLO_SUPPORT;
>>> -               else
>>> -                       ab->mlo_capable_flags &= ~ATH12K_INTRA_DEVICE_MLO_SUPPORT;
>>> -       }
>>>
>>> The above logic seems to keep the initialized intra MLO support even when
>>> single_chip_mlo_support_valid is not set. The above code removal is correct as
>>> MLO support can not be enabled in host when firmware does not advertise it.
>>
>> Ack
>>
>>>
>>> diff --git a/drivers/net/wireless/ath/ath12k/core.c b/drivers/net/wireless/ath/ath12k/core.c
>>>
>>
>> You skipped an important chunk:
>>
>> -    ab->mlo_capable_flags = ATH12K_INTRA_DEVICE_MLO_SUPPORT;
>>> +       ab->single_chip_mlo_supp = false;
>>
>> Is this an _undocumented_ change? I think it is. If the developer has
>> described that the commit disables MLO, there would be no such
>> questions.
>>
>>> diff --git a/drivers/net/wireless/ath/ath12k/qmi.c b/drivers/net/wireless/ath/ath12k/qmi.c
>>>
>>> +       if (resp.single_chip_mlo_support_valid &&
>>> +           resp.single_chip_mlo_support)
>>> +               ab->single_chip_mlo_supp = true;
>>>
>>> The above code does it in right way. Overriding firmware MLO capability as done
>>> in the submitted patch under review is obviously wrong. The firmware used to report
>>> the issue seems to have an odd behavior: 1. it does not seem to advertise MLO
>>> capability in single_chip_mlo_support bit and 2. expects configurations to enable
>>> MLO from host. None of the WCN firmware available in upstream either advertises
>>> MLO capability or expects configurations to enable MLO from host.
>>
>> Additionally, from the commit message:
>>
>>      For the WCN7850 family of chipsets, since the event is not supported,
>>      assumption is made that single chip MLO is supported.
>>
>> However the code doesn't contain that change. Instead single chip MLO is
>> unconditionally disabled.
>>
>> I guess, Neil's change should be reworked to be limited to WCN7850 only,
>> however it must be done as it is what was expected from the commit
>> message.
>>
> 
> There has been lots of rework gone in to ath12k driver towards enabling
> MLO support for QCN chip set. As of now, MLO boot and runtime configurations
> are restricted to  QCN chipset. WCN will not work with those MLO changes.
> Re-enabling single_chip_mlo_supp equivalent (single_chip_mlo_supp got cleaned up in ath/linux-next) will create issues when MLO gets enabled for WCN. Driver
> changes/hacks to support this non-upstream firmware specific behavior will
> become a major challenge over a period as new features are getting added in driver/firmware.

Making sure a driver doesn't regress with a fw is a major challenge for everyone,
this is why we CI test the kernel on different platform and make sure
it still works over time.

Please share your roadmap with the community and add some documentation
explaining which firmwares & features are currently supported and which
features are in development. It's something usually done in a TODO and
ROADMAP file in the driver directory.

And honestly, please report a warning in the kernel when using a non-supported
firmware version, because Qualcomm ships hundreds of different firmwares
with cryptic namings working with different driver versions and the situation
is honestly impossible to handle for developers & customers when setting up
systems based on mainline Linux.

Finally, it seems you only test the WCN on an x86 platform, and since you ship
the WCN along (at least) the SM8550, SM8650 and SM8750 SoCs (which is like millions
of phones) and since those SoCs are pretty well supported in mainline Linux,
these targets should be validated aswell since they have different power-up
requirements than the standalone PCIe card.

Last note, there's no "upstream firmware", the linux-firmware repository is a handy
collection of firmwares shared by vendors in a central place with clear reditribution
licencing, but it cannot be considered as "upstream" like the Linux code is.
So expliciting which firmware version is supported in the documentation and the
code is the preferred way.

Thanks,
Neil

> 
> Vasanth
Neil Armstrong March 20, 2025, 10:14 a.m. UTC | #11
Hi Jeff,

On 19/03/2025 19:32, Jeff Johnson wrote:
> On 3/19/2025 3:27 AM, Krzysztof Kozlowski wrote:
>> On 19/03/2025 10:06, Vasanthakumar Thiagarajan wrote:
>>>>>> ---
>>>>>> base-commit: 7eb172143d5508b4da468ed59ee857c6e5e01da6
>>>>>> change-id: 20250303-topic-ath12k-fix-crash-49e9055c61a1
>>>>>>
>>>>>> Best regards,
>>>>>
>>>>> NAK since this will break QCN
>>>>> There is a series under internal review to address MLO issues for WCN chipsets
>>>>
>>>> ???
>>>>
>>>> The original commit is wrong, this fixes the conversion, nothing else.
>>>
>>> Nope. Driver changes to enable MLO with WCN chipset are not there yet.
>>> Setting the mlo capability flag without having required driver changes
>>> for WCN chipset will likely result in firmware crash. So the recommendation
>>> is to enable MLO (in WCN) only when all the necessary driver changes
>>> (in development, public posting in near future) are in place.
>> Really, these are your answers? There is regression and first reply is
>> upstream should wait for whatever you do internally. Second answer is
>> the same - public posting in near future?
>>
>> Can you start working with the upstream instead?
> 
> There is a lot going on in this thread. I want to address the big picture. It
> is no secret that Qualcomm has historically focused on downstream drivers, and
> upstream was mostly an afterthought. But that mindset has changed. Qualcomm is
> fully embracing upstream kernel development, and has actively recruited (and
> is still recruiting!) experienced upstream Linux Kernel engineers. And in
> places where there are shortcoming, Qualcomm has partnered with companies like
> Linaro to bring in needed support. So Qualcomm is very much "working with the
> upstream." We may not be working well sometimes, but many of us are still
> inexperienced with working with the upstream. We are coming up to speed.
> Specifically for Wi-Fi, we have a large number of engineers who have primarily
> worked on downstream code who are now working on upstream (including me!). And
> we still have the issue that many of the products we are shipping now still
> have a lot of downstream DNA, especially when it comes to firmware interfaces.
> So please bear with us as we learn and evolve.
> 
> Please keep the constructive feedback coming. And remember, the more detailed
> the feedback, the easier it is to incorporate the feedback.

Thanks for all the feedback, all this escalated mainly because the 46d16f7e1d14
commit doesn't do what is written in the commit message, and there were no clear
sign MLO support with WCN firmares would be removed shortly and redeveloped later.

As final note for this patch, do you have any timeline for when MLO would be enabled
again on WCN chips ?

I'm happy Qualcomm is embracing mainline development, but Qualcomm still ships
hundreds of different incompatible firmware and downstream drivers variant in the
field, and this makes system integration very hard.

And as you mention the partnership with Linaro, we are very happy to help
and support your mainline development over time by reviewing, testing and
reporting.

Thanks,
Neil

> 
> Thanks!
> /jeff
>
diff mbox series

Patch

diff --git a/drivers/net/wireless/ath/ath12k/core.c b/drivers/net/wireless/ath/ath12k/core.c
index 0606116d6b9c491b6ede401b2e1aedfb619339a8..33aba5fceec946fad5a47a11a4d86b7be96e682e 100644
--- a/drivers/net/wireless/ath/ath12k/core.c
+++ b/drivers/net/wireless/ath/ath12k/core.c
@@ -1927,7 +1927,7 @@  struct ath12k_base *ath12k_core_alloc(struct device *dev, size_t priv_size,
 	ab->dev = dev;
 	ab->hif.bus = bus;
 	ab->qmi.num_radios = U8_MAX;
-	ab->single_chip_mlo_supp = false;
+	ab->single_chip_mlo_supp = true;
 
 	/* Device index used to identify the devices in a group.
 	 *