mbox series

[v2,0/4] soundwire: qcom: stablity fixes

Message ID 20230525133812.30841-1-srinivas.kandagatla@linaro.org
Headers show
Series soundwire: qcom: stablity fixes | expand

Message

Srinivas Kandagatla May 25, 2023, 1:38 p.m. UTC
During x13s audio testing we hit few corner cases due to issues
in codec drivers and some obvious code bugs.

Here are the fixes for those issues, mostly the issues are around
devices loosing the sync in between runtime pm suspend resume path.

With codec fixes along with these fixes, audio on x13s is pretty stable.

Thanks,
Srini

Changes since v1:
	- dropped runtime pm changes patch as unable to reproduced it anymore
	- fixed clk stop flag as suggested by Pierre
	- rebased on top of linux-next

Srinivas Kandagatla (4):
  soundwire: qcom: update status correctly with mask
  soundwire: qcom: wait for fifo to be empty before suspend
  soundwire: qcom: add software workaround for bus clash interrupt
    assertion
  soundwire: qcom: set clk stop need reset flag at runtime

 drivers/soundwire/qcom.c | 100 ++++++++++++++++++++++++++-------------
 1 file changed, 68 insertions(+), 32 deletions(-)

Comments

Johan Hovold June 5, 2023, 9:08 a.m. UTC | #1
On Thu, May 25, 2023 at 02:38:08PM +0100, Srinivas Kandagatla wrote:
> During x13s audio testing we hit few corner cases due to issues
> in codec drivers and some obvious code bugs.
> 
> Here are the fixes for those issues, mostly the issues are around
> devices loosing the sync in between runtime pm suspend resume path.
> 
> With codec fixes along with these fixes, audio on x13s is pretty stable.
> 
> Thanks,
> Srini
> 
> Changes since v1:
> 	- dropped runtime pm changes patch as unable to reproduced it anymore
> 	- fixed clk stop flag as suggested by Pierre
> 	- rebased on top of linux-next

I tried to update to this series on my 6.4-rc5 branch for the X13s and
the above changes appear to lead to breakages again.

Specifically, with the updated clk stop flag (simple_clk_stop_capable) I
see:

[   14.789533] wcd9380-codec sdw:0:0217:010d:00:3: Slave 1 state check1: UNATTACHED, status was 1
[   14.789717] qcom-soundwire 3330000.soundwire-controller: qcom_swrm_irq_handler: SWR bus clsh detected
[   14.796164] wcd9380-codec sdw:0:0217:010d:00:3: Slave 1 state check1: UNATTACHED, status was 1

and without the runtime pm patch that you dropped in v2 I get
intermittent (e.g. twice in five boots):

[   11.527301] snd-sc8280xp sound: ASoC: adding FE link failed
[   11.527409] snd-sc8280xp sound: ASoC: topology: could not load header: -517
[   11.527557] qcom-apm gprsvc:service:2:1: tplg component load failed-517
[   11.527640] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22
[   11.528079] snd-sc8280xp sound: ASoC: failed to instantiate card -22
[   11.533388] snd-sc8280xp: probe of sound failed with error -22

Again, this was with 6.4-rc5, but these problems are likely still there
also with linux-next.

Johan
Srinivas Kandagatla June 7, 2023, 9:36 a.m. UTC | #2
On 05/06/2023 10:08, Johan Hovold wrote:
> On Thu, May 25, 2023 at 02:38:08PM +0100, Srinivas Kandagatla wrote:
>> During x13s audio testing we hit few corner cases due to issues
>> in codec drivers and some obvious code bugs.
>>
>> Here are the fixes for those issues, mostly the issues are around
>> devices loosing the sync in between runtime pm suspend resume path.
>>
>> With codec fixes along with these fixes, audio on x13s is pretty stable.
>>
>> Thanks,
>> Srini
>>
>> Changes since v1:
>> 	- dropped runtime pm changes patch as unable to reproduced it anymore
>> 	- fixed clk stop flag as suggested by Pierre
>> 	- rebased on top of linux-next
> 
> I tried to update to this series on my 6.4-rc5 branch for the X13s and
> the above changes appear to lead to breakages again.
> 
These patches are based on linux next, we can not cleanly apply them on 
rc5 i guess without the depended patches.

I have tried these patches with your rc4 branch along with other 
depended patches my branch is at: 
https://git.linaro.org/people/srinivas.kandagatla/linux.git/log/?h=wip/sc8280xp-v6.4-rc4

this works fine for me, not seeing any issues so far.
tested both speakers and headset.


> Specifically, with the updated clk stop flag (simple_clk_stop_capable) I
> see:
> 
> [   14.789533] wcd9380-codec sdw:0:0217:010d:00:3: Slave 1 state check1: UNATTACHED, status was 1
> [   14.789717] qcom-soundwire 3330000.soundwire-controller: qcom_swrm_irq_handler: SWR bus clsh detected
> [   14.796164] wcd9380-codec sdw:0:0217:010d:00:3: Slave 1 state check1: UNATTACHED, status was 1
> 

> and without the runtime pm patch that you dropped in v2 I get
> intermittent (e.g. twice in five boots):
> 
> [   11.527301] snd-sc8280xp sound: ASoC: adding FE link failed
> [   11.527409] snd-sc8280xp sound: ASoC: topology: could not load header: -517
Never seen this, looks like some corrupted tplg to me.. I might be wrong.

> [   11.527557] qcom-apm gprsvc:service:2:1: tplg component load failed-517
> [   11.527640] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22
> [   11.528079] snd-sc8280xp sound: ASoC: failed to instantiate card -22
> [   11.533388] snd-sc8280xp: probe of sound failed with error -22
> 
> Again, this was with 6.4-rc5, but these problems are likely still there
> also with linux-next.

Have you tried linux-next?


--srini
> 
> Johan
Johan Hovold June 8, 2023, 10:11 a.m. UTC | #3
On Wed, Jun 07, 2023 at 10:36:40AM +0100, Srinivas Kandagatla wrote:
> On 05/06/2023 10:08, Johan Hovold wrote:
> > On Thu, May 25, 2023 at 02:38:08PM +0100, Srinivas Kandagatla wrote:
> >> During x13s audio testing we hit few corner cases due to issues
> >> in codec drivers and some obvious code bugs.
> >>
> >> Here are the fixes for those issues, mostly the issues are around
> >> devices loosing the sync in between runtime pm suspend resume path.
> >>
> >> With codec fixes along with these fixes, audio on x13s is pretty stable.

> >> Changes since v1:
> >> 	- dropped runtime pm changes patch as unable to reproduced it anymore
> >> 	- fixed clk stop flag as suggested by Pierre
> >> 	- rebased on top of linux-next
> > 
> > I tried to update to this series on my 6.4-rc5 branch for the X13s and
> > the above changes appear to lead to breakages again.
> > 
> These patches are based on linux next, we can not cleanly apply them on 
> rc5 i guess without the depended patches.

Yeah, I did the corresponding changes you did in v2 to v1 applied to
6.4-rc5 instead of backporting the dependencies.

> I have tried these patches with your rc4 branch along with other 
> depended patches my branch is at: 
> https://git.linaro.org/people/srinivas.kandagatla/linux.git/log/?h=wip/sc8280xp-v6.4-rc4
> 
> this works fine for me, not seeing any issues so far.
> tested both speakers and headset.
> 
> > Specifically, with the updated clk stop flag (simple_clk_stop_capable) I
> > see:
> > 
> > [   14.789533] wcd9380-codec sdw:0:0217:010d:00:3: Slave 1 state check1: UNATTACHED, status was 1
> > [   14.789717] qcom-soundwire 3330000.soundwire-controller: qcom_swrm_irq_handler: SWR bus clsh detected
> > [   14.796164] wcd9380-codec sdw:0:0217:010d:00:3: Slave 1 state check1: UNATTACHED, status was 1

So as we discussed of list, this particular issue was due to a mistake I
made when "backporting" your v2 where I forgot to invert the test when
replacing simple_clk_stop_capable with clk_stop_mode1.

> > and without the runtime pm patch that you dropped in v2 I get
> > intermittent (e.g. twice in five boots):
> > 
> > [   11.527301] snd-sc8280xp sound: ASoC: adding FE link failed
> > [   11.527409] snd-sc8280xp sound: ASoC: topology: could not load header: -517
> Never seen this, looks like some corrupted tplg to me.. I might be wrong.
> 
> > [   11.527557] qcom-apm gprsvc:service:2:1: tplg component load failed-517
> > [   11.527640] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22
> > [   11.528079] snd-sc8280xp sound: ASoC: failed to instantiate card -22
> > [   11.533388] snd-sc8280xp: probe of sound failed with error -22
> > 
> > Again, this was with 6.4-rc5, but these problems are likely still there
> > also with linux-next.
> 
> Have you tried linux-next?

No, not yet, but I just triggered the above once more after not having
seen with my latest -rc5 branch for a while (e.g. 20 reboots?):

[   11.430131] qcom-soundwire 3210000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered
[   11.431741] wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:4 (ops wcd938x_sdw_component_ops [snd_soc_wcd938x_sdw])
[   11.431933] wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:3 (ops wcd938x_sdw_component_ops [snd_soc_wcd938x_sdw])
[   11.435406] qcom-soundwire 3330000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered
[   11.449286] qcom-soundwire 3250000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered
[   11.450632] wsa883x-codec sdw:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2
[   11.453155] wsa883x-codec sdw:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2
[   11.456511] wsa883x-codec sdw:0:0217:0202:00:2: WSA883X Version 1_1, Variant: WSA8835_V2
[   11.562623] q6apm-dai 3000000.remoteproc:glink-edge:gpr:service@1:dais: Adding to iommu group 23
[   11.585766] snd-sc8280xp sound: ASoC: adding FE link failed
[   11.585872] snd-sc8280xp sound: ASoC: topology: could not load header: -517
[   11.586021] qcom-apm gprsvc:service:2:1: tplg component load failed-517
[   11.586100] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22
[   11.586530] snd-sc8280xp sound: ASoC: failed to instantiate card -22
[   11.591831] snd-sc8280xp: probe of sound failed with error -22

I don't think I've ever seen it before dropping the runtime PM patch as
you did in v2, and I hit it twice fairly quickly after dropping it. And
now again.

I'm not saying that the runtime PM patch is necessarily correct, and
perhaps it is just changes in timing that lead to the above, but we
definitely have a bug here.

My X13s is sitting here in the above state right now if there's some
state you want me to inspect.

This is the branch I'm using:

	https://github.com/jhovold/linux/tree/wip/sc8280xp-v6.4-rc5

with commit

	c0ab29445663 ("soundwire: qcom: enable runtime pm before controller is registered")

reverted.

Johan
Johan Hovold June 8, 2023, 12:40 p.m. UTC | #4
On Thu, Jun 08, 2023 at 12:11:45PM +0200, Johan Hovold wrote:
> On Wed, Jun 07, 2023 at 10:36:40AM +0100, Srinivas Kandagatla wrote:

> No, not yet, but I just triggered the above once more after not having
> seen with my latest -rc5 branch for a while (e.g. 20 reboots?):
> 
> [   11.430131] qcom-soundwire 3210000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered
> [   11.431741] wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:4 (ops wcd938x_sdw_component_ops [snd_soc_wcd938x_sdw])
> [   11.431933] wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:3 (ops wcd938x_sdw_component_ops [snd_soc_wcd938x_sdw])
> [   11.435406] qcom-soundwire 3330000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered
> [   11.449286] qcom-soundwire 3250000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered
> [   11.450632] wsa883x-codec sdw:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2
> [   11.453155] wsa883x-codec sdw:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2
> [   11.456511] wsa883x-codec sdw:0:0217:0202:00:2: WSA883X Version 1_1, Variant: WSA8835_V2
> [   11.562623] q6apm-dai 3000000.remoteproc:glink-edge:gpr:service@1:dais: Adding to iommu group 23
> [   11.585766] snd-sc8280xp sound: ASoC: adding FE link failed
> [   11.585872] snd-sc8280xp sound: ASoC: topology: could not load header: -517
> [   11.586021] qcom-apm gprsvc:service:2:1: tplg component load failed-517
> [   11.586100] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22
> [   11.586530] snd-sc8280xp sound: ASoC: failed to instantiate card -22
> [   11.591831] snd-sc8280xp: probe of sound failed with error -22
> 
> I don't think I've ever seen it before dropping the runtime PM patch as
> you did in v2, and I hit it twice fairly quickly after dropping it. And
> now again.
> 
> I'm not saying that the runtime PM patch is necessarily correct, and
> perhaps it is just changes in timing that lead to the above, but we
> definitely have a bug here.

I searched my notes and realised that I have seen this once also with
the runtime pm patch. So the fact that happened to see it more often
after dropping it is likely due to changes in timing.

Looking at the above log it seems like we hit a probe deferral somewhere
as some resource is not available yet, and this is eventually turned
into a hard failure that breaks audio as the error is propagated up the
stack.

Johan
Srinivas Kandagatla June 8, 2023, 12:45 p.m. UTC | #5
On 08/06/2023 13:40, Johan Hovold wrote:
> On Thu, Jun 08, 2023 at 12:11:45PM +0200, Johan Hovold wrote:
>> On Wed, Jun 07, 2023 at 10:36:40AM +0100, Srinivas Kandagatla wrote:
> 
>> No, not yet, but I just triggered the above once more after not having
>> seen with my latest -rc5 branch for a while (e.g. 20 reboots?):
>>
>> [   11.430131] qcom-soundwire 3210000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered
>> [   11.431741] wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:4 (ops wcd938x_sdw_component_ops [snd_soc_wcd938x_sdw])
>> [   11.431933] wcd938x_codec audio-codec: bound sdw:0:0217:010d:00:3 (ops wcd938x_sdw_component_ops [snd_soc_wcd938x_sdw])
>> [   11.435406] qcom-soundwire 3330000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered
>> [   11.449286] qcom-soundwire 3250000.soundwire-controller: Qualcomm Soundwire controller v1.6.0 Registered
>> [   11.450632] wsa883x-codec sdw:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2
>> [   11.453155] wsa883x-codec sdw:0:0217:0202:00:1: WSA883X Version 1_1, Variant: WSA8835_V2
>> [   11.456511] wsa883x-codec sdw:0:0217:0202:00:2: WSA883X Version 1_1, Variant: WSA8835_V2
>> [   11.562623] q6apm-dai 3000000.remoteproc:glink-edge:gpr:service@1:dais: Adding to iommu group 23
>> [   11.585766] snd-sc8280xp sound: ASoC: adding FE link failed
>> [   11.585872] snd-sc8280xp sound: ASoC: topology: could not load header: -517
>> [   11.586021] qcom-apm gprsvc:service:2:1: tplg component load failed-517
>> [   11.586100] qcom-apm gprsvc:service:2:1: ASoC: error at snd_soc_component_probe on gprsvc:service:2:1: -22
>> [   11.586530] snd-sc8280xp sound: ASoC: failed to instantiate card -22
>> [   11.591831] snd-sc8280xp: probe of sound failed with error -22
>>
>> I don't think I've ever seen it before dropping the runtime PM patch as
>> you did in v2, and I hit it twice fairly quickly after dropping it. And
>> now again.
>>
>> I'm not saying that the runtime PM patch is necessarily correct, and
>> perhaps it is just changes in timing that lead to the above, but we
>> definitely have a bug here.
> 
> I searched my notes and realised that I have seen this once also with
> the runtime pm patch. So the fact that happened to see it more often
> after dropping it is likely due to changes in timing.
> 
> Looking at the above log it seems like we hit a probe deferral somewhere
> as some resource is not available yet, and this is eventually turned
> into a hard failure that breaks audio as the error is propagated up the
> stack.

I was looking at this too, And I think this change should help..
Not tried it though


----------------------------------->cut<----------------------------------
diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c
index cdabfe2feceb..b57532d6b163 100644
--- a/drivers/soundwire/qcom.c
+++ b/drivers/soundwire/qcom.c
@@ -543,7 +543,7 @@ static int qcom_swrm_enumerate(struct sdw_bus *bus)
  		list_for_each_entry_safe(slave, _s, &bus->slaves, node) {
  			if (sdw_compare_devid(slave, id) == 0) {
  				qcom_swrm_set_slave_dev_num(bus, slave, i);
-				if (!slave->prop.simple_clk_stop_capable)
+				if (slave->prop.clk_stop_mode1)
  					ctrl->clock_stop_not_supported = true;

  				found = true;
diff --git a/sound/soc/qcom/qdsp6/topology.c 
b/sound/soc/qcom/qdsp6/topology.c
index cccc59b570b9..3c50a560bc84 100644
--- a/sound/soc/qcom/qdsp6/topology.c
+++ b/sound/soc/qcom/qdsp6/topology.c
@@ -1276,10 +1276,8 @@ int audioreach_tplg_init(struct snd_soc_component 
*component)
  	}

  	ret = snd_soc_tplg_component_load(component, &audioreach_tplg_ops, fw);
-	if (ret < 0) {
+	if (ret < 0)
  		dev_err(dev, "tplg component load failed%d\n", ret);
-		ret = -EINVAL;
-	}

  	release_firmware(fw);
  err:

----------------------------------->cut<----------------------------------

> 
> Johan
Johan Hovold June 8, 2023, 1:05 p.m. UTC | #6
On Thu, Jun 08, 2023 at 01:45:22PM +0100, Srinivas Kandagatla wrote:
> On 08/06/2023 13:40, Johan Hovold wrote:

> > Looking at the above log it seems like we hit a probe deferral somewhere
> > as some resource is not available yet, and this is eventually turned
> > into a hard failure that breaks audio as the error is propagated up the
> > stack.
> 
> I was looking at this too, And I think this change should help..
> Not tried it though

>   	ret = snd_soc_tplg_component_load(component, &audioreach_tplg_ops, fw);
> -	if (ret < 0) {
> +	if (ret < 0)
>   		dev_err(dev, "tplg component load failed%d\n", ret);
> -		ret = -EINVAL;
> -	}

That in itself only seems to make the problem worse with new follow-on
errors. Looks like a dereference of an error pointer in a codec driver:

[   12.086999] snd-sc8280xp sound: ASoC: adding FE link failed
[   12.087231] snd-sc8280xp sound: ASoC: topology: could not load header: -517
[   12.087547] qcom-apm gprsvc:service:2:1: tplg component load failed-517
[   12.088865] snd-sc8280xp sound: ASoC: failed to instantiate card -517
[   12.117093] genirq: Flags mismatch irq 289. 00002001 (HPHR PDM WD INT) vs. 00002001 (HPHR PDM WD INT)
[   12.117504] wcd938x_codec audio-codec: Failed to request HPHR WD interrupt (-16)
[   12.117664] genirq: Flags mismatch irq 290. 00002001 (HPHL PDM WD INT) vs. 00002001 (HPHL PDM WD INT)
[   12.117861] wcd938x_codec audio-codec: Failed to request HPHL WD interrupt (-16)
[   12.118010] genirq: Flags mismatch irq 291. 00002001 (AUX PDM WD INT) vs. 00002001 (AUX PDM WD INT)
[   12.118197] wcd938x_codec audio-codec: Failed to request Aux WD interrupt (-16)
[   12.118579] genirq: Flags mismatch irq 292. 00002001 (mbhc sw intr) vs. 00002001 (mbhc sw intr)
[   12.118763] wcd938x_codec audio-codec: Failed to request mbhc interrupts -16
[   12.122415] snd-sc8280xp sound: ASoC: Parent card not yet available, widget card binding deferred
[   12.126740] Unable to handle kernel paging request at virtual address fffffffffffffff8
[   12.126856] Mem abort info:
[   12.126903]   ESR = 0x0000000096000004
[   12.126961]   EC = 0x25: DABT (current EL), IL = 32 bits
[   12.127036]   SET = 0, FnV = 0
[   12.127085]   EA = 0, S1PTW = 0
[   12.127135]   FSC = 0x04: level 0 translation fault
[   12.127205] Data abort info:
[   12.127250]   ISV = 0, ISS = 0x00000004
[   12.127308]   CM = 0, WnR = 0
[   12.127356] swapper pgtable: 4k pages, 48-bit VAs, pgdp=000000028e41b000
[   12.127445] [fffffffffffffff8] pgd=0000000000000000, p4d=0000000000000000
[   12.127624] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[   12.127674] Modules linked in: q6apm_dai q6apm_lpass_dais snd_soc_wsa883x q6prm_clocks snd_q6dsp_common q6prm michael_mic cbc des_generic libdes ecb algif_skcipher md5 algif_hash af_alg ip6_tables xt_LOG nf_log_syslog ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_conntrack snd_q6apm nf_conntrack libcrc32c nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter r8152 mii qrtr_mhi panel_edp snd_soc_hdmi_codec venus_dec venus_enc fastrpc gpio_sbu_mux rpmsg_ctrl apr videobuf2_dma_contig rpmsg_char qrtr_smd qcom_spmi_adc_tm5 videobuf2_memops snd_soc_lpass_va_macro snd_soc_lpass_tx_macro snd_soc_lpass_wsa_macro snd_soc_lpass_rx_macro snd_soc_lpass_macro_common ath11k_pci qcom_pm8008_regulator ath11k mac80211 venus_core pmic_glink_altmode qcom_battmgr msm libarc4 v4l2_mem2mem hci_uart videobuf2_v4l2 gpu_sched btqca videodev bluetooth drm_display_helper cfg80211 phy_qcom_qmp_combo videobuf2_common leds_qcom_lpg qcom_spmi_temp_alarm drm_dp_aux_bus ecdh_generic qcom_spmi_adc5 drm_kms_helper mc ecc led_class_multicolor rtc_pm8xxx mhi industrialio
[   12.127941]  qcom_pon rfkill syscopyarea snd_soc_sc8280xp reboot_mode qcom_vadc_common sysfillrect nvmem_qcom_spmi_sdam qcom_pm8008 snd_soc_wcd938x sysimgblt snd_soc_qcom_common snd_soc_qcom_sdw videocc_sc8280xp typec phy_qcom_edp qcom_stats regmap_i2c snd_soc_wcd938x_sdw phy_qcom_qmp_usb pinctrl_sc8280xp_lpass_lpi phy_qcom_snps_femto_v2 qcom_q6v5_pas icc_bwmon soundwire_qcom regmap_sdw pinctrl_lpass_lpi snd_soc_wcd_mbhc lpasscc_sc8280xp qcom_pil_info snd_soc_core qcom_common snd_compress qcom_glink_smem qcom_q6v5 snd_pcm qcom_sysmon snd_timer icc_osm_l3 mdt_loader qrtr qcom_wdt qcom_rng snd pmic_glink pdr_interface soundcore socinfo qmi_helpers pwm_bl soundwire_bus drm dm_mod ip_tables x_tables ipv6 pcie_qcom crc8 phy_qcom_qmp_pcie nvme nvme_core hid_multitouch i2c_qcom_geni i2c_hid_of i2c_hid i2c_core
[   12.129101] CPU: 1 PID: 158 Comm: kworker/u16:7 Not tainted 6.4.0-rc5 #62
[   12.129153] Hardware name: LENOVO 21BYZ9SRUS/21BYZ9SRUS, BIOS N3HET53W (1.25 ) 10/12/2022
[   12.129209] Workqueue: events_unbound deferred_probe_work_func
[   12.129267] pstate: a0400005 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   12.129319] pc : wcd_mbhc_start+0x28/0x380 [snd_soc_wcd_mbhc]
[   12.129381] lr : wcd938x_codec_set_jack+0x28/0x48 [snd_soc_wcd938x]
[   12.129445] sp : ffff80000a523950
[   12.129471] x29: ffff80000a523950 x28: ffff408fd176f080 x27: ffff408fd176c880
[   12.129530] x26: ffffac710f2c0b20 x25: 0002000000000000 x24: 00000000fffffdf4
[   12.129588] x23: ffff408fd176c880 x22: ffff408fd176d518 x21: ffff408fd176f080
[   12.129645] x20: ffff408fc5c30c80 x19: fffffffffffffff0 x18: 0000000000000020
[   12.129701] x17: 0000000000000000 x16: ffffac710e565dac x15: ffffffffffffffff
[   12.129758] x14: ffff40905779741b x13: ffffffffffffffff x12: 0000000000000000
[   12.129814] x11: ffff409b340586f0 x10: ffffac710fbcf1e0 x9 : 0000000000000000
[   12.129870] x8 : ffff408fd9288700 x7 : 0000000000000000 x6 : ffff408fd77891a0
[   12.129926] x5 : ffff408fd176f750 x4 : 0000000000000000 x3 : ffffac70c8662310
[   12.132271] x2 : ffff408fd176d518 x1 : ffff408fc3480158 x0 : fffffffffffffff0
[   12.133443] Call trace:
[   12.134003]  wcd_mbhc_start+0x28/0x380 [snd_soc_wcd_mbhc]
[   12.134567]  wcd938x_codec_set_jack+0x28/0x48 [snd_soc_wcd938x]
[   12.135131]  snd_soc_component_set_jack+0x28/0x8c [snd_soc_core]
[   12.135709]  qcom_snd_wcd_jack_setup+0x7c/0x19c [snd_soc_qcom_common]
[   12.136278]  sc8280xp_snd_init+0x20/0x2c [snd_soc_sc8280xp]
[   12.137080]  snd_soc_link_init+0x28/0x90 [snd_soc_core]
[   12.139427]  snd_soc_bind_card+0x628/0xbfc [snd_soc_core]
[   12.141761]  snd_soc_register_card+0xec/0x104 [snd_soc_core]
[   12.144082]  devm_snd_soc_register_card+0x4c/0xa4 [snd_soc_core]
[   12.146402]  sc8280xp_platform_probe+0xf0/0x108 [snd_soc_sc8280xp]
[   12.148694]  platform_probe+0x68/0xd8
[   12.150969]  really_probe+0x184/0x3c8
[   12.153257]  __driver_probe_device+0x7c/0x16c
[   12.155532]  driver_probe_device+0x3c/0x110
[   12.157787]  __device_attach_driver+0xbc/0x158
[   12.160056]  bus_for_each_drv+0x84/0xe0
[   12.162304]  __device_attach+0xa8/0x1d4
[   12.164518]  device_initial_probe+0x14/0x20
[   12.166716]  bus_probe_device+0xb0/0xb4
[   12.168924]  deferred_probe_work_func+0xa0/0xf4
[   12.171100]  process_one_work+0x288/0x5bc
[   12.173267]  worker_thread+0x74/0x450
[   12.175409]  kthread+0x124/0x128
[   12.177555]  ret_from_fork+0x10/0x20
[   12.179683] Code: fa401804 54001ae0 a9025bf5 aa0003f3 (f9400415) 
[   12.181815] ---[ end trace 0000000000000000 ]---

Johan
Srinivas Kandagatla June 8, 2023, 2:23 p.m. UTC | #7
On 08/06/2023 14:05, Johan Hovold wrote:
> On Thu, Jun 08, 2023 at 01:45:22PM +0100, Srinivas Kandagatla wrote:
>> On 08/06/2023 13:40, Johan Hovold wrote:
> 
>>> Looking at the above log it seems like we hit a probe deferral somewhere
>>> as some resource is not available yet, and this is eventually turned
>>> into a hard failure that breaks audio as the error is propagated up the
>>> stack.
>>
>> I was looking at this too, And I think this change should help..
>> Not tried it though
> 
>>    	ret = snd_soc_tplg_component_load(component, &audioreach_tplg_ops, fw);
>> -	if (ret < 0) {
>> +	if (ret < 0)
>>    		dev_err(dev, "tplg component load failed%d\n", ret);
>> -		ret = -EINVAL;
>> -	}
> 
> That in itself only seems to make the problem worse with new follow-on
> errors. Looks like a dereference of an error pointer in a codec driver:

Thanks Johan,

I will try to see if we can fix this properly.

--srini
Johan Hovold July 10, 2023, 8:17 a.m. UTC | #8
On Thu, May 25, 2023 at 02:38:09PM +0100, Srinivas Kandagatla wrote:
> SoundWire device status can be incorrectly updated without
> proper mask, fix this by adding a mask before updating the status.
> 
> Fixes: c7d49c76d1d5 ("soundwire: qcom: add support to new interrupts")
> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
> ---

When rebasing on 6.5-rc1, I noticed that this one was apparently never
merged along with the rest of the series.

Any idea how this could have happened?

And can we get this one into 6.5 as well?

>  drivers/soundwire/qcom.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/soundwire/qcom.c b/drivers/soundwire/qcom.c
> index aad5942e5980..9440787e924b 100644
> --- a/drivers/soundwire/qcom.c
> +++ b/drivers/soundwire/qcom.c
> @@ -515,7 +515,7 @@ static int qcom_swrm_get_alert_slave_dev_num(struct qcom_swrm_ctrl *ctrl)
>  		status = (val >> (dev_num * SWRM_MCP_SLV_STATUS_SZ));
>  
>  		if ((status & SWRM_MCP_SLV_STATUS_MASK) == SDW_SLAVE_ALERT) {
> -			ctrl->status[dev_num] = status;
> +			ctrl->status[dev_num] = status & SWRM_MCP_SLV_STATUS_MASK;
>  			return dev_num;
>  		}
>  	}

Johan