mbox series

[v2,0/9] sound: Use -EPROBE_DEFER instead of i915 module loading.

Message ID 20230719164141.228073-1-maarten.lankhorst@linux.intel.com
Headers show
Series sound: Use -EPROBE_DEFER instead of i915 module loading. | expand

Message

Maarten Lankhorst July 19, 2023, 4:41 p.m. UTC
Explicitly loading i915 becomes a problem when upstreaming the new intel driver
for Tiger Lake and higher graphics (xe). By loading i915, it doesn't wait for
driver load of xe, and will fail completely before it loads.

-EPROBE_DEFER has to be returned before any device is created in probe(),
otherwise the removal of the device will cause EPROBE_DEFER to try again
in an infinite loop.

The conversion is done in gradual steps. First I add an argument to
snd_hdac_i915_init to allow for -EPROBE_DEFER so I can convert each driver
separately. Then I convert each driver to move snd_hdac_i915_init out of the
workqueue. Finally I drop the ability to choose modprobe behavior after the
last user is converted.

I suspect the avs and skylake drivers used snd_hdac_i915_init purely for the
modprobe, but I don't have the hardware to test if it can be safely removed.
It can still be done easily in a followup patch to simplify probing.

---
New since first version:

- snd_hda_core.gpu_bind is added as a mechanism to force gpu binding,
  for testing. snd_hda_core.gpu_bind=0 forces waiting for GPU bind to
  off, snd_hda_core.gpu_bind=1 forces waiting for gpu bind. Default
  setting depends on whether kernel booted with nomodeset.
- Incorporated all feedback review.

Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: Cezary Rojewski <cezary.rojewski@intel.com>
Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Cc: Liam Girdwood <liam.r.girdwood@linux.intel.com>
Cc: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Cc: Bard Liao <yung-chuan.liao@linux.intel.com>
Cc: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Cc: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Daniel Baluta <daniel.baluta@nxp.com>
Cc: alsa-devel@alsa-project.org
Cc: linux-kernel@vger.kernel.org
Cc: sound-open-firmware@alsa-project.org

Maarten Lankhorst (9):
  ALSA: hda/intel: Fix error handling in azx_probe()
  ALSA: hda/i915: Allow override of gpu binding.
  ALSA: hda/i915: Add an allow_modprobe argument to snd_hdac_i915_init
  ALSA: hda/i915: Allow xe as match for i915_component_master_match
  ASoC: Intel: avs: Move snd_hdac_i915_init to before probe_work.
  ASoC: Intel: Skylake: Move snd_hdac_i915_init to before probe_work.
  ALSA: hda/intel: Move snd_hdac_i915_init to before probe_work.
  ASoC: SOF: Intel: Remove deferred probe for SOF
  ALSA: hda/i915: Remove extra argument from snd_hdac_i915_init

 sound/hda/hdac_i915.c         | 25 ++++++++-------
 sound/pci/hda/hda_intel.c     | 60 ++++++++++++++++++-----------------
 sound/soc/intel/avs/core.c    | 13 +++++---
 sound/soc/intel/skylake/skl.c | 31 ++++++------------
 sound/soc/sof/Kconfig         | 19 -----------
 sound/soc/sof/core.c          | 38 ++--------------------
 sound/soc/sof/intel/Kconfig   |  1 -
 sound/soc/sof/intel/hda.c     | 32 +++++++++++--------
 sound/soc/sof/sof-pci-dev.c   |  3 +-
 sound/soc/sof/sof-priv.h      |  5 ---
 10 files changed, 85 insertions(+), 142 deletions(-)

Comments

Kai Vehmanen July 21, 2023, 10:06 a.m. UTC | #1
Hi,

On Wed, 19 Jul 2023, Maarten Lankhorst wrote:

> Explicitly loading i915 becomes a problem when upstreaming the new intel driver
> for Tiger Lake and higher graphics (xe). By loading i915, it doesn't wait for
> driver load of xe, and will fail completely before it loads.
> 
> -EPROBE_DEFER has to be returned before any device is created in probe(),
> otherwise the removal of the device will cause EPROBE_DEFER to try again
> in an infinite loop.

thanks, series looks good to me now. We'll need to adopt the new gpu_bind
parameter in a number of CI systems (where we test without i915/xe), but 
this looks perfectly doable.
  
I'll give my 

Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>

... for the hdac_i915.c changes. For AVS and SOF, I'd ask for some 
more review time to allow Cezary, Pierre et al to weigh in. I don't
personally recall e.g. where we've used CONFIG_SOF_FORCE_PROBE_WORKQUEUE
and do we have grounds to keep it even if workqueue is no longer set
for HDA codec support.

Br, Kai
Peter Ujfalusi July 21, 2023, 11:34 a.m. UTC | #2
On 19/07/2023 19:41, Maarten Lankhorst wrote:
> Add missing pci_set_drv to NULL call on error.
> 
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> ---
>  sound/pci/hda/hda_intel.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
> index ef831770ca7da..0d2d6bc6c75ef 100644
> --- a/sound/pci/hda/hda_intel.c
> +++ b/sound/pci/hda/hda_intel.c
> @@ -2188,6 +2188,7 @@ static int azx_probe(struct pci_dev *pci,
>  	return 0;
>  
>  out_free:
> +	pci_set_drvdata(pci, NULL);
The original patch added this:
f4c482a4d0b3 ("ALSA: hda - Fix yet another race of vga_switcheroo registration")

but got removed later by:
20a24225d8f9 ("ALSA: PCI: Remove superfluous pci_set_drvdata(pci, NULL) at remove")

and partially added back (to azx_remove) by:
e81478bbe7a1 ("ALSA: hda: fix general protection fault in azx_runtime_idle")

I guess, it should do not harm to add it back...

>  	snd_card_free(card);
>  	return err;
>  }

Reviewed-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Peter Ujfalusi July 21, 2023, 12:19 p.m. UTC | #3
Hi Maarten,

On 19/07/2023 19:41, Maarten Lankhorst wrote:
> Explicitly loading i915 becomes a problem when upstreaming the new intel driver
> for Tiger Lake and higher graphics (xe). By loading i915, it doesn't wait for
> driver load of xe, and will fail completely before it loads.
> 
> -EPROBE_DEFER has to be returned before any device is created in probe(),
> otherwise the removal of the device will cause EPROBE_DEFER to try again
> in an infinite loop.
> 
> The conversion is done in gradual steps. First I add an argument to
> snd_hdac_i915_init to allow for -EPROBE_DEFER so I can convert each driver
> separately. Then I convert each driver to move snd_hdac_i915_init out of the
> workqueue. Finally I drop the ability to choose modprobe behavior after the
> last user is converted.
> 
> I suspect the avs and skylake drivers used snd_hdac_i915_init purely for the
> modprobe, but I don't have the hardware to test if it can be safely removed.
> It can still be done easily in a followup patch to simplify probing.

Apart from the few comments I had, this looks great and works OK on the
machines I have tested (iow, no regression so far).

Thank you for the work!
Pierre-Louis Bossart July 24, 2023, 10:15 a.m. UTC | #4
On 7/21/23 13:34, Péter Ujfalusi wrote:
> 
> 
> On 19/07/2023 19:41, Maarten Lankhorst wrote:
>> Add missing pci_set_drv to NULL call on error.
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> ---
>>  sound/pci/hda/hda_intel.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
>> index ef831770ca7da..0d2d6bc6c75ef 100644
>> --- a/sound/pci/hda/hda_intel.c
>> +++ b/sound/pci/hda/hda_intel.c
>> @@ -2188,6 +2188,7 @@ static int azx_probe(struct pci_dev *pci,
>>  	return 0;
>>  
>>  out_free:
>> +	pci_set_drvdata(pci, NULL);
> The original patch added this:
> f4c482a4d0b3 ("ALSA: hda - Fix yet another race of vga_switcheroo registration")
> 
> but got removed later by:
> 20a24225d8f9 ("ALSA: PCI: Remove superfluous pci_set_drvdata(pci, NULL) at remove")
> 
> and partially added back (to azx_remove) by:
> e81478bbe7a1 ("ALSA: hda: fix general protection fault in azx_runtime_idle")
> 
> I guess, it should do not harm to add it back...
> 
>>  	snd_card_free(card);
>>  	return err;
>>  }
> 
> Reviewed-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>

Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Takashi Iwai July 31, 2023, 3:51 p.m. UTC | #5
On Wed, 19 Jul 2023 18:41:32 +0200,
Maarten Lankhorst wrote:
> 
> Explicitly loading i915 becomes a problem when upstreaming the new intel driver
> for Tiger Lake and higher graphics (xe). By loading i915, it doesn't wait for
> driver load of xe, and will fail completely before it loads.
> 
> -EPROBE_DEFER has to be returned before any device is created in probe(),
> otherwise the removal of the device will cause EPROBE_DEFER to try again
> in an infinite loop.
> 
> The conversion is done in gradual steps. First I add an argument to
> snd_hdac_i915_init to allow for -EPROBE_DEFER so I can convert each driver
> separately. Then I convert each driver to move snd_hdac_i915_init out of the
> workqueue. Finally I drop the ability to choose modprobe behavior after the
> last user is converted.
> 
> I suspect the avs and skylake drivers used snd_hdac_i915_init purely for the
> modprobe, but I don't have the hardware to test if it can be safely removed.
> It can still be done easily in a followup patch to simplify probing.
> 
> ---
> New since first version:
> 
> - snd_hda_core.gpu_bind is added as a mechanism to force gpu binding,
>   for testing. snd_hda_core.gpu_bind=0 forces waiting for GPU bind to
>   off, snd_hda_core.gpu_bind=1 forces waiting for gpu bind. Default
>   setting depends on whether kernel booted with nomodeset.
> - Incorporated all feedback review.

Maarten, are you working on v3 patch set?
Or, for moving forward, should we merge v2 now and fix the rest based
on that later?


thanks,

Takashi
Maarten Lankhorst July 31, 2023, 4:37 p.m. UTC | #6
Hey,

Den 2023-07-31 kl. 17:51, skrev Takashi Iwai:
> On Wed, 19 Jul 2023 18:41:32 +0200,
> Maarten Lankhorst wrote:
>> Explicitly loading i915 becomes a problem when upstreaming the new intel driver
>> for Tiger Lake and higher graphics (xe). By loading i915, it doesn't wait for
>> driver load of xe, and will fail completely before it loads.
>>
>> -EPROBE_DEFER has to be returned before any device is created in probe(),
>> otherwise the removal of the device will cause EPROBE_DEFER to try again
>> in an infinite loop.
>>
>> The conversion is done in gradual steps. First I add an argument to
>> snd_hdac_i915_init to allow for -EPROBE_DEFER so I can convert each driver
>> separately. Then I convert each driver to move snd_hdac_i915_init out of the
>> workqueue. Finally I drop the ability to choose modprobe behavior after the
>> last user is converted.
>>
>> I suspect the avs and skylake drivers used snd_hdac_i915_init purely for the
>> modprobe, but I don't have the hardware to test if it can be safely removed.
>> It can still be done easily in a followup patch to simplify probing.
>>
>> ---
>> New since first version:
>>
>> - snd_hda_core.gpu_bind is added as a mechanism to force gpu binding,
>>    for testing. snd_hda_core.gpu_bind=0 forces waiting for GPU bind to
>>    off, snd_hda_core.gpu_bind=1 forces waiting for gpu bind. Default
>>    setting depends on whether kernel booted with nomodeset.
>> - Incorporated all feedback review.
> Maarten, are you working on v3 patch set?
> Or, for moving forward, should we merge v2 now and fix the rest based
> on that later?

I've been working on a small change to keep the workqueue in SOF and 
only move the binding to the probe function to match what snd-hda-intel 
is doing, but I don't know if that is needed?

It was a bit unclear to me based on feedback if I should try to kill the 
workqueue on all drivers (but with no way to test), or keep it around.

Cheers,

~Maarten
Takashi Iwai July 31, 2023, 7:32 p.m. UTC | #7
On Mon, 31 Jul 2023 18:37:36 +0200,
Maarten Lankhorst wrote:
> 
> Hey,
> 
> Den 2023-07-31 kl. 17:51, skrev Takashi Iwai:
> > On Wed, 19 Jul 2023 18:41:32 +0200,
> > Maarten Lankhorst wrote:
> >> Explicitly loading i915 becomes a problem when upstreaming the new intel driver
> >> for Tiger Lake and higher graphics (xe). By loading i915, it doesn't wait for
> >> driver load of xe, and will fail completely before it loads.
> >> 
> >> -EPROBE_DEFER has to be returned before any device is created in probe(),
> >> otherwise the removal of the device will cause EPROBE_DEFER to try again
> >> in an infinite loop.
> >> 
> >> The conversion is done in gradual steps. First I add an argument to
> >> snd_hdac_i915_init to allow for -EPROBE_DEFER so I can convert each driver
> >> separately. Then I convert each driver to move snd_hdac_i915_init out of the
> >> workqueue. Finally I drop the ability to choose modprobe behavior after the
> >> last user is converted.
> >> 
> >> I suspect the avs and skylake drivers used snd_hdac_i915_init purely for the
> >> modprobe, but I don't have the hardware to test if it can be safely removed.
> >> It can still be done easily in a followup patch to simplify probing.
> >> 
> >> ---
> >> New since first version:
> >> 
> >> - snd_hda_core.gpu_bind is added as a mechanism to force gpu binding,
> >>    for testing. snd_hda_core.gpu_bind=0 forces waiting for GPU bind to
> >>    off, snd_hda_core.gpu_bind=1 forces waiting for gpu bind. Default
> >>    setting depends on whether kernel booted with nomodeset.
> >> - Incorporated all feedback review.
> > Maarten, are you working on v3 patch set?
> > Or, for moving forward, should we merge v2 now and fix the rest based
> > on that later?
> 
> I've been working on a small change to keep the workqueue in SOF and
> only move the binding to the probe function to match what
> snd-hda-intel is doing, but I don't know if that is needed?
> 
> It was a bit unclear to me based on feedback if I should try to kill
> the workqueue on all drivers (but with no way to test), or keep it
> around.

I guess it's still safer to keep the workqueue in many drivers.  There
can be modprobe or firmware loading at any later stage.
We can get rid of the workqueue once after confirming that it's really
safe, too.

So, if you can work on the patch set in that regard, it's fine, I can
wait for it.


thanks,

Takashi
Maarten Lankhorst Aug. 1, 2023, 7:27 a.m. UTC | #8
Hey,

Den 2023-07-31 kl. 21:32, skrev Takashi Iwai:
> On Mon, 31 Jul 2023 18:37:36 +0200,
> Maarten Lankhorst wrote:
>> Hey,
>>
>> Den 2023-07-31 kl. 17:51, skrev Takashi Iwai:
>>> On Wed, 19 Jul 2023 18:41:32 +0200,
>>> Maarten Lankhorst wrote:
>>>> Explicitly loading i915 becomes a problem when upstreaming the new intel driver
>>>> for Tiger Lake and higher graphics (xe). By loading i915, it doesn't wait for
>>>> driver load of xe, and will fail completely before it loads.
>>>>
>>>> -EPROBE_DEFER has to be returned before any device is created in probe(),
>>>> otherwise the removal of the device will cause EPROBE_DEFER to try again
>>>> in an infinite loop.
>>>>
>>>> The conversion is done in gradual steps. First I add an argument to
>>>> snd_hdac_i915_init to allow for -EPROBE_DEFER so I can convert each driver
>>>> separately. Then I convert each driver to move snd_hdac_i915_init out of the
>>>> workqueue. Finally I drop the ability to choose modprobe behavior after the
>>>> last user is converted.
>>>>
>>>> I suspect the avs and skylake drivers used snd_hdac_i915_init purely for the
>>>> modprobe, but I don't have the hardware to test if it can be safely removed.
>>>> It can still be done easily in a followup patch to simplify probing.
>>>>
>>>> ---
>>>> New since first version:
>>>>
>>>> - snd_hda_core.gpu_bind is added as a mechanism to force gpu binding,
>>>>     for testing. snd_hda_core.gpu_bind=0 forces waiting for GPU bind to
>>>>     off, snd_hda_core.gpu_bind=1 forces waiting for gpu bind. Default
>>>>     setting depends on whether kernel booted with nomodeset.
>>>> - Incorporated all feedback review.
>>> Maarten, are you working on v3 patch set?
>>> Or, for moving forward, should we merge v2 now and fix the rest based
>>> on that later?
>> I've been working on a small change to keep the workqueue in SOF and
>> only move the binding to the probe function to match what
>> snd-hda-intel is doing, but I don't know if that is needed?
>>
>> It was a bit unclear to me based on feedback if I should try to kill
>> the workqueue on all drivers (but with no way to test), or keep it
>> around.
> I guess it's still safer to keep the workqueue in many drivers.  There
> can be modprobe or firmware loading at any later stage.
> We can get rid of the workqueue once after confirming that it's really
> safe, too.
>
> So, if you can work on the patch set in that regard, it's fine, I can
> wait for it.

I've finished that patch, but it caused regressions (oops) while 
rebooting. I think it's safer to kill the workqueue for SOC, and then 
convert all other drivers later.

Cheers,

~Maarten
Pierre-Louis Bossart Aug. 1, 2023, 4:32 p.m. UTC | #9
> I've been working on a small change to keep the workqueue in SOF and
> only move the binding to the probe function to match what snd-hda-intel
> is doing, but I don't know if that is needed?
> 
> It was a bit unclear to me based on feedback if I should try to kill the
> workqueue on all drivers (but with no way to test), or keep it around.

My understanding is that we only want to move the binding to the probe
function and leave the workqueue removal for another day - possibly never.
Mark Brown Aug. 4, 2023, 11:59 a.m. UTC | #10
On Fri, Aug 04, 2023 at 12:47:54PM +0200, Maarten Lankhorst wrote:
> On 2023-08-01 18:32, Pierre-Louis Bossart wrote:

> This mail can be applied with git am -c.
> ------8<---------
> Now that we can use -EPROBE_DEFER, it's no longer required to spin off

Don't do this, it breaks my automation and means I very nearly just
skipped the patch entirely since it looked like the middle of some x86
discussion.
Maarten Lankhorst Aug. 4, 2023, 2:31 p.m. UTC | #11
Hey,

Den 2023-08-04 kl. 13:59, skrev Mark Brown:
> On Fri, Aug 04, 2023 at 12:47:54PM +0200, Maarten Lankhorst wrote:
>> On 2023-08-01 18:32, Pierre-Louis Bossart wrote:
>> This mail can be applied with git am -c.
>> ------8<---------
>> Now that we can use -EPROBE_DEFER, it's no longer required to spin off
> Don't do this, it breaks my automation and means I very nearly just
> skipped the patch entirely since it looked like the middle of some x86
> discussion.

Yeah, it's replacing the patch from earlier. I can resend, but means 
having to add all acks, r-b'd, etc. :)

If you have scripts that do that, all the better.

Cheers,

~Maarten
Mark Brown Aug. 4, 2023, 2:34 p.m. UTC | #12
On Fri, Aug 04, 2023 at 04:31:21PM +0200, Maarten Lankhorst wrote:
> Den 2023-08-04 kl. 13:59, skrev Mark Brown:

> > > On 2023-08-01 18:32, Pierre-Louis Bossart wrote:
> > > This mail can be applied with git am -c.
> > > ------8<---------

> > > Now that we can use -EPROBE_DEFER, it's no longer required to spin off
> > Don't do this, it breaks my automation and means I very nearly just
> > skipped the patch entirely since it looked like the middle of some x86
> > discussion.

> Yeah, it's replacing the patch from earlier. I can resend, but means having
> to add all acks, r-b'd, etc. :)

*Defintely* do not do that:

Please don't send new patches in reply to old patches or serieses, this
makes it harder for both people and tools to understand what is going
on - it can bury things in mailboxes and make it difficult to keep track
of what current patches are, both for the new patches and the old ones.

> If you have scripts that do that, all the better.

If you're using b4 then b4 trailers --update.