Message ID | 20210817190057.255264-1-pierre-louis.bossart@linux.intel.com |
---|---|
Headers | show |
Series | driver core: kick deferred probe from delayed context | expand |
On Tue, Aug 17, 2021 at 02:00:56PM -0500, Pierre-Louis Bossart wrote: > The premise of the deferred probe implementation is that a successful > driver binding is a proxy for the resources provided by this driver > becoming available. While this is a correct assumption in most of the > cases, there are exceptions to the rule such as > > a) the use of request_firmware_nowait(). In this case, the resources > may become available when the 'cont' callback completes, for example > when if the firmware needs to be downloaded and executed on a SoC > core or DSP. > > b) a split implementation of the probe with a workqueue when one or > ore request_module() calls are required: a synchronous probe prevents > other drivers from probing, impacting boot time, and an async probe is > not allowed to avoid a deadlock. This is the case on all Intel audio > platforms, with request_module() being required for the i915 display > audio and HDaudio external codecs. > > In these cases, there is no way to notify the deferred probe > infrastructure of the enablement of resources after the driver > binding. Then just wait for it to happen naturally? > The driver_deferred_probe_trigger() function is currently used > 'anytime a driver is successfully bound to a device', this patch > suggest exporing by exporting it so that drivers can kick-off > re-probing of deferred devices at the end of a deferred processing. I really do not want to export this as it will get really messy very quickly with different drivers/busses attempting to call this. Either handle it in your driver (why do you have to defer probe at all, just succeed and move on to register the needed stuff after you are initialized) or rely on the driver core here. thanks, greg k-h
On Wed, Aug 18, 2021 at 07:44:39AM +0200, Greg Kroah-Hartman wrote: > On Tue, Aug 17, 2021 at 02:00:56PM -0500, Pierre-Louis Bossart wrote: > > In these cases, there is no way to notify the deferred probe > > infrastructure of the enablement of resources after the driver > > binding. > Then just wait for it to happen naturally? Through what mechanism will it happen naturally? Deferred probe currently only does things if things are being registered or if probes complete. > > The driver_deferred_probe_trigger() function is currently used > > 'anytime a driver is successfully bound to a device', this patch > > suggest exporing by exporting it so that drivers can kick-off > > re-probing of deferred devices at the end of a deferred processing. > I really do not want to export this as it will get really messy very > quickly with different drivers/busses attempting to call this. I'm not sure I see the mess here - it's just queueing some work, one of the things that the workqueue stuff does well is handle things getting scheduled while they're already queued. Honestly having understood their problem I think we need to be adding these calls into all the resource provider APIs. > Either handle it in your driver (why do you have to defer probe at all, > just succeed and move on to register the needed stuff after you are > initialized) or rely on the driver core here. That's exactly what they're doing currently and the driver core isn't delivering. Driver A is slow to start up and providing a resource to driver B, this gets handled in driver A by succeeding immediately and then registering the resource once the startup has completed. Unfortunately while that was happening not only has driver B registered and deferred but the rest of the probes/defers in the system have completed so the deferred probe mechanism is idle. Nothing currently tells the deferred probe mechanism that a new resource is now available so it never retries the probe of driver B. The only way I can see to fix this without modifying the driver core is to make driver A block during probe but that would at best slow down boot. The issue is that the driver core is using drivers completing probe as a proxy for resources becoming available. That works most of the time because most probes are fully synchronous but it breaks down if a resource provider registers resources outside of probe, we might still be fine if system boot is still happening and something else probes but only through luck.
On Wed, Aug 18, 2021 at 12:57:36PM +0100, Mark Brown wrote: > On Wed, Aug 18, 2021 at 07:44:39AM +0200, Greg Kroah-Hartman wrote: > > On Tue, Aug 17, 2021 at 02:00:56PM -0500, Pierre-Louis Bossart wrote: > > > > In these cases, there is no way to notify the deferred probe > > > infrastructure of the enablement of resources after the driver > > > binding. > > > Then just wait for it to happen naturally? > > Through what mechanism will it happen naturally? Deferred probe > currently only does things if things are being registered or if probes > complete. > > > > The driver_deferred_probe_trigger() function is currently used > > > 'anytime a driver is successfully bound to a device', this patch > > > suggest exporing by exporting it so that drivers can kick-off > > > re-probing of deferred devices at the end of a deferred processing. > > > I really do not want to export this as it will get really messy very > > quickly with different drivers/busses attempting to call this. > > I'm not sure I see the mess here - it's just queueing some work, one of > the things that the workqueue stuff does well is handle things getting > scheduled while they're already queued. Honestly having understood > their problem I think we need to be adding these calls into all the > resource provider APIs. > > > Either handle it in your driver (why do you have to defer probe at all, > > just succeed and move on to register the needed stuff after you are > > initialized) or rely on the driver core here. > > That's exactly what they're doing currently and the driver core isn't > delivering. > > Driver A is slow to start up and providing a resource to driver B, this > gets handled in driver A by succeeding immediately and then registering > the resource once the startup has completed. Unfortunately while that > was happening not only has driver B registered and deferred but the rest > of the probes/defers in the system have completed so the deferred probe > mechanism is idle. Nothing currently tells the deferred probe mechanism > that a new resource is now available so it never retries the probe of > driver B. The only way I can see to fix this without modifying the > driver core is to make driver A block during probe but that would at > best slow down boot. > > The issue is that the driver core is using drivers completing probe as a > proxy for resources becoming available. That works most of the time > because most probes are fully synchronous but it breaks down if a > resource provider registers resources outside of probe, we might still > be fine if system boot is still happening and something else probes but > only through luck. The driver core is not using that as a proxy, that is up to the driver itself or not. All probe means is "yes, this driver binds to this device, thank you!" for that specific bus/class type. That's all, if the driver needs to go off and do real work before it can properly control the device, wonderful, have it go and do that async. So if you know you should be binding to the device, great, kick off some other work and return success from probe. There's no reason you have to delay or defer for no good reason, right? But yes, if you do get new resources, the probe should be called again, that's what the deferred logic is for (or is that the link logic, I can't recall) This shouldn't be a new thing, no needing to call the driver core directly like this at all, it should "just happen", right? thanks, greg k-h
On Wed, Aug 18, 2021 at 03:22:19PM +0200, Greg Kroah-Hartman wrote: > On Wed, Aug 18, 2021 at 12:57:36PM +0100, Mark Brown wrote: > > The issue is that the driver core is using drivers completing probe as a > > proxy for resources becoming available. That works most of the time > > because most probes are fully synchronous but it breaks down if a > > resource provider registers resources outside of probe, we might still > > be fine if system boot is still happening and something else probes but > > only through luck. > The driver core is not using that as a proxy, that is up to the driver > itself or not. All probe means is "yes, this driver binds to this > device, thank you!" for that specific bus/class type. That's all, if > the driver needs to go off and do real work before it can properly > control the device, wonderful, have it go and do that async. Right, which is what is happening here - but the deferred probe machinery in the core is reading more into the probe succeeding than it should. > So if you know you should be binding to the device, great, kick off some > other work and return success from probe. There's no reason you have to > delay or defer for no good reason, right? The driver that's deferring isn't the one that takes a long time to probe - the driver that's deferring depends on the driver that takes a long time to probe, it defers because the resource it needs isn't available when it tries to probe as the slow device is still doing it's thing asynchronously. The problem is that the driver core isn't going back and attempting to probe the deferred device again once the driver that took a long time has provided resources. > But yes, if you do get new resources, the probe should be called again, > that's what the deferred logic is for (or is that the link logic, I > can't recall) This shouldn't be a new thing, no needing to call the > driver core directly like this at all, it should "just happen", right? How specifically does new resources becoming available directly cause a new probe deferral run at the moment? I can't see anything that resource provider APIs are doing to say that a new resource has become available, this patch is trying to provide something they can do.
>>> The issue is that the driver core is using drivers completing probe as a >>> proxy for resources becoming available. That works most of the time >>> because most probes are fully synchronous but it breaks down if a >>> resource provider registers resources outside of probe, we might still >>> be fine if system boot is still happening and something else probes but >>> only through luck. > >> The driver core is not using that as a proxy, that is up to the driver >> itself or not. All probe means is "yes, this driver binds to this >> device, thank you!" for that specific bus/class type. That's all, if >> the driver needs to go off and do real work before it can properly >> control the device, wonderful, have it go and do that async. > > Right, which is what is happening here - but the deferred probe > machinery in the core is reading more into the probe succeeding than it > should. I think Greg was referring to the use of the PROBE_PREFER_ASYNCHRONOUS probe type. We tried just that and got a nice WARN_ON because we are using request_module() to deal with HDaudio codecs. The details are in [1] but the kernel code is unambiguous... /* * We don't allow synchronous module loading from async. Module * init may invoke async_synchronize_full() which will end up * waiting for this task which already is waiting for the module * loading to complete, leading to a deadlock. */ WARN_ON_ONCE(wait && current_is_async()); The reason why we use a workqueue is because we are otherwise painted in a corner by conflicting requirements. a) we have to use request_module() b) we cannot use the async probe because of the request_module() c) we have to avoid blocking on boot I understand the resistance to exporting this function, no one in our team was really happy about it, but no one could find an alternate solution. If there is something better, I am all ears. Thanks -Pierre [1] https://github.com/thesofproject/linux/pull/3079
On Wed, Aug 18, 2021 at 7:52 AM Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> wrote: > > > > >>> The issue is that the driver core is using drivers completing probe as a > >>> proxy for resources becoming available. That works most of the time > >>> because most probes are fully synchronous but it breaks down if a > >>> resource provider registers resources outside of probe, we might still > >>> be fine if system boot is still happening and something else probes but > >>> only through luck. > > > >> The driver core is not using that as a proxy, that is up to the driver > >> itself or not. All probe means is "yes, this driver binds to this > >> device, thank you!" for that specific bus/class type. That's all, if > >> the driver needs to go off and do real work before it can properly > >> control the device, wonderful, have it go and do that async. > > > > Right, which is what is happening here - but the deferred probe > > machinery in the core is reading more into the probe succeeding than it > > should. > > I think Greg was referring to the use of the PROBE_PREFER_ASYNCHRONOUS > probe type. We tried just that and got a nice WARN_ON because we are > using request_module() to deal with HDaudio codecs. The details are in > [1] but the kernel code is unambiguous... > > /* > * We don't allow synchronous module loading from async. Module > * init may invoke async_synchronize_full() which will end up > * waiting for this task which already is waiting for the module > * loading to complete, leading to a deadlock. > */ > WARN_ON_ONCE(wait && current_is_async()); > > > The reason why we use a workqueue is because we are otherwise painted in > a corner by conflicting requirements. > > a) we have to use request_module() > b) we cannot use the async probe because of the request_module() > c) we have to avoid blocking on boot > > I understand the resistance to exporting this function, no one in our > team was really happy about it, but no one could find an alternate > solution. If there is something better, I am all ears. Additionally you mentioned that the consumer is unknown to the producer, so you are not able, for example, to use the newly exported device_driver_attach() to directly trigger the unblocked dependency.
On Wed, Aug 18, 2021 at 09:51:51AM -0500, Pierre-Louis Bossart wrote: > > > >>> The issue is that the driver core is using drivers completing probe as a > >>> proxy for resources becoming available. That works most of the time > >>> because most probes are fully synchronous but it breaks down if a > >>> resource provider registers resources outside of probe, we might still > >>> be fine if system boot is still happening and something else probes but > >>> only through luck. > > > >> The driver core is not using that as a proxy, that is up to the driver > >> itself or not. All probe means is "yes, this driver binds to this > >> device, thank you!" for that specific bus/class type. That's all, if > >> the driver needs to go off and do real work before it can properly > >> control the device, wonderful, have it go and do that async. > > > > Right, which is what is happening here - but the deferred probe > > machinery in the core is reading more into the probe succeeding than it > > should. > > I think Greg was referring to the use of the PROBE_PREFER_ASYNCHRONOUS > probe type. We tried just that and got a nice WARN_ON because we are > using request_module() to deal with HDaudio codecs. The details are in > [1] but the kernel code is unambiguous... > > /* > * We don't allow synchronous module loading from async. Module > * init may invoke async_synchronize_full() which will end up > * waiting for this task which already is waiting for the module > * loading to complete, leading to a deadlock. > */ > WARN_ON_ONCE(wait && current_is_async()); > > > The reason why we use a workqueue is because we are otherwise painted in > a corner by conflicting requirements. > > a) we have to use request_module() Wait, why? module loading is async, use auto-loading when the hardware/device is found and reported to userspace. Forcing a module to load by the kernel is not always wise as the module is not always present in the filesystem at that point in time at boot (think modules on the filesystem, not in the initramfs). Try fixing this issue and maybe it will resolve itself as you should be working async. thanks, greg k-h
>> a) we have to use request_module() > > Wait, why? > > module loading is async, use auto-loading when the hardware/device is > found and reported to userspace. Forcing a module to load by the kernel > is not always wise as the module is not always present in the filesystem > at that point in time at boot (think modules on the filesystem, not in > the initramfs). > > Try fixing this issue and maybe it will resolve itself as you should be > working async. It's been that way for a very long time (2015?) for HDAudio support, see sound/pci/hda/hda_bind.c. It's my understanding that it was a conscious design decision to use vendor-specific modules, if available, and fallback to generic modules if the first pass failed. Takashi, you may want to chime in...
On Wed, Aug 18, 2021 at 10:53:07AM -0500, Pierre-Louis Bossart wrote: > > > > >> a) we have to use request_module() > > > > Wait, why? > > > > module loading is async, use auto-loading when the hardware/device is > > found and reported to userspace. Forcing a module to load by the kernel > > is not always wise as the module is not always present in the filesystem > > at that point in time at boot (think modules on the filesystem, not in > > the initramfs). > > > > Try fixing this issue and maybe it will resolve itself as you should be > > working async. > > It's been that way for a very long time (2015?) for HDAudio support, see > sound/pci/hda/hda_bind.c. It's my understanding that it was a conscious > design decision to use vendor-specific modules, if available, and > fallback to generic modules if the first pass failed. If it has been this way for so long, what has caused the sudden change to need to export this and call this function?
On Wed, Aug 18, 2021 at 06:49:51PM +0200, Greg Kroah-Hartman wrote: > On Wed, Aug 18, 2021 at 10:53:07AM -0500, Pierre-Louis Bossart wrote: > > It's been that way for a very long time (2015?) for HDAudio support, see > > sound/pci/hda/hda_bind.c. It's my understanding that it was a conscious > > design decision to use vendor-specific modules, if available, and > > fallback to generic modules if the first pass failed. > If it has been this way for so long, what has caused the sudden change > to need to export this and call this function? The usage predates the hardware that requires firmware downloads - that's very new.
>>>> a) we have to use request_module() >>> >>> Wait, why? >>> >>> module loading is async, use auto-loading when the hardware/device is >>> found and reported to userspace. Forcing a module to load by the kernel >>> is not always wise as the module is not always present in the filesystem >>> at that point in time at boot (think modules on the filesystem, not in >>> the initramfs). >>> >>> Try fixing this issue and maybe it will resolve itself as you should be >>> working async. >> >> It's been that way for a very long time (2015?) for HDAudio support, see >> sound/pci/hda/hda_bind.c. It's my understanding that it was a conscious >> design decision to use vendor-specific modules, if available, and >> fallback to generic modules if the first pass failed. > > If it has been this way for so long, what has caused the sudden change > to need to export this and call this function? Fair question, I did not provide all the context with a cover letter that was already quite long. Here are more details: In the existing Intel audio drivers, we have a PCI device that first get probed. The PCI driver initializes the DSP and exposes what the audio DSP can do, but the platform-specific configuration for a given board is handled by a child device [1]. We have all kinds of hard-coded lookup tables to figure out what the board is and what machine driver should be used based on the presence of other ACPI devices and/or DMI quirks [2][3]. We must have used this solution since 2010, mainly because 'the other OS' does not rely on platform firmware for a description of the audio capabilities. In the 'soon' future, that machine driver will probed with its own ACPI ID and become generic, with all the information related to the board described in platform firmware and parsed by the driver. This is how the 'simple card' works today in Device Tree environments, platform firmware describes how host-provided components are connected to 3rd-party components. I cannot provide more details at this time since this is a not yet a publicly-available specification (this specification work does take place in a standardization body). That change in how the machine driver gets probed creates a new problem we didn't have before: this generic machine driver will probe in the early stages of the boot, long before the DSP and audio codecs are initialized/available. I initially looked at the component framework to try to express dependencies. It's really not clear to me if this is the 'right' direction, for ASoC-based solutions we already have components that register with a core. I also started looking at other proposals that were made over the years, this problem of expressing dependencies is not new. No real luck. In the end, since the DeviceTree-based solutions based on deferred probes work fine for the same type of usages, I tried to reuse the same deferred probe mechanism. The only reason why I needed to export this function is to work-around the request_module() use. I am not claiming any award for architecture, this is clearly a domain-specific corner case. I did try the async probe, I consulted with Marc Brown, had an internal review with Dan Williams and Andy Shevchenko. While nobody cheered, it seemed like this export was 'reasonable' compared to a re-architecture of the HDaudio/HDMI support - which is a really scary proposition. There is no immediate rush to make this change in this kernel cycle or the next, I am open to alternatives, but I wanted to make sure we don't have any Linux plumbing issues by the time the specification becomes public and is used by 'the other OS'. Does this help get more context? [1] https://elixir.bootlin.com/linux/latest/source/sound/soc/sof/core.c#L234 [2] https://elixir.bootlin.com/linux/latest/source/sound/soc/intel/common/soc-acpi-intel-tgl-match.c#L323 [3] https://elixir.bootlin.com/linux/latest/source/sound/soc/intel/boards/sof_sdw.c#L50
On Wed, Aug 18, 2021 at 01:09:44PM -0500, Pierre-Louis Bossart wrote: > I initially looked at the component framework to try to express > dependencies. It's really not clear to me if this is the 'right' > direction, for ASoC-based solutions we already have components that > register with a core. Historically (long before both deferred probe and the component framework) ASoC used to implement a mechanism that essentially did deferred probe for the dependencies - it'd maintain it's own lists of dependencies and then tell the machine driver and all the components when the card was ready. Once deferred probe was there we dropped all the open coded deferral stuff since it was just reimplementing what deferred probe does in a slightly more complicated fashion (it tracked the dependencies in a finer grained manner, though the result wasn't any different). See b19e6e7b76 (ASoC: core: Use driver core probe deferral) for the conversion. What ASoC is doing with the cards is fundamentally the same thing as what the component helpers are doing, we could in theory convert to using that but unlike with probe deferral it doesn't really save us any work and we'd still need all the card level tracking we've got to connect the various bits of the card together and order things. If we were starting from scratch we would probably use components but there's far more pressing things to be getting on with otherwise.