[1/2] soundwire: bus: Don't filter slave alerts

Message ID	20230119165104.3433290-1-ckeepax@opensource.cirrus.com
State	Superseded
Headers	show Return-Path: <alsa-devel-bounces@alsa-project.org> From: Charles Keepax <ckeepax@opensource.cirrus.com> To: <vkoul@kernel.org> Subject: [PATCH 1/2] soundwire: bus: Don't filter slave alerts Date: Thu, 19 Jan 2023 16:51:03 +0000 Message-ID: <20230119165104.3433290-1-ckeepax@opensource.cirrus.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: list Cc: alsa-devel@alsa-project.org, patches@opensource.cirrus.com, pierre-louis.bossart@linux.intel.com, linux-kernel@vger.kernel.org, sanyog.r.kale@intel.com, yung-chuan.liao@linux.intel.com Errors-To: alsa-devel-bounces@alsa-project.org Sender: "Alsa-devel" <alsa-devel-bounces@alsa-project.org>
Series	[1/2] soundwire: bus: Don't filter slave alerts \| expand [1/2] soundwire: bus: Don't filter slave alerts

Charles Keepax Jan. 19, 2023, 4:51 p.m. UTC

Currently the SoundWire core will loop handling slave alerts but it will
only handle those present when the alert was first raised. This causes
some issues with the Cadence SoundWire IP, which only generates an IRQ
when alert changes state. This means that if a new alert arrives whilst
old alerts are being handled it will not be handled in the currently
loop and then no further alerts will be processed since alert never
changes state to trigger a new IRQ.

Correct this issue by allowing the core to handle all pending alerts in
the IRQ handling loop. The code will still only loop up to
SDW_READ_INTR_CLEAR_RETRY times, so it shouldn't be possible for it get
completely stuck and if you are generating IRQs faster than you can
handle them you likely have bigger problems anyway.

Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
---
 drivers/soundwire/bus.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

Charles Keepax Jan. 20, 2023, 10:14 a.m. UTC | #1

On Thu, Jan 19, 2023 at 11:27:14AM -0600, Pierre-Louis Bossart wrote:
> On 1/19/23 10:51, Charles Keepax wrote:
> > Currently the SoundWire core will loop handling slave alerts but it will
> > only handle those present when the alert was first raised. This causes
> > some issues with the Cadence SoundWire IP, which only generates an IRQ
> > when alert changes state. This means that if a new alert arrives whilst
> > old alerts are being handled it will not be handled in the currently
> > loop and then no further alerts will be processed since alert never
> > changes state to trigger a new IRQ.
> > 
> > Correct this issue by allowing the core to handle all pending alerts in
> > the IRQ handling loop. The code will still only loop up to
> > SDW_READ_INTR_CLEAR_RETRY times, so it shouldn't be possible for it get
> > completely stuck and if you are generating IRQs faster than you can
> > handle them you likely have bigger problems anyway.
> 
> The change makes sense, but it's a bit odd to change the way the
> interrupts are handled because of a specific design. The bus should be
> able to deal with various designs, not force a one-size-fits-all policy
> that may not be quite right in all cases.
> 
> Could we have a new flag at the bus level that says that peripheral
> interrupts are not filtered, and set if for the Intel case?
> 
> We could similarly make the SDW_READ_INTR_CLEAR_RETRY constant
> bus/platform specific. The SoundWire spec mandates that we re-read the
> status after clearing the interrupt, but it doesn't say how to deal with
> recurring interrupts.

Perhaps I should have phrased the commit message differently
here. To be honest I am not really convince the old code makes
a huge amount of sense. So I would prefer not to add a flag
enabling the weird behaviour.

I would be of the opinion that there are really two options
for IRQ handling code like this that make sense:

1) Loop until the IRQs are handled, ie. it is the soundwire
core's responsibility to make sure all the IRQs are handled
before moving on.

2) Just handle the IRQs available when the function is called,
ie. it is the drivers responsibility to keep calling the core
until the IRQs are handled.

That way there is a clearly defined who that is responsible.
The old code is a weird mix of the two where most of the time
it is the soundwire core's responsibly to handle recurring
IRQs unless a new one happens in which case it is the drivers
responsibilty to recall the core.

Also the new code will still work for drivers that have level
IRQs and recall the core, without any modification of those
drivers. So I don't see what anyone would be gaining from the
old system.

Regarding making the clear retries platform specific that makes
sense to me but is clearly a separate patch. I will add it onto
my soundwire todo list.

Thanks,
Charles

> > Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
> > ---
> >  drivers/soundwire/bus.c | 12 ++++--------
> >  1 file changed, 4 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
> > index 633d411b64f35..daee2cca94a4d 100644
> > --- a/drivers/soundwire/bus.c
> > +++ b/drivers/soundwire/bus.c
> > @@ -1560,7 +1560,7 @@ static int sdw_handle_slave_alerts(struct sdw_slave *slave)
> >  	unsigned long port;
> >  	bool slave_notify;
> >  	u8 sdca_cascade = 0;
> > -	u8 buf, buf2[2], _buf, _buf2[2];
> > +	u8 buf, buf2[2];
> >  	bool parity_check;
> >  	bool parity_quirk;
> >  
> > @@ -1716,9 +1716,9 @@ static int sdw_handle_slave_alerts(struct sdw_slave *slave)
> >  				"SDW_SCP_INT1 recheck read failed:%d\n", ret);
> >  			goto io_err;
> >  		}
> > -		_buf = ret;
> > +		buf = ret;
> >  
> > -		ret = sdw_nread_no_pm(slave, SDW_SCP_INTSTAT2, 2, _buf2);
> > +		ret = sdw_nread_no_pm(slave, SDW_SCP_INTSTAT2, 2, buf2);
> >  		if (ret < 0) {
> >  			dev_err(&slave->dev,
> >  				"SDW_SCP_INT2/3 recheck read failed:%d\n", ret);
> > @@ -1736,12 +1736,8 @@ static int sdw_handle_slave_alerts(struct sdw_slave *slave)
> >  		}
> >  
> >  		/*
> > -		 * Make sure no interrupts are pending, but filter to limit loop
> > -		 * to interrupts identified in the first status read
> > +		 * Make sure no interrupts are pending
> >  		 */
> > -		buf &= _buf;
> > -		buf2[0] &= _buf2[0];
> > -		buf2[1] &= _buf2[1];
> >  		stat = buf || buf2[0] || buf2[1] || sdca_cascade;
> >  
> >  		/*

Pierre-Louis Bossart Jan. 20, 2023, 4:11 p.m. UTC | #2

On 1/20/23 04:14, Charles Keepax wrote:
> On Thu, Jan 19, 2023 at 11:27:14AM -0600, Pierre-Louis Bossart wrote:
>> On 1/19/23 10:51, Charles Keepax wrote:
>>> Currently the SoundWire core will loop handling slave alerts but it will
>>> only handle those present when the alert was first raised. This causes
>>> some issues with the Cadence SoundWire IP, which only generates an IRQ
>>> when alert changes state. This means that if a new alert arrives whilst
>>> old alerts are being handled it will not be handled in the currently
>>> loop and then no further alerts will be processed since alert never
>>> changes state to trigger a new IRQ.
>>>
>>> Correct this issue by allowing the core to handle all pending alerts in
>>> the IRQ handling loop. The code will still only loop up to
>>> SDW_READ_INTR_CLEAR_RETRY times, so it shouldn't be possible for it get
>>> completely stuck and if you are generating IRQs faster than you can
>>> handle them you likely have bigger problems anyway.
>>
>> The change makes sense, but it's a bit odd to change the way the
>> interrupts are handled because of a specific design. The bus should be
>> able to deal with various designs, not force a one-size-fits-all policy
>> that may not be quite right in all cases.
>>
>> Could we have a new flag at the bus level that says that peripheral
>> interrupts are not filtered, and set if for the Intel case?
>>
>> We could similarly make the SDW_READ_INTR_CLEAR_RETRY constant
>> bus/platform specific. The SoundWire spec mandates that we re-read the
>> status after clearing the interrupt, but it doesn't say how to deal with
>> recurring interrupts.
> 
> Perhaps I should have phrased the commit message differently
> here. To be honest I am not really convince the old code makes
> a huge amount of sense. So I would prefer not to add a flag
> enabling the weird behaviour.
> 
> I would be of the opinion that there are really two options
> for IRQ handling code like this that make sense:
> 
> 1) Loop until the IRQs are handled, ie. it is the soundwire
> core's responsibility to make sure all the IRQs are handled
> before moving on.
> 
> 2) Just handle the IRQs available when the function is called,
> ie. it is the drivers responsibility to keep calling the core
> until the IRQs are handled.
> 
> That way there is a clearly defined who that is responsible.
> The old code is a weird mix of the two where most of the time
> it is the soundwire core's responsibly to handle recurring
> IRQs unless a new one happens in which case it is the drivers
> responsibilty to recall the core.
> 
> Also the new code will still work for drivers that have level
> IRQs and recall the core, without any modification of those
> drivers. So I don't see what anyone would be gaining from the
> old system.

I think the intent of the 'old code' was the option 2), expect that it's
broken on Intel platforms and not possible because of the hardware design.

I am good with your two suggested options.

> Regarding making the clear retries platform specific that makes
> sense to me but is clearly a separate patch. I will add it onto
> my soundwire todo list.

yes, it's a separate patch indeed.

Pierre-Louis Bossart Jan. 20, 2023, 4:20 p.m. UTC | #3

On 1/20/23 03:59, Charles Keepax wrote:
> On Thu, Jan 19, 2023 at 11:12:04AM -0600, Pierre-Louis Bossart wrote:
>> No objection on this addition, just a couple of comments to improve it:
>>
>>>  EXPORT_SYMBOL(sdw_bus_master_add);
>>> @@ -158,6 +183,8 @@ static int sdw_delete_slave(struct device *dev, void *data)
>>>  	mutex_lock(&bus->bus_lock);
>>>  
>>>  	if (slave->dev_num) { /* clear dev_num if assigned */
>>> +		irq_dispose_mapping(irq_find_mapping(bus->domain, slave->dev_num));
>>> +
>>
>> could this be done conditionally. e.g.
>>
>> if (slave->prop.irq)
>>     irq_dispose_mapping(irq_find_mapping(bus->domain, slave->dev_num));
>>
>>> +			slave->irq = irq_create_mapping(bus->domain, dev_num);
>>> +			if (!slave->irq) {
>>> +				dev_err(bus->dev, "Failed to map IRQ\n");
>>> +				return -EINVAL;
>>> +			}
>>
>> ...and here....
>>
>> if (slave->prop.irq) {
>> 	slave->irq = irq_create_mapping(bus->domain, dev_num);
>> 	if (!slave->irq) {
>> 		dev_err(bus->dev, "Failed to map IRQ\n");
>> 		return -EINVAL;
>> 	}
>> }
>>
> 
> Yeah I am happy to make those conditional, I guess it is cleaner
> to not map IRQs if they wont be used.

ok

> 
>>> @@ -369,6 +371,7 @@ struct sdw_dpn_prop {
>>>   * @clock_reg_supported: the Peripheral implements the clock base and scale
>>>   * registers introduced with the SoundWire 1.2 specification. SDCA devices
>>>   * do not need to set this boolean property as the registers are required.
>>> + * @irq: call actual IRQ handler on slave, as well as callback
>>>   */
>>>  struct sdw_slave_prop {
>>>  	u32 mipi_revision;
>>> @@ -393,6 +396,7 @@ struct sdw_slave_prop {
>>>  	u8 scp_int1_mask;
>>>  	u32 quirks;
>>>  	bool clock_reg_supported;
>>> +	bool irq;
>>
>> this can be confused with the 'wake_capable' property.
>>
>> maybe 'out_of_band_irq' ?
>>
> 
> Yes I struggle on the name a bit and then just gave up and
> went with plain "irq", hard to know what to call it. Not sure
> out_of_band is quite right since it not really out of band,
> handle_nested_irq pretty much basically boils down to a function
> call really. Maybe something like "map_irq", or "use_domain_irq"?

Naming is hard. I use 'in-band wake' for SoundWire-based notifications,
so I used 'out-of-band' for non-SoundWire stuff.

use_domain_irq sounds goods to me, it's different enough that confusions
are not possible.

>> There should be an explanation and something checking that both are not
>> used concurrently.
> 
> I will try to expand the explanation a litte, but I dont see any
> reason to block calling both handlers, no ill effects would come
> for a driver having both and it is useful if any soundwire
> specific steps are needed that arn't on other control buses.

I think it's problematic if the peripheral tries to wake-up the manager
from clock-stop with both an in-band wake (i.e. drive the data line
high) and a separate GPIO-based interrupt. It's asking for trouble IMHO.
We spent hours in the MIPI team to make sure there were no races between
the manager-initiated restarts and peripheral-initiated restarts, adding
a 3rd mechanism in the mix gives me a migraine already.

Charles Keepax Jan. 23, 2023, 2:53 p.m. UTC | #4

On Fri, Jan 20, 2023 at 10:20:50AM -0600, Pierre-Louis Bossart wrote:
> On 1/20/23 03:59, Charles Keepax wrote:
> > On Thu, Jan 19, 2023 at 11:12:04AM -0600, Pierre-Louis Bossart wrote:
> >> There should be an explanation and something checking that both are not
> >> used concurrently.
> > 
> > I will try to expand the explanation a litte, but I dont see any
> > reason to block calling both handlers, no ill effects would come
> > for a driver having both and it is useful if any soundwire
> > specific steps are needed that arn't on other control buses.
> 
> I think it's problematic if the peripheral tries to wake-up the manager
> from clock-stop with both an in-band wake (i.e. drive the data line
> high) and a separate GPIO-based interrupt. It's asking for trouble IMHO.
> We spent hours in the MIPI team to make sure there were no races between
> the manager-initiated restarts and peripheral-initiated restarts, adding
> a 3rd mechanism in the mix gives me a migraine already.

Apologies but I am struggling see why this has any bearing on
the case of a device that does both an in-band and out-of-band
wake. The code we are adding in this patch will only be called in the
in-band case. handle_nested_irq doesn't do any hardware magic or
schedule any threads, it just calls a function that was provided
when the client called request_threaded_irq. The only guarantee
of atomicity you have on the interrupt_callback is sdw_dev_lock
and that is being held across both calls after the patch.

Could you be a little more specific on what you mean by this
represents a 3rd mechanism, to me this isn't a new mechanism just
an extra callback? Say for example this patch added an
interrupt_callback_early to sdw_slave_ops that is called just
before interrupt_callback.

@@ -1681,6 +1681,9 @@ static int sdw_handle_slave_alerts(struct sdw_slave *slave)
                                struct device *dev = &slave->dev;
                                struct sdw_driver *drv = drv_to_sdw_driver(dev->driver);

+                         if (drv->ops && drv->ops->interrupt_callback_early)
+                                 drv->ops->interrupt_callback_early(slave);
+
                                if (drv->ops && drv->ops->interrupt_callback) {
                                        slave_intr.sdca_cascade = sdca_cascade;
                                        slave_intr.control_port = clear;

Would that similarly worry you? As in is it the client driver
writer dealing with 2 points of entry that worries you, or
something deeper relating to the IRQs?

Also if it helps I could go over in a little more detail how
the IRQs on our device works and why that means I would prefer
to have the option to use both. There are alternatives but they
arn't really as pretty.

Thanks,
Charles

Pierre-Louis Bossart Jan. 23, 2023, 3:50 p.m. UTC | #5

On 1/23/23 08:53, Charles Keepax wrote:
> On Fri, Jan 20, 2023 at 10:20:50AM -0600, Pierre-Louis Bossart wrote:
>> On 1/20/23 03:59, Charles Keepax wrote:
>>> On Thu, Jan 19, 2023 at 11:12:04AM -0600, Pierre-Louis Bossart wrote:
>>>> There should be an explanation and something checking that both are not
>>>> used concurrently.
>>>
>>> I will try to expand the explanation a litte, but I dont see any
>>> reason to block calling both handlers, no ill effects would come
>>> for a driver having both and it is useful if any soundwire
>>> specific steps are needed that arn't on other control buses.
>>
>> I think it's problematic if the peripheral tries to wake-up the manager
>> from clock-stop with both an in-band wake (i.e. drive the data line
>> high) and a separate GPIO-based interrupt. It's asking for trouble IMHO.
>> We spent hours in the MIPI team to make sure there were no races between
>> the manager-initiated restarts and peripheral-initiated restarts, adding
>> a 3rd mechanism in the mix gives me a migraine already.
> 
> Apologies but I am struggling see why this has any bearing on
> the case of a device that does both an in-band and out-of-band
> wake. The code we are adding in this patch will only be called in the
> in-band case. handle_nested_irq doesn't do any hardware magic or
> schedule any threads, it just calls a function that was provided
> when the client called request_threaded_irq. The only guarantee
> of atomicity you have on the interrupt_callback is sdw_dev_lock
> and that is being held across both calls after the patch.
> 
> Could you be a little more specific on what you mean by this
> represents a 3rd mechanism, to me this isn't a new mechanism just
> an extra callback? Say for example this patch added an
> interrupt_callback_early to sdw_slave_ops that is called just
> before interrupt_callback.

Well, the main concern is exiting the clock-stop. That is handled by the
manager and could be done
a) as the result of the framework deciding that something needs to be
done (typically as a result of user/applications starting a stream)
b) by the device with an in-band wake in case of e.g. jack detection or
acoustic events detected
c) same as b) but with a separate out-of-band interrupt.

I'd like to make sure b) and c) are mutually-exclusive options, and that
the device will not throw BOTH an in-band wake and an external interrupt.

Richard Fitzgerald Jan. 23, 2023, 4:08 p.m. UTC | #6

On 23/01/2023 15:50, Pierre-Louis Bossart wrote:
> 
> 
> On 1/23/23 08:53, Charles Keepax wrote:
>> On Fri, Jan 20, 2023 at 10:20:50AM -0600, Pierre-Louis Bossart wrote:
>>> On 1/20/23 03:59, Charles Keepax wrote:
>>>> On Thu, Jan 19, 2023 at 11:12:04AM -0600, Pierre-Louis Bossart wrote:
>>>>> There should be an explanation and something checking that both are not
>>>>> used concurrently.
>>>>
>>>> I will try to expand the explanation a litte, but I dont see any
>>>> reason to block calling both handlers, no ill effects would come
>>>> for a driver having both and it is useful if any soundwire
>>>> specific steps are needed that arn't on other control buses.
>>>
>>> I think it's problematic if the peripheral tries to wake-up the manager
>>> from clock-stop with both an in-band wake (i.e. drive the data line
>>> high) and a separate GPIO-based interrupt. It's asking for trouble IMHO.
>>> We spent hours in the MIPI team to make sure there were no races between
>>> the manager-initiated restarts and peripheral-initiated restarts, adding
>>> a 3rd mechanism in the mix gives me a migraine already.
>>
>> Apologies but I am struggling see why this has any bearing on
>> the case of a device that does both an in-band and out-of-band
>> wake. The code we are adding in this patch will only be called in the
>> in-band case. handle_nested_irq doesn't do any hardware magic or
>> schedule any threads, it just calls a function that was provided
>> when the client called request_threaded_irq. The only guarantee
>> of atomicity you have on the interrupt_callback is sdw_dev_lock
>> and that is being held across both calls after the patch.
>>
>> Could you be a little more specific on what you mean by this
>> represents a 3rd mechanism, to me this isn't a new mechanism just
>> an extra callback? Say for example this patch added an
>> interrupt_callback_early to sdw_slave_ops that is called just
>> before interrupt_callback.
> 
> Well, the main concern is exiting the clock-stop. That is handled by the
> manager and could be done
> a) as the result of the framework deciding that something needs to be
> done (typically as a result of user/applications starting a stream)
> b) by the device with an in-band wake in case of e.g. jack detection or
> acoustic events detected
> c) same as b) but with a separate out-of-band interrupt.
> 
> I'd like to make sure b) and c) are mutually-exclusive options, and that
> the device will not throw BOTH an in-band wake and an external interrupt.

Why would it be a problem if the device did (b) and (c)?
(c) is completely invisible to the SoundWire core and not something
that it has to handle. The handler for an out-of-band interrupt must
call pm_runtime_get_sync() or pm_runtime_resume_and_get() and that
would wake its own driver and the host controller.

Pierre-Louis Bossart Jan. 23, 2023, 4:38 p.m. UTC | #7

On 1/23/23 10:08, Richard Fitzgerald wrote:
> On 23/01/2023 15:50, Pierre-Louis Bossart wrote:
>>
>>
>> On 1/23/23 08:53, Charles Keepax wrote:
>>> On Fri, Jan 20, 2023 at 10:20:50AM -0600, Pierre-Louis Bossart wrote:
>>>> On 1/20/23 03:59, Charles Keepax wrote:
>>>>> On Thu, Jan 19, 2023 at 11:12:04AM -0600, Pierre-Louis Bossart wrote:
>>>>>> There should be an explanation and something checking that both
>>>>>> are not
>>>>>> used concurrently.
>>>>>
>>>>> I will try to expand the explanation a litte, but I dont see any
>>>>> reason to block calling both handlers, no ill effects would come
>>>>> for a driver having both and it is useful if any soundwire
>>>>> specific steps are needed that arn't on other control buses.
>>>>
>>>> I think it's problematic if the peripheral tries to wake-up the manager
>>>> from clock-stop with both an in-band wake (i.e. drive the data line
>>>> high) and a separate GPIO-based interrupt. It's asking for trouble
>>>> IMHO.
>>>> We spent hours in the MIPI team to make sure there were no races
>>>> between
>>>> the manager-initiated restarts and peripheral-initiated restarts,
>>>> adding
>>>> a 3rd mechanism in the mix gives me a migraine already.
>>>
>>> Apologies but I am struggling see why this has any bearing on
>>> the case of a device that does both an in-band and out-of-band
>>> wake. The code we are adding in this patch will only be called in the
>>> in-band case. handle_nested_irq doesn't do any hardware magic or
>>> schedule any threads, it just calls a function that was provided
>>> when the client called request_threaded_irq. The only guarantee
>>> of atomicity you have on the interrupt_callback is sdw_dev_lock
>>> and that is being held across both calls after the patch.
>>>
>>> Could you be a little more specific on what you mean by this
>>> represents a 3rd mechanism, to me this isn't a new mechanism just
>>> an extra callback? Say for example this patch added an
>>> interrupt_callback_early to sdw_slave_ops that is called just
>>> before interrupt_callback.
>>
>> Well, the main concern is exiting the clock-stop. That is handled by the
>> manager and could be done
>> a) as the result of the framework deciding that something needs to be
>> done (typically as a result of user/applications starting a stream)
>> b) by the device with an in-band wake in case of e.g. jack detection or
>> acoustic events detected
>> c) same as b) but with a separate out-of-band interrupt.
>>
>> I'd like to make sure b) and c) are mutually-exclusive options, and that
>> the device will not throw BOTH an in-band wake and an external interrupt.
> 
> Why would it be a problem if the device did (b) and (c)?
> (c) is completely invisible to the SoundWire core and not something
> that it has to handle. The handler for an out-of-band interrupt must
> call pm_runtime_get_sync() or pm_runtime_resume_and_get() and that
> would wake its own driver and the host controller.

The Intel hardware has a power optimization for the clock-stop, which
leads to different paths to wake the system. The SoundWire IP can deal
with the data line staying high, but in the optimized mode the wakes are
signaled as DSP interrupts at a higher level. That's why we added this
intel_link_process_wakeen_event() function called from
hda_dsp_interrupt_thread().

So yes on paper everything would work nicely, but that's asking for
trouble with races left and right. In other words, unless you have a
very good reason for using two wake-up mechanisms, pick a single one.

(a) and (c) are very similar in that all the exit is handled by
pm_runtime so I am not worried too much. I do worry about paths that
were never tested and never planned for.

Charles Keepax Jan. 23, 2023, 5:07 p.m. UTC | #8

On Mon, Jan 23, 2023 at 09:50:15AM -0600, Pierre-Louis Bossart wrote:
> On 1/23/23 08:53, Charles Keepax wrote:
> > On Fri, Jan 20, 2023 at 10:20:50AM -0600, Pierre-Louis Bossart wrote:
> >> On 1/20/23 03:59, Charles Keepax wrote:
> >>> On Thu, Jan 19, 2023 at 11:12:04AM -0600, Pierre-Louis Bossart wrote:
> >>>> There should be an explanation and something checking that both are not
> >>>> used concurrently.
> >>>
> >>> I will try to expand the explanation a litte, but I dont see any
> >>> reason to block calling both handlers, no ill effects would come
> >>> for a driver having both and it is useful if any soundwire
> >>> specific steps are needed that arn't on other control buses.
> >>
> >> I think it's problematic if the peripheral tries to wake-up the manager
> >> from clock-stop with both an in-band wake (i.e. drive the data line
> >> high) and a separate GPIO-based interrupt. It's asking for trouble IMHO.
> >> We spent hours in the MIPI team to make sure there were no races between
> >> the manager-initiated restarts and peripheral-initiated restarts, adding
> >> a 3rd mechanism in the mix gives me a migraine already.
> > 
> > Apologies but I am struggling see why this has any bearing on
> > the case of a device that does both an in-band and out-of-band
> > wake. The code we are adding in this patch will only be called in the
> > in-band case. handle_nested_irq doesn't do any hardware magic or
> > schedule any threads, it just calls a function that was provided
> > when the client called request_threaded_irq. The only guarantee
> > of atomicity you have on the interrupt_callback is sdw_dev_lock
> > and that is being held across both calls after the patch.
> > 
> > Could you be a little more specific on what you mean by this
> > represents a 3rd mechanism, to me this isn't a new mechanism just
> > an extra callback? Say for example this patch added an
> > interrupt_callback_early to sdw_slave_ops that is called just
> > before interrupt_callback.
> 
> Well, the main concern is exiting the clock-stop. That is handled by the
> manager and could be done
> a) as the result of the framework deciding that something needs to be
> done (typically as a result of user/applications starting a stream)
> b) by the device with an in-band wake in case of e.g. jack detection or
> acoustic events detected
> c) same as b) but with a separate out-of-band interrupt.
> 
> I'd like to make sure b) and c) are mutually-exclusive options, and that
> the device will not throw BOTH an in-band wake and an external interrupt.

I think this is the bit I don't follow why does this patch have
anything to do with whether b) and c) are mutually-exclusive
options?

This patch lets you register an IRQ handler to the in-band IRQs,
it has nothing to do with whether you register an IRQ handler to
some out of band IRQ and there is nothing in the current
framework that will prevent someone doing that. Adding a
check that forces someone to choose between using
handle_nested_irq or interrupt_callback will also have no
bearing on whether they attach an IRQ handler to an out-of-band
IRQ.

Thanks,
Charles

Richard Fitzgerald Jan. 23, 2023, 5:17 p.m. UTC | #9

On 23/01/2023 16:38, Pierre-Louis Bossart wrote:
> 
> 
> On 1/23/23 10:08, Richard Fitzgerald wrote:
>> On 23/01/2023 15:50, Pierre-Louis Bossart wrote:
>>>
>>>
>>> On 1/23/23 08:53, Charles Keepax wrote:
>>>> On Fri, Jan 20, 2023 at 10:20:50AM -0600, Pierre-Louis Bossart wrote:
>>>>> On 1/20/23 03:59, Charles Keepax wrote:
>>>>>> On Thu, Jan 19, 2023 at 11:12:04AM -0600, Pierre-Louis Bossart wrote:
>>>>>>> There should be an explanation and something checking that both
>>>>>>> are not
>>>>>>> used concurrently.
>>>>>>
>>>>>> I will try to expand the explanation a litte, but I dont see any
>>>>>> reason to block calling both handlers, no ill effects would come
>>>>>> for a driver having both and it is useful if any soundwire
>>>>>> specific steps are needed that arn't on other control buses.
>>>>>
>>>>> I think it's problematic if the peripheral tries to wake-up the manager
>>>>> from clock-stop with both an in-band wake (i.e. drive the data line
>>>>> high) and a separate GPIO-based interrupt. It's asking for trouble
>>>>> IMHO.
>>>>> We spent hours in the MIPI team to make sure there were no races
>>>>> between
>>>>> the manager-initiated restarts and peripheral-initiated restarts,
>>>>> adding
>>>>> a 3rd mechanism in the mix gives me a migraine already.
>>>>
>>>> Apologies but I am struggling see why this has any bearing on
>>>> the case of a device that does both an in-band and out-of-band
>>>> wake. The code we are adding in this patch will only be called in the
>>>> in-band case. handle_nested_irq doesn't do any hardware magic or
>>>> schedule any threads, it just calls a function that was provided
>>>> when the client called request_threaded_irq. The only guarantee
>>>> of atomicity you have on the interrupt_callback is sdw_dev_lock
>>>> and that is being held across both calls after the patch.
>>>>
>>>> Could you be a little more specific on what you mean by this
>>>> represents a 3rd mechanism, to me this isn't a new mechanism just
>>>> an extra callback? Say for example this patch added an
>>>> interrupt_callback_early to sdw_slave_ops that is called just
>>>> before interrupt_callback.
>>>
>>> Well, the main concern is exiting the clock-stop. That is handled by the
>>> manager and could be done
>>> a) as the result of the framework deciding that something needs to be
>>> done (typically as a result of user/applications starting a stream)
>>> b) by the device with an in-band wake in case of e.g. jack detection or
>>> acoustic events detected
>>> c) same as b) but with a separate out-of-band interrupt.
>>>
>>> I'd like to make sure b) and c) are mutually-exclusive options, and that
>>> the device will not throw BOTH an in-band wake and an external interrupt.
>>
>> Why would it be a problem if the device did (b) and (c)?
>> (c) is completely invisible to the SoundWire core and not something
>> that it has to handle. The handler for an out-of-band interrupt must
>> call pm_runtime_get_sync() or pm_runtime_resume_and_get() and that
>> would wake its own driver and the host controller.
> 
> The Intel hardware has a power optimization for the clock-stop, which
> leads to different paths to wake the system. The SoundWire IP can deal
> with the data line staying high, but in the optimized mode the wakes are
> signaled as DSP interrupts at a higher level. That's why we added this
> intel_link_process_wakeen_event() function called from
> hda_dsp_interrupt_thread().
> 
> So yes on paper everything would work nicely, but that's asking for
> trouble with races left and right. In other words, unless you have a

Wake up from a hard INT is simply a runtime_resume of the codec driver.
That is no different from ASoC runtime resuming the driver to perform
some audio activity, or to access a volatile register. An event caused
a runtime-resume - the driver and the host controller must resume.

The Intel code _must_ be able to safely wakeup from clock-stop if
something runtime-resumes the codec driver. ASoC relies on that, and
pm_runtime would be broken if that doesn't work.

> very good reason for using two wake-up mechanisms, pick a single one.
> 
> (a) and (c) are very similar in that all the exit is handled by
> pm_runtime so I am not worried too much. I do worry about paths that
> were never tested and never planned for.

Pierre-Louis Bossart Jan. 23, 2023, 6:07 p.m. UTC | #10

On 1/23/23 11:17, Richard Fitzgerald wrote:
> On 23/01/2023 16:38, Pierre-Louis Bossart wrote:
>>
>>
>> On 1/23/23 10:08, Richard Fitzgerald wrote:
>>> On 23/01/2023 15:50, Pierre-Louis Bossart wrote:
>>>>
>>>>
>>>> On 1/23/23 08:53, Charles Keepax wrote:
>>>>> On Fri, Jan 20, 2023 at 10:20:50AM -0600, Pierre-Louis Bossart wrote:
>>>>>> On 1/20/23 03:59, Charles Keepax wrote:
>>>>>>> On Thu, Jan 19, 2023 at 11:12:04AM -0600, Pierre-Louis Bossart
>>>>>>> wrote:
>>>>>>>> There should be an explanation and something checking that both
>>>>>>>> are not
>>>>>>>> used concurrently.
>>>>>>>
>>>>>>> I will try to expand the explanation a litte, but I dont see any
>>>>>>> reason to block calling both handlers, no ill effects would come
>>>>>>> for a driver having both and it is useful if any soundwire
>>>>>>> specific steps are needed that arn't on other control buses.
>>>>>>
>>>>>> I think it's problematic if the peripheral tries to wake-up the
>>>>>> manager
>>>>>> from clock-stop with both an in-band wake (i.e. drive the data line
>>>>>> high) and a separate GPIO-based interrupt. It's asking for trouble
>>>>>> IMHO.
>>>>>> We spent hours in the MIPI team to make sure there were no races
>>>>>> between
>>>>>> the manager-initiated restarts and peripheral-initiated restarts,
>>>>>> adding
>>>>>> a 3rd mechanism in the mix gives me a migraine already.
>>>>>
>>>>> Apologies but I am struggling see why this has any bearing on
>>>>> the case of a device that does both an in-band and out-of-band
>>>>> wake. The code we are adding in this patch will only be called in the
>>>>> in-band case. handle_nested_irq doesn't do any hardware magic or
>>>>> schedule any threads, it just calls a function that was provided
>>>>> when the client called request_threaded_irq. The only guarantee
>>>>> of atomicity you have on the interrupt_callback is sdw_dev_lock
>>>>> and that is being held across both calls after the patch.
>>>>>
>>>>> Could you be a little more specific on what you mean by this
>>>>> represents a 3rd mechanism, to me this isn't a new mechanism just
>>>>> an extra callback? Say for example this patch added an
>>>>> interrupt_callback_early to sdw_slave_ops that is called just
>>>>> before interrupt_callback.
>>>>
>>>> Well, the main concern is exiting the clock-stop. That is handled by
>>>> the
>>>> manager and could be done
>>>> a) as the result of the framework deciding that something needs to be
>>>> done (typically as a result of user/applications starting a stream)
>>>> b) by the device with an in-band wake in case of e.g. jack detection or
>>>> acoustic events detected
>>>> c) same as b) but with a separate out-of-band interrupt.
>>>>
>>>> I'd like to make sure b) and c) are mutually-exclusive options, and
>>>> that
>>>> the device will not throw BOTH an in-band wake and an external
>>>> interrupt.
>>>
>>> Why would it be a problem if the device did (b) and (c)?
>>> (c) is completely invisible to the SoundWire core and not something
>>> that it has to handle. The handler for an out-of-band interrupt must
>>> call pm_runtime_get_sync() or pm_runtime_resume_and_get() and that
>>> would wake its own driver and the host controller.
>>
>> The Intel hardware has a power optimization for the clock-stop, which
>> leads to different paths to wake the system. The SoundWire IP can deal
>> with the data line staying high, but in the optimized mode the wakes are
>> signaled as DSP interrupts at a higher level. That's why we added this
>> intel_link_process_wakeen_event() function called from
>> hda_dsp_interrupt_thread().
>>
>> So yes on paper everything would work nicely, but that's asking for
>> trouble with races left and right. In other words, unless you have a
> 
> Wake up from a hard INT is simply a runtime_resume of the codec driver.
> That is no different from ASoC runtime resuming the driver to perform
> some audio activity, or to access a volatile register. An event caused
> a runtime-resume - the driver and the host controller must resume.
> 
> The Intel code _must_ be able to safely wakeup from clock-stop if
> something runtime-resumes the codec driver. ASoC relies on that, and
> pm_runtime would be broken if that doesn't work.

Like I said before, the Intel code will work with either b) or c).

Using both to exit clock stop is not a recommended/tested solution, and
it's not something I have a burning desire to look into. If you register
an external IRQ, then pretty please describe your device as not
'wake_capable'.

>> very good reason for using two wake-up mechanisms, pick a single one.
>>
>> (a) and (c) are very similar in that all the exit is handled by
>> pm_runtime so I am not worried too much. I do worry about paths that
>> were never tested and never planned for.

[1/2] soundwire: bus: Don't filter slave alerts

Commit Message

Comments

Patch