diff mbox series

[3/3] soundwire: bus: Fix lost UNATTACH when re-enumerating

Message ID 20220825122241.273090-4-rf@opensource.cirrus.com
State New
Headers show
Series soundwire: Fixes for spurious and missing UNATTACH | expand

Commit Message

Richard Fitzgerald Aug. 25, 2022, 12:22 p.m. UTC
Rearrange sdw_handle_slave_status() so that any peripherals
on device #0 that are given a device ID are reported as
unattached. The ensures that UNATTACH status is not lost.

Handle unenumerated devices first and update the
sdw_slave_status array to indicate IDs that must have become
UNATTACHED.

Look for UNATTACHED devices after this so we can pick up
peripherals that were UNATTACHED in the original PING status
and those that were still ATTACHED at the time of the PING but
then reverted to unenumerated and were found by
sdw_program_device_num().

As sdw_update_slave_status() is always processing a snapshot of
a PING from some time in the past, it is possible that the status
is changing while sdw_update_slave_status() is running.

A peripheral could report attached in the PING, but detach and
revert to device #0 and then be found in the loop in
sdw_program_device_num(). Previously the code would not have
updated slave->status to UNATTACHED because there was never a
PING with that status. If the slave->status is not updated to
UNATTACHED the next PING will report it as ATTACHED, but its
slave->status is already ATTACHED so the re-attach will not be
properly handled.

This situations happens fairly frequently with multiple
peripherals on a bus that are intentionally reset (for example
after downloading firmware).

Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
---
 drivers/soundwire/bus.c | 39 ++++++++++++++++++++++++++-------------
 1 file changed, 26 insertions(+), 13 deletions(-)

Comments

Richard Fitzgerald Aug. 29, 2022, 9:50 a.m. UTC | #1
On 26/08/2022 09:06, Pierre-Louis Bossart wrote:
> 
<SNIP>
> 
> Thanks for the detailed answer, this sequence of events will certainly
> defeat the Cadence IP and the way sticky bits were handled.
> 
> The UNATTACHED case was assumed to be a really rare case of losing sync,
> i.e. a SOFT_RESET in SoundWire parlance.
> 
> If you explicitly do a device reset, that would be a new scenario that
> was not considered before on any of the existing SoundWire commercial
> devices. It's however something we need to support, and your work here
> is much appreciated.
> 
> I still think we should re-check the actual status from a PING frame, in
> order to work with more current data than the sticky bits taken at an
> earlier time, but that would only be a minor improvement.
> 
> I also have a vague feeling that additional work is needed to make sure
> the DAIs are not used before that second enumeration and all firmware
> download complete. I did a couple of tests last year where I used the
> debugfs interface to issue a device reset command while streaming audio,
> and the detach/reattach was not handled at the ASoC level.
> 
> I really don't see any logical flaws in your patch as is, so
> 
> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> 

I have pushed an alternative fix that waits until it sees an UNATTACHED
status before reprogramming the device ID.
https://lore.kernel.org/lkml/20220829094458.1169504-1-rf@opensource.cirrus.com/T/#t

I've tested it with 4 amps on the same bus, all being reset after their
firmware has been downloaded.

I leave it to you to choose which fix you prefer. The second fix is
simpler and I didn't see any problems in testing.
Richard Fitzgerald Aug. 30, 2022, 9 a.m. UTC | #2
On 25/08/2022 13:22, Richard Fitzgerald wrote:
> Rearrange sdw_handle_slave_status() so that any peripherals
> on device #0 that are given a device ID are reported as
> unattached. The ensures that UNATTACH status is not lost.
> 
> Handle unenumerated devices first and update the
> sdw_slave_status array to indicate IDs that must have become
> UNATTACHED.
> 

Don't use this patch!
I found there's a race condition with the Cadence interrupts.
Use my alternative fix.
diff mbox series

Patch

diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
index bb8ce26c68b3..1212148ac251 100644
--- a/drivers/soundwire/bus.c
+++ b/drivers/soundwire/bus.c
@@ -718,7 +718,8 @@  void sdw_extract_slave_id(struct sdw_bus *bus,
 }
 EXPORT_SYMBOL(sdw_extract_slave_id);
 
-static int sdw_program_device_num(struct sdw_bus *bus)
+static int sdw_program_device_num(struct sdw_bus *bus,
+				  enum sdw_slave_status status[])
 {
 	u8 buf[SDW_NUM_DEV_ID_REGISTERS] = {0};
 	struct sdw_slave *slave, *_s;
@@ -776,6 +777,12 @@  static int sdw_program_device_num(struct sdw_bus *bus)
 					return ret;
 				}
 
+				/*
+				 * It could have dropped off the bus since the
+				 * PING response so update the status array.
+				 */
+				status[slave->dev_num] = SDW_SLAVE_UNATTACHED;
+
 				break;
 			}
 		}
@@ -1735,10 +1742,21 @@  int sdw_handle_slave_status(struct sdw_bus *bus,
 {
 	enum sdw_slave_status prev_status;
 	struct sdw_slave *slave;
+	bool programmed_dev_num = false;
 	bool attached_initializing;
 	int i, ret = 0;
 
-	/* first check if any Slaves fell off the bus */
+	/* Handle any unenumerated peripherals */
+	if (status[0] == SDW_SLAVE_ATTACHED) {
+		dev_dbg(bus->dev, "Slave attached, programming device number\n");
+		ret = sdw_program_device_num(bus, status);
+		if (ret < 0)
+			dev_warn(bus->dev, "Slave attach failed: %d\n", ret);
+
+		programmed_dev_num = true;
+	}
+
+	/* Check if any fell off the bus */
 	for (i = 1; i <= SDW_MAX_DEVICES; i++) {
 		mutex_lock(&bus->bus_lock);
 		if (test_bit(i, bus->assigned) == false) {
@@ -1764,17 +1782,12 @@  int sdw_handle_slave_status(struct sdw_bus *bus,
 		}
 	}
 
-	if (status[0] == SDW_SLAVE_ATTACHED) {
-		dev_dbg(bus->dev, "Slave attached, programming device number\n");
-		ret = sdw_program_device_num(bus);
-		if (ret < 0)
-			dev_err(bus->dev, "Slave attach failed: %d\n", ret);
-		/*
-		 * programming a device number will have side effects,
-		 * so we deal with other devices at a later time
-		 */
-		return ret;
-	}
+	/*
+	 * programming a device number will have side effects,
+	 * so we deal with other devices at a later time
+	 */
+	if (programmed_dev_num)
+		return 0;
 
 	/* Continue to check other slave statuses */
 	for (i = 1; i <= SDW_MAX_DEVICES; i++) {