diff mbox series

[RFC,07/16] soundwire: bus: add API for BPT protocol

Message ID 20231207222944.663893-8-pierre-louis.bossart@linux.intel.com
State New
Headers show
Series soundwire/ASoC: speed-up downloads with BTP/BRA protocol | expand

Commit Message

Pierre-Louis Bossart Dec. 7, 2023, 10:29 p.m. UTC
Add definitions and helpers for the BPT/BRA protocol. Peripheral
drivers (aka ASoC codec drivers) can use this API to send bulk data
such as firmware or tables.

The API is only available when no other audio streams have been
allocated, and only one BTP/BRA stream is allowed per link.  To avoid
the addition of yet another lock, the refcount tests are handled in
the stream master_runtime alloc/free routines where the bus_lock is
already held. Another benefit of this approach is that the same
bus_lock is used to handle runtime and port linked lists, which
reduces the potential for misaligned configurations.

In addition to exclusion with audio streams, BPT transfers have a lot
of overhead, specifically registers writes are needed to enable
transport in DP0. In addition, most DMAs don't handle too well very
small data sets.

This patch suggests a minimum bound of 64 bytes, for smaller transfers
codec drivers should rely on the regular read/write commands in
Column0.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 drivers/soundwire/bus.c       | 77 +++++++++++++++++++++++++++++++++++
 drivers/soundwire/bus.h       | 18 ++++++++
 drivers/soundwire/stream.c    | 30 ++++++++++++++
 include/linux/soundwire/sdw.h | 76 ++++++++++++++++++++++++++++++++++
 4 files changed, 201 insertions(+)

Comments

Charles Keepax Dec. 12, 2023, 12:19 p.m. UTC | #1
On Thu, Dec 07, 2023 at 04:29:35PM -0600, Pierre-Louis Bossart wrote:
> This patch suggests a minimum bound of 64 bytes, for smaller transfers

128 in the code.

> +int sdw_bpt_close_stream(struct sdw_bus *bus,
> +			 struct sdw_slave *slave,
> +			 enum sdw_bpt_type mode,
> +			 struct sdw_bpt_msg *msg)
> +{

Seems weird to pass the message to close?

> +int sdw_bpt_open_stream(struct sdw_bus *bus,
> +			struct sdw_slave *slave,
> +			enum sdw_bpt_type mode,
> +			struct sdw_bpt_msg *msg)

Ditto, here does it make sense to pass the msg to open? I guess
the idea is that one only sends a single message in one
open->send->wait->close cycle? Would be much nicer if multiple
send/waits could be done in a single open/close, or if the
limitation is unavoidable why split out open/send into separate
calls at all? Just have send and wait and wrap open/close into
them.

> +	if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) {
> +		dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n",
> +			msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT);
> +		return -EINVAL;
> +	}

Should this really be here? My understanding is this alignment is
a limitation of specific hardware so should this check not be in
the Cadence master code.

> +int sdw_bpt_send_async(struct sdw_bus *bus,
> +		       struct sdw_slave *slave,
> +		       struct sdw_bpt_msg *msg)
> +{
> +	if (msg->len > SDW_BPT_MSG_MAX_BYTES)
> +		return -EINVAL;

Does this check make sense since this was already checked in
open? I guess the user could pass in a different message at this
stage, but that I guess is part of what feels weird about passing
the message into open.

> +/**
> + * struct sdw_btp_msg - Message structure
> + * @addr: Start Register address accessed in the Slave
> + * @len: number of bytes to transfer. More than 64Kb can be transferred
> + * but a practical limit of SDW_BPT_MSG_MAX_BYTES is enforced.

Where is the 64kb coming from here?

> +/*
> + * Add a minimum number of bytes for BPT transfer sizes. BPT has a lot of
> + * overhead, enabling/disabling a stream costs 6 write commands, plus the bank
> + * switch that could be delayed in time. In addition, transferring very small
> + * data sets over DMA is known to be problematic on multiple platforms.
> + */
> +#define SDW_BPT_MSG_MIN_BYTES      128
> +

Is it really necessary for the core to enforce a minimum transfer
size (well except for the required alignment thing)? Yes small
transfers don't obviously make sense, but there is nothing inherently
wrong with doing one, which makes me feel it is excessive for the
core to block more than it has to here.

I would be of the opinion its up to driver writers if they have
some reason to do small BRA transfers. Firstly, since we are so
keen on allowing BRA and normal writes to overlap, one could see
use-cases when you want to do something through BRA such that it
doesn't block the normal command stream. Also there is already 1
feature on cs42l43 that can only be accessed through BRA, yes that
is a heroically questionable hardware design choice. Whilst we are
mostly ignoring that for now, I can see this being something some
other hardware teams decide to heroically do at some point as well.

Thanks,
Charles
Vinod Koul Dec. 18, 2023, 11:54 a.m. UTC | #2
On 07-12-23, 16:29, Pierre-Louis Bossart wrote:
> Add definitions and helpers for the BPT/BRA protocol. Peripheral
> drivers (aka ASoC codec drivers) can use this API to send bulk data
> such as firmware or tables.
> 
> The API is only available when no other audio streams have been
> allocated, and only one BTP/BRA stream is allowed per link.  To avoid
> the addition of yet another lock, the refcount tests are handled in
> the stream master_runtime alloc/free routines where the bus_lock is
> already held. Another benefit of this approach is that the same
> bus_lock is used to handle runtime and port linked lists, which
> reduces the potential for misaligned configurations.
> 
> In addition to exclusion with audio streams, BPT transfers have a lot
> of overhead, specifically registers writes are needed to enable
> transport in DP0. In addition, most DMAs don't handle too well very
> small data sets.
> 
> This patch suggests a minimum bound of 64 bytes, for smaller transfers
> codec drivers should rely on the regular read/write commands in
> Column0.
> 
> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
> ---
>  drivers/soundwire/bus.c       | 77 +++++++++++++++++++++++++++++++++++
>  drivers/soundwire/bus.h       | 18 ++++++++
>  drivers/soundwire/stream.c    | 30 ++++++++++++++
>  include/linux/soundwire/sdw.h | 76 ++++++++++++++++++++++++++++++++++
>  4 files changed, 201 insertions(+)
> 
> diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
> index f3fec15c3112..e5758d2ed88f 100644
> --- a/drivers/soundwire/bus.c
> +++ b/drivers/soundwire/bus.c
> @@ -2015,3 +2015,80 @@ void sdw_clear_slave_status(struct sdw_bus *bus, u32 request)
>  	}
>  }
>  EXPORT_SYMBOL(sdw_clear_slave_status);
> +
> +int sdw_bpt_close_stream(struct sdw_bus *bus,
> +			 struct sdw_slave *slave,
> +			 enum sdw_bpt_type mode,
> +			 struct sdw_bpt_msg *msg)
> +{
> +	int ret;
> +
> +	ret = bus->ops->bpt_close_stream(bus, slave, mode, msg);
> +	if (ret < 0)
> +		dev_err(bus->dev, "BPT stream close, err %d\n", ret);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(sdw_bpt_close_stream);
> +
> +int sdw_bpt_open_stream(struct sdw_bus *bus,
> +			struct sdw_slave *slave,
> +			enum sdw_bpt_type mode,
> +			struct sdw_bpt_msg *msg)
> +{
> +	int ret;
> +
> +	/* only Bulk Register Access (BRA) is supported for now */
> +	if (mode != SDW_BRA)
> +		return -EINVAL;
> +
> +	if (msg->len < SDW_BPT_MSG_MIN_BYTES) {
> +		dev_err(bus->dev, "BPT message length %d, min supported %d\n",
> +			msg->len, SDW_BPT_MSG_MIN_BYTES);
> +		return -EINVAL;
> +	}
> +
> +	if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) {
> +		dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n",
> +			msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT);
> +		return -EINVAL;
> +	}

Is this a protocol requirement?

> +
> +	/* check device is enumerated */
> +	if (slave->dev_num == SDW_ENUM_DEV_NUM ||
> +	    slave->dev_num > SDW_MAX_DEVICES)
> +		return -ENODEV;
> +
> +	/* make sure all callbacks are defined */
> +	if (!bus->ops->bpt_open_stream ||
> +	    !bus->ops->bpt_close_stream ||
> +	    !bus->ops->bpt_send_async ||
> +	    !bus->ops->bpt_wait)
> +		return -ENOTSUPP;

should this not be checked at probe time, if device declares the support

> +
> +	ret = bus->ops->bpt_open_stream(bus, slave, mode, msg);
> +	if (ret < 0)
> +		dev_err(bus->dev, "BPT stream open, err %d\n", ret);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(sdw_bpt_open_stream);

can we open multiple times (i dont see a check preventing that), how do
we close..?

Re-iterating my comment on documentation patch, can we do with a async api
and wait api, that makes symantics a lot simpler, right..?

> +
> +int sdw_bpt_send_async(struct sdw_bus *bus,
> +		       struct sdw_slave *slave,
> +		       struct sdw_bpt_msg *msg)
> +{
> +	if (msg->len > SDW_BPT_MSG_MAX_BYTES)
> +		return -EINVAL;
> +
> +	return bus->ops->bpt_send_async(bus, slave, msg);
> +}
> +EXPORT_SYMBOL(sdw_bpt_send_async);

Can we call this multiple times after open, it is unclear to me. Can you
please add kernel-doc comments about the APIs here as well

>  struct sdw_master_ops {
>  	int (*read_prop)(struct sdw_bus *bus);
> @@ -869,6 +913,20 @@ struct sdw_master_ops {
>  	void (*new_peripheral_assigned)(struct sdw_bus *bus,
>  					struct sdw_slave *slave,
>  					int dev_num);
> +	int (*bpt_open_stream)(struct sdw_bus *bus,
> +			       struct sdw_slave *slave,
> +			       enum sdw_bpt_type mode,
> +			       struct sdw_bpt_msg *msg);
> +	int (*bpt_close_stream)(struct sdw_bus *bus,
> +				struct sdw_slave *slave,
> +				enum sdw_bpt_type mode,
> +				struct sdw_bpt_msg *msg);
> +	int (*bpt_send_async)(struct sdw_bus *bus,
> +			      struct sdw_slave *slave,
> +			      struct sdw_bpt_msg *msg);
> +	int (*bpt_wait)(struct sdw_bus *bus,
> +			struct sdw_slave *slave,
> +			struct sdw_bpt_msg *msg);

do we need both bus and slave, that was a mistake in orignal design IMO.
We should fix that for bpt_ apis
Pierre-Louis Bossart Dec. 18, 2023, 1:12 p.m. UTC | #3
>> +int sdw_bpt_open_stream(struct sdw_bus *bus,
>> +			struct sdw_slave *slave,
>> +			enum sdw_bpt_type mode,
>> +			struct sdw_bpt_msg *msg)
>> +{
>> +	int ret;
>> +
>> +	/* only Bulk Register Access (BRA) is supported for now */
>> +	if (mode != SDW_BRA)
>> +		return -EINVAL;
>> +
>> +	if (msg->len < SDW_BPT_MSG_MIN_BYTES) {
>> +		dev_err(bus->dev, "BPT message length %d, min supported %d\n",
>> +			msg->len, SDW_BPT_MSG_MIN_BYTES);
>> +		return -EINVAL;
>> +	}
>> +
>> +	if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) {
>> +		dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n",
>> +			msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT);
>> +		return -EINVAL;
>> +	}
> 
> Is this a protocol requirement?

No, it's an implementation requirement.

We could move this to host-specific parts but then the codec drivers
will have to know about alignment requirements for each host they are
use with. IOW, it's more work for codec drivers if we don't have a
minimum bar for alignment requirement across all platforms.

> 
>> +
>> +	/* check device is enumerated */
>> +	if (slave->dev_num == SDW_ENUM_DEV_NUM ||
>> +	    slave->dev_num > SDW_MAX_DEVICES)
>> +		return -ENODEV;
>> +
>> +	/* make sure all callbacks are defined */
>> +	if (!bus->ops->bpt_open_stream ||
>> +	    !bus->ops->bpt_close_stream ||
>> +	    !bus->ops->bpt_send_async ||
>> +	    !bus->ops->bpt_wait)
>> +		return -ENOTSUPP;
> 
> should this not be checked at probe time, if device declares the support

sdw_bpt_open_stream() would be called by the peripheral driver (or
regmap as a proxy). The peripheral driver could also decide to check for
those callback during its probe, but that's beyond the scope of this
patchset.

These checks are just there for paranoia, in case a peripheral driver
uses BTP/BRA on a host where they are not supported.

It's not science-fiction, we see AMD- and INTEL-based platforms using
the same SoundWire-based codecs.

>> +	ret = bus->ops->bpt_open_stream(bus, slave, mode, msg);
>> +	if (ret < 0)
>> +		dev_err(bus->dev, "BPT stream open, err %d\n", ret);
>> +
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL(sdw_bpt_open_stream);
> 
> can we open multiple times (i dont see a check preventing that), how do
> we close..?

there's a refcount preventing multiples BTP streams from being opened.

> Re-iterating my comment on documentation patch, can we do with a async api
> and wait api, that makes symantics a lot simpler, right..?

see reply in previous email, combining open+send is weird IMHO.

>> +
>> +int sdw_bpt_send_async(struct sdw_bus *bus,
>> +		       struct sdw_slave *slave,
>> +		       struct sdw_bpt_msg *msg)
>> +{
>> +	if (msg->len > SDW_BPT_MSG_MAX_BYTES)
>> +		return -EINVAL;
>> +
>> +	return bus->ops->bpt_send_async(bus, slave, msg);
>> +}
>> +EXPORT_SYMBOL(sdw_bpt_send_async);
> 
> Can we call this multiple times after open, it is unclear to me. Can you
> please add kernel-doc comments about the APIs here as well

This can be called multiple times but it's useless: all the buffers are
prepared in the open() stage. This is the moral equivalent of a trigger
step, just enable data transfers.

> 
>>  struct sdw_master_ops {
>>  	int (*read_prop)(struct sdw_bus *bus);
>> @@ -869,6 +913,20 @@ struct sdw_master_ops {
>>  	void (*new_peripheral_assigned)(struct sdw_bus *bus,
>>  					struct sdw_slave *slave,
>>  					int dev_num);
>> +	int (*bpt_open_stream)(struct sdw_bus *bus,
>> +			       struct sdw_slave *slave,
>> +			       enum sdw_bpt_type mode,
>> +			       struct sdw_bpt_msg *msg);
>> +	int (*bpt_close_stream)(struct sdw_bus *bus,
>> +				struct sdw_slave *slave,
>> +				enum sdw_bpt_type mode,
>> +				struct sdw_bpt_msg *msg);
>> +	int (*bpt_send_async)(struct sdw_bus *bus,
>> +			      struct sdw_slave *slave,
>> +			      struct sdw_bpt_msg *msg);
>> +	int (*bpt_wait)(struct sdw_bus *bus,
>> +			struct sdw_slave *slave,
>> +			struct sdw_bpt_msg *msg);
> 
> do we need both bus and slave, that was a mistake in orignal design IMO.
> We should fix that for bpt_ apis

No disagreement. All the routines follow the same template, if we change
one we should also change the others.

The main question as discussed with Charles is whether we want to pass
the 'msg' argument in all routines.
Charles Keepax Dec. 18, 2023, 2:57 p.m. UTC | #4
On Mon, Dec 18, 2023 at 02:12:36PM +0100, Pierre-Louis Bossart wrote:
> > Is this a protocol requirement?
> 
> No, it's an implementation requirement.
> 
> We could move this to host-specific parts but then the codec drivers
> will have to know about alignment requirements for each host they are
> use with. IOW, it's more work for codec drivers if we don't have a
> minimum bar for alignment requirement across all platforms.
> 

I do certainly see that side of the argument and it does probably
warrant some thought as to how a slave might learn the alignment
requirements. I guess maybe some sort of core helper function to
return the alignment? Or putting it in properties the slave can
access? One could even keep the check here, but just pull the
value from something system specific.

The danger with putting it in the core is IMHO:

a) It rules out certain use-cases, generally I think its a bad
idea if the framework design prohibits stuff the underlying bus
could do because someone will, at some point, want to do it.

b) The core limit could get a bit out of hand once more
controllers are added. The core limit needs to be a multiple of
all the controller limits, if a controller comes along with a
weird alignment requirement, that gets problematic fast.

Thanks,
Charles
Pierre-Louis Bossart Dec. 18, 2023, 4:44 p.m. UTC | #5
On 12/18/23 08:57, Charles Keepax wrote:
> On Mon, Dec 18, 2023 at 02:12:36PM +0100, Pierre-Louis Bossart wrote:
>>> Is this a protocol requirement?
>>
>> No, it's an implementation requirement.
>>
>> We could move this to host-specific parts but then the codec drivers
>> will have to know about alignment requirements for each host they are
>> use with. IOW, it's more work for codec drivers if we don't have a
>> minimum bar for alignment requirement across all platforms.
>>
> 
> I do certainly see that side of the argument and it does probably
> warrant some thought as to how a slave might learn the alignment
> requirements. I guess maybe some sort of core helper function to
> return the alignment? Or putting it in properties the slave can
> access? One could even keep the check here, but just pull the
> value from something system specific.
> 
> The danger with putting it in the core is IMHO:
> 
> a) It rules out certain use-cases, generally I think its a bad
> idea if the framework design prohibits stuff the underlying bus
> could do because someone will, at some point, want to do it.

SoundWire has lots of fancy and borderline nebulous concepts, my take is
"let's do few things and do them well". We can always revisit new usages
later, for now my main objective is "speed up downloads".

> b) The core limit could get a bit out of hand once more
> controllers are added. The core limit needs to be a multiple of
> all the controller limits, if a controller comes along with a
> weird alignment requirement, that gets problematic fast.

I don't have any information on other controllers, but I wouldn't be
surprised if most of the quirks are due to peripheral limitations and
ambiguous interpretations of what 'ACK' means. I tried writing to
reserved parts of the memory or non-existent registers and nothing bad
was reported...
diff mbox series

Patch

diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
index f3fec15c3112..e5758d2ed88f 100644
--- a/drivers/soundwire/bus.c
+++ b/drivers/soundwire/bus.c
@@ -2015,3 +2015,80 @@  void sdw_clear_slave_status(struct sdw_bus *bus, u32 request)
 	}
 }
 EXPORT_SYMBOL(sdw_clear_slave_status);
+
+int sdw_bpt_close_stream(struct sdw_bus *bus,
+			 struct sdw_slave *slave,
+			 enum sdw_bpt_type mode,
+			 struct sdw_bpt_msg *msg)
+{
+	int ret;
+
+	ret = bus->ops->bpt_close_stream(bus, slave, mode, msg);
+	if (ret < 0)
+		dev_err(bus->dev, "BPT stream close, err %d\n", ret);
+
+	return ret;
+}
+EXPORT_SYMBOL(sdw_bpt_close_stream);
+
+int sdw_bpt_open_stream(struct sdw_bus *bus,
+			struct sdw_slave *slave,
+			enum sdw_bpt_type mode,
+			struct sdw_bpt_msg *msg)
+{
+	int ret;
+
+	/* only Bulk Register Access (BRA) is supported for now */
+	if (mode != SDW_BRA)
+		return -EINVAL;
+
+	if (msg->len < SDW_BPT_MSG_MIN_BYTES) {
+		dev_err(bus->dev, "BPT message length %d, min supported %d\n",
+			msg->len, SDW_BPT_MSG_MIN_BYTES);
+		return -EINVAL;
+	}
+
+	if (msg->len % SDW_BPT_MSG_BYTE_ALIGNMENT) {
+		dev_err(bus->dev, "BPT message length %d is not a multiple of %d bytes\n",
+			msg->len, SDW_BPT_MSG_BYTE_ALIGNMENT);
+		return -EINVAL;
+	}
+
+	/* check device is enumerated */
+	if (slave->dev_num == SDW_ENUM_DEV_NUM ||
+	    slave->dev_num > SDW_MAX_DEVICES)
+		return -ENODEV;
+
+	/* make sure all callbacks are defined */
+	if (!bus->ops->bpt_open_stream ||
+	    !bus->ops->bpt_close_stream ||
+	    !bus->ops->bpt_send_async ||
+	    !bus->ops->bpt_wait)
+		return -ENOTSUPP;
+
+	ret = bus->ops->bpt_open_stream(bus, slave, mode, msg);
+	if (ret < 0)
+		dev_err(bus->dev, "BPT stream open, err %d\n", ret);
+
+	return ret;
+}
+EXPORT_SYMBOL(sdw_bpt_open_stream);
+
+int sdw_bpt_send_async(struct sdw_bus *bus,
+		       struct sdw_slave *slave,
+		       struct sdw_bpt_msg *msg)
+{
+	if (msg->len > SDW_BPT_MSG_MAX_BYTES)
+		return -EINVAL;
+
+	return bus->ops->bpt_send_async(bus, slave, msg);
+}
+EXPORT_SYMBOL(sdw_bpt_send_async);
+
+int sdw_bpt_wait(struct sdw_bus *bus,
+		 struct sdw_slave *slave,
+		 struct sdw_bpt_msg *msg)
+{
+	return bus->ops->bpt_wait(bus, slave, msg);
+}
+EXPORT_SYMBOL(sdw_bpt_wait);
diff --git a/drivers/soundwire/bus.h b/drivers/soundwire/bus.h
index fda6b24ac2da..936852ab3154 100644
--- a/drivers/soundwire/bus.h
+++ b/drivers/soundwire/bus.h
@@ -72,6 +72,24 @@  struct sdw_msg {
 	bool page;
 };
 
+/**
+ * struct sdw_btp_msg - Message structure
+ * @addr: Start Register address accessed in the Slave
+ * @len: number of bytes to transfer. More than 64Kb can be transferred
+ * but a practical limit of SDW_BPT_MSG_MAX_BYTES is enforced.
+ * @dev_num: Slave device number
+ * @flags: transfer flags, indicate if xfer is read or write
+ * @buf: message data buffer (filled by host for write, filled
+ * by Peripheral hardware for reads)
+ */
+struct sdw_bpt_msg {
+	u32 addr;
+	u32 len;
+	u8 dev_num;
+	u8 flags;
+	u8 *buf;
+};
+
 #define SDW_DOUBLE_RATE_FACTOR		2
 #define SDW_STRM_RATE_GROUPING		1
 
diff --git a/drivers/soundwire/stream.c b/drivers/soundwire/stream.c
index f9762610f051..b097b1a808f9 100644
--- a/drivers/soundwire/stream.c
+++ b/drivers/soundwire/stream.c
@@ -1189,6 +1189,20 @@  static struct sdw_master_runtime
 	struct sdw_master_runtime *m_rt, *walk_m_rt;
 	struct list_head *insert_after;
 
+	if (stream->type == SDW_STREAM_BPT) {
+		if (bus->stream_refcount > 0 || bus->bpt_stream_refcount > 0) {
+			dev_err(bus->dev, "%s: invalid configuration %d streams %d BPT streams\n",
+				__func__, bus->stream_refcount, bus->bpt_stream_refcount);
+			return ERR_PTR(-EBUSY);
+		}
+	} else {
+		if (bus->bpt_stream_refcount > 0) {
+			dev_err(bus->dev, "%s: BTP stream already allocated\n",
+				__func__);
+			return ERR_PTR(-EAGAIN);
+		}
+	}
+
 	m_rt = kzalloc(sizeof(*m_rt), GFP_KERNEL);
 	if (!m_rt)
 		return NULL;
@@ -1217,6 +1231,8 @@  static struct sdw_master_runtime
 	m_rt->stream = stream;
 
 	bus->stream_refcount++;
+	if (stream->type == SDW_STREAM_BPT)
+		bus->bpt_stream_refcount++;
 
 	return m_rt;
 }
@@ -1265,6 +1281,8 @@  static void sdw_master_rt_free(struct sdw_master_runtime *m_rt,
 	list_del(&m_rt->bus_node);
 	kfree(m_rt);
 
+	if (stream->type == SDW_STREAM_BPT)
+		bus->bpt_stream_refcount--;
 	bus->stream_refcount--;
 }
 
@@ -1941,6 +1959,12 @@  int sdw_stream_add_master(struct sdw_bus *bus,
 	m_rt = sdw_master_rt_find(bus, stream);
 	if (!m_rt) {
 		m_rt = sdw_master_rt_alloc(bus, stream);
+		if (IS_ERR(m_rt)) {
+			ret = PTR_ERR(m_rt);
+			dev_err(bus->dev, "%s: Master runtime alloc failed for stream:%s: %d\n",
+				__func__, stream->name, ret);
+			goto unlock;
+		}
 		if (!m_rt) {
 			dev_err(bus->dev, "%s: Master runtime alloc failed for stream:%s\n",
 				__func__, stream->name);
@@ -2056,6 +2080,12 @@  int sdw_stream_add_slave(struct sdw_slave *slave,
 		 * So, allocate m_rt and add Slave to it.
 		 */
 		m_rt = sdw_master_rt_alloc(slave->bus, stream);
+		if (IS_ERR(m_rt)) {
+			ret = PTR_ERR(m_rt);
+			dev_err(&slave->dev, "%s: Master runtime alloc failed for stream:%s: %d\n",
+				__func__, stream->name, ret);
+			goto unlock;
+		}
 		if (!m_rt) {
 			dev_err(&slave->dev, "%s: Master runtime alloc failed for stream:%s\n",
 				__func__, stream->name);
diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h
index 1dda65d91bad..e54c5bbd2b91 100644
--- a/include/linux/soundwire/sdw.h
+++ b/include/linux/soundwire/sdw.h
@@ -835,6 +835,46 @@  struct sdw_defer {
 	struct sdw_msg *msg;
 };
 
+/*
+ * Add a practical limit to BPT transfer sizes. BPT is typically used
+ * to transfer firmware, and larger firmware transfers will increase
+ * the cold latency beyond typical OS or user requirements.
+ */
+#define SDW_BPT_MSG_MAX_BYTES  (1024 * 1024)
+
+/*
+ * Add a minimum number of bytes for BPT transfer sizes. BPT has a lot of
+ * overhead, enabling/disabling a stream costs 6 write commands, plus the bank
+ * switch that could be delayed in time. In addition, transferring very small
+ * data sets over DMA is known to be problematic on multiple platforms.
+ */
+#define SDW_BPT_MSG_MIN_BYTES      128
+
+/*
+ * Add an alignment requirement for BPT transfer sizes. BPT is
+ * typically with DMAs which are most-efficient with data aligned to a
+ * cache line.  For example HDaudio DMAs strongly recommend to use
+ * multiples of 128 bytes.
+ * This could be a platform-specific value, but that would be problematic for
+ * codec drivers which are typically not platform-specific by definition.
+ * It's probably simpler for everyone to align on a baseline and use regular
+ * read/write commands for the remaining bytes.
+ */
+#define SDW_BPT_MSG_BYTE_ALIGNMENT 128
+
+/**
+ * sdw_bpt_type - SoundWire Bulk Payload Transfer (BPT) type
+ * @SDW_BRA: Soundwire Bulk Register Access (BRA)
+ *
+ * The SoundWire specification allows for implementation-defined
+ * solutions, they would be added after BRA.
+ */
+enum sdw_bpt_type {
+	SDW_BRA,
+};
+
+struct sdw_bpt_msg;
+
 /**
  * struct sdw_master_ops - Master driver ops
  * @read_prop: Read Master properties
@@ -850,6 +890,10 @@  struct sdw_defer {
  * @get_device_num: Callback for vendor-specific device_number allocation
  * @put_device_num: Callback for vendor-specific device_number release
  * @new_peripheral_assigned: Callback to handle enumeration of new peripheral.
+ * @bpt_open_stream: reserve resources for BPT stream
+ * @bpt_close_stream: release resources for BPT stream
+ * @bpt_send_async: send message using BTP protocol
+ * @bpt_wait: wait for message completion using BTP protocol
  */
 struct sdw_master_ops {
 	int (*read_prop)(struct sdw_bus *bus);
@@ -869,6 +913,20 @@  struct sdw_master_ops {
 	void (*new_peripheral_assigned)(struct sdw_bus *bus,
 					struct sdw_slave *slave,
 					int dev_num);
+	int (*bpt_open_stream)(struct sdw_bus *bus,
+			       struct sdw_slave *slave,
+			       enum sdw_bpt_type mode,
+			       struct sdw_bpt_msg *msg);
+	int (*bpt_close_stream)(struct sdw_bus *bus,
+				struct sdw_slave *slave,
+				enum sdw_bpt_type mode,
+				struct sdw_bpt_msg *msg);
+	int (*bpt_send_async)(struct sdw_bus *bus,
+			      struct sdw_slave *slave,
+			      struct sdw_bpt_msg *msg);
+	int (*bpt_wait)(struct sdw_bus *bus,
+			struct sdw_slave *slave,
+			struct sdw_bpt_msg *msg);
 };
 
 /**
@@ -905,6 +963,8 @@  struct sdw_master_ops {
  * synchronization will be used even if a stream only uses a single
  * SoundWire segment.
  * @stream_refcount: number of streams currently using this bus
+ * @btp_stream_refcount: number of BTP streams currently using this bus (should
+ * be zero or one, multiple streams per link is not supported).
  */
 struct sdw_bus {
 	struct device *dev;
@@ -935,6 +995,7 @@  struct sdw_bus {
 	bool multi_link;
 	int hw_sync_min_links;
 	int stream_refcount;
+	int bpt_stream_refcount;
 };
 
 int sdw_bus_master_add(struct sdw_bus *bus, struct device *parent,
@@ -1052,6 +1113,21 @@  int sdw_bus_exit_clk_stop(struct sdw_bus *bus);
 int sdw_compare_devid(struct sdw_slave *slave, struct sdw_slave_id id);
 void sdw_extract_slave_id(struct sdw_bus *bus, u64 addr, struct sdw_slave_id *id);
 
+int sdw_bpt_open_stream(struct sdw_bus *bus,
+			struct sdw_slave *slave,
+			enum sdw_bpt_type mode,
+			struct sdw_bpt_msg *msg);
+int sdw_bpt_close_stream(struct sdw_bus *bus,
+			 struct sdw_slave *slave,
+			 enum sdw_bpt_type mode,
+			 struct sdw_bpt_msg *msg);
+int sdw_bpt_send_async(struct sdw_bus *bus,
+		       struct sdw_slave *slave,
+		       struct sdw_bpt_msg *msg);
+int sdw_bpt_wait(struct sdw_bus *bus,
+		 struct sdw_slave *slave,
+		 struct sdw_bpt_msg *msg);
+
 #if IS_ENABLED(CONFIG_SOUNDWIRE)
 
 int sdw_stream_add_slave(struct sdw_slave *slave,