diff mbox series

bus: mhi: host: Allocate entire MHI control config once

Message ID 20250328165913.3380933-1-jeff.hugo@oss.qualcomm.com
State Superseded
Headers show
Series bus: mhi: host: Allocate entire MHI control config once | expand

Commit Message

Jeff Hugo March 28, 2025, 4:59 p.m. UTC
From: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>

MHI control configurations such as channel context, event context, command
context and rings, are currently allocated individually. During MHI
initialization MHI bus driver needs to configure the address space in
which this control configuration resides. Since different component of the
config is being allocated separately, only logical solution is to give the
entire RAM address space, as they could be anywhere.

As per MHI specification the MHI control configuration address space should
not be more them 4GB.

Since the current implementation is in violation of MHI specification.
Allocate a single giant DMA buffer for MHI control configurations and
limit the configuration address space to that buffer.

Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
---
 drivers/bus/mhi/host/init.c     | 259 +++++++++++++++++---------------
 drivers/bus/mhi/host/internal.h |   1 -
 include/linux/mhi.h             |   7 +
 3 files changed, 143 insertions(+), 124 deletions(-)

Comments

Manivannan Sadhasivam April 8, 2025, 7:01 a.m. UTC | #1
On Fri, Mar 28, 2025 at 10:59:13AM -0600, Jeff Hugo wrote:
> From: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
> 
> MHI control configurations such as channel context, event context, command
> context and rings, are currently allocated individually. During MHI
> initialization MHI bus driver needs to configure the address space in
> which this control configuration resides. Since different component of the
> config is being allocated separately, only logical solution is to give the
> entire RAM address space, as they could be anywhere.
> 

This is fine...

> As per MHI specification the MHI control configuration address space should
> not be more them 4GB.
> 

Where exactly this limitation is specified in the spec? The spec supports full
64 bit address space for the MHI control/data structures. But due to the device
DMA limitations, MHI controller drivers often use 32 bit address space. But
that's not a spec limitation.

> Since the current implementation is in violation of MHI specification.

How?

> Allocate a single giant DMA buffer for MHI control configurations and
> limit the configuration address space to that buffer.
> 

I don't think this could work for all devices. For instance, some ath11k devices
use a fixed reserved region in host address space for MHICTRL/BASE.

- Mani
Jeff Hugo April 8, 2025, 2:56 p.m. UTC | #2
On 4/8/2025 1:01 AM, Manivannan Sadhasivam wrote:
> On Fri, Mar 28, 2025 at 10:59:13AM -0600, Jeff Hugo wrote:
>> From: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
>>
>> MHI control configurations such as channel context, event context, command
>> context and rings, are currently allocated individually. During MHI
>> initialization MHI bus driver needs to configure the address space in
>> which this control configuration resides. Since different component of the
>> config is being allocated separately, only logical solution is to give the
>> entire RAM address space, as they could be anywhere.
>>
> 
> This is fine...

We tripped over this when experimenting with an automotive market 
product. The FW for that product had a rather strict interpretation of 
the spec, which we confirmed with the spec owner.

In the specific FW implementation, the device maps the entire MHI space 
of shared structures in a single ATU entry. The device cannot map an 
entire 64-bit address space, and it expects all of the shared structures 
in a single compact range.

This applies to the control structures, not the data buffers per the 
device implementation.

This restriction seems backed by the spec.  I can't find a reason why 
the device is invalid, if limited.  I don't think this should break 
anything, but more on that below.

> 
>> As per MHI specification the MHI control configuration address space should
>> not be more them 4GB.
>>
> 
> Where exactly this limitation is specified in the spec? The spec supports full
> 64 bit address space for the MHI control/data structures. But due to the device
> DMA limitations, MHI controller drivers often use 32 bit address space. But
> that's not a spec limitation.

Its not the clearest thing, sadly.

Document 80-NF223-11 Rev AB "MHI spec v1.2" Section 6.2 "MHI Registers" 
table 6-19 (page 106) -

Describing MHICTRLLIMIT: "The most significant 32 bits of MHICTRLBASE 
and MHICTRLLIMIT registers must be equal."

This means we have a 4GB range (32-bit) to play with in a 64-bit address 
space.  If the upper 32-bits of the 64-bit address for both the base and 
the limit must be the same, then the range of addresses from the base to 
the limit can only vary the lower 32-bits.

Invalid:
BASE: 0x0
LIMIT: 0xffffffff_ffffffff

Valid:
BASE: 0x0f_00000000
LIMIT: 0x0f_ffffffff

>> Since the current implementation is in violation of MHI specification.

For example mhi_init_dev_ctxt()

We allocate the chan_ctxt with dma_alloc_coherent() as an individual 
allocation.  In the case of AIC100, the device can access the full 
64-bit address space, but the DMA engine is limited to a 32-bit transfer 
size.  The chan_ctxt probably won't be larger than 4GB, so the size is 
rather irrelevant.  Can be allocated anywhere.  Lets say that it gets 
put in the lower 32-bit address space - 0x0_XXXXXXXX

Then a little bit later we allocate er_ctxt with a different 
dma_alloc_coherent() instance.  Being a unique allocation, it is not 
tied to the chan_ctxt and can exist anywhere.  Lets assume that it gets 
put somewhere in the non-lower 32-bits - 0x1000_XXXXXXXX

Now we have a problem because we cannot describe a single range covering 
both of these allocations via MHICTRLBASE/MHICTRLLIMIT where the upper 
32-bits of both registers is the same.

>> Allocate a single giant DMA buffer for MHI control configurations and
>> limit the configuration address space to that buffer.
>>
> 
> I don't think this could work for all devices. For instance, some ath11k devices
> use a fixed reserved region in host address space for MHICTRL/BASE.

Why would we be unable to allocate all of the control structures in a 
single allocation out of that reserved region?  Is it larger than 4GB in 
size?

-Jeff
Manivannan Sadhasivam April 25, 2025, 5:37 a.m. UTC | #3
+ ath11k list, Jeff and Baochen (for question regarding the use of reserved
memory for allocating the MHI data structures in host)

On Tue, Apr 08, 2025 at 08:56:43AM -0600, Jeff Hugo wrote:
> On 4/8/2025 1:01 AM, Manivannan Sadhasivam wrote:
> > On Fri, Mar 28, 2025 at 10:59:13AM -0600, Jeff Hugo wrote:
> > > From: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
> > > 
> > > MHI control configurations such as channel context, event context, command
> > > context and rings, are currently allocated individually. During MHI
> > > initialization MHI bus driver needs to configure the address space in
> > > which this control configuration resides. Since different component of the
> > > config is being allocated separately, only logical solution is to give the
> > > entire RAM address space, as they could be anywhere.
> > > 
> > 
> > This is fine...
> 
> We tripped over this when experimenting with an automotive market product.
> The FW for that product had a rather strict interpretation of the spec,
> which we confirmed with the spec owner.
> 
> In the specific FW implementation, the device maps the entire MHI space of
> shared structures in a single ATU entry. The device cannot map an entire
> 64-bit address space, and it expects all of the shared structures in a
> single compact range.
> 
> This applies to the control structures, not the data buffers per the device
> implementation.
> 
> This restriction seems backed by the spec.  I can't find a reason why the
> device is invalid, if limited.  I don't think this should break anything,
> but more on that below.
> 

Yes, atleast I have heard about that limitation before.

> > 
> > > As per MHI specification the MHI control configuration address space should
> > > not be more them 4GB.
> > > 
> > 
> > Where exactly this limitation is specified in the spec? The spec supports full
> > 64 bit address space for the MHI control/data structures. But due to the device
> > DMA limitations, MHI controller drivers often use 32 bit address space. But
> > that's not a spec limitation.
> 
> Its not the clearest thing, sadly.
> 
> Document 80-NF223-11 Rev AB "MHI spec v1.2" Section 6.2 "MHI Registers"
> table 6-19 (page 106) -
> 
> Describing MHICTRLLIMIT: "The most significant 32 bits of MHICTRLBASE and
> MHICTRLLIMIT registers must be equal."
> 
> This means we have a 4GB range (32-bit) to play with in a 64-bit address
> space.  If the upper 32-bits of the 64-bit address for both the base and the
> limit must be the same, then the range of addresses from the base to the
> limit can only vary the lower 32-bits.
> 
> Invalid:
> BASE: 0x0
> LIMIT: 0xffffffff_ffffffff
> 
> Valid:
> BASE: 0x0f_00000000
> LIMIT: 0x0f_ffffffff
> 

Ah. Didn't spot this before, thanks for explaining!

> > > Since the current implementation is in violation of MHI specification.
> 
> For example mhi_init_dev_ctxt()
> 
> We allocate the chan_ctxt with dma_alloc_coherent() as an individual
> allocation.  In the case of AIC100, the device can access the full 64-bit
> address space, but the DMA engine is limited to a 32-bit transfer size.  The
> chan_ctxt probably won't be larger than 4GB, so the size is rather
> irrelevant.  Can be allocated anywhere.  Lets say that it gets put in the
> lower 32-bit address space - 0x0_XXXXXXXX
> 
> Then a little bit later we allocate er_ctxt with a different
> dma_alloc_coherent() instance.  Being a unique allocation, it is not tied to
> the chan_ctxt and can exist anywhere.  Lets assume that it gets put
> somewhere in the non-lower 32-bits - 0x1000_XXXXXXXX
> 
> Now we have a problem because we cannot describe a single range covering
> both of these allocations via MHICTRLBASE/MHICTRLLIMIT where the upper
> 32-bits of both registers is the same.
> 

Yes, I get it. I do not have issues in allocating all the structures in one go.
But the problem is with MHICTRL_BASE/LIMIT. More below.

> > > Allocate a single giant DMA buffer for MHI control configurations and
> > > limit the configuration address space to that buffer.
> > > 
> > 
> > I don't think this could work for all devices. For instance, some ath11k devices
> > use a fixed reserved region in host address space for MHICTRL/BASE.
> 
> Why would we be unable to allocate all of the control structures in a single
> allocation out of that reserved region?  Is it larger than 4GB in size?
> 

I was confused by the fact that ath11k driver adds an offset of 0x1000000 to the
reserved region for the iova_start:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/wireless/ath/ath11k/mhi.c?h=v6.15-rc3#n331

So this means the firmware will only map the memory from reserved + 0x1000000
for MHI data structures. But even with current implementation, the MHI stack
doesn't know anything about the offset, because it just relies on
dma_alloc_coherent() API which will only honor the reserved region pointed by
'memory' property in the node (without the offset).

So I'm not sure how the firmware makes sure that the data structures lives
within the offset region. This is more of a question to ath11k folks.

But your series is not going to make it any worse. Sorry about the confusion.

- Mani
Manivannan Sadhasivam April 25, 2025, 5:40 a.m. UTC | #4
On Fri, Mar 28, 2025 at 10:59:13AM -0600, Jeff Hugo wrote:
> From: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
> 
> MHI control configurations such as channel context, event context, command
> context and rings, are currently allocated individually. During MHI
> initialization MHI bus driver needs to configure the address space in
> which this control configuration resides. Since different component of the
> config is being allocated separately, only logical solution is to give the
> entire RAM address space, as they could be anywhere.
> 
> As per MHI specification the MHI control configuration address space should
> not be more them 4GB.

Could you please add the spec limitation here as you described in your reply?

> 
> Since the current implementation is in violation of MHI specification.

Here too.

> Allocate a single giant DMA buffer for MHI control configurations and
> limit the configuration address space to that buffer.
> 
> Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
> Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>

Other than that, patch looks good to me.

Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>

- Mani

> ---
>  drivers/bus/mhi/host/init.c     | 259 +++++++++++++++++---------------
>  drivers/bus/mhi/host/internal.h |   1 -
>  include/linux/mhi.h             |   7 +
>  3 files changed, 143 insertions(+), 124 deletions(-)
> 
> diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c
> index 13e7a55f54ff..03afbfb8a96f 100644
> --- a/drivers/bus/mhi/host/init.c
> +++ b/drivers/bus/mhi/host/init.c
> @@ -159,21 +159,21 @@ static struct attribute *mhi_dev_attrs[] = {
>  };
>  ATTRIBUTE_GROUPS(mhi_dev);
>  
> -/* MHI protocol requires the transfer ring to be aligned with ring length */
> -static int mhi_alloc_aligned_ring(struct mhi_controller *mhi_cntrl,
> -				  struct mhi_ring *ring,
> -				  u64 len)
> +static void mhi_init_ring_base(struct mhi_ring *ring)
>  {
> -	ring->alloc_size = len + (len - 1);
> -	ring->pre_aligned = dma_alloc_coherent(mhi_cntrl->cntrl_dev, ring->alloc_size,
> -					       &ring->dma_handle, GFP_KERNEL);
> -	if (!ring->pre_aligned)
> -		return -ENOMEM;
> -
> -	ring->iommu_base = (ring->dma_handle + (len - 1)) & ~(len - 1);
> +	ring->iommu_base = (ring->dma_handle + (ring->len - 1)) & ~(ring->len - 1);
>  	ring->base = ring->pre_aligned + (ring->iommu_base - ring->dma_handle);
> +}
>  
> -	return 0;
> +/* MHI protocol requires the transfer ring to be aligned with ring length */
> +static void mhi_init_ring(struct mhi_controller *mhi_cntrl, struct mhi_ring *ring,
> +			  size_t el_size, size_t offset)
> +{
> +	ring->el_size = el_size;
> +	ring->len = ring->el_size * ring->elements;
> +	ring->pre_aligned = mhi_cntrl->ctrl_config_base + offset;
> +	ring->dma_handle = mhi_cntrl->ctrl_config_dma + offset;
> +	mhi_init_ring_base(ring);
>  }
>  
>  void mhi_deinit_free_irq(struct mhi_controller *mhi_cntrl)
> @@ -265,40 +265,134 @@ void mhi_deinit_dev_ctxt(struct mhi_controller *mhi_cntrl)
>  	mhi_cmd = mhi_cntrl->mhi_cmd;
>  	for (i = 0; i < NR_OF_CMD_RINGS; i++, mhi_cmd++) {
>  		ring = &mhi_cmd->ring;
> -		dma_free_coherent(mhi_cntrl->cntrl_dev, ring->alloc_size,
> -				  ring->pre_aligned, ring->dma_handle);
>  		ring->base = NULL;
>  		ring->iommu_base = 0;
>  	}
>  
> -	dma_free_coherent(mhi_cntrl->cntrl_dev,
> -			  sizeof(*mhi_ctxt->cmd_ctxt) * NR_OF_CMD_RINGS,
> -			  mhi_ctxt->cmd_ctxt, mhi_ctxt->cmd_ctxt_addr);
> -
>  	mhi_event = mhi_cntrl->mhi_event;
>  	for (i = 0; i < mhi_cntrl->total_ev_rings; i++, mhi_event++) {
>  		if (mhi_event->offload_ev)
>  			continue;
>  
>  		ring = &mhi_event->ring;
> -		dma_free_coherent(mhi_cntrl->cntrl_dev, ring->alloc_size,
> -				  ring->pre_aligned, ring->dma_handle);
>  		ring->base = NULL;
>  		ring->iommu_base = 0;
>  	}
>  
> -	dma_free_coherent(mhi_cntrl->cntrl_dev, sizeof(*mhi_ctxt->er_ctxt) *
> -			  mhi_cntrl->total_ev_rings, mhi_ctxt->er_ctxt,
> -			  mhi_ctxt->er_ctxt_addr);
> -
> -	dma_free_coherent(mhi_cntrl->cntrl_dev, sizeof(*mhi_ctxt->chan_ctxt) *
> -			  mhi_cntrl->max_chan, mhi_ctxt->chan_ctxt,
> -			  mhi_ctxt->chan_ctxt_addr);
> -
> +	dma_free_coherent(mhi_cntrl->cntrl_dev, mhi_cntrl->ctrl_config_size,
> +			  mhi_cntrl->ctrl_config_base, mhi_cntrl->ctrl_config_dma);
>  	kfree(mhi_ctxt);
>  	mhi_cntrl->mhi_ctxt = NULL;
>  }
>  
> +/*
> + * Control Configuration Address Space Layout
> + *              +------------------------------------------+
> + *              |              Channel Context             |
> + *              +------------------------------------------+
> + *              |           Array of Channel Rings         |
> + *              +------------------------------------------+
> + *              |              Event Context               |
> + *              +------------------------------------------+
> + *              |           Array of Event Rings           |
> + *              +------------------------------------------+
> + *              |              Command Context             |
> + *              +------------------------------------------+
> + *              |           Array of Command Rings         |
> + *              +------------------------------------------+
> + */
> +static int mhi_alloc_control_space(struct mhi_controller *mhi_cntrl, struct mhi_ctxt *mhi_ctxt)
> +{
> +	size_t chan_ctxt_offset, cmd_ctxt_offset, er_ctxt_offset;
> +	size_t chan_ctxt_size, cmd_ctxt_size, er_ctxt_size;
> +	size_t el_size = sizeof(struct mhi_ring_element);
> +	u32 ev_rings = mhi_cntrl->total_ev_rings;
> +	struct mhi_event *mhi_event;
> +	size_t alloc_size, offset = 0;
> +	struct mhi_chan *mhi_chan;
> +	struct mhi_cmd *mhi_cmd;
> +	struct mhi_ring *ring;
> +	int i;
> +
> +	chan_ctxt_offset = offset;
> +	chan_ctxt_size = sizeof(*mhi_ctxt->chan_ctxt) * mhi_cntrl->max_chan;
> +	chan_ctxt_size = PAGE_ALIGN(chan_ctxt_size);
> +	offset += chan_ctxt_size;
> +	for (i = 0, mhi_chan = mhi_cntrl->mhi_chan; i < mhi_cntrl->max_chan; i++, mhi_chan++) {
> +		ring = &mhi_chan->tre_ring;
> +		if (!ring->elements)
> +			continue;
> +		alloc_size = ring->elements * el_size;
> +		alloc_size = PAGE_ALIGN(alloc_size << 1);
> +		/* Temporarily save offset here */
> +		ring->pre_aligned = (void *)offset;
> +		offset += alloc_size;
> +	}
> +
> +	er_ctxt_offset = offset;
> +	er_ctxt_size = sizeof(*mhi_ctxt->er_ctxt) * ev_rings;
> +	er_ctxt_size = PAGE_ALIGN(er_ctxt_size);
> +	offset += er_ctxt_size;
> +	for (i = 0, mhi_event = mhi_cntrl->mhi_event; i < ev_rings; i++, mhi_event++) {
> +		ring = &mhi_event->ring;
> +		alloc_size = ring->elements * el_size;
> +		alloc_size = PAGE_ALIGN(alloc_size << 1);
> +		/* Temporarily save offset here */
> +		ring->pre_aligned = (void *)offset;
> +		offset += alloc_size;
> +	}
> +
> +	cmd_ctxt_offset = offset;
> +	cmd_ctxt_size = sizeof(*mhi_ctxt->cmd_ctxt) * NR_OF_CMD_RINGS;
> +	cmd_ctxt_size = PAGE_ALIGN(cmd_ctxt_size);
> +	offset += cmd_ctxt_size;
> +	for (i = 0, mhi_cmd = mhi_cntrl->mhi_cmd; i < NR_OF_CMD_RINGS; i++, mhi_cmd++) {
> +		ring = &mhi_cmd->ring;
> +		alloc_size = CMD_EL_PER_RING * el_size;
> +		alloc_size = PAGE_ALIGN(alloc_size << 1);
> +		/* Temporarily save offset here */
> +		ring->pre_aligned = (void *)offset;
> +		offset += alloc_size;
> +	}
> +
> +	mhi_cntrl->ctrl_config_size = offset;
> +	mhi_cntrl->ctrl_config_base = dma_alloc_coherent(mhi_cntrl->cntrl_dev,
> +							 mhi_cntrl->ctrl_config_size,
> +							 &mhi_cntrl->ctrl_config_dma,
> +							 GFP_KERNEL);
> +	if (!mhi_cntrl->ctrl_config_base)
> +		return -ENOMEM;
> +
> +	/* Setup channel ctxt */
> +	mhi_ctxt->chan_ctxt = mhi_cntrl->ctrl_config_base + chan_ctxt_offset;
> +	mhi_ctxt->chan_ctxt_addr = mhi_cntrl->ctrl_config_dma + chan_ctxt_offset;
> +	for (i = 0, mhi_chan = mhi_cntrl->mhi_chan; i < mhi_cntrl->max_chan; i++, mhi_chan++) {
> +		ring = &mhi_chan->tre_ring;
> +		if (!ring->elements)
> +			continue;
> +		mhi_init_ring(mhi_cntrl, ring, el_size, (size_t)ring->pre_aligned);
> +	}
> +
> +	/* Setup event context */
> +	mhi_ctxt->er_ctxt = mhi_cntrl->ctrl_config_base + er_ctxt_offset;
> +	mhi_ctxt->er_ctxt_addr = mhi_cntrl->ctrl_config_dma + er_ctxt_offset;
> +	for (i = 0, mhi_event = mhi_cntrl->mhi_event; i < ev_rings; i++, mhi_event++) {
> +		ring = &mhi_event->ring;
> +		mhi_init_ring(mhi_cntrl, ring, el_size, (size_t)ring->pre_aligned);
> +	}
> +
> +	/* Setup cmd context */
> +	mhi_ctxt->cmd_ctxt = mhi_cntrl->ctrl_config_base + cmd_ctxt_offset;
> +	mhi_ctxt->cmd_ctxt_addr = mhi_cntrl->ctrl_config_dma + cmd_ctxt_offset;
> +	for (i = 0, mhi_cmd = mhi_cntrl->mhi_cmd; i < NR_OF_CMD_RINGS; i++, mhi_cmd++) {
> +		ring = &mhi_cmd->ring;
> +		ring->elements = CMD_EL_PER_RING;
> +		mhi_init_ring(mhi_cntrl, ring, el_size, (size_t)ring->pre_aligned);
> +	}
> +
> +	return 0;
> +}
> +
>  int mhi_init_dev_ctxt(struct mhi_controller *mhi_cntrl)
>  {
>  	struct mhi_ctxt *mhi_ctxt;
> @@ -309,7 +403,7 @@ int mhi_init_dev_ctxt(struct mhi_controller *mhi_cntrl)
>  	struct mhi_event *mhi_event;
>  	struct mhi_cmd *mhi_cmd;
>  	u32 tmp;
> -	int ret = -ENOMEM, i;
> +	int ret, i;
>  
>  	atomic_set(&mhi_cntrl->dev_wake, 0);
>  	atomic_set(&mhi_cntrl->pending_pkts, 0);
> @@ -318,14 +412,12 @@ int mhi_init_dev_ctxt(struct mhi_controller *mhi_cntrl)
>  	if (!mhi_ctxt)
>  		return -ENOMEM;
>  
> -	/* Setup channel ctxt */
> -	mhi_ctxt->chan_ctxt = dma_alloc_coherent(mhi_cntrl->cntrl_dev,
> -						 sizeof(*mhi_ctxt->chan_ctxt) *
> -						 mhi_cntrl->max_chan,
> -						 &mhi_ctxt->chan_ctxt_addr,
> -						 GFP_KERNEL);
> -	if (!mhi_ctxt->chan_ctxt)
> -		goto error_alloc_chan_ctxt;
> +	/* Allocate control configuration */
> +	ret = mhi_alloc_control_space(mhi_cntrl, mhi_ctxt);
> +	if (ret) {
> +		kfree(mhi_ctxt);
> +		return ret;
> +	}
>  
>  	mhi_chan = mhi_cntrl->mhi_chan;
>  	chan_ctxt = mhi_ctxt->chan_ctxt;
> @@ -350,15 +442,6 @@ int mhi_init_dev_ctxt(struct mhi_controller *mhi_cntrl)
>  		mhi_chan->tre_ring.db_addr = (void __iomem *)&chan_ctxt->wp;
>  	}
>  
> -	/* Setup event context */
> -	mhi_ctxt->er_ctxt = dma_alloc_coherent(mhi_cntrl->cntrl_dev,
> -					       sizeof(*mhi_ctxt->er_ctxt) *
> -					       mhi_cntrl->total_ev_rings,
> -					       &mhi_ctxt->er_ctxt_addr,
> -					       GFP_KERNEL);
> -	if (!mhi_ctxt->er_ctxt)
> -		goto error_alloc_er_ctxt;
> -
>  	er_ctxt = mhi_ctxt->er_ctxt;
>  	mhi_event = mhi_cntrl->mhi_event;
>  	for (i = 0; i < mhi_cntrl->total_ev_rings; i++, er_ctxt++,
> @@ -379,12 +462,6 @@ int mhi_init_dev_ctxt(struct mhi_controller *mhi_cntrl)
>  		er_ctxt->msivec = cpu_to_le32(mhi_event->irq);
>  		mhi_event->db_cfg.db_mode = true;
>  
> -		ring->el_size = sizeof(struct mhi_ring_element);
> -		ring->len = ring->el_size * ring->elements;
> -		ret = mhi_alloc_aligned_ring(mhi_cntrl, ring, ring->len);
> -		if (ret)
> -			goto error_alloc_er;
> -
>  		/*
>  		 * If the read pointer equals to the write pointer, then the
>  		 * ring is empty
> @@ -396,28 +473,10 @@ int mhi_init_dev_ctxt(struct mhi_controller *mhi_cntrl)
>  		ring->ctxt_wp = &er_ctxt->wp;
>  	}
>  
> -	/* Setup cmd context */
> -	ret = -ENOMEM;
> -	mhi_ctxt->cmd_ctxt = dma_alloc_coherent(mhi_cntrl->cntrl_dev,
> -						sizeof(*mhi_ctxt->cmd_ctxt) *
> -						NR_OF_CMD_RINGS,
> -						&mhi_ctxt->cmd_ctxt_addr,
> -						GFP_KERNEL);
> -	if (!mhi_ctxt->cmd_ctxt)
> -		goto error_alloc_er;
> -
>  	mhi_cmd = mhi_cntrl->mhi_cmd;
>  	cmd_ctxt = mhi_ctxt->cmd_ctxt;
>  	for (i = 0; i < NR_OF_CMD_RINGS; i++, mhi_cmd++, cmd_ctxt++) {
>  		struct mhi_ring *ring = &mhi_cmd->ring;
> -
> -		ring->el_size = sizeof(struct mhi_ring_element);
> -		ring->elements = CMD_EL_PER_RING;
> -		ring->len = ring->el_size * ring->elements;
> -		ret = mhi_alloc_aligned_ring(mhi_cntrl, ring, ring->len);
> -		if (ret)
> -			goto error_alloc_cmd;
> -
>  		ring->rp = ring->wp = ring->base;
>  		cmd_ctxt->rbase = cpu_to_le64(ring->iommu_base);
>  		cmd_ctxt->rp = cmd_ctxt->wp = cmd_ctxt->rbase;
> @@ -428,43 +487,6 @@ int mhi_init_dev_ctxt(struct mhi_controller *mhi_cntrl)
>  	mhi_cntrl->mhi_ctxt = mhi_ctxt;
>  
>  	return 0;
> -
> -error_alloc_cmd:
> -	for (--i, --mhi_cmd; i >= 0; i--, mhi_cmd--) {
> -		struct mhi_ring *ring = &mhi_cmd->ring;
> -
> -		dma_free_coherent(mhi_cntrl->cntrl_dev, ring->alloc_size,
> -				  ring->pre_aligned, ring->dma_handle);
> -	}
> -	dma_free_coherent(mhi_cntrl->cntrl_dev,
> -			  sizeof(*mhi_ctxt->cmd_ctxt) * NR_OF_CMD_RINGS,
> -			  mhi_ctxt->cmd_ctxt, mhi_ctxt->cmd_ctxt_addr);
> -	i = mhi_cntrl->total_ev_rings;
> -	mhi_event = mhi_cntrl->mhi_event + i;
> -
> -error_alloc_er:
> -	for (--i, --mhi_event; i >= 0; i--, mhi_event--) {
> -		struct mhi_ring *ring = &mhi_event->ring;
> -
> -		if (mhi_event->offload_ev)
> -			continue;
> -
> -		dma_free_coherent(mhi_cntrl->cntrl_dev, ring->alloc_size,
> -				  ring->pre_aligned, ring->dma_handle);
> -	}
> -	dma_free_coherent(mhi_cntrl->cntrl_dev, sizeof(*mhi_ctxt->er_ctxt) *
> -			  mhi_cntrl->total_ev_rings, mhi_ctxt->er_ctxt,
> -			  mhi_ctxt->er_ctxt_addr);
> -
> -error_alloc_er_ctxt:
> -	dma_free_coherent(mhi_cntrl->cntrl_dev, sizeof(*mhi_ctxt->chan_ctxt) *
> -			  mhi_cntrl->max_chan, mhi_ctxt->chan_ctxt,
> -			  mhi_ctxt->chan_ctxt_addr);
> -
> -error_alloc_chan_ctxt:
> -	kfree(mhi_ctxt);
> -
> -	return ret;
>  }
>  
>  int mhi_init_mmio(struct mhi_controller *mhi_cntrl)
> @@ -475,6 +497,7 @@ int mhi_init_mmio(struct mhi_controller *mhi_cntrl)
>  	struct mhi_event *mhi_event;
>  	void __iomem *base = mhi_cntrl->regs;
>  	struct device *dev = &mhi_cntrl->mhi_dev->dev;
> +	dma_addr_t mhi_ctrl_limit = mhi_cntrl->ctrl_config_dma + mhi_cntrl->ctrl_config_size - 1;
>  	struct {
>  		u32 offset;
>  		u32 val;
> @@ -505,11 +528,11 @@ int mhi_init_mmio(struct mhi_controller *mhi_cntrl)
>  		},
>  		{
>  			MHICTRLBASE_HIGHER,
> -			upper_32_bits(mhi_cntrl->iova_start),
> +			upper_32_bits(mhi_cntrl->ctrl_config_dma),
>  		},
>  		{
>  			MHICTRLBASE_LOWER,
> -			lower_32_bits(mhi_cntrl->iova_start),
> +			lower_32_bits(mhi_cntrl->ctrl_config_dma),
>  		},
>  		{
>  			MHIDATABASE_HIGHER,
> @@ -521,11 +544,11 @@ int mhi_init_mmio(struct mhi_controller *mhi_cntrl)
>  		},
>  		{
>  			MHICTRLLIMIT_HIGHER,
> -			upper_32_bits(mhi_cntrl->iova_stop),
> +			upper_32_bits(mhi_ctrl_limit),
>  		},
>  		{
>  			MHICTRLLIMIT_LOWER,
> -			lower_32_bits(mhi_cntrl->iova_stop),
> +			lower_32_bits(mhi_ctrl_limit),
>  		},
>  		{
>  			MHIDATALIMIT_HIGHER,
> @@ -622,8 +645,6 @@ void mhi_deinit_chan_ctxt(struct mhi_controller *mhi_cntrl,
>  	if (!chan_ctxt->rbase) /* Already uninitialized */
>  		return;
>  
> -	dma_free_coherent(mhi_cntrl->cntrl_dev, tre_ring->alloc_size,
> -			  tre_ring->pre_aligned, tre_ring->dma_handle);
>  	vfree(buf_ring->base);
>  
>  	buf_ring->base = tre_ring->base = NULL;
> @@ -649,26 +670,18 @@ int mhi_init_chan_ctxt(struct mhi_controller *mhi_cntrl,
>  	struct mhi_ring *tre_ring;
>  	struct mhi_chan_ctxt *chan_ctxt;
>  	u32 tmp;
> -	int ret;
>  
>  	buf_ring = &mhi_chan->buf_ring;
>  	tre_ring = &mhi_chan->tre_ring;
> -	tre_ring->el_size = sizeof(struct mhi_ring_element);
> -	tre_ring->len = tre_ring->el_size * tre_ring->elements;
>  	chan_ctxt = &mhi_cntrl->mhi_ctxt->chan_ctxt[mhi_chan->chan];
> -	ret = mhi_alloc_aligned_ring(mhi_cntrl, tre_ring, tre_ring->len);
> -	if (ret)
> -		return -ENOMEM;
> +	mhi_init_ring_base(tre_ring);
>  
>  	buf_ring->el_size = sizeof(struct mhi_buf_info);
>  	buf_ring->len = buf_ring->el_size * buf_ring->elements;
>  	buf_ring->base = vzalloc(buf_ring->len);
>  
> -	if (!buf_ring->base) {
> -		dma_free_coherent(mhi_cntrl->cntrl_dev, tre_ring->alloc_size,
> -				  tre_ring->pre_aligned, tre_ring->dma_handle);
> +	if (!buf_ring->base)
>  		return -ENOMEM;
> -	}
>  
>  	tmp = le32_to_cpu(chan_ctxt->chcfg);
>  	tmp &= ~CHAN_CTX_CHSTATE_MASK;
> diff --git a/drivers/bus/mhi/host/internal.h b/drivers/bus/mhi/host/internal.h
> index ce566f7d2e92..bedaf248c49a 100644
> --- a/drivers/bus/mhi/host/internal.h
> +++ b/drivers/bus/mhi/host/internal.h
> @@ -215,7 +215,6 @@ struct mhi_ring {
>  	size_t el_size;
>  	size_t len;
>  	size_t elements;
> -	size_t alloc_size;
>  	void __iomem *db_addr;
>  };
>  
> diff --git a/include/linux/mhi.h b/include/linux/mhi.h
> index 059dc94d20bb..eb5e0181bea7 100644
> --- a/include/linux/mhi.h
> +++ b/include/linux/mhi.h
> @@ -325,6 +325,9 @@ struct mhi_controller_config {
>   * @mhi_event: MHI event ring configurations table
>   * @mhi_cmd: MHI command ring configurations table
>   * @mhi_ctxt: MHI device context, shared memory between host and device
> + * @ctrl_config_size: Total allocated size of control configuration buffer
> + * @ctrl_config_dma: Base bus address of control configuration buffer
> + * @ctrl_config_base: Base CPU address of control configuration buffer
>   * @pm_mutex: Mutex for suspend/resume operation
>   * @pm_lock: Lock for protecting MHI power management state
>   * @timeout_ms: Timeout in ms for state transitions
> @@ -403,6 +406,10 @@ struct mhi_controller {
>  	struct mhi_cmd *mhi_cmd;
>  	struct mhi_ctxt *mhi_ctxt;
>  
> +	size_t ctrl_config_size;
> +	dma_addr_t ctrl_config_dma;
> +	void *ctrl_config_base;
> +
>  	struct mutex pm_mutex;
>  	rwlock_t pm_lock;
>  	u32 timeout_ms;
> -- 
> 2.34.1
>
Jeff Hugo April 25, 2025, 4:10 p.m. UTC | #5
On 4/24/2025 11:37 PM, Manivannan Sadhasivam wrote:
> + ath11k list, Jeff and Baochen (for question regarding the use of reserved
> memory for allocating the MHI data structures in host)
> 
> On Tue, Apr 08, 2025 at 08:56:43AM -0600, Jeff Hugo wrote:
>> On 4/8/2025 1:01 AM, Manivannan Sadhasivam wrote:
>>> On Fri, Mar 28, 2025 at 10:59:13AM -0600, Jeff Hugo wrote:
>>>> From: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
>>>>
>>>> MHI control configurations such as channel context, event context, command
>>>> context and rings, are currently allocated individually. During MHI
>>>> initialization MHI bus driver needs to configure the address space in
>>>> which this control configuration resides. Since different component of the
>>>> config is being allocated separately, only logical solution is to give the
>>>> entire RAM address space, as they could be anywhere.
>>>>
>>>
>>> This is fine...
>>
>> We tripped over this when experimenting with an automotive market product.
>> The FW for that product had a rather strict interpretation of the spec,
>> which we confirmed with the spec owner.
>>
>> In the specific FW implementation, the device maps the entire MHI space of
>> shared structures in a single ATU entry. The device cannot map an entire
>> 64-bit address space, and it expects all of the shared structures in a
>> single compact range.
>>
>> This applies to the control structures, not the data buffers per the device
>> implementation.
>>
>> This restriction seems backed by the spec.  I can't find a reason why the
>> device is invalid, if limited.  I don't think this should break anything,
>> but more on that below.
>>
> 
> Yes, atleast I have heard about that limitation before.
> 
>>>
>>>> As per MHI specification the MHI control configuration address space should
>>>> not be more them 4GB.
>>>>
>>>
>>> Where exactly this limitation is specified in the spec? The spec supports full
>>> 64 bit address space for the MHI control/data structures. But due to the device
>>> DMA limitations, MHI controller drivers often use 32 bit address space. But
>>> that's not a spec limitation.
>>
>> Its not the clearest thing, sadly.
>>
>> Document 80-NF223-11 Rev AB "MHI spec v1.2" Section 6.2 "MHI Registers"
>> table 6-19 (page 106) -
>>
>> Describing MHICTRLLIMIT: "The most significant 32 bits of MHICTRLBASE and
>> MHICTRLLIMIT registers must be equal."
>>
>> This means we have a 4GB range (32-bit) to play with in a 64-bit address
>> space.  If the upper 32-bits of the 64-bit address for both the base and the
>> limit must be the same, then the range of addresses from the base to the
>> limit can only vary the lower 32-bits.
>>
>> Invalid:
>> BASE: 0x0
>> LIMIT: 0xffffffff_ffffffff
>>
>> Valid:
>> BASE: 0x0f_00000000
>> LIMIT: 0x0f_ffffffff
>>
> 
> Ah. Didn't spot this before, thanks for explaining!
> 
>>>> Since the current implementation is in violation of MHI specification.
>>
>> For example mhi_init_dev_ctxt()
>>
>> We allocate the chan_ctxt with dma_alloc_coherent() as an individual
>> allocation.  In the case of AIC100, the device can access the full 64-bit
>> address space, but the DMA engine is limited to a 32-bit transfer size.  The
>> chan_ctxt probably won't be larger than 4GB, so the size is rather
>> irrelevant.  Can be allocated anywhere.  Lets say that it gets put in the
>> lower 32-bit address space - 0x0_XXXXXXXX
>>
>> Then a little bit later we allocate er_ctxt with a different
>> dma_alloc_coherent() instance.  Being a unique allocation, it is not tied to
>> the chan_ctxt and can exist anywhere.  Lets assume that it gets put
>> somewhere in the non-lower 32-bits - 0x1000_XXXXXXXX
>>
>> Now we have a problem because we cannot describe a single range covering
>> both of these allocations via MHICTRLBASE/MHICTRLLIMIT where the upper
>> 32-bits of both registers is the same.
>>
> 
> Yes, I get it. I do not have issues in allocating all the structures in one go.
> But the problem is with MHICTRL_BASE/LIMIT. More below.
> 
>>>> Allocate a single giant DMA buffer for MHI control configurations and
>>>> limit the configuration address space to that buffer.
>>>>
>>>
>>> I don't think this could work for all devices. For instance, some ath11k devices
>>> use a fixed reserved region in host address space for MHICTRL/BASE.
>>
>> Why would we be unable to allocate all of the control structures in a single
>> allocation out of that reserved region?  Is it larger than 4GB in size?
>>
> 
> I was confused by the fact that ath11k driver adds an offset of 0x1000000 to the
> reserved region for the iova_start:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/wireless/ath/ath11k/mhi.c?h=v6.15-rc3#n331
> 
> So this means the firmware will only map the memory from reserved + 0x1000000
> for MHI data structures. But even with current implementation, the MHI stack
> doesn't know anything about the offset, because it just relies on
> dma_alloc_coherent() API which will only honor the reserved region pointed by
> 'memory' property in the node (without the offset).
> 
> So I'm not sure how the firmware makes sure that the data structures lives
> within the offset region. This is more of a question to ath11k folks.
> 
> But your series is not going to make it any worse. Sorry about the confusion.

No problem.  I appreciate the sanity check.

-Jeff
Jeff Hugo April 25, 2025, 4:11 p.m. UTC | #6
On 4/24/2025 11:40 PM, Manivannan Sadhasivam wrote:
> On Fri, Mar 28, 2025 at 10:59:13AM -0600, Jeff Hugo wrote:
>> From: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
>>
>> MHI control configurations such as channel context, event context, command
>> context and rings, are currently allocated individually. During MHI
>> initialization MHI bus driver needs to configure the address space in
>> which this control configuration resides. Since different component of the
>> config is being allocated separately, only logical solution is to give the
>> entire RAM address space, as they could be anywhere.
>>
>> As per MHI specification the MHI control configuration address space should
>> not be more them 4GB.
> 
> Could you please add the spec limitation here as you described in your reply?

Will do.

> 
>>
>> Since the current implementation is in violation of MHI specification.
> 
> Here too.
> 
>> Allocate a single giant DMA buffer for MHI control configurations and
>> limit the configuration address space to that buffer.
>>
>> Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
>> Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
> 
> Other than that, patch looks good to me.
> 
> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> 
> - Mani
Jeff Johnson April 25, 2025, 4:56 p.m. UTC | #7
On 4/24/2025 10:37 PM, Manivannan Sadhasivam wrote:
> I was confused by the fact that ath11k driver adds an offset of 0x1000000 to the
> reserved region for the iova_start:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/wireless/ath/ath11k/mhi.c?h=v6.15-rc3#n331

This predates my role as maintainer.
I was an internal reviewer, but in internal review (as well as the public v1)
the code was using dedicated 'qcom' DT entries and did not have an offset.

So I looked at the entire patch history:
https://lore.kernel.org/linux-wireless/?q=s%3A%22ath11k%3A+Use+reserved+host+DDR+addresses+from+DT+for+PCI+devices%22

The offset was added in v2.

There is nothing that describes why the offset was added, and nobody
questioned it.


> 
> So this means the firmware will only map the memory from reserved + 0x1000000
> for MHI data structures. But even with current implementation, the MHI stack
> doesn't know anything about the offset, because it just relies on
> dma_alloc_coherent() API which will only honor the reserved region pointed by
> 'memory' property in the node (without the offset).
> 
> So I'm not sure how the firmware makes sure that the data structures lives
> within the offset region. This is more of a question to ath11k folks.
> 
> But your series is not going to make it any worse. Sorry about the confusion.
> 
> - Mani
>
Baochen Qiang April 28, 2025, 1:37 a.m. UTC | #8
On 4/8/2025 10:56 PM, Jeff Hugo wrote:
> On 4/8/2025 1:01 AM, Manivannan Sadhasivam wrote:
>> On Fri, Mar 28, 2025 at 10:59:13AM -0600, Jeff Hugo wrote:
>>> From: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
>>>
>>> MHI control configurations such as channel context, event context, command
>>> context and rings, are currently allocated individually. During MHI
>>> initialization MHI bus driver needs to configure the address space in
>>> which this control configuration resides. Since different component of the
>>> config is being allocated separately, only logical solution is to give the
>>> entire RAM address space, as they could be anywhere.
>>>
>>
>> This is fine...
> 
> We tripped over this when experimenting with an automotive market product. The FW for that
> product had a rather strict interpretation of the spec, which we confirmed with the spec
> owner.
> 
> In the specific FW implementation, the device maps the entire MHI space of shared
> structures in a single ATU entry. The device cannot map an entire 64-bit address space,
> and it expects all of the shared structures in a single compact range.
> 
> This applies to the control structures, not the data buffers per the device implementation.
> 
> This restriction seems backed by the spec.  I can't find a reason why the device is
> invalid, if limited.  I don't think this should break anything, but more on that below.
> 
>>
>>> As per MHI specification the MHI control configuration address space should
>>> not be more them 4GB.
>>>
>>
>> Where exactly this limitation is specified in the spec? The spec supports full
>> 64 bit address space for the MHI control/data structures. But due to the device
>> DMA limitations, MHI controller drivers often use 32 bit address space. But
>> that's not a spec limitation.
> 
> Its not the clearest thing, sadly.
> 
> Document 80-NF223-11 Rev AB "MHI spec v1.2" Section 6.2 "MHI Registers" table 6-19 (page
> 106) -
> 
> Describing MHICTRLLIMIT: "The most significant 32 bits of MHICTRLBASE and MHICTRLLIMIT
> registers must be equal."
> 
> This means we have a 4GB range (32-bit) to play with in a 64-bit address space.  If the
> upper 32-bits of the 64-bit address for both the base and the limit must be the same, then
> the range of addresses from the base to the limit can only vary the lower 32-bits.
> 
> Invalid:
> BASE: 0x0
> LIMIT: 0xffffffff_ffffffff
> 
> Valid:
> BASE: 0x0f_00000000
> LIMIT: 0x0f_ffffffff

as an MHI controller driver, ath11k is doing

		mhi_ctrl->iova_start = 0;
		mhi_ctrl->iova_stop = ab_pci->dma_mask;

where ab_pci->dma_mask is defined as

		ab_pci->dma_mask = DMA_BIT_MASK(36)

clearly this does not align with the 32bit requirement above, however there is no issue hit.

What happens if the most significant 32 bits of MHICTRLBASE and MHICTRLLIMIT is not equal?
SYS_ERR or RDDM?


> 
>>> Since the current implementation is in violation of MHI specification.
> 
> For example mhi_init_dev_ctxt()
> 
> We allocate the chan_ctxt with dma_alloc_coherent() as an individual allocation.  In the
> case of AIC100, the device can access the full 64-bit address space, but the DMA engine is
> limited to a 32-bit transfer size.  The chan_ctxt probably won't be larger than 4GB, so
> the size is rather irrelevant.  Can be allocated anywhere.  Lets say that it gets put in
> the lower 32-bit address space - 0x0_XXXXXXXX
> 
> Then a little bit later we allocate er_ctxt with a different dma_alloc_coherent()
> instance.  Being a unique allocation, it is not tied to the chan_ctxt and can exist
> anywhere.  Lets assume that it gets put somewhere in the non-lower 32-bits - 0x1000_XXXXXXXX
> 
> Now we have a problem because we cannot describe a single range covering both of these
> allocations via MHICTRLBASE/MHICTRLLIMIT where the upper 32-bits of both registers is the
> same.
> 
>>> Allocate a single giant DMA buffer for MHI control configurations and
>>> limit the configuration address space to that buffer.
>>>
>>
>> I don't think this could work for all devices. For instance, some ath11k devices
>> use a fixed reserved region in host address space for MHICTRL/BASE.
> 
> Why would we be unable to allocate all of the control structures in a single allocation
> out of that reserved region?  Is it larger than 4GB in size?
> 
> -Jeff
> 
>
Jeff Hugo April 28, 2025, 3:09 p.m. UTC | #9
On 4/27/2025 7:37 PM, Baochen Qiang wrote:
> 
> 
> On 4/8/2025 10:56 PM, Jeff Hugo wrote:
>> On 4/8/2025 1:01 AM, Manivannan Sadhasivam wrote:
>>> On Fri, Mar 28, 2025 at 10:59:13AM -0600, Jeff Hugo wrote:
>>>> From: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
>>>>
>>>> MHI control configurations such as channel context, event context, command
>>>> context and rings, are currently allocated individually. During MHI
>>>> initialization MHI bus driver needs to configure the address space in
>>>> which this control configuration resides. Since different component of the
>>>> config is being allocated separately, only logical solution is to give the
>>>> entire RAM address space, as they could be anywhere.
>>>>
>>>
>>> This is fine...
>>
>> We tripped over this when experimenting with an automotive market product. The FW for that
>> product had a rather strict interpretation of the spec, which we confirmed with the spec
>> owner.
>>
>> In the specific FW implementation, the device maps the entire MHI space of shared
>> structures in a single ATU entry. The device cannot map an entire 64-bit address space,
>> and it expects all of the shared structures in a single compact range.
>>
>> This applies to the control structures, not the data buffers per the device implementation.
>>
>> This restriction seems backed by the spec.  I can't find a reason why the device is
>> invalid, if limited.  I don't think this should break anything, but more on that below.
>>
>>>
>>>> As per MHI specification the MHI control configuration address space should
>>>> not be more them 4GB.
>>>>
>>>
>>> Where exactly this limitation is specified in the spec? The spec supports full
>>> 64 bit address space for the MHI control/data structures. But due to the device
>>> DMA limitations, MHI controller drivers often use 32 bit address space. But
>>> that's not a spec limitation.
>>
>> Its not the clearest thing, sadly.
>>
>> Document 80-NF223-11 Rev AB "MHI spec v1.2" Section 6.2 "MHI Registers" table 6-19 (page
>> 106) -
>>
>> Describing MHICTRLLIMIT: "The most significant 32 bits of MHICTRLBASE and MHICTRLLIMIT
>> registers must be equal."
>>
>> This means we have a 4GB range (32-bit) to play with in a 64-bit address space.  If the
>> upper 32-bits of the 64-bit address for both the base and the limit must be the same, then
>> the range of addresses from the base to the limit can only vary the lower 32-bits.
>>
>> Invalid:
>> BASE: 0x0
>> LIMIT: 0xffffffff_ffffffff
>>
>> Valid:
>> BASE: 0x0f_00000000
>> LIMIT: 0x0f_ffffffff
> 
> as an MHI controller driver, ath11k is doing
> 
> 		mhi_ctrl->iova_start = 0;
> 		mhi_ctrl->iova_stop = ab_pci->dma_mask;
> 
> where ab_pci->dma_mask is defined as
> 
> 		ab_pci->dma_mask = DMA_BIT_MASK(36)
> 
> clearly this does not align with the 32bit requirement above, however there is no issue hit.
> 
> What happens if the most significant 32 bits of MHICTRLBASE and MHICTRLLIMIT is not equal?
> SYS_ERR or RDDM?

On the device we experienced this with, we would be able to transition 
to M0, but the later the device would have an internal fault and crash.

-Jeff
Manivannan Sadhasivam April 28, 2025, 4:04 p.m. UTC | #10
On Fri, Apr 25, 2025 at 09:56:05AM -0700, Jeff Johnson wrote:
> On 4/24/2025 10:37 PM, Manivannan Sadhasivam wrote:
> > I was confused by the fact that ath11k driver adds an offset of 0x1000000 to the
> > reserved region for the iova_start:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/wireless/ath/ath11k/mhi.c?h=v6.15-rc3#n331
> 
> This predates my role as maintainer.
> I was an internal reviewer, but in internal review (as well as the public v1)
> the code was using dedicated 'qcom' DT entries and did not have an offset.
> 
> So I looked at the entire patch history:
> https://lore.kernel.org/linux-wireless/?q=s%3A%22ath11k%3A+Use+reserved+host+DDR+addresses+from+DT+for+PCI+devices%22
> 
> The offset was added in v2.
> 
> There is nothing that describes why the offset was added, and nobody
> questioned it.
> 

Hmm. I'm not sure how this works in the firmware. If firmware uses iova_start as
the start address for the ATU mapping of the MHI data structures, then it won't
be able to access the structures allocated prior to this offset (which would be
very likely).

So the fact that the firmware still works with this offset means it doesn't use
iova at all. But it would be good to remove this offset from the code anyway.

- Mani
Baochen Qiang April 29, 2025, 1:44 a.m. UTC | #11
On 4/28/2025 11:09 PM, Jeff Hugo wrote:
> On 4/27/2025 7:37 PM, Baochen Qiang wrote:
>>
>>
>> On 4/8/2025 10:56 PM, Jeff Hugo wrote:
>>> On 4/8/2025 1:01 AM, Manivannan Sadhasivam wrote:
>>>> On Fri, Mar 28, 2025 at 10:59:13AM -0600, Jeff Hugo wrote:
>>>>> From: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
>>>>>
>>>>> MHI control configurations such as channel context, event context, command
>>>>> context and rings, are currently allocated individually. During MHI
>>>>> initialization MHI bus driver needs to configure the address space in
>>>>> which this control configuration resides. Since different component of the
>>>>> config is being allocated separately, only logical solution is to give the
>>>>> entire RAM address space, as they could be anywhere.
>>>>>
>>>>
>>>> This is fine...
>>>
>>> We tripped over this when experimenting with an automotive market product. The FW for that
>>> product had a rather strict interpretation of the spec, which we confirmed with the spec
>>> owner.
>>>
>>> In the specific FW implementation, the device maps the entire MHI space of shared
>>> structures in a single ATU entry. The device cannot map an entire 64-bit address space,
>>> and it expects all of the shared structures in a single compact range.
>>>
>>> This applies to the control structures, not the data buffers per the device
>>> implementation.
>>>
>>> This restriction seems backed by the spec.  I can't find a reason why the device is
>>> invalid, if limited.  I don't think this should break anything, but more on that below.
>>>
>>>>
>>>>> As per MHI specification the MHI control configuration address space should
>>>>> not be more them 4GB.
>>>>>
>>>>
>>>> Where exactly this limitation is specified in the spec? The spec supports full
>>>> 64 bit address space for the MHI control/data structures. But due to the device
>>>> DMA limitations, MHI controller drivers often use 32 bit address space. But
>>>> that's not a spec limitation.
>>>
>>> Its not the clearest thing, sadly.
>>>
>>> Document 80-NF223-11 Rev AB "MHI spec v1.2" Section 6.2 "MHI Registers" table 6-19 (page
>>> 106) -
>>>
>>> Describing MHICTRLLIMIT: "The most significant 32 bits of MHICTRLBASE and MHICTRLLIMIT
>>> registers must be equal."
>>>
>>> This means we have a 4GB range (32-bit) to play with in a 64-bit address space.  If the
>>> upper 32-bits of the 64-bit address for both the base and the limit must be the same, then
>>> the range of addresses from the base to the limit can only vary the lower 32-bits.
>>>
>>> Invalid:
>>> BASE: 0x0
>>> LIMIT: 0xffffffff_ffffffff
>>>
>>> Valid:
>>> BASE: 0x0f_00000000
>>> LIMIT: 0x0f_ffffffff
>>
>> as an MHI controller driver, ath11k is doing
>>
>>         mhi_ctrl->iova_start = 0;
>>         mhi_ctrl->iova_stop = ab_pci->dma_mask;
>>
>> where ab_pci->dma_mask is defined as
>>
>>         ab_pci->dma_mask = DMA_BIT_MASK(36)
>>
>> clearly this does not align with the 32bit requirement above, however there is no issue
>> hit.
>>
>> What happens if the most significant 32 bits of MHICTRLBASE and MHICTRLLIMIT is not equal?
>> SYS_ERR or RDDM?
> 
> On the device we experienced this with, we would be able to transition to M0, but the
> later the device would have an internal fault and crash.

so with this patch, MHICTRLBASE/MHICTRLLIMIT are untied from iova_start/iova_stop, meaning
current ath11k settings does not break the requirement, hence no ath11k change needed, right?

> 
> -Jeff
Jeff Hugo May 12, 2025, 6:33 p.m. UTC | #12
On 4/28/2025 7:44 PM, Baochen Qiang wrote:
> 
> 
> On 4/28/2025 11:09 PM, Jeff Hugo wrote:
>> On 4/27/2025 7:37 PM, Baochen Qiang wrote:
>>>
>>>
>>> On 4/8/2025 10:56 PM, Jeff Hugo wrote:
>>>> On 4/8/2025 1:01 AM, Manivannan Sadhasivam wrote:
>>>>> On Fri, Mar 28, 2025 at 10:59:13AM -0600, Jeff Hugo wrote:
>>>>>> From: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
>>>>>>
>>>>>> MHI control configurations such as channel context, event context, command
>>>>>> context and rings, are currently allocated individually. During MHI
>>>>>> initialization MHI bus driver needs to configure the address space in
>>>>>> which this control configuration resides. Since different component of the
>>>>>> config is being allocated separately, only logical solution is to give the
>>>>>> entire RAM address space, as they could be anywhere.
>>>>>>
>>>>>
>>>>> This is fine...
>>>>
>>>> We tripped over this when experimenting with an automotive market product. The FW for that
>>>> product had a rather strict interpretation of the spec, which we confirmed with the spec
>>>> owner.
>>>>
>>>> In the specific FW implementation, the device maps the entire MHI space of shared
>>>> structures in a single ATU entry. The device cannot map an entire 64-bit address space,
>>>> and it expects all of the shared structures in a single compact range.
>>>>
>>>> This applies to the control structures, not the data buffers per the device
>>>> implementation.
>>>>
>>>> This restriction seems backed by the spec.  I can't find a reason why the device is
>>>> invalid, if limited.  I don't think this should break anything, but more on that below.
>>>>
>>>>>
>>>>>> As per MHI specification the MHI control configuration address space should
>>>>>> not be more them 4GB.
>>>>>>
>>>>>
>>>>> Where exactly this limitation is specified in the spec? The spec supports full
>>>>> 64 bit address space for the MHI control/data structures. But due to the device
>>>>> DMA limitations, MHI controller drivers often use 32 bit address space. But
>>>>> that's not a spec limitation.
>>>>
>>>> Its not the clearest thing, sadly.
>>>>
>>>> Document 80-NF223-11 Rev AB "MHI spec v1.2" Section 6.2 "MHI Registers" table 6-19 (page
>>>> 106) -
>>>>
>>>> Describing MHICTRLLIMIT: "The most significant 32 bits of MHICTRLBASE and MHICTRLLIMIT
>>>> registers must be equal."
>>>>
>>>> This means we have a 4GB range (32-bit) to play with in a 64-bit address space.  If the
>>>> upper 32-bits of the 64-bit address for both the base and the limit must be the same, then
>>>> the range of addresses from the base to the limit can only vary the lower 32-bits.
>>>>
>>>> Invalid:
>>>> BASE: 0x0
>>>> LIMIT: 0xffffffff_ffffffff
>>>>
>>>> Valid:
>>>> BASE: 0x0f_00000000
>>>> LIMIT: 0x0f_ffffffff
>>>
>>> as an MHI controller driver, ath11k is doing
>>>
>>>          mhi_ctrl->iova_start = 0;
>>>          mhi_ctrl->iova_stop = ab_pci->dma_mask;
>>>
>>> where ab_pci->dma_mask is defined as
>>>
>>>          ab_pci->dma_mask = DMA_BIT_MASK(36)
>>>
>>> clearly this does not align with the 32bit requirement above, however there is no issue
>>> hit.
>>>
>>> What happens if the most significant 32 bits of MHICTRLBASE and MHICTRLLIMIT is not equal?
>>> SYS_ERR or RDDM?
>>
>> On the device we experienced this with, we would be able to transition to M0, but the
>> later the device would have an internal fault and crash.
> 
> so with this patch, MHICTRLBASE/MHICTRLLIMIT are untied from iova_start/iova_stop, meaning
> current ath11k settings does not break the requirement, hence no ath11k change needed, right?

I think so.

-Jeff
diff mbox series

Patch

diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c
index 13e7a55f54ff..03afbfb8a96f 100644
--- a/drivers/bus/mhi/host/init.c
+++ b/drivers/bus/mhi/host/init.c
@@ -159,21 +159,21 @@  static struct attribute *mhi_dev_attrs[] = {
 };
 ATTRIBUTE_GROUPS(mhi_dev);
 
-/* MHI protocol requires the transfer ring to be aligned with ring length */
-static int mhi_alloc_aligned_ring(struct mhi_controller *mhi_cntrl,
-				  struct mhi_ring *ring,
-				  u64 len)
+static void mhi_init_ring_base(struct mhi_ring *ring)
 {
-	ring->alloc_size = len + (len - 1);
-	ring->pre_aligned = dma_alloc_coherent(mhi_cntrl->cntrl_dev, ring->alloc_size,
-					       &ring->dma_handle, GFP_KERNEL);
-	if (!ring->pre_aligned)
-		return -ENOMEM;
-
-	ring->iommu_base = (ring->dma_handle + (len - 1)) & ~(len - 1);
+	ring->iommu_base = (ring->dma_handle + (ring->len - 1)) & ~(ring->len - 1);
 	ring->base = ring->pre_aligned + (ring->iommu_base - ring->dma_handle);
+}
 
-	return 0;
+/* MHI protocol requires the transfer ring to be aligned with ring length */
+static void mhi_init_ring(struct mhi_controller *mhi_cntrl, struct mhi_ring *ring,
+			  size_t el_size, size_t offset)
+{
+	ring->el_size = el_size;
+	ring->len = ring->el_size * ring->elements;
+	ring->pre_aligned = mhi_cntrl->ctrl_config_base + offset;
+	ring->dma_handle = mhi_cntrl->ctrl_config_dma + offset;
+	mhi_init_ring_base(ring);
 }
 
 void mhi_deinit_free_irq(struct mhi_controller *mhi_cntrl)
@@ -265,40 +265,134 @@  void mhi_deinit_dev_ctxt(struct mhi_controller *mhi_cntrl)
 	mhi_cmd = mhi_cntrl->mhi_cmd;
 	for (i = 0; i < NR_OF_CMD_RINGS; i++, mhi_cmd++) {
 		ring = &mhi_cmd->ring;
-		dma_free_coherent(mhi_cntrl->cntrl_dev, ring->alloc_size,
-				  ring->pre_aligned, ring->dma_handle);
 		ring->base = NULL;
 		ring->iommu_base = 0;
 	}
 
-	dma_free_coherent(mhi_cntrl->cntrl_dev,
-			  sizeof(*mhi_ctxt->cmd_ctxt) * NR_OF_CMD_RINGS,
-			  mhi_ctxt->cmd_ctxt, mhi_ctxt->cmd_ctxt_addr);
-
 	mhi_event = mhi_cntrl->mhi_event;
 	for (i = 0; i < mhi_cntrl->total_ev_rings; i++, mhi_event++) {
 		if (mhi_event->offload_ev)
 			continue;
 
 		ring = &mhi_event->ring;
-		dma_free_coherent(mhi_cntrl->cntrl_dev, ring->alloc_size,
-				  ring->pre_aligned, ring->dma_handle);
 		ring->base = NULL;
 		ring->iommu_base = 0;
 	}
 
-	dma_free_coherent(mhi_cntrl->cntrl_dev, sizeof(*mhi_ctxt->er_ctxt) *
-			  mhi_cntrl->total_ev_rings, mhi_ctxt->er_ctxt,
-			  mhi_ctxt->er_ctxt_addr);
-
-	dma_free_coherent(mhi_cntrl->cntrl_dev, sizeof(*mhi_ctxt->chan_ctxt) *
-			  mhi_cntrl->max_chan, mhi_ctxt->chan_ctxt,
-			  mhi_ctxt->chan_ctxt_addr);
-
+	dma_free_coherent(mhi_cntrl->cntrl_dev, mhi_cntrl->ctrl_config_size,
+			  mhi_cntrl->ctrl_config_base, mhi_cntrl->ctrl_config_dma);
 	kfree(mhi_ctxt);
 	mhi_cntrl->mhi_ctxt = NULL;
 }
 
+/*
+ * Control Configuration Address Space Layout
+ *              +------------------------------------------+
+ *              |              Channel Context             |
+ *              +------------------------------------------+
+ *              |           Array of Channel Rings         |
+ *              +------------------------------------------+
+ *              |              Event Context               |
+ *              +------------------------------------------+
+ *              |           Array of Event Rings           |
+ *              +------------------------------------------+
+ *              |              Command Context             |
+ *              +------------------------------------------+
+ *              |           Array of Command Rings         |
+ *              +------------------------------------------+
+ */
+static int mhi_alloc_control_space(struct mhi_controller *mhi_cntrl, struct mhi_ctxt *mhi_ctxt)
+{
+	size_t chan_ctxt_offset, cmd_ctxt_offset, er_ctxt_offset;
+	size_t chan_ctxt_size, cmd_ctxt_size, er_ctxt_size;
+	size_t el_size = sizeof(struct mhi_ring_element);
+	u32 ev_rings = mhi_cntrl->total_ev_rings;
+	struct mhi_event *mhi_event;
+	size_t alloc_size, offset = 0;
+	struct mhi_chan *mhi_chan;
+	struct mhi_cmd *mhi_cmd;
+	struct mhi_ring *ring;
+	int i;
+
+	chan_ctxt_offset = offset;
+	chan_ctxt_size = sizeof(*mhi_ctxt->chan_ctxt) * mhi_cntrl->max_chan;
+	chan_ctxt_size = PAGE_ALIGN(chan_ctxt_size);
+	offset += chan_ctxt_size;
+	for (i = 0, mhi_chan = mhi_cntrl->mhi_chan; i < mhi_cntrl->max_chan; i++, mhi_chan++) {
+		ring = &mhi_chan->tre_ring;
+		if (!ring->elements)
+			continue;
+		alloc_size = ring->elements * el_size;
+		alloc_size = PAGE_ALIGN(alloc_size << 1);
+		/* Temporarily save offset here */
+		ring->pre_aligned = (void *)offset;
+		offset += alloc_size;
+	}
+
+	er_ctxt_offset = offset;
+	er_ctxt_size = sizeof(*mhi_ctxt->er_ctxt) * ev_rings;
+	er_ctxt_size = PAGE_ALIGN(er_ctxt_size);
+	offset += er_ctxt_size;
+	for (i = 0, mhi_event = mhi_cntrl->mhi_event; i < ev_rings; i++, mhi_event++) {
+		ring = &mhi_event->ring;
+		alloc_size = ring->elements * el_size;
+		alloc_size = PAGE_ALIGN(alloc_size << 1);
+		/* Temporarily save offset here */
+		ring->pre_aligned = (void *)offset;
+		offset += alloc_size;
+	}
+
+	cmd_ctxt_offset = offset;
+	cmd_ctxt_size = sizeof(*mhi_ctxt->cmd_ctxt) * NR_OF_CMD_RINGS;
+	cmd_ctxt_size = PAGE_ALIGN(cmd_ctxt_size);
+	offset += cmd_ctxt_size;
+	for (i = 0, mhi_cmd = mhi_cntrl->mhi_cmd; i < NR_OF_CMD_RINGS; i++, mhi_cmd++) {
+		ring = &mhi_cmd->ring;
+		alloc_size = CMD_EL_PER_RING * el_size;
+		alloc_size = PAGE_ALIGN(alloc_size << 1);
+		/* Temporarily save offset here */
+		ring->pre_aligned = (void *)offset;
+		offset += alloc_size;
+	}
+
+	mhi_cntrl->ctrl_config_size = offset;
+	mhi_cntrl->ctrl_config_base = dma_alloc_coherent(mhi_cntrl->cntrl_dev,
+							 mhi_cntrl->ctrl_config_size,
+							 &mhi_cntrl->ctrl_config_dma,
+							 GFP_KERNEL);
+	if (!mhi_cntrl->ctrl_config_base)
+		return -ENOMEM;
+
+	/* Setup channel ctxt */
+	mhi_ctxt->chan_ctxt = mhi_cntrl->ctrl_config_base + chan_ctxt_offset;
+	mhi_ctxt->chan_ctxt_addr = mhi_cntrl->ctrl_config_dma + chan_ctxt_offset;
+	for (i = 0, mhi_chan = mhi_cntrl->mhi_chan; i < mhi_cntrl->max_chan; i++, mhi_chan++) {
+		ring = &mhi_chan->tre_ring;
+		if (!ring->elements)
+			continue;
+		mhi_init_ring(mhi_cntrl, ring, el_size, (size_t)ring->pre_aligned);
+	}
+
+	/* Setup event context */
+	mhi_ctxt->er_ctxt = mhi_cntrl->ctrl_config_base + er_ctxt_offset;
+	mhi_ctxt->er_ctxt_addr = mhi_cntrl->ctrl_config_dma + er_ctxt_offset;
+	for (i = 0, mhi_event = mhi_cntrl->mhi_event; i < ev_rings; i++, mhi_event++) {
+		ring = &mhi_event->ring;
+		mhi_init_ring(mhi_cntrl, ring, el_size, (size_t)ring->pre_aligned);
+	}
+
+	/* Setup cmd context */
+	mhi_ctxt->cmd_ctxt = mhi_cntrl->ctrl_config_base + cmd_ctxt_offset;
+	mhi_ctxt->cmd_ctxt_addr = mhi_cntrl->ctrl_config_dma + cmd_ctxt_offset;
+	for (i = 0, mhi_cmd = mhi_cntrl->mhi_cmd; i < NR_OF_CMD_RINGS; i++, mhi_cmd++) {
+		ring = &mhi_cmd->ring;
+		ring->elements = CMD_EL_PER_RING;
+		mhi_init_ring(mhi_cntrl, ring, el_size, (size_t)ring->pre_aligned);
+	}
+
+	return 0;
+}
+
 int mhi_init_dev_ctxt(struct mhi_controller *mhi_cntrl)
 {
 	struct mhi_ctxt *mhi_ctxt;
@@ -309,7 +403,7 @@  int mhi_init_dev_ctxt(struct mhi_controller *mhi_cntrl)
 	struct mhi_event *mhi_event;
 	struct mhi_cmd *mhi_cmd;
 	u32 tmp;
-	int ret = -ENOMEM, i;
+	int ret, i;
 
 	atomic_set(&mhi_cntrl->dev_wake, 0);
 	atomic_set(&mhi_cntrl->pending_pkts, 0);
@@ -318,14 +412,12 @@  int mhi_init_dev_ctxt(struct mhi_controller *mhi_cntrl)
 	if (!mhi_ctxt)
 		return -ENOMEM;
 
-	/* Setup channel ctxt */
-	mhi_ctxt->chan_ctxt = dma_alloc_coherent(mhi_cntrl->cntrl_dev,
-						 sizeof(*mhi_ctxt->chan_ctxt) *
-						 mhi_cntrl->max_chan,
-						 &mhi_ctxt->chan_ctxt_addr,
-						 GFP_KERNEL);
-	if (!mhi_ctxt->chan_ctxt)
-		goto error_alloc_chan_ctxt;
+	/* Allocate control configuration */
+	ret = mhi_alloc_control_space(mhi_cntrl, mhi_ctxt);
+	if (ret) {
+		kfree(mhi_ctxt);
+		return ret;
+	}
 
 	mhi_chan = mhi_cntrl->mhi_chan;
 	chan_ctxt = mhi_ctxt->chan_ctxt;
@@ -350,15 +442,6 @@  int mhi_init_dev_ctxt(struct mhi_controller *mhi_cntrl)
 		mhi_chan->tre_ring.db_addr = (void __iomem *)&chan_ctxt->wp;
 	}
 
-	/* Setup event context */
-	mhi_ctxt->er_ctxt = dma_alloc_coherent(mhi_cntrl->cntrl_dev,
-					       sizeof(*mhi_ctxt->er_ctxt) *
-					       mhi_cntrl->total_ev_rings,
-					       &mhi_ctxt->er_ctxt_addr,
-					       GFP_KERNEL);
-	if (!mhi_ctxt->er_ctxt)
-		goto error_alloc_er_ctxt;
-
 	er_ctxt = mhi_ctxt->er_ctxt;
 	mhi_event = mhi_cntrl->mhi_event;
 	for (i = 0; i < mhi_cntrl->total_ev_rings; i++, er_ctxt++,
@@ -379,12 +462,6 @@  int mhi_init_dev_ctxt(struct mhi_controller *mhi_cntrl)
 		er_ctxt->msivec = cpu_to_le32(mhi_event->irq);
 		mhi_event->db_cfg.db_mode = true;
 
-		ring->el_size = sizeof(struct mhi_ring_element);
-		ring->len = ring->el_size * ring->elements;
-		ret = mhi_alloc_aligned_ring(mhi_cntrl, ring, ring->len);
-		if (ret)
-			goto error_alloc_er;
-
 		/*
 		 * If the read pointer equals to the write pointer, then the
 		 * ring is empty
@@ -396,28 +473,10 @@  int mhi_init_dev_ctxt(struct mhi_controller *mhi_cntrl)
 		ring->ctxt_wp = &er_ctxt->wp;
 	}
 
-	/* Setup cmd context */
-	ret = -ENOMEM;
-	mhi_ctxt->cmd_ctxt = dma_alloc_coherent(mhi_cntrl->cntrl_dev,
-						sizeof(*mhi_ctxt->cmd_ctxt) *
-						NR_OF_CMD_RINGS,
-						&mhi_ctxt->cmd_ctxt_addr,
-						GFP_KERNEL);
-	if (!mhi_ctxt->cmd_ctxt)
-		goto error_alloc_er;
-
 	mhi_cmd = mhi_cntrl->mhi_cmd;
 	cmd_ctxt = mhi_ctxt->cmd_ctxt;
 	for (i = 0; i < NR_OF_CMD_RINGS; i++, mhi_cmd++, cmd_ctxt++) {
 		struct mhi_ring *ring = &mhi_cmd->ring;
-
-		ring->el_size = sizeof(struct mhi_ring_element);
-		ring->elements = CMD_EL_PER_RING;
-		ring->len = ring->el_size * ring->elements;
-		ret = mhi_alloc_aligned_ring(mhi_cntrl, ring, ring->len);
-		if (ret)
-			goto error_alloc_cmd;
-
 		ring->rp = ring->wp = ring->base;
 		cmd_ctxt->rbase = cpu_to_le64(ring->iommu_base);
 		cmd_ctxt->rp = cmd_ctxt->wp = cmd_ctxt->rbase;
@@ -428,43 +487,6 @@  int mhi_init_dev_ctxt(struct mhi_controller *mhi_cntrl)
 	mhi_cntrl->mhi_ctxt = mhi_ctxt;
 
 	return 0;
-
-error_alloc_cmd:
-	for (--i, --mhi_cmd; i >= 0; i--, mhi_cmd--) {
-		struct mhi_ring *ring = &mhi_cmd->ring;
-
-		dma_free_coherent(mhi_cntrl->cntrl_dev, ring->alloc_size,
-				  ring->pre_aligned, ring->dma_handle);
-	}
-	dma_free_coherent(mhi_cntrl->cntrl_dev,
-			  sizeof(*mhi_ctxt->cmd_ctxt) * NR_OF_CMD_RINGS,
-			  mhi_ctxt->cmd_ctxt, mhi_ctxt->cmd_ctxt_addr);
-	i = mhi_cntrl->total_ev_rings;
-	mhi_event = mhi_cntrl->mhi_event + i;
-
-error_alloc_er:
-	for (--i, --mhi_event; i >= 0; i--, mhi_event--) {
-		struct mhi_ring *ring = &mhi_event->ring;
-
-		if (mhi_event->offload_ev)
-			continue;
-
-		dma_free_coherent(mhi_cntrl->cntrl_dev, ring->alloc_size,
-				  ring->pre_aligned, ring->dma_handle);
-	}
-	dma_free_coherent(mhi_cntrl->cntrl_dev, sizeof(*mhi_ctxt->er_ctxt) *
-			  mhi_cntrl->total_ev_rings, mhi_ctxt->er_ctxt,
-			  mhi_ctxt->er_ctxt_addr);
-
-error_alloc_er_ctxt:
-	dma_free_coherent(mhi_cntrl->cntrl_dev, sizeof(*mhi_ctxt->chan_ctxt) *
-			  mhi_cntrl->max_chan, mhi_ctxt->chan_ctxt,
-			  mhi_ctxt->chan_ctxt_addr);
-
-error_alloc_chan_ctxt:
-	kfree(mhi_ctxt);
-
-	return ret;
 }
 
 int mhi_init_mmio(struct mhi_controller *mhi_cntrl)
@@ -475,6 +497,7 @@  int mhi_init_mmio(struct mhi_controller *mhi_cntrl)
 	struct mhi_event *mhi_event;
 	void __iomem *base = mhi_cntrl->regs;
 	struct device *dev = &mhi_cntrl->mhi_dev->dev;
+	dma_addr_t mhi_ctrl_limit = mhi_cntrl->ctrl_config_dma + mhi_cntrl->ctrl_config_size - 1;
 	struct {
 		u32 offset;
 		u32 val;
@@ -505,11 +528,11 @@  int mhi_init_mmio(struct mhi_controller *mhi_cntrl)
 		},
 		{
 			MHICTRLBASE_HIGHER,
-			upper_32_bits(mhi_cntrl->iova_start),
+			upper_32_bits(mhi_cntrl->ctrl_config_dma),
 		},
 		{
 			MHICTRLBASE_LOWER,
-			lower_32_bits(mhi_cntrl->iova_start),
+			lower_32_bits(mhi_cntrl->ctrl_config_dma),
 		},
 		{
 			MHIDATABASE_HIGHER,
@@ -521,11 +544,11 @@  int mhi_init_mmio(struct mhi_controller *mhi_cntrl)
 		},
 		{
 			MHICTRLLIMIT_HIGHER,
-			upper_32_bits(mhi_cntrl->iova_stop),
+			upper_32_bits(mhi_ctrl_limit),
 		},
 		{
 			MHICTRLLIMIT_LOWER,
-			lower_32_bits(mhi_cntrl->iova_stop),
+			lower_32_bits(mhi_ctrl_limit),
 		},
 		{
 			MHIDATALIMIT_HIGHER,
@@ -622,8 +645,6 @@  void mhi_deinit_chan_ctxt(struct mhi_controller *mhi_cntrl,
 	if (!chan_ctxt->rbase) /* Already uninitialized */
 		return;
 
-	dma_free_coherent(mhi_cntrl->cntrl_dev, tre_ring->alloc_size,
-			  tre_ring->pre_aligned, tre_ring->dma_handle);
 	vfree(buf_ring->base);
 
 	buf_ring->base = tre_ring->base = NULL;
@@ -649,26 +670,18 @@  int mhi_init_chan_ctxt(struct mhi_controller *mhi_cntrl,
 	struct mhi_ring *tre_ring;
 	struct mhi_chan_ctxt *chan_ctxt;
 	u32 tmp;
-	int ret;
 
 	buf_ring = &mhi_chan->buf_ring;
 	tre_ring = &mhi_chan->tre_ring;
-	tre_ring->el_size = sizeof(struct mhi_ring_element);
-	tre_ring->len = tre_ring->el_size * tre_ring->elements;
 	chan_ctxt = &mhi_cntrl->mhi_ctxt->chan_ctxt[mhi_chan->chan];
-	ret = mhi_alloc_aligned_ring(mhi_cntrl, tre_ring, tre_ring->len);
-	if (ret)
-		return -ENOMEM;
+	mhi_init_ring_base(tre_ring);
 
 	buf_ring->el_size = sizeof(struct mhi_buf_info);
 	buf_ring->len = buf_ring->el_size * buf_ring->elements;
 	buf_ring->base = vzalloc(buf_ring->len);
 
-	if (!buf_ring->base) {
-		dma_free_coherent(mhi_cntrl->cntrl_dev, tre_ring->alloc_size,
-				  tre_ring->pre_aligned, tre_ring->dma_handle);
+	if (!buf_ring->base)
 		return -ENOMEM;
-	}
 
 	tmp = le32_to_cpu(chan_ctxt->chcfg);
 	tmp &= ~CHAN_CTX_CHSTATE_MASK;
diff --git a/drivers/bus/mhi/host/internal.h b/drivers/bus/mhi/host/internal.h
index ce566f7d2e92..bedaf248c49a 100644
--- a/drivers/bus/mhi/host/internal.h
+++ b/drivers/bus/mhi/host/internal.h
@@ -215,7 +215,6 @@  struct mhi_ring {
 	size_t el_size;
 	size_t len;
 	size_t elements;
-	size_t alloc_size;
 	void __iomem *db_addr;
 };
 
diff --git a/include/linux/mhi.h b/include/linux/mhi.h
index 059dc94d20bb..eb5e0181bea7 100644
--- a/include/linux/mhi.h
+++ b/include/linux/mhi.h
@@ -325,6 +325,9 @@  struct mhi_controller_config {
  * @mhi_event: MHI event ring configurations table
  * @mhi_cmd: MHI command ring configurations table
  * @mhi_ctxt: MHI device context, shared memory between host and device
+ * @ctrl_config_size: Total allocated size of control configuration buffer
+ * @ctrl_config_dma: Base bus address of control configuration buffer
+ * @ctrl_config_base: Base CPU address of control configuration buffer
  * @pm_mutex: Mutex for suspend/resume operation
  * @pm_lock: Lock for protecting MHI power management state
  * @timeout_ms: Timeout in ms for state transitions
@@ -403,6 +406,10 @@  struct mhi_controller {
 	struct mhi_cmd *mhi_cmd;
 	struct mhi_ctxt *mhi_ctxt;
 
+	size_t ctrl_config_size;
+	dma_addr_t ctrl_config_dma;
+	void *ctrl_config_base;
+
 	struct mutex pm_mutex;
 	rwlock_t pm_lock;
 	u32 timeout_ms;