diff mbox series

[RESEND,v4,1/6] dmaengine: dw: Add peripheral bus width verification

Message ID 20240802075100.6475-2-fancer.lancer@gmail.com
State New
Headers show
Series dmaengine: dw: Fix src/dst addr width misconfig | expand

Commit Message

Serge Semin Aug. 2, 2024, 7:50 a.m. UTC
Currently the src_addr_width and dst_addr_width fields of the
dma_slave_config structure are mapped to the CTLx.SRC_TR_WIDTH and
CTLx.DST_TR_WIDTH fields of the peripheral bus side in order to have the
properly aligned data passed to the target device. It's done just by
converting the passed peripheral bus width to the encoded value using the
__ffs() function. This implementation has several problematic sides:

1. __ffs() is undefined if no bit exist in the passed value. Thus if the
specified addr-width is DMA_SLAVE_BUSWIDTH_UNDEFINED, __ffs() may return
unexpected value depending on the platform-specific implementation.

2. DW AHB DMA-engine permits having the power-of-2 transfer width limited
by the DMAH_Mk_HDATA_WIDTH IP-core synthesize parameter. Specifying
bus-width out of that constraints scope will definitely cause unexpected
result since the destination reg will be only partly touched than the
client driver implied.

Let's fix all of that by adding the peripheral bus width verification
method and calling it in dwc_config() which is supposed to be executed
before preparing any transfer. The new method will make sure that the
passed source or destination address width is valid and if undefined then
the driver will just fallback to the 1-byte width transfer.

Fixes: 029a40e97d0d ("dmaengine: dw: provide DMA capabilities")
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>

---

Changelog v2:
- Add a note to the commit message about having the verification
  method called in the dwc_config() function. (Andy)
- Add hyphen to "1byte" in the in-situ comment. (Andy)
- Convert "err" to "ret" variable. (Andy)
---
 drivers/dma/dw/core.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

Comments

Andy Shevchenko Sept. 14, 2024, 7:12 p.m. UTC | #1
On Fri, Aug 02, 2024 at 10:50:46AM +0300, Serge Semin wrote:
> Currently the src_addr_width and dst_addr_width fields of the
> dma_slave_config structure are mapped to the CTLx.SRC_TR_WIDTH and
> CTLx.DST_TR_WIDTH fields of the peripheral bus side in order to have the
> properly aligned data passed to the target device. It's done just by
> converting the passed peripheral bus width to the encoded value using the
> __ffs() function. This implementation has several problematic sides:
> 
> 1. __ffs() is undefined if no bit exist in the passed value. Thus if the
> specified addr-width is DMA_SLAVE_BUSWIDTH_UNDEFINED, __ffs() may return
> unexpected value depending on the platform-specific implementation.
> 
> 2. DW AHB DMA-engine permits having the power-of-2 transfer width limited
> by the DMAH_Mk_HDATA_WIDTH IP-core synthesize parameter. Specifying
> bus-width out of that constraints scope will definitely cause unexpected
> result since the destination reg will be only partly touched than the
> client driver implied.
> 
> Let's fix all of that by adding the peripheral bus width verification
> method and calling it in dwc_config() which is supposed to be executed
> before preparing any transfer. The new method will make sure that the
> passed source or destination address width is valid and if undefined then
> the driver will just fallback to the 1-byte width transfer.

This patch broke Intel Merrifield iDMA32 + SPI PXA2xx configuration to
me. Since it's first in the series and most likely the rest is
dependent and we are almost at the release date I propose to roll back
and start again after v6.12-rc1 will be out. Vinod, can we revert the
entire series, please?
Serge Semin Sept. 14, 2024, 7:22 p.m. UTC | #2
On Sat, Sep 14, 2024 at 10:12:35PM +0300, Andy Shevchenko wrote:
> On Fri, Aug 02, 2024 at 10:50:46AM +0300, Serge Semin wrote:
> > Currently the src_addr_width and dst_addr_width fields of the
> > dma_slave_config structure are mapped to the CTLx.SRC_TR_WIDTH and
> > CTLx.DST_TR_WIDTH fields of the peripheral bus side in order to have the
> > properly aligned data passed to the target device. It's done just by
> > converting the passed peripheral bus width to the encoded value using the
> > __ffs() function. This implementation has several problematic sides:
> > 
> > 1. __ffs() is undefined if no bit exist in the passed value. Thus if the
> > specified addr-width is DMA_SLAVE_BUSWIDTH_UNDEFINED, __ffs() may return
> > unexpected value depending on the platform-specific implementation.
> > 
> > 2. DW AHB DMA-engine permits having the power-of-2 transfer width limited
> > by the DMAH_Mk_HDATA_WIDTH IP-core synthesize parameter. Specifying
> > bus-width out of that constraints scope will definitely cause unexpected
> > result since the destination reg will be only partly touched than the
> > client driver implied.
> > 
> > Let's fix all of that by adding the peripheral bus width verification
> > method and calling it in dwc_config() which is supposed to be executed
> > before preparing any transfer. The new method will make sure that the
> > passed source or destination address width is valid and if undefined then
> > the driver will just fallback to the 1-byte width transfer.
> 
> This patch broke Intel Merrifield iDMA32 + SPI PXA2xx configuration to
> me. Since it's first in the series and most likely the rest is
> dependent and we are almost at the release date I propose to roll back
> and start again after v6.12-rc1 will be out. Vinod, can we revert the
> entire series, please?

I guess it's not the best option, since the patch has already been
backported to the stable kernels anyway. Rolling back it from all of
them seems tiresome. Let's at least try to fix the just discovered
problem?

Could you please provide more details about what exactly happening?

-Serge(y)

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
>
Ferry Toth Sept. 15, 2024, 9:06 p.m. UTC | #3
Hi,

Op 14-09-2024 om 21:22 schreef Serge Semin:
> On Sat, Sep 14, 2024 at 10:12:35PM +0300, Andy Shevchenko wrote:
>> On Fri, Aug 02, 2024 at 10:50:46AM +0300, Serge Semin wrote:
>>> Currently the src_addr_width and dst_addr_width fields of the
>>> dma_slave_config structure are mapped to the CTLx.SRC_TR_WIDTH and
>>> CTLx.DST_TR_WIDTH fields of the peripheral bus side in order to have the
>>> properly aligned data passed to the target device. It's done just by
>>> converting the passed peripheral bus width to the encoded value using the
>>> __ffs() function. This implementation has several problematic sides:
>>>
>>> 1. __ffs() is undefined if no bit exist in the passed value. Thus if the
>>> specified addr-width is DMA_SLAVE_BUSWIDTH_UNDEFINED, __ffs() may return
>>> unexpected value depending on the platform-specific implementation.
>>>
>>> 2. DW AHB DMA-engine permits having the power-of-2 transfer width limited
>>> by the DMAH_Mk_HDATA_WIDTH IP-core synthesize parameter. Specifying
>>> bus-width out of that constraints scope will definitely cause unexpected
>>> result since the destination reg will be only partly touched than the
>>> client driver implied.
>>>
>>> Let's fix all of that by adding the peripheral bus width verification
>>> method and calling it in dwc_config() which is supposed to be executed
>>> before preparing any transfer. The new method will make sure that the
>>> passed source or destination address width is valid and if undefined then
>>> the driver will just fallback to the 1-byte width transfer.
>> This patch broke Intel Merrifield iDMA32 + SPI PXA2xx configuration to
>> me. Since it's first in the series and most likely the rest is
>> dependent and we are almost at the release date I propose to roll back
>> and start again after v6.12-rc1 will be out. Vinod, can we revert the
>> entire series, please?
Indeed all six need to be reverted due to dependency.
>> I guess it's not the best option, since the patch has already been
>> backported to the stable kernels anyway. Rolling back it from all of
>> them seems tiresome. Let's at least try to fix the just discovered
>> problem?
>>
>> Could you please provide more details about what exactly happening?

I can reproduce (after working around another issue with the following 
tip from Andy: The DMA module is loaded _after_ the SPI, and for some 
reason the DMA engine APIs haven't returned deferred probe and hence the 
SPI considered the absence of DMA even if we have ACPI description 
non-fatal. So, you may try to manually unload SPI and load again and it 
should start showing DMA).

On 6.11.0-rc6 I get:
root@edison:~# ./spidev_test -D /dev/spidev5.1 -l
spi mode: 0x20
bits per word: 8
max speed: 500000 Hz (500 kHz)
can't send spi message: Device or resource busy
Aborted (core dumped)

And on 6.11.0-rc7 with the 6 patches reverted:

root@edison:~# ./spidev_test -D /dev/spidev5.1 -l -v
spi mode: 0x20
bits per word: 8
max speed: 500000 Hz (500 kHz)
TX | FF FF FF FF FF FF 40 00 00 00 00 95 FF FF FF FF FF FF FF FF FF FF 
FF FF FF FF FF FF FF FF F0 0D  |......@.........................|
RX | FF FF FF FF FF FF 40 00 00 00 00 95 FF FF FF FF FF FF FF FF FF FF 
FF FF FF FF FF FF FF FF F0 0D  |......@.........................|
>> -Serge(y)
>>
>> -- 
>> With Best Regards,
>> Andy Shevchenko
>>
>>
Andy Shevchenko Sept. 16, 2024, 11:43 a.m. UTC | #4
On Sat, Sep 14, 2024 at 10:22:22PM +0300, Serge Semin wrote:
> On Sat, Sep 14, 2024 at 10:12:35PM +0300, Andy Shevchenko wrote:
> > On Fri, Aug 02, 2024 at 10:50:46AM +0300, Serge Semin wrote:
> > > Currently the src_addr_width and dst_addr_width fields of the
> > > dma_slave_config structure are mapped to the CTLx.SRC_TR_WIDTH and
> > > CTLx.DST_TR_WIDTH fields of the peripheral bus side in order to have the
> > > properly aligned data passed to the target device. It's done just by
> > > converting the passed peripheral bus width to the encoded value using the
> > > __ffs() function. This implementation has several problematic sides:
> > > 
> > > 1. __ffs() is undefined if no bit exist in the passed value. Thus if the
> > > specified addr-width is DMA_SLAVE_BUSWIDTH_UNDEFINED, __ffs() may return
> > > unexpected value depending on the platform-specific implementation.
> > > 
> > > 2. DW AHB DMA-engine permits having the power-of-2 transfer width limited
> > > by the DMAH_Mk_HDATA_WIDTH IP-core synthesize parameter. Specifying
> > > bus-width out of that constraints scope will definitely cause unexpected
> > > result since the destination reg will be only partly touched than the
> > > client driver implied.
> > > 
> > > Let's fix all of that by adding the peripheral bus width verification
> > > method and calling it in dwc_config() which is supposed to be executed
> > > before preparing any transfer. The new method will make sure that the
> > > passed source or destination address width is valid and if undefined then
> > > the driver will just fallback to the 1-byte width transfer.
> > 
> > This patch broke Intel Merrifield iDMA32 + SPI PXA2xx configuration to
> > me. Since it's first in the series and most likely the rest is
> > dependent and we are almost at the release date I propose to roll back
> > and start again after v6.12-rc1 will be out. Vinod, can we revert the
> > entire series, please?
> 
> I guess it's not the best option, since the patch has already been
> backported to the stable kernels anyway. Rolling back it from all of
> them seems tiresome. Let's at least try to fix the just discovered
> problem?

Please, provide one we can test!

> Could you please provide more details about what exactly happening?

Sure. AFAICT the only problematic line is this:

        else if (!is_power_of_2(reg_width) || reg_width > max_width)

in your patch, and it may trigger, for example, when max_width == 0.
This, in accordance with my brief investigation, happens due to the following.

The DMA slave configuration is being copied twice in DW DMA code:
1) when respective filter function triggers (see acpi/of glue code);
2) when the channel is about to be allocated.

The iDMA32 has only a single master, and hence m_master == p_master,
BUT the filter function in the acpi code is universal and it copies
the wrong (from the iDMA32 perspective) value to p_master.
As the result, when you retrieve the max_width, it takes the value from
p_master, which is defined to 1 (sic!), and hence assigns it to 0.

I don't know how to quickfix this as the proper fix seems to provide
the correct data in the first place.

Any ideas, patches we may test?
Andy Shevchenko Sept. 16, 2024, 11:45 a.m. UTC | #5
On Mon, Sep 16, 2024 at 02:43:48PM +0300, Andy Shevchenko wrote:
> On Sat, Sep 14, 2024 at 10:22:22PM +0300, Serge Semin wrote:
> > On Sat, Sep 14, 2024 at 10:12:35PM +0300, Andy Shevchenko wrote:
> > > On Fri, Aug 02, 2024 at 10:50:46AM +0300, Serge Semin wrote:
> > > > Currently the src_addr_width and dst_addr_width fields of the
> > > > dma_slave_config structure are mapped to the CTLx.SRC_TR_WIDTH and
> > > > CTLx.DST_TR_WIDTH fields of the peripheral bus side in order to have the
> > > > properly aligned data passed to the target device. It's done just by
> > > > converting the passed peripheral bus width to the encoded value using the
> > > > __ffs() function. This implementation has several problematic sides:
> > > > 
> > > > 1. __ffs() is undefined if no bit exist in the passed value. Thus if the
> > > > specified addr-width is DMA_SLAVE_BUSWIDTH_UNDEFINED, __ffs() may return
> > > > unexpected value depending on the platform-specific implementation.
> > > > 
> > > > 2. DW AHB DMA-engine permits having the power-of-2 transfer width limited
> > > > by the DMAH_Mk_HDATA_WIDTH IP-core synthesize parameter. Specifying
> > > > bus-width out of that constraints scope will definitely cause unexpected
> > > > result since the destination reg will be only partly touched than the
> > > > client driver implied.
> > > > 
> > > > Let's fix all of that by adding the peripheral bus width verification
> > > > method and calling it in dwc_config() which is supposed to be executed
> > > > before preparing any transfer. The new method will make sure that the
> > > > passed source or destination address width is valid and if undefined then
> > > > the driver will just fallback to the 1-byte width transfer.
> > > 
> > > This patch broke Intel Merrifield iDMA32 + SPI PXA2xx configuration to
> > > me. Since it's first in the series and most likely the rest is
> > > dependent and we are almost at the release date I propose to roll back
> > > and start again after v6.12-rc1 will be out. Vinod, can we revert the
> > > entire series, please?
> > 
> > I guess it's not the best option, since the patch has already been
> > backported to the stable kernels anyway. Rolling back it from all of
> > them seems tiresome. Let's at least try to fix the just discovered
> > problem?
> 
> Please, provide one we can test!
> 
> > Could you please provide more details about what exactly happening?
> 
> Sure. AFAICT the only problematic line is this:
> 
>         else if (!is_power_of_2(reg_width) || reg_width > max_width)
> 
> in your patch, and it may trigger, for example, when max_width == 0.
> This, in accordance with my brief investigation, happens due to the following.
> 
> The DMA slave configuration is being copied twice in DW DMA code:
> 1) when respective filter function triggers (see acpi/of glue code);
> 2) when the channel is about to be allocated.
> 
> The iDMA32 has only a single master, and hence m_master == p_master,
> BUT the filter function in the acpi code is universal and it copies
> the wrong (from the iDMA32 perspective) value to p_master.
> As the result, when you retrieve the max_width, it takes the value from
> p_master, which is defined to 1 (sic!), and hence assigns it to 0.
> 
> I don't know how to quickfix this as the proper fix seems to provide
> the correct data in the first place.
> 
> Any ideas, patches we may test?

P.S. for your advocacy it seems that your change actually revealed an
inconsistency in the existing code. But still, it made a regression.
Andy Shevchenko Sept. 16, 2024, 11:57 a.m. UTC | #6
On Mon, Sep 16, 2024 at 02:45:41PM +0300, Andy Shevchenko wrote:
> On Mon, Sep 16, 2024 at 02:43:48PM +0300, Andy Shevchenko wrote:
> > On Sat, Sep 14, 2024 at 10:22:22PM +0300, Serge Semin wrote:
> > > On Sat, Sep 14, 2024 at 10:12:35PM +0300, Andy Shevchenko wrote:
> > > > On Fri, Aug 02, 2024 at 10:50:46AM +0300, Serge Semin wrote:
> > > > > Currently the src_addr_width and dst_addr_width fields of the
> > > > > dma_slave_config structure are mapped to the CTLx.SRC_TR_WIDTH and
> > > > > CTLx.DST_TR_WIDTH fields of the peripheral bus side in order to have the
> > > > > properly aligned data passed to the target device. It's done just by
> > > > > converting the passed peripheral bus width to the encoded value using the
> > > > > __ffs() function. This implementation has several problematic sides:
> > > > > 
> > > > > 1. __ffs() is undefined if no bit exist in the passed value. Thus if the
> > > > > specified addr-width is DMA_SLAVE_BUSWIDTH_UNDEFINED, __ffs() may return
> > > > > unexpected value depending on the platform-specific implementation.
> > > > > 
> > > > > 2. DW AHB DMA-engine permits having the power-of-2 transfer width limited
> > > > > by the DMAH_Mk_HDATA_WIDTH IP-core synthesize parameter. Specifying
> > > > > bus-width out of that constraints scope will definitely cause unexpected
> > > > > result since the destination reg will be only partly touched than the
> > > > > client driver implied.
> > > > > 
> > > > > Let's fix all of that by adding the peripheral bus width verification
> > > > > method and calling it in dwc_config() which is supposed to be executed
> > > > > before preparing any transfer. The new method will make sure that the
> > > > > passed source or destination address width is valid and if undefined then
> > > > > the driver will just fallback to the 1-byte width transfer.
> > > > 
> > > > This patch broke Intel Merrifield iDMA32 + SPI PXA2xx configuration to
> > > > me. Since it's first in the series and most likely the rest is
> > > > dependent and we are almost at the release date I propose to roll back
> > > > and start again after v6.12-rc1 will be out. Vinod, can we revert the
> > > > entire series, please?
> > > 
> > > I guess it's not the best option, since the patch has already been
> > > backported to the stable kernels anyway. Rolling back it from all of
> > > them seems tiresome. Let's at least try to fix the just discovered
> > > problem?
> > 
> > Please, provide one we can test!
> > 
> > > Could you please provide more details about what exactly happening?
> > 
> > Sure. AFAICT the only problematic line is this:
> > 
> >         else if (!is_power_of_2(reg_width) || reg_width > max_width)
> > 
> > in your patch, and it may trigger, for example, when max_width == 0.
> > This, in accordance with my brief investigation, happens due to the following.
> > 
> > The DMA slave configuration is being copied twice in DW DMA code:
> > 1) when respective filter function triggers (see acpi/of glue code);
> > 2) when the channel is about to be allocated.
> > 
> > The iDMA32 has only a single master, and hence m_master == p_master,
> > BUT the filter function in the acpi code is universal and it copies
> > the wrong (from the iDMA32 perspective) value to p_master.
> > As the result, when you retrieve the max_width, it takes the value from
> > p_master, which is defined to 1 (sic!), and hence assigns it to 0.

Okay that was the theory, now I made a hack patch, i.e. supply 0 in acpi.c in
the filter function and everything starts to work. So, my theory is correct.

> > I don't know how to quickfix this as the proper fix seems to provide
> > the correct data in the first place.
> > 
> > Any ideas, patches we may test?
> 
> P.S. for your advocacy it seems that your change actually revealed an
> inconsistency in the existing code. But still, it made a regression.
Andy Shevchenko Sept. 16, 2024, 12:46 p.m. UTC | #7
On Mon, Sep 16, 2024 at 02:57:56PM +0300, Andy Shevchenko wrote:
> On Mon, Sep 16, 2024 at 02:45:41PM +0300, Andy Shevchenko wrote:

...

> Okay that was the theory, now I made a hack patch, i.e. supply 0 in acpi.c in
> the filter function and everything starts to work. So, my theory is correct.

FWIW, I have reapplied the whole series and it seems fine, so the only regression
is that I described in the previous replies.
Serge Semin Sept. 19, 2024, 2:04 p.m. UTC | #8
Hi Andy, Ferry

On Mon, Sep 16, 2024 at 03:46:44PM +0300, Andy Shevchenko wrote:
> On Mon, Sep 16, 2024 at 02:57:56PM +0300, Andy Shevchenko wrote:
> > On Mon, Sep 16, 2024 at 02:45:41PM +0300, Andy Shevchenko wrote:
> 
> ...
> 
> > Okay that was the theory, now I made a hack patch, i.e. supply 0 in acpi.c in
> > the filter function and everything starts to work. So, my theory is correct.
> 
> FWIW, I have reapplied the whole series and it seems fine, so the only regression
> is that I described in the previous replies.

Just submitted a fix of the issue based on the Andy' and my problem
investigation. You've been Cc'ed. Please test the patch out on your
hardware:
https://lore.kernel.org/dmaengine/20240919135854.16124-1-fancer.lancer@gmail.com

Hope it works.

-Serge(y)

> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
>
diff mbox series

Patch

diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c
index 4b3402156eae..d4c694b0f55a 100644
--- a/drivers/dma/dw/core.c
+++ b/drivers/dma/dw/core.c
@@ -16,6 +16,7 @@ 
 #include <linux/init.h>
 #include <linux/interrupt.h>
 #include <linux/io.h>
+#include <linux/log2.h>
 #include <linux/mm.h>
 #include <linux/module.h>
 #include <linux/slab.h>
@@ -780,10 +781,43 @@  bool dw_dma_filter(struct dma_chan *chan, void *param)
 }
 EXPORT_SYMBOL_GPL(dw_dma_filter);
 
+static int dwc_verify_p_buswidth(struct dma_chan *chan)
+{
+	struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
+	struct dw_dma *dw = to_dw_dma(chan->device);
+	u32 reg_width, max_width;
+
+	if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV)
+		reg_width = dwc->dma_sconfig.dst_addr_width;
+	else if (dwc->dma_sconfig.direction == DMA_DEV_TO_MEM)
+		reg_width = dwc->dma_sconfig.src_addr_width;
+	else /* DMA_MEM_TO_MEM */
+		return 0;
+
+	max_width = dw->pdata->data_width[dwc->dws.p_master];
+
+	/* Fall-back to 1-byte transfer width if undefined */
+	if (reg_width == DMA_SLAVE_BUSWIDTH_UNDEFINED)
+		reg_width = DMA_SLAVE_BUSWIDTH_1_BYTE;
+	else if (!is_power_of_2(reg_width) || reg_width > max_width)
+		return -EINVAL;
+	else /* bus width is valid */
+		return 0;
+
+	/* Update undefined addr width value */
+	if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV)
+		dwc->dma_sconfig.dst_addr_width = reg_width;
+	else /* DMA_DEV_TO_MEM */
+		dwc->dma_sconfig.src_addr_width = reg_width;
+
+	return 0;
+}
+
 static int dwc_config(struct dma_chan *chan, struct dma_slave_config *sconfig)
 {
 	struct dw_dma_chan *dwc = to_dw_dma_chan(chan);
 	struct dw_dma *dw = to_dw_dma(chan->device);
+	int ret;
 
 	memcpy(&dwc->dma_sconfig, sconfig, sizeof(*sconfig));
 
@@ -792,6 +826,10 @@  static int dwc_config(struct dma_chan *chan, struct dma_slave_config *sconfig)
 	dwc->dma_sconfig.dst_maxburst =
 		clamp(dwc->dma_sconfig.dst_maxburst, 0U, dwc->max_burst);
 
+	ret = dwc_verify_p_buswidth(chan);
+	if (ret)
+		return ret;
+
 	dw->encode_maxburst(dwc, &dwc->dma_sconfig.src_maxburst);
 	dw->encode_maxburst(dwc, &dwc->dma_sconfig.dst_maxburst);