[03/32] spi: dw: Fix driving MOSI low while recieving

Message ID 20201107081420.60325-4-damien.lemoal@wdc.com
State New
Headers show
Series
  • [01/32] of: Fix property supplier parsing
Related show

Commit Message

Damien Le Moal Nov. 7, 2020, 8:13 a.m.
Fix for the Synopsis DesignWare SPI mmio driver taken from the work
by Sean Anderson for the U-Boot project. Sean comments:

The resting state of MOSI is high when nothing is driving it. If we
drive it low while recieving, it looks like we are transmitting 0x00
instead of transmitting nothing. This can confuse slaves (like SD cards)
which allow new commands to be sent over MOSI while they are returning
data over MISO. The return of MOSI from 0 to 1 at the end of recieving
a byte can look like a start bit and a transmission bit to an SD card.
This will cause the card to become out-of-sync with the SPI device, as
it thinks the device has already started transmitting two bytes of a new
command. The mmc-spi driver will not detect the R1 response from the SD
card, since it is sent too early, and offset by two bits. This patch
fixes transfer errors when using SD cards with dw spi.

Signed-off-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 drivers/spi/spi-dw-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Serge Semin Nov. 9, 2020, 2:48 p.m. | #1
On Mon, Nov 09, 2020 at 02:14:22PM +0000, Mark Brown wrote:
> On Mon, Nov 09, 2020 at 08:47:10AM -0500, Sean Anderson wrote:
> > On 11/9/20 8:29 AM, Mark Brown wrote:
> 
> > > In this case it also looks like the controller hardware requires
> > > transmit data and therefore should be setting SPI_MUST_TX and just
> > > removing the in driver default anyway, though that will have no effect
> > > one way or anther on the issue you're seeing.
> 

> > There is a recieve-only mode, but it is not used by this driver. Perhaps
> > it should be.
> 
> I'd expect it'd perform better, especially on systems that are
> apparently struggling for CPU bandwidth like yours seems to.

Well, it might seem a good idea to use that mode, but there are multiple problems
you may get in implementing it.

First of all the Receive-only mode is having a limited number bytes to receive
at once. It's just 64KB. So in order to implement it you'd need to split the
bigger transfers up, and feed the DMA engine with smaller chunks one-by-one.

Secondly the Receive-only mode will make the DW SSI controller to constantly receive
the data from the SPI bus and to put it into the Rx FIFO. So your DMA engine will
have to keep up with extracting the data from there on time, otherwise you'll
end up with Rx FIFO overflow error eventually. The problem will be actual for the
DMA engines/system buses, which are slower than the SPI bus speed, second for the
DMA engines with no hardware accelerated LLP traversal support (like on our DWC DMA
controller). The second problem can be also fixed by splitting the transfers up as
it has been already implemented in the spi-dw-dma.c. But the first problem can't be
fixed, but just workarounded by limiting the SPI bus frequency so the DMA engine
would keep up with incoming data traffic.

-Sergey

> 
> > > Please also try to avoid the use of master/slave terminology where
> > > reasonable, controller and device tend to work for SPI (though MOSI/MISO
> > > are going to be harder to shift).
> 
> > Here I use it to draw distinction between the SPI master and the SPI
> > slave, which are both devices in different contexts. 
> 
> If you find the use of device to refer to the device being controlled
> confusing consider also using something like client device instead,
> there's a number of ways to do it (there's a list in Documentation IIRC).
Serge Semin Nov. 9, 2020, 7:19 p.m. | #2
On Mon, Nov 09, 2020 at 02:14:22PM +0000, Mark Brown wrote:
> On Mon, Nov 09, 2020 at 08:47:10AM -0500, Sean Anderson wrote:
> > On 11/9/20 8:29 AM, Mark Brown wrote:
> > > On Sat, Nov 07, 2020 at 05:13:51PM +0900, Damien Le Moal wrote:
> > > 

> > >> The resting state of MOSI is high when nothing is driving it. If we
> > >> drive it low while recieving, it looks like we are transmitting 0x00
> > >> instead of transmitting nothing. This can confuse slaves (like SD cards)
> > >> which allow new commands to be sent over MOSI while they are returning
> > >> data over MISO. The return of MOSI from 0 to 1 at the end of recieving
> > >> a byte can look like a start bit and a transmission bit to an SD card.

Yeah, that's what we've also experienced on our systems. We've worked
around the problem in exactly the same way as you have. But we haven't
dared to send it out as the solution seemed a bit hackish.

> > > 
> > > If client devices are interpreting the transmitted data then I would
> > > expect the drivers for that hardware to be ensuring that whatever we
> > > transmit matches what the device is expecting.  We shouldn't be putting
> > > a hack in a particular controller driver to paper over things, that will
> > > mean that the device will break when used with other controllers and if
> > > different devices have different requirements then obviously we can't
> > > satisfy them.  There is not meaningfully a general specification for SPI
> > > which says what happens when signals are idle, it's all specific to the
> > > client device.
> > > 
> > > In this case it also looks like the controller hardware requires
> > > transmit data and therefore should be setting SPI_MUST_TX and just
> > > removing the in driver default anyway, though that will have no effect
> > > one way or anther on the issue you're seeing.
> 

> > There is a recieve-only mode, but it is not used by this driver. Perhaps
> > it should be.
> 
> I'd expect it'd perform better, especially on systems that are
> apparently struggling for CPU bandwidth like yours seems to.

CPU-wise. RO-mode won't help in that case. Moreover it will be even
more errors-prone for the systems with small CPU bandwidth. As I said
the Receive-only mode will make the SPI controller automatically
receiving data from the SPI bus and putting it into the Rx FIFO. If
CPU is either busy with something else or too slow in fetching the
data from the Rx FIFO, the FIFO will be eventually overflown with
data, which we need to avoid at all cost.

As I see it the Receive-only mode is only acceptable in the next two
situations:

1) Rx-only DMA. But only if the DMA-engine and system bus are fast
enough to fetch the incoming data on time. (Note for example in our
system some DWC DMA-engine channels don't work well with the DW APB
SSI working with full-speed, so we had to set constraints on the DWC
DMA channels being used in conjunction with the DW APB SSI
controller.)

2) Rx-only with atomic CPU utilization. In order to make sure that the
CPU keeps up with fetching the data from the Rx FIFO, we have to
disable the local CPU IRQs while performing the Rx-only transfers, so
to prevent the Rx FIFO overflow while the CPU is doing something else.
Needless to say that such approach should be utilized only as a last
resort, if we have no choice but to run the Receive-only transfers.
Because locking the CPU for God knows how much time may cause the
system interactivity degradation. For instance, a possible use-case of
that design is when the controller is communicating with the
SPI-devices with native DW APB SSI chip-select attached. BTW You can
also find that design implemented in the kernel 5.10 spi-dw-core.c
driver in context of the SPI-memory operations (with my last patches
merged in). In particular I had to use it to handle the CPU-based
EEPROM-read mode.

So in all other cases for normal CPU-based SPI-transfers when
GPIO-based chip-select is available the safest solution would be to
use a normal Push-Pull mode. In this case we have no risk in getting
the Rx FIFO overflow unless there is a bug in the code, which is
fixable anyway.

Getting back to the patch. In fact I don't really see how the
Receive-only mode will help us with solving the problem noted in the
patch log. As Mark said the problem with the Tx data on Rx-only
transfers should be fixed on the client side. If an subordinate
SPI-device needs a specific value to be received in that case, then
that value should be somehow provided to the SPI-controller anyway.
So the native Rx-only mode of the DW APB SSI controller won't help.
Currently it's possible to be done only by executing a Full-duplex
SPI-transfer with the Tx-buffer being pre-initialized with that
value.

Another possible solution for the problem would be to fix the SPI core
so aside with tx_buf being set to the NULL-pointer, a client driver
would provide a default level or some specific value being put to the
SPI bus on Rx-only transfers. If an SPI-controller is capable of
satisfying the request, then it will accept the transfer. If it's not,
then the SPI core may try to convert the Rx-only transfer into the
Full-duplex transfer with the Tx-buffer being initialized with the
requested level.

-Sergey
Mark Brown Nov. 9, 2020, 8:20 p.m. | #3
On Mon, Nov 09, 2020 at 10:19:09PM +0300, Serge Semin wrote:
> On Mon, Nov 09, 2020 at 02:14:22PM +0000, Mark Brown wrote:

> > I'd expect it'd perform better, especially on systems that are
> > apparently struggling for CPU bandwidth like yours seems to.

> CPU-wise. RO-mode won't help in that case. Moreover it will be even
> more errors-prone for the systems with small CPU bandwidth. As I said

Right, these are two separate issues - one is that the client device
has requirements on the transmit data at times when the driver isn't
defining what should be transmitted, the other is that the controller
driver is using full duplex mode even for single duplex data.  I just
happened to notice the second issue while reviewing the change - there
shouldn't be any code for setting the dummy transmit pattern in the
driver in the first place.

> 2) Rx-only with atomic CPU utilization. In order to make sure that the
> CPU keeps up with fetching the data from the Rx FIFO, we have to
> disable the local CPU IRQs while performing the Rx-only transfers, so
> to prevent the Rx FIFO overflow while the CPU is doing something else.

...

> So in all other cases for normal CPU-based SPI-transfers when
> GPIO-based chip-select is available the safest solution would be to
> use a normal Push-Pull mode. In this case we have no risk in getting
> the Rx FIFO overflow unless there is a bug in the code, which is
> fixable anyway.

I'm not clear why we would have issues with the FIFO overflowing in PIO
mode, especially with a GPIO chip select - even if we're forced to tell
the controller how big the transfer is if we're using a GPIO chip select
we could just tell it we're doing a series of FIFO sized transfers?

> Another possible solution for the problem would be to fix the SPI core
> so aside with tx_buf being set to the NULL-pointer, a client driver
> would provide a default level or some specific value being put to the
> SPI bus on Rx-only transfers. If an SPI-controller is capable of
> satisfying the request, then it will accept the transfer. If it's not,
> then the SPI core may try to convert the Rx-only transfer into the
> Full-duplex transfer with the Tx-buffer being initialized with the
> requested level.

We do have support in the core for creating dummy data buffers for
controllers that can't do half duplex - that's the SPI_MUST_TX and
matching SPI_MUST_RX that I mentioned in my initial reply.  Currently we
always zero fill transmit buffers, the expected semantic is that if the
client driver isn't supplying data that means the device doesn't care
what gets sent and it's not clear to me that it isn't sensible to just
keep that like I said earlier, I don't know how common it's going to be
to need this since most half duplex transfers generally are half duplex.
The whole point with the SPI_MUST_ flags was to remove the need for
controller drivers to open code handling this, it was adding
complication and supporting configuration of the dummy data feels like
it's adding room for things to go wrong.
Serge Semin Nov. 9, 2020, 9:05 p.m. | #4
On Mon, Nov 09, 2020 at 08:20:52PM +0000, Mark Brown wrote:
> On Mon, Nov 09, 2020 at 10:19:09PM +0300, Serge Semin wrote:
> > On Mon, Nov 09, 2020 at 02:14:22PM +0000, Mark Brown wrote:
> 
> > > I'd expect it'd perform better, especially on systems that are
> > > apparently struggling for CPU bandwidth like yours seems to.
> 
> > CPU-wise. RO-mode won't help in that case. Moreover it will be even
> > more errors-prone for the systems with small CPU bandwidth. As I said
> 
> Right, these are two separate issues - one is that the client device
> has requirements on the transmit data at times when the driver isn't
> defining what should be transmitted, the other is that the controller
> driver is using full duplex mode even for single duplex data.  I just
> happened to notice the second issue while reviewing the change - there
> shouldn't be any code for setting the dummy transmit pattern in the
> driver in the first place.
> 
> > 2) Rx-only with atomic CPU utilization. In order to make sure that the
> > CPU keeps up with fetching the data from the Rx FIFO, we have to
> > disable the local CPU IRQs while performing the Rx-only transfers, so
> > to prevent the Rx FIFO overflow while the CPU is doing something else.
> 
> ...
> 
> > So in all other cases for normal CPU-based SPI-transfers when
> > GPIO-based chip-select is available the safest solution would be to
> > use a normal Push-Pull mode. In this case we have no risk in getting
> > the Rx FIFO overflow unless there is a bug in the code, which is
> > fixable anyway.
> 

> I'm not clear why we would have issues with the FIFO overflowing in PIO
> mode, especially with a GPIO chip select - even if we're forced to tell
> the controller how big the transfer is if we're using a GPIO chip select

> we could just tell it we're doing a series of FIFO sized transfers?

Hm, you are right. Splitting the Rx-only transfers on the chunks with
lengths smaller than the FIFO size indeed would have solved the
problem of the Rx FIFO overflow with GPIO-based CS hardware. Don't
really know how I missed that. Most likely because when concerning the
Tx-only/Rx-only/EEPROM-read modes I always think about the native
chip-select automatic assertion/de-assertion, in which case we have no
other way but to provide the SPI-transfers/message atomicity.

> 
> > Another possible solution for the problem would be to fix the SPI core
> > so aside with tx_buf being set to the NULL-pointer, a client driver
> > would provide a default level or some specific value being put to the
> > SPI bus on Rx-only transfers. If an SPI-controller is capable of
> > satisfying the request, then it will accept the transfer. If it's not,
> > then the SPI core may try to convert the Rx-only transfer into the
> > Full-duplex transfer with the Tx-buffer being initialized with the
> > requested level.
> 

> We do have support in the core for creating dummy data buffers for
> controllers that can't do half duplex - that's the SPI_MUST_TX and
> matching SPI_MUST_RX that I mentioned in my initial reply.  Currently we
> always zero fill transmit buffers, the expected semantic is that if the
> client driver isn't supplying data that means the device doesn't care
> what gets sent and it's not clear to me that it isn't sensible to just
> keep that like I said earlier,

> I don't know how common it's going to be
> to need this since most half duplex transfers generally are half duplex.
> The whole point with the SPI_MUST_ flags was to remove the need for
> controller drivers to open code handling this, it was adding
> complication and supporting configuration of the dummy data feels like
> it's adding room for things to go wrong.

If by general Rx-only half-duplex transfers you meant that the client
SPI-device shall just not care what the MOSI level, then the only
acceptable solution of the noted in this patch problem is to fix the
client driver. Since in case of the MMC-SPI client device sometimes it
does care about the level.

-Sergey

Patch

diff --git a/drivers/spi/spi-dw-core.c b/drivers/spi/spi-dw-core.c
index 841c85247f01..c2ef1d8d46d5 100644
--- a/drivers/spi/spi-dw-core.c
+++ b/drivers/spi/spi-dw-core.c
@@ -137,7 +137,7 @@  static inline u32 rx_max(struct dw_spi *dws)
 static void dw_writer(struct dw_spi *dws)
 {
 	u32 max = tx_max(dws);
-	u16 txw = 0;
+	u16 txw = 0xffff;
 
 	while (max--) {
 		if (dws->tx) {