diff mbox series

tty: max310x: work around regmap->regcache data corruption

Message ID bd91db46c50615bc1d1d62beb659fa7f62386446.1701446070.git.jan.kundrat@cesnet.cz
State New
Headers show
Series tty: max310x: work around regmap->regcache data corruption | expand

Commit Message

Jan Kundrát Dec. 1, 2023, 2:51 p.m. UTC
The TL;DR summary is that the regmap_noinc_write spills over the data
that are correctly written to the HW also to the following registers in
the regcache. As a result, regcache then contains user-controlled
garbage which will be used later for bit updates on unrelated registers.

This patch is a "wrong" fix; a real fix would involve fixing regmap
and/or regcache, but that code has too many indirections for my little
mind.

I was investigating a regression that happened somewhere between 5.12.4
(plus 14 of our patches) and v6.5.9 (plus 7 of our patches). Our
MAX14830 UART would work fine the first time, but when our application
opens the UART the second time it just wouldn't send anything over the
physical TX pin. With the help of a logical analyzer, I found out that
the kernel was sending value 0xcd to the MODE1 register, which on this
chip is a request to set the UART's TX pin to the Hi-Z mode and to
switch off RX completely. That's certainly not the intention of the
code, but that's what I was seeing on the physical SPI bus, and also in
the log when I instrumented the regmap layer.

It turned out that one of the *data* bytes which were sent over the UART
was 0xdd, and that this *data byte* somehow ended up in the regcache's
idea about the value within the MODE1 register. When the UART is opened
up the next time and max310x_startup updates a single unrelated bit in
MODE1, that code consults the regcache, notices the 0xdd data byte in
there, and ends up sending 0xcd over SPI.

Here's what dump_stack() shows:

 max310x spi1.2: regcache_write: reg 0x9 value 0xdd
 max310x spi1.2: PWNED
 CPU: 1 PID: 26 Comm: kworker/1:1 Not tainted 6.5.9-7-g9e090fe75fd8 #7
 Hardware name: Marvell Armada 380/385 (Device Tree)
 Workqueue: events max310x_tx_proc
  unwind_backtrace from show_stack+0x10/0x14
  show_stack from dump_stack_lvl+0x40/0x4c
  dump_stack_lvl from regcache_write+0xc0/0xc4
  regcache_write from _regmap_raw_write_impl+0x178/0x828
  _regmap_raw_write_impl from _regmap_raw_write+0xb8/0x134
  _regmap_raw_write from regmap_noinc_write+0x130/0x178
  regmap_noinc_write from max310x_tx_proc+0xd4/0x1a4
  max310x_tx_proc from process_one_work+0x21c/0x4e4
  process_one_work from worker_thread+0x50/0x54c
  worker_thread from kthread+0xe0/0xfc
  kthread from ret_from_fork+0x14/0x28

Clearly, regmap_noinc_write of a register 0x00 (that's the TX FIFO on
this chip) has no business updating register 0x09, but that's what was
happening here. The regmap_config is already set up in a way that
register 0x00 is marked precious and volatile, so it has no business
going through the cache at all. Also, the documentation for
regmap_noinc_write suggests that this driver was using the regmap
infrastructure correctly, and that the real bug is somewhere in
regmap/regcache where a call to regmap_noinc_write end up updating an
unrelated register in regcache.

Until regmap/regcache is fixed, let's just use an adapted version of the
old code that bypasses regmap altogether, and just sends out an SPI
transaction.

This is related to my commit 3f42b142ea1171967e40e10e4b0241c0d6d28d41
("serial: max310x: fix IO data corruption in batched operations") which
introduced usage of regmap_noinc_write() to this driver. That commit is
a fixup of commit 285e76fc049c4d32c772eea9460a7ef28a193802 ("serial:
max310x: use regmap methods for SPI batch operations") which started
using regmap_raw_write(), which was however also a wrong function.

Fixes: 3f42b142ea11 ("serial: max310x: fix IO data corruption in batched operations")
Fixes: 285e76fc049c ("serial: max310x: use regmap methods for SPI batch operations")
Signed-off-by: Jan Kundrát <jan.kundrat@cesnet.cz>
To: Mark Brown <broonie@kernel.org>
To: Cosmin Tanislav <cosmin.tanislav@analog.com>
To: linux-serial@vger.kernel.org
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: linux-kernel@vger.kernel.org
---
 drivers/tty/serial/max310x.c | 30 ++++++++++++++++++++++++------
 1 file changed, 24 insertions(+), 6 deletions(-)

Comments

Hugo Villeneuve Dec. 1, 2023, 6:27 p.m. UTC | #1
On Fri, 1 Dec 2023 15:51:51 +0100
Jan Kundrát <jan.kundrat@cesnet.cz> wrote:

> The TL;DR summary is that the regmap_noinc_write spills over the data
> that are correctly written to the HW also to the following registers in
> the regcache. As a result, regcache then contains user-controlled
> garbage which will be used later for bit updates on unrelated registers.
> 
> This patch is a "wrong" fix; a real fix would involve fixing regmap
> and/or regcache, but that code has too many indirections for my little
> mind.
> 
> I was investigating a regression that happened somewhere between 5.12.4
> (plus 14 of our patches) and v6.5.9 (plus 7 of our patches). Our
> MAX14830 UART would work fine the first time, but when our application
> opens the UART the second time it just wouldn't send anything over the
> physical TX pin. With the help of a logical analyzer, I found out that
> the kernel was sending value 0xcd to the MODE1 register, which on this
> chip is a request to set the UART's TX pin to the Hi-Z mode and to
> switch off RX completely. That's certainly not the intention of the
> code, but that's what I was seeing on the physical SPI bus, and also in
> the log when I instrumented the regmap layer.
> 
> It turned out that one of the *data* bytes which were sent over the UART
> was 0xdd, and that this *data byte* somehow ended up in the regcache's
> idea about the value within the MODE1 register. When the UART is opened
> up the next time and max310x_startup updates a single unrelated bit in
> MODE1, that code consults the regcache, notices the 0xdd data byte in
> there, and ends up sending 0xcd over SPI.
> 
> Here's what dump_stack() shows:
> 
>  max310x spi1.2: regcache_write: reg 0x9 value 0xdd
>  max310x spi1.2: PWNED
>  CPU: 1 PID: 26 Comm: kworker/1:1 Not tainted 6.5.9-7-g9e090fe75fd8 #7
>  Hardware name: Marvell Armada 380/385 (Device Tree)
>  Workqueue: events max310x_tx_proc
>   unwind_backtrace from show_stack+0x10/0x14
>   show_stack from dump_stack_lvl+0x40/0x4c
>   dump_stack_lvl from regcache_write+0xc0/0xc4
>   regcache_write from _regmap_raw_write_impl+0x178/0x828
>   _regmap_raw_write_impl from _regmap_raw_write+0xb8/0x134
>   _regmap_raw_write from regmap_noinc_write+0x130/0x178
>   regmap_noinc_write from max310x_tx_proc+0xd4/0x1a4
>   max310x_tx_proc from process_one_work+0x21c/0x4e4
>   process_one_work from worker_thread+0x50/0x54c
>   worker_thread from kthread+0xe0/0xfc
>   kthread from ret_from_fork+0x14/0x28
> 
> Clearly, regmap_noinc_write of a register 0x00 (that's the TX FIFO on
> this chip) has no business updating register 0x09, but that's what was
> happening here. The regmap_config is already set up in a way that
> register 0x00 is marked precious and volatile, so it has no business
> going through the cache at all. Also, the documentation for
> regmap_noinc_write suggests that this driver was using the regmap
> infrastructure correctly, and that the real bug is somewhere in
> regmap/regcache where a call to regmap_noinc_write end up updating an
> unrelated register in regcache.

Hi Jan,
it is funny, as I am preparing to send a patch for the sc16is7xx driver
to convert FIFO R/W to use the _noinc_ versions of regmap functions,
inspired by your patch 3f42b142ea11 ("serial: max310x: fix IO data
corruption in batched operations").

I am testing on a custom board with two SC16IS752 in SPI mode.

Here is our current FIFO write code:

  regcache_cache_bypass(one->regmap, true);
  regmap_raw_write(one->regmap, SC16IS7XX_THR_REG, s->buf, to_send);
  regcache_cache_bypass(one->regmap, false);

I am converting it to _noinc_ version to be able to remove the manual
(and ugly) cache control workaround to this:

  regmap_noinc_write(one->regmap, SC16IS7XX_THR_REG, s->buf, to_send);

SC16IS7XX_THR_REG is already in precious and volatile, and I also
have put it in the noinc list.

To confirm that this works ok, I have put debug traces in some regmap
functions, and escpecially a trace in regcache_write() to indicate if
regmap is caching or not the register.

Here is an example when writing 01234567890123456789 (20 bytes) to the
Tx FIFO:

sc16is7xx spi0.0: sc16is7xx_tx_proc(): entry
sc16is7xx spi0.0: sc16is7xx_handle_tx(): entry
sc16is7xx spi0.0: regcache_read() not caching volatile reg $08
spi0.0-port0: regmap_read:  [08] 40
sc16is7xx spi0.0: regcache_write() not caching volatile reg $08
sc16is7xx spi0.0: _regmap_raw_write_impl() reg     = $00
sc16is7xx spi0.0: _regmap_raw_write_impl() val_len = 20
sc16is7xx spi0.0: regcache_write() not caching volatile reg $00
spi0.0-port0: regmap_write: [00] 30 31 32 33 34 35 36 37 38 39...
spi0.0-port0: regmap_write: [00] 36 37 38 39
...

With this I have confirmed that regmap _noinc_ works as intended, with
regcache_write() indicating it is not caching the volatile register 00
(THR).

I hope this can help you with your investigation, let me know if I can
help more.

Hugo Villeneuve.


 
> Until regmap/regcache is fixed, let's just use an adapted version of the
> old code that bypasses regmap altogether, and just sends out an SPI
> transaction.
> 
> This is related to my commit 3f42b142ea1171967e40e10e4b0241c0d6d28d41
> ("serial: max310x: fix IO data corruption in batched operations") which
> introduced usage of regmap_noinc_write() to this driver. That commit is
> a fixup of commit 285e76fc049c4d32c772eea9460a7ef28a193802 ("serial:
> max310x: use regmap methods for SPI batch operations") which started
> using regmap_raw_write(), which was however also a wrong function.
> 
> Fixes: 3f42b142ea11 ("serial: max310x: fix IO data corruption in batched operations")
> Fixes: 285e76fc049c ("serial: max310x: use regmap methods for SPI batch operations")
> Signed-off-by: Jan Kundrát <jan.kundrat@cesnet.cz>
> To: Mark Brown <broonie@kernel.org>
> To: Cosmin Tanislav <cosmin.tanislav@analog.com>
> To: linux-serial@vger.kernel.org
> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
> Cc: linux-kernel@vger.kernel.org
> ---
>  drivers/tty/serial/max310x.c | 30 ++++++++++++++++++++++++------
>  1 file changed, 24 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/tty/serial/max310x.c b/drivers/tty/serial/max310x.c
> index c44237470bee..79797b573723 100644
> --- a/drivers/tty/serial/max310x.c
> +++ b/drivers/tty/serial/max310x.c
> @@ -663,16 +663,34 @@ static u32 max310x_set_ref_clk(struct device *dev, struct max310x_port *s,
>  
>  static void max310x_batch_write(struct uart_port *port, u8 *txbuf, unsigned int len)
>  {
> -	struct max310x_one *one = to_max310x_port(port);
> -
> -	regmap_noinc_write(one->regmap, MAX310X_THR_REG, txbuf, len);
> +	const u8 header = (port->iobase * 0x20 + MAX310X_THR_REG) | MAX310X_WRITE_BIT;
> +	struct spi_transfer xfer[] = {
> +		{
> +			.tx_buf = &header,
> +			.len = 1,
> +		},
> +		{
> +			.tx_buf = txbuf,
> +			.len = len,
> +		},
> +	};
> +	spi_sync_transfer(to_spi_device(port->dev), xfer, ARRAY_SIZE(xfer));
>  }
>  
>  static void max310x_batch_read(struct uart_port *port, u8 *rxbuf, unsigned int len)
>  {
> -	struct max310x_one *one = to_max310x_port(port);
> -
> -	regmap_noinc_read(one->regmap, MAX310X_RHR_REG, rxbuf, len);
> +	const u8 header = port->iobase * 0x20 + MAX310X_RHR_REG;
> +	struct spi_transfer xfer[] = {
> +		{
> +			.tx_buf = &header,
> +			.len = 1,
> +		},
> +		{
> +			.rx_buf = rxbuf,
> +			.len = len,
> +		},
> +	};
> +	spi_sync_transfer(to_spi_device(port->dev), xfer, ARRAY_SIZE(xfer));
>  }
>  
>  static void max310x_handle_rx(struct uart_port *port, unsigned int rxlen)
> -- 
> 2.42.0
> 
> 
>
Hugo Villeneuve Dec. 1, 2023, 9:40 p.m. UTC | #2
On Fri, 1 Dec 2023 13:27:36 -0500
Hugo Villeneuve <hugo@hugovil.com> wrote:

> On Fri, 1 Dec 2023 15:51:51 +0100
> Jan Kundrát <jan.kundrat@cesnet.cz> wrote:
> 
> > The TL;DR summary is that the regmap_noinc_write spills over the data
> > that are correctly written to the HW also to the following registers in
> > the regcache. As a result, regcache then contains user-controlled
> > garbage which will be used later for bit updates on unrelated registers.
> > 
> > This patch is a "wrong" fix; a real fix would involve fixing regmap
> > and/or regcache, but that code has too many indirections for my little
> > mind.
> > 
> > I was investigating a regression that happened somewhere between 5.12.4
> > (plus 14 of our patches) and v6.5.9 (plus 7 of our patches). Our
> > MAX14830 UART would work fine the first time, but when our application
> > opens the UART the second time it just wouldn't send anything over the
> > physical TX pin. With the help of a logical analyzer, I found out that
> > the kernel was sending value 0xcd to the MODE1 register, which on this
> > chip is a request to set the UART's TX pin to the Hi-Z mode and to
> > switch off RX completely. That's certainly not the intention of the
> > code, but that's what I was seeing on the physical SPI bus, and also in
> > the log when I instrumented the regmap layer.
> > 
> > It turned out that one of the *data* bytes which were sent over the UART
> > was 0xdd, and that this *data byte* somehow ended up in the regcache's
> > idea about the value within the MODE1 register. When the UART is opened
> > up the next time and max310x_startup updates a single unrelated bit in
> > MODE1, that code consults the regcache, notices the 0xdd data byte in
> > there, and ends up sending 0xcd over SPI.
> > 
> > Here's what dump_stack() shows:
> > 
> >  max310x spi1.2: regcache_write: reg 0x9 value 0xdd
> >  max310x spi1.2: PWNED
> >  CPU: 1 PID: 26 Comm: kworker/1:1 Not tainted 6.5.9-7-g9e090fe75fd8 #7
> >  Hardware name: Marvell Armada 380/385 (Device Tree)
> >  Workqueue: events max310x_tx_proc
> >   unwind_backtrace from show_stack+0x10/0x14
> >   show_stack from dump_stack_lvl+0x40/0x4c
> >   dump_stack_lvl from regcache_write+0xc0/0xc4
> >   regcache_write from _regmap_raw_write_impl+0x178/0x828
> >   _regmap_raw_write_impl from _regmap_raw_write+0xb8/0x134
> >   _regmap_raw_write from regmap_noinc_write+0x130/0x178
> >   regmap_noinc_write from max310x_tx_proc+0xd4/0x1a4
> >   max310x_tx_proc from process_one_work+0x21c/0x4e4
> >   process_one_work from worker_thread+0x50/0x54c
> >   worker_thread from kthread+0xe0/0xfc
> >   kthread from ret_from_fork+0x14/0x28
> > 
> > Clearly, regmap_noinc_write of a register 0x00 (that's the TX FIFO on
> > this chip) has no business updating register 0x09, but that's what was
> > happening here. The regmap_config is already set up in a way that
> > register 0x00 is marked precious and volatile, so it has no business
> > going through the cache at all. Also, the documentation for
> > regmap_noinc_write suggests that this driver was using the regmap
> > infrastructure correctly, and that the real bug is somewhere in
> > regmap/regcache where a call to regmap_noinc_write end up updating an
> > unrelated register in regcache.
> 
> Hi Jan,
> it is funny, as I am preparing to send a patch for the sc16is7xx driver
> to convert FIFO R/W to use the _noinc_ versions of regmap functions,
> inspired by your patch 3f42b142ea11 ("serial: max310x: fix IO data
> corruption in batched operations").
> 
> I am testing on a custom board with two SC16IS752 in SPI mode.

Hi,
I ran the tests on Greg's tty-next tree.

Hugo Villeneuve.


> Here is our current FIFO write code:
> 
>   regcache_cache_bypass(one->regmap, true);
>   regmap_raw_write(one->regmap, SC16IS7XX_THR_REG, s->buf, to_send);
>   regcache_cache_bypass(one->regmap, false);
> 
> I am converting it to _noinc_ version to be able to remove the manual
> (and ugly) cache control workaround to this:
> 
>   regmap_noinc_write(one->regmap, SC16IS7XX_THR_REG, s->buf, to_send);
> 
> SC16IS7XX_THR_REG is already in precious and volatile, and I also
> have put it in the noinc list.
> 
> To confirm that this works ok, I have put debug traces in some regmap
> functions, and escpecially a trace in regcache_write() to indicate if
> regmap is caching or not the register.
> 
> Here is an example when writing 01234567890123456789 (20 bytes) to the
> Tx FIFO:
> 
> sc16is7xx spi0.0: sc16is7xx_tx_proc(): entry
> sc16is7xx spi0.0: sc16is7xx_handle_tx(): entry
> sc16is7xx spi0.0: regcache_read() not caching volatile reg $08
> spi0.0-port0: regmap_read:  [08] 40
> sc16is7xx spi0.0: regcache_write() not caching volatile reg $08
> sc16is7xx spi0.0: _regmap_raw_write_impl() reg     = $00
> sc16is7xx spi0.0: _regmap_raw_write_impl() val_len = 20
> sc16is7xx spi0.0: regcache_write() not caching volatile reg $00
> spi0.0-port0: regmap_write: [00] 30 31 32 33 34 35 36 37 38 39...
> spi0.0-port0: regmap_write: [00] 36 37 38 39
> ...
> 
> With this I have confirmed that regmap _noinc_ works as intended, with
> regcache_write() indicating it is not caching the volatile register 00
> (THR).
> 
> I hope this can help you with your investigation, let me know if I can
> help more.
> 
> Hugo Villeneuve.
> 
> 
>  
> > Until regmap/regcache is fixed, let's just use an adapted version of the
> > old code that bypasses regmap altogether, and just sends out an SPI
> > transaction.
> > 
> > This is related to my commit 3f42b142ea1171967e40e10e4b0241c0d6d28d41
> > ("serial: max310x: fix IO data corruption in batched operations") which
> > introduced usage of regmap_noinc_write() to this driver. That commit is
> > a fixup of commit 285e76fc049c4d32c772eea9460a7ef28a193802 ("serial:
> > max310x: use regmap methods for SPI batch operations") which started
> > using regmap_raw_write(), which was however also a wrong function.
> > 
> > Fixes: 3f42b142ea11 ("serial: max310x: fix IO data corruption in batched operations")
> > Fixes: 285e76fc049c ("serial: max310x: use regmap methods for SPI batch operations")
> > Signed-off-by: Jan Kundrát <jan.kundrat@cesnet.cz>
> > To: Mark Brown <broonie@kernel.org>
> > To: Cosmin Tanislav <cosmin.tanislav@analog.com>
> > To: linux-serial@vger.kernel.org
> > Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
> > Cc: linux-kernel@vger.kernel.org
> > ---
> >  drivers/tty/serial/max310x.c | 30 ++++++++++++++++++++++++------
> >  1 file changed, 24 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/tty/serial/max310x.c b/drivers/tty/serial/max310x.c
> > index c44237470bee..79797b573723 100644
> > --- a/drivers/tty/serial/max310x.c
> > +++ b/drivers/tty/serial/max310x.c
> > @@ -663,16 +663,34 @@ static u32 max310x_set_ref_clk(struct device *dev, struct max310x_port *s,
> >  
> >  static void max310x_batch_write(struct uart_port *port, u8 *txbuf, unsigned int len)
> >  {
> > -	struct max310x_one *one = to_max310x_port(port);
> > -
> > -	regmap_noinc_write(one->regmap, MAX310X_THR_REG, txbuf, len);
> > +	const u8 header = (port->iobase * 0x20 + MAX310X_THR_REG) | MAX310X_WRITE_BIT;
> > +	struct spi_transfer xfer[] = {
> > +		{
> > +			.tx_buf = &header,
> > +			.len = 1,
> > +		},
> > +		{
> > +			.tx_buf = txbuf,
> > +			.len = len,
> > +		},
> > +	};
> > +	spi_sync_transfer(to_spi_device(port->dev), xfer, ARRAY_SIZE(xfer));
> >  }
> >  
> >  static void max310x_batch_read(struct uart_port *port, u8 *rxbuf, unsigned int len)
> >  {
> > -	struct max310x_one *one = to_max310x_port(port);
> > -
> > -	regmap_noinc_read(one->regmap, MAX310X_RHR_REG, rxbuf, len);
> > +	const u8 header = port->iobase * 0x20 + MAX310X_RHR_REG;
> > +	struct spi_transfer xfer[] = {
> > +		{
> > +			.tx_buf = &header,
> > +			.len = 1,
> > +		},
> > +		{
> > +			.rx_buf = rxbuf,
> > +			.len = len,
> > +		},
> > +	};
> > +	spi_sync_transfer(to_spi_device(port->dev), xfer, ARRAY_SIZE(xfer));
> >  }
> >  
> >  static void max310x_handle_rx(struct uart_port *port, unsigned int rxlen)
> > -- 
> > 2.42.0
> > 
> > 
> > 
>
Hugo Villeneuve Dec. 1, 2023, 10:16 p.m. UTC | #3
On Fri, 1 Dec 2023 21:41:48 +0000
Mark Brown <broonie@kernel.org> wrote:

> On Fri, Dec 01, 2023 at 04:38:46PM -0500, Hugo Villeneuve wrote:
> > Mark Brown <broonie@kernel.org> wrote:
> 
> > > If you're working on that driver it'd also be good to update the current
> > > use of cache bypass for the enhanced features/interrupt identification
> > > register (and anything else in there, that did seem to be the only one)
> > > to use regmap ranges instead - that'd remove the need for the efr_lock
> > > and be a much more sensible/idiomatic use of the regmap APIs.
> 
> > I will also look to remove the efr_lock, altough it has more
> > implications since this ship has some registers that share a common
> > address, and selected by bits in other registers, and I think this
> > is why there is this efr_lock.
> 
> Right, the registers sharing a common address with the register selected
> by bits in another register is what regmap ranges support - the less
> creative use of this is banked blocks of registers with a selector
> register which picks which page bank is in use, that's moderately common
> especially for TI.

Hi Mark,
thanks for the info, I was not aware of that, and will look into it.

Hugo Villeneuve.
Mark Brown Dec. 4, 2023, 4:35 p.m. UTC | #4
On Mon, Dec 04, 2023 at 11:29:05AM -0500, Hugo Villeneuve wrote:

> Do you have an example of a driver which is using regmap ranges like it
> should be done in this driver, that is using the exact same address for
> two or more registers? I found an example, but it doesn't seem
> applicable to the sc16is7xx driver because the two registers do not
> share a common address, for example they have addresses like 0x01 and
> 0x81, even though with the proper page selection, they finally map to
> address 0x01.

I don't understand what you mean here - you say that the addresses both
have addresses 0x1 and 0x81 but map to address 0x1.  What does the 0x81
refer to?  The comments in the driver seemed to indicate that there was
a single address which mapped to multiple underlying registers...

Searching for struct regmap_range_cfg should show a lot of users in
mainline.
Hugo Villeneuve Dec. 4, 2023, 5:01 p.m. UTC | #5
On Mon, 4 Dec 2023 16:35:30 +0000
Mark Brown <broonie@kernel.org> wrote:

> On Mon, Dec 04, 2023 at 11:29:05AM -0500, Hugo Villeneuve wrote:
> 
> > Do you have an example of a driver which is using regmap ranges like it
> > should be done in this driver, that is using the exact same address for
> > two or more registers? I found an example, but it doesn't seem
> > applicable to the sc16is7xx driver because the two registers do not
> > share a common address, for example they have addresses like 0x01 and
> > 0x81, even though with the proper page selection, they finally map to
> > address 0x01.
> 
> I don't understand what you mean here - you say that the addresses both
> have addresses 0x1 and 0x81 but map to address 0x1.  What does the 0x81
> refer to?  The comments in the driver seemed to indicate that there was
> a single address which mapped to multiple underlying registers...

Hi,
I was referring to an example in da9063-i2c.c where they have
these two registers:

#define	DA9063_REG_STATUS_A		0x01
#define	DA9063_REG_SEQ			0x81

To access one or the other, you must select page 0 or 1 in page config
selection register at address 0x00. It makes sense to me for this case.

But for the sc16is7xx, for example you have these two
independent registers, sharing the exact same address:

#define SC16IS7XX_IIR_REG		(0x02) /* Interrupt Identification */
#define SC16IS7XX_FCR_REG		(0x02) /* FIFO control */

I am not sure if regmap range can be used with this configuration.
Assuming regmap range would be properly setup, when we call
regmap_read(regmap, SC16IS7XX_IIR_REG, &val), how does regmap would
know that we want to access SC16IS7XX_IIR_REG and not SC16IS7XX_FCR_REG?

> Searching for struct regmap_range_cfg should show a lot of users in
> mainline.

Yes, I am trying to find a good example but I must download and read the
datasheet for each one. If you could point to an IC/driver that uses
regmap_range similar to IC sc16is7xx, it would really help.

Thank you
Hugo Villeneuve
Hugo Villeneuve Dec. 4, 2023, 7:41 p.m. UTC | #6
On Mon, 4 Dec 2023 19:16:57 +0000
Mark Brown <broonie@kernel.org> wrote:

> On Mon, Dec 04, 2023 at 01:59:22PM -0500, Hugo Villeneuve wrote:
> 
> > Based on tas2770.c, I am assuming I could redefine the code to look
> > like this:
> 
> > /* SC16IS7XX register definitions */
> > #define SC16IS7XX_REG(page, reg)        ((page * 128) + reg)
> > 
> > #define SC16IS7XX_RHR_REG	SC16IS7XX_REG(0, 0x00) /* RX FIFO */
> > #define SC16IS7XX_THR_REG	SC16IS7XX_REG(0, 0x00) /* TX FIFO */
> > #define SC16IS7XX_IER_REG	SC16IS7XX_REG(0, 0x01) /* Interrupt enable */
> > #define SC16IS7XX_IIR_REG	SC16IS7XX_REG(0, 0x02) /* Interrupt Identification (read) */
> > #define SC16IS7XX_FCR_REG	SC16IS7XX_REG(0, 0x02) /* FIFO control (write) */
> > #define SC16IS7XX_MCR_REG	SC16IS7XX_REG(0, 0x04) /* Modem Control */
> > #define SC16IS7XX_LSR_REG	SC16IS7XX_REG(0, 0x05) /* Line Status */
> > #define SC16IS7XX_MSR_REG	SC16IS7XX_REG(0, 0x06) /* Modem Status */
> > #define SC16IS7XX_SPR_REG	SC16IS7XX_REG(0, 0x07) /* Scratch Pad */
> > ...
> > #define SC16IS7XX_EFR_REG	SC16IS7XX_REG(1, 0x02) /* Enhanced Features */
> > #define SC16IS7XX_XON1_REG	SC16IS7XX_REG(1, 0x04) /* Xon1 word */
> > #define SC16IS7XX_XON2_REG	SC16IS7XX_REG(1, 0x05) /* Xon2 word */
> > #define SC16IS7XX_XOFF1_REG	SC16IS7XX_REG(1, 0x06) /* Xoff1 word */
> > #define SC16IS7XX_XOFF2_REG	SC16IS7XX_REG(1, 0x07) /* Xoff2 word */
> 
> > static const struct regmap_range_cfg sc16is7xx_regmap_ranges[] = {
> > 	{
> > 		.range_min = 0,
> > 		.range_max = 1 * 128,
> > 		.selector_reg = SC16IS7XX_LCR_REG,
> > 		.selector_mask = 0xff,
> > 		.selector_shift = 0,
> > 		.window_start = 0,
> > 		.window_len = 128,
> > 	},
> > };
> 
> That looks plausible - I'd tend to make the range just completely
> non-physical (eg, start at some unrepresentable value) so there's no
> possible ambiguity with a physical address.  I'm also not clear why
> you've made the window be 128 registers wide if it's just this range

I simply took what was in tas2770.c as a starting point.

> from 2-7 that gets switched around, I'd do something more like
> 
> #define SC16IS7XX_RANGE_BASE 0x1000
> #define SC16IS7XX_WINDOW_LEN 6
> #define SC16IS7XX_PAGED_REG(page, reg) \
> 	(SC16IS7XX_RANGE_BASE + (SC16IS7XX_WINDOW_LEN * page) + reg)
> 
> 	.range_min = SC16IS7XX_RANGE_BASE,
> 	.range_max = SC16IS7XX_RANGE_BASE + (2 * SC16IS7XX_WINDOW_LEN),
> 	.window_start = 2,
> 	.window_len = SC16IS7XX_WINDOW_LEN,
> 
> You could apply a - 2 offset to reg as well to make the defines for the
> registers clearer.  The above means that any untranslated register you
> see in I/O traces should be immediately obvious (and more likely to trip
> up error handling and tell you about it if it goes wrong) and hopefully
> also makes it easier to reason about what's going on.

Ok.

> > But here, selecting the proper "page" is not obvious,
> > because to select page 1, we need to write a fixed value of 0xBF to
> > the LCR register.
> 
> > And when selecting page 0, we must write the previous value that was
> > in LCR _before_ we made the switch to page 1...
> 
> > How do we do that?
> 
> That's one reason why having the range cover all the registers gets
> confusing.  That said the code does have special casing for a selector
> register in the middle of the range since there were some devices that
> just put the paging register in the middle of the range it controls for
> some innovative region, the core will notice that this is a selector
> register and not do any paging for that register.

You are talking about the selector being inside the range, and I
understand that because I have looked at _regmap_select_page()
comments and logic.

But that is not was my question was about. Here a pseudo code
example to select "page" 1:

1. save original value of LCR register.
2. write 0xBF to LCR register
3. access desired register in page 1
4. restore original LCR value saved in step 1

How do you do that with regmap range?

Hugo.
Hugo Villeneuve Dec. 4, 2023, 8:02 p.m. UTC | #7
On Mon, 4 Dec 2023 19:48:05 +0000
Mark Brown <broonie@kernel.org> wrote:

> On Mon, Dec 04, 2023 at 02:41:36PM -0500, Hugo Villeneuve wrote:
> 
> > But that is not was my question was about. Here a pseudo code
> > example to select "page" 1:
> 
> > 1. save original value of LCR register.
> > 2. write 0xBF to LCR register
> > 3. access desired register in page 1
> > 4. restore original LCR value saved in step 1
> 
> > How do you do that with regmap range?
> 
> Are you saying that the selector has other, non-selector functions?

Yes! There is no bit or bit range in that register that is used to
select a praticular set of registers or "page". It is only when
writing the special magic value of $BF that the IC switches to "page"
1. And if the value is NOT $BF, then it switches back to "page" 0.

When I told you if you could point to other IC/drivers that had the
same configuration, I tough you were aware of this. That explains some
of the confusion.

> This is truly innovative hardware,...

Well, I would not say innovative, but more "crappy" hardware design :)

> ...generally the selector is just a 
> bitfield that you write paging values to.

This is also what I am accustomed to normally.

>  You'd need to extend the core
> so that it knows about this quirk, right now that's not possible and
> we'll just leave the window pointing at whatever was last accessed.

Ok. I am not sure that adding support for it would make sense, since I
do not know of other ICs that could reuse this very specific and
particular method for switching "paged" registers.

Thank you,
Hugo Villeneuve
Mark Brown Dec. 4, 2023, 10:09 p.m. UTC | #8
On Mon, Dec 04, 2023 at 03:02:24PM -0500, Hugo Villeneuve wrote:
> Mark Brown <broonie@kernel.org> wrote:

> > This is truly innovative hardware,...

> Well, I would not say innovative, but more "crappy" hardware design :)

I didn't say it was *good* innovation.

> >  You'd need to extend the core
> > so that it knows about this quirk, right now that's not possible and
> > we'll just leave the window pointing at whatever was last accessed.

> Ok. I am not sure that adding support for it would make sense, since I
> do not know of other ICs that could reuse this very specific and
> particular method for switching "paged" registers.

Yeah, I'm drawing a blank there.  The thing that springs to mind is
optimisation with wanting to always be on a particular page for fast
interrupt handling or something but that feels rather thin.
Hugo Villeneuve Dec. 5, 2023, 3:52 p.m. UTC | #9
On Fri, 1 Dec 2023 18:34:38 +0000
Mark Brown <broonie@kernel.org> wrote:

> On Fri, Dec 01, 2023 at 01:27:36PM -0500, Hugo Villeneuve wrote:
> 
> > it is funny, as I am preparing to send a patch for the sc16is7xx driver
> > to convert FIFO R/W to use the _noinc_ versions of regmap functions,
> > inspired by your patch 3f42b142ea11 ("serial: max310x: fix IO data
> > corruption in batched operations").
> 
> If you're working on that driver it'd also be good to update the current
> use of cache bypass for the enhanced features/interrupt identification
> register (and anything else in there, that did seem to be the only one)
> to use regmap ranges instead - that'd remove the need for the efr_lock
> and be a much more sensible/idiomatic use of the regmap APIs.

Hi Mark,
after our discussion about regmap range, it seems that the
efr_lock will need to stay.

In fact, all of this helped me to uncover another case where an
additional lock would be needed.

Hugo Villeneuve
diff mbox series

Patch

diff --git a/drivers/tty/serial/max310x.c b/drivers/tty/serial/max310x.c
index c44237470bee..79797b573723 100644
--- a/drivers/tty/serial/max310x.c
+++ b/drivers/tty/serial/max310x.c
@@ -663,16 +663,34 @@  static u32 max310x_set_ref_clk(struct device *dev, struct max310x_port *s,
 
 static void max310x_batch_write(struct uart_port *port, u8 *txbuf, unsigned int len)
 {
-	struct max310x_one *one = to_max310x_port(port);
-
-	regmap_noinc_write(one->regmap, MAX310X_THR_REG, txbuf, len);
+	const u8 header = (port->iobase * 0x20 + MAX310X_THR_REG) | MAX310X_WRITE_BIT;
+	struct spi_transfer xfer[] = {
+		{
+			.tx_buf = &header,
+			.len = 1,
+		},
+		{
+			.tx_buf = txbuf,
+			.len = len,
+		},
+	};
+	spi_sync_transfer(to_spi_device(port->dev), xfer, ARRAY_SIZE(xfer));
 }
 
 static void max310x_batch_read(struct uart_port *port, u8 *rxbuf, unsigned int len)
 {
-	struct max310x_one *one = to_max310x_port(port);
-
-	regmap_noinc_read(one->regmap, MAX310X_RHR_REG, rxbuf, len);
+	const u8 header = port->iobase * 0x20 + MAX310X_RHR_REG;
+	struct spi_transfer xfer[] = {
+		{
+			.tx_buf = &header,
+			.len = 1,
+		},
+		{
+			.rx_buf = rxbuf,
+			.len = len,
+		},
+	};
+	spi_sync_transfer(to_spi_device(port->dev), xfer, ARRAY_SIZE(xfer));
 }
 
 static void max310x_handle_rx(struct uart_port *port, unsigned int rxlen)