Message ID | 20240519193109.122466-1-doug@schmorgal.com |
---|---|
State | New |
Headers | show |
Series | serial: pxa: Disable TX interrupt if there is no more data to transmit | expand |
On 19. 05. 24, 21:31, Doug Brown wrote: > If a TX interrupt occurs and no new data gets loaded into the TX FIFO, > the UART will never fire another TX interrupt until the UART_IER_THRI > flag is toggled off and back on. If nothing ever calls stop_tx(), this > effectively results in transmissions getting hung up until another > unrelated UART IRQ occurs, such as an RX interrupt. > > The driver used to do this correctly until the transition to > uart_port_tx_limited(). This didn't matter until the behavior of > __uart_port_tx changed in commit 7bfb915a597a ("serial: core: only stop > transmit when HW fifo is empty"). > > Fixes: d11cc8c3c4b6 ("tty: serial: use uart_port_tx_limited()") > Signed-off-by: Doug Brown <doug@schmorgal.com> > --- > > Note: I based this on v6.9 instead of tty-next since it's a fix; please > let me know if that was the wrong move and I would be happy to resubmit > it based on tty-next. The patch changes ever so slightly because of the > circ_buf -> kfifo transition. The only difference is it needs this > condition in the "if" statement instead: > > kfifo_is_empty(&up->port.state->port.xmit_fifo) > > This has been tested to work properly on tty-next as well. > > drivers/tty/serial/pxa.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/tty/serial/pxa.c b/drivers/tty/serial/pxa.c > index e395ff29c1a2..8abb85bee87c 100644 > --- a/drivers/tty/serial/pxa.c > +++ b/drivers/tty/serial/pxa.c > @@ -176,6 +176,14 @@ static void transmit_chars(struct uart_pxa_port *up) > { > u8 ch; > > + /* If there is nothing left to transmit, disable the TX interrupt. > + * Otherwise we can get stuck waiting for another IRQ that never happens. > + */ > + if (uart_circ_empty(&up->port.state->xmit)) { > + serial_pxa_stop_tx(&up->port); > + return; > + } This does not make sense. If the circ buf is empty, uart_port_tx_limited() should stop the TX already. You simply revert to the state before 7bfb915a597a, but on a per-driver basis. IOW all drivers using the helper would have exactly this issue after 7bfb915a597a. What driver was 7bfb915a597a about after all? The commit log of the commit is hopeless (it's very vague) in this respect: commit 7bfb915a597a301abb892f620fe5c283a9fdbd77 Author: Jonas Gorski <jonas.gorski@gmail.com> Date: Sun Mar 3 16:08:07 2024 +0100 serial: core: only stop transmit when HW fifo is empty If the circular buffer is empty, it just means we fit all characters to send into the HW fifo, but not that the hardware finished transmitting them. So if we immediately call stop_tx() after that, this may abort any pending characters in the HW fifo, and cause dropped characters on the console. Fix this by only stopping tx when the tx HW fifo is actually empty. Fixes: 8275b48b2780 ("tty: serial: introduce transmit helpers") (And it barely fixes 8275b48b2780 per se. 8275b48b2780 only moved the processing to a single place, most/all of the drivers already did it that way.) /me confused. Perhaps, it should be reverted and the affected driver should just pass UART_TX_NOSTOP instead? > uart_port_tx_limited(&up->port, ch, up->port.fifosize / 2, > true, > serial_out(up, UART_TX, ch), thanks,
Hi On Mon, 20 May 2024 at 08:41, Jiri Slaby <jirislaby@kernel.org> wrote: > > On 19. 05. 24, 21:31, Doug Brown wrote: > > If a TX interrupt occurs and no new data gets loaded into the TX FIFO, > > the UART will never fire another TX interrupt until the UART_IER_THRI > > flag is toggled off and back on. If nothing ever calls stop_tx(), this > > effectively results in transmissions getting hung up until another > > unrelated UART IRQ occurs, such as an RX interrupt. > > > > The driver used to do this correctly until the transition to > > uart_port_tx_limited(). This didn't matter until the behavior of > > __uart_port_tx changed in commit 7bfb915a597a ("serial: core: only stop > > transmit when HW fifo is empty"). > > > > Fixes: d11cc8c3c4b6 ("tty: serial: use uart_port_tx_limited()") > > Signed-off-by: Doug Brown <doug@schmorgal.com> > > --- > > > > Note: I based this on v6.9 instead of tty-next since it's a fix; please > > let me know if that was the wrong move and I would be happy to resubmit > > it based on tty-next. The patch changes ever so slightly because of the > > circ_buf -> kfifo transition. The only difference is it needs this > > condition in the "if" statement instead: > > > > kfifo_is_empty(&up->port.state->port.xmit_fifo) > > > > This has been tested to work properly on tty-next as well. > > > > drivers/tty/serial/pxa.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/drivers/tty/serial/pxa.c b/drivers/tty/serial/pxa.c > > index e395ff29c1a2..8abb85bee87c 100644 > > --- a/drivers/tty/serial/pxa.c > > +++ b/drivers/tty/serial/pxa.c > > @@ -176,6 +176,14 @@ static void transmit_chars(struct uart_pxa_port *up) > > { > > u8 ch; > > > > + /* If there is nothing left to transmit, disable the TX interrupt. > > + * Otherwise we can get stuck waiting for another IRQ that never happens. > > + */ > > + if (uart_circ_empty(&up->port.state->xmit)) { > > + serial_pxa_stop_tx(&up->port); > > + return; > > + } > > This does not make sense. If the circ buf is empty, > uart_port_tx_limited() should stop the TX already. You simply revert to > the state before 7bfb915a597a, but on a per-driver basis. > > IOW all drivers using the helper would have exactly this issue after > 7bfb915a597a. > > What driver was 7bfb915a597a about after all? The commit log of the > commit is hopeless (it's very vague) in this respect: > commit 7bfb915a597a301abb892f620fe5c283a9fdbd77 > Author: Jonas Gorski <jonas.gorski@gmail.com> > Date: Sun Mar 3 16:08:07 2024 +0100 > > serial: core: only stop transmit when HW fifo is empty > > If the circular buffer is empty, it just means we fit all characters to > send into the HW fifo, but not that the hardware finished transmitting > them. > > So if we immediately call stop_tx() after that, this may abort any > pending characters in the HW fifo, and cause dropped characters on the > console. > > Fix this by only stopping tx when the tx HW fifo is actually empty. > > Fixes: 8275b48b2780 ("tty: serial: introduce transmit helpers") > > > > (And it barely fixes 8275b48b2780 per se. 8275b48b2780 only moved the > processing to a single place, most/all of the drivers already did it > that way.) > > /me confused. > > Perhaps, it should be reverted and the affected driver should just pass > UART_TX_NOSTOP instead? > > > uart_port_tx_limited(&up->port, ch, up->port.fifosize / 2, > > true, > > serial_out(up, UART_TX, ch), > > thanks, Sorry, I should have added that this is for bcm63xx_uart.c. My reasoning for doing it this way and not via a flag was: 1. The kernel doc for uart_ops::stop_tx() says "The driver should stop transmitting characters as soon as possible." 2. bcm63xx_uart.c's stop_tx() does exactly that, by immediately stopping tx (leading to potentially lost output). 3. Therefore uart_port_tx_limited() is wrong by calling stop_tx() while the transmitter is still busy, and the core issue is this. So adding the check for the hw fifo being empty seemed to me as the "right" fix, fixing it for all drivers that abort tx in stop_tx(). Best Regards, Jonas
Hi Jonas and Jiri, On 5/20/2024 2:30 AM, Jonas Gorski wrote: > Hi > > On Mon, 20 May 2024 at 08:41, Jiri Slaby <jirislaby@kernel.org> wrote: >> >> This does not make sense. If the circ buf is empty, >> uart_port_tx_limited() should stop the TX already. You simply revert to >> the state before 7bfb915a597a, but on a per-driver basis. >> >> IOW all drivers using the helper would have exactly this issue after >> 7bfb915a597a. I tested more, and it appears you are correct, Jiri. I think 7bfb915a597a has broken other drivers too. For some background on this, after upgrading to 6.9 I observed that transmits would stop working on my PXA168-based device with the pxa driver until I caused another UART IRQ to fire, for example by typing a character in the console. I bisected the problem to 7bfb915a597a. I don't have a bunch of hardware to test with, but I cobbled together a way to emulate a BeagleBoard in an old Linaro QEMU branch, and the omap- serial driver also appears to be affected. Shortly after boot, the UART just completely hangs and all transmits stop working. I can't even type a character to recover it like I can with the pxa driver. Reverting 7bfb915a597a fixes the issue. I think the reason the pxa driver recovers when I type a character is because it always checks the THRE bit of the LSR in the IRQ handler, even if the IIR doesn't say there's a TX interrupt ready. (This appears to be a workaround for an erratum in the PXA serial IP...) >> Perhaps, it should be reverted and the affected driver should just pass >> UART_TX_NOSTOP instead? >> >>> uart_port_tx_limited(&up->port, ch, up->port.fifosize / 2, >>> true, >>> serial_out(up, UART_TX, ch), I can't speak for whether that would fix the issue Jonas originally saw, but it does look like it needs to be reverted because other drivers are also broken due to this change. > 1. The kernel doc for uart_ops::stop_tx() says "The driver should stop > transmitting characters as soon as possible." Very interesting. I guess the correct behavior depends on the exact interpretation of "stop transmitting characters as soon as possible". :-) It looks like, at least for the existing 16550-esque drivers that are using the helper, they are treating stop_tx() more as a signal to disable the TX interrupt, but still allow any characters already in the FIFO to go out. All I know is, in the PXA UART, I was observing the TX IRQ firing when we had no data to load in the FIFO, and we weren't disabling the interrupt. This causes a 16550A-style UART to get hung up because the TX interrupt never fires again unless you disable it and re-enable it. Doug
Hi Jonas, >> On Mon, 20 May 2024 at 08:41, Jiri Slaby <jirislaby@kernel.org> wrote: >>> >>> Perhaps, it should be reverted and the affected driver should just pass >>> UART_TX_NOSTOP instead? >>> >>>> uart_port_tx_limited(&up->port, ch, up->port.fifosize / 2, >>>> true, >>>> serial_out(up, UART_TX, ch), Does the bcm63xx_uart driver work correctly if 7bfb915a597a is reverted and UART_TX_NOSTOP is used instead? I was able to further confirm on hardware (BeagleBoard) that omap-serial is also broken after 7bfb915a597a. I'm trying to figure out how to safely revert it while also allowing bcm63xx to continue to work properly. Thanks, Doug
Hi, On Sun, 2 Jun 2024 at 19:24, Doug Brown <doug@schmorgal.com> wrote: > > Hi Jonas, > > >> On Mon, 20 May 2024 at 08:41, Jiri Slaby <jirislaby@kernel.org> wrote: > >>> > >>> Perhaps, it should be reverted and the affected driver should just pass > >>> UART_TX_NOSTOP instead? > >>> > >>>> uart_port_tx_limited(&up->port, ch, up->port.fifosize / 2, > >>>> true, > >>>> serial_out(up, UART_TX, ch), > Does the bcm63xx_uart driver work correctly if 7bfb915a597a is reverted > and UART_TX_NOSTOP is used instead? > > I was able to further confirm on hardware (BeagleBoard) that omap-serial > is also broken after 7bfb915a597a. I'm trying to figure out how to > safely revert it while also allowing bcm63xx to continue to work properly. So I went through all users of uart_port_tx(_limited()), and expectedly most do just disable the IRQ (or do nothing) in their stop_tx() call back. The notable exceptions I could identify are: - bcm63xx_uart: disables the emitter as well - mxs-uart: disables the emitter (introduced and uses UART_TX_NOSTOP) - sprd_serial: stops alls dma transfers as well - atmel_serial: stops dma and disables emitter as well I suspect sprd_serial and atmel_serial to not work properly with the old behavior, but I have no devices where I could test this. There are a few more exceptions in those that do not use uart_port_tx(_limited()), like I've seen drivers blocking in stop_tx() until the FIFO is empty. Not sure if that is expected. The quick solutions for bcm63xx_uart to either pass UART_TX_NOSTOP [1], or make stop_tx() (and start_tx()) not touch the emitter (or revert the uart_port_tx_limited() usage). Regardless of the chosen quick fix, it feels like stop_tx() is the wrong thing to use for uart_port_tx(), and just happened to do the right thing for the majority or drivers. But the correct function (something along "done submitting tx") also doesn't exist. But I'm no maintainer for tty/serial, nor do I have much (intimate) knowledge there. This is just from looking at it from the outside, and my thoughts may very well be wrong. [1] https://lore.kernel.org/lkml/20240225151426.1342285-1-jonas.gorski@gmail.com/ Best Regards, Jonas
Hi Jonas, On 6/4/2024 6:04 AM, Jonas Gorski wrote: > Hi, > > On Sun, 2 Jun 2024 at 19:24, Doug Brown <doug@schmorgal.com> wrote: >> >> Hi Jonas, >> >>>> On Mon, 20 May 2024 at 08:41, Jiri Slaby <jirislaby@kernel.org> wrote: >>>>> >>>>> Perhaps, it should be reverted and the affected driver should just pass >>>>> UART_TX_NOSTOP instead? >>>>> >>>>>> uart_port_tx_limited(&up->port, ch, up->port.fifosize / 2, >>>>>> true, >>>>>> serial_out(up, UART_TX, ch), >> Does the bcm63xx_uart driver work correctly if 7bfb915a597a is reverted >> and UART_TX_NOSTOP is used instead? >> >> I was able to further confirm on hardware (BeagleBoard) that omap-serial >> is also broken after 7bfb915a597a. I'm trying to figure out how to >> safely revert it while also allowing bcm63xx to continue to work properly. > > So I went through all users of uart_port_tx(_limited()), and > expectedly most do just disable the IRQ (or do nothing) in their > stop_tx() call back. > > The notable exceptions I could identify are: > > - bcm63xx_uart: disables the emitter as well > - mxs-uart: disables the emitter (introduced and uses UART_TX_NOSTOP) > - sprd_serial: stops alls dma transfers as well > - atmel_serial: stops dma and disables emitter as well > > I suspect sprd_serial and atmel_serial to not work properly with the > old behavior, but I have no devices where I could test this. I see what you mean. Unfortunately I also don't have the ability to test either of those drivers. It looks like atmel_serial didn't previously have a stop_tx() anywhere in the code that was refactored to use uart_port_tx(), so I'm assuming it should also have UART_TX_NOSTOP added. I'm not confident enough to submit a patch for it though. sprd_serial prior to the uart_port_tx_limited() refactor was calling stop_tx() when uart_circ_empty() was true -- in fact it was almost identical to the pxa driver's code -- so I feel like any bug in there related to stopping too early would have predated any of these changes. > Regardless of the chosen quick fix, it feels like stop_tx() is the > wrong thing to use for uart_port_tx(), and just happened to do the > right thing for the majority or drivers. But the correct function > (something along "done submitting tx") also doesn't exist. That makes complete sense to me. > But I'm no maintainer for tty/serial, nor do I have much (intimate) > knowledge there. This is just from looking at it from the outside, and > my thoughts may very well be wrong. I'm in the same boat. I wouldn't feel comfortable submitting a large fix for something like that. There are so many hardware platforms that are using this code, it seems very intimidating to work on. Thanks for the analysis, and thanks for linking your previous patch! I'll work on submitting a quick fix for this at least. I think if we go back to the approach your earlier patchset used, it will fix the issue I'm seeing on omap-serial and pxa while also working properly on bcm63xx_uart. Doug
diff --git a/drivers/tty/serial/pxa.c b/drivers/tty/serial/pxa.c index e395ff29c1a2..8abb85bee87c 100644 --- a/drivers/tty/serial/pxa.c +++ b/drivers/tty/serial/pxa.c @@ -176,6 +176,14 @@ static void transmit_chars(struct uart_pxa_port *up) { u8 ch; + /* If there is nothing left to transmit, disable the TX interrupt. + * Otherwise we can get stuck waiting for another IRQ that never happens. + */ + if (uart_circ_empty(&up->port.state->xmit)) { + serial_pxa_stop_tx(&up->port); + return; + } + uart_port_tx_limited(&up->port, ch, up->port.fifosize / 2, true, serial_out(up, UART_TX, ch),
If a TX interrupt occurs and no new data gets loaded into the TX FIFO, the UART will never fire another TX interrupt until the UART_IER_THRI flag is toggled off and back on. If nothing ever calls stop_tx(), this effectively results in transmissions getting hung up until another unrelated UART IRQ occurs, such as an RX interrupt. The driver used to do this correctly until the transition to uart_port_tx_limited(). This didn't matter until the behavior of __uart_port_tx changed in commit 7bfb915a597a ("serial: core: only stop transmit when HW fifo is empty"). Fixes: d11cc8c3c4b6 ("tty: serial: use uart_port_tx_limited()") Signed-off-by: Doug Brown <doug@schmorgal.com> --- Note: I based this on v6.9 instead of tty-next since it's a fix; please let me know if that was the wrong move and I would be happy to resubmit it based on tty-next. The patch changes ever so slightly because of the circ_buf -> kfifo transition. The only difference is it needs this condition in the "if" statement instead: kfifo_is_empty(&up->port.state->port.xmit_fifo) This has been tested to work properly on tty-next as well. drivers/tty/serial/pxa.c | 8 ++++++++ 1 file changed, 8 insertions(+)