diff mbox series

[6/8] serial: qcom-geni: fix console corruption

Message ID 20240902152451.862-7-johan+linaro@kernel.org
State Superseded
Headers show
Series [1/8] serial: qcom-geni: fix fifo polling timeout | expand

Commit Message

Johan Hovold Sept. 2, 2024, 3:24 p.m. UTC
The Qualcomm serial console implementation is broken and can lose
characters when the serial port is also used for tty output.

Specifically, the console code only waits for the current tx command to
complete when all data has already been written to the fifo. When there
are on-going longer transfers this often means that console output is
lost when the console code inadvertently "hijacks" the current tx
command instead of starting a new one.

This can, for example, be observed during boot when console output that
should have been interspersed with init output is truncated:

	[    9.462317] qcom-snps-eusb2-hsphy fde000.phy: Registered Qcom-eUSB2 phy
	[  OK  ] Found device KBG50ZNS256G KIOXIA Wi[    9.471743ndows.
	[    9.539915] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller

Add a new state variable to track how much data has been written to the
fifo and use it to determine when the fifo and shift register are both
empty. This is needed since there is currently no other known way to
determine when the shift register is empty.

This in turn allows the console code to interrupt long transfers without
losing data.

Note that the oops-in-progress case is similarly broken as it does not
cancel any active command and also waits for the wrong status flag when
attempting to drain the fifo (TX_FIFO_NOT_EMPTY_EN is only set when
cancelling a command leaves data in the fifo).

Fixes: c4f528795d1a ("tty: serial: msm_geni_serial: Add serial driver support for GENI based QUP")
Fixes: a1fee899e5be ("tty: serial: qcom_geni_serial: Fix softlock")
Fixes: 9e957a155005 ("serial: qcom-geni: Don't cancel/abort if we can't get the port lock")
Cc: stable@vger.kernel.org	# 4.17
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
---
 drivers/tty/serial/qcom_geni_serial.c | 48 ++++++++++++++-------------
 1 file changed, 25 insertions(+), 23 deletions(-)

Comments

Douglas Anderson Sept. 4, 2024, 9:51 p.m. UTC | #1
Hi,

On Mon, Sep 2, 2024 at 8:26 AM Johan Hovold <johan+linaro@kernel.org> wrote:
>
> +static void qcom_geni_serial_drain_fifo(struct uart_port *uport)
> +{
> +       struct qcom_geni_serial_port *port = to_dev_port(uport);
> +
> +       if (!qcom_geni_serial_main_active(uport))
> +               return;

It seems like all callers already do the check and only ever call you
if the port is active. Do you really need to re-check?


> @@ -308,6 +311,17 @@ static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
>         return qcom_geni_serial_poll_bitfield(uport, offset, field, set ? field : 0);
>  }
>
> +static void qcom_geni_serial_drain_fifo(struct uart_port *uport)
> +{
> +       struct qcom_geni_serial_port *port = to_dev_port(uport);
> +
> +       if (!qcom_geni_serial_main_active(uport))
> +               return;
> +
> +       qcom_geni_serial_poll_bitfield(uport, SE_GENI_M_GP_LENGTH, GP_LENGTH,
> +                       port->tx_queued);

nit: indent "port->tx_queued" to match open parenthesis?

...also: as the kernel test robot reported, w/ certain CONFIGs this is
defined / not used.

Aside from the nit / robot issue, this solution looks reasonable to
me. It's been long enough that I've already paged out much of the past
digging I did into this driver, but this seems like it should work.
Feel free to add my Reviewed-by when the robot issue is fixed.



-Doug
Johan Hovold Sept. 5, 2024, 8:57 a.m. UTC | #2
On Wed, Sep 04, 2024 at 02:51:15PM -0700, Doug Anderson wrote:
> On Mon, Sep 2, 2024 at 8:26 AM Johan Hovold <johan+linaro@kernel.org> wrote:
> >
> > +static void qcom_geni_serial_drain_fifo(struct uart_port *uport)
> > +{
> > +       struct qcom_geni_serial_port *port = to_dev_port(uport);
> > +
> > +       if (!qcom_geni_serial_main_active(uport))
> > +               return;
> 
> It seems like all callers already do the check and only ever call you
> if the port is active. Do you really need to re-check?

I wanted to make the helper self-contained and work in both cases. But
since I ended up only using this helper only in the console code and
will need to move it anyway (under the console ifdef), perhaps I can
consider dropping it. But then again, it's just one register read.

> > @@ -308,6 +311,17 @@ static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
> >         return qcom_geni_serial_poll_bitfield(uport, offset, field, set ? field : 0);
> >  }
> >
> > +static void qcom_geni_serial_drain_fifo(struct uart_port *uport)
> > +{
> > +       struct qcom_geni_serial_port *port = to_dev_port(uport);
> > +
> > +       if (!qcom_geni_serial_main_active(uport))
> > +               return;
> > +
> > +       qcom_geni_serial_poll_bitfield(uport, SE_GENI_M_GP_LENGTH, GP_LENGTH,
> > +                       port->tx_queued);
> 
> nit: indent "port->tx_queued" to match open parenthesis?

No, I don't use open-parenthesis alignment unless that's the
(consistent) style of the code I'm changing (e.g. to avoid unnecessary
realignments when symbol names change and to make a point about
checkpatch --pedantic warnings not being part of the coding style).
 
> ...also: as the kernel test robot reported, w/ certain CONFIGs this is
> defined / not used.

Yes, I need to move the helper under the console ifdef. I was just
waiting to see if there was any further feedback before respinning.

> Aside from the nit / robot issue, this solution looks reasonable to
> me. It's been long enough that I've already paged out much of the past
> digging I did into this driver, but this seems like it should work.
> Feel free to add my Reviewed-by when the robot issue is fixed.

Thanks for reviewing.

Johan
diff mbox series

Patch

diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 7029c39a9a21..be620c5703f5 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -131,6 +131,7 @@  struct qcom_geni_serial_port {
 	bool brk;
 
 	unsigned int tx_remaining;
+	unsigned int tx_queued;
 	int wakeup_irq;
 	bool rx_tx_swap;
 	bool cts_rts_swap;
@@ -144,6 +145,8 @@  static const struct uart_ops qcom_geni_uart_pops;
 static struct uart_driver qcom_geni_console_driver;
 static struct uart_driver qcom_geni_uart_driver;
 
+static void qcom_geni_serial_cancel_tx_cmd(struct uart_port *uport);
+
 static inline struct qcom_geni_serial_port *to_dev_port(struct uart_port *uport)
 {
 	return container_of(uport, struct qcom_geni_serial_port, uport);
@@ -308,6 +311,17 @@  static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
 	return qcom_geni_serial_poll_bitfield(uport, offset, field, set ? field : 0);
 }
 
+static void qcom_geni_serial_drain_fifo(struct uart_port *uport)
+{
+	struct qcom_geni_serial_port *port = to_dev_port(uport);
+
+	if (!qcom_geni_serial_main_active(uport))
+		return;
+
+	qcom_geni_serial_poll_bitfield(uport, SE_GENI_M_GP_LENGTH, GP_LENGTH,
+			port->tx_queued);
+}
+
 static void qcom_geni_serial_setup_tx(struct uart_port *uport, u32 xmit_size)
 {
 	u32 m_cmd;
@@ -476,7 +490,6 @@  static void qcom_geni_serial_console_write(struct console *co, const char *s,
 	struct qcom_geni_serial_port *port;
 	bool locked = true;
 	unsigned long flags;
-	u32 geni_status;
 
 	WARN_ON(co->index < 0 || co->index >= GENI_UART_CONS_PORTS);
 
@@ -490,34 +503,20 @@  static void qcom_geni_serial_console_write(struct console *co, const char *s,
 	else
 		uart_port_lock_irqsave(uport, &flags);
 
-	geni_status = readl(uport->membase + SE_GENI_STATUS);
+	if (qcom_geni_serial_main_active(uport)) {
+		/* Wait for completion or drain FIFO */
+		if (!locked || port->tx_remaining == 0)
+			qcom_geni_serial_poll_tx_done(uport);
+		else
+			qcom_geni_serial_drain_fifo(uport);
 
-	if (!locked) {
-		/*
-		 * We can only get here if an oops is in progress then we were
-		 * unable to get the lock. This means we can't safely access
-		 * our state variables like tx_remaining. About the best we
-		 * can do is wait for the FIFO to be empty before we start our
-		 * transfer, so we'll do that.
-		 */
-		qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
-					  M_TX_FIFO_NOT_EMPTY_EN, false);
-	} else if ((geni_status & M_GENI_CMD_ACTIVE) && !port->tx_remaining) {
-		/*
-		 * It seems we can't interrupt existing transfers if all data
-		 * has been sent, in which case we need to look for done first.
-		 */
-		qcom_geni_serial_poll_tx_done(uport);
+		qcom_geni_serial_cancel_tx_cmd(uport);
 	}
 
 	__qcom_geni_serial_console_write(uport, s, count);
 
-
-	if (locked) {
-		if (port->tx_remaining)
-			qcom_geni_serial_setup_tx(uport, port->tx_remaining);
+	if (locked)
 		uart_port_unlock_irqrestore(uport, flags);
-	}
 }
 
 static void handle_rx_console(struct uart_port *uport, u32 bytes, bool drop)
@@ -698,6 +697,7 @@  static void qcom_geni_serial_cancel_tx_cmd(struct uart_port *uport)
 	writel(M_CMD_CANCEL_EN, uport->membase + SE_GENI_M_IRQ_CLEAR);
 
 	port->tx_remaining = 0;
+	port->tx_queued = 0;
 }
 
 static void qcom_geni_serial_handle_rx_fifo(struct uart_port *uport, bool drop)
@@ -924,6 +924,7 @@  static void qcom_geni_serial_handle_tx_fifo(struct uart_port *uport,
 	if (!port->tx_remaining) {
 		qcom_geni_serial_setup_tx(uport, pending);
 		port->tx_remaining = pending;
+		port->tx_queued = 0;
 
 		irq_en = readl(uport->membase + SE_GENI_M_IRQ_EN);
 		if (!(irq_en & M_TX_FIFO_WATERMARK_EN))
@@ -932,6 +933,7 @@  static void qcom_geni_serial_handle_tx_fifo(struct uart_port *uport,
 	}
 
 	qcom_geni_serial_send_chunk_fifo(uport, chunk);
+	port->tx_queued += chunk;
 
 	/*
 	 * The tx fifo watermark is level triggered and latched. Though we had