diff mbox series

serial: 8250: fix panic due to PSLVERR

Message ID 20250403090336.16643-1-cuiyunhui@bytedance.com
State New
Headers show
Series serial: 8250: fix panic due to PSLVERR | expand

Commit Message

yunhui cui April 3, 2025, 9:03 a.m. UTC
When the PSLVERR_RESP_EN parameter is set to 1, the device generates
an error response if an attempt is made to read an empty RBR (Receive
Buffer Register) while the FIFO is enabled.

In serial8250_do_startup, calling serial_port_out(port, UART_LCR,
UART_LCR_WLEN8) triggers dw8250_check_lcr(), which invokes
dw8250_force_idle() and serial8250_clear_and_reinit_fifos(). The latter
function enables the FIFO via serial_out(p, UART_FCR, p->fcr).
Execution proceeds to the dont_test_tx_en label:
...
serial_port_in(port, UART_RX);
This satisfies the PSLVERR trigger condition.

Because another CPU(e.g., using printk) is accessing the UART (UART
is busy), the current CPU fails the check (value & ~UART_LCR_SPAR) ==
(lcr & ~UART_LCR_SPAR), causing it to enter dw8250_force_idle().

To resolve this issue, relevant serial_port_out operations should be
placed in a critical section, and UART_RX data should only be read
when the UART_LSR DR bit is set.

Panic message:
[    0.442336] Oops - unknown exception [#1]
[    0.442337] Modules linked in:
[    0.442339] CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Tainted: G        W          6.12.13-00102-gf1f43e345877 #1
[    0.442342] Tainted: [W]=WARN
[    0.442343] epc : dw8250_serial_in32+0x1e/0x4a
[    0.442351]  ra : serial8250_do_startup+0x2c8/0x88e
[    0.442354] epc : ffffffff8064efca ra : ffffffff8064af28 sp : ffff8f8000103990
[    0.442355]  gp : ffffffff815bad28 tp : ffffaf807e36d400 t0 : ffffaf80804cf080
[    0.442356]  t1 : 0000000000000001 t2 : 0000000000000000 s0 : ffff8f80001039a0
[    0.442358]  s1 : ffffffff81626fc0 a0 : ffffffff81626fc0 a1 : 0000000000000000
[    0.442359]  a2 : 0000000000000000 a3 : 0000000000000000 a4 : ffffffff81626fc0
[    0.442360]  a5 : ffff8f800012d900 a6 : 000000000000000f a7 : 000000000fc648c1
[    0.442361]  s2 : 0000000000000000 s3 : 0000000200000022 s4 : 0000000000000000
[    0.442362]  s5 : ffffffff81626fc0 s6 : ffffaf8085227000 s7 : ffffffff81073c58
[    0.442363]  s8 : 0000000000500000 s9 : ffffaf80851a5a60 s10: ffffaf80851a5a60
[    0.442365]  s11: ffffffff80e85980 t3 : ffffaf807e324600 t4 : 0000000000000002
[    0.442365]  t5 : 0000000000000003 t6 : ffffaf80804cf072
[    0.442366] status: 0000000200000120 badaddr: 0000000000000000 cause: 0000000000000013
[    0.442368] [<ffffffff8064efca>] dw8250_serial_in32+0x1e/0x4a
[    0.442371] [<ffffffff8064af28>] serial8250_do_startup+0x2c8/0x88e
[    0.442373] [<ffffffff8064b514>] serial8250_startup+0x26/0x2e
[    0.442375] [<ffffffff806428a2>] uart_startup+0x13a/0x308
[    0.442377] [<ffffffff80642aa4>] uart_port_activate+0x34/0x50
[    0.442378] [<ffffffff8062ab6a>] tty_port_open+0xb4/0x110
[    0.442383] [<ffffffff8063f548>] uart_open+0x22/0x36
[    0.442389] [<ffffffff806234b4>] tty_open+0x1be/0x5e6
[    0.442396] [<ffffffff802f2d52>] chrdev_open+0x10a/0x2a8
[    0.442400] [<ffffffff802e7ab6>] do_dentry_open+0xf6/0x34e
[    0.442405] [<ffffffff802e9456>] vfs_open+0x2a/0xb4
[    0.442408] [<ffffffff80300124>] path_openat+0x676/0xf36
[    0.442410] [<ffffffff80300a58>] do_filp_open+0x74/0xfa
[    0.442412] [<ffffffff802e9900>] file_open_name+0x84/0x144
[    0.442414] [<ffffffff802e99f6>] filp_open+0x36/0x54
[    0.442416] [<ffffffff80a01232>] console_on_rootfs+0x26/0x70
[    0.442420] [<ffffffff80a0154e>] kernel_init_freeable+0x2d2/0x30e
[    0.442422] [<ffffffff8099c730>] kernel_init+0x2a/0x15e
[    0.442427] [<ffffffff809a7666>] ret_from_fork+0xe/0x1c
[    0.442430] Code: e022 e406 0800 4683 0c15 691c 872a 96bb 00d5 97b6 (439c) 851b
[    0.442432] ---[ end trace 0000000000000000 ]---
[    0.442434] Kernel panic - not syncing: Fatal exception in interrupt
[    0.442435] SMP: stopping secondary CPUs
[    0.451111] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
---
 drivers/tty/serial/8250/8250_port.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

Comments

Andy Shevchenko April 4, 2025, 10:58 a.m. UTC | #1
On Fri, Apr 04, 2025 at 10:44:09AM +0800, yunhui cui wrote:
> On Thu, Apr 3, 2025 at 7:36 PM Andy Shevchenko
> <andriy.shevchenko@linux.intel.com> wrote:
> > On Thu, Apr 03, 2025 at 05:03:36PM +0800, Yunhui Cui wrote:

...

> > > To resolve this issue, relevant serial_port_out operations should be
> >
> > serial_port_out()
> 
> Okay.
> 
> >
> > > placed in a critical section, and UART_RX data should only be read
> > > when the UART_LSR DR bit is set.
> >
> > The last one is made in the common code, are you sure that all supported UARTs
> > will be okay with such a change?
> 
> This change enhances code robustness without being intrusive.

It is intrusive as it touches the core part affecting basically
_all_ of the 8250-based drivers.

Yes, it's small, but still it needs to be done carefully with commit message
pointing out to the other 8250 datasheets to show that this is _not_ DW
specific change.

...

> > > Panic message:
> >
> > Please, read this
> > https://www.kernel.org/doc/html/latest/process/submitting-patches.html#backtraces-in-commit-messages
> > and act accordingly.
> 
> Okay, I'll update the next version to follow the guideline: 'Avoid
> directly copying full dmesg output (e.g., timestamps, registers, and
> stack dumps); instead, extract the critical call chain.'

and make it short, e.g. ~3-5 lines only.
yunhui cui April 7, 2025, 1 p.m. UTC | #2
Hi Andy,

On Fri, Apr 4, 2025 at 6:21 PM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Fri, Apr 04, 2025 at 10:31:25AM +0800, yunhui cui wrote:
> > On Thu, Apr 3, 2025 at 7:50 PM Andy Shevchenko
> > <andriy.shevchenko@linux.intel.com> wrote:
> > > On Thu, Apr 03, 2025 at 02:36:16PM +0300, Andy Shevchenko wrote:
> > > > On Thu, Apr 03, 2025 at 05:03:36PM +0800, Yunhui Cui wrote:
> > >
> > > A couple of more questions here:
> > > 1) what is the DW IP version and where did you get the PSLVERR_RESP_EN
> > > parameter from?
> > > 2) what is the setting of the UART_16550_COMPATIBLE parameter?
> >
> > 1): Refer to: https://www.synopsys.com/dw/ipdir.php?c=DW_apb_uart
>
> I don't understand this. I asked about version of the IP, I have datasheets
> already for many of them, I can't find PSLVERR_RESP_EN there, that's why the
> question.

You can check the link:
https://iccircle.com/static/upload/img20240313113905.pdf for the
relevant introduction.

>
> > 2): data->uart_16550_compatible == 0
>
> Thanks!
>
> --
> With Best Regards,
> Andy Shevchenko
>
>

Thanks,
Yunhui
yunhui cui April 7, 2025, 1:22 p.m. UTC | #3
Hi Andy,


On Fri, Apr 4, 2025 at 6:58 PM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Fri, Apr 04, 2025 at 10:44:09AM +0800, yunhui cui wrote:
> > On Thu, Apr 3, 2025 at 7:36 PM Andy Shevchenko
> > <andriy.shevchenko@linux.intel.com> wrote:
> > > On Thu, Apr 03, 2025 at 05:03:36PM +0800, Yunhui Cui wrote:
>
> ...
>
> > > > To resolve this issue, relevant serial_port_out operations should be
> > >
> > > serial_port_out()
> >
> > Okay.
> >
> > >
> > > > placed in a critical section, and UART_RX data should only be read
> > > > when the UART_LSR DR bit is set.
> > >
> > > The last one is made in the common code, are you sure that all supported UARTs
> > > will be okay with such a change?
> >
> > This change enhances code robustness without being intrusive.
>
> It is intrusive as it touches the core part affecting basically
> _all_ of the 8250-based drivers.
>
> Yes, it's small, but still it needs to be done carefully with commit message
> pointing out to the other 8250 datasheets to show that this is _not_ DW
> specific change.

serial8250_clear_fifos is already part of the serial8250_do_startup
process. The purpose of adding it to the critical section is to
prevent the FIFO from being cleared while the UART is in use.

Similarly, serial_port_out(port, UART_LCR, UART_LCR_WLEN8);

It is a correct logic to check if the data is ready before reading,
which prevents the FIFO from being enabled by other CPUs before
executing serial_port_in(port, UART_RX).

>
> ...
>
> > > > Panic message:
> > >
> > > Please, read this
> > > https://www.kernel.org/doc/html/latest/process/submitting-patches.html#backtraces-in-commit-messages
> > > and act accordingly.
> >
> > Okay, I'll update the next version to follow the guideline: 'Avoid
> > directly copying full dmesg output (e.g., timestamps, registers, and
> > stack dumps); instead, extract the critical call chain.'
>
> and make it short, e.g. ~3-5 lines only.

Okay.

>
> --
> With Best Regards,
> Andy Shevchenko
>
>

Thanks,
Yunhui
Andy Shevchenko April 7, 2025, 2:34 p.m. UTC | #4
On Mon, Apr 07, 2025 at 09:00:22PM +0800, yunhui cui wrote:
> On Fri, Apr 4, 2025 at 6:21 PM Andy Shevchenko
> <andriy.shevchenko@linux.intel.com> wrote:
> > On Fri, Apr 04, 2025 at 10:31:25AM +0800, yunhui cui wrote:
> > > On Thu, Apr 3, 2025 at 7:50 PM Andy Shevchenko
> > > <andriy.shevchenko@linux.intel.com> wrote:
> > > > On Thu, Apr 03, 2025 at 02:36:16PM +0300, Andy Shevchenko wrote:
> > > > > On Thu, Apr 03, 2025 at 05:03:36PM +0800, Yunhui Cui wrote:
> > > >
> > > > A couple of more questions here:
> > > > 1) what is the DW IP version and where did you get the PSLVERR_RESP_EN
> > > > parameter from?
> > > > 2) what is the setting of the UART_16550_COMPATIBLE parameter?
> > >
> > > 1): Refer to: https://www.synopsys.com/dw/ipdir.php?c=DW_apb_uart
> >
> > I don't understand this. I asked about version of the IP, I have datasheets
> > already for many of them, I can't find PSLVERR_RESP_EN there, that's why the
> > question.
> 
> You can check the link:
> https://iccircle.com/static/upload/img20240313113905.pdf for the
> relevant introduction.

This is helpful, thank you!
Andy Shevchenko April 7, 2025, 2:37 p.m. UTC | #5
On Mon, Apr 07, 2025 at 09:22:08PM +0800, yunhui cui wrote:
> On Fri, Apr 4, 2025 at 6:58 PM Andy Shevchenko
> <andriy.shevchenko@linux.intel.com> wrote:
> > On Fri, Apr 04, 2025 at 10:44:09AM +0800, yunhui cui wrote:
> > > On Thu, Apr 3, 2025 at 7:36 PM Andy Shevchenko
> > > <andriy.shevchenko@linux.intel.com> wrote:
> > > > On Thu, Apr 03, 2025 at 05:03:36PM +0800, Yunhui Cui wrote:

...

> > > > > To resolve this issue, relevant serial_port_out operations should be
> > > >
> > > > serial_port_out()
> > >
> > > Okay.
> > >
> > > > > placed in a critical section, and UART_RX data should only be read
> > > > > when the UART_LSR DR bit is set.
> > > >
> > > > The last one is made in the common code, are you sure that all supported UARTs
> > > > will be okay with such a change?
> > >
> > > This change enhances code robustness without being intrusive.
> >
> > It is intrusive as it touches the core part affecting basically
> > _all_ of the 8250-based drivers.
> >
> > Yes, it's small, but still it needs to be done carefully with commit message
> > pointing out to the other 8250 datasheets to show that this is _not_ DW
> > specific change.
> 
> serial8250_clear_fifos is already part of the serial8250_do_startup
> process. The purpose of adding it to the critical section is to
> prevent the FIFO from being cleared while the UART is in use.
> 
> Similarly, serial_port_out(port, UART_LCR, UART_LCR_WLEN8);
> 
> It is a correct logic to check if the data is ready before reading,
> which prevents the FIFO from being enabled by other CPUs before
> executing serial_port_in(port, UART_RX).

Yes, I understand what you are doing there, my question is about possible
side-effects on non-DW (non-Synopsys) IPs.
diff mbox series

Patch

diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
index 3f256e96c722..6909c81109db 100644
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -2264,13 +2264,16 @@  int serial8250_do_startup(struct uart_port *port)
 	 * Clear the FIFO buffers and disable them.
 	 * (they will be reenabled in set_termios())
 	 */
+	uart_port_lock_irqsave(port, &flags);
 	serial8250_clear_fifos(up);
+	uart_port_unlock_irqrestore(port, flags);
 
 	/*
 	 * Clear the interrupt registers.
 	 */
-	serial_port_in(port, UART_LSR);
-	serial_port_in(port, UART_RX);
+	lsr = serial_port_in(port, UART_LSR);
+	if (lsr & UART_LSR_DR)
+		serial_port_in(port, UART_RX);
 	serial_port_in(port, UART_IIR);
 	serial_port_in(port, UART_MSR);
 
@@ -2380,9 +2383,9 @@  int serial8250_do_startup(struct uart_port *port)
 	/*
 	 * Now, initialize the UART
 	 */
+	uart_port_lock_irqsave(port, &flags);
 	serial_port_out(port, UART_LCR, UART_LCR_WLEN8);
 
-	uart_port_lock_irqsave(port, &flags);
 	if (up->port.flags & UPF_FOURPORT) {
 		if (!up->port.irq)
 			up->port.mctrl |= TIOCM_OUT1;
@@ -2435,8 +2438,9 @@  int serial8250_do_startup(struct uart_port *port)
 	 * saved flags to avoid getting false values from polling
 	 * routines or the previous session.
 	 */
-	serial_port_in(port, UART_LSR);
-	serial_port_in(port, UART_RX);
+	lsr = serial_port_in(port, UART_LSR);
+	if (lsr & UART_LSR_DR)
+		serial_port_in(port, UART_RX);
 	serial_port_in(port, UART_IIR);
 	serial_port_in(port, UART_MSR);
 	up->lsr_saved_flags = 0;