diff mbox series

[RFC] serial: core: Use pm_runtime_get_sync() in uart_start()

Message ID 20231124122258.1050-1-xuewen.yan@unisoc.com
State New
Headers show
Series [RFC] serial: core: Use pm_runtime_get_sync() in uart_start() | expand

Commit Message

Xuewen Yan Nov. 24, 2023, 12:22 p.m. UTC
The commit 84a9582fd203("serial: core: Start managing serial controllers to enable runtime PM")
use the pm_runtime_get() after uart_port_lock() which would close the irq and disable preement.
At this time,  pm_runtime_get may cause the following two problems:

(1) deadlock in try_to_wake_up:

uart_write()
uart_port_lock() <<< get lock
__uart_start
__pm_runtime_resume
rpm_resume
queue_work_on
try_to_wake_up
_printk
uart_console_write
...
uart_port_lock() <<< wait forever

(2) scheduling while atomic:
uart_write()
uart_port_lock() <<< get lock
__uart_start
__pm_runtime_resume
rpm_resume
chedule() << sleep

So let us use pm_runtime_get_sync() to prevent these.

Fixes: 84a9582fd203 ("serial: core: Start managing serial controllers to enable runtime PM")
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
 drivers/tty/serial/serial_core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Xuewen Yan Nov. 27, 2023, 1:54 a.m. UTC | #1
Hi Tony

On Sat, Nov 25, 2023 at 3:48 PM Tony Lindgren <tony@atomide.com> wrote:
>
> * Xuewen Yan <xuewen.yan@unisoc.com> [231124 12:25]:
> > The commit 84a9582fd203("serial: core: Start managing serial controllers to enable runtime PM")
> > use the pm_runtime_get() after uart_port_lock() which would close the irq and disable preement.
> > At this time,  pm_runtime_get may cause the following two problems:
> >
> > (1) deadlock in try_to_wake_up:
> >
> > uart_write()
> > uart_port_lock() <<< get lock
> > __uart_start
> > __pm_runtime_resume
> > rpm_resume
> > queue_work_on
> > try_to_wake_up
> > _printk
> > uart_console_write
> > ...
> > uart_port_lock() <<< wait forever
> >
> > (2) scheduling while atomic:
> > uart_write()
> > uart_port_lock() <<< get lock
> > __uart_start
> > __pm_runtime_resume
> > rpm_resume
> > chedule() << sleep
>
> Are these spinlock vs raw spinlock nesting warnings from lockdep? If so
> can you please post the full warnings somewhere? Or if some extra steps
> are needed to reproduce please describe that too.

Indeed, we use pr_info in scheduler in our own kernel tree, and this
deadlock happended.
I would try to use printk_deferred and re-test.
And I also notice the warning was reported by syzbot:

https://lore.kernel.org/all/0000000000006f01f00608a16cea@google.com/
https://lore.kernel.org/all/000000000000e7765006072e9591@google.com/

These are also because the pm_runtime_put().

Thanks!

>
> Chances are very high that your serial port is already runtime active at
> this point unless you manually idle it so that's why I'm wondering as
> all that likely is happening here is a check on the runtime PM usage count.
>
> > So let us use pm_runtime_get_sync() to prevent these.
>
> We need to fix this some other way as we can't use pm_runtime_get_sync()
> here. The sync call variants require setting pm_runtime_irq_safe() for the
> device. And this is what we really want to avoid as it takes a permanent
> usage count on the parent device.
>
> What we want to do here is to get runtime PM to wake-up the device if idle
> and try tx again after runtime PM resume as needed.
>
> Just guessing at this point.. To me it sounds like the fix might be to
> use a raw spinlock for uart_port_lock() and uart_port_unlock()?
>
> Regards,
>
> Tony
>
>
> > Fixes: 84a9582fd203 ("serial: core: Start managing serial controllers to enable runtime PM")
> > Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> > ---
> >  drivers/tty/serial/serial_core.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
> > index f1348a509552..902f7ed35f4d 100644
> > --- a/drivers/tty/serial/serial_core.c
> > +++ b/drivers/tty/serial/serial_core.c
> > @@ -145,7 +145,7 @@ static void __uart_start(struct uart_state *state)
> >       port_dev = port->port_dev;
> >
> >       /* Increment the runtime PM usage count for the active check below */
> > -     err = pm_runtime_get(&port_dev->dev);
> > +     err = pm_runtime_get_sync(&port_dev->dev);
> >       if (err < 0 && err != -EINPROGRESS) {
> >               pm_runtime_put_noidle(&port_dev->dev);
> >               return;
> > @@ -159,7 +159,7 @@ static void __uart_start(struct uart_state *state)
> >       if (!pm_runtime_enabled(port->dev) || pm_runtime_active(port->dev))
> >               port->ops->start_tx(port);
> >       pm_runtime_mark_last_busy(&port_dev->dev);
> > -     pm_runtime_put_autosuspend(&port_dev->dev);
> > +     pm_runtime_put_sync_autosuspend(&port_dev->dev);
> >  }
> >
> >  static void uart_start(struct tty_struct *tty)
> > --
> > 2.25.1
> >
diff mbox series

Patch

diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index f1348a509552..902f7ed35f4d 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -145,7 +145,7 @@  static void __uart_start(struct uart_state *state)
 	port_dev = port->port_dev;
 
 	/* Increment the runtime PM usage count for the active check below */
-	err = pm_runtime_get(&port_dev->dev);
+	err = pm_runtime_get_sync(&port_dev->dev);
 	if (err < 0 && err != -EINPROGRESS) {
 		pm_runtime_put_noidle(&port_dev->dev);
 		return;
@@ -159,7 +159,7 @@  static void __uart_start(struct uart_state *state)
 	if (!pm_runtime_enabled(port->dev) || pm_runtime_active(port->dev))
 		port->ops->start_tx(port);
 	pm_runtime_mark_last_busy(&port_dev->dev);
-	pm_runtime_put_autosuspend(&port_dev->dev);
+	pm_runtime_put_sync_autosuspend(&port_dev->dev);
 }
 
 static void uart_start(struct tty_struct *tty)