mbox series

[0/2] serial: Fix problems when serial transfer is happening at suspend time

Message ID 20240523232216.3148367-1-dianders@chromium.org
Headers show
Series serial: Fix problems when serial transfer is happening at suspend time | expand

Message

Doug Anderson May 23, 2024, 11:22 p.m. UTC
This is a set of two patches that fix problems related to suspending
while a serial transfer is going on. The two patches are independent
from each other and can land in any order. The only thing tying them
together is that I used the same test to reproduce both of them.
Specifically, I could reproduce my problemes by logging in via an
agetty on the debug serial port (which was _not_ used for kernel
console) and running:
  cat /var/log/messages
...and then (via an SSH session) forcing a few suspend/resume cycles.

The first patch solves a problem that is probably more major. It was
introduced recently and has even shown up in stable trees.
Suspend/resume testing in ChromeOS test labs are hitting the problem
fixed by this patch. The fix hasn't been tested in labs, but when I
reproduced the problem locally I could see that the fix worked. IMO it
should land ASAP.

The second patch fixes an ancient problem that I only found because I
was trying to reproduce the first problem. Given how long it's been
around it's probably not urgent but it would be nice to get fixed.


Douglas Anderson (2):
  serial: port: Don't block system suspend even if bytes are left to
    xmit
  serial: qcom-geni: Fix qcom_geni_serial_stop_tx_fifo() while xfer

 drivers/tty/serial/qcom_geni_serial.c | 45 +++++++++++++++++++++++++--
 drivers/tty/serial/serial_port.c      | 10 ++++++
 2 files changed, 52 insertions(+), 3 deletions(-)

Comments

Andy Shevchenko May 27, 2024, 5:54 p.m. UTC | #1
On Thu, May 23, 2024 at 04:22:12PM -0700, Douglas Anderson wrote:
> Recently, suspend testing on sc7180-trogdor based devices has started
> to sometimes fail with messages like this:
> 
>   port a88000.serial:0.0: PM: calling pm_runtime_force_suspend+0x0/0xf8 @ 28934, parent: a88000.serial:0
>   port a88000.serial:0.0: PM: dpm_run_callback(): pm_runtime_force_suspend+0x0/0xf8 returns -16
>   port a88000.serial:0.0: PM: pm_runtime_force_suspend+0x0/0xf8 returned -16 after 33 usecs
>   port a88000.serial:0.0: PM: failed to suspend: error -16
> 
> I could reproduce these problem by logging in via an agetty on the
> debug serial port (which was _not_ used for kernel console) and
> running:
>   cat /var/log/messages
> ...and then (via an SSH session) forcing a few suspend/resume cycles.
> 
> Tracing through the code and doing some printf debugging shows that

printf()

...or...

printf()-based

> the -16 (-EBUSY) comes from the recently added
> serial_port_runtime_suspend().
> 
> The idea of the serial_port_runtime_suspend() function is to prevent
> the port from being _runtime_ suspended if it still has bytes left to
> transmit. Having bytes left to transmit isn't a reason to block
> _system_ suspend, though. The DEFINE_RUNTIME_DEV_PM_OPS() used by the
> serial_port code means that the system suspend function will be
> pm_runtime_force_suspend(). In pm_runtime_force_suspend() we can see
> that before calling the runtime suspend function we'll call
> pm_runtime_disable(). This should be a reliable way to detect that
> we're called from system suspend and that we shouldn't look for
> busyness.

...

> +	/*
> +	 * We only want to check the busyness of the port if PM Runtime is
> +	 * enabled. Specifically PM Runtime will be disabled by
> +	 * pm_runtime_force_suspend() during system suspend and we don't want
> +	 * to block system suspend even if there is data still left to
> +	 * transmit. We only want to block regulator PM Runtime transitions.

regular

> +	 */
> +	if (!pm_runtime_enabled(dev))
> +		return 0;