serial: 8250_dw: Improve unwritable LCR workaround

Message ID 1380069549-9176-1-git-send-email-tim.kryger@linaro.org
State New
Headers show

Commit Message

Tim Kryger Sept. 25, 2013, 12:39 a.m.
The Designware UART has a limitation where it ignores writes into the
LCR if the UART is busy.  The current workaround stashes a copy of the
last written LCR and writes it back down to the hardware if it receives
a special busy interrupt which is raised when a write was ignored.

Unfortunately, interrupts are typically disabled prior to performing a
sequence of register writes that include the LCR so the point at which
the retry occurs is too late.  An example is serial8250_do_set_termios()
where an ignored LCR write results in the baud divisor not being set and
instead a garbage character is sent out the transmitter.

Furthermore, since serial_port_out() offers no way to indicate failure,
a serious effort must be made to ensure that the LCR is actually updated
before returning back to the caller.  This is difficult, however, as a
UART that was busy during the first attempt is likely to still be busy
when a subsequent attempt is made unless some extra action is taken.

This updated workaround takes the extreme action of clearing the TX/RX
FIFOs and reading the receive buffer before writing down the LCR in the
hope that doing so will force the UART into an idle state.  While this
may seem unnecessarily aggressive, writes to the LCR are used to change
the baud rate, parity, stop bit, or data length so the data that may be
lost is likely not important.  Admittedly, this is far from ideal but it
seems to be the best that can be done given the hardware limitations.

Lastly, the revised workaround doesn't touch the LCR in the ISR, so it
avoids the possibility of a "serial8250: too much work for irq" lock up.
This problem is rare in real situations but can be reproduced easily by
wiring up two UARTs and running the following commands.

  # stty -F /dev/ttyS1 echo
  # stty -F /dev/ttyS2 echo
  # cat /dev/ttyS1 &
  [1] 375
  # echo asdf > /dev/ttyS1
  asdf

  [   27.700000] serial8250: too much work for irq96
  [   27.700000] serial8250: too much work for irq96
  [   27.710000] serial8250: too much work for irq96
  [   27.710000] serial8250: too much work for irq96
  [   27.720000] serial8250: too much work for irq96
  [   27.720000] serial8250: too much work for irq96
  [   27.730000] serial8250: too much work for irq96
  [   27.730000] serial8250: too much work for irq96
  [   27.740000] serial8250: too much work for irq96

Signed-off-by: Tim Kryger <tim.kryger@linaro.org>
Reviewed-by: Matt Porter <matt.porter@linaro.org>
Reviewed-by: Markus Mayer <markus.mayer@linaro.org>
---
 drivers/tty/serial/8250/8250_dw.c | 57 +++++++++++++++++++++++++++------------
 1 file changed, 40 insertions(+), 17 deletions(-)

Comments

Heikki Krogerus Sept. 25, 2013, 11:42 a.m. | #1
Hi Tim,

On Tue, Sep 24, 2013 at 05:39:09PM -0700, Tim Kryger wrote:
> The Designware UART has a limitation where it ignores writes into the
> LCR if the UART is busy.  The current workaround stashes a copy of the
> last written LCR and writes it back down to the hardware if it receives
> a special busy interrupt which is raised when a write was ignored.
> 
> Unfortunately, interrupts are typically disabled prior to performing a
> sequence of register writes that include the LCR so the point at which
> the retry occurs is too late.  An example is serial8250_do_set_termios()
> where an ignored LCR write results in the baud divisor not being set and
> instead a garbage character is sent out the transmitter.
> 
> Furthermore, since serial_port_out() offers no way to indicate failure,
> a serious effort must be made to ensure that the LCR is actually updated
> before returning back to the caller.  This is difficult, however, as a
> UART that was busy during the first attempt is likely to still be busy
> when a subsequent attempt is made unless some extra action is taken.
> 
> This updated workaround takes the extreme action of clearing the TX/RX
> FIFOs and reading the receive buffer before writing down the LCR in the
> hope that doing so will force the UART into an idle state.  While this
> may seem unnecessarily aggressive, writes to the LCR are used to change
> the baud rate, parity, stop bit, or data length so the data that may be
> lost is likely not important.  Admittedly, this is far from ideal but it
> seems to be the best that can be done given the hardware limitations.

<snip>

> @@ -76,17 +75,35 @@ static inline int dw8250_modify_msr(struct uart_port *p, int offset, int value)
>  	return value;
>  }
>  
> +/* The UART will ignore writes to LCR when busy we take aggressive steps
> + * to ensure that it is idle before attempting to write to LCR */
> +static void dw8250_force_idle(struct uart_port *p)
> +{
> +	serial8250_clear_and_reinit_fifos(container_of
> +					  (p, struct uart_8250_port, port));
> +	(void)p->serial_in(p, UART_LSR);
> +	(void)p->serial_in(p, UART_MSR);
> +	(void)p->serial_in(p, UART_RX);
> +}

This looks pretty brutal. Is it really necessary?

>  static void dw8250_serial_out(struct uart_port *p, int offset, int value)
>  {
>  	struct dw8250_data *d = p->private_data;
>  
> -	if (offset == UART_LCR)
> -		d->last_lcr = value;
> -
> -	if (offset == UART_MCR)
> -		d->last_mcr = value;
> -
> -	writeb(value, p->membase + (offset << p->regshift));
> +	if (offset == UART_LCR) {
> +		int tries = 1000;
> +		while (tries--) {
> +			if (value == p->serial_in(p, UART_LCR))
> +				return;
> +			dw8250_force_idle(p);
> +			writeb(value, p->membase + (UART_LCR << p->regshift));
> +		}
> +		dev_err(p->dev, "Couldn't set LCR to %d\n", value);

Is it not enough to simply poll USR[0] to see when the UART becomes
free?
Tim Kryger Sept. 25, 2013, 10:47 p.m. | #2
On Wed, Sep 25, 2013 at 4:42 AM, Heikki Krogerus
<heikki.krogerus@linux.intel.com> wrote:
> Hi Tim,
>
> On Tue, Sep 24, 2013 at 05:39:09PM -0700, Tim Kryger wrote:
>> The Designware UART has a limitation where it ignores writes into the
>> LCR if the UART is busy.  The current workaround stashes a copy of the
>> last written LCR and writes it back down to the hardware if it receives
>> a special busy interrupt which is raised when a write was ignored.
>>
>> Unfortunately, interrupts are typically disabled prior to performing a
>> sequence of register writes that include the LCR so the point at which
>> the retry occurs is too late.  An example is serial8250_do_set_termios()
>> where an ignored LCR write results in the baud divisor not being set and
>> instead a garbage character is sent out the transmitter.
>>
>> Furthermore, since serial_port_out() offers no way to indicate failure,
>> a serious effort must be made to ensure that the LCR is actually updated
>> before returning back to the caller.  This is difficult, however, as a
>> UART that was busy during the first attempt is likely to still be busy
>> when a subsequent attempt is made unless some extra action is taken.
>>
>> This updated workaround takes the extreme action of clearing the TX/RX
>> FIFOs and reading the receive buffer before writing down the LCR in the
>> hope that doing so will force the UART into an idle state.  While this
>> may seem unnecessarily aggressive, writes to the LCR are used to change
>> the baud rate, parity, stop bit, or data length so the data that may be
>> lost is likely not important.  Admittedly, this is far from ideal but it
>> seems to be the best that can be done given the hardware limitations.
>
> <snip>
>
>> @@ -76,17 +75,35 @@ static inline int dw8250_modify_msr(struct uart_port *p, int offset, int value)
>>       return value;
>>  }
>>
>> +/* The UART will ignore writes to LCR when busy we take aggressive steps
>> + * to ensure that it is idle before attempting to write to LCR */
>> +static void dw8250_force_idle(struct uart_port *p)
>> +{
>> +     serial8250_clear_and_reinit_fifos(container_of
>> +                                       (p, struct uart_8250_port, port));
>> +     (void)p->serial_in(p, UART_LSR);
>> +     (void)p->serial_in(p, UART_MSR);
>> +     (void)p->serial_in(p, UART_RX);
>> +}
>
> This looks pretty brutal. Is it really necessary?

Clearing the RX FIFO and receive buffer are necessary.  The hardware
reports a busy status if there are any unread characters waiting.

Reading the LSR and MSR may not be essential.  I suspect some
interrupts that are cleared with these reads could trigger a busy
status but the documentation is vague.

>>  static void dw8250_serial_out(struct uart_port *p, int offset, int value)
>>  {
>>       struct dw8250_data *d = p->private_data;
>>
>> -     if (offset == UART_LCR)
>> -             d->last_lcr = value;
>> -
>> -     if (offset == UART_MCR)
>> -             d->last_mcr = value;
>> -
>> -     writeb(value, p->membase + (offset << p->regshift));
>> +     if (offset == UART_LCR) {
>> +             int tries = 1000;
>> +             while (tries--) {
>> +                     if (value == p->serial_in(p, UART_LCR))
>> +                             return;
>> +                     dw8250_force_idle(p);
>> +                     writeb(value, p->membase + (UART_LCR << p->regshift));
>> +             }
>> +             dev_err(p->dev, "Couldn't set LCR to %d\n", value);
>
> Is it not enough to simply poll USR[0] to see when the UART becomes
> free?

Unfortunately not.  The LCR is modified while holding the spinlock
with local interrupts disabled so the ISR is deprived of the
opportunity to empty the FIFO.

Given that the hardware will never become idle so long as there are
characters in the RX FIFO, the only option is to forcibly empty it
here.

> --
> heikki
Greg KH Sept. 26, 2013, 10:46 p.m. | #3
On Tue, Sep 24, 2013 at 05:39:09PM -0700, Tim Kryger wrote:
> The Designware UART has a limitation where it ignores writes into the
> LCR if the UART is busy.  The current workaround stashes a copy of the
> last written LCR and writes it back down to the hardware if it receives
> a special busy interrupt which is raised when a write was ignored.
> 
> Unfortunately, interrupts are typically disabled prior to performing a
> sequence of register writes that include the LCR so the point at which
> the retry occurs is too late.  An example is serial8250_do_set_termios()
> where an ignored LCR write results in the baud divisor not being set and
> instead a garbage character is sent out the transmitter.
> 
> Furthermore, since serial_port_out() offers no way to indicate failure,
> a serious effort must be made to ensure that the LCR is actually updated
> before returning back to the caller.  This is difficult, however, as a
> UART that was busy during the first attempt is likely to still be busy
> when a subsequent attempt is made unless some extra action is taken.
> 
> This updated workaround takes the extreme action of clearing the TX/RX
> FIFOs and reading the receive buffer before writing down the LCR in the
> hope that doing so will force the UART into an idle state.  While this
> may seem unnecessarily aggressive, writes to the LCR are used to change
> the baud rate, parity, stop bit, or data length so the data that may be
> lost is likely not important.  Admittedly, this is far from ideal but it
> seems to be the best that can be done given the hardware limitations.
> 
> Lastly, the revised workaround doesn't touch the LCR in the ISR, so it
> avoids the possibility of a "serial8250: too much work for irq" lock up.
> This problem is rare in real situations but can be reproduced easily by
> wiring up two UARTs and running the following commands.
> 
>   # stty -F /dev/ttyS1 echo
>   # stty -F /dev/ttyS2 echo
>   # cat /dev/ttyS1 &
>   [1] 375
>   # echo asdf > /dev/ttyS1
>   asdf
> 
>   [   27.700000] serial8250: too much work for irq96
>   [   27.700000] serial8250: too much work for irq96
>   [   27.710000] serial8250: too much work for irq96
>   [   27.710000] serial8250: too much work for irq96
>   [   27.720000] serial8250: too much work for irq96
>   [   27.720000] serial8250: too much work for irq96
>   [   27.730000] serial8250: too much work for irq96
>   [   27.730000] serial8250: too much work for irq96
>   [   27.740000] serial8250: too much work for irq96
> 
> Signed-off-by: Tim Kryger <tim.kryger@linaro.org>
> Reviewed-by: Matt Porter <matt.porter@linaro.org>
> Reviewed-by: Markus Mayer <markus.mayer@linaro.org>
> ---
>  drivers/tty/serial/8250/8250_dw.c | 57 +++++++++++++++++++++++++++------------
>  1 file changed, 40 insertions(+), 17 deletions(-)

This patch no longer applies to my tty-next tree, can you please refresh
it and resend so that I can apply it?

thanks,

greg k-h
Heikki Krogerus Sept. 27, 2013, 7:49 a.m. | #4
Hi,

On Wed, Sep 25, 2013 at 03:47:14PM -0700, Tim Kryger wrote:
> On Wed, Sep 25, 2013 at 4:42 AM, Heikki Krogerus
> <heikki.krogerus@linux.intel.com> wrote:
> > On Tue, Sep 24, 2013 at 05:39:09PM -0700, Tim Kryger wrote:
> >>  static void dw8250_serial_out(struct uart_port *p, int offset, int value)
> >>  {
> >>       struct dw8250_data *d = p->private_data;
> >>
> >> -     if (offset == UART_LCR)
> >> -             d->last_lcr = value;
> >> -
> >> -     if (offset == UART_MCR)
> >> -             d->last_mcr = value;
> >> -
> >> -     writeb(value, p->membase + (offset << p->regshift));
> >> +     if (offset == UART_LCR) {
> >> +             int tries = 1000;
> >> +             while (tries--) {
> >> +                     if (value == p->serial_in(p, UART_LCR))
> >> +                             return;
> >> +                     dw8250_force_idle(p);
> >> +                     writeb(value, p->membase + (UART_LCR << p->regshift));
> >> +             }
> >> +             dev_err(p->dev, "Couldn't set LCR to %d\n", value);
> >
> > Is it not enough to simply poll USR[0] to see when the UART becomes
> > free?
> 
> Unfortunately not.  The LCR is modified while holding the spinlock
> with local interrupts disabled so the ISR is deprived of the
> opportunity to empty the FIFO.
> 
> Given that the hardware will never become idle so long as there are
> characters in the RX FIFO, the only option is to forcibly empty it
> here.

OK. The LCR issue is only a problem when the UART is configured with
the busy functionality (UART_16550_COMPATIBLE == NO). Otherwise LRC is
always accessible and this is not necessary.

We need a flag for the UART_16550_COMPATIBLE configuration, and
use this WA based on that.

Thanks,
Tim Kryger Oct. 1, 2013, 5:13 p.m. | #5
On Fri, Sep 27, 2013 at 12:49 AM, Heikki Krogerus
<heikki.krogerus@linux.intel.com> wrote:

> OK. The LCR issue is only a problem when the UART is configured with
> the busy functionality (UART_16550_COMPATIBLE == NO). Otherwise LRC is
> always accessible and this is not necessary.

Reading through the latest DW databook, I believe you are correct.

> We need a flag for the UART_16550_COMPATIBLE configuration, and
> use this WA based on that.

If I adjust the workaround to force the UART idle only when an LCR
write was ignored rather than doing it prior to any LCR write, would
that address your concerns?

Patch

diff --git a/drivers/tty/serial/8250/8250_dw.c b/drivers/tty/serial/8250/8250_dw.c
index daf710f..ffbb69c 100644
--- a/drivers/tty/serial/8250/8250_dw.c
+++ b/drivers/tty/serial/8250/8250_dw.c
@@ -56,7 +56,6 @@ 
 
 
 struct dw8250_data {
-	int		last_lcr;
 	int		last_mcr;
 	int		line;
 	struct clk	*clk;
@@ -76,17 +75,35 @@  static inline int dw8250_modify_msr(struct uart_port *p, int offset, int value)
 	return value;
 }
 
+/* The UART will ignore writes to LCR when busy we take aggressive steps
+ * to ensure that it is idle before attempting to write to LCR */
+static void dw8250_force_idle(struct uart_port *p)
+{
+	serial8250_clear_and_reinit_fifos(container_of
+					  (p, struct uart_8250_port, port));
+	(void)p->serial_in(p, UART_LSR);
+	(void)p->serial_in(p, UART_MSR);
+	(void)p->serial_in(p, UART_RX);
+}
+
 static void dw8250_serial_out(struct uart_port *p, int offset, int value)
 {
 	struct dw8250_data *d = p->private_data;
 
-	if (offset == UART_LCR)
-		d->last_lcr = value;
-
-	if (offset == UART_MCR)
-		d->last_mcr = value;
-
-	writeb(value, p->membase + (offset << p->regshift));
+	if (offset == UART_LCR) {
+		int tries = 1000;
+		while (tries--) {
+			if (value == p->serial_in(p, UART_LCR))
+				return;
+			dw8250_force_idle(p);
+			writeb(value, p->membase + (UART_LCR << p->regshift));
+		}
+		dev_err(p->dev, "Couldn't set LCR to %d\n", value);
+	} else {
+		if (offset == UART_MCR)
+			d->last_mcr = value;
+		writeb(value, p->membase + (offset << p->regshift));
+	}
 }
 
 static unsigned int dw8250_serial_in(struct uart_port *p, int offset)
@@ -107,13 +124,20 @@  static void dw8250_serial_out32(struct uart_port *p, int offset, int value)
 {
 	struct dw8250_data *d = p->private_data;
 
-	if (offset == UART_LCR)
-		d->last_lcr = value;
-
-	if (offset == UART_MCR)
-		d->last_mcr = value;
-
-	writel(value, p->membase + (offset << p->regshift));
+	if (offset == UART_LCR) {
+		int tries = 1000;
+		while (tries--) {
+			if (value == p->serial_in(p, UART_LCR))
+				return;
+			dw8250_force_idle(p);
+			writel(value, p->membase + (UART_LCR << p->regshift));
+		}
+		dev_err(p->dev, "Couldn't set LCR to %d\n", value);
+	} else {
+		if (offset == UART_MCR)
+			d->last_mcr = value;
+		writel(value, p->membase + (offset << p->regshift));
+	}
 }
 
 static unsigned int dw8250_serial_in32(struct uart_port *p, int offset)
@@ -131,9 +155,8 @@  static int dw8250_handle_irq(struct uart_port *p)
 	if (serial8250_handle_irq(p, iir)) {
 		return 1;
 	} else if ((iir & UART_IIR_BUSY) == UART_IIR_BUSY) {
-		/* Clear the USR and write the LCR again. */
+		/* Clear the USR */
 		(void)p->serial_in(p, d->usr_reg);
-		p->serial_out(p, UART_LCR, d->last_lcr);
 
 		return 1;
 	}