[v4] usbserial: cp210x - icount support for parity error checking

Message ID 838f09f9-4063-1c2c-8b4d-c18dee6c18de@jrr.cz
State Superseded
Headers show
Series
  • [v4] usbserial: cp210x - icount support for parity error checking
Related show

Commit Message

Jerry June 22, 2020, 3:13 p.m.
The current version of cp210x driver doesn't provide any way to detect
a parity error in received data from userspace. Some serial protocols like
STM32 bootloader protect data only by parity so application needs to
know whether parity error happened to repeat peripheral data reading.

Added support for icount (ioctl TIOCGICOUNT) which sends GET_COMM_STATUS
command to CP210X and according received flags increments fields for
parity error, frame error, break and overrun. An application can detect
an error condition after reading data from ttyUSB and reacts adequately.
There is no impact for applications which don't call ioctl TIOCGICOUNT.

The flag "hardware overrun" is not examined because CP2102 sets this bit
for the first received byte after openning of port which was previously
closed with some unreaded data in buffer. This is confusing and checking
"queue overrun" flag seems be enough.

Signed-off-by: Jaromír Škorpil <Jerry@jrr.cz>
---
v2: Simplified counting - only queue overrun checked
v3: Changed description + UTF8 name
v4: Corrected endian

  cp210x.c |   43 ++++++++++++++++++++++++++++++++++++++-----
  1 file changed, 38 insertions(+), 5 deletions(-)

Comments

Johan Hovold July 3, 2020, 7:45 a.m. | #1
On Wed, Jul 01, 2020 at 09:28:25PM +0200, Jerry wrote:
> Johan Hovold wrote on 7/1/20 5:42 PM:

> > It would be better if could detect both types of overrun.

> >

> > Did you try moving the purge command at close to after disabling the

> > uart?

> >

> > Or perhaps we can add a "dummy" comm-status command after disabling the

> > uart?


> I try to describe more details about this overrun problem:

> 1) I tried only CP2102 because our company uses it, I have no idea whether 

> the same apply for CP2104,2105... or not, I don't have another chip.

> 2) Maybe I should note I'm always using even parity (because of STM32 

> bootloader protocol).

> 3) I thought the problem is created by unreaded data when closing because 

> overrun was reported after closing if GET_COMM_STATUS shown positive 

> ulAmountInInQueue before closing. Later I discovered that if I close the 

> port, wait, send some data from outside, then open it, I will also get 

> HW_OVERRUN flag.

> 4) So currently I suppose that problem is usually created by any incoming 

> data after disabling interface. If I remove UART_DISABLE from close method, 

> no overrun ever reported.

> 5) Unfortunately this overrun is not reported immediately after port 

> opening but later after receiving first byte. I didn't find any way how to 

> purge nor dummy read this flag before transmitting data.

> 6) I didn't find this problem in a chip errata and nobody complaining about it.

> 7) I successfully reproduced the same problem in MS Windows 10. If some 

> data arrived to closed port, then I open it, everything is OK but after 

> sending request and receiving reply I often get overrun indication from 

> Win32 API ClearCommError()

> 8) I removed HW_OVERRUN checking because I don't want to break anything but 

> maybe Linux driver should work the same way as Windows and don't hide this 

> problem?

> 9) I needed just to detect parity error from user space and things 

> complicate.  :-)


Heh, yeah, it tends to be that way. :) But thanks for the great summary
of your findings!

I think it probably makes most sense to keep the error in as it's a
quirk of the device rather than risk missing an actual overrun.

The problem here is that we're sort of violating the API and turning
TIOCGICOUNT into a polling interface instead of just returning our error
and interrupt counters. This means will always undercount any errors for
a start.

The chip appears to have a mechanism for reporting errors in-band, but
that would amount to processing every character received to look for the
escape char, which adds overhead. I'm guessing that interface would also
insert an overrun error when returning the first character.

But perhaps that it was we should be using as it allows parity the
errors to be reported using the normal in-band methods. If we only
enable it when parity checking is enabled then the overhead seems
justified.

I did a quick test here and the event insertion appears to work well for
parity errors. Didn't manage to get it to report break events yet, and
modem-status changes are being buffered and not reported until the next
character. But in theory we could expand the implementation to provide
more features later.

I'll see if I can cook something up quickly.

> >>    /*

> >> - * Read how many bytes are waiting in the TX queue.

> >> + * Read how many bytes are waiting in the TX queue and update error counters.

> >>     */

> >> -static int cp210x_get_tx_queue_byte_count(struct usb_serial_port *port,

> >> -		u32 *count)

> >> +static int cp210x_get_comm_status(struct usb_serial_port *port,

> >> +		u32 *tx_count)

> >>    {

> >>    	struct usb_serial *serial = port->serial;

> >>    	struct cp210x_port_private *port_priv = usb_get_serial_port_data(port);

> >> @@ -855,7 +865,17 @@ static int cp210x_get_tx_queue_byte_coun

> >>    			0, port_priv->bInterfaceNumber, sts, sizeof(*sts),

> >>    			USB_CTRL_GET_TIMEOUT);

> >>    	if (result == sizeof(*sts)) {

> >> -		*count = le32_to_cpu(sts->ulAmountInOutQueue);

> >> +		u32 flags = le32_to_cpu(sts->ulErrors);

> >> +		if (flags & CP210X_SERIAL_ERR_BREAK)

> >> +			port->icount.brk++;

> >> +		if (flags & CP210X_SERIAL_ERR_FRAME)

> >> +			port->icount.frame++;

> >> +		if (flags & CP210X_SERIAL_ERR_QUEUE_OVERRUN)

> >> +			port->icount.overrun++;

> >> +		if (flags & CP210X_SERIAL_ERR_PARITY)

> >> +			port->icount.parity++;

> >> +		if (tx_count)

> >> +			*tx_count = le32_to_cpu(sts->ulAmountInOutQueue);

> >>    		result = 0;

> >>    	} else {

> >>    		dev_err(&port->dev, "failed to get comm status: %d\n", result);

> >> @@ -873,13 +893,26 @@ static bool cp210x_tx_empty(struct usb_s

> >>    	int err;

> >>    	u32 count;

> >>    

> >> -	err = cp210x_get_tx_queue_byte_count(port, &count);

> >> +	err = cp210x_get_comm_status(port, &count);

> >>    	if (err)

> >>    		return true;

> >>    

> >>    	return !count;

> >>    }

> > It seems a bit weird to be updating the error counters while polling

> > for tx-empty during close. How about having cp210x_get_comm_status()

> > take two u32 pointers for tx_count and errors and only pass in one or

> > the other from cp210x_tx_empty() and cp210x_get_icount() respectively?


> I suppose that GET_COMM_STATUS reads and CLEAR pending error flags (not 

> explained in datasheet but behaves that way). So if some errors are 

> reported during cp210x_tx_empty() it can be either counted immediately or 

> lost. I'm not sure if cp210x_tx_empty() is called ONLY during close but 

> seems be more consistent to count every detected error regardless who calls 

> GET_COMM_STATUS.


tx_empty() is called when waiting for data to be drained for example
during close but also on when changing terminal settings with TCSADRAIN
or on tcdrain().

Since we wouldn't be counting errors during normal operation it seems
arbitrary to be counting during these seemingly unrelated calls. Better
to only poll from TIOCGICOUNT, if we decide to go with that approach at
all.

> >> +static int cp210x_get_icount(struct tty_struct *tty,

> >> +		struct serial_icounter_struct *icount)

> >> +{

> >> +	struct usb_serial_port *port = tty->driver_data;

> >> +	int result;

> >> +

> >> +	result = cp210x_get_comm_status(port, NULL);

> >> +	if (result)

> >> +		return result;

> > And then you update the error counters here, preferably under the port

> > lock.


> I wasn't sure about port lock. I looked in several usb drivers (ftdi, 

> pl2303) and it seems that port->icount.xxx increments are usually done out 

> of lock. But if you wish I put increments into 

> spin_lock_irqsave(&port->lock, flags), correct?


Correct.

It's quite possible that that's missing elsewhere, but at least those
counters are updated in completion callbacks which do not run in
parallel unlike ioctls(), which are not serialised.

Johan
Johan Hovold July 3, 2020, 3:01 p.m. | #2
On Fri, Jul 03, 2020 at 09:45:39AM +0200, Johan Hovold wrote:
> On Wed, Jul 01, 2020 at 09:28:25PM +0200, Jerry wrote:

> > Johan Hovold wrote on 7/1/20 5:42 PM:

> > > It would be better if could detect both types of overrun.

> > >

> > > Did you try moving the purge command at close to after disabling the

> > > uart?

> > >

> > > Or perhaps we can add a "dummy" comm-status command after disabling the

> > > uart?

> 

> > I try to describe more details about this overrun problem:

> > 1) I tried only CP2102 because our company uses it, I have no idea whether 

> > the same apply for CP2104,2105... or not, I don't have another chip.

> > 2) Maybe I should note I'm always using even parity (because of STM32 

> > bootloader protocol).

> > 3) I thought the problem is created by unreaded data when closing because 

> > overrun was reported after closing if GET_COMM_STATUS shown positive 

> > ulAmountInInQueue before closing. Later I discovered that if I close the 

> > port, wait, send some data from outside, then open it, I will also get 

> > HW_OVERRUN flag.

> > 4) So currently I suppose that problem is usually created by any incoming 

> > data after disabling interface. If I remove UART_DISABLE from close method, 

> > no overrun ever reported.

> > 5) Unfortunately this overrun is not reported immediately after port 

> > opening but later after receiving first byte. I didn't find any way how to 

> > purge nor dummy read this flag before transmitting data.

> > 6) I didn't find this problem in a chip errata and nobody complaining about it.

> > 7) I successfully reproduced the same problem in MS Windows 10. If some 

> > data arrived to closed port, then I open it, everything is OK but after 

> > sending request and receiving reply I often get overrun indication from 

> > Win32 API ClearCommError()

> > 8) I removed HW_OVERRUN checking because I don't want to break anything but 

> > maybe Linux driver should work the same way as Windows and don't hide this 

> > problem?

> > 9) I needed just to detect parity error from user space and things 

> > complicate.  :-)

> 

> Heh, yeah, it tends to be that way. :) But thanks for the great summary

> of your findings!

> 

> I think it probably makes most sense to keep the error in as it's a

> quirk of the device rather than risk missing an actual overrun.

> 

> The problem here is that we're sort of violating the API and turning

> TIOCGICOUNT into a polling interface instead of just returning our error

> and interrupt counters. This means will always undercount any errors for

> a start.

> 

> The chip appears to have a mechanism for reporting errors in-band, but

> that would amount to processing every character received to look for the

> escape char, which adds overhead. I'm guessing that interface would also

> insert an overrun error when returning the first character.

> 

> But perhaps that it was we should be using as it allows parity the

> errors to be reported using the normal in-band methods. If we only

> enable it when parity checking is enabled then the overhead seems

> justified.

> 

> I did a quick test here and the event insertion appears to work well for

> parity errors. Didn't manage to get it to report break events yet, and

> modem-status changes are being buffered and not reported until the next

> character. But in theory we could expand the implementation to provide

> more features later.

> 

> I'll see if I can cook something up quickly.


Would you mind giving the below a try and see how that works in your
setup?

With this patch you'd be able to use PARMRK to be notified of any parity
errors, but I also wired up TIOCGICOUNT if you prefer to use that.

Also, could try and see if your device detects breaks properly? Mine
just return a NUL char.

Johan



From 5fc7de670489a6651e023c325e674666d65cfe14 Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan@kernel.org>

Date: Fri, 3 Jul 2020 16:39:14 +0200
Subject: [PATCH] USB: serial: add support for line-status events

Add support for line-status events that can be used to detect and report
parity errors.

Enable the device's event-insertion mode whenever input-parity checking
is requested. This will insert line and modem status events into the
data stream.

Note that modem-status changes appear to be buffered until a character
is received and is therefore left unimplemented.

On at least one type of these chips, line breaks are also not detected
properly and is just reported as a NUL character. I'm therefore not
enabling event-insertion when !IGNBRK is requested.

Also wire up TIOCGICOUNT to allow for reading out the line-status
counters.

Signed-off-by: Johan Hovold <johan@kernel.org>

---
 drivers/usb/serial/cp210x.c | 207 +++++++++++++++++++++++++++++++++++-
 1 file changed, 204 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c
index f5143eedbc48..b5f8176ee7ab 100644
--- a/drivers/usb/serial/cp210x.c
+++ b/drivers/usb/serial/cp210x.c
@@ -50,6 +50,9 @@ static void cp210x_release(struct usb_serial *);
 static int cp210x_port_probe(struct usb_serial_port *);
 static int cp210x_port_remove(struct usb_serial_port *);
 static void cp210x_dtr_rts(struct usb_serial_port *p, int on);
+static void cp210x_process_read_urb(struct urb *urb);
+static void cp210x_enable_event_mode(struct usb_serial_port *port);
+static void cp210x_disable_event_mode(struct usb_serial_port *port);
 
 static const struct usb_device_id id_table[] = {
 	{ USB_DEVICE(0x045B, 0x0053) }, /* Renesas RX610 RX-Stick */
@@ -253,9 +256,21 @@ struct cp210x_serial_private {
 	bool			use_actual_rate;
 };
 
+enum cp210x_event_state {
+	ES_DATA,
+	ES_ESCAPE,
+	ES_LSR,
+	ES_LSR_DATA_0,
+	ES_LSR_DATA_1,
+	ES_MSR
+};
+
 struct cp210x_port_private {
 	__u8			bInterfaceNumber;
 	bool			has_swapped_line_ctl;
+	bool			event_mode;
+	enum cp210x_event_state event_state;
+	u8 lsr;
 };
 
 static struct usb_serial_driver cp210x_device = {
@@ -274,12 +289,14 @@ static struct usb_serial_driver cp210x_device = {
 	.tx_empty		= cp210x_tx_empty,
 	.tiocmget		= cp210x_tiocmget,
 	.tiocmset		= cp210x_tiocmset,
+	.get_icount		= usb_serial_generic_get_icount,
 	.attach			= cp210x_attach,
 	.disconnect		= cp210x_disconnect,
 	.release		= cp210x_release,
 	.port_probe		= cp210x_port_probe,
 	.port_remove		= cp210x_port_remove,
-	.dtr_rts		= cp210x_dtr_rts
+	.dtr_rts		= cp210x_dtr_rts,
+	.process_read_urb	= cp210x_process_read_urb,
 };
 
 static struct usb_serial_driver * const serial_drivers[] = {
@@ -401,6 +418,15 @@ struct cp210x_comm_status {
  */
 #define PURGE_ALL		0x000f
 
+/* CP210X_EMBED_EVENTS */
+#define CP210X_ESCCHAR		0xff
+
+#define CP210X_LSR_OVERRUN	BIT(1)
+#define CP210X_LSR_PARITY	BIT(2)
+#define CP210X_LSR_FRAME	BIT(3)
+#define CP210X_LSR_BREAK	BIT(4)
+
+
 /* CP210X_GET_FLOW/CP210X_SET_FLOW read/write these 0x10 bytes */
 struct cp210x_flow_ctl {
 	__le32	ulControlHandshake;
@@ -807,6 +833,7 @@ static int cp210x_get_line_ctl(struct usb_serial_port *port, u16 *ctl)
 
 static int cp210x_open(struct tty_struct *tty, struct usb_serial_port *port)
 {
+	struct cp210x_port_private *port_priv = usb_get_serial_port_data(port);
 	int result;
 
 	result = cp210x_write_u16_reg(port, CP210X_IFC_ENABLE, UART_ENABLE);
@@ -819,20 +846,151 @@ static int cp210x_open(struct tty_struct *tty, struct usb_serial_port *port)
 	cp210x_get_termios(tty, port);
 
 	/* The baud rate must be initialised on cp2104 */
-	if (tty)
+	if (tty) {
 		cp210x_change_speed(tty, port, NULL);
 
-	return usb_serial_generic_open(tty, port);
+		/* FIXME: handle from set_termios() only */
+		if (I_INPCK(tty))
+			cp210x_enable_event_mode(port);
+	}
+
+	result = usb_serial_generic_open(tty, port);
+	if (result)
+		goto err_disable;
+
+	return 0;
+
+err_disable:
+	cp210x_write_u16_reg(port, CP210X_IFC_ENABLE, UART_DISABLE);
+	port_priv->event_mode = false;
+
+	return result;
 }
 
 static void cp210x_close(struct usb_serial_port *port)
 {
+	struct cp210x_port_private *port_priv = usb_get_serial_port_data(port);
+
 	usb_serial_generic_close(port);
 
 	/* Clear both queues; cp2108 needs this to avoid an occasional hang */
 	cp210x_write_u16_reg(port, CP210X_PURGE, PURGE_ALL);
 
 	cp210x_write_u16_reg(port, CP210X_IFC_ENABLE, UART_DISABLE);
+
+	/* Disabling the interface disables event-insertion mode. */
+	port_priv->event_mode = false;
+}
+
+static char cp210x_process_lsr(struct usb_serial_port *port, unsigned char lsr)
+{
+	char flag = TTY_NORMAL;
+
+	if (lsr & CP210X_LSR_BREAK) {
+		flag = TTY_BREAK;
+		port->icount.brk++;
+		/* FIXME: no need to process sysrq chars without breaks... */
+		usb_serial_handle_break(port);
+	} else if (lsr & CP210X_LSR_PARITY) {
+		flag = TTY_PARITY;
+		port->icount.parity++;
+	} else if (lsr & CP210X_LSR_FRAME) {
+		flag = TTY_FRAME;
+		port->icount.frame++;
+	}
+
+	if (lsr & CP210X_LSR_OVERRUN) {
+		port->icount.overrun++;
+		tty_insert_flip_char(&port->port, 0, TTY_OVERRUN);
+	}
+
+	return flag;
+}
+
+static bool cp210x_process_event_char(struct usb_serial_port *port, unsigned char *ch, char *flag)
+{
+	struct cp210x_port_private *port_priv = usb_get_serial_port_data(port);
+
+	switch(port_priv->event_state) {
+	case ES_DATA:
+		if (*ch == CP210X_ESCCHAR) {
+			port_priv->event_state = ES_ESCAPE;
+			break;
+		}
+		return false;
+	case ES_ESCAPE:
+		switch (*ch) {
+		case 0:
+			dev_dbg(&port->dev, "%s - escape char\n", __func__);
+			*ch = CP210X_ESCCHAR;
+			port_priv->event_state = ES_DATA;
+			return false;
+		case 1:
+			port_priv->event_state = ES_LSR_DATA_0;
+			break;
+		case 2:
+			port_priv->event_state = ES_LSR;
+			break;
+		case 3:
+			port_priv->event_state = ES_MSR;
+			break;
+		default:
+			dev_err(&port->dev, "malformed event 0x%02x\n", *ch);
+			port_priv->event_state = ES_DATA;
+			break;
+		}
+		break;
+	case ES_LSR_DATA_0:
+		port_priv->lsr = *ch;
+		port_priv->event_state = ES_LSR_DATA_1;
+		break;
+	case ES_LSR_DATA_1:
+		dev_dbg(&port->dev, "%s - lsr = 0x%02x, data = 0x%02x\n",
+				__func__, port_priv->lsr, *ch);
+		*flag = cp210x_process_lsr(port, port_priv->lsr);
+		port_priv->event_state = ES_DATA;
+		return false;
+	case ES_LSR:
+		dev_dbg(&port->dev, "%s - lsr = 0x%02x\n", __func__, *ch);
+		port_priv->lsr = *ch;
+		cp210x_process_lsr(port, port_priv->lsr);
+		port_priv->event_state = ES_DATA;
+		break;
+	case ES_MSR:
+		dev_dbg(&port->dev, "%s - msr = 0x%02x\n", __func__, *ch);
+		/* unimplemented */
+		port_priv->event_state = ES_DATA;
+		break;
+	}
+
+	return true;
+}
+
+static void cp210x_process_read_urb(struct urb *urb)
+{
+	struct usb_serial_port *port = urb->context;
+	struct cp210x_port_private *port_priv = usb_get_serial_port_data(port);
+	unsigned char *ch = urb->transfer_buffer;
+	char flag;
+	int i;
+
+	if (!urb->actual_length)
+		return;
+
+	if (!port_priv->event_mode && !(port->port.console && port->sysrq)) {
+		tty_insert_flip_string(&port->port, ch, urb->actual_length);
+	} else {
+		for (i = 0; i < urb->actual_length; i++, ch++) {
+			flag = TTY_NORMAL;
+
+			if (cp210x_process_event_char(port, ch, &flag))
+				continue;
+
+			if (!usb_serial_handle_sysrq_char(port, *ch))
+				tty_insert_flip_char(&port->port, *ch, flag);
+		}
+	}
+	tty_flip_buffer_push(&port->port);
 }
 
 /*
@@ -1148,6 +1306,41 @@ static void cp210x_change_speed(struct tty_struct *tty,
 	tty_encode_baud_rate(tty, baud, baud);
 }
 
+static void cp210x_enable_event_mode(struct usb_serial_port *port)
+{
+	struct cp210x_port_private *port_priv = usb_get_serial_port_data(port);
+	int ret;
+
+	if (port_priv->event_mode)
+		return;
+
+	port_priv->event_state = ES_DATA;
+	port_priv->event_mode = true;
+
+	ret = cp210x_write_u16_reg(port, CP210X_EMBED_EVENTS, CP210X_ESCCHAR);
+	if (ret) {
+		dev_err(&port->dev, "failed to enable events: %d\n", ret);
+		port_priv->event_mode = false;
+	}
+}
+
+static void cp210x_disable_event_mode(struct usb_serial_port *port)
+{
+	struct cp210x_port_private *port_priv = usb_get_serial_port_data(port);
+	int ret;
+
+	if (!port_priv->event_mode)
+		return;
+
+	ret = cp210x_write_u16_reg(port, CP210X_EMBED_EVENTS, 0);
+	if (ret) {
+		dev_err(&port->dev, "failed to disable events: %d\n", ret);
+		return;
+	}
+
+	port_priv->event_mode = false;
+}
+
 static void cp210x_set_termios(struct tty_struct *tty,
 		struct usb_serial_port *port, struct ktermios *old_termios)
 {
@@ -1270,6 +1463,14 @@ static void cp210x_set_termios(struct tty_struct *tty,
 				sizeof(flow_ctl));
 	}
 
+	/*
+	 * Enable event-insertion mode only if input parity checking is
+	 * enabled for now.
+	 */
+	if (I_INPCK(tty))
+		cp210x_enable_event_mode(port);
+	else
+		cp210x_disable_event_mode(port);
 }
 
 static int cp210x_tiocmset(struct tty_struct *tty,
-- 
2.26.2
Jerry July 3, 2020, 6:45 p.m. | #3
Johan Hovold wrote on 7/3/20 9:45 AM:
> Heh, yeah, it tends to be that way. :) But thanks for the great summary

> of your findings!

>

> I think it probably makes most sense to keep the error in as it's a

> quirk of the device rather than risk missing an actual overrun.

OK
> The problem here is that we're sort of violating the API and turning

> TIOCGICOUNT into a polling interface instead of just returning our error

> and interrupt counters. This means will always undercount any errors for

> a start.

>

> The chip appears to have a mechanism for reporting errors in-band, but

> that would amount to processing every character received to look for the

> escape char, which adds overhead. I'm guessing that interface would also

> insert an overrun error when returning the first character.

Yes, it is posible to use EMBED_EVENTS and chip will insert event codes 
into stream of data bytes. But it will cost some processing power and if 
transmitted data contains ESC char it will be escaped so it will cost some 
additional bandwidth. In the worst case (all received data = ESC) it means 
double bandwidth.

I have found such solution here:
https://github.com/fortian/cp210x/blob/master/cp210x.c
but it is quite complex and I expected argument that it costs too much 
(especially when using maximum baudrate) so I suggested simple way which 
doesn't have an impact until ioctl is called.
> But perhaps that it was we should be using as it allows parity the

> errors to be reported using the normal in-band methods. If we only

> enable it when parity checking is enabled then the overhead seems

> justified.

>

> I did a quick test here and the event insertion appears to work well for

> parity errors. Didn't manage to get it to report break events yet, and

> modem-status changes are being buffered and not reported until the next

> character. But in theory we could expand the implementation to provide

> more features later.

>

> I'll see if I can cook something up quickly.


>> I suppose that GET_COMM_STATUS reads and CLEAR pending error flags (not

>> explained in datasheet but behaves that way). So if some errors are

>> reported during cp210x_tx_empty() it can be either counted immediately or

>> lost. I'm not sure if cp210x_tx_empty() is called ONLY during close but

>> seems be more consistent to count every detected error regardless who calls

>> GET_COMM_STATUS.

> tx_empty() is called when waiting for data to be drained for example

> during close but also on when changing terminal settings with TCSADRAIN

> or on tcdrain().

But losing an error during tcdrain() is definitely wrong. It is common to 
send (write) some request, then call tcdrain to be sure that whole request 
was transmitted and then receive response. Calling tcdrain is necessary in 
combination with GPIO manipulation but it can also be useful to measure 
correct timeout because I need to know that data was already transmitted to 
target (it can take long time for slow baudrate). There is no reason to 
discard error flags during such waiting.

> Correct.

>

> It's quite possible that that's missing elsewhere, but at least those

> counters are updated in completion callbacks which do not run in

> parallel unlike ioctls(), which are not serialised.

>

> Johan


Jerry
Johan Hovold July 6, 2020, 7:51 a.m. | #4
On Fri, Jul 03, 2020 at 08:45:45PM +0200, Jerry wrote:
> Johan Hovold wrote on 7/3/20 9:45 AM:


> > The problem here is that we're sort of violating the API and turning

> > TIOCGICOUNT into a polling interface instead of just returning our error

> > and interrupt counters. This means will always undercount any errors for

> > a start.

> >

> > The chip appears to have a mechanism for reporting errors in-band, but

> > that would amount to processing every character received to look for the

> > escape char, which adds overhead. I'm guessing that interface would also

> > insert an overrun error when returning the first character.


> Yes, it is posible to use EMBED_EVENTS and chip will insert event codes 

> into stream of data bytes. But it will cost some processing power and if 

> transmitted data contains ESC char it will be escaped so it will cost some 

> additional bandwidth. In the worst case (all received data = ESC) it means 

> double bandwidth.


Yeah, but sending a stream of ESC chars isn't a particularly interesting
application anyway. ;)

> I have found such solution here:

> https://github.com/fortian/cp210x/blob/master/cp210x.c

> but it is quite complex and I expected argument that it costs too much 

> (especially when using maximum baudrate) so I suggested simple way which 

> doesn't have an impact until ioctl is called.


It will definitely have some overhead, but since it allows for proper
posix behaviour and things like break detection I think it's warranted
(but only when those features are requested of course).

It may be possible to get the best of both worlds here if we poll for
changes only if input parity checking is disabled. You get the
statistics and could still use the Linux-specific TIOCGICOUNT ioctl to
check for errors indirectly.

> > But perhaps that it was we should be using as it allows parity the

> > errors to be reported using the normal in-band methods. If we only

> > enable it when parity checking is enabled then the overhead seems

> > justified.

> >

> > I did a quick test here and the event insertion appears to work well for

> > parity errors. Didn't manage to get it to report break events yet, and

> > modem-status changes are being buffered and not reported until the next

> > character. But in theory we could expand the implementation to provide

> > more features later.

> >

> > I'll see if I can cook something up quickly.

> 

> >> I suppose that GET_COMM_STATUS reads and CLEAR pending error flags (not

> >> explained in datasheet but behaves that way). So if some errors are

> >> reported during cp210x_tx_empty() it can be either counted immediately or

> >> lost. I'm not sure if cp210x_tx_empty() is called ONLY during close but

> >> seems be more consistent to count every detected error regardless who calls

> >> GET_COMM_STATUS.


> > tx_empty() is called when waiting for data to be drained for example

> > during close but also on when changing terminal settings with TCSADRAIN

> > or on tcdrain().


> But losing an error during tcdrain() is definitely wrong. It is common to 

> send (write) some request, then call tcdrain to be sure that whole request 

> was transmitted and then receive response. Calling tcdrain is necessary in 

> combination with GPIO manipulation but it can also be useful to measure 

> correct timeout because I need to know that data was already transmitted to 

> target (it can take long time for slow baudrate). There is no reason to 

> discard error flags during such waiting.


Yes, you're right of course. Since comm-status request clears the flags
they need to be accounted for whenever it's used.

Johan
Johan Hovold July 6, 2020, 9:08 a.m. | #5
On Mon, Jun 22, 2020 at 05:13:41PM +0200, Jaromír Škorpil wrote:
> The current version of cp210x driver doesn't provide any way to detect

> a parity error in received data from userspace. Some serial protocols like

> STM32 bootloader protect data only by parity so application needs to

> know whether parity error happened to repeat peripheral data reading.

> 

> Added support for icount (ioctl TIOCGICOUNT) which sends GET_COMM_STATUS

> command to CP210X and according received flags increments fields for

> parity error, frame error, break and overrun. An application can detect

> an error condition after reading data from ttyUSB and reacts adequately.

> There is no impact for applications which don't call ioctl TIOCGICOUNT.

> 

> The flag "hardware overrun" is not examined because CP2102 sets this bit

> for the first received byte after openning of port which was previously

> closed with some unreaded data in buffer. This is confusing and checking

> "queue overrun" flag seems be enough.

> 

> Signed-off-by: Jaromír Škorpil <Jerry@jrr.cz>

> ---

> v2: Simplified counting - only queue overrun checked

> v3: Changed description + UTF8 name

> v4: Corrected endian


So let's go with this and then I can add support for in-band reporting
on top.

I was gonna apply it and the missing locking, but noticed that the patch
is white-space damaged. At least some leading tabs have been converted.

Try sending the patch yourself (e.g. using git-send-email) and make sure
you can apply it using git-am.

>   cp210x.c |   43 ++++++++++++++++++++++++++++++++++++++-----

>   1 file changed, 38 insertions(+), 5 deletions(-)

> 

> diff -up linux-5.8-rc1/drivers/usb/serial/cp210x.c j/drivers/usb/serial/cp210x.c

> --- linux-5.8-rc1/drivers/usb/serial/cp210x.c

> +++ j/drivers/usb/serial/cp210x.c

> @@ -43,6 +43,8 @@ static int cp210x_tiocmget(struct tty_st

>   static int cp210x_tiocmset(struct tty_struct *, unsigned int, unsigned int);

>   static int cp210x_tiocmset_port(struct usb_serial_port *port,

>   		unsigned int, unsigned int);

> +static int cp210x_get_icount(struct tty_struct *tty,

> +		struct serial_icounter_struct *icount);

>   static void cp210x_break_ctl(struct tty_struct *, int);

>   static int cp210x_attach(struct usb_serial *);

>   static void cp210x_disconnect(struct usb_serial *);

> @@ -274,6 +276,7 @@ static struct usb_serial_driver cp210x_d

>   	.tx_empty		= cp210x_tx_empty,

>   	.tiocmget		= cp210x_tiocmget,

>   	.tiocmset		= cp210x_tiocmset,

> +	.get_icount		= cp210x_get_icount,

>   	.attach			= cp210x_attach,

>   	.disconnect		= cp210x_disconnect,

>   	.release		= cp210x_release,

> @@ -393,6 +396,13 @@ struct cp210x_comm_status {

>   	u8       bReserved;

>   } __packed;

>   

> +/* cp210x_comm_status::ulErrors */

> +#define CP210X_SERIAL_ERR_BREAK	BIT(0)

> +#define CP210X_SERIAL_ERR_FRAME	BIT(1)

> +#define CP210X_SERIAL_ERR_HW_OVERRUN	BIT(2)

> +#define CP210X_SERIAL_ERR_QUEUE_OVERRUN	BIT(3)

> +#define CP210X_SERIAL_ERR_PARITY	BIT(4)


Can you drop the SERIAL_ infix here?

> +

>   /*

>    * CP210X_PURGE - 16 bits passed in wValue of USB request.

>    * SiLabs app note AN571 gives a strange description of the 4 bits:

> @@ -836,10 +846,10 @@ static void cp210x_close(struct usb_seri

>   }

>   

>   /*

> - * Read how many bytes are waiting in the TX queue.

> + * Read how many bytes are waiting in the TX queue and update error counters.

>    */

> -static int cp210x_get_tx_queue_byte_count(struct usb_serial_port *port,

> -		u32 *count)

> +static int cp210x_get_comm_status(struct usb_serial_port *port,

> +		u32 *tx_count)

>   {

>   	struct usb_serial *serial = port->serial;

>   	struct cp210x_port_private *port_priv = usb_get_serial_port_data(port);

> @@ -855,7 +865,17 @@ static int cp210x_get_tx_queue_byte_coun

>   			0, port_priv->bInterfaceNumber, sts, sizeof(*sts),

>   			USB_CTRL_GET_TIMEOUT);

>   	if (result == sizeof(*sts)) {

> -		*count = le32_to_cpu(sts->ulAmountInOutQueue);

> +		u32 flags = le32_to_cpu(sts->ulErrors);


Here should be a newline.

> +		if (flags & CP210X_SERIAL_ERR_BREAK)

> +			port->icount.brk++;

> +		if (flags & CP210X_SERIAL_ERR_FRAME)

> +			port->icount.frame++;

> +		if (flags & CP210X_SERIAL_ERR_QUEUE_OVERRUN)

> +			port->icount.overrun++;

> +		if (flags & CP210X_SERIAL_ERR_PARITY)

> +			port->icount.parity++;


And these icount increments should be done under the port->lock as
TIOCGICOUNT can be called in parallel.

> +		if (tx_count)

> +			*tx_count = le32_to_cpu(sts->ulAmountInOutQueue);

>   		result = 0;

>   	} else {

>   		dev_err(&port->dev, "failed to get comm status: %d\n", result);

> @@ -873,13 +893,26 @@ static bool cp210x_tx_empty(struct usb_s

>   	int err;

>   	u32 count;

>   

> -	err = cp210x_get_tx_queue_byte_count(port, &count);

> +	err = cp210x_get_comm_status(port, &count);

>   	if (err)

>   		return true;

>   

>   	return !count;

>   }

>   

> +static int cp210x_get_icount(struct tty_struct *tty,

> +		struct serial_icounter_struct *icount)

> +{

> +	struct usb_serial_port *port = tty->driver_data;

> +	int result;

> +

> +	result = cp210x_get_comm_status(port, NULL);

> +	if (result)

> +		return result;

> +

> +	return usb_serial_generic_get_icount(tty, icount);

> +}

> +

>   /*

>    * cp210x_get_termios

>    * Reads the baud rate, data bits, parity, stop bits and flow control mode

> 

> 

> 


Johan
Jerry July 6, 2020, 11:47 a.m. | #6
Johan Hovold wrote on 7/3/20 5:01 PM:
> On Fri, Jul 03, 2020 at 09:45:39AM +0200, Johan Hovold wrote:

> Would you mind giving the below a try and see how that works in your

> setup?

>

> With this patch you'd be able to use PARMRK to be notified of any parity

> errors, but I also wired up TIOCGICOUNT if you prefer to use that.

>

> Also, could try and see if your device detects breaks properly? Mine

> just return a NUL char.

>

> Johan

>

I tried your patch. It detects the parity error correctly for my 
application. Unfortunately I'm not able currently to verify a break 
reception due to holiday, I'll probably try it next week when back at work 
(I need to recompile the device firmware).

An interesting thing that your patch don't report any overrun. I'm not able 
to create a real overrun (any idea?) but it doesn't report any fake overrun 
compared to GET_COMM_STATUS.

The question could be a value of CP210X_ESCCHAR. You selected 0xFF and this 
value can be quite common in received data because an erased flash memory 
is full of 0xFF. When I read an empty memory it means double USB bandwidth 
so for example 0xFE seems be better... but I understand that it depend on 
application and it is hard to globally select this value.

I'll do some more tests but your solution seems be fully acceptable for my 
needs. For me TIOCGICOUNT is enough (I just resend request when an error 
detected) but for other situation it would be very nice to know exactly 
which byte was malformed through PARMRK.

Jerry
Johan Hovold July 6, 2020, 1:59 p.m. | #7
On Mon, Jul 06, 2020 at 01:47:50PM +0200, Jerry wrote:
> Johan Hovold wrote on 7/3/20 5:01 PM:

> > On Fri, Jul 03, 2020 at 09:45:39AM +0200, Johan Hovold wrote:

> > Would you mind giving the below a try and see how that works in your

> > setup?

> >

> > With this patch you'd be able to use PARMRK to be notified of any parity

> > errors, but I also wired up TIOCGICOUNT if you prefer to use that.

> >

> > Also, could try and see if your device detects breaks properly? Mine

> > just return a NUL char.


> I tried your patch. It detects the parity error correctly for my 

> application. Unfortunately I'm not able currently to verify a break 

> reception due to holiday, I'll probably try it next week when back at work 

> (I need to recompile the device firmware).


Cool, thanks.

I noticed that I can get comm-status to report a break condition when
event-insertion mode is disabled, but it just results in a NUL char in
event mode (the error flag isn't set then).

Perhaps buggy firmware, unless there's some other command for masking
events. Haven't quite understood how the EVENTMASK requests are supposed
to be used. Replacing the break char (SET_CHAR) didn't help at least.

> An interesting thing that your patch don't report any overrun. I'm not able 

> to create a real overrun (any idea?) but it doesn't report any fake overrun 

> compared to GET_COMM_STATUS.


Yeah, I noticed that too, although I had a feeling the fake overrun
didn't even always show up when sending while closed here either. 

Not sure how best to trigger an overrun since we rely on the read urb to
report it. Perhaps pausing the read urbs, filling up the device fifo and
then submitting the urbs again could work? Would need to patch the
driver quite a bit for that though.

> The question could be a value of CP210X_ESCCHAR. You selected 0xFF and this 

> value can be quite common in received data because an erased flash memory 

> is full of 0xFF. When I read an empty memory it means double USB bandwidth 

> so for example 0xFE seems be better... but I understand that it depend on 

> application and it is hard to globally select this value.


Good point! I think we should definitely avoid 0xff.

> I'll do some more tests but your solution seems be fully acceptable for my 

> needs. For me TIOCGICOUNT is enough (I just resend request when an error 

> detected) but for other situation it would be very nice to know exactly 

> which byte was malformed through PARMRK.


That's good to hear. I'll respin the generic (event-based) solution
then.

Thanks,
Johan
Jerry July 8, 2020, 9:05 p.m. | #8
Johan Hovold wrote on 7/6/20 3:59 PM:
> On Mon, Jul 06, 2020 at 01:47:50PM +0200, Jerry wrote:

>> Johan Hovold wrote on 7/3/20 5:01 PM:

>>> Also, could try and see if your device detects breaks properly? Mine

>>> just return a NUL char.

I've done some experiments with CP2102 receiving a break.
It seems that chip always receives 0x00 for the start of break (with 
correct parity when even parity set, wrong for odd parity) and later 
(probably after 250 ms) it also sets break flag in GET_COMM_STATUS.
I don't see any indication of the break event in data. I tried to change 
some things in your solution but without success.

I also haven't ever seen Frame error (neither way). I tried several ways 
(different tx/rx baudrate, receive a parity data without parity enabled, 
generating shorter breaks) and I suppose that CP2102 can't indicate framing 
error.

Luckily I haven't found any problem with parity checking.  :-)

> I noticed that I can get comm-status to report a break condition when

> event-insertion mode is disabled, but it just results in a NUL char in

> event mode (the error flag isn't set then).

>

> Perhaps buggy firmware, unless there's some other command for masking

> events. Haven't quite understood how the EVENTMASK requests are supposed

> to be used. Replacing the break char (SET_CHAR) didn't help at least.

Neither I understand EVENTMASK in AN571 but I suppose that it is used for 
W32 API WaitCommEvent() because of very similar flag list
https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-waitcommevent
maybe somebody else can explain possible usage...
>> An interesting thing that your patch don't report any overrun. I'm not able

>> to create a real overrun (any idea?) but it doesn't report any fake overrun

>> compared to GET_COMM_STATUS.

> Yeah, I noticed that too, although I had a feeling the fake overrun

> didn't even always show up when sending while closed here either.

You are right, the fake HW overrun in GET_COMM_STATUS isn't reported always 
but very often.
>

> Thanks,

> Johan


Jerry
Johan Hovold July 13, 2020, 10:54 a.m. | #9
On Wed, Jul 08, 2020 at 11:05:29PM +0200, Jerry wrote:
> Johan Hovold wrote on 7/6/20 3:59 PM:

> > On Mon, Jul 06, 2020 at 01:47:50PM +0200, Jerry wrote:

> >> Johan Hovold wrote on 7/3/20 5:01 PM:

> >>> Also, could try and see if your device detects breaks properly? Mine

> >>> just return a NUL char.


> I've done some experiments with CP2102 receiving a break.

> It seems that chip always receives 0x00 for the start of break (with 

> correct parity when even parity set, wrong for odd parity) and later 

> (probably after 250 ms) it also sets break flag in GET_COMM_STATUS.

> I don't see any indication of the break event in data. I tried to change 

> some things in your solution but without success.


Ok, thanks for testing! The SERIAL_BREAK_CHAR in ulFlowReplace probably
needs to be set for breaks to be reported in-band, but unfortunately
that doesn't seem to have any effect on CP2102.

> I also haven't ever seen Frame error (neither way). I tried several ways 

> (different tx/rx baudrate, receive a parity data without parity enabled, 

> generating shorter breaks) and I suppose that CP2102 can't indicate framing 

> error.

> 

> Luckily I haven't found any problem with parity checking.  :-)


That's good.

I've been giving this some more thought and decided that it's probably
best not to extend TIOCGICOUNT with a COMM_STATUS request after all.

TIOCGICOUNT is really only supposed to be used to retrieve the
modem-status interrupts in concert with TIOCMIWAIT, but the ioctl was
later amended with some error statistics as well. As I mentioned before,
the ioctl is linux-specific and the statistics counter are mostly
undocumented and the behaviour varies from driver to driver. And don't
think adding another device-specific implementation which essentially
polls for errors (rather than counts them) is a good idea.

Since you confirmed that the event based implementation works for your
use case I think we should stick to that as it's allows for the normal
POSIX mechanisms for detecting parity errors.

Johan

Patch

diff -up linux-5.8-rc1/drivers/usb/serial/cp210x.c j/drivers/usb/serial/cp210x.c
--- linux-5.8-rc1/drivers/usb/serial/cp210x.c
+++ j/drivers/usb/serial/cp210x.c
@@ -43,6 +43,8 @@  static int cp210x_tiocmget(struct tty_st
  static int cp210x_tiocmset(struct tty_struct *, unsigned int, unsigned int);
  static int cp210x_tiocmset_port(struct usb_serial_port *port,
  		unsigned int, unsigned int);
+static int cp210x_get_icount(struct tty_struct *tty,
+		struct serial_icounter_struct *icount);
  static void cp210x_break_ctl(struct tty_struct *, int);
  static int cp210x_attach(struct usb_serial *);
  static void cp210x_disconnect(struct usb_serial *);
@@ -274,6 +276,7 @@  static struct usb_serial_driver cp210x_d
  	.tx_empty		= cp210x_tx_empty,
  	.tiocmget		= cp210x_tiocmget,
  	.tiocmset		= cp210x_tiocmset,
+	.get_icount		= cp210x_get_icount,
  	.attach			= cp210x_attach,
  	.disconnect		= cp210x_disconnect,
  	.release		= cp210x_release,
@@ -393,6 +396,13 @@  struct cp210x_comm_status {
  	u8       bReserved;
  } __packed;
  
+/* cp210x_comm_status::ulErrors */
+#define CP210X_SERIAL_ERR_BREAK	BIT(0)
+#define CP210X_SERIAL_ERR_FRAME	BIT(1)
+#define CP210X_SERIAL_ERR_HW_OVERRUN	BIT(2)
+#define CP210X_SERIAL_ERR_QUEUE_OVERRUN	BIT(3)
+#define CP210X_SERIAL_ERR_PARITY	BIT(4)
+
  /*
   * CP210X_PURGE - 16 bits passed in wValue of USB request.
   * SiLabs app note AN571 gives a strange description of the 4 bits:
@@ -836,10 +846,10 @@  static void cp210x_close(struct usb_seri
  }
  
  /*
- * Read how many bytes are waiting in the TX queue.
+ * Read how many bytes are waiting in the TX queue and update error counters.
   */
-static int cp210x_get_tx_queue_byte_count(struct usb_serial_port *port,
-		u32 *count)
+static int cp210x_get_comm_status(struct usb_serial_port *port,
+		u32 *tx_count)
  {
  	struct usb_serial *serial = port->serial;
  	struct cp210x_port_private *port_priv = usb_get_serial_port_data(port);
@@ -855,7 +865,17 @@  static int cp210x_get_tx_queue_byte_coun
  			0, port_priv->bInterfaceNumber, sts, sizeof(*sts),
  			USB_CTRL_GET_TIMEOUT);
  	if (result == sizeof(*sts)) {
-		*count = le32_to_cpu(sts->ulAmountInOutQueue);
+		u32 flags = le32_to_cpu(sts->ulErrors);
+		if (flags & CP210X_SERIAL_ERR_BREAK)
+			port->icount.brk++;
+		if (flags & CP210X_SERIAL_ERR_FRAME)
+			port->icount.frame++;
+		if (flags & CP210X_SERIAL_ERR_QUEUE_OVERRUN)
+			port->icount.overrun++;
+		if (flags & CP210X_SERIAL_ERR_PARITY)
+			port->icount.parity++;
+		if (tx_count)
+			*tx_count = le32_to_cpu(sts->ulAmountInOutQueue);
  		result = 0;
  	} else {
  		dev_err(&port->dev, "failed to get comm status: %d\n", result);
@@ -873,13 +893,26 @@  static bool cp210x_tx_empty(struct usb_s
  	int err;
  	u32 count;
  
-	err = cp210x_get_tx_queue_byte_count(port, &count);
+	err = cp210x_get_comm_status(port, &count);
  	if (err)
  		return true;
  
  	return !count;
  }
  
+static int cp210x_get_icount(struct tty_struct *tty,
+		struct serial_icounter_struct *icount)
+{
+	struct usb_serial_port *port = tty->driver_data;
+	int result;
+
+	result = cp210x_get_comm_status(port, NULL);
+	if (result)
+		return result;
+
+	return usb_serial_generic_get_icount(tty, icount);
+}
+
  /*
   * cp210x_get_termios
   * Reads the baud rate, data bits, parity, stop bits and flow control mode