Message ID | 20240610152420.v4.2.I65a6430ab75f74d20c28b5c5f819dd5b8455933d@changeid |
---|---|
State | New |
Headers | show |
Series | serial: qcom-geni: Overhaul TX handling to fix crashes/hangs | expand |
On Mon, 10 Jun 2024, Douglas Anderson wrote: > The current uart_fifo_timeout() returns jiffies, which is not always > the most convenient for callers. Add a variant uart_fifo_timeout_ms() > that returns the timeout in milliseconds. > > NOTES: > - msecs_to_jiffies() rounds up, unlike nsecs_to_jiffies(). This is > because msecs_to_jiffies() is actually intended for device drivers > to calculate timeout value. This means we don't need to take the max > of the timeout and "1" since the timeout will always be > 0 ms (we > add 20 ms of slop). > - uart_fifo_timeout_ms() returns "unsigned int" but we leave > uart_fifo_timeout() returning "unsigned long". This matches the > types of msecs_to_jiffies(). > > Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> > Signed-off-by: Douglas Anderson <dianders@chromium.org> > --- > > Changes in v4: > - New > > include/linux/serial_core.h | 15 +++++++++++---- > 1 file changed, 11 insertions(+), 4 deletions(-) > > diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h > index 8cb65f50e830..97968acfd564 100644 > --- a/include/linux/serial_core.h > +++ b/include/linux/serial_core.h > @@ -889,14 +889,21 @@ unsigned int uart_get_divisor(struct uart_port *port, unsigned int baud); > /* > * Calculates FIFO drain time. > */ > -static inline unsigned long uart_fifo_timeout(struct uart_port *port) > +static inline unsigned int uart_fifo_timeout_ms(struct uart_port *port) > { > u64 fifo_timeout = (u64)READ_ONCE(port->frame_time) * port->fifosize; > + unsigned int fifo_timeout_ms = div_u64(fifo_timeout, NSEC_PER_MSEC); > > - /* Add .02 seconds of slop */ > - fifo_timeout += 20 * NSEC_PER_MSEC; > + /* > + * Add .02 seconds of slop. This also helps account for the fact that > + * when we converted from ns to ms that we didn't round up. > + */ > + return fifo_timeout_ms + 20; > +} > > - return max(nsecs_to_jiffies(fifo_timeout), 1UL); > +static inline unsigned long uart_fifo_timeout(struct uart_port *port) > +{ > + return msecs_to_jiffies(uart_fifo_timeout_ms(port)); > } Hi, This is definitely towards the right direction! However, it now does double conversion, first div_u64() and then msecs_to_jiffies(). Perhaps it would be better to retain the nsecs version (maybe rename it to _ns for consistency) and add _ms variant that does the nsec -> msec conversion.
Hi, On Wed, Jun 12, 2024 at 12:38 AM Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> wrote: > > On Mon, 10 Jun 2024, Douglas Anderson wrote: > > > The current uart_fifo_timeout() returns jiffies, which is not always > > the most convenient for callers. Add a variant uart_fifo_timeout_ms() > > that returns the timeout in milliseconds. > > > > NOTES: > > - msecs_to_jiffies() rounds up, unlike nsecs_to_jiffies(). This is > > because msecs_to_jiffies() is actually intended for device drivers > > to calculate timeout value. This means we don't need to take the max > > of the timeout and "1" since the timeout will always be > 0 ms (we > > add 20 ms of slop). > > - uart_fifo_timeout_ms() returns "unsigned int" but we leave > > uart_fifo_timeout() returning "unsigned long". This matches the > > types of msecs_to_jiffies(). > > > > Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > > --- > > > > Changes in v4: > > - New > > > > include/linux/serial_core.h | 15 +++++++++++---- > > 1 file changed, 11 insertions(+), 4 deletions(-) > > > > diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h > > index 8cb65f50e830..97968acfd564 100644 > > --- a/include/linux/serial_core.h > > +++ b/include/linux/serial_core.h > > @@ -889,14 +889,21 @@ unsigned int uart_get_divisor(struct uart_port *port, unsigned int baud); > > /* > > * Calculates FIFO drain time. > > */ > > -static inline unsigned long uart_fifo_timeout(struct uart_port *port) > > +static inline unsigned int uart_fifo_timeout_ms(struct uart_port *port) > > { > > u64 fifo_timeout = (u64)READ_ONCE(port->frame_time) * port->fifosize; > > + unsigned int fifo_timeout_ms = div_u64(fifo_timeout, NSEC_PER_MSEC); > > > > - /* Add .02 seconds of slop */ > > - fifo_timeout += 20 * NSEC_PER_MSEC; > > + /* > > + * Add .02 seconds of slop. This also helps account for the fact that > > + * when we converted from ns to ms that we didn't round up. > > + */ > > + return fifo_timeout_ms + 20; > > +} > > > > - return max(nsecs_to_jiffies(fifo_timeout), 1UL); > > +static inline unsigned long uart_fifo_timeout(struct uart_port *port) > > +{ > > + return msecs_to_jiffies(uart_fifo_timeout_ms(port)); > > } > > Hi, > > This is definitely towards the right direction! However, it now does > double conversion, first div_u64() and then msecs_to_jiffies(). Perhaps it > would be better to retain the nsecs version (maybe rename it to _ns for > consistency) and add _ms variant that does the nsec -> msec conversion. I spent a bit of time thinking about it and I don't agree. If you feel very strongly about it or someone else wants to jump in and break the tie then I can look again, but: 1. The comment before nsecs_to_jiffies() specifically states that it's not supposed to be used for this purpose. Specifically, it says: * Unlike {m,u}secs_to_jiffies, type of input is not unsigned int but u64. * And this doesn't return MAX_JIFFY_OFFSET since this function is designed * for scheduler, not for use in device drivers to calculate timeout value. ...so switching away from nsecs_to_jiffies() to msecs_to_jiffies() is arguably a "bugfix", or at least avoids using the API in a way that's against the documentation. 2. As mentioned in the commit message, nsecs_to_jiffies() truncates where msecs_to_jiffies() rounds up. Presumably this difference is related to the comment above where the "ns" version is intended for the scheduler. Using the "ms" version allows us to get rid of the extra call to "max()" which is a benefit. Technically since the timeout is at least 20 ms the minimum HZ is 100 I guess we didn't need the max anyway, but I guess someone thought it was cleaner and now we can definitely get rid of it. 3. These functions are inline anyway, so I don't think it's causing a huge bloat of instructions. In fact, moving from 64-bit math to 32-bit math sooner could make the code smaller. 4. I don't feel like it hurts the readability to convert down to ms and then to jiffies. In fact, IMO it helps since it makes it more obvious that we're working with ms.
On Wed, 12 Jun 2024, Doug Anderson wrote: > On Wed, Jun 12, 2024 at 12:38 AM Ilpo Järvinen > <ilpo.jarvinen@linux.intel.com> wrote: > > > > On Mon, 10 Jun 2024, Douglas Anderson wrote: > > > > > The current uart_fifo_timeout() returns jiffies, which is not always > > > the most convenient for callers. Add a variant uart_fifo_timeout_ms() > > > that returns the timeout in milliseconds. > > > > > > NOTES: > > > - msecs_to_jiffies() rounds up, unlike nsecs_to_jiffies(). This is > > > because msecs_to_jiffies() is actually intended for device drivers > > > to calculate timeout value. This means we don't need to take the max > > > of the timeout and "1" since the timeout will always be > 0 ms (we > > > add 20 ms of slop). > > > - uart_fifo_timeout_ms() returns "unsigned int" but we leave > > > uart_fifo_timeout() returning "unsigned long". This matches the > > > types of msecs_to_jiffies(). > > > > > > Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> > > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > > > --- > > > > > > Changes in v4: > > > - New > > > > > > include/linux/serial_core.h | 15 +++++++++++---- > > > 1 file changed, 11 insertions(+), 4 deletions(-) > > > > > > diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h > > > index 8cb65f50e830..97968acfd564 100644 > > > --- a/include/linux/serial_core.h > > > +++ b/include/linux/serial_core.h > > > @@ -889,14 +889,21 @@ unsigned int uart_get_divisor(struct uart_port *port, unsigned int baud); > > > /* > > > * Calculates FIFO drain time. > > > */ > > > -static inline unsigned long uart_fifo_timeout(struct uart_port *port) > > > +static inline unsigned int uart_fifo_timeout_ms(struct uart_port *port) > > > { > > > u64 fifo_timeout = (u64)READ_ONCE(port->frame_time) * port->fifosize; > > > + unsigned int fifo_timeout_ms = div_u64(fifo_timeout, NSEC_PER_MSEC); > > > > > > - /* Add .02 seconds of slop */ > > > - fifo_timeout += 20 * NSEC_PER_MSEC; > > > + /* > > > + * Add .02 seconds of slop. This also helps account for the fact that > > > + * when we converted from ns to ms that we didn't round up. > > > + */ > > > + return fifo_timeout_ms + 20; > > > +} > > > > > > - return max(nsecs_to_jiffies(fifo_timeout), 1UL); > > > +static inline unsigned long uart_fifo_timeout(struct uart_port *port) > > > +{ > > > + return msecs_to_jiffies(uart_fifo_timeout_ms(port)); > > > } > > > > Hi, > > > > This is definitely towards the right direction! However, it now does > > double conversion, first div_u64() and then msecs_to_jiffies(). Perhaps it > > would be better to retain the nsecs version (maybe rename it to _ns for > > consistency) and add _ms variant that does the nsec -> msec conversion. > > I spent a bit of time thinking about it and I don't agree. If you feel > very strongly about it or someone else wants to jump in and break the > tie then I can look again, but: > > 1. The comment before nsecs_to_jiffies() specifically states that it's > not supposed to be used for this purpose. Specifically, it says: > > * Unlike {m,u}secs_to_jiffies, type of input is not unsigned int but u64. > * And this doesn't return MAX_JIFFY_OFFSET since this function is designed > * for scheduler, not for use in device drivers to calculate timeout value. > > ...so switching away from nsecs_to_jiffies() to msecs_to_jiffies() is > arguably a "bugfix", or at least avoids using the API in a way that's > against the documentation. Okay, I see. However, there's no way around using u64 here even with your version that does not use nsecs_to_jiffies() because nsecs is the most useful form of input when starting from frame_time, usecs is a bit coarse-grained for higher data rates. > 2. As mentioned in the commit message, nsecs_to_jiffies() truncates > where msecs_to_jiffies() rounds up. Presumably this difference is > related to the comment above where the "ns" version is intended for > the scheduler. Using the "ms" version allows us to get rid of the > extra call to "max()" which is a benefit. Technically since the > timeout is at least 20 ms the minimum HZ is 100 I guess we didn't need > the max anyway, but I guess someone thought it was cleaner and now we > can definitely get rid of it. > > 3. These functions are inline anyway, so I don't think it's causing a > huge bloat of instructions. In fact, moving from 64-bit math to 32-bit > math sooner could make the code smaller. > > 4. I don't feel like it hurts the readability to convert down to ms > and then to jiffies. In fact, IMO it helps since it makes it more > obvious that we're working with ms. I'd be lying if I'd say I feel strongly about it but my only argument involves doing an extra divide which is somewhat costly. It's a plain 32-bit divide though so not as bad as the u64 one that is unavoidable.
Hi, On Wed, Jun 12, 2024 at 11:56 PM Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> wrote: > > On Wed, 12 Jun 2024, Doug Anderson wrote: > > On Wed, Jun 12, 2024 at 12:38 AM Ilpo Järvinen > > <ilpo.jarvinen@linux.intel.com> wrote: > > > > > > On Mon, 10 Jun 2024, Douglas Anderson wrote: > > > > > > > The current uart_fifo_timeout() returns jiffies, which is not always > > > > the most convenient for callers. Add a variant uart_fifo_timeout_ms() > > > > that returns the timeout in milliseconds. > > > > > > > > NOTES: > > > > - msecs_to_jiffies() rounds up, unlike nsecs_to_jiffies(). This is > > > > because msecs_to_jiffies() is actually intended for device drivers > > > > to calculate timeout value. This means we don't need to take the max > > > > of the timeout and "1" since the timeout will always be > 0 ms (we > > > > add 20 ms of slop). > > > > - uart_fifo_timeout_ms() returns "unsigned int" but we leave > > > > uart_fifo_timeout() returning "unsigned long". This matches the > > > > types of msecs_to_jiffies(). > > > > > > > > Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> > > > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > > > > --- > > > > > > > > Changes in v4: > > > > - New > > > > > > > > include/linux/serial_core.h | 15 +++++++++++---- > > > > 1 file changed, 11 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h > > > > index 8cb65f50e830..97968acfd564 100644 > > > > --- a/include/linux/serial_core.h > > > > +++ b/include/linux/serial_core.h > > > > @@ -889,14 +889,21 @@ unsigned int uart_get_divisor(struct uart_port *port, unsigned int baud); > > > > /* > > > > * Calculates FIFO drain time. > > > > */ > > > > -static inline unsigned long uart_fifo_timeout(struct uart_port *port) > > > > +static inline unsigned int uart_fifo_timeout_ms(struct uart_port *port) > > > > { > > > > u64 fifo_timeout = (u64)READ_ONCE(port->frame_time) * port->fifosize; > > > > + unsigned int fifo_timeout_ms = div_u64(fifo_timeout, NSEC_PER_MSEC); > > > > > > > > - /* Add .02 seconds of slop */ > > > > - fifo_timeout += 20 * NSEC_PER_MSEC; > > > > + /* > > > > + * Add .02 seconds of slop. This also helps account for the fact that > > > > + * when we converted from ns to ms that we didn't round up. > > > > + */ > > > > + return fifo_timeout_ms + 20; > > > > +} > > > > > > > > - return max(nsecs_to_jiffies(fifo_timeout), 1UL); > > > > +static inline unsigned long uart_fifo_timeout(struct uart_port *port) > > > > +{ > > > > + return msecs_to_jiffies(uart_fifo_timeout_ms(port)); > > > > } > > > > > > Hi, > > > > > > This is definitely towards the right direction! However, it now does > > > double conversion, first div_u64() and then msecs_to_jiffies(). Perhaps it > > > would be better to retain the nsecs version (maybe rename it to _ns for > > > consistency) and add _ms variant that does the nsec -> msec conversion. > > > > I spent a bit of time thinking about it and I don't agree. If you feel > > very strongly about it or someone else wants to jump in and break the > > tie then I can look again, but: > > > > 1. The comment before nsecs_to_jiffies() specifically states that it's > > not supposed to be used for this purpose. Specifically, it says: > > > > * Unlike {m,u}secs_to_jiffies, type of input is not unsigned int but u64. > > * And this doesn't return MAX_JIFFY_OFFSET since this function is designed > > * for scheduler, not for use in device drivers to calculate timeout value. > > > > ...so switching away from nsecs_to_jiffies() to msecs_to_jiffies() is > > arguably a "bugfix", or at least avoids using the API in a way that's > > against the documentation. > > Okay, I see. However, there's no way around using u64 here even with your > version that does not use nsecs_to_jiffies() because nsecs is the most > useful form of input when starting from frame_time, usecs is a bit > coarse-grained for higher data rates. Right. We have to start with u64 because the frame time is in ns and we can only fit ~4 seconds worth of ns in 32-bits. That seems iffy. > > 2. As mentioned in the commit message, nsecs_to_jiffies() truncates > > where msecs_to_jiffies() rounds up. Presumably this difference is > > related to the comment above where the "ns" version is intended for > > the scheduler. Using the "ms" version allows us to get rid of the > > extra call to "max()" which is a benefit. Technically since the > > timeout is at least 20 ms the minimum HZ is 100 I guess we didn't need > > the max anyway, but I guess someone thought it was cleaner and now we > > can definitely get rid of it. > > > > 3. These functions are inline anyway, so I don't think it's causing a > > huge bloat of instructions. In fact, moving from 64-bit math to 32-bit > > math sooner could make the code smaller. > > > > 4. I don't feel like it hurts the readability to convert down to ms > > and then to jiffies. In fact, IMO it helps since it makes it more > > obvious that we're working with ms. > > I'd be lying if I'd say I feel strongly about it Fair enough. If someone wants to throw in an opinion and tiebreak then they can. > but my only argument > involves doing an extra divide which is somewhat costly. It's a > plain 32-bit divide though so not as bad as the u64 one that is > unavoidable. We shouldn't be calling this in a loop anyway, so it's unlikely to matter. In any case, I'd note that with the old code we had: 1. 64-bit multiply (time * fifosize) 2. 64-bit addition (result + 20ms) 3. 64-bit => 32-bit division (to jiffies) 4. 32-bit comparison against the value 1. 5. Conditional setting of the value to 1. Now we have: 1. 64-bit multiply (time * fifosize) 2. 64-bit => 32-bit division (to ms) 3. 32-bit addition with a small immediate (20) 4. 32-bit addition (div round up) if HZ != 1000 5. 32-bit division (div round up) if HZ != 1000 I didn't try disassembling to see what the compiler did and it would be different for each compiler / ISA / optimization level / value of HZ, but I guess my point is that while we have one more divide (unless HZ == 1000) we may have one less conditional. We're also tending to do our math with small immediates which some ISAs can handle more efficiently. I think the real answer, though, is that this doesn't really matter and that we should pick the solution that's cleaner/easier to understand. I'm still in favor of the patch as it is. As I said, if folks feel really strongly then it doesn't matter and I can change it, but otherwise I'd rather keep it the way it is. -Doug
diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h index 8cb65f50e830..97968acfd564 100644 --- a/include/linux/serial_core.h +++ b/include/linux/serial_core.h @@ -889,14 +889,21 @@ unsigned int uart_get_divisor(struct uart_port *port, unsigned int baud); /* * Calculates FIFO drain time. */ -static inline unsigned long uart_fifo_timeout(struct uart_port *port) +static inline unsigned int uart_fifo_timeout_ms(struct uart_port *port) { u64 fifo_timeout = (u64)READ_ONCE(port->frame_time) * port->fifosize; + unsigned int fifo_timeout_ms = div_u64(fifo_timeout, NSEC_PER_MSEC); - /* Add .02 seconds of slop */ - fifo_timeout += 20 * NSEC_PER_MSEC; + /* + * Add .02 seconds of slop. This also helps account for the fact that + * when we converted from ns to ms that we didn't round up. + */ + return fifo_timeout_ms + 20; +} - return max(nsecs_to_jiffies(fifo_timeout), 1UL); +static inline unsigned long uart_fifo_timeout(struct uart_port *port) +{ + return msecs_to_jiffies(uart_fifo_timeout_ms(port)); } /* Base timer interval for polling */
The current uart_fifo_timeout() returns jiffies, which is not always the most convenient for callers. Add a variant uart_fifo_timeout_ms() that returns the timeout in milliseconds. NOTES: - msecs_to_jiffies() rounds up, unlike nsecs_to_jiffies(). This is because msecs_to_jiffies() is actually intended for device drivers to calculate timeout value. This means we don't need to take the max of the timeout and "1" since the timeout will always be > 0 ms (we add 20 ms of slop). - uart_fifo_timeout_ms() returns "unsigned int" but we leave uart_fifo_timeout() returning "unsigned long". This matches the types of msecs_to_jiffies(). Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> --- Changes in v4: - New include/linux/serial_core.h | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)