mbox series

[v2,0/2] serial: 8250: Fixes for Oxford Semiconductor 950 UARTs

Message ID alpine.DEB.2.21.2106260539240.37803@angie.orcam.me.uk
Headers show
Series serial: 8250: Fixes for Oxford Semiconductor 950 UARTs | expand

Message

Maciej W. Rozycki June 26, 2021, 4:11 a.m. UTC
Hi,

 Here's v2 of the outstanding fixes for Oxford Semiconductor 950 UARTs, 
comprising an chip-specific clock handling implementation for accurate 
baud rate selection including higher baud rates and FIFO rx trigger level 
configuration.

 NB I have since had an opportunity to check what transceiver chips have 
been used with the option card I have been experimenting with, and they've 
turned out to be the Zywyn ZT3243F, specified for up to 1Mbps, so no 
surprise they eventually gave up as the rates I have tried increased, and 
quite interesting it was only at 4Mbps that they did it.

 Please apply.

  Maciej

Comments

andy@surfacebook.localdomain July 12, 2021, 8:15 p.m. UTC | #1
Sat, Jun 26, 2021 at 06:11:32AM +0200, Maciej W. Rozycki kirjoitti:
> Oxford Semiconductor PCIe (Tornado) serial port devices are driven by a

> fixed 62.5MHz clock input derived from the 100MHz PCI Express clock.

> 

> We currently drive the device using its default oversampling rate of 16 

> and the clock prescaler disabled, consequently yielding the baud base of 

> 3906250.  This base is inadequate for some of the high-speed baud rates 

> such as 460800bps, for which the closest rate possible can be obtained 

> by dividing the baud base by 8, yielding the baud rate of 488281.25bps, 

> which is off by 5.9638%.  This is enough for data communication to break 

> with the remote end talking actual 460800bps where missed stop bits have 

> been observed.

> 

> We can do better however, by taking advantage of a reduced oversampling 

> rate, which can be set to any integer value from 4 to 16 inclusive by 

> programming the TCR register, and by using the clock prescaler, which 

> can be set to any value from 1 to 63.875 in increments of 0.125 in the 

> CPR/CPR2 register pair[1][2][3][4].  The prescaler has to be explicitly 

> enabled though by setting bit 7 in the MCR or otherwise it is bypassed 

> as if the value of 1 was used.

> 

> By using these parameters rates from 15625000bps down to 1bps can be 

> obtained, with either exact or highly-accurate actual bit rates for 

> standard and many non-standard rates.

> 

> Make use of these features then as follows:

> 

> - Set the baud base to 15625000, reflecting the minimum oversampling 

>   rate of 4 with the clock prescaler and divisor both set to 1.

> 

> - Set the MCR mask and force bits in the UART template so as to have 

>   MCR[7] always set and then have 8250 core propagate those settings, if 

>   supplied as non-zero, overriding the ALPHA_KLUDGE_MCR default.

> 

> - Override the `get_divisor' handler and determine a good combination of

>   parameters by using a lookup table with predetermined value pairs of 

>   the oversampling rate and the clock prescaler and finding a pair that 

>   divides the input clock such that the quotient, when rounded to the 

>   nearest integer, deviates the least from the exact result.  Calculate 

>   the clock divisor accordingly.

> 

>   Scale the resulting oversampling rate (only by powers of two) if 

>   possible so as to maximise it, reducing the divisor accordingly, and 

>   avoid a divisor overflow for very low baud rates by scaling the 

>   oversampling rate and/or the prescaler even if that causes some 

>   accuracy loss.


>   Also handle the historic spd_cust feature so as to allow one to set 

>   all the three parameters manually to arbitrary values, by keeping the 

>   low 16 bits for the divisor and then putting TCR in bits 19:16 and 

>   CPR/CPR2 in bits 28:20, sanitising the bit pattern supplied such as

>   to clamp CPR/CPR2 values between 0.000 and 0.875 inclusive to 1.000.

>   This preserves compatibility with any existing setups, that is where 

>   requesting a custom divisor that only has any bits set among the low 

>   16 the oversampling rate of 16 and the clock prescaler of 1 will be 

>   used.


Please no. We really would like to get rid of that ugly hack. The BOTHER exists
for ages.

>   Finally abuse the `frac' argument to store the determined bit patterns 

>   for the TCR, CPR and CPR2 registers.

> 

> - Override the `set_divisor' handler so as to set the TCR, CPR and CPR2 

>   registers from the `frac' value supplied.  Set the divisor as usually.

> 

> With the baud base set to 15625000 and the unsigned 16-bit UART_DIV_MAX 

> limitation imposed by `serial8250_get_baud_rate' standard baud rates 

> below 300bps become unavailable in the regular way, e.g. the rate of 

> 200bps requires the baud base to be divided by 78125 and that is beyond 

> the unsigned 16-bit range.  The historic spd_cust feature can still be 

> used to obtain such rates if so required.

> 

> Here are the figures for the standard and some non-standard baud rates 

> (including those quoted in Oxford Semiconductor documentation), giving 

> the requested rate (r), the actual rate yielded (a) and its deviation 

> from the requested rate (d), and the values of the oversampling rate 

> (tcr), the clock prescaler (cpr) and the divisor (div) produced by the 

> new `get_divisor' handler:

> 

> r: 15625000, a: 15625000.00, d:  0.0000%, tcr:  4, cpr:  1.000, div:     1

> r: 12500000, a: 12500000.00, d:  0.0000%, tcr:  5, cpr:  1.000, div:     1

> r: 10416666, a: 10416666.67, d:  0.0000%, tcr:  6, cpr:  1.000, div:     1

> r:  8928571, a:  8928571.43, d:  0.0000%, tcr:  7, cpr:  1.000, div:     1

> r:  7812500, a:  7812500.00, d:  0.0000%, tcr:  8, cpr:  1.000, div:     1

> r:  4000000, a:  4000000.00, d:  0.0000%, tcr:  5, cpr:  3.125, div:     1

> r:  3686400, a:  3676470.59, d: -0.2694%, tcr:  8, cpr:  2.125, div:     1

> r:  3500000, a:  3496503.50, d: -0.0999%, tcr: 13, cpr:  1.375, div:     1

> r:  3000000, a:  2976190.48, d: -0.7937%, tcr: 14, cpr:  1.500, div:     1

> r:  2500000, a:  2500000.00, d:  0.0000%, tcr: 10, cpr:  2.500, div:     1

> r:  2000000, a:  2000000.00, d:  0.0000%, tcr: 10, cpr:  3.125, div:     1

> r:  1843200, a:  1838235.29, d: -0.2694%, tcr: 16, cpr:  2.125, div:     1

> r:  1500000, a:  1492537.31, d: -0.4975%, tcr:  5, cpr:  8.375, div:     1

> r:  1152000, a:  1152073.73, d:  0.0064%, tcr: 14, cpr:  3.875, div:     1

> r:   921600, a:   919117.65, d: -0.2694%, tcr: 16, cpr:  2.125, div:     2

> r:   576000, a:   576036.87, d:  0.0064%, tcr: 14, cpr:  3.875, div:     2

> r:   460800, a:   460829.49, d:  0.0064%, tcr:  7, cpr:  3.875, div:     5

> r:   230400, a:   230414.75, d:  0.0064%, tcr: 14, cpr:  3.875, div:     5

> r:   115200, a:   115207.37, d:  0.0064%, tcr: 14, cpr:  1.250, div:    31

> r:    57600, a:    57603.69, d:  0.0064%, tcr:  8, cpr:  3.875, div:    35

> r:    38400, a:    38402.46, d:  0.0064%, tcr: 14, cpr:  3.875, div:    30

> r:    19200, a:    19201.23, d:  0.0064%, tcr:  8, cpr:  3.875, div:   105

> r:     9600, a:     9600.06, d:  0.0006%, tcr:  9, cpr:  1.125, div:   643

> r:     4800, a:     4799.98, d: -0.0004%, tcr:  7, cpr:  2.875, div:   647

> r:     2400, a:     2400.02, d:  0.0008%, tcr:  9, cpr:  2.250, div:  1286

> r:     1200, a:     1200.00, d:  0.0000%, tcr: 14, cpr:  2.875, div:  1294

> r:      300, a:      300.00, d:  0.0000%, tcr: 11, cpr:  2.625, div:  7215

> r:      200, a:      200.00, d:  0.0000%, tcr: 16, cpr:  1.250, div: 15625

> r:      150, a:      150.00, d:  0.0000%, tcr: 13, cpr:  2.250, div: 14245

> r:      134, a:      134.00, d:  0.0000%, tcr: 11, cpr:  2.625, div: 16153

> r:      110, a:      110.00, d:  0.0000%, tcr: 12, cpr:  1.000, div: 47348

> r:       75, a:       75.00, d:  0.0000%, tcr:  4, cpr:  5.875, div: 35461

> r:       50, a:       50.00, d:  0.0000%, tcr: 16, cpr:  1.250, div: 62500

> r:       25, a:       25.00, d:  0.0000%, tcr: 16, cpr:  2.500, div: 62500

> r:        4, a:        4.00, d:  0.0000%, tcr: 16, cpr: 20.000, div: 48828

> r:        2, a:        2.00, d:  0.0000%, tcr: 16, cpr: 40.000, div: 48828

> r:        1, a:        1.00, d:  0.0000%, tcr: 16, cpr: 63.875, div: 61154

> 

> References:

> 

> [1] "OXPCIe200 PCI Express Multi-Port Bridge", Oxford Semiconductor,

>     Inc., DS-0045, 10 Nov 2008, Section "950 Mode", pp. 64-65

> 

> [2] "OXPCIe952 PCI Express Bridge to Dual Serial & Parallel Port",

>     Oxford Semiconductor, Inc., DS-0046, Mar 06 08, Section "950 Mode", 

>     p. 20

> 

> [3] "OXPCIe954 PCI Express Bridge to Quad Serial Port", Oxford

>     Semiconductor, Inc., DS-0047, Feb 08, Section "950 Mode", p. 20

> 

> [4] "OXPCIe958 PCI Express Bridge to Octal Serial Port", Oxford

>     Semiconductor, Inc., DS-0048, Feb 08, Section "950 Mode", p. 20


Is it possible to reduce a commit message by shifting some stuff to the
dedicated documentation?

...

>  drivers/tty/serial/8250/8250_pci.c  |  331 ++++++++++++++++++++++++++++--------


Can we, please, split the quirk driver first as it's done in a lot of examples
(_exar, _mid, _lpss, _...) and then modify it?

...

> +/* Tornado-specific constants for the TCR and CPR registers; see below.  */

> +#define OXSEMI_TORNADO_TCR_MASK	0xf

> +#define OXSEMI_TORNADO_CPR_MASK	0x1ff

> +#define OXSEMI_TORNADO_CPR_MIN	8

> +

> +/*

> + * Determine the oversampling rate, the clock prescaler, and the clock

> + * divisor for the requested baud rate.  The clock rate is 62.5 MHz,

> + * which is four times the baud base, and the prescaler increments in

> + * steps of 1/8.  Therefore to make calculations on integers we need

> + * to use a scaled clock rate, which is the baud base multiplied by 32

> + * (or our assumed UART clock rate multiplied by 2).

> + *

> + * The allowed oversampling rates are from 4 up to 16 inclusive (values

> + * from 0 to 3 inclusive map to 16).  Likewise the clock prescaler allows

> + * values between 1.000 and 63.875 inclusive (operation for values from

> + * 0.000 to 0.875 has not been specified).  The clock divisor is the usual

> + * unsigned 16-bit integer.

> + *

> + * For the most accurate baud rate we use a table of predetermined

> + * oversampling rates and clock prescalers that records all possible

> + * products of the two parameters in the range from 4 up to 255 inclusive,

> + * and additionally 335 for the 1500000bps rate, with the prescaler scaled

> + * by 8.  The table is sorted by the decreasing value of the oversampling

> + * rate and ties are resolved by sorting by the decreasing value of the

> + * product.  This way preference is given to higher oversampling rates.

> + *

> + * We iterate over the table and choose the product of an oversampling

> + * rate and a clock prescaler that gives the lowest integer division

> + * result deviation, or if an exact integer divider is found we stop

> + * looking for right away.  We do some fixup if the resulting clock

> + * divisor required would be out of its unsigned 16-bit integer range.

> + *

> + * Finally we abuse the supposed fractional part returned to encode the

> + * 4-bit value of the oversampling rate and the 9-bit value of the clock

> + * prescaler which will end up in the TCR and CPR/CPR2 registers.

> + */

> +static unsigned int pci_oxsemi_tornado_get_divisor(struct uart_port *port,

> +						   unsigned int baud,

> +						   unsigned int *frac)

> +{

> +	static u8 p[][2] = {

> +		{ 16, 14, }, { 16, 13, }, { 16, 12, }, { 16, 11, },

> +		{ 16, 10, }, { 16,  9, }, { 16,  8, }, { 15, 17, },

> +		{ 15, 16, }, { 15, 15, }, { 15, 14, }, { 15, 13, },

> +		{ 15, 12, }, { 15, 11, }, { 15, 10, }, { 15,  9, },

> +		{ 15,  8, }, { 14, 18, }, { 14, 17, }, { 14, 14, },

> +		{ 14, 13, }, { 14, 12, }, { 14, 11, }, { 14, 10, },

> +		{ 14,  9, }, { 14,  8, }, { 13, 19, }, { 13, 18, },

> +		{ 13, 17, }, { 13, 13, }, { 13, 12, }, { 13, 11, },

> +		{ 13, 10, }, { 13,  9, }, { 13,  8, }, { 12, 19, },

> +		{ 12, 18, }, { 12, 17, }, { 12, 11, }, { 12,  9, },

> +		{ 12,  8, }, { 11, 23, }, { 11, 22, }, { 11, 21, },

> +		{ 11, 20, }, { 11, 19, }, { 11, 18, }, { 11, 17, },

> +		{ 11, 11, }, { 11, 10, }, { 11,  9, }, { 11,  8, },

> +		{ 10, 25, }, { 10, 23, }, { 10, 20, }, { 10, 19, },

> +		{ 10, 17, }, { 10, 10, }, { 10,  9, }, { 10,  8, },

> +		{  9, 27, }, {  9, 23, }, {  9, 21, }, {  9, 19, },

> +		{  9, 18, }, {  9, 17, }, {  9,  9, }, {  9,  8, },

> +		{  8, 31, }, {  8, 29, }, {  8, 23, }, {  8, 19, },

> +		{  8, 17, }, {  8,  8, }, {  7, 35, }, {  7, 31, },

> +		{  7, 29, }, {  7, 25, }, {  7, 23, }, {  7, 21, },

> +		{  7, 19, }, {  7, 17, }, {  7, 15, }, {  7, 14, },

> +		{  7, 13, }, {  7, 12, }, {  7, 11, }, {  7, 10, },

> +		{  7,  9, }, {  7,  8, }, {  6, 41, }, {  6, 37, },

> +		{  6, 31, }, {  6, 29, }, {  6, 23, }, {  6, 19, },

> +		{  6, 17, }, {  6, 13, }, {  6, 11, }, {  6, 10, },

> +		{  6,  9, }, {  6,  8, }, {  5, 67, }, {  5, 47, },

> +		{  5, 43, }, {  5, 41, }, {  5, 37, }, {  5, 31, },

> +		{  5, 29, }, {  5, 25, }, {  5, 23, }, {  5, 19, },

> +		{  5, 17, }, {  5, 15, }, {  5, 13, }, {  5, 11, },

> +		{  5, 10, }, {  5,  9, }, {  5,  8, }, {  4, 61, },

> +		{  4, 59, }, {  4, 53, }, {  4, 47, }, {  4, 43, },

> +		{  4, 41, }, {  4, 37, }, {  4, 31, }, {  4, 29, },

> +		{  4, 23, }, {  4, 19, }, {  4, 17, }, {  4, 13, },

> +		{  4,  9, }, {  4,  8, },

> +	};


Oh là là! Please, use rational best approximation algorithm instead (check CONFIG_RATIONAL).


-- 
With Best Regards,
Andy Shevchenko
Maciej W. Rozycki July 13, 2021, 1:52 a.m. UTC | #2
Hi Andy,

 Something wrong with your "From:" header; I've fixed it up based on a 
best guess basis.

On Mon, 12 Jul 2021, andy@surfacebook.localdomain wrote:

> >   Also handle the historic spd_cust feature so as to allow one to set 

> >   all the three parameters manually to arbitrary values, by keeping the 

> >   low 16 bits for the divisor and then putting TCR in bits 19:16 and 

> >   CPR/CPR2 in bits 28:20, sanitising the bit pattern supplied such as

> >   to clamp CPR/CPR2 values between 0.000 and 0.875 inclusive to 1.000.

> >   This preserves compatibility with any existing setups, that is where 

> >   requesting a custom divisor that only has any bits set among the low 

> >   16 the oversampling rate of 16 and the clock prescaler of 1 will be 

> >   used.

> 

> Please no. We really would like to get rid of that ugly hack. The BOTHER exists

> for ages.


 I have actually carefully considered it before submission and:

1. it remains a supported user API with a tool included with contemporary 
   distributions, and

2. with this device you can't set all the possible whole-number baud 
   rates let alone UART clock frequencies with the BOTHER API, and

3. it doesn't hurt.

If you'd like to get rid of SPD_CUST, then just do so, but until then I 
fail to see a point to have it supported with some devices but not other 
ones.

 NB if you do get to it, then please consider adding an equally flexible 
API too, e.g. for fractional baud rates (134.5bps haha); I won't mind if 
it's less hackish though.

> > References:

> > 

> > [1] "OXPCIe200 PCI Express Multi-Port Bridge", Oxford Semiconductor,

> >     Inc., DS-0045, 10 Nov 2008, Section "950 Mode", pp. 64-65

> > 

> > [2] "OXPCIe952 PCI Express Bridge to Dual Serial & Parallel Port",

> >     Oxford Semiconductor, Inc., DS-0046, Mar 06 08, Section "950 Mode", 

> >     p. 20

> > 

> > [3] "OXPCIe954 PCI Express Bridge to Quad Serial Port", Oxford

> >     Semiconductor, Inc., DS-0047, Feb 08, Section "950 Mode", p. 20

> > 

> > [4] "OXPCIe958 PCI Express Bridge to Octal Serial Port", Oxford

> >     Semiconductor, Inc., DS-0048, Feb 08, Section "950 Mode", p. 20

> 

> Is it possible to reduce a commit message by shifting some stuff to the

> dedicated documentation?


 The relevant stuff has been included as comments along with actual code 
already, and the rest is the usual submission-time rationale.  This will 
be the initial source of information when someone studies the history of 
this code (`git log').

 I don't consider it cast in stone however, so if there's any particular 
piece you'd like to see elsewhere, then please point out to me what to 
move and where.  Or give any guidance other than just: "Rewrite it!"  

 (Yes I often have troubles figuring out the real intent of some changes 
made say 15 years ago that have turned out broken after all those years 
and whose change description is simply too terse now that the lore has 
been lost.)

> >  drivers/tty/serial/8250/8250_pci.c  |  331 ++++++++++++++++++++++++++++--------

> 

> Can we, please, split the quirk driver first as it's done in a lot of examples

> (_exar, _mid, _lpss, _...) and then modify it?


 I have found it unclear where the line is drawn between having support 
code included with 8250_pci.c proper and having it split off to a separate 
file.  All the device-specific files seem to provide complex handling, 
well beyond just calculating the clock.

 I'll be happy to split it off however (with a suitable preparatory 
change) if there is a consensus in favour to doing so.

> > +/*

> > + * Determine the oversampling rate, the clock prescaler, and the clock

> > + * divisor for the requested baud rate.  The clock rate is 62.5 MHz,

> > + * which is four times the baud base, and the prescaler increments in

> > + * steps of 1/8.  Therefore to make calculations on integers we need

> > + * to use a scaled clock rate, which is the baud base multiplied by 32

> > + * (or our assumed UART clock rate multiplied by 2).

> > + *

> > + * The allowed oversampling rates are from 4 up to 16 inclusive (values

> > + * from 0 to 3 inclusive map to 16).  Likewise the clock prescaler allows

> > + * values between 1.000 and 63.875 inclusive (operation for values from

> > + * 0.000 to 0.875 has not been specified).  The clock divisor is the usual

> > + * unsigned 16-bit integer.

> > + *

> > + * For the most accurate baud rate we use a table of predetermined

> > + * oversampling rates and clock prescalers that records all possible

> > + * products of the two parameters in the range from 4 up to 255 inclusive,

> > + * and additionally 335 for the 1500000bps rate, with the prescaler scaled

> > + * by 8.  The table is sorted by the decreasing value of the oversampling

> > + * rate and ties are resolved by sorting by the decreasing value of the

> > + * product.  This way preference is given to higher oversampling rates.

> > + *

> > + * We iterate over the table and choose the product of an oversampling

> > + * rate and a clock prescaler that gives the lowest integer division

> > + * result deviation, or if an exact integer divider is found we stop

> > + * looking for right away.  We do some fixup if the resulting clock

> > + * divisor required would be out of its unsigned 16-bit integer range.

> > + *

> > + * Finally we abuse the supposed fractional part returned to encode the

> > + * 4-bit value of the oversampling rate and the 9-bit value of the clock

> > + * prescaler which will end up in the TCR and CPR/CPR2 registers.

> > + */

> > +static unsigned int pci_oxsemi_tornado_get_divisor(struct uart_port *port,

> > +						   unsigned int baud,

> > +						   unsigned int *frac)

> > +{

> > +	static u8 p[][2] = {

> > +		{ 16, 14, }, { 16, 13, }, { 16, 12, }, { 16, 11, },

> > +		{ 16, 10, }, { 16,  9, }, { 16,  8, }, { 15, 17, },

> > +		{ 15, 16, }, { 15, 15, }, { 15, 14, }, { 15, 13, },

> > +		{ 15, 12, }, { 15, 11, }, { 15, 10, }, { 15,  9, },

> > +		{ 15,  8, }, { 14, 18, }, { 14, 17, }, { 14, 14, },

> > +		{ 14, 13, }, { 14, 12, }, { 14, 11, }, { 14, 10, },

> > +		{ 14,  9, }, { 14,  8, }, { 13, 19, }, { 13, 18, },

> > +		{ 13, 17, }, { 13, 13, }, { 13, 12, }, { 13, 11, },

> > +		{ 13, 10, }, { 13,  9, }, { 13,  8, }, { 12, 19, },

> > +		{ 12, 18, }, { 12, 17, }, { 12, 11, }, { 12,  9, },

> > +		{ 12,  8, }, { 11, 23, }, { 11, 22, }, { 11, 21, },

> > +		{ 11, 20, }, { 11, 19, }, { 11, 18, }, { 11, 17, },

> > +		{ 11, 11, }, { 11, 10, }, { 11,  9, }, { 11,  8, },

> > +		{ 10, 25, }, { 10, 23, }, { 10, 20, }, { 10, 19, },

> > +		{ 10, 17, }, { 10, 10, }, { 10,  9, }, { 10,  8, },

> > +		{  9, 27, }, {  9, 23, }, {  9, 21, }, {  9, 19, },

> > +		{  9, 18, }, {  9, 17, }, {  9,  9, }, {  9,  8, },

> > +		{  8, 31, }, {  8, 29, }, {  8, 23, }, {  8, 19, },

> > +		{  8, 17, }, {  8,  8, }, {  7, 35, }, {  7, 31, },

> > +		{  7, 29, }, {  7, 25, }, {  7, 23, }, {  7, 21, },

> > +		{  7, 19, }, {  7, 17, }, {  7, 15, }, {  7, 14, },

> > +		{  7, 13, }, {  7, 12, }, {  7, 11, }, {  7, 10, },

> > +		{  7,  9, }, {  7,  8, }, {  6, 41, }, {  6, 37, },

> > +		{  6, 31, }, {  6, 29, }, {  6, 23, }, {  6, 19, },

> > +		{  6, 17, }, {  6, 13, }, {  6, 11, }, {  6, 10, },

> > +		{  6,  9, }, {  6,  8, }, {  5, 67, }, {  5, 47, },

> > +		{  5, 43, }, {  5, 41, }, {  5, 37, }, {  5, 31, },

> > +		{  5, 29, }, {  5, 25, }, {  5, 23, }, {  5, 19, },

> > +		{  5, 17, }, {  5, 15, }, {  5, 13, }, {  5, 11, },

> > +		{  5, 10, }, {  5,  9, }, {  5,  8, }, {  4, 61, },

> > +		{  4, 59, }, {  4, 53, }, {  4, 47, }, {  4, 43, },

> > +		{  4, 41, }, {  4, 37, }, {  4, 31, }, {  4, 29, },

> > +		{  4, 23, }, {  4, 19, }, {  4, 17, }, {  4, 13, },

> > +		{  4,  9, }, {  4,  8, },

> > +	};

> 

> Oh là là! Please, use rational best approximation algorithm instead 

> (check CONFIG_RATIONAL).


 Thanks for the pointer, I didn't know we had this piece.

 However how is it supposed to apply here?  The denominator is always 8,
so we can rule it out (by multiplying the dividend by 8, which this piece 
does, so that the divisor is a whole number), but the numerator has to be 
a product of three integers, from a different range each ([4,16], [8,511], 
[1, 65535]) as noted above.

 Essentially we need to find such three integers (with extra constraints) 
the product of which is closest to (500000000 / baud_rate) -- which IMHO 
amounts to factorisation, an NP-complete problem as you have been surely 
aware (and the whole world relies on), and I have decided that this simple 
table-driven approximation is good enough to handle the usual baud rates, 
especially the higher ones.  For several baud rates it gives more accurate 
results (lower deviation) than the factors proposed in the manufacturer's 
datasheets.

 I just fail to see how your proposed algorithm could be factored in here, 
but I'll be happy to be proved wrong, so I'll appreciate guidance.

 In any case thank you for your review, always appreciated!

  Maciej
Andy Shevchenko July 13, 2021, 7:19 a.m. UTC | #3
On Tue, Jul 13, 2021 at 4:52 AM Maciej W. Rozycki <macro@orcam.me.uk> wrote:
>  Something wrong with your "From:" header; I've fixed it up based on a

> best guess basis.


Ah, yes, I have to fix it locally. Thanks!

> On Mon, 12 Jul 2021, andy@surfacebook.localdomain wrote:

>

> > >   Also handle the historic spd_cust feature so as to allow one to set

> > >   all the three parameters manually to arbitrary values, by keeping the

> > >   low 16 bits for the divisor and then putting TCR in bits 19:16 and

> > >   CPR/CPR2 in bits 28:20, sanitising the bit pattern supplied such as

> > >   to clamp CPR/CPR2 values between 0.000 and 0.875 inclusive to 1.000.

> > >   This preserves compatibility with any existing setups, that is where

> > >   requesting a custom divisor that only has any bits set among the low

> > >   16 the oversampling rate of 16 and the clock prescaler of 1 will be

> > >   used.

> >

> > Please no. We really would like to get rid of that ugly hack. The BOTHER exists

> > for ages.

>

>  I have actually carefully considered it before submission and:

>

> 1. it remains a supported user API with a tool included with contemporary

>    distributions, and


What supported API?

> 2. with this device you can't set all the possible whole-number baud

>    rates let alone UART clock frequencies with the BOTHER API, and


How does SPD_CUST make it different?

> 3. it doesn't hurt.


It hurts development a lot.

> If you'd like to get rid of SPD_CUST, then just do so, but until then I

> fail to see a point to have it supported with some devices but not other

> ones.


It _is_ the current state of affairs. Most of the contemporary drivers
do not support this "feature" at all.

>  NB if you do get to it, then please consider adding an equally flexible

> API too, e.g. for fractional baud rates (134.5bps haha); I won't mind if

> it's less hackish though.


Why do you need fractional baud rates for the small speeds? I do not
believe we have any good use case for that. And 1/2 from 134 is less
than 0.5% which is tolerable by UART by definition.

So, please no, drop it.

> > > References:

> > >

> > > [1] "OXPCIe200 PCI Express Multi-Port Bridge", Oxford Semiconductor,

> > >     Inc., DS-0045, 10 Nov 2008, Section "950 Mode", pp. 64-65

> > >

> > > [2] "OXPCIe952 PCI Express Bridge to Dual Serial & Parallel Port",

> > >     Oxford Semiconductor, Inc., DS-0046, Mar 06 08, Section "950 Mode",

> > >     p. 20

> > >

> > > [3] "OXPCIe954 PCI Express Bridge to Quad Serial Port", Oxford

> > >     Semiconductor, Inc., DS-0047, Feb 08, Section "950 Mode", p. 20

> > >

> > > [4] "OXPCIe958 PCI Express Bridge to Octal Serial Port", Oxford

> > >     Semiconductor, Inc., DS-0048, Feb 08, Section "950 Mode", p. 20

> >

> > Is it possible to reduce a commit message by shifting some stuff to the

> > dedicated documentation?

>

>  The relevant stuff has been included as comments along with actual code

> already, and the rest is the usual submission-time rationale.  This will

> be the initial source of information when someone studies the history of

> this code (`git log').


I do not object to this, but perhaps in the form of documentation it
would serve a better job (no-one will need to go deep into the Git
history for this, especially non-developer people who just got a
tarball, for example).

>  I don't consider it cast in stone however, so if there's any particular

> piece you'd like to see elsewhere, then please point out to me what to

> move and where.  Or give any guidance other than just: "Rewrite it!"


At least that table with divisors and deviation with accompanying
text. But I dare to say 90-95% of the commit message and leave
something like "Here is a new driver. documentation is there." To
where? Documentation/admin-guide seems most suitable right now
(looking at the presence of auxdisplay folder), however I think that
maybe dedicated folder like Documentation/hardware-notes maybe better.

+Cc: Mauro. What do you think about this? We need a folder where we
rather describe hardware features and maybe some driver implementation
details.

>  (Yes I often have troubles figuring out the real intent of some changes

> made say 15 years ago that have turned out broken after all those years

> and whose change description is simply too terse now that the lore has

> been lost.)

>

> > >  drivers/tty/serial/8250/8250_pci.c  |  331 ++++++++++++++++++++++++++++--------

> >

> > Can we, please, split the quirk driver first as it's done in a lot of examples

> > (_exar, _mid, _lpss, _...) and then modify it?

>

>  I have found it unclear where the line is drawn between having support

> code included with 8250_pci.c proper and having it split off to a separate

> file.  All the device-specific files seem to provide complex handling,

> well beyond just calculating the clock.


Lines of code in the current 8250_pci in conjunction with expansion.
To me 331 (okay, it's something like 280?) LOC + sounds like a very
good justification to split.

>  I'll be happy to split it off however (with a suitable preparatory

> change) if there is a consensus in favour to doing so.


If you have a consensus with yourself :-) Maintaining 8250_pci is a burden.
You may look into the history of 8250_pci (and you will often see my
name there) how it was shrinking in time.

> > > +/*

> > > + * Determine the oversampling rate, the clock prescaler, and the clock

> > > + * divisor for the requested baud rate.  The clock rate is 62.5 MHz,

> > > + * which is four times the baud base, and the prescaler increments in

> > > + * steps of 1/8.  Therefore to make calculations on integers we need

> > > + * to use a scaled clock rate, which is the baud base multiplied by 32

> > > + * (or our assumed UART clock rate multiplied by 2).

> > > + *

> > > + * The allowed oversampling rates are from 4 up to 16 inclusive (values

> > > + * from 0 to 3 inclusive map to 16).  Likewise the clock prescaler allows

> > > + * values between 1.000 and 63.875 inclusive (operation for values from

> > > + * 0.000 to 0.875 has not been specified).  The clock divisor is the usual

> > > + * unsigned 16-bit integer.

> > > + *

> > > + * For the most accurate baud rate we use a table of predetermined

> > > + * oversampling rates and clock prescalers that records all possible

> > > + * products of the two parameters in the range from 4 up to 255 inclusive,

> > > + * and additionally 335 for the 1500000bps rate, with the prescaler scaled

> > > + * by 8.  The table is sorted by the decreasing value of the oversampling

> > > + * rate and ties are resolved by sorting by the decreasing value of the

> > > + * product.  This way preference is given to higher oversampling rates.

> > > + *

> > > + * We iterate over the table and choose the product of an oversampling

> > > + * rate and a clock prescaler that gives the lowest integer division

> > > + * result deviation, or if an exact integer divider is found we stop

> > > + * looking for right away.  We do some fixup if the resulting clock


for it right

> > > + * divisor required would be out of its unsigned 16-bit integer range.

> > > + *

> > > + * Finally we abuse the supposed fractional part returned to encode the

> > > + * 4-bit value of the oversampling rate and the 9-bit value of the clock

> > > + * prescaler which will end up in the TCR and CPR/CPR2 registers.

> > > + */

> > > +static unsigned int pci_oxsemi_tornado_get_divisor(struct uart_port *port,

> > > +                                              unsigned int baud,

> > > +                                              unsigned int *frac)

> > > +{

> > > +   static u8 p[][2] = {

> > > +           { 16, 14, }, { 16, 13, }, { 16, 12, }, { 16, 11, },

> > > +           { 16, 10, }, { 16,  9, }, { 16,  8, }, { 15, 17, },

> > > +           { 15, 16, }, { 15, 15, }, { 15, 14, }, { 15, 13, },

> > > +           { 15, 12, }, { 15, 11, }, { 15, 10, }, { 15,  9, },

> > > +           { 15,  8, }, { 14, 18, }, { 14, 17, }, { 14, 14, },

> > > +           { 14, 13, }, { 14, 12, }, { 14, 11, }, { 14, 10, },

> > > +           { 14,  9, }, { 14,  8, }, { 13, 19, }, { 13, 18, },

> > > +           { 13, 17, }, { 13, 13, }, { 13, 12, }, { 13, 11, },

> > > +           { 13, 10, }, { 13,  9, }, { 13,  8, }, { 12, 19, },

> > > +           { 12, 18, }, { 12, 17, }, { 12, 11, }, { 12,  9, },

> > > +           { 12,  8, }, { 11, 23, }, { 11, 22, }, { 11, 21, },

> > > +           { 11, 20, }, { 11, 19, }, { 11, 18, }, { 11, 17, },

> > > +           { 11, 11, }, { 11, 10, }, { 11,  9, }, { 11,  8, },

> > > +           { 10, 25, }, { 10, 23, }, { 10, 20, }, { 10, 19, },

> > > +           { 10, 17, }, { 10, 10, }, { 10,  9, }, { 10,  8, },

> > > +           {  9, 27, }, {  9, 23, }, {  9, 21, }, {  9, 19, },

> > > +           {  9, 18, }, {  9, 17, }, {  9,  9, }, {  9,  8, },

> > > +           {  8, 31, }, {  8, 29, }, {  8, 23, }, {  8, 19, },

> > > +           {  8, 17, }, {  8,  8, }, {  7, 35, }, {  7, 31, },

> > > +           {  7, 29, }, {  7, 25, }, {  7, 23, }, {  7, 21, },

> > > +           {  7, 19, }, {  7, 17, }, {  7, 15, }, {  7, 14, },

> > > +           {  7, 13, }, {  7, 12, }, {  7, 11, }, {  7, 10, },

> > > +           {  7,  9, }, {  7,  8, }, {  6, 41, }, {  6, 37, },

> > > +           {  6, 31, }, {  6, 29, }, {  6, 23, }, {  6, 19, },

> > > +           {  6, 17, }, {  6, 13, }, {  6, 11, }, {  6, 10, },

> > > +           {  6,  9, }, {  6,  8, }, {  5, 67, }, {  5, 47, },

> > > +           {  5, 43, }, {  5, 41, }, {  5, 37, }, {  5, 31, },

> > > +           {  5, 29, }, {  5, 25, }, {  5, 23, }, {  5, 19, },

> > > +           {  5, 17, }, {  5, 15, }, {  5, 13, }, {  5, 11, },

> > > +           {  5, 10, }, {  5,  9, }, {  5,  8, }, {  4, 61, },

> > > +           {  4, 59, }, {  4, 53, }, {  4, 47, }, {  4, 43, },

> > > +           {  4, 41, }, {  4, 37, }, {  4, 31, }, {  4, 29, },

> > > +           {  4, 23, }, {  4, 19, }, {  4, 17, }, {  4, 13, },

> > > +           {  4,  9, }, {  4,  8, },

> > > +   };

> >

> > Oh là là! Please, use rational best approximation algorithm instead

> > (check CONFIG_RATIONAL).

>

>  Thanks for the pointer, I didn't know we had this piece.

>

>  However how is it supposed to apply here?  The denominator is always 8,

> so we can rule it out (by multiplying the dividend by 8, which this piece

> does, so that the divisor is a whole number), but the numerator has to be

> a product of three integers, from a different range each ([4,16], [8,511],

> [1, 65535]) as noted above.

>

>  Essentially we need to find such three integers (with extra constraints)

> the product of which is closest to (500000000 / baud_rate) -- which IMHO

> amounts to factorisation, an NP-complete problem as you have been surely

> aware (and the whole world relies on), and I have decided that this simple

> table-driven approximation is good enough to handle the usual baud rates,

> especially the higher ones.  For several baud rates it gives more accurate

> results (lower deviation) than the factors proposed in the manufacturer's

> datasheets.


And my point is to calculate is always based on the asked baud rate.
Yes. I understand what you wrote above and sometimes only brute force
can be used, but in the kernel we have integer arithmetics which helps
a lot besides the fact of bits twiddlings.

>  I just fail to see how your proposed algorithm could be factored in here,

> but I'll be happy to be proved wrong, so I'll appreciate guidance.


It's possible that it doesn't fit in the current form or for all three
integers. Just give some time and think about it. Maybe you can come
up with a better idea. I usually point to one case I have solved [1]
to show that ugly tables can be dropped (in some cases it makes sense
to leave them, though).

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/spi/spi-pxa2xx.c?id=9df461eca18f5395ee84670cdba6755dddec1898

>  In any case thank you for your review, always appreciated!


You/re welcome!

-- 
With Best Regards,
Andy Shevchenko
Maciej W. Rozycki July 15, 2021, 7:49 p.m. UTC | #4
On Tue, 13 Jul 2021, Andy Shevchenko wrote:

> > > Please no. We really would like to get rid of that ugly hack. The BOTHER exists

> > > for ages.

> >

> >  I have actually carefully considered it before submission and:

> >

> > 1. it remains a supported user API with a tool included with contemporary

> >    distributions, and

> 

> What supported API?


 The TIOCGSERIAL/TIOCSSERIAL ioctls.  It's not clear to me why you're 
asking; as a serial driver expert you must have been surely aware of their 
existence.

> > 2. with this device you can't set all the possible whole-number baud

> >    rates let alone UART clock frequencies with the BOTHER API, and

> 

> How does SPD_CUST make it different?


 You can program the divider registers directly with any bit pattern 
supported by hardware.  You don't know what use for it has been out there 
in the field for the last ~30 years.

> > 3. it doesn't hurt.

> 

> It hurts development a lot.


 It is there and the presence or absence of the feature for OxSemi devices 
does not affect anything but OxSemi support code.

> > If you'd like to get rid of SPD_CUST, then just do so, but until then I

> > fail to see a point to have it supported with some devices but not other

> > ones.

> 

> It _is_ the current state of affairs. Most of the contemporary drivers

> do not support this "feature" at all.


 It is currently a supported feature for OxSemi devices (and most if not 
all 8250 derivatives), and I don't see a reason to selectively remove it 
with this specifice submission.  There may be installations out there that 
rely on it -- there have been shortcomings in the implementation, which 
are the motivation for this patch series, but mind that we've had support 
for OxSemi Tornado devices since 2008.  That's a lot of time.

 Driver writers for other UART implementations may choose to have it or 
not to with their new code written from scratch.  As a matter of interest 
I've checked zs.c, one of the serial hw drivers I had considerable input 
to and it uses its own ZS_BPS_TO_BRG macro rather than `uart_get_divisor', 
so it does not support this feature (and dz.c only has a set of 15 fixed 
baud rates, which is where the original termio B<rate> macro bit patterns, 
as well as the EXTA and EXTB macros for the externally supplied clock 
chosen by the 16th bit pattern, come from).

 And as I say: if you want to remove it, then do it globally and tell 
people that as from Linux x.y.z this feature is no longer supported, so 
that is clear and unambiguos and some poor IT soul out there does not get 
hit by a random obscure driver feature removal with a kernel upgrade.

 NB in the course of this effort I've learnt the hard way that `setserial' 
is even invoked automatically nowadays in startup/shutdown scripts for 
port state saving and restoration, so a random unannounced feature removal 
here can wreak havoc with people's installations they have configured 
years ago.

> >  NB if you do get to it, then please consider adding an equally flexible

> > API too, e.g. for fractional baud rates (134.5bps haha); I won't mind if

> > it's less hackish though.

> 

> Why do you need fractional baud rates for the small speeds? I do not

> believe we have any good use case for that. And 1/2 from 134 is less

> than 0.5% which is tolerable by UART by definition.


 Just pointing out the incompleteness of the implementation should 
SPD_CUST be removed.

> So, please no, drop it.


 Based on my consideration given above, no, I maintain keeping the feature 
is the right approach, at least for this change concerned.  After all its 
purpose is to correct baud rate setting and not to remove vaguely related 
features.

> >  The relevant stuff has been included as comments along with actual code

> > already, and the rest is the usual submission-time rationale.  This will

> > be the initial source of information when someone studies the history of

> > this code (`git log').

> 

> I do not object to this, but perhaps in the form of documentation it

> would serve a better job (no-one will need to go deep into the Git

> history for this, especially non-developer people who just got a

> tarball, for example).


 Why would a non-developer need to dive into it while also hesitating to 
go through the git log?  Moreover, why would a developer hesitate digging 
through the git log while trying to understand code?

 Maybe I'm old-fashioned, but coming from long before we had any sensible 
repository (umm, the MIPS port and I believe m68k had CVS in the old days, 
but try to find anything with CVS!) let alone git I can't overrate my 
appreciation for its features, and our requirement for sensible change 
descriptions is not there for nothing.  So looking through `git log' is 
the first thing I do when I want to understand why a piece of code unknown 
to me looks like it does.

> >  I don't consider it cast in stone however, so if there's any particular

> > piece you'd like to see elsewhere, then please point out to me what to

> > move and where.  Or give any guidance other than just: "Rewrite it!"

> 

> At least that table with divisors and deviation with accompanying

> text. But I dare to say 90-95% of the commit message and leave

> something like "Here is a new driver. documentation is there." To

> where? Documentation/admin-guide seems most suitable right now

> (looking at the presence of auxdisplay folder), however I think that

> maybe dedicated folder like Documentation/hardware-notes maybe better.


 Not a new driver really, but a bug fix for the broken calculation we 
currently have given how inaccurately the particular input clock rate 
chosen by the designer of the ASIC gets divided at least for some of our 
standard baud rates.  You may have seen with my previous fix that the 
original submitter for the OxSemi handlers didn't even get the base baud 
rate right (and that stuff has been included as vendor-supplied drivers 
with various OxSemi-based serial port card manufacturers!), so I can't 
really tell what happened there.

 It looks to me like nice if not ultimate UART hardware ruined by broken 
software, sigh.  Which may have contributed to the withdrawal of the ASICs 
from manufacturing as with the extra quality stripped by missing software 
support obviously they may not have been able to compete solely by price 
with cheap UARTs made elsewhere that provide a basic feature set only (and 
no documentation).

 NB it does DMA as one would expect especially at the higher baud rates, 
and earlier OxSemi PCI or discrete UARTs have similar features (including 
the clock prescaler), none of which we handle.  If we started, only then I 
guess we could call it a new driver.  I have the datasheets, but I can't 
dedicate more time I'm afraid beyond what is absolutely necessary to get 
the broken PCIe stuff right.

 OK though, you've convinced me.

> +Cc: Mauro. What do you think about this? We need a folder where we

> rather describe hardware features and maybe some driver implementation

> details.


 Not for serial hardware drivers, but a while ago I committed what now 
lives at Documentation/networking/device_drivers/fddi/defza.rst along with 
several other networking driver notes, so I imagine we can have a similar 
collection for 8250 stuff and the like.

> >  I have found it unclear where the line is drawn between having support

> > code included with 8250_pci.c proper and having it split off to a separate

> > file.  All the device-specific files seem to provide complex handling,

> > well beyond just calculating the clock.

> 

> Lines of code in the current 8250_pci in conjunction with expansion.

> To me 331 (okay, it's something like 280?) LOC + sounds like a very

> good justification to split.


 OK, the size argument alone makes some sense to me, though OTOH splitting 
sources into many files prevents the compiler from doing interprocedural 
optimisations, which can only be partially compensated by LTO (if enabled 
in the first place).

 I'll see what I can do here anyway.  Mind that non-PCIe OxSemi stuff 
remains incomplete, as I noted above, so it'll be a partial driver anyway.

> > > > + * We iterate over the table and choose the product of an oversampling

> > > > + * rate and a clock prescaler that gives the lowest integer division

> > > > + * result deviation, or if an exact integer divider is found we stop

> > > > + * looking for right away.  We do some fixup if the resulting clock

> 

> for it right


 Fixed, thanks!

> >  Essentially we need to find such three integers (with extra constraints)

> > the product of which is closest to (500000000 / baud_rate) -- which IMHO

> > amounts to factorisation, an NP-complete problem as you have been surely

> > aware (and the whole world relies on), and I have decided that this simple

> > table-driven approximation is good enough to handle the usual baud rates,

> > especially the higher ones.  For several baud rates it gives more accurate

> > results (lower deviation) than the factors proposed in the manufacturer's

> > datasheets.

> 

> And my point is to calculate is always based on the asked baud rate.

> Yes. I understand what you wrote above and sometimes only brute force

> can be used, but in the kernel we have integer arithmetics which helps

> a lot besides the fact of bits twiddlings.


 Well, the algorithm is for finding the closest rational number expressed 
by a numerator and a denominator to the required one.  We have no problem 
with that, because the divisor is integer (which is of course a rational 
number too, but finding the numerator where the denominator if constant 1 
is trivial).

> >  I just fail to see how your proposed algorithm could be factored in here,

> > but I'll be happy to be proved wrong, so I'll appreciate guidance.

> 

> It's possible that it doesn't fit in the current form or for all three

> integers. Just give some time and think about it. Maybe you can come

> up with a better idea. I usually point to one case I have solved [1]

> to show that ugly tables can be dropped (in some cases it makes sense

> to leave them, though).

> 

> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/spi/spi-pxa2xx.c?id=9df461eca18f5395ee84670cdba6755dddec1898


 Well, I considered a naïve algorithm, but it would be quadratic, it would 
make a lot of redundant calculations, and division is not exactly a fast 
operation; some targets we support have no hardware division instruction 
even.  Needless to say it would still have to encode the ranges supported.  
And any code size gain from the naïve algorithm might still not match the 
268 bytes the table consumes, especially with RISC targets.

 I think my proposal is a reasonable compromise that addresses the problem 
at hand, and if you or anyone else finds a better solution sometime, then 
you are free to improve it.  Perfect is the enemy of good enough, so the 
potential existence of an ultimate solution that hasn't been discovered 
yet shouldn't block progress in a hope that the solution gets found soon 
enough.

 I'll repost the outstanding pieces of the series with the adjustments I 
have agreed to make.

  Maciej
Andy Shevchenko July 19, 2021, 2:55 p.m. UTC | #5
On Thu, Jul 15, 2021 at 10:49 PM Maciej W. Rozycki <macro@orcam.me.uk> wrote:
> On Tue, 13 Jul 2021, Andy Shevchenko wrote:

>

> > > > Please no. We really would like to get rid of that ugly hack. The BOTHER exists

> > > > for ages.

> > >

> > >  I have actually carefully considered it before submission and:

> > >

> > > 1. it remains a supported user API with a tool included with contemporary

> > >    distributions, and

> >

> > What supported API?

>

>  The TIOCGSERIAL/TIOCSSERIAL ioctls.  It's not clear to me why you're

> asking; as a serial driver expert you must have been surely aware of their

> existence.


Yes, I know how it's used. It's the ugliest hack in the serial support in Linux.
Telling that baud rate is 38400 and supplying something completely different,
non-standard stuff.

> > > 2. with this device you can't set all the possible whole-number baud

> > >    rates let alone UART clock frequencies with the BOTHER API, and

> >

> > How does SPD_CUST make it different?

>

>  You can program the divider registers directly with any bit pattern

> supported by hardware.


And again, how does it differ from BOTHER paths (except being a hack)?
You supply baud rate with 1 baud granularity and program hardware
registers accordingly. (Yes, I remember you pointed out the sub-HZ
frequencies, but I don't buy it as a very good argument because the
only significant error can be achieved at baud rates under the 100).

>  You don't know what use for it has been out there

> in the field for the last ~30 years.


Is it a question or statement?

> > > 3. it doesn't hurt.

> >

> > It hurts development a lot.

>

>  It is there and the presence or absence of the feature for OxSemi devices

> does not affect anything but OxSemi support code.


It's a pity. But what support code needs the SPD_CUST nowadays?

> > > If you'd like to get rid of SPD_CUST, then just do so, but until then I

> > > fail to see a point to have it supported with some devices but not other

> > > ones.

> >

> > It _is_ the current state of affairs. Most of the contemporary drivers

> > do not support this "feature" at all.

>

>  It is currently a supported feature for OxSemi devices (and most if not

> all 8250 derivatives), and I don't see a reason to selectively remove it

> with this specifice submission.  There may be installations out there that

> rely on it -- there have been shortcomings in the implementation, which

> are the motivation for this patch series, but mind that we've had support

> for OxSemi Tornado devices since 2008.  That's a lot of time.

>

>  Driver writers for other UART implementations may choose to have it or

> not to with their new code written from scratch.  As a matter of interest

> I've checked zs.c, one of the serial hw drivers I had considerable input

> to and it uses its own ZS_BPS_TO_BRG macro rather than `uart_get_divisor',

> so it does not support this feature (and dz.c only has a set of 15 fixed

> baud rates, which is where the original termio B<rate> macro bit patterns,

> as well as the EXTA and EXTB macros for the externally supplied clock

> chosen by the 16th bit pattern, come from).

>

>  And as I say: if you want to remove it, then do it globally and tell

> people that as from Linux x.y.z this feature is no longer supported, so

> that is clear and unambiguos and some poor IT soul out there does not get

> hit by a random obscure driver feature removal with a kernel upgrade.


I had had this in my plans [2] but suddenly I had no time to accomplish it. :-(

>  NB in the course of this effort I've learnt the hard way that `setserial'

> is even invoked automatically nowadays in startup/shutdown scripts for

> port state saving and restoration, so a random unannounced feature removal

> here can wreak havoc with people's installations they have configured

> years ago.


I believe that for these years spd_cust shouldn't be used at all. OK,
it seems the first thing we may do is to patch the user space to give
a fat warning that spd_cust is deprecated and should be avoided.
And... it seems already there [3].

> > >  NB if you do get to it, then please consider adding an equally flexible

> > > API too, e.g. for fractional baud rates (134.5bps haha); I won't mind if

> > > it's less hackish though.

> >

> > Why do you need fractional baud rates for the small speeds? I do not

> > believe we have any good use case for that. And 1/2 from 134 is less

> > than 0.5% which is tolerable by UART by definition.

>

>  Just pointing out the incompleteness of the implementation should

> SPD_CUST be removed.

>

> > So, please no, drop it.

>

>  Based on my consideration given above, no, I maintain keeping the feature

> is the right approach, at least for this change concerned.  After all its

> purpose is to correct baud rate setting and not to remove vaguely related

> features.


I see your point. My question here, does this series extend SPD_CUST
to additional (newly supported?) baud rates? If so, I would be against
that extension. Supporting deprecated stuff is okay for a while.

To the end of the discussion may be good to read also these [4] and [5]


> > >  The relevant stuff has been included as comments along with actual code

> > > already, and the rest is the usual submission-time rationale.  This will

> > > be the initial source of information when someone studies the history of

> > > this code (`git log').

> >

> > I do not object to this, but perhaps in the form of documentation it

> > would serve a better job (no-one will need to go deep into the Git

> > history for this, especially non-developer people who just got a

> > tarball, for example).

>

>  Why would a non-developer need to dive into it while also hesitating to

> go through the git log?  Moreover, why would a developer hesitate digging

> through the git log while trying to understand code?

>

>  Maybe I'm old-fashioned, but coming from long before we had any sensible

> repository (umm, the MIPS port and I believe m68k had CVS in the old days,

> but try to find anything with CVS!) let alone git I can't overrate my

> appreciation for its features, and our requirement for sensible change

> descriptions is not there for nothing.  So looking through `git log' is

> the first thing I do when I want to understand why a piece of code unknown

> to me looks like it does.


Tell me how to look into git logs without cloning a repository (I
think it's possible through some borken web UI somehow, but never
heard about it). In any case, either in the commit message or in the
code it will be in the repo, so not a big deal to me after all.

> > >  I don't consider it cast in stone however, so if there's any particular

> > > piece you'd like to see elsewhere, then please point out to me what to

> > > move and where.  Or give any guidance other than just: "Rewrite it!"

> >

> > At least that table with divisors and deviation with accompanying

> > text. But I dare to say 90-95% of the commit message and leave

> > something like "Here is a new driver. documentation is there." To

> > where? Documentation/admin-guide seems most suitable right now

> > (looking at the presence of auxdisplay folder), however I think that

> > maybe dedicated folder like Documentation/hardware-notes maybe better.

>

>  Not a new driver really, but a bug fix for the broken calculation we

> currently have given how inaccurately the particular input clock rate

> chosen by the designer of the ASIC gets divided at least for some of our

> standard baud rates.  You may have seen with my previous fix that the

> original submitter for the OxSemi handlers didn't even get the base baud

> rate right (and that stuff has been included as vendor-supplied drivers

> with various OxSemi-based serial port card manufacturers!), so I can't

> really tell what happened there.


Yeah, it seems it has never worked before.

>  It looks to me like nice if not ultimate UART hardware ruined by broken

> software, sigh.  Which may have contributed to the withdrawal of the ASICs

> from manufacturing as with the extra quality stripped by missing software

> support obviously they may not have been able to compete solely by price

> with cheap UARTs made elsewhere that provide a basic feature set only (and

> no documentation).

>

>  NB it does DMA as one would expect especially at the higher baud rates,

> and earlier OxSemi PCI or discrete UARTs have similar features (including

> the clock prescaler), none of which we handle.  If we started, only then I

> guess we could call it a new driver.  I have the datasheets, but I can't

> dedicate more time I'm afraid beyond what is absolutely necessary to get

> the broken PCIe stuff right.

>

>  OK though, you've convinced me.

>

> > +Cc: Mauro. What do you think about this? We need a folder where we

> > rather describe hardware features and maybe some driver implementation

> > details.

>

>  Not for serial hardware drivers, but a while ago I committed what now

> lives at Documentation/networking/device_drivers/fddi/defza.rst along with

> several other networking driver notes, so I imagine we can have a similar

> collection for 8250 stuff and the like.


To me any folder is more or less okayish as long as it's there, under
the Documentation :-)

> > >  I have found it unclear where the line is drawn between having support

> > > code included with 8250_pci.c proper and having it split off to a separate

> > > file.  All the device-specific files seem to provide complex handling,

> > > well beyond just calculating the clock.

> >

> > Lines of code in the current 8250_pci in conjunction with expansion.

> > To me 331 (okay, it's something like 280?) LOC + sounds like a very

> > good justification to split.

>

>  OK, the size argument alone makes some sense to me, though OTOH splitting

> sources into many files prevents the compiler from doing interprocedural

> optimisations, which can only be partially compensated by LTO (if enabled

> in the first place).

>

>  I'll see what I can do here anyway.  Mind that non-PCIe OxSemi stuff

> remains incomplete, as I noted above, so it'll be a partial driver anyway.


Thanks for doing that anyway!

> > > > > + * We iterate over the table and choose the product of an oversampling

> > > > > + * rate and a clock prescaler that gives the lowest integer division

> > > > > + * result deviation, or if an exact integer divider is found we stop

> > > > > + * looking for right away.  We do some fixup if the resulting clock

> >

> > for it right

>

>  Fixed, thanks!

>

> > >  Essentially we need to find such three integers (with extra constraints)

> > > the product of which is closest to (500000000 / baud_rate) -- which IMHO

> > > amounts to factorisation, an NP-complete problem as you have been surely

> > > aware (and the whole world relies on), and I have decided that this simple

> > > table-driven approximation is good enough to handle the usual baud rates,

> > > especially the higher ones.  For several baud rates it gives more accurate

> > > results (lower deviation) than the factors proposed in the manufacturer's

> > > datasheets.

> >

> > And my point is to calculate is always based on the asked baud rate.

> > Yes. I understand what you wrote above and sometimes only brute force

> > can be used, but in the kernel we have integer arithmetics which helps

> > a lot besides the fact of bits twiddlings.

>

>  Well, the algorithm is for finding the closest rational number expressed

> by a numerator and a denominator to the required one.  We have no problem

> with that, because the divisor is integer (which is of course a rational

> number too, but finding the numerator where the denominator if constant 1

> is trivial).

>

> > >  I just fail to see how your proposed algorithm could be factored in here,

> > > but I'll be happy to be proved wrong, so I'll appreciate guidance.

> >

> > It's possible that it doesn't fit in the current form or for all three

> > integers. Just give some time and think about it. Maybe you can come

> > up with a better idea. I usually point to one case I have solved [1]

> > to show that ugly tables can be dropped (in some cases it makes sense

> > to leave them, though).

> >

> > [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/spi/spi-pxa2xx.c?id=9df461eca18f5395ee84670cdba6755dddec1898

>

>  Well, I considered a naïve algorithm, but it would be quadratic, it would

> make a lot of redundant calculations, and division is not exactly a fast

> operation; some targets we support have no hardware division instruction

> even.  Needless to say it would still have to encode the ranges supported.

> And any code size gain from the naïve algorithm might still not match the

> 268 bytes the table consumes, especially with RISC targets.

>

>  I think my proposal is a reasonable compromise that addresses the problem

> at hand, and if you or anyone else finds a better solution sometime, then

> you are free to improve it.  Perfect is the enemy of good enough, so the

> potential existence of an ultimate solution that hasn't been discovered

> yet shouldn't block progress in a hope that the solution gets found soon

> enough.


Fair enough.

>  I'll repost the outstanding pieces of the series with the adjustments I

> have agreed to make.


Thanks!

[2]: Hmm... It's funny how some things may easily disappear from the
internet. I'll look again tomorrow for the links. Okay, I have the
discussion locally available, I may forward you the entire thread.
[3]: https://elixir.bootlin.com/linux/latest/source/drivers/tty/serial/serial_core.c#L986
[4]: https://lore.kernel.org/linux-serial/CAHhAz+hrmLiRJ77cM=CYj+iyH8aUJ64R6FtAwGqqB2pOS0n0aQ@mail.gmail.com/T/#u
[5]: https://github.com/npat-efault/picocom/blob/master/termios2.txt


--
With Best Regards,
Andy Shevchenko
Maciej W. Rozycki Feb. 12, 2022, 8:41 a.m. UTC | #6
On Mon, 19 Jul 2021, Andy Shevchenko wrote:

> >  The TIOCGSERIAL/TIOCSSERIAL ioctls.  It's not clear to me why you're
> > asking; as a serial driver expert you must have been surely aware of their
> > existence.
> 
> Yes, I know how it's used. It's the ugliest hack in the serial support in Linux.
> Telling that baud rate is 38400 and supplying something completely different,
> non-standard stuff.

 Well, the interface reflects how hardware used to work actually.

 My DECstation 5000/200 box has an ASIC providing a quad UART, or a serial
line multiplexer really (handled by drivers/tty/serial/dz.c BTW), which is 
a design dating back to 1970 and a discrete DZ11 interface for Unibus[1], 
and the device has a 4-bit baud rate selector for the internal generator 
with individual settings from 0x0 to 0xe selecting the standard baud rates 
between 50 and 9600 baud, and then 0xf selects an external baud generator 
input, which in this particular machine lets you switch between 19200 and 
38400 through a board CSR.  See how this arrangement exactly matches the 
EXTA/EXTB macros aka B19200/B38400 (now you have found out where these 
have come from!).

 The original DZ11 interface did not expect such high serial communication 
speeds of course and hence the shortage of control bits for the internal 
baud rate generator.

 And then the SPD_CUST API has been similarly invented for an analogous 
shortcoming of the old version of the terminal I/O interface (who could 
have thought speeds beyond 38400 would ever be required!) and we continue 
carrying this historical baggage, as the cost of backwards compatibility.

 Note that the DECstation 5000/200 is a 1990 design, so not much older
than Linux and we just adapted to the world at the time.  I wouldn't mind 
a saner API to set the raw clock divider, but nobody has come up with a 
more up-to-date solution.

> >  You can program the divider registers directly with any bit pattern
> > supported by hardware.
> 
> And again, how does it differ from BOTHER paths (except being a hack)?
> You supply baud rate with 1 baud granularity and program hardware
> registers accordingly. (Yes, I remember you pointed out the sub-HZ
> frequencies, but I don't buy it as a very good argument because the
> only significant error can be achieved at baud rates under the 100).

 With this device owing to the internal 16-bit divisor range limitation 
you can't even set the baud rates under 100 (or under 239 actually) with 
BOTHER.  This has nothing to do with HZ; it's just that 15625000 / 65535 
works out at 238.422.

 Plus all the world is not modems and teletypes nowadays anymore.  People 
wire all kinds of weird equipment to serial ports and the more flexibility 
we give them in driving it the better.

> >  You don't know what use for it has been out there
> > in the field for the last ~30 years.
> 
> Is it a question or statement?

 It's a statement.

> > > > 3. it doesn't hurt.
> > >
> > > It hurts development a lot.
> >
> >  It is there and the presence or absence of the feature for OxSemi devices
> > does not affect anything but OxSemi support code.
> 
> It's a pity. But what support code needs the SPD_CUST nowadays?

 I don't know.  But there is no alternative raw interface and I think it 
is important to give people good tools for their work.

> >  It is currently a supported feature for OxSemi devices (and most if not
> > all 8250 derivatives), and I don't see a reason to selectively remove it
> > with this specifice submission.  There may be installations out there that
> > rely on it -- there have been shortcomings in the implementation, which
> > are the motivation for this patch series, but mind that we've had support
> > for OxSemi Tornado devices since 2008.  That's a lot of time.
> >
> >  Driver writers for other UART implementations may choose to have it or
> > not to with their new code written from scratch.  As a matter of interest
> > I've checked zs.c, one of the serial hw drivers I had considerable input
> > to and it uses its own ZS_BPS_TO_BRG macro rather than `uart_get_divisor',
> > so it does not support this feature (and dz.c only has a set of 15 fixed
> > baud rates, which is where the original termio B<rate> macro bit patterns,
> > as well as the EXTA and EXTB macros for the externally supplied clock
> > chosen by the 16th bit pattern, come from).
> >
> >  And as I say: if you want to remove it, then do it globally and tell
> > people that as from Linux x.y.z this feature is no longer supported, so
> > that is clear and unambiguos and some poor IT soul out there does not get
> > hit by a random obscure driver feature removal with a kernel upgrade.
> 
> I had had this in my plans [2] but suddenly I had no time to accomplish 
> it. :-(

 So I do hope you can realise why perfect is the enemy of good enough.  
There was nothing wrong in particular with my proposed bug fix for the 
clock calculation and yet half a year on and we got nowhere only because 
my proposal was not perfect enough and then I was preempted with other 
stuff.

> >  NB in the course of this effort I've learnt the hard way that `setserial'
> > is even invoked automatically nowadays in startup/shutdown scripts for
> > port state saving and restoration, so a random unannounced feature removal
> > here can wreak havoc with people's installations they have configured
> > years ago.
> 
> I believe that for these years spd_cust shouldn't be used at all. OK,
> it seems the first thing we may do is to patch the user space to give
> a fat warning that spd_cust is deprecated and should be avoided.
> And... it seems already there [3].

 Yet no equally functional alternative.  Call it BRAW or something and 
pass a 32-bit or 64-bit cookie for the driver to interpret.  And then you 
can remove the spd_cust/38400bps hack.

> >  Just pointing out the incompleteness of the implementation should
> > SPD_CUST be removed.
> >
> > > So, please no, drop it.
> >
> >  Based on my consideration given above, no, I maintain keeping the feature
> > is the right approach, at least for this change concerned.  After all its
> > purpose is to correct baud rate setting and not to remove vaguely related
> > features.
> 
> I see your point. My question here, does this series extend SPD_CUST
> to additional (newly supported?) baud rates? If so, I would be against
> that extension. Supporting deprecated stuff is okay for a while.

 It does allow one to program the full clock divider range of the OxSemi 
devices.  I find it appropriate according to my engineer's code of good 
practices.  And it doesn't cause any burden for non-OxSemi code.

> To the end of the discussion may be good to read also these [4] and [5]

 Thank you for the pointers, however there's nothing new to me there, 
sorry.

> >  Not for serial hardware drivers, but a while ago I committed what now
> > lives at Documentation/networking/device_drivers/fddi/defza.rst along with
> > several other networking driver notes, so I imagine we can have a similar
> > collection for 8250 stuff and the like.
> 
> To me any folder is more or less okayish as long as it's there, under
> the Documentation :-)

 I've chosen Documentation/tty/device_drivers/oxsemi-tornado.rst then, 
following the pattern.

> >  OK, the size argument alone makes some sense to me, though OTOH splitting
> > sources into many files prevents the compiler from doing interprocedural
> > optimisations, which can only be partially compensated by LTO (if enabled
> > in the first place).
> >
> >  I'll see what I can do here anyway.  Mind that non-PCIe OxSemi stuff
> > remains incomplete, as I noted above, so it'll be a partial driver anyway.
> 
> Thanks for doing that anyway!

 So I have had a look at how it has been done for other drivers and I have 
now convinced myself against such a split.  The primary reason for this 
conclusion is that there is no basic infrastructure for such a split and 
the ultimate result is code duplication with no clear benefit to justify 
it.

 Take for example the most recently separated dedicated 8250_pericom.c 
driver: its `pericom8250_probe' function duplicates pieces extracted from 
`pciserial_init_one' (so if a change has to be applied to either function, 
then all its siblings have to be manually verified as to whether the same 
change has to be applied there as well; I bet that it won't happen as 
required and pieces of code will slowly diverge and suffer from bitrot) 
and then PCI error handling and suspend/remove support have been removed 
for no reason.

 I suppose common code could be factored out from `pciserial_init_one' and 
the other routines private to 8250_pci.c either moved to a library archive 
to be linked into individual drivers or to a subdriver along the lines of 
drivers/net/ethernet/amd/7990.c or drivers/scsi/esp_scsi.c.  But it is not 
the case at the moment, so in my opinion the split has been premature.

 Originally I thought that those separated drivers are merely a source 
code split across multiple files to make maintenance easier, all to be 
built and linked together under the SERIAL_8250_PCI option, rather than 
standalone drivers.  It is clearly not the case though.

> >  I'll repost the outstanding pieces of the series with the adjustments I
> > have agreed to make.
> 
> Thanks!

 I shall be posting v3 right away along with fixes for serveral issues I 
have come across in the course of this development.  I will be happy to 
address any bugs, functional issues or coding style problems that may have 
escaped my attention, however I consider this version otherwise final and 
I am not going to make this code a separate driver or remove the custom 
baud rate divisor support code.

 The rationale behind this is that AFAIK it is not the Linux kernel policy 
for any of its subsystems to require any other bugs or missing features to 
be addressed as a prerequisite for a bug fix or even a functional 
improvement to be accepted.

 If this remains unacceptable, then so be it.  I have already gone above 
and beyond with this submission and I have put enough time into it, and I 
am not going to invest anymore, as this is likely to exceed what long-term 
maintenance as a local patch is going to take on one hand, and the change 
has been published so people who might need it can chase it and install 
locally themselves.  This is not my pet project, but rather an ad hoc 
effort to fix issues I have come across by chance.

References:

[1] "DZ11 asynchronous serial line interface", 
    <https://gunkies.org/wiki/DZ11_asynchronous_serial_line_interface>

  Maciej
Andy Shevchenko March 18, 2022, 1:40 p.m. UTC | #7
On Sat, Feb 12, 2022 at 08:41:19AM +0000, Maciej W. Rozycki wrote:
> On Mon, 19 Jul 2021, Andy Shevchenko wrote:
> 
> > >  The TIOCGSERIAL/TIOCSSERIAL ioctls.  It's not clear to me why you're
> > > asking; as a serial driver expert you must have been surely aware of their
> > > existence.
> > 
> > Yes, I know how it's used. It's the ugliest hack in the serial support in Linux.
> > Telling that baud rate is 38400 and supplying something completely different,
> > non-standard stuff.
> 
>  Well, the interface reflects how hardware used to work actually.
> 
>  My DECstation 5000/200 box has an ASIC providing a quad UART, or a serial
> line multiplexer really (handled by drivers/tty/serial/dz.c BTW), which is 
> a design dating back to 1970 and a discrete DZ11 interface for Unibus[1], 
> and the device has a 4-bit baud rate selector for the internal generator 
> with individual settings from 0x0 to 0xe selecting the standard baud rates 
> between 50 and 9600 baud, and then 0xf selects an external baud generator 
> input, which in this particular machine lets you switch between 19200 and 
> 38400 through a board CSR.  See how this arrangement exactly matches the 
> EXTA/EXTB macros aka B19200/B38400 (now you have found out where these 
> have come from!).
> 
>  The original DZ11 interface did not expect such high serial communication 
> speeds of course and hence the shortage of control bits for the internal 
> baud rate generator.
> 
>  And then the SPD_CUST API has been similarly invented for an analogous 
> shortcoming of the old version of the terminal I/O interface (who could 
> have thought speeds beyond 38400 would ever be required!) and we continue 
> carrying this historical baggage, as the cost of backwards compatibility.
> 
>  Note that the DECstation 5000/200 is a 1990 design, so not much older
> than Linux and we just adapted to the world at the time.  I wouldn't mind 
> a saner API to set the raw clock divider, but nobody has come up with a 
> more up-to-date solution.

What do you mean by the last sentence? The BOTHER is exactly what is new
ABI for the baud rate setting without using any ugly hacks in the kernel.

> > >  You can program the divider registers directly with any bit pattern
> > > supported by hardware.
> > 
> > And again, how does it differ from BOTHER paths (except being a hack)?
> > You supply baud rate with 1 baud granularity and program hardware
> > registers accordingly. (Yes, I remember you pointed out the sub-HZ
> > frequencies, but I don't buy it as a very good argument because the
> > only significant error can be achieved at baud rates under the 100).
> 
>  With this device owing to the internal 16-bit divisor range limitation 
> you can't even set the baud rates under 100 (or under 239 actually) with 
> BOTHER.  This has nothing to do with HZ; it's just that 15625000 / 65535 
> works out at 238.422.

I'm lost here. BOTHER requests whatever user wants to have with 1 baud
granularity. How can't it work? The only limitations here are the hw
limitations, but it doesn't affect the kernel <--> user space interfaces.

>  Plus all the world is not modems and teletypes nowadays anymore.  People 
> wire all kinds of weird equipment to serial ports and the more flexibility 
> we give them in driving it the better.
> 
> > >  You don't know what use for it has been out there
> > > in the field for the last ~30 years.
> > 
> > Is it a question or statement?
> 
>  It's a statement.

Use of SPD_CUST? Yeah, this weirdness may have spread a lot, but some of the
commits in the Linux kernel suggests that SPD_CUST is discouraged to be used.

> > > > > 3. it doesn't hurt.
> > > >
> > > > It hurts development a lot.
> > >
> > >  It is there and the presence or absence of the feature for OxSemi devices
> > > does not affect anything but OxSemi support code.
> > 
> > It's a pity. But what support code needs the SPD_CUST nowadays?
> 
>  I don't know.  But there is no alternative raw interface and I think it 
> is important to give people good tools for their work.

BOTHER _is_ the one. It seems to me you never tried it.

> > >  It is currently a supported feature for OxSemi devices (and most if not
> > > all 8250 derivatives), and I don't see a reason to selectively remove it
> > > with this specifice submission.  There may be installations out there that
> > > rely on it -- there have been shortcomings in the implementation, which
> > > are the motivation for this patch series, but mind that we've had support
> > > for OxSemi Tornado devices since 2008.  That's a lot of time.
> > >
> > >  Driver writers for other UART implementations may choose to have it or
> > > not to with their new code written from scratch.  As a matter of interest
> > > I've checked zs.c, one of the serial hw drivers I had considerable input
> > > to and it uses its own ZS_BPS_TO_BRG macro rather than `uart_get_divisor',
> > > so it does not support this feature (and dz.c only has a set of 15 fixed
> > > baud rates, which is where the original termio B<rate> macro bit patterns,
> > > as well as the EXTA and EXTB macros for the externally supplied clock
> > > chosen by the 16th bit pattern, come from).
> > >
> > >  And as I say: if you want to remove it, then do it globally and tell
> > > people that as from Linux x.y.z this feature is no longer supported, so
> > > that is clear and unambiguos and some poor IT soul out there does not get
> > > hit by a random obscure driver feature removal with a kernel upgrade.
> > 
> > I had had this in my plans [2] but suddenly I had no time to accomplish 
> > it. :-(
> 
>  So I do hope you can realise why perfect is the enemy of good enough.  

I hope you can realise that as well. Because BOTHER is already supported and
what you are doing is uglification of the code and resurrecting an evil.

> There was nothing wrong in particular with my proposed bug fix for the 
> clock calculation and yet half a year on and we got nowhere only because 
> my proposal was not perfect enough and then I was preempted with other 
> stuff.

> > >  NB in the course of this effort I've learnt the hard way that `setserial'
> > > is even invoked automatically nowadays in startup/shutdown scripts for
> > > port state saving and restoration, so a random unannounced feature removal
> > > here can wreak havoc with people's installations they have configured
> > > years ago.
> > 
> > I believe that for these years spd_cust shouldn't be used at all. OK,
> > it seems the first thing we may do is to patch the user space to give
> > a fat warning that spd_cust is deprecated and should be avoided.
> > And... it seems already there [3].
> 
>  Yet no equally functional alternative.  Call it BRAW or something and 
> pass a 32-bit or 64-bit cookie for the driver to interpret.  And then you 
> can remove the spd_cust/38400bps hack.

Have you tried BOTHER? It's already there for a long time. Don't tell me that
it's not an equivalent. It's better than that.

> > >  Just pointing out the incompleteness of the implementation should
> > > SPD_CUST be removed.
> > >
> > > > So, please no, drop it.
> > >
> > >  Based on my consideration given above, no, I maintain keeping the feature
> > > is the right approach, at least for this change concerned.  After all its
> > > purpose is to correct baud rate setting and not to remove vaguely related
> > > features.
> > 
> > I see your point. My question here, does this series extend SPD_CUST
> > to additional (newly supported?) baud rates? If so, I would be against
> > that extension. Supporting deprecated stuff is okay for a while.
> 
>  It does allow one to program the full clock divider range of the OxSemi 
> devices.  I find it appropriate according to my engineer's code of good 
> practices.  And it doesn't cause any burden for non-OxSemi code.

How BOTHER does prevent you doing the same?

> > To the end of the discussion may be good to read also these [4] and [5]
> 
>  Thank you for the pointers, however there's nothing new to me there, 
> sorry.
> 
> > >  Not for serial hardware drivers, but a while ago I committed what now
> > > lives at Documentation/networking/device_drivers/fddi/defza.rst along with
> > > several other networking driver notes, so I imagine we can have a similar
> > > collection for 8250 stuff and the like.
> > 
> > To me any folder is more or less okayish as long as it's there, under
> > the Documentation :-)
> 
>  I've chosen Documentation/tty/device_drivers/oxsemi-tornado.rst then, 
> following the pattern.
> 
> > >  OK, the size argument alone makes some sense to me, though OTOH splitting
> > > sources into many files prevents the compiler from doing interprocedural
> > > optimisations, which can only be partially compensated by LTO (if enabled
> > > in the first place).
> > >
> > >  I'll see what I can do here anyway.  Mind that non-PCIe OxSemi stuff
> > > remains incomplete, as I noted above, so it'll be a partial driver anyway.
> > 
> > Thanks for doing that anyway!
> 
>  So I have had a look at how it has been done for other drivers and I have 
> now convinced myself against such a split.  The primary reason for this 
> conclusion is that there is no basic infrastructure for such a split and 
> the ultimate result is code duplication with no clear benefit to justify 
> it.

Justification for split is to keep certain quirks out of the scope of the
generic driver. I'm not sure what duplication you are talking about if the
LOC statistics shows otherwise.

>  Take for example the most recently separated dedicated 8250_pericom.c 
> driver: its `pericom8250_probe' function duplicates pieces extracted from 
> `pciserial_init_one' (so if a change has to be applied to either function, 
> then all its siblings have to be manually verified as to whether the same 
> change has to be applied there as well; I bet that it won't happen as 
> required and pieces of code will slowly diverge and suffer from bitrot) 
> and then PCI error handling and suspend/remove support have been removed 
> for no reason.

You may not want to get the idea, it's fine. The rationale is simple:
isolate quirks for certain platform(s) in one place. Each platform
in a separate module.

>  I suppose common code could be factored out from `pciserial_init_one' and 
> the other routines private to 8250_pci.c either moved to a library archive 
> to be linked into individual drivers or to a subdriver along the lines of 
> drivers/net/ethernet/amd/7990.c or drivers/scsi/esp_scsi.c.  But it is not 
> the case at the moment, so in my opinion the split has been premature.
> 
>  Originally I thought that those separated drivers are merely a source 
> code split across multiple files to make maintenance easier, all to be 
> built and linked together under the SERIAL_8250_PCI option, rather than 
> standalone drivers.  It is clearly not the case though.
> 
> > >  I'll repost the outstanding pieces of the series with the adjustments I
> > > have agreed to make.
> > 
> > Thanks!
> 
>  I shall be posting v3 right away along with fixes for serveral issues I 
> have come across in the course of this development.  I will be happy to 
> address any bugs, functional issues or coding style problems that may have 
> escaped my attention, however I consider this version otherwise final and 
> I am not going to make this code a separate driver or remove the custom 
> baud rate divisor support code.
> 
>  The rationale behind this is that AFAIK it is not the Linux kernel policy 
> for any of its subsystems to require any other bugs or missing features to 
> be addressed as a prerequisite for a bug fix or even a functional 
> improvement to be accepted.
> 
>  If this remains unacceptable, then so be it.  I have already gone above 
> and beyond with this submission and I have put enough time into it, and I 
> am not going to invest anymore, as this is likely to exceed what long-term 
> maintenance as a local patch is going to take on one hand, and the change 
> has been published so people who might need it can chase it and install 
> locally themselves.  This is not my pet project, but rather an ad hoc 
> effort to fix issues I have come across by chance.
> 
> References:
> 
> [1] "DZ11 asynchronous serial line interface", 
>     <https://gunkies.org/wiki/DZ11_asynchronous_serial_line_interface>
> 
>   Maciej
Maciej W. Rozycki March 23, 2022, 9:59 p.m. UTC | #8
On Fri, 18 Mar 2022, Andy Shevchenko wrote:

> >  It does allow one to program the full clock divider range of the OxSemi 
> > devices.  I find it appropriate according to my engineer's code of good 
> > practices.  And it doesn't cause any burden for non-OxSemi code.
> 
> How BOTHER does prevent you doing the same?

 It does not allow you to set arbitrary serial port clock rates.  You can 
only set integer baud rates, and then only those that do not exceed the 
[1;65535] clock divisor range.

> >  So I have had a look at how it has been done for other drivers and I have 
> > now convinced myself against such a split.  The primary reason for this 
> > conclusion is that there is no basic infrastructure for such a split and 
> > the ultimate result is code duplication with no clear benefit to justify 
> > it.
> 
> Justification for split is to keep certain quirks out of the scope of the
> generic driver. I'm not sure what duplication you are talking about if the
> LOC statistics shows otherwise.

 All the init/remove code is almost the same across all the devices.  And 
suspend/resume and PCI error handling code has been removed from the split 
off devices, and for the functional regression to be fixed:

1. this code would have to be replicated, or

2. handlers from the generic 8250_pci.c driver exported and referred to, 
   or

3. some kind of a helper library (or a core module) created providing this 
   stuff to 8250_*.c drivers as required.

 I guess the latter is the minimum that could convince me this driver
framework is usable for implementing device-specific drivers (as I find 
the other variants rather miserable hacks).

 Plus there would have to be clear information provided to the users as 
otherwise people will be rather confused as to why 3 out of their 4 16x50 
PCI/e serial cards work with 8250_pci.c while the remaining one does not 
(probably broken, or is it?).

> You may not want to get the idea, it's fine. The rationale is simple:
> isolate quirks for certain platform(s) in one place. Each platform
> in a separate module.

 What is a platform in your terminology?  A PCI/e option card you can 
install in about any modern computer?  I usually think of platforms as 
specific families of computers rather than option cards.  Variants of 
otherwise the same device are usually handled with a single driver in 
Linux.

  Maciej
Andy Shevchenko March 24, 2022, 11:28 a.m. UTC | #9
On Wed, Mar 23, 2022 at 09:59:28PM +0000, Maciej W. Rozycki wrote:
> On Fri, 18 Mar 2022, Andy Shevchenko wrote:
> 
> > >  It does allow one to program the full clock divider range of the OxSemi 
> > > devices.  I find it appropriate according to my engineer's code of good 
> > > practices.  And it doesn't cause any burden for non-OxSemi code.
> > 
> > How BOTHER does prevent you doing the same?
> 
>  It does not allow you to set arbitrary serial port clock rates.  You can 
> only set integer baud rates,

Why do you need fractional baud rates? What is the practical use of that, please?

>	and then only those that do not exceed the [1;65535] clock divisor
>	range.

Can you be more specific as I can't see how it's possible in practice? In
several 8250 drivers we are able to set whatever we want to the limits of
the hardware.

> > >  So I have had a look at how it has been done for other drivers and I have 
> > > now convinced myself against such a split.  The primary reason for this 
> > > conclusion is that there is no basic infrastructure for such a split and 
> > > the ultimate result is code duplication with no clear benefit to justify 
> > > it.
> > 
> > Justification for split is to keep certain quirks out of the scope of the
> > generic driver. I'm not sure what duplication you are talking about if the
> > LOC statistics shows otherwise.
> 
>  All the init/remove code is almost the same across all the devices.

Each of the platform has its own constraints and what you see as a repetition
is just a similarity. If you have an idea of the common probe function, please
share.

Also, don't forget the memory footprint case at run time. In embedded world
we do not need 8250 code that is not supported by the platform in question.

The split allows to disable / remove the code that is not needed.

> And suspend/resume and PCI error handling code has been removed from the
> split off devices,

This is managed by PCI core. Any specifics, please?

>	and for the functional regression to be fixed:
> 
> 1. this code would have to be replicated, or
> 
> 2. handlers from the generic 8250_pci.c driver exported and referred to, 
>    or
> 
> 3. some kind of a helper library (or a core module) created providing this 
>    stuff to 8250_*.c drivers as required.

Which functional regression? You mean if it will be found then it needs to
be fixed in several places?

>  I guess the latter is the minimum that could convince me this driver
> framework is usable for implementing device-specific drivers (as I find 
> the other variants rather miserable hacks).
> 
>  Plus there would have to be clear information provided to the users as 
> otherwise people will be rather confused as to why 3 out of their 4 16x50 
> PCI/e serial cards work with 8250_pci.c while the remaining one does not 
> (probably broken, or is it?).

The default configuration after the split assumes that the driver is enabled
by the very same kernel configuration. Otherwise distributions will choose
what they consider better for their users and customers.

> > You may not want to get the idea, it's fine. The rationale is simple:
> > isolate quirks for certain platform(s) in one place. Each platform
> > in a separate module.
> 
>  What is a platform in your terminology?  A PCI/e option card you can 
> install in about any modern computer?  I usually think of platforms as 
> specific families of computers rather than option cards.  Variants of 
> otherwise the same device are usually handled with a single driver in 
> Linux.

It might be PCIe card, it might be soldered on the motherboard, it can
be part of the SoC.

By platform I assume the certain SoC + certain discrete components wired
in a certain way (PCB level). In your case it's a motherboard with PCIe
serial port card.