diff mbox series

serial: Revert serial: core: Fix serial core port id to not use port->line

Message ID 98a891fd-5a1f-6568-a12c-28577126a42@sealevel.com
State New
Headers show
Series serial: Revert serial: core: Fix serial core port id to not use port->line | expand

Commit Message

Matthew Howell Aug. 28, 2023, 8:41 p.m. UTC
From: Matthew Howell <matthew.howell@sealevel.com>
XR17V35X cards seemingly unable to register serial port. Confirmed on 
Sealevel 7202C, 7204EC, and Exar XR17V352 reference board. 
dmesg states: "Couldn't register serial port 0, irq 24, type 2, error -22"

I first identified the problem when I pulled down 6.6-rc1 and I was able 
to trace it to d962de6ae51f9b76ad736220077cda83084090b1. I understand that this 
commit is noted as being reverted in 1ef2c2df1199, but I was only able to 
resolve the issue by reverting d962de6ae51f myself using this patch.

I suggest reverting using this patch unless someone more knowledgeable 
about what these changes actually do has a better suggestion.

Revert "serial: core: serial core port id to not use port->line"
This reverts commit d962de6ae51f9b76ad736220077cda83084090b1

Link: https://lore.kernel.org/all/20230725054216.45696-3-tony@atomide.com/

Fixes: d962de6ae51f9b76ad736220077cda83084090b1 ("serial: core: Fix serial 
core port id to not use port->line")
Signed-off-by: Matthew Howell <matthew.howell@sealevel.com>
---

Comments

Tony Lindgren Aug. 29, 2023, 3:52 a.m. UTC | #1
Hi,

* Matthew Howell <matthew.howell@sealevel.com> [230828 20:41]:
> From: Matthew Howell <matthew.howell@sealevel.com>
> XR17V35X cards seemingly unable to register serial port. Confirmed on 
> Sealevel 7202C, 7204EC, and Exar XR17V352 reference board. 
> dmesg states: "Couldn't register serial port 0, irq 24, type 2, error -22"
> 
> I first identified the problem when I pulled down 6.6-rc1 and I was able 
> to trace it to d962de6ae51f9b76ad736220077cda83084090b1. I understand that this 
> commit is noted as being reverted in 1ef2c2df1199, but I was only able to 
> resolve the issue by reverting d962de6ae51f myself using this patch.

Thanks for the report. Do you maybe mean 6.5-rc1 instead of 6.6-rc1 above?

If so, I suspect the issue you are reporting got already fixed during the
-rc cycle for v6.5 kernel.

> I suggest reverting using this patch unless someone more knowledgeable 
> about what these changes actually do has a better suggestion.

Can you please test with v6.5 kernel? It has the two fixes below that
sounds like you may have been missing:

a4a79e03bab5 ("serial: core: Revert port_id use")
04c7f60ca477 ("serial: core: Fix serial core port id, including multiport devices")

Note how commit a4a79e03bab5 already did a partial revert of what you're
suggesting.

If you already have these two commits, then let's investigate further to
see what might be still wrong.

Regards,

Tony
Matthew Howell Aug. 29, 2023, 1:42 p.m. UTC | #2
On Tue, 29 Aug 2023, Tony Lindgren wrote:

> Hi,
> 
> * Matthew Howell <matthew.howell@sealevel.com> [230828 20:41]:
> > From: Matthew Howell <matthew.howell@sealevel.com>
> > XR17V35X cards seemingly unable to register serial port. Confirmed on
> > Sealevel 7202C, 7204EC, and Exar XR17V352 reference board.
> > dmesg states: "Couldn't register serial port 0, irq 24, type 2, error -22"
> >
> > I first identified the problem when I pulled down 6.6-rc1 and I was able
> > to trace it to d962de6ae51f9b76ad736220077cda83084090b1. I understand that this
> > commit is noted as being reverted in 1ef2c2df1199, but I was only able to
> > resolve the issue by reverting d962de6ae51f myself using this patch.
> 
> Thanks for the report. Do you maybe mean 6.5-rc1 instead of 6.6-rc1 above?

Apologies, I meant 6.5, no RC. Specifically, I first found this issue on 
the v6.5 tag (2dde18cd1d8f). I then rolled back until I traced the issue 
down to the patch in question (d962de6ae51f). Even more specifically, 
according to my test notes I tested the following commits, with results as 
indicated:

04c7f60ca477ffbf7b7910320482335050f0d23a -> Not working
3d9e6f556e235ddcdc9f73600fdd46fe1736b090 -> Not working
3c4f8333b582487a2d1e02171f1465531cde53e3 -> Not working
a4a79e03bab57729bd8046d22bf3666912e586fb -> Not working
1ef2c2df11997b8135f34adcf2c200d3b4aacbe9 -> Not working
d962de6ae51f9b76ad736220077cda83084090b1 -> Not working
282069845af388b08d622ad192b831dcd0549c62 -> Working
e6d34ced01bc3aaad616b9446bbaa96cd04617c4 -> Working
748c5ea8b8796ae8ee80b8d3a3d940570b588d59 -> Working
868a9fd9480785952336e5f119e1f75877c423a8 -> Working

> > If so, I suspect the issue you are reporting got already fixed during the
> -rc cycle for v6.5 kernel.
> 
> > I suggest reverting using this patch unless someone more knowledgeable
> > about what these changes actually do has a better suggestion.
> 
> Can you please test with v6.5 kernel? It has the two fixes below that
> sounds like you may have been missing:
> 
> a4a79e03bab5 ("serial: core: Revert port_id use")
> 04c7f60ca477 ("serial: core: Fix serial core port id, including multiport devices")
> 
> Note how commit a4a79e03bab5 already did a partial revert of what you're
> suggesting.
> 
> If you already have these two commits, then let's investigate further to
> see what might be still wrong.

Looking at the results in my notes (see above), I had tested those 
two commits specifically and found they still had the same issue.

What I can say for certain is that of the commits I have tested: 

1) Commits before d962de6ae51f work on the hardware I have tested 
2) Commits after d962de6ae51f don't work on the hardware I have tested
3) Pulling v6.5 and reverting d962de6ae51f with git revert resolves the 
issue

> Regards,
> 
> Tony
>
Tony Lindgren Aug. 29, 2023, 8:15 p.m. UTC | #3
* Matthew Howell <matthew.howell@sealevel.com> [230829 13:42]:
> On Tue, 29 Aug 2023, Tony Lindgren wrote:
> 
> > Hi,
> > 
> > * Matthew Howell <matthew.howell@sealevel.com> [230828 20:41]:
> > > From: Matthew Howell <matthew.howell@sealevel.com>
> > > XR17V35X cards seemingly unable to register serial port. Confirmed on
> > > Sealevel 7202C, 7204EC, and Exar XR17V352 reference board.
> > > dmesg states: "Couldn't register serial port 0, irq 24, type 2, error -22"
> > >
> > > I first identified the problem when I pulled down 6.6-rc1 and I was able
> > > to trace it to d962de6ae51f9b76ad736220077cda83084090b1. I understand that this
> > > commit is noted as being reverted in 1ef2c2df1199, but I was only able to
> > > resolve the issue by reverting d962de6ae51f myself using this patch.
> > 
> > Thanks for the report. Do you maybe mean 6.5-rc1 instead of 6.6-rc1 above?
> 
> Apologies, I meant 6.5, no RC. Specifically, I first found this issue on 
> the v6.5 tag (2dde18cd1d8f). I then rolled back until I traced the issue 
> down to the patch in question (d962de6ae51f). Even more specifically, 
> according to my test notes I tested the following commits, with results as 
> indicated:
> 
> 04c7f60ca477ffbf7b7910320482335050f0d23a -> Not working
> 3d9e6f556e235ddcdc9f73600fdd46fe1736b090 -> Not working
> 3c4f8333b582487a2d1e02171f1465531cde53e3 -> Not working
> a4a79e03bab57729bd8046d22bf3666912e586fb -> Not working
> 1ef2c2df11997b8135f34adcf2c200d3b4aacbe9 -> Not working
> d962de6ae51f9b76ad736220077cda83084090b1 -> Not working
> 282069845af388b08d622ad192b831dcd0549c62 -> Working
> e6d34ced01bc3aaad616b9446bbaa96cd04617c4 -> Working
> 748c5ea8b8796ae8ee80b8d3a3d940570b588d59 -> Working
> 868a9fd9480785952336e5f119e1f75877c423a8 -> Working

OK

> What I can say for certain is that of the commits I have tested: 
> 
> 1) Commits before d962de6ae51f work on the hardware I have tested 
> 2) Commits after d962de6ae51f don't work on the hardware I have tested
> 3) Pulling v6.5 and reverting d962de6ae51f with git revert resolves the 
> issue

OK. To me it seems uart.port.port_id should be always 0 in exar_pci_probe()
and get automatically allocated in serial_base_port_add(). Sounds like this
is not a duplicate port_id issue though but something else as it sounds
like you're not getting duplicate sysfs entry related errors.

If it is a port_id conflict I'm not sure why commit 3d9e6f556e23 is not
working for your as it has commit a4a79e03bab5 ("serial: core: Revert
port_id use"). Care to check that again, or maybe try with v6.5 with just
the commit below reverted?

04c7f60ca477 ("serial: core: Fix serial core port id, including multiport devices")

Dmesg output might help also to figure out if this happens on the first
port or the second port.

Not sure yet where the -22 error here comes from.

Regards,

Tony
Matthew Howell Aug. 31, 2023, 2:58 p.m. UTC | #4
On Tue, 29 Aug 2023, Tony Lindgren wrote:

> ⚠Caution: External email. Exercise extreme caution with links or attachments.⚠
> 
> 
> * Matthew Howell <matthew.howell@sealevel.com> [230829 13:42]:
> > On Tue, 29 Aug 2023, Tony Lindgren wrote:
> >
> > > Hi,
> > >
> > > * Matthew Howell <matthew.howell@sealevel.com> [230828 20:41]:
> > > > From: Matthew Howell <matthew.howell@sealevel.com>
> > > > XR17V35X cards seemingly unable to register serial port. Confirmed on
> > > > Sealevel 7202C, 7204EC, and Exar XR17V352 reference board.
> > > > dmesg states: "Couldn't register serial port 0, irq 24, type 2, error -22"
> > > >
> > > > I first identified the problem when I pulled down 6.6-rc1 and I was able
> > > > to trace it to d962de6ae51f9b76ad736220077cda83084090b1. I understand that this
> > > > commit is noted as being reverted in 1ef2c2df1199, but I was only able to
> > > > resolve the issue by reverting d962de6ae51f myself using this patch.
> > >
> > > Thanks for the report. Do you maybe mean 6.5-rc1 instead of 6.6-rc1 above?
> >
> > Apologies, I meant 6.5, no RC. Specifically, I first found this issue on
> > the v6.5 tag (2dde18cd1d8f). I then rolled back until I traced the issue
> > down to the patch in question (d962de6ae51f). Even more specifically,
> > according to my test notes I tested the following commits, with results as
> > indicated:
> >
> > 04c7f60ca477ffbf7b7910320482335050f0d23a -> Not working
> > 3d9e6f556e235ddcdc9f73600fdd46fe1736b090 -> Not working
> > 3c4f8333b582487a2d1e02171f1465531cde53e3 -> Not working
> > a4a79e03bab57729bd8046d22bf3666912e586fb -> Not working
> > 1ef2c2df11997b8135f34adcf2c200d3b4aacbe9 -> Not working
> > d962de6ae51f9b76ad736220077cda83084090b1 -> Not working
> > 282069845af388b08d622ad192b831dcd0549c62 -> Working
> > e6d34ced01bc3aaad616b9446bbaa96cd04617c4 -> Working
> > 748c5ea8b8796ae8ee80b8d3a3d940570b588d59 -> Working
> > 868a9fd9480785952336e5f119e1f75877c423a8 -> Working
> 
> OK
> 
> > What I can say for certain is that of the commits I have tested:
> >
> > 1) Commits before d962de6ae51f work on the hardware I have tested
> > 2) Commits after d962de6ae51f don't work on the hardware I have tested
> > 3) Pulling v6.5 and reverting d962de6ae51f with git revert resolves the
> > issue
> 
> OK. To me it seems uart.port.port_id should be always 0 in exar_pci_probe()
> and get automatically allocated in serial_base_port_add(). Sounds like this
> is not a duplicate port_id issue though but something else as it sounds
> like you're not getting duplicate sysfs entry related errors.
>
> If it is a port_id conflict I'm not sure why commit 3d9e6f556e23 is not
> working for your as it has commit a4a79e03bab5 ("serial: core: Revert
> port_id use"). Care to check that again, or maybe try with v6.5 with just
> the commit below reverted?
> 
> 04c7f60ca477 ("serial: core: Fix serial core port id, including multiport devices")

Just tried that, but no difference. Same error.
 
> Dmesg output might help also to figure out if this happens on the first
> port or the second port.

The full error in dmesg is:
[Aug30 15:48] exar_serial 0000:04:00.0: Couldn't register serial port 0, irq 24, type 2, error -22

This is on a 2-port adapter. I don't see any indication in dmesg that it 
attempted to register the other port.

> Not sure yet where the -22 error here comes from.
>
> Regards,
> 
> Tony
>
Tony Lindgren Sept. 1, 2023, 4:47 a.m. UTC | #5
* Matthew Howell <matthew.howell@sealevel.com> [230831 14:58]:
> On Tue, 29 Aug 2023, Tony Lindgren wrote:
> > If it is a port_id conflict I'm not sure why commit 3d9e6f556e23 is not
> > working for your as it has commit a4a79e03bab5 ("serial: core: Revert
> > port_id use"). Care to check that again, or maybe try with v6.5 with just
> > the commit below reverted?
> > 
> > 04c7f60ca477 ("serial: core: Fix serial core port id, including multiport devices")
> 
> Just tried that, but no difference. Same error.

OK thanks to testing it. So it's starting to look like the issue is
somehow related to the serial8250_setup_port() change in commit d962de6ae51f
("serial: core: Fix serial core port id to not use port->line").

The experimental patch below should confirm if the issue is related to the
port_id usage or serial8250_setup_port(). Care to give this a try against
v6.5 without other patches?

> > Dmesg output might help also to figure out if this happens on the first
> > port or the second port.
> 
> The full error in dmesg is:
> [Aug30 15:48] exar_serial 0000:04:00.0: Couldn't register serial port 0, irq 24, type 2, error -22
> 
> This is on a 2-port adapter. I don't see any indication in dmesg that it 
> attempted to register the other port.

OK. Are there other 8520 related uarts probing before that?

Regards,

Tony

8< --------------------
diff --git a/drivers/tty/serial/serial_base_bus.c b/drivers/tty/serial/serial_base_bus.c
--- a/drivers/tty/serial/serial_base_bus.c
+++ b/drivers/tty/serial/serial_base_bus.c
@@ -169,7 +169,7 @@ struct serial_port_device *serial_base_port_add(struct uart_port *port,
 	err = serial_base_device_init(port, &port_dev->dev,
 				      &ctrl_dev->dev, &serial_port_type,
 				      serial_base_port_release,
-				      port->ctrl_id, port->port_id);
+				      port->ctrl_id, port->line);
 	if (err)
 		goto err_put_device;
Matthew Howell Sept. 1, 2023, 6:48 p.m. UTC | #6
On Fri, 1 Sep 2023, Tony Lindgren wrote:
> * Matthew Howell <matthew.howell@sealevel.com> [230831 14:58]:
> > On Tue, 29 Aug 2023, Tony Lindgren wrote:
> > > If it is a port_id conflict I'm not sure why commit 3d9e6f556e23 is not
> > > working for your as it has commit a4a79e03bab5 ("serial: core: Revert
> > > port_id use"). Care to check that again, or maybe try with v6.5 with just
> > > the commit below reverted?
> > >
> > > 04c7f60ca477 ("serial: core: Fix serial core port id, including multiport devices")
> >
> > Just tried that, but no difference. Same error.
> 
> OK thanks to testing it. So it's starting to look like the issue is
> somehow related to the serial8250_setup_port() change in commit d962de6ae51f
> ("serial: core: Fix serial core port id to not use port->line").
> 
> The experimental patch below should confirm if the issue is related to the
> port_id usage or serial8250_setup_port(). Care to give this a try against
> v6.5 without other patches?

For some reason I am unable to apply the patch using either git apply or 
the patch command. I made the change manually though and the issue still 
occurs.
 
> > > Dmesg output might help also to figure out if this happens on the first
> > > port or the second port.
> >
> > The full error in dmesg is:
> > [Aug30 15:48] exar_serial 0000:04:00.0: Couldn't register serial port 0, irq 24, type 2, error -22
> >
> > This is on a 2-port adapter. I don't see any indication in dmesg that it
> > attempted to register the other port.
> 
> OK. Are there other 8520 related uarts probing before that?

Not that I can see. The only earlier entries I see are from before I load 
the patched driver. If I add another card in I can see that both are 
probed, but both have the error:

[  +0.005929] exar_serial 0000:01:00.0: Couldn't register serial port 0, 
irq 24, type 2, error -22
[  +0.003431] exar_serial 0000:04:00.0: Couldn't register serial port 0, 
irq 25, type 2, error -22

04:00.0 is a Sealevel card and the other is a reference Exar XR17V352.

> Regards,
> 
> Tony
> 
> 8< --------------------
> diff --git a/drivers/tty/serial/serial_base_bus.c b/drivers/tty/serial/serial_base_bus.c
> --- a/drivers/tty/serial/serial_base_bus.c
> +++ b/drivers/tty/serial/serial_base_bus.c
> @@ -169,7 +169,7 @@ struct serial_port_device *serial_base_port_add(struct uart_port *port,
>         err = serial_base_device_init(port, &port_dev->dev,
>                                       &ctrl_dev->dev, &serial_port_type,
>                                       serial_base_port_release,
> -                                     port->ctrl_id, port->port_id);
> +                                     port->ctrl_id, port->line);
>         if (err)
>                 goto err_put_device;
> 
>
Tony Lindgren Sept. 2, 2023, 4:34 a.m. UTC | #7
* Matthew Howell <matthew.howell@sealevel.com> [230901 18:48]:
> On Fri, 1 Sep 2023, Tony Lindgren wrote:
> > * Matthew Howell <matthew.howell@sealevel.com> [230831 14:58]:
> > > On Tue, 29 Aug 2023, Tony Lindgren wrote:
> > > > If it is a port_id conflict I'm not sure why commit 3d9e6f556e23 is not
> > > > working for your as it has commit a4a79e03bab5 ("serial: core: Revert
> > > > port_id use"). Care to check that again, or maybe try with v6.5 with just
> > > > the commit below reverted?
> > > >
> > > > 04c7f60ca477 ("serial: core: Fix serial core port id, including multiport devices")
> > >
> > > Just tried that, but no difference. Same error.
> > 
> > OK thanks to testing it. So it's starting to look like the issue is
> > somehow related to the serial8250_setup_port() change in commit d962de6ae51f
> > ("serial: core: Fix serial core port id to not use port->line").
> > 
> > The experimental patch below should confirm if the issue is related to the
> > port_id usage or serial8250_setup_port(). Care to give this a try against
> > v6.5 without other patches?
> 
> For some reason I am unable to apply the patch using either git apply or 
> the patch command. I made the change manually though and the issue still 
> occurs.

OK. If the patch did not apply against v6.5, can you please verify you don't
have other patches applied like your revert? I don't think the patch I sent
is white space damanged or anything. Doing git diff v6.5.. should show you
what might be different :)

> > > > Dmesg output might help also to figure out if this happens on the first
> > > > port or the second port.
> > >
> > > The full error in dmesg is:
> > > [Aug30 15:48] exar_serial 0000:04:00.0: Couldn't register serial port 0, irq 24, type 2, error -22
> > >
> > > This is on a 2-port adapter. I don't see any indication in dmesg that it
> > > attempted to register the other port.
> > 
> > OK. Are there other 8520 related uarts probing before that?
> 
> Not that I can see. The only earlier entries I see are from before I load 
> the patched driver. If I add another card in I can see that both are 
> probed, but both have the error:
> 
> [  +0.005929] exar_serial 0000:01:00.0: Couldn't register serial port 0, 
> irq 24, type 2, error -22
> [  +0.003431] exar_serial 0000:04:00.0: Couldn't register serial port 0, 
> irq 25, type 2, error -22
> 
> 04:00.0 is a Sealevel card and the other is a reference Exar XR17V352.

Not sure what you mean with the 8250 entries from before loading the
patched driver..

Maybe things go wrong already somewhere earlier if the integrated 8250
port(s) don't show up either? If so, maybe this issue is somehow machine
specific rather than 8250_exar specific.

Can you please post or email me your full working dmesg output, failing
dmesg output, the kernel .config used, and kernel command line?

Also, do you have any suggestions for a commonly available pcie 8250_exar
card I could try to reproduce this issue with?

Regards,

Tony
Tony Lindgren Sept. 5, 2023, 3:55 p.m. UTC | #8
* Matthew Howell <matthew.howell@sealevel.com> [230905 15:05]:
> On Sat, 2 Sep 2023, Tony Lindgren wrote:
> > OK. If the patch did not apply against v6.5, can you please verify you don't
> > have other patches applied like your revert? I don't think the patch I sent
> > is white space damanged or anything. Doing git diff v6.5.. should show you
> > what might be different :)
> 
> It shouldn't have had any patches applied to. Just verified again by 
> running git diff after pulling the v6.5 branch, but git diff shows no 
> differences. 
> 
> I suspect alpine may be mangling the text on my end in some way. If I 
> apply the changes manually and then run git diff v6.5 the patch looks the 
> same as what you provided, but tab/spaces are different.

OK thanks for checking.

> > Not sure what you mean with the 8250 entries from before loading the
> > patched driver..
> > 
> > Maybe things go wrong already somewhere earlier if the integrated 8250
> > port(s) don't show up either? If so, maybe this issue is somehow machine
> > specific rather than 8250_exar specific.
> 
> I should have been more specific there. I was actually referring to 
> 8250_exar entries, not 8250 entries. I have not had any issues with the 
> base 8250 driver loading.

Ah OK sorry I misunderstood.

> HOWEVER, I did just find something very interesting. When I first found 
> the issue my running kernel was still an RC version (6.5-RC4, I 
> believe). The issue did NOT occur in the running kernel, or when building 
> 8250_exar from the 6.5-RC4 source. I expected the issue to exist in the 
> running kernel after I updated to 6.5, but this is NOT the the case. 
> XR17V35X devices still work in my running kernel. It is only when I build 
> from source AND the source contains the port_id changes that the issue 
> occurs. My current kernel is 6.5.0-1-MANJARO. 
> 
> Could I be doing something wrong here that for some reason only
> manifests itself in combination with the port_id change? 
> 
> The only things I can think of are:
> 1) insmod does not account for dependencies, so in theory I could be 
> failing to build and load some other required module. However, modprobe 
> indicates 8250_exar has no dependencies, so I didn't think this should be 
> an issue.

If you are not using modprobe, and have CONFIG_SERIAL_CORE=m, you
need to load serial_base.ko. I don't think we can build the core stuff as
as serial_core.ko without renaming serial_core.c to something else. Looks
like your config has SERIAL_CORFE built-in though, and without the serial
core stuff you'd likely get "Unknown symbol in module" error loading
8250_exar.

> 2) The Arch/Manjaro Kernel I am running does not actually have the port_id 
> change, even though it should. Do you know of an a wy to determine this?

Well I guess you could check the patches applied to that kernel, but
presumably it's v6.5 for that part.

> The general build procedure I have been using is:
> 
> ## Clone v6.5 tagged kernel source
> git clone --depth=1 https://github.com/torvalds/linux.git --branch v6.5
> 
> ## Link symvers
> ln -s /usr/lib/modules/$(uname -r)/build/Module.symvers . 
> 
> ## Copy existing config
> zcat /proc/config.gz > .config
> 
> ## Make sure 8250_exar is built as a module. Disable auto-version.
> sed -i '/CONFIG_SERIAL_8250_EXAR=/c\CONFIG_SERIAL_8250_EXAR=m' .config
> sed -i '/CONFIG_LOCALVERSION_AUTO=/c\CONFIG_LOCALVERSION_AUTO=n' .config
> sed -i '/CONFIG_LOCALVERSION=/c\CONFIG_LOCALVERSION=""' .config
> make modules_prepare LOCALVERSION=-MANJARO EXTRAVERSION=-1
> 
> ## Apply patch, if applicable
> patch -p1 < patch.diff
> 
> ## Build and load module
> make M=drivers/tty/serial/8250/
> sudo rmmod 8250_exar

Maybe check if rmmod 8250_exar now somehow causes the following insmod
8250_exar attempts to fail?

> sudo insmod drivers/tty/serial/8250/8250_exar.ko
> 
> > Can you please post or email me your full working dmesg output, failing
> > dmesg output, the kernel .config used, and kernel command line?
> 
> ---
> Kernel Command Line:
> quiet splash resume=UUID=46a37dda-0d60-4ed1-94ea-9219fbe85dde udev.log_priority=3 iomem=relaxed
> ---
> 
> ---
> dmesg start
> Note: Everything before [ 1149.943049] is prior to loading the module 
> built from source. The successful version looks the same, except instead 
> of the error message I see the same ttyS4 and ttyS5 at MMIO... messages 
> that appeared before. 

OK yeah thanks, not seeing anything wrong early in the dmesg.

Regards,

Tony
Tony Lindgren Sept. 5, 2023, 4:51 p.m. UTC | #9
* Matthew Howell <matthew.howell@sealevel.com> [230905 16:43]:
> On Tue, 5 Sep 2023, Tony Lindgren wrote:
> > Maybe check if rmmod 8250_exar now somehow causes the following insmod
> > 8250_exar attempts to fail?
> 
> Could you clarify what you mean? It is at that stage that I normally see 
> the error in dmesg unless I have reverted the port id patch. In other 
> words, if I just load it as-is I get the error in question.
> 
> Do you mean to try loading the installed kernel module with insmod?
> If that is what you mean, I just tried loading the included binary with 
> insmod but did not get the error and it loaded correctly. I loaded it 
> with:
> 
> sudo insmod /usr/lib/modules/6.5.0-1-MANJARO/kernel/drivers/tty/serial/8250/8250_exar.ko.zst 

I meant maybe reloading 8250_exar fails. So the test I would do is build
build a plain v6.5 kernel, boot it, modprobe 8250_exar, rmmod 8250_exar,
and then again modprobe 8250_exar.

So maybe the first modprobe 8250_exar works after boot, but the second
modprobe 8250_exar won't?

> Do you see anything concerning or possibly incorrect with the way I am 
> building the 8250_exar module?

No I don't see how that would make a difference.

Regards,

Tony
Matthew Howell Sept. 11, 2023, 1:04 p.m. UTC | #10
On Tue, 5 Sep 2023, Tony Lindgren wrote:
> * Matthew Howell <matthew.howell@sealevel.com> [230905 16:43]:
> > On Tue, 5 Sep 2023, Tony Lindgren wrote:
> > > Maybe check if rmmod 8250_exar now somehow causes the following insmod
> > > 8250_exar attempts to fail?
> >
> > Could you clarify what you mean? It is at that stage that I normally see
> > the error in dmesg unless I have reverted the port id patch. In other
> > words, if I just load it as-is I get the error in question.
> >
> > Do you mean to try loading the installed kernel module with insmod?
> > If that is what you mean, I just tried loading the included binary with
> > insmod but did not get the error and it loaded correctly. I loaded it
> > with:
> >
> > sudo insmod /usr/lib/modules/6.5.0-1-MANJARO/kernel/drivers/tty/serial/8250/8250_exar.ko.zst
> 
> I meant maybe reloading 8250_exar fails. So the test I would do is build
> build a plain v6.5 kernel, boot it, modprobe 8250_exar, rmmod 8250_exar,
> and then again modprobe 8250_exar.

Sorry for the late reply, had some Manjaro/Arch specific issues installing 
the plain v6.5 kernel and so I switched over to Ubuntu 23.04. 

The results are as follows:

1. I confirmed I get the same behavior on Ubuntu as I do on Manjaro as far 
as compiling and loading 8250_exar from v6.5 using insmod. In this case, 
the running kernel was v6.2.

2. I installed the v6.5 kernel package and confirmed the issue does NOT 
occur 'out of the box'. In other words, the 8250_exar module 'bundled' / 
installed with the Ubuntu v6.5 kernel package does not have issues loading 
the ports.

3. I then attempted to insmod the 8250_exar module I built from the v6.5 
source. This still results in the same dmesg error.

4. I then built the entire v6.5 kernel from source and installed it on 
Ubuntu 23.04. I used the .config file from the Ubuntu v6.5 kernel. This 
behaves the same as the v6.5 kernel package. In other words, the 8250_exar 
module bundled / installed with the kernel has no issues, even when I 
compile it myself.

5. Finally, I tested to see if insmod vs modprobe was significant by 
unloading and reloading the installed / bundled 8250_exar with both 
modprobe and rmmod/insmod. There were no differences between the two 
methods. For insmod, I tried this against both the 8250_exar module in 
/lib/modules and the 8250_Exar module in the original source directory.   

In conclusion, it seems like the 8250_exar module works as expected when 
running against exactly the kernel it was built against, but fails when 
running against any other kernel, even if that kernel is practically 
identical. I also would have expected some symbol or symver error if the 
kernels actually differed in some meaningful way.

So it seems like this might not be an issue for typical 'users' of the 
kernel, but may impact people who need to build and load 8250_exar 
manually for whatever reason, unless I am just building the module 
incorrectly somehow. 

Any chance you could try building the module with your setup and then send 
it over? I changed the EXTRAVERSION and LOCALVERSION of my build such that 
the final version string is 6.5.0-090823-TESTING and I think your module 
would need to match for me to load it.

> So maybe the first modprobe 8250_exar works after boot, but the second
> modprobe 8250_exar won't?
> 
> > Do you see anything concerning or possibly incorrect with the way I am
> > building the 8250_exar module?
> 
> No I don't see how that would make a difference.
> 
> Regards,
> 
> Tony
> 
>
Greg KH Sept. 11, 2023, 1:15 p.m. UTC | #11
On Mon, Sep 11, 2023 at 09:04:14AM -0400, Matthew Howell wrote:
> So it seems like this might not be an issue for typical 'users' of the 
> kernel, but may impact people who need to build and load 8250_exar 
> manually for whatever reason, unless I am just building the module 
> incorrectly somehow. 

Yes, Linux does not support building a module against one source
tree/configuration and attempting to load it into a different kernel at
all.  You are lucky that this was the only thing that broke :)

Thanks for the testing and letting us know that all is good with the
tree as-is.

greg k-h
Matthew Howell Sept. 11, 2023, 1:58 p.m. UTC | #12
On Mon, 11 Sep 2023, Greg KH wrote:
> On Mon, Sep 11, 2023 at 09:04:14AM -0400, Matthew Howell wrote:
> > So it seems like this might not be an issue for typical 'users' of the
> > kernel, but may impact people who need to build and load 8250_exar
> > manually for whatever reason, unless I am just building the module
> > incorrectly somehow.
> 
> Yes, Linux does not support building a module against one source
> tree/configuration and attempting to load it into a different kernel at
> all.  You are lucky that this was the only thing that broke :)

Ah, I had been under the impression that all that mattered for module 
compatibility was a matching .config and 'base' kernel version (ie, 6.5), 
so long as no backports or ABI changes had been made by the distro. I 
guess that is not actually the case though.

> Thanks for the testing and letting us know that all is good with the
> tree as-is.
> 
> greg k-h
>
diff mbox series

Patch

diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c
index 3449f8790e46..ab4da98332ab 100644
--- a/drivers/tty/serial/8250/8250_core.c
+++ b/drivers/tty/serial/8250/8250_core.c
@@ -497,7 +497,6 @@  static struct uart_8250_port *serial8250_setup_port(int index)
 
 	up = &serial8250_ports[index];
 	up->port.line = index;
-	up->port.port_id = index;
 
 	serial8250_init_port(up);
 	if (!base_ops)
@@ -1041,7 +1040,6 @@  int serial8250_register_8250_port(const struct uart_8250_port *up)
 			uart_remove_one_port(&serial8250_reg, &uart->port);
 
 		uart->port.ctrl_id	= up->port.ctrl_id;
-		uart->port.port_id	= up->port.port_id;
 		uart->port.iobase       = up->port.iobase;
 		uart->port.membase      = up->port.membase;
 		uart->port.irq          = up->port.irq;
diff --git a/drivers/tty/serial/serial_base_bus.c b/drivers/tty/serial/serial_base_bus.c
index 3dfcf20c4eb6..53ae87ae337b 100644
--- a/drivers/tty/serial/serial_base_bus.c
+++ b/drivers/tty/serial/serial_base_bus.c
@@ -169,7 +169,8 @@  struct serial_port_device *serial_base_port_add(struct uart_port *port,
 	err = serial_base_device_init(port, &port_dev->dev,
 				      &ctrl_dev->dev, &serial_port_type,
 				      serial_base_port_release,
-				      port->ctrl_id, port->port_id);
+				      port->line);
+
 	if (err)
 		goto err_put_device;
 
diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
index a156d2ed8d9e..201813d888df 100644
--- a/include/linux/serial_core.h
+++ b/include/linux/serial_core.h
@@ -460,7 +460,6 @@  struct uart_port {
 	int			(*iso7816_config)(struct uart_port *,
 						  struct serial_iso7816 *iso7816);
 	unsigned int		ctrl_id;		/* optional serial core controller id */
-	unsigned int		port_id;		/* optional serial core port id */
 	unsigned int		irq;			/* irq number */
 	unsigned long		irqflags;		/* irq flags  */
 	unsigned int		uartclk;		/* base uart clock */