mbox series

[v4,00/14] Introduce the lwIP network stack

Message ID cover.1718638104.git.jerome.forissier@linaro.org
Headers show
Series Introduce the lwIP network stack | expand

Message

Jerome Forissier June 17, 2024, 3:32 p.m. UTC
This is a rework of a patch series by Maxim Uvarov: "net/lwip: add lwip
library for the network stack" [1]. The goal is to introduce the lwIP TCP/IP
stack [2] [3] as an alternative to the current implementation in net/,
selectable with Kconfig, and ultimately keep only lwIP if possible. Some
reasons for doing so are:
- Make the support of HTTPS in the wget command easier. Javier T. (CC'd)
has some additional lwIP and Mbed TLS patches to do so. With that it
becomes possible to fetch and launch a distro installer such as Debian
etc. using a secure, authenticated connection directly from the U-Boot
shell. Several use cases:
  * Authentication: prevent MITM attack (third party replacing the
binary with a different one)
  * Confidentiality: prevent third parties from grabbing a copy of the
image as it is being downloaded
  * Allow connection to servers that do not support plain HTTP anymore
(this is becoming more and more common on the Internet these days)
- Possibly benefit from additional features implemented in lwIP
- Less code to maintain in U-Boot

Prior to applying this series, the lwIP stack needs to be added as a
Git subtree with the following command:

 $  git subtree add --squash --prefix lib/lwip/lwip https://git.savannah.gnu.org/git/lwip.git STABLE-2_2_0_RELEASE

The first patch renames some enums in order to avoid a conflict when a
later patch enables the lwIP library.

The second patch introduces a new Kconfig symbol: NET_LWIP, which selects
the lwIP implementation instead of the current one (NET). Contrary to the
approach chosen by Maxim in [1], NET_LWIP and NET cannot be enabled
simultaneously. The rationale is we want to start from a clean state and
not pull potentially duplicated functionality from both stacks. Note
however that a few files are still built in net/, they are the ones
related to ethernet device management and the ethernet bootflow.

The third patch splits the net.h header into net-legacy.h, net-common.h,
net-lwip.h, leaving net.h as a simple wrapper.

The fourth patch introduces the Makefile to build lwIP when NET_LWIP is
enabled.

The subsequent patches implement various network-oriented commands and
features: dhcp, dns, ping, tftpboot, wget.

A number of features are currently incompatible with NET_LWIP: SANDBOX,
DFU_TFTP, FASTBOOT, SPL_NET. All make assumptions on how the network
stack is implemented and/or pull sybols that are not trivially exported
from lwIP. Some interface rework may be needed.

Due to the above and in order to provide some level of testing, a new QEMU
configuration is introduced (qemu_arm64_lwip_defconfig) which is the same
as qemu_arm64_defconfig but with NET_LWIP and CMD_*_LWIP enabled.
Tests are added to test/py/tests/test_net.py for that configuration.

Changes in v4:

- Fixed the DHCP algorithm which was missing a sys_timeout() call in
the "fine timer" callback. This should close the issue that Tom R.
reported with his Raspberry Pi 3 (it does fix it on mine).
- The DHCP exchange timeout is increased from 2 to 10 seconds
- The DHCP exchange can be interrupted with Ctrl-C.
- "net: introduce alternative implementation as net-lwip/": rework
dependencies. A few symbols have 'depends on !NET_LWIP' and in addition
'NET_LWIP depends on !SANDBOX'. Sandbox, DSA and fastboot are
unsupported, because they are deeply welded to the current stack.
- All network commands (dns, ping, tftp and wget):
  * Get rid of global variables (Ilias A.)
  * Use printf() rather than log_info()
- "net-lwip: add ping command": use packet count instead of
timeout, fix code style (Ilias A.)
- Add "net: split cmd/net.c into cmd/net.c and cmd/net-common.c"
extracted from the wget patch (Ilias A.).
- Add "net: split include/net.h into net{,-common,-legacy,-lwip}.h"
(Ilias A.)
- Add "flash: prefix error codes with FL_" which is required to
avoid name clashes when splitting net.h
- Reworked the initialization of the lwIP stack. One and only
one network interface (struct netif) is added for the duration
of the command that uses that interface. That's commit "net-lwip:
add DHCP support and dhcp commmand".
- Drop "test: dm: dsa, eth: disable tests when CONFIG_NET_LWIP=y",
not needed now that NET_LWIP depend on !SANDBOX.
- qemu_arm64_lwip_defconfig now enables CMD_DNS and CMD_WGET (so
that all the supported network commands are available).

Changes in v3:

- Make NET_LWIP a Kconfig choice in patch "net: introduce alternative
implementation as net-lwip/" (Tom R.)
- Drop the patch introducing lwIP as a Git subtree and document the git
command in the cover letter instead (Tom R.)
- "net-lwip: add TFTP support and tftpboot command": use the same
"Bytes transferred =" message as in the legacy implementation (Tom R.,
Maxim U.)
- Drop "test/py: net: add _lwip variants of dhcp, ping and tftpboot
tests" which is not needed anymore.
- Add missing kfree() calls in cmd/net-common.c and fix the parsing of
decimal address in net-lwip/wget.c (patch "net-lwip: add wget command")
(Maxim U.)
- "net-lwip: add ping command": drop the ICMP payload (Ilias A.). Set
the sequence number to zero when entering ping_loop().

Changes in v2:

** Address comments from Ilias A.

- "net-lwip: add wget command"
Implement the wget_with_dns() function to do most of the wget work and
call it from do_wget(). This allows to simplify patch "net-lwip: add
support for EFI_HTTP_BOOT".

- "net-lwip: import net command from cmd/net.c"
Move a few functions from cmd/net.c to a new file cmd/net-common.c
rather than duplicating then in cmd/net-lwip.c.

- "net-lwip: add support for EFI_HTTP_BOOT"
Since wget_with_dns() is now implemented in "net-lwip: add wget command",
just enable the wget command when the lwIP stack is enabled and
EFI_HTTP_BOOT is requested.

** Address comments from Tom R.

- "net-lwip: add DHCP support and dhcp commmand",
  "net-lwip: add TFTP support and tftpboot command",
  "net-lwip: add ping command",
  "net-lwip: add dns command",
  "net-lwip: add wget command"
Do not introduce new CMD_XXX_LWIP symbols and use existing CMD_XXX
instead.

- "configs: add qemu_arm64_lwip_defconfig"
Use #include <configs/qemu_arm64_defconfig>.

- "net-lwip: import lwIP library under lib/lwip"
Patch removed and replaced by the introduction of a Git subtree:
"Squashed 'lib/lwip/lwip/' content from commit 0a0452b2c3".

Note that I have not yet addressed your comments on "test: dm: dsa,
eth: disable tests when CONFIG_NET_LWIP=y"). I need some more time
for that and I think running CI on this v2 will help better understand
what is needed for v3.

** Miscellaneous improvements

- "net: introduce alternative implementation as net-lwip/":

* Make DFU_OVER_TFTP not DFU_TFTP incompatible with NET_LWIP. It seems
quite natural to supplement "depends on NET" with "&& !NET_LWIP".
* Make PROT_*_LWIP not visible by removing the Kconfig prompt.

[1] https://lore.kernel.org/all/20231127125726.3735-1-maxim.uvarov@linaro.org/
[2] https://www.nongnu.org/lwip/
[3] https://en.wikipedia.org/wiki/LwIP

CC: Javier Tia <javier.tia@linaro.org>

Jerome Forissier (14):
  flash: prefix error codes with FL_
  net: introduce alternative implementation as net-lwip/
  net: split include/net.h into net{,-common,-legacy,-lwip}.h
  net-lwip: build lwIP
  net-lwip: add DHCP support and dhcp commmand
  net-lwip: add TFTP support and tftpboot command
  net-lwip: add ping command
  net-lwip: add dns command
  net: split cmd/net.c into cmd/net.c and cmd/net-common.c
  net-lwip: add wget command
  cmd: bdinfo: enable -e when CONFIG_CMD_NET_LWIP=y
  configs: add qemu_arm64_lwip_defconfig
  MAINTAINERS: net-lwip: add myself as a maintainer
  CI: add qemu_arm64_lwip to the test matrix

 .azure-pipelines.yml              |   7 +
 Kconfig                           |  41 ++
 MAINTAINERS                       |  11 +
 Makefile                          |  12 +-
 board/cobra5272/flash.c           |  26 +-
 board/freescale/m5253demo/flash.c |   6 +-
 boot/Kconfig                      |   5 +-
 cmd/Kconfig                       |  44 ++
 cmd/Makefile                      |   7 +
 cmd/bdinfo.c                      |   5 +-
 cmd/net-common.c                  | 109 ++++
 cmd/net-lwip.c                    |  45 ++
 cmd/net.c                         | 115 ----
 common/Kconfig                    |   2 +-
 common/board_r.c                  |   4 +-
 common/flash.c                    |  44 +-
 common/spl/Kconfig                |   1 +
 configs/qemu_arm64_lwip_defconfig |   4 +
 drivers/dfu/Kconfig               |   1 +
 drivers/fastboot/Kconfig          |   1 +
 drivers/mtd/cfi_flash.c           |  36 +-
 drivers/net/Kconfig               |   3 +-
 drivers/net/phy/Kconfig           |   2 +-
 drivers/usb/gadget/Kconfig        |   2 +-
 include/flash.h                   |  20 +-
 include/net-common.h              | 408 +++++++++++++
 include/net-legacy.h              | 649 ++++++++++++++++++++
 include/net-lwip.h                |  36 ++
 include/net.h                     | 944 +-----------------------------
 lib/Makefile                      |   2 +
 lib/lwip/Makefile                 |  55 ++
 lib/lwip/u-boot/arch/cc.h         |  43 ++
 lib/lwip/u-boot/arch/sys_arch.h   |   0
 lib/lwip/u-boot/limits.h          |   0
 lib/lwip/u-boot/lwipopts.h        | 197 +++++++
 net-lwip/Kconfig                  |  37 ++
 net-lwip/Makefile                 |  18 +
 net-lwip/dhcp.c                   | 108 ++++
 net-lwip/dns.c                    | 117 ++++
 net-lwip/eth_internal.h           |  35 ++
 net-lwip/net-lwip.c               | 270 +++++++++
 net-lwip/ping.c                   | 174 ++++++
 net-lwip/tftp.c                   | 226 +++++++
 net-lwip/wget.c                   | 269 +++++++++
 net/Kconfig                       |  14 -
 45 files changed, 3008 insertions(+), 1147 deletions(-)
 create mode 100644 cmd/net-common.c
 create mode 100644 cmd/net-lwip.c
 create mode 100644 configs/qemu_arm64_lwip_defconfig
 create mode 100644 include/net-common.h
 create mode 100644 include/net-legacy.h
 create mode 100644 include/net-lwip.h
 create mode 100644 lib/lwip/Makefile
 create mode 100644 lib/lwip/u-boot/arch/cc.h
 create mode 100644 lib/lwip/u-boot/arch/sys_arch.h
 create mode 100644 lib/lwip/u-boot/limits.h
 create mode 100644 lib/lwip/u-boot/lwipopts.h
 create mode 100644 net-lwip/Kconfig
 create mode 100644 net-lwip/Makefile
 create mode 100644 net-lwip/dhcp.c
 create mode 100644 net-lwip/dns.c
 create mode 100644 net-lwip/eth_internal.h
 create mode 100644 net-lwip/net-lwip.c
 create mode 100644 net-lwip/ping.c
 create mode 100644 net-lwip/tftp.c
 create mode 100644 net-lwip/wget.c

Comments

Tom Rini June 18, 2024, 8:21 p.m. UTC | #1
On Mon, Jun 17, 2024 at 05:32:52PM +0200, Jerome Forissier wrote:

> This is a rework of a patch series by Maxim Uvarov: "net/lwip: add lwip
> library for the network stack" [1]. The goal is to introduce the lwIP TCP/IP
> stack [2] [3] as an alternative to the current implementation in net/,
> selectable with Kconfig, and ultimately keep only lwIP if possible. Some
> reasons for doing so are:
> - Make the support of HTTPS in the wget command easier. Javier T. (CC'd)
> has some additional lwIP and Mbed TLS patches to do so. With that it
> becomes possible to fetch and launch a distro installer such as Debian
> etc. using a secure, authenticated connection directly from the U-Boot
> shell. Several use cases:
>   * Authentication: prevent MITM attack (third party replacing the
> binary with a different one)
>   * Confidentiality: prevent third parties from grabbing a copy of the
> image as it is being downloaded
>   * Allow connection to servers that do not support plain HTTP anymore
> (this is becoming more and more common on the Internet these days)
> - Possibly benefit from additional features implemented in lwIP
> - Less code to maintain in U-Boot

So, on a Pi 3 (rpi_3_defconfig) I see this now (and it passes normally):
========================================== FAILURES ===========================================
___________________________________ test_efi_helloworld_net ___________________________________
test/py/tests/test_efi_loader.py:163: in test_efi_helloworld_net
    assert expected_text in output
E   AssertionError: assert 'Hello, world' in 'No UEFI binary known at 200000'
------------------------------------ Captured stdout call -------------------------------------
U-Boot> tftpboot 200000 EFI/arm64/helloworld.efi
Using smsc95xx_eth device
TFTP from server 192.168.1.10; our IP address is 192.168.1.100
Filename 'EFI/arm64/helloworld.efi'.
Load address: 0x200000
Loading:
Bytes transferred = 4528 (11b0 hex)
U-Boot> U-Boot> crc32 200000 $filesize
CRC32 for 00200000 ... 002011af ==> 2b466005
U-Boot> U-Boot> bootefi 200000
No UEFI binary known at 200000
U-Boot>
=================================== short test summary info =================================== 
If I disable that test, it moves on to failing the same exact way for
grub. If I disable the grub test too. After that, oh, a bunch of other
tests get skipped because CMD_NET and similar aren't enabled now, and
the tests are wrong. I'll post that as another patch by itself. After
correcting for that, we're seemingly noticeably slower as I need to
increase the timeout for tftp'ing my 83MiB FIT image I use for kernel
testing. We no longer have the estimated speed message, so I can't as
easily say how much slower it is. After increasing the timeout, the
kernel boot test does work.

I can note that normally it takes ~18ms to get a dhcp reply, but with
lwIP it's now 132ms, and previously the kernel loaded at 2.7MiB/s
(which, not great) but if that has a similar level of slowdown, could
well explain it.
Ilias Apalodimas June 19, 2024, 7:24 a.m. UTC | #2
Hi Tom

On Tue, 18 Jun 2024 at 23:21, Tom Rini <trini@konsulko.com> wrote:
>
> On Mon, Jun 17, 2024 at 05:32:52PM +0200, Jerome Forissier wrote:
>
> > This is a rework of a patch series by Maxim Uvarov: "net/lwip: add lwip
> > library for the network stack" [1]. The goal is to introduce the lwIP TCP/IP
> > stack [2] [3] as an alternative to the current implementation in net/,
> > selectable with Kconfig, and ultimately keep only lwIP if possible. Some
> > reasons for doing so are:
> > - Make the support of HTTPS in the wget command easier. Javier T. (CC'd)
> > has some additional lwIP and Mbed TLS patches to do so. With that it
> > becomes possible to fetch and launch a distro installer such as Debian
> > etc. using a secure, authenticated connection directly from the U-Boot
> > shell. Several use cases:
> >   * Authentication: prevent MITM attack (third party replacing the
> > binary with a different one)
> >   * Confidentiality: prevent third parties from grabbing a copy of the
> > image as it is being downloaded
> >   * Allow connection to servers that do not support plain HTTP anymore
> > (this is becoming more and more common on the Internet these days)
> > - Possibly benefit from additional features implemented in lwIP
> > - Less code to maintain in U-Boot
>
> So, on a Pi 3 (rpi_3_defconfig) I see this now (and it passes normally):
> ========================================== FAILURES ===========================================
> ___________________________________ test_efi_helloworld_net ___________________________________
> test/py/tests/test_efi_loader.py:163: in test_efi_helloworld_net
>     assert expected_text in output
> E   AssertionError: assert 'Hello, world' in 'No UEFI binary known at 200000'
> ------------------------------------ Captured stdout call -------------------------------------
> U-Boot> tftpboot 200000 EFI/arm64/helloworld.efi
> Using smsc95xx_eth device
> TFTP from server 192.168.1.10; our IP address is 192.168.1.100
> Filename 'EFI/arm64/helloworld.efi'.
> Load address: 0x200000
> Loading:
> Bytes transferred = 4528 (11b0 hex)
> U-Boot> U-Boot> crc32 200000 $filesize
> CRC32 for 00200000 ... 002011af ==> 2b466005
> U-Boot> U-Boot> bootefi 200000
> No UEFI binary known at 200000
> U-Boot>
> =================================== short test summary info ===================================
> If I disable that test, it moves on to failing the same exact way for
> grub. If I disable the grub test too. After that, oh, a bunch of other
> tests get skipped because CMD_NET and similar aren't enabled now, and
> the tests are wrong. I'll post that as another patch by itself. After
> correcting for that, we're seemingly noticeably slower as I need to
> increase the timeout for tftp'ing my 83MiB FIT image I use for kernel
> testing. We no longer have the estimated speed message, so I can't as
> easily say how much slower it is. After increasing the timeout, the
> kernel boot test does work.
>
> I can note that normally it takes ~18ms to get a dhcp reply, but with
> lwIP it's now 132ms, and previously the kernel loaded at 2.7MiB/s
> (which, not great) but if that has a similar level of slowdown, could
> well explain it.
>

Thanks for taking the time. We'll run the pytests before v5. That
being said, my wget tests were faster with lwIP last time I checked.

Thanks
/Ilias
> --
> Tom
Jerome Forissier June 19, 2024, 2:47 p.m. UTC | #3
On 6/19/24 09:24, Ilias Apalodimas wrote:
> Hi Tom
> 
> On Tue, 18 Jun 2024 at 23:21, Tom Rini <trini@konsulko.com> wrote:
>>
>> On Mon, Jun 17, 2024 at 05:32:52PM +0200, Jerome Forissier wrote:
>>
>>> This is a rework of a patch series by Maxim Uvarov: "net/lwip: add lwip
>>> library for the network stack" [1]. The goal is to introduce the lwIP TCP/IP
>>> stack [2] [3] as an alternative to the current implementation in net/,
>>> selectable with Kconfig, and ultimately keep only lwIP if possible. Some
>>> reasons for doing so are:
>>> - Make the support of HTTPS in the wget command easier. Javier T. (CC'd)
>>> has some additional lwIP and Mbed TLS patches to do so. With that it
>>> becomes possible to fetch and launch a distro installer such as Debian
>>> etc. using a secure, authenticated connection directly from the U-Boot
>>> shell. Several use cases:
>>>   * Authentication: prevent MITM attack (third party replacing the
>>> binary with a different one)
>>>   * Confidentiality: prevent third parties from grabbing a copy of the
>>> image as it is being downloaded
>>>   * Allow connection to servers that do not support plain HTTP anymore
>>> (this is becoming more and more common on the Internet these days)
>>> - Possibly benefit from additional features implemented in lwIP
>>> - Less code to maintain in U-Boot
>>
>> So, on a Pi 3 (rpi_3_defconfig) I see this now (and it passes normally):
>> ========================================== FAILURES ===========================================
>> ___________________________________ test_efi_helloworld_net ___________________________________
>> test/py/tests/test_efi_loader.py:163: in test_efi_helloworld_net
>>     assert expected_text in output
>> E   AssertionError: assert 'Hello, world' in 'No UEFI binary known at 200000'
>> ------------------------------------ Captured stdout call -------------------------------------
>> U-Boot> tftpboot 200000 EFI/arm64/helloworld.efi
>> Using smsc95xx_eth device
>> TFTP from server 192.168.1.10; our IP address is 192.168.1.100
>> Filename 'EFI/arm64/helloworld.efi'.
>> Load address: 0x200000
>> Loading:
>> Bytes transferred = 4528 (11b0 hex)
>> U-Boot> U-Boot> crc32 200000 $filesize
>> CRC32 for 00200000 ... 002011af ==> 2b466005
>> U-Boot> U-Boot> bootefi 200000
>> No UEFI binary known at 200000
>> U-Boot>
>> =================================== short test summary info ===================================
>> If I disable that test, it moves on to failing the same exact way for
>> grub. If I disable the grub test too. After that, oh, a bunch of other
>> tests get skipped because CMD_NET and similar aren't enabled now, and
>> the tests are wrong. I'll post that as another patch by itself. After
>> correcting for that, we're seemingly noticeably slower as I need to
>> increase the timeout for tftp'ing my 83MiB FIT image I use for kernel
>> testing. We no longer have the estimated speed message, so I can't as
>> easily say how much slower it is. After increasing the timeout, the
>> kernel boot test does work.
>>
>> I can note that normally it takes ~18ms to get a dhcp reply, but with
>> lwIP it's now 132ms, and previously the kernel loaded at 2.7MiB/s
>> (which, not great) but if that has a similar level of slowdown, could
>> well explain it.
>>
> 
> Thanks for taking the time. We'll run the pytests before v5. That
> being said, my wget tests were faster with lwIP last time I checked.

The reason for the slower TFTP is the block size. lwIP supports only the
default (512 bytes), while the legacy stack supports the 'blksize' option
and sets CONFIG_TFTP_BLOCKSIZE=1468 by default.
Tom Rini June 19, 2024, 3:07 p.m. UTC | #4
On Wed, Jun 19, 2024 at 04:47:08PM +0200, Jerome Forissier wrote:
> 
> 
> On 6/19/24 09:24, Ilias Apalodimas wrote:
> > Hi Tom
> > 
> > On Tue, 18 Jun 2024 at 23:21, Tom Rini <trini@konsulko.com> wrote:
> >>
> >> On Mon, Jun 17, 2024 at 05:32:52PM +0200, Jerome Forissier wrote:
> >>
> >>> This is a rework of a patch series by Maxim Uvarov: "net/lwip: add lwip
> >>> library for the network stack" [1]. The goal is to introduce the lwIP TCP/IP
> >>> stack [2] [3] as an alternative to the current implementation in net/,
> >>> selectable with Kconfig, and ultimately keep only lwIP if possible. Some
> >>> reasons for doing so are:
> >>> - Make the support of HTTPS in the wget command easier. Javier T. (CC'd)
> >>> has some additional lwIP and Mbed TLS patches to do so. With that it
> >>> becomes possible to fetch and launch a distro installer such as Debian
> >>> etc. using a secure, authenticated connection directly from the U-Boot
> >>> shell. Several use cases:
> >>>   * Authentication: prevent MITM attack (third party replacing the
> >>> binary with a different one)
> >>>   * Confidentiality: prevent third parties from grabbing a copy of the
> >>> image as it is being downloaded
> >>>   * Allow connection to servers that do not support plain HTTP anymore
> >>> (this is becoming more and more common on the Internet these days)
> >>> - Possibly benefit from additional features implemented in lwIP
> >>> - Less code to maintain in U-Boot
> >>
> >> So, on a Pi 3 (rpi_3_defconfig) I see this now (and it passes normally):
> >> ========================================== FAILURES ===========================================
> >> ___________________________________ test_efi_helloworld_net ___________________________________
> >> test/py/tests/test_efi_loader.py:163: in test_efi_helloworld_net
> >>     assert expected_text in output
> >> E   AssertionError: assert 'Hello, world' in 'No UEFI binary known at 200000'
> >> ------------------------------------ Captured stdout call -------------------------------------
> >> U-Boot> tftpboot 200000 EFI/arm64/helloworld.efi
> >> Using smsc95xx_eth device
> >> TFTP from server 192.168.1.10; our IP address is 192.168.1.100
> >> Filename 'EFI/arm64/helloworld.efi'.
> >> Load address: 0x200000
> >> Loading:
> >> Bytes transferred = 4528 (11b0 hex)
> >> U-Boot> U-Boot> crc32 200000 $filesize
> >> CRC32 for 00200000 ... 002011af ==> 2b466005
> >> U-Boot> U-Boot> bootefi 200000
> >> No UEFI binary known at 200000
> >> U-Boot>
> >> =================================== short test summary info ===================================
> >> If I disable that test, it moves on to failing the same exact way for
> >> grub. If I disable the grub test too. After that, oh, a bunch of other
> >> tests get skipped because CMD_NET and similar aren't enabled now, and
> >> the tests are wrong. I'll post that as another patch by itself. After
> >> correcting for that, we're seemingly noticeably slower as I need to
> >> increase the timeout for tftp'ing my 83MiB FIT image I use for kernel
> >> testing. We no longer have the estimated speed message, so I can't as
> >> easily say how much slower it is. After increasing the timeout, the
> >> kernel boot test does work.
> >>
> >> I can note that normally it takes ~18ms to get a dhcp reply, but with
> >> lwIP it's now 132ms, and previously the kernel loaded at 2.7MiB/s
> >> (which, not great) but if that has a similar level of slowdown, could
> >> well explain it.
> >>
> > 
> > Thanks for taking the time. We'll run the pytests before v5. That
> > being said, my wget tests were faster with lwIP last time I checked.
> 
> The reason for the slower TFTP is the block size. lwIP supports only the
> default (512 bytes), while the legacy stack supports the 'blksize' option
> and sets CONFIG_TFTP_BLOCKSIZE=1468 by default.

Ouch. Can we ask if upstream is agreeable to making that configurable
some way, and then we utilize that? I'm not looking forward to lots of
performance hit reports.

But please note that my dhcp request/reply is also taking 10x as long,
and in v3 it wasn't working at all? I feel like there's possibly another
issue lurking here.
Jerome Forissier June 19, 2024, 4:45 p.m. UTC | #5
On 6/19/24 17:07, Tom Rini wrote:
> On Wed, Jun 19, 2024 at 04:47:08PM +0200, Jerome Forissier wrote:
>>
>>
>> On 6/19/24 09:24, Ilias Apalodimas wrote:
>>> Hi Tom
>>>
>>> On Tue, 18 Jun 2024 at 23:21, Tom Rini <trini@konsulko.com> wrote:
>>>>
>>>> On Mon, Jun 17, 2024 at 05:32:52PM +0200, Jerome Forissier wrote:
>>>>
>>>>> This is a rework of a patch series by Maxim Uvarov: "net/lwip: add lwip
>>>>> library for the network stack" [1]. The goal is to introduce the lwIP TCP/IP
>>>>> stack [2] [3] as an alternative to the current implementation in net/,
>>>>> selectable with Kconfig, and ultimately keep only lwIP if possible. Some
>>>>> reasons for doing so are:
>>>>> - Make the support of HTTPS in the wget command easier. Javier T. (CC'd)
>>>>> has some additional lwIP and Mbed TLS patches to do so. With that it
>>>>> becomes possible to fetch and launch a distro installer such as Debian
>>>>> etc. using a secure, authenticated connection directly from the U-Boot
>>>>> shell. Several use cases:
>>>>>   * Authentication: prevent MITM attack (third party replacing the
>>>>> binary with a different one)
>>>>>   * Confidentiality: prevent third parties from grabbing a copy of the
>>>>> image as it is being downloaded
>>>>>   * Allow connection to servers that do not support plain HTTP anymore
>>>>> (this is becoming more and more common on the Internet these days)
>>>>> - Possibly benefit from additional features implemented in lwIP
>>>>> - Less code to maintain in U-Boot
>>>>
>>>> So, on a Pi 3 (rpi_3_defconfig) I see this now (and it passes normally):
>>>> ========================================== FAILURES ===========================================
>>>> ___________________________________ test_efi_helloworld_net ___________________________________
>>>> test/py/tests/test_efi_loader.py:163: in test_efi_helloworld_net
>>>>     assert expected_text in output
>>>> E   AssertionError: assert 'Hello, world' in 'No UEFI binary known at 200000'
>>>> ------------------------------------ Captured stdout call -------------------------------------
>>>> U-Boot> tftpboot 200000 EFI/arm64/helloworld.efi
>>>> Using smsc95xx_eth device
>>>> TFTP from server 192.168.1.10; our IP address is 192.168.1.100
>>>> Filename 'EFI/arm64/helloworld.efi'.
>>>> Load address: 0x200000
>>>> Loading:
>>>> Bytes transferred = 4528 (11b0 hex)
>>>> U-Boot> U-Boot> crc32 200000 $filesize
>>>> CRC32 for 00200000 ... 002011af ==> 2b466005
>>>> U-Boot> U-Boot> bootefi 200000
>>>> No UEFI binary known at 200000
>>>> U-Boot>
>>>> =================================== short test summary info ===================================
>>>> If I disable that test, it moves on to failing the same exact way for
>>>> grub. If I disable the grub test too. After that, oh, a bunch of other
>>>> tests get skipped because CMD_NET and similar aren't enabled now, and
>>>> the tests are wrong. I'll post that as another patch by itself. After
>>>> correcting for that, we're seemingly noticeably slower as I need to
>>>> increase the timeout for tftp'ing my 83MiB FIT image I use for kernel
>>>> testing. We no longer have the estimated speed message, so I can't as
>>>> easily say how much slower it is. After increasing the timeout, the
>>>> kernel boot test does work.
>>>>
>>>> I can note that normally it takes ~18ms to get a dhcp reply, but with
>>>> lwIP it's now 132ms, and previously the kernel loaded at 2.7MiB/s
>>>> (which, not great) but if that has a similar level of slowdown, could
>>>> well explain it.
>>>>
>>>
>>> Thanks for taking the time. We'll run the pytests before v5. That
>>> being said, my wget tests were faster with lwIP last time I checked.
>>
>> The reason for the slower TFTP is the block size. lwIP supports only the
>> default (512 bytes), while the legacy stack supports the 'blksize' option
>> and sets CONFIG_TFTP_BLOCKSIZE=1468 by default.
> 
> Ouch. Can we ask if upstream is agreeable to making that configurable
> some way, and then we utilize that? I'm not looking forward to lots of
> performance hit reports.

I tried a quick & dirty hack to hardcode the blksize option in the lwIP
stack and I can confirm I get the same speed as with the legacy
implementation or even slightly better (legacy: 822 KB/s, hacked lwIP:
843 KB/s, upstream lwIP: 343 KB/s). That's on QEMU with py3tftp running on
my laptop as a TFTP server. I would probably get better results with a
good TFTP server and real hardware (RPi...) but at least this shows lwIP
can be on par with the U-Boot stack.

I can consider writing a better patch for lwIP and submitting it upstream.
 
> But please note that my dhcp request/reply is also taking 10x as long,
> and in v3 it wasn't working at all? I feel like there's possibly another
> issue lurking here.

This I need to investigate more on my RPi. I'll keep you posted. Thanks
again for testing.
Peter Robinson June 19, 2024, 5:19 p.m. UTC | #6
On Wed, 19 Jun 2024 at 16:06, Jerome Forissier
<jerome.forissier@linaro.org> wrote:
>
>
>
> On 6/19/24 09:24, Ilias Apalodimas wrote:
> > Hi Tom
> >
> > On Tue, 18 Jun 2024 at 23:21, Tom Rini <trini@konsulko.com> wrote:
> >>
> >> On Mon, Jun 17, 2024 at 05:32:52PM +0200, Jerome Forissier wrote:
> >>
> >>> This is a rework of a patch series by Maxim Uvarov: "net/lwip: add lwip
> >>> library for the network stack" [1]. The goal is to introduce the lwIP TCP/IP
> >>> stack [2] [3] as an alternative to the current implementation in net/,
> >>> selectable with Kconfig, and ultimately keep only lwIP if possible. Some
> >>> reasons for doing so are:
> >>> - Make the support of HTTPS in the wget command easier. Javier T. (CC'd)
> >>> has some additional lwIP and Mbed TLS patches to do so. With that it
> >>> becomes possible to fetch and launch a distro installer such as Debian
> >>> etc. using a secure, authenticated connection directly from the U-Boot
> >>> shell. Several use cases:
> >>>   * Authentication: prevent MITM attack (third party replacing the
> >>> binary with a different one)
> >>>   * Confidentiality: prevent third parties from grabbing a copy of the
> >>> image as it is being downloaded
> >>>   * Allow connection to servers that do not support plain HTTP anymore
> >>> (this is becoming more and more common on the Internet these days)
> >>> - Possibly benefit from additional features implemented in lwIP
> >>> - Less code to maintain in U-Boot
> >>
> >> So, on a Pi 3 (rpi_3_defconfig) I see this now (and it passes normally):
> >> ========================================== FAILURES ===========================================
> >> ___________________________________ test_efi_helloworld_net ___________________________________
> >> test/py/tests/test_efi_loader.py:163: in test_efi_helloworld_net
> >>     assert expected_text in output
> >> E   AssertionError: assert 'Hello, world' in 'No UEFI binary known at 200000'
> >> ------------------------------------ Captured stdout call -------------------------------------
> >> U-Boot> tftpboot 200000 EFI/arm64/helloworld.efi
> >> Using smsc95xx_eth device
> >> TFTP from server 192.168.1.10; our IP address is 192.168.1.100
> >> Filename 'EFI/arm64/helloworld.efi'.
> >> Load address: 0x200000
> >> Loading:
> >> Bytes transferred = 4528 (11b0 hex)
> >> U-Boot> U-Boot> crc32 200000 $filesize
> >> CRC32 for 00200000 ... 002011af ==> 2b466005
> >> U-Boot> U-Boot> bootefi 200000
> >> No UEFI binary known at 200000
> >> U-Boot>
> >> =================================== short test summary info ===================================
> >> If I disable that test, it moves on to failing the same exact way for
> >> grub. If I disable the grub test too. After that, oh, a bunch of other
> >> tests get skipped because CMD_NET and similar aren't enabled now, and
> >> the tests are wrong. I'll post that as another patch by itself. After
> >> correcting for that, we're seemingly noticeably slower as I need to
> >> increase the timeout for tftp'ing my 83MiB FIT image I use for kernel
> >> testing. We no longer have the estimated speed message, so I can't as
> >> easily say how much slower it is. After increasing the timeout, the
> >> kernel boot test does work.
> >>
> >> I can note that normally it takes ~18ms to get a dhcp reply, but with
> >> lwIP it's now 132ms, and previously the kernel loaded at 2.7MiB/s
> >> (which, not great) but if that has a similar level of slowdown, could
> >> well explain it.
> >>
> >
> > Thanks for taking the time. We'll run the pytests before v5. That
> > being said, my wget tests were faster with lwIP last time I checked.
>
> The reason for the slower TFTP is the block size. lwIP supports only the
> default (512 bytes), while the legacy stack supports the 'blksize' option
> and sets CONFIG_TFTP_BLOCKSIZE=1468 by default.

This number looks like it's derived from standard ethernet frame
(1500) minus default packet sizes to get to 1468. Does LWIP have a
concept of MTU? I would have thought it would be better to use MTU if
so because it then means it's not hard coded and it can change based
on what the driver/HW supports?

Peter
Jerome Forissier June 20, 2024, 8:05 a.m. UTC | #7
On 6/19/24 19:19, Peter Robinson wrote:
> On Wed, 19 Jun 2024 at 16:06, Jerome Forissier
> <jerome.forissier@linaro.org> wrote:
>>
>>
>>
>> On 6/19/24 09:24, Ilias Apalodimas wrote:
>>> Hi Tom
>>>
>>> On Tue, 18 Jun 2024 at 23:21, Tom Rini <trini@konsulko.com> wrote:
>>>>
>>>> On Mon, Jun 17, 2024 at 05:32:52PM +0200, Jerome Forissier wrote:
>>>>
>>>>> This is a rework of a patch series by Maxim Uvarov: "net/lwip: add lwip
>>>>> library for the network stack" [1]. The goal is to introduce the lwIP TCP/IP
>>>>> stack [2] [3] as an alternative to the current implementation in net/,
>>>>> selectable with Kconfig, and ultimately keep only lwIP if possible. Some
>>>>> reasons for doing so are:
>>>>> - Make the support of HTTPS in the wget command easier. Javier T. (CC'd)
>>>>> has some additional lwIP and Mbed TLS patches to do so. With that it
>>>>> becomes possible to fetch and launch a distro installer such as Debian
>>>>> etc. using a secure, authenticated connection directly from the U-Boot
>>>>> shell. Several use cases:
>>>>>   * Authentication: prevent MITM attack (third party replacing the
>>>>> binary with a different one)
>>>>>   * Confidentiality: prevent third parties from grabbing a copy of the
>>>>> image as it is being downloaded
>>>>>   * Allow connection to servers that do not support plain HTTP anymore
>>>>> (this is becoming more and more common on the Internet these days)
>>>>> - Possibly benefit from additional features implemented in lwIP
>>>>> - Less code to maintain in U-Boot
>>>>
>>>> So, on a Pi 3 (rpi_3_defconfig) I see this now (and it passes normally):
>>>> ========================================== FAILURES ===========================================
>>>> ___________________________________ test_efi_helloworld_net ___________________________________
>>>> test/py/tests/test_efi_loader.py:163: in test_efi_helloworld_net
>>>>     assert expected_text in output
>>>> E   AssertionError: assert 'Hello, world' in 'No UEFI binary known at 200000'
>>>> ------------------------------------ Captured stdout call -------------------------------------
>>>> U-Boot> tftpboot 200000 EFI/arm64/helloworld.efi
>>>> Using smsc95xx_eth device
>>>> TFTP from server 192.168.1.10; our IP address is 192.168.1.100
>>>> Filename 'EFI/arm64/helloworld.efi'.
>>>> Load address: 0x200000
>>>> Loading:
>>>> Bytes transferred = 4528 (11b0 hex)
>>>> U-Boot> U-Boot> crc32 200000 $filesize
>>>> CRC32 for 00200000 ... 002011af ==> 2b466005
>>>> U-Boot> U-Boot> bootefi 200000
>>>> No UEFI binary known at 200000
>>>> U-Boot>
>>>> =================================== short test summary info ===================================
>>>> If I disable that test, it moves on to failing the same exact way for
>>>> grub. If I disable the grub test too. After that, oh, a bunch of other
>>>> tests get skipped because CMD_NET and similar aren't enabled now, and
>>>> the tests are wrong. I'll post that as another patch by itself. After
>>>> correcting for that, we're seemingly noticeably slower as I need to
>>>> increase the timeout for tftp'ing my 83MiB FIT image I use for kernel
>>>> testing. We no longer have the estimated speed message, so I can't as
>>>> easily say how much slower it is. After increasing the timeout, the
>>>> kernel boot test does work.
>>>>
>>>> I can note that normally it takes ~18ms to get a dhcp reply, but with
>>>> lwIP it's now 132ms, and previously the kernel loaded at 2.7MiB/s
>>>> (which, not great) but if that has a similar level of slowdown, could
>>>> well explain it.
>>>>
>>>
>>> Thanks for taking the time. We'll run the pytests before v5. That
>>> being said, my wget tests were faster with lwIP last time I checked.
>>
>> The reason for the slower TFTP is the block size. lwIP supports only the
>> default (512 bytes), while the legacy stack supports the 'blksize' option
>> and sets CONFIG_TFTP_BLOCKSIZE=1468 by default.
> 
> This number looks like it's derived from standard ethernet frame
> (1500) minus default packet sizes to get to 1468. Does LWIP have a
> concept of MTU? 

It does (netif->mtu), although it does not have path MTU discovery. But for
the use cases we consider (mostly booting from the LAN) it shouldn't matter.

> I would have thought it would be better to use MTU if
> so because it then means it's not hard coded and it can change based
> on what the driver/HW supports?

I agree but for maximum flexibility I plan to extend the lwIP API and set
blksize to the mtu in the caller (net-lwip/tftp.c):

  tftp_init_client(&ctx);  // Provide the callbacks
  tftp_client_set_bksize(netif->mtu);  // New function
  tftp_get(&ctx, addr, port, fname, mode);  // Start the transfer
Tim Harvey June 20, 2024, 5:10 p.m. UTC | #8
On Mon, Jun 17, 2024 at 8:33 AM Jerome Forissier
<jerome.forissier@linaro.org> wrote:
>
> This is a rework of a patch series by Maxim Uvarov: "net/lwip: add lwip
> library for the network stack" [1]. The goal is to introduce the lwIP TCP/IP
> stack [2] [3] as an alternative to the current implementation in net/,
> selectable with Kconfig, and ultimately keep only lwIP if possible. Some
> reasons for doing so are:
> - Make the support of HTTPS in the wget command easier. Javier T. (CC'd)
> has some additional lwIP and Mbed TLS patches to do so. With that it
> becomes possible to fetch and launch a distro installer such as Debian
> etc. using a secure, authenticated connection directly from the U-Boot
> shell. Several use cases:
>   * Authentication: prevent MITM attack (third party replacing the
> binary with a different one)
>   * Confidentiality: prevent third parties from grabbing a copy of the
> image as it is being downloaded
>   * Allow connection to servers that do not support plain HTTP anymore
> (this is becoming more and more common on the Internet these days)
> - Possibly benefit from additional features implemented in lwIP
> - Less code to maintain in U-Boot
>
> Prior to applying this series, the lwIP stack needs to be added as a
> Git subtree with the following command:
>
>  $  git subtree add --squash --prefix lib/lwip/lwip https://git.savannah.gnu.org/git/lwip.git STABLE-2_2_0_RELEASE
>
> The first patch renames some enums in order to avoid a conflict when a
> later patch enables the lwIP library.
>
> The second patch introduces a new Kconfig symbol: NET_LWIP, which selects
> the lwIP implementation instead of the current one (NET). Contrary to the
> approach chosen by Maxim in [1], NET_LWIP and NET cannot be enabled
> simultaneously. The rationale is we want to start from a clean state and
> not pull potentially duplicated functionality from both stacks. Note
> however that a few files are still built in net/, they are the ones
> related to ethernet device management and the ethernet bootflow.
>
> The third patch splits the net.h header into net-legacy.h, net-common.h,
> net-lwip.h, leaving net.h as a simple wrapper.
>
> The fourth patch introduces the Makefile to build lwIP when NET_LWIP is
> enabled.
>
> The subsequent patches implement various network-oriented commands and
> features: dhcp, dns, ping, tftpboot, wget.
>
> A number of features are currently incompatible with NET_LWIP: SANDBOX,
> DFU_TFTP, FASTBOOT, SPL_NET. All make assumptions on how the network
> stack is implemented and/or pull sybols that are not trivially exported
> from lwIP. Some interface rework may be needed.
>
> Due to the above and in order to provide some level of testing, a new QEMU
> configuration is introduced (qemu_arm64_lwip_defconfig) which is the same
> as qemu_arm64_defconfig but with NET_LWIP and CMD_*_LWIP enabled.
> Tests are added to test/py/tests/test_net.py for that configuration.
>

Hi Jermone,

I've given this a spin on an imx8mm-venice-gw73xx-0x via
imx8mm_venice_defconfig and I can't get DHCP to work (didn't work in
v3 either):

$ diff defconfig configs/imx8mm_venice_defconfig
68,69c68,71
< CONFIG_CMD_DNS_LWIP=y
< CONFIG_CMD_WGET_LWIP=y
---
> CONFIG_CMD_DHCP6=y
> CONFIG_CMD_TFTPPUT=y
> CONFIG_SYS_DISABLE_AUTOLOAD=y
> CONFIG_CMD_WGET=y
88,90c90,94
< CONFIG_NET_LWIP=y
< CONFIG_LWIP_DEBUG=y
< CONFIG_LWIP_ASSERT=y
---
> CONFIG_NET_RANDOM_ETHADDR=y
> CONFIG_IP_DEFRAG=y
> CONFIG_TFTP_BLOCKSIZE=4096
> CONFIG_PROT_TCP_SACK=y
> CONFIG_IPV6=y

Target:
u-boot=> net list
eth0 : ethernet@30be0000 00:d0:12:b5:f8:41 active
u-boot=> dhcp || echo fail
eth0: ethernet@30be0000 00:d0:12:b5:f8:41 active
netif_set_ipaddr: netif address being changed
netif: added interface 0 addr 192.168.1.1 netmask 0.0.0.0 gw 0.0.0.0
etharp_request: sending ARP request.
etharp_raw: sending raw ARP packet.
ethernet_output: sending packet 000000007df00bf0
dhcp_start(netif=000000007df2af40) 0dhcp_start(): mallocing new DHCP client
dhcp_start(): allocated dhcp
dhcp_start(): starting DHCP configuration
dhcp_discover()
transaction id xid(42021)
dhcp_discover: making request
dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, LWIP_IANA_PORT_DHCP_SERVER)
ip4_output_if: 0IP header:
+-------------------------------+
| 4 | 5 |  0x00 |       336     | (v, hl, tos, len)
+-------------------------------+
|        0      |000|       0   | (id, flags, offset)
+-------------------------------+
|  255  |   17  |    0xba9d     | (ttl, proto, chksum)
+-------------------------------+
|    0  |    0  |    0  |    0  | (src)
+-------------------------------+
|  255  |  255  |  255  |  255  | (dest)
+-------------------------------+
ip4_output_if: call netif->output()
ethernet_output: sending packet 000000007df2c010
dhcp_discover: deleting()
dhcp_discover: SELECTING
dhcp_discover(): set request timeout 2000 msecs
ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:28:28:5d:bb:16:9f, type:8899
ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:00:27:0e:0d:74:ba, type:806
etharp_update_arp_entry: 172.24.20.24 - 00:27:0e:0d:74:ba
etharp_find_entry: found empty entry 0
etharp_find_entry: no empty entry found and not allowed to recycle
etharp_input: incoming ARP request
etharp_input: ARP request was not for us.
ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:98:90:96:ac:d7:63, type:806
etharp_update_arp_entry: 172.24.20.3 - 98:90:96:ac:d7:63
etharp_find_entry: found empty entry 0
etharp_find_entry: no empty entry found and not allowed to recycle
etharp_input: incoming ARP request
etharp_input: ARP request was not for us.
ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:84:2b:2b:4e:5a:9a, type:806
etharp_update_arp_entry: 172.24.0.1 - 84:2b:2b:4e:5a:9a
etharp_find_entry: found empty entry 0
etharp_find_entry: no empty entry found and not allowed to recycle
etharp_input: incoming ARP request
etharp_input: ARP request was not for us.
ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:cc:96:e5:1e:a0:7b, type:800
ip_input: iphdr->dest 0xffff1fac netif->ip_addr 0x101a8c0 (0x0, 0x0, 0xffff1fac)
ip4_input: UDP packet to DHCP client port 138
ip4_input: packet not for us.
ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:d4:be:d9:9f:5a:b4, type:800
ip_input: iphdr->dest 0xffffffff netif->ip_addr 0x101a8c0 (0x0, 0x0, 0xffffffff)
ip4_input: packet accepted on interface 0ip4_input:
IP header:
+-------------------------------+
| 4 | 5 |  0x00 |        68     | (v, hl, tos, len)
+-------------------------------+
|    19472      |000|       0   | (id, flags, offset)
+-------------------------------+
|  128  |   17  |    0x2e51     | (ttl, proto, chksum)
+-------------------------------+
|  172  |   24  |   20  |   48  | (src)
+-------------------------------+
|  255  |  255  |  255  |  255  | (dest)
+-------------------------------+
ip4_input: p->len 68 p->tot_len 68
ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:00:0b:ab:2e:fa:51, type:806
etharp_update_arp_entry: 172.24.24.181 - 00:0b:ab:2e:fa:51
etharp_find_entry: found empty entry 0
etharp_find_entry: no empty entry found and not allowed to recycle
etharp_input: incoming ARP request
etharp_input: ARP request was not for us.
ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:98:90:96:ac:d7:63, type:806
etharp_update_arp_entry: 172.24.20.3 - 98:90:96:ac:d7:63
etharp_find_entry: found empty entry 0
etharp_find_entry: no empty entry found and not allowed to recycle
etharp_input: incoming ARP request
etharp_input: ARP request was not for us.
ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:84:2b:2b:4e:5a:9a, type:806
etharp_update_arp_entry: 172.24.0.1 - 84:2b:2b:4e:5a:9a
etharp_find_entry: found empty entry 0
etharp_find_entry: no empty entry found and not allowed to recycle
etharp_input: incoming ARP request
etharp_input: ARP request was not for us.
fail

Even using ping before DHCP fails:
u-boot=> setenv ipaddr 192.168.1.1; ping 192.168.1.146 || echo fail
Using ethernet@30be0000 device
ip4_route: No route to 192.168.1.146
ip4_route: No route to 192.168.1.146
ip4_route: No route to 192.168.1.146
ip4_route: No route to 192.168.1.146
ip4_route: No route to 192.168.1.146
ip4_route: No route to 192.168.1.146
ip4_route: No route to 192.168.1.146
ip4_route: No route to 192.168.1.146
ip4_route: No route to 192.168.1.146
ip4_route: No route to 192.168.1.146
ping failed; host 192.168.1.146 is not alive
fail

Best Regards,

Tim
Jerome Forissier June 21, 2024, 12:59 p.m. UTC | #9
On 6/20/24 19:10, Tim Harvey wrote:
> On Mon, Jun 17, 2024 at 8:33 AM Jerome Forissier
> <jerome.forissier@linaro.org> wrote:
>>
>> This is a rework of a patch series by Maxim Uvarov: "net/lwip: add lwip
>> library for the network stack" [1]. The goal is to introduce the lwIP TCP/IP
>> stack [2] [3] as an alternative to the current implementation in net/,
>> selectable with Kconfig, and ultimately keep only lwIP if possible. Some
>> reasons for doing so are:
>> - Make the support of HTTPS in the wget command easier. Javier T. (CC'd)
>> has some additional lwIP and Mbed TLS patches to do so. With that it
>> becomes possible to fetch and launch a distro installer such as Debian
>> etc. using a secure, authenticated connection directly from the U-Boot
>> shell. Several use cases:
>>   * Authentication: prevent MITM attack (third party replacing the
>> binary with a different one)
>>   * Confidentiality: prevent third parties from grabbing a copy of the
>> image as it is being downloaded
>>   * Allow connection to servers that do not support plain HTTP anymore
>> (this is becoming more and more common on the Internet these days)
>> - Possibly benefit from additional features implemented in lwIP
>> - Less code to maintain in U-Boot
>>
>> Prior to applying this series, the lwIP stack needs to be added as a
>> Git subtree with the following command:
>>
>>  $  git subtree add --squash --prefix lib/lwip/lwip https://git.savannah.gnu.org/git/lwip.git STABLE-2_2_0_RELEASE
>>
>> The first patch renames some enums in order to avoid a conflict when a
>> later patch enables the lwIP library.
>>
>> The second patch introduces a new Kconfig symbol: NET_LWIP, which selects
>> the lwIP implementation instead of the current one (NET). Contrary to the
>> approach chosen by Maxim in [1], NET_LWIP and NET cannot be enabled
>> simultaneously. The rationale is we want to start from a clean state and
>> not pull potentially duplicated functionality from both stacks. Note
>> however that a few files are still built in net/, they are the ones
>> related to ethernet device management and the ethernet bootflow.
>>
>> The third patch splits the net.h header into net-legacy.h, net-common.h,
>> net-lwip.h, leaving net.h as a simple wrapper.
>>
>> The fourth patch introduces the Makefile to build lwIP when NET_LWIP is
>> enabled.
>>
>> The subsequent patches implement various network-oriented commands and
>> features: dhcp, dns, ping, tftpboot, wget.
>>
>> A number of features are currently incompatible with NET_LWIP: SANDBOX,
>> DFU_TFTP, FASTBOOT, SPL_NET. All make assumptions on how the network
>> stack is implemented and/or pull sybols that are not trivially exported
>> from lwIP. Some interface rework may be needed.
>>
>> Due to the above and in order to provide some level of testing, a new QEMU
>> configuration is introduced (qemu_arm64_lwip_defconfig) which is the same
>> as qemu_arm64_defconfig but with NET_LWIP and CMD_*_LWIP enabled.
>> Tests are added to test/py/tests/test_net.py for that configuration.
>>
> 
> Hi Jermone,
> 
> I've given this a spin on an imx8mm-venice-gw73xx-0x via
> imx8mm_venice_defconfig and I can't get DHCP to work (didn't work in
> v3 either):
> 
> $ diff defconfig configs/imx8mm_venice_defconfig
> 68,69c68,71
> < CONFIG_CMD_DNS_LWIP=y
> < CONFIG_CMD_WGET_LWIP=y
> ---
>> CONFIG_CMD_DHCP6=y
>> CONFIG_CMD_TFTPPUT=y
>> CONFIG_SYS_DISABLE_AUTOLOAD=y
>> CONFIG_CMD_WGET=y
> 88,90c90,94
> < CONFIG_NET_LWIP=y
> < CONFIG_LWIP_DEBUG=y
> < CONFIG_LWIP_ASSERT=y
> ---
>> CONFIG_NET_RANDOM_ETHADDR=y
>> CONFIG_IP_DEFRAG=y
>> CONFIG_TFTP_BLOCKSIZE=4096
>> CONFIG_PROT_TCP_SACK=y
>> CONFIG_IPV6=y
> 
> Target:
> u-boot=> net list
> eth0 : ethernet@30be0000 00:d0:12:b5:f8:41 active
> u-boot=> dhcp || echo fail
> eth0: ethernet@30be0000 00:d0:12:b5:f8:41 active
> netif_set_ipaddr: netif address being changed
> netif: added interface 0 addr 192.168.1.1 netmask 0.0.0.0 gw 0.0.0.0
> etharp_request: sending ARP request.
> etharp_raw: sending raw ARP packet.
> ethernet_output: sending packet 000000007df00bf0
> dhcp_start(netif=000000007df2af40) 0dhcp_start(): mallocing new DHCP client
> dhcp_start(): allocated dhcp
> dhcp_start(): starting DHCP configuration
> dhcp_discover()
> transaction id xid(42021)
> dhcp_discover: making request
> dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, LWIP_IANA_PORT_DHCP_SERVER)
> ip4_output_if: 0IP header:
> +-------------------------------+
> | 4 | 5 |  0x00 |       336     | (v, hl, tos, len)
> +-------------------------------+
> |        0      |000|       0   | (id, flags, offset)
> +-------------------------------+
> |  255  |   17  |    0xba9d     | (ttl, proto, chksum)
> +-------------------------------+
> |    0  |    0  |    0  |    0  | (src)
> +-------------------------------+
> |  255  |  255  |  255  |  255  | (dest)
> +-------------------------------+
> ip4_output_if: call netif->output()
> ethernet_output: sending packet 000000007df2c010
> dhcp_discover: deleting()
> dhcp_discover: SELECTING
> dhcp_discover(): set request timeout 2000 msecs
> ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:28:28:5d:bb:16:9f, type:8899
> ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:00:27:0e:0d:74:ba, type:806
> etharp_update_arp_entry: 172.24.20.24 - 00:27:0e:0d:74:ba
> etharp_find_entry: found empty entry 0
> etharp_find_entry: no empty entry found and not allowed to recycle
> etharp_input: incoming ARP request
> etharp_input: ARP request was not for us.
> ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:98:90:96:ac:d7:63, type:806
> etharp_update_arp_entry: 172.24.20.3 - 98:90:96:ac:d7:63
> etharp_find_entry: found empty entry 0
> etharp_find_entry: no empty entry found and not allowed to recycle
> etharp_input: incoming ARP request
> etharp_input: ARP request was not for us.
> ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:84:2b:2b:4e:5a:9a, type:806
> etharp_update_arp_entry: 172.24.0.1 - 84:2b:2b:4e:5a:9a
> etharp_find_entry: found empty entry 0
> etharp_find_entry: no empty entry found and not allowed to recycle
> etharp_input: incoming ARP request
> etharp_input: ARP request was not for us.
> ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:cc:96:e5:1e:a0:7b, type:800
> ip_input: iphdr->dest 0xffff1fac netif->ip_addr 0x101a8c0 (0x0, 0x0, 0xffff1fac)
> ip4_input: UDP packet to DHCP client port 138
> ip4_input: packet not for us.

That's weird. I think what could happen is the interface already has
a static IP assigned when it is registered to the lwIP stack and thus
DHCP never receives the packed because it is dropped before.
I have reworked the initialization code in v5, lwIP will get to know
only one interface at a time and I make sure that interface is registered
without an IP address when doing DHCP. Hopefully that should fix the issue.

Thanks for testing!
Simon Glass June 21, 2024, 2:57 p.m. UTC | #10
Hi Jerome,

On Mon, 17 Jun 2024 at 09:33, Jerome Forissier
<jerome.forissier@linaro.org> wrote:
>
> This is a rework of a patch series by Maxim Uvarov: "net/lwip: add lwip
> library for the network stack" [1]. The goal is to introduce the lwIP TCP/IP
> stack [2] [3] as an alternative to the current implementation in net/,
> selectable with Kconfig, and ultimately keep only lwIP if possible. Some
> reasons for doing so are:
> - Make the support of HTTPS in the wget command easier. Javier T. (CC'd)
> has some additional lwIP and Mbed TLS patches to do so. With that it
> becomes possible to fetch and launch a distro installer such as Debian
> etc. using a secure, authenticated connection directly from the U-Boot
> shell. Several use cases:
>   * Authentication: prevent MITM attack (third party replacing the
> binary with a different one)
>   * Confidentiality: prevent third parties from grabbing a copy of the
> image as it is being downloaded
>   * Allow connection to servers that do not support plain HTTP anymore
> (this is becoming more and more common on the Internet these days)
> - Possibly benefit from additional features implemented in lwIP
> - Less code to maintain in U-Boot
>
> Prior to applying this series, the lwIP stack needs to be added as a
> Git subtree with the following command:
>
>  $  git subtree add --squash --prefix lib/lwip/lwip https://git.savannah.gnu.org/git/lwip.git STABLE-2_2_0_RELEASE
>
> The first patch renames some enums in order to avoid a conflict when a
> later patch enables the lwIP library.
>
> The second patch introduces a new Kconfig symbol: NET_LWIP, which selects
> the lwIP implementation instead of the current one (NET). Contrary to the
> approach chosen by Maxim in [1], NET_LWIP and NET cannot be enabled
> simultaneously. The rationale is we want to start from a clean state and
> not pull potentially duplicated functionality from both stacks. Note
> however that a few files are still built in net/, they are the ones
> related to ethernet device management and the ethernet bootflow.
>
> The third patch splits the net.h header into net-legacy.h, net-common.h,
> net-lwip.h, leaving net.h as a simple wrapper.
>
> The fourth patch introduces the Makefile to build lwIP when NET_LWIP is
> enabled.
>
> The subsequent patches implement various network-oriented commands and
> features: dhcp, dns, ping, tftpboot, wget.
>
> A number of features are currently incompatible with NET_LWIP: SANDBOX,
> DFU_TFTP, FASTBOOT, SPL_NET. All make assumptions on how the network
> stack is implemented and/or pull sybols that are not trivially exported
> from lwIP. Some interface rework may be needed.
>
> Due to the above and in order to provide some level of testing, a new QEMU
> configuration is introduced (qemu_arm64_lwip_defconfig) which is the same
> as qemu_arm64_defconfig but with NET_LWIP and CMD_*_LWIP enabled.
> Tests are added to test/py/tests/test_net.py for that configuration.
>
> Changes in v4:
>
> - Fixed the DHCP algorithm which was missing a sys_timeout() call in
> the "fine timer" callback. This should close the issue that Tom R.
> reported with his Raspberry Pi 3 (it does fix it on mine).
> - The DHCP exchange timeout is increased from 2 to 10 seconds
> - The DHCP exchange can be interrupted with Ctrl-C.
> - "net: introduce alternative implementation as net-lwip/": rework
> dependencies. A few symbols have 'depends on !NET_LWIP' and in addition
> 'NET_LWIP depends on !SANDBOX'. Sandbox, DSA and fastboot are
> unsupported, because they are deeply welded to the current stack.
> - All network commands (dns, ping, tftp and wget):
>   * Get rid of global variables (Ilias A.)
>   * Use printf() rather than log_info()
> - "net-lwip: add ping command": use packet count instead of
> timeout, fix code style (Ilias A.)
> - Add "net: split cmd/net.c into cmd/net.c and cmd/net-common.c"
> extracted from the wget patch (Ilias A.).
> - Add "net: split include/net.h into net{,-common,-legacy,-lwip}.h"
> (Ilias A.)
> - Add "flash: prefix error codes with FL_" which is required to
> avoid name clashes when splitting net.h
> - Reworked the initialization of the lwIP stack. One and only
> one network interface (struct netif) is added for the duration
> of the command that uses that interface. That's commit "net-lwip:
> add DHCP support and dhcp commmand".
> - Drop "test: dm: dsa, eth: disable tests when CONFIG_NET_LWIP=y",
> not needed now that NET_LWIP depend on !SANDBOX.
> - qemu_arm64_lwip_defconfig now enables CMD_DNS and CMD_WGET (so
> that all the supported network commands are available).
>
> Changes in v3:
>
> - Make NET_LWIP a Kconfig choice in patch "net: introduce alternative
> implementation as net-lwip/" (Tom R.)
> - Drop the patch introducing lwIP as a Git subtree and document the git
> command in the cover letter instead (Tom R.)
> - "net-lwip: add TFTP support and tftpboot command": use the same
> "Bytes transferred =" message as in the legacy implementation (Tom R.,
> Maxim U.)
> - Drop "test/py: net: add _lwip variants of dhcp, ping and tftpboot
> tests" which is not needed anymore.
> - Add missing kfree() calls in cmd/net-common.c and fix the parsing of
> decimal address in net-lwip/wget.c (patch "net-lwip: add wget command")
> (Maxim U.)
> - "net-lwip: add ping command": drop the ICMP payload (Ilias A.). Set
> the sequence number to zero when entering ping_loop().
>
> Changes in v2:
>
> ** Address comments from Ilias A.
>
> - "net-lwip: add wget command"
> Implement the wget_with_dns() function to do most of the wget work and
> call it from do_wget(). This allows to simplify patch "net-lwip: add
> support for EFI_HTTP_BOOT".
>
> - "net-lwip: import net command from cmd/net.c"
> Move a few functions from cmd/net.c to a new file cmd/net-common.c
> rather than duplicating then in cmd/net-lwip.c.
>
> - "net-lwip: add support for EFI_HTTP_BOOT"
> Since wget_with_dns() is now implemented in "net-lwip: add wget command",
> just enable the wget command when the lwIP stack is enabled and
> EFI_HTTP_BOOT is requested.
>
> ** Address comments from Tom R.
>
> - "net-lwip: add DHCP support and dhcp commmand",
>   "net-lwip: add TFTP support and tftpboot command",
>   "net-lwip: add ping command",
>   "net-lwip: add dns command",
>   "net-lwip: add wget command"
> Do not introduce new CMD_XXX_LWIP symbols and use existing CMD_XXX
> instead.
>
> - "configs: add qemu_arm64_lwip_defconfig"
> Use #include <configs/qemu_arm64_defconfig>.
>
> - "net-lwip: import lwIP library under lib/lwip"
> Patch removed and replaced by the introduction of a Git subtree:
> "Squashed 'lib/lwip/lwip/' content from commit 0a0452b2c3".
>
> Note that I have not yet addressed your comments on "test: dm: dsa,
> eth: disable tests when CONFIG_NET_LWIP=y"). I need some more time
> for that and I think running CI on this v2 will help better understand
> what is needed for v3.
>
> ** Miscellaneous improvements
>
> - "net: introduce alternative implementation as net-lwip/":
>
> * Make DFU_OVER_TFTP not DFU_TFTP incompatible with NET_LWIP. It seems
> quite natural to supplement "depends on NET" with "&& !NET_LWIP".
> * Make PROT_*_LWIP not visible by removing the Kconfig prompt.
>
> [1] https://lore.kernel.org/all/20231127125726.3735-1-maxim.uvarov@linaro.org/
> [2] https://www.nongnu.org/lwip/
> [3] https://en.wikipedia.org/wiki/LwIP
>
> CC: Javier Tia <javier.tia@linaro.org>
>
> Jerome Forissier (14):
>   flash: prefix error codes with FL_
>   net: introduce alternative implementation as net-lwip/
>   net: split include/net.h into net{,-common,-legacy,-lwip}.h
>   net-lwip: build lwIP
>   net-lwip: add DHCP support and dhcp commmand
>   net-lwip: add TFTP support and tftpboot command
>   net-lwip: add ping command
>   net-lwip: add dns command
>   net: split cmd/net.c into cmd/net.c and cmd/net-common.c
>   net-lwip: add wget command
>   cmd: bdinfo: enable -e when CONFIG_CMD_NET_LWIP=y
>   configs: add qemu_arm64_lwip_defconfig
>   MAINTAINERS: net-lwip: add myself as a maintainer
>   CI: add qemu_arm64_lwip to the test matrix
>
>  .azure-pipelines.yml              |   7 +
>  Kconfig                           |  41 ++
>  MAINTAINERS                       |  11 +
>  Makefile                          |  12 +-
>  board/cobra5272/flash.c           |  26 +-
>  board/freescale/m5253demo/flash.c |   6 +-
>  boot/Kconfig                      |   5 +-
>  cmd/Kconfig                       |  44 ++
>  cmd/Makefile                      |   7 +
>  cmd/bdinfo.c                      |   5 +-
>  cmd/net-common.c                  | 109 ++++
>  cmd/net-lwip.c                    |  45 ++
>  cmd/net.c                         | 115 ----
>  common/Kconfig                    |   2 +-
>  common/board_r.c                  |   4 +-
>  common/flash.c                    |  44 +-
>  common/spl/Kconfig                |   1 +
>  configs/qemu_arm64_lwip_defconfig |   4 +
>  drivers/dfu/Kconfig               |   1 +
>  drivers/fastboot/Kconfig          |   1 +
>  drivers/mtd/cfi_flash.c           |  36 +-
>  drivers/net/Kconfig               |   3 +-
>  drivers/net/phy/Kconfig           |   2 +-
>  drivers/usb/gadget/Kconfig        |   2 +-
>  include/flash.h                   |  20 +-
>  include/net-common.h              | 408 +++++++++++++
>  include/net-legacy.h              | 649 ++++++++++++++++++++
>  include/net-lwip.h                |  36 ++
>  include/net.h                     | 944 +-----------------------------
>  lib/Makefile                      |   2 +
>  lib/lwip/Makefile                 |  55 ++
>  lib/lwip/u-boot/arch/cc.h         |  43 ++
>  lib/lwip/u-boot/arch/sys_arch.h   |   0
>  lib/lwip/u-boot/limits.h          |   0
>  lib/lwip/u-boot/lwipopts.h        | 197 +++++++
>  net-lwip/Kconfig                  |  37 ++
>  net-lwip/Makefile                 |  18 +
>  net-lwip/dhcp.c                   | 108 ++++
>  net-lwip/dns.c                    | 117 ++++
>  net-lwip/eth_internal.h           |  35 ++
>  net-lwip/net-lwip.c               | 270 +++++++++
>  net-lwip/ping.c                   | 174 ++++++
>  net-lwip/tftp.c                   | 226 +++++++
>  net-lwip/wget.c                   | 269 +++++++++
>  net/Kconfig                       |  14 -
>  45 files changed, 3008 insertions(+), 1147 deletions(-)
>  create mode 100644 cmd/net-common.c
>  create mode 100644 cmd/net-lwip.c
>  create mode 100644 configs/qemu_arm64_lwip_defconfig
>  create mode 100644 include/net-common.h
>  create mode 100644 include/net-legacy.h
>  create mode 100644 include/net-lwip.h
>  create mode 100644 lib/lwip/Makefile
>  create mode 100644 lib/lwip/u-boot/arch/cc.h
>  create mode 100644 lib/lwip/u-boot/arch/sys_arch.h
>  create mode 100644 lib/lwip/u-boot/limits.h
>  create mode 100644 lib/lwip/u-boot/lwipopts.h
>  create mode 100644 net-lwip/Kconfig
>  create mode 100644 net-lwip/Makefile
>  create mode 100644 net-lwip/dhcp.c
>  create mode 100644 net-lwip/dns.c
>  create mode 100644 net-lwip/eth_internal.h
>  create mode 100644 net-lwip/net-lwip.c
>  create mode 100644 net-lwip/ping.c
>  create mode 100644 net-lwip/tftp.c
>  create mode 100644 net-lwip/wget.c

Would it be possible to put net-lwip in net/lwip? We should try to
keep the directory structure clean / minimal.

Regards,
Simon
Tim Harvey June 21, 2024, 4:08 p.m. UTC | #11
On Fri, Jun 21, 2024 at 5:59 AM Jerome Forissier
<jerome.forissier@linaro.org> wrote:
>
>
>
> On 6/20/24 19:10, Tim Harvey wrote:
> > On Mon, Jun 17, 2024 at 8:33 AM Jerome Forissier
> > <jerome.forissier@linaro.org> wrote:
> >>
> >> This is a rework of a patch series by Maxim Uvarov: "net/lwip: add lwip
> >> library for the network stack" [1]. The goal is to introduce the lwIP TCP/IP
> >> stack [2] [3] as an alternative to the current implementation in net/,
> >> selectable with Kconfig, and ultimately keep only lwIP if possible. Some
> >> reasons for doing so are:
> >> - Make the support of HTTPS in the wget command easier. Javier T. (CC'd)
> >> has some additional lwIP and Mbed TLS patches to do so. With that it
> >> becomes possible to fetch and launch a distro installer such as Debian
> >> etc. using a secure, authenticated connection directly from the U-Boot
> >> shell. Several use cases:
> >>   * Authentication: prevent MITM attack (third party replacing the
> >> binary with a different one)
> >>   * Confidentiality: prevent third parties from grabbing a copy of the
> >> image as it is being downloaded
> >>   * Allow connection to servers that do not support plain HTTP anymore
> >> (this is becoming more and more common on the Internet these days)
> >> - Possibly benefit from additional features implemented in lwIP
> >> - Less code to maintain in U-Boot
> >>
> >> Prior to applying this series, the lwIP stack needs to be added as a
> >> Git subtree with the following command:
> >>
> >>  $  git subtree add --squash --prefix lib/lwip/lwip https://git.savannah.gnu.org/git/lwip.git STABLE-2_2_0_RELEASE
> >>
> >> The first patch renames some enums in order to avoid a conflict when a
> >> later patch enables the lwIP library.
> >>
> >> The second patch introduces a new Kconfig symbol: NET_LWIP, which selects
> >> the lwIP implementation instead of the current one (NET). Contrary to the
> >> approach chosen by Maxim in [1], NET_LWIP and NET cannot be enabled
> >> simultaneously. The rationale is we want to start from a clean state and
> >> not pull potentially duplicated functionality from both stacks. Note
> >> however that a few files are still built in net/, they are the ones
> >> related to ethernet device management and the ethernet bootflow.
> >>
> >> The third patch splits the net.h header into net-legacy.h, net-common.h,
> >> net-lwip.h, leaving net.h as a simple wrapper.
> >>
> >> The fourth patch introduces the Makefile to build lwIP when NET_LWIP is
> >> enabled.
> >>
> >> The subsequent patches implement various network-oriented commands and
> >> features: dhcp, dns, ping, tftpboot, wget.
> >>
> >> A number of features are currently incompatible with NET_LWIP: SANDBOX,
> >> DFU_TFTP, FASTBOOT, SPL_NET. All make assumptions on how the network
> >> stack is implemented and/or pull sybols that are not trivially exported
> >> from lwIP. Some interface rework may be needed.
> >>
> >> Due to the above and in order to provide some level of testing, a new QEMU
> >> configuration is introduced (qemu_arm64_lwip_defconfig) which is the same
> >> as qemu_arm64_defconfig but with NET_LWIP and CMD_*_LWIP enabled.
> >> Tests are added to test/py/tests/test_net.py for that configuration.
> >>
> >
> > Hi Jermone,
> >
> > I've given this a spin on an imx8mm-venice-gw73xx-0x via
> > imx8mm_venice_defconfig and I can't get DHCP to work (didn't work in
> > v3 either):
> >
> > $ diff defconfig configs/imx8mm_venice_defconfig
> > 68,69c68,71
> > < CONFIG_CMD_DNS_LWIP=y
> > < CONFIG_CMD_WGET_LWIP=y
> > ---
> >> CONFIG_CMD_DHCP6=y
> >> CONFIG_CMD_TFTPPUT=y
> >> CONFIG_SYS_DISABLE_AUTOLOAD=y
> >> CONFIG_CMD_WGET=y
> > 88,90c90,94
> > < CONFIG_NET_LWIP=y
> > < CONFIG_LWIP_DEBUG=y
> > < CONFIG_LWIP_ASSERT=y
> > ---
> >> CONFIG_NET_RANDOM_ETHADDR=y
> >> CONFIG_IP_DEFRAG=y
> >> CONFIG_TFTP_BLOCKSIZE=4096
> >> CONFIG_PROT_TCP_SACK=y
> >> CONFIG_IPV6=y
> >
> > Target:
> > u-boot=> net list
> > eth0 : ethernet@30be0000 00:d0:12:b5:f8:41 active
> > u-boot=> dhcp || echo fail
> > eth0: ethernet@30be0000 00:d0:12:b5:f8:41 active
> > netif_set_ipaddr: netif address being changed
> > netif: added interface 0 addr 192.168.1.1 netmask 0.0.0.0 gw 0.0.0.0
> > etharp_request: sending ARP request.
> > etharp_raw: sending raw ARP packet.
> > ethernet_output: sending packet 000000007df00bf0
> > dhcp_start(netif=000000007df2af40) 0dhcp_start(): mallocing new DHCP client
> > dhcp_start(): allocated dhcp
> > dhcp_start(): starting DHCP configuration
> > dhcp_discover()
> > transaction id xid(42021)
> > dhcp_discover: making request
> > dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, LWIP_IANA_PORT_DHCP_SERVER)
> > ip4_output_if: 0IP header:
> > +-------------------------------+
> > | 4 | 5 |  0x00 |       336     | (v, hl, tos, len)
> > +-------------------------------+
> > |        0      |000|       0   | (id, flags, offset)
> > +-------------------------------+
> > |  255  |   17  |    0xba9d     | (ttl, proto, chksum)
> > +-------------------------------+
> > |    0  |    0  |    0  |    0  | (src)
> > +-------------------------------+
> > |  255  |  255  |  255  |  255  | (dest)
> > +-------------------------------+
> > ip4_output_if: call netif->output()
> > ethernet_output: sending packet 000000007df2c010
> > dhcp_discover: deleting()
> > dhcp_discover: SELECTING
> > dhcp_discover(): set request timeout 2000 msecs
> > ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:28:28:5d:bb:16:9f, type:8899
> > ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:00:27:0e:0d:74:ba, type:806
> > etharp_update_arp_entry: 172.24.20.24 - 00:27:0e:0d:74:ba
> > etharp_find_entry: found empty entry 0
> > etharp_find_entry: no empty entry found and not allowed to recycle
> > etharp_input: incoming ARP request
> > etharp_input: ARP request was not for us.
> > ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:98:90:96:ac:d7:63, type:806
> > etharp_update_arp_entry: 172.24.20.3 - 98:90:96:ac:d7:63
> > etharp_find_entry: found empty entry 0
> > etharp_find_entry: no empty entry found and not allowed to recycle
> > etharp_input: incoming ARP request
> > etharp_input: ARP request was not for us.
> > ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:84:2b:2b:4e:5a:9a, type:806
> > etharp_update_arp_entry: 172.24.0.1 - 84:2b:2b:4e:5a:9a
> > etharp_find_entry: found empty entry 0
> > etharp_find_entry: no empty entry found and not allowed to recycle
> > etharp_input: incoming ARP request
> > etharp_input: ARP request was not for us.
> > ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:cc:96:e5:1e:a0:7b, type:800
> > ip_input: iphdr->dest 0xffff1fac netif->ip_addr 0x101a8c0 (0x0, 0x0, 0xffff1fac)
> > ip4_input: UDP packet to DHCP client port 138
> > ip4_input: packet not for us.
>
> That's weird. I think what could happen is the interface already has
> a static IP assigned when it is registered to the lwIP stack and thus
> DHCP never receives the packed because it is dropped before.
> I have reworked the initialization code in v5, lwIP will get to know
> only one interface at a time and I make sure that interface is registered
> without an IP address when doing DHCP. Hopefully that should fix the issue.
>
> Thanks for testing!
>

I tried your to-upstream/v5-wip branch
(042bea36eb9731079a3d7afffe3774d79e06ac5d) and it behaves the same. Do
you have something else to try/test?

Best Regards,

Tim
Jerome Forissier June 21, 2024, 5:55 p.m. UTC | #12
Hi Simon,

On 6/21/24 16:57, Simon Glass wrote:
> Hi Jerome,
> 
> On Mon, 17 Jun 2024 at 09:33, Jerome Forissier
> <jerome.forissier@linaro.org> wrote:
>>
>> This is a rework of a patch series by Maxim Uvarov: "net/lwip: add lwip
>> library for the network stack" [1]. The goal is to introduce the lwIP TCP/IP
>> stack [2] [3] as an alternative to the current implementation in net/,
>> selectable with Kconfig, and ultimately keep only lwIP if possible. Some
>> reasons for doing so are:
>> - Make the support of HTTPS in the wget command easier. Javier T. (CC'd)
>> has some additional lwIP and Mbed TLS patches to do so. With that it
>> becomes possible to fetch and launch a distro installer such as Debian
>> etc. using a secure, authenticated connection directly from the U-Boot
>> shell. Several use cases:
>>   * Authentication: prevent MITM attack (third party replacing the
>> binary with a different one)
>>   * Confidentiality: prevent third parties from grabbing a copy of the
>> image as it is being downloaded
>>   * Allow connection to servers that do not support plain HTTP anymore
>> (this is becoming more and more common on the Internet these days)
>> - Possibly benefit from additional features implemented in lwIP
>> - Less code to maintain in U-Boot
>>
>> Prior to applying this series, the lwIP stack needs to be added as a
>> Git subtree with the following command:
>>
>>  $  git subtree add --squash --prefix lib/lwip/lwip https://git.savannah.gnu.org/git/lwip.git STABLE-2_2_0_RELEASE
>>
>> The first patch renames some enums in order to avoid a conflict when a
>> later patch enables the lwIP library.
>>
>> The second patch introduces a new Kconfig symbol: NET_LWIP, which selects
>> the lwIP implementation instead of the current one (NET). Contrary to the
>> approach chosen by Maxim in [1], NET_LWIP and NET cannot be enabled
>> simultaneously. The rationale is we want to start from a clean state and
>> not pull potentially duplicated functionality from both stacks. Note
>> however that a few files are still built in net/, they are the ones
>> related to ethernet device management and the ethernet bootflow.
>>
>> The third patch splits the net.h header into net-legacy.h, net-common.h,
>> net-lwip.h, leaving net.h as a simple wrapper.
>>
>> The fourth patch introduces the Makefile to build lwIP when NET_LWIP is
>> enabled.
>>
>> The subsequent patches implement various network-oriented commands and
>> features: dhcp, dns, ping, tftpboot, wget.
>>
>> A number of features are currently incompatible with NET_LWIP: SANDBOX,
>> DFU_TFTP, FASTBOOT, SPL_NET. All make assumptions on how the network
>> stack is implemented and/or pull sybols that are not trivially exported
>> from lwIP. Some interface rework may be needed.
>>
>> Due to the above and in order to provide some level of testing, a new QEMU
>> configuration is introduced (qemu_arm64_lwip_defconfig) which is the same
>> as qemu_arm64_defconfig but with NET_LWIP and CMD_*_LWIP enabled.
>> Tests are added to test/py/tests/test_net.py for that configuration.
>>
>> Changes in v4:
>>
>> - Fixed the DHCP algorithm which was missing a sys_timeout() call in
>> the "fine timer" callback. This should close the issue that Tom R.
>> reported with his Raspberry Pi 3 (it does fix it on mine).
>> - The DHCP exchange timeout is increased from 2 to 10 seconds
>> - The DHCP exchange can be interrupted with Ctrl-C.
>> - "net: introduce alternative implementation as net-lwip/": rework
>> dependencies. A few symbols have 'depends on !NET_LWIP' and in addition
>> 'NET_LWIP depends on !SANDBOX'. Sandbox, DSA and fastboot are
>> unsupported, because they are deeply welded to the current stack.
>> - All network commands (dns, ping, tftp and wget):
>>   * Get rid of global variables (Ilias A.)
>>   * Use printf() rather than log_info()
>> - "net-lwip: add ping command": use packet count instead of
>> timeout, fix code style (Ilias A.)
>> - Add "net: split cmd/net.c into cmd/net.c and cmd/net-common.c"
>> extracted from the wget patch (Ilias A.).
>> - Add "net: split include/net.h into net{,-common,-legacy,-lwip}.h"
>> (Ilias A.)
>> - Add "flash: prefix error codes with FL_" which is required to
>> avoid name clashes when splitting net.h
>> - Reworked the initialization of the lwIP stack. One and only
>> one network interface (struct netif) is added for the duration
>> of the command that uses that interface. That's commit "net-lwip:
>> add DHCP support and dhcp commmand".
>> - Drop "test: dm: dsa, eth: disable tests when CONFIG_NET_LWIP=y",
>> not needed now that NET_LWIP depend on !SANDBOX.
>> - qemu_arm64_lwip_defconfig now enables CMD_DNS and CMD_WGET (so
>> that all the supported network commands are available).
>>
>> Changes in v3:
>>
>> - Make NET_LWIP a Kconfig choice in patch "net: introduce alternative
>> implementation as net-lwip/" (Tom R.)
>> - Drop the patch introducing lwIP as a Git subtree and document the git
>> command in the cover letter instead (Tom R.)
>> - "net-lwip: add TFTP support and tftpboot command": use the same
>> "Bytes transferred =" message as in the legacy implementation (Tom R.,
>> Maxim U.)
>> - Drop "test/py: net: add _lwip variants of dhcp, ping and tftpboot
>> tests" which is not needed anymore.
>> - Add missing kfree() calls in cmd/net-common.c and fix the parsing of
>> decimal address in net-lwip/wget.c (patch "net-lwip: add wget command")
>> (Maxim U.)
>> - "net-lwip: add ping command": drop the ICMP payload (Ilias A.). Set
>> the sequence number to zero when entering ping_loop().
>>
>> Changes in v2:
>>
>> ** Address comments from Ilias A.
>>
>> - "net-lwip: add wget command"
>> Implement the wget_with_dns() function to do most of the wget work and
>> call it from do_wget(). This allows to simplify patch "net-lwip: add
>> support for EFI_HTTP_BOOT".
>>
>> - "net-lwip: import net command from cmd/net.c"
>> Move a few functions from cmd/net.c to a new file cmd/net-common.c
>> rather than duplicating then in cmd/net-lwip.c.
>>
>> - "net-lwip: add support for EFI_HTTP_BOOT"
>> Since wget_with_dns() is now implemented in "net-lwip: add wget command",
>> just enable the wget command when the lwIP stack is enabled and
>> EFI_HTTP_BOOT is requested.
>>
>> ** Address comments from Tom R.
>>
>> - "net-lwip: add DHCP support and dhcp commmand",
>>   "net-lwip: add TFTP support and tftpboot command",
>>   "net-lwip: add ping command",
>>   "net-lwip: add dns command",
>>   "net-lwip: add wget command"
>> Do not introduce new CMD_XXX_LWIP symbols and use existing CMD_XXX
>> instead.
>>
>> - "configs: add qemu_arm64_lwip_defconfig"
>> Use #include <configs/qemu_arm64_defconfig>.
>>
>> - "net-lwip: import lwIP library under lib/lwip"
>> Patch removed and replaced by the introduction of a Git subtree:
>> "Squashed 'lib/lwip/lwip/' content from commit 0a0452b2c3".
>>
>> Note that I have not yet addressed your comments on "test: dm: dsa,
>> eth: disable tests when CONFIG_NET_LWIP=y"). I need some more time
>> for that and I think running CI on this v2 will help better understand
>> what is needed for v3.
>>
>> ** Miscellaneous improvements
>>
>> - "net: introduce alternative implementation as net-lwip/":
>>
>> * Make DFU_OVER_TFTP not DFU_TFTP incompatible with NET_LWIP. It seems
>> quite natural to supplement "depends on NET" with "&& !NET_LWIP".
>> * Make PROT_*_LWIP not visible by removing the Kconfig prompt.
>>
>> [1] https://lore.kernel.org/all/20231127125726.3735-1-maxim.uvarov@linaro.org/
>> [2] https://www.nongnu.org/lwip/
>> [3] https://en.wikipedia.org/wiki/LwIP
>>
>> CC: Javier Tia <javier.tia@linaro.org>
>>
>> Jerome Forissier (14):
>>   flash: prefix error codes with FL_
>>   net: introduce alternative implementation as net-lwip/
>>   net: split include/net.h into net{,-common,-legacy,-lwip}.h
>>   net-lwip: build lwIP
>>   net-lwip: add DHCP support and dhcp commmand
>>   net-lwip: add TFTP support and tftpboot command
>>   net-lwip: add ping command
>>   net-lwip: add dns command
>>   net: split cmd/net.c into cmd/net.c and cmd/net-common.c
>>   net-lwip: add wget command
>>   cmd: bdinfo: enable -e when CONFIG_CMD_NET_LWIP=y
>>   configs: add qemu_arm64_lwip_defconfig
>>   MAINTAINERS: net-lwip: add myself as a maintainer
>>   CI: add qemu_arm64_lwip to the test matrix
>>
>>  .azure-pipelines.yml              |   7 +
>>  Kconfig                           |  41 ++
>>  MAINTAINERS                       |  11 +
>>  Makefile                          |  12 +-
>>  board/cobra5272/flash.c           |  26 +-
>>  board/freescale/m5253demo/flash.c |   6 +-
>>  boot/Kconfig                      |   5 +-
>>  cmd/Kconfig                       |  44 ++
>>  cmd/Makefile                      |   7 +
>>  cmd/bdinfo.c                      |   5 +-
>>  cmd/net-common.c                  | 109 ++++
>>  cmd/net-lwip.c                    |  45 ++
>>  cmd/net.c                         | 115 ----
>>  common/Kconfig                    |   2 +-
>>  common/board_r.c                  |   4 +-
>>  common/flash.c                    |  44 +-
>>  common/spl/Kconfig                |   1 +
>>  configs/qemu_arm64_lwip_defconfig |   4 +
>>  drivers/dfu/Kconfig               |   1 +
>>  drivers/fastboot/Kconfig          |   1 +
>>  drivers/mtd/cfi_flash.c           |  36 +-
>>  drivers/net/Kconfig               |   3 +-
>>  drivers/net/phy/Kconfig           |   2 +-
>>  drivers/usb/gadget/Kconfig        |   2 +-
>>  include/flash.h                   |  20 +-
>>  include/net-common.h              | 408 +++++++++++++
>>  include/net-legacy.h              | 649 ++++++++++++++++++++
>>  include/net-lwip.h                |  36 ++
>>  include/net.h                     | 944 +-----------------------------
>>  lib/Makefile                      |   2 +
>>  lib/lwip/Makefile                 |  55 ++
>>  lib/lwip/u-boot/arch/cc.h         |  43 ++
>>  lib/lwip/u-boot/arch/sys_arch.h   |   0
>>  lib/lwip/u-boot/limits.h          |   0
>>  lib/lwip/u-boot/lwipopts.h        | 197 +++++++
>>  net-lwip/Kconfig                  |  37 ++
>>  net-lwip/Makefile                 |  18 +
>>  net-lwip/dhcp.c                   | 108 ++++
>>  net-lwip/dns.c                    | 117 ++++
>>  net-lwip/eth_internal.h           |  35 ++
>>  net-lwip/net-lwip.c               | 270 +++++++++
>>  net-lwip/ping.c                   | 174 ++++++
>>  net-lwip/tftp.c                   | 226 +++++++
>>  net-lwip/wget.c                   | 269 +++++++++
>>  net/Kconfig                       |  14 -
>>  45 files changed, 3008 insertions(+), 1147 deletions(-)
>>  create mode 100644 cmd/net-common.c
>>  create mode 100644 cmd/net-lwip.c
>>  create mode 100644 configs/qemu_arm64_lwip_defconfig
>>  create mode 100644 include/net-common.h
>>  create mode 100644 include/net-legacy.h
>>  create mode 100644 include/net-lwip.h
>>  create mode 100644 lib/lwip/Makefile
>>  create mode 100644 lib/lwip/u-boot/arch/cc.h
>>  create mode 100644 lib/lwip/u-boot/arch/sys_arch.h
>>  create mode 100644 lib/lwip/u-boot/limits.h
>>  create mode 100644 lib/lwip/u-boot/lwipopts.h
>>  create mode 100644 net-lwip/Kconfig
>>  create mode 100644 net-lwip/Makefile
>>  create mode 100644 net-lwip/dhcp.c
>>  create mode 100644 net-lwip/dns.c
>>  create mode 100644 net-lwip/eth_internal.h
>>  create mode 100644 net-lwip/net-lwip.c
>>  create mode 100644 net-lwip/ping.c
>>  create mode 100644 net-lwip/tftp.c
>>  create mode 100644 net-lwip/wget.c
> 
> Would it be possible to put net-lwip in net/lwip? We should try to
> keep the directory structure clean / minimal.

Sure.
Jerome Forissier June 21, 2024, 5:56 p.m. UTC | #13
On 6/21/24 18:08, Tim Harvey wrote:
> On Fri, Jun 21, 2024 at 5:59 AM Jerome Forissier
> <jerome.forissier@linaro.org> wrote:
>>
>>
>>
>> On 6/20/24 19:10, Tim Harvey wrote:
>>> On Mon, Jun 17, 2024 at 8:33 AM Jerome Forissier
>>> <jerome.forissier@linaro.org> wrote:
>>>>
>>>> This is a rework of a patch series by Maxim Uvarov: "net/lwip: add lwip
>>>> library for the network stack" [1]. The goal is to introduce the lwIP TCP/IP
>>>> stack [2] [3] as an alternative to the current implementation in net/,
>>>> selectable with Kconfig, and ultimately keep only lwIP if possible. Some
>>>> reasons for doing so are:
>>>> - Make the support of HTTPS in the wget command easier. Javier T. (CC'd)
>>>> has some additional lwIP and Mbed TLS patches to do so. With that it
>>>> becomes possible to fetch and launch a distro installer such as Debian
>>>> etc. using a secure, authenticated connection directly from the U-Boot
>>>> shell. Several use cases:
>>>>   * Authentication: prevent MITM attack (third party replacing the
>>>> binary with a different one)
>>>>   * Confidentiality: prevent third parties from grabbing a copy of the
>>>> image as it is being downloaded
>>>>   * Allow connection to servers that do not support plain HTTP anymore
>>>> (this is becoming more and more common on the Internet these days)
>>>> - Possibly benefit from additional features implemented in lwIP
>>>> - Less code to maintain in U-Boot
>>>>
>>>> Prior to applying this series, the lwIP stack needs to be added as a
>>>> Git subtree with the following command:
>>>>
>>>>  $  git subtree add --squash --prefix lib/lwip/lwip https://git.savannah.gnu.org/git/lwip.git STABLE-2_2_0_RELEASE
>>>>
>>>> The first patch renames some enums in order to avoid a conflict when a
>>>> later patch enables the lwIP library.
>>>>
>>>> The second patch introduces a new Kconfig symbol: NET_LWIP, which selects
>>>> the lwIP implementation instead of the current one (NET). Contrary to the
>>>> approach chosen by Maxim in [1], NET_LWIP and NET cannot be enabled
>>>> simultaneously. The rationale is we want to start from a clean state and
>>>> not pull potentially duplicated functionality from both stacks. Note
>>>> however that a few files are still built in net/, they are the ones
>>>> related to ethernet device management and the ethernet bootflow.
>>>>
>>>> The third patch splits the net.h header into net-legacy.h, net-common.h,
>>>> net-lwip.h, leaving net.h as a simple wrapper.
>>>>
>>>> The fourth patch introduces the Makefile to build lwIP when NET_LWIP is
>>>> enabled.
>>>>
>>>> The subsequent patches implement various network-oriented commands and
>>>> features: dhcp, dns, ping, tftpboot, wget.
>>>>
>>>> A number of features are currently incompatible with NET_LWIP: SANDBOX,
>>>> DFU_TFTP, FASTBOOT, SPL_NET. All make assumptions on how the network
>>>> stack is implemented and/or pull sybols that are not trivially exported
>>>> from lwIP. Some interface rework may be needed.
>>>>
>>>> Due to the above and in order to provide some level of testing, a new QEMU
>>>> configuration is introduced (qemu_arm64_lwip_defconfig) which is the same
>>>> as qemu_arm64_defconfig but with NET_LWIP and CMD_*_LWIP enabled.
>>>> Tests are added to test/py/tests/test_net.py for that configuration.
>>>>
>>>
>>> Hi Jermone,
>>>
>>> I've given this a spin on an imx8mm-venice-gw73xx-0x via
>>> imx8mm_venice_defconfig and I can't get DHCP to work (didn't work in
>>> v3 either):
>>>
>>> $ diff defconfig configs/imx8mm_venice_defconfig
>>> 68,69c68,71
>>> < CONFIG_CMD_DNS_LWIP=y
>>> < CONFIG_CMD_WGET_LWIP=y
>>> ---
>>>> CONFIG_CMD_DHCP6=y
>>>> CONFIG_CMD_TFTPPUT=y
>>>> CONFIG_SYS_DISABLE_AUTOLOAD=y
>>>> CONFIG_CMD_WGET=y
>>> 88,90c90,94
>>> < CONFIG_NET_LWIP=y
>>> < CONFIG_LWIP_DEBUG=y
>>> < CONFIG_LWIP_ASSERT=y
>>> ---
>>>> CONFIG_NET_RANDOM_ETHADDR=y
>>>> CONFIG_IP_DEFRAG=y
>>>> CONFIG_TFTP_BLOCKSIZE=4096
>>>> CONFIG_PROT_TCP_SACK=y
>>>> CONFIG_IPV6=y
>>>
>>> Target:
>>> u-boot=> net list
>>> eth0 : ethernet@30be0000 00:d0:12:b5:f8:41 active
>>> u-boot=> dhcp || echo fail
>>> eth0: ethernet@30be0000 00:d0:12:b5:f8:41 active
>>> netif_set_ipaddr: netif address being changed
>>> netif: added interface 0 addr 192.168.1.1 netmask 0.0.0.0 gw 0.0.0.0
>>> etharp_request: sending ARP request.
>>> etharp_raw: sending raw ARP packet.
>>> ethernet_output: sending packet 000000007df00bf0
>>> dhcp_start(netif=000000007df2af40) 0dhcp_start(): mallocing new DHCP client
>>> dhcp_start(): allocated dhcp
>>> dhcp_start(): starting DHCP configuration
>>> dhcp_discover()
>>> transaction id xid(42021)
>>> dhcp_discover: making request
>>> dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, LWIP_IANA_PORT_DHCP_SERVER)
>>> ip4_output_if: 0IP header:
>>> +-------------------------------+
>>> | 4 | 5 |  0x00 |       336     | (v, hl, tos, len)
>>> +-------------------------------+
>>> |        0      |000|       0   | (id, flags, offset)
>>> +-------------------------------+
>>> |  255  |   17  |    0xba9d     | (ttl, proto, chksum)
>>> +-------------------------------+
>>> |    0  |    0  |    0  |    0  | (src)
>>> +-------------------------------+
>>> |  255  |  255  |  255  |  255  | (dest)
>>> +-------------------------------+
>>> ip4_output_if: call netif->output()
>>> ethernet_output: sending packet 000000007df2c010
>>> dhcp_discover: deleting()
>>> dhcp_discover: SELECTING
>>> dhcp_discover(): set request timeout 2000 msecs
>>> ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:28:28:5d:bb:16:9f, type:8899
>>> ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:00:27:0e:0d:74:ba, type:806
>>> etharp_update_arp_entry: 172.24.20.24 - 00:27:0e:0d:74:ba
>>> etharp_find_entry: found empty entry 0
>>> etharp_find_entry: no empty entry found and not allowed to recycle
>>> etharp_input: incoming ARP request
>>> etharp_input: ARP request was not for us.
>>> ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:98:90:96:ac:d7:63, type:806
>>> etharp_update_arp_entry: 172.24.20.3 - 98:90:96:ac:d7:63
>>> etharp_find_entry: found empty entry 0
>>> etharp_find_entry: no empty entry found and not allowed to recycle
>>> etharp_input: incoming ARP request
>>> etharp_input: ARP request was not for us.
>>> ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:84:2b:2b:4e:5a:9a, type:806
>>> etharp_update_arp_entry: 172.24.0.1 - 84:2b:2b:4e:5a:9a
>>> etharp_find_entry: found empty entry 0
>>> etharp_find_entry: no empty entry found and not allowed to recycle
>>> etharp_input: incoming ARP request
>>> etharp_input: ARP request was not for us.
>>> ethernet_input: dest:ff:ff:ff:ff:ff:ff, src:cc:96:e5:1e:a0:7b, type:800
>>> ip_input: iphdr->dest 0xffff1fac netif->ip_addr 0x101a8c0 (0x0, 0x0, 0xffff1fac)
>>> ip4_input: UDP packet to DHCP client port 138
>>> ip4_input: packet not for us.
>>
>> That's weird. I think what could happen is the interface already has
>> a static IP assigned when it is registered to the lwIP stack and thus
>> DHCP never receives the packed because it is dropped before.
>> I have reworked the initialization code in v5, lwIP will get to know
>> only one interface at a time and I make sure that interface is registered
>> without an IP address when doing DHCP. Hopefully that should fix the issue.
>>
>> Thanks for testing!
>>
> 
> I tried your to-upstream/v5-wip branch
> (042bea36eb9731079a3d7afffe3774d79e06ac5d) and it behaves the same. Do
> you have something else to try/test?

Not at the moment, unfortunately. I'll take a closer look at what this
message "packet not for us" means on Monday.

Thanks,
Fabio Estevam June 21, 2024, 6:42 p.m. UTC | #14
Hi Tim and Jerome,

On Fri, Jun 21, 2024 at 1:08 PM Tim Harvey <tharvey@gateworks.com> wrote:

> I tried your to-upstream/v5-wip branch
> (042bea36eb9731079a3d7afffe3774d79e06ac5d) and it behaves the same. Do
> you have something else to try/test?

Yes, when I tested older versions from Maxim I could never get lwIP to
work with i.MX.

Jerome,

Please try to run the lwIP series on any i.MX board, if possible.

Thanks
Maxim Uvarov June 22, 2024, 8:08 a.m. UTC | #15
пт, 21 июн. 2024 г. в 21:42, Fabio Estevam <festevam@gmail.com>:
>
> Hi Tim and Jerome,
>
> On Fri, Jun 21, 2024 at 1:08 PM Tim Harvey <tharvey@gateworks.com> wrote:
>
> > I tried your to-upstream/v5-wip branch
> > (042bea36eb9731079a3d7afffe3774d79e06ac5d) and it behaves the same. Do
> > you have something else to try/test?
>
> Yes, when I tested older versions from Maxim I could never get lwIP to
> work with i.MX.
>
> Jerome,
>
> Please try to run the lwIP series on any i.MX board, if possible.
>
> Thanks

Packet not for us means that incoming packet DST MAC does not match to
the MAC address inside lwip. I.e. to MAC address set into lwip when
lwip_init was done. Original U-Boot network stack does not compare
MACs but lwip does. There is something specific on this board, in
general lwip with debug should print out
MAC used during initialization. This MAC should match to the MAC from
the incoming packet.
Tim Harvey June 24, 2024, 10:28 p.m. UTC | #16
On Sat, Jun 22, 2024 at 1:09 AM Maxim Uvarov <muvarov@gmail.com> wrote:
>
> пт, 21 июн. 2024 г. в 21:42, Fabio Estevam <festevam@gmail.com>:
> >
> > Hi Tim and Jerome,
> >
> > On Fri, Jun 21, 2024 at 1:08 PM Tim Harvey <tharvey@gateworks.com> wrote:
> >
> > > I tried your to-upstream/v5-wip branch
> > > (042bea36eb9731079a3d7afffe3774d79e06ac5d) and it behaves the same. Do
> > > you have something else to try/test?
> >
> > Yes, when I tested older versions from Maxim I could never get lwIP to
> > work with i.MX.
> >
> > Jerome,
> >
> > Please try to run the lwIP series on any i.MX board, if possible.
> >
> > Thanks
>
> Packet not for us means that incoming packet DST MAC does not match to
> the MAC address inside lwip. I.e. to MAC address set into lwip when
> lwip_init was done. Original U-Boot network stack does not compare
> MACs but lwip does. There is something specific on this board, in
> general lwip with debug should print out
> MAC used during initialization. This MAC should match to the MAC from
> the incoming packet.
>

It seems 'packet not for us' can mean a lot of things.

I added a bit of debugging around 'DHCP packet accepted' and found I'm
not receiving any packets from my DHCP server. So I connected directly
to another board (isolated network) where I ran my own server and
tcpdump and I don't see packets coming from lwip:

without lwip my server shows:
# tcpdump -i eth1
tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
length 262144 bytes
tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
length 262144 bytes
22:18:01.043992 IP (tos 0x0, ttl 255, id 23, offset 0, flags [DF],
proto UDP (17), length 328)
22:18:01.044391 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none],
proto UDP (17), length 328)
22:18:01.044454 IP (tos 0x0, ttl 255, id 24, offset 0, flags [DF],
proto UDP (17), length 328)
22:18:01.044752 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none],
proto UDP (17), length 328)

# tcpdump -i eth1
tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
length 262144 bytes
22:22:41.602178  [|llc]
22:22:41.709978  [|llc]
22:22:41.867947  [|llc]
22:22:42.105202  [|llc]
22:22:42.502091  [|llc]
22:22:43.219079  [|llc]
22:22:44.497979  [|llc]
22:22:45.776884  [|llc]
22:22:47.054773  [|llc]
22:22:48.333667  [|llc]
22:22:49.611559  [|llc]
22:22:50.890469  [|llc]

What actual hardware has this been tested with? I think Tom mentioned
he tested with an rpi of some sort.

I don't know what the meaning of the llc msg is above but you can see
there is no DHCP request.

Best Regards,

Tim
Jerome Forissier June 25, 2024, 8:02 a.m. UTC | #17
On 6/25/24 00:28, Tim Harvey wrote:
> On Sat, Jun 22, 2024 at 1:09 AM Maxim Uvarov <muvarov@gmail.com> wrote:
>>
>> пт, 21 июн. 2024 г. в 21:42, Fabio Estevam <festevam@gmail.com>:
>>>
>>> Hi Tim and Jerome,
>>>
>>> On Fri, Jun 21, 2024 at 1:08 PM Tim Harvey <tharvey@gateworks.com> wrote:
>>>
>>>> I tried your to-upstream/v5-wip branch
>>>> (042bea36eb9731079a3d7afffe3774d79e06ac5d) and it behaves the same. Do
>>>> you have something else to try/test?
>>>
>>> Yes, when I tested older versions from Maxim I could never get lwIP to
>>> work with i.MX.
>>>
>>> Jerome,
>>>
>>> Please try to run the lwIP series on any i.MX board, if possible.
>>>
>>> Thanks
>>
>> Packet not for us means that incoming packet DST MAC does not match to
>> the MAC address inside lwip. I.e. to MAC address set into lwip when
>> lwip_init was done. Original U-Boot network stack does not compare
>> MACs but lwip does. There is something specific on this board, in
>> general lwip with debug should print out
>> MAC used during initialization. This MAC should match to the MAC from
>> the incoming packet.
>>
> 
> It seems 'packet not for us' can mean a lot of things.

Yeah :-/ in this case I believe the traces are caused by unrelated traffic
(UDP port 138 is used by NetBIOS).


> I added a bit of debugging around 'DHCP packet accepted' and found I'm
> not receiving any packets from my DHCP server. So I connected directly
> to another board (isolated network) where I ran my own server and
> tcpdump and I don't see packets coming from lwip:

Ha! That's interesting.

> without lwip my server shows:
> # tcpdump -i eth1
> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> length 262144 bytes
> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> length 262144 bytes
> 22:18:01.043992 IP (tos 0x0, ttl 255, id 23, offset 0, flags [DF],
> proto UDP (17), length 328)
> 22:18:01.044391 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none],
> proto UDP (17), length 328)
> 22:18:01.044454 IP (tos 0x0, ttl 255, id 24, offset 0, flags [DF],
> proto UDP (17), length 328)
> 22:18:01.044752 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none],
> proto UDP (17), length 328)
> 
> # tcpdump -i eth1
> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> length 262144 bytes
> 22:22:41.602178  [|llc]
> 22:22:41.709978  [|llc]
> 22:22:41.867947  [|llc]
> 22:22:42.105202  [|llc]
> 22:22:42.502091  [|llc]
> 22:22:43.219079  [|llc]
> 22:22:44.497979  [|llc]
> 22:22:45.776884  [|llc]
> 22:22:47.054773  [|llc]
> 22:22:48.333667  [|llc]
> 22:22:49.611559  [|llc]
> 22:22:50.890469  [|llc]
> 
> What actual hardware has this been tested with? I think Tom mentioned
> he tested with an rpi of some sort.

Yes, I believe he tested on RPi 3B  and me too. Ilias has tested on NVIDIA
Jetson Nano.
 
> I don't know what the meaning of the llc msg is above but you can see
> there is no DHCP request.

Some more tracing is needed. Can you try enabling all traces in
lib/lwip/u-boot/lwipopts.h? i.e., replace all LWIP_DBG_OFF with
LWIP_DBG_ON in the #if defined(CONFIG_LWIP_DEBUG) section. And of course
enabled CONFIG_LWIP_DEBUG.
I do have a i.MX8M Plus EVK board but will not be able to use it before
tomorrow.

Thanks,
Tim Harvey June 26, 2024, 4 p.m. UTC | #18
On Tue, Jun 25, 2024 at 1:02 AM Jerome Forissier
<jerome.forissier@linaro.org> wrote:
>
> On 6/25/24 00:28, Tim Harvey wrote:
> > On Sat, Jun 22, 2024 at 1:09 AM Maxim Uvarov <muvarov@gmail.com> wrote:
> >>
> >> пт, 21 июн. 2024 г. в 21:42, Fabio Estevam <festevam@gmail.com>:
> >>>
> >>> Hi Tim and Jerome,
> >>>
> >>> On Fri, Jun 21, 2024 at 1:08 PM Tim Harvey <tharvey@gateworks.com> wrote:
> >>>
> >>>> I tried your to-upstream/v5-wip branch
> >>>> (042bea36eb9731079a3d7afffe3774d79e06ac5d) and it behaves the same. Do
> >>>> you have something else to try/test?
> >>>
> >>> Yes, when I tested older versions from Maxim I could never get lwIP to
> >>> work with i.MX.
> >>>
> >>> Jerome,
> >>>
> >>> Please try to run the lwIP series on any i.MX board, if possible.
> >>>
> >>> Thanks
> >>
> >> Packet not for us means that incoming packet DST MAC does not match to
> >> the MAC address inside lwip. I.e. to MAC address set into lwip when
> >> lwip_init was done. Original U-Boot network stack does not compare
> >> MACs but lwip does. There is something specific on this board, in
> >> general lwip with debug should print out
> >> MAC used during initialization. This MAC should match to the MAC from
> >> the incoming packet.
> >>
> >
> > It seems 'packet not for us' can mean a lot of things.
>
> Yeah :-/ in this case I believe the traces are caused by unrelated traffic
> (UDP port 138 is used by NetBIOS).
>
>
> > I added a bit of debugging around 'DHCP packet accepted' and found I'm
> > not receiving any packets from my DHCP server. So I connected directly
> > to another board (isolated network) where I ran my own server and
> > tcpdump and I don't see packets coming from lwip:
>
> Ha! That's interesting.
>
> > without lwip my server shows:
> > # tcpdump -i eth1
> > tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> > length 262144 bytes
> > tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> > length 262144 bytes
> > 22:18:01.043992 IP (tos 0x0, ttl 255, id 23, offset 0, flags [DF],
> > proto UDP (17), length 328)
> > 22:18:01.044391 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none],
> > proto UDP (17), length 328)
> > 22:18:01.044454 IP (tos 0x0, ttl 255, id 24, offset 0, flags [DF],
> > proto UDP (17), length 328)
> > 22:18:01.044752 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none],
> > proto UDP (17), length 328)
> >
> > # tcpdump -i eth1
> > tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> > length 262144 bytes
> > 22:22:41.602178  [|llc]
> > 22:22:41.709978  [|llc]
> > 22:22:41.867947  [|llc]
> > 22:22:42.105202  [|llc]
> > 22:22:42.502091  [|llc]
> > 22:22:43.219079  [|llc]
> > 22:22:44.497979  [|llc]
> > 22:22:45.776884  [|llc]
> > 22:22:47.054773  [|llc]
> > 22:22:48.333667  [|llc]
> > 22:22:49.611559  [|llc]
> > 22:22:50.890469  [|llc]
> >
> > What actual hardware has this been tested with? I think Tom mentioned
> > he tested with an rpi of some sort.
>
> Yes, I believe he tested on RPi 3B  and me too. Ilias has tested on NVIDIA
> Jetson Nano.
>
> > I don't know what the meaning of the llc msg is above but you can see
> > there is no DHCP request.
>
> Some more tracing is needed. Can you try enabling all traces in
> lib/lwip/u-boot/lwipopts.h? i.e., replace all LWIP_DBG_OFF with
> LWIP_DBG_ON in the #if defined(CONFIG_LWIP_DEBUG) section. And of course
> enabled CONFIG_LWIP_DEBUG.

Here's what I see with all those enabled on an imx8mm-venice-gw73xx-0x:
Net:   eth0: ethernet@30be0000 [PRIME]
GSC     : boot watchdog disabled
Thermal protection:enabled at 96C
Hit any key to stop autoboot:  0
u-boot=> dhcp || echo fail
netif: added interface et IP addr 0.0.0.0 netmask 0.0.0.0 gw 0.0.0.0
netif: setting default interface et
dhcp_start(netif=00000000fdf23bb0) et0
dhcp_start(): mallocing new DHCP client
dhcp_start(): allocated dhcp
dhcp_start(): starting DHCP configuration
udp_bind(ipaddr = 0.0.0.0, port = 68)
udp_bind: bound to 0.0.0.0, port 68)
udp_connect: connected to 0.0.0.0, port 67)
dhcp_discover()
pbuf_alloc(length=308)
pbuf_alloc(length=308) == 00000000fdf23d10
transaction id xid(42021)
dhcp_discover: making request
dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, LWIP_IANA_PORT_DHCP_SERVER)
pbuf_add_header: old 00000000fdf23d5e new 00000000fdf23d56 (8)
udp_send: added header in given pbuf 00000000fdf23d10
udp_send: sending datagram of length 316
udp_send: UDP packet length 316
inet_chksum_pseudo(): checksumming pbuf 00000000fdf23d10 (has next
0000000000000000)
inet_chksum_pseudo(): pbuf chain lwip_chksum()=807
udp_send: UDP checksum 0xf7f8
udp_send: ip_output_if (,,,,0x11,)
pbuf_add_header: old 00000000fdf23d56 new 00000000fdf23d42 (20)
ip4_output_if: et0
IP header:
+-------------------------------+
| 4 | 5 |  0x00 |       336     | (v, hl, tos, len)
+-------------------------------+
|        0      |000|       0   | (id, flags, offset)
+-------------------------------+
|  255  |   17  |    0xba9d     | (ttl, proto, chksum)
+-------------------------------+
|    0  |    0  |    0  |    0  | (src)
+-------------------------------+
|  255  |  255  |  255  |  255  | (dest)
+-------------------------------+
ip4_output_if: call netif->output()
pbuf_add_header: old 00000000fdf23d42 new 00000000fdf23d34 (14)
ethernet_output: sending packet 00000000fdf23d10
^^^ at this point a 350 byte packet gets sent to fecmxc_send for the
NIC which looks like this via print_hex_dump_bytes():
00000000: ff ff ff ff ff ff 00 d0 12 ba f8 cc 08 00 45 00  ..............E.
00000010: 01 50 00 00 00 00 ff 11 ba 9d 00 00 00 00 ff ff  .P..............
00000020: ff ff 00 44 00 43 01 3c f8 f7 01 01 06 00 00 04  ...D.C.<........
00000030: 20 21 00 00 00 00 00 00 00 00 00 00 00 00 00 00   !..............
00000040: 00 00 00 00 00 00 00 d0 12 ba f8 cc 00 00 00 00  ................
00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000110: 00 00 00 00 00 00 63 82 53 63 35 01 01 39 02 05  ......c.Sc5..9..
00000120: dc 37 03 01 03 1c ff 00 00 00 00 00 00 00 00 00  .7..............
00000130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00        ..............

on my dhcp server 'tcpdump -i eth1 -X -vvv' shows:
tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
length 262144 bytes
15:34:25.341167  [|llc]
0x0000:  0000 0000 0000 0000 0000 343d f2fd 0000  ..........4=....
0x0010:  0000 5e01 5e01 8000 0100 0000 0000 0000  ..^.^...........
0x0020:  0000 0000 0000 ffff ffff ffff 00d0 12ba  ................
0x0030:  f8cc 0800 4500 0150 001e 0000 ff11 ba7f  ....E..P........
0x0040:  0000 0000 ffff ffff 0044 0043 013c 6d78  .........D.C.<mx
0x0050:  0101 0600 1255 994f 0000 0000 0000 0000  .....U.O........
0x0060:  0000 0000 0000 0000 0000 0000 00d0 12ba  ................
0x0070:  f8cc 0000 0000 0000 0000 0000 0000 0000  ................
0x0080:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x0090:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x00a0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x00b0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x00c0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x00d0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x00e0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x00f0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x0100:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x0110:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x0120:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x0130:  0000 0000 0000 0000 0000 0000 6382 5363  ............c.Sc
0x0140:  3501 0139 0205 dc37 0301 031c ff00 0000  5..9...7........

Comparing to w/o LWIP:
u-boot=> dhcp || echo fail
BOOTP broadcast 1
fecmxc_send ethernet@30be0000 342
00000000: ff ff ff ff ff ff 00 d0 12 ba f8 cc 08 00 45 00  ..............E.
00000010: 01 48 00 00 40 00 ff 11 7a a5 00 00 00 00 ff ff  .H..@...z.......
00000020: ff ff 00 44 00 43 01 34 00 00 01 01 06 00 12 bb  ...D.C.4........
00000030: 82 64 00 00 00 00 00 00 00 00 00 00 00 00 00 00  .d..............
00000040: 00 00 00 00 00 00 00 d0 12 ba f8 cc 00 00 00 00  ................
00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000110: 00 00 00 00 00 00 63 82 53 63 35 01 01 39 02 02  ......c.Sc5..9..
00000120: 40 5d 02 00 16 5e 03 01 00 00 3c 0c 55 2d 42 6f  @]...^....<.U-Bo
00000130: 6f 74 2e 61 72 6d 76 38 37 05 01 03 06 0c 11 ff  ot.armv87.......
00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000150: 00 00 00 00 00 00                                ......
^^^ Note the packet is 8 bytes shorter

and on the dhcp server:
tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
length 262144 bytes
15:45:12.548077 IP (tos 0x0, ttl 255, id 0, offset 0, flags [DF],
proto UDP (17), length 328)
    0.0.0.0.bootpc > 255.255.255.255.bootps: [no cksum] BOOTP/DHCP,
Request from 00:d0:12:ba:f8:cc (oui Unknown), length 300, xid
0x12bb8264, Flags [none] (0x0000)
  Client-Ethernet-Address 00:d0:12:ba:f8:cc (oui Unknown)
  Vendor-rfc1048 Extensions
    Magic Cookie 0x63825363
    DHCP-Message (53), length 1: Discover
    MSZ (57), length 2: 576
    ARCH (93), length 2: 22
    NDI (94), length 3: 1.0.0
    Vendor-Class (60), length 12: "U-Boot.armv8"
    Parameter-Request (55), length 5:
      Subnet-Mask (1), Default-Gateway (3), Domain-Name-Server (6),
Hostname (12)
      RP (17)
    END (255), length 0
    PAD (0), length 0, occurs 22
0x0000:  4500 0148 0000 4000 ff11 7aa5 0000 0000  E..H..@...z.....
0x0010:  ffff ffff 0044 0043 0134 0000 0101 0600  .....D.C.4......
0x0020:  12bb 8264 0000 0000 0000 0000 0000 0000  ...d............
0x0030:  0000 0000 0000 0000 00d0 12ba f8cc 0000  ................
0x0040:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x0050:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x0060:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x0070:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x0080:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x0090:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x00a0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x00b0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x00c0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x00d0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x00e0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x00f0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
0x0100:  0000 0000 0000 0000 6382 5363 3501 0139  ........c.Sc5..9
0x0110:  0202 405d 0200 165e 0301 0000 3c0c 552d  ..@]...^....<.U-
0x0120:  426f 6f74 2e61 726d 7638 3705 0103 060c  Boot.armv87.....
0x0130:  11ff 0000 0000 0000 0000 0000 0000 0000  ................
0x0140:  0000 0000 0000 0000                      ........

> I do have a i.MX8M Plus EVK board but will not be able to use it before
> tomorrow.

Please let me know your findings. I can test imx8mp as well but would
have to reconfigure everything.

Best Regards,

Tim
Jerome Forissier June 28, 2024, 9:50 a.m. UTC | #19
Hi Tim,

On 6/26/24 18:00, Tim Harvey wrote:
> On Tue, Jun 25, 2024 at 1:02 AM Jerome Forissier
> <jerome.forissier@linaro.org> wrote:
>>
>> On 6/25/24 00:28, Tim Harvey wrote:
>>> On Sat, Jun 22, 2024 at 1:09 AM Maxim Uvarov <muvarov@gmail.com> wrote:
>>>>
>>>> пт, 21 июн. 2024 г. в 21:42, Fabio Estevam <festevam@gmail.com>:
>>>>>
>>>>> Hi Tim and Jerome,
>>>>>
>>>>> On Fri, Jun 21, 2024 at 1:08 PM Tim Harvey <tharvey@gateworks.com> wrote:
>>>>>
>>>>>> I tried your to-upstream/v5-wip branch
>>>>>> (042bea36eb9731079a3d7afffe3774d79e06ac5d) and it behaves the same. Do
>>>>>> you have something else to try/test?
>>>>>
>>>>> Yes, when I tested older versions from Maxim I could never get lwIP to
>>>>> work with i.MX.
>>>>>
>>>>> Jerome,
>>>>>
>>>>> Please try to run the lwIP series on any i.MX board, if possible.
>>>>>
>>>>> Thanks
>>>>
>>>> Packet not for us means that incoming packet DST MAC does not match to
>>>> the MAC address inside lwip. I.e. to MAC address set into lwip when
>>>> lwip_init was done. Original U-Boot network stack does not compare
>>>> MACs but lwip does. There is something specific on this board, in
>>>> general lwip with debug should print out
>>>> MAC used during initialization. This MAC should match to the MAC from
>>>> the incoming packet.
>>>>
>>>
>>> It seems 'packet not for us' can mean a lot of things.
>>
>> Yeah :-/ in this case I believe the traces are caused by unrelated traffic
>> (UDP port 138 is used by NetBIOS).
>>
>>
>>> I added a bit of debugging around 'DHCP packet accepted' and found I'm
>>> not receiving any packets from my DHCP server. So I connected directly
>>> to another board (isolated network) where I ran my own server and
>>> tcpdump and I don't see packets coming from lwip:
>>
>> Ha! That's interesting.
>>
>>> without lwip my server shows:
>>> # tcpdump -i eth1
>>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
>>> length 262144 bytes
>>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
>>> length 262144 bytes
>>> 22:18:01.043992 IP (tos 0x0, ttl 255, id 23, offset 0, flags [DF],
>>> proto UDP (17), length 328)
>>> 22:18:01.044391 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none],
>>> proto UDP (17), length 328)
>>> 22:18:01.044454 IP (tos 0x0, ttl 255, id 24, offset 0, flags [DF],
>>> proto UDP (17), length 328)
>>> 22:18:01.044752 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none],
>>> proto UDP (17), length 328)
>>>
>>> # tcpdump -i eth1
>>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
>>> length 262144 bytes
>>> 22:22:41.602178  [|llc]
>>> 22:22:41.709978  [|llc]
>>> 22:22:41.867947  [|llc]
>>> 22:22:42.105202  [|llc]
>>> 22:22:42.502091  [|llc]
>>> 22:22:43.219079  [|llc]
>>> 22:22:44.497979  [|llc]
>>> 22:22:45.776884  [|llc]
>>> 22:22:47.054773  [|llc]
>>> 22:22:48.333667  [|llc]
>>> 22:22:49.611559  [|llc]
>>> 22:22:50.890469  [|llc]
>>>
>>> What actual hardware has this been tested with? I think Tom mentioned
>>> he tested with an rpi of some sort.
>>
>> Yes, I believe he tested on RPi 3B  and me too. Ilias has tested on NVIDIA
>> Jetson Nano.
>>
>>> I don't know what the meaning of the llc msg is above but you can see
>>> there is no DHCP request.
>>
>> Some more tracing is needed. Can you try enabling all traces in
>> lib/lwip/u-boot/lwipopts.h? i.e., replace all LWIP_DBG_OFF with
>> LWIP_DBG_ON in the #if defined(CONFIG_LWIP_DEBUG) section. And of course
>> enabled CONFIG_LWIP_DEBUG.
> 
> Here's what I see with all those enabled on an imx8mm-venice-gw73xx-0x:
> Net:   eth0: ethernet@30be0000 [PRIME]
> GSC     : boot watchdog disabled
> Thermal protection:enabled at 96C
> Hit any key to stop autoboot:  0
> u-boot=> dhcp || echo fail
> netif: added interface et IP addr 0.0.0.0 netmask 0.0.0.0 gw 0.0.0.0
> netif: setting default interface et
> dhcp_start(netif=00000000fdf23bb0) et0
> dhcp_start(): mallocing new DHCP client
> dhcp_start(): allocated dhcp
> dhcp_start(): starting DHCP configuration
> udp_bind(ipaddr = 0.0.0.0, port = 68)
> udp_bind: bound to 0.0.0.0, port 68)
> udp_connect: connected to 0.0.0.0, port 67)
> dhcp_discover()
> pbuf_alloc(length=308)
> pbuf_alloc(length=308) == 00000000fdf23d10
> transaction id xid(42021)
> dhcp_discover: making request
> dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, LWIP_IANA_PORT_DHCP_SERVER)
> pbuf_add_header: old 00000000fdf23d5e new 00000000fdf23d56 (8)
> udp_send: added header in given pbuf 00000000fdf23d10
> udp_send: sending datagram of length 316
> udp_send: UDP packet length 316
> inet_chksum_pseudo(): checksumming pbuf 00000000fdf23d10 (has next
> 0000000000000000)
> inet_chksum_pseudo(): pbuf chain lwip_chksum()=807
> udp_send: UDP checksum 0xf7f8
> udp_send: ip_output_if (,,,,0x11,)
> pbuf_add_header: old 00000000fdf23d56 new 00000000fdf23d42 (20)
> ip4_output_if: et0
> IP header:
> +-------------------------------+
> | 4 | 5 |  0x00 |       336     | (v, hl, tos, len)
> +-------------------------------+
> |        0      |000|       0   | (id, flags, offset)
> +-------------------------------+
> |  255  |   17  |    0xba9d     | (ttl, proto, chksum)
> +-------------------------------+
> |    0  |    0  |    0  |    0  | (src)
> +-------------------------------+
> |  255  |  255  |  255  |  255  | (dest)
> +-------------------------------+
> ip4_output_if: call netif->output()
> pbuf_add_header: old 00000000fdf23d42 new 00000000fdf23d34 (14)
> ethernet_output: sending packet 00000000fdf23d10
> ^^^ at this point a 350 byte packet gets sent to fecmxc_send for the
> NIC which looks like this via print_hex_dump_bytes():
> 00000000: ff ff ff ff ff ff 00 d0 12 ba f8 cc 08 00 45 00  ..............E.
> 00000010: 01 50 00 00 00 00 ff 11 ba 9d 00 00 00 00 ff ff  .P..............
> 00000020: ff ff 00 44 00 43 01 3c f8 f7 01 01 06 00 00 04  ...D.C.<........
> 00000030: 20 21 00 00 00 00 00 00 00 00 00 00 00 00 00 00   !..............
> 00000040: 00 00 00 00 00 00 00 d0 12 ba f8 cc 00 00 00 00  ................
> 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000110: 00 00 00 00 00 00 63 82 53 63 35 01 01 39 02 05  ......c.Sc5..9..
> 00000120: dc 37 03 01 03 1c ff 00 00 00 00 00 00 00 00 00  .7..............
> 00000130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00        ..............

Up to this point everything *seems* OK

> on my dhcp server 'tcpdump -i eth1 -X -vvv' shows:
> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> length 262144 bytes
> 15:34:25.341167  [|llc]
> 0x0000:  0000 0000 0000 0000 0000 343d f2fd 0000  ..........4=....
> 0x0010:  0000 5e01 5e01 8000 0100 0000 0000 0000  ..^.^...........
> 0x0020:  0000 0000 0000 ffff ffff ffff 00d0 12ba  ................
> 0x0030:  f8cc 0800 4500 0150 001e 0000 ff11 ba7f  ....E..P........
> 0x0040:  0000 0000 ffff ffff 0044 0043 013c 6d78  .........D.C.<mx
> 0x0050:  0101 0600 1255 994f 0000 0000 0000 0000  .....U.O........
> 0x0060:  0000 0000 0000 0000 0000 0000 00d0 12ba  ................
> 0x0070:  f8cc 0000 0000 0000 0000 0000 0000 0000  ................
> 0x0080:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x0090:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x00a0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x00b0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x00c0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x00d0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x00e0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x00f0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x0100:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x0110:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x0120:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x0130:  0000 0000 0000 0000 0000 0000 6382 5363  ............c.Sc
> 0x0140:  3501 0139 0205 dc37 0301 031c ff00 0000  5..9...7........

But here it shows the DHCP packet is not received.

> Comparing to w/o LWIP:
> u-boot=> dhcp || echo fail
> BOOTP broadcast 1
> fecmxc_send ethernet@30be0000 342
> 00000000: ff ff ff ff ff ff 00 d0 12 ba f8 cc 08 00 45 00  ..............E.
> 00000010: 01 48 00 00 40 00 ff 11 7a a5 00 00 00 00 ff ff  .H..@...z.......
> 00000020: ff ff 00 44 00 43 01 34 00 00 01 01 06 00 12 bb  ...D.C.4........
> 00000030: 82 64 00 00 00 00 00 00 00 00 00 00 00 00 00 00  .d..............
> 00000040: 00 00 00 00 00 00 00 d0 12 ba f8 cc 00 00 00 00  ................
> 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000110: 00 00 00 00 00 00 63 82 53 63 35 01 01 39 02 02  ......c.Sc5..9..
> 00000120: 40 5d 02 00 16 5e 03 01 00 00 3c 0c 55 2d 42 6f  @]...^....<.U-Bo
> 00000130: 6f 74 2e 61 72 6d 76 38 37 05 01 03 06 0c 11 ff  ot.armv87.......
> 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> 00000150: 00 00 00 00 00 00                                ......
> ^^^ Note the packet is 8 bytes shorter

Yes but that's just padding IIUC.

> and on the dhcp server:
> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> length 262144 bytes
> 15:45:12.548077 IP (tos 0x0, ttl 255, id 0, offset 0, flags [DF],
> proto UDP (17), length 328)
>     0.0.0.0.bootpc > 255.255.255.255.bootps: [no cksum] BOOTP/DHCP,
> Request from 00:d0:12:ba:f8:cc (oui Unknown), length 300, xid
> 0x12bb8264, Flags [none] (0x0000)
>   Client-Ethernet-Address 00:d0:12:ba:f8:cc (oui Unknown)
>   Vendor-rfc1048 Extensions
>     Magic Cookie 0x63825363
>     DHCP-Message (53), length 1: Discover
>     MSZ (57), length 2: 576
>     ARCH (93), length 2: 22
>     NDI (94), length 3: 1.0.0
>     Vendor-Class (60), length 12: "U-Boot.armv8"
>     Parameter-Request (55), length 5:
>       Subnet-Mask (1), Default-Gateway (3), Domain-Name-Server (6),
> Hostname (12)
>       RP (17)
>     END (255), length 0
>     PAD (0), length 0, occurs 22
> 0x0000:  4500 0148 0000 4000 ff11 7aa5 0000 0000  E..H..@...z.....
> 0x0010:  ffff ffff 0044 0043 0134 0000 0101 0600  .....D.C.4......
> 0x0020:  12bb 8264 0000 0000 0000 0000 0000 0000  ...d............
> 0x0030:  0000 0000 0000 0000 00d0 12ba f8cc 0000  ................
> 0x0040:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x0050:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x0060:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x0070:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x0080:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x0090:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x00a0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x00b0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x00c0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x00d0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x00e0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x00f0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0x0100:  0000 0000 0000 0000 6382 5363 3501 0139  ........c.Sc5..9
> 0x0110:  0202 405d 0200 165e 0301 0000 3c0c 552d  ..@]...^....<.U-
> 0x0120:  426f 6f74 2e61 726d 7638 3705 0103 060c  Boot.armv87.....
> 0x0130:  11ff 0000 0000 0000 0000 0000 0000 0000  ................
> 0x0140:  0000 0000 0000 0000                      ........

All good there, as expected.

>> I do have a i.MX8M Plus EVK board but will not be able to use it before
>> tomorrow.
> 
> Please let me know your findings. I can test imx8mp as well but would
> have to reconfigure everything.

So I did some tests on my imx8mp and I got DHCP to work on the second
ethernet port (ENET2) but *not* on the first one (ENET1). They appear to
be using different drivers (drivers/net/fec_mxc.c for ENET1 and
drivers/net/dwc_eth_qos.c for ENET2). No matter what I do, there is no
packet going out of ENET1.

With a network cable plugged into ENET2 (ethernet@30bf0000):

...
Net:   eth0: ethernet@30be0000, eth1: ethernet@30bf0000 [PRIME]
Hit any key to stop autoboot:  0
u-boot=> 
u-boot=> dhcp
ethernet@30bf0000 Waiting for PHY auto negotiation to complete....... done
DHCP client bound to address 192.168.0.39 (2066 ms)
u-boot=> 

Now moving the cable to ENET1 (ethernet@30be0000) and resetting the board:

...
Net:   eth0: ethernet@30be0000, eth1: ethernet@30bf0000 [PRIME]
Hit any key to stop autoboot:  0
u-boot=> 
u-boot=> setenv ethact ethernet@30be0000
u-boot=> dhcp
u-boot=> echo $?
1

I enabled the debug traces in fecmxc_send() but I see nothing wrong,
the function is called with a packet and it returns success yet nothing
is sent out on the wire. Any idea what could be wrong?
My plan is to add more traces to the driver and compare NET vs. NET_LWIP.

Regards,
Tim Harvey June 28, 2024, 3:48 p.m. UTC | #20
On Fri, Jun 28, 2024 at 2:50 AM Jerome Forissier
<jerome.forissier@linaro.org> wrote:
>
> Hi Tim,
>
> On 6/26/24 18:00, Tim Harvey wrote:
> > On Tue, Jun 25, 2024 at 1:02 AM Jerome Forissier
> > <jerome.forissier@linaro.org> wrote:
> >>
> >> On 6/25/24 00:28, Tim Harvey wrote:
> >>> On Sat, Jun 22, 2024 at 1:09 AM Maxim Uvarov <muvarov@gmail.com> wrote:
> >>>>
> >>>> пт, 21 июн. 2024 г. в 21:42, Fabio Estevam <festevam@gmail.com>:
> >>>>>
> >>>>> Hi Tim and Jerome,
> >>>>>
> >>>>> On Fri, Jun 21, 2024 at 1:08 PM Tim Harvey <tharvey@gateworks.com> wrote:
> >>>>>
> >>>>>> I tried your to-upstream/v5-wip branch
> >>>>>> (042bea36eb9731079a3d7afffe3774d79e06ac5d) and it behaves the same. Do
> >>>>>> you have something else to try/test?
> >>>>>
> >>>>> Yes, when I tested older versions from Maxim I could never get lwIP to
> >>>>> work with i.MX.
> >>>>>
> >>>>> Jerome,
> >>>>>
> >>>>> Please try to run the lwIP series on any i.MX board, if possible.
> >>>>>
> >>>>> Thanks
> >>>>
> >>>> Packet not for us means that incoming packet DST MAC does not match to
> >>>> the MAC address inside lwip. I.e. to MAC address set into lwip when
> >>>> lwip_init was done. Original U-Boot network stack does not compare
> >>>> MACs but lwip does. There is something specific on this board, in
> >>>> general lwip with debug should print out
> >>>> MAC used during initialization. This MAC should match to the MAC from
> >>>> the incoming packet.
> >>>>
> >>>
> >>> It seems 'packet not for us' can mean a lot of things.
> >>
> >> Yeah :-/ in this case I believe the traces are caused by unrelated traffic
> >> (UDP port 138 is used by NetBIOS).
> >>
> >>
> >>> I added a bit of debugging around 'DHCP packet accepted' and found I'm
> >>> not receiving any packets from my DHCP server. So I connected directly
> >>> to another board (isolated network) where I ran my own server and
> >>> tcpdump and I don't see packets coming from lwip:
> >>
> >> Ha! That's interesting.
> >>
> >>> without lwip my server shows:
> >>> # tcpdump -i eth1
> >>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> >>> length 262144 bytes
> >>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> >>> length 262144 bytes
> >>> 22:18:01.043992 IP (tos 0x0, ttl 255, id 23, offset 0, flags [DF],
> >>> proto UDP (17), length 328)
> >>> 22:18:01.044391 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none],
> >>> proto UDP (17), length 328)
> >>> 22:18:01.044454 IP (tos 0x0, ttl 255, id 24, offset 0, flags [DF],
> >>> proto UDP (17), length 328)
> >>> 22:18:01.044752 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none],
> >>> proto UDP (17), length 328)
> >>>
> >>> # tcpdump -i eth1
> >>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> >>> length 262144 bytes
> >>> 22:22:41.602178  [|llc]
> >>> 22:22:41.709978  [|llc]
> >>> 22:22:41.867947  [|llc]
> >>> 22:22:42.105202  [|llc]
> >>> 22:22:42.502091  [|llc]
> >>> 22:22:43.219079  [|llc]
> >>> 22:22:44.497979  [|llc]
> >>> 22:22:45.776884  [|llc]
> >>> 22:22:47.054773  [|llc]
> >>> 22:22:48.333667  [|llc]
> >>> 22:22:49.611559  [|llc]
> >>> 22:22:50.890469  [|llc]
> >>>
> >>> What actual hardware has this been tested with? I think Tom mentioned
> >>> he tested with an rpi of some sort.
> >>
> >> Yes, I believe he tested on RPi 3B  and me too. Ilias has tested on NVIDIA
> >> Jetson Nano.
> >>
> >>> I don't know what the meaning of the llc msg is above but you can see
> >>> there is no DHCP request.
> >>
> >> Some more tracing is needed. Can you try enabling all traces in
> >> lib/lwip/u-boot/lwipopts.h? i.e., replace all LWIP_DBG_OFF with
> >> LWIP_DBG_ON in the #if defined(CONFIG_LWIP_DEBUG) section. And of course
> >> enabled CONFIG_LWIP_DEBUG.
> >
> > Here's what I see with all those enabled on an imx8mm-venice-gw73xx-0x:
> > Net:   eth0: ethernet@30be0000 [PRIME]
> > GSC     : boot watchdog disabled
> > Thermal protection:enabled at 96C
> > Hit any key to stop autoboot:  0
> > u-boot=> dhcp || echo fail
> > netif: added interface et IP addr 0.0.0.0 netmask 0.0.0.0 gw 0.0.0.0
> > netif: setting default interface et
> > dhcp_start(netif=00000000fdf23bb0) et0
> > dhcp_start(): mallocing new DHCP client
> > dhcp_start(): allocated dhcp
> > dhcp_start(): starting DHCP configuration
> > udp_bind(ipaddr = 0.0.0.0, port = 68)
> > udp_bind: bound to 0.0.0.0, port 68)
> > udp_connect: connected to 0.0.0.0, port 67)
> > dhcp_discover()
> > pbuf_alloc(length=308)
> > pbuf_alloc(length=308) == 00000000fdf23d10
> > transaction id xid(42021)
> > dhcp_discover: making request
> > dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, LWIP_IANA_PORT_DHCP_SERVER)
> > pbuf_add_header: old 00000000fdf23d5e new 00000000fdf23d56 (8)
> > udp_send: added header in given pbuf 00000000fdf23d10
> > udp_send: sending datagram of length 316
> > udp_send: UDP packet length 316
> > inet_chksum_pseudo(): checksumming pbuf 00000000fdf23d10 (has next
> > 0000000000000000)
> > inet_chksum_pseudo(): pbuf chain lwip_chksum()=807
> > udp_send: UDP checksum 0xf7f8
> > udp_send: ip_output_if (,,,,0x11,)
> > pbuf_add_header: old 00000000fdf23d56 new 00000000fdf23d42 (20)
> > ip4_output_if: et0
> > IP header:
> > +-------------------------------+
> > | 4 | 5 |  0x00 |       336     | (v, hl, tos, len)
> > +-------------------------------+
> > |        0      |000|       0   | (id, flags, offset)
> > +-------------------------------+
> > |  255  |   17  |    0xba9d     | (ttl, proto, chksum)
> > +-------------------------------+
> > |    0  |    0  |    0  |    0  | (src)
> > +-------------------------------+
> > |  255  |  255  |  255  |  255  | (dest)
> > +-------------------------------+
> > ip4_output_if: call netif->output()
> > pbuf_add_header: old 00000000fdf23d42 new 00000000fdf23d34 (14)
> > ethernet_output: sending packet 00000000fdf23d10
> > ^^^ at this point a 350 byte packet gets sent to fecmxc_send for the
> > NIC which looks like this via print_hex_dump_bytes():
> > 00000000: ff ff ff ff ff ff 00 d0 12 ba f8 cc 08 00 45 00  ..............E.
> > 00000010: 01 50 00 00 00 00 ff 11 ba 9d 00 00 00 00 ff ff  .P..............
> > 00000020: ff ff 00 44 00 43 01 3c f8 f7 01 01 06 00 00 04  ...D.C.<........
> > 00000030: 20 21 00 00 00 00 00 00 00 00 00 00 00 00 00 00   !..............
> > 00000040: 00 00 00 00 00 00 00 d0 12 ba f8 cc 00 00 00 00  ................
> > 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 00000110: 00 00 00 00 00 00 63 82 53 63 35 01 01 39 02 05  ......c.Sc5..9..
> > 00000120: dc 37 03 01 03 1c ff 00 00 00 00 00 00 00 00 00  .7..............
> > 00000130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 00000150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00        ..............
>
> Up to this point everything *seems* OK
>
> > on my dhcp server 'tcpdump -i eth1 -X -vvv' shows:
> > tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> > length 262144 bytes
> > 15:34:25.341167  [|llc]
> > 0x0000:  0000 0000 0000 0000 0000 343d f2fd 0000  ..........4=....
> > 0x0010:  0000 5e01 5e01 8000 0100 0000 0000 0000  ..^.^...........
> > 0x0020:  0000 0000 0000 ffff ffff ffff 00d0 12ba  ................
> > 0x0030:  f8cc 0800 4500 0150 001e 0000 ff11 ba7f  ....E..P........
> > 0x0040:  0000 0000 ffff ffff 0044 0043 013c 6d78  .........D.C.<mx
> > 0x0050:  0101 0600 1255 994f 0000 0000 0000 0000  .....U.O........
> > 0x0060:  0000 0000 0000 0000 0000 0000 00d0 12ba  ................
> > 0x0070:  f8cc 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x0080:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x0090:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x00a0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x00b0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x00c0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x00d0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x00e0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x00f0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x0100:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x0110:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x0120:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x0130:  0000 0000 0000 0000 0000 0000 6382 5363  ............c.Sc
> > 0x0140:  3501 0139 0205 dc37 0301 031c ff00 0000  5..9...7........
>
> But here it shows the DHCP packet is not received.
>
> > Comparing to w/o LWIP:
> > u-boot=> dhcp || echo fail
> > BOOTP broadcast 1
> > fecmxc_send ethernet@30be0000 342
> > 00000000: ff ff ff ff ff ff 00 d0 12 ba f8 cc 08 00 45 00  ..............E.
> > 00000010: 01 48 00 00 40 00 ff 11 7a a5 00 00 00 00 ff ff  .H..@...z.......
> > 00000020: ff ff 00 44 00 43 01 34 00 00 01 01 06 00 12 bb  ...D.C.4........
> > 00000030: 82 64 00 00 00 00 00 00 00 00 00 00 00 00 00 00  .d..............
> > 00000040: 00 00 00 00 00 00 00 d0 12 ba f8 cc 00 00 00 00  ................
> > 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 00000110: 00 00 00 00 00 00 63 82 53 63 35 01 01 39 02 02  ......c.Sc5..9..
> > 00000120: 40 5d 02 00 16 5e 03 01 00 00 3c 0c 55 2d 42 6f  @]...^....<.U-Bo
> > 00000130: 6f 74 2e 61 72 6d 76 38 37 05 01 03 06 0c 11 ff  ot.armv87.......
> > 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> > 00000150: 00 00 00 00 00 00                                ......
> > ^^^ Note the packet is 8 bytes shorter
>
> Yes but that's just padding IIUC.
>
> > and on the dhcp server:
> > tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> > length 262144 bytes
> > 15:45:12.548077 IP (tos 0x0, ttl 255, id 0, offset 0, flags [DF],
> > proto UDP (17), length 328)
> >     0.0.0.0.bootpc > 255.255.255.255.bootps: [no cksum] BOOTP/DHCP,
> > Request from 00:d0:12:ba:f8:cc (oui Unknown), length 300, xid
> > 0x12bb8264, Flags [none] (0x0000)
> >   Client-Ethernet-Address 00:d0:12:ba:f8:cc (oui Unknown)
> >   Vendor-rfc1048 Extensions
> >     Magic Cookie 0x63825363
> >     DHCP-Message (53), length 1: Discover
> >     MSZ (57), length 2: 576
> >     ARCH (93), length 2: 22
> >     NDI (94), length 3: 1.0.0
> >     Vendor-Class (60), length 12: "U-Boot.armv8"
> >     Parameter-Request (55), length 5:
> >       Subnet-Mask (1), Default-Gateway (3), Domain-Name-Server (6),
> > Hostname (12)
> >       RP (17)
> >     END (255), length 0
> >     PAD (0), length 0, occurs 22
> > 0x0000:  4500 0148 0000 4000 ff11 7aa5 0000 0000  E..H..@...z.....
> > 0x0010:  ffff ffff 0044 0043 0134 0000 0101 0600  .....D.C.4......
> > 0x0020:  12bb 8264 0000 0000 0000 0000 0000 0000  ...d............
> > 0x0030:  0000 0000 0000 0000 00d0 12ba f8cc 0000  ................
> > 0x0040:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x0050:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x0060:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x0070:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x0080:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x0090:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x00a0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x00b0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x00c0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x00d0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x00e0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x00f0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x0100:  0000 0000 0000 0000 6382 5363 3501 0139  ........c.Sc5..9
> > 0x0110:  0202 405d 0200 165e 0301 0000 3c0c 552d  ..@]...^....<.U-
> > 0x0120:  426f 6f74 2e61 726d 7638 3705 0103 060c  Boot.armv87.....
> > 0x0130:  11ff 0000 0000 0000 0000 0000 0000 0000  ................
> > 0x0140:  0000 0000 0000 0000                      ........
>
> All good there, as expected.
>
> >> I do have a i.MX8M Plus EVK board but will not be able to use it before
> >> tomorrow.
> >
> > Please let me know your findings. I can test imx8mp as well but would
> > have to reconfigure everything.
>
> So I did some tests on my imx8mp and I got DHCP to work on the second
> ethernet port (ENET2) but *not* on the first one (ENET1). They appear to
> be using different drivers (drivers/net/fec_mxc.c for ENET1 and
> drivers/net/dwc_eth_qos.c for ENET2). No matter what I do, there is no
> packet going out of ENET1.
>
> With a network cable plugged into ENET2 (ethernet@30bf0000):
>
> ...
> Net:   eth0: ethernet@30be0000, eth1: ethernet@30bf0000 [PRIME]
> Hit any key to stop autoboot:  0
> u-boot=>
> u-boot=> dhcp
> ethernet@30bf0000 Waiting for PHY auto negotiation to complete....... done
> DHCP client bound to address 192.168.0.39 (2066 ms)
> u-boot=>
>
> Now moving the cable to ENET1 (ethernet@30be0000) and resetting the board:
>
> ...
> Net:   eth0: ethernet@30be0000, eth1: ethernet@30bf0000 [PRIME]
> Hit any key to stop autoboot:  0
> u-boot=>
> u-boot=> setenv ethact ethernet@30be0000
> u-boot=> dhcp
> u-boot=> echo $?
> 1
>
> I enabled the debug traces in fecmxc_send() but I see nothing wrong,
> the function is called with a packet and it returns success yet nothing
> is sent out on the wire. Any idea what could be wrong?
> My plan is to add more traces to the driver and compare NET vs. NET_LWIP.
>

Jerome,

That is consistent with what I see then. The imx8mm (and all imx all
the way back to imx28) use the FEC ethernet MAC so I would assume you
see the issue on all of those SOC's. The imx8mp introduced a 2nd
different MAC, the eqos which you are showing working.

If you look at my results above a packet 'is' received by the DHCP
server but its missing something where tcpdump shows an '[|llc]' and I
don't know what that means - something wrong with the link layer?

I feel like something is wrong in the header or packet alignment.
Maybe the key is to understand what the tcpdump output of '[llc]'
means:
- lwip fec mac destination tcpdump shows '15:34:25.341167  [|llc]'
- no-lwip fec mac destination tcptdump shows '15:45:12.548077 IP (tos
0x0, ttl 255, id 0, offset 0, flags [DF], proto UDP (17), length 328)

(Adding Joe and Ramon, the network maintainers to cc)

Regards,

Tim
Jerome Forissier July 1, 2024, 9:58 a.m. UTC | #21
On 6/28/24 17:48, Tim Harvey wrote:
> On Fri, Jun 28, 2024 at 2:50 AM Jerome Forissier
> <jerome.forissier@linaro.org> wrote:
>>
>> Hi Tim,
>>
>> On 6/26/24 18:00, Tim Harvey wrote:
>>> On Tue, Jun 25, 2024 at 1:02 AM Jerome Forissier
>>> <jerome.forissier@linaro.org> wrote:
>>>>
>>>> On 6/25/24 00:28, Tim Harvey wrote:
>>>>> On Sat, Jun 22, 2024 at 1:09 AM Maxim Uvarov <muvarov@gmail.com> wrote:
>>>>>>
>>>>>> пт, 21 июн. 2024 г. в 21:42, Fabio Estevam <festevam@gmail.com>:
>>>>>>>
>>>>>>> Hi Tim and Jerome,
>>>>>>>
>>>>>>> On Fri, Jun 21, 2024 at 1:08 PM Tim Harvey <tharvey@gateworks.com> wrote:
>>>>>>>
>>>>>>>> I tried your to-upstream/v5-wip branch
>>>>>>>> (042bea36eb9731079a3d7afffe3774d79e06ac5d) and it behaves the same. Do
>>>>>>>> you have something else to try/test?
>>>>>>>
>>>>>>> Yes, when I tested older versions from Maxim I could never get lwIP to
>>>>>>> work with i.MX.
>>>>>>>
>>>>>>> Jerome,
>>>>>>>
>>>>>>> Please try to run the lwIP series on any i.MX board, if possible.
>>>>>>>
>>>>>>> Thanks
>>>>>>
>>>>>> Packet not for us means that incoming packet DST MAC does not match to
>>>>>> the MAC address inside lwip. I.e. to MAC address set into lwip when
>>>>>> lwip_init was done. Original U-Boot network stack does not compare
>>>>>> MACs but lwip does. There is something specific on this board, in
>>>>>> general lwip with debug should print out
>>>>>> MAC used during initialization. This MAC should match to the MAC from
>>>>>> the incoming packet.
>>>>>>
>>>>>
>>>>> It seems 'packet not for us' can mean a lot of things.
>>>>
>>>> Yeah :-/ in this case I believe the traces are caused by unrelated traffic
>>>> (UDP port 138 is used by NetBIOS).
>>>>
>>>>
>>>>> I added a bit of debugging around 'DHCP packet accepted' and found I'm
>>>>> not receiving any packets from my DHCP server. So I connected directly
>>>>> to another board (isolated network) where I ran my own server and
>>>>> tcpdump and I don't see packets coming from lwip:
>>>>
>>>> Ha! That's interesting.
>>>>
>>>>> without lwip my server shows:
>>>>> # tcpdump -i eth1
>>>>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
>>>>> length 262144 bytes
>>>>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
>>>>> length 262144 bytes
>>>>> 22:18:01.043992 IP (tos 0x0, ttl 255, id 23, offset 0, flags [DF],
>>>>> proto UDP (17), length 328)
>>>>> 22:18:01.044391 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none],
>>>>> proto UDP (17), length 328)
>>>>> 22:18:01.044454 IP (tos 0x0, ttl 255, id 24, offset 0, flags [DF],
>>>>> proto UDP (17), length 328)
>>>>> 22:18:01.044752 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none],
>>>>> proto UDP (17), length 328)
>>>>>
>>>>> # tcpdump -i eth1
>>>>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
>>>>> length 262144 bytes
>>>>> 22:22:41.602178  [|llc]
>>>>> 22:22:41.709978  [|llc]
>>>>> 22:22:41.867947  [|llc]
>>>>> 22:22:42.105202  [|llc]
>>>>> 22:22:42.502091  [|llc]
>>>>> 22:22:43.219079  [|llc]
>>>>> 22:22:44.497979  [|llc]
>>>>> 22:22:45.776884  [|llc]
>>>>> 22:22:47.054773  [|llc]
>>>>> 22:22:48.333667  [|llc]
>>>>> 22:22:49.611559  [|llc]
>>>>> 22:22:50.890469  [|llc]
>>>>>
>>>>> What actual hardware has this been tested with? I think Tom mentioned
>>>>> he tested with an rpi of some sort.
>>>>
>>>> Yes, I believe he tested on RPi 3B  and me too. Ilias has tested on NVIDIA
>>>> Jetson Nano.
>>>>
>>>>> I don't know what the meaning of the llc msg is above but you can see
>>>>> there is no DHCP request.
>>>>
>>>> Some more tracing is needed. Can you try enabling all traces in
>>>> lib/lwip/u-boot/lwipopts.h? i.e., replace all LWIP_DBG_OFF with
>>>> LWIP_DBG_ON in the #if defined(CONFIG_LWIP_DEBUG) section. And of course
>>>> enabled CONFIG_LWIP_DEBUG.
>>>
>>> Here's what I see with all those enabled on an imx8mm-venice-gw73xx-0x:
>>> Net:   eth0: ethernet@30be0000 [PRIME]
>>> GSC     : boot watchdog disabled
>>> Thermal protection:enabled at 96C
>>> Hit any key to stop autoboot:  0
>>> u-boot=> dhcp || echo fail
>>> netif: added interface et IP addr 0.0.0.0 netmask 0.0.0.0 gw 0.0.0.0
>>> netif: setting default interface et
>>> dhcp_start(netif=00000000fdf23bb0) et0
>>> dhcp_start(): mallocing new DHCP client
>>> dhcp_start(): allocated dhcp
>>> dhcp_start(): starting DHCP configuration
>>> udp_bind(ipaddr = 0.0.0.0, port = 68)
>>> udp_bind: bound to 0.0.0.0, port 68)
>>> udp_connect: connected to 0.0.0.0, port 67)
>>> dhcp_discover()
>>> pbuf_alloc(length=308)
>>> pbuf_alloc(length=308) == 00000000fdf23d10
>>> transaction id xid(42021)
>>> dhcp_discover: making request
>>> dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, LWIP_IANA_PORT_DHCP_SERVER)
>>> pbuf_add_header: old 00000000fdf23d5e new 00000000fdf23d56 (8)
>>> udp_send: added header in given pbuf 00000000fdf23d10
>>> udp_send: sending datagram of length 316
>>> udp_send: UDP packet length 316
>>> inet_chksum_pseudo(): checksumming pbuf 00000000fdf23d10 (has next
>>> 0000000000000000)
>>> inet_chksum_pseudo(): pbuf chain lwip_chksum()=807
>>> udp_send: UDP checksum 0xf7f8
>>> udp_send: ip_output_if (,,,,0x11,)
>>> pbuf_add_header: old 00000000fdf23d56 new 00000000fdf23d42 (20)
>>> ip4_output_if: et0
>>> IP header:
>>> +-------------------------------+
>>> | 4 | 5 |  0x00 |       336     | (v, hl, tos, len)
>>> +-------------------------------+
>>> |        0      |000|       0   | (id, flags, offset)
>>> +-------------------------------+
>>> |  255  |   17  |    0xba9d     | (ttl, proto, chksum)
>>> +-------------------------------+
>>> |    0  |    0  |    0  |    0  | (src)
>>> +-------------------------------+
>>> |  255  |  255  |  255  |  255  | (dest)
>>> +-------------------------------+
>>> ip4_output_if: call netif->output()
>>> pbuf_add_header: old 00000000fdf23d42 new 00000000fdf23d34 (14)
>>> ethernet_output: sending packet 00000000fdf23d10
>>> ^^^ at this point a 350 byte packet gets sent to fecmxc_send for the
>>> NIC which looks like this via print_hex_dump_bytes():
>>> 00000000: ff ff ff ff ff ff 00 d0 12 ba f8 cc 08 00 45 00  ..............E.
>>> 00000010: 01 50 00 00 00 00 ff 11 ba 9d 00 00 00 00 ff ff  .P..............
>>> 00000020: ff ff 00 44 00 43 01 3c f8 f7 01 01 06 00 00 04  ...D.C.<........
>>> 00000030: 20 21 00 00 00 00 00 00 00 00 00 00 00 00 00 00   !..............
>>> 00000040: 00 00 00 00 00 00 00 d0 12 ba f8 cc 00 00 00 00  ................
>>> 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 00000110: 00 00 00 00 00 00 63 82 53 63 35 01 01 39 02 05  ......c.Sc5..9..
>>> 00000120: dc 37 03 01 03 1c ff 00 00 00 00 00 00 00 00 00  .7..............
>>> 00000130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 00000150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00        ..............
>>
>> Up to this point everything *seems* OK
>>
>>> on my dhcp server 'tcpdump -i eth1 -X -vvv' shows:
>>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
>>> length 262144 bytes
>>> 15:34:25.341167  [|llc]
>>> 0x0000:  0000 0000 0000 0000 0000 343d f2fd 0000  ..........4=....
>>> 0x0010:  0000 5e01 5e01 8000 0100 0000 0000 0000  ..^.^...........
>>> 0x0020:  0000 0000 0000 ffff ffff ffff 00d0 12ba  ................
>>> 0x0030:  f8cc 0800 4500 0150 001e 0000 ff11 ba7f  ....E..P........
>>> 0x0040:  0000 0000 ffff ffff 0044 0043 013c 6d78  .........D.C.<mx
>>> 0x0050:  0101 0600 1255 994f 0000 0000 0000 0000  .....U.O........
>>> 0x0060:  0000 0000 0000 0000 0000 0000 00d0 12ba  ................
>>> 0x0070:  f8cc 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x0080:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x0090:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x00a0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x00b0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x00c0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x00d0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x00e0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x00f0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x0100:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x0110:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x0120:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x0130:  0000 0000 0000 0000 0000 0000 6382 5363  ............c.Sc
>>> 0x0140:  3501 0139 0205 dc37 0301 031c ff00 0000  5..9...7........
>>
>> But here it shows the DHCP packet is not received.
>>
>>> Comparing to w/o LWIP:
>>> u-boot=> dhcp || echo fail
>>> BOOTP broadcast 1
>>> fecmxc_send ethernet@30be0000 342
>>> 00000000: ff ff ff ff ff ff 00 d0 12 ba f8 cc 08 00 45 00  ..............E.
>>> 00000010: 01 48 00 00 40 00 ff 11 7a a5 00 00 00 00 ff ff  .H..@...z.......
>>> 00000020: ff ff 00 44 00 43 01 34 00 00 01 01 06 00 12 bb  ...D.C.4........
>>> 00000030: 82 64 00 00 00 00 00 00 00 00 00 00 00 00 00 00  .d..............
>>> 00000040: 00 00 00 00 00 00 00 d0 12 ba f8 cc 00 00 00 00  ................
>>> 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 00000110: 00 00 00 00 00 00 63 82 53 63 35 01 01 39 02 02  ......c.Sc5..9..
>>> 00000120: 40 5d 02 00 16 5e 03 01 00 00 3c 0c 55 2d 42 6f  @]...^....<.U-Bo
>>> 00000130: 6f 74 2e 61 72 6d 76 38 37 05 01 03 06 0c 11 ff  ot.armv87.......
>>> 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>> 00000150: 00 00 00 00 00 00                                ......
>>> ^^^ Note the packet is 8 bytes shorter
>>
>> Yes but that's just padding IIUC.
>>
>>> and on the dhcp server:
>>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
>>> length 262144 bytes
>>> 15:45:12.548077 IP (tos 0x0, ttl 255, id 0, offset 0, flags [DF],
>>> proto UDP (17), length 328)
>>>     0.0.0.0.bootpc > 255.255.255.255.bootps: [no cksum] BOOTP/DHCP,
>>> Request from 00:d0:12:ba:f8:cc (oui Unknown), length 300, xid
>>> 0x12bb8264, Flags [none] (0x0000)
>>>   Client-Ethernet-Address 00:d0:12:ba:f8:cc (oui Unknown)
>>>   Vendor-rfc1048 Extensions
>>>     Magic Cookie 0x63825363
>>>     DHCP-Message (53), length 1: Discover
>>>     MSZ (57), length 2: 576
>>>     ARCH (93), length 2: 22
>>>     NDI (94), length 3: 1.0.0
>>>     Vendor-Class (60), length 12: "U-Boot.armv8"
>>>     Parameter-Request (55), length 5:
>>>       Subnet-Mask (1), Default-Gateway (3), Domain-Name-Server (6),
>>> Hostname (12)
>>>       RP (17)
>>>     END (255), length 0
>>>     PAD (0), length 0, occurs 22
>>> 0x0000:  4500 0148 0000 4000 ff11 7aa5 0000 0000  E..H..@...z.....
>>> 0x0010:  ffff ffff 0044 0043 0134 0000 0101 0600  .....D.C.4......
>>> 0x0020:  12bb 8264 0000 0000 0000 0000 0000 0000  ...d............
>>> 0x0030:  0000 0000 0000 0000 00d0 12ba f8cc 0000  ................
>>> 0x0040:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x0050:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x0060:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x0070:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x0080:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x0090:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x00a0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x00b0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x00c0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x00d0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x00e0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x00f0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x0100:  0000 0000 0000 0000 6382 5363 3501 0139  ........c.Sc5..9
>>> 0x0110:  0202 405d 0200 165e 0301 0000 3c0c 552d  ..@]...^....<.U-
>>> 0x0120:  426f 6f74 2e61 726d 7638 3705 0103 060c  Boot.armv87.....
>>> 0x0130:  11ff 0000 0000 0000 0000 0000 0000 0000  ................
>>> 0x0140:  0000 0000 0000 0000                      ........
>>
>> All good there, as expected.
>>
>>>> I do have a i.MX8M Plus EVK board but will not be able to use it before
>>>> tomorrow.
>>>
>>> Please let me know your findings. I can test imx8mp as well but would
>>> have to reconfigure everything.
>>
>> So I did some tests on my imx8mp and I got DHCP to work on the second
>> ethernet port (ENET2) but *not* on the first one (ENET1). They appear to
>> be using different drivers (drivers/net/fec_mxc.c for ENET1 and
>> drivers/net/dwc_eth_qos.c for ENET2). No matter what I do, there is no
>> packet going out of ENET1.
>>
>> With a network cable plugged into ENET2 (ethernet@30bf0000):
>>
>> ...
>> Net:   eth0: ethernet@30be0000, eth1: ethernet@30bf0000 [PRIME]
>> Hit any key to stop autoboot:  0
>> u-boot=>
>> u-boot=> dhcp
>> ethernet@30bf0000 Waiting for PHY auto negotiation to complete....... done
>> DHCP client bound to address 192.168.0.39 (2066 ms)
>> u-boot=>
>>
>> Now moving the cable to ENET1 (ethernet@30be0000) and resetting the board:
>>
>> ...
>> Net:   eth0: ethernet@30be0000, eth1: ethernet@30bf0000 [PRIME]
>> Hit any key to stop autoboot:  0
>> u-boot=>
>> u-boot=> setenv ethact ethernet@30be0000
>> u-boot=> dhcp
>> u-boot=> echo $?
>> 1
>>
>> I enabled the debug traces in fecmxc_send() but I see nothing wrong,
>> the function is called with a packet and it returns success yet nothing
>> is sent out on the wire. Any idea what could be wrong?
>> My plan is to add more traces to the driver and compare NET vs. NET_LWIP.
>>
> 
> Jerome,
> 
> That is consistent with what I see then. The imx8mm (and all imx all
> the way back to imx28) use the FEC ethernet MAC so I would assume you
> see the issue on all of those SOC's. The imx8mp introduced a 2nd
> different MAC, the eqos which you are showing working.
> 
> If you look at my results above a packet 'is' received by the DHCP
> server but its missing something where tcpdump shows an '[|llc]' and I
> don't know what that means - something wrong with the link layer?
> 
> I feel like something is wrong in the header or packet alignment.

Good catch :) Here is a quick workaroud that works for me:

diff --git a/drivers/net/fec_mxc.c b/drivers/net/fec_mxc.c
index 2dc1364beec..85bf1a2b52e 100644
--- a/drivers/net/fec_mxc.c
+++ b/drivers/net/fec_mxc.c
@@ -658,7 +658,7 @@ static void fecmxc_halt(struct udevice *dev)
  * @param[in] length Data count in bytes
  * Return: 0 on success
  */
-static int fecmxc_send(struct udevice *dev, void *packet, int length)
+static int fecmxc_send_(struct udevice *dev, void *packet, int length)
 {
 	unsigned int status;
 	u32 size;
@@ -796,6 +796,21 @@ out:
 	return ret;
 }
 
+static int fecmxc_send(struct udevice *dev, void *packet, int length)
+{
+	void *pp;
+	int ret;
+
+	pp = memalign(64, length);
+	if (!pp)
+		return -ENOMEM;
+	memcpy(pp, packet, length);
+	ret = fecmxc_send_(dev, pp, length);
+	free(pp);
+
+	return ret;
+}
+
 /**
  * Pull one frame from the card
  * @param[in] dev Our ethernet device to handle


I believe the root of the problem is that lwIP uses malloc() to allocate
the outgoing packets and doesn't enforce any alignement. I need to
revisit that wrt. PKTALIGN.

Thanks,
Tim Harvey July 1, 2024, 4:06 p.m. UTC | #22
On Mon, Jul 1, 2024 at 2:58 AM Jerome Forissier
<jerome.forissier@linaro.org> wrote:
>
>
>
> On 6/28/24 17:48, Tim Harvey wrote:
> > On Fri, Jun 28, 2024 at 2:50 AM Jerome Forissier
> > <jerome.forissier@linaro.org> wrote:
> >>
> >> Hi Tim,
> >>
> >> On 6/26/24 18:00, Tim Harvey wrote:
> >>> On Tue, Jun 25, 2024 at 1:02 AM Jerome Forissier
> >>> <jerome.forissier@linaro.org> wrote:
> >>>>
> >>>> On 6/25/24 00:28, Tim Harvey wrote:
> >>>>> On Sat, Jun 22, 2024 at 1:09 AM Maxim Uvarov <muvarov@gmail.com> wrote:
> >>>>>>
> >>>>>> пт, 21 июн. 2024 г. в 21:42, Fabio Estevam <festevam@gmail.com>:
> >>>>>>>
> >>>>>>> Hi Tim and Jerome,
> >>>>>>>
> >>>>>>> On Fri, Jun 21, 2024 at 1:08 PM Tim Harvey <tharvey@gateworks.com> wrote:
> >>>>>>>
> >>>>>>>> I tried your to-upstream/v5-wip branch
> >>>>>>>> (042bea36eb9731079a3d7afffe3774d79e06ac5d) and it behaves the same. Do
> >>>>>>>> you have something else to try/test?
> >>>>>>>
> >>>>>>> Yes, when I tested older versions from Maxim I could never get lwIP to
> >>>>>>> work with i.MX.
> >>>>>>>
> >>>>>>> Jerome,
> >>>>>>>
> >>>>>>> Please try to run the lwIP series on any i.MX board, if possible.
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>
> >>>>>> Packet not for us means that incoming packet DST MAC does not match to
> >>>>>> the MAC address inside lwip. I.e. to MAC address set into lwip when
> >>>>>> lwip_init was done. Original U-Boot network stack does not compare
> >>>>>> MACs but lwip does. There is something specific on this board, in
> >>>>>> general lwip with debug should print out
> >>>>>> MAC used during initialization. This MAC should match to the MAC from
> >>>>>> the incoming packet.
> >>>>>>
> >>>>>
> >>>>> It seems 'packet not for us' can mean a lot of things.
> >>>>
> >>>> Yeah :-/ in this case I believe the traces are caused by unrelated traffic
> >>>> (UDP port 138 is used by NetBIOS).
> >>>>
> >>>>
> >>>>> I added a bit of debugging around 'DHCP packet accepted' and found I'm
> >>>>> not receiving any packets from my DHCP server. So I connected directly
> >>>>> to another board (isolated network) where I ran my own server and
> >>>>> tcpdump and I don't see packets coming from lwip:
> >>>>
> >>>> Ha! That's interesting.
> >>>>
> >>>>> without lwip my server shows:
> >>>>> # tcpdump -i eth1
> >>>>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> >>>>> length 262144 bytes
> >>>>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> >>>>> length 262144 bytes
> >>>>> 22:18:01.043992 IP (tos 0x0, ttl 255, id 23, offset 0, flags [DF],
> >>>>> proto UDP (17), length 328)
> >>>>> 22:18:01.044391 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none],
> >>>>> proto UDP (17), length 328)
> >>>>> 22:18:01.044454 IP (tos 0x0, ttl 255, id 24, offset 0, flags [DF],
> >>>>> proto UDP (17), length 328)
> >>>>> 22:18:01.044752 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none],
> >>>>> proto UDP (17), length 328)
> >>>>>
> >>>>> # tcpdump -i eth1
> >>>>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> >>>>> length 262144 bytes
> >>>>> 22:22:41.602178  [|llc]
> >>>>> 22:22:41.709978  [|llc]
> >>>>> 22:22:41.867947  [|llc]
> >>>>> 22:22:42.105202  [|llc]
> >>>>> 22:22:42.502091  [|llc]
> >>>>> 22:22:43.219079  [|llc]
> >>>>> 22:22:44.497979  [|llc]
> >>>>> 22:22:45.776884  [|llc]
> >>>>> 22:22:47.054773  [|llc]
> >>>>> 22:22:48.333667  [|llc]
> >>>>> 22:22:49.611559  [|llc]
> >>>>> 22:22:50.890469  [|llc]
> >>>>>
> >>>>> What actual hardware has this been tested with? I think Tom mentioned
> >>>>> he tested with an rpi of some sort.
> >>>>
> >>>> Yes, I believe he tested on RPi 3B  and me too. Ilias has tested on NVIDIA
> >>>> Jetson Nano.
> >>>>
> >>>>> I don't know what the meaning of the llc msg is above but you can see
> >>>>> there is no DHCP request.
> >>>>
> >>>> Some more tracing is needed. Can you try enabling all traces in
> >>>> lib/lwip/u-boot/lwipopts.h? i.e., replace all LWIP_DBG_OFF with
> >>>> LWIP_DBG_ON in the #if defined(CONFIG_LWIP_DEBUG) section. And of course
> >>>> enabled CONFIG_LWIP_DEBUG.
> >>>
> >>> Here's what I see with all those enabled on an imx8mm-venice-gw73xx-0x:
> >>> Net:   eth0: ethernet@30be0000 [PRIME]
> >>> GSC     : boot watchdog disabled
> >>> Thermal protection:enabled at 96C
> >>> Hit any key to stop autoboot:  0
> >>> u-boot=> dhcp || echo fail
> >>> netif: added interface et IP addr 0.0.0.0 netmask 0.0.0.0 gw 0.0.0.0
> >>> netif: setting default interface et
> >>> dhcp_start(netif=00000000fdf23bb0) et0
> >>> dhcp_start(): mallocing new DHCP client
> >>> dhcp_start(): allocated dhcp
> >>> dhcp_start(): starting DHCP configuration
> >>> udp_bind(ipaddr = 0.0.0.0, port = 68)
> >>> udp_bind: bound to 0.0.0.0, port 68)
> >>> udp_connect: connected to 0.0.0.0, port 67)
> >>> dhcp_discover()
> >>> pbuf_alloc(length=308)
> >>> pbuf_alloc(length=308) == 00000000fdf23d10
> >>> transaction id xid(42021)
> >>> dhcp_discover: making request
> >>> dhcp_discover: sendto(DISCOVER, IP_ADDR_BROADCAST, LWIP_IANA_PORT_DHCP_SERVER)
> >>> pbuf_add_header: old 00000000fdf23d5e new 00000000fdf23d56 (8)
> >>> udp_send: added header in given pbuf 00000000fdf23d10
> >>> udp_send: sending datagram of length 316
> >>> udp_send: UDP packet length 316
> >>> inet_chksum_pseudo(): checksumming pbuf 00000000fdf23d10 (has next
> >>> 0000000000000000)
> >>> inet_chksum_pseudo(): pbuf chain lwip_chksum()=807
> >>> udp_send: UDP checksum 0xf7f8
> >>> udp_send: ip_output_if (,,,,0x11,)
> >>> pbuf_add_header: old 00000000fdf23d56 new 00000000fdf23d42 (20)
> >>> ip4_output_if: et0
> >>> IP header:
> >>> +-------------------------------+
> >>> | 4 | 5 |  0x00 |       336     | (v, hl, tos, len)
> >>> +-------------------------------+
> >>> |        0      |000|       0   | (id, flags, offset)
> >>> +-------------------------------+
> >>> |  255  |   17  |    0xba9d     | (ttl, proto, chksum)
> >>> +-------------------------------+
> >>> |    0  |    0  |    0  |    0  | (src)
> >>> +-------------------------------+
> >>> |  255  |  255  |  255  |  255  | (dest)
> >>> +-------------------------------+
> >>> ip4_output_if: call netif->output()
> >>> pbuf_add_header: old 00000000fdf23d42 new 00000000fdf23d34 (14)
> >>> ethernet_output: sending packet 00000000fdf23d10
> >>> ^^^ at this point a 350 byte packet gets sent to fecmxc_send for the
> >>> NIC which looks like this via print_hex_dump_bytes():
> >>> 00000000: ff ff ff ff ff ff 00 d0 12 ba f8 cc 08 00 45 00  ..............E.
> >>> 00000010: 01 50 00 00 00 00 ff 11 ba 9d 00 00 00 00 ff ff  .P..............
> >>> 00000020: ff ff 00 44 00 43 01 3c f8 f7 01 01 06 00 00 04  ...D.C.<........
> >>> 00000030: 20 21 00 00 00 00 00 00 00 00 00 00 00 00 00 00   !..............
> >>> 00000040: 00 00 00 00 00 00 00 d0 12 ba f8 cc 00 00 00 00  ................
> >>> 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 00000110: 00 00 00 00 00 00 63 82 53 63 35 01 01 39 02 05  ......c.Sc5..9..
> >>> 00000120: dc 37 03 01 03 1c ff 00 00 00 00 00 00 00 00 00  .7..............
> >>> 00000130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 00000150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00        ..............
> >>
> >> Up to this point everything *seems* OK
> >>
> >>> on my dhcp server 'tcpdump -i eth1 -X -vvv' shows:
> >>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> >>> length 262144 bytes
> >>> 15:34:25.341167  [|llc]
> >>> 0x0000:  0000 0000 0000 0000 0000 343d f2fd 0000  ..........4=....
> >>> 0x0010:  0000 5e01 5e01 8000 0100 0000 0000 0000  ..^.^...........
> >>> 0x0020:  0000 0000 0000 ffff ffff ffff 00d0 12ba  ................
> >>> 0x0030:  f8cc 0800 4500 0150 001e 0000 ff11 ba7f  ....E..P........
> >>> 0x0040:  0000 0000 ffff ffff 0044 0043 013c 6d78  .........D.C.<mx
> >>> 0x0050:  0101 0600 1255 994f 0000 0000 0000 0000  .....U.O........
> >>> 0x0060:  0000 0000 0000 0000 0000 0000 00d0 12ba  ................
> >>> 0x0070:  f8cc 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x0080:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x0090:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x00a0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x00b0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x00c0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x00d0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x00e0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x00f0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x0100:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x0110:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x0120:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x0130:  0000 0000 0000 0000 0000 0000 6382 5363  ............c.Sc
> >>> 0x0140:  3501 0139 0205 dc37 0301 031c ff00 0000  5..9...7........
> >>
> >> But here it shows the DHCP packet is not received.
> >>
> >>> Comparing to w/o LWIP:
> >>> u-boot=> dhcp || echo fail
> >>> BOOTP broadcast 1
> >>> fecmxc_send ethernet@30be0000 342
> >>> 00000000: ff ff ff ff ff ff 00 d0 12 ba f8 cc 08 00 45 00  ..............E.
> >>> 00000010: 01 48 00 00 40 00 ff 11 7a a5 00 00 00 00 ff ff  .H..@...z.......
> >>> 00000020: ff ff 00 44 00 43 01 34 00 00 01 01 06 00 12 bb  ...D.C.4........
> >>> 00000030: 82 64 00 00 00 00 00 00 00 00 00 00 00 00 00 00  .d..............
> >>> 00000040: 00 00 00 00 00 00 00 d0 12 ba f8 cc 00 00 00 00  ................
> >>> 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 000000a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 000000b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 00000100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 00000110: 00 00 00 00 00 00 63 82 53 63 35 01 01 39 02 02  ......c.Sc5..9..
> >>> 00000120: 40 5d 02 00 16 5e 03 01 00 00 3c 0c 55 2d 42 6f  @]...^....<.U-Bo
> >>> 00000130: 6f 74 2e 61 72 6d 76 38 37 05 01 03 06 0c 11 ff  ot.armv87.......
> >>> 00000140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
> >>> 00000150: 00 00 00 00 00 00                                ......
> >>> ^^^ Note the packet is 8 bytes shorter
> >>
> >> Yes but that's just padding IIUC.
> >>
> >>> and on the dhcp server:
> >>> tcpdump: listening on eth1, link-type EN10MB (Ethernet), snapshot
> >>> length 262144 bytes
> >>> 15:45:12.548077 IP (tos 0x0, ttl 255, id 0, offset 0, flags [DF],
> >>> proto UDP (17), length 328)
> >>>     0.0.0.0.bootpc > 255.255.255.255.bootps: [no cksum] BOOTP/DHCP,
> >>> Request from 00:d0:12:ba:f8:cc (oui Unknown), length 300, xid
> >>> 0x12bb8264, Flags [none] (0x0000)
> >>>   Client-Ethernet-Address 00:d0:12:ba:f8:cc (oui Unknown)
> >>>   Vendor-rfc1048 Extensions
> >>>     Magic Cookie 0x63825363
> >>>     DHCP-Message (53), length 1: Discover
> >>>     MSZ (57), length 2: 576
> >>>     ARCH (93), length 2: 22
> >>>     NDI (94), length 3: 1.0.0
> >>>     Vendor-Class (60), length 12: "U-Boot.armv8"
> >>>     Parameter-Request (55), length 5:
> >>>       Subnet-Mask (1), Default-Gateway (3), Domain-Name-Server (6),
> >>> Hostname (12)
> >>>       RP (17)
> >>>     END (255), length 0
> >>>     PAD (0), length 0, occurs 22
> >>> 0x0000:  4500 0148 0000 4000 ff11 7aa5 0000 0000  E..H..@...z.....
> >>> 0x0010:  ffff ffff 0044 0043 0134 0000 0101 0600  .....D.C.4......
> >>> 0x0020:  12bb 8264 0000 0000 0000 0000 0000 0000  ...d............
> >>> 0x0030:  0000 0000 0000 0000 00d0 12ba f8cc 0000  ................
> >>> 0x0040:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x0050:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x0060:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x0070:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x0080:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x0090:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x00a0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x00b0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x00c0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x00d0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x00e0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x00f0:  0000 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x0100:  0000 0000 0000 0000 6382 5363 3501 0139  ........c.Sc5..9
> >>> 0x0110:  0202 405d 0200 165e 0301 0000 3c0c 552d  ..@]...^....<.U-
> >>> 0x0120:  426f 6f74 2e61 726d 7638 3705 0103 060c  Boot.armv87.....
> >>> 0x0130:  11ff 0000 0000 0000 0000 0000 0000 0000  ................
> >>> 0x0140:  0000 0000 0000 0000                      ........
> >>
> >> All good there, as expected.
> >>
> >>>> I do have a i.MX8M Plus EVK board but will not be able to use it before
> >>>> tomorrow.
> >>>
> >>> Please let me know your findings. I can test imx8mp as well but would
> >>> have to reconfigure everything.
> >>
> >> So I did some tests on my imx8mp and I got DHCP to work on the second
> >> ethernet port (ENET2) but *not* on the first one (ENET1). They appear to
> >> be using different drivers (drivers/net/fec_mxc.c for ENET1 and
> >> drivers/net/dwc_eth_qos.c for ENET2). No matter what I do, there is no
> >> packet going out of ENET1.
> >>
> >> With a network cable plugged into ENET2 (ethernet@30bf0000):
> >>
> >> ...
> >> Net:   eth0: ethernet@30be0000, eth1: ethernet@30bf0000 [PRIME]
> >> Hit any key to stop autoboot:  0
> >> u-boot=>
> >> u-boot=> dhcp
> >> ethernet@30bf0000 Waiting for PHY auto negotiation to complete....... done
> >> DHCP client bound to address 192.168.0.39 (2066 ms)
> >> u-boot=>
> >>
> >> Now moving the cable to ENET1 (ethernet@30be0000) and resetting the board:
> >>
> >> ...
> >> Net:   eth0: ethernet@30be0000, eth1: ethernet@30bf0000 [PRIME]
> >> Hit any key to stop autoboot:  0
> >> u-boot=>
> >> u-boot=> setenv ethact ethernet@30be0000
> >> u-boot=> dhcp
> >> u-boot=> echo $?
> >> 1
> >>
> >> I enabled the debug traces in fecmxc_send() but I see nothing wrong,
> >> the function is called with a packet and it returns success yet nothing
> >> is sent out on the wire. Any idea what could be wrong?
> >> My plan is to add more traces to the driver and compare NET vs. NET_LWIP.
> >>
> >
> > Jerome,
> >
> > That is consistent with what I see then. The imx8mm (and all imx all
> > the way back to imx28) use the FEC ethernet MAC so I would assume you
> > see the issue on all of those SOC's. The imx8mp introduced a 2nd
> > different MAC, the eqos which you are showing working.
> >
> > If you look at my results above a packet 'is' received by the DHCP
> > server but its missing something where tcpdump shows an '[|llc]' and I
> > don't know what that means - something wrong with the link layer?
> >
> > I feel like something is wrong in the header or packet alignment.
>
> Good catch :) Here is a quick workaroud that works for me:
>
> diff --git a/drivers/net/fec_mxc.c b/drivers/net/fec_mxc.c
> index 2dc1364beec..85bf1a2b52e 100644
> --- a/drivers/net/fec_mxc.c
> +++ b/drivers/net/fec_mxc.c
> @@ -658,7 +658,7 @@ static void fecmxc_halt(struct udevice *dev)
>   * @param[in] length Data count in bytes
>   * Return: 0 on success
>   */
> -static int fecmxc_send(struct udevice *dev, void *packet, int length)
> +static int fecmxc_send_(struct udevice *dev, void *packet, int length)
>  {
>         unsigned int status;
>         u32 size;
> @@ -796,6 +796,21 @@ out:
>         return ret;
>  }
>
> +static int fecmxc_send(struct udevice *dev, void *packet, int length)
> +{
> +       void *pp;
> +       int ret;
> +
> +       pp = memalign(64, length);
> +       if (!pp)
> +               return -ENOMEM;
> +       memcpy(pp, packet, length);
> +       ret = fecmxc_send_(dev, pp, length);
> +       free(pp);
> +
> +       return ret;
> +}
> +
>  /**
>   * Pull one frame from the card
>   * @param[in] dev Our ethernet device to handle
>
>
> I believe the root of the problem is that lwIP uses malloc() to allocate
> the outgoing packets and doesn't enforce any alignement. I need to
> revisit that wrt. PKTALIGN.
>

Yes, this makes sense. This driver uses DMA.

Glad we got to the bottom of it!

This hack does resolve the issue.

Best Regards,

Tim