mbox series

[v3,net,0/8] eth: bnxt: fix several bugs in the bnxt module

Message ID 20250309134219.91670-1-ap420073@gmail.com
Headers show
Series eth: bnxt: fix several bugs in the bnxt module | expand

Message

Taehee Yoo March 9, 2025, 1:42 p.m. UTC
The first fixes setting incorrect skb->truesize.
When xdp-mb prog returns XDP_PASS, skb is allocated and initialized.
Currently, The truesize is calculated as BNXT_RX_PAGE_SIZE *
sinfo->nr_frags, but sinfo->nr_frags is flushed by napi_build_skb().
So, it stores sinfo before calling napi_build_skb() and then use it
for calculate truesize.

The second fixes kernel panic in the bnxt_queue_mem_alloc().
The bnxt_queue_mem_alloc() accesses rx ring descriptor.
rx ring descriptors are allocated when the interface is up and it's
freed when the interface is down.
So, if bnxt_queue_mem_alloc() is called when the interface is down,
kernel panic occurs.
This patch makes the bnxt_queue_mem_alloc() return -ENETDOWN if rx ring
descriptors are not allocated.

The third patch fixes kernel panic in the bnxt_queue_{start | stop}().
When a queue is restarted bnxt_queue_{start | stop}() are called.
These functions set MRU to 0 to stop packet flow and then to set up the
remaining things.
MRU variable is a member of vnic_info[] the first vnic_info is for
default and the second is for ntuple.
The first vnic_info is always allocated when interface is up, but the
second is allocated only when ntuple is enabled.
(ethtool -K eth0 ntuple <on | off>).
Currently, the bnxt_queue_{start | stop}() access
vnic_info[BNXT_VNIC_NTUPLE] regardless of whether ntuple is enabled or
not.
So kernel panic occurs.
This patch make the bnxt_queue_{start | stop}() use bp->nr_vnics instead
of BNXT_VNIC_NTUPLE.

The fourth patch fixes a warning due to checksum state.
The bnxt_rx_pkt() checks whether skb->ip_summed is not CHECKSUM_NONE
before updating ip_summed. if ip_summed is not CHECKSUM_NONE, it WARNS
about it. However, the bnxt_xdp_build_skb() is called in XDP-MB-PASS
path and it updates ip_summed earlier than bnxt_rx_pkt().
So, in the XDP-MB-PASS path, the bnxt_rx_pkt() always warns about
checksum.
Updating ip_summed at the bnxt_xdp_build_skb() is unnecessary and
duplicate, so it is removed.

The fifth patch fixes a kernel panic in the
bnxt_get_queue_stats{rx | tx}().
The bnxt_get_queue_stats{rx | tx}() callback functions are called when
a queue is resetting.
These internally access rx and tx rings without null check, but rings
are allocated and initialized when the interface is up.
So, these functions are called when the interface is down, it
occurs a kernel panic.

The sixth patch fixes memory leak in queue reset logic.
When a queue is resetting, tpa_info is allocated for the new queue and
tpa_info for an old queue is not used anymore.
So it should be freed, but not.

The seventh patch makes net_devmem_unbind_dmabuf() ignore -ENETDOWN.
When devmem socket is closed, net_devmem_unbind_dmabuf() is called to
unbind/release resources.
If interface is down, the driver returns -ENETDOWN.
The -ENETDOWN return value is not an actual error, because the interface
will release resources when the interface is down.
So, net_devmem_unbind_dmabuf() needs to ignore -ENETDOWN.

The last patch adds XDP testcases to
tools/testing/selftests/drivers/net/ping.py.

v3:
 - Copy nr_frags instead of full copy. (1/8)
 - Add Review tags from Somnath. (3/8)
 - Add new patch for fixing kernel panic in the
   bnxt_get_queue_stats{rx | tx}(). (5/8)
 - Add new patch for fixing memory leak in queue reset. (6/8)

v2:
 - Do not use num_frags in the bnxt_xdp_build_skb(). (1/6)
 - Add Review tags from Somnath and Jakub. (2/6)
 - Add new patch for fixing checksum warning. (4/6)
 - Add new patch for fixing warning in net_devmem_unbind_dmabuf(). (5/6)
 - Add new XDP testcases to ping.py (6/6)

Taehee Yoo (8):
  eth: bnxt: fix truesize for mb-xdp-pass case
  eth: bnxt: return fail if interface is down in bnxt_queue_mem_alloc()
  eth: bnxt: do not use BNXT_VNIC_NTUPLE unconditionally in queue
    restart logic
  eth: bnxt: do not update checksum in bnxt_xdp_build_skb()
  eth: bnxt: fix kernel panic in the bnxt_get_queue_stats{rx | tx}
  eth: bnxt: fix memory leak in queue reset
  net: devmem: do not WARN conditionally after netdev_rx_queue_restart()
  selftests: drv-net: add xdp cases for ping.py

 drivers/net/ethernet/broadcom/bnxt/bnxt.c     |  25 ++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |  13 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h |   3 +-
 net/core/devmem.c                             |   4 +-
 tools/testing/selftests/drivers/net/ping.py   | 200 ++++++++++++++++--
 .../testing/selftests/net/lib/xdp_dummy.bpf.c |   6 +
 6 files changed, 220 insertions(+), 31 deletions(-)

Comments

Somnath Kotur March 10, 2025, 2:59 a.m. UTC | #1
On Sun, Mar 9, 2025 at 7:13 PM Taehee Yoo <ap420073@gmail.com> wrote:
>
> When qstats-get operation is executed, callbacks of netdev_stats_ops
> are called. The bnxt_get_queue_stats{rx | tx} collect per-queue stats
> from sw_stats in the rings.
> But {rx | tx | cp}_ring are allocated when the interface is up.
> So, these rings are not allocated when the interface is down.
>
> The qstats-get is allowed even if the interface is down. However,
> the bnxt_get_queue_stats{rx | tx}() accesses cp_ring and tx_ring
> without null check.
> So, it needs to avoid accessing rings if the interface is down.
>
> Reproducer:
>  ip link set $interface down
>  ./cli.py --spec netdev.yaml --dump qstats-get
> OR
>  ip link set $interface down
>  python ./stats.py
>
> Splat looks like:
>  BUG: kernel NULL pointer dereference, address: 0000000000000000
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x0000) - not-present page
>  PGD 1680fa067 P4D 1680fa067 PUD 16be3b067 PMD 0
>  Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
>  CPU: 0 UID: 0 PID: 1495 Comm: python3 Not tainted 6.14.0-rc4+ #32 5cd0f999d5a15c574ac72b3e4b907341
>  Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
>  RIP: 0010:bnxt_get_queue_stats_rx+0xf/0x70 [bnxt_en]
>  Code: c6 87 b5 18 00 00 02 eb a2 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 01
>  RSP: 0018:ffffabef43cdb7e0 EFLAGS: 00010282
>  RAX: 0000000000000000 RBX: ffffffffc04c8710 RCX: 0000000000000000
>  RDX: ffffabef43cdb858 RSI: 0000000000000000 RDI: ffff8d504e850000
>  RBP: ffff8d506c9f9c00 R08: 0000000000000004 R09: ffff8d506bcd901c
>  R10: 0000000000000015 R11: ffff8d506bcd9000 R12: 0000000000000000
>  R13: ffffabef43cdb8c0 R14: ffff8d504e850000 R15: 0000000000000000
>  FS:  00007f2c5462b080(0000) GS:ffff8d575f600000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 0000000000000000 CR3: 0000000167fd0000 CR4: 00000000007506f0
>  PKRU: 55555554
>  Call Trace:
>   <TASK>
>   ? __die+0x20/0x70
>   ? page_fault_oops+0x15a/0x460
>   ? sched_balance_find_src_group+0x58d/0xd10
>   ? exc_page_fault+0x6e/0x180
>   ? asm_exc_page_fault+0x22/0x30
>   ? bnxt_get_queue_stats_rx+0xf/0x70 [bnxt_en cdd546fd48563c280cfd30e9647efa420db07bf1]
>   netdev_nl_stats_by_netdev+0x2b1/0x4e0
>   ? xas_load+0x9/0xb0
>   ? xas_find+0x183/0x1d0
>   ? xa_find+0x8b/0xe0
>   netdev_nl_qstats_get_dumpit+0xbf/0x1e0
>   genl_dumpit+0x31/0x90
>   netlink_dump+0x1a8/0x360
>
> Fixes: af7b3b4adda5 ("eth: bnxt: support per-queue statistics")
> Signed-off-by: Taehee Yoo <ap420073@gmail.com>
> ---
>
> v3:
>  - Patch added.
>
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 9afb2c5072b1..bee12d9b57ab 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -15384,6 +15384,9 @@ static void bnxt_get_queue_stats_rx(struct net_device *dev, int i,
>         struct bnxt_cp_ring_info *cpr;
>         u64 *sw;
>
> +       if (!bp->bnapi)
> +               return;
> +
>         cpr = &bp->bnapi[i]->cp_ring;
>         sw = cpr->stats.sw_stats;
>
> @@ -15407,6 +15410,9 @@ static void bnxt_get_queue_stats_tx(struct net_device *dev, int i,
>         struct bnxt_napi *bnapi;
>         u64 *sw;
>
> +       if (!bp->tx_ring)
> +               return;
> +
>         bnapi = bp->tx_ring[bp->tx_ring_map[i]].bnapi;
>         sw = bnapi->cp_ring.stats.sw_stats;
>
> --
> 2.34.1
>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>