diff mbox series

[v3,net,2/8] eth: bnxt: return fail if interface is down in bnxt_queue_mem_alloc()

Message ID 20250309134219.91670-3-ap420073@gmail.com
State New
Headers show
Series eth: bnxt: fix several bugs in the bnxt module | expand

Commit Message

Taehee Yoo March 9, 2025, 1:42 p.m. UTC
The bnxt_queue_mem_alloc() is called to allocate new queue memory when
a queue is restarted.
It internally accesses rx buffer descriptor corresponding to the index.
The rx buffer descriptor is allocated and set when the interface is up
and it's freed when the interface is down.
So, if queue is restarted if interface is down, kernel panic occurs.

Splat looks like:
 BUG: unable to handle page fault for address: 000000000000b240
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
 CPU: 3 UID: 0 PID: 1563 Comm: ncdevmem2 Not tainted 6.14.0-rc2+ #9 844ddba6e7c459cafd0bf4db9a3198e
 Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
 RIP: 0010:bnxt_queue_mem_alloc+0x3f/0x4e0 [bnxt_en]
 Code: 41 54 4d 89 c4 4d 69 c0 c0 05 00 00 55 48 89 f5 53 48 89 fb 4c 8d b5 40 05 00 00 48 83 ec 15
 RSP: 0018:ffff9dcc83fef9e8 EFLAGS: 00010202
 RAX: ffffffffc0457720 RBX: ffff934ed8d40000 RCX: 0000000000000000
 RDX: 000000000000001f RSI: ffff934ea508f800 RDI: ffff934ea508f808
 RBP: ffff934ea508f800 R08: 000000000000b240 R09: ffff934e84f4b000
 R10: ffff9dcc83fefa30 R11: ffff934e84f4b000 R12: 000000000000001f
 R13: ffff934ed8d40ac0 R14: ffff934ea508fd40 R15: ffff934e84f4b000
 FS:  00007fa73888c740(0000) GS:ffff93559f780000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000000000000b240 CR3: 0000000145a2e000 CR4: 00000000007506f0
 PKRU: 55555554
 Call Trace:
  <TASK>
  ? __die+0x20/0x70
  ? page_fault_oops+0x15a/0x460
  ? exc_page_fault+0x6e/0x180
  ? asm_exc_page_fault+0x22/0x30
  ? __pfx_bnxt_queue_mem_alloc+0x10/0x10 [bnxt_en 7f85e76f4d724ba07471d7e39d9e773aea6597b7]
  ? bnxt_queue_mem_alloc+0x3f/0x4e0 [bnxt_en 7f85e76f4d724ba07471d7e39d9e773aea6597b7]
  netdev_rx_queue_restart+0xc5/0x240
  net_devmem_bind_dmabuf_to_queue+0xf8/0x200
  netdev_nl_bind_rx_doit+0x3a7/0x450
  genl_family_rcv_msg_doit+0xd9/0x130
  genl_rcv_msg+0x184/0x2b0
  ? __pfx_netdev_nl_bind_rx_doit+0x10/0x10
  ? __pfx_genl_rcv_msg+0x10/0x10
  netlink_rcv_skb+0x54/0x100
  genl_rcv+0x24/0x40
...

Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Fixes: 2d694c27d32e ("bnxt_en: implement netdev_queue_mgmt_ops")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---

v3:
 - No changes.

v2:
 - Add Review tags from Somnath and Jakub.

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Mina Almasry March 9, 2025, 9:44 p.m. UTC | #1
On Sun, Mar 9, 2025 at 6:43 AM Taehee Yoo <ap420073@gmail.com> wrote:
>
> The bnxt_queue_mem_alloc() is called to allocate new queue memory when
> a queue is restarted.
> It internally accesses rx buffer descriptor corresponding to the index.
> The rx buffer descriptor is allocated and set when the interface is up
> and it's freed when the interface is down.
> So, if queue is restarted if interface is down, kernel panic occurs.
>
> Splat looks like:
>  BUG: unable to handle page fault for address: 000000000000b240
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x0000) - not-present page
>  PGD 0 P4D 0
>  Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
>  CPU: 3 UID: 0 PID: 1563 Comm: ncdevmem2 Not tainted 6.14.0-rc2+ #9 844ddba6e7c459cafd0bf4db9a3198e
>  Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
>  RIP: 0010:bnxt_queue_mem_alloc+0x3f/0x4e0 [bnxt_en]
>  Code: 41 54 4d 89 c4 4d 69 c0 c0 05 00 00 55 48 89 f5 53 48 89 fb 4c 8d b5 40 05 00 00 48 83 ec 15
>  RSP: 0018:ffff9dcc83fef9e8 EFLAGS: 00010202
>  RAX: ffffffffc0457720 RBX: ffff934ed8d40000 RCX: 0000000000000000
>  RDX: 000000000000001f RSI: ffff934ea508f800 RDI: ffff934ea508f808
>  RBP: ffff934ea508f800 R08: 000000000000b240 R09: ffff934e84f4b000
>  R10: ffff9dcc83fefa30 R11: ffff934e84f4b000 R12: 000000000000001f
>  R13: ffff934ed8d40ac0 R14: ffff934ea508fd40 R15: ffff934e84f4b000
>  FS:  00007fa73888c740(0000) GS:ffff93559f780000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 000000000000b240 CR3: 0000000145a2e000 CR4: 00000000007506f0
>  PKRU: 55555554
>  Call Trace:
>   <TASK>
>   ? __die+0x20/0x70
>   ? page_fault_oops+0x15a/0x460
>   ? exc_page_fault+0x6e/0x180
>   ? asm_exc_page_fault+0x22/0x30
>   ? __pfx_bnxt_queue_mem_alloc+0x10/0x10 [bnxt_en 7f85e76f4d724ba07471d7e39d9e773aea6597b7]
>   ? bnxt_queue_mem_alloc+0x3f/0x4e0 [bnxt_en 7f85e76f4d724ba07471d7e39d9e773aea6597b7]
>   netdev_rx_queue_restart+0xc5/0x240
>   net_devmem_bind_dmabuf_to_queue+0xf8/0x200
>   netdev_nl_bind_rx_doit+0x3a7/0x450
>   genl_family_rcv_msg_doit+0xd9/0x130
>   genl_rcv_msg+0x184/0x2b0
>   ? __pfx_netdev_nl_bind_rx_doit+0x10/0x10
>   ? __pfx_genl_rcv_msg+0x10/0x10
>   netlink_rcv_skb+0x54/0x100
>   genl_rcv+0x24/0x40
> ...
>
> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
> Reviewed-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Mina Almasry <almasrymina@google.com>

Although I wonder if you wanna return -ENETDOWN from other queue API
ops, if your driver doesn't handle them.
Taehee Yoo March 10, 2025, 2:10 a.m. UTC | #2
On Mon, Mar 10, 2025 at 6:44 AM Mina Almasry <almasrymina@google.com> wrote:
>

Hi Mina,
Thanks a lot for the review!

> On Sun, Mar 9, 2025 at 6:43 AM Taehee Yoo <ap420073@gmail.com> wrote:
> >
> > The bnxt_queue_mem_alloc() is called to allocate new queue memory when
> > a queue is restarted.
> > It internally accesses rx buffer descriptor corresponding to the index.
> > The rx buffer descriptor is allocated and set when the interface is up
> > and it's freed when the interface is down.
> > So, if queue is restarted if interface is down, kernel panic occurs.
> >
> > Splat looks like:
> >  BUG: unable to handle page fault for address: 000000000000b240
> >  #PF: supervisor read access in kernel mode
> >  #PF: error_code(0x0000) - not-present page
> >  PGD 0 P4D 0
> >  Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
> >  CPU: 3 UID: 0 PID: 1563 Comm: ncdevmem2 Not tainted 6.14.0-rc2+ #9 844ddba6e7c459cafd0bf4db9a3198e
> >  Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
> >  RIP: 0010:bnxt_queue_mem_alloc+0x3f/0x4e0 [bnxt_en]
> >  Code: 41 54 4d 89 c4 4d 69 c0 c0 05 00 00 55 48 89 f5 53 48 89 fb 4c 8d b5 40 05 00 00 48 83 ec 15
> >  RSP: 0018:ffff9dcc83fef9e8 EFLAGS: 00010202
> >  RAX: ffffffffc0457720 RBX: ffff934ed8d40000 RCX: 0000000000000000
> >  RDX: 000000000000001f RSI: ffff934ea508f800 RDI: ffff934ea508f808
> >  RBP: ffff934ea508f800 R08: 000000000000b240 R09: ffff934e84f4b000
> >  R10: ffff9dcc83fefa30 R11: ffff934e84f4b000 R12: 000000000000001f
> >  R13: ffff934ed8d40ac0 R14: ffff934ea508fd40 R15: ffff934e84f4b000
> >  FS:  00007fa73888c740(0000) GS:ffff93559f780000(0000) knlGS:0000000000000000
> >  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >  CR2: 000000000000b240 CR3: 0000000145a2e000 CR4: 00000000007506f0
> >  PKRU: 55555554
> >  Call Trace:
> >   <TASK>
> >   ? __die+0x20/0x70
> >   ? page_fault_oops+0x15a/0x460
> >   ? exc_page_fault+0x6e/0x180
> >   ? asm_exc_page_fault+0x22/0x30
> >   ? __pfx_bnxt_queue_mem_alloc+0x10/0x10 [bnxt_en 7f85e76f4d724ba07471d7e39d9e773aea6597b7]
> >   ? bnxt_queue_mem_alloc+0x3f/0x4e0 [bnxt_en 7f85e76f4d724ba07471d7e39d9e773aea6597b7]
> >   netdev_rx_queue_restart+0xc5/0x240
> >   net_devmem_bind_dmabuf_to_queue+0xf8/0x200
> >   netdev_nl_bind_rx_doit+0x3a7/0x450
> >   genl_family_rcv_msg_doit+0xd9/0x130
> >   genl_rcv_msg+0x184/0x2b0
> >   ? __pfx_netdev_nl_bind_rx_doit+0x10/0x10
> >   ? __pfx_genl_rcv_msg+0x10/0x10
> >   netlink_rcv_skb+0x54/0x100
> >   genl_rcv+0x24/0x40
> > ...
> >
> > Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
> > Reviewed-by: Jakub Kicinski <kuba@kernel.org>
>
> Reviewed-by: Mina Almasry <almasrymina@google.com>
>
> Although I wonder if you wanna return -ENETDOWN from other queue API
> ops, if your driver doesn't handle them.

Okay, I will add -ENETDOWN return to the other queue API in the next
version.

Thanks a lot!
Taehee Yoo
Jakub Kicinski March 10, 2025, 8:30 p.m. UTC | #3
On Mon, 10 Mar 2025 11:10:42 +0900 Taehee Yoo wrote:
> > Although I wonder if you wanna return -ENETDOWN from other queue API
> > ops, if your driver doesn't handle them.  
> 
> Okay, I will add -ENETDOWN return to the other queue API in the next
> version.

I prefer the current version. If queues can't be allocated while down,
how can free be called..
diff mbox series

Patch

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 6b5fe4ee7a99..acb9500ef930 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -15447,6 +15447,9 @@  static int bnxt_queue_mem_alloc(struct net_device *dev, void *qmem, int idx)
 	struct bnxt_ring_struct *ring;
 	int rc;
 
+	if (!bp->rx_ring)
+		return -ENETDOWN;
+
 	rxr = &bp->rx_ring[idx];
 	clone = qmem;
 	memcpy(clone, rxr, sizeof(*rxr));