Message ID | 20241211071127.38452-1-liuhangbin@gmail.com |
---|---|
Headers | show |
Series | bond: fix xfrm offload feature during init | expand |
On Wed, 11 Dec 2024 07:11:25 +0000 Hangbin Liu wrote: > The first patch fixes the xfrm offload feature during setup active-backup > mode. The second patch add a ipsec offload testing. Looks like the test is too good, is there a fix pending somewhere for the BUG below? We can't merge the test before that: https://netdev-3.bots.linux.dev/vmksft-bonding-dbg/results/900082/11-bond-ipsec-offload-sh/stderr [ 859.672652][ C3] bond_xfrm_update_stats: eth0 doesn't support xdo_dev_state_update_stats [ 860.467189][ T8677] bond0: (slave eth0): link status definitely down, disabling slave [ 860.467664][ T8677] bond0: (slave eth1): making interface the new active one [ 860.831042][ T9677] bond_xfrm_update_stats: eth1 doesn't support xdo_dev_state_update_stats [ 862.195271][ T9683] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562 [ 862.195880][ T9683] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9683, name: ip [ 862.196189][ T9683] preempt_count: 201, expected: 0 [ 862.196396][ T9683] RCU nest depth: 0, expected: 0 [ 862.196591][ T9683] 2 locks held by ip/9683: [ 862.196818][ T9683] #0: ffff88800a829558 (&net->xfrm.xfrm_cfg_mutex){+.+.}-{4:4}, at: xfrm_netlink_rcv+0x65/0x90 [xfrm_user] [ 862.197264][ T9683] #1: ffff88800f460548 (&x->lock){+.-.}-{3:3}, at: xfrm_state_flush+0x1b3/0x3a0 [ 862.197629][ T9683] CPU: 3 UID: 0 PID: 9683 Comm: ip Not tainted 6.13.0-rc1-virtme #1 [ 862.197967][ T9683] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 862.198204][ T9683] Call Trace: [ 862.198352][ T9683] <TASK> [ 862.198458][ T9683] dump_stack_lvl+0xb0/0xd0 [ 862.198659][ T9683] __might_resched+0x2f8/0x530 [ 862.198852][ T9683] ? kfree+0x2d/0x330 [ 862.199005][ T9683] __mutex_lock+0xd9/0xbc0 [ 862.199202][ T9683] ? ref_tracker_free+0x35e/0x910 [ 862.199401][ T9683] ? bond_ipsec_del_sa+0x2c1/0x790 [ 862.199937][ T9683] ? find_held_lock+0x2c/0x110 [ 862.200133][ T9683] ? __pfx___mutex_lock+0x10/0x10 [ 862.200329][ T9683] ? bond_ipsec_del_sa+0x280/0x790 [ 862.200519][ T9683] ? xfrm_dev_state_delete+0x97/0x170 [ 862.200711][ T9683] ? __xfrm_state_delete+0x681/0x8e0 [ 862.200907][ T9683] ? xfrm_user_rcv_msg+0x4f8/0x920 [xfrm_user] [ 862.201151][ T9683] ? netlink_rcv_skb+0x130/0x360 [ 862.201347][ T9683] ? xfrm_netlink_rcv+0x74/0x90 [xfrm_user] [ 862.201587][ T9683] ? netlink_unicast+0x44b/0x710 [ 862.201780][ T9683] ? netlink_sendmsg+0x723/0xbe0 [ 862.201973][ T9683] ? ____sys_sendmsg+0x7ac/0xa10 [ 862.202164][ T9683] ? ___sys_sendmsg+0xee/0x170 [ 862.202355][ T9683] ? __sys_sendmsg+0x109/0x1a0 [ 862.202546][ T9683] ? do_syscall_64+0xc1/0x1d0 [ 862.202738][ T9683] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 862.202986][ T9683] ? __pfx_nsim_ipsec_del_sa+0x10/0x10 [netdevsim] [ 862.203251][ T9683] ? bond_ipsec_del_sa+0x2c1/0x790 [ 862.203457][ T9683] bond_ipsec_del_sa+0x2c1/0x790 [ 862.203648][ T9683] ? __pfx_lock_acquire.part.0+0x10/0x10 [ 862.203845][ T9683] ? __pfx_bond_ipsec_del_sa+0x10/0x10 [ 862.204034][ T9683] ? do_raw_spin_lock+0x131/0x270 [ 862.204225][ T9683] ? __pfx_do_raw_spin_lock+0x10/0x10 [ 862.204468][ T9683] xfrm_dev_state_delete+0x97/0x170 [ 862.204665][ T9683] __xfrm_state_delete+0x681/0x8e0 [ 862.204858][ T9683] xfrm_state_flush+0x1bb/0x3a0 [ 862.205057][ T9683] xfrm_flush_sa+0xf0/0x270 [xfrm_user] [ 862.205290][ T9683] ? __pfx_xfrm_flush_sa+0x10/0x10 [xfrm_user] [ 862.205537][ T9683] ? __nla_validate_parse+0x48/0x3d0 [ 862.205744][ T9683] xfrm_user_rcv_msg+0x4f8/0x920 [xfrm_user] [ 862.205985][ T9683] ? __pfx___lock_release+0x10/0x10 [ 862.206174][ T9683] ? __pfx_xfrm_user_rcv_msg+0x10/0x10 [xfrm_user] [ 862.206412][ T9683] ? __pfx_validate_chain+0x10/0x10 [ 862.206614][ T9683] ? hlock_class+0x4e/0x130 [ 862.206807][ T9683] ? mark_lock+0x38/0x3e0 [ 862.206986][ T9683] ? __mutex_trylock_common+0xfa/0x260 [ 862.207181][ T9683] ? __pfx___mutex_trylock_common+0x10/0x10 [ 862.207425][ T9683] netlink_rcv_skb+0x130/0x360
On Thu, Dec 12, 2024 at 06:27:34AM -0800, Jakub Kicinski wrote: > On Wed, 11 Dec 2024 07:11:25 +0000 Hangbin Liu wrote: > > The first patch fixes the xfrm offload feature during setup active-backup > > mode. The second patch add a ipsec offload testing. > > Looks like the test is too good, is there a fix pending somewhere for > the BUG below? We can't merge the test before that: This should be a regression of 2aeeef906d5a ("bonding: change ipsec_lock from spin lock to mutex"). As in xfrm_state_delete we called spin_lock_bh(&x->lock) for the xfrm state delete. But I'm not sure if it's proper to release the spin lock in bond code. This seems too specific. diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 7daeab67e7b5..69563bc958ca 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -592,6 +592,7 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs) real_dev->xfrmdev_ops->xdo_dev_state_delete(xs); out: netdev_put(real_dev, &tracker); + spin_unlock_bh(&xs->lock); mutex_lock(&bond->ipsec_lock); list_for_each_entry(ipsec, &bond->ipsec_list, list) { if (ipsec->xs == xs) { @@ -601,6 +602,7 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs) } } mutex_unlock(&bond->ipsec_lock); + spin_lock_bh(&xs->lock); } What do you think? Thanks Hangbin > > https://netdev-3.bots.linux.dev/vmksft-bonding-dbg/results/900082/11-bond-ipsec-offload-sh/stderr > > [ 859.672652][ C3] bond_xfrm_update_stats: eth0 doesn't support xdo_dev_state_update_stats > [ 860.467189][ T8677] bond0: (slave eth0): link status definitely down, disabling slave > [ 860.467664][ T8677] bond0: (slave eth1): making interface the new active one > [ 860.831042][ T9677] bond_xfrm_update_stats: eth1 doesn't support xdo_dev_state_update_stats > [ 862.195271][ T9683] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562 > [ 862.195880][ T9683] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9683, name: ip > [ 862.196189][ T9683] preempt_count: 201, expected: 0 > [ 862.196396][ T9683] RCU nest depth: 0, expected: 0 > [ 862.196591][ T9683] 2 locks held by ip/9683: > [ 862.196818][ T9683] #0: ffff88800a829558 (&net->xfrm.xfrm_cfg_mutex){+.+.}-{4:4}, at: xfrm_netlink_rcv+0x65/0x90 [xfrm_user] > [ 862.197264][ T9683] #1: ffff88800f460548 (&x->lock){+.-.}-{3:3}, at: xfrm_state_flush+0x1b3/0x3a0 > [ 862.197629][ T9683] CPU: 3 UID: 0 PID: 9683 Comm: ip Not tainted 6.13.0-rc1-virtme #1 > [ 862.197967][ T9683] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 > [ 862.198204][ T9683] Call Trace: > [ 862.198352][ T9683] <TASK> > [ 862.198458][ T9683] dump_stack_lvl+0xb0/0xd0 > [ 862.198659][ T9683] __might_resched+0x2f8/0x530 > [ 862.198852][ T9683] ? kfree+0x2d/0x330 > [ 862.199005][ T9683] __mutex_lock+0xd9/0xbc0 > [ 862.199202][ T9683] ? ref_tracker_free+0x35e/0x910 > [ 862.199401][ T9683] ? bond_ipsec_del_sa+0x2c1/0x790 > [ 862.199937][ T9683] ? find_held_lock+0x2c/0x110 > [ 862.200133][ T9683] ? __pfx___mutex_lock+0x10/0x10 > [ 862.200329][ T9683] ? bond_ipsec_del_sa+0x280/0x790 > [ 862.200519][ T9683] ? xfrm_dev_state_delete+0x97/0x170 > [ 862.200711][ T9683] ? __xfrm_state_delete+0x681/0x8e0 > [ 862.200907][ T9683] ? xfrm_user_rcv_msg+0x4f8/0x920 [xfrm_user] > [ 862.201151][ T9683] ? netlink_rcv_skb+0x130/0x360 > [ 862.201347][ T9683] ? xfrm_netlink_rcv+0x74/0x90 [xfrm_user] > [ 862.201587][ T9683] ? netlink_unicast+0x44b/0x710 > [ 862.201780][ T9683] ? netlink_sendmsg+0x723/0xbe0 > [ 862.201973][ T9683] ? ____sys_sendmsg+0x7ac/0xa10 > [ 862.202164][ T9683] ? ___sys_sendmsg+0xee/0x170 > [ 862.202355][ T9683] ? __sys_sendmsg+0x109/0x1a0 > [ 862.202546][ T9683] ? do_syscall_64+0xc1/0x1d0 > [ 862.202738][ T9683] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f > [ 862.202986][ T9683] ? __pfx_nsim_ipsec_del_sa+0x10/0x10 [netdevsim] > [ 862.203251][ T9683] ? bond_ipsec_del_sa+0x2c1/0x790 > [ 862.203457][ T9683] bond_ipsec_del_sa+0x2c1/0x790 > [ 862.203648][ T9683] ? __pfx_lock_acquire.part.0+0x10/0x10 > [ 862.203845][ T9683] ? __pfx_bond_ipsec_del_sa+0x10/0x10 > [ 862.204034][ T9683] ? do_raw_spin_lock+0x131/0x270 > [ 862.204225][ T9683] ? __pfx_do_raw_spin_lock+0x10/0x10 > [ 862.204468][ T9683] xfrm_dev_state_delete+0x97/0x170 > [ 862.204665][ T9683] __xfrm_state_delete+0x681/0x8e0 > [ 862.204858][ T9683] xfrm_state_flush+0x1bb/0x3a0 > [ 862.205057][ T9683] xfrm_flush_sa+0xf0/0x270 [xfrm_user] > [ 862.205290][ T9683] ? __pfx_xfrm_flush_sa+0x10/0x10 [xfrm_user] > [ 862.205537][ T9683] ? __nla_validate_parse+0x48/0x3d0 > [ 862.205744][ T9683] xfrm_user_rcv_msg+0x4f8/0x920 [xfrm_user] > [ 862.205985][ T9683] ? __pfx___lock_release+0x10/0x10 > [ 862.206174][ T9683] ? __pfx_xfrm_user_rcv_msg+0x10/0x10 [xfrm_user] > [ 862.206412][ T9683] ? __pfx_validate_chain+0x10/0x10 > [ 862.206614][ T9683] ? hlock_class+0x4e/0x130 > [ 862.206807][ T9683] ? mark_lock+0x38/0x3e0 > [ 862.206986][ T9683] ? __mutex_trylock_common+0xfa/0x260 > [ 862.207181][ T9683] ? __pfx___mutex_trylock_common+0x10/0x10 > [ 862.207425][ T9683] netlink_rcv_skb+0x130/0x360
On Fri, 13 Dec 2024 07:18:08 +0000 Hangbin Liu wrote: > On Thu, Dec 12, 2024 at 06:27:34AM -0800, Jakub Kicinski wrote: > > On Wed, 11 Dec 2024 07:11:25 +0000 Hangbin Liu wrote: > > > The first patch fixes the xfrm offload feature during setup active-backup > > > mode. The second patch add a ipsec offload testing. > > > > Looks like the test is too good, is there a fix pending somewhere for > > the BUG below? We can't merge the test before that: > > This should be a regression of 2aeeef906d5a ("bonding: change ipsec_lock from > spin lock to mutex"). As in xfrm_state_delete we called spin_lock_bh(&x->lock) > for the xfrm state delete. > > But I'm not sure if it's proper to release the spin lock in bond code. > This seems too specific. > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index 7daeab67e7b5..69563bc958ca 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -592,6 +592,7 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs) > real_dev->xfrmdev_ops->xdo_dev_state_delete(xs); > out: > netdev_put(real_dev, &tracker); > + spin_unlock_bh(&xs->lock); > mutex_lock(&bond->ipsec_lock); > list_for_each_entry(ipsec, &bond->ipsec_list, list) { > if (ipsec->xs == xs) { > @@ -601,6 +602,7 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs) > } > } > mutex_unlock(&bond->ipsec_lock); > + spin_lock_bh(&xs->lock); > } > > > What do you think? Re-locking doesn't look great, glancing at the code I don't see any obvious better workarounds. Easiest fix would be to don't let the drivers sleep in the callbacks and then we can go back to a spin lock. Maybe nvidia people have better ideas, I'm not familiar with this offload.
On Fri, Dec 13, 2024 at 07:31:27PM -0800, Jakub Kicinski wrote: > On Fri, 13 Dec 2024 07:18:08 +0000 Hangbin Liu wrote: > > On Thu, Dec 12, 2024 at 06:27:34AM -0800, Jakub Kicinski wrote: > > > On Wed, 11 Dec 2024 07:11:25 +0000 Hangbin Liu wrote: > > > > The first patch fixes the xfrm offload feature during setup active-backup > > > > mode. The second patch add a ipsec offload testing. > > > > > > Looks like the test is too good, is there a fix pending somewhere for > > > the BUG below? We can't merge the test before that: > > > > This should be a regression of 2aeeef906d5a ("bonding: change ipsec_lock from > > spin lock to mutex"). As in xfrm_state_delete we called spin_lock_bh(&x->lock) > > for the xfrm state delete. > > > > But I'm not sure if it's proper to release the spin lock in bond code. > > This seems too specific. > > > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > > index 7daeab67e7b5..69563bc958ca 100644 > > --- a/drivers/net/bonding/bond_main.c > > +++ b/drivers/net/bonding/bond_main.c > > @@ -592,6 +592,7 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs) > > real_dev->xfrmdev_ops->xdo_dev_state_delete(xs); > > out: > > netdev_put(real_dev, &tracker); > > + spin_unlock_bh(&xs->lock); > > mutex_lock(&bond->ipsec_lock); > > list_for_each_entry(ipsec, &bond->ipsec_list, list) { > > if (ipsec->xs == xs) { > > @@ -601,6 +602,7 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs) > > } > > } > > mutex_unlock(&bond->ipsec_lock); > > + spin_lock_bh(&xs->lock); > > } > > > > > > What do you think? > > Re-locking doesn't look great, glancing at the code I don't see any > obvious better workarounds. Easiest fix would be to don't let the > drivers sleep in the callbacks and then we can go back to a spin lock. > Maybe nvidia people have better ideas, I'm not familiar with this > offload. I don't know how to disable bonding sleeping since we use mutex_lock now. Hi Jianbo, do you have any idea? Thanks Hangbin
On 1/2/2025 10:44 AM, Hangbin Liu wrote: > On Fri, Dec 13, 2024 at 07:31:27PM -0800, Jakub Kicinski wrote: >> On Fri, 13 Dec 2024 07:18:08 +0000 Hangbin Liu wrote: >>> On Thu, Dec 12, 2024 at 06:27:34AM -0800, Jakub Kicinski wrote: >>>> On Wed, 11 Dec 2024 07:11:25 +0000 Hangbin Liu wrote: >>>>> The first patch fixes the xfrm offload feature during setup active-backup >>>>> mode. The second patch add a ipsec offload testing. >>>> >>>> Looks like the test is too good, is there a fix pending somewhere for >>>> the BUG below? We can't merge the test before that: >>> >>> This should be a regression of 2aeeef906d5a ("bonding: change ipsec_lock from >>> spin lock to mutex"). As in xfrm_state_delete we called spin_lock_bh(&x->lock) >>> for the xfrm state delete. >>> >>> But I'm not sure if it's proper to release the spin lock in bond code. >>> This seems too specific. >>> >>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >>> index 7daeab67e7b5..69563bc958ca 100644 >>> --- a/drivers/net/bonding/bond_main.c >>> +++ b/drivers/net/bonding/bond_main.c >>> @@ -592,6 +592,7 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs) >>> real_dev->xfrmdev_ops->xdo_dev_state_delete(xs); >>> out: >>> netdev_put(real_dev, &tracker); >>> + spin_unlock_bh(&xs->lock); >>> mutex_lock(&bond->ipsec_lock); >>> list_for_each_entry(ipsec, &bond->ipsec_list, list) { >>> if (ipsec->xs == xs) { >>> @@ -601,6 +602,7 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs) >>> } >>> } >>> mutex_unlock(&bond->ipsec_lock); >>> + spin_lock_bh(&xs->lock); >>> } >>> >>> >>> What do you think? >> >> Re-locking doesn't look great, glancing at the code I don't see any >> obvious better workarounds. Easiest fix would be to don't let the >> drivers sleep in the callbacks and then we can go back to a spin lock. >> Maybe nvidia people have better ideas, I'm not familiar with this >> offload. > > I don't know how to disable bonding sleeping since we use mutex_lock now. > Hi Jianbo, do you have any idea? > I think we should allow drivers to sleep in the callbacks. So, maybe it's better to move driver's xdo_dev_state_delete out of state's spin lock. Thanks! Jianbo
On Thu, Jan 02, 2025 at 11:33:34AM +0800, Jianbo Liu wrote: > > > Re-locking doesn't look great, glancing at the code I don't see any > > > obvious better workarounds. Easiest fix would be to don't let the > > > drivers sleep in the callbacks and then we can go back to a spin lock. > > > Maybe nvidia people have better ideas, I'm not familiar with this > > > offload. > > > > I don't know how to disable bonding sleeping since we use mutex_lock now. > > Hi Jianbo, do you have any idea? > > > > I think we should allow drivers to sleep in the callbacks. So, maybe it's > better to move driver's xdo_dev_state_delete out of state's spin lock. Thanks for the suggestion, let me have a try first. Hangbin