Message ID | 20250409144133.2833606-1-cratiu@nvidia.com |
---|---|
Headers | show |
Series | xfrm & bonding: Correct use of xso.real_dev | expand |
On 4/9/25 17:41, Cosmin Ratiu wrote: > This patch series was motivated by fixing a few bugs in the bonding > driver related to xfrm state migration on device failover. > > struct xfrm_dev_offload has two net_device pointers: dev and real_dev. > The first one is the device the xfrm_state is offloaded on and the > second one is used by the bonding driver to manage the underlying device > xfrm_states are actually offloaded on. When bonding isn't used, the two > pointers are the same. > > This causes confusion in drivers: Which device pointer should they use? > If they want to support bonding, they need to only use real_dev and > never look at dev. > > Furthermore, real_dev is used without proper locking from multiple code > paths and changing it is dangerous. See commit [1] for example. > > This patch series clears things out by removing all uses of real_dev > from outside the bonding driver. > Then, the bonding driver is refactored to fix a couple of long standing > races and the original bug which motivated this patch series. > > [1] commit f8cde9805981 ("bonding: fix xfrm real_dev null pointer > dereference") > > v1 -> v2: > Added missing kdoc for various functions. > Made bond_ipsec_del_sa() use xso.real_dev instead of curr_active_slave. > > Cosmin Ratiu (6): > Cleaning up unnecessary uses of xso.real_dev: > net/mlx5: Avoid using xso.real_dev unnecessarily > xfrm: Use xdo.dev instead of xdo.real_dev > xfrm: Remove unneeded device check from validate_xmit_xfrm > Refactoring device operations to get an explicit device pointer: > xfrm: Add explicit dev to .xdo_dev_state_{add,delete,free} > Fixing a bonding xfrm state migration bug: > bonding: Mark active offloaded xfrm_states > Fixing long standing races in bonding: > bonding: Fix multiple long standing offload races > > Documentation/networking/xfrm_device.rst | 10 +- > drivers/net/bonding/bond_main.c | 113 +++++++++--------- > .../net/ethernet/chelsio/cxgb4/cxgb4_main.c | 20 ++-- > .../inline_crypto/ch_ipsec/chcr_ipsec.c | 18 ++- > .../net/ethernet/intel/ixgbe/ixgbe_ipsec.c | 41 ++++--- > drivers/net/ethernet/intel/ixgbevf/ipsec.c | 21 ++-- > .../marvell/octeontx2/nic/cn10k_ipsec.c | 18 +-- > .../mellanox/mlx5/core/en_accel/ipsec.c | 28 ++--- > .../mellanox/mlx5/core/en_accel/ipsec.h | 1 + > .../net/ethernet/netronome/nfp/crypto/ipsec.c | 11 +- > drivers/net/netdevsim/ipsec.c | 15 ++- > include/linux/netdevice.h | 10 +- > include/net/xfrm.h | 8 ++ > net/xfrm/xfrm_device.c | 13 +- > net/xfrm/xfrm_state.c | 16 +-- > 15 files changed, 182 insertions(+), 161 deletions(-) > Thank you for following up on this. It's definitely not getting cleaner with a bond ptr in xfrm protected by two locks in different drivers, but it should do AFAICT. In case there is another version I'd add a big comment of the locking expectations for real_dev, and maybe in the future it can fully move to the bonding as well. For the set: Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Cheers, Nik
On Thu, 2025-04-10 at 14:10 +0300, Nikolay Aleksandrov wrote: > > Thank you for following up on this. It's definitely not getting > cleaner with a bond > ptr in xfrm protected by two locks in different drivers, but it > should do AFAICT. > In case there is another version I'd add a big comment of the locking > expectations > for real_dev, and maybe in the future it can fully move to the > bonding as well. > > For the set: > Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Thanks for the review, I'll send v3 with the comment added. I would have liked to push this a bit further with making real_dev private, but I didn't find a way and that can be done separately. Cosmin.