mbox series

[pull,request,net,0/9] mlx5 fixes 2020-11-03

Message ID 20201103191830.60151-1-saeedm@nvidia.com
Headers show
Series mlx5 fixes 2020-11-03 | expand

Message

Saeed Mahameed Nov. 3, 2020, 7:18 p.m. UTC
Hi Jakub,

This series introduces some fixes to mlx5 driver.
For more information please see tag log below.

Please pull and let me know if there is any problem.

For -stable v5.1
 ('net/mlx5: Fix deletion of duplicate rules')

For -stable v5.4
 ('net/mlx5e: Fix modify header actions memory leak')

For -stable v5.8
 ('net/mlx5e: Protect encap route dev from concurrent release')

For -stable v5.9
 ('net/mlx5e: Fix VXLAN synchronization after function reload')
 ('net/mlx5e: Fix refcount leak on kTLS RX resync')
 ('net/mlx5e: Use spin_lock_bh for async_icosq_lock')
 ('net/mlx5e: Fix incorrect access of RCU-protected xdp_prog')
 ('net/mlx5: E-switch, Avoid extack error log for disabled vport')

Thanks,
Saeed.

---
The following changes since commit 9621618130bf7e83635367c13b9a6ee53935bb37:

  sfp: Fix error handing in sfp_probe() (2020-11-02 17:19:59 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5-fixes-2020-11-03

for you to fetch changes up to 9166a02a7b1c21de2155c91e3b69c805e2448267:

  net/mlx5e: Fix incorrect access of RCU-protected xdp_prog (2020-11-03 11:11:53 -0800)

----------------------------------------------------------------
mlx5-fixes-2020-11-03

----------------------------------------------------------------
Aya Levin (1):
      net/mlx5e: Fix VXLAN synchronization after function reload

Jianbo Liu (1):
      net/mlx5e: E-Switch, Offload all chain 0 priorities when modify header and forward action is not supported

Maor Dickman (1):
      net/mlx5e: Fix modify header actions memory leak

Maor Gottlieb (1):
      net/mlx5: Fix deletion of duplicate rules

Maxim Mikityanskiy (3):
      net/mlx5e: Use spin_lock_bh for async_icosq_lock
      net/mlx5e: Fix refcount leak on kTLS RX resync
      net/mlx5e: Fix incorrect access of RCU-protected xdp_prog

Parav Pandit (1):
      net/mlx5: E-switch, Avoid extack error log for disabled vport

Vlad Buslov (1):
      net/mlx5e: Protect encap route dev from concurrent release

 .../net/ethernet/mellanox/mlx5/core/en/rep/tc.c    |  6 +-
 .../net/ethernet/mellanox/mlx5/core/en/tc_tun.c    | 72 ++++++++++++++--------
 .../net/ethernet/mellanox/mlx5/core/en/xsk/setup.c |  4 +-
 .../net/ethernet/mellanox/mlx5/core/en/xsk/tx.c    |  4 +-
 .../ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c | 27 ++++----
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  1 +
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.h   |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c    |  8 +--
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c  |  2 -
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c  |  7 ++-
 .../ethernet/mellanox/mlx5/core/lib/fs_chains.c    |  3 -
 .../net/ethernet/mellanox/mlx5/core/lib/vxlan.c    | 23 +++++--
 .../net/ethernet/mellanox/mlx5/core/lib/vxlan.h    |  2 +
 14 files changed, 98 insertions(+), 65 deletions(-)

Comments

Jakub Kicinski Nov. 4, 2020, 10:03 p.m. UTC | #1
On Tue, 3 Nov 2020 11:18:22 -0800 Saeed Mahameed wrote:
> From: Maor Dickman <maord@nvidia.com>
> 
> Modify header actions are allocated during parse tc actions and only
> freed during the flow creation, however, on error flow the allocated
> memory is wrongly unfreed.
> 
> Fix this by calling dealloc_mod_hdr_actions in __mlx5e_add_fdb_flow
> and mlx5e_add_nic_flow error flow.
> 
> Fixes: d7e75a325cb2 ("net/mlx5e: Add offloading of E-Switch TC pedit (header re-write) action"')
> Fixes: 2f4fe4cab073 ("net/mlx5e: Add offloading of NIC TC pedit (header re-write) actions")
> Signed-off-by: Maor Dickman <maord@nvidia.com>
> Reviewed-by: Paul Blakey <paulb@nvidia.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

Sorry, I didn't look at the warnings for you patches yesterday.

Fixes tag: Fixes: d7e75a325cb2 ("net/mlx5e: Add offloading of E-Switch TC pedit (header re-write) action"')
Has these problem(s):
	- Subject has leading but no trailing quotes
	- Subject does not match target commit subject
	  Just use
		git log -1 --format='Fixes: %h ("%s")'
Jakub Kicinski Nov. 4, 2020, 10:59 p.m. UTC | #2
On Tue, 3 Nov 2020 11:18:25 -0800 Saeed Mahameed wrote:
> From: Maxim Mikityanskiy <maximmi@mellanox.com>
> 
> On resync, the driver calls inet_lookup_established
> (__inet6_lookup_established) that increases sk_refcnt of the socket. To
> decrease it, the driver set skb->destructor to sock_edemux. However, it
> didn't work well, because the TCP stack also sets this destructor for
> early demux, and the refcount gets decreased only once

Why is the stack doing early_demux if there is already a socket
assigned? Or is it not early_demux but something else?
Can you point us at the code?

IPv4:
	if (net->ipv4.sysctl_ip_early_demux &&
	    !skb_dst(skb) &&
	    !skb->sk &&                              <============
	    !ip_is_fragment(iph)) {
		const struct net_protocol *ipprot;
		int protocol = iph->protocol;

		ipprot = rcu_dereference(inet_protos[protocol]);
		if (ipprot && (edemux = READ_ONCE(ipprot->early_demux))) {
			err = INDIRECT_CALL_2(edemux, tcp_v4_early_demux,
					      udp_v4_early_demux, skb);
			if (unlikely(err))
				goto drop_error;
			/* must reload iph, skb->head might have changed */
			iph = ip_hdr(skb);
		}
	}

IPv6:
	if (net->ipv4.sysctl_ip_early_demux && !skb_dst(skb) && skb->sk == NULL) {
                                                                ~~~~~~~~~~~~~~~
		const struct inet6_protocol *ipprot;

		ipprot = rcu_dereference(inet6_protos[ipv6_hdr(skb)->nexthdr]);
		if (ipprot && (edemux = READ_ONCE(ipprot->early_demux)))
			INDIRECT_CALL_2(edemux, tcp_v6_early_demux,
					udp_v6_early_demux, skb);
	}
Maxim Mikityanskiy Nov. 5, 2020, 10:15 a.m. UTC | #3
On 2020-11-05 00:59, Jakub Kicinski wrote:
> On Tue, 3 Nov 2020 11:18:25 -0800 Saeed Mahameed wrote:

>> From: Maxim Mikityanskiy <maximmi@mellanox.com>

>>

>> On resync, the driver calls inet_lookup_established

>> (__inet6_lookup_established) that increases sk_refcnt of the socket. To

>> decrease it, the driver set skb->destructor to sock_edemux. However, it

>> didn't work well, because the TCP stack also sets this destructor for

>> early demux, and the refcount gets decreased only once

> 

> Why is the stack doing early_demux if there is already a socket

> assigned? Or is it not early_demux but something else?

> Can you point us at the code?


I thought this was the code that was in conflict with setting 
skb->destructor in the driver:

void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
{
         skb_orphan(skb);
         skb->sk = sk;
#ifdef CONFIG_INET
         if (unlikely(!sk_fullsock(sk))) {
                 skb->destructor = sock_edemux;
                 sock_hold(sk);
                 return;
         }
#endif

However, after taking another look, it seems that the root cause is 
somewhere else. This piece of code actually calls skb_orphan before 
reassigning the destructor.

I'll debug it further to try to find where the destructor is replaced or 
just not called.

> IPv4:

> 	if (net->ipv4.sysctl_ip_early_demux &&

> 	    !skb_dst(skb) &&

> 	    !skb->sk &&                              <============

> 	    !ip_is_fragment(iph)) {

> 		const struct net_protocol *ipprot;

> 		int protocol = iph->protocol;

> 

> 		ipprot = rcu_dereference(inet_protos[protocol]);

> 		if (ipprot && (edemux = READ_ONCE(ipprot->early_demux))) {

> 			err = INDIRECT_CALL_2(edemux, tcp_v4_early_demux,

> 					      udp_v4_early_demux, skb);

> 			if (unlikely(err))

> 				goto drop_error;

> 			/* must reload iph, skb->head might have changed */

> 			iph = ip_hdr(skb);

> 		}

> 	}

> 

> IPv6:

> 	if (net->ipv4.sysctl_ip_early_demux && !skb_dst(skb) && skb->sk == NULL) {

>                                                                  ~~~~~~~~~~~~~~~

> 		const struct inet6_protocol *ipprot;

> 

> 		ipprot = rcu_dereference(inet6_protos[ipv6_hdr(skb)->nexthdr]);

> 		if (ipprot && (edemux = READ_ONCE(ipprot->early_demux)))

> 			INDIRECT_CALL_2(edemux, tcp_v6_early_demux,

> 					udp_v6_early_demux, skb);

> 	}

>