diff mbox series

[v2] ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup

Message ID 20210903064013.9842-1-zhoufeng.zf@bytedance.com
State Superseded
Headers show
Series [v2] ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup | expand

Commit Message

Feng Zhou Sept. 3, 2021, 6:40 a.m. UTC
From: Feng Zhou <zhoufeng.zf@bytedance.com>

The ixgbe driver currently generates a NULL pointer dereference with
some machine (online cpus < 63). This is due to the fact that the
maximum value of num_xdp_queues is nr_cpu_ids. Code is in
"ixgbe_set_rss_queues"".

Here's how the problem repeats itself:
Some machine (online cpus < 63), And user set num_queues to 63 through
ethtool. Code is in the "ixgbe_set_channels",
adapter->ring_feature[RING_F_FDIR].limit = count;
It becames 63.
When user use xdp, "ixgbe_set_rss_queues" will set queues num.
adapter->num_rx_queues = rss_i;
adapter->num_tx_queues = rss_i;
adapter->num_xdp_queues = ixgbe_xdp_queues(adapter);
And rss_i's value is from
f = &adapter->ring_feature[RING_F_FDIR];
rss_i = f->indices = f->limit;
So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup",
for (i = 0; i < adapter->num_rx_queues; i++)
	if (adapter->xdp_ring[i]->xsk_umem)
lead to panic.
Call trace:
[exception RIP: ixgbe_xdp+368]
RIP: ffffffffc02a76a0  RSP: ffff9fe16202f8d0  RFLAGS: 00010297
RAX: 0000000000000000  RBX: 0000000000000020  RCX: 0000000000000000
RDX: 0000000000000000  RSI: 000000000000001c  RDI: ffffffffa94ead90
RBP: ffff92f8f24c0c18   R8: 0000000000000000   R9: 0000000000000000
R10: ffff9fe16202f830  R11: 0000000000000000  R12: ffff92f8f24c0000
R13: ffff9fe16202fc01  R14: 000000000000000a  R15: ffffffffc02a7530
ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 7 [ffff9fe16202f8f0] dev_xdp_install at ffffffffa89fbbcc
 8 [ffff9fe16202f920] dev_change_xdp_fd at ffffffffa8a08808
 9 [ffff9fe16202f960] do_setlink at ffffffffa8a20235
10 [ffff9fe16202fa88] rtnl_setlink at ffffffffa8a20384
11 [ffff9fe16202fc78] rtnetlink_rcv_msg at ffffffffa8a1a8dd
12 [ffff9fe16202fcf0] netlink_rcv_skb at ffffffffa8a717eb
13 [ffff9fe16202fd40] netlink_unicast at ffffffffa8a70f88
14 [ffff9fe16202fd80] netlink_sendmsg at ffffffffa8a71319
15 [ffff9fe16202fdf0] sock_sendmsg at ffffffffa89df290
16 [ffff9fe16202fe08] __sys_sendto at ffffffffa89e19c8
17 [ffff9fe16202ff30] __x64_sys_sendto at ffffffffa89e1a64
18 [ffff9fe16202ff38] do_syscall_64 at ffffffffa84042b9
19 [ffff9fe16202ff50] entry_SYSCALL_64_after_hwframe at ffffffffa8c0008c

Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for
AF_XDP")
Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
---
Updates since v1:
- Fix "ixgbe_max_channels" callback so that it will not allow a setting of 
queues to be higher than the num_online_cpus().
more details can be seen from here:
https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20210817075407.11961-1-zhoufeng.zf@bytedance.com/
Thanks to Maciej Fijalkowski for your advice.

 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 8 ++++++--
 2 files changed, 7 insertions(+), 3 deletions(-)

Comments

Paul Menzel Sept. 6, 2021, 6:37 a.m. UTC | #1
Dear Feng,


Am 03.09.21 um 08:40 schrieb Feng zhou:

(If you care, in your email client, your last name does not start with a 
capital letter.)

> From: Feng Zhou <zhoufeng.zf@bytedance.com>

> 

> The ixgbe driver currently generates a NULL pointer dereference with

> some machine (online cpus < 63). This is due to the fact that the

> maximum value of num_xdp_queues is nr_cpu_ids. Code is in

> "ixgbe_set_rss_queues"".

> 

> Here's how the problem repeats itself:

> Some machine (online cpus < 63), And user set num_queues to 63 through

> ethtool. Code is in the "ixgbe_set_channels",

> adapter->ring_feature[RING_F_FDIR].limit = count;


For better legibility, you might want to indent code (blocks) by four 
spaces and add blank lines around it (also below).

> It becames 63.


becomes

> When user use xdp, "ixgbe_set_rss_queues" will set queues num.

> adapter->num_rx_queues = rss_i;

> adapter->num_tx_queues = rss_i;

> adapter->num_xdp_queues = ixgbe_xdp_queues(adapter);

> And rss_i's value is from

> f = &adapter->ring_feature[RING_F_FDIR];

> rss_i = f->indices = f->limit;

> So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup",

> for (i = 0; i < adapter->num_rx_queues; i++)

> 	if (adapter->xdp_ring[i]->xsk_umem)

> lead to panic.


lead*s*?

> Call trace:

> [exception RIP: ixgbe_xdp+368]

> RIP: ffffffffc02a76a0  RSP: ffff9fe16202f8d0  RFLAGS: 00010297

> RAX: 0000000000000000  RBX: 0000000000000020  RCX: 0000000000000000

> RDX: 0000000000000000  RSI: 000000000000001c  RDI: ffffffffa94ead90

> RBP: ffff92f8f24c0c18   R8: 0000000000000000   R9: 0000000000000000

> R10: ffff9fe16202f830  R11: 0000000000000000  R12: ffff92f8f24c0000

> R13: ffff9fe16202fc01  R14: 000000000000000a  R15: ffffffffc02a7530

> ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

>   7 [ffff9fe16202f8f0] dev_xdp_install at ffffffffa89fbbcc

>   8 [ffff9fe16202f920] dev_change_xdp_fd at ffffffffa8a08808

>   9 [ffff9fe16202f960] do_setlink at ffffffffa8a20235

> 10 [ffff9fe16202fa88] rtnl_setlink at ffffffffa8a20384

> 11 [ffff9fe16202fc78] rtnetlink_rcv_msg at ffffffffa8a1a8dd

> 12 [ffff9fe16202fcf0] netlink_rcv_skb at ffffffffa8a717eb

> 13 [ffff9fe16202fd40] netlink_unicast at ffffffffa8a70f88

> 14 [ffff9fe16202fd80] netlink_sendmsg at ffffffffa8a71319

> 15 [ffff9fe16202fdf0] sock_sendmsg at ffffffffa89df290

> 16 [ffff9fe16202fe08] __sys_sendto at ffffffffa89e19c8

> 17 [ffff9fe16202ff30] __x64_sys_sendto at ffffffffa89e1a64

> 18 [ffff9fe16202ff38] do_syscall_64 at ffffffffa84042b9

> 19 [ffff9fe16202ff50] entry_SYSCALL_64_after_hwframe at ffffffffa8c0008c


Please describe the fix in the commit message.

> Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for

> AF_XDP")

> Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>

> ---

> Updates since v1:

> - Fix "ixgbe_max_channels" callback so that it will not allow a setting of

> queues to be higher than the num_online_cpus().

> more details can be seen from here:

> https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20210817075407.11961-1-zhoufeng.zf@bytedance.com/

> Thanks to Maciej Fijalkowski for your advice.

> 

>   drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 2 +-

>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 8 ++++++--

>   2 files changed, 7 insertions(+), 3 deletions(-)

> 

> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c

> index 4ceaca0f6ce3..21321d164708 100644

> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c

> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c

> @@ -3204,7 +3204,7 @@ static unsigned int ixgbe_max_channels(struct ixgbe_adapter *adapter)

>   		max_combined = ixgbe_max_rss_indices(adapter);

>   	}

>   

> -	return max_combined;

> +	return min_t(int, max_combined, num_online_cpus());

>   }

>   

>   static void ixgbe_get_channels(struct net_device *dev,

> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

> index 14aea40da50f..5db496cc5070 100644

> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

> @@ -10112,6 +10112,7 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)

>   	struct ixgbe_adapter *adapter = netdev_priv(dev);

>   	struct bpf_prog *old_prog;

>   	bool need_reset;

> +	int num_queues;

>   

>   	if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)

>   		return -EINVAL;

> @@ -10161,11 +10162,14 @@ static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)

>   	/* Kick start the NAPI context if there is an AF_XDP socket open

>   	 * on that queue id. This so that receiving will start.

>   	 */

> -	if (need_reset && prog)

> -		for (i = 0; i < adapter->num_rx_queues; i++)

> +	if (need_reset && prog) {

> +		num_queues = min_t(int, adapter->num_rx_queues,

> +			adapter->num_xdp_queues);

> +		for (i = 0; i < num_queues; i++)

>   			if (adapter->xdp_ring[i]->xsk_pool)

>   				(void)ixgbe_xsk_wakeup(adapter->netdev, i,

>   						       XDP_WAKEUP_RX);

> +	}

>   

>   	return 0;

>   }

>
Feng Zhou Sept. 6, 2021, 7:49 a.m. UTC | #2
在 2021/9/6 下午2:37, Paul Menzel 写道:
> Dear Feng,

>

>

> Am 03.09.21 um 08:40 schrieb Feng zhou:

>

> (If you care, in your email client, your last name does not start with 

> a capital letter.)

>

>> From: Feng Zhou <zhoufeng.zf@bytedance.com>

>>

>> The ixgbe driver currently generates a NULL pointer dereference with

>> some machine (online cpus < 63). This is due to the fact that the

>> maximum value of num_xdp_queues is nr_cpu_ids. Code is in

>> "ixgbe_set_rss_queues"".

>>

>> Here's how the problem repeats itself:

>> Some machine (online cpus < 63), And user set num_queues to 63 through

>> ethtool. Code is in the "ixgbe_set_channels",

>> adapter->ring_feature[RING_F_FDIR].limit = count;

>

> For better legibility, you might want to indent code (blocks) by four 

> spaces and add blank lines around it (also below).

>

>> It becames 63.

>

> becomes

>

>> When user use xdp, "ixgbe_set_rss_queues" will set queues num.

>> adapter->num_rx_queues = rss_i;

>> adapter->num_tx_queues = rss_i;

>> adapter->num_xdp_queues = ixgbe_xdp_queues(adapter);

>> And rss_i's value is from

>> f = &adapter->ring_feature[RING_F_FDIR];

>> rss_i = f->indices = f->limit;

>> So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup",

>> for (i = 0; i < adapter->num_rx_queues; i++)

>>     if (adapter->xdp_ring[i]->xsk_umem)

>> lead to panic.

>

> lead*s*?

>

>> Call trace:

>> [exception RIP: ixgbe_xdp+368]

>> RIP: ffffffffc02a76a0  RSP: ffff9fe16202f8d0  RFLAGS: 00010297

>> RAX: 0000000000000000  RBX: 0000000000000020  RCX: 0000000000000000

>> RDX: 0000000000000000  RSI: 000000000000001c  RDI: ffffffffa94ead90

>> RBP: ffff92f8f24c0c18   R8: 0000000000000000   R9: 0000000000000000

>> R10: ffff9fe16202f830  R11: 0000000000000000  R12: ffff92f8f24c0000

>> R13: ffff9fe16202fc01  R14: 000000000000000a  R15: ffffffffc02a7530

>> ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

>>   7 [ffff9fe16202f8f0] dev_xdp_install at ffffffffa89fbbcc

>>   8 [ffff9fe16202f920] dev_change_xdp_fd at ffffffffa8a08808

>>   9 [ffff9fe16202f960] do_setlink at ffffffffa8a20235

>> 10 [ffff9fe16202fa88] rtnl_setlink at ffffffffa8a20384

>> 11 [ffff9fe16202fc78] rtnetlink_rcv_msg at ffffffffa8a1a8dd

>> 12 [ffff9fe16202fcf0] netlink_rcv_skb at ffffffffa8a717eb

>> 13 [ffff9fe16202fd40] netlink_unicast at ffffffffa8a70f88

>> 14 [ffff9fe16202fd80] netlink_sendmsg at ffffffffa8a71319

>> 15 [ffff9fe16202fdf0] sock_sendmsg at ffffffffa89df290

>> 16 [ffff9fe16202fe08] __sys_sendto at ffffffffa89e19c8

>> 17 [ffff9fe16202ff30] __x64_sys_sendto at ffffffffa89e1a64

>> 18 [ffff9fe16202ff38] do_syscall_64 at ffffffffa84042b9

>> 19 [ffff9fe16202ff50] entry_SYSCALL_64_after_hwframe at ffffffffa8c0008c

>

> Please describe the fix in the commit message.

>

>> Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for

>> AF_XDP")

>> Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>

>> ---

>> Updates since v1:

>> - Fix "ixgbe_max_channels" callback so that it will not allow a 

>> setting of

>> queues to be higher than the num_online_cpus().

>> more details can be seen from here:

>> https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20210817075407.11961-1-zhoufeng.zf@bytedance.com/ 

>>

>> Thanks to Maciej Fijalkowski for your advice.

>>

>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 2 +-

>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 8 ++++++--

>>   2 files changed, 7 insertions(+), 3 deletions(-)

>>

>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c 

>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c

>> index 4ceaca0f6ce3..21321d164708 100644

>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c

>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c

>> @@ -3204,7 +3204,7 @@ static unsigned int ixgbe_max_channels(struct 

>> ixgbe_adapter *adapter)

>>           max_combined = ixgbe_max_rss_indices(adapter);

>>       }

>>   -    return max_combined;

>> +    return min_t(int, max_combined, num_online_cpus());

>>   }

>>     static void ixgbe_get_channels(struct net_device *dev,

>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c 

>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

>> index 14aea40da50f..5db496cc5070 100644

>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

>> @@ -10112,6 +10112,7 @@ static int ixgbe_xdp_setup(struct net_device 

>> *dev, struct bpf_prog *prog)

>>       struct ixgbe_adapter *adapter = netdev_priv(dev);

>>       struct bpf_prog *old_prog;

>>       bool need_reset;

>> +    int num_queues;

>>         if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)

>>           return -EINVAL;

>> @@ -10161,11 +10162,14 @@ static int ixgbe_xdp_setup(struct 

>> net_device *dev, struct bpf_prog *prog)

>>       /* Kick start the NAPI context if there is an AF_XDP socket open

>>        * on that queue id. This so that receiving will start.

>>        */

>> -    if (need_reset && prog)

>> -        for (i = 0; i < adapter->num_rx_queues; i++)

>> +    if (need_reset && prog) {

>> +        num_queues = min_t(int, adapter->num_rx_queues,

>> +            adapter->num_xdp_queues);

>> +        for (i = 0; i < num_queues; i++)

>>               if (adapter->xdp_ring[i]->xsk_pool)

>>                   (void)ixgbe_xsk_wakeup(adapter->netdev, i,

>>                                  XDP_WAKEUP_RX);

>> +    }

>>         return 0;

>>   }

>>

Thanks for your advice. I will modify the commit message in v3
Jesper Dangaard Brouer Sept. 6, 2021, 11:31 a.m. UTC | #3
Hi Feng and Jason,

Please notice that you are both developing patches that change the ixgbe 
driver in related areas.

Jason's patch:
  Subject: [PATCH v7] ixgbe: let the xdpdrv work with more than 64 cpus
 
https://lore.kernel.org/all/20210901101206.50274-1-kerneljasonxing@gmail.com/

We might need both as this patch looks like a fix to a panic, and 
Jason's patch allows XDP on ixgbe to work on machines with more than 64 
CPUs.

-Jesper

On 06/09/2021 09.49, Feng Zhou wrote:
> 

> 在 2021/9/6 下午2:37, Paul Menzel 写道:

>> Dear Feng,

>>

>>

>> Am 03.09.21 um 08:40 schrieb Feng zhou:

>>

>> (If you care, in your email client, your last name does not start with 

>> a capital letter.)

>>

>>> From: Feng Zhou <zhoufeng.zf@bytedance.com>

>>>

>>> The ixgbe driver currently generates a NULL pointer dereference with

>>> some machine (online cpus < 63). This is due to the fact that the

>>> maximum value of num_xdp_queues is nr_cpu_ids. Code is in

>>> "ixgbe_set_rss_queues"".

>>>

>>> Here's how the problem repeats itself:

>>> Some machine (online cpus < 63), And user set num_queues to 63 through

>>> ethtool. Code is in the "ixgbe_set_channels",

>>> adapter->ring_feature[RING_F_FDIR].limit = count;

>>

>> For better legibility, you might want to indent code (blocks) by four 

>> spaces and add blank lines around it (also below).

>>

>>> It becames 63.

>>

>> becomes

>>

>>> When user use xdp, "ixgbe_set_rss_queues" will set queues num.

>>> adapter->num_rx_queues = rss_i;

>>> adapter->num_tx_queues = rss_i;

>>> adapter->num_xdp_queues = ixgbe_xdp_queues(adapter);

>>> And rss_i's value is from

>>> f = &adapter->ring_feature[RING_F_FDIR];

>>> rss_i = f->indices = f->limit;

>>> So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup",

>>> for (i = 0; i < adapter->num_rx_queues; i++)

>>>     if (adapter->xdp_ring[i]->xsk_umem)

>>> lead to panic.

>>

>> lead*s*?

>>

>>> Call trace:

>>> [exception RIP: ixgbe_xdp+368]

>>> RIP: ffffffffc02a76a0  RSP: ffff9fe16202f8d0  RFLAGS: 00010297

>>> RAX: 0000000000000000  RBX: 0000000000000020  RCX: 0000000000000000

>>> RDX: 0000000000000000  RSI: 000000000000001c  RDI: ffffffffa94ead90

>>> RBP: ffff92f8f24c0c18   R8: 0000000000000000   R9: 0000000000000000

>>> R10: ffff9fe16202f830  R11: 0000000000000000  R12: ffff92f8f24c0000

>>> R13: ffff9fe16202fc01  R14: 000000000000000a  R15: ffffffffc02a7530

>>> ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

>>>   7 [ffff9fe16202f8f0] dev_xdp_install at ffffffffa89fbbcc

>>>   8 [ffff9fe16202f920] dev_change_xdp_fd at ffffffffa8a08808

>>>   9 [ffff9fe16202f960] do_setlink at ffffffffa8a20235

>>> 10 [ffff9fe16202fa88] rtnl_setlink at ffffffffa8a20384

>>> 11 [ffff9fe16202fc78] rtnetlink_rcv_msg at ffffffffa8a1a8dd

>>> 12 [ffff9fe16202fcf0] netlink_rcv_skb at ffffffffa8a717eb

>>> 13 [ffff9fe16202fd40] netlink_unicast at ffffffffa8a70f88

>>> 14 [ffff9fe16202fd80] netlink_sendmsg at ffffffffa8a71319

>>> 15 [ffff9fe16202fdf0] sock_sendmsg at ffffffffa89df290

>>> 16 [ffff9fe16202fe08] __sys_sendto at ffffffffa89e19c8

>>> 17 [ffff9fe16202ff30] __x64_sys_sendto at ffffffffa89e1a64

>>> 18 [ffff9fe16202ff38] do_syscall_64 at ffffffffa84042b9

>>> 19 [ffff9fe16202ff50] entry_SYSCALL_64_after_hwframe at ffffffffa8c0008c

>>

>> Please describe the fix in the commit message.

>>

>>> Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for

>>> AF_XDP")

>>> Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>

>>> ---

>>> Updates since v1:

>>> - Fix "ixgbe_max_channels" callback so that it will not allow a 

>>> setting of

>>> queues to be higher than the num_online_cpus().

>>> more details can be seen from here:

>>> https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20210817075407.11961-1-zhoufeng.zf@bytedance.com/ 

>>>

>>> Thanks to Maciej Fijalkowski for your advice.

>>>

>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 2 +-

>>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 8 ++++++--

>>>   2 files changed, 7 insertions(+), 3 deletions(-)

>>>

>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c 

>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c

>>> index 4ceaca0f6ce3..21321d164708 100644

>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c

>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c

>>> @@ -3204,7 +3204,7 @@ static unsigned int ixgbe_max_channels(struct 

>>> ixgbe_adapter *adapter)

>>>           max_combined = ixgbe_max_rss_indices(adapter);

>>>       }

>>>   -    return max_combined;

>>> +    return min_t(int, max_combined, num_online_cpus());

>>>   }

>>>     static void ixgbe_get_channels(struct net_device *dev,

>>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c 

>>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

>>> index 14aea40da50f..5db496cc5070 100644

>>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

>>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

>>> @@ -10112,6 +10112,7 @@ static int ixgbe_xdp_setup(struct net_device 

>>> *dev, struct bpf_prog *prog)

>>>       struct ixgbe_adapter *adapter = netdev_priv(dev);

>>>       struct bpf_prog *old_prog;

>>>       bool need_reset;

>>> +    int num_queues;

>>>         if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)

>>>           return -EINVAL;

>>> @@ -10161,11 +10162,14 @@ static int ixgbe_xdp_setup(struct 

>>> net_device *dev, struct bpf_prog *prog)

>>>       /* Kick start the NAPI context if there is an AF_XDP socket open

>>>        * on that queue id. This so that receiving will start.

>>>        */

>>> -    if (need_reset && prog)

>>> -        for (i = 0; i < adapter->num_rx_queues; i++)

>>> +    if (need_reset && prog) {

>>> +        num_queues = min_t(int, adapter->num_rx_queues,

>>> +            adapter->num_xdp_queues);

>>> +        for (i = 0; i < num_queues; i++)

>>>               if (adapter->xdp_ring[i]->xsk_pool)

>>>                   (void)ixgbe_xsk_wakeup(adapter->netdev, i,

>>>                                  XDP_WAKEUP_RX);

>>> +    }

>>>         return 0;

>>>   }

>>>

> Thanks for your advice. I will modify the commit message in v3

>
Jason Xing Sept. 8, 2021, 2:36 a.m. UTC | #4
On Mon, Sep 6, 2021 at 7:32 PM Jesper Dangaard Brouer
<jbrouer@redhat.com> wrote:
>

> Hi Feng and Jason,

>

> Please notice that you are both developing patches that change the ixgbe

> driver in related areas.


Thanks for noticing. We're doing different things as they are both
related to XDP on ixgbe actually.

Jason

>

> Jason's patch:

>   Subject: [PATCH v7] ixgbe: let the xdpdrv work with more than 64 cpus

>

> https://lore.kernel.org/all/20210901101206.50274-1-kerneljasonxing@gmail.com/

>

> We might need both as this patch looks like a fix to a panic, and

> Jason's patch allows XDP on ixgbe to work on machines with more than 64

> CPUs.

>

> -Jesper

>

> On 06/09/2021 09.49, Feng Zhou wrote:

> >

> > 在 2021/9/6 下午2:37, Paul Menzel 写道:

> >> Dear Feng,

> >>

> >>

> >> Am 03.09.21 um 08:40 schrieb Feng zhou:

> >>

> >> (If you care, in your email client, your last name does not start with

> >> a capital letter.)

> >>

> >>> From: Feng Zhou <zhoufeng.zf@bytedance.com>

> >>>

> >>> The ixgbe driver currently generates a NULL pointer dereference with

> >>> some machine (online cpus < 63). This is due to the fact that the

> >>> maximum value of num_xdp_queues is nr_cpu_ids. Code is in

> >>> "ixgbe_set_rss_queues"".

> >>>

> >>> Here's how the problem repeats itself:

> >>> Some machine (online cpus < 63), And user set num_queues to 63 through

> >>> ethtool. Code is in the "ixgbe_set_channels",

> >>> adapter->ring_feature[RING_F_FDIR].limit = count;

> >>

> >> For better legibility, you might want to indent code (blocks) by four

> >> spaces and add blank lines around it (also below).

> >>

> >>> It becames 63.

> >>

> >> becomes

> >>

> >>> When user use xdp, "ixgbe_set_rss_queues" will set queues num.

> >>> adapter->num_rx_queues = rss_i;

> >>> adapter->num_tx_queues = rss_i;

> >>> adapter->num_xdp_queues = ixgbe_xdp_queues(adapter);

> >>> And rss_i's value is from

> >>> f = &adapter->ring_feature[RING_F_FDIR];

> >>> rss_i = f->indices = f->limit;

> >>> So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup",

> >>> for (i = 0; i < adapter->num_rx_queues; i++)

> >>>     if (adapter->xdp_ring[i]->xsk_umem)

> >>> lead to panic.

> >>

> >> lead*s*?

> >>

> >>> Call trace:

> >>> [exception RIP: ixgbe_xdp+368]

> >>> RIP: ffffffffc02a76a0  RSP: ffff9fe16202f8d0  RFLAGS: 00010297

> >>> RAX: 0000000000000000  RBX: 0000000000000020  RCX: 0000000000000000

> >>> RDX: 0000000000000000  RSI: 000000000000001c  RDI: ffffffffa94ead90

> >>> RBP: ffff92f8f24c0c18   R8: 0000000000000000   R9: 0000000000000000

> >>> R10: ffff9fe16202f830  R11: 0000000000000000  R12: ffff92f8f24c0000

> >>> R13: ffff9fe16202fc01  R14: 000000000000000a  R15: ffffffffc02a7530

> >>> ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

> >>>   7 [ffff9fe16202f8f0] dev_xdp_install at ffffffffa89fbbcc

> >>>   8 [ffff9fe16202f920] dev_change_xdp_fd at ffffffffa8a08808

> >>>   9 [ffff9fe16202f960] do_setlink at ffffffffa8a20235

> >>> 10 [ffff9fe16202fa88] rtnl_setlink at ffffffffa8a20384

> >>> 11 [ffff9fe16202fc78] rtnetlink_rcv_msg at ffffffffa8a1a8dd

> >>> 12 [ffff9fe16202fcf0] netlink_rcv_skb at ffffffffa8a717eb

> >>> 13 [ffff9fe16202fd40] netlink_unicast at ffffffffa8a70f88

> >>> 14 [ffff9fe16202fd80] netlink_sendmsg at ffffffffa8a71319

> >>> 15 [ffff9fe16202fdf0] sock_sendmsg at ffffffffa89df290

> >>> 16 [ffff9fe16202fe08] __sys_sendto at ffffffffa89e19c8

> >>> 17 [ffff9fe16202ff30] __x64_sys_sendto at ffffffffa89e1a64

> >>> 18 [ffff9fe16202ff38] do_syscall_64 at ffffffffa84042b9

> >>> 19 [ffff9fe16202ff50] entry_SYSCALL_64_after_hwframe at ffffffffa8c0008c

> >>

> >> Please describe the fix in the commit message.

> >>

> >>> Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for

> >>> AF_XDP")

> >>> Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>

> >>> ---

> >>> Updates since v1:

> >>> - Fix "ixgbe_max_channels" callback so that it will not allow a

> >>> setting of

> >>> queues to be higher than the num_online_cpus().

> >>> more details can be seen from here:

> >>> https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20210817075407.11961-1-zhoufeng.zf@bytedance.com/

> >>>

> >>> Thanks to Maciej Fijalkowski for your advice.

> >>>

> >>>   drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 2 +-

> >>>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 8 ++++++--

> >>>   2 files changed, 7 insertions(+), 3 deletions(-)

> >>>

> >>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c

> >>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c

> >>> index 4ceaca0f6ce3..21321d164708 100644

> >>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c

> >>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c

> >>> @@ -3204,7 +3204,7 @@ static unsigned int ixgbe_max_channels(struct

> >>> ixgbe_adapter *adapter)

> >>>           max_combined = ixgbe_max_rss_indices(adapter);

> >>>       }

> >>>   -    return max_combined;

> >>> +    return min_t(int, max_combined, num_online_cpus());

> >>>   }

> >>>     static void ixgbe_get_channels(struct net_device *dev,

> >>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

> >>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

> >>> index 14aea40da50f..5db496cc5070 100644

> >>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

> >>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c

> >>> @@ -10112,6 +10112,7 @@ static int ixgbe_xdp_setup(struct net_device

> >>> *dev, struct bpf_prog *prog)

> >>>       struct ixgbe_adapter *adapter = netdev_priv(dev);

> >>>       struct bpf_prog *old_prog;

> >>>       bool need_reset;

> >>> +    int num_queues;

> >>>         if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)

> >>>           return -EINVAL;

> >>> @@ -10161,11 +10162,14 @@ static int ixgbe_xdp_setup(struct

> >>> net_device *dev, struct bpf_prog *prog)

> >>>       /* Kick start the NAPI context if there is an AF_XDP socket open

> >>>        * on that queue id. This so that receiving will start.

> >>>        */

> >>> -    if (need_reset && prog)

> >>> -        for (i = 0; i < adapter->num_rx_queues; i++)

> >>> +    if (need_reset && prog) {

> >>> +        num_queues = min_t(int, adapter->num_rx_queues,

> >>> +            adapter->num_xdp_queues);

> >>> +        for (i = 0; i < num_queues; i++)

> >>>               if (adapter->xdp_ring[i]->xsk_pool)

> >>>                   (void)ixgbe_xsk_wakeup(adapter->netdev, i,

> >>>                                  XDP_WAKEUP_RX);

> >>> +    }

> >>>         return 0;

> >>>   }

> >>>

> > Thanks for your advice. I will modify the commit message in v3

> >

>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index 4ceaca0f6ce3..21321d164708 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -3204,7 +3204,7 @@  static unsigned int ixgbe_max_channels(struct ixgbe_adapter *adapter)
 		max_combined = ixgbe_max_rss_indices(adapter);
 	}
 
-	return max_combined;
+	return min_t(int, max_combined, num_online_cpus());
 }
 
 static void ixgbe_get_channels(struct net_device *dev,
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 14aea40da50f..5db496cc5070 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -10112,6 +10112,7 @@  static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
 	struct ixgbe_adapter *adapter = netdev_priv(dev);
 	struct bpf_prog *old_prog;
 	bool need_reset;
+	int num_queues;
 
 	if (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED)
 		return -EINVAL;
@@ -10161,11 +10162,14 @@  static int ixgbe_xdp_setup(struct net_device *dev, struct bpf_prog *prog)
 	/* Kick start the NAPI context if there is an AF_XDP socket open
 	 * on that queue id. This so that receiving will start.
 	 */
-	if (need_reset && prog)
-		for (i = 0; i < adapter->num_rx_queues; i++)
+	if (need_reset && prog) {
+		num_queues = min_t(int, adapter->num_rx_queues,
+			adapter->num_xdp_queues);
+		for (i = 0; i < num_queues; i++)
 			if (adapter->xdp_ring[i]->xsk_pool)
 				(void)ixgbe_xsk_wakeup(adapter->netdev, i,
 						       XDP_WAKEUP_RX);
+	}
 
 	return 0;
 }