diff mbox series

iavf: do not override the adapter state in the watchdog task

Message ID 20210305123856.14302-1-sassmann@kpanic.de
State New
Headers show
Series iavf: do not override the adapter state in the watchdog task | expand

Commit Message

Stefan Assmann March 5, 2021, 12:38 p.m. UTC
The iavf watchdog task overrides adapter->state to __IAVF_RESETTING
when it detects a pending reset. Then schedules iavf_reset_task() which
takes care of the reset.

The reset task is capable of handling the reset without changing
adapter->state. In fact we lose the state information when the watchdog
task prematurely changes the adapter state. This may lead to a crash if
instead of the reset task the iavf_remove() function gets called before
the reset task.
In that case (if we were in state __IAVF_RUNNING previously) the
iavf_remove() function triggers iavf_close() which fails to close the
device because of the incorrect state information.

This may result in a crash due to pending interrupts.
kernel BUG at drivers/pci/msi.c:357!
[...]
Call Trace:
 [<ffffffffbddf24dd>] pci_disable_msix+0x3d/0x50
 [<ffffffffc08d2a63>] iavf_reset_interrupt_capability+0x23/0x40 [iavf]
 [<ffffffffc08d312a>] iavf_remove+0x10a/0x350 [iavf]
 [<ffffffffbddd3359>] pci_device_remove+0x39/0xc0
 [<ffffffffbdeb492f>] __device_release_driver+0x7f/0xf0
 [<ffffffffbdeb49c3>] device_release_driver+0x23/0x30
 [<ffffffffbddcabb4>] pci_stop_bus_device+0x84/0xa0
 [<ffffffffbddcacc2>] pci_stop_and_remove_bus_device+0x12/0x20
 [<ffffffffbddf361f>] pci_iov_remove_virtfn+0xaf/0x160
 [<ffffffffbddf3bcc>] sriov_disable+0x3c/0xf0
 [<ffffffffbddf3ca3>] pci_disable_sriov+0x23/0x30
 [<ffffffffc0667365>] i40e_free_vfs+0x265/0x2d0 [i40e]
 [<ffffffffc0667624>] i40e_pci_sriov_configure+0x144/0x1f0 [i40e]
 [<ffffffffbddd5307>] sriov_numvfs_store+0x177/0x1d0
Code: 00 00 e8 3c 25 e3 ff 49 c7 86 88 08 00 00 00 00 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 48 8b 7b 28 e8 0d 44
RIP  [<ffffffffbbbf1068>] free_msi_irqs+0x188/0x190

The solution is to not touch the adapter->state in iavf_watchdog_task()
and let the reset task handle the state transition.

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
---
 drivers/net/ethernet/intel/iavf/iavf_main.c | 1 -
 1 file changed, 1 deletion(-)

Comments

Jankowski, Konrad0 July 6, 2021, 8:06 a.m. UTC | #1
> -----Original Message-----

> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of

> Stefan Assmann

> Sent: piÄ…tek, 5 marca 2021 13:39

> To: intel-wired-lan@lists.osuosl.org

> Cc: netdev@vger.kernel.org; sassmann@kpanic.de

> Subject: [Intel-wired-lan] [PATCH] iavf: do not override the adapter state in

> the watchdog task

> 

> The iavf watchdog task overrides adapter->state to __IAVF_RESETTING

> when it detects a pending reset. Then schedules iavf_reset_task() which

> takes care of the reset.

> 

> The reset task is capable of handling the reset without changing

> adapter->state. In fact we lose the state information when the watchdog

> task prematurely changes the adapter state. This may lead to a crash if

> instead of the reset task the iavf_remove() function gets called before the

> reset task.

> In that case (if we were in state __IAVF_RUNNING previously) the

> iavf_remove() function triggers iavf_close() which fails to close the device

> because of the incorrect state information.

> 

> This may result in a crash due to pending interrupts.

> kernel BUG at drivers/pci/msi.c:357!

> [...]

> Call Trace:

>  [<ffffffffbddf24dd>] pci_disable_msix+0x3d/0x50  [<ffffffffc08d2a63>]

> iavf_reset_interrupt_capability+0x23/0x40 [iavf]  [<ffffffffc08d312a>]

> iavf_remove+0x10a/0x350 [iavf]  [<ffffffffbddd3359>]

> pci_device_remove+0x39/0xc0  [<ffffffffbdeb492f>]

> __device_release_driver+0x7f/0xf0  [<ffffffffbdeb49c3>]

> device_release_driver+0x23/0x30  [<ffffffffbddcabb4>]

> pci_stop_bus_device+0x84/0xa0  [<ffffffffbddcacc2>]

> pci_stop_and_remove_bus_device+0x12/0x20

>  [<ffffffffbddf361f>] pci_iov_remove_virtfn+0xaf/0x160  [<ffffffffbddf3bcc>]

> sriov_disable+0x3c/0xf0  [<ffffffffbddf3ca3>] pci_disable_sriov+0x23/0x30

> [<ffffffffc0667365>] i40e_free_vfs+0x265/0x2d0 [i40e]  [<ffffffffc0667624>]

> i40e_pci_sriov_configure+0x144/0x1f0 [i40e]  [<ffffffffbddd5307>]

> sriov_numvfs_store+0x177/0x1d0

> Code: 00 00 e8 3c 25 e3 ff 49 c7 86 88 08 00 00 00 00 00 00 5b 41 5c 41 5d 41 5e

> 41 5f 5d c3 48 8b 7b 28 e8 0d 44 RIP  [<ffffffffbbbf1068>]

> free_msi_irqs+0x188/0x190

> 

> The solution is to not touch the adapter->state in iavf_watchdog_task() and

> let the reset task handle the state transition.

> 

> Signed-off-by: Stefan Assmann <sassmann@kpanic.de>

> ---

>  drivers/net/ethernet/intel/iavf/iavf_main.c | 1 -

>  1 file changed, 1 deletion(-)

> 

> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c

> b/drivers/net/ethernet/intel/iavf/iavf_main.c

> index 0a867d64d467..d9e3a70abb47 100644

> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c

> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c

> @@ -1954,7 +1954,6 @@ static void iavf_watchdog_task(struct work_struct

> *work)

>  		/* check for hw reset */

>  	reg_val = rd32(hw, IAVF_VF_ARQLEN1) &

> IAVF_VF_ARQLEN1_ARQENABLE_MASK;

>  	if (!reg_val) {

> -		adapter->state = __IAVF_RESETTING;

>  		adapter->flags |= IAVF_FLAG_RESET_PENDING;

>  		adapter->aq_required = 0;

>  		adapter->current_op = VIRTCHNL_OP_UNKNOWN;

> --

> 2.29.2


Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 0a867d64d467..d9e3a70abb47 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -1954,7 +1954,6 @@  static void iavf_watchdog_task(struct work_struct *work)
 		/* check for hw reset */
 	reg_val = rd32(hw, IAVF_VF_ARQLEN1) & IAVF_VF_ARQLEN1_ARQENABLE_MASK;
 	if (!reg_val) {
-		adapter->state = __IAVF_RESETTING;
 		adapter->flags |= IAVF_FLAG_RESET_PENDING;
 		adapter->aq_required = 0;
 		adapter->current_op = VIRTCHNL_OP_UNKNOWN;