diff mbox series

[5.13] mwifiex: bring down link before deleting interface

Message ID 20210515024227.2159311-1-briannorris@chromium.org
State Accepted
Commit 1f9482aa8d412b4ba06ce6ab8e333fb8ca29a06e
Headers show
Series [5.13] mwifiex: bring down link before deleting interface | expand

Commit Message

Brian Norris May 15, 2021, 2:42 a.m. UTC
We can deadlock when rmmod'ing the driver or going through firmware
reset, because the cfg80211_unregister_wdev() has to bring down the link
for us, ... which then grab the same wiphy lock.

nl80211_del_interface() already handles a very similar case, with a nice
description:

        /*
         * We hold RTNL, so this is safe, without RTNL opencount cannot
         * reach 0, and thus the rdev cannot be deleted.
         *
         * We need to do it for the dev_close(), since that will call
         * the netdev notifiers, and we need to acquire the mutex there
         * but don't know if we get there from here or from some other
         * place (e.g. "ip link set ... down").
         */
        mutex_unlock(&rdev->wiphy.mtx);
...

Do similarly for mwifiex teardown, by ensuring we bring the link down
first.

Sample deadlock trace:

[  247.103516] INFO: task rmmod:2119 blocked for more than 123 seconds.
[  247.110630]       Not tainted 5.12.4 #5
[  247.115796] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  247.124557] task:rmmod           state:D stack:    0 pid: 2119 ppid:  2114 flags:0x00400208
[  247.133905] Call trace:
[  247.136644]  __switch_to+0x130/0x170
[  247.140643]  __schedule+0x714/0xa0c
[  247.144548]  schedule_preempt_disabled+0x88/0xf4
[  247.149714]  __mutex_lock_common+0x43c/0x750
[  247.154496]  mutex_lock_nested+0x5c/0x68
[  247.158884]  cfg80211_netdev_notifier_call+0x280/0x4e0 [cfg80211]
[  247.165769]  raw_notifier_call_chain+0x4c/0x78
[  247.170742]  call_netdevice_notifiers_info+0x68/0xa4
[  247.176305]  __dev_close_many+0x7c/0x138
[  247.180693]  dev_close_many+0x7c/0x10c
[  247.184893]  unregister_netdevice_many+0xfc/0x654
[  247.190158]  unregister_netdevice_queue+0xb4/0xe0
[  247.195424]  _cfg80211_unregister_wdev+0xa4/0x204 [cfg80211]
[  247.201816]  cfg80211_unregister_wdev+0x20/0x2c [cfg80211]
[  247.208016]  mwifiex_del_virtual_intf+0xc8/0x188 [mwifiex]
[  247.214174]  mwifiex_uninit_sw+0x158/0x1b0 [mwifiex]
[  247.219747]  mwifiex_remove_card+0x38/0xa0 [mwifiex]
[  247.225316]  mwifiex_pcie_remove+0xd0/0xe0 [mwifiex_pcie]
[  247.231451]  pci_device_remove+0x50/0xe0
[  247.235849]  device_release_driver_internal+0x110/0x1b0
[  247.241701]  driver_detach+0x5c/0x9c
[  247.245704]  bus_remove_driver+0x84/0xb8
[  247.250095]  driver_unregister+0x3c/0x60
[  247.254486]  pci_unregister_driver+0x2c/0x90
[  247.259267]  cleanup_module+0x18/0xcdc [mwifiex_pcie]

Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/linux-wireless/98392296-40ee-6300-369c-32e16cff3725@gmail.com/
Link: https://lore.kernel.org/linux-wireless/ab4d00ce52f32bd8e45ad0448a44737e@bewaar.me/
Reported-by: Maximilian Luz <luzmaximilian@gmail.com>
Reported-by: dave@bewaar.me
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Brian Norris <briannorris@chromium.org>
---
 drivers/net/wireless/marvell/mwifiex/main.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

--

Comments

Maximilian Luz May 15, 2021, 11:36 a.m. UTC | #1
On 5/15/21 4:42 AM, Brian Norris wrote:
> We can deadlock when rmmod'ing the driver or going through firmware
> reset, because the cfg80211_unregister_wdev() has to bring down the link
> for us, ... which then grab the same wiphy lock.
> 
> nl80211_del_interface() already handles a very similar case, with a nice
> description:
> 
>          /*
>           * We hold RTNL, so this is safe, without RTNL opencount cannot
>           * reach 0, and thus the rdev cannot be deleted.
>           *
>           * We need to do it for the dev_close(), since that will call
>           * the netdev notifiers, and we need to acquire the mutex there
>           * but don't know if we get there from here or from some other
>           * place (e.g. "ip link set ... down").
>           */
>          mutex_unlock(&rdev->wiphy.mtx);
> ...
> 
> Do similarly for mwifiex teardown, by ensuring we bring the link down
> first.
> 
> Sample deadlock trace:
> 
> [  247.103516] INFO: task rmmod:2119 blocked for more than 123 seconds.
> [  247.110630]       Not tainted 5.12.4 #5
> [  247.115796] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  247.124557] task:rmmod           state:D stack:    0 pid: 2119 ppid:  2114 flags:0x00400208
> [  247.133905] Call trace:
> [  247.136644]  __switch_to+0x130/0x170
> [  247.140643]  __schedule+0x714/0xa0c
> [  247.144548]  schedule_preempt_disabled+0x88/0xf4
> [  247.149714]  __mutex_lock_common+0x43c/0x750
> [  247.154496]  mutex_lock_nested+0x5c/0x68
> [  247.158884]  cfg80211_netdev_notifier_call+0x280/0x4e0 [cfg80211]
> [  247.165769]  raw_notifier_call_chain+0x4c/0x78
> [  247.170742]  call_netdevice_notifiers_info+0x68/0xa4
> [  247.176305]  __dev_close_many+0x7c/0x138
> [  247.180693]  dev_close_many+0x7c/0x10c
> [  247.184893]  unregister_netdevice_many+0xfc/0x654
> [  247.190158]  unregister_netdevice_queue+0xb4/0xe0
> [  247.195424]  _cfg80211_unregister_wdev+0xa4/0x204 [cfg80211]
> [  247.201816]  cfg80211_unregister_wdev+0x20/0x2c [cfg80211]
> [  247.208016]  mwifiex_del_virtual_intf+0xc8/0x188 [mwifiex]
> [  247.214174]  mwifiex_uninit_sw+0x158/0x1b0 [mwifiex]
> [  247.219747]  mwifiex_remove_card+0x38/0xa0 [mwifiex]
> [  247.225316]  mwifiex_pcie_remove+0xd0/0xe0 [mwifiex_pcie]
> [  247.231451]  pci_device_remove+0x50/0xe0
> [  247.235849]  device_release_driver_internal+0x110/0x1b0
> [  247.241701]  driver_detach+0x5c/0x9c
> [  247.245704]  bus_remove_driver+0x84/0xb8
> [  247.250095]  driver_unregister+0x3c/0x60
> [  247.254486]  pci_unregister_driver+0x2c/0x90
> [  247.259267]  cleanup_module+0x18/0xcdc [mwifiex_pcie]
> 
> Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver")
> Cc: stable@vger.kernel.org
> Link: https://lore.kernel.org/linux-wireless/98392296-40ee-6300-369c-32e16cff3725@gmail.com/
> Link: https://lore.kernel.org/linux-wireless/ab4d00ce52f32bd8e45ad0448a44737e@bewaar.me/
> Reported-by: Maximilian Luz <luzmaximilian@gmail.com>
> Reported-by: dave@bewaar.me
> Cc: Johannes Berg <johannes@sipsolutions.net>
> Signed-off-by: Brian Norris <briannorris@chromium.org>

Thanks!

Tested-by: Maximilian Luz <luzmaximilian@gmail.com>

> ---
>   drivers/net/wireless/marvell/mwifiex/main.c | 13 ++++++++++---
>   1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/wireless/marvell/mwifiex/main.c b/drivers/net/wireless/marvell/mwifiex/main.c
> index 529dfd8b7ae8..17399d4aa129 100644
> --- a/drivers/net/wireless/marvell/mwifiex/main.c
> +++ b/drivers/net/wireless/marvell/mwifiex/main.c
> @@ -1445,11 +1445,18 @@ static void mwifiex_uninit_sw(struct mwifiex_adapter *adapter)
>   		if (!priv)
>   			continue;
>   		rtnl_lock();
> -		wiphy_lock(adapter->wiphy);
>   		if (priv->netdev &&
> -		    priv->wdev.iftype != NL80211_IFTYPE_UNSPECIFIED)
> +		    priv->wdev.iftype != NL80211_IFTYPE_UNSPECIFIED) {
> +			/*
> +			 * Close the netdev now, because if we do it later, the
> +			 * netdev notifiers will need to acquire the wiphy lock
> +			 * again --> deadlock.
> +			 */
> +			dev_close(priv->wdev.netdev);
> +			wiphy_lock(adapter->wiphy);
>   			mwifiex_del_virtual_intf(adapter->wiphy, &priv->wdev);
> -		wiphy_unlock(adapter->wiphy);
> +			wiphy_unlock(adapter->wiphy);
> +		}
>   		rtnl_unlock();
>   	}
>   
>
Dave Olsthoorn May 15, 2021, 3:10 p.m. UTC | #2
On 2021-05-15 04:42, Brian Norris wrote:
> We can deadlock when rmmod'ing the driver or going through firmware
> reset, because the cfg80211_unregister_wdev() has to bring down the 
> link
> for us, ... which then grab the same wiphy lock.
> 
> nl80211_del_interface() already handles a very similar case, with a 
> nice
> description:
> 
>         /*
>          * We hold RTNL, so this is safe, without RTNL opencount cannot
>          * reach 0, and thus the rdev cannot be deleted.
>          *
>          * We need to do it for the dev_close(), since that will call
>          * the netdev notifiers, and we need to acquire the mutex there
>          * but don't know if we get there from here or from some other
>          * place (e.g. "ip link set ... down").
>          */
>         mutex_unlock(&rdev->wiphy.mtx);
> ...
> 
> Do similarly for mwifiex teardown, by ensuring we bring the link down
> first.
> 
> Sample deadlock trace:
> 
> [  247.103516] INFO: task rmmod:2119 blocked for more than 123 seconds.
> [  247.110630]       Not tainted 5.12.4 #5
> [  247.115796] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  247.124557] task:rmmod           state:D stack:    0 pid: 2119
> ppid:  2114 flags:0x00400208
> [  247.133905] Call trace:
> [  247.136644]  __switch_to+0x130/0x170
> [  247.140643]  __schedule+0x714/0xa0c
> [  247.144548]  schedule_preempt_disabled+0x88/0xf4
> [  247.149714]  __mutex_lock_common+0x43c/0x750
> [  247.154496]  mutex_lock_nested+0x5c/0x68
> [  247.158884]  cfg80211_netdev_notifier_call+0x280/0x4e0 [cfg80211]
> [  247.165769]  raw_notifier_call_chain+0x4c/0x78
> [  247.170742]  call_netdevice_notifiers_info+0x68/0xa4
> [  247.176305]  __dev_close_many+0x7c/0x138
> [  247.180693]  dev_close_many+0x7c/0x10c
> [  247.184893]  unregister_netdevice_many+0xfc/0x654
> [  247.190158]  unregister_netdevice_queue+0xb4/0xe0
> [  247.195424]  _cfg80211_unregister_wdev+0xa4/0x204 [cfg80211]
> [  247.201816]  cfg80211_unregister_wdev+0x20/0x2c [cfg80211]
> [  247.208016]  mwifiex_del_virtual_intf+0xc8/0x188 [mwifiex]
> [  247.214174]  mwifiex_uninit_sw+0x158/0x1b0 [mwifiex]
> [  247.219747]  mwifiex_remove_card+0x38/0xa0 [mwifiex]
> [  247.225316]  mwifiex_pcie_remove+0xd0/0xe0 [mwifiex_pcie]
> [  247.231451]  pci_device_remove+0x50/0xe0
> [  247.235849]  device_release_driver_internal+0x110/0x1b0
> [  247.241701]  driver_detach+0x5c/0x9c
> [  247.245704]  bus_remove_driver+0x84/0xb8
> [  247.250095]  driver_unregister+0x3c/0x60
> [  247.254486]  pci_unregister_driver+0x2c/0x90
> [  247.259267]  cleanup_module+0x18/0xcdc [mwifiex_pcie]
> 
> Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the 
> driver")
> Cc: stable@vger.kernel.org
> Link:
> https://lore.kernel.org/linux-wireless/98392296-40ee-6300-369c-32e16cff3725@gmail.com/
> Link:
> https://lore.kernel.org/linux-wireless/ab4d00ce52f32bd8e45ad0448a44737e@bewaar.me/
> Reported-by: Maximilian Luz <luzmaximilian@gmail.com>
> Reported-by: Dave Olsthoorn <dave@bewaar.me>

Thanks!

The firmware still seems to crash quicker than previously, but that's a 
unrelated problem.

Tested-by: Dave Olsthoorn <dave@bewaar.me>

> Cc: Johannes Berg <johannes@sipsolutions.net>
> Signed-off-by: Brian Norris <briannorris@chromium.org>
> ---
>  drivers/net/wireless/marvell/mwifiex/main.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/wireless/marvell/mwifiex/main.c
> b/drivers/net/wireless/marvell/mwifiex/main.c
> index 529dfd8b7ae8..17399d4aa129 100644
> --- a/drivers/net/wireless/marvell/mwifiex/main.c
> +++ b/drivers/net/wireless/marvell/mwifiex/main.c
> @@ -1445,11 +1445,18 @@ static void mwifiex_uninit_sw(struct
> mwifiex_adapter *adapter)
>  		if (!priv)
>  			continue;
>  		rtnl_lock();
> -		wiphy_lock(adapter->wiphy);
>  		if (priv->netdev &&
> -		    priv->wdev.iftype != NL80211_IFTYPE_UNSPECIFIED)
> +		    priv->wdev.iftype != NL80211_IFTYPE_UNSPECIFIED) {
> +			/*
> +			 * Close the netdev now, because if we do it later, the
> +			 * netdev notifiers will need to acquire the wiphy lock
> +			 * again --> deadlock.
> +			 */
> +			dev_close(priv->wdev.netdev);
> +			wiphy_lock(adapter->wiphy);
>  			mwifiex_del_virtual_intf(adapter->wiphy, &priv->wdev);
> -		wiphy_unlock(adapter->wiphy);
> +			wiphy_unlock(adapter->wiphy);
> +		}
>  		rtnl_unlock();
>  	}
Brian Norris June 11, 2021, 12:18 a.m. UTC | #3
On Fri, May 14, 2021 at 7:45 PM Brian Norris <briannorris@chromium.org> wrote:
>
> We can deadlock when rmmod'ing the driver or going through firmware
> reset, because the cfg80211_unregister_wdev() has to bring down the link
> for us, ... which then grab the same wiphy lock.
...
> Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver")
> Cc: stable@vger.kernel.org
> Link: https://lore.kernel.org/linux-wireless/98392296-40ee-6300-369c-32e16cff3725@gmail.com/
> Link: https://lore.kernel.org/linux-wireless/ab4d00ce52f32bd8e45ad0448a44737e@bewaar.me/
> Reported-by: Maximilian Luz <luzmaximilian@gmail.com>
> Reported-by: dave@bewaar.me
> Cc: Johannes Berg <johannes@sipsolutions.net>
> Signed-off-by: Brian Norris <briannorris@chromium.org>

Ping - is this going to get merged? It's a 5.12 regression, and we
have multiple people complaining about it (and they tested the fix
too!).

Thanks,
Brian
Kalle Valo June 11, 2021, 9:53 a.m. UTC | #4
Brian Norris <briannorris@chromium.org> writes:

> On Fri, May 14, 2021 at 7:45 PM Brian Norris <briannorris@chromium.org> wrote:

>>

>> We can deadlock when rmmod'ing the driver or going through firmware

>> reset, because the cfg80211_unregister_wdev() has to bring down the link

>> for us, ... which then grab the same wiphy lock.

> ...

>> Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver")

>> Cc: stable@vger.kernel.org

>> Link:

>> https://lore.kernel.org/linux-wireless/98392296-40ee-6300-369c-32e16cff3725@gmail.com/

>> Link:

>> https://lore.kernel.org/linux-wireless/ab4d00ce52f32bd8e45ad0448a44737e@bewaar.me/

>> Reported-by: Maximilian Luz <luzmaximilian@gmail.com>

>> Reported-by: dave@bewaar.me

>> Cc: Johannes Berg <johannes@sipsolutions.net>

>> Signed-off-by: Brian Norris <briannorris@chromium.org>

>

> Ping - is this going to get merged? It's a 5.12 regression, and we

> have multiple people complaining about it (and they tested the fix

> too!).


Thanks for the ping, this got piled up under all the -next patches and I
missed it. I'll look at it now.

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
Kalle Valo June 11, 2021, 10:03 a.m. UTC | #5
Brian Norris <briannorris@chromium.org> wrote:

> We can deadlock when rmmod'ing the driver or going through firmware

> reset, because the cfg80211_unregister_wdev() has to bring down the link

> for us, ... which then grab the same wiphy lock.

> 

> nl80211_del_interface() already handles a very similar case, with a nice

> description:

> 

>         /*

>          * We hold RTNL, so this is safe, without RTNL opencount cannot

>          * reach 0, and thus the rdev cannot be deleted.

>          *

>          * We need to do it for the dev_close(), since that will call

>          * the netdev notifiers, and we need to acquire the mutex there

>          * but don't know if we get there from here or from some other

>          * place (e.g. "ip link set ... down").

>          */

>         mutex_unlock(&rdev->wiphy.mtx);

> ...

> 

> Do similarly for mwifiex teardown, by ensuring we bring the link down

> first.

> 

> Sample deadlock trace:

> 

> [  247.103516] INFO: task rmmod:2119 blocked for more than 123 seconds.

> [  247.110630]       Not tainted 5.12.4 #5

> [  247.115796] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

> [  247.124557] task:rmmod           state:D stack:    0 pid: 2119 ppid:  2114 flags:0x00400208

> [  247.133905] Call trace:

> [  247.136644]  __switch_to+0x130/0x170

> [  247.140643]  __schedule+0x714/0xa0c

> [  247.144548]  schedule_preempt_disabled+0x88/0xf4

> [  247.149714]  __mutex_lock_common+0x43c/0x750

> [  247.154496]  mutex_lock_nested+0x5c/0x68

> [  247.158884]  cfg80211_netdev_notifier_call+0x280/0x4e0 [cfg80211]

> [  247.165769]  raw_notifier_call_chain+0x4c/0x78

> [  247.170742]  call_netdevice_notifiers_info+0x68/0xa4

> [  247.176305]  __dev_close_many+0x7c/0x138

> [  247.180693]  dev_close_many+0x7c/0x10c

> [  247.184893]  unregister_netdevice_many+0xfc/0x654

> [  247.190158]  unregister_netdevice_queue+0xb4/0xe0

> [  247.195424]  _cfg80211_unregister_wdev+0xa4/0x204 [cfg80211]

> [  247.201816]  cfg80211_unregister_wdev+0x20/0x2c [cfg80211]

> [  247.208016]  mwifiex_del_virtual_intf+0xc8/0x188 [mwifiex]

> [  247.214174]  mwifiex_uninit_sw+0x158/0x1b0 [mwifiex]

> [  247.219747]  mwifiex_remove_card+0x38/0xa0 [mwifiex]

> [  247.225316]  mwifiex_pcie_remove+0xd0/0xe0 [mwifiex_pcie]

> [  247.231451]  pci_device_remove+0x50/0xe0

> [  247.235849]  device_release_driver_internal+0x110/0x1b0

> [  247.241701]  driver_detach+0x5c/0x9c

> [  247.245704]  bus_remove_driver+0x84/0xb8

> [  247.250095]  driver_unregister+0x3c/0x60

> [  247.254486]  pci_unregister_driver+0x2c/0x90

> [  247.259267]  cleanup_module+0x18/0xcdc [mwifiex_pcie]

> 

> Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver")

> Cc: stable@vger.kernel.org

> Link: https://lore.kernel.org/linux-wireless/98392296-40ee-6300-369c-32e16cff3725@gmail.com/

> Link: https://lore.kernel.org/linux-wireless/ab4d00ce52f32bd8e45ad0448a44737e@bewaar.me/

> Reported-by: Maximilian Luz <luzmaximilian@gmail.com>

> Reported-by: dave@bewaar.me

> Cc: Johannes Berg <johannes@sipsolutions.net>

> Signed-off-by: Brian Norris <briannorris@chromium.org>

> Tested-by: Maximilian Luz <luzmaximilian@gmail.com>

> Tested-by: Dave Olsthoorn <dave@bewaar.me>


Patch applied to wireless-drivers.git, thanks.

1f9482aa8d41 mwifiex: bring down link before deleting interface

-- 
https://patchwork.kernel.org/project/linux-wireless/patch/20210515024227.2159311-1-briannorris@chromium.org/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
diff mbox series

Patch

diff --git a/drivers/net/wireless/marvell/mwifiex/main.c b/drivers/net/wireless/marvell/mwifiex/main.c
index 529dfd8b7ae8..17399d4aa129 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.c
+++ b/drivers/net/wireless/marvell/mwifiex/main.c
@@ -1445,11 +1445,18 @@  static void mwifiex_uninit_sw(struct mwifiex_adapter *adapter)
 		if (!priv)
 			continue;
 		rtnl_lock();
-		wiphy_lock(adapter->wiphy);
 		if (priv->netdev &&
-		    priv->wdev.iftype != NL80211_IFTYPE_UNSPECIFIED)
+		    priv->wdev.iftype != NL80211_IFTYPE_UNSPECIFIED) {
+			/*
+			 * Close the netdev now, because if we do it later, the
+			 * netdev notifiers will need to acquire the wiphy lock
+			 * again --> deadlock.
+			 */
+			dev_close(priv->wdev.netdev);
+			wiphy_lock(adapter->wiphy);
 			mwifiex_del_virtual_intf(adapter->wiphy, &priv->wdev);
-		wiphy_unlock(adapter->wiphy);
+			wiphy_unlock(adapter->wiphy);
+		}
 		rtnl_unlock();
 	}