diff mbox series

[net-next,4/6] ionic: add queue lock around open and stop

Message ID 20210827185512.50206-5-snelson@pensando.io
State New
Headers show
Series ionic: queue mgmt updates | expand

Commit Message

Shannon Nelson Aug. 27, 2021, 6:55 p.m. UTC
Add the queue configuration lock to ionic_open() and
ionic_stop() so that they don't collide with other in parallel
queue configuration actions such as MTU changes as can be
demonstrated with a tight loop of ifup/change-mtu/ifdown.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
---
 drivers/net/ethernet/pensando/ionic/ionic_lif.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Jakub Kicinski Aug. 28, 2021, 12:39 a.m. UTC | #1
On Fri, 27 Aug 2021 11:55:10 -0700 Shannon Nelson wrote:
> Add the queue configuration lock to ionic_open() and

> ionic_stop() so that they don't collide with other in parallel

> queue configuration actions such as MTU changes as can be

> demonstrated with a tight loop of ifup/change-mtu/ifdown.


Say more? how are up/down/change mtu not under rtnl_lock?
Shannon Nelson Aug. 28, 2021, 5:12 a.m. UTC | #2
On 8/27/21 5:39 PM, Jakub Kicinski wrote:
> On Fri, 27 Aug 2021 11:55:10 -0700 Shannon Nelson wrote:

>> Add the queue configuration lock to ionic_open() and

>> ionic_stop() so that they don't collide with other in parallel

>> queue configuration actions such as MTU changes as can be

>> demonstrated with a tight loop of ifup/change-mtu/ifdown.

> Say more? how are up/down/change mtu not under rtnl_lock?

Sorry, that commit message didn't get updated as it should have. The MTU 
change played with the timing of actions just right, but wasn't the 
culprit.  The actual issue was that the watchdog and the ionic_stop 
collided: ionic_stop had started taking the queues down but without 
grabbing the mutex, and the watchdog timer went off and ran the 
link_check which grabbed the mutex and tried to bring them back up 
again.  This didn't break anything in the driver, but confused the NIC 
firmware and left the interface non-operational. This was cleared with 
another ifdown/ifup cycle.

I can repost with a better commit description.

sln
diff mbox series

Patch

diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
index d69c80c3eaa2..1d31b9385849 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
@@ -2233,9 +2233,11 @@  static int ionic_open(struct net_device *netdev)
 	if (test_and_clear_bit(IONIC_LIF_F_BROKEN, lif->state))
 		netdev_info(netdev, "clearing broken state\n");
 
+	mutex_lock(&lif->queue_lock);
+
 	err = ionic_txrx_alloc(lif);
 	if (err)
-		return err;
+		goto err_unlock;
 
 	err = ionic_txrx_init(lif);
 	if (err)
@@ -2256,12 +2258,15 @@  static int ionic_open(struct net_device *netdev)
 			goto err_txrx_deinit;
 	}
 
+	mutex_unlock(&lif->queue_lock);
 	return 0;
 
 err_txrx_deinit:
 	ionic_txrx_deinit(lif);
 err_txrx_free:
 	ionic_txrx_free(lif);
+err_unlock:
+	mutex_unlock(&lif->queue_lock);
 	return err;
 }
 
@@ -2281,9 +2286,11 @@  static int ionic_stop(struct net_device *netdev)
 	if (test_bit(IONIC_LIF_F_FW_RESET, lif->state))
 		return 0;
 
+	mutex_lock(&lif->queue_lock);
 	ionic_stop_queues(lif);
 	ionic_txrx_deinit(lif);
 	ionic_txrx_free(lif);
+	mutex_unlock(&lif->queue_lock);
 
 	return 0;
 }