diff mbox series

[net-next,2/3] net: ethernet: mtk_eth_soc: missing mutex

Message ID 20210418211145.21914-3-pablo@netfilter.org
State New
Headers show
Series mtk_ppe_offload fixes | expand

Commit Message

Pablo Neira Ayuso April 18, 2021, 9:11 p.m. UTC
Patch 2ed37183abb7 ("netfilter: flowtable: separate replace, destroy and
stats to different workqueues") splits the workqueue per event type. Add
a mutex to serialize updates.

Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support")
Reported-by: Frank Wunderlich <frank-w@public-files.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 .../net/ethernet/mediatek/mtk_ppe_offload.c   | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

Comments

Jakub Kicinski April 19, 2021, 9:16 p.m. UTC | #1
On Sun, 18 Apr 2021 23:11:44 +0200 Pablo Neira Ayuso wrote:
> Patch 2ed37183abb7 ("netfilter: flowtable: separate replace, destroy and

> stats to different workqueues") splits the workqueue per event type. Add

> a mutex to serialize updates.

> 

> Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support")

> Reported-by: Frank Wunderlich <frank-w@public-files.de>

> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


This driver doesn't set unlocked_driver_cb, why is it expected to take
any locks? I thought the contract is that caller should hold rtnl.
Pablo Neira Ayuso April 19, 2021, 9:40 p.m. UTC | #2
On Mon, Apr 19, 2021 at 02:16:01PM -0700, Jakub Kicinski wrote:
> On Sun, 18 Apr 2021 23:11:44 +0200 Pablo Neira Ayuso wrote:

> > Patch 2ed37183abb7 ("netfilter: flowtable: separate replace, destroy and

> > stats to different workqueues") splits the workqueue per event type. Add

> > a mutex to serialize updates.

> > 

> > Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support")

> > Reported-by: Frank Wunderlich <frank-w@public-files.de>

> > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

> 

> This driver doesn't set unlocked_driver_cb, why is it expected to take

> any locks? I thought the contract is that caller should hold rtnl.


No rtnl lock is held from the netfilter side, see:

42f1c2712090 ("netfilter: nftables: comment indirect serialization of
commit_mutex with rtnl_mutex")
Jakub Kicinski April 19, 2021, 9:43 p.m. UTC | #3
On Mon, 19 Apr 2021 23:40:19 +0200 Pablo Neira Ayuso wrote:
> On Mon, Apr 19, 2021 at 02:16:01PM -0700, Jakub Kicinski wrote:

> > On Sun, 18 Apr 2021 23:11:44 +0200 Pablo Neira Ayuso wrote:  

> > > Patch 2ed37183abb7 ("netfilter: flowtable: separate replace, destroy and

> > > stats to different workqueues") splits the workqueue per event type. Add

> > > a mutex to serialize updates.

> > > 

> > > Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support")

> > > Reported-by: Frank Wunderlich <frank-w@public-files.de>

> > > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>  

> > 

> > This driver doesn't set unlocked_driver_cb, why is it expected to take

> > any locks? I thought the contract is that caller should hold rtnl.  

> 

> No rtnl lock is held from the netfilter side, see:

> 

> 42f1c2712090 ("netfilter: nftables: comment indirect serialization of

> commit_mutex with rtnl_mutex")


All the tc-centric drivers but mlx5 depend on rtnl_lock, is there
something preventing them from binding to netfilter blocks?
Pablo Neira Ayuso April 19, 2021, 9:54 p.m. UTC | #4
On Mon, Apr 19, 2021 at 02:43:41PM -0700, Jakub Kicinski wrote:
> On Mon, 19 Apr 2021 23:40:19 +0200 Pablo Neira Ayuso wrote:

> > On Mon, Apr 19, 2021 at 02:16:01PM -0700, Jakub Kicinski wrote:

> > > On Sun, 18 Apr 2021 23:11:44 +0200 Pablo Neira Ayuso wrote:  

> > > > Patch 2ed37183abb7 ("netfilter: flowtable: separate replace, destroy and

> > > > stats to different workqueues") splits the workqueue per event type. Add

> > > > a mutex to serialize updates.

> > > > 

> > > > Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support")

> > > > Reported-by: Frank Wunderlich <frank-w@public-files.de>

> > > > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>  

> > > 

> > > This driver doesn't set unlocked_driver_cb, why is it expected to take

> > > any locks? I thought the contract is that caller should hold rtnl.  

> > 

> > No rtnl lock is held from the netfilter side, see:

> > 

> > 42f1c2712090 ("netfilter: nftables: comment indirect serialization of

> > commit_mutex with rtnl_mutex")

> 

> All the tc-centric drivers but mlx5 depend on rtnl_lock, is there

> something preventing them from binding to netfilter blocks?


Only mlx5 and this driver support for TC_SETUP_FT.

This fix is targetting at the TC_SETUP_FT flow block type.
Jakub Kicinski April 19, 2021, 9:56 p.m. UTC | #5
On Mon, 19 Apr 2021 23:54:32 +0200 Pablo Neira Ayuso wrote:
> On Mon, Apr 19, 2021 at 02:43:41PM -0700, Jakub Kicinski wrote:

> > On Mon, 19 Apr 2021 23:40:19 +0200 Pablo Neira Ayuso wrote:  

> > > No rtnl lock is held from the netfilter side, see:

> > > 

> > > 42f1c2712090 ("netfilter: nftables: comment indirect serialization of

> > > commit_mutex with rtnl_mutex")  

> > 

> > All the tc-centric drivers but mlx5 depend on rtnl_lock, is there

> > something preventing them from binding to netfilter blocks?  

> 

> Only mlx5 and this driver support for TC_SETUP_FT.

> 

> This fix is targetting at the TC_SETUP_FT flow block type.


Ah, there is a separate block type. Thanks.
Frank Wunderlich April 20, 2021, 11:51 a.m. UTC | #6
Am 18. April 2021 23:11:44 MESZ schrieb Pablo Neira Ayuso <pablo@netfilter.org>:
>Patch 2ed37183abb7 ("netfilter: flowtable: separate replace, destroy

>and

>stats to different workqueues") splits the workqueue per event type.

>Add

>a mutex to serialize updates.

>

>Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading

>support")

>Reported-by: Frank Wunderlich <frank-w@public-files.de>

>Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>


Hi Pablo,

As far we tested it, the mutex does not avoid the hang. It looks a bit better,but at the end it was fixed by this Patch

https://patchwork.kernel.org/project/linux-mediatek/patch/20210417072905.207032-1-dqfext@gmail.com/

Alex did some tests without the lock here and it still looks stable. So it looks like it is not needed
regards Frank
Pablo Neira Ayuso April 20, 2021, 2:58 p.m. UTC | #7
On Tue, Apr 20, 2021 at 01:51:07PM +0200, Frank Wunderlich wrote:
> Am 18. April 2021 23:11:44 MESZ schrieb Pablo Neira Ayuso <pablo@netfilter.org>:

> >Patch 2ed37183abb7 ("netfilter: flowtable: separate replace, destroy

> >and

> >stats to different workqueues") splits the workqueue per event type.

> >Add

> >a mutex to serialize updates.

> >

> >Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading

> >support")

> >Reported-by: Frank Wunderlich <frank-w@public-files.de>

> >Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

> 

> Hi Pablo,

> 

> As far we tested it, the mutex does not avoid the hang. It looks a bit better,but at the end it was fixed by this Patch

> 

> https://patchwork.kernel.org/project/linux-mediatek/patch/20210417072905.207032-1-dqfext@gmail.com/

> 

> Alex did some tests without the lock here and it still looks stable.

> So it looks like it is not needed


It might be hard to trigger the race, but it's needed. There are
several workqueues racing to add and delete entries from the driver
flowtable representation which has no locks.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mediatek/mtk_ppe_offload.c b/drivers/net/ethernet/mediatek/mtk_ppe_offload.c
index 4975106fbc42..2759c875c791 100644
--- a/drivers/net/ethernet/mediatek/mtk_ppe_offload.c
+++ b/drivers/net/ethernet/mediatek/mtk_ppe_offload.c
@@ -391,6 +391,8 @@  mtk_flow_offload_stats(struct mtk_eth *eth, struct flow_cls_offload *f)
 	return 0;
 }
 
+static DEFINE_MUTEX(mtk_flow_offload_mutex);
+
 static int
 mtk_eth_setup_tc_block_cb(enum tc_setup_type type, void *type_data, void *cb_priv)
 {
@@ -398,6 +400,7 @@  mtk_eth_setup_tc_block_cb(enum tc_setup_type type, void *type_data, void *cb_pri
 	struct net_device *dev = cb_priv;
 	struct mtk_mac *mac = netdev_priv(dev);
 	struct mtk_eth *eth = mac->hw;
+	int err;
 
 	if (!tc_can_offload(dev))
 		return -EOPNOTSUPP;
@@ -405,18 +408,24 @@  mtk_eth_setup_tc_block_cb(enum tc_setup_type type, void *type_data, void *cb_pri
 	if (type != TC_SETUP_CLSFLOWER)
 		return -EOPNOTSUPP;
 
+	mutex_lock(&mtk_flow_offload_mutex);
 	switch (cls->command) {
 	case FLOW_CLS_REPLACE:
-		return mtk_flow_offload_replace(eth, cls);
+		err = mtk_flow_offload_replace(eth, cls);
+		break;
 	case FLOW_CLS_DESTROY:
-		return mtk_flow_offload_destroy(eth, cls);
+		err = mtk_flow_offload_destroy(eth, cls);
+		break;
 	case FLOW_CLS_STATS:
-		return mtk_flow_offload_stats(eth, cls);
+		err = mtk_flow_offload_stats(eth, cls);
+		break;
 	default:
-		return -EOPNOTSUPP;
+		err = -EOPNOTSUPP;
+		break;
 	}
+	mutex_unlock(&mtk_flow_offload_mutex);
 
-	return 0;
+	return err;
 }
 
 static int