Message ID | 20210418211145.21914-3-pablo@netfilter.org |
---|---|
State | New |
Headers | show |
Series | mtk_ppe_offload fixes | expand |
On Sun, 18 Apr 2021 23:11:44 +0200 Pablo Neira Ayuso wrote: > Patch 2ed37183abb7 ("netfilter: flowtable: separate replace, destroy and > stats to different workqueues") splits the workqueue per event type. Add > a mutex to serialize updates. > > Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support") > Reported-by: Frank Wunderlich <frank-w@public-files.de> > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> This driver doesn't set unlocked_driver_cb, why is it expected to take any locks? I thought the contract is that caller should hold rtnl.
On Mon, Apr 19, 2021 at 02:16:01PM -0700, Jakub Kicinski wrote: > On Sun, 18 Apr 2021 23:11:44 +0200 Pablo Neira Ayuso wrote: > > Patch 2ed37183abb7 ("netfilter: flowtable: separate replace, destroy and > > stats to different workqueues") splits the workqueue per event type. Add > > a mutex to serialize updates. > > > > Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support") > > Reported-by: Frank Wunderlich <frank-w@public-files.de> > > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> > > This driver doesn't set unlocked_driver_cb, why is it expected to take > any locks? I thought the contract is that caller should hold rtnl. No rtnl lock is held from the netfilter side, see: 42f1c2712090 ("netfilter: nftables: comment indirect serialization of commit_mutex with rtnl_mutex")
On Mon, 19 Apr 2021 23:40:19 +0200 Pablo Neira Ayuso wrote: > On Mon, Apr 19, 2021 at 02:16:01PM -0700, Jakub Kicinski wrote: > > On Sun, 18 Apr 2021 23:11:44 +0200 Pablo Neira Ayuso wrote: > > > Patch 2ed37183abb7 ("netfilter: flowtable: separate replace, destroy and > > > stats to different workqueues") splits the workqueue per event type. Add > > > a mutex to serialize updates. > > > > > > Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support") > > > Reported-by: Frank Wunderlich <frank-w@public-files.de> > > > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> > > > > This driver doesn't set unlocked_driver_cb, why is it expected to take > > any locks? I thought the contract is that caller should hold rtnl. > > No rtnl lock is held from the netfilter side, see: > > 42f1c2712090 ("netfilter: nftables: comment indirect serialization of > commit_mutex with rtnl_mutex") All the tc-centric drivers but mlx5 depend on rtnl_lock, is there something preventing them from binding to netfilter blocks?
On Mon, Apr 19, 2021 at 02:43:41PM -0700, Jakub Kicinski wrote: > On Mon, 19 Apr 2021 23:40:19 +0200 Pablo Neira Ayuso wrote: > > On Mon, Apr 19, 2021 at 02:16:01PM -0700, Jakub Kicinski wrote: > > > On Sun, 18 Apr 2021 23:11:44 +0200 Pablo Neira Ayuso wrote: > > > > Patch 2ed37183abb7 ("netfilter: flowtable: separate replace, destroy and > > > > stats to different workqueues") splits the workqueue per event type. Add > > > > a mutex to serialize updates. > > > > > > > > Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support") > > > > Reported-by: Frank Wunderlich <frank-w@public-files.de> > > > > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> > > > > > > This driver doesn't set unlocked_driver_cb, why is it expected to take > > > any locks? I thought the contract is that caller should hold rtnl. > > > > No rtnl lock is held from the netfilter side, see: > > > > 42f1c2712090 ("netfilter: nftables: comment indirect serialization of > > commit_mutex with rtnl_mutex") > > All the tc-centric drivers but mlx5 depend on rtnl_lock, is there > something preventing them from binding to netfilter blocks? Only mlx5 and this driver support for TC_SETUP_FT. This fix is targetting at the TC_SETUP_FT flow block type.
On Mon, 19 Apr 2021 23:54:32 +0200 Pablo Neira Ayuso wrote: > On Mon, Apr 19, 2021 at 02:43:41PM -0700, Jakub Kicinski wrote: > > On Mon, 19 Apr 2021 23:40:19 +0200 Pablo Neira Ayuso wrote: > > > No rtnl lock is held from the netfilter side, see: > > > > > > 42f1c2712090 ("netfilter: nftables: comment indirect serialization of > > > commit_mutex with rtnl_mutex") > > > > All the tc-centric drivers but mlx5 depend on rtnl_lock, is there > > something preventing them from binding to netfilter blocks? > > Only mlx5 and this driver support for TC_SETUP_FT. > > This fix is targetting at the TC_SETUP_FT flow block type. Ah, there is a separate block type. Thanks.
Am 18. April 2021 23:11:44 MESZ schrieb Pablo Neira Ayuso <pablo@netfilter.org>: >Patch 2ed37183abb7 ("netfilter: flowtable: separate replace, destroy >and >stats to different workqueues") splits the workqueue per event type. >Add >a mutex to serialize updates. > >Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading >support") >Reported-by: Frank Wunderlich <frank-w@public-files.de> >Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Hi Pablo, As far we tested it, the mutex does not avoid the hang. It looks a bit better,but at the end it was fixed by this Patch https://patchwork.kernel.org/project/linux-mediatek/patch/20210417072905.207032-1-dqfext@gmail.com/ Alex did some tests without the lock here and it still looks stable. So it looks like it is not needed regards Frank
On Tue, Apr 20, 2021 at 01:51:07PM +0200, Frank Wunderlich wrote: > Am 18. April 2021 23:11:44 MESZ schrieb Pablo Neira Ayuso <pablo@netfilter.org>: > >Patch 2ed37183abb7 ("netfilter: flowtable: separate replace, destroy > >and > >stats to different workqueues") splits the workqueue per event type. > >Add > >a mutex to serialize updates. > > > >Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading > >support") > >Reported-by: Frank Wunderlich <frank-w@public-files.de> > >Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> > > Hi Pablo, > > As far we tested it, the mutex does not avoid the hang. It looks a bit better,but at the end it was fixed by this Patch > > https://patchwork.kernel.org/project/linux-mediatek/patch/20210417072905.207032-1-dqfext@gmail.com/ > > Alex did some tests without the lock here and it still looks stable. > So it looks like it is not needed It might be hard to trigger the race, but it's needed. There are several workqueues racing to add and delete entries from the driver flowtable representation which has no locks.
diff --git a/drivers/net/ethernet/mediatek/mtk_ppe_offload.c b/drivers/net/ethernet/mediatek/mtk_ppe_offload.c index 4975106fbc42..2759c875c791 100644 --- a/drivers/net/ethernet/mediatek/mtk_ppe_offload.c +++ b/drivers/net/ethernet/mediatek/mtk_ppe_offload.c @@ -391,6 +391,8 @@ mtk_flow_offload_stats(struct mtk_eth *eth, struct flow_cls_offload *f) return 0; } +static DEFINE_MUTEX(mtk_flow_offload_mutex); + static int mtk_eth_setup_tc_block_cb(enum tc_setup_type type, void *type_data, void *cb_priv) { @@ -398,6 +400,7 @@ mtk_eth_setup_tc_block_cb(enum tc_setup_type type, void *type_data, void *cb_pri struct net_device *dev = cb_priv; struct mtk_mac *mac = netdev_priv(dev); struct mtk_eth *eth = mac->hw; + int err; if (!tc_can_offload(dev)) return -EOPNOTSUPP; @@ -405,18 +408,24 @@ mtk_eth_setup_tc_block_cb(enum tc_setup_type type, void *type_data, void *cb_pri if (type != TC_SETUP_CLSFLOWER) return -EOPNOTSUPP; + mutex_lock(&mtk_flow_offload_mutex); switch (cls->command) { case FLOW_CLS_REPLACE: - return mtk_flow_offload_replace(eth, cls); + err = mtk_flow_offload_replace(eth, cls); + break; case FLOW_CLS_DESTROY: - return mtk_flow_offload_destroy(eth, cls); + err = mtk_flow_offload_destroy(eth, cls); + break; case FLOW_CLS_STATS: - return mtk_flow_offload_stats(eth, cls); + err = mtk_flow_offload_stats(eth, cls); + break; default: - return -EOPNOTSUPP; + err = -EOPNOTSUPP; + break; } + mutex_unlock(&mtk_flow_offload_mutex); - return 0; + return err; } static int
Patch 2ed37183abb7 ("netfilter: flowtable: separate replace, destroy and stats to different workqueues") splits the workqueue per event type. Add a mutex to serialize updates. Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support") Reported-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> --- .../net/ethernet/mediatek/mtk_ppe_offload.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)