diff mbox series

[net-next] net: sched: remove redundant 'rtnl_held' argument

Message ID 20201127151205.23492-1-vladbu@nvidia.com
State New
Headers show
Series [net-next] net: sched: remove redundant 'rtnl_held' argument | expand

Commit Message

Vlad Buslov Nov. 27, 2020, 3:12 p.m. UTC
Functions tfilter_notify_chain() and tcf_get_next_proto() are always called
with rtnl lock held in current implementation. Moreover, attempting to call
them without rtnl lock would cause a warning down the call chain in
function __tcf_get_next_proto() that requires the lock to be held by
callers. Remove the 'rtnl_held' argument in order to simplify the code and
make rtnl lock requirement explicit.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
---
 include/net/pkt_cls.h |  2 +-
 net/sched/cls_api.c   | 18 ++++++++----------
 net/sched/sch_api.c   |  4 ++--
 3 files changed, 11 insertions(+), 13 deletions(-)

Comments

Jakub Kicinski Dec. 1, 2020, 2:52 a.m. UTC | #1
On Fri, 27 Nov 2020 17:12:05 +0200 Vlad Buslov wrote:
> @@ -2262,7 +2260,7 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,

>  

>  	if (prio == 0) {

>  		tfilter_notify_chain(net, skb, block, q, parent, n,

> -				     chain, RTM_DELTFILTER, rtnl_held);

> +				     chain, RTM_DELTFILTER);

>  		tcf_chain_flush(chain, rtnl_held);

>  		err = 0;

>  		goto errout;


Hum. This looks off.
Vlad Buslov Dec. 1, 2020, 7:55 a.m. UTC | #2
On Tue 01 Dec 2020 at 04:52, Jakub Kicinski <kuba@kernel.org> wrote:
> On Fri, 27 Nov 2020 17:12:05 +0200 Vlad Buslov wrote:

>> @@ -2262,7 +2260,7 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,

>>  

>>  	if (prio == 0) {

>>  		tfilter_notify_chain(net, skb, block, q, parent, n,

>> -				     chain, RTM_DELTFILTER, rtnl_held);

>> +				     chain, RTM_DELTFILTER);

>>  		tcf_chain_flush(chain, rtnl_held);

>>  		err = 0;

>>  		goto errout;

>

> Hum. This looks off.


Hi Jakub,

Prio==0 means user requests to flush whole chain. In such case rtnl lock
is obtained earlier in tc_del_tfilter():

	/* Take rtnl mutex if flushing whole chain, block is shared (no qdisc
	 * found), qdisc is not unlocked, classifier type is not specified,
	 * classifier is not unlocked.
	 */
	if (!prio ||
	    (q && !(q->ops->cl_ops->flags & QDISC_CLASS_OPS_DOIT_UNLOCKED)) ||
	    !tcf_proto_is_unlocked(name)) {
		rtnl_held = true;
		rtnl_lock();
	}
Jakub Kicinski Dec. 1, 2020, 5:03 p.m. UTC | #3
On Tue, 1 Dec 2020 09:55:37 +0200 Vlad Buslov wrote:
> On Tue 01 Dec 2020 at 04:52, Jakub Kicinski <kuba@kernel.org> wrote:

> > On Fri, 27 Nov 2020 17:12:05 +0200 Vlad Buslov wrote:  

> >> @@ -2262,7 +2260,7 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,

> >>  

> >>  	if (prio == 0) {

> >>  		tfilter_notify_chain(net, skb, block, q, parent, n,

> >> -				     chain, RTM_DELTFILTER, rtnl_held);

> >> +				     chain, RTM_DELTFILTER);

> >>  		tcf_chain_flush(chain, rtnl_held);

> >>  		err = 0;

> >>  		goto errout;  

> >

> > Hum. This looks off.  

> 

> Hi Jakub,

> 

> Prio==0 means user requests to flush whole chain. In such case rtnl lock

> is obtained earlier in tc_del_tfilter():

> 

> 	/* Take rtnl mutex if flushing whole chain, block is shared (no qdisc

> 	 * found), qdisc is not unlocked, classifier type is not specified,

> 	 * classifier is not unlocked.

> 	 */

> 	if (!prio ||

> 	    (q && !(q->ops->cl_ops->flags & QDISC_CLASS_OPS_DOIT_UNLOCKED)) ||

> 	    !tcf_proto_is_unlocked(name)) {

> 		rtnl_held = true;

> 		rtnl_lock();

> 	}

> 


Makes sense, although seems a little fragile. Why not put a true in
there, in that case?

Do you have a larger plan here? The motivation seems a little unclear
if I'm completely honest. Are you dropping the rtnl_held from all callers 
of __tcf_get_next_proto() just to save the extra argument / typing?
That's nice but there's also value in the API being consistent.
Vlad Buslov Dec. 1, 2020, 6:39 p.m. UTC | #4
On Tue 01 Dec 2020 at 19:03, Jakub Kicinski <kuba@kernel.org> wrote:
> On Tue, 1 Dec 2020 09:55:37 +0200 Vlad Buslov wrote:

>> On Tue 01 Dec 2020 at 04:52, Jakub Kicinski <kuba@kernel.org> wrote:

>> > On Fri, 27 Nov 2020 17:12:05 +0200 Vlad Buslov wrote:  

>> >> @@ -2262,7 +2260,7 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,

>> >>  

>> >>  	if (prio == 0) {

>> >>  		tfilter_notify_chain(net, skb, block, q, parent, n,

>> >> -				     chain, RTM_DELTFILTER, rtnl_held);

>> >> +				     chain, RTM_DELTFILTER);

>> >>  		tcf_chain_flush(chain, rtnl_held);

>> >>  		err = 0;

>> >>  		goto errout;  

>> >

>> > Hum. This looks off.  

>> 

>> Hi Jakub,

>> 

>> Prio==0 means user requests to flush whole chain. In such case rtnl lock

>> is obtained earlier in tc_del_tfilter():

>> 

>> 	/* Take rtnl mutex if flushing whole chain, block is shared (no qdisc

>> 	 * found), qdisc is not unlocked, classifier type is not specified,

>> 	 * classifier is not unlocked.

>> 	 */

>> 	if (!prio ||

>> 	    (q && !(q->ops->cl_ops->flags & QDISC_CLASS_OPS_DOIT_UNLOCKED)) ||

>> 	    !tcf_proto_is_unlocked(name)) {

>> 		rtnl_held = true;

>> 		rtnl_lock();

>> 	}

>> 

>

> Makes sense, although seems a little fragile. Why not put a true in

> there, in that case?


Because, as I described in commit message, the function will trigger an
assertion if called without rtnl lock, so passing rtnl_held==false
argument makes no sense and is confusing for the reader.

>

> Do you have a larger plan here? The motivation seems a little unclear

> if I'm completely honest. Are you dropping the rtnl_held from all callers 

> of __tcf_get_next_proto() just to save the extra argument / typing?


The plan is to have 'rtnl_held' arg for functions that can be called
without rtnl lock and not have such argument for functions that require
caller to hold rtnl :)

To elaborate further regarding motivation for this patch: some time ago
I received an email asking why I have rtnl_held arg in function that has
ASSERT_RTNL() in one of its dependencies. I re-read the code and
determined that it was a leftover from earlier version and is not needed
in code that was eventually upstreamed. Removing the argument was an
easy decision since Jiri hates those and repeatedly asked me to minimize
usage of such function arguments, so I didn't expect it to be
controversial.

> That's nice but there's also value in the API being consistent.


Cls_api has multiple functions that don't have 'rtnl_held' argument.
Only functions that can work without rtnl lock have it. Why do you
suggest it is inconsistent to remove it here?
Jakub Kicinski Dec. 1, 2020, 7:24 p.m. UTC | #5
On Tue, 1 Dec 2020 20:39:16 +0200 Vlad Buslov wrote:
> On Tue 01 Dec 2020 at 19:03, Jakub Kicinski <kuba@kernel.org> wrote:

> > On Tue, 1 Dec 2020 09:55:37 +0200 Vlad Buslov wrote:  

> >> On Tue 01 Dec 2020 at 04:52, Jakub Kicinski <kuba@kernel.org> wrote:  

> >> > On Fri, 27 Nov 2020 17:12:05 +0200 Vlad Buslov wrote:    

> >> >> @@ -2262,7 +2260,7 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,

> >> >>  

> >> >>  	if (prio == 0) {

> >> >>  		tfilter_notify_chain(net, skb, block, q, parent, n,

> >> >> -				     chain, RTM_DELTFILTER, rtnl_held);

> >> >> +				     chain, RTM_DELTFILTER);

> >> >>  		tcf_chain_flush(chain, rtnl_held);

> >> >>  		err = 0;

> >> >>  		goto errout;    

> >> >

> >> > Hum. This looks off.    

> >> 

> >> Hi Jakub,

> >> 

> >> Prio==0 means user requests to flush whole chain. In such case rtnl lock

> >> is obtained earlier in tc_del_tfilter():

> >> 

> >> 	/* Take rtnl mutex if flushing whole chain, block is shared (no qdisc

> >> 	 * found), qdisc is not unlocked, classifier type is not specified,

> >> 	 * classifier is not unlocked.

> >> 	 */

> >> 	if (!prio ||

> >> 	    (q && !(q->ops->cl_ops->flags & QDISC_CLASS_OPS_DOIT_UNLOCKED)) ||

> >> 	    !tcf_proto_is_unlocked(name)) {

> >> 		rtnl_held = true;

> >> 		rtnl_lock();

> >> 	}

> >>   

> >

> > Makes sense, although seems a little fragile. Why not put a true in

> > there, in that case?  

> 

> Because, as I described in commit message, the function will trigger an

> assertion if called without rtnl lock, so passing rtnl_held==false

> argument makes no sense and is confusing for the reader.


The assumption being that tcf_ functions without the arg must hold the
lock?

> > Do you have a larger plan here? The motivation seems a little unclear

> > if I'm completely honest. Are you dropping the rtnl_held from all callers 

> > of __tcf_get_next_proto() just to save the extra argument / typing?  

> 

> The plan is to have 'rtnl_held' arg for functions that can be called

> without rtnl lock and not have such argument for functions that require

> caller to hold rtnl :)

> 

> To elaborate further regarding motivation for this patch: some time ago

> I received an email asking why I have rtnl_held arg in function that has

> ASSERT_RTNL() in one of its dependencies. I re-read the code and

> determined that it was a leftover from earlier version and is not needed

> in code that was eventually upstreamed. Removing the argument was an

> easy decision since Jiri hates those and repeatedly asked me to minimize

> usage of such function arguments, so I didn't expect it to be

> controversial.

> 

> > That's nice but there's also value in the API being consistent.  

> 

> Cls_api has multiple functions that don't have 'rtnl_held' argument.

> Only functions that can work without rtnl lock have it. Why do you

> suggest it is inconsistent to remove it here?


I see. I was just trying to figure out if you have a plan for larger
restructuring to improve the situation. I also dislike to arguments
being passed around in a seemingly random fashion. Removing or adding
them to a single function does not move the needle much, IMO.

But since the patch is correct I'll apply it now, thanks!
Vlad Buslov Dec. 2, 2020, 8:32 a.m. UTC | #6
On Tue 01 Dec 2020 at 21:24, Jakub Kicinski <kuba@kernel.org> wrote:
> On Tue, 1 Dec 2020 20:39:16 +0200 Vlad Buslov wrote:

>> On Tue 01 Dec 2020 at 19:03, Jakub Kicinski <kuba@kernel.org> wrote:

>> > On Tue, 1 Dec 2020 09:55:37 +0200 Vlad Buslov wrote:  

>> >> On Tue 01 Dec 2020 at 04:52, Jakub Kicinski <kuba@kernel.org> wrote:  

>> >> > On Fri, 27 Nov 2020 17:12:05 +0200 Vlad Buslov wrote:    

>> >> >> @@ -2262,7 +2260,7 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,

>> >> >>  

>> >> >>  	if (prio == 0) {

>> >> >>  		tfilter_notify_chain(net, skb, block, q, parent, n,

>> >> >> -				     chain, RTM_DELTFILTER, rtnl_held);

>> >> >> +				     chain, RTM_DELTFILTER);

>> >> >>  		tcf_chain_flush(chain, rtnl_held);

>> >> >>  		err = 0;

>> >> >>  		goto errout;    

>> >> >

>> >> > Hum. This looks off.    

>> >> 

>> >> Hi Jakub,

>> >> 

>> >> Prio==0 means user requests to flush whole chain. In such case rtnl lock

>> >> is obtained earlier in tc_del_tfilter():

>> >> 

>> >> 	/* Take rtnl mutex if flushing whole chain, block is shared (no qdisc

>> >> 	 * found), qdisc is not unlocked, classifier type is not specified,

>> >> 	 * classifier is not unlocked.

>> >> 	 */

>> >> 	if (!prio ||

>> >> 	    (q && !(q->ops->cl_ops->flags & QDISC_CLASS_OPS_DOIT_UNLOCKED)) ||

>> >> 	    !tcf_proto_is_unlocked(name)) {

>> >> 		rtnl_held = true;

>> >> 		rtnl_lock();

>> >> 	}

>> >>   

>> >

>> > Makes sense, although seems a little fragile. Why not put a true in

>> > there, in that case?  

>> 

>> Because, as I described in commit message, the function will trigger an

>> assertion if called without rtnl lock, so passing rtnl_held==false

>> argument makes no sense and is confusing for the reader.

>

> The assumption being that tcf_ functions without the arg must hold the

> lock?


Yes.

>

>> > Do you have a larger plan here? The motivation seems a little unclear

>> > if I'm completely honest. Are you dropping the rtnl_held from all callers 

>> > of __tcf_get_next_proto() just to save the extra argument / typing?  

>> 

>> The plan is to have 'rtnl_held' arg for functions that can be called

>> without rtnl lock and not have such argument for functions that require

>> caller to hold rtnl :)

>> 

>> To elaborate further regarding motivation for this patch: some time ago

>> I received an email asking why I have rtnl_held arg in function that has

>> ASSERT_RTNL() in one of its dependencies. I re-read the code and

>> determined that it was a leftover from earlier version and is not needed

>> in code that was eventually upstreamed. Removing the argument was an

>> easy decision since Jiri hates those and repeatedly asked me to minimize

>> usage of such function arguments, so I didn't expect it to be

>> controversial.

>> 

>> > That's nice but there's also value in the API being consistent.  

>> 

>> Cls_api has multiple functions that don't have 'rtnl_held' argument.

>> Only functions that can work without rtnl lock have it. Why do you

>> suggest it is inconsistent to remove it here?

>

> I see. I was just trying to figure out if you have a plan for larger

> restructuring to improve the situation. I also dislike to arguments

> being passed around in a seemingly random fashion. Removing or adding

> them to a single function does not move the needle much, IMO.


No, this is not part of larger effort. I would like to stop passing
'rtnl_held' everywhere, but for that I need other drivers that implement
TC offload to stop requiring rtnl lock, which would allow removing
rtnl_held from tcf_proto_ops callbacks.

>

> But since the patch is correct I'll apply it now, thanks!


Thank you!
diff mbox series

Patch

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 133f9ad4d4f9..0f2a9c44171c 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -48,7 +48,7 @@  void tcf_chain_put_by_act(struct tcf_chain *chain);
 struct tcf_chain *tcf_get_next_chain(struct tcf_block *block,
 				     struct tcf_chain *chain);
 struct tcf_proto *tcf_get_next_proto(struct tcf_chain *chain,
-				     struct tcf_proto *tp, bool rtnl_held);
+				     struct tcf_proto *tp);
 void tcf_block_netif_keep_dst(struct tcf_block *block);
 int tcf_block_get(struct tcf_block **p_block,
 		  struct tcf_proto __rcu **p_filter_chain, struct Qdisc *q,
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index ff3e943febaa..37b77bd30974 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -991,13 +991,12 @@  __tcf_get_next_proto(struct tcf_chain *chain, struct tcf_proto *tp)
  */
 
 struct tcf_proto *
-tcf_get_next_proto(struct tcf_chain *chain, struct tcf_proto *tp,
-		   bool rtnl_held)
+tcf_get_next_proto(struct tcf_chain *chain, struct tcf_proto *tp)
 {
 	struct tcf_proto *tp_next = __tcf_get_next_proto(chain, tp);
 
 	if (tp)
-		tcf_proto_put(tp, rtnl_held, NULL);
+		tcf_proto_put(tp, true, NULL);
 
 	return tp_next;
 }
@@ -1924,15 +1923,14 @@  static int tfilter_del_notify(struct net *net, struct sk_buff *oskb,
 static void tfilter_notify_chain(struct net *net, struct sk_buff *oskb,
 				 struct tcf_block *block, struct Qdisc *q,
 				 u32 parent, struct nlmsghdr *n,
-				 struct tcf_chain *chain, int event,
-				 bool rtnl_held)
+				 struct tcf_chain *chain, int event)
 {
 	struct tcf_proto *tp;
 
-	for (tp = tcf_get_next_proto(chain, NULL, rtnl_held);
-	     tp; tp = tcf_get_next_proto(chain, tp, rtnl_held))
+	for (tp = tcf_get_next_proto(chain, NULL);
+	     tp; tp = tcf_get_next_proto(chain, tp))
 		tfilter_notify(net, oskb, n, tp, block,
-			       q, parent, NULL, event, false, rtnl_held);
+			       q, parent, NULL, event, false, true);
 }
 
 static void tfilter_put(struct tcf_proto *tp, void *fh)
@@ -2262,7 +2260,7 @@  static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
 
 	if (prio == 0) {
 		tfilter_notify_chain(net, skb, block, q, parent, n,
-				     chain, RTM_DELTFILTER, rtnl_held);
+				     chain, RTM_DELTFILTER);
 		tcf_chain_flush(chain, rtnl_held);
 		err = 0;
 		goto errout;
@@ -2895,7 +2893,7 @@  static int tc_ctl_chain(struct sk_buff *skb, struct nlmsghdr *n,
 		break;
 	case RTM_DELCHAIN:
 		tfilter_notify_chain(net, skb, block, q, parent, n,
-				     chain, RTM_DELTFILTER, true);
+				     chain, RTM_DELTFILTER);
 		/* Flush the chain first as the user requested chain removal. */
 		tcf_chain_flush(chain, true);
 		/* In case the chain was successfully deleted, put a reference
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 1a2d2471b078..51cb553e4317 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -1943,8 +1943,8 @@  static int tc_bind_class_walker(struct Qdisc *q, unsigned long cl,
 	     chain = tcf_get_next_chain(block, chain)) {
 		struct tcf_proto *tp;
 
-		for (tp = tcf_get_next_proto(chain, NULL, true);
-		     tp; tp = tcf_get_next_proto(chain, tp, true)) {
+		for (tp = tcf_get_next_proto(chain, NULL);
+		     tp; tp = tcf_get_next_proto(chain, tp)) {
 			struct tcf_bind_args arg = {};
 
 			arg.w.fn = tcf_node_bind;