diff mbox series

[net-next] net: openvswitch: fix to make sure flow_lookup() is not preempted

Message ID 160259304349.181017.7492443293310262978.stgit@ebuild
State New
Headers show
Series [net-next] net: openvswitch: fix to make sure flow_lookup() is not preempted | expand

Commit Message

Eelco Chaudron Oct. 13, 2020, 12:44 p.m. UTC
The flow_lookup() function uses per CPU variables, which must not be
preempted. However, this is fine in the general napi use case where
the local BH is disabled. But, it's also called in the netlink
context, which is preemptible. The below patch makes sure that even
in the netlink path, preemption is disabled.

Fixes: eac87c413bf9 ("net: openvswitch: reorder masks array based on usage")
Reported-by: Juri Lelli <jlelli@redhat.com>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
---
 net/openvswitch/flow_table.c |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Sebastian Andrzej Siewior Oct. 13, 2020, 12:53 p.m. UTC | #1
On 2020-10-13 14:44:19 [+0200], Eelco Chaudron wrote:
> The flow_lookup() function uses per CPU variables, which must not be
> preempted. However, this is fine in the general napi use case where
> the local BH is disabled. But, it's also called in the netlink
> context, which is preemptible. The below patch makes sure that even
> in the netlink path, preemption is disabled.
> 
> Fixes: eac87c413bf9 ("net: openvswitch: reorder masks array based on usage")
> Reported-by: Juri Lelli <jlelli@redhat.com>
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---
>  net/openvswitch/flow_table.c |   10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
> index 87c286ad660e..16289386632b 100644
> --- a/net/openvswitch/flow_table.c
> +++ b/net/openvswitch/flow_table.c
> @@ -850,9 +850,17 @@ struct sw_flow *ovs_flow_tbl_lookup(struct flow_table *tbl,
>  	struct mask_array *ma = rcu_dereference_ovsl(tbl->mask_array);
>  	u32 __always_unused n_mask_hit;
>  	u32 __always_unused n_cache_hit;
> +	struct sw_flow *flow;
>  	u32 index = 0;
>  
> -	return flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, &index);
> +	/* This function gets called trough the netlink interface and therefore
> +	 * is preemptible. However, flow_lookup() function needs to be called
> +	 * with preemption disabled due to CPU specific variables.
> +	 */

Once again. u64_stats_update_begin(). What protects you against
concurrent access.

> +	preempt_disable();
> +	flow = flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, &index);
> +	preempt_enable();
> +	return flow;
>  }
>  
>  struct sw_flow *ovs_flow_tbl_lookup_exact(struct flow_table *tbl,
> 

Sebastian
Eelco Chaudron Oct. 14, 2020, 10:44 a.m. UTC | #2
On 13 Oct 2020, at 14:53, Sebastian Andrzej Siewior wrote:

> On 2020-10-13 14:44:19 [+0200], Eelco Chaudron wrote:
>> The flow_lookup() function uses per CPU variables, which must not be
>> preempted. However, this is fine in the general napi use case where
>> the local BH is disabled. But, it's also called in the netlink
>> context, which is preemptible. The below patch makes sure that even
>> in the netlink path, preemption is disabled.
>>
>> Fixes: eac87c413bf9 ("net: openvswitch: reorder masks array based on 
>> usage")
>> Reported-by: Juri Lelli <jlelli@redhat.com>
>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>> ---
>>  net/openvswitch/flow_table.c |   10 +++++++++-
>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/openvswitch/flow_table.c 
>> b/net/openvswitch/flow_table.c
>> index 87c286ad660e..16289386632b 100644
>> --- a/net/openvswitch/flow_table.c
>> +++ b/net/openvswitch/flow_table.c
>> @@ -850,9 +850,17 @@ struct sw_flow *ovs_flow_tbl_lookup(struct 
>> flow_table *tbl,
>>  	struct mask_array *ma = rcu_dereference_ovsl(tbl->mask_array);
>>  	u32 __always_unused n_mask_hit;
>>  	u32 __always_unused n_cache_hit;
>> +	struct sw_flow *flow;
>>  	u32 index = 0;
>>
>> -	return flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, 
>> &index);
>> +	/* This function gets called trough the netlink interface and 
>> therefore
>> +	 * is preemptible. However, flow_lookup() function needs to be 
>> called
>> +	 * with preemption disabled due to CPU specific variables.
>> +	 */
>
> Once again. u64_stats_update_begin(). What protects you against
> concurrent access.

Thanks Sebastian for repeating this, as I thought I went over the 
seqcount code and thought it should be fine for my use case. However 
based on this comment I went over it again, and found the logic part I 
was constantly missing :)

My idea is to send a v2 patch and in addition to the preempt_disable() 
also make the seqcount part per CPU. I noticed other parts of the 
networking stack doing it the same way. So the patch would look 
something like:

@@ -731,7 +732,7 @@ static struct sw_flow *flow_lookup(struct flow_table 
*tbl,
                                    u32 *n_cache_hit,
                                    u32 *index)
  {
-       u64 *usage_counters = this_cpu_ptr(ma->masks_usage_cntr);
+       struct mask_array_stats *stats = 
this_cpu_ptr(ma->masks_usage_stats);
         struct sw_flow *flow;
         struct sw_flow_mask *mask;
         int i;
@@ -741,9 +742,9 @@ static struct sw_flow *flow_lookup(struct flow_table 
*tbl,
                 if (mask) {
                         flow = masked_flow_lookup(ti, key, mask, 
n_mask_hit);
                         if (flow) {
-                               u64_stats_update_begin(&ma->syncp);
-                               usage_counters[*index]++;
-                               u64_stats_update_end(&ma->syncp);
+                               u64_stats_update_begin(&stats->syncp);
+                               stats->usage_cntr[*index]++;
+                               u64_stats_update_end(&stats->syncp);
                                 (*n_cache_hit)++;
                                 return flow;
                         }

Let me know your thoughts.


Thanks,

Eelco
Sebastian Andrzej Siewior Oct. 15, 2020, 7:55 a.m. UTC | #3
On 2020-10-14 12:44:23 [+0200], Eelco Chaudron wrote:
> Let me know your thoughts.

better. If your seccount is per-CPU then you get away without explicit
writer locking if you rely on global per-CPU locking. You can't do
preempt_disable() because this section can be interrupt by softirq. You
need something stronger :)

Side note: Adding a fixes tag and net-next looks like "stable material
starting next merge window".

> Thanks,
> 
> Eelco

Sebastian
Eelco Chaudron Oct. 15, 2020, 8:11 a.m. UTC | #4
On 15 Oct 2020, at 9:55, Sebastian Andrzej Siewior wrote:

> On 2020-10-14 12:44:23 [+0200], Eelco Chaudron wrote:
>> Let me know your thoughts.
>
> better. If your seccount is per-CPU then you get away without explicit
> writer locking if you rely on global per-CPU locking. You can't do
> preempt_disable() because this section can be interrupt by softirq. 
> You
> need something stronger :)

Thanks for your reply! Yes I had it replaced with local_bh_disable() in 
my v2, as I noticed the hard IRQ to softirq part earlier.

> Side note: Adding a fixes tag and net-next looks like "stable material
> starting next merge window".

Have the patch on net-next, but will send it out on next (will do the 
conversion later today and sent it out).

Thanks,

Eelco
Sebastian Andrzej Siewior Oct. 15, 2020, 8:15 a.m. UTC | #5
On 2020-10-15 10:11:31 [+0200], Eelco Chaudron wrote:
> Thanks for your reply! Yes I had it replaced with local_bh_disable() in my
> v2, as I noticed the hard IRQ to softirq part earlier.

Okay. Resend the complete thing once you ready and I take a look.

> Thanks,
> 
> Eelco

Sebastian
diff mbox series

Patch

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 87c286ad660e..16289386632b 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -850,9 +850,17 @@  struct sw_flow *ovs_flow_tbl_lookup(struct flow_table *tbl,
 	struct mask_array *ma = rcu_dereference_ovsl(tbl->mask_array);
 	u32 __always_unused n_mask_hit;
 	u32 __always_unused n_cache_hit;
+	struct sw_flow *flow;
 	u32 index = 0;
 
-	return flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, &index);
+	/* This function gets called trough the netlink interface and therefore
+	 * is preemptible. However, flow_lookup() function needs to be called
+	 * with preemption disabled due to CPU specific variables.
+	 */
+	preempt_disable();
+	flow = flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, &index);
+	preempt_enable();
+	return flow;
 }
 
 struct sw_flow *ovs_flow_tbl_lookup_exact(struct flow_table *tbl,