[37/50] netfilter: nf_tables: atomic dump and reset for stateful objects

Message ID 20161209102432.GA986@salvia
State New
Headers show

Commit Message

Pablo Neira Ayuso Dec. 9, 2016, 10:24 a.m.
Hi Paul,

On Thu, Dec 08, 2016 at 07:40:14PM -0500, Paul Gortmaker wrote:
> On Wed, Dec 7, 2016 at 4:52 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:

> > This patch adds a new NFT_MSG_GETOBJ_RESET command perform an atomic

> > dump-and-reset of the stateful object. This also comes with add support

> > for atomic dump and reset for counter and quota objects.

> 

> This triggered a new build failure in linux-next on parisc-32, which a

> hands-off bisect

> run lists as resulting from this:

> 

> ERROR: "__cmpxchg_u64" [net/netfilter/nft_counter.ko] undefined!

> make[2]: *** [__modpost] Error 1

> make[1]: *** [modules] Error 2

> make: *** [sub-make] Error 2

> 43da04a593d8b2626f1cf4b56efe9402f6b53652 is the first bad commit

> commit 43da04a593d8b2626f1cf4b56efe9402f6b53652

> Author: Pablo Neira Ayuso <pablo@netfilter.org>

> Date:   Mon Nov 28 00:05:44 2016 +0100

> 

>     netfilter: nf_tables: atomic dump and reset for stateful objects

> 

>     This patch adds a new NFT_MSG_GETOBJ_RESET command perform an atomic

>     dump-and-reset of the stateful object. This also comes with add support

>     for atomic dump and reset for counter and quota objects.

> 

>     Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

> 

> :040000 040000 6cd4554f69247e5c837db52342f26888beda1623

> 5908aca93c89e7922336546c3753bfcf2aceefba M      include

> :040000 040000 f25d5831eb30972436bd198c5bb237a0cb0b4856

> 4ee5751c8de02bb5a8dcaadb2a2df7986d90f8e9 M      net

> bisect run success

> 

> Guessing this is more an issue with parisc than it is with netfilter, but I

> figured I'd mention it anyway.


I'm planning to submit this patch to parisc, I'm attaching it to this
email.

Comments

Eric Dumazet Dec. 9, 2016, 2:24 p.m. | #1
On Fri, 2016-12-09 at 11:24 +0100, Pablo Neira Ayuso wrote:
> Hi Paul,


Hi Pablo

Given that bytes/packets counters are modified without cmpxchg64()  :

static inline void nft_counter_do_eval(struct nft_counter_percpu_priv *priv,
                                       struct nft_regs *regs,
                                       const struct nft_pktinfo *pkt)
{
        struct nft_counter_percpu *this_cpu;

        local_bh_disable();
        this_cpu = this_cpu_ptr(priv->counter);
        u64_stats_update_begin(&this_cpu->syncp);
        this_cpu->counter.bytes += pkt->skb->len;
        this_cpu->counter.packets++;
        u64_stats_update_end(&this_cpu->syncp);
        local_bh_enable();
}

It means that the cmpxchg64() used to clear the stats is not good enough.

It does not help to make sure stats are properly cleared.

On 64 bit, the ->syncp is not there, so the nft_counter_reset() might
not see that a bytes or packets counter was modified by another cpu.


CPU 1                              CPU 2

LOAD PTR->BYTES into REG_A         old = *counter;
REG_A += skb->len;
                                   cmpxchg64(counter, old, 0);
PTR->BYTES = REG_A

It looks that you want a seqcount, even on 64bit arches,
so that CPU 2 can restart its loop, and more importantly you need
to not accumulate the values you read, because they might be old/invalid.

Another way would be to not use cmpxchg64() at all.
Way to expensive in fast path !

The percpu value would never be modified by an other cpu than the owner.

You need a per cpu seqcount, no need to add a syncp per nft percpu counter.


static DEFINE_PERCPU(seqcount_t, nft_pcpu_seq);

static inline void nft_counter_do_eval(struct nft_counter_percpu_priv *priv,
                                       struct nft_regs *regs,
                                       const struct nft_pktinfo *pkt)
{
        struct nft_counter_percpu *this_cpu;
	seqcount_t *myseq;

        local_bh_disable();
        this_cpu = this_cpu_ptr(priv->counter);
	myseq = this_cpu_ptr(&nft_pcpu_seq);

	write_seqcount_begin(myseq);

        this_cpu->counter.bytes += pkt->skb->len;
        this_cpu->counter.packets++;

	write_seqcount_end(myseq);
	
        local_bh_enable();
}

Thanks !
Pablo Neira Ayuso Dec. 10, 2016, 12:21 p.m. | #2
On Fri, Dec 09, 2016 at 07:22:06AM -0800, Eric Dumazet wrote:
> On Fri, 2016-12-09 at 06:24 -0800, Eric Dumazet wrote:

> 

> > It looks that you want a seqcount, even on 64bit arches,

> > so that CPU 2 can restart its loop, and more importantly you need

> > to not accumulate the values you read, because they might be old/invalid.

> 

> Untested patch to give general idea. I can polish it a bit later today.


I'm preparing a patch now, so you can review it.

Eric, thanks a lot for reviewing and proposing a working approach!

Patch hide | download patch | download mbox

From c9d320ac0be2a32a7b2bfad398be549865088ecf Mon Sep 17 00:00:00 2001
From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Thu, 8 Dec 2016 22:55:33 +0100
Subject: [PATCH] parisc: export symbol __cmpxchg_u64()

kbuild test robot reports:

>> ERROR: "__cmpxchg_u64" [net/netfilter/nft_counter.ko] undefined!

Commit 43da04a593d8 ("netfilter: nf_tables: atomic dump and reset for
stateful objects") introduces the first client of cmpxchg64() from
modules.

Patch 54b668009076 ("parisc: Add native high-resolution sched_clock()
implementation") removed __cmpxchg_u64() dependency on CONFIG_64BIT.
So, let's fix this problem by exporting this symbol unconditionally.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 arch/parisc/kernel/parisc_ksyms.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/parisc/kernel/parisc_ksyms.c b/arch/parisc/kernel/parisc_ksyms.c
index 3cad8aadc69e..cfa704548cf3 100644
--- a/arch/parisc/kernel/parisc_ksyms.c
+++ b/arch/parisc/kernel/parisc_ksyms.c
@@ -40,8 +40,8 @@  EXPORT_SYMBOL(__atomic_hash);
 #endif
 #ifdef CONFIG_64BIT
 EXPORT_SYMBOL(__xchg64);
-EXPORT_SYMBOL(__cmpxchg_u64);
 #endif
+EXPORT_SYMBOL(__cmpxchg_u64);
 
 #include <asm/uaccess.h>
 EXPORT_SYMBOL(lclear_user);
-- 
2.1.4