Message ID | 20210205001727.2125-2-pablo@netfilter.org |
---|---|
State | New |
Headers | show |
Series | Netfilter fixes for net | expand |
Hello: This series was applied to netdev/net.git (refs/heads/master): On Fri, 5 Feb 2021 01:17:24 +0100 you wrote: > From: Jozsef Kadlecsik <kadlec@mail.kfki.hu> > > When both --reap and --update flag are specified, there's a code > path at which the entry to be updated is reaped beforehand, > which then leads to kernel crash. Reap only entries which won't be > updated. > > [...] Here is the summary with links: - [net,1/4] netfilter: xt_recent: Fix attempt to update deleted entry https://git.kernel.org/netdev/net/c/b1bdde33b723 - [net,2/4] selftests: netfilter: fix current year https://git.kernel.org/netdev/net/c/a3005b0f83f2 - [net,3/4] netfilter: nftables: fix possible UAF over chains from packet path in netns https://git.kernel.org/netdev/net/c/767d1216bff8 - [net,4/4] netfilter: flowtable: fix tcp and udp header checksum update https://git.kernel.org/netdev/net/c/8d6bca156e47 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
thank you for adressing that issue - maybe GRO can be enabled and wasn't involved at all "Reap only entries which won't be updated" sounds for me like the could be some optimization: i mean when you first update and then check what can be reaped the recently updated entry would not match to begin with Am 05.02.21 um 01:17 schrieb Pablo Neira Ayuso: > From: Jozsef Kadlecsik <kadlec@mail.kfki.hu> > > When both --reap and --update flag are specified, there's a code > path at which the entry to be updated is reaped beforehand, > which then leads to kernel crash. Reap only entries which won't be > updated. > > Fixes kernel bugzilla #207773. > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=207773 > Reported-by: Reindl Harald <h.reindl@thelounge.net> > Fixes: 0079c5aee348 ("netfilter: xt_recent: add an entry reaper") > Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> > --- > net/netfilter/xt_recent.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/net/netfilter/xt_recent.c b/net/netfilter/xt_recent.c > index 606411869698..0446307516cd 100644 > --- a/net/netfilter/xt_recent.c > +++ b/net/netfilter/xt_recent.c > @@ -152,7 +152,8 @@ static void recent_entry_remove(struct recent_table *t, struct recent_entry *e) > /* > * Drop entries with timestamps older then 'time'. > */ > -static void recent_entry_reap(struct recent_table *t, unsigned long time) > +static void recent_entry_reap(struct recent_table *t, unsigned long time, > + struct recent_entry *working, bool update) > { > struct recent_entry *e; > > @@ -161,6 +162,12 @@ static void recent_entry_reap(struct recent_table *t, unsigned long time) > */ > e = list_entry(t->lru_list.next, struct recent_entry, lru_list); > > + /* > + * Do not reap the entry which are going to be updated. > + */ > + if (e == working && update) > + return; > + > /* > * The last time stamp is the most recent. > */ > @@ -303,7 +310,8 @@ recent_mt(const struct sk_buff *skb, struct xt_action_param *par) > > /* info->seconds must be non-zero */ > if (info->check_set & XT_RECENT_REAP) > - recent_entry_reap(t, time); > + recent_entry_reap(t, time, e, > + info->check_set & XT_RECENT_UPDATE && ret); > } > > if (info->check_set & XT_RECENT_SET ||
Hi Harald, On Fri, 5 Feb 2021, Reindl Harald wrote: > "Reap only entries which won't be updated" sounds for me like the could > be some optimization: i mean when you first update and then check what > can be reaped the recently updated entry would not match to begin with When the entry is new and the given recent table is full we cannot update (add) it, unless old entries are deleted (reaped) first. So it'd require more additional checkings to be introduced to reverse the order of the two operations. Best regards, Jozsef > Am 05.02.21 um 01:17 schrieb Pablo Neira Ayuso: > > From: Jozsef Kadlecsik <kadlec@mail.kfki.hu> > > > > When both --reap and --update flag are specified, there's a code > > path at which the entry to be updated is reaped beforehand, > > which then leads to kernel crash. Reap only entries which won't be > > updated. > > > > Fixes kernel bugzilla #207773. > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=207773 > > Reported-by: Reindl Harald <h.reindl@thelounge.net> > > Fixes: 0079c5aee348 ("netfilter: xt_recent: add an entry reaper") > > Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> > > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> > > --- > > net/netfilter/xt_recent.c | 12 ++++++++++-- > > 1 file changed, 10 insertions(+), 2 deletions(-) > > > > diff --git a/net/netfilter/xt_recent.c b/net/netfilter/xt_recent.c > > index 606411869698..0446307516cd 100644 > > --- a/net/netfilter/xt_recent.c > > +++ b/net/netfilter/xt_recent.c > > @@ -152,7 +152,8 @@ static void recent_entry_remove(struct recent_table *t, > > struct recent_entry *e) > > /* > > * Drop entries with timestamps older then 'time'. > > */ > > -static void recent_entry_reap(struct recent_table *t, unsigned long time) > > +static void recent_entry_reap(struct recent_table *t, unsigned long time, > > + struct recent_entry *working, bool update) > > { > > struct recent_entry *e; > > @@ -161,6 +162,12 @@ static void recent_entry_reap(struct recent_table *t, > > unsigned long time) > > */ > > e = list_entry(t->lru_list.next, struct recent_entry, lru_list); > > + /* > > + * Do not reap the entry which are going to be updated. > > + */ > > + if (e == working && update) > > + return; > > + > > /* > > * The last time stamp is the most recent. > > */ > > @@ -303,7 +310,8 @@ recent_mt(const struct sk_buff *skb, struct > > xt_action_param *par) > > /* info->seconds must be non-zero */ > > if (info->check_set & XT_RECENT_REAP) > > - recent_entry_reap(t, time); > > + recent_entry_reap(t, time, e, > > + info->check_set & XT_RECENT_UPDATE && ret); > > } > > if (info->check_set & XT_RECENT_SET || > - E-mail : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics H-1525 Budapest 114, POB. 49, Hungary
Am 05.02.21 um 14:54 schrieb Jozsef Kadlecsik: > Hi Harald, > > On Fri, 5 Feb 2021, Reindl Harald wrote: > >> "Reap only entries which won't be updated" sounds for me like the could >> be some optimization: i mean when you first update and then check what >> can be reaped the recently updated entry would not match to begin with > > When the entry is new and the given recent table is full we cannot update > (add) it, unless old entries are deleted (reaped) first. So it'd require > more additional checkings to be introduced to reverse the order of the two > operations. well, the most important thing is that the firewall-vm stops to kernel-panic, built that beast in autumn 2018 and until april 2019 i went trough hell with random crashes all the time (connlimit regression, driver issues, vmware issues and that one where i removed --reap on the most called one with some other changes when it crashed 5 or 10 times a day and then 3 days not at all so never figured out what was the gamechanger on the other hand if you can't reap old entries because everything is fresh (real DDOS) you can't update / add it anyways what makes me thinking about the ones without --reap - how is it handeled in that case, i mean there must be some LRU logic present anyways given that --reap is not enabled by default (otherwise that bug would not have hitted me so long randomly) my first xt_recent-rule on top don't have --reap by intention because it's the DDOS stuff with total connections to any machine per two seconds, my guess what that --reap don't come for free and the roudnabout 200 MB RAM overhead is OK, for the other 12 not hitting that much the VM would consume 1.5 GB RAM after a few days instead 240 MB - but they where obviosuly the trigger for random crashes how does that one work after "it's full" to track recent attackers instead just consume memory and no longer work properly?
Am 05.02.21 um 15:42 schrieb Reindl Harald: > > > Am 05.02.21 um 14:54 schrieb Jozsef Kadlecsik: >> Hi Harald, >> >> On Fri, 5 Feb 2021, Reindl Harald wrote: >> >>> "Reap only entries which won't be updated" sounds for me like the could >>> be some optimization: i mean when you first update and then check what >>> can be reaped the recently updated entry would not match to begin with >> >> When the entry is new and the given recent table is full we cannot update >> (add) it, unless old entries are deleted (reaped) first. So it'd require >> more additional checkings to be introduced to reverse the order of the >> two >> operations. > well, the most important thing is that the firewall-vm stops to > kernel-panic why is that still not part of 5.10.14 given how old that issue is :-( https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.10.14
On Fri, 5 Feb 2021, Reindl Harald wrote: > what makes me thinking about the ones without --reap - how is it > handeled in that case, i mean there must be some LRU logic present > anyways given that --reap is not enabled by default (otherwise that bug > would not have hitted me so long randomly) Yes, checking the code I was wrong: when the recent table is full, the oldest entry is automatically removed to make space for the new one. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics H-1525 Budapest 114, POB. 49, Hungary
On Sun, 7 Feb 2021, Reindl Harald wrote: > > well, the most important thing is that the firewall-vm stops to > > kernel-panic > > why is that still not part of 5.10.14 given how old that issue is :-( > > https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.10.14 Probably we missed the window when patches were accepted for the new release. That's all. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics H-1525 Budapest 114, POB. 49, Hungary
Am 07.02.21 um 20:38 schrieb Jozsef Kadlecsik: > On Sun, 7 Feb 2021, Reindl Harald wrote: > >>> well, the most important thing is that the firewall-vm stops to >>> kernel-panic >> >> why is that still not part of 5.10.14 given how old that issue is :-( >> >> https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.10.14 > > Probably we missed the window when patches were accepted for the new > release. That's all probably something is broken in the whole process given that 5.10.15 still don't contain the fix while i am tired of a new "stable release" every few days and 5.10.x like every LTS release in the past few years has a peak of it https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.10.15
Am 10.02.21 um 11:34 schrieb Reindl Harald: > > > Am 07.02.21 um 20:38 schrieb Jozsef Kadlecsik: >> On Sun, 7 Feb 2021, Reindl Harald wrote: >> >>>> well, the most important thing is that the firewall-vm stops to >>>> kernel-panic >>> >>> why is that still not part of 5.10.14 given how old that issue is :-( >>> >>> https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.10.14 >> >> Probably we missed the window when patches were accepted for the new >> release. That's all > > probably something is broken in the whole process given that 5.10.15 > still don't contain the fix while i am tired of a new "stable release" > every few days and 5.10.x like every LTS release in the past few years > has a peak of it > > https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.10.15 and another useless crash of something which has a ready patch from before 5.10.14 [165940.842226] kernel BUG at lib/list_debug.c:45! [165940.874769] invalid opcode: 0000 [#1] SMP NOPTI [165940.876680] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.15-100.fc32.x86_64 #1 [165940.880198] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018 [165940.885314] RIP: 0010:__list_del_entry_valid.cold+0xf/0x47 [165940.886202] Code: fe ff 0f 0b 48 89 d1 4c 89 c6 4c 89 ca 48 c7 c7 60 88 40 b2 e8 cf 45 fe ff 0f 0b 48 89 fe 48 c7 c7 f0 88 40 b2 e8 be 45 fe ff <0f> 0b 48 c7 c7 a0 89 40 b2 e8 b0 45 fe ff 0f 0b 48 89 f2 48 89 fe [165940.889107] RSP: 0018:ffffaf0480003928 EFLAGS: 00010282 [165940.889943] RAX: 000000000000004e RBX: ffff9fa911148000 RCX: 0000000000000000 [165940.891066] RDX: ffff9fa99d4269e0 RSI: ffff9fa99d418a80 RDI: 0000000000000300 [165940.892190] RBP: ffffaf04800039a0 R08: 0000000000000000 R09: ffffaf0480003760 [165940.893313] R10: ffffaf0480003758 R11: ffffffffb2b44748 R12: ffff9fa9046000f8 [165940.894441] R13: ffff9fa911148010 R14: ffff9fa903329400 R15: ffff9fa904600000 [165940.895573] FS: 0000000000000000(0000) GS:ffff9fa99d400000(0000) knlGS:0000000000000000 [165940.896856] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [165940.897789] CR2: 00007fb9442e5000 CR3: 00000000030a0006 CR4: 00000000003706f0 [165940.898954] Call Trace: [165940.899400] <IRQ> [165940.899757] recent_mt+0x1b5/0x39b [xt_recent] [165940.900492] ? set_match_v4+0x92/0xb0 [xt_set] [165940.901236] nft_match_large_eval+0x34/0x60 [nft_compat] [165940.902104] nft_do_chain+0x141/0x4e0 [nf_tables] [165940.902869] ? fib_validate_source+0x47/0xf0 [165940.903564] ? ip_route_input_slow+0x722/0xaa0 [165940.904282] nft_do_chain_ipv4+0x56/0x60 [nf_tables] [165940.905086] nf_hook_slow+0x3f/0xb0 [165940.905658] ip_forward+0x441/0x480 [165940.906230] ? ip4_key_hashfn+0xb0/0xb0 [165940.906856] __netif_receive_skb_one_core+0x67/0x70 [165940.907639] netif_receive_skb+0x35/0x110 [165940.908295] br_handle_frame_finish+0x17a/0x450 [bridge] [165940.909143] ? ip_finish_output2+0x19b/0x560 [165940.909842] ? br_handle_frame_finish+0x450/0x450 [bridge] [165940.910718] br_handle_frame+0x292/0x350 [bridge] [165940.911483] ? ip_sublist_rcv_finish+0x57/0x70 [165940.912199] ? ___slab_alloc+0x127/0x5b0 [165940.912835] __netif_receive_skb_core+0x196/0xf70 [165940.913590] ? ip_list_rcv+0x125/0x140 [165940.914201] __netif_receive_skb_list_core+0x12f/0x2b0 [165940.915024] netif_receive_skb_list_internal+0x1bc/0x2e0 [165940.915873] ? vmxnet3_rq_rx_complete+0x8bd/0xde0 [vmxnet3] [165940.916769] napi_complete_done+0x6f/0x190 [165940.917439] vmxnet3_poll_rx_only+0x7b/0xa0 [vmxnet3] [165940.918249] net_rx_action+0x135/0x3b0 [165940.918863] __do_softirq+0xca/0x288 [165940.919451] asm_call_irq_on_stack+0xf/0x20 [165940.920146] </IRQ> [165940.920508] do_softirq_own_stack+0x37/0x40 [165940.921187] irq_exit_rcu+0xc2/0x100 [165940.921772] common_interrupt+0x74/0x130 [165940.922410] asm_common_interrupt+0x1e/0x40
Am 13.02.21 um 17:09 schrieb Reindl Harald: > > > Am 10.02.21 um 11:34 schrieb Reindl Harald: >> >> >> Am 07.02.21 um 20:38 schrieb Jozsef Kadlecsik: >>> On Sun, 7 Feb 2021, Reindl Harald wrote: >>> >>>>> well, the most important thing is that the firewall-vm stops to >>>>> kernel-panic >>>> >>>> why is that still not part of 5.10.14 given how old that issue is :-( >>>> >>>> https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.10.14 >>> >>> Probably we missed the window when patches were accepted for the new >>> release. That's all >> >> probably something is broken in the whole process given that 5.10.15 >> still don't contain the fix while i am tired of a new "stable release" >> every few days and 5.10.x like every LTS release in the past few years >> has a peak of it >> >> https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.10.15 https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.10.16 again no "netfilter" or "xt_recent" what is the point of new kernel releases every second day without fixing months old issues where a pacth exists? > and another useless crash of something which has a ready patch from > before 5.10.14 > > [165940.842226] kernel BUG at lib/list_debug.c:45! > [165940.874769] invalid opcode: 0000 [#1] SMP NOPTI > [165940.876680] CPU: 0 PID: 0 Comm: swapper/0 Not tainted > 5.10.15-100.fc32.x86_64 #1 > [165940.880198] Hardware name: VMware, Inc. VMware Virtual > Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018 > [165940.885314] RIP: 0010:__list_del_entry_valid.cold+0xf/0x47 > [165940.886202] Code: fe ff 0f 0b 48 89 d1 4c 89 c6 4c 89 ca 48 c7 c7 60 > 88 40 b2 e8 cf 45 fe ff 0f 0b 48 89 fe 48 c7 c7 f0 88 40 b2 e8 be 45 fe > ff <0f> 0b 48 c7 c7 a0 89 40 b2 e8 b0 45 fe ff 0f 0b 48 89 f2 48 89 fe > [165940.889107] RSP: 0018:ffffaf0480003928 EFLAGS: 00010282 > [165940.889943] RAX: 000000000000004e RBX: ffff9fa911148000 RCX: > 0000000000000000 > [165940.891066] RDX: ffff9fa99d4269e0 RSI: ffff9fa99d418a80 RDI: > 0000000000000300 > [165940.892190] RBP: ffffaf04800039a0 R08: 0000000000000000 R09: > ffffaf0480003760 > [165940.893313] R10: ffffaf0480003758 R11: ffffffffb2b44748 R12: > ffff9fa9046000f8 > [165940.894441] R13: ffff9fa911148010 R14: ffff9fa903329400 R15: > ffff9fa904600000 > [165940.895573] FS: 0000000000000000(0000) GS:ffff9fa99d400000(0000) > knlGS:0000000000000000 > [165940.896856] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [165940.897789] CR2: 00007fb9442e5000 CR3: 00000000030a0006 CR4: > 00000000003706f0 > [165940.898954] Call Trace: > [165940.899400] <IRQ> > [165940.899757] recent_mt+0x1b5/0x39b [xt_recent] > [165940.900492] ? set_match_v4+0x92/0xb0 [xt_set] > [165940.901236] nft_match_large_eval+0x34/0x60 [nft_compat] > [165940.902104] nft_do_chain+0x141/0x4e0 [nf_tables] > [165940.902869] ? fib_validate_source+0x47/0xf0 > [165940.903564] ? ip_route_input_slow+0x722/0xaa0 > [165940.904282] nft_do_chain_ipv4+0x56/0x60 [nf_tables] > [165940.905086] nf_hook_slow+0x3f/0xb0 > [165940.905658] ip_forward+0x441/0x480 > [165940.906230] ? ip4_key_hashfn+0xb0/0xb0 > [165940.906856] __netif_receive_skb_one_core+0x67/0x70 > [165940.907639] netif_receive_skb+0x35/0x110 > [165940.908295] br_handle_frame_finish+0x17a/0x450 [bridge] > [165940.909143] ? ip_finish_output2+0x19b/0x560 > [165940.909842] ? br_handle_frame_finish+0x450/0x450 [bridge] > [165940.910718] br_handle_frame+0x292/0x350 [bridge] > [165940.911483] ? ip_sublist_rcv_finish+0x57/0x70 > [165940.912199] ? ___slab_alloc+0x127/0x5b0 > [165940.912835] __netif_receive_skb_core+0x196/0xf70 > [165940.913590] ? ip_list_rcv+0x125/0x140 > [165940.914201] __netif_receive_skb_list_core+0x12f/0x2b0 > [165940.915024] netif_receive_skb_list_internal+0x1bc/0x2e0 > [165940.915873] ? vmxnet3_rq_rx_complete+0x8bd/0xde0 [vmxnet3] > [165940.916769] napi_complete_done+0x6f/0x190 > [165940.917439] vmxnet3_poll_rx_only+0x7b/0xa0 [vmxnet3] > [165940.918249] net_rx_action+0x135/0x3b0 > [165940.918863] __do_softirq+0xca/0x288 > [165940.919451] asm_call_irq_on_stack+0xf/0x20 > [165940.920146] </IRQ> > [165940.920508] do_softirq_own_stack+0x37/0x40 > [165940.921187] irq_exit_rcu+0xc2/0x100 > [165940.921772] common_interrupt+0x74/0x130 > [165940.922410] asm_common_interrupt+0x1e/0x40
On Wed, 10 Feb 2021, Reindl Harald wrote: > > > why is that still not part of 5.10.14 given how old that issue is > > > > > > https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.10.14 > > > > Probably we missed the window when patches were accepted for the new > > release. That's all > > probably something is broken in the whole process given that 5.10.15 still > don't contain the fix while i am tired of a new "stable release" every few > days and 5.10.x like every LTS release in the past few years has a peak of it > > https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.10.15 The process is a multi-step one: netfilter patches are sent for reviewing/accepting/rejecting to the net maintaners, and after that they send the patches to the kernel source maintainers. A single patch is rarely picked out to handle differently. The patch has entered the queue for the stable trees. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics H-1525 Budapest 114, POB. 49, Hungary
diff --git a/net/netfilter/xt_recent.c b/net/netfilter/xt_recent.c index 606411869698..0446307516cd 100644 --- a/net/netfilter/xt_recent.c +++ b/net/netfilter/xt_recent.c @@ -152,7 +152,8 @@ static void recent_entry_remove(struct recent_table *t, struct recent_entry *e) /* * Drop entries with timestamps older then 'time'. */ -static void recent_entry_reap(struct recent_table *t, unsigned long time) +static void recent_entry_reap(struct recent_table *t, unsigned long time, + struct recent_entry *working, bool update) { struct recent_entry *e; @@ -161,6 +162,12 @@ static void recent_entry_reap(struct recent_table *t, unsigned long time) */ e = list_entry(t->lru_list.next, struct recent_entry, lru_list); + /* + * Do not reap the entry which are going to be updated. + */ + if (e == working && update) + return; + /* * The last time stamp is the most recent. */ @@ -303,7 +310,8 @@ recent_mt(const struct sk_buff *skb, struct xt_action_param *par) /* info->seconds must be non-zero */ if (info->check_set & XT_RECENT_REAP) - recent_entry_reap(t, time); + recent_entry_reap(t, time, e, + info->check_set & XT_RECENT_UPDATE && ret); } if (info->check_set & XT_RECENT_SET ||