Message ID | 20210529110746.6796-1-w@1wt.eu |
---|---|
State | New |
Headers | show |
Series | [net-next] ipv6: use prandom_u32() for ID generation | expand |
From: Willy Tarreau > Sent: 29 May 2021 12:08 > > This is a complement to commit aa6dd211e4b1 ("inet: use bigger hash > table for IP ID generation"), but focusing on some specific aspects > of IPv6. > > Contary to IPv4, IPv6 only uses packet IDs with fragments, and with a > minimum MTU of 1280, it's much less easy to force a remote peer to > produce many fragments to explore its ID sequence. In addition packet > IDs are 32-bit in IPv6, which further complicates their analysis. On > the other hand, it is often easier to choose among plenty of possible > source addresses and partially work around the bigger hash table the > commit above permits, which leaves IPv6 partially exposed to some > possibilities of remote analysis at the risk of weakening some > protocols like DNS if some IDs can be predicted with a good enough > probability. > > Given the wide range of permitted IDs, the risk of collision is extremely > low so there's no need to rely on the positive increment algorithm that > is shared with the IPv4 code via ip_idents_reserve(). We have a fast > PRNG, so let's simply call prandom_u32() and be done with it. > > Performance measurements at 10 Gbps couldn't show any difference with > the previous code, even when using a single core, because due to the > large fragments, we're limited to only ~930 kpps at 10 Gbps and the cost > of the random generation is completely offset by other operations and by > the network transfer time. In addition, this change removes the need to > update a shared entry in the idents table so it may even end up being > slightly faster on large scale systems where this matters. > > The risk of at least one collision here is about 1/80 million among > 10 IDs, 1/850k among 100 IDs, and still only 1/8.5k among 1000 IDs, > which remains very low compared to IPv4 where all IDs are reused > every 4 to 80ms on a 10 Gbps flow depending on packet sizes. The problem is that, on average, 1 in 2^32 packets will use the same id as the previous one. If a fragment of such a pair gets lost horrid things are likely to happen. Note that this is different from an ID being reused after a count of packets or after a time delay. So you still need something to ensure IDs aren't reused immediately. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Mon, May 31, 2021 at 10:41:18AM +0000, David Laight wrote: > The problem is that, on average, 1 in 2^32 packets will use > the same id as the previous one. > If a fragment of such a pair gets lost horrid things are > likely to happen. > Note that this is different from an ID being reused after a > count of packets or after a time delay. I'm well aware of this, as this is something we discussed already for IPv4 and which I objected to for the same reason (except that it's 1/2^16 there). With that said, the differences with IPv4 are significant here, because you won't fragment below 1280 bytes per packet, which means the issue could happen every 5 terabytes of fragmented losses (or reorders). I'd say that in the worst case you're using load-balanced links with some funny LB algorithm that ensures that every second fragment is sent on the same link as the previous packet's first fragment. This is the case where you could provoke a failure every 5 TB. But then you're still subject to UDP's 16-bit checksumm so in practice you're seeing a failure every 320 PB. Finally it's the same probability as getting both TCP csum + Ethernet CRC correct on a failure, except that here it applies only to large fragments while with TCP/eth it applies to any packet. > So you still need something to ensure IDs aren't reused immediately. That's what I initially did for IPv4 but Amit could exploit this specific property. For example it makes it easier to count flows behind NAT when there is a guaranteed distance :-/ We even tried with a smooth, non-linear distribution, but that made no difference, it remained observable. Another idea we had in mind was to keep small increments for local networks and use full randoms only over routers (since fragments are rare and terribly unreliable on the net), but that would involve quite significant changes for very little benefit compared to the current option in the end. Regards, Willy
On Sat, May 29, 2021 at 1:08 PM Willy Tarreau <w@1wt.eu> wrote: > > This is a complement to commit aa6dd211e4b1 ("inet: use bigger hash > table for IP ID generation"), but focusing on some specific aspects > of IPv6. > > Contary to IPv4, IPv6 only uses packet IDs with fragments, and with a > minimum MTU of 1280, it's much less easy to force a remote peer to > produce many fragments to explore its ID sequence. In addition packet > IDs are 32-bit in IPv6, which further complicates their analysis. On > the other hand, it is often easier to choose among plenty of possible > source addresses and partially work around the bigger hash table the > commit above permits, which leaves IPv6 partially exposed to some > possibilities of remote analysis at the risk of weakening some > protocols like DNS if some IDs can be predicted with a good enough > probability. > > Given the wide range of permitted IDs, the risk of collision is extremely > low so there's no need to rely on the positive increment algorithm that > is shared with the IPv4 code via ip_idents_reserve(). We have a fast > PRNG, so let's simply call prandom_u32() and be done with it. > > Performance measurements at 10 Gbps couldn't show any difference with > the previous code, even when using a single core, because due to the > large fragments, we're limited to only ~930 kpps at 10 Gbps and the cost > of the random generation is completely offset by other operations and by > the network transfer time. In addition, this change removes the need to > update a shared entry in the idents table so it may even end up being > slightly faster on large scale systems where this matters. > > The risk of at least one collision here is about 1/80 million among > 10 IDs, 1/850k among 100 IDs, and still only 1/8.5k among 1000 IDs, > which remains very low compared to IPv4 where all IDs are reused > every 4 to 80ms on a 10 Gbps flow depending on packet sizes. > > Reported-by: Amit Klein <aksecurity@gmail.com> > Cc: Eric Dumazet <edumazet@google.com> > Signed-off-by: Willy Tarreau <w@1wt.eu> Reviewed-by: Eric Dumazet <edumazet@google.com> > --- > net/ipv6/output_core.c | 28 +++++----------------------- > 1 file changed, 5 insertions(+), 23 deletions(-) > > diff --git a/net/ipv6/output_core.c b/net/ipv6/output_core.c > index af36acc1a644..2880dc7d9a49 100644 > --- a/net/ipv6/output_core.c > +++ b/net/ipv6/output_core.c > @@ -15,29 +15,11 @@ static u32 __ipv6_select_ident(struct net *net, > const struct in6_addr *dst, > const struct in6_addr *src) > { > - const struct { > - struct in6_addr dst; > - struct in6_addr src; > - } __aligned(SIPHASH_ALIGNMENT) combined = { > - .dst = *dst, > - .src = *src, > - }; > - u32 hash, id; > - > - /* Note the following code is not safe, but this is okay. */ > - if (unlikely(siphash_key_is_zero(&net->ipv4.ip_id_key))) > - get_random_bytes(&net->ipv4.ip_id_key, > - sizeof(net->ipv4.ip_id_key)); > - > - hash = siphash(&combined, sizeof(combined), &net->ipv4.ip_id_key); > - > - /* Treat id of 0 as unset and if we get 0 back from ip_idents_reserve, > - * set the hight order instead thus minimizing possible future > - * collisions. > - */ > - id = ip_idents_reserve(hash, 1); > - if (unlikely(!id)) > - id = 1 << 31; > + u32 id; > + > + do { > + id = prandom_u32(); > + } while (!id); > > return id; > } > -- > 2.17.5 >
Hello: This patch was applied to netdev/net-next.git (refs/heads/master): On Sat, 29 May 2021 13:07:46 +0200 you wrote: > This is a complement to commit aa6dd211e4b1 ("inet: use bigger hash > table for IP ID generation"), but focusing on some specific aspects > of IPv6. > > Contary to IPv4, IPv6 only uses packet IDs with fragments, and with a > minimum MTU of 1280, it's much less easy to force a remote peer to > produce many fragments to explore its ID sequence. In addition packet > IDs are 32-bit in IPv6, which further complicates their analysis. On > the other hand, it is often easier to choose among plenty of possible > source addresses and partially work around the bigger hash table the > commit above permits, which leaves IPv6 partially exposed to some > possibilities of remote analysis at the risk of weakening some > protocols like DNS if some IDs can be predicted with a good enough > probability. > > [...] Here is the summary with links: - [net-next] ipv6: use prandom_u32() for ID generation https://git.kernel.org/netdev/net-next/c/62f20e068ccc You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
diff --git a/net/ipv6/output_core.c b/net/ipv6/output_core.c index af36acc1a644..2880dc7d9a49 100644 --- a/net/ipv6/output_core.c +++ b/net/ipv6/output_core.c @@ -15,29 +15,11 @@ static u32 __ipv6_select_ident(struct net *net, const struct in6_addr *dst, const struct in6_addr *src) { - const struct { - struct in6_addr dst; - struct in6_addr src; - } __aligned(SIPHASH_ALIGNMENT) combined = { - .dst = *dst, - .src = *src, - }; - u32 hash, id; - - /* Note the following code is not safe, but this is okay. */ - if (unlikely(siphash_key_is_zero(&net->ipv4.ip_id_key))) - get_random_bytes(&net->ipv4.ip_id_key, - sizeof(net->ipv4.ip_id_key)); - - hash = siphash(&combined, sizeof(combined), &net->ipv4.ip_id_key); - - /* Treat id of 0 as unset and if we get 0 back from ip_idents_reserve, - * set the hight order instead thus minimizing possible future - * collisions. - */ - id = ip_idents_reserve(hash, 1); - if (unlikely(!id)) - id = 1 << 31; + u32 id; + + do { + id = prandom_u32(); + } while (!id); return id; }
This is a complement to commit aa6dd211e4b1 ("inet: use bigger hash table for IP ID generation"), but focusing on some specific aspects of IPv6. Contary to IPv4, IPv6 only uses packet IDs with fragments, and with a minimum MTU of 1280, it's much less easy to force a remote peer to produce many fragments to explore its ID sequence. In addition packet IDs are 32-bit in IPv6, which further complicates their analysis. On the other hand, it is often easier to choose among plenty of possible source addresses and partially work around the bigger hash table the commit above permits, which leaves IPv6 partially exposed to some possibilities of remote analysis at the risk of weakening some protocols like DNS if some IDs can be predicted with a good enough probability. Given the wide range of permitted IDs, the risk of collision is extremely low so there's no need to rely on the positive increment algorithm that is shared with the IPv4 code via ip_idents_reserve(). We have a fast PRNG, so let's simply call prandom_u32() and be done with it. Performance measurements at 10 Gbps couldn't show any difference with the previous code, even when using a single core, because due to the large fragments, we're limited to only ~930 kpps at 10 Gbps and the cost of the random generation is completely offset by other operations and by the network transfer time. In addition, this change removes the need to update a shared entry in the idents table so it may even end up being slightly faster on large scale systems where this matters. The risk of at least one collision here is about 1/80 million among 10 IDs, 1/850k among 100 IDs, and still only 1/8.5k among 1000 IDs, which remains very low compared to IPv4 where all IDs are reused every 4 to 80ms on a 10 Gbps flow depending on packet sizes. Reported-by: Amit Klein <aksecurity@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Willy Tarreau <w@1wt.eu> --- net/ipv6/output_core.c | 28 +++++----------------------- 1 file changed, 5 insertions(+), 23 deletions(-)