diff mbox series

net/sock: don't drop udp packets if udp_mem[2] not reached

Message ID 20200907144435.43165-1-dust.li@linux.alibaba.com
State New
Headers show
Series net/sock: don't drop udp packets if udp_mem[2] not reached | expand

Commit Message

dust.li Sept. 7, 2020, 2:44 p.m. UTC
We encoutered udp packets drop under a pretty low pressure
with net.ipv4.udp_mem[0] set to a small value (4096).

After some tracing and debugging, we found that for udp
protocol, __sk_mem_raise_allocated() will possiblly drop
packets if:
  udp_mem[0] < udp_prot.memory_allocated < udp_mem[2]

That's because __sk_mem_raise_allocated() didn't handle
the above condition for protocols like udp who doesn't
have sk_has_memory_pressure()

We can reproduce this with the following condition
1. udp_mem[0] is relateive small,
2. net.core.rmem_default/max > udp_mem[0] * 4K
3. The udp server receive slowly, causing the udp_prot->memory_allocated
   exceed udp_mem[0], but still under udp_mem[2]

I wrote a test script to reproduce this:
https://github.com/dust-li/kernel-test/blob/master/exceed_udp_mem_min_drop/exceed_udp_mem_min_drop.sh

Obviously, we should not drop packets when udp_prot.memory_allocated
just exceed udp_mem[0] but still under hard limit.

For protocols with memory_pressure callbacks (like TCP), this is
not a problem, because there is an extra check:
```
  if (sk_has_memory_pressure(sk)) {
  	u64 alloc;

  	if (!sk_under_memory_pressure(sk))
  		return 1;
  	alloc = sk_sockets_allocated_read_positive(sk);
  	if (sk_prot_mem_limits(sk, 2) > alloc *
  			sk_mem_pages(sk->sk_wmem_queued +
  				atomic_read(&sk->sk_rmem_alloc) +
  				sk->sk_forward_alloc))
  		return 1;
  }
```

But UDP didn't check this, so I add an extra check here
to make sure UDP packets are not dropped until the hard limit
is reached.

Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
---
 net/core/sock.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

dust.li Sept. 8, 2020, 3:15 a.m. UTC | #1
On Mon, Sep 07, 2020 at 07:18:48PM +0200, Paolo Abeni wrote:
>Hi,
>
>On Mon, 2020-09-07 at 22:44 +0800, Dust Li wrote:
>> We encoutered udp packets drop under a pretty low pressure
>> with net.ipv4.udp_mem[0] set to a small value (4096).
>> 
>> After some tracing and debugging, we found that for udp
>> protocol, __sk_mem_raise_allocated() will possiblly drop
>> packets if:
>>   udp_mem[0] < udp_prot.memory_allocated < udp_mem[2]
>> 
>> That's because __sk_mem_raise_allocated() didn't handle
>> the above condition for protocols like udp who doesn't
>> have sk_has_memory_pressure()
>> 
>> We can reproduce this with the following condition
>> 1. udp_mem[0] is relateive small,
>> 2. net.core.rmem_default/max > udp_mem[0] * 4K
>
>This looks like something that could/should be addressed at
>configuration level ?!?
Thanks a lot for the review !

Sorry, maybe I haven't make it clear enough

The real problem is the scability with the sockets number.
Since the udp_mem is for all UDP sockets, with the number of udp
sockets grows, soon or later, udp_prot.memory_allocated will
exceed udp_mem[0], and __sk_mem_raise_allocated() will cause
the packets drop here. But the total udp memory allocated
may still far under udp_mem[1] or udp_mem[2]

>
>udp_mem[0] should accomodate confortably at least a socket.

Yeah, I agree udp_mem[0] should be large enough for at least a
socket.

Here I use 4096 just for simple and reproduce what we met before.

I changed my test program a bit:
 - with 16 server sockets
 - with 1 client sending 3000 messages(size: 4096bytes) to each
   of those 8 server sockets
 - set net.core.rmem_default/max to (2*4096*4096)
 - and keep udp_mem unset, which by default on my 4GB VM is
   'net.ipv4.udp_mem = 91944        122592  183888'

https://github.com/dust-li/kernel-test/blob/master/exceed_udp_mem_min_drop/multi_sockets_test.sh


Actually, with more udp sockets, I can always make it large
enough to exceed udp_mem[0], and drop packets before udp_mem[1]
and udp_mem[2].

>
>Cheers,
>
>Paolo
diff mbox series

Patch

diff --git a/net/core/sock.c b/net/core/sock.c
index 6c5c6b18eff4..fed8211d8dbe 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2648,6 +2648,12 @@  int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
 				 atomic_read(&sk->sk_rmem_alloc) +
 				 sk->sk_forward_alloc))
 			return 1;
+	} else {
+		/* for prots without memory_pressure callbacks, we should not
+		 * drop until hard limit reached
+		 */
+		if (allocated <= sk_prot_mem_limits(sk, 2))
+			return 1;
 	}
 
 suppress_allocation: