Message ID | 20210913223850.660578-1-kuba@kernel.org |
---|---|
State | New |
Headers | show |
Series | [RFC,net] net: stream: don't purge sk_error_queue without holding its lock | expand |
On 9/14/21 7:18 AM, Jakub Kicinski wrote: > On Mon, 13 Sep 2021 22:14:00 -0700 Eric Dumazet wrote: >> On 9/13/21 3:38 PM, Jakub Kicinski wrote: >>> sk_stream_kill_queues() can be called when there are still >>> outstanding skbs to transmit. Those skbs may try to queue >>> notifications to the error queue (e.g. timestamps). >>> If sk_stream_kill_queues() purges the queue without taking >>> its lock the queue may get corrupted. >>> >>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") >>> Signed-off-by: Jakub Kicinski <kuba@kernel.org> >>> --- >>> Sending as an RFC for review, compile-tested only. >>> >>> Seems far more likely that I'm missing something than that >>> this has been broken forever and nobody noticed :S >>> --- >>> net/core/stream.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/net/core/stream.c b/net/core/stream.c >>> index 4f1d4aa5fb38..7c585088f394 100644 >>> --- a/net/core/stream.c >>> +++ b/net/core/stream.c >>> @@ -196,7 +196,7 @@ void sk_stream_kill_queues(struct sock *sk) >>> __skb_queue_purge(&sk->sk_receive_queue); >>> >>> /* Next, the error queue. */ >>> - __skb_queue_purge(&sk->sk_error_queue); >>> + skb_queue_purge(&sk->sk_error_queue); >>> >>> /* Next, the write queue. */ >>> WARN_ON(!skb_queue_empty(&sk->sk_write_queue)); >> >> This should not be needed. >> >> By definition, sk_stream_kill_queues() is only called when there is no >> more references on the sockets. >> >> So all outstanding packets must have been orphaned or freed. > > I don't see the wait anywhere, would you mind spelling it out? > My (likely flawed) understanding is that inet_sock_destruct() gets > called when refs are gone (via sk->sk_destruct). > > But tcp_disconnect() + tcp_close() seem to happily call > inet_csk_destroy_sock() -> sk_stream_kill_queues() with outstanding > sk_wmem_alloc refs. tcp_disconnect() should probably leave the error queue as is. For some reason I thought your report was about inet_sock_destruct() tcp_disconnect() has always been full of bugs, it is surprising real applications (not fuzzers) are still trying to use it. > >> Anyway, Linux-2.6.12-rc2 had no timestamps yet. > > I see, thanks, if some form of the patch stands perhaps: > > Fixes: cb9eff097831 ("net: new user space API for time stamping of incoming and outgoing packets") > Except that this patch wont prevent a packet being added to sk_error_queue right after skb_queue_purge(&sk->sk_error_queue). If you think there is a bug, it must be fixed in another way. IMO, preventing err packets from a prior session being queued after a tcp_disconnect() is rather hard. We should not even try (packets could be stuck for hours in a qdisc)
On Tue, 14 Sep 2021 09:32:09 -0700 Eric Dumazet wrote: > On 9/14/21 7:18 AM, Jakub Kicinski wrote: > > On Mon, 13 Sep 2021 22:14:00 -0700 Eric Dumazet wrote: > >> This should not be needed. > >> > >> By definition, sk_stream_kill_queues() is only called when there is no > >> more references on the sockets. > >> > >> So all outstanding packets must have been orphaned or freed. > > > > I don't see the wait anywhere, would you mind spelling it out? > > My (likely flawed) understanding is that inet_sock_destruct() gets > > called when refs are gone (via sk->sk_destruct). > > > > But tcp_disconnect() + tcp_close() seem to happily call > > inet_csk_destroy_sock() -> sk_stream_kill_queues() with outstanding > > sk_wmem_alloc refs. > > tcp_disconnect() should probably leave the error queue as is. > > For some reason I thought your report was about inet_sock_destruct() > > tcp_disconnect() has always been full of bugs, it is surprising real applications > (not fuzzers) are still trying to use it. I think I hit it because app sets SOCK_LINGER && !sk->sk_lingertime. I don't think the app disconnects explicitly, but "same difference". > >> Anyway, Linux-2.6.12-rc2 had no timestamps yet. > > > > I see, thanks, if some form of the patch stands perhaps: > > > > Fixes: cb9eff097831 ("net: new user space API for time stamping of incoming and outgoing packets") > > > > Except that this patch wont prevent a packet being added to sk_error_queue > right after skb_queue_purge(&sk->sk_error_queue). Right, but then inet_sock_destruct() also purges the err queue, again. I was afraid of regressions but we could just remove the purging from sk_stream_kill_queues(), and target net-next? > If you think there is a bug, it must be fixed in another way. > > IMO, preventing err packets from a prior session being queued after a tcp_disconnect() > is rather hard. We should not even try (packets could be stuck for hours in a qdisc) Indeed, we could rearrange the SOCK_DEAD check in sock_queue_err_skb() to skip queuing and put it under the err queue lock (provided we make sk_stream_kill_queues() take that lock as well). But seems like an overkill. I'd lean towards the existing patch or removing the purge from sk_stream_kill_queues(). LMK what you prefer, this is not urgent.
On Tue, 14 Sep 2021 10:55:49 -0700 Eric Dumazet wrote: > On 9/14/21 9:56 AM, Jakub Kicinski wrote: > > Right, but then inet_sock_destruct() also purges the err queue, again. > > I was afraid of regressions but we could just remove the purging > > from sk_stream_kill_queues(), and target net-next? > > > > Yes, this would be the safest thing. Alright, let me do more testing and post a removal to net-next. > >> If you think there is a bug, it must be fixed in another way. > >> > >> IMO, preventing err packets from a prior session being queued after a tcp_disconnect() > >> is rather hard. We should not even try (packets could be stuck for hours in a qdisc) > > > > Indeed, we could rearrange the SOCK_DEAD check in sock_queue_err_skb() > > to skip queuing and put it under the err queue lock (provided we make > > sk_stream_kill_queues() take that lock as well). But seems like an > > overkill. I'd lean towards the existing patch or removing the purge from > > sk_stream_kill_queues(). LMK what you prefer, this is not urgent. > > The issue would really about the tcp_disconnect() case, > followed by a reuse of the socket to establish another session. > > In order to prevent polluting sk_error_queue with notifications > triggered by old packets (from prior flow), this would require > to record the socket cookie in skb, or something like that :/ Ah, I see what you meant! I'll make a note of the disconnect case in the commit message. I just care that we don't corrupt the queue on close.
diff --git a/net/core/stream.c b/net/core/stream.c index 4f1d4aa5fb38..7c585088f394 100644 --- a/net/core/stream.c +++ b/net/core/stream.c @@ -196,7 +196,7 @@ void sk_stream_kill_queues(struct sock *sk) __skb_queue_purge(&sk->sk_receive_queue); /* Next, the error queue. */ - __skb_queue_purge(&sk->sk_error_queue); + skb_queue_purge(&sk->sk_error_queue); /* Next, the write queue. */ WARN_ON(!skb_queue_empty(&sk->sk_write_queue));
sk_stream_kill_queues() can be called when there are still outstanding skbs to transmit. Those skbs may try to queue notifications to the error queue (e.g. timestamps). If sk_stream_kill_queues() purges the queue without taking its lock the queue may get corrupted. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Jakub Kicinski <kuba@kernel.org> --- Sending as an RFC for review, compile-tested only. Seems far more likely that I'm missing something than that this has been broken forever and nobody noticed :S --- net/core/stream.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)