diff mbox series

[net] tcp: Fix potential use-after-free due to double kfree().

Message ID 20210118055920.82516-1-kuniyu@amazon.co.jp
State New
Headers show
Series [net] tcp: Fix potential use-after-free due to double kfree(). | expand

Commit Message

Kuniyuki Iwashima Jan. 18, 2021, 5:59 a.m. UTC
Receiving ACK with a valid SYN cookie, cookie_v4_check() allocates struct
request_sock and then can allocate inet_rsk(req)->ireq_opt. After that,
tcp_v4_syn_recv_sock() allocates struct sock and copies ireq_opt to
inet_sk(sk)->inet_opt. Normally, tcp_v4_syn_recv_sock() inserts the full
socket into ehash and sets NULL to ireq_opt. Otherwise,
tcp_v4_syn_recv_sock() has to reset inet_opt by NULL and free the full
socket.

The commit 01770a1661657 ("tcp: fix race condition when creating child
sockets from syncookies") added a new path, in which more than one cores
create full sockets for the same SYN cookie. Currently, the core which
loses the race frees the full socket without resetting inet_opt, resulting
in that both sock_put() and reqsk_put() call kfree() for the same memory:

  sock_put
    sk_free
      __sk_free
        sk_destruct
          __sk_destruct
            sk->sk_destruct/inet_sock_destruct
              kfree(rcu_dereference_protected(inet->inet_opt, 1));

  reqsk_put
    reqsk_free
      __reqsk_free
        req->rsk_ops->destructor/tcp_v4_reqsk_destructor
          kfree(rcu_dereference_protected(inet_rsk(req)->ireq_opt, 1));

Calling kmalloc() between the double kfree() can lead to use-after-free, so
this patch fixes it by setting NULL to inet_opt before sock_put().

As a side note, this kind of issue does not happen for IPv6. This is
because tcp_v6_syn_recv_sock() clones both ipv6_opt and pktopts which
correspond to ireq_opt in IPv4.

Fixes: 01770a166165 ("tcp: fix race condition when creating child sockets from syncookies")
CC: Ricardo Dias <rdias@singlestore.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com>
---
 net/ipv4/tcp_ipv4.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Jakub Kicinski Jan. 20, 2021, 1:17 a.m. UTC | #1
On Mon, 18 Jan 2021 14:59:20 +0900 Kuniyuki Iwashima wrote:
> Receiving ACK with a valid SYN cookie, cookie_v4_check() allocates struct

> request_sock and then can allocate inet_rsk(req)->ireq_opt. After that,

> tcp_v4_syn_recv_sock() allocates struct sock and copies ireq_opt to

> inet_sk(sk)->inet_opt. Normally, tcp_v4_syn_recv_sock() inserts the full

> socket into ehash and sets NULL to ireq_opt. Otherwise,

> tcp_v4_syn_recv_sock() has to reset inet_opt by NULL and free the full

> socket.

> 

> The commit 01770a1661657 ("tcp: fix race condition when creating child

> sockets from syncookies") added a new path, in which more than one cores

> create full sockets for the same SYN cookie. Currently, the core which

> loses the race frees the full socket without resetting inet_opt, resulting

> in that both sock_put() and reqsk_put() call kfree() for the same memory:

> 

>   sock_put

>     sk_free

>       __sk_free

>         sk_destruct

>           __sk_destruct

>             sk->sk_destruct/inet_sock_destruct

>               kfree(rcu_dereference_protected(inet->inet_opt, 1));

> 

>   reqsk_put

>     reqsk_free

>       __reqsk_free

>         req->rsk_ops->destructor/tcp_v4_reqsk_destructor

>           kfree(rcu_dereference_protected(inet_rsk(req)->ireq_opt, 1));

> 

> Calling kmalloc() between the double kfree() can lead to use-after-free, so

> this patch fixes it by setting NULL to inet_opt before sock_put().

> 

> As a side note, this kind of issue does not happen for IPv6. This is

> because tcp_v6_syn_recv_sock() clones both ipv6_opt and pktopts which

> correspond to ireq_opt in IPv4.

> 

> Fixes: 01770a166165 ("tcp: fix race condition when creating child sockets from syncookies")

> CC: Ricardo Dias <rdias@singlestore.com>

> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>

> Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com>


Ricardo, Eric, any reason this was written this way?

> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c

> index 58207c7769d0..87eb614dab27 100644

> --- a/net/ipv4/tcp_ipv4.c

> +++ b/net/ipv4/tcp_ipv4.c

> @@ -1595,6 +1595,8 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb,

>  		tcp_move_syn(newtp, req);

>  		ireq->ireq_opt = NULL;

>  	} else {

> +		newinet->inet_opt = NULL;

> +

>  		if (!req_unhash && found_dup_sk) {

>  			/* This code path should only be executed in the

>  			 * syncookie case only

> @@ -1602,8 +1604,6 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb,

>  			bh_unlock_sock(newsk);

>  			sock_put(newsk);

>  			newsk = NULL;

> -		} else {

> -			newinet->inet_opt = NULL;

>  		}

>  	}

>  	return newsk;
Eric Dumazet Jan. 20, 2021, 1:07 p.m. UTC | #2
On Wed, Jan 20, 2021 at 2:17 AM Jakub Kicinski <kuba@kernel.org> wrote:
>

> On Mon, 18 Jan 2021 14:59:20 +0900 Kuniyuki Iwashima wrote:

> > Receiving ACK with a valid SYN cookie, cookie_v4_check() allocates struct

> > request_sock and then can allocate inet_rsk(req)->ireq_opt. After that,

> > tcp_v4_syn_recv_sock() allocates struct sock and copies ireq_opt to

> > inet_sk(sk)->inet_opt. Normally, tcp_v4_syn_recv_sock() inserts the full

> > socket into ehash and sets NULL to ireq_opt. Otherwise,

> > tcp_v4_syn_recv_sock() has to reset inet_opt by NULL and free the full

> > socket.

> >

> > The commit 01770a1661657 ("tcp: fix race condition when creating child

> > sockets from syncookies") added a new path, in which more than one cores

> > create full sockets for the same SYN cookie. Currently, the core which

> > loses the race frees the full socket without resetting inet_opt, resulting

> > in that both sock_put() and reqsk_put() call kfree() for the same memory:

> >

> >   sock_put

> >     sk_free

> >       __sk_free

> >         sk_destruct

> >           __sk_destruct

> >             sk->sk_destruct/inet_sock_destruct

> >               kfree(rcu_dereference_protected(inet->inet_opt, 1));

> >

> >   reqsk_put

> >     reqsk_free

> >       __reqsk_free

> >         req->rsk_ops->destructor/tcp_v4_reqsk_destructor

> >           kfree(rcu_dereference_protected(inet_rsk(req)->ireq_opt, 1));

> >

> > Calling kmalloc() between the double kfree() can lead to use-after-free, so

> > this patch fixes it by setting NULL to inet_opt before sock_put().

> >

> > As a side note, this kind of issue does not happen for IPv6. This is

> > because tcp_v6_syn_recv_sock() clones both ipv6_opt and pktopts which

> > correspond to ireq_opt in IPv4.

> >

> > Fixes: 01770a166165 ("tcp: fix race condition when creating child sockets from syncookies")

> > CC: Ricardo Dias <rdias@singlestore.com>

> > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>

> > Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com>

>

> Ricardo, Eric, any reason this was written this way?


Well, I guess that was a plain bug.

IPv4 options are not used often I think.

Reviewed-by: Eric Dumazet <edumazet@google.com>
Jakub Kicinski Jan. 20, 2021, 4:57 p.m. UTC | #3
On Wed, 20 Jan 2021 14:07:35 +0100 Eric Dumazet wrote:
> On Wed, Jan 20, 2021 at 2:17 AM Jakub Kicinski <kuba@kernel.org> wrote:

> > On Mon, 18 Jan 2021 14:59:20 +0900 Kuniyuki Iwashima wrote:  

> > > Receiving ACK with a valid SYN cookie, cookie_v4_check() allocates struct

> > > request_sock and then can allocate inet_rsk(req)->ireq_opt. After that,

> > > tcp_v4_syn_recv_sock() allocates struct sock and copies ireq_opt to

> > > inet_sk(sk)->inet_opt. Normally, tcp_v4_syn_recv_sock() inserts the full

> > > socket into ehash and sets NULL to ireq_opt. Otherwise,

> > > tcp_v4_syn_recv_sock() has to reset inet_opt by NULL and free the full

> > > socket.

> > >

> > > The commit 01770a1661657 ("tcp: fix race condition when creating child

> > > sockets from syncookies") added a new path, in which more than one cores

> > > create full sockets for the same SYN cookie. Currently, the core which

> > > loses the race frees the full socket without resetting inet_opt, resulting

> > > in that both sock_put() and reqsk_put() call kfree() for the same memory:

> > >

> > >   sock_put

> > >     sk_free

> > >       __sk_free

> > >         sk_destruct

> > >           __sk_destruct

> > >             sk->sk_destruct/inet_sock_destruct

> > >               kfree(rcu_dereference_protected(inet->inet_opt, 1));

> > >

> > >   reqsk_put

> > >     reqsk_free

> > >       __reqsk_free

> > >         req->rsk_ops->destructor/tcp_v4_reqsk_destructor

> > >           kfree(rcu_dereference_protected(inet_rsk(req)->ireq_opt, 1));

> > >

> > > Calling kmalloc() between the double kfree() can lead to use-after-free, so

> > > this patch fixes it by setting NULL to inet_opt before sock_put().

> > >

> > > As a side note, this kind of issue does not happen for IPv6. This is

> > > because tcp_v6_syn_recv_sock() clones both ipv6_opt and pktopts which

> > > correspond to ireq_opt in IPv4.

> > >

> > > Fixes: 01770a166165 ("tcp: fix race condition when creating child sockets from syncookies")

> > > CC: Ricardo Dias <rdias@singlestore.com>

> > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>

> > > Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com>  

> >

> > Ricardo, Eric, any reason this was written this way?  

> 

> Well, I guess that was a plain bug.

> 

> IPv4 options are not used often I think.


I see.

> Reviewed-by: Eric Dumazet <edumazet@google.com>


Applied, thank you!
diff mbox series

Patch

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 58207c7769d0..87eb614dab27 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1595,6 +1595,8 @@  struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb,
 		tcp_move_syn(newtp, req);
 		ireq->ireq_opt = NULL;
 	} else {
+		newinet->inet_opt = NULL;
+
 		if (!req_unhash && found_dup_sk) {
 			/* This code path should only be executed in the
 			 * syncookie case only
@@ -1602,8 +1604,6 @@  struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb,
 			bh_unlock_sock(newsk);
 			sock_put(newsk);
 			newsk = NULL;
-		} else {
-			newinet->inet_opt = NULL;
 		}
 	}
 	return newsk;