diff mbox series

neighbour: Prevent a dead entry from updating gc_list

Message ID 20210127165453.GA20514@chinagar-linux.qualcomm.com
State New
Headers show
Series neighbour: Prevent a dead entry from updating gc_list | expand

Commit Message

Chinmay Agarwal Jan. 27, 2021, 4:54 p.m. UTC
Following race condition was detected:
<CPU A, t0> - neigh_flush_dev() is under execution and calls
neigh_mark_dead(n) marking the neighbour entry 'n' as dead.

<CPU B, t1> - Executing: __netif_receive_skb() ->
__netif_receive_skb_core() -> arp_rcv() -> arp_process().arp_process()
calls __neigh_lookup() which takes a reference on neighbour entry 'n'.

<CPU A, t2> - Moves further along neigh_flush_dev() and calls
neigh_cleanup_and_release(n), but since reference count increased in t2,
'n' couldn't be destroyed.

<CPU B, t3> - Moves further along, arp_process() and calls
neigh_update()-> __neigh_update() -> neigh_update_gc_list(), which adds
the neighbour entry back in gc_list(neigh_mark_dead(), removed it
earlier in t0 from gc_list)

<CPU B, t4> - arp_process() finally calls neigh_release(n), destroying
the neighbour entry.

This leads to 'n' still being part of gc_list, but the actual
neighbour structure has been freed.

The situation can be prevented from happening if we disallow a dead
entry to have any possibility of updating gc_list. This is what the
patch intends to achieve.

Fixes: 9c29a2f55ec0 ("neighbor: Fix locking order for gc_list changes")
Signed-off-by: Chinmay Agarwal <chinagar@codeaurora.org>
---
 net/core/neighbour.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--

Comments

Cong Wang Jan. 28, 2021, 2:34 a.m. UTC | #1
On Wed, Jan 27, 2021 at 8:55 AM Chinmay Agarwal <chinagar@codeaurora.org> wrote:
>

> Following race condition was detected:

> <CPU A, t0> - neigh_flush_dev() is under execution and calls

> neigh_mark_dead(n) marking the neighbour entry 'n' as dead.

>

> <CPU B, t1> - Executing: __netif_receive_skb() ->

> __netif_receive_skb_core() -> arp_rcv() -> arp_process().arp_process()

> calls __neigh_lookup() which takes a reference on neighbour entry 'n'.

>

> <CPU A, t2> - Moves further along neigh_flush_dev() and calls

> neigh_cleanup_and_release(n), but since reference count increased in t2,

> 'n' couldn't be destroyed.

>

> <CPU B, t3> - Moves further along, arp_process() and calls

> neigh_update()-> __neigh_update() -> neigh_update_gc_list(), which adds

> the neighbour entry back in gc_list(neigh_mark_dead(), removed it

> earlier in t0 from gc_list)

>

> <CPU B, t4> - arp_process() finally calls neigh_release(n), destroying

> the neighbour entry.

>

> This leads to 'n' still being part of gc_list, but the actual

> neighbour structure has been freed.

>

> The situation can be prevented from happening if we disallow a dead

> entry to have any possibility of updating gc_list. This is what the

> patch intends to achieve.

>

> Fixes: 9c29a2f55ec0 ("neighbor: Fix locking order for gc_list changes")

> Signed-off-by: Chinmay Agarwal <chinagar@codeaurora.org>


Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>


Thanks.
David Ahern Jan. 30, 2021, 4:05 p.m. UTC | #2
On 1/27/21 9:54 AM, Chinmay Agarwal wrote:
> Following race condition was detected:

> <CPU A, t0> - neigh_flush_dev() is under execution and calls

> neigh_mark_dead(n) marking the neighbour entry 'n' as dead.

> 

> <CPU B, t1> - Executing: __netif_receive_skb() ->

> __netif_receive_skb_core() -> arp_rcv() -> arp_process().arp_process()

> calls __neigh_lookup() which takes a reference on neighbour entry 'n'.

> 

> <CPU A, t2> - Moves further along neigh_flush_dev() and calls

> neigh_cleanup_and_release(n), but since reference count increased in t2,

> 'n' couldn't be destroyed.

> 

> <CPU B, t3> - Moves further along, arp_process() and calls

> neigh_update()-> __neigh_update() -> neigh_update_gc_list(), which adds

> the neighbour entry back in gc_list(neigh_mark_dead(), removed it

> earlier in t0 from gc_list)

> 

> <CPU B, t4> - arp_process() finally calls neigh_release(n), destroying

> the neighbour entry.

> 

> This leads to 'n' still being part of gc_list, but the actual

> neighbour structure has been freed.

> 

> The situation can be prevented from happening if we disallow a dead

> entry to have any possibility of updating gc_list. This is what the

> patch intends to achieve.

> 

> Fixes: 9c29a2f55ec0 ("neighbor: Fix locking order for gc_list changes")


always Cc the author(s) of commits in Fixes tag.

> Signed-off-by: Chinmay Agarwal <chinagar@codeaurora.org>

> ---

>  net/core/neighbour.c | 7 ++++---

>  1 file changed, 4 insertions(+), 3 deletions(-)

> 


Reviewed-by: David Ahern <dsahern@kernel.org>
Jakub Kicinski Feb. 2, 2021, 1:09 a.m. UTC | #3
On Wed, 27 Jan 2021 22:24:54 +0530 Chinmay Agarwal wrote:
> Following race condition was detected:

> <CPU A, t0> - neigh_flush_dev() is under execution and calls

> neigh_mark_dead(n) marking the neighbour entry 'n' as dead.

> 

> <CPU B, t1> - Executing: __netif_receive_skb() ->

> __netif_receive_skb_core() -> arp_rcv() -> arp_process().arp_process()

> calls __neigh_lookup() which takes a reference on neighbour entry 'n'.

> 

> <CPU A, t2> - Moves further along neigh_flush_dev() and calls

> neigh_cleanup_and_release(n), but since reference count increased in t2,

> 'n' couldn't be destroyed.

> 

> <CPU B, t3> - Moves further along, arp_process() and calls

> neigh_update()-> __neigh_update() -> neigh_update_gc_list(), which adds

> the neighbour entry back in gc_list(neigh_mark_dead(), removed it

> earlier in t0 from gc_list)

> 

> <CPU B, t4> - arp_process() finally calls neigh_release(n), destroying

> the neighbour entry.

> 

> This leads to 'n' still being part of gc_list, but the actual

> neighbour structure has been freed.

> 

> The situation can be prevented from happening if we disallow a dead

> entry to have any possibility of updating gc_list. This is what the

> patch intends to achieve.

> 

> Fixes: 9c29a2f55ec0 ("neighbor: Fix locking order for gc_list changes")

> Signed-off-by: Chinmay Agarwal <chinagar@codeaurora.org>


Applied, thanks!
diff mbox series

Patch

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index ff07358..e2982b3 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1244,13 +1244,14 @@  static int __neigh_update(struct neighbour *neigh, const u8 *lladdr,
 	old    = neigh->nud_state;
 	err    = -EPERM;
 
-	if (!(flags & NEIGH_UPDATE_F_ADMIN) &&
-	    (old & (NUD_NOARP | NUD_PERMANENT)))
-		goto out;
 	if (neigh->dead) {
 		NL_SET_ERR_MSG(extack, "Neighbor entry is now dead");
+		new = old;
 		goto out;
 	}
+	if (!(flags & NEIGH_UPDATE_F_ADMIN) &&
+	    (old & (NUD_NOARP | NUD_PERMANENT)))
+		goto out;
 
 	ext_learn_change = neigh_update_ext_learned(neigh, flags, &notify);