diff mbox series

[IPV6,1/1] ipv6: allocate enough headroom in ip6_finish_output2()

Message ID 3cb5a2e5-4e4c-728a-252d-4757b6c9612d@virtuozzo.com
State New
Headers show
Series [IPV6,1/1] ipv6: allocate enough headroom in ip6_finish_output2() | expand

Commit Message

Vasily Averin July 7, 2021, 2:04 p.m. UTC
When TEE target mirrors traffic to another interface, sk_buff may
not have enough headroom to be processed correctly.
ip_finish_output2() detect this situation for ipv4 and allocates
new skb with enogh headroom. However ipv6 lacks this logic in
ip_finish_output2 and it leads to skb_under_panic:

 skbuff: skb_under_panic: text:ffffffffc0866ad4 len:96 put:24
 head:ffff97be85e31800 data:ffff97be85e317f8 tail:0x58 end:0xc0 dev:gre0
 ------------[ cut here ]------------
 kernel BUG at net/core/skbuff.c:110!
 invalid opcode: 0000 [#1] SMP PTI
 CPU: 2 PID: 393 Comm: kworker/2:2 Tainted: G           OE     5.13.0 #13
 Hardware name: Virtuozzo KVM, BIOS 1.11.0-2.vz7.4 04/01/2014
 Workqueue: ipv6_addrconf addrconf_dad_work
 RIP: 0010:skb_panic+0x48/0x4a
 Call Trace:
  skb_push.cold.111+0x10/0x10
  ipgre_header+0x24/0xf0 [ip_gre]
  neigh_connected_output+0xae/0xf0
  ip6_finish_output2+0x1a8/0x5a0
  ip6_output+0x5c/0x110
  nf_dup_ipv6+0x158/0x1000 [nf_dup_ipv6]
  tee_tg6+0x2e/0x40 [xt_TEE]
  ip6t_do_table+0x294/0x470 [ip6_tables]
  nf_hook_slow+0x44/0xc0
  nf_hook.constprop.34+0x72/0xe0
  ndisc_send_skb+0x20d/0x2e0
  ndisc_send_ns+0xd1/0x210
  addrconf_dad_work+0x3c8/0x540
  process_one_work+0x1d1/0x370
  worker_thread+0x30/0x390
  kthread+0x116/0x130
  ret_from_fork+0x22/0x30

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
---
 net/ipv6/ip6_output.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

Comments

David Ahern July 7, 2021, 2:45 p.m. UTC | #1
On 7/7/21 8:04 AM, Vasily Averin wrote:
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index ff4f9eb..e5af740 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -61,9 +61,24 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
>  	struct dst_entry *dst = skb_dst(skb);
>  	struct net_device *dev = dst->dev;
>  	const struct in6_addr *nexthop;
> +	unsigned int hh_len = LL_RESERVED_SPACE(dev);
>  	struct neighbour *neigh;
>  	int ret;
>  
> +	/* Be paranoid, rather than too clever. */
> +	if (unlikely(skb_headroom(skb) < hh_len && dev->header_ops)) {
> +		struct sk_buff *skb2;
> +
> +		skb2 = skb_realloc_headroom(skb, LL_RESERVED_SPACE(dev));

why not use hh_len here?


> +		if (!skb2) {
> +			kfree_skb(skb);
> +			return -ENOMEM;
> +		}
> +		if (skb->sk)
> +			skb_set_owner_w(skb2, skb->sk);
> +		consume_skb(skb);
> +		skb = skb2;
> +	}
>  	if (ipv6_addr_is_multicast(&ipv6_hdr(skb)->daddr)) {
>  		struct inet6_dev *idev = ip6_dst_idev(skb_dst(skb));
>  
>
Jakub Kicinski July 7, 2021, 4:42 p.m. UTC | #2
On Wed, 7 Jul 2021 08:45:13 -0600 David Ahern wrote:
> On 7/7/21 8:04 AM, Vasily Averin wrote:
> > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> > index ff4f9eb..e5af740 100644
> > --- a/net/ipv6/ip6_output.c
> > +++ b/net/ipv6/ip6_output.c
> > @@ -61,9 +61,24 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
> >  	struct dst_entry *dst = skb_dst(skb);
> >  	struct net_device *dev = dst->dev;
> >  	const struct in6_addr *nexthop;
> > +	unsigned int hh_len = LL_RESERVED_SPACE(dev);
> >  	struct neighbour *neigh;
> >  	int ret;
> >  
> > +	/* Be paranoid, rather than too clever. */
> > +	if (unlikely(skb_headroom(skb) < hh_len && dev->header_ops)) {
> > +		struct sk_buff *skb2;
> > +
> > +		skb2 = skb_realloc_headroom(skb, LL_RESERVED_SPACE(dev));  
> 
> why not use hh_len here?

Is there a reason for the new skb? Why not pskb_expand_head()?

> > +		if (!skb2) {
> > +			kfree_skb(skb);
> > +			return -ENOMEM;
> > +		}
> > +		if (skb->sk)
> > +			skb_set_owner_w(skb2, skb->sk);
> > +		consume_skb(skb);
> > +		skb = skb2;
> > +	}
> >  	if (ipv6_addr_is_multicast(&ipv6_hdr(skb)->daddr)) {
> >  		struct inet6_dev *idev = ip6_dst_idev(skb_dst(skb));
Vasily Averin July 7, 2021, 5:53 p.m. UTC | #3
On 7/7/21 8:41 PM, Eric Dumazet wrote:
> On 7/7/21 6:42 PM, Jakub Kicinski wrote:
>> On Wed, 7 Jul 2021 08:45:13 -0600 David Ahern wrote:
>>> On 7/7/21 8:04 AM, Vasily Averin wrote:
>>>> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>>>> index ff4f9eb..e5af740 100644
>>>> --- a/net/ipv6/ip6_output.c
>>>> +++ b/net/ipv6/ip6_output.c
>>>> @@ -61,9 +61,24 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
>>>>  	struct dst_entry *dst = skb_dst(skb);
>>>>  	struct net_device *dev = dst->dev;
>>>>  	const struct in6_addr *nexthop;
>>>> +	unsigned int hh_len = LL_RESERVED_SPACE(dev);
>>>>  	struct neighbour *neigh;
>>>>  	int ret;
>>>>  
>>>> +	/* Be paranoid, rather than too clever. */
>>>> +	if (unlikely(skb_headroom(skb) < hh_len && dev->header_ops)) {
>>>> +		struct sk_buff *skb2;
>>>> +
>>>> +		skb2 = skb_realloc_headroom(skb, LL_RESERVED_SPACE(dev));  
>>>
>>> why not use hh_len here?
>>
>> Is there a reason for the new skb? Why not pskb_expand_head()?
> 
> pskb_expand_head() might crash, if skb is shared.
> 
> We possibly can add a helper, factorizing all this,
> and eventually use pskb_expand_head() if safe.

Thank you for feedback, I'll do it in 2nd version.
	Vasily Averin
Eric Dumazet July 7, 2021, 6:50 p.m. UTC | #4
On 7/7/21 8:30 PM, Jakub Kicinski wrote:
> On Wed, 7 Jul 2021 19:41:44 +0200 Eric Dumazet wrote:
>> On 7/7/21 6:42 PM, Jakub Kicinski wrote:
>>> On Wed, 7 Jul 2021 08:45:13 -0600 David Ahern wrote:  
>>>> why not use hh_len here?  
>>>
>>> Is there a reason for the new skb? Why not pskb_expand_head()?  
>>
>>
>> pskb_expand_head() might crash, if skb is shared.
>>
>> We possibly can add a helper, factorizing all this,
>> and eventually use pskb_expand_head() if safe.
> 
> Is there a strategically placed skb_share_check() somewhere further
> down? Otherwise there seems to be a lot of questionable skb_cow*()
> calls, also __skb_linearize() and skb_pad() are risky, no?
> Or is it that shared skbs are uncommon and syzbot doesn't hit them?
> 

Some of us try hard to remove skb_get() occurrences,
but they tend to re-appear fast :/

Refs: commit a516993f0ac1694673412eb2d16a091eafa77d2a
("net: fix wrong skb_get() usage / crash in IGMP/MLD parsing code")
diff mbox series

Patch

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index ff4f9eb..e5af740 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -61,9 +61,24 @@  static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
 	struct dst_entry *dst = skb_dst(skb);
 	struct net_device *dev = dst->dev;
 	const struct in6_addr *nexthop;
+	unsigned int hh_len = LL_RESERVED_SPACE(dev);
 	struct neighbour *neigh;
 	int ret;
 
+	/* Be paranoid, rather than too clever. */
+	if (unlikely(skb_headroom(skb) < hh_len && dev->header_ops)) {
+		struct sk_buff *skb2;
+
+		skb2 = skb_realloc_headroom(skb, LL_RESERVED_SPACE(dev));
+		if (!skb2) {
+			kfree_skb(skb);
+			return -ENOMEM;
+		}
+		if (skb->sk)
+			skb_set_owner_w(skb2, skb->sk);
+		consume_skb(skb);
+		skb = skb2;
+	}
 	if (ipv6_addr_is_multicast(&ipv6_hdr(skb)->daddr)) {
 		struct inet6_dev *idev = ip6_dst_idev(skb_dst(skb));