diff mbox series

[net] ip6_tunnel: set inner ipproto before ip6_tnl_encap.

Message ID 20201016111156.26927-1-ovov@yandex-team.ru
State Superseded
Headers show
Series [net] ip6_tunnel: set inner ipproto before ip6_tnl_encap. | expand

Commit Message

Alexander Ovechkin Oct. 16, 2020, 11:11 a.m. UTC
ip6_tnl_encap assigns to proto transport protocol which
encapsulates inner packet, but we must pass to set_inner_ipproto
protocol of that inner packet.

Calling set_inner_ipproto after ip6_tnl_encap might break gso.
For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto 
would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to
incorrect calling sequence of gso functions:
ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment
instead of:
ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment

Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru>
---
 net/ipv6/ip6_tunnel.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Willem de Bruijn Oct. 16, 2020, 5:55 p.m. UTC | #1
On Fri, Oct 16, 2020 at 7:14 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote:
>

> ip6_tnl_encap assigns to proto transport protocol which

> encapsulates inner packet, but we must pass to set_inner_ipproto

> protocol of that inner packet.

>

> Calling set_inner_ipproto after ip6_tnl_encap might break gso.

> For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto

> would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to

> incorrect calling sequence of gso functions:

> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment

> instead of:

> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment

>

> Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru>


Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support") moved
the call from ip6_tnl_encap's caller to inside ip6_tnl_encap.

It makes sense that that likely broke this behavior for UDP (L4) tunnels.

But it was moved on purpose to avoid setting the inner protocol to
IPPROTO_MPLS. That needs to use skb->inner_protocol to further
segment.

I suspect we need to set this before or after conditionally to avoid
breaking that use case.
Vadim Fedorenko Oct. 17, 2020, 12:59 a.m. UTC | #2
On 16.10.2020 18:55, Willem de Bruijn wrote:
> On Fri, Oct 16, 2020 at 7:14 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote:

>> ip6_tnl_encap assigns to proto transport protocol which

>> encapsulates inner packet, but we must pass to set_inner_ipproto

>> protocol of that inner packet.

>>

>> Calling set_inner_ipproto after ip6_tnl_encap might break gso.

>> For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto

>> would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to

>> incorrect calling sequence of gso functions:

>> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment

>> instead of:

>> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment

>>

>> Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru>

> Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support") moved

> the call from ip6_tnl_encap's caller to inside ip6_tnl_encap.

>

> It makes sense that that likely broke this behavior for UDP (L4) tunnels.

>

> But it was moved on purpose to avoid setting the inner protocol to

> IPPROTO_MPLS. That needs to use skb->inner_protocol to further

> segment.

>

> I suspect we need to set this before or after conditionally to avoid

> breaking that use case.

I hope it could be fixed with something like this:

diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index a0217e5..87368b0 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1121,6 +1121,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device 
*dev, __u8 dsfield,
         bool use_cache = false;
         u8 hop_limit;
         int err = -1;
+       __u8 pproto = proto;

         if (t->parms.collect_md) {
                 hop_limit = skb_tunnel_info(skb)->key.ttl;
@@ -1280,7 +1281,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device 
*dev, __u8 dsfield,
                 ipv6_push_frag_opts(skb, &opt.ops, &proto);
         }

-       skb_set_inner_ipproto(skb, proto);
+       skb_set_inner_ipproto(skb, pproto == IPPROTO_MPLS ? proto : pproto);

         skb_push(skb, sizeof(struct ipv6hdr));
         skb_reset_network_header(skb);
Alexander Ovechkin Oct. 27, 2020, 9:44 p.m. UTC | #3
> But it was moved on purpose to avoid setting the inner protocol to IPPROTO_MPLS. That needs to use skb->inner_protocol to further segment.
And why do we need to avoid setting the inner protocol to IPPROTO_MPLS? Currently skb->inner_protocol is used before call of ip6_tnl_xmit.
Can you please give example when this patch breaks MPLS segmentation?

> On 16 Oct 2020, at 20:55, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> 
> On Fri, Oct 16, 2020 at 7:14 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote:
>> 
>> ip6_tnl_encap assigns to proto transport protocol which
>> encapsulates inner packet, but we must pass to set_inner_ipproto
>> protocol of that inner packet.
>> 
>> Calling set_inner_ipproto after ip6_tnl_encap might break gso.
>> For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto
>> would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to
>> incorrect calling sequence of gso functions:
>> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment
>> instead of:
>> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment
>> 
>> Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru>
> 
> Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support") moved
> the call from ip6_tnl_encap's caller to inside ip6_tnl_encap.
> 
> It makes sense that that likely broke this behavior for UDP (L4) tunnels.
> 
> But it was moved on purpose to avoid setting the inner protocol to
> IPPROTO_MPLS. That needs to use skb->inner_protocol to further
> segment.
> 
> I suspect we need to set this before or after conditionally to avoid
> breaking that use case.
Alexander Ovechkin Oct. 27, 2020, 9:45 p.m. UTC | #4
But ip6_tnl_encap assigns to proto number of outer protocol (i.e. =
protocol that encapsulates our original packet). Setting inner_ipproto =
to this value makes no sense.=20

For example in case of ipv6 inside MPLS inside fou6 encapsulation we =
have following packet structure:
+--------------+ <---+
|    ipv6      |     |
+--------------+     +- fou6-encap
|     udp      |     |
+--------------+ <---+
|     mpls     | <---   mpls-enacp
+--------------+ <---+
|  inner-ipv6  |     |
+--------------+     +- original packet
|     ...      |     |
+--------------+ <---+
After ip6_tnl_encap proto will be equal to IPPROTO_UDP, that is =
obviously is not inner ipproto.

Actually if pproto =3D=3D IPPROTO_MPLS than we have two layers of =
encapsulation and it is meaningless to set inner ipproto, cause =
currently there is no support for segmentation of packets with two =
layers of encapsulation.


> On 17 Oct 2020, at 03:59, Vadim Fedorenko <vfedorenko@novek.ru> wrote:
> 
> On 16.10.2020 18:55, Willem de Bruijn wrote:
>> On Fri, Oct 16, 2020 at 7:14 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote:
>>> ip6_tnl_encap assigns to proto transport protocol which
>>> encapsulates inner packet, but we must pass to set_inner_ipproto
>>> protocol of that inner packet.
>>> 
>>> Calling set_inner_ipproto after ip6_tnl_encap might break gso.
>>> For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto
>>> would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to
>>> incorrect calling sequence of gso functions:
>>> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment
>>> instead of:
>>> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment
>>> 
>>> Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru>
>> Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support") moved
>> the call from ip6_tnl_encap's caller to inside ip6_tnl_encap.
>> 
>> It makes sense that that likely broke this behavior for UDP (L4) tunnels.
>> 
>> But it was moved on purpose to avoid setting the inner protocol to
>> IPPROTO_MPLS. That needs to use skb->inner_protocol to further
>> segment.
>> 
>> I suspect we need to set this before or after conditionally to avoid
>> breaking that use case.
> I hope it could be fixed with something like this:
> 
> diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
> index a0217e5..87368b0 100644
> --- a/net/ipv6/ip6_tunnel.c
> +++ b/net/ipv6/ip6_tunnel.c
> @@ -1121,6 +1121,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
>         bool use_cache = false;
>         u8 hop_limit;
>         int err = -1;
> +       __u8 pproto = proto;
> 
>         if (t->parms.collect_md) {
>                 hop_limit = skb_tunnel_info(skb)->key.ttl;
> @@ -1280,7 +1281,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
>                 ipv6_push_frag_opts(skb, &opt.ops, &proto);
>         }
> 
> -       skb_set_inner_ipproto(skb, proto);
> +       skb_set_inner_ipproto(skb, pproto == IPPROTO_MPLS ? proto : pproto);
> 
>         skb_push(skb, sizeof(struct ipv6hdr));
>         skb_reset_network_header(skb);
>
Willem de Bruijn Oct. 28, 2020, 1:53 a.m. UTC | #5
On Tue, Oct 27, 2020 at 5:52 PM Alexander Ovechkin <ovov@yandex-team.ru> wrote:
>
> > But it was moved on purpose to avoid setting the inner protocol to IPPROTO_MPLS. That needs to use skb->inner_protocol to further segment.
> And why do we need to avoid setting the inner protocol to IPPROTO_MPLS? Currently skb->inner_protocol is used before call of ip6_tnl_xmit.
> Can you please give example when this patch breaks MPLS segmentation?

mpls_gso_segment calls skb_mac_gso_segment on the inner packet. After
setting skb->protocol based on skb->inner_protocol.

perhaps mpls encap gso and udp tunnel gso simply cannot be enabled
together, because both use skb->inner_(ipproto|protocol) to demultiplex:

  18    163  net/ipv4/udp_offload.c <<skb_udp_tunnel_segment>>
             protocol = skb->inner_protocol;
  19     35  net/mpls/mpls_gso.c <<mpls_gso_segment>>
             skb->protocol = skb->inner_protocol;

   3    168  net/ipv4/udp_offload.c <<skb_udp_tunnel_segment>>
             ops = rcu_dereference(offloads[skb->inner_ipproto]);

Please don't top post, btw.


> > On 16 Oct 2020, at 20:55, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> >
> > On Fri, Oct 16, 2020 at 7:14 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote:
> >>
> >> ip6_tnl_encap assigns to proto transport protocol which
> >> encapsulates inner packet, but we must pass to set_inner_ipproto
> >> protocol of that inner packet.
> >>
> >> Calling set_inner_ipproto after ip6_tnl_encap might break gso.
> >> For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto
> >> would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to
> >> incorrect calling sequence of gso functions:
> >> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment
> >> instead of:
> >> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment
> >>
> >> Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru>
> >
> > Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support") moved
> > the call from ip6_tnl_encap's caller to inside ip6_tnl_encap.
> >
> > It makes sense that that likely broke this behavior for UDP (L4) tunnels.
> >
> > But it was moved on purpose to avoid setting the inner protocol to
> > IPPROTO_MPLS. That needs to use skb->inner_protocol to further
> > segment.
> >
> > I suspect we need to set this before or after conditionally to avoid
> > breaking that use case.
>
Alexander Ovechkin Oct. 29, 2020, 4:23 a.m. UTC | #6
On 28 Oct 2020, at 01:53 UTC Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> On Tue, Oct 27, 2020 at 5:52 PM Alexander Ovechkin <ovov@yandex-team.ru> wrote:
> >
> > > But it was moved on purpose to avoid setting the inner protocol to IPPROTO_MPLS. That needs to use skb->inner_protocol to further segment.
> > And why do we need to avoid setting the inner protocol to IPPROTO_MPLS? Currently skb->inner_protocol is used before call of ip6_tnl_xmit.
> > Can you please give example when this patch breaks MPLS segmentation?
> 
> mpls_gso_segment calls skb_mac_gso_segment on the inner packet. After
> setting skb->protocol based on skb->inner_protocol.

Yeah, but mpls_gso_segment is called before ip6_tnl_xmit (because tun devices don't have NETIF_F_GSO_SOFTWARE in their mpls_features), so it does not matter to what value ip6_tnl_xmit sets skb->inner_ipproto.
And even if gso would been called after both mpls_xmit and ip6_tnl_xmit it would fail as you have written.
Willem de Bruijn Oct. 29, 2020, 2:40 p.m. UTC | #7
On Thu, Oct 29, 2020 at 3:46 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote:
>
> On 28 Oct 2020, at 01:53 UTC Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> > On Tue, Oct 27, 2020 at 5:52 PM Alexander Ovechkin <ovov@yandex-team.ru> wrote:
> > >
> > > > But it was moved on purpose to avoid setting the inner protocol to IPPROTO_MPLS. That needs to use skb->inner_protocol to further segment.
> > > And why do we need to avoid setting the inner protocol to IPPROTO_MPLS? Currently skb->inner_protocol is used before call of ip6_tnl_xmit.
> > > Can you please give example when this patch breaks MPLS segmentation?
> >
> > mpls_gso_segment calls skb_mac_gso_segment on the inner packet. After
> > setting skb->protocol based on skb->inner_protocol.
>
> Yeah, but mpls_gso_segment is called before ip6_tnl_xmit (because tun devices don't have NETIF_F_GSO_SOFTWARE in their mpls_features), so it does not matter to what value ip6_tnl_xmit sets skb->inner_ipproto.
> And even if gso would been called after both mpls_xmit and ip6_tnl_xmit it would fail as you have written.

Good point. Okay, if no mpls gso packets can make it here, then it
should not matter.

Vadim, are we missing another reason for this move?

Else, no other concerns from me. Please do add a Fixes tag.
Vadim Fedorenko Oct. 29, 2020, 3:50 p.m. UTC | #8
On 29.10.2020 14:40, Willem de Bruijn wrote:
> On Thu, Oct 29, 2020 at 3:46 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote:
>> On 28 Oct 2020, at 01:53 UTC Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
>>> On Tue, Oct 27, 2020 at 5:52 PM Alexander Ovechkin <ovov@yandex-team.ru> wrote:
>>>>> But it was moved on purpose to avoid setting the inner protocol to IPPROTO_MPLS. That needs to use skb->inner_protocol to further segment.
>>>> And why do we need to avoid setting the inner protocol to IPPROTO_MPLS? Currently skb->inner_protocol is used before call of ip6_tnl_xmit.
>>>> Can you please give example when this patch breaks MPLS segmentation?
>>> mpls_gso_segment calls skb_mac_gso_segment on the inner packet. After
>>> setting skb->protocol based on skb->inner_protocol.
>> Yeah, but mpls_gso_segment is called before ip6_tnl_xmit (because tun devices don't have NETIF_F_GSO_SOFTWARE in their mpls_features), so it does not matter to what value ip6_tnl_xmit sets skb->inner_ipproto.
>> And even if gso would been called after both mpls_xmit and ip6_tnl_xmit it would fail as you have written.
> Good point. Okay, if no mpls gso packets can make it here, then it
> should not matter.
>
> Vadim, are we missing another reason for this move?
>
> Else, no other concerns from me. Please do add a Fixes tag.
I need a bit of time to repeat all the tests I've done earlier. Will be back 
soon with the results.
Vadim Fedorenko Oct. 30, 2020, 11:01 a.m. UTC | #9
On 29.10.2020 14:40, Willem de Bruijn wrote:
> On Thu, Oct 29, 2020 at 3:46 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote:
>> On 28 Oct 2020, at 01:53 UTC Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
>>> On Tue, Oct 27, 2020 at 5:52 PM Alexander Ovechkin <ovov@yandex-team.ru> wrote:
>>>>> But it was moved on purpose to avoid setting the inner protocol to IPPROTO_MPLS. That needs to use skb->inner_protocol to further segment.
>>>> And why do we need to avoid setting the inner protocol to IPPROTO_MPLS? Currently skb->inner_protocol is used before call of ip6_tnl_xmit.
>>>> Can you please give example when this patch breaks MPLS segmentation?
>>> mpls_gso_segment calls skb_mac_gso_segment on the inner packet. After
>>> setting skb->protocol based on skb->inner_protocol.
>> Yeah, but mpls_gso_segment is called before ip6_tnl_xmit (because tun devices don't have NETIF_F_GSO_SOFTWARE in their mpls_features), so it does not matter to what value ip6_tnl_xmit sets skb->inner_ipproto.
>> And even if gso would been called after both mpls_xmit and ip6_tnl_xmit it would fail as you have written.
> Good point. Okay, if no mpls gso packets can make it here, then it
> should not matter.
>
> Vadim, are we missing another reason for this move?
>
> Else, no other concerns from me. Please do add a Fixes tag.
Could not reproduce the bug. Could you please provide a test scenario?

Anyway, all my scenarious with MPLS-in-IPv6 and MPLS-in-GUE are working so I'm 
ok with moving
Alexander Ovechkin Oct. 30, 2020, 12:54 p.m. UTC | #10
On 30 Oct 2020, at 14:01, Vadim Fedorenko <vfedorenko@novek.ru> wrote:
> Could not reproduce the bug. Could you please provide a test scenario?


It can be reproduced if your net device doesn’t support udp tunnel segmentation (i.e its features do not have SKB_GSO_UDP_TUNNEL).
If you try to send packet larger than the MTU fou6-only tunnel (without any other encap) it will be dropped, because of invalid skb->inner_ipproto (that will be equal to IPPROTO_UDP — outer protocol, instead of IPPROTO_IPV6).
skb->inner_ipproto is used here:
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/tree/net/ipv4/udp_offload.c?id=07e0887302450a62f51dba72df6afb5fabb23d1c#n168
Vadim Fedorenko Oct. 31, 2020, 1:57 a.m. UTC | #11
On 30.10.2020 12:54, Alexander Ovechkin wrote:
> On 30 Oct 2020, at 14:01, Vadim Fedorenko <vfedorenko@novek.ru> wrote:

>> Could not reproduce the bug. Could you please provide a test scenario?

> It can be reproduced if your net device doesn’t support udp tunnel segmentation (i.e its features do not have SKB_GSO_UDP_TUNNEL).

> If you try to send packet larger than the MTU fou6-only tunnel (without any other encap) it will be dropped, because of invalid skb->inner_ipproto (that will be equal to IPPROTO_UDP — outer protocol, instead of IPPROTO_IPV6).

> skb->inner_ipproto is used here:

> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/tree/net/ipv4/udp_offload.c?id=07e0887302450a62f51dba72df6afb5fabb23d1c#n168

Ok, all my tests show that MPLS encap is working after this moving, so I have no 
concerns too.
diff mbox series

Patch

diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index a0217e5bf3bc..648db3fe508f 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1271,6 +1271,8 @@  int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
 	if (max_headroom > dev->needed_headroom)
 		dev->needed_headroom = max_headroom;
 
+	skb_set_inner_ipproto(skb, proto);
+
 	err = ip6_tnl_encap(skb, t, &proto, fl6);
 	if (err)
 		return err;
@@ -1280,8 +1282,6 @@  int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
 		ipv6_push_frag_opts(skb, &opt.ops, &proto);
 	}
 
-	skb_set_inner_ipproto(skb, proto);
-
 	skb_push(skb, sizeof(struct ipv6hdr));
 	skb_reset_network_header(skb);
 	ipv6h = ipv6_hdr(skb);