diff mbox series

[net] ip6_tunnel: fix GRE6 segmentation

Message ID 20210622015254.1967716-1-kuba@kernel.org
State New
Headers show
Series [net] ip6_tunnel: fix GRE6 segmentation | expand

Commit Message

Jakub Kicinski June 22, 2021, 1:52 a.m. UTC
Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support")
moved assiging inner_ipproto down from ipxip6_tnl_xmit() to
its callee ip6_tnl_xmit(). The latter is also used by GRE.

Since commit 38720352412a ("gre: Use inner_proto to obtain inner
header protocol") GRE had been depending on skb->inner_protocol
during segmentation. It sets it in gre_build_header() and reads
it in gre_gso_segment(). Changes to ip6_tnl_xmit() overwrite
the protocol, resulting in GSO skbs getting dropped.

Note that inner_protocol is a union with inner_ipproto,
GRE uses the former while the change switched it to the latter
(always setting it to just IPPROTO_GRE).

Restore the original location of skb_set_inner_ipproto(),
it is unclear why it was moved in the first place.

Fixes: 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 net/ipv6/ip6_tunnel.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Vadim Fedorenko June 22, 2021, 1:58 a.m. UTC | #1
On 22.06.2021 02:52, Jakub Kicinski wrote:
> Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support")
> moved assiging inner_ipproto down from ipxip6_tnl_xmit() to
> its callee ip6_tnl_xmit(). The latter is also used by GRE.
> 
> Since commit 38720352412a ("gre: Use inner_proto to obtain inner
> header protocol") GRE had been depending on skb->inner_protocol
> during segmentation. It sets it in gre_build_header() and reads
> it in gre_gso_segment(). Changes to ip6_tnl_xmit() overwrite
> the protocol, resulting in GSO skbs getting dropped.
> 
> Note that inner_protocol is a union with inner_ipproto,
> GRE uses the former while the change switched it to the latter
> (always setting it to just IPPROTO_GRE).
> 
> Restore the original location of skb_set_inner_ipproto(),
> it is unclear why it was moved in the first place.
> 
> Fixes: 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support")
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
Tested-by: Vadim Fedorenko <vfedorenko@novek.ru>
patchwork-bot+netdevbpf@kernel.org June 22, 2021, 5:40 p.m. UTC | #2
Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Mon, 21 Jun 2021 18:52:54 -0700 you wrote:
> Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support")
> moved assiging inner_ipproto down from ipxip6_tnl_xmit() to
> its callee ip6_tnl_xmit(). The latter is also used by GRE.
> 
> Since commit 38720352412a ("gre: Use inner_proto to obtain inner
> header protocol") GRE had been depending on skb->inner_protocol
> during segmentation. It sets it in gre_build_header() and reads
> it in gre_gso_segment(). Changes to ip6_tnl_xmit() overwrite
> the protocol, resulting in GSO skbs getting dropped.
> 
> [...]

Here is the summary with links:
  - [net] ip6_tunnel: fix GRE6 segmentation
    https://git.kernel.org/netdev/net/c/a6e3f2985a80

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
David Ahern June 23, 2021, 3:47 a.m. UTC | #3
On 6/22/21 4:24 PM, Jakub Kicinski wrote:
>>>   

>>

>> would be good to capture the GRE use case that found the bug and the

>> MPLS version as test cases under tools/testing/selftests/net. Both

>> should be doable using namespaces.

> 

> I believe Vadim is working on MPLS side, how does this look for GRE?

> 


I like the template you followed. :-)

The test case looks good to me, thanks for doing it.
Jakub Kicinski June 23, 2021, 4:14 p.m. UTC | #4
On Tue, 22 Jun 2021 21:47:45 -0600 David Ahern wrote:
> On 6/22/21 4:24 PM, Jakub Kicinski wrote:

> >> would be good to capture the GRE use case that found the bug and the

> >> MPLS version as test cases under tools/testing/selftests/net. Both

> >> should be doable using namespaces.  

> > 

> > I believe Vadim is working on MPLS side, how does this look for GRE?

> 

> I like the template you followed. :-)


:)

> The test case looks good to me, thanks for doing it.


Noob question, why do we need that 2 sec wait with IPv6 sometimes?
I've seen it randomly in my local testing as well I wasn't sure if 
it's a bug or expected.

I make a v6 tunnel on top of a VLAN and for 2 secs after creation 
I get the wrong route in ip -6 r g.
David Ahern June 23, 2021, 6:28 p.m. UTC | #5
On 6/23/21 10:14 AM, Jakub Kicinski wrote:
> On Tue, 22 Jun 2021 21:47:45 -0600 David Ahern wrote:

>> On 6/22/21 4:24 PM, Jakub Kicinski wrote:

>>>> would be good to capture the GRE use case that found the bug and the

>>>> MPLS version as test cases under tools/testing/selftests/net. Both

>>>> should be doable using namespaces.  

>>>

>>> I believe Vadim is working on MPLS side, how does this look for GRE?

>>

>> I like the template you followed. :-)

> 

> :)

> 

>> The test case looks good to me, thanks for doing it.

> 

> Noob question, why do we need that 2 sec wait with IPv6 sometimes?

> I've seen it randomly in my local testing as well I wasn't sure if 

> it's a bug or expected.


It is to let IPv6 DAD to complete otherwise the address will not be
selected as a source address. That typically results in test failures.
There are sysctl settings that can prevent the race and hence the need
for the sleep.

> 

> I make a v6 tunnel on top of a VLAN and for 2 secs after creation 

> I get the wrong route in ip -6 r g.

>
Guillaume Nault June 24, 2021, 1:39 p.m. UTC | #6
On Wed, Jun 23, 2021 at 12:28:05PM -0600, David Ahern wrote:
> On 6/23/21 10:14 AM, Jakub Kicinski wrote:

> > Noob question, why do we need that 2 sec wait with IPv6 sometimes?

> > I've seen it randomly in my local testing as well I wasn't sure if 

> > it's a bug or expected.

> 

> It is to let IPv6 DAD to complete otherwise the address will not be

> selected as a source address. That typically results in test failures.

> There are sysctl settings that can prevent the race and hence the need

> for the sleep.


But Jakub's script uses "nodad" in the "ip address add ..." commands.
Isn't that supposed to disable DAD entirely for the new address?
Why would it need an additional "sleep 2"?
David Ahern June 24, 2021, 2:36 p.m. UTC | #7
On 6/24/21 7:39 AM, Guillaume Nault wrote:
> On Wed, Jun 23, 2021 at 12:28:05PM -0600, David Ahern wrote:

>> On 6/23/21 10:14 AM, Jakub Kicinski wrote:

>>> Noob question, why do we need that 2 sec wait with IPv6 sometimes?

>>> I've seen it randomly in my local testing as well I wasn't sure if 

>>> it's a bug or expected.

>>

>> It is to let IPv6 DAD to complete otherwise the address will not be

>> selected as a source address. That typically results in test failures.

>> There are sysctl settings that can prevent the race and hence the need

>> for the sleep.

> 

> But Jakub's script uses "nodad" in the "ip address add ..." commands.

> Isn't that supposed to disable DAD entirely for the new address?

> Why would it need an additional "sleep 2"?

> 


it should yes. I think the selftests have acquired a blend of nodad,
sysctl and sleep. I'm sure it could be cleaned up and made consistent.
Jakub Kicinski June 24, 2021, 4:33 p.m. UTC | #8
On Thu, 24 Jun 2021 08:36:12 -0600 David Ahern wrote:
> >> It is to let IPv6 DAD to complete otherwise the address will not be

> >> selected as a source address. That typically results in test failures.

> >> There are sysctl settings that can prevent the race and hence the need

> >> for the sleep.  

> > 

> > But Jakub's script uses "nodad" in the "ip address add ..." commands.

> > Isn't that supposed to disable DAD entirely for the new address?

> > Why would it need an additional "sleep 2"?

> >   

> 

> it should yes. I think the selftests have acquired a blend of nodad,

> sysctl and sleep. I'm sure it could be cleaned up and made consistent.


I was guessing that the DAD is happening on the link local address,
that's why, no? I'm using link local in the underlay.
diff mbox series

Patch

diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 288bafded998..28ca70af014a 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1239,8 +1239,6 @@  int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
 	if (max_headroom > dev->needed_headroom)
 		dev->needed_headroom = max_headroom;
 
-	skb_set_inner_ipproto(skb, proto);
-
 	err = ip6_tnl_encap(skb, t, &proto, fl6);
 	if (err)
 		return err;
@@ -1377,6 +1375,8 @@  ipxip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev,
 	if (iptunnel_handle_offloads(skb, SKB_GSO_IPXIP6))
 		return -1;
 
+	skb_set_inner_ipproto(skb, protocol);
+
 	err = ip6_tnl_xmit(skb, dev, dsfield, &fl6, encap_limit, &mtu,
 			   protocol);
 	if (err != 0) {