diff mbox series

net: always use icmp{,v6}_ndo_send from ndo_start_xmit

Message ID 20210225234631.2547776-1-Jason@zx2c4.com
State Superseded
Headers show
Series net: always use icmp{,v6}_ndo_send from ndo_start_xmit | expand

Commit Message

Jason A. Donenfeld Feb. 25, 2021, 11:46 p.m. UTC
There were a few remaining tunnel drivers that didn't receive the prior
conversion to icmp{,v6}_ndo_send. Knowing now that this could lead to
memory corrution (see ee576c47db60 ("net: icmp: pass zeroed opts from
icmp{,v6}_ndo_send before sending") for details), there's even more
imperative to have these all converted. So this commit goes through the
remaining cases that I could find and does a boring translation to the
ndo variety.

Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

---
 net/ipv4/ip_tunnel.c  |  5 ++---
 net/ipv4/ip_vti.c     |  6 +++---
 net/ipv6/ip6_gre.c    | 16 ++++++++--------
 net/ipv6/ip6_tunnel.c | 10 +++++-----
 net/ipv6/ip6_vti.c    |  6 +++---
 net/ipv6/sit.c        |  2 +-
 6 files changed, 22 insertions(+), 23 deletions(-)

-- 
2.30.1

Comments

Willem de Bruijn Feb. 26, 2021, 9:25 p.m. UTC | #1
On Thu, Feb 25, 2021 at 6:46 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>

> There were a few remaining tunnel drivers that didn't receive the prior

> conversion to icmp{,v6}_ndo_send. Knowing now that this could lead to

> memory corrution (see ee576c47db60 ("net: icmp: pass zeroed opts from

> icmp{,v6}_ndo_send before sending") for details), there's even more

> imperative to have these all converted. So this commit goes through the

> remaining cases that I could find and does a boring translation to the

> ndo variety.

>

> Cc: Willem de Bruijn <willemb@google.com>

> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>


Using a stack variable over skb->cb[] is definitely the right fix for
all of these. Thanks Jason.

Only part that I don't fully know is the conntrack conversion. That is
a behavioral change. What is the context behind that? I assume it's
fine. In that if needed, that is the case for all devices, nothing
specific about the couple that call icmp(v6)_ndo_send already.
Jason A. Donenfeld Feb. 26, 2021, 10:22 p.m. UTC | #2
On Fri, Feb 26, 2021 at 10:25 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>

> On Thu, Feb 25, 2021 at 6:46 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:

> >

> > There were a few remaining tunnel drivers that didn't receive the prior

> > conversion to icmp{,v6}_ndo_send. Knowing now that this could lead to

> > memory corrution (see ee576c47db60 ("net: icmp: pass zeroed opts from

> > icmp{,v6}_ndo_send before sending") for details), there's even more

> > imperative to have these all converted. So this commit goes through the

> > remaining cases that I could find and does a boring translation to the

> > ndo variety.

> >

> > Cc: Willem de Bruijn <willemb@google.com>

> > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

>

> Using a stack variable over skb->cb[] is definitely the right fix for

> all of these. Thanks Jason.

>

> Only part that I don't fully know is the conntrack conversion. That is

> a behavioral change. What is the context behind that? I assume it's

> fine. In that if needed, that is the case for all devices, nothing

> specific about the couple that call icmp(v6)_ndo_send already.


That's actually a sensible change anyway. icmp_send does something
bogus if the packet has already passed through netfilter, which is why
the ndo variant was adopted. So it's good and correct for these to
change in that way.

Jason
Willem de Bruijn Feb. 26, 2021, 11:28 p.m. UTC | #3
On Fri, Feb 26, 2021 at 5:39 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>

> On Fri, Feb 26, 2021 at 10:25 PM Willem de Bruijn

> <willemdebruijn.kernel@gmail.com> wrote:

> >

> > On Thu, Feb 25, 2021 at 6:46 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:

> > >

> > > There were a few remaining tunnel drivers that didn't receive the prior

> > > conversion to icmp{,v6}_ndo_send. Knowing now that this could lead to

> > > memory corrution (see ee576c47db60 ("net: icmp: pass zeroed opts from

> > > icmp{,v6}_ndo_send before sending") for details), there's even more

> > > imperative to have these all converted. So this commit goes through the

> > > remaining cases that I could find and does a boring translation to the

> > > ndo variety.

> > >

> > > Cc: Willem de Bruijn <willemb@google.com>

> > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

> >

> > Using a stack variable over skb->cb[] is definitely the right fix for

> > all of these. Thanks Jason.

> >

> > Only part that I don't fully know is the conntrack conversion. That is

> > a behavioral change. What is the context behind that? I assume it's

> > fine. In that if needed, that is the case for all devices, nothing

> > specific about the couple that call icmp(v6)_ndo_send already.

>

> That's actually a sensible change anyway. icmp_send does something

> bogus if the packet has already passed through netfilter, which is why

> the ndo variant was adopted. So it's good and correct for these to

> change in that way.

>

> Jason


Something bogus, how? Does this apply to all uses of conntrack?
Specifically NAT? Not trying to be obtuse, but I really find it hard
to evaluate that part.

Please cc: the maintainers for patches that are meant to be merged, btw.
Jakub Kicinski Feb. 26, 2021, 11:54 p.m. UTC | #4
On Fri, 26 Feb 2021 18:28:56 -0500 Willem de Bruijn wrote:
> Please cc: the maintainers for patches that are meant to be merged, btw.


I was about to say. Please repost.
Jason A. Donenfeld Feb. 27, 2021, 12:33 a.m. UTC | #5
On Sat, Feb 27, 2021 at 12:29 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>

> On Fri, Feb 26, 2021 at 5:39 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:

> >

> > On Fri, Feb 26, 2021 at 10:25 PM Willem de Bruijn

> > <willemdebruijn.kernel@gmail.com> wrote:

> > >

> > > On Thu, Feb 25, 2021 at 6:46 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:

> > > >

> > > > There were a few remaining tunnel drivers that didn't receive the prior

> > > > conversion to icmp{,v6}_ndo_send. Knowing now that this could lead to

> > > > memory corrution (see ee576c47db60 ("net: icmp: pass zeroed opts from

> > > > icmp{,v6}_ndo_send before sending") for details), there's even more

> > > > imperative to have these all converted. So this commit goes through the

> > > > remaining cases that I could find and does a boring translation to the

> > > > ndo variety.

> > > >

> > > > Cc: Willem de Bruijn <willemb@google.com>

> > > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

> > >

> > > Using a stack variable over skb->cb[] is definitely the right fix for

> > > all of these. Thanks Jason.

> > >

> > > Only part that I don't fully know is the conntrack conversion. That is

> > > a behavioral change. What is the context behind that? I assume it's

> > > fine. In that if needed, that is the case for all devices, nothing

> > > specific about the couple that call icmp(v6)_ndo_send already.

> >

> > That's actually a sensible change anyway. icmp_send does something

> > bogus if the packet has already passed through netfilter, which is why

> > the ndo variant was adopted. So it's good and correct for these to

> > change in that way.

> >

> > Jason

>

> Something bogus, how? Does this apply to all uses of conntrack?

> Specifically NAT? Not trying to be obtuse, but I really find it hard

> to evaluate that part.


By the time packets hit ndo_start_xmit, the src address has changed,
and icmp can't deliver to the actual source, and its rate limiting
works against the wrong source. All of this was explained, justified,
and discussed on the original icmp_ndo_start patchset, which included
the function and converted drivers to use it. However, a few spots
were missed, which this patchset cleans up. Here's the merge with
details:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=803381f9f117493d6204d82445a530c834040fe6

The network devices that this patch here adjusts are no different than
the four I originally fixed up in that series -- xfrmi, gtp, sunvnet,
wireguard.

> Please cc: the maintainers for patches that are meant to be merged, btw.


Whoops. I'll do so and repost.
Willem de Bruijn Feb. 27, 2021, 2:10 a.m. UTC | #6
On Fri, Feb 26, 2021 at 7:42 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>

> On Sat, Feb 27, 2021 at 12:29 AM Willem de Bruijn

> <willemdebruijn.kernel@gmail.com> wrote:

> >

> > On Fri, Feb 26, 2021 at 5:39 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:

> > >

> > > On Fri, Feb 26, 2021 at 10:25 PM Willem de Bruijn

> > > <willemdebruijn.kernel@gmail.com> wrote:

> > > >

> > > > On Thu, Feb 25, 2021 at 6:46 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:

> > > > >

> > > > > There were a few remaining tunnel drivers that didn't receive the prior

> > > > > conversion to icmp{,v6}_ndo_send. Knowing now that this could lead to

> > > > > memory corrution (see ee576c47db60 ("net: icmp: pass zeroed opts from

> > > > > icmp{,v6}_ndo_send before sending") for details), there's even more

> > > > > imperative to have these all converted. So this commit goes through the

> > > > > remaining cases that I could find and does a boring translation to the

> > > > > ndo variety.

> > > > >

> > > > > Cc: Willem de Bruijn <willemb@google.com>

> > > > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

> > > >

> > > > Using a stack variable over skb->cb[] is definitely the right fix for

> > > > all of these. Thanks Jason.

> > > >

> > > > Only part that I don't fully know is the conntrack conversion. That is

> > > > a behavioral change. What is the context behind that? I assume it's

> > > > fine. In that if needed, that is the case for all devices, nothing

> > > > specific about the couple that call icmp(v6)_ndo_send already.

> > >

> > > That's actually a sensible change anyway. icmp_send does something

> > > bogus if the packet has already passed through netfilter, which is why

> > > the ndo variant was adopted. So it's good and correct for these to

> > > change in that way.

> > >

> > > Jason

> >

> > Something bogus, how? Does this apply to all uses of conntrack?

> > Specifically NAT? Not trying to be obtuse, but I really find it hard

> > to evaluate that part.

>

> By the time packets hit ndo_start_xmit, the src address has changed,

> and icmp can't deliver to the actual source, and its rate limiting

> works against the wrong source. All of this was explained, justified,

> and discussed on the original icmp_ndo_start patchset, which included

> the function and converted drivers to use it. However, a few spots

> were missed, which this patchset cleans up. Here's the merge with

> details:

>

> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=803381f9f117493d6204d82445a530c834040fe6


Thanks for the link. I used git blame to a look at the patch that
introduced the code before asking for more context, but this state was
in the merge commit, so I missed it.

> The network devices that this patch here adjusts are no different than

> the four I originally fixed up in that series -- xfrmi, gtp, sunvnet,

> wireguard.


Agreed.


> > Please cc: the maintainers for patches that are meant to be merged, btw.

>

> Whoops. I'll do so and repost.
diff mbox series

Patch

diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index 76a420c76f16..f6cc26de5ed3 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -502,8 +502,7 @@  static int tnl_update_pmtu(struct net_device *dev, struct sk_buff *skb,
 		if (!skb_is_gso(skb) &&
 		    (inner_iph->frag_off & htons(IP_DF)) &&
 		    mtu < pkt_size) {
-			memset(IPCB(skb), 0, sizeof(*IPCB(skb)));
-			icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));
+			icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));
 			return -E2BIG;
 		}
 	}
@@ -527,7 +526,7 @@  static int tnl_update_pmtu(struct net_device *dev, struct sk_buff *skb,
 
 		if (!skb_is_gso(skb) && mtu >= IPV6_MIN_MTU &&
 					mtu < pkt_size) {
-			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+			icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
 			return -E2BIG;
 		}
 	}
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index abc171e79d3e..eb207089ece0 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -238,13 +238,13 @@  static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
 	if (skb->len > mtu) {
 		skb_dst_update_pmtu_no_confirm(skb, mtu);
 		if (skb->protocol == htons(ETH_P_IP)) {
-			icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
-				  htonl(mtu));
+			icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
+				      htonl(mtu));
 		} else {
 			if (mtu < IPV6_MIN_MTU)
 				mtu = IPV6_MIN_MTU;
 
-			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+			icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
 		}
 
 		dst_release(dst);
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index c3bc89b6b1a1..1baf43aacb2e 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -678,8 +678,8 @@  static int prepare_ip6gre_xmit_ipv6(struct sk_buff *skb,
 
 		tel = (struct ipv6_tlv_tnl_enc_lim *)&skb_network_header(skb)[offset];
 		if (tel->encap_limit == 0) {
-			icmpv6_send(skb, ICMPV6_PARAMPROB,
-				    ICMPV6_HDR_FIELD, offset + 2);
+			icmpv6_ndo_send(skb, ICMPV6_PARAMPROB,
+					ICMPV6_HDR_FIELD, offset + 2);
 			return -1;
 		}
 		*encap_limit = tel->encap_limit - 1;
@@ -805,8 +805,8 @@  static inline int ip6gre_xmit_ipv4(struct sk_buff *skb, struct net_device *dev)
 	if (err != 0) {
 		/* XXX: send ICMP error even if DF is not set. */
 		if (err == -EMSGSIZE)
-			icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
-				  htonl(mtu));
+			icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
+				      htonl(mtu));
 		return -1;
 	}
 
@@ -837,7 +837,7 @@  static inline int ip6gre_xmit_ipv6(struct sk_buff *skb, struct net_device *dev)
 			  &mtu, skb->protocol);
 	if (err != 0) {
 		if (err == -EMSGSIZE)
-			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+			icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
 		return -1;
 	}
 
@@ -1063,10 +1063,10 @@  static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
 		/* XXX: send ICMP error even if DF is not set. */
 		if (err == -EMSGSIZE) {
 			if (skb->protocol == htons(ETH_P_IP))
-				icmp_send(skb, ICMP_DEST_UNREACH,
-					  ICMP_FRAG_NEEDED, htonl(mtu));
+				icmp_ndo_send(skb, ICMP_DEST_UNREACH,
+					      ICMP_FRAG_NEEDED, htonl(mtu));
 			else
-				icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+				icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
 		}
 
 		goto tx_err;
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index a7950baa05e5..3fa0eca5a06f 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -1332,8 +1332,8 @@  ipxip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev,
 
 				tel = (void *)&skb_network_header(skb)[offset];
 				if (tel->encap_limit == 0) {
-					icmpv6_send(skb, ICMPV6_PARAMPROB,
-						ICMPV6_HDR_FIELD, offset + 2);
+					icmpv6_ndo_send(skb, ICMPV6_PARAMPROB,
+							ICMPV6_HDR_FIELD, offset + 2);
 					return -1;
 				}
 				encap_limit = tel->encap_limit - 1;
@@ -1385,11 +1385,11 @@  ipxip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev,
 		if (err == -EMSGSIZE)
 			switch (protocol) {
 			case IPPROTO_IPIP:
-				icmp_send(skb, ICMP_DEST_UNREACH,
-					  ICMP_FRAG_NEEDED, htonl(mtu));
+				icmp_ndo_send(skb, ICMP_DEST_UNREACH,
+					      ICMP_FRAG_NEEDED, htonl(mtu));
 				break;
 			case IPPROTO_IPV6:
-				icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+				icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
 				break;
 			default:
 				break;
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index 0225fd694192..f10e7a72ea62 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -521,10 +521,10 @@  vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
 			if (mtu < IPV6_MIN_MTU)
 				mtu = IPV6_MIN_MTU;
 
-			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+			icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
 		} else {
-			icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
-				  htonl(mtu));
+			icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
+				      htonl(mtu));
 		}
 
 		err = -EMSGSIZE;
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 93636867aee2..63ccd9f2dccc 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -987,7 +987,7 @@  static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb,
 			skb_dst_update_pmtu_no_confirm(skb, mtu);
 
 		if (skb->len > mtu && !skb_is_gso(skb)) {
-			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+			icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
 			ip_rt_put(rt);
 			goto tx_error;
 		}