Message ID | 20201017125011.2655391-1-vincent@bernat.ch |
---|---|
State | New |
Headers | show |
Series | [net-next,v1] net: evaluate net.conf.ipvX.all.ignore_routes_with_linkdown | expand |
On Sat, 17 Oct 2020 14:50:11 +0200 Vincent Bernat wrote: > Introduced in 0eeb075fad73, the "ignore_routes_with_linkdown" sysctl > ignores a route whose interface is down. It is provided as a > per-interface sysctl. However, while a "all" variant is exposed, it > was a noop since it was never evaluated. We use the usual "or" logic > for this kind of sysctls. > Without this patch, the two last lines would fail on H1 (the one using > the "all" sysctl). With the patch, everything succeeds as expected. > > Also document the sysctl in `ip-sysctl.rst`. > > Fixes: 0eeb075fad73 ("net: ipv4 sysctl option to ignore routes when nexthop link is down") > Signed-off-by: Vincent Bernat <vincent@bernat.ch> I'm not hearing any objections, but I have two questions: - do you intend to merge it for 5.10 or 5.11? Because it has a fixes tag, yet it's marked for net-next. If we put it in 5.10 it may get pulled into stable immediately, knowing how things work lately. - we have other sysctls that use IN_DEV_CONF_GET(), e.g. "proxy_arp_pvlan" should those also be converted?
On 10/19/20 6:53 PM, Jakub Kicinski wrote: > On Sat, 17 Oct 2020 14:50:11 +0200 Vincent Bernat wrote: >> Introduced in 0eeb075fad73, the "ignore_routes_with_linkdown" sysctl >> ignores a route whose interface is down. It is provided as a >> per-interface sysctl. However, while a "all" variant is exposed, it >> was a noop since it was never evaluated. We use the usual "or" logic >> for this kind of sysctls. > >> Without this patch, the two last lines would fail on H1 (the one using >> the "all" sysctl). With the patch, everything succeeds as expected. >> >> Also document the sysctl in `ip-sysctl.rst`. >> >> Fixes: 0eeb075fad73 ("net: ipv4 sysctl option to ignore routes when nexthop link is down") >> Signed-off-by: Vincent Bernat <vincent@bernat.ch> > > I'm not hearing any objections, but I have two questions: > - do you intend to merge it for 5.10 or 5.11? Because it has a fixes > tag, yet it's marked for net-next. If we put it in 5.10 it may get > pulled into stable immediately, knowing how things work lately. > - we have other sysctls that use IN_DEV_CONF_GET(), > e.g. "proxy_arp_pvlan" should those also be converted? > The inconsistency with 'all' has been a major pain. In this case, I think it makes sense. Blindly changing all of them I suspect will lead to trouble. It is something reviewers should keep an eye on as sysctl settings get added.
[ fix Andy's address ] On 10/17/20 6:50 AM, Vincent Bernat wrote: > Introduced in 0eeb075fad73, the "ignore_routes_with_linkdown" sysctl > ignores a route whose interface is down. It is provided as a > per-interface sysctl. However, while a "all" variant is exposed, it > was a noop since it was never evaluated. We use the usual "or" logic > for this kind of sysctls. > > Tested with: > > ip link add type veth # veth0 + veth1 > ip link add type veth # veth1 + veth2 > ip link set up dev veth0 > ip link set up dev veth1 # link-status paired with veth0 > ip link set up dev veth2 > ip link set up dev veth3 # link-status paired with veth2 > > # First available path > ip -4 addr add 203.0.113.${uts#H}/24 dev veth0 > ip -6 addr add 2001:db8:1::${uts#H}/64 dev veth0 > > # Second available path > ip -4 addr add 192.0.2.${uts#H}/24 dev veth2 > ip -6 addr add 2001:db8:2::${uts#H}/64 dev veth2 > > # More specific route through first path > ip -4 route add 198.51.100.0/25 via 203.0.113.254 # via veth0 > ip -6 route add 2001:db8:3::/56 via 2001:db8:1::ff # via veth0 > > # Less specific route through second path > ip -4 route add 198.51.100.0/24 via 192.0.2.254 # via veth2 > ip -6 route add 2001:db8:3::/48 via 2001:db8:2::ff # via veth2 > > # H1: enable on "all" > # H2: enable on "veth0" > for v in ipv4 ipv6; do > case $uts in > H1) > sysctl -qw net.${v}.conf.all.ignore_routes_with_linkdown=1 > ;; > H2) > sysctl -qw net.${v}.conf.veth0.ignore_routes_with_linkdown=1 > ;; > esac > done > > set -xe > # When veth0 is up, best route is through veth0 > ip -o route get 198.51.100.1 | grep -Fw veth0 > ip -o route get 2001:db8:3::1 | grep -Fw veth0 > > # When veth0 is down, best route should be through veth2 on H1/H2, > # but on veth0 on H2 > ip link set down dev veth1 # down veth0 > ip route show > [ $uts != H3 ] || ip -o route get 198.51.100.1 | grep -Fw veth0 > [ $uts != H3 ] || ip -o route get 2001:db8:3::1 | grep -Fw veth0 > [ $uts = H3 ] || ip -o route get 198.51.100.1 | grep -Fw veth2 > [ $uts = H3 ] || ip -o route get 2001:db8:3::1 | grep -Fw veth2 > > Without this patch, the two last lines would fail on H1 (the one using > the "all" sysctl). With the patch, everything succeeds as expected. > > Also document the sysctl in `ip-sysctl.rst`. > > Fixes: 0eeb075fad73 ("net: ipv4 sysctl option to ignore routes when nexthop link is down") > Signed-off-by: Vincent Bernat <vincent@bernat.ch> > --- > Documentation/networking/ip-sysctl.rst | 3 +++ > include/linux/inetdevice.h | 2 +- > 2 files changed, 4 insertions(+), 1 deletion(-) > > diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst > index 837d51f9e1fa..fb6e4658fd4f 100644 > --- a/Documentation/networking/ip-sysctl.rst > +++ b/Documentation/networking/ip-sysctl.rst > @@ -1552,6 +1552,9 @@ igmpv3_unsolicited_report_interval - INTEGER > > Default: 1000 (1 seconds) > > +ignore_routes_with_linkdown - BOOLEAN > + Ignore routes whose link is down when performing a FIB lookup. > + > promote_secondaries - BOOLEAN > When a primary IP address is removed from this interface > promote a corresponding secondary IP address instead of > diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h > index 3515ca64e638..3bbcddd22df8 100644 > --- a/include/linux/inetdevice.h > +++ b/include/linux/inetdevice.h > @@ -126,7 +126,7 @@ static inline void ipv4_devconf_setall(struct in_device *in_dev) > IN_DEV_ORCONF((in_dev), ACCEPT_REDIRECTS))) > > #define IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN(in_dev) \ > - IN_DEV_CONF_GET((in_dev), IGNORE_ROUTES_WITH_LINKDOWN) > + IN_DEV_ORCONF((in_dev), IGNORE_ROUTES_WITH_LINKDOWN) > > #define IN_DEV_ARPFILTER(in_dev) IN_DEV_ORCONF((in_dev), ARPFILTER) > #define IN_DEV_ARP_ACCEPT(in_dev) IN_DEV_ORCONF((in_dev), ARP_ACCEPT) >
On Mon, 19 Oct 2020 20:56:36 -0600 David Ahern wrote: > On 10/19/20 6:53 PM, Jakub Kicinski wrote: > > On Sat, 17 Oct 2020 14:50:11 +0200 Vincent Bernat wrote: > >> Introduced in 0eeb075fad73, the "ignore_routes_with_linkdown" sysctl > >> ignores a route whose interface is down. It is provided as a > >> per-interface sysctl. However, while a "all" variant is exposed, it > >> was a noop since it was never evaluated. We use the usual "or" logic > >> for this kind of sysctls. > > > >> Without this patch, the two last lines would fail on H1 (the one using > >> the "all" sysctl). With the patch, everything succeeds as expected. > >> > >> Also document the sysctl in `ip-sysctl.rst`. > >> > >> Fixes: 0eeb075fad73 ("net: ipv4 sysctl option to ignore routes when nexthop link is down") > >> Signed-off-by: Vincent Bernat <vincent@bernat.ch> > > > > I'm not hearing any objections, but I have two questions: > > - do you intend to merge it for 5.10 or 5.11? Because it has a fixes > > tag, yet it's marked for net-next. If we put it in 5.10 it may get > > pulled into stable immediately, knowing how things work lately. > > - we have other sysctls that use IN_DEV_CONF_GET(), > > e.g. "proxy_arp_pvlan" should those also be converted? > > The inconsistency with 'all' has been a major pain. In this case, I > think it makes sense. Blindly changing all of them I suspect will lead > to trouble. It is something reviewers should keep an eye on as sysctl > settings get added. Just saying.. if Vincent had the time to clean them all up _carefully_, it'd be less likely we'll see another one added through copy & paste :)
❦ 19 octobre 2020 17:53 -07, Jakub Kicinski: > I'm not hearing any objections, but I have two questions: > - do you intend to merge it for 5.10 or 5.11? Because it has a fixes > tag, yet it's marked for net-next. If we put it in 5.10 it may get > pulled into stable immediately, knowing how things work lately. I have never been super-familiar with net/net-next. I am always using net-next and let people sort it out. > - we have other sysctls that use IN_DEV_CONF_GET(), e.g. > "proxy_arp_pvlan" should those also be converted? I can check them and do that.
On Tue, 20 Oct 2020 08:20:43 +0200 Vincent Bernat wrote: > ❦ 19 octobre 2020 17:53 -07, Jakub Kicinski: > > I'm not hearing any objections, but I have two questions: > > - do you intend to merge it for 5.10 or 5.11? Because it has a fixes > > tag, yet it's marked for net-next. If we put it in 5.10 it may get > > pulled into stable immediately, knowing how things work lately. > > I have never been super-familiar with net/net-next. I am always using > net-next and let people sort it out. Since there is some breakage potential I'd prefer to merge this change into net-next and give it more testing. But net-next is currently closed (don't pay attention to the graphic at vger). If you wouldn't mind - please repost next week. You can split out and repost the documentation change separately if you want, that's a fix that can go into net right away. > > - we have other sysctls that use IN_DEV_CONF_GET(), e.g. > > "proxy_arp_pvlan" should those also be converted? > > I can check them and do that.
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index 837d51f9e1fa..fb6e4658fd4f 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -1552,6 +1552,9 @@ igmpv3_unsolicited_report_interval - INTEGER Default: 1000 (1 seconds) +ignore_routes_with_linkdown - BOOLEAN + Ignore routes whose link is down when performing a FIB lookup. + promote_secondaries - BOOLEAN When a primary IP address is removed from this interface promote a corresponding secondary IP address instead of diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h index 3515ca64e638..3bbcddd22df8 100644 --- a/include/linux/inetdevice.h +++ b/include/linux/inetdevice.h @@ -126,7 +126,7 @@ static inline void ipv4_devconf_setall(struct in_device *in_dev) IN_DEV_ORCONF((in_dev), ACCEPT_REDIRECTS))) #define IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN(in_dev) \ - IN_DEV_CONF_GET((in_dev), IGNORE_ROUTES_WITH_LINKDOWN) + IN_DEV_ORCONF((in_dev), IGNORE_ROUTES_WITH_LINKDOWN) #define IN_DEV_ARPFILTER(in_dev) IN_DEV_ORCONF((in_dev), ARPFILTER) #define IN_DEV_ARP_ACCEPT(in_dev) IN_DEV_ORCONF((in_dev), ARP_ACCEPT)
Introduced in 0eeb075fad73, the "ignore_routes_with_linkdown" sysctl ignores a route whose interface is down. It is provided as a per-interface sysctl. However, while a "all" variant is exposed, it was a noop since it was never evaluated. We use the usual "or" logic for this kind of sysctls. Tested with: ip link add type veth # veth0 + veth1 ip link add type veth # veth1 + veth2 ip link set up dev veth0 ip link set up dev veth1 # link-status paired with veth0 ip link set up dev veth2 ip link set up dev veth3 # link-status paired with veth2 # First available path ip -4 addr add 203.0.113.${uts#H}/24 dev veth0 ip -6 addr add 2001:db8:1::${uts#H}/64 dev veth0 # Second available path ip -4 addr add 192.0.2.${uts#H}/24 dev veth2 ip -6 addr add 2001:db8:2::${uts#H}/64 dev veth2 # More specific route through first path ip -4 route add 198.51.100.0/25 via 203.0.113.254 # via veth0 ip -6 route add 2001:db8:3::/56 via 2001:db8:1::ff # via veth0 # Less specific route through second path ip -4 route add 198.51.100.0/24 via 192.0.2.254 # via veth2 ip -6 route add 2001:db8:3::/48 via 2001:db8:2::ff # via veth2 # H1: enable on "all" # H2: enable on "veth0" for v in ipv4 ipv6; do case $uts in H1) sysctl -qw net.${v}.conf.all.ignore_routes_with_linkdown=1 ;; H2) sysctl -qw net.${v}.conf.veth0.ignore_routes_with_linkdown=1 ;; esac done set -xe # When veth0 is up, best route is through veth0 ip -o route get 198.51.100.1 | grep -Fw veth0 ip -o route get 2001:db8:3::1 | grep -Fw veth0 # When veth0 is down, best route should be through veth2 on H1/H2, # but on veth0 on H2 ip link set down dev veth1 # down veth0 ip route show [ $uts != H3 ] || ip -o route get 198.51.100.1 | grep -Fw veth0 [ $uts != H3 ] || ip -o route get 2001:db8:3::1 | grep -Fw veth0 [ $uts = H3 ] || ip -o route get 198.51.100.1 | grep -Fw veth2 [ $uts = H3 ] || ip -o route get 2001:db8:3::1 | grep -Fw veth2 Without this patch, the two last lines would fail on H1 (the one using the "all" sysctl). With the patch, everything succeeds as expected. Also document the sysctl in `ip-sysctl.rst`. Fixes: 0eeb075fad73 ("net: ipv4 sysctl option to ignore routes when nexthop link is down") Signed-off-by: Vincent Bernat <vincent@bernat.ch> --- Documentation/networking/ip-sysctl.rst | 3 +++ include/linux/inetdevice.h | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-)