Message ID | 20210826002540.11306-1-rdna@fb.com |
---|---|
State | New |
Headers | show |
Series | [net] rtnetlink: Return correct error on changing device netns | expand |
Hello: This patch was applied to netdev/net.git (refs/heads/master): On Wed, 25 Aug 2021 17:25:40 -0700 you wrote: > Currently when device is moved between network namespaces using > RTM_NEWLINK message type and one of netns attributes (FLA_NET_NS_PID, > IFLA_NET_NS_FD, IFLA_TARGET_NETNSID) but w/o specifying IFLA_IFNAME, and > target namespace already has device with same name, userspace will get > EINVAL what is confusing and makes debugging harder. > > Fix it so that userspace gets more appropriate EEXIST instead what makes > debugging much easier. > > [...] Here is the summary with links: - [net] rtnetlink: Return correct error on changing device netns https://git.kernel.org/netdev/net/c/96a6b93b6988 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
Andrey Ignatov <rdna@fb.com> writes: > Currently when device is moved between network namespaces using > RTM_NEWLINK message type and one of netns attributes (FLA_NET_NS_PID, > IFLA_NET_NS_FD, IFLA_TARGET_NETNSID) but w/o specifying IFLA_IFNAME, and > target namespace already has device with same name, userspace will get > EINVAL what is confusing and makes debugging harder. > > Fix it so that userspace gets more appropriate EEXIST instead what makes > debugging much easier. > > Before: > > # ./ifname.sh > + ip netns add ns0 > + ip netns exec ns0 ip link add l0 type dummy > + ip netns exec ns0 ip link show l0 > 8: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 > link/ether 66:90:b5:d5:78:69 brd ff:ff:ff:ff:ff:ff > + ip link add l0 type dummy > + ip link show l0 > 10: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 > link/ether 6e:c6:1f:15:20:8d brd ff:ff:ff:ff:ff:ff > + ip link set l0 netns ns0 > RTNETLINK answers: Invalid argument > > After: > > # ./ifname.sh > + ip netns add ns0 > + ip netns exec ns0 ip link add l0 type dummy > + ip netns exec ns0 ip link show l0 > 8: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 > link/ether 1e:4a:72:e3:e3:8f brd ff:ff:ff:ff:ff:ff > + ip link add l0 type dummy > + ip link show l0 > 10: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 > link/ether f2:fc:fe:2b:7d:a6 brd ff:ff:ff:ff:ff:ff > + ip link set l0 netns ns0 > RTNETLINK answers: File exists > > The problem is that do_setlink() passes its `char *ifname` argument, > that it gets from a caller, to __dev_change_net_namespace() as is (as > `const char *pat`), but semantics of ifname and pat can be different. > > For example, __rtnl_newlink() does this: > > net/core/rtnetlink.c > 3270 char ifname[IFNAMSIZ]; > ... > 3286 if (tb[IFLA_IFNAME]) > 3287 nla_strscpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ); > 3288 else > 3289 ifname[0] = '\0'; > ... > 3364 if (dev) { > ... > 3394 return do_setlink(skb, dev, ifm, extack, tb, ifname, status); > 3395 } > > , i.e. do_setlink() gets ifname pointer that is always valid no matter > if user specified IFLA_IFNAME or not and then do_setlink() passes this > ifname pointer as is to __dev_change_net_namespace() as pat argument. > > But the pat (pattern) in __dev_change_net_namespace() is used as: > > net/core/dev.c > 11198 err = -EEXIST; > 11199 if (__dev_get_by_name(net, dev->name)) { > 11200 /* We get here if we can't use the current device name */ > 11201 if (!pat) > 11202 goto out; > 11203 err = dev_get_valid_name(net, dev, pat); > 11204 if (err < 0) > 11205 goto out; > 11206 } > > As the result the `goto out` path on line 11202 is neven taken and > instead of returning EEXIST defined on line 11198, > __dev_change_net_namespace() returns an error from dev_get_valid_name() > and this, in turn, will be EINVAL for ifname[0] = '\0' set earlier. > > Fixes: d8a5ec672768 ("[NET]: netlink support for moving devices between network namespaces.") > Signed-off-by: Andrey Ignatov <rdna@fb.com> The analysis and the fix looks good to me. The code calling do_setlink is inconsistent. One caller of do_setlink passes a NULL to indicate not name has been specified. Other callers pass a string of zero bytes to indicate no name has been specified. I wonder if we might want to fix the callers to uniformly pass NULL, instead of a string of length zero. There is a slight chance this will trigger a regression somewhere because we are changing the error code but this change looks easy enough to revert in the unlikely event this breaks existing userspace. Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> > --- > net/core/rtnetlink.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c > index f6af3e74fc44..662eb1c37f47 100644 > --- a/net/core/rtnetlink.c > +++ b/net/core/rtnetlink.c > @@ -2608,6 +2608,7 @@ static int do_setlink(const struct sk_buff *skb, > return err; > > if (tb[IFLA_NET_NS_PID] || tb[IFLA_NET_NS_FD] || tb[IFLA_TARGET_NETNSID]) { > + const char *pat = ifname && ifname[0] ? ifname : NULL; > struct net *net; > int new_ifindex; > > @@ -2623,7 +2624,7 @@ static int do_setlink(const struct sk_buff *skb, > else > new_ifindex = 0; > > - err = __dev_change_net_namespace(dev, net, ifname, new_ifindex); > + err = __dev_change_net_namespace(dev, net, pat, new_ifindex); > put_net(net); > if (err) > goto errout;
On Thu, 26 Aug 2021 11:15:22 -0500 ebiederm@xmission.com (Eric W. Biederman) wrote: > The analysis and the fix looks good to me. > > The code calling do_setlink is inconsistent. One caller of do_setlink > passes a NULL to indicate not name has been specified. Other callers > pass a string of zero bytes to indicate no name has been specified. > > I wonder if we might want to fix the callers to uniformly pass NULL, > instead of a string of length zero. > > There is a slight chance this will trigger a regression somewhere > because we are changing the error code but this change looks easy enough > to revert in the unlikely event this breaks existing userspace. > > Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> This patch causes a new warning from Coverity: ________________________________________________________________________________________________________ *** CID 1490867: Null pointer dereferences (FORWARD_NULL) /net/core/rtnetlink.c: 2701 in do_setlink() 2695 2696 /* 2697 * Interface selected by interface index but interface 2698 * name provided implies that a name change has been 2699 * requested. 2700 */ >>> CID 1490867: Null pointer dereferences (FORWARD_NULL) >>> Dereferencing null pointer "ifname". 2701 if (ifm->ifi_index > 0 && ifname[0]) { 2702 err = dev_change_name(dev, ifname); 2703 if (err < 0) 2704 goto errout; 2705 status |= DO_SETLINK_MODIFIED; 2706 Originally, the code was not accepting ifname == NULL and would crash. Somewhere along the way some new callers seem to have gotten confused. What code is call do_setlink() with NULL as ifname, that should be fixed.
On Mon, 30 Aug 2021 07:59:48 -0700 Stephen Hemminger wrote: > On Thu, 26 Aug 2021 11:15:22 -0500 > ebiederm@xmission.com (Eric W. Biederman) wrote: > > > The analysis and the fix looks good to me. > > > > The code calling do_setlink is inconsistent. One caller of do_setlink > > passes a NULL to indicate not name has been specified. Other callers > > pass a string of zero bytes to indicate no name has been specified. > > > > I wonder if we might want to fix the callers to uniformly pass NULL, > > instead of a string of length zero. > > > > There is a slight chance this will trigger a regression somewhere > > because we are changing the error code but this change looks easy enough > > to revert in the unlikely event this breaks existing userspace. > > > > Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> > > This patch causes a new warning from Coverity: > ________________________________________________________________________________________________________ > *** CID 1490867: Null pointer dereferences (FORWARD_NULL) > /net/core/rtnetlink.c: 2701 in do_setlink() > 2695 > 2696 /* > 2697 * Interface selected by interface index but interface > 2698 * name provided implies that a name change has been > 2699 * requested. > 2700 */ > [...] > 2701 if (ifm->ifi_index > 0 && ifname[0]) { > 2702 err = dev_change_name(dev, ifname); > 2703 if (err < 0) > 2704 goto errout; > 2705 status |= DO_SETLINK_MODIFIED; > 2706 > > Originally, the code was not accepting ifname == NULL and would > crash. Somewhere along the way some new callers seem to have gotten > confused. > > What code is call do_setlink() with NULL as ifname, that should be fixed. It's a false positive. There's only one caller with ifname=NULL: static int rtnl_group_changelink(const struct sk_buff *skb, ... err = do_setlink(skb, dev, ifm, extack, tb, NULL, 0); if (err < 0) return err; Which has one caller, under this condition: if (ifm->ifi_index == 0 && tb[IFLA_GROUP]) return rtnl_group_changelink(skb, net, ... condition which excludes evaluating the check in question: if (ifm->ifi_index > 0 && ifname[0]) { err = dev_change_name(dev, ifname); Proving ifm->ifi_index has to be 0 for ifname to be NULL.
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index f6af3e74fc44..662eb1c37f47 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -2608,6 +2608,7 @@ static int do_setlink(const struct sk_buff *skb, return err; if (tb[IFLA_NET_NS_PID] || tb[IFLA_NET_NS_FD] || tb[IFLA_TARGET_NETNSID]) { + const char *pat = ifname && ifname[0] ? ifname : NULL; struct net *net; int new_ifindex; @@ -2623,7 +2624,7 @@ static int do_setlink(const struct sk_buff *skb, else new_ifindex = 0; - err = __dev_change_net_namespace(dev, net, ifname, new_ifindex); + err = __dev_change_net_namespace(dev, net, pat, new_ifindex); put_net(net); if (err) goto errout;
Currently when device is moved between network namespaces using RTM_NEWLINK message type and one of netns attributes (FLA_NET_NS_PID, IFLA_NET_NS_FD, IFLA_TARGET_NETNSID) but w/o specifying IFLA_IFNAME, and target namespace already has device with same name, userspace will get EINVAL what is confusing and makes debugging harder. Fix it so that userspace gets more appropriate EEXIST instead what makes debugging much easier. Before: # ./ifname.sh + ip netns add ns0 + ip netns exec ns0 ip link add l0 type dummy + ip netns exec ns0 ip link show l0 8: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 66:90:b5:d5:78:69 brd ff:ff:ff:ff:ff:ff + ip link add l0 type dummy + ip link show l0 10: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 6e:c6:1f:15:20:8d brd ff:ff:ff:ff:ff:ff + ip link set l0 netns ns0 RTNETLINK answers: Invalid argument After: # ./ifname.sh + ip netns add ns0 + ip netns exec ns0 ip link add l0 type dummy + ip netns exec ns0 ip link show l0 8: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 1e:4a:72:e3:e3:8f brd ff:ff:ff:ff:ff:ff + ip link add l0 type dummy + ip link show l0 10: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether f2:fc:fe:2b:7d:a6 brd ff:ff:ff:ff:ff:ff + ip link set l0 netns ns0 RTNETLINK answers: File exists The problem is that do_setlink() passes its `char *ifname` argument, that it gets from a caller, to __dev_change_net_namespace() as is (as `const char *pat`), but semantics of ifname and pat can be different. For example, __rtnl_newlink() does this: net/core/rtnetlink.c 3270 char ifname[IFNAMSIZ]; ... 3286 if (tb[IFLA_IFNAME]) 3287 nla_strscpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ); 3288 else 3289 ifname[0] = '\0'; ... 3364 if (dev) { ... 3394 return do_setlink(skb, dev, ifm, extack, tb, ifname, status); 3395 } , i.e. do_setlink() gets ifname pointer that is always valid no matter if user specified IFLA_IFNAME or not and then do_setlink() passes this ifname pointer as is to __dev_change_net_namespace() as pat argument. But the pat (pattern) in __dev_change_net_namespace() is used as: net/core/dev.c 11198 err = -EEXIST; 11199 if (__dev_get_by_name(net, dev->name)) { 11200 /* We get here if we can't use the current device name */ 11201 if (!pat) 11202 goto out; 11203 err = dev_get_valid_name(net, dev, pat); 11204 if (err < 0) 11205 goto out; 11206 } As the result the `goto out` path on line 11202 is neven taken and instead of returning EEXIST defined on line 11198, __dev_change_net_namespace() returns an error from dev_get_valid_name() and this, in turn, will be EINVAL for ifname[0] = '\0' set earlier. Fixes: d8a5ec672768 ("[NET]: netlink support for moving devices between network namespaces.") Signed-off-by: Andrey Ignatov <rdna@fb.com> --- net/core/rtnetlink.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)