Message ID | 20210801231730.7493-1-vladimir.oltean@nxp.com |
---|---|
State | New |
Headers | show |
Series | [net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry | expand |
On 02/08/2021 02:17, Vladimir Oltean wrote: > Currently it is possible to add broken extern_learn FDB entries to the > bridge in two ways: > > 1. Entries pointing towards the bridge device that are not local/permanent: > > ip link add br0 type bridge > bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static > > 2. Entries pointing towards the bridge device or towards a port that > are marked as local/permanent, however the bridge does not process the > 'permanent' bit in any way, therefore they are recorded as though they > aren't permanent: > > ip link add br0 type bridge > bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent > > Since commit 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the > same as entries towards the bridge"), these incorrect FDB entries can > even trigger NULL pointer dereferences inside the kernel. > > This is because that commit made the assumption that all FDB entries > that are not local/permanent have a valid destination port. For context, > local / permanent FDB entries either have fdb->dst == NULL, and these > point towards the bridge device and are therefore local and not to be > used for forwarding, or have fdb->dst == a net_bridge_port structure > (but are to be treated in the same way, i.e. not for forwarding). > > That assumption _is_ correct as long as things are working correctly in > the bridge driver, i.e. we cannot logically have fdb->dst == NULL under > any circumstance for FDB entries that are not local. However, the > extern_learn code path where FDB entries are managed by a user space > controller show that it is possible for the bridge kernel driver to > misinterpret the NUD flags of an entry transmitted by user space, and > end up having fdb->dst == NULL while not being a local entry. This is > invalid and should be rejected. > > Before, the two commands listed above both crashed the kernel in this > check from br_switchdev_fdb_notify: > Not before 52e4bec15546 though, the check used to be: struct net_device *dev = dst ? dst->dev : br->dev; which wouldn't crash. So the fixes tag below is incorrect, you could add a weird extern learn entry, but it wouldn't crash the kernel. > struct net_device *dev = info.is_local ? br->dev : dst->dev; > > info.is_local == false, dst == NULL. > > After this patch, the invalid entry added by the first command is > rejected: > > ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static; ip link del br0 > Error: bridge: FDB entry towards bridge must be permanent. > > and the valid entry added by the second command is properly treated as a > local address and does not crash br_switchdev_fdb_notify anymore: > > ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent; ip link del br0 > > Fixes: eb100e0e24a2 ("net: bridge: allow to add externally learned entries from user-space") Please change fixes tag to 52e4bec15546. > Reported-by: syzbot+9ba1174359adba5a5b7c@syzkaller.appspotmail.com > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> > --- [snip] Thanks, Nik
On Mon, Aug 02, 2021 at 10:42:41AM +0300, Nikolay Aleksandrov wrote: > On 02/08/2021 02:17, Vladimir Oltean wrote: > > Currently it is possible to add broken extern_learn FDB entries to the > > bridge in two ways: > > > > 1. Entries pointing towards the bridge device that are not local/permanent: > > > > ip link add br0 type bridge > > bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static > > > > 2. Entries pointing towards the bridge device or towards a port that > > are marked as local/permanent, however the bridge does not process the > > 'permanent' bit in any way, therefore they are recorded as though they > > aren't permanent: > > > > ip link add br0 type bridge > > bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent > > > > Since commit 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the > > same as entries towards the bridge"), these incorrect FDB entries can > > even trigger NULL pointer dereferences inside the kernel. > > > > This is because that commit made the assumption that all FDB entries > > that are not local/permanent have a valid destination port. For context, > > local / permanent FDB entries either have fdb->dst == NULL, and these > > point towards the bridge device and are therefore local and not to be > > used for forwarding, or have fdb->dst == a net_bridge_port structure > > (but are to be treated in the same way, i.e. not for forwarding). > > > > That assumption _is_ correct as long as things are working correctly in > > the bridge driver, i.e. we cannot logically have fdb->dst == NULL under > > any circumstance for FDB entries that are not local. However, the > > extern_learn code path where FDB entries are managed by a user space > > controller show that it is possible for the bridge kernel driver to > > misinterpret the NUD flags of an entry transmitted by user space, and > > end up having fdb->dst == NULL while not being a local entry. This is > > invalid and should be rejected. > > > > Before, the two commands listed above both crashed the kernel in this > > check from br_switchdev_fdb_notify: > > > > Not before 52e4bec15546 though, the check used to be: > struct net_device *dev = dst ? dst->dev : br->dev; "Before", as in "before this patch, on net-next/linux-next". > which wouldn't crash. So the fixes tag below is incorrect, you could > add a weird extern learn entry, but it wouldn't crash the kernel. :) Is our only criterion whether a patch is buggy or not that it causes a NULL pointer dereference inside the kernel? I thought I'd mention the interaction with the net-next work for the sake of being thorough, and because this is how the syzcaller caught it by coincidence, but "kernel does not treat an FDB entry with the 'permanent' flag as permanent" is enough of a reason to submit this as a bug fix for the commit that I pointed to. Granted, I don't have any use case with extern_learn, so probably your user space programs simply don't add permanent FDB entries, but as this is the kernel UAPI, it should nevertheless do whatever the user space is allowed to say. For a permanent FDB entry, that behavior is to stop forwarding for that MAC DA, and that behavior obviously was not taking place even before any change in br_switchdev_fdb_notify(), or even with CONFIG_NET_SWITCHDEV=n.
On 02/08/2021 12:20, Vladimir Oltean wrote: > On Mon, Aug 02, 2021 at 10:42:41AM +0300, Nikolay Aleksandrov wrote: >> On 02/08/2021 02:17, Vladimir Oltean wrote: >>> Currently it is possible to add broken extern_learn FDB entries to the >>> bridge in two ways: >>> >>> 1. Entries pointing towards the bridge device that are not local/permanent: >>> >>> ip link add br0 type bridge >>> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static >>> >>> 2. Entries pointing towards the bridge device or towards a port that >>> are marked as local/permanent, however the bridge does not process the >>> 'permanent' bit in any way, therefore they are recorded as though they >>> aren't permanent: >>> >>> ip link add br0 type bridge >>> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent >>> >>> Since commit 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the >>> same as entries towards the bridge"), these incorrect FDB entries can >>> even trigger NULL pointer dereferences inside the kernel. >>> >>> This is because that commit made the assumption that all FDB entries >>> that are not local/permanent have a valid destination port. For context, >>> local / permanent FDB entries either have fdb->dst == NULL, and these >>> point towards the bridge device and are therefore local and not to be >>> used for forwarding, or have fdb->dst == a net_bridge_port structure >>> (but are to be treated in the same way, i.e. not for forwarding). >>> >>> That assumption _is_ correct as long as things are working correctly in >>> the bridge driver, i.e. we cannot logically have fdb->dst == NULL under >>> any circumstance for FDB entries that are not local. However, the >>> extern_learn code path where FDB entries are managed by a user space >>> controller show that it is possible for the bridge kernel driver to >>> misinterpret the NUD flags of an entry transmitted by user space, and >>> end up having fdb->dst == NULL while not being a local entry. This is >>> invalid and should be rejected. >>> >>> Before, the two commands listed above both crashed the kernel in this >>> check from br_switchdev_fdb_notify: >>> >> >> Not before 52e4bec15546 though, the check used to be: >> struct net_device *dev = dst ? dst->dev : br->dev; > > "Before", as in "before this patch, on net-next/linux-next". > We still need that check, more below. >> which wouldn't crash. So the fixes tag below is incorrect, you could >> add a weird extern learn entry, but it wouldn't crash the kernel. > > :) > > Is our only criterion whether a patch is buggy or not that it causes a > NULL pointer dereference inside the kernel? > > I thought I'd mention the interaction with the net-next work for the > sake of being thorough, and because this is how the syzcaller caught it > by coincidence, but "kernel does not treat an FDB entry with the > 'permanent' flag as permanent" is enough of a reason to submit this as a Not exactly right, you may add it as permanent but it doesn't get "permanent" flag set. The actual bug is that it points to the bridge device, e.g. null dst without the flag. > bug fix for the commit that I pointed to. Granted, I don't have any use > case with extern_learn, so probably your user space programs simply > don't add permanent FDB entries, but as this is the kernel UAPI, it > should nevertheless do whatever the user space is allowed to say. For a > permanent FDB entry, that behavior is to stop forwarding for that MAC > DA, and that behavior obviously was not taking place even before any > change in br_switchdev_fdb_notify(), or even with CONFIG_NET_SWITCHDEV=n. > Actually I believe there is still a bug in 52e4bec15546 even with this fix. The flag can change after the dst has been read in br_switchdev_fdb_notify() so in theory you could still do a null pointer dereference. fdb_notify() can be called from a few places without locking. The code shouldn't dereference the dst based on the flag. I'm okay with this change due to the null dst without permanent flag fix, but it doesn't fully fix the null pointer dereference.
On Mon, Aug 02, 2021 at 12:42:17PM +0300, Nikolay Aleksandrov wrote: > >>> Before, the two commands listed above both crashed the kernel in this > >>> check from br_switchdev_fdb_notify: > >>> > >> > >> Not before 52e4bec15546 though, the check used to be: > >> struct net_device *dev = dst ? dst->dev : br->dev; > > > > "Before", as in "before this patch, on net-next/linux-next". > > > > We still need that check, more below. > > >> which wouldn't crash. So the fixes tag below is incorrect, you could > >> add a weird extern learn entry, but it wouldn't crash the kernel. > > > > :) > > > > Is our only criterion whether a patch is buggy or not that it causes a > > NULL pointer dereference inside the kernel? > > > > I thought I'd mention the interaction with the net-next work for the > > sake of being thorough, and because this is how the syzcaller caught it > > by coincidence, but "kernel does not treat an FDB entry with the > > 'permanent' flag as permanent" is enough of a reason to submit this as a > > Not exactly right, you may add it as permanent but it doesn't get "permanent" flag set. And that is the bug I am addressing here, no? > The actual bug is that it points to the bridge device, e.g. null dst without the flag. > > > bug fix for the commit that I pointed to. Granted, I don't have any use > > case with extern_learn, so probably your user space programs simply > > don't add permanent FDB entries, but as this is the kernel UAPI, it > > should nevertheless do whatever the user space is allowed to say. For a > > permanent FDB entry, that behavior is to stop forwarding for that MAC > > DA, and that behavior obviously was not taking place even before any > > change in br_switchdev_fdb_notify(), or even with CONFIG_NET_SWITCHDEV=n. > > > > Actually I believe there is still a bug in 52e4bec15546 even with this fix. > The flag can change after the dst has been read in br_switchdev_fdb_notify() > so in theory you could still do a null pointer dereference. fdb_notify() > can be called from a few places without locking. The code shouldn't dereference > the dst based on the flag. Are you thinking of a specific code path that triggers a race between (a) a writer side doing WRITE_ONCE(fdb->dst, NULL) and then set_bit(BR_FDB_LOCAL, &fdb->flags), exactly in this order, and (b) a reader side catching that fdb exactly in between the above 2 statements, through fdb_notify or otherwise (br_fdb_replay)? Because I don't see any. Plus, I am a bit nervous about protecting against theoretical/unproven races in a way that masks real bugs, as we would be doing if I add an extra check in br_fdb_replay_one and br_switchdev_fdb_notify against the case where an entry has fdb->dst == NULL but not BR_FDB_LOCAL. > > I'm okay with this change due to the null dst without permanent flag fix, but > it doesn't fully fix the null pointer dereference. So is there any change that I should make to this patch?
On 02/08/2021 13:52, Vladimir Oltean wrote: > On Mon, Aug 02, 2021 at 12:42:17PM +0300, Nikolay Aleksandrov wrote: >>>>> Before, the two commands listed above both crashed the kernel in this >>>>> check from br_switchdev_fdb_notify: >>>>> >>>> >>>> Not before 52e4bec15546 though, the check used to be: >>>> struct net_device *dev = dst ? dst->dev : br->dev; >>> >>> "Before", as in "before this patch, on net-next/linux-next". >>> >> >> We still need that check, more below. >> >>>> which wouldn't crash. So the fixes tag below is incorrect, you could >>>> add a weird extern learn entry, but it wouldn't crash the kernel. >>> >>> :) >>> >>> Is our only criterion whether a patch is buggy or not that it causes a >>> NULL pointer dereference inside the kernel? >>> >>> I thought I'd mention the interaction with the net-next work for the >>> sake of being thorough, and because this is how the syzcaller caught it >>> by coincidence, but "kernel does not treat an FDB entry with the >>> 'permanent' flag as permanent" is enough of a reason to submit this as a >> >> Not exactly right, you may add it as permanent but it doesn't get "permanent" flag set. > > And that is the bug I am addressing here, no? > Obviously I pointed to the statement above, I don't get the question here. >> The actual bug is that it points to the bridge device, e.g. null dst without the flag. >> >>> bug fix for the commit that I pointed to. Granted, I don't have any use >>> case with extern_learn, so probably your user space programs simply >>> don't add permanent FDB entries, but as this is the kernel UAPI, it >>> should nevertheless do whatever the user space is allowed to say. For a >>> permanent FDB entry, that behavior is to stop forwarding for that MAC >>> DA, and that behavior obviously was not taking place even before any >>> change in br_switchdev_fdb_notify(), or even with CONFIG_NET_SWITCHDEV=n. >>> >> >> Actually I believe there is still a bug in 52e4bec15546 even with this fix. >> The flag can change after the dst has been read in br_switchdev_fdb_notify() >> so in theory you could still do a null pointer dereference. fdb_notify() >> can be called from a few places without locking. The code shouldn't dereference >> the dst based on the flag. > > Are you thinking of a specific code path that triggers a race between > (a) a writer side doing WRITE_ONCE(fdb->dst, NULL) and then > set_bit(BR_FDB_LOCAL, &fdb->flags), exactly in this order, and Visible order is not guaranteed, there are no barriers neither at writer nor reader sides, especially when used without locking. So we cannot make any assumptions about the order visibility of these writes. > (b) a reader side catching that fdb exactly in between the above 2 > statements, through fdb_notify or otherwise (br_fdb_replay)? > > Because I don't see any. > > Plus, I am a bit nervous about protecting against theoretical/unproven > races in a way that masks real bugs, as we would be doing if I add an > extra check in br_fdb_replay_one and br_switchdev_fdb_notify against the > case where an entry has fdb->dst == NULL but not BR_FDB_LOCAL. > The bits are _not_ visible atomically with the setting of ->dst. It is obvious you must not dereference anything based on them, they are only indications when used outside of locked regions and code must be able to deal with inconsistencies as that is implied by the way they're used. It is a clear and obvious bug dereferencing based on a bit that can change in parallel without any memory ordering guarantees. You are not "masking" anything, but fixing what is currently buggy use of fdb bits. As I already said - this doesn't fix the null deref bug completely, in fact it fixes a different inconsistency, before at worst you'd get blackholed traffic for such entries now you get a null pointer dereference. >> >> I'm okay with this change due to the null dst without permanent flag fix, but >> it doesn't fully fix the null pointer dereference. > > So is there any change that I should make to this patch? > No, this patch is fine as-is.
On Mon, Aug 02, 2021 at 02:02:36PM +0300, Nikolay Aleksandrov wrote: > >> Actually I believe there is still a bug in 52e4bec15546 even with this fix. > >> The flag can change after the dst has been read in br_switchdev_fdb_notify() > >> so in theory you could still do a null pointer dereference. fdb_notify() > >> can be called from a few places without locking. The code shouldn't dereference > >> the dst based on the flag. > > > > Are you thinking of a specific code path that triggers a race between > > (a) a writer side doing WRITE_ONCE(fdb->dst, NULL) and then > > set_bit(BR_FDB_LOCAL, &fdb->flags), exactly in this order, and > > Visible order is not guaranteed, there are no barriers neither at writer nor reader > sides, especially when used without locking. So we cannot make any assumptions > about the order visibility of these writes. > > > (b) a reader side catching that fdb exactly in between the above 2 > > statements, through fdb_notify or otherwise (br_fdb_replay)? > > > > Because I don't see any. > > > > Plus, I am a bit nervous about protecting against theoretical/unproven > > races in a way that masks real bugs, as we would be doing if I add an > > extra check in br_fdb_replay_one and br_switchdev_fdb_notify against the > > case where an entry has fdb->dst == NULL but not BR_FDB_LOCAL. > > > > The bits are _not_ visible atomically with the setting of ->dst. It is obvious > you must not dereference anything based on them, they are only indications when used > outside of locked regions and code must be able to deal with inconsistencies as that > is implied by the way they're used. It is a clear and obvious bug dereferencing based > on a bit that can change in parallel without any memory ordering guarantees. Ok, I will send a separate patch for that. > You are not "masking" anything, but fixing what is currently buggy use of fdb bits. I am "masking" in the sense that the bug I am fixing here was not obvious to me until it triggered a NPD. That would stop happening with the patch I'm about to send, but maybe there are still bridge UAPI functions that do not validate the 'permanent' flag from FDB entries. > As I already said - this doesn't fix the null deref bug completely, in fact it fixes a different > inconsistency, before at worst you'd get blackholed traffic for such entries now > you get a null pointer dereference.
On 02/08/2021 02:17, Vladimir Oltean wrote: > Currently it is possible to add broken extern_learn FDB entries to the > bridge in two ways: > > 1. Entries pointing towards the bridge device that are not local/permanent: > > ip link add br0 type bridge > bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static > > 2. Entries pointing towards the bridge device or towards a port that > are marked as local/permanent, however the bridge does not process the > 'permanent' bit in any way, therefore they are recorded as though they > aren't permanent: > > ip link add br0 type bridge > bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent > > Since commit 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the > same as entries towards the bridge"), these incorrect FDB entries can > even trigger NULL pointer dereferences inside the kernel. > > This is because that commit made the assumption that all FDB entries > that are not local/permanent have a valid destination port. For context, > local / permanent FDB entries either have fdb->dst == NULL, and these > point towards the bridge device and are therefore local and not to be > used for forwarding, or have fdb->dst == a net_bridge_port structure > (but are to be treated in the same way, i.e. not for forwarding). > > That assumption _is_ correct as long as things are working correctly in > the bridge driver, i.e. we cannot logically have fdb->dst == NULL under > any circumstance for FDB entries that are not local. However, the > extern_learn code path where FDB entries are managed by a user space > controller show that it is possible for the bridge kernel driver to > misinterpret the NUD flags of an entry transmitted by user space, and > end up having fdb->dst == NULL while not being a local entry. This is > invalid and should be rejected. > > Before, the two commands listed above both crashed the kernel in this > check from br_switchdev_fdb_notify: > > struct net_device *dev = info.is_local ? br->dev : dst->dev; > > info.is_local == false, dst == NULL. > > After this patch, the invalid entry added by the first command is > rejected: > > ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static; ip link del br0 > Error: bridge: FDB entry towards bridge must be permanent. > > and the valid entry added by the second command is properly treated as a > local address and does not crash br_switchdev_fdb_notify anymore: > > ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent; ip link del br0 > > Fixes: eb100e0e24a2 ("net: bridge: allow to add externally learned entries from user-space") > Reported-by: syzbot+9ba1174359adba5a5b7c@syzkaller.appspotmail.com > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> > --- Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Hello: This patch was applied to netdev/net.git (refs/heads/master): On Mon, 2 Aug 2021 02:17:30 +0300 you wrote: > Currently it is possible to add broken extern_learn FDB entries to the > bridge in two ways: > > 1. Entries pointing towards the bridge device that are not local/permanent: > > ip link add br0 type bridge > bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static > > [...] Here is the summary with links: - [net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry https://git.kernel.org/netdev/net/c/0541a6293298 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
On Mon, Aug 02, 2021 at 02:17:30AM +0300, Vladimir Oltean wrote: > diff --git a/net/bridge/br.c b/net/bridge/br.c > index ef743f94254d..bbab9984f24e 100644 > --- a/net/bridge/br.c > +++ b/net/bridge/br.c > @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused, > case SWITCHDEV_FDB_ADD_TO_BRIDGE: > fdb_info = ptr; > err = br_fdb_external_learn_add(br, p, fdb_info->addr, > - fdb_info->vid, false); > + fdb_info->vid, > + fdb_info->is_local, false); When 'is_local' was added in commit 2c4eca3ef716 ("net: bridge: switchdev: include local flag in FDB notifications") it was not initialized in all the call sites that emit 'SWITCHDEV_FDB_ADD_TO_BRIDGE' notification, so it can contain garbage. > if (err) { > err = notifier_from_errno(err); > break; [...] > @@ -1281,6 +1292,10 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, > > if (swdev_notify) > flags |= BIT(BR_FDB_ADDED_BY_USER); > + > + if (is_local) > + flags |= BIT(BR_FDB_LOCAL); I have at least once selftest where I forgot the 'static' keyword: bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1 This patch breaks the test when run against both the kernel and hardware data paths. I don't mind patching these tests, but we might get more reports in the future. Nik, what do you think? > + > fdb = fdb_create(br, p, addr, vid, flags); > if (!fdb) { > err = -ENOMEM; > @@ -1307,6 +1322,9 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, > if (swdev_notify) > set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags); > > + if (is_local) > + set_bit(BR_FDB_LOCAL, &fdb->flags); > + > if (modified) > fdb_notify(br, fdb, RTM_NEWNEIGH, swdev_notify); > } > diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h > index 2b48b204205e..aa64d8d63ca3 100644 > --- a/net/bridge/br_private.h > +++ b/net/bridge/br_private.h > @@ -711,7 +711,7 @@ int br_fdb_get(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev, > int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p); > void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p); > int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, > - const unsigned char *addr, u16 vid, > + const unsigned char *addr, u16 vid, bool is_local, > bool swdev_notify); > int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p, > const unsigned char *addr, u16 vid, > -- > 2.25.1 >
On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote: > On Mon, Aug 02, 2021 at 02:17:30AM +0300, Vladimir Oltean wrote: > > diff --git a/net/bridge/br.c b/net/bridge/br.c > > index ef743f94254d..bbab9984f24e 100644 > > --- a/net/bridge/br.c > > +++ b/net/bridge/br.c > > @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused, > > case SWITCHDEV_FDB_ADD_TO_BRIDGE: > > fdb_info = ptr; > > err = br_fdb_external_learn_add(br, p, fdb_info->addr, > > - fdb_info->vid, false); > > + fdb_info->vid, > > + fdb_info->is_local, false); > > When 'is_local' was added in commit 2c4eca3ef716 ("net: bridge: > switchdev: include local flag in FDB notifications") it was not > initialized in all the call sites that emit > 'SWITCHDEV_FDB_ADD_TO_BRIDGE' notification, so it can contain garbage. Thanks for the report, I'll send a patch which adds a: memset(&info, 0, sizeof(info)); to all SWITCHDEV_FDB_*_TO_BRIDGE call sites.
On 09/08/2021 15:16, Ido Schimmel wrote: > On Mon, Aug 02, 2021 at 02:17:30AM +0300, Vladimir Oltean wrote: >> diff --git a/net/bridge/br.c b/net/bridge/br.c >> index ef743f94254d..bbab9984f24e 100644 >> --- a/net/bridge/br.c >> +++ b/net/bridge/br.c >> @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused, >> case SWITCHDEV_FDB_ADD_TO_BRIDGE: >> fdb_info = ptr; >> err = br_fdb_external_learn_add(br, p, fdb_info->addr, >> - fdb_info->vid, false); >> + fdb_info->vid, >> + fdb_info->is_local, false); > > When 'is_local' was added in commit 2c4eca3ef716 ("net: bridge: > switchdev: include local flag in FDB notifications") it was not > initialized in all the call sites that emit > 'SWITCHDEV_FDB_ADD_TO_BRIDGE' notification, so it can contain garbage. > nice catch >> if (err) { >> err = notifier_from_errno(err); >> break; > > [...] > >> @@ -1281,6 +1292,10 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, >> >> if (swdev_notify) >> flags |= BIT(BR_FDB_ADDED_BY_USER); >> + >> + if (is_local) >> + flags |= BIT(BR_FDB_LOCAL); > > I have at least once selftest where I forgot the 'static' keyword: > > bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1 > > This patch breaks the test when run against both the kernel and hardware > data paths. I don't mind patching these tests, but we might get more > reports in the future. > > Nik, what do you think? > Ahh, that's unfortunate. The patch's assumption is correct that we must not have fdb->dst == NULL and the dst to be non-local (i.e. without BR_FDB_LOCAL). Since all solutions break user-space in a different way and since this patch also already broke it by the check for !p && !NUD_PERMANENT in __br_fdb_add() which was allowed before that, I think the best course of action is to ignore NUD_PERMANENT in __br_fdb_add() for extern_learn case and always set BR_FDB_LOCAL in br_fdb_external_learn_add() when !p. That would allow all prior calls to work and would remove the dst==NULL without BR_FDB_LOCAL issue. Honestly, I doubt anyone is using extern_learn with bridge device entries, but we cannot assume anything since this is already a part of the uAPI and we must allow it. Basically we silently fix the BR_FDB_LOCAL problem so old user syntax and code can continue working. It is a hack, but I don't see another solution which doesn't break user-space in some way. Handling NUD_PERMANENT only with !p is equivalent, unfortunately we shouldn't keep the error since that can break someone who was adding such entries without NUD_PERMANENT flag, but we can force it in kernel, that should make such scripts succeed. Traffic used to be blackholed for such entries and now it will be received locally, that will be the only difference. TBH, I want to keep that error so middle ground would be to handle NUD_PERMANENT only when used with !p and keep it. :) WDYT ? Solution which forces BR_FDB_LOCAL for !p calls (completely untested): diff --git a/net/bridge/br.c b/net/bridge/br.c index c8ae823aa8e7..d3a32c6813e0 100644 --- a/net/bridge/br.c +++ b/net/bridge/br.c @@ -166,8 +166,7 @@ static int br_switchdev_event(struct notifier_block *unused, case SWITCHDEV_FDB_ADD_TO_BRIDGE: fdb_info = ptr; err = br_fdb_external_learn_add(br, p, fdb_info->addr, - fdb_info->vid, - fdb_info->is_local, false); + fdb_info->vid, false); if (err) { err = notifier_from_errno(err); break; diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index b8e22057f680..4e3b1b66f132 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -1255,15 +1255,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, rcu_read_unlock(); local_bh_enable(); } else if (ndm->ndm_flags & NTF_EXT_LEARNED) { - if (!p && !(ndm->ndm_state & NUD_PERMANENT)) { - NL_SET_ERR_MSG_MOD(extack, - "FDB entry towards bridge must be permanent"); - return -EINVAL; - } - - err = br_fdb_external_learn_add(br, p, addr, vid, - ndm->ndm_state & NUD_PERMANENT, - true); + err = br_fdb_external_learn_add(br, p, addr, vid, true); } else { spin_lock_bh(&br->hash_lock); err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb); @@ -1491,7 +1483,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p) } int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, - const unsigned char *addr, u16 vid, bool is_local, + const unsigned char *addr, u16 vid, bool swdev_notify) { struct net_bridge_fdb_entry *fdb; @@ -1509,7 +1501,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, if (swdev_notify) flags |= BIT(BR_FDB_ADDED_BY_USER); - if (is_local) + if (!p) flags |= BIT(BR_FDB_LOCAL); fdb = fdb_create(br, p, addr, vid, flags); @@ -1538,7 +1530,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, if (swdev_notify) set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags); - if (is_local) + if (!p) set_bit(BR_FDB_LOCAL, &fdb->flags); if (modified) diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 86969d1bd036..907e5742b392 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -778,7 +778,7 @@ int br_fdb_get(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev, int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p); void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p); int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, - const unsigned char *addr, u16 vid, bool is_local, + const unsigned char *addr, u16 vid, bool swdev_notify); int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid,
On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote: > I have at least once selftest where I forgot the 'static' keyword: > > bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1 > > This patch breaks the test when run against both the kernel and hardware > data paths. I don't mind patching these tests, but we might get more > reports in the future. Is it the 'static' keyword that you forgot, or 'dynamic'? The tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest looks to me like it's testing the behavior of an FDB entry which should roam, and which without the extern_learn flag would be ageable.
On Mon, Aug 09, 2021 at 06:33:30PM +0300, Nikolay Aleksandrov wrote: > TBH, I want to keep that error so middle ground would be to handle NUD_PERMANENT only > when used with !p and keep it. :) WDYT ? Yes, works for me > > Solution which forces BR_FDB_LOCAL for !p calls (completely untested): Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> > diff --git a/net/bridge/br.c b/net/bridge/br.c > index c8ae823aa8e7..d3a32c6813e0 100644 > --- a/net/bridge/br.c > +++ b/net/bridge/br.c > @@ -166,8 +166,7 @@ static int br_switchdev_event(struct notifier_block *unused, > case SWITCHDEV_FDB_ADD_TO_BRIDGE: > fdb_info = ptr; > err = br_fdb_external_learn_add(br, p, fdb_info->addr, > - fdb_info->vid, > - fdb_info->is_local, false); > + fdb_info->vid, false); > if (err) { > err = notifier_from_errno(err); > break; > diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c > index b8e22057f680..4e3b1b66f132 100644 > --- a/net/bridge/br_fdb.c > +++ b/net/bridge/br_fdb.c > @@ -1255,15 +1255,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, > rcu_read_unlock(); > local_bh_enable(); > } else if (ndm->ndm_flags & NTF_EXT_LEARNED) { > - if (!p && !(ndm->ndm_state & NUD_PERMANENT)) { > - NL_SET_ERR_MSG_MOD(extack, > - "FDB entry towards bridge must be permanent"); > - return -EINVAL; > - } > - > - err = br_fdb_external_learn_add(br, p, addr, vid, > - ndm->ndm_state & NUD_PERMANENT, > - true); > + err = br_fdb_external_learn_add(br, p, addr, vid, true); > } else { > spin_lock_bh(&br->hash_lock); > err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb); > @@ -1491,7 +1483,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p) > } > > int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, > - const unsigned char *addr, u16 vid, bool is_local, > + const unsigned char *addr, u16 vid, > bool swdev_notify) > { > struct net_bridge_fdb_entry *fdb; > @@ -1509,7 +1501,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, > if (swdev_notify) > flags |= BIT(BR_FDB_ADDED_BY_USER); > > - if (is_local) > + if (!p) > flags |= BIT(BR_FDB_LOCAL); > > fdb = fdb_create(br, p, addr, vid, flags); > @@ -1538,7 +1530,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, > if (swdev_notify) > set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags); > > - if (is_local) > + if (!p) > set_bit(BR_FDB_LOCAL, &fdb->flags); > > if (modified) > diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h > index 86969d1bd036..907e5742b392 100644 > --- a/net/bridge/br_private.h > +++ b/net/bridge/br_private.h > @@ -778,7 +778,7 @@ int br_fdb_get(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev, > int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p); > void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p); > int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, > - const unsigned char *addr, u16 vid, bool is_local, > + const unsigned char *addr, u16 vid, > bool swdev_notify); > int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p, > const unsigned char *addr, u16 vid, > >
On Mon, Aug 09, 2021 at 04:05:22PM +0000, Vladimir Oltean wrote: > On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote: > > I have at least once selftest where I forgot the 'static' keyword: > > > > bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1 > > > > This patch breaks the test when run against both the kernel and hardware > > data paths. I don't mind patching these tests, but we might get more > > reports in the future. > > Is it the 'static' keyword that you forgot, or 'dynamic'? The > tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest > looks to me like it's testing the behavior of an FDB entry which should > roam, and which without the extern_learn flag would be ageable. static - no aging, no roaming dynamic - aging, roaming extern_learn - no aging, roaming So these combinations do not make any sense and the kernel will ignore static/dynamic when extern_learn is specified. It's needed to work around iproute2 behavior of "assume permanent"
On Tue, Aug 10, 2021 at 09:46:34AM +0300, Ido Schimmel wrote: > On Mon, Aug 09, 2021 at 04:05:22PM +0000, Vladimir Oltean wrote: > > On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote: > > > I have at least once selftest where I forgot the 'static' keyword: > > > > > > bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1 > > > > > > This patch breaks the test when run against both the kernel and hardware > > > data paths. I don't mind patching these tests, but we might get more > > > reports in the future. > > > > Is it the 'static' keyword that you forgot, or 'dynamic'? The > > tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest > > looks to me like it's testing the behavior of an FDB entry which should > > roam, and which without the extern_learn flag would be ageable. > > static - no aging, no roaming > dynamic - aging, roaming > extern_learn - no aging, roaming > > So these combinations do not make any sense and the kernel will ignore > static/dynamic when extern_learn is specified. It's needed to work > around iproute2 behavior of "assume permanent" Since NTF_EXT_LEARNED is part of ndm->ndm_flags and NUD_REACHABLE/NUD_NOARP are part of ndm->ndm_state, it is not at all clear to me that 'extern_learn' belongs to the same class of bridge neighbor attributes as 'static'/'dynamic', and that it is invalid to have the full degree of freedom. If it isn't, shouldn't the kernel validate that, instead of just ignoring the ndm->ndm_state? If it's too late to validate, shouldn't we at least document somewhere that the ndm_state is ignored in the presence of ndm_flags & NTF_EXT_LEARNED? It is user API after all, easter eggs like this aren't very enjoyable.
On 10/08/2021 13:09, Vladimir Oltean wrote: > On Tue, Aug 10, 2021 at 09:46:34AM +0300, Ido Schimmel wrote: >> On Mon, Aug 09, 2021 at 04:05:22PM +0000, Vladimir Oltean wrote: >>> On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote: >>>> I have at least once selftest where I forgot the 'static' keyword: >>>> >>>> bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1 >>>> >>>> This patch breaks the test when run against both the kernel and hardware >>>> data paths. I don't mind patching these tests, but we might get more >>>> reports in the future. >>> >>> Is it the 'static' keyword that you forgot, or 'dynamic'? The >>> tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest >>> looks to me like it's testing the behavior of an FDB entry which should >>> roam, and which without the extern_learn flag would be ageable. >> >> static - no aging, no roaming >> dynamic - aging, roaming >> extern_learn - no aging, roaming >> >> So these combinations do not make any sense and the kernel will ignore >> static/dynamic when extern_learn is specified. It's needed to work >> around iproute2 behavior of "assume permanent" > > Since NTF_EXT_LEARNED is part of ndm->ndm_flags and NUD_REACHABLE/NUD_NOARP > are part of ndm->ndm_state, it is not at all clear to me that 'extern_learn' > belongs to the same class of bridge neighbor attributes as 'static'/'dynamic', > and that it is invalid to have the full degree of freedom. If it isn't, > shouldn't the kernel validate that, instead of just ignoring the ndm->ndm_state? > If it's too late to validate, shouldn't we at least document somewhere > that the ndm_state is ignored in the presence of ndm_flags & NTF_EXT_LEARNED? > It is user API after all, easter eggs like this aren't very enjoyable. > It's too late unfortunately, ignoring other flags in that case has been the standard behaviour for a long time (it has never made sense to specify flags for extern_learn entries). I'll send a separate patch that adds a comment to document it or if you have another thing in mind feel free to send a patch. Thanks, Nik
On Tue, Aug 10, 2021 at 01:15:32PM +0300, Nikolay Aleksandrov wrote: > On 10/08/2021 13:09, Vladimir Oltean wrote: > > On Tue, Aug 10, 2021 at 09:46:34AM +0300, Ido Schimmel wrote: > >> On Mon, Aug 09, 2021 at 04:05:22PM +0000, Vladimir Oltean wrote: > >>> On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote: > >>>> I have at least once selftest where I forgot the 'static' keyword: > >>>> > >>>> bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1 > >>>> > >>>> This patch breaks the test when run against both the kernel and hardware > >>>> data paths. I don't mind patching these tests, but we might get more > >>>> reports in the future. > >>> > >>> Is it the 'static' keyword that you forgot, or 'dynamic'? The > >>> tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest > >>> looks to me like it's testing the behavior of an FDB entry which should > >>> roam, and which without the extern_learn flag would be ageable. > >> > >> static - no aging, no roaming > >> dynamic - aging, roaming > >> extern_learn - no aging, roaming > >> > >> So these combinations do not make any sense and the kernel will ignore > >> static/dynamic when extern_learn is specified. It's needed to work > >> around iproute2 behavior of "assume permanent" > > > > Since NTF_EXT_LEARNED is part of ndm->ndm_flags and NUD_REACHABLE/NUD_NOARP > > are part of ndm->ndm_state, it is not at all clear to me that 'extern_learn' > > belongs to the same class of bridge neighbor attributes as 'static'/'dynamic', > > and that it is invalid to have the full degree of freedom. If it isn't, > > shouldn't the kernel validate that, instead of just ignoring the ndm->ndm_state? > > If it's too late to validate, shouldn't we at least document somewhere > > that the ndm_state is ignored in the presence of ndm_flags & NTF_EXT_LEARNED? > > It is user API after all, easter eggs like this aren't very enjoyable. > > > > It's too late unfortunately, ignoring other flags in that case has been the standard > behaviour for a long time (it has never made sense to specify flags for extern_learn > entries). I'll send a separate patch that adds a comment to document it or if you have > another thing in mind feel free to send a patch. No, I don't have anything else in mind, but since the topic is the same as the "net: bridge: fix flags interpretation for extern learn fdb entries" patch you already sent, you could as well just send a v2 for that and add an extra phrase in a comment somewhere near a NTF_EXT_LEARNED uapi definition, or perhaps extend this comment right here: /* NUD_NOARP & NUD_PERMANENT are pseudostates, they never change and make no address resolution or NUD. NUD_PERMANENT also cannot be deleted by garbage collectors. */
On 10/08/2021 13:38, Vladimir Oltean wrote: > On Tue, Aug 10, 2021 at 01:15:32PM +0300, Nikolay Aleksandrov wrote: >> On 10/08/2021 13:09, Vladimir Oltean wrote: >>> On Tue, Aug 10, 2021 at 09:46:34AM +0300, Ido Schimmel wrote: >>>> On Mon, Aug 09, 2021 at 04:05:22PM +0000, Vladimir Oltean wrote: >>>>> On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote: >>>>>> I have at least once selftest where I forgot the 'static' keyword: >>>>>> >>>>>> bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1 >>>>>> >>>>>> This patch breaks the test when run against both the kernel and hardware >>>>>> data paths. I don't mind patching these tests, but we might get more >>>>>> reports in the future. >>>>> >>>>> Is it the 'static' keyword that you forgot, or 'dynamic'? The >>>>> tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest >>>>> looks to me like it's testing the behavior of an FDB entry which should >>>>> roam, and which without the extern_learn flag would be ageable. >>>> >>>> static - no aging, no roaming >>>> dynamic - aging, roaming >>>> extern_learn - no aging, roaming >>>> >>>> So these combinations do not make any sense and the kernel will ignore >>>> static/dynamic when extern_learn is specified. It's needed to work >>>> around iproute2 behavior of "assume permanent" >>> >>> Since NTF_EXT_LEARNED is part of ndm->ndm_flags and NUD_REACHABLE/NUD_NOARP >>> are part of ndm->ndm_state, it is not at all clear to me that 'extern_learn' >>> belongs to the same class of bridge neighbor attributes as 'static'/'dynamic', >>> and that it is invalid to have the full degree of freedom. If it isn't, >>> shouldn't the kernel validate that, instead of just ignoring the ndm->ndm_state? >>> If it's too late to validate, shouldn't we at least document somewhere >>> that the ndm_state is ignored in the presence of ndm_flags & NTF_EXT_LEARNED? >>> It is user API after all, easter eggs like this aren't very enjoyable. >>> >> >> It's too late unfortunately, ignoring other flags in that case has been the standard >> behaviour for a long time (it has never made sense to specify flags for extern_learn >> entries). I'll send a separate patch that adds a comment to document it or if you have >> another thing in mind feel free to send a patch. > > No, I don't have anything else in mind, but since the topic is the same > as the "net: bridge: fix flags interpretation for extern learn fdb entries" > patch you already sent, you could as well just send a v2 for that and > add an extra phrase in a comment somewhere near a NTF_EXT_LEARNED uapi > definition, or perhaps extend this comment right here: > > /* NUD_NOARP & NUD_PERMANENT are pseudostates, they never change > and make no address resolution or NUD. > NUD_PERMANENT also cannot be deleted by garbage collectors. > */ > sure, I was going to send it for net-next, but I might as well do it in -net.
diff --git a/net/bridge/br.c b/net/bridge/br.c index ef743f94254d..bbab9984f24e 100644 --- a/net/bridge/br.c +++ b/net/bridge/br.c @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused, case SWITCHDEV_FDB_ADD_TO_BRIDGE: fdb_info = ptr; err = br_fdb_external_learn_add(br, p, fdb_info->addr, - fdb_info->vid, false); + fdb_info->vid, + fdb_info->is_local, false); if (err) { err = notifier_from_errno(err); break; diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index a16191dcaed1..835cec1e5a03 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -1019,7 +1019,8 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source, static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, - u16 nlh_flags, u16 vid, struct nlattr *nfea_tb[]) + u16 nlh_flags, u16 vid, struct nlattr *nfea_tb[], + struct netlink_ext_ack *extack) { int err = 0; @@ -1038,7 +1039,15 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, rcu_read_unlock(); local_bh_enable(); } else if (ndm->ndm_flags & NTF_EXT_LEARNED) { - err = br_fdb_external_learn_add(br, p, addr, vid, true); + if (!p && !(ndm->ndm_state & NUD_PERMANENT)) { + NL_SET_ERR_MSG_MOD(extack, + "FDB entry towards bridge must be permanent"); + return -EINVAL; + } + + err = br_fdb_external_learn_add(br, p, addr, vid, + ndm->ndm_state & NUD_PERMANENT, + true); } else { spin_lock_bh(&br->hash_lock); err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb); @@ -1110,9 +1119,11 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[], } /* VID was specified, so use it. */ - err = __br_fdb_add(ndm, br, p, addr, nlh_flags, vid, nfea_tb); + err = __br_fdb_add(ndm, br, p, addr, nlh_flags, vid, nfea_tb, + extack); } else { - err = __br_fdb_add(ndm, br, p, addr, nlh_flags, 0, nfea_tb); + err = __br_fdb_add(ndm, br, p, addr, nlh_flags, 0, nfea_tb, + extack); if (err || !vg || !vg->num_vlans) goto out; @@ -1124,7 +1135,7 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[], if (!br_vlan_should_use(v)) continue; err = __br_fdb_add(ndm, br, p, addr, nlh_flags, v->vid, - nfea_tb); + nfea_tb, extack); if (err) goto out; } @@ -1264,7 +1275,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p) } int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, - const unsigned char *addr, u16 vid, + const unsigned char *addr, u16 vid, bool is_local, bool swdev_notify) { struct net_bridge_fdb_entry *fdb; @@ -1281,6 +1292,10 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, if (swdev_notify) flags |= BIT(BR_FDB_ADDED_BY_USER); + + if (is_local) + flags |= BIT(BR_FDB_LOCAL); + fdb = fdb_create(br, p, addr, vid, flags); if (!fdb) { err = -ENOMEM; @@ -1307,6 +1322,9 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, if (swdev_notify) set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags); + if (is_local) + set_bit(BR_FDB_LOCAL, &fdb->flags); + if (modified) fdb_notify(br, fdb, RTM_NEWNEIGH, swdev_notify); } diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 2b48b204205e..aa64d8d63ca3 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -711,7 +711,7 @@ int br_fdb_get(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev, int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p); void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p); int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, - const unsigned char *addr, u16 vid, + const unsigned char *addr, u16 vid, bool is_local, bool swdev_notify); int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid,
Currently it is possible to add broken extern_learn FDB entries to the bridge in two ways: 1. Entries pointing towards the bridge device that are not local/permanent: ip link add br0 type bridge bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static 2. Entries pointing towards the bridge device or towards a port that are marked as local/permanent, however the bridge does not process the 'permanent' bit in any way, therefore they are recorded as though they aren't permanent: ip link add br0 type bridge bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent Since commit 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the same as entries towards the bridge"), these incorrect FDB entries can even trigger NULL pointer dereferences inside the kernel. This is because that commit made the assumption that all FDB entries that are not local/permanent have a valid destination port. For context, local / permanent FDB entries either have fdb->dst == NULL, and these point towards the bridge device and are therefore local and not to be used for forwarding, or have fdb->dst == a net_bridge_port structure (but are to be treated in the same way, i.e. not for forwarding). That assumption _is_ correct as long as things are working correctly in the bridge driver, i.e. we cannot logically have fdb->dst == NULL under any circumstance for FDB entries that are not local. However, the extern_learn code path where FDB entries are managed by a user space controller show that it is possible for the bridge kernel driver to misinterpret the NUD flags of an entry transmitted by user space, and end up having fdb->dst == NULL while not being a local entry. This is invalid and should be rejected. Before, the two commands listed above both crashed the kernel in this check from br_switchdev_fdb_notify: struct net_device *dev = info.is_local ? br->dev : dst->dev; info.is_local == false, dst == NULL. After this patch, the invalid entry added by the first command is rejected: ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static; ip link del br0 Error: bridge: FDB entry towards bridge must be permanent. and the valid entry added by the second command is properly treated as a local address and does not crash br_switchdev_fdb_notify anymore: ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent; ip link del br0 Fixes: eb100e0e24a2 ("net: bridge: allow to add externally learned entries from user-space") Reported-by: syzbot+9ba1174359adba5a5b7c@syzkaller.appspotmail.com Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> --- net/bridge/br.c | 3 ++- net/bridge/br_fdb.c | 30 ++++++++++++++++++++++++------ net/bridge/br_private.h | 2 +- 3 files changed, 27 insertions(+), 8 deletions(-)