diff mbox series

[net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry

Message ID 20210801231730.7493-1-vladimir.oltean@nxp.com
State New
Headers show
Series [net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry | expand

Commit Message

Vladimir Oltean Aug. 1, 2021, 11:17 p.m. UTC
Currently it is possible to add broken extern_learn FDB entries to the
bridge in two ways:

1. Entries pointing towards the bridge device that are not local/permanent:

ip link add br0 type bridge
bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static

2. Entries pointing towards the bridge device or towards a port that
are marked as local/permanent, however the bridge does not process the
'permanent' bit in any way, therefore they are recorded as though they
aren't permanent:

ip link add br0 type bridge
bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent

Since commit 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the
same as entries towards the bridge"), these incorrect FDB entries can
even trigger NULL pointer dereferences inside the kernel.

This is because that commit made the assumption that all FDB entries
that are not local/permanent have a valid destination port. For context,
local / permanent FDB entries either have fdb->dst == NULL, and these
point towards the bridge device and are therefore local and not to be
used for forwarding, or have fdb->dst == a net_bridge_port structure
(but are to be treated in the same way, i.e. not for forwarding).

That assumption _is_ correct as long as things are working correctly in
the bridge driver, i.e. we cannot logically have fdb->dst == NULL under
any circumstance for FDB entries that are not local. However, the
extern_learn code path where FDB entries are managed by a user space
controller show that it is possible for the bridge kernel driver to
misinterpret the NUD flags of an entry transmitted by user space, and
end up having fdb->dst == NULL while not being a local entry. This is
invalid and should be rejected.

Before, the two commands listed above both crashed the kernel in this
check from br_switchdev_fdb_notify:

	struct net_device *dev = info.is_local ? br->dev : dst->dev;

info.is_local == false, dst == NULL.

After this patch, the invalid entry added by the first command is
rejected:

ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static; ip link del br0
Error: bridge: FDB entry towards bridge must be permanent.

and the valid entry added by the second command is properly treated as a
local address and does not crash br_switchdev_fdb_notify anymore:

ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent; ip link del br0

Fixes: eb100e0e24a2 ("net: bridge: allow to add externally learned entries from user-space")
Reported-by: syzbot+9ba1174359adba5a5b7c@syzkaller.appspotmail.com
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/bridge/br.c         |  3 ++-
 net/bridge/br_fdb.c     | 30 ++++++++++++++++++++++++------
 net/bridge/br_private.h |  2 +-
 3 files changed, 27 insertions(+), 8 deletions(-)

Comments

Nikolay Aleksandrov Aug. 2, 2021, 7:42 a.m. UTC | #1
On 02/08/2021 02:17, Vladimir Oltean wrote:
> Currently it is possible to add broken extern_learn FDB entries to the

> bridge in two ways:

> 

> 1. Entries pointing towards the bridge device that are not local/permanent:

> 

> ip link add br0 type bridge

> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static

> 

> 2. Entries pointing towards the bridge device or towards a port that

> are marked as local/permanent, however the bridge does not process the

> 'permanent' bit in any way, therefore they are recorded as though they

> aren't permanent:

> 

> ip link add br0 type bridge

> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent

> 

> Since commit 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the

> same as entries towards the bridge"), these incorrect FDB entries can

> even trigger NULL pointer dereferences inside the kernel.

> 

> This is because that commit made the assumption that all FDB entries

> that are not local/permanent have a valid destination port. For context,

> local / permanent FDB entries either have fdb->dst == NULL, and these

> point towards the bridge device and are therefore local and not to be

> used for forwarding, or have fdb->dst == a net_bridge_port structure

> (but are to be treated in the same way, i.e. not for forwarding).

> 

> That assumption _is_ correct as long as things are working correctly in

> the bridge driver, i.e. we cannot logically have fdb->dst == NULL under

> any circumstance for FDB entries that are not local. However, the

> extern_learn code path where FDB entries are managed by a user space

> controller show that it is possible for the bridge kernel driver to

> misinterpret the NUD flags of an entry transmitted by user space, and

> end up having fdb->dst == NULL while not being a local entry. This is

> invalid and should be rejected.

> 

> Before, the two commands listed above both crashed the kernel in this

> check from br_switchdev_fdb_notify:

> 


Not before 52e4bec15546 though, the check used to be:
struct net_device *dev = dst ? dst->dev : br->dev;

which wouldn't crash. So the fixes tag below is incorrect, you could
add a weird extern learn entry, but it wouldn't crash the kernel.

> 	struct net_device *dev = info.is_local ? br->dev : dst->dev;

> 

> info.is_local == false, dst == NULL.

> 

> After this patch, the invalid entry added by the first command is

> rejected:

> 

> ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static; ip link del br0

> Error: bridge: FDB entry towards bridge must be permanent.

> 

> and the valid entry added by the second command is properly treated as a

> local address and does not crash br_switchdev_fdb_notify anymore:

> 

> ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent; ip link del br0

> 

> Fixes: eb100e0e24a2 ("net: bridge: allow to add externally learned entries from user-space")


Please change fixes tag to 52e4bec15546.

> Reported-by: syzbot+9ba1174359adba5a5b7c@syzkaller.appspotmail.com

> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

> ---

[snip]

Thanks,
 Nik
Vladimir Oltean Aug. 2, 2021, 9:20 a.m. UTC | #2
On Mon, Aug 02, 2021 at 10:42:41AM +0300, Nikolay Aleksandrov wrote:
> On 02/08/2021 02:17, Vladimir Oltean wrote:

> > Currently it is possible to add broken extern_learn FDB entries to the

> > bridge in two ways:

> >

> > 1. Entries pointing towards the bridge device that are not local/permanent:

> >

> > ip link add br0 type bridge

> > bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static

> >

> > 2. Entries pointing towards the bridge device or towards a port that

> > are marked as local/permanent, however the bridge does not process the

> > 'permanent' bit in any way, therefore they are recorded as though they

> > aren't permanent:

> >

> > ip link add br0 type bridge

> > bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent

> >

> > Since commit 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the

> > same as entries towards the bridge"), these incorrect FDB entries can

> > even trigger NULL pointer dereferences inside the kernel.

> >

> > This is because that commit made the assumption that all FDB entries

> > that are not local/permanent have a valid destination port. For context,

> > local / permanent FDB entries either have fdb->dst == NULL, and these

> > point towards the bridge device and are therefore local and not to be

> > used for forwarding, or have fdb->dst == a net_bridge_port structure

> > (but are to be treated in the same way, i.e. not for forwarding).

> >

> > That assumption _is_ correct as long as things are working correctly in

> > the bridge driver, i.e. we cannot logically have fdb->dst == NULL under

> > any circumstance for FDB entries that are not local. However, the

> > extern_learn code path where FDB entries are managed by a user space

> > controller show that it is possible for the bridge kernel driver to

> > misinterpret the NUD flags of an entry transmitted by user space, and

> > end up having fdb->dst == NULL while not being a local entry. This is

> > invalid and should be rejected.

> >

> > Before, the two commands listed above both crashed the kernel in this

> > check from br_switchdev_fdb_notify:

> >

>

> Not before 52e4bec15546 though, the check used to be:

> struct net_device *dev = dst ? dst->dev : br->dev;


"Before", as in "before this patch, on net-next/linux-next".

> which wouldn't crash. So the fixes tag below is incorrect, you could

> add a weird extern learn entry, but it wouldn't crash the kernel.


:)

Is our only criterion whether a patch is buggy or not that it causes a
NULL pointer dereference inside the kernel?

I thought I'd mention the interaction with the net-next work for the
sake of being thorough, and because this is how the syzcaller caught it
by coincidence, but "kernel does not treat an FDB entry with the
'permanent' flag as permanent" is enough of a reason to submit this as a
bug fix for the commit that I pointed to. Granted, I don't have any use
case with extern_learn, so probably your user space programs simply
don't add permanent FDB entries, but as this is the kernel UAPI, it
should nevertheless do whatever the user space is allowed to say. For a
permanent FDB entry, that behavior is to stop forwarding for that MAC
DA, and that behavior obviously was not taking place even before any
change in br_switchdev_fdb_notify(), or even with CONFIG_NET_SWITCHDEV=n.
Nikolay Aleksandrov Aug. 2, 2021, 9:42 a.m. UTC | #3
On 02/08/2021 12:20, Vladimir Oltean wrote:
> On Mon, Aug 02, 2021 at 10:42:41AM +0300, Nikolay Aleksandrov wrote:

>> On 02/08/2021 02:17, Vladimir Oltean wrote:

>>> Currently it is possible to add broken extern_learn FDB entries to the

>>> bridge in two ways:

>>>

>>> 1. Entries pointing towards the bridge device that are not local/permanent:

>>>

>>> ip link add br0 type bridge

>>> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static

>>>

>>> 2. Entries pointing towards the bridge device or towards a port that

>>> are marked as local/permanent, however the bridge does not process the

>>> 'permanent' bit in any way, therefore they are recorded as though they

>>> aren't permanent:

>>>

>>> ip link add br0 type bridge

>>> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent

>>>

>>> Since commit 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the

>>> same as entries towards the bridge"), these incorrect FDB entries can

>>> even trigger NULL pointer dereferences inside the kernel.

>>>

>>> This is because that commit made the assumption that all FDB entries

>>> that are not local/permanent have a valid destination port. For context,

>>> local / permanent FDB entries either have fdb->dst == NULL, and these

>>> point towards the bridge device and are therefore local and not to be

>>> used for forwarding, or have fdb->dst == a net_bridge_port structure

>>> (but are to be treated in the same way, i.e. not for forwarding).

>>>

>>> That assumption _is_ correct as long as things are working correctly in

>>> the bridge driver, i.e. we cannot logically have fdb->dst == NULL under

>>> any circumstance for FDB entries that are not local. However, the

>>> extern_learn code path where FDB entries are managed by a user space

>>> controller show that it is possible for the bridge kernel driver to

>>> misinterpret the NUD flags of an entry transmitted by user space, and

>>> end up having fdb->dst == NULL while not being a local entry. This is

>>> invalid and should be rejected.

>>>

>>> Before, the two commands listed above both crashed the kernel in this

>>> check from br_switchdev_fdb_notify:

>>>

>>

>> Not before 52e4bec15546 though, the check used to be:

>> struct net_device *dev = dst ? dst->dev : br->dev;

> 

> "Before", as in "before this patch, on net-next/linux-next".

> 


We still need that check, more below.

>> which wouldn't crash. So the fixes tag below is incorrect, you could

>> add a weird extern learn entry, but it wouldn't crash the kernel.

> 

> :)

> 

> Is our only criterion whether a patch is buggy or not that it causes a

> NULL pointer dereference inside the kernel?

> 

> I thought I'd mention the interaction with the net-next work for the

> sake of being thorough, and because this is how the syzcaller caught it

> by coincidence, but "kernel does not treat an FDB entry with the

> 'permanent' flag as permanent" is enough of a reason to submit this as a


Not exactly right, you may add it as permanent but it doesn't get "permanent" flag set.
The actual bug is that it points to the bridge device, e.g. null dst without the flag.

> bug fix for the commit that I pointed to. Granted, I don't have any use

> case with extern_learn, so probably your user space programs simply

> don't add permanent FDB entries, but as this is the kernel UAPI, it

> should nevertheless do whatever the user space is allowed to say. For a

> permanent FDB entry, that behavior is to stop forwarding for that MAC

> DA, and that behavior obviously was not taking place even before any

> change in br_switchdev_fdb_notify(), or even with CONFIG_NET_SWITCHDEV=n.

> 


Actually I believe there is still a bug in 52e4bec15546 even with this fix.
The flag can change after the dst has been read in br_switchdev_fdb_notify()
so in theory you could still do a null pointer dereference. fdb_notify()
can be called from a few places without locking. The code shouldn't dereference
the dst based on the flag.

I'm okay with this change due to the null dst without permanent flag fix, but
it doesn't fully fix the null pointer dereference.
Vladimir Oltean Aug. 2, 2021, 10:52 a.m. UTC | #4
On Mon, Aug 02, 2021 at 12:42:17PM +0300, Nikolay Aleksandrov wrote:
> >>> Before, the two commands listed above both crashed the kernel in this

> >>> check from br_switchdev_fdb_notify:

> >>>

> >>

> >> Not before 52e4bec15546 though, the check used to be:

> >> struct net_device *dev = dst ? dst->dev : br->dev;

> >

> > "Before", as in "before this patch, on net-next/linux-next".

> >

>

> We still need that check, more below.

>

> >> which wouldn't crash. So the fixes tag below is incorrect, you could

> >> add a weird extern learn entry, but it wouldn't crash the kernel.

> >

> > :)

> >

> > Is our only criterion whether a patch is buggy or not that it causes a

> > NULL pointer dereference inside the kernel?

> >

> > I thought I'd mention the interaction with the net-next work for the

> > sake of being thorough, and because this is how the syzcaller caught it

> > by coincidence, but "kernel does not treat an FDB entry with the

> > 'permanent' flag as permanent" is enough of a reason to submit this as a

>

> Not exactly right, you may add it as permanent but it doesn't get "permanent" flag set.


And that is the bug I am addressing here, no?

> The actual bug is that it points to the bridge device, e.g. null dst without the flag.

>

> > bug fix for the commit that I pointed to. Granted, I don't have any use

> > case with extern_learn, so probably your user space programs simply

> > don't add permanent FDB entries, but as this is the kernel UAPI, it

> > should nevertheless do whatever the user space is allowed to say. For a

> > permanent FDB entry, that behavior is to stop forwarding for that MAC

> > DA, and that behavior obviously was not taking place even before any

> > change in br_switchdev_fdb_notify(), or even with CONFIG_NET_SWITCHDEV=n.

> >

>

> Actually I believe there is still a bug in 52e4bec15546 even with this fix.

> The flag can change after the dst has been read in br_switchdev_fdb_notify()

> so in theory you could still do a null pointer dereference. fdb_notify()

> can be called from a few places without locking. The code shouldn't dereference

> the dst based on the flag.


Are you thinking of a specific code path that triggers a race between
(a) a writer side doing WRITE_ONCE(fdb->dst, NULL) and then
    set_bit(BR_FDB_LOCAL, &fdb->flags), exactly in this order, and
(b) a reader side catching that fdb exactly in between the above 2
    statements, through fdb_notify or otherwise (br_fdb_replay)?

Because I don't see any.

Plus, I am a bit nervous about protecting against theoretical/unproven
races in a way that masks real bugs, as we would be doing if I add an
extra check in br_fdb_replay_one and br_switchdev_fdb_notify against the
case where an entry has fdb->dst == NULL but not BR_FDB_LOCAL.

>

> I'm okay with this change due to the null dst without permanent flag fix, but

> it doesn't fully fix the null pointer dereference.


So is there any change that I should make to this patch?
Nikolay Aleksandrov Aug. 2, 2021, 11:02 a.m. UTC | #5
On 02/08/2021 13:52, Vladimir Oltean wrote:
> On Mon, Aug 02, 2021 at 12:42:17PM +0300, Nikolay Aleksandrov wrote:

>>>>> Before, the two commands listed above both crashed the kernel in this

>>>>> check from br_switchdev_fdb_notify:

>>>>>

>>>>

>>>> Not before 52e4bec15546 though, the check used to be:

>>>> struct net_device *dev = dst ? dst->dev : br->dev;

>>>

>>> "Before", as in "before this patch, on net-next/linux-next".

>>>

>>

>> We still need that check, more below.

>>

>>>> which wouldn't crash. So the fixes tag below is incorrect, you could

>>>> add a weird extern learn entry, but it wouldn't crash the kernel.

>>>

>>> :)

>>>

>>> Is our only criterion whether a patch is buggy or not that it causes a

>>> NULL pointer dereference inside the kernel?

>>>

>>> I thought I'd mention the interaction with the net-next work for the

>>> sake of being thorough, and because this is how the syzcaller caught it

>>> by coincidence, but "kernel does not treat an FDB entry with the

>>> 'permanent' flag as permanent" is enough of a reason to submit this as a

>>

>> Not exactly right, you may add it as permanent but it doesn't get "permanent" flag set.

> 

> And that is the bug I am addressing here, no?

> 


Obviously I pointed to the statement above, I don't get the question here.

>> The actual bug is that it points to the bridge device, e.g. null dst without the flag.

>>

>>> bug fix for the commit that I pointed to. Granted, I don't have any use

>>> case with extern_learn, so probably your user space programs simply

>>> don't add permanent FDB entries, but as this is the kernel UAPI, it

>>> should nevertheless do whatever the user space is allowed to say. For a

>>> permanent FDB entry, that behavior is to stop forwarding for that MAC

>>> DA, and that behavior obviously was not taking place even before any

>>> change in br_switchdev_fdb_notify(), or even with CONFIG_NET_SWITCHDEV=n.

>>>

>>

>> Actually I believe there is still a bug in 52e4bec15546 even with this fix.

>> The flag can change after the dst has been read in br_switchdev_fdb_notify()

>> so in theory you could still do a null pointer dereference. fdb_notify()

>> can be called from a few places without locking. The code shouldn't dereference

>> the dst based on the flag.

> 

> Are you thinking of a specific code path that triggers a race between

> (a) a writer side doing WRITE_ONCE(fdb->dst, NULL) and then

>     set_bit(BR_FDB_LOCAL, &fdb->flags), exactly in this order, and


Visible order is not guaranteed, there are no barriers neither at writer nor reader
sides, especially when used without locking. So we cannot make any assumptions
about the order visibility of these writes.

> (b) a reader side catching that fdb exactly in between the above 2

>     statements, through fdb_notify or otherwise (br_fdb_replay)?

> 

> Because I don't see any.

> 

> Plus, I am a bit nervous about protecting against theoretical/unproven

> races in a way that masks real bugs, as we would be doing if I add an

> extra check in br_fdb_replay_one and br_switchdev_fdb_notify against the

> case where an entry has fdb->dst == NULL but not BR_FDB_LOCAL.

> 


The bits are _not_ visible atomically with the setting of ->dst. It is obvious
you must not dereference anything based on them, they are only indications when used
outside of locked regions and code must be able to deal with inconsistencies as that
is implied by the way they're used. It is a clear and obvious bug dereferencing based
on a bit that can change in parallel without any memory ordering guarantees.

You are not "masking" anything, but fixing what is currently buggy use of fdb bits.
As I already said - this doesn't fix the null deref bug completely, in fact it fixes a different
inconsistency, before at worst you'd get blackholed traffic for such entries now
you get a null pointer dereference.

>>

>> I'm okay with this change due to the null dst without permanent flag fix, but

>> it doesn't fully fix the null pointer dereference.

> 

> So is there any change that I should make to this patch?

> 


No, this patch is fine as-is.
Vladimir Oltean Aug. 2, 2021, 11:20 a.m. UTC | #6
On Mon, Aug 02, 2021 at 02:02:36PM +0300, Nikolay Aleksandrov wrote:
> >> Actually I believe there is still a bug in 52e4bec15546 even with this fix.

> >> The flag can change after the dst has been read in br_switchdev_fdb_notify()

> >> so in theory you could still do a null pointer dereference. fdb_notify()

> >> can be called from a few places without locking. The code shouldn't dereference

> >> the dst based on the flag.

> >

> > Are you thinking of a specific code path that triggers a race between

> > (a) a writer side doing WRITE_ONCE(fdb->dst, NULL) and then

> >     set_bit(BR_FDB_LOCAL, &fdb->flags), exactly in this order, and

>

> Visible order is not guaranteed, there are no barriers neither at writer nor reader

> sides, especially when used without locking. So we cannot make any assumptions

> about the order visibility of these writes.

>

> > (b) a reader side catching that fdb exactly in between the above 2

> >     statements, through fdb_notify or otherwise (br_fdb_replay)?

> >

> > Because I don't see any.

> >

> > Plus, I am a bit nervous about protecting against theoretical/unproven

> > races in a way that masks real bugs, as we would be doing if I add an

> > extra check in br_fdb_replay_one and br_switchdev_fdb_notify against the

> > case where an entry has fdb->dst == NULL but not BR_FDB_LOCAL.

> >

>

> The bits are _not_ visible atomically with the setting of ->dst. It is obvious

> you must not dereference anything based on them, they are only indications when used

> outside of locked regions and code must be able to deal with inconsistencies as that

> is implied by the way they're used. It is a clear and obvious bug dereferencing based

> on a bit that can change in parallel without any memory ordering guarantees.


Ok, I will send a separate patch for that.

> You are not "masking" anything, but fixing what is currently buggy use of fdb bits.


I am "masking" in the sense that the bug I am fixing here was not
obvious to me until it triggered a NPD. That would stop happening with
the patch I'm about to send, but maybe there are still bridge UAPI
functions that do not validate the 'permanent' flag from FDB entries.

> As I already said - this doesn't fix the null deref bug completely, in fact it fixes a different

> inconsistency, before at worst you'd get blackholed traffic for such entries now

> you get a null pointer dereference.
Nikolay Aleksandrov Aug. 2, 2021, 11:25 a.m. UTC | #7
On 02/08/2021 02:17, Vladimir Oltean wrote:
> Currently it is possible to add broken extern_learn FDB entries to the

> bridge in two ways:

> 

> 1. Entries pointing towards the bridge device that are not local/permanent:

> 

> ip link add br0 type bridge

> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static

> 

> 2. Entries pointing towards the bridge device or towards a port that

> are marked as local/permanent, however the bridge does not process the

> 'permanent' bit in any way, therefore they are recorded as though they

> aren't permanent:

> 

> ip link add br0 type bridge

> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent

> 

> Since commit 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the

> same as entries towards the bridge"), these incorrect FDB entries can

> even trigger NULL pointer dereferences inside the kernel.

> 

> This is because that commit made the assumption that all FDB entries

> that are not local/permanent have a valid destination port. For context,

> local / permanent FDB entries either have fdb->dst == NULL, and these

> point towards the bridge device and are therefore local and not to be

> used for forwarding, or have fdb->dst == a net_bridge_port structure

> (but are to be treated in the same way, i.e. not for forwarding).

> 

> That assumption _is_ correct as long as things are working correctly in

> the bridge driver, i.e. we cannot logically have fdb->dst == NULL under

> any circumstance for FDB entries that are not local. However, the

> extern_learn code path where FDB entries are managed by a user space

> controller show that it is possible for the bridge kernel driver to

> misinterpret the NUD flags of an entry transmitted by user space, and

> end up having fdb->dst == NULL while not being a local entry. This is

> invalid and should be rejected.

> 

> Before, the two commands listed above both crashed the kernel in this

> check from br_switchdev_fdb_notify:

> 

> 	struct net_device *dev = info.is_local ? br->dev : dst->dev;

> 

> info.is_local == false, dst == NULL.

> 

> After this patch, the invalid entry added by the first command is

> rejected:

> 

> ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static; ip link del br0

> Error: bridge: FDB entry towards bridge must be permanent.

> 

> and the valid entry added by the second command is properly treated as a

> local address and does not crash br_switchdev_fdb_notify anymore:

> 

> ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent; ip link del br0

> 

> Fixes: eb100e0e24a2 ("net: bridge: allow to add externally learned entries from user-space")

> Reported-by: syzbot+9ba1174359adba5a5b7c@syzkaller.appspotmail.com

> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

> ---


Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
patchwork-bot+netdevbpf@kernel.org Aug. 2, 2021, 10:10 p.m. UTC | #8
Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Mon,  2 Aug 2021 02:17:30 +0300 you wrote:
> Currently it is possible to add broken extern_learn FDB entries to the

> bridge in two ways:

> 

> 1. Entries pointing towards the bridge device that are not local/permanent:

> 

> ip link add br0 type bridge

> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static

> 

> [...]


Here is the summary with links:
  - [net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
    https://git.kernel.org/netdev/net/c/0541a6293298

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
Ido Schimmel Aug. 9, 2021, 12:16 p.m. UTC | #9
On Mon, Aug 02, 2021 at 02:17:30AM +0300, Vladimir Oltean wrote:
> diff --git a/net/bridge/br.c b/net/bridge/br.c

> index ef743f94254d..bbab9984f24e 100644

> --- a/net/bridge/br.c

> +++ b/net/bridge/br.c

> @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused,

>  	case SWITCHDEV_FDB_ADD_TO_BRIDGE:

>  		fdb_info = ptr;

>  		err = br_fdb_external_learn_add(br, p, fdb_info->addr,

> -						fdb_info->vid, false);

> +						fdb_info->vid,

> +						fdb_info->is_local, false);


When 'is_local' was added in commit 2c4eca3ef716 ("net: bridge:
switchdev: include local flag in FDB notifications") it was not
initialized in all the call sites that emit
'SWITCHDEV_FDB_ADD_TO_BRIDGE' notification, so it can contain garbage.

>  		if (err) {

>  			err = notifier_from_errno(err);

>  			break;


[...]

> @@ -1281,6 +1292,10 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,

>  

>  		if (swdev_notify)

>  			flags |= BIT(BR_FDB_ADDED_BY_USER);

> +

> +		if (is_local)

> +			flags |= BIT(BR_FDB_LOCAL);


I have at least once selftest where I forgot the 'static' keyword:

bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1

This patch breaks the test when run against both the kernel and hardware
data paths. I don't mind patching these tests, but we might get more
reports in the future.

Nik, what do you think?

> +

>  		fdb = fdb_create(br, p, addr, vid, flags);

>  		if (!fdb) {

>  			err = -ENOMEM;

> @@ -1307,6 +1322,9 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,

>  		if (swdev_notify)

>  			set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);

>  

> +		if (is_local)

> +			set_bit(BR_FDB_LOCAL, &fdb->flags);

> +

>  		if (modified)

>  			fdb_notify(br, fdb, RTM_NEWNEIGH, swdev_notify);

>  	}

> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h

> index 2b48b204205e..aa64d8d63ca3 100644

> --- a/net/bridge/br_private.h

> +++ b/net/bridge/br_private.h

> @@ -711,7 +711,7 @@ int br_fdb_get(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev,

>  int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);

>  void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);

>  int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,

> -			      const unsigned char *addr, u16 vid,

> +			      const unsigned char *addr, u16 vid, bool is_local,

>  			      bool swdev_notify);

>  int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,

>  			      const unsigned char *addr, u16 vid,

> -- 

> 2.25.1

>
Vladimir Oltean Aug. 9, 2021, 12:32 p.m. UTC | #10
On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote:
> On Mon, Aug 02, 2021 at 02:17:30AM +0300, Vladimir Oltean wrote:

> > diff --git a/net/bridge/br.c b/net/bridge/br.c

> > index ef743f94254d..bbab9984f24e 100644

> > --- a/net/bridge/br.c

> > +++ b/net/bridge/br.c

> > @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused,

> >  	case SWITCHDEV_FDB_ADD_TO_BRIDGE:

> >  		fdb_info = ptr;

> >  		err = br_fdb_external_learn_add(br, p, fdb_info->addr,

> > -						fdb_info->vid, false);

> > +						fdb_info->vid,

> > +						fdb_info->is_local, false);

> 

> When 'is_local' was added in commit 2c4eca3ef716 ("net: bridge:

> switchdev: include local flag in FDB notifications") it was not

> initialized in all the call sites that emit

> 'SWITCHDEV_FDB_ADD_TO_BRIDGE' notification, so it can contain garbage.


Thanks for the report, I'll send a patch which adds a:

	memset(&info, 0, sizeof(info));

to all SWITCHDEV_FDB_*_TO_BRIDGE call sites.
Nikolay Aleksandrov Aug. 9, 2021, 3:33 p.m. UTC | #11
On 09/08/2021 15:16, Ido Schimmel wrote:
> On Mon, Aug 02, 2021 at 02:17:30AM +0300, Vladimir Oltean wrote:

>> diff --git a/net/bridge/br.c b/net/bridge/br.c

>> index ef743f94254d..bbab9984f24e 100644

>> --- a/net/bridge/br.c

>> +++ b/net/bridge/br.c

>> @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused,

>>  	case SWITCHDEV_FDB_ADD_TO_BRIDGE:

>>  		fdb_info = ptr;

>>  		err = br_fdb_external_learn_add(br, p, fdb_info->addr,

>> -						fdb_info->vid, false);

>> +						fdb_info->vid,

>> +						fdb_info->is_local, false);

> 

> When 'is_local' was added in commit 2c4eca3ef716 ("net: bridge:

> switchdev: include local flag in FDB notifications") it was not

> initialized in all the call sites that emit

> 'SWITCHDEV_FDB_ADD_TO_BRIDGE' notification, so it can contain garbage.

> 


nice catch

>>  		if (err) {

>>  			err = notifier_from_errno(err);

>>  			break;

> 

> [...]

> 

>> @@ -1281,6 +1292,10 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,

>>  

>>  		if (swdev_notify)

>>  			flags |= BIT(BR_FDB_ADDED_BY_USER);

>> +

>> +		if (is_local)

>> +			flags |= BIT(BR_FDB_LOCAL);

> 

> I have at least once selftest where I forgot the 'static' keyword:

> 

> bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1

> 

> This patch breaks the test when run against both the kernel and hardware

> data paths. I don't mind patching these tests, but we might get more

> reports in the future.

> 

> Nik, what do you think?

> 


Ahh, that's unfortunate. The patch's assumption is correct that we must not have fdb->dst == NULL
and the dst to be non-local (i.e. without BR_FDB_LOCAL). Since all solutions break user-space in
a different way and since this patch also already broke it by the check for !p && !NUD_PERMANENT
in __br_fdb_add() which was allowed before that, I think the best course of action is to ignore
NUD_PERMANENT in __br_fdb_add() for extern_learn case and always set BR_FDB_LOCAL in br_fdb_external_learn_add()
when !p. That would allow all prior calls to work and would remove the dst==NULL without BR_FDB_LOCAL
issue. Honestly, I doubt anyone is using extern_learn with bridge device entries, but we cannot assume
anything since this is already a part of the uAPI and we must allow it. Basically we silently
fix the BR_FDB_LOCAL problem so old user syntax and code can continue working.
It is a hack, but I don't see another solution which doesn't break user-space in some way.
Handling NUD_PERMANENT only with !p is equivalent, unfortunately we shouldn't keep the error
since that can break someone who was adding such entries without NUD_PERMANENT flag, but
we can force it in kernel, that should make such scripts succeed. Traffic used to be blackholed
for such entries and now it will be received locally, that will be the only difference.

TBH, I want to keep that error so middle ground would be to handle NUD_PERMANENT only
when used with !p and keep it. :) WDYT ?

Solution which forces BR_FDB_LOCAL for !p calls (completely untested):
diff --git a/net/bridge/br.c b/net/bridge/br.c
index c8ae823aa8e7..d3a32c6813e0 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -166,8 +166,7 @@ static int br_switchdev_event(struct notifier_block *unused,
        case SWITCHDEV_FDB_ADD_TO_BRIDGE:
                fdb_info = ptr;
                err = br_fdb_external_learn_add(br, p, fdb_info->addr,
-                                               fdb_info->vid,
-                                               fdb_info->is_local, false);
+                                               fdb_info->vid, false);
                if (err) {
                        err = notifier_from_errno(err);
                        break;
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index b8e22057f680..4e3b1b66f132 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -1255,15 +1255,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br,
                rcu_read_unlock();
                local_bh_enable();
        } else if (ndm->ndm_flags & NTF_EXT_LEARNED) {
-               if (!p && !(ndm->ndm_state & NUD_PERMANENT)) {
-                       NL_SET_ERR_MSG_MOD(extack,
-                                          "FDB entry towards bridge must be permanent");
-                       return -EINVAL;
-               }
-
-               err = br_fdb_external_learn_add(br, p, addr, vid,
-                                               ndm->ndm_state & NUD_PERMANENT,
-                                               true);
+               err = br_fdb_external_learn_add(br, p, addr, vid, true);
        } else {
                spin_lock_bh(&br->hash_lock);
                err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);
@@ -1491,7 +1483,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
 }
 
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
-                             const unsigned char *addr, u16 vid, bool is_local,
+                             const unsigned char *addr, u16 vid,
                              bool swdev_notify)
 {
        struct net_bridge_fdb_entry *fdb;
@@ -1509,7 +1501,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
                if (swdev_notify)
                        flags |= BIT(BR_FDB_ADDED_BY_USER);
 
-               if (is_local)
+               if (!p)
                        flags |= BIT(BR_FDB_LOCAL);
 
                fdb = fdb_create(br, p, addr, vid, flags);
@@ -1538,7 +1530,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
                if (swdev_notify)
                        set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
 
-               if (is_local)
+               if (!p)
                        set_bit(BR_FDB_LOCAL, &fdb->flags);
 
                if (modified)
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 86969d1bd036..907e5742b392 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -778,7 +778,7 @@ int br_fdb_get(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev,
 int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);
 void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
-                             const unsigned char *addr, u16 vid, bool is_local,
+                             const unsigned char *addr, u16 vid,
                              bool swdev_notify);
 int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
                              const unsigned char *addr, u16 vid,
Vladimir Oltean Aug. 9, 2021, 4:05 p.m. UTC | #12
On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote:
> I have at least once selftest where I forgot the 'static' keyword:

>

> bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1

>

> This patch breaks the test when run against both the kernel and hardware

> data paths. I don't mind patching these tests, but we might get more

> reports in the future.


Is it the 'static' keyword that you forgot, or 'dynamic'? The
tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest
looks to me like it's testing the behavior of an FDB entry which should
roam, and which without the extern_learn flag would be ageable.
Ido Schimmel Aug. 10, 2021, 6:40 a.m. UTC | #13
On Mon, Aug 09, 2021 at 06:33:30PM +0300, Nikolay Aleksandrov wrote:
> TBH, I want to keep that error so middle ground would be to handle NUD_PERMANENT only

> when used with !p and keep it. :) WDYT ?


Yes, works for me

> 

> Solution which forces BR_FDB_LOCAL for !p calls (completely untested):


Reviewed-by: Ido Schimmel <idosch@nvidia.com>

Tested-by: Ido Schimmel <idosch@nvidia.com>


> diff --git a/net/bridge/br.c b/net/bridge/br.c

> index c8ae823aa8e7..d3a32c6813e0 100644

> --- a/net/bridge/br.c

> +++ b/net/bridge/br.c

> @@ -166,8 +166,7 @@ static int br_switchdev_event(struct notifier_block *unused,

>         case SWITCHDEV_FDB_ADD_TO_BRIDGE:

>                 fdb_info = ptr;

>                 err = br_fdb_external_learn_add(br, p, fdb_info->addr,

> -                                               fdb_info->vid,

> -                                               fdb_info->is_local, false);

> +                                               fdb_info->vid, false);

>                 if (err) {

>                         err = notifier_from_errno(err);

>                         break;

> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c

> index b8e22057f680..4e3b1b66f132 100644

> --- a/net/bridge/br_fdb.c

> +++ b/net/bridge/br_fdb.c

> @@ -1255,15 +1255,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br,

>                 rcu_read_unlock();

>                 local_bh_enable();

>         } else if (ndm->ndm_flags & NTF_EXT_LEARNED) {

> -               if (!p && !(ndm->ndm_state & NUD_PERMANENT)) {

> -                       NL_SET_ERR_MSG_MOD(extack,

> -                                          "FDB entry towards bridge must be permanent");

> -                       return -EINVAL;

> -               }

> -

> -               err = br_fdb_external_learn_add(br, p, addr, vid,

> -                                               ndm->ndm_state & NUD_PERMANENT,

> -                                               true);

> +               err = br_fdb_external_learn_add(br, p, addr, vid, true);

>         } else {

>                 spin_lock_bh(&br->hash_lock);

>                 err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);

> @@ -1491,7 +1483,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)

>  }

>  

>  int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,

> -                             const unsigned char *addr, u16 vid, bool is_local,

> +                             const unsigned char *addr, u16 vid,

>                               bool swdev_notify)

>  {

>         struct net_bridge_fdb_entry *fdb;

> @@ -1509,7 +1501,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,

>                 if (swdev_notify)

>                         flags |= BIT(BR_FDB_ADDED_BY_USER);

>  

> -               if (is_local)

> +               if (!p)

>                         flags |= BIT(BR_FDB_LOCAL);

>  

>                 fdb = fdb_create(br, p, addr, vid, flags);

> @@ -1538,7 +1530,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,

>                 if (swdev_notify)

>                         set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);

>  

> -               if (is_local)

> +               if (!p)

>                         set_bit(BR_FDB_LOCAL, &fdb->flags);

>  

>                 if (modified)

> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h

> index 86969d1bd036..907e5742b392 100644

> --- a/net/bridge/br_private.h

> +++ b/net/bridge/br_private.h

> @@ -778,7 +778,7 @@ int br_fdb_get(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev,

>  int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);

>  void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);

>  int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,

> -                             const unsigned char *addr, u16 vid, bool is_local,

> +                             const unsigned char *addr, u16 vid,

>                               bool swdev_notify);

>  int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,

>                               const unsigned char *addr, u16 vid,

> 

>
Ido Schimmel Aug. 10, 2021, 6:46 a.m. UTC | #14
On Mon, Aug 09, 2021 at 04:05:22PM +0000, Vladimir Oltean wrote:
> On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote:

> > I have at least once selftest where I forgot the 'static' keyword:

> >

> > bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1

> >

> > This patch breaks the test when run against both the kernel and hardware

> > data paths. I don't mind patching these tests, but we might get more

> > reports in the future.

> 

> Is it the 'static' keyword that you forgot, or 'dynamic'? The

> tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest

> looks to me like it's testing the behavior of an FDB entry which should

> roam, and which without the extern_learn flag would be ageable.


static - no aging, no roaming
dynamic - aging, roaming
extern_learn - no aging, roaming

So these combinations do not make any sense and the kernel will ignore
static/dynamic when extern_learn is specified. It's needed to work
around iproute2 behavior of "assume permanent"
Vladimir Oltean Aug. 10, 2021, 10:09 a.m. UTC | #15
On Tue, Aug 10, 2021 at 09:46:34AM +0300, Ido Schimmel wrote:
> On Mon, Aug 09, 2021 at 04:05:22PM +0000, Vladimir Oltean wrote:

> > On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote:

> > > I have at least once selftest where I forgot the 'static' keyword:

> > >

> > > bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1

> > >

> > > This patch breaks the test when run against both the kernel and hardware

> > > data paths. I don't mind patching these tests, but we might get more

> > > reports in the future.

> > 

> > Is it the 'static' keyword that you forgot, or 'dynamic'? The

> > tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest

> > looks to me like it's testing the behavior of an FDB entry which should

> > roam, and which without the extern_learn flag would be ageable.

> 

> static - no aging, no roaming

> dynamic - aging, roaming

> extern_learn - no aging, roaming

> 

> So these combinations do not make any sense and the kernel will ignore

> static/dynamic when extern_learn is specified. It's needed to work

> around iproute2 behavior of "assume permanent"


Since NTF_EXT_LEARNED is part of ndm->ndm_flags and NUD_REACHABLE/NUD_NOARP
are part of ndm->ndm_state, it is not at all clear to me that 'extern_learn'
belongs to the same class of bridge neighbor attributes as 'static'/'dynamic',
and that it is invalid to have the full degree of freedom. If it isn't,
shouldn't the kernel validate that, instead of just ignoring the ndm->ndm_state?
If it's too late to validate, shouldn't we at least document somewhere
that the ndm_state is ignored in the presence of ndm_flags & NTF_EXT_LEARNED?
It is user API after all, easter eggs like this aren't very enjoyable.
Nikolay Aleksandrov Aug. 10, 2021, 10:15 a.m. UTC | #16
On 10/08/2021 13:09, Vladimir Oltean wrote:
> On Tue, Aug 10, 2021 at 09:46:34AM +0300, Ido Schimmel wrote:

>> On Mon, Aug 09, 2021 at 04:05:22PM +0000, Vladimir Oltean wrote:

>>> On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote:

>>>> I have at least once selftest where I forgot the 'static' keyword:

>>>>

>>>> bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1

>>>>

>>>> This patch breaks the test when run against both the kernel and hardware

>>>> data paths. I don't mind patching these tests, but we might get more

>>>> reports in the future.

>>>

>>> Is it the 'static' keyword that you forgot, or 'dynamic'? The

>>> tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest

>>> looks to me like it's testing the behavior of an FDB entry which should

>>> roam, and which without the extern_learn flag would be ageable.

>>

>> static - no aging, no roaming

>> dynamic - aging, roaming

>> extern_learn - no aging, roaming

>>

>> So these combinations do not make any sense and the kernel will ignore

>> static/dynamic when extern_learn is specified. It's needed to work

>> around iproute2 behavior of "assume permanent"

> 

> Since NTF_EXT_LEARNED is part of ndm->ndm_flags and NUD_REACHABLE/NUD_NOARP

> are part of ndm->ndm_state, it is not at all clear to me that 'extern_learn'

> belongs to the same class of bridge neighbor attributes as 'static'/'dynamic',

> and that it is invalid to have the full degree of freedom. If it isn't,

> shouldn't the kernel validate that, instead of just ignoring the ndm->ndm_state?

> If it's too late to validate, shouldn't we at least document somewhere

> that the ndm_state is ignored in the presence of ndm_flags & NTF_EXT_LEARNED?

> It is user API after all, easter eggs like this aren't very enjoyable.

> 


It's too late unfortunately, ignoring other flags in that case has been the standard
behaviour for a long time (it has never made sense to specify flags for extern_learn
entries). I'll send a separate patch that adds a comment to document it or if you have
another thing in mind feel free to send a patch.

Thanks,
 Nik
Vladimir Oltean Aug. 10, 2021, 10:38 a.m. UTC | #17
On Tue, Aug 10, 2021 at 01:15:32PM +0300, Nikolay Aleksandrov wrote:
> On 10/08/2021 13:09, Vladimir Oltean wrote:

> > On Tue, Aug 10, 2021 at 09:46:34AM +0300, Ido Schimmel wrote:

> >> On Mon, Aug 09, 2021 at 04:05:22PM +0000, Vladimir Oltean wrote:

> >>> On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote:

> >>>> I have at least once selftest where I forgot the 'static' keyword:

> >>>>

> >>>> bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1

> >>>>

> >>>> This patch breaks the test when run against both the kernel and hardware

> >>>> data paths. I don't mind patching these tests, but we might get more

> >>>> reports in the future.

> >>>

> >>> Is it the 'static' keyword that you forgot, or 'dynamic'? The

> >>> tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest

> >>> looks to me like it's testing the behavior of an FDB entry which should

> >>> roam, and which without the extern_learn flag would be ageable.

> >>

> >> static - no aging, no roaming

> >> dynamic - aging, roaming

> >> extern_learn - no aging, roaming

> >>

> >> So these combinations do not make any sense and the kernel will ignore

> >> static/dynamic when extern_learn is specified. It's needed to work

> >> around iproute2 behavior of "assume permanent"

> > 

> > Since NTF_EXT_LEARNED is part of ndm->ndm_flags and NUD_REACHABLE/NUD_NOARP

> > are part of ndm->ndm_state, it is not at all clear to me that 'extern_learn'

> > belongs to the same class of bridge neighbor attributes as 'static'/'dynamic',

> > and that it is invalid to have the full degree of freedom. If it isn't,

> > shouldn't the kernel validate that, instead of just ignoring the ndm->ndm_state?

> > If it's too late to validate, shouldn't we at least document somewhere

> > that the ndm_state is ignored in the presence of ndm_flags & NTF_EXT_LEARNED?

> > It is user API after all, easter eggs like this aren't very enjoyable.

> > 

> 

> It's too late unfortunately, ignoring other flags in that case has been the standard

> behaviour for a long time (it has never made sense to specify flags for extern_learn

> entries). I'll send a separate patch that adds a comment to document it or if you have

> another thing in mind feel free to send a patch.


No, I don't have anything else in mind, but since the topic is the same
as the "net: bridge: fix flags interpretation for extern learn fdb entries"
patch you already sent, you could as well just send a v2 for that and
add an extra phrase in a comment somewhere near a NTF_EXT_LEARNED uapi
definition, or perhaps extend this comment right here:

/* NUD_NOARP & NUD_PERMANENT are pseudostates, they never change
   and make no address resolution or NUD.
   NUD_PERMANENT also cannot be deleted by garbage collectors.
 */
Nikolay Aleksandrov Aug. 10, 2021, 10:43 a.m. UTC | #18
On 10/08/2021 13:38, Vladimir Oltean wrote:
> On Tue, Aug 10, 2021 at 01:15:32PM +0300, Nikolay Aleksandrov wrote:

>> On 10/08/2021 13:09, Vladimir Oltean wrote:

>>> On Tue, Aug 10, 2021 at 09:46:34AM +0300, Ido Schimmel wrote:

>>>> On Mon, Aug 09, 2021 at 04:05:22PM +0000, Vladimir Oltean wrote:

>>>>> On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote:

>>>>>> I have at least once selftest where I forgot the 'static' keyword:

>>>>>>

>>>>>> bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1

>>>>>>

>>>>>> This patch breaks the test when run against both the kernel and hardware

>>>>>> data paths. I don't mind patching these tests, but we might get more

>>>>>> reports in the future.

>>>>>

>>>>> Is it the 'static' keyword that you forgot, or 'dynamic'? The

>>>>> tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest

>>>>> looks to me like it's testing the behavior of an FDB entry which should

>>>>> roam, and which without the extern_learn flag would be ageable.

>>>>

>>>> static - no aging, no roaming

>>>> dynamic - aging, roaming

>>>> extern_learn - no aging, roaming

>>>>

>>>> So these combinations do not make any sense and the kernel will ignore

>>>> static/dynamic when extern_learn is specified. It's needed to work

>>>> around iproute2 behavior of "assume permanent"

>>>

>>> Since NTF_EXT_LEARNED is part of ndm->ndm_flags and NUD_REACHABLE/NUD_NOARP

>>> are part of ndm->ndm_state, it is not at all clear to me that 'extern_learn'

>>> belongs to the same class of bridge neighbor attributes as 'static'/'dynamic',

>>> and that it is invalid to have the full degree of freedom. If it isn't,

>>> shouldn't the kernel validate that, instead of just ignoring the ndm->ndm_state?

>>> If it's too late to validate, shouldn't we at least document somewhere

>>> that the ndm_state is ignored in the presence of ndm_flags & NTF_EXT_LEARNED?

>>> It is user API after all, easter eggs like this aren't very enjoyable.

>>>

>>

>> It's too late unfortunately, ignoring other flags in that case has been the standard

>> behaviour for a long time (it has never made sense to specify flags for extern_learn

>> entries). I'll send a separate patch that adds a comment to document it or if you have

>> another thing in mind feel free to send a patch.

> 

> No, I don't have anything else in mind, but since the topic is the same

> as the "net: bridge: fix flags interpretation for extern learn fdb entries"

> patch you already sent, you could as well just send a v2 for that and

> add an extra phrase in a comment somewhere near a NTF_EXT_LEARNED uapi

> definition, or perhaps extend this comment right here:

> 

> /* NUD_NOARP & NUD_PERMANENT are pseudostates, they never change

>    and make no address resolution or NUD.

>    NUD_PERMANENT also cannot be deleted by garbage collectors.

>  */

> 


sure, I was going to send it for net-next, but I might as well do it in -net.
diff mbox series

Patch

diff --git a/net/bridge/br.c b/net/bridge/br.c
index ef743f94254d..bbab9984f24e 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -166,7 +166,8 @@  static int br_switchdev_event(struct notifier_block *unused,
 	case SWITCHDEV_FDB_ADD_TO_BRIDGE:
 		fdb_info = ptr;
 		err = br_fdb_external_learn_add(br, p, fdb_info->addr,
-						fdb_info->vid, false);
+						fdb_info->vid,
+						fdb_info->is_local, false);
 		if (err) {
 			err = notifier_from_errno(err);
 			break;
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index a16191dcaed1..835cec1e5a03 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -1019,7 +1019,8 @@  static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
 
 static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br,
 			struct net_bridge_port *p, const unsigned char *addr,
-			u16 nlh_flags, u16 vid, struct nlattr *nfea_tb[])
+			u16 nlh_flags, u16 vid, struct nlattr *nfea_tb[],
+			struct netlink_ext_ack *extack)
 {
 	int err = 0;
 
@@ -1038,7 +1039,15 @@  static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br,
 		rcu_read_unlock();
 		local_bh_enable();
 	} else if (ndm->ndm_flags & NTF_EXT_LEARNED) {
-		err = br_fdb_external_learn_add(br, p, addr, vid, true);
+		if (!p && !(ndm->ndm_state & NUD_PERMANENT)) {
+			NL_SET_ERR_MSG_MOD(extack,
+					   "FDB entry towards bridge must be permanent");
+			return -EINVAL;
+		}
+
+		err = br_fdb_external_learn_add(br, p, addr, vid,
+						ndm->ndm_state & NUD_PERMANENT,
+						true);
 	} else {
 		spin_lock_bh(&br->hash_lock);
 		err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);
@@ -1110,9 +1119,11 @@  int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 		}
 
 		/* VID was specified, so use it. */
-		err = __br_fdb_add(ndm, br, p, addr, nlh_flags, vid, nfea_tb);
+		err = __br_fdb_add(ndm, br, p, addr, nlh_flags, vid, nfea_tb,
+				   extack);
 	} else {
-		err = __br_fdb_add(ndm, br, p, addr, nlh_flags, 0, nfea_tb);
+		err = __br_fdb_add(ndm, br, p, addr, nlh_flags, 0, nfea_tb,
+				   extack);
 		if (err || !vg || !vg->num_vlans)
 			goto out;
 
@@ -1124,7 +1135,7 @@  int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 			if (!br_vlan_should_use(v))
 				continue;
 			err = __br_fdb_add(ndm, br, p, addr, nlh_flags, v->vid,
-					   nfea_tb);
+					   nfea_tb, extack);
 			if (err)
 				goto out;
 		}
@@ -1264,7 +1275,7 @@  void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
 }
 
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
-			      const unsigned char *addr, u16 vid,
+			      const unsigned char *addr, u16 vid, bool is_local,
 			      bool swdev_notify)
 {
 	struct net_bridge_fdb_entry *fdb;
@@ -1281,6 +1292,10 @@  int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 
 		if (swdev_notify)
 			flags |= BIT(BR_FDB_ADDED_BY_USER);
+
+		if (is_local)
+			flags |= BIT(BR_FDB_LOCAL);
+
 		fdb = fdb_create(br, p, addr, vid, flags);
 		if (!fdb) {
 			err = -ENOMEM;
@@ -1307,6 +1322,9 @@  int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 		if (swdev_notify)
 			set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
 
+		if (is_local)
+			set_bit(BR_FDB_LOCAL, &fdb->flags);
+
 		if (modified)
 			fdb_notify(br, fdb, RTM_NEWNEIGH, swdev_notify);
 	}
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 2b48b204205e..aa64d8d63ca3 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -711,7 +711,7 @@  int br_fdb_get(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev,
 int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);
 void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
-			      const unsigned char *addr, u16 vid,
+			      const unsigned char *addr, u16 vid, bool is_local,
 			      bool swdev_notify);
 int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
 			      const unsigned char *addr, u16 vid,