diff mbox series

[net-next] net: bridge: pop vlan from skb if filtering is disabled but it's a pvid

Message ID 20200911231619.2876486-1-olteanv@gmail.com
State New
Headers show
Series [net-next] net: bridge: pop vlan from skb if filtering is disabled but it's a pvid | expand

Commit Message

Vladimir Oltean Sept. 11, 2020, 11:16 p.m. UTC
Currently the bridge untags VLANs from its VLAN group in
__allowed_ingress() only when VLAN filtering is enabled.

When installing a pvid in egress-tagged mode, DSA switches have a
problem:

ip link add dev br0 type bridge vlan_filtering 0
ip link set swp0 master br0
bridge vlan del dev swp0 vid 1
bridge vlan add dev swp0 vid 1 pvid

When adding a VLAN on a DSA switch interface, DSA configures the VLAN
membership of the CPU port using the same flags as swp0 (in this case
"pvid and not untagged"), in an attempt to copy the frame as-is from
ingress to the CPU.

However, in this case, the packet may arrive untagged on ingress, it
will be pvid-tagged by the ingress port, and will be sent as
egress-tagged towards the CPU. Otherwise stated, the CPU will see a VLAN
tag where there was none to speak of on ingress.

When vlan_filtering is 1, this is not a problem, as stated in the first
paragraph, because __allowed_ingress() will pop it. But currently, when
vlan_filtering is 0 and we have such a VLAN configuration, we need an
8021q upper (br0.1) to be able to ping over that VLAN.

Make the 2 cases (vlan_filtering 0 and 1) behave the same way by popping
the pvid, if the skb happens to be tagged with it, when vlan_filtering
is 0.

There was an attempt to resolve this issue locally within the DSA
receive data path, but even though we can determine that we are under a
bridge with vlan_filtering=0, there are still some challenges:
- we cannot be certain that the skb will end up in the software bridge's
  data path, and for that reason, we may be popping the VLAN for
  nothing. Example: there might exist an 8021q upper with the same VLAN,
  or this interface might be a DSA master for another switch. In that
  case, the VLAN should definitely not be popped even if it is equal to
  the default_pvid of the bridge, because it will be consumed about the
  DSA layer below.
- the bridge API only offers a race-free API for determining the pvid of
  a port, br_vlan_get_pvid(), under RTNL.

And in fact this might not even be a situation unique to DSA. Any driver
that receives untagged frames as pvid-tagged is now able to communicate
without needing an 8021q upper for the pvid.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
---
 net/bridge/br_vlan.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Nikolay Aleksandrov Sept. 12, 2020, 7:37 a.m. UTC | #1
On Sat, 2020-09-12 at 10:23 +0300, Vladimir Oltean wrote:
> On Sat, Sep 12, 2020 at 06:56:12AM +0000, Nikolay Aleksandrov wrote:

> > Could you point me to a thread where these problems were discussed and why

> > they couldn't be resolved within DSA in detail ?

> 

> See my discussion with Florian in this thread:

> http://patchwork.ozlabs.org/project/netdev/patch/20200907182910.1285496-5-olteanv@gmail.com/

> There's a bunch of unrelated stuff going on there, hope you'll manage.

> 


Thanks!
I'm traveling and will be back on Sun evening, will go through the thread then.

> > > - the bridge API only offers a race-free API for determining the pvid of

> > >   a port, br_vlan_get_pvid(), under RTNL.

> > > 

> > 

> > The API can be easily extended.

> > 

> 

> If you can help, cool.

> 

> > > And in fact this might not even be a situation unique to DSA. Any driver

> > > that receives untagged frames as pvid-tagged is now able to communicate

> > > without needing an 8021q upper for the pvid.

> > > 

> > 

> > I would prefer we don't add hardware/driver-specific fixes in the bridge, when

> > vlan filtering is disabled there should be no vlan manipulation/filtering done

> > by the bridge. This could potentially break users who have added 8021q devices

> > as bridge ports. At the very least this needs to be hidden behind a new option,

> > but I would like to find a way to actually push it back to DSA. But again adding

> > hardware/driver-specific options should be avoided.

> > 

> > Can you use tc to pop the vlan on ingress ? I mean the cases above are visible

> > to the user, so they might decide to add the ingress vlan rule.

> > 

> > Thanks,

> >  Nik

> 

> I can, but I think that all in all it's a bit strange for the bridge to

> not untag pvid-tagged frames.

> 

> Thanks!

> -Vladimir


If vlan filtering is disabled the bridge shouldn't do any vlan processing,
that's the expected behaviour. If tc is a viable option then I'd explore that
further and avoid adding more code.
Vladimir Oltean Sept. 12, 2020, 7:50 a.m. UTC | #2
On Sat, Sep 12, 2020 at 07:37:42AM +0000, Nikolay Aleksandrov wrote:
> On Sat, 2020-09-12 at 10:23 +0300, Vladimir Oltean wrote:
> > On Sat, Sep 12, 2020 at 06:56:12AM +0000, Nikolay Aleksandrov wrote:
> > > Could you point me to a thread where these problems were discussed and why
> > > they couldn't be resolved within DSA in detail ?
> >
> > See my discussion with Florian in this thread:
> > http://patchwork.ozlabs.org/project/netdev/patch/20200907182910.1285496-5-olteanv@gmail.com/
> > There's a bunch of unrelated stuff going on there, hope you'll manage.
> >
>
> Thanks!
> I'm traveling and will be back on Sun evening, will go through the thread then.
>

Ok, take your time.
For some reason patchwork seems to have trimmed the discussion thread.
See on lore here:
https://lore.kernel.org/netdev/20200907182910.1285496-5-olteanv@gmail.com/T/#t

Thanks,
-Vladimir
Florian Fainelli Sept. 12, 2020, 7:38 p.m. UTC | #3
On 9/11/2020 11:56 PM, Nikolay Aleksandrov wrote:
> On Sat, 2020-09-12 at 02:16 +0300, Vladimir Oltean wrote:
>> Currently the bridge untags VLANs from its VLAN group in
>> __allowed_ingress() only when VLAN filtering is enabled.
>>
>> When installing a pvid in egress-tagged mode, DSA switches have a
>> problem:
>>
>> ip link add dev br0 type bridge vlan_filtering 0
>> ip link set swp0 master br0
>> bridge vlan del dev swp0 vid 1
>> bridge vlan add dev swp0 vid 1 pvid
>>
>> When adding a VLAN on a DSA switch interface, DSA configures the VLAN
>> membership of the CPU port using the same flags as swp0 (in this case
>> "pvid and not untagged"), in an attempt to copy the frame as-is from
>> ingress to the CPU.
>>
>> However, in this case, the packet may arrive untagged on ingress, it
>> will be pvid-tagged by the ingress port, and will be sent as
>> egress-tagged towards the CPU. Otherwise stated, the CPU will see a VLAN
>> tag where there was none to speak of on ingress.
>>
>> When vlan_filtering is 1, this is not a problem, as stated in the first
>> paragraph, because __allowed_ingress() will pop it. But currently, when
>> vlan_filtering is 0 and we have such a VLAN configuration, we need an
>> 8021q upper (br0.1) to be able to ping over that VLAN.
>>
>> Make the 2 cases (vlan_filtering 0 and 1) behave the same way by popping
>> the pvid, if the skb happens to be tagged with it, when vlan_filtering
>> is 0.
>>
>> There was an attempt to resolve this issue locally within the DSA
>> receive data path, but even though we can determine that we are under a
>> bridge with vlan_filtering=0, there are still some challenges:
>> - we cannot be certain that the skb will end up in the software bridge's
>>    data path, and for that reason, we may be popping the VLAN for
>>    nothing. Example: there might exist an 8021q upper with the same VLAN,
>>    or this interface might be a DSA master for another switch. In that
>>    case, the VLAN should definitely not be popped even if it is equal to
>>    the default_pvid of the bridge, because it will be consumed about the
>>    DSA layer below.
> 
> Could you point me to a thread where these problems were discussed and why
> they couldn't be resolved within DSA in detail ?
> 
>> - the bridge API only offers a race-free API for determining the pvid of
>>    a port, br_vlan_get_pvid(), under RTNL.
>>
> 
> The API can be easily extended.
> 
>> And in fact this might not even be a situation unique to DSA. Any driver
>> that receives untagged frames as pvid-tagged is now able to communicate
>> without needing an 8021q upper for the pvid.
>>
> 
> I would prefer we don't add hardware/driver-specific fixes in the bridge, when
> vlan filtering is disabled there should be no vlan manipulation/filtering done
> by the bridge. This could potentially break users who have added 8021q devices
> as bridge ports. At the very least this needs to be hidden behind a new option,
> but I would like to find a way to actually push it back to DSA. But again adding
> hardware/driver-specific options should be avoided.
> 
> Can you use tc to pop the vlan on ingress ? I mean the cases above are visible
> to the user, so they might decide to add the ingress vlan rule.

We had discussed various options with Vladimir in the threads he points 
out but this one is by far the cleanest and basically aligns the data 
path when the bridge is configured with vlan_filtering=0 or 1.

Some Ethernet switches supported via DSA either insist on or the driver 
has been written in such a way that the default_pvid of the bridge which 
is egress untagged for the user-facing ports is configured as egress 
tagged for the CPU port. That CPU port is ultimately responsible for 
bringing the Ethernet frames into the Linux host and the bridge data 
path which is how we got VLAN tagged frames into the bridge, even if a 
vlan_filtering=0 bridge is not supposed to.

We can solve this today by having a 802.1Q upper on top of the bridge 
default to pop/push the default_pvid VLAN tag, but this is not obvious, 
represents a divergence from a pure software bridge, and leads to 
support questions. What is also utterly confusing for instance is that a 
raw socket like one opened by a DHCP client will successfully allow br0 
to obtain an IP address, but the data path still is not functional, as 
you would need to use br0.1 to have a working data path taking care of 
the VLAN tag.

Vladimir had offered a DSA centric solution to this problem, which was 
not that bad after all, but this one is also my favorite.

Let us know when you are caught up on the thread and we can see how to 
solve that the best way.
Nikolay Aleksandrov Sept. 14, 2020, 7:51 a.m. UTC | #4
On Sat, 2020-09-12 at 12:38 -0700, Florian Fainelli wrote:
> 

> On 9/11/2020 11:56 PM, Nikolay Aleksandrov wrote:

> > On Sat, 2020-09-12 at 02:16 +0300, Vladimir Oltean wrote:

> > > Currently the bridge untags VLANs from its VLAN group in

> > > __allowed_ingress() only when VLAN filtering is enabled.

> > > 

> > > When installing a pvid in egress-tagged mode, DSA switches have a

> > > problem:

> > > 

> > > ip link add dev br0 type bridge vlan_filtering 0

> > > ip link set swp0 master br0

> > > bridge vlan del dev swp0 vid 1

> > > bridge vlan add dev swp0 vid 1 pvid

> > > 

> > > When adding a VLAN on a DSA switch interface, DSA configures the VLAN

> > > membership of the CPU port using the same flags as swp0 (in this case

> > > "pvid and not untagged"), in an attempt to copy the frame as-is from

> > > ingress to the CPU.

> > > 

> > > However, in this case, the packet may arrive untagged on ingress, it

> > > will be pvid-tagged by the ingress port, and will be sent as

> > > egress-tagged towards the CPU. Otherwise stated, the CPU will see a VLAN

> > > tag where there was none to speak of on ingress.

> > > 

> > > When vlan_filtering is 1, this is not a problem, as stated in the first

> > > paragraph, because __allowed_ingress() will pop it. But currently, when

> > > vlan_filtering is 0 and we have such a VLAN configuration, we need an

> > > 8021q upper (br0.1) to be able to ping over that VLAN.

> > > 

> > > Make the 2 cases (vlan_filtering 0 and 1) behave the same way by popping

> > > the pvid, if the skb happens to be tagged with it, when vlan_filtering

> > > is 0.

> > > 

> > > There was an attempt to resolve this issue locally within the DSA

> > > receive data path, but even though we can determine that we are under a

> > > bridge with vlan_filtering=0, there are still some challenges:

> > > - we cannot be certain that the skb will end up in the software bridge's

> > >    data path, and for that reason, we may be popping the VLAN for

> > >    nothing. Example: there might exist an 8021q upper with the same VLAN,

> > >    or this interface might be a DSA master for another switch. In that

> > >    case, the VLAN should definitely not be popped even if it is equal to

> > >    the default_pvid of the bridge, because it will be consumed about the

> > >    DSA layer below.

> > 

> > Could you point me to a thread where these problems were discussed and why

> > they couldn't be resolved within DSA in detail ?

> > 

> > > - the bridge API only offers a race-free API for determining the pvid of

> > >    a port, br_vlan_get_pvid(), under RTNL.

> > > 

> > 

> > The API can be easily extended.

> > 

> > > And in fact this might not even be a situation unique to DSA. Any driver

> > > that receives untagged frames as pvid-tagged is now able to communicate

> > > without needing an 8021q upper for the pvid.

> > > 

> > 

> > I would prefer we don't add hardware/driver-specific fixes in the bridge, when

> > vlan filtering is disabled there should be no vlan manipulation/filtering done

> > by the bridge. This could potentially break users who have added 8021q devices

> > as bridge ports. At the very least this needs to be hidden behind a new option,

> > but I would like to find a way to actually push it back to DSA. But again adding

> > hardware/driver-specific options should be avoided.

> > 

> > Can you use tc to pop the vlan on ingress ? I mean the cases above are visible

> > to the user, so they might decide to add the ingress vlan rule.

> 

> We had discussed various options with Vladimir in the threads he points 

> out but this one is by far the cleanest and basically aligns the data 

> path when the bridge is configured with vlan_filtering=0 or 1.

> 

> Some Ethernet switches supported via DSA either insist on or the driver 

> has been written in such a way that the default_pvid of the bridge which 

> is egress untagged for the user-facing ports is configured as egress 

> tagged for the CPU port. That CPU port is ultimately responsible for 

> bringing the Ethernet frames into the Linux host and the bridge data 

> path which is how we got VLAN tagged frames into the bridge, even if a 

> vlan_filtering=0 bridge is not supposed to.

> 


I read the thread and see your problem, but I still think that it must not be
fixed in the bridge device, more below.

> We can solve this today by having a 802.1Q upper on top of the bridge 

> default to pop/push the default_pvid VLAN tag, but this is not obvious, 

> represents a divergence from a pure software bridge, and leads to 

> support questions. What is also utterly confusing for instance is that a 

> raw socket like one opened by a DHCP client will successfully allow br0 

> to obtain an IP address, but the data path still is not functional, as 

> you would need to use br0.1 to have a working data path taking care of 

> the VLAN tag.

> 


With this patch br0.1 won't work anymore for anyone having vlan_filtering=0.

> Vladimir had offered a DSA centric solution to this problem, which was 

> not that bad after all, but this one is also my favorite.

> 


I saw it, and it looks good. I saw the one of the main issues is not having an
RCU get pvid helper. I can provide you with that to simplify the patch.

> Let us know when you are caught up on the thread and we can see how to 

> solve that the best way.


I'll start with why this patch is a non-starter, I'm guessing most of us already
have guessed, but just to have them:
 - the fix is DSA-specific, it must not reside in the bridge
 - vlan_filtering=0 means absolutely no vlan processing, that is what everyone
   expects and breaking that expectation would break various use cases

Less important, but still:
 - it is in the fast path for everyone
 - it can already be fixed by a tc action/8021q device

We can go into details but that would be a waste of time, instead I think we
should focus on Vladimir's proposed DSA change.

Vladimir, I think with the right pvid helper the patch would reduce to
dsa_untag_bridge_pvid() on the Rx path only. One thing that I'm curious about
is shouldn't dsa_untag_bridge_pvid() check if the bridge pvid is == to the skb
tag and the port's pvid?
Since we can have the port's pvid different from the bridge's. That's for the
case of vlan_filtering=1 and the port having that vlan, but not as pvid.

Thanks,
 Nik
Florian Fainelli Sept. 14, 2020, 8:11 p.m. UTC | #5
On 9/14/2020 12:51 AM, Nikolay Aleksandrov wrote:
> On Sat, 2020-09-12 at 12:38 -0700, Florian Fainelli wrote:
>>
>> On 9/11/2020 11:56 PM, Nikolay Aleksandrov wrote:
>>> On Sat, 2020-09-12 at 02:16 +0300, Vladimir Oltean wrote:
>>>> Currently the bridge untags VLANs from its VLAN group in
>>>> __allowed_ingress() only when VLAN filtering is enabled.
>>>>
>>>> When installing a pvid in egress-tagged mode, DSA switches have a
>>>> problem:
>>>>
>>>> ip link add dev br0 type bridge vlan_filtering 0
>>>> ip link set swp0 master br0
>>>> bridge vlan del dev swp0 vid 1
>>>> bridge vlan add dev swp0 vid 1 pvid
>>>>
>>>> When adding a VLAN on a DSA switch interface, DSA configures the VLAN
>>>> membership of the CPU port using the same flags as swp0 (in this case
>>>> "pvid and not untagged"), in an attempt to copy the frame as-is from
>>>> ingress to the CPU.
>>>>
>>>> However, in this case, the packet may arrive untagged on ingress, it
>>>> will be pvid-tagged by the ingress port, and will be sent as
>>>> egress-tagged towards the CPU. Otherwise stated, the CPU will see a VLAN
>>>> tag where there was none to speak of on ingress.
>>>>
>>>> When vlan_filtering is 1, this is not a problem, as stated in the first
>>>> paragraph, because __allowed_ingress() will pop it. But currently, when
>>>> vlan_filtering is 0 and we have such a VLAN configuration, we need an
>>>> 8021q upper (br0.1) to be able to ping over that VLAN.
>>>>
>>>> Make the 2 cases (vlan_filtering 0 and 1) behave the same way by popping
>>>> the pvid, if the skb happens to be tagged with it, when vlan_filtering
>>>> is 0.
>>>>
>>>> There was an attempt to resolve this issue locally within the DSA
>>>> receive data path, but even though we can determine that we are under a
>>>> bridge with vlan_filtering=0, there are still some challenges:
>>>> - we cannot be certain that the skb will end up in the software bridge's
>>>>     data path, and for that reason, we may be popping the VLAN for
>>>>     nothing. Example: there might exist an 8021q upper with the same VLAN,
>>>>     or this interface might be a DSA master for another switch. In that
>>>>     case, the VLAN should definitely not be popped even if it is equal to
>>>>     the default_pvid of the bridge, because it will be consumed about the
>>>>     DSA layer below.
>>>
>>> Could you point me to a thread where these problems were discussed and why
>>> they couldn't be resolved within DSA in detail ?
>>>
>>>> - the bridge API only offers a race-free API for determining the pvid of
>>>>     a port, br_vlan_get_pvid(), under RTNL.
>>>>
>>>
>>> The API can be easily extended.
>>>
>>>> And in fact this might not even be a situation unique to DSA. Any driver
>>>> that receives untagged frames as pvid-tagged is now able to communicate
>>>> without needing an 8021q upper for the pvid.
>>>>
>>>
>>> I would prefer we don't add hardware/driver-specific fixes in the bridge, when
>>> vlan filtering is disabled there should be no vlan manipulation/filtering done
>>> by the bridge. This could potentially break users who have added 8021q devices
>>> as bridge ports. At the very least this needs to be hidden behind a new option,
>>> but I would like to find a way to actually push it back to DSA. But again adding
>>> hardware/driver-specific options should be avoided.
>>>
>>> Can you use tc to pop the vlan on ingress ? I mean the cases above are visible
>>> to the user, so they might decide to add the ingress vlan rule.
>>
>> We had discussed various options with Vladimir in the threads he points
>> out but this one is by far the cleanest and basically aligns the data
>> path when the bridge is configured with vlan_filtering=0 or 1.
>>
>> Some Ethernet switches supported via DSA either insist on or the driver
>> has been written in such a way that the default_pvid of the bridge which
>> is egress untagged for the user-facing ports is configured as egress
>> tagged for the CPU port. That CPU port is ultimately responsible for
>> bringing the Ethernet frames into the Linux host and the bridge data
>> path which is how we got VLAN tagged frames into the bridge, even if a
>> vlan_filtering=0 bridge is not supposed to.
>>
> 
> I read the thread and see your problem, but I still think that it must not be
> fixed in the bridge device, more below.
> 
>> We can solve this today by having a 802.1Q upper on top of the bridge
>> default to pop/push the default_pvid VLAN tag, but this is not obvious,
>> represents a divergence from a pure software bridge, and leads to
>> support questions. What is also utterly confusing for instance is that a
>> raw socket like one opened by a DHCP client will successfully allow br0
>> to obtain an IP address, but the data path still is not functional, as
>> you would need to use br0.1 to have a working data path taking care of
>> the VLAN tag.
>>
> 
> With this patch br0.1 won't work anymore for anyone having vlan_filtering=0.
> 
>> Vladimir had offered a DSA centric solution to this problem, which was
>> not that bad after all, but this one is also my favorite.
>>
> 
> I saw it, and it looks good. I saw the one of the main issues is not having an
> RCU get pvid helper. I can provide you with that to simplify the patch.
> 
>> Let us know when you are caught up on the thread and we can see how to
>> solve that the best way.
> 
> I'll start with why this patch is a non-starter, I'm guessing most of us already
> have guessed, but just to have them:
>   - the fix is DSA-specific, it must not reside in the bridge

Not necessarily any switch that forces egress tagging of a given PVID 
would fall in that category.

>   - vlan_filtering=0 means absolutely no vlan processing, that is what everyone
>     expects and breaking that expectation would break various use cases

OK

> 
> Less important, but still:
>   - it is in the fast path for everyone
>   - it can already be fixed by a tc action/8021q device

Sure, but the point is that it should be fixed in a way that is 
transparent to the user, as much as possible.

> 
> We can go into details but that would be a waste of time, instead I think we
> should focus on Vladimir's proposed DSA change.
> 
> Vladimir, I think with the right pvid helper the patch would reduce to
> dsa_untag_bridge_pvid() on the Rx path only. One thing that I'm curious about
> is shouldn't dsa_untag_bridge_pvid() check if the bridge pvid is == to the skb
> tag and the port's pvid?
> Since we can have the port's pvid different from the bridge's. That's for the
> case of vlan_filtering=1 and the port having that vlan, but not as pvid.
> 
> Thanks,
>   Nik
>
Florian Fainelli Sept. 21, 2020, 7:44 p.m. UTC | #6
On 9/14/20 1:11 PM, Florian Fainelli wrote:
>> Less important, but still:
>>   - it is in the fast path for everyone
>>   - it can already be fixed by a tc action/8021q device
> 
> Sure, but the point is that it should be fixed in a way that is
> transparent to the user, as much as possible.
> 
>>
>> We can go into details but that would be a waste of time, instead I
>> think we
>> should focus on Vladimir's proposed DSA change.
>>
>> Vladimir, I think with the right pvid helper the patch would reduce to
>> dsa_untag_bridge_pvid() on the Rx path only. One thing that I'm
>> curious about
>> is shouldn't dsa_untag_bridge_pvid() check if the bridge pvid is == to
>> the skb
>> tag and the port's pvid?
>> Since we can have the port's pvid different from the bridge's. That's
>> for the
>> case of vlan_filtering=1 and the port having that vlan, but not as pvid.

Vladimir, let me know if you have a patch for DSA and I can give it a
try quickly. Thanks!
Vladimir Oltean Sept. 21, 2020, 7:56 p.m. UTC | #7
On Mon, Sep 21, 2020 at 12:44:43PM -0700, Florian Fainelli wrote:
> Vladimir, let me know if you have a patch for DSA and I can give it a
> try quickly. Thanks!

Let me clean it up a little and send it, I need to export a wrapper over
br_get_pvid() for external use, called under rcu_read_lock().
Vladimir Oltean Sept. 21, 2020, 10:12 p.m. UTC | #8
On Mon, Sep 21, 2020 at 10:56:07PM +0300, Vladimir Oltean wrote:
> On Mon, Sep 21, 2020 at 12:44:43PM -0700, Florian Fainelli wrote:

> > Vladimir, let me know if you have a patch for DSA and I can give it a

> > try quickly. Thanks!

> 

> Let me clean it up a little and send it, I need to export a wrapper over

> br_get_pvid() for external use, called under rcu_read_lock().


Here it is as an attached patch, sorry that it can't be simpler than
that. Please call the function from where you need it, and then submit
it yourself to net-next.

Also, you'll notice a lockdep warning when you test it. That's what the
br_vlan_get_pvid_rcu() fix I've just sent is for. Make sure you also
take that.

Thanks,
-Vladimir
From 664bad74cb8e598280e20940092a190026459c5a Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>

Date: Mon, 21 Sep 2020 23:06:44 +0300
Subject: [PATCH] net: dsa: untag the bridge pvid from rx skbs

Currently the bridge untags VLANs present in its VLAN groups in
__allowed_ingress() only when VLAN filtering is enabled.

But when a skb is seen on the RX path as tagged with the bridge's pvid,
and that bridge has vlan_filtering=0, and there isn't any 8021q upper
with that VLAN either, then we have a problem. The bridge will not untag
it (since it is supposed to remain VLAN-unaware), and pvid-tagged
communication will be broken.

There are 2 situations where we can end up like that:

1. When installing a pvid in egress-tagged mode, like this:

ip link add dev br0 type bridge vlan_filtering 0
ip link set swp0 master br0
bridge vlan del dev swp0 vid 1
bridge vlan add dev swp0 vid 1 pvid

This happens because DSA configures the VLAN membership of the CPU port
using the same flags as swp0 (in this case "pvid and not untagged"), in
an attempt to copy the frame as-is from ingress to the CPU.

However, in this case, the packet may arrive untagged on ingress, it
will be pvid-tagged by the ingress port, and will be sent as
egress-tagged towards the CPU. Otherwise stated, the CPU will see a VLAN
tag where there was none to speak of on ingress.

When vlan_filtering is 1, this is not a problem, as stated in the first
paragraph, because __allowed_ingress() will pop it. But currently, when
vlan_filtering is 0 and we have such a VLAN configuration, we need an
8021q upper (br0.1) to be able to ping over that VLAN, which is not
symmetrical with the vlan_filtering=1 case, and therefore, confusing for
users.

Basically what DSA attempts to do is simply an approximation: try to
copy the skb with (or without) the same VLAN all the way up to the CPU.
But DSA drivers treat CPU port VLAN membership in various ways (which is
a good segue into situation 2). And some of those drivers simply tell
the CPU port to copy the frame unmodified, which is the golden standard
when it comes to VLAN processing (therefore, any driver which can
configure the hardware to do that, should do that, and discard the VLAN
flags requested by DSA on the CPU port).

2. Some DSA drivers always configure the CPU port as egress-tagged, in
an attempt to recover the classified VLAN from the skb. These drivers
cannot work at all with untagged traffic when bridged in
vlan_filtering=0 mode. And they can't go for the easy "just keep the
pvid as egress-untagged towards the CPU" route, because each front port
can have its own pvid, and that might require conflicting VLAN
membership settings on the CPU port (swp1 is pvid for VID 1 and
egress-tagged for VID 2; swp2 is egress-taggeed for VID 1 and pvid for
VID 2; with this simplistic approach, the CPU port, which is really a
separate hardware entity and has its own VLAN membership settings, would
end up being egress-untagged in both VID 1 and VID 2, therefore losing
the VLAN tags of ingress traffic).

So the only thing we can do is to create a helper function for resolving
the problematic case (that is, a function which untags the bridge pvid
when that is in vlan_filtering=0 mode), which taggers in need should
call. It isn't called from the generic DSA receive path because there
are drivers that fall neither in the first nor second category.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

---
 net/dsa/dsa_priv.h | 66 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)

diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 2da656d984ef..0348dbab4131 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -7,6 +7,7 @@
 #ifndef __DSA_PRIV_H
 #define __DSA_PRIV_H
 
+#include <linux/if_bridge.h>
 #include <linux/phy.h>
 #include <linux/netdevice.h>
 #include <linux/netpoll.h>
@@ -194,6 +195,71 @@ dsa_slave_to_master(const struct net_device *dev)
 	return dp->cpu_dp->master;
 }
 
+/* If under a bridge with vlan_filtering=0, make sure to send pvid-tagged
+ * frames as untagged, since the bridge will not untag them.
+ */
+static inline struct sk_buff *dsa_untag_bridge_pvid(struct sk_buff *skb)
+{
+	struct dsa_port *dp = dsa_slave_to_port(skb->dev);
+	struct vlan_ethhdr *hdr = vlan_eth_hdr(skb);
+	struct net_device *br = dp->bridge_dev;
+	struct net_device *dev = skb->dev;
+	struct net_device *upper_dev;
+	struct list_head *iter;
+	u16 vid, pvid, proto;
+	int err;
+
+	if (!br || br_vlan_enabled(br))
+		return skb;
+
+	err = br_vlan_get_proto(br, &proto);
+	if (err)
+		return skb;
+
+	/* Move VLAN tag from data to hwaccel */
+	if (!skb_vlan_tag_present(skb) && hdr->h_vlan_proto == htons(proto)) {
+		skb = skb_vlan_untag(skb);
+		if (!skb)
+			return NULL;
+	}
+
+	if (!skb_vlan_tag_present(skb))
+		return skb;
+
+	vid = skb_vlan_tag_get_id(skb);
+
+	/* We already run under an RCU read-side critical section since
+	 * we are called from netif_receive_skb_list_internal().
+	 */
+	err = br_vlan_get_pvid_rcu(dev, &pvid);
+	if (err)
+		return skb;
+
+	if (vid != pvid)
+		return skb;
+
+	/* The sad part about attempting to untag from DSA is that we
+	 * don't know, unless we check, if the skb will end up in
+	 * the bridge's data path - br_allowed_ingress() - or not.
+	 * For example, there might be an 8021q upper for the
+	 * default_pvid of the bridge, which will steal VLAN-tagged traffic
+	 * from the bridge's data path. This is a configuration that DSA
+	 * supports because vlan_filtering is 0. In that case, we should
+	 * definitely keep the tag, to make sure it keeps working.
+	 */
+	netdev_for_each_upper_dev_rcu(dev, upper_dev, iter) {
+		if (!is_vlan_dev(upper_dev))
+			continue;
+
+		if (vid == vlan_dev_vlan_id(upper_dev))
+			return skb;
+	}
+
+	__vlan_hwaccel_clear_tag(skb);
+
+	return skb;
+}
+
 /* switch.c */
 int dsa_switch_register_notifier(struct dsa_switch *ds);
 void dsa_switch_unregister_notifier(struct dsa_switch *ds);
-- 
2.25.1
diff mbox series

Patch

diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index d2b8737f9fc0..ecfdb9cd3183 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -580,7 +580,23 @@  bool br_allowed_ingress(const struct net_bridge *br,
 	 * permitted.
 	 */
 	if (!br_opt_get(br, BROPT_VLAN_ENABLED)) {
+		u16 v;
+
 		BR_INPUT_SKB_CB(skb)->vlan_filtered = false;
+
+		/* See comment in __allowed_ingress about how skb can end up
+		 * here not having a hwaccel tag
+		 */
+		if (unlikely(!skb_vlan_tag_present(skb) &&
+			     skb->protocol == br->vlan_proto)) {
+			skb = skb_vlan_untag(skb);
+			if (unlikely(!skb))
+				return false;
+		}
+
+		if (!br_vlan_get_tag(skb, &v) && v == br_get_pvid(vg))
+			__vlan_hwaccel_clear_tag(skb);
+
 		return true;
 	}