diff mbox series

[RFC,v2,net-next,4/4] net/cpsw_switchdev: add switchdev mode of operation on cpsw driver

Message ID 1528974690-31600-5-git-send-email-ilias.apalodimas@linaro.org
State New
Headers show
Series Add switchdev on TI-CPSW | expand

Commit Message

Ilias Apalodimas June 14, 2018, 11:11 a.m. UTC
This patch enables switchdev funtionality on the driver based on a
.config option(CONFIG_TI_CPSW_SWITCHDEV). CPSW driver used a DTS option
called dual_emac to enable switch or dual emac mode. The new config option
will override this configuration.

It creates 2 ports, eth0 and eth1(that can be renamed to sw0p1 and sw0p2
via udev rules).
sw0p1 and sw0p2 are the netdev interfaces connected to PHY devices.
This hardware also has a CPU port which is configured invidividually in
the case of VLANs.

On device init all netdevices (including the CPU port) will operate on
VLAN 0. sw0p1 and sw0p2 will operate as normal netdev interfaces.
Once they are added in a bridge the default bridge vlan will not be added
to the CPU port. In order to get an ip address on br0 you'll need to add
the CPU port on that vlan by issuing:
bridge vlan add dev br0 vid <vid> pvid untagged self

Multicast traffic:
setting IFF_MULTICAST on and off will affect registered multicast on that
port(if enabled port will be added on registered multicast traffic mask).
This muct occur before adding VLANs on the interfaces. If you change the
flag after the VLAN configuration you need to re-issue the VLAN config
commands.

MDBs/FDBs:
If the CPU port is member of the appropriate VLANs then switchdev API
will add FDB/MDB entries uppon detection. If the CPU port is not a member
the user can manually specify the entries.

ALE_P0_UNI_FLOOD will be enabled when the first interface joins the bridge
and will be disabled once the last interface leaves the bridge

Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>

---
 drivers/net/ethernet/ti/Kconfig          |   9 +
 drivers/net/ethernet/ti/Makefile         |   1 +
 drivers/net/ethernet/ti/cpsw.c           | 306 +++++++++++++++++++++-
 drivers/net/ethernet/ti/cpsw_priv.h      |   2 +
 drivers/net/ethernet/ti/cpsw_switchdev.c | 418 +++++++++++++++++++++++++++++++
 drivers/net/ethernet/ti/cpsw_switchdev.h |   4 +
 6 files changed, 731 insertions(+), 9 deletions(-)
 create mode 100644 drivers/net/ethernet/ti/cpsw_switchdev.c
 create mode 100644 drivers/net/ethernet/ti/cpsw_switchdev.h

-- 
2.7.4

Comments

Jiri Pirko June 14, 2018, 11:23 a.m. UTC | #1
Thu, Jun 14, 2018 at 01:11:30PM CEST, ilias.apalodimas@linaro.org wrote:
>This patch enables switchdev funtionality on the driver based on a

>.config option(CONFIG_TI_CPSW_SWITCHDEV). CPSW driver used a DTS option

>called dual_emac to enable switch or dual emac mode. The new config option

>will override this configuration.

>

>It creates 2 ports, eth0 and eth1(that can be renamed to sw0p1 and sw0p2

>via udev rules).


Actually the current udev should rename the netdevs by default,
according the phys_port_name. Could you check?
Jiri Pirko June 14, 2018, 11:30 a.m. UTC | #2
Thu, Jun 14, 2018 at 01:11:30PM CEST, ilias.apalodimas@linaro.org wrote:

[...]

>@@ -2711,6 +2789,10 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,

> 	if (of_property_read_bool(node, "dual_emac"))

> 		data->switch_mode = CPSW_DUAL_EMAC;

> 

>+	/* switchdev overrides DTS */

>+	if (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV))

>+		data->switch_mode = CPSW_SWITCHDEV;


So you force CPSW_SWITCHDEV mode if the CONFIG_TI_CPSW_SWITCHDEV is
enabled. That does not sound right. I think that user should tell what
mode does he want regardless what the kernel config is.
Ilias Apalodimas June 14, 2018, 11:32 a.m. UTC | #3
On Thu, Jun 14, 2018 at 01:23:49PM +0200, Jiri Pirko wrote:
> Thu, Jun 14, 2018 at 01:11:30PM CEST, ilias.apalodimas@linaro.org wrote:

> >This patch enables switchdev funtionality on the driver based on a

> >.config option(CONFIG_TI_CPSW_SWITCHDEV). CPSW driver used a DTS option

> >called dual_emac to enable switch or dual emac mode. The new config option

> >will override this configuration.

> >

> >It creates 2 ports, eth0 and eth1(that can be renamed to sw0p1 and sw0p2

> >via udev rules).

> 

> Actually the current udev should rename the netdevs by default,

> according the phys_port_name. Could you check?

I will if i manage to compile udev properly for this board
Ilias Apalodimas June 14, 2018, 11:34 a.m. UTC | #4
On Thu, Jun 14, 2018 at 01:30:28PM +0200, Jiri Pirko wrote:
> Thu, Jun 14, 2018 at 01:11:30PM CEST, ilias.apalodimas@linaro.org wrote:

> 

> [...]

> 

> >@@ -2711,6 +2789,10 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,

> > 	if (of_property_read_bool(node, "dual_emac"))

> > 		data->switch_mode = CPSW_DUAL_EMAC;

> > 

> >+	/* switchdev overrides DTS */

> >+	if (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV))

> >+		data->switch_mode = CPSW_SWITCHDEV;

> 

> So you force CPSW_SWITCHDEV mode if the CONFIG_TI_CPSW_SWITCHDEV is

> enabled. That does not sound right. I think that user should tell what

> mode does he want regardless what the kernel config is.

We discussed this during the V1 of the RFC. Yes it doesn't seem good, but the
device currently configures the modes using DTS (which is not correct). I choose
the .config due to that. I can't think of anything better, but i am open to
suggestions
Jiri Pirko June 14, 2018, 11:39 a.m. UTC | #5
Thu, Jun 14, 2018 at 01:34:04PM CEST, ilias.apalodimas@linaro.org wrote:
>On Thu, Jun 14, 2018 at 01:30:28PM +0200, Jiri Pirko wrote:

>> Thu, Jun 14, 2018 at 01:11:30PM CEST, ilias.apalodimas@linaro.org wrote:

>> 

>> [...]

>> 

>> >@@ -2711,6 +2789,10 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,

>> > 	if (of_property_read_bool(node, "dual_emac"))

>> > 		data->switch_mode = CPSW_DUAL_EMAC;

>> > 

>> >+	/* switchdev overrides DTS */

>> >+	if (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV))

>> >+		data->switch_mode = CPSW_SWITCHDEV;

>> 

>> So you force CPSW_SWITCHDEV mode if the CONFIG_TI_CPSW_SWITCHDEV is

>> enabled. That does not sound right. I think that user should tell what

>> mode does he want regardless what the kernel config is.

>We discussed this during the V1 of the RFC. Yes it doesn't seem good, but the

>device currently configures the modes using DTS (which is not correct). I choose

>the .config due to that. I can't think of anything better, but i am open to

>suggestions


Agreed that DTS does fit as well. I think that this might be a job for
devlink parameters (patchset is going to be sent upstream next week).
You do have 1 bus address for the whole device (both ports), right?
Ilias Apalodimas June 14, 2018, 11:43 a.m. UTC | #6
On Thu, Jun 14, 2018 at 01:39:58PM +0200, Jiri Pirko wrote:
> Thu, Jun 14, 2018 at 01:34:04PM CEST, ilias.apalodimas@linaro.org wrote:

> >On Thu, Jun 14, 2018 at 01:30:28PM +0200, Jiri Pirko wrote:

> >> Thu, Jun 14, 2018 at 01:11:30PM CEST, ilias.apalodimas@linaro.org wrote:

> >> 

> >> [...]

> >> 

> >> >@@ -2711,6 +2789,10 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,

> >> > 	if (of_property_read_bool(node, "dual_emac"))

> >> > 		data->switch_mode = CPSW_DUAL_EMAC;

> >> > 

> >> >+	/* switchdev overrides DTS */

> >> >+	if (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV))

> >> >+		data->switch_mode = CPSW_SWITCHDEV;

> >> 

> >> So you force CPSW_SWITCHDEV mode if the CONFIG_TI_CPSW_SWITCHDEV is

> >> enabled. That does not sound right. I think that user should tell what

> >> mode does he want regardless what the kernel config is.

> >We discussed this during the V1 of the RFC. Yes it doesn't seem good, but the

> >device currently configures the modes using DTS (which is not correct). I choose

> >the .config due to that. I can't think of anything better, but i am open to

> >suggestions

> 

> Agreed that DTS does fit as well. I think that this might be a job for

> devlink parameters (patchset is going to be sent upstream next week).

> You do have 1 bus address for the whole device (both ports), right?

> 

Yes devlink sounds reasonable. I thyink there's only one bus for it, but then
again i am far from an expert on the hardware interrnals. Grygorii can correct
me if i am wrong.

Thanks for taking time to read this,
Ilias
Andrew Lunn June 18, 2018, 4:16 p.m. UTC | #7
> @@ -2711,6 +2789,10 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,

>  	if (of_property_read_bool(node, "dual_emac"))

>  		data->switch_mode = CPSW_DUAL_EMAC;

>  

> +	/* switchdev overrides DTS */

> +	if (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV))

> +		data->switch_mode = CPSW_SWITCHDEV;

> +


I know you discussed this a bit with Jiri, but i still think if
'dual_mac" is found, you should do dual mac. The DT clearly requests
dual mac, and doing anything else is going to cause confusion.

The device tree binding is ambiguous what should happen when dual-mac
is missing. So i would only enable swithdev mode in that case.

But ideally, it should be a new driver with a new binding.

    Andrew
Ilias Apalodimas June 18, 2018, 8:19 p.m. UTC | #8
On Mon, Jun 18, 2018 at 06:16:27PM +0200, Andrew Lunn wrote:
> > @@ -2711,6 +2789,10 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,

> >  	if (of_property_read_bool(node, "dual_emac"))

> >  		data->switch_mode = CPSW_DUAL_EMAC;

> >  

> > +	/* switchdev overrides DTS */

> > +	if (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV))

> > +		data->switch_mode = CPSW_SWITCHDEV;

> > +

> 

> I know you discussed this a bit with Jiri, but i still think if

> 'dual_mac" is found, you should do dual mac. The DT clearly requests

> dual mac, and doing anything else is going to cause confusion.

> 

> The device tree binding is ambiguous what should happen when dual-mac

> is missing. So i would only enable swithdev mode in that case.

At the moment if no 'dual_emac;' is found on DTS the driver operates in "switch
mode". It only registers 1 ethernet interface with no ability (unless you patch
the kernel) to configure the switch. If we use DTS instead of a .config option
we should add parsing for something like 'switchdev;' in the DTS.
Jiri proposed using devlink, which makes sense, but i am not sure it's
applicable on this patchset. This will change the driver completely and will
totally break backwards compatibility.

Ideally i'd prefer something like:
1. Add a DTS option and continue the current behavior. I agree with you that
this will cause less confusion (in fact i prefer it for the current state of the
driver compared to the .config).
or
2. Keep the .config option which is better suited over DTS but might cause some
confusion.
> 

> But ideally, it should be a new driver with a new binding.

TI is better suited to comment on this. The work proposed here is mostly to
accomodate future TSN related configuration for switches. This patchset has
been tested against Ivan's patchset for CBS and is working as expected 
configuration wise.
(http://lkml.iu.edu/hypermail/linux/kernel/1806.1/05302.html)

Thanks
Ilias
Grygorii Strashko June 18, 2018, 11:19 p.m. UTC | #9
On 06/14/2018 06:43 AM, Ilias Apalodimas wrote:
> On Thu, Jun 14, 2018 at 01:39:58PM +0200, Jiri Pirko wrote:

>> Thu, Jun 14, 2018 at 01:34:04PM CEST, ilias.apalodimas@linaro.org wrote:

>>> On Thu, Jun 14, 2018 at 01:30:28PM +0200, Jiri Pirko wrote:

>>>> Thu, Jun 14, 2018 at 01:11:30PM CEST, ilias.apalodimas@linaro.org wrote:

>>>>

>>>> [...]

>>>>

>>>>> @@ -2711,6 +2789,10 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,

>>>>> 	if (of_property_read_bool(node, "dual_emac"))

>>>>> 		data->switch_mode = CPSW_DUAL_EMAC;

>>>>>

>>>>> +	/* switchdev overrides DTS */

>>>>> +	if (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV))

>>>>> +		data->switch_mode = CPSW_SWITCHDEV;

>>>>

>>>> So you force CPSW_SWITCHDEV mode if the CONFIG_TI_CPSW_SWITCHDEV is

>>>> enabled. That does not sound right. I think that user should tell what

>>>> mode does he want regardless what the kernel config is.

>>> We discussed this during the V1 of the RFC. Yes it doesn't seem good, but the

>>> device currently configures the modes using DTS (which is not correct). I choose

>>> the .config due to that. I can't think of anything better, but i am open to

>>> suggestions

>>

>> Agreed that DTS does fit as well. I think that this might be a job for

>> devlink parameters (patchset is going to be sent upstream next week).

>> You do have 1 bus address for the whole device (both ports), right?

>>

> Yes devlink sounds reasonable. I thyink there's only one bus for it, but then

> again i am far from an expert on the hardware interrnals. Grygorii can correct

> me if i am wrong.


Devlink and NFS boot are not compatible as per my understanding, so .. 

Again, current driver, as is, supports NFS boot in all modes
(how good is the cur driver is question which out of scope of this discussion).

And we discussed this in RFC v1 - driver mode selection will be changed 
if we will proceed and it will be new driver for proper switch support.

Not sure about about Devlink (need to study it and we never got any requests from end
users for this as I know), and I'd like to note (again) that this is embedded 
(industrial/automotive etc), so everything preferred to be simple, fast and
preferably configured statically (in most of the cases end users what boot time 
configuration).
 
-- 
regards,
-grygorii
Grygorii Strashko June 18, 2018, 11:20 p.m. UTC | #10
On 06/18/2018 03:19 PM, Ilias Apalodimas wrote:
> On Mon, Jun 18, 2018 at 06:16:27PM +0200, Andrew Lunn wrote:

>>> @@ -2711,6 +2789,10 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,

>>>   	if (of_property_read_bool(node, "dual_emac"))

>>>   		data->switch_mode = CPSW_DUAL_EMAC;

>>>   

>>> +	/* switchdev overrides DTS */

>>> +	if (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV))

>>> +		data->switch_mode = CPSW_SWITCHDEV;

>>> +

>>

>> I know you discussed this a bit with Jiri, but i still think if

>> 'dual_mac" is found, you should do dual mac. The DT clearly requests

>> dual mac, and doing anything else is going to cause confusion.

>>

>> The device tree binding is ambiguous what should happen when dual-mac

>> is missing. So i would only enable swithdev mode in that case.

> At the moment if no 'dual_emac;' is found on DTS the driver operates in "switch

> mode". It only registers 1 ethernet interface with no ability (unless you patch

> the kernel) to configure the switch. If we use DTS instead of a .config option

> we should add parsing for something like 'switchdev;' in the DTS.

> Jiri proposed using devlink, which makes sense, but i am not sure it's

> applicable on this patchset. This will change the driver completely and will

> totally break backwards compatibility.

> 

> Ideally i'd prefer something like:

> 1. Add a DTS option and continue the current behavior. I agree with you that

> this will cause less confusion (in fact i prefer it for the current state of the

> driver compared to the .config).

> or

> 2. Keep the .config option which is better suited over DTS but might cause some

> confusion.

>>

>> But ideally, it should be a new driver with a new binding.

> TI is better suited to comment on this. The work proposed here is mostly to

> accomodate future TSN related configuration for switches. This patchset has

> been tested against Ivan's patchset for CBS and is working as expected

> configuration wise.

> (http://lkml.iu.edu/hypermail/linux/kernel/1806.1/05302.html)


We'd try the new driver in the future

-- 
regards,
-grygorii
Jiri Pirko June 20, 2018, 7:08 a.m. UTC | #11
Tue, Jun 19, 2018 at 01:19:00AM CEST, grygorii.strashko@ti.com wrote:
>

>

>On 06/14/2018 06:43 AM, Ilias Apalodimas wrote:

>> On Thu, Jun 14, 2018 at 01:39:58PM +0200, Jiri Pirko wrote:

>>> Thu, Jun 14, 2018 at 01:34:04PM CEST, ilias.apalodimas@linaro.org wrote:

>>>> On Thu, Jun 14, 2018 at 01:30:28PM +0200, Jiri Pirko wrote:

>>>>> Thu, Jun 14, 2018 at 01:11:30PM CEST, ilias.apalodimas@linaro.org wrote:

>>>>>

>>>>> [...]

>>>>>

>>>>>> @@ -2711,6 +2789,10 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,

>>>>>> 	if (of_property_read_bool(node, "dual_emac"))

>>>>>> 		data->switch_mode = CPSW_DUAL_EMAC;

>>>>>>

>>>>>> +	/* switchdev overrides DTS */

>>>>>> +	if (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV))

>>>>>> +		data->switch_mode = CPSW_SWITCHDEV;

>>>>>

>>>>> So you force CPSW_SWITCHDEV mode if the CONFIG_TI_CPSW_SWITCHDEV is

>>>>> enabled. That does not sound right. I think that user should tell what

>>>>> mode does he want regardless what the kernel config is.

>>>> We discussed this during the V1 of the RFC. Yes it doesn't seem good, but the

>>>> device currently configures the modes using DTS (which is not correct). I choose

>>>> the .config due to that. I can't think of anything better, but i am open to

>>>> suggestions

>>>

>>> Agreed that DTS does fit as well. I think that this might be a job for

>>> devlink parameters (patchset is going to be sent upstream next week).

>>> You do have 1 bus address for the whole device (both ports), right?

>>>

>> Yes devlink sounds reasonable. I thyink there's only one bus for it, but then

>> again i am far from an expert on the hardware interrnals. Grygorii can correct

>> me if i am wrong.

>

>Devlink and NFS boot are not compatible as per my understanding, so .. 


? Why do you say so?


>

>Again, current driver, as is, supports NFS boot in all modes

>(how good is the cur driver is question which out of scope of this discussion).

>

>And we discussed this in RFC v1 - driver mode selection will be changed 

>if we will proceed and it will be new driver for proper switch support.

>

>Not sure about about Devlink (need to study it and we never got any requests from end

>users for this as I know), and I'd like to note (again) that this is embedded 

>(industrial/automotive etc), so everything preferred to be simple, fast and

>preferably configured statically (in most of the cases end users what boot time 

>configuration).


You need to study it indeed.

> 

>-- 

>regards,

>-grygorii
Ivan Vecera June 20, 2018, 12:53 p.m. UTC | #12
On 20.6.2018 09:08, Jiri Pirko wrote:
> Tue, Jun 19, 2018 at 01:19:00AM CEST, grygorii.strashko@ti.com wrote:

>>

>>

>> On 06/14/2018 06:43 AM, Ilias Apalodimas wrote:

>>> On Thu, Jun 14, 2018 at 01:39:58PM +0200, Jiri Pirko wrote:

>>>> Thu, Jun 14, 2018 at 01:34:04PM CEST, ilias.apalodimas@linaro.org wrote:

>>>>> On Thu, Jun 14, 2018 at 01:30:28PM +0200, Jiri Pirko wrote:

>>>>>> Thu, Jun 14, 2018 at 01:11:30PM CEST, ilias.apalodimas@linaro.org wrote:

>>>>>>

>>>>>> [...]

>>>>>>

>>>>>>> @@ -2711,6 +2789,10 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,

>>>>>>> 	if (of_property_read_bool(node, "dual_emac"))

>>>>>>> 		data->switch_mode = CPSW_DUAL_EMAC;

>>>>>>>

>>>>>>> +	/* switchdev overrides DTS */

>>>>>>> +	if (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV))

>>>>>>> +		data->switch_mode = CPSW_SWITCHDEV;

>>>>>>

>>>>>> So you force CPSW_SWITCHDEV mode if the CONFIG_TI_CPSW_SWITCHDEV is

>>>>>> enabled. That does not sound right. I think that user should tell what

>>>>>> mode does he want regardless what the kernel config is.

>>>>> We discussed this during the V1 of the RFC. Yes it doesn't seem good, but the

>>>>> device currently configures the modes using DTS (which is not correct). I choose

>>>>> the .config due to that. I can't think of anything better, but i am open to

>>>>> suggestions

>>>>

>>>> Agreed that DTS does fit as well. I think that this might be a job for

>>>> devlink parameters (patchset is going to be sent upstream next week).

>>>> You do have 1 bus address for the whole device (both ports), right?

>>>>

>>> Yes devlink sounds reasonable. I thyink there's only one bus for it, but then

>>> again i am far from an expert on the hardware interrnals. Grygorii can correct

>>> me if i am wrong.

>>

>> Devlink and NFS boot are not compatible as per my understanding, so .. 

> 

> ? Why do you say so?


Why aren't they compatible?? You can have devlink binary in initramfs and
configure the ASIC and ports via devlink batch file - prior mounting NFS root.

>>

>> Again, current driver, as is, supports NFS boot in all modes

>> (how good is the cur driver is question which out of scope of this discussion).

>>

>> And we discussed this in RFC v1 - driver mode selection will be changed 

>> if we will proceed and it will be new driver for proper switch support.

>>

>> Not sure about about Devlink (need to study it and we never got any requests from end

>> users for this as I know), and I'd like to note (again) that this is embedded 

>> (industrial/automotive etc), so everything preferred to be simple, fast and

>> preferably configured statically (in most of the cases end users what boot time 

>> configuration).

> 

> You need to study it indeed.

> 

>>

>> -- 

>> regards,

>> -grygorii
Ivan Vecera June 20, 2018, 12:56 p.m. UTC | #13
On 18.6.2018 22:19, Ilias Apalodimas wrote:
> Jiri proposed using devlink, which makes sense, but i am not sure it's

> applicable on this patchset. This will change the driver completely and will

> totally break backwards compatibility.


Another good reason for a new driver.

I.
Ilias Apalodimas June 20, 2018, 12:59 p.m. UTC | #14
On Wed, Jun 20, 2018 at 02:53:47PM +0200, Ivan Vecera wrote:
> On 20.6.2018 09:08, Jiri Pirko wrote:

> > Tue, Jun 19, 2018 at 01:19:00AM CEST, grygorii.strashko@ti.com wrote:

> >>

> >>

> >> On 06/14/2018 06:43 AM, Ilias Apalodimas wrote:

> >>> On Thu, Jun 14, 2018 at 01:39:58PM +0200, Jiri Pirko wrote:

> >>>> Thu, Jun 14, 2018 at 01:34:04PM CEST, ilias.apalodimas@linaro.org wrote:

> >>>>> On Thu, Jun 14, 2018 at 01:30:28PM +0200, Jiri Pirko wrote:

> >>>>>> Thu, Jun 14, 2018 at 01:11:30PM CEST, ilias.apalodimas@linaro.org wrote:

> >>>>>>

> >>>>>> [...]

> >>>>>>

> >>>>>>> @@ -2711,6 +2789,10 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,

> >>>>>>> 	if (of_property_read_bool(node, "dual_emac"))

> >>>>>>> 		data->switch_mode = CPSW_DUAL_EMAC;

> >>>>>>>

> >>>>>>> +	/* switchdev overrides DTS */

> >>>>>>> +	if (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV))

> >>>>>>> +		data->switch_mode = CPSW_SWITCHDEV;

> >>>>>>

> >>>>>> So you force CPSW_SWITCHDEV mode if the CONFIG_TI_CPSW_SWITCHDEV is

> >>>>>> enabled. That does not sound right. I think that user should tell what

> >>>>>> mode does he want regardless what the kernel config is.

> >>>>> We discussed this during the V1 of the RFC. Yes it doesn't seem good, but the

> >>>>> device currently configures the modes using DTS (which is not correct). I choose

> >>>>> the .config due to that. I can't think of anything better, but i am open to

> >>>>> suggestions

> >>>>

> >>>> Agreed that DTS does fit as well. I think that this might be a job for

> >>>> devlink parameters (patchset is going to be sent upstream next week).

> >>>> You do have 1 bus address for the whole device (both ports), right?

> >>>>

> >>> Yes devlink sounds reasonable. I thyink there's only one bus for it, but then

> >>> again i am far from an expert on the hardware interrnals. Grygorii can correct

> >>> me if i am wrong.

> >>

> >> Devlink and NFS boot are not compatible as per my understanding, so .. 

> > 

> > ? Why do you say so?

> 

> Why aren't they compatible?? You can have devlink binary in initramfs and

> configure the ASIC and ports via devlink batch file - prior mounting NFS root.

This is doable but, are we taking into consideration device reconfiguration 
(which was the reason for creating the minimal filesystem) or are we just 
talking about device booting/initial config?
> 

> >>

> >> Again, current driver, as is, supports NFS boot in all modes

> >> (how good is the cur driver is question which out of scope of this discussion).

> >>

> >> And we discussed this in RFC v1 - driver mode selection will be changed 

> >> if we will proceed and it will be new driver for proper switch support.

> >>

> >> Not sure about about Devlink (need to study it and we never got any requests from end

> >> users for this as I know), and I'd like to note (again) that this is embedded 

> >> (industrial/automotive etc), so everything preferred to be simple, fast and

> >> preferably configured statically (in most of the cases end users what boot time 

> >> configuration).

> > 

> > You need to study it indeed.

> > 

> >>

> >> -- 

> >> regards,

> >> -grygorii

> 


Regards
Ilias
Ivan Vecera June 20, 2018, 1:54 p.m. UTC | #15
On 20.6.2018 14:59, Ilias Apalodimas wrote:
> On Wed, Jun 20, 2018 at 02:53:47PM +0200, Ivan Vecera wrote:

>> On 20.6.2018 09:08, Jiri Pirko wrote:

>>> Tue, Jun 19, 2018 at 01:19:00AM CEST, grygorii.strashko@ti.com wrote:

>>>>

>>>>

>>>> On 06/14/2018 06:43 AM, Ilias Apalodimas wrote:

>>>>> On Thu, Jun 14, 2018 at 01:39:58PM +0200, Jiri Pirko wrote:

>>>>>> Thu, Jun 14, 2018 at 01:34:04PM CEST, ilias.apalodimas@linaro.org wrote:

>>>>>>> On Thu, Jun 14, 2018 at 01:30:28PM +0200, Jiri Pirko wrote:

>>>>>>>> Thu, Jun 14, 2018 at 01:11:30PM CEST, ilias.apalodimas@linaro.org wrote:

>>>>>>>>

>>>>>>>> [...]

>>>>>>>>

>>>>>>>>> @@ -2711,6 +2789,10 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,

>>>>>>>>> 	if (of_property_read_bool(node, "dual_emac"))

>>>>>>>>> 		data->switch_mode = CPSW_DUAL_EMAC;

>>>>>>>>>

>>>>>>>>> +	/* switchdev overrides DTS */

>>>>>>>>> +	if (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV))

>>>>>>>>> +		data->switch_mode = CPSW_SWITCHDEV;

>>>>>>>>

>>>>>>>> So you force CPSW_SWITCHDEV mode if the CONFIG_TI_CPSW_SWITCHDEV is

>>>>>>>> enabled. That does not sound right. I think that user should tell what

>>>>>>>> mode does he want regardless what the kernel config is.

>>>>>>> We discussed this during the V1 of the RFC. Yes it doesn't seem good, but the

>>>>>>> device currently configures the modes using DTS (which is not correct). I choose

>>>>>>> the .config due to that. I can't think of anything better, but i am open to

>>>>>>> suggestions

>>>>>>

>>>>>> Agreed that DTS does fit as well. I think that this might be a job for

>>>>>> devlink parameters (patchset is going to be sent upstream next week).

>>>>>> You do have 1 bus address for the whole device (both ports), right?

>>>>>>

>>>>> Yes devlink sounds reasonable. I thyink there's only one bus for it, but then

>>>>> again i am far from an expert on the hardware interrnals. Grygorii can correct

>>>>> me if i am wrong.

>>>>

>>>> Devlink and NFS boot are not compatible as per my understanding, so .. 

>>>

>>> ? Why do you say so?

>>

>> Why aren't they compatible?? You can have devlink binary in initramfs and

>> configure the ASIC and ports via devlink batch file - prior mounting NFS root.

> This is doable but, are we taking into consideration device reconfiguration 

> (which was the reason for creating the minimal filesystem) or are we just 

> talking about device booting/initial config?

Devlink calls should be placed in setup.sh

I.
Ilias Apalodimas June 20, 2018, 5:51 p.m. UTC | #16
Hello Ivan,
On Wed, Jun 20, 2018 at 02:56:48PM +0200, Ivan Vecera wrote:
> On 18.6.2018 22:19, Ilias Apalodimas wrote:

> > Jiri proposed using devlink, which makes sense, but i am not sure it's

> > applicable on this patchset. This will change the driver completely and will

> > totally break backwards compatibility.

> 

> Another good reason for a new driver.

> 

> I.

This is actually conflicting at least to my understanding. Jiri proposed using 
devlink was used as an alternative method to enable a new mode instead of 
adding it on a .config option. A new driver wouldn't have a need for that right?

Thanks
Ilias
Andrew Lunn June 20, 2018, 5:57 p.m. UTC | #17
> > Another good reason for a new driver.

> > 

> > I.

> This is actually conflicting at least to my understanding. Jiri proposed using 

> devlink was used as an alternative method to enable a new mode instead of 

> adding it on a .config option. A new driver wouldn't have a need for that right?


A new driver would only do switchdev. So correct, there would not be
any configuration need. It would also give you a chance to have a new
device tree binding, if that helps at all.

    Andrew
Florian Fainelli June 20, 2018, 5:58 p.m. UTC | #18
On 06/20/2018 10:51 AM, Ilias Apalodimas wrote:
> Hello Ivan,

> On Wed, Jun 20, 2018 at 02:56:48PM +0200, Ivan Vecera wrote:

>> On 18.6.2018 22:19, Ilias Apalodimas wrote:

>>> Jiri proposed using devlink, which makes sense, but i am not sure it's

>>> applicable on this patchset. This will change the driver completely and will

>>> totally break backwards compatibility.

>>

>> Another good reason for a new driver.

>>

>> I.

> This is actually conflicting at least to my understanding. Jiri proposed using 

> devlink was used as an alternative method to enable a new mode instead of 

> adding it on a .config option. A new driver wouldn't have a need for that right?


Correct, with a new driver would likely behave correctly upon being
probed such that you could have your switch ports act as normal network
devices from which you could run IP-config and do NFS boot.
-- 
Florian
Ilias Apalodimas June 20, 2018, 6:03 p.m. UTC | #19
Hi Florian,

On Wed, Jun 20, 2018 at 10:58:26AM -0700, Florian Fainelli wrote:
> On 06/20/2018 10:51 AM, Ilias Apalodimas wrote:

> > Hello Ivan,

> > On Wed, Jun 20, 2018 at 02:56:48PM +0200, Ivan Vecera wrote:

> >> On 18.6.2018 22:19, Ilias Apalodimas wrote:

> >>> Jiri proposed using devlink, which makes sense, but i am not sure it's

> >>> applicable on this patchset. This will change the driver completely and will

> >>> totally break backwards compatibility.

> >>

> >> Another good reason for a new driver.

> >>

> >> I.

> > This is actually conflicting at least to my understanding. Jiri proposed using 

> > devlink was used as an alternative method to enable a new mode instead of 

> > adding it on a .config option. A new driver wouldn't have a need for that right?

> 

> Correct, with a new driver would likely behave correctly upon being

> probed such that you could have your switch ports act as normal network

> devices from which you could run IP-config and do NFS boot.

The current driver also does NFS properly and the 2 ethernet ports act as normal
network interfaces. The NFS section in the cover letter
is to cover the cases were users running on NFS need to change the running
switch configuration(starting from adding the 2 interfaces on a bridge). 
Since iproute2 is located on the NFS filesystem the moment
network connectivity is lost, you loose the ability to perform further
configuration and in certian configuration scenarios render the device
unusable.
> -- 

> Florian


Thanks
Ilias
Ivan Vecera June 21, 2018, 12:19 p.m. UTC | #20
On 20.6.2018 20:03, Ilias Apalodimas wrote:
> Hi Florian,

> 

> On Wed, Jun 20, 2018 at 10:58:26AM -0700, Florian Fainelli wrote:

>> On 06/20/2018 10:51 AM, Ilias Apalodimas wrote:

>>> Hello Ivan,

>>> On Wed, Jun 20, 2018 at 02:56:48PM +0200, Ivan Vecera wrote:

>>>> On 18.6.2018 22:19, Ilias Apalodimas wrote:

>>>>> Jiri proposed using devlink, which makes sense, but i am not sure it's

>>>>> applicable on this patchset. This will change the driver completely and will

>>>>> totally break backwards compatibility.

>>>>

>>>> Another good reason for a new driver.

>>>>

>>>> I.

>>> This is actually conflicting at least to my understanding. Jiri proposed using 

>>> devlink was used as an alternative method to enable a new mode instead of 

>>> adding it on a .config option. A new driver wouldn't have a need for that right?

>>

>> Correct, with a new driver would likely behave correctly upon being

>> probed such that you could have your switch ports act as normal network

>> devices from which you could run IP-config and do NFS boot.

> The current driver also does NFS properly and the 2 ethernet ports act as normal

> network interfaces. The NFS section in the cover letter

> is to cover the cases were users running on NFS need to change the running

> switch configuration(starting from adding the 2 interfaces on a bridge). 

> Since iproute2 is located on the NFS filesystem the moment

> network connectivity is lost, you loose the ability to perform further

> configuration and in certian configuration scenarios render the device

> unusable.


Yes, with a new driver you can drop NFS-boot hack you mentioned in cover letter.
All configuration is done during driver probe and thus prior NFS mount. Only
thing you loose with a new driver is backward compatibility but the question is:
DO you really need it?

Ivan
Ilias Apalodimas June 21, 2018, 12:45 p.m. UTC | #21
On Thu, Jun 21, 2018 at 02:19:55PM +0200, Ivan Vecera wrote:
> On 20.6.2018 20:03, Ilias Apalodimas wrote:

> > Hi Florian,

> > 

> > On Wed, Jun 20, 2018 at 10:58:26AM -0700, Florian Fainelli wrote:

> >> On 06/20/2018 10:51 AM, Ilias Apalodimas wrote:

> >>> Hello Ivan,

> >>> On Wed, Jun 20, 2018 at 02:56:48PM +0200, Ivan Vecera wrote:

> >>>> On 18.6.2018 22:19, Ilias Apalodimas wrote:

> >>>>> Jiri proposed using devlink, which makes sense, but i am not sure it's

> >>>>> applicable on this patchset. This will change the driver completely and will

> >>>>> totally break backwards compatibility.

> >>>>

> >>>> Another good reason for a new driver.

> >>>>

> >>>> I.

> >>> This is actually conflicting at least to my understanding. Jiri proposed using 

> >>> devlink was used as an alternative method to enable a new mode instead of 

> >>> adding it on a .config option. A new driver wouldn't have a need for that right?

> >>

> >> Correct, with a new driver would likely behave correctly upon being

> >> probed such that you could have your switch ports act as normal network

> >> devices from which you could run IP-config and do NFS boot.

> > The current driver also does NFS properly and the 2 ethernet ports act as normal

> > network interfaces. The NFS section in the cover letter

> > is to cover the cases were users running on NFS need to change the running

> > switch configuration(starting from adding the 2 interfaces on a bridge). 

> > Since iproute2 is located on the NFS filesystem the moment

> > network connectivity is lost, you loose the ability to perform further

> > configuration and in certian configuration scenarios render the device

> > unusable.

> 

> Yes, with a new driver you can drop NFS-boot hack you mentioned in cover letter.

> All configuration is done during driver probe and thus prior NFS mount. Only

> thing you loose with a new driver is backward compatibility but the question is:

> DO you really need it?

Ok i think there's a bit of confusion here. I'll try to clarify it. 
There is no NFS hack for the current driver or ever was. Whether you use
.config/DTS/devlink/module param method for configuration this is strictly for
configuration. The driver (via .config currently) correctly
registers and initializes everything it needs to work. NFS boots fine without
using anything from that script.
The whole script is not there to boot up the device.  The script is there to 
help any potential testing that has to be done *via* NFS and the user has to
reconfigure the switch for his testcases.
Since you need to add the 2 interfaces in a bridge and start the switch
configuration, the moment you do that you loose your network access, thus all
the commands you need for configuration. This is why you need to chroot to do
that. I don't see how writing a new driver will change that.

The driver is currently widely used and that's the reason we tried to avoid
rewriting it. The current driver uses a DTS option to distinguish between two
existing modes. This patch just adds a third one. So to my understanding we
have the following options:
1. The driver already uses DTS to configure the hardware mode. Although this is
techincally wrong, we can add a third mode on DTS called 'switchdev;', get rid
of the .config option and keep the configuration method common (although not
optimal).
2. Keep the .config option which overrides the 2 existing modes. 
3. Introduce a devlink option. If this is applied for all 3 modes, it will break
backwards compatibility, so it's not an option. If it's introduced for
configuring 'switchdev' mode only, we fall into the same pitfall as option 2),
we have something that overrides our current config, slightly better though
since it's not a compile time option.
4. rewrite the driver 

If it was a brand new driver, i'd definitely pick 4. Since it's a pre-existing
driver though i can't rule out the rest of the options. 

Regards
Ilias
Arnd Bergmann June 21, 2018, 3:31 p.m. UTC | #22
On Thu, Jun 21, 2018 at 2:45 PM, Ilias Apalodimas
<ilias.apalodimas@linaro.org> wrote:
> On Thu, Jun 21, 2018 at 02:19:55PM +0200, Ivan Vecera wrote:


> The driver is currently widely used and that's the reason we tried to avoid

> rewriting it. The current driver uses a DTS option to distinguish between two

> existing modes. This patch just adds a third one. So to my understanding we

> have the following options:

> 1. The driver already uses DTS to configure the hardware mode. Although this is

> techincally wrong, we can add a third mode on DTS called 'switchdev;', get rid

> of the .config option and keep the configuration method common (although not

> optimal).

> 2. Keep the .config option which overrides the 2 existing modes.

> 3. Introduce a devlink option. If this is applied for all 3 modes, it will break

> backwards compatibility, so it's not an option. If it's introduced for

> configuring 'switchdev' mode only, we fall into the same pitfall as option 2),

> we have something that overrides our current config, slightly better though

> since it's not a compile time option.

> 4. rewrite the driver


As I understand it, the switchdev support can also be added without
becoming incompatible with the existing behavior, this is how I would
suggest it gets added in a way that keeps the existing DT binding and
user view while adding switchdev:

* In non-"dual-emac" mode, show only one network device that is
  configured as a transparent switch as today. Any users that today
  add TI's third-party ioctl interface with a non-upstreamable patch
  can keep using this mode and try to forward-port that patch.
* In "dual-emac" mode (as selected through DT), the hardware is
   configured to be viewed as two separate network devices as before,
   regardless of kernel configuration. Users can add the two device
   to a bridge device as before, and enabling switchdev support in
   the kernel configuration (based on your patch series) would change
   nothing else than using hardware support in the background to
   reconfigure the HW VLAN settings.

This does not require using devlink, adding a third mode, or changing
the DT binding or the user-visible behavior when switchdev is enabled,
but should get all the features you want.

> If it was a brand new driver, i'd definitely pick 4. Since it's a pre-existing

> driver though i can't rule out the rest of the options.


I think the suggestion was to have a new driver with a new binding
so that the DT could choose between the two drivers, one with
somewhat obscure behavior and the other with proper behavior.

However, from what I can tell, the only requirement to get a somewhat
reasonable behavior is that you enable "dual-emac" mode in DT
to get switchdev support. It would be trivial to add a new compatible
value that only allows that mode along with supporting switchdev,
but I don't think that's necessary here.

Writing a new driver might also be a good idea (depending on the
quality of the existing one, I haven't looked in detail), but again
I would see no reason for the new driver to be incompatible with
the existing binding, so a gradual cleanup seems like a better
approach.

       Arnd
Ilias Apalodimas June 22, 2018, 7:45 a.m. UTC | #23
On Thu, Jun 21, 2018 at 05:31:31PM +0200, Arnd Bergmann wrote:
> On Thu, Jun 21, 2018 at 2:45 PM, Ilias Apalodimas

> <ilias.apalodimas@linaro.org> wrote:

> > On Thu, Jun 21, 2018 at 02:19:55PM +0200, Ivan Vecera wrote:

> 

> > The driver is currently widely used and that's the reason we tried to avoid

> > rewriting it. The current driver uses a DTS option to distinguish between two

> > existing modes. This patch just adds a third one. So to my understanding we

> > have the following options:

> > 1. The driver already uses DTS to configure the hardware mode. Although this is

> > techincally wrong, we can add a third mode on DTS called 'switchdev;', get rid

> > of the .config option and keep the configuration method common (although not

> > optimal).

> > 2. Keep the .config option which overrides the 2 existing modes.

> > 3. Introduce a devlink option. If this is applied for all 3 modes, it will break

> > backwards compatibility, so it's not an option. If it's introduced for

> > configuring 'switchdev' mode only, we fall into the same pitfall as option 2),

> > we have something that overrides our current config, slightly better though

> > since it's not a compile time option.

> > 4. rewrite the driver

> 

> As I understand it, the switchdev support can also be added without

> becoming incompatible with the existing behavior, this is how I would

> suggest it gets added in a way that keeps the existing DT binding and

> user view while adding switchdev:

> 

> * In non-"dual-emac" mode, show only one network device that is

>   configured as a transparent switch as today. Any users that today

>   add TI's third-party ioctl interface with a non-upstreamable patch

>   can keep using this mode and try to forward-port that patch.

Correct
> * In "dual-emac" mode (as selected through DT), the hardware is

>    configured to be viewed as two separate network devices as before,

>    regardless of kernel configuration. Users can add the two device

>    to a bridge device as before, and enabling switchdev support in

>    the kernel configuration (based on your patch series) would change

>    nothing else than using hardware support in the background to

>    reconfigure the HW VLAN settings.

> 

> This does not require using devlink, adding a third mode, or changing

> the DT binding or the user-visible behavior when switchdev is enabled,

> but should get all the features you want.

> 

Correct again. This is doable and the changes on the current patchset are
somewhat trivial (detecting a bridge and making the configuration changes
on the fly).
> > If it was a brand new driver, i'd definitely pick 4. Since it's a pre-existing

> > driver though i can't rule out the rest of the options.

> 

> I think the suggestion was to have a new driver with a new binding

> so that the DT could choose between the two drivers, one with

> somewhat obscure behavior and the other with proper behavior.

> 

> However, from what I can tell, the only requirement to get a somewhat

> reasonable behavior is that you enable "dual-emac" mode in DT

> to get switchdev support. It would be trivial to add a new compatible

> value that only allows that mode along with supporting switchdev,

> but I don't think that's necessary here.

> 

> Writing a new driver might also be a good idea (depending on the

> quality of the existing one, I haven't looked in detail), but again

> I would see no reason for the new driver to be incompatible with

> the existing binding, so a gradual cleanup seems like a better

> approach.

Agree
> 

>        Arnd


If people like this idea, i can send a V3 with these changes.

Thanks
Ilias
Grygorii Strashko June 27, 2018, 7:18 p.m. UTC | #24
On 06/22/2018 02:45 AM, Ilias Apalodimas wrote:
> On Thu, Jun 21, 2018 at 05:31:31PM +0200, Arnd Bergmann wrote:

>> On Thu, Jun 21, 2018 at 2:45 PM, Ilias Apalodimas

>> <ilias.apalodimas@linaro.org> wrote:

>>> On Thu, Jun 21, 2018 at 02:19:55PM +0200, Ivan Vecera wrote:

>>

>>> The driver is currently widely used and that's the reason we tried to avoid

>>> rewriting it. The current driver uses a DTS option to distinguish between two

>>> existing modes. This patch just adds a third one. So to my understanding we

>>> have the following options:

>>> 1. The driver already uses DTS to configure the hardware mode. Although this is

>>> techincally wrong, we can add a third mode on DTS called 'switchdev;', get rid

>>> of the .config option and keep the configuration method common (although not

>>> optimal).

>>> 2. Keep the .config option which overrides the 2 existing modes.

>>> 3. Introduce a devlink option. If this is applied for all 3 modes, it will break

>>> backwards compatibility, so it's not an option. If it's introduced for

>>> configuring 'switchdev' mode only, we fall into the same pitfall as option 2),

>>> we have something that overrides our current config, slightly better though

>>> since it's not a compile time option.

>>> 4. rewrite the driver

>>

>> As I understand it, the switchdev support can also be added without

>> becoming incompatible with the existing behavior, this is how I would

>> suggest it gets added in a way that keeps the existing DT binding and

>> user view while adding switchdev:

>>

>> * In non-"dual-emac" mode, show only one network device that is

>>    configured as a transparent switch as today. Any users that today

>>    add TI's third-party ioctl interface with a non-upstreamable patch

>>    can keep using this mode and try to forward-port that patch.

> Correct

>> * In "dual-emac" mode (as selected through DT), the hardware is

>>     configured to be viewed as two separate network devices as before,

>>     regardless of kernel configuration. Users can add the two device

>>     to a bridge device as before, and enabling switchdev support in

>>     the kernel configuration (based on your patch series) would change

>>     nothing else than using hardware support in the background to

>>     reconfigure the HW VLAN settings.

>>

>> This does not require using devlink, adding a third mode, or changing

>> the DT binding or the user-visible behavior when switchdev is enabled,

>> but should get all the features you want.

>>

> Correct again. This is doable and the changes on the current patchset are

> somewhat trivial (detecting a bridge and making the configuration changes

> on the fly).

>>> If it was a brand new driver, i'd definitely pick 4. Since it's a pre-existing

>>> driver though i can't rule out the rest of the options.

>>

>> I think the suggestion was to have a new driver with a new binding

>> so that the DT could choose between the two drivers, one with

>> somewhat obscure behavior and the other with proper behavior.

>>

>> However, from what I can tell, the only requirement to get a somewhat

>> reasonable behavior is that you enable "dual-emac" mode in DT

>> to get switchdev support. It would be trivial to add a new compatible

>> value that only allows that mode along with supporting switchdev,

>> but I don't think that's necessary here.

>>

>> Writing a new driver might also be a good idea (depending on the

>> quality of the existing one, I haven't looked in detail), but again

>> I would see no reason for the new driver to be incompatible with

>> the existing binding, so a gradual cleanup seems like a better

>> approach.

> Agree

>>

> 

> If people like this idea, i can send a V3 with these changes.


Nop. I do not think this is good idea, because "dual_mac" mode has very strict
meaning and requirements. In "dual_mac" mode both port should be teated and work
as *separate network devices" (like two, not connected PCI eth cards) - the fact that
it's implemented on top of hw, which can do packet switching doesn't matter here and just a
technical solution.
Main requirements:
1) No packet forwarding is allowed inside hw under any circumstances, only Linux
   Host SW can consume or forward packets
2) One interface should not block another inside CPSW hw which implies special FIFOs/HW
 configuration
As per, above switchdev functionality doesn't make too much sense in "dual_mac" mode and
introducing dual meaning for this mode is not a good choice either.

Again, as discussed, option 4 is considered as preferred.

-- 
regards,
-grygorii
Arnd Bergmann June 27, 2018, 8:40 p.m. UTC | #25
On Wed, Jun 27, 2018 at 9:18 PM, Grygorii Strashko
<grygorii.strashko@ti.com> wrote:
> On 06/22/2018 02:45 AM, Ilias Apalodimas wrote:

>> On Thu, Jun 21, 2018 at 05:31:31PM +0200, Arnd Bergmann wrote:

>>> On Thu, Jun 21, 2018 at 2:45 PM, Ilias Apalodimas

>>> <ilias.apalodimas@linaro.org> wrote:

>>>> On Thu, Jun 21, 2018 at 02:19:55PM +0200, Ivan Vecera wrote:

>>>

>>

>> If people like this idea, i can send a V3 with these changes.

>

> Nop. I do not think this is good idea, because "dual_mac" mode has very strict

> meaning and requirements. In "dual_mac" mode both port should be teated and work

> as *separate network devices" (like two, not connected PCI eth cards) - the fact that

> it's implemented on top of hw, which can do packet switching doesn't matter here and just a

> technical solution.

> Main requirements:

> 1) No packet forwarding is allowed inside hw under any circumstances, only Linux

>    Host SW can consume or forward packets

> 2) One interface should not block another inside CPSW hw which implies special FIFOs/HW

>  configuration


Could you explain the reasoning behind those requirements? I honestly don't
see what difference it makes, given that a new driver with switchdev support
would look exactly like the dual_emac mode as long as you don't add the
two interfaces into a bridge, and the user-visible behavior is already required
to be the same.

> As per, above switchdev functionality doesn't make too much sense in "dual_mac" mode and

> introducing dual meaning for this mode is not a good choice either.

>

> Again, as discussed, option 4 is considered as preferred.


Do you mean creating an incompatible binding that could be implemented by
the same driver, or duplicating the driver with one copy of the old binding
and one copy for the new binding?

      Arnd
Grygorii Strashko June 27, 2018, 11:03 p.m. UTC | #26
On 06/27/2018 03:40 PM, Arnd Bergmann wrote:
> On Wed, Jun 27, 2018 at 9:18 PM, Grygorii Strashko

> <grygorii.strashko@ti.com> wrote:

>> On 06/22/2018 02:45 AM, Ilias Apalodimas wrote:

>>> On Thu, Jun 21, 2018 at 05:31:31PM +0200, Arnd Bergmann wrote:

>>>> On Thu, Jun 21, 2018 at 2:45 PM, Ilias Apalodimas

>>>> <ilias.apalodimas@linaro.org> wrote:

>>>>> On Thu, Jun 21, 2018 at 02:19:55PM +0200, Ivan Vecera wrote:

>>>>

>>>

>>> If people like this idea, i can send a V3 with these changes.

>>

>> Nop. I do not think this is good idea, because "dual_mac" mode has very strict

>> meaning and requirements. In "dual_mac" mode both port should be teated and work

>> as *separate network devices" (like two, not connected PCI eth cards) - the fact that

>> it's implemented on top of hw, which can do packet switching doesn't matter here and just a

>> technical solution.

>> Main requirements:

>> 1) No packet forwarding is allowed inside hw under any circumstances, only Linux

>>     Host SW can consume or forward packets

>> 2) One interface should not block another inside CPSW hw which implies special FIFOs/HW

>>   configuration

> 

> Could you explain the reasoning behind those requirements? I honestly don't

> see what difference it makes, given that a new driver with switchdev support

> would look exactly like the dual_emac mode as long as you don't add the

> two interfaces into a bridge, and the user-visible behavior is already required

> to be the same.


Am not aware of all details - it's custom filtering/routing/firewalling applications.
(Like Industrial Ethernet (EtherCAT) to Ethernet converter on one port and
 another port is for control/monitoring purposes)
And yes, it looks similar. But, as I mentioned, dual_mac mode required CPSW to be
configured differently and reconfiguration during attaching to the bridge
is very (very) problematic - first, FIFOs/HW configuration not expected to be done on the fly,
second vlans 1/2 reserved for this mode while bridge uses vid 1 by default.
In dual_mac mode port just switched to promiscuous mode when attached to the bridge.
Using kernel configuration option will break multi-platform support as
all CPSW instances will start behaving as switch.  

> 

>> As per, above switchdev functionality doesn't make too much sense in "dual_mac" mode and

>> introducing dual meaning for this mode is not a good choice either.

>>

>> Again, as discussed, option 4 is considered as preferred.

> 

> Do you mean creating an incompatible binding that could be implemented by

> the same driver, or duplicating the driver with one copy of the old binding

> and one copy for the new binding?


The idea is to keep dual_mac and one port mode (60% of use cases) as is
while create new driver for two port switch mode by refactoring existing driver and 
re-using generic parts as max as possible. Also, update bindings as there are
a lot of ancient and obsolete DT definitions which still supported for compatibility
reasons. And, yes, possibility to mix dual_mac and switchdev in one driver is
considered, but postponed as it hard to implement now, and as the main target is
to drop custom ioctl and switch to standard Linux interfaces for switch.
One step at time.

-- 
regards,
-grygorii
Arnd Bergmann June 28, 2018, 7:53 a.m. UTC | #27
On Thu, Jun 28, 2018 at 1:03 AM, Grygorii Strashko
<grygorii.strashko@ti.com> wrote:
>

>

> On 06/27/2018 03:40 PM, Arnd Bergmann wrote:

>> On Wed, Jun 27, 2018 at 9:18 PM, Grygorii Strashko

>> <grygorii.strashko@ti.com> wrote:

>>> On 06/22/2018 02:45 AM, Ilias Apalodimas wrote:

>>>> On Thu, Jun 21, 2018 at 05:31:31PM +0200, Arnd Bergmann wrote:

>>>>> On Thu, Jun 21, 2018 at 2:45 PM, Ilias Apalodimas

>>>>> <ilias.apalodimas@linaro.org> wrote:

>>>>>> On Thu, Jun 21, 2018 at 02:19:55PM +0200, Ivan Vecera wrote:

>>>>>

>>>>

>>>> If people like this idea, i can send a V3 with these changes.

>>>

>>> Nop. I do not think this is good idea, because "dual_mac" mode has very strict

>>> meaning and requirements. In "dual_mac" mode both port should be teated and work

>>> as *separate network devices" (like two, not connected PCI eth cards) - the fact that

>>> it's implemented on top of hw, which can do packet switching doesn't matter here and just a

>>> technical solution.

>>> Main requirements:

>>> 1) No packet forwarding is allowed inside hw under any circumstances, only Linux

>>>     Host SW can consume or forward packets

>>> 2) One interface should not block another inside CPSW hw which implies special FIFOs/HW

>>>   configuration

>>

>> Could you explain the reasoning behind those requirements? I honestly don't

>> see what difference it makes, given that a new driver with switchdev support

>> would look exactly like the dual_emac mode as long as you don't add the

>> two interfaces into a bridge, and the user-visible behavior is already required

>> to be the same.

>

> Am not aware of all details - it's custom filtering/routing/firewalling applications.

> (Like Industrial Ethernet (EtherCAT) to Ethernet converter on one port and

>  another port is for control/monitoring purposes)

> And yes, it looks similar. But, as I mentioned, dual_mac mode required CPSW to be

> configured differently and reconfiguration during attaching to the bridge

> is very (very) problematic - first, FIFOs/HW configuration not expected to be done on the fly,

> second vlans 1/2 reserved for this mode while bridge uses vid 1 by default.

> In dual_mac mode port just switched to promiscuous mode when attached to the bridge.

> Using kernel configuration option will break multi-platform support as

> all CPSW instances will start behaving as switch.


I was referring to dynamically reconfiguring the device during switch
attachment (which you say is hard in the current driver, and I can believe that,
but it does seem like a problem that can be solved with a proper design),
and the kernel configuration must have no impact on the default behavior
in that case.

This would still meet the requirements for the use case you mention,
as one would definitely never want to bridge between an EtherCAT
interface and a management interface.

>>> As per, above switchdev functionality doesn't make too much sense in "dual_mac" mode and

>>> introducing dual meaning for this mode is not a good choice either.

>>>

>>> Again, as discussed, option 4 is considered as preferred.

>>

>> Do you mean creating an incompatible binding that could be implemented by

>> the same driver, or duplicating the driver with one copy of the old binding

>> and one copy for the new binding?

>

> The idea is to keep dual_mac and one port mode (60% of use cases) as is

> while create new driver for two port switch mode by refactoring existing driver and

> re-using generic parts as max as possible. Also, update bindings as there are

> a lot of ancient and obsolete DT definitions which still supported for compatibility

> reasons. And, yes, possibility to mix dual_mac and switchdev in one driver is

> considered, but postponed as it hard to implement now, and as the main target is

> to drop custom ioctl and switch to standard Linux interfaces for switch.

> One step at time.


But wouldn't that new driver have the exact same problem with reconfiguring
the device between the boot-time configuration that behaves like
dual_emac and the switch configuration once the switchdev gets attached?

      Arnd
diff mbox series

Patch

diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig
index 9263d63..a299d86 100644
--- a/drivers/net/ethernet/ti/Kconfig
+++ b/drivers/net/ethernet/ti/Kconfig
@@ -73,6 +73,15 @@  config TI_CPSW
 	  To compile this driver as a module, choose M here: the module
 	  will be called cpsw.
 
+config TI_CPSW_SWITCHDEV
+	bool "TI CPSW switchdev support"
+	depends on TI_CPSW
+	depends on NET_SWITCHDEV
+	help
+	  Enable switchdev support on TI's CPSW Ethernet Switch.
+
+	  This will allow you to configure the switch using standard tools.
+
 config TI_CPTS
 	bool "TI Common Platform Time Sync (CPTS) Support"
 	depends on TI_CPSW || TI_KEYSTONE_NETCP || COMPILE_TEST
diff --git a/drivers/net/ethernet/ti/Makefile b/drivers/net/ethernet/ti/Makefile
index 0be551d..d6eb2a2 100644
--- a/drivers/net/ethernet/ti/Makefile
+++ b/drivers/net/ethernet/ti/Makefile
@@ -15,6 +15,7 @@  obj-$(CONFIG_TI_CPSW_PHY_SEL) += cpsw-phy-sel.o
 obj-$(CONFIG_TI_CPSW_ALE) += cpsw_ale.o
 obj-$(CONFIG_TI_CPTS_MOD) += cpts.o
 obj-$(CONFIG_TI_CPSW) += ti_cpsw.o
+obj-$(CONFIG_TI_CPSW_SWITCHDEV) += cpsw_switchdev.o
 ti_cpsw-y := cpsw.o
 
 obj-$(CONFIG_TI_KEYSTONE_NETCP) += keystone_netcp.o
diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index e5765cc..b501908 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -18,12 +18,10 @@ 
 #include <linux/clk.h>
 #include <linux/timer.h>
 #include <linux/module.h>
-#include <linux/platform_device.h>
 #include <linux/irqreturn.h>
 #include <linux/interrupt.h>
 #include <linux/if_ether.h>
 #include <linux/etherdevice.h>
-#include <linux/netdevice.h>
 #include <linux/net_tstamp.h>
 #include <linux/phy.h>
 #include <linux/workqueue.h>
@@ -43,6 +41,7 @@ 
 #include "cpsw.h"
 #include "cpsw_ale.h"
 #include "cpsw_priv.h"
+#include "cpsw_switchdev.h"
 #include "cpts.h"
 #include "davinci_cpdma.h"
 
@@ -361,6 +360,13 @@  struct cpsw_hw_stats {
 	u32	rxdmaoverruns;
 };
 
+struct cpsw_switchdev_event_work {
+	struct work_struct work;
+	struct switchdev_notifier_fdb_info fdb_info;
+	struct cpsw_priv *priv;
+	unsigned long event;
+};
+
 #define CPSW_STAT(m)		CPSW_STATS,				\
 				sizeof(((struct cpsw_hw_stats *)0)->m), \
 				offsetof(struct cpsw_hw_stats, m)
@@ -488,14 +494,32 @@  static int cpsw_is_switch(u8 switch_mode)
 	return switch_mode == CPSW_TI_SWITCH;
 }
 
+static int cpsw_is_switchdev(u8 switch_mode)
+{
+	return switch_mode == CPSW_SWITCHDEV;
+}
+
 static int cpsw_slave_index(struct cpsw_priv *priv)
 {
 	struct cpsw_common *cpsw = priv->cpsw;
 
+#if IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV)
+	if (priv->emac_port == HOST_PORT_NUM)
+		return -1;
+#endif
+
 	return cpsw->data.switch_mode ? priv->emac_port - 1 :
 		cpsw->data.active_slave;
 }
 
+static void cpsw_switchdev_port_enable(struct net_device *ndev)
+{
+#if IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV)
+	cpsw_port_switchdev_init(ndev);
+	ndev->features |= NETIF_F_NETNS_LOCAL;
+#endif
+}
+
 static void cpsw_set_promiscious(struct net_device *ndev, bool enable)
 {
 	struct cpsw_common *cpsw = ndev_to_cpsw(ndev);
@@ -521,6 +545,7 @@  static void cpsw_set_promiscious(struct net_device *ndev, bool enable)
 		if (enable) {
 			/* Enable Bypass */
 			cpsw_ale_control_set(ale, 0, ALE_BYPASS, 1);
+			cpsw_ale_set_allmulti(ale, IFF_ALLMULTI);
 
 			dev_dbg(&ndev->dev, "promiscuity enabled\n");
 		} else {
@@ -554,6 +579,7 @@  static void cpsw_set_promiscious(struct net_device *ndev, bool enable)
 
 			/* Flood All Unicast Packets to Host port */
 			cpsw_ale_control_set(ale, 0, ALE_P0_UNI_FLOOD, 1);
+			cpsw_ale_set_allmulti(ale, IFF_ALLMULTI);
 			dev_dbg(&ndev->dev, "promiscuity enabled\n");
 		} else {
 			/* Don't Flood All Unicast Packets to Host port */
@@ -568,6 +594,19 @@  static void cpsw_set_promiscious(struct net_device *ndev, bool enable)
 			}
 			dev_dbg(&ndev->dev, "promiscuity disabled\n");
 		}
+	} else if (cpsw_is_switchdev(cpsw->data.switch_mode)) {
+		/* When interfaces are placed into a bridge they'll switch to
+		 * promiscuous mode. In switchdev case ALE_P0_UNI_FLOOD is
+		 * changed whether any switch port participates in the bridge
+		 * or not
+		 */
+		struct cpsw_priv *priv = netdev_priv(ndev);
+		int slave_idx = cpsw_slave_index(priv);
+		int slave_num;
+
+		slave_num = cpsw_get_slave_port(slave_idx);
+		cpsw_ale_control_set(ale, slave_num, ALE_PORT_NOLEARN, 0);
+		cpsw_ale_control_set(ale, slave_num, ALE_PORT_NO_SA_UPDATE, 0);
 	}
 }
 
@@ -586,7 +625,6 @@  static void cpsw_ndo_set_rx_mode(struct net_device *ndev)
 	if (ndev->flags & IFF_PROMISC) {
 		/* Enable promiscuous mode */
 		cpsw_set_promiscious(ndev, true);
-		cpsw_ale_set_allmulti(cpsw->ale, IFF_ALLMULTI);
 		return;
 	} else {
 		/* Disable promiscuous mode */
@@ -721,6 +759,10 @@  static void cpsw_rx_handler(void *token, int len, int status)
 		return;
 	}
 
+#if IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV)
+	if (cpsw_is_switchdev(cpsw->data.switch_mode))
+		skb->offload_fwd_mark = 1;
+#endif
 	new_skb = netdev_alloc_skb_ip_align(ndev, cpsw->rx_packet_max);
 	if (new_skb) {
 		skb_copy_queue_mapping(new_skb, skb);
@@ -1427,10 +1469,13 @@  static void cpsw_init_host_port(struct cpsw_priv *priv)
 			     ALE_PORT_STATE, ALE_PORT_STATE_FORWARD);
 
 	if (!cpsw_is_dual_mac(cpsw->data.switch_mode)) {
-		cpsw_ale_add_ucast(cpsw->ale, priv->mac_addr, HOST_PORT_NUM,
-				   0, 0);
+		char stpa[] = {0x01, 0x80, 0xc2, 0x0, 0x0, 0x0};
+
 		cpsw_ale_add_mcast(cpsw->ale, priv->ndev->broadcast,
 				   ALE_PORT_HOST, 0, 0, ALE_MCAST_FWD_2);
+		cpsw_ale_add_mcast(cpsw->ale, stpa,
+				   ALE_PORT_HOST, ALE_SUPER, 0,
+				   ALE_MCAST_BLOCK_LEARN_FWD);
 	}
 }
 
@@ -1529,11 +1574,14 @@  static int cpsw_ndo_open(struct net_device *ndev)
 	for_each_slave(priv, cpsw_slave_open, priv);
 
 	/* Add default VLAN */
-	if (!cpsw_is_dual_mac(cpsw->data.switch_mode))
+	if (!cpsw_is_dual_mac(cpsw->data.switch_mode)) {
 		cpsw_add_default_vlan(priv);
-	else
+		cpsw_ale_add_ucast(cpsw->ale, priv->mac_addr, HOST_PORT_NUM, 0,
+				   0);
+	} else {
 		cpsw_ale_add_vlan(cpsw->ale, cpsw->data.default_vlan,
 				  ALE_ALL_PORTS, ALE_ALL_PORTS, 0, 0);
+	}
 
 	/* initialize shared resources for every ndev */
 	if (!cpsw->usage_count) {
@@ -1852,6 +1900,9 @@  static int cpsw_ndo_ioctl(struct net_device *dev, struct ifreq *req, int cmd)
 	if (!netif_running(dev))
 		return -EINVAL;
 
+	if (slave_no < 0)
+		return -EOPNOTSUPP;
+
 	switch (cmd) {
 	case SIOCSHWTSTAMP:
 		return cpsw_hwtstamp_set(dev, req);
@@ -1941,7 +1992,7 @@  static inline int cpsw_add_vlan_ale_entry(struct cpsw_priv *priv,
 	u32 port_mask;
 	struct cpsw_common *cpsw = priv->cpsw;
 
-	if (cpsw_is_dual_mac(cpsw->data.switch_mode)) {
+	if (!cpsw_is_switch(cpsw->data.switch_mode)) {
 		port_mask = (1 << priv->emac_port) | ALE_PORT_HOST;
 
 		if (priv->ndev->flags & IFF_ALLMULTI)
@@ -1989,6 +2040,10 @@  static int cpsw_ndo_vlan_rx_add_vid(struct net_device *ndev,
 	if (vid == cpsw->data.default_vlan)
 		return 0;
 
+	if (cpsw_is_switchdev(cpsw->data.switch_mode) &&
+	    (netif_is_bridge_port(ndev)))
+		return -EOPNOTSUPP;
+
 	ret = pm_runtime_get_sync(cpsw->dev);
 	if (ret < 0) {
 		pm_runtime_put_noidle(cpsw->dev);
@@ -2025,6 +2080,10 @@  static int cpsw_ndo_vlan_rx_kill_vid(struct net_device *ndev,
 	if (vid == cpsw->data.default_vlan)
 		return 0;
 
+	if (cpsw_is_switchdev(cpsw->data.switch_mode) &&
+	    (netif_is_bridge_port(ndev)))
+		return -EOPNOTSUPP;
+
 	ret = pm_runtime_get_sync(cpsw->dev);
 	if (ret < 0) {
 		pm_runtime_put_noidle(cpsw->dev);
@@ -2056,6 +2115,24 @@  static int cpsw_ndo_vlan_rx_kill_vid(struct net_device *ndev,
 	return ret;
 }
 
+static int cpsw_ndo_get_phys_port_name(struct net_device *ndev, char *name,
+				       size_t len)
+{
+	struct cpsw_priv *priv = netdev_priv(ndev);
+	struct cpsw_common *cpsw = priv->cpsw;
+	int err;
+
+	if (!cpsw_is_switchdev(cpsw->data.switch_mode))
+		return -EOPNOTSUPP;
+
+	err = snprintf(name, len, "p%d", priv->emac_port);
+
+	if (err >= len)
+		return -EINVAL;
+
+	return 0;
+}
+
 static int cpsw_ndo_set_tx_maxrate(struct net_device *ndev, int queue, u32 rate)
 {
 	struct cpsw_priv *priv = netdev_priv(ndev);
@@ -2122,6 +2199,7 @@  static const struct net_device_ops cpsw_netdev_ops = {
 #endif
 	.ndo_vlan_rx_add_vid	= cpsw_ndo_vlan_rx_add_vid,
 	.ndo_vlan_rx_kill_vid	= cpsw_ndo_vlan_rx_kill_vid,
+	.ndo_get_phys_port_name = cpsw_ndo_get_phys_port_name,
 };
 
 static int cpsw_get_regs_len(struct net_device *ndev)
@@ -2711,6 +2789,10 @@  static int cpsw_probe_dt(struct cpsw_platform_data *data,
 	if (of_property_read_bool(node, "dual_emac"))
 		data->switch_mode = CPSW_DUAL_EMAC;
 
+	/* switchdev overrides DTS */
+	if (IS_ENABLED(CONFIG_TI_CPSW_SWITCHDEV))
+		data->switch_mode = CPSW_SWITCHDEV;
+
 	/*
 	 * Populate all the child nodes here...
 	 */
@@ -2874,6 +2956,9 @@  static int cpsw_probe_dual_emac(struct cpsw_priv *priv)
 	cpsw->slaves[1].ndev = ndev;
 	ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER;
 
+	if (cpsw_is_switchdev(cpsw->data.switch_mode))
+		cpsw_switchdev_port_enable(ndev);
+
 	ndev->netdev_ops = &cpsw_netdev_ops;
 	ndev->ethtool_ops = &cpsw_ethtool_ops;
 
@@ -2903,6 +2988,196 @@  static const struct soc_device_attribute cpsw_soc_devices[] = {
 	{ /* sentinel */ }
 };
 
+static bool cpsw_port_dev_check(const struct net_device *dev)
+{
+	return dev->netdev_ops == &cpsw_netdev_ops;
+}
+
+static void cpsw_fdb_offload_notify(struct net_device *ndev,
+				    struct switchdev_notifier_fdb_info *rcv)
+{
+	struct switchdev_notifier_fdb_info info;
+
+	info.addr = rcv->addr;
+	info.vid = rcv->vid;
+	call_switchdev_notifiers(SWITCHDEV_FDB_OFFLOADED,
+				 ndev, &info.info);
+}
+
+static void cpsw_switchdev_event_work(struct work_struct *work)
+{
+	struct cpsw_switchdev_event_work *switchdev_work =
+		container_of(work, struct cpsw_switchdev_event_work, work);
+	struct cpsw_priv *priv = switchdev_work->priv;
+	struct switchdev_notifier_fdb_info *fdb;
+	struct cpsw_common *cpsw = priv->cpsw;
+	int port = priv->emac_port;
+
+	rtnl_lock();
+	switch (switchdev_work->event) {
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+		fdb = &switchdev_work->fdb_info;
+		if (memcmp(priv->mac_addr, (u8 *)fdb->addr, ETH_ALEN) == 0)
+			port = HOST_PORT_NUM;
+		cpsw_ale_add_ucast(cpsw->ale, (u8 *)fdb->addr, port, ALE_VLAN,
+				   fdb->vid);
+		cpsw_fdb_offload_notify(priv->ndev, fdb);
+		break;
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		fdb = &switchdev_work->fdb_info;
+		if (memcmp(priv->mac_addr, (u8 *)fdb->addr, ETH_ALEN) == 0)
+			port = HOST_PORT_NUM;
+		cpsw_ale_del_ucast(cpsw->ale, (u8 *)fdb->addr, port, ALE_VLAN,
+				   fdb->vid);
+		break;
+	default:
+		break;
+	}
+	rtnl_unlock();
+
+	kfree(switchdev_work->fdb_info.addr);
+	kfree(switchdev_work);
+	dev_put(priv->ndev);
+}
+
+/* called under rcu_read_lock() */
+static int cpsw_switchdev_event(struct notifier_block *unused,
+				unsigned long event, void *ptr)
+{
+	struct net_device *ndev = switchdev_notifier_info_to_dev(ptr);
+	struct switchdev_notifier_fdb_info *fdb_info = ptr;
+	struct cpsw_switchdev_event_work *switchdev_work;
+	struct cpsw_priv *priv = netdev_priv(ndev);
+
+	if (!cpsw_port_dev_check(ndev))
+		return NOTIFY_DONE;
+
+	switchdev_work = kzalloc(sizeof(*switchdev_work), GFP_ATOMIC);
+	if (WARN_ON(!switchdev_work))
+		return NOTIFY_BAD;
+
+	INIT_WORK(&switchdev_work->work, cpsw_switchdev_event_work);
+	switchdev_work->priv = priv;
+	switchdev_work->event = event;
+
+	switch (event) {
+	case SWITCHDEV_FDB_ADD_TO_DEVICE:
+	case SWITCHDEV_FDB_DEL_TO_DEVICE:
+		memcpy(&switchdev_work->fdb_info, ptr,
+		       sizeof(switchdev_work->fdb_info));
+		switchdev_work->fdb_info.addr = kzalloc(ETH_ALEN, GFP_ATOMIC);
+		ether_addr_copy((u8 *)switchdev_work->fdb_info.addr,
+				fdb_info->addr);
+		dev_hold(ndev);
+		break;
+	default:
+		kfree(switchdev_work);
+		return NOTIFY_DONE;
+	}
+
+	queue_work(system_long_wq, &switchdev_work->work);
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block cpsw_switchdev_notifier = {
+	.notifier_call = cpsw_switchdev_event,
+};
+
+static void cpsw_netdevice_port_link(struct net_device *ndev)
+{
+	struct cpsw_priv *priv = netdev_priv(ndev);
+	struct cpsw_common *cpsw = priv->cpsw;
+
+	if (!cpsw->br_members) {
+		cpsw_ale_control_set(cpsw->ale, HOST_PORT_NUM, ALE_P0_UNI_FLOOD,
+				     1);
+		dev_dbg(&ndev->dev, "Set P0_UNI_FLOOD\n");
+	}
+	cpsw->br_members++;
+}
+
+static void cpsw_netdevice_port_unlink(struct net_device *ndev)
+{
+	struct cpsw_priv *priv = netdev_priv(ndev);
+	struct cpsw_common *cpsw = priv->cpsw;
+
+	cpsw->br_members--;
+	if (!cpsw->br_members) {
+		cpsw_ale_control_set(cpsw->ale, HOST_PORT_NUM, ALE_P0_UNI_FLOOD,
+				     0);
+		dev_dbg(&ndev->dev, "unset P0_UNI_FLOOD\n");
+	}
+}
+
+/* netdev notifier */
+static int cpsw_netdevice_event(struct notifier_block *unused,
+				unsigned long event, void *ptr)
+{
+	struct net_device *ndev = netdev_notifier_info_to_dev(ptr);
+	struct netdev_notifier_changeupper_info *info;
+
+	switch (event) {
+	case NETDEV_CHANGEUPPER:
+		info = ptr;
+		if (!info->master)
+			goto out;
+		if (info->linking)
+			cpsw_netdevice_port_link(ndev);
+		else
+			cpsw_netdevice_port_unlink(ndev);
+		break;
+	default:
+		return NOTIFY_DONE;
+	}
+
+out:
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block cpsw_netdevice_nb __read_mostly = {
+	.notifier_call = cpsw_netdevice_event,
+};
+
+static int cpsw_register_notifiers(struct cpsw_priv *priv)
+{
+	int ret;
+
+	ret = register_netdevice_notifier(&cpsw_netdevice_nb);
+	if (ret) {
+		cpsw_err(priv, probe, "can't register netdevice notifier\n");
+		return ret;
+	}
+
+	ret = register_switchdev_notifier(&cpsw_switchdev_notifier);
+	if (ret) {
+		cpsw_err(priv, probe, "can't register switchdev notifier\n");
+		goto unreg_netdevice;
+	}
+
+	return ret;
+
+unreg_netdevice:
+	ret = unregister_netdevice_notifier(&cpsw_netdevice_nb);
+
+	return ret;
+}
+
+static int cpsw_unregister_notifiers(struct cpsw_priv *priv)
+{
+	int ret;
+
+	ret = unregister_switchdev_notifier(&cpsw_switchdev_notifier);
+	if (ret)
+		dev_err(priv->dev, "can't unregister switchdev notifier\n");
+
+	ret += unregister_netdevice_notifier(&cpsw_netdevice_nb);
+	if (ret)
+		dev_err(priv->dev, "can't unregister netdevice notifier\n");
+
+	return ret;
+}
+
 static int cpsw_probe(struct platform_device *pdev)
 {
 	struct clk			*clk;
@@ -3135,6 +3410,9 @@  static int cpsw_probe(struct platform_device *pdev)
 		goto clean_dma_ret;
 	}
 
+	if (cpsw_is_switchdev(cpsw->data.switch_mode))
+		cpsw_switchdev_port_enable(ndev);
+
 	ndev->features |= NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_HW_VLAN_CTAG_RX;
 
 	ndev->netdev_ops = &cpsw_netdev_ops;
@@ -3202,6 +3480,12 @@  static int cpsw_probe(struct platform_device *pdev)
 		goto clean_dma_ret;
 	}
 
+	if (cpsw_is_switchdev(cpsw->data.switch_mode)) {
+		ret = cpsw_register_notifiers(priv);
+		if (ret)
+			goto clean_dma_ret;
+	}
+
 	cpsw_notice(priv, probe,
 		    "initialized device (regs %pa, irq %d, pool size %d)\n",
 		    &ss_res->start, ndev->irq, dma_params.descs_pool_size);
@@ -3227,7 +3511,8 @@  static int cpsw_probe(struct platform_device *pdev)
 static int cpsw_remove(struct platform_device *pdev)
 {
 	struct net_device *ndev = platform_get_drvdata(pdev);
-	struct cpsw_common *cpsw = ndev_to_cpsw(ndev);
+	struct cpsw_priv *priv = netdev_priv(ndev);
+	struct cpsw_common *cpsw = priv->cpsw;
 	int ret;
 
 	ret = pm_runtime_get_sync(&pdev->dev);
@@ -3236,6 +3521,9 @@  static int cpsw_remove(struct platform_device *pdev)
 		return ret;
 	}
 
+	if (cpsw_is_switchdev(cpsw->data.switch_mode))
+		ret = cpsw_unregister_notifiers(priv);
+
 	if (!cpsw_is_switch(cpsw->data.switch_mode))
 		unregister_netdev(cpsw->slaves[1].ndev);
 	unregister_netdev(ndev);
diff --git a/drivers/net/ethernet/ti/cpsw_priv.h b/drivers/net/ethernet/ti/cpsw_priv.h
index 86a2709..4380b1c 100644
--- a/drivers/net/ethernet/ti/cpsw_priv.h
+++ b/drivers/net/ethernet/ti/cpsw_priv.h
@@ -33,6 +33,7 @@ 
 enum {
 	CPSW_TI_SWITCH,
 	CPSW_DUAL_EMAC,
+	CPSW_SWITCHDEV,
 };
 
 struct cpsw_slave_data {
@@ -98,6 +99,7 @@  struct cpsw_common {
 	int				rx_ch_num, tx_ch_num;
 	int				speed;
 	int				usage_count;
+	u8				br_members;
 };
 
 struct cpsw_priv {
diff --git a/drivers/net/ethernet/ti/cpsw_switchdev.c b/drivers/net/ethernet/ti/cpsw_switchdev.c
new file mode 100644
index 0000000..528e99e
--- /dev/null
+++ b/drivers/net/ethernet/ti/cpsw_switchdev.c
@@ -0,0 +1,418 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/* Texas Instruments switchdev Driver
+ *
+ * Copyright (C) 2018 Texas Instruments
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation version 2.
+ *
+ * This program is distributed "as is" WITHOUT ANY WARRANTY of any
+ * kind, whether express or implied; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/etherdevice.h>
+#include <linux/if_bridge.h>
+#include <net/switchdev.h>
+#include "cpsw.h"
+#include "cpsw_priv.h"
+#include "cpsw_ale.h"
+
+static u32 cpsw_switchdev_get_ver(struct net_device *ndev)
+{
+	struct cpsw_priv *priv = netdev_priv(ndev);
+	struct cpsw_common *cpsw = priv->cpsw;
+
+	return cpsw->version;
+}
+
+static int cpsw_port_stp_state_set(struct cpsw_priv *priv,
+				   struct switchdev_trans *trans, u8 state)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	u8 cpsw_state;
+	int ret = 0;
+
+	if (switchdev_trans_ph_prepare(trans))
+		return 0;
+
+	switch (state) {
+	case BR_STATE_FORWARDING:
+		cpsw_state = ALE_PORT_STATE_FORWARD;
+		break;
+	case BR_STATE_LEARNING:
+		cpsw_state = ALE_PORT_STATE_LEARN;
+		break;
+	case BR_STATE_DISABLED:
+		cpsw_state = ALE_PORT_STATE_DISABLE;
+		break;
+	case BR_STATE_LISTENING:
+	case BR_STATE_BLOCKING:
+		cpsw_state = ALE_PORT_STATE_BLOCK;
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	ret = cpsw_ale_control_set(cpsw->ale, priv->emac_port,
+				   ALE_PORT_STATE, cpsw_state);
+	dev_dbg(priv->dev, "ale state: %u\n", cpsw_state);
+
+	return ret;
+}
+
+static int cpsw_port_attr_br_flags_set(struct cpsw_priv *priv,
+				       struct switchdev_trans *trans,
+				       struct net_device *orig_dev,
+				       unsigned long brport_flags)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	bool unreg_mcast_add = false;
+
+	if (switchdev_trans_ph_prepare(trans))
+		return 0;
+
+	if (brport_flags & BR_MCAST_FLOOD)
+		unreg_mcast_add = true;
+	cpsw_ale_set_unreg_mcast(cpsw->ale, BIT(priv->emac_port),
+				 unreg_mcast_add);
+
+	return 0;
+}
+
+static int cpsw_port_attr_set(struct net_device *ndev,
+			      const struct switchdev_attr *attr,
+			      struct switchdev_trans *trans)
+{
+	struct cpsw_priv *priv = netdev_priv(ndev);
+	u8 state;
+	int ret;
+
+	dev_dbg(priv->dev, "attr: id %u dev: %s port: %u\n", attr->id,
+		priv->ndev->name, priv->emac_port);
+
+	switch (attr->id) {
+	case SWITCHDEV_ATTR_ID_PORT_STP_STATE:
+		ret = cpsw_port_stp_state_set(priv, trans, attr->u.stp_state);
+		dev_dbg(priv->dev, "stp state: %u\n", state);
+		break;
+	case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS:
+		ret = cpsw_port_attr_br_flags_set(priv, trans, attr->orig_dev,
+						  attr->u.brport_flags);
+		break;
+	default:
+		ret = -EOPNOTSUPP;
+		break;
+	}
+
+	return ret;
+}
+
+static int cpsw_port_attr_get(struct net_device *dev,
+			      struct switchdev_attr *attr)
+{
+	u32 cpsw_ver;
+	int err = 0;
+
+	switch (attr->id) {
+	case SWITCHDEV_ATTR_ID_PORT_PARENT_ID:
+		cpsw_ver = cpsw_switchdev_get_ver(dev);
+		attr->u.ppid.id_len = sizeof(cpsw_ver);
+		memcpy(&attr->u.ppid.id, &cpsw_ver, attr->u.ppid.id_len);
+		break;
+	case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS_SUPPORT:
+		attr->u.brport_flags_support = BR_MCAST_FLOOD;
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	return err;
+}
+
+static u16 cpsw_get_pvid(struct cpsw_priv *priv)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	u32 __iomem *port_vlan_reg;
+	u32 pvid;
+
+	if (priv->emac_port) {
+		int reg = CPSW2_PORT_VLAN;
+
+		if (cpsw->version == CPSW_VERSION_1)
+			reg = CPSW1_PORT_VLAN;
+		pvid = slave_read(cpsw->slaves + (priv->emac_port - 1), reg);
+	} else {
+		port_vlan_reg = &cpsw->host_port_regs->port_vlan;
+		pvid = readl(port_vlan_reg);
+	}
+
+	pvid = pvid & 0xfff;
+
+	return pvid;
+}
+
+static void cpsw_set_pvid(struct cpsw_priv *priv, u16 vid, bool cfi, u32 cos)
+{
+	struct cpsw_common *cpsw = priv->cpsw;
+	void __iomem *port_vlan_reg;
+	u32 pvid;
+
+	pvid = vid;
+	pvid |= cfi ? BIT(12) : 0;
+	pvid |= (cos & 0x7) << 13;
+
+	if (priv->emac_port) {
+		int reg = CPSW2_PORT_VLAN;
+
+		if (cpsw->version == CPSW_VERSION_1)
+			reg = CPSW1_PORT_VLAN;
+		/* no barrier */
+		slave_write(cpsw->slaves + (priv->emac_port - 1), pvid, reg);
+	} else {
+		/* CPU port */
+		port_vlan_reg = &cpsw->host_port_regs->port_vlan;
+		writel(pvid, port_vlan_reg);
+	}
+}
+
+static int cpsw_port_vlan_add(struct cpsw_priv *priv, bool untag, bool pvid,
+			      u16 vid, struct net_device *orig_dev)
+{
+	bool cpu_port = netif_is_bridge_master(orig_dev);
+	struct cpsw_common *cpsw = priv->cpsw;
+	int unreg_mcast_mask = 0;
+	int reg_mcast_mask = 0;
+	int untag_mask = 0;
+	int port_mask;
+	int ret = 0;
+	u32 flags;
+
+	if (cpu_port) {
+		port_mask = BIT(HOST_PORT_NUM);
+		flags = orig_dev->flags;
+		unreg_mcast_mask = port_mask;
+	} else {
+		port_mask = BIT(priv->emac_port);
+		flags = priv->ndev->flags;
+	}
+
+	if (flags & IFF_MULTICAST)
+		reg_mcast_mask = port_mask;
+
+	if (untag)
+		untag_mask = port_mask;
+
+	ret = cpsw_ale_vlan_add_modify(cpsw->ale, vid, port_mask, untag_mask,
+				       reg_mcast_mask, unreg_mcast_mask);
+	if (ret) {
+		dev_err(priv->dev, "Unable to add vlan\n");
+		return ret;
+	}
+
+	if (!pvid)
+		return ret;
+
+	cpsw_set_pvid(priv, vid, 0, 0);
+
+	dev_dbg(priv->dev, "VID add: %u dev: %s port: %u\n", vid,
+		priv->ndev->name, priv->emac_port);
+
+	return ret;
+}
+
+static int cpsw_port_vlan_del(struct cpsw_priv *priv, u16 vid,
+			      struct net_device *orig_dev)
+{
+	bool cpu_port = netif_is_bridge_master(orig_dev);
+	struct cpsw_common *cpsw = priv->cpsw;
+	int port_mask;
+	int ret = 0;
+
+	if (cpu_port)
+		port_mask = BIT(HOST_PORT_NUM);
+	else
+		port_mask = BIT(priv->emac_port);
+
+	ret = cpsw_ale_vlan_del_modify(cpsw->ale, vid, port_mask);
+	if (ret != 0)
+		return ret;
+
+	/* We don't care for the return value here, error is returned only if
+	 * the unicast entry is not present
+	 */
+	cpsw_ale_del_ucast(cpsw->ale, priv->mac_addr,
+			   HOST_PORT_NUM, ALE_VLAN, vid);
+
+	if (vid == cpsw_get_pvid(priv))
+		cpsw_set_pvid(priv, 0, 0, 0);
+
+	/* We don't care for the return value here, error is returned only if
+	 * the multicast entry is not present
+	 */
+	cpsw_ale_del_mcast(cpsw->ale, priv->ndev->broadcast,
+			   0, ALE_VLAN, vid);
+
+	dev_dbg(priv->dev, "VID del: %u dev: %s port: %u\n", vid,
+		priv->ndev->name, priv->emac_port);
+
+	return ret;
+}
+
+static int cpsw_port_vlans_add(struct cpsw_priv *priv,
+			       const struct switchdev_obj_port_vlan *vlan,
+			       struct switchdev_trans *trans)
+{
+	bool untag = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
+	struct net_device *orig_dev = vlan->obj.orig_dev;
+	bool cpu_port = netif_is_bridge_master(orig_dev);
+	bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID;
+	u16 vid;
+
+	if (cpu_port && !(vlan->flags & BRIDGE_VLAN_INFO_BRENTRY))
+		return 0;
+
+	if (switchdev_trans_ph_prepare(trans))
+		return 0;
+
+	for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) {
+		int err;
+
+		err = cpsw_port_vlan_add(priv, untag, pvid, vid, orig_dev);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int cpsw_port_vlans_del(struct cpsw_priv *priv,
+			       const struct switchdev_obj_port_vlan *vlan)
+
+{
+	struct net_device *orig_dev = vlan->obj.orig_dev;
+	u16 vid;
+
+	for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) {
+		int err;
+
+		err = cpsw_port_vlan_del(priv, vid, orig_dev);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int cpsw_port_mdb_add(struct cpsw_priv *priv,
+			     struct switchdev_obj_port_mdb *mdb,
+			     struct switchdev_trans *trans)
+
+{
+	struct net_device *orig_dev = mdb->obj.orig_dev;
+	bool cpu_port = netif_is_bridge_master(orig_dev);
+	struct cpsw_common *cpsw = priv->cpsw;
+	int port_mask;
+	int err;
+
+	if (switchdev_trans_ph_prepare(trans))
+		return 0;
+
+	if (cpu_port)
+		port_mask = BIT(HOST_PORT_NUM);
+	else
+		port_mask = BIT(priv->emac_port);
+
+	err = cpsw_ale_mcast_add_modify(cpsw->ale, mdb->addr, port_mask,
+					ALE_VLAN, mdb->vid, 0);
+
+	dev_dbg(priv->dev, "MDB add: %pM dev: %s vid %u port: %u\n", mdb->addr,
+		priv->ndev->name, mdb->vid, priv->emac_port);
+
+	return err;
+}
+
+static int cpsw_port_mdb_del(struct cpsw_priv *priv,
+			     struct switchdev_obj_port_mdb *mdb)
+
+{
+	struct net_device *orig_dev = mdb->obj.orig_dev;
+	bool cpu_port = netif_is_bridge_master(orig_dev);
+	struct cpsw_common *cpsw = priv->cpsw;
+	int del_mask;
+	int err;
+
+	if (cpu_port)
+		del_mask = BIT(HOST_PORT_NUM);
+	else
+		del_mask = BIT(priv->emac_port);
+	err = cpsw_ale_mcast_del_modify(cpsw->ale, mdb->addr, del_mask,
+					ALE_VLAN, mdb->vid);
+	dev_dbg(priv->dev, "MDB del: %pM dev: %s vid %u port: %u\n", mdb->addr,
+		priv->ndev->name, mdb->vid, priv->emac_port);
+
+	return err;
+}
+
+static int cpsw_port_obj_add(struct net_device *ndev,
+			     const struct switchdev_obj *obj,
+			     struct switchdev_trans *trans)
+{
+	struct switchdev_obj_port_vlan *vlan = SWITCHDEV_OBJ_PORT_VLAN(obj);
+	struct switchdev_obj_port_mdb *mdb = SWITCHDEV_OBJ_PORT_MDB(obj);
+	struct cpsw_priv *priv = netdev_priv(ndev);
+	int err = 0;
+
+	switch (obj->id) {
+	case SWITCHDEV_OBJ_ID_PORT_VLAN:
+		err = cpsw_port_vlans_add(priv, vlan, trans);
+		break;
+	case SWITCHDEV_OBJ_ID_PORT_MDB:
+	case SWITCHDEV_OBJ_ID_HOST_MDB:
+		err = cpsw_port_mdb_add(priv, mdb, trans);
+		break;
+	default:
+		err = -EOPNOTSUPP;
+		break;
+	}
+
+	return err;
+}
+
+static int cpsw_port_obj_del(struct net_device *ndev,
+			     const struct switchdev_obj *obj)
+{
+	struct switchdev_obj_port_vlan *vlan = SWITCHDEV_OBJ_PORT_VLAN(obj);
+	struct switchdev_obj_port_mdb *mdb = SWITCHDEV_OBJ_PORT_MDB(obj);
+	struct cpsw_priv *priv = netdev_priv(ndev);
+	int err = 0;
+
+	switch (obj->id) {
+	case SWITCHDEV_OBJ_ID_PORT_VLAN:
+		err = cpsw_port_vlans_del(priv, vlan);
+		break;
+	case SWITCHDEV_OBJ_ID_PORT_MDB:
+	case SWITCHDEV_OBJ_ID_HOST_MDB:
+		err = cpsw_port_mdb_del(priv, mdb);
+		break;
+	default:
+		err = -EOPNOTSUPP;
+		break;
+	}
+
+	return err;
+}
+
+static const struct switchdev_ops cpsw_port_switchdev_ops = {
+	.switchdev_port_attr_set	= cpsw_port_attr_set,
+	.switchdev_port_attr_get	= cpsw_port_attr_get,
+	.switchdev_port_obj_add		= cpsw_port_obj_add,
+	.switchdev_port_obj_del		= cpsw_port_obj_del,
+};
+
+void cpsw_port_switchdev_init(struct net_device *ndev)
+{
+	ndev->switchdev_ops = &cpsw_port_switchdev_ops;
+}
diff --git a/drivers/net/ethernet/ti/cpsw_switchdev.h b/drivers/net/ethernet/ti/cpsw_switchdev.h
new file mode 100644
index 0000000..4940462
--- /dev/null
+++ b/drivers/net/ethernet/ti/cpsw_switchdev.h
@@ -0,0 +1,4 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <net/switchdev.h>
+
+void cpsw_port_switchdev_init(struct net_device *ndev);