diff mbox series

[net-next,1/6] ethtool: Extend link modes settings uAPI with lanes

Message ID 20201010154119.3537085-2-idosch@idosch.org
State New
Headers show
Series Support setting lanes via ethtool | expand

Commit Message

Ido Schimmel Oct. 10, 2020, 3:41 p.m. UTC
From: Danielle Ratson <danieller@nvidia.com>

Currently, when auto negotiation is on, the user can advertise all the
linkmodes which correspond to a specific speed, but does not have a
similar selector for the number of lanes. This is significant when a
specific speed can be achieved using different number of lanes.  For
example, 2x50 or 4x25.

Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct
ethtool_link_settings' with lanes field in order to implement a new
lanes-selector that will enable the user to advertise a specific number
of lanes as well.

When auto negotiation is off, lanes parameter can be forced only if the
driver supports it. Add a capability bit in 'struct ethtool_ops' that
allows ethtool know if the driver can handle the lanes parameter when
auto negotiation is off, so if it does not, an error message will be
returned when trying to set lanes.

Example:

$ ethtool -s swp1 lanes 4
$ ethtool swp1
  Settings for swp1:
	Supported ports: [ FIBRE ]
        Supported link modes:   1000baseKX/Full
                                10000baseKR/Full
                                40000baseCR4/Full
				40000baseSR4/Full
				40000baseLR4/Full
                                25000baseCR/Full
                                25000baseSR/Full
				50000baseCR2/Full
                                100000baseSR4/Full
				100000baseCR4/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  40000baseCR4/Full
				40000baseSR4/Full
				40000baseLR4/Full
                                100000baseSR4/Full
				100000baseCR4/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: Unknown!
        Duplex: Unknown! (255)
        Auto-negotiation: on
        Port: Direct Attach Copper
        PHYAD: 0
        Transceiver: internal
        Link detected: no

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
 Documentation/networking/ethtool-netlink.rst |  11 +-
 include/linux/ethtool.h                      |   4 +
 include/uapi/linux/ethtool.h                 |   8 +
 include/uapi/linux/ethtool_netlink.h         |   1 +
 net/ethtool/linkmodes.c                      | 227 +++++++++++--------
 net/ethtool/netlink.h                        |   2 +-
 6 files changed, 158 insertions(+), 95 deletions(-)

Comments

Jakub Kicinski Oct. 11, 2020, 10:37 p.m. UTC | #1
On Sat, 10 Oct 2020 18:41:14 +0300 Ido Schimmel wrote:
> From: Danielle Ratson <danieller@nvidia.com>
> 
> Currently, when auto negotiation is on, the user can advertise all the
> linkmodes which correspond to a specific speed, but does not have a
> similar selector for the number of lanes. This is significant when a
> specific speed can be achieved using different number of lanes.  For
> example, 2x50 or 4x25.
> 
> Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct
> ethtool_link_settings' with lanes field in order to implement a new
> lanes-selector that will enable the user to advertise a specific number
> of lanes as well.
> 
> When auto negotiation is off, lanes parameter can be forced only if the
> driver supports it. Add a capability bit in 'struct ethtool_ops' that
> allows ethtool know if the driver can handle the lanes parameter when
> auto negotiation is off, so if it does not, an error message will be
> returned when trying to set lanes.

What's the use for this in practical terms? Isn't the lane count
basically implied by the module that gets plugged in?

> +/* Lanes, 1, 2, 4 or 8. */
> +#define ETHTOOL_LANES_1			1
> +#define ETHTOOL_LANES_2			2
> +#define ETHTOOL_LANES_4			4
> +#define ETHTOOL_LANES_8			8

Not an extremely useful set of defines, not sure Michal would agree.

> +#define ETHTOOL_LANES_UNKNOWN		0

>  struct link_mode_info {
>  	int				speed;
> +	int				lanes;

why signed?

>  	u8				duplex;
>  };

> @@ -274,16 +277,17 @@ const struct nla_policy ethnl_linkmodes_set_policy[] = {
>  	[ETHTOOL_A_LINKMODES_SPEED]		= { .type = NLA_U32 },
>  	[ETHTOOL_A_LINKMODES_DUPLEX]		= { .type = NLA_U8 },
>  	[ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG]	= { .type = NLA_U8 },
> +	[ETHTOOL_A_LINKMODES_LANES]		= { .type = NLA_U32 },

NLA_POLICY_VALIDATE_FN(), not sure why the types for this
validation_type are limited.. Johannes, just an abundance of caution?

> +	} else if (!lsettings->autoneg) {
> +		/* If autoneg is off and lanes parameter is not passed from user,
> +		 * set the lanes parameter to UNKNOWN.
> +		 */
> +		ksettings->lanes = ETHTOOL_LANES_UNKNOWN;

you assume UNKNOWN is zero by doing !lanes in auto_linkmodes -
that's inconsistent.

> +	}
> +
>  	ret = ethnl_update_bitset(ksettings->link_modes.advertising,
>  				  __ETHTOOL_LINK_MODE_MASK_NBITS,
>  				  tb[ETHTOOL_A_LINKMODES_OURS], link_mode_names,
Danielle Ratson Oct. 12, 2020, 3:33 p.m. UTC | #2
> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Monday, October 12, 2020 1:38 AM
> To: Ido Schimmel <idosch@idosch.org>
> Cc: netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko
> <jiri@nvidia.com>; Danielle Ratson <danieller@nvidia.com>;
> andrew@lunn.ch; f.fainelli@gmail.com; mkubecek@suse.cz; mlxsw
> <mlxsw@nvidia.com>; Ido Schimmel <idosch@nvidia.com>;
> johannes@sipsolutions.net
> Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI
> with lanes
> 
> On Sat, 10 Oct 2020 18:41:14 +0300 Ido Schimmel wrote:
> > From: Danielle Ratson <danieller@nvidia.com>
> >
> > Currently, when auto negotiation is on, the user can advertise all the
> > linkmodes which correspond to a specific speed, but does not have a
> > similar selector for the number of lanes. This is significant when a
> > specific speed can be achieved using different number of lanes.  For
> > example, 2x50 or 4x25.
> >
> > Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct
> > ethtool_link_settings' with lanes field in order to implement a new
> > lanes-selector that will enable the user to advertise a specific
> > number of lanes as well.
> >
> > When auto negotiation is off, lanes parameter can be forced only if
> > the driver supports it. Add a capability bit in 'struct ethtool_ops'
> > that allows ethtool know if the driver can handle the lanes parameter
> > when auto negotiation is off, so if it does not, an error message will
> > be returned when trying to set lanes.
> 
> What's the use for this in practical terms? Isn't the lane count basically
> implied by the module that gets plugged in?

The use is to enable the user to decide how to achieve a certain speed. 
For example, if he wants to get 100G and the port has 4 lanes, the speed can be achieved it using both 2 lanes of 50G and 4 lanes of 25G, as a port with 4 lanes width can work in 2 lanes mode with double speed each.
So, by specifying "lanes 2" he will achieve 100G using 2 lanes of 50G.
 
> 
> > +/* Lanes, 1, 2, 4 or 8. */
> > +#define ETHTOOL_LANES_1			1
> > +#define ETHTOOL_LANES_2			2
> > +#define ETHTOOL_LANES_4			4
> > +#define ETHTOOL_LANES_8			8
> 
> Not an extremely useful set of defines, not sure Michal would agree.
> 
> > +#define ETHTOOL_LANES_UNKNOWN		0
> 
> >  struct link_mode_info {
> >  	int				speed;
> > +	int				lanes;
> 
> why signed?

I have aligned it to the speed.

> 
> >  	u8				duplex;
> >  };
> 
> > @@ -274,16 +277,17 @@ const struct nla_policy
> ethnl_linkmodes_set_policy[] = {
> >  	[ETHTOOL_A_LINKMODES_SPEED]		= { .type = NLA_U32 },
> >  	[ETHTOOL_A_LINKMODES_DUPLEX]		= { .type = NLA_U8 },
> >  	[ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG]	= { .type =
> NLA_U8 },
> > +	[ETHTOOL_A_LINKMODES_LANES]		= { .type = NLA_U32 },
> 
> NLA_POLICY_VALIDATE_FN(), not sure why the types for this
> validation_type are limited.. Johannes, just an abundance of caution?
> 
> > +	} else if (!lsettings->autoneg) {
> > +		/* If autoneg is off and lanes parameter is not passed from
> user,
> > +		 * set the lanes parameter to UNKNOWN.
> > +		 */
> > +		ksettings->lanes = ETHTOOL_LANES_UNKNOWN;
> 
> you assume UNKNOWN is zero by doing !lanes in auto_linkmodes - that's
> inconsistent.

I didn't do !lanes... maybe you mean !req_lanes which is different parameter that specifies if the lanes parameter was passed from user space.

> 
> > +	}
> > +
> >  	ret = ethnl_update_bitset(ksettings->link_modes.advertising,
> >  				  __ETHTOOL_LINK_MODE_MASK_NBITS,
> >  				  tb[ETHTOOL_A_LINKMODES_OURS],
> link_mode_names,
Jakub Kicinski Oct. 12, 2020, 3:58 p.m. UTC | #3
On Mon, 12 Oct 2020 15:33:45 +0000 Danielle Ratson wrote:
> > What's the use for this in practical terms? Isn't the lane count basically
> > implied by the module that gets plugged in?  
> 
> The use is to enable the user to decide how to achieve a certain speed. 
> For example, if he wants to get 100G and the port has 4 lanes, the
> speed can be achieved it using both 2 lanes of 50G and 4 lanes of
> 25G, as a port with 4 lanes width can work in 2 lanes mode with
> double speed each. So, by specifying "lanes 2" he will achieve 100G
> using 2 lanes of 50G.

Can you give a concrete example of serdes capabilities of the port,
what SFP gets plugged in and what configuration user wants to select?
Michal Kubecek Oct. 12, 2020, 4:40 p.m. UTC | #4
On Mon, Oct 12, 2020 at 03:33:45PM +0000, Danielle Ratson wrote:
> > 
> > > +/* Lanes, 1, 2, 4 or 8. */
> > > +#define ETHTOOL_LANES_1			1
> > > +#define ETHTOOL_LANES_2			2
> > > +#define ETHTOOL_LANES_4			4
> > > +#define ETHTOOL_LANES_8			8
> > 
> > Not an extremely useful set of defines, not sure Michal would agree.

I don't see much use for them either. Such defines or enums make sense
when the numbers represent some values which have symbolic names but
these values only represent number of lanes so a number is IMHO
sufficient.


> > > +#define ETHTOOL_LANES_UNKNOWN		0
> > 
> > >  struct link_mode_info {
> > >  	int				speed;
> > > +	int				lanes;
> > 
> > why signed?
> 
> I have aligned it to the speed.

For speed, signed type was used mostly because SPEED_UNKNOWN is defined
as -1 (even if it's cast to u32 in most places), I don't think there is
a reason to use a signed type.

Michal
Michal Kubecek Oct. 12, 2020, 5:03 p.m. UTC | #5
On Sat, Oct 10, 2020 at 06:41:14PM +0300, Ido Schimmel wrote:
> From: Danielle Ratson <danieller@nvidia.com>
> 
> Currently, when auto negotiation is on, the user can advertise all the
> linkmodes which correspond to a specific speed, but does not have a
> similar selector for the number of lanes. This is significant when a
> specific speed can be achieved using different number of lanes.  For
> example, 2x50 or 4x25.
> 
> Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct
> ethtool_link_settings' with lanes field in order to implement a new
> lanes-selector that will enable the user to advertise a specific number
> of lanes as well.
> 
> When auto negotiation is off, lanes parameter can be forced only if the
> driver supports it. Add a capability bit in 'struct ethtool_ops' that
> allows ethtool know if the driver can handle the lanes parameter when
> auto negotiation is off, so if it does not, an error message will be
> returned when trying to set lanes.
> 
> Example:
> 
> $ ethtool -s swp1 lanes 4
> $ ethtool swp1
>   Settings for swp1:
> 	Supported ports: [ FIBRE ]
>         Supported link modes:   1000baseKX/Full
>                                 10000baseKR/Full
>                                 40000baseCR4/Full
> 				40000baseSR4/Full
> 				40000baseLR4/Full
>                                 25000baseCR/Full
>                                 25000baseSR/Full
> 				50000baseCR2/Full
>                                 100000baseSR4/Full
> 				100000baseCR4/Full
>         Supported pause frame use: Symmetric Receive-only
>         Supports auto-negotiation: Yes
>         Supported FEC modes: Not reported
>         Advertised link modes:  40000baseCR4/Full
> 				40000baseSR4/Full
> 				40000baseLR4/Full
>                                 100000baseSR4/Full
> 				100000baseCR4/Full
>         Advertised pause frame use: No
>         Advertised auto-negotiation: Yes
>         Advertised FEC modes: Not reported
>         Speed: Unknown!
>         Duplex: Unknown! (255)
>         Auto-negotiation: on
>         Port: Direct Attach Copper
>         PHYAD: 0
>         Transceiver: internal
>         Link detected: no
> 
> Signed-off-by: Danielle Ratson <danieller@nvidia.com>
> Reviewed-by: Jiri Pirko <jiri@nvidia.com>
> Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> ---
[...]
>  static const struct link_mode_info link_mode_params[] = {
> -	__DEFINE_LINK_MODE_PARAMS(10, T, Half),
> -	__DEFINE_LINK_MODE_PARAMS(10, T, Full),
> -	__DEFINE_LINK_MODE_PARAMS(100, T, Half),
> -	__DEFINE_LINK_MODE_PARAMS(100, T, Full),
> -	__DEFINE_LINK_MODE_PARAMS(1000, T, Half),
> -	__DEFINE_LINK_MODE_PARAMS(1000, T, Full),
> +	__DEFINE_LINK_MODE_PARAMS(10, T, 1, Half),
> +	__DEFINE_LINK_MODE_PARAMS(10, T, 1, Full),
> +	__DEFINE_LINK_MODE_PARAMS(100, T, 1, Half),
> +	__DEFINE_LINK_MODE_PARAMS(100, T, 1, Full),
> +	__DEFINE_LINK_MODE_PARAMS(1000, T, 1, Half),
> +	__DEFINE_LINK_MODE_PARAMS(1000, T, 1, Full),

Technically, 4 may be more appropriate for 1000base-T, 2500base-T,
5000base-T and 10000base-T but it's probably just a formality. While
there is 1000base-T1, I'm not sure if we can expect a device which would
support e.g. both 1000base-T and 1000base-T1 (or some other colliding
combination of modes).

Michal
Johannes Berg Oct. 12, 2020, 7:10 p.m. UTC | #6
Hi,

Sorry, somehow didn't see this until now.

> > +/* Lanes, 1, 2, 4 or 8. */
> > +#define ETHTOOL_LANES_1			1
> > +#define ETHTOOL_LANES_2			2
> > +#define ETHTOOL_LANES_4			4
> > +#define ETHTOOL_LANES_8			8
> 
> Not an extremely useful set of defines, not sure Michal would agree.
> 
> > +#define ETHTOOL_LANES_UNKNOWN		0
> >  struct link_mode_info {
> >  	int				speed;
> > +	int				lanes;
> 
> why signed?
> 
> >  	u8				duplex;
> >  };
> > @@ -274,16 +277,17 @@ const struct nla_policy ethnl_linkmodes_set_policy[] = {
> >  	[ETHTOOL_A_LINKMODES_SPEED]		= { .type = NLA_U32 },
> >  	[ETHTOOL_A_LINKMODES_DUPLEX]		= { .type = NLA_U8 },
> >  	[ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG]	= { .type = NLA_U8 },
> > +	[ETHTOOL_A_LINKMODES_LANES]		= { .type = NLA_U32 },
> 
> NLA_POLICY_VALIDATE_FN(), not sure why the types for this
> validation_type are limited.. Johannes, just an abundance of caution?

So let me see if I got this right - you're saying you'd like to use
NLA_POLICY_VALIDATE_FN() for an NLA_U32, to validate against the _LANES
being 1, 2, 4 or 8?

First of all, you _can_, no? I mean, it's limited by

#define NLA_ENSURE_NO_VALIDATION_PTR(tp)                \
        (__NLA_ENSURE(tp != NLA_BITFIELD32 &&           \
                      tp != NLA_REJECT &&               \
                      tp != NLA_NESTED &&               \
                      tp != NLA_NESTED_ARRAY) + tp)

and the reason is sort of encoded in that - the types listed here
already use the pointer *regardless of the validation_type*, so you
can't have a pointer to the function in the same union.

But not sure I understood :-)

johannes
Jakub Kicinski Oct. 12, 2020, 8:08 p.m. UTC | #7
On Mon, 12 Oct 2020 21:10:53 +0200 Johannes Berg wrote:
> Hi,
> 
> Sorry, somehow didn't see this until now.
> 
> > > +/* Lanes, 1, 2, 4 or 8. */
> > > +#define ETHTOOL_LANES_1			1
> > > +#define ETHTOOL_LANES_2			2
> > > +#define ETHTOOL_LANES_4			4
> > > +#define ETHTOOL_LANES_8			8  
> > 
> > Not an extremely useful set of defines, not sure Michal would agree.
> >   
> > > +#define ETHTOOL_LANES_UNKNOWN		0
> > >  struct link_mode_info {
> > >  	int				speed;
> > > +	int				lanes;  
> > 
> > why signed?
> >   
> > >  	u8				duplex;
> > >  };
> > > @@ -274,16 +277,17 @@ const struct nla_policy ethnl_linkmodes_set_policy[] = {
> > >  	[ETHTOOL_A_LINKMODES_SPEED]		= { .type = NLA_U32 },
> > >  	[ETHTOOL_A_LINKMODES_DUPLEX]		= { .type = NLA_U8 },
> > >  	[ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG]	= { .type = NLA_U8 },
> > > +	[ETHTOOL_A_LINKMODES_LANES]		= { .type = NLA_U32 },  
> > 
> > NLA_POLICY_VALIDATE_FN(), not sure why the types for this
> > validation_type are limited.. Johannes, just an abundance of caution?  
> 
> So let me see if I got this right - you're saying you'd like to use
> NLA_POLICY_VALIDATE_FN() for an NLA_U32, to validate against the _LANES
> being 1, 2, 4 or 8?
> 
> First of all, you _can_, no? I mean, it's limited by
> 
> #define NLA_ENSURE_NO_VALIDATION_PTR(tp)                \
>         (__NLA_ENSURE(tp != NLA_BITFIELD32 &&           \
>                       tp != NLA_REJECT &&               \
>                       tp != NLA_NESTED &&               \
>                       tp != NLA_NESTED_ARRAY) + tp)
> 
> and the reason is sort of encoded in that - the types listed here
> already use the pointer *regardless of the validation_type*, so you
> can't have a pointer to the function in the same union.
> 
> But not sure I understood :-)

Yes, you're right. Sorry for the noise, one day I'll learn to read..
Danielle Ratson Oct. 13, 2020, 2:29 p.m. UTC | #8
> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Monday, October 12, 2020 6:58 PM
> To: Danielle Ratson <danieller@nvidia.com>
> Cc: Ido Schimmel <idosch@idosch.org>; netdev@vger.kernel.org;
> davem@davemloft.net; Jiri Pirko <jiri@nvidia.com>; andrew@lunn.ch;
> f.fainelli@gmail.com; mkubecek@suse.cz; mlxsw <mlxsw@nvidia.com>; Ido
> Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net
> Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI
> with lanes
> 
> On Mon, 12 Oct 2020 15:33:45 +0000 Danielle Ratson wrote:
> > > What's the use for this in practical terms? Isn't the lane count
> > > basically implied by the module that gets plugged in?
> >
> > The use is to enable the user to decide how to achieve a certain speed.
> > For example, if he wants to get 100G and the port has 4 lanes, the
> > speed can be achieved it using both 2 lanes of 50G and 4 lanes of 25G,
> > as a port with 4 lanes width can work in 2 lanes mode with double
> > speed each. So, by specifying "lanes 2" he will achieve 100G using 2
> > lanes of 50G.
> 
> Can you give a concrete example of serdes capabilities of the port, what SFP
> gets plugged in and what configuration user wants to select?

Example:
- swp1 is a 200G port with 4 lanes.
- QSFP28 is plugged in.
- The user wants to select configuration of 100G speed using 2 lanes, 50G each.

$ ethtool swp1
Settings for swp1:
        Supported ports: [ FIBRE         Backplane ]
        Supported link modes:   1000baseT/Full
                                10000baseT/Full
                                1000baseKX/Full
                                10000baseKR/Full
                                10000baseR_FEC
                                40000baseKR4/Full
                                40000baseCR4/Full
                                40000baseSR4/Full
                                40000baseLR4/Full
                                25000baseCR/Full
                                25000baseKR/Full
                                25000baseSR/Full
                                50000baseCR2/Full
                                50000baseKR2/Full
                                100000baseKR4/Full
                                100000baseSR4/Full
                                100000baseCR4/Full
                                100000baseLR4_ER4/Full
                                50000baseSR2/Full
                                10000baseCR/Full
                                10000baseSR/Full
                                10000baseLR/Full
                                10000baseER/Full
                                50000baseKR/Full
                                50000baseSR/Full
                                50000baseCR/Full
                                50000baseLR_ER_FR/Full
                                50000baseDR/Full
                                100000baseKR2/Full
                                100000baseSR2/Full
                                100000baseCR2/Full
                                100000baseLR2_ER2_FR2/Full
                                100000baseDR2/Full
                                200000baseKR4/Full
                                200000baseSR4/Full
                                200000baseLR4_ER4_FR4/Full
                                200000baseDR4/Full
                                200000baseCR4/Full
        Advertised link modes:  1000baseT/Full
                                10000baseT/Full
                                1000baseKX/Full
                                10000baseKR/Full
                                10000baseR_FEC
                                40000baseKR4/Full
                                40000baseCR4/Full
                                40000baseSR4/Full
                                40000baseLR4/Full
                                25000baseCR/Full
                                25000baseKR/Full
                                25000baseSR/Full
                                50000baseCR2/Full
                                50000baseKR2/Full
                                100000baseKR4/Full
                                100000baseSR4/Full
                                100000baseCR4/Full
                                100000baseLR4_ER4/Full
                                50000baseSR2/Full
                                10000baseCR/Full
                                10000baseSR/Full
                                10000baseLR/Full
                                10000baseER/Full
                                50000baseKR/Full
                                50000baseSR/Full
                                50000baseCR/Full
                                50000baseLR_ER_FR/Full
                                50000baseDR/Full
                                100000baseKR2/Full
                                100000baseSR2/Full
                                100000baseCR2/Full
                                100000baseLR2_ER2_FR2/Full
                                100000baseDR2/Full
                                200000baseKR4/Full
                                200000baseSR4/Full
                                200000baseLR4_ER4_FR4/Full
                                200000baseDR4/Full
                                200000baseCR4/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: Unknown!
        Lanes: Unknown!
        Duplex: Unknown! (255)
        Auto-negotiation: on
        Port: Direct Attach Copper
        PHYAD: 0
        Transceiver: internal
        Link detected: no

$ ethtool -s swp1 speed 100000 lanes 2
$ ethtool swp1
Settings for swp1:
        Supported ports: [ FIBRE         Backplane ]
        Supported link modes:   1000baseT/Full
                                10000baseT/Full
                                1000baseKX/Full
                                10000baseKR/Full
                                10000baseR_FEC
                                40000baseKR4/Full
                                40000baseCR4/Full
                                40000baseSR4/Full
                                40000baseLR4/Full
                                25000baseCR/Full
                                25000baseKR/Full
                                25000baseSR/Full
                                50000baseCR2/Full
                                50000baseKR2/Full
                                100000baseKR4/Full
                                100000baseSR4/Full
                                100000baseCR4/Full
                                100000baseLR4_ER4/Full
                                50000baseSR2/Full
                                10000baseCR/Full
                                10000baseSR/Full
                                10000baseLR/Full
                                10000baseER/Full
                                50000baseKR/Full
                                50000baseSR/Full
                                50000baseCR/Full
                                50000baseLR_ER_FR/Full
                                50000baseDR/Full
                                100000baseKR2/Full
                                100000baseSR2/Full
                                100000baseCR2/Full
                                100000baseLR2_ER2_FR2/Full
                                100000baseDR2/Full
                                200000baseKR4/Full
                                200000baseSR4/Full
                                200000baseLR4_ER4_FR4/Full
                                200000baseDR4/Full
                                200000baseCR4/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  100000baseKR2/Full
                                100000baseSR2/Full
                                100000baseCR2/Full
                                100000baseLR2_ER2_FR2/Full
                                100000baseDR2/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
Jakub Kicinski Oct. 13, 2020, 3:43 p.m. UTC | #9
On Tue, 13 Oct 2020 14:29:29 +0000 Danielle Ratson wrote:
> > On Mon, 12 Oct 2020 15:33:45 +0000 Danielle Ratson wrote:  
> > > > What's the use for this in practical terms? Isn't the lane count
> > > > basically implied by the module that gets plugged in?  
> > >
> > > The use is to enable the user to decide how to achieve a certain speed.
> > > For example, if he wants to get 100G and the port has 4 lanes, the
> > > speed can be achieved it using both 2 lanes of 50G and 4 lanes of 25G,
> > > as a port with 4 lanes width can work in 2 lanes mode with double
> > > speed each. So, by specifying "lanes 2" he will achieve 100G using 2
> > > lanes of 50G.  
> > 
> > Can you give a concrete example of serdes capabilities of the port, what SFP
> > gets plugged in and what configuration user wants to select?  
> 
> Example:
> - swp1 is a 200G port with 4 lanes.
> - QSFP28 is plugged in.
> - The user wants to select configuration of 100G speed using 2 lanes, 50G each.

I mean.. sounds quite contrived to me.

But I'm not opposed if others are fine with the change.
Andrew Lunn Oct. 16, 2020, 10:15 p.m. UTC | #10
> Example:
> - swp1 is a 200G port with 4 lanes.
> - QSFP28 is plugged in.
> - The user wants to select configuration of 100G speed using 2 lanes, 50G each.
> 
> $ ethtool swp1
> Settings for swp1:
>         Supported ports: [ FIBRE         Backplane ]
>         Supported link modes:   1000baseT/Full
>                                 10000baseT/Full
>                                 1000baseKX/Full
>                                 10000baseKR/Full
>                                 10000baseR_FEC
>                                 40000baseKR4/Full
>                                 40000baseCR4/Full
>                                 40000baseSR4/Full
>                                 40000baseLR4/Full
>                                 25000baseCR/Full
>                                 25000baseKR/Full
>                                 25000baseSR/Full
>                                 50000baseCR2/Full
>                                 50000baseKR2/Full
>                                 100000baseKR4/Full
>                                 100000baseSR4/Full
>                                 100000baseCR4/Full
>                                 100000baseLR4_ER4/Full
>                                 50000baseSR2/Full
>                                 10000baseCR/Full
>                                 10000baseSR/Full
>                                 10000baseLR/Full
>                                 10000baseER/Full
>                                 50000baseKR/Full
>                                 50000baseSR/Full
>                                 50000baseCR/Full
>                                 50000baseLR_ER_FR/Full
>                                 50000baseDR/Full

>                                 100000baseKR2/Full
>                                 100000baseSR2/Full
>                                 100000baseCR2/Full
>                                 100000baseLR2_ER2_FR2/Full
>                                 100000baseDR2/Full

I'm not sure i fully understand all these different link modes, but i
thought these 5 are all 100G using 2 lanes? So why cannot the user
simply do

ethtool -s swp1 advertise 100000baseKR2/Full

and the driver can figure out it needs to use two lanes at 50G?

    Andrew
Danielle Ratson Oct. 19, 2020, 7:19 a.m. UTC | #11
> -----Original Message-----
> From: Andrew Lunn <andrew@lunn.ch>
> Sent: Saturday, October 17, 2020 1:16 AM
> To: Danielle Ratson <danieller@nvidia.com>
> Cc: Jakub Kicinski <kuba@kernel.org>; Ido Schimmel <idosch@idosch.org>;
> netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko
> <jiri@nvidia.com>; f.fainelli@gmail.com; mkubecek@suse.cz; mlxsw
> <mlxsw@nvidia.com>; Ido Schimmel <idosch@nvidia.com>;
> johannes@sipsolutions.net
> Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI
> with lanes
> 
> > Example:
> > - swp1 is a 200G port with 4 lanes.
> > - QSFP28 is plugged in.
> > - The user wants to select configuration of 100G speed using 2 lanes, 50G
> each.
> >
> > $ ethtool swp1
> > Settings for swp1:
> >         Supported ports: [ FIBRE         Backplane ]
> >         Supported link modes:   1000baseT/Full
> >                                 10000baseT/Full
> >                                 1000baseKX/Full
> >                                 10000baseKR/Full
> >                                 10000baseR_FEC
> >                                 40000baseKR4/Full
> >                                 40000baseCR4/Full
> >                                 40000baseSR4/Full
> >                                 40000baseLR4/Full
> >                                 25000baseCR/Full
> >                                 25000baseKR/Full
> >                                 25000baseSR/Full
> >                                 50000baseCR2/Full
> >                                 50000baseKR2/Full
> >                                 100000baseKR4/Full
> >                                 100000baseSR4/Full
> >                                 100000baseCR4/Full
> >                                 100000baseLR4_ER4/Full
> >                                 50000baseSR2/Full
> >                                 10000baseCR/Full
> >                                 10000baseSR/Full
> >                                 10000baseLR/Full
> >                                 10000baseER/Full
> >                                 50000baseKR/Full
> >                                 50000baseSR/Full
> >                                 50000baseCR/Full
> >                                 50000baseLR_ER_FR/Full
> >                                 50000baseDR/Full
> 
> >                                 100000baseKR2/Full
> >                                 100000baseSR2/Full
> >                                 100000baseCR2/Full
> >                                 100000baseLR2_ER2_FR2/Full
> >                                 100000baseDR2/Full
> 
> I'm not sure i fully understand all these different link modes, but i thought
> these 5 are all 100G using 2 lanes? So why cannot the user simply do
> 
> ethtool -s swp1 advertise 100000baseKR2/Full
> 
> and the driver can figure out it needs to use two lanes at 50G?
> 
>     Andrew

Hi Andrew,

Thanks for the feedback.

I guess you mean " ethtool -s swp1 advertise 100000baseKR2/Full on".

First, the idea might work but only for auto negotiation mode, whereas the lanes parameter is a solution for both.

Second, the command as you have suggested it, wouldn't change anything in the current situation as I checked. We can disable all the others and leave only the one we want but the command doesn't filter the other link modes but it just turns the mentioned link modes up if they aren't. However, the lanes parameter is a selector, which make it much more user friendly in my opinion.

Also, we can't turn only one of them up. But you have to set for example:

$ ethtool -s swp1 advertise 100000baseKR2/Full on 100000baseSR2/Full on 100000baseCR2/Full on 100000baseLR2_ER2_FR2/Full on 100000baseDR2/Full on

Am I missing something?
Michal Kubecek Oct. 19, 2020, 11:04 a.m. UTC | #12
On Mon, Oct 19, 2020 at 07:19:34AM +0000, Danielle Ratson wrote:
> > -----Original Message-----
> > From: Andrew Lunn <andrew@lunn.ch>
> > Sent: Saturday, October 17, 2020 1:16 AM
> > 
> > I'm not sure i fully understand all these different link modes, but
> > i thought these 5 are all 100G using 2 lanes? So why cannot the user
> > simply do
> > 
> > ethtool -s swp1 advertise 100000baseKR2/Full
> > 
> > and the driver can figure out it needs to use two lanes at 50G?
> > 
> >     Andrew
> 
> Hi Andrew,
> 
> Thanks for the feedback.
> 
> I guess you mean " ethtool -s swp1 advertise 100000baseKR2/Full on".
> 
> First, the idea might work but only for auto negotiation mode, whereas
> the lanes parameter is a solution for both.
> 
> Second, the command as you have suggested it, wouldn't change anything
> in the current situation as I checked. We can disable all the others
> and leave only the one we want but the command doesn't filter the
> other link modes but it just turns the mentioned link modes up if they
> aren't. However, the lanes parameter is a selector, which make it much
> more user friendly in my opinion.

It would be quite easy to extend the ethtool command line parser to
allow also

  ethtool -s <dev> advertise <mode> ...

in addition to already supported

  ethtool -s <dev> advertise <mask>
  ethtool -s <dev> advertise <mask>/<mask>
  ethtool -s { <mode> { on | off } } ...

Parser converting simple list of values into a maskless bitset is
already there so it would be only matter of checking if there are at
least two arguments and second is "on" or "off" and using corresponding
parser. I think it would be useful independently of this series.

> Also, we can't turn only one of them up. But you have to set for
> example:
> 
> $ ethtool -s swp1 advertise 100000baseKR2/Full on 100000baseSR2/Full on 100000baseCR2/Full on 100000baseLR2_ER2_FR2/Full on 100000baseDR2/Full on
> 
> Am I missing something?

IIUC Jakub's concern is rather about real life need for such selectors,
i.e. how realistic is "I want a(ny) 100Gb/s mode with two lanes" as an
actual user need; if it wouldn't be mostly (or only) used as a quick way
to distinguish between two supported 100Gb/s modes.

IMHO if we go this way, we should consider going all the way, i.e. allow
also selecting by the remaining part of the mode ("media type", e.g.
"LR", not sure what the official name is) and, more important, get full
information about link mode in use from driver (we only get speed and
duplex at the moment). But that would require changes in the
get_linksettings() interface and drivers.

Michal
Jiri Pirko Oct. 19, 2020, 12:24 p.m. UTC | #13
Sat, Oct 17, 2020 at 12:15:53AM CEST, andrew@lunn.ch wrote:
>> Example:
>> - swp1 is a 200G port with 4 lanes.
>> - QSFP28 is plugged in.
>> - The user wants to select configuration of 100G speed using 2 lanes, 50G each.
>> 
>> $ ethtool swp1
>> Settings for swp1:
>>         Supported ports: [ FIBRE         Backplane ]
>>         Supported link modes:   1000baseT/Full
>>                                 10000baseT/Full
>>                                 1000baseKX/Full
>>                                 10000baseKR/Full
>>                                 10000baseR_FEC
>>                                 40000baseKR4/Full
>>                                 40000baseCR4/Full
>>                                 40000baseSR4/Full
>>                                 40000baseLR4/Full
>>                                 25000baseCR/Full
>>                                 25000baseKR/Full
>>                                 25000baseSR/Full
>>                                 50000baseCR2/Full
>>                                 50000baseKR2/Full
>>                                 100000baseKR4/Full
>>                                 100000baseSR4/Full
>>                                 100000baseCR4/Full
>>                                 100000baseLR4_ER4/Full
>>                                 50000baseSR2/Full
>>                                 10000baseCR/Full
>>                                 10000baseSR/Full
>>                                 10000baseLR/Full
>>                                 10000baseER/Full
>>                                 50000baseKR/Full
>>                                 50000baseSR/Full
>>                                 50000baseCR/Full
>>                                 50000baseLR_ER_FR/Full
>>                                 50000baseDR/Full
>
>>                                 100000baseKR2/Full
>>                                 100000baseSR2/Full
>>                                 100000baseCR2/Full
>>                                 100000baseLR2_ER2_FR2/Full
>>                                 100000baseDR2/Full
>
>I'm not sure i fully understand all these different link modes, but i
>thought these 5 are all 100G using 2 lanes? So why cannot the user
>simply do
>
>ethtool -s swp1 advertise 100000baseKR2/Full
>
>and the driver can figure out it needs to use two lanes at 50G?

100000baseKR2 is 2 lanes. No need to figure anything out. What do you
mean by that?

>
>    Andrew
Jiri Pirko Oct. 19, 2020, 12:26 p.m. UTC | #14
Mon, Oct 19, 2020 at 01:04:22PM CEST, mkubecek@suse.cz wrote:
>On Mon, Oct 19, 2020 at 07:19:34AM +0000, Danielle Ratson wrote:
>> > -----Original Message-----
>> > From: Andrew Lunn <andrew@lunn.ch>
>> > Sent: Saturday, October 17, 2020 1:16 AM
>> > 
>> > I'm not sure i fully understand all these different link modes, but
>> > i thought these 5 are all 100G using 2 lanes? So why cannot the user
>> > simply do
>> > 
>> > ethtool -s swp1 advertise 100000baseKR2/Full
>> > 
>> > and the driver can figure out it needs to use two lanes at 50G?
>> > 
>> >     Andrew
>> 
>> Hi Andrew,
>> 
>> Thanks for the feedback.
>> 
>> I guess you mean " ethtool -s swp1 advertise 100000baseKR2/Full on".
>> 
>> First, the idea might work but only for auto negotiation mode, whereas
>> the lanes parameter is a solution for both.
>> 
>> Second, the command as you have suggested it, wouldn't change anything
>> in the current situation as I checked. We can disable all the others
>> and leave only the one we want but the command doesn't filter the
>> other link modes but it just turns the mentioned link modes up if they
>> aren't. However, the lanes parameter is a selector, which make it much
>> more user friendly in my opinion.
>
>It would be quite easy to extend the ethtool command line parser to
>allow also
>
>  ethtool -s <dev> advertise <mode> ...
>
>in addition to already supported
>
>  ethtool -s <dev> advertise <mask>
>  ethtool -s <dev> advertise <mask>/<mask>
>  ethtool -s { <mode> { on | off } } ...
>
>Parser converting simple list of values into a maskless bitset is
>already there so it would be only matter of checking if there are at
>least two arguments and second is "on" or "off" and using corresponding
>parser. I think it would be useful independently of this series.

Understood. So basically you will pass some selectors on cmdline and the
uapi would stay intact.
How do you imagine the specific lane number selection should look like
on the cmdline?



>
>> Also, we can't turn only one of them up. But you have to set for
>> example:
>> 
>> $ ethtool -s swp1 advertise 100000baseKR2/Full on 100000baseSR2/Full on 100000baseCR2/Full on 100000baseLR2_ER2_FR2/Full on 100000baseDR2/Full on
>> 
>> Am I missing something?
>
>IIUC Jakub's concern is rather about real life need for such selectors,
>i.e. how realistic is "I want a(ny) 100Gb/s mode with two lanes" as an
>actual user need; if it wouldn't be mostly (or only) used as a quick way
>to distinguish between two supported 100Gb/s modes.
>
>IMHO if we go this way, we should consider going all the way, i.e. allow
>also selecting by the remaining part of the mode ("media type", e.g.
>"LR", not sure what the official name is) and, more important, get full
>information about link mode in use from driver (we only get speed and
>duplex at the moment). But that would require changes in the
>get_linksettings() interface and drivers.
>
>Michal
Andrew Lunn Oct. 19, 2020, 12:38 p.m. UTC | #15
> >>                                 100000baseKR2/Full
> >>                                 100000baseSR2/Full
> >>                                 100000baseCR2/Full
> >>                                 100000baseLR2_ER2_FR2/Full
> >>                                 100000baseDR2/Full
> >
> >I'm not sure i fully understand all these different link modes, but i
> >thought these 5 are all 100G using 2 lanes? So why cannot the user
> >simply do
> >
> >ethtool -s swp1 advertise 100000baseKR2/Full
> >
> >and the driver can figure out it needs to use two lanes at 50G?
> 
> 100000baseKR2 is 2 lanes. No need to figure anything out. What do you
> mean by that?

I'm just thinking, rather than add a new UAPI for the number of lanes,
why not use one we already have, if the number of lanes is implied by
the link mode.

    Andrew
Michal Kubecek Oct. 19, 2020, 1:24 p.m. UTC | #16
On Mon, Oct 19, 2020 at 02:26:43PM +0200, Jiri Pirko wrote:
> Mon, Oct 19, 2020 at 01:04:22PM CEST, mkubecek@suse.cz wrote:
> >
> >It would be quite easy to extend the ethtool command line parser to
> >allow also
> >
> >  ethtool -s <dev> advertise <mode> ...
> >
> >in addition to already supported
> >
> >  ethtool -s <dev> advertise <mask>
> >  ethtool -s <dev> advertise <mask>/<mask>
> >  ethtool -s { <mode> { on | off } } ...

This should have been

  ethtool -s <dev> advertise { <mode> { on | off } } ...

> >Parser converting simple list of values into a maskless bitset is
> >already there so it would be only matter of checking if there are at
> >least two arguments and second is "on" or "off" and using corresponding
> >parser. I think it would be useful independently of this series.
> 
> Understood. So basically you will pass some selectors on cmdline and the
> uapi would stay intact.
> How do you imagine the specific lane number selection should look like
> on the cmdline?

As I said, I meant the extension suggested in my mail as independent of
what this series is about. For lanes count selector, I find proposed

  ethtool -s <dev> ... lanes <lanes_num> ...

the most natural.
Danielle Ratson Oct. 20, 2020, 7:39 a.m. UTC | #17
> -----Original Message-----
> From: Michal Kubecek <mkubecek@suse.cz>
> Sent: Monday, October 19, 2020 4:25 PM
> To: Jiri Pirko <jiri@resnulli.us>
> Cc: Danielle Ratson <danieller@nvidia.com>; Andrew Lunn
> <andrew@lunn.ch>; Jakub Kicinski <kuba@kernel.org>; Ido Schimmel
> <idosch@idosch.org>; netdev@vger.kernel.org; davem@davemloft.net; Jiri
> Pirko <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw <mlxsw@nvidia.com>;
> Ido Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net
> Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI
> with lanes
> 
> On Mon, Oct 19, 2020 at 02:26:43PM +0200, Jiri Pirko wrote:
> > Mon, Oct 19, 2020 at 01:04:22PM CEST, mkubecek@suse.cz wrote:
> > >
> > >It would be quite easy to extend the ethtool command line parser to
> > >allow also
> > >
> > >  ethtool -s <dev> advertise <mode> ...
> > >
> > >in addition to already supported
> > >
> > >  ethtool -s <dev> advertise <mask>
> > >  ethtool -s <dev> advertise <mask>/<mask>  ethtool -s { <mode> { on
> > > | off } } ...
> 
> This should have been
> 
>   ethtool -s <dev> advertise { <mode> { on | off } } ...
> 
> > >Parser converting simple list of values into a maskless bitset is
> > >already there so it would be only matter of checking if there are at
> > >least two arguments and second is "on" or "off" and using
> > >corresponding parser. I think it would be useful independently of this
> series.
> >
> > Understood. So basically you will pass some selectors on cmdline and
> > the uapi would stay intact.
> > How do you imagine the specific lane number selection should look like
> > on the cmdline?
> 
> As I said, I meant the extension suggested in my mail as independent of what
> this series is about. For lanes count selector, I find proposed
> 
>   ethtool -s <dev> ... lanes <lanes_num> ...
> 
> the most natural.
> 
> From purely syntactic/semantic point of view, there are three types of
> requests:
> 
>   (1) enable specific set of modes, disable the rest
>   (2) enable/disable specific modes, leave the rest as they are
>   (3) enable modes matching a condition (and disable the rest)
> 
> What I proposed was to allow the use symbolic names instead of masks
> (which are getting more and more awful with each new mode) also for (1),
> like they can already be used for (2).
> 
> The lanes selector is an extension of (3) which I would prefer not to mix with
> (1) or (2) within one command line, i.e. either "advertise" or "speed / duplex
> / lanes".
> 
> IIUC Jakub's and Andrew's comments were not so much about the syntax
> and semantic (which is quite clear) but rather about the question if the
> requests like "advertise exactly the modes with (100Gb/s speed and) two
> lanes" would really address a real life need and wouldn't be more often used
> as shortcuts for "advertise 100000baseKR2/Full". (On the other hand, I
> suspect existing speed and duplex selectors are often used the same way.)
> 
> Michal

So, do you want to change the current approach somehow or we are good to go with this one, keeping in mind the future extension you have suggested? 

Thanks.
Michal Kubecek Oct. 21, 2020, 7:08 a.m. UTC | #18
On Tue, Oct 20, 2020 at 07:39:13AM +0000, Danielle Ratson wrote:
> > -----Original Message-----

> > From: Michal Kubecek <mkubecek@suse.cz>

> > Sent: Monday, October 19, 2020 4:25 PM

> > 

> > As I said, I meant the extension suggested in my mail as independent of what

> > this series is about. For lanes count selector, I find proposed

> > 

> >   ethtool -s <dev> ... lanes <lanes_num> ...

> > 

> > the most natural.

> > 

> > From purely syntactic/semantic point of view, there are three types of

> > requests:

> > 

> >   (1) enable specific set of modes, disable the rest

> >   (2) enable/disable specific modes, leave the rest as they are

> >   (3) enable modes matching a condition (and disable the rest)

> > 

> > What I proposed was to allow the use symbolic names instead of masks

> > (which are getting more and more awful with each new mode) also for (1),

> > like they can already be used for (2).

> > 

> > The lanes selector is an extension of (3) which I would prefer not to mix with

> > (1) or (2) within one command line, i.e. either "advertise" or "speed / duplex

> > / lanes".

> > 

> > IIUC Jakub's and Andrew's comments were not so much about the syntax

> > and semantic (which is quite clear) but rather about the question if the

> > requests like "advertise exactly the modes with (100Gb/s speed and) two

> > lanes" would really address a real life need and wouldn't be more often used

> > as shortcuts for "advertise 100000baseKR2/Full". (On the other hand, I

> > suspect existing speed and duplex selectors are often used the same way.)

> > 

> > Michal

> 

> So, do you want to change the current approach somehow or we are good

> to go with this one, keeping in mind the future extension you have

> suggested? 


As far as I'm concerned, it makes sense as it is. The only thing I'm not
happy about is ETHTOOL_A_LINKMODES_LANES being a "write only" attribute
(unlike _SPEED and _DUPLEX) but being able to query this information
would require extensive changes far beyond the scope of this series.

Michal
Danielle Ratson Oct. 21, 2020, 7:20 a.m. UTC | #19
> -----Original Message-----

> From: Michal Kubecek <mkubecek@suse.cz>

> Sent: Wednesday, October 21, 2020 10:08 AM

> To: Danielle Ratson <danieller@nvidia.com>

> Cc: Jiri Pirko <jiri@resnulli.us>; Andrew Lunn <andrew@lunn.ch>; Jakub

> Kicinski <kuba@kernel.org>; Ido Schimmel <idosch@idosch.org>;

> netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko

> <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw <mlxsw@nvidia.com>; Ido

> Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net

> Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI

> with lanes

> 

> On Tue, Oct 20, 2020 at 07:39:13AM +0000, Danielle Ratson wrote:

> > > -----Original Message-----

> > > From: Michal Kubecek <mkubecek@suse.cz>

> > > Sent: Monday, October 19, 2020 4:25 PM

> > >

> > > As I said, I meant the extension suggested in my mail as independent

> > > of what this series is about. For lanes count selector, I find

> > > proposed

> > >

> > >   ethtool -s <dev> ... lanes <lanes_num> ...

> > >

> > > the most natural.

> > >

> > > From purely syntactic/semantic point of view, there are three types

> > > of

> > > requests:

> > >

> > >   (1) enable specific set of modes, disable the rest

> > >   (2) enable/disable specific modes, leave the rest as they are

> > >   (3) enable modes matching a condition (and disable the rest)

> > >

> > > What I proposed was to allow the use symbolic names instead of masks

> > > (which are getting more and more awful with each new mode) also for

> > > (1), like they can already be used for (2).

> > >

> > > The lanes selector is an extension of (3) which I would prefer not

> > > to mix with

> > > (1) or (2) within one command line, i.e. either "advertise" or

> > > "speed / duplex / lanes".

> > >

> > > IIUC Jakub's and Andrew's comments were not so much about the syntax

> > > and semantic (which is quite clear) but rather about the question if

> > > the requests like "advertise exactly the modes with (100Gb/s speed

> > > and) two lanes" would really address a real life need and wouldn't

> > > be more often used as shortcuts for "advertise 100000baseKR2/Full".

> > > (On the other hand, I suspect existing speed and duplex selectors

> > > are often used the same way.)

> > >

> > > Michal

> >

> > So, do you want to change the current approach somehow or we are good

> > to go with this one, keeping in mind the future extension you have

> > suggested?

> 

> As far as I'm concerned, it makes sense as it is. The only thing I'm not happy

> about is ETHTOOL_A_LINKMODES_LANES being a "write only" attribute

> (unlike _SPEED and _DUPLEX) but being able to query this information would

> require extensive changes far beyond the scope of this series.

> 

> Michal


If I understood you correctly, patch #5 does that query, isn't it? "mlxsw: ethtool: Expose the number of lanes in use"
Michal Kubecek Oct. 21, 2020, 8:47 a.m. UTC | #20
On Wed, Oct 21, 2020 at 07:20:22AM +0000, Danielle Ratson wrote:
> > -----Original Message-----

> > From: Michal Kubecek <mkubecek@suse.cz>

> > Sent: Wednesday, October 21, 2020 10:08 AM

> > 

> > On Tue, Oct 20, 2020 at 07:39:13AM +0000, Danielle Ratson wrote:

> > > > -----Original Message-----

> > > > From: Michal Kubecek <mkubecek@suse.cz>

> > > > Sent: Monday, October 19, 2020 4:25 PM

> > > >

> > > > As I said, I meant the extension suggested in my mail as independent

> > > > of what this series is about. For lanes count selector, I find

> > > > proposed

> > > >

> > > >   ethtool -s <dev> ... lanes <lanes_num> ...

> > > >

> > > > the most natural.

> > > >

> > > > From purely syntactic/semantic point of view, there are three types

> > > > of

> > > > requests:

> > > >

> > > >   (1) enable specific set of modes, disable the rest

> > > >   (2) enable/disable specific modes, leave the rest as they are

> > > >   (3) enable modes matching a condition (and disable the rest)

> > > >

> > > > What I proposed was to allow the use symbolic names instead of masks

> > > > (which are getting more and more awful with each new mode) also for

> > > > (1), like they can already be used for (2).

> > > >

> > > > The lanes selector is an extension of (3) which I would prefer not

> > > > to mix with

> > > > (1) or (2) within one command line, i.e. either "advertise" or

> > > > "speed / duplex / lanes".

> > > >

> > > > IIUC Jakub's and Andrew's comments were not so much about the syntax

> > > > and semantic (which is quite clear) but rather about the question if

> > > > the requests like "advertise exactly the modes with (100Gb/s speed

> > > > and) two lanes" would really address a real life need and wouldn't

> > > > be more often used as shortcuts for "advertise 100000baseKR2/Full".

> > > > (On the other hand, I suspect existing speed and duplex selectors

> > > > are often used the same way.)

> > > >

> > > > Michal

> > >

> > > So, do you want to change the current approach somehow or we are good

> > > to go with this one, keeping in mind the future extension you have

> > > suggested?

> > 

> > As far as I'm concerned, it makes sense as it is. The only thing I'm not happy

> > about is ETHTOOL_A_LINKMODES_LANES being a "write only" attribute

> > (unlike _SPEED and _DUPLEX) but being able to query this information would

> > require extensive changes far beyond the scope of this series.

> 

> If I understood you correctly, patch #5 does that query, isn't it?

> "mlxsw: ethtool: Expose the number of lanes in use"


Ah, right, it does. But as you extend struct ethtool_link_ksettings and
drivers will need to be updated to provide this information, wouldn't it
be more useful to let the driver provide link mode in use instead (and
derive number of lanes from it)?

Michal
Danielle Ratson Oct. 22, 2020, 6:15 a.m. UTC | #21
> -----Original Message-----
> From: Michal Kubecek <mkubecek@suse.cz>
> Sent: Wednesday, October 21, 2020 11:48 AM
> To: Danielle Ratson <danieller@nvidia.com>
> Cc: Jiri Pirko <jiri@resnulli.us>; Andrew Lunn <andrew@lunn.ch>; Jakub
> Kicinski <kuba@kernel.org>; Ido Schimmel <idosch@idosch.org>;
> netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko
> <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw <mlxsw@nvidia.com>; Ido
> Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net
> Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI
> with lanes
> 
> On Wed, Oct 21, 2020 at 07:20:22AM +0000, Danielle Ratson wrote:
> > > -----Original Message-----
> > > From: Michal Kubecek <mkubecek@suse.cz>
> > > Sent: Wednesday, October 21, 2020 10:08 AM
> > >
> > > On Tue, Oct 20, 2020 at 07:39:13AM +0000, Danielle Ratson wrote:
> > > > > -----Original Message-----
> > > > > From: Michal Kubecek <mkubecek@suse.cz>
> > > > > Sent: Monday, October 19, 2020 4:25 PM
> > > > >
> > > > > As I said, I meant the extension suggested in my mail as
> > > > > independent of what this series is about. For lanes count
> > > > > selector, I find proposed
> > > > >
> > > > >   ethtool -s <dev> ... lanes <lanes_num> ...
> > > > >
> > > > > the most natural.
> > > > >
> > > > > From purely syntactic/semantic point of view, there are three
> > > > > types of
> > > > > requests:
> > > > >
> > > > >   (1) enable specific set of modes, disable the rest
> > > > >   (2) enable/disable specific modes, leave the rest as they are
> > > > >   (3) enable modes matching a condition (and disable the rest)
> > > > >
> > > > > What I proposed was to allow the use symbolic names instead of
> > > > > masks (which are getting more and more awful with each new mode)
> > > > > also for (1), like they can already be used for (2).
> > > > >
> > > > > The lanes selector is an extension of (3) which I would prefer
> > > > > not to mix with
> > > > > (1) or (2) within one command line, i.e. either "advertise" or
> > > > > "speed / duplex / lanes".
> > > > >
> > > > > IIUC Jakub's and Andrew's comments were not so much about the
> > > > > syntax and semantic (which is quite clear) but rather about the
> > > > > question if the requests like "advertise exactly the modes with
> > > > > (100Gb/s speed
> > > > > and) two lanes" would really address a real life need and
> > > > > wouldn't be more often used as shortcuts for "advertise
> 100000baseKR2/Full".
> > > > > (On the other hand, I suspect existing speed and duplex
> > > > > selectors are often used the same way.)
> > > > >
> > > > > Michal
> > > >
> > > > So, do you want to change the current approach somehow or we are
> > > > good to go with this one, keeping in mind the future extension you
> > > > have suggested?
> > >
> > > As far as I'm concerned, it makes sense as it is. The only thing I'm
> > > not happy about is ETHTOOL_A_LINKMODES_LANES being a "write only"
> > > attribute (unlike _SPEED and _DUPLEX) but being able to query this
> > > information would require extensive changes far beyond the scope of
> this series.
> >
> > If I understood you correctly, patch #5 does that query, isn't it?
> > "mlxsw: ethtool: Expose the number of lanes in use"
> 
> Ah, right, it does. But as you extend struct ethtool_link_ksettings and drivers
> will need to be updated to provide this information, wouldn't it be more
> useful to let the driver provide link mode in use instead (and derive number
> of lanes from it)?
> 
> Michal

This is the way it is done with the speed parameter, so I have aligned it to it. Why the lanes should be done differently comparing to the speed?
Michal Kubecek Oct. 22, 2020, 4:27 p.m. UTC | #22
On Thu, Oct 22, 2020 at 06:15:48AM +0000, Danielle Ratson wrote:
> > -----Original Message-----

> > From: Michal Kubecek <mkubecek@suse.cz>

> > Sent: Wednesday, October 21, 2020 11:48 AM

> > 

> > Ah, right, it does. But as you extend struct ethtool_link_ksettings

> > and drivers will need to be updated to provide this information,

> > wouldn't it be more useful to let the driver provide link mode in

> > use instead (and derive number of lanes from it)?

> 

> This is the way it is done with the speed parameter, so I have aligned

> it to it. Why the lanes should be done differently comparing to the

> speed?


Speed and duplex have worked this way since ages and the interface was
probably introduced back in times when combination of speed and duplex
was sufficient to identify the link mode. This is no longer the case and
even adding number of lanes wouldn't make the combination unique. So if
we are going to extend the interface now and update drivers to provide
extra information, I believe it would be more useful to provide full
information.

Michal
Edwin Peer Nov. 19, 2020, 8:38 p.m. UTC | #23
On Sat, Oct 10, 2020 at 3:54 PM Ido Schimmel <idosch@idosch.org> wrote:

> Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct

> ethtool_link_settings' with lanes field in order to implement a new

> lanes-selector that will enable the user to advertise a specific number

> of lanes as well.


Why can't this be implied by port break-out configuration? For higher
speed signalling modes like PAM4, what's the difference between a
port with unused lanes vs the same port split into multiple logical
ports? In essence, the driver could then always choose the slowest
signalling mode that utilizes all the available lanes.

Regards,
Edwin Peer
Jiri Pirko Nov. 23, 2020, 9:40 a.m. UTC | #24
Thu, Nov 19, 2020 at 09:38:34PM CET, edwin.peer@broadcom.com wrote:
>On Sat, Oct 10, 2020 at 3:54 PM Ido Schimmel <idosch@idosch.org> wrote:

>

>> Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct

>> ethtool_link_settings' with lanes field in order to implement a new

>> lanes-selector that will enable the user to advertise a specific number

>> of lanes as well.

>

>Why can't this be implied by port break-out configuration? For higher

>speed signalling modes like PAM4, what's the difference between a

>port with unused lanes vs the same port split into multiple logical

>ports? In essence, the driver could then always choose the slowest


There is a crucial difference. Split port is configured alwasy by user.
Each split port has a devlink instace, netdevice associated with it.
It is one level above the lanes.


>signalling mode that utilizes all the available lanes.

>

>Regards,

>Edwin Peer
Danielle Ratson Nov. 23, 2020, 9:47 a.m. UTC | #25
> -----Original Message-----

> From: Michal Kubecek <mkubecek@suse.cz>

> Sent: Thursday, October 22, 2020 7:28 PM

> To: Danielle Ratson <danieller@nvidia.com>

> Cc: Jiri Pirko <jiri@resnulli.us>; Andrew Lunn <andrew@lunn.ch>; Jakub Kicinski <kuba@kernel.org>; Ido Schimmel

> <idosch@idosch.org>; netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw

> <mlxsw@nvidia.com>; Ido Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net

> Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI with lanes

> 

> On Thu, Oct 22, 2020 at 06:15:48AM +0000, Danielle Ratson wrote:

> > > -----Original Message-----

> > > From: Michal Kubecek <mkubecek@suse.cz>

> > > Sent: Wednesday, October 21, 2020 11:48 AM

> > >

> > > Ah, right, it does. But as you extend struct ethtool_link_ksettings

> > > and drivers will need to be updated to provide this information,

> > > wouldn't it be more useful to let the driver provide link mode in

> > > use instead (and derive number of lanes from it)?

> >

> > This is the way it is done with the speed parameter, so I have aligned

> > it to it. Why the lanes should be done differently comparing to the

> > speed?

> 

> Speed and duplex have worked this way since ages and the interface was probably introduced back in times when combination of

> speed and duplex was sufficient to identify the link mode. This is no longer the case and even adding number of lanes wouldn't make

> the combination unique. So if we are going to extend the interface now and update drivers to provide extra information, I believe it

> would be more useful to provide full information.

> 

> Michal


Hi Michal,

What do you think of passing the link modes you have suggested as a bitmask, similar to "supported", that contains only one positive bit?
Something like that:

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index afae2beacbc3..dd946c88daa3 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -127,6 +127,7 @@ struct ethtool_link_ksettings {
                __ETHTOOL_DECLARE_LINK_MODE_MASK(supported);
                __ETHTOOL_DECLARE_LINK_MODE_MASK(advertising);
                __ETHTOOL_DECLARE_LINK_MODE_MASK(lp_advertising);
+               __ETHTOOL_DECLARE_LINK_MODE_MASK(chosen);
        } link_modes;
        u32     lanes;
 };

Do you have perhaps a better suggestion?

And the speed and duplex parameters should be removed from being passed like as well, right?

Thanks,
Danielle
Michal Kubecek Nov. 24, 2020, 10:12 p.m. UTC | #26
On Mon, Nov 23, 2020 at 09:47:53AM +0000, Danielle Ratson wrote:
> 

> 

> > -----Original Message-----

> > From: Michal Kubecek <mkubecek@suse.cz>

> > Sent: Thursday, October 22, 2020 7:28 PM

> > To: Danielle Ratson <danieller@nvidia.com>

> > Cc: Jiri Pirko <jiri@resnulli.us>; Andrew Lunn <andrew@lunn.ch>; Jakub Kicinski <kuba@kernel.org>; Ido Schimmel

> > <idosch@idosch.org>; netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw

> > <mlxsw@nvidia.com>; Ido Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net

> > Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI with lanes

> > 

> > On Thu, Oct 22, 2020 at 06:15:48AM +0000, Danielle Ratson wrote:

> > > > -----Original Message-----

> > > > From: Michal Kubecek <mkubecek@suse.cz>

> > > > Sent: Wednesday, October 21, 2020 11:48 AM

> > > >

> > > > Ah, right, it does. But as you extend struct ethtool_link_ksettings

> > > > and drivers will need to be updated to provide this information,

> > > > wouldn't it be more useful to let the driver provide link mode in

> > > > use instead (and derive number of lanes from it)?

> > >

> > > This is the way it is done with the speed parameter, so I have aligned

> > > it to it. Why the lanes should be done differently comparing to the

> > > speed?

> > 

> > Speed and duplex have worked this way since ages and the interface was probably introduced back in times when combination of

> > speed and duplex was sufficient to identify the link mode. This is no longer the case and even adding number of lanes wouldn't make

> > the combination unique. So if we are going to extend the interface now and update drivers to provide extra information, I believe it

> > would be more useful to provide full information.

> > 

> > Michal

> 

> Hi Michal,

> 

> What do you think of passing the link modes you have suggested as a

> bitmask, similar to "supported", that contains only one positive bit?

> Something like that:

> 

> diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h

> index afae2beacbc3..dd946c88daa3 100644

> --- a/include/linux/ethtool.h

> +++ b/include/linux/ethtool.h

> @@ -127,6 +127,7 @@ struct ethtool_link_ksettings {

>                 __ETHTOOL_DECLARE_LINK_MODE_MASK(supported);

>                 __ETHTOOL_DECLARE_LINK_MODE_MASK(advertising);

>                 __ETHTOOL_DECLARE_LINK_MODE_MASK(lp_advertising);

> +               __ETHTOOL_DECLARE_LINK_MODE_MASK(chosen);

>         } link_modes;

>         u32     lanes;

>  };

> 

> Do you have perhaps a better suggestion?


Not sure if it's better but as we know there is only one mode, we could
simply pass the index. We would still need to reserve a special value
for none/unknown but getting an index would make lookup easier.

> And the speed and duplex parameters should be removed from being

> passed like as well, right?


We cannot remove them from struct ethtool_link_settings and the ioctl
and netlink messages as those are part of UAPI and we have to preserve
backward compatibility. But drivers which provide link mode would not
need to fill speed and duplex in their ->get_link_ksettings() as the
common code could do that for them.

Michal
Danielle Ratson Nov. 25, 2020, 10:35 a.m. UTC | #27
> -----Original Message-----

> From: Michal Kubecek <mkubecek@suse.cz>

> Sent: Wednesday, November 25, 2020 12:12 AM

> To: Danielle Ratson <danieller@nvidia.com>

> Cc: Jiri Pirko <jiri@resnulli.us>; Andrew Lunn <andrew@lunn.ch>; Jakub Kicinski <kuba@kernel.org>; Ido Schimmel

> <idosch@idosch.org>; netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw

> <mlxsw@nvidia.com>; Ido Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net

> Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI with lanes

> 

> On Mon, Nov 23, 2020 at 09:47:53AM +0000, Danielle Ratson wrote:

> >

> >

> > > -----Original Message-----

> > > From: Michal Kubecek <mkubecek@suse.cz>

> > > Sent: Thursday, October 22, 2020 7:28 PM

> > > To: Danielle Ratson <danieller@nvidia.com>

> > > Cc: Jiri Pirko <jiri@resnulli.us>; Andrew Lunn <andrew@lunn.ch>;

> > > Jakub Kicinski <kuba@kernel.org>; Ido Schimmel <idosch@idosch.org>;

> > > netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko

> > > <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw <mlxsw@nvidia.com>;

> > > Ido Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net

> > > Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes

> > > settings uAPI with lanes

> > >

> > > On Thu, Oct 22, 2020 at 06:15:48AM +0000, Danielle Ratson wrote:

> > > > > -----Original Message-----

> > > > > From: Michal Kubecek <mkubecek@suse.cz>

> > > > > Sent: Wednesday, October 21, 2020 11:48 AM

> > > > >

> > > > > Ah, right, it does. But as you extend struct

> > > > > ethtool_link_ksettings and drivers will need to be updated to

> > > > > provide this information, wouldn't it be more useful to let the

> > > > > driver provide link mode in use instead (and derive number of lanes from it)?

> > > >

> > > > This is the way it is done with the speed parameter, so I have

> > > > aligned it to it. Why the lanes should be done differently

> > > > comparing to the speed?

> > >

> > > Speed and duplex have worked this way since ages and the interface

> > > was probably introduced back in times when combination of speed and

> > > duplex was sufficient to identify the link mode. This is no longer

> > > the case and even adding number of lanes wouldn't make the combination unique. So if we are going to extend the interface now

> and update drivers to provide extra information, I believe it would be more useful to provide full information.

> > >

> > > Michal

> >

> > Hi Michal,

> >

> > What do you think of passing the link modes you have suggested as a

> > bitmask, similar to "supported", that contains only one positive bit?

> > Something like that:


Hi Michal,

Thanks for your response.

Actually what I said is not very accurate. 
In ethtool, for speed 100G and 4 lanes for example, there are few link modes that fits:
ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT
ETHTOOL_LINK_MODE_100000baseSR4_Full_BIT
ETHTOOL_LINK_MODE_100000baseCR4_Full_BIT
ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT

The difference is the media. And in the driver we shrink into one bit.
But maybe that makes passing a bitmask more sense, or am I missing something?

> >

> > diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index

> > afae2beacbc3..dd946c88daa3 100644

> > --- a/include/linux/ethtool.h

> > +++ b/include/linux/ethtool.h

> > @@ -127,6 +127,7 @@ struct ethtool_link_ksettings {

> >                 __ETHTOOL_DECLARE_LINK_MODE_MASK(supported);

> >                 __ETHTOOL_DECLARE_LINK_MODE_MASK(advertising);

> >                 __ETHTOOL_DECLARE_LINK_MODE_MASK(lp_advertising);

> > +               __ETHTOOL_DECLARE_LINK_MODE_MASK(chosen);

> >         } link_modes;

> >         u32     lanes;

> >  };

> >

> > Do you have perhaps a better suggestion?

> 

> Not sure if it's better but as we know there is only one mode, we could simply pass the index. We would still need to reserve a special

> value for none/unknown but getting an index would make lookup easier.

> 

> > And the speed and duplex parameters should be removed from being

> > passed like as well, right?

> 

> We cannot remove them from struct ethtool_link_settings and the ioctl and netlink messages as those are part of UAPI and we have

> to preserve backward compatibility. But drivers which provide link mode would not need to fill speed and duplex in their -

> >get_link_ksettings() as the common code could do that for them.


Yes of course I didn't mean to remove the parameters from the struct, just to not prepare them for passing to ethtool when getting the link settings.

Thanks.

> 

> Michal
Michal Kubecek Nov. 26, 2020, 9:07 p.m. UTC | #28
On Wed, Nov 25, 2020 at 10:35:35AM +0000, Danielle Ratson wrote:
> > > What do you think of passing the link modes you have suggested as a

> > > bitmask, similar to "supported", that contains only one positive bit?

> > > Something like that:

> 

> Hi Michal,

> 

> Thanks for your response.

> 

> Actually what I said is not very accurate. 

> In ethtool, for speed 100G and 4 lanes for example, there are few link modes that fits:

> ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT

> ETHTOOL_LINK_MODE_100000baseSR4_Full_BIT

> ETHTOOL_LINK_MODE_100000baseCR4_Full_BIT

> ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT

> 

> The difference is the media. And in the driver we shrink into one bit.

> But maybe that makes passing a bitmask more sense, or am I missing something?


But as far as I understand, at any moment, only one of these will be
actually in use so that's what the driver should report. Or is the
problem that the driver cannot identify the media in use? (To be
precise: by "cannot identify" I mean "it's not possible for the driver
to find out", not "current code does not distinguish".)

Michal
Edwin Peer Nov. 30, 2020, 5:01 p.m. UTC | #29
On Mon, Nov 23, 2020 at 1:40 AM Jiri Pirko <jiri@resnulli.us> wrote:

> >Why can't this be implied by port break-out configuration? For higher

> >speed signalling modes like PAM4, what's the difference between a

> >port with unused lanes vs the same port split into multiple logical

> >ports? In essence, the driver could then always choose the slowest

>

> There is a crucial difference. Split port is configured alwasy by user.

> Each split port has a devlink instace, netdevice associated with it.

> It is one level above the lanes.


Right, but the one still implies the other. Splitting the port implies fewer
lanes available.

I understand the concern if the device cannot provide sufficient MAC
resources to provide for the additional ports, but leaving a net device
unused (with the option to utilize an additional, now spare, port) still
seems better to me than leaving lanes unused and always wasted.

Otherwise, the earlier suggestion of fully specifying the forced link
mode (although I don't think Andrew articulated it quite that way)
instead of a forced speed and separate lane mode makes most
sense.

Regards,
Edwin Peer
Jiri Pirko Nov. 30, 2020, 5:14 p.m. UTC | #30
Mon, Nov 30, 2020 at 06:01:43PM CET, edwin.peer@broadcom.com wrote:
>On Mon, Nov 23, 2020 at 1:40 AM Jiri Pirko <jiri@resnulli.us> wrote:

>

>> >Why can't this be implied by port break-out configuration? For higher

>> >speed signalling modes like PAM4, what's the difference between a

>> >port with unused lanes vs the same port split into multiple logical

>> >ports? In essence, the driver could then always choose the slowest

>>

>> There is a crucial difference. Split port is configured alwasy by user.

>> Each split port has a devlink instace, netdevice associated with it.

>> It is one level above the lanes.

>

>Right, but the one still implies the other. Splitting the port implies fewer

>lanes available.

>

>I understand the concern if the device cannot provide sufficient MAC

>resources to provide for the additional ports, but leaving a net device

>unused (with the option to utilize an additional, now spare, port) still

>seems better to me than leaving lanes unused and always wasted.


I don't follow what exactly are you implying. Could you elaborate a bit
more?

>

>Otherwise, the earlier suggestion of fully specifying the forced link

>mode (although I don't think Andrew articulated it quite that way)

>instead of a forced speed and separate lane mode makes most

>sense.

>

>Regards,

>Edwin Peer
Edwin Peer Nov. 30, 2020, 6 p.m. UTC | #31
On Mon, Nov 30, 2020 at 9:14 AM Jiri Pirko <jiri@resnulli.us> wrote:

> >> There is a crucial difference. Split port is configured alwasy by user.

> >> Each split port has a devlink instace, netdevice associated with it.

> >> It is one level above the lanes.

> >

> >Right, but the one still implies the other. Splitting the port implies fewer

> >lanes available.

> >

> >I understand the concern if the device cannot provide sufficient MAC

> >resources to provide for the additional ports, but leaving a net device

> >unused (with the option to utilize an additional, now spare, port) still

> >seems better to me than leaving lanes unused and always wasted.

>

> I don't follow what exactly are you implying. Could you elaborate a bit

> more?


Perhaps an example...

Consider a physical QSFP connector comprising 4 lanes. Today, if the
speed is forced, we would achieve 100G speeds using all 4 lanes with
NRZ encoding. If we configure the port for PAM4 encoding at the same
speed, then we only require 2 of the available 4 lanes. The remaining 2
lanes are wasted. If we only require 2 of the 4 lanes, why not split the
port and request the same speed of one of the now split out ports? Now,
this same speed is only achievable using PAM4 encoding (it is implied)
and we have a spare, potentially usable, assuming an appropriate break-
out cable, port instead of the 2 unused lanes.

So concretely, I'm suggesting that if we want to force PAM4 at the lower
speeds, split the port and then we don't need an ethtool interface change
at all to achieve the same goal. Having a spare (potentially usable) port
is better than spare (unusable) lanes.

Now, if the port can't be split for some reason (perhaps there aren't
sufficient device MAC resources, stats contexts, whatever), then that's
a different story. But, even so, perhaps the port lane topology more
appropriately belongs as part of a device configuration interface in
devlink and the number of lanes available to a port should be a
property of the port instead of a link mode knob?

Regards,
Edwin Peer
Jiri Pirko Dec. 1, 2020, 11:22 a.m. UTC | #32
Mon, Nov 30, 2020 at 07:00:16PM CET, edwin.peer@broadcom.com wrote:
>On Mon, Nov 30, 2020 at 9:14 AM Jiri Pirko <jiri@resnulli.us> wrote:

>

>> >> There is a crucial difference. Split port is configured alwasy by user.

>> >> Each split port has a devlink instace, netdevice associated with it.

>> >> It is one level above the lanes.

>> >

>> >Right, but the one still implies the other. Splitting the port implies fewer

>> >lanes available.

>> >

>> >I understand the concern if the device cannot provide sufficient MAC

>> >resources to provide for the additional ports, but leaving a net device

>> >unused (with the option to utilize an additional, now spare, port) still

>> >seems better to me than leaving lanes unused and always wasted.

>>

>> I don't follow what exactly are you implying. Could you elaborate a bit

>> more?

>

>Perhaps an example...

>

>Consider a physical QSFP connector comprising 4 lanes. Today, if the

>speed is forced, we would achieve 100G speeds using all 4 lanes with

>NRZ encoding. If we configure the port for PAM4 encoding at the same

>speed, then we only require 2 of the available 4 lanes. The remaining 2

>lanes are wasted. If we only require 2 of the 4 lanes, why not split the

>port and request the same speed of one of the now split out ports? Now,

>this same speed is only achievable using PAM4 encoding (it is implied)

>and we have a spare, potentially usable, assuming an appropriate break-

>out cable, port instead of the 2 unused lanes.


I don't see how this dynamic split port could work in real life to be
honest. The split is something admin needs to configure and can rely
that netdevice exists all the time and not comes and goes under
different circumstances. Multiple obvious reasons why.

One way or another, I don't see the relation between this and this
patchset.


>

>So concretely, I'm suggesting that if we want to force PAM4 at the lower

>speeds, split the port and then we don't need an ethtool interface change

>at all to achieve the same goal. Having a spare (potentially usable) port

>is better than spare (unusable) lanes.


The admin has to decide, define.


>

>Now, if the port can't be split for some reason (perhaps there aren't

>sufficient device MAC resources, stats contexts, whatever), then that's

>a different story. But, even so, perhaps the port lane topology more

>appropriately belongs as part of a device configuration interface in

>devlink and the number of lanes available to a port should be a

>property of the port instead of a link mode knob?

>

>Regards,

>Edwin Peer
Danielle Ratson Dec. 1, 2020, 5:22 p.m. UTC | #33
> -----Original Message-----

> From: Michal Kubecek <mkubecek@suse.cz>

> Sent: Thursday, November 26, 2020 11:08 PM

> To: Danielle Ratson <danieller@nvidia.com>

> Cc: Jiri Pirko <jiri@resnulli.us>; Andrew Lunn <andrew@lunn.ch>; Jakub Kicinski <kuba@kernel.org>; Ido Schimmel

> <idosch@idosch.org>; netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw

> <mlxsw@nvidia.com>; Ido Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net

> Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI with lanes

> 

> On Wed, Nov 25, 2020 at 10:35:35AM +0000, Danielle Ratson wrote:

> > > > What do you think of passing the link modes you have suggested as

> > > > a bitmask, similar to "supported", that contains only one positive bit?

> > > > Something like that:

> >

> > Hi Michal,

> >

> > Thanks for your response.

> >

> > Actually what I said is not very accurate.

> > In ethtool, for speed 100G and 4 lanes for example, there are few link modes that fits:

> > ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT

> > ETHTOOL_LINK_MODE_100000baseSR4_Full_BIT

> > ETHTOOL_LINK_MODE_100000baseCR4_Full_BIT

> > ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT

> >

> > The difference is the media. And in the driver we shrink into one bit.

> > But maybe that makes passing a bitmask more sense, or am I missing something?

> 

> But as far as I understand, at any moment, only one of these will be actually in use so that's what the driver should report. Or is the

> problem that the driver cannot identify the media in use? (To be

> precise: by "cannot identify" I mean "it's not possible for the driver to find out", not "current code does not distinguish".)

> 

> Michal


After more investigation, those are my conclusions:
We have two types of supported asics in the driver- one of them is able to distinguish between the medias and the other one doesn't.
So in the first type I can send one bit as you requested from the driver to ethtool but in the other one I can't.

The suggestions I have are:
1. Add a bit that for unknown media for each link (something like ETHTOOL_LINK_MODE_100000unknown_Full_BIT). I am not sure it is even possible or makes sense.
2. Pass the link mode as bitmap.

Do you see any other option?

Thanks.
Edwin Peer Dec. 2, 2020, 12:32 a.m. UTC | #34
On Tue, Dec 1, 2020 at 3:22 AM Jiri Pirko <jiri@resnulli.us> wrote:

> >Consider a physical QSFP connector comprising 4 lanes. Today, if the

> >speed is forced, we would achieve 100G speeds using all 4 lanes with

> >NRZ encoding. If we configure the port for PAM4 encoding at the same

> >speed, then we only require 2 of the available 4 lanes. The remaining 2

> >lanes are wasted. If we only require 2 of the 4 lanes, why not split the

> >port and request the same speed of one of the now split out ports? Now,

> >this same speed is only achievable using PAM4 encoding (it is implied)

> >and we have a spare, potentially usable, assuming an appropriate break-

> >out cable, port instead of the 2 unused lanes.

>

> I don't see how this dynamic split port could work in real life to be

> honest. The split is something admin needs to configure and can rely

> that netdevice exists all the time and not comes and goes under

> different circumstances. Multiple obvious reasons why.


I'm not suggesting the port split be dynamic at all. I'm suggesting that if
the admin wants or needs to force PAM4 on a port that would otherwise
be able to achieve the given speed using more lanes with NRZ, then the
admin should split the port, so that it has fewer lanes, in order to make
that intent clear (or otherwise configure the port to have fewer lanes
attached, if you really don't want to or can't create the additional split
port).

Using this approach, the existing ethtool forced speed interface is
sufficient to configure all possible lane encodings, because the
encoding that the driver must select is now implicit (note, we don't
need to care about media type here). That is, the driver can always
select the encoding that maximizes utilization of the lanes available
to the port (as defined by the admin).

> >So concretely, I'm suggesting that if we want to force PAM4 at the lower

> >speeds, split the port and then we don't need an ethtool interface change

> >at all to achieve the same goal. Having a spare (potentially usable) port

> >is better than spare (unusable) lanes.

>

> The admin has to decide, define.


I'm not sure I understand. The admin would indeed decide. This paragraph
merely served to motivate why a rational admin should prefer to have a
spare port rather than unused lanes he can't use, because they would be
attached to a port using an encoding that doesn't need them. If he wasn't
planning on using the additional port, he loses nothing. Otherwise, he gains
something he would not otherwise have had (it's win-win). From the
perspective of the original port, two unused lanes is no different than two
lanes allocated to another logical port.

Regards,
Edwin Peer
Edwin Peer Dec. 2, 2020, 12:52 a.m. UTC | #35
On Wed, Nov 25, 2020 at 10:35:35AM +0000, Danielle Ratson wrote:

> > > In ethtool, for speed 100G and 4 lanes for example, there are few link modes that fits:

> > > ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT

> > > ETHTOOL_LINK_MODE_100000baseSR4_Full_BIT

> > > ETHTOOL_LINK_MODE_100000baseCR4_Full_BIT

> > > ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT

> > >


> The suggestions I have are:

> 1. Add a bit that for unknown media for each link (something like

> ETHTOOL_LINK_MODE_100000unknown_Full_BIT). I am not sure it is even

> possible or makes sense.


The number of lanes would still need to be specified. You would need at least:

ETHTOOL_LINK_MODE_100000xR2_FULL

and

ETHTOOL_LINK_MODE_100000xR4_FULL

to distinguish between PAM4 and NRZ at 100G respectively. And, there's still
the cost of maintaining a different enum to ethtool_link_mode_bit_indices.

> 2. Pass the link mode as bitmap.


The user only wants to specify link speed and encoding, not media. The
bitmap has the same problem to solve. Or, should user space set the bits
for all possible media types that satisfy the desired speed and encoding?
Eeek. Alternatively, the driver would need to accept any bit that implies the
desired speed and encoding, ignoring what media the bit specifies (to do
so would require maintaining a map of equivalences, yuck).

> Do you see any other option?


As stated in another sub-thread, I think the encoding can be implied by the
speed if the number of lanes is a property of the port configuration. In which
case the existing ethtool interface is sufficient.

Regards,
Edwin Peer
Edwin Peer Dec. 2, 2020, 1:17 a.m. UTC | #36
On Tue, Dec 1, 2020 at 9:23 AM Danielle Ratson <danieller@nvidia.com> wrote:

> The suggestions I have are:

> 1. Add a bit that for unknown media for each link (something like ETHTOOL_LINK_MODE_100000unknown_Full_BIT). I am not sure it is even possible or makes sense.

> 2. Pass the link mode as bitmap.

>

> Do you see any other option?


Other options include:

1) Let the user specify link encoding directly, as we do for FEC. This would
imply the number of lanes used at a given speed rather than the other way
around. Then you only need a handful of indices in the enum.

2) Instead of an enum, encode the semantics into the link mode bit
representation. N bits define the speed. Other bits define the encoding,
number of lanes, media, etc. This would solve the problem of decoding the
intent from a single user selected link mode index, but would require a v2
UABI. The user could have the same strings, but not much more. And then,
it's probably better to give each dimension their own field in a struct and be
done, lest we run out of bits. Let the user space tool map the link mode
string onto the appropriate struct values.

Regards,
Edwin Peer
Jiri Pirko Dec. 2, 2020, 10:09 a.m. UTC | #37
Wed, Dec 02, 2020 at 01:32:46AM CET, edwin.peer@broadcom.com wrote:
>On Tue, Dec 1, 2020 at 3:22 AM Jiri Pirko <jiri@resnulli.us> wrote:

>

>> >Consider a physical QSFP connector comprising 4 lanes. Today, if the

>> >speed is forced, we would achieve 100G speeds using all 4 lanes with

>> >NRZ encoding. If we configure the port for PAM4 encoding at the same

>> >speed, then we only require 2 of the available 4 lanes. The remaining 2

>> >lanes are wasted. If we only require 2 of the 4 lanes, why not split the

>> >port and request the same speed of one of the now split out ports? Now,

>> >this same speed is only achievable using PAM4 encoding (it is implied)

>> >and we have a spare, potentially usable, assuming an appropriate break-

>> >out cable, port instead of the 2 unused lanes.

>>

>> I don't see how this dynamic split port could work in real life to be

>> honest. The split is something admin needs to configure and can rely

>> that netdevice exists all the time and not comes and goes under

>> different circumstances. Multiple obvious reasons why.

>

>I'm not suggesting the port split be dynamic at all. I'm suggesting that if

>the admin wants or needs to force PAM4 on a port that would otherwise

>be able to achieve the given speed using more lanes with NRZ, then the

>admin should split the port, so that it has fewer lanes, in order to make

>that intent clear (or otherwise configure the port to have fewer lanes

>attached, if you really don't want to or can't create the additional split

>port).


Okay, I see your point now. The thing is, the port split/unsplit causes
a great distubance. Meaning, the netdevs all of the port
disappear/reappear. Now consider following example:

You have a router you have configured routes on many netdevs
On one of the netdevs (has routes on it), you for any reason
need to force lane number.
In your suggestion, the netdev disappears along with the routes, the
routing is then broken. I don't see how this could be acceptable.

We are talking here about netdev configuration, we have a tool for that,
that is ethtool. What you suggest is to take it to different level,
I don't believe it is correct/doable.


>

>Using this approach, the existing ethtool forced speed interface is

>sufficient to configure all possible lane encodings, because the

>encoding that the driver must select is now implicit (note, we don't

>need to care about media type here). That is, the driver can always

>select the encoding that maximizes utilization of the lanes available

>to the port (as defined by the admin).

>

>> >So concretely, I'm suggesting that if we want to force PAM4 at the lower

>> >speeds, split the port and then we don't need an ethtool interface change

>> >at all to achieve the same goal. Having a spare (potentially usable) port

>> >is better than spare (unusable) lanes.

>>

>> The admin has to decide, define.

>

>I'm not sure I understand. The admin would indeed decide. This paragraph

>merely served to motivate why a rational admin should prefer to have a

>spare port rather than unused lanes he can't use, because they would be

>attached to a port using an encoding that doesn't need them. If he wasn't

>planning on using the additional port, he loses nothing. Otherwise, he gains

>something he would not otherwise have had (it's win-win). From the

>perspective of the original port, two unused lanes is no different than two

>lanes allocated to another logical port.

>

>Regards,

>Edwin Peer
Edwin Peer Dec. 2, 2020, 5:53 p.m. UTC | #38
On Wed, Dec 2, 2020 at 2:09 AM Jiri Pirko <jiri@resnulli.us> wrote:

> >I'm not suggesting the port split be dynamic at all. I'm suggesting that if

> >the admin wants or needs to force PAM4 on a port that would otherwise

> >be able to achieve the given speed using more lanes with NRZ, then the

> >admin should split the port, so that it has fewer lanes, in order to make

> >that intent clear (or otherwise configure the port to have fewer lanes

> >attached, if you really don't want to or can't create the additional split

> >port).

>

> Okay, I see your point now. The thing is, the port split/unsplit causes

> a great distubance. Meaning, the netdevs all of the port

> disappear/reappear. Now consider following example:


I guess that's a detail of the present implementation and/or device
capabilities (I'm not particularly familiar), but presumably it is at least
possible, in principle, to modify a device's port config without taking
down the netdev. That said, after spending more time thinking about
this, I'm coming around to the idea of being able to explicitly set
encoding (not lanes) via ethtool. And, in this context, by encoding,
I technically mean line code signalling, not symbol encoding.

Although it does for today's link modes, the number of lanes does
not generally imply signalling mode. In future we may have PAM8
signalling and then the present proposal of deriving the signalling
mode for a given speed from the lane count falls down. We should
be specifying signalling mode via ethtool instead of lanes.

Alternatively, we need to be specifying the fully defined link mode.
But, doing so is fraught with other issues, including how to represent
it in the interface and the fact that the user doesn't necessarily want
to specify physical media in these cases, something that is implied
by the full link mode definition.

Regards,
Edwin Peer
diff mbox series

Patch

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 30b98245979f..05073482db05 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -431,16 +431,17 @@  Request contents:
   ``ETHTOOL_A_LINKMODES_SPEED``               u32     link speed (Mb/s)
   ``ETHTOOL_A_LINKMODES_DUPLEX``              u8      duplex mode
   ``ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG``    u8      Master/slave port mode
+  ``ETHTOOL_A_LINKMODES_LANES``               u32     lanes
   ==========================================  ======  ==========================
 
 ``ETHTOOL_A_LINKMODES_OURS`` bit set allows setting advertised link modes. If
 autonegotiation is on (either set now or kept from before), advertised modes
 are not changed (no ``ETHTOOL_A_LINKMODES_OURS`` attribute) and at least one
-of speed and duplex is specified, kernel adjusts advertised modes to all
-supported modes matching speed, duplex or both (whatever is specified). This
-autoselection is done on ethtool side with ioctl interface, netlink interface
-is supposed to allow requesting changes without knowing what exactly kernel
-supports.
+of speed, duplex and lanes is specified, kernel adjusts advertised modes to all
+supported modes matching speed, duplex, lanes or all (whatever is specified).
+This autoselection is done on ethtool side with ioctl interface, netlink
+interface is supposed to allow requesting changes without knowing what exactly
+kernel supports.
 
 
 LINKSTATE_GET
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 6408b446051f..4945ce487f01 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -128,6 +128,7 @@  struct ethtool_link_ksettings {
 		__ETHTOOL_DECLARE_LINK_MODE_MASK(advertising);
 		__ETHTOOL_DECLARE_LINK_MODE_MASK(lp_advertising);
 	} link_modes;
+	u32	lanes;
 };
 
 /**
@@ -241,6 +242,8 @@  bool ethtool_convert_link_mode_to_legacy_u32(u32 *legacy_u32,
 	 ETHTOOL_COALESCE_PKT_RATE_LOW | ETHTOOL_COALESCE_PKT_RATE_HIGH | \
 	 ETHTOOL_COALESCE_RATE_SAMPLE_INTERVAL)
 
+#define ETHTOOL_CAP_LINK_LANES_SUPPORTED BIT(0)
+
 #define ETHTOOL_STAT_NOT_SET	(~0ULL)
 
 /**
@@ -419,6 +422,7 @@  struct ethtool_pause_stats {
  * of the generic netdev features interface.
  */
 struct ethtool_ops {
+	u32     capabilities;
 	u32	supported_coalesce_params;
 	void	(*get_drvinfo)(struct net_device *, struct ethtool_drvinfo *);
 	int	(*get_regs_len)(struct net_device *);
diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index 9ca87bc73c44..2125b93ecee0 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -1738,6 +1738,14 @@  static inline int ethtool_validate_speed(__u32 speed)
 	return speed <= INT_MAX || speed == (__u32)SPEED_UNKNOWN;
 }
 
+/* Lanes, 1, 2, 4 or 8. */
+#define ETHTOOL_LANES_1			1
+#define ETHTOOL_LANES_2			2
+#define ETHTOOL_LANES_4			4
+#define ETHTOOL_LANES_8			8
+
+#define ETHTOOL_LANES_UNKNOWN		0
+
 /* Duplex, half or full. */
 #define DUPLEX_HALF		0x00
 #define DUPLEX_FULL		0x01
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index e2bf36e6964b..a286635ac9b8 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -227,6 +227,7 @@  enum {
 	ETHTOOL_A_LINKMODES_DUPLEX,		/* u8 */
 	ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG,	/* u8 */
 	ETHTOOL_A_LINKMODES_MASTER_SLAVE_STATE,	/* u8 */
+	ETHTOOL_A_LINKMODES_LANES,		/* u32 */
 
 	/* add new constants above here */
 	__ETHTOOL_A_LINKMODES_CNT,
diff --git a/net/ethtool/linkmodes.c b/net/ethtool/linkmodes.c
index c5bcb9abc8b9..a1f02896ed8b 100644
--- a/net/ethtool/linkmodes.c
+++ b/net/ethtool/linkmodes.c
@@ -152,12 +152,14 @@  const struct ethnl_request_ops ethnl_linkmodes_request_ops = {
 
 struct link_mode_info {
 	int				speed;
+	int				lanes;
 	u8				duplex;
 };
 
-#define __DEFINE_LINK_MODE_PARAMS(_speed, _type, _duplex) \
+#define __DEFINE_LINK_MODE_PARAMS(_speed, _type, _lanes, _duplex) \
 	[ETHTOOL_LINK_MODE(_speed, _type, _duplex)] = { \
 		.speed	= SPEED_ ## _speed, \
+		.lanes	= ETHTOOL_LANES_ ## _lanes, \
 		.duplex	= __DUPLEX_ ## _duplex \
 	}
 #define __DUPLEX_Half DUPLEX_HALF
@@ -165,105 +167,106 @@  struct link_mode_info {
 #define __DEFINE_SPECIAL_MODE_PARAMS(_mode) \
 	[ETHTOOL_LINK_MODE_ ## _mode ## _BIT] = { \
 		.speed	= SPEED_UNKNOWN, \
+		.lanes	= ETHTOOL_LANES_UNKNOWN, \
 		.duplex	= DUPLEX_UNKNOWN, \
 	}
 
 static const struct link_mode_info link_mode_params[] = {
-	__DEFINE_LINK_MODE_PARAMS(10, T, Half),
-	__DEFINE_LINK_MODE_PARAMS(10, T, Full),
-	__DEFINE_LINK_MODE_PARAMS(100, T, Half),
-	__DEFINE_LINK_MODE_PARAMS(100, T, Full),
-	__DEFINE_LINK_MODE_PARAMS(1000, T, Half),
-	__DEFINE_LINK_MODE_PARAMS(1000, T, Full),
+	__DEFINE_LINK_MODE_PARAMS(10, T, 1, Half),
+	__DEFINE_LINK_MODE_PARAMS(10, T, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(100, T, 1, Half),
+	__DEFINE_LINK_MODE_PARAMS(100, T, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(1000, T, 1, Half),
+	__DEFINE_LINK_MODE_PARAMS(1000, T, 1, Full),
 	__DEFINE_SPECIAL_MODE_PARAMS(Autoneg),
 	__DEFINE_SPECIAL_MODE_PARAMS(TP),
 	__DEFINE_SPECIAL_MODE_PARAMS(AUI),
 	__DEFINE_SPECIAL_MODE_PARAMS(MII),
 	__DEFINE_SPECIAL_MODE_PARAMS(FIBRE),
 	__DEFINE_SPECIAL_MODE_PARAMS(BNC),
-	__DEFINE_LINK_MODE_PARAMS(10000, T, Full),
+	__DEFINE_LINK_MODE_PARAMS(10000, T, 1, Full),
 	__DEFINE_SPECIAL_MODE_PARAMS(Pause),
 	__DEFINE_SPECIAL_MODE_PARAMS(Asym_Pause),
-	__DEFINE_LINK_MODE_PARAMS(2500, X, Full),
+	__DEFINE_LINK_MODE_PARAMS(2500, X, 1, Full),
 	__DEFINE_SPECIAL_MODE_PARAMS(Backplane),
-	__DEFINE_LINK_MODE_PARAMS(1000, KX, Full),
-	__DEFINE_LINK_MODE_PARAMS(10000, KX4, Full),
-	__DEFINE_LINK_MODE_PARAMS(10000, KR, Full),
+	__DEFINE_LINK_MODE_PARAMS(1000, KX, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(10000, KX4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(10000, KR, 1, Full),
 	[ETHTOOL_LINK_MODE_10000baseR_FEC_BIT] = {
 		.speed	= SPEED_10000,
 		.duplex = DUPLEX_FULL,
 	},
-	__DEFINE_LINK_MODE_PARAMS(20000, MLD2, Full),
-	__DEFINE_LINK_MODE_PARAMS(20000, KR2, Full),
-	__DEFINE_LINK_MODE_PARAMS(40000, KR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(40000, CR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(40000, SR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(40000, LR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(56000, KR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(56000, CR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(56000, SR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(56000, LR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(25000, CR, Full),
-	__DEFINE_LINK_MODE_PARAMS(25000, KR, Full),
-	__DEFINE_LINK_MODE_PARAMS(25000, SR, Full),
-	__DEFINE_LINK_MODE_PARAMS(50000, CR2, Full),
-	__DEFINE_LINK_MODE_PARAMS(50000, KR2, Full),
-	__DEFINE_LINK_MODE_PARAMS(100000, KR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(100000, SR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(100000, CR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(100000, LR4_ER4, Full),
-	__DEFINE_LINK_MODE_PARAMS(50000, SR2, Full),
-	__DEFINE_LINK_MODE_PARAMS(1000, X, Full),
-	__DEFINE_LINK_MODE_PARAMS(10000, CR, Full),
-	__DEFINE_LINK_MODE_PARAMS(10000, SR, Full),
-	__DEFINE_LINK_MODE_PARAMS(10000, LR, Full),
-	__DEFINE_LINK_MODE_PARAMS(10000, LRM, Full),
-	__DEFINE_LINK_MODE_PARAMS(10000, ER, Full),
-	__DEFINE_LINK_MODE_PARAMS(2500, T, Full),
-	__DEFINE_LINK_MODE_PARAMS(5000, T, Full),
+	__DEFINE_LINK_MODE_PARAMS(20000, MLD2, 2, Full),
+	__DEFINE_LINK_MODE_PARAMS(20000, KR2, 2, Full),
+	__DEFINE_LINK_MODE_PARAMS(40000, KR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(40000, CR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(40000, SR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(40000, LR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(56000, KR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(56000, CR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(56000, SR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(56000, LR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(25000, CR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(25000, KR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(25000, SR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(50000, CR2, 2, Full),
+	__DEFINE_LINK_MODE_PARAMS(50000, KR2, 2, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, KR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, SR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, CR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, LR4_ER4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(50000, SR2, 2, Full),
+	__DEFINE_LINK_MODE_PARAMS(1000, X, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(10000, CR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(10000, SR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(10000, LR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(10000, LRM, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(10000, ER, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(2500, T, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(5000, T, 1, Full),
 	__DEFINE_SPECIAL_MODE_PARAMS(FEC_NONE),
 	__DEFINE_SPECIAL_MODE_PARAMS(FEC_RS),
 	__DEFINE_SPECIAL_MODE_PARAMS(FEC_BASER),
-	__DEFINE_LINK_MODE_PARAMS(50000, KR, Full),
-	__DEFINE_LINK_MODE_PARAMS(50000, SR, Full),
-	__DEFINE_LINK_MODE_PARAMS(50000, CR, Full),
-	__DEFINE_LINK_MODE_PARAMS(50000, LR_ER_FR, Full),
-	__DEFINE_LINK_MODE_PARAMS(50000, DR, Full),
-	__DEFINE_LINK_MODE_PARAMS(100000, KR2, Full),
-	__DEFINE_LINK_MODE_PARAMS(100000, SR2, Full),
-	__DEFINE_LINK_MODE_PARAMS(100000, CR2, Full),
-	__DEFINE_LINK_MODE_PARAMS(100000, LR2_ER2_FR2, Full),
-	__DEFINE_LINK_MODE_PARAMS(100000, DR2, Full),
-	__DEFINE_LINK_MODE_PARAMS(200000, KR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(200000, SR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(200000, LR4_ER4_FR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(200000, DR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(200000, CR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(100, T1, Full),
-	__DEFINE_LINK_MODE_PARAMS(1000, T1, Full),
-	__DEFINE_LINK_MODE_PARAMS(400000, KR8, Full),
-	__DEFINE_LINK_MODE_PARAMS(400000, SR8, Full),
-	__DEFINE_LINK_MODE_PARAMS(400000, LR8_ER8_FR8, Full),
-	__DEFINE_LINK_MODE_PARAMS(400000, DR8, Full),
-	__DEFINE_LINK_MODE_PARAMS(400000, CR8, Full),
+	__DEFINE_LINK_MODE_PARAMS(50000, KR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(50000, SR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(50000, CR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(50000, LR_ER_FR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(50000, DR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, KR2, 2, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, SR2, 2, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, CR2, 2, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, LR2_ER2_FR2, 2, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, DR2, 2, Full),
+	__DEFINE_LINK_MODE_PARAMS(200000, KR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(200000, SR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(200000, LR4_ER4_FR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(200000, DR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(200000, CR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(100, T1, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(1000, T1, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(400000, KR8, 8, Full),
+	__DEFINE_LINK_MODE_PARAMS(400000, SR8, 8, Full),
+	__DEFINE_LINK_MODE_PARAMS(400000, LR8_ER8_FR8, 8, Full),
+	__DEFINE_LINK_MODE_PARAMS(400000, DR8, 8, Full),
+	__DEFINE_LINK_MODE_PARAMS(400000, CR8, 8, Full),
 	__DEFINE_SPECIAL_MODE_PARAMS(FEC_LLRS),
-	__DEFINE_LINK_MODE_PARAMS(100000, KR, Full),
-	__DEFINE_LINK_MODE_PARAMS(100000, SR, Full),
-	__DEFINE_LINK_MODE_PARAMS(100000, LR_ER_FR, Full),
-	__DEFINE_LINK_MODE_PARAMS(100000, DR, Full),
-	__DEFINE_LINK_MODE_PARAMS(100000, CR, Full),
-	__DEFINE_LINK_MODE_PARAMS(200000, KR2, Full),
-	__DEFINE_LINK_MODE_PARAMS(200000, SR2, Full),
-	__DEFINE_LINK_MODE_PARAMS(200000, LR2_ER2_FR2, Full),
-	__DEFINE_LINK_MODE_PARAMS(200000, DR2, Full),
-	__DEFINE_LINK_MODE_PARAMS(200000, CR2, Full),
-	__DEFINE_LINK_MODE_PARAMS(400000, KR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(400000, SR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(400000, LR4_ER4_FR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(400000, DR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(400000, CR4, Full),
-	__DEFINE_LINK_MODE_PARAMS(100, FX, Half),
-	__DEFINE_LINK_MODE_PARAMS(100, FX, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, KR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, SR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, LR_ER_FR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, DR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, CR, 1, Full),
+	__DEFINE_LINK_MODE_PARAMS(200000, KR2, 2, Full),
+	__DEFINE_LINK_MODE_PARAMS(200000, SR2, 2, Full),
+	__DEFINE_LINK_MODE_PARAMS(200000, LR2_ER2_FR2, 2, Full),
+	__DEFINE_LINK_MODE_PARAMS(200000, DR2, 2, Full),
+	__DEFINE_LINK_MODE_PARAMS(200000, CR2, 2, Full),
+	__DEFINE_LINK_MODE_PARAMS(400000, KR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(400000, SR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(400000, LR4_ER4_FR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(400000, DR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(400000, CR4, 4, Full),
+	__DEFINE_LINK_MODE_PARAMS(100, FX, 1, Half),
+	__DEFINE_LINK_MODE_PARAMS(100, FX, 1, Full),
 };
 
 const struct nla_policy ethnl_linkmodes_set_policy[] = {
@@ -274,16 +277,17 @@  const struct nla_policy ethnl_linkmodes_set_policy[] = {
 	[ETHTOOL_A_LINKMODES_SPEED]		= { .type = NLA_U32 },
 	[ETHTOOL_A_LINKMODES_DUPLEX]		= { .type = NLA_U8 },
 	[ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG]	= { .type = NLA_U8 },
+	[ETHTOOL_A_LINKMODES_LANES]		= { .type = NLA_U32 },
 };
 
-/* Set advertised link modes to all supported modes matching requested speed
- * and duplex values. Called when autonegotiation is on, speed or duplex is
- * requested but no link mode change. This is done in userspace with ioctl()
- * interface, move it into kernel for netlink.
+/* Set advertised link modes to all supported modes matching requested speed,
+ * lanes and duplex values. Called when autonegotiation is on, speed, lanes or
+ * duplex is requested but no link mode change. This is done in userspace with
+ * ioctl() interface, move it into kernel for netlink.
  * Returns true if advertised modes bitmap was modified.
  */
 static bool ethnl_auto_linkmodes(struct ethtool_link_ksettings *ksettings,
-				 bool req_speed, bool req_duplex)
+				 bool req_speed, bool req_lanes, bool req_duplex)
 {
 	unsigned long *advertising = ksettings->link_modes.advertising;
 	unsigned long *supported = ksettings->link_modes.supported;
@@ -302,6 +306,7 @@  static bool ethnl_auto_linkmodes(struct ethtool_link_ksettings *ksettings,
 			continue;
 		if (test_bit(i, supported) &&
 		    (!req_speed || info->speed == ksettings->base.speed) &&
+		    (!req_lanes || info->lanes == ksettings->lanes) &&
 		    (!req_duplex || info->duplex == ksettings->base.duplex))
 			set_bit(i, advertising);
 		else
@@ -325,12 +330,25 @@  static bool ethnl_validate_master_slave_cfg(u8 cfg)
 	return false;
 }
 
+static bool ethnl_validate_lanes_cfg(u32 cfg)
+{
+	switch (cfg) {
+	case ETHTOOL_LANES_1:
+	case ETHTOOL_LANES_2:
+	case ETHTOOL_LANES_4:
+	case ETHTOOL_LANES_8:
+		return true;
+	}
+
+	return false;
+}
+
 static int ethnl_update_linkmodes(struct genl_info *info, struct nlattr **tb,
 				  struct ethtool_link_ksettings *ksettings,
-				  bool *mod)
+				  bool *mod, const struct net_device *dev)
 {
 	struct ethtool_link_settings *lsettings = &ksettings->base;
-	bool req_speed, req_duplex;
+	bool req_speed, req_lanes, req_duplex;
 	const struct nlattr *master_slave_cfg;
 	int ret;
 
@@ -353,10 +371,39 @@  static int ethnl_update_linkmodes(struct genl_info *info, struct nlattr **tb,
 
 	*mod = false;
 	req_speed = tb[ETHTOOL_A_LINKMODES_SPEED];
+	req_lanes = tb[ETHTOOL_A_LINKMODES_LANES];
 	req_duplex = tb[ETHTOOL_A_LINKMODES_DUPLEX];
 
 	ethnl_update_u8(&lsettings->autoneg, tb[ETHTOOL_A_LINKMODES_AUTONEG],
 			mod);
+
+	if (req_lanes) {
+		u32 lanes_cfg = nla_get_u32(tb[ETHTOOL_A_LINKMODES_LANES]);
+
+		if (!ethnl_validate_lanes_cfg(lanes_cfg)) {
+			NL_SET_ERR_MSG_ATTR(info->extack,
+					    tb[ETHTOOL_A_LINKMODES_LANES],
+					    "lanes value is invalid");
+			return -EINVAL;
+		}
+
+		/* If autoneg is off and lanes parameter is not supported by the
+		 * driver, return an error.
+		 */
+		if (!lsettings->autoneg &&
+		    !(dev->ethtool_ops->capabilities & ETHTOOL_CAP_LINK_LANES_SUPPORTED)) {
+			NL_SET_ERR_MSG_ATTR(info->extack,
+					    tb[ETHTOOL_A_LINKMODES_LANES],
+					    "lanes configuration not supported by device");
+			return -EOPNOTSUPP;
+		}
+	} else if (!lsettings->autoneg) {
+		/* If autoneg is off and lanes parameter is not passed from user,
+		 * set the lanes parameter to UNKNOWN.
+		 */
+		ksettings->lanes = ETHTOOL_LANES_UNKNOWN;
+	}
+
 	ret = ethnl_update_bitset(ksettings->link_modes.advertising,
 				  __ETHTOOL_LINK_MODE_MASK_NBITS,
 				  tb[ETHTOOL_A_LINKMODES_OURS], link_mode_names,
@@ -365,13 +412,15 @@  static int ethnl_update_linkmodes(struct genl_info *info, struct nlattr **tb,
 		return ret;
 	ethnl_update_u32(&lsettings->speed, tb[ETHTOOL_A_LINKMODES_SPEED],
 			 mod);
+	ethnl_update_u32(&ksettings->lanes, tb[ETHTOOL_A_LINKMODES_LANES],
+			 mod);
 	ethnl_update_u8(&lsettings->duplex, tb[ETHTOOL_A_LINKMODES_DUPLEX],
 			mod);
 	ethnl_update_u8(&lsettings->master_slave_cfg, master_slave_cfg, mod);
 
 	if (!tb[ETHTOOL_A_LINKMODES_OURS] && lsettings->autoneg &&
-	    (req_speed || req_duplex) &&
-	    ethnl_auto_linkmodes(ksettings, req_speed, req_duplex))
+	    (req_speed || req_lanes || req_duplex) &&
+	    ethnl_auto_linkmodes(ksettings, req_speed, req_lanes, req_duplex))
 		*mod = true;
 
 	return 0;
@@ -409,7 +458,7 @@  int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info)
 		goto out_ops;
 	}
 
-	ret = ethnl_update_linkmodes(info, tb, &ksettings, &mod);
+	ret = ethnl_update_linkmodes(info, tb, &ksettings, &mod, dev);
 	if (ret < 0)
 		goto out_ops;
 
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index d8efec516d86..6eabd58d81bf 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -351,7 +351,7 @@  extern const struct nla_policy ethnl_strset_get_policy[ETHTOOL_A_STRSET_COUNTS_O
 extern const struct nla_policy ethnl_linkinfo_get_policy[ETHTOOL_A_LINKINFO_HEADER + 1];
 extern const struct nla_policy ethnl_linkinfo_set_policy[ETHTOOL_A_LINKINFO_TP_MDIX_CTRL + 1];
 extern const struct nla_policy ethnl_linkmodes_get_policy[ETHTOOL_A_LINKMODES_HEADER + 1];
-extern const struct nla_policy ethnl_linkmodes_set_policy[ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG + 1];
+extern const struct nla_policy ethnl_linkmodes_set_policy[ETHTOOL_A_LINKMODES_LANES + 1];
 extern const struct nla_policy ethnl_linkstate_get_policy[ETHTOOL_A_LINKSTATE_HEADER + 1];
 extern const struct nla_policy ethnl_debug_get_policy[ETHTOOL_A_DEBUG_HEADER + 1];
 extern const struct nla_policy ethnl_debug_set_policy[ETHTOOL_A_DEBUG_MSGMASK + 1];