Message ID | 20201010154119.3537085-2-idosch@idosch.org |
---|---|
State | New |
Headers | show |
Series | Support setting lanes via ethtool | expand |
On Sat, 10 Oct 2020 18:41:14 +0300 Ido Schimmel wrote: > From: Danielle Ratson <danieller@nvidia.com> > > Currently, when auto negotiation is on, the user can advertise all the > linkmodes which correspond to a specific speed, but does not have a > similar selector for the number of lanes. This is significant when a > specific speed can be achieved using different number of lanes. For > example, 2x50 or 4x25. > > Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct > ethtool_link_settings' with lanes field in order to implement a new > lanes-selector that will enable the user to advertise a specific number > of lanes as well. > > When auto negotiation is off, lanes parameter can be forced only if the > driver supports it. Add a capability bit in 'struct ethtool_ops' that > allows ethtool know if the driver can handle the lanes parameter when > auto negotiation is off, so if it does not, an error message will be > returned when trying to set lanes. What's the use for this in practical terms? Isn't the lane count basically implied by the module that gets plugged in? > +/* Lanes, 1, 2, 4 or 8. */ > +#define ETHTOOL_LANES_1 1 > +#define ETHTOOL_LANES_2 2 > +#define ETHTOOL_LANES_4 4 > +#define ETHTOOL_LANES_8 8 Not an extremely useful set of defines, not sure Michal would agree. > +#define ETHTOOL_LANES_UNKNOWN 0 > struct link_mode_info { > int speed; > + int lanes; why signed? > u8 duplex; > }; > @@ -274,16 +277,17 @@ const struct nla_policy ethnl_linkmodes_set_policy[] = { > [ETHTOOL_A_LINKMODES_SPEED] = { .type = NLA_U32 }, > [ETHTOOL_A_LINKMODES_DUPLEX] = { .type = NLA_U8 }, > [ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG] = { .type = NLA_U8 }, > + [ETHTOOL_A_LINKMODES_LANES] = { .type = NLA_U32 }, NLA_POLICY_VALIDATE_FN(), not sure why the types for this validation_type are limited.. Johannes, just an abundance of caution? > + } else if (!lsettings->autoneg) { > + /* If autoneg is off and lanes parameter is not passed from user, > + * set the lanes parameter to UNKNOWN. > + */ > + ksettings->lanes = ETHTOOL_LANES_UNKNOWN; you assume UNKNOWN is zero by doing !lanes in auto_linkmodes - that's inconsistent. > + } > + > ret = ethnl_update_bitset(ksettings->link_modes.advertising, > __ETHTOOL_LINK_MODE_MASK_NBITS, > tb[ETHTOOL_A_LINKMODES_OURS], link_mode_names,
> -----Original Message----- > From: Jakub Kicinski <kuba@kernel.org> > Sent: Monday, October 12, 2020 1:38 AM > To: Ido Schimmel <idosch@idosch.org> > Cc: netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko > <jiri@nvidia.com>; Danielle Ratson <danieller@nvidia.com>; > andrew@lunn.ch; f.fainelli@gmail.com; mkubecek@suse.cz; mlxsw > <mlxsw@nvidia.com>; Ido Schimmel <idosch@nvidia.com>; > johannes@sipsolutions.net > Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI > with lanes > > On Sat, 10 Oct 2020 18:41:14 +0300 Ido Schimmel wrote: > > From: Danielle Ratson <danieller@nvidia.com> > > > > Currently, when auto negotiation is on, the user can advertise all the > > linkmodes which correspond to a specific speed, but does not have a > > similar selector for the number of lanes. This is significant when a > > specific speed can be achieved using different number of lanes. For > > example, 2x50 or 4x25. > > > > Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct > > ethtool_link_settings' with lanes field in order to implement a new > > lanes-selector that will enable the user to advertise a specific > > number of lanes as well. > > > > When auto negotiation is off, lanes parameter can be forced only if > > the driver supports it. Add a capability bit in 'struct ethtool_ops' > > that allows ethtool know if the driver can handle the lanes parameter > > when auto negotiation is off, so if it does not, an error message will > > be returned when trying to set lanes. > > What's the use for this in practical terms? Isn't the lane count basically > implied by the module that gets plugged in? The use is to enable the user to decide how to achieve a certain speed. For example, if he wants to get 100G and the port has 4 lanes, the speed can be achieved it using both 2 lanes of 50G and 4 lanes of 25G, as a port with 4 lanes width can work in 2 lanes mode with double speed each. So, by specifying "lanes 2" he will achieve 100G using 2 lanes of 50G. > > > +/* Lanes, 1, 2, 4 or 8. */ > > +#define ETHTOOL_LANES_1 1 > > +#define ETHTOOL_LANES_2 2 > > +#define ETHTOOL_LANES_4 4 > > +#define ETHTOOL_LANES_8 8 > > Not an extremely useful set of defines, not sure Michal would agree. > > > +#define ETHTOOL_LANES_UNKNOWN 0 > > > struct link_mode_info { > > int speed; > > + int lanes; > > why signed? I have aligned it to the speed. > > > u8 duplex; > > }; > > > @@ -274,16 +277,17 @@ const struct nla_policy > ethnl_linkmodes_set_policy[] = { > > [ETHTOOL_A_LINKMODES_SPEED] = { .type = NLA_U32 }, > > [ETHTOOL_A_LINKMODES_DUPLEX] = { .type = NLA_U8 }, > > [ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG] = { .type = > NLA_U8 }, > > + [ETHTOOL_A_LINKMODES_LANES] = { .type = NLA_U32 }, > > NLA_POLICY_VALIDATE_FN(), not sure why the types for this > validation_type are limited.. Johannes, just an abundance of caution? > > > + } else if (!lsettings->autoneg) { > > + /* If autoneg is off and lanes parameter is not passed from > user, > > + * set the lanes parameter to UNKNOWN. > > + */ > > + ksettings->lanes = ETHTOOL_LANES_UNKNOWN; > > you assume UNKNOWN is zero by doing !lanes in auto_linkmodes - that's > inconsistent. I didn't do !lanes... maybe you mean !req_lanes which is different parameter that specifies if the lanes parameter was passed from user space. > > > + } > > + > > ret = ethnl_update_bitset(ksettings->link_modes.advertising, > > __ETHTOOL_LINK_MODE_MASK_NBITS, > > tb[ETHTOOL_A_LINKMODES_OURS], > link_mode_names,
On Mon, 12 Oct 2020 15:33:45 +0000 Danielle Ratson wrote: > > What's the use for this in practical terms? Isn't the lane count basically > > implied by the module that gets plugged in? > > The use is to enable the user to decide how to achieve a certain speed. > For example, if he wants to get 100G and the port has 4 lanes, the > speed can be achieved it using both 2 lanes of 50G and 4 lanes of > 25G, as a port with 4 lanes width can work in 2 lanes mode with > double speed each. So, by specifying "lanes 2" he will achieve 100G > using 2 lanes of 50G. Can you give a concrete example of serdes capabilities of the port, what SFP gets plugged in and what configuration user wants to select?
On Mon, Oct 12, 2020 at 03:33:45PM +0000, Danielle Ratson wrote: > > > > > +/* Lanes, 1, 2, 4 or 8. */ > > > +#define ETHTOOL_LANES_1 1 > > > +#define ETHTOOL_LANES_2 2 > > > +#define ETHTOOL_LANES_4 4 > > > +#define ETHTOOL_LANES_8 8 > > > > Not an extremely useful set of defines, not sure Michal would agree. I don't see much use for them either. Such defines or enums make sense when the numbers represent some values which have symbolic names but these values only represent number of lanes so a number is IMHO sufficient. > > > +#define ETHTOOL_LANES_UNKNOWN 0 > > > > > struct link_mode_info { > > > int speed; > > > + int lanes; > > > > why signed? > > I have aligned it to the speed. For speed, signed type was used mostly because SPEED_UNKNOWN is defined as -1 (even if it's cast to u32 in most places), I don't think there is a reason to use a signed type. Michal
On Sat, Oct 10, 2020 at 06:41:14PM +0300, Ido Schimmel wrote: > From: Danielle Ratson <danieller@nvidia.com> > > Currently, when auto negotiation is on, the user can advertise all the > linkmodes which correspond to a specific speed, but does not have a > similar selector for the number of lanes. This is significant when a > specific speed can be achieved using different number of lanes. For > example, 2x50 or 4x25. > > Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct > ethtool_link_settings' with lanes field in order to implement a new > lanes-selector that will enable the user to advertise a specific number > of lanes as well. > > When auto negotiation is off, lanes parameter can be forced only if the > driver supports it. Add a capability bit in 'struct ethtool_ops' that > allows ethtool know if the driver can handle the lanes parameter when > auto negotiation is off, so if it does not, an error message will be > returned when trying to set lanes. > > Example: > > $ ethtool -s swp1 lanes 4 > $ ethtool swp1 > Settings for swp1: > Supported ports: [ FIBRE ] > Supported link modes: 1000baseKX/Full > 10000baseKR/Full > 40000baseCR4/Full > 40000baseSR4/Full > 40000baseLR4/Full > 25000baseCR/Full > 25000baseSR/Full > 50000baseCR2/Full > 100000baseSR4/Full > 100000baseCR4/Full > Supported pause frame use: Symmetric Receive-only > Supports auto-negotiation: Yes > Supported FEC modes: Not reported > Advertised link modes: 40000baseCR4/Full > 40000baseSR4/Full > 40000baseLR4/Full > 100000baseSR4/Full > 100000baseCR4/Full > Advertised pause frame use: No > Advertised auto-negotiation: Yes > Advertised FEC modes: Not reported > Speed: Unknown! > Duplex: Unknown! (255) > Auto-negotiation: on > Port: Direct Attach Copper > PHYAD: 0 > Transceiver: internal > Link detected: no > > Signed-off-by: Danielle Ratson <danieller@nvidia.com> > Reviewed-by: Jiri Pirko <jiri@nvidia.com> > Signed-off-by: Ido Schimmel <idosch@nvidia.com> > --- [...] > static const struct link_mode_info link_mode_params[] = { > - __DEFINE_LINK_MODE_PARAMS(10, T, Half), > - __DEFINE_LINK_MODE_PARAMS(10, T, Full), > - __DEFINE_LINK_MODE_PARAMS(100, T, Half), > - __DEFINE_LINK_MODE_PARAMS(100, T, Full), > - __DEFINE_LINK_MODE_PARAMS(1000, T, Half), > - __DEFINE_LINK_MODE_PARAMS(1000, T, Full), > + __DEFINE_LINK_MODE_PARAMS(10, T, 1, Half), > + __DEFINE_LINK_MODE_PARAMS(10, T, 1, Full), > + __DEFINE_LINK_MODE_PARAMS(100, T, 1, Half), > + __DEFINE_LINK_MODE_PARAMS(100, T, 1, Full), > + __DEFINE_LINK_MODE_PARAMS(1000, T, 1, Half), > + __DEFINE_LINK_MODE_PARAMS(1000, T, 1, Full), Technically, 4 may be more appropriate for 1000base-T, 2500base-T, 5000base-T and 10000base-T but it's probably just a formality. While there is 1000base-T1, I'm not sure if we can expect a device which would support e.g. both 1000base-T and 1000base-T1 (or some other colliding combination of modes). Michal
Hi, Sorry, somehow didn't see this until now. > > +/* Lanes, 1, 2, 4 or 8. */ > > +#define ETHTOOL_LANES_1 1 > > +#define ETHTOOL_LANES_2 2 > > +#define ETHTOOL_LANES_4 4 > > +#define ETHTOOL_LANES_8 8 > > Not an extremely useful set of defines, not sure Michal would agree. > > > +#define ETHTOOL_LANES_UNKNOWN 0 > > struct link_mode_info { > > int speed; > > + int lanes; > > why signed? > > > u8 duplex; > > }; > > @@ -274,16 +277,17 @@ const struct nla_policy ethnl_linkmodes_set_policy[] = { > > [ETHTOOL_A_LINKMODES_SPEED] = { .type = NLA_U32 }, > > [ETHTOOL_A_LINKMODES_DUPLEX] = { .type = NLA_U8 }, > > [ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG] = { .type = NLA_U8 }, > > + [ETHTOOL_A_LINKMODES_LANES] = { .type = NLA_U32 }, > > NLA_POLICY_VALIDATE_FN(), not sure why the types for this > validation_type are limited.. Johannes, just an abundance of caution? So let me see if I got this right - you're saying you'd like to use NLA_POLICY_VALIDATE_FN() for an NLA_U32, to validate against the _LANES being 1, 2, 4 or 8? First of all, you _can_, no? I mean, it's limited by #define NLA_ENSURE_NO_VALIDATION_PTR(tp) \ (__NLA_ENSURE(tp != NLA_BITFIELD32 && \ tp != NLA_REJECT && \ tp != NLA_NESTED && \ tp != NLA_NESTED_ARRAY) + tp) and the reason is sort of encoded in that - the types listed here already use the pointer *regardless of the validation_type*, so you can't have a pointer to the function in the same union. But not sure I understood :-) johannes
On Mon, 12 Oct 2020 21:10:53 +0200 Johannes Berg wrote: > Hi, > > Sorry, somehow didn't see this until now. > > > > +/* Lanes, 1, 2, 4 or 8. */ > > > +#define ETHTOOL_LANES_1 1 > > > +#define ETHTOOL_LANES_2 2 > > > +#define ETHTOOL_LANES_4 4 > > > +#define ETHTOOL_LANES_8 8 > > > > Not an extremely useful set of defines, not sure Michal would agree. > > > > > +#define ETHTOOL_LANES_UNKNOWN 0 > > > struct link_mode_info { > > > int speed; > > > + int lanes; > > > > why signed? > > > > > u8 duplex; > > > }; > > > @@ -274,16 +277,17 @@ const struct nla_policy ethnl_linkmodes_set_policy[] = { > > > [ETHTOOL_A_LINKMODES_SPEED] = { .type = NLA_U32 }, > > > [ETHTOOL_A_LINKMODES_DUPLEX] = { .type = NLA_U8 }, > > > [ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG] = { .type = NLA_U8 }, > > > + [ETHTOOL_A_LINKMODES_LANES] = { .type = NLA_U32 }, > > > > NLA_POLICY_VALIDATE_FN(), not sure why the types for this > > validation_type are limited.. Johannes, just an abundance of caution? > > So let me see if I got this right - you're saying you'd like to use > NLA_POLICY_VALIDATE_FN() for an NLA_U32, to validate against the _LANES > being 1, 2, 4 or 8? > > First of all, you _can_, no? I mean, it's limited by > > #define NLA_ENSURE_NO_VALIDATION_PTR(tp) \ > (__NLA_ENSURE(tp != NLA_BITFIELD32 && \ > tp != NLA_REJECT && \ > tp != NLA_NESTED && \ > tp != NLA_NESTED_ARRAY) + tp) > > and the reason is sort of encoded in that - the types listed here > already use the pointer *regardless of the validation_type*, so you > can't have a pointer to the function in the same union. > > But not sure I understood :-) Yes, you're right. Sorry for the noise, one day I'll learn to read..
> -----Original Message----- > From: Jakub Kicinski <kuba@kernel.org> > Sent: Monday, October 12, 2020 6:58 PM > To: Danielle Ratson <danieller@nvidia.com> > Cc: Ido Schimmel <idosch@idosch.org>; netdev@vger.kernel.org; > davem@davemloft.net; Jiri Pirko <jiri@nvidia.com>; andrew@lunn.ch; > f.fainelli@gmail.com; mkubecek@suse.cz; mlxsw <mlxsw@nvidia.com>; Ido > Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net > Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI > with lanes > > On Mon, 12 Oct 2020 15:33:45 +0000 Danielle Ratson wrote: > > > What's the use for this in practical terms? Isn't the lane count > > > basically implied by the module that gets plugged in? > > > > The use is to enable the user to decide how to achieve a certain speed. > > For example, if he wants to get 100G and the port has 4 lanes, the > > speed can be achieved it using both 2 lanes of 50G and 4 lanes of 25G, > > as a port with 4 lanes width can work in 2 lanes mode with double > > speed each. So, by specifying "lanes 2" he will achieve 100G using 2 > > lanes of 50G. > > Can you give a concrete example of serdes capabilities of the port, what SFP > gets plugged in and what configuration user wants to select? Example: - swp1 is a 200G port with 4 lanes. - QSFP28 is plugged in. - The user wants to select configuration of 100G speed using 2 lanes, 50G each. $ ethtool swp1 Settings for swp1: Supported ports: [ FIBRE Backplane ] Supported link modes: 1000baseT/Full 10000baseT/Full 1000baseKX/Full 10000baseKR/Full 10000baseR_FEC 40000baseKR4/Full 40000baseCR4/Full 40000baseSR4/Full 40000baseLR4/Full 25000baseCR/Full 25000baseKR/Full 25000baseSR/Full 50000baseCR2/Full 50000baseKR2/Full 100000baseKR4/Full 100000baseSR4/Full 100000baseCR4/Full 100000baseLR4_ER4/Full 50000baseSR2/Full 10000baseCR/Full 10000baseSR/Full 10000baseLR/Full 10000baseER/Full 50000baseKR/Full 50000baseSR/Full 50000baseCR/Full 50000baseLR_ER_FR/Full 50000baseDR/Full 100000baseKR2/Full 100000baseSR2/Full 100000baseCR2/Full 100000baseLR2_ER2_FR2/Full 100000baseDR2/Full 200000baseKR4/Full 200000baseSR4/Full 200000baseLR4_ER4_FR4/Full 200000baseDR4/Full 200000baseCR4/Full Advertised link modes: 1000baseT/Full 10000baseT/Full 1000baseKX/Full 10000baseKR/Full 10000baseR_FEC 40000baseKR4/Full 40000baseCR4/Full 40000baseSR4/Full 40000baseLR4/Full 25000baseCR/Full 25000baseKR/Full 25000baseSR/Full 50000baseCR2/Full 50000baseKR2/Full 100000baseKR4/Full 100000baseSR4/Full 100000baseCR4/Full 100000baseLR4_ER4/Full 50000baseSR2/Full 10000baseCR/Full 10000baseSR/Full 10000baseLR/Full 10000baseER/Full 50000baseKR/Full 50000baseSR/Full 50000baseCR/Full 50000baseLR_ER_FR/Full 50000baseDR/Full 100000baseKR2/Full 100000baseSR2/Full 100000baseCR2/Full 100000baseLR2_ER2_FR2/Full 100000baseDR2/Full 200000baseKR4/Full 200000baseSR4/Full 200000baseLR4_ER4_FR4/Full 200000baseDR4/Full 200000baseCR4/Full Advertised pause frame use: No Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Speed: Unknown! Lanes: Unknown! Duplex: Unknown! (255) Auto-negotiation: on Port: Direct Attach Copper PHYAD: 0 Transceiver: internal Link detected: no $ ethtool -s swp1 speed 100000 lanes 2 $ ethtool swp1 Settings for swp1: Supported ports: [ FIBRE Backplane ] Supported link modes: 1000baseT/Full 10000baseT/Full 1000baseKX/Full 10000baseKR/Full 10000baseR_FEC 40000baseKR4/Full 40000baseCR4/Full 40000baseSR4/Full 40000baseLR4/Full 25000baseCR/Full 25000baseKR/Full 25000baseSR/Full 50000baseCR2/Full 50000baseKR2/Full 100000baseKR4/Full 100000baseSR4/Full 100000baseCR4/Full 100000baseLR4_ER4/Full 50000baseSR2/Full 10000baseCR/Full 10000baseSR/Full 10000baseLR/Full 10000baseER/Full 50000baseKR/Full 50000baseSR/Full 50000baseCR/Full 50000baseLR_ER_FR/Full 50000baseDR/Full 100000baseKR2/Full 100000baseSR2/Full 100000baseCR2/Full 100000baseLR2_ER2_FR2/Full 100000baseDR2/Full 200000baseKR4/Full 200000baseSR4/Full 200000baseLR4_ER4_FR4/Full 200000baseDR4/Full 200000baseCR4/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 100000baseKR2/Full 100000baseSR2/Full 100000baseCR2/Full 100000baseLR2_ER2_FR2/Full 100000baseDR2/Full Advertised pause frame use: No Advertised auto-negotiation: Yes Advertised FEC modes: Not reported
On Tue, 13 Oct 2020 14:29:29 +0000 Danielle Ratson wrote: > > On Mon, 12 Oct 2020 15:33:45 +0000 Danielle Ratson wrote: > > > > What's the use for this in practical terms? Isn't the lane count > > > > basically implied by the module that gets plugged in? > > > > > > The use is to enable the user to decide how to achieve a certain speed. > > > For example, if he wants to get 100G and the port has 4 lanes, the > > > speed can be achieved it using both 2 lanes of 50G and 4 lanes of 25G, > > > as a port with 4 lanes width can work in 2 lanes mode with double > > > speed each. So, by specifying "lanes 2" he will achieve 100G using 2 > > > lanes of 50G. > > > > Can you give a concrete example of serdes capabilities of the port, what SFP > > gets plugged in and what configuration user wants to select? > > Example: > - swp1 is a 200G port with 4 lanes. > - QSFP28 is plugged in. > - The user wants to select configuration of 100G speed using 2 lanes, 50G each. I mean.. sounds quite contrived to me. But I'm not opposed if others are fine with the change.
> Example: > - swp1 is a 200G port with 4 lanes. > - QSFP28 is plugged in. > - The user wants to select configuration of 100G speed using 2 lanes, 50G each. > > $ ethtool swp1 > Settings for swp1: > Supported ports: [ FIBRE Backplane ] > Supported link modes: 1000baseT/Full > 10000baseT/Full > 1000baseKX/Full > 10000baseKR/Full > 10000baseR_FEC > 40000baseKR4/Full > 40000baseCR4/Full > 40000baseSR4/Full > 40000baseLR4/Full > 25000baseCR/Full > 25000baseKR/Full > 25000baseSR/Full > 50000baseCR2/Full > 50000baseKR2/Full > 100000baseKR4/Full > 100000baseSR4/Full > 100000baseCR4/Full > 100000baseLR4_ER4/Full > 50000baseSR2/Full > 10000baseCR/Full > 10000baseSR/Full > 10000baseLR/Full > 10000baseER/Full > 50000baseKR/Full > 50000baseSR/Full > 50000baseCR/Full > 50000baseLR_ER_FR/Full > 50000baseDR/Full > 100000baseKR2/Full > 100000baseSR2/Full > 100000baseCR2/Full > 100000baseLR2_ER2_FR2/Full > 100000baseDR2/Full I'm not sure i fully understand all these different link modes, but i thought these 5 are all 100G using 2 lanes? So why cannot the user simply do ethtool -s swp1 advertise 100000baseKR2/Full and the driver can figure out it needs to use two lanes at 50G? Andrew
> -----Original Message----- > From: Andrew Lunn <andrew@lunn.ch> > Sent: Saturday, October 17, 2020 1:16 AM > To: Danielle Ratson <danieller@nvidia.com> > Cc: Jakub Kicinski <kuba@kernel.org>; Ido Schimmel <idosch@idosch.org>; > netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko > <jiri@nvidia.com>; f.fainelli@gmail.com; mkubecek@suse.cz; mlxsw > <mlxsw@nvidia.com>; Ido Schimmel <idosch@nvidia.com>; > johannes@sipsolutions.net > Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI > with lanes > > > Example: > > - swp1 is a 200G port with 4 lanes. > > - QSFP28 is plugged in. > > - The user wants to select configuration of 100G speed using 2 lanes, 50G > each. > > > > $ ethtool swp1 > > Settings for swp1: > > Supported ports: [ FIBRE Backplane ] > > Supported link modes: 1000baseT/Full > > 10000baseT/Full > > 1000baseKX/Full > > 10000baseKR/Full > > 10000baseR_FEC > > 40000baseKR4/Full > > 40000baseCR4/Full > > 40000baseSR4/Full > > 40000baseLR4/Full > > 25000baseCR/Full > > 25000baseKR/Full > > 25000baseSR/Full > > 50000baseCR2/Full > > 50000baseKR2/Full > > 100000baseKR4/Full > > 100000baseSR4/Full > > 100000baseCR4/Full > > 100000baseLR4_ER4/Full > > 50000baseSR2/Full > > 10000baseCR/Full > > 10000baseSR/Full > > 10000baseLR/Full > > 10000baseER/Full > > 50000baseKR/Full > > 50000baseSR/Full > > 50000baseCR/Full > > 50000baseLR_ER_FR/Full > > 50000baseDR/Full > > > 100000baseKR2/Full > > 100000baseSR2/Full > > 100000baseCR2/Full > > 100000baseLR2_ER2_FR2/Full > > 100000baseDR2/Full > > I'm not sure i fully understand all these different link modes, but i thought > these 5 are all 100G using 2 lanes? So why cannot the user simply do > > ethtool -s swp1 advertise 100000baseKR2/Full > > and the driver can figure out it needs to use two lanes at 50G? > > Andrew Hi Andrew, Thanks for the feedback. I guess you mean " ethtool -s swp1 advertise 100000baseKR2/Full on". First, the idea might work but only for auto negotiation mode, whereas the lanes parameter is a solution for both. Second, the command as you have suggested it, wouldn't change anything in the current situation as I checked. We can disable all the others and leave only the one we want but the command doesn't filter the other link modes but it just turns the mentioned link modes up if they aren't. However, the lanes parameter is a selector, which make it much more user friendly in my opinion. Also, we can't turn only one of them up. But you have to set for example: $ ethtool -s swp1 advertise 100000baseKR2/Full on 100000baseSR2/Full on 100000baseCR2/Full on 100000baseLR2_ER2_FR2/Full on 100000baseDR2/Full on Am I missing something?
On Mon, Oct 19, 2020 at 07:19:34AM +0000, Danielle Ratson wrote: > > -----Original Message----- > > From: Andrew Lunn <andrew@lunn.ch> > > Sent: Saturday, October 17, 2020 1:16 AM > > > > I'm not sure i fully understand all these different link modes, but > > i thought these 5 are all 100G using 2 lanes? So why cannot the user > > simply do > > > > ethtool -s swp1 advertise 100000baseKR2/Full > > > > and the driver can figure out it needs to use two lanes at 50G? > > > > Andrew > > Hi Andrew, > > Thanks for the feedback. > > I guess you mean " ethtool -s swp1 advertise 100000baseKR2/Full on". > > First, the idea might work but only for auto negotiation mode, whereas > the lanes parameter is a solution for both. > > Second, the command as you have suggested it, wouldn't change anything > in the current situation as I checked. We can disable all the others > and leave only the one we want but the command doesn't filter the > other link modes but it just turns the mentioned link modes up if they > aren't. However, the lanes parameter is a selector, which make it much > more user friendly in my opinion. It would be quite easy to extend the ethtool command line parser to allow also ethtool -s <dev> advertise <mode> ... in addition to already supported ethtool -s <dev> advertise <mask> ethtool -s <dev> advertise <mask>/<mask> ethtool -s { <mode> { on | off } } ... Parser converting simple list of values into a maskless bitset is already there so it would be only matter of checking if there are at least two arguments and second is "on" or "off" and using corresponding parser. I think it would be useful independently of this series. > Also, we can't turn only one of them up. But you have to set for > example: > > $ ethtool -s swp1 advertise 100000baseKR2/Full on 100000baseSR2/Full on 100000baseCR2/Full on 100000baseLR2_ER2_FR2/Full on 100000baseDR2/Full on > > Am I missing something? IIUC Jakub's concern is rather about real life need for such selectors, i.e. how realistic is "I want a(ny) 100Gb/s mode with two lanes" as an actual user need; if it wouldn't be mostly (or only) used as a quick way to distinguish between two supported 100Gb/s modes. IMHO if we go this way, we should consider going all the way, i.e. allow also selecting by the remaining part of the mode ("media type", e.g. "LR", not sure what the official name is) and, more important, get full information about link mode in use from driver (we only get speed and duplex at the moment). But that would require changes in the get_linksettings() interface and drivers. Michal
Sat, Oct 17, 2020 at 12:15:53AM CEST, andrew@lunn.ch wrote: >> Example: >> - swp1 is a 200G port with 4 lanes. >> - QSFP28 is plugged in. >> - The user wants to select configuration of 100G speed using 2 lanes, 50G each. >> >> $ ethtool swp1 >> Settings for swp1: >> Supported ports: [ FIBRE Backplane ] >> Supported link modes: 1000baseT/Full >> 10000baseT/Full >> 1000baseKX/Full >> 10000baseKR/Full >> 10000baseR_FEC >> 40000baseKR4/Full >> 40000baseCR4/Full >> 40000baseSR4/Full >> 40000baseLR4/Full >> 25000baseCR/Full >> 25000baseKR/Full >> 25000baseSR/Full >> 50000baseCR2/Full >> 50000baseKR2/Full >> 100000baseKR4/Full >> 100000baseSR4/Full >> 100000baseCR4/Full >> 100000baseLR4_ER4/Full >> 50000baseSR2/Full >> 10000baseCR/Full >> 10000baseSR/Full >> 10000baseLR/Full >> 10000baseER/Full >> 50000baseKR/Full >> 50000baseSR/Full >> 50000baseCR/Full >> 50000baseLR_ER_FR/Full >> 50000baseDR/Full > >> 100000baseKR2/Full >> 100000baseSR2/Full >> 100000baseCR2/Full >> 100000baseLR2_ER2_FR2/Full >> 100000baseDR2/Full > >I'm not sure i fully understand all these different link modes, but i >thought these 5 are all 100G using 2 lanes? So why cannot the user >simply do > >ethtool -s swp1 advertise 100000baseKR2/Full > >and the driver can figure out it needs to use two lanes at 50G? 100000baseKR2 is 2 lanes. No need to figure anything out. What do you mean by that? > > Andrew
Mon, Oct 19, 2020 at 01:04:22PM CEST, mkubecek@suse.cz wrote: >On Mon, Oct 19, 2020 at 07:19:34AM +0000, Danielle Ratson wrote: >> > -----Original Message----- >> > From: Andrew Lunn <andrew@lunn.ch> >> > Sent: Saturday, October 17, 2020 1:16 AM >> > >> > I'm not sure i fully understand all these different link modes, but >> > i thought these 5 are all 100G using 2 lanes? So why cannot the user >> > simply do >> > >> > ethtool -s swp1 advertise 100000baseKR2/Full >> > >> > and the driver can figure out it needs to use two lanes at 50G? >> > >> > Andrew >> >> Hi Andrew, >> >> Thanks for the feedback. >> >> I guess you mean " ethtool -s swp1 advertise 100000baseKR2/Full on". >> >> First, the idea might work but only for auto negotiation mode, whereas >> the lanes parameter is a solution for both. >> >> Second, the command as you have suggested it, wouldn't change anything >> in the current situation as I checked. We can disable all the others >> and leave only the one we want but the command doesn't filter the >> other link modes but it just turns the mentioned link modes up if they >> aren't. However, the lanes parameter is a selector, which make it much >> more user friendly in my opinion. > >It would be quite easy to extend the ethtool command line parser to >allow also > > ethtool -s <dev> advertise <mode> ... > >in addition to already supported > > ethtool -s <dev> advertise <mask> > ethtool -s <dev> advertise <mask>/<mask> > ethtool -s { <mode> { on | off } } ... > >Parser converting simple list of values into a maskless bitset is >already there so it would be only matter of checking if there are at >least two arguments and second is "on" or "off" and using corresponding >parser. I think it would be useful independently of this series. Understood. So basically you will pass some selectors on cmdline and the uapi would stay intact. How do you imagine the specific lane number selection should look like on the cmdline? > >> Also, we can't turn only one of them up. But you have to set for >> example: >> >> $ ethtool -s swp1 advertise 100000baseKR2/Full on 100000baseSR2/Full on 100000baseCR2/Full on 100000baseLR2_ER2_FR2/Full on 100000baseDR2/Full on >> >> Am I missing something? > >IIUC Jakub's concern is rather about real life need for such selectors, >i.e. how realistic is "I want a(ny) 100Gb/s mode with two lanes" as an >actual user need; if it wouldn't be mostly (or only) used as a quick way >to distinguish between two supported 100Gb/s modes. > >IMHO if we go this way, we should consider going all the way, i.e. allow >also selecting by the remaining part of the mode ("media type", e.g. >"LR", not sure what the official name is) and, more important, get full >information about link mode in use from driver (we only get speed and >duplex at the moment). But that would require changes in the >get_linksettings() interface and drivers. > >Michal
> >> 100000baseKR2/Full > >> 100000baseSR2/Full > >> 100000baseCR2/Full > >> 100000baseLR2_ER2_FR2/Full > >> 100000baseDR2/Full > > > >I'm not sure i fully understand all these different link modes, but i > >thought these 5 are all 100G using 2 lanes? So why cannot the user > >simply do > > > >ethtool -s swp1 advertise 100000baseKR2/Full > > > >and the driver can figure out it needs to use two lanes at 50G? > > 100000baseKR2 is 2 lanes. No need to figure anything out. What do you > mean by that? I'm just thinking, rather than add a new UAPI for the number of lanes, why not use one we already have, if the number of lanes is implied by the link mode. Andrew
On Mon, Oct 19, 2020 at 02:26:43PM +0200, Jiri Pirko wrote: > Mon, Oct 19, 2020 at 01:04:22PM CEST, mkubecek@suse.cz wrote: > > > >It would be quite easy to extend the ethtool command line parser to > >allow also > > > > ethtool -s <dev> advertise <mode> ... > > > >in addition to already supported > > > > ethtool -s <dev> advertise <mask> > > ethtool -s <dev> advertise <mask>/<mask> > > ethtool -s { <mode> { on | off } } ... This should have been ethtool -s <dev> advertise { <mode> { on | off } } ... > >Parser converting simple list of values into a maskless bitset is > >already there so it would be only matter of checking if there are at > >least two arguments and second is "on" or "off" and using corresponding > >parser. I think it would be useful independently of this series. > > Understood. So basically you will pass some selectors on cmdline and the > uapi would stay intact. > How do you imagine the specific lane number selection should look like > on the cmdline? As I said, I meant the extension suggested in my mail as independent of what this series is about. For lanes count selector, I find proposed ethtool -s <dev> ... lanes <lanes_num> ... the most natural.
> -----Original Message----- > From: Michal Kubecek <mkubecek@suse.cz> > Sent: Monday, October 19, 2020 4:25 PM > To: Jiri Pirko <jiri@resnulli.us> > Cc: Danielle Ratson <danieller@nvidia.com>; Andrew Lunn > <andrew@lunn.ch>; Jakub Kicinski <kuba@kernel.org>; Ido Schimmel > <idosch@idosch.org>; netdev@vger.kernel.org; davem@davemloft.net; Jiri > Pirko <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw <mlxsw@nvidia.com>; > Ido Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net > Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI > with lanes > > On Mon, Oct 19, 2020 at 02:26:43PM +0200, Jiri Pirko wrote: > > Mon, Oct 19, 2020 at 01:04:22PM CEST, mkubecek@suse.cz wrote: > > > > > >It would be quite easy to extend the ethtool command line parser to > > >allow also > > > > > > ethtool -s <dev> advertise <mode> ... > > > > > >in addition to already supported > > > > > > ethtool -s <dev> advertise <mask> > > > ethtool -s <dev> advertise <mask>/<mask> ethtool -s { <mode> { on > > > | off } } ... > > This should have been > > ethtool -s <dev> advertise { <mode> { on | off } } ... > > > >Parser converting simple list of values into a maskless bitset is > > >already there so it would be only matter of checking if there are at > > >least two arguments and second is "on" or "off" and using > > >corresponding parser. I think it would be useful independently of this > series. > > > > Understood. So basically you will pass some selectors on cmdline and > > the uapi would stay intact. > > How do you imagine the specific lane number selection should look like > > on the cmdline? > > As I said, I meant the extension suggested in my mail as independent of what > this series is about. For lanes count selector, I find proposed > > ethtool -s <dev> ... lanes <lanes_num> ... > > the most natural. > > From purely syntactic/semantic point of view, there are three types of > requests: > > (1) enable specific set of modes, disable the rest > (2) enable/disable specific modes, leave the rest as they are > (3) enable modes matching a condition (and disable the rest) > > What I proposed was to allow the use symbolic names instead of masks > (which are getting more and more awful with each new mode) also for (1), > like they can already be used for (2). > > The lanes selector is an extension of (3) which I would prefer not to mix with > (1) or (2) within one command line, i.e. either "advertise" or "speed / duplex > / lanes". > > IIUC Jakub's and Andrew's comments were not so much about the syntax > and semantic (which is quite clear) but rather about the question if the > requests like "advertise exactly the modes with (100Gb/s speed and) two > lanes" would really address a real life need and wouldn't be more often used > as shortcuts for "advertise 100000baseKR2/Full". (On the other hand, I > suspect existing speed and duplex selectors are often used the same way.) > > Michal So, do you want to change the current approach somehow or we are good to go with this one, keeping in mind the future extension you have suggested? Thanks.
On Tue, Oct 20, 2020 at 07:39:13AM +0000, Danielle Ratson wrote: > > -----Original Message----- > > From: Michal Kubecek <mkubecek@suse.cz> > > Sent: Monday, October 19, 2020 4:25 PM > > > > As I said, I meant the extension suggested in my mail as independent of what > > this series is about. For lanes count selector, I find proposed > > > > ethtool -s <dev> ... lanes <lanes_num> ... > > > > the most natural. > > > > From purely syntactic/semantic point of view, there are three types of > > requests: > > > > (1) enable specific set of modes, disable the rest > > (2) enable/disable specific modes, leave the rest as they are > > (3) enable modes matching a condition (and disable the rest) > > > > What I proposed was to allow the use symbolic names instead of masks > > (which are getting more and more awful with each new mode) also for (1), > > like they can already be used for (2). > > > > The lanes selector is an extension of (3) which I would prefer not to mix with > > (1) or (2) within one command line, i.e. either "advertise" or "speed / duplex > > / lanes". > > > > IIUC Jakub's and Andrew's comments were not so much about the syntax > > and semantic (which is quite clear) but rather about the question if the > > requests like "advertise exactly the modes with (100Gb/s speed and) two > > lanes" would really address a real life need and wouldn't be more often used > > as shortcuts for "advertise 100000baseKR2/Full". (On the other hand, I > > suspect existing speed and duplex selectors are often used the same way.) > > > > Michal > > So, do you want to change the current approach somehow or we are good > to go with this one, keeping in mind the future extension you have > suggested? As far as I'm concerned, it makes sense as it is. The only thing I'm not happy about is ETHTOOL_A_LINKMODES_LANES being a "write only" attribute (unlike _SPEED and _DUPLEX) but being able to query this information would require extensive changes far beyond the scope of this series. Michal
> -----Original Message----- > From: Michal Kubecek <mkubecek@suse.cz> > Sent: Wednesday, October 21, 2020 10:08 AM > To: Danielle Ratson <danieller@nvidia.com> > Cc: Jiri Pirko <jiri@resnulli.us>; Andrew Lunn <andrew@lunn.ch>; Jakub > Kicinski <kuba@kernel.org>; Ido Schimmel <idosch@idosch.org>; > netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko > <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw <mlxsw@nvidia.com>; Ido > Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net > Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI > with lanes > > On Tue, Oct 20, 2020 at 07:39:13AM +0000, Danielle Ratson wrote: > > > -----Original Message----- > > > From: Michal Kubecek <mkubecek@suse.cz> > > > Sent: Monday, October 19, 2020 4:25 PM > > > > > > As I said, I meant the extension suggested in my mail as independent > > > of what this series is about. For lanes count selector, I find > > > proposed > > > > > > ethtool -s <dev> ... lanes <lanes_num> ... > > > > > > the most natural. > > > > > > From purely syntactic/semantic point of view, there are three types > > > of > > > requests: > > > > > > (1) enable specific set of modes, disable the rest > > > (2) enable/disable specific modes, leave the rest as they are > > > (3) enable modes matching a condition (and disable the rest) > > > > > > What I proposed was to allow the use symbolic names instead of masks > > > (which are getting more and more awful with each new mode) also for > > > (1), like they can already be used for (2). > > > > > > The lanes selector is an extension of (3) which I would prefer not > > > to mix with > > > (1) or (2) within one command line, i.e. either "advertise" or > > > "speed / duplex / lanes". > > > > > > IIUC Jakub's and Andrew's comments were not so much about the syntax > > > and semantic (which is quite clear) but rather about the question if > > > the requests like "advertise exactly the modes with (100Gb/s speed > > > and) two lanes" would really address a real life need and wouldn't > > > be more often used as shortcuts for "advertise 100000baseKR2/Full". > > > (On the other hand, I suspect existing speed and duplex selectors > > > are often used the same way.) > > > > > > Michal > > > > So, do you want to change the current approach somehow or we are good > > to go with this one, keeping in mind the future extension you have > > suggested? > > As far as I'm concerned, it makes sense as it is. The only thing I'm not happy > about is ETHTOOL_A_LINKMODES_LANES being a "write only" attribute > (unlike _SPEED and _DUPLEX) but being able to query this information would > require extensive changes far beyond the scope of this series. > > Michal If I understood you correctly, patch #5 does that query, isn't it? "mlxsw: ethtool: Expose the number of lanes in use"
On Wed, Oct 21, 2020 at 07:20:22AM +0000, Danielle Ratson wrote: > > -----Original Message----- > > From: Michal Kubecek <mkubecek@suse.cz> > > Sent: Wednesday, October 21, 2020 10:08 AM > > > > On Tue, Oct 20, 2020 at 07:39:13AM +0000, Danielle Ratson wrote: > > > > -----Original Message----- > > > > From: Michal Kubecek <mkubecek@suse.cz> > > > > Sent: Monday, October 19, 2020 4:25 PM > > > > > > > > As I said, I meant the extension suggested in my mail as independent > > > > of what this series is about. For lanes count selector, I find > > > > proposed > > > > > > > > ethtool -s <dev> ... lanes <lanes_num> ... > > > > > > > > the most natural. > > > > > > > > From purely syntactic/semantic point of view, there are three types > > > > of > > > > requests: > > > > > > > > (1) enable specific set of modes, disable the rest > > > > (2) enable/disable specific modes, leave the rest as they are > > > > (3) enable modes matching a condition (and disable the rest) > > > > > > > > What I proposed was to allow the use symbolic names instead of masks > > > > (which are getting more and more awful with each new mode) also for > > > > (1), like they can already be used for (2). > > > > > > > > The lanes selector is an extension of (3) which I would prefer not > > > > to mix with > > > > (1) or (2) within one command line, i.e. either "advertise" or > > > > "speed / duplex / lanes". > > > > > > > > IIUC Jakub's and Andrew's comments were not so much about the syntax > > > > and semantic (which is quite clear) but rather about the question if > > > > the requests like "advertise exactly the modes with (100Gb/s speed > > > > and) two lanes" would really address a real life need and wouldn't > > > > be more often used as shortcuts for "advertise 100000baseKR2/Full". > > > > (On the other hand, I suspect existing speed and duplex selectors > > > > are often used the same way.) > > > > > > > > Michal > > > > > > So, do you want to change the current approach somehow or we are good > > > to go with this one, keeping in mind the future extension you have > > > suggested? > > > > As far as I'm concerned, it makes sense as it is. The only thing I'm not happy > > about is ETHTOOL_A_LINKMODES_LANES being a "write only" attribute > > (unlike _SPEED and _DUPLEX) but being able to query this information would > > require extensive changes far beyond the scope of this series. > > If I understood you correctly, patch #5 does that query, isn't it? > "mlxsw: ethtool: Expose the number of lanes in use" Ah, right, it does. But as you extend struct ethtool_link_ksettings and drivers will need to be updated to provide this information, wouldn't it be more useful to let the driver provide link mode in use instead (and derive number of lanes from it)? Michal
> -----Original Message----- > From: Michal Kubecek <mkubecek@suse.cz> > Sent: Wednesday, October 21, 2020 11:48 AM > To: Danielle Ratson <danieller@nvidia.com> > Cc: Jiri Pirko <jiri@resnulli.us>; Andrew Lunn <andrew@lunn.ch>; Jakub > Kicinski <kuba@kernel.org>; Ido Schimmel <idosch@idosch.org>; > netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko > <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw <mlxsw@nvidia.com>; Ido > Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net > Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI > with lanes > > On Wed, Oct 21, 2020 at 07:20:22AM +0000, Danielle Ratson wrote: > > > -----Original Message----- > > > From: Michal Kubecek <mkubecek@suse.cz> > > > Sent: Wednesday, October 21, 2020 10:08 AM > > > > > > On Tue, Oct 20, 2020 at 07:39:13AM +0000, Danielle Ratson wrote: > > > > > -----Original Message----- > > > > > From: Michal Kubecek <mkubecek@suse.cz> > > > > > Sent: Monday, October 19, 2020 4:25 PM > > > > > > > > > > As I said, I meant the extension suggested in my mail as > > > > > independent of what this series is about. For lanes count > > > > > selector, I find proposed > > > > > > > > > > ethtool -s <dev> ... lanes <lanes_num> ... > > > > > > > > > > the most natural. > > > > > > > > > > From purely syntactic/semantic point of view, there are three > > > > > types of > > > > > requests: > > > > > > > > > > (1) enable specific set of modes, disable the rest > > > > > (2) enable/disable specific modes, leave the rest as they are > > > > > (3) enable modes matching a condition (and disable the rest) > > > > > > > > > > What I proposed was to allow the use symbolic names instead of > > > > > masks (which are getting more and more awful with each new mode) > > > > > also for (1), like they can already be used for (2). > > > > > > > > > > The lanes selector is an extension of (3) which I would prefer > > > > > not to mix with > > > > > (1) or (2) within one command line, i.e. either "advertise" or > > > > > "speed / duplex / lanes". > > > > > > > > > > IIUC Jakub's and Andrew's comments were not so much about the > > > > > syntax and semantic (which is quite clear) but rather about the > > > > > question if the requests like "advertise exactly the modes with > > > > > (100Gb/s speed > > > > > and) two lanes" would really address a real life need and > > > > > wouldn't be more often used as shortcuts for "advertise > 100000baseKR2/Full". > > > > > (On the other hand, I suspect existing speed and duplex > > > > > selectors are often used the same way.) > > > > > > > > > > Michal > > > > > > > > So, do you want to change the current approach somehow or we are > > > > good to go with this one, keeping in mind the future extension you > > > > have suggested? > > > > > > As far as I'm concerned, it makes sense as it is. The only thing I'm > > > not happy about is ETHTOOL_A_LINKMODES_LANES being a "write only" > > > attribute (unlike _SPEED and _DUPLEX) but being able to query this > > > information would require extensive changes far beyond the scope of > this series. > > > > If I understood you correctly, patch #5 does that query, isn't it? > > "mlxsw: ethtool: Expose the number of lanes in use" > > Ah, right, it does. But as you extend struct ethtool_link_ksettings and drivers > will need to be updated to provide this information, wouldn't it be more > useful to let the driver provide link mode in use instead (and derive number > of lanes from it)? > > Michal This is the way it is done with the speed parameter, so I have aligned it to it. Why the lanes should be done differently comparing to the speed?
On Thu, Oct 22, 2020 at 06:15:48AM +0000, Danielle Ratson wrote: > > -----Original Message----- > > From: Michal Kubecek <mkubecek@suse.cz> > > Sent: Wednesday, October 21, 2020 11:48 AM > > > > Ah, right, it does. But as you extend struct ethtool_link_ksettings > > and drivers will need to be updated to provide this information, > > wouldn't it be more useful to let the driver provide link mode in > > use instead (and derive number of lanes from it)? > > This is the way it is done with the speed parameter, so I have aligned > it to it. Why the lanes should be done differently comparing to the > speed? Speed and duplex have worked this way since ages and the interface was probably introduced back in times when combination of speed and duplex was sufficient to identify the link mode. This is no longer the case and even adding number of lanes wouldn't make the combination unique. So if we are going to extend the interface now and update drivers to provide extra information, I believe it would be more useful to provide full information. Michal
On Sat, Oct 10, 2020 at 3:54 PM Ido Schimmel <idosch@idosch.org> wrote: > Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct > ethtool_link_settings' with lanes field in order to implement a new > lanes-selector that will enable the user to advertise a specific number > of lanes as well. Why can't this be implied by port break-out configuration? For higher speed signalling modes like PAM4, what's the difference between a port with unused lanes vs the same port split into multiple logical ports? In essence, the driver could then always choose the slowest signalling mode that utilizes all the available lanes. Regards, Edwin Peer
Thu, Nov 19, 2020 at 09:38:34PM CET, edwin.peer@broadcom.com wrote: >On Sat, Oct 10, 2020 at 3:54 PM Ido Schimmel <idosch@idosch.org> wrote: > >> Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct >> ethtool_link_settings' with lanes field in order to implement a new >> lanes-selector that will enable the user to advertise a specific number >> of lanes as well. > >Why can't this be implied by port break-out configuration? For higher >speed signalling modes like PAM4, what's the difference between a >port with unused lanes vs the same port split into multiple logical >ports? In essence, the driver could then always choose the slowest There is a crucial difference. Split port is configured alwasy by user. Each split port has a devlink instace, netdevice associated with it. It is one level above the lanes. >signalling mode that utilizes all the available lanes. > >Regards, >Edwin Peer
> -----Original Message----- > From: Michal Kubecek <mkubecek@suse.cz> > Sent: Thursday, October 22, 2020 7:28 PM > To: Danielle Ratson <danieller@nvidia.com> > Cc: Jiri Pirko <jiri@resnulli.us>; Andrew Lunn <andrew@lunn.ch>; Jakub Kicinski <kuba@kernel.org>; Ido Schimmel > <idosch@idosch.org>; netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw > <mlxsw@nvidia.com>; Ido Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net > Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI with lanes > > On Thu, Oct 22, 2020 at 06:15:48AM +0000, Danielle Ratson wrote: > > > -----Original Message----- > > > From: Michal Kubecek <mkubecek@suse.cz> > > > Sent: Wednesday, October 21, 2020 11:48 AM > > > > > > Ah, right, it does. But as you extend struct ethtool_link_ksettings > > > and drivers will need to be updated to provide this information, > > > wouldn't it be more useful to let the driver provide link mode in > > > use instead (and derive number of lanes from it)? > > > > This is the way it is done with the speed parameter, so I have aligned > > it to it. Why the lanes should be done differently comparing to the > > speed? > > Speed and duplex have worked this way since ages and the interface was probably introduced back in times when combination of > speed and duplex was sufficient to identify the link mode. This is no longer the case and even adding number of lanes wouldn't make > the combination unique. So if we are going to extend the interface now and update drivers to provide extra information, I believe it > would be more useful to provide full information. > > Michal Hi Michal, What do you think of passing the link modes you have suggested as a bitmask, similar to "supported", that contains only one positive bit? Something like that: diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index afae2beacbc3..dd946c88daa3 100644 --- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -127,6 +127,7 @@ struct ethtool_link_ksettings { __ETHTOOL_DECLARE_LINK_MODE_MASK(supported); __ETHTOOL_DECLARE_LINK_MODE_MASK(advertising); __ETHTOOL_DECLARE_LINK_MODE_MASK(lp_advertising); + __ETHTOOL_DECLARE_LINK_MODE_MASK(chosen); } link_modes; u32 lanes; }; Do you have perhaps a better suggestion? And the speed and duplex parameters should be removed from being passed like as well, right? Thanks, Danielle
On Mon, Nov 23, 2020 at 09:47:53AM +0000, Danielle Ratson wrote: > > > > -----Original Message----- > > From: Michal Kubecek <mkubecek@suse.cz> > > Sent: Thursday, October 22, 2020 7:28 PM > > To: Danielle Ratson <danieller@nvidia.com> > > Cc: Jiri Pirko <jiri@resnulli.us>; Andrew Lunn <andrew@lunn.ch>; Jakub Kicinski <kuba@kernel.org>; Ido Schimmel > > <idosch@idosch.org>; netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw > > <mlxsw@nvidia.com>; Ido Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net > > Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI with lanes > > > > On Thu, Oct 22, 2020 at 06:15:48AM +0000, Danielle Ratson wrote: > > > > -----Original Message----- > > > > From: Michal Kubecek <mkubecek@suse.cz> > > > > Sent: Wednesday, October 21, 2020 11:48 AM > > > > > > > > Ah, right, it does. But as you extend struct ethtool_link_ksettings > > > > and drivers will need to be updated to provide this information, > > > > wouldn't it be more useful to let the driver provide link mode in > > > > use instead (and derive number of lanes from it)? > > > > > > This is the way it is done with the speed parameter, so I have aligned > > > it to it. Why the lanes should be done differently comparing to the > > > speed? > > > > Speed and duplex have worked this way since ages and the interface was probably introduced back in times when combination of > > speed and duplex was sufficient to identify the link mode. This is no longer the case and even adding number of lanes wouldn't make > > the combination unique. So if we are going to extend the interface now and update drivers to provide extra information, I believe it > > would be more useful to provide full information. > > > > Michal > > Hi Michal, > > What do you think of passing the link modes you have suggested as a > bitmask, similar to "supported", that contains only one positive bit? > Something like that: > > diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h > index afae2beacbc3..dd946c88daa3 100644 > --- a/include/linux/ethtool.h > +++ b/include/linux/ethtool.h > @@ -127,6 +127,7 @@ struct ethtool_link_ksettings { > __ETHTOOL_DECLARE_LINK_MODE_MASK(supported); > __ETHTOOL_DECLARE_LINK_MODE_MASK(advertising); > __ETHTOOL_DECLARE_LINK_MODE_MASK(lp_advertising); > + __ETHTOOL_DECLARE_LINK_MODE_MASK(chosen); > } link_modes; > u32 lanes; > }; > > Do you have perhaps a better suggestion? Not sure if it's better but as we know there is only one mode, we could simply pass the index. We would still need to reserve a special value for none/unknown but getting an index would make lookup easier. > And the speed and duplex parameters should be removed from being > passed like as well, right? We cannot remove them from struct ethtool_link_settings and the ioctl and netlink messages as those are part of UAPI and we have to preserve backward compatibility. But drivers which provide link mode would not need to fill speed and duplex in their ->get_link_ksettings() as the common code could do that for them. Michal
> -----Original Message----- > From: Michal Kubecek <mkubecek@suse.cz> > Sent: Wednesday, November 25, 2020 12:12 AM > To: Danielle Ratson <danieller@nvidia.com> > Cc: Jiri Pirko <jiri@resnulli.us>; Andrew Lunn <andrew@lunn.ch>; Jakub Kicinski <kuba@kernel.org>; Ido Schimmel > <idosch@idosch.org>; netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw > <mlxsw@nvidia.com>; Ido Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net > Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI with lanes > > On Mon, Nov 23, 2020 at 09:47:53AM +0000, Danielle Ratson wrote: > > > > > > > -----Original Message----- > > > From: Michal Kubecek <mkubecek@suse.cz> > > > Sent: Thursday, October 22, 2020 7:28 PM > > > To: Danielle Ratson <danieller@nvidia.com> > > > Cc: Jiri Pirko <jiri@resnulli.us>; Andrew Lunn <andrew@lunn.ch>; > > > Jakub Kicinski <kuba@kernel.org>; Ido Schimmel <idosch@idosch.org>; > > > netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko > > > <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw <mlxsw@nvidia.com>; > > > Ido Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net > > > Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes > > > settings uAPI with lanes > > > > > > On Thu, Oct 22, 2020 at 06:15:48AM +0000, Danielle Ratson wrote: > > > > > -----Original Message----- > > > > > From: Michal Kubecek <mkubecek@suse.cz> > > > > > Sent: Wednesday, October 21, 2020 11:48 AM > > > > > > > > > > Ah, right, it does. But as you extend struct > > > > > ethtool_link_ksettings and drivers will need to be updated to > > > > > provide this information, wouldn't it be more useful to let the > > > > > driver provide link mode in use instead (and derive number of lanes from it)? > > > > > > > > This is the way it is done with the speed parameter, so I have > > > > aligned it to it. Why the lanes should be done differently > > > > comparing to the speed? > > > > > > Speed and duplex have worked this way since ages and the interface > > > was probably introduced back in times when combination of speed and > > > duplex was sufficient to identify the link mode. This is no longer > > > the case and even adding number of lanes wouldn't make the combination unique. So if we are going to extend the interface now > and update drivers to provide extra information, I believe it would be more useful to provide full information. > > > > > > Michal > > > > Hi Michal, > > > > What do you think of passing the link modes you have suggested as a > > bitmask, similar to "supported", that contains only one positive bit? > > Something like that: Hi Michal, Thanks for your response. Actually what I said is not very accurate. In ethtool, for speed 100G and 4 lanes for example, there are few link modes that fits: ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT ETHTOOL_LINK_MODE_100000baseSR4_Full_BIT ETHTOOL_LINK_MODE_100000baseCR4_Full_BIT ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT The difference is the media. And in the driver we shrink into one bit. But maybe that makes passing a bitmask more sense, or am I missing something? > > > > diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index > > afae2beacbc3..dd946c88daa3 100644 > > --- a/include/linux/ethtool.h > > +++ b/include/linux/ethtool.h > > @@ -127,6 +127,7 @@ struct ethtool_link_ksettings { > > __ETHTOOL_DECLARE_LINK_MODE_MASK(supported); > > __ETHTOOL_DECLARE_LINK_MODE_MASK(advertising); > > __ETHTOOL_DECLARE_LINK_MODE_MASK(lp_advertising); > > + __ETHTOOL_DECLARE_LINK_MODE_MASK(chosen); > > } link_modes; > > u32 lanes; > > }; > > > > Do you have perhaps a better suggestion? > > Not sure if it's better but as we know there is only one mode, we could simply pass the index. We would still need to reserve a special > value for none/unknown but getting an index would make lookup easier. > > > And the speed and duplex parameters should be removed from being > > passed like as well, right? > > We cannot remove them from struct ethtool_link_settings and the ioctl and netlink messages as those are part of UAPI and we have > to preserve backward compatibility. But drivers which provide link mode would not need to fill speed and duplex in their - > >get_link_ksettings() as the common code could do that for them. Yes of course I didn't mean to remove the parameters from the struct, just to not prepare them for passing to ethtool when getting the link settings. Thanks. > > Michal
On Wed, Nov 25, 2020 at 10:35:35AM +0000, Danielle Ratson wrote: > > > What do you think of passing the link modes you have suggested as a > > > bitmask, similar to "supported", that contains only one positive bit? > > > Something like that: > > Hi Michal, > > Thanks for your response. > > Actually what I said is not very accurate. > In ethtool, for speed 100G and 4 lanes for example, there are few link modes that fits: > ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT > ETHTOOL_LINK_MODE_100000baseSR4_Full_BIT > ETHTOOL_LINK_MODE_100000baseCR4_Full_BIT > ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT > > The difference is the media. And in the driver we shrink into one bit. > But maybe that makes passing a bitmask more sense, or am I missing something? But as far as I understand, at any moment, only one of these will be actually in use so that's what the driver should report. Or is the problem that the driver cannot identify the media in use? (To be precise: by "cannot identify" I mean "it's not possible for the driver to find out", not "current code does not distinguish".) Michal
On Mon, Nov 23, 2020 at 1:40 AM Jiri Pirko <jiri@resnulli.us> wrote: > >Why can't this be implied by port break-out configuration? For higher > >speed signalling modes like PAM4, what's the difference between a > >port with unused lanes vs the same port split into multiple logical > >ports? In essence, the driver could then always choose the slowest > > There is a crucial difference. Split port is configured alwasy by user. > Each split port has a devlink instace, netdevice associated with it. > It is one level above the lanes. Right, but the one still implies the other. Splitting the port implies fewer lanes available. I understand the concern if the device cannot provide sufficient MAC resources to provide for the additional ports, but leaving a net device unused (with the option to utilize an additional, now spare, port) still seems better to me than leaving lanes unused and always wasted. Otherwise, the earlier suggestion of fully specifying the forced link mode (although I don't think Andrew articulated it quite that way) instead of a forced speed and separate lane mode makes most sense. Regards, Edwin Peer
Mon, Nov 30, 2020 at 06:01:43PM CET, edwin.peer@broadcom.com wrote: >On Mon, Nov 23, 2020 at 1:40 AM Jiri Pirko <jiri@resnulli.us> wrote: > >> >Why can't this be implied by port break-out configuration? For higher >> >speed signalling modes like PAM4, what's the difference between a >> >port with unused lanes vs the same port split into multiple logical >> >ports? In essence, the driver could then always choose the slowest >> >> There is a crucial difference. Split port is configured alwasy by user. >> Each split port has a devlink instace, netdevice associated with it. >> It is one level above the lanes. > >Right, but the one still implies the other. Splitting the port implies fewer >lanes available. > >I understand the concern if the device cannot provide sufficient MAC >resources to provide for the additional ports, but leaving a net device >unused (with the option to utilize an additional, now spare, port) still >seems better to me than leaving lanes unused and always wasted. I don't follow what exactly are you implying. Could you elaborate a bit more? > >Otherwise, the earlier suggestion of fully specifying the forced link >mode (although I don't think Andrew articulated it quite that way) >instead of a forced speed and separate lane mode makes most >sense. > >Regards, >Edwin Peer
On Mon, Nov 30, 2020 at 9:14 AM Jiri Pirko <jiri@resnulli.us> wrote: > >> There is a crucial difference. Split port is configured alwasy by user. > >> Each split port has a devlink instace, netdevice associated with it. > >> It is one level above the lanes. > > > >Right, but the one still implies the other. Splitting the port implies fewer > >lanes available. > > > >I understand the concern if the device cannot provide sufficient MAC > >resources to provide for the additional ports, but leaving a net device > >unused (with the option to utilize an additional, now spare, port) still > >seems better to me than leaving lanes unused and always wasted. > > I don't follow what exactly are you implying. Could you elaborate a bit > more? Perhaps an example... Consider a physical QSFP connector comprising 4 lanes. Today, if the speed is forced, we would achieve 100G speeds using all 4 lanes with NRZ encoding. If we configure the port for PAM4 encoding at the same speed, then we only require 2 of the available 4 lanes. The remaining 2 lanes are wasted. If we only require 2 of the 4 lanes, why not split the port and request the same speed of one of the now split out ports? Now, this same speed is only achievable using PAM4 encoding (it is implied) and we have a spare, potentially usable, assuming an appropriate break- out cable, port instead of the 2 unused lanes. So concretely, I'm suggesting that if we want to force PAM4 at the lower speeds, split the port and then we don't need an ethtool interface change at all to achieve the same goal. Having a spare (potentially usable) port is better than spare (unusable) lanes. Now, if the port can't be split for some reason (perhaps there aren't sufficient device MAC resources, stats contexts, whatever), then that's a different story. But, even so, perhaps the port lane topology more appropriately belongs as part of a device configuration interface in devlink and the number of lanes available to a port should be a property of the port instead of a link mode knob? Regards, Edwin Peer
Mon, Nov 30, 2020 at 07:00:16PM CET, edwin.peer@broadcom.com wrote: >On Mon, Nov 30, 2020 at 9:14 AM Jiri Pirko <jiri@resnulli.us> wrote: > >> >> There is a crucial difference. Split port is configured alwasy by user. >> >> Each split port has a devlink instace, netdevice associated with it. >> >> It is one level above the lanes. >> > >> >Right, but the one still implies the other. Splitting the port implies fewer >> >lanes available. >> > >> >I understand the concern if the device cannot provide sufficient MAC >> >resources to provide for the additional ports, but leaving a net device >> >unused (with the option to utilize an additional, now spare, port) still >> >seems better to me than leaving lanes unused and always wasted. >> >> I don't follow what exactly are you implying. Could you elaborate a bit >> more? > >Perhaps an example... > >Consider a physical QSFP connector comprising 4 lanes. Today, if the >speed is forced, we would achieve 100G speeds using all 4 lanes with >NRZ encoding. If we configure the port for PAM4 encoding at the same >speed, then we only require 2 of the available 4 lanes. The remaining 2 >lanes are wasted. If we only require 2 of the 4 lanes, why not split the >port and request the same speed of one of the now split out ports? Now, >this same speed is only achievable using PAM4 encoding (it is implied) >and we have a spare, potentially usable, assuming an appropriate break- >out cable, port instead of the 2 unused lanes. I don't see how this dynamic split port could work in real life to be honest. The split is something admin needs to configure and can rely that netdevice exists all the time and not comes and goes under different circumstances. Multiple obvious reasons why. One way or another, I don't see the relation between this and this patchset. > >So concretely, I'm suggesting that if we want to force PAM4 at the lower >speeds, split the port and then we don't need an ethtool interface change >at all to achieve the same goal. Having a spare (potentially usable) port >is better than spare (unusable) lanes. The admin has to decide, define. > >Now, if the port can't be split for some reason (perhaps there aren't >sufficient device MAC resources, stats contexts, whatever), then that's >a different story. But, even so, perhaps the port lane topology more >appropriately belongs as part of a device configuration interface in >devlink and the number of lanes available to a port should be a >property of the port instead of a link mode knob? > >Regards, >Edwin Peer
> -----Original Message----- > From: Michal Kubecek <mkubecek@suse.cz> > Sent: Thursday, November 26, 2020 11:08 PM > To: Danielle Ratson <danieller@nvidia.com> > Cc: Jiri Pirko <jiri@resnulli.us>; Andrew Lunn <andrew@lunn.ch>; Jakub Kicinski <kuba@kernel.org>; Ido Schimmel > <idosch@idosch.org>; netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko <jiri@nvidia.com>; f.fainelli@gmail.com; mlxsw > <mlxsw@nvidia.com>; Ido Schimmel <idosch@nvidia.com>; johannes@sipsolutions.net > Subject: Re: [PATCH net-next 1/6] ethtool: Extend link modes settings uAPI with lanes > > On Wed, Nov 25, 2020 at 10:35:35AM +0000, Danielle Ratson wrote: > > > > What do you think of passing the link modes you have suggested as > > > > a bitmask, similar to "supported", that contains only one positive bit? > > > > Something like that: > > > > Hi Michal, > > > > Thanks for your response. > > > > Actually what I said is not very accurate. > > In ethtool, for speed 100G and 4 lanes for example, there are few link modes that fits: > > ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT > > ETHTOOL_LINK_MODE_100000baseSR4_Full_BIT > > ETHTOOL_LINK_MODE_100000baseCR4_Full_BIT > > ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT > > > > The difference is the media. And in the driver we shrink into one bit. > > But maybe that makes passing a bitmask more sense, or am I missing something? > > But as far as I understand, at any moment, only one of these will be actually in use so that's what the driver should report. Or is the > problem that the driver cannot identify the media in use? (To be > precise: by "cannot identify" I mean "it's not possible for the driver to find out", not "current code does not distinguish".) > > Michal After more investigation, those are my conclusions: We have two types of supported asics in the driver- one of them is able to distinguish between the medias and the other one doesn't. So in the first type I can send one bit as you requested from the driver to ethtool but in the other one I can't. The suggestions I have are: 1. Add a bit that for unknown media for each link (something like ETHTOOL_LINK_MODE_100000unknown_Full_BIT). I am not sure it is even possible or makes sense. 2. Pass the link mode as bitmap. Do you see any other option? Thanks.
On Tue, Dec 1, 2020 at 3:22 AM Jiri Pirko <jiri@resnulli.us> wrote: > >Consider a physical QSFP connector comprising 4 lanes. Today, if the > >speed is forced, we would achieve 100G speeds using all 4 lanes with > >NRZ encoding. If we configure the port for PAM4 encoding at the same > >speed, then we only require 2 of the available 4 lanes. The remaining 2 > >lanes are wasted. If we only require 2 of the 4 lanes, why not split the > >port and request the same speed of one of the now split out ports? Now, > >this same speed is only achievable using PAM4 encoding (it is implied) > >and we have a spare, potentially usable, assuming an appropriate break- > >out cable, port instead of the 2 unused lanes. > > I don't see how this dynamic split port could work in real life to be > honest. The split is something admin needs to configure and can rely > that netdevice exists all the time and not comes and goes under > different circumstances. Multiple obvious reasons why. I'm not suggesting the port split be dynamic at all. I'm suggesting that if the admin wants or needs to force PAM4 on a port that would otherwise be able to achieve the given speed using more lanes with NRZ, then the admin should split the port, so that it has fewer lanes, in order to make that intent clear (or otherwise configure the port to have fewer lanes attached, if you really don't want to or can't create the additional split port). Using this approach, the existing ethtool forced speed interface is sufficient to configure all possible lane encodings, because the encoding that the driver must select is now implicit (note, we don't need to care about media type here). That is, the driver can always select the encoding that maximizes utilization of the lanes available to the port (as defined by the admin). > >So concretely, I'm suggesting that if we want to force PAM4 at the lower > >speeds, split the port and then we don't need an ethtool interface change > >at all to achieve the same goal. Having a spare (potentially usable) port > >is better than spare (unusable) lanes. > > The admin has to decide, define. I'm not sure I understand. The admin would indeed decide. This paragraph merely served to motivate why a rational admin should prefer to have a spare port rather than unused lanes he can't use, because they would be attached to a port using an encoding that doesn't need them. If he wasn't planning on using the additional port, he loses nothing. Otherwise, he gains something he would not otherwise have had (it's win-win). From the perspective of the original port, two unused lanes is no different than two lanes allocated to another logical port. Regards, Edwin Peer
On Wed, Nov 25, 2020 at 10:35:35AM +0000, Danielle Ratson wrote: > > > In ethtool, for speed 100G and 4 lanes for example, there are few link modes that fits: > > > ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT > > > ETHTOOL_LINK_MODE_100000baseSR4_Full_BIT > > > ETHTOOL_LINK_MODE_100000baseCR4_Full_BIT > > > ETHTOOL_LINK_MODE_100000baseLR4_ER4_Full_BIT > > > > The suggestions I have are: > 1. Add a bit that for unknown media for each link (something like > ETHTOOL_LINK_MODE_100000unknown_Full_BIT). I am not sure it is even > possible or makes sense. The number of lanes would still need to be specified. You would need at least: ETHTOOL_LINK_MODE_100000xR2_FULL and ETHTOOL_LINK_MODE_100000xR4_FULL to distinguish between PAM4 and NRZ at 100G respectively. And, there's still the cost of maintaining a different enum to ethtool_link_mode_bit_indices. > 2. Pass the link mode as bitmap. The user only wants to specify link speed and encoding, not media. The bitmap has the same problem to solve. Or, should user space set the bits for all possible media types that satisfy the desired speed and encoding? Eeek. Alternatively, the driver would need to accept any bit that implies the desired speed and encoding, ignoring what media the bit specifies (to do so would require maintaining a map of equivalences, yuck). > Do you see any other option? As stated in another sub-thread, I think the encoding can be implied by the speed if the number of lanes is a property of the port configuration. In which case the existing ethtool interface is sufficient. Regards, Edwin Peer
On Tue, Dec 1, 2020 at 9:23 AM Danielle Ratson <danieller@nvidia.com> wrote: > The suggestions I have are: > 1. Add a bit that for unknown media for each link (something like ETHTOOL_LINK_MODE_100000unknown_Full_BIT). I am not sure it is even possible or makes sense. > 2. Pass the link mode as bitmap. > > Do you see any other option? Other options include: 1) Let the user specify link encoding directly, as we do for FEC. This would imply the number of lanes used at a given speed rather than the other way around. Then you only need a handful of indices in the enum. 2) Instead of an enum, encode the semantics into the link mode bit representation. N bits define the speed. Other bits define the encoding, number of lanes, media, etc. This would solve the problem of decoding the intent from a single user selected link mode index, but would require a v2 UABI. The user could have the same strings, but not much more. And then, it's probably better to give each dimension their own field in a struct and be done, lest we run out of bits. Let the user space tool map the link mode string onto the appropriate struct values. Regards, Edwin Peer
Wed, Dec 02, 2020 at 01:32:46AM CET, edwin.peer@broadcom.com wrote: >On Tue, Dec 1, 2020 at 3:22 AM Jiri Pirko <jiri@resnulli.us> wrote: > >> >Consider a physical QSFP connector comprising 4 lanes. Today, if the >> >speed is forced, we would achieve 100G speeds using all 4 lanes with >> >NRZ encoding. If we configure the port for PAM4 encoding at the same >> >speed, then we only require 2 of the available 4 lanes. The remaining 2 >> >lanes are wasted. If we only require 2 of the 4 lanes, why not split the >> >port and request the same speed of one of the now split out ports? Now, >> >this same speed is only achievable using PAM4 encoding (it is implied) >> >and we have a spare, potentially usable, assuming an appropriate break- >> >out cable, port instead of the 2 unused lanes. >> >> I don't see how this dynamic split port could work in real life to be >> honest. The split is something admin needs to configure and can rely >> that netdevice exists all the time and not comes and goes under >> different circumstances. Multiple obvious reasons why. > >I'm not suggesting the port split be dynamic at all. I'm suggesting that if >the admin wants or needs to force PAM4 on a port that would otherwise >be able to achieve the given speed using more lanes with NRZ, then the >admin should split the port, so that it has fewer lanes, in order to make >that intent clear (or otherwise configure the port to have fewer lanes >attached, if you really don't want to or can't create the additional split >port). Okay, I see your point now. The thing is, the port split/unsplit causes a great distubance. Meaning, the netdevs all of the port disappear/reappear. Now consider following example: You have a router you have configured routes on many netdevs On one of the netdevs (has routes on it), you for any reason need to force lane number. In your suggestion, the netdev disappears along with the routes, the routing is then broken. I don't see how this could be acceptable. We are talking here about netdev configuration, we have a tool for that, that is ethtool. What you suggest is to take it to different level, I don't believe it is correct/doable. > >Using this approach, the existing ethtool forced speed interface is >sufficient to configure all possible lane encodings, because the >encoding that the driver must select is now implicit (note, we don't >need to care about media type here). That is, the driver can always >select the encoding that maximizes utilization of the lanes available >to the port (as defined by the admin). > >> >So concretely, I'm suggesting that if we want to force PAM4 at the lower >> >speeds, split the port and then we don't need an ethtool interface change >> >at all to achieve the same goal. Having a spare (potentially usable) port >> >is better than spare (unusable) lanes. >> >> The admin has to decide, define. > >I'm not sure I understand. The admin would indeed decide. This paragraph >merely served to motivate why a rational admin should prefer to have a >spare port rather than unused lanes he can't use, because they would be >attached to a port using an encoding that doesn't need them. If he wasn't >planning on using the additional port, he loses nothing. Otherwise, he gains >something he would not otherwise have had (it's win-win). From the >perspective of the original port, two unused lanes is no different than two >lanes allocated to another logical port. > >Regards, >Edwin Peer
On Wed, Dec 2, 2020 at 2:09 AM Jiri Pirko <jiri@resnulli.us> wrote: > >I'm not suggesting the port split be dynamic at all. I'm suggesting that if > >the admin wants or needs to force PAM4 on a port that would otherwise > >be able to achieve the given speed using more lanes with NRZ, then the > >admin should split the port, so that it has fewer lanes, in order to make > >that intent clear (or otherwise configure the port to have fewer lanes > >attached, if you really don't want to or can't create the additional split > >port). > > Okay, I see your point now. The thing is, the port split/unsplit causes > a great distubance. Meaning, the netdevs all of the port > disappear/reappear. Now consider following example: I guess that's a detail of the present implementation and/or device capabilities (I'm not particularly familiar), but presumably it is at least possible, in principle, to modify a device's port config without taking down the netdev. That said, after spending more time thinking about this, I'm coming around to the idea of being able to explicitly set encoding (not lanes) via ethtool. And, in this context, by encoding, I technically mean line code signalling, not symbol encoding. Although it does for today's link modes, the number of lanes does not generally imply signalling mode. In future we may have PAM8 signalling and then the present proposal of deriving the signalling mode for a given speed from the lane count falls down. We should be specifying signalling mode via ethtool instead of lanes. Alternatively, we need to be specifying the fully defined link mode. But, doing so is fraught with other issues, including how to represent it in the interface and the fact that the user doesn't necessarily want to specify physical media in these cases, something that is implied by the full link mode definition. Regards, Edwin Peer
diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst index 30b98245979f..05073482db05 100644 --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@ -431,16 +431,17 @@ Request contents: ``ETHTOOL_A_LINKMODES_SPEED`` u32 link speed (Mb/s) ``ETHTOOL_A_LINKMODES_DUPLEX`` u8 duplex mode ``ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG`` u8 Master/slave port mode + ``ETHTOOL_A_LINKMODES_LANES`` u32 lanes ========================================== ====== ========================== ``ETHTOOL_A_LINKMODES_OURS`` bit set allows setting advertised link modes. If autonegotiation is on (either set now or kept from before), advertised modes are not changed (no ``ETHTOOL_A_LINKMODES_OURS`` attribute) and at least one -of speed and duplex is specified, kernel adjusts advertised modes to all -supported modes matching speed, duplex or both (whatever is specified). This -autoselection is done on ethtool side with ioctl interface, netlink interface -is supposed to allow requesting changes without knowing what exactly kernel -supports. +of speed, duplex and lanes is specified, kernel adjusts advertised modes to all +supported modes matching speed, duplex, lanes or all (whatever is specified). +This autoselection is done on ethtool side with ioctl interface, netlink +interface is supposed to allow requesting changes without knowing what exactly +kernel supports. LINKSTATE_GET diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index 6408b446051f..4945ce487f01 100644 --- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -128,6 +128,7 @@ struct ethtool_link_ksettings { __ETHTOOL_DECLARE_LINK_MODE_MASK(advertising); __ETHTOOL_DECLARE_LINK_MODE_MASK(lp_advertising); } link_modes; + u32 lanes; }; /** @@ -241,6 +242,8 @@ bool ethtool_convert_link_mode_to_legacy_u32(u32 *legacy_u32, ETHTOOL_COALESCE_PKT_RATE_LOW | ETHTOOL_COALESCE_PKT_RATE_HIGH | \ ETHTOOL_COALESCE_RATE_SAMPLE_INTERVAL) +#define ETHTOOL_CAP_LINK_LANES_SUPPORTED BIT(0) + #define ETHTOOL_STAT_NOT_SET (~0ULL) /** @@ -419,6 +422,7 @@ struct ethtool_pause_stats { * of the generic netdev features interface. */ struct ethtool_ops { + u32 capabilities; u32 supported_coalesce_params; void (*get_drvinfo)(struct net_device *, struct ethtool_drvinfo *); int (*get_regs_len)(struct net_device *); diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h index 9ca87bc73c44..2125b93ecee0 100644 --- a/include/uapi/linux/ethtool.h +++ b/include/uapi/linux/ethtool.h @@ -1738,6 +1738,14 @@ static inline int ethtool_validate_speed(__u32 speed) return speed <= INT_MAX || speed == (__u32)SPEED_UNKNOWN; } +/* Lanes, 1, 2, 4 or 8. */ +#define ETHTOOL_LANES_1 1 +#define ETHTOOL_LANES_2 2 +#define ETHTOOL_LANES_4 4 +#define ETHTOOL_LANES_8 8 + +#define ETHTOOL_LANES_UNKNOWN 0 + /* Duplex, half or full. */ #define DUPLEX_HALF 0x00 #define DUPLEX_FULL 0x01 diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h index e2bf36e6964b..a286635ac9b8 100644 --- a/include/uapi/linux/ethtool_netlink.h +++ b/include/uapi/linux/ethtool_netlink.h @@ -227,6 +227,7 @@ enum { ETHTOOL_A_LINKMODES_DUPLEX, /* u8 */ ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG, /* u8 */ ETHTOOL_A_LINKMODES_MASTER_SLAVE_STATE, /* u8 */ + ETHTOOL_A_LINKMODES_LANES, /* u32 */ /* add new constants above here */ __ETHTOOL_A_LINKMODES_CNT, diff --git a/net/ethtool/linkmodes.c b/net/ethtool/linkmodes.c index c5bcb9abc8b9..a1f02896ed8b 100644 --- a/net/ethtool/linkmodes.c +++ b/net/ethtool/linkmodes.c @@ -152,12 +152,14 @@ const struct ethnl_request_ops ethnl_linkmodes_request_ops = { struct link_mode_info { int speed; + int lanes; u8 duplex; }; -#define __DEFINE_LINK_MODE_PARAMS(_speed, _type, _duplex) \ +#define __DEFINE_LINK_MODE_PARAMS(_speed, _type, _lanes, _duplex) \ [ETHTOOL_LINK_MODE(_speed, _type, _duplex)] = { \ .speed = SPEED_ ## _speed, \ + .lanes = ETHTOOL_LANES_ ## _lanes, \ .duplex = __DUPLEX_ ## _duplex \ } #define __DUPLEX_Half DUPLEX_HALF @@ -165,105 +167,106 @@ struct link_mode_info { #define __DEFINE_SPECIAL_MODE_PARAMS(_mode) \ [ETHTOOL_LINK_MODE_ ## _mode ## _BIT] = { \ .speed = SPEED_UNKNOWN, \ + .lanes = ETHTOOL_LANES_UNKNOWN, \ .duplex = DUPLEX_UNKNOWN, \ } static const struct link_mode_info link_mode_params[] = { - __DEFINE_LINK_MODE_PARAMS(10, T, Half), - __DEFINE_LINK_MODE_PARAMS(10, T, Full), - __DEFINE_LINK_MODE_PARAMS(100, T, Half), - __DEFINE_LINK_MODE_PARAMS(100, T, Full), - __DEFINE_LINK_MODE_PARAMS(1000, T, Half), - __DEFINE_LINK_MODE_PARAMS(1000, T, Full), + __DEFINE_LINK_MODE_PARAMS(10, T, 1, Half), + __DEFINE_LINK_MODE_PARAMS(10, T, 1, Full), + __DEFINE_LINK_MODE_PARAMS(100, T, 1, Half), + __DEFINE_LINK_MODE_PARAMS(100, T, 1, Full), + __DEFINE_LINK_MODE_PARAMS(1000, T, 1, Half), + __DEFINE_LINK_MODE_PARAMS(1000, T, 1, Full), __DEFINE_SPECIAL_MODE_PARAMS(Autoneg), __DEFINE_SPECIAL_MODE_PARAMS(TP), __DEFINE_SPECIAL_MODE_PARAMS(AUI), __DEFINE_SPECIAL_MODE_PARAMS(MII), __DEFINE_SPECIAL_MODE_PARAMS(FIBRE), __DEFINE_SPECIAL_MODE_PARAMS(BNC), - __DEFINE_LINK_MODE_PARAMS(10000, T, Full), + __DEFINE_LINK_MODE_PARAMS(10000, T, 1, Full), __DEFINE_SPECIAL_MODE_PARAMS(Pause), __DEFINE_SPECIAL_MODE_PARAMS(Asym_Pause), - __DEFINE_LINK_MODE_PARAMS(2500, X, Full), + __DEFINE_LINK_MODE_PARAMS(2500, X, 1, Full), __DEFINE_SPECIAL_MODE_PARAMS(Backplane), - __DEFINE_LINK_MODE_PARAMS(1000, KX, Full), - __DEFINE_LINK_MODE_PARAMS(10000, KX4, Full), - __DEFINE_LINK_MODE_PARAMS(10000, KR, Full), + __DEFINE_LINK_MODE_PARAMS(1000, KX, 1, Full), + __DEFINE_LINK_MODE_PARAMS(10000, KX4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(10000, KR, 1, Full), [ETHTOOL_LINK_MODE_10000baseR_FEC_BIT] = { .speed = SPEED_10000, .duplex = DUPLEX_FULL, }, - __DEFINE_LINK_MODE_PARAMS(20000, MLD2, Full), - __DEFINE_LINK_MODE_PARAMS(20000, KR2, Full), - __DEFINE_LINK_MODE_PARAMS(40000, KR4, Full), - __DEFINE_LINK_MODE_PARAMS(40000, CR4, Full), - __DEFINE_LINK_MODE_PARAMS(40000, SR4, Full), - __DEFINE_LINK_MODE_PARAMS(40000, LR4, Full), - __DEFINE_LINK_MODE_PARAMS(56000, KR4, Full), - __DEFINE_LINK_MODE_PARAMS(56000, CR4, Full), - __DEFINE_LINK_MODE_PARAMS(56000, SR4, Full), - __DEFINE_LINK_MODE_PARAMS(56000, LR4, Full), - __DEFINE_LINK_MODE_PARAMS(25000, CR, Full), - __DEFINE_LINK_MODE_PARAMS(25000, KR, Full), - __DEFINE_LINK_MODE_PARAMS(25000, SR, Full), - __DEFINE_LINK_MODE_PARAMS(50000, CR2, Full), - __DEFINE_LINK_MODE_PARAMS(50000, KR2, Full), - __DEFINE_LINK_MODE_PARAMS(100000, KR4, Full), - __DEFINE_LINK_MODE_PARAMS(100000, SR4, Full), - __DEFINE_LINK_MODE_PARAMS(100000, CR4, Full), - __DEFINE_LINK_MODE_PARAMS(100000, LR4_ER4, Full), - __DEFINE_LINK_MODE_PARAMS(50000, SR2, Full), - __DEFINE_LINK_MODE_PARAMS(1000, X, Full), - __DEFINE_LINK_MODE_PARAMS(10000, CR, Full), - __DEFINE_LINK_MODE_PARAMS(10000, SR, Full), - __DEFINE_LINK_MODE_PARAMS(10000, LR, Full), - __DEFINE_LINK_MODE_PARAMS(10000, LRM, Full), - __DEFINE_LINK_MODE_PARAMS(10000, ER, Full), - __DEFINE_LINK_MODE_PARAMS(2500, T, Full), - __DEFINE_LINK_MODE_PARAMS(5000, T, Full), + __DEFINE_LINK_MODE_PARAMS(20000, MLD2, 2, Full), + __DEFINE_LINK_MODE_PARAMS(20000, KR2, 2, Full), + __DEFINE_LINK_MODE_PARAMS(40000, KR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(40000, CR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(40000, SR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(40000, LR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(56000, KR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(56000, CR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(56000, SR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(56000, LR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(25000, CR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(25000, KR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(25000, SR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(50000, CR2, 2, Full), + __DEFINE_LINK_MODE_PARAMS(50000, KR2, 2, Full), + __DEFINE_LINK_MODE_PARAMS(100000, KR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(100000, SR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(100000, CR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(100000, LR4_ER4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(50000, SR2, 2, Full), + __DEFINE_LINK_MODE_PARAMS(1000, X, 1, Full), + __DEFINE_LINK_MODE_PARAMS(10000, CR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(10000, SR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(10000, LR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(10000, LRM, 1, Full), + __DEFINE_LINK_MODE_PARAMS(10000, ER, 1, Full), + __DEFINE_LINK_MODE_PARAMS(2500, T, 1, Full), + __DEFINE_LINK_MODE_PARAMS(5000, T, 1, Full), __DEFINE_SPECIAL_MODE_PARAMS(FEC_NONE), __DEFINE_SPECIAL_MODE_PARAMS(FEC_RS), __DEFINE_SPECIAL_MODE_PARAMS(FEC_BASER), - __DEFINE_LINK_MODE_PARAMS(50000, KR, Full), - __DEFINE_LINK_MODE_PARAMS(50000, SR, Full), - __DEFINE_LINK_MODE_PARAMS(50000, CR, Full), - __DEFINE_LINK_MODE_PARAMS(50000, LR_ER_FR, Full), - __DEFINE_LINK_MODE_PARAMS(50000, DR, Full), - __DEFINE_LINK_MODE_PARAMS(100000, KR2, Full), - __DEFINE_LINK_MODE_PARAMS(100000, SR2, Full), - __DEFINE_LINK_MODE_PARAMS(100000, CR2, Full), - __DEFINE_LINK_MODE_PARAMS(100000, LR2_ER2_FR2, Full), - __DEFINE_LINK_MODE_PARAMS(100000, DR2, Full), - __DEFINE_LINK_MODE_PARAMS(200000, KR4, Full), - __DEFINE_LINK_MODE_PARAMS(200000, SR4, Full), - __DEFINE_LINK_MODE_PARAMS(200000, LR4_ER4_FR4, Full), - __DEFINE_LINK_MODE_PARAMS(200000, DR4, Full), - __DEFINE_LINK_MODE_PARAMS(200000, CR4, Full), - __DEFINE_LINK_MODE_PARAMS(100, T1, Full), - __DEFINE_LINK_MODE_PARAMS(1000, T1, Full), - __DEFINE_LINK_MODE_PARAMS(400000, KR8, Full), - __DEFINE_LINK_MODE_PARAMS(400000, SR8, Full), - __DEFINE_LINK_MODE_PARAMS(400000, LR8_ER8_FR8, Full), - __DEFINE_LINK_MODE_PARAMS(400000, DR8, Full), - __DEFINE_LINK_MODE_PARAMS(400000, CR8, Full), + __DEFINE_LINK_MODE_PARAMS(50000, KR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(50000, SR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(50000, CR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(50000, LR_ER_FR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(50000, DR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(100000, KR2, 2, Full), + __DEFINE_LINK_MODE_PARAMS(100000, SR2, 2, Full), + __DEFINE_LINK_MODE_PARAMS(100000, CR2, 2, Full), + __DEFINE_LINK_MODE_PARAMS(100000, LR2_ER2_FR2, 2, Full), + __DEFINE_LINK_MODE_PARAMS(100000, DR2, 2, Full), + __DEFINE_LINK_MODE_PARAMS(200000, KR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(200000, SR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(200000, LR4_ER4_FR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(200000, DR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(200000, CR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(100, T1, 1, Full), + __DEFINE_LINK_MODE_PARAMS(1000, T1, 1, Full), + __DEFINE_LINK_MODE_PARAMS(400000, KR8, 8, Full), + __DEFINE_LINK_MODE_PARAMS(400000, SR8, 8, Full), + __DEFINE_LINK_MODE_PARAMS(400000, LR8_ER8_FR8, 8, Full), + __DEFINE_LINK_MODE_PARAMS(400000, DR8, 8, Full), + __DEFINE_LINK_MODE_PARAMS(400000, CR8, 8, Full), __DEFINE_SPECIAL_MODE_PARAMS(FEC_LLRS), - __DEFINE_LINK_MODE_PARAMS(100000, KR, Full), - __DEFINE_LINK_MODE_PARAMS(100000, SR, Full), - __DEFINE_LINK_MODE_PARAMS(100000, LR_ER_FR, Full), - __DEFINE_LINK_MODE_PARAMS(100000, DR, Full), - __DEFINE_LINK_MODE_PARAMS(100000, CR, Full), - __DEFINE_LINK_MODE_PARAMS(200000, KR2, Full), - __DEFINE_LINK_MODE_PARAMS(200000, SR2, Full), - __DEFINE_LINK_MODE_PARAMS(200000, LR2_ER2_FR2, Full), - __DEFINE_LINK_MODE_PARAMS(200000, DR2, Full), - __DEFINE_LINK_MODE_PARAMS(200000, CR2, Full), - __DEFINE_LINK_MODE_PARAMS(400000, KR4, Full), - __DEFINE_LINK_MODE_PARAMS(400000, SR4, Full), - __DEFINE_LINK_MODE_PARAMS(400000, LR4_ER4_FR4, Full), - __DEFINE_LINK_MODE_PARAMS(400000, DR4, Full), - __DEFINE_LINK_MODE_PARAMS(400000, CR4, Full), - __DEFINE_LINK_MODE_PARAMS(100, FX, Half), - __DEFINE_LINK_MODE_PARAMS(100, FX, Full), + __DEFINE_LINK_MODE_PARAMS(100000, KR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(100000, SR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(100000, LR_ER_FR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(100000, DR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(100000, CR, 1, Full), + __DEFINE_LINK_MODE_PARAMS(200000, KR2, 2, Full), + __DEFINE_LINK_MODE_PARAMS(200000, SR2, 2, Full), + __DEFINE_LINK_MODE_PARAMS(200000, LR2_ER2_FR2, 2, Full), + __DEFINE_LINK_MODE_PARAMS(200000, DR2, 2, Full), + __DEFINE_LINK_MODE_PARAMS(200000, CR2, 2, Full), + __DEFINE_LINK_MODE_PARAMS(400000, KR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(400000, SR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(400000, LR4_ER4_FR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(400000, DR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(400000, CR4, 4, Full), + __DEFINE_LINK_MODE_PARAMS(100, FX, 1, Half), + __DEFINE_LINK_MODE_PARAMS(100, FX, 1, Full), }; const struct nla_policy ethnl_linkmodes_set_policy[] = { @@ -274,16 +277,17 @@ const struct nla_policy ethnl_linkmodes_set_policy[] = { [ETHTOOL_A_LINKMODES_SPEED] = { .type = NLA_U32 }, [ETHTOOL_A_LINKMODES_DUPLEX] = { .type = NLA_U8 }, [ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG] = { .type = NLA_U8 }, + [ETHTOOL_A_LINKMODES_LANES] = { .type = NLA_U32 }, }; -/* Set advertised link modes to all supported modes matching requested speed - * and duplex values. Called when autonegotiation is on, speed or duplex is - * requested but no link mode change. This is done in userspace with ioctl() - * interface, move it into kernel for netlink. +/* Set advertised link modes to all supported modes matching requested speed, + * lanes and duplex values. Called when autonegotiation is on, speed, lanes or + * duplex is requested but no link mode change. This is done in userspace with + * ioctl() interface, move it into kernel for netlink. * Returns true if advertised modes bitmap was modified. */ static bool ethnl_auto_linkmodes(struct ethtool_link_ksettings *ksettings, - bool req_speed, bool req_duplex) + bool req_speed, bool req_lanes, bool req_duplex) { unsigned long *advertising = ksettings->link_modes.advertising; unsigned long *supported = ksettings->link_modes.supported; @@ -302,6 +306,7 @@ static bool ethnl_auto_linkmodes(struct ethtool_link_ksettings *ksettings, continue; if (test_bit(i, supported) && (!req_speed || info->speed == ksettings->base.speed) && + (!req_lanes || info->lanes == ksettings->lanes) && (!req_duplex || info->duplex == ksettings->base.duplex)) set_bit(i, advertising); else @@ -325,12 +330,25 @@ static bool ethnl_validate_master_slave_cfg(u8 cfg) return false; } +static bool ethnl_validate_lanes_cfg(u32 cfg) +{ + switch (cfg) { + case ETHTOOL_LANES_1: + case ETHTOOL_LANES_2: + case ETHTOOL_LANES_4: + case ETHTOOL_LANES_8: + return true; + } + + return false; +} + static int ethnl_update_linkmodes(struct genl_info *info, struct nlattr **tb, struct ethtool_link_ksettings *ksettings, - bool *mod) + bool *mod, const struct net_device *dev) { struct ethtool_link_settings *lsettings = &ksettings->base; - bool req_speed, req_duplex; + bool req_speed, req_lanes, req_duplex; const struct nlattr *master_slave_cfg; int ret; @@ -353,10 +371,39 @@ static int ethnl_update_linkmodes(struct genl_info *info, struct nlattr **tb, *mod = false; req_speed = tb[ETHTOOL_A_LINKMODES_SPEED]; + req_lanes = tb[ETHTOOL_A_LINKMODES_LANES]; req_duplex = tb[ETHTOOL_A_LINKMODES_DUPLEX]; ethnl_update_u8(&lsettings->autoneg, tb[ETHTOOL_A_LINKMODES_AUTONEG], mod); + + if (req_lanes) { + u32 lanes_cfg = nla_get_u32(tb[ETHTOOL_A_LINKMODES_LANES]); + + if (!ethnl_validate_lanes_cfg(lanes_cfg)) { + NL_SET_ERR_MSG_ATTR(info->extack, + tb[ETHTOOL_A_LINKMODES_LANES], + "lanes value is invalid"); + return -EINVAL; + } + + /* If autoneg is off and lanes parameter is not supported by the + * driver, return an error. + */ + if (!lsettings->autoneg && + !(dev->ethtool_ops->capabilities & ETHTOOL_CAP_LINK_LANES_SUPPORTED)) { + NL_SET_ERR_MSG_ATTR(info->extack, + tb[ETHTOOL_A_LINKMODES_LANES], + "lanes configuration not supported by device"); + return -EOPNOTSUPP; + } + } else if (!lsettings->autoneg) { + /* If autoneg is off and lanes parameter is not passed from user, + * set the lanes parameter to UNKNOWN. + */ + ksettings->lanes = ETHTOOL_LANES_UNKNOWN; + } + ret = ethnl_update_bitset(ksettings->link_modes.advertising, __ETHTOOL_LINK_MODE_MASK_NBITS, tb[ETHTOOL_A_LINKMODES_OURS], link_mode_names, @@ -365,13 +412,15 @@ static int ethnl_update_linkmodes(struct genl_info *info, struct nlattr **tb, return ret; ethnl_update_u32(&lsettings->speed, tb[ETHTOOL_A_LINKMODES_SPEED], mod); + ethnl_update_u32(&ksettings->lanes, tb[ETHTOOL_A_LINKMODES_LANES], + mod); ethnl_update_u8(&lsettings->duplex, tb[ETHTOOL_A_LINKMODES_DUPLEX], mod); ethnl_update_u8(&lsettings->master_slave_cfg, master_slave_cfg, mod); if (!tb[ETHTOOL_A_LINKMODES_OURS] && lsettings->autoneg && - (req_speed || req_duplex) && - ethnl_auto_linkmodes(ksettings, req_speed, req_duplex)) + (req_speed || req_lanes || req_duplex) && + ethnl_auto_linkmodes(ksettings, req_speed, req_lanes, req_duplex)) *mod = true; return 0; @@ -409,7 +458,7 @@ int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info) goto out_ops; } - ret = ethnl_update_linkmodes(info, tb, &ksettings, &mod); + ret = ethnl_update_linkmodes(info, tb, &ksettings, &mod, dev); if (ret < 0) goto out_ops; diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h index d8efec516d86..6eabd58d81bf 100644 --- a/net/ethtool/netlink.h +++ b/net/ethtool/netlink.h @@ -351,7 +351,7 @@ extern const struct nla_policy ethnl_strset_get_policy[ETHTOOL_A_STRSET_COUNTS_O extern const struct nla_policy ethnl_linkinfo_get_policy[ETHTOOL_A_LINKINFO_HEADER + 1]; extern const struct nla_policy ethnl_linkinfo_set_policy[ETHTOOL_A_LINKINFO_TP_MDIX_CTRL + 1]; extern const struct nla_policy ethnl_linkmodes_get_policy[ETHTOOL_A_LINKMODES_HEADER + 1]; -extern const struct nla_policy ethnl_linkmodes_set_policy[ETHTOOL_A_LINKMODES_MASTER_SLAVE_CFG + 1]; +extern const struct nla_policy ethnl_linkmodes_set_policy[ETHTOOL_A_LINKMODES_LANES + 1]; extern const struct nla_policy ethnl_linkstate_get_policy[ETHTOOL_A_LINKSTATE_HEADER + 1]; extern const struct nla_policy ethnl_debug_get_policy[ETHTOOL_A_DEBUG_HEADER + 1]; extern const struct nla_policy ethnl_debug_set_policy[ETHTOOL_A_DEBUG_MSGMASK + 1];