mbox series

[net-next,v2,00/14] Add mlx5 subfunction support

Message ID 20201209072934.1272819-1-saeed@kernel.org
Headers show
Series Add mlx5 subfunction support | expand

Message

Saeed Mahameed Dec. 9, 2020, 7:29 a.m. UTC
From: Parav Pandit <parav@nvidia.com>

Hi Dave, Jakub, Jason,

This series form Parav was the theme of this mlx5 release cycle,
we've been waiting anxiously for the auxbus infrastructure to make it into
the kernel, and now as the auxbus is in and all the stars are aligned, I
can finally submit this V2 of the devlink and mlx5 subfunction support.

Subfunctions came to solve the scaling issue of virtualization
and switchdev environments, where SRIOV failed to deliver and users ran
out of VFs very quickly as SRIOV demands huge amount of physical resources
in both of the servers and the NIC.

Subfunction provide the same functionality as SRIOV but in a very
lightweight manner, please see the thorough and detailed
documentation from Parav below, in the commit messages and the
Networking documentation patches at the end of this series.

Sending V2 as a continuation to V1 that was sent Last month [0],
Parav has provided full change-log in the commit message of each patch.
[0] https://lore.kernel.org/linux-rdma/20201112192424.2742-1-parav@nvidia.com/

Parav Pandit Says:
=================

This patchset introduces support for mlx5 subfunction (SF).

A subfunction is a lightweight function that has a parent PCI function on
which it is deployed. mlx5 subfunction has its own function capabilities
and its own resources. This means a subfunction has its own dedicated
queues(txq, rxq, cq, eq). These queues are neither shared nor stealed from
the parent PCI function.

When subfunction is RDMA capable, it has its own QP1, GID table and rdma
resources neither shared nor stealed from the parent PCI function.

A subfunction has dedicated window in PCI BAR space that is not shared
with ther other subfunctions or parent PCI function. This ensures that all
class devices of the subfunction accesses only assigned PCI BAR space.

A Subfunction supports eswitch representation through which it supports tc
offloads. User must configure eswitch to send/receive packets from/to
subfunction port.

Subfunctions share PCI level resources such as PCI MSI-X IRQs with
their other subfunctions and/or with its parent PCI function.

Patch summary:
--------------
Patch 1 to 4 prepares devlink
patch 5 to 7 mlx5 adds SF device support
Patch 8 to 11 mlx5 adds SF devlink port support
Patch 12 and 14 adds documentation

Patch-1 prepares code to handle multiple port function attributes
Patch-2 introduces devlink pcisf port flavour similar to pcipf and pcivf
Patch-3 adds port add and delete driver callbacks
Patch-4 adds port function state get and set callbacks
Patch-5 mlx5 vhca event notifier support to distribute subfunction
        state change notification
Patch-6 adds SF auxiliary device
Patch-7 adds SF auxiliary driver
Patch-8 prepares eswitch to handler SF vport
Patch-9 adds eswitch helpers to add/remove SF vport
Patch-10 implements devlink port add/del callbacks
Patch-11 implements devlink port function get/set callbacks
Patch-12 to 14 adds documentation
Patch-12 added mlx5 port function documentation
Patch-13 adds subfunction documentation
Patch-14 adds mlx5 subfunction documentation

Subfunction support is discussed in detail in RFC [1] and [2].
RFC [1] and extension [2] describes requirements, design and proposed
plumbing using devlink, auxiliary bus and sysfs for systemd/udev
support. Functionality of this patchset is best explained using real
examples further below.

overview:
--------
A subfunction can be created and deleted by a user using devlink port
add/delete interface.

A subfunction can be configured using devlink port function attribute
before its activated.

When a subfunction is activated, it results in an auxiliary device on
the host PCI device where it is deployed. A driver binds to the
auxiliary device that further creates supported class devices.

example subfunction usage sequence:
-----------------------------------
Change device to switchdev mode:
$ devlink dev eswitch set pci/0000:06:00.0 mode switchdev

Add a devlink port of subfunction flaovur:
$ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88

Configure mac address of the port function:
$ devlink port function set ens2f0npf0sf88 hw_addr 00:00:00:00:88:88

Now activate the function:
$ devlink port function set ens2f0npf0sf88 state active

Now use the auxiliary device and class devices:
$ devlink dev show
pci/0000:06:00.0
auxiliary/mlx5_core.sf.4

$ ip link show
127: ens2f0np0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 24:8a:07:b3:d1:12 brd ff:ff:ff:ff:ff:ff
    altname enp6s0f0np0
129: p0sf88: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 00:00:00:00:88:88 brd ff:ff:ff:ff:ff:ff

$ rdma dev show
43: rdmap6s0f0: node_type ca fw 16.29.0550 node_guid 248a:0703:00b3:d112 sys_image_guid 248a:0703:00b3:d112
44: mlx5_0: node_type ca fw 16.29.0550 node_guid 0000:00ff:fe00:8888 sys_image_guid 248a:0703:00b3:d112

After use inactivate the function:
$ devlink port function set ens2f0npf0sf88 state inactive

Now delete the subfunction port:
$ devlink port del ens2f0npf0sf88

[1] https://lore.kernel.org/netdev/20200519092258.GF4655@nanopsycho/
[2] https://marc.info/?l=linux-netdev&m=158555928517777&w=2

=================
---
Changelog:
v1->v2:
 - added documentation for subfunction and its mlx5 implementation
 - add MLX5_SF config option documentation
 - rebased
 - dropped devlink global lock improvement patch as mlx5 doesn't support
   reload while SFs are allocated
 - dropped devlink reload lock patch as mlx5 doesn't support reload
   when SFs are allocated
 - using updated vhca event from device to add remove auxiliary device
 - split sf devlink port allocation and sf hardware context allocation

Parav Pandit (13):
  devlink: Prepare code to fill multiple port function attributes
  devlink: Introduce PCI SF port flavour and port attribute
  devlink: Support add and delete devlink port
  devlink: Support get and set state of port function
  net/mlx5: Introduce vhca state event notifier
  net/mlx5: SF, Add auxiliary device support
  net/mlx5: SF, Add auxiliary device driver
  net/mlx5: E-switch, Add eswitch helpers for SF vport
  net/mlx5: SF, Add port add delete functionality
  net/mlx5: SF, Port function state change support
  devlink: Add devlink port documentation
  devlink: Extend devlink port documentation for subfunctions
  net/mlx5: Add devlink subfunction port documentation

Vu Pham (1):
  net/mlx5: E-switch, Prepare eswitch to handle SF vport

 Documentation/driver-api/auxiliary_bus.rst    |   2 +
 .../device_drivers/ethernet/mellanox/mlx5.rst | 209 +++++++
 .../networking/devlink/devlink-port.rst       | 199 +++++++
 Documentation/networking/devlink/index.rst    |   1 +
 .../net/ethernet/mellanox/mlx5/core/Kconfig   |  19 +
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   9 +
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c |   8 +
 .../net/ethernet/mellanox/mlx5/core/devlink.c |  19 +
 drivers/net/ethernet/mellanox/mlx5/core/eq.c  |   5 +-
 .../mellanox/mlx5/core/esw/acl/egress_ofld.c  |   2 +-
 .../mellanox/mlx5/core/esw/devlink_port.c     |  41 ++
 .../net/ethernet/mellanox/mlx5/core/eswitch.c |  48 +-
 .../net/ethernet/mellanox/mlx5/core/eswitch.h |  78 +++
 .../mellanox/mlx5/core/eswitch_offloads.c     |  47 +-
 .../net/ethernet/mellanox/mlx5/core/events.c  |   7 +
 .../net/ethernet/mellanox/mlx5/core/main.c    |  60 +-
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |  12 +
 .../net/ethernet/mellanox/mlx5/core/pci_irq.c |  20 +
 .../net/ethernet/mellanox/mlx5/core/sf/cmd.c  |  48 ++
 .../ethernet/mellanox/mlx5/core/sf/dev/dev.c  | 271 +++++++++
 .../ethernet/mellanox/mlx5/core/sf/dev/dev.h  |  55 ++
 .../mellanox/mlx5/core/sf/dev/driver.c        | 101 ++++
 .../ethernet/mellanox/mlx5/core/sf/devlink.c  | 552 ++++++++++++++++++
 .../ethernet/mellanox/mlx5/core/sf/hw_table.c | 235 ++++++++
 .../mlx5/core/sf/mlx5_ifc_vhca_event.h        |  82 +++
 .../net/ethernet/mellanox/mlx5/core/sf/priv.h |  21 +
 .../net/ethernet/mellanox/mlx5/core/sf/sf.h   |  92 +++
 .../mellanox/mlx5/core/sf/vhca_event.c        | 189 ++++++
 .../mellanox/mlx5/core/sf/vhca_event.h        |  57 ++
 .../net/ethernet/mellanox/mlx5/core/vport.c   |   3 +-
 include/linux/mlx5/driver.h                   |  16 +-
 include/net/devlink.h                         |  79 +++
 include/uapi/linux/devlink.h                  |  26 +
 net/core/devlink.c                            | 266 ++++++++-
 34 files changed, 2832 insertions(+), 47 deletions(-)
 create mode 100644 Documentation/networking/devlink/devlink-port.rst
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/cmd.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/mlx5_ifc_vhca_event.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/priv.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h

Comments

Samudrala, Sridhar Dec. 11, 2020, 4:11 a.m. UTC | #1
On 12/8/2020 11:29 PM, saeed@kernel.org wrote:
> From: Parav Pandit <parav@nvidia.com>

>

> Hi Dave, Jakub, Jason,

>

> This series form Parav was the theme of this mlx5 release cycle,

> we've been waiting anxiously for the auxbus infrastructure to make it into

> the kernel, and now as the auxbus is in and all the stars are aligned, I

> can finally submit this V2 of the devlink and mlx5 subfunction support.

>

> Subfunctions came to solve the scaling issue of virtualization

> and switchdev environments, where SRIOV failed to deliver and users ran

> out of VFs very quickly as SRIOV demands huge amount of physical resources

> in both of the servers and the NIC.

>

> Subfunction provide the same functionality as SRIOV but in a very

> lightweight manner, please see the thorough and detailed

> documentation from Parav below, in the commit messages and the

> Networking documentation patches at the end of this series.


What is the mechanism for assigning these subfunctions to VMs?
OR is this only targeted for container usecases at this time?

>

> Sending V2 as a continuation to V1 that was sent Last month [0],

> Parav has provided full change-log in the commit message of each patch.

> [0] https://lore.kernel.org/linux-rdma/20201112192424.2742-1-parav@nvidia.com/

>

> Parav Pandit Says:

> =================

>

> This patchset introduces support for mlx5 subfunction (SF).

>

> A subfunction is a lightweight function that has a parent PCI function on

> which it is deployed. mlx5 subfunction has its own function capabilities

> and its own resources. This means a subfunction has its own dedicated

> queues(txq, rxq, cq, eq). These queues are neither shared nor stealed from

> the parent PCI function.

>

> When subfunction is RDMA capable, it has its own QP1, GID table and rdma

> resources neither shared nor stealed from the parent PCI function.

>

> A subfunction has dedicated window in PCI BAR space that is not shared

> with ther other subfunctions or parent PCI function. This ensures that all

> class devices of the subfunction accesses only assigned PCI BAR space.

>

> A Subfunction supports eswitch representation through which it supports tc

> offloads. User must configure eswitch to send/receive packets from/to

> subfunction port.

>

> Subfunctions share PCI level resources such as PCI MSI-X IRQs with

> their other subfunctions and/or with its parent PCI function.

>

> Patch summary:

> --------------

> Patch 1 to 4 prepares devlink

> patch 5 to 7 mlx5 adds SF device support

> Patch 8 to 11 mlx5 adds SF devlink port support

> Patch 12 and 14 adds documentation

>

> Patch-1 prepares code to handle multiple port function attributes

> Patch-2 introduces devlink pcisf port flavour similar to pcipf and pcivf

> Patch-3 adds port add and delete driver callbacks

> Patch-4 adds port function state get and set callbacks

> Patch-5 mlx5 vhca event notifier support to distribute subfunction

>          state change notification

> Patch-6 adds SF auxiliary device

> Patch-7 adds SF auxiliary driver

> Patch-8 prepares eswitch to handler SF vport

> Patch-9 adds eswitch helpers to add/remove SF vport

> Patch-10 implements devlink port add/del callbacks

> Patch-11 implements devlink port function get/set callbacks

> Patch-12 to 14 adds documentation

> Patch-12 added mlx5 port function documentation

> Patch-13 adds subfunction documentation

> Patch-14 adds mlx5 subfunction documentation

>

> Subfunction support is discussed in detail in RFC [1] and [2].

> RFC [1] and extension [2] describes requirements, design and proposed

> plumbing using devlink, auxiliary bus and sysfs for systemd/udev

> support. Functionality of this patchset is best explained using real

> examples further below.

>

> overview:

> --------

> A subfunction can be created and deleted by a user using devlink port

> add/delete interface.

>

> A subfunction can be configured using devlink port function attribute

> before its activated.

>

> When a subfunction is activated, it results in an auxiliary device on

> the host PCI device where it is deployed. A driver binds to the

> auxiliary device that further creates supported class devices.

>

> example subfunction usage sequence:

> -----------------------------------

> Change device to switchdev mode:

> $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev

>

> Add a devlink port of subfunction flaovur:

> $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88

Is there any requirement that subfunctions can be created only when 
eswitch mode is set to switchdev?
I think we should not restrict this functionality without switchdev mode .

After this step, i guess an auxiliary device is created on the auxiliary 
bus and a devlink port.
Does "devlink port show" show this port and can we list the auxiliary 
device.
> Configure mac address of the port function:

> $ devlink port function set ens2f0npf0sf88 hw_addr 00:00:00:00:88:88

What is ens2f0npf0sf88? Is this the port representer netdev? I think we 
should allow setting
this by passing the devlink port.

What about other attributes like number of queues, interrupt vectors and 
port
capabilities etc? Can we add other attributes via this interface?
>

> Now activate the function:

> $ devlink port function set ens2f0npf0sf88 state active

Is the subfunction netdev created after this step?
I thought there was a step to bind the auxiliary device to the driver.
How does the probe routine for the auxiliary device get invoked?
>

> Now use the auxiliary device and class devices:

> $ devlink dev show

> pci/0000:06:00.0

> auxiliary/mlx5_core.sf.4

>

> $ ip link show

> 127: ens2f0np0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000

>      link/ether 24:8a:07:b3:d1:12 brd ff:ff:ff:ff:ff:ff

>      altname enp6s0f0np0

> 129: p0sf88: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000

>      link/ether 00:00:00:00:88:88 brd ff:ff:ff:ff:ff:ff

>

> $ rdma dev show

> 43: rdmap6s0f0: node_type ca fw 16.29.0550 node_guid 248a:0703:00b3:d112 sys_image_guid 248a:0703:00b3:d112

> 44: mlx5_0: node_type ca fw 16.29.0550 node_guid 0000:00ff:fe00:8888 sys_image_guid 248a:0703:00b3:d112

>

> After use inactivate the function:

> $ devlink port function set ens2f0npf0sf88 state inactive

>

> Now delete the subfunction port:

> $ devlink port del ens2f0npf0sf88

>

> [1] https://lore.kernel.org/netdev/20200519092258.GF4655@nanopsycho/

> [2] https://marc.info/?l=linux-netdev&m=158555928517777&w=2

>

> =================

> ---

> Changelog:

> v1->v2:

>   - added documentation for subfunction and its mlx5 implementation

>   - add MLX5_SF config option documentation

>   - rebased

>   - dropped devlink global lock improvement patch as mlx5 doesn't support

>     reload while SFs are allocated

>   - dropped devlink reload lock patch as mlx5 doesn't support reload

>     when SFs are allocated

>   - using updated vhca event from device to add remove auxiliary device

>   - split sf devlink port allocation and sf hardware context allocation

>

> Parav Pandit (13):

>    devlink: Prepare code to fill multiple port function attributes

>    devlink: Introduce PCI SF port flavour and port attribute

>    devlink: Support add and delete devlink port

>    devlink: Support get and set state of port function

>    net/mlx5: Introduce vhca state event notifier

>    net/mlx5: SF, Add auxiliary device support

>    net/mlx5: SF, Add auxiliary device driver

>    net/mlx5: E-switch, Add eswitch helpers for SF vport

>    net/mlx5: SF, Add port add delete functionality

>    net/mlx5: SF, Port function state change support

>    devlink: Add devlink port documentation

>    devlink: Extend devlink port documentation for subfunctions

>    net/mlx5: Add devlink subfunction port documentation

>

> Vu Pham (1):

>    net/mlx5: E-switch, Prepare eswitch to handle SF vport

>

>   Documentation/driver-api/auxiliary_bus.rst    |   2 +

>   .../device_drivers/ethernet/mellanox/mlx5.rst | 209 +++++++

>   .../networking/devlink/devlink-port.rst       | 199 +++++++

>   Documentation/networking/devlink/index.rst    |   1 +

>   .../net/ethernet/mellanox/mlx5/core/Kconfig   |  19 +

>   .../net/ethernet/mellanox/mlx5/core/Makefile  |   9 +

>   drivers/net/ethernet/mellanox/mlx5/core/cmd.c |   8 +

>   .../net/ethernet/mellanox/mlx5/core/devlink.c |  19 +

>   drivers/net/ethernet/mellanox/mlx5/core/eq.c  |   5 +-

>   .../mellanox/mlx5/core/esw/acl/egress_ofld.c  |   2 +-

>   .../mellanox/mlx5/core/esw/devlink_port.c     |  41 ++

>   .../net/ethernet/mellanox/mlx5/core/eswitch.c |  48 +-

>   .../net/ethernet/mellanox/mlx5/core/eswitch.h |  78 +++

>   .../mellanox/mlx5/core/eswitch_offloads.c     |  47 +-

>   .../net/ethernet/mellanox/mlx5/core/events.c  |   7 +

>   .../net/ethernet/mellanox/mlx5/core/main.c    |  60 +-

>   .../ethernet/mellanox/mlx5/core/mlx5_core.h   |  12 +

>   .../net/ethernet/mellanox/mlx5/core/pci_irq.c |  20 +

>   .../net/ethernet/mellanox/mlx5/core/sf/cmd.c  |  48 ++

>   .../ethernet/mellanox/mlx5/core/sf/dev/dev.c  | 271 +++++++++

>   .../ethernet/mellanox/mlx5/core/sf/dev/dev.h  |  55 ++

>   .../mellanox/mlx5/core/sf/dev/driver.c        | 101 ++++

>   .../ethernet/mellanox/mlx5/core/sf/devlink.c  | 552 ++++++++++++++++++

>   .../ethernet/mellanox/mlx5/core/sf/hw_table.c | 235 ++++++++

>   .../mlx5/core/sf/mlx5_ifc_vhca_event.h        |  82 +++

>   .../net/ethernet/mellanox/mlx5/core/sf/priv.h |  21 +

>   .../net/ethernet/mellanox/mlx5/core/sf/sf.h   |  92 +++

>   .../mellanox/mlx5/core/sf/vhca_event.c        | 189 ++++++

>   .../mellanox/mlx5/core/sf/vhca_event.h        |  57 ++

>   .../net/ethernet/mellanox/mlx5/core/vport.c   |   3 +-

>   include/linux/mlx5/driver.h                   |  16 +-

>   include/net/devlink.h                         |  79 +++

>   include/uapi/linux/devlink.h                  |  26 +

>   net/core/devlink.c                            | 266 ++++++++-

>   34 files changed, 2832 insertions(+), 47 deletions(-)

>   create mode 100644 Documentation/networking/devlink/devlink-port.rst

>   create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/cmd.c

>   create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.c

>   create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.h

>   create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c

>   create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.c

>   create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.c

>   create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/mlx5_ifc_vhca_event.h

>   create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/priv.h

>   create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/sf.h

>   create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.c

>   create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.h

>
Parav Pandit Dec. 11, 2020, 8:33 a.m. UTC | #2
> From: Samudrala, Sridhar <sridhar.samudrala@intel.com>

> Sent: Friday, December 11, 2020 9:42 AM

> 

> On 12/8/2020 11:29 PM, saeed@kernel.org wrote:

> > From: Parav Pandit <parav@nvidia.com>



> > Subfunction provide the same functionality as SRIOV but in a very

> > lightweight manner, please see the thorough and detailed documentation

> > from Parav below, in the commit messages and the Networking

> > documentation patches at the end of this series.

> 

> What is the mechanism for assigning these subfunctions to VMs?

> OR is this only targeted for container usecases at this time?

> 

Currently subfunction cannot be assigned to VM as_is.
Some more vfio_pci style software may be developed in future to map subfunction auxiliary device to the VM.

> >

> > Add a devlink port of subfunction flaovur:

> > $ devlink port add pci/0000:06:00.0 flavour pcisf pfnum 0 sfnum 88

> Is there any requirement that subfunctions can be created only when

> eswitch mode is set to switchdev?

> I think we should not restrict this functionality without switchdev mode .

> 

It is not restricted. We discussed this before at [3].

> After this step, i guess an auxiliary device is created on the auxiliary bus and a

> devlink port.

> Does "devlink port show" show this port and can we list the auxiliary device.

Yes and yes.
Below command will show the the auxiliary device.
Auxiliary device is listed in detail in the patch_7 at [4] when its created.
$ devlink dev show auxiliary/mlx5_core.sf.4/
More below.

> > Configure mac address of the port function:

> > $ devlink port function set ens2f0npf0sf88 hw_addr 00:00:00:00:88:88

> What is ens2f0npf0sf88? Is this the port representer netdev? 

Yes, it is representor netdev associated with the devlink port.

> I think we should allow setting this by passing the devlink port.

Absolutely. It is. Every devlink port is identified by a unique port index.
So
$ devlink port show pci/0000:06:00.0/<devlink_port_index>  will show it.

It is captured in detailed example in the commit log of the patch_7 that adds it at [4].
Also present in the Documentation of mlx5.rst patch_14 at [5].

I just used the representor netdev example as it was intuitive to view the world from eswitch side.
Bu yes, instead of netdev port index is already supported natively.

> 

> What about other attributes like number of queues, interrupt vectors and

> port capabilities etc? Can we add other attributes via this interface?

> >

We believe that capabilities of the function should be controlled using the port function set command.
At the moment only mac address can be configured.
Number of queues is a resource so devlink resource is more suitable interface.

> > Now activate the function:

> > $ devlink port function set ens2f0npf0sf88 state active

> Is the subfunction netdev created after this step?

Yes.
> I thought there was a step to bind the auxiliary device to the driver.

Yes. User can always bind/unbind auxiliary driver from the auxiliary device.
Currently auxiliary bus do not have option to disable autoprobe (per device).
This is something to be extended in future so that user can select how a subfunction device to be used in the host system.

> How does the probe routine for the auxiliary device get invoked?

> >

When the subfunction auxiliary device is placed on the auxiliary bus, driver core invokes the registered driver probe routine.
Please refer to patch _7 at [4]. It is similar to how a pci device is probed.

[3] https://lore.kernel.org/netdev/BY5PR12MB43225AA5A5E42E76C03F645BDC3F0@BY5PR12MB4322.namprd12.prod.outlook.com/
[4] https://lore.kernel.org/netdev/20201209072934.1272819-4-saeed@kernel.org/
[5] https://lore.kernel.org/netdev/20201209072934.1272819-15-saeed@kernel.org/