mbox series

[net-next,00/16] Add devlink reload action and limit options

Message ID 1601560759-11030-1-git-send-email-moshe@mellanox.com
Headers show
Series Add devlink reload action and limit options | expand

Message

Moshe Shemesh Oct. 1, 2020, 1:59 p.m. UTC
Introduce new options on devlink reload API to enable the user to select
the reload action required and constrains limits on these actions that he
may want to ensure. Complete support for reload actions in mlx5.
The following reload actions are supported:
  driver_reinit: driver entities re-initialization, applying devlink-param
                 and devlink-resource values.
  fw_activate: firmware activate.

The uAPI is backward compatible, if the reload action option is omitted
from the reload command, the driver reinit action will be used.
Note that when required to do firmware activation some drivers may need
to reload the driver. On the other hand some drivers may need to reset
the firmware to reinitialize the driver entities. Therefore, the devlink
reload command returns the actions which were actually performed.

By default reload actions are not limited and driver implementation may
include reset or downtime as needed to perform the actions.
However, if reload limit is selected, the driver should perform only if
it can do it while keeping the limit constrains.
Reload limit added:
  no_reset: No reset allowed, no down time allowed, no link flap and no
            configuration is lost.

Each driver which supports devlink reload command should expose the
reload actions and limits supported.

Add reload stats to hold the history per reload action per limit.
For example, the number of times fw_activate has been done on this
device since the driver module was added or if the firmware activation
was done with or without reset.

Change log: Preceding to this version were 5 RFC versions. The changes
applied according to comments mainly from Jakub and Jiri on RFC API
patches are listed in each patch.

Patch 1 changes devlink_reload_supported() param type to enable using
        it before allocating devlink.
Patch 2-3 add the new API reload action and reload limit options to
          devlink reload.
Patch 4-5 add reload stats and remote reload stats. These stats are
          exposed through devlink dev get.
Patches 6-11 add support on mlx5 for devlink reload action fw_activate
            and handle the firmware reset events.
Patches 12-13 add devlink enable remote dev reset parameter and use it
             in mlx5.
Patches 14-15 mlx5 add devlink reload limit no_reset support for
              fw_activate reload action.
Patch 16 adds documentation file devlink-reload.rst 

Moshe Shemesh (16):
  devlink: Change devlink_reload_supported() param type
  devlink: Add reload action option to devlink reload command
  devlink: Add devlink reload limit option
  devlink: Add reload stats
  devlink: Add remote reload stats
  net/mlx5: Add functions to set/query MFRL register
  net/mlx5: Set cap for pci sync for fw update event
  net/mlx5: Handle sync reset request event
  net/mlx5: Handle sync reset now event
  net/mlx5: Handle sync reset abort event
  net/mlx5: Add support for devlink reload action fw activate
  devlink: Add enable_remote_dev_reset generic parameter
  net/mlx5: Add devlink param enable_remote_dev_reset support
  net/mlx5: Add support for fw live patch event
  net/mlx5: Add support for devlink reload limit no reset
  devlink: Add Documentation/networking/devlink/devlink-reload.rst

 .../networking/devlink/devlink-params.rst     |   6 +
 .../networking/devlink/devlink-reload.rst     |  81 +++
 Documentation/networking/devlink/index.rst    |   1 +
 drivers/net/ethernet/mellanox/mlx4/main.c     |   9 +-
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
 .../net/ethernet/mellanox/mlx5/core/devlink.c | 114 ++++-
 .../mellanox/mlx5/core/diag/fw_tracer.c       |  52 ++
 .../mellanox/mlx5/core/diag/fw_tracer.h       |   1 +
 .../ethernet/mellanox/mlx5/core/fw_reset.c    | 463 ++++++++++++++++++
 .../ethernet/mellanox/mlx5/core/fw_reset.h    |  21 +
 .../net/ethernet/mellanox/mlx5/core/health.c  |  35 +-
 .../net/ethernet/mellanox/mlx5/core/main.c    |  16 +
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |   2 +
 drivers/net/ethernet/mellanox/mlxsw/core.c    |  13 +-
 drivers/net/netdevsim/dev.c                   |   8 +-
 include/linux/mlx5/device.h                   |   1 +
 include/linux/mlx5/driver.h                   |   2 +
 include/net/devlink.h                         |  21 +-
 include/uapi/linux/devlink.h                  |  42 ++
 net/core/devlink.c                            | 318 +++++++++++-
 20 files changed, 1164 insertions(+), 44 deletions(-)
 create mode 100644 Documentation/networking/devlink/devlink-reload.rst
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h

Comments

Jakub Kicinski Oct. 1, 2020, 9:48 p.m. UTC | #1
On Thu,  1 Oct 2020 16:59:08 +0300 Moshe Shemesh wrote:
> Add remote reload stats to hold the history of actions performed due
> devlink reload commands initiated by remote host. For example, in case
> firmware activation with reset finished successfully but was initiated
> by remote host.
> 
> The function devlink_remote_reload_actions_performed() is exported to
> enable drivers update on remote reload actions performed as it was not
> initiated by their own devlink instance.
> 
> Expose devlink remote reload stats to the user through devlink dev get
> command.

Reviewed-by: Jakub Kicinski <kuba@kernel.org>

>  		for (i = 0; i <= DEVLINK_RELOAD_ACTION_MAX; i++) {
> -			if (!devlink_reload_action_is_supported(devlink, i) ||
> +			if ((!is_remote && !devlink_reload_action_is_supported(devlink, i)) 

I see the point of these checks now, I guess it would have been cleaner
if they were added in this patch, but no big deal.