mbox series

[0/4] crypto: qat - add heartbeat feature

Message ID 20230615090437.436796-1-damian.muszynski@intel.com
Headers show
Series crypto: qat - add heartbeat feature | expand

Message

Damian Muszynski June 15, 2023, 9:04 a.m. UTC
This set introduces support for the QAT heartbeat feature. It allows
detection whenever device firmware or acceleration unit will hang.
We're adding this feature to allow our clients having a tool with
they could verify if all of the Quick Assist hardware resources are
healthy and operational.

QAT device firmware periodically writes counters to a specified physical
memory location. A pair of counters per thread is incremented at
the start and end of the main processing loop within the firmware.
Checking for Heartbeat consists of checking the validity of the pair
of counter values for each thread. Stagnant counters indicate
a firmware hang.

The first patch removes historical and never used HB definitions.
Patch no. 2 is implementing the hardware clock frequency measuring
interface.
The third introduces the main heartbeat implementation with the debugfs
interface.
The last patch implements an algorithm that allows the code to detect
which version of heartbeat API is used at the currently loaded firmware.

Signed-off-by: Damian Muszynski <damian.muszynski@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

Damian Muszynski (4):
  crypto: qat - drop obsolete heartbeat interface
  crypto: qat - add measure clock frequency
  crypto: qat - add heartbeat feature
  crypto: qat - add heartbeat counters check

 Documentation/ABI/testing/debugfs-driver-qat  |  51 +++
 .../intel/qat/qat_4xxx/adf_4xxx_hw_data.c     |  11 +
 .../intel/qat/qat_4xxx/adf_4xxx_hw_data.h     |   4 +
 drivers/crypto/intel/qat/qat_4xxx/adf_drv.c   |   3 +
 .../intel/qat/qat_c3xxx/adf_c3xxx_hw_data.c   |  28 ++
 .../intel/qat/qat_c3xxx/adf_c3xxx_hw_data.h   |   7 +
 .../intel/qat/qat_c62x/adf_c62x_hw_data.c     |  28 ++
 .../intel/qat/qat_c62x/adf_c62x_hw_data.h     |   7 +
 drivers/crypto/intel/qat/qat_common/Makefile  |   3 +
 .../intel/qat/qat_common/adf_accel_devices.h  |  10 +
 .../crypto/intel/qat/qat_common/adf_admin.c   |  31 ++
 .../intel/qat/qat_common/adf_cfg_strings.h    |   2 +
 .../crypto/intel/qat/qat_common/adf_clock.c   | 127 +++++++
 .../crypto/intel/qat/qat_common/adf_clock.h   |  14 +
 .../intel/qat/qat_common/adf_common_drv.h     |   2 +
 .../crypto/intel/qat/qat_common/adf_dbgfs.c   |   9 +-
 .../intel/qat/qat_common/adf_gen2_config.c    |   7 +
 .../intel/qat/qat_common/adf_gen2_hw_data.h   |   3 +
 .../intel/qat/qat_common/adf_gen4_hw_data.h   |   3 +
 .../intel/qat/qat_common/adf_heartbeat.c      | 327 ++++++++++++++++++
 .../intel/qat/qat_common/adf_heartbeat.h      |  79 +++++
 .../qat/qat_common/adf_heartbeat_dbgfs.c      | 194 +++++++++++
 .../qat/qat_common/adf_heartbeat_dbgfs.h      |  12 +
 .../crypto/intel/qat/qat_common/adf_init.c    |  15 +
 .../qat/qat_common/icp_qat_fw_init_admin.h    |  20 +-
 .../qat/qat_dh895xcc/adf_dh895xcc_hw_data.c   |  13 +
 .../qat/qat_dh895xcc/adf_dh895xcc_hw_data.h   |   5 +
 27 files changed, 998 insertions(+), 17 deletions(-)
 create mode 100644 drivers/crypto/intel/qat/qat_common/adf_clock.c
 create mode 100644 drivers/crypto/intel/qat/qat_common/adf_clock.h
 create mode 100644 drivers/crypto/intel/qat/qat_common/adf_heartbeat.c
 create mode 100644 drivers/crypto/intel/qat/qat_common/adf_heartbeat.h
 create mode 100644 drivers/crypto/intel/qat/qat_common/adf_heartbeat_dbgfs.c
 create mode 100644 drivers/crypto/intel/qat/qat_common/adf_heartbeat_dbgfs.h


base-commit: 926f061ec16ed195331e950755fd74c897aefef3

Comments

Giovanni Cabiddu June 19, 2023, 2:46 p.m. UTC | #1
Hi Herbert,

On Thu, Jun 15, 2023 at 11:04:33AM +0200, Damian Muszynski wrote:
> This set introduces support for the QAT heartbeat feature. It allows
> detection whenever device firmware or acceleration unit will hang.
> We're adding this feature to allow our clients having a tool with
> they could verify if all of the Quick Assist hardware resources are
> healthy and operational.
> 
> QAT device firmware periodically writes counters to a specified physical
> memory location. A pair of counters per thread is incremented at
> the start and end of the main processing loop within the firmware.
> Checking for Heartbeat consists of checking the validity of the pair
> of counter values for each thread. Stagnant counters indicate
> a firmware hang.
> 
> The first patch removes historical and never used HB definitions.
> Patch no. 2 is implementing the hardware clock frequency measuring
> interface.
> The third introduces the main heartbeat implementation with the debugfs
> interface.
> The last patch implements an algorithm that allows the code to detect
> which version of heartbeat API is used at the currently loaded firmware.
> 
> Signed-off-by: Damian Muszynski <damian.muszynski@intel.com>
> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
This set introduces a build failure with clang.
Ignore it. We are going to send a new version.

>> ld.lld: error: undefined symbol: __aeabi_ldivmod
>>> referenced by adf_clock.c:29
>>> (drivers/crypto/intel/qat/qat_common/adf_clock.c:29)
>>>               drivers/crypto/intel/qat/qat_common/adf_clock.o:(adf_clock_get_current_time)
>>>               in archive vmlinux.a
>>> referenced by adf_clock.c:24
>> (drivers/crypto/intel/qat/qat_common/adf_clock.c:24)

Regards,