From patchwork Thu Aug 20 18:51:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Igor Russkikh X-Patchwork-Id: 262187 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB48CC433DF for ; Thu, 20 Aug 2020 18:52:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 705B0207DE for ; Thu, 20 Aug 2020 18:52:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=marvell.com header.i=@marvell.com header.b="N14DeoSo" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727878AbgHTSwj (ORCPT ); Thu, 20 Aug 2020 14:52:39 -0400 Received: from mx0a-0016f401.pphosted.com ([67.231.148.174]:25342 "EHLO mx0b-0016f401.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727943AbgHTSwX (ORCPT ); Thu, 20 Aug 2020 14:52:23 -0400 Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 07KITqMS029090; Thu, 20 Aug 2020 11:52:15 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : mime-version : content-type; s=pfpt0220; bh=oOQCGu2EA6WK7OAS9r8VK73EF2O2nQTBS52WLPfU0QQ=; b=N14DeoSosnZcCQphzVfJcXSCRjzcuxLJKkyerP+7SOuH9AoHWNJXyiWVaL+bJ6k553/L LILdHmB+vXC79Ewy3cwucv2+Vmxakxw2wP5Job4m5L4Fq0HCvi5TvSCF0k9zF8YRDmPX 76CBW7aCU+ystrubSezf34+B0kuRGqwRfoykKTandRTcJJBNUB725Qh+YDh7Hr65B+aj 3678YXqtvGCNA0AkVpUtBpxIBuKlfGxRG2W9mLzOMdp2ijh7BT6+8ymzu023/kJCCGdM PunTNllwoa1EOc8YTjcaaez7v8Vcp0oASVxDKmcOizq4pgHbU8xXB2WtSPZBcwfd55gf Vw== Received: from sc-exch03.marvell.com ([199.233.58.183]) by mx0a-0016f401.pphosted.com with ESMTP id 3304fhxys1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Thu, 20 Aug 2020 11:52:15 -0700 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 20 Aug 2020 11:52:15 -0700 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 20 Aug 2020 11:52:14 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Thu, 20 Aug 2020 11:52:14 -0700 Received: from NN-LT0019.marvell.com (NN-LT0019.marvell.com [10.193.54.28]) by maili.marvell.com (Postfix) with ESMTP id 3527F3F7043; Thu, 20 Aug 2020 11:52:11 -0700 (PDT) From: Igor Russkikh To: CC: "David S . Miller" , Jakub Kicinski , Ariel Elior , Michal Kalderon , Igor Russkikh Subject: [PATCH v6 net-next 00/10] qed: introduce devlink health support Date: Thu, 20 Aug 2020 21:51:54 +0300 Message-ID: <20200820185204.652-1-irusskikh@marvell.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-08-20_03:2020-08-19,2020-08-20 signatures=0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This is a followup implementation after series https://patchwork.ozlabs.org/project/netdev/cover/20200514095727.1361-1-irusskikh@marvell.com/ This is an implementation of devlink health infrastructure. With this we are now able to report HW errors to devlink, and it'll take its own actions depending on user configuration to capture and store the dump at the bad moment, and to request the driver to recover the device. So far we do not differentiate global device failures or specific PCI function failures. This means that some errors specific to one physical function will affect an entire device. This is not yet fully designed and verified, will followup in future. Solution was verified with artificial HW errors generated, existing tools for dump analysis could be used. v6: patch 4: changing serial to board.serial and fw to fw.app v5: improved patch 4 description v4: - commit message and other fixes after Jiri's comments - removed one patch (will send to net) v3: fix uninit var usage in patch 11 v2: fix #include issue from kbuild test robot. Igor Russkikh (10): qed: move out devlink logic into a new file qed/qede: make devlink survive recovery qed: fix kconfig help entries qed: implement devlink info request qed: health reporter init deinit seq qed: use devlink logic to report errors qed*: make use of devlink recovery infrastructure qed: implement devlink dump qed: align adjacent indent qede: make driver reliable on unload after failures drivers/net/ethernet/qlogic/Kconfig | 5 +- drivers/net/ethernet/qlogic/qed/Makefile | 1 + drivers/net/ethernet/qlogic/qed/qed.h | 3 +- drivers/net/ethernet/qlogic/qed/qed_dev.c | 9 + drivers/net/ethernet/qlogic/qed/qed_devlink.c | 259 ++++++++++++++++++ drivers/net/ethernet/qlogic/qed/qed_devlink.h | 20 ++ drivers/net/ethernet/qlogic/qed/qed_main.c | 116 +------- drivers/net/ethernet/qlogic/qede/qede.h | 2 + drivers/net/ethernet/qlogic/qede/qede_main.c | 38 ++- include/linux/qed/qed_if.h | 23 +- 10 files changed, 347 insertions(+), 129 deletions(-) create mode 100644 drivers/net/ethernet/qlogic/qed/qed_devlink.c create mode 100644 drivers/net/ethernet/qlogic/qed/qed_devlink.h