From patchwork Mon Jun 7 11:18:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "huangguangbin \(A\)" X-Patchwork-Id: 455711 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D338C4743F for ; Mon, 7 Jun 2021 11:21:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 07BA66101A for ; Mon, 7 Jun 2021 11:21:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231480AbhFGLXO (ORCPT ); Mon, 7 Jun 2021 07:23:14 -0400 Received: from szxga08-in.huawei.com ([45.249.212.255]:4337 "EHLO szxga08-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230194AbhFGLXM (ORCPT ); Mon, 7 Jun 2021 07:23:12 -0400 Received: from dggemv711-chm.china.huawei.com (unknown [172.30.72.54]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4Fz9mZ3LwMz1BKn8; Mon, 7 Jun 2021 19:16:30 +0800 (CST) Received: from dggemi759-chm.china.huawei.com (10.1.198.145) by dggemv711-chm.china.huawei.com (10.1.198.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2176.2; Mon, 7 Jun 2021 19:21:19 +0800 Received: from localhost.localdomain (10.67.165.24) by dggemi759-chm.china.huawei.com (10.1.198.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Mon, 7 Jun 2021 19:21:19 +0800 From: Guangbin Huang To: , CC: , , , , Subject: [PATCH net-next 1/3] net: hns3: add a separate error handling task Date: Mon, 7 Jun 2021 19:18:10 +0800 Message-ID: <1623064692-24205-2-git-send-email-huangguangbin2@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1623064692-24205-1-git-send-email-huangguangbin2@huawei.com> References: <1623064692-24205-1-git-send-email-huangguangbin2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggemi759-chm.china.huawei.com (10.1.198.145) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Jiaran Zhang Error handling and recovery logic are intertwined. Error handling (i.e. error identification, clearing error sources and initiation of recovery) is done in context of reset task. If certain hardware errors get delivered during driver init time, which can cause driver init/loading to fail. Introduce a separate error handling task to ensure below: 1. Reset logic remains independent of the error handling logic. 2. Add the hclge_errhand_task_schedule to schedule error recovery tasks, This will ensure that common misellaneous MSI-X interrupt are re-enabled quickly. Signed-off-by: Jiaran Zhang Signed-off-by: Salil Mehta Signed-off-by: Yufeng Mo Signed-off-by: Guangbin Huang --- .../net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c | 4 +-- .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 38 ++++++++++++++++++++++ .../ethernet/hisilicon/hns3/hns3pf/hclge_main.h | 1 + 3 files changed, 41 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c index 8223d699cd94..f125aa425872 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c @@ -1940,8 +1940,8 @@ int hclge_handle_hw_msix_error(struct hclge_dev *hdev, if (!test_bit(HCLGE_STATE_SERVICE_INITED, &hdev->state)) { dev_err(dev, - "Can't handle - MSIx error reported during dev init\n"); - return 0; + "failed to handle msix error during dev init\n"); + return -EAGAIN; } return hclge_handle_all_hw_msix_error(hdev, reset_requests); diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index 6ecc106af334..8a431e124adb 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -2843,6 +2843,14 @@ static void hclge_reset_task_schedule(struct hclge_dev *hdev) hclge_wq, &hdev->service_task, 0); } +static void hclge_errhand_task_schedule(struct hclge_dev *hdev) +{ + if (!test_bit(HCLGE_STATE_REMOVING, &hdev->state) && + !test_and_set_bit(HCLGE_STATE_ERR_SERVICE_SCHED, &hdev->state)) + mod_delayed_work_on(cpumask_first(&hdev->affinity_mask), + hclge_wq, &hdev->service_task, 0); +} + void hclge_task_schedule(struct hclge_dev *hdev, unsigned long delay_time) { if (!test_bit(HCLGE_STATE_REMOVING, &hdev->state) && @@ -4264,6 +4272,36 @@ static void hclge_reset_subtask(struct hclge_dev *hdev) hdev->reset_type = HNAE3_NONE_RESET; } +static void hclge_misc_err_recovery(struct hclge_dev *hdev) +{ + struct hnae3_ae_dev *ae_dev = pci_get_drvdata(hdev->pdev); + struct device *dev = &hdev->pdev->dev; + u32 msix_sts_reg; + + msix_sts_reg = hclge_read_dev(&hdev->hw, HCLGE_MISC_VECTOR_INT_STS); + + if (msix_sts_reg & HCLGE_VECTOR0_REG_MSIX_MASK) { + if (hclge_handle_hw_msix_error(hdev, + &hdev->default_reset_request)) + dev_info(dev, "received msix interrupt 0x%x\n", + msix_sts_reg); + + if (hdev->default_reset_request) + if (ae_dev->ops->reset_event) + ae_dev->ops->reset_event(hdev->pdev, NULL); + } + + hclge_enable_vector(&hdev->misc_vector, true); +} + +static void hclge_errhand_service_task(struct hclge_dev *hdev) +{ + if (!test_and_clear_bit(HCLGE_STATE_ERR_SERVICE_SCHED, &hdev->state)) + return; + + hclge_misc_err_recovery(hdev); +} + static void hclge_reset_service_task(struct hclge_dev *hdev) { if (!test_and_clear_bit(HCLGE_STATE_RST_SERVICE_SCHED, &hdev->state)) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h index 7595f841aaac..9b8abb5d7a8e 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h @@ -221,6 +221,7 @@ enum HCLGE_DEV_STATE { HCLGE_STATE_RST_HANDLING, HCLGE_STATE_MBX_SERVICE_SCHED, HCLGE_STATE_MBX_HANDLING, + HCLGE_STATE_ERR_SERVICE_SCHED, HCLGE_STATE_STATISTICS_UPDATING, HCLGE_STATE_CMD_DISABLE, HCLGE_STATE_LINK_UPDATING, From patchwork Mon Jun 7 11:18:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "huangguangbin \(A\)" X-Patchwork-Id: 455710 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EAEBEC47094 for ; Mon, 7 Jun 2021 11:21:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D33DD61078 for ; Mon, 7 Jun 2021 11:21:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231524AbhFGLXS (ORCPT ); Mon, 7 Jun 2021 07:23:18 -0400 Received: from szxga08-in.huawei.com ([45.249.212.255]:4339 "EHLO szxga08-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231459AbhFGLXO (ORCPT ); Mon, 7 Jun 2021 07:23:14 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4Fz9mb2t9Wz1BKmg; Mon, 7 Jun 2021 19:16:31 +0800 (CST) Received: from dggemi759-chm.china.huawei.com (10.1.198.145) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2176.2; Mon, 7 Jun 2021 19:21:19 +0800 Received: from localhost.localdomain (10.67.165.24) by dggemi759-chm.china.huawei.com (10.1.198.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Mon, 7 Jun 2021 19:21:19 +0800 From: Guangbin Huang To: , CC: , , , , Subject: [PATCH net-next 2/3] net: hns3: add scheduling logic for error handling task Date: Mon, 7 Jun 2021 19:18:11 +0800 Message-ID: <1623064692-24205-3-git-send-email-huangguangbin2@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1623064692-24205-1-git-send-email-huangguangbin2@huawei.com> References: <1623064692-24205-1-git-send-email-huangguangbin2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggemi759-chm.china.huawei.com (10.1.198.145) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Jiaran Zhang Error handling & recovery is done in context of reset task which gets scheduled from misc interrupt handler in existing code. But since error handling has been moved to new task, it should get scheduled instead of the reset task from the interrupt handler. Signed-off-by: Jiaran Zhang Signed-off-by: Salil Mehta Signed-off-by: Yufeng Mo Signed-off-by: Guangbin Huang --- .../net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 20 ++++++-------------- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index 8a431e124adb..4b1aa5c45852 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -3402,18 +3402,8 @@ static irqreturn_t hclge_misc_irq_handle(int irq, void *data) /* vector 0 interrupt is shared with reset and mailbox source events.*/ switch (event_cause) { case HCLGE_VECTOR0_EVENT_ERR: - /* we do not know what type of reset is required now. This could - * only be decided after we fetch the type of errors which - * caused this event. Therefore, we will do below for now: - * 1. Assert HNAE3_UNKNOWN_RESET type of reset. This means we - * have defered type of reset to be used. - * 2. Schedule the reset service task. - * 3. When service task receives HNAE3_UNKNOWN_RESET type it - * will fetch the correct type of reset. This would be done - * by first decoding the types of errors. - */ - set_bit(HNAE3_UNKNOWN_RESET, &hdev->reset_request); - fallthrough; + hclge_errhand_task_schedule(hdev); + break; case HCLGE_VECTOR0_EVENT_RST: hclge_reset_task_schedule(hdev); break; @@ -4385,14 +4375,16 @@ static void hclge_service_task(struct work_struct *work) struct hclge_dev *hdev = container_of(work, struct hclge_dev, service_task.work); + hclge_errhand_service_task(hdev); hclge_reset_service_task(hdev); hclge_mailbox_service_task(hdev); hclge_periodic_service_task(hdev); - /* Handle reset and mbx again in case periodical task delays the - * handling by calling hclge_task_schedule() in + /* Handle error recovery, reset and mbx again in case periodical task + * delays the handling by calling hclge_task_schedule() in * hclge_periodic_service_task(). */ + hclge_errhand_service_task(hdev); hclge_reset_service_task(hdev); hclge_mailbox_service_task(hdev); } From patchwork Mon Jun 7 11:18:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "huangguangbin \(A\)" X-Patchwork-Id: 456494 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60A9EC47082 for ; Mon, 7 Jun 2021 11:21:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4E4BB61078 for ; Mon, 7 Jun 2021 11:21:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231499AbhFGLXQ (ORCPT ); Mon, 7 Jun 2021 07:23:16 -0400 Received: from szxga08-in.huawei.com ([45.249.212.255]:4338 "EHLO szxga08-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231184AbhFGLXN (ORCPT ); Mon, 7 Jun 2021 07:23:13 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4Fz9mb36hFz1BKml; Mon, 7 Jun 2021 19:16:31 +0800 (CST) Received: from dggemi759-chm.china.huawei.com (10.1.198.145) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2176.2; Mon, 7 Jun 2021 19:21:20 +0800 Received: from localhost.localdomain (10.67.165.24) by dggemi759-chm.china.huawei.com (10.1.198.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Mon, 7 Jun 2021 19:21:19 +0800 From: Guangbin Huang To: , CC: , , , , Subject: [PATCH net-next 3/3] net: hns3: remove now redundant logic related to HNAE3_UNKNOWN_RESET Date: Mon, 7 Jun 2021 19:18:12 +0800 Message-ID: <1623064692-24205-4-git-send-email-huangguangbin2@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1623064692-24205-1-git-send-email-huangguangbin2@huawei.com> References: <1623064692-24205-1-git-send-email-huangguangbin2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggemi759-chm.china.huawei.com (10.1.198.145) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Yufeng Mo Earlier patches have decoupled the MSI-X conveyed error handling and recovery logic. This earlier concept code is no longer required. Signed-off-by: Yufeng Mo Signed-off-by: Salil Mehta Signed-off-by: Jiaran Zhang Signed-off-by: Guangbin Huang --- drivers/net/ethernet/hisilicon/hns3/hnae3.h | 1 - .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 22 ---------------------- 2 files changed, 23 deletions(-) diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h index 89b2b7fa7b8b..dc9b5bc3431b 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h +++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h @@ -243,7 +243,6 @@ enum hnae3_reset_type { HNAE3_FUNC_RESET, HNAE3_GLOBAL_RESET, HNAE3_IMP_RESET, - HNAE3_UNKNOWN_RESET, HNAE3_NONE_RESET, HNAE3_MAX_RESET, }; diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index 4b1aa5c45852..45102681bd2a 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -3792,28 +3792,6 @@ static enum hnae3_reset_type hclge_get_reset_level(struct hnae3_ae_dev *ae_dev, enum hnae3_reset_type rst_level = HNAE3_NONE_RESET; struct hclge_dev *hdev = ae_dev->priv; - /* first, resolve any unknown reset type to the known type(s) */ - if (test_bit(HNAE3_UNKNOWN_RESET, addr)) { - u32 msix_sts_reg = hclge_read_dev(&hdev->hw, - HCLGE_MISC_VECTOR_INT_STS); - /* we will intentionally ignore any errors from this function - * as we will end up in *some* reset request in any case - */ - if (hclge_handle_hw_msix_error(hdev, addr)) - dev_info(&hdev->pdev->dev, "received msix interrupt 0x%x\n", - msix_sts_reg); - - clear_bit(HNAE3_UNKNOWN_RESET, addr); - /* We defered the clearing of the error event which caused - * interrupt since it was not posssible to do that in - * interrupt context (and this is the reason we introduced - * new UNKNOWN reset type). Now, the errors have been - * handled and cleared in hardware we can safely enable - * interrupts. This is an exception to the norm. - */ - hclge_enable_vector(&hdev->misc_vector, true); - } - /* return the highest priority reset level amongst all */ if (test_bit(HNAE3_IMP_RESET, addr)) { rst_level = HNAE3_IMP_RESET;