From patchwork Mon Apr 10 13:21:56 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 97108 Delivered-To: patch@linaro.org Received: by 10.140.89.233 with SMTP id v96csp1347187qgd; Mon, 10 Apr 2017 05:52:50 -0700 (PDT) X-Received: by 10.99.53.195 with SMTP id c186mr55410039pga.182.1491828770637; Mon, 10 Apr 2017 05:52:50 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i10si13590582pfj.245.2017.04.10.05.52.50; Mon, 10 Apr 2017 05:52:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753479AbdDJMw0 (ORCPT + 24 others); Mon, 10 Apr 2017 08:52:26 -0400 Received: from szxga03-in.huawei.com ([45.249.212.189]:5427 "EHLO dggrg03-dlp.huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753394AbdDJMwZ (ORCPT ); Mon, 10 Apr 2017 08:52:25 -0400 Received: from 172.30.72.57 (EHLO DGGEML401-HUB.china.huawei.com) ([172.30.72.57]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id ALL62812; Mon, 10 Apr 2017 20:52:15 +0800 (CST) Received: from localhost.localdomain (10.67.212.75) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Mon, 10 Apr 2017 20:52:03 +0800 From: John Garry To: , CC: , , , , , Xiaofei Tan , John Garry Subject: [PATCH 1/6] scsi: hisi_sas: workaround STP link SoC bug Date: Mon, 10 Apr 2017 21:21:56 +0800 Message-ID: <1491830521-21437-2-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1491830521-21437-1-git-send-email-john.garry@huawei.com> References: <1491830521-21437-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.212.75] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020206.58EB8000.000A, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 8f303f1b0bbcfad61fcefa853da07955 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Xiaofei Tan After resetting the controller, the process of scanning SATA disks attached to an expander may fail occasionally. The issue is that the controller can't close the STP link created by target if the max link time is 0. To workaround this issue, we reject STP link after resetting the controller, and change the corresponding PHY to accept STP link only after receiving data. We do this check in cq interrupt handler. In order not to reduce efficiency, we use an variable to control whether we should check and change PHY to accept STP link. The function phys_reject_stp_links_v2_hw() should be called after resetting the controller. The solution of another SoC bug "SATA IO timeout", that also uses the same register to control STP link, is not effective before the PHY accepts STP link. Signed-off-by: Xiaofei Tan Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas.h | 1 + drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 60 +++++++++++++++++++++++++++++++++- 2 files changed, 60 insertions(+), 1 deletion(-) -- 1.9.1 Reviewed-by: Johannes Thumshirn diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h index 6aa0b62..72347b6 100644 --- a/drivers/scsi/hisi_sas/hisi_sas.h +++ b/drivers/scsi/hisi_sas/hisi_sas.h @@ -203,6 +203,7 @@ struct hisi_hba { int slot_index_count; unsigned long *slot_index_tags; + unsigned long reject_stp_links_msk; /* SCSI/SAS glue */ struct sas_ha_struct sha; diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c index 66c2de8..c550cc4 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c @@ -218,6 +218,9 @@ #define RX_IDAF_DWORD6 (PORT_BASE + 0xdc) #define RXOP_CHECK_CFG_H (PORT_BASE + 0xfc) #define CON_CONTROL (PORT_BASE + 0x118) +#define CON_CONTROL_CFG_OPEN_ACC_STP_OFF 0 +#define CON_CONTROL_CFG_OPEN_ACC_STP_MSK \ + (0x01 << CON_CONTROL_CFG_OPEN_ACC_STP_OFF) #define DONE_RECEIVED_TIME (PORT_BASE + 0x11c) #define CHL_INT0 (PORT_BASE + 0x1b4) #define CHL_INT0_HOTPLUG_TOUT_OFF 0 @@ -240,6 +243,9 @@ #define CHL_INT1_MSK (PORT_BASE + 0x1c4) #define CHL_INT2_MSK (PORT_BASE + 0x1c8) #define CHL_INT_COAL_EN (PORT_BASE + 0x1d0) +#define DMA_TX_DFX1 (PORT_BASE + 0x204) +#define DMA_TX_DFX1_IPTT_OFF 0 +#define DMA_TX_DFX1_IPTT_MSK (0xffff << DMA_TX_DFX1_IPTT_OFF) #define PHY_CTRL_RDY_MSK (PORT_BASE + 0x2b0) #define PHYCTRL_NOT_RDY_MSK (PORT_BASE + 0x2b4) #define PHYCTRL_DWS_RESET_MSK (PORT_BASE + 0x2b8) @@ -593,7 +599,8 @@ static u32 hisi_sas_phy_read32(struct hisi_hba *hisi_hba, slot_index_alloc_quirk_v2_hw(struct hisi_hba *hisi_hba, int *slot_idx, struct domain_device *device) { - unsigned int index = 0; + /* STP link chip bug workaround:index start from 1 */ + unsigned int index = 1; void *bitmap = hisi_hba->slot_index_tags; int sata_dev = dev_is_sata(device); @@ -875,6 +882,46 @@ static int reset_hw_v2_hw(struct hisi_hba *hisi_hba) return 0; } +/* This function needs to be called after resetting SAS controller. */ +static void phys_reject_stp_links_v2_hw(struct hisi_hba *hisi_hba) +{ + u32 cfg; + int phy_no; + + hisi_hba->reject_stp_links_msk = (1 << hisi_hba->n_phy) - 1; + for (phy_no = 0; phy_no < hisi_hba->n_phy; phy_no++) { + cfg = hisi_sas_phy_read32(hisi_hba, phy_no, CON_CONTROL); + if (!(cfg & CON_CONTROL_CFG_OPEN_ACC_STP_MSK)) + continue; + + cfg &= ~CON_CONTROL_CFG_OPEN_ACC_STP_MSK; + hisi_sas_phy_write32(hisi_hba, phy_no, CON_CONTROL, cfg); + } +} + +static void phys_try_accept_stp_links_v2_hw(struct hisi_hba *hisi_hba) +{ + int phy_no; + u32 dma_tx_dfx1; + + for (phy_no = 0; phy_no < hisi_hba->n_phy; phy_no++) { + if (!(hisi_hba->reject_stp_links_msk & BIT(phy_no))) + continue; + + dma_tx_dfx1 = hisi_sas_phy_read32(hisi_hba, phy_no, + DMA_TX_DFX1); + if (dma_tx_dfx1 & DMA_TX_DFX1_IPTT_MSK) { + u32 cfg = hisi_sas_phy_read32(hisi_hba, + phy_no, CON_CONTROL); + + cfg |= CON_CONTROL_CFG_OPEN_ACC_STP_MSK; + hisi_sas_phy_write32(hisi_hba, phy_no, + CON_CONTROL, cfg); + clear_bit(phy_no, &hisi_hba->reject_stp_links_msk); + } + } +} + static void init_reg_v2_hw(struct hisi_hba *hisi_hba) { struct device *dev = &hisi_hba->pdev->dev; @@ -1012,6 +1059,9 @@ static void link_timeout_enable_link(unsigned long data) int i, reg_val; for (i = 0; i < hisi_hba->n_phy; i++) { + if (hisi_hba->reject_stp_links_msk & BIT(i)) + continue; + reg_val = hisi_sas_phy_read32(hisi_hba, i, CON_CONTROL); if (!(reg_val & BIT(0))) { hisi_sas_phy_write32(hisi_hba, i, @@ -1031,6 +1081,9 @@ static void link_timeout_disable_link(unsigned long data) reg_val = hisi_sas_read32(hisi_hba, PHY_STATE); for (i = 0; i < hisi_hba->n_phy && reg_val; i++) { + if (hisi_hba->reject_stp_links_msk & BIT(i)) + continue; + if (reg_val & BIT(i)) { hisi_sas_phy_write32(hisi_hba, i, CON_CONTROL, 0x6); @@ -2867,6 +2920,9 @@ static void cq_tasklet_v2_hw(unsigned long val) u32 rd_point = cq->rd_point, wr_point, dev_id; int queue = cq->id; + if (unlikely(hisi_hba->reject_stp_links_msk)) + phys_try_accept_stp_links_v2_hw(hisi_hba); + complete_queue = hisi_hba->complete_hdr[queue]; spin_lock(&hisi_hba->lock); @@ -3214,6 +3270,8 @@ static int soft_reset_v2_hw(struct hisi_hba *hisi_hba) if (rc) return rc; + phys_reject_stp_links_v2_hw(hisi_hba); + /* Re-enable the PHYs */ for (phy_no = 0; phy_no < hisi_hba->n_phy; phy_no++) { struct hisi_sas_phy *phy = &hisi_hba->phy[phy_no]; From patchwork Mon Apr 10 13:21:57 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 97110 Delivered-To: patch@linaro.org Received: by 10.140.89.233 with SMTP id v96csp1347192qgd; Mon, 10 Apr 2017 05:52:51 -0700 (PDT) X-Received: by 10.84.236.8 with SMTP id q8mr18532391plk.104.1491828771350; Mon, 10 Apr 2017 05:52:51 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i10si13590582pfj.245.2017.04.10.05.52.51; Mon, 10 Apr 2017 05:52:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753537AbdDJMwg (ORCPT + 24 others); Mon, 10 Apr 2017 08:52:36 -0400 Received: from szxga03-in.huawei.com ([45.249.212.189]:5428 "EHLO dggrg03-dlp.huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753394AbdDJMw2 (ORCPT ); Mon, 10 Apr 2017 08:52:28 -0400 Received: from 172.30.72.57 (EHLO DGGEML401-HUB.china.huawei.com) ([172.30.72.57]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id ALL62809; Mon, 10 Apr 2017 20:52:14 +0800 (CST) Received: from localhost.localdomain (10.67.212.75) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Mon, 10 Apr 2017 20:52:03 +0800 From: John Garry To: , CC: , , , , , Xiaofei Tan , John Garry Subject: [PATCH 2/6] scsi: hisi_sas: workaround a SoC SATA IO processing bug Date: Mon, 10 Apr 2017 21:21:57 +0800 Message-ID: <1491830521-21437-3-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1491830521-21437-1-git-send-email-john.garry@huawei.com> References: <1491830521-21437-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.212.75] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020206.58EB7FFF.009C, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 908d51206fcd394cf64b7e0cdbbb63b8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Xiaofei Tan This patch provides a workaround a SoC bug where SATA IPTTs for different devices may conflict. The workaround solution requests the following: 1. SATA device id must be even and not equal to SAS IPTT. 2. SATA device can not share the same IPTT with other SAS or SATA device. Besides we shall consider IPTT value 0 is reserved for another SoC bug (STP device open link at firstly after SAS controller reset). To sum up, the solution is: Each SATA device uses independent and continuous 32 even IPTT from 64 to 4094, then v2 hw can only support 63 SATA devices. All SAS device(SSP/SMP devices) share odd IPTT value from 1 to 4095. Signed-off-by: Xiaofei Tan Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas.h | 2 + drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 83 ++++++++++++++++++++++++++++------ 2 files changed, 72 insertions(+), 13 deletions(-) -- 1.9.1 Reviewed-by: Johannes Thumshirn diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h index 72347b6..c80ca83 100644 --- a/drivers/scsi/hisi_sas/hisi_sas.h +++ b/drivers/scsi/hisi_sas/hisi_sas.h @@ -115,6 +115,7 @@ struct hisi_sas_device { atomic64_t running_req; struct list_head list; u8 dev_status; + int sata_idx; }; struct hisi_sas_slot { @@ -238,6 +239,7 @@ struct hisi_hba { struct hisi_sas_slot *slot_info; unsigned long flags; const struct hisi_sas_hw *hw; /* Low level hw interface */ + unsigned long sata_dev_bitmap[BITS_TO_LONGS(HISI_SAS_MAX_DEVICES)]; struct work_struct rst_work; }; diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c index c550cc4..fc8d8296 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c @@ -537,6 +537,7 @@ enum { }; #define HISI_SAS_COMMAND_ENTRIES_V2_HW 4096 +#define HISI_MAX_SATA_SUPPORT_V2_HW (HISI_SAS_COMMAND_ENTRIES_V2_HW/64 - 1) #define DIR_NO_DATA 0 #define DIR_TO_INI 1 @@ -597,39 +598,86 @@ static u32 hisi_sas_phy_read32(struct hisi_hba *hisi_hba, /* This function needs to be protected from pre-emption. */ static int slot_index_alloc_quirk_v2_hw(struct hisi_hba *hisi_hba, int *slot_idx, - struct domain_device *device) + struct domain_device *device) { - /* STP link chip bug workaround:index start from 1 */ - unsigned int index = 1; - void *bitmap = hisi_hba->slot_index_tags; int sata_dev = dev_is_sata(device); + void *bitmap = hisi_hba->slot_index_tags; + struct hisi_sas_device *sas_dev = device->lldd_dev; + int sata_idx = sas_dev->sata_idx; + int start, end; + + if (!sata_dev) { + /* + * STP link SoC bug workaround: index starts from 1. + * additionally, we can only allocate odd IPTT(1~4095) + * for SAS/SMP device. + */ + start = 1; + end = hisi_hba->slot_index_count; + } else { + if (sata_idx >= HISI_MAX_SATA_SUPPORT_V2_HW) + return -EINVAL; + + /* + * For SATA device: allocate even IPTT in this interval + * [64*(sata_idx+1), 64*(sata_idx+2)], then each SATA device + * own 32 IPTTs. IPTT 0 shall not be used duing to STP link + * SoC bug workaround. So we ignore the first 32 even IPTTs. + */ + start = 64 * (sata_idx + 1); + end = 64 * (sata_idx + 2); + } while (1) { - index = find_next_zero_bit(bitmap, hisi_hba->slot_index_count, - index); - if (index >= hisi_hba->slot_index_count) + start = find_next_zero_bit(bitmap, + hisi_hba->slot_index_count, start); + if (start >= end) return -SAS_QUEUE_FULL; /* - * SAS IPTT bit0 should be 1 - */ - if (sata_dev || (index & 1)) + * SAS IPTT bit0 should be 1, and SATA IPTT bit0 should be 0. + */ + if (sata_dev ^ (start & 1)) break; - index++; + start++; } - set_bit(index, bitmap); - *slot_idx = index; + set_bit(start, bitmap); + *slot_idx = start; return 0; } +static bool sata_index_alloc_v2_hw(struct hisi_hba *hisi_hba, int *idx) +{ + unsigned int index; + struct device *dev = &hisi_hba->pdev->dev; + void *bitmap = hisi_hba->sata_dev_bitmap; + + index = find_first_zero_bit(bitmap, HISI_MAX_SATA_SUPPORT_V2_HW); + if (index >= HISI_MAX_SATA_SUPPORT_V2_HW) { + dev_warn(dev, "alloc sata index failed, index=%d\n", index); + return false; + } + + set_bit(index, bitmap); + *idx = index; + return true; +} + + static struct hisi_sas_device *alloc_dev_quirk_v2_hw(struct domain_device *device) { struct hisi_hba *hisi_hba = device->port->ha->lldd_ha; struct hisi_sas_device *sas_dev = NULL; int i, sata_dev = dev_is_sata(device); + int sata_idx = -1; spin_lock(&hisi_hba->lock); + + if (sata_dev) + if (!sata_index_alloc_v2_hw(hisi_hba, &sata_idx)) + goto out; + for (i = 0; i < HISI_SAS_MAX_DEVICES; i++) { /* * SATA device id bit0 should be 0 @@ -643,10 +691,13 @@ hisi_sas_device *alloc_dev_quirk_v2_hw(struct domain_device *device) sas_dev->dev_type = device->dev_type; sas_dev->hisi_hba = hisi_hba; sas_dev->sas_device = device; + sas_dev->sata_idx = sata_idx; INIT_LIST_HEAD(&hisi_hba->devices[i].list); break; } } + +out: spin_unlock(&hisi_hba->lock); return sas_dev; @@ -753,6 +804,10 @@ static void free_device_v2_hw(struct hisi_hba *hisi_hba, u32 reg_val = hisi_sas_read32(hisi_hba, ENT_INT_SRC3); int i; + /* SoC bug workaround */ + if (dev_is_sata(sas_dev->sas_device)) + clear_bit(sas_dev->sata_idx, hisi_hba->sata_dev_bitmap); + /* clear the itct interrupt state */ if (ENT_INT_SRC3_ITC_INT_MSK & reg_val) hisi_sas_write32(hisi_hba, ENT_INT_SRC3, @@ -3197,6 +3252,8 @@ static int hisi_sas_v2_init(struct hisi_hba *hisi_hba) { int rc; + memset(hisi_hba->sata_dev_bitmap, 0, sizeof(hisi_hba->sata_dev_bitmap)); + rc = hw_init_v2_hw(hisi_hba); if (rc) return rc; From patchwork Mon Apr 10 13:21:58 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 97112 Delivered-To: patch@linaro.org Received: by 10.140.89.233 with SMTP id v96csp1347469qgd; Mon, 10 Apr 2017 05:53:44 -0700 (PDT) X-Received: by 10.98.150.65 with SMTP id c62mr16059510pfe.96.1491828824220; Mon, 10 Apr 2017 05:53:44 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p10si13518944pge.361.2017.04.10.05.53.43; Mon, 10 Apr 2017 05:53:44 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753612AbdDJMxU (ORCPT + 24 others); Mon, 10 Apr 2017 08:53:20 -0400 Received: from szxga03-in.huawei.com ([45.249.212.189]:5423 "EHLO dggrg03-dlp.huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752760AbdDJMwX (ORCPT ); Mon, 10 Apr 2017 08:52:23 -0400 Received: from 172.30.72.57 (EHLO DGGEML401-HUB.china.huawei.com) ([172.30.72.57]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id ALL62810; Mon, 10 Apr 2017 20:52:14 +0800 (CST) Received: from localhost.localdomain (10.67.212.75) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Mon, 10 Apr 2017 20:52:04 +0800 From: John Garry To: , CC: , , , , , Xiaofei Tan , John Garry Subject: [PATCH 3/6] scsi: hisi_sas: workaround SoC about abort timeout bug Date: Mon, 10 Apr 2017 21:21:58 +0800 Message-ID: <1491830521-21437-4-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1491830521-21437-1-git-send-email-john.garry@huawei.com> References: <1491830521-21437-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.212.75] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020203.58EB7FFF.0190, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 96d0ce92ffe02832b4ca9c6b82e2d900 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Xiaofei Tan This patch adds a workaround solution for a SoC bug which may cause SoC logic fatal error when disabling a PHY. Then we find internal abort IO timeout may occur, and the controller IO breakpoint may be corrupted. We work around this SoC bug by optimizing the flow of disabling a PHY. Signed-off-by: Xiaofei Tan Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 127 ++++++++++++++++++++++++++++++++- 1 file changed, 126 insertions(+), 1 deletion(-) -- 1.9.1 diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c index fc8d8296..a1eb88d 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c @@ -207,6 +207,8 @@ #define TXID_AUTO (PORT_BASE + 0xb8) #define TXID_AUTO_CT3_OFF 1 #define TXID_AUTO_CT3_MSK (0x1 << TXID_AUTO_CT3_OFF) +#define TXID_AUTO_CTB_OFF 11 +#define TXID_AUTO_CTB_MSK (0x1 << TXID_AUTO_CTB_OFF) #define TX_HARDRST_OFF 2 #define TX_HARDRST_MSK (0x1 << TX_HARDRST_OFF) #define RX_IDAF_DWORD0 (PORT_BASE + 0xc4) @@ -243,9 +245,17 @@ #define CHL_INT1_MSK (PORT_BASE + 0x1c4) #define CHL_INT2_MSK (PORT_BASE + 0x1c8) #define CHL_INT_COAL_EN (PORT_BASE + 0x1d0) +#define DMA_TX_DFX0 (PORT_BASE + 0x200) #define DMA_TX_DFX1 (PORT_BASE + 0x204) #define DMA_TX_DFX1_IPTT_OFF 0 #define DMA_TX_DFX1_IPTT_MSK (0xffff << DMA_TX_DFX1_IPTT_OFF) +#define DMA_TX_FIFO_DFX0 (PORT_BASE + 0x240) +#define PORT_DFX0 (PORT_BASE + 0x258) +#define LINK_DFX2 (PORT_BASE + 0X264) +#define LINK_DFX2_RCVR_HOLD_STS_OFF 9 +#define LINK_DFX2_RCVR_HOLD_STS_MSK (0x1 << LINK_DFX2_RCVR_HOLD_STS_OFF) +#define LINK_DFX2_SEND_HOLD_STS_OFF 10 +#define LINK_DFX2_SEND_HOLD_STS_MSK (0x1 << LINK_DFX2_SEND_HOLD_STS_OFF) #define PHY_CTRL_RDY_MSK (PORT_BASE + 0x2b0) #define PHYCTRL_NOT_RDY_MSK (PORT_BASE + 0x2b4) #define PHYCTRL_DWS_RESET_MSK (PORT_BASE + 0x2b8) @@ -1194,12 +1204,127 @@ static bool is_sata_phy_v2_hw(struct hisi_hba *hisi_hba, int phy_no) return false; } +static bool tx_fifo_is_empty_v2_hw(struct hisi_hba *hisi_hba, int phy_no) +{ + u32 dfx_val; + + dfx_val = hisi_sas_phy_read32(hisi_hba, phy_no, DMA_TX_DFX1); + + if (dfx_val & BIT(16)) + return false; + + return true; +} + +static bool axi_bus_is_idle_v2_hw(struct hisi_hba *hisi_hba, int phy_no) +{ + int i, max_loop = 1000; + struct device *dev = &hisi_hba->pdev->dev; + u32 status, axi_status, dfx_val, dfx_tx_val; + + for (i = 0; i < max_loop; i++) { + status = hisi_sas_read32_relaxed(hisi_hba, + AXI_MASTER_CFG_BASE + AM_CURR_TRANS_RETURN); + + axi_status = hisi_sas_read32(hisi_hba, AXI_CFG); + dfx_val = hisi_sas_phy_read32(hisi_hba, phy_no, DMA_TX_DFX1); + dfx_tx_val = hisi_sas_phy_read32(hisi_hba, + phy_no, DMA_TX_FIFO_DFX0); + + if ((status == 0x3) && (axi_status == 0x0) && + (dfx_val & BIT(20)) && (dfx_tx_val & BIT(10))) + return true; + udelay(10); + } + dev_err(dev, "bus is not idle phy%d, axi150:0x%x axi100:0x%x port204:0x%x port240:0x%x\n", + phy_no, status, axi_status, + dfx_val, dfx_tx_val); + return false; +} + +static bool wait_io_done_v2_hw(struct hisi_hba *hisi_hba, int phy_no) +{ + int i, max_loop = 1000; + struct device *dev = &hisi_hba->pdev->dev; + u32 status, tx_dfx0; + + for (i = 0; i < max_loop; i++) { + status = hisi_sas_phy_read32(hisi_hba, phy_no, LINK_DFX2); + status = (status & 0x3fc0) >> 6; + + if (status != 0x1) + return true; + + tx_dfx0 = hisi_sas_phy_read32(hisi_hba, phy_no, DMA_TX_DFX0); + if ((tx_dfx0 & 0x1ff) == 0x2) + return true; + udelay(10); + } + dev_err(dev, "IO not done phy%d, port264:0x%x port200:0x%x\n", + phy_no, status, tx_dfx0); + return false; +} + +static bool allowed_disable_phy_v2_hw(struct hisi_hba *hisi_hba, int phy_no) +{ + if (tx_fifo_is_empty_v2_hw(hisi_hba, phy_no)) + return true; + + if (!axi_bus_is_idle_v2_hw(hisi_hba, phy_no)) + return false; + + if (!wait_io_done_v2_hw(hisi_hba, phy_no)) + return false; + + return true; +} + + static void disable_phy_v2_hw(struct hisi_hba *hisi_hba, int phy_no) { - u32 cfg = hisi_sas_phy_read32(hisi_hba, phy_no, PHY_CFG); + u32 cfg, axi_val, dfx0_val, txid_auto; + struct device *dev = &hisi_hba->pdev->dev; + + /* Close axi bus. */ + axi_val = hisi_sas_read32(hisi_hba, AXI_MASTER_CFG_BASE + + AM_CTRL_GLOBAL); + axi_val |= 0x1; + hisi_sas_write32(hisi_hba, AXI_MASTER_CFG_BASE + + AM_CTRL_GLOBAL, axi_val); + + if (is_sata_phy_v2_hw(hisi_hba, phy_no)) { + if (allowed_disable_phy_v2_hw(hisi_hba, phy_no)) + goto do_disable; + + /* Reset host controller. */ + queue_work(hisi_hba->wq, &hisi_hba->rst_work); + return; + } + + dfx0_val = hisi_sas_phy_read32(hisi_hba, phy_no, PORT_DFX0); + dfx0_val = (dfx0_val & 0x1fc0) >> 6; + if (dfx0_val != 0x4) + goto do_disable; + + if (!tx_fifo_is_empty_v2_hw(hisi_hba, phy_no)) { + dev_warn(dev, "phy%d, wait tx fifo need send break\n", + phy_no); + txid_auto = hisi_sas_phy_read32(hisi_hba, phy_no, + TXID_AUTO); + txid_auto |= TXID_AUTO_CTB_MSK; + hisi_sas_phy_write32(hisi_hba, phy_no, TXID_AUTO, + txid_auto); + } +do_disable: + cfg = hisi_sas_phy_read32(hisi_hba, phy_no, PHY_CFG); cfg &= ~PHY_CFG_ENA_MSK; hisi_sas_phy_write32(hisi_hba, phy_no, PHY_CFG, cfg); + + /* Open axi bus. */ + axi_val &= ~0x1; + hisi_sas_write32(hisi_hba, AXI_MASTER_CFG_BASE + + AM_CTRL_GLOBAL, axi_val); } static void start_phy_v2_hw(struct hisi_hba *hisi_hba, int phy_no) From patchwork Mon Apr 10 13:21:59 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 97109 Delivered-To: patch@linaro.org Received: by 10.140.89.233 with SMTP id v96csp1347191qgd; Mon, 10 Apr 2017 05:52:51 -0700 (PDT) X-Received: by 10.99.143.88 with SMTP id r24mr56839855pgn.177.1491828771005; Mon, 10 Apr 2017 05:52:51 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i10si13590582pfj.245.2017.04.10.05.52.50; Mon, 10 Apr 2017 05:52:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753515AbdDJMw3 (ORCPT + 24 others); Mon, 10 Apr 2017 08:52:29 -0400 Received: from szxga03-in.huawei.com ([45.249.212.189]:5426 "EHLO dggrg03-dlp.huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752750AbdDJMw1 (ORCPT ); Mon, 10 Apr 2017 08:52:27 -0400 Received: from 172.30.72.57 (EHLO DGGEML401-HUB.china.huawei.com) ([172.30.72.57]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id ALL62805; Mon, 10 Apr 2017 20:52:13 +0800 (CST) Received: from localhost.localdomain (10.67.212.75) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Mon, 10 Apr 2017 20:52:04 +0800 From: John Garry To: , CC: , , , , , John Garry , Xiaofei Tan Subject: [PATCH 4/6] scsi: hisi_sas: add v2 hw internal abort timeout workaround Date: Mon, 10 Apr 2017 21:21:59 +0800 Message-ID: <1491830521-21437-5-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1491830521-21437-1-git-send-email-john.garry@huawei.com> References: <1491830521-21437-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.212.75] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020206.58EB7FFF.00B8, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 4fe771d04f4847dea92eec009ee6fd35 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch is a workaround for a SoC bug where an internal abort command may timeout. In v2 hw, the channel should become idle in order to finish abort process. If the target side has been sending HOLD, host side channel failed to complete the frame to send, and can not enter the idle state. Then internal abort command will timeout. As this issue is only in v2 hw, we deal with it in the hw layer. Our workaround solution is: If abort is not finished within a certain period of time, we will check HOLD status. If HOLD has been sending, we will send break command. Signed-off-by: John Garry Signed-off-by: Xiaofei Tan --- drivers/scsi/hisi_sas/hisi_sas.h | 1 + drivers/scsi/hisi_sas/hisi_sas_main.c | 2 +- drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 62 +++++++++++++++++++++++++++++----- 3 files changed, 55 insertions(+), 10 deletions(-) -- 1.9.1 diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h index c80ca83..4e28f32 100644 --- a/drivers/scsi/hisi_sas/hisi_sas.h +++ b/drivers/scsi/hisi_sas/hisi_sas.h @@ -138,6 +138,7 @@ struct hisi_sas_slot { struct hisi_sas_sge_page *sge_page; dma_addr_t sge_page_dma; struct work_struct abort_slot; + struct timer_list internal_abort_timer; }; struct hisi_sas_tmf_task { diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index 9890dfd..a5c6d06 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -1224,7 +1224,7 @@ static int hisi_sas_query_task(struct sas_task *task) task->task_done = hisi_sas_task_done; task->slow_task->timer.data = (unsigned long)task; task->slow_task->timer.function = hisi_sas_tmf_timedout; - task->slow_task->timer.expires = jiffies + 20*HZ; + task->slow_task->timer.expires = jiffies + msecs_to_jiffies(110); add_timer(&task->slow_task->timer); /* Lock as we are alloc'ing a slot, which cannot be interrupted */ diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c index a1eb88d..cc5675b 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c @@ -194,9 +194,9 @@ #define SL_CONTROL_NOTIFY_EN_MSK (0x1 << SL_CONTROL_NOTIFY_EN_OFF) #define SL_CONTROL_CTA_OFF 17 #define SL_CONTROL_CTA_MSK (0x1 << SL_CONTROL_CTA_OFF) -#define RX_PRIMS_STATUS (PORT_BASE + 0x98) -#define RX_BCAST_CHG_OFF 1 -#define RX_BCAST_CHG_MSK (0x1 << RX_BCAST_CHG_OFF) +#define RX_PRIMS_STATUS (PORT_BASE + 0x98) +#define RX_BCAST_CHG_OFF 1 +#define RX_BCAST_CHG_MSK (0x1 << RX_BCAST_CHG_OFF) #define TX_ID_DWORD0 (PORT_BASE + 0x9c) #define TX_ID_DWORD1 (PORT_BASE + 0xa0) #define TX_ID_DWORD2 (PORT_BASE + 0xa4) @@ -209,8 +209,8 @@ #define TXID_AUTO_CT3_MSK (0x1 << TXID_AUTO_CT3_OFF) #define TXID_AUTO_CTB_OFF 11 #define TXID_AUTO_CTB_MSK (0x1 << TXID_AUTO_CTB_OFF) -#define TX_HARDRST_OFF 2 -#define TX_HARDRST_MSK (0x1 << TX_HARDRST_OFF) +#define TX_HARDRST_OFF 2 +#define TX_HARDRST_MSK (0x1 << TX_HARDRST_OFF) #define RX_IDAF_DWORD0 (PORT_BASE + 0xc4) #define RX_IDAF_DWORD1 (PORT_BASE + 0xc8) #define RX_IDAF_DWORD2 (PORT_BASE + 0xcc) @@ -245,13 +245,13 @@ #define CHL_INT1_MSK (PORT_BASE + 0x1c4) #define CHL_INT2_MSK (PORT_BASE + 0x1c8) #define CHL_INT_COAL_EN (PORT_BASE + 0x1d0) -#define DMA_TX_DFX0 (PORT_BASE + 0x200) -#define DMA_TX_DFX1 (PORT_BASE + 0x204) +#define DMA_TX_DFX0 (PORT_BASE + 0x200) +#define DMA_TX_DFX1 (PORT_BASE + 0x204) #define DMA_TX_DFX1_IPTT_OFF 0 #define DMA_TX_DFX1_IPTT_MSK (0xffff << DMA_TX_DFX1_IPTT_OFF) #define DMA_TX_FIFO_DFX0 (PORT_BASE + 0x240) -#define PORT_DFX0 (PORT_BASE + 0x258) -#define LINK_DFX2 (PORT_BASE + 0X264) +#define PORT_DFX0 (PORT_BASE + 0x258) +#define LINK_DFX2 (PORT_BASE + 0X264) #define LINK_DFX2_RCVR_HOLD_STS_OFF 9 #define LINK_DFX2_RCVR_HOLD_STS_MSK (0x1 << LINK_DFX2_RCVR_HOLD_STS_OFF) #define LINK_DFX2_SEND_HOLD_STS_OFF 10 @@ -2260,15 +2260,18 @@ static void slot_err_v2_hw(struct hisi_hba *hisi_hba, case STAT_IO_COMPLETE: /* internal abort command complete */ ts->stat = TMF_RESP_FUNC_SUCC; + del_timer(&slot->internal_abort_timer); goto out; case STAT_IO_NO_DEVICE: ts->stat = TMF_RESP_FUNC_COMPLETE; + del_timer(&slot->internal_abort_timer); goto out; case STAT_IO_NOT_VALID: /* abort single io, controller don't find * the io need to abort */ ts->stat = TMF_RESP_FUNC_FAILED; + del_timer(&slot->internal_abort_timer); goto out; default: break; @@ -2502,6 +2505,40 @@ static int prep_ata_v2_hw(struct hisi_hba *hisi_hba, return 0; } +static void hisi_sas_internal_abort_quirk_timeout(unsigned long data) +{ + struct hisi_sas_slot *slot = (struct hisi_sas_slot *)data; + struct hisi_sas_port *port = slot->port; + struct asd_sas_port *asd_sas_port; + struct asd_sas_phy *sas_phy; + + if (!port) + return; + + asd_sas_port = &port->sas_port; + + /* Kick the hardware - send break command */ + list_for_each_entry(sas_phy, &asd_sas_port->phy_list, port_phy_el) { + struct hisi_sas_phy *phy = sas_phy->lldd_phy; + struct hisi_hba *hisi_hba = phy->hisi_hba; + int phy_no = sas_phy->id; + u32 link_dfx2; + + link_dfx2 = hisi_sas_phy_read32(hisi_hba, phy_no, LINK_DFX2); + if ((link_dfx2 == LINK_DFX2_RCVR_HOLD_STS_MSK) || + (link_dfx2 & LINK_DFX2_SEND_HOLD_STS_MSK)) { + u32 txid_auto; + + txid_auto = hisi_sas_phy_read32(hisi_hba, phy_no, + TXID_AUTO); + txid_auto |= TXID_AUTO_CTB_MSK; + hisi_sas_phy_write32(hisi_hba, phy_no, TXID_AUTO, + txid_auto); + return; + } + } +} + static int prep_abort_v2_hw(struct hisi_hba *hisi_hba, struct hisi_sas_slot *slot, int device_id, int abort_flag, int tag_to_abort) @@ -2510,6 +2547,13 @@ static int prep_abort_v2_hw(struct hisi_hba *hisi_hba, struct domain_device *dev = task->dev; struct hisi_sas_cmd_hdr *hdr = slot->cmd_hdr; struct hisi_sas_port *port = slot->port; + struct timer_list *timer = &slot->internal_abort_timer; + + /* setup the quirk timer */ + setup_timer(timer, hisi_sas_internal_abort_quirk_timeout, + (unsigned long)slot); + /* Set the timeout to 10ms less than internal abort timeout */ + mod_timer(timer, jiffies + msecs_to_jiffies(100)); /* dw0 */ hdr->dw0 = cpu_to_le32((5 << CMD_HDR_CMD_OFF) | /*abort*/ From patchwork Mon Apr 10 13:22:00 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 97111 Delivered-To: patch@linaro.org Received: by 10.140.89.233 with SMTP id v96csp1347338qgd; Mon, 10 Apr 2017 05:53:18 -0700 (PDT) X-Received: by 10.98.12.72 with SMTP id u69mr8795236pfi.47.1491828798433; Mon, 10 Apr 2017 05:53:18 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l4si13595901pfg.76.2017.04.10.05.53.18; Mon, 10 Apr 2017 05:53:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753103AbdDJMwZ (ORCPT + 24 others); Mon, 10 Apr 2017 08:52:25 -0400 Received: from szxga03-in.huawei.com ([45.249.212.189]:5422 "EHLO dggrg03-dlp.huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752750AbdDJMwW (ORCPT ); Mon, 10 Apr 2017 08:52:22 -0400 Received: from 172.30.72.57 (EHLO DGGEML401-HUB.china.huawei.com) ([172.30.72.57]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id ALL62813; Mon, 10 Apr 2017 20:52:15 +0800 (CST) Received: from localhost.localdomain (10.67.212.75) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Mon, 10 Apr 2017 20:52:05 +0800 From: John Garry To: , CC: , , , , , John Garry Subject: [PATCH 5/6] scsi: hisi_sas: fix NULL deference when TMF timeouts Date: Mon, 10 Apr 2017 21:22:00 +0800 Message-ID: <1491830521-21437-6-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1491830521-21437-1-git-send-email-john.garry@huawei.com> References: <1491830521-21437-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.212.75] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020206.58EB8001.005E, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 80b783d92e70782b24424872204e4a03 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org If a TMF timeouts (maybe due to unlikely scenario of an expander being unplugged when TMF for remote device is active), when we eventually try to free the slot, we crash as we dereference the slot's task, which has already been released. As a fix, add checks in the slot release code for a NULL task. Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_main.c | 60 +++++++++++++++++++---------------- 1 file changed, 33 insertions(+), 27 deletions(-) -- 1.9.1 diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index a5c6d06..7e6e882 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -77,17 +77,22 @@ static void hisi_sas_slot_index_init(struct hisi_hba *hisi_hba) void hisi_sas_slot_task_free(struct hisi_hba *hisi_hba, struct sas_task *task, struct hisi_sas_slot *slot) { - struct device *dev = &hisi_hba->pdev->dev; - struct domain_device *device = task->dev; - struct hisi_sas_device *sas_dev = device->lldd_dev; - if (!slot->task) - return; + if (task) { + struct device *dev = &hisi_hba->pdev->dev; + struct domain_device *device = task->dev; + struct hisi_sas_device *sas_dev = device->lldd_dev; - if (!sas_protocol_ata(task->task_proto)) - if (slot->n_elem) - dma_unmap_sg(dev, task->scatter, slot->n_elem, - task->data_dir); + if (!sas_protocol_ata(task->task_proto)) + if (slot->n_elem) + dma_unmap_sg(dev, task->scatter, slot->n_elem, + task->data_dir); + + task->lldd_task = NULL; + + if (sas_dev) + atomic64_dec(&sas_dev->running_req); + } if (slot->command_table) dma_pool_free(hisi_hba->command_table_pool, @@ -102,12 +107,10 @@ void hisi_sas_slot_task_free(struct hisi_hba *hisi_hba, struct sas_task *task, slot->sge_page_dma); list_del_init(&slot->entry); - task->lldd_task = NULL; slot->task = NULL; slot->port = NULL; hisi_sas_slot_index_free(hisi_hba, slot->idx); - if (sas_dev) - atomic64_dec(&sas_dev->running_req); + /* slot memory is fully zeroed when it is reused */ } EXPORT_SYMBOL_GPL(hisi_sas_slot_task_free); @@ -569,25 +572,23 @@ static void hisi_sas_port_notify_formed(struct asd_sas_phy *sas_phy) spin_unlock_irqrestore(&hisi_hba->lock, flags); } -static void hisi_sas_do_release_task(struct hisi_hba *hisi_hba, - struct sas_task *task, +static void hisi_sas_do_release_task(struct hisi_hba *hisi_hba, struct sas_task *task, struct hisi_sas_slot *slot) { - struct task_status_struct *ts; - unsigned long flags; - - if (!task) - return; + if (task) { + unsigned long flags; + struct task_status_struct *ts; - ts = &task->task_status; + ts = &task->task_status; - ts->resp = SAS_TASK_COMPLETE; - ts->stat = SAS_ABORTED_TASK; - spin_lock_irqsave(&task->task_state_lock, flags); - task->task_state_flags &= - ~(SAS_TASK_STATE_PENDING | SAS_TASK_AT_INITIATOR); - task->task_state_flags |= SAS_TASK_STATE_DONE; - spin_unlock_irqrestore(&task->task_state_lock, flags); + ts->resp = SAS_TASK_COMPLETE; + ts->stat = SAS_ABORTED_TASK; + spin_lock_irqsave(&task->task_state_lock, flags); + task->task_state_flags &= + ~(SAS_TASK_STATE_PENDING | SAS_TASK_AT_INITIATOR); + task->task_state_flags |= SAS_TASK_STATE_DONE; + spin_unlock_irqrestore(&task->task_state_lock, flags); + } hisi_sas_slot_task_free(hisi_hba, task, slot); } @@ -742,7 +743,12 @@ static int hisi_sas_exec_internal_tmf_task(struct domain_device *device, /* Even TMF timed out, return direct. */ if ((task->task_state_flags & SAS_TASK_STATE_ABORTED)) { if (!(task->task_state_flags & SAS_TASK_STATE_DONE)) { + struct hisi_sas_slot *slot = task->lldd_task; + dev_err(dev, "abort tmf: TMF task timeout\n"); + if (slot) + slot->task = NULL; + goto ex_err; } } From patchwork Mon Apr 10 13:22:01 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 97114 Delivered-To: patch@linaro.org Received: by 10.140.89.233 with SMTP id v96csp1347669qgd; Mon, 10 Apr 2017 05:54:16 -0700 (PDT) X-Received: by 10.99.163.91 with SMTP id v27mr55812061pgn.171.1491828856067; Mon, 10 Apr 2017 05:54:16 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b10si13523284pgf.419.2017.04.10.05.54.15; Mon, 10 Apr 2017 05:54:16 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753560AbdDJMxR (ORCPT + 24 others); Mon, 10 Apr 2017 08:53:17 -0400 Received: from szxga03-in.huawei.com ([45.249.212.189]:5425 "EHLO dggrg03-dlp.huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752856AbdDJMwX (ORCPT ); Mon, 10 Apr 2017 08:52:23 -0400 Received: from 172.30.72.57 (EHLO DGGEML401-HUB.china.huawei.com) ([172.30.72.57]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id ALL62811; Mon, 10 Apr 2017 20:52:15 +0800 (CST) Received: from localhost.localdomain (10.67.212.75) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Mon, 10 Apr 2017 20:52:05 +0800 From: John Garry To: , CC: , , , , , Xiang Chen , John Garry Subject: [PATCH 6/6] scsi: hisi_sas: controller reset for multi-bits ECC and AXI fatal errors Date: Mon, 10 Apr 2017 21:22:01 +0800 Message-ID: <1491830521-21437-7-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1491830521-21437-1-git-send-email-john.garry@huawei.com> References: <1491830521-21437-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.212.75] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020205.58EB7FFF.01AF, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: deaf5e0e2ea0cb8b8e8af6d4d1a7f05d Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Xiang Chen For 1 bit ECC errors, those errors can be recovered by hw. But for multi-bits ECC and AXI errors, there are something wrong with whole module or system, so try reset the controller to recover those errors instead of calling panic(). Signed-off-by: Xiang Chen Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 93 ++++++++++++++++++++-------------- 1 file changed, 56 insertions(+), 37 deletions(-) -- 1.9.1 diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c index cc5675b..89d0959 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c @@ -2918,94 +2918,105 @@ static void multi_bit_ecc_error_process_v2_hw(struct hisi_hba *hisi_hba, if (irq_value & BIT(SAS_ECC_INTR_DQE_ECC_MB_OFF)) { reg_val = hisi_sas_read32(hisi_hba, HGC_DQE_ECC_ADDR); - panic("%s: hgc_dqe_accbad_intr (0x%x) found: \ + dev_warn(dev, "hgc_dqe_accbad_intr (0x%x) found: \ Ram address is 0x%08X\n", - dev_name(dev), irq_value, + irq_value, (reg_val & HGC_DQE_ECC_MB_ADDR_MSK) >> HGC_DQE_ECC_MB_ADDR_OFF); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); } if (irq_value & BIT(SAS_ECC_INTR_IOST_ECC_MB_OFF)) { reg_val = hisi_sas_read32(hisi_hba, HGC_IOST_ECC_ADDR); - panic("%s: hgc_iost_accbad_intr (0x%x) found: \ + dev_warn(dev, "hgc_iost_accbad_intr (0x%x) found: \ Ram address is 0x%08X\n", - dev_name(dev), irq_value, + irq_value, (reg_val & HGC_IOST_ECC_MB_ADDR_MSK) >> HGC_IOST_ECC_MB_ADDR_OFF); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); } if (irq_value & BIT(SAS_ECC_INTR_ITCT_ECC_MB_OFF)) { reg_val = hisi_sas_read32(hisi_hba, HGC_ITCT_ECC_ADDR); - panic("%s: hgc_itct_accbad_intr (0x%x) found: \ + dev_warn(dev,"hgc_itct_accbad_intr (0x%x) found: \ Ram address is 0x%08X\n", - dev_name(dev), irq_value, + irq_value, (reg_val & HGC_ITCT_ECC_MB_ADDR_MSK) >> HGC_ITCT_ECC_MB_ADDR_OFF); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); } if (irq_value & BIT(SAS_ECC_INTR_IOSTLIST_ECC_MB_OFF)) { reg_val = hisi_sas_read32(hisi_hba, HGC_LM_DFX_STATUS2); - panic("%s: hgc_iostl_accbad_intr (0x%x) found: \ + dev_warn(dev, "hgc_iostl_accbad_intr (0x%x) found: \ memory address is 0x%08X\n", - dev_name(dev), irq_value, + irq_value, (reg_val & HGC_LM_DFX_STATUS2_IOSTLIST_MSK) >> HGC_LM_DFX_STATUS2_IOSTLIST_OFF); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); } if (irq_value & BIT(SAS_ECC_INTR_ITCTLIST_ECC_MB_OFF)) { reg_val = hisi_sas_read32(hisi_hba, HGC_LM_DFX_STATUS2); - panic("%s: hgc_itctl_accbad_intr (0x%x) found: \ + dev_warn(dev, "hgc_itctl_accbad_intr (0x%x) found: \ memory address is 0x%08X\n", - dev_name(dev), irq_value, + irq_value, (reg_val & HGC_LM_DFX_STATUS2_ITCTLIST_MSK) >> HGC_LM_DFX_STATUS2_ITCTLIST_OFF); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); } if (irq_value & BIT(SAS_ECC_INTR_CQE_ECC_MB_OFF)) { reg_val = hisi_sas_read32(hisi_hba, HGC_CQE_ECC_ADDR); - panic("%s: hgc_cqe_accbad_intr (0x%x) found: \ + dev_warn(dev, "hgc_cqe_accbad_intr (0x%x) found: \ Ram address is 0x%08X\n", - dev_name(dev), irq_value, + irq_value, (reg_val & HGC_CQE_ECC_MB_ADDR_MSK) >> HGC_CQE_ECC_MB_ADDR_OFF); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); } if (irq_value & BIT(SAS_ECC_INTR_NCQ_MEM0_ECC_MB_OFF)) { reg_val = hisi_sas_read32(hisi_hba, HGC_RXM_DFX_STATUS14); - panic("%s: rxm_mem0_accbad_intr (0x%x) found: \ + dev_warn(dev, "rxm_mem0_accbad_intr (0x%x) found: \ memory address is 0x%08X\n", - dev_name(dev), irq_value, + irq_value, (reg_val & HGC_RXM_DFX_STATUS14_MEM0_MSK) >> HGC_RXM_DFX_STATUS14_MEM0_OFF); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); } if (irq_value & BIT(SAS_ECC_INTR_NCQ_MEM1_ECC_MB_OFF)) { reg_val = hisi_sas_read32(hisi_hba, HGC_RXM_DFX_STATUS14); - panic("%s: rxm_mem1_accbad_intr (0x%x) found: \ + dev_warn(dev, "rxm_mem1_accbad_intr (0x%x) found: \ memory address is 0x%08X\n", - dev_name(dev), irq_value, + irq_value, (reg_val & HGC_RXM_DFX_STATUS14_MEM1_MSK) >> HGC_RXM_DFX_STATUS14_MEM1_OFF); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); } if (irq_value & BIT(SAS_ECC_INTR_NCQ_MEM2_ECC_MB_OFF)) { reg_val = hisi_sas_read32(hisi_hba, HGC_RXM_DFX_STATUS14); - panic("%s: rxm_mem2_accbad_intr (0x%x) found: \ + dev_warn(dev, "rxm_mem2_accbad_intr (0x%x) found: \ memory address is 0x%08X\n", - dev_name(dev), irq_value, + irq_value, (reg_val & HGC_RXM_DFX_STATUS14_MEM2_MSK) >> HGC_RXM_DFX_STATUS14_MEM2_OFF); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); } if (irq_value & BIT(SAS_ECC_INTR_NCQ_MEM3_ECC_MB_OFF)) { reg_val = hisi_sas_read32(hisi_hba, HGC_RXM_DFX_STATUS15); - panic("%s: rxm_mem3_accbad_intr (0x%x) found: \ + dev_warn(dev, "rxm_mem3_accbad_intr (0x%x) found: \ memory address is 0x%08X\n", - dev_name(dev), irq_value, + irq_value, (reg_val & HGC_RXM_DFX_STATUS15_MEM3_MSK) >> HGC_RXM_DFX_STATUS15_MEM3_OFF); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); } + return; } static irqreturn_t fatal_ecc_int_v2_hw(int irq_no, void *p) @@ -3063,23 +3074,27 @@ static irqreturn_t fatal_axi_int_v2_hw(int irq_no, void *p) if (irq_value & BIT(ENT_INT_SRC3_WP_DEPTH_OFF)) { hisi_sas_write32(hisi_hba, ENT_INT_SRC3, 1 << ENT_INT_SRC3_WP_DEPTH_OFF); - panic("%s: write pointer and depth error (0x%x) \ + dev_warn(dev, "write pointer and depth error (0x%x) \ found!\n", - dev_name(dev), irq_value); + irq_value); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); } if (irq_value & BIT(ENT_INT_SRC3_IPTT_SLOT_NOMATCH_OFF)) { hisi_sas_write32(hisi_hba, ENT_INT_SRC3, 1 << ENT_INT_SRC3_IPTT_SLOT_NOMATCH_OFF); - panic("%s: iptt no match slot error (0x%x) found!\n", - dev_name(dev), irq_value); + dev_warn(dev, "iptt no match slot error (0x%x) found!\n", + irq_value); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); } - if (irq_value & BIT(ENT_INT_SRC3_RP_DEPTH_OFF)) - panic("%s: read pointer and depth error (0x%x) \ + if (irq_value & BIT(ENT_INT_SRC3_RP_DEPTH_OFF)) { + dev_warn(dev, "read pointer and depth error (0x%x) \ found!\n", - dev_name(dev), irq_value); + irq_value); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); + } if (irq_value & BIT(ENT_INT_SRC3_AXI_OFF)) { int i; @@ -3090,10 +3105,11 @@ static irqreturn_t fatal_axi_int_v2_hw(int irq_no, void *p) HGC_AXI_FIFO_ERR_INFO); for (i = 0; i < AXI_ERR_NR; i++) { - if (err_value & BIT(i)) - panic("%s: %s (0x%x) found!\n", - dev_name(dev), + if (err_value & BIT(i)) { + dev_warn(dev, "%s (0x%x) found!\n", axi_err_info[i], irq_value); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); + } } } @@ -3106,10 +3122,11 @@ static irqreturn_t fatal_axi_int_v2_hw(int irq_no, void *p) HGC_AXI_FIFO_ERR_INFO); for (i = 0; i < FIFO_ERR_NR; i++) { - if (err_value & BIT(AXI_ERR_NR + i)) - panic("%s: %s (0x%x) found!\n", - dev_name(dev), + if (err_value & BIT(AXI_ERR_NR + i)) { + dev_warn(dev, "%s (0x%x) found!\n", fifo_err_info[i], irq_value); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); + } } } @@ -3117,15 +3134,17 @@ static irqreturn_t fatal_axi_int_v2_hw(int irq_no, void *p) if (irq_value & BIT(ENT_INT_SRC3_LM_OFF)) { hisi_sas_write32(hisi_hba, ENT_INT_SRC3, 1 << ENT_INT_SRC3_LM_OFF); - panic("%s: LM add/fetch list error (0x%x) found!\n", - dev_name(dev), irq_value); + dev_warn(dev, "LM add/fetch list error (0x%x) found!\n", + irq_value); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); } if (irq_value & BIT(ENT_INT_SRC3_ABT_OFF)) { hisi_sas_write32(hisi_hba, ENT_INT_SRC3, 1 << ENT_INT_SRC3_ABT_OFF); - panic("%s: SAS_HGC_ABT fetch LM list error (0x%x) found!\n", - dev_name(dev), irq_value); + dev_warn(dev, "SAS_HGC_ABT fetch LM list error (0x%x) found!\n", + irq_value); + queue_work(hisi_hba->wq, &hisi_hba->rst_work); } }