From patchwork Fri Sep 1 09:41:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719728 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49F85CA0FE1 for ; Fri, 1 Sep 2023 09:42:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348888AbjIAJmB (ORCPT ); Fri, 1 Sep 2023 05:42:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56260 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237703AbjIAJmA (ORCPT ); Fri, 1 Sep 2023 05:42:00 -0400 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C1D71705; Fri, 1 Sep 2023 02:41:53 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4RcY0p3TL0z1L8r1; Fri, 1 Sep 2023 17:40:10 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:41:50 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 01/19] scsi: scsi_error: Define framework for LUN/target based error handle Date: Fri, 1 Sep 2023 17:41:09 +0800 Message-ID: <20230901094127.2010873-2-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org The old scsi error handle logic is based on host, once a scsi command in one LUN of this host is classfied as failed, SCSI mid-level would set the whole host to recovery state, and no IO can be submitted to all LUNs of this host any more before recovery finished, while the recovery process might take a long time to finish. It's unreasonable when there are a lot of LUNs in one host. This change introduce a way for driver to implement its own error handle logic which can be based on scsi LUN or scsi target as minimum unit. scsi_device_eh is defined for error handle based on scsi LUN, and pointer struct scsi_device_eh "eh" is added in scsi_device, which is NULL by default. LLDs can initialize the sdev->eh in hostt->slave_alloc to implement an scsi LUN based error handle. If this member is not NULL, SCSI mid-level would branch to drivers' error handler rather than the old one which block whole host's IO. scsi_target_eh is defined for error handle based on scsi target, and pointer struct scsi_target_eh "eh" is added in scsi_target, which is NULL by default. LLDs can initialize the starget->eh in hostt->target_alloc to implement an scsi target based error handle. If this member is not NULL, SCSI mid-level would branch to drivers' error handler rather than the old one which block whole host's IO. Signed-off-by: Wenchao Hao --- drivers/scsi/scsi_error.c | 57 +++++++++++++++++++++++++++++++- drivers/scsi/scsi_lib.c | 12 +++++++ drivers/scsi/scsi_priv.h | 18 ++++++++++ include/scsi/scsi_device.h | 67 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 153 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index c67cdcdc3ba8..1d1d97b94613 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -290,11 +290,48 @@ static void scsi_eh_inc_host_failed(struct rcu_head *head) spin_unlock_irqrestore(shost->host_lock, flags); } +#define SCSI_EH_NO_HANDLER 1 + +static int __scsi_eh_scmd_add_sdev(struct scsi_cmnd *scmd) +{ + struct scsi_device *sdev = scmd->device; + struct scsi_device_eh *eh = sdev->eh; + + if (!eh || !eh->add_cmnd) + return SCSI_EH_NO_HANDLER; + + scsi_eh_reset(scmd); + eh->add_cmnd(scmd); + + if (eh->wakeup) + eh->wakeup(sdev); + + return 0; +} + +static int __scsi_eh_scmd_add_starget(struct scsi_cmnd *scmd) +{ + struct scsi_device *sdev = scmd->device; + struct scsi_target *starget = scsi_target(sdev); + struct scsi_target_eh *eh = starget->eh; + + if (!eh || !eh->add_cmnd) + return SCSI_EH_NO_HANDLER; + + scsi_eh_reset(scmd); + eh->add_cmnd(scmd); + + if (eh->wakeup) + eh->wakeup(starget); + + return 0; +} + /** * scsi_eh_scmd_add - add scsi cmd to error handling. * @scmd: scmd to run eh on. */ -void scsi_eh_scmd_add(struct scsi_cmnd *scmd) +static void __scsi_eh_scmd_add(struct scsi_cmnd *scmd) { struct Scsi_Host *shost = scmd->device->host; unsigned long flags; @@ -320,6 +357,24 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd) call_rcu_hurry(&scmd->rcu, scsi_eh_inc_host_failed); } +void scsi_eh_scmd_add(struct scsi_cmnd *scmd) +{ + struct scsi_device *sdev = scmd->device; + struct scsi_target *starget = scsi_target(sdev); + struct Scsi_Host *shost = sdev->host; + + if (unlikely(scsi_host_in_recovery(shost))) + __scsi_eh_scmd_add(scmd); + + if (unlikely(scsi_target_in_recovery(starget))) + if (__scsi_eh_scmd_add_starget(scmd)) + __scsi_eh_scmd_add(scmd); + + if (__scsi_eh_scmd_add_sdev(scmd)) + if (__scsi_eh_scmd_add_starget(scmd)) + __scsi_eh_scmd_add(scmd); +} + /** * scsi_timeout - Timeout function for normal scsi commands. * @req: request that is timing out. diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index ad9afae49544..db0a42fe49c0 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -298,6 +298,12 @@ void scsi_device_unbusy(struct scsi_device *sdev, struct scsi_cmnd *cmd) sbitmap_put(&sdev->budget_map, cmd->budget_token); cmd->budget_token = -1; + + if (sdev->eh && sdev->eh->wakeup) + sdev->eh->wakeup(sdev); + + if (starget->eh && starget->eh->wakeup) + starget->eh->wakeup(starget); } static void scsi_kick_queue(struct request_queue *q) @@ -1253,6 +1259,9 @@ static inline int scsi_dev_queue_ready(struct request_queue *q, { int token; + if (scsi_device_in_recovery(sdev)) + return -1; + token = sbitmap_get(&sdev->budget_map); if (atomic_read(&sdev->device_blocked)) { if (token < 0) @@ -1288,6 +1297,9 @@ static inline int scsi_target_queue_ready(struct Scsi_Host *shost, struct scsi_target *starget = scsi_target(sdev); unsigned int busy; + if (scsi_target_in_recovery(starget)) + return 0; + if (starget->single_lun) { spin_lock_irq(shost->host_lock); if (starget->starget_sdev_user && diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h index f42388ecb024..f7605c2a1ed1 100644 --- a/drivers/scsi/scsi_priv.h +++ b/drivers/scsi/scsi_priv.h @@ -196,6 +196,24 @@ static inline void scsi_dh_add_device(struct scsi_device *sdev) { } static inline void scsi_dh_release_device(struct scsi_device *sdev) { } #endif +static inline int scsi_device_in_recovery(struct scsi_device *sdev) +{ + struct scsi_device_eh *eh = sdev->eh; + + if (eh && eh->is_busy) + return eh->is_busy(sdev); + return 0; +} + +static inline int scsi_target_in_recovery(struct scsi_target *starget) +{ + struct scsi_target_eh *eh = starget->eh; + + if (eh && eh->is_busy) + return eh->is_busy(starget); + return 0; +} + struct bsg_device *scsi_bsg_register_queue(struct scsi_device *sdev); extern int scsi_device_max_queue_depth(struct scsi_device *sdev); diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index 75b2235b99e2..08ed9a03015d 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -104,6 +104,71 @@ enum scsi_vpd_parameters { SCSI_VPD_HEADER_SIZE = 4, }; +struct scsi_device; +struct scsi_target; + +struct scsi_device_eh { + /* + * add scsi command to error handler so it would be handuled by + * driver's error handle strategy + */ + void (*add_cmnd)(struct scsi_cmnd *scmd); + + /* + * to judge if the device is busy handling errors, called before + * dispatch scsi cmnd + * + * return 0 if it's ready to accepy scsi cmnd + * return 0 if it's in error handle, command's would not be dispatched + */ + int (*is_busy)(struct scsi_device *sdev); + + /* + * wakeup device's error handle + * + * usually the error handler strategy would not run at once when + * error command is added. This function would be called when any + * scsi cmnd is finished or when scsi cmnd is added. + */ + int (*wakeup)(struct scsi_device *sdev); + + /* + * data entity for device specific error handler + */ + unsigned long driver_data[]; +}; + +struct scsi_target_eh { + /* + * add scsi command to error handler so it would be handuled by + * driver's error handle strategy + */ + void (*add_cmnd)(struct scsi_cmnd *scmd); + + /* + * to judge if the device is busy handling errors, called before + * dispatch scsi cmnd + * + * return 0 if it's ready to accepy scsi cmnd + * return 0 if it's in error handle, command's would not be dispatched + */ + int (*is_busy)(struct scsi_target *starget); + + /* + * wakeup device's error handle + * + * usually the error handler strategy would not run at once when + * error command is added. This function would be called when any + * scsi cmnd is finished or when scsi cmnd is added. + */ + int (*wakeup)(struct scsi_target *starget); + + /* + * data entity for device specific error handler + */ + unsigned long driver_data[]; +}; + struct scsi_device { struct Scsi_Host *host; struct request_queue *request_queue; @@ -258,6 +323,7 @@ struct scsi_device { struct mutex state_mutex; enum scsi_device_state sdev_state; struct task_struct *quiesced_by; + struct scsi_device_eh *eh; unsigned long sdev_data[]; } __attribute__((aligned(sizeof(unsigned long)))); @@ -344,6 +410,7 @@ struct scsi_target { char scsi_level; enum scsi_target_state state; void *hostdata; /* available to low-level driver */ + struct scsi_target_eh *eh; unsigned long starget_data[]; /* for the transport */ /* starget_data must be the last element!!!! */ } __attribute__((aligned(sizeof(unsigned long)))); From patchwork Fri Sep 1 09:41:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719727 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B294ACA0FE9 for ; Fri, 1 Sep 2023 09:42:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348902AbjIAJmF (ORCPT ); Fri, 1 Sep 2023 05:42:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239516AbjIAJmA (ORCPT ); Fri, 1 Sep 2023 05:42:00 -0400 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D81410D5; Fri, 1 Sep 2023 02:41:54 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.56]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4RcY0q1Wbjz1L8yr; Fri, 1 Sep 2023 17:40:11 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:41:51 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 02/19] scsi: scsi_error: Move complete variable eh_action from shost to sdevice Date: Fri, 1 Sep 2023 17:41:10 +0800 Message-ID: <20230901094127.2010873-3-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org eh_action is used to wait for error handle command's completion if scsi command is send in error handle. Now the error handler might based on scsi_device, so move it to scsi_device. This is preparation for a genernal LUN/target based error handle strategy. Signed-off-by: Wenchao Hao --- drivers/scsi/scsi_error.c | 6 +++--- include/scsi/scsi_device.h | 2 ++ include/scsi/scsi_host.h | 2 -- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 1d1d97b94613..879fdd7c165b 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -914,7 +914,7 @@ void scsi_eh_done(struct scsi_cmnd *scmd) SCSI_LOG_ERROR_RECOVERY(3, scmd_printk(KERN_INFO, scmd, "%s result: %x\n", __func__, scmd->result)); - eh_action = scmd->device->host->eh_action; + eh_action = scmd->device->eh_action; if (eh_action) complete(eh_action); } @@ -1203,7 +1203,7 @@ static enum scsi_disposition scsi_send_eh_cmnd(struct scsi_cmnd *scmd, retry: scsi_eh_prep_cmnd(scmd, &ses, cmnd, cmnd_size, sense_bytes); - shost->eh_action = &done; + sdev->eh_action = &done; scsi_log_send(scmd); scmd->submitter = SUBMITTED_BY_SCSI_ERROR_HANDLER; @@ -1246,7 +1246,7 @@ static enum scsi_disposition scsi_send_eh_cmnd(struct scsi_cmnd *scmd, rtn = SUCCESS; } - shost->eh_action = NULL; + sdev->eh_action = NULL; scsi_log_completion(scmd, rtn); diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index 08ed9a03015d..df3f1b8d1390 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -324,6 +324,8 @@ struct scsi_device { enum scsi_device_state sdev_state; struct task_struct *quiesced_by; struct scsi_device_eh *eh; + struct completion *eh_action; /* Wait for specific actions */ + /* on the device. */ unsigned long sdev_data[]; } __attribute__((aligned(sizeof(unsigned long)))); diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h index 70b7475dcf56..def0d99e9b36 100644 --- a/include/scsi/scsi_host.h +++ b/include/scsi/scsi_host.h @@ -554,8 +554,6 @@ struct Scsi_Host { struct list_head eh_abort_list; struct list_head eh_cmd_q; struct task_struct * ehandler; /* Error recovery thread. */ - struct completion * eh_action; /* Wait for specific actions on the - host. */ wait_queue_head_t host_wait; const struct scsi_host_template *hostt; struct scsi_transport_template *transportt; From patchwork Fri Sep 1 09:41:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719860 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EE8ACA0FEA for ; Fri, 1 Sep 2023 09:42:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348883AbjIAJmH (ORCPT ); Fri, 1 Sep 2023 05:42:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348890AbjIAJmB (ORCPT ); Fri, 1 Sep 2023 05:42:01 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4DE6810F0; Fri, 1 Sep 2023 02:41:55 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.54]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4RcXyH0ncdzhZJG; Fri, 1 Sep 2023 17:37:59 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:41:52 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 03/19] scsi: scsi_error: Check if to do reset in scsi_try_xxx_reset Date: Fri, 1 Sep 2023 17:41:11 +0800 Message-ID: <20230901094127.2010873-4-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org This is preparation for a genernal LUN/target based error handle strategy, the strategy would reuse some error handler APIs, but some steps of these function should not be performed. For example, we should not perform target reset if we just stop IOs on one single LUN. This change add checks in scsi_try_xxx_reset to make sure the reset operations would not be performed only if the condition is not satisfied. Signed-off-by: Wenchao Hao --- drivers/scsi/scsi_error.c | 37 +++++++++++++++++++++++++++++++------ 1 file changed, 31 insertions(+), 6 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 879fdd7c165b..48ed035d44ce 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -923,7 +923,7 @@ void scsi_eh_done(struct scsi_cmnd *scmd) * scsi_try_host_reset - ask host adapter to reset itself * @scmd: SCSI cmd to send host reset. */ -static enum scsi_disposition scsi_try_host_reset(struct scsi_cmnd *scmd) +static enum scsi_disposition __scsi_try_host_reset(struct scsi_cmnd *scmd) { unsigned long flags; enum scsi_disposition rtn; @@ -949,11 +949,19 @@ static enum scsi_disposition scsi_try_host_reset(struct scsi_cmnd *scmd) return rtn; } +static enum scsi_disposition scsi_try_host_reset(struct scsi_cmnd *scmd) +{ + if (!scsi_host_in_recovery(scmd->device->host)) + return FAILED; + + return __scsi_try_host_reset(scmd); +} + /** * scsi_try_bus_reset - ask host to perform a bus reset * @scmd: SCSI cmd to send bus reset. */ -static enum scsi_disposition scsi_try_bus_reset(struct scsi_cmnd *scmd) +static enum scsi_disposition __scsi_try_bus_reset(struct scsi_cmnd *scmd) { unsigned long flags; enum scsi_disposition rtn; @@ -979,6 +987,14 @@ static enum scsi_disposition scsi_try_bus_reset(struct scsi_cmnd *scmd) return rtn; } +static enum scsi_disposition scsi_try_bus_reset(struct scsi_cmnd *scmd) +{ + if (!scsi_host_in_recovery(scmd->device->host)) + return FAILED; + + return __scsi_try_bus_reset(scmd); +} + static void __scsi_report_device_reset(struct scsi_device *sdev, void *data) { sdev->was_reset = 1; @@ -995,7 +1011,7 @@ static void __scsi_report_device_reset(struct scsi_device *sdev, void *data) * timer on it, and set the host back to a consistent state prior to * returning. */ -static enum scsi_disposition scsi_try_target_reset(struct scsi_cmnd *scmd) +static enum scsi_disposition __scsi_try_target_reset(struct scsi_cmnd *scmd) { unsigned long flags; enum scsi_disposition rtn; @@ -1016,6 +1032,15 @@ static enum scsi_disposition scsi_try_target_reset(struct scsi_cmnd *scmd) return rtn; } +static enum scsi_disposition scsi_try_target_reset(struct scsi_cmnd *scmd) +{ + if (!(scsi_target_in_recovery(scsi_target(scmd->device)) || + scsi_host_in_recovery(scmd->device->host))) + return FAILED; + + return __scsi_try_target_reset(scmd); +} + /** * scsi_try_bus_device_reset - Ask host to perform a BDR on a dev * @scmd: SCSI cmd used to send BDR @@ -2534,17 +2559,17 @@ scsi_ioctl_reset(struct scsi_device *dev, int __user *arg) break; fallthrough; case SG_SCSI_RESET_TARGET: - rtn = scsi_try_target_reset(scmd); + rtn = __scsi_try_target_reset(scmd); if (rtn == SUCCESS || (val & SG_SCSI_RESET_NO_ESCALATE)) break; fallthrough; case SG_SCSI_RESET_BUS: - rtn = scsi_try_bus_reset(scmd); + rtn = __scsi_try_bus_reset(scmd); if (rtn == SUCCESS || (val & SG_SCSI_RESET_NO_ESCALATE)) break; fallthrough; case SG_SCSI_RESET_HOST: - rtn = scsi_try_host_reset(scmd); + rtn = __scsi_try_host_reset(scmd); if (rtn == SUCCESS) break; fallthrough; From patchwork Fri Sep 1 09:41:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719726 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5B12CA0FE6 for ; Fri, 1 Sep 2023 09:42:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348919AbjIAJmL (ORCPT ); Fri, 1 Sep 2023 05:42:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55844 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348889AbjIAJmB (ORCPT ); Fri, 1 Sep 2023 05:42:01 -0400 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB77610EC; Fri, 1 Sep 2023 02:41:55 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4RcY0r4ZZJz1L8v0; Fri, 1 Sep 2023 17:40:12 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:41:52 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 04/19] scsi: scsi_error: Add helper scsi_eh_sdev_stu to do START_UNIT Date: Fri, 1 Sep 2023 17:41:12 +0800 Message-ID: <20230901094127.2010873-5-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add helper function scsi_eh_sdev_stu() to perform START_UNIT and check if to finish some error commands. This is preparation for a genernal LUN/target based error handle strategy and did not change original logic. Signed-off-by: Wenchao Hao --- drivers/scsi/scsi_error.c | 50 +++++++++++++++++++++++---------------- 1 file changed, 29 insertions(+), 21 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 48ed035d44ce..64eb616261ec 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -1564,6 +1564,31 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd) return 1; } +static int scsi_eh_sdev_stu(struct scsi_cmnd *scmd, + struct list_head *work_q, + struct list_head *done_q) +{ + struct scsi_device *sdev = scmd->device; + struct scsi_cmnd *next; + + SCSI_LOG_ERROR_RECOVERY(3, sdev_printk(KERN_INFO, sdev, + "%s: Sending START_UNIT\n", current->comm)); + + if (scsi_eh_try_stu(scmd)) { + SCSI_LOG_ERROR_RECOVERY(3, sdev_printk(KERN_INFO, sdev, + "%s: START_UNIT failed\n", current->comm)); + return 0; + } + + if (!scsi_device_online(sdev) || !scsi_eh_tur(scmd)) + list_for_each_entry_safe(scmd, next, work_q, eh_entry) + if (scmd->device == sdev && + scsi_eh_action(scmd, SUCCESS) == SUCCESS) + scsi_eh_finish_cmd(scmd, done_q); + + return list_empty(work_q); +} + /** * scsi_eh_stu - send START_UNIT if needed * @shost: &scsi host being recovered. @@ -1578,7 +1603,7 @@ static int scsi_eh_stu(struct Scsi_Host *shost, struct list_head *work_q, struct list_head *done_q) { - struct scsi_cmnd *scmd, *stu_scmd, *next; + struct scsi_cmnd *scmd, *stu_scmd; struct scsi_device *sdev; shost_for_each_device(sdev, shost) { @@ -1601,26 +1626,9 @@ static int scsi_eh_stu(struct Scsi_Host *shost, if (!stu_scmd) continue; - SCSI_LOG_ERROR_RECOVERY(3, - sdev_printk(KERN_INFO, sdev, - "%s: Sending START_UNIT\n", - current->comm)); - - if (!scsi_eh_try_stu(stu_scmd)) { - if (!scsi_device_online(sdev) || - !scsi_eh_tur(stu_scmd)) { - list_for_each_entry_safe(scmd, next, - work_q, eh_entry) { - if (scmd->device == sdev && - scsi_eh_action(scmd, SUCCESS) == SUCCESS) - scsi_eh_finish_cmd(scmd, done_q); - } - } - } else { - SCSI_LOG_ERROR_RECOVERY(3, - sdev_printk(KERN_INFO, sdev, - "%s: START_UNIT failed\n", - current->comm)); + if (scsi_eh_sdev_stu(stu_scmd, work_q, done_q)) { + scsi_device_put(sdev); + break; } } From patchwork Fri Sep 1 09:41:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719859 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 027B9CA0FE8 for ; Fri, 1 Sep 2023 09:42:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348930AbjIAJmM (ORCPT ); Fri, 1 Sep 2023 05:42:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56356 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348881AbjIAJmF (ORCPT ); Fri, 1 Sep 2023 05:42:05 -0400 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A975610D7; Fri, 1 Sep 2023 02:41:56 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.55]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4RcXz35PtczQjKk; Fri, 1 Sep 2023 17:38:39 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:41:53 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 05/19] scsi: scsi_error: Add helper scsi_eh_sdev_reset to do lun reset Date: Fri, 1 Sep 2023 17:41:13 +0800 Message-ID: <20230901094127.2010873-6-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add helper function scsi_eh_sdev_reset() to perform lun reset and check if to finish some error commands. This is preparation for a genernal LUN/target based error handle strategy and did not change original logic. Signed-off-by: Wenchao Hao --- drivers/scsi/scsi_error.c | 54 +++++++++++++++++++++++---------------- 1 file changed, 32 insertions(+), 22 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 64eb616261ec..16888540b663 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -1635,6 +1635,34 @@ static int scsi_eh_stu(struct Scsi_Host *shost, return list_empty(work_q); } +static int scsi_eh_sdev_reset(struct scsi_cmnd *scmd, + struct list_head *work_q, + struct list_head *done_q) +{ + struct scsi_cmnd *next; + struct scsi_device *sdev = scmd->device; + enum scsi_disposition rtn; + + SCSI_LOG_ERROR_RECOVERY(3, sdev_printk(KERN_INFO, sdev, + "%s: Sending BDR\n", current->comm)); + + rtn = scsi_try_bus_device_reset(scmd); + if (rtn != SUCCESS && rtn != FAST_IO_FAIL) { + SCSI_LOG_ERROR_RECOVERY(3, + sdev_printk(KERN_INFO, sdev, + "%s: BDR failed\n", current->comm)); + return 0; + } + + if (!scsi_device_online(sdev) || rtn == FAST_IO_FAIL || + !scsi_eh_tur(scmd)) + list_for_each_entry_safe(scmd, next, work_q, eh_entry) + if (scmd->device == sdev && + scsi_eh_action(scmd, rtn) != FAILED) + scsi_eh_finish_cmd(scmd, done_q); + + return list_empty(work_q); +} /** * scsi_eh_bus_device_reset - send bdr if needed @@ -1652,9 +1680,8 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host *shost, struct list_head *work_q, struct list_head *done_q) { - struct scsi_cmnd *scmd, *bdr_scmd, *next; + struct scsi_cmnd *scmd, *bdr_scmd; struct scsi_device *sdev; - enum scsi_disposition rtn; shost_for_each_device(sdev, shost) { if (scsi_host_eh_past_deadline(shost)) { @@ -1675,26 +1702,9 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host *shost, if (!bdr_scmd) continue; - SCSI_LOG_ERROR_RECOVERY(3, - sdev_printk(KERN_INFO, sdev, - "%s: Sending BDR\n", current->comm)); - rtn = scsi_try_bus_device_reset(bdr_scmd); - if (rtn == SUCCESS || rtn == FAST_IO_FAIL) { - if (!scsi_device_online(sdev) || - rtn == FAST_IO_FAIL || - !scsi_eh_tur(bdr_scmd)) { - list_for_each_entry_safe(scmd, next, - work_q, eh_entry) { - if (scmd->device == sdev && - scsi_eh_action(scmd, rtn) != FAILED) - scsi_eh_finish_cmd(scmd, - done_q); - } - } - } else { - SCSI_LOG_ERROR_RECOVERY(3, - sdev_printk(KERN_INFO, sdev, - "%s: BDR failed\n", current->comm)); + if (scsi_eh_sdev_reset(bdr_scmd, work_q, done_q)) { + scsi_device_put(sdev); + break; } } From patchwork Fri Sep 1 09:41:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719725 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41CA8CA0FEB for ; Fri, 1 Sep 2023 09:42:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348934AbjIAJmM (ORCPT ); Fri, 1 Sep 2023 05:42:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56292 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348907AbjIAJmG (ORCPT ); Fri, 1 Sep 2023 05:42:06 -0400 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2724710FA; Fri, 1 Sep 2023 02:41:57 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4RcY0t0YXzz1L8qZ; Fri, 1 Sep 2023 17:40:14 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:41:54 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 06/19] scsi: scsi_error: Add flags to mark error handle steps has done Date: Fri, 1 Sep 2023 17:41:14 +0800 Message-ID: <20230901094127.2010873-7-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org LUN based error handle would mainly do three steps to recover commands which are check sense, start unit, and reset lun. It might fallback to target/host based error handle which would do these steps too. Target based error handle would reset target, it would also fallback to host based error handle. Add some flags to mark these steps are done to avoid repeating these steps. The flags should be cleared when LUN/target based error handler is waked up or when target/host based error handle finished, and set when fallback to target/host based error handle. scsi_eh_get_sense, scsi_eh_stu, scsi_eh_bus_device_reset and scsi_eh_target_reset would check these flags before actually action. Signed-off-by: Wenchao Hao --- drivers/scsi/scsi_error.c | 55 ++++++++++++++++++++++++++++++++++++++ include/scsi/scsi_device.h | 28 +++++++++++++++++++ 2 files changed, 83 insertions(+) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 16888540b663..055c04470f5c 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -57,10 +57,50 @@ #define BUS_RESET_SETTLE_TIME (10) #define HOST_RESET_SETTLE_TIME (10) +#define sdev_flags_done(flag) \ +static inline int sdev_##flag(struct scsi_device *sdev) \ +{ \ + struct scsi_device_eh *eh = sdev->eh; \ + if (!eh) \ + return 0; \ + return eh->flag; \ +} + static int scsi_eh_try_stu(struct scsi_cmnd *scmd); static enum scsi_disposition scsi_try_to_abort_cmd(const struct scsi_host_template *, struct scsi_cmnd *); +sdev_flags_done(get_sense_done); +sdev_flags_done(stu_done); +sdev_flags_done(reset_done); + +static inline int starget_reset_done(struct scsi_target *starget) +{ + struct scsi_target_eh *eh = starget->eh; + + if (!eh) + return 0; + return eh->reset_done; +} + +static inline void shost_clear_eh_done(struct Scsi_Host *shost) +{ + struct scsi_device *sdev; + struct scsi_target *starget; + + list_for_each_entry(starget, &shost->__targets, siblings) + if (starget->eh) + starget->eh->reset_done = 0; + + shost_for_each_device(sdev, shost) { + if (!sdev->eh) + continue; + sdev->eh->get_sense_done = 0; + sdev->eh->stu_done = 0; + sdev->eh->reset_done = 0; + } +} + void scsi_eh_wakeup(struct Scsi_Host *shost) { lockdep_assert_held(shost->host_lock); @@ -1402,6 +1442,9 @@ int scsi_eh_get_sense(struct list_head *work_q, current->comm)); break; } + if (sdev_get_sense_done(scmd->device) || + starget_reset_done(scsi_target(scmd->device))) + continue; if (!scsi_status_is_check_condition(scmd->result)) /* * don't request sense if there's no check condition @@ -1615,6 +1658,9 @@ static int scsi_eh_stu(struct Scsi_Host *shost, scsi_device_put(sdev); break; } + if (sdev_stu_done(sdev) || + starget_reset_done(scsi_target(sdev))) + continue; stu_scmd = NULL; list_for_each_entry(scmd, work_q, eh_entry) if (scmd->device == sdev && SCSI_SENSE_VALID(scmd) && @@ -1698,6 +1744,9 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host *shost, bdr_scmd = scmd; break; } + if (sdev_reset_done(sdev) || + starget_reset_done(scsi_target(sdev))) + continue; if (!bdr_scmd) continue; @@ -1746,6 +1795,11 @@ static int scsi_eh_target_reset(struct Scsi_Host *shost, } scmd = list_entry(tmp_list.next, struct scsi_cmnd, eh_entry); + if (starget_reset_done(scsi_target(scmd->device))) { + /* push back on work queue for further processing */ + list_move(&scmd->eh_entry, work_q); + continue; + } id = scmd_id(scmd); SCSI_LOG_ERROR_RECOVERY(3, @@ -2359,6 +2413,7 @@ static void scsi_unjam_host(struct Scsi_Host *shost) if (!scsi_eh_get_sense(&eh_work_q, &eh_done_q)) scsi_eh_ready_devs(shost, &eh_work_q, &eh_done_q); + shost_clear_eh_done(shost); spin_lock_irqsave(shost->host_lock, flags); if (shost->eh_deadline != -1) shost->last_reset = 0; diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index df3f1b8d1390..b03a4f21c7df 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -108,6 +108,24 @@ struct scsi_device; struct scsi_target; struct scsi_device_eh { + /* + * LUN rebased error handle would mainly do three + * steps to recovery commands which are + * check sense + * start unit + * reset lun + * While we would fallback to target or host based error handle + * which would do these steps too. Add flags to mark thes steps + * are done to avoid repeating these steps. + * + * The flags should be cleared when LUN based error handler is + * wakedup or when target/host based error handle finished, + * set when fallback to target or host based error handle. + */ + unsigned get_sense_done:1; + unsigned stu_done:1; + unsigned reset_done:1; + /* * add scsi command to error handler so it would be handuled by * driver's error handle strategy @@ -139,6 +157,16 @@ struct scsi_device_eh { }; struct scsi_target_eh { + /* + * flag to mark target reset is done to avoid repeating + * these steps when fallback to host based error handle + * + * The flag should be cleared when target based error handler + * is * wakedup or when host based error handle finished, + * set when fallback to host based error handle. + */ + unsigned reset_done:1; + /* * add scsi command to error handler so it would be handuled by * driver's error handle strategy From patchwork Fri Sep 1 09:41:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719858 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE839CA0FE1 for ; Fri, 1 Sep 2023 09:42:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348961AbjIAJmX (ORCPT ); Fri, 1 Sep 2023 05:42:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345337AbjIAJmH (ORCPT ); Fri, 1 Sep 2023 05:42:07 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0F111170C; Fri, 1 Sep 2023 02:41:58 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.54]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4RcXyK6zF9zhZHX; Fri, 1 Sep 2023 17:38:01 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:41:55 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 07/19] scsi: scsi_error: Add helper to handle scsi device's error command list Date: Fri, 1 Sep 2023 17:41:15 +0800 Message-ID: <20230901094127.2010873-8-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add helper scsi_sdev_eh() to handle scsi device's error command list, it would perform some steps which can be done with LUN's IO blocked, including check sense, start unit and reset lun. Signed-off-by: Wenchao Hao --- drivers/scsi/scsi_error.c | 37 +++++++++++++++++++++++++++++++++++++ include/scsi/scsi_eh.h | 2 ++ 2 files changed, 39 insertions(+) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 055c04470f5c..f24f081fc637 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -2510,6 +2510,43 @@ int scsi_error_handler(void *data) return 0; } +/* + * Single LUN error handle + * + * @work_q: list of scsi commands need to recovery + * @done_q: list of scsi commands handled + * + * return: return 1 if all commands in work_q is recoveryed, else 0 is returned + */ +int scsi_sdev_eh(struct scsi_device *sdev, + struct list_head *work_q, + struct list_head *done_q) +{ + int ret = 0; + struct scsi_cmnd *scmd; + + SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev, + "%s:luneh: checking sense\n", current->comm)); + ret = scsi_eh_get_sense(work_q, done_q); + if (ret) + return ret; + + SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev, + "%s:luneh: start unit\n", current->comm)); + scmd = list_first_entry(work_q, struct scsi_cmnd, eh_entry); + ret = scsi_eh_sdev_stu(scmd, work_q, done_q); + if (ret) + return ret; + + SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev, + "%s:luneh reset LUN\n", current->comm)); + scmd = list_first_entry(work_q, struct scsi_cmnd, eh_entry); + ret = scsi_eh_sdev_reset(scmd, work_q, done_q); + + return ret; +} +EXPORT_SYMBOL_GPL(scsi_sdev_eh); + /* * Function: scsi_report_bus_reset() * diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h index 1ae08e81339f..5ce791063baf 100644 --- a/include/scsi/scsi_eh.h +++ b/include/scsi/scsi_eh.h @@ -18,6 +18,8 @@ extern int scsi_block_when_processing_errors(struct scsi_device *); extern bool scsi_command_normalize_sense(const struct scsi_cmnd *cmd, struct scsi_sense_hdr *sshdr); extern enum scsi_disposition scsi_check_sense(struct scsi_cmnd *); +extern int scsi_sdev_eh(struct scsi_device *sdev, struct list_head *workq, + struct list_head *doneq); static inline bool scsi_sense_is_deferred(const struct scsi_sense_hdr *sshdr) { From patchwork Fri Sep 1 09:41:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719857 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0FCCCA0FE6 for ; Fri, 1 Sep 2023 09:42:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348972AbjIAJm2 (ORCPT ); Fri, 1 Sep 2023 05:42:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348927AbjIAJmM (ORCPT ); Fri, 1 Sep 2023 05:42:12 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 185B210D5; Fri, 1 Sep 2023 02:41:59 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4RcXyf1dbTzNmJ3; Fri, 1 Sep 2023 17:38:18 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:41:55 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 08/19] scsi: scsi_error: Add a general LUN based error handler Date: Fri, 1 Sep 2023 17:41:16 +0800 Message-ID: <20230901094127.2010873-9-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add a general LUN based error handler which can be used by drivers directly. This error handler implements an scsi_device_eh, when handling error commands, it would call helper function scsi_sdev_eh() added before to try recover error commands. The behavior if scsi_sdev_eh() can not recover all error commands depends on fallback flag, which is initialized when scsi_device is allocated. If fallback is set, it would fallback to further error recover strategy like old host based error handle; else it would mark this scsi device offline and flush all error commands. To using this error handler, drivers should call scsi_device_setup_eh() in its slave_alloc() to setup it's LUN based error handler; call scsi_device_clear_eh() in its slave_destroy() to clear LUN based error handler. Signed-off-by: Wenchao Hao --- drivers/scsi/scsi_error.c | 170 ++++++++++++++++++++++++++++++++++++++ include/scsi/scsi_eh.h | 2 + 2 files changed, 172 insertions(+) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index f24f081fc637..b17bf1dea799 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -2759,3 +2759,173 @@ bool scsi_get_sense_info_fld(const u8 *sense_buffer, int sb_len, } } EXPORT_SYMBOL(scsi_get_sense_info_fld); + +struct scsi_lun_eh { + spinlock_t eh_lock; + unsigned int eh_num; + struct list_head eh_cmd_q; + struct scsi_device *sdev; + struct work_struct eh_handle_work; + unsigned int fallback:1; /* If fallback to further */ + /* recovery on failure */ +}; + +/* + * error handle strategy based on LUN, following steps + * is applied to recovery error commands in list: + * check sense data + * send start unit + * reset lun + * if there are still error commands, it would fallback to + * target based or host based error handle for further recovery. + */ +static void sdev_eh_work(struct work_struct *work) +{ + unsigned long flags; + struct scsi_lun_eh *luneh = + container_of(work, struct scsi_lun_eh, eh_handle_work); + struct scsi_device *sdev = luneh->sdev; + struct scsi_device_eh *eh = sdev->eh; + struct Scsi_Host *shost = sdev->host; + struct scsi_cmnd *scmd, *next; + LIST_HEAD(eh_work_q); + LIST_HEAD(eh_done_q); + + spin_lock_irqsave(&luneh->eh_lock, flags); + list_splice_init(&luneh->eh_cmd_q, &eh_work_q); + spin_unlock_irqrestore(&luneh->eh_lock, flags); + + if (scsi_sdev_eh(sdev, &eh_work_q, &eh_done_q)) + goto out_flush_done; + + if (!luneh->fallback) { + list_for_each_entry_safe(scmd, next, &eh_work_q, eh_entry) + scsi_eh_finish_cmd(scmd, &eh_done_q); + + sdev_printk(KERN_INFO, sdev, "%s:luneh: Device offlined - " + "not ready after error recovery\n", current->comm); + + mutex_lock(&sdev->state_mutex); + scsi_device_set_state(sdev, SDEV_OFFLINE); + mutex_unlock(&sdev->state_mutex); + + goto out_flush_done; + } + + /* + * fallback to target or host based error handle + */ + SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev, + "%s:luneh fallback to further recovery\n", current->comm)); + list_for_each_entry_safe(scmd, next, &eh_work_q, eh_entry) { + list_del_init(&scmd->eh_entry); + + if (scsi_host_in_recovery(shost) || + __scsi_eh_scmd_add_starget(scmd)) + __scsi_eh_scmd_add(scmd); + } + + eh->get_sense_done = 1; + eh->stu_done = 1; + eh->reset_done = 1; + +out_flush_done: + scsi_eh_flush_done_q(&eh_done_q); + spin_lock_irqsave(&luneh->eh_lock, flags); + luneh->eh_num = 0; + spin_unlock_irqrestore(&luneh->eh_lock, flags); +} +static void sdev_eh_add_cmnd(struct scsi_cmnd *scmd) +{ + unsigned long flags; + struct scsi_lun_eh *luneh; + struct scsi_device *sdev = scmd->device; + + luneh = (struct scsi_lun_eh *)sdev->eh->driver_data; + + spin_lock_irqsave(&luneh->eh_lock, flags); + list_add_tail(&scmd->eh_entry, &luneh->eh_cmd_q); + luneh->eh_num++; + spin_unlock_irqrestore(&luneh->eh_lock, flags); +} +static int sdev_eh_is_busy(struct scsi_device *sdev) +{ + int ret = 0; + unsigned long flags; + struct scsi_lun_eh *luneh; + + if (!sdev->eh) + return 0; + + luneh = (struct scsi_lun_eh *)sdev->eh->driver_data; + + spin_lock_irqsave(&luneh->eh_lock, flags); + ret = luneh->eh_num; + spin_unlock_irqrestore(&luneh->eh_lock, flags); + + return ret; +} +static int sdev_eh_wakeup(struct scsi_device *sdev) +{ + unsigned long flags; + unsigned int nr_error; + unsigned int nr_busy; + struct scsi_lun_eh *luneh; + + luneh = (struct scsi_lun_eh *)sdev->eh->driver_data; + + spin_lock_irqsave(&luneh->eh_lock, flags); + nr_error = luneh->eh_num; + spin_unlock_irqrestore(&luneh->eh_lock, flags); + + nr_busy = scsi_device_busy(sdev); + + if (!nr_error || nr_busy != nr_error) { + SCSI_LOG_ERROR_RECOVERY(5, sdev_printk(KERN_INFO, sdev, + "%s:luneh: do not wake up, busy/error: %d/%d\n", + current->comm, nr_busy, nr_error)); + return 0; + } + + SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev, + "%s:luneh: waking up, busy/error: %d/%d\n", + current->comm, nr_busy, nr_error)); + + return schedule_work(&luneh->eh_handle_work); +} + +int scsi_device_setup_eh(struct scsi_device *sdev, int fallback) +{ + struct scsi_device_eh *eh; + struct scsi_lun_eh *luneh; + + eh = kzalloc(sizeof(struct scsi_device_eh) + sizeof(struct scsi_lun_eh), + GFP_KERNEL); + if (!eh) { + sdev_printk(KERN_ERR, sdev, "failed to setup error handle\n"); + return -ENOMEM; + } + luneh = (struct scsi_lun_eh *)eh->driver_data; + + eh->add_cmnd = sdev_eh_add_cmnd; + eh->is_busy = sdev_eh_is_busy; + eh->wakeup = sdev_eh_wakeup; + + luneh->fallback = fallback; + luneh->sdev = sdev; + spin_lock_init(&luneh->eh_lock); + INIT_LIST_HEAD(&luneh->eh_cmd_q); + INIT_WORK(&luneh->eh_handle_work, sdev_eh_work); + + sdev->eh = eh; + + return 0; +} +EXPORT_SYMBOL_GPL(scsi_device_setup_eh); + +void scsi_device_clear_eh(struct scsi_device *sdev) +{ + kfree(sdev->eh); + sdev->eh = NULL; +} +EXPORT_SYMBOL_GPL(scsi_device_clear_eh); diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h index 5ce791063baf..89b471aa484f 100644 --- a/include/scsi/scsi_eh.h +++ b/include/scsi/scsi_eh.h @@ -20,6 +20,8 @@ extern bool scsi_command_normalize_sense(const struct scsi_cmnd *cmd, extern enum scsi_disposition scsi_check_sense(struct scsi_cmnd *); extern int scsi_sdev_eh(struct scsi_device *sdev, struct list_head *workq, struct list_head *doneq); +extern int scsi_device_setup_eh(struct scsi_device *sdev, int fallback); +extern void scsi_device_clear_eh(struct scsi_device *sdev); static inline bool scsi_sense_is_deferred(const struct scsi_sense_hdr *sshdr) { From patchwork Fri Sep 1 09:41:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719724 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EDC5CA0FE8 for ; Fri, 1 Sep 2023 09:42:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348967AbjIAJm1 (ORCPT ); Fri, 1 Sep 2023 05:42:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56292 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348905AbjIAJmM (ORCPT ); Fri, 1 Sep 2023 05:42:12 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A2E531717; Fri, 1 Sep 2023 02:41:59 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4RcY0v3ZpBzrSFC; Fri, 1 Sep 2023 17:40:15 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:41:56 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 09/19] scsi: core: increase/decrease target_busy without check can_queue Date: Fri, 1 Sep 2023 17:41:17 +0800 Message-ID: <20230901094127.2010873-10-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org This is preparation for a genernal target based error handle strategy to check if to wake up actual error handler. Signed-off-by: Wenchao Hao --- drivers/scsi/scsi_lib.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index db0a42fe49c0..4a7fb48aa60f 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -293,8 +293,7 @@ void scsi_device_unbusy(struct scsi_device *sdev, struct scsi_cmnd *cmd) scsi_dec_host_busy(shost, cmd); - if (starget->can_queue > 0) - atomic_dec(&starget->target_busy); + atomic_dec(&starget->target_busy); sbitmap_put(&sdev->budget_map, cmd->budget_token); cmd->budget_token = -1; @@ -1311,10 +1310,10 @@ static inline int scsi_target_queue_ready(struct Scsi_Host *shost, spin_unlock_irq(shost->host_lock); } + busy = atomic_inc_return(&starget->target_busy) - 1; if (starget->can_queue <= 0) return 1; - busy = atomic_inc_return(&starget->target_busy) - 1; if (atomic_read(&starget->target_blocked) > 0) { if (busy) goto starved; @@ -1339,8 +1338,7 @@ static inline int scsi_target_queue_ready(struct Scsi_Host *shost, list_move_tail(&sdev->starved_entry, &shost->starved_list); spin_unlock_irq(shost->host_lock); out_dec: - if (starget->can_queue > 0) - atomic_dec(&starget->target_busy); + atomic_dec(&starget->target_busy); return 0; } @@ -1784,8 +1782,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, out_dec_host_busy: scsi_dec_host_busy(shost, cmd); out_dec_target_busy: - if (scsi_target(sdev)->can_queue > 0) - atomic_dec(&scsi_target(sdev)->target_busy); + atomic_dec(&scsi_target(sdev)->target_busy); out_put_budget: scsi_mq_put_budget(q, cmd->budget_token); cmd->budget_token = -1; From patchwork Fri Sep 1 09:41:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719723 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5401ACA0FE1 for ; Fri, 1 Sep 2023 09:42:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235597AbjIAJm2 (ORCPT ); Fri, 1 Sep 2023 05:42:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348939AbjIAJmP (ORCPT ); Fri, 1 Sep 2023 05:42:15 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 86A0F1726; Fri, 1 Sep 2023 02:42:00 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4RcY003ytWzVkR3; Fri, 1 Sep 2023 17:39:28 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:41:57 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 10/19] scsi: scsi_error: Add helper to handle scsi target's error command list Date: Fri, 1 Sep 2023 17:41:18 +0800 Message-ID: <20230901094127.2010873-11-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add helper scsi_starget_eh() to handle scsi target's error command list, it would perform some steps which can be done with target's IO blocked, including check sense, start unit, reset lun and reset target. Signed-off-by: Wenchao Hao --- drivers/scsi/scsi_error.c | 129 ++++++++++++++++++++++++++++++++++++++ include/scsi/scsi_eh.h | 2 + 2 files changed, 131 insertions(+) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index b17bf1dea799..50cd8104175d 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -2547,6 +2547,135 @@ int scsi_sdev_eh(struct scsi_device *sdev, } EXPORT_SYMBOL_GPL(scsi_sdev_eh); +static int starget_eh_stu(struct scsi_target *starget, + struct list_head *work_q, + struct list_head *done_q) +{ + struct scsi_device *sdev; + struct scsi_cmnd *scmd, *stu_scmd; + + list_for_each_entry(sdev, &starget->devices, same_target_siblings) { + if (sdev_stu_done(sdev)) + continue; + + stu_scmd = NULL; + list_for_each_entry(scmd, work_q, eh_entry) + if (scmd->device == sdev && SCSI_SENSE_VALID(scmd) && + scsi_check_sense(scmd) == FAILED) { + stu_scmd = scmd; + break; + } + if (!stu_scmd) + continue; + + if (scsi_eh_sdev_stu(stu_scmd, work_q, done_q)) + return 1; + } + + return 0; +} + +static int starget_eh_reset_lun(struct scsi_target *starget, + struct list_head *work_q, + struct list_head *done_q) +{ + struct scsi_device *sdev; + struct scsi_cmnd *scmd, *bdr_scmd; + + list_for_each_entry(sdev, &starget->devices, same_target_siblings) { + if (sdev_reset_done(sdev)) + continue; + + bdr_scmd = NULL; + list_for_each_entry(scmd, work_q, eh_entry) + if (scmd->device) { + bdr_scmd = scmd; + break; + } + if (!bdr_scmd) + continue; + + if (scsi_eh_sdev_reset(bdr_scmd, work_q, done_q)) + return 1; + } + + return 0; +} + +static int starget_eh_reset_target(struct scsi_target *starget, + struct list_head *work_q, + struct list_head *done_q) +{ + enum scsi_disposition rtn; + struct scsi_cmnd *scmd, *next; + LIST_HEAD(check_list); + + scmd = list_first_entry(work_q, struct scsi_cmnd, eh_entry); + + SCSI_LOG_ERROR_RECOVERY(3, starget_printk(KERN_INFO, starget, + "%s: Sending target reset\n", current->comm)); + + rtn = scsi_try_target_reset(scmd); + if (rtn != SUCCESS && rtn != FAST_IO_FAIL) { + SCSI_LOG_ERROR_RECOVERY(3, starget_printk(KERN_INFO, starget, + "%s: Target reset failed\n", + current->comm)); + return 0; + } + + SCSI_LOG_ERROR_RECOVERY(3, starget_printk(KERN_INFO, starget, + "%s: Target reset success\n", current->comm)); + + list_for_each_entry_safe(scmd, next, work_q, eh_entry) { + if (rtn == SUCCESS) + list_move_tail(&scmd->eh_entry, &check_list); + else if (rtn == FAST_IO_FAIL) + scsi_eh_finish_cmd(scmd, done_q); + } + + return scsi_eh_test_devices(&check_list, work_q, done_q, 0); +} + +/* + * Target based error handle + * + * @work_q: list of scsi commands need to recovery + * @done_q: list of scsi commands handled + * + * return: return 1 if all commands in work_q is recoveryed, else 0 is returned + */ +int scsi_starget_eh(struct scsi_target *starget, + struct list_head *work_q, + struct list_head *done_q) +{ + int ret = 0; + + SCSI_LOG_ERROR_RECOVERY(2, starget_printk(KERN_INFO, starget, + "%s:targeteh: checking sense\n", current->comm)); + ret = scsi_eh_get_sense(work_q, done_q); + if (ret) + return ret; + + SCSI_LOG_ERROR_RECOVERY(2, starget_printk(KERN_INFO, starget, + "%s:targeteh: start unit\n", current->comm)); + ret = starget_eh_stu(starget, work_q, done_q); + if (ret) + return ret; + + SCSI_LOG_ERROR_RECOVERY(2, starget_printk(KERN_INFO, starget, + "%s:targeteh reset LUN\n", current->comm)); + ret = starget_eh_reset_lun(starget, work_q, done_q); + if (ret) + return ret; + + SCSI_LOG_ERROR_RECOVERY(2, starget_printk(KERN_INFO, starget, + "%s:targeteh reset target\n", current->comm)); + ret = starget_eh_reset_target(starget, work_q, done_q); + + return ret; +} +EXPORT_SYMBOL_GPL(scsi_starget_eh); + /* * Function: scsi_report_bus_reset() * diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h index 89b471aa484f..80e2f130e884 100644 --- a/include/scsi/scsi_eh.h +++ b/include/scsi/scsi_eh.h @@ -20,6 +20,8 @@ extern bool scsi_command_normalize_sense(const struct scsi_cmnd *cmd, extern enum scsi_disposition scsi_check_sense(struct scsi_cmnd *); extern int scsi_sdev_eh(struct scsi_device *sdev, struct list_head *workq, struct list_head *doneq); +extern int scsi_starget_eh(struct scsi_target *starget, + struct list_head *workq, struct list_head *doneq); extern int scsi_device_setup_eh(struct scsi_device *sdev, int fallback); extern void scsi_device_clear_eh(struct scsi_device *sdev); From patchwork Fri Sep 1 09:41:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719722 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 254C2CA0FE8 for ; Fri, 1 Sep 2023 09:42:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349019AbjIAJmp (ORCPT ); Fri, 1 Sep 2023 05:42:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348922AbjIAJmW (ORCPT ); Fri, 1 Sep 2023 05:42:22 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 583561732; Fri, 1 Sep 2023 02:42:01 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4RcY013mYyzVkSC; Fri, 1 Sep 2023 17:39:29 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:41:57 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 11/19] scsi: scsi_error: Add a general target based error handler Date: Fri, 1 Sep 2023 17:41:19 +0800 Message-ID: <20230901094127.2010873-12-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add a general target based error handler which can be used by drivers directly. This error handler implements an scsi_target_eh, when handling error commands, it would call helper function scsi_starget_eh() added before to try recover error commands. The behavior if scsi_starget_eh() can not recover all error commands depends on fallback flag, which is initialized when scsi_target is allocated. If fallback is set, it would fallback to further error recover strategy like old host based error handle; else it would mark this scsi devices of this target offline and flush all error commands. To using this error handler, drivers should call scsi_target_setup_eh() in its target_alloc() to setup it's target based error handler; call scsi_device_clear_eh() in its target_destroy() to clear this target based error handler. Signed-off-by: Wenchao Hao --- drivers/scsi/scsi_error.c | 161 ++++++++++++++++++++++++++++++++++++++ include/scsi/scsi_eh.h | 2 + 2 files changed, 163 insertions(+) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 50cd8104175d..1338742e55b9 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -101,6 +101,19 @@ static inline void shost_clear_eh_done(struct Scsi_Host *shost) } } +static inline void starget_clear_eh_done(struct scsi_target *starget) +{ + struct scsi_device *sdev; + + list_for_each_entry(sdev, &starget->devices, same_target_siblings) { + if (!sdev->eh) + continue; + sdev->eh->get_sense_done = 0; + sdev->eh->stu_done = 0; + sdev->eh->reset_done = 0; + } +} + void scsi_eh_wakeup(struct Scsi_Host *shost) { lockdep_assert_held(shost->host_lock); @@ -3058,3 +3071,151 @@ void scsi_device_clear_eh(struct scsi_device *sdev) sdev->eh = NULL; } EXPORT_SYMBOL_GPL(scsi_device_clear_eh); + +struct starget_eh { + spinlock_t eh_lock; + unsigned int eh_num; + struct list_head eh_cmd_q; + struct scsi_target *starget; + struct work_struct eh_handle_work; + unsigned int fallback:1; +}; + +static void starget_eh_work(struct work_struct *work) +{ + struct scsi_cmnd *scmd, *next; + unsigned long flags; + LIST_HEAD(eh_work_q); + LIST_HEAD(eh_done_q); + struct starget_eh *stargeteh = + container_of(work, struct starget_eh, eh_handle_work); + struct scsi_target *starget = stargeteh->starget; + struct scsi_target_eh *eh = starget->eh; + + spin_lock_irqsave(&stargeteh->eh_lock, flags); + list_splice_init(&stargeteh->eh_cmd_q, &eh_work_q); + spin_unlock_irqrestore(&stargeteh->eh_lock, flags); + + if (scsi_starget_eh(starget, &eh_work_q, &eh_done_q)) + goto out_clear_flag; + + if (!stargeteh->fallback) { + scsi_eh_offline_sdevs(&eh_work_q, &eh_done_q); + goto out_clear_flag; + } + + /* + * fallback to host based error handle + */ + SCSI_LOG_ERROR_RECOVERY(2, starget_printk(KERN_INFO, starget, + "%s:targeteh fallback to further recovery\n", current->comm)); + eh->reset_done = 1; + list_for_each_entry_safe(scmd, next, &eh_work_q, eh_entry) { + list_del_init(&scmd->eh_entry); + __scsi_eh_scmd_add(scmd); + } + goto out_flush_done; + +out_clear_flag: + starget_clear_eh_done(starget); + +out_flush_done: + scsi_eh_flush_done_q(&eh_done_q); + spin_lock_irqsave(&stargeteh->eh_lock, flags); + stargeteh->eh_num = 0; + spin_unlock_irqrestore(&stargeteh->eh_lock, flags); +} + +static void starget_eh_add_cmnd(struct scsi_cmnd *scmd) +{ + unsigned long flags; + struct scsi_target *starget = scmd->device->sdev_target; + struct starget_eh *eh; + + eh = (struct starget_eh *)starget->eh->driver_data; + + spin_lock_irqsave(&eh->eh_lock, flags); + list_add_tail(&scmd->eh_entry, &eh->eh_cmd_q); + eh->eh_num++; + spin_unlock_irqrestore(&eh->eh_lock, flags); +} + +static int starget_eh_is_busy(struct scsi_target *starget) +{ + int ret = 0; + unsigned long flags; + struct starget_eh *eh; + + eh = (struct starget_eh *)starget->eh->driver_data; + + spin_lock_irqsave(&eh->eh_lock, flags); + ret = eh->eh_num; + spin_unlock_irqrestore(&eh->eh_lock, flags); + + return ret; +} + +static int starget_eh_wakeup(struct scsi_target *starget) +{ + unsigned long flags; + unsigned int nr_error; + unsigned int nr_busy; + struct starget_eh *eh; + + eh = (struct starget_eh *)starget->eh->driver_data; + + spin_lock_irqsave(&eh->eh_lock, flags); + nr_error = eh->eh_num; + spin_unlock_irqrestore(&eh->eh_lock, flags); + + nr_busy = atomic_read(&starget->target_busy); + + if (!nr_error || nr_busy != nr_error) { + SCSI_LOG_ERROR_RECOVERY(5, starget_printk(KERN_INFO, starget, + "%s:targeteh: do not wake up, busy/error is %d/%d\n", + current->comm, nr_busy, nr_error)); + return 0; + } + + SCSI_LOG_ERROR_RECOVERY(2, starget_printk(KERN_INFO, starget, + "%s:targeteh: waking up, busy/error is %d/%d\n", + current->comm, nr_busy, nr_error)); + + return schedule_work(&eh->eh_handle_work); +} + +int scsi_target_setup_eh(struct scsi_target *starget, int fallback) +{ + struct scsi_target_eh *eh; + struct starget_eh *stargeteh; + + eh = kzalloc(sizeof(struct scsi_device_eh) + sizeof(struct starget_eh), + GFP_KERNEL); + if (!eh) { + starget_printk(KERN_ERR, starget, "failed to setup eh\n"); + return -ENOMEM; + } + stargeteh = (struct starget_eh *)eh->driver_data; + + eh->add_cmnd = starget_eh_add_cmnd; + eh->is_busy = starget_eh_is_busy; + eh->wakeup = starget_eh_wakeup; + stargeteh->starget = starget; + stargeteh->fallback = fallback; + + spin_lock_init(&stargeteh->eh_lock); + INIT_LIST_HEAD(&stargeteh->eh_cmd_q); + INIT_WORK(&stargeteh->eh_handle_work, starget_eh_work); + + starget->eh = eh; + + return 0; +} +EXPORT_SYMBOL_GPL(scsi_target_setup_eh); + +void scsi_target_clear_eh(struct scsi_target *starget) +{ + kfree(starget->eh); + starget->eh = NULL; +} +EXPORT_SYMBOL_GPL(scsi_target_clear_eh); diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h index 80e2f130e884..011f63030589 100644 --- a/include/scsi/scsi_eh.h +++ b/include/scsi/scsi_eh.h @@ -24,6 +24,8 @@ extern int scsi_starget_eh(struct scsi_target *starget, struct list_head *workq, struct list_head *doneq); extern int scsi_device_setup_eh(struct scsi_device *sdev, int fallback); extern void scsi_device_clear_eh(struct scsi_device *sdev); +extern int scsi_target_setup_eh(struct scsi_target *starget, int fallback); +extern void scsi_target_clear_eh(struct scsi_target *starget); static inline bool scsi_sense_is_deferred(const struct scsi_sense_hdr *sshdr) { From patchwork Fri Sep 1 09:41:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719855 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA367CA0FE9 for ; Fri, 1 Sep 2023 09:42:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348976AbjIAJmq (ORCPT ); Fri, 1 Sep 2023 05:42:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348950AbjIAJmW (ORCPT ); Fri, 1 Sep 2023 05:42:22 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 120131735; Fri, 1 Sep 2023 02:42:02 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.56]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4RcY021ljdzVkSD; Fri, 1 Sep 2023 17:39:30 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:41:58 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 12/19] scsi: scsi_debug: Add param to control LUN bassed error handler Date: Fri, 1 Sep 2023 17:41:20 +0800 Message-ID: <20230901094127.2010873-13-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add new module param lun_eh to control if enable LUN based error handle, and param lun_eh_fallback to control if fallback to further recover when LUN recovery can not recover all error commands. This is used to test the LUN based error handle. Signed-off-by: Wenchao Hao --- drivers/scsi/scsi_debug.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index f30399a75ec0..af3d43c9db6f 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -841,6 +841,8 @@ static bool have_dif_prot; static bool write_since_sync; static bool sdebug_statistics = DEF_STATISTICS; static bool sdebug_wp; +static bool sdebug_lun_eh; +static bool sdebug_lun_eh_fallback; /* Following enum: 0: no zbc, def; 1: host aware; 2: host managed */ static enum blk_zoned_model sdeb_zbc_model = BLK_ZONED_NONE; static char *sdeb_zbc_model_s; @@ -5437,6 +5439,9 @@ static int scsi_debug_slave_alloc(struct scsi_device *sdp) pr_info("slave_alloc <%u %u %u %llu>\n", sdp->host->host_no, sdp->channel, sdp->id, sdp->lun); + if (sdebug_lun_eh) + return scsi_device_setup_eh(sdp, sdebug_lun_eh_fallback); + return 0; } @@ -5491,6 +5496,9 @@ static void scsi_debug_slave_destroy(struct scsi_device *sdp) /* make this slot available for re-use */ devip->used = false; sdp->hostdata = NULL; + + if (sdebug_lun_eh) + scsi_device_clear_eh(sdp); } /* Returns true if we require the queued memory to be freed by the caller. */ @@ -6167,6 +6175,8 @@ module_param_named(zone_cap_mb, sdeb_zbc_zone_cap_mb, int, S_IRUGO); module_param_named(zone_max_open, sdeb_zbc_max_open, int, S_IRUGO); module_param_named(zone_nr_conv, sdeb_zbc_nr_conv, int, S_IRUGO); module_param_named(zone_size_mb, sdeb_zbc_zone_size_mb, int, S_IRUGO); +module_param_named(lun_eh, sdebug_lun_eh, bool, S_IRUGO); +module_param_named(lun_eh_fallback, sdebug_lun_eh_fallback, bool, S_IRUGO); MODULE_AUTHOR("Eric Youngdale + Douglas Gilbert"); MODULE_DESCRIPTION("SCSI debug adapter driver"); @@ -6239,6 +6249,8 @@ MODULE_PARM_DESC(zone_cap_mb, "Zone capacity in MiB (def=zone size)"); MODULE_PARM_DESC(zone_max_open, "Maximum number of open zones; [0] for no limit (def=auto)"); MODULE_PARM_DESC(zone_nr_conv, "Number of conventional zones (def=1)"); MODULE_PARM_DESC(zone_size_mb, "Zone size in MiB (def=auto)"); +MODULE_PARM_DESC(lun_eh, "LUN based error handle (def=0)"); +MODULE_PARM_DESC(lun_eh_fallback, "Fallback to further recovery if LUN recovery failed (def=0)"); #define SDEBUG_INFO_LEN 256 static char sdebug_info[SDEBUG_INFO_LEN]; From patchwork Fri Sep 1 09:41:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719856 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68DA7CA0FE6 for ; Fri, 1 Sep 2023 09:42:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348924AbjIAJmo (ORCPT ); Fri, 1 Sep 2023 05:42:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58750 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348956AbjIAJmX (ORCPT ); Fri, 1 Sep 2023 05:42:23 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B9C9010E7; Fri, 1 Sep 2023 02:42:02 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4RcXyk0fP1zNmXX; Fri, 1 Sep 2023 17:38:22 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:41:59 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 13/19] scsi: scsi_debug: Add param to control target based error handle Date: Fri, 1 Sep 2023 17:41:21 +0800 Message-ID: <20230901094127.2010873-14-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add new module param target_eh to control if enable target based error handler, and param target_eh_fallback to control if fallback to further recover when target recovery can not recover all error commands. This is used to test the target based error handle. Signed-off-by: Wenchao Hao --- drivers/scsi/scsi_debug.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index af3d43c9db6f..9a8aa48fb8a4 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -843,6 +843,8 @@ static bool sdebug_statistics = DEF_STATISTICS; static bool sdebug_wp; static bool sdebug_lun_eh; static bool sdebug_lun_eh_fallback; +static bool sdebug_target_eh; +static bool sdebug_target_eh_fallback; /* Following enum: 0: no zbc, def; 1: host aware; 2: host managed */ static enum blk_zoned_model sdeb_zbc_model = BLK_ZONED_NONE; static char *sdeb_zbc_model_s; @@ -1137,6 +1139,9 @@ static int sdebug_target_alloc(struct scsi_target *starget) starget->hostdata = targetip; + if (sdebug_target_eh) + return scsi_target_setup_eh(starget, sdebug_target_eh_fallback); + return 0; } @@ -1152,6 +1157,9 @@ static void sdebug_target_destroy(struct scsi_target *starget) { struct sdebug_target_info *targetip; + if (sdebug_target_eh) + scsi_target_clear_eh(starget); + targetip = (struct sdebug_target_info *)starget->hostdata; if (targetip) { starget->hostdata = NULL; @@ -6177,6 +6185,8 @@ module_param_named(zone_nr_conv, sdeb_zbc_nr_conv, int, S_IRUGO); module_param_named(zone_size_mb, sdeb_zbc_zone_size_mb, int, S_IRUGO); module_param_named(lun_eh, sdebug_lun_eh, bool, S_IRUGO); module_param_named(lun_eh_fallback, sdebug_lun_eh_fallback, bool, S_IRUGO); +module_param_named(target_eh, sdebug_target_eh, bool, S_IRUGO); +module_param_named(target_eh_fallback, sdebug_target_eh_fallback, bool, S_IRUGO); MODULE_AUTHOR("Eric Youngdale + Douglas Gilbert"); MODULE_DESCRIPTION("SCSI debug adapter driver"); @@ -6251,6 +6261,8 @@ MODULE_PARM_DESC(zone_nr_conv, "Number of conventional zones (def=1)"); MODULE_PARM_DESC(zone_size_mb, "Zone size in MiB (def=auto)"); MODULE_PARM_DESC(lun_eh, "LUN based error handle (def=0)"); MODULE_PARM_DESC(lun_eh_fallback, "Fallback to further recovery if LUN recovery failed (def=0)"); +MODULE_PARM_DESC(target_eh, "target based error handle (def=0)"); +MODULE_PARM_DESC(target_eh_fallback, "Fallback to further recovery if target recovery failed (def=0)"); #define SDEBUG_INFO_LEN 256 static char sdebug_info[SDEBUG_INFO_LEN]; From patchwork Fri Sep 1 09:41:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719721 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29A9ECA0FE6 for ; Fri, 1 Sep 2023 09:42:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348993AbjIAJnA (ORCPT ); Fri, 1 Sep 2023 05:43:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348994AbjIAJmo (ORCPT ); Fri, 1 Sep 2023 05:42:44 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A8BCE19BA; Fri, 1 Sep 2023 02:42:09 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4RcY0B0VHtzVkS3; Fri, 1 Sep 2023 17:39:38 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:42:00 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 14/19] scsi: mpt3sas: Add param to control LUN based error handle Date: Fri, 1 Sep 2023 17:41:22 +0800 Message-ID: <20230901094127.2010873-15-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add new module param lun_eh to control if enable LUN based error handler, since mpt3sas defined callback eh_host_reset and eh_target_reset, so make it fallback to further recover when LUN based recovery can not recover all error commands. Signed-off-by: Wenchao Hao --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index c3c1f466fe01..7a48e89c3e5d 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -174,6 +174,10 @@ module_param(host_tagset_enable, int, 0444); MODULE_PARM_DESC(host_tagset_enable, "Shared host tagset enable/disable Default: enable(1)"); +static bool lun_eh; +module_param(lun_eh, bool, 0444); +MODULE_PARM_DESC(lun_eh, "LUN based error handle (def=0)"); + /* raid transport support */ static struct raid_template *mpt3sas_raid_template; static struct raid_template *mpt2sas_raid_template; @@ -2044,6 +2048,13 @@ scsih_slave_alloc(struct scsi_device *sdev) struct _sas_device *sas_device; struct _pcie_device *pcie_device; unsigned long flags; + int ret = 0; + + if (lun_eh) { + ret = scsi_device_setup_eh(sdev, 1); + if (ret) + return ret; + } sas_device_priv_data = kzalloc(sizeof(*sas_device_priv_data), GFP_KERNEL); @@ -2122,6 +2133,9 @@ scsih_slave_destroy(struct scsi_device *sdev) struct _pcie_device *pcie_device; unsigned long flags; + if (lun_eh) + scsi_device_clear_eh(sdev); + if (!sdev->hostdata) return; From patchwork Fri Sep 1 09:41:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719854 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 911B6CA0FE8 for ; Fri, 1 Sep 2023 09:43:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349051AbjIAJnB (ORCPT ); Fri, 1 Sep 2023 05:43:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349000AbjIAJmo (ORCPT ); Fri, 1 Sep 2023 05:42:44 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB72A10FB; Fri, 1 Sep 2023 02:42:10 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4RcY163FHTzrSKP; Fri, 1 Sep 2023 17:40:26 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:42:07 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 15/19] scsi: mpt3sas: Add param to control target based error handle Date: Fri, 1 Sep 2023 17:41:23 +0800 Message-ID: <20230901094127.2010873-16-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add new module param target_eh to control if enable target based error handle, since mpt3sas defined callback eh_host_reset, so make it fallback to further recover when target based recovery can not recover all error commands. Signed-off-by: Wenchao Hao --- drivers/scsi/mpt3sas/mpt3sas_scsih.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index 7a48e89c3e5d..6170d8a772d4 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -178,6 +178,10 @@ static bool lun_eh; module_param(lun_eh, bool, 0444); MODULE_PARM_DESC(lun_eh, "LUN based error handle (def=0)"); +static bool target_eh; +module_param(target_eh, bool, 0444); +MODULE_PARM_DESC(target_eh, "target based error handle (def=0)"); + /* raid transport support */ static struct raid_template *mpt3sas_raid_template; static struct raid_template *mpt2sas_raid_template; @@ -1879,6 +1883,13 @@ scsih_target_alloc(struct scsi_target *starget) struct _pcie_device *pcie_device; unsigned long flags; struct sas_rphy *rphy; + int ret = 0; + + if (target_eh) { + ret = scsi_target_setup_eh(starget, 1); + if (ret) + return ret; + } sas_target_priv_data = kzalloc(sizeof(*sas_target_priv_data), GFP_KERNEL); @@ -1969,6 +1980,9 @@ scsih_target_destroy(struct scsi_target *starget) struct _pcie_device *pcie_device; unsigned long flags; + if (target_eh) + scsi_target_clear_eh(starget); + sas_target_priv_data = starget->hostdata; if (!sas_target_priv_data) return; From patchwork Fri Sep 1 09:41:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719720 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7CB71CA0FE1 for ; Fri, 1 Sep 2023 09:43:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348890AbjIAJnL (ORCPT ); Fri, 1 Sep 2023 05:43:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39400 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349010AbjIAJmo (ORCPT ); Fri, 1 Sep 2023 05:42:44 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F38061BC6; Fri, 1 Sep 2023 02:42:14 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4RcY171GZlzrS0g; Fri, 1 Sep 2023 17:40:27 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:42:08 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 16/19] scsi: smartpqi: Add param to control LUN based error handle Date: Fri, 1 Sep 2023 17:41:24 +0800 Message-ID: <20230901094127.2010873-17-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add new param lun_eh to control if enable LUN based error handler, since smartpqi did not define other further reset callbacks, it is not necessary to fallback to further recover any more, so set the LUN error handler with fallback set to 0. Signed-off-by: Wenchao Hao --- drivers/scsi/smartpqi/smartpqi_init.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index 6aaaa7ebca37..107156d85d85 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -187,6 +187,10 @@ module_param_named(ctrl_ready_timeout, MODULE_PARM_DESC(ctrl_ready_timeout, "Timeout in seconds for driver to wait for controller ready."); +static bool pqi_lun_eh; +module_param_named(lun_eh, pqi_lun_eh, bool, 0444); +MODULE_PARM_DESC(lun_eh, "LUN based error handle (def=0)"); + static char *raid_levels[] = { "RAID-0", "RAID-4", @@ -6356,6 +6360,13 @@ static int pqi_slave_alloc(struct scsi_device *sdev) struct pqi_ctrl_info *ctrl_info; struct scsi_target *starget; struct sas_rphy *rphy; + int ret = 0; + + if (pqi_lun_eh) { + ret = scsi_device_setup_eh(sdev, 0); + if (ret) + return ret; + } ctrl_info = shost_to_hba(sdev->host); @@ -6439,6 +6450,9 @@ static void pqi_slave_destroy(struct scsi_device *sdev) ctrl_info = shost_to_hba(sdev->host); + if (pqi_lun_eh) + scsi_device_clear_eh(sdev); + mutex_acquired = mutex_trylock(&ctrl_info->scan_mutex); if (!mutex_acquired) return; From patchwork Fri Sep 1 09:41:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719719 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3CC47CA0FE9 for ; Fri, 1 Sep 2023 09:43:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349002AbjIAJnL (ORCPT ); Fri, 1 Sep 2023 05:43:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349020AbjIAJmp (ORCPT ); Fri, 1 Sep 2023 05:42:45 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 09E831BD6; Fri, 1 Sep 2023 02:42:19 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4RcXyb5wpNzhZJb; Fri, 1 Sep 2023 17:38:15 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:42:08 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 17/19] scsi: megaraid_sas: Add param to control target based error handle Date: Fri, 1 Sep 2023 17:41:25 +0800 Message-ID: <20230901094127.2010873-18-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add new param target_eh to control if enable target based error handler, since megaraid_sas did not define callback eh_device_reset, so only target based error handler is enabled; and megaraid_sas defined eh_host_reset, so make it fallback to further recover when target based recovery can not recover all error commands. Signed-off-by: Wenchao Hao --- drivers/scsi/megaraid/megaraid_sas_base.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index 050eed8e2684..cc00cd5b213d 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -45,6 +45,7 @@ #include #include #include +#include #include "megaraid_sas_fusion.h" #include "megaraid_sas.h" @@ -127,6 +128,10 @@ int host_tagset_enable = 1; module_param(host_tagset_enable, int, 0444); MODULE_PARM_DESC(host_tagset_enable, "Shared host tagset enable/disable Default: enable(1)"); +static bool target_eh; +module_param(target_eh, bool, 0444); +MODULE_PARM_DESC(target_eh, "target based error handle (def=0)"); + MODULE_LICENSE("GPL"); MODULE_VERSION(MEGASAS_VERSION); MODULE_AUTHOR("megaraidlinux.pdl@broadcom.com"); @@ -2174,6 +2179,19 @@ static void megasas_slave_destroy(struct scsi_device *sdev) sdev->hostdata = NULL; } +static int megasas_target_alloc(struct scsi_target *starget) +{ + if (target_eh) + return scsi_target_setup_eh(starget, 1); + return 0; +} + +static void megasas_target_destroy(struct scsi_target *starget) +{ + if (target_eh) + scsi_target_clear_eh(starget); +} + /* * megasas_complete_outstanding_ioctls - Complete outstanding ioctls after a * kill adapter @@ -3525,6 +3543,8 @@ static const struct scsi_host_template megasas_template = { .change_queue_depth = scsi_change_queue_depth, .max_segment_size = 0xffffffff, .cmd_size = sizeof(struct megasas_cmd_priv), + .target_alloc = megasas_target_alloc, + .target_destroy = megasas_target_destroy, }; /** From patchwork Fri Sep 1 09:41:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719853 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E2FECA0FEA for ; Fri, 1 Sep 2023 09:43:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349007AbjIAJnM (ORCPT ); Fri, 1 Sep 2023 05:43:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39518 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348950AbjIAJmv (ORCPT ); Fri, 1 Sep 2023 05:42:51 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D3611708; Fri, 1 Sep 2023 02:42:23 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4RcY184S2czrSMV; Fri, 1 Sep 2023 17:40:28 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:42:09 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 18/19] scsi: virtio_scsi: Add param to control LUN based error handle Date: Fri, 1 Sep 2023 17:41:26 +0800 Message-ID: <20230901094127.2010873-19-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add new param lun_eh to control if enable LUN based error handler, since virtio_scsi did not define other further reset callbacks, it is not necessary to fallback to further recover any more, so set the LUN error handler with fallback set to 0. Signed-off-by: Wenchao Hao --- drivers/scsi/virtio_scsi.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c index bd5633667d01..7bf4a34cdd20 100644 --- a/drivers/scsi/virtio_scsi.c +++ b/drivers/scsi/virtio_scsi.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include @@ -37,6 +38,10 @@ #define VIRTIO_SCSI_EVENT_LEN 8 #define VIRTIO_SCSI_VQ_BASE 2 +static bool lun_eh; +module_param(lun_eh, bool, 0444); +MODULE_PARM_DESC(lun_eh, "LUN based error handle (def=0)"); + /* Command queue element */ struct virtio_scsi_cmd { struct scsi_cmnd *sc; @@ -679,9 +684,18 @@ static int virtscsi_device_alloc(struct scsi_device *sdevice) */ sdevice->sdev_bflags = BLIST_TRY_VPD_PAGES; + if (lun_eh) + return scsi_device_setup_eh(sdevice, 0); + return 0; } +static void virtscsi_device_destroy(struct scsi_device *sdevice) +{ + if (lun_eh) + return scsi_device_clear_eh(sdevice); +} + /** * virtscsi_change_queue_depth() - Change a virtscsi target's queue depth @@ -757,7 +771,7 @@ static const struct scsi_host_template virtscsi_host_template = { .eh_device_reset_handler = virtscsi_device_reset, .eh_timed_out = virtscsi_eh_timed_out, .slave_alloc = virtscsi_device_alloc, - + .slave_destroy = virtscsi_device_destroy, .dma_boundary = UINT_MAX, .map_queues = virtscsi_map_queues, .track_queue_depth = 1, From patchwork Fri Sep 1 09:41:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wenchao Hao X-Patchwork-Id: 719852 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58EE0CA0FE1 for ; Fri, 1 Sep 2023 09:43:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232071AbjIAJn2 (ORCPT ); Fri, 1 Sep 2023 05:43:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1349043AbjIAJm7 (ORCPT ); Fri, 1 Sep 2023 05:42:59 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE8141BE3; Fri, 1 Sep 2023 02:42:30 -0700 (PDT) Received: from kwepemm600012.china.huawei.com (unknown [172.30.72.55]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4RcXyd20CjzhZJn; Fri, 1 Sep 2023 17:38:17 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm600012.china.huawei.com (7.193.23.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 1 Sep 2023 17:42:10 +0800 From: Wenchao Hao To: "James E . J . Bottomley" , "Martin K . Petersen" , CC: Hannes Reinecke , , , , Wenchao Hao Subject: [RFC PATCH v2 19/19] scsi: iscsi_tcp: Add param to control LUN based error handle Date: Fri, 1 Sep 2023 17:41:27 +0800 Message-ID: <20230901094127.2010873-20-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230901094127.2010873-1-haowenchao2@huawei.com> References: <20230901094127.2010873-1-haowenchao2@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600012.china.huawei.com (7.193.23.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add new param lun_eh to control if enable LUN based error handler, since iscsi_tcp defined callback eh_target_reset, so make it fallback to further recover when LUN based recovery can not recover all error commands. Signed-off-by: Wenchao Hao --- drivers/scsi/iscsi_tcp.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index 9ab8555180a3..83474dc0ecd5 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include @@ -63,6 +64,10 @@ module_param_named(debug_iscsi_tcp, iscsi_sw_tcp_dbg, int, MODULE_PARM_DESC(debug_iscsi_tcp, "Turn on debugging for iscsi_tcp module " "Set to 1 to turn on, and zero to turn off. Default is off."); +static bool iscsi_sw_tcp_lun_eh; +module_param_named(lun_eh, iscsi_sw_tcp_lun_eh, bool, 0444); +MODULE_PARM_DESC(lun_eh, "LUN based error handle (def=0)"); + #define ISCSI_SW_TCP_DBG(_conn, dbg_fmt, arg...) \ do { \ if (iscsi_sw_tcp_dbg) \ @@ -1065,6 +1070,19 @@ static int iscsi_sw_tcp_slave_configure(struct scsi_device *sdev) return 0; } +static int iscsi_sw_tcp_slave_alloc(struct scsi_device *sdev) +{ + if (iscsi_sw_tcp_lun_eh) + return scsi_device_setup_eh(sdev, 1); + return 0; +} + +static void iscsi_sw_tcp_slave_destroy(struct scsi_device *sdev) +{ + if (iscsi_sw_tcp_lun_eh) + return scsi_device_clear_eh(sdev); +} + static const struct scsi_host_template iscsi_sw_tcp_sht = { .module = THIS_MODULE, .name = "iSCSI Initiator over TCP/IP", @@ -1080,6 +1098,8 @@ static const struct scsi_host_template iscsi_sw_tcp_sht = { .eh_target_reset_handler = iscsi_eh_recover_target, .dma_boundary = PAGE_SIZE - 1, .slave_configure = iscsi_sw_tcp_slave_configure, + .slave_alloc = iscsi_sw_tcp_slave_alloc, + .slave_destroy = iscsi_sw_tcp_slave_destroy, .proc_name = "iscsi_tcp", .this_id = -1, .track_queue_depth = 1,