Message ID | 1605070685-20945-3-git-send-email-muneendra.kumar@broadcom.com |
---|---|
State | Superseded |
Headers | show |
Series | scsi: Support to handle Intermittent errors | expand |
On 11/11/20 5:58 AM, Muneendra wrote: > Added a new optional routine eh_should_retry_cmd > in scsi_host_template that allows the transport to decide if a > cmd is retryable.Return true if the transport is in a state the > cmd should be retried on. > > Added a new interface scsi_eh_should_retry_cmd which checks and > calls the new routine eh_should_retry_cmd. > > Made changes in scmd_eh_abort_handler and scsi_eh_flush_done_q which > calls the scsi_eh_should_retry_cmd to check whether the > command needs to be retried. > > The above changes were done based on a patch by Mike Christie. > > Signed-off-by: Muneendra <muneendra.kumar@broadcom.com> > > --- > v7: > Added New routine in scsi_host_template to decide if a cmd is > retryable instead of checking the same using SCMD_NORETRIES_ABORT > bit as the cmd retry part can be checked by validating the port state. > > Moved the DID_TRANSPORT_MARGINAL changes to previous patch > for reordering > > v6: > Rearranged the patch by merging second hunk of the patch2 in v5 > to this patch > > v5: > added the DID_TRANSPORT_MARGINAL case to > scsi_decide_disposition > v4: > Modified the comments in the code appropriately > > v3: > Merged first part of the previous patch(v2 patch3) with > this patch. > > v2: > set the hostbyte as DID_TRANSPORT_MARGINAL instead of > DID_TRANSPORT_FAILFAST. > --- > drivers/scsi/scsi_error.c | 17 +++++++++++++++-- > include/scsi/scsi_host.h | 6 ++++++ > 2 files changed, 21 insertions(+), 2 deletions(-) > Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
On Wed, 2020-11-11 at 10:28 +0530, Muneendra wrote: > Added a new optional routine eh_should_retry_cmd > in scsi_host_template that allows the transport to decide if a > cmd is retryable.Return true if the transport is in a state the > cmd should be retried on. > > Added a new interface scsi_eh_should_retry_cmd which checks and > calls the new routine eh_should_retry_cmd. > > Made changes in scmd_eh_abort_handler and scsi_eh_flush_done_q which > calls the scsi_eh_should_retry_cmd to check whether the > command needs to be retried. > > The above changes were done based on a patch by Mike Christie. > > Signed-off-by: Muneendra <muneendra.kumar@broadcom.com> > > --- > v7: > Added New routine in scsi_host_template to decide if a cmd is > retryable instead of checking the same using SCMD_NORETRIES_ABORT > bit as the cmd retry part can be checked by validating the port > state. > > Moved the DID_TRANSPORT_MARGINAL changes to previous patch > for reordering > > v6: > Rearranged the patch by merging second hunk of the patch2 in v5 > to this patch > > v5: > added the DID_TRANSPORT_MARGINAL case to > scsi_decide_disposition > v4: > Modified the comments in the code appropriately > > v3: > Merged first part of the previous patch(v2 patch3) with > this patch. > > v2: > set the hostbyte as DID_TRANSPORT_MARGINAL instead of > DID_TRANSPORT_FAILFAST. > --- > drivers/scsi/scsi_error.c | 17 +++++++++++++++-- > include/scsi/scsi_host.h | 6 ++++++ > 2 files changed, 21 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > index 28056ee498b3..1cdfa5a8ca09 100644 > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -124,6 +124,17 @@ static bool scsi_cmd_retry_allowed(struct > scsi_cmnd *cmd) > return ++cmd->retries <= cmd->allowed; > } > > +static bool scsi_eh_should_retry_cmd(struct scsi_cmnd *cmd) > +{ > + struct scsi_device *sdev = cmd->device; > + struct Scsi_Host *host = sdev->host; > + > + if (host->hostt->eh_should_retry_cmd) > + return host->hostt->eh_should_retry_cmd(cmd); > + > + return true; > +} > + > /** > * scmd_eh_abort_handler - Handle command aborts > * @work: command to be aborted. > @@ -159,7 +170,8 @@ scmd_eh_abort_handler(struct work_struct *work) > "eh timeout, not > retrying " > "aborted > command\n")); > } else if (!scsi_noretry_cmd(scmd) && > - scsi_cmd_retry_allowed(scmd)) { > + scsi_cmd_retry_allowed(scmd) && > + scsi_eh_should_retry_cmd(scmd)) { > SCSI_LOG_ERROR_RECOVERY(3, > scmd_printk(KERN_WARNING, scmd, > "retry aborted > command\n")); > @@ -2111,7 +2123,8 @@ void scsi_eh_flush_done_q(struct list_head > *done_q) > list_for_each_entry_safe(scmd, next, done_q, eh_entry) { > list_del_init(&scmd->eh_entry); > if (scsi_device_online(scmd->device) && > - !scsi_noretry_cmd(scmd) && > scsi_cmd_retry_allowed(scmd)) { > + !scsi_noretry_cmd(scmd) && > scsi_cmd_retry_allowed(scmd) && > + scsi_eh_should_retry_cmd(scmd)) { > SCSI_LOG_ERROR_RECOVERY(3, > scmd_printk(KERN_INFO, scmd, > "%s: flush retry cmd\n", > diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h > index 701f178b20ae..e30fd963b97d 100644 > --- a/include/scsi/scsi_host.h > +++ b/include/scsi/scsi_host.h > @@ -314,6 +314,12 @@ struct scsi_host_template { > * Status: OPTIONAL > */ > enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *); > + /* > + * Optional routine that allows the transport to decide if a > cmd > + * is retryable. Return true if the transport is in a state the > + * cmd should be retried on. > + */ > + bool (*eh_should_retry_cmd)(struct scsi_cmnd *scmd); > > /* This is an optional routine that allows transport to > initiate > * LLD adapter or firmware reset using sysfs attribute. Reviewed-by: Ewan D. Milne <emilne@redhat.com>
> On Nov 10, 2020, at 10:58 PM, Muneendra <muneendra.kumar@broadcom.com> wrote: > > Added a new optional routine eh_should_retry_cmd > in scsi_host_template that allows the transport to decide if a > cmd is retryable.Return true if the transport is in a state the > cmd should be retried on. > > Added a new interface scsi_eh_should_retry_cmd which checks and > calls the new routine eh_should_retry_cmd. > > Made changes in scmd_eh_abort_handler and scsi_eh_flush_done_q which > calls the scsi_eh_should_retry_cmd to check whether the > command needs to be retried. > > The above changes were done based on a patch by Mike Christie. > > Signed-off-by: Muneendra <muneendra.kumar@broadcom.com> > > --- > v7: > Added New routine in scsi_host_template to decide if a cmd is > retryable instead of checking the same using SCMD_NORETRIES_ABORT > bit as the cmd retry part can be checked by validating the port state. > > Moved the DID_TRANSPORT_MARGINAL changes to previous patch > for reordering > > v6: > Rearranged the patch by merging second hunk of the patch2 in v5 > to this patch > > v5: > added the DID_TRANSPORT_MARGINAL case to > scsi_decide_disposition > v4: > Modified the comments in the code appropriately > > v3: > Merged first part of the previous patch(v2 patch3) with > this patch. > > v2: > set the hostbyte as DID_TRANSPORT_MARGINAL instead of > DID_TRANSPORT_FAILFAST. > --- > drivers/scsi/scsi_error.c | 17 +++++++++++++++-- > include/scsi/scsi_host.h | 6 ++++++ > 2 files changed, 21 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > index 28056ee498b3..1cdfa5a8ca09 100644 > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -124,6 +124,17 @@ static bool scsi_cmd_retry_allowed(struct scsi_cmnd *cmd) > return ++cmd->retries <= cmd->allowed; > } > > +static bool scsi_eh_should_retry_cmd(struct scsi_cmnd *cmd) > +{ > + struct scsi_device *sdev = cmd->device; > + struct Scsi_Host *host = sdev->host; > + > + if (host->hostt->eh_should_retry_cmd) > + return host->hostt->eh_should_retry_cmd(cmd); > + > + return true; > +} > + > /** > * scmd_eh_abort_handler - Handle command aborts > * @work: command to be aborted. > @@ -159,7 +170,8 @@ scmd_eh_abort_handler(struct work_struct *work) > "eh timeout, not retrying " > "aborted command\n")); > } else if (!scsi_noretry_cmd(scmd) && > - scsi_cmd_retry_allowed(scmd)) { > + scsi_cmd_retry_allowed(scmd) && > + scsi_eh_should_retry_cmd(scmd)) { > SCSI_LOG_ERROR_RECOVERY(3, > scmd_printk(KERN_WARNING, scmd, > "retry aborted command\n")); > @@ -2111,7 +2123,8 @@ void scsi_eh_flush_done_q(struct list_head *done_q) > list_for_each_entry_safe(scmd, next, done_q, eh_entry) { > list_del_init(&scmd->eh_entry); > if (scsi_device_online(scmd->device) && > - !scsi_noretry_cmd(scmd) && scsi_cmd_retry_allowed(scmd)) { > + !scsi_noretry_cmd(scmd) && scsi_cmd_retry_allowed(scmd) && > + scsi_eh_should_retry_cmd(scmd)) { > SCSI_LOG_ERROR_RECOVERY(3, > scmd_printk(KERN_INFO, scmd, > "%s: flush retry cmd\n", > diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h > index 701f178b20ae..e30fd963b97d 100644 > --- a/include/scsi/scsi_host.h > +++ b/include/scsi/scsi_host.h > @@ -314,6 +314,12 @@ struct scsi_host_template { > * Status: OPTIONAL > */ > enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *); > + /* > + * Optional routine that allows the transport to decide if a cmd > + * is retryable. Return true if the transport is in a state the > + * cmd should be retried on. > + */ > + bool (*eh_should_retry_cmd)(struct scsi_cmnd *scmd); > > /* This is an optional routine that allows transport to initiate > * LLD adapter or firmware reset using sysfs attribute. > -- > 2.26.2 > Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> -- Himanshu Madhani Oracle Linux Engineering
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 28056ee498b3..1cdfa5a8ca09 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -124,6 +124,17 @@ static bool scsi_cmd_retry_allowed(struct scsi_cmnd *cmd) return ++cmd->retries <= cmd->allowed; } +static bool scsi_eh_should_retry_cmd(struct scsi_cmnd *cmd) +{ + struct scsi_device *sdev = cmd->device; + struct Scsi_Host *host = sdev->host; + + if (host->hostt->eh_should_retry_cmd) + return host->hostt->eh_should_retry_cmd(cmd); + + return true; +} + /** * scmd_eh_abort_handler - Handle command aborts * @work: command to be aborted. @@ -159,7 +170,8 @@ scmd_eh_abort_handler(struct work_struct *work) "eh timeout, not retrying " "aborted command\n")); } else if (!scsi_noretry_cmd(scmd) && - scsi_cmd_retry_allowed(scmd)) { + scsi_cmd_retry_allowed(scmd) && + scsi_eh_should_retry_cmd(scmd)) { SCSI_LOG_ERROR_RECOVERY(3, scmd_printk(KERN_WARNING, scmd, "retry aborted command\n")); @@ -2111,7 +2123,8 @@ void scsi_eh_flush_done_q(struct list_head *done_q) list_for_each_entry_safe(scmd, next, done_q, eh_entry) { list_del_init(&scmd->eh_entry); if (scsi_device_online(scmd->device) && - !scsi_noretry_cmd(scmd) && scsi_cmd_retry_allowed(scmd)) { + !scsi_noretry_cmd(scmd) && scsi_cmd_retry_allowed(scmd) && + scsi_eh_should_retry_cmd(scmd)) { SCSI_LOG_ERROR_RECOVERY(3, scmd_printk(KERN_INFO, scmd, "%s: flush retry cmd\n", diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h index 701f178b20ae..e30fd963b97d 100644 --- a/include/scsi/scsi_host.h +++ b/include/scsi/scsi_host.h @@ -314,6 +314,12 @@ struct scsi_host_template { * Status: OPTIONAL */ enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *); + /* + * Optional routine that allows the transport to decide if a cmd + * is retryable. Return true if the transport is in a state the + * cmd should be retried on. + */ + bool (*eh_should_retry_cmd)(struct scsi_cmnd *scmd); /* This is an optional routine that allows transport to initiate * LLD adapter or firmware reset using sysfs attribute.
Added a new optional routine eh_should_retry_cmd in scsi_host_template that allows the transport to decide if a cmd is retryable.Return true if the transport is in a state the cmd should be retried on. Added a new interface scsi_eh_should_retry_cmd which checks and calls the new routine eh_should_retry_cmd. Made changes in scmd_eh_abort_handler and scsi_eh_flush_done_q which calls the scsi_eh_should_retry_cmd to check whether the command needs to be retried. The above changes were done based on a patch by Mike Christie. Signed-off-by: Muneendra <muneendra.kumar@broadcom.com> --- v7: Added New routine in scsi_host_template to decide if a cmd is retryable instead of checking the same using SCMD_NORETRIES_ABORT bit as the cmd retry part can be checked by validating the port state. Moved the DID_TRANSPORT_MARGINAL changes to previous patch for reordering v6: Rearranged the patch by merging second hunk of the patch2 in v5 to this patch v5: added the DID_TRANSPORT_MARGINAL case to scsi_decide_disposition v4: Modified the comments in the code appropriately v3: Merged first part of the previous patch(v2 patch3) with this patch. v2: set the hostbyte as DID_TRANSPORT_MARGINAL instead of DID_TRANSPORT_FAILFAST. --- drivers/scsi/scsi_error.c | 17 +++++++++++++++-- include/scsi/scsi_host.h | 6 ++++++ 2 files changed, 21 insertions(+), 2 deletions(-)