diff mbox series

[3/4] scsi: Internally retry scsi_execute commands

Message ID 20220810034155.20744-4-michael.christie@oracle.com
State New
Headers show
Series scsi: passthrough fixes/improvements | expand

Commit Message

Mike Christie Aug. 10, 2022, 3:41 a.m. UTC
In several places like LU setup and pr_ops we will hit the noretry code
path because we do not retry any passthrough commands that hit device
errors even though scsi-ml thinks the command is retryable.

This has us only fast fail commands that hit device errors that have been
submitted through the block layer directly and not via scsi_execute. This
allows SG IO and other users to continue to get fast failures and all
device errors returned to them, and scsi_execute users will get their
retries they had requested.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/scsi_error.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Mike Christie Aug. 10, 2022, 5:06 p.m. UTC | #1
On 8/10/22 5:46 AM, Martin Wilck wrote:
> Hi Mike,
> 
> On Tue, 2022-08-09 at 22:41 -0500, Mike Christie wrote:
>> In several places like LU setup and pr_ops we will hit the noretry
>> code
>> path because we do not retry any passthrough commands that hit device
>> errors even though scsi-ml thinks the command is retryable.
>>
>> This has us only fast fail commands that hit device errors that have
>> been
>> submitted through the block layer directly and not via scsi_execute.
>> This
>> allows SG IO and other users to continue to get fast failures and all
>> device errors returned to them, and scsi_execute users will get their
>> retries they had requested.
>>
>> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> 
> Thanks a lot. I like the general approach, but I am concerned that by
> treating every command sent by scsi_execute() or scsi_execute_req() as
> a retryable command, we may break some callers, or at least modify
> their semantics in unexpected ways. For example, spi_execute(),
> scsi_probe_lun(), scsi_report_lun_scan() currently retry only on UA.
> With this change, these commands will be retried in additional cases,
> without the caller noticing (see 949bf797595fc ("[SCSI] fix command
> retries in spi_transport class"). It isn't obvious to me that this is
> correct in all affected cases. 
Let's make sure we are on the same page, because I might agree with you.
But for possible solutions we need to agree what this patch actually changes.

We currently have 3 places we get retries from:

1. scsi_decide_disposition - For passthrough commands the patch only changes
the behavior when scsi_decide_disposition gets NEED_RETRY, retries < allowed,
and REQ_FAILFAST_DEV is not set.

It's really specific and not as general as I think you thought it was.

2. scsi_io_completion - Passthrough commands are never retried here.

3. scsi_execute users driving retries.

For your examples above:
 
- The scan/probe functions ask for 3 retries and so with patch1 we will now
get 3 x 3 retries for errors that hit #1.

So I agree this is really wrong for DID_TIME_OUT.

- There is no behavior change for spi because it uses REQ_FAILFAST_DEV.

> 
> Note that the retry logic of the mid level may change depending on the
> installed device handler. For example, ALUA devices will endlessly
> retry UA with ASC=29, which some callers explicitly look for.

There is no behavior change with my patch for ASC=29 case, because it
uses ADD_TO_MLQUEUE and we don't run scsi_noretry_cmd for that error.

It could change how 0x04/0x0a is handled because it uses NEEDS_RETRY.
However, scsi_dh_alua uses REQ_FAILFAST_DEV so we do not retry in
scsi_noretry_cmd like before.


> 
> I believe we need to review all callers that have their own retry loop
> and /or error handling logic. This includes sd_spinup_disk(),
> sd_sync_cache(), scsi_test_unit_ready(). SCSI device handlers treat
> some sense keys in special ways, or may retry commands on another
> member of a port group (see alua_rtpg()).

There is no change in behavior for the alua one but agree with the general
comment.

> 
> DID_TIME_OUT is a general concern - no current caller of scsi_execute()
> expects timed-out commands to be retried, and doing so has the
> potential of slowing down operations a lot. I am aware that my recent
> patch changed exactly this for scsi_probe_lun(), but doing the same
> thing for every scsi_execute() invocation sounds at least somewhat
> dangerous.

Agree this patch is wrong:
- With patch1 that fixes scsi_cmnd->allowed we can end up with N and M
DID_TIME_OUT retries and that can get crazy. So agree with that.

- For the general idea of do we always want to retry DID_TIME_OUT, I can
I also see your point.

- After reading your mail and thinking about patch 4, I was thinking that
this is wrong for patch 4 as well. For the pr_ops case we want the opposite
of what you were mentioning in here. I actually want scsi-ml to retry
all UAs for me or allow me to retry all UAs and not just handle the specific
PGR related ones.

I'm not sure where to go from here:

1. Just have the callers drive extra retries like we do now.

I guess I've come around and are ok with this.

With patch1, scsi_cmnd->allowed and scsi_execute works for me, so my
previous comment about scsi_probe_lun needing to retry that case does
not apply since scsi-ml will do it for me.

I do think we need to retry your case in other places though.

2. Instead of trying to make it general for all scsi_execute_users, we can
add SCMD bits for specific cases like DID_TIME_OUT or a SCMD bit that tells
scsi_noretry_cmd to not always fail passthrough commands just because they
are passthrough. It would work the opposite of the FASTFAIL bits where instead
of failing fast, we retry.

I think because the cases scsi_noretry_cmd is used for are really specific cases
(scsi_decide_disposition sees NEEDS_RETRY, retries < allowed, and REQ_FAILFAST_DEV
is not set) that might not be very useful. I'm not sure we want to add a bunch of
cases specific to scsi_execute callers to scsi_check_sense? I don't think we do.
Martin Wilck Aug. 11, 2022, 9:56 a.m. UTC | #2
On Wed, 2022-08-10 at 12:06 -0500, Mike Christie wrote:
> On 8/10/22 5:46 AM, Martin Wilck wrote:
> > Hi Mike,
> > 
> > On Tue, 2022-08-09 at 22:41 -0500, Mike Christie wrote:
> > > In several places like LU setup and pr_ops we will hit the
> > > noretry
> > > code
> > > path because we do not retry any passthrough commands that hit
> > > device
> > > errors even though scsi-ml thinks the command is retryable.
> > > 
> > > This has us only fast fail commands that hit device errors that
> > > have
> > > been
> > > submitted through the block layer directly and not via
> > > scsi_execute.
> > > This
> > > allows SG IO and other users to continue to get fast failures and
> > > all
> > > device errors returned to them, and scsi_execute users will get
> > > their
> > > retries they had requested.
> > > 
> > > Signed-off-by: Mike Christie <michael.christie@oracle.com>
> > 
> > Thanks a lot. I like the general approach, but I am concerned that
> > by
> > treating every command sent by scsi_execute() or scsi_execute_req()
> > as
> > a retryable command, we may break some callers, or at least modify
> > their semantics in unexpected ways. For example, spi_execute(),
> > scsi_probe_lun(), scsi_report_lun_scan() currently retry only on
> > UA.
> > With this change, these commands will be retried in additional
> > cases,
> > without the caller noticing (see 949bf797595fc ("[SCSI] fix command
> > retries in spi_transport class"). It isn't obvious to me that this
> > is
> > correct in all affected cases. 
> Let's make sure we are on the same page, because I might agree with
> you.
> But for possible solutions we need to agree what this patch actually
> changes.
> 
> We currently have 3 places we get retries from:
> 
> 1. scsi_decide_disposition - For passthrough commands the patch only
> changes
> the behavior when scsi_decide_disposition gets NEED_RETRY, retries <
> allowed,
> and REQ_FAILFAST_DEV is not set.
> 
> It's really specific and not as general as I think you thought it
> was.

I was aware of it, but I confused matters for some cases :-/

> 2. scsi_io_completion - Passthrough commands are never retried here.
> 
> 3. scsi_execute users driving retries.

There's also the error handler retrying commands e.g. after a
successful abort. This is the DID_TIME_OUT case, actually -
scsi_decide_disposition() never calls scsi_noretry_cmd() for
DID_TIME_OUT.

> For your examples above:
>  
> - The scan/probe functions ask for 3 retries and so with patch1 we
> will now
> get 3 x 3 retries for errors that hit #1.
> 
> So I agree this is really wrong for DID_TIME_OUT.
> 
> - There is no behavior change for spi because it uses
> REQ_FAILFAST_DEV.

Overlooked that, thanks.

> > 
> > Note that the retry logic of the mid level may change depending on
> > the
> > installed device handler. For example, ALUA devices will endlessly
> > retry UA with ASC=29, which some callers explicitly look for.
> 
> There is no behavior change with my patch for ASC=29 case, because it
> uses ADD_TO_MLQUEUE and we don't run scsi_noretry_cmd for that error.

Right, thanks for pointing that out. I saw it but didn't realize what
it meant.

> It could change how 0x04/0x0a is handled because it uses NEEDS_RETRY.
> However, scsi_dh_alua uses REQ_FAILFAST_DEV so we do not retry in
> scsi_noretry_cmd like before.

Not quite following you here - alua_check_sense() is called for any
command, not just those submitted from the ALUA code.
> 

> > I believe we need to review all callers that have their own retry
> > loop
> > and /or error handling logic. This includes sd_spinup_disk(),
> > sd_sync_cache(), scsi_test_unit_ready(). SCSI device handlers treat
> > some sense keys in special ways, or may retry commands on another
> > member of a port group (see alua_rtpg()).
> 
> There is no change in behavior for the alua one but agree with the
> general
> comment.
> 
> > 
> > DID_TIME_OUT is a general concern - no current caller of
> > scsi_execute()
> > expects timed-out commands to be retried, and doing so has the
> > potential of slowing down operations a lot. I am aware that my
> > recent
> > patch changed exactly this for scsi_probe_lun(), but doing the same
> > thing for every scsi_execute() invocation sounds at least somewhat
> > dangerous.
> 
> Agree this patch is wrong:
> - With patch1 that fixes scsi_cmnd->allowed we can end up with N and
> M
> DID_TIME_OUT retries and that can get crazy. So agree with that.
> 
> - For the general idea of do we always want to retry DID_TIME_OUT, I
> can
> I also see your point.
> 
> - After reading your mail and thinking about patch 4, I was thinking
> that
> this is wrong for patch 4 as well. For the pr_ops case we want the
> opposite
> of what you were mentioning in here. I actually want scsi-ml to retry
> all UAs for me or allow me to retry all UAs and not just handle the
> specific
> PGR related ones.
> 
> I'm not sure where to go from here:
> 
> 1. Just have the callers drive extra retries like we do now.
> 
> I guess I've come around and are ok with this.

Me too, but I'm not sure about Christoph.

> 
> With patch1, scsi_cmnd->allowed and scsi_execute works for me, so my
> previous comment about scsi_probe_lun needing to retry that case does
> not apply since scsi-ml will do it for me.
> 
> I do think we need to retry your case in other places though.
> 
> 2. Instead of trying to make it general for all scsi_execute_users,
> we can
> add SCMD bits for specific cases like DID_TIME_OUT or a SCMD bit that
> tells
> scsi_noretry_cmd to not always fail passthrough commands just because
> they
> are passthrough. It would work the opposite of the FASTFAIL bits
> where instead
> of failing fast, we retry.
> 
> I think because the cases scsi_noretry_cmd is used for are really
> specific cases
> (scsi_decide_disposition sees NEEDS_RETRY, retries < allowed, and
> REQ_FAILFAST_DEV
> is not set) that might not be very useful. 

I don't think it's _that_ speficic. (retries < allowed) is the default
case, at least for the first failure. REQ_FAILFAST_DEV has very few
users except for the device handlers, and NEEDS_RETRY is a rather
frequently used disposition.

> I'm not sure we want to add a bunch of
> cases specific to scsi_execute callers to scsi_check_sense? I don't
> think we do.

If we don't want to fall back to 1. above, I think we can go a long way
with two additional flags for scsi_execute() callers:

 a) indicate whether DID_TIME_OUT should be retried by the error
handler (default: no),
 b) indicate whether UA (and NOT READY?) should be retried on the ML
(default: current behavior)

The rationale for b) is that (IIRC from yesterday's code inspection all
callers that check sense codes only define special cases for UA, or
some subcases of it.

Maybe we need also

 c) indicate whether the (don't) retry logic for SG_IO should be used.

> To be clear, I mean for this approach to be generally useful we would
> have to
> add the cases scsi_execute callers are retrying in scsi_check_sense.
> We could
> then remove the scsi_execute caller driven retries and only have
> scsi_decide_disposition
> retry.
> 
> It sounded really nice at first, but to do that I would have a bunch
> of cases
> that might be specific to pr_ops, alua or scanning, etc. Then I also
> had cases where
> user1 does not want to retry but user2 did, so I have to add more
> SCMD bits. So, in
> the end it will get messier than you initially think.
> 

I agree. I thought about passing a lookup table of sense keys (not) to
be retried to the ML, but it feels over-engineered to me. 

Thanks
Martin
Christoph Hellwig Aug. 11, 2022, 12:27 p.m. UTC | #3
On Wed, Aug 10, 2022 at 12:06:41PM -0500, Mike Christie wrote:
> 2. Instead of trying to make it general for all scsi_execute_users, we can
> add SCMD bits for specific cases like DID_TIME_OUT or a SCMD bit that tells
> scsi_noretry_cmd to not always fail passthrough commands just because they
> are passthrough. It would work the opposite of the FASTFAIL bits where instead
> of failing fast, we retry.

Yes, I think this is closer to what I'd like to see.   Although I wonder
if we should turn it around and require the FAILFAST bits to opt out of
automatic retries even for passthrough, even if that turns into a major
audit.
Christoph Hellwig Aug. 11, 2022, 12:28 p.m. UTC | #4
On Wed, Aug 10, 2022 at 12:38:08PM -0500, Mike Christie wrote:
> It sounded really nice at first, but to do that I would have a bunch of
> cases that might be specific to pr_ops, alua or scanning, etc. Then I
> also had cases where user1 does not want to retry but user2 did, so I
> have to add more SCMD bits. So, in the end it will get messier than
> you initially think.

Do you have an overview of the different cases you've spot so far?
Mike Christie Aug. 11, 2022, 4:15 p.m. UTC | #5
On 8/11/22 4:56 AM, Martin Wilck wrote:
>> It could change how 0x04/0x0a is handled because it uses NEEDS_RETRY.
>> However, scsi_dh_alua uses REQ_FAILFAST_DEV so we do not retry in
>> scsi_noretry_cmd like before.
> 
> Not quite following you here - alua_check_sense() is called for any
> command, not just those submitted from the ALUA code.

Ah, I thought you had mentioned alua above because of your alua_rtpg
example. Above I was saying that there was no behavior change for your
alua_rtpg example because it uses REQ_FAILFAST_DEV.


>> 2. Instead of trying to make it general for all scsi_execute_users,
>> we can
>> add SCMD bits for specific cases like DID_TIME_OUT or a SCMD bit that
>> tells
>> scsi_noretry_cmd to not always fail passthrough commands just because
>> they
>> are passthrough. It would work the opposite of the FASTFAIL bits
>> where instead
>> of failing fast, we retry.
>>
>> I think because the cases scsi_noretry_cmd is used for are really
>> specific cases
>> (scsi_decide_disposition sees NEEDS_RETRY, retries < allowed, and
>> REQ_FAILFAST_DEV
>> is not set) that might not be very useful. 
> 
> I don't think it's _that_ speficic. (retries < allowed) is the default
> case, at least for the first failure. REQ_FAILFAST_DEV has very few
> users except for the device handlers, and NEEDS_RETRY is a rather
> frequently used disposition.
I'm saying it's really specific because we only hit this code
path that is causing issues when scsi_check_sense returns NEEDS_RETRY.
There's 5 in there and one in scsi_dh_alua. 4 of them are UAs.

Compared to all the sense errors that we check for in the
scsi_execute callers and including all the times they do a retry for
all errors the 5 cases in scsi_check_sense seemed really specific.

Let me send a patch for this type of design because in the other mail
Christoph was asking for more details. I originally started going that
route so it won't be too much trouble to do a RFC so we can get an
idea of what it will look like.
Mike Christie Aug. 11, 2022, 4:19 p.m. UTC | #6
On 8/11/22 7:28 AM, Christoph Hellwig wrote:
> On Wed, Aug 10, 2022 at 12:38:08PM -0500, Mike Christie wrote:
>> It sounded really nice at first, but to do that I would have a bunch of
>> cases that might be specific to pr_ops, alua or scanning, etc. Then I
>> also had cases where user1 does not want to retry but user2 did, so I
>> have to add more SCMD bits. So, in the end it will get messier than
>> you initially think.
> 
> Do you have an overview of the different cases you've spot so far?

I'll finish up what I was working on for this type of design and I'll
post it as a rough RFC, so it should be easier to discuss in code format.

I'm going to break out the first patch, handle your comment on it, and
send that separately since it's a regression fix and not related to this
discussion.
Martin Wilck Aug. 11, 2022, 5:02 p.m. UTC | #7
On Thu, 2022-08-11 at 11:15 -0500, Mike Christie wrote:
> > 
> > I don't think it's _that_ speficic. (retries < allowed) is the
> > default
> > case, at least for the first failure. REQ_FAILFAST_DEV has very few
> > users except for the device handlers, and NEEDS_RETRY is a rather
> > frequently used disposition.
> I'm saying it's really specific because we only hit this code
> path that is causing issues when scsi_check_sense returns
> NEEDS_RETRY.

What about the other cases in scsi_decide_disposition() that jump to
maybe_retry?


> There's 5 in there and one in scsi_dh_alua. 4 of them are UAs.
> 
> Compared to all the sense errors that we check for in the
> scsi_execute callers and including all the times they do a retry for
> all errors the 5 cases in scsi_check_sense seemed really specific.
> 
> Let me send a patch for this type of design because in the other mail
> Christoph was asking for more details. I originally started going
> that
> route so it won't be too much trouble to do a RFC so we can get an
> idea of what it will look like.

Looking forward to it.

Martin
Mike Christie Aug. 11, 2022, 6:22 p.m. UTC | #8
On 8/11/22 12:02 PM, Martin Wilck wrote:
> On Thu, 2022-08-11 at 11:15 -0500, Mike Christie wrote:
>>>
>>> I don't think it's _that_ speficic. (retries < allowed) is the
>>> default
>>> case, at least for the first failure. REQ_FAILFAST_DEV has very few
>>> users except for the device handlers, and NEEDS_RETRY is a rather
>>> frequently used disposition.
>> I'm saying it's really specific because we only hit this code
>> path that is causing issues when scsi_check_sense returns
>> NEEDS_RETRY.
> 
> What about the other cases in scsi_decide_disposition() that jump to
> maybe_retry?

Ok one exception to what I wrote. DID_TIMEOUT hits the
blk_rq_is_passthrough test of course :)

The rest hit that check condition check and are retried like normal IO:

scsi_noretry_cmd()
.....

	switch (host_byte(scmd->result)) {
...

        if (!scsi_status_is_check_condition(scmd->result))
                return false;

We get to that blk_rq_is_passthrough for DID_TIME_OUT because it
has a goto to bypass the above test. The other host errors we return
false above and retry.
diff mbox series

Patch

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index ac4471e33a9c..573d926220c4 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1806,7 +1806,8 @@  bool scsi_noretry_cmd(struct scsi_cmnd *scmd)
 	 * assume caller has checked sense and determined
 	 * the check condition was retryable.
 	 */
-	if (req->cmd_flags & REQ_FAILFAST_DEV || blk_rq_is_passthrough(req))
+	if (req->cmd_flags & REQ_FAILFAST_DEV ||
+	    scmd->submitter == SUBMITTED_BY_BLOCK_PT)
 		return true;
 
 	return false;