diff mbox series

scsi: do not retry FAILFAST commands on DID_TRANSPORT_DISRUPTED

Message ID 20210126130212.47998-1-hare@suse.de
State New
Headers show
Series scsi: do not retry FAILFAST commands on DID_TRANSPORT_DISRUPTED | expand

Commit Message

Hannes Reinecke Jan. 26, 2021, 1:02 p.m. UTC
When a command is return with DID_TRANSPORT_DISRUPTED we should
be looking at the REQ_FAILFAST_TRANSPORT flag and do not retry
the command if set.
Otherwise multipath will be requeuing a command on the failed
path and not fail it over to one of the working paths.

Cc: Martin Wilck <martin.wilck@suse.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
---
 drivers/scsi/scsi_error.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Mike Christie Jan. 28, 2021, 12:51 a.m. UTC | #1
On 1/26/21 7:02 AM, Hannes Reinecke wrote:
> When a command is return with DID_TRANSPORT_DISRUPTED we should

> be looking at the REQ_FAILFAST_TRANSPORT flag and do not retry

> the command if set.

> Otherwise multipath will be requeuing a command on the failed

> path and not fail it over to one of the working paths.

> 

> Cc: Martin Wilck <martin.wilck@suse.com>

> Signed-off-by: Hannes Reinecke <hare@suse.com>

> ---

>   drivers/scsi/scsi_error.c | 1 +

>   1 file changed, 1 insertion(+)

> 

> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c

> index a52665eaf288..005118385b70 100644

> --- a/drivers/scsi/scsi_error.c

> +++ b/drivers/scsi/scsi_error.c

> @@ -1753,6 +1753,7 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)

>   	case DID_TIME_OUT:

>   		goto check_type;

>   	case DID_BUS_BUSY:

> +	case DID_TRANSPORT_DISRUPTED:

>   		return (scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT);

>   	case DID_PARITY:

>   		return (scmd->request->cmd_flags & REQ_FAILFAST_DEV);


We don't fast fail for that error code to avoid churn for transient 
transport problems. The FC and iscsi drivers block the rport/session, 
return that code and then it's up the fast_io_fail/replacement timeout.
Hannes Reinecke Jan. 28, 2021, 6:57 a.m. UTC | #2
On 1/28/21 1:51 AM, michael.christie@oracle.com wrote:
> On 1/26/21 7:02 AM, Hannes Reinecke wrote:

>> When a command is return with DID_TRANSPORT_DISRUPTED we should

>> be looking at the REQ_FAILFAST_TRANSPORT flag and do not retry

>> the command if set.

>> Otherwise multipath will be requeuing a command on the failed

>> path and not fail it over to one of the working paths.

>>

>> Cc: Martin Wilck <martin.wilck@suse.com>

>> Signed-off-by: Hannes Reinecke <hare@suse.com>

>> ---

>>   drivers/scsi/scsi_error.c | 1 +

>>   1 file changed, 1 insertion(+)

>>

>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c

>> index a52665eaf288..005118385b70 100644

>> --- a/drivers/scsi/scsi_error.c

>> +++ b/drivers/scsi/scsi_error.c

>> @@ -1753,6 +1753,7 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)

>>       case DID_TIME_OUT:

>>           goto check_type;

>>       case DID_BUS_BUSY:

>> +    case DID_TRANSPORT_DISRUPTED:

>>           return (scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT);

>>       case DID_PARITY:

>>           return (scmd->request->cmd_flags & REQ_FAILFAST_DEV);

> 

> We don't fast fail for that error code to avoid churn for transient 

> transport problems. The FC and iscsi drivers block the rport/session, 

> return that code and then it's up the fast_io_fail/replacement timeout.

> 

_But_ if prevents that command to be failed over to another path, so 
essentially we're blocking execution until fast_io_fail tmo.

For no good reason as we have other paths available.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
Ewan Milne Jan. 28, 2021, 1:40 p.m. UTC | #3
On Thu, 2021-01-28 at 07:57 +0100, Hannes Reinecke wrote:
> On 1/28/21 1:51 AM, michael.christie@oracle.com wrote:

> > On 1/26/21 7:02 AM, Hannes Reinecke wrote:

> > > When a command is return with DID_TRANSPORT_DISRUPTED we should

> > > be looking at the REQ_FAILFAST_TRANSPORT flag and do not retry

> > > the command if set.

> > > Otherwise multipath will be requeuing a command on the failed

> > > path and not fail it over to one of the working paths.

> > > 

> > > Cc: Martin Wilck <martin.wilck@suse.com>

> > > Signed-off-by: Hannes Reinecke <hare@suse.com>

> > > ---

> > >   drivers/scsi/scsi_error.c | 1 +

> > >   1 file changed, 1 insertion(+)

> > > 

> > > diff --git a/drivers/scsi/scsi_error.c

> > > b/drivers/scsi/scsi_error.c

> > > index a52665eaf288..005118385b70 100644

> > > --- a/drivers/scsi/scsi_error.c

> > > +++ b/drivers/scsi/scsi_error.c

> > > @@ -1753,6 +1753,7 @@ int scsi_noretry_cmd(struct scsi_cmnd

> > > *scmd)

> > >       case DID_TIME_OUT:

> > >           goto check_type;

> > >       case DID_BUS_BUSY:

> > > +    case DID_TRANSPORT_DISRUPTED:

> > >           return (scmd->request->cmd_flags &

> > > REQ_FAILFAST_TRANSPORT);

> > >       case DID_PARITY:

> > >           return (scmd->request->cmd_flags & REQ_FAILFAST_DEV);

> > 

> > We don't fast fail for that error code to avoid churn for

> > transient 

> > transport problems. The FC and iscsi drivers block the

> > rport/session, 

> > return that code and then it's up the fast_io_fail/replacement

> > timeout.

> > 

> 

> _But_ if prevents that command to be failed over to another path, so 

> essentially we're blocking execution until fast_io_fail tmo.

> 

> For no good reason as we have other paths available.


And if we don't?  Not everybody sets queue_if_no_path, right?

-Ewan
 
> 

> Cheers,

> 

> Hannes
Mike Christie Jan. 28, 2021, 6:01 p.m. UTC | #4
On 1/28/21 12:57 AM, Hannes Reinecke wrote:
> On 1/28/21 1:51 AM, michael.christie@oracle.com wrote:

>> On 1/26/21 7:02 AM, Hannes Reinecke wrote:

>>> When a command is return with DID_TRANSPORT_DISRUPTED we should

>>> be looking at the REQ_FAILFAST_TRANSPORT flag and do not retry

>>> the command if set.

>>> Otherwise multipath will be requeuing a command on the failed

>>> path and not fail it over to one of the working paths.

>>>

>>> Cc: Martin Wilck <martin.wilck@suse.com>

>>> Signed-off-by: Hannes Reinecke <hare@suse.com>

>>> ---

>>>   drivers/scsi/scsi_error.c | 1 +

>>>   1 file changed, 1 insertion(+)

>>>

>>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c

>>> index a52665eaf288..005118385b70 100644

>>> --- a/drivers/scsi/scsi_error.c

>>> +++ b/drivers/scsi/scsi_error.c

>>> @@ -1753,6 +1753,7 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)

>>>       case DID_TIME_OUT:

>>>           goto check_type;

>>>       case DID_BUS_BUSY:

>>> +    case DID_TRANSPORT_DISRUPTED:

>>>           return (scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT);

>>>       case DID_PARITY:

>>>           return (scmd->request->cmd_flags & REQ_FAILFAST_DEV);

>>

>> We don't fast fail for that error code to avoid churn for transient transport problems. The FC and iscsi drivers block the rport/session, return that code and then it's up the fast_io_fail/replacement timeout.

>>

> _But_ if prevents that command to be failed over to another path, so essentially we're blocking execution until fast_io_fail tmo.

> 

> For no good reason as we have other paths available.


It is for a good reason. Causing a failover to another path can lead
to other resources being shifted and putting the system out of
balance. For active/active it's most likely fine, but for active
passive targets and it might be better to wait a second or two to see
if we can recover the optimal/preferred path.

For iscsi if the user has set fast_io_fail tmo > 0 then they want the
delayed behavior. If the user wants it failed immediately they set
fast_io_tmo (iscsi replacement timeout) to zero. Note for fc you would
need to add code to do the same since zero means off.
 
The other issue is that your patch only solves part of the problem.
It would prevent the IO that went through the above code path and new IO
from waiting for fast_io_fail. It does not handle the other IO that would
be caught between dm-multipath and the driver when the rport/session block
completes/starts. In the end the app is probably going to get stuck waiting
for fast io fail tmo to fire.

I think for both reasons, you want to just make it so the user can set
fast io fail tmo in a way that some value means fail immediately or do
something completely new.
Mike Christie Jan. 28, 2021, 6:21 p.m. UTC | #5
On 1/28/21 12:01 PM, Mike Christie wrote:
> On 1/28/21 12:57 AM, Hannes Reinecke wrote:

>> On 1/28/21 1:51 AM, michael.christie@oracle.com wrote:

>>> On 1/26/21 7:02 AM, Hannes Reinecke wrote:

>>>> When a command is return with DID_TRANSPORT_DISRUPTED we should

>>>> be looking at the REQ_FAILFAST_TRANSPORT flag and do not retry

>>>> the command if set.

>>>> Otherwise multipath will be requeuing a command on the failed

>>>> path and not fail it over to one of the working paths.

>>>>

>>>> Cc: Martin Wilck <martin.wilck@suse.com>

>>>> Signed-off-by: Hannes Reinecke <hare@suse.com>

>>>> ---

>>>>   drivers/scsi/scsi_error.c | 1 +

>>>>   1 file changed, 1 insertion(+)

>>>>

>>>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c

>>>> index a52665eaf288..005118385b70 100644

>>>> --- a/drivers/scsi/scsi_error.c

>>>> +++ b/drivers/scsi/scsi_error.c

>>>> @@ -1753,6 +1753,7 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd)

>>>>       case DID_TIME_OUT:

>>>>           goto check_type;

>>>>       case DID_BUS_BUSY:

>>>> +    case DID_TRANSPORT_DISRUPTED:

>>>>           return (scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT);

>>>>       case DID_PARITY:

>>>>           return (scmd->request->cmd_flags & REQ_FAILFAST_DEV);

>>>

>>> We don't fast fail for that error code to avoid churn for transient transport problems. The FC and iscsi drivers block the rport/session, return that code and then it's up the fast_io_fail/replacement timeout.

>>>

>> _But_ if prevents that command to be failed over to another path, so essentially we're blocking execution until fast_io_fail tmo.

>>

>> For no good reason as we have other paths available.

> 

> It is for a good reason. Causing a failover to another path can lead

> to other resources being shifted and putting the system out of

> balance. For active/active it's most likely fine, but for active

> passive targets and it might be better to wait a second or two to see

> if we can recover the optimal/preferred path.

> 

> For iscsi if the user has set fast_io_fail tmo > 0 then they want the

> delayed behavior. If the user wants it failed immediately they set

> fast_io_tmo (iscsi replacement timeout) to zero. Note for fc you would

> need to add code to do the same since zero means off.

>  

> The other issue is that your patch only solves part of the problem.

> It would prevent the IO that went through the above code path and new IO

> from waiting for fast_io_fail. It does not handle the other IO that would

> be caught between dm-multipath and the driver when the rport/session block

> completes/starts. In the end the app is probably going to get stuck waiting

> for fast io fail tmo to fire.

> 

> I think for both reasons, you want to just make it so the user can set

> fast io fail tmo in a way that some value means fail immediately or do

> something completely new.


For that last part, I think in the short term we could just fix up 
csi_transport_fc to work like iscsi, but for the longer term to handle
cases like you have N>1 paths in a multipath group and are doing
active/active across them but then have another group where you have to do
active/passive over, then we could do something new.

I think we have what we have now, because back when it was made we couldn't
tell dm-multipath what the error was, so we had to handle it in both layers
like this.

Now, we could fail immediately, not block the queues so IO would not get stuck
in there, and then dm-multipath and multipathd could do something smart with
the error since they know the reason it was failed, and the path and path group
layout, and have more info on the device like if it is active passive. For compat
and to support what both of us need, we could have a timeout/setting for paths
within a path group and one for switching path groups. We could also do different
behaviors of the error was DID_TRANSPORT_DISRUPTED vs DID_TRANSPORT_FAILFAST
since for the latter the transport timers have fired.
diff mbox series

Patch

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index a52665eaf288..005118385b70 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1753,6 +1753,7 @@  int scsi_noretry_cmd(struct scsi_cmnd *scmd)
 	case DID_TIME_OUT:
 		goto check_type;
 	case DID_BUS_BUSY:
+	case DID_TRANSPORT_DISRUPTED:
 		return (scmd->request->cmd_flags & REQ_FAILFAST_TRANSPORT);
 	case DID_PARITY:
 		return (scmd->request->cmd_flags & REQ_FAILFAST_DEV);