[3/3] fnic: check for started requests in fnic_wq_copy_cleanup_handler()

Message ID 20210429122517.39659-4-hare@suse.de
State New
Headers show
Series
  • fnic: use scsi_host_busy_iter() to traverse commands
Related show

Commit Message

Hannes Reinecke April 29, 2021, 12:25 p.m.
fnic_wq_copy_cleanup_handler() is using scsi_host_find_tag() to
map id to a scsi command. However, as per discussion on the mailinglist
scsi_host_find_tag() might return a non-started request, so we need
to check the returned command with blk_mq_request_started() to avoid
the function tripping over a non-initialized command.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/fnic/fnic_scsi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Martin Wilck May 4, 2021, 7:49 a.m. | #1
On Thu, 2021-04-29 at 19:28 +0200, Hannes Reinecke wrote:
> On 4/29/21 4:34 PM, Ming Lei wrote:

> > On Thu, Apr 29, 2021 at 02:25:17PM +0200, Hannes Reinecke wrote:

> > > fnic_wq_copy_cleanup_handler() is using scsi_host_find_tag() to

> > > map id to a scsi command. However, as per discussion on the

> > > mailinglist

> > > scsi_host_find_tag() might return a non-started request, so we

> > > need

> > > to check the returned command with blk_mq_request_started() to

> > > avoid

> > > the function tripping over a non-initialized command.

> > > 

> > > Signed-off-by: Hannes Reinecke <hare@suse.de>

> > > ---

> > >   drivers/scsi/fnic/fnic_scsi.c | 2 +-

> > >   1 file changed, 1 insertion(+), 1 deletion(-)

> > > 

> > > diff --git a/drivers/scsi/fnic/fnic_scsi.c

> > > b/drivers/scsi/fnic/fnic_scsi.c

> > > index 762cc8bd2653..b9fd3d87416b 100644

> > > --- a/drivers/scsi/fnic/fnic_scsi.c

> > > +++ b/drivers/scsi/fnic/fnic_scsi.c

> > > @@ -1466,7 +1466,7 @@ void fnic_wq_copy_cleanup_handler(struct

> > > vnic_wq_copy *wq,

> > >                 return;

> > >   

> > >         sc = scsi_host_find_tag(fnic->lport->host, id);

> > > -       if (!sc)

> > > +       if (!sc || !blk_mq_request_started(sc->request))

> > >                 return;

> > 

> > scsi_host_find_tag() has covered blk_mq_request_started check

> > already, so

> > this patch isn't necessary.

> > 

> Right. So drop it, then.


While you are at this, could you please re-consider e73a5e8e8003
("scsi: core: Only return started requests from scsi_host_find_tag()")
in general?

I have come to think that commit is incorrect. It was created as an
attempt to fix the observed fnic crashes, but it was ineffective. The
crashes were eventually fixed by patch 2/3 of this series. 

IMO scsi_host_find_tag() should return a request if there is one,
regardless whether or not it's started, and leave the decision to
ignore pending requests to the caller, like it used to be until v5.8.

Regards
Martin
Ming Lei May 4, 2021, 8:06 a.m. | #2
On Tue, May 04, 2021 at 09:49:12AM +0200, Martin Wilck wrote:
> On Thu, 2021-04-29 at 19:28 +0200, Hannes Reinecke wrote:

> > On 4/29/21 4:34 PM, Ming Lei wrote:

> > > On Thu, Apr 29, 2021 at 02:25:17PM +0200, Hannes Reinecke wrote:

> > > > fnic_wq_copy_cleanup_handler() is using scsi_host_find_tag() to

> > > > map id to a scsi command. However, as per discussion on the

> > > > mailinglist

> > > > scsi_host_find_tag() might return a non-started request, so we

> > > > need

> > > > to check the returned command with blk_mq_request_started() to

> > > > avoid

> > > > the function tripping over a non-initialized command.

> > > > 

> > > > Signed-off-by: Hannes Reinecke <hare@suse.de>

> > > > ---

> > > >   drivers/scsi/fnic/fnic_scsi.c | 2 +-

> > > >   1 file changed, 1 insertion(+), 1 deletion(-)

> > > > 

> > > > diff --git a/drivers/scsi/fnic/fnic_scsi.c

> > > > b/drivers/scsi/fnic/fnic_scsi.c

> > > > index 762cc8bd2653..b9fd3d87416b 100644

> > > > --- a/drivers/scsi/fnic/fnic_scsi.c

> > > > +++ b/drivers/scsi/fnic/fnic_scsi.c

> > > > @@ -1466,7 +1466,7 @@ void fnic_wq_copy_cleanup_handler(struct

> > > > vnic_wq_copy *wq,

> > > >                 return;

> > > >   

> > > >         sc = scsi_host_find_tag(fnic->lport->host, id);

> > > > -       if (!sc)

> > > > +       if (!sc || !blk_mq_request_started(sc->request))

> > > >                 return;

> > > 

> > > scsi_host_find_tag() has covered blk_mq_request_started check

> > > already, so

> > > this patch isn't necessary.

> > > 

> > Right. So drop it, then.

> 

> While you are at this, could you please re-consider e73a5e8e8003

> ("scsi: core: Only return started requests from scsi_host_find_tag()")

> in general?

> 

> I have come to think that commit is incorrect. It was created as an

> attempt to fix the observed fnic crashes, but it was ineffective. The

> crashes were eventually fixed by patch 2/3 of this series. 

> 

> IMO scsi_host_find_tag() should return a request if there is one,

> regardless whether or not it's started, and leave the decision to

> ignore pending requests to the caller, like it used to be until v5.8.


Can you share the cases in which SCSI needs to deal with non in-flight
requests via scsi_host_find_tag()? which is supposed to be used for retrieving
request via one active tag in scsi io completion path.


Thanks,
Ming
Martin Wilck May 4, 2021, 8:57 a.m. | #3
On Tue, 2021-05-04 at 16:06 +0800, Ming Lei wrote:
> On Tue, May 04, 2021 at 09:49:12AM +0200, Martin Wilck wrote:

> > On Thu, 2021-04-29 at 19:28 +0200, Hannes Reinecke wrote:

> > > On 4/29/21 4:34 PM, Ming Lei wrote:

> > > > On Thu, Apr 29, 2021 at 02:25:17PM +0200, Hannes Reinecke

> > > > wrote:

> > > > > fnic_wq_copy_cleanup_handler() is using scsi_host_find_tag()

> > > > > to

> > > > > map id to a scsi command. However, as per discussion on the

> > > > > mailinglist

> > > > > scsi_host_find_tag() might return a non-started request, so

> > > > > we

> > > > > need

> > > > > to check the returned command with blk_mq_request_started()

> > > > > to

> > > > > avoid

> > > > > the function tripping over a non-initialized command.

> > > > > 

> > > > > Signed-off-by: Hannes Reinecke <hare@suse.de>

> > > > > ---

> > > > >   drivers/scsi/fnic/fnic_scsi.c | 2 +-

> > > > >   1 file changed, 1 insertion(+), 1 deletion(-)

> > > > > 

> > > > > diff --git a/drivers/scsi/fnic/fnic_scsi.c

> > > > > b/drivers/scsi/fnic/fnic_scsi.c

> > > > > index 762cc8bd2653..b9fd3d87416b 100644

> > > > > --- a/drivers/scsi/fnic/fnic_scsi.c

> > > > > +++ b/drivers/scsi/fnic/fnic_scsi.c

> > > > > @@ -1466,7 +1466,7 @@ void

> > > > > fnic_wq_copy_cleanup_handler(struct

> > > > > vnic_wq_copy *wq,

> > > > >                 return;

> > > > >   

> > > > >         sc = scsi_host_find_tag(fnic->lport->host, id);

> > > > > -       if (!sc)

> > > > > +       if (!sc || !blk_mq_request_started(sc->request))

> > > > >                 return;

> > > > 

> > > > scsi_host_find_tag() has covered blk_mq_request_started check

> > > > already, so

> > > > this patch isn't necessary.

> > > > 

> > > Right. So drop it, then.

> > 

> > While you are at this, could you please re-consider e73a5e8e8003

> > ("scsi: core: Only return started requests from

> > scsi_host_find_tag()")

> > in general?

> > 

> > I have come to think that commit is incorrect. It was created as an

> > attempt to fix the observed fnic crashes, but it was ineffective.

> > The

> > crashes were eventually fixed by patch 2/3 of this series. 

> > 

> > IMO scsi_host_find_tag() should return a request if there is one,

> > regardless whether or not it's started, and leave the decision to

> > ignore pending requests to the caller, like it used to be until

> > v5.8.

> 

> Can you share the cases in which SCSI needs to deal with non in-

> flight

> requests via scsi_host_find_tag()? which is supposed to be used for

> retrieving

> request via one active tag in scsi io completion path.


No, I can't. I haven't reviewed every caller. I thought about the
function's documentation, which simply says "find the tagged command by
host". Maybe I got this wrong.

You're right that returning non-in-flight requests makes no sense for
drivers calling this function in the request completion code path.

The situation was a bit different until recently, when drivers used
scsi_host_find_tag() also for iterating over commands. The commit
message of e73a5e8e8003 stated that it avoids 'random errors' in this
case; I am not sure if that's actually the case. I can at least imagine
that a driver would want to abort or otherwise invalidate requests even
if they're not in flight yet (e.g. when shutting down a controller).
As the drivers have now been fixed to use blk_mq_tagset_busy_iter(),
there's no need to discuss this further.

Thanks,
Martin
Bart Van Assche May 4, 2021, 5:03 p.m. | #4
On 5/4/21 1:57 AM, Martin Wilck wrote:
> No, I can't. I haven't reviewed every caller. I thought about the

> function's documentation, which simply says "find the tagged command by

> host". Maybe I got this wrong.

> 

> You're right that returning non-in-flight requests makes no sense for

> drivers calling this function in the request completion code path.

> 

> The situation was a bit different until recently, when drivers used

> scsi_host_find_tag() also for iterating over commands. The commit

> message of e73a5e8e8003 stated that it avoids 'random errors' in this

> case; I am not sure if that's actually the case. I can at least imagine

> that a driver would want to abort or otherwise invalidate requests even

> if they're not in flight yet (e.g. when shutting down a controller).

> As the drivers have now been fixed to use blk_mq_tagset_busy_iter(),

> there's no need to discuss this further.


Unless if there is a strong argument I'd like to keep commit
e73a5e8e8003 ("scsi: core: Only return started requests from
scsi_host_find_tag()"). I think that commit implements the behavior that
is needed in the completion path of SCSI LLDs.

Thanks,

Bart.

Patch

diff --git a/drivers/scsi/fnic/fnic_scsi.c b/drivers/scsi/fnic/fnic_scsi.c
index 762cc8bd2653..b9fd3d87416b 100644
--- a/drivers/scsi/fnic/fnic_scsi.c
+++ b/drivers/scsi/fnic/fnic_scsi.c
@@ -1466,7 +1466,7 @@  void fnic_wq_copy_cleanup_handler(struct vnic_wq_copy *wq,
 		return;
 
 	sc = scsi_host_find_tag(fnic->lport->host, id);
-	if (!sc)
+	if (!sc || !blk_mq_request_started(sc->request))
 		return;
 
 	io_lock = fnic_io_lock_hash(fnic, sc);