Message ID | 20230124143025.3464-1-mwilck@suse.com |
---|---|
State | Superseded |
Headers | show |
Series | [v2] scsi: add non-sleeping variant of scsi_device_put() and use it in alua | expand |
On 1/24/23 06:30, mwilck@suse.com wrote: > +/** > + * scsi_device_put_nosleep - release a reference to a scsi_device > + * @sdev: device to release a reference on. > + * > + * Description: Release a reference to the scsi_device and decrements the use > + * count of the underlying LLDD module. This function may only be called from > + * a call context where it is certain that the reference dropped is not the > + * last one. > + */ The above comment does not cover the call from scsi_device_put(). That could be addressed by adding the following text at the end of the above comment: " or from a context in which it is allowed to sleep". However, if that text would be added the function name ("scsi_device_put_nosleep()") would become confusing. How about open-coding scsi_device_put_nosleep() in scsi_device_put() to prevent this confusion? Thanks, Bart.
On 1/24/23 07:26, Martin Wilck wrote:
> Or simply mentioning that the call from scsi_device_put() is legal?
That's fine with me.
Thanks,
Bart.
On Tue, Jan 24, 2023 at 03:30:25PM +0100, mwilck@suse.com wrote: > From: Martin Wilck <mwilck@suse.com> > > Since the might_sleep() annotation was added in scsi_device_put() and > alua_rtpg_queue(), we have seen repeated reports of "BUG: sleeping function > called from invalid context" [1], [2]. alua_rtpg_queue() is always called > from contexts where the caller must hold at least one reference to the scsi > device in question. This means that the reference taken by > alua_rtpg_queue() itself can't be the last one, and thus can be dropped > without entering the code path in which scsi_device_put() might actually > sleep. If there is always guaranteed to be another reference, why does this code even grab one? The pattern of dropping a reference that can't be the last is pretty nonsensical.
On Tue, 2023-01-24 at 22:49 -0800, Christoph Hellwig wrote: > On Tue, Jan 24, 2023 at 03:30:25PM +0100, mwilck@suse.com wrote: > > From: Martin Wilck <mwilck@suse.com> > > > > Since the might_sleep() annotation was added in scsi_device_put() > > and > > alua_rtpg_queue(), we have seen repeated reports of "BUG: sleeping > > function > > called from invalid context" [1], [2]. alua_rtpg_queue() is always > > called > > from contexts where the caller must hold at least one reference to > > the scsi > > device in question. This means that the reference taken by > > alua_rtpg_queue() itself can't be the last one, and thus can be > > dropped > > without entering the code path in which scsi_device_put() might > > actually > > sleep. > > If there is always guaranteed to be another reference, why does this > code even grab one? The pattern of dropping a reference that can't > be > the last is pretty nonsensical. > It's because the sdev is passed to the work queue to execute the RTPG. To my understanding, the rationale is that the caller's ref may be given up before the workqueue starts running, thus an additional ref is needed to make sure the sdev isn't freed before the workqueue accesses it. But if queue_delayed_work() fails (e.g. because the item is already queued) this additional ref must be given up. Regards, Martin
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c index 29a2865b8e2e..8f3aac9e6a50 100644 --- a/drivers/scsi/device_handler/scsi_dh_alua.c +++ b/drivers/scsi/device_handler/scsi_dh_alua.c @@ -1032,7 +1032,7 @@ static bool alua_rtpg_queue(struct alua_port_group *pg, kref_put(&pg->kref, release_port_group); } if (sdev) - scsi_device_put(sdev); + scsi_device_put_nosleep(sdev); return true; } diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 1426b9b03612..22b82e57dee3 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -576,6 +576,24 @@ int scsi_device_get(struct scsi_device *sdev) } EXPORT_SYMBOL(scsi_device_get); +/** + * scsi_device_put_nosleep - release a reference to a scsi_device + * @sdev: device to release a reference on. + * + * Description: Release a reference to the scsi_device and decrements the use + * count of the underlying LLDD module. This function may only be called from + * a call context where it is certain that the reference dropped is not the + * last one. + */ +void scsi_device_put_nosleep(struct scsi_device *sdev) +{ + struct module *mod = sdev->host->hostt->module; + + put_device(&sdev->sdev_gendev); + module_put(mod); +} +EXPORT_SYMBOL(scsi_device_put_nosleep); + /** * scsi_device_put - release a reference to a scsi_device * @sdev: device to release a reference on. @@ -586,12 +604,9 @@ EXPORT_SYMBOL(scsi_device_get); */ void scsi_device_put(struct scsi_device *sdev) { - struct module *mod = sdev->host->hostt->module; - might_sleep(); - put_device(&sdev->sdev_gendev); - module_put(mod); + scsi_device_put_nosleep(sdev); } EXPORT_SYMBOL(scsi_device_put); diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index 3642b8e3928b..f1ad48c9c601 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -365,6 +365,7 @@ void scsi_attach_vpd(struct scsi_device *sdev); extern struct scsi_device *scsi_device_from_queue(struct request_queue *q); extern int __must_check scsi_device_get(struct scsi_device *); +void scsi_device_put_nosleep(struct scsi_device *sdev); extern void scsi_device_put(struct scsi_device *); extern struct scsi_device *scsi_device_lookup(struct Scsi_Host *, uint, uint, u64);