Message ID | 20250403211937.2225615-5-bvanassche@acm.org |
---|---|
State | New |
Headers | show |
Series | Optimize the hot path in the UFS driver | expand |
On 03/04/2025 22:17, Bart Van Assche wrote: > From: Hannes Reinecke<hare@suse.de> > > Quite some drivers are using management commands internally. These commands > typically use the same tag pool as regular SCSI commands. Tags for these > management commands are set aside before allocating the block-mq tag bitmap > for regular SCSI commands. The block layer already supports this via the > reserved tag mechanism. Add a new field 'nr_reserved_cmds' to the SCSI host > template to instruct the block layer to set aside a tag space for these > management commands by using reserved tags. Exclude reserved commands from > .can_queue because .can_queue is visible in sysfs. > > Signed-off-by: Hannes Reinecke<hare@suse.de> > [ bvanassche: modified patch description ] > Cc: John Garry<john.g.garry@oracle.com> > Signed-off-by: Bart Van Assche<bvanassche@acm.org> How is this related to the rest of the series? To allocate a reserved-tag request we need to use BLK_MQ_REQ_RESERVED, right? I don't see that used...
On 4/4/25 3:49 AM, John Garry wrote: > On 03/04/2025 22:17, Bart Van Assche wrote: >> From: Hannes Reinecke<hare@suse.de> >> >> Quite some drivers are using management commands internally. These >> commands >> typically use the same tag pool as regular SCSI commands. Tags for these >> management commands are set aside before allocating the block-mq tag >> bitmap >> for regular SCSI commands. The block layer already supports this via the >> reserved tag mechanism. Add a new field 'nr_reserved_cmds' to the SCSI >> host >> template to instruct the block layer to set aside a tag space for these >> management commands by using reserved tags. Exclude reserved commands >> from >> .can_queue because .can_queue is visible in sysfs. >> >> Signed-off-by: Hannes Reinecke<hare@suse.de> >> [ bvanassche: modified patch description ] >> Cc: John Garry<john.g.garry@oracle.com> >> Signed-off-by: Bart Van Assche<bvanassche@acm.org> > > How is this related to the rest of the series? > > To allocate a reserved-tag request we need to use BLK_MQ_REQ_RESERVED, > right? I don't see that used... Hi John, Calling blk_mq_alloc_request(..., BLK_MQ_REQ_RESERVED) is not the only way to allocated a reserved tag. Another possibility is to let the driver manage reserved tags. The UFS driver only needs a single reserved tag and already serializes use of that tag. See also the following change in patch 21/24: - hba->reserved_slot = hba->nutrs - 1; + hba->reserved_slot = 0; Use of hba->reserved_slot is serialized by calling ufshcd_dev_man_lock() and ufshcd_dev_man_unlock(). More code is serialized by these calls than only the region in which hba->reserved_slot is used so I don't think that the UFS driver code would become shorter by using block layer functions for allocating / freeing reserved tags. Thanks, Bart.
On 04/04/2025 17:34, Bart Van Assche wrote: >>> Quite some drivers are using management commands internally. These >>> commands >>> typically use the same tag pool as regular SCSI commands. Tags for these >>> management commands are set aside before allocating the block-mq tag >>> bitmap >>> for regular SCSI commands. The block layer already supports this via the >>> reserved tag mechanism. Add a new field 'nr_reserved_cmds' to the >>> SCSI host >>> template to instruct the block layer to set aside a tag space for these >>> management commands by using reserved tags. Exclude reserved commands >>> from >>> .can_queue because .can_queue is visible in sysfs. >>> >>> Signed-off-by: Hannes Reinecke<hare@suse.de> >>> [ bvanassche: modified patch description ] >>> Cc: John Garry<john.g.garry@oracle.com> >>> Signed-off-by: Bart Van Assche<bvanassche@acm.org> >> >> How is this related to the rest of the series? >> >> To allocate a reserved-tag request we need to use BLK_MQ_REQ_RESERVED, >> right? I don't see that used... > > Hi John, > > Calling blk_mq_alloc_request(..., BLK_MQ_REQ_RESERVED) is not the only > way to allocated a reserved tag. Another possibility is to let the > driver manage reserved tags. The UFS driver only needs a single reserved > tag and already serializes use of that tag. See also the following > change in patch 21/24: > > - hba->reserved_slot = hba->nutrs - 1; > + hba->reserved_slot = 0; > > Use of hba->reserved_slot is serialized by calling ufshcd_dev_man_lock() > and ufshcd_dev_man_unlock(). More code is serialized by these calls than > only the region in which hba->reserved_slot is used so I don't think > that the UFS driver code would become shorter by using block layer > functions for allocating / freeing reserved tags. Now I see this in 23/24: +/* + * Convert a block layer tag into a SCSI command pointer. This function is + * called once per I/O completion path and is also called from error paths. + */ +static inline struct scsi_cmnd *ufshcd_tag_to_cmd(struct ufs_hba *hba, u32 tag) +{ + struct blk_mq_tags *tags = hba->host->tag_set.tags[0]; + struct request *rq; + + /* + * Use .static_rqs[] for reserved commands because blk_mq_get_tag() + * is not called for reserved commands by the UFS driver. + */ + rq = tag < UFSHCD_NUM_RESERVED ? tags->static_rqs[tag] : + blk_mq_tag_to_rq(tags, tag); + + if (WARN_ON_ONCE(!rq)) + return NULL; + + return blk_mq_rq_to_pdu(rq); +} + Do you really think that it is ok that anything outside the block layer should be referencing tags->static_rqs[] directly? Even using blk_mq_alloc_request() would seem better than that.
On 4/4/25 10:34 AM, John Garry wrote: > Now I see this in 23/24: > > +/* > + * Convert a block layer tag into a SCSI command pointer. This function is > + * called once per I/O completion path and is also called from error > paths. > + */ > +static inline struct scsi_cmnd *ufshcd_tag_to_cmd(struct ufs_hba *hba, > u32 tag) > +{ > + struct blk_mq_tags *tags = hba->host->tag_set.tags[0]; > + struct request *rq; > + > + /* > + * Use .static_rqs[] for reserved commands because blk_mq_get_tag() > + * is not called for reserved commands by the UFS driver. > + */ > + rq = tag < UFSHCD_NUM_RESERVED ? tags->static_rqs[tag] : > + blk_mq_tag_to_rq(tags, tag); > + > + if (WARN_ON_ONCE(!rq)) > + return NULL; > + > + return blk_mq_rq_to_pdu(rq); > +} > + > > Do you really think that it is ok that anything outside the block layer > should be referencing tags->static_rqs[] directly? > > Even using blk_mq_alloc_request() would seem better than that. Hi John, There is a challenge: a request queue must exist before blk_mq_alloc_request() is called because the first argument of that function is a request queue pointer. Creating a request queue by calling __scsi_add_device() is not appropriate because __scsi_add_device() submits SCSI commands to a SCSI device and submitting SCSI commands to a UFS device must only happen after the initial device management commands have been sent. The only solution I see is to call blk_mq_init_queue() directly from the UFS driver and to pass that request queue to blk_mq_alloc_request() when allocating device management commands. Bart.
On 4/4/25 10:34 AM, John Garry wrote: > Now I see this in 23/24: > > +/* > + * Convert a block layer tag into a SCSI command pointer. This function is > + * called once per I/O completion path and is also called from error > paths. > + */ > +static inline struct scsi_cmnd *ufshcd_tag_to_cmd(struct ufs_hba *hba, > u32 tag) > +{ > + struct blk_mq_tags *tags = hba->host->tag_set.tags[0]; > + struct request *rq; > + > + /* > + * Use .static_rqs[] for reserved commands because blk_mq_get_tag() > + * is not called for reserved commands by the UFS driver. > + */ > + rq = tag < UFSHCD_NUM_RESERVED ? tags->static_rqs[tag] : > + blk_mq_tag_to_rq(tags, tag); > + > + if (WARN_ON_ONCE(!rq)) > + return NULL; > + > + return blk_mq_rq_to_pdu(rq); > +} > + > > Do you really think that it is ok that anything outside the block layer > should be referencing tags->static_rqs[] directly? > > Even using blk_mq_alloc_request() would seem better than that. Hi John, Using blk_mq_tag_to_rq() to translate a tag of a reserved request into a request pointer only works after the block layer has set tags->rqs[tag_of_reserved_request] first. That pointer is set by blk_mq_get_driver_tag(). blk_mq_get_driver_tag() is called by the block layer code that submits a request to a block driver. Hence, to ensure that blk_mq_get_driver_tag() is called for reserved requests the reserved requests would have to pass through scsi_queue_rq(). For reserved requests sdev == NULL as explained in a previous email. There are many statements in the SCSI command submission and completion path in which it is assumed that sdev != NULL. I don't think that the SCSI maintainer (Martin) would agree with adding "if (sdev)" statements in many places in the SCSI core. Letting UFS reserved requests being processed by another function than scsi_queue_rq() doesn't seem feasible to me either. Although it is easy to create an additional request queue for reserved requests, that request queue shares its tag set with the SCSI host and hence also the request submission function. From scsi_host.h: struct Scsi_Host { struct blk_mq_tag_set tag_set; [ ... ] }; From blk-mq.h: struct blk_mq_tag_set { const struct blk_mq_ops *ops; [ ... ] }; Unless someone comes up with an elegant proposal, I will keep the approach where ufshcd_tag_to_cmd() handles reserved tags and regular tags differently. It should be possible to do this without referencing tags->static_rqs[] directly from the UFS driver. Bart.
On 4/15/25 19:21, Bart Van Assche wrote: > > On 4/4/25 10:34 AM, John Garry wrote: >> Now I see this in 23/24: >> >> +/* >> + * Convert a block layer tag into a SCSI command pointer. This >> function is >> + * called once per I/O completion path and is also called from error >> paths. >> + */ >> +static inline struct scsi_cmnd *ufshcd_tag_to_cmd(struct ufs_hba >> *hba, u32 tag) >> +{ >> + struct blk_mq_tags *tags = hba->host->tag_set.tags[0]; >> + struct request *rq; >> + >> + /* >> + * Use .static_rqs[] for reserved commands because blk_mq_get_tag() >> + * is not called for reserved commands by the UFS driver. >> + */ >> + rq = tag < UFSHCD_NUM_RESERVED ? tags->static_rqs[tag] : >> + blk_mq_tag_to_rq(tags, tag); >> + >> + if (WARN_ON_ONCE(!rq)) >> + return NULL; >> + >> + return blk_mq_rq_to_pdu(rq); >> +} >> + >> >> Do you really think that it is ok that anything outside the block >> layer should be referencing tags->static_rqs[] directly? >> >> Even using blk_mq_alloc_request() would seem better than that. > > Hi John, > > Using blk_mq_tag_to_rq() to translate a tag of a reserved > request into a request pointer only works after the block layer has > set tags->rqs[tag_of_reserved_request] first. That pointer is set by > blk_mq_get_driver_tag(). blk_mq_get_driver_tag() is called by the > block layer code that submits a request to a block driver. Hence, to > ensure that blk_mq_get_driver_tag() is called for reserved requests > the reserved requests would have to pass through scsi_queue_rq(). > For reserved requests sdev == NULL as explained in a previous email. > There are many statements in the SCSI command submission and completion > path in which it is assumed that sdev != NULL. I don't think that the > SCSI maintainer (Martin) would agree with adding "if (sdev)" statements > in many places in the SCSI core. > > Letting UFS reserved requests being processed by another function than > scsi_queue_rq() doesn't seem feasible to me either. Although it is easy > to create an additional request queue for reserved requests, that > request queue shares its tag set with the SCSI host and hence also the > request submission function. From scsi_host.h: > As rightly noted, we need an sdev to use the reserved command handling trick. There are two choices: -> use the existing sdev for per-LUN TMFs. If the driver only uses ABORT TASK/ABORT TASK_SET/RESET LUN we already know which sdev to use, so that is easy. -> allocate a 'fake' scsi device for the host. Not pretty, but gives us a nice combined interface to deal with reserved commands. Bart, from my reading the UFS driver only requires a TMF for command abort and device reset, in both cases we should have a scsi device available. Can't you use that? Cheers, Hannes
On 16/04/2025 08:39, Hannes Reinecke wrote: >> JFYI, I was working on another solution (different to Hannes') which >> allows reserved requests pass through scsi_queue_rq(), but I stopped >> that work when I changed employer. >> > Oh, really? > Do you have the patchset still around? > I'd be very much interested in that... Yeah, if you remember, I was working on that for libsas/libata, but started to get bogged down in how to handle reserved commands for libsas/libata: https://lore.kernel.org/linux-scsi/1666693096-180008-1-git-send-email-john.garry@huawei.com/T/#m35ad9a27320077f55c2be5549f036628fe79dcf7 It would be a lot simpler to support for HBAs which plug directly into the SCSI midlayer - like aacraid. > >>> Although it is easy >>> to create an additional request queue for reserved requests, that >>> request queue shares its tag set with the SCSI host and hence also the >>> request submission function. From scsi_host.h: >>> >>> struct Scsi_Host { >>> struct blk_mq_tag_set tag_set; >>> [ ... ] >>> }; >>> >>> From blk-mq.h: >>> >>> struct blk_mq_tag_set { >>> const struct blk_mq_ops *ops; >>> [ ... ] >>> }; >>> >>> Unless someone comes up with an elegant proposal, I will keep the >>> approach where ufshcd_tag_to_cmd() handles reserved tags and regular >>> tags differently. >>> >>> It should be possible to do this without referencing tags->static_rqs[] >>> directly from the UFS driver. >> >> I'm not sure how that will look, but my preference is to fully >> implement reserved tags support which can be used by all SCSI LLDs. >> > Sure, that was my goal, too. > But then I got stuck as some drivers (most notably aacraid) use internal > commands to detect the hardware and allocate SCSI devices, so with my > approach we need a 'fake' SCSI device for the host to handle that. > If you have other ideas I'd be very interested in thems. My solution still involved supporting a fake sdev also for times like you describe, e.g. shost needs to send reserved commands. Thanks, John
diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c index e021f1106bea..c2c6c96781e3 100644 --- a/drivers/scsi/hosts.c +++ b/drivers/scsi/hosts.c @@ -495,6 +495,9 @@ struct Scsi_Host *scsi_host_alloc(const struct scsi_host_template *sht, int priv if (sht->virt_boundary_mask) shost->virt_boundary_mask = sht->virt_boundary_mask; + if (sht->nr_reserved_cmds) + shost->nr_reserved_cmds = sht->nr_reserved_cmds; + device_initialize(&shost->shost_gendev); dev_set_name(&shost->shost_gendev, "host%d", shost->host_no); shost->shost_gendev.bus = &scsi_bus_type; diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 1da26f300287..94dafa5ceaaa 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -2083,7 +2083,9 @@ int scsi_mq_setup_tags(struct Scsi_Host *shost) tag_set->ops = &scsi_mq_ops_no_commit; tag_set->nr_hw_queues = shost->nr_hw_queues ? : 1; tag_set->nr_maps = shost->nr_maps ? : 1; - tag_set->queue_depth = shost->can_queue; + tag_set->queue_depth = + shost->can_queue + shost->nr_reserved_cmds; + tag_set->reserved_tags = shost->nr_reserved_cmds; tag_set->cmd_size = cmd_size; tag_set->numa_node = dev_to_node(shost->dma_dev); if (shost->hostt->tag_alloc_policy_rr) diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h index 26bc23419cfd..2c0f5ec1046e 100644 --- a/include/scsi/scsi_host.h +++ b/include/scsi/scsi_host.h @@ -375,10 +375,19 @@ struct scsi_host_template { /* * This determines if we will use a non-interrupt driven * or an interrupt driven scheme. It is set to the maximum number - * of simultaneous commands a single hw queue in HBA will accept. + * of simultaneous commands a single hw queue in HBA will accept + * excluding internal commands. */ int can_queue; + /* + * This determines how many commands the HBA will set aside + * for internal commands. This number will be added to + * @can_queue to calcumate the maximum number of simultaneous + * commands sent to the host. + */ + int nr_reserved_cmds; + /* * In many instances, especially where disconnect / reconnect are * supported, our host also has an ID on the SCSI bus. If this is @@ -611,6 +620,11 @@ struct Scsi_Host { unsigned short max_cmd_len; int this_id; + + /* + * Number of commands this host can handle at the same time. + * This excludes reserved commands as specified by nr_reserved_cmds. + */ int can_queue; short cmd_per_lun; short unsigned int sg_tablesize; @@ -631,6 +645,12 @@ struct Scsi_Host { */ unsigned nr_hw_queues; unsigned nr_maps; + + /* + * Number of reserved commands to allocate, if any. + */ + unsigned nr_reserved_cmds; + unsigned active_mode:2; /*