diff mbox series

[RFC,1/4] scsi: Allow drivers to set BLK_MQ_F_BLOCKING

Message ID 20220308003957.123312-2-michael.christie@oracle.com
State New
Headers show
Series scsi/iscsi: Send iscsi data from kblockd | expand

Commit Message

Mike Christie March 8, 2022, 12:39 a.m. UTC
The software iscsi driver's queuecommand can block and taking the extra
hop from kblockd to its workqueue results in a performance hit. Allowing
it to set BLK_MQ_F_BLOCKING and transmit from that context directly
results in a 20-30% improvement in IOPs for workloads like:

fio --filename=/dev/sdb --direct=1 --rw=randrw --bs=4k --ioengine=libaio
--iodepth=128  --numjobs=1

and for all write workloads.

when using the none scheduler and the app and iscsi bound to the same
CPUs. Throughput tests show similar gains.

This patch adds a new scsi_host_template field so drivers can tell scsi-ml
that they can block so scsi-ml can setup the tag set with the
BLK_MQ_F_BLOCKING flag.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/scsi_lib.c  | 6 ++++--
 include/scsi/scsi_host.h | 4 ++++
 2 files changed, 8 insertions(+), 2 deletions(-)

Comments

Mike Christie March 9, 2022, 1:17 a.m. UTC | #1
On 3/8/22 6:53 PM, Ming Lei wrote:
> On Mon, Mar 07, 2022 at 06:39:54PM -0600, Mike Christie wrote:
>> The software iscsi driver's queuecommand can block and taking the extra
>> hop from kblockd to its workqueue results in a performance hit. Allowing
>> it to set BLK_MQ_F_BLOCKING and transmit from that context directly
>> results in a 20-30% improvement in IOPs for workloads like:
>>
>> fio --filename=/dev/sdb --direct=1 --rw=randrw --bs=4k --ioengine=libaio
>> --iodepth=128  --numjobs=1
>>
>> and for all write workloads.
> 
> This single patch shouldn't make any difference for iscsi, so please
> make it as last one if performance improvement data is provided
> in commit log.

Ok.

> 
> Also is there performance effect for other worloads? such as multiple
> jobs? iscsi is SQ hardware, so if driver is blocked in ->queuecommand()
> via BLK_MQ_F_BLOCKING, other contexts can't submit IO to scsi ML any more.

If you mean multiple jobs running on the same connection/session then
they are all serialized now. A connection can only do 1 cmd at a time.
There's a big mutex around it in the network layer, so multiple jobs
just suck no matter what.

If you mean multiple jobs from different connection/sessions, then the
iscsi code with this patchset blocks only because the network layer
takes a mutex for a short time. We configure it to not block for things
like socket space, memory allocations, we do zero copy IO normally, etc
so it's quick.

We also can do up to workqueues max_active limit worth of calls so
other things can normally send IO. We haven't found a need to increase
it yet.
diff mbox series

Patch

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0a70aa763a96..a5dbeb9994ae 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1989,6 +1989,8 @@  int scsi_mq_setup_tags(struct Scsi_Host *shost)
 	tag_set->driver_data = shost;
 	if (shost->host_tagset)
 		tag_set->flags |= BLK_MQ_F_TAG_HCTX_SHARED;
+	if (shost->hostt->queuecommand_blocks)
+		tag_set->flags |= BLK_MQ_F_BLOCKING;
 
 	return blk_mq_alloc_tag_set(tag_set);
 }
@@ -2952,8 +2954,8 @@  scsi_host_block(struct Scsi_Host *shost)
 	}
 
 	/*
-	 * SCSI never enables blk-mq's BLK_MQ_F_BLOCKING flag so
-	 * calling synchronize_rcu() once is enough.
+	 * Drivers that use this helper enable blk-mq's BLK_MQ_F_BLOCKING flag
+	 * so calling synchronize_rcu() once is enough.
 	 */
 	WARN_ON_ONCE(shost->tag_set.flags & BLK_MQ_F_BLOCKING);
 
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 72e1a347baa6..0d106dc9309d 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -75,6 +75,10 @@  struct scsi_host_template {
 	 */
 	int (* queuecommand)(struct Scsi_Host *, struct scsi_cmnd *);
 
+	/*
+	 * Set To true if the queuecommand function can block.
+	 */
+	bool queuecommand_blocks;
 	/*
 	 * The commit_rqs function is used to trigger a hardware
 	 * doorbell after some requests have been queued with