mbox series

[V8,00/13] blk-mq/scsi: tracking device queue depth via sbitmap

Message ID 20210207092029.1558550-1-ming.lei@redhat.com
Headers show
Series blk-mq/scsi: tracking device queue depth via sbitmap | expand

Message

Ming Lei Feb. 7, 2021, 9:20 a.m. UTC
Hi,

scsi uses one global atomic variable to track queue depth for each
LUN/request queue. This way can't scale well when there is lots of CPU
cores and the disk is very fast. Broadcom guys has complained that their
high end HBA can't reach top performance because .device_busy is
operated in IO path.

Replace the atomic variable sdev->device_busy with sbitmap for
tracking scsi device queue depth.

Test on scsi_debug shows this way improve IOPS > 20%. Meantime
the IOPS difference is just ~1% compared with bypassing .device_busy
on scsi_debug via patches[1]

The 1st 6 patches moves percpu allocation hint into sbitmap, since
the improvement by doing percpu allocation hint on sbitmap is observable.
Meantime export helpers for SCSI.

Patch 7 and 8 prepares for the conversion by returning budget token
from .get_budget callback, meantime passes the budget token to driver
via 'struct blk_mq_queue_data' in .queue_rq().

The last four patches changes SCSI for switching to track device queue
depth via sbitmap.

The patchset have been tested by Broadcom, and obvious performance boost
can be observed on megaraid_sas.

V8:
	- fix handling for device_blocked, only patch 13 is changed
	- rebase on latest for-5.12/block, only patch 10 is changed

V7:
	- fix build failure on drivers/vhost/scsi.c, only patch 2 & 4 are
	  changed

V6:
	- rebase on for-5.12/block

V5:
	- add comment on sbitmap_weight()
	- add patch 'megaraid_sas: v2 replace sdev_busy with local counter'
	  for fixing build failure

V4:
	- limit max sdev->queue_depth as max(1024, shost->can_queue)
	- simplify code for moving per-cpu allocation hint into sbitmap

V3:
	- rebase on both for-5.10/block and 5.10/scsi-queue.

V2:
	- fix one build failure


Kashyap Desai (1):
  megaraid_sas: v2 replace sdev_busy with local counter

Ming Lei (12):
  sbitmap: remove sbitmap_clear_bit_unlock
  sbitmap: maintain allocation round_robin in sbitmap
  sbitmap: add helpers for updating allocation hint
  sbitmap: move allocation hint into sbitmap
  sbitmap: export sbitmap_weight
  sbitmap: add helper of sbitmap_calculate_shift
  blk-mq: add callbacks for storing & retrieving budget token
  blk-mq: return budget token from .get_budget callback
  scsi: put hot fields of scsi_host_template into one cacheline
  scsi: add scsi_device_busy() to read sdev->device_busy
  scsi: make sure sdev->queue_depth is <= max(shost->can_queue, 1024)
  scsi: replace sdev->device_busy with sbitmap

 block/blk-mq-sched.c                        |  17 +-
 block/blk-mq.c                              |  38 ++--
 block/blk-mq.h                              |  25 ++-
 block/kyber-iosched.c                       |   3 +-
 drivers/message/fusion/mptsas.c             |   2 +-
 drivers/scsi/megaraid/megaraid_sas.h        |   2 +
 drivers/scsi/megaraid/megaraid_sas_fusion.c |  47 ++++-
 drivers/scsi/mpt3sas/mpt3sas_scsih.c        |   2 +-
 drivers/scsi/scsi.c                         |  13 ++
 drivers/scsi/scsi_lib.c                     |  71 ++++---
 drivers/scsi/scsi_priv.h                    |   3 +
 drivers/scsi/scsi_scan.c                    |  23 ++-
 drivers/scsi/scsi_sysfs.c                   |   4 +-
 drivers/scsi/sg.c                           |   2 +-
 drivers/vhost/scsi.c                        |   4 +-
 include/linux/blk-mq.h                      |  13 +-
 include/linux/sbitmap.h                     |  85 +++++---
 include/scsi/scsi_cmnd.h                    |   2 +
 include/scsi/scsi_device.h                  |   8 +-
 include/scsi/scsi_host.h                    |  72 ++++---
 lib/sbitmap.c                               | 210 +++++++++++---------
 21 files changed, 432 insertions(+), 214 deletions(-)

Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>

Comments

Martin K. Petersen March 4, 2021, 4:15 a.m. UTC | #1
On Sun, 7 Feb 2021 17:20:16 +0800, Ming Lei wrote:

> scsi uses one global atomic variable to track queue depth for each

> LUN/request queue. This way can't scale well when there is lots of CPU

> cores and the disk is very fast. Broadcom guys has complained that their

> high end HBA can't reach top performance because .device_busy is

> operated in IO path.

> 

> Replace the atomic variable sdev->device_busy with sbitmap for

> tracking scsi device queue depth.

> 

> [...]


Applied to 5.13/scsi-queue, thanks!

[01/13] sbitmap: remove sbitmap_clear_bit_unlock
        https://git.kernel.org/mkp/scsi/c/46d2a5813454
[02/13] sbitmap: maintain allocation round_robin in sbitmap
        https://git.kernel.org/mkp/scsi/c/ed9eb92974bc
[03/13] sbitmap: add helpers for updating allocation hint
        https://git.kernel.org/mkp/scsi/c/a523156a9303
[04/13] sbitmap: move allocation hint into sbitmap
        https://git.kernel.org/mkp/scsi/c/30d4ee6f3a9d
[05/13] sbitmap: export sbitmap_weight
        https://git.kernel.org/mkp/scsi/c/d9ba7618bec3
[06/13] sbitmap: add helper of sbitmap_calculate_shift
        https://git.kernel.org/mkp/scsi/c/5d747419d20e
[07/13] blk-mq: add callbacks for storing & retrieving budget token
        https://git.kernel.org/mkp/scsi/c/9dda23635dbe
[08/13] blk-mq: return budget token from .get_budget callback
        https://git.kernel.org/mkp/scsi/c/cd4ef15a289a
[09/13] scsi: put hot fields of scsi_host_template into one cacheline
        https://git.kernel.org/mkp/scsi/c/a8474e7b28a0
[10/13] megaraid_sas: v2 replace sdev_busy with local counter
        https://git.kernel.org/mkp/scsi/c/d7afc2ed1447
[11/13] scsi: add scsi_device_busy() to read sdev->device_busy
        https://git.kernel.org/mkp/scsi/c/c300d1182331
[12/13] scsi: make sure sdev->queue_depth is <= max(shost->can_queue, 1024)
        https://git.kernel.org/mkp/scsi/c/b0a4b45dc841
[13/13] scsi: replace sdev->device_busy with sbitmap
        https://git.kernel.org/mkp/scsi/c/62b38e49fcf7

-- 
Martin K. Petersen	Oracle Linux Engineering