mbox series

[v10,00/18] Improve performance for zoned UFS devices

Message ID 20230818193546.2014874-1-bvanassche@acm.org
Headers show
Series Improve performance for zoned UFS devices | expand

Message

Bart Van Assche Aug. 18, 2023, 7:34 p.m. UTC
Hi Jens,

This patch series improves small write IOPS by a factor of four (+300%) for
zoned UFS devices on my test setup with an UFSHCI 3.0 controller. Please
consider this patch series for the next merge window.

Thank you,

Bart.

Changes compared to v9:
 - Introduced an additional scsi_driver callback: .eh_needs_prepare_resubmit().
 - Renamed the scsi_debug kernel module parameter 'no_zone_write_lock' into
   'preserves_write_order'.
 - Fixed an out-of-bounds access in the unit scsi_call_prepare_resubmit() unit
   test.
 - Wrapped ufshcd_auto_hibern8_update() calls in UFS host drivers with
   WARN_ON_ONCE() such that a kernel stack appears in case an error code is
   returned.
 - Elaborated a comment in the UFSHCI driver.

Changes compared to v8:
 - Fixed handling of 'driver_preserves_write_order' and 'use_zone_write_lock'
   in blk_stack_limits().
 - Added a comment in disk_set_zoned().
 - Modified blk_req_needs_zone_write_lock() such that it returns false if
   q->limits.use_zone_write_lock is false.
 - Modified disk_clear_zone_settings() such that it clears
   q->limits.use_zone_write_lock.
 - Left out one change from the mq-deadline patch that became superfluous due to
   the blk_req_needs_zone_write_lock() change.
 - Modified scsi_call_prepare_resubmit() such that it only calls list_sort() if
   zoned writes have to be resubmitted for which zone write locking is disabled.
 - Added an additional unit test for scsi_call_prepare_resubmit().
 - Modified the sorting code in the sd driver such that only those SCSI commands
   are sorted for which write locking is disabled.
 - Modified sd_zbc.c such that ELEVATOR_F_ZBD_SEQ_WRITE is only set if the
   write order is not preserved.
 - Included three patches for UFS host drivers that rework code that wrote
   directly to the auto-hibernation controller register.
 - Modified the UFS driver such that enabling auto-hibernation is not allowed
   if a zoned logical unit is present and if the controller operates in legacy
   mode.
 - Also in the UFS driver, simplified ufshcd_auto_hibern8_update().

Changes compared to v7:
 - Split the queue_limits member variable `use_zone_write_lock' into two member
   variables: `use_zone_write_lock' (set by disk_set_zoned()) and
   `driver_preserves_write_order' (set by the block driver or SCSI LLD). This
   should clear up the confusion about the purpose of this variable.
 - Moved the code for sorting SCSI commands by LBA from the SCSI error handler
   into the SCSI disk (sd) driver as requested by Christoph.
   
Changes compared to v6:
 - Removed QUEUE_FLAG_NO_ZONE_WRITE_LOCK and instead introduced a flag in
   the request queue limits data structure.

Changes compared to v5:
 - Renamed scsi_cmp_lba() into scsi_cmp_sector().
 - Improved several source code comments.

Changes compared to v4:
 - Dropped the patch that introduces the REQ_NO_ZONE_WRITE_LOCK flag.
 - Dropped the null_blk patch and added two scsi_debug patches instead.
 - Dropped the f2fs patch.
 - Split the patch for the UFS driver into two patches.
 - Modified several patch descriptions and source code comments.
 - Renamed dd_use_write_locking() into dd_use_zone_write_locking().
 - Moved the list_sort() call from scsi_unjam_host() into scsi_eh_flush_done_q()
   such that sorting happens just before reinserting.
 - Removed the scsi_cmd_retry_allowed() call from scsi_check_sense() to make
   sure that the retry counter is adjusted once per retry instead of twice.

Changes compared to v3:
 - Restored the patch that introduces QUEUE_FLAG_NO_ZONE_WRITE_LOCK. That patch
   had accidentally been left out from v2.
 - In patch "block: Introduce the flag REQ_NO_ZONE_WRITE_LOCK", improved the
   patch description and added the function blk_no_zone_write_lock().
 - In patch "block/mq-deadline: Only use zone locking if necessary", moved the
   blk_queue_is_zoned() call into dd_use_write_locking().
 - In patch "fs/f2fs: Disable zone write locking", set REQ_NO_ZONE_WRITE_LOCK
   from inside __bio_alloc() instead of in f2fs_submit_write_bio().

Changes compared to v2:
 - Renamed the request queue flag for disabling zone write locking.
 - Introduced a new request flag for disabling zone write locking.
 - Modified the mq-deadline scheduler such that zone write locking is only
   disabled if both flags are set.
 - Added an F2FS patch that sets the request flag for disabling zone write
   locking.
 - Only disable zone write locking in the UFS driver if auto-hibernation is
   disabled.

Changes compared to v1:
 - Left out the patches that are already upstream.
 - Switched the approach in patch "scsi: Retry unaligned zoned writes" from
   retrying immediately to sending unaligned write commands to the SCSI error
   handler.

Bart Van Assche (18):
  block: Introduce more member variables related to zone write locking
  block: Only use write locking if necessary
  block/mq-deadline: Only use zone locking if necessary
  scsi: core: Introduce a mechanism for reordering requests in the error
    handler
  scsi: core: Add unit tests for scsi_call_prepare_resubmit()
  scsi: sd: Sort commands by LBA before resubmitting
  scsi: core: Retry unaligned zoned writes
  scsi: sd_zbc: Only require an I/O scheduler if needed
  scsi: scsi_debug: Add the preserves_write_order module parameter
  scsi: scsi_debug: Support injecting unaligned write errors
  scsi: ufs: Change the return type of ufshcd_auto_hibern8_update()
  scsi: ufs: hisi: Rework the code that disables auto-hibernation
  scsi: ufs: mediatek: Rework the code for disabling auto-hibernation
  scsi: ufs: sprd: Rework the code for disabling auto-hibernation
  scsi: ufs: Rename ufshcd_auto_hibern8_enable() and make it static
  scsi: ufs: Simplify ufshcd_auto_hibern8_update()
  scsi: ufs: Forbid auto-hibernation without I/O scheduler
  scsi: ufs: Inform the block layer about write ordering

 block/blk-settings.c            |  15 +++
 block/blk-zoned.c               |  10 +-
 block/mq-deadline.c             |  11 +-
 drivers/scsi/Kconfig            |   2 +
 drivers/scsi/Kconfig.kunit      |   4 +
 drivers/scsi/Makefile           |   2 +
 drivers/scsi/Makefile.kunit     |   1 +
 drivers/scsi/scsi_debug.c       |  21 +++-
 drivers/scsi/scsi_error.c       |  81 +++++++++++++
 drivers/scsi/scsi_error_test.c  | 207 ++++++++++++++++++++++++++++++++
 drivers/scsi/scsi_lib.c         |   1 +
 drivers/scsi/scsi_priv.h        |   1 +
 drivers/scsi/sd.c               |  51 ++++++++
 drivers/scsi/sd_zbc.c           |   4 +-
 drivers/ufs/core/ufs-sysfs.c    |   2 +-
 drivers/ufs/core/ufshcd-priv.h  |   1 -
 drivers/ufs/core/ufshcd.c       | 114 ++++++++++++++----
 drivers/ufs/host/ufs-hisi.c     |   5 +-
 drivers/ufs/host/ufs-mediatek.c |   2 +-
 drivers/ufs/host/ufs-sprd.c     |  11 +-
 include/linux/blkdev.h          |  10 ++
 include/scsi/scsi.h             |   1 +
 include/scsi/scsi_driver.h      |   2 +
 include/ufs/ufshcd.h            |   3 +-
 24 files changed, 512 insertions(+), 50 deletions(-)
 create mode 100644 drivers/scsi/Kconfig.kunit
 create mode 100644 drivers/scsi/Makefile.kunit
 create mode 100644 drivers/scsi/scsi_error_test.c

Comments

Bao D. Nguyen Aug. 18, 2023, 9:11 p.m. UTC | #1
On 8/18/2023 12:34 PM, Bart Van Assche wrote:
> Call ufshcd_auto_hibern8_update() instead of writing directly into the
> auto-hibernation control register. This patch is part of an effort to
> move all auto-hibernation register changes into the UFSHCI driver core.
> 

Reviewed-by: Bao D. Nguyen <quic_nguyenb@quicinc.com>
Bao D. Nguyen Aug. 18, 2023, 9:14 p.m. UTC | #2
On 8/18/2023 12:34 PM, Bart Van Assche wrote:
> Call ufshcd_auto_hibern8_update() instead of writing directly into the
> auto-hibernation control register. This patch is part of an effort to
> move all auto-hibernation register changes into the UFSHCI driver core.
> 

Reviewed-by: Bao D. Nguyen <quic_nguyenb@quicinc.com>
Bao D. Nguyen Aug. 18, 2023, 9:16 p.m. UTC | #3
On 8/18/2023 12:34 PM, Bart Van Assche wrote:
> Calls to ufshcd_auto_hibern8_update() are already serialized: this
> function is either called if user space software is not running
> (preparing to suspend) or from a single sysfs store callback function.
> Kernfs serializes sysfs .store() callbacks.
> 
> No functionality is changed. This patch makes the next patch in this
> series easier to read.
> 

Reviewed-by: Bao D. Nguyen <quic_nguyenb@quicinc.com>
Bao D. Nguyen Aug. 18, 2023, 9:17 p.m. UTC | #4
On 8/18/2023 12:34 PM, Bart Van Assche wrote:
> UFSHCI 3.0 controllers do not preserve the write order if auto-hibernation
> is enabled. If the write order is not preserved, an I/O scheduler is
> required to serialize zoned writes. Hence do not allow auto-hibernation
> to be enabled without I/O scheduler if a zoned logical unit is present
> and if the controller is operating in legacy mode. This patch has been
> tested with the following shell script:
> 

Reviewed-by: Bao D. Nguyen <quic_nguyenb@quicinc.com>
Bao D. Nguyen Aug. 18, 2023, 9:19 p.m. UTC | #5
On 8/18/2023 12:34 PM, Bart Van Assche wrote:
>  From the UFSHCI 4.0 specification, about the legacy (single queue) mode:
> "The host controller always process transfer requests in-order according
> to the order submitted to the list. In case of multiple commands with
> single doorbell register ringing (batch mode), The dispatch order for
> these transfer requests by host controller will base on their index in
> the List. A transfer request with lower index value will be executed
> before a transfer request with higher index value."
> 
>  From the UFSHCI 4.0 specification, about the MCQ mode:
> "Command Submission
> 1. Host SW writes an Entry to SQ
> 2. Host SW updates SQ doorbell tail pointer
> 
> Command Processing
> 3. After fetching the Entry, Host Controller updates SQ doorbell head
>     pointer
> 4. Host controller sends COMMAND UPIU to UFS device"
> 
> In other words, for both legacy and MCQ mode, UFS controllers are
> required to forward commands to the UFS device in the order these
> commands have been received from the host.
> 
> Notes:
> - For legacy mode this is only correct if the host submits one
>    command at a time. The UFS driver does this.
> - Also in legacy mode, the command order is not preserved if
>    auto-hibernation is enabled in the UFS controller. Hence, enable
>    zone write locking if auto-hibernation is enabled.
> 
> This patch improves performance as follows on my test setup:
> - With the mq-deadline scheduler: 2.5x more IOPS for small writes.
> - When not using an I/O scheduler compared to using mq-deadline with
>    zone locking: 4x more IOPS for small writes.
> 

Hi Bart,
Thank you for double checking and the detailed explanation.

Reviewed-by: Bao D. Nguyen <quic_nguyenb@quicinc.com>
Can Guo Aug. 21, 2023, 7:15 a.m. UTC | #6
On 8/19/2023 3:34 AM, Bart Van Assche wrote:
>  From the UFSHCI 4.0 specification, about the legacy (single queue) mode:
> "The host controller always process transfer requests in-order according
> to the order submitted to the list. In case of multiple commands with
> single doorbell register ringing (batch mode), The dispatch order for
> these transfer requests by host controller will base on their index in
> the List. A transfer request with lower index value will be executed
> before a transfer request with higher index value."
>
>  From the UFSHCI 4.0 specification, about the MCQ mode:
> "Command Submission
> 1. Host SW writes an Entry to SQ
> 2. Host SW updates SQ doorbell tail pointer
>
> Command Processing
> 3. After fetching the Entry, Host Controller updates SQ doorbell head
>     pointer
> 4. Host controller sends COMMAND UPIU to UFS device"
>
> In other words, for both legacy and MCQ mode, UFS controllers are
> required to forward commands to the UFS device in the order these
> commands have been received from the host.
>
> Notes:
> - For legacy mode this is only correct if the host submits one
>    command at a time. The UFS driver does this.
> - Also in legacy mode, the command order is not preserved if
>    auto-hibernation is enabled in the UFS controller. Hence, enable
>    zone write locking if auto-hibernation is enabled.
>
> This patch improves performance as follows on my test setup:
> - With the mq-deadline scheduler: 2.5x more IOPS for small writes.
> - When not using an I/O scheduler compared to using mq-deadline with
>    zone locking: 4x more IOPS for small writes.
>
> Cc: Martin K. Petersen <martin.petersen@oracle.com>
> Cc: Can Guo <quic_cang@quicinc.com>
> Cc: Avri Altman <avri.altman@wdc.com>
> Cc: Bao D. Nguyen <quic_nguyenb@quicinc.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>   drivers/ufs/core/ufshcd.c | 22 ++++++++++++++++++++--
>   1 file changed, 20 insertions(+), 2 deletions(-)
Reviewed-by: Can Guo <quic_cang@quicinc.com>