mbox series

[v2,0/9] Support zoned devices with gap zones

Message ID 20220421183023.3462291-1-bvanassche@acm.org
Headers show
Series Support zoned devices with gap zones | expand

Message

Bart Van Assche April 21, 2022, 6:30 p.m. UTC
Hi Martin,

In ZBC-2 support has been improved for zones with a size that is not a power
of two by allowing host-managed devices to report gap zones. This patch adds
support for zoned devices for which data zones and gap zones alternate if the
distance between zone start LBAs is a power of two.

Please consider this patch series for kernel v5.19.

Thanks,

Bart.

Changes compared to v1:
- Made this patch series compatible with the zone querying code in BTRFS.
- Addressed Damien's off-list review comments.
- Added patch "Return early in sd_zbc_check_zoned_characteristics()" to this
  series.

Bart Van Assche (9):
  scsi: sd_zbc: Improve source code documentation
  scsi: sd_zbc: Verify that the zone size is a power of two
  scsi: sd_zbc: Use logical blocks as unit when querying zones
  scsi: sd_zbc: Introduce struct zoned_disk_info
  scsi: sd_zbc: Return early in sd_zbc_check_zoned_characteristics()
  scsi: sd_zbc: Hide gap zones
  scsi_debug: Fix a typo
  scsi_debug: Rename zone type constants
  scsi_debug: Add gap zone support

 drivers/scsi/scsi_debug.c | 149 ++++++++++++++++++------
 drivers/scsi/sd.h         |  32 ++++--
 drivers/scsi/sd_zbc.c     | 236 +++++++++++++++++++++++++++++---------
 include/scsi/scsi_proto.h |   9 +-
 4 files changed, 331 insertions(+), 95 deletions(-)

Comments

Himanshu Madhani April 21, 2022, 8:13 p.m. UTC | #1
> On Apr 21, 2022, at 11:30 AM, Bart Van Assche <bvanassche@acm.org> wrote:
> 
> Deriving the meaning of the nr_zones, rev_nr_zones, zone_blocks and
> rev_zone_blocks member variables requires careful analysis of the source
> code. Make the meaning of these member variables easier to understand by
> introducing struct zoned_disk_info.
> 
> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> drivers/scsi/sd.h     | 22 +++++++++++++++----
> drivers/scsi/sd_zbc.c | 49 ++++++++++++++++++++-----------------------
> 2 files changed, 41 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h
> index 4849cbe771a7..47434f905b0a 100644
> --- a/drivers/scsi/sd.h
> +++ b/drivers/scsi/sd.h
> @@ -67,6 +67,20 @@ enum {
> 	SD_ZERO_WS10_UNMAP,	/* Use WRITE SAME(10) with UNMAP */
> };
> 
> +/**
> + * struct zoned_disk_info - Specific properties of a ZBC SCSI device.
> + * @nr_zones: number of zones.
> + * @zone_blocks: number of logical blocks per zone.
> + *
> + * This data structure holds the ZBC SCSI device properties that are retrieved
> + * twice: a first time before the gendisk capacity is known and a second time
> + * after the gendisk capacity is known.
> + */
> +struct zoned_disk_info {
> +	u32		nr_zones;
> +	u32		zone_blocks;
> +};
> +
> struct scsi_disk {
> 	struct scsi_device *device;
> 
> @@ -78,10 +92,10 @@ struct scsi_disk {
> 	struct gendisk	*disk;
> 	struct opal_dev *opal_dev;
> #ifdef CONFIG_BLK_DEV_ZONED
> -	u32		nr_zones;
> -	u32		rev_nr_zones;
> -	u32		zone_blocks;
> -	u32		rev_zone_blocks;
> +	/* Updated during revalidation before the gendisk capacity is known. */
> +	struct zoned_disk_info	early_zone_info;
> +	/* Updated during revalidation after the gendisk capacity is known. */
> +	struct zoned_disk_info	zone_info;
> 	u32		zones_optimal_open;
> 	u32		zones_optimal_nonseq;
> 	u32		zones_max_open;
> diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
> index e76bcbfd0d1c..ac557a5a65c8 100644
> --- a/drivers/scsi/sd_zbc.c
> +++ b/drivers/scsi/sd_zbc.c
> @@ -181,7 +181,7 @@ static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp,
> 	 * sure that the allocated buffer can always be mapped by limiting the
> 	 * number of pages allocated to the HBA max segments limit.
> 	 */
> -	nr_zones = min(nr_zones, sdkp->nr_zones);
> +	nr_zones = min(nr_zones, sdkp->zone_info.nr_zones);
> 	bufsize = roundup((nr_zones + 1) * 64, SECTOR_SIZE);
> 	bufsize = min_t(size_t, bufsize,
> 			queue_max_hw_sectors(q) << SECTOR_SHIFT);
> @@ -206,7 +206,7 @@ static void *sd_zbc_alloc_report_buffer(struct scsi_disk *sdkp,
>  */
> static inline sector_t sd_zbc_zone_sectors(struct scsi_disk *sdkp)
> {
> -	return logical_to_sectors(sdkp->device, sdkp->zone_blocks);
> +	return logical_to_sectors(sdkp->device, sdkp->zone_info.zone_blocks);
> }
> 
> /**
> @@ -262,7 +262,7 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector,
> 			zone_idx++;
> 		}
> 
> -		lba += sdkp->zone_blocks * i;
> +		lba += sdkp->zone_info.zone_blocks * i;
> 	}
> 
> 	ret = zone_idx;
> @@ -320,14 +320,14 @@ static void sd_zbc_update_wp_offset_workfn(struct work_struct *work)
> 	sdkp = container_of(work, struct scsi_disk, zone_wp_offset_work);
> 
> 	spin_lock_irqsave(&sdkp->zones_wp_offset_lock, flags);
> -	for (zno = 0; zno < sdkp->nr_zones; zno++) {
> +	for (zno = 0; zno < sdkp->zone_info.nr_zones; zno++) {
> 		if (sdkp->zones_wp_offset[zno] != SD_ZBC_UPDATING_WP_OFST)
> 			continue;
> 
> 		spin_unlock_irqrestore(&sdkp->zones_wp_offset_lock, flags);
> 		ret = sd_zbc_do_report_zones(sdkp, sdkp->zone_wp_update_buf,
> 					     SD_BUF_SIZE,
> -					     zno * sdkp->zone_blocks, true);
> +					     zno * sdkp->zone_info.zone_blocks, true);
> 		spin_lock_irqsave(&sdkp->zones_wp_offset_lock, flags);
> 		if (!ret)
> 			sd_zbc_parse_report(sdkp, sdkp->zone_wp_update_buf + 64,
> @@ -394,7 +394,7 @@ blk_status_t sd_zbc_prepare_zone_append(struct scsi_cmnd *cmd, sector_t *lba,
> 		break;
> 	default:
> 		wp_offset = sectors_to_logical(sdkp->device, wp_offset);
> -		if (wp_offset + nr_blocks > sdkp->zone_blocks) {
> +		if (wp_offset + nr_blocks > sdkp->zone_info.zone_blocks) {
> 			ret = BLK_STS_IOERR;
> 			break;
> 		}
> @@ -523,7 +523,7 @@ static unsigned int sd_zbc_zone_wp_update(struct scsi_cmnd *cmd,
> 		break;
> 	case REQ_OP_ZONE_RESET_ALL:
> 		memset(sdkp->zones_wp_offset, 0,
> -		       sdkp->nr_zones * sizeof(unsigned int));
> +		       sdkp->zone_info.nr_zones * sizeof(unsigned int));
> 		break;
> 	default:
> 		break;
> @@ -680,16 +680,16 @@ static void sd_zbc_print_zones(struct scsi_disk *sdkp)
> 	if (!sd_is_zoned(sdkp) || !sdkp->capacity)
> 		return;
> 
> -	if (sdkp->capacity & (sdkp->zone_blocks - 1))
> +	if (sdkp->capacity & (sdkp->zone_info.zone_blocks - 1))
> 		sd_printk(KERN_NOTICE, sdkp,
> 			  "%u zones of %u logical blocks + 1 runt zone\n",
> -			  sdkp->nr_zones - 1,
> -			  sdkp->zone_blocks);
> +			  sdkp->zone_info.nr_zones - 1,
> +			  sdkp->zone_info.zone_blocks);
> 	else
> 		sd_printk(KERN_NOTICE, sdkp,
> 			  "%u zones of %u logical blocks\n",
> -			  sdkp->nr_zones,
> -			  sdkp->zone_blocks);
> +			  sdkp->zone_info.nr_zones,
> +			  sdkp->zone_info.zone_blocks);
> }
> 
> static int sd_zbc_init_disk(struct scsi_disk *sdkp)
> @@ -716,10 +716,8 @@ static void sd_zbc_clear_zone_info(struct scsi_disk *sdkp)
> 	kfree(sdkp->zone_wp_update_buf);
> 	sdkp->zone_wp_update_buf = NULL;
> 
> -	sdkp->nr_zones = 0;
> -	sdkp->rev_nr_zones = 0;
> -	sdkp->zone_blocks = 0;
> -	sdkp->rev_zone_blocks = 0;
> +	sdkp->early_zone_info = (struct zoned_disk_info){ };
> +	sdkp->zone_info = (struct zoned_disk_info){ };
> 
> 	mutex_unlock(&sdkp->rev_mutex);
> }
> @@ -746,8 +744,8 @@ int sd_zbc_revalidate_zones(struct scsi_disk *sdkp)
> {
> 	struct gendisk *disk = sdkp->disk;
> 	struct request_queue *q = disk->queue;
> -	u32 zone_blocks = sdkp->rev_zone_blocks;
> -	unsigned int nr_zones = sdkp->rev_nr_zones;
> +	u32 zone_blocks = sdkp->early_zone_info.zone_blocks;
> +	unsigned int nr_zones = sdkp->early_zone_info.nr_zones;
> 	u32 max_append;
> 	int ret = 0;
> 	unsigned int flags;
> @@ -778,14 +776,14 @@ int sd_zbc_revalidate_zones(struct scsi_disk *sdkp)
> 	 */
> 	mutex_lock(&sdkp->rev_mutex);
> 
> -	if (sdkp->zone_blocks == zone_blocks &&
> -	    sdkp->nr_zones == nr_zones &&
> +	if (sdkp->zone_info.zone_blocks == zone_blocks &&
> +	    sdkp->zone_info.nr_zones == nr_zones &&
> 	    disk->queue->nr_zones == nr_zones)
> 		goto unlock;
> 
> 	flags = memalloc_noio_save();
> -	sdkp->zone_blocks = zone_blocks;
> -	sdkp->nr_zones = nr_zones;
> +	sdkp->zone_info.zone_blocks = zone_blocks;
> +	sdkp->zone_info.nr_zones = nr_zones;
> 	sdkp->rev_wp_offset = kvcalloc(nr_zones, sizeof(u32), GFP_KERNEL);
> 	if (!sdkp->rev_wp_offset) {
> 		ret = -ENOMEM;
> @@ -800,8 +798,7 @@ int sd_zbc_revalidate_zones(struct scsi_disk *sdkp)
> 	sdkp->rev_wp_offset = NULL;
> 
> 	if (ret) {
> -		sdkp->zone_blocks = 0;
> -		sdkp->nr_zones = 0;
> +		sdkp->zone_info = (struct zoned_disk_info){ };
> 		sdkp->capacity = 0;
> 		goto unlock;
> 	}
> @@ -887,8 +884,8 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, u8 buf[SD_BUF_SIZE])
> 	if (blk_queue_zoned_model(q) == BLK_ZONED_HM)
> 		blk_queue_zone_write_granularity(q, sdkp->physical_block_size);
> 
> -	sdkp->rev_nr_zones = nr_zones;
> -	sdkp->rev_zone_blocks = zone_blocks;
> +	sdkp->early_zone_info.nr_zones = nr_zones;
> +	sdkp->early_zone_info.zone_blocks = zone_blocks;
> 
> 	return 0;
> 

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

--
Himanshu Madhani	Oracle Linux Engineering
Himanshu Madhani April 21, 2022, 8:16 p.m. UTC | #2
> On Apr 21, 2022, at 11:30 AM, Bart Van Assche <bvanassche@acm.org> wrote:
> 
> Change a single occurrence of "nad" into "and".
> 
> Cc: Douglas Gilbert <dgilbert@interlog.com>
> Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> drivers/scsi/scsi_debug.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
> index c607755cce00..7cfae8206a4b 100644
> --- a/drivers/scsi/scsi_debug.c
> +++ b/drivers/scsi/scsi_debug.c
> @@ -4408,7 +4408,7 @@ static int resp_verify(struct scsi_cmnd *scp, struct sdebug_dev_info *devip)
> 
> #define RZONES_DESC_HD 64
> 
> -/* Report zones depending on start LBA nad reporting options */
> +/* Report zones depending on start LBA and reporting options */
> static int resp_report_zones(struct scsi_cmnd *scp,
> 			     struct sdebug_dev_info *devip)
> {

Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>

--
Himanshu Madhani	Oracle Linux Engineering
Douglas Gilbert April 26, 2022, 1:56 a.m. UTC | #3
On 2022-04-21 14:30, Bart Van Assche wrote:
> Hi Martin,
> 
> In ZBC-2 support has been improved for zones with a size that is not a power
> of two by allowing host-managed devices to report gap zones. This patch adds
> support for zoned devices for which data zones and gap zones alternate if the
> distance between zone start LBAs is a power of two.
> 
> Please consider this patch series for kernel v5.19.

whole series:
Acked-by: Douglas Gilbert <dgilbert@interlog.com>

> Changes compared to v1:
> - Made this patch series compatible with the zone querying code in BTRFS.
> - Addressed Damien's off-list review comments.
> - Added patch "Return early in sd_zbc_check_zoned_characteristics()" to this
>    series.
> 
> Bart Van Assche (9):
>    scsi: sd_zbc: Improve source code documentation
>    scsi: sd_zbc: Verify that the zone size is a power of two
>    scsi: sd_zbc: Use logical blocks as unit when querying zones
>    scsi: sd_zbc: Introduce struct zoned_disk_info
>    scsi: sd_zbc: Return early in sd_zbc_check_zoned_characteristics()
>    scsi: sd_zbc: Hide gap zones
>    scsi_debug: Fix a typo
>    scsi_debug: Rename zone type constants
>    scsi_debug: Add gap zone support
> 
>   drivers/scsi/scsi_debug.c | 149 ++++++++++++++++++------
>   drivers/scsi/sd.h         |  32 ++++--
>   drivers/scsi/sd_zbc.c     | 236 +++++++++++++++++++++++++++++---------
>   include/scsi/scsi_proto.h |   9 +-
>   4 files changed, 331 insertions(+), 95 deletions(-)
>
Martin K. Petersen April 26, 2022, 2:56 a.m. UTC | #4
Bart,

> In ZBC-2 support has been improved for zones with a size that is not a
> power of two by allowing host-managed devices to report gap
> zones. This patch adds support for zoned devices for which data zones
> and gap zones alternate if the distance between zone start LBAs is a
> power of two.

Applied to 5.19/scsi-staging, thanks!
Martin K. Petersen May 3, 2022, 12:51 a.m. UTC | #5
On Thu, 21 Apr 2022 11:30:14 -0700, Bart Van Assche wrote:

> In ZBC-2 support has been improved for zones with a size that is not a power
> of two by allowing host-managed devices to report gap zones. This patch adds
> support for zoned devices for which data zones and gap zones alternate if the
> distance between zone start LBAs is a power of two.
> 
> Please consider this patch series for kernel v5.19.
> 
> [...]

Applied to 5.19/scsi-queue, thanks!

[1/9] scsi: sd_zbc: Improve source code documentation
      https://git.kernel.org/mkp/scsi/c/aa96bfb4caff
[2/9] scsi: sd_zbc: Verify that the zone size is a power of two
      https://git.kernel.org/mkp/scsi/c/9a93b9c9d38a
[3/9] scsi: sd_zbc: Use logical blocks as unit when querying zones
      https://git.kernel.org/mkp/scsi/c/43af5da09efb
[4/9] scsi: sd_zbc: Introduce struct zoned_disk_info
      https://git.kernel.org/mkp/scsi/c/628617be8968
[5/9] scsi: sd_zbc: Return early in sd_zbc_check_zoned_characteristics()
      https://git.kernel.org/mkp/scsi/c/60caf3758103
[6/9] scsi: sd_zbc: Hide gap zones
      https://git.kernel.org/mkp/scsi/c/c976e588b34e
[7/9] scsi_debug: Fix a typo
      https://git.kernel.org/mkp/scsi/c/897284e8a048
[8/9] scsi_debug: Rename zone type constants
      https://git.kernel.org/mkp/scsi/c/35dbe2b9a7b0
[9/9] scsi_debug: Add gap zone support
      https://git.kernel.org/mkp/scsi/c/4a5fc1c6d752