diff mbox series

[3/3] scsi: sd_zbc: Fix handling of RC BASIS

Message ID 20220711230051.15372-4-bvanassche@acm.org
State New
Headers show
Series Improve READ CAPACITY reporting and handling | expand

Commit Message

Bart Van Assche July 11, 2022, 11 p.m. UTC
Using the RETURNED LOGICAL BLOCK ADDRESS field + 1 as the capacity (largest
addressable LBA) if RC BASIS = 0 is wrong if there are sequential write
required zones. Hence only use the RC BASIS = 0 capacity if there are no
sequential write required zones.

Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Fixes: d2e428e49eec ("scsi: sd_zbc: Reduce boot device scan and revalidate time")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/sd_zbc.c | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

Comments

Damien Le Moal July 11, 2022, 11:11 p.m. UTC | #1
On 7/12/22 08:00, Bart Van Assche wrote:
> Using the RETURNED LOGICAL BLOCK ADDRESS field + 1 as the capacity (largest
> addressable LBA) if RC BASIS = 0 is wrong if there are sequential write
> required zones. Hence only use the RC BASIS = 0 capacity if there are no
> sequential write required zones.

This does not make sense to me: RC BASIS == 0 is defined like this:

The RETURNED LOGICAL BLOCK ADDRESS field indicates the highest LBA of a
contiguous range of zones that are not sequential write required zones
starting with the first zone.

Do you have these cases:
1) host-managed disks:
SWR zones are *mandatory* so there is at least one. Thus read capacity
will return either 0 if there are no conventional zones (they are
optional) or the total capacity of the set of contiguous conventional
zones starting at lba 0. In either case, read capacity does not give you
the actual total capacity and you have to look at the report zones reply
max lba field.
2) host-aware disks:
There are no SWR zones, there cannot be any. You can only have
conventional zones (optionally) and sequential write preferred zones. So
"the highest LBA of a contiguous range of zones that are not sequential
write required zones starting with the first zone" necessarily is always
the total capacity. RC BASIS = 0 is non-sensical for host-aware drives.

So I do not understand your change.

Note that anyway, there are no drives out there that use RC BASIS = 0. I
had to hack a drive FW to implement it to test this code...

> 
> Cc: Damien Le Moal <damien.lemoal@wdc.com>
> Cc: Hannes Reinecke <hare@suse.com>
> Fixes: d2e428e49eec ("scsi: sd_zbc: Reduce boot device scan and revalidate time")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/scsi/sd_zbc.c | 26 ++++++++++++++++++++++----
>  1 file changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
> index 6acc4f406eb8..41ff1c0fd04b 100644
> --- a/drivers/scsi/sd_zbc.c
> +++ b/drivers/scsi/sd_zbc.c
> @@ -699,7 +699,7 @@ static int sd_zbc_check_zoned_characteristics(struct scsi_disk *sdkp,
>   *
>   * Get the device zone size and check that the device capacity as reported
>   * by READ CAPACITY matches the max_lba value (plus one) of the report zones
> - * command reply for devices with RC_BASIS == 0.
> + * command reply.
>   *
>   * Returns 0 upon success or an error code upon failure.
>   */
> @@ -709,6 +709,7 @@ static int sd_zbc_check_capacity(struct scsi_disk *sdkp, unsigned char *buf,
>  	u64 zone_blocks;
>  	sector_t max_lba;
>  	unsigned char *rec;
> +	u8 same;
>  	int ret;
>  
>  	/* Do a report zone to get max_lba and the size of the first zone */
> @@ -716,9 +717,26 @@ static int sd_zbc_check_capacity(struct scsi_disk *sdkp, unsigned char *buf,
>  	if (ret)
>  		return ret;
>  
> -	if (sdkp->rc_basis == 0) {
> -		/* The max_lba field is the capacity of this device */
> -		max_lba = get_unaligned_be64(&buf[8]);
> +	/*
> +	 * From ZBC-1: "If the ZONE LIST LENGTH field is zero then the SAME
> +	 * field is invalid and should be ignored by the application client."
> +	 */
> +	if (get_unaligned_be32(&buf[0]) == 0) {
> +		sd_printk(KERN_INFO, sdkp, "No zones have been reported\n");
> +		return -EIO;
> +	}
> +
> +	same = buf[4] & 0xf;
> +	max_lba = get_unaligned_be64(&buf[8]);
> +	/*
> +	 * The max_lba field is the largest addressable LBA of the disk if:
> +	 * - Either RC BASIS == 1.
> +	 * - Or RC BASIS == 0, there is at least one zone in the response
> +	 *   (max_lba != 0) and all zones have the same type (same == 1 ||
> +	 *   same == 2).
> +	 */
> +	if ((sdkp->rc_basis == 0 && max_lba && (same == 1 || same == 2)) ||
> +	    sdkp->rc_basis == 1) {
>  		if (sdkp->capacity != max_lba + 1) {
>  			if (sdkp->first_scan)
>  				sd_printk(KERN_WARNING, sdkp,
Damien Le Moal July 11, 2022, 11:28 p.m. UTC | #2
On 7/12/22 08:11, Damien Le Moal wrote:
> On 7/12/22 08:00, Bart Van Assche wrote:
>> Using the RETURNED LOGICAL BLOCK ADDRESS field + 1 as the capacity (largest
>> addressable LBA) if RC BASIS = 0 is wrong if there are sequential write
>> required zones. Hence only use the RC BASIS = 0 capacity if there are no
>> sequential write required zones.
> 
> This does not make sense to me: RC BASIS == 0 is defined like this:
> 
> The RETURNED LOGICAL BLOCK ADDRESS field indicates the highest LBA of a
> contiguous range of zones that are not sequential write required zones
> starting with the first zone.
> 
> Do you have these cases:
> 1) host-managed disks:
> SWR zones are *mandatory* so there is at least one. Thus read capacity
> will return either 0 if there are no conventional zones (they are
> optional) or the total capacity of the set of contiguous conventional
> zones starting at lba 0. In either case, read capacity does not give you
> the actual total capacity and you have to look at the report zones reply
> max lba field.
> 2) host-aware disks:
> There are no SWR zones, there cannot be any. You can only have
> conventional zones (optionally) and sequential write preferred zones. So
> "the highest LBA of a contiguous range of zones that are not sequential
> write required zones starting with the first zone" necessarily is always
> the total capacity. RC BASIS = 0 is non-sensical for host-aware drives.

What I meant to say here for the host-aware case is that RC BASIS is
irrelevant. Both RC BASIS = 0 and = 1 end up with read capacity giving the
actual total capacity. ANd for good reasons: the host-aware model is
designed to be backward compatible with regular disks, which do not have
RC BASIS, so RC BASIS = 0, always.

> 
> So I do not understand your change.
> 
> Note that anyway, there are no drives out there that use RC BASIS = 0. I
> had to hack a drive FW to implement it to test this code...
> 
>>
>> Cc: Damien Le Moal <damien.lemoal@wdc.com>
>> Cc: Hannes Reinecke <hare@suse.com>
>> Fixes: d2e428e49eec ("scsi: sd_zbc: Reduce boot device scan and revalidate time")
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> ---
>>  drivers/scsi/sd_zbc.c | 26 ++++++++++++++++++++++----
>>  1 file changed, 22 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
>> index 6acc4f406eb8..41ff1c0fd04b 100644
>> --- a/drivers/scsi/sd_zbc.c
>> +++ b/drivers/scsi/sd_zbc.c
>> @@ -699,7 +699,7 @@ static int sd_zbc_check_zoned_characteristics(struct scsi_disk *sdkp,
>>   *
>>   * Get the device zone size and check that the device capacity as reported
>>   * by READ CAPACITY matches the max_lba value (plus one) of the report zones
>> - * command reply for devices with RC_BASIS == 0.
>> + * command reply.
>>   *
>>   * Returns 0 upon success or an error code upon failure.
>>   */
>> @@ -709,6 +709,7 @@ static int sd_zbc_check_capacity(struct scsi_disk *sdkp, unsigned char *buf,
>>  	u64 zone_blocks;
>>  	sector_t max_lba;
>>  	unsigned char *rec;
>> +	u8 same;
>>  	int ret;
>>  
>>  	/* Do a report zone to get max_lba and the size of the first zone */
>> @@ -716,9 +717,26 @@ static int sd_zbc_check_capacity(struct scsi_disk *sdkp, unsigned char *buf,
>>  	if (ret)
>>  		return ret;
>>  
>> -	if (sdkp->rc_basis == 0) {
>> -		/* The max_lba field is the capacity of this device */
>> -		max_lba = get_unaligned_be64(&buf[8]);
>> +	/*
>> +	 * From ZBC-1: "If the ZONE LIST LENGTH field is zero then the SAME
>> +	 * field is invalid and should be ignored by the application client."
>> +	 */
>> +	if (get_unaligned_be32(&buf[0]) == 0) {
>> +		sd_printk(KERN_INFO, sdkp, "No zones have been reported\n");
>> +		return -EIO;
>> +	}
>> +
>> +	same = buf[4] & 0xf;
>> +	max_lba = get_unaligned_be64(&buf[8]);
>> +	/*
>> +	 * The max_lba field is the largest addressable LBA of the disk if:
>> +	 * - Either RC BASIS == 1.
>> +	 * - Or RC BASIS == 0, there is at least one zone in the response
>> +	 *   (max_lba != 0) and all zones have the same type (same == 1 ||
>> +	 *   same == 2).
>> +	 */
>> +	if ((sdkp->rc_basis == 0 && max_lba && (same == 1 || same == 2)) ||
>> +	    sdkp->rc_basis == 1) {
>>  		if (sdkp->capacity != max_lba + 1) {
>>  			if (sdkp->first_scan)
>>  				sd_printk(KERN_WARNING, sdkp,
> 
>
Bart Van Assche July 12, 2022, 12:39 a.m. UTC | #3
On 7/11/22 16:11, Damien Le Moal wrote:
> Do you have these cases:
> 1) host-managed disks:
> SWR zones are *mandatory* so there is at least one. Thus read capacity
> will return either 0 if there are no conventional zones (they are
> optional) or the total capacity of the set of contiguous conventional
> zones starting at lba 0. In either case, read capacity does not give you
> the actual total capacity and you have to look at the report zones reply
> max lba field.

Does the scsi_debug driver allow to create a host-managed disk with one 
or more conventional zones and one or more sequential write required zones?

> Note that anyway, there are no drives out there that use RC BASIS = 0. I
> had to hack a drive FW to implement it to test this code...

A JEDEC member is telling me that I should use RC BASIS = 0 for 
host-managed zoned storage devices that only have sequential write 
required zones. That sounds weird to me so I decided to take a look at 
how the sd_zbc.c code handles RC BASIS.

Bart.
Bart Van Assche July 12, 2022, 5:14 p.m. UTC | #4
On 7/11/22 16:28, Damien Le Moal wrote:
> On 7/12/22 08:11, Damien Le Moal wrote:
>> On 7/12/22 08:00, Bart Van Assche wrote:
>>> Using the RETURNED LOGICAL BLOCK ADDRESS field + 1 as the capacity (largest
>>> addressable LBA) if RC BASIS = 0 is wrong if there are sequential write
>>> required zones. Hence only use the RC BASIS = 0 capacity if there are no
>>> sequential write required zones.
>>
>> This does not make sense to me: RC BASIS == 0 is defined like this:
>>
>> The RETURNED LOGICAL BLOCK ADDRESS field indicates the highest LBA of a
>> contiguous range of zones that are not sequential write required zones
>> starting with the first zone.
>>
>> Do you have these cases:
>> 1) host-managed disks:
>> SWR zones are *mandatory* so there is at least one. Thus read capacity
>> will return either 0 if there are no conventional zones (they are
>> optional) or the total capacity of the set of contiguous conventional
>> zones starting at lba 0. In either case, read capacity does not give you
>> the actual total capacity and you have to look at the report zones reply
>> max lba field.
>> 2) host-aware disks:
>> There are no SWR zones, there cannot be any. You can only have
>> conventional zones (optionally) and sequential write preferred zones. So
>> "the highest LBA of a contiguous range of zones that are not sequential
>> write required zones starting with the first zone" necessarily is always
>> the total capacity. RC BASIS = 0 is non-sensical for host-aware drives.
> 
> What I meant to say here for the host-aware case is that RC BASIS is
> irrelevant. Both RC BASIS = 0 and = 1 end up with read capacity giving the
> actual total capacity. ANd for good reasons: the host-aware model is
> designed to be backward compatible with regular disks, which do not have
> RC BASIS, so RC BASIS = 0, always.
Hi Damien,

I agree that for host-aware block devices (conventional + sequential write
preferred zones) RC BASIS is irrelevant.

What I'm concerned about is host-managed block devices (conventional + sequential
write required zones) that report an incorrect RC BASIS value (0 instead of 1).
Shouldn't sd_zbc_check_capacity() skip the capacity reported by host-managed block
devices that report RC BASIS = 0?

Thanks,

Bart.
diff mbox series

Patch

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 6acc4f406eb8..41ff1c0fd04b 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -699,7 +699,7 @@  static int sd_zbc_check_zoned_characteristics(struct scsi_disk *sdkp,
  *
  * Get the device zone size and check that the device capacity as reported
  * by READ CAPACITY matches the max_lba value (plus one) of the report zones
- * command reply for devices with RC_BASIS == 0.
+ * command reply.
  *
  * Returns 0 upon success or an error code upon failure.
  */
@@ -709,6 +709,7 @@  static int sd_zbc_check_capacity(struct scsi_disk *sdkp, unsigned char *buf,
 	u64 zone_blocks;
 	sector_t max_lba;
 	unsigned char *rec;
+	u8 same;
 	int ret;
 
 	/* Do a report zone to get max_lba and the size of the first zone */
@@ -716,9 +717,26 @@  static int sd_zbc_check_capacity(struct scsi_disk *sdkp, unsigned char *buf,
 	if (ret)
 		return ret;
 
-	if (sdkp->rc_basis == 0) {
-		/* The max_lba field is the capacity of this device */
-		max_lba = get_unaligned_be64(&buf[8]);
+	/*
+	 * From ZBC-1: "If the ZONE LIST LENGTH field is zero then the SAME
+	 * field is invalid and should be ignored by the application client."
+	 */
+	if (get_unaligned_be32(&buf[0]) == 0) {
+		sd_printk(KERN_INFO, sdkp, "No zones have been reported\n");
+		return -EIO;
+	}
+
+	same = buf[4] & 0xf;
+	max_lba = get_unaligned_be64(&buf[8]);
+	/*
+	 * The max_lba field is the largest addressable LBA of the disk if:
+	 * - Either RC BASIS == 1.
+	 * - Or RC BASIS == 0, there is at least one zone in the response
+	 *   (max_lba != 0) and all zones have the same type (same == 1 ||
+	 *   same == 2).
+	 */
+	if ((sdkp->rc_basis == 0 && max_lba && (same == 1 || same == 2)) ||
+	    sdkp->rc_basis == 1) {
 		if (sdkp->capacity != max_lba + 1) {
 			if (sdkp->first_scan)
 				sd_printk(KERN_WARNING, sdkp,