diff mbox series

[RFC] scsi: sd_zbc: prevent report zones racing rev_wp_ofst updates

Message ID 20200804142541.17126-1-johannes.thumshirn@wdc.com
State New
Headers show
Series [RFC] scsi: sd_zbc: prevent report zones racing rev_wp_ofst updates | expand

Commit Message

Johannes Thumshirn Aug. 4, 2020, 2:25 p.m. UTC
When revalidating a zoned SCSI disk, we need to do a REPORT ZONES, which
also updates the write-pointer offset cache.

As we don't want a normal REPORT ZONES to constantly update the
write-pointer offset cache, we swap the cache contents from revalidate
with the live version.

But if a second REPORT ZONES is triggered while '->rev_wp_offset' is
already allocated, sd_zbc_parse_report() can't distinguish the two
different REPORT ZONES (from revalidation context or from a
file-system/ioctl).

                 CPU0                             CPU1

sd_zbc_revalidate_zones()
`-> mutex_lock(&sdkp->rev_mutex);
`-> sdkp->rev_wp_offset = kvcalloc();
`->blk_revalidate_disk_zones();
   `-> disk->fops->report_zones();
       `-> sd_zbc_report_zones();
           `-> sd_zbc_parse_report();
	       `-> if (sdkp->rev_wp_offset)
                   `-> sdkp->rev_wp_offset[idx] =

                                           blkdev_report_zones()
                                           `-> disk->fops->report_zones();
                                               `-> sd_zbc_report_zones();
                                                   `-> sd_zbc_parse_report();
                                        	       `-> if (sdkp->rev_wp_offset)
                                                           `-> sdkp->rev_wp_offset[idx] =

   `-> update_driver_data();
      `-> sd_zbc_revalidate_zones_cb();
          `-> swap(sdkp->zones_wp_offset, sdkp->rev_wp_offset);
`-> kvfree(sdkp->rev_wp_offset);
`-> sdkp->rev_wp_offset = NULL;
`-> mutex_unlock(&sdkp->rev_mutex);

As two concurrent revalidates are excluded via the '->rev_mutex', try to
grab the '->rev_mutex' in sd_zbc_report_zones(). If we cannot lock the
'->rev_mutex' because it's already held, we know we're called in a disk
revalidate context, if we can grab the mutex we need to unlock it again
after sd_zbc_parse_report() as we're not called in a revalidate context.

This way we can ensure a partial REPORT ZONES doesn't set invalid
write-pointer offsets in the revalidate write-pointer offset cache when a
partial REPORT ZONES is running concurrently with a full REPORT ZONES from
disk revalidation.

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
 drivers/scsi/sd_zbc.c | 11 +++++++++++
 1 file changed, 11 insertions(+)
diff mbox series

Patch

diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c
index 6f7eba66687e..d19456220c09 100644
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -198,6 +198,7 @@  int sd_zbc_report_zones(struct gendisk *disk, sector_t sector,
 	unsigned char *buf;
 	size_t offset, buflen = 0;
 	int zone_idx = 0;
+	bool unlock = false;
 	int ret;
 
 	if (!sd_is_zoned(sdkp))
@@ -223,6 +224,14 @@  int sd_zbc_report_zones(struct gendisk *disk, sector_t sector,
 		if (!nr)
 			break;
 
+		/*
+		 * Check if we're called by revalidate or by a normal report
+		 * zones. Mutually exclude report zones due to revalidation and
+		 * normal report zones, so we're not accidentally overriding the
+		 * write-pointer offset cache.
+		 */
+		if (mutex_trylock(&sdkp->rev_mutex))
+			unlock = true;
 		for (i = 0; i < nr && zone_idx < nr_zones; i++) {
 			offset += 64;
 			ret = sd_zbc_parse_report(sdkp, buf + offset, zone_idx,
@@ -231,6 +240,8 @@  int sd_zbc_report_zones(struct gendisk *disk, sector_t sector,
 				goto out;
 			zone_idx++;
 		}
+		if (unlock)
+			mutex_unlock(&sdkp->rev_mutex);
 
 		sector += sd_zbc_zone_sectors(sdkp) * i;
 	}