From patchwork Thu Sep 9 02:35:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 508643 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A712C433FE for ; Thu, 9 Sep 2021 02:35:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 32F6A6101A for ; Thu, 9 Sep 2021 02:35:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236191AbhIICg6 (ORCPT ); Wed, 8 Sep 2021 22:36:58 -0400 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:42516 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231898AbhIICg5 (ORCPT ); Wed, 8 Sep 2021 22:36:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1631154949; x=1662690949; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=43i0QSckSxHEDAvLL9GdwzJ8xOxZWauuFIMSS/Taejo=; b=XnVKPkvrC1Yjgymll3fEZaSfYqMfZNjGDQbIzWee8oAc5o64S+NqqVPQ MesXSScMv7UiuJhAGh6HdP3A1xJW8tXRpi3QAbhOgxUSJ39oFp0TLTzB9 p3Ztj9QqhDqgXUgyeky8ZSrTmIBlXvr2zR6uGqAGseRYtTPcMmw/wLkXS Pzg9Cpe9FmXr+c/REG4sNExHXadY5wCw58m2voXHywG+ArZcOLN39fO60 m4fAdnbJUaJ93JDmJtNWgH4sW9iTz8nhzcbnscn39nfrlwLl7muf9A3vr /YP95AZbacEZ67HW55tvdf5sK2clmJS0UFP6ALjmG/hGowdPmasV+OlMZ A==; X-IronPort-AV: E=Sophos;i="5.85,279,1624291200"; d="scan'208";a="180062206" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 09 Sep 2021 10:35:49 +0800 IronPort-SDR: rXsNCYpL+jFW4hfUL6cDDfIXR9go6YaJZ2CR3YebM907cxghQT0iymt7JlwksJsROqUx0iDajJ 5ZsQpJVEF+X4HAbHnrzbxr6W5oizHRiKorkDYNKDeKdm4SVtYjL9LdVofWvB5me3Kf2INwA7gr TgLkUtx84Js8hdOE7htcBI49qwuEn3kA9oImCoABiU7KbytrLhKc0kzmztU0ToCF7ZWgsUewTO z6758535n3nm+dIgEUJCNsyde6F/WcoVfPQzANzkLUEpM7jceodISQv1i/7TOLV6f7W5HXOjRa fUKQVW4ebucTLqfFxEX9PBcH Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Sep 2021 19:12:23 -0700 IronPort-SDR: 70AfsP/q25Z69FOLalEkWnZEgO2jwmGO8T5kw9iKEN7HD+leem0wx8EqcMMB8cSZooQ94XQfSx VaiblP1LBXRPZjXCbDqtZKkDjzbstMPnv1AgfSq3FG2fLoD48FvbZVC5LbHm2cLEnr278NVChz ZGnOwwGX0HHKaExb9VSe0jgQFp+GTTGrmAh/18+LUiRrGGPf8Le5F6al6nZRzZRMsttCXebbXh rvdKwnF/b3YjYTyDWR5x0Kj1ZkXXPgW3Wrub2nwObe7Pw3oD0miwtUBOJDo+sYTDGtu48Vvg/A UQQ= WDCIronportException: Internal Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Sep 2021 19:35:47 -0700 From: Damien Le Moal To: Jens Axboe , linux-block@vger.kernel.org, linux-ide@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Subject: [PATCH v8 1/5] block: Add independent access ranges support Date: Thu, 9 Sep 2021 11:35:41 +0900 Message-Id: <20210909023545.1101672-2-damien.lemoal@wdc.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210909023545.1101672-1-damien.lemoal@wdc.com> References: <20210909023545.1101672-1-damien.lemoal@wdc.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org The Concurrent Positioning Ranges VPD page (for SCSI) and data log page (for ATA) contain parameters describing the set of contiguous LBAs that can be served independently by a single LUN multi-actuator hard-disk. Similarly, a logically defined block device composed of multiple disks can in some cases execute requests directed at different sector ranges in parallel. A dm-linear device aggregating 2 block devices together is an example. This patch implements support for exposing a block device independent access ranges to the user through sysfs to allow optimizing device accesses to increase performance. To describe the set of independent sector ranges of a device (actuators of a multi-actuator HDDs or table entries of a dm-linear device), The type struct blk_independent_access_ranges is introduced. This structure describes the sector ranges using an array of struct blk_independent_access_range structures. This range structure defines the start sector and number of sectors of the access range. The ranges in the array cannot overlap and must contain all sectors within the device capacity. The function disk_set_independent_access_ranges() allows a device driver to signal to the block layer that a device has multiple independent access ranges. In this case, a struct blk_independent_access_ranges is attached to the device request queue by the function disk_set_independent_access_ranges(). The function disk_alloc_independent_access_ranges() is provided for drivers to allocate this structure. struct blk_independent_access_ranges contains kobjects (struct kobject) to expose to the user through sysfs the set of independent access ranges supported by a device. When the device is initialized, sysfs registration of the ranges information is done from blk_register_queue() using the block layer internal function disk_register_independent_access_ranges(). If a driver calls disk_set_independent_access_ranges() for a registered queue, e.g. when a device is revalidated, disk_set_independent_access_ranges() will execute disk_register_independent_access_ranges() to update the sysfs attribute files. The sysfs file structure created starts from the independent_access_ranges sub-directory and contains the start sector and number of sectors of each range, with the information for each range grouped in numbered sub-directories. E.g. for a dual actuator HDD, the user sees: $ tree /sys/block/sdk/queue/independent_access_ranges/ /sys/block/sdk/queue/independent_access_ranges/ |-- 0 | |-- nr_sectors | `-- sector `-- 1 |-- nr_sectors `-- sector For a regular device with a single access range, the independent_access_ranges sysfs directory does not exist. Device revalidation may lead to changes to this structure and to the attribute values. When manipulated, the queue sysfs_lock and sysfs_dir_lock mutexes are held for atomicity, similarly to how the blk-mq and elevator sysfs queue sub-directories are protected. The code related to the management of independent access ranges is added in the new file block/blk-ia-ranges.c. Signed-off-by: Damien Le Moal --- block/Makefile | 2 +- block/blk-ia-ranges.c | 348 +++++++++++++++++++++++++++++++++++++++++ block/blk-sysfs.c | 26 ++- block/blk.h | 4 + include/linux/blkdev.h | 39 +++++ 5 files changed, 410 insertions(+), 9 deletions(-) create mode 100644 block/blk-ia-ranges.c diff --git a/block/Makefile b/block/Makefile index c03b92d8a4bc..36ccea55619f 100644 --- a/block/Makefile +++ b/block/Makefile @@ -9,7 +9,7 @@ obj-$(CONFIG_BLOCK) := bdev.o fops.o bio.o elevator.o blk-core.o blk-sysfs.o \ blk-lib.o blk-mq.o blk-mq-tag.o blk-stat.o \ blk-mq-sysfs.o blk-mq-cpumap.o blk-mq-sched.o ioctl.o \ genhd.o ioprio.o badblocks.o partitions/ blk-rq-qos.o \ - disk-events.o + disk-events.o blk-ia-ranges.o obj-$(CONFIG_BOUNCE) += bounce.o obj-$(CONFIG_BLK_SCSI_REQUEST) += scsi_ioctl.o diff --git a/block/blk-ia-ranges.c b/block/blk-ia-ranges.c new file mode 100644 index 000000000000..c246c425d0d7 --- /dev/null +++ b/block/blk-ia-ranges.c @@ -0,0 +1,348 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Block device concurrent positioning ranges. + * + * Copyright (C) 2021 Western Digital Corporation or its Affiliates. + */ +#include +#include +#include +#include + +#include "blk.h" + +static ssize_t +blk_ia_range_sector_show(struct blk_independent_access_range *iar, + char *buf) +{ + return sprintf(buf, "%llu\n", iar->sector); +} + +static ssize_t +blk_ia_range_nr_sectors_show(struct blk_independent_access_range *iar, + char *buf) +{ + return sprintf(buf, "%llu\n", iar->nr_sectors); +} + +struct blk_ia_range_sysfs_entry { + struct attribute attr; + ssize_t (*show)(struct blk_independent_access_range *iar, char *buf); +}; + +static struct blk_ia_range_sysfs_entry blk_ia_range_sector_entry = { + .attr = { .name = "sector", .mode = 0444 }, + .show = blk_ia_range_sector_show, +}; + +static struct blk_ia_range_sysfs_entry blk_ia_range_nr_sectors_entry = { + .attr = { .name = "nr_sectors", .mode = 0444 }, + .show = blk_ia_range_nr_sectors_show, +}; + +static struct attribute *blk_ia_range_attrs[] = { + &blk_ia_range_sector_entry.attr, + &blk_ia_range_nr_sectors_entry.attr, + NULL, +}; +ATTRIBUTE_GROUPS(blk_ia_range); + +static ssize_t blk_ia_range_sysfs_show(struct kobject *kobj, + struct attribute *attr, char *buf) +{ + struct blk_ia_range_sysfs_entry *entry = + container_of(attr, struct blk_ia_range_sysfs_entry, attr); + struct blk_independent_access_range *iar = + container_of(kobj, struct blk_independent_access_range, kobj); + ssize_t ret; + + mutex_lock(&iar->queue->sysfs_lock); + ret = entry->show(iar, buf); + mutex_unlock(&iar->queue->sysfs_lock); + + return ret; +} + +static const struct sysfs_ops blk_ia_range_sysfs_ops = { + .show = blk_ia_range_sysfs_show, +}; + +/* + * Independent access range entries are not freed individually, but alltogether + * with struct blk_independent_access_ranges and its array of ranges. Since + * kobject_add() takes a reference on the parent kobject contained in + * struct blk_independent_access_ranges, the array of independent access range + * entries cannot be freed until kobject_del() is called for all entries. + * So we do not need to do anything here, but still need this no-op release + * operation to avoid complaints from the kobject code. + */ +static void blk_ia_range_sysfs_nop_release(struct kobject *kobj) +{ +} + +static struct kobj_type blk_ia_range_ktype = { + .sysfs_ops = &blk_ia_range_sysfs_ops, + .default_groups = blk_ia_range_groups, + .release = blk_ia_range_sysfs_nop_release, +}; + +/* + * This will be executed only after all independent access range entries are + * removed with kobject_del(), at which point, it is safe to free everything, + * including the array of ranges. + */ +static void blk_ia_ranges_sysfs_release(struct kobject *kobj) +{ + struct blk_independent_access_ranges *iars = + container_of(kobj, struct blk_independent_access_ranges, kobj); + + kfree(iars); +} + +static struct kobj_type blk_ia_ranges_ktype = { + .release = blk_ia_ranges_sysfs_release, +}; + +/** + * disk_register_ia_ranges - register with sysfs a set of independent + * access ranges + * @disk: Target disk + * @new_iars: New set of independent access ranges + * + * Register with sysfs a set of independent access ranges for @disk. + * If @new_iars is not NULL, this set of ranges is registered and the old set + * specified by q->ia_ranges is unregistered. Otherwise, q->ia_ranges is + * registered if it is not already. + */ +int disk_register_independent_access_ranges(struct gendisk *disk, + struct blk_independent_access_ranges *new_iars) +{ + struct request_queue *q = disk->queue; + struct blk_independent_access_ranges *iars; + int i, ret; + + lockdep_assert_held(&q->sysfs_dir_lock); + lockdep_assert_held(&q->sysfs_lock); + + /* If a new range set is specified, unregister the old one */ + if (new_iars) { + if (q->ia_ranges) + disk_unregister_independent_access_ranges(disk); + q->ia_ranges = new_iars; + } + + iars = q->ia_ranges; + if (!iars) + return 0; + + /* + * At this point, iars is the new set of sector access ranges that needs + * to be registered with sysfs. + */ + WARN_ON(iars->sysfs_registered); + ret = kobject_init_and_add(&iars->kobj, &blk_ia_ranges_ktype, + &q->kobj, "%s", "independent_access_ranges"); + if (ret) { + q->ia_ranges = NULL; + kfree(iars); + return ret; + } + + for (i = 0; i < iars->nr_ia_ranges; i++) { + iars->ia_range[i].queue = q; + ret = kobject_init_and_add(&iars->ia_range[i].kobj, + &blk_ia_range_ktype, &iars->kobj, + "%d", i); + if (ret) { + while (--i >= 0) + kobject_del(&iars->ia_range[i].kobj); + kobject_del(&iars->kobj); + kobject_put(&iars->kobj); + return ret; + } + } + + iars->sysfs_registered = true; + + return 0; +} + +void disk_unregister_independent_access_ranges(struct gendisk *disk) +{ + struct request_queue *q = disk->queue; + struct blk_independent_access_ranges *iars = q->ia_ranges; + int i; + + lockdep_assert_held(&q->sysfs_dir_lock); + lockdep_assert_held(&q->sysfs_lock); + + if (!iars) + return; + + if (iars->sysfs_registered) { + for (i = 0; i < iars->nr_ia_ranges; i++) + kobject_del(&iars->ia_range[i].kobj); + kobject_del(&iars->kobj); + kobject_put(&iars->kobj); + } else { + kfree(iars); + } + + q->ia_ranges = NULL; +} + +static struct blk_independent_access_range * +disk_find_ia_range(struct blk_independent_access_ranges *iars, + sector_t sector) +{ + struct blk_independent_access_range *iar; + int i; + + for (i = 0; i < iars->nr_ia_ranges; i++) { + iar = &iars->ia_range[i]; + if (sector >= iar->sector && + sector < iar->sector + iar->nr_sectors) + return iar; + } + + return NULL; +} + +static bool disk_check_ia_ranges(struct gendisk *disk, + struct blk_independent_access_ranges *iars) +{ + struct blk_independent_access_range *iar, *tmp; + sector_t capacity = get_capacity(disk); + sector_t sector = 0; + int i; + + /* + * While sorting the ranges in increasing LBA order, check that the + * ranges do not overlap, that there are no sector holes and that all + * sectors belong to one range. + */ + for (i = 0; i < iars->nr_ia_ranges; i++) { + tmp = disk_find_ia_range(iars, sector); + if (!tmp || tmp->sector != sector) { + pr_warn("Invalid non-contiguous independent access ranges\n"); + return false; + } + + iar = &iars->ia_range[i]; + if (tmp != iar) { + swap(iar->sector, tmp->sector); + swap(iar->nr_sectors, tmp->nr_sectors); + } + + sector += iar->nr_sectors; + } + + if (sector != capacity) { + pr_warn("Independent access ranges do not match disk capacity\n"); + return false; + } + + return true; +} + +static bool disk_ia_ranges_changed(struct gendisk *disk, + struct blk_independent_access_ranges *new) +{ + struct blk_independent_access_ranges *old = disk->queue->ia_ranges; + int i; + + if (!old) + return true; + + if (old->nr_ia_ranges != new->nr_ia_ranges) + return true; + + for (i = 0; i < old->nr_ia_ranges; i++) { + if (new->ia_range[i].sector != old->ia_range[i].sector || + new->ia_range[i].nr_sectors != old->ia_range[i].nr_sectors) + return true; + } + + return false; +} + +/** + * disk_alloc_independent_access_ranges - Allocate an independent access ranges + * data structure + * @disk: target disk + * @nr_ia_ranges: Number of independent access ranges + * + * Allocate a struct blk_independent_access_ranges structure with @nr_ia_ranges + * access range descriptors. + */ +struct blk_independent_access_ranges * +disk_alloc_independent_access_ranges(struct gendisk *disk, int nr_ia_ranges) +{ + struct blk_independent_access_ranges *iars; + + iars = kzalloc_node(struct_size(iars, ia_range, nr_ia_ranges), + GFP_KERNEL, disk->queue->node); + if (iars) + iars->nr_ia_ranges = nr_ia_ranges; + return iars; +} +EXPORT_SYMBOL_GPL(disk_alloc_independent_access_ranges); + +/** + * disk_set_independent_access_ranges - Set a disk independent access ranges + * @disk: target disk + * @iars: independent access ranges structure + * + * Set the independent access ranges information of the request queue + * of @disk to @iars. If @iars is NULL and the independent access ranges + * structure already set is cleared. If there are no differences between + * @iars and the independent access ranges structure already set, @iars + * is freed. + */ +void disk_set_independent_access_ranges(struct gendisk *disk, + struct blk_independent_access_ranges *iars) +{ + struct request_queue *q = disk->queue; + + if (WARN_ON_ONCE(iars && !iars->nr_ia_ranges)) { + kfree(iars); + iars = NULL; + } + + mutex_lock(&q->sysfs_dir_lock); + mutex_lock(&q->sysfs_lock); + + if (iars) { + if (!disk_check_ia_ranges(disk, iars)) { + kfree(iars); + iars = NULL; + goto reg; + } + + if (!disk_ia_ranges_changed(disk, iars)) { + kfree(iars); + goto unlock; + } + } + + /* + * This may be called for a registered queue. E.g. during a device + * revalidation. If that is the case, we need to unregister the old + * set of independent access ranges and register the new set. If the + * queue is not registered, registration of the device request queue + * will register the independent access ranges, so only swap in the + * new set and free the old one. + */ +reg: + if (blk_queue_registered(q)) { + disk_register_independent_access_ranges(disk, iars); + } else { + swap(q->ia_ranges, iars); + kfree(iars); + } + +unlock: + mutex_unlock(&q->sysfs_lock); + mutex_unlock(&q->sysfs_dir_lock); +} +EXPORT_SYMBOL_GPL(disk_set_independent_access_ranges); diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 614d9d47de36..1f932cd6c037 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -887,16 +887,15 @@ int blk_register_queue(struct gendisk *disk) } mutex_lock(&q->sysfs_lock); + + ret = disk_register_independent_access_ranges(disk, NULL); + if (ret) + goto put_dev; + if (q->elevator) { ret = elv_register_queue(q, false); - if (ret) { - mutex_unlock(&q->sysfs_lock); - mutex_unlock(&q->sysfs_dir_lock); - kobject_del(&q->kobj); - blk_trace_remove_sysfs(dev); - kobject_put(&dev->kobj); - return ret; - } + if (ret) + goto put_dev; } blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q); @@ -927,6 +926,16 @@ int blk_register_queue(struct gendisk *disk) percpu_ref_switch_to_percpu(&q->q_usage_counter); } + return ret; + +put_dev: + disk_unregister_independent_access_ranges(disk); + mutex_unlock(&q->sysfs_lock); + mutex_unlock(&q->sysfs_dir_lock); + kobject_del(&q->kobj); + blk_trace_remove_sysfs(dev); + kobject_put(&dev->kobj); + return ret; } @@ -972,6 +981,7 @@ void blk_unregister_queue(struct gendisk *disk) mutex_lock(&q->sysfs_lock); if (q->elevator) elv_unregister_queue(q); + disk_unregister_independent_access_ranges(disk); mutex_unlock(&q->sysfs_lock); mutex_unlock(&q->sysfs_dir_lock); diff --git a/block/blk.h b/block/blk.h index 7d2a0ba7ed21..fa19ab9cd6b7 100644 --- a/block/blk.h +++ b/block/blk.h @@ -375,4 +375,8 @@ static inline void bio_clear_hipri(struct bio *bio) extern const struct address_space_operations def_blk_aops; +int disk_register_independent_access_ranges(struct gendisk *disk, + struct blk_independent_access_ranges *new_iars); +void disk_unregister_independent_access_ranges(struct gendisk *disk); + #endif /* BLK_INTERNAL_H */ diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index c9cb12483e12..2c754f01a7ae 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -377,6 +377,34 @@ static inline int blkdev_zone_mgmt_ioctl(struct block_device *bdev, #endif /* CONFIG_BLK_DEV_ZONED */ +/* + * Independent access ranges: struct blk_independent_access_range describes + * a range of contiguous sectors that can be accessed using device command + * execution resources that are independent from the resources used for + * other access ranges. This is typically found with single-LUN multi-actuator + * HDDs where each access range is served by a different set of heads. + * The set of independent ranges supported by the device is defined using + * struct blk_independent_access_ranges. The independent ranges must not overlap + * and must include all sectors within the disk capacity (no sector holes + * allowed). + * For a device with multiple ranges, requests targeting sectors in different + * ranges can be executed in parallel. A request can straddle an access range + * boundary. + */ +struct blk_independent_access_range { + struct kobject kobj; + struct request_queue *queue; + sector_t sector; + sector_t nr_sectors; +}; + +struct blk_independent_access_ranges { + struct kobject kobj; + bool sysfs_registered; + unsigned int nr_ia_ranges; + struct blk_independent_access_range ia_range[]; +}; + struct request_queue { struct request *last_merge; struct elevator_queue *elevator; @@ -569,6 +597,12 @@ struct request_queue { #define BLK_MAX_WRITE_HINTS 5 u64 write_hints[BLK_MAX_WRITE_HINTS]; + + /* + * Independent sector access ranges. This is always NULL for + * devices that do not have multiple independent access ranges. + */ + struct blk_independent_access_ranges *ia_ranges; }; /* Keep blk_queue_flag_name[] in sync with the definitions below */ @@ -1164,6 +1198,11 @@ extern void blk_queue_required_elevator_features(struct request_queue *q, extern bool blk_queue_can_use_dma_map_merging(struct request_queue *q, struct device *dev); +struct blk_independent_access_ranges * +disk_alloc_independent_access_ranges(struct gendisk *disk, int nr_ia_ranges); +void disk_set_independent_access_ranges(struct gendisk *disk, + struct blk_independent_access_ranges *iars); + /* * Number of physical segments as sent to the device. * From patchwork Thu Sep 9 02:35:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 508642 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A79EC433F5 for ; Thu, 9 Sep 2021 02:35:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 210CB6113A for ; Thu, 9 Sep 2021 02:35:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245131AbhIIChC (ORCPT ); Wed, 8 Sep 2021 22:37:02 -0400 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:42524 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236257AbhIICg7 (ORCPT ); Wed, 8 Sep 2021 22:36:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1631154951; x=1662690951; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=JvrFu+U1V8PAYrjm6bWrGsnC1QXXCBxJtUJePkupkKw=; b=VKpfVNledxIwxV1uMulXPZ3b0NMGZKLRgg8kcqNAu7omtoxZAZHEPJD2 JkMVl6uVj1yvSHukGxtwc+wE7teRe9RSjDPRarl/AjYfWg+FmT5cfzbvL d0C6qM2D4Y7hvukuHIlBXn77opzwGXrf9a7q/pIomtUX2zOPfJtVJ34YK rUpzDhHQbppGpLGJy4TdGRH93aa1IWe0tScEBUyzGH/o6zio9jtfq0h1t 8bbTawMTBh8vSEOBf+l3Z+MumFI6NSmJ+sN/UyZZuhfPNwDI6PQuzhFOZ IhK5aZQxRAWnioqptIyj2VmYKav4o9IldQ7nR3WiXvwrtmAZAESLtgkIe g==; X-IronPort-AV: E=Sophos;i="5.85,279,1624291200"; d="scan'208";a="180062214" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 09 Sep 2021 10:35:51 +0800 IronPort-SDR: h9cRrC65eLFv8dT2umI7wINe/DTRIqIQyI2r2yA1lC8hIogX5CoQz3JxN2cSbbsGlhncgHkpGb ylwjC7sm/SpWTR5g5pIZDnGNfzB21skpW/8awij4F4pvdNNnHRbnrBWuowA2o0lJuWL3XgohFR iAZK0Odxa/uj5YKuFjvwKSdFDS+z7JJu9NJIqcVkK/zHDi6iIvuWkUgQxtAetzj3EQs5oxBnJ6 AitFFYsnphZS1R3KpWwX1hCYmc8ytPQSSHlZKX4te+EG7Y2cmB6jaHK18THUrPV3BsZPMyJywK Xr+avlB7cbqwlTeEorV8cQqk Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Sep 2021 19:12:26 -0700 IronPort-SDR: PakRXDKIdNI7Hzy8aMEgTTC3tu7XaNwQGGY52eXmyzMC4rUI7otea3WL9O/trd7vB/2RBnzo4s jehsuxzQY+x1B48F3Vp7XnePpQkKS2Sx538l6/AfGtQnVCcduQlX5ACtO3I/zAicmcXtg73h/F 0EFmtkG9hQs8KhmaV+0qDF4aZFudAoe0fJNHFqqWkue7vxxevIkeVzjpOmwY/A5Ta952pq5Phu Ic+prmHIlWNOMfKEa5o5HWh6mozptyCjWO0qCqKib3keMur1gC45sCe1+UFjED0aWqmBnYcYRZ xNA= WDCIronportException: Internal Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Sep 2021 19:35:50 -0700 From: Damien Le Moal To: Jens Axboe , linux-block@vger.kernel.org, linux-ide@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Subject: [PATCH v8 3/5] libata: support concurrent positioning ranges log Date: Thu, 9 Sep 2021 11:35:43 +0900 Message-Id: <20210909023545.1101672-4-damien.lemoal@wdc.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210909023545.1101672-1-damien.lemoal@wdc.com> References: <20210909023545.1101672-1-damien.lemoal@wdc.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Add support to discover if an ATA device supports the Concurrent Positioning Ranges data log (address 0x47), indicating that the device is capable of seeking to multiple different locations in parallel using multiple actuators serving different LBA ranges. Also add support to translate the concurrent positioning ranges log into its equivalent Concurrent Positioning Ranges VPD page B9h in libata-scsi.c. The format of the Concurrent Positioning Ranges Log is defined in ACS-5 r9. Signed-off-by: Damien Le Moal Reviewed-by: Hannes Reinecke Reviewed-by: Christoph Hellwig --- drivers/ata/libata-core.c | 57 +++++++++++++++++++++++++++++++++++++-- drivers/ata/libata-scsi.c | 48 ++++++++++++++++++++++++++------- include/linux/ata.h | 1 + include/linux/libata.h | 15 +++++++++++ 4 files changed, 110 insertions(+), 11 deletions(-) diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index b8459c54f739..b083e5dae6f3 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -2433,18 +2433,70 @@ static void ata_dev_config_devslp(struct ata_device *dev) } } +static void ata_dev_config_cpr(struct ata_device *dev) +{ + unsigned int err_mask; + size_t buf_len; + int i, nr_cpr = 0; + struct ata_cpr_log *cpr_log = NULL; + u8 *desc, *buf = NULL; + + if (!ata_identify_page_supported(dev, + ATA_LOG_CONCURRENT_POSITIONING_RANGES)) + goto out; + + /* + * Read IDENTIFY DEVICE data log, page 0x47 + * (concurrent positioning ranges). We can have at most 255 32B range + * descriptors plus a 64B header. + */ + buf_len = (64 + 255 * 32 + 511) & ~511; + buf = kzalloc(buf_len, GFP_KERNEL); + if (!buf) + goto out; + + err_mask = ata_read_log_page(dev, ATA_LOG_IDENTIFY_DEVICE, + ATA_LOG_CONCURRENT_POSITIONING_RANGES, + buf, buf_len >> 9); + if (err_mask) + goto out; + + nr_cpr = buf[0]; + if (!nr_cpr) + goto out; + + cpr_log = kzalloc(struct_size(cpr_log, cpr, nr_cpr), GFP_KERNEL); + if (!cpr_log) + goto out; + + cpr_log->nr_cpr = nr_cpr; + desc = &buf[64]; + for (i = 0; i < nr_cpr; i++, desc += 32) { + cpr_log->cpr[i].num = desc[0]; + cpr_log->cpr[i].num_storage_elements = desc[1]; + cpr_log->cpr[i].start_lba = get_unaligned_le64(&desc[8]); + cpr_log->cpr[i].num_lbas = get_unaligned_le64(&desc[16]); + } + +out: + swap(dev->cpr_log, cpr_log); + kfree(cpr_log); + kfree(buf); +} + static void ata_dev_print_features(struct ata_device *dev) { if (!(dev->flags & ATA_DFLAG_FEATURES_MASK)) return; ata_dev_info(dev, - "Features:%s%s%s%s%s\n", + "Features:%s%s%s%s%s%s\n", dev->flags & ATA_DFLAG_TRUSTED ? " Trust" : "", dev->flags & ATA_DFLAG_DA ? " Dev-Attention" : "", dev->flags & ATA_DFLAG_DEVSLP ? " Dev-Sleep" : "", dev->flags & ATA_DFLAG_NCQ_SEND_RECV ? " NCQ-sndrcv" : "", - dev->flags & ATA_DFLAG_NCQ_PRIO ? " NCQ-prio" : ""); + dev->flags & ATA_DFLAG_NCQ_PRIO ? " NCQ-prio" : "", + dev->cpr_log ? " CPR" : ""); } /** @@ -2608,6 +2660,7 @@ int ata_dev_configure(struct ata_device *dev) ata_dev_config_sense_reporting(dev); ata_dev_config_zac(dev); ata_dev_config_trusted(dev); + ata_dev_config_cpr(dev); dev->cdb_len = 32; if (ata_msg_drv(ap) && print_info) diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 0b7b4624e4df..e1e97e62f716 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -1895,7 +1895,7 @@ static unsigned int ata_scsiop_inq_std(struct ata_scsi_args *args, u8 *rbuf) */ static unsigned int ata_scsiop_inq_00(struct ata_scsi_args *args, u8 *rbuf) { - int num_pages; + int i, num_pages = 0; static const u8 pages[] = { 0x00, /* page 0x00, this page */ 0x80, /* page 0x80, unit serial no page */ @@ -1905,13 +1905,17 @@ static unsigned int ata_scsiop_inq_00(struct ata_scsi_args *args, u8 *rbuf) 0xb1, /* page 0xb1, block device characteristics page */ 0xb2, /* page 0xb2, thin provisioning page */ 0xb6, /* page 0xb6, zoned block device characteristics */ + 0xb9, /* page 0xb9, concurrent positioning ranges */ }; - num_pages = sizeof(pages); - if (!(args->dev->flags & ATA_DFLAG_ZAC)) - num_pages--; + for (i = 0; i < sizeof(pages); i++) { + if (pages[i] == 0xb6 && + !(args->dev->flags & ATA_DFLAG_ZAC)) + continue; + rbuf[num_pages + 4] = pages[i]; + num_pages++; + } rbuf[3] = num_pages; /* number of supported VPD pages */ - memcpy(rbuf + 4, pages, num_pages); return 0; } @@ -2121,6 +2125,26 @@ static unsigned int ata_scsiop_inq_b6(struct ata_scsi_args *args, u8 *rbuf) return 0; } +static unsigned int ata_scsiop_inq_b9(struct ata_scsi_args *args, u8 *rbuf) +{ + struct ata_cpr_log *cpr_log = args->dev->cpr_log; + u8 *desc = &rbuf[64]; + int i; + + /* SCSI Concurrent Positioning Ranges VPD page: SBC-5 rev 1 or later */ + rbuf[1] = 0xb9; + put_unaligned_be16(64 + (int)cpr_log->nr_cpr * 32 - 4, &rbuf[3]); + + for (i = 0; i < cpr_log->nr_cpr; i++, desc += 32) { + desc[0] = cpr_log->cpr[i].num; + desc[1] = cpr_log->cpr[i].num_storage_elements; + put_unaligned_be64(cpr_log->cpr[i].start_lba, &desc[8]); + put_unaligned_be64(cpr_log->cpr[i].num_lbas, &desc[16]); + } + + return 0; +} + /** * modecpy - Prepare response for MODE SENSE * @dest: output buffer @@ -4120,11 +4144,17 @@ void ata_scsi_simulate(struct ata_device *dev, struct scsi_cmnd *cmd) ata_scsi_rbuf_fill(&args, ata_scsiop_inq_b2); break; case 0xb6: - if (dev->flags & ATA_DFLAG_ZAC) { + if (dev->flags & ATA_DFLAG_ZAC) ata_scsi_rbuf_fill(&args, ata_scsiop_inq_b6); - break; - } - fallthrough; + else + ata_scsi_set_invalid_field(dev, cmd, 2, 0xff); + break; + case 0xb9: + if (dev->cpr_log) + ata_scsi_rbuf_fill(&args, ata_scsiop_inq_b9); + else + ata_scsi_set_invalid_field(dev, cmd, 2, 0xff); + break; default: ata_scsi_set_invalid_field(dev, cmd, 2, 0xff); break; diff --git a/include/linux/ata.h b/include/linux/ata.h index 1b44f40c7700..199e47e97d64 100644 --- a/include/linux/ata.h +++ b/include/linux/ata.h @@ -329,6 +329,7 @@ enum { ATA_LOG_SECURITY = 0x06, ATA_LOG_SATA_SETTINGS = 0x08, ATA_LOG_ZONED_INFORMATION = 0x09, + ATA_LOG_CONCURRENT_POSITIONING_RANGES = 0x47, /* Identify device SATA settings log:*/ ATA_LOG_DEVSLP_OFFSET = 0x30, diff --git a/include/linux/libata.h b/include/linux/libata.h index 860e63f5667b..e3ab85964f94 100644 --- a/include/linux/libata.h +++ b/include/linux/libata.h @@ -675,6 +675,18 @@ struct ata_ering { struct ata_ering_entry ring[ATA_ERING_SIZE]; }; +struct ata_cpr { + u8 num; + u8 num_storage_elements; + u64 start_lba; + u64 num_lbas; +}; + +struct ata_cpr_log { + u8 nr_cpr; + struct ata_cpr cpr[]; +}; + struct ata_device { struct ata_link *link; unsigned int devno; /* 0 or 1 */ @@ -734,6 +746,9 @@ struct ata_device { u32 zac_zones_optimal_nonseq; u32 zac_zones_max_open; + /* Concurrent positioning ranges */ + struct ata_cpr_log *cpr_log; + /* error history */ int spdn_cnt; /* ering is CLEAR_END, read comment above CLEAR_END */ From patchwork Thu Sep 9 02:35:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Damien Le Moal X-Patchwork-Id: 508641 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12C75C433F5 for ; Thu, 9 Sep 2021 02:36:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E912E6113C for ; Thu, 9 Sep 2021 02:36:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241443AbhIIChK (ORCPT ); Wed, 8 Sep 2021 22:37:10 -0400 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:42527 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238723AbhIIChB (ORCPT ); Wed, 8 Sep 2021 22:37:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1631154952; x=1662690952; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=GzrETZRidFfNItPSNL+wFqWCgQ95fWYFF2GYt16RvEc=; b=AFZF0zPA1lD9CaRHFh75HUIQadKyxw4215iTVLIm3hMB47rK3TwGK72R /LqsvaIcqYd7AaKpX/jH6PUVIg2l1WgI4Y22bScJBfFy5pjUsAXUGIBOJ m7sxM4dLuhV3hyE0tVuffe1Kvuhl16nMx0uNWP3TP9+lWXjUg87k5N+yA wYbvfQQGsy5G+hddt1vww2UvLY4FI6Scnv97zncI3O+5Ta0gqNaRKzRIB hPUrplVxPcFFV0kviKXtlv3I9br0MDOOxnMJkUT4C11D/3C/9u6zICJHp PTo/MPzFKNIMmWws45mlk/kdUVYzisBWDOItKlJgmWJUA/wAvMV51kpzW Q==; X-IronPort-AV: E=Sophos;i="5.85,279,1624291200"; d="scan'208";a="180062216" Received: from h199-255-45-14.hgst.com (HELO uls-op-cesaep01.wdc.com) ([199.255.45.14]) by ob1.hgst.iphmx.com with ESMTP; 09 Sep 2021 10:35:52 +0800 IronPort-SDR: 1XKXK4zGeBJLzOfuSCd/thL1jHxgx8w6s7RIpmc77QJ5H0gpKrf7/V0kQ9rRDfEc0m3d4bNgwJ scsLev/LglG/ZoppqUMzHU7ZbnJPl/s56csd48VAMQXoYBjO0S7BoO0fjuXb+HISVg9wPMIdsf dauQqZ8ZIr2MNNAhPmiGN7/s79pCpD31+SBb3Nk/N5n8ko0IbIp8m0V6jZBV+XKA5IHWNslcrI CZRfPPkiUEZqcXIc8/h83QEqJFbo4iFkmcFMAvDLvHiFbpPvw6TiqjD50ozXk58rFqz5GJAHSq K9bfUvGXpabzOFzl+sndCp2W Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep01.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Sep 2021 19:12:27 -0700 IronPort-SDR: tN1KWA0S2mGa4HfJ8XCgr9akIGIt/PP/mv7Z76fEtA0hg2rE7wgLCSzHJDUQca0t98m4WgB4OP t5BI1QGmMuorXtZuP5pCTdooeBB66XjmO1vqHvs0U9gQ+fe+9YxU5zQR3BYDAwm6vMAsfU9TK6 qW6KgSmWzA/+YNetX7A9EO0tTOrlqoEuZ4w2SSl0NyCnG1mSfN08hGY3G9LJNQgLJGS4/YEkCe eH0AgTpLsoyiAcj1efSSWk4P7OIokr29rsguZFcNwXAkkD9MMMsSQIRqCUHeVG46Bf7clKAJec CfY= WDCIronportException: Internal Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip02.wdc.com with ESMTP; 08 Sep 2021 19:35:51 -0700 From: Damien Le Moal To: Jens Axboe , linux-block@vger.kernel.org, linux-ide@vger.kernel.org, "Martin K . Petersen" , linux-scsi@vger.kernel.org Subject: [PATCH v8 4/5] doc: document sysfs queue/independent_access_ranges attributes Date: Thu, 9 Sep 2021 11:35:44 +0900 Message-Id: <20210909023545.1101672-5-damien.lemoal@wdc.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210909023545.1101672-1-damien.lemoal@wdc.com> References: <20210909023545.1101672-1-damien.lemoal@wdc.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Update the file Documentation/block/queue-sysfs.rst to add a description of a device queue sysfs entries related to independent access ranges (e.g. concurrent positioning ranges for multi-actuator hard-disks). Signed-off-by: Damien Le Moal Reviewed-by: Hannes Reinecke --- Documentation/block/queue-sysfs.rst | 31 +++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/Documentation/block/queue-sysfs.rst b/Documentation/block/queue-sysfs.rst index 4dc7f0d499a8..b6e8983d8eda 100644 --- a/Documentation/block/queue-sysfs.rst +++ b/Documentation/block/queue-sysfs.rst @@ -286,4 +286,35 @@ sequential zones of zoned block devices (devices with a zoned attributed that reports "host-managed" or "host-aware"). This value is always 0 for regular block devices. +independent_access_ranges (RO) +------------------------------ + +The presence of this sub-directory of the /sys/block/xxx/queue/ directory +indicates that the device is capable of executing requests targeting +different sector ranges in parallel. For instance, single LUN multi-actuator +hard-disks will have an independent_access_ranges directory if the device +correctly advertizes the sector ranges of its actuators. + +The independent_access_ranges directory contains one directory per access +range, with each range described using the sector (RO) attribute file to +indicate the first sector of the range and the nr_sectors (RO) attribute file +to indicate the total number of sectors in the range starting from the first +sector of the range. For example, a dual-actuator hard-disk will have the +following independent_access_ranges entries.:: + + $ tree /sys/block//queue/independent_access_ranges/ + /sys/block//queue/independent_access_ranges/ + |-- 0 + | |-- nr_sectors + | `-- sector + `-- 1 + |-- nr_sectors + `-- sector + +The sector and nr_sectors attributes use 512B sector unit, regardless of +the actual block size of the device. Independent access ranges do not +overlap and include all sectors within the device capacity. The access +ranges are numbered in increasing order of the range start sector, +that is, the sector attribute of range 0 always has the value 0. + Jens Axboe , February 2009