From patchwork Wed May 29 05:04:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoph Hellwig X-Patchwork-Id: 799908 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1B75C15B567; Wed, 29 May 2024 05:05:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716959123; cv=none; b=opzLhkGzTY7xPxc9BauMzU7q233gDNb15jH82+awhEHUUC4LH5vogTODMBSZk0GiVTGmx+gCCeWp7DnE5XUgWIWoIWOdSZH5sIDpYimM9xXzTonkynTcorTTe54ruEXCC3Xlmg1oJXt3/YBPKU/3UtlbCBlSYtcQ6JQzZkkH3yE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716959123; c=relaxed/simple; bh=nGUIeggc7LQqS5O2/8QtRodJI/IrWKGKQ8PsElPqbQw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=d53yIJaOetdAc3xuEb7XaQG/68FFHLcLQ93cm7VojuV/yF5UE1xcsdDVxo2xRpxKQJLxcbPr24Ux5FQpfmt7QbgdotVcvJXS0wnSNjNZhbL1iOZ1aiVk5w/ZulAHZUXPtDmbGTPuTJwQlRaTtD8W5WvqeznYhXu8oy9zQpmuOxU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de; spf=none smtp.mailfrom=bombadil.srs.infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=HnvQ4wSi; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=lst.de Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=bombadil.srs.infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="HnvQ4wSi" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender :Reply-To:Content-Type:Content-ID:Content-Description; bh=kkKXG4BPL9slh8VpntlZ2DVJFx0Ai+B2JWcJoYF05nw=; b=HnvQ4wSiS6N7e8PXLIagPtScJa 9/CgQf5oHlA+0rUcPiwh+x8U1RVP3Swl11ecx0Ee6huxhdXFhz5IzhMmddCvwLO1YkXuZgnKHIyZv AOVIEy1TgjZeACrC1Wsy6aVLlvm0QgjUq0kp0HR3UzCsdxIH18itiz4xA3D/wB7hkO0VExVnzB8rs Ny7wFqmGC0E9tY75dJoAetU6VdBpDVUTO9E/9ZAV2FjtPP7/7inVd3uages1phxoxggirTFJYeodu 7kl6Sd5Gbz+ioQGkhFm1LHTkWvL9+n2TaFTuftelgS7DtSTnFh2nQ4oxoLBRlTZwwpPhhY7PFNUIZ DgeOm/3A==; Received: from 2a02-8389-2341-5b80-7775-b725-99f7-3344.cable.dynamic.v6.surfer.at ([2a02:8389:2341:5b80:7775:b725:99f7:3344] helo=localhost) by bombadil.infradead.org with esmtpsa (Exim 4.97.1 #2 (Red Hat Linux)) id 1sCBUx-00000002pTc-49ZV; Wed, 29 May 2024 05:05:16 +0000 From: Christoph Hellwig To: Jens Axboe , "Martin K. Petersen" Cc: Richard Weinberger , Anton Ivanov , Johannes Berg , Josef Bacik , Ilya Dryomov , Dongsheng Yang , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= , linux-um@lists.infradead.org, linux-block@vger.kernel.org, nbd@other.debian.org, ceph-devel@vger.kernel.org, xen-devel@lists.xenproject.org, linux-scsi@vger.kernel.org Subject: [PATCH 02/12] block: take io_opt and io_min into account for max_sectors Date: Wed, 29 May 2024 07:04:04 +0200 Message-ID: <20240529050507.1392041-3-hch@lst.de> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240529050507.1392041-1-hch@lst.de> References: <20240529050507.1392041-1-hch@lst.de> Precedence: bulk X-Mailing-List: ceph-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html The soft max_sectors limit is normally capped by the hardware limits and an arbitrary upper limit enforced by the kernel, but can be modified by the user. A few drivers want to increase this limit (nbd, rbd) or adjust it up or down based on hardware capabilities (sd). Change blk_validate_limits to default max_sectors to the optimal I/O size, or upgrade it to the preferred minimal I/O size if that is larger than the kernel default if no optimal I/O size is provided based on the logic in the SD driver. This keeps the existing kernel default for drivers that do not provide an io_opt or very big io_min value, but picks a much more useful default for those who provide these hints, and allows to remove the hacks to set the user max_sectors limit in nbd, rbd and sd. Note that rd picks a different value for the optimal I/O size vs the user max_sectors value, so this is a bit of a behavior change that could use careful review from people familiar with rbd. Signed-off-by: Christoph Hellwig Reviewed-by: Damien Le Moal Reviewed-by: Bart Van Assche --- block/blk-settings.c | 7 +++++++ drivers/block/nbd.c | 2 +- drivers/block/rbd.c | 1 - drivers/scsi/sd.c | 29 +++++------------------------ 4 files changed, 13 insertions(+), 26 deletions(-) diff --git a/block/blk-settings.c b/block/blk-settings.c index effeb9a639bb45..a49abdb3554834 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -153,6 +153,13 @@ static int blk_validate_limits(struct queue_limits *lim) if (lim->max_user_sectors < PAGE_SIZE / SECTOR_SIZE) return -EINVAL; lim->max_sectors = min(max_hw_sectors, lim->max_user_sectors); + } else if (lim->io_opt) { + lim->max_sectors = + min(max_hw_sectors, lim->io_opt >> SECTOR_SHIFT); + } else if (lim->io_min && + lim->io_min > (BLK_DEF_MAX_SECTORS_CAP << SECTOR_SHIFT)) { + lim->max_sectors = + min(max_hw_sectors, lim->io_min >> SECTOR_SHIFT); } else { lim->max_sectors = min(max_hw_sectors, BLK_DEF_MAX_SECTORS_CAP); } diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 22a79a62cc4eab..ad887d614d5b3f 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -1808,7 +1808,7 @@ static struct nbd_device *nbd_dev_add(int index, unsigned int refs) { struct queue_limits lim = { .max_hw_sectors = 65536, - .max_user_sectors = 256, + .io_opt = 256 << SECTOR_SHIFT, .max_segments = USHRT_MAX, .max_segment_size = UINT_MAX, }; diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 26ff5cd2bf0abc..05096172f334a3 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -4954,7 +4954,6 @@ static int rbd_init_disk(struct rbd_device *rbd_dev) rbd_dev->layout.object_size * rbd_dev->layout.stripe_count; struct queue_limits lim = { .max_hw_sectors = objset_bytes >> SECTOR_SHIFT, - .max_user_sectors = objset_bytes >> SECTOR_SHIFT, .io_min = rbd_dev->opts->alloc_size, .io_opt = rbd_dev->opts->alloc_size, .max_segments = USHRT_MAX, diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index f6c822c9cbd2d3..3dff9150ce11e2 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3593,7 +3593,7 @@ static int sd_revalidate_disk(struct gendisk *disk) struct request_queue *q = sdkp->disk->queue; sector_t old_capacity = sdkp->capacity; unsigned char *buffer; - unsigned int dev_max, rw_max; + unsigned int dev_max; SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_revalidate_disk\n")); @@ -3675,34 +3675,15 @@ static int sd_revalidate_disk(struct gendisk *disk) else blk_queue_io_min(sdkp->disk->queue, 0); - if (sd_validate_opt_xfer_size(sdkp, dev_max)) { - q->limits.io_opt = logical_to_bytes(sdp, sdkp->opt_xfer_blocks); - rw_max = logical_to_sectors(sdp, sdkp->opt_xfer_blocks); - } else { - q->limits.io_opt = 0; - rw_max = min_not_zero(logical_to_sectors(sdp, dev_max), - (sector_t)BLK_DEF_MAX_SECTORS_CAP); - } - /* * Limit default to SCSI host optimal sector limit if set. There may be * an impact on performance for when the size of a request exceeds this * host limit. */ - rw_max = min_not_zero(rw_max, sdp->host->opt_sectors); - - /* Do not exceed controller limit */ - rw_max = min(rw_max, queue_max_hw_sectors(q)); - - /* - * Only update max_sectors if previously unset or if the current value - * exceeds the capabilities of the hardware. - */ - if (sdkp->first_scan || - q->limits.max_sectors > q->limits.max_dev_sectors || - q->limits.max_sectors > q->limits.max_hw_sectors) { - q->limits.max_sectors = rw_max; - q->limits.max_user_sectors = rw_max; + q->limits.io_opt = sdp->host->opt_sectors << SECTOR_SHIFT; + if (sd_validate_opt_xfer_size(sdkp, dev_max)) { + q->limits.io_opt = min_not_zero(q->limits.io_opt, + logical_to_bytes(sdp, sdkp->opt_xfer_blocks)); } sdkp->first_scan = 0;