diff mbox series

[v3,07/15] block: Limit atomic write IO size according to atomic_write_max_sectors

Message ID 20240124113841.31824-8-john.g.garry@oracle.com
State New
Headers show
Series block atomic writes | expand

Commit Message

John Garry Jan. 24, 2024, 11:38 a.m. UTC
Currently an IO size is limited to the request_queue limits max_sectors.
Limit the size for an atomic write to queue limit atomic_write_max_sectors
value.

Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 block/blk-merge.c | 11 ++++++++++-
 block/blk.h       |  3 +++
 2 files changed, 13 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig Feb. 13, 2024, 6:26 a.m. UTC | #1
On Wed, Jan 24, 2024 at 11:38:33AM +0000, John Garry wrote:
> Currently an IO size is limited to the request_queue limits max_sectors.
> Limit the size for an atomic write to queue limit atomic_write_max_sectors
> value.

Same here.  Please have one patch that actually adds useful atomic write
support to the block layer.  That doesn't include fs stuff like
IOCB_ATOMIC or the block file operation support, but to have a reviewable
chunk I'd really like to see the full block-layer support for the limits,
enforcing them, the merge prevention in a single commit with an extensive
commit log explaining the semantics.  That allows a useful review without
looking at the full tree, and also will help with people reading history
in the future.
John Garry Feb. 13, 2024, 8:15 a.m. UTC | #2
On 13/02/2024 06:26, Christoph Hellwig wrote:
> On Wed, Jan 24, 2024 at 11:38:33AM +0000, John Garry wrote:
>> Currently an IO size is limited to the request_queue limits max_sectors.
>> Limit the size for an atomic write to queue limit atomic_write_max_sectors
>> value.
> 
> Same here.  Please have one patch that actually adds useful atomic write
> support to the block layer.  That doesn't include fs stuff like
> IOCB_ATOMIC or the block file operation support, but to have a reviewable
> chunk I'd really like to see the full block-layer support for the limits,
> enforcing them, the merge prevention in a single commit with an extensive
> commit log explaining the semantics.  That allows a useful review without
> looking at the full tree, and also will help with people reading history
> in the future.

Fine, so essentially merge 1, 2, 5, 7, 8, 9

BTW, I was also going to add this function which ensures that partitions 
are properly aligned:

bool bdev_can_atomic_write(struct block_device *bdev)
{
	struct request_queue *bd_queue = bdev->bd_queue;
	struct queue_limits *limits = &bd_queue->limits;

	if(!limits->atomic_write_unit_min_sectors)
		return false;

	if (bdev_is_partition(bdev)) {
		unsigned int granularity = max(limits->atomic_write_unit_min_sectors,
limits->atomic_write_hw_boundary_sectors);
		if (bdev->bd_start_sect % granularity)
			return false;
	}
	return true;
}

I'm note sure if that would be better in the fops.c patch (or not added)

Thanks,
John
Christoph Hellwig Feb. 14, 2024, 7:26 a.m. UTC | #3
On Tue, Feb 13, 2024 at 08:15:08AM +0000, John Garry wrote:
> I'm note sure if that would be better in the fops.c patch (or not added)

We'll need the partition check.  If you want to get fancy you could
also add the atomic boundary offset thing there as a partitions would
make devices with that "feature" useful again, although I'd prefer to
only deal with that if the need actually arises.

The right place is in the core infrastructure, the bdev patch is just
a user of the block infrastructure.  bdev really are just another
file system and a consumer of the block layer APIs.
John Garry Feb. 14, 2024, 9:24 a.m. UTC | #4
On 14/02/2024 07:26, Christoph Hellwig wrote:
> On Tue, Feb 13, 2024 at 08:15:08AM +0000, John Garry wrote:
>> I'm note sure if that would be better in the fops.c patch (or not added)
> 
> We'll need the partition check.  If you want to get fancy you could
> also add the atomic boundary offset thing there as a partitions would
> make devices with that "feature" useful again, although I'd prefer to
> only deal with that if the need actually arises.

Yeah, that is my general philosophy about possible weird HW.

> 
> The right place is in the core infrastructure, the bdev patch is just
> a user of the block infrastructure.  bdev really are just another
> file system and a consumer of the block layer APIs.

ok, I'll try to find a good place for it.

Thanks,
John
diff mbox series

Patch

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 74e9e775f13d..6306a2c82354 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -167,7 +167,16 @@  static inline unsigned get_max_io_size(struct bio *bio,
 {
 	unsigned pbs = lim->physical_block_size >> SECTOR_SHIFT;
 	unsigned lbs = lim->logical_block_size >> SECTOR_SHIFT;
-	unsigned max_sectors = lim->max_sectors, start, end;
+	unsigned max_sectors, start, end;
+
+	/*
+	 * We ignore lim->max_sectors for atomic writes simply because
+	 * it may less than the bio size, which we cannot tolerate.
+	 */
+	if (bio->bi_opf & REQ_ATOMIC)
+		max_sectors = lim->atomic_write_max_sectors;
+	else
+		max_sectors = lim->max_sectors;
 
 	if (lim->chunk_sectors) {
 		max_sectors = min(max_sectors,
diff --git a/block/blk.h b/block/blk.h
index 050696131329..6ba8333fcf26 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -178,6 +178,9 @@  static inline unsigned int blk_queue_get_max_sectors(struct request *rq)
 	if (unlikely(op == REQ_OP_WRITE_ZEROES))
 		return q->limits.max_write_zeroes_sectors;
 
+	if (rq->cmd_flags & REQ_ATOMIC)
+		return q->limits.atomic_write_max_sectors;
+
 	return q->limits.max_sectors;
 }