Message ID | 20240903150748.2179966-2-john.g.garry@oracle.com |
---|---|
State | New |
Headers | show |
Series | RAID0 atomic write support | expand |
On Tue, Sep 03, 2024 at 03:07:45PM +0000, John Garry wrote: > For bdev file operations, a write will be truncated when trying to write > past the end of the device. This could not be tolerated for an atomic > write. > > Ensure that the size of the bdev matches max atomic write unit so that this > truncation would never occur. But we'd still support atomic writes for all but the last sectors of the device? Isn't this really an application problem? If not supporting atomic writes at all for unaligned devices is the right thing to do, we'll need to clearly document this somewhere. Any maybe also add a pr_once to log a message?
On 12/09/2024 14:15, Christoph Hellwig wrote: > On Tue, Sep 03, 2024 at 03:07:45PM +0000, John Garry wrote: >> For bdev file operations, a write will be truncated when trying to write >> past the end of the device. This could not be tolerated for an atomic >> write. >> >> Ensure that the size of the bdev matches max atomic write unit so that this >> truncation would never occur. > > But we'd still support atomic writes for all but the last sectors of > the device? We should do be able to, but with this patch we cannot. However, a misaligned partition would be very much unexpected. > Isn't this really an application problem? Sure, if the application tried to do an atomic write to the end of the device and it was truncated. > > If not supporting atomic writes at all for unaligned devices is the right > thing to do, we'll need to clearly document this somewhere. Any maybe > also add a pr_once to log a message? I could also just reject any truncation on the atomic write in fops. Maybe that is better. And at some stage looking at making parted, fdisk, and other partitioning tools atomic write aware would be good, so that the user knows about these restrictions. Thanks, John
On Thu, Sep 12, 2024 at 03:58:00PM +0100, John Garry wrote: > On 12/09/2024 14:15, Christoph Hellwig wrote: >> On Tue, Sep 03, 2024 at 03:07:45PM +0000, John Garry wrote: >>> For bdev file operations, a write will be truncated when trying to write >>> past the end of the device. This could not be tolerated for an atomic >>> write. >>> >>> Ensure that the size of the bdev matches max atomic write unit so that this >>> truncation would never occur. >> >> But we'd still support atomic writes for all but the last sectors of >> the device? > > We should do be able to, but with this patch we cannot. However, a > misaligned partition would be very much unexpected. Yes, misaligned partitions is very unexpected, but with large and potentially unlimited atomic boundaries I would not expect the size to always be aligned. But then again at least in NVMe atomic writes don't need to match the max size anyway, so I'm not entirely sure what the problem actually is. > I could also just reject any truncation on the atomic write in fops. Maybe > that is better. It probably is.
On 12/09/2024 16:07, Christoph Hellwig wrote: >> We should do be able to, but with this patch we cannot. However, a >> misaligned partition would be very much unexpected. > Yes, misaligned partitions is very unexpected, but with large and > potentially unlimited atomic boundaries I would not expect the size > to always be aligned. But then again at least in NVMe atomic writes > don't need to match the max size anyway, so I'm not entirely sure > what the problem actually is. Actually it's not an alignment issue, but a size issue. Consider a 3.5MB partition and atomic write max is 1MB. If we tried to atomic write 1MB at offset 3MB, then it would be truncated to a 0.5MB write. So maybe it is an application bug. > >> I could also just reject any truncation on the atomic write in fops. Maybe >> that is better.
On 9/12/24 17:22, John Garry wrote: > On 12/09/2024 16:07, Christoph Hellwig wrote: >>> We should do be able to, but with this patch we cannot. However, a >>> misaligned partition would be very much unexpected. >> Yes, misaligned partitions is very unexpected, but with large and >> potentially unlimited atomic boundaries I would not expect the size >> to always be aligned. But then again at least in NVMe atomic writes >> don't need to match the max size anyway, so I'm not entirely sure >> what the problem actually is. > > Actually it's not an alignment issue, but a size issue. > > Consider a 3.5MB partition and atomic write max is 1MB. If we tried to > atomic write 1MB at offset 3MB, then it would be truncated to a 0.5MB > write. > > So maybe it is an application bug. > Hmm. Why don't we reject such an I/O? We cannot guarantee an atomic write, so I think we should be perfectly fine to return an error to userspace. Cheers, Hannes
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index b7664d593486..af8434e391fa 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1662,6 +1662,9 @@ static inline bool bdev_can_atomic_write(struct block_device *bdev) if (!limits->atomic_write_unit_min) return false; + if (!IS_ALIGNED(bdev_nr_bytes(bdev), limits->atomic_write_unit_max)) + return false; + if (bdev_is_partition(bdev)) { sector_t bd_start_sect = bdev->bd_start_sect; unsigned int alignment =
For bdev file operations, a write will be truncated when trying to write past the end of the device. This could not be tolerated for an atomic write. Ensure that the size of the bdev matches max atomic write unit so that this truncation would never occur. Fixes: 9da3d1e912f3 ("block: Add core atomic write support") Signed-off-by: John Garry <john.g.garry@oracle.com> --- include/linux/blkdev.h | 3 +++ 1 file changed, 3 insertions(+)