Message ID | 20240611051929.513387-11-hch@lst.de |
---|---|
State | New |
Headers | show |
Series | [01/26] sd: fix sd_is_zoned | expand |
On Thu, Jun 13, 2024 at 04:05:08PM +0200, Christoph Hellwig wrote: > On Wed, Jun 12, 2024 at 05:56:15PM +0200, Roger Pau Monné wrote: > > Right. AFAICT advertising "feature-barrier" and/or > > "feature-flush-cache" could be done based on whether blkback > > understand those commands, not on whether the underlying storage > > supports the equivalent of them. > > > > Worst case we can print a warning message once about the underlying > > storage failing to complete flush/barrier requests, and that data > > integrity might not be guaranteed going forward, and not propagate the > > error to the upper layer? > > > > What would be the consequence of propagating a flush error to the > > upper layers? > > If you propage the error to the upper layer you will generate an > I/O error there, which usually leads to a file system shutdown. > > > Given the description of the feature in the blkif header, I'm afraid > > we cannot guarantee that seeing the feature exposed implies barrier or > > flush support, since the request could fail at any time (or even from > > the start of the disk attachment) and it would still sadly be a correct > > implementation given the description of the options. > > Well, then we could do something like the patch below, which keeps > the existing behavior, but insolates the block layer from it and > removes the only user of blk_queue_write_cache from interrupt > context: LGTM, I'm not sure there's much else we can do. > --- > From e6e82c769ab209a77302994c3829cf6ff7a595b8 Mon Sep 17 00:00:00 2001 > From: Christoph Hellwig <hch@lst.de> > Date: Thu, 30 May 2024 08:58:52 +0200 > Subject: xen-blkfront: don't disable cache flushes when they fail > > blkfront always had a robust negotiation protocol for detecting a write > cache. Stop simply disabling cache flushes in the block layer as the > flags handling is moving to the atomic queue limits API that needs > user context to freeze the queue for that. Instead handle the case > of the feature flags cleared inside of blkfront. This removes old > debug code to check for such a mismatch which was previously impossible > to hit, including the check for passthrough requests that blkfront > never used to start with. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > drivers/block/xen-blkfront.c | 44 +++++++++++++++++++----------------- > 1 file changed, 23 insertions(+), 21 deletions(-) > > diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c > index 9b4ec3e4908cce..e2c92d5095ff17 100644 > --- a/drivers/block/xen-blkfront.c > +++ b/drivers/block/xen-blkfront.c > @@ -788,6 +788,14 @@ static int blkif_queue_rw_req(struct request *req, struct blkfront_ring_info *ri > * A barrier request a superset of FUA, so we can > * implement it the same way. (It's also a FLUSH+FUA, > * since it is guaranteed ordered WRT previous writes.) > + * > + * Note that can end up here with a FUA write and the > + * flags cleared. This happens when the flag was > + * run-time disabled and raced with I/O submission in > + * the block layer. We submit it as a normal write Since blkfront no longer signals that FUA is no longer available for the device, getting a request with FUA is not actually a race I think? > + * here. A pure flush should never end up here with > + * the flags cleared as they are completed earlier for > + * the !feature_flush case. > */ > if (info->feature_flush && info->feature_fua) > ring_req->operation = > @@ -795,8 +803,6 @@ static int blkif_queue_rw_req(struct request *req, struct blkfront_ring_info *ri > else if (info->feature_flush) > ring_req->operation = > BLKIF_OP_FLUSH_DISKCACHE; > - else > - ring_req->operation = 0; > } > ring_req->u.rw.nr_segments = num_grant; > if (unlikely(require_extra_req)) { > @@ -887,16 +893,6 @@ static inline void flush_requests(struct blkfront_ring_info *rinfo) > notify_remote_via_irq(rinfo->irq); > } > > -static inline bool blkif_request_flush_invalid(struct request *req, > - struct blkfront_info *info) > -{ > - return (blk_rq_is_passthrough(req) || > - ((req_op(req) == REQ_OP_FLUSH) && > - !info->feature_flush) || > - ((req->cmd_flags & REQ_FUA) && > - !info->feature_fua)); > -} > - > static blk_status_t blkif_queue_rq(struct blk_mq_hw_ctx *hctx, > const struct blk_mq_queue_data *qd) > { > @@ -908,23 +904,30 @@ static blk_status_t blkif_queue_rq(struct blk_mq_hw_ctx *hctx, > rinfo = get_rinfo(info, qid); > blk_mq_start_request(qd->rq); > spin_lock_irqsave(&rinfo->ring_lock, flags); > - if (RING_FULL(&rinfo->ring)) > - goto out_busy; > > - if (blkif_request_flush_invalid(qd->rq, rinfo->dev_info)) > - goto out_err; > + /* > + * Check if the backend actually supports flushes. > + * > + * While the block layer won't send us flushes if we don't claim to > + * support them, the Xen protocol allows the backend to revoke support > + * at any time. That is of course a really bad idea and dangerous, but > + * has been allowed for 10+ years. In that case we simply clear the > + * flags, and directly return here for an empty flush and ignore the > + * FUA flag later on. > + */ > + if (unlikely(req_op(qd->rq) == REQ_OP_FLUSH && !info->feature_flush)) > + goto out; Don't you need to complete the request here? Thanks, Roger.
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index 9b4ec3e4908cce..9794ac2d3299d1 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -982,18 +982,6 @@ static const char *flush_info(struct blkfront_info *info) return "barrier or flush: disabled;"; } -static void xlvbd_flush(struct blkfront_info *info) -{ - blk_queue_write_cache(info->rq, info->feature_flush ? true : false, - info->feature_fua ? true : false); - pr_info("blkfront: %s: %s %s %s %s %s %s %s\n", - info->gd->disk_name, flush_info(info), - "persistent grants:", info->feature_persistent ? - "enabled;" : "disabled;", "indirect descriptors:", - info->max_indirect_segments ? "enabled;" : "disabled;", - "bounce buffer:", info->bounce ? "enabled" : "disabled;"); -} - static int xen_translate_vdev(int vdevice, int *minor, unsigned int *offset) { int major; @@ -1162,7 +1150,15 @@ static int xlvbd_alloc_gendisk(blkif_sector_t capacity, info->sector_size = sector_size; info->physical_sector_size = physical_sector_size; - xlvbd_flush(info); + blk_queue_write_cache(info->rq, info->feature_flush ? true : false, + info->feature_fua ? true : false); + + pr_info("blkfront: %s: %s %s %s %s %s %s %s\n", + info->gd->disk_name, flush_info(info), + "persistent grants:", info->feature_persistent ? + "enabled;" : "disabled;", "indirect descriptors:", + info->max_indirect_segments ? "enabled;" : "disabled;", + "bounce buffer:", info->bounce ? "enabled" : "disabled;"); if (info->vdisk_info & VDISK_READONLY) set_disk_ro(gd, 1); @@ -1622,13 +1618,6 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) info->gd->disk_name, op_name(bret.operation)); blkif_req(req)->error = BLK_STS_NOTSUPP; } - if (unlikely(blkif_req(req)->error)) { - if (blkif_req(req)->error == BLK_STS_NOTSUPP) - blkif_req(req)->error = BLK_STS_OK; - info->feature_fua = 0; - info->feature_flush = 0; - xlvbd_flush(info); - } fallthrough; case BLKIF_OP_READ: case BLKIF_OP_WRITE:
blkfront always had a robust negotiation protocol for detecting a write cache. Stop simply disabling cache flushes when they fail as that is a grave error. Signed-off-by: Christoph Hellwig <hch@lst.de> --- drivers/block/xen-blkfront.c | 29 +++++++++-------------------- 1 file changed, 9 insertions(+), 20 deletions(-)