Message ID | 20190118115219.63576-1-paolo.valente@linaro.org |
---|---|
Headers | show |
Series | reverting two commits causing freezes | expand |
On 1/18/19 4:52 AM, Paolo Valente wrote: > Hi Jens, > a user reported a warning, followed by freezes, in case he increases > nr_requests to more than 64 [1]. After reproducing the issues, I > reverted the commit f0635b8a416e ("bfq: calculate shallow depths at > init time"), plus the related commit bd7d4ef6a4c9 ("bfq-iosched: > remove unused variable"). The problem went away. For reverts, please put the justification into the actual revert commit. With this series, if applied as-is, we'd have two patches in the tree that just says "revert X" without any hint as to why that was done. > Maybe the assumption in commit f0635b8a416e ("bfq: calculate shallow > depths at init time") does not hold true? It apparently doesn't! But let's try and figure this out instead of blindly reverting it. OK, I think I see it. For the sched_tags case, when we grow the requests, we allocate a new set. Hence any old cache would be stale at that point. How about something like this? It still keeps the code of having to update this out of the hot IO path, and only calls it when we actually change the depths. Totally untested... diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index cd307767a134..b09589915667 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -5342,7 +5342,7 @@ static unsigned int bfq_update_depths(struct bfq_data *bfqd, return min_shallow; } -static int bfq_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int index) +static void bfq_depth_updated(struct blk_mq_hw_ctx *hctx) { struct bfq_data *bfqd = hctx->queue->elevator->elevator_data; struct blk_mq_tags *tags = hctx->sched_tags; @@ -5350,6 +5350,11 @@ static int bfq_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int index) min_shallow = bfq_update_depths(bfqd, &tags->bitmap_tags); sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, min_shallow); +} + +static int bfq_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int index) +{ + bfq_depth_updated(hctx); return 0; } @@ -5772,6 +5777,7 @@ static struct elevator_type iosched_bfq_mq = { .requests_merged = bfq_requests_merged, .request_merged = bfq_request_merged, .has_work = bfq_has_work, + .depth_updated = bfq_depth_updated, .init_hctx = bfq_init_hctx, .init_sched = bfq_init_queue, .exit_sched = bfq_exit_queue, diff --git a/block/blk-mq.c b/block/blk-mq.c index 3ba37b9e15e9..a047b297ade5 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3101,6 +3101,8 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) } if (ret) break; + if (q->elevator && q->elevator->type->ops.depth_updated) + q->elevator->type->ops.depth_updated(hctx); } if (!ret) diff --git a/include/linux/elevator.h b/include/linux/elevator.h index 2e9e2763bf47..6e8bc53740f0 100644 --- a/include/linux/elevator.h +++ b/include/linux/elevator.h @@ -31,6 +31,7 @@ struct elevator_mq_ops { void (*exit_sched)(struct elevator_queue *); int (*init_hctx)(struct blk_mq_hw_ctx *, unsigned int); void (*exit_hctx)(struct blk_mq_hw_ctx *, unsigned int); + void (*depth_updated)(struct blk_mq_hw_ctx *); bool (*allow_merge)(struct request_queue *, struct request *, struct bio *); bool (*bio_merge)(struct blk_mq_hw_ctx *, struct bio *); -- Jens Axboe
On 1/18/19 6:35 AM, Jens Axboe wrote: > On 1/18/19 4:52 AM, Paolo Valente wrote: >> Hi Jens, >> a user reported a warning, followed by freezes, in case he increases >> nr_requests to more than 64 [1]. After reproducing the issues, I >> reverted the commit f0635b8a416e ("bfq: calculate shallow depths at >> init time"), plus the related commit bd7d4ef6a4c9 ("bfq-iosched: >> remove unused variable"). The problem went away. > > For reverts, please put the justification into the actual revert > commit. With this series, if applied as-is, we'd have two patches > in the tree that just says "revert X" without any hint as to why > that was done. > >> Maybe the assumption in commit f0635b8a416e ("bfq: calculate shallow >> depths at init time") does not hold true? > > It apparently doesn't! But let's try and figure this out instead of > blindly reverting it. OK, I think I see it. For the sched_tags > case, when we grow the requests, we allocate a new set. Hence any > old cache would be stale at that point. > > How about something like this? It still keeps the code of having > to update this out of the hot IO path, and only calls it when we > actually change the depths. > > Totally untested... Now tested, and it seems to work for me. Note that haven't tried to reproduce the issue, I just verified that the patch functionally does what it should - when depths are updated, the hook is invoked and updates the internal BFQ depth map. -- Jens Axboe
> Il giorno 18 gen 2019, alle ore 14:35, Jens Axboe <axboe@kernel.dk> ha scritto: > > On 1/18/19 4:52 AM, Paolo Valente wrote: >> Hi Jens, >> a user reported a warning, followed by freezes, in case he increases >> nr_requests to more than 64 [1]. After reproducing the issues, I >> reverted the commit f0635b8a416e ("bfq: calculate shallow depths at >> init time"), plus the related commit bd7d4ef6a4c9 ("bfq-iosched: >> remove unused variable"). The problem went away. > > For reverts, please put the justification into the actual revert > commit. With this series, if applied as-is, we'd have two patches > in the tree that just says "revert X" without any hint as to why > that was done. > I forget to say explicitly that these patches were meant only to give you and anybody else something concrete to test and check. With me you're as safe as houses, in terms of amount of comments in final patches :) >> Maybe the assumption in commit f0635b8a416e ("bfq: calculate shallow >> depths at init time") does not hold true? > > It apparently doesn't! But let's try and figure this out instead of > blindly reverting it. Totally agree. > OK, I think I see it. For the sched_tags > case, when we grow the requests, we allocate a new set. Hence any > old cache would be stale at that point. > ok > How about something like this? It still keeps the code of having > to update this out of the hot IO path, and only calls it when we > actually change the depths. > Looks rather clean and efficient. > Totally untested... > It seems to work here too. Thanks, Paolo > > diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c > index cd307767a134..b09589915667 100644 > --- a/block/bfq-iosched.c > +++ b/block/bfq-iosched.c > @@ -5342,7 +5342,7 @@ static unsigned int bfq_update_depths(struct bfq_data *bfqd, > return min_shallow; > } > > -static int bfq_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int index) > +static void bfq_depth_updated(struct blk_mq_hw_ctx *hctx) > { > struct bfq_data *bfqd = hctx->queue->elevator->elevator_data; > struct blk_mq_tags *tags = hctx->sched_tags; > @@ -5350,6 +5350,11 @@ static int bfq_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int index) > > min_shallow = bfq_update_depths(bfqd, &tags->bitmap_tags); > sbitmap_queue_min_shallow_depth(&tags->bitmap_tags, min_shallow); > +} > + > +static int bfq_init_hctx(struct blk_mq_hw_ctx *hctx, unsigned int index) > +{ > + bfq_depth_updated(hctx); > return 0; > } > > @@ -5772,6 +5777,7 @@ static struct elevator_type iosched_bfq_mq = { > .requests_merged = bfq_requests_merged, > .request_merged = bfq_request_merged, > .has_work = bfq_has_work, > + .depth_updated = bfq_depth_updated, > .init_hctx = bfq_init_hctx, > .init_sched = bfq_init_queue, > .exit_sched = bfq_exit_queue, > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 3ba37b9e15e9..a047b297ade5 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -3101,6 +3101,8 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr) > } > if (ret) > break; > + if (q->elevator && q->elevator->type->ops.depth_updated) > + q->elevator->type->ops.depth_updated(hctx); > } > > if (!ret) > diff --git a/include/linux/elevator.h b/include/linux/elevator.h > index 2e9e2763bf47..6e8bc53740f0 100644 > --- a/include/linux/elevator.h > +++ b/include/linux/elevator.h > @@ -31,6 +31,7 @@ struct elevator_mq_ops { > void (*exit_sched)(struct elevator_queue *); > int (*init_hctx)(struct blk_mq_hw_ctx *, unsigned int); > void (*exit_hctx)(struct blk_mq_hw_ctx *, unsigned int); > + void (*depth_updated)(struct blk_mq_hw_ctx *); > > bool (*allow_merge)(struct request_queue *, struct request *, struct bio *); > bool (*bio_merge)(struct blk_mq_hw_ctx *, struct bio *); > > -- > Jens Axboe >
On 1/18/19 10:24 AM, Paolo Valente wrote: > > >> Il giorno 18 gen 2019, alle ore 14:35, Jens Axboe <axboe@kernel.dk> ha scritto: >> >> On 1/18/19 4:52 AM, Paolo Valente wrote: >>> Hi Jens, >>> a user reported a warning, followed by freezes, in case he increases >>> nr_requests to more than 64 [1]. After reproducing the issues, I >>> reverted the commit f0635b8a416e ("bfq: calculate shallow depths at >>> init time"), plus the related commit bd7d4ef6a4c9 ("bfq-iosched: >>> remove unused variable"). The problem went away. >> >> For reverts, please put the justification into the actual revert >> commit. With this series, if applied as-is, we'd have two patches >> in the tree that just says "revert X" without any hint as to why >> that was done. >> > > I forget to say explicitly that these patches were meant only to give > you and anybody else something concrete to test and check. > > With me you're as safe as houses, in terms of amount of comments in > final patches :) It's almost an example of the classic case of "if you want a real solution to a problem, post a knowingly bad and half assed solution". That always gets people out of the woodwork :-) >>> Maybe the assumption in commit f0635b8a416e ("bfq: calculate shallow >>> depths at init time") does not hold true? >> >> It apparently doesn't! But let's try and figure this out instead of >> blindly reverting it. > > Totally agree. > >> OK, I think I see it. For the sched_tags >> case, when we grow the requests, we allocate a new set. Hence any >> old cache would be stale at that point. >> > > ok > >> How about something like this? It still keeps the code of having >> to update this out of the hot IO path, and only calls it when we >> actually change the depths. >> > > Looks rather clean and efficient. > >> Totally untested... >> > > It seems to work here too. OK good, I've posted it "officially" now. -- Jens Axboe