diff mbox series

[01/12] blk-mq: do not include passthrough requests in I/O accounting

Message ID 20220222141450.591193-2-hch@lst.de
State Superseded
Headers show
Series [01/12] blk-mq: do not include passthrough requests in I/O accounting | expand

Commit Message

Christoph Hellwig Feb. 22, 2022, 2:14 p.m. UTC
I/O accounting buckets I/O into the read/write/discard categories into
which passthrough I/O does not fit at all.  It also accounts to the
block_device, which may not even exist for passthrough I/O.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/blk-mq.c | 6 +-----
 block/blk.h    | 2 +-
 2 files changed, 2 insertions(+), 6 deletions(-)

Comments

Ming Lei Feb. 23, 2022, 2:08 a.m. UTC | #1
On Tue, Feb 22, 2022 at 03:14:39PM +0100, Christoph Hellwig wrote:
> I/O accounting buckets I/O into the read/write/discard categories into
> which passthrough I/O does not fit at all.  It also accounts to the
> block_device, which may not even exist for passthrough I/O.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  block/blk-mq.c | 6 +-----
>  block/blk.h    | 2 +-
>  2 files changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index a05ce77250316..ee80853473d1e 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -883,11 +883,7 @@ static inline void blk_account_io_done(struct request *req, u64 now)
>  
>  static void __blk_account_io_start(struct request *rq)
>  {
> -	/* passthrough requests can hold bios that do not have ->bi_bdev set */
> -	if (rq->bio && rq->bio->bi_bdev)
> -		rq->part = rq->bio->bi_bdev;
> -	else if (rq->q->disk)
> -		rq->part = rq->q->disk->part0;
> +	rq->part = rq->bio->bi_bdev;
>  
>  	part_stat_lock();
>  	update_io_ticks(rq->part, jiffies, false);
> diff --git a/block/blk.h b/block/blk.h
> index ebaa59ca46ca6..6f21859c7f0ff 100644
> --- a/block/blk.h
> +++ b/block/blk.h
> @@ -325,7 +325,7 @@ int blk_dev_init(void);
>   */
>  static inline bool blk_do_io_stat(struct request *rq)
>  {
> -	return (rq->rq_flags & RQF_IO_STAT) && rq->q->disk;
> +	return (rq->rq_flags & RQF_IO_STAT) && !blk_rq_is_passthrough(rq);

I guess this way may cause regression for workloads with lots of userspace IO
from user viewpoint?


Thanks,
Ming
Christoph Hellwig Feb. 23, 2022, 6:42 a.m. UTC | #2
On Wed, Feb 23, 2022 at 10:08:20AM +0800, Ming Lei wrote:
> > -	return (rq->rq_flags & RQF_IO_STAT) && rq->q->disk;
> > +	return (rq->rq_flags & RQF_IO_STAT) && !blk_rq_is_passthrough(rq);
> 
> I guess this way may cause regression for workloads with lots of userspace IO
> from user viewpoint?

I'd say it fixes it as the accounting right now is completely bogus.
Ming Lei Feb. 23, 2022, 7:02 a.m. UTC | #3
On Wed, Feb 23, 2022 at 07:42:26AM +0100, Christoph Hellwig wrote:
> On Wed, Feb 23, 2022 at 10:08:20AM +0800, Ming Lei wrote:
> > > -	return (rq->rq_flags & RQF_IO_STAT) && rq->q->disk;
> > > +	return (rq->rq_flags & RQF_IO_STAT) && !blk_rq_is_passthrough(rq);
> > 
> > I guess this way may cause regression for workloads with lots of userspace IO
> > from user viewpoint?
> 
> I'd say it fixes it as the accounting right now is completely bogus.

There are small amount of in-kernel passthrough requests(admin, or driver
private) which shouldn't be accounted, but passthrough RW IO requests from
userspace can be lots of, and user may rely on diskstat to account them.


Thanks,
Ming
Christoph Hellwig Feb. 23, 2022, 7:36 a.m. UTC | #4
On Wed, Feb 23, 2022 at 03:02:08PM +0800, Ming Lei wrote:
> There are small amount of in-kernel passthrough requests(admin, or driver
> private) which shouldn't be accounted, but passthrough RW IO requests from
> userspace can be lots of, and user may rely on diskstat to account them.

/dev/sg won't be accounted either.  But most importantly they are
accounted wrongly: the accounting buckets into read/write/discard.  Any
most pass through commands are everything but.

Also the way how this accounting works is completely broken.
Passthrough requests are sent through a request_queue, and it does
not make sense to account them to a block_device which sits way about
that.
diff mbox series

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index a05ce77250316..ee80853473d1e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -883,11 +883,7 @@  static inline void blk_account_io_done(struct request *req, u64 now)
 
 static void __blk_account_io_start(struct request *rq)
 {
-	/* passthrough requests can hold bios that do not have ->bi_bdev set */
-	if (rq->bio && rq->bio->bi_bdev)
-		rq->part = rq->bio->bi_bdev;
-	else if (rq->q->disk)
-		rq->part = rq->q->disk->part0;
+	rq->part = rq->bio->bi_bdev;
 
 	part_stat_lock();
 	update_io_ticks(rq->part, jiffies, false);
diff --git a/block/blk.h b/block/blk.h
index ebaa59ca46ca6..6f21859c7f0ff 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -325,7 +325,7 @@  int blk_dev_init(void);
  */
 static inline bool blk_do_io_stat(struct request *rq)
 {
-	return (rq->rq_flags & RQF_IO_STAT) && rq->q->disk;
+	return (rq->rq_flags & RQF_IO_STAT) && !blk_rq_is_passthrough(rq);
 }
 
 void update_io_ticks(struct block_device *part, unsigned long now, bool end);