diff mbox

kmemleak report after 9082e87bfbf8 ("block: remove struct bio_batch")

Message ID 20160606161245.GC29910@e104818-lin.cambridge.arm.com
State New
Headers show

Commit Message

Catalin Marinas June 6, 2016, 4:12 p.m. UTC
On Mon, Jun 06, 2016 at 04:13:34PM +0200, Christoph Hellwig wrote:
> I've got a few reports of this over the weekend, but it still doesn't

> make much sense to me.

> 

> Could it be that kmemleak can't deal with the bio_batch logic?  I've

> tried to look at the various bio and biovec number entries in

> /proc/slabinfo, and while they keep changing a bit during the

> system runtime there doesn't seem to be a persistent increase

> even after lots of mkfs calls.


I think the reported leaks settle after about 10-20min (2-3 kmemleak
periodic scans), so checking /proc/slabinfo may not be sufficient if
the leak is not growing. The leaks also do not seem to disappear,
otherwise kmemleak would no longer report them (e.g. after kfree, even
if they had been previously reported).

What kmemleak reports is objects for which it cannot find a pointer (to
anywhere inside that object; e.g. list_heads are handled). False
positives are indeed present sometimes but for cases where pointers are
stored in non-tracked objects like alloc_pages().

It seems that this leak reports always come in pairs. The first one:

unreferenced object 0xffff880262cbe900 (size 256):
  comm "NetworkManager", pid 516, jiffies 4294895670 (age 2479.340s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 c0 f3 ab 8a 00 88 ff ff  ................
    02 20 00 20 00 00 00 00 11 00 00 00 00 00 00 00  . . ............
  backtrace:
    [<ffffffff811d244e>] kmem_cache_alloc+0xfe/0x250
    [<ffffffff81175b42>] mempool_alloc+0x72/0x190
    [<ffffffff812dc9f6>] bio_alloc_bioset+0xb6/0x240
    [<ffffffff812eca7f>] next_bio+0x1f/0x50
    [<ffffffff812ecf1a>] blkdev_issue_zeroout+0xea/0x1d0
    [<ffffffffc025b610>] ext4_issue_zeroout+0x40/0x50 [ext4]
    [<ffffffffc028c58d>] ext4_ext_map_blocks+0x144d/0x1bb0 [ext4]
    [<ffffffff811839f4>] release_pages+0x254/0x310
    [<ffffffff8118437a>] __pagevec_release+0x2a/0x40
    [<ffffffffc025b297>] mpage_prepare_extent_to_map+0x227/0x2c0 [ext4]
    [<ffffffffc025b793>] ext4_map_blocks+0x173/0x5d0 [ext4]
    [<ffffffffc025f1d0>] ext4_writepages+0x700/0xd40 [ext4]
    [<ffffffff8121312e>] legitimize_mnt+0xe/0x50
    [<ffffffff811d244e>] kmem_cache_alloc+0xfe/0x250
    [<ffffffff81173fd5>] __filemap_fdatawrite_range+0xc5/0x100
    [<ffffffff81174113>] filemap_write_and_wait_range+0x33/0x70

is the first mempool_alloc() in bio_alloc_bioset() for struct bio and
front_pad.

The second report:

unreferenced object 0xffff880036488600 (size 256):
  comm "NetworkManager", pid 516, jiffies 4294895670 (age 2479.348s)
  hex dump (first 32 bytes):
    80 39 08 00 00 ea ff ff 00 10 00 00 00 00 00 00  .9..............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff811d244e>] kmem_cache_alloc+0xfe/0x250
    [<ffffffff812dc8b7>] bvec_alloc+0x57/0xe0
    [<ffffffff812dcaaf>] bio_alloc_bioset+0x16f/0x240
    [<ffffffff812eca7f>] next_bio+0x1f/0x50
    [<ffffffff812ecf1a>] blkdev_issue_zeroout+0xea/0x1d0
    [<ffffffffc025b610>] ext4_issue_zeroout+0x40/0x50 [ext4]
    [<ffffffffc028c58d>] ext4_ext_map_blocks+0x144d/0x1bb0 [ext4]
    [<ffffffff811839f4>] release_pages+0x254/0x310
    [<ffffffff8118437a>] __pagevec_release+0x2a/0x40
    [<ffffffffc025b297>] mpage_prepare_extent_to_map+0x227/0x2c0 [ext4]
    [<ffffffffc025b793>] ext4_map_blocks+0x173/0x5d0 [ext4]
    [<ffffffffc025f1d0>] ext4_writepages+0x700/0xd40 [ext4]
    [<ffffffff8121312e>] legitimize_mnt+0xe/0x50
    [<ffffffff811d244e>] kmem_cache_alloc+0xfe/0x250
    [<ffffffff81173fd5>] __filemap_fdatawrite_range+0xc5/0x100
    [<ffffffff81174113>] filemap_write_and_wait_range+0x33/0x70

is for the struct bio_vec allocation in bvec_alloc() (the one going via
kmem_cache_alloc).

IIUC, the bio object above allocated via next_bio() ->
bio_alloc_bioset() is returned to __blkdev_issue_zeroout() which
eventually submits them either directly for the last one or via
next_bio().

Regarding bio chaining, I can't figure out what the first bio allocated
in __blkdev_issue_zeroout() is chained to since bio == NULL initially.
Subsequent next_bio() allocations are linked to the previous ones via
bio_chain() but somehow kmemleak loses track of the first one, hence the
subsequent bios are reported as leaks. That's unless the chaining should
be the other way around:


Also confusing is that chaining is done via bio->bi_private, however
this is overridden in other places like submit_bio_wait().

However, since I don't fully understand this code, this chaining may not
even be essential to struct bio freeing (and I'm investigating the wrong
path).

-- 
Catalin

Comments

Catalin Marinas June 6, 2016, 5:35 p.m. UTC | #1
On Mon, Jun 06, 2016 at 07:27:18PM +0200, Christoph Hellwig wrote:
> On Mon, Jun 06, 2016 at 12:09:49PM -0500, Shaun Tancheff wrote:

> > I'm pretty sure it is missing a bio_put() after submit_bio_wait().

> > 

> > Please excuse the hack-y patch but I think you need to do something

> > like this ...

> > (Note tabs eaten by gmail).

> 

> Yeah, that makes sense - oddly enough submit_bio_wait doesn't do a

> bio_put. Still not sure why I don't see the leaks after repeated

> mkfs.xfs runs, though.


You can force more kmemleak scans via:

  echo scan > /sys/kernel/debug/kmemleak

In my case, the leaks were reported for ext4 and appeared during boot,
no need for mkfs. But kmemleak favours false negatives more than
positives (otherwise it would be pretty unusable), so you don't always
hit them.

-- 
Catalin
Catalin Marinas June 7, 2016, 9:39 a.m. UTC | #2
On Mon, Jun 06, 2016 at 12:09:49PM -0500, Shaun Tancheff wrote:
> I'm pretty sure it is missing a bio_put() after submit_bio_wait().

> 

> Please excuse the hack-y patch but I think you need to do something

> like this ...

> (Note tabs eaten by gmail).

> 

> diff --git a/block/blk-lib.c b/block/blk-lib.c

> index 23d7f30..9e29dc3 100644

> --- a/block/blk-lib.c

> +++ b/block/blk-lib.c

> @@ -113,6 +113,7 @@ int blkdev_issue_discard(struct block_device

> *bdev, sector_t sector,

>                 ret = submit_bio_wait(type, bio);

>                 if (ret == -EOPNOTSUPP)

>                         ret = 0;

> +               bio_put(bio);

>         }

>         blk_finish_plug(&plug);

> 

> @@ -165,8 +166,10 @@ int blkdev_issue_write_same(struct block_device

> *bdev, sector_t sector,

>                 }

>         }

> 

> -       if (bio)

> +       if (bio) {

>                 ret = submit_bio_wait(REQ_WRITE | REQ_WRITE_SAME, bio);

> +               bio_put(bio);

> +       }

>         return ret != -EOPNOTSUPP ? ret : 0;

>  }

>  EXPORT_SYMBOL(blkdev_issue_write_same);

> @@ -206,8 +209,11 @@ static int __blkdev_issue_zeroout(struct

> block_device *bdev, sector_t sector,

>                 }

>         }

> 

> -       if (bio)

> -               return submit_bio_wait(WRITE, bio);

> +       if (bio) {

> +               ret = submit_bio_wait(WRITE, bio);

> +               bio_put(bio);

> +               return ret;

> +       }

>         return 0;

>  }


This patch appears to fix the memory leak on my machine.

Tested-by: Catalin Marinas <catalin.marinas@arm.com>
diff mbox

Patch

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 23d7f301a196..3bf78b7b74cc 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -15,7 +15,7 @@  static struct bio *next_bio(struct bio *bio, int rw, unsigned int nr_pages,
 	struct bio *new = bio_alloc(gfp, nr_pages);
 
 	if (bio) {
-		bio_chain(bio, new);
+		bio_chain(new, bio);
 		submit_bio(rw, bio);
 	}