From patchwork Wed Nov 30 09:51:04 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Kara X-Patchwork-Id: 84989 Delivered-To: patch@linaro.org Received: by 10.140.20.101 with SMTP id 92csp154649qgi; Wed, 30 Nov 2016 01:51:37 -0800 (PST) X-Received: by 10.99.47.7 with SMTP id v7mr58846707pgv.39.1480499496966; Wed, 30 Nov 2016 01:51:36 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m17si57826026pgh.314.2016.11.30.01.51.36; Wed, 30 Nov 2016 01:51:36 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933787AbcK3JvM (ORCPT + 3 others); Wed, 30 Nov 2016 04:51:12 -0500 Received: from mx2.suse.de ([195.135.220.15]:37271 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754970AbcK3JvI (ORCPT ); Wed, 30 Nov 2016 04:51:08 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 483CBAC42; Wed, 30 Nov 2016 09:51:05 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id 71C491E10F1; Wed, 30 Nov 2016 10:51:04 +0100 (CET) Date: Wed, 30 Nov 2016 10:51:04 +0100 From: Jan Kara To: Wei Fang Cc: Jan Kara , akpm@linux-foundation.org, hannes@cmpxchg.org, hch@infradead.org, linux-mm@kvack.org, stable@vger.kernel.org, Jens Axboe , Tejun Heo Subject: Re: [PATCH] mm: Fix a NULL dereference crash while accessing bdev->bd_disk Message-ID: <20161130095104.GB20030@quack2.suse.cz> References: <1480125982-8497-1-git-send-email-fangwei1@huawei.com> <20161128100718.GD2590@quack2.suse.cz> <583CE0C7.1040406@huawei.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <583CE0C7.1040406@huawei.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org On Tue 29-11-16 09:58:31, Wei Fang wrote: > Hi, Jan, > > On 2016/11/28 18:07, Jan Kara wrote: > > Good catch but I don't like sprinkling checks like this into the writeback > > code and furthermore we don't want to call into writeback code when block > > device is in the process of being destroyed which is what would happen with > > your patch. That is a bug waiting to happen... > > Agreed. Need another way to fix this problem. I looked through the > writeback cgroup code in __filemap_fdatawrite_range(), found if we > turn on CONFIG_CGROUP_WRITEBACK, a new crash will happen. OK, can you test with attached patch please? Thanks! Honza -- Jan Kara SUSE Labs, CR >From ef10e4f52d2d05982fbeba09e48a4253b5fd1119 Mon Sep 17 00:00:00 2001 From: Rabin Vincent Date: Thu, 10 Mar 2016 13:26:03 +0100 Subject: [PATCH] block: protect iterate_bdevs() against concurrent close If a block device is closed while iterate_bdevs() is handling it, the following NULL pointer dereference occurs because bdev->b_disk is NULL in bdev_get_queue(), which is called from blk_get_backing_dev_info() (in turn called by the mapping_cap_writeback_dirty() call in __filemap_fdatawrite_range()): BUG: unable to handle kernel NULL pointer dereference at 0000000000000508 IP: [] blk_get_backing_dev_info+0x10/0x20 PGD 9e62067 PUD 9ee8067 PMD 0 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: CPU: 1 PID: 2422 Comm: sync Not tainted 4.5.0-rc7+ #400 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) task: ffff880009f4d700 ti: ffff880009f5c000 task.ti: ffff880009f5c000 RIP: 0010:[] [] blk_get_backing_dev_info+0x10/0x20 RSP: 0018:ffff880009f5fe68 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88000ec17a38 RCX: ffffffff81a4e940 RDX: 7fffffffffffffff RSI: 0000000000000000 RDI: ffff88000ec176c0 RBP: ffff880009f5fe68 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88000ec17860 R13: ffffffff811b25c0 R14: ffff88000ec178e0 R15: ffff88000ec17a38 FS: 00007faee505d700(0000) GS:ffff88000fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000508 CR3: 0000000009e8a000 CR4: 00000000000006e0 Stack: ffff880009f5feb8 ffffffff8112e7f5 0000000000000000 7fffffffffffffff 0000000000000000 0000000000000000 7fffffffffffffff 0000000000000001 ffff88000ec178e0 ffff88000ec17860 ffff880009f5fec8 ffffffff8112e81f Call Trace: [] __filemap_fdatawrite_range+0x85/0x90 [] filemap_fdatawrite+0x1f/0x30 [] fdatawrite_one_bdev+0x16/0x20 [] iterate_bdevs+0xf2/0x130 [] sys_sync+0x63/0x90 [] entry_SYSCALL_64_fastpath+0x12/0x76 Code: 0f 1f 44 00 00 48 8b 87 f0 00 00 00 55 48 89 e5 <48> 8b 80 08 05 00 00 5d RIP [] blk_get_backing_dev_info+0x10/0x20 RSP CR2: 0000000000000508 ---[ end trace 2487336ceb3de62d ]--- The crash is easily reproducible by running the following command, if an msleep(100) is inserted before the call to func() in iterate_devs(): while :; do head -c1 /dev/nullb0; done > /dev/null & while :; do sync; done Fix it by holding the bd_mutex across the func() call and only calling func() if the bdev is opened. Cc: stable@vger.kernel.org Fixes: 5c0d6b60a0ba46d45020547eacf7199171920935 Reported-by: Wei Fang Signed-off-by: Rabin Vincent Signed-off-by: Jan Kara --- fs/block_dev.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index 05b553368bb4..899fa8ccc347 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -1950,6 +1950,7 @@ void iterate_bdevs(void (*func)(struct block_device *, void *), void *arg) spin_lock(&blockdev_superblock->s_inode_list_lock); list_for_each_entry(inode, &blockdev_superblock->s_inodes, i_sb_list) { struct address_space *mapping = inode->i_mapping; + struct block_device *bdev; spin_lock(&inode->i_lock); if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW) || @@ -1970,8 +1971,12 @@ void iterate_bdevs(void (*func)(struct block_device *, void *), void *arg) */ iput(old_inode); old_inode = inode; + bdev = I_BDEV(inode); - func(I_BDEV(inode), arg); + mutex_lock(&bdev->bd_mutex); + if (bdev->bd_openers) + func(bdev, arg); + mutex_unlock(&bdev->bd_mutex); spin_lock(&blockdev_superblock->s_inode_list_lock); } -- 2.6.6