From patchwork Thu Oct 22 19:01:02 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Felipe Balbi X-Patchwork-Id: 55448 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-wi0-f197.google.com (mail-wi0-f197.google.com [209.85.212.197]) by patches.linaro.org (Postfix) with ESMTPS id 0B0D920581 for ; Thu, 22 Oct 2015 19:02:04 +0000 (UTC) Received: by wicfg8 with SMTP id fg8sf693680wic.0 for ; Thu, 22 Oct 2015 12:02:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:from:to:cc:subject:user-agent:date :message-id:mime-version:content-type:sender:precedence:list-id :x-original-sender:x-original-authentication-results:mailing-list :list-post:list-help:list-archive:list-unsubscribe; bh=VMx9bCAXBiziQhcUP8a24QN8OOG8eRmi4tOF4APAhIM=; b=R5uGDcIv/1MEqCdoGKNsnEovughy9CkY6fgEu5Byf97PjDsbWP1pEFBhWzPmQKu8Eq lrrbBg1BJkP4fxRz0kzeq45qzoxlpEWpuU9qc5xJvyGYyLPspoieSpS0+vEAjRxAEByZ qL2sH1liZVm8sXpFe5Ut5pbPL3Q/7JlWTOsbBUGK+Fe8/fd8v9E7I5ndwNxZs5y6an5l toQLb2nXrbckCHeRtfAbDLMGOXK56ImzrZ92NsQUtH8YFpAo1EYpDOfDLREHS9dujvB0 AJUeRSmH+hhQmlF8UL364DhhF4Zz9MQUXmgAKwqVBprvOVlfoUSC0/io7SJcCs5DpO2s VqFg== X-Gm-Message-State: ALoCoQlbC5bZ8xFdhPYmWcYfFV9ICMTk2JauhS5AsCrtq2BHNrUTPLbjPfPAvnTCxnftgKz7D5n3 X-Received: by 10.180.109.129 with SMTP id hs1mr12887wib.1.1445540523226; Thu, 22 Oct 2015 12:02:03 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.25.89.20 with SMTP id n20ls100241lfb.54.gmail; Thu, 22 Oct 2015 12:02:03 -0700 (PDT) X-Received: by 10.25.210.206 with SMTP id j197mr6116626lfg.86.1445540523064; Thu, 22 Oct 2015 12:02:03 -0700 (PDT) Received: from mail-lf0-f45.google.com (mail-lf0-f45.google.com. [209.85.215.45]) by mx.google.com with ESMTPS id zl2si10468475lbb.120.2015.10.22.12.02.02 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 22 Oct 2015 12:02:02 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.45 as permitted sender) client-ip=209.85.215.45; Received: by lfaz124 with SMTP id z124so59290160lfa.1 for ; Thu, 22 Oct 2015 12:02:02 -0700 (PDT) X-Received: by 10.25.86.213 with SMTP id k204mr6072160lfb.36.1445540522815; Thu, 22 Oct 2015 12:02:02 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.59.35 with SMTP id w3csp773814lbq; Thu, 22 Oct 2015 12:02:01 -0700 (PDT) X-Received: by 10.66.142.38 with SMTP id rt6mr12662146pab.121.1445540521050; Thu, 22 Oct 2015 12:02:01 -0700 (PDT) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id xm2si23093270pbb.66.2015.10.22.12.02.00; Thu, 22 Oct 2015 12:02:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-omap-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752348AbbJVTCA (ORCPT + 3 others); Thu, 22 Oct 2015 15:02:00 -0400 Received: from arroyo.ext.ti.com ([192.94.94.40]:58405 "EHLO arroyo.ext.ti.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751902AbbJVTB6 (ORCPT ); Thu, 22 Oct 2015 15:01:58 -0400 Received: from dflxv15.itg.ti.com ([128.247.5.124]) by arroyo.ext.ti.com (8.13.7/8.13.7) with ESMTP id t9MJ16xU023186; Thu, 22 Oct 2015 14:01:06 -0500 Received: from DLEE70.ent.ti.com (dlemailx.itg.ti.com [157.170.170.113]) by dflxv15.itg.ti.com (8.14.3/8.13.8) with ESMTP id t9MJ16Dn001916; Thu, 22 Oct 2015 14:01:06 -0500 Received: from dflp32.itg.ti.com (10.64.6.15) by DLEE70.ent.ti.com (157.170.170.113) with Microsoft SMTP Server id 14.3.224.2; Thu, 22 Oct 2015 14:01:06 -0500 Received: from localhost (ileax41-snat.itg.ti.com [10.172.224.153]) by dflp32.itg.ti.com (8.14.3/8.13.8) with ESMTP id t9MJ15F4024439; Thu, 22 Oct 2015 14:01:05 -0500 From: Felipe Balbi To: Brian Norris , David Woodhouse CC: , , Subject: Regression caused by 073db4a51ee43ccb827f54a4261c0583b028d5ab User-Agent: Notmuch/0.20.2 (http://notmuchmail.org) Emacs/24.5.1 (x86_64-pc-linux-gnu) Date: Thu, 22 Oct 2015 14:01:02 -0500 Message-ID: <87y4euenip.fsf@saruman.tx.rr.com> MIME-Version: 1.0 Sender: linux-omap-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-omap@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: balbi@ti.com X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.45 as permitted sender) smtp.mailfrom=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , Hi, I just noticed that commit 073db4a51ee4 (mtd: fix: avoid race condition when accessing mtd->usecount) caused a regression at least when removing m25p80. Wonder if you guys would know of a quick fix, other than reverting $commit in HEAD (yes, that makes the problem go away, but regresses on what $commit tried to fix, of course). More info about the regression follows, together with bisection log: # modprobe -r m25p80 [ 53.419251] [ 53.420838] ====================================================== [ 53.427300] [ INFO: possible circular locking dependency detected ] [ 53.433865] 4.3.0-rc6 #96 Not tainted [ 53.437686] ------------------------------------------------------- [ 53.444220] modprobe/372 is trying to acquire lock: [ 53.449320] (&new->lock){+.+...}, at: [] del_mtd_blktrans_dev+0x80/0xdc [ 53.457271] [ 53.457271] but task is already holding lock: [ 53.463372] (mtd_table_mutex){+.+.+.}, at: [] del_mtd_device+0x18/0x100 [ 53.471321] [ 53.471321] which lock already depends on the new lock. [ 53.471321] [ 53.479856] [ 53.479856] the existing dependency chain (in reverse order) is: [ 53.487660] -> #1 (mtd_table_mutex){+.+.+.}: [ 53.492331] [] blktrans_open+0x34/0x1a4 [ 53.497879] [] __blkdev_get+0xc4/0x3b0 [ 53.503364] [] blkdev_get+0x108/0x320 [ 53.508743] [] do_dentry_open+0x218/0x314 [ 53.514496] [] path_openat+0x4c0/0xf9c [ 53.519959] [] do_filp_open+0x5c/0xc0 [ 53.525336] [] do_sys_open+0xfc/0x1cc [ 53.530716] [] ret_fast_syscall+0x0/0x1c [ 53.536375] -> #0 (&new->lock){+.+...}: [ 53.540587] [] mutex_lock_nested+0x38/0x3cc [ 53.546504] [] del_mtd_blktrans_dev+0x80/0xdc [ 53.552606] [] blktrans_notify_remove+0x7c/0x84 [ 53.558891] [] del_mtd_device+0x74/0x100 [ 53.564544] [] del_mtd_partitions+0x80/0xc8 [ 53.570451] [] mtd_device_unregister+0x24/0x48 [ 53.576637] [] spi_drv_remove+0x1c/0x34 [ 53.582207] [] __device_release_driver+0x88/0x114 [ 53.588663] [] device_release_driver+0x20/0x2c [ 53.594843] [] bus_remove_device+0xd8/0x108 [ 53.600748] [] device_del+0x10c/0x210 [ 53.606127] [] device_unregister+0xc/0x20 [ 53.611849] [] __unregister+0x10/0x20 [ 53.617211] [] device_for_each_child+0x50/0x7c [ 53.623387] [] spi_unregister_master+0x58/0x8c [ 53.629578] [] release_nodes+0x15c/0x1c8 [ 53.635223] [] __device_release_driver+0x90/0x114 [ 53.641689] [] driver_detach+0xb4/0xb8 [ 53.647147] [] bus_remove_driver+0x4c/0xa0 [ 53.652970] [] SyS_delete_module+0x11c/0x1e4 [ 53.658976] [] ret_fast_syscall+0x0/0x1c [ 53.664621] [ 53.664621] other info that might help us debug this: [ 53.664621] [ 53.672979] Possible unsafe locking scenario: [ 53.672979] [ 53.679169] CPU0 CPU1 [ 53.683900] ---- ---- [ 53.688633] lock(mtd_table_mutex); [ 53.692383] lock(&new->lock); [ 53.698306] lock(mtd_table_mutex); [ 53.704658] lock(&new->lock); [ 53.707946] [ 53.707946] *** DEADLOCK *** [ 53.707946] [ 53.714123] 5 locks held by modprobe/372: [ 53.718305] #0: (&dev->mutex){......}, at: [] driver_detach+0x44/0xb8 [ 53.726147] #1: (&dev->mutex){......}, at: [] driver_detach+0x50/0xb8 [ 53.733985] #2: (&dev->mutex){......}, at: [] device_release_driver+0x18/0x2c [ 53.742541] #3: (mtd_partitions_mutex){+.+.+.}, at: [] del_mtd_partitions+0x1c/0xc8 [ 53.751656] #4: (mtd_table_mutex){+.+.+.}, at: [] del_mtd_device+0x18/0x100 [ 53.760048] [ 53.760048] stack backtrace: [ 53.764591] CPU: 0 PID: 372 Comm: modprobe Not tainted 4.3.0-rc6 #96 [ 53.771217] Hardware name: Generic AM43 (Flattened Device Tree) [ 53.777419] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [ 53.785511] [] (show_stack) from [] (dump_stack+0x84/0x9c) [ 53.793063] [] (dump_stack) from [] (print_circular_bug+0x1c8/0x30c) [ 53.801500] [] (print_circular_bug) from [] (__lock_acquire+0x1a48/0x1cd8) [ 53.810480] [] (__lock_acquire) from [] (lock_acquire+0xac/0x12c) [ 53.818649] [] (lock_acquire) from [] (mutex_lock_nested+0x38/0x3cc) [ 53.827103] [] (mutex_lock_nested) from [] (del_mtd_blktrans_dev+0x80/0xdc) [ 53.836199] [] (del_mtd_blktrans_dev) from [] (blktrans_notify_remove+0x7c/0x84) [ 53.845735] [] (blktrans_notify_remove) from [] (del_mtd_device+0x74/0x100) [ 53.854833] [] (del_mtd_device) from [] (del_mtd_partitions+0x80/0xc8) [ 53.863469] [] (del_mtd_partitions) from [] (mtd_device_unregister+0x24/0x48) [ 53.872733] [] (mtd_device_unregister) from [] (spi_drv_remove+0x1c/0x34) [ 53.881633] [] (spi_drv_remove) from [] (__device_release_driver+0x88/0x114) [ 53.890788] [] (__device_release_driver) from [] (device_release_driver+0x20/0x2c) [ 53.900483] [] (device_release_driver) from [] (bus_remove_device+0xd8/0x108) [ 53.909735] [] (bus_remove_device) from [] (device_del+0x10c/0x210) [ 53.918088] [] (device_del) from [] (device_unregister+0xc/0x20) [ 53.926160] [] (device_unregister) from [] (__unregister+0x10/0x20) [ 53.934526] [] (__unregister) from [] (device_for_each_child+0x50/0x7c) [ 53.943261] [] (device_for_each_child) from [] (spi_unregister_master+0x58/0x8c) [ 53.952805] [] (spi_unregister_master) from [] (release_nodes+0x15c/0x1c8) [ 53.961809] [] (release_nodes) from [] (__device_release_driver+0x90/0x114) [ 53.970883] [] (__device_release_driver) from [] (driver_detach+0xb4/0xb8) [ 53.979864] [] (driver_detach) from [] (bus_remove_driver+0x4c/0xa0) [ 53.988303] [] (bus_remove_driver) from [] (SyS_delete_module+0x11c/0x1e4) [ 53.997285] [] (SyS_delete_module) from [] (ret_fast_syscall+0x0/0x1c) Bisection log: git bisect start # good: [b953c0d234bc72e8489d3bf51a276c5c4ec85345] Linux 4.1 git bisect good b953c0d234bc72e8489d3bf51a276c5c4ec85345 # bad: [d770e558e21961ad6cfdf0ff7df0eb5d7d4f0754] Linux 4.2-rc1 git bisect bad d770e558e21961ad6cfdf0ff7df0eb5d7d4f0754 # bad: [4570a37169d4b44d316f40b2ccc681dc93fedc7b] Merge tag 'sound-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound git bisect bad 4570a37169d4b44d316f40b2ccc681dc93fedc7b # bad: [4e241557fc1cb560bd9e77ca1b4a9352732a5427] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm git bisect bad 4e241557fc1cb560bd9e77ca1b4a9352732a5427 # good: [44d21c3f3a2ef2f58b18bda64c52c99e723f3f4a] Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 git bisect good 44d21c3f3a2ef2f58b18bda64c52c99e723f3f4a # good: [acd53127c4adbd34570b221e7ea1f7fc94aea923] Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi git bisect good acd53127c4adbd34570b221e7ea1f7fc94aea923 # bad: [54245ed870c8cf9ff87fdf78955ffbc93b261e9f] Merge tag 'for-linus-20150623' of git://git.infradead.org/linux-mtd git bisect bad 54245ed870c8cf9ff87fdf78955ffbc93b261e9f # good: [5a602e157a9d91d5ce98d07c404097edba8ec9f3] Merge tag 'spi-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi git bisect good 5a602e157a9d91d5ce98d07c404097edba8ec9f3 # good: [1abf25a25b86dcfe28d243a5af71bd1c9d6de1ef] mfd: lpc_ich: Assign subdevice ids automatically git bisect good 1abf25a25b86dcfe28d243a5af71bd1c9d6de1ef # bad: [45c2ebd702a468d5037cf16aa4f8ea8d67776f6a] mtd: docg3: Don't leak docg3->bbt in error path git bisect bad 45c2ebd702a468d5037cf16aa4f8ea8d67776f6a # bad: [f628ece6636c2f0354a52566cafdea6d2f963b3d] mtd: brcmnand: add BCM63138 support git bisect bad f628ece6636c2f0354a52566cafdea6d2f963b3d # good: [b94665322b786a806a0169752ff2f35f3f467b99] mtd: samsung: Constify platform_device_id git bisect good b94665322b786a806a0169752ff2f35f3f467b99 # bad: [073db4a51ee43ccb827f54a4261c0583b028d5ab] mtd: fix: avoid race condition when accessing mtd->usecount git bisect bad 073db4a51ee43ccb827f54a4261c0583b028d5ab # good: [b79c332fb283c101abb5d8570dea2d29f3871802] mtd: spi-nor: add support for the ISSI SI25CD512 SPI flash git bisect good b79c332fb283c101abb5d8570dea2d29f3871802 # good: [7aaea7605c0e19fa7b38d7ac5dcd818942fd17a7] jffs2: fix unbalanced locking git bisect good 7aaea7605c0e19fa7b38d7ac5dcd818942fd17a7 # good: [5844feeaa4154d1c46d3462c7a4653d22356d8b4] mtd: nand: add common DT init code git bisect good 5844feeaa4154d1c46d3462c7a4653d22356d8b4 # first bad commit: [073db4a51ee43ccb827f54a4261c0583b028d5ab] mtd: fix: avoid race condition when accessing mtd->usecount commit 073db4a51ee43ccb827f54a4261c0583b028d5ab Author: Brian Norris Date: Thu May 7 17:55:16 2015 -0700 mtd: fix: avoid race condition when accessing mtd->usecount On A MIPS 32-cores machine a BUG_ON was triggered because some acesses to mtd->usecount were done without taking mtd_table_mutex. kernel: Call Trace: kernel: [] __put_mtd_device+0x20/0x50 kernel: [] blktrans_release+0x8c/0xd8 kernel: [] __blkdev_put+0x1a8/0x200 kernel: [] blkdev_close+0x1c/0x30 kernel: [] __fput+0xac/0x250 kernel: [] task_work_run+0xd8/0x120 kernel: [] work_notifysig+0x10/0x18 kernel: kernel: Code: 2442ffff ac8202d8 000217fe <00020336> dc820128 10400003 00000000 0040f809 00000000 kernel: ---[ end trace 080fbb4579b47a73 ]--- Fixed by taking the mutex in blktrans_open and blktrans_release. Note that this locking is already suggested in include/linux/mtd/blktrans.h: struct mtd_blktrans_ops { ... /* Called with mtd_table_mutex held; no race with add/remove */ int (*open)(struct mtd_blktrans_dev *dev); void (*release)(struct mtd_blktrans_dev *dev); ... }; But we weren't following it. Originally reported by (and patched by) Zhang and Giuseppe, independently. Improved and rewritten. Cc: stable@vger.kernel.org Reported-by: Zhang Xingcai Reported-by: Giuseppe Cantavenera Tested-by: Giuseppe Cantavenera Acked-by: Alexander Sverdlin Signed-off-by: Brian Norris Tested-by: Felipe Balbi diff --git a/drivers/mtd/mtd_blkdevs.c b/drivers/mtd/mtd_blkdevs.c index 2b0c52870999..df7c6c70757a 100644 --- a/drivers/mtd/mtd_blkdevs.c +++ b/drivers/mtd/mtd_blkdevs.c @@ -197,6 +197,7 @@ static int blktrans_open(struct block_device *bdev, fmode_t mode) return -ERESTARTSYS; /* FIXME: busy loop! -arnd*/ mutex_lock(&dev->lock); + mutex_lock(&mtd_table_mutex); if (dev->open) goto unlock; @@ -220,6 +221,7 @@ static int blktrans_open(struct block_device *bdev, fmode_t mode) unlock: dev->open++; + mutex_unlock(&mtd_table_mutex); mutex_unlock(&dev->lock); blktrans_dev_put(dev); return ret; @@ -230,6 +232,7 @@ error_release: error_put: module_put(dev->tr->owner); kref_put(&dev->ref, blktrans_dev_release); + mutex_unlock(&mtd_table_mutex); mutex_unlock(&dev->lock); blktrans_dev_put(dev); return ret; @@ -243,6 +246,7 @@ static void blktrans_release(struct gendisk *disk, fmode_t mode) return; mutex_lock(&dev->lock); + mutex_lock(&mtd_table_mutex); if (--dev->open) goto unlock; @@ -256,6 +260,7 @@ static void blktrans_release(struct gendisk *disk, fmode_t mode) __put_mtd_device(dev->mtd); } unlock: + mutex_unlock(&mtd_table_mutex); mutex_unlock(&dev->lock); blktrans_dev_put(dev); }