[v2,0/4] Fix device_lock deadlock on two probe() paths

Message ID	0-v2-d2762acaf50a+16d-iommu_group_locking2_jgg@nvidia.com
Headers	show Return-Path: <linux-acpi-owner@vger.kernel.org> From: Jason Gunthorpe <jgg@nvidia.com> To: iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>, Len Brown <lenb@kernel.org>, linux-acpi@vger.kernel.org, "Rafael J. Wysocki" <rafael@kernel.org>, Robin Murphy <robin.murphy@arm.com>, Will Deacon <will@kernel.org> Cc: Lu Baolu <baolu.lu@linux.intel.com>, Joerg Roedel <jroedel@suse.de>, Kevin Tian <kevin.tian@intel.com>, Marek Szyprowski <m.szyprowski@samsung.com>, Chen-Yu Tsai <wenst@chromium.org> Subject: [PATCH v2 0/4] Fix device_lock deadlock on two probe() paths Date: Wed, 9 Aug 2023 11:43:46 -0300 Message-ID: <0-v2-d2762acaf50a+16d-iommu_group_locking2_jgg@nvidia.com> Content-Transfer-Encoding: 8bit Content-Type: text/plain MIME-Version: 1.0 Precedence: bulk
Series	Fix device_lock deadlock on two probe() paths \| expand [v2,0/4] Fix device_lock deadlock on two probe() paths [v2,1/4] iommu: Provide iommu_probe_device_locked() [v2,2/4] iommu: Pass in the iommu_device to probe for in bus_iommu_probe() [v2,3/4] iommu: Do not attempt to re-lock the iommu device when probing [v2,4/4] iommu: dev->iommu->iommu_dev must be set before ops->device_group()

Jason Gunthorpe Aug. 9, 2023, 2:43 p.m. UTC

I missed two paths where __iommu_probe_device() can be called while
already holding the device_lock() for the device that is to be probed.

This causes a deadlock because __iommu_probe_device() will attempt to
re-acquire the lock.

Organize things so that these two paths can re-use the existing already
held device_lock by marking the call chains based on if the lock is held
or not.

Also the order of iommu_init_device() was not correct for
generic_single_device_group()

v2:
 - New patch to correct the order of setting iommu_dev during
   iommu_init_device()
 - Spelling fixes
 - Simply block probing the iommu device itself instead of trying to do it
v1: https://lore.kernel.org/r/0-v1-8612b9ef48da+333-iommu_group_locking2_jgg@nvidia.com

Jason Gunthorpe (4):
  iommu: Provide iommu_probe_device_locked()
  iommu: Pass in the iommu_device to probe for in bus_iommu_probe()
  iommu: Do not attempt to re-lock the iommu device when probing
  iommu: dev->iommu->iommu_dev must be set before ops->device_group()

 drivers/acpi/scan.c        |  5 +--
 drivers/iommu/iommu.c      | 65 +++++++++++++++++++++++++++-----------
 drivers/iommu/of_iommu.c   |  2 +-
 drivers/iommu/omap-iommu.c | 12 +++++--
 include/linux/iommu.h      |  6 +++-
 5 files changed, 65 insertions(+), 25 deletions(-)


base-commit: 8d3740021d5d461e1ec4c17fc5625b9b4237c2f8

Jason Gunthorpe Aug. 9, 2023, 3:55 p.m. UTC | #1

On Wed, Aug 09, 2023 at 05:49:44PM +0200, Joerg Roedel wrote:
> On Wed, Aug 09, 2023 at 11:43:46AM -0300, Jason Gunthorpe wrote:
> > Jason Gunthorpe (4):
> >   iommu: Provide iommu_probe_device_locked()
> >   iommu: Pass in the iommu_device to probe for in bus_iommu_probe()
> >   iommu: Do not attempt to re-lock the iommu device when probing
> >   iommu: dev->iommu->iommu_dev must be set before ops->device_group()
> 
> Applied, thanks for fixing this quickly.

Yes, of course, thanks!

Jason

Tian, Kevin Aug. 10, 2023, 2:37 a.m. UTC | #2

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, August 9, 2023 10:44 PM
> 
> When bus_iommu_probe() runs it can attempt to probe the iommu itself, for
> instance if the iommu is located on a platform_bus. This will cause the
> device_lock() to deadlock on itself as the device_driver probe() callback
> for the device calling iommu_device_register() already holds the
> device_lock():
> 
>  probe_iommu_group+0x18/0x38
>  bus_for_each_dev+0xe4/0x168
>  bus_iommu_probe+0x8c/0x240
>  iommu_device_register+0x120/0x1b0
>  mtk_iommu_probe+0x494/0x7a0
>  platform_probe+0x94/0x100
>  really_probe+0x1e4/0x3e8
>  __driver_probe_device+0xc0/0x1a0
>  driver_probe_device+0x110/0x1f0
>  __device_attach_driver+0xf0/0x1b0
>  bus_for_each_drv+0xf0/0x170
>  __device_attach+0x120/0x240
>  device_initial_probe+0x1c/0x30
>  bus_probe_device+0xdc/0xe8
>  deferred_probe_work_func+0xf0/0x140
>  process_one_work+0x3b0/0x910
>  worker_thread+0x33c/0x610
>  kthread+0x1dc/0x1f0
>  ret_from_fork+0x10/0x20
> 
> Keep track of the iommu itself and do not attempt to relock the device
> while doing the probe_iommu_group scan.
> 
> To accommodate omap's use of unregistered struct iommu_device's add a
> new
> 'hwdev' member to keep track of the hwdev in all cases. Normally this
> would be dev->parent, but since omap doesn't allocate that struct it won't
> exist for it.
> 
> Fixes: a16b5d1681ab ("iommu: Complete the locking for dev-
> >iommu_group")
> Reported-by: Chen-Yu Tsai <wenst@chromium.org>
> Closes: https://lore.kernel.org/linux-iommu/CAGXv+5E-
> f9AteAYkmXYzVDZFSA_royc7-bS5LcrzzuHDnXccwA@mail.gmail.com
> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Tested-by: Chen-Yu Tsai <wenst@chromium.org>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

Tian, Kevin Aug. 10, 2023, 2:37 a.m. UTC | #3

> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, August 9, 2023 10:44 PM
> 
> As generic_single_device_group() requires it. Otherwise it crashes:
> 
>   generic_single_device_group+0x24/0x88
>   __iommu_probe_device+0xe8/0x444
>   iommu_probe_device+0x1c/0x54
>   of_iommu_configure+0x10c/0x200
>   of_dma_configure_id+0x1e0/0x3b4
>   platform_dma_configure+0x30/0x78
>   really_probe+0x70/0x2b4
>   __driver_probe_device+0x78/0x12c
>   driver_probe_device+0x3c/0x160
>   __driver_attach+0x9c/0x1ac
>   bus_for_each_dev+0x74/0xd4
>   driver_attach+0x24/0x30
>   bus_add_driver+0xe4/0x1e8
>   driver_register+0x60/0x128
>   __platform_driver_register+0x28/0x34
>   hantro_driver_init+0x20/0x1000 [hantro_vpu]
>   do_one_initcall+0x74/0x2f0
>   do_init_module+0x58/0x1ec
>   load_module+0x1a20/0x1c64
>   init_module_from_file+0x84/0xc0
>   idempotent_init_module+0x180/0x250
>   __arm64_sys_finit_module+0x64/0xa0
>   invoke_syscall+0x48/0x114
>   el0_svc_common.constprop.0+0xec/0x10c
>   do_el0_svc+0x38/0xa4
>   el0_svc+0x40/0xac
>   el0t_64_sync_handler+0xc0/0xc4
>   el0t_64_sync+0x190/0x194
> 
> Fixes: 5dd59857af60 ("iommu: Add generic_single_device_group()")
> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Closes: https://lore.kernel.org/all/e5e75222-13a9-ee01-0c25-
> 8311b622fe0d@samsung.com/
> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

Jeffrey Hugo Aug. 10, 2023, 4:15 p.m. UTC | #4

On 8/9/2023 8:43 AM, Jason Gunthorpe wrote:
> I missed two paths where __iommu_probe_device() can be called while
> already holding the device_lock() for the device that is to be probed.
> 
> This causes a deadlock because __iommu_probe_device() will attempt to
> re-acquire the lock.
> 
> Organize things so that these two paths can re-use the existing already
> held device_lock by marking the call chains based on if the lock is held
> or not.
> 
> Also the order of iommu_init_device() was not correct for
> generic_single_device_group()
> 
> v2:
>   - New patch to correct the order of setting iommu_dev during
>     iommu_init_device()
>   - Spelling fixes
>   - Simply block probing the iommu device itself instead of trying to do it
> v1: https://lore.kernel.org/r/0-v1-8612b9ef48da+333-iommu_group_locking2_jgg@nvidia.com
> 
> Jason Gunthorpe (4):
>    iommu: Provide iommu_probe_device_locked()
>    iommu: Pass in the iommu_device to probe for in bus_iommu_probe()
>    iommu: Do not attempt to re-lock the iommu device when probing
>    iommu: dev->iommu->iommu_dev must be set before ops->device_group()
> 
>   drivers/acpi/scan.c        |  5 +--
>   drivers/iommu/iommu.c      | 65 +++++++++++++++++++++++++++-----------
>   drivers/iommu/of_iommu.c   |  2 +-
>   drivers/iommu/omap-iommu.c | 12 +++++--
>   include/linux/iommu.h      |  6 +++-
>   5 files changed, 65 insertions(+), 25 deletions(-)
> 
> 
> base-commit: 8d3740021d5d461e1ec4c17fc5625b9b4237c2f8

I found that -next broke boot on the Lenovo Miix 630 laptop (Qualcomm 
MSM8998).  Bisected to "iommu: Complete the locking for dev->iommu_group".

I applied this series and the boot hang failure shortly after iommu 
probe goes away.

Tested-by: Jeffrey Hugo <quic_jhugo@quicinc.com>

Marek Szyprowski Aug. 17, 2023, 8:31 a.m. UTC | #5

Hi Jason,

On 09.08.2023 16:43, Jason Gunthorpe wrote:
> I missed two paths where __iommu_probe_device() can be called while
> already holding the device_lock() for the device that is to be probed.
>
> This causes a deadlock because __iommu_probe_device() will attempt to
> re-acquire the lock.
>
> Organize things so that these two paths can re-use the existing already
> held device_lock by marking the call chains based on if the lock is held
> or not.
>
> Also the order of iommu_init_device() was not correct for
> generic_single_device_group()

I've just noticed that there is still one more issue left to fix after 
applying this patchset. On Qualcomm's RB5 board I get the following 
warning (captured on vanilla next-20230817):

------------[ cut here ]------------
WARNING: CPU: 6 PID: 274 at include/linux/device.h:1012 
iommu_probe_device_locked+0xd4/0xe4
Modules linked in: apr pdr_interface venus_dec venus_enc 
videobuf2_dma_contig rpmsg_ctrl fastrpc qrtr_smd rpmsg_char 
lontium_lt9611uxc msm mcp251xfd qcom_camss can_dev videobuf2_dma_sg 
ocmem imx412 gpu_sched venus_core snd_soc_sm8250 phy_qcom_qmp_combo 
videobuf2_memops v4l2_fwnode leds_qcom_lpg v4l2_mem2mem drm_dp_aux_bus 
v4l2_async snd_soc_qcom_sdw videobuf2_v4l2 ax88179_178a onboard_usb_hub 
videodev rtc_pm8xxx qcom_spmi_adc5 qcom_pon qcom_spmi_adc_tm5 
led_class_multicolor qcom_spmi_temp_alarm qcom_vadc_common crct10dif_ce 
i2c_qcom_geni snd_soc_qcom_common qrtr drm_display_helper 
videobuf2_common camcc_sm8250 mc typec i2c_qcom_cci spi_geni_qcom 
qcom_stats llcc_qcom phy_qcom_qmp_usb qcom_rng icc_bwmon qcom_q6v5_pas 
qcrypto phy_qcom_snps_femto_v2 qcom_pil_info sha256_generic 
soundwire_qcom coresight_stm coresight_tmc stm_core coresight_funnel 
coresight_replicator libsha256 qcom_q6v5 pinctrl_sm8250_lpass_lpi 
soundwire_bus snd_soc_lpass_va_macro snd_soc_lpass_macro_common 
pinctrl_lpass_lpi coresight authenc
  lpass_gfm_sm8250 qcom_sysmon slimbus snd_soc_lpass_wsa_macro libdes 
qcom_common qcom_glink_smem socinfo mdt_loader qmi_helpers 
phy_qcom_qmp_pcie icc_osm_l3 qcom_wdt display_connector ip_tables 
x_tables ipv6
CPU: 6 PID: 274 Comm: kworker/u16:8 Not tainted 6.5.0-rc6-next-20230817 
#13867
Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
Workqueue: events_unbound deferred_probe_work_func
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : iommu_probe_device_locked+0xd4/0xe4
lr : iommu_probe_device_locked+0x78/0xe4
...
Call trace:
  iommu_probe_device_locked+0xd4/0xe4
  of_iommu_configure+0x10c/0x200
  of_dma_configure_id+0x104/0x3b8
  a6xx_gmu_init+0x4c/0xccc [msm]
  a6xx_gpu_init+0x3ac/0x770 [msm]
  adreno_bind+0x174/0x2ac [msm]
  component_bind_all+0x118/0x24c
  msm_drm_bind+0x1e8/0x6c4 [msm]
  try_to_bring_up_aggregate_device+0x168/0x1d4
  __component_add+0xa8/0x170
  component_add+0x14/0x20
  dsi_dev_attach+0x20/0x2c [msm]
  dsi_host_attach+0x9c/0x144 [msm]
  devm_mipi_dsi_attach+0x34/0xb4
  lt9611uxc_attach_dsi.isra.0+0x84/0xfc [lontium_lt9611uxc]
  lt9611uxc_probe+0x5c8/0x68c [lontium_lt9611uxc]
  i2c_device_probe+0x14c/0x290
  really_probe+0x148/0x2b4
  __driver_probe_device+0x78/0x12c
  driver_probe_device+0x3c/0x160
  __device_attach_driver+0xb8/0x138
  bus_for_each_drv+0x84/0xe0
  __device_attach+0xa8/0x1b0
  device_initial_probe+0x14/0x20
  bus_probe_device+0xb0/0xb4
  deferred_probe_work_func+0x8c/0xc8
  process_one_work+0x1ec/0x53c
  worker_thread+0x298/0x408
  kthread+0x124/0x128
  ret_from_fork+0x10/0x20
irq event stamp: 207712
hardirqs last  enabled at (207711): [<ffffcfe16b46b160>] 
_raw_spin_unlock_irqrestore+0x74/0x78
hardirqs last disabled at (207712): [<ffffcfe16b458630>] el1_dbg+0x24/0x8c
softirqs last  enabled at (206480): [<ffffcfe16a290a10>] 
__do_softirq+0x438/0x4ec
softirqs last disabled at (206473): [<ffffcfe16a296980>] 
____do_softirq+0x10/0x1c
---[ end trace 0000000000000000 ]---
------------[ cut here ]------------

Let me know if you need more information or some tests on the hardware.


>
> v2:
>   - New patch to correct the order of setting iommu_dev during
>     iommu_init_device()
>   - Spelling fixes
>   - Simply block probing the iommu device itself instead of trying to do it
> v1: https://lore.kernel.org/r/0-v1-8612b9ef48da+333-iommu_group_locking2_jgg@nvidia.com
>
> Jason Gunthorpe (4):
>    iommu: Provide iommu_probe_device_locked()
>    iommu: Pass in the iommu_device to probe for in bus_iommu_probe()
>    iommu: Do not attempt to re-lock the iommu device when probing
>    iommu: dev->iommu->iommu_dev must be set before ops->device_group()
>
>   drivers/acpi/scan.c        |  5 +--
>   drivers/iommu/iommu.c      | 65 +++++++++++++++++++++++++++-----------
>   drivers/iommu/of_iommu.c   |  2 +-
>   drivers/iommu/omap-iommu.c | 12 +++++--
>   include/linux/iommu.h      |  6 +++-
>   5 files changed, 65 insertions(+), 25 deletions(-)
>
>
> base-commit: 8d3740021d5d461e1ec4c17fc5625b9b4237c2f8

Best regards

Jason Gunthorpe Aug. 18, 2023, 4:06 p.m. UTC | #6

On Fri, Aug 18, 2023 at 05:56:20PM +0200, Joerg Roedel wrote:
> On Thu, Aug 17, 2023 at 03:33:16PM -0300, Jason Gunthorpe wrote:
> > Bascially.. Yikes!
> 
> Hmm, that is a difficult situation. Even if the problem is a misuse of
> the APIs we can not just blindly break other drivers by our core
> changes.

They are not broken, they just throw a lockdep warning and keep going
as before. This is what triggers:

static inline void device_lock_assert(struct device *dev)
{
	lockdep_assert_held(&dev->mutex);
}

So non-debug builds won't even see anything.

> We need to resolve this situation pretty soon, otherwise I need to
> remove the locking rework patches from the IOMMU tree until the
> callers are fixed.
> 
> Is there a way to keep the broken drivers working for now?

Historically we've tolerated lockdep warnings as a way to motivate
people who care to fix their stuff properly. eg the Intel VT-D had a
lockdep warning at kernel boot for many releases before it was fixed.

The series doesn't make things any functionally worse for these places
misusing the API, but it now does throw a warning in some cases.

IMHO I'd rather keep the warning rather than supress it by adding
device_locks(). Do you agree?

Jason

Jason Gunthorpe Aug. 18, 2023, 6:15 p.m. UTC | #7

On Fri, Aug 18, 2023 at 02:00:21PM -0400, Eric Farman wrote:

> Well, I'm trying to chase down an apparent deadlock in the mdev cases
> that is introduced with the commit [1] blamed by these fixes. Seems
> that when iommu_group_{add|remove}_device gets called out of vfio (for
> example, vfio-ap or -ccw), the device lock is already held so
> attempting to get it again isn't going to go well.

Oh, yes. Thankfully due to all the recent cleanup there is now only
one caller of iommu_group_add_device() - VFIO and only on the
vfio_register_emulated_iommu_dev() path.

All those callers are under mdev probe callbacks so they all must have
the device lock held. iommu_group_remove_device is the same. Simple
fix to just assert the device lock is held.

> I'm puzzled why lockdep isn't shouting over this for me, so I added a
> lockdep_assert_not_held() in those paths to get a proper callchain:

The driver core mostly disables lockdep on the device_lock() :(

Does this work for you?

Thanks,
Jason

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 18162049bd2294..1f58bfacb47951 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1166,12 +1166,11 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 {
 	struct group_device *gdev;
 
-	device_lock(dev);
+	device_lock_assert(dev);
+
 	gdev = iommu_group_alloc_device(group, dev);
-	if (IS_ERR(gdev)) {
-		device_unlock(dev);
+	if (IS_ERR(gdev))
 		return PTR_ERR(gdev);
-	}
 
 	iommu_group_ref_get(group);
 	dev->iommu_group = group;
@@ -1179,7 +1178,6 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
 	mutex_lock(&group->mutex);
 	list_add_tail(&gdev->list, &group->devices);
 	mutex_unlock(&group->mutex);
-	device_unlock(dev);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(iommu_group_add_device);
@@ -1195,14 +1193,13 @@ void iommu_group_remove_device(struct device *dev)
 {
 	struct iommu_group *group;
 
-	device_lock(dev);
+	device_lock_assert(dev);
+
 	group = dev->iommu_group;
 	if (group) {
 		dev_info(dev, "Removing from iommu group %d\n", group->id);
 		__iommu_group_remove_device(dev);
 	}
-	device_unlock(dev);
-
 }
 EXPORT_SYMBOL_GPL(iommu_group_remove_device);

Eric Farman Aug. 18, 2023, 6:32 p.m. UTC | #8

On Fri, 2023-08-18 at 15:15 -0300, Jason Gunthorpe wrote:
> On Fri, Aug 18, 2023 at 02:00:21PM -0400, Eric Farman wrote:
> 
> > Well, I'm trying to chase down an apparent deadlock in the mdev
> > cases
> > that is introduced with the commit [1] blamed by these fixes. Seems
> > that when iommu_group_{add|remove}_device gets called out of vfio
> > (for
> > example, vfio-ap or -ccw), the device lock is already held so
> > attempting to get it again isn't going to go well.
> 
> Oh, yes. Thankfully due to all the recent cleanup there is now only
> one caller of iommu_group_add_device() - VFIO and only on the
> vfio_register_emulated_iommu_dev() path.
> 
> All those callers are under mdev probe callbacks so they all must
> have
> the device lock held. iommu_group_remove_device is the same. Simple
> fix to just assert the device lock is held.

Agreed.

> 
> > I'm puzzled why lockdep isn't shouting over this for me, so I added
> > a
> > lockdep_assert_not_held() in those paths to get a proper callchain:
> 
> The driver core mostly disables lockdep on the device_lock() :(

Ah, TIL. Thanks!

> 
> Does this work for you?

Yup, for both -ap and -ccw devices that I have handy. Thanks for the
quick turnaround!

Thanks,
Eric

> 
> Thanks,
> Jason
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 18162049bd2294..1f58bfacb47951 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -1166,12 +1166,11 @@ int iommu_group_add_device(struct iommu_group
> *group, struct device *dev)
>  {
>         struct group_device *gdev;
>  
> -       device_lock(dev);
> +       device_lock_assert(dev);
> +
>         gdev = iommu_group_alloc_device(group, dev);
> -       if (IS_ERR(gdev)) {
> -               device_unlock(dev);
> +       if (IS_ERR(gdev))
>                 return PTR_ERR(gdev);
> -       }
>  
>         iommu_group_ref_get(group);
>         dev->iommu_group = group;
> @@ -1179,7 +1178,6 @@ int iommu_group_add_device(struct iommu_group
> *group, struct device *dev)
>         mutex_lock(&group->mutex);
>         list_add_tail(&gdev->list, &group->devices);
>         mutex_unlock(&group->mutex);
> -       device_unlock(dev);
>         return 0;
>  }
>  EXPORT_SYMBOL_GPL(iommu_group_add_device);
> @@ -1195,14 +1193,13 @@ void iommu_group_remove_device(struct device
> *dev)
>  {
>         struct iommu_group *group;
>  
> -       device_lock(dev);
> +       device_lock_assert(dev);
> +
>         group = dev->iommu_group;
>         if (group) {
>                 dev_info(dev, "Removing from iommu group %d\n",
> group->id);
>                 __iommu_group_remove_device(dev);
>         }
> -       device_unlock(dev);
> -
>  }
>  EXPORT_SYMBOL_GPL(iommu_group_remove_device);
>

Robin Murphy Aug. 18, 2023, 6:50 p.m. UTC | #9

On 2023-08-18 19:24, Joerg Roedel wrote:
> On Fri, Aug 18, 2023 at 01:06:43PM -0300, Jason Gunthorpe wrote:
>> On Fri, Aug 18, 2023 at 05:56:20PM +0200, Joerg Roedel wrote:
>>> On Thu, Aug 17, 2023 at 03:33:16PM -0300, Jason Gunthorpe wrote:
>>>> Bascially.. Yikes!
>>>
>>> Hmm, that is a difficult situation. Even if the problem is a misuse of
>>> the APIs we can not just blindly break other drivers by our core
>>> changes.
>>
>> They are not broken, they just throw a lockdep warning and keep going
>> as before. This is what triggers:
>>
>> static inline void device_lock_assert(struct device *dev)
>> {
>> 	lockdep_assert_held(&dev->mutex);
>> }
>>
>> So non-debug builds won't even see anything.
> 
> But this still means that a function is called without holding the
> proper lock.
> 
>> Historically we've tolerated lockdep warnings as a way to motivate
>> people who care to fix their stuff properly. eg the Intel VT-D had a
>> lockdep warning at kernel boot for many releases before it was fixed.
> 
> There is a difference between knowingly introducing new lockdep warnings
> into upstream and letting warnings discovered upstream rot for a while.

Furthermore I'd say it's one thing to have deadlocks or warnings slip in 
as part of some functional change, but quite another when the change is 
solely reworking the locking in an attempt to make it better. This is 
clearly not better.

Thanks,
Robin.

Jason Gunthorpe Aug. 18, 2023, 7:18 p.m. UTC | #10

On Fri, Aug 18, 2023 at 08:24:01PM +0200, Joerg Roedel wrote:
> On Fri, Aug 18, 2023 at 01:06:43PM -0300, Jason Gunthorpe wrote:
> > On Fri, Aug 18, 2023 at 05:56:20PM +0200, Joerg Roedel wrote:
> > > On Thu, Aug 17, 2023 at 03:33:16PM -0300, Jason Gunthorpe wrote:
> > > > Bascially.. Yikes!
> > > 
> > > Hmm, that is a difficult situation. Even if the problem is a misuse of
> > > the APIs we can not just blindly break other drivers by our core
> > > changes.
> > 
> > They are not broken, they just throw a lockdep warning and keep going
> > as before. This is what triggers:
> > 
> > static inline void device_lock_assert(struct device *dev)
> > {
> > 	lockdep_assert_held(&dev->mutex);
> > }
> > 
> > So non-debug builds won't even see anything.
> 
> But this still means that a function is called without holding the
> proper lock.

It has alway been like that. of_dma_configure_id() always required the
device_lock to be held!

eg as one example:

 of_iommu_configure
  of_iommu_configure_device
   of_iommu_configure_dev
    of_iommu_xlate
     iommu_fwspec_init
       dev_iommu_get

It is subtle, but the device_lock is what protects the store to
dev->iommu inside dev_iommu_get(). In v6.5-rc1 many callers held the
device lock here, and after this series only these broken drivers
don't.

The driver assumes it has exclusive use of the platform device it
steals. Not just for this call, but in general it does other stuff
that rests on this assumption. Since it is exclusive it doesn't
actually need any locking - this is why it works reliably as is today.

Again, there is no practical bug here, the driver works fine. On
non-debug kernels there is no warning or functional issue. Debug
kernels rightly highlight that the API is being used wrong.

It is wrong use of the APIs because someone could go and use sysfs to
attach, say, VFIO to the stolen platform_device and cause all kinds of
kernel problems.

> I can't send anything with known problems upstream.

In my view this is not creating a new problem. It is exposing existing
problems with a debugging message only on debug kernels.

However, if you view the debug message as a problem then I suggest we
simply comment out with a note the device_lock_assert() from the iommu
code. This would be sad because the vast majority of systems don't use
these badly designed drivers.

This puts it back to the v6.5-rc1 behavior where of_dma_configure_id()
won't make any prints if it is called with wrong locking.

Please let me know.

Jason

Jason Gunthorpe Aug. 18, 2023, 7:19 p.m. UTC | #11

On Fri, Aug 18, 2023 at 07:50:18PM +0100, Robin Murphy wrote:

> > There is a difference between knowingly introducing new lockdep warnings
> > into upstream and letting warnings discovered upstream rot for a while.
> 
> Furthermore I'd say it's one thing to have deadlocks or warnings slip in as
> part of some functional change, but quite another when the change is solely
> reworking the locking in an attempt to make it better. This is clearly not
> better.

Exactly what isn't better? We now print a warning when pre-existing
locking rules are violated. That is the only issue here.

Jason

[v2,0/4] Fix device_lock deadlock on two probe() paths

Message

Comments