diff mbox series

[3/3] iommu: Do not attempt to re-lock the iommu device when probing

Message ID 3-v1-8612b9ef48da+333-iommu_group_locking2_jgg@nvidia.com
State New
Headers show
Series None | expand

Commit Message

Jason Gunthorpe Aug. 8, 2023, 5:27 p.m. UTC
When bus_iommu_probe() runs it can attempt to probe the iommu itself, for
instance if the iommu is located on a platform_bus. This will cause the
device_lock() to deadlock on itself as the device_driver probe() callback
for the device calling iommu_device_register() already holds the
device_lock():

 probe_iommu_group+0x18/0x38
 bus_for_each_dev+0xe4/0x168
 bus_iommu_probe+0x8c/0x240
 iommu_device_register+0x120/0x1b0
 mtk_iommu_probe+0x494/0x7a0
 platform_probe+0x94/0x100
 really_probe+0x1e4/0x3e8
 __driver_probe_device+0xc0/0x1a0
 driver_probe_device+0x110/0x1f0
 __device_attach_driver+0xf0/0x1b0
 bus_for_each_drv+0xf0/0x170
 __device_attach+0x120/0x240
 device_initial_probe+0x1c/0x30
 bus_probe_device+0xdc/0xe8
 deferred_probe_work_func+0xf0/0x140
 process_one_work+0x3b0/0x910
 worker_thread+0x33c/0x610
 kthread+0x1dc/0x1f0
 ret_from_fork+0x10/0x20

Keep track of the iommu itself and do not attempt to relock the device
while doing the probe_iommu_group scan.

To accommodate omap's use of unregistered struct iommu_device's add a new
'hwdev' member to keep track of the hwdev in all cases. Normally this
would be dev->parent, but since omap doesn't allocate that struct it won't
exist for it.

Fixes: a2dde836050f ("iommu: Complete the locking for dev->iommu_group")
Reported-by: Chen-Yu Tsai <wenst@chromium.org>
Closes: https://lore.kernel.org/linux-iommu/CAGXv+5E-f9AteAYkmXYzVDZFSA_royc7-bS5LcrzzuHDnXccwA@mail.gmail.com
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/iommu/iommu.c      | 12 ++++++++++--
 drivers/iommu/omap-iommu.c |  1 +
 include/linux/iommu.h      |  2 ++
 3 files changed, 13 insertions(+), 2 deletions(-)

Comments

Tian, Kevin Aug. 9, 2023, 4:01 a.m. UTC | #1
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, August 9, 2023 1:27 AM
> 
> @@ -1800,11 +1801,18 @@ struct probe_iommu_args {
>  static int probe_iommu_group(struct device *dev, void *data)
>  {
>  	struct probe_iommu_args *args = data;
> +	bool need_lock;
>  	int ret;
> 
> -	device_lock(dev);
> +	/* Probing the iommu itself is always done under the device_lock */
> +	need_lock = !args->iommu || args->iommu->hwdev != dev;
> +

is !args->iommu a valid case?

btw probably a dumb question. Why do we continue to probe the
iommu device itself instead of skipping it? The group is a concept
for devices which require DMA protection from iommu instead of
for the iommu device itself...
Jason Gunthorpe Aug. 9, 2023, 12:15 p.m. UTC | #2
On Wed, Aug 09, 2023 at 04:01:42AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Wednesday, August 9, 2023 1:27 AM
> > 
> > @@ -1800,11 +1801,18 @@ struct probe_iommu_args {
> >  static int probe_iommu_group(struct device *dev, void *data)
> >  {
> >  	struct probe_iommu_args *args = data;
> > +	bool need_lock;
> >  	int ret;
> > 
> > -	device_lock(dev);
> > +	/* Probing the iommu itself is always done under the device_lock */
> > +	need_lock = !args->iommu || args->iommu->hwdev != dev;
> > +
> 
> is !args->iommu a valid case?

Hmm, not any more, it used to happen in an earlier version
 
> btw probably a dumb question. Why do we continue to probe the
> iommu device itself instead of skipping it? The group is a concept
> for devices which require DMA protection from iommu instead of
> for the iommu device itself...

Yeah, that is how I originally did it, but since the locking appeared
here I thought it would be safer to just continue to invoke probe as
we have always done. I don't know for sure that there isn't some
driver that relies on this for some reason.

eg it might change the group layouts or something.

Thanks,
Jason
Marek Szyprowski Aug. 9, 2023, 1:38 p.m. UTC | #3
On 09.08.2023 14:15, Jason Gunthorpe wrote:
> On Wed, Aug 09, 2023 at 04:01:42AM +0000, Tian, Kevin wrote:
>>> From: Jason Gunthorpe <jgg@nvidia.com>
>>> Sent: Wednesday, August 9, 2023 1:27 AM
>>>
>>> @@ -1800,11 +1801,18 @@ struct probe_iommu_args {
>>>   static int probe_iommu_group(struct device *dev, void *data)
>>>   {
>>>   	struct probe_iommu_args *args = data;
>>> +	bool need_lock;
>>>   	int ret;
>>>
>>> -	device_lock(dev);
>>> +	/* Probing the iommu itself is always done under the device_lock */
>>> +	need_lock = !args->iommu || args->iommu->hwdev != dev;
>>> +
>> is !args->iommu a valid case?
> Hmm, not any more, it used to happen in an earlier version
>   
>> btw probably a dumb question. Why do we continue to probe the
>> iommu device itself instead of skipping it? The group is a concept
>> for devices which require DMA protection from iommu instead of
>> for the iommu device itself...
> Yeah, that is how I originally did it, but since the locking appeared
> here I thought it would be safer to just continue to invoke probe as
> we have always done. I don't know for sure that there isn't some
> driver that relies on this for some reason.
>
> eg it might change the group layouts or something.

Well, I don't think that there is any driver doing such things, but the 
only way to check is simply change the behavior an wait for complaints.


Best regards
Jason Gunthorpe Aug. 9, 2023, 2:33 p.m. UTC | #4
On Wed, Aug 09, 2023 at 03:38:30PM +0200, Marek Szyprowski wrote:
> On 09.08.2023 14:15, Jason Gunthorpe wrote:
> > On Wed, Aug 09, 2023 at 04:01:42AM +0000, Tian, Kevin wrote:
> >>> From: Jason Gunthorpe <jgg@nvidia.com>
> >>> Sent: Wednesday, August 9, 2023 1:27 AM
> >>>
> >>> @@ -1800,11 +1801,18 @@ struct probe_iommu_args {
> >>>   static int probe_iommu_group(struct device *dev, void *data)
> >>>   {
> >>>   	struct probe_iommu_args *args = data;
> >>> +	bool need_lock;
> >>>   	int ret;
> >>>
> >>> -	device_lock(dev);
> >>> +	/* Probing the iommu itself is always done under the device_lock */
> >>> +	need_lock = !args->iommu || args->iommu->hwdev != dev;
> >>> +
> >> is !args->iommu a valid case?
> > Hmm, not any more, it used to happen in an earlier version
> >   
> >> btw probably a dumb question. Why do we continue to probe the
> >> iommu device itself instead of skipping it? The group is a concept
> >> for devices which require DMA protection from iommu instead of
> >> for the iommu device itself...
> > Yeah, that is how I originally did it, but since the locking appeared
> > here I thought it would be safer to just continue to invoke probe as
> > we have always done. I don't know for sure that there isn't some
> > driver that relies on this for some reason.
> >
> > eg it might change the group layouts or something.
> 
> Well, I don't think that there is any driver doing such things, but the 
> only way to check is simply change the behavior an wait for complaints.

My other concern is that we don't fully block iommu devices from being
self probed. They can still be probed during rescan and during 2nd
calls to bus_iommu_probe().

But OK, it makes more logical sense to just block it so lets try that.

Jason
diff mbox series

Patch

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 19fdb1a220240f..8842f4975ec4a8 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -264,6 +264,7 @@  int iommu_device_register(struct iommu_device *iommu,
 		return -EBUSY;
 
 	iommu->ops = ops;
+	iommu->hwdev = hwdev;
 	if (hwdev)
 		iommu->fwnode = dev_fwnode(hwdev);
 
@@ -1800,11 +1801,18 @@  struct probe_iommu_args {
 static int probe_iommu_group(struct device *dev, void *data)
 {
 	struct probe_iommu_args *args = data;
+	bool need_lock;
 	int ret;
 
-	device_lock(dev);
+	/* Probing the iommu itself is always done under the device_lock */
+	need_lock = !args->iommu || args->iommu->hwdev != dev;
+
+	if (need_lock)
+		device_lock(dev);
 	ret = __iommu_probe_device(dev, args->group_list);
-	device_unlock(dev);
+	if (need_lock)
+		device_unlock(dev);
+
 	if (ret == -ENODEV)
 		ret = 0;
 
diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index 1e4a90ec64322b..20fcc8ebab6ae3 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -1241,6 +1241,7 @@  static int omap_iommu_probe(struct platform_device *pdev)
 		 * an iommu without registering it, so re-run probe again to try
 		 * to match any devices that are waiting for this iommu.
 		 */
+		obj->iommu.hwdev = &pdev->dev;
 		bus_iommu_probe(&platform_bus_type, &obj->iommu);
 	}
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index cc47e4086d69ec..96782bfb384462 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -361,6 +361,7 @@  struct iommu_domain_ops {
  * @list: Used by the iommu-core to keep a list of registered iommus
  * @ops: iommu-ops for talking to this iommu
  * @dev: struct device for sysfs handling
+ * @hwdev: The device HW that controls the iommu
  * @singleton_group: Used internally for drivers that have only one group
  * @max_pasids: number of supported PASIDs
  */
@@ -369,6 +370,7 @@  struct iommu_device {
 	const struct iommu_ops *ops;
 	struct fwnode_handle *fwnode;
 	struct device *dev;
+	struct device *hwdev;
 	struct iommu_group *singleton_group;
 	u32 max_pasids;
 };