diff mbox

[RFC,v3,6/7] arm: call iommu_init before of_platform_populate

Message ID 1410539695-29128-7-git-send-email-will.deacon@arm.com
State Superseded
Headers show

Commit Message

Will Deacon Sept. 12, 2014, 4:34 p.m. UTC
We need to ensure that the IOMMUs in the system have a chance to perform
some basic initialisation before we start adding masters to them.

This patch adds a call to of_iommu_init before of_platform_populate.

Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 arch/arm/kernel/setup.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Thierry Reding Sept. 22, 2014, 9:36 a.m. UTC | #1
On Fri, Sep 12, 2014 at 05:34:54PM +0100, Will Deacon wrote:
> We need to ensure that the IOMMUs in the system have a chance to perform
> some basic initialisation before we start adding masters to them.
> 
> This patch adds a call to of_iommu_init before of_platform_populate.

Why can't you call it from of_platform_populate() directly? That way it
would be usable for all architectures rather than just ARM. Eventually
we're going to need the same thing for 64-bit ARM (and possibly others
as well).

I also don't like how this uses what is essentially initcall ordering
when solutions have been proposed in the past to properly integrate this
with deferred probe and allow IOMMU drivers to be proper drivers. We
keep adding these special cases for devices that need to be probed early
and I think it's turning into a mess.

Thierry
Arnd Bergmann Sept. 22, 2014, 11:08 a.m. UTC | #2
On Monday 22 September 2014 11:36:15 Thierry Reding wrote:
> On Fri, Sep 12, 2014 at 05:34:54PM +0100, Will Deacon wrote:
> > We need to ensure that the IOMMUs in the system have a chance to perform
> > some basic initialisation before we start adding masters to them.
> > 
> > This patch adds a call to of_iommu_init before of_platform_populate.
> 
> Why can't you call it from of_platform_populate() directly? That way it
> would be usable for all architectures rather than just ARM. Eventually
> we're going to need the same thing for 64-bit ARM (and possibly others
> as well).

IIRC, of_platform_populate can be called multiple times, even recursively
be drivers that populate their own child devices.

	Arnd
Thierry Reding Sept. 22, 2014, 11:40 a.m. UTC | #3
On Mon, Sep 22, 2014 at 01:08:27PM +0200, Arnd Bergmann wrote:
> On Monday 22 September 2014 11:36:15 Thierry Reding wrote:
> > On Fri, Sep 12, 2014 at 05:34:54PM +0100, Will Deacon wrote:
> > > We need to ensure that the IOMMUs in the system have a chance to perform
> > > some basic initialisation before we start adding masters to them.
> > > 
> > > This patch adds a call to of_iommu_init before of_platform_populate.
> > 
> > Why can't you call it from of_platform_populate() directly? That way it
> > would be usable for all architectures rather than just ARM. Eventually
> > we're going to need the same thing for 64-bit ARM (and possibly others
> > as well).
> 
> IIRC, of_platform_populate can be called multiple times, even recursively
> be drivers that populate their own child devices.

Indeed. Perhaps it could be conditionally called when root == NULL. But
perhaps that's not safe either. Anyway, I still think we shouldn't be
making this some randomly placed early initcall anyway.

Thierry
Arnd Bergmann Sept. 22, 2014, 4:03 p.m. UTC | #4
On Monday 22 September 2014 13:40:40 Thierry Reding wrote:
> On Mon, Sep 22, 2014 at 01:08:27PM +0200, Arnd Bergmann wrote:
> > On Monday 22 September 2014 11:36:15 Thierry Reding wrote:
> > > On Fri, Sep 12, 2014 at 05:34:54PM +0100, Will Deacon wrote:
> > > > We need to ensure that the IOMMUs in the system have a chance to perform
> > > > some basic initialisation before we start adding masters to them.
> > > > 
> > > > This patch adds a call to of_iommu_init before of_platform_populate.
> > > 
> > > Why can't you call it from of_platform_populate() directly? That way it
> > > would be usable for all architectures rather than just ARM. Eventually
> > > we're going to need the same thing for 64-bit ARM (and possibly others
> > > as well).
> > 
> > IIRC, of_platform_populate can be called multiple times, even recursively
> > be drivers that populate their own child devices.
> 
> Indeed. Perhaps it could be conditionally called when root == NULL. But
> perhaps that's not safe either. Anyway, I still think we shouldn't be
> making this some randomly placed early initcall anyway.

I disagree. IOMMUs are special in the same sense that irqchips, clocks and
timers are. This is not just a random call, it is being explicit about a
base functionality that is needed for all devices attached to it.

While we pretend that these are just device drivers in some sense, I think
it's perfectly reasonable to have an explicit entry point for each subsystem
here.

I see two problems with using deferred probing here: 

- we don't actually need to defer the probing but the binding to the driver
  when no dma ops are set, but it seems silly to even create the device
  before we can find out which ops it should use.
  The reason is that a driver does not actively ask for an IOMMU as it would
  for other subsystems (gpio, led, dmaengine, ...).

- As long as the dma_ops are not set, we can't actually call probe() for
  any device other than iommus, and even that would require adding special
  magic in the platform_device_probe(), so we'd just defer every device
  until we get to the iommu. This likely causes a lot of overhead at probe
  time.

	Arnd
Thierry Reding Sept. 23, 2014, 7:02 a.m. UTC | #5
On Mon, Sep 22, 2014 at 06:03:47PM +0200, Arnd Bergmann wrote:
> On Monday 22 September 2014 13:40:40 Thierry Reding wrote:
> > On Mon, Sep 22, 2014 at 01:08:27PM +0200, Arnd Bergmann wrote:
> > > On Monday 22 September 2014 11:36:15 Thierry Reding wrote:
> > > > On Fri, Sep 12, 2014 at 05:34:54PM +0100, Will Deacon wrote:
> > > > > We need to ensure that the IOMMUs in the system have a chance to perform
> > > > > some basic initialisation before we start adding masters to them.
> > > > > 
> > > > > This patch adds a call to of_iommu_init before of_platform_populate.
> > > > 
> > > > Why can't you call it from of_platform_populate() directly? That way it
> > > > would be usable for all architectures rather than just ARM. Eventually
> > > > we're going to need the same thing for 64-bit ARM (and possibly others
> > > > as well).
> > > 
> > > IIRC, of_platform_populate can be called multiple times, even recursively
> > > be drivers that populate their own child devices.
> > 
> > Indeed. Perhaps it could be conditionally called when root == NULL. But
> > perhaps that's not safe either. Anyway, I still think we shouldn't be
> > making this some randomly placed early initcall anyway.
> 
> I disagree. IOMMUs are special in the same sense that irqchips, clocks and
> timers are. This is not just a random call, it is being explicit about a
> base functionality that is needed for all devices attached to it.

Why do you say IOMMUs are special? I can understand how clocks and
timers are somewhat special because they are required at early boot. But
IOMMUs are not special in that way. I don't think there are any IOMMU
users that early in the boot process. The only reason why IOMMU is
special is because of the way it's currently hooked into the driver
model.

> While we pretend that these are just device drivers in some sense, I think
> it's perfectly reasonable to have an explicit entry point for each subsystem
> here.

They aren't really device drivers at all. They don't bind to devices but
to device tree nodes. And that comes at a high cost. Writing drivers for
these subsystems is weird, because all of a sudden you can no longer use
all the goodies that we've become used to (dev_log(), devres, ...) over
the years.

> I see two problems with using deferred probing here: 
> 
> - we don't actually need to defer the probing but the binding to the driver
>   when no dma ops are set, but it seems silly to even create the device
>   before we can find out which ops it should use.

What does device creation have to do with anything? Surely a device
won't need IOMMU services before the device is bound to a driver.

>   The reason is that a driver does not actively ask for an IOMMU as it would
>   for other subsystems (gpio, led, dmaengine, ...).

Actually it does. At least in some cases. If you want to use the IOMMU
API directly you'd call something like iommu_present() on the bus type
and then allocate an IOMMU domain and attach to it. Unfortunately the
API seems to have been designed under the assumption that IOMMU will
have been registered before any users, so the API doesn't propagate any
meaningful errors.

Currently the specifics of how that works is all mostly hidden within
the IOMMU driver which will decide what IOMMU to attach to. That's
problematic for cases where a device has multiple master interfaces,
since it won't be able to decide which one to attach to.

Also, even if in other cases the drivers don't actively ask for an IOMMU
that doesn't mean that they couldn't be made to. For drivers that use
the DMA/IOMMU integration layer this is probably not practicable, but
there is no reason the core couldn't do it.

> - As long as the dma_ops are not set, we can't actually call probe() for
>   any device other than iommus, and even that would require adding special
>   magic in the platform_device_probe(), so we'd just defer every device
>   until we get to the iommu. This likely causes a lot of overhead at probe
>   time.

Why? The patches that I (and Hiroshi before me) posted a while ago did
exactly that and it worked just fine. The only objection to that was
that Greg (and you I think) didn't want to have that code in the core
and therefore -EPROBE_DEFER can't properly be propagated. There's no
special magic required beyond that. The IOMMU becomes a resource or
service provider, just like any other driver.

As for the overhead I think that's negligible. Hopefully the IOMMU would
be standalone enough not to defer probing itself, so at worst all IOMMU
users would be deferred once. Many of them will already defer because of
other resources such as GPIOs, clocks or regulators anyway. But that's a
problem for which a solution could be implemented. Dependency-based
probe ordering was discussed not so long ago. With that in place the
IOMMU would be probed before any of its users.

Thierry
Arnd Bergmann Sept. 23, 2014, 7:44 a.m. UTC | #6
On Tuesday 23 September 2014 09:02:39 Thierry Reding wrote:

> > I see two problems with using deferred probing here: 
> > 
> > - we don't actually need to defer the probing but the binding to the driver
> >   when no dma ops are set, but it seems silly to even create the device
> >   before we can find out which ops it should use.
> 
> What does device creation have to do with anything? Surely a device
> won't need IOMMU services before the device is bound to a driver.

The problem is that the driver will start using the IOMMU as soon
as it calls dma_map_*, but that happens at runtime, not necessarily
during the probe function.

So we can get into the weird situation that probe() returns success,
but then you can't use the device yet because you don't know whether
it is supposed to use an IOMMU or not.

> >   The reason is that a driver does not actively ask for an IOMMU as it would
> >   for other subsystems (gpio, led, dmaengine, ...).
> 
> Actually it does. At least in some cases. If you want to use the IOMMU
> API directly you'd call something like iommu_present() on the bus type
> and then allocate an IOMMU domain and attach to it. Unfortunately the
> API seems to have been designed under the assumption that IOMMU will
> have been registered before any users, so the API doesn't propagate any
> meaningful errors.

That's just a special case that does not even work as it should yet,
please don't confuse the matter more by talking about drivers that
use the IOMMU API explicitly, this series has very little to do with
that.

> Also, even if in other cases the drivers don't actively ask for an IOMMU
> that doesn't mean that they couldn't be made to. For drivers that use
> the DMA/IOMMU integration layer this is probably not practicable, but
> there is no reason the core couldn't do it.

We intentionally have an abstraction that is meant to let you write drivers
without knowing about iommu, swiotlb or coherency, these are all hidden
behind the dma_map_ops implementation.


	Arnd
Thierry Reding Sept. 23, 2014, 8:59 a.m. UTC | #7
On Tue, Sep 23, 2014 at 09:44:25AM +0200, Arnd Bergmann wrote:
> On Tuesday 23 September 2014 09:02:39 Thierry Reding wrote:
> 
> > > I see two problems with using deferred probing here: 
> > > 
> > > - we don't actually need to defer the probing but the binding to the driver
> > >   when no dma ops are set, but it seems silly to even create the device
> > >   before we can find out which ops it should use.
> > 
> > What does device creation have to do with anything? Surely a device
> > won't need IOMMU services before the device is bound to a driver.
> 
> The problem is that the driver will start using the IOMMU as soon
> as it calls dma_map_*, but that happens at runtime, not necessarily
> during the probe function.

But it won't call dma_map_*() before probe, right? If this is done in
the core we can defer probing before the driver's probe is called, so
there shouldn't be an issue.

> So we can get into the weird situation that probe() returns success,
> but then you can't use the device yet because you don't know whether
> it is supposed to use an IOMMU or not.

Of course any such solution would only work under the assumption that
the driver (or core) knows what it's doing and doesn't call dma_map_*()
until IOMMU support has properly been set up.

For DMA/IOMMU integration users I'd expect the core to set everything up
before proceeding to calling the driver's .probe() function. For others
they will need a way to either explicitly check for the IOMMU master
interface or have the core do that for them (similar to how it would do
that for DMA/IOMMU integration users).

> > >   The reason is that a driver does not actively ask for an IOMMU as it would
> > >   for other subsystems (gpio, led, dmaengine, ...).
> > 
> > Actually it does. At least in some cases. If you want to use the IOMMU
> > API directly you'd call something like iommu_present() on the bus type
> > and then allocate an IOMMU domain and attach to it. Unfortunately the
> > API seems to have been designed under the assumption that IOMMU will
> > have been registered before any users, so the API doesn't propagate any
> > meaningful errors.
> 
> That's just a special case that does not even work as it should yet,

Of course it works. I count at least 3 drivers in mainline (and 1 local)
that do exactly this.

> please don't confuse the matter more by talking about drivers that
> use the IOMMU API explicitly, this series has very little to do with
> that.

It has everything to do with it. If we go and implement everything in
terms of DMA/IOMMU then we leave out all other users. Depending on who
you talk to the direct IOMMU users are the actually important ones.

> > Also, even if in other cases the drivers don't actively ask for an IOMMU
> > that doesn't mean that they couldn't be made to. For drivers that use
> > the DMA/IOMMU integration layer this is probably not practicable, but
> > there is no reason the core couldn't do it.
> 
> We intentionally have an abstraction that is meant to let you write drivers
> without knowing about iommu, swiotlb or coherency, these are all hidden
> behind the dma_map_ops implementation.

Perhaps I'm missing something then. How can you associate two devices
with the same domain with dma_map_ops?

Thierry
Laurent Pinchart Oct. 14, 2014, 1:07 p.m. UTC | #8
Hi Arnd,

On Tuesday 23 September 2014 09:44:25 Arnd Bergmann wrote:
> On Tuesday 23 September 2014 09:02:39 Thierry Reding wrote:
> > > I see two problems with using deferred probing here:
> > > 
> > > - we don't actually need to defer the probing but the binding to the
> > >   driver when no dma ops are set, but it seems silly to even create the
> > >   device before we can find out which ops it should use.
> > 
> > What does device creation have to do with anything? Surely a device
> > won't need IOMMU services before the device is bound to a driver.
> 
> The problem is that the driver will start using the IOMMU as soon
> as it calls dma_map_*, but that happens at runtime, not necessarily
> during the probe function.
> 
> So we can get into the weird situation that probe() returns success,
> but then you can't use the device yet because you don't know whether
> it is supposed to use an IOMMU or not.

If we want IOMMU devices to be supported by common device drivers we need to 
defer probing of the master devices, there's no doubt about that. Earlier 
approaches that hooked up into the device core code were rejected, but it 
should be possible to use bus notifiers to achieve the same result (with the 
drawback of having to register one notifier per bus). The 
BUS_NOTIFY_BIND_DRIVER notifier can then just return -EPROBE_DEFER when a 
iommus property is available and points to an IOMMU not registered yet. I'm 
not saying we have to do this, but I believe that at least from a technical 
point of view it could be done.

> > >   The reason is that a driver does not actively ask for an IOMMU as it
> > >   would for other subsystems (gpio, led, dmaengine, ...).
> > 
> > Actually it does. At least in some cases. If you want to use the IOMMU
> > API directly you'd call something like iommu_present() on the bus type
> > and then allocate an IOMMU domain and attach to it. Unfortunately the
> > API seems to have been designed under the assumption that IOMMU will
> > have been registered before any users, so the API doesn't propagate any
> > meaningful errors.
> 
> That's just a special case that does not even work as it should yet,
> please don't confuse the matter more by talking about drivers that
> use the IOMMU API explicitly, this series has very little to do with
> that.
> 
> > Also, even if in other cases the drivers don't actively ask for an IOMMU
> > that doesn't mean that they couldn't be made to. For drivers that use
> > the DMA/IOMMU integration layer this is probably not practicable, but
> > there is no reason the core couldn't do it.
> 
> We intentionally have an abstraction that is meant to let you write drivers
> without knowing about iommu, swiotlb or coherency, these are all hidden
> behind the dma_map_ops implementation.
Arnd Bergmann Oct. 14, 2014, 1:20 p.m. UTC | #9
On Tuesday 14 October 2014 16:07:38 Laurent Pinchart wrote:
> On Tuesday 23 September 2014 09:44:25 Arnd Bergmann wrote:
> > On Tuesday 23 September 2014 09:02:39 Thierry Reding wrote:
> > > > I see two problems with using deferred probing here:
> > > > 
> > > > - we don't actually need to defer the probing but the binding to the
> > > >   driver when no dma ops are set, but it seems silly to even create the
> > > >   device before we can find out which ops it should use.
> > > 
> > > What does device creation have to do with anything? Surely a device
> > > won't need IOMMU services before the device is bound to a driver.
> > 
> > The problem is that the driver will start using the IOMMU as soon
> > as it calls dma_map_*, but that happens at runtime, not necessarily
> > during the probe function.
> > 
> > So we can get into the weird situation that probe() returns success,
> > but then you can't use the device yet because you don't know whether
> > it is supposed to use an IOMMU or not.
> 
> If we want IOMMU devices to be supported by common device drivers we need to 
> defer probing of the master devices, there's no doubt about that. Earlier 
> approaches that hooked up into the device core code were rejected, but it 
> should be possible to use bus notifiers to achieve the same result (with the 
> drawback of having to register one notifier per bus). The 
> BUS_NOTIFY_BIND_DRIVER notifier can then just return -EPROBE_DEFER when a 
> iommus property is available and points to an IOMMU not registered yet. I'm 
> not saying we have to do this, but I believe that at least from a technical 
> point of view it could be done.

I think that fundamentally speaking, relying on notifiers for something like
this is very problematic, both in terms of maintainability and reliability.
We should really try to get the notifiers out of the iommu handling, not put
more of them in.

	Arnd
Thierry Reding Oct. 14, 2014, 1:37 p.m. UTC | #10
On Tue, Oct 14, 2014 at 03:20:46PM +0200, Arnd Bergmann wrote:
> On Tuesday 14 October 2014 16:07:38 Laurent Pinchart wrote:
> > On Tuesday 23 September 2014 09:44:25 Arnd Bergmann wrote:
> > > On Tuesday 23 September 2014 09:02:39 Thierry Reding wrote:
> > > > > I see two problems with using deferred probing here:
> > > > > 
> > > > > - we don't actually need to defer the probing but the binding to the
> > > > >   driver when no dma ops are set, but it seems silly to even create the
> > > > >   device before we can find out which ops it should use.
> > > > 
> > > > What does device creation have to do with anything? Surely a device
> > > > won't need IOMMU services before the device is bound to a driver.
> > > 
> > > The problem is that the driver will start using the IOMMU as soon
> > > as it calls dma_map_*, but that happens at runtime, not necessarily
> > > during the probe function.
> > > 
> > > So we can get into the weird situation that probe() returns success,
> > > but then you can't use the device yet because you don't know whether
> > > it is supposed to use an IOMMU or not.
> > 
> > If we want IOMMU devices to be supported by common device drivers we need to 
> > defer probing of the master devices, there's no doubt about that. Earlier 
> > approaches that hooked up into the device core code were rejected, but it 
> > should be possible to use bus notifiers to achieve the same result (with the 
> > drawback of having to register one notifier per bus). The 
> > BUS_NOTIFY_BIND_DRIVER notifier can then just return -EPROBE_DEFER when a 
> > iommus property is available and points to an IOMMU not registered yet. I'm 
> > not saying we have to do this, but I believe that at least from a technical 
> > point of view it could be done.
> 
> I think that fundamentally speaking, relying on notifiers for something like
> this is very problematic, both in terms of maintainability and reliability.
> We should really try to get the notifiers out of the iommu handling, not put
> more of them in.

Agreed. Also last time I checked the driver core simply ignored the
return value from notifiers, therefore this wouldn't work without
changing the core either.

Still, I agree with Laurent that we really should be relying on probe
deferral for probe ordering. And while it's true that earlier attempts
to put this into the core were rejected, I think there's still value in
proposing it again. The alternative proposed here is similarly close to
the core and needs to duplicated for every architecture. That itself is
to me a strong indication that this really does belong in the core.

I think initially this was proposed to become part of really_probe() and
I still think that's where it belongs. There's precedent for it with the
pinctrl_bind_pins() call, though it seems like Greg regrets allowing
that into the core. Perhaps if really_probe() is "too core", then
platform_drv_probe() would be a better candidate?

Thierry
Laurent Pinchart Oct. 14, 2014, 3:01 p.m. UTC | #11
Hi Thierry,

On Tuesday 14 October 2014 15:37:59 Thierry Reding wrote:
> On Tue, Oct 14, 2014 at 03:20:46PM +0200, Arnd Bergmann wrote:
> > On Tuesday 14 October 2014 16:07:38 Laurent Pinchart wrote:
> >> On Tuesday 23 September 2014 09:44:25 Arnd Bergmann wrote:
> >>> On Tuesday 23 September 2014 09:02:39 Thierry Reding wrote:
> >>>>> I see two problems with using deferred probing here:
> >>>>> 
> >>>>> - we don't actually need to defer the probing but the binding to
> >>>>>   the driver when no dma ops are set, but it seems silly to even
> >>>>>   create the device before we can find out which ops it should use.
> >>>> 
> >>>> What does device creation have to do with anything? Surely a device
> >>>> won't need IOMMU services before the device is bound to a driver.
> >>> 
> >>> The problem is that the driver will start using the IOMMU as soon
> >>> as it calls dma_map_*, but that happens at runtime, not necessarily
> >>> during the probe function.
> >>> 
> >>> So we can get into the weird situation that probe() returns success,
> >>> but then you can't use the device yet because you don't know whether
> >>> it is supposed to use an IOMMU or not.
> >> 
> >> If we want IOMMU devices to be supported by common device drivers we
> >> need to defer probing of the master devices, there's no doubt about
> >> that. Earlier approaches that hooked up into the device core code were
> >> rejected, but it should be possible to use bus notifiers to achieve the
> >> same result (with the drawback of having to register one notifier per
> >> bus). The BUS_NOTIFY_BIND_DRIVER notifier can then just return -
> >> EPROBE_DEFER when a iommus property is available and points to an IOMMU
> >> not registered yet. I'm not saying we have to do this, but I believe that
> >> at least from a technical point of view it could be done.
> > 
> > I think that fundamentally speaking, relying on notifiers for something
> > like this is very problematic, both in terms of maintainability and
> > reliability. We should really try to get the notifiers out of the iommu
> > handling, not put more of them in.
> 
> Agreed. Also last time I checked the driver core simply ignored the
> return value from notifiers, therefore this wouldn't work without
> changing the core either.
> 
> Still, I agree with Laurent that we really should be relying on probe
> deferral for probe ordering.

*If* we decide to support IOMMUs with real devices ;-)

I don't have a strong opinion on the subject. I initially thought it would be 
a shame not to be able to use devres, until realizing that binding to a DT 
node instead of a device meant that no unbind can ever occur. Loosing dev_* 
support is also an annoyance though.

> And while it's true that earlier attempts to put this into the core were
> rejected, I think there's still value in proposing it again. The alternative
> proposed here is similarly close to the core and needs to duplicated for
> every architecture. That itself is to me a strong indication that this
> really does belong in the core.
> 
> I think initially this was proposed to become part of really_probe() and
> I still think that's where it belongs. There's precedent for it with the
> pinctrl_bind_pins() call, though it seems like Greg regrets allowing
> that into the core. Perhaps if really_probe() is "too core", then
> platform_drv_probe() would be a better candidate?

I've CC'ed Greg to this e-mail as he will likely have the last word on this 
topic.
Thierry Reding Oct. 14, 2014, 3:05 p.m. UTC | #12
On Tue, Oct 14, 2014 at 06:01:58PM +0300, Laurent Pinchart wrote:
> Hi Thierry,
> 
> On Tuesday 14 October 2014 15:37:59 Thierry Reding wrote:
> > On Tue, Oct 14, 2014 at 03:20:46PM +0200, Arnd Bergmann wrote:
> > > On Tuesday 14 October 2014 16:07:38 Laurent Pinchart wrote:
> > >> On Tuesday 23 September 2014 09:44:25 Arnd Bergmann wrote:
> > >>> On Tuesday 23 September 2014 09:02:39 Thierry Reding wrote:
> > >>>>> I see two problems with using deferred probing here:
> > >>>>> 
> > >>>>> - we don't actually need to defer the probing but the binding to
> > >>>>>   the driver when no dma ops are set, but it seems silly to even
> > >>>>>   create the device before we can find out which ops it should use.
> > >>>> 
> > >>>> What does device creation have to do with anything? Surely a device
> > >>>> won't need IOMMU services before the device is bound to a driver.
> > >>> 
> > >>> The problem is that the driver will start using the IOMMU as soon
> > >>> as it calls dma_map_*, but that happens at runtime, not necessarily
> > >>> during the probe function.
> > >>> 
> > >>> So we can get into the weird situation that probe() returns success,
> > >>> but then you can't use the device yet because you don't know whether
> > >>> it is supposed to use an IOMMU or not.
> > >> 
> > >> If we want IOMMU devices to be supported by common device drivers we
> > >> need to defer probing of the master devices, there's no doubt about
> > >> that. Earlier approaches that hooked up into the device core code were
> > >> rejected, but it should be possible to use bus notifiers to achieve the
> > >> same result (with the drawback of having to register one notifier per
> > >> bus). The BUS_NOTIFY_BIND_DRIVER notifier can then just return -
> > >> EPROBE_DEFER when a iommus property is available and points to an IOMMU
> > >> not registered yet. I'm not saying we have to do this, but I believe that
> > >> at least from a technical point of view it could be done.
> > > 
> > > I think that fundamentally speaking, relying on notifiers for something
> > > like this is very problematic, both in terms of maintainability and
> > > reliability. We should really try to get the notifiers out of the iommu
> > > handling, not put more of them in.
> > 
> > Agreed. Also last time I checked the driver core simply ignored the
> > return value from notifiers, therefore this wouldn't work without
> > changing the core either.
> > 
> > Still, I agree with Laurent that we really should be relying on probe
> > deferral for probe ordering.
> 
> *If* we decide to support IOMMUs with real devices ;-)
> 
> I don't have a strong opinion on the subject. I initially thought it would be 
> a shame not to be able to use devres, until realizing that binding to a DT 
> node instead of a device meant that no unbind can ever occur. Loosing dev_* 
> support is also an annoyance though.

Binding to a DT node then also means that you can't build the driver as
a module. And in particular for multiplatform kernels this is going to
be a problem eventually.

Thierry
Laurent Pinchart Oct. 14, 2014, 3:10 p.m. UTC | #13
Hi Thierry,

On Tuesday 14 October 2014 17:05:29 Thierry Reding wrote:
> On Tue, Oct 14, 2014 at 06:01:58PM +0300, Laurent Pinchart wrote:
> > On Tuesday 14 October 2014 15:37:59 Thierry Reding wrote:
> >> On Tue, Oct 14, 2014 at 03:20:46PM +0200, Arnd Bergmann wrote:
> >>> On Tuesday 14 October 2014 16:07:38 Laurent Pinchart wrote:
> >>>> On Tuesday 23 September 2014 09:44:25 Arnd Bergmann wrote:
> >>>>> On Tuesday 23 September 2014 09:02:39 Thierry Reding wrote:
> >>>>>>> I see two problems with using deferred probing here:
> >>>>>>> 
> >>>>>>> - we don't actually need to defer the probing but the binding to
> >>>>>>>   the driver when no dma ops are set, but it seems silly to even
> >>>>>>>   create the device before we can find out which ops it should
> >>>>>>>   use.
> >>>>>> 
> >>>>>> What does device creation have to do with anything? Surely a device
> >>>>>> won't need IOMMU services before the device is bound to a driver.
> >>>>> 
> >>>>> The problem is that the driver will start using the IOMMU as soon
> >>>>> as it calls dma_map_*, but that happens at runtime, not necessarily
> >>>>> during the probe function.
> >>>>> 
> >>>>> So we can get into the weird situation that probe() returns success,
> >>>>> but then you can't use the device yet because you don't know whether
> >>>>> it is supposed to use an IOMMU or not.
> >>>> 
> >>>> If we want IOMMU devices to be supported by common device drivers we
> >>>> need to defer probing of the master devices, there's no doubt about
> >>>> that. Earlier approaches that hooked up into the device core code
> >>>> were rejected, but it should be possible to use bus notifiers to
> >>>> achieve the same result (with the drawback of having to register one
> >>>> notifier per bus). The BUS_NOTIFY_BIND_DRIVER notifier can then just
> >>>> return -EPROBE_DEFER when a iommus property is available and points to
> >>>> an IOMMU not registered yet. I'm not saying we have to do this, but I
> >>>> believe that at least from a technical point of view it could be done.
> >>> 
> >>> I think that fundamentally speaking, relying on notifiers for
> >>> something like this is very problematic, both in terms of
> >>> maintainability and reliability. We should really try to get the
> >>> notifiers out of the iommu handling, not put more of them in.
> >> 
> >> Agreed. Also last time I checked the driver core simply ignored the
> >> return value from notifiers, therefore this wouldn't work without
> >> changing the core either.
> >> 
> >> Still, I agree with Laurent that we really should be relying on probe
> >> deferral for probe ordering.
> > 
> > *If* we decide to support IOMMUs with real devices ;-)
> > 
> > I don't have a strong opinion on the subject. I initially thought it would
> > be a shame not to be able to use devres, until realizing that binding to
> > a DT node instead of a device meant that no unbind can ever occur.
> > Loosing dev_* support is also an annoyance though.
> 
> Binding to a DT node then also means that you can't build the driver as
> a module. And in particular for multiplatform kernels this is going to
> be a problem eventually.

It will, but other dependencies will make multiplatform kernels unpractical 
with or without IOMMUs anyway. Many platforms need clocks, timers, pin 
control, gpio, i2c and pmic support to boot to userspace. Agreed, that's not a 
valid reason to make the problem worse just for the sake of it.
diff mbox

Patch

diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 84db893dedc2..495e29697a67 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -18,6 +18,7 @@ 
 #include <linux/bootmem.h>
 #include <linux/seq_file.h>
 #include <linux/screen_info.h>
+#include <linux/of_iommu.h>
 #include <linux/of_platform.h>
 #include <linux/init.h>
 #include <linux/kexec.h>
@@ -800,6 +801,7 @@  static int __init customize_machine(void)
 	 * machine from the device tree, if no callback is provided,
 	 * otherwise we would always need an init_machine callback.
 	 */
+	of_iommu_init();
 	if (machine_desc->init_machine)
 		machine_desc->init_machine();
 #ifdef CONFIG_OF