[RFC] drm/bridge: panel: Add module_get/but calls to attached panel driver

Message ID 1519070782-20834-1-git-send-email-jsarha@ti.com
State New
Headers show
Series
  • [RFC] drm/bridge: panel: Add module_get/but calls to attached panel driver
Related show

Commit Message

Jyri Sarha Feb. 19, 2018, 8:06 p.m.
Currently there is no way for a master drm driver to protect against an
attached panel driver from being unloaded while it is in use. The
least we can do is to indicate the usage by incrementing the module
reference count.

Signed-off-by: Jyri Sarha <jsarha@ti.com>
cc: eric@anholt.net
cc: laurent.pinchart@ideasonboard.com
---
I do not see any module_get/put code in drm core. Is there is a reason
for that?

There is two more alternative places for adding the module_get/put
code. One is puting it directly to drm_panel_attach() and
drm_panel_detach(). However, if the same module implements both the
master drm driver and the panel (like tilcdc does with its
tilcdc_panel.c), then attaching the panel will lock the module in for
no good reason. Still, this solution should work with drm bridges as I
do not see any reason why anybody would implement bridge drivers in
the same module with the master drm driver.

The other place to put the code would in the master drm driver. But
for handling the situation with bridges would need the device pointer
in struct drm_bridge.

 drivers/gpu/drm/bridge/panel.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Daniel Vetter Feb. 19, 2018, 10:59 p.m. | #1
On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:
> Currently there is no way for a master drm driver to protect against an
> attached panel driver from being unloaded while it is in use. The
> least we can do is to indicate the usage by incrementing the module
> reference count.
> 
> Signed-off-by: Jyri Sarha <jsarha@ti.com>
> cc: eric@anholt.net
> cc: laurent.pinchart@ideasonboard.com
> ---
> I do not see any module_get/put code in drm core. Is there is a reason
> for that?
> 
> There is two more alternative places for adding the module_get/put
> code. One is puting it directly to drm_panel_attach() and
> drm_panel_detach(). However, if the same module implements both the
> master drm driver and the panel (like tilcdc does with its
> tilcdc_panel.c), then attaching the panel will lock the module in for
> no good reason. Still, this solution should work with drm bridges as I
> do not see any reason why anybody would implement bridge drivers in
> the same module with the master drm driver.
> 
> The other place to put the code would in the master drm driver. But
> for handling the situation with bridges would need the device pointer
> in struct drm_bridge.

I think this looks like a reasonable place to do this. Looking at the code
we seem to have a similar issue with the bridge driver itself. I think
we need to wire through the module owner stuff and add a try_modeul_get to
of_drm_find_bridge (and any other helper used to find bridge instances).

Care to look into that part of the problem, too?
-Daniel
> 
>  drivers/gpu/drm/bridge/panel.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/bridge/panel.c b/drivers/gpu/drm/bridge/panel.c
> index 6d99d4a..0a10be6 100644
> --- a/drivers/gpu/drm/bridge/panel.c
> +++ b/drivers/gpu/drm/bridge/panel.c
> @@ -161,6 +161,10 @@ struct drm_bridge *drm_panel_bridge_add(struct drm_panel *panel,
>  	if (!panel)
>  		return ERR_PTR(-EINVAL);
>  
> +	if (WARN_ON(!panel->dev->driver) ||
> +	    !try_module_get(panel->dev->driver->owner))
> +		return ERR_PTR(-ENODEV);
> +
>  	panel_bridge = devm_kzalloc(panel->dev, sizeof(*panel_bridge),
>  				    GFP_KERNEL);
>  	if (!panel_bridge)
> @@ -199,6 +203,9 @@ void drm_panel_bridge_remove(struct drm_bridge *bridge)
>  	panel_bridge = drm_bridge_to_panel_bridge(bridge);
>  
>  	drm_bridge_remove(bridge);
> +
> +	module_put(panel_bridge->panel->dev->driver->owner);
> +
>  	devm_kfree(panel_bridge->panel->dev, bridge);
>  }
>  EXPORT_SYMBOL(drm_panel_bridge_remove);
> -- 
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
>
Thierry Reding Feb. 20, 2018, 10:34 a.m. | #2
On Mon, Feb 19, 2018 at 11:59:23PM +0100, Daniel Vetter wrote:
> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:

> > Currently there is no way for a master drm driver to protect against an

> > attached panel driver from being unloaded while it is in use. The

> > least we can do is to indicate the usage by incrementing the module

> > reference count.

> > 

> > Signed-off-by: Jyri Sarha <jsarha@ti.com>

> > cc: eric@anholt.net

> > cc: laurent.pinchart@ideasonboard.com

> > ---

> > I do not see any module_get/put code in drm core. Is there is a reason

> > for that?

> > 

> > There is two more alternative places for adding the module_get/put

> > code. One is puting it directly to drm_panel_attach() and

> > drm_panel_detach(). However, if the same module implements both the

> > master drm driver and the panel (like tilcdc does with its

> > tilcdc_panel.c), then attaching the panel will lock the module in for

> > no good reason. Still, this solution should work with drm bridges as I

> > do not see any reason why anybody would implement bridge drivers in

> > the same module with the master drm driver.

> > 

> > The other place to put the code would in the master drm driver. But

> > for handling the situation with bridges would need the device pointer

> > in struct drm_bridge.

> 

> I think this looks like a reasonable place to do this. Looking at the code

> we seem to have a similar issue with the bridge driver itself. I think

> we need to wire through the module owner stuff and add a try_modeul_get to

> of_drm_find_bridge (and any other helper used to find bridge instances).


I disagree. module_get() is only going to protect you from unloading a
module that's in use, but there are other ways to unbind a driver from
the device and cause subsequent mayhem.

struct device_link was "recently" introduced to fix that issue.

Thierry
Jyri Sarha Feb. 20, 2018, 11:28 a.m. | #3
On 20/02/18 12:34, Thierry Reding wrote:
> On Mon, Feb 19, 2018 at 11:59:23PM +0100, Daniel Vetter wrote:
>> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:
>>> Currently there is no way for a master drm driver to protect against an
>>> attached panel driver from being unloaded while it is in use. The
>>> least we can do is to indicate the usage by incrementing the module
>>> reference count.
>>>
>>> Signed-off-by: Jyri Sarha <jsarha@ti.com>
>>> cc: eric@anholt.net
>>> cc: laurent.pinchart@ideasonboard.com
>>> ---
>>> I do not see any module_get/put code in drm core. Is there is a reason
>>> for that?
>>>
>>> There is two more alternative places for adding the module_get/put
>>> code. One is puting it directly to drm_panel_attach() and
>>> drm_panel_detach(). However, if the same module implements both the
>>> master drm driver and the panel (like tilcdc does with its
>>> tilcdc_panel.c), then attaching the panel will lock the module in for
>>> no good reason. Still, this solution should work with drm bridges as I
>>> do not see any reason why anybody would implement bridge drivers in
>>> the same module with the master drm driver.
>>>
>>> The other place to put the code would in the master drm driver. But
>>> for handling the situation with bridges would need the device pointer
>>> in struct drm_bridge.
>>
>> I think this looks like a reasonable place to do this. Looking at the code
>> we seem to have a similar issue with the bridge driver itself. I think
>> we need to wire through the module owner stuff and add a try_modeul_get to
>> of_drm_find_bridge (and any other helper used to find bridge instances).
> 
> I disagree. module_get() is only going to protect you from unloading a
> module that's in use, but there are other ways to unbind a driver from
> the device and cause subsequent mayhem.
> 

Yes. This is not a full fix for the issue. But AFAIK at the moment there
is no infrastructure to tear down the whole drm device grace fully when
a bridge driver or panel is removed. So this should be considered as a
partial remedy protecting user from accidentally crashing the drm driver
by unloading the wrong module first. That is why I wrote "least we can do".

> struct device_link was "recently" introduced to fix that issue.
> 

Unfortunately I do not have time to do full fix for the issue, but in
any case I think this patch is a step to the right direction. The module
reference count should anyway be increased at least to know that the
driver code remains in memory, even if the device is not there any more.

Then again the display panels or bridges are hardly ever hot pluggable
devices, so they do not normally vanish just like that, unless someone
is being nasty and forcibly unbinding them. But yes, I know that is a
bad excuse for broken behaviour.

Best regards,
Jyri
Thierry Reding Feb. 20, 2018, 12:03 p.m. | #4
On Tue, Feb 20, 2018 at 01:28:48PM +0200, Jyri Sarha wrote:
> On 20/02/18 12:34, Thierry Reding wrote:

> > On Mon, Feb 19, 2018 at 11:59:23PM +0100, Daniel Vetter wrote:

> >> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:

> >>> Currently there is no way for a master drm driver to protect against an

> >>> attached panel driver from being unloaded while it is in use. The

> >>> least we can do is to indicate the usage by incrementing the module

> >>> reference count.

> >>>

> >>> Signed-off-by: Jyri Sarha <jsarha@ti.com>

> >>> cc: eric@anholt.net

> >>> cc: laurent.pinchart@ideasonboard.com

> >>> ---

> >>> I do not see any module_get/put code in drm core. Is there is a reason

> >>> for that?

> >>>

> >>> There is two more alternative places for adding the module_get/put

> >>> code. One is puting it directly to drm_panel_attach() and

> >>> drm_panel_detach(). However, if the same module implements both the

> >>> master drm driver and the panel (like tilcdc does with its

> >>> tilcdc_panel.c), then attaching the panel will lock the module in for

> >>> no good reason. Still, this solution should work with drm bridges as I

> >>> do not see any reason why anybody would implement bridge drivers in

> >>> the same module with the master drm driver.

> >>>

> >>> The other place to put the code would in the master drm driver. But

> >>> for handling the situation with bridges would need the device pointer

> >>> in struct drm_bridge.

> >>

> >> I think this looks like a reasonable place to do this. Looking at the code

> >> we seem to have a similar issue with the bridge driver itself. I think

> >> we need to wire through the module owner stuff and add a try_modeul_get to

> >> of_drm_find_bridge (and any other helper used to find bridge instances).

> > 

> > I disagree. module_get() is only going to protect you from unloading a

> > module that's in use, but there are other ways to unbind a driver from

> > the device and cause subsequent mayhem.

> > 

> 

> Yes. This is not a full fix for the issue. But AFAIK at the moment there

> is no infrastructure to tear down the whole drm device grace fully when

> a bridge driver or panel is removed. So this should be considered as a

> partial remedy protecting user from accidentally crashing the drm driver

> by unloading the wrong module first. That is why I wrote "least we can do".

> 

> > struct device_link was "recently" introduced to fix that issue.

> > 

> 

> Unfortunately I do not have time to do full fix for the issue, but in

> any case I think this patch is a step to the right direction. The module

> reference count should anyway be increased at least to know that the

> driver code remains in memory, even if the device is not there any more.

> 

> Then again the display panels or bridges are hardly ever hot pluggable

> devices, so they do not normally vanish just like that, unless someone

> is being nasty and forcibly unbinding them. But yes, I know that is a

> bad excuse for broken behaviour.


I think device links are actually a lot easier to use and give you a
full solution for this. Essentially the only thing you need to do is add
a call to device_link_add() in the correct place.

That said, it seems to me like drm_panel_bridge_add() is not a good
place to add this, because the panel/bridge does not follow a typical
consumer/supplier model. It is rather a helper construct with a well-
defined lifetime.

What you do want to protect is the panel (the "parent" of the panel/
bridge) from going away unexpectedly. I think the best way to achieve
that is to make the link in drm_panel_attach() with something like
this:

	u32 flags = DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE;
	struct device *consumer = connector->dev->dev;

	link = device_link_add(consumer, panel->dev, flags);
	if (!link) {
		dev_err(panel->dev, "failed to link panel %s to %s\n",
			dev_name(panel->dev), dev_name(consumer));
		return -EINVAL;
	}

Ideally I think we'd link the panel to the connector rather than the
top-level DRM device, but we don't have a struct device * for that, so
this is probably as good as it gets for now.

You might get away without the DL_FLAG_PM_RUNTIME because DRM should
handle that properly already.

Thierry
Jyri Sarha Feb. 20, 2018, 3:04 p.m. | #5
On 20/02/18 14:03, Thierry Reding wrote:
> On Tue, Feb 20, 2018 at 01:28:48PM +0200, Jyri Sarha wrote:
>> On 20/02/18 12:34, Thierry Reding wrote:
>>> On Mon, Feb 19, 2018 at 11:59:23PM +0100, Daniel Vetter wrote:
>>>> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:
>>>>> Currently there is no way for a master drm driver to protect against an
>>>>> attached panel driver from being unloaded while it is in use. The
>>>>> least we can do is to indicate the usage by incrementing the module
>>>>> reference count.
>>>>>
>>>>> Signed-off-by: Jyri Sarha <jsarha@ti.com>
>>>>> cc: eric@anholt.net
>>>>> cc: laurent.pinchart@ideasonboard.com
>>>>> ---
>>>>> I do not see any module_get/put code in drm core. Is there is a reason
>>>>> for that?
>>>>>
>>>>> There is two more alternative places for adding the module_get/put
>>>>> code. One is puting it directly to drm_panel_attach() and
>>>>> drm_panel_detach(). However, if the same module implements both the
>>>>> master drm driver and the panel (like tilcdc does with its
>>>>> tilcdc_panel.c), then attaching the panel will lock the module in for
>>>>> no good reason. Still, this solution should work with drm bridges as I
>>>>> do not see any reason why anybody would implement bridge drivers in
>>>>> the same module with the master drm driver.
>>>>>
>>>>> The other place to put the code would in the master drm driver. But
>>>>> for handling the situation with bridges would need the device pointer
>>>>> in struct drm_bridge.
>>>>
>>>> I think this looks like a reasonable place to do this. Looking at the code
>>>> we seem to have a similar issue with the bridge driver itself. I think
>>>> we need to wire through the module owner stuff and add a try_modeul_get to
>>>> of_drm_find_bridge (and any other helper used to find bridge instances).
>>>
>>> I disagree. module_get() is only going to protect you from unloading a
>>> module that's in use, but there are other ways to unbind a driver from
>>> the device and cause subsequent mayhem.
>>>
>>
>> Yes. This is not a full fix for the issue. But AFAIK at the moment there
>> is no infrastructure to tear down the whole drm device grace fully when
>> a bridge driver or panel is removed. So this should be considered as a
>> partial remedy protecting user from accidentally crashing the drm driver
>> by unloading the wrong module first. That is why I wrote "least we can do".
>>
>>> struct device_link was "recently" introduced to fix that issue.
>>>
>>
>> Unfortunately I do not have time to do full fix for the issue, but in
>> any case I think this patch is a step to the right direction. The module
>> reference count should anyway be increased at least to know that the
>> driver code remains in memory, even if the device is not there any more.
>>
>> Then again the display panels or bridges are hardly ever hot pluggable
>> devices, so they do not normally vanish just like that, unless someone
>> is being nasty and forcibly unbinding them. But yes, I know that is a
>> bad excuse for broken behaviour.
> 
> I think device links are actually a lot easier to use and give you a
> full solution for this. Essentially the only thing you need to do is add
> a call to device_link_add() in the correct place.
> 
> That said, it seems to me like drm_panel_bridge_add() is not a good
> place to add this, because the panel/bridge does not follow a typical
> consumer/supplier model. It is rather a helper construct with a well-
> defined lifetime.
> 
> What you do want to protect is the panel (the "parent" of the panel/
> bridge) from going away unexpectedly. I think the best way to achieve
> that is to make the link in drm_panel_attach() with something like
> this:
> 
> 	u32 flags = DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE;
> 	struct device *consumer = connector->dev->dev;
> 
> 	link = device_link_add(consumer, panel->dev, flags);
> 	if (!link) {
> 		dev_err(panel->dev, "failed to link panel %s to %s\n",
> 			dev_name(panel->dev), dev_name(consumer));
> 		return -EINVAL;
> 	}
> 
> Ideally I think we'd link the panel to the connector rather than the
> top-level DRM device, but we don't have a struct device * for that, so
> this is probably as good as it gets for now.
> 

Thanks for the hint. Adding the link indeed unbinds the master drm
device when the panel is unloaded without any complaints.

The annoying thing is that the master drm device does not probe again
and bind when the supplier device becomes available again. However,
forcing a manual bind brings the whole device back without a problem.

Is there any trivial way to fix this?

Best regards,
Jyri

> You might get away without the DL_FLAG_PM_RUNTIME because DRM should
> handle that properly already.
Lukas Wunner Feb. 21, 2018, 7:18 a.m. | #6
[+cc Rafael]

On Tue, Feb 20, 2018 at 05:04:00PM +0200, Jyri Sarha wrote:
> On 20/02/18 14:03, Thierry Reding wrote:
> > On Tue, Feb 20, 2018 at 01:28:48PM +0200, Jyri Sarha wrote:
> >> On 20/02/18 12:34, Thierry Reding wrote:
> >>>> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:
> >>>>> Currently there is no way for a master drm driver to protect against an
> >>>>> attached panel driver from being unloaded while it is in use. The
> >>>>> least we can do is to indicate the usage by incrementing the module
> >>>>> reference count.
[...]
> >>> I disagree. module_get() is only going to protect you from unloading a
> >>> module that's in use, but there are other ways to unbind a driver from
> >>> the device and cause subsequent mayhem.
> >>>
> >>> struct device_link was "recently" introduced to fix that issue.
> >>
> >> Unfortunately I do not have time to do full fix for the issue, but in
> >> any case I think this patch is a step to the right direction. The module
> >> reference count should anyway be increased at least to know that the
> >> driver code remains in memory, even if the device is not there any more.
> > 
> > I think device links are actually a lot easier to use and give you a
> > full solution for this. Essentially the only thing you need to do is add
> > a call to device_link_add() in the correct place.
> > 
> > That said, it seems to me like drm_panel_bridge_add() is not a good
> > place to add this, because the panel/bridge does not follow a typical
> > consumer/supplier model. It is rather a helper construct with a well-
> > defined lifetime.
> > 
> > What you do want to protect is the panel (the "parent" of the panel/
> > bridge) from going away unexpectedly. I think the best way to achieve
> > that is to make the link in drm_panel_attach() with something like
> > this:
> > 
> > 	u32 flags = DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE;
> > 	struct device *consumer = connector->dev->dev;
> > 
> > 	link = device_link_add(consumer, panel->dev, flags);
> > 	if (!link) {
> > 		dev_err(panel->dev, "failed to link panel %s to %s\n",
> > 			dev_name(panel->dev), dev_name(consumer));
> > 		return -EINVAL;
> > 	}
> > 
> > Ideally I think we'd link the panel to the connector rather than the
> > top-level DRM device, but we don't have a struct device * for that, so
> > this is probably as good as it gets for now.
> 
> Thanks for the hint. Adding the link indeed unbinds the master drm
> device when the panel is unloaded without any complaints.
> 
> The annoying thing is that the master drm device does not probe again
> and bind when the supplier device becomes available again. However,
> forcing a manual bind brings the whole device back without a problem.
> 
> Is there any trivial way to fix this?

How about the below, would this work for you?

Or is the device link added in the consumer's driver, hence is gone when
the consumer is unbound?  If so, it should be added somewhere else,
or it shouldn't be removed on consumer unbind.

@Thierry:  Thanks for shifting the discussion in the right direction,
it's good to see device links getting more adoption, instead of
proliferating yet more kludges.  However I don't understand why you say
we don't have a struct device for the connector, drm_sysfs_connector_add()
does create one.

-- >8 --
Subject: [PATCH] driver core: Ensure consumers are probed when supplier is
 bound

When a device link supplier is unbound (either manually or because one
of its own suppliers was unbound), its consumers are unbound as well.

However when that supplier is subsequently rebound, its consumers should
likewise be given a chance to rebind.  Achieve that by putting them on
the deferred probe list in device_links_driver_bound().  The sole caller
of that function, driver_bound(), triggers deferred probing afterwards.

Reported-by: Jyri Sarha <jsarha@ti.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
 drivers/base/base.h | 1 +
 drivers/base/core.c | 4 +++-
 drivers/base/dd.c   | 2 +-
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index d800de6..39370eb 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -114,6 +114,7 @@ extern void device_release_driver_internal(struct device *dev,
 
 extern void driver_detach(struct device_driver *drv);
 extern int driver_probe_device(struct device_driver *drv, struct device *dev);
+extern void driver_deferred_probe_add(struct device *dev);
 extern void driver_deferred_probe_del(struct device *dev);
 static inline int driver_match_device(struct device_driver *drv,
 				      struct device *dev)
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 920af22..2869d21 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -402,7 +402,8 @@ int device_links_check_suppliers(struct device *dev)
  * @dev: Device to update the links for.
  *
  * The probe has been successful, so update links from this device to any
- * consumers by changing their status to "available".
+ * consumers by changing their status to "available".  Mark the consumers
+ * for deferred probing in case the supplier was unbound and is now rebound.
  *
  * Also change the status of @dev's links to suppliers to "active".
  *
@@ -420,6 +421,7 @@ void device_links_driver_bound(struct device *dev)
 
 		WARN_ON(link->status != DL_STATE_DORMANT);
 		WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
+		driver_deferred_probe_add(link->consumer);
 	}
 
 	list_for_each_entry(link, &dev->links.suppliers, c_node) {
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 2c964f5..ac5a21a 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -141,7 +141,7 @@ static void deferred_probe_work_func(struct work_struct *work)
 }
 static DECLARE_WORK(deferred_probe_work, deferred_probe_work_func);
 
-static void driver_deferred_probe_add(struct device *dev)
+void driver_deferred_probe_add(struct device *dev)
 {
 	mutex_lock(&deferred_probe_mutex);
 	if (list_empty(&dev->p->deferred_probe)) {
Jyri Sarha Feb. 21, 2018, 11:30 a.m. | #7
On 21/02/18 09:18, Lukas Wunner wrote:
> [+cc Rafael]
> 
> On Tue, Feb 20, 2018 at 05:04:00PM +0200, Jyri Sarha wrote:
>> On 20/02/18 14:03, Thierry Reding wrote:
>>> On Tue, Feb 20, 2018 at 01:28:48PM +0200, Jyri Sarha wrote:
>>>> On 20/02/18 12:34, Thierry Reding wrote:
>>>>>> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:
>>>>>>> Currently there is no way for a master drm driver to protect against an
>>>>>>> attached panel driver from being unloaded while it is in use. The
>>>>>>> least we can do is to indicate the usage by incrementing the module
>>>>>>> reference count.
> [...]
>>>>> I disagree. module_get() is only going to protect you from unloading a
>>>>> module that's in use, but there are other ways to unbind a driver from
>>>>> the device and cause subsequent mayhem.
>>>>>
>>>>> struct device_link was "recently" introduced to fix that issue.
>>>>
>>>> Unfortunately I do not have time to do full fix for the issue, but in
>>>> any case I think this patch is a step to the right direction. The module
>>>> reference count should anyway be increased at least to know that the
>>>> driver code remains in memory, even if the device is not there any more.
>>>
>>> I think device links are actually a lot easier to use and give you a
>>> full solution for this. Essentially the only thing you need to do is add
>>> a call to device_link_add() in the correct place.
>>>
>>> That said, it seems to me like drm_panel_bridge_add() is not a good
>>> place to add this, because the panel/bridge does not follow a typical
>>> consumer/supplier model. It is rather a helper construct with a well-
>>> defined lifetime.
>>>
>>> What you do want to protect is the panel (the "parent" of the panel/
>>> bridge) from going away unexpectedly. I think the best way to achieve
>>> that is to make the link in drm_panel_attach() with something like
>>> this:
>>>
>>> 	u32 flags = DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE;
>>> 	struct device *consumer = connector->dev->dev;
>>>
>>> 	link = device_link_add(consumer, panel->dev, flags);
>>> 	if (!link) {
>>> 		dev_err(panel->dev, "failed to link panel %s to %s\n",
>>> 			dev_name(panel->dev), dev_name(consumer));
>>> 		return -EINVAL;
>>> 	}
>>>
>>> Ideally I think we'd link the panel to the connector rather than the
>>> top-level DRM device, but we don't have a struct device * for that, so
>>> this is probably as good as it gets for now.
>>
>> Thanks for the hint. Adding the link indeed unbinds the master drm
>> device when the panel is unloaded without any complaints.
>>
>> The annoying thing is that the master drm device does not probe again
>> and bind when the supplier device becomes available again. However,
>> forcing a manual bind brings the whole device back without a problem.
>>
>> Is there any trivial way to fix this?
> 
> How about the below, would this work for you?
> 
> Or is the device link added in the consumer's driver, hence is gone when
> the consumer is unbound?  If so, it should be added somewhere else,
> or it shouldn't be removed on consumer unbind.
> 
For some reason the patch bellow does not work. No even if I do not
remove the link ever, after it has been made in the consumers probe.

However, the device link works the way I want it to if I move the
driver_deferred_probe_add() to device_links_unbind_consumers() right
before the device_release_driver_internal(). This appears to works even
if I have device_link_del() in the remove call of the consumer driver,
but I have not yet checked how racy this solution is.

If I put the driver_deferred_probe_add() right after the
device_release_driver_internal() everything works if I do not try to
delete the link in the consumer driver remove. However, leaving the link
there forever after a successful probe does not feel right to me, even
if in practice it usually would work fine.

I wonder if either solution could evolve to an acceptable solution for
all device_link users.

Best regards,
Jyri

> @Thierry:  Thanks for shifting the discussion in the right direction,
> it's good to see device links getting more adoption, instead of
> proliferating yet more kludges.  However I don't understand why you say
> we don't have a struct device for the connector, drm_sysfs_connector_add()
> does create one.
> 
> -- >8 --
> Subject: [PATCH] driver core: Ensure consumers are probed when supplier is
>  bound
> 
> When a device link supplier is unbound (either manually or because one
> of its own suppliers was unbound), its consumers are unbound as well.
> 
> However when that supplier is subsequently rebound, its consumers should
> likewise be given a chance to rebind.  Achieve that by putting them on
> the deferred probe list in device_links_driver_bound().  The sole caller
> of that function, driver_bound(), triggers deferred probing afterwards.
> 
> Reported-by: Jyri Sarha <jsarha@ti.com>
> Cc: Thierry Reding <thierry.reding@gmail.com>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> ---
>  drivers/base/base.h | 1 +
>  drivers/base/core.c | 4 +++-
>  drivers/base/dd.c   | 2 +-
>  3 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/base/base.h b/drivers/base/base.h
> index d800de6..39370eb 100644
> --- a/drivers/base/base.h
> +++ b/drivers/base/base.h
> @@ -114,6 +114,7 @@ extern void device_release_driver_internal(struct device *dev,
>  
>  extern void driver_detach(struct device_driver *drv);
>  extern int driver_probe_device(struct device_driver *drv, struct device *dev);
> +extern void driver_deferred_probe_add(struct device *dev);
>  extern void driver_deferred_probe_del(struct device *dev);
>  static inline int driver_match_device(struct device_driver *drv,
>  				      struct device *dev)
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 920af22..2869d21 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -402,7 +402,8 @@ int device_links_check_suppliers(struct device *dev)
>   * @dev: Device to update the links for.
>   *
>   * The probe has been successful, so update links from this device to any
> - * consumers by changing their status to "available".
> + * consumers by changing their status to "available".  Mark the consumers
> + * for deferred probing in case the supplier was unbound and is now rebound.
>   *
>   * Also change the status of @dev's links to suppliers to "active".
>   *
> @@ -420,6 +421,7 @@ void device_links_driver_bound(struct device *dev)
>  
>  		WARN_ON(link->status != DL_STATE_DORMANT);
>  		WRITE_ONCE(link->status, DL_STATE_AVAILABLE);
> +		driver_deferred_probe_add(link->consumer);
>  	}
>  
>  	list_for_each_entry(link, &dev->links.suppliers, c_node) {
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 2c964f5..ac5a21a 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -141,7 +141,7 @@ static void deferred_probe_work_func(struct work_struct *work)
>  }
>  static DECLARE_WORK(deferred_probe_work, deferred_probe_work_func);
>  
> -static void driver_deferred_probe_add(struct device *dev)
> +void driver_deferred_probe_add(struct device *dev)
>  {
>  	mutex_lock(&deferred_probe_mutex);
>  	if (list_empty(&dev->p->deferred_probe)) {
>
Lukas Wunner Feb. 21, 2018, 1:15 p.m. | #8
On Wed, Feb 21, 2018 at 01:30:23PM +0200, Jyri Sarha wrote:
> On 21/02/18 09:18, Lukas Wunner wrote:
> > On Tue, Feb 20, 2018 at 05:04:00PM +0200, Jyri Sarha wrote:
> >> On 20/02/18 14:03, Thierry Reding wrote:
> >>> On Tue, Feb 20, 2018 at 01:28:48PM +0200, Jyri Sarha wrote:
> >>>> On 20/02/18 12:34, Thierry Reding wrote:
> >>>>>> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:
> >>>>>>> Currently there is no way for a master drm driver to protect against an
> >>>>>>> attached panel driver from being unloaded while it is in use. The
> >>>>>>> least we can do is to indicate the usage by incrementing the module
> >>>>>>> reference count.
> > [...]
> >>>>> I disagree. module_get() is only going to protect you from unloading a
> >>>>> module that's in use, but there are other ways to unbind a driver from
> >>>>> the device and cause subsequent mayhem.
> >>>>>
> >>>>> struct device_link was "recently" introduced to fix that issue.
> >>>>
> >>>> Unfortunately I do not have time to do full fix for the issue, but in
> >>>> any case I think this patch is a step to the right direction. The module
> >>>> reference count should anyway be increased at least to know that the
> >>>> driver code remains in memory, even if the device is not there any more.
> >>>
> >>> I think device links are actually a lot easier to use and give you a
> >>> full solution for this. Essentially the only thing you need to do is add
> >>> a call to device_link_add() in the correct place.
> >>>
> >>> That said, it seems to me like drm_panel_bridge_add() is not a good
> >>> place to add this, because the panel/bridge does not follow a typical
> >>> consumer/supplier model. It is rather a helper construct with a well-
> >>> defined lifetime.
> >>>
> >>> What you do want to protect is the panel (the "parent" of the panel/
> >>> bridge) from going away unexpectedly. I think the best way to achieve
> >>> that is to make the link in drm_panel_attach() with something like
> >>> this:
> >>>
> >>> 	u32 flags = DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE;
> >>> 	struct device *consumer = connector->dev->dev;
> >>>
> >>> 	link = device_link_add(consumer, panel->dev, flags);
> >>> 	if (!link) {
> >>> 		dev_err(panel->dev, "failed to link panel %s to %s\n",
> >>> 			dev_name(panel->dev), dev_name(consumer));
> >>> 		return -EINVAL;
> >>> 	}
> >>>
> >>> Ideally I think we'd link the panel to the connector rather than the
> >>> top-level DRM device, but we don't have a struct device * for that, so
> >>> this is probably as good as it gets for now.
> >>
> >> Thanks for the hint. Adding the link indeed unbinds the master drm
> >> device when the panel is unloaded without any complaints.
> >>
> >> The annoying thing is that the master drm device does not probe again
> >> and bind when the supplier device becomes available again. However,
> >> forcing a manual bind brings the whole device back without a problem.
> >>
> >> Is there any trivial way to fix this?
> > 
> > How about the below, would this work for you?
> > 
> > Or is the device link added in the consumer's driver, hence is gone when
> > the consumer is unbound?  If so, it should be added somewhere else,
> > or it shouldn't be removed on consumer unbind.
> > 
> For some reason the patch bellow does not work. No even if I do not
> remove the link ever, after it has been made in the consumers probe.

Yes.  The patch was only meant to work if the device link is *not*
added by the consumer, i.e. if it is added by the supplier, in a PCI
quirk, initcall or whatever.


> However, the device link works the way I want it to if I move the
> driver_deferred_probe_add() to device_links_unbind_consumers() right
> before the device_release_driver_internal(). This appears to works even
> if I have device_link_del() in the remove call of the consumer driver,
> but I have not yet checked how racy this solution is.
> 
> If I put the driver_deferred_probe_add() right after the
> device_release_driver_internal() everything works if I do not try to
> delete the link in the consumer driver remove. However, leaving the link
> there forever after a successful probe does not feel right to me, even
> if in practice it usually would work fine.

The first of the two above-mentioned solutions might also be
acceptable.

However a general problem of adding the link in the consumer is that
the consumer has to be probed at least once.  I'm not sure if that's
always guaranteed.

Leaving the device link around is fine if it represents a dependency
that exists permanently in the system.  But that begs the question if
the consumer is at all the right place to add it.  Perhaps the link
should rather be added when traversing the device tree and instantiating
the devices, and before ever probing them.  E.g. if a panel DT node
references a backlight DT node, add a device link.

Thanks,

Lukas
Thierry Reding Feb. 21, 2018, 2:26 p.m. | #9
On Wed, Feb 21, 2018 at 08:18:42AM +0100, Lukas Wunner wrote:
> [+cc Rafael]

> 

> On Tue, Feb 20, 2018 at 05:04:00PM +0200, Jyri Sarha wrote:

> > On 20/02/18 14:03, Thierry Reding wrote:

> > > On Tue, Feb 20, 2018 at 01:28:48PM +0200, Jyri Sarha wrote:

> > >> On 20/02/18 12:34, Thierry Reding wrote:

> > >>>> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:

> > >>>>> Currently there is no way for a master drm driver to protect against an

> > >>>>> attached panel driver from being unloaded while it is in use. The

> > >>>>> least we can do is to indicate the usage by incrementing the module

> > >>>>> reference count.

> [...]

> > >>> I disagree. module_get() is only going to protect you from unloading a

> > >>> module that's in use, but there are other ways to unbind a driver from

> > >>> the device and cause subsequent mayhem.

> > >>>

> > >>> struct device_link was "recently" introduced to fix that issue.

> > >>

> > >> Unfortunately I do not have time to do full fix for the issue, but in

> > >> any case I think this patch is a step to the right direction. The module

> > >> reference count should anyway be increased at least to know that the

> > >> driver code remains in memory, even if the device is not there any more.

> > > 

> > > I think device links are actually a lot easier to use and give you a

> > > full solution for this. Essentially the only thing you need to do is add

> > > a call to device_link_add() in the correct place.

> > > 

> > > That said, it seems to me like drm_panel_bridge_add() is not a good

> > > place to add this, because the panel/bridge does not follow a typical

> > > consumer/supplier model. It is rather a helper construct with a well-

> > > defined lifetime.

> > > 

> > > What you do want to protect is the panel (the "parent" of the panel/

> > > bridge) from going away unexpectedly. I think the best way to achieve

> > > that is to make the link in drm_panel_attach() with something like

> > > this:

> > > 

> > > 	u32 flags = DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE;

> > > 	struct device *consumer = connector->dev->dev;

> > > 

> > > 	link = device_link_add(consumer, panel->dev, flags);

> > > 	if (!link) {

> > > 		dev_err(panel->dev, "failed to link panel %s to %s\n",

> > > 			dev_name(panel->dev), dev_name(consumer));

> > > 		return -EINVAL;

> > > 	}

> > > 

> > > Ideally I think we'd link the panel to the connector rather than the

> > > top-level DRM device, but we don't have a struct device * for that, so

> > > this is probably as good as it gets for now.

> > 

> > Thanks for the hint. Adding the link indeed unbinds the master drm

> > device when the panel is unloaded without any complaints.

> > 

> > The annoying thing is that the master drm device does not probe again

> > and bind when the supplier device becomes available again. However,

> > forcing a manual bind brings the whole device back without a problem.

> > 

> > Is there any trivial way to fix this?

> 

> How about the below, would this work for you?

> 

> Or is the device link added in the consumer's driver, hence is gone when

> the consumer is unbound?  If so, it should be added somewhere else,

> or it shouldn't be removed on consumer unbind.

> 

> @Thierry:  Thanks for shifting the discussion in the right direction,

> it's good to see device links getting more adoption, instead of

> proliferating yet more kludges.  However I don't understand why you say

> we don't have a struct device for the connector, drm_sysfs_connector_add()

> does create one.


The problem is that the device created by drm_sysfs_connector_add() is
created purely to represent the device in sysfs. That means it won't
have a driver bound to it, so if you add the device link to that, then
the driver core will try to unbind the sysfs device which isn't bound
in the first place.

I think what we'd need is for drivers to be able to set that device to
whatever their parent is. For example, if there's an IP block on an SoC
that provides the connector, then the corresponding pdev->dev should be
that pointer. That way the device link would work as expected and tear
down the driver for that IP block (hopefully together with the rest of
the DRM device).

However, doing this via the DRM device's ->dev should also work because
a) not all connectors have a parent device and b) the dependency really
is from the DRM device to the panel.

Thierry
Thierry Reding Feb. 21, 2018, 2:30 p.m. | #10
On Wed, Feb 21, 2018 at 02:15:13PM +0100, Lukas Wunner wrote:
> On Wed, Feb 21, 2018 at 01:30:23PM +0200, Jyri Sarha wrote:

> > On 21/02/18 09:18, Lukas Wunner wrote:

> > > On Tue, Feb 20, 2018 at 05:04:00PM +0200, Jyri Sarha wrote:

> > >> On 20/02/18 14:03, Thierry Reding wrote:

> > >>> On Tue, Feb 20, 2018 at 01:28:48PM +0200, Jyri Sarha wrote:

> > >>>> On 20/02/18 12:34, Thierry Reding wrote:

> > >>>>>> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:

> > >>>>>>> Currently there is no way for a master drm driver to protect against an

> > >>>>>>> attached panel driver from being unloaded while it is in use. The

> > >>>>>>> least we can do is to indicate the usage by incrementing the module

> > >>>>>>> reference count.

> > > [...]

> > >>>>> I disagree. module_get() is only going to protect you from unloading a

> > >>>>> module that's in use, but there are other ways to unbind a driver from

> > >>>>> the device and cause subsequent mayhem.

> > >>>>>

> > >>>>> struct device_link was "recently" introduced to fix that issue.

> > >>>>

> > >>>> Unfortunately I do not have time to do full fix for the issue, but in

> > >>>> any case I think this patch is a step to the right direction. The module

> > >>>> reference count should anyway be increased at least to know that the

> > >>>> driver code remains in memory, even if the device is not there any more.

> > >>>

> > >>> I think device links are actually a lot easier to use and give you a

> > >>> full solution for this. Essentially the only thing you need to do is add

> > >>> a call to device_link_add() in the correct place.

> > >>>

> > >>> That said, it seems to me like drm_panel_bridge_add() is not a good

> > >>> place to add this, because the panel/bridge does not follow a typical

> > >>> consumer/supplier model. It is rather a helper construct with a well-

> > >>> defined lifetime.

> > >>>

> > >>> What you do want to protect is the panel (the "parent" of the panel/

> > >>> bridge) from going away unexpectedly. I think the best way to achieve

> > >>> that is to make the link in drm_panel_attach() with something like

> > >>> this:

> > >>>

> > >>> 	u32 flags = DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE;

> > >>> 	struct device *consumer = connector->dev->dev;

> > >>>

> > >>> 	link = device_link_add(consumer, panel->dev, flags);

> > >>> 	if (!link) {

> > >>> 		dev_err(panel->dev, "failed to link panel %s to %s\n",

> > >>> 			dev_name(panel->dev), dev_name(consumer));

> > >>> 		return -EINVAL;

> > >>> 	}

> > >>>

> > >>> Ideally I think we'd link the panel to the connector rather than the

> > >>> top-level DRM device, but we don't have a struct device * for that, so

> > >>> this is probably as good as it gets for now.

> > >>

> > >> Thanks for the hint. Adding the link indeed unbinds the master drm

> > >> device when the panel is unloaded without any complaints.

> > >>

> > >> The annoying thing is that the master drm device does not probe again

> > >> and bind when the supplier device becomes available again. However,

> > >> forcing a manual bind brings the whole device back without a problem.

> > >>

> > >> Is there any trivial way to fix this?

> > > 

> > > How about the below, would this work for you?

> > > 

> > > Or is the device link added in the consumer's driver, hence is gone when

> > > the consumer is unbound?  If so, it should be added somewhere else,

> > > or it shouldn't be removed on consumer unbind.

> > > 

> > For some reason the patch bellow does not work. No even if I do not

> > remove the link ever, after it has been made in the consumers probe.

> 

> Yes.  The patch was only meant to work if the device link is *not*

> added by the consumer, i.e. if it is added by the supplier, in a PCI

> quirk, initcall or whatever.

> 

> 

> > However, the device link works the way I want it to if I move the

> > driver_deferred_probe_add() to device_links_unbind_consumers() right

> > before the device_release_driver_internal(). This appears to works even

> > if I have device_link_del() in the remove call of the consumer driver,

> > but I have not yet checked how racy this solution is.

> > 

> > If I put the driver_deferred_probe_add() right after the

> > device_release_driver_internal() everything works if I do not try to

> > delete the link in the consumer driver remove. However, leaving the link

> > there forever after a successful probe does not feel right to me, even

> > if in practice it usually would work fine.

> 

> The first of the two above-mentioned solutions might also be

> acceptable.

> 

> However a general problem of adding the link in the consumer is that

> the consumer has to be probed at least once.  I'm not sure if that's

> always guaranteed.

> 

> Leaving the device link around is fine if it represents a dependency

> that exists permanently in the system.  But that begs the question if

> the consumer is at all the right place to add it.  Perhaps the link

> should rather be added when traversing the device tree and instantiating

> the devices, and before ever probing them.  E.g. if a panel DT node

> references a backlight DT node, add a device link.


Good luck with that. This has been discussed numerous times in the past
and never solved. I don't recall any good keywords, and a quick search
through my mailbox doesn't show the threads I'm looking for. But this
was discussed in the context of automatically making deferred probe work
for these kind of dependencies.

The bottom line was, if I remember correctly, that we can't generically
because there are many shapes these dependencies can take. Any generic
DT code wouldn't know about them and only the binding implementers (i.e.
the drivers, the consumers) actually know what to look for and what the
type of dependency really is (some may want to defer probe, others may
succeed probe but have a mechanism to "hotplug" the resource later on).

Thierry
Thierry Reding Feb. 21, 2018, 2:40 p.m. | #11
On Wed, Feb 21, 2018 at 03:30:41PM +0100, Thierry Reding wrote:
> On Wed, Feb 21, 2018 at 02:15:13PM +0100, Lukas Wunner wrote:

> > On Wed, Feb 21, 2018 at 01:30:23PM +0200, Jyri Sarha wrote:

> > > On 21/02/18 09:18, Lukas Wunner wrote:

> > > > On Tue, Feb 20, 2018 at 05:04:00PM +0200, Jyri Sarha wrote:

> > > >> On 20/02/18 14:03, Thierry Reding wrote:

> > > >>> On Tue, Feb 20, 2018 at 01:28:48PM +0200, Jyri Sarha wrote:

> > > >>>> On 20/02/18 12:34, Thierry Reding wrote:

> > > >>>>>> On Mon, Feb 19, 2018 at 10:06:22PM +0200, Jyri Sarha wrote:

> > > >>>>>>> Currently there is no way for a master drm driver to protect against an

> > > >>>>>>> attached panel driver from being unloaded while it is in use. The

> > > >>>>>>> least we can do is to indicate the usage by incrementing the module

> > > >>>>>>> reference count.

> > > > [...]

> > > >>>>> I disagree. module_get() is only going to protect you from unloading a

> > > >>>>> module that's in use, but there are other ways to unbind a driver from

> > > >>>>> the device and cause subsequent mayhem.

> > > >>>>>

> > > >>>>> struct device_link was "recently" introduced to fix that issue.

> > > >>>>

> > > >>>> Unfortunately I do not have time to do full fix for the issue, but in

> > > >>>> any case I think this patch is a step to the right direction. The module

> > > >>>> reference count should anyway be increased at least to know that the

> > > >>>> driver code remains in memory, even if the device is not there any more.

> > > >>>

> > > >>> I think device links are actually a lot easier to use and give you a

> > > >>> full solution for this. Essentially the only thing you need to do is add

> > > >>> a call to device_link_add() in the correct place.

> > > >>>

> > > >>> That said, it seems to me like drm_panel_bridge_add() is not a good

> > > >>> place to add this, because the panel/bridge does not follow a typical

> > > >>> consumer/supplier model. It is rather a helper construct with a well-

> > > >>> defined lifetime.

> > > >>>

> > > >>> What you do want to protect is the panel (the "parent" of the panel/

> > > >>> bridge) from going away unexpectedly. I think the best way to achieve

> > > >>> that is to make the link in drm_panel_attach() with something like

> > > >>> this:

> > > >>>

> > > >>> 	u32 flags = DL_FLAG_PM_RUNTIME | DL_FLAG_AUTOREMOVE;

> > > >>> 	struct device *consumer = connector->dev->dev;

> > > >>>

> > > >>> 	link = device_link_add(consumer, panel->dev, flags);

> > > >>> 	if (!link) {

> > > >>> 		dev_err(panel->dev, "failed to link panel %s to %s\n",

> > > >>> 			dev_name(panel->dev), dev_name(consumer));

> > > >>> 		return -EINVAL;

> > > >>> 	}

> > > >>>

> > > >>> Ideally I think we'd link the panel to the connector rather than the

> > > >>> top-level DRM device, but we don't have a struct device * for that, so

> > > >>> this is probably as good as it gets for now.

> > > >>

> > > >> Thanks for the hint. Adding the link indeed unbinds the master drm

> > > >> device when the panel is unloaded without any complaints.

> > > >>

> > > >> The annoying thing is that the master drm device does not probe again

> > > >> and bind when the supplier device becomes available again. However,

> > > >> forcing a manual bind brings the whole device back without a problem.

> > > >>

> > > >> Is there any trivial way to fix this?

> > > > 

> > > > How about the below, would this work for you?

> > > > 

> > > > Or is the device link added in the consumer's driver, hence is gone when

> > > > the consumer is unbound?  If so, it should be added somewhere else,

> > > > or it shouldn't be removed on consumer unbind.

> > > > 

> > > For some reason the patch bellow does not work. No even if I do not

> > > remove the link ever, after it has been made in the consumers probe.

> > 

> > Yes.  The patch was only meant to work if the device link is *not*

> > added by the consumer, i.e. if it is added by the supplier, in a PCI

> > quirk, initcall or whatever.

> > 

> > 

> > > However, the device link works the way I want it to if I move the

> > > driver_deferred_probe_add() to device_links_unbind_consumers() right

> > > before the device_release_driver_internal(). This appears to works even

> > > if I have device_link_del() in the remove call of the consumer driver,

> > > but I have not yet checked how racy this solution is.

> > > 

> > > If I put the driver_deferred_probe_add() right after the

> > > device_release_driver_internal() everything works if I do not try to

> > > delete the link in the consumer driver remove. However, leaving the link

> > > there forever after a successful probe does not feel right to me, even

> > > if in practice it usually would work fine.

> > 

> > The first of the two above-mentioned solutions might also be

> > acceptable.

> > 

> > However a general problem of adding the link in the consumer is that

> > the consumer has to be probed at least once.  I'm not sure if that's

> > always guaranteed.

> > 

> > Leaving the device link around is fine if it represents a dependency

> > that exists permanently in the system.  But that begs the question if

> > the consumer is at all the right place to add it.  Perhaps the link

> > should rather be added when traversing the device tree and instantiating

> > the devices, and before ever probing them.  E.g. if a panel DT node

> > references a backlight DT node, add a device link.

> 

> Good luck with that. This has been discussed numerous times in the past

> and never solved. I don't recall any good keywords, and a quick search

> through my mailbox doesn't show the threads I'm looking for. But this

> was discussed in the context of automatically making deferred probe work

> for these kind of dependencies.

> 

> The bottom line was, if I remember correctly, that we can't generically

> because there are many shapes these dependencies can take. Any generic

> DT code wouldn't know about them and only the binding implementers (i.e.

> the drivers, the consumers) actually know what to look for and what the

> type of dependency really is (some may want to defer probe, others may

> succeed probe but have a mechanism to "hotplug" the resource later on).


I found this:

	https://lwn.net/Articles/652668/

which has a couple of pointers to other threads that should serve as a
good starting point if you want to read up on what's been tried in the
past.

Note that the context here is primarily probe ordering, but the issues
underneath are the same.

Thierry

Patch

diff --git a/drivers/gpu/drm/bridge/panel.c b/drivers/gpu/drm/bridge/panel.c
index 6d99d4a..0a10be6 100644
--- a/drivers/gpu/drm/bridge/panel.c
+++ b/drivers/gpu/drm/bridge/panel.c
@@ -161,6 +161,10 @@  struct drm_bridge *drm_panel_bridge_add(struct drm_panel *panel,
 	if (!panel)
 		return ERR_PTR(-EINVAL);
 
+	if (WARN_ON(!panel->dev->driver) ||
+	    !try_module_get(panel->dev->driver->owner))
+		return ERR_PTR(-ENODEV);
+
 	panel_bridge = devm_kzalloc(panel->dev, sizeof(*panel_bridge),
 				    GFP_KERNEL);
 	if (!panel_bridge)
@@ -199,6 +203,9 @@  void drm_panel_bridge_remove(struct drm_bridge *bridge)
 	panel_bridge = drm_bridge_to_panel_bridge(bridge);
 
 	drm_bridge_remove(bridge);
+
+	module_put(panel_bridge->panel->dev->driver->owner);
+
 	devm_kfree(panel_bridge->panel->dev, bridge);
 }
 EXPORT_SYMBOL(drm_panel_bridge_remove);