diff mbox series

[v2,2/2] of: dynamic: flush devlinks workqueue before destroying the changeset

Message ID 20240205-fix-device-links-overlays-v2-2-5344f8c79d57@analog.com
State New
Headers show
Series fix DT overlays when device links are released | expand

Commit Message

Nuno Sa via B4 Relay Feb. 5, 2024, 12:09 p.m. UTC
From: Nuno Sa <nuno.sa@analog.com>

Device links will drop their supplier + consumer refcounts
asynchronously. That means that the refcount of the of_node attached to
these devices will also be dropped asynchronously and so we cannot
guarantee the DT overlay assumption that the of_node refcount must be 1 in
__of_changeset_entry_destroy().

Given the above, call the new fwnode_links_flush_queue() helper to flush
the devlink workqueue so we can be sure that all links are dropped before
doing the proper checks.

Signed-off-by: Nuno Sa <nuno.sa@analog.com>
---
 drivers/of/dynamic.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Andy Shevchenko Feb. 5, 2024, 12:36 p.m. UTC | #1
On Mon, Feb 05, 2024 at 01:09:33PM +0100, Nuno Sa via B4 Relay wrote:
> From: Nuno Sa <nuno.sa@analog.com>
> 
> Device links will drop their supplier + consumer refcounts
> asynchronously. That means that the refcount of the of_node attached to
> these devices will also be dropped asynchronously and so we cannot
> guarantee the DT overlay assumption that the of_node refcount must be 1 in
> __of_changeset_entry_destroy().
> 
> Given the above, call the new fwnode_links_flush_queue() helper to flush
> the devlink workqueue so we can be sure that all links are dropped before
> doing the proper checks.

Have you seen my comments against v1?

> +++ b/drivers/of/dynamic.c
> @@ -14,6 +14,7 @@
>  #include <linux/slab.h>
>  #include <linux/string.h>
>  #include <linux/proc_fs.h>
> +#include <linux/fwnode.h>

Try to squeeze this to make it ordered (with given context it may go before
linux/s* ones, but maybe you may find a better spot).

...

> +	/*
> +	 * device links drop their device references (and hence their of_node

Device links...

> +	 * references) asynchronously on a dedicated workqueue. Hence we need
> +	 * to flush it to make sure everything is done before doing the below
> +	 * checks.
> +	 */
Nuno Sá Feb. 12, 2024, 12:10 p.m. UTC | #2
On Mon, 2024-02-05 at 13:09 +0100, Nuno Sa wrote:
> Device links will drop their supplier + consumer refcounts
> asynchronously. That means that the refcount of the of_node attached to
> these devices will also be dropped asynchronously and so we cannot
> guarantee the DT overlay assumption that the of_node refcount must be 1 in
> __of_changeset_entry_destroy().
> 
> Given the above, call the new fwnode_links_flush_queue() helper to flush
> the devlink workqueue so we can be sure that all links are dropped before
> doing the proper checks.
> 
> Signed-off-by: Nuno Sa <nuno.sa@analog.com>
> ---
>  drivers/of/dynamic.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
> index 3bf27052832f..b7153c72c9c9 100644
> --- a/drivers/of/dynamic.c
> +++ b/drivers/of/dynamic.c
> @@ -14,6 +14,7 @@
>  #include <linux/slab.h>
>  #include <linux/string.h>
>  #include <linux/proc_fs.h>
> +#include <linux/fwnode.h>
>  
>  #include "of_private.h"
>  
> @@ -518,6 +519,13 @@ EXPORT_SYMBOL(of_changeset_create_node);
>  
>  static void __of_changeset_entry_destroy(struct of_changeset_entry *ce)
>  {
> +	/*
> +	 * device links drop their device references (and hence their of_node
> +	 * references) asynchronously on a dedicated workqueue. Hence we need
> +	 * to flush it to make sure everything is done before doing the below
> +	 * checks.
> +	 */
> +	fwnode_links_flush_queue();
>  	if (ce->action == OF_RECONFIG_ATTACH_NODE &&
>  	    of_node_check_flag(ce->np, OF_OVERLAY)) {
>  		if (kref_read(&ce->np->kobj.kref) > 1) {
> 

Hi Rob and Frank,

Any way you could take a look at this and see if you're ok with the change in the
overlay code? 

On the devlink side , we already got the ok from Rafael.

Thanks!
- Nuno Sá
Nuno Sá Feb. 13, 2024, 3:01 p.m. UTC | #3
On Tue, 2024-02-13 at 08:51 -0600, Rob Herring wrote:
> On Mon, Feb 12, 2024 at 01:10:27PM +0100, Nuno Sá wrote:
> > On Mon, 2024-02-05 at 13:09 +0100, Nuno Sa wrote:
> > > Device links will drop their supplier + consumer refcounts
> > > asynchronously. That means that the refcount of the of_node attached to
> > > these devices will also be dropped asynchronously and so we cannot
> > > guarantee the DT overlay assumption that the of_node refcount must be 1 in
> > > __of_changeset_entry_destroy().
> > > 
> > > Given the above, call the new fwnode_links_flush_queue() helper to flush
> > > the devlink workqueue so we can be sure that all links are dropped before
> > > doing the proper checks.
> > > 
> > > Signed-off-by: Nuno Sa <nuno.sa@analog.com>
> > > ---
> > >  drivers/of/dynamic.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
> > > index 3bf27052832f..b7153c72c9c9 100644
> > > --- a/drivers/of/dynamic.c
> > > +++ b/drivers/of/dynamic.c
> > > @@ -14,6 +14,7 @@
> > >  #include <linux/slab.h>
> > >  #include <linux/string.h>
> > >  #include <linux/proc_fs.h>
> > > +#include <linux/fwnode.h>
> > >  
> > >  #include "of_private.h"
> > >  
> > > @@ -518,6 +519,13 @@ EXPORT_SYMBOL(of_changeset_create_node);
> > >  
> > >  static void __of_changeset_entry_destroy(struct of_changeset_entry *ce)
> > >  {
> > > +	/*
> > > +	 * device links drop their device references (and hence their
> > > of_node
> > > +	 * references) asynchronously on a dedicated workqueue. Hence we
> > > need
> > > +	 * to flush it to make sure everything is done before doing the
> > > below
> > > +	 * checks.
> > > +	 */
> > > +	fwnode_links_flush_queue();
> > >  	if (ce->action == OF_RECONFIG_ATTACH_NODE &&
> > >  	    of_node_check_flag(ce->np, OF_OVERLAY)) {
> > >  		if (kref_read(&ce->np->kobj.kref) > 1) {
> > > 
> > 
> > Hi Rob and Frank,
> > 
> > Any way you could take a look at this and see if you're ok with the change
> > in the
> > overlay code? 
> > 
> > On the devlink side , we already got the ok from Rafael.
> 
> Didn't Saravana say he was going to look at this? As of yesterday, he's 
> also a DT maintainer so deferring to him.
> 

Yeah, I did asked him but I guess he never had the time for it... Saravana,
could you please give some feedback on this? I think the most sensible part is
on the devlink side but I assume this is not going to be merged without an ack
from a DT maintainer...

- Nuno Sá
Saravana Kannan Feb. 21, 2024, 12:39 a.m. UTC | #4
On Wed, Feb 14, 2024 at 4:48 AM Nuno Sá <noname.nuno@gmail.com> wrote:
>
> On Tue, 2024-02-13 at 19:44 -0800, Saravana Kannan wrote:
> > On Tue, Feb 13, 2024 at 6:57 AM Nuno Sá <noname.nuno@gmail.com> wrote:
> > >
> > > On Tue, 2024-02-13 at 08:51 -0600, Rob Herring wrote:
> > > > On Mon, Feb 12, 2024 at 01:10:27PM +0100, Nuno Sá wrote:
> > > > > On Mon, 2024-02-05 at 13:09 +0100, Nuno Sa wrote:
> > > > > > Device links will drop their supplier + consumer refcounts
> > > > > > asynchronously. That means that the refcount of the of_node attached
> > > > > > to
> > > > > > these devices will also be dropped asynchronously and so we cannot
> > > > > > guarantee the DT overlay assumption that the of_node refcount must be
> > > > > > 1 in
> > > > > > __of_changeset_entry_destroy().
> > > > > >
> > > > > > Given the above, call the new fwnode_links_flush_queue() helper to
> > > > > > flush
> > > > > > the devlink workqueue so we can be sure that all links are dropped
> > > > > > before
> > > > > > doing the proper checks.
> > > > > >
> > > > > > Signed-off-by: Nuno Sa <nuno.sa@analog.com>
> > > > > > ---
> > > > > >  drivers/of/dynamic.c | 8 ++++++++
> > > > > >  1 file changed, 8 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
> > > > > > index 3bf27052832f..b7153c72c9c9 100644
> > > > > > --- a/drivers/of/dynamic.c
> > > > > > +++ b/drivers/of/dynamic.c
> > > > > > @@ -14,6 +14,7 @@
> > > > > >  #include <linux/slab.h>
> > > > > >  #include <linux/string.h>
> > > > > >  #include <linux/proc_fs.h>
> > > > > > +#include <linux/fwnode.h>
> > > > > >
> > > > > >  #include "of_private.h"
> > > > > >
> > > > > > @@ -518,6 +519,13 @@ EXPORT_SYMBOL(of_changeset_create_node);
> > > > > >
> > > > > >  static void __of_changeset_entry_destroy(struct of_changeset_entry
> > > > > > *ce)
> > > > > >  {
> > > > > > + /*
> > > > > > +  * device links drop their device references (and hence their
> > > > > > of_node
> > > > > > +  * references) asynchronously on a dedicated workqueue. Hence we
> > > > > > need
> > > > > > +  * to flush it to make sure everything is done before doing the
> > > > > > below
> > > > > > +  * checks.
> > > > > > +  */
> > > > > > + fwnode_links_flush_queue();
> > > > > >   if (ce->action == OF_RECONFIG_ATTACH_NODE &&
> > > > > >       of_node_check_flag(ce->np, OF_OVERLAY)) {
> > > > > >           if (kref_read(&ce->np->kobj.kref) > 1) {
> > > > > >
> > > > >
> > > > > Hi Rob and Frank,
> > > > >
> > > > > Any way you could take a look at this and see if you're ok with the
> > > > > change
> > > > > in the
> > > > > overlay code?
> > > > >
> > > > > On the devlink side , we already got the ok from Rafael.
> > > >
> > > > Didn't Saravana say he was going to look at this? As of yesterday, he's
> > > > also a DT maintainer so deferring to him.
> > > >
> > >
> > > Yeah, I did asked him but I guess he never had the time for it... Saravana,
> > > could you please give some feedback on this? I think the most sensible part
> > > is
> > > on the devlink side but I assume this is not going to be merged without an
> > > ack
> > > from a DT maintainer...
> >
> > Sorry for the delay Nuno. I'll get to this. I promise. This week is a bit
> > busy.
> >
>
> No worries. Just making sure it's not forgotten :)

Hi Nuno,

Thanks for nudging me about this issue.

I replied to a similar patch series that Herve sent out last year.
Chose to reply to that because it had fewer issues to fix and Herve
sent it out a while ago.
https://lore.kernel.org/all/20231130174126.688486-1-herve.codina@bootlin.com/

Can you please chime in there?

Thanks,
Saravana
Nuno Sá Feb. 21, 2024, 6:58 a.m. UTC | #5
On Tue, 2024-02-20 at 16:39 -0800, Saravana Kannan wrote:
> On Wed, Feb 14, 2024 at 4:48 AM Nuno Sá <noname.nuno@gmail.com> wrote:
> > 
> > On Tue, 2024-02-13 at 19:44 -0800, Saravana Kannan wrote:
> > > On Tue, Feb 13, 2024 at 6:57 AM Nuno Sá <noname.nuno@gmail.com> wrote:
> > > > 
> > > > On Tue, 2024-02-13 at 08:51 -0600, Rob Herring wrote:
> > > > > On Mon, Feb 12, 2024 at 01:10:27PM +0100, Nuno Sá wrote:
> > > > > > On Mon, 2024-02-05 at 13:09 +0100, Nuno Sa wrote:
> > > > > > > Device links will drop their supplier + consumer refcounts
> > > > > > > asynchronously. That means that the refcount of the of_node attached
> > > > > > > to
> > > > > > > these devices will also be dropped asynchronously and so we cannot
> > > > > > > guarantee the DT overlay assumption that the of_node refcount must be
> > > > > > > 1 in
> > > > > > > __of_changeset_entry_destroy().
> > > > > > > 
> > > > > > > Given the above, call the new fwnode_links_flush_queue() helper to
> > > > > > > flush
> > > > > > > the devlink workqueue so we can be sure that all links are dropped
> > > > > > > before
> > > > > > > doing the proper checks.
> > > > > > > 
> > > > > > > Signed-off-by: Nuno Sa <nuno.sa@analog.com>
> > > > > > > ---
> > > > > > >  drivers/of/dynamic.c | 8 ++++++++
> > > > > > >  1 file changed, 8 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
> > > > > > > index 3bf27052832f..b7153c72c9c9 100644
> > > > > > > --- a/drivers/of/dynamic.c
> > > > > > > +++ b/drivers/of/dynamic.c
> > > > > > > @@ -14,6 +14,7 @@
> > > > > > >  #include <linux/slab.h>
> > > > > > >  #include <linux/string.h>
> > > > > > >  #include <linux/proc_fs.h>
> > > > > > > +#include <linux/fwnode.h>
> > > > > > > 
> > > > > > >  #include "of_private.h"
> > > > > > > 
> > > > > > > @@ -518,6 +519,13 @@ EXPORT_SYMBOL(of_changeset_create_node);
> > > > > > > 
> > > > > > >  static void __of_changeset_entry_destroy(struct of_changeset_entry
> > > > > > > *ce)
> > > > > > >  {
> > > > > > > + /*
> > > > > > > +  * device links drop their device references (and hence their
> > > > > > > of_node
> > > > > > > +  * references) asynchronously on a dedicated workqueue. Hence we
> > > > > > > need
> > > > > > > +  * to flush it to make sure everything is done before doing the
> > > > > > > below
> > > > > > > +  * checks.
> > > > > > > +  */
> > > > > > > + fwnode_links_flush_queue();
> > > > > > >   if (ce->action == OF_RECONFIG_ATTACH_NODE &&
> > > > > > >       of_node_check_flag(ce->np, OF_OVERLAY)) {
> > > > > > >           if (kref_read(&ce->np->kobj.kref) > 1) {
> > > > > > > 
> > > > > > 
> > > > > > Hi Rob and Frank,
> > > > > > 
> > > > > > Any way you could take a look at this and see if you're ok with the
> > > > > > change
> > > > > > in the
> > > > > > overlay code?
> > > > > > 
> > > > > > On the devlink side , we already got the ok from Rafael.
> > > > > 
> > > > > Didn't Saravana say he was going to look at this? As of yesterday, he's
> > > > > also a DT maintainer so deferring to him.
> > > > > 
> > > > 
> > > > Yeah, I did asked him but I guess he never had the time for it... Saravana,
> > > > could you please give some feedback on this? I think the most sensible part
> > > > is
> > > > on the devlink side but I assume this is not going to be merged without an
> > > > ack
> > > > from a DT maintainer...
> > > 
> > > Sorry for the delay Nuno. I'll get to this. I promise. This week is a bit
> > > busy.
> > > 
> > 
> > No worries. Just making sure it's not forgotten :)
> 
> Hi Nuno,
> 
> Thanks for nudging me about this issue.
> 

Hi Saravana,

> I replied to a similar patch series that Herve sent out last year.
> Chose to reply to that because it had fewer issues to fix and Herve
> sent it out a while ago.

I think it's fixing the same issues but as he sent first, fair enough :)

> https://lore.kernel.org/all/20231130174126.688486-1-herve.codina@bootlin.com/
> 
> Can you please chime in there?
> 

Already did... Please look at my first patch. It already has an ack from Rafael and I
think it's fairly close with what you want (it might need some naming improvements
though).

- Nuno Sá
Nuno Sá Feb. 21, 2024, 7:13 a.m. UTC | #6
On Tue, 2024-02-20 at 16:39 -0800, Saravana Kannan wrote:
> On Wed, Feb 14, 2024 at 4:48 AM Nuno Sá <noname.nuno@gmail.com> wrote:
> > 
> > On Tue, 2024-02-13 at 19:44 -0800, Saravana Kannan wrote:
> > > On Tue, Feb 13, 2024 at 6:57 AM Nuno Sá <noname.nuno@gmail.com> wrote:
> > > > 
> > > > On Tue, 2024-02-13 at 08:51 -0600, Rob Herring wrote:
> > > > > On Mon, Feb 12, 2024 at 01:10:27PM +0100, Nuno Sá wrote:
> > > > > > On Mon, 2024-02-05 at 13:09 +0100, Nuno Sa wrote:
> > > > > > > Device links will drop their supplier + consumer refcounts
> > > > > > > asynchronously. That means that the refcount of the of_node attached
> > > > > > > to
> > > > > > > these devices will also be dropped asynchronously and so we cannot
> > > > > > > guarantee the DT overlay assumption that the of_node refcount must be
> > > > > > > 1 in
> > > > > > > __of_changeset_entry_destroy().
> > > > > > > 
> > > > > > > Given the above, call the new fwnode_links_flush_queue() helper to
> > > > > > > flush
> > > > > > > the devlink workqueue so we can be sure that all links are dropped
> > > > > > > before
> > > > > > > doing the proper checks.
> > > > > > > 
> > > > > > > Signed-off-by: Nuno Sa <nuno.sa@analog.com>
> > > > > > > ---
> > > > > > >  drivers/of/dynamic.c | 8 ++++++++
> > > > > > >  1 file changed, 8 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
> > > > > > > index 3bf27052832f..b7153c72c9c9 100644
> > > > > > > --- a/drivers/of/dynamic.c
> > > > > > > +++ b/drivers/of/dynamic.c
> > > > > > > @@ -14,6 +14,7 @@
> > > > > > >  #include <linux/slab.h>
> > > > > > >  #include <linux/string.h>
> > > > > > >  #include <linux/proc_fs.h>
> > > > > > > +#include <linux/fwnode.h>
> > > > > > > 
> > > > > > >  #include "of_private.h"
> > > > > > > 
> > > > > > > @@ -518,6 +519,13 @@ EXPORT_SYMBOL(of_changeset_create_node);
> > > > > > > 
> > > > > > >  static void __of_changeset_entry_destroy(struct of_changeset_entry
> > > > > > > *ce)
> > > > > > >  {
> > > > > > > + /*
> > > > > > > +  * device links drop their device references (and hence their
> > > > > > > of_node
> > > > > > > +  * references) asynchronously on a dedicated workqueue. Hence we
> > > > > > > need
> > > > > > > +  * to flush it to make sure everything is done before doing the
> > > > > > > below
> > > > > > > +  * checks.
> > > > > > > +  */
> > > > > > > + fwnode_links_flush_queue();
> > > > > > >   if (ce->action == OF_RECONFIG_ATTACH_NODE &&
> > > > > > >       of_node_check_flag(ce->np, OF_OVERLAY)) {
> > > > > > >           if (kref_read(&ce->np->kobj.kref) > 1) {
> > > > > > > 
> > > > > > 
> > > > > > Hi Rob and Frank,
> > > > > > 
> > > > > > Any way you could take a look at this and see if you're ok with the
> > > > > > change
> > > > > > in the
> > > > > > overlay code?
> > > > > > 
> > > > > > On the devlink side , we already got the ok from Rafael.
> > > > > 
> > > > > Didn't Saravana say he was going to look at this? As of yesterday, he's
> > > > > also a DT maintainer so deferring to him.
> > > > > 
> > > > 
> > > > Yeah, I did asked him but I guess he never had the time for it... Saravana,
> > > > could you please give some feedback on this? I think the most sensible part
> > > > is
> > > > on the devlink side but I assume this is not going to be merged without an
> > > > ack
> > > > from a DT maintainer...
> > > 
> > > Sorry for the delay Nuno. I'll get to this. I promise. This week is a bit
> > > busy.
> > > 
> > 
> > No worries. Just making sure it's not forgotten :)
> 
> Hi Nuno,
> 
> Thanks for nudging me about this issue.
> 
> I replied to a similar patch series that Herve sent out last year.
> Chose to reply to that because it had fewer issues to fix and Herve
> sent it out a while ago.

Ehehe, FWIW, I did sent it out before I believe:

https://lore.kernel.org/lkml/20231127-fix-device-links-overlays-v1-1-d7438f56d025@analog.com/

I just got no attention and it took some time until I got some feedback (I also
pushed for it with resends). If you follow the links in the cover, you'll see I first
started (and spotted the issue) the effort in May last year.

That said, I'm more than fine with whatever series is taken. I just care about the
problem being solved :)

- Nuno Sá
diff mbox series

Patch

diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c
index 3bf27052832f..b7153c72c9c9 100644
--- a/drivers/of/dynamic.c
+++ b/drivers/of/dynamic.c
@@ -14,6 +14,7 @@ 
 #include <linux/slab.h>
 #include <linux/string.h>
 #include <linux/proc_fs.h>
+#include <linux/fwnode.h>
 
 #include "of_private.h"
 
@@ -518,6 +519,13 @@  EXPORT_SYMBOL(of_changeset_create_node);
 
 static void __of_changeset_entry_destroy(struct of_changeset_entry *ce)
 {
+	/*
+	 * device links drop their device references (and hence their of_node
+	 * references) asynchronously on a dedicated workqueue. Hence we need
+	 * to flush it to make sure everything is done before doing the below
+	 * checks.
+	 */
+	fwnode_links_flush_queue();
 	if (ce->action == OF_RECONFIG_ATTACH_NODE &&
 	    of_node_check_flag(ce->np, OF_OVERLAY)) {
 		if (kref_read(&ce->np->kobj.kref) > 1) {