Message ID | 20240205-fix-device-links-overlays-v2-2-5344f8c79d57@analog.com |
---|---|
State | New |
Headers | show |
Series | fix DT overlays when device links are released | expand |
On Mon, Feb 05, 2024 at 01:09:33PM +0100, Nuno Sa via B4 Relay wrote: > From: Nuno Sa <nuno.sa@analog.com> > > Device links will drop their supplier + consumer refcounts > asynchronously. That means that the refcount of the of_node attached to > these devices will also be dropped asynchronously and so we cannot > guarantee the DT overlay assumption that the of_node refcount must be 1 in > __of_changeset_entry_destroy(). > > Given the above, call the new fwnode_links_flush_queue() helper to flush > the devlink workqueue so we can be sure that all links are dropped before > doing the proper checks. Have you seen my comments against v1? > +++ b/drivers/of/dynamic.c > @@ -14,6 +14,7 @@ > #include <linux/slab.h> > #include <linux/string.h> > #include <linux/proc_fs.h> > +#include <linux/fwnode.h> Try to squeeze this to make it ordered (with given context it may go before linux/s* ones, but maybe you may find a better spot). ... > + /* > + * device links drop their device references (and hence their of_node Device links... > + * references) asynchronously on a dedicated workqueue. Hence we need > + * to flush it to make sure everything is done before doing the below > + * checks. > + */
On Mon, 2024-02-05 at 13:09 +0100, Nuno Sa wrote: > Device links will drop their supplier + consumer refcounts > asynchronously. That means that the refcount of the of_node attached to > these devices will also be dropped asynchronously and so we cannot > guarantee the DT overlay assumption that the of_node refcount must be 1 in > __of_changeset_entry_destroy(). > > Given the above, call the new fwnode_links_flush_queue() helper to flush > the devlink workqueue so we can be sure that all links are dropped before > doing the proper checks. > > Signed-off-by: Nuno Sa <nuno.sa@analog.com> > --- > drivers/of/dynamic.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c > index 3bf27052832f..b7153c72c9c9 100644 > --- a/drivers/of/dynamic.c > +++ b/drivers/of/dynamic.c > @@ -14,6 +14,7 @@ > #include <linux/slab.h> > #include <linux/string.h> > #include <linux/proc_fs.h> > +#include <linux/fwnode.h> > > #include "of_private.h" > > @@ -518,6 +519,13 @@ EXPORT_SYMBOL(of_changeset_create_node); > > static void __of_changeset_entry_destroy(struct of_changeset_entry *ce) > { > + /* > + * device links drop their device references (and hence their of_node > + * references) asynchronously on a dedicated workqueue. Hence we need > + * to flush it to make sure everything is done before doing the below > + * checks. > + */ > + fwnode_links_flush_queue(); > if (ce->action == OF_RECONFIG_ATTACH_NODE && > of_node_check_flag(ce->np, OF_OVERLAY)) { > if (kref_read(&ce->np->kobj.kref) > 1) { > Hi Rob and Frank, Any way you could take a look at this and see if you're ok with the change in the overlay code? On the devlink side , we already got the ok from Rafael. Thanks! - Nuno Sá
On Tue, 2024-02-13 at 08:51 -0600, Rob Herring wrote: > On Mon, Feb 12, 2024 at 01:10:27PM +0100, Nuno Sá wrote: > > On Mon, 2024-02-05 at 13:09 +0100, Nuno Sa wrote: > > > Device links will drop their supplier + consumer refcounts > > > asynchronously. That means that the refcount of the of_node attached to > > > these devices will also be dropped asynchronously and so we cannot > > > guarantee the DT overlay assumption that the of_node refcount must be 1 in > > > __of_changeset_entry_destroy(). > > > > > > Given the above, call the new fwnode_links_flush_queue() helper to flush > > > the devlink workqueue so we can be sure that all links are dropped before > > > doing the proper checks. > > > > > > Signed-off-by: Nuno Sa <nuno.sa@analog.com> > > > --- > > > drivers/of/dynamic.c | 8 ++++++++ > > > 1 file changed, 8 insertions(+) > > > > > > diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c > > > index 3bf27052832f..b7153c72c9c9 100644 > > > --- a/drivers/of/dynamic.c > > > +++ b/drivers/of/dynamic.c > > > @@ -14,6 +14,7 @@ > > > #include <linux/slab.h> > > > #include <linux/string.h> > > > #include <linux/proc_fs.h> > > > +#include <linux/fwnode.h> > > > > > > #include "of_private.h" > > > > > > @@ -518,6 +519,13 @@ EXPORT_SYMBOL(of_changeset_create_node); > > > > > > static void __of_changeset_entry_destroy(struct of_changeset_entry *ce) > > > { > > > + /* > > > + * device links drop their device references (and hence their > > > of_node > > > + * references) asynchronously on a dedicated workqueue. Hence we > > > need > > > + * to flush it to make sure everything is done before doing the > > > below > > > + * checks. > > > + */ > > > + fwnode_links_flush_queue(); > > > if (ce->action == OF_RECONFIG_ATTACH_NODE && > > > of_node_check_flag(ce->np, OF_OVERLAY)) { > > > if (kref_read(&ce->np->kobj.kref) > 1) { > > > > > > > Hi Rob and Frank, > > > > Any way you could take a look at this and see if you're ok with the change > > in the > > overlay code? > > > > On the devlink side , we already got the ok from Rafael. > > Didn't Saravana say he was going to look at this? As of yesterday, he's > also a DT maintainer so deferring to him. > Yeah, I did asked him but I guess he never had the time for it... Saravana, could you please give some feedback on this? I think the most sensible part is on the devlink side but I assume this is not going to be merged without an ack from a DT maintainer... - Nuno Sá
On Wed, Feb 14, 2024 at 4:48 AM Nuno Sá <noname.nuno@gmail.com> wrote: > > On Tue, 2024-02-13 at 19:44 -0800, Saravana Kannan wrote: > > On Tue, Feb 13, 2024 at 6:57 AM Nuno Sá <noname.nuno@gmail.com> wrote: > > > > > > On Tue, 2024-02-13 at 08:51 -0600, Rob Herring wrote: > > > > On Mon, Feb 12, 2024 at 01:10:27PM +0100, Nuno Sá wrote: > > > > > On Mon, 2024-02-05 at 13:09 +0100, Nuno Sa wrote: > > > > > > Device links will drop their supplier + consumer refcounts > > > > > > asynchronously. That means that the refcount of the of_node attached > > > > > > to > > > > > > these devices will also be dropped asynchronously and so we cannot > > > > > > guarantee the DT overlay assumption that the of_node refcount must be > > > > > > 1 in > > > > > > __of_changeset_entry_destroy(). > > > > > > > > > > > > Given the above, call the new fwnode_links_flush_queue() helper to > > > > > > flush > > > > > > the devlink workqueue so we can be sure that all links are dropped > > > > > > before > > > > > > doing the proper checks. > > > > > > > > > > > > Signed-off-by: Nuno Sa <nuno.sa@analog.com> > > > > > > --- > > > > > > drivers/of/dynamic.c | 8 ++++++++ > > > > > > 1 file changed, 8 insertions(+) > > > > > > > > > > > > diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c > > > > > > index 3bf27052832f..b7153c72c9c9 100644 > > > > > > --- a/drivers/of/dynamic.c > > > > > > +++ b/drivers/of/dynamic.c > > > > > > @@ -14,6 +14,7 @@ > > > > > > #include <linux/slab.h> > > > > > > #include <linux/string.h> > > > > > > #include <linux/proc_fs.h> > > > > > > +#include <linux/fwnode.h> > > > > > > > > > > > > #include "of_private.h" > > > > > > > > > > > > @@ -518,6 +519,13 @@ EXPORT_SYMBOL(of_changeset_create_node); > > > > > > > > > > > > static void __of_changeset_entry_destroy(struct of_changeset_entry > > > > > > *ce) > > > > > > { > > > > > > + /* > > > > > > + * device links drop their device references (and hence their > > > > > > of_node > > > > > > + * references) asynchronously on a dedicated workqueue. Hence we > > > > > > need > > > > > > + * to flush it to make sure everything is done before doing the > > > > > > below > > > > > > + * checks. > > > > > > + */ > > > > > > + fwnode_links_flush_queue(); > > > > > > if (ce->action == OF_RECONFIG_ATTACH_NODE && > > > > > > of_node_check_flag(ce->np, OF_OVERLAY)) { > > > > > > if (kref_read(&ce->np->kobj.kref) > 1) { > > > > > > > > > > > > > > > > Hi Rob and Frank, > > > > > > > > > > Any way you could take a look at this and see if you're ok with the > > > > > change > > > > > in the > > > > > overlay code? > > > > > > > > > > On the devlink side , we already got the ok from Rafael. > > > > > > > > Didn't Saravana say he was going to look at this? As of yesterday, he's > > > > also a DT maintainer so deferring to him. > > > > > > > > > > Yeah, I did asked him but I guess he never had the time for it... Saravana, > > > could you please give some feedback on this? I think the most sensible part > > > is > > > on the devlink side but I assume this is not going to be merged without an > > > ack > > > from a DT maintainer... > > > > Sorry for the delay Nuno. I'll get to this. I promise. This week is a bit > > busy. > > > > No worries. Just making sure it's not forgotten :) Hi Nuno, Thanks for nudging me about this issue. I replied to a similar patch series that Herve sent out last year. Chose to reply to that because it had fewer issues to fix and Herve sent it out a while ago. https://lore.kernel.org/all/20231130174126.688486-1-herve.codina@bootlin.com/ Can you please chime in there? Thanks, Saravana
On Tue, 2024-02-20 at 16:39 -0800, Saravana Kannan wrote: > On Wed, Feb 14, 2024 at 4:48 AM Nuno Sá <noname.nuno@gmail.com> wrote: > > > > On Tue, 2024-02-13 at 19:44 -0800, Saravana Kannan wrote: > > > On Tue, Feb 13, 2024 at 6:57 AM Nuno Sá <noname.nuno@gmail.com> wrote: > > > > > > > > On Tue, 2024-02-13 at 08:51 -0600, Rob Herring wrote: > > > > > On Mon, Feb 12, 2024 at 01:10:27PM +0100, Nuno Sá wrote: > > > > > > On Mon, 2024-02-05 at 13:09 +0100, Nuno Sa wrote: > > > > > > > Device links will drop their supplier + consumer refcounts > > > > > > > asynchronously. That means that the refcount of the of_node attached > > > > > > > to > > > > > > > these devices will also be dropped asynchronously and so we cannot > > > > > > > guarantee the DT overlay assumption that the of_node refcount must be > > > > > > > 1 in > > > > > > > __of_changeset_entry_destroy(). > > > > > > > > > > > > > > Given the above, call the new fwnode_links_flush_queue() helper to > > > > > > > flush > > > > > > > the devlink workqueue so we can be sure that all links are dropped > > > > > > > before > > > > > > > doing the proper checks. > > > > > > > > > > > > > > Signed-off-by: Nuno Sa <nuno.sa@analog.com> > > > > > > > --- > > > > > > > drivers/of/dynamic.c | 8 ++++++++ > > > > > > > 1 file changed, 8 insertions(+) > > > > > > > > > > > > > > diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c > > > > > > > index 3bf27052832f..b7153c72c9c9 100644 > > > > > > > --- a/drivers/of/dynamic.c > > > > > > > +++ b/drivers/of/dynamic.c > > > > > > > @@ -14,6 +14,7 @@ > > > > > > > #include <linux/slab.h> > > > > > > > #include <linux/string.h> > > > > > > > #include <linux/proc_fs.h> > > > > > > > +#include <linux/fwnode.h> > > > > > > > > > > > > > > #include "of_private.h" > > > > > > > > > > > > > > @@ -518,6 +519,13 @@ EXPORT_SYMBOL(of_changeset_create_node); > > > > > > > > > > > > > > static void __of_changeset_entry_destroy(struct of_changeset_entry > > > > > > > *ce) > > > > > > > { > > > > > > > + /* > > > > > > > + * device links drop their device references (and hence their > > > > > > > of_node > > > > > > > + * references) asynchronously on a dedicated workqueue. Hence we > > > > > > > need > > > > > > > + * to flush it to make sure everything is done before doing the > > > > > > > below > > > > > > > + * checks. > > > > > > > + */ > > > > > > > + fwnode_links_flush_queue(); > > > > > > > if (ce->action == OF_RECONFIG_ATTACH_NODE && > > > > > > > of_node_check_flag(ce->np, OF_OVERLAY)) { > > > > > > > if (kref_read(&ce->np->kobj.kref) > 1) { > > > > > > > > > > > > > > > > > > > Hi Rob and Frank, > > > > > > > > > > > > Any way you could take a look at this and see if you're ok with the > > > > > > change > > > > > > in the > > > > > > overlay code? > > > > > > > > > > > > On the devlink side , we already got the ok from Rafael. > > > > > > > > > > Didn't Saravana say he was going to look at this? As of yesterday, he's > > > > > also a DT maintainer so deferring to him. > > > > > > > > > > > > > Yeah, I did asked him but I guess he never had the time for it... Saravana, > > > > could you please give some feedback on this? I think the most sensible part > > > > is > > > > on the devlink side but I assume this is not going to be merged without an > > > > ack > > > > from a DT maintainer... > > > > > > Sorry for the delay Nuno. I'll get to this. I promise. This week is a bit > > > busy. > > > > > > > No worries. Just making sure it's not forgotten :) > > Hi Nuno, > > Thanks for nudging me about this issue. > Hi Saravana, > I replied to a similar patch series that Herve sent out last year. > Chose to reply to that because it had fewer issues to fix and Herve > sent it out a while ago. I think it's fixing the same issues but as he sent first, fair enough :) > https://lore.kernel.org/all/20231130174126.688486-1-herve.codina@bootlin.com/ > > Can you please chime in there? > Already did... Please look at my first patch. It already has an ack from Rafael and I think it's fairly close with what you want (it might need some naming improvements though). - Nuno Sá
On Tue, 2024-02-20 at 16:39 -0800, Saravana Kannan wrote: > On Wed, Feb 14, 2024 at 4:48 AM Nuno Sá <noname.nuno@gmail.com> wrote: > > > > On Tue, 2024-02-13 at 19:44 -0800, Saravana Kannan wrote: > > > On Tue, Feb 13, 2024 at 6:57 AM Nuno Sá <noname.nuno@gmail.com> wrote: > > > > > > > > On Tue, 2024-02-13 at 08:51 -0600, Rob Herring wrote: > > > > > On Mon, Feb 12, 2024 at 01:10:27PM +0100, Nuno Sá wrote: > > > > > > On Mon, 2024-02-05 at 13:09 +0100, Nuno Sa wrote: > > > > > > > Device links will drop their supplier + consumer refcounts > > > > > > > asynchronously. That means that the refcount of the of_node attached > > > > > > > to > > > > > > > these devices will also be dropped asynchronously and so we cannot > > > > > > > guarantee the DT overlay assumption that the of_node refcount must be > > > > > > > 1 in > > > > > > > __of_changeset_entry_destroy(). > > > > > > > > > > > > > > Given the above, call the new fwnode_links_flush_queue() helper to > > > > > > > flush > > > > > > > the devlink workqueue so we can be sure that all links are dropped > > > > > > > before > > > > > > > doing the proper checks. > > > > > > > > > > > > > > Signed-off-by: Nuno Sa <nuno.sa@analog.com> > > > > > > > --- > > > > > > > drivers/of/dynamic.c | 8 ++++++++ > > > > > > > 1 file changed, 8 insertions(+) > > > > > > > > > > > > > > diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c > > > > > > > index 3bf27052832f..b7153c72c9c9 100644 > > > > > > > --- a/drivers/of/dynamic.c > > > > > > > +++ b/drivers/of/dynamic.c > > > > > > > @@ -14,6 +14,7 @@ > > > > > > > #include <linux/slab.h> > > > > > > > #include <linux/string.h> > > > > > > > #include <linux/proc_fs.h> > > > > > > > +#include <linux/fwnode.h> > > > > > > > > > > > > > > #include "of_private.h" > > > > > > > > > > > > > > @@ -518,6 +519,13 @@ EXPORT_SYMBOL(of_changeset_create_node); > > > > > > > > > > > > > > static void __of_changeset_entry_destroy(struct of_changeset_entry > > > > > > > *ce) > > > > > > > { > > > > > > > + /* > > > > > > > + * device links drop their device references (and hence their > > > > > > > of_node > > > > > > > + * references) asynchronously on a dedicated workqueue. Hence we > > > > > > > need > > > > > > > + * to flush it to make sure everything is done before doing the > > > > > > > below > > > > > > > + * checks. > > > > > > > + */ > > > > > > > + fwnode_links_flush_queue(); > > > > > > > if (ce->action == OF_RECONFIG_ATTACH_NODE && > > > > > > > of_node_check_flag(ce->np, OF_OVERLAY)) { > > > > > > > if (kref_read(&ce->np->kobj.kref) > 1) { > > > > > > > > > > > > > > > > > > > Hi Rob and Frank, > > > > > > > > > > > > Any way you could take a look at this and see if you're ok with the > > > > > > change > > > > > > in the > > > > > > overlay code? > > > > > > > > > > > > On the devlink side , we already got the ok from Rafael. > > > > > > > > > > Didn't Saravana say he was going to look at this? As of yesterday, he's > > > > > also a DT maintainer so deferring to him. > > > > > > > > > > > > > Yeah, I did asked him but I guess he never had the time for it... Saravana, > > > > could you please give some feedback on this? I think the most sensible part > > > > is > > > > on the devlink side but I assume this is not going to be merged without an > > > > ack > > > > from a DT maintainer... > > > > > > Sorry for the delay Nuno. I'll get to this. I promise. This week is a bit > > > busy. > > > > > > > No worries. Just making sure it's not forgotten :) > > Hi Nuno, > > Thanks for nudging me about this issue. > > I replied to a similar patch series that Herve sent out last year. > Chose to reply to that because it had fewer issues to fix and Herve > sent it out a while ago. Ehehe, FWIW, I did sent it out before I believe: https://lore.kernel.org/lkml/20231127-fix-device-links-overlays-v1-1-d7438f56d025@analog.com/ I just got no attention and it took some time until I got some feedback (I also pushed for it with resends). If you follow the links in the cover, you'll see I first started (and spotted the issue) the effort in May last year. That said, I'm more than fine with whatever series is taken. I just care about the problem being solved :) - Nuno Sá
diff --git a/drivers/of/dynamic.c b/drivers/of/dynamic.c index 3bf27052832f..b7153c72c9c9 100644 --- a/drivers/of/dynamic.c +++ b/drivers/of/dynamic.c @@ -14,6 +14,7 @@ #include <linux/slab.h> #include <linux/string.h> #include <linux/proc_fs.h> +#include <linux/fwnode.h> #include "of_private.h" @@ -518,6 +519,13 @@ EXPORT_SYMBOL(of_changeset_create_node); static void __of_changeset_entry_destroy(struct of_changeset_entry *ce) { + /* + * device links drop their device references (and hence their of_node + * references) asynchronously on a dedicated workqueue. Hence we need + * to flush it to make sure everything is done before doing the below + * checks. + */ + fwnode_links_flush_queue(); if (ce->action == OF_RECONFIG_ATTACH_NODE && of_node_check_flag(ce->np, OF_OVERLAY)) { if (kref_read(&ce->np->kobj.kref) > 1) {