Message ID | 20210915170940.617415-3-saravanak@google.com |
---|---|
State | Superseded |
Headers | show |
Series | fw_devlink bug fixes | expand |
On Wed, Sep 15, 2021 at 7:09 PM Saravana Kannan <saravanak@google.com> wrote: > > If a parent device is also a supplier to a child device, fw_devlink=on by > design delays the probe() of the child device until the probe() of the > parent finishes successfully. > > However, some drivers of such parent devices (where parent is also a > supplier) expect the child device to finish probing successfully as soon as > they are added using device_add() and before the probe() of the parent > device has completed successfully. One example of such a case is discussed > in the link mentioned below. > > Add a flag to make fw_devlink=on not enforce these supplier-consumer > relationships, so these drivers can continue working. > > Link: https://lore.kernel.org/netdev/CAGETcx_uj0V4DChME-gy5HGKTYnxLBX=TH2rag29f_p=UcG+Tg@mail.gmail.com/ > Fixes: ea718c699055 ("Revert "Revert "driver core: Set fw_devlink=on by default""") > Signed-off-by: Saravana Kannan <saravanak@google.com> > --- > drivers/base/core.c | 19 +++++++++++++++++++ > include/linux/fwnode.h | 11 ++++++++--- > 2 files changed, 27 insertions(+), 3 deletions(-) > > diff --git a/drivers/base/core.c b/drivers/base/core.c > index 316df6027093..21d4cb5d3767 100644 > --- a/drivers/base/core.c > +++ b/drivers/base/core.c > @@ -1722,6 +1722,25 @@ static int fw_devlink_create_devlink(struct device *con, > struct device *sup_dev; > int ret = 0; > > + /* > + * In some cases, a device P might also be a supplier to its child node > + * C. However, this would defer the probe of C until the probe of P > + * completes successfully. This is perfectly fine in the device driver > + * model. device_add() doesn't guarantee probe completion of the device > + * by the time it returns. That's right. However, I don't quite see a point in device links where the supplier is a direct ancestor of the consumer. From the PM perspective they are simply redundant and I'm not sure about any other use cases where they aren't. So what's the reason to add them in the first place? > + * > + * However, there are a few drivers that assume C will finish probing > + * as soon as it's added and before P finishes probing. So, we provide > + * a flag to let fw_devlink know not to delay the probe of C until the > + * probe of P completes successfully. Well, who's expected to set that flag and when? This needs to be mentioned here IMO. > + * > + * When such a flag is set, we can't create device links where P is the > + * supplier of C as that would delay the probe of C. > + */ > + if (sup_handle->flags & FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD && > + fwnode_is_ancestor_of(sup_handle, con->fwnode)) > + return -EINVAL; > + > sup_dev = get_dev_from_fwnode(sup_handle); > if (sup_dev) { > /* > diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h > index 59828516ebaf..9f4ad719bfe3 100644 > --- a/include/linux/fwnode.h > +++ b/include/linux/fwnode.h > @@ -22,10 +22,15 @@ struct device; > * LINKS_ADDED: The fwnode has already be parsed to add fwnode links. > * NOT_DEVICE: The fwnode will never be populated as a struct device. > * INITIALIZED: The hardware corresponding to fwnode has been initialized. > + * NEEDS_CHILD_BOUND_ON_ADD: For this fwnode/device to probe successfully, its > + * driver needs its child devices to be bound with > + * their respective drivers as soon as they are > + * added. The fact that this requires so much comment text here is a clear band-aid indication to me. > */ > -#define FWNODE_FLAG_LINKS_ADDED BIT(0) > -#define FWNODE_FLAG_NOT_DEVICE BIT(1) > -#define FWNODE_FLAG_INITIALIZED BIT(2) > +#define FWNODE_FLAG_LINKS_ADDED BIT(0) > +#define FWNODE_FLAG_NOT_DEVICE BIT(1) > +#define FWNODE_FLAG_INITIALIZED BIT(2) > +#define FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD BIT(3) > > struct fwnode_handle { > struct fwnode_handle *secondary; > -- > 2.33.0.309.g3052b89438-goog >
> > diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h > > index 59828516ebaf..9f4ad719bfe3 100644 > > --- a/include/linux/fwnode.h > > +++ b/include/linux/fwnode.h > > @@ -22,10 +22,15 @@ struct device; > > * LINKS_ADDED: The fwnode has already be parsed to add fwnode links. > > * NOT_DEVICE: The fwnode will never be populated as a struct device. > > * INITIALIZED: The hardware corresponding to fwnode has been initialized. > > + * NEEDS_CHILD_BOUND_ON_ADD: For this fwnode/device to probe successfully, its > > + * driver needs its child devices to be bound with > > + * their respective drivers as soon as they are > > + * added. > > The fact that this requires so much comment text here is a clear > band-aid indication to me. This whole patchset is a band aid, but it is for stable, to fix things which are currently broken. So we need to answer the question, is a bad aid good enough for stable, with the assumption a real fix will come along later? Andrew
On Sun, Sep 19, 2021 at 03:16:34AM +0200, Andrew Lunn wrote: > > > diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h > > > index 59828516ebaf..9f4ad719bfe3 100644 > > > --- a/include/linux/fwnode.h > > > +++ b/include/linux/fwnode.h > > > @@ -22,10 +22,15 @@ struct device; > > > * LINKS_ADDED: The fwnode has already be parsed to add fwnode links. > > > * NOT_DEVICE: The fwnode will never be populated as a struct device. > > > * INITIALIZED: The hardware corresponding to fwnode has been initialized. > > > + * NEEDS_CHILD_BOUND_ON_ADD: For this fwnode/device to probe successfully, its > > > + * driver needs its child devices to be bound with > > > + * their respective drivers as soon as they are > > > + * added. > > > > The fact that this requires so much comment text here is a clear > > band-aid indication to me. > > This whole patchset is a band aid, but it is for stable, to fix things > which are currently broken. So we need to answer the question, is a > bad aid good enough for stable, with the assumption a real fix will > come along later? Fix it properly first and worry about stable later. greg k-h
On Sun, Sep 19, 2021 at 03:16:34AM +0200, Andrew Lunn wrote: > > > diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h > > > index 59828516ebaf..9f4ad719bfe3 100644 > > > --- a/include/linux/fwnode.h > > > +++ b/include/linux/fwnode.h > > > @@ -22,10 +22,15 @@ struct device; > > > * LINKS_ADDED: The fwnode has already be parsed to add fwnode links. > > > * NOT_DEVICE: The fwnode will never be populated as a struct device. > > > * INITIALIZED: The hardware corresponding to fwnode has been initialized. > > > + * NEEDS_CHILD_BOUND_ON_ADD: For this fwnode/device to probe successfully, its > > > + * driver needs its child devices to be bound with > > > + * their respective drivers as soon as they are > > > + * added. > > > > The fact that this requires so much comment text here is a clear > > band-aid indication to me. > > This whole patchset is a band aid, but it is for stable, to fix things > which are currently broken. So we need to answer the question, is a > bad aid good enough for stable, with the assumption a real fix will > come along later? Fix it properly first, don't worry about stable. But what is wrong with this as-is? What needs to be done that is not happening here that you feels still needs to be addressed? thanks, greg k-h
On Tue, Sep 21, 2021 at 6:15 PM Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > > On Sun, Sep 19, 2021 at 03:16:34AM +0200, Andrew Lunn wrote: > > > > diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h > > > > index 59828516ebaf..9f4ad719bfe3 100644 > > > > --- a/include/linux/fwnode.h > > > > +++ b/include/linux/fwnode.h > > > > @@ -22,10 +22,15 @@ struct device; > > > > * LINKS_ADDED: The fwnode has already be parsed to add fwnode links. > > > > * NOT_DEVICE: The fwnode will never be populated as a struct device. > > > > * INITIALIZED: The hardware corresponding to fwnode has been initialized. > > > > + * NEEDS_CHILD_BOUND_ON_ADD: For this fwnode/device to probe successfully, its > > > > + * driver needs its child devices to be bound with > > > > + * their respective drivers as soon as they are > > > > + * added. > > > > > > The fact that this requires so much comment text here is a clear > > > band-aid indication to me. > > > > This whole patchset is a band aid, but it is for stable, to fix things > > which are currently broken. So we need to answer the question, is a > > bad aid good enough for stable, with the assumption a real fix will > > come along later? > > Fix it properly first, don't worry about stable. > > But what is wrong with this as-is? What needs to be done that is not > happening here that you feels still needs to be addressed? The existing code attempts to "enforce" device links where the supplier is a direct ancestor of the consumer (e.g. its parent), which is questionable by itself (why do that?) and that's the source of the underlying issue (self-inflicted circular dependencies that cause devices to wait for a deferred probe forever) which this patchest attempts to avoid by adding an extra flag to the driver core and expecting specific drivers to mark their devices as "special". And that's "until we have a real fix".
> The existing code attempts to "enforce" device links where the > supplier is a direct ancestor of the consumer (e.g. its parent), which > is questionable by itself (why do that?) In this case, we have an Ethernet switch as the parent device. It registers an interrupt controller, to the interrupt subsystem. It also registers an MDIO controller to the MDIO subsystem. The MDIO subsystem finds the Ethernet PHYs on the MDIO bus, and registers the PHYs to the PHY subsystem. Device tree is then used to glue all the parts together. The PHY has an interrupt output which is connected to the interrupt controller, and a standard DT property is used to connect the two. The MACs in the switch are connected to the PHYs, and standard DT properties are used to connect them together. So we have a loop. But the driver model does not have a problem with this, at least not until fw_devlink came along. As soon as a resource is registered with a subsystem, it can be used. Where as fw_devlink seems to assume a resource cannot be used until the driver providing it completes probe. Now, we could ignore all these subsystems, re-invent the wheels inside the switch driver, and then not have suppliers and consumers at all, it is all internal. But that seems like a bad idea, more wheels, more bugs. So for me, the real fix is that fw_devlink learns that resources are available as soon as they are registered, not when the provider device completes probe. Andrew
On Tue, Sep 21, 2021 at 7:56 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > The existing code attempts to "enforce" device links where the > > supplier is a direct ancestor of the consumer (e.g. its parent), which > > is questionable by itself (why do that?) > > In this case, we have an Ethernet switch as the parent device. It > registers an interrupt controller, to the interrupt subsystem. It also > registers an MDIO controller to the MDIO subsystem. The MDIO subsystem > finds the Ethernet PHYs on the MDIO bus, and registers the PHYs to the > PHY subsystem. > > Device tree is then used to glue all the parts together. The PHY has > an interrupt output which is connected to the interrupt controller, > and a standard DT property is used to connect the two. The MACs in the > switch are connected to the PHYs, and standard DT properties are used > to connect them together. So we have a loop. But the driver model does > not have a problem with this, at least not until fw_devlink came > along. As soon as a resource is registered with a subsystem, it can be > used. Where as fw_devlink seems to assume a resource cannot be used > until the driver providing it completes probe. It works at a device level, so it doesn't know about resources. The only information it has is of the "this device may depend on that other device" type and it uses that information to figure out a usable probe ordering for drivers. > Now, we could ignore all these subsystems, re-invent the wheels inside > the switch driver, and then not have suppliers and consumers at all, > it is all internal. But that seems like a bad idea, more wheels, more > bugs. > > So for me, the real fix is that fw_devlink learns that resources are > available as soon as they are registered, not when the provider device > completes probe. Because it doesn't know about individual resources, it cannot really do that. Also if the probe has already started, it may still return -EPROBE_DEFER at any time in theory, so as a rule the dependency is actually known to be satisfied when the probe has successfully completed. However, making children wait for their parents to complete probing is generally artificial, especially in the cases when the children are registered by the parent's driver. So waiting should be an exception in these cases, not a rule.
> It works at a device level, so it doesn't know about resources. The > only information it has is of the "this device may depend on that > other device" type and it uses that information to figure out a usable > probe ordering for drivers. And that simplification is the problem. A phandle does not point to a device, it points to a resource of a device. It should really be doing what the driver would do, follow the phandle to the resource and see if it exists yet. If it does not exist then yes it can defer the probe. If the resource does exist, allow the driver to probe. > Also if the probe has already started, it may still return > -EPROBE_DEFER at any time in theory Sure it can, and does. And any driver which is not broken will unregister its resources on the error path. And that causes users of the resources to release them. It all nicely unravels, and then tries again later. This all works, it is what these drivers do. > However, making children wait for their parents to complete probing is > generally artificial, especially in the cases when the children are > registered by the parent's driver. So waiting should be an exception > in these cases, not a rule. So are you suggesting that fw_devlink core needs to change, recognise the dependency on a parent, and allow the probe? Works for me. Gets us back to before fw_devlink. Andrew
Sorry I've been busy with LPC and some other stuff and could respond earlier. On Tue, Sep 21, 2021 at 12:50 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > It works at a device level, so it doesn't know about resources. The > > only information it has is of the "this device may depend on that > > other device" type and it uses that information to figure out a usable > > probe ordering for drivers. > > And that simplification is the problem. A phandle does not point to a > device, it points to a resource of a device. It should really be doing > what the driver would do, follow the phandle to the resource and see > if it exists yet. If it does not exist then yes it can defer the > probe. If the resource does exist, allow the driver to probe. > > > Also if the probe has already started, it may still return > > -EPROBE_DEFER at any time in theory > > Sure it can, and does. And any driver which is not broken will > unregister its resources on the error path. And that causes users of > the resources to release them. It all nicely unravels, and then tries > again later. This all works, it is what these drivers do. One of the points of fw_devlink=on is to avoid the pointless deferred probes that'd happen in this situation. So saying "let this happen" when fw_devlink=on kinda beats the point of it. See further below. > > > However, making children wait for their parents to complete probing is > > generally artificial, especially in the cases when the children are > > registered by the parent's driver. So waiting should be an exception > > in these cases, not a rule. Rafael, There are cases where the children try to probe too quickly (before the parent has had time to set up all the resources it's setting up) and the child defers the probe. Even Andrew had an example of that with some ethernet driver where the deferred probe is attempted multiple times wasting time and then it eventually succeeds. Considering there's no guarantee that a device_add() will result in the device being bound immediately, why shouldn't we make the child device wait until the parent has completely probed and we know all the resources from the parent are guaranteed to be available? Why can't we treat drivers that assume a device will get bound as soon as it's added as the exception (because we don't guarantee that anyway)? Also, this assumption that the child will be bound successfully upon addition forces the parent/child drivers to play initcall chicken -- the child's driver has to be registered before the parent's driver. We should be getting away from those by fixing the parent driver that's making these assumptions (I'll be glad to help with that). We need to be moving towards reducing pointless deferred probes and initcall ordering requirements instead of saying "this bad assumption used to work, so allow me to continue doing that". -Saravana > So are you suggesting that fw_devlink core needs to change, recognise > the dependency on a parent, and allow the probe? Works for me. Gets us > back to before fw_devlink.
> There are cases where the children try to probe too quickly (before > the parent has had time to set up all the resources it's setting up) > and the child defers the probe. Even Andrew had an example of that > with some ethernet driver where the deferred probe is attempted > multiple times wasting time and then it eventually succeeds. And i prefer an occasional EPROBE_DEFER over a broken Ethernet switch, which is the current state. I'm happy to see optimisations, but not at the expense of breaking working stuff. > Also, this assumption that the child will be bound successfully upon > addition forces the parent/child drivers to play initcall chicken We have never had any initcall chicken problems. The switch drivers all are standard mdio_module_driver, module_platform_driver, module_i2c_driver, module_pci_driver. Nothing special here. Things load in whatever order they load, and it all works out, maybe with an EPROBE_DEFER cycle. Which is good, we get our error paths tested, and sometimes find bugs that way. Andrew
On Tue, Sep 21, 2021 at 2:03 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > There are cases where the children try to probe too quickly (before > > the parent has had time to set up all the resources it's setting up) > > and the child defers the probe. Even Andrew had an example of that > > with some ethernet driver where the deferred probe is attempted > > multiple times wasting time and then it eventually succeeds. > > And i prefer an occasional EPROBE_DEFER over a broken Ethernet switch, > which is the current state. I'm happy to see optimisations, but not at > the expense of breaking working stuff. Right, but in that case, the long term solution should be to make changes so we don't expect the child to be bound as soon as it's added. Not disable the optimization. Agree? > > > Also, this assumption that the child will be bound successfully upon > > addition forces the parent/child drivers to play initcall chicken > > We have never had any initcall chicken problems. The switch drivers > all are standard mdio_module_driver, module_platform_driver, > module_i2c_driver, module_pci_driver. Nothing special here. Things > load in whatever order they load, and it all works out, maybe with an > EPROBE_DEFER cycle. Which is good, we get our error paths tested, and > sometimes find bugs that way. My comment was a general comment about parent drives that expect the child drivers to be bound on addition -- not specific to DSA. But even in the DSA case, not playing initcall chicken means if you change the order of driver registration, things should still work. However, as it stands, if you register the switch driver before the PHY drivers are registered, you are going to force bind the PHYs to the generic driver. -Saravana
On Tue, Sep 21, 2021 at 02:54:47PM -0700, Saravana Kannan wrote: > On Tue, Sep 21, 2021 at 2:03 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > > > There are cases where the children try to probe too quickly (before > > > the parent has had time to set up all the resources it's setting up) > > > and the child defers the probe. Even Andrew had an example of that > > > with some ethernet driver where the deferred probe is attempted > > > multiple times wasting time and then it eventually succeeds. > > > > And i prefer an occasional EPROBE_DEFER over a broken Ethernet switch, > > which is the current state. I'm happy to see optimisations, but not at > > the expense of breaking working stuff. > > Right, but in that case, the long term solution should be to make > changes so we don't expect the child to be bound as soon as it's > added. Not disable the optimization. Agree? Maybe. Lets see how you fix what is currently broken. At the moment, i don't care too much about the long term solution. The current quick fix for stable does not seem to be making any progress. So we need the real fix now, to unbreak what is currently broken, then we can think about the long term. Andrew
On Tue, Sep 21, 2021 at 3:04 PM Andrew Lunn <andrew@lunn.ch> wrote: > > On Tue, Sep 21, 2021 at 02:54:47PM -0700, Saravana Kannan wrote: > > On Tue, Sep 21, 2021 at 2:03 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > > > > > There are cases where the children try to probe too quickly (before > > > > the parent has had time to set up all the resources it's setting up) > > > > and the child defers the probe. Even Andrew had an example of that > > > > with some ethernet driver where the deferred probe is attempted > > > > multiple times wasting time and then it eventually succeeds. > > > > > > And i prefer an occasional EPROBE_DEFER over a broken Ethernet switch, > > > which is the current state. I'm happy to see optimisations, but not at > > > the expense of breaking working stuff. > > > > Right, but in that case, the long term solution should be to make > > changes so we don't expect the child to be bound as soon as it's > > added. Not disable the optimization. Agree? > > Maybe. Lets see how you fix what is currently broken. At the moment, i > don't care too much about the long term solution. The current quick > fix for stable does not seem to be making any progress. So we need the > real fix now, to unbreak what is currently broken, then we can think > about the long term. Wait, what's the difference between a real fix vs a long term fix? To me those are the same. I agree that having DSA be broken till we have the final fix isn't good. Especially because DSA is complicated and the over eager gen PHY driver makes it even harder to fix it quickly. Merging this patch gives an exception to DSA, while we figure out a good fix. It also makes sure more drivers don't get merged with the same assumptions (because fw_devlink=on won't give them the exception). Greg/Rafael, can we merge this please while we figure out a fix for DSA/PHY. It's a non-trivial problem to solve because PHYs need some kind of driver "match rating". -Saravana
> Wait, what's the difference between a real fix vs a long term fix? To > me those are the same. Maybe the long term fix is you follow the phandle to the actual resources, see it is present, and allow the probe? That brings you in line with how things actually work with devices probing against resources. I don't know how much work that is, since there is no uniform API to follow a phandle to a resource. I think each phandle type has its own helper. For an interrupt phandle you need to use of_irq_get(), for a gpio phandle maybe of_get_named_gpio_flags(), for a reset phandle __of_reset_control_get(), etc. Because this does not sounds too simple, maybe you can find something simpler which is a real fix for now, good enough that it will get merged, and then you can implement this phandle following for the long term fix? Andrew
On Tue, Sep 21, 2021 at 5:45 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > Wait, what's the difference between a real fix vs a long term fix? To > > me those are the same. > > Maybe the long term fix is you follow the phandle to the actual > resources, see it is present, and allow the probe? That brings you in > line with how things actually work with devices probing against > resources. > > I don't know how much work that is, since there is no uniform API to > follow a phandle to a resource. I think each phandle type has its own > helper. For an interrupt phandle you need to use of_irq_get(), for a > gpio phandle maybe of_get_named_gpio_flags(), for a reset phandle > __of_reset_control_get(), etc. That goes back to Rafael's reply (and I agree): "Also if the probe has already started, it may still return -EPROBE_DEFER at any time in theory, so as a rule the dependency is actually known to be satisfied when the probe has successfully completed." So waiting for the probe to finish is the right behavior/intentional for fw_devlink. > Because this does not sounds too simple, maybe you can find something > simpler which is a real fix for now, good enough that it will get > merged, and then you can implement this phandle following for the long > term fix? The simpler fix is really just this patch. I'm hoping Greg/Rafael see my point about doing the exception this way prevents things from getting worse will we address existing cases that need the flag. The long/proper fix is to the DSA framework. I have some ideas that I think will work but I've had time to get to (but on the top of my upstream work list). We can judge that after I send out the patches :) -Saravana
> That goes back to Rafael's reply (and I agree): > > "Also if the probe has already started, it may still return > -EPROBE_DEFER at any time in theory, so as a rule the dependency is > actually known to be satisfied when the probe has successfully > completed." > > So waiting for the probe to finish is the right behavior/intentional > for fw_devlink. But differs to how things actually work in the driver model. The driver model does not care if a driver has finished probing, you can use a resource as soon as it is registered. Hence this whole problem/discussion. Andrew
On Tue, Sep 21, 2021 at 10:07 PM Saravana Kannan <saravanak@google.com> wrote: > > Sorry I've been busy with LPC and some other stuff and could respond earlier. > > On Tue, Sep 21, 2021 at 12:50 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > > > It works at a device level, so it doesn't know about resources. The > > > only information it has is of the "this device may depend on that > > > other device" type and it uses that information to figure out a usable > > > probe ordering for drivers. > > > > And that simplification is the problem. A phandle does not point to a > > device, it points to a resource of a device. It should really be doing > > what the driver would do, follow the phandle to the resource and see > > if it exists yet. If it does not exist then yes it can defer the > > probe. If the resource does exist, allow the driver to probe. > > > > > Also if the probe has already started, it may still return > > > -EPROBE_DEFER at any time in theory > > > > Sure it can, and does. And any driver which is not broken will > > unregister its resources on the error path. And that causes users of > > the resources to release them. It all nicely unravels, and then tries > > again later. This all works, it is what these drivers do. > > One of the points of fw_devlink=on is to avoid the pointless deferred > probes that'd happen in this situation. So saying "let this happen" > when fw_devlink=on kinda beats the point of it. See further below. Well, you need to define "pointless deferred probes" in the first place. fw_devlink adds deferred probes by itself, so why are those not pointless whereas the others are? > > > > > However, making children wait for their parents to complete probing is > > > generally artificial, especially in the cases when the children are > > > registered by the parent's driver. So waiting should be an exception > > > in these cases, not a rule. > > Rafael, > > There are cases where the children try to probe too quickly (before > the parent has had time to set up all the resources it's setting up) > and the child defers the probe. Even Andrew had an example of that > with some ethernet driver where the deferred probe is attempted > multiple times wasting time and then it eventually succeeds. You seem to be arguing that it may be possible to replace multiple probe attempts that each are deferred with one probe deferral which then is beneficial from the performance perspective. Yes, there are cases like that, but when this is used as a general rule, it introduces a problem if it does a deferred probe when there is no need for a probe deferral at all (like in the specific problem case at hand). Also if the probing of the child is deferred just once, adding an extra dependency on the parent to it doesn't really help. > Considering there's no guarantee that a device_add() will result in > the device being bound immediately, why shouldn't we make the child > device wait until the parent has completely probed and we know all the > resources from the parent are guaranteed to be available? Why can't we > treat drivers that assume a device will get bound as soon as it's > added as the exception (because we don't guarantee that anyway)? Because this adds artificial constraints that otherwise aren't there in some cases to the picture and asking drivers to mark themselves as "please don't add these artificial constraints for me" is not particularly straightforward. Moreover, it does that retroactively causing things that are entirely correct and previously worked just fine to now have to paint themselves red to continue working as before. The fact that immediate probe completion cannot be guaranteed in general doesn't mean that it cannot be assumed in certain situations. For example, a parent driver registering a child may know what the child driver is and so it may know that the child will either probe successfully right away or the probing of it will fail and your extra constraint breaks that assumption. You can't really know how many of such cases there are and trying to cover them with a new flag is a retroactive whack-a-mole game. > Also, this assumption that the child will be bound successfully upon > addition forces the parent/child drivers to play initcall chicken -- > the child's driver has to be registered before the parent's driver. That's true, but why is this a general problem? For example, they both may be registered by the same function in the right order. What's wrong with that? > We should be getting away from those by fixing the parent driver that's > making these assumptions (I'll be glad to help with that). We need to > be moving towards reducing pointless deferred probes and initcall > ordering requirements instead of saying "this bad assumption used to > work, so allow me to continue doing that". It is not always a bad assumption. It may be code designed this way.
diff --git a/drivers/base/core.c b/drivers/base/core.c index 316df6027093..21d4cb5d3767 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -1722,6 +1722,25 @@ static int fw_devlink_create_devlink(struct device *con, struct device *sup_dev; int ret = 0; + /* + * In some cases, a device P might also be a supplier to its child node + * C. However, this would defer the probe of C until the probe of P + * completes successfully. This is perfectly fine in the device driver + * model. device_add() doesn't guarantee probe completion of the device + * by the time it returns. + * + * However, there are a few drivers that assume C will finish probing + * as soon as it's added and before P finishes probing. So, we provide + * a flag to let fw_devlink know not to delay the probe of C until the + * probe of P completes successfully. + * + * When such a flag is set, we can't create device links where P is the + * supplier of C as that would delay the probe of C. + */ + if (sup_handle->flags & FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD && + fwnode_is_ancestor_of(sup_handle, con->fwnode)) + return -EINVAL; + sup_dev = get_dev_from_fwnode(sup_handle); if (sup_dev) { /* diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h index 59828516ebaf..9f4ad719bfe3 100644 --- a/include/linux/fwnode.h +++ b/include/linux/fwnode.h @@ -22,10 +22,15 @@ struct device; * LINKS_ADDED: The fwnode has already be parsed to add fwnode links. * NOT_DEVICE: The fwnode will never be populated as a struct device. * INITIALIZED: The hardware corresponding to fwnode has been initialized. + * NEEDS_CHILD_BOUND_ON_ADD: For this fwnode/device to probe successfully, its + * driver needs its child devices to be bound with + * their respective drivers as soon as they are + * added. */ -#define FWNODE_FLAG_LINKS_ADDED BIT(0) -#define FWNODE_FLAG_NOT_DEVICE BIT(1) -#define FWNODE_FLAG_INITIALIZED BIT(2) +#define FWNODE_FLAG_LINKS_ADDED BIT(0) +#define FWNODE_FLAG_NOT_DEVICE BIT(1) +#define FWNODE_FLAG_INITIALIZED BIT(2) +#define FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD BIT(3) struct fwnode_handle { struct fwnode_handle *secondary;
If a parent device is also a supplier to a child device, fw_devlink=on by design delays the probe() of the child device until the probe() of the parent finishes successfully. However, some drivers of such parent devices (where parent is also a supplier) expect the child device to finish probing successfully as soon as they are added using device_add() and before the probe() of the parent device has completed successfully. One example of such a case is discussed in the link mentioned below. Add a flag to make fw_devlink=on not enforce these supplier-consumer relationships, so these drivers can continue working. Link: https://lore.kernel.org/netdev/CAGETcx_uj0V4DChME-gy5HGKTYnxLBX=TH2rag29f_p=UcG+Tg@mail.gmail.com/ Fixes: ea718c699055 ("Revert "Revert "driver core: Set fw_devlink=on by default""") Signed-off-by: Saravana Kannan <saravanak@google.com> --- drivers/base/core.c | 19 +++++++++++++++++++ include/linux/fwnode.h | 11 ++++++++--- 2 files changed, 27 insertions(+), 3 deletions(-)