diff mbox series

regulator: Defer init completion for a while after late_initcall

Message ID 20190904124250.25844-1-broonie@kernel.org
State Accepted
Commit 55576cf1853798e86f620766e23b604c9224c19c
Headers show
Series regulator: Defer init completion for a while after late_initcall | expand

Commit Message

Mark Brown Sept. 4, 2019, 12:42 p.m. UTC
The kernel has no way of knowing when we have finished instantiating
drivers, between deferred probe and systems that build key drivers as
modules we might be doing this long after userspace has booted. This has
always been a bit of an issue with regulator_init_complete since it can
power off hardware that's not had it's driver loaded which can result in
user visible effects, the main case is powering off displays. Practically
speaking it's not been an issue in real systems since most systems that
use the regulator API are embedded and build in key drivers anyway but
with Arm laptops coming on the market it's becoming more of an issue so
let's do something about it.

In the absence of any better idea just defer the powering off for 30s
after late_initcall(), this is obviously a hack but it should mask the
issue for now and it's no more arbitrary than late_initcall() itself.
Ideally we'd have some heuristics to detect if we're on an affected
system and tune or skip the delay appropriately, and there may be some
need for a command line option to be added.

Signed-off-by: Mark Brown <broonie@kernel.org>

Tested-by: Lee Jones <lee.jones@linaro.org>

---
 drivers/regulator/core.c | 42 +++++++++++++++++++++++++++++-----------
 1 file changed, 31 insertions(+), 11 deletions(-)

-- 
2.20.1

Comments

Mark Brown Nov. 18, 2019, 12:46 p.m. UTC | #1
On Sat, Nov 16, 2019 at 01:52:33PM +0100, Torsten Duwe wrote:
> On Wed, 4 Sep 2019 13:42:50 +0100 Mark Brown <broonie@kernel.org> wrote:


> > In the absence of any better idea just defer the powering off for 30s

> > after late_initcall(), this is obviously a hack but it should mask the

> > issue for now and it's no more arbitrary than late_initcall() itself.

> > Ideally we'd have some heuristics to detect if we're on an affected

> > system and tune or skip the delay appropriately, and there may be some

> > need for a command line option to be added.


> Am I the only one having problems with this change? I get


I've had no reports of any problems.

> [   11.917136] anx6345 0-0038: 0-0038 supply dvdd12-supply not found, using dummy regulator

> [   11.917174] axp20x-rsb sunxi-rsb-3a3: AXP20x variant AXP803 found


> Despite being loaded as a very early module, PMIC init ^^^ only starts now.


I'm very surprised that anything to do with resolving incomplete
constraints would be affected by this change.  The only thing we do in
the defered bit of init is power off unused regulators which has no
bearing on registration at all.  The only thing that might have a
bearing on this is marking the sytem as having full constraints but
that's still directly in the initcall, not deferred.

> But much later on


> [   38.248573] dcdc4: disabling

> [   38.268493] vcc-pd: disabling

> [   38.288446] vdd-edp: disabling


> screen goes dark and stays dark. Use count of the regulators is 0. I guess

> this is because the driver code had been returned the dummy instead?


This is not new behaviour, all this change did was delay this.  We've
been powering off unused regulators for a bit over a decade.

> It's a mobile device so in principle there is nothing wrong with powering

> down unused circuitry, and always-on is not an option.

> Am I correct to perceive this solution as not 100% mature yet? The anx6345


Like I say this is not in any way new and pretty stable.

> driver in particular needs to do a little "voltage dance" with specific

> timing on the real regulators should the device come up really unpowered,

> so IMHO it's probably neccessary to return EPROBE_DEFER at least in this

> particular case and prepare the driver for it? Or what would be the real

> solution in this case?


We power off regulators which aren't enabled by any driver and where we
have permission from the constraints to change the state.  If the
regulator can't be powered off then it should be flagged as always-on in
constraints, if a driver needs it the driver should be enabling the
regulator.

I don't folow what you're saying about probe deferral here at all,
sorry.
Torsten Duwe Nov. 30, 2019, 3:27 p.m. UTC | #2
On Mon, Nov 18, 2019 at 08:29:49PM +0000, Mark Brown wrote:
> On Mon, Nov 18, 2019 at 08:40:12PM +0100, Torsten Duwe wrote:

> 

> > kernel: anx6345 0-0038: 0-0038 supply dvdd12-supply not found, using dummy regulator

> > kernel: anx6345 0-0038: 0-0038 supply dvdd25-supply not found, using dummy regulator

> 

> > DT has:

> >   dvdd25-supply = <&reg_dldo2>;

> >   dvdd12-supply = <&reg_dldo3>;


Note these 4 lines...

> > It's only that the regulator driver module has not fully loaded at that point.

> 

> We substitute in the dummy regulator in regulator_get() if

> regulator_dev_lookup() returns -ENODEV and a few other conditions are

> satisfied.  When lookup up via DT regulator_dev_lookup() will use

> of_find_regulator_by_node() to look up the regulator, if that lookup

> fails it returns -EPROBE_DEFER.  Until we get to of_find_regulator_by_node()

> we're just looking to see if nodes exist, not to see if anything is

> registered.  What mechanism do you see causing issues?  If there's

> something going wrong here it's in that area.


First of all: thanks a lot! This has put me onto the right track.

> As far as I can tell whatever is going on with your system it's only

> ever been working through luck.


Yes indeed. It turned out the regulators were still on from U-Boot
and that code never worked.

>   Without any specific references to

> what's going on in the system it's hard to tell what might be happening,


Well, actually the 4 lines above give a good hint :) of_get_regulator()
will look for "dvdd25-supply-supply". I'm fairly relieved that even you
didn't spot this right away. The fix just went to dri-devel, you're Cc'ed.
Unfortunately the documentation for this is buried in the git commit log.

For the record: I'm still convinced that the original change can uncover
bugs unexpectedly, and is not suited for -stable.

Thanks for the help, and sorry for the non-standard nomenclature.

	Torsten
Mark Brown Dec. 1, 2019, 11:49 p.m. UTC | #3
On Sat, Nov 30, 2019 at 04:27:00PM +0100, Torsten Duwe wrote:

> Well, actually the 4 lines above give a good hint :) of_get_regulator()

> will look for "dvdd25-supply-supply". I'm fairly relieved that even you

> didn't spot this right away. The fix just went to dri-devel, you're Cc'ed.

> Unfortunately the documentation for this is buried in the git commit log.


Glad you got to the bottom of this!  Like I said in reply to your
fix (which I saw first) this is covered in the DT binding
documentation for the regultaor API.

> For the record: I'm still convinced that the original change can uncover

> bugs unexpectedly, and is not suited for -stable.


Yeah, it's definitely at the upper limit of what I consider safe
for stable but it seems to fit the risk profile of what stable
wants to include these days and there was demand for it from
people working on DT based laptops and desktops.
diff mbox series

Patch

diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 4a27a46ec6e7..340db986b67f 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -5644,7 +5644,7 @@  static int __init regulator_init(void)
 /* init early to allow our consumers to complete system booting */
 core_initcall(regulator_init);
 
-static int __init regulator_late_cleanup(struct device *dev, void *data)
+static int regulator_late_cleanup(struct device *dev, void *data)
 {
 	struct regulator_dev *rdev = dev_to_rdev(dev);
 	const struct regulator_ops *ops = rdev->desc->ops;
@@ -5693,17 +5693,8 @@  static int __init regulator_late_cleanup(struct device *dev, void *data)
 	return 0;
 }
 
-static int __init regulator_init_complete(void)
+static void regulator_init_complete_work_function(struct work_struct *work)
 {
-	/*
-	 * Since DT doesn't provide an idiomatic mechanism for
-	 * enabling full constraints and since it's much more natural
-	 * with DT to provide them just assume that a DT enabled
-	 * system has full constraints.
-	 */
-	if (of_have_populated_dt())
-		has_full_constraints = true;
-
 	/*
 	 * Regulators may had failed to resolve their input supplies
 	 * when were registered, either because the input supply was
@@ -5721,6 +5712,35 @@  static int __init regulator_init_complete(void)
 	 */
 	class_for_each_device(&regulator_class, NULL, NULL,
 			      regulator_late_cleanup);
+}
+
+static DECLARE_DELAYED_WORK(regulator_init_complete_work,
+			    regulator_init_complete_work_function);
+
+static int __init regulator_init_complete(void)
+{
+	/*
+	 * Since DT doesn't provide an idiomatic mechanism for
+	 * enabling full constraints and since it's much more natural
+	 * with DT to provide them just assume that a DT enabled
+	 * system has full constraints.
+	 */
+	if (of_have_populated_dt())
+		has_full_constraints = true;
+
+	/*
+	 * We punt completion for an arbitrary amount of time since
+	 * systems like distros will load many drivers from userspace
+	 * so consumers might not always be ready yet, this is
+	 * particularly an issue with laptops where this might bounce
+	 * the display off then on.  Ideally we'd get a notification
+	 * from userspace when this happens but we don't so just wait
+	 * a bit and hope we waited long enough.  It'd be better if
+	 * we'd only do this on systems that need it, and a kernel
+	 * command line option might be useful.
+	 */
+	schedule_delayed_work(&regulator_init_complete_work,
+			      msecs_to_jiffies(30000));
 
 	return 0;
 }