mbox series

[v3,0/8] PM / ACPI / i2c: Deploy runtime PM centric path for system sleep

Message ID 1504018610-10822-1-git-send-email-ulf.hansson@linaro.org
Headers show
Series PM / ACPI / i2c: Deploy runtime PM centric path for system sleep | expand

Message

Ulf Hansson Aug. 29, 2017, 2:56 p.m. UTC
The i2c designware platform driver, drivers/i2c/busses/i2c-designware-platdrv.c,
isn't well optimized for system sleep.

What makes this driver particularly interesting is because it's a cross-SoC
driver, which sometimes means there is an ACPI PM domain attached to the i2c
device and sometimes not. The driver is being used on both x86 and ARM.

In principle, to optimize the system sleep support in i2c driver, this series
enables the proven runtime PM centric path for the i2c driver. However, to do
that the ACPI PM domain also have to collaborate and understand this behaviour.
From earlier versions, Rafael has also pointed out that also the PM core needs
to be involved.

Therefore a number of changes, patch 1 to patch 6, makes the needed changes to
the PM core and the ACPI PM domain. In patch7 and patch 8, the i2c driver gets
optimized and is converted to the runtime PM centric path for system sleep.

It shall be noted, the behaviour of the ACPI PM domain should remain intact,
still taking benefit of using the direct_complete path during system sleep,
except for those drivers that uses the runtime PM centric path.

This series has been tested on an ARM64 Hikey board, which isn't having the
i2c device attached to the ACPI PM domain. This means that the ACPI changes
needs to be tested on some relevant Intel SoCs and it's greatly appreciated
is someone could help out with this, so is of course review comments.

Some news in v3:
	- The fix for the i2c driver [1], is now present in Linus' tree from tag
	v4.13-rc7 - and so does Rafael's tree.
	- To simplify for testers, I have published a branch [3] based upon
	Rafael's pm tree and linux-next branch.
	- Rephrased some part of the coverletter to clarify the intent of this
	series.
	- Addressed review comments from v2.

Some news in v2:
	- The v1 contained a fix for the i2c driver, this has been sent
	separately [1] and picked up for fixes by Wolfram for v4.13-rcs. However
	the fix has not yet reached Linus' tree. The changes on i2c driver
	are based upon that change.
	- To simplify for testers, I have published a branch [2] based upon
	Rafael's pm tree and linux-next branch, which also includes the above
	patch.
	- Rephrased the coverletter to clarify the intent of this series.
	- Addressed review comments from v1.

[1]
http://patchwork.ozlabs.org/patch/799803/

[2]
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc.git acpi_pm_i2c_rpm_path_v2

[3]
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc.git acpi_pm_i2c_rpm_path_v3

Kind regards
Ulf Hansson


Ulf Hansson (8):
  PM / Sleep: Make the runtime PM centric path known to the PM core
  PM / ACPI: Restore acpi_subsys_complete()
  PM / Sleep: Remove pm_complete_with_resume_check()
  PM / ACPI: Split code validating need for runtime resume in
    ->prepare()
  PM / ACPI: Split acpi_lpss_suspend_late|resume_early()
  PM / ACPI: Enable the runtime PM centric approach for system sleep
  i2c: designware: Don't resume device in the ->complete() callback
  i2c: designware: Deploy the runtime PM centric path for system sleep

 drivers/acpi/acpi_lpss.c                    |  79 ++++++++++++++------
 drivers/acpi/device_pm.c                    | 111 ++++++++++++++++++++++------
 drivers/base/power/generic_ops.c            |  23 ------
 drivers/base/power/main.c                   |  49 ++++++++++--
 drivers/base/power/runtime.c                |   1 +
 drivers/i2c/busses/i2c-designware-platdrv.c |  34 ++-------
 include/linux/pm.h                          |   8 +-
 7 files changed, 204 insertions(+), 101 deletions(-)

-- 
2.7.4

Comments

Rafael J. Wysocki Sept. 2, 2017, 2:48 p.m. UTC | #1
On Thursday, August 31, 2017 11:06:05 AM CEST Ulf Hansson wrote:
> [...]

> 

> >> >> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c

> >> >> index ea1732e..865737a 100644

> >> >> --- a/drivers/base/power/main.c

> >> >> +++ b/drivers/base/power/main.c

> >> >> @@ -1549,14 +1549,16 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async)

> >> >>               if (parent) {

> >> >>                       spin_lock_irq(&parent->power.lock);

> >> >>

> >> >> -                     dev->parent->power.direct_complete = false;

> >> >> +                     if (!dev->power.is_rpm_sleep)

> >> >> +                             dev->parent->power.direct_complete = false;

> >> >

> >> > This doesn't look good to me at all.

> >> >

> >> > It potentially breaks the fundamental assumption of the direct_complete

> >> > mechanism that no devices with that flag set will ever be runtime resumed

> >> > during system suspend.

> >> >

> >> > Which can happen with this change if a device with power.is_rpm_sleep is

> >> > resumed causing the parent to be resumed too.

> >>

> >> Doesn't the exact same problem you describe, already exist prior my change!?

> >

> > No, it doesn't.

> >

> >> That is, if the current device is runtime suspended and the

> >> direct_complete flag is set for it, then __device_suspend() has

> >> already disabled runtime PM for the device, when we reach this point.

> >> That means, a following attempts to runtime resume the device will

> >> fail and thus also to runtime resume its parent.

> >

> > That's because any devices with direct_complete set *cannot* be runtime

> > resumed after __device_suspend().  That's intentional and how the thing

> > is designed.  It cannot work differently, sorry.

> >

> >> So to me, this is a different problem, related how to work out the

> >> correct order of how to suspend devices.

> >

> > The ordering matters here, but it is not the problem.

> >

> > Setting direct_complete means that PM callbacks for this device won't be

> > invoked going forward.  However, if the state of the device was to change

> > (like as a result of runtime PM resume), it would be necessary to run those

> > callbacks after all, but after __device_suspend() it may be too late for

> > that, because the ->suspend callback may have been skipped already.

> >

> > The only way to address that is to prevent runtime PM resume of the device

> > from happening at the point the PM core decides to use direct_complete for it,

> > but that requires runtime PM to be disabled for it.  Moreover, since the

> > device is now suspended with runtime PM disabled and its children might rely

> > on it if they were active, the children must be suspended with runtime PM

> > disabled at this point too.  That's why direct_complete can only be set

> > for a device if it is set for the entire hierarchy below it.

> 

> Thanks, this was a very nice description of the direct_complete path!

> 

> >

> > Summary: If you allow a device to be runtime resumed during system susped,

> > direct_complete cannot be set for it and it cannot be set for its parent and

> > so on.

> 

> Yes, this is what it seems to boils done to!

> 

> Perhaps the ACPI PM domain is special in this case, because in its

> ->prepare() callback it asks the ACPI FW about whether the

> direct_complete path is possible for a device. In other words, the

> ACPI FW seems to be capable of providing information about if a device

> may become runtime resumed during system suspend. But, really, isn't

> that just a best guess? No?


I wouldn't call it a guess, but it may go too far.

The part that the ACPI PM domain can figure out is whether or not the power
state of the device needs to be updated due to differences between runtime
PM and system suspend configuration requirements.  If it doesn't need to be
updated, acpi_subsys_prepare() returns 1 which may me overoptimistic.

For many devices that works, because the only possible reason why they may
need to be accessed during system suspend is to update their power state.

It may not work for some devices, though, because there may be other reasons
for accessing them during system suspend and that's where some opt-out way
is needed.

> On those ARM SoCs I am working on, non-ACPI, the information about

> whether a device may become runtime resumed during system suspend,

> most often can't be find out in a deterministic way. That's because

> this information depends on how a device is being used by other

> devices, which changes dynamically and from one system suspend

> sequence to another. In cases like the i2c-designware-plat driver used

> in Hikey (ARM64 board), it can't ever know before hand, if someone is

> going to request an i2c transfer during system suspend or not.


Well, I2C is somewhat special in that respect, because dependencies between
devices in there are not tracked AFAICS. 

> To really deal with these devices properly, we have to suspend them in

> the correct order.


Right, but in order to achieve that the right ordering actually needs
to be known.  If it is not known, there is no way to get that right in
general. :-)

The problem basically is that during system suspend you need to disable
the controller at one point, sooner or later, and keep it disabled from
that point on.   Of course, that should happen when it is guaranteed that
there won't be any new transactions going forward, but you cannot actually
guaranee that without knowing that all of its "consumers" have already
been suspended.

> My point is, doesn't the ordering problem exists also for a device,

> which the PM core runs the direct_complete path for, even if the ACPI

> PM domain is attached to the device? Just because runtime PM is

> disabled for a device, doesn't mean the ordering issue disappears,

> right?


The ordering issue is there if there are dependencies between devices the
PM core doesn't know about.  Otherwise, the rule that direct_complete can
only be used for a given device if it also is used for the entire hierarchy
below it (and for all of the device's "consumers" and everything that depends
on them too for that matter) makes it go away (it serves as a guarantee that
all of the device's "consumers" have already been suspended).

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Sept. 2, 2017, 3:38 p.m. UTC | #2
On Friday, September 1, 2017 10:27:05 AM CEST Ulf Hansson wrote:
> On 29 August 2017 at 17:27, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:

> > On Tuesday, August 29, 2017 4:56:48 PM CEST Ulf Hansson wrote:

> >> This change enables the ACPI PM domain to cope with drivers that deploys

> >> the runtime PM centric path for system sleep.

> >

> > [cut]

> >

> >> @@ -1052,11 +1066,20 @@ EXPORT_SYMBOL_GPL(acpi_subsys_complete);

> >>   * @dev: Device to handle.

> >>   *

> >>   * Follow PCI and resume devices suspended at run time before running their

> >> - * system suspend callbacks.

> >> + * system suspend callbacks. However, try to avoid it in case the runtime PM

> >> + * centric path is used for the device and then trust the driver to do the

> >> + * right thing.

> >>   */

> >>  int acpi_subsys_suspend(struct device *dev)

> >>  {

> >> -     pm_runtime_resume(dev);

> >> +     struct acpi_device *adev = ACPI_COMPANION(dev);

> >> +

> >> +     if (!adev)

> >> +             return 0;

> >> +

> >> +     if (!dev_pm_is_rpm_sleep(dev) || acpi_dev_needs_resume(dev, adev))

> >> +             pm_runtime_resume(dev);

> >> +

> >>       return pm_generic_suspend(dev);

> >>  }

> >>  EXPORT_SYMBOL_GPL(acpi_subsys_suspend);

> >

> > Well, I tried to avoid calling acpi_dev_needs_resume() for multiple times

> > and that's why I added the update_state thing.

> >

> > Moreover, the is_rpm_sleep flag here has to mean not only that

> > direct_complete should not be used with the device, but also that its driver

> > is fine with not resuming it.

> 

> Let me try to explain this better. I realize the changelog is

> misleading around this particular section! Huh, apologize for that!

> 

> First, patch1 makes the PM core treat the is_rpm_sleep flag as the

> direct_complete isn't allowed for the device.

> 

> For that reason, when the is_rpm_sleep is set, there is no point

> calling acpi_dev_needs_resume() from acpi_subsys_prepare(), but

> instead that can be deferred to acpi_subsys_suspend() - because it

> doesn't matter if acpi_subsys_prepare() returns 0 or 1, in either case

> the acpi_subsys_suspend() will be called. That's really what goes on

> here.

> 

> The end result is the same. If the acpi_dev_needs_resume() thinks that

> the device needs to be runtime resumed, pm_runtime_resume() is called

> for the device in acpi_subsys_suspend().

> 

> So, this has nothing to do with whether the driver "is fine with not

> resuming it" thing.


No, sorry.

If is_rpm_sleep was not set, the ACPI PM domain would resume the device in
acpi_subsys_suspend() regardless of the acpi_dev_needs_resume() return value.
That's what's there in the patch.  So clearly, setting is_rpm_sleep means
"this device does not need to be resumed in acpi_subsys_suspend() unless
acpi_dev_needs_resume() returns true".  Which clearly means that the driver
*is* fine with not resuming it, because if is_rpm_sleep is set, the device
in fact may not be resumed and then the driver will need to cope with that.

And note that this meaning of is_rpm_sleep is different from what it is
expected to mean to the core.

> >

> > IMO it is not a good idea to use one flag for these two different things at the

> > same time at all.

> 

> Yeah, I guess my upper comment addresses your immediate concern here?


No, they don't.

> However, there is one other thing the is_rpm_flag means. That is that

> the driver has informed the ACPI PM domain, to trust the driver to

> deal with system sleep, via re-using the runtime PM callbacks.

> So the flag does still have two meanings, but that we can change - of course.


I guess that you are referring to the use of dev_pm_is_rpm_sleep() in
acpi_subsys_suspend_late()?  That's the third thing this flag means ...

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lukas Wunner Sept. 4, 2017, 5:46 a.m. UTC | #3
On Mon, Sep 04, 2017 at 02:17:21AM +0200, Rafael J. Wysocki wrote:
> OTOH I'm starting to think that direct_complete is only theoretically

> useful and may not be actually set very often in practice, whereas it

> adds significant complexity to the code, so I'm not sure about it any

> more.


That makes me come out of the woodwork as a direct_complete fan:

Runtime resuming a discrete GPU on a modern dual GPU laptop takes about
1.5 sec, runtime resuming Thunderbolt controllers more than 2 sec.
A discrete GPU easily consumes 10W, a Thunderbolt controller 2W.

So not having direct_complete would noticeably delay system suspend and
resume as well as reduce battery life.

Lukas
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Sept. 4, 2017, 10:04 a.m. UTC | #4
On Monday, September 4, 2017 7:46:37 AM CEST Lukas Wunner wrote:
> On Mon, Sep 04, 2017 at 02:17:21AM +0200, Rafael J. Wysocki wrote:

> > OTOH I'm starting to think that direct_complete is only theoretically

> > useful and may not be actually set very often in practice, whereas it

> > adds significant complexity to the code, so I'm not sure about it any

> > more.

> 

> That makes me come out of the woodwork as a direct_complete fan:

> 

> Runtime resuming a discrete GPU on a modern dual GPU laptop takes about

> 1.5 sec, runtime resuming Thunderbolt controllers more than 2 sec.

> A discrete GPU easily consumes 10W, a Thunderbolt controller 2W.

> 

> So not having direct_complete would noticeably delay system suspend and

> resume as well as reduce battery life.


Well, that's a good reason for having it. :-)

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Sept. 4, 2017, 12:55 p.m. UTC | #5
[...]

>> > So can you please remind me why the _force_ wrappers are needed?

>>

>> See below.

>>

>> >

>> > In particular, why can't drivers arrange their callbacks the way I did that

>> > in https://patchwork.kernel.org/patch/9928583/ ?

>>

>> I was preparing a reply to that patch, but let me summarize that here instead.

>>

>> Let me be clear, the patch is an improvement of the behavior of the

>> driver and it addresses the issues you point out in the change log.

>> Re-using the runtime PM callbacks for system sleep, is nice as it

>> avoids open coding, which is of curse also one of the reason of using

>> pm_runtime_force_suspend|resume().

>>

>> Still there are a couple of things I am worried about in this patch.

>> *)

>> To be able to re-use the same callbacks for system sleep and runtime

>> PM, some boilerplate code is added to the driver, as to cope with the

>> different conditions inside the callbacks. That pattern would become

>> repeated to many drivers dealing with similar issues.

>

> I'm not worried about that as long as there are good examples and

> documented best practices.

>

> There aren't any right now, which is a problem, but that certainly is

> fixable.

>

>> **)

>> The ->resume_early() callback powers on the device, in case it was

>> runtime resumed when the ->suspend_late() callback was invoked. That

>> is in many cases completely unnecessary, causing us to waste power and

>> increase system resume time, for absolutely no reason. However, I

>> understand the patch didn't try to address this, but to really fix

>> this, there has to be an even closer collaboration between runtime PM

>> and the system sleep callbacks.

>

> I don't quite agree and here's why.

>

> If a device was not runtime-suspended right before system suspend, then quite

> likely it was in use then.  Therefore it is quite likely to be resumed

> immediately after system resume anyway.


Unfortunate, to always make that assumption, leads to a non-optimized
behavior of system sleep. I think we can do better than that!

Let me give you a concrete example, where the above assumption would
lead to an non-optimized behavior.

To put an MMC card into low power state during system suspend
(covering eMMC, SD, SDIO) the mmc core needs to send a couple of
commands over the MMC interface to the card, as to conform with the
(e)MMC/SD/SDIO spec. To do this, the mmc driver for the mmc controller
must runtime resume its device, as to be able to send the commands
over the interface.

Now, when the system resumes, there is absolutely no reason to runtime
resume the device for the MMC controller, just because it was runtime
resumed during system suspend. Instead that is better to be postponed
to when the MMC card is really needed and thus via runtime PM instead.

This scenario shouldn't be specific to only MMC controllers/cards, but
should apply to any external devices/controllers that needs some
special treatment to be put into low power state during system
suspend. Particularly also when those external devices may be left in
that low power state until those are really needed. A couple of cases
I know of pops up in my head, WiFi chips, persistent storage devices,
etc. There should be plenty.

Another common case, is when a subsystem core layer flushes a request
queue during system suspend, which may cause a controller device to be
runtime resumed. Making the assumption that, because flushing the
queue was done during system suspend, we must also power up the
controller during system resume, again would lead to a non-optimized
behavior.

>

> Now, if that's just one device, it probably doesn't matter, but if there are

> more devices like that, they will be resumed after system suspend when they

> are accessed and quite likely they will be accessed one-by-one rather than in

> parallel with each other, so the latencies related to that will add up.  In

> that case it is better to resume them upfront during system resume as they will

> be resumed in parallel with each other then.  And that also is *way* simpler.

>

> This means that the benefit from avoiding to resume devices during system

> resume is not quite so obvious and the whole point above is highly

> questionable.


I hope my reasoning above explains why I think it shouldn't be
considered as questionable.

If you like, I can also provide some real data/logs - showing you
what's happening.

>

>>

>> So, to remind you why the pm_runtime_force_suspend|resume() helpers is

>> preferred, that's because both of the above two things becomes taken

>> care of.

>

> And that is why there is this stuff about parents and usage counters, right?


Correct. Perhaps this commit tells you a little more.

commit 1d9174fbc55ec99ccbfcafa3de2528ef78a849aa
Author: Ulf Hansson <ulf.hansson@linaro.org>
Date:   Thu Oct 13 16:58:54 2016 +0200

    PM / Runtime: Defer resuming of the device in pm_runtime_force_resume()

[...]

>> >

>> > PM core => bus type / PM domain ->suspend_late => driver ->suspend_late

>> >

>> > is far more straightforward than

>> >

>> > PM core => bus type / PM domain ->suspend_late => driver ->suspend_late =>

>> >         bus type / PM domain ->runtime_suspend => driver ->runtime_suspend

>> >

>> > with the bus type / PM domain having to figure out somehow at the

>> > ->suspend_late time whether or not its ->runtume_suspend is going to be invoked

>> > in the middle of it.

>> >

>> > Apart from this just being aesthetically disgusting to me, which admittedly is

>> > a matter of personal opinion, it makes debugging new driver code harder (if it

>> > happens to not work) and reviewing it almost impossible, because now you need

>> > to take all of the tangling between callbacks into accont and sometimes not

>> > just for one bus type / PM domain.

>>

>> I am wondering that perhaps you may be overlooking some of the

>> internals of runtime PM. Or maybe not? :-)

>>

>> I mean, the hole thing is build upon that anyone can call runtime PM

>> functions to runtime resume/suspend a device.

>

> Well, right in general, except that _force_suspend/resume() invoke

> *callbacks* and *not* runtime PM functions.


I am considering pm_runtime_force_suspend|resume() being a part of the
runtime PM API, except that those may be called only during system
sleep.

Comparing a call to pm_runtime_resume(); this may trigger rpm_resume()
to invoke the callbacks. To me, the difference is that the conditions
looked at in rpm_resume(), when runtime PM is enabled, becomes
different for system sleep when runtime PM is disabled - and that is
taken care of in pm_runtime_force_suspend|resume().

>

>> Doing that, makes the

>> hierarchy of the runtime PM callbacks being walked and invoked, of

>> course properly managed by the runtime PM core.

>>

>> My point is that, the runtime PM core still controls this behavior,

>> even when the pm_runtime_force_suspend|resume() helpers are being

>> invoked. The only difference is that it allows runtime PM for the

>> device to be disabled, and still correctly invoked the callbacks. That

>> is what it is all about.

>

> So why is it even useful to call ->runtime_suspend from a middle layer

> in pm_runtime_force_suspend(), for example?


Perhaps I don't understand the question correctly.

Anyway, the answer I think of, is probably because of the same reason
to why the runtime PM core invokes it, when it runs rpm_suspend() for
a device. My point is, we want the similar behavior.

[...]

>> >

>> > Also, when I looked at _force_suspend/resume() again, I got concerned.

>> > There is stuff in there that shouldn't be necessary in a driver's

>> > ->late_suspend/->early_resume and some things in there just made me

>> > scratch my head.

>>

>> Yes, there are some complexity in there, I will be happy to answer any

>> specific question about it.

>

> OK

>

> Of course they require runtime PM to be enabled by drivers using them as

> their callbacks, but I suppose that you realize that.

>

> Why to disabe/renable runtime PM in there in the first place?  That should

> have been done by the core when these functions are intended to be called.


The reason is because we didn't want to re-strict them to be used only
in ->suspend_late() and ->resume_early(), but also for ->suspend() and
->resume(), which is when runtime PM still is enabled.

>

> Second, why to use RPM_GET_CALLBACK in there?


To follow the same rules/hierarchy, as being done in rpm_suspend|resume().

>

> Next, how is the parent actually runtime-resumed by pm_runtime_force_resume()

> which the comment in pm_runtime_force_suspend() talks about?


I think the relevant use case here is when a parent and a child, both
have subsystems/drivers using pm_runtime_force_suspend|resume(). If
that isn't the case, we expect that the parent is always resumed
during system resume. It's a bit fragile approach, so we perhaps we
should deal with it, even if the hole thing is used as opt-in.

Anyway, let's focus on the case which I think is most relevant to your question:

A couple of conditions to start with.
*) The PM core system suspends a child prior a parent, which leads to
pm_runtime_force_suspend() being called for the child first.
**) The PM core system resumes a parents before a child, thus
pm_runtime_force_resume() is called for the parent first.

In case a child don't need to be resumed when
pm_runtime_force_resume() is called for it, likely doesn't its parent.
However, to control that, in system suspend the
pm_runtime_force_suspend() increases the usage counter for the parent,
as to indicate if it needs to be resumed when
pm_runtime_force_resume() is called for it.

Finally, when the child becomes resumed in pm_runtime_force_resume(),
pm_runtime_set_active() is called for it. This verifies that the
parent also has been resumed properly.

[...]

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Sept. 4, 2017, 1:21 p.m. UTC | #6
On 2 September 2017 at 17:38, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Friday, September 1, 2017 10:27:05 AM CEST Ulf Hansson wrote:

>> On 29 August 2017 at 17:27, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:

>> > On Tuesday, August 29, 2017 4:56:48 PM CEST Ulf Hansson wrote:

>> >> This change enables the ACPI PM domain to cope with drivers that deploys

>> >> the runtime PM centric path for system sleep.

>> >

>> > [cut]

>> >

>> >> @@ -1052,11 +1066,20 @@ EXPORT_SYMBOL_GPL(acpi_subsys_complete);

>> >>   * @dev: Device to handle.

>> >>   *

>> >>   * Follow PCI and resume devices suspended at run time before running their

>> >> - * system suspend callbacks.

>> >> + * system suspend callbacks. However, try to avoid it in case the runtime PM

>> >> + * centric path is used for the device and then trust the driver to do the

>> >> + * right thing.

>> >>   */

>> >>  int acpi_subsys_suspend(struct device *dev)

>> >>  {

>> >> -     pm_runtime_resume(dev);

>> >> +     struct acpi_device *adev = ACPI_COMPANION(dev);

>> >> +

>> >> +     if (!adev)

>> >> +             return 0;

>> >> +

>> >> +     if (!dev_pm_is_rpm_sleep(dev) || acpi_dev_needs_resume(dev, adev))

>> >> +             pm_runtime_resume(dev);

>> >> +

>> >>       return pm_generic_suspend(dev);

>> >>  }

>> >>  EXPORT_SYMBOL_GPL(acpi_subsys_suspend);

>> >

>> > Well, I tried to avoid calling acpi_dev_needs_resume() for multiple times

>> > and that's why I added the update_state thing.

>> >

>> > Moreover, the is_rpm_sleep flag here has to mean not only that

>> > direct_complete should not be used with the device, but also that its driver

>> > is fine with not resuming it.

>>

>> Let me try to explain this better. I realize the changelog is

>> misleading around this particular section! Huh, apologize for that!

>>

>> First, patch1 makes the PM core treat the is_rpm_sleep flag as the

>> direct_complete isn't allowed for the device.

>>

>> For that reason, when the is_rpm_sleep is set, there is no point

>> calling acpi_dev_needs_resume() from acpi_subsys_prepare(), but

>> instead that can be deferred to acpi_subsys_suspend() - because it

>> doesn't matter if acpi_subsys_prepare() returns 0 or 1, in either case

>> the acpi_subsys_suspend() will be called. That's really what goes on

>> here.

>>

>> The end result is the same. If the acpi_dev_needs_resume() thinks that

>> the device needs to be runtime resumed, pm_runtime_resume() is called

>> for the device in acpi_subsys_suspend().

>>

>> So, this has nothing to do with whether the driver "is fine with not

>> resuming it" thing.

>

> No, sorry.

>

> If is_rpm_sleep was not set, the ACPI PM domain would resume the device in

> acpi_subsys_suspend() regardless of the acpi_dev_needs_resume() return value.


Yes, I believe I forgot about one scenario, when the direct_complete
path has been abandoned by the PM core, because a child device was
suspend before and it couldn't run the direct_complete path for it?

Just to be sure, that's the case you also had in mind?

> That's what's there in the patch.  So clearly, setting is_rpm_sleep means

> "this device does not need to be resumed in acpi_subsys_suspend() unless

> acpi_dev_needs_resume() returns true".  Which clearly means that the driver

> *is* fine with not resuming it, because if is_rpm_sleep is set, the device

> in fact may not be resumed and then the driver will need to cope with that.


Yes, I understand your concern, because we may break the default
behavior of the ACPI PM domain.

So, *if* there will be a next version, I will make sure to be better
safe than sorry, and add one flag per use case.

>

> And note that this meaning of is_rpm_sleep is different from what it is

> expected to mean to the core.

>

>> >

>> > IMO it is not a good idea to use one flag for these two different things at the

>> > same time at all.

>>

>> Yeah, I guess my upper comment addresses your immediate concern here?

>

> No, they don't.

>

>> However, there is one other thing the is_rpm_flag means. That is that

>> the driver has informed the ACPI PM domain, to trust the driver to

>> deal with system sleep, via re-using the runtime PM callbacks.

>> So the flag does still have two meanings, but that we can change - of course.

>

> I guess that you are referring to the use of dev_pm_is_rpm_sleep() in

> acpi_subsys_suspend_late()?  That's the third thing this flag means ...


Yes.

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Sept. 6, 2017, 9:39 p.m. UTC | #7
On Wednesday, September 6, 2017 3:59:16 PM CEST Ulf Hansson wrote:
> On 6 September 2017 at 12:46, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:

> > On Wednesday, September 6, 2017 2:52:59 AM CEST Rafael J. Wysocki wrote:

> >> On Monday, September 4, 2017 2:55:37 PM CEST Ulf Hansson wrote:

> >> > [...]

> >

> > I guess I can wrap it up, because all of the points seem to have been stated

> > and repeating them would not be useful.

> >

> > My summary of the discussion is as follows.

> >

> > It only is valid to use pm_runtime_force_suspend/resume() as *driver*

> > callbacks for system suspend/resume if both the driver itself and all of

> > the middle layers it has to work with carry out the same sequence of

> > operations in order to suspend the device both in runtime PM and for

> > system sleep (and analogously for resuming).  [The middle layers need

> > to meet additional conditions, but that's less relevant.]

> >

> > Unfortunately, for the ACPI PM domain and the PCI bus type the situation is

> > different, because they generally need to do different things to suspend

> > devices for system sleep than they do for runtime PM (which mostly is

> > related to the handling of ACPI-defined sleep states and device/system

> > wakeup, but not limited to that).  This clearly means that drivers needing

> > to work with the ACPI PM domain and PCI drivers cannot use

> > pm_runtime_force_suspend/resume() as their PM callbacks for system

> > suspend/resume (quite fundamentally).

> >

> > [Note that for i2c-designware-platdrv the situation is even more complicated,

> > because on some platforms it has to work with the ACPI PM domain (or the

> > ACPI LPSS driver), on some platforms its parent is a PCI device and on

> > some other platforms there's none of them.]

> 

> That is also why it makes it really interesting. I am guessing we will

> be seeing more of these cases sooner or later.

> 

> To make it even more complex, I can guess we can expect cases when

> genpd is mixed with the ACPI PM domain.

> 

> >

> > However, for drivers that need to work with the ACPI PM domain and

> > PCI drivers the differences in the device handling between runtime PM and

> > system suspend/resume are *very* often (even though not always) covered

> > entirely by the middle layer code.  Then, the driver itself actually

> > always carries out the same sequence of operations in order to suspend

> > the device (or to resume it, analogously).  The driver then can re-use

> > its runtime PM callbacks for system suspend/resume (but at the driver

> > level only) and it would be good to make that easy (or easier) for these

> > drivers somehow.

> 

> This is a very nice summary so far, thanks for putting it together.


No problem.

I actually have an idea on how to move forward, but let me start a new thread
for discussing that.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html