diff mbox series

[1/3] PM: domains: Drop the performance state vote for a device at detach

Message ID 20210902101634.827187-2-ulf.hansson@linaro.org
State New
Headers show
Series PM: domains: Improvements for performance states in genpd | expand

Commit Message

Ulf Hansson Sept. 2, 2021, 10:16 a.m. UTC
When a device is detached from its genpd, genpd loses track of the device,
including its performance state vote that may have been requested for it.

Rather than relying on the consumer driver to drop the performance state
vote for its device, let's do it internally in genpd when the device is
getting detached. In this way, we makes sure that the aggregation of the
votes in genpd becomes correct.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

---
 drivers/base/power/domain.c | 9 ++-------
 include/linux/pm_domain.h   | 1 -
 2 files changed, 2 insertions(+), 8 deletions(-)

-- 
2.25.1

Comments

Dmitry Osipenko Sept. 3, 2021, 6:01 a.m. UTC | #1
02.09.2021 13:16, Ulf Hansson пишет:
> When a device is detached from its genpd, genpd loses track of the device,

> including its performance state vote that may have been requested for it.

> 

> Rather than relying on the consumer driver to drop the performance state

> vote for its device, let's do it internally in genpd when the device is

> getting detached. In this way, we makes sure that the aggregation of the

> votes in genpd becomes correct.


This is a dangerous behaviour in a case where performance state
represents voltage. If hardware is kept active on detachment, say it's
always-on, then it may be a disaster to drop the voltage for the active
hardware.

It's safe to drop performance state only if you assume that there is a
firmware behind kernel which has its own layer of performance management
and it will prevent the disaster by saying 'nope, I'm not doing this'.

The performance state should be persistent for a device and it should be
controlled in a conjunction with runtime PM. If platform wants to drop
performance state to zero on detachment, then this behaviour should be
specific to that platform.
Ulf Hansson Sept. 3, 2021, 8:22 a.m. UTC | #2
On Fri, 3 Sept 2021 at 08:01, Dmitry Osipenko <digetx@gmail.com> wrote:
>

> 02.09.2021 13:16, Ulf Hansson пишет:

> > When a device is detached from its genpd, genpd loses track of the device,

> > including its performance state vote that may have been requested for it.

> >

> > Rather than relying on the consumer driver to drop the performance state

> > vote for its device, let's do it internally in genpd when the device is

> > getting detached. In this way, we makes sure that the aggregation of the

> > votes in genpd becomes correct.

>

> This is a dangerous behaviour in a case where performance state

> represents voltage. If hardware is kept active on detachment, say it's

> always-on, then it may be a disaster to drop the voltage for the active

> hardware.

>

> It's safe to drop performance state only if you assume that there is a

> firmware behind kernel which has its own layer of performance management

> and it will prevent the disaster by saying 'nope, I'm not doing this'.

>

> The performance state should be persistent for a device and it should be

> controlled in a conjunction with runtime PM. If platform wants to drop

> performance state to zero on detachment, then this behaviour should be

> specific to that platform.


I understand your concern, but at this point, genpd can't help to fix this.

Genpd has no information about the device, unless it's attached to it.
For now and for these always on HWs, we simply need to make sure the
device stays attached, in one way or the other.

Kind regards
Uffe
Dmitry Osipenko Sept. 3, 2021, 9:58 a.m. UTC | #3
03.09.2021 11:22, Ulf Hansson пишет:
> On Fri, 3 Sept 2021 at 08:01, Dmitry Osipenko <digetx@gmail.com> wrote:

>>

>> 02.09.2021 13:16, Ulf Hansson пишет:

>>> When a device is detached from its genpd, genpd loses track of the device,

>>> including its performance state vote that may have been requested for it.

>>>

>>> Rather than relying on the consumer driver to drop the performance state

>>> vote for its device, let's do it internally in genpd when the device is

>>> getting detached. In this way, we makes sure that the aggregation of the

>>> votes in genpd becomes correct.

>>

>> This is a dangerous behaviour in a case where performance state

>> represents voltage. If hardware is kept active on detachment, say it's

>> always-on, then it may be a disaster to drop the voltage for the active

>> hardware.

>>

>> It's safe to drop performance state only if you assume that there is a

>> firmware behind kernel which has its own layer of performance management

>> and it will prevent the disaster by saying 'nope, I'm not doing this'.

>>

>> The performance state should be persistent for a device and it should be

>> controlled in a conjunction with runtime PM. If platform wants to drop

>> performance state to zero on detachment, then this behaviour should be

>> specific to that platform.

> 

> I understand your concern, but at this point, genpd can't help to fix this.

> 

> Genpd has no information about the device, unless it's attached to it.

> For now and for these always on HWs, we simply need to make sure the

> device stays attached, in one way or the other.


This indeed requires to redesign GENPD to make it more coupled with a
device, but this is not a real problem for any of the current API users
AFAIK. Ideally the state should be persistent to make API more universal.

Since for today we assume that device should be suspended at the time of
the detachment (if the default OPP state isn't used), it may be better
to add a noisy warning message if pstate!=0, keeping the state untouched
if it's not zero.
Ulf Hansson Sept. 3, 2021, 2:03 p.m. UTC | #4
On Fri, 3 Sept 2021 at 11:58, Dmitry Osipenko <digetx@gmail.com> wrote:
>

> 03.09.2021 11:22, Ulf Hansson пишет:

> > On Fri, 3 Sept 2021 at 08:01, Dmitry Osipenko <digetx@gmail.com> wrote:

> >>

> >> 02.09.2021 13:16, Ulf Hansson пишет:

> >>> When a device is detached from its genpd, genpd loses track of the device,

> >>> including its performance state vote that may have been requested for it.

> >>>

> >>> Rather than relying on the consumer driver to drop the performance state

> >>> vote for its device, let's do it internally in genpd when the device is

> >>> getting detached. In this way, we makes sure that the aggregation of the

> >>> votes in genpd becomes correct.

> >>

> >> This is a dangerous behaviour in a case where performance state

> >> represents voltage. If hardware is kept active on detachment, say it's

> >> always-on, then it may be a disaster to drop the voltage for the active

> >> hardware.

> >>

> >> It's safe to drop performance state only if you assume that there is a

> >> firmware behind kernel which has its own layer of performance management

> >> and it will prevent the disaster by saying 'nope, I'm not doing this'.

> >>

> >> The performance state should be persistent for a device and it should be

> >> controlled in a conjunction with runtime PM. If platform wants to drop

> >> performance state to zero on detachment, then this behaviour should be

> >> specific to that platform.

> >

> > I understand your concern, but at this point, genpd can't help to fix this.

> >

> > Genpd has no information about the device, unless it's attached to it.

> > For now and for these always on HWs, we simply need to make sure the

> > device stays attached, in one way or the other.

>

> This indeed requires to redesign GENPD to make it more coupled with a

> device, but this is not a real problem for any of the current API users

> AFAIK. Ideally the state should be persistent to make API more universal.


Right. In fact this has been discussed in the past. In principle, the
idea was to attach to genpd at device registration, rather than at
driver probe.

Although, this is not very easy to implement - and it seems like the
churns to do, have not been really worth it. At least so far.

>

> Since for today we assume that device should be suspended at the time of

> the detachment (if the default OPP state isn't used), it may be better

> to add a noisy warning message if pstate!=0, keeping the state untouched

> if it's not zero.


That would just be very silly in my opinion.

When the device is detached (suspended or not), it may cause it's PM
domain to be powered off - and there is really nothing we can do about
that from the genpd point of view.

As stated, the only current short term solution is to avoid detaching
the device. Anything else, would just be papering of the issue.

Kind regards
Uffe
Dmitry Osipenko Sept. 5, 2021, 8:26 a.m. UTC | #5
03.09.2021 17:03, Ulf Hansson пишет:
> On Fri, 3 Sept 2021 at 11:58, Dmitry Osipenko <digetx@gmail.com> wrote:

>>

>> 03.09.2021 11:22, Ulf Hansson пишет:

>>> On Fri, 3 Sept 2021 at 08:01, Dmitry Osipenko <digetx@gmail.com> wrote:

>>>>

>>>> 02.09.2021 13:16, Ulf Hansson пишет:

>>>>> When a device is detached from its genpd, genpd loses track of the device,

>>>>> including its performance state vote that may have been requested for it.

>>>>>

>>>>> Rather than relying on the consumer driver to drop the performance state

>>>>> vote for its device, let's do it internally in genpd when the device is

>>>>> getting detached. In this way, we makes sure that the aggregation of the

>>>>> votes in genpd becomes correct.

>>>>

>>>> This is a dangerous behaviour in a case where performance state

>>>> represents voltage. If hardware is kept active on detachment, say it's

>>>> always-on, then it may be a disaster to drop the voltage for the active

>>>> hardware.

>>>>

>>>> It's safe to drop performance state only if you assume that there is a

>>>> firmware behind kernel which has its own layer of performance management

>>>> and it will prevent the disaster by saying 'nope, I'm not doing this'.

>>>>

>>>> The performance state should be persistent for a device and it should be

>>>> controlled in a conjunction with runtime PM. If platform wants to drop

>>>> performance state to zero on detachment, then this behaviour should be

>>>> specific to that platform.

>>>

>>> I understand your concern, but at this point, genpd can't help to fix this.

>>>

>>> Genpd has no information about the device, unless it's attached to it.

>>> For now and for these always on HWs, we simply need to make sure the

>>> device stays attached, in one way or the other.

>>

>> This indeed requires to redesign GENPD to make it more coupled with a

>> device, but this is not a real problem for any of the current API users

>> AFAIK. Ideally the state should be persistent to make API more universal.

> 

> Right. In fact this has been discussed in the past. In principle, the

> idea was to attach to genpd at device registration, rather than at

> driver probe.

> 

> Although, this is not very easy to implement - and it seems like the

> churns to do, have not been really worth it. At least so far.

> 

>>

>> Since for today we assume that device should be suspended at the time of

>> the detachment (if the default OPP state isn't used), it may be better

>> to add a noisy warning message if pstate!=0, keeping the state untouched

>> if it's not zero.

> 

> That would just be very silly in my opinion.

> 

> When the device is detached (suspended or not), it may cause it's PM

> domain to be powered off - and there is really nothing we can do about

> that from the genpd point of view.

> 

> As stated, the only current short term solution is to avoid detaching

> the device. Anything else, would just be papering of the issue.


What about to re-evaluate the performance state of the domain after
detachment instead of setting the state to zero? This way PD driver may
take an action on detachment if performance isn't zero, before hardware
is crashed, for example it may emit a warning.
Ulf Hansson Sept. 6, 2021, 10:24 a.m. UTC | #6
On Sun, 5 Sept 2021 at 10:26, Dmitry Osipenko <digetx@gmail.com> wrote:
>

> 03.09.2021 17:03, Ulf Hansson пишет:

> > On Fri, 3 Sept 2021 at 11:58, Dmitry Osipenko <digetx@gmail.com> wrote:

> >>

> >> 03.09.2021 11:22, Ulf Hansson пишет:

> >>> On Fri, 3 Sept 2021 at 08:01, Dmitry Osipenko <digetx@gmail.com> wrote:

> >>>>

> >>>> 02.09.2021 13:16, Ulf Hansson пишет:

> >>>>> When a device is detached from its genpd, genpd loses track of the device,

> >>>>> including its performance state vote that may have been requested for it.

> >>>>>

> >>>>> Rather than relying on the consumer driver to drop the performance state

> >>>>> vote for its device, let's do it internally in genpd when the device is

> >>>>> getting detached. In this way, we makes sure that the aggregation of the

> >>>>> votes in genpd becomes correct.

> >>>>

> >>>> This is a dangerous behaviour in a case where performance state

> >>>> represents voltage. If hardware is kept active on detachment, say it's

> >>>> always-on, then it may be a disaster to drop the voltage for the active

> >>>> hardware.

> >>>>

> >>>> It's safe to drop performance state only if you assume that there is a

> >>>> firmware behind kernel which has its own layer of performance management

> >>>> and it will prevent the disaster by saying 'nope, I'm not doing this'.

> >>>>

> >>>> The performance state should be persistent for a device and it should be

> >>>> controlled in a conjunction with runtime PM. If platform wants to drop

> >>>> performance state to zero on detachment, then this behaviour should be

> >>>> specific to that platform.

> >>>

> >>> I understand your concern, but at this point, genpd can't help to fix this.

> >>>

> >>> Genpd has no information about the device, unless it's attached to it.

> >>> For now and for these always on HWs, we simply need to make sure the

> >>> device stays attached, in one way or the other.

> >>

> >> This indeed requires to redesign GENPD to make it more coupled with a

> >> device, but this is not a real problem for any of the current API users

> >> AFAIK. Ideally the state should be persistent to make API more universal.

> >

> > Right. In fact this has been discussed in the past. In principle, the

> > idea was to attach to genpd at device registration, rather than at

> > driver probe.

> >

> > Although, this is not very easy to implement - and it seems like the

> > churns to do, have not been really worth it. At least so far.

> >

> >>

> >> Since for today we assume that device should be suspended at the time of

> >> the detachment (if the default OPP state isn't used), it may be better

> >> to add a noisy warning message if pstate!=0, keeping the state untouched

> >> if it's not zero.

> >

> > That would just be very silly in my opinion.

> >

> > When the device is detached (suspended or not), it may cause it's PM

> > domain to be powered off - and there is really nothing we can do about

> > that from the genpd point of view.

> >

> > As stated, the only current short term solution is to avoid detaching

> > the device. Anything else, would just be papering of the issue.

>

> What about to re-evaluate the performance state of the domain after

> detachment instead of setting the state to zero?


I am not suggesting to set the performance state of the genpd to zero,
but to drop a potential vote for a performance state for the *device*
that is about to be detached.

Calling genpd_set_performance_state(dev, 0), during detach will have
the same effect as triggering a re-evaluation of the performance state
for the genpd, but after the detach.

> This way PD driver may

> take an action on detachment if performance isn't zero, before hardware

> is crashed, for example it may emit a warning.


Not sure I got that. Exactly when do you want to emit a warning and
for what reason?

Do you want to add a check somewhere to see if
'gpd_data->performance_state' is non zero - and then print a warning?

Kind regards
Uffe
Dmitry Osipenko Sept. 6, 2021, 2:11 p.m. UTC | #7
06.09.2021 13:24, Ulf Hansson пишет:
> On Sun, 5 Sept 2021 at 10:26, Dmitry Osipenko <digetx@gmail.com> wrote:

>>

>> 03.09.2021 17:03, Ulf Hansson пишет:

>>> On Fri, 3 Sept 2021 at 11:58, Dmitry Osipenko <digetx@gmail.com> wrote:

>>>>

>>>> 03.09.2021 11:22, Ulf Hansson пишет:

>>>>> On Fri, 3 Sept 2021 at 08:01, Dmitry Osipenko <digetx@gmail.com> wrote:

>>>>>>

>>>>>> 02.09.2021 13:16, Ulf Hansson пишет:

>>>>>>> When a device is detached from its genpd, genpd loses track of the device,

>>>>>>> including its performance state vote that may have been requested for it.

>>>>>>>

>>>>>>> Rather than relying on the consumer driver to drop the performance state

>>>>>>> vote for its device, let's do it internally in genpd when the device is

>>>>>>> getting detached. In this way, we makes sure that the aggregation of the

>>>>>>> votes in genpd becomes correct.

>>>>>>

>>>>>> This is a dangerous behaviour in a case where performance state

>>>>>> represents voltage. If hardware is kept active on detachment, say it's

>>>>>> always-on, then it may be a disaster to drop the voltage for the active

>>>>>> hardware.

>>>>>>

>>>>>> It's safe to drop performance state only if you assume that there is a

>>>>>> firmware behind kernel which has its own layer of performance management

>>>>>> and it will prevent the disaster by saying 'nope, I'm not doing this'.

>>>>>>

>>>>>> The performance state should be persistent for a device and it should be

>>>>>> controlled in a conjunction with runtime PM. If platform wants to drop

>>>>>> performance state to zero on detachment, then this behaviour should be

>>>>>> specific to that platform.

>>>>>

>>>>> I understand your concern, but at this point, genpd can't help to fix this.

>>>>>

>>>>> Genpd has no information about the device, unless it's attached to it.

>>>>> For now and for these always on HWs, we simply need to make sure the

>>>>> device stays attached, in one way or the other.

>>>>

>>>> This indeed requires to redesign GENPD to make it more coupled with a

>>>> device, but this is not a real problem for any of the current API users

>>>> AFAIK. Ideally the state should be persistent to make API more universal.

>>>

>>> Right. In fact this has been discussed in the past. In principle, the

>>> idea was to attach to genpd at device registration, rather than at

>>> driver probe.

>>>

>>> Although, this is not very easy to implement - and it seems like the

>>> churns to do, have not been really worth it. At least so far.

>>>

>>>>

>>>> Since for today we assume that device should be suspended at the time of

>>>> the detachment (if the default OPP state isn't used), it may be better

>>>> to add a noisy warning message if pstate!=0, keeping the state untouched

>>>> if it's not zero.

>>>

>>> That would just be very silly in my opinion.

>>>

>>> When the device is detached (suspended or not), it may cause it's PM

>>> domain to be powered off - and there is really nothing we can do about

>>> that from the genpd point of view.

>>>

>>> As stated, the only current short term solution is to avoid detaching

>>> the device. Anything else, would just be papering of the issue.

>>

>> What about to re-evaluate the performance state of the domain after

>> detachment instead of setting the state to zero?

> 

> I am not suggesting to set the performance state of the genpd to zero,

> but to drop a potential vote for a performance state for the *device*

> that is about to be detached.


By removing the vote of the *device*, you will drop the performance
state of the genpd. If device is active and it's wrong to drop its
state, then you may cause the damage.

> Calling genpd_set_performance_state(dev, 0), during detach will have

> the same effect as triggering a re-evaluation of the performance state

> for the genpd, but after the detach.


Yes

>> This way PD driver may

>> take an action on detachment if performance isn't zero, before hardware

>> is crashed, for example it may emit a warning.

> 

> Not sure I got that. Exactly when do you want to emit a warning and

> for what reason?

> 

> Do you want to add a check somewhere to see if

> 'gpd_data->performance_state' is non zero - and then print a warning?


I want to check the 'gpd_data->performance_state' from the detachment
callback and emit the warning + lock further performance changes in the
PD driver since it's a error condition.
Ulf Hansson Sept. 6, 2021, 5:34 p.m. UTC | #8
On Mon, 6 Sept 2021 at 16:11, Dmitry Osipenko <digetx@gmail.com> wrote:
>

> 06.09.2021 13:24, Ulf Hansson пишет:

> > On Sun, 5 Sept 2021 at 10:26, Dmitry Osipenko <digetx@gmail.com> wrote:

> >>

> >> 03.09.2021 17:03, Ulf Hansson пишет:

> >>> On Fri, 3 Sept 2021 at 11:58, Dmitry Osipenko <digetx@gmail.com> wrote:

> >>>>

> >>>> 03.09.2021 11:22, Ulf Hansson пишет:

> >>>>> On Fri, 3 Sept 2021 at 08:01, Dmitry Osipenko <digetx@gmail.com> wrote:

> >>>>>>

> >>>>>> 02.09.2021 13:16, Ulf Hansson пишет:

> >>>>>>> When a device is detached from its genpd, genpd loses track of the device,

> >>>>>>> including its performance state vote that may have been requested for it.

> >>>>>>>

> >>>>>>> Rather than relying on the consumer driver to drop the performance state

> >>>>>>> vote for its device, let's do it internally in genpd when the device is

> >>>>>>> getting detached. In this way, we makes sure that the aggregation of the

> >>>>>>> votes in genpd becomes correct.

> >>>>>>

> >>>>>> This is a dangerous behaviour in a case where performance state

> >>>>>> represents voltage. If hardware is kept active on detachment, say it's

> >>>>>> always-on, then it may be a disaster to drop the voltage for the active

> >>>>>> hardware.

> >>>>>>

> >>>>>> It's safe to drop performance state only if you assume that there is a

> >>>>>> firmware behind kernel which has its own layer of performance management

> >>>>>> and it will prevent the disaster by saying 'nope, I'm not doing this'.

> >>>>>>

> >>>>>> The performance state should be persistent for a device and it should be

> >>>>>> controlled in a conjunction with runtime PM. If platform wants to drop

> >>>>>> performance state to zero on detachment, then this behaviour should be

> >>>>>> specific to that platform.

> >>>>>

> >>>>> I understand your concern, but at this point, genpd can't help to fix this.

> >>>>>

> >>>>> Genpd has no information about the device, unless it's attached to it.

> >>>>> For now and for these always on HWs, we simply need to make sure the

> >>>>> device stays attached, in one way or the other.

> >>>>

> >>>> This indeed requires to redesign GENPD to make it more coupled with a

> >>>> device, but this is not a real problem for any of the current API users

> >>>> AFAIK. Ideally the state should be persistent to make API more universal.

> >>>

> >>> Right. In fact this has been discussed in the past. In principle, the

> >>> idea was to attach to genpd at device registration, rather than at

> >>> driver probe.

> >>>

> >>> Although, this is not very easy to implement - and it seems like the

> >>> churns to do, have not been really worth it. At least so far.

> >>>

> >>>>

> >>>> Since for today we assume that device should be suspended at the time of

> >>>> the detachment (if the default OPP state isn't used), it may be better

> >>>> to add a noisy warning message if pstate!=0, keeping the state untouched

> >>>> if it's not zero.

> >>>

> >>> That would just be very silly in my opinion.

> >>>

> >>> When the device is detached (suspended or not), it may cause it's PM

> >>> domain to be powered off - and there is really nothing we can do about

> >>> that from the genpd point of view.

> >>>

> >>> As stated, the only current short term solution is to avoid detaching

> >>> the device. Anything else, would just be papering of the issue.

> >>

> >> What about to re-evaluate the performance state of the domain after

> >> detachment instead of setting the state to zero?

> >

> > I am not suggesting to set the performance state of the genpd to zero,

> > but to drop a potential vote for a performance state for the *device*

> > that is about to be detached.

>

> By removing the vote of the *device*, you will drop the performance

> state of the genpd. If device is active and it's wrong to drop its

> state, then you may cause the damage.

>

> > Calling genpd_set_performance_state(dev, 0), during detach will have

> > the same effect as triggering a re-evaluation of the performance state

> > for the genpd, but after the detach.

>

> Yes

>

> >> This way PD driver may

> >> take an action on detachment if performance isn't zero, before hardware

> >> is crashed, for example it may emit a warning.

> >

> > Not sure I got that. Exactly when do you want to emit a warning and

> > for what reason?

> >

> > Do you want to add a check somewhere to see if

> > 'gpd_data->performance_state' is non zero - and then print a warning?

>

> I want to check the 'gpd_data->performance_state' from the detachment

> callback and emit the warning + lock further performance changes in the

> PD driver since it's a error condition.


Alright, so if I understand correctly, you intend to do the check for
the "error condition" of the device in the genpd->detach_dev()
callback?

What exactly do you intend to do beyond this point, if you detect the
"error condition"? Locking further changes of the performance state
seems fragile too, especially if some other device/driver requires the
performance state to be raised. It sounds like you simply need to call
BUG_ON() then?

Also note that a very similar problem exists, *before* the device gets
attached in the first place. More precisely, nothing prevents the
performance state from being set to a non-compatible value for an
always-on HW/device that hasn't been attached yet. So maybe you need
to set the maximum performance state at genpd initializations, then
use the ->sync_state() callback to very that all consumers have been
attached to the genpd provider, before allowing the state to be
changed/lowered?

Kind regards
Uffe
Dmitry Osipenko Sept. 6, 2021, 7:33 p.m. UTC | #9
06.09.2021 20:34, Ulf Hansson пишет:
> On Mon, 6 Sept 2021 at 16:11, Dmitry Osipenko <digetx@gmail.com> wrote:

>>

>> 06.09.2021 13:24, Ulf Hansson пишет:

>>> On Sun, 5 Sept 2021 at 10:26, Dmitry Osipenko <digetx@gmail.com> wrote:

>>>>

>>>> 03.09.2021 17:03, Ulf Hansson пишет:

>>>>> On Fri, 3 Sept 2021 at 11:58, Dmitry Osipenko <digetx@gmail.com> wrote:

>>>>>>

>>>>>> 03.09.2021 11:22, Ulf Hansson пишет:

>>>>>>> On Fri, 3 Sept 2021 at 08:01, Dmitry Osipenko <digetx@gmail.com> wrote:

>>>>>>>>

>>>>>>>> 02.09.2021 13:16, Ulf Hansson пишет:

>>>>>>>>> When a device is detached from its genpd, genpd loses track of the device,

>>>>>>>>> including its performance state vote that may have been requested for it.

>>>>>>>>>

>>>>>>>>> Rather than relying on the consumer driver to drop the performance state

>>>>>>>>> vote for its device, let's do it internally in genpd when the device is

>>>>>>>>> getting detached. In this way, we makes sure that the aggregation of the

>>>>>>>>> votes in genpd becomes correct.

>>>>>>>>

>>>>>>>> This is a dangerous behaviour in a case where performance state

>>>>>>>> represents voltage. If hardware is kept active on detachment, say it's

>>>>>>>> always-on, then it may be a disaster to drop the voltage for the active

>>>>>>>> hardware.

>>>>>>>>

>>>>>>>> It's safe to drop performance state only if you assume that there is a

>>>>>>>> firmware behind kernel which has its own layer of performance management

>>>>>>>> and it will prevent the disaster by saying 'nope, I'm not doing this'.

>>>>>>>>

>>>>>>>> The performance state should be persistent for a device and it should be

>>>>>>>> controlled in a conjunction with runtime PM. If platform wants to drop

>>>>>>>> performance state to zero on detachment, then this behaviour should be

>>>>>>>> specific to that platform.

>>>>>>>

>>>>>>> I understand your concern, but at this point, genpd can't help to fix this.

>>>>>>>

>>>>>>> Genpd has no information about the device, unless it's attached to it.

>>>>>>> For now and for these always on HWs, we simply need to make sure the

>>>>>>> device stays attached, in one way or the other.

>>>>>>

>>>>>> This indeed requires to redesign GENPD to make it more coupled with a

>>>>>> device, but this is not a real problem for any of the current API users

>>>>>> AFAIK. Ideally the state should be persistent to make API more universal.

>>>>>

>>>>> Right. In fact this has been discussed in the past. In principle, the

>>>>> idea was to attach to genpd at device registration, rather than at

>>>>> driver probe.

>>>>>

>>>>> Although, this is not very easy to implement - and it seems like the

>>>>> churns to do, have not been really worth it. At least so far.

>>>>>

>>>>>>

>>>>>> Since for today we assume that device should be suspended at the time of

>>>>>> the detachment (if the default OPP state isn't used), it may be better

>>>>>> to add a noisy warning message if pstate!=0, keeping the state untouched

>>>>>> if it's not zero.

>>>>>

>>>>> That would just be very silly in my opinion.

>>>>>

>>>>> When the device is detached (suspended or not), it may cause it's PM

>>>>> domain to be powered off - and there is really nothing we can do about

>>>>> that from the genpd point of view.

>>>>>

>>>>> As stated, the only current short term solution is to avoid detaching

>>>>> the device. Anything else, would just be papering of the issue.

>>>>

>>>> What about to re-evaluate the performance state of the domain after

>>>> detachment instead of setting the state to zero?

>>>

>>> I am not suggesting to set the performance state of the genpd to zero,

>>> but to drop a potential vote for a performance state for the *device*

>>> that is about to be detached.

>>

>> By removing the vote of the *device*, you will drop the performance

>> state of the genpd. If device is active and it's wrong to drop its

>> state, then you may cause the damage.

>>

>>> Calling genpd_set_performance_state(dev, 0), during detach will have

>>> the same effect as triggering a re-evaluation of the performance state

>>> for the genpd, but after the detach.

>>

>> Yes

>>

>>>> This way PD driver may

>>>> take an action on detachment if performance isn't zero, before hardware

>>>> is crashed, for example it may emit a warning.

>>>

>>> Not sure I got that. Exactly when do you want to emit a warning and

>>> for what reason?

>>>

>>> Do you want to add a check somewhere to see if

>>> 'gpd_data->performance_state' is non zero - and then print a warning?

>>

>> I want to check the 'gpd_data->performance_state' from the detachment

>> callback and emit the warning + lock further performance changes in the

>> PD driver since it's a error condition.

> 

> Alright, so if I understand correctly, you intend to do the check for

> the "error condition" of the device in the genpd->detach_dev()

> callback?


Yes

> What exactly do you intend to do beyond this point, if you detect the

> "error condition"? Locking further changes of the performance state

> seems fragile too, especially if some other device/driver requires the

> performance state to be raised. It sounds like you simply need to call

> BUG_ON() then?


I can lock it to high performance state.

> Also note that a very similar problem exists, *before* the device gets

> attached in the first place. More precisely, nothing prevents the

> performance state from being set to a non-compatible value for an

> always-on HW/device that hasn't been attached yet. So maybe you need

> to set the maximum performance state at genpd initializations, then

> use the ->sync_state() callback to very that all consumers have been

> attached to the genpd provider, before allowing the state to be

> changed/lowered?


That is already done by the PD driver.

https://elixir.bootlin.com/linux/latest/source/drivers/soc/tegra/pmc.c#L3790
Ulf Hansson Sept. 7, 2021, 10:16 a.m. UTC | #10
On Mon, 6 Sept 2021 at 21:33, Dmitry Osipenko <digetx@gmail.com> wrote:
>

> 06.09.2021 20:34, Ulf Hansson пишет:

> > On Mon, 6 Sept 2021 at 16:11, Dmitry Osipenko <digetx@gmail.com> wrote:

> >>

> >> 06.09.2021 13:24, Ulf Hansson пишет:

> >>> On Sun, 5 Sept 2021 at 10:26, Dmitry Osipenko <digetx@gmail.com> wrote:

> >>>>

> >>>> 03.09.2021 17:03, Ulf Hansson пишет:

> >>>>> On Fri, 3 Sept 2021 at 11:58, Dmitry Osipenko <digetx@gmail.com> wrote:

> >>>>>>

> >>>>>> 03.09.2021 11:22, Ulf Hansson пишет:

> >>>>>>> On Fri, 3 Sept 2021 at 08:01, Dmitry Osipenko <digetx@gmail.com> wrote:

> >>>>>>>>

> >>>>>>>> 02.09.2021 13:16, Ulf Hansson пишет:

> >>>>>>>>> When a device is detached from its genpd, genpd loses track of the device,

> >>>>>>>>> including its performance state vote that may have been requested for it.

> >>>>>>>>>

> >>>>>>>>> Rather than relying on the consumer driver to drop the performance state

> >>>>>>>>> vote for its device, let's do it internally in genpd when the device is

> >>>>>>>>> getting detached. In this way, we makes sure that the aggregation of the

> >>>>>>>>> votes in genpd becomes correct.

> >>>>>>>>

> >>>>>>>> This is a dangerous behaviour in a case where performance state

> >>>>>>>> represents voltage. If hardware is kept active on detachment, say it's

> >>>>>>>> always-on, then it may be a disaster to drop the voltage for the active

> >>>>>>>> hardware.

> >>>>>>>>

> >>>>>>>> It's safe to drop performance state only if you assume that there is a

> >>>>>>>> firmware behind kernel which has its own layer of performance management

> >>>>>>>> and it will prevent the disaster by saying 'nope, I'm not doing this'.

> >>>>>>>>

> >>>>>>>> The performance state should be persistent for a device and it should be

> >>>>>>>> controlled in a conjunction with runtime PM. If platform wants to drop

> >>>>>>>> performance state to zero on detachment, then this behaviour should be

> >>>>>>>> specific to that platform.

> >>>>>>>

> >>>>>>> I understand your concern, but at this point, genpd can't help to fix this.

> >>>>>>>

> >>>>>>> Genpd has no information about the device, unless it's attached to it.

> >>>>>>> For now and for these always on HWs, we simply need to make sure the

> >>>>>>> device stays attached, in one way or the other.

> >>>>>>

> >>>>>> This indeed requires to redesign GENPD to make it more coupled with a

> >>>>>> device, but this is not a real problem for any of the current API users

> >>>>>> AFAIK. Ideally the state should be persistent to make API more universal.

> >>>>>

> >>>>> Right. In fact this has been discussed in the past. In principle, the

> >>>>> idea was to attach to genpd at device registration, rather than at

> >>>>> driver probe.

> >>>>>

> >>>>> Although, this is not very easy to implement - and it seems like the

> >>>>> churns to do, have not been really worth it. At least so far.

> >>>>>

> >>>>>>

> >>>>>> Since for today we assume that device should be suspended at the time of

> >>>>>> the detachment (if the default OPP state isn't used), it may be better

> >>>>>> to add a noisy warning message if pstate!=0, keeping the state untouched

> >>>>>> if it's not zero.

> >>>>>

> >>>>> That would just be very silly in my opinion.

> >>>>>

> >>>>> When the device is detached (suspended or not), it may cause it's PM

> >>>>> domain to be powered off - and there is really nothing we can do about

> >>>>> that from the genpd point of view.

> >>>>>

> >>>>> As stated, the only current short term solution is to avoid detaching

> >>>>> the device. Anything else, would just be papering of the issue.

> >>>>

> >>>> What about to re-evaluate the performance state of the domain after

> >>>> detachment instead of setting the state to zero?

> >>>

> >>> I am not suggesting to set the performance state of the genpd to zero,

> >>> but to drop a potential vote for a performance state for the *device*

> >>> that is about to be detached.

> >>

> >> By removing the vote of the *device*, you will drop the performance

> >> state of the genpd. If device is active and it's wrong to drop its

> >> state, then you may cause the damage.

> >>

> >>> Calling genpd_set_performance_state(dev, 0), during detach will have

> >>> the same effect as triggering a re-evaluation of the performance state

> >>> for the genpd, but after the detach.

> >>

> >> Yes

> >>

> >>>> This way PD driver may

> >>>> take an action on detachment if performance isn't zero, before hardware

> >>>> is crashed, for example it may emit a warning.

> >>>

> >>> Not sure I got that. Exactly when do you want to emit a warning and

> >>> for what reason?

> >>>

> >>> Do you want to add a check somewhere to see if

> >>> 'gpd_data->performance_state' is non zero - and then print a warning?

> >>

> >> I want to check the 'gpd_data->performance_state' from the detachment

> >> callback and emit the warning + lock further performance changes in the

> >> PD driver since it's a error condition.

> >

> > Alright, so if I understand correctly, you intend to do the check for

> > the "error condition" of the device in the genpd->detach_dev()

> > callback?

>

> Yes


Okay.

>

> > What exactly do you intend to do beyond this point, if you detect the

> > "error condition"? Locking further changes of the performance state

> > seems fragile too, especially if some other device/driver requires the

> > performance state to be raised. It sounds like you simply need to call

> > BUG_ON() then?

>

> I can lock it to high performance state.


Alright.

>

> > Also note that a very similar problem exists, *before* the device gets

> > attached in the first place. More precisely, nothing prevents the

> > performance state from being set to a non-compatible value for an

> > always-on HW/device that hasn't been attached yet. So maybe you need

> > to set the maximum performance state at genpd initializations, then

> > use the ->sync_state() callback to very that all consumers have been

> > attached to the genpd provider, before allowing the state to be

> > changed/lowered?

>

> That is already done by the PD driver.

>

> https://elixir.bootlin.com/linux/latest/source/drivers/soc/tegra/pmc.c#L3790


Yes, I already knew that, but forgot it. :-) Thanks for the pointer.
Let me rethink the approach.

In a way, it kind of sounds like this is a generic problem - so
perhaps we should think of adding a ->withdraw_sync_state() callback
that can be assigned by provider drivers, to get informed when a
consumer driver is getting unbinded.

Kind regards
Uffe
Dmitry Osipenko Sept. 9, 2021, 1:48 p.m. UTC | #11
07.09.2021 13:16, Ulf Hansson пишет:
...
>>> Also note that a very similar problem exists, *before* the device gets

>>> attached in the first place. More precisely, nothing prevents the

>>> performance state from being set to a non-compatible value for an

>>> always-on HW/device that hasn't been attached yet. So maybe you need

>>> to set the maximum performance state at genpd initializations, then

>>> use the ->sync_state() callback to very that all consumers have been

>>> attached to the genpd provider, before allowing the state to be

>>> changed/lowered?

>>

>> That is already done by the PD driver.

>>

>> https://elixir.bootlin.com/linux/latest/source/drivers/soc/tegra/pmc.c#L3790

> 

> Yes, I already knew that, but forgot it. :-) Thanks for the pointer.

> Let me rethink the approach.

> 

> In a way, it kind of sounds like this is a generic problem - so

> perhaps we should think of adding a ->withdraw_sync_state() callback

> that can be assigned by provider drivers, to get informed when a

> consumer driver is getting unbinded.


Not sure, doesn't feel to me that this is necessary for today. A bit too
cumbersome for a simple sanity-check, IMO.
Ulf Hansson Sept. 9, 2021, 2:45 p.m. UTC | #12
On Thu, 9 Sept 2021 at 15:48, Dmitry Osipenko <digetx@gmail.com> wrote:
>

> 07.09.2021 13:16, Ulf Hansson пишет:

> ...

> >>> Also note that a very similar problem exists, *before* the device gets

> >>> attached in the first place. More precisely, nothing prevents the

> >>> performance state from being set to a non-compatible value for an

> >>> always-on HW/device that hasn't been attached yet. So maybe you need

> >>> to set the maximum performance state at genpd initializations, then

> >>> use the ->sync_state() callback to very that all consumers have been

> >>> attached to the genpd provider, before allowing the state to be

> >>> changed/lowered?

> >>

> >> That is already done by the PD driver.

> >>

> >> https://elixir.bootlin.com/linux/latest/source/drivers/soc/tegra/pmc.c#L3790

> >

> > Yes, I already knew that, but forgot it. :-) Thanks for the pointer.

> > Let me rethink the approach.

> >

> > In a way, it kind of sounds like this is a generic problem - so

> > perhaps we should think of adding a ->withdraw_sync_state() callback

> > that can be assigned by provider drivers, to get informed when a

> > consumer driver is getting unbinded.

>

> Not sure, doesn't feel to me that this is necessary for today. A bit too

> cumbersome for a simple sanity-check, IMO.


Maybe, but I can bring it up with the fw_devlinks people to see what they think.

In any case, we should not move forward with $subject patch as is.

Let me think about it.

Kind regards
Uffe
diff mbox series

Patch

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 5db704f02e71..278e040f607f 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -1665,6 +1665,8 @@  static int genpd_remove_device(struct generic_pm_domain *genpd,
 		goto out;
 	}
 
+	genpd_set_performance_state(dev, 0);
+
 	genpd->device_count--;
 	genpd->max_off_time_changed = true;
 
@@ -2604,12 +2606,6 @@  static void genpd_dev_pm_detach(struct device *dev, bool power_off)
 
 	dev_dbg(dev, "removing from PM domain %s\n", pd->name);
 
-	/* Drop the default performance state */
-	if (dev_gpd_data(dev)->default_pstate) {
-		dev_pm_genpd_set_performance_state(dev, 0);
-		dev_gpd_data(dev)->default_pstate = 0;
-	}
-
 	for (i = 1; i < GENPD_RETRY_MAX_MS; i <<= 1) {
 		ret = genpd_remove_device(pd, dev);
 		if (ret != -EAGAIN)
@@ -2702,7 +2698,6 @@  static int __genpd_dev_pm_attach(struct device *dev, struct device *base_dev,
 		ret = dev_pm_genpd_set_performance_state(dev, pstate);
 		if (ret)
 			goto err;
-		dev_gpd_data(dev)->default_pstate = pstate;
 	}
 	return 1;
 
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index 67017c9390c8..21a0577305ef 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -198,7 +198,6 @@  struct generic_pm_domain_data {
 	struct notifier_block *power_nb;
 	int cpu;
 	unsigned int performance_state;
-	unsigned int default_pstate;
 	unsigned int rpm_pstate;
 	ktime_t	next_wakeup;
 	void *data;