diff mbox series

[1/2] OPP: Use _set_opp_level() for single genpd case

Message ID f709e9e273004be43efe3a2854a7e7b51a777f99.1697710527.git.viresh.kumar@linaro.org
State Superseded
Headers show
Series [1/2] OPP: Use _set_opp_level() for single genpd case | expand

Commit Message

Viresh Kumar Oct. 19, 2023, 10:22 a.m. UTC
There are two genpd (as required-opp) cases that we need to handle,
devices with a single genpd and ones with multiple genpds.

The multiple genpds case is clear, where the OPP core calls
dev_pm_domain_attach_by_name() for them and uses the virtual devices
returned by this helper to call dev_pm_domain_set_performance_state()
later to change the performance state.

The single genpd case however requires special handling as we need to
use the same `dev` structure (instead of a virtual one provided by genpd
core) for setting the performance state via
dev_pm_domain_set_performance_state().

As we move towards more generic code to take care of the required OPPs,
where we will recursively call dev_pm_opp_set_opp() for all the required
OPPs, the above special case becomes a problem.

Eventually we want to handle all performance state changes via
_set_opp_level(), so lets move the single genpd case to that right away.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/opp/core.c |  6 ++++--
 drivers/opp/of.c   | 25 ++++++++++++++++++++++---
 2 files changed, 26 insertions(+), 5 deletions(-)

Comments

Viresh Kumar Oct. 25, 2023, 6:54 a.m. UTC | #1
On 19-10-23, 13:16, Ulf Hansson wrote:
> On Thu, 19 Oct 2023 at 12:22, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > +static int _link_required_opps(struct dev_pm_opp *opp, struct opp_table *opp_table,
> >                                struct opp_table *required_table, int index)
> >  {
> >         struct device_node *np;
> > @@ -314,6 +314,25 @@ static int _link_required_opps(struct dev_pm_opp *opp,
> >                 return -ENODEV;
> >         }
> >
> > +       /*
> > +        * There are two genpd (as required-opp) cases that we need to handle,
> > +        * devices with a single genpd and ones with multiple genpds.
> > +        *
> > +        * The single genpd case requires special handling as we need to use the
> > +        * same `dev` structure (instead of a virtual one provided by genpd
> > +        * core) for setting the performance state. Lets treat this as a case
> > +        * where the OPP's level is directly available without required genpd
> > +        * link in the DT.
> > +        *
> > +        * Just update the `level` with the right value, which
> > +        * dev_pm_opp_set_opp() will take care of in the normal path itself.
> > +        */
> > +       if (required_table->is_genpd && opp_table->required_opp_count == 1 &&
> > +           !opp_table->genpd_virt_devs) {
> > +               if (!WARN_ON(opp->level))
> 
> Hmm. Doesn't this introduce an unnecessary limitation?
> 
> An opp node that has a required-opps phande, may have "opp-hz",
> "opp-microvolt", etc. Why would we not allow the "opp-level" to be
> used too?

Coming back to this, why would we ever want a device to have "opp-level" and
"required-opp" (set to genpd's table) ? That would mean we will call:

dev_pm_domain_set_performance_state() twice to set different level values.

And so it should be safe to force that if required-opp table is set to a genpd,
then opp-level shouldn't be set. Maybe we should fail in that case, which isn't
happening currently.
Ulf Hansson Oct. 25, 2023, 10:40 a.m. UTC | #2
On Wed, 25 Oct 2023 at 08:55, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 19-10-23, 13:16, Ulf Hansson wrote:
> > On Thu, 19 Oct 2023 at 12:22, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > +static int _link_required_opps(struct dev_pm_opp *opp, struct opp_table *opp_table,
> > >                                struct opp_table *required_table, int index)
> > >  {
> > >         struct device_node *np;
> > > @@ -314,6 +314,25 @@ static int _link_required_opps(struct dev_pm_opp *opp,
> > >                 return -ENODEV;
> > >         }
> > >
> > > +       /*
> > > +        * There are two genpd (as required-opp) cases that we need to handle,
> > > +        * devices with a single genpd and ones with multiple genpds.
> > > +        *
> > > +        * The single genpd case requires special handling as we need to use the
> > > +        * same `dev` structure (instead of a virtual one provided by genpd
> > > +        * core) for setting the performance state. Lets treat this as a case
> > > +        * where the OPP's level is directly available without required genpd
> > > +        * link in the DT.
> > > +        *
> > > +        * Just update the `level` with the right value, which
> > > +        * dev_pm_opp_set_opp() will take care of in the normal path itself.
> > > +        */
> > > +       if (required_table->is_genpd && opp_table->required_opp_count == 1 &&
> > > +           !opp_table->genpd_virt_devs) {
> > > +               if (!WARN_ON(opp->level))
> >
> > Hmm. Doesn't this introduce an unnecessary limitation?
> >
> > An opp node that has a required-opps phande, may have "opp-hz",
> > "opp-microvolt", etc. Why would we not allow the "opp-level" to be
> > used too?
>
> Coming back to this, why would we ever want a device to have "opp-level" and
> "required-opp" (set to genpd's table) ? That would mean we will call:
>
> dev_pm_domain_set_performance_state() twice to set different level values.

Yes - and that would be weird, especially since the PM domain (genpd)
is already managing the aggregation and propagation to parent domains.

I guess I got a bit confused by the commit message for patch2/2, where
it sounded like you were striving towards introducing recursive calls
to set OPPs. Having a closer look, I realize that isn't the case,
which I think makes sense.

>
> And so it should be safe to force that if required-opp table is set to a genpd,
> then opp-level shouldn't be set. Maybe we should fail in that case, which isn't
> happening currently.

Yes, it seems better to fail earlier during the OF parsing of the
required-opps or when adding an OPP dynamically. In that way, the
WARN_ON above could be removed.

That said, sorry for the noise and either way, feel free to add (for
$subject patch):

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Kind regards
Uffe
Viresh Kumar Oct. 25, 2023, 10:48 a.m. UTC | #3
On 25-10-23, 12:40, Ulf Hansson wrote:
> On Wed, 25 Oct 2023 at 08:55, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > On 19-10-23, 13:16, Ulf Hansson wrote:
> > > On Thu, 19 Oct 2023 at 12:22, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > > +       if (required_table->is_genpd && opp_table->required_opp_count == 1 &&
> > > > +           !opp_table->genpd_virt_devs) {
> > > > +               if (!WARN_ON(opp->level))
> > >
> > > Hmm. Doesn't this introduce an unnecessary limitation?
> > >
> > > An opp node that has a required-opps phande, may have "opp-hz",
> > > "opp-microvolt", etc. Why would we not allow the "opp-level" to be
> > > used too?
> >
> > Coming back to this, why would we ever want a device to have "opp-level" and
> > "required-opp" (set to genpd's table) ? That would mean we will call:
> >
> > dev_pm_domain_set_performance_state() twice to set different level values.
> 
> Yes - and that would be weird, especially since the PM domain (genpd)
> is already managing the aggregation and propagation to parent domains.
> 
> I guess I got a bit confused by the commit message for patch2/2, where
> it sounded like you were striving towards introducing recursive calls
> to set OPPs. Having a closer look, I realize that isn't the case,
> which I think makes sense.
> 
> >
> > And so it should be safe to force that if required-opp table is set to a genpd,
> > then opp-level shouldn't be set. Maybe we should fail in that case, which isn't
> > happening currently.
> 
> Yes, it seems better to fail earlier during the OF parsing of the
> required-opps or when adding an OPP dynamically. In that way, the
> WARN_ON above could be removed.

For now I will leave the WARN as is, will reconsider if it makes more
sense to fail and return early. And send a separate patch for that.

> That said, sorry for the noise and either way, feel free to add (for
> $subject patch):
> 
> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Thanks.
Stephan Gerhold Oct. 25, 2023, 1:47 p.m. UTC | #4
On Wed, Oct 25, 2023 at 12:40:26PM +0200, Ulf Hansson wrote:
> On Wed, 25 Oct 2023 at 08:55, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> >
> > On 19-10-23, 13:16, Ulf Hansson wrote:
> > > On Thu, 19 Oct 2023 at 12:22, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > > +static int _link_required_opps(struct dev_pm_opp *opp, struct opp_table *opp_table,
> > > >                                struct opp_table *required_table, int index)
> > > >  {
> > > >         struct device_node *np;
> > > > @@ -314,6 +314,25 @@ static int _link_required_opps(struct dev_pm_opp *opp,
> > > >                 return -ENODEV;
> > > >         }
> > > >
> > > > +       /*
> > > > +        * There are two genpd (as required-opp) cases that we need to handle,
> > > > +        * devices with a single genpd and ones with multiple genpds.
> > > > +        *
> > > > +        * The single genpd case requires special handling as we need to use the
> > > > +        * same `dev` structure (instead of a virtual one provided by genpd
> > > > +        * core) for setting the performance state. Lets treat this as a case
> > > > +        * where the OPP's level is directly available without required genpd
> > > > +        * link in the DT.
> > > > +        *
> > > > +        * Just update the `level` with the right value, which
> > > > +        * dev_pm_opp_set_opp() will take care of in the normal path itself.
> > > > +        */
> > > > +       if (required_table->is_genpd && opp_table->required_opp_count == 1 &&
> > > > +           !opp_table->genpd_virt_devs) {
> > > > +               if (!WARN_ON(opp->level))
> > >
> > > Hmm. Doesn't this introduce an unnecessary limitation?
> > >
> > > An opp node that has a required-opps phande, may have "opp-hz",
> > > "opp-microvolt", etc. Why would we not allow the "opp-level" to be
> > > used too?
> >
> > Coming back to this, why would we ever want a device to have "opp-level" and
> > "required-opp" (set to genpd's table) ? That would mean we will call:
> >
> > dev_pm_domain_set_performance_state() twice to set different level values.
> 
> Yes - and that would be weird, especially since the PM domain (genpd)
> is already managing the aggregation and propagation to parent domains.
> 

FWIW I'm hitting this WARNing when trying to set up the parent domain
setup for CPR->RPMPD(MX) on MSM8916 that I discussed with Uffe recently
[1]. I know, me and all my weird OPP setups. :'D

Basically, I have cpufreq voting for performance states of the CPR genpd
(via required-opps). CPR is supposed to have <&rpmpd MSM8916_VDDMX_AO>
as parent genpd and translates to the parent performance state using the
"required-opps" in the *CPR* OPP table:

	cpr: power-controller@b018000 {
		compatible = "qcom,msm8916-cpr", "qcom,cpr";
		reg = <0x0b018000 0x1000>;
		/* ... */
		#power-domain-cells = <0>;
		operating-points-v2 = <&cpr_opp_table>;
		/* Supposed to be parent domain, not consumer */
		power-domains = <&rpmpd MSM8916_VDDMX_AO>;

		cpr_opp_table: opp-table {
			compatible = "operating-points-v2-qcom-level";

			cpr_opp1: opp1 {
				opp-level = <1>;
				qcom,opp-fuse-level = <1>;
				required-opps = <&rpmpd_opp_svs_soc>;
			};
			cpr_opp2: opp2 {
				opp-level = <2>;
				qcom,opp-fuse-level = <2>;
				required-opps = <&rpmpd_opp_nom>;
			};
			cpr_opp3: opp3 {
				opp-level = <3>;
				qcom,opp-fuse-level = <3>;
				required-opps = <&rpmpd_opp_super_turbo>;
			};
		};
	};

There are two problems with this:

 1. (Unrelated to $subject patch)
    Since there is only a single entry in "power-domains", the genpd
    core code automatically attaches the CPR platform device as consumer
    of the VDDMX_AO power domain. I don't want this, I want it to become
    child of the VDDMX_AO genpd.

    I added some hacky code to workaround this. One option that works is
    to add a second dummy entry to "power-domains", which will prevent
    the genpd core from attaching the power domain:

    	power-domains = <&rpmpd MSM8916_VDDMX_AO>, <0>;

    The other option is detaching the power domain again in probe(),
    after setting it up as parent domain:

	struct of_phandle_args parent, child;

	child.np = dev->of_node;
	child.args_count = 0;

	of_parse_phandle_with_args(dev->of_node, "power-domains",
 			           "#power-domain-cells", 0, &parent));
	of_genpd_add_subdomain(&parent, &child);

	/* Detach power domain since it's managed via the subdomain */
	dev_pm_domain_detach(dev, false);

    Is there a cleaner way to handle this? Mainly a question for Uffe.

 2. The OPP WARNing triggers with both variants because it just checks
    if "required-opps" has a single entry. I guess we need extra checks
    to exclude the "parent genpd" case compared to the "OPP" case.

	[    1.116244] WARNING: CPU: 2 PID: 36 at drivers/opp/of.c:331 _link_required_opps+0x180/0x1cc
	[    1.125897] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
	[    1.146887] pc : _link_required_opps+0x180/0x1cc
	[    1.146902] lr : _link_required_opps+0xdc/0x1cc
	[    1.276408] Call trace:
	[    1.283519]  _link_required_opps+0x180/0x1cc
	[    1.285779]  _of_add_table_indexed+0x61c/0xd40
	[    1.290292]  dev_pm_opp_of_add_table+0x10/0x18
	[    1.294546]  of_genpd_add_provider_simple+0x80/0x160
	[    1.298974]  cpr_probe+0x6a0/0x97c
	[    1.304092]  platform_probe+0x64/0xbc

It does seem to work correctly, with and without this patch. So I guess
another option might be to simply silence this WARN_ON(). :')

Thanks,
Stephan

[1]: https://lore.kernel.org/linux-pm/CAPDyKFoH5EOvRRKy-Bgp_B9B3rf=PUKK5N45s5PNgfBi55PaOQ@mail.gmail.com/
Viresh Kumar Oct. 25, 2023, 3:24 p.m. UTC | #5
On 25-10-23, 15:47, Stephan Gerhold wrote:
> FWIW I'm hitting this WARNing when trying to set up the parent domain
> setup for CPR->RPMPD(MX) on MSM8916 that I discussed with Uffe recently
> [1]. I know, me and all my weird OPP setups. :'D
> 
> Basically, I have cpufreq voting for performance states of the CPR genpd
> (via required-opps). CPR is supposed to have <&rpmpd MSM8916_VDDMX_AO>
> as parent genpd and translates to the parent performance state using the
> "required-opps" in the *CPR* OPP table:
> 
> 	cpr: power-controller@b018000 {
> 		compatible = "qcom,msm8916-cpr", "qcom,cpr";
> 		reg = <0x0b018000 0x1000>;
> 		/* ... */
> 		#power-domain-cells = <0>;
> 		operating-points-v2 = <&cpr_opp_table>;
> 		/* Supposed to be parent domain, not consumer */
> 		power-domains = <&rpmpd MSM8916_VDDMX_AO>;
> 
> 		cpr_opp_table: opp-table {
> 			compatible = "operating-points-v2-qcom-level";
> 
> 			cpr_opp1: opp1 {
> 				opp-level = <1>;
> 				qcom,opp-fuse-level = <1>;
> 				required-opps = <&rpmpd_opp_svs_soc>;
> 			};
> 			cpr_opp2: opp2 {
> 				opp-level = <2>;
> 				qcom,opp-fuse-level = <2>;
> 				required-opps = <&rpmpd_opp_nom>;
> 			};
> 			cpr_opp3: opp3 {
> 				opp-level = <3>;
> 				qcom,opp-fuse-level = <3>;
> 				required-opps = <&rpmpd_opp_super_turbo>;
> 			};
> 		};
> 	};

I have forgotten a bit about this usecase. How exactly does the
configurations work currently for this ? I mean genpd core must be
setting the vote finally for only one of them or something else ?
Stephan Gerhold Oct. 25, 2023, 4:16 p.m. UTC | #6
On Wed, Oct 25, 2023 at 08:54:31PM +0530, Viresh Kumar wrote:
> On 25-10-23, 15:47, Stephan Gerhold wrote:
> > FWIW I'm hitting this WARNing when trying to set up the parent domain
> > setup for CPR->RPMPD(MX) on MSM8916 that I discussed with Uffe recently
> > [1]. I know, me and all my weird OPP setups. :'D
> > 
> > Basically, I have cpufreq voting for performance states of the CPR genpd
> > (via required-opps). CPR is supposed to have <&rpmpd MSM8916_VDDMX_AO>
> > as parent genpd and translates to the parent performance state using the
> > "required-opps" in the *CPR* OPP table:
> > 
> > 	cpr: power-controller@b018000 {
> > 		compatible = "qcom,msm8916-cpr", "qcom,cpr";
> > 		reg = <0x0b018000 0x1000>;
> > 		/* ... */
> > 		#power-domain-cells = <0>;
> > 		operating-points-v2 = <&cpr_opp_table>;
> > 		/* Supposed to be parent domain, not consumer */
> > 		power-domains = <&rpmpd MSM8916_VDDMX_AO>;
> > 
> > 		cpr_opp_table: opp-table {
> > 			compatible = "operating-points-v2-qcom-level";
> > 
> > 			cpr_opp1: opp1 {
> > 				opp-level = <1>;
> > 				qcom,opp-fuse-level = <1>;
> > 				required-opps = <&rpmpd_opp_svs_soc>;
> > 			};
> > 			cpr_opp2: opp2 {
> > 				opp-level = <2>;
> > 				qcom,opp-fuse-level = <2>;
> > 				required-opps = <&rpmpd_opp_nom>;
> > 			};
> > 			cpr_opp3: opp3 {
> > 				opp-level = <3>;
> > 				qcom,opp-fuse-level = <3>;
> > 				required-opps = <&rpmpd_opp_super_turbo>;
> > 			};
> > 		};
> > 	};
> 
> I have forgotten a bit about this usecase. How exactly does the
> configurations work currently for this ? I mean genpd core must be
> setting the vote finally for only one of them or something else ?
> 

I'm not sure if I understand your question correctly. Basically, setting
<&rpmpd MSM8916_VDDMX_AO> as "parent genpd" of <&cpr> is supposed to
describe that there is a direct relationship between the performance
states of CPR and VDDMX. When changing the CPR performance state, VDDMX
should also be adjusted accordingly.

This is implemented in the genpd core in _genpd_set_performance_state().
It loops over the parent genpds, and re-evaluates the performance states
of each of them. Translation happens using genpd_xlate_performance_state()
which is just a direct call to dev_pm_opp_xlate_performance_state().
This will look up the required-opps from the OPP table above. However,
the genpd core calls ->set_performance_state() on the parent genpd
directly, so dev_pm_opp_set_opp() isn't involved in this case.

Overall the call sequence for a CPUfreq switch will look something like:

 - cpu0: dev_pm_opp_set_rate(998.4 MHz)
  - cpu0: _set_required_opps(opp-998400000)
   - genpd:1:cpu0: dev_pm_opp_set_opp(&cpr_opp3)
    - genpd:1:cpu0: _set_opp_level(&cpr_opp3)
     - cpr: _genpd_set_performance_state(3)

      # genpd: translate & adjust parent performance states
      - cpr: genpd_xlate_performance_state(parent=VDDMX_AO)
             => &rpmpd_opp_super_turbo = 6
       - VDDMX_AO: _genpd_set_performance_state(6)
        - rpmpd: ->set_performance_state(VDDMX_AO, 6)

      # genpd: change actual performance state
      - cpr: ->set_performance_state(cpr, 3)

Before the discussion with Uffe I did not describe this relationship
between CPR<->VDDMX as parent-child, I just had them as two separate
power domains in the CPU OPP table. That worked fine too but Uffe
suggested the parent-child representation might be better.
   
Does that help or were you looking for something else? :D

Thanks,
Stephan
Ulf Hansson Oct. 26, 2023, 9:53 a.m. UTC | #7
On Wed, 25 Oct 2023 at 15:49, Stephan Gerhold <stephan@gerhold.net> wrote:
>
> On Wed, Oct 25, 2023 at 12:40:26PM +0200, Ulf Hansson wrote:
> > On Wed, 25 Oct 2023 at 08:55, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > >
> > > On 19-10-23, 13:16, Ulf Hansson wrote:
> > > > On Thu, 19 Oct 2023 at 12:22, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > > > > +static int _link_required_opps(struct dev_pm_opp *opp, struct opp_table *opp_table,
> > > > >                                struct opp_table *required_table, int index)
> > > > >  {
> > > > >         struct device_node *np;
> > > > > @@ -314,6 +314,25 @@ static int _link_required_opps(struct dev_pm_opp *opp,
> > > > >                 return -ENODEV;
> > > > >         }
> > > > >
> > > > > +       /*
> > > > > +        * There are two genpd (as required-opp) cases that we need to handle,
> > > > > +        * devices with a single genpd and ones with multiple genpds.
> > > > > +        *
> > > > > +        * The single genpd case requires special handling as we need to use the
> > > > > +        * same `dev` structure (instead of a virtual one provided by genpd
> > > > > +        * core) for setting the performance state. Lets treat this as a case
> > > > > +        * where the OPP's level is directly available without required genpd
> > > > > +        * link in the DT.
> > > > > +        *
> > > > > +        * Just update the `level` with the right value, which
> > > > > +        * dev_pm_opp_set_opp() will take care of in the normal path itself.
> > > > > +        */
> > > > > +       if (required_table->is_genpd && opp_table->required_opp_count == 1 &&
> > > > > +           !opp_table->genpd_virt_devs) {
> > > > > +               if (!WARN_ON(opp->level))
> > > >
> > > > Hmm. Doesn't this introduce an unnecessary limitation?
> > > >
> > > > An opp node that has a required-opps phande, may have "opp-hz",
> > > > "opp-microvolt", etc. Why would we not allow the "opp-level" to be
> > > > used too?
> > >
> > > Coming back to this, why would we ever want a device to have "opp-level" and
> > > "required-opp" (set to genpd's table) ? That would mean we will call:
> > >
> > > dev_pm_domain_set_performance_state() twice to set different level values.
> >
> > Yes - and that would be weird, especially since the PM domain (genpd)
> > is already managing the aggregation and propagation to parent domains.
> >
>
> FWIW I'm hitting this WARNing when trying to set up the parent domain
> setup for CPR->RPMPD(MX) on MSM8916 that I discussed with Uffe recently
> [1]. I know, me and all my weird OPP setups. :'D
>
> Basically, I have cpufreq voting for performance states of the CPR genpd
> (via required-opps). CPR is supposed to have <&rpmpd MSM8916_VDDMX_AO>
> as parent genpd and translates to the parent performance state using the
> "required-opps" in the *CPR* OPP table:
>
>         cpr: power-controller@b018000 {
>                 compatible = "qcom,msm8916-cpr", "qcom,cpr";
>                 reg = <0x0b018000 0x1000>;
>                 /* ... */
>                 #power-domain-cells = <0>;
>                 operating-points-v2 = <&cpr_opp_table>;
>                 /* Supposed to be parent domain, not consumer */
>                 power-domains = <&rpmpd MSM8916_VDDMX_AO>;
>
>                 cpr_opp_table: opp-table {
>                         compatible = "operating-points-v2-qcom-level";
>
>                         cpr_opp1: opp1 {
>                                 opp-level = <1>;
>                                 qcom,opp-fuse-level = <1>;
>                                 required-opps = <&rpmpd_opp_svs_soc>;
>                         };
>                         cpr_opp2: opp2 {
>                                 opp-level = <2>;
>                                 qcom,opp-fuse-level = <2>;
>                                 required-opps = <&rpmpd_opp_nom>;
>                         };
>                         cpr_opp3: opp3 {
>                                 opp-level = <3>;
>                                 qcom,opp-fuse-level = <3>;
>                                 required-opps = <&rpmpd_opp_super_turbo>;
>                         };
>                 };
>         };
>
> There are two problems with this:
>
>  1. (Unrelated to $subject patch)
>     Since there is only a single entry in "power-domains", the genpd
>     core code automatically attaches the CPR platform device as consumer
>     of the VDDMX_AO power domain. I don't want this, I want it to become
>     child of the VDDMX_AO genpd.
>
>     I added some hacky code to workaround this. One option that works is
>     to add a second dummy entry to "power-domains", which will prevent
>     the genpd core from attaching the power domain:
>
>         power-domains = <&rpmpd MSM8916_VDDMX_AO>, <0>;

Hmm, looks a bit hackish to me.

>
>     The other option is detaching the power domain again in probe(),
>     after setting it up as parent domain:

Yes, if needed.

>
>         struct of_phandle_args parent, child;
>
>         child.np = dev->of_node;
>         child.args_count = 0;
>
>         of_parse_phandle_with_args(dev->of_node, "power-domains",
>                                    "#power-domain-cells", 0, &parent));
>         of_genpd_add_subdomain(&parent, &child);
>
>         /* Detach power domain since it's managed via the subdomain */
>         dev_pm_domain_detach(dev, false);
>
>     Is there a cleaner way to handle this? Mainly a question for Uffe.

At the moment, I don't think so. In fact, we have situations when the
attachment is really useful.

For example, during ->probe(), one can do a pm_runtime_get_sync() to
power-on the "parent" domain. This may be needed to synchronize the
power-states between the child/parent-domains, before calling
of_genpd_add_subdomain().

>
>  2. The OPP WARNing triggers with both variants because it just checks
>     if "required-opps" has a single entry. I guess we need extra checks
>     to exclude the "parent genpd" case compared to the "OPP" case.
>
>         [    1.116244] WARNING: CPU: 2 PID: 36 at drivers/opp/of.c:331 _link_required_opps+0x180/0x1cc
>         [    1.125897] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
>         [    1.146887] pc : _link_required_opps+0x180/0x1cc
>         [    1.146902] lr : _link_required_opps+0xdc/0x1cc
>         [    1.276408] Call trace:
>         [    1.283519]  _link_required_opps+0x180/0x1cc
>         [    1.285779]  _of_add_table_indexed+0x61c/0xd40
>         [    1.290292]  dev_pm_opp_of_add_table+0x10/0x18
>         [    1.294546]  of_genpd_add_provider_simple+0x80/0x160
>         [    1.298974]  cpr_probe+0x6a0/0x97c
>         [    1.304092]  platform_probe+0x64/0xbc
>
> It does seem to work correctly, with and without this patch. So I guess
> another option might be to simply silence this WARN_ON(). :')

Oh, thanks for pointing this out! This case haven't crossed my mind yet!

Allow me to think a bit more about it. I will get back to you again
with a suggestion soon, unless Viresh comes back first. :-)

[...]

Kind regards
Uffe
Viresh Kumar Oct. 30, 2023, 10:29 a.m. UTC | #8
On 26-10-23, 11:53, Ulf Hansson wrote:
> On Wed, 25 Oct 2023 at 15:49, Stephan Gerhold <stephan@gerhold.net> wrote:
> >  2. The OPP WARNing triggers with both variants because it just checks
> >     if "required-opps" has a single entry. I guess we need extra checks
> >     to exclude the "parent genpd" case compared to the "OPP" case.
> >
> >         [    1.116244] WARNING: CPU: 2 PID: 36 at drivers/opp/of.c:331 _link_required_opps+0x180/0x1cc
> >         [    1.125897] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> >         [    1.146887] pc : _link_required_opps+0x180/0x1cc
> >         [    1.146902] lr : _link_required_opps+0xdc/0x1cc
> >         [    1.276408] Call trace:
> >         [    1.283519]  _link_required_opps+0x180/0x1cc
> >         [    1.285779]  _of_add_table_indexed+0x61c/0xd40
> >         [    1.290292]  dev_pm_opp_of_add_table+0x10/0x18
> >         [    1.294546]  of_genpd_add_provider_simple+0x80/0x160
> >         [    1.298974]  cpr_probe+0x6a0/0x97c
> >         [    1.304092]  platform_probe+0x64/0xbc
> >
> > It does seem to work correctly, with and without this patch. So I guess
> > another option might be to simply silence this WARN_ON(). :')
> 
> Oh, thanks for pointing this out! This case haven't crossed my mind yet!
> 
> Allow me to think a bit more about it. I will get back to you again
> with a suggestion soon, unless Viresh comes back first. :-)

I have resent the series now.

Stephan, please give it a try again. Thanks.

Regarding this case where a genpd's table points to a parent genpd's table via
the required-opps, it is a bit tricky to solve and the only way around that I
could think of is that someone needs to call dev_pm_opp_set_config() with the
right device pointer, with that we won't hit the warning anymore and things will
work as expected.

In this case the OPP core needs to call dev_pm_domain_set_performance_state()
for device and then its genpd. We need the right device pointers :(

Ulf, also another important thing here is that maybe we would want the genpd
core to not propagate the voting anymore to the parent genpd's ? The
dev_pm_opp_set_opp() call is better placed at handling all things and not just
the performance state, like clk, regulator, bandwidth and so the recursion
should happen at OPP level only. For now my series shouldn't break anything,
just that we will try to set performance state twice for the parent genpd, the
second call should silently return as the target state should be equal to
current state.
Ulf Hansson Nov. 10, 2023, 1:50 p.m. UTC | #9
On Mon, 6 Nov 2023 at 08:08, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 03-11-23, 12:58, Ulf Hansson wrote:
> > Are you saying that the OPP library should be capable of managing the
> > parent-clock-rates too, when there is a new rate being requested for a
> > clock that belongs to an OPP? To me, that sounds like replicating
> > framework specific knowledge into the OPP library, no? Why do we want
> > this?
>
> I am surely not touching clocks or any other framework :)
>
> > Unless I totally misunderstood your suggestion, I think it would be
> > better if the OPP library remained simple and didn't run recursive
> > calls, but instead relied on each framework to manage the aggregation
> > and propagation to parents.
>
> I see your point and agree with it.

Okay!

>
> Here is the problem and I am not very sure what's the way forward for this then:
>
> - Devices can have other devices (like caches) or genpds mentioned via
>   required-opps.
>
> - Same is true for genpds, they can also have required-opps, which may or may not
>   be other genpds.
>
> - When OPP core is asked to set a device's OPP, it isn't only about performance
>   level, but clk, level, regulator, bw, etc. And so a full call to
>   dev_pm_opp_set_opp() is required.
>
> - The OPP core is going to run the helper recursively only for required-opps and
>   hence it won't affect clock or regulators.

What if a required-opps has a clock or regulator? Doesn't that mean
that clocks/regulators could be called too?

>
> - But it currently affects genpds as they are mentioned in required-opps.
>
> - Skipping the recursive call to a parent genpd will require a special hack,
>   maybe we should add it, I am just discussing it if we should or if there is
>   another way around this.

Right, I see.

If this is only for required-opps and devices being hooked up to a PM
domain (genpd), my suggestion would be to keep avoiding doing the
propagation to required-opps-parents. For the similar reasons to why
we don't do it for clock/regulators, the propagation and aggregation,
seems to me, to belong better in genpd.

Did that make sense?

Kind regards
Uffe
Viresh Kumar Nov. 15, 2023, 5:32 a.m. UTC | #10
On 10-11-23, 14:50, Ulf Hansson wrote:
> If this is only for required-opps and devices being hooked up to a PM
> domain (genpd), my suggestion would be to keep avoiding doing the
> propagation to required-opps-parents. For the similar reasons to why
> we don't do it for clock/regulators, the propagation and aggregation,
> seems to me, to belong better in genpd.
> 
> Did that make sense?

Hmm, it does. Let me see what I can do on this..
Viresh Kumar Nov. 16, 2023, 10:44 a.m. UTC | #11
On 15-11-23, 11:02, Viresh Kumar wrote:
> On 10-11-23, 14:50, Ulf Hansson wrote:
> > If this is only for required-opps and devices being hooked up to a PM
> > domain (genpd), my suggestion would be to keep avoiding doing the
> > propagation to required-opps-parents. For the similar reasons to why
> > we don't do it for clock/regulators, the propagation and aggregation,
> > seems to me, to belong better in genpd.
> > 
> > Did that make sense?
> 
> Hmm, it does. Let me see what I can do on this..

Hi Ulf,

I have sent V3 with a new commit at the end to take care of this.
diff mbox series

Patch

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 84f345c69ea5..aab8c8e79146 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -1074,10 +1074,12 @@  static int _opp_set_required_opps_generic(struct device *dev,
 static int _opp_set_required_opps_genpd(struct device *dev,
 	struct opp_table *opp_table, struct dev_pm_opp *opp, bool scaling_down)
 {
-	struct device **genpd_virt_devs =
-		opp_table->genpd_virt_devs ? opp_table->genpd_virt_devs : &dev;
+	struct device **genpd_virt_devs = opp_table->genpd_virt_devs;
 	int index, target, delta, ret;
 
+	if (!genpd_virt_devs)
+		return 0;
+
 	/* Scaling up? Set required OPPs in normal order, else reverse */
 	if (!scaling_down) {
 		index = 0;
diff --git a/drivers/opp/of.c b/drivers/opp/of.c
index 81fa27599d58..e056f31a48b5 100644
--- a/drivers/opp/of.c
+++ b/drivers/opp/of.c
@@ -296,7 +296,7 @@  void _of_clear_opp(struct opp_table *opp_table, struct dev_pm_opp *opp)
 	of_node_put(opp->np);
 }
 
-static int _link_required_opps(struct dev_pm_opp *opp,
+static int _link_required_opps(struct dev_pm_opp *opp, struct opp_table *opp_table,
 			       struct opp_table *required_table, int index)
 {
 	struct device_node *np;
@@ -314,6 +314,25 @@  static int _link_required_opps(struct dev_pm_opp *opp,
 		return -ENODEV;
 	}
 
+	/*
+	 * There are two genpd (as required-opp) cases that we need to handle,
+	 * devices with a single genpd and ones with multiple genpds.
+	 *
+	 * The single genpd case requires special handling as we need to use the
+	 * same `dev` structure (instead of a virtual one provided by genpd
+	 * core) for setting the performance state. Lets treat this as a case
+	 * where the OPP's level is directly available without required genpd
+	 * link in the DT.
+	 *
+	 * Just update the `level` with the right value, which
+	 * dev_pm_opp_set_opp() will take care of in the normal path itself.
+	 */
+	if (required_table->is_genpd && opp_table->required_opp_count == 1 &&
+	    !opp_table->genpd_virt_devs) {
+		if (!WARN_ON(opp->level))
+			opp->level = opp->required_opps[0]->level;
+	}
+
 	return 0;
 }
 
@@ -338,7 +357,7 @@  static int _of_opp_alloc_required_opps(struct opp_table *opp_table,
 		if (IS_ERR_OR_NULL(required_table))
 			continue;
 
-		ret = _link_required_opps(opp, required_table, i);
+		ret = _link_required_opps(opp, opp_table, required_table, i);
 		if (ret)
 			goto free_required_opps;
 	}
@@ -359,7 +378,7 @@  static int lazy_link_required_opps(struct opp_table *opp_table,
 	int ret;
 
 	list_for_each_entry(opp, &opp_table->opp_list, node) {
-		ret = _link_required_opps(opp, new_table, index);
+		ret = _link_required_opps(opp, opp_table, new_table, index);
 		if (ret)
 			return ret;
 	}