[RFC,V7,2/2] OPP: Allow "opp-hz" and "opp-microvolt" to contain magic values

Message ID 23ba51eaa6b52117458165dccc00a95cf8e86e1d.1509453284.git.viresh.kumar@linaro.org
State New
Headers show
Series
  • Untitled series #5607
Related show

Commit Message

Viresh Kumar Oct. 31, 2017, 12:47 p.m.
On some platforms the exact frequency or voltage may be hidden from the
OS by the firmware. Allow such configurations to pass magic values in
the "opp-hz" or the "opp-microvolt" properties, which should be
interpreted in a platform dependent way.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

---
 Documentation/devicetree/bindings/opp/opp.txt | 6 ++++++
 1 file changed, 6 insertions(+)

-- 
2.15.0.rc1.236.g92ea95045093

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Rob Herring Oct. 31, 2017, 4:02 p.m. | #1
On Tue, Oct 31, 2017 at 7:47 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On some platforms the exact frequency or voltage may be hidden from the

> OS by the firmware. Allow such configurations to pass magic values in

> the "opp-hz" or the "opp-microvolt" properties, which should be

> interpreted in a platform dependent way.


Why not a new property for magic values? opp-magic? Don't we want to
know when we have magic values?

Wouldn't magic values in opp-hz get propagated to user space? I can
see the complaints now. "My 4GHz processor is running at 6Hz!" Just
like people complain when BogoMIPS is not high enough.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rob Herring Nov. 1, 2017, 8:39 p.m. | #2
On Tue, Oct 31, 2017 at 9:17 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 31 October 2017 at 16:02, Rob Herring <robh+dt@kernel.org> wrote:

>> Why not a new property for magic values? opp-magic? Don't we want to

>> know when we have magic values?

>

> I have kept a separate property since beginning (domain-performance-state)

> and moved to using these magic values in the existing field because of the

> suggestion Kevin gave earlier.

>

> https://marc.info/?l=linux-kernel&m=149306082218001&w=2

>

> I am not sure what to do now :)


Okay, I guess reusing the properties is fine.

>> Wouldn't magic values in opp-hz get propagated to user space?

>

> The OPP core puts them in debugfs just to know how the OPPs are

> set. Otherwise, I am not sure that the power domain core/drivers would

> be exposing that to user space.


I was thinking thru the cpufreq interface, but I guess this is not for cpus.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar Nov. 2, 2017, 4:49 a.m. | #3
On 01-11-17, 15:39, Rob Herring wrote:
> On Tue, Oct 31, 2017 at 9:17 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:

> > On 31 October 2017 at 16:02, Rob Herring <robh+dt@kernel.org> wrote:

> >> Why not a new property for magic values? opp-magic? Don't we want to

> >> know when we have magic values?

> >

> > I have kept a separate property since beginning (domain-performance-state)

> > and moved to using these magic values in the existing field because of the

> > suggestion Kevin gave earlier.

> >

> > https://marc.info/?l=linux-kernel&m=149306082218001&w=2

> >

> > I am not sure what to do now :)

> 

> Okay, I guess reusing the properties is fine.


Okay, great.

> >> Wouldn't magic values in opp-hz get propagated to user space?

> >

> > The OPP core puts them in debugfs just to know how the OPPs are

> > set. Otherwise, I am not sure that the power domain core/drivers would

> > be exposing that to user space.

> 

> I was thinking thru the cpufreq interface, but I guess this is not for cpus.


Oh no. That's not the target here for sure.

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar Nov. 2, 2017, 9 a.m. | #4
On 02-11-17, 00:15, Stephen Boyd wrote:
> Sorry I'm not following. We're going to need to have platform

> specific code that understands platform specific bindings that

> aren't shoved into the generic OPP bindings.


At least I am not targeting any platform specific binding right now.
The way I see this to work is:

- We will reuse earlier bindings and allow opp-hz and opp-microvolt to
  contain special values (this patch).
- Platform specific DT entries will put corner numbers in opp-hz (or
  opp-microvolt) fields.
- Some platform specific driver (in OPP or genpd) will be used to
  convert OPP into a performance state (corner) value. Now that can
  simply read opp-hz (or opp-microvolt) and return its value.
- OPP core will request for a performance state (code is already
  merged for that).

And so there is no platform specific binding here. Do you want to do
this differently ?

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ulf Hansson Nov. 28, 2017, 4:38 p.m. | #5
On 2 November 2017 at 10:00, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 02-11-17, 00:15, Stephen Boyd wrote:

>> Sorry I'm not following. We're going to need to have platform

>> specific code that understands platform specific bindings that

>> aren't shoved into the generic OPP bindings.

>

> At least I am not targeting any platform specific binding right now.

> The way I see this to work is:

>

> - We will reuse earlier bindings and allow opp-hz and opp-microvolt to

>   contain special values (this patch).

> - Platform specific DT entries will put corner numbers in opp-hz (or

>   opp-microvolt) fields.

> - Some platform specific driver (in OPP or genpd) will be used to

>   convert OPP into a performance state (corner) value. Now that can

>   simply read opp-hz (or opp-microvolt) and return its value.


Since the "operating-points-v2" phandle(s) belongs in the power-domain
controller device node, which anyway is being parsed by the genpd SoC
specific driver, I assume it makes sense to start the initialization
from there. Unless there is something that prevents that, of course.

Then whatever library/helper functions we need for parse and create
the OPP tables, can be provided to the OPP framework and the OPP OF
library.

> - OPP core will request for a performance state (code is already

>   merged for that).

>

> And so there is no platform specific binding here. Do you want to do

> this differently ?


This makes sense to me!

Also, the SoC (QCOM) specific genpd driver is free to use the
terminology "corner values", when it translates opp-hz|microvolt into
such values.

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Boyd Nov. 30, 2017, 12:50 a.m. | #6
On 11/28, Ulf Hansson wrote:
> On 2 November 2017 at 10:00, Viresh Kumar <viresh.kumar@linaro.org> wrote:

> > On 02-11-17, 00:15, Stephen Boyd wrote:

> >> Sorry I'm not following. We're going to need to have platform

> >> specific code that understands platform specific bindings that

> >> aren't shoved into the generic OPP bindings.

> >

> > At least I am not targeting any platform specific binding right now.

> > The way I see this to work is:

> >

> > - We will reuse earlier bindings and allow opp-hz and opp-microvolt to

> >   contain special values (this patch).

> > - Platform specific DT entries will put corner numbers in opp-hz (or

> >   opp-microvolt) fields.

> > - Some platform specific driver (in OPP or genpd) will be used to

> >   convert OPP into a performance state (corner) value. Now that can

> >   simply read opp-hz (or opp-microvolt) and return its value.

> 

> Since the "operating-points-v2" phandle(s) belongs in the power-domain

> controller device node, which anyway is being parsed by the genpd SoC

> specific driver, I assume it makes sense to start the initialization

> from there. Unless there is something that prevents that, of course.

> 

> Then whatever library/helper functions we need for parse and create

> the OPP tables, can be provided to the OPP framework and the OPP OF

> library.

> 

> > - OPP core will request for a performance state (code is already

> >   merged for that).

> >

> > And so there is no platform specific binding here. Do you want to do

> > this differently ?

> 

> This makes sense to me!

> 

> Also, the SoC (QCOM) specific genpd driver is free to use the

> terminology "corner values", when it translates opp-hz|microvolt into

> such values.

> 


Sorry it still makes zero sense to me. It seems that we're trying
to make the OPP table parsing generic just for the sake of code
brevity. Is this the goal? From a DT writer perspective it seems
confusing to say that opp-microvolt is sometimes a microvolt and
sometimes not a microvolt. Why can't the SoC specific genpd
driver parse something like "qcom,corner" instead out of the
node?

BTW, I don't believe I have a use-case where I want to express
power domain OPP tables. I have many devices that all have
different frequencies that are all tied into the same power
domain. This binding makes it look like we can only have one
frequency per domain which won't work.

I want to express that a device with a range of frequencies (or
really multiple ranges of frequencies) is inside certain physical
power domains and the frequency of the clks dictates the minimum
voltage requirement of those power domains.

For the most complicated case, imagine something like our eMMC
controller that has two clks (clk1,clk2) that it changes the rate
of independently and those two clks rely on two different
regulators (vreg1, vreg2) that supply voltage domains in the SoC
which the eMMC controller happens to be part of (pd1, pd2). And
the device is also part of another power domain that we use to
turn everything off (pd3).

 +-------+                                 +-------+
 | vreg1 |                                 | vreg2 |
 +---+---+                                 +----+--+
     |                                          |
     |     +------------+-------------+         |
     |     | +-------+  |  +--------+ |         |
     +-----> | clk1  |  |  |  clk2  | <---------+
           | +---+---+  |  +----+---+ |
           |     |      |       |     |
      +----------v--------------v-----------+
      |    |            |             |     |
      |    |            |             |     |
      |    |   pd1      |     pd2     |     |
      |    |            |             |     |
      |    +------------+-------------+     |
      |                                     |
      |  pd3                          eMMC  |
      +-------------------------------------+


From a DT perspective, I see this as one emmc node:

    pd: power-domain-controller {
        #power-domain-cells = <1>;
    };

    cc: clock-controller {
        #clock-cells = <1>;
    };

    emmc {
        power-domains = <&pd 1> <&pd 2>, <&pd 3>;
        clocks = <&cc 1> <&cc 2>;
    }

And then we really don't want to have to express every single
possible frequency that clk1 and clk2 can be just to express the
voltage/corner constraints. It could be that we have some table
like so:

       clk1 Hz  |  pd1 Corner
    ------------+-----------
         40000  |   0
        960000  |   0
      19200000  |   1
      25000000  |   1
      50000000  |   2
      74000000  |   3

       clk2 Hz  |  pd2 Corner
    ------------+-----------
       19200000 |   0
      150000000 |   1
      340000000 |   2


BUT we also have another device that uses pd1 and has it's own
clk3 with different frequency constraints:

       clk3 Hz   |  pd1 Corner
    -------------+-----------
      250000000  |   0
      360000000  |   1
      730000000  |   2

So when clk3 is at 730000000, we don't care what frequency clk1
is running at, pd1 needs to be at least at corner 2. If it's at
corner 3 because clk1 is at 74000000 then that's fine.

I imagine DT would look like our "fmax tables" that are currently
out of tree. Something like:

	clk1_fmax_table {
		fmax0 {
			reg = /bits/ 64 <960000>;
			qcom,corner = <0>;
		};
		fmax1 {
			reg = /bits/ 64 <25000000>;
			qcom,corner = <1>;
		};
		fmax2 {
			reg = /bits/ 64 <50000000>;
			qcom,corner = <2>;
		};
		fmax3 {
			reg = /bits/ 64 <740000000>;
			qcom,corner = <3>;
		};
	};

Which is similar to OPP table, but we only list the maximum
frequency that requires a particular corner. We're back to the
same problem we have with OPPs of figuring out how to relate a
table to a certain clk and power domain though. At least for
qcom, we could do that with some sort of complicated list
property:

    emmc {
        performance-states = <&fmax_table &cc 1 &pd 1>
    };

or something like that which would parse a table, a clock, and a
number of power domains.

Reminds me, we can have one clk frequency map to multiple power
domains too. We have this case for our CPUs and PLLs where we
have individual power control on two domains for one frequency.
So when the PLL frequency changes, we need to turn on and set the
voltage or corner of two regulators.

So I don't really know how this is all going to work. I'd really
appreciate to see the full picture though. Reviewing this a bit
at a time makes me lose sight of the bigger picture.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar Nov. 30, 2017, 6:59 a.m. | #7
On 29-11-17, 16:50, Stephen Boyd wrote:
> Sorry it still makes zero sense to me. It seems that we're trying

> to make the OPP table parsing generic just for the sake of code

> brevity.


Not just the code but bindings as well to make sure we don't add a new
property (similar to earlier ones) for every platform that wants to
use performance states.

> Is this the goal? From a DT writer perspective it seems

> confusing to say that opp-microvolt is sometimes a microvolt and

> sometimes not a microvolt.


Well it would still represent the voltage but not in microvolt units
as the platform guys decided to hide those values from kernel and
handle them directly in firmware.

> Why can't the SoC specific genpd

> driver parse something like "qcom,corner" instead out of the

> node?


Sure we can, but that means that a new property will be required for
the next platform.

I did it this way as Kevin (and Rob) suggested NOT to add another
property but use the earlier ones as we aren't passing anything new
here, just that the units of the property are different. For another
SoC, we may want to hide both freq and voltage values from kernel and
pass firmware dependent values. Should we add two new properties for
that SoC then ?

> BTW, I don't believe I have a use-case where I want to express

> power domain OPP tables.


I do remember that you once said [1] that you may want to pass the
real voltage values as well via DT. And so I thought that you can pass
performance-state (corner) in opp-hz and real voltage values in
opp-microvolt.

> I have many devices that all have

> different frequencies that are all tied into the same power

> domain. This binding makes it look like we can only have one

> frequency per domain which won't work.


No, that isn't the case. Looks like we have some confusion here. Let
me try with a simple example:

        foo: foo-power-domain@09000000 {
                compatible = "foo,genpd";
                #power-domain-cells = <0>;
                operating-points-v2 = <&domain_opp_table>;
        };

        cpu0: cpu@0 {
                compatible = "arm,cortex-a53", "arm,armv8";
                ...
                operating-points-v2 = <&cpu_opp_table>;
                power-domains = <&foo>;
        };


        domain_opp_table: domain_opp_table {
                compatible = "operating-points-v2";

                domain_opp_1: opp00 {
                        opp-hz = /bits/ 64 <1>; /* These are corners AKA perf states */
                };
                domain_opp_2: opp01 {
                        opp-hz = /bits/ 64 <2>;
                };
                domain_opp_3: opp02 {
                        opp-hz = /bits/ 64 <3>;
                };
        };

        cpu_opp_table: cpu_opp_table {
                compatible = "operating-points-v2";
                opp-shared;

                opp00 {
                        opp-hz = /bits/ 64 <208000000>;
                        clock-latency-ns = <500000>;
                        power-domain-opp = <&domain_opp_1>;
                };
                opp01 {
                        opp-hz = /bits/ 64 <432000000>;
                        clock-latency-ns = <500000>;
                        power-domain-opp = <&domain_opp_2>;
                };
                opp02 {
                        opp-hz = /bits/ 64 <729000000>;
                        clock-latency-ns = <500000>;
                        power-domain-opp = <&domain_opp_2>;
                };
                opp03 {
                        opp-hz = /bits/ 64 <960000000>;
                        clock-latency-ns = <500000>;
                        power-domain-opp = <&domain_opp_3>;
                };
        };

The device frequencies are still managed by device's OPP table,
just that device's OPP has OPP requirement from another device which
is power domain in this case.

> I want to express that a device with a range of frequencies (or

> really multiple ranges of frequencies) is inside certain physical

> power domains and the frequency of the clks dictates the minimum

> voltage requirement of those power domains.

> 

> For the most complicated case, imagine something like our eMMC

> controller that has two clks (clk1,clk2) that it changes the rate

> of independently and those two clks rely on two different

> regulators (vreg1, vreg2) that supply voltage domains in the SoC

> which the eMMC controller happens to be part of (pd1, pd2). And

> the device is also part of another power domain that we use to

> turn everything off (pd3).

> 

>  +-------+                                 +-------+

>  | vreg1 |                                 | vreg2 |

>  +---+---+                                 +----+--+

>      |                                          |

>      |     +------------+-------------+         |

>      |     | +-------+  |  +--------+ |         |

>      +-----> | clk1  |  |  |  clk2  | <---------+

>            | +---+---+  |  +----+---+ |

>            |     |      |       |     |

>       +----------v--------------v-----------+

>       |    |            |             |     |

>       |    |            |             |     |

>       |    |   pd1      |     pd2     |     |

>       |    |            |             |     |

>       |    +------------+-------------+     |

>       |                                     |

>       |  pd3                          eMMC  |

>       +-------------------------------------+

> 

> 

> >From a DT perspective, I see this as one emmc node:

> 

>     pd: power-domain-controller {

>         #power-domain-cells = <1>;

>     };

> 

>     cc: clock-controller {

>         #clock-cells = <1>;

>     };

> 

>     emmc {

>         power-domains = <&pd 1> <&pd 2>, <&pd 3>;

>         clocks = <&cc 1> <&cc 2>;

>     }

> 

> And then we really don't want to have to express every single

> possible frequency that clk1 and clk2 can be just to express the

> voltage/corner constraints. It could be that we have some table

> like so:

> 

>        clk1 Hz  |  pd1 Corner

>     ------------+-----------

>          40000  |   0

>         960000  |   0

>       19200000  |   1

>       25000000  |   1

>       50000000  |   2

>       74000000  |   3

> 

>        clk2 Hz  |  pd2 Corner

>     ------------+-----------

>        19200000 |   0

>       150000000 |   1

>       340000000 |   2

> 

> 

> BUT we also have another device that uses pd1 and has it's own

> clk3 with different frequency constraints:

> 

>        clk3 Hz   |  pd1 Corner

>     -------------+-----------

>       250000000  |   0

>       360000000  |   1

>       730000000  |   2

> 

> So when clk3 is at 730000000, we don't care what frequency clk1

> is running at, pd1 needs to be at least at corner 2. If it's at

> corner 3 because clk1 is at 74000000 then that's fine.

> 

> I imagine DT would look like our "fmax tables" that are currently

> out of tree. Something like:

> 

> 	clk1_fmax_table {

> 		fmax0 {

> 			reg = /bits/ 64 <960000>;

> 			qcom,corner = <0>;

> 		};

> 		fmax1 {

> 			reg = /bits/ 64 <25000000>;

> 			qcom,corner = <1>;

> 		};

> 		fmax2 {

> 			reg = /bits/ 64 <50000000>;

> 			qcom,corner = <2>;

> 		};

> 		fmax3 {

> 			reg = /bits/ 64 <740000000>;

> 			qcom,corner = <3>;

> 		};

> 	};

> 

> Which is similar to OPP table, but we only list the maximum

> frequency that requires a particular corner. We're back to the

> same problem we have with OPPs of figuring out how to relate a

> table to a certain clk and power domain though. At least for

> qcom, we could do that with some sort of complicated list

> property:

> 

>     emmc {

>         performance-states = <&fmax_table &cc 1 &pd 1>

>     };

> 

> or something like that which would parse a table, a clock, and a

> number of power domains.


Here is how this can be represented with the current proposal.

        pd: power-domain-controller {
            #power-domain-cells = <1>;
            operating-points-v2 = <&pd1_opp_table>, <&pd2_opp_table>, <&pd3_opp_table>;
        };
   
        /*
         * The below OPP nodes can contain other properties like
         * microvolt and microamps, etc.
         *
         * Also if the below 3 tables are exactly same, then the same
         * table can be used for all the three power domains provided
         * by the above controller.
         */
        pd1_opp_table: pd1_opp_table {
                compatible = "operating-points-v2";

                pd1_opp_1: opp00 {
                        opp-hz = /bits/ 64 <1>; /* These are corners AKA perf states */
                };
                pd1_opp_2: opp01 {
                        opp-hz = /bits/ 64 <2>;
                };
                pd1_opp_3: opp02 {
                        opp-hz = /bits/ 64 <3>;
                };
        };

        pd2_opp_table: pd2_opp_table {
                compatible = "operating-points-v2";

                pd2_opp_1: opp00 {
                        opp-hz = /bits/ 64 <1>; /* These are corners AKA perf states */
                };
                pd2_opp_2: opp01 {
                        opp-hz = /bits/ 64 <2>;
                };
                pd2_opp_3: opp02 {
                        opp-hz = /bits/ 64 <3>;
                };
        };

        pd3_opp_table: pd3_opp_table {
                compatible = "operating-points-v2";

                pd3_opp_1: opp00 {
                        opp-hz = /bits/ 64 <1>; /* These are corners AKA perf states */
                };
                pd3_opp_2: opp01 {
                        opp-hz = /bits/ 64 <2>;
                };
                pd3_opp_3: opp02 {
                        opp-hz = /bits/ 64 <3>;
                };
        };

        cc: clock-controller {
            #clock-cells = <1>;
        };
    
        emmc {
            power-domains = <&pd 1> <&pd 2>, <&pd 3>;
            clocks = <&cc 1> <&cc 2>;
            /*
             * We don't allow multiple OPP tables for devices
             * currently, but I think we need to use it for multiple
             * clk case, like emmc in your example.
             */
            operating-points-v2 = <&cc1_opp_table>, <&cc2_opp_table>;
        }

        cc1_opp_table: cc1_opp_table {
                compatible = "operating-points-v2";
                opp-shared;

                opp00 {
                        opp-hz = /bits/ 64 <40000>;
                        power-domain-opp = <&pd1_opp_1>;
                };
                opp01 {
                        opp-hz = /bits/ 64 <960000>;
                        power-domain-opp = <&pd1_opp_1>;
                };
                opp02 {
                        opp-hz = /bits/ 64 <19200000>;
                        power-domain-opp = <&pd1_opp_2>;
                };
                opp03 {
                        opp-hz = /bits/ 64 <25000000>;
                        power-domain-opp = <&pd1_opp_2>;
                };
                opp04 {
                        opp-hz = /bits/ 64 <50000000>;
                        power-domain-opp = <&pd1_opp_3>;
                };
                opp05 {
                        opp-hz = /bits/ 64 <74000000>;
                        power-domain-opp = <&pd1_opp_3>;
                };
        };

        cc2_opp_table: cc2_opp_table {
                compatible = "operating-points-v2";
                opp-shared;

                opp00 {
                        opp-hz = /bits/ 64 <19200000>;
                        power-domain-opp = <&pd2_opp_1>;
                };
                opp01 {
                        opp-hz = /bits/ 64 <150000000>;
                        power-domain-opp = <&pd2_opp_2>;
                };
                opp02 {
                        opp-hz = /bits/ 64 <340000000>;
                        power-domain-opp = <&pd2_opp_3>;
                };
        };

Other devices can too have their OPP tables containing phandles to the
pd1/2/3 OPPs.

Wouldn't this work well ?

> Reminds me, we can have one clk frequency map to multiple power

> domains too. We have this case for our CPUs and PLLs where we

> have individual power control on two domains for one frequency.

> So when the PLL frequency changes, we need to turn on and set the

> voltage or corner of two regulators.


Sure, in that case the above "power-domain-opp" can contain a list.
This is the multiple-power-domain case we have discussed multiple
times earlier.

> So I don't really know how this is all going to work.


I am still hopeful :)

> I'd really

> appreciate to see the full picture though. Reviewing this a bit

> at a time makes me lose sight of the bigger picture.


I was about to publish code based on these bindings today, but then I
received last minute comments from you and Rob and that work is going
to wait a bit more :)

-- 
viresh

[1] https://marc.info/?l=linux-kernel&m=147995301320486&w=2
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rob Herring Dec. 26, 2017, 8:23 p.m. | #8
On Thu, Nov 30, 2017 at 12:59 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 29-11-17, 16:50, Stephen Boyd wrote:

>> Sorry it still makes zero sense to me. It seems that we're trying

>> to make the OPP table parsing generic just for the sake of code

>> brevity.

>

> Not just the code but bindings as well to make sure we don't add a new

> property (similar to earlier ones) for every platform that wants to

> use performance states.

>

>> Is this the goal? From a DT writer perspective it seems

>> confusing to say that opp-microvolt is sometimes a microvolt and

>> sometimes not a microvolt.

>

> Well it would still represent the voltage but not in microvolt units

> as the platform guys decided to hide those values from kernel and

> handle them directly in firmware.

>

>> Why can't the SoC specific genpd

>> driver parse something like "qcom,corner" instead out of the

>> node?

>

> Sure we can, but that means that a new property will be required for

> the next platform.

>

> I did it this way as Kevin (and Rob) suggested NOT to add another

> property but use the earlier ones as we aren't passing anything new

> here, just that the units of the property are different. For another

> SoC, we may want to hide both freq and voltage values from kernel and

> pass firmware dependent values. Should we add two new properties for

> that SoC then ?

>

>> BTW, I don't believe I have a use-case where I want to express

>> power domain OPP tables.

>

> I do remember that you once said [1] that you may want to pass the

> real voltage values as well via DT. And so I thought that you can pass

> performance-state (corner) in opp-hz and real voltage values in

> opp-microvolt.

>

>> I have many devices that all have

>> different frequencies that are all tied into the same power

>> domain. This binding makes it look like we can only have one

>> frequency per domain which won't work.

>

> No, that isn't the case. Looks like we have some confusion here. Let

> me try with a simple example:

>

>         foo: foo-power-domain@09000000 {

>                 compatible = "foo,genpd";

>                 #power-domain-cells = <0>;

>                 operating-points-v2 = <&domain_opp_table>;

>         };

>

>         cpu0: cpu@0 {

>                 compatible = "arm,cortex-a53", "arm,armv8";

>                 ...

>                 operating-points-v2 = <&cpu_opp_table>;

>                 power-domains = <&foo>;

>         };

>

>

>         domain_opp_table: domain_opp_table {

>                 compatible = "operating-points-v2";

>

>                 domain_opp_1: opp00 {

>                         opp-hz = /bits/ 64 <1>; /* These are corners AKA perf states */

>                 };

>                 domain_opp_2: opp01 {

>                         opp-hz = /bits/ 64 <2>;

>                 };

>                 domain_opp_3: opp02 {

>                         opp-hz = /bits/ 64 <3>;

>                 };

>         };

>

>         cpu_opp_table: cpu_opp_table {

>                 compatible = "operating-points-v2";

>                 opp-shared;

>

>                 opp00 {

>                         opp-hz = /bits/ 64 <208000000>;

>                         clock-latency-ns = <500000>;

>                         power-domain-opp = <&domain_opp_1>;


What is this? opp00 here is not a device. One OPP should not point to
another. "power-domain-opp" is only supposed to appear in devices
alongside power-domains properties.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar Dec. 27, 2017, 4:45 a.m. | #9
On 26-12-17, 14:23, Rob Herring wrote:
> >         cpu_opp_table: cpu_opp_table {

> >                 compatible = "operating-points-v2";

> >                 opp-shared;

> >

> >                 opp00 {

> >                         opp-hz = /bits/ 64 <208000000>;

> >                         clock-latency-ns = <500000>;

> >                         power-domain-opp = <&domain_opp_1>;

> 

> What is this? opp00 here is not a device. One OPP should not point to

> another. "power-domain-opp" is only supposed to appear in devices

> alongside power-domains properties.


There are two type of devices:

A.) With fixed performance state requirements and they will have the
new "required-opp" property in the device node itself as you said.

B.) Devices which can do DVFS (CPU, MMC, LCD, etc) and those may need
a different performance state of the domain for their individual OPPs
and so we can't have this property in the device all the time.

Does this make sense ?

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rob Herring Dec. 27, 2017, 9:36 p.m. | #10
On Tue, Dec 26, 2017 at 10:45 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 26-12-17, 14:23, Rob Herring wrote:

>> >         cpu_opp_table: cpu_opp_table {

>> >                 compatible = "operating-points-v2";

>> >                 opp-shared;

>> >

>> >                 opp00 {

>> >                         opp-hz = /bits/ 64 <208000000>;

>> >                         clock-latency-ns = <500000>;

>> >                         power-domain-opp = <&domain_opp_1>;

>>

>> What is this? opp00 here is not a device. One OPP should not point to

>> another. "power-domain-opp" is only supposed to appear in devices

>> alongside power-domains properties.

>

> There are two type of devices:

>

> A.) With fixed performance state requirements and they will have the

> new "required-opp" property in the device node itself as you said.

>

> B.) Devices which can do DVFS (CPU, MMC, LCD, etc) and those may need

> a different performance state of the domain for their individual OPPs

> and so we can't have this property in the device all the time.

>

> Does this make sense ?


No. From the definition for power-domain-opp

"+- power-domain-opp: This contains phandle to one of the OPP nodes of
the master
+  power domain. This specifies the minimum required OPP of the master
domain for
+  the functioning of the device in this OPP (where this property is present).
+  This property can only be set for a device if the device node contains the
+  "power-domains" property. Also, either all or none of the OPP nodes in an OPP
+  table should have it set."

In the above example, you are violating the next to last sentence.

Though, I'm now confused by what the last sentence means.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Viresh Kumar Dec. 28, 2017, 4:32 a.m. | #11
On 27-12-17, 15:36, Rob Herring wrote:
> On Tue, Dec 26, 2017 at 10:45 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:

> > On 26-12-17, 14:23, Rob Herring wrote:

> >> >         cpu_opp_table: cpu_opp_table {

> >> >                 compatible = "operating-points-v2";

> >> >                 opp-shared;

> >> >

> >> >                 opp00 {

> >> >                         opp-hz = /bits/ 64 <208000000>;

> >> >                         clock-latency-ns = <500000>;

> >> >                         power-domain-opp = <&domain_opp_1>;

> >>

> >> What is this? opp00 here is not a device. One OPP should not point to

> >> another. "power-domain-opp" is only supposed to appear in devices

> >> alongside power-domains properties.

> >

> > There are two type of devices:

> >

> > A.) With fixed performance state requirements and they will have the

> > new "required-opp" property in the device node itself as you said.

> >

> > B.) Devices which can do DVFS (CPU, MMC, LCD, etc) and those may need

> > a different performance state of the domain for their individual OPPs

> > and so we can't have this property in the device all the time.

> >

> > Does this make sense ?

> 

> No. From the definition for power-domain-opp

> 

> "+- power-domain-opp: This contains phandle to one of the OPP nodes of

> the master

> +  power domain. This specifies the minimum required OPP of the master

> domain for

> +  the functioning of the device in this OPP (where this property is present).


The per-opp thing was mentioned here.

> +  This property can only be set for a device if the device node contains the

> +  "power-domains" property.


This was trying to say something else, though it wasn't clear and so your
concerns.

I wanted to say that the device node or its OPP nodes can have the
"power-domain-opp" property only if the device node has a "power-domains"
property. i.e. you need to have power domain first and then only the
power-domain-opp property.

> Also, either all or none of the OPP nodes in an OPP

> +  table should have it set."

> 

> In the above example, you are violating the next to last sentence.

> 

> Though, I'm now confused by what the last sentence means.


Yeah, lets leave it as is as the V8 has changed this significantly and you
already Acked it :)

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt
index 203e09fe7698..9c5056fb120f 100644
--- a/Documentation/devicetree/bindings/opp/opp.txt
+++ b/Documentation/devicetree/bindings/opp/opp.txt
@@ -166,6 +166,12 @@  properties.
   "power-domains" property. Also, either all or none of the OPP nodes in an OPP
   table should have it set.
 
+
+On some platforms the exact frequency or voltage may be hidden from the OS by
+the firmware and the "opp-hz" or the "opp-microvolt" properties may contain
+magic values that represent the frequency or voltage in a firmware dependent
+way, for example an index of an array in the firmware.
+
 Example 1: Single cluster Dual-core ARM cortex A9, switch DVFS states together.
 
 / {