Message ID | cover.1543219386.git.viresh.kumar@linaro.org |
---|---|
Headers | show |
Series | PM / Domains: Allow performance state propagation | expand |
On 11/26/2018 1:39 PM, Viresh Kumar wrote: > Hi, > > This series adds performance state propagation support in genpd core. > The propagation happens from the sub-domains to their masters. More > details can be found in the individual commit logs. > > This is tested on hikey960 by faking power domains in such a way that > the CPU devices have two power domains and both of them have the same > master domain. The CPU device, as well as its power domains have > "required-opps" property set and the performance requirement from the > CPU eventually configures all the domains (2 sub-domains and 1 master). I validated this using the rpmh powerdomain driver [1] where I had to model a relationship across cx and mx powerdomains, so that mx is always >= cx. Seems to work as expected, I will respin the rpmh powerdomain patches soon (Though its awaiting Rob's review/ack for the corner bindings) Tested-by: Rajendra Nayak <rnayak@codeaurora.org> [1] https://patchwork.ozlabs.org/cover/935289/ > > Based on opp/linux-next branch (which is 4.20-rc1 + > multiple-power-domain-support-in-opp-core + some OPP fixes). > > v1->V2: > - First patch (1/5) is new and an improvement to earlier stuff. > - Move genpd_status_on() check to _genpd_reeval_performance_state() from > _genpd_set_performance_state(). > - Improve dev_pm_opp_xlate_performance_state() to handle 1:1 pstate > mapping between genpd and its master and also to fix a problem while > finding the dst_table. > - Handle pstate=0 case properly. > > -- > viresh > > Viresh Kumar (5): > OPP: Improve _find_table_of_opp_np() > OPP: Add dev_pm_opp_xlate_performance_state() helper > PM / Domains: Save OPP table pointer in genpd > PM / Domains: Factorize dev_pm_genpd_set_performance_state() > PM / Domains: Propagate performance state updates > > drivers/base/power/domain.c | 211 +++++++++++++++++++++++++++--------- > drivers/opp/core.c | 59 ++++++++++ > drivers/opp/of.c | 14 ++- > include/linux/pm_domain.h | 6 + > include/linux/pm_opp.h | 7 ++ > 5 files changed, 244 insertions(+), 53 deletions(-) >
On Mon, 26 Nov 2018 at 09:10, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > Introduce a new helper dev_pm_opp_xlate_performance_state() which will > be used to translate from pstate of a device to another one. I don't get this, could you please elaborate? > > Initially this will be used by genpd to find pstate of a master domain > using its sub-domain's pstate. I assume pstate is the performance state of a genpd that you refer to, no? Please try to be a bit more descriptive in your changelogs, to avoid confusions. > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> > --- > drivers/opp/core.c | 59 ++++++++++++++++++++++++++++++++++++++++++ > include/linux/pm_opp.h | 7 +++++ > 2 files changed, 66 insertions(+) > > diff --git a/drivers/opp/core.c b/drivers/opp/core.c > index 0eaa954b3f6c..99b71f60b6d8 100644 > --- a/drivers/opp/core.c > +++ b/drivers/opp/core.c > @@ -1707,6 +1707,65 @@ void dev_pm_opp_put_genpd_virt_dev(struct opp_table *opp_table, > dev_err(virt_dev, "Failed to find required device entry\n"); > } > > +/** > + * dev_pm_opp_xlate_performance_state() - Find required OPP's pstate for src_table. > + * @src_table: OPP table which has dst_table as one of its required OPP table. > + * @dst_table: Required OPP table of the src_table. > + * @pstate: Current performance state of the src_table. > + * > + * This Returns pstate of the OPP (present in @dst_table) pointed out by the > + * "required-opps" property of the OPP (present in @src_table) which has > + * performance state set to @pstate. > + * > + * Return: Positive performance state on success, otherwise 0 on errors. > + */ > +unsigned int dev_pm_opp_xlate_performance_state(struct opp_table *src_table, > + struct opp_table *dst_table, > + unsigned int pstate) > +{ > + struct dev_pm_opp *opp; > + unsigned int dest_pstate = 0; > + int i; > + > + /* > + * Normally the src_table will have the "required_opps" property set to > + * point to one of the OPPs in the dst_table, but in some cases the > + * genpd and its master have one to one mapping of performance states > + * and so none of them have the "required-opps" property set. Return the > + * pstate of the src_table as it is in such cases. This comment is good, but I also think some of this important information is missing in the changelog. > + */ > + if (!src_table->required_opp_count) > + return pstate; > + > + for (i = 0; i < src_table->required_opp_count; i++) { > + if (src_table->required_opp_tables[i]->np == dst_table->np) > + break; > + } > + > + if (unlikely(i == src_table->required_opp_count)) { > + pr_err("%s: Couldn't find matching OPP table (%p: %p)\n", > + __func__, src_table, dst_table); > + return 0; > + } > + > + mutex_lock(&src_table->lock); > + > + list_for_each_entry(opp, &src_table->opp_list, node) { > + if (opp->pstate == pstate) { > + dest_pstate = opp->required_opps[i]->pstate; > + goto unlock; > + } > + } > + > + pr_err("%s: Couldn't find matching OPP (%p: %p)\n", __func__, src_table, > + dst_table); > + > +unlock: > + mutex_unlock(&src_table->lock); > + > + return dest_pstate; > +} > + > /** > * dev_pm_opp_add() - Add an OPP table from a table definitions > * @dev: device for which we do this operation > diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h > index 2b2c3fd985ab..5a64a81a1789 100644 > --- a/include/linux/pm_opp.h > +++ b/include/linux/pm_opp.h > @@ -128,6 +128,7 @@ struct opp_table *dev_pm_opp_register_set_opp_helper(struct device *dev, int (*s > void dev_pm_opp_unregister_set_opp_helper(struct opp_table *opp_table); > struct opp_table *dev_pm_opp_set_genpd_virt_dev(struct device *dev, struct device *virt_dev, int index); > void dev_pm_opp_put_genpd_virt_dev(struct opp_table *opp_table, struct device *virt_dev); > +unsigned int dev_pm_opp_xlate_performance_state(struct opp_table *src_table, struct opp_table *dst_table, unsigned int pstate); > int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq); > int dev_pm_opp_set_sharing_cpus(struct device *cpu_dev, const struct cpumask *cpumask); > int dev_pm_opp_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask); > @@ -280,6 +281,12 @@ static inline struct opp_table *dev_pm_opp_set_genpd_virt_dev(struct device *dev > } > > static inline void dev_pm_opp_put_genpd_virt_dev(struct opp_table *opp_table, struct device *virt_dev) {} > + > +static inline unsigned int dev_pm_opp_xlate_performance_state(struct opp_table *src_table, struct opp_table *dst_table, unsigned int pstate) > +{ > + return 0; > +} > + > static inline int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq) > { > return -ENOTSUPP; > -- > 2.19.1.568.g152ad8e3369a > Besides the comments, I think this makes sense to me. Although, let me review the rest in the series before making a final call. Kind regards Uffe
On Mon, 26 Nov 2018 at 09:10, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > Separate out _genpd_set_performance_state() and > _genpd_reeval_performance_state() from > dev_pm_genpd_set_performance_state() to handle performance state update > related stuff. This will be used by a later commit. > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Kind regards Uffe > --- > drivers/base/power/domain.c | 105 +++++++++++++++++++++--------------- > 1 file changed, 62 insertions(+), 43 deletions(-) > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > index 0d928359b880..d6b389a9f678 100644 > --- a/drivers/base/power/domain.c > +++ b/drivers/base/power/domain.c > @@ -239,6 +239,63 @@ static void genpd_update_accounting(struct generic_pm_domain *genpd) > static inline void genpd_update_accounting(struct generic_pm_domain *genpd) {} > #endif > > +static int _genpd_set_performance_state(struct generic_pm_domain *genpd, > + unsigned int state) > +{ > + int ret; > + > + ret = genpd->set_performance_state(genpd, state); > + if (ret) > + return ret; > + > + genpd->performance_state = state; > + return 0; > +} > + > +static int _genpd_reeval_performance_state(struct generic_pm_domain *genpd, > + unsigned int state) > +{ > + struct generic_pm_domain_data *pd_data; > + struct pm_domain_data *pdd; > + > + /* New requested state is same as Max requested state */ > + if (state == genpd->performance_state) > + return 0; > + > + /* New requested state is higher than Max requested state */ > + if (state > genpd->performance_state) > + goto update_state; > + > + /* Traverse all devices within the domain */ > + list_for_each_entry(pdd, &genpd->dev_list, list_node) { > + pd_data = to_gpd_data(pdd); > + > + if (pd_data->performance_state > state) > + state = pd_data->performance_state; > + } > + > + if (state == genpd->performance_state) > + return 0; > + > + /* > + * We aren't propagating performance state changes of a subdomain to its > + * masters as we don't have hardware that needs it. Over that, the > + * performance states of subdomain and its masters may not have > + * one-to-one mapping and would require additional information. We can > + * get back to this once we have hardware that needs it. For that > + * reason, we don't have to consider performance state of the subdomains > + * of genpd here. > + */ > + > +update_state: > + if (!genpd_status_on(genpd)) { > + genpd->performance_state = state; > + return 0; > + } > + > + return _genpd_set_performance_state(genpd, state); > +} > + > /** > * dev_pm_genpd_set_performance_state- Set performance state of device's power > * domain. > @@ -257,10 +314,9 @@ static inline void genpd_update_accounting(struct generic_pm_domain *genpd) {} > int dev_pm_genpd_set_performance_state(struct device *dev, unsigned int state) > { > struct generic_pm_domain *genpd; > - struct generic_pm_domain_data *gpd_data, *pd_data; > - struct pm_domain_data *pdd; > + struct generic_pm_domain_data *gpd_data; > unsigned int prev; > - int ret = 0; > + int ret; > > genpd = dev_to_genpd(dev); > if (IS_ERR(genpd)) > @@ -281,47 +337,10 @@ int dev_pm_genpd_set_performance_state(struct device *dev, unsigned int state) > prev = gpd_data->performance_state; > gpd_data->performance_state = state; > > - /* New requested state is same as Max requested state */ > - if (state == genpd->performance_state) > - goto unlock; > - > - /* New requested state is higher than Max requested state */ > - if (state > genpd->performance_state) > - goto update_state; > - > - /* Traverse all devices within the domain */ > - list_for_each_entry(pdd, &genpd->dev_list, list_node) { > - pd_data = to_gpd_data(pdd); > - > - if (pd_data->performance_state > state) > - state = pd_data->performance_state; > - } > - > - if (state == genpd->performance_state) > - goto unlock; > - > - /* > - * We aren't propagating performance state changes of a subdomain to its > - * masters as we don't have hardware that needs it. Over that, the > - * performance states of subdomain and its masters may not have > - * one-to-one mapping and would require additional information. We can > - * get back to this once we have hardware that needs it. For that > - * reason, we don't have to consider performance state of the subdomains > - * of genpd here. > - */ > - > -update_state: > - if (genpd_status_on(genpd)) { > - ret = genpd->set_performance_state(genpd, state); > - if (ret) { > - gpd_data->performance_state = prev; > - goto unlock; > - } > - } > - > - genpd->performance_state = state; > + ret = _genpd_reeval_performance_state(genpd, state); > + if (ret) > + gpd_data->performance_state = prev; > > -unlock: > genpd_unlock(genpd); > > return ret; > -- > 2.19.1.568.g152ad8e3369a >
On 30-11-18, 09:53, Ulf Hansson wrote: > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > > index 8e554e6a82a2..0d928359b880 100644 > > --- a/drivers/base/power/domain.c > > +++ b/drivers/base/power/domain.c > > @@ -1907,12 +1907,21 @@ int of_genpd_add_provider_simple(struct device_node *np, > > ret); > > goto unlock; > > } > > + > > + /* > > + * Save table for faster processing while setting performance > > + * state. > > + */ > > + genpd->opp_table = dev_pm_opp_get_opp_table(&genpd->dev); > > + WARN_ON(!genpd->opp_table); > > The WARN_ON seems lazy. Why not implement a proper error path? We just added the table before this call and so we should get the table no matter what. Initially I wanted to add a BUG_ON() but then I relaxed it a bit :) There is no point of error path here as that will never run. -- viresh
On Mon, 26 Nov 2018 at 09:10, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > This commit updates genpd core to start propagating performance state > updates to master domains that have their set_performance_state() > callback set. > > A genpd handles two type of performance states now. The first one is the > performance state requirement put on the genpd by the devices and > sub-domains under it, which is already represented by > genpd->performance_state. This isn't correct. Currently genpd->performance_state reflects the aggregation of the state requirements from each device that is attached to it. Not from subdomains. > The second type, introduced in this commit, is > the performance state requirement(s) put by the genpd on its master > domain(s). There is a separate value required for each master that the > genpd has and so a new field is added to the struct gpd_link > (link->performance_state), which represents the link between a genpd and > its master. The struct gpd_link also got another field > prev_performance_state, which is used by genpd core as a temporary > variable during transitions. > > We need to propagate setting performance state while powering-on a genpd > as well, as we ignore performance state requirements from sub-domains > which are powered-off. For this reason _genpd_power_on() also received > the additional parameter, depth, which is used for hierarchical locking > within genpd. > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> > --- > drivers/base/power/domain.c | 113 ++++++++++++++++++++++++++++++------ > include/linux/pm_domain.h | 4 ++ > 2 files changed, 98 insertions(+), 19 deletions(-) > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > index d6b389a9f678..206e263abe90 100644 > --- a/drivers/base/power/domain.c > +++ b/drivers/base/power/domain.c > @@ -239,24 +239,88 @@ static void genpd_update_accounting(struct generic_pm_domain *genpd) > static inline void genpd_update_accounting(struct generic_pm_domain *genpd) {} > #endif > > +static int _genpd_reeval_performance_state(struct generic_pm_domain *genpd, > + unsigned int state, int depth); > + I don't like forward declarations like these, as in most cases it means that the code can be re-worked and improved. However, because of the recursiveness code in genpd, I understand that it may be needed for some cases. Anyway, please have look and see if you can rid of it. > static int _genpd_set_performance_state(struct generic_pm_domain *genpd, > - unsigned int state) > + unsigned int state, int depth) > { > + struct generic_pm_domain *master; > + struct gpd_link *link; > + unsigned int mstate; please rename to master_state to clarify its use. > int ret; > > + /* Propagate to masters of genpd */ > + list_for_each_entry(link, &genpd->slave_links, slave_node) { > + master = link->master; > + > + if (!master->set_performance_state) > + continue; > + > + if (unlikely(!state)) { > + mstate = 0; > + } else { > + /* Find master's performance state */ > + mstate = dev_pm_opp_xlate_performance_state(genpd->opp_table, > + master->opp_table, state); > + if (unlikely(!mstate)) { > + ret = -EINVAL; > + goto err; > + } > + } > + > + genpd_lock_nested(master, depth + 1); > + > + link->prev_performance_state = link->performance_state; > + link->performance_state = mstate; > + ret = _genpd_reeval_performance_state(master, mstate, depth + 1); > + if (ret) > + link->performance_state = link->prev_performance_state; > + > + genpd_unlock(master); > + > + if (ret) > + goto err; > + } > + > ret = genpd->set_performance_state(genpd, state); > if (ret) > - return ret; > + goto err; > > genpd->performance_state = state; > return 0; > + > +err: > + /* Encountered an error, lets rollback */ > + list_for_each_entry_continue_reverse(link, &genpd->slave_links, > + slave_node) { > + master = link->master; > + > + if (!master->set_performance_state) > + continue; > + > + genpd_lock_nested(master, depth + 1); > + > + mstate = link->prev_performance_state; > + link->performance_state = mstate; > + > + if (_genpd_reeval_performance_state(master, mstate, depth + 1)) { > + pr_err("%s: Failed to roll back to %d performance state\n", > + master->name, mstate); > + } > + > + genpd_unlock(master); > + } > + > + return ret; > } > > static int _genpd_reeval_performance_state(struct generic_pm_domain *genpd, > - unsigned int state) > + unsigned int state, int depth) > { > struct generic_pm_domain_data *pd_data; > struct pm_domain_data *pdd; > + struct gpd_link *link; > > /* New requested state is same as Max requested state */ > if (state == genpd->performance_state) > @@ -274,18 +338,27 @@ static int _genpd_reeval_performance_state(struct generic_pm_domain *genpd, > state = pd_data->performance_state; > } > > - if (state == genpd->performance_state) > - return 0; > - > /* > - * We aren't propagating performance state changes of a subdomain to its > - * masters as we don't have hardware that needs it. Over that, the > - * performance states of subdomain and its masters may not have > - * one-to-one mapping and would require additional information. We can > - * get back to this once we have hardware that needs it. For that > - * reason, we don't have to consider performance state of the subdomains > - * of genpd here. > + * Traverse all powered-on subdomains within the domain. This can be > + * done without any additional locking as the link->performance_state > + * field is protected by the master genpd->lock, which is already taken. > + * > + * Also note that link->performance_state (subdomain's performance state > + * requirement to master domain) is different from > + * link->slave->performance_state (current performance state requirement > + * of the devices/sub-domains of the subdomain) and so can have a > + * different value. > */ > + list_for_each_entry(link, &genpd->master_links, master_node) { > + if (!genpd_status_on(link->slave)) > + continue; > + > + if (link->performance_state > state) > + state = link->performance_state; > + } > + > + if (state == genpd->performance_state) > + return 0; > > update_state: > if (!genpd_status_on(genpd)) { > @@ -293,7 +366,7 @@ static int _genpd_reeval_performance_state(struct generic_pm_domain *genpd, > return 0; > } > > - return _genpd_set_performance_state(genpd, state); > + return _genpd_set_performance_state(genpd, state, depth); > } > > /** > @@ -337,7 +410,7 @@ int dev_pm_genpd_set_performance_state(struct device *dev, unsigned int state) > prev = gpd_data->performance_state; > gpd_data->performance_state = state; > > - ret = _genpd_reeval_performance_state(genpd, state); > + ret = _genpd_reeval_performance_state(genpd, state, 0); > if (ret) > gpd_data->performance_state = prev; > > @@ -347,7 +420,8 @@ int dev_pm_genpd_set_performance_state(struct device *dev, unsigned int state) > } > EXPORT_SYMBOL_GPL(dev_pm_genpd_set_performance_state); > > -static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed) > +static int > +_genpd_power_on(struct generic_pm_domain *genpd, bool timed, int depth) Cosmetics, but please move the new parameter below instead of moving "static int" above. > { > unsigned int state_idx = genpd->state_idx; > ktime_t time_start; > @@ -368,7 +442,8 @@ static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed) > elapsed_ns = ktime_to_ns(ktime_sub(ktime_get(), time_start)); > > if (unlikely(genpd->set_performance_state)) { > - ret = genpd->set_performance_state(genpd, genpd->performance_state); > + ret = _genpd_set_performance_state(genpd, > + genpd->performance_state, depth); This is going to be a problem for genpd_sync_power_on() as in some cases we can't allow to use locks. Moreover I am wondering how this plays with genpd_power_on(), as when it calls _genpd_power_on() the first time, that is done by holding the top-level master's lock - and all locks in that hierarchy. In other words we are always powering on the top most master first, then step by step we move down in the hierarchy to power on the genpds. > if (ret) { > pr_warn("%s: Failed to set performance state %d (%d)\n", > genpd->name, genpd->performance_state, ret); > @@ -558,7 +633,7 @@ static int genpd_power_on(struct generic_pm_domain *genpd, unsigned int depth) > } > } > > - ret = _genpd_power_on(genpd, true); > + ret = _genpd_power_on(genpd, true, depth); > if (ret) > goto err; > > @@ -963,7 +1038,7 @@ static void genpd_sync_power_on(struct generic_pm_domain *genpd, bool use_lock, > genpd_unlock(link->master); > } > > - _genpd_power_on(genpd, false); > + _genpd_power_on(genpd, false, depth); > > genpd->status = GPD_STATE_ACTIVE; > } > diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h > index 9ad101362aef..dd364abb649a 100644 > --- a/include/linux/pm_domain.h > +++ b/include/linux/pm_domain.h > @@ -136,6 +136,10 @@ struct gpd_link { > struct list_head master_node; > struct generic_pm_domain *slave; > struct list_head slave_node; > + > + /* Sub-domain's per-master domain performance state */ > + unsigned int performance_state; > + unsigned int prev_performance_state; > }; > > struct gpd_timing_data { > -- > 2.19.1.568.g152ad8e3369a > Kind regards Uffe
On 30-11-18, 10:44, Ulf Hansson wrote: > On Mon, 26 Nov 2018 at 09:10, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > This commit updates genpd core to start propagating performance state > > updates to master domains that have their set_performance_state() > > callback set. > > > > A genpd handles two type of performance states now. The first one is the > > performance state requirement put on the genpd by the devices and > > sub-domains under it, which is already represented by > > genpd->performance_state. > > This isn't correct. > > Currently genpd->performance_state reflects the aggregation of the > state requirements from each device that is attached to it. Not from > subdomains. Okay, will improve the changelog. > > The second type, introduced in this commit, is > > the performance state requirement(s) put by the genpd on its master > > domain(s). There is a separate value required for each master that the > > genpd has and so a new field is added to the struct gpd_link > > (link->performance_state), which represents the link between a genpd and > > its master. The struct gpd_link also got another field > > prev_performance_state, which is used by genpd core as a temporary > > variable during transitions. > > > > We need to propagate setting performance state while powering-on a genpd > > as well, as we ignore performance state requirements from sub-domains > > which are powered-off. For this reason _genpd_power_on() also received > > the additional parameter, depth, which is used for hierarchical locking > > within genpd. > > > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> > > --- > > drivers/base/power/domain.c | 113 ++++++++++++++++++++++++++++++------ > > include/linux/pm_domain.h | 4 ++ > > 2 files changed, 98 insertions(+), 19 deletions(-) > > > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > > index d6b389a9f678..206e263abe90 100644 > > --- a/drivers/base/power/domain.c > > +++ b/drivers/base/power/domain.c > > @@ -239,24 +239,88 @@ static void genpd_update_accounting(struct generic_pm_domain *genpd) > > static inline void genpd_update_accounting(struct generic_pm_domain *genpd) {} > > #endif > > > > +static int _genpd_reeval_performance_state(struct generic_pm_domain *genpd, > > + unsigned int state, int depth); > > + > > I don't like forward declarations like these, as in most cases it > means that the code can be re-worked and improved. > > However, because of the recursiveness code in genpd, I understand that > it may be needed for some cases. Anyway, please have look and see if > you can rid of it. Will see if I can do it any better, though I really doubt it :) > > static int _genpd_set_performance_state(struct generic_pm_domain *genpd, > > - unsigned int state) > > + unsigned int state, int depth) > > -static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed) > > +static int > > +_genpd_power_on(struct generic_pm_domain *genpd, bool timed, int depth) > > Cosmetics, but please move the new parameter below instead of moving > "static int" above. > > > { > > unsigned int state_idx = genpd->state_idx; > > ktime_t time_start; > > @@ -368,7 +442,8 @@ static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed) > > elapsed_ns = ktime_to_ns(ktime_sub(ktime_get(), time_start)); > > > > if (unlikely(genpd->set_performance_state)) { > > - ret = genpd->set_performance_state(genpd, genpd->performance_state); > > + ret = _genpd_set_performance_state(genpd, > > + genpd->performance_state, depth); > > This is going to be a problem for genpd_sync_power_on() as in some > cases we can't allow to use locks. So we pass the use_lock parameter to this routine as well ? > Moreover I am wondering how this plays with genpd_power_on(), as when > it calls _genpd_power_on() the first time, that is done by holding the > top-level master's lock - and all locks in that hierarchy. In other > words we are always powering on the top most master first, then step > by step we move down in the hierarchy to power on the genpds. Sure, but the ordering of locks is always subdomain first and then master. Considering the case of Qcom, we have two domains Cx (sub-domain) and Mx (master). On first genpd_power_on(Cx) call, we will first call genpd_power_on(Mx) which will just power it on as none of its master will have perf-state support. We then call _genpd_power_on(Cx) which will also not do anything with Mx as its own (Cx's) pstate would be 0 at that time. But even if it had a valid value, it will propagate just fine with all proper locking in place. I will wait for your reply on this particular patch before sending out V3. Thanks for the reviews Ulf. -- viresh
On Fri, 30 Nov 2018 at 10:59, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 30-11-18, 10:44, Ulf Hansson wrote: > > On Mon, 26 Nov 2018 at 09:10, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > > > This commit updates genpd core to start propagating performance state > > > updates to master domains that have their set_performance_state() > > > callback set. > > > > > > A genpd handles two type of performance states now. The first one is the > > > performance state requirement put on the genpd by the devices and > > > sub-domains under it, which is already represented by > > > genpd->performance_state. > > > > This isn't correct. > > > > Currently genpd->performance_state reflects the aggregation of the > > state requirements from each device that is attached to it. Not from > > subdomains. > > Okay, will improve the changelog. > > > > The second type, introduced in this commit, is > > > the performance state requirement(s) put by the genpd on its master > > > domain(s). There is a separate value required for each master that the > > > genpd has and so a new field is added to the struct gpd_link > > > (link->performance_state), which represents the link between a genpd and > > > its master. The struct gpd_link also got another field > > > prev_performance_state, which is used by genpd core as a temporary > > > variable during transitions. > > > > > > We need to propagate setting performance state while powering-on a genpd > > > as well, as we ignore performance state requirements from sub-domains > > > which are powered-off. For this reason _genpd_power_on() also received > > > the additional parameter, depth, which is used for hierarchical locking > > > within genpd. > > > > > > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> > > > --- > > > drivers/base/power/domain.c | 113 ++++++++++++++++++++++++++++++------ > > > include/linux/pm_domain.h | 4 ++ > > > 2 files changed, 98 insertions(+), 19 deletions(-) > > > > > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > > > index d6b389a9f678..206e263abe90 100644 > > > --- a/drivers/base/power/domain.c > > > +++ b/drivers/base/power/domain.c > > > @@ -239,24 +239,88 @@ static void genpd_update_accounting(struct generic_pm_domain *genpd) > > > static inline void genpd_update_accounting(struct generic_pm_domain *genpd) {} > > > #endif > > > > > > +static int _genpd_reeval_performance_state(struct generic_pm_domain *genpd, > > > + unsigned int state, int depth); > > > + > > > > I don't like forward declarations like these, as in most cases it > > means that the code can be re-worked and improved. > > > > However, because of the recursiveness code in genpd, I understand that > > it may be needed for some cases. Anyway, please have look and see if > > you can rid of it. > > Will see if I can do it any better, though I really doubt it :) > > > > static int _genpd_set_performance_state(struct generic_pm_domain *genpd, > > > - unsigned int state) > > > + unsigned int state, int depth) > > > > > -static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed) > > > +static int > > > +_genpd_power_on(struct generic_pm_domain *genpd, bool timed, int depth) > > > > Cosmetics, but please move the new parameter below instead of moving > > "static int" above. > > > > > { > > > unsigned int state_idx = genpd->state_idx; > > > ktime_t time_start; > > > @@ -368,7 +442,8 @@ static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed) > > > elapsed_ns = ktime_to_ns(ktime_sub(ktime_get(), time_start)); > > > > > > if (unlikely(genpd->set_performance_state)) { > > > - ret = genpd->set_performance_state(genpd, genpd->performance_state); > > > + ret = _genpd_set_performance_state(genpd, > > > + genpd->performance_state, depth); > > > > This is going to be a problem for genpd_sync_power_on() as in some > > cases we can't allow to use locks. > > So we pass the use_lock parameter to this routine as well ? Maybe. > > > Moreover I am wondering how this plays with genpd_power_on(), as when > > it calls _genpd_power_on() the first time, that is done by holding the > > top-level master's lock - and all locks in that hierarchy. In other > > words we are always powering on the top most master first, then step > > by step we move down in the hierarchy to power on the genpds. > > Sure, but the ordering of locks is always subdomain first and then master. > Considering the case of Qcom, we have two domains Cx (sub-domain) and Mx (master). > > On first genpd_power_on(Cx) call, we will first call genpd_power_on(Mx) which > will just power it on as none of its master will have perf-state support. We > then call _genpd_power_on(Cx) which will also not do anything with Mx as its own > (Cx's) pstate would be 0 at that time. But even if it had a valid value, it will > propagate just fine with all proper locking in place. Can you explain that, it's not super easy to follow the flow. So what will happen if Cx has a value that needs to be propagated? What locks will be taken, and in what order? Following, what if we had a Bx domain, being the subdomain of Cx, and it too had a value that needs to be propagated. It sounds like we will do the propagation one time per level. Is that really necessary, couldn't we just do it once, after the power on sequence have been completed? Kind regards Uffe
On 30-11-18, 09:45, Ulf Hansson wrote: > On Mon, 26 Nov 2018 at 09:10, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > Introduce a new helper dev_pm_opp_xlate_performance_state() which will > > be used to translate from pstate of a device to another one. > > I don't get this, could you please elaborate? > > > > > Initially this will be used by genpd to find pstate of a master domain > > using its sub-domain's pstate. > > I assume pstate is the performance state of a genpd that you refer to, > no? Please try to be a bit more descriptive in your changelogs, to > avoid confusions. Here is the new changelog: OPP: Add dev_pm_opp_xlate_performance_state() helper dev_pm_genpd_set_performance_state() needs to handle performance state propagation going forward. Currently this routine only gets the required performance state of the device's genpd as an argument, but it doesn't know how to translate that to master genpd(s) of the device's genpd. Introduce a new helper dev_pm_opp_xlate_performance_state() which will be used to translate from performance state of a device (or genpd sub-domain) to another device (or master genpd). Normally the src_table (of genpd sub-domain) will have the "required_opps" property set to point to one of the OPPs in the dst_table (of master genpd), but in some cases the genpd and its master have one to one mapping of performance states and so none of them have the "required-opps" property set. Return the performance state of the src_table as it is in such cases. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Hope this looks fine ? -- viresh
On 30-11-18, 10:44, Ulf Hansson wrote: > On Mon, 26 Nov 2018 at 09:10, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > +static int _genpd_reeval_performance_state(struct generic_pm_domain *genpd, > > + unsigned int state, int depth); > > + > > I don't like forward declarations like these, as in most cases it > means that the code can be re-worked and improved. > > However, because of the recursiveness code in genpd, I understand that > it may be needed for some cases. Anyway, please have look and see if > you can rid of it. I tried, but I couldn't figure out a way to avoid this thing because of recursion. Even if I break some of the parts out to another routine, then also the dependency stays as is. Below is the new diff after making all the changes you suggested. -- viresh -------------------------8<------------------------- Subject: [PATCH] PM / Domains: Propagate performance state updates This commit updates genpd core to start propagating performance state updates to master domains that have their set_performance_state() callback set. Currently a genpd only handles the performance state requirements from the devices under its control. This commit extends that to also handle the performance state requirement(s) put on the master genpd by its sub-domains. There is a separate value required for each master that the genpd has and so a new field is added to the struct gpd_link (link->performance_state), which represents the link between a genpd and its master. The struct gpd_link also got another field prev_performance_state, which is used by genpd core as a temporary variable during transitions. We need to propagate setting performance state while powering-on a genpd as well, as we ignore performance state requirements from sub-domains which are powered-off. For this reason _genpd_power_on() also received the additional parameter, depth, which is used for hierarchical locking within genpd. We can't take the genpd lock sometimes when _genpd_power_on() is called from genpd_sync_power_on() and to take care of that several routine got an additional parameter use_lock. This parameter is always set to true if the call isn't initiated from genpd_sync_power_on(), else it receives the use_lock parameter of genpd_sync_power_on() as is. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> --- drivers/base/power/domain.c | 122 ++++++++++++++++++++++++++++++------ include/linux/pm_domain.h | 4 ++ 2 files changed, 107 insertions(+), 19 deletions(-) diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index d6b389a9f678..165a04646657 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -239,24 +239,97 @@ static void genpd_update_accounting(struct generic_pm_domain *genpd) static inline void genpd_update_accounting(struct generic_pm_domain *genpd) {} #endif +static int _genpd_reeval_performance_state(struct generic_pm_domain *genpd, + unsigned int state, int depth, + bool use_lock); + static int _genpd_set_performance_state(struct generic_pm_domain *genpd, - unsigned int state) + unsigned int state, int depth, + bool use_lock) { + struct generic_pm_domain *master; + struct gpd_link *link; + unsigned int master_state; int ret; + /* Propagate to masters of genpd */ + list_for_each_entry(link, &genpd->slave_links, slave_node) { + master = link->master; + + if (!master->set_performance_state) + continue; + + if (unlikely(!state)) { + master_state = 0; + } else { + /* Find master's performance state */ + master_state = dev_pm_opp_xlate_performance_state(genpd->opp_table, + master->opp_table, state); + if (unlikely(!master_state)) { + ret = -EINVAL; + goto err; + } + } + + if (use_lock) + genpd_lock_nested(master, depth + 1); + + link->prev_performance_state = link->performance_state; + link->performance_state = master_state; + ret = _genpd_reeval_performance_state(master, master_state, + depth + 1, use_lock); + if (ret) + link->performance_state = link->prev_performance_state; + + if (use_lock) + genpd_unlock(master); + + if (ret) + goto err; + } + ret = genpd->set_performance_state(genpd, state); if (ret) - return ret; + goto err; genpd->performance_state = state; return 0; + +err: + /* Encountered an error, lets rollback */ + list_for_each_entry_continue_reverse(link, &genpd->slave_links, + slave_node) { + master = link->master; + + if (!master->set_performance_state) + continue; + + if (use_lock) + genpd_lock_nested(master, depth + 1); + + master_state = link->prev_performance_state; + link->performance_state = master_state; + + if (_genpd_reeval_performance_state(master, master_state, + depth + 1, use_lock)) { + pr_err("%s: Failed to roll back to %d performance state\n", + master->name, master_state); + } + + if (use_lock) + genpd_unlock(master); + } + + return ret; } static int _genpd_reeval_performance_state(struct generic_pm_domain *genpd, - unsigned int state) + unsigned int state, int depth, + bool use_lock) { struct generic_pm_domain_data *pd_data; struct pm_domain_data *pdd; + struct gpd_link *link; /* New requested state is same as Max requested state */ if (state == genpd->performance_state) @@ -274,18 +347,27 @@ static int _genpd_reeval_performance_state(struct generic_pm_domain *genpd, state = pd_data->performance_state; } - if (state == genpd->performance_state) - return 0; - /* - * We aren't propagating performance state changes of a subdomain to its - * masters as we don't have hardware that needs it. Over that, the - * performance states of subdomain and its masters may not have - * one-to-one mapping and would require additional information. We can - * get back to this once we have hardware that needs it. For that - * reason, we don't have to consider performance state of the subdomains - * of genpd here. + * Traverse all powered-on subdomains within the domain. This can be + * done without any additional locking as the link->performance_state + * field is protected by the master genpd->lock, which is already taken. + * + * Also note that link->performance_state (subdomain's performance state + * requirement to master domain) is different from + * link->slave->performance_state (current performance state requirement + * of the devices/sub-domains of the subdomain) and so can have a + * different value. */ + list_for_each_entry(link, &genpd->master_links, master_node) { + if (!genpd_status_on(link->slave)) + continue; + + if (link->performance_state > state) + state = link->performance_state; + } + + if (state == genpd->performance_state) + return 0; update_state: if (!genpd_status_on(genpd)) { @@ -293,7 +375,7 @@ static int _genpd_reeval_performance_state(struct generic_pm_domain *genpd, return 0; } - return _genpd_set_performance_state(genpd, state); + return _genpd_set_performance_state(genpd, state, depth, use_lock); } /** @@ -337,7 +419,7 @@ int dev_pm_genpd_set_performance_state(struct device *dev, unsigned int state) prev = gpd_data->performance_state; gpd_data->performance_state = state; - ret = _genpd_reeval_performance_state(genpd, state); + ret = _genpd_reeval_performance_state(genpd, state, 0, true); if (ret) gpd_data->performance_state = prev; @@ -347,7 +429,8 @@ int dev_pm_genpd_set_performance_state(struct device *dev, unsigned int state) } EXPORT_SYMBOL_GPL(dev_pm_genpd_set_performance_state); -static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed) +static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed, + int depth, bool use_lock) { unsigned int state_idx = genpd->state_idx; ktime_t time_start; @@ -368,7 +451,8 @@ static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed) elapsed_ns = ktime_to_ns(ktime_sub(ktime_get(), time_start)); if (unlikely(genpd->set_performance_state)) { - ret = genpd->set_performance_state(genpd, genpd->performance_state); + ret = _genpd_set_performance_state(genpd, + genpd->performance_state, depth, use_lock); if (ret) { pr_warn("%s: Failed to set performance state %d (%d)\n", genpd->name, genpd->performance_state, ret); @@ -558,7 +642,7 @@ static int genpd_power_on(struct generic_pm_domain *genpd, unsigned int depth) } } - ret = _genpd_power_on(genpd, true); + ret = _genpd_power_on(genpd, true, depth, true); if (ret) goto err; @@ -963,7 +1047,7 @@ static void genpd_sync_power_on(struct generic_pm_domain *genpd, bool use_lock, genpd_unlock(link->master); } - _genpd_power_on(genpd, false); + _genpd_power_on(genpd, false, depth, use_lock); genpd->status = GPD_STATE_ACTIVE; } diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h index 9ad101362aef..dd364abb649a 100644 --- a/include/linux/pm_domain.h +++ b/include/linux/pm_domain.h @@ -136,6 +136,10 @@ struct gpd_link { struct list_head master_node; struct generic_pm_domain *slave; struct list_head slave_node; + + /* Sub-domain's per-master domain performance state */ + unsigned int performance_state; + unsigned int prev_performance_state; }; struct gpd_timing_data {
+ Stephen, Mike, Graham On Fri, 30 Nov 2018 at 12:06, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 30-11-18, 11:18, Ulf Hansson wrote: > > On Fri, 30 Nov 2018 at 10:59, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > Sure, but the ordering of locks is always subdomain first and then master. > > > Considering the case of Qcom, we have two domains Cx (sub-domain) and Mx (master). > > > > > > On first genpd_power_on(Cx) call, we will first call genpd_power_on(Mx) which > > > will just power it on as none of its master will have perf-state support. We > > > then call _genpd_power_on(Cx) which will also not do anything with Mx as its own > > > (Cx's) pstate would be 0 at that time. But even if it had a valid value, it will > > > propagate just fine with all proper locking in place. > > > > Can you explain that, it's not super easy to follow the flow. > > Sorry, I somehow assumed you would know it already :) > > > So what will happen if Cx has a value that needs to be propagated? > > What locks will be taken, and in what order? > > > > Following, what if we had a Bx domain, being the subdomain of Cx, and > > it too had a value that needs to be propagated. > > Lets take the worst example, we have Bx (sub-domain of Cx), Cx (sub-domain of > Mx) and Dx (master). Normal power-on/off will always have the values 0, so lets > consider resume sequence where all the domains will have a value pstate value. > And please forgive me for any bugs I have introduced in the following > super-complex sequence :) > > genpd_runtime_resume(dev) //domain Bx > -> genpd_lock(Bx) > -> genpd_power_on(Bx) > > -> genpd_lock(Cx) > -> genpd_power_on(Cx) > > -> genpd_lock(Dx) > -> genpd_power_on(Dx) > > -> _genpd_power_on(Dx) > -> _genpd_set_performance_state(Dx, Dxstate) { > //Doesn't have any masters > -> genpd->set_performance_state(Dx, Dxstate); > } > > -> genpd_unlock(Dx) > > -> _genpd_power_on(Cx) > -> _genpd_set_performance_state(Cx, Cxstate) { > //have one master, Dx > -> genpd_lock(Dx) > -> _genpd_set_performance_state(Dx, Dxstate) { > //Doesn't have any masters > -> genpd->set_performance_state(Dx, Dxstate); > } > > -> genpd_unlock(Dx) > > // Change Cx state > -> genpd->set_performance_state(Cx, Cxstate); > } > > -> genpd_unlock(Cx) > > -> _genpd_power_on(Bx) > -> _genpd_set_performance_state(Bx, Bxstate) { > //have one master, Cx > -> genpd_lock(Cx) > -> _genpd_set_performance_state(Cx, Cxstate) { > //have one master, Dx > -> genpd_lock(Dx) > -> _genpd_set_performance_state(Dx, Dxstate) { > //Doesn't have any masters > -> genpd->set_performance_state(Dx, Dxstate); > } > > -> genpd_unlock(Dx) > > // Change Cx state > -> genpd->set_performance_state(Cx, Cxstate); > } > -> genpd_unlock(Cx) > > -> genpd->set_performance_state(Bx, Bxstate); > } > > -> genpd_unlock(Bx) > > Thanks for clarifying. This confirms my worries about the locking overhead. > > > It sounds like we will > > do the propagation one time per level. Is that really necessary, > > couldn't we just do it once, after the power on sequence have been > > completed? > > It will be a BIG hack somewhere, isn't it ? How will we know when has the time > come to shoot the final sequence of set_performance_state() ? And where will we > do it? genpd_runtime_resume() ? And then we will have more problems, for example > Rajendra earlier compared this stuff to clk framework where it is possible to do > clk_set_rate() first and then only call clk_enable() and the same should be > possible with genpd as well, i.e. set performance state first and then only > enable the device/domain. And so we need this right within genpd_power_on(). There is one a big difference while comparing with clocks, which make this more difficult. That is, in dev_pm_genpd_set_performance_state(), we are *not* calling ->the set_performance_state() callback of the genpd, unless the genpd is already powered on. Instead, for that case, we are only aggregating the performance states votes, to defer to invoke ->set_performance_state() until the genpd becomes powered on. In some way this makes sense, but for clock_set_rate(), the clock's rate can be changed, no matter if the clock has been prepared/enabled or not. I recall we discussed this behavior of genpd, while introducing the performance states support to it. Reaching this point, introducing the master-domain propagation of performance states votes, we may need to re-consider the behavior, as there is evidently an overhead that grows along with the hierarchy. As a matter of fact, what I think this boils to, is to consider if we want to temporary drop the performance state vote for a device from genpd's ->runtime_suspend() callback. Thus, also restore the vote from genpd's ->runtime_resume() callback. That's because, this is directly related to whether genpd should care about whether it's powered on or off, when calling the ->set_performance_state(). We have had discussions at LKML already around this topic. It seems like we need to pick them up to reach a consensus, before we can move forward with this. > > I know things are repetitive here, but that's the right way of doing it IMHO. > What do you say ? As this point, honestly I don't know yet. I have looped in Stephen, Mike and Graham, let's see if they have some thoughts on the topic. Kind regards Uffe
Quoting Ulf Hansson (2018-12-03 05:38:35) > + Stephen, Mike, Graham > > On Fri, 30 Nov 2018 at 12:06, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > > On 30-11-18, 11:18, Ulf Hansson wrote: > There is one a big difference while comparing with clocks, which make > this more difficult. > > That is, in dev_pm_genpd_set_performance_state(), we are *not* calling > ->the set_performance_state() callback of the genpd, unless the genpd > is already powered on. Instead, for that case, we are only aggregating > the performance states votes, to defer to invoke > ->set_performance_state() until the genpd becomes powered on. In some > way this makes sense, but for clock_set_rate(), the clock's rate can > be changed, no matter if the clock has been prepared/enabled or not. > > I recall we discussed this behavior of genpd, while introducing the > performance states support to it. Reaching this point, introducing the > master-domain propagation of performance states votes, we may need to > re-consider the behavior, as there is evidently an overhead that grows > along with the hierarchy. > > As a matter of fact, what I think this boils to, is to consider if we > want to temporary drop the performance state vote for a device from > genpd's ->runtime_suspend() callback. Thus, also restore the vote from > genpd's ->runtime_resume() callback. That's because, this is directly > related to whether genpd should care about whether it's powered on or > off, when calling the ->set_performance_state(). We have had > discussions at LKML already around this topic. It seems like we need > to pick them up to reach a consensus, before we can move forward with > this. What are we trying to gain consensus on exactly? I've been thinking that we would want there to be a performance state 'target' for the 'devices are runtime suspended' state that we apply when the genpd is powered down from runtime PM suspend of all devices in the domain. This would then be overwritten with whatever the aggregated performance state is when the genpd is powered back on due to some device runtime resuming. Don't we already sort of have support for this right now in genpd with commit fc5cbf0c94b6 ("PM / Domains: Support for multiple states")? So I would think that we either let that state_idx member be 0 or some implementation defined 'off' state index of the genpd that can be achieved. I was also thinking that the genpd_power_state was more than just a bunch of idle states of a device so that we could combine on/off/idle/active(performance states?) into one genpd_power_state space that is affected by idle and on/off runtime PM status. In the Qualcomm use-case, the state of the clk, either on or off, is factored into the decision to consider the voltage constraint that the clk has on the CX domain. So when the clk is disabled, the voltage constraint on CX can be removed/ignored in the aggregation equation because the hardware isn't clocking. This needs to work properly so that devices can disable their clks and save power. I hope that we can move clk on/off/rate control to be implemented with clk domains based upon genpds so that we can generically reason about genpds being on/off and at some performance state (i.e. a frequency) instead of making something clk specific in the genpd code. If we can do that, then this can be stated as a slave genpd (the clk domain) that's linked to a master genpd (the voltage/corner domain) can only affect the performance state requirement of the master genpd when the slave genpd is on. When the slave genpd is off, the performance state requirement is dropped and the master domain settles to a lower requirement based on the other domains linked to it that are on. The clk domain would turn on and off when the set of devices it is attached to are all suspended and that would "do the right thing" by bubbling up the requirements to the master domain this way.