Message ID | e6969d79643c87af24895d508e5c6f7462b1c758.1550748118.git.viresh.kumar@linaro.org |
---|---|
State | New |
Headers | show |
Series | None | expand |
Hi Verish On 02/21/19 16:59, Viresh Kumar wrote: [...] > @@ -2239,6 +2314,8 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy, > struct cpufreq_policy *new_policy) > { > struct cpufreq_governor *old_gov; > + struct device *cpu_dev = get_cpu_device(policy->cpu); > + unsigned long min, max; > int ret; > > pr_debug("setting new policy for CPU %u: %u - %u kHz\n", > @@ -2253,11 +2330,23 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy, > if (new_policy->min > new_policy->max) > return -EINVAL; > > + min = dev_pm_qos_read_value(cpu_dev, DEV_PM_QOS_MIN_FREQUENCY); > + max = dev_pm_qos_read_value(cpu_dev, DEV_PM_QOS_MAX_FREQUENCY); > + > + if (min > new_policy->min) > + new_policy->min = min; > + if (max < new_policy->max) > + new_policy->max = max; > + Assuming for example min and max range from 1-10, and thermal throttles max to 5 using pm_qos to deal with temporary thermal pressure. But shortly after a driver thinks that max shouldn't be greater than 7 for one reason or another. What will happen after thermal pressure removes its constraint? Will we still remember the driver's request and apply it so max is set to 7 instead of 10? Thanks -- Qais Yousef
On 22-02-19, 11:44, Qais Yousef wrote: > Hi Verish > > On 02/21/19 16:59, Viresh Kumar wrote: > > [...] > > > @@ -2239,6 +2314,8 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy, > > struct cpufreq_policy *new_policy) > > { > > struct cpufreq_governor *old_gov; > > + struct device *cpu_dev = get_cpu_device(policy->cpu); > > + unsigned long min, max; > > int ret; > > > > pr_debug("setting new policy for CPU %u: %u - %u kHz\n", > > @@ -2253,11 +2330,23 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy, > > if (new_policy->min > new_policy->max) > > return -EINVAL; > > > > + min = dev_pm_qos_read_value(cpu_dev, DEV_PM_QOS_MIN_FREQUENCY); > > + max = dev_pm_qos_read_value(cpu_dev, DEV_PM_QOS_MAX_FREQUENCY); > > + > > + if (min > new_policy->min) > > + new_policy->min = min; > > + if (max < new_policy->max) > > + new_policy->max = max; > > + > > Assuming for example min and max range from 1-10, and thermal throttles max to > 5 using pm_qos to deal with temporary thermal pressure. But shortly after > a driver thinks that max shouldn't be greater than 7 for one reason or another. > > What will happen after thermal pressure removes its constraint? Will we still > remember the driver's request and apply it so max is set to 7 instead of 10? Once everything comes via PM QoS, it will remember all the presently available requests and choose a target min/max frequency based on that. But even with this patchset, with half stuff done with PM QoS and half done with cpufreq notifiers, it should still work that way only. -- viresh
On 02/25/19 10:01, Viresh Kumar wrote: > On 22-02-19, 11:44, Qais Yousef wrote: > > Hi Verish > > > > On 02/21/19 16:59, Viresh Kumar wrote: > > > > [...] > > > > > @@ -2239,6 +2314,8 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy, > > > struct cpufreq_policy *new_policy) > > > { > > > struct cpufreq_governor *old_gov; > > > + struct device *cpu_dev = get_cpu_device(policy->cpu); > > > + unsigned long min, max; > > > int ret; > > > > > > pr_debug("setting new policy for CPU %u: %u - %u kHz\n", > > > @@ -2253,11 +2330,23 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy, > > > if (new_policy->min > new_policy->max) > > > return -EINVAL; > > > > > > + min = dev_pm_qos_read_value(cpu_dev, DEV_PM_QOS_MIN_FREQUENCY); > > > + max = dev_pm_qos_read_value(cpu_dev, DEV_PM_QOS_MAX_FREQUENCY); > > > + > > > + if (min > new_policy->min) > > > + new_policy->min = min; > > > + if (max < new_policy->max) > > > + new_policy->max = max; > > > + > > > > Assuming for example min and max range from 1-10, and thermal throttles max to > > 5 using pm_qos to deal with temporary thermal pressure. But shortly after > > a driver thinks that max shouldn't be greater than 7 for one reason or another. > > > > What will happen after thermal pressure removes its constraint? Will we still > > remember the driver's request and apply it so max is set to 7 instead of 10? > > Once everything comes via PM QoS, it will remember all the presently available > requests and choose a target min/max frequency based on that. OK I can see the logic now in kernel/power/qos.c now. Sorry I missed it and it was easier to ask :-) > > But even with this patchset, with half stuff done with PM QoS and half done with > cpufreq notifiers, it should still work that way only. And this is why we need to check here if the PM QoS value doesn't conflict with the current min/max, right? Until the current notifier code is removed they could trip over each others. It would be nice to add a comment here about PM QoS managing and remembering values and that we need to be careful that both mechanisms don't trip over each others until this transient period is over. I have a nit too. It would be nice to explicitly state this is CPU_{MIN,MAX}_FREQUENCY. I can see someone else adding {MIN,MAX}_FREQUENCY for something elsee (memory maybe?) Although I looked at the previous series briefly, but this one looks more compact and easier to follow, so +1 for that. -- Qais Yousef
On 25-02-19, 08:58, Qais Yousef wrote: > On 02/25/19 10:01, Viresh Kumar wrote: > > > > + min = dev_pm_qos_read_value(cpu_dev, DEV_PM_QOS_MIN_FREQUENCY); > > > > + max = dev_pm_qos_read_value(cpu_dev, DEV_PM_QOS_MAX_FREQUENCY); > > > > + > > > > + if (min > new_policy->min) > > > > + new_policy->min = min; > > > > + if (max < new_policy->max) > > > > + new_policy->max = max; > And this is why we need to check here if the PM QoS value doesn't conflict with > the current min/max, right? Until the current notifier code is removed they > could trip over each others. No. The above if/else block is already removed as part of patch 5/5. It was required because of conflict between userspace specific min/max and qos min/max, which are migrated to use qos by patc 5/5. The cpufreq notifier mechanism already lets users play with min/max and that is already safe from conflicts. > It would be nice to add a comment here about PM QoS managing and remembering > values I am not sure if that would add any value. Some documentation update may be useful for people looking for details though, that I shall do after all the changes get in and things become a bit stable. > and that we need to be careful that both mechanisms don't trip over > each others until this transient period is over. The second mechanism will die very very soon once this is merged, migrating them shouldn't be a big challenge AFAICT. I didn't attempt that because I didn't wanted to waste time updating things in case this version also doesn't make sense to others. > I have a nit too. It would be nice to explicitly state this is > CPU_{MIN,MAX}_FREQUENCY. I can see someone else adding {MIN,MAX}_FREQUENCY for > something elsee (memory maybe?) This is not CPU specific, but any device. The same interface shall be used by devfreq as well, who wanted to use freq-constraints initially. > Although I looked at the previous series briefly, but this one looks more > compact and easier to follow, so +1 for that. Thanks for looking into this Qais. -- viresh
On 02/25/19 14:39, Viresh Kumar wrote: > On 25-02-19, 08:58, Qais Yousef wrote: > > On 02/25/19 10:01, Viresh Kumar wrote: > > > > > + min = dev_pm_qos_read_value(cpu_dev, DEV_PM_QOS_MIN_FREQUENCY); > > > > > + max = dev_pm_qos_read_value(cpu_dev, DEV_PM_QOS_MAX_FREQUENCY); > > > > > + > > > > > + if (min > new_policy->min) > > > > > + new_policy->min = min; > > > > > + if (max < new_policy->max) > > > > > + new_policy->max = max; > > > And this is why we need to check here if the PM QoS value doesn't conflict with > > the current min/max, right? Until the current notifier code is removed they > > could trip over each others. > > No. The above if/else block is already removed as part of patch 5/5. It was > required because of conflict between userspace specific min/max and qos min/max, > which are migrated to use qos by patc 5/5. > > The cpufreq notifier mechanism already lets users play with min/max and that is > already safe from conflicts. > > > > It would be nice to add a comment here about PM QoS managing and remembering > > values > > I am not sure if that would add any value. Some documentation update may be > useful for people looking for details though, that I shall do after all the > changes get in and things become a bit stable. > Up to you. But not everyone is familiar with the code and a one line comment that points to where aggregation is happening would be helpful for someone scanning this code IMHO. > > and that we need to be careful that both mechanisms don't trip over > > each others until this transient period is over. > > The second mechanism will die very very soon once this is merged, migrating them > shouldn't be a big challenge AFAICT. I didn't attempt that because I didn't > wanted to waste time updating things in case this version also doesn't make > sense to others. > > > I have a nit too. It would be nice to explicitly state this is > > CPU_{MIN,MAX}_FREQUENCY. I can see someone else adding {MIN,MAX}_FREQUENCY for > > something elsee (memory maybe?) > > This is not CPU specific, but any device. The same interface shall be used by > devfreq as well, who wanted to use freq-constraints initially. > I don't get that to be honest. I probably have to read more. Is what you're saying that when applying a MIN_FREQUENCY constraint the same value will be applied to both cpufreq and devfreq? Isn't this too coarse? > > Although I looked at the previous series briefly, but this one looks more > > compact and easier to follow, so +1 for that. > > Thanks for looking into this Qais. > > -- > viresh Thanks -- Qais Yousef
On 02/26/19 08:00, Viresh Kumar wrote: > On 25-02-19, 12:14, Qais Yousef wrote: > > On 02/25/19 14:39, Viresh Kumar wrote: > > > On 25-02-19, 08:58, Qais Yousef wrote: > > > > On 02/25/19 10:01, Viresh Kumar wrote: > > > > > > > + min = dev_pm_qos_read_value(cpu_dev, DEV_PM_QOS_MIN_FREQUENCY); > > > > > > > + max = dev_pm_qos_read_value(cpu_dev, DEV_PM_QOS_MAX_FREQUENCY); > > > > > > > + > > > > > > > + if (min > new_policy->min) > > > > > > > + new_policy->min = min; > > > > > > > + if (max < new_policy->max) > > > > > > > + new_policy->max = max; > > > > > > > And this is why we need to check here if the PM QoS value doesn't conflict with > > > > the current min/max, right? Until the current notifier code is removed they > > > > could trip over each others. > > > > > > No. The above if/else block is already removed as part of patch 5/5. It was > > > required because of conflict between userspace specific min/max and qos min/max, > > > which are migrated to use qos by patc 5/5. > > > > > > The cpufreq notifier mechanism already lets users play with min/max and that is > > > already safe from conflicts. > > > > > > > > > > It would be nice to add a comment here about PM QoS managing and remembering > > > > values > > > > > > I am not sure if that would add any value. Some documentation update may be > > > useful for people looking for details though, that I shall do after all the > > > changes get in and things become a bit stable. > > > > > > > Up to you. But not everyone is familiar with the code and a one line comment > > that points to where aggregation is happening would be helpful for someone > > scanning this code IMHO. > > Okay, will add something then. > > > > > and that we need to be careful that both mechanisms don't trip over > > > > each others until this transient period is over. > > > > > > The second mechanism will die very very soon once this is merged, migrating them > > > shouldn't be a big challenge AFAICT. I didn't attempt that because I didn't > > > wanted to waste time updating things in case this version also doesn't make > > > sense to others. > > > > > > > I have a nit too. It would be nice to explicitly state this is > > > > CPU_{MIN,MAX}_FREQUENCY. I can see someone else adding {MIN,MAX}_FREQUENCY for > > > > something elsee (memory maybe?) > > > > > > This is not CPU specific, but any device. The same interface shall be used by > > > devfreq as well, who wanted to use freq-constraints initially. > > > > > > > I don't get that to be honest. I probably have to read more. > > > > Is what you're saying that when applying a MIN_FREQUENCY constraint the same > > value will be applied to both cpufreq and devfreq? Isn't this too coarse? > > Oh no. A constraint with QoS is added like this: > > dev_pm_qos_add_request(dev, req, DEV_PM_QOS_MIN_FREQUENCY, min); > > Now dev here can be any device struct, CPU's or GPU's or anything else. All the > MIN freq requests are stored/processed per device and for a CPU in cpufreq all > we will see is MIN requests for the CPUs. And so the macro is required to be a > bit generic and shouldn't have CPU word within it. > > Hope I was able to clarify your doubt a bit. Thanks. Ah I see yes it all makes sense now. Thanks! -- Qais Yousef
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 0e626b00053b..d615bf35ac00 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -26,6 +26,7 @@ #include <linux/kernel_stat.h> #include <linux/module.h> #include <linux/mutex.h> +#include <linux/pm_qos.h> #include <linux/slab.h> #include <linux/suspend.h> #include <linux/syscore_ops.h> @@ -1082,11 +1083,77 @@ static void handle_update(struct work_struct *work) cpufreq_update_policy(cpu); } +static void cpufreq_update_freq_work(struct work_struct *work) +{ + struct cpufreq_policy *policy = + container_of(work, struct cpufreq_policy, req_work); + struct cpufreq_policy new_policy = *policy; + + /* We should read constraint values from QoS layer */ + new_policy.min = 0; + new_policy.max = UINT_MAX; + + down_write(&policy->rwsem); + + if (!policy_is_inactive(policy)) + cpufreq_set_policy(policy, &new_policy); + + up_write(&policy->rwsem); +} + +static int cpufreq_update_freq(struct cpufreq_policy *policy) +{ + schedule_work(&policy->req_work); + return 0; +} + +static int cpufreq_notifier_min(struct notifier_block *nb, unsigned long freq, + void *data) +{ + struct cpufreq_policy *policy = container_of(nb, struct cpufreq_policy, nb_min); + + return cpufreq_update_freq(policy); +} + +static int cpufreq_notifier_max(struct notifier_block *nb, unsigned long freq, + void *data) +{ + struct cpufreq_policy *policy = container_of(nb, struct cpufreq_policy, nb_max); + + return cpufreq_update_freq(policy); +} + +static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy) +{ + struct kobject *kobj; + struct completion *cmp; + + down_write(&policy->rwsem); + cpufreq_stats_free_table(policy); + kobj = &policy->kobj; + cmp = &policy->kobj_unregister; + up_write(&policy->rwsem); + kobject_put(kobj); + + /* + * We need to make sure that the underlying kobj is + * actually not referenced anymore by anybody before we + * proceed with unloading. + */ + pr_debug("waiting for dropping of refcount\n"); + wait_for_completion(cmp); + pr_debug("wait complete\n"); +} + static struct cpufreq_policy *cpufreq_policy_alloc(unsigned int cpu) { struct cpufreq_policy *policy; + struct device *dev = get_cpu_device(cpu); int ret; + if (!dev) + return NULL; + policy = kzalloc(sizeof(*policy), GFP_KERNEL); if (!policy) return NULL; @@ -1103,20 +1170,45 @@ static struct cpufreq_policy *cpufreq_policy_alloc(unsigned int cpu) ret = kobject_init_and_add(&policy->kobj, &ktype_cpufreq, cpufreq_global_kobject, "policy%u", cpu); if (ret) { - pr_err("%s: failed to init policy->kobj: %d\n", __func__, ret); + dev_err(dev, "%s: failed to init policy->kobj: %d\n", __func__, ret); goto err_free_real_cpus; } + policy->nb_min.notifier_call = cpufreq_notifier_min; + policy->nb_max.notifier_call = cpufreq_notifier_max; + + ret = dev_pm_qos_add_notifier(dev, &policy->nb_min, + DEV_PM_QOS_MIN_FREQUENCY); + if (ret) { + dev_err(dev, "Failed to register MIN QoS notifier: %d (%*pbl)\n", + ret, cpumask_pr_args(policy->cpus)); + goto err_kobj_remove; + } + + ret = dev_pm_qos_add_notifier(dev, &policy->nb_max, + DEV_PM_QOS_MAX_FREQUENCY); + if (ret) { + dev_err(dev, "Failed to register MAX QoS notifier: %d (%*pbl)\n", + ret, cpumask_pr_args(policy->cpus)); + goto err_min_qos_notifier; + } + INIT_LIST_HEAD(&policy->policy_list); init_rwsem(&policy->rwsem); spin_lock_init(&policy->transition_lock); init_waitqueue_head(&policy->transition_wait); init_completion(&policy->kobj_unregister); INIT_WORK(&policy->update, handle_update); + INIT_WORK(&policy->req_work, cpufreq_update_freq_work); policy->cpu = cpu; return policy; +err_min_qos_notifier: + dev_pm_qos_remove_notifier(dev, &policy->nb_min, + DEV_PM_QOS_MIN_FREQUENCY); +err_kobj_remove: + cpufreq_policy_put_kobj(policy); err_free_real_cpus: free_cpumask_var(policy->real_cpus); err_free_rcpumask: @@ -1129,30 +1221,9 @@ static struct cpufreq_policy *cpufreq_policy_alloc(unsigned int cpu) return NULL; } -static void cpufreq_policy_put_kobj(struct cpufreq_policy *policy) -{ - struct kobject *kobj; - struct completion *cmp; - - down_write(&policy->rwsem); - cpufreq_stats_free_table(policy); - kobj = &policy->kobj; - cmp = &policy->kobj_unregister; - up_write(&policy->rwsem); - kobject_put(kobj); - - /* - * We need to make sure that the underlying kobj is - * actually not referenced anymore by anybody before we - * proceed with unloading. - */ - pr_debug("waiting for dropping of refcount\n"); - wait_for_completion(cmp); - pr_debug("wait complete\n"); -} - static void cpufreq_policy_free(struct cpufreq_policy *policy) { + struct device *dev = get_cpu_device(policy->cpu); unsigned long flags; int cpu; @@ -1164,6 +1235,10 @@ static void cpufreq_policy_free(struct cpufreq_policy *policy) per_cpu(cpufreq_cpu_data, cpu) = NULL; write_unlock_irqrestore(&cpufreq_driver_lock, flags); + dev_pm_qos_remove_notifier(dev, &policy->nb_max, + DEV_PM_QOS_MAX_FREQUENCY); + dev_pm_qos_remove_notifier(dev, &policy->nb_min, + DEV_PM_QOS_MIN_FREQUENCY); cpufreq_policy_put_kobj(policy); free_cpumask_var(policy->real_cpus); free_cpumask_var(policy->related_cpus); @@ -2239,6 +2314,8 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy, struct cpufreq_policy *new_policy) { struct cpufreq_governor *old_gov; + struct device *cpu_dev = get_cpu_device(policy->cpu); + unsigned long min, max; int ret; pr_debug("setting new policy for CPU %u: %u - %u kHz\n", @@ -2253,11 +2330,23 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy, if (new_policy->min > new_policy->max) return -EINVAL; + min = dev_pm_qos_read_value(cpu_dev, DEV_PM_QOS_MIN_FREQUENCY); + max = dev_pm_qos_read_value(cpu_dev, DEV_PM_QOS_MAX_FREQUENCY); + + if (min > new_policy->min) + new_policy->min = min; + if (max < new_policy->max) + new_policy->max = max; + /* verify the cpu speed can be set within this limit */ ret = cpufreq_driver->verify(new_policy); if (ret) return ret; + /* + * The notifier-chain shall be removed once all the users of + * CPUFREQ_ADJUST are moved to use the QoS framework. + */ /* adjust if necessary - all reasons */ blocking_notifier_call_chain(&cpufreq_policy_notifier_list, CPUFREQ_ADJUST, new_policy); diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index b160e98076e3..f8f48d8a9b52 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -90,6 +90,7 @@ struct cpufreq_policy { struct work_struct update; /* if update_policy() needs to be * called, but you're in IRQ context */ + struct work_struct req_work; struct cpufreq_user_policy user_policy; struct cpufreq_frequency_table *freq_table; @@ -154,6 +155,9 @@ struct cpufreq_policy { /* Pointer to the cooling device if used for thermal mitigation */ struct thermal_cooling_device *cdev; + + struct notifier_block nb_min; + struct notifier_block nb_max; }; /* Only for ACPI */
This registers the notifiers for min/max frequency constraints with the PM QoS framework. The constraints are also taken into consideration in cpufreq_set_policy(). This also relocates cpufreq_policy_put_kobj() as it is required to be called from cpufreq_policy_alloc() now. No constraints are added until now though. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> --- drivers/cpufreq/cpufreq.c | 135 +++++++++++++++++++++++++++++++------- include/linux/cpufreq.h | 4 ++ 2 files changed, 116 insertions(+), 23 deletions(-) -- 2.21.0.rc0.269.g1a574e7a288b