[3/6] cpufreq: governor: Drop min_sampling_rate

Message ID 713af1a417a9a77f0c41976b25874687ac235e8e.1498733506.git.viresh.kumar@linaro.org
State Superseded
Headers show
Series
  • cpufreq: transition-latency cleanups
Related show

Commit Message

Viresh Kumar June 29, 2017, 10:59 a.m.
The cpufreq core and governors aren't supposed to set a limit on how
fast we want to try changing the frequency. This is currently done for
the legacy governors with help of min_sampling_rate.

At worst, we may end up setting the sampling rate to a value lower than
the rate at which frequency can be changed and then one of the CPUs in
the policy will be only changing frequency for ever.

But that is something for the user to decide and there is no need to
have special handling for such cases in the core. Leave it for the user
to figure out.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

---
 Documentation/admin-guide/pm/cpufreq.rst |  8 --------
 drivers/cpufreq/cpufreq_conservative.c   |  6 ------
 drivers/cpufreq/cpufreq_governor.c       | 10 ++--------
 drivers/cpufreq/cpufreq_governor.h       |  1 -
 drivers/cpufreq/cpufreq_ondemand.c       | 12 ------------
 include/linux/cpufreq.h                  |  2 --
 6 files changed, 2 insertions(+), 37 deletions(-)

-- 
2.13.0.71.gd7076ec9c9cb

Comments

Viresh Kumar June 30, 2017, 3:34 a.m. | #1
On 29-06-17, 20:01, Dominik Brodowski wrote:
> On Thu, Jun 29, 2017 at 04:29:06PM +0530, Viresh Kumar wrote:

> > The cpufreq core and governors aren't supposed to set a limit on how

> > fast we want to try changing the frequency. This is currently done for

> > the legacy governors with help of min_sampling_rate.

> > 

> > At worst, we may end up setting the sampling rate to a value lower than

> > the rate at which frequency can be changed and then one of the CPUs in

> > the policy will be only changing frequency for ever.

> 

> Is it safe to issue requests to change the CPU frequency so frequently,


Well, I assumed so. I am not sure the hardware would break though.
Overheating ?

> even

> on historic hardware such as speedstep-{ich,smi,centrino}? In the past,

> these checks more or less disallowed the running of dynamic frequency

> scaling at least on speedstep-smi[*],


We must by doing dynamic freq scaling even without this patch. I don't
see why you say the above then.

All we do here is that we get rid of the limit on how soon we can
change the freq again.

> but maybe on a few other platforms as

> well. That's why I am curious on whether this may break systems potentially

> on a hardware level if the hardware was not designed to do dynamic frequency

> scaling (and not just frequency switches on battery/AC).


Honestly I am not sure if any hardware can break or not, just because
of this commit.

-- 
viresh
Dominik Brodowski June 30, 2017, 4:53 a.m. | #2
On Fri, Jun 30, 2017 at 09:04:25AM +0530, Viresh Kumar wrote:
> On 29-06-17, 20:01, Dominik Brodowski wrote:

> > On Thu, Jun 29, 2017 at 04:29:06PM +0530, Viresh Kumar wrote:

> > > The cpufreq core and governors aren't supposed to set a limit on how

> > > fast we want to try changing the frequency. This is currently done for

> > > the legacy governors with help of min_sampling_rate.

> > > 

> > > At worst, we may end up setting the sampling rate to a value lower than

> > > the rate at which frequency can be changed and then one of the CPUs in

> > > the policy will be only changing frequency for ever.

> > 

> > Is it safe to issue requests to change the CPU frequency so frequently,

> 

> Well, I assumed so. I am not sure the hardware would break though.

> Overheating ?

> 

> > even

> > on historic hardware such as speedstep-{ich,smi,centrino}? In the past,

> > these checks more or less disallowed the running of dynamic frequency

> > scaling at least on speedstep-smi[*],

> 

> We must by doing dynamic freq scaling even without this patch. I don't

> see why you say the above then.

> 

> All we do here is that we get rid of the limit on how soon we can

> change the freq again.


Well, as I understand it, first generation "speedstep" was designed more or
less to switch frequencies only when AC power was lost or restored.

The Linux implementation merely said: "no on-the-fly changes", but switch
frequencies whenever a user explicitly requested such a change (presumably
only every once in an unspecified while).

This same reasoning may be present in other drivers using CPUFREQ_ETERNAL.

> > but maybe on a few other platforms as

> > well. That's why I am curious on whether this may break systems potentially

> > on a hardware level if the hardware was not designed to do dynamic frequency

> > scaling (and not just frequency switches on battery/AC).

> 

> Honestly I am not sure if any hardware can break or not, just because

> of this commit.


I am not *sure* either, I am just worried of the consequences of doing
things out-of-spec...

Best
	Dominik
Viresh Kumar June 30, 2017, 5:40 a.m. | #3
On 30-06-17, 06:53, Dominik Brodowski wrote:
> On Fri, Jun 30, 2017 at 09:04:25AM +0530, Viresh Kumar wrote:

> > On 29-06-17, 20:01, Dominik Brodowski wrote:

> > > On Thu, Jun 29, 2017 at 04:29:06PM +0530, Viresh Kumar wrote:

> > > > The cpufreq core and governors aren't supposed to set a limit on how

> > > > fast we want to try changing the frequency. This is currently done for

> > > > the legacy governors with help of min_sampling_rate.

> > > > 

> > > > At worst, we may end up setting the sampling rate to a value lower than

> > > > the rate at which frequency can be changed and then one of the CPUs in

> > > > the policy will be only changing frequency for ever.

> > > 

> > > Is it safe to issue requests to change the CPU frequency so frequently,

> > 

> > Well, I assumed so. I am not sure the hardware would break though.

> > Overheating ?

> > 

> > > even

> > > on historic hardware such as speedstep-{ich,smi,centrino}? In the past,


speedstep-smi is the only one which sets transition_latency to
CPUFREQ_ETERNAL and the others are putting some meaningful values. So
yes, they should be doing DVFS dynamically.

> > > these checks more or less disallowed the running of dynamic frequency

> > > scaling at least on speedstep-smi[*],

> > 

> > We must by doing dynamic freq scaling even without this patch. I don't

> > see why you say the above then.

> > 

> > All we do here is that we get rid of the limit on how soon we can

> > change the freq again.

> 

> Well, as I understand it, first generation "speedstep" was designed more or

> less to switch frequencies only when AC power was lost or restored.

> 

> The Linux implementation merely said: "no on-the-fly changes", but switch

> frequencies whenever a user explicitly requested such a change (presumably

> only every once in an unspecified while).

> 

> This same reasoning may be present in other drivers using CPUFREQ_ETERNAL.


Thanks for the explanation here and I am convinced that this series
has at least done one thing wrong. And that is removal of
max_transition_latency from governors and allowing ondemand to run on
such platforms (which may end up breaking them).

So I will actually modify that patch and set max_transition_latency to
CPUFREQ_ETERNAL for ondemand/conservative instead of 10ms. Also we
should do the same for schedutil as well, so that will also use the
max_transition_latency field.

But I hope, this patch will still be fine. Right ?

> I am not *sure* either, I am just worried of the consequences of doing

> things out-of-spec...


Thanks for your inputs Dominik.

-- 
viresh

Patch

diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst
index 09aa2e949787..6adbe1ed58b9 100644
--- a/Documentation/admin-guide/pm/cpufreq.rst
+++ b/Documentation/admin-guide/pm/cpufreq.rst
@@ -471,14 +471,6 @@  it is allowed to use (the ``scaling_max_freq`` policy limit).
 
 	# echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) > ondemand/sampling_rate
 
-
-``min_sampling_rate``
-	The minimum value of ``sampling_rate``.
-
-	Equal to 10000 (10 ms) if :c:macro:`CONFIG_NO_HZ_COMMON` and
-	:c:data:`tick_nohz_active` are both set or to 20 times the value of
-	:c:data:`jiffies` in microseconds otherwise.
-
 ``up_threshold``
 	If the estimated CPU load is above this value (in percent), the governor
 	will set the frequency to the maximum value allowed for the policy.
diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index 88220ff3e1c2..f20f20a77d4d 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -246,7 +246,6 @@  gov_show_one_common(sampling_rate);
 gov_show_one_common(sampling_down_factor);
 gov_show_one_common(up_threshold);
 gov_show_one_common(ignore_nice_load);
-gov_show_one_common(min_sampling_rate);
 gov_show_one(cs, down_threshold);
 gov_show_one(cs, freq_step);
 
@@ -254,12 +253,10 @@  gov_attr_rw(sampling_rate);
 gov_attr_rw(sampling_down_factor);
 gov_attr_rw(up_threshold);
 gov_attr_rw(ignore_nice_load);
-gov_attr_ro(min_sampling_rate);
 gov_attr_rw(down_threshold);
 gov_attr_rw(freq_step);
 
 static struct attribute *cs_attributes[] = {
-	&min_sampling_rate.attr,
 	&sampling_rate.attr,
 	&sampling_down_factor.attr,
 	&up_threshold.attr,
@@ -297,10 +294,7 @@  static int cs_init(struct dbs_data *dbs_data)
 	dbs_data->up_threshold = DEF_FREQUENCY_UP_THRESHOLD;
 	dbs_data->sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR;
 	dbs_data->ignore_nice_load = 0;
-
 	dbs_data->tuners = tuners;
-	dbs_data->min_sampling_rate = MIN_SAMPLING_RATE_RATIO *
-		jiffies_to_usecs(10);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 47e24b5384b3..858081f9c3d7 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -47,14 +47,11 @@  ssize_t store_sampling_rate(struct gov_attr_set *attr_set, const char *buf,
 {
 	struct dbs_data *dbs_data = to_dbs_data(attr_set);
 	struct policy_dbs_info *policy_dbs;
-	unsigned int rate;
 	int ret;
-	ret = sscanf(buf, "%u", &rate);
+	ret = sscanf(buf, "%u", &dbs_data->sampling_rate);
 	if (ret != 1)
 		return -EINVAL;
 
-	dbs_data->sampling_rate = max(rate, dbs_data->min_sampling_rate);
-
 	/*
 	 * We are operating under dbs_data->mutex and so the list and its
 	 * entries can't be freed concurrently.
@@ -437,10 +434,7 @@  int cpufreq_dbs_governor_init(struct cpufreq_policy *policy)
 		latency = 1;
 
 	/* Bring kernel and HW constraints together */
-	dbs_data->min_sampling_rate = max(dbs_data->min_sampling_rate,
-					  MIN_LATENCY_MULTIPLIER * latency);
-	dbs_data->sampling_rate = max(dbs_data->min_sampling_rate,
-				      LATENCY_MULTIPLIER * latency);
+	dbs_data->sampling_rate = LATENCY_MULTIPLIER * latency;
 
 	if (!have_governor_per_policy())
 		gov->gdbs_data = dbs_data;
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 7cbb07512e4c..06d9f90ede93 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -41,7 +41,6 @@  enum {OD_NORMAL_SAMPLE, OD_SUB_SAMPLE};
 struct dbs_data {
 	struct gov_attr_set attr_set;
 	void *tuners;
-	unsigned int min_sampling_rate;
 	unsigned int ignore_nice_load;
 	unsigned int sampling_rate;
 	unsigned int sampling_down_factor;
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index 3937acf7e026..6b423eebfd5d 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -319,7 +319,6 @@  gov_show_one_common(sampling_rate);
 gov_show_one_common(up_threshold);
 gov_show_one_common(sampling_down_factor);
 gov_show_one_common(ignore_nice_load);
-gov_show_one_common(min_sampling_rate);
 gov_show_one_common(io_is_busy);
 gov_show_one(od, powersave_bias);
 
@@ -329,10 +328,8 @@  gov_attr_rw(up_threshold);
 gov_attr_rw(sampling_down_factor);
 gov_attr_rw(ignore_nice_load);
 gov_attr_rw(powersave_bias);
-gov_attr_ro(min_sampling_rate);
 
 static struct attribute *od_attributes[] = {
-	&min_sampling_rate.attr,
 	&sampling_rate.attr,
 	&up_threshold.attr,
 	&sampling_down_factor.attr,
@@ -373,17 +370,8 @@  static int od_init(struct dbs_data *dbs_data)
 	if (idle_time != -1ULL) {
 		/* Idle micro accounting is supported. Use finer thresholds */
 		dbs_data->up_threshold = MICRO_FREQUENCY_UP_THRESHOLD;
-		/*
-		 * In nohz/micro accounting case we set the minimum frequency
-		 * not depending on HZ, but fixed (very low).
-		*/
-		dbs_data->min_sampling_rate = MICRO_FREQUENCY_MIN_SAMPLE_RATE;
 	} else {
 		dbs_data->up_threshold = DEF_FREQUENCY_UP_THRESHOLD;
-
-		/* For correct statistics, we need 10 ticks for each measure */
-		dbs_data->min_sampling_rate = MIN_SAMPLING_RATE_RATIO *
-			jiffies_to_usecs(10);
 	}
 
 	dbs_data->sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR;
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 54dfa1bdf138..083e7f3c23dd 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -488,9 +488,7 @@  static inline unsigned long cpufreq_scale(unsigned long old, u_int div,
  * ondemand governor will work on any processor with transition latency <= 10ms,
  * using appropriate sampling rate.
  */
-#define MIN_SAMPLING_RATE_RATIO		(2)
 #define LATENCY_MULTIPLIER		(1000)
-#define MIN_LATENCY_MULTIPLIER		(20)
 
 struct cpufreq_governor {
 	char	name[CPUFREQ_NAME_LEN];