[v6,3/7] Add infrastructure to store and update instantaneous thermal pressure

Message ID 1576123908-12105-4-git-send-email-thara.gopinath@linaro.org
State New
Headers show
Series
  • Introduce Thermal Pressure
Related show

Commit Message

Thara Gopinath Dec. 12, 2019, 4:11 a.m.
Add architecture specific APIs to update and track thermal pressure on a
per cpu basis. A per cpu variable thermal_pressure is introduced to keep
track of instantaneous per cpu thermal pressure. Thermal pressure is the
delta between maximum capacity and capped capacity due to a thermal event.
capacity and capped capacity due to a thermal event.

topology_get_thermal_pressure can be hooked into the scheduler specified
arch_scale_thermal_capacity to retrieve instantaneius thermal pressure of
a cpu.

arch_set_thermal_presure can be used to update the thermal pressure by
providing a capped maximum capacity.

Considering topology_get_thermal_pressure reads thermal_pressure and
arch_set_thermal_pressure writes into thermal_pressure, one can argue for
some sort of locking mechanism to avoid a stale value.  But considering
topology_get_thermal_pressure_average can be called from a system critical
path like scheduler tick function, a locking mechanism is not ideal. This
means that it is possible the thermal_pressure value used to calculate
average thermal pressure for a cpu can be stale for upto 1 tick period.

Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>

---

v3->v4:
        - Dropped per cpu max_capacity_info struct and instead added a per
          delta_capacity variable to store the delta between maximum
          capacity and capped capacity. The delta is now calculated when
          thermal pressure is updated and not every tick.
        - Dropped populate_max_capacity_info api as only per cpu delta
          capacity is stored.
        - Renamed update_periodic_maxcap to
          trigger_thermal_pressure_average and update_maxcap_capacity to
          update_thermal_pressure.
v4->v5:
	- As per Peter's review comments folded thermal.c into fair.c.
	- As per Ionela's review comments revamped update_thermal_pressure
	  to take maximum available capacity as input instead of maximum
	  capped frequency ration.
v5->v6:
	- As per review comments moved all the infrastructure to track
	  and retrieve instantaneous thermal pressure out of scheduler
	  to topology files.

 arch/arm/include/asm/topology.h   |  3 +++
 arch/arm64/include/asm/topology.h |  3 +++
 drivers/base/arch_topology.c      | 13 +++++++++++++
 include/linux/arch_topology.h     | 11 +++++++++++
 4 files changed, 30 insertions(+)

-- 
2.1.4

Comments

Peter Zijlstra Dec. 16, 2019, 2:37 p.m. | #1
On Wed, Dec 11, 2019 at 11:11:44PM -0500, Thara Gopinath wrote:
> diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h

> index 8a0fae9..90b18c3 100644

> --- a/arch/arm/include/asm/topology.h

> +++ b/arch/arm/include/asm/topology.h

> @@ -16,6 +16,9 @@

>  /* Enable topology flag updates */

>  #define arch_update_cpu_topology topology_update_cpu_topology

>  

> +/* Replace task scheduler's defalut thermal pressure retrieve API */

> +#define arch_scale_thermal_capacity topology_get_thermal_pressure


Witsness the previously observed confusion.

>  #else

>  

>  static inline void init_cpu_topology(void) { }

> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h

> index a4d945d..ccb277b 100644

> --- a/arch/arm64/include/asm/topology.h

> +++ b/arch/arm64/include/asm/topology.h

> @@ -25,6 +25,9 @@ int pcibus_to_node(struct pci_bus *bus);

>  /* Enable topology flag updates */

>  #define arch_update_cpu_topology topology_update_cpu_topology

>  

> +/* Replace task scheduler's defalut thermal pressure retrieve API */

> +#define arch_scale_thermal_capacity topology_get_thermal_pressure

> +

>  #include <asm-generic/topology.h>

>  

>  #endif /* _ASM_ARM_TOPOLOGY_H */

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c

> index 1eb81f11..3a91379 100644

> --- a/drivers/base/arch_topology.c

> +++ b/drivers/base/arch_topology.c

> @@ -42,6 +42,19 @@ void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity)

>  	per_cpu(cpu_scale, cpu) = capacity;

>  }

>  

> +DEFINE_PER_CPU(unsigned long, thermal_pressure);

> +

> +void arch_set_thermal_pressure(struct cpumask *cpus,

> +			       unsigned long capped_capacity)

> +{

> +	int cpu;

> +	unsigned long delta = arch_scale_cpu_capacity(cpumask_first(cpus)) -

> +					capped_capacity;

> +

> +	for_each_cpu(cpu, cpus)

> +		WRITE_ONCE(per_cpu(thermal_pressure, cpu), delta);

> +}


Not a capacity.

>  static ssize_t cpu_capacity_show(struct device *dev,

>  				 struct device_attribute *attr,

>  				 char *buf)

> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h

> index 3015ecb..7a04364 100644

> --- a/include/linux/arch_topology.h

> +++ b/include/linux/arch_topology.h

> @@ -33,6 +33,17 @@ unsigned long topology_get_freq_scale(int cpu)

>  	return per_cpu(freq_scale, cpu);

>  }

>  

> +DECLARE_PER_CPU(unsigned long, thermal_pressure);

> +

> +static inline

> +unsigned long topology_get_thermal_pressure(int cpu)

> +{

> +	return per_cpu(thermal_pressure, cpu);

> +}
Dietmar Eggemann Dec. 17, 2019, 12:54 p.m. | #2
On 12/12/2019 05:11, Thara Gopinath wrote:

Shouldn't the subject contain something like 'arm,arm64,drivers:' ?

> Add architecture specific APIs to update and track thermal pressure on a

> per cpu basis. A per cpu variable thermal_pressure is introduced to keep

> track of instantaneous per cpu thermal pressure. Thermal pressure is the

> delta between maximum capacity and capped capacity due to a thermal event.

> capacity and capped capacity due to a thermal event.

> 

> topology_get_thermal_pressure can be hooked into the scheduler specified

> arch_scale_thermal_capacity to retrieve instantaneius thermal pressure of


s/instantaneius/instantaneous

> a cpu.

> 

> arch_set_thermal_presure can be used to update the thermal pressure by


s/presure/pressure

> providing a capped maximum capacity.

> 

> Considering topology_get_thermal_pressure reads thermal_pressure and

> arch_set_thermal_pressure writes into thermal_pressure, one can argue for

> some sort of locking mechanism to avoid a stale value.  But considering

> topology_get_thermal_pressure_average can be called from a system critical


s/topology_get_thermal_pressure_average/topology_get_thermal_pressure ?

> path like scheduler tick function, a locking mechanism is not ideal. This

> means that it is possible the thermal_pressure value used to calculate

> average thermal pressure for a cpu can be stale for upto 1 tick period.

> 

> Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>

> ---

> 

> v3->v4:

>         - Dropped per cpu max_capacity_info struct and instead added a per

>           delta_capacity variable to store the delta between maximum

>           capacity and capped capacity. The delta is now calculated when

>           thermal pressure is updated and not every tick.

>         - Dropped populate_max_capacity_info api as only per cpu delta

>           capacity is stored.

>         - Renamed update_periodic_maxcap to

>           trigger_thermal_pressure_average and update_maxcap_capacity to

>           update_thermal_pressure.

> v4->v5:

> 	- As per Peter's review comments folded thermal.c into fair.c.

> 	- As per Ionela's review comments revamped update_thermal_pressure

> 	  to take maximum available capacity as input instead of maximum

> 	  capped frequency ration.

> v5->v6:

> 	- As per review comments moved all the infrastructure to track

> 	  and retrieve instantaneous thermal pressure out of scheduler

> 	  to topology files.

> 

>  arch/arm/include/asm/topology.h   |  3 +++

>  arch/arm64/include/asm/topology.h |  3 +++

>  drivers/base/arch_topology.c      | 13 +++++++++++++

>  include/linux/arch_topology.h     | 11 +++++++++++

>  4 files changed, 30 insertions(+)

> 

> diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h

> index 8a0fae9..90b18c3 100644

> --- a/arch/arm/include/asm/topology.h

> +++ b/arch/arm/include/asm/topology.h

> @@ -16,6 +16,9 @@

>  /* Enable topology flag updates */

>  #define arch_update_cpu_topology topology_update_cpu_topology

>  

> +/* Replace task scheduler's defalut thermal pressure retrieve API */

> +#define arch_scale_thermal_capacity topology_get_thermal_pressure

> +

>  #else

>  

>  static inline void init_cpu_topology(void) { }

> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h

> index a4d945d..ccb277b 100644

> --- a/arch/arm64/include/asm/topology.h

> +++ b/arch/arm64/include/asm/topology.h

> @@ -25,6 +25,9 @@ int pcibus_to_node(struct pci_bus *bus);

>  /* Enable topology flag updates */

>  #define arch_update_cpu_topology topology_update_cpu_topology

>  

> +/* Replace task scheduler's defalut thermal pressure retrieve API */

> +#define arch_scale_thermal_capacity topology_get_thermal_pressure

> +

>  #include <asm-generic/topology.h>

>  

>  #endif /* _ASM_ARM_TOPOLOGY_H */

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c

> index 1eb81f11..3a91379 100644

> --- a/drivers/base/arch_topology.c

> +++ b/drivers/base/arch_topology.c

> @@ -42,6 +42,19 @@ void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity)

>  	per_cpu(cpu_scale, cpu) = capacity;

>  }

>  

> +DEFINE_PER_CPU(unsigned long, thermal_pressure);

> +

> +void arch_set_thermal_pressure(struct cpumask *cpus,

> +			       unsigned long capped_capacity)

> +{

> +	int cpu;

> +	unsigned long delta = arch_scale_cpu_capacity(cpumask_first(cpus)) -

> +					capped_capacity;


 unsigned long delta;
 int cpu;

 delta = arch_scale_cpu_capacity(cpumask_first(cpus)) - capped_capacity;

Easier to read plus order local variable declarations from longest to
shortest line.

https://lore.kernel.org/r/20181107171149.165693799@linutronix.de

> +

> +	for_each_cpu(cpu, cpus)

> +		WRITE_ONCE(per_cpu(thermal_pressure, cpu), delta);

> +}

> +

>  static ssize_t cpu_capacity_show(struct device *dev,

>  				 struct device_attribute *attr,

>  				 char *buf)

> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h

> index 3015ecb..7a04364 100644

> --- a/include/linux/arch_topology.h

> +++ b/include/linux/arch_topology.h

> @@ -33,6 +33,17 @@ unsigned long topology_get_freq_scale(int cpu)

>  	return per_cpu(freq_scale, cpu);

>  }

>  

> +DECLARE_PER_CPU(unsigned long, thermal_pressure);

> +

> +static inline

> +unsigned long topology_get_thermal_pressure(int cpu)


Save one line:

static inline unsigned long topology_get_thermal_pressure(int cpu)

> +{

> +	return per_cpu(thermal_pressure, cpu);

> +}

> +

> +void arch_set_thermal_pressure(struct cpumask *cpus,

> +			       unsigned long capped_capacity);

> +

>  struct cpu_topology {

>  	int thread_id;

>  	int core_id;

>
Ionela Voinescu Dec. 23, 2019, 5:50 p.m. | #3
Hi Thara,

On Wednesday 11 Dec 2019 at 23:11:44 (-0500), Thara Gopinath wrote:
> Add architecture specific APIs to update and track thermal pressure on a

> per cpu basis. A per cpu variable thermal_pressure is introduced to keep

> track of instantaneous per cpu thermal pressure. Thermal pressure is the

> delta between maximum capacity and capped capacity due to a thermal event.

> capacity and capped capacity due to a thermal event.

  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This line seems to be a duplicate (initially I thought I was seeing
double :) ).

> topology_get_thermal_pressure can be hooked into the scheduler specified

> arch_scale_thermal_capacity to retrieve instantaneius thermal pressure of

> a cpu.

> 

> arch_set_thermal_presure can be used to update the thermal pressure by

> providing a capped maximum capacity.

> 

> Considering topology_get_thermal_pressure reads thermal_pressure and

> arch_set_thermal_pressure writes into thermal_pressure, one can argue for

> some sort of locking mechanism to avoid a stale value.  But considering

> topology_get_thermal_pressure_average can be called from a system critical

> path like scheduler tick function, a locking mechanism is not ideal. This

> means that it is possible the thermal_pressure value used to calculate

> average thermal pressure for a cpu can be stale for upto 1 tick period.

> 

> Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>

[...]
> diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h

> index 8a0fae9..90b18c3 100644

> --- a/arch/arm/include/asm/topology.h

> +++ b/arch/arm/include/asm/topology.h

> @@ -16,6 +16,9 @@

>  /* Enable topology flag updates */

>  #define arch_update_cpu_topology topology_update_cpu_topology

>  

> +/* Replace task scheduler's defalut thermal pressure retrieve API */


s/defalut/default

> +#define arch_scale_thermal_capacity topology_get_thermal_pressure

> +


I also think this is deserving of a better name. I would drop the
'scale' part as well as it is not used as a scale factor, as
freq_scale or cpu_scale, but it's used as a reduction in capacity
(thermal capacity pressure) due to a thermal event.

It might be too much but what do you think about:
arch_thermal_capacity_pressure?

>  #else

>  

>  static inline void init_cpu_topology(void) { }

> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h

> index a4d945d..ccb277b 100644

> --- a/arch/arm64/include/asm/topology.h

> +++ b/arch/arm64/include/asm/topology.h

> @@ -25,6 +25,9 @@ int pcibus_to_node(struct pci_bus *bus);

>  /* Enable topology flag updates */

>  #define arch_update_cpu_topology topology_update_cpu_topology

>  

> +/* Replace task scheduler's defalut thermal pressure retrieve API */


s/defalut/default

Regards,
Ionela.

Patch

diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h
index 8a0fae9..90b18c3 100644
--- a/arch/arm/include/asm/topology.h
+++ b/arch/arm/include/asm/topology.h
@@ -16,6 +16,9 @@ 
 /* Enable topology flag updates */
 #define arch_update_cpu_topology topology_update_cpu_topology
 
+/* Replace task scheduler's defalut thermal pressure retrieve API */
+#define arch_scale_thermal_capacity topology_get_thermal_pressure
+
 #else
 
 static inline void init_cpu_topology(void) { }
diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
index a4d945d..ccb277b 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -25,6 +25,9 @@  int pcibus_to_node(struct pci_bus *bus);
 /* Enable topology flag updates */
 #define arch_update_cpu_topology topology_update_cpu_topology
 
+/* Replace task scheduler's defalut thermal pressure retrieve API */
+#define arch_scale_thermal_capacity topology_get_thermal_pressure
+
 #include <asm-generic/topology.h>
 
 #endif /* _ASM_ARM_TOPOLOGY_H */
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1eb81f11..3a91379 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -42,6 +42,19 @@  void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity)
 	per_cpu(cpu_scale, cpu) = capacity;
 }
 
+DEFINE_PER_CPU(unsigned long, thermal_pressure);
+
+void arch_set_thermal_pressure(struct cpumask *cpus,
+			       unsigned long capped_capacity)
+{
+	int cpu;
+	unsigned long delta = arch_scale_cpu_capacity(cpumask_first(cpus)) -
+					capped_capacity;
+
+	for_each_cpu(cpu, cpus)
+		WRITE_ONCE(per_cpu(thermal_pressure, cpu), delta);
+}
+
 static ssize_t cpu_capacity_show(struct device *dev,
 				 struct device_attribute *attr,
 				 char *buf)
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 3015ecb..7a04364 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -33,6 +33,17 @@  unsigned long topology_get_freq_scale(int cpu)
 	return per_cpu(freq_scale, cpu);
 }
 
+DECLARE_PER_CPU(unsigned long, thermal_pressure);
+
+static inline
+unsigned long topology_get_thermal_pressure(int cpu)
+{
+	return per_cpu(thermal_pressure, cpu);
+}
+
+void arch_set_thermal_pressure(struct cpumask *cpus,
+			       unsigned long capped_capacity);
+
 struct cpu_topology {
 	int thread_id;
 	int core_id;