diff mbox series

[RESEND,v2,3/4] PM: EM: Add em_dev_update_chip_binning()

Message ID 20240322110850.77086-4-lukasz.luba@arm.com
State Superseded
Headers show
Series Update Energy Model after chip binning adjusted voltages | expand

Commit Message

Lukasz Luba March 22, 2024, 11:08 a.m. UTC
Add a function which allows to modify easily the EM after the new voltage
information is available. The device drivers for the chip can adjust
the voltage values after setup. The voltage for the same frequency in OPP
can be different due to chip binning. The voltage impacts the power usage
and the EM power values can be updated to reflect that.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
---
 include/linux/energy_model.h |  5 ++++
 kernel/power/energy_model.c  | 51 ++++++++++++++++++++++++++++++++++++
 2 files changed, 56 insertions(+)

Comments

Dietmar Eggemann March 26, 2024, 10:09 a.m. UTC | #1
On 22/03/2024 12:08, Lukasz Luba wrote:

> diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
> index 6960dd7393b2d..f7f7ae34ec552 100644
> --- a/kernel/power/energy_model.c
> +++ b/kernel/power/energy_model.c
> @@ -808,3 +808,54 @@ static void em_update_workfn(struct work_struct *work)
>  {
>  	em_check_capacity_update();
>  }
> +
> +/**
> + * em_dev_update_chip_binning() - Update Energy Model with new values after

s/with new values// ... IMHO this should be obvious ?

> + *			the new voltage information is present in the OPPs.
> + * @dev		: Device for which the Energy Model has to be updated.
> + *
> + * This function allows to update easily the EM with new values available in
> + * the OPP framework and DT. It can be used after the chip has been properly
> + * verified by device drivers and the voltages adjusted for the 'chip binning'.
> + * It uses the "dynamic-power-coefficient" DT property to calculate the power
> + * values for EM. For power calculation it uses the new adjusted voltage
> + * values known for OPPs, which might be changed after boot.

The last two sentences describe what dev_pm_opp_calc_power() is doing.
Maybe this can be made clearer here?

> + */
> +int em_dev_update_chip_binning(struct device *dev)

This is the old dev_pm_opp_of_update_em() right?

> +{
> +	struct em_perf_table __rcu *em_table;
> +	struct em_perf_domain *pd;
> +	int i, ret;
> +
> +	if (IS_ERR_OR_NULL(dev))
> +		return -EINVAL;

When do you use if '(IS_ERR_OR_NULL(dev))' and when 'if(!dev)' for EM
interface functions?

> +	pd = em_pd_get(dev);
> +	if (!pd) {
> +		dev_warn(dev, "Couldn't find Energy Model\n");
> +		return -EINVAL;
> +	}
> +
> +	em_table = em_table_dup(pd);
> +	if (!em_table) {
> +		dev_warn(dev, "EM: allocation failed\n");
> +		return -ENOMEM;
> +	}
> +
> +	/* Update power values which might change due to new voltage in OPPs */
> +	for (i = 0; i < pd->nr_perf_states; i++) {
> +		unsigned long freq = em_table->state[i].frequency;
> +		unsigned long power;
> +
> +		ret = dev_pm_opp_calc_power(dev, &power, &freq);
> +		if (ret) {
> +			em_table_free(em_table);
> +			return ret;
> +		}
> +
> +		em_table->state[i].power = power;
> +	}
> +
> +	return em_recalc_and_update(dev, pd, em_table);
> +}
> +EXPORT_SYMBOL_GPL(em_dev_update_chip_binning);

In the previous version of 'chip-binning' you were using the new EM
interface em_dev_compute_costs() (1) which is now replaced by
em_recalc_and_update() -> em_compute_costs().

https://lkml.kernel.org/r/20231220110339.1065505-2-lukasz.luba@arm.com

Which leaves (1) still unused.

That was why my concern back then that we shouldn't introduce EM
interfaces without a user:

https://lkml.kernel.org/r/8fc499cf-fca1-4465-bff7-a93dfd36f3c8@arm.com

What happens now with em_dev_compute_costs()?
Lukasz Luba March 26, 2024, 8:32 p.m. UTC | #2
On 3/26/24 10:09, Dietmar Eggemann wrote:
> On 22/03/2024 12:08, Lukasz Luba wrote:
> 
>> diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
>> index 6960dd7393b2d..f7f7ae34ec552 100644
>> --- a/kernel/power/energy_model.c
>> +++ b/kernel/power/energy_model.c
>> @@ -808,3 +808,54 @@ static void em_update_workfn(struct work_struct *work)
>>   {
>>   	em_check_capacity_update();
>>   }
>> +
>> +/**
>> + * em_dev_update_chip_binning() - Update Energy Model with new values after
> 
> s/with new values// ... IMHO this should be obvious ?

Make sense

> 
>> + *			the new voltage information is present in the OPPs.
>> + * @dev		: Device for which the Energy Model has to be updated.
>> + *
>> + * This function allows to update easily the EM with new values available in
>> + * the OPP framework and DT. It can be used after the chip has been properly
>> + * verified by device drivers and the voltages adjusted for the 'chip binning'.
>> + * It uses the "dynamic-power-coefficient" DT property to calculate the power
>> + * values for EM. For power calculation it uses the new adjusted voltage
>> + * values known for OPPs, which might be changed after boot.
> 
> The last two sentences describe what dev_pm_opp_calc_power() is doing.
> Maybe this can be made clearer here?

Or I can just remove this, since it's too detailed description.

> 
>> + */
>> +int em_dev_update_chip_binning(struct device *dev)
> 
> This is the old dev_pm_opp_of_update_em() right?

Yes, it is similar.

> 
>> +{
>> +	struct em_perf_table __rcu *em_table;
>> +	struct em_perf_domain *pd;
>> +	int i, ret;
>> +
>> +	if (IS_ERR_OR_NULL(dev))
>> +		return -EINVAL;
> 
> When do you use if '(IS_ERR_OR_NULL(dev))' and when 'if(!dev)' for EM
> interface functions?

Sometimes IS_ERR_OR_NULL is used, especially for API function other
that register function.

> 
>> +	pd = em_pd_get(dev);
>> +	if (!pd) {
>> +		dev_warn(dev, "Couldn't find Energy Model\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	em_table = em_table_dup(pd);
>> +	if (!em_table) {
>> +		dev_warn(dev, "EM: allocation failed\n");
>> +		return -ENOMEM;
>> +	}
>> +
>> +	/* Update power values which might change due to new voltage in OPPs */
>> +	for (i = 0; i < pd->nr_perf_states; i++) {
>> +		unsigned long freq = em_table->state[i].frequency;
>> +		unsigned long power;
>> +
>> +		ret = dev_pm_opp_calc_power(dev, &power, &freq);
>> +		if (ret) {
>> +			em_table_free(em_table);
>> +			return ret;
>> +		}
>> +
>> +		em_table->state[i].power = power;
>> +	}
>> +
>> +	return em_recalc_and_update(dev, pd, em_table);
>> +}
>> +EXPORT_SYMBOL_GPL(em_dev_update_chip_binning);
> 
> In the previous version of 'chip-binning' you were using the new EM
> interface em_dev_compute_costs() (1) which is now replaced by
> em_recalc_and_update() -> em_compute_costs().
> 
> https://lkml.kernel.org/r/20231220110339.1065505-2-lukasz.luba@arm.com
> 
> Which leaves (1) still unused.
> 
> That was why my concern back then that we shouldn't introduce EM
> interfaces without a user:
> 
> https://lkml.kernel.org/r/8fc499cf-fca1-4465-bff7-a93dfd36f3c8@arm.com
> 
> What happens now with em_dev_compute_costs()?
> 

For now it's not used, but modules which will create new EMs
with custom power values will use it. When such a module have
e.g. 5 EMs for one PD and only switches on one of them, then
this em_dev_compute_costs() will be used at setup for those
5 EMs. Later it won't be used.
I don't wanted to combine the registration of new EM with
the compute cost, because that will create overhead in the
switching to new EM code path. Now we have only ~3us, which
was the main goal.

When our scmi-cpufreq get the support for EM update this
compute cost will be used there.
Dietmar Eggemann March 27, 2024, 12:55 p.m. UTC | #3
On 26/03/2024 21:32, Lukasz Luba wrote:
> 
> 
> On 3/26/24 10:09, Dietmar Eggemann wrote:
>> On 22/03/2024 12:08, Lukasz Luba wrote:

[...]

>>> +    return em_recalc_and_update(dev, pd, em_table);
>>> +}
>>> +EXPORT_SYMBOL_GPL(em_dev_update_chip_binning);
>>
>> In the previous version of 'chip-binning' you were using the new EM
>> interface em_dev_compute_costs() (1) which is now replaced by
>> em_recalc_and_update() -> em_compute_costs().
>>
>> https://lkml.kernel.org/r/20231220110339.1065505-2-lukasz.luba@arm.com
>>
>> Which leaves (1) still unused.
>>
>> That was why my concern back then that we shouldn't introduce EM
>> interfaces without a user:
>>
>> https://lkml.kernel.org/r/8fc499cf-fca1-4465-bff7-a93dfd36f3c8@arm.com
>>
>> What happens now with em_dev_compute_costs()?
>>
> 
> For now it's not used, but modules which will create new EMs
> with custom power values will use it. When such a module have
> e.g. 5 EMs for one PD and only switches on one of them, then
> this em_dev_compute_costs() will be used at setup for those
> 5 EMs. Later it won't be used.
> I don't wanted to combine the registration of new EM with
> the compute cost, because that will create overhead in the
> switching to new EM code path. Now we have only ~3us, which
> was the main goal.
> 
> When our scmi-cpufreq get the support for EM update this
> compute cost will be used there.

OK, I see. I checked the reloadable EM test module and
em_dev_compute_costs() is used there like you described it.
Dietmar Eggemann March 28, 2024, 7:21 a.m. UTC | #4
On 27/03/2024 13:55, Dietmar Eggemann wrote:
> On 26/03/2024 21:32, Lukasz Luba wrote:
>>
>>
>> On 3/26/24 10:09, Dietmar Eggemann wrote:
>>> On 22/03/2024 12:08, Lukasz Luba wrote:
> 
> [...]
> 
>>>> +    return em_recalc_and_update(dev, pd, em_table);
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(em_dev_update_chip_binning);
>>>
>>> In the previous version of 'chip-binning' you were using the new EM
>>> interface em_dev_compute_costs() (1) which is now replaced by
>>> em_recalc_and_update() -> em_compute_costs().
>>>
>>> https://lkml.kernel.org/r/20231220110339.1065505-2-lukasz.luba@arm.com
>>>
>>> Which leaves (1) still unused.
>>>
>>> That was why my concern back then that we shouldn't introduce EM
>>> interfaces without a user:
>>>
>>> https://lkml.kernel.org/r/8fc499cf-fca1-4465-bff7-a93dfd36f3c8@arm.com
>>>
>>> What happens now with em_dev_compute_costs()?
>>>
>>
>> For now it's not used, but modules which will create new EMs
>> with custom power values will use it. When such a module have
>> e.g. 5 EMs for one PD and only switches on one of them, then
>> this em_dev_compute_costs() will be used at setup for those
>> 5 EMs. Later it won't be used.
>> I don't wanted to combine the registration of new EM with
>> the compute cost, because that will create overhead in the
>> switching to new EM code path. Now we have only ~3us, which
>> was the main goal.
>>
>> When our scmi-cpufreq get the support for EM update this
>> compute cost will be used there.
> 
> OK, I see. I checked the reloadable EM test module and
> em_dev_compute_costs() is used there like you described it.

I had a second look and IMHO the layout is like this:

Internal (1) and external (2,3) 'reloadable EM' use cases:
(EM alloc and free not depicted)

1. Late CPUs booting (CPU capacity adjustment)

 e3f1164fc9ee  PM: EM: Support late CPUs booting and capacity adjustment

 schedule_delayed_work(&em_update_work, ...)

  em_update_workfn()
   em_check_capacity_update()
    em_adjust_new_capacity()
     em_recalc_and_update()       <-- (i)
      em_compute_costs()          <-- (ii)
      em_dev_update_perf_domain() <-- external

2. Reload EM from driver

 22ea02848c07  PM: EM: Add em_dev_compute_costs()
 977230d5d503  PM: EM: Introduce em_dev_update_perf_domain() for EM
               updates

 em_dev_compute_costs()           <-- external
  em_compute_costs()              <-- (ii)
 em_dev_update_perf_domain()      <-- external

3. Chip binning

 this patchset  PM: EM: Add em_dev_update_chip_binning()

 em_dev_update_chip_binning()     <-- external
  em_recalc_and_update()          <-- (i)
   em_compute_costs()             <-- (ii)
   em_dev_update_perf_domain()    <-- external
Lukasz Luba March 28, 2024, 7:30 a.m. UTC | #5
On 3/28/24 07:21, Dietmar Eggemann wrote:
> On 27/03/2024 13:55, Dietmar Eggemann wrote:
>> On 26/03/2024 21:32, Lukasz Luba wrote:
>>>
>>>
>>> On 3/26/24 10:09, Dietmar Eggemann wrote:
>>>> On 22/03/2024 12:08, Lukasz Luba wrote:
>>
>> [...]
>>
>>>>> +    return em_recalc_and_update(dev, pd, em_table);
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(em_dev_update_chip_binning);
>>>>
>>>> In the previous version of 'chip-binning' you were using the new EM
>>>> interface em_dev_compute_costs() (1) which is now replaced by
>>>> em_recalc_and_update() -> em_compute_costs().
>>>>
>>>> https://lkml.kernel.org/r/20231220110339.1065505-2-lukasz.luba@arm.com
>>>>
>>>> Which leaves (1) still unused.
>>>>
>>>> That was why my concern back then that we shouldn't introduce EM
>>>> interfaces without a user:
>>>>
>>>> https://lkml.kernel.org/r/8fc499cf-fca1-4465-bff7-a93dfd36f3c8@arm.com
>>>>
>>>> What happens now with em_dev_compute_costs()?
>>>>
>>>
>>> For now it's not used, but modules which will create new EMs
>>> with custom power values will use it. When such a module have
>>> e.g. 5 EMs for one PD and only switches on one of them, then
>>> this em_dev_compute_costs() will be used at setup for those
>>> 5 EMs. Later it won't be used.
>>> I don't wanted to combine the registration of new EM with
>>> the compute cost, because that will create overhead in the
>>> switching to new EM code path. Now we have only ~3us, which
>>> was the main goal.
>>>
>>> When our scmi-cpufreq get the support for EM update this
>>> compute cost will be used there.
>>
>> OK, I see. I checked the reloadable EM test module and
>> em_dev_compute_costs() is used there like you described it.
> 
> I had a second look and IMHO the layout is like this:
> 
> Internal (1) and external (2,3) 'reloadable EM' use cases:
> (EM alloc and free not depicted)
> 
> 1. Late CPUs booting (CPU capacity adjustment)
> 
>   e3f1164fc9ee  PM: EM: Support late CPUs booting and capacity adjustment
> 
>   schedule_delayed_work(&em_update_work, ...)
> 
>    em_update_workfn()
>     em_check_capacity_update()
>      em_adjust_new_capacity()
>       em_recalc_and_update()       <-- (i)
>        em_compute_costs()          <-- (ii)
>        em_dev_update_perf_domain() <-- external
> 
> 2. Reload EM from driver
> 
>   22ea02848c07  PM: EM: Add em_dev_compute_costs()
>   977230d5d503  PM: EM: Introduce em_dev_update_perf_domain() for EM
>                 updates
> 
>   em_dev_compute_costs()           <-- external
>    em_compute_costs()              <-- (ii)
>   em_dev_update_perf_domain()      <-- external
> 
> 3. Chip binning
> 
>   this patchset  PM: EM: Add em_dev_update_chip_binning()
> 
>   em_dev_update_chip_binning()     <-- external
>    em_recalc_and_update()          <-- (i)
>     em_compute_costs()             <-- (ii)
>     em_dev_update_perf_domain()    <-- external
> 
> 
> 
> 

Yes, that's correct.
diff mbox series

Patch

diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
index 770755df852f1..d30d67c2f07cf 100644
--- a/include/linux/energy_model.h
+++ b/include/linux/energy_model.h
@@ -172,6 +172,7 @@  struct em_perf_table __rcu *em_table_alloc(struct em_perf_domain *pd);
 void em_table_free(struct em_perf_table __rcu *table);
 int em_dev_compute_costs(struct device *dev, struct em_perf_state *table,
 			 int nr_states);
+int em_dev_update_chip_binning(struct device *dev);
 
 /**
  * em_pd_get_efficient_state() - Get an efficient performance state from the EM
@@ -387,6 +388,10 @@  int em_dev_compute_costs(struct device *dev, struct em_perf_state *table,
 {
 	return -EINVAL;
 }
+static inline int em_dev_update_chip_binning(struct device *dev)
+{
+	return -EINVAL;
+}
 #endif
 
 #endif
diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
index 6960dd7393b2d..f7f7ae34ec552 100644
--- a/kernel/power/energy_model.c
+++ b/kernel/power/energy_model.c
@@ -808,3 +808,54 @@  static void em_update_workfn(struct work_struct *work)
 {
 	em_check_capacity_update();
 }
+
+/**
+ * em_dev_update_chip_binning() - Update Energy Model with new values after
+ *			the new voltage information is present in the OPPs.
+ * @dev		: Device for which the Energy Model has to be updated.
+ *
+ * This function allows to update easily the EM with new values available in
+ * the OPP framework and DT. It can be used after the chip has been properly
+ * verified by device drivers and the voltages adjusted for the 'chip binning'.
+ * It uses the "dynamic-power-coefficient" DT property to calculate the power
+ * values for EM. For power calculation it uses the new adjusted voltage
+ * values known for OPPs, which might be changed after boot.
+ */
+int em_dev_update_chip_binning(struct device *dev)
+{
+	struct em_perf_table __rcu *em_table;
+	struct em_perf_domain *pd;
+	int i, ret;
+
+	if (IS_ERR_OR_NULL(dev))
+		return -EINVAL;
+
+	pd = em_pd_get(dev);
+	if (!pd) {
+		dev_warn(dev, "Couldn't find Energy Model\n");
+		return -EINVAL;
+	}
+
+	em_table = em_table_dup(pd);
+	if (!em_table) {
+		dev_warn(dev, "EM: allocation failed\n");
+		return -ENOMEM;
+	}
+
+	/* Update power values which might change due to new voltage in OPPs */
+	for (i = 0; i < pd->nr_perf_states; i++) {
+		unsigned long freq = em_table->state[i].frequency;
+		unsigned long power;
+
+		ret = dev_pm_opp_calc_power(dev, &power, &freq);
+		if (ret) {
+			em_table_free(em_table);
+			return ret;
+		}
+
+		em_table->state[i].power = power;
+	}
+
+	return em_recalc_and_update(dev, pd, em_table);
+}
+EXPORT_SYMBOL_GPL(em_dev_update_chip_binning);