diff mbox series

[01/21] arch_topology: Make register_cpu_capacity_sysctl() tolerant to late CPUs

Message ID E1r5R2g-00CsyV-Ss@rmk-PC.armlinux.org.uk
State New
Headers show
Series [01/21] arch_topology: Make register_cpu_capacity_sysctl() tolerant to late CPUs | expand

Commit Message

Russell King (Oracle) Nov. 21, 2023, 1:43 p.m. UTC
From: James Morse <james.morse@arm.com>

register_cpu_capacity_sysctl() adds a property to sysfs that describes
the CPUs capacity. This is done from a subsys_initcall() that assumes
all possible CPUs are registered.

With CPU hotplug, possible CPUs aren't registered until they become
present, (or for arm64 enabled). This leads to messages during boot:
| register_cpu_capacity_sysctl: too early to get CPU1 device!
and once these CPUs are added to the system, the file is missing.

Move this to a cpuhp callback, so that the file is created once
CPUs are brought online. This covers CPUs that are added late by
mechanisms like hotplug.
One observable difference is the file is now missing for offline CPUs.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
If the offline CPUs thing is a problem for the tools that consume
this value, we'd need to move cpu_capacity to be part of cpu.c's
common_cpu_attr_groups. However, attempts to discuss this just end
up in a black hole, so this is a non-starter. Thus, if this needs
to be done, it can be done as a separate patch.
---
 drivers/base/arch_topology.c | 38 ++++++++++++++++++++++++------------
 1 file changed, 26 insertions(+), 12 deletions(-)

Comments

Jonathan Cameron Nov. 30, 2023, 4:46 p.m. UTC | #1
On Tue, 21 Nov 2023 13:43:54 +0000
Russell King <rmk+kernel@armlinux.org.uk> wrote:

> From: James Morse <james.morse@arm.com>
> 
> register_cpu_capacity_sysctl() adds a property to sysfs that describes
> the CPUs capacity. This is done from a subsys_initcall() that assumes
> all possible CPUs are registered.
> 
> With CPU hotplug, possible CPUs aren't registered until they become
> present, (or for arm64 enabled). This leads to messages during boot:
> | register_cpu_capacity_sysctl: too early to get CPU1 device!
> and once these CPUs are added to the system, the file is missing.
> 
> Move this to a cpuhp callback, so that the file is created once
> CPUs are brought online. This covers CPUs that are added late by
> mechanisms like hotplug.
> One observable difference is the file is now missing for offline CPUs.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
> If the offline CPUs thing is a problem for the tools that consume
> this value, we'd need to move cpu_capacity to be part of cpu.c's
> common_cpu_attr_groups. However, attempts to discuss this just end
> up in a black hole, so this is a non-starter. Thus, if this needs
> to be done, it can be done as a separate patch.
> ---
>  drivers/base/arch_topology.c | 38 ++++++++++++++++++++++++------------
>  1 file changed, 26 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index b741b5ba82bd..9ccb7daee78e 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -220,20 +220,34 @@ static DECLARE_WORK(update_topology_flags_work, update_topology_flags_workfn);
>  
>  static DEVICE_ATTR_RO(cpu_capacity);
>  
> -static int register_cpu_capacity_sysctl(void)
> +static int cpu_capacity_sysctl_add(unsigned int cpu)
>  {
> -	int i;
> -	struct device *cpu;
> +	struct device *cpu_dev = get_cpu_device(cpu);
>  
> -	for_each_possible_cpu(i) {
> -		cpu = get_cpu_device(i);
> -		if (!cpu) {
> -			pr_err("%s: too early to get CPU%d device!\n",
> -			       __func__, i);
> -			continue;
> -		}
> -		device_create_file(cpu, &dev_attr_cpu_capacity);
> -	}
> +	if (!cpu_dev)
> +		return -ENOENT;
> +
> +	device_create_file(cpu_dev, &dev_attr_cpu_capacity);
> +
> +	return 0;
> +}
> +
> +static int cpu_capacity_sysctl_remove(unsigned int cpu)
> +{
> +	struct device *cpu_dev = get_cpu_device(cpu);
> +
> +	if (!cpu_dev)
> +		return -ENOENT;
> +
> +	device_remove_file(cpu_dev, &dev_attr_cpu_capacity);
> +
> +	return 0;
> +}
> +
> +static int register_cpu_capacity_sysctl(void)
> +{
> +	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "topology/cpu-capacity",
> +			  cpu_capacity_sysctl_add, cpu_capacity_sysctl_remove);
>  
>  	return 0;
>  }
Thomas Gleixner Dec. 1, 2023, 10:45 a.m. UTC | #2
On Tue, Nov 21 2023 at 13:43, Russell King wrote:
> ---
> If the offline CPUs thing is a problem for the tools that consume
> this value, we'd need to move cpu_capacity to be part of cpu.c's
> common_cpu_attr_groups. However, attempts to discuss this just end
> up in a black hole, so this is a non-starter. Thus, if this needs
> to be done, it can be done as a separate patch.

Offline CPUs have 0 capacity by definition....
diff mbox series

Patch

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index b741b5ba82bd..9ccb7daee78e 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -220,20 +220,34 @@  static DECLARE_WORK(update_topology_flags_work, update_topology_flags_workfn);
 
 static DEVICE_ATTR_RO(cpu_capacity);
 
-static int register_cpu_capacity_sysctl(void)
+static int cpu_capacity_sysctl_add(unsigned int cpu)
 {
-	int i;
-	struct device *cpu;
+	struct device *cpu_dev = get_cpu_device(cpu);
 
-	for_each_possible_cpu(i) {
-		cpu = get_cpu_device(i);
-		if (!cpu) {
-			pr_err("%s: too early to get CPU%d device!\n",
-			       __func__, i);
-			continue;
-		}
-		device_create_file(cpu, &dev_attr_cpu_capacity);
-	}
+	if (!cpu_dev)
+		return -ENOENT;
+
+	device_create_file(cpu_dev, &dev_attr_cpu_capacity);
+
+	return 0;
+}
+
+static int cpu_capacity_sysctl_remove(unsigned int cpu)
+{
+	struct device *cpu_dev = get_cpu_device(cpu);
+
+	if (!cpu_dev)
+		return -ENOENT;
+
+	device_remove_file(cpu_dev, &dev_attr_cpu_capacity);
+
+	return 0;
+}
+
+static int register_cpu_capacity_sysctl(void)
+{
+	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "topology/cpu-capacity",
+			  cpu_capacity_sysctl_add, cpu_capacity_sysctl_remove);
 
 	return 0;
 }