cpu-topology: warn if NUMA configurations conflicts with lower layer

Message ID 1577088979-8545-1-git-send-email-prime.zeng@hisilicon.com
State New
Headers show
Series
  • cpu-topology: warn if NUMA configurations conflicts with lower layer
Related show

Commit Message

z00214469 Dec. 23, 2019, 8:16 a.m.
As we know, from sched domain's perspective, the DIE layer should be
larger than or at least equal to the MC layer, and in some cases, MC
is defined by the arch specified hardware, MPIDR for example, but NUMA
can be defined by users, with the following system configrations:
*************************************
NUMA:      	 0-2,  3-7
core_siblings:   0-3,  4-7
*************************************
Per the current code, for core 3, its MC cpu map fallbacks to 3~7(its
core_sibings is 0~3 while its numa node map is 3~7).

For the sched MC, when we are build sched groups:
step1. core3 's sched groups chain is built like this: 3->4->5->6->7->3
step2. core4's sched groups chain is built like this: 4->5->6->7->4
so after step2, core3's sched groups for MC level is overlapped, more
importantly, it will fall to dead loop if while(sg != sg->groups)

Obviously, the NUMA node with cpu 3-7 conflict with the MC level cpu
map, but unfortunately, there is no way even detect such cases.

In this patch, prompt a warning message to help with the above cases.

Signed-off-by: Zeng Tao <prime.zeng@hisilicon.com>

---
 drivers/base/arch_topology.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

-- 
2.8.1

Comments

Sudeep Holla Dec. 31, 2019, 4:40 p.m. | #1
On Mon, Dec 23, 2019 at 04:16:19PM +0800, z00214469 wrote:
> As we know, from sched domain's perspective, the DIE layer should be

> larger than or at least equal to the MC layer, and in some cases, MC

> is defined by the arch specified hardware, MPIDR for example, but NUMA

> can be defined by users,


Who are the users you are referring above ?

> with the following system configrations:


Do you mean ACPI tables or DT or some firmware tables ?

> *************************************

> NUMA:      	 0-2,  3-7


Is the above simply wrong with respect to hardware and it actually match
core_siblings ?

> core_siblings:   0-3,  4-7

> *************************************

> Per the current code, for core 3, its MC cpu map fallbacks to 3~7(its

> core_sibings is 0~3 while its numa node map is 3~7).

>

> For the sched MC, when we are build sched groups:

> step1. core3 's sched groups chain is built like this: 3->4->5->6->7->3

> step2. core4's sched groups chain is built like this: 4->5->6->7->4

> so after step2, core3's sched groups for MC level is overlapped, more

> importantly, it will fall to dead loop if while(sg != sg->groups)

>

> Obviously, the NUMA node with cpu 3-7 conflict with the MC level cpu

> map, but unfortunately, there is no way even detect such cases.

>


Again, is cpu 3-7 actually in a NUMA node or is it 4-7 ?

> In this patch, prompt a warning message to help with the above cases.

>

> Signed-off-by: Zeng Tao <prime.zeng@hisilicon.com>

> ---

>  drivers/base/arch_topology.c | 10 +++++++++-

>  1 file changed, 9 insertions(+), 1 deletion(-)

> 

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c

> index 1eb81f11..5fe44b3 100644

> --- a/drivers/base/arch_topology.c

> +++ b/drivers/base/arch_topology.c

> @@ -439,10 +439,18 @@ const struct cpumask *cpu_coregroup_mask(int cpu)

>  	if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {

>  		/* not numa in package, lets use the package siblings */

>  		core_mask = &cpu_topology[cpu].core_sibling;

> -	}

> +	} else

> +		pr_warn_once("Warning: suspicous broken topology: cpu:[%d]'s core_sibling:[%*pbl] not a subset of numa node:[%*pbl]\n",

> +			cpu, cpumask_pr_args(&cpu_topology[cpu].core_sibling),

> +			cpumask_pr_args(core_mask));

> +


Won't this print warning on all systems that don't have numa within a
package ? What are you trying to achieve here ?

>  	if (cpu_topology[cpu].llc_id != -1) {

>  		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))

>  			core_mask = &cpu_topology[cpu].llc_sibling;

> +		else

> +			pr_warn_once("Warning: suspicous broken topology: cpu:[%d]'s llc_sibling:[%*pbl] not a subset of numa node:[%*pbl]\n",

> +				cpu, cpumask_pr_args(&cpu_topology[cpu].llc_sibling),

> +				cpumask_pr_args(core_mask));

>  	}

>


This will trigger warning on all systems that lack cacheinfo topology.
I don't understand the intent of this patch at all. Can you explain
all the steps you follow and the issue you face ?

--
Regards,
Sudeep

Patch

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1eb81f11..5fe44b3 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -439,10 +439,18 @@  const struct cpumask *cpu_coregroup_mask(int cpu)
 	if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
 		/* not numa in package, lets use the package siblings */
 		core_mask = &cpu_topology[cpu].core_sibling;
-	}
+	} else
+		pr_warn_once("Warning: suspicous broken topology: cpu:[%d]'s core_sibling:[%*pbl] not a subset of numa node:[%*pbl]\n",
+			cpu, cpumask_pr_args(&cpu_topology[cpu].core_sibling),
+			cpumask_pr_args(core_mask));
+
 	if (cpu_topology[cpu].llc_id != -1) {
 		if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
 			core_mask = &cpu_topology[cpu].llc_sibling;
+		else
+			pr_warn_once("Warning: suspicous broken topology: cpu:[%d]'s llc_sibling:[%*pbl] not a subset of numa node:[%*pbl]\n",
+				cpu, cpumask_pr_args(&cpu_topology[cpu].llc_sibling),
+				cpumask_pr_args(core_mask));
 	}
 
 	return core_mask;