diff mbox series

[1/1] mm: only dispaly online cpus of the numa node

Message ID 1497962608-12756-1-git-send-email-thunder.leizhen@huawei.com
State New
Headers show
Series [1/1] mm: only dispaly online cpus of the numa node | expand

Commit Message

Zhen Lei June 20, 2017, 12:43 p.m. UTC
When I executed numactl -H(which read /sys/devices/system/node/nodeX/cpumap
and display cpumask_of_node for each node), but I got different result on
X86 and arm64. For each numa node, the former only displayed online CPUs,
and the latter displayed all possible CPUs. Unfortunately, both Linux
documentation and numactl manual have not described it clear.

I sent a mail to ask for help, and Michal Hocko <mhocko@kernel.org> replied
that he preferred to print online cpus because it doesn't really make much
sense to bind anything on offline nodes.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

---
 drivers/base/node.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--
2.5.0

Comments

Michal Hocko Aug. 24, 2017, 8:32 a.m. UTC | #1
It seems this has slipped through cracks. Let's CC arm64 guys

On Tue 20-06-17 20:43:28, Zhen Lei wrote:
> When I executed numactl -H(which read /sys/devices/system/node/nodeX/cpumap

> and display cpumask_of_node for each node), but I got different result on

> X86 and arm64. For each numa node, the former only displayed online CPUs,

> and the latter displayed all possible CPUs. Unfortunately, both Linux

> documentation and numactl manual have not described it clear.

> 

> I sent a mail to ask for help, and Michal Hocko <mhocko@kernel.org> replied

> that he preferred to print online cpus because it doesn't really make much

> sense to bind anything on offline nodes.


Yes printing offline CPUs is just confusing and more so when the
behavior is not consistent over architectures. I believe that x86
behavior is the more appropriate one because it is more logical to dump
the NUMA topology and use it for affinity setting than adding one
additional step to check the cpu state to achieve the same.

It is true that the online/offline state might change at any time so the
above might be tricky on its own but if we should at least make the
behavior consistent.

> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>


Acked-by: Michal Hocko <mhocko@suse.com>


> ---

>  drivers/base/node.c | 6 ++++--

>  1 file changed, 4 insertions(+), 2 deletions(-)

> 

> diff --git a/drivers/base/node.c b/drivers/base/node.c

> index 5548f96..d5e7ce7 100644

> --- a/drivers/base/node.c

> +++ b/drivers/base/node.c

> @@ -28,12 +28,14 @@ static struct bus_type node_subsys = {

>  static ssize_t node_read_cpumap(struct device *dev, bool list, char *buf)

>  {

>  	struct node *node_dev = to_node(dev);

> -	const struct cpumask *mask = cpumask_of_node(node_dev->dev.id);

> +	struct cpumask mask;

> +

> +	cpumask_and(&mask, cpumask_of_node(node_dev->dev.id), cpu_online_mask);

> 

>  	/* 2008/04/07: buf currently PAGE_SIZE, need 9 chars per 32 bits. */

>  	BUILD_BUG_ON((NR_CPUS/32 * 9) > (PAGE_SIZE-1));

> 

> -	return cpumap_print_to_pagebuf(list, buf, mask);

> +	return cpumap_print_to_pagebuf(list, buf, &mask);

>  }

> 

>  static inline ssize_t node_read_cpumask(struct device *dev,

> --

> 2.5.0

> 

> 


-- 
Michal Hocko
SUSE Labs
Will Deacon Aug. 25, 2017, 5:34 p.m. UTC | #2
On Thu, Aug 24, 2017 at 10:32:26AM +0200, Michal Hocko wrote:
> It seems this has slipped through cracks. Let's CC arm64 guys

> 

> On Tue 20-06-17 20:43:28, Zhen Lei wrote:

> > When I executed numactl -H(which read /sys/devices/system/node/nodeX/cpumap

> > and display cpumask_of_node for each node), but I got different result on

> > X86 and arm64. For each numa node, the former only displayed online CPUs,

> > and the latter displayed all possible CPUs. Unfortunately, both Linux

> > documentation and numactl manual have not described it clear.

> > 

> > I sent a mail to ask for help, and Michal Hocko <mhocko@kernel.org> replied

> > that he preferred to print online cpus because it doesn't really make much

> > sense to bind anything on offline nodes.

> 

> Yes printing offline CPUs is just confusing and more so when the

> behavior is not consistent over architectures. I believe that x86

> behavior is the more appropriate one because it is more logical to dump

> the NUMA topology and use it for affinity setting than adding one

> additional step to check the cpu state to achieve the same.

> 

> It is true that the online/offline state might change at any time so the

> above might be tricky on its own but if we should at least make the

> behavior consistent.

> 

> > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

> 

> Acked-by: Michal Hocko <mhocko@suse.com>


The concept looks find to me, but shouldn't we use cpumask_var_t and
alloc/free_cpumask_var?

Will

> >  drivers/base/node.c | 6 ++++--

> >  1 file changed, 4 insertions(+), 2 deletions(-)

> > 

> > diff --git a/drivers/base/node.c b/drivers/base/node.c

> > index 5548f96..d5e7ce7 100644

> > --- a/drivers/base/node.c

> > +++ b/drivers/base/node.c

> > @@ -28,12 +28,14 @@ static struct bus_type node_subsys = {

> >  static ssize_t node_read_cpumap(struct device *dev, bool list, char *buf)

> >  {

> >  	struct node *node_dev = to_node(dev);

> > -	const struct cpumask *mask = cpumask_of_node(node_dev->dev.id);

> > +	struct cpumask mask;

> > +

> > +	cpumask_and(&mask, cpumask_of_node(node_dev->dev.id), cpu_online_mask);

> > 

> >  	/* 2008/04/07: buf currently PAGE_SIZE, need 9 chars per 32 bits. */

> >  	BUILD_BUG_ON((NR_CPUS/32 * 9) > (PAGE_SIZE-1));

> > 

> > -	return cpumap_print_to_pagebuf(list, buf, mask);

> > +	return cpumap_print_to_pagebuf(list, buf, &mask);

> >  }

> > 

> >  static inline ssize_t node_read_cpumask(struct device *dev,

> > --

> > 2.5.0

> > 

> > 

> 

> -- 

> Michal Hocko

> SUSE Labs
Michal Hocko Aug. 28, 2017, 1:13 p.m. UTC | #3
On Fri 25-08-17 18:34:33, Will Deacon wrote:
> On Thu, Aug 24, 2017 at 10:32:26AM +0200, Michal Hocko wrote:

> > It seems this has slipped through cracks. Let's CC arm64 guys

> > 

> > On Tue 20-06-17 20:43:28, Zhen Lei wrote:

> > > When I executed numactl -H(which read /sys/devices/system/node/nodeX/cpumap

> > > and display cpumask_of_node for each node), but I got different result on

> > > X86 and arm64. For each numa node, the former only displayed online CPUs,

> > > and the latter displayed all possible CPUs. Unfortunately, both Linux

> > > documentation and numactl manual have not described it clear.

> > > 

> > > I sent a mail to ask for help, and Michal Hocko <mhocko@kernel.org> replied

> > > that he preferred to print online cpus because it doesn't really make much

> > > sense to bind anything on offline nodes.

> > 

> > Yes printing offline CPUs is just confusing and more so when the

> > behavior is not consistent over architectures. I believe that x86

> > behavior is the more appropriate one because it is more logical to dump

> > the NUMA topology and use it for affinity setting than adding one

> > additional step to check the cpu state to achieve the same.

> > 

> > It is true that the online/offline state might change at any time so the

> > above might be tricky on its own but if we should at least make the

> > behavior consistent.

> > 

> > > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

> > 

> > Acked-by: Michal Hocko <mhocko@suse.com>

> 

> The concept looks find to me, but shouldn't we use cpumask_var_t and

> alloc/free_cpumask_var?


This will be safer but both callers of node_read_cpumap are shallow
stack so I am not sure a stack is a limiting factor here.

Zhen Lei, would you care to update that part please?

-- 
Michal Hocko
SUSE Labs
Zhen Lei Sept. 29, 2017, 6:46 a.m. UTC | #4
On 2017/8/28 21:13, Michal Hocko wrote:
> On Fri 25-08-17 18:34:33, Will Deacon wrote:

>> On Thu, Aug 24, 2017 at 10:32:26AM +0200, Michal Hocko wrote:

>>> It seems this has slipped through cracks. Let's CC arm64 guys

>>>

>>> On Tue 20-06-17 20:43:28, Zhen Lei wrote:

>>>> When I executed numactl -H(which read /sys/devices/system/node/nodeX/cpumap

>>>> and display cpumask_of_node for each node), but I got different result on

>>>> X86 and arm64. For each numa node, the former only displayed online CPUs,

>>>> and the latter displayed all possible CPUs. Unfortunately, both Linux

>>>> documentation and numactl manual have not described it clear.

>>>>

>>>> I sent a mail to ask for help, and Michal Hocko <mhocko@kernel.org> replied

>>>> that he preferred to print online cpus because it doesn't really make much

>>>> sense to bind anything on offline nodes.

>>>

>>> Yes printing offline CPUs is just confusing and more so when the

>>> behavior is not consistent over architectures. I believe that x86

>>> behavior is the more appropriate one because it is more logical to dump

>>> the NUMA topology and use it for affinity setting than adding one

>>> additional step to check the cpu state to achieve the same.

>>>

>>> It is true that the online/offline state might change at any time so the

>>> above might be tricky on its own but if we should at least make the

>>> behavior consistent.

>>>

>>>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

>>>

>>> Acked-by: Michal Hocko <mhocko@suse.com>

>>

>> The concept looks find to me, but shouldn't we use cpumask_var_t and

>> alloc/free_cpumask_var?

> 

> This will be safer but both callers of node_read_cpumap are shallow

> stack so I am not sure a stack is a limiting factor here.

> 

> Zhen Lei, would you care to update that part please?

> 

Sure, I will send v2 immediately.

I'm so sorry that missed this email until someone told me.

-- 
Thanks!
BestRegards
diff mbox series

Patch

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 5548f96..d5e7ce7 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -28,12 +28,14 @@  static struct bus_type node_subsys = {
 static ssize_t node_read_cpumap(struct device *dev, bool list, char *buf)
 {
 	struct node *node_dev = to_node(dev);
-	const struct cpumask *mask = cpumask_of_node(node_dev->dev.id);
+	struct cpumask mask;
+
+	cpumask_and(&mask, cpumask_of_node(node_dev->dev.id), cpu_online_mask);

 	/* 2008/04/07: buf currently PAGE_SIZE, need 9 chars per 32 bits. */
 	BUILD_BUG_ON((NR_CPUS/32 * 9) > (PAGE_SIZE-1));

-	return cpumap_print_to_pagebuf(list, buf, mask);
+	return cpumap_print_to_pagebuf(list, buf, &mask);
 }

 static inline ssize_t node_read_cpumask(struct device *dev,