diff mbox series

[RFC,1/2] slub: Avoid trying to allocate memory on offline nodes

Message ID 20180801200418.1325826-2-jeremy.linton@arm.com
State New
Headers show
Series harden alloc_pages against bogus nid | expand

Commit Message

Jeremy Linton Aug. 1, 2018, 8:04 p.m. UTC
If a user calls the *alloc_node() functions with an invalid node
its possible to crash in alloc_pages_nodemask because NODE_DATA()
returns a bad node, which propogates into the node zonelist in
prepare_alloc_pages. This avoids that by not trying to allocate
new slabs against offline nodes.

(example backtrace)

  __alloc_pages_nodemask+0x128/0xf48
  allocate_slab+0x94/0x528
  new_slab+0x68/0xc8
  ___slab_alloc+0x44c/0x520
  __slab_alloc+0x50/0x68
  kmem_cache_alloc_node_trace+0xe0/0x230

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>

---
 mm/slub.c | 2 ++
 1 file changed, 2 insertions(+)

-- 
2.14.3

Comments

Michal Hocko Aug. 2, 2018, 9:15 a.m. UTC | #1
On Wed 01-08-18 15:04:17, Jeremy Linton wrote:
[...]
> @@ -2519,6 +2519,8 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,

>  		if (unlikely(!node_match(page, searchnode))) {

>  			stat(s, ALLOC_NODE_MISMATCH);

>  			deactivate_slab(s, page, c->freelist, c);

> +			if (!node_online(searchnode))

> +				node = NUMA_NO_NODE;

>  			goto new_slab;


This is inherently racy. Numa node can get offline at any point after
you check it here. Making it race free would involve some sort of
locking and I am not really convinced this is a good idea.

>  		}

>  	}

> -- 

> 2.14.3

> 


-- 
Michal Hocko
SUSE Labs
Christoph Lameter (Ampere) Aug. 2, 2018, 2:23 p.m. UTC | #2
On Wed, 1 Aug 2018, Jeremy Linton wrote:

> diff --git a/mm/slub.c b/mm/slub.c

> index 51258eff4178..e03719bac1e2 100644

> --- a/mm/slub.c

> +++ b/mm/slub.c

> @@ -2519,6 +2519,8 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,

>  		if (unlikely(!node_match(page, searchnode))) {

>  			stat(s, ALLOC_NODE_MISMATCH);

>  			deactivate_slab(s, page, c->freelist, c);

> +			if (!node_online(searchnode))

> +				node = NUMA_NO_NODE;

>  			goto new_slab;

>  		}

>  	}

>


Would it not be better to implement this check in the page allocator?
There is also the issue of how to fallback to the nearest node.

NUMA_NO_NODE should fallback to the current memory allocation policy but
it seems by inserting it here you would end up just with the default node
for the processor.
Jeremy Linton Aug. 3, 2018, 3:12 a.m. UTC | #3
Hi,

On 08/02/2018 09:23 AM, Christopher Lameter wrote:
> On Wed, 1 Aug 2018, Jeremy Linton wrote:

> 

>> diff --git a/mm/slub.c b/mm/slub.c

>> index 51258eff4178..e03719bac1e2 100644

>> --- a/mm/slub.c

>> +++ b/mm/slub.c

>> @@ -2519,6 +2519,8 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,

>>   		if (unlikely(!node_match(page, searchnode))) {

>>   			stat(s, ALLOC_NODE_MISMATCH);

>>   			deactivate_slab(s, page, c->freelist, c);

>> +			if (!node_online(searchnode))

>> +				node = NUMA_NO_NODE;

>>   			goto new_slab;

>>   		}

>>   	}

>>

> 

> Would it not be better to implement this check in the page allocator?

> There is also the issue of how to fallback to the nearest node.


Possibly? Falling back to the nearest node though, should be handled if 
memory-less nodes is enabled, which in the problematic case isn't.

> 

> NUMA_NO_NODE should fallback to the current memory allocation policy but

> it seems by inserting it here you would end up just with the default node

> for the processor.


I picked this spot (compared to 2/2) because a number of paths are 
funneling through here, and in this case it shouldn't be a very hot path.
Jeremy Linton Aug. 3, 2018, 3:21 a.m. UTC | #4
Hi,

On 08/02/2018 04:15 AM, Michal Hocko wrote:
> On Wed 01-08-18 15:04:17, Jeremy Linton wrote:

> [...]

>> @@ -2519,6 +2519,8 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,

>>   		if (unlikely(!node_match(page, searchnode))) {

>>   			stat(s, ALLOC_NODE_MISMATCH);

>>   			deactivate_slab(s, page, c->freelist, c);

>> +			if (!node_online(searchnode))

>> +				node = NUMA_NO_NODE;

>>   			goto new_slab;

> 

> This is inherently racy. Numa node can get offline at any point after

> you check it here. Making it race free would involve some sort of

> locking and I am not really convinced this is a good idea.


I spent some time looking/thinking about this, and i'm pretty sure its 
not creating any new problems. But OTOH, I think the node_online() check 
is probably a bit misleading as what we really want to assure is that 
node<MAX_NUMNODES and that there is going to be a valid entry in 
NODE_DATA() so we don't deference null.


> 

>>   		}

>>   	}

>> -- 

>> 2.14.3

>>

>
Michal Hocko Aug. 3, 2018, 6:20 a.m. UTC | #5
On Thu 02-08-18 22:21:53, Jeremy Linton wrote:
> Hi,

> 

> On 08/02/2018 04:15 AM, Michal Hocko wrote:

> > On Wed 01-08-18 15:04:17, Jeremy Linton wrote:

> > [...]

> > > @@ -2519,6 +2519,8 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,

> > >   		if (unlikely(!node_match(page, searchnode))) {

> > >   			stat(s, ALLOC_NODE_MISMATCH);

> > >   			deactivate_slab(s, page, c->freelist, c);

> > > +			if (!node_online(searchnode))

> > > +				node = NUMA_NO_NODE;

> > >   			goto new_slab;

> > 

> > This is inherently racy. Numa node can get offline at any point after

> > you check it here. Making it race free would involve some sort of

> > locking and I am not really convinced this is a good idea.

> 

> I spent some time looking/thinking about this, and i'm pretty sure its not

> creating any new problems. But OTOH, I think the node_online() check is

> probably a bit misleading as what we really want to assure is that

> node<MAX_NUMNODES and that there is going to be a valid entry in NODE_DATA()

> so we don't deference null.


Exactly. And we do rely that the user of the allocator doesn't really
use bogus parameters. This is not a function to be used for untrusted or
unsanitized inputs.

-- 
Michal Hocko
SUSE Labs
diff mbox series

Patch

diff --git a/mm/slub.c b/mm/slub.c
index 51258eff4178..e03719bac1e2 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2519,6 +2519,8 @@  static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
 		if (unlikely(!node_match(page, searchnode))) {
 			stat(s, ALLOC_NODE_MISMATCH);
 			deactivate_slab(s, page, c->freelist, c);
+			if (!node_online(searchnode))
+				node = NUMA_NO_NODE;
 			goto new_slab;
 		}
 	}