[v4,2/2] arm64/bpf: don't allocate BPF JIT programs in module memory

Message ID 20181123221804.440-3-ard.biesheuvel@linaro.org
State New
Headers show
Series
  • [v4,1/2] bpf: add __weak hook for allocating executable memory
Related show

Commit Message

Ard Biesheuvel Nov. 23, 2018, 10:18 p.m.
The arm64 module region is a 128 MB region that is kept close to
the core kernel, in order to ensure that relative branches are
always in range. So using the same region for programs that do
not have this restriction is wasteful, and preferably avoided.

Now that the core BPF JIT code permits the alloc/free routines to
be overridden, implement them by vmalloc()/vfree() calls from a
dedicated 128 MB region set aside for BPF programs. This ensures
that BPF programs are still in branching range of each other, which
is something the JIT currently depends upon (and is not guaranteed
when using module_alloc() on KASLR kernels like we do currently).
It also ensures that placement of BPF programs does not correlate
with the placement of the core kernel or modules, making it less
likely that leaking the former will reveal the latter.

This also solves an issue under KASAN, where shadow memory is
needlessly allocated for all BPF programs (which don't require KASAN
shadow pages since they are not KASAN instrumented)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

---
 arch/arm64/include/asm/memory.h |  5 ++++-
 arch/arm64/net/bpf_jit_comp.c   | 13 +++++++++++++
 2 files changed, 17 insertions(+), 1 deletion(-)

-- 
2.19.1

Comments

Will Deacon Nov. 30, 2018, 6:26 p.m. | #1
On Fri, Nov 23, 2018 at 11:18:04PM +0100, Ard Biesheuvel wrote:
> The arm64 module region is a 128 MB region that is kept close to

> the core kernel, in order to ensure that relative branches are

> always in range. So using the same region for programs that do

> not have this restriction is wasteful, and preferably avoided.

> 

> Now that the core BPF JIT code permits the alloc/free routines to

> be overridden, implement them by vmalloc()/vfree() calls from a

> dedicated 128 MB region set aside for BPF programs. This ensures

> that BPF programs are still in branching range of each other, which

> is something the JIT currently depends upon (and is not guaranteed

> when using module_alloc() on KASLR kernels like we do currently).

> It also ensures that placement of BPF programs does not correlate

> with the placement of the core kernel or modules, making it less

> likely that leaking the former will reveal the latter.

> 

> This also solves an issue under KASAN, where shadow memory is

> needlessly allocated for all BPF programs (which don't require KASAN

> shadow pages since they are not KASAN instrumented)

> 

> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> ---

>  arch/arm64/include/asm/memory.h |  5 ++++-

>  arch/arm64/net/bpf_jit_comp.c   | 13 +++++++++++++

>  2 files changed, 17 insertions(+), 1 deletion(-)

> 

> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h

> index b96442960aea..ee20fc63899c 100644

> --- a/arch/arm64/include/asm/memory.h

> +++ b/arch/arm64/include/asm/memory.h

> @@ -62,8 +62,11 @@

>  #define PAGE_OFFSET		(UL(0xffffffffffffffff) - \

>  	(UL(1) << (VA_BITS - 1)) + 1)

>  #define KIMAGE_VADDR		(MODULES_END)

> +#define BPF_JIT_REGION_START	(VA_START + KASAN_SHADOW_SIZE)

> +#define BPF_JIT_REGION_SIZE	(SZ_128M)

> +#define BPF_JIT_REGION_END	(BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)

>  #define MODULES_END		(MODULES_VADDR + MODULES_VSIZE)

> -#define MODULES_VADDR		(VA_START + KASAN_SHADOW_SIZE)

> +#define MODULES_VADDR		(BPF_JIT_REGION_END)

>  #define MODULES_VSIZE		(SZ_128M)

>  #define VMEMMAP_START		(PAGE_OFFSET - VMEMMAP_SIZE)

>  #define PCI_IO_END		(VMEMMAP_START - SZ_2M)

> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c

> index a6fdaea07c63..76c2ab40c02d 100644

> --- a/arch/arm64/net/bpf_jit_comp.c

> +++ b/arch/arm64/net/bpf_jit_comp.c

> @@ -940,3 +940,16 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)

>  					   tmp : orig_prog);

>  	return prog;

>  }

> +

> +void *bpf_jit_alloc_exec(unsigned long size)

> +{

> +	return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,

> +				    BPF_JIT_REGION_END, GFP_KERNEL,

> +				    PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,

> +				    __builtin_return_address(0));


I guess we'll want VM_IMMEDIATE_UNMAP here if Rich gets that merged. In the
meantime, I wonder if it's worth zeroing the region in bpf_jit_free_exec()?
(although we'd need the size information...).

Will
Ard Biesheuvel Nov. 30, 2018, 7:20 p.m. | #2
On Fri, 30 Nov 2018 at 19:26, Will Deacon <will.deacon@arm.com> wrote:
>

> On Fri, Nov 23, 2018 at 11:18:04PM +0100, Ard Biesheuvel wrote:

> > The arm64 module region is a 128 MB region that is kept close to

> > the core kernel, in order to ensure that relative branches are

> > always in range. So using the same region for programs that do

> > not have this restriction is wasteful, and preferably avoided.

> >

> > Now that the core BPF JIT code permits the alloc/free routines to

> > be overridden, implement them by vmalloc()/vfree() calls from a

> > dedicated 128 MB region set aside for BPF programs. This ensures

> > that BPF programs are still in branching range of each other, which

> > is something the JIT currently depends upon (and is not guaranteed

> > when using module_alloc() on KASLR kernels like we do currently).

> > It also ensures that placement of BPF programs does not correlate

> > with the placement of the core kernel or modules, making it less

> > likely that leaking the former will reveal the latter.

> >

> > This also solves an issue under KASAN, where shadow memory is

> > needlessly allocated for all BPF programs (which don't require KASAN

> > shadow pages since they are not KASAN instrumented)

> >

> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> > ---

> >  arch/arm64/include/asm/memory.h |  5 ++++-

> >  arch/arm64/net/bpf_jit_comp.c   | 13 +++++++++++++

> >  2 files changed, 17 insertions(+), 1 deletion(-)

> >

> > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h

> > index b96442960aea..ee20fc63899c 100644

> > --- a/arch/arm64/include/asm/memory.h

> > +++ b/arch/arm64/include/asm/memory.h

> > @@ -62,8 +62,11 @@

> >  #define PAGE_OFFSET          (UL(0xffffffffffffffff) - \

> >       (UL(1) << (VA_BITS - 1)) + 1)

> >  #define KIMAGE_VADDR         (MODULES_END)

> > +#define BPF_JIT_REGION_START (VA_START + KASAN_SHADOW_SIZE)

> > +#define BPF_JIT_REGION_SIZE  (SZ_128M)

> > +#define BPF_JIT_REGION_END   (BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)

> >  #define MODULES_END          (MODULES_VADDR + MODULES_VSIZE)

> > -#define MODULES_VADDR                (VA_START + KASAN_SHADOW_SIZE)

> > +#define MODULES_VADDR                (BPF_JIT_REGION_END)

> >  #define MODULES_VSIZE                (SZ_128M)

> >  #define VMEMMAP_START                (PAGE_OFFSET - VMEMMAP_SIZE)

> >  #define PCI_IO_END           (VMEMMAP_START - SZ_2M)

> > diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c

> > index a6fdaea07c63..76c2ab40c02d 100644

> > --- a/arch/arm64/net/bpf_jit_comp.c

> > +++ b/arch/arm64/net/bpf_jit_comp.c

> > @@ -940,3 +940,16 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)

> >                                          tmp : orig_prog);

> >       return prog;

> >  }

> > +

> > +void *bpf_jit_alloc_exec(unsigned long size)

> > +{

> > +     return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,

> > +                                 BPF_JIT_REGION_END, GFP_KERNEL,

> > +                                 PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,

> > +                                 __builtin_return_address(0));

>

> I guess we'll want VM_IMMEDIATE_UNMAP here if Rich gets that merged.


I think akpm already queued up that patch.

> In the

> meantime, I wonder if it's worth zeroing the region in bpf_jit_free_exec()?

> (although we'd need the size information...).

>


Not sure. What exactly would that achieve?
Will Deacon Dec. 3, 2018, 12:49 p.m. | #3
On Fri, Nov 30, 2018 at 08:20:06PM +0100, Ard Biesheuvel wrote:
> On Fri, 30 Nov 2018 at 19:26, Will Deacon <will.deacon@arm.com> wrote:

> >

> > On Fri, Nov 23, 2018 at 11:18:04PM +0100, Ard Biesheuvel wrote:

> > > The arm64 module region is a 128 MB region that is kept close to

> > > the core kernel, in order to ensure that relative branches are

> > > always in range. So using the same region for programs that do

> > > not have this restriction is wasteful, and preferably avoided.

> > >

> > > Now that the core BPF JIT code permits the alloc/free routines to

> > > be overridden, implement them by vmalloc()/vfree() calls from a

> > > dedicated 128 MB region set aside for BPF programs. This ensures

> > > that BPF programs are still in branching range of each other, which

> > > is something the JIT currently depends upon (and is not guaranteed

> > > when using module_alloc() on KASLR kernels like we do currently).

> > > It also ensures that placement of BPF programs does not correlate

> > > with the placement of the core kernel or modules, making it less

> > > likely that leaking the former will reveal the latter.

> > >

> > > This also solves an issue under KASAN, where shadow memory is

> > > needlessly allocated for all BPF programs (which don't require KASAN

> > > shadow pages since they are not KASAN instrumented)

> > >

> > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> > > ---

> > >  arch/arm64/include/asm/memory.h |  5 ++++-

> > >  arch/arm64/net/bpf_jit_comp.c   | 13 +++++++++++++

> > >  2 files changed, 17 insertions(+), 1 deletion(-)

> > >

> > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h

> > > index b96442960aea..ee20fc63899c 100644

> > > --- a/arch/arm64/include/asm/memory.h

> > > +++ b/arch/arm64/include/asm/memory.h

> > > @@ -62,8 +62,11 @@

> > >  #define PAGE_OFFSET          (UL(0xffffffffffffffff) - \

> > >       (UL(1) << (VA_BITS - 1)) + 1)

> > >  #define KIMAGE_VADDR         (MODULES_END)

> > > +#define BPF_JIT_REGION_START (VA_START + KASAN_SHADOW_SIZE)

> > > +#define BPF_JIT_REGION_SIZE  (SZ_128M)

> > > +#define BPF_JIT_REGION_END   (BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)

> > >  #define MODULES_END          (MODULES_VADDR + MODULES_VSIZE)

> > > -#define MODULES_VADDR                (VA_START + KASAN_SHADOW_SIZE)

> > > +#define MODULES_VADDR                (BPF_JIT_REGION_END)

> > >  #define MODULES_VSIZE                (SZ_128M)

> > >  #define VMEMMAP_START                (PAGE_OFFSET - VMEMMAP_SIZE)

> > >  #define PCI_IO_END           (VMEMMAP_START - SZ_2M)

> > > diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c

> > > index a6fdaea07c63..76c2ab40c02d 100644

> > > --- a/arch/arm64/net/bpf_jit_comp.c

> > > +++ b/arch/arm64/net/bpf_jit_comp.c

> > > @@ -940,3 +940,16 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)

> > >                                          tmp : orig_prog);

> > >       return prog;

> > >  }

> > > +

> > > +void *bpf_jit_alloc_exec(unsigned long size)

> > > +{

> > > +     return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,

> > > +                                 BPF_JIT_REGION_END, GFP_KERNEL,

> > > +                                 PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,

> > > +                                 __builtin_return_address(0));

> >

> > I guess we'll want VM_IMMEDIATE_UNMAP here if Rich gets that merged.

> 

> I think akpm already queued up that patch.

> 

> > In the

> > meantime, I wonder if it's worth zeroing the region in bpf_jit_free_exec()?

> > (although we'd need the size information...).

> >

> 

> Not sure. What exactly would that achieve?


I think the zero encoding is guaranteed to be undefined, so it would limit
the usefulness of any stale, executable TLB entries. However, we'd also need
cache maintenance to make that stuff visible to the I side, so it's probably
not worth it, especially if akpm has queued the stuff from Rich.

Maybe just add an:

/* FIXME: Remove this when VM_IMMEDIATE_UNMAP is supported */
#ifndef VM_IMMEDIATE_UNMAP
#define VM_IMMEDIATE_UNMAP	0
#endif

so we remember to come back and sort this out? Up to you.

Will
Ard Biesheuvel Dec. 4, 2018, 3:45 p.m. | #4
On Mon, 3 Dec 2018 at 13:49, Will Deacon <will.deacon@arm.com> wrote:
>

> On Fri, Nov 30, 2018 at 08:20:06PM +0100, Ard Biesheuvel wrote:

> > On Fri, 30 Nov 2018 at 19:26, Will Deacon <will.deacon@arm.com> wrote:

> > >

> > > On Fri, Nov 23, 2018 at 11:18:04PM +0100, Ard Biesheuvel wrote:

> > > > The arm64 module region is a 128 MB region that is kept close to

> > > > the core kernel, in order to ensure that relative branches are

> > > > always in range. So using the same region for programs that do

> > > > not have this restriction is wasteful, and preferably avoided.

> > > >

> > > > Now that the core BPF JIT code permits the alloc/free routines to

> > > > be overridden, implement them by vmalloc()/vfree() calls from a

> > > > dedicated 128 MB region set aside for BPF programs. This ensures

> > > > that BPF programs are still in branching range of each other, which

> > > > is something the JIT currently depends upon (and is not guaranteed

> > > > when using module_alloc() on KASLR kernels like we do currently).

> > > > It also ensures that placement of BPF programs does not correlate

> > > > with the placement of the core kernel or modules, making it less

> > > > likely that leaking the former will reveal the latter.

> > > >

> > > > This also solves an issue under KASAN, where shadow memory is

> > > > needlessly allocated for all BPF programs (which don't require KASAN

> > > > shadow pages since they are not KASAN instrumented)

> > > >

> > > > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> > > > ---

> > > >  arch/arm64/include/asm/memory.h |  5 ++++-

> > > >  arch/arm64/net/bpf_jit_comp.c   | 13 +++++++++++++

> > > >  2 files changed, 17 insertions(+), 1 deletion(-)

> > > >

> > > > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h

> > > > index b96442960aea..ee20fc63899c 100644

> > > > --- a/arch/arm64/include/asm/memory.h

> > > > +++ b/arch/arm64/include/asm/memory.h

> > > > @@ -62,8 +62,11 @@

> > > >  #define PAGE_OFFSET          (UL(0xffffffffffffffff) - \

> > > >       (UL(1) << (VA_BITS - 1)) + 1)

> > > >  #define KIMAGE_VADDR         (MODULES_END)

> > > > +#define BPF_JIT_REGION_START (VA_START + KASAN_SHADOW_SIZE)

> > > > +#define BPF_JIT_REGION_SIZE  (SZ_128M)

> > > > +#define BPF_JIT_REGION_END   (BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)

> > > >  #define MODULES_END          (MODULES_VADDR + MODULES_VSIZE)

> > > > -#define MODULES_VADDR                (VA_START + KASAN_SHADOW_SIZE)

> > > > +#define MODULES_VADDR                (BPF_JIT_REGION_END)

> > > >  #define MODULES_VSIZE                (SZ_128M)

> > > >  #define VMEMMAP_START                (PAGE_OFFSET - VMEMMAP_SIZE)

> > > >  #define PCI_IO_END           (VMEMMAP_START - SZ_2M)

> > > > diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c

> > > > index a6fdaea07c63..76c2ab40c02d 100644

> > > > --- a/arch/arm64/net/bpf_jit_comp.c

> > > > +++ b/arch/arm64/net/bpf_jit_comp.c

> > > > @@ -940,3 +940,16 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)

> > > >                                          tmp : orig_prog);

> > > >       return prog;

> > > >  }

> > > > +

> > > > +void *bpf_jit_alloc_exec(unsigned long size)

> > > > +{

> > > > +     return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,

> > > > +                                 BPF_JIT_REGION_END, GFP_KERNEL,

> > > > +                                 PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,

> > > > +                                 __builtin_return_address(0));

> > >

> > > I guess we'll want VM_IMMEDIATE_UNMAP here if Rich gets that merged.

> >

> > I think akpm already queued up that patch.

> >

> > > In the

> > > meantime, I wonder if it's worth zeroing the region in bpf_jit_free_exec()?

> > > (although we'd need the size information...).

> > >

> >

> > Not sure. What exactly would that achieve?

>

> I think the zero encoding is guaranteed to be undefined, so it would limit

> the usefulness of any stale, executable TLB entries. However, we'd also need

> cache maintenance to make that stuff visible to the I side, so it's probably

> not worth it, especially if akpm has queued the stuff from Rich.

>

> Maybe just add an:

>

> /* FIXME: Remove this when VM_IMMEDIATE_UNMAP is supported */

> #ifndef VM_IMMEDIATE_UNMAP

> #define VM_IMMEDIATE_UNMAP      0

> #endif

>

> so we remember to come back and sort this out? Up to you.

>


I'll just make a note to send out that patch once the definition lands via -akpm
Daniel Borkmann Dec. 5, 2018, 12:24 p.m. | #5
Hi Will,

On 12/04/2018 04:45 PM, Ard Biesheuvel wrote:
> On Mon, 3 Dec 2018 at 13:49, Will Deacon <will.deacon@arm.com> wrote:

>> On Fri, Nov 30, 2018 at 08:20:06PM +0100, Ard Biesheuvel wrote:

>>> On Fri, 30 Nov 2018 at 19:26, Will Deacon <will.deacon@arm.com> wrote:

>>>> On Fri, Nov 23, 2018 at 11:18:04PM +0100, Ard Biesheuvel wrote:

>>>>> The arm64 module region is a 128 MB region that is kept close to

>>>>> the core kernel, in order to ensure that relative branches are

>>>>> always in range. So using the same region for programs that do

>>>>> not have this restriction is wasteful, and preferably avoided.

>>>>>

>>>>> Now that the core BPF JIT code permits the alloc/free routines to

>>>>> be overridden, implement them by vmalloc()/vfree() calls from a

>>>>> dedicated 128 MB region set aside for BPF programs. This ensures

>>>>> that BPF programs are still in branching range of each other, which

>>>>> is something the JIT currently depends upon (and is not guaranteed

>>>>> when using module_alloc() on KASLR kernels like we do currently).

>>>>> It also ensures that placement of BPF programs does not correlate

>>>>> with the placement of the core kernel or modules, making it less

>>>>> likely that leaking the former will reveal the latter.

>>>>>

>>>>> This also solves an issue under KASAN, where shadow memory is

>>>>> needlessly allocated for all BPF programs (which don't require KASAN

>>>>> shadow pages since they are not KASAN instrumented)

>>>>>

>>>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

>>>>> ---

>>>>>  arch/arm64/include/asm/memory.h |  5 ++++-

>>>>>  arch/arm64/net/bpf_jit_comp.c   | 13 +++++++++++++

>>>>>  2 files changed, 17 insertions(+), 1 deletion(-)

>>>>>

>>>>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h

>>>>> index b96442960aea..ee20fc63899c 100644

>>>>> --- a/arch/arm64/include/asm/memory.h

>>>>> +++ b/arch/arm64/include/asm/memory.h

>>>>> @@ -62,8 +62,11 @@

>>>>>  #define PAGE_OFFSET          (UL(0xffffffffffffffff) - \

>>>>>       (UL(1) << (VA_BITS - 1)) + 1)

>>>>>  #define KIMAGE_VADDR         (MODULES_END)

>>>>> +#define BPF_JIT_REGION_START (VA_START + KASAN_SHADOW_SIZE)

>>>>> +#define BPF_JIT_REGION_SIZE  (SZ_128M)

>>>>> +#define BPF_JIT_REGION_END   (BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)

>>>>>  #define MODULES_END          (MODULES_VADDR + MODULES_VSIZE)

>>>>> -#define MODULES_VADDR                (VA_START + KASAN_SHADOW_SIZE)

>>>>> +#define MODULES_VADDR                (BPF_JIT_REGION_END)

>>>>>  #define MODULES_VSIZE                (SZ_128M)

>>>>>  #define VMEMMAP_START                (PAGE_OFFSET - VMEMMAP_SIZE)

>>>>>  #define PCI_IO_END           (VMEMMAP_START - SZ_2M)

>>>>> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c

>>>>> index a6fdaea07c63..76c2ab40c02d 100644

>>>>> --- a/arch/arm64/net/bpf_jit_comp.c

>>>>> +++ b/arch/arm64/net/bpf_jit_comp.c

>>>>> @@ -940,3 +940,16 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)

>>>>>                                          tmp : orig_prog);

>>>>>       return prog;

>>>>>  }

>>>>> +

>>>>> +void *bpf_jit_alloc_exec(unsigned long size)

>>>>> +{

>>>>> +     return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,

>>>>> +                                 BPF_JIT_REGION_END, GFP_KERNEL,

>>>>> +                                 PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,

>>>>> +                                 __builtin_return_address(0));

>>>>

>>>> I guess we'll want VM_IMMEDIATE_UNMAP here if Rich gets that merged.

>>>

>>> I think akpm already queued up that patch.

>>>

>>>> In the

>>>> meantime, I wonder if it's worth zeroing the region in bpf_jit_free_exec()?

>>>> (although we'd need the size information...).

>>>>

>>>

>>> Not sure. What exactly would that achieve?

>>

>> I think the zero encoding is guaranteed to be undefined, so it would limit

>> the usefulness of any stale, executable TLB entries. However, we'd also need

>> cache maintenance to make that stuff visible to the I side, so it's probably

>> not worth it, especially if akpm has queued the stuff from Rich.

>>

>> Maybe just add an:

>>

>> /* FIXME: Remove this when VM_IMMEDIATE_UNMAP is supported */

>> #ifndef VM_IMMEDIATE_UNMAP

>> #define VM_IMMEDIATE_UNMAP      0

>> #endif

>>

>> so we remember to come back and sort this out? Up to you.

> 

> I'll just make a note to send out that patch once the definition lands via -akpm


Could I get an ACK from you for this patch, then I'd take the series into bpf-next.

Thanks,
Daniel
Will Deacon Dec. 5, 2018, 1:24 p.m. | #6
On Wed, Dec 05, 2018 at 01:24:17PM +0100, Daniel Borkmann wrote:
> On 12/04/2018 04:45 PM, Ard Biesheuvel wrote:

> > On Mon, 3 Dec 2018 at 13:49, Will Deacon <will.deacon@arm.com> wrote:

> >> On Fri, Nov 30, 2018 at 08:20:06PM +0100, Ard Biesheuvel wrote:

> >>> On Fri, 30 Nov 2018 at 19:26, Will Deacon <will.deacon@arm.com> wrote:

> >>>> On Fri, Nov 23, 2018 at 11:18:04PM +0100, Ard Biesheuvel wrote:

> >>>>> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c

> >>>>> index a6fdaea07c63..76c2ab40c02d 100644

> >>>>> --- a/arch/arm64/net/bpf_jit_comp.c

> >>>>> +++ b/arch/arm64/net/bpf_jit_comp.c

> >>>>> @@ -940,3 +940,16 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)

> >>>>>                                          tmp : orig_prog);

> >>>>>       return prog;

> >>>>>  }

> >>>>> +

> >>>>> +void *bpf_jit_alloc_exec(unsigned long size)

> >>>>> +{

> >>>>> +     return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,

> >>>>> +                                 BPF_JIT_REGION_END, GFP_KERNEL,

> >>>>> +                                 PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,

> >>>>> +                                 __builtin_return_address(0));

> >>>>

> >>>> I guess we'll want VM_IMMEDIATE_UNMAP here if Rich gets that merged.

> >>>

> >>> I think akpm already queued up that patch.

> >>>

> >>>> In the

> >>>> meantime, I wonder if it's worth zeroing the region in bpf_jit_free_exec()?

> >>>> (although we'd need the size information...).

> >>>>

> >>>

> >>> Not sure. What exactly would that achieve?

> >>

> >> I think the zero encoding is guaranteed to be undefined, so it would limit

> >> the usefulness of any stale, executable TLB entries. However, we'd also need

> >> cache maintenance to make that stuff visible to the I side, so it's probably

> >> not worth it, especially if akpm has queued the stuff from Rich.

> >>

> >> Maybe just add an:

> >>

> >> /* FIXME: Remove this when VM_IMMEDIATE_UNMAP is supported */

> >> #ifndef VM_IMMEDIATE_UNMAP

> >> #define VM_IMMEDIATE_UNMAP      0

> >> #endif

> >>

> >> so we remember to come back and sort this out? Up to you.

> > 

> > I'll just make a note to send out that patch once the definition lands via -akpm

> 

> Could I get an ACK from you for this patch, then I'd take the series into bpf-next.


Gah, thanks for the ping: I thought I acked this initially, but turns out I
didn't.

Acked-by: Will Deacon <will.deacon@arm.com>


Will
Daniel Borkmann Dec. 5, 2018, 3:40 p.m. | #7
On 12/05/2018 02:24 PM, Will Deacon wrote:
> On Wed, Dec 05, 2018 at 01:24:17PM +0100, Daniel Borkmann wrote:

>> On 12/04/2018 04:45 PM, Ard Biesheuvel wrote:

>>> On Mon, 3 Dec 2018 at 13:49, Will Deacon <will.deacon@arm.com> wrote:

>>>> On Fri, Nov 30, 2018 at 08:20:06PM +0100, Ard Biesheuvel wrote:

>>>>> On Fri, 30 Nov 2018 at 19:26, Will Deacon <will.deacon@arm.com> wrote:

>>>>>> On Fri, Nov 23, 2018 at 11:18:04PM +0100, Ard Biesheuvel wrote:

>>>>>>> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c

>>>>>>> index a6fdaea07c63..76c2ab40c02d 100644

>>>>>>> --- a/arch/arm64/net/bpf_jit_comp.c

>>>>>>> +++ b/arch/arm64/net/bpf_jit_comp.c

>>>>>>> @@ -940,3 +940,16 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)

>>>>>>>                                          tmp : orig_prog);

>>>>>>>       return prog;

>>>>>>>  }

>>>>>>> +

>>>>>>> +void *bpf_jit_alloc_exec(unsigned long size)

>>>>>>> +{

>>>>>>> +     return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,

>>>>>>> +                                 BPF_JIT_REGION_END, GFP_KERNEL,

>>>>>>> +                                 PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,

>>>>>>> +                                 __builtin_return_address(0));

>>>>>>

>>>>>> I guess we'll want VM_IMMEDIATE_UNMAP here if Rich gets that merged.

>>>>>

>>>>> I think akpm already queued up that patch.

>>>>>

>>>>>> In the

>>>>>> meantime, I wonder if it's worth zeroing the region in bpf_jit_free_exec()?

>>>>>> (although we'd need the size information...).

>>>>>>

>>>>>

>>>>> Not sure. What exactly would that achieve?

>>>>

>>>> I think the zero encoding is guaranteed to be undefined, so it would limit

>>>> the usefulness of any stale, executable TLB entries. However, we'd also need

>>>> cache maintenance to make that stuff visible to the I side, so it's probably

>>>> not worth it, especially if akpm has queued the stuff from Rich.

>>>>

>>>> Maybe just add an:

>>>>

>>>> /* FIXME: Remove this when VM_IMMEDIATE_UNMAP is supported */

>>>> #ifndef VM_IMMEDIATE_UNMAP

>>>> #define VM_IMMEDIATE_UNMAP      0

>>>> #endif

>>>>

>>>> so we remember to come back and sort this out? Up to you.

>>>

>>> I'll just make a note to send out that patch once the definition lands via -akpm

>>

>> Could I get an ACK from you for this patch, then I'd take the series into bpf-next.

> 

> Gah, thanks for the ping: I thought I acked this initially, but turns out I

> didn't.

> 

> Acked-by: Will Deacon <will.deacon@arm.com>


Applied, thanks everyone!

Patch

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index b96442960aea..ee20fc63899c 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -62,8 +62,11 @@ 
 #define PAGE_OFFSET		(UL(0xffffffffffffffff) - \
 	(UL(1) << (VA_BITS - 1)) + 1)
 #define KIMAGE_VADDR		(MODULES_END)
+#define BPF_JIT_REGION_START	(VA_START + KASAN_SHADOW_SIZE)
+#define BPF_JIT_REGION_SIZE	(SZ_128M)
+#define BPF_JIT_REGION_END	(BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)
 #define MODULES_END		(MODULES_VADDR + MODULES_VSIZE)
-#define MODULES_VADDR		(VA_START + KASAN_SHADOW_SIZE)
+#define MODULES_VADDR		(BPF_JIT_REGION_END)
 #define MODULES_VSIZE		(SZ_128M)
 #define VMEMMAP_START		(PAGE_OFFSET - VMEMMAP_SIZE)
 #define PCI_IO_END		(VMEMMAP_START - SZ_2M)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index a6fdaea07c63..76c2ab40c02d 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -940,3 +940,16 @@  struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
 					   tmp : orig_prog);
 	return prog;
 }
+
+void *bpf_jit_alloc_exec(unsigned long size)
+{
+	return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,
+				    BPF_JIT_REGION_END, GFP_KERNEL,
+				    PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE,
+				    __builtin_return_address(0));
+}
+
+void bpf_jit_free_exec(void *addr)
+{
+	return vfree(addr);
+}