diff mbox

[RFC,1/6] arm64: vmemmap: use virtual projection of linear region

Message ID 1456330893-19228-2-git-send-email-ard.biesheuvel@linaro.org
State Superseded
Headers show

Commit Message

Ard Biesheuvel Feb. 24, 2016, 4:21 p.m. UTC
Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made
some changes to the memory mapping code to allow physical memory to reside
at an offset that exceeds the size of the virtual address space.

However, since the size of the vmemmap area is proportional to the size of
the VA area, but it is populated relative to the physical space, we may
end up with the struct page array being mapped outside of the vmemmap
region. For instance, on my Seattle A0 box, I can see the following output
in the dmesg log.

   vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000   (     8 GB maximum)
             0xffffffbfc0000000 - 0xffffffbfd0000000   (   256 MB actual)

We can fix this by deciding that the vmemmap region is not a projection of
the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
linear region. This way, we are guaranteed that the vmemmap region is of
sufficient size, and we can also reduce its size by half.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

---
 arch/arm64/include/asm/pgtable.h | 7 ++++---
 arch/arm64/mm/init.c             | 4 ++--
 2 files changed, 6 insertions(+), 5 deletions(-)

-- 
2.5.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Comments

Ard Biesheuvel Feb. 25, 2016, 7:02 a.m. UTC | #1
On 24 February 2016 at 17:21, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made

> some changes to the memory mapping code to allow physical memory to reside

> at an offset that exceeds the size of the virtual address space.

>

> However, since the size of the vmemmap area is proportional to the size of

> the VA area, but it is populated relative to the physical space, we may

> end up with the struct page array being mapped outside of the vmemmap

> region. For instance, on my Seattle A0 box, I can see the following output

> in the dmesg log.

>

>    vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000   (     8 GB maximum)

>              0xffffffbfc0000000 - 0xffffffbfd0000000   (   256 MB actual)

>

> We can fix this by deciding that the vmemmap region is not a projection of

> the physical space, but of the virtual space above PAGE_OFFSET, i.e., the

> linear region. This way, we are guaranteed that the vmemmap region is of

> sufficient size, and we can also reduce its size by half.

>

> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> ---

>  arch/arm64/include/asm/pgtable.h | 7 ++++---

>  arch/arm64/mm/init.c             | 4 ++--

>  2 files changed, 6 insertions(+), 5 deletions(-)

>

> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h

> index a440f5a85d08..8e6baea0ff61 100644

> --- a/arch/arm64/include/asm/pgtable.h

> +++ b/arch/arm64/include/asm/pgtable.h

> @@ -34,18 +34,19 @@

>  /*

>   * VMALLOC and SPARSEMEM_VMEMMAP ranges.

>   *

> - * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array

> + * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array

>   *     (rounded up to PUD_SIZE).

>   * VMALLOC_START: beginning of the kernel vmalloc space

>   * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,

>   *     fixed mappings and modules

>   */

> -#define VMEMMAP_SIZE           ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)

> +#define VMEMMAP_SIZE           ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)

>

>  #define VMALLOC_START          (MODULES_END)

>  #define VMALLOC_END            (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)

>

> -#define vmemmap                        ((struct page *)(VMALLOC_END + SZ_64K))

> +#define VMEMMAP_START          (VMALLOC_END + SZ_64K)

> +#define vmemmap                        ((struct page *)(VMEMMAP_START - memstart_addr / sizeof(struct page)))

>


Note that with the linear region randomization which is now in -next,
this division needs to be signed (since memstart_addr can wrap).

So I should either update the definition of memstart_addr to s64 in
this patch, or cast to (s64) in the expression above


>  #define FIRST_USER_ADDRESS     0UL

>

> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c

> index c0ea54bd9995..88046b94fa87 100644

> --- a/arch/arm64/mm/init.c

> +++ b/arch/arm64/mm/init.c

> @@ -363,8 +363,8 @@ void __init mem_init(void)

>                   MLK_ROUNDUP(_text, _etext),

>                   MLK_ROUNDUP(_sdata, _edata),

>  #ifdef CONFIG_SPARSEMEM_VMEMMAP

> -                 MLG((unsigned long)vmemmap,

> -                     (unsigned long)vmemmap + VMEMMAP_SIZE),

> +                 MLG(VMEMMAP_START,

> +                     VMEMMAP_START + VMEMMAP_SIZE),

>                   MLM((unsigned long)virt_to_page(PAGE_OFFSET),

>                       (unsigned long)virt_to_page(high_memory)),

>  #endif

> --

> 2.5.0

>


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Will Deacon Feb. 26, 2016, 3:15 p.m. UTC | #2
On Thu, Feb 25, 2016 at 08:02:00AM +0100, Ard Biesheuvel wrote:
> On 24 February 2016 at 17:21, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:

> > Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made

> > some changes to the memory mapping code to allow physical memory to reside

> > at an offset that exceeds the size of the virtual address space.

> >

> > However, since the size of the vmemmap area is proportional to the size of

> > the VA area, but it is populated relative to the physical space, we may

> > end up with the struct page array being mapped outside of the vmemmap

> > region. For instance, on my Seattle A0 box, I can see the following output

> > in the dmesg log.

> >

> >    vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000   (     8 GB maximum)

> >              0xffffffbfc0000000 - 0xffffffbfd0000000   (   256 MB actual)

> >

> > We can fix this by deciding that the vmemmap region is not a projection of

> > the physical space, but of the virtual space above PAGE_OFFSET, i.e., the

> > linear region. This way, we are guaranteed that the vmemmap region is of

> > sufficient size, and we can also reduce its size by half.

> >

> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

> > ---

> >  arch/arm64/include/asm/pgtable.h | 7 ++++---

> >  arch/arm64/mm/init.c             | 4 ++--

> >  2 files changed, 6 insertions(+), 5 deletions(-)

> >

> > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h

> > index a440f5a85d08..8e6baea0ff61 100644

> > --- a/arch/arm64/include/asm/pgtable.h

> > +++ b/arch/arm64/include/asm/pgtable.h

> > @@ -34,18 +34,19 @@

> >  /*

> >   * VMALLOC and SPARSEMEM_VMEMMAP ranges.

> >   *

> > - * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array

> > + * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array

> >   *     (rounded up to PUD_SIZE).

> >   * VMALLOC_START: beginning of the kernel vmalloc space

> >   * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,

> >   *     fixed mappings and modules

> >   */

> > -#define VMEMMAP_SIZE           ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)

> > +#define VMEMMAP_SIZE           ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)

> >

> >  #define VMALLOC_START          (MODULES_END)

> >  #define VMALLOC_END            (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)

> >

> > -#define vmemmap                        ((struct page *)(VMALLOC_END + SZ_64K))

> > +#define VMEMMAP_START          (VMALLOC_END + SZ_64K)

> > +#define vmemmap                        ((struct page *)(VMEMMAP_START - memstart_addr / sizeof(struct page)))

> >

> 

> Note that with the linear region randomization which is now in -next,

> this division needs to be signed (since memstart_addr can wrap).

> 

> So I should either update the definition of memstart_addr to s64 in

> this patch, or cast to (s64) in the expression above


Can you avoid the division altogether by doing something like:

(struct page *)(VMEMMAP_START - (PHYS_PFN(memstart_addr) * sizeof(struct page)))

or have I misunderstood how this works?

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Ard Biesheuvel Feb. 26, 2016, 3:39 p.m. UTC | #3
On 26 February 2016 at 16:15, Will Deacon <will.deacon@arm.com> wrote:
> On Thu, Feb 25, 2016 at 08:02:00AM +0100, Ard Biesheuvel wrote:

>> On 24 February 2016 at 17:21, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:

>> > Commit dd006da21646 ("arm64: mm: increase VA range of identity map") made

>> > some changes to the memory mapping code to allow physical memory to reside

>> > at an offset that exceeds the size of the virtual address space.

>> >

>> > However, since the size of the vmemmap area is proportional to the size of

>> > the VA area, but it is populated relative to the physical space, we may

>> > end up with the struct page array being mapped outside of the vmemmap

>> > region. For instance, on my Seattle A0 box, I can see the following output

>> > in the dmesg log.

>> >

>> >    vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000   (     8 GB maximum)

>> >              0xffffffbfc0000000 - 0xffffffbfd0000000   (   256 MB actual)

>> >

>> > We can fix this by deciding that the vmemmap region is not a projection of

>> > the physical space, but of the virtual space above PAGE_OFFSET, i.e., the

>> > linear region. This way, we are guaranteed that the vmemmap region is of

>> > sufficient size, and we can also reduce its size by half.

>> >

>> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

>> > ---

>> >  arch/arm64/include/asm/pgtable.h | 7 ++++---

>> >  arch/arm64/mm/init.c             | 4 ++--

>> >  2 files changed, 6 insertions(+), 5 deletions(-)

>> >

>> > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h

>> > index a440f5a85d08..8e6baea0ff61 100644

>> > --- a/arch/arm64/include/asm/pgtable.h

>> > +++ b/arch/arm64/include/asm/pgtable.h

>> > @@ -34,18 +34,19 @@

>> >  /*

>> >   * VMALLOC and SPARSEMEM_VMEMMAP ranges.

>> >   *

>> > - * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array

>> > + * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array

>> >   *     (rounded up to PUD_SIZE).

>> >   * VMALLOC_START: beginning of the kernel vmalloc space

>> >   * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,

>> >   *     fixed mappings and modules

>> >   */

>> > -#define VMEMMAP_SIZE           ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)

>> > +#define VMEMMAP_SIZE           ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)

>> >

>> >  #define VMALLOC_START          (MODULES_END)

>> >  #define VMALLOC_END            (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)

>> >

>> > -#define vmemmap                        ((struct page *)(VMALLOC_END + SZ_64K))

>> > +#define VMEMMAP_START          (VMALLOC_END + SZ_64K)

>> > +#define vmemmap                        ((struct page *)(VMEMMAP_START - memstart_addr / sizeof(struct page)))

>> >

>>

>> Note that with the linear region randomization which is now in -next,

>> this division needs to be signed (since memstart_addr can wrap).

>>

>> So I should either update the definition of memstart_addr to s64 in

>> this patch, or cast to (s64) in the expression above

>

> Can you avoid the division altogether by doing something like:

>

> (struct page *)(VMEMMAP_START - (PHYS_PFN(memstart_addr) * sizeof(struct page)))

>

> or have I misunderstood how this works?

>


It needs to be a signed shift, since the RHS of the subtraction must
remain negative if memstart_addr is 'negative'

This works as well:
(struct page *)VMEMMAP_START - ((s64)memstart_addr >> PAGE_SHIFT)

It may be appropriate to change the definition of memstart_addr to
s64, to reflect that, under randomization of the linear region, the
start of physical memory may be 'below zero' so that the actual
populated RAM region is high up in the linear region.
That way, we can lose the case here.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Will Deacon Feb. 26, 2016, 4:24 p.m. UTC | #4
On Fri, Feb 26, 2016 at 04:39:55PM +0100, Ard Biesheuvel wrote:
> On 26 February 2016 at 16:15, Will Deacon <will.deacon@arm.com> wrote:

> > On Thu, Feb 25, 2016 at 08:02:00AM +0100, Ard Biesheuvel wrote:

> >> On 24 February 2016 at 17:21, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:

> >> > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h

> >> > index a440f5a85d08..8e6baea0ff61 100644

> >> > --- a/arch/arm64/include/asm/pgtable.h

> >> > +++ b/arch/arm64/include/asm/pgtable.h

> >> > @@ -34,18 +34,19 @@

> >> >  /*

> >> >   * VMALLOC and SPARSEMEM_VMEMMAP ranges.

> >> >   *

> >> > - * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array

> >> > + * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array

> >> >   *     (rounded up to PUD_SIZE).

> >> >   * VMALLOC_START: beginning of the kernel vmalloc space

> >> >   * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,

> >> >   *     fixed mappings and modules

> >> >   */

> >> > -#define VMEMMAP_SIZE           ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)

> >> > +#define VMEMMAP_SIZE           ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)

> >> >

> >> >  #define VMALLOC_START          (MODULES_END)

> >> >  #define VMALLOC_END            (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)

> >> >

> >> > -#define vmemmap                        ((struct page *)(VMALLOC_END + SZ_64K))

> >> > +#define VMEMMAP_START          (VMALLOC_END + SZ_64K)

> >> > +#define vmemmap                        ((struct page *)(VMEMMAP_START - memstart_addr / sizeof(struct page)))

> >> >

> >>

> >> Note that with the linear region randomization which is now in -next,

> >> this division needs to be signed (since memstart_addr can wrap).

> >>

> >> So I should either update the definition of memstart_addr to s64 in

> >> this patch, or cast to (s64) in the expression above

> >

> > Can you avoid the division altogether by doing something like:

> >

> > (struct page *)(VMEMMAP_START - (PHYS_PFN(memstart_addr) * sizeof(struct page)))

> >

> > or have I misunderstood how this works?

> >

> 

> It needs to be a signed shift, since the RHS of the subtraction must

> remain negative if memstart_addr is 'negative'

> 

> This works as well:

> (struct page *)VMEMMAP_START - ((s64)memstart_addr >> PAGE_SHIFT)


Ah yeah, even better.

> It may be appropriate to change the definition of memstart_addr to

> s64, to reflect that, under randomization of the linear region, the

> start of physical memory may be 'below zero' so that the actual

> populated RAM region is high up in the linear region.

> That way, we can lose the case here.


That sounds like a good idea.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Ard Biesheuvel Feb. 26, 2016, 4:26 p.m. UTC | #5
On 26 February 2016 at 17:24, Will Deacon <will.deacon@arm.com> wrote:
> On Fri, Feb 26, 2016 at 04:39:55PM +0100, Ard Biesheuvel wrote:

>> On 26 February 2016 at 16:15, Will Deacon <will.deacon@arm.com> wrote:

>> > On Thu, Feb 25, 2016 at 08:02:00AM +0100, Ard Biesheuvel wrote:

>> >> On 24 February 2016 at 17:21, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:

>> >> > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h

>> >> > index a440f5a85d08..8e6baea0ff61 100644

>> >> > --- a/arch/arm64/include/asm/pgtable.h

>> >> > +++ b/arch/arm64/include/asm/pgtable.h

>> >> > @@ -34,18 +34,19 @@

>> >> >  /*

>> >> >   * VMALLOC and SPARSEMEM_VMEMMAP ranges.

>> >> >   *

>> >> > - * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array

>> >> > + * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array

>> >> >   *     (rounded up to PUD_SIZE).

>> >> >   * VMALLOC_START: beginning of the kernel vmalloc space

>> >> >   * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,

>> >> >   *     fixed mappings and modules

>> >> >   */

>> >> > -#define VMEMMAP_SIZE           ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)

>> >> > +#define VMEMMAP_SIZE           ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)

>> >> >

>> >> >  #define VMALLOC_START          (MODULES_END)

>> >> >  #define VMALLOC_END            (PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)

>> >> >

>> >> > -#define vmemmap                        ((struct page *)(VMALLOC_END + SZ_64K))

>> >> > +#define VMEMMAP_START          (VMALLOC_END + SZ_64K)

>> >> > +#define vmemmap                        ((struct page *)(VMEMMAP_START - memstart_addr / sizeof(struct page)))

>> >> >

>> >>

>> >> Note that with the linear region randomization which is now in -next,

>> >> this division needs to be signed (since memstart_addr can wrap).

>> >>

>> >> So I should either update the definition of memstart_addr to s64 in

>> >> this patch, or cast to (s64) in the expression above

>> >

>> > Can you avoid the division altogether by doing something like:

>> >

>> > (struct page *)(VMEMMAP_START - (PHYS_PFN(memstart_addr) * sizeof(struct page)))

>> >

>> > or have I misunderstood how this works?

>> >

>>

>> It needs to be a signed shift, since the RHS of the subtraction must

>> remain negative if memstart_addr is 'negative'

>>

>> This works as well:

>> (struct page *)VMEMMAP_START - ((s64)memstart_addr >> PAGE_SHIFT)

>

> Ah yeah, even better.

>

>> It may be appropriate to change the definition of memstart_addr to

>> s64, to reflect that, under randomization of the linear region, the

>> start of physical memory may be 'below zero' so that the actual

>> populated RAM region is high up in the linear region.

>> That way, we can lose the case here.

>

> That sounds like a good idea.

>


OK, I will respin the first patch. As far as the remaining patches are
concerned, I wonder if you have any suggestions as to how to measure
the performance impact of making virt_to_page() disregard PHYS_OFFSET
(as I did in 6/6) before respinning/resending the remaining patches

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
diff mbox

Patch

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index a440f5a85d08..8e6baea0ff61 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -34,18 +34,19 @@ 
 /*
  * VMALLOC and SPARSEMEM_VMEMMAP ranges.
  *
- * VMEMAP_SIZE: allows the whole VA space to be covered by a struct page array
+ * VMEMAP_SIZE: allows the whole linear region to be covered by a struct page array
  *	(rounded up to PUD_SIZE).
  * VMALLOC_START: beginning of the kernel vmalloc space
  * VMALLOC_END: extends to the available space below vmmemmap, PCI I/O space,
  *	fixed mappings and modules
  */
-#define VMEMMAP_SIZE		ALIGN((1UL << (VA_BITS - PAGE_SHIFT)) * sizeof(struct page), PUD_SIZE)
+#define VMEMMAP_SIZE		ALIGN((1UL << (VA_BITS - PAGE_SHIFT - 1)) * sizeof(struct page), PUD_SIZE)
 
 #define VMALLOC_START		(MODULES_END)
 #define VMALLOC_END		(PAGE_OFFSET - PUD_SIZE - VMEMMAP_SIZE - SZ_64K)
 
-#define vmemmap			((struct page *)(VMALLOC_END + SZ_64K))
+#define VMEMMAP_START		(VMALLOC_END + SZ_64K)
+#define vmemmap			((struct page *)(VMEMMAP_START - memstart_addr / sizeof(struct page)))
 
 #define FIRST_USER_ADDRESS	0UL
 
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index c0ea54bd9995..88046b94fa87 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -363,8 +363,8 @@  void __init mem_init(void)
 		  MLK_ROUNDUP(_text, _etext),
 		  MLK_ROUNDUP(_sdata, _edata),
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
-		  MLG((unsigned long)vmemmap,
-		      (unsigned long)vmemmap + VMEMMAP_SIZE),
+		  MLG(VMEMMAP_START,
+		      VMEMMAP_START + VMEMMAP_SIZE),
 		  MLM((unsigned long)virt_to_page(PAGE_OFFSET),
 		      (unsigned long)virt_to_page(high_memory)),
 #endif