diff mbox

[Xen-devel,2/5] xen: arm: Handle 4K aligned hypervisor load address.

Message ID 1405355950-6461-2-git-send-email-ian.campbell@citrix.com
State New
Headers show

Commit Message

Ian Campbell July 14, 2014, 4:39 p.m. UTC
Currently the boot page tables map Xen at XEN_VIRT_START using a 2MB section
mapping. This means that the bootloader must load Xen at a 2MB aligned address.
Unfortunately this is not the case with UEFI on the Juno platform where Xen
fails to boot. Furthermore the Linux boot protocol (which Xen claims to adhere
to) does not have this restriction, therefore this is our bug and not the
bootloader's.

Fix this by adding third level pagetables to the boot time pagetables, allowing
us to map a Xen which is aligned only to a 4K boundary. This only affects the
boot time page tables since Xen will later relocate itself to a 2MB aligned
address. Strictly speaking the non-boot processors could make use of this and
use a section mapping, but it is simpler if all processors follow the same boot
path.

Strictly speaking the Linux boot protocol doesn't even require 4K alignment
(and apparently Linux can cope with this), but so far all bootloaders appear to
provide it, so support for this is left for another day.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
 xen/arch/arm/arm32/head.S |   54 +++++++++++++++++++++++++++++++++------------
 xen/arch/arm/arm64/head.S |   50 +++++++++++++++++++++++++++++------------
 xen/arch/arm/mm.c         |    8 +++++--
 3 files changed, 82 insertions(+), 30 deletions(-)

Comments

Julien Grall July 14, 2014, 10:33 p.m. UTC | #1
Hi Ian,

On 14/07/14 17:39, Ian Campbell wrote:
> Currently the boot page tables map Xen at XEN_VIRT_START using a 2MB section
> mapping. This means that the bootloader must load Xen at a 2MB aligned address.
> Unfortunately this is not the case with UEFI on the Juno platform where Xen
> fails to boot. Furthermore the Linux boot protocol (which Xen claims to adhere
> to) does not have this restriction, therefore this is our bug and not the
> bootloader's.
>
> Fix this by adding third level pagetables to the boot time pagetables, allowing
> us to map a Xen which is aligned only to a 4K boundary. This only affects the
> boot time page tables since Xen will later relocate itself to a 2MB aligned
> address. Strictly speaking the non-boot processors could make use of this and
> use a section mapping, but it is simpler if all processors follow the same boot
> path.

OOI, did you think about the solution to copy Xen to a 2MB aligned 
address before setup the early page table?

It would avoid to use third level page table during boot but will 
introduce an extra copy of Xen (only for the boot processor).

FYI, it's what Linux does.
Ian Campbell July 15, 2014, 9:13 a.m. UTC | #2
On Mon, 2014-07-14 at 23:33 +0100, Julien Grall wrote:
> Hi Ian,
> 
> On 14/07/14 17:39, Ian Campbell wrote:
> > Currently the boot page tables map Xen at XEN_VIRT_START using a 2MB section
> > mapping. This means that the bootloader must load Xen at a 2MB aligned address.
> > Unfortunately this is not the case with UEFI on the Juno platform where Xen
> > fails to boot. Furthermore the Linux boot protocol (which Xen claims to adhere
> > to) does not have this restriction, therefore this is our bug and not the
> > bootloader's.
> >
> > Fix this by adding third level pagetables to the boot time pagetables, allowing
> > us to map a Xen which is aligned only to a 4K boundary. This only affects the
> > boot time page tables since Xen will later relocate itself to a 2MB aligned
> > address. Strictly speaking the non-boot processors could make use of this and
> > use a section mapping, but it is simpler if all processors follow the same boot
> > path.
> 
> OOI, did you think about the solution to copy Xen to a 2MB aligned 
> address before setup the early page table?

I did but I was worried about clobbering something useful, like a boot
module.

> It would avoid to use third level page table during boot

This isn't really that expensive, and it's marked __init so it goes away
after boot.

>  but will 
> introduce an extra copy of Xen (only for the boot processor).
> 
> FYI, it's what Linux does.

Interesting, how does it choose the address? Just by rounding down the
loading address?

Ian.
Julien Grall July 15, 2014, 11:03 a.m. UTC | #3
On 15/07/14 10:13, Ian Campbell wrote:
> On Mon, 2014-07-14 at 23:33 +0100, Julien Grall wrote:
>> Hi Ian,
>>
>> On 14/07/14 17:39, Ian Campbell wrote:
>>> Currently the boot page tables map Xen at XEN_VIRT_START using a 2MB section
>>> mapping. This means that the bootloader must load Xen at a 2MB aligned address.
>>> Unfortunately this is not the case with UEFI on the Juno platform where Xen
>>> fails to boot. Furthermore the Linux boot protocol (which Xen claims to adhere
>>> to) does not have this restriction, therefore this is our bug and not the
>>> bootloader's.
>>>
>>> Fix this by adding third level pagetables to the boot time pagetables, allowing
>>> us to map a Xen which is aligned only to a 4K boundary. This only affects the
>>> boot time page tables since Xen will later relocate itself to a 2MB aligned
>>> address. Strictly speaking the non-boot processors could make use of this and
>>> use a section mapping, but it is simpler if all processors follow the same boot
>>> path.
>>
>> OOI, did you think about the solution to copy Xen to a 2MB aligned
>> address before setup the early page table?
>
> I did but I was worried about clobbering something useful, like a boot
> module.
>
>> It would avoid to use third level page table during boot
>
> This isn't really that expensive, and it's marked __init so it goes away
> after boot.
>
>>   but will
>> introduce an extra copy of Xen (only for the boot processor).
>>
>> FYI, it's what Linux does.
>
> Interesting, how does it choose the address? Just by rounding down the
> loading address?

For ARM32 yes.

For ARM64, I don't find anything about relocating the Image. AFAIU, they 
use a field in the header to specify at which offset we need to load the 
kernel from the RAM base address.

On Xen, this field is set to 0, which means loading at any address. So I 
suspect for ARM64 we can change this offset and avoid to modify page 
table code.

Regards,
Julien Grall July 15, 2014, 11:07 a.m. UTC | #4
On 15/07/14 12:03, Julien Grall wrote:
>
>
> On 15/07/14 10:13, Ian Campbell wrote:
>> On Mon, 2014-07-14 at 23:33 +0100, Julien Grall wrote:
>>> Hi Ian,
>>>
>>> On 14/07/14 17:39, Ian Campbell wrote:
>>>> Currently the boot page tables map Xen at XEN_VIRT_START using a 2MB
>>>> section
>>>> mapping. This means that the bootloader must load Xen at a 2MB
>>>> aligned address.
>>>> Unfortunately this is not the case with UEFI on the Juno platform
>>>> where Xen
>>>> fails to boot. Furthermore the Linux boot protocol (which Xen claims
>>>> to adhere
>>>> to) does not have this restriction, therefore this is our bug and
>>>> not the
>>>> bootloader's.
>>>>
>>>> Fix this by adding third level pagetables to the boot time
>>>> pagetables, allowing
>>>> us to map a Xen which is aligned only to a 4K boundary. This only
>>>> affects the
>>>> boot time page tables since Xen will later relocate itself to a 2MB
>>>> aligned
>>>> address. Strictly speaking the non-boot processors could make use of
>>>> this and
>>>> use a section mapping, but it is simpler if all processors follow
>>>> the same boot
>>>> path.
>>>
>>> OOI, did you think about the solution to copy Xen to a 2MB aligned
>>> address before setup the early page table?
>>
>> I did but I was worried about clobbering something useful, like a boot
>> module.
>>
>>> It would avoid to use third level page table during boot
>>
>> This isn't really that expensive, and it's marked __init so it goes away
>> after boot.
>>
>>>   but will
>>> introduce an extra copy of Xen (only for the boot processor).
>>>
>>> FYI, it's what Linux does.
>>
>> Interesting, how does it choose the address? Just by rounding down the
>> loading address?
>
> For ARM32 yes.
>
> For ARM64, I don't find anything about relocating the Image. AFAIU, they
> use a field in the header to specify at which offset we need to load the
> kernel from the RAM base address.
>
> On Xen, this field is set to 0, which means loading at any address. So I
> suspect for ARM64 we can change this offset and avoid to modify page
> table code.

Hrm... I misread the documentation for this part.

"The image must be placed at the specified offset (currently 0x80000)
from the start of the system RAM and called there. The start of the
system RAM must be aligned to 2MB."
Ian Campbell July 15, 2014, 11:10 a.m. UTC | #5
On Tue, 2014-07-15 at 12:07 +0100, Julien Grall wrote:
> 
> On 15/07/14 12:03, Julien Grall wrote:
> >
> >
> > On 15/07/14 10:13, Ian Campbell wrote:
> >> On Mon, 2014-07-14 at 23:33 +0100, Julien Grall wrote:
> >>> Hi Ian,
> >>>
> >>> On 14/07/14 17:39, Ian Campbell wrote:
> >>>> Currently the boot page tables map Xen at XEN_VIRT_START using a 2MB
> >>>> section
> >>>> mapping. This means that the bootloader must load Xen at a 2MB
> >>>> aligned address.
> >>>> Unfortunately this is not the case with UEFI on the Juno platform
> >>>> where Xen
> >>>> fails to boot. Furthermore the Linux boot protocol (which Xen claims
> >>>> to adhere
> >>>> to) does not have this restriction, therefore this is our bug and
> >>>> not the
> >>>> bootloader's.
> >>>>
> >>>> Fix this by adding third level pagetables to the boot time
> >>>> pagetables, allowing
> >>>> us to map a Xen which is aligned only to a 4K boundary. This only
> >>>> affects the
> >>>> boot time page tables since Xen will later relocate itself to a 2MB
> >>>> aligned
> >>>> address. Strictly speaking the non-boot processors could make use of
> >>>> this and
> >>>> use a section mapping, but it is simpler if all processors follow
> >>>> the same boot
> >>>> path.
> >>>
> >>> OOI, did you think about the solution to copy Xen to a 2MB aligned
> >>> address before setup the early page table?
> >>
> >> I did but I was worried about clobbering something useful, like a boot
> >> module.
> >>
> >>> It would avoid to use third level page table during boot
> >>
> >> This isn't really that expensive, and it's marked __init so it goes away
> >> after boot.
> >>
> >>>   but will
> >>> introduce an extra copy of Xen (only for the boot processor).
> >>>
> >>> FYI, it's what Linux does.
> >>
> >> Interesting, how does it choose the address? Just by rounding down the
> >> loading address?
> >
> > For ARM32 yes.
> >
> > For ARM64, I don't find anything about relocating the Image. AFAIU, they
> > use a field in the header to specify at which offset we need to load the
> > kernel from the RAM base address.
> >
> > On Xen, this field is set to 0, which means loading at any address. So I
> > suspect for ARM64 we can change this offset and avoid to modify page
> > table code.
> 
> Hrm... I misread the documentation for this part.
> 
> "The image must be placed at the specified offset (currently 0x80000)
> from the start of the system RAM and called there. The start of the
> system RAM must be aligned to 2MB."
> 

Regardless of this being more flexible in what load addresses we accept
is relatively easy, has no impact after boot and the code is already
written.

Ian.
Julien Grall July 15, 2014, 12:03 p.m. UTC | #6
On 15/07/14 12:10, Ian Campbell wrote:
> On Tue, 2014-07-15 at 12:07 +0100, Julien Grall wrote:
>>
>> On 15/07/14 12:03, Julien Grall wrote:
>>>
>>>
>>> On 15/07/14 10:13, Ian Campbell wrote:
>>>> On Mon, 2014-07-14 at 23:33 +0100, Julien Grall wrote:
>>>>> Hi Ian,
>>>>>
>>>>> On 14/07/14 17:39, Ian Campbell wrote:
>>>>>> Currently the boot page tables map Xen at XEN_VIRT_START using a 2MB
>>>>>> section
>>>>>> mapping. This means that the bootloader must load Xen at a 2MB
>>>>>> aligned address.
>>>>>> Unfortunately this is not the case with UEFI on the Juno platform
>>>>>> where Xen
>>>>>> fails to boot. Furthermore the Linux boot protocol (which Xen claims
>>>>>> to adhere
>>>>>> to) does not have this restriction, therefore this is our bug and
>>>>>> not the
>>>>>> bootloader's.
>>>>>>
>>>>>> Fix this by adding third level pagetables to the boot time
>>>>>> pagetables, allowing
>>>>>> us to map a Xen which is aligned only to a 4K boundary. This only
>>>>>> affects the
>>>>>> boot time page tables since Xen will later relocate itself to a 2MB
>>>>>> aligned
>>>>>> address. Strictly speaking the non-boot processors could make use of
>>>>>> this and
>>>>>> use a section mapping, but it is simpler if all processors follow
>>>>>> the same boot
>>>>>> path.
>>>>>
>>>>> OOI, did you think about the solution to copy Xen to a 2MB aligned
>>>>> address before setup the early page table?
>>>>
>>>> I did but I was worried about clobbering something useful, like a boot
>>>> module.
>>>>
>>>>> It would avoid to use third level page table during boot
>>>>
>>>> This isn't really that expensive, and it's marked __init so it goes away
>>>> after boot.
>>>>
>>>>>    but will
>>>>> introduce an extra copy of Xen (only for the boot processor).
>>>>>
>>>>> FYI, it's what Linux does.
>>>>
>>>> Interesting, how does it choose the address? Just by rounding down the
>>>> loading address?
>>>
>>> For ARM32 yes.
>>>
>>> For ARM64, I don't find anything about relocating the Image. AFAIU, they
>>> use a field in the header to specify at which offset we need to load the
>>> kernel from the RAM base address.
>>>
>>> On Xen, this field is set to 0, which means loading at any address. So I
>>> suspect for ARM64 we can change this offset and avoid to modify page
>>> table code.
>>
>> Hrm... I misread the documentation for this part.
>>
>> "The image must be placed at the specified offset (currently 0x80000)
>> from the start of the system RAM and called there. The start of the
>> system RAM must be aligned to 2MB."
>>
>
> Regardless of this being more flexible in what load addresses we accept
> is relatively easy, has no impact after boot and the code is already
> written.

I stopped to count the number of patch I completely reworked after the 
first version...

I think this is adding complexity in the assembly code and restriction 
(see your panic) where the bootloader load Xen in the memory. Even 
though, the restriction where already there but hidden by the fact we 
are using 2MB mapping.
	
This could be replaced by:
    ARM32: adding a couple of assembly lines to relocate down to a 2MB 
address.
    ARM64: using the offset in the Image, unless if we released 
bootloader is not able to correctly cope with it.

Regards,
Ian Campbell July 15, 2014, 3:18 p.m. UTC | #7
On Tue, 2014-07-15 at 13:03 +0100, Julien Grall wrote:
> > Regardless of this being more flexible in what load addresses we accept
> > is relatively easy, has no impact after boot and the code is already
> > written.
> 
> I stopped to count the number of patch I completely reworked after the 
> first version...

Fortunately no one is asking you to rework this patch and I'm perfectly
happy to do so myself.

> I think this is adding complexity in the assembly code and restriction 
> (see your panic) where the bootloader load Xen in the memory.

It removes restrictions and adds none (except that it currently
incorrectly rejects being loaded at exactly 2M right now, that's a
simple fix though).

I don't see it as being much more complex, it's essentially the same
pattern as level 1 and 2 further up the page table tree.

>  Even 
> though, the restriction where already there but hidden by the fact we 
> are using 2MB mapping.

You have a strange definition of "adding restrictions" then.

> This could be replaced by:
>     ARM32: adding a couple of assembly lines to relocate down to a 2MB 
> address.
>     ARM64: using the offset in the Image, unless if we released 
> bootloader is not able to correctly cope with it.

I'm not 100% certain that either of those are completely viable.

For 32-bit I don't think we know what we will overwrite by copying
ourselves down. Perhaps it would be ok, but why risk it.

For 64-bit at least the Juno firmware appears to load us at 0x80080000
irrespective of the value put in the text offset field. That's certainly
a bug in the firmware, but I can't see any reason not to make ourselves
more flexible here.

I've no idea what control the UEFI stub is going to have over load
address, but I'm pretty sure it will be preferable to avoid having to
relocate on that code path too.

Ian.
Julien Grall July 16, 2014, 3:18 p.m. UTC | #8
On 15/07/14 16:18, Ian Campbell wrote:
> For 64-bit at least the Juno firmware appears to load us at 0x80080000
> irrespective of the value put in the text offset field. That's certainly
> a bug in the firmware, but I can't see any reason not to make ourselves
> more flexible here.

D'oh, do you plan to fill a bug against the firmware?
Julien Grall July 16, 2014, 3:41 p.m. UTC | #9
Hi Ian,

On 14/07/14 17:39, Ian Campbell wrote:
> ---
>   xen/arch/arm/arm32/head.S |   54 +++++++++++++++++++++++++++++++++------------
>   xen/arch/arm/arm64/head.S |   50 +++++++++++++++++++++++++++++------------
>   xen/arch/arm/mm.c         |    8 +++++--
>   3 files changed, 82 insertions(+), 30 deletions(-)
>
> diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
> index 1319a13..3a72195 100644
> --- a/xen/arch/arm/arm32/head.S
> +++ b/xen/arch/arm/arm32/head.S
> @@ -26,6 +26,7 @@
>
>   #define PT_PT     0xf7f /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=1 P=1 */
>   #define PT_MEM    0xf7d /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=0 P=1 */
> +#define PT_MEM_L3 0xf7f /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=1 P=1 */
>   #define PT_DEV    0xe71 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=0 P=1 */
>   #define PT_DEV_L3 0xe73 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=1 P=1 */
>
> @@ -279,25 +280,50 @@ cpu_init_done:
>           ldr   r4, =boot_second
>           add   r4, r4, r10            /* r4 := paddr (boot_second) */
>
> -        lsr   r2, r9, #SECOND_SHIFT  /* Base address for 2MB mapping */
> -        lsl   r2, r2, #SECOND_SHIFT
> +        ldr   r1, =boot_third
> +        add   r1, r1, r10            /* r1 := paddr (boot_third) */
> +        mov   r3, #0x0
> +
> +        /* ... map boot_third in boot_second[1] */
> +        orr   r2, r1, #PT_UPPER(PT)  /* r2:r3 := table map of boot_third */
> +        orr   r2, r2, #PT_LOWER(PT)  /* (+ rights for linear PT) */
> +        strd  r2, r3, [r4, #8]       /* Map it in slot 1 */
> +
> +        /* ... map of paddr(start) in boot_second */
> +        lsrs  r1, r9, #SECOND_SHIFT  /* Offset of base paddr in boot_second */
> +        mov   r2, #0x0ff             /* r2 := LPAE entries mask */
> +        orr   r2, r2, #0x100
> +        and   r1, r1, r2
> +        cmp   r1, #1
> +        bne   2f                     /* It's not in slot 1, map it */
> +
> +        /* Identity map clashes with boot_third, which we cannot handle yet */
> +        PRINT("Unable to build boot page tables - virt and phys addresses clash.\r\n")
> +        b     fail

AFAIU, this can happen if the kernel is loaded around 2MB in the memory, 
right?

Also what does prevent Xen to be shared between 2 third page table?

> +
> +2:
> +        lsl   r2, r1, #SECOND_SHIFT  /* Base address for 2MB mapping */
>           orr   r2, r2, #PT_UPPER(MEM) /* r2:r3 := section map */
>           orr   r2, r2, #PT_LOWER(MEM)
> +        lsl   r1, r1, #3             /* r1 := Slot offset */
> +        strd  r2, r3, [r4, r1]       /* Mapping of paddr(start) */
>
> -        /* ... map of vaddr(start) in boot_second */
> -        ldr   r1, =start
> -        lsr   r1, #(SECOND_SHIFT - 3)   /* Slot for vaddr(start) */
> -        strd  r2, r3, [r4, r1]       /* Map vaddr(start) */
> +        /* Setup boot_third: */
> +1:      ldr   r4, =boot_third
> +        add   r4, r4, r10            /* r4 := paddr (boot_third) */
>
> -        /* ... map of paddr(start) in boot_second */
> -        lsrs  r1, r9, #30            /* Base paddr */
> -        bne   1f                     /* If paddr(start) is not in slot 0
> -                                      * then the mapping was done in
> -                                      * boot_pgtable above */
> +        lsr   r2, r9, #THIRD_SHIFT  /* Base address for 4K mapping */
> +        lsl   r2, r2, #THIRD_SHIFT
> +        orr   r2, r2, #PT_UPPER(MEM_L3) /* r2:r3 := map */
> +        orr   r2, r2, #PT_LOWER(MEM_L3)
>
> -        mov   r1, r9, lsr #(SECOND_SHIFT - 3)   /* Slot for paddr(start) */
> -        strd  r2, r3, [r4, r1]       /* Map Xen there */
> -1:
> +        /* ... map of vaddr(start) in boot_third */
> +        mov   r1, #0
> +1:      strd  r2, r3, [r4, r1]       /* Map vaddr(start) */
> +        add   r2, r2, #4096          /* Next page */

I would use PAGE_SIZE or THIRD_SIZE here.

> +        add   r1, r1, #8             /* Next slot */
> +        cmp   r1, #(512*8)

Any reason to not use LPAE_ENTRIES here?


Regards,
Ian Campbell July 16, 2014, 4:53 p.m. UTC | #10
On Wed, 2014-07-16 at 16:41 +0100, Julien Grall wrote:
> > +        /* Identity map clashes with boot_third, which we cannot handle yet */
> > +        PRINT("Unable to build boot page tables - virt and phys addresses clash.\r\n")
> > +        b     fail
> 
> AFAIU, this can happen if the kernel is loaded around 2MB in the memory, 
> right?

Yes from 2MB up to (but not including) 4MB.

It is an error (I think) that this patch bugs if Xen is loaded at
exactly 2MB, since then the virtual and identity-physical mappings are
the same.

> Also what does prevent Xen to be shared between 2 third page table?

This is the virtual mapping, which always starts at exactly 2MB, so that
can only happen if Xen is larger than 2MB, which we assume is not the
case both here and in various bits of the C code start of day
relocating/setup etc.

> I would use PAGE_SIZE or THIRD_SIZE here.

Ack

> > +        add   r1, r1, #8             /* Next slot */
> > +        cmp   r1, #(512*8)
> 
> Any reason to not use LPAE_ENTRIES here?

I meant to come back and fix this and forgot.

Ian.
Ian Campbell July 16, 2014, 4:54 p.m. UTC | #11
On Wed, 2014-07-16 at 16:18 +0100, Julien Grall wrote:
> 
> On 15/07/14 16:18, Ian Campbell wrote:
> > For 64-bit at least the Juno firmware appears to load us at 0x80080000
> > irrespective of the value put in the text offset field. That's certainly
> > a bug in the firmware, but I can't see any reason not to make ourselves
> > more flexible here.
> 
> D'oh, do you plan to fill a bug against the firmware?

I probably should. Need to figure out where though.

I also need to check if Mark Rutland's recent Linux side changes to the
arm64 Image do or don't cause this to already happened, and therefore
whether it is known and reported already.

He was randomising the text offset, I suspect because he had already
found and reported this issue...

Ian.
Julien Grall July 16, 2014, 5:49 p.m. UTC | #12
On 16/07/14 17:53, Ian Campbell wrote:
> On Wed, 2014-07-16 at 16:41 +0100, Julien Grall wrote:
>>> +        /* Identity map clashes with boot_third, which we cannot handle yet */
>>> +        PRINT("Unable to build boot page tables - virt and phys addresses clash.\r\n")
>>> +        b     fail
>>
>> AFAIU, this can happen if the kernel is loaded around 2MB in the memory,
>> right?
>
> Yes from 2MB up to (but not including) 4MB.
>
> It is an error (I think) that this patch bugs if Xen is loaded at
> exactly 2MB, since then the virtual and identity-physical mappings are
> the same.
>
>> Also what does prevent Xen to be shared between 2 third page table?
>
> This is the virtual mapping, which always starts at exactly 2MB, so that
> can only happen if Xen is larger than 2MB, which we assume is not the
> case both here and in various bits of the C code start of day
> relocating/setup etc.

Sorry I was thinking that boot_third is used for the 1:1 mapping.

It looks like you are using a 2MB mapping for the identity mapping:

+        /* ... map of paddr(start) in boot_second */
+        lsrs  r1, r9, #SECOND_SHIFT  /* Offset of base paddr in 
boot_second */
+        mov   r2, #0x0ff             /* r2 := LPAE entries mask */
+        orr   r2, r2, #0x100
+        and   r1, r1, r2
+        cmp   r1, #1
+        bne   2f                     /* It's not in slot 1, map it */

r9 contains the physical address of start, but the binary could cross 
the 2MB boundary (because, for instance, the start address is at 
0xXX2FXXXXX). So the assembly code to enable the pagination may not be 
on the same slot.

I think this very unlikely, but if it happens it will be hard to debug.
Maybe you can add a sanity check or add a label before the pagination is 
enabled and use it in the slot.

BTW I think you can use lsr instead of lsrs to get the offset.

Regards,
Ian Campbell July 17, 2014, 9:38 a.m. UTC | #13
On Wed, 2014-07-16 at 18:49 +0100, Julien Grall wrote:
> On 16/07/14 17:53, Ian Campbell wrote:
> > On Wed, 2014-07-16 at 16:41 +0100, Julien Grall wrote:
> >>> +        /* Identity map clashes with boot_third, which we cannot handle yet */
> >>> +        PRINT("Unable to build boot page tables - virt and phys addresses clash.\r\n")
> >>> +        b     fail
> >>
> >> AFAIU, this can happen if the kernel is loaded around 2MB in the memory,
> >> right?
> >
> > Yes from 2MB up to (but not including) 4MB.
> >
> > It is an error (I think) that this patch bugs if Xen is loaded at
> > exactly 2MB, since then the virtual and identity-physical mappings are
> > the same.
> >
> >> Also what does prevent Xen to be shared between 2 third page table?
> >
> > This is the virtual mapping, which always starts at exactly 2MB, so that
> > can only happen if Xen is larger than 2MB, which we assume is not the
> > case both here and in various bits of the C code start of day
> > relocating/setup etc.
> 
> Sorry I was thinking that boot_third is used for the 1:1 mapping.
> 
> It looks like you are using a 2MB mapping for the identity mapping:
> 
> +        /* ... map of paddr(start) in boot_second */
> +        lsrs  r1, r9, #SECOND_SHIFT  /* Offset of base paddr in 
> boot_second */
> +        mov   r2, #0x0ff             /* r2 := LPAE entries mask */
> +        orr   r2, r2, #0x100
> +        and   r1, r1, r2
> +        cmp   r1, #1
> +        bne   2f                     /* It's not in slot 1, map it */
> 
> r9 contains the physical address of start, but the binary could cross 
> the 2MB boundary (because, for instance, the start address is at 
> 0xXX2FXXXXX). So the assembly code to enable the pagination may not be 
> on the same slot.

This is indeed a theoretical possibility which I hadn't considered.

What saves us in practice is that the code in head.S from _start to
paging is <4K and therefore given a 4K aligned load address cannot cross
a 4K boundary or a 2MB boundary, etc.

The easiest fix is probably a BUILD_BUG_ON of some sort I think.

> I think this very unlikely, but if it happens it will be hard to debug.
> Maybe you can add a sanity check or add a label before the pagination is 
> enabled and use it in the slot.
> 
> BTW I think you can use lsr instead of lsrs to get the offset.

True.

Ian.
diff mbox

Patch

diff --git a/xen/arch/arm/arm32/head.S b/xen/arch/arm/arm32/head.S
index 1319a13..3a72195 100644
--- a/xen/arch/arm/arm32/head.S
+++ b/xen/arch/arm/arm32/head.S
@@ -26,6 +26,7 @@ 
 
 #define PT_PT     0xf7f /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=1 P=1 */
 #define PT_MEM    0xf7d /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=0 P=1 */
+#define PT_MEM_L3 0xf7f /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=1 P=1 */
 #define PT_DEV    0xe71 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=0 P=1 */
 #define PT_DEV_L3 0xe73 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=1 P=1 */
 
@@ -279,25 +280,50 @@  cpu_init_done:
         ldr   r4, =boot_second
         add   r4, r4, r10            /* r4 := paddr (boot_second) */
 
-        lsr   r2, r9, #SECOND_SHIFT  /* Base address for 2MB mapping */
-        lsl   r2, r2, #SECOND_SHIFT
+        ldr   r1, =boot_third
+        add   r1, r1, r10            /* r1 := paddr (boot_third) */
+        mov   r3, #0x0
+
+        /* ... map boot_third in boot_second[1] */
+        orr   r2, r1, #PT_UPPER(PT)  /* r2:r3 := table map of boot_third */
+        orr   r2, r2, #PT_LOWER(PT)  /* (+ rights for linear PT) */
+        strd  r2, r3, [r4, #8]       /* Map it in slot 1 */
+
+        /* ... map of paddr(start) in boot_second */
+        lsrs  r1, r9, #SECOND_SHIFT  /* Offset of base paddr in boot_second */
+        mov   r2, #0x0ff             /* r2 := LPAE entries mask */
+        orr   r2, r2, #0x100
+        and   r1, r1, r2
+        cmp   r1, #1
+        bne   2f                     /* It's not in slot 1, map it */
+
+        /* Identity map clashes with boot_third, which we cannot handle yet */
+        PRINT("Unable to build boot page tables - virt and phys addresses clash.\r\n")
+        b     fail
+
+2:
+        lsl   r2, r1, #SECOND_SHIFT  /* Base address for 2MB mapping */
         orr   r2, r2, #PT_UPPER(MEM) /* r2:r3 := section map */
         orr   r2, r2, #PT_LOWER(MEM)
+        lsl   r1, r1, #3             /* r1 := Slot offset */
+        strd  r2, r3, [r4, r1]       /* Mapping of paddr(start) */
 
-        /* ... map of vaddr(start) in boot_second */
-        ldr   r1, =start
-        lsr   r1, #(SECOND_SHIFT - 3)   /* Slot for vaddr(start) */
-        strd  r2, r3, [r4, r1]       /* Map vaddr(start) */
+        /* Setup boot_third: */
+1:      ldr   r4, =boot_third
+        add   r4, r4, r10            /* r4 := paddr (boot_third) */
 
-        /* ... map of paddr(start) in boot_second */
-        lsrs  r1, r9, #30            /* Base paddr */
-        bne   1f                     /* If paddr(start) is not in slot 0
-                                      * then the mapping was done in
-                                      * boot_pgtable above */
+        lsr   r2, r9, #THIRD_SHIFT  /* Base address for 4K mapping */
+        lsl   r2, r2, #THIRD_SHIFT
+        orr   r2, r2, #PT_UPPER(MEM_L3) /* r2:r3 := map */
+        orr   r2, r2, #PT_LOWER(MEM_L3)
 
-        mov   r1, r9, lsr #(SECOND_SHIFT - 3)   /* Slot for paddr(start) */
-        strd  r2, r3, [r4, r1]       /* Map Xen there */
-1:
+        /* ... map of vaddr(start) in boot_third */
+        mov   r1, #0
+1:      strd  r2, r3, [r4, r1]       /* Map vaddr(start) */
+        add   r2, r2, #4096          /* Next page */
+        add   r1, r1, #8             /* Next slot */
+        cmp   r1, #(512*8)
+        blo   1b
 
         /* Defer fixmap and dtb mapping until after paging enabled, to
          * avoid them clashing with the 1:1 mapping. */
diff --git a/xen/arch/arm/arm64/head.S b/xen/arch/arm/arm64/head.S
index 883640c..3f46f43 100644
--- a/xen/arch/arm/arm64/head.S
+++ b/xen/arch/arm/arm64/head.S
@@ -27,6 +27,7 @@ 
 
 #define PT_PT     0xf7f /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=1 P=1 */
 #define PT_MEM    0xf7d /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=0 P=1 */
+#define PT_MEM_L3 0xf7f /* nG=1 AF=1 SH=11 AP=01 NS=1 ATTR=111 T=1 P=1 */
 #define PT_DEV    0xe71 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=0 P=1 */
 #define PT_DEV_L3 0xe73 /* nG=1 AF=1 SH=10 AP=01 NS=1 ATTR=100 T=1 P=1 */
 
@@ -303,25 +304,46 @@  skip_bss:
         ldr   x4, =boot_second       /* Next level into boot_second */
         add   x4, x4, x20            /* x4 := paddr(boot_second) */
 
-        lsr   x2, x19, #SECOND_SHIFT /* Base address for 2MB mapping */
-        lsl   x2, x2, #SECOND_SHIFT
+        /* ... map boot_third in boot_second[1] */
+        ldr   x1, =boot_third
+        add   x1, x1, x20            /* x1 := paddr(boot_third) */
+        mov   x3, #PT_PT             /* x2 := table map of boot_third */
+        orr   x2, x1, x3             /*       + rights for linear PT */
+        str   x2, [x4, #8]           /* Map it in slot 1 */
+
+        /* ... map of paddr(start) in boot_second */
+        lsr   x2, x19, #SECOND_SHIFT /* x2 := Offset of base paddr in boot_second */
+        and   x1, x2, 0x1ff          /* x1 := Slot to use */
+        cmp   x1, #1
+        b.ne  2f                     /* It's not in slot 1, map it */
+
+        /* Identity map clashes with boot_third, which we cannot handle yet */
+        PRINT("Unable to build boot page tables - virt and phys addresses clash.\r\n")
+        b     fail
+
+2:
+        lsl   x2, x19, #SECOND_SHIFT /* Base address for 2MB mapping */
         mov   x3, #PT_MEM            /* x2 := Section map */
         orr   x2, x2, x3
+        lsl   x1, x1, #3             /* x1 := Slot offset */
+        str   x2, [x4, x1]           /* Create mapping of paddr(start)*/
 
-        /* ... map of vaddr(start) in boot_second */
-        ldr   x1, =start
-        lsr   x1, x1, #(SECOND_SHIFT - 3)   /* Slot for vaddr(start) */
-        str   x2, [x4, x1]           /* Map vaddr(start) */
+1:      /* Setup boot_third: */
+        ldr   x4, =boot_third
+        add   x4, x4, x20            /* x4 := paddr (boot_third) */
 
-        /* ... map of paddr(start) in boot_second */
-        lsr   x1, x19, #FIRST_SHIFT  /* Base paddr */
-        cbnz  x1, 1f                 /* If paddr(start) is not in slot 0
-                                      * then the mapping was done in
-                                      * boot_pgtable or boot_first above */
+        lsr   x2, x19, #THIRD_SHIFT  /* Base address for 4K mapping */
+        lsl   x2, x2, #THIRD_SHIFT
+        mov   x3, #PT_MEM_L3         /* x2 := Section map */
+        orr   x2, x2, x3
 
-        lsr   x1, x19, #(SECOND_SHIFT - 3)  /* Slot for paddr(start) */
-        str   x2, [x4, x1]           /* Map Xen there */
-1:
+        /* ... map of vaddr(start) in boot_third */
+        mov   x1, xzr
+1:      str   x2, [x4, x1]           /* Map vaddr(start) */
+        add   x2, x2, #4096          /* Next page */
+        add   x1, x1, #8             /* Next slot */
+        cmp   x1, #(512*8)           /* 512 entries per page */
+        b.lt  1b
 
         /* Defer fixmap and dtb mapping until after paging enabled, to
          * avoid them clashing with the 1:1 mapping. */
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 03a0533..fdc7c98 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -47,8 +47,9 @@  struct domain *dom_xen, *dom_io, *dom_cow;
  * to the CPUs own pagetables.
  *
  * These pagetables have a very simple structure. They include:
- *  - a 2MB mapping of xen at XEN_VIRT_START, boot_first and
- *    boot_second are used to populate the trie down to that mapping.
+ *  - 2MB worth of 4K mappings of xen at XEN_VIRT_START, boot_first and
+ *    boot_second are used to populate the tables down to boot_third
+ *    which contains the actual mapping.
  *  - a 1:1 mapping of xen at its current physical address. This uses a
  *    section mapping at whichever of boot_{pgtable,first,second}
  *    covers that physical address.
@@ -69,6 +70,7 @@  lpae_t boot_pgtable[LPAE_ENTRIES] __attribute__((__aligned__(4096)));
 lpae_t boot_first[LPAE_ENTRIES] __attribute__((__aligned__(4096)));
 #endif
 lpae_t boot_second[LPAE_ENTRIES]  __attribute__((__aligned__(4096)));
+lpae_t boot_third[LPAE_ENTRIES]  __attribute__((__aligned__(4096)));
 
 /* Main runtime page tables */
 
@@ -492,6 +494,8 @@  void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
 #endif
     memset(boot_second, 0x0, PAGE_SIZE);
     clean_and_invalidate_xen_dcache(boot_second);
+    memset(boot_third, 0x0, PAGE_SIZE);
+    clean_and_invalidate_xen_dcache(boot_third);
 
     /* Break up the Xen mapping into 4k pages and protect them separately. */
     for ( i = 0; i < LPAE_ENTRIES; i++ )