diff mbox

xen/arm: Rework the way to compute dom0 DTB base address

Message ID 1369662637-21449-1-git-send-email-julien.grall@linaro.org
State Superseded, archived
Headers show

Commit Message

Julien Grall May 27, 2013, 1:50 p.m. UTC
If the DTB is loading right the after the kernel, on some setup, Linux will
overwrite the DTB during the decompression step.

To be sure the DTB won't be overwritten by the decrompression stage, load
the DTB near the end of the first memory bank and below 4Gib (if memory range is
greater).

Signed-off-by: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/domain_build.c |   19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

Comments

Andrew Cooper May 27, 2013, 1:58 p.m. UTC | #1
On 27/05/13 14:50, Julien Grall wrote:
> If the DTB is loading right the after the kernel, on some setup, Linux will
> overwrite the DTB during the decompression step.
>
> To be sure the DTB won't be overwritten by the decrompression stage, load
decompression
> the DTB near the end of the first memory bank and below 4Gib (if memory range is
> greater).

Surely the correct solution is to make Linux aware that a DTB is present
so it wont overwrite it?

Sticking it at the end of memory in the hope that it wont be overwritten
is still going to fail in a somewhat memory-limited situation.

~Andrew

>
> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> ---
>  xen/arch/arm/domain_build.c |   19 +++++++++++++------
>  1 file changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index f857e40..ca086a3 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -519,6 +519,7 @@ static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
>      void *fdt;
>      int new_size;
>      int ret;
> +    paddr_t end;
>  
>      kinfo->unassigned_mem = dom0_mem;
>  
> @@ -543,11 +544,6 @@ static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
>      if ( ret < 0 )
>          goto err;
>  
> -    /*
> -     * Put the device tree at the beginning of the first bank.  It
> -     * must be below 4 GiB.
> -     */
> -    kinfo->dtb_paddr = kinfo->mem.bank[0].start + 0x100;
>      if ( kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt) > (1ull << 32) )
>      {
>          printk("Not enough memory below 4 GiB for the device tree.");
> @@ -555,6 +551,16 @@ static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
>          goto err;
>      }
>  
> +    /*
> +     * DTB must be load below 4GiB and far enough to linux (Linux use
> +     * the space after it to decompress)
> +     * Load the DTB at the end of the first bank or below 4Gib
> +     */
> +    end = kinfo->mem.bank[0].start + kinfo->mem.bank[0].size;
> +    kinfo->dtb_paddr = (MIN(1ull << 32, end) - fdt_totalsize(kinfo->fdt));
> +    /* Linux requires the address to be a multiple of 4 */
> +    kinfo->dtb_paddr &= ~3;
> +
>      return 0;
>  
>    err:
> @@ -566,6 +572,8 @@ static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
>  static void dtb_load(struct kernel_info *kinfo)
>  {
>      void * __user dtb_virt = (void * __user)(register_t)kinfo->dtb_paddr;
> +    printk("Loading dom0 DTB to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
> +           kinfo->dtb_paddr, kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt));
>  
>      raw_copy_to_guest(dtb_virt, kinfo->fdt, fdt_totalsize(kinfo->fdt));
>      xfree(kinfo->fdt);
> @@ -604,7 +612,6 @@ int construct_dom0(struct domain *d)
>      /* The following loads use the domain's p2m */
>      p2m_load_VTTBR(d);
>  
> -    kinfo.dtb_paddr = kinfo.zimage.load_addr + kinfo.zimage.len;
>      kernel_load(&kinfo);
>      dtb_load(&kinfo);
>
Julien Grall May 27, 2013, 3:44 p.m. UTC | #2
On 05/27/2013 02:58 PM, Andrew Cooper wrote:

> On 27/05/13 14:50, Julien Grall wrote:
>> If the DTB is loading right the after the kernel, on some setup, Linux will
>> overwrite the DTB during the decompression step.
>>
>> To be sure the DTB won't be overwritten by the decrompression stage, load
> decompression
>> the DTB near the end of the first memory bank and below 4Gib (if memory range is
>> greater).
> 
> Surely the correct solution is to make Linux aware that a DTB is present
> so it wont overwrite it? 


Most of the setup I used let the user choose the DTB base address or
automatically generate it (randomly?).

For instance, QEMU assumes the DTB will be load from 128Mo, if there is
more than 256Mo, or the amount of memory divide by 2.
It's possible to assume the same things because Linux is built with
CONFIG_AUTO_ZRELADDR, which is enabled by the multiple platform support.
This option will assume that the zImage will be place in the first 128
MB from start of memory.

It's also possible to enable CONFIG_ARM_APPENDED_DTB which checks during
decompression stage if there is an appended DTB. But it's for the old
bootloader (ie which doesn't set the DTB address in r2) and won't allow
to boot Linux either on bare metal or on XEN.

Your solution is the best, but, if I didn't miss something, it means
some rework on the Linux decompression stage. I think, this is also on
issue with all the bootloaders.

> Sticking it at the end of memory in the hope that it wont be overwritten
> is still going to fail in a somewhat memory-limited situation.

Quick question, on x86 does Linux take care of the initrd during
decompression stage? Is there any assumption on the memory layout?
Ian Campbell May 28, 2013, 9:03 a.m. UTC | #3
On Mon, 2013-05-27 at 14:50 +0100, Julien Grall wrote:
> If the DTB is loading right the after the kernel, on some setup, Linux will
> overwrite the DTB during the decompression step.
> 
> To be sure the DTB won't be overwritten by the decrompression stage, load

"decompression"

> the DTB near the end of the first memory bank and below 4Gib (if memory range is
> greater).

I've been considering something like this too. I have a feeling we might
end up just pushing things around in memory continually breaking one
platform in favour of another. However this new choice location seems
likely to be more compatible than what we have now...

> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> ---
>  xen/arch/arm/domain_build.c |   19 +++++++++++++------
>  1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index f857e40..ca086a3 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -519,6 +519,7 @@ static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
>      void *fdt;
>      int new_size;
>      int ret;
> +    paddr_t end;
>  
>      kinfo->unassigned_mem = dom0_mem;
>  
> @@ -543,11 +544,6 @@ static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
>      if ( ret < 0 )
>          goto err;
>  
> -    /*
> -     * Put the device tree at the beginning of the first bank.  It
> -     * must be below 4 GiB.
> -     */
> -    kinfo->dtb_paddr = kinfo->mem.bank[0].start + 0x100;
>      if ( kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt) > (1ull << 32) )

Isn't this check now misplaced?

>      {
>          printk("Not enough memory below 4 GiB for the device tree.");
> @@ -555,6 +551,16 @@ static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
>          goto err;
>      }
>  
> +    /*
> +     * DTB must be load below 4GiB and far enough to linux (Linux use
> +     * the space after it to decompress)
> +     * Load the DTB at the end of the first bank or below 4Gib
> +     */
> +    end = kinfo->mem.bank[0].start + kinfo->mem.bank[0].size;
> +    kinfo->dtb_paddr = (MIN(1ull << 32, end) - fdt_totalsize(kinfo->fdt));
> +    /* Linux requires the address to be a multiple of 4 */
> +    kinfo->dtb_paddr &= ~3;

What do you think of aligning to e.g. a 2MB boundary?

"multiple of 4" would be more usually expressed in this context as
"aligned to 4 bytes".

> +
>      return 0;
>  
>    err:
> @@ -566,6 +572,8 @@ static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
>  static void dtb_load(struct kernel_info *kinfo)
>  {
>      void * __user dtb_virt = (void * __user)(register_t)kinfo->dtb_paddr;
> +    printk("Loading dom0 DTB to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
> +           kinfo->dtb_paddr, kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt));
>  
>      raw_copy_to_guest(dtb_virt, kinfo->fdt, fdt_totalsize(kinfo->fdt));
>      xfree(kinfo->fdt);
> @@ -604,7 +612,6 @@ int construct_dom0(struct domain *d)
>      /* The following loads use the domain's p2m */
>      p2m_load_VTTBR(d);
>  
> -    kinfo.dtb_paddr = kinfo.zimage.load_addr + kinfo.zimage.len;
>      kernel_load(&kinfo);
>      dtb_load(&kinfo);
>
Ian Campbell May 28, 2013, 9:05 a.m. UTC | #4
On Mon, 2013-05-27 at 16:44 +0100, Julien Grall wrote:
> It's also possible to enable CONFIG_ARM_APPENDED_DTB which checks during
> decompression stage if there is an appended DTB. But it's for the old
> bootloader (ie which doesn't set the DTB address in r2) and won't allow
> to boot Linux either on bare metal or on XEN.

We don't want to rely on APPENDED_DTB -- it is considered a hack...

> Your solution is the best, but, if I didn't miss something, it means
> some rework on the Linux decompression stage. I think, this is also on
> issue with all the bootloaders.

Yes. For better or worse the defacto standard way of doing this on ARM
platforms today is for the bootloader (or the bootscript) to contain
some magic addresses which "works". Pushing those into Xen isn't ideal
but its the best option right now.

Ian.
Julien Grall May 28, 2013, 11:11 a.m. UTC | #5
On 05/28/2013 10:03 AM, Ian Campbell wrote:

> On Mon, 2013-05-27 at 14:50 +0100, Julien Grall wrote:
>> If the DTB is loading right the after the kernel, on some setup, Linux will
>> overwrite the DTB during the decompression step.
>>
>> To be sure the DTB won't be overwritten by the decrompression stage, load
> 
> "decompression"
> 
>> the DTB near the end of the first memory bank and below 4Gib (if memory range is
>> greater).
> 
> I've been considering something like this too. I have a feeling we might
> end up just pushing things around in memory continually breaking one
> platform in favour of another. However this new choice location seems
> likely to be more compatible than what we have now...
>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>> ---
>>  xen/arch/arm/domain_build.c |   19 +++++++++++++------
>>  1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index f857e40..ca086a3 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -519,6 +519,7 @@ static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
>>      void *fdt;
>>      int new_size;
>>      int ret;
>> +    paddr_t end;
>>  
>>      kinfo->unassigned_mem = dom0_mem;
>>  
>> @@ -543,11 +544,6 @@ static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
>>      if ( ret < 0 )
>>          goto err;
>>  
>> -    /*
>> -     * Put the device tree at the beginning of the first bank.  It
>> -     * must be below 4 GiB.
>> -     */
>> -    kinfo->dtb_paddr = kinfo->mem.bank[0].start + 0x100;
>>      if ( kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt) > (1ull << 32) )
> 
> Isn't this check now misplaced?


Right. I think we can remove kinfo->dtb_paddr and just check the DTB
size is less than 4GiB.

> 
>>      {
>>          printk("Not enough memory below 4 GiB for the device tree.");
>> @@ -555,6 +551,16 @@ static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
>>          goto err;
>>      }
>>  
>> +    /*
>> +     * DTB must be load below 4GiB and far enough to linux (Linux use
>> +     * the space after it to decompress)
>> +     * Load the DTB at the end of the first bank or below 4Gib
>> +     */
>> +    end = kinfo->mem.bank[0].start + kinfo->mem.bank[0].size;
>> +    kinfo->dtb_paddr = (MIN(1ull << 32, end) - fdt_totalsize(kinfo->fdt));
>> +    /* Linux requires the address to be a multiple of 4 */
>> +    kinfo->dtb_paddr &= ~3;
> 
> What do you think of aligning to e.g. a 2MB boundary?

Sounds good for, in fact, I wanted to avoid a big wasted space after the
DTB. I will send a new version of this patch with all the modifications.

> "multiple of 4" would be more usually expressed in this context as
> "aligned to 4 bytes".
> 
>> +
>>      return 0;
>>  
>>    err:
>> @@ -566,6 +572,8 @@ static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
>>  static void dtb_load(struct kernel_info *kinfo)
>>  {
>>      void * __user dtb_virt = (void * __user)(register_t)kinfo->dtb_paddr;
>> +    printk("Loading dom0 DTB to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
>> +           kinfo->dtb_paddr, kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt));
>>  
>>      raw_copy_to_guest(dtb_virt, kinfo->fdt, fdt_totalsize(kinfo->fdt));
>>      xfree(kinfo->fdt);
>> @@ -604,7 +612,6 @@ int construct_dom0(struct domain *d)
>>      /* The following loads use the domain's p2m */
>>      p2m_load_VTTBR(d);
>>  
>> -    kinfo.dtb_paddr = kinfo.zimage.load_addr + kinfo.zimage.len;
>>      kernel_load(&kinfo);
>>      dtb_load(&kinfo);
>>  
> 
>
Ian Campbell May 28, 2013, 11:19 a.m. UTC | #6
> >>      if ( ret < 0 )
> >>          goto err;
> >>  
> >> -    /*
> >> -     * Put the device tree at the beginning of the first bank.  It
> >> -     * must be below 4 GiB.
> >> -     */
> >> -    kinfo->dtb_paddr = kinfo->mem.bank[0].start + 0x100;
> >>      if ( kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt) > (1ull << 32) )
> > 
> > Isn't this check now misplaced?
> 
> 
> Right. I think we can remove kinfo->dtb_paddr and just check the DTB
> size is less than 4GiB.

Is it easy to check for overlap with the kernel and provide a useful
message in that case?

> > What do you think of aligning to e.g. a 2MB boundary?
> 
> Sounds good for, in fact, I wanted to avoid a big wasted space after the
> DTB. I will send a new version of this patch with all the modifications.

Thanks,
Ian.
Julien Grall May 28, 2013, 11:24 a.m. UTC | #7
On 05/28/2013 12:19 PM, Ian Campbell wrote:

>>>>      if ( ret < 0 )
>>>>          goto err;
>>>>  
>>>> -    /*
>>>> -     * Put the device tree at the beginning of the first bank.  It
>>>> -     * must be below 4 GiB.
>>>> -     */
>>>> -    kinfo->dtb_paddr = kinfo->mem.bank[0].start + 0x100;
>>>>      if ( kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt) > (1ull << 32) )
>>>
>>> Isn't this check now misplaced?
>>
>>
>> Right. I think we can remove kinfo->dtb_paddr and just check the DTB
>> size is less than 4GiB.
> 
> Is it easy to check for overlap with the kernel and provide a useful
> message in that case?


It's possible :). I will add it in the next version.
diff mbox

Patch

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index f857e40..ca086a3 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -519,6 +519,7 @@  static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
     void *fdt;
     int new_size;
     int ret;
+    paddr_t end;
 
     kinfo->unassigned_mem = dom0_mem;
 
@@ -543,11 +544,6 @@  static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
     if ( ret < 0 )
         goto err;
 
-    /*
-     * Put the device tree at the beginning of the first bank.  It
-     * must be below 4 GiB.
-     */
-    kinfo->dtb_paddr = kinfo->mem.bank[0].start + 0x100;
     if ( kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt) > (1ull << 32) )
     {
         printk("Not enough memory below 4 GiB for the device tree.");
@@ -555,6 +551,16 @@  static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
         goto err;
     }
 
+    /*
+     * DTB must be load below 4GiB and far enough to linux (Linux use
+     * the space after it to decompress)
+     * Load the DTB at the end of the first bank or below 4Gib
+     */
+    end = kinfo->mem.bank[0].start + kinfo->mem.bank[0].size;
+    kinfo->dtb_paddr = (MIN(1ull << 32, end) - fdt_totalsize(kinfo->fdt));
+    /* Linux requires the address to be a multiple of 4 */
+    kinfo->dtb_paddr &= ~3;
+
     return 0;
 
   err:
@@ -566,6 +572,8 @@  static int prepare_dtb(struct domain *d, struct kernel_info *kinfo)
 static void dtb_load(struct kernel_info *kinfo)
 {
     void * __user dtb_virt = (void * __user)(register_t)kinfo->dtb_paddr;
+    printk("Loading dom0 DTB to 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
+           kinfo->dtb_paddr, kinfo->dtb_paddr + fdt_totalsize(kinfo->fdt));
 
     raw_copy_to_guest(dtb_virt, kinfo->fdt, fdt_totalsize(kinfo->fdt));
     xfree(kinfo->fdt);
@@ -604,7 +612,6 @@  int construct_dom0(struct domain *d)
     /* The following loads use the domain's p2m */
     p2m_load_VTTBR(d);
 
-    kinfo.dtb_paddr = kinfo.zimage.load_addr + kinfo.zimage.len;
     kernel_load(&kinfo);
     dtb_load(&kinfo);