diff mbox

arm: Handle device tree memory regions larger than 4GB

Message ID 1341506918-22077-1-git-send-email-peter.maydell@linaro.org
State Superseded
Headers show

Commit Message

Peter Maydell July 5, 2012, 4:48 p.m. UTC
Device tree memory regions may have sizes larger than 4GB.
Instead of silently truncating a 64 bit size when we pass it
to arm_add_memory(), split large regions into 2GB chunks.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
With this patch, I can take a device tree which has been tweaked
so its #address-cells and #size-cells are both 2, and boot it on
a QEMU vexpress-a15 model with >4GB of RAM, and have the kernel
actually detect the right amount of RAM. [the qemu bit needs some
patches I haven't posted yet.]

Since I'm not really a kernel dev I thought I'd post this to
linaro-dev first in the hope of a bit of friendly local review
before venturing onto lkml :-)
(Apologies to those on cc who got this twice because I mistyped
the linaro-dev email address.)

 arch/arm/kernel/devtree.c |    5 +++++
 1 file changed, 5 insertions(+)

Comments

Rob Herring July 5, 2012, 11:27 p.m. UTC | #1
On 07/05/2012 11:48 AM, Peter Maydell wrote:
> Device tree memory regions may have sizes larger than 4GB.
> Instead of silently truncating a 64 bit size when we pass it
> to arm_add_memory(), split large regions into 2GB chunks.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> With this patch, I can take a device tree which has been tweaked
> so its #address-cells and #size-cells are both 2, and boot it on
> a QEMU vexpress-a15 model with >4GB of RAM, and have the kernel
> actually detect the right amount of RAM. [the qemu bit needs some
> patches I haven't posted yet.]
> 
> Since I'm not really a kernel dev I thought I'd post this to
> linaro-dev first in the hope of a bit of friendly local review
> before venturing onto lkml :-)
> (Apologies to those on cc who got this twice because I mistyped
> the linaro-dev email address.)

We'll still find you. ;)

>  arch/arm/kernel/devtree.c |    5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/arm/kernel/devtree.c b/arch/arm/kernel/devtree.c
> index bee7f9d..79a6e66 100644
> --- a/arch/arm/kernel/devtree.c
> +++ b/arch/arm/kernel/devtree.c
> @@ -26,6 +26,11 @@
>  
>  void __init early_init_dt_add_memory_arch(u64 base, u64 size)
>  {
> +	while (size > 0x80000000) {
> +		arm_add_memory(base, 0x80000000);
> +		base += 0x80000000;
> +		size -= 0x80000000;
> +	}
>  	arm_add_memory(base, size);

I would just change arm_add_memory to use phys_addr_t for the size
param. This ultimately calls memblock functions which use phys_addr_t
for sizes.

One thing I noticed is ATAGs will be broken for LPAE. That's probably
fine because if you can fix your bootloader for new ATAGs, then you can
support DT.

Rob

>  }
>  
>
Peter Maydell July 6, 2012, 7:43 a.m. UTC | #2
On 6 July 2012 00:27, Rob Herring <robherring2@gmail.com> wrote:
> On 07/05/2012 11:48 AM, Peter Maydell wrote:
>>  void __init early_init_dt_add_memory_arch(u64 base, u64 size)
>>  {
>> +     while (size > 0x80000000) {
>> +             arm_add_memory(base, 0x80000000);
>> +             base += 0x80000000;
>> +             size -= 0x80000000;
>> +     }
>>       arm_add_memory(base, size);
>
> I would just change arm_add_memory to use phys_addr_t for the size
> param. This ultimately calls memblock functions which use phys_addr_t
> for sizes.

I was wondering if somebody would suggest that. It's a bigger patch
(since it requires changing struct membank and potentially any function
which uses that) but I guess it is the long term right thing. I'll
have a go at that next week...

> One thing I noticed is ATAGs will be broken for LPAE. That's probably
> fine because if you can fix your bootloader for new ATAGs, then you can
> support DT.

Well, it works fine with LPAE assuming you didn't actually need to
pass in 4GB of RAM. (I think you could for instance specify 3.5GB
RAM starting at 2GB using ATAGS.) But yes, I'm assuming ATAGs boot
is 'legacy' now and we can insist on DT for any new platform that
can manage 4GB of RAM in the first place.

-- PMM
Peter Maydell July 6, 2012, 6:07 p.m. UTC | #3
On 6 July 2012 00:27, Rob Herring <robherring2@gmail.com> wrote:
> I would just change arm_add_memory to use phys_addr_t for the size
> param. This ultimately calls memblock functions which use phys_addr_t
> for sizes.

So I have a patch that does this which basically works. However
there is a bit I'm not sure about. arm_add_memory() does this:
   bank->size = size & PAGE_MASK;

in an attempt to clear the bottom bits of the size. However,
since PAGE_MASK is defined as:
 #define PAGE_SIZE               (_AC(1,UL) << PAGE_SHIFT)
 #define PAGE_MASK               (~(PAGE_SIZE-1))

PAGE_MASK is a 32 bit unsigned constant and so this & will
clear the top 32 bits of bank->size.

I'm really not sure what the best way to fix this is; suggestions?

thanks
-- PMM
Dave Martin July 6, 2012, 7:26 p.m. UTC | #4
On Fri, Jul 06, 2012 at 07:07:35PM +0100, Peter Maydell wrote:
> On 6 July 2012 00:27, Rob Herring <robherring2@gmail.com> wrote:
> > I would just change arm_add_memory to use phys_addr_t for the size
> > param. This ultimately calls memblock functions which use phys_addr_t
> > for sizes.
> 
> So I have a patch that does this which basically works. However
> there is a bit I'm not sure about. arm_add_memory() does this:
>    bank->size = size & PAGE_MASK;
> 
> in an attempt to clear the bottom bits of the size. However,
> since PAGE_MASK is defined as:
>  #define PAGE_SIZE               (_AC(1,UL) << PAGE_SHIFT)
>  #define PAGE_MASK               (~(PAGE_SIZE-1))
> 
> PAGE_MASK is a 32 bit unsigned constant and so this & will
> clear the top 32 bits of bank->size.
> 
> I'm really not sure what the best way to fix this is; suggestions?

Maybe something like

	~(phys_addr_t)~PAGE_MASK

or

	~(phys_addr_t)(PAGE_SIZE - 1)

would work.

---Dave
Peter Maydell July 7, 2012, 10:36 a.m. UTC | #5
On 6 July 2012 20:26, Dave Martin <dave.martin@linaro.org> wrote:
> On Fri, Jul 06, 2012 at 07:07:35PM +0100, Peter Maydell wrote:
>> On 6 July 2012 00:27, Rob Herring <robherring2@gmail.com> wrote:
>> > I would just change arm_add_memory to use phys_addr_t for the size
>> > param. This ultimately calls memblock functions which use phys_addr_t
>> > for sizes.
>>
>> So I have a patch that does this which basically works. However
>> there is a bit I'm not sure about. arm_add_memory() does this:
>>    bank->size = size & PAGE_MASK;
>>
>> in an attempt to clear the bottom bits of the size. However,
>> since PAGE_MASK is defined as:
>>  #define PAGE_SIZE               (_AC(1,UL) << PAGE_SHIFT)
>>  #define PAGE_MASK               (~(PAGE_SIZE-1))
>>
>> PAGE_MASK is a 32 bit unsigned constant and so this & will
>> clear the top 32 bits of bank->size.
>>
>> I'm really not sure what the best way to fix this is; suggestions?
>
> Maybe something like
>
>         ~(phys_addr_t)~PAGE_MASK
>
> or
>
>         ~(phys_addr_t)(PAGE_SIZE - 1)
>
> would work.

Mmm. It feels a bit unsatisfactory that an "obviously correct"
line of code like "size &= ~PAGE_MASK" could actually be subtly
wrong (seems like the kind of thing that could easily slip
through code review), so I was wondering if maybe redefining
PAGE_MASK so it didn't do the wrong thing on 64 bit types
would be better. (That's definitely way out of my depth though.)

-- PMM
Rob Herring July 7, 2012, 3:57 p.m. UTC | #6
On 07/07/2012 05:36 AM, Peter Maydell wrote:
> On 6 July 2012 20:26, Dave Martin <dave.martin@linaro.org> wrote:
>> On Fri, Jul 06, 2012 at 07:07:35PM +0100, Peter Maydell wrote:
>>> On 6 July 2012 00:27, Rob Herring <robherring2@gmail.com> wrote:
>>>> I would just change arm_add_memory to use phys_addr_t for the size
>>>> param. This ultimately calls memblock functions which use phys_addr_t
>>>> for sizes.
>>>
>>> So I have a patch that does this which basically works. However
>>> there is a bit I'm not sure about. arm_add_memory() does this:
>>>    bank->size = size & PAGE_MASK;
>>>
>>> in an attempt to clear the bottom bits of the size. However,
>>> since PAGE_MASK is defined as:
>>>  #define PAGE_SIZE               (_AC(1,UL) << PAGE_SHIFT)
>>>  #define PAGE_MASK               (~(PAGE_SIZE-1))
>>>
>>> PAGE_MASK is a 32 bit unsigned constant and so this & will
>>> clear the top 32 bits of bank->size.
>>>
>>> I'm really not sure what the best way to fix this is; suggestions?
>>
>> Maybe something like
>>
>>         ~(phys_addr_t)~PAGE_MASK
>>
>> or
>>
>>         ~(phys_addr_t)(PAGE_SIZE - 1)
>>
>> would work.
> 
> Mmm. It feels a bit unsatisfactory that an "obviously correct"
> line of code like "size &= ~PAGE_MASK" could actually be subtly
> wrong (seems like the kind of thing that could easily slip
> through code review), so I was wondering if maybe redefining
> PAGE_MASK so it didn't do the wrong thing on 64 bit types
> would be better. (That's definitely way out of my depth though.)
> 

While there is a mixture of usage of PAGE_MASK in the kernel, I think
correct usage is with virt addresses. For phys addresses, there is
PHYS_MASK which is sized correctly for LPAE.

You really should post this on the lakml list.

Rob

> -- PMM
>
Dave Martin July 9, 2012, 10:46 a.m. UTC | #7
On Sat, Jul 07, 2012 at 10:57:40AM -0500, Rob Herring wrote:
> On 07/07/2012 05:36 AM, Peter Maydell wrote:
> > On 6 July 2012 20:26, Dave Martin <dave.martin@linaro.org> wrote:
> >> On Fri, Jul 06, 2012 at 07:07:35PM +0100, Peter Maydell wrote:
> >>> On 6 July 2012 00:27, Rob Herring <robherring2@gmail.com> wrote:
> >>>> I would just change arm_add_memory to use phys_addr_t for the size
> >>>> param. This ultimately calls memblock functions which use phys_addr_t
> >>>> for sizes.
> >>>
> >>> So I have a patch that does this which basically works. However
> >>> there is a bit I'm not sure about. arm_add_memory() does this:
> >>>    bank->size = size & PAGE_MASK;
> >>>
> >>> in an attempt to clear the bottom bits of the size. However,
> >>> since PAGE_MASK is defined as:
> >>>  #define PAGE_SIZE               (_AC(1,UL) << PAGE_SHIFT)
> >>>  #define PAGE_MASK               (~(PAGE_SIZE-1))
> >>>
> >>> PAGE_MASK is a 32 bit unsigned constant and so this & will
> >>> clear the top 32 bits of bank->size.
> >>>
> >>> I'm really not sure what the best way to fix this is; suggestions?
> >>
> >> Maybe something like
> >>
> >>         ~(phys_addr_t)~PAGE_MASK
> >>
> >> or
> >>
> >>         ~(phys_addr_t)(PAGE_SIZE - 1)
> >>
> >> would work.
> > 
> > Mmm. It feels a bit unsatisfactory that an "obviously correct"
> > line of code like "size &= ~PAGE_MASK" could actually be subtly
> > wrong (seems like the kind of thing that could easily slip
> > through code review), so I was wondering if maybe redefining
> > PAGE_MASK so it didn't do the wrong thing on 64 bit types
> > would be better. (That's definitely way out of my depth though.)

That's C for you.  What we really mean by "... & PAGE_MASK" is a
bit-clear operation, but C doesn't have that.

> While there is a mixture of usage of PAGE_MASK in the kernel, I think
> correct usage is with virt addresses. For phys addresses, there is
> PHYS_MASK which is sized correctly for LPAE.
> 
> You really should post this on the lakml list.

The bits masked off by PAGE_MASK are identical between physical and
virtual views of the same address, so we could argue that PAGE_MASK
is appropriate for both, if a suitable definition is possible.

Thinking about it, using a 64-bit definition of PAGE_MASK should
not break most uses.  sizeof(expression involving PAGE_MASK) might
unexpectedly by 8 where 4 is expected, but I suspect that such
usage is rare or nonexstent.

GCC seems to generate sensibly optimal code for simple operations
like (unsigned long) & PAGE_MASK, equivalent to what would be
generated with a 32-bit PAGE_MASK.

That would need discussion on alkml to see whether other people
agree.

---Dave
diff mbox

Patch

diff --git a/arch/arm/kernel/devtree.c b/arch/arm/kernel/devtree.c
index bee7f9d..79a6e66 100644
--- a/arch/arm/kernel/devtree.c
+++ b/arch/arm/kernel/devtree.c
@@ -26,6 +26,11 @@ 
 
 void __init early_init_dt_add_memory_arch(u64 base, u64 size)
 {
+	while (size > 0x80000000) {
+		arm_add_memory(base, 0x80000000);
+		base += 0x80000000;
+		size -= 0x80000000;
+	}
 	arm_add_memory(base, size);
 }