[Xen-devel] Unable to use EFI firmware in Xen ARM guest after 41f8901

Message ID CAKv+Gu-cPMdpcKvY0cR=vJOsE5B7CcqNMB=7xVHKo41FkLSVVw@mail.gmail.com
State New
Headers show

Commit Message

Ard Biesheuvel Oct. 2, 2015, 12:18 p.m.
On 1 October 2015 at 18:32, Julien Grall <julien.grall@gmail.com> wrote:
>
> On 1 Oct 2015 17:07, "Ard Biesheuvel" <ard.biesheuvel@linaro.org> wrote:
>>
>> On 1 October 2015 at 17:58, Julien Grall <julien.grall@citrix.com> wrote:
>> > Hi,
>> >
>> > We tried today to use the UEFI binary provided by Linaro for Xen [1] and
>> > noticed the guest doesn't boot anymore.
>> >
>>
>> Thanks for reporting. My LAVA job appears to have been offline for a
>> while because the Xen mustang dom0 kernel has disappeared.
>
> FWIW, everything to support X-gene and Xen should be upstreamed.
> So you could use a normal Linux for Dom0.
>
> FIY, Im planning to add a test in our CI loop to check the UEFI firmware.
>

Great! Would be good to have some independent verification, especially
since the (lack of) CI-to-LAVA integration still requires me to go
look at the artifacts manually.

>>
>> > My bisector fingered the commit 41f890164bc201f69841309c6b55e24c64121960
>> > "ArmPkg/Mmu: Fix literal number left shift bug".
>> >
>>
>> Do you mean reverting just this patch fixes it for you?
>
> Yes, building the UEFI with the commit just before allows me to reach the
> UEFI console.
>
>> > I've tried to use the DEBUG firmware but didn't get any output at all.
>> >
>> > The guest has been created with 512MB in one memory bank
>> > (0x40000000-0x60000000). The binary are loaded at:
>> >    UEFI firmware: 0x40080000
>> >    DTB: 0x48000000
>> >
>> > Does anyone have any insight what could go wrong?
>> >
>>
>> I will investigate. The change itself seems obviously correct (but
>> don't they always :-))
>

OK, so the change was obviously correct, but the rest of the code is not.

Instead of erroneously mapping large naturally aligned regions down to
2 MB blocks, the MMU code now notices that the single cacheable 1:1
memory region that I define for Xen domU [which spans the entire (I)PA
space] could potentially be mapped using level 0 block entries of 512
GB each. Only, the architecture does not actually support that.

Could you please try this patch and see if it fixes things?

--------8<----------
we need to go to the page below.
   // The PageLevel was calculated on the Base Address alignment but
did not take in account the alignment
--------8<----------

As an aside, it is probably not necessary to make the 1:1 region as
large as I am doing, especially since it now turns out that we need to
map it in 1 GB blocks and use 4 levels. Is there any reasonable upper
bound to the domU PA space other than what is communicated in the ID
registers?

Comments

Ian Campbell Oct. 2, 2015, 12:43 p.m. | #1
On Fri, 2015-10-02 at 14:18 +0200, Ard Biesheuvel wrote:
> Is there any reasonable upper bound to the domU PA space
>  other than what is communicated in the ID registers?

You mean the PASize bits/register? In which case that is it as far as the
guest should be is concerned, yes.

Ian.
Ard Biesheuvel Oct. 2, 2015, 12:48 p.m. | #2
On 2 October 2015 at 14:43, Ian Campbell <ian.campbell@citrix.com> wrote:
> On Fri, 2015-10-02 at 14:18 +0200, Ard Biesheuvel wrote:
>> Is there any reasonable upper bound to the domU PA space
>>  other than what is communicated in the ID registers?
>
> You mean the PASize bits/register? In which case that is it as far as the
> guest should be is concerned, yes.
>

OK.

As discussed on IRC (#xenarm), the rationale of this approach is that
Xen's stage2 attributes will ultimately override device mappings, so
1:1 mapping the whole address space cacheable is actually a reasonable
thing to do. And in fact, 4 KB of translation tables for each 512 GB
of address space is probably not such a big deal after all. So I am
inclined to leave things are they are if the proposed patch works.
Ian Campbell Oct. 2, 2015, 1:07 p.m. | #3
On Fri, 2015-10-02 at 14:48 +0200, Ard Biesheuvel wrote:
> On 2 October 2015 at 14:43, Ian Campbell <ian.campbell@citrix.com> wrote:
> > On Fri, 2015-10-02 at 14:18 +0200, Ard Biesheuvel wrote:
> > > Is there any reasonable upper bound to the domU PA space
> > >  other than what is communicated in the ID registers?
> > 
> > You mean the PASize bits/register? In which case that is it as far as
> > the
> > guest should be is concerned, yes.
> > 
> 
> OK.
> 
> As discussed on IRC (#xenarm), the rationale of this approach is that
> Xen's stage2 attributes will ultimately override device mappings, so
> 1:1 mapping the whole address space cacheable is actually a reasonable
> thing to do. And in fact, 4 KB of translation tables for each 512 GB
> of address space is probably not such a big deal after all.

Well, I'm not going to commit to that always being the case, since its not
part of the guest ABI, but given that OVMF would normally be supplied as
part of the host rather than the guest if we ever break/change that
assumption we do at least have the opportunity to get ovmf fixed/updated
around the same time.

IOW I suppose this is tolerable.

>  So I am
> inclined to leave things are they are if the proposed patch works.

Sure.

Ian.
Stefano Stabellini Oct. 2, 2015, 1:07 p.m. | #4
On Fri, 2 Oct 2015, Ard Biesheuvel wrote:
> diff --git a/ArmPkg/Library/ArmLib/AArch64/AArch64Mmu.c
> b/ArmPkg/Library/ArmLib/AArch64/AArch64Mmu.c
> index d98769b06b75..c84c872dbe60 100644
> --- a/ArmPkg/Library/ArmLib/AArch64/AArch64Mmu.c
> +++ b/ArmPkg/Library/ArmLib/AArch64/AArch64Mmu.c
> @@ -303,8 +303,9 @@ GetBlockEntryListFromAddress (
>      }
>    }
> 
> -  // Identify the Page Level the RegionStart must belongs to
> -  PageLevel = 3 - ((BaseAddressAlignment - 12) / 9);
> +  // Identify the Page Level the RegionStart must belong to. Note
> that PageLevel
> +  // should be at least 1 since block translations are not supported at level 0
> +  PageLevel = MAX (3 - ((BaseAddressAlignment - 12) / 9), 1);
> 
>    // If the required size is smaller than the current block size then
> we need to go to the page below.
>    // The PageLevel was calculated on the Base Address alignment but
> did not take in account the alignment

It works.

Tested-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Patch hide | download patch | download mbox

diff --git a/ArmPkg/Library/ArmLib/AArch64/AArch64Mmu.c
b/ArmPkg/Library/ArmLib/AArch64/AArch64Mmu.c
index d98769b06b75..c84c872dbe60 100644
--- a/ArmPkg/Library/ArmLib/AArch64/AArch64Mmu.c
+++ b/ArmPkg/Library/ArmLib/AArch64/AArch64Mmu.c
@@ -303,8 +303,9 @@  GetBlockEntryListFromAddress (
     }
   }

-  // Identify the Page Level the RegionStart must belongs to
-  PageLevel = 3 - ((BaseAddressAlignment - 12) / 9);
+  // Identify the Page Level the RegionStart must belong to. Note
that PageLevel
+  // should be at least 1 since block translations are not supported at level 0
+  PageLevel = MAX (3 - ((BaseAddressAlignment - 12) / 9), 1);

   // If the required size is smaller than the current block size then