[Xen-devel,v2,15/15] xen/arm: arm64: Document Cortex-A57 erratum 834220

Message ID 1464013052-32587-16-git-send-email-julien.grall@arm.com
State New
Headers show

Commit Message

Julien Grall May 23, 2016, 2:17 p.m.
The ARM erratum applies to certain revisions of Cortex-A57. The
processor may report a Stage 2 translation fault as the result of
Stage 1 fault for load crossing a page boundary when there is a
permission fault or device memory fault at stage 1 and a translation
fault at Stage 2.

So Xen needs to check that Stage 1 translation does not generate a fault
before handling the Stage 2 fault. If it is a Stage 1 translation fault,
return to the guest to let the processor injecting the correct fault.

Only document it as this is already the behavior of the fault handlers.
Note that some optimization could be done to avoid unecessary translation
fault. This is because HPFAR_EL2 is valid for more use case. For the moment,
the code is left unmodified.

Signed-off-by: Julien Grall <julien.grall@arm.com>
---
 docs/misc/arm/silicon-errata.txt |  1 +
 xen/arch/arm/traps.c             | 30 ++++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+)

Comments

Julien Grall May 30, 2016, 4:15 p.m. | #1
Hi Stefano,

On 30/05/2016 16:11, Stefano Stabellini wrote:
> On Mon, 23 May 2016, Julien Grall wrote:
>> The ARM erratum applies to certain revisions of Cortex-A57. The
>> processor may report a Stage 2 translation fault as the result of
>> Stage 1 fault for load crossing a page boundary when there is a
>> permission fault or device memory fault at stage 1 and a translation
>> fault at Stage 2.
>>
>> So Xen needs to check that Stage 1 translation does not generate a fault
>> before handling the Stage 2 fault. If it is a Stage 1 translation fault,
>> return to the guest to let the processor injecting the correct fault.
>>
>> Only document it as this is already the behavior of the fault handlers.
>> Note that some optimization could be done to avoid unecessary translation
>> fault. This is because HPFAR_EL2 is valid for more use case. For the moment,
>> the code is left unmodified.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>> ---
>>  docs/misc/arm/silicon-errata.txt |  1 +
>>  xen/arch/arm/traps.c             | 30 ++++++++++++++++++++++++++++++
>>  2 files changed, 31 insertions(+)
>>
>> diff --git a/docs/misc/arm/silicon-errata.txt b/docs/misc/arm/silicon-errata.txt
>> index ab2e5bc..1ac365d 100644
>> --- a/docs/misc/arm/silicon-errata.txt
>> +++ b/docs/misc/arm/silicon-errata.txt
>> @@ -47,3 +47,4 @@ stable hypervisors.
>>  | ARM            | Cortex-A53      | #819472         | ARM64_ERRATUM_819472    |
>>  | ARM            | Cortex-A57      | #852523         | N/A                     |
>>  | ARM            | Cortex-A57      | #832075         | ARM64_ERRATUM_832075    |
>> +| ARM            | Cortex-A57      | #834220         | N/A                     |
>> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
>> index 3acdba0..bbd5309 100644
>> --- a/xen/arch/arm/traps.c
>> +++ b/xen/arch/arm/traps.c
>> @@ -2396,6 +2396,21 @@ static void do_trap_instr_abort_guest(struct cpu_user_regs *regs,
>>              .kind = hsr.iabt.s1ptw ? npfec_kind_in_gpt : npfec_kind_with_gla
>>          };
>>
>> +        /*
>> +         * Erratum #834220: The processor may report a Stage 2
>> +         * translation fault as the result of Stage 1 fault for load
>> +         * crossing a page boundary when there is a permission fault or
>> +         * device memory alignment fault at Stage 1 and a translation
>> +         * fault at Stage 2.
>> +         *
>> +         * So Xen needs to check that the Stage 1 translation does not
>> +         * generate a fault before handling stage 2 fault. If it is a Stage
>> +         * 1 translation fault, return to the guest to let the processor
>> +         * injecting the correct fault.
>> +         *
>> +         * XXX: This can be optimized to avoid some unecessary
>> +         * translation.
>> +         */
>>          if ( hsr.iabt.s1ptw )
>>              gpa = get_faulting_ipa();
>>          else
>
> Please write the comment only once, maybe in do_trap_hypervisor before
> the calls to do_trap_instr_abort_guest and do_trap_data_abort_guest.

I wrote this comment at these two specific places because developers and 
reviewers can spot directly that the code has been written in a specific 
way to avoid an erratum.

If we happen to fix one place and not the other, the comment would still 
be there. do_trap_hypervisor would be the wrong place for this comment 
because it is "far away" in the code and will be less likely read.

I can shrink down the message. What about:

"Erratum #834220: Xen needs to check that the Stage 1 translation does 
not generate a fault before handling Stage 2 fault. If it is a stage 1 
translation fault, return to the guest to let the project injecting the 
correct fault.

XXX: This can be optimized to avoid some unnecessary translation."

Regards,
Julien Grall May 30, 2016, 4:33 p.m. | #2
On 30/05/2016 17:19, Stefano Stabellini wrote:
>> "Erratum #834220: Xen needs to check that the Stage 1 translation does not
>> generate a fault before handling Stage 2 fault. If it is a stage 1 translation
>> fault, return to the guest to let the project injecting the correct fault.
>>
>> XXX: This can be optimized to avoid some unnecessary translation."
>
> What about adding a lengthy and detailed description of the erratum
> elsewhere and just having a one liner at the call sites, such as:

I don't see any problem to have "lengthy" comment twice. It could 
actually be 3 lines because the last one is a TODO.

I thought about merging some bits of do_trap_instr_abort_guest and 
do_trap_data_abort_guest, but at first glance it is not that simple.

> /* Erratum #834220: check Stage1 translation does not generate faults first! */
>
> so that developers can easily grep for #834220 through the code to have
> the full explanation?

Where would you put the full explanation? do_trap_hypervisor does not 
make sense because it does not deal with the erratum. When we will come 
back in few months time, we will wonder why the comment is there.

Regards,
Julien Grall May 31, 2016, 10:39 a.m. | #3
Hi Stefano,

On 31/05/16 10:34, Stefano Stabellini wrote:
> On Mon, 30 May 2016, Julien Grall wrote:
>> On 30/05/2016 17:19, Stefano Stabellini wrote:
>>>> "Erratum #834220: Xen needs to check that the Stage 1 translation does not
>>>> generate a fault before handling Stage 2 fault. If it is a stage 1
>>>> translation
>>>> fault, return to the guest to let the project injecting the correct fault.
>>>>
>>>> XXX: This can be optimized to avoid some unnecessary translation."
>>>
>>> What about adding a lengthy and detailed description of the erratum
>>> elsewhere and just having a one liner at the call sites, such as:
>>
>> I don't see any problem to have "lengthy" comment twice. It could actually be
>> 3 lines because the last one is a TODO.
>
> In my experience comments tend to be modified and when it happens, not
> all the instances get always updated. That's why I would prefer to have
> only one good explanation somewhere and then references to it.
>
> That said, they are just comments, even if they don't get properly
> updated the code won't break because of that (not immediately at least).
> So I am not going to be headstrong on this.
>
>
>> I thought about merging some bits of do_trap_instr_abort_guest and
>> do_trap_data_abort_guest, but at first glance it is not that simple.
>>
>>> /* Erratum #834220: check Stage1 translation does not generate faults first!
>>> */
>>>
>>> so that developers can easily grep for #834220 through the code to have
>>> the full explanation?
>>
>> Where would you put the full explanation? do_trap_hypervisor does not make
>> sense because it does not deal with the erratum. When we will come back in few
>> months time, we will wonder why the comment is there.
>
> docs/misc/arm/silicon-errata.txt?
> do_trap_instr_abort_guest or do_trap_data_abort_guest but not both (only
> a reference in the other?

silicon-errata.txt could be a good place.

Cheers,

Patch

diff --git a/docs/misc/arm/silicon-errata.txt b/docs/misc/arm/silicon-errata.txt
index ab2e5bc..1ac365d 100644
--- a/docs/misc/arm/silicon-errata.txt
+++ b/docs/misc/arm/silicon-errata.txt
@@ -47,3 +47,4 @@  stable hypervisors.
 | ARM            | Cortex-A53      | #819472         | ARM64_ERRATUM_819472    |
 | ARM            | Cortex-A57      | #852523         | N/A                     |
 | ARM            | Cortex-A57      | #832075         | ARM64_ERRATUM_832075    |
+| ARM            | Cortex-A57      | #834220         | N/A                     |
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 3acdba0..bbd5309 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2396,6 +2396,21 @@  static void do_trap_instr_abort_guest(struct cpu_user_regs *regs,
             .kind = hsr.iabt.s1ptw ? npfec_kind_in_gpt : npfec_kind_with_gla
         };
 
+        /*
+         * Erratum #834220: The processor may report a Stage 2
+         * translation fault as the result of Stage 1 fault for load
+         * crossing a page boundary when there is a permission fault or
+         * device memory alignment fault at Stage 1 and a translation
+         * fault at Stage 2.
+         *
+         * So Xen needs to check that the Stage 1 translation does not
+         * generate a fault before handling stage 2 fault. If it is a Stage
+         * 1 translation fault, return to the guest to let the processor
+         * injecting the correct fault.
+         *
+         * XXX: This can be optimized to avoid some unecessary
+         * translation.
+         */
         if ( hsr.iabt.s1ptw )
             gpa = get_faulting_ipa();
         else
@@ -2445,6 +2460,21 @@  static void do_trap_data_abort_guest(struct cpu_user_regs *regs,
     info.gva = READ_SYSREG64(FAR_EL2);
 #endif
 
+    /*
+     * Erratum #834220: The processor may report a Stage 2
+     * translation fault as the result of Stage 1 fault for load
+     * crossing a page boundary when there is a permission fault or
+     * device memory alignment fault at Stage 1 and a translation
+     * fault at Stage 2.
+     *
+     * So Xen needs to check that the Stage 1 translation does not
+     * generate a fault before handling stage 2 fault. If it is a Stage
+     * 1 translation fault, return to the guest to let the processor
+     * injecting the correct fault.
+     *
+     * XXX: This can be optimized to avoid some unecessary
+     * translation.
+     */
     if ( dabt.s1ptw )
         info.gpa = get_faulting_ipa();
     else