diff mbox series

[Xen-devel,for-4.12,v2,06/17] xen/arm: p2m: Introduce a function to resolve translation fault

Message ID 20181204202651.8836-7-julien.grall@arm.com
State New
Headers show
Series xen/arm: Implement Set/Way operations | expand

Commit Message

Julien Grall Dec. 4, 2018, 8:26 p.m. UTC
Currently a Stage-2 translation fault could happen:
    1) MMIO emulation
    2) Another pCPU was modifying the P2M using Break-Before-Make
    3) Guest Physical address is not mapped

A follow-up patch will re-purpose the valid bit in an entry to generate
translation fault. This would be used to do an action on each entry to
track pages used for a given period.

When receiving the translation fault, we would need to walk the pages
table to find the faulting entry and then toggle valid bit. We can't use
p2m_lookup() for this purpose as it only tells us the mapping exists.

So this patch adds a new function to walk the page-tables and updates
the entry. This function will also handle 2) as it also requires walking
the page-table.

The function is able to cope with both table and block entry having the
validate bit unset. This gives flexibility to the function clearing the
valid bits. To keep the algorithm simple, the fault will be propating
one-level down. This will be repeated until a block entry has been
reached.

At the moment, there are no action done when reaching a block/page entry
but setting the valid bit to 1.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---
    Changes in v2:
        - Typoes
        - Add more comment
        - Skip clearing valid bit if it was already done
        - Move the prototype in p2m.h
        - Expand commit message
---
 xen/arch/arm/p2m.c        | 142 ++++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/traps.c      |  10 ++--
 xen/include/asm-arm/p2m.h |   2 +
 3 files changed, 148 insertions(+), 6 deletions(-)

Comments

Stefano Stabellini Dec. 6, 2018, 10:33 p.m. UTC | #1
On Tue, 4 Dec 2018, Julien Grall wrote:
> Currently a Stage-2 translation fault could happen:
>     1) MMIO emulation
>     2) Another pCPU was modifying the P2M using Break-Before-Make
>     3) Guest Physical address is not mapped
> 
> A follow-up patch will re-purpose the valid bit in an entry to generate
> translation fault. This would be used to do an action on each entry to
> track pages used for a given period.
> 
> When receiving the translation fault, we would need to walk the pages
> table to find the faulting entry and then toggle valid bit. We can't use
> p2m_lookup() for this purpose as it only tells us the mapping exists.
> 
> So this patch adds a new function to walk the page-tables and updates
> the entry. This function will also handle 2) as it also requires walking
> the page-table.
> 
> The function is able to cope with both table and block entry having the
> validate bit unset. This gives flexibility to the function clearing the
> valid bits. To keep the algorithm simple, the fault will be propating
> one-level down. This will be repeated until a block entry has been
> reached.
> 
> At the moment, there are no action done when reaching a block/page entry
> but setting the valid bit to 1.

Thanks, this explanation is much better

Acked-by: Stefano Stabellini <sstabellini@kernel.org>


> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> ---
>     Changes in v2:
>         - Typoes
>         - Add more comment
>         - Skip clearing valid bit if it was already done
>         - Move the prototype in p2m.h
>         - Expand commit message
> ---
>  xen/arch/arm/p2m.c        | 142 ++++++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/traps.c      |  10 ++--
>  xen/include/asm-arm/p2m.h |   2 +
>  3 files changed, 148 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 39680eeb6e..2706db3e67 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1035,6 +1035,148 @@ int p2m_set_entry(struct p2m_domain *p2m,
>      return rc;
>  }
>  
> +/* Invalidate all entries in the table. The p2m should be write locked. */
> +static void p2m_invalidate_table(struct p2m_domain *p2m, mfn_t mfn)
> +{
> +    lpae_t *table;
> +    unsigned int i;
> +
> +    ASSERT(p2m_is_write_locked(p2m));
> +
> +    table = map_domain_page(mfn);
> +
> +    for ( i = 0; i < LPAE_ENTRIES; i++ )
> +    {
> +        lpae_t pte = table[i];
> +
> +        /*
> +         * Writing an entry can be expensive because it may involve
> +         * cleaning the cache. So avoid updating the entry if the valid
> +         * bit is already cleared.
> +         */
> +        if ( !pte.p2m.valid )
> +            continue;
> +
> +        pte.p2m.valid = 0;
> +
> +        p2m_write_pte(&table[i], pte, p2m->clean_pte);
> +    }
> +
> +    unmap_domain_page(table);
> +
> +    p2m->need_flush = true;
> +}
> +
> +/*
> + * Resolve any translation fault due to change in the p2m. This
> + * includes break-before-make and valid bit cleared.
> + */
> +bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn)
> +{
> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +    unsigned int level = 0;
> +    bool resolved = false;
> +    lpae_t entry, *table;
> +    paddr_t addr = gfn_to_gaddr(gfn);
> +
> +    /* Convenience aliases */
> +    const unsigned int offsets[4] = {
> +        zeroeth_table_offset(addr),
> +        first_table_offset(addr),
> +        second_table_offset(addr),
> +        third_table_offset(addr)
> +    };
> +
> +    p2m_write_lock(p2m);
> +
> +    /* This gfn is higher than the highest the p2m map currently holds */
> +    if ( gfn_x(gfn) > gfn_x(p2m->max_mapped_gfn) )
> +        goto out;
> +
> +    table = p2m_get_root_pointer(p2m, gfn);
> +    /*
> +     * The table should always be non-NULL because the gfn is below
> +     * p2m->max_mapped_gfn and the root table pages are always present.
> +     */
> +    BUG_ON(table == NULL);
> +
> +    /*
> +     * Go down the page-tables until an entry has the valid bit unset or
> +     * a block/page entry has been hit.
> +     */
> +    for ( level = P2M_ROOT_LEVEL; level <= 3; level++ )
> +    {
> +        int rc;
> +
> +        entry = table[offsets[level]];
> +
> +        if ( level == 3 )
> +            break;
> +
> +        /* Stop as soon as we hit an entry with the valid bit unset. */
> +        if ( !lpae_is_valid(entry) )
> +            break;
> +
> +        rc = p2m_next_level(p2m, true, level, &table, offsets[level]);
> +        if ( rc == GUEST_TABLE_MAP_FAILED )
> +            goto out_unmap;
> +        else if ( rc != GUEST_TABLE_NORMAL_PAGE )
> +            break;
> +    }
> +
> +    /*
> +     * If the valid bit of the entry is set, it means someone was playing with
> +     * the Stage-2 page table. Nothing to do and mark the fault as resolved.
> +     */
> +    if ( lpae_is_valid(entry) )
> +    {
> +        resolved = true;
> +        goto out_unmap;
> +    }
> +
> +    /*
> +     * The valid bit is unset. If the entry is still not valid then the fault
> +     * cannot be resolved, exit and report it.
> +     */
> +    if ( !p2m_is_valid(entry) )
> +        goto out_unmap;
> +
> +    /*
> +     * Now we have an entry with valid bit unset, but still valid from
> +     * the P2M point of view.
> +     *
> +     * If an entry is pointing to a table, each entry of the table will
> +     * have there valid bit cleared. This allows a function to clear the
> +     * full p2m with just a couple of write. The valid bit will then be
> +     * propagated on the fault.
> +     * If an entry is pointing to a block/page, no work to do for now.
> +     */
> +    if ( lpae_is_table(entry, level) )
> +        p2m_invalidate_table(p2m, lpae_get_mfn(entry));
> +
> +    /*
> +     * Now that the work on the entry is done, set the valid bit to prevent
> +     * another fault on that entry.
> +     */
> +    resolved = true;
> +    entry.p2m.valid = 1;
> +
> +    p2m_write_pte(table + offsets[level], entry, p2m->clean_pte);
> +
> +    /*
> +     * No need to flush the TLBs as the modified entry had the valid bit
> +     * unset.
> +     */
> +
> +out_unmap:
> +    unmap_domain_page(table);
> +
> +out:
> +    p2m_write_unlock(p2m);
> +
> +    return resolved;
> +}
> +
>  static inline int p2m_insert_mapping(struct domain *d,
>                                       gfn_t start_gfn,
>                                       unsigned long nr,
> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
> index 94fe1a6da7..b00d0b8e1e 100644
> --- a/xen/arch/arm/traps.c
> +++ b/xen/arch/arm/traps.c
> @@ -1893,7 +1893,6 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>      vaddr_t gva;
>      paddr_t gpa;
>      uint8_t fsc = xabt.fsc & ~FSC_LL_MASK;
> -    mfn_t mfn;
>      bool is_data = (hsr.ec == HSR_EC_DATA_ABORT_LOWER_EL);
>  
>      /*
> @@ -1972,12 +1971,11 @@ static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
>          }
>  
>          /*
> -         * The PT walk may have failed because someone was playing
> -         * with the Stage-2 page table. Walk the Stage-2 PT to check
> -         * if the entry exists. If it's the case, return to the guest
> +         * First check if the translation fault can be resolved by the
> +         * P2M subsystem. If that's the case nothing else to do.
>           */
> -        mfn = gfn_to_mfn(current->domain, gaddr_to_gfn(gpa));
> -        if ( !mfn_eq(mfn, INVALID_MFN) )
> +        if ( p2m_resolve_translation_fault(current->domain,
> +                                           gaddr_to_gfn(gpa)) )
>              return;
>  
>          if ( is_data && try_map_mmio(gaddr_to_gfn(gpa)) )
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index 4fe78d39a5..13f7a27c38 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -226,6 +226,8 @@ int p2m_set_entry(struct p2m_domain *p2m,
>                    p2m_type_t t,
>                    p2m_access_t a);
>  
> +bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn);
> +
>  /* Clean & invalidate caches corresponding to a region of guest address space */
>  int p2m_cache_flush(struct domain *d, gfn_t start, unsigned long nr);
>  
> -- 
> 2.11.0
>
diff mbox series

Patch

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 39680eeb6e..2706db3e67 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1035,6 +1035,148 @@  int p2m_set_entry(struct p2m_domain *p2m,
     return rc;
 }
 
+/* Invalidate all entries in the table. The p2m should be write locked. */
+static void p2m_invalidate_table(struct p2m_domain *p2m, mfn_t mfn)
+{
+    lpae_t *table;
+    unsigned int i;
+
+    ASSERT(p2m_is_write_locked(p2m));
+
+    table = map_domain_page(mfn);
+
+    for ( i = 0; i < LPAE_ENTRIES; i++ )
+    {
+        lpae_t pte = table[i];
+
+        /*
+         * Writing an entry can be expensive because it may involve
+         * cleaning the cache. So avoid updating the entry if the valid
+         * bit is already cleared.
+         */
+        if ( !pte.p2m.valid )
+            continue;
+
+        pte.p2m.valid = 0;
+
+        p2m_write_pte(&table[i], pte, p2m->clean_pte);
+    }
+
+    unmap_domain_page(table);
+
+    p2m->need_flush = true;
+}
+
+/*
+ * Resolve any translation fault due to change in the p2m. This
+ * includes break-before-make and valid bit cleared.
+ */
+bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    unsigned int level = 0;
+    bool resolved = false;
+    lpae_t entry, *table;
+    paddr_t addr = gfn_to_gaddr(gfn);
+
+    /* Convenience aliases */
+    const unsigned int offsets[4] = {
+        zeroeth_table_offset(addr),
+        first_table_offset(addr),
+        second_table_offset(addr),
+        third_table_offset(addr)
+    };
+
+    p2m_write_lock(p2m);
+
+    /* This gfn is higher than the highest the p2m map currently holds */
+    if ( gfn_x(gfn) > gfn_x(p2m->max_mapped_gfn) )
+        goto out;
+
+    table = p2m_get_root_pointer(p2m, gfn);
+    /*
+     * The table should always be non-NULL because the gfn is below
+     * p2m->max_mapped_gfn and the root table pages are always present.
+     */
+    BUG_ON(table == NULL);
+
+    /*
+     * Go down the page-tables until an entry has the valid bit unset or
+     * a block/page entry has been hit.
+     */
+    for ( level = P2M_ROOT_LEVEL; level <= 3; level++ )
+    {
+        int rc;
+
+        entry = table[offsets[level]];
+
+        if ( level == 3 )
+            break;
+
+        /* Stop as soon as we hit an entry with the valid bit unset. */
+        if ( !lpae_is_valid(entry) )
+            break;
+
+        rc = p2m_next_level(p2m, true, level, &table, offsets[level]);
+        if ( rc == GUEST_TABLE_MAP_FAILED )
+            goto out_unmap;
+        else if ( rc != GUEST_TABLE_NORMAL_PAGE )
+            break;
+    }
+
+    /*
+     * If the valid bit of the entry is set, it means someone was playing with
+     * the Stage-2 page table. Nothing to do and mark the fault as resolved.
+     */
+    if ( lpae_is_valid(entry) )
+    {
+        resolved = true;
+        goto out_unmap;
+    }
+
+    /*
+     * The valid bit is unset. If the entry is still not valid then the fault
+     * cannot be resolved, exit and report it.
+     */
+    if ( !p2m_is_valid(entry) )
+        goto out_unmap;
+
+    /*
+     * Now we have an entry with valid bit unset, but still valid from
+     * the P2M point of view.
+     *
+     * If an entry is pointing to a table, each entry of the table will
+     * have there valid bit cleared. This allows a function to clear the
+     * full p2m with just a couple of write. The valid bit will then be
+     * propagated on the fault.
+     * If an entry is pointing to a block/page, no work to do for now.
+     */
+    if ( lpae_is_table(entry, level) )
+        p2m_invalidate_table(p2m, lpae_get_mfn(entry));
+
+    /*
+     * Now that the work on the entry is done, set the valid bit to prevent
+     * another fault on that entry.
+     */
+    resolved = true;
+    entry.p2m.valid = 1;
+
+    p2m_write_pte(table + offsets[level], entry, p2m->clean_pte);
+
+    /*
+     * No need to flush the TLBs as the modified entry had the valid bit
+     * unset.
+     */
+
+out_unmap:
+    unmap_domain_page(table);
+
+out:
+    p2m_write_unlock(p2m);
+
+    return resolved;
+}
+
 static inline int p2m_insert_mapping(struct domain *d,
                                      gfn_t start_gfn,
                                      unsigned long nr,
diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 94fe1a6da7..b00d0b8e1e 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1893,7 +1893,6 @@  static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
     vaddr_t gva;
     paddr_t gpa;
     uint8_t fsc = xabt.fsc & ~FSC_LL_MASK;
-    mfn_t mfn;
     bool is_data = (hsr.ec == HSR_EC_DATA_ABORT_LOWER_EL);
 
     /*
@@ -1972,12 +1971,11 @@  static void do_trap_stage2_abort_guest(struct cpu_user_regs *regs,
         }
 
         /*
-         * The PT walk may have failed because someone was playing
-         * with the Stage-2 page table. Walk the Stage-2 PT to check
-         * if the entry exists. If it's the case, return to the guest
+         * First check if the translation fault can be resolved by the
+         * P2M subsystem. If that's the case nothing else to do.
          */
-        mfn = gfn_to_mfn(current->domain, gaddr_to_gfn(gpa));
-        if ( !mfn_eq(mfn, INVALID_MFN) )
+        if ( p2m_resolve_translation_fault(current->domain,
+                                           gaddr_to_gfn(gpa)) )
             return;
 
         if ( is_data && try_map_mmio(gaddr_to_gfn(gpa)) )
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 4fe78d39a5..13f7a27c38 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -226,6 +226,8 @@  int p2m_set_entry(struct p2m_domain *p2m,
                   p2m_type_t t,
                   p2m_access_t a);
 
+bool p2m_resolve_translation_fault(struct domain *d, gfn_t gfn);
+
 /* Clean & invalidate caches corresponding to a region of guest address space */
 int p2m_cache_flush(struct domain *d, gfn_t start, unsigned long nr);