[v3] xen/arm: p2m: Correctly flush TLB in create_p2m_entries

Message ID 1389706615-9578-1-git-send-email-julien.grall@linaro.org
State Accepted
Commit c938f5414b2b7ffa5b6f65daa9f4522db360987b
Headers show

Commit Message

Julien Grall Jan. 14, 2014, 1:36 p.m.
The p2m is shared between VCPUs for each domain. Currently Xen only flush
TLB on the local PCPU. This could result to mismatch between the mapping in the
p2m and TLBs.

Flush TLB entries used by this domain on every PCPU. The flush can also be
moved out of the loop because:
    - ALLOCATE: only called for dom0 RAM allocation, so the flush is never called
    - INSERT: if valid = 1 that would means with have replaced a
    page that already belongs to the domain. A VCPU can write on the wrong page.
    This can happen for dom0 with the 1:1 mapping because the mapping is not
    removed from the p2m.
    - REMOVE: except for grant-table (replace_grant_host_mapping), each
    call to guest_physmap_remove_page are protected by the callers via a
        get_page -> .... -> guest_physmap_remove_page -> ... -> put_page. So
    the page can't be allocated for another domain until the last put_page.
    - RELINQUISH : the domain is not running anymore so we don't care...

Also avoid leaking a foreign page if the function is INSERTed a new mapping
on top of foreign mapping.

Signed-off-by: Julien Grall <julien.grall@linaro.org>

---
    Changes in v3:
        - Add an ASSERT in ALLOCATE
        - Fix typo in commit message
        - Move put_page above the switch to avoid leaking foreign page
        when a page is replaced.
    Changes in v2:
        - Switch to the domain for only flush its TLBs entries
        - Move the flush out of the loop

This is a possible bug fix (found by reading the code) for Xen 4.4, I moved the
flush out of the loop which should be safe (see why in the commit message).
Without this patch, the guest can have stale TLB entries when the VCPU is moved
to another PCPU.

Except grant-table (I can't find {get,put}_page for grant-table code???),
all the callers are protected by a get_page before removing the page. So if the
another VCPU is trying to access to this page before the flush, it will just
read/write the wrong page.

The downside of this patch is Xen flushes less TLBs. Instead of flushing all TLBs
on the current PCPU, Xen flushes TLBs for a specific VMID on every CPUs. This
should be safe because create_p2m_entries only deal with a specific domain.

I don't think I forget case in this function. Let me know if it's the case.
---
 xen/arch/arm/p2m.c |   56 +++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 38 insertions(+), 18 deletions(-)

Comments

Ian Campbell Jan. 14, 2014, 2:44 p.m. | #1
On Tue, 2014-01-14 at 13:36 +0000, Julien Grall wrote:
> The p2m is shared between VCPUs for each domain. Currently Xen only flush
> TLB on the local PCPU. This could result to mismatch between the mapping in the
> p2m and TLBs.
> 
> Flush TLB entries used by this domain on every PCPU. The flush can also be
> moved out of the loop because:
>     - ALLOCATE: only called for dom0 RAM allocation, so the flush is never called
>     - INSERT: if valid = 1 that would means with have replaced a
>     page that already belongs to the domain. A VCPU can write on the wrong page.
>     This can happen for dom0 with the 1:1 mapping because the mapping is not
>     removed from the p2m.
>     - REMOVE: except for grant-table (replace_grant_host_mapping), each
>     call to guest_physmap_remove_page are protected by the callers via a
>         get_page -> .... -> guest_physmap_remove_page -> ... -> put_page. So
>     the page can't be allocated for another domain until the last put_page.
>     - RELINQUISH : the domain is not running anymore so we don't care...
> 
> Also avoid leaking a foreign page if the function is INSERTed a new mapping
> on top of foreign mapping.
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

Release hat: There are two major issues here, one is not broadcasting
the TLB flush, which is a potential security issue (another VCPU can
keep accessing a page after it is freed). The other is a potential DoS
by leaking a reference on a foreign page, which would stop that domain
from ever being destroyed.

Either of these two issues would be enough to justify taking this change
for 4.4.

We are cutting rc2 at the moment, I will apply after that is out the
way.
Ian Campbell Jan. 15, 2014, 2:13 p.m. | #2
On Tue, 2014-01-14 at 14:44 +0000, Ian Campbell wrote:
> On Tue, 2014-01-14 at 13:36 +0000, Julien Grall wrote:
> > The p2m is shared between VCPUs for each domain. Currently Xen only flush
> > TLB on the local PCPU. This could result to mismatch between the mapping in the
> > p2m and TLBs.
> > 
> > Flush TLB entries used by this domain on every PCPU. The flush can also be
> > moved out of the loop because:
> >     - ALLOCATE: only called for dom0 RAM allocation, so the flush is never called
> >     - INSERT: if valid = 1 that would means with have replaced a
> >     page that already belongs to the domain. A VCPU can write on the wrong page.
> >     This can happen for dom0 with the 1:1 mapping because the mapping is not
> >     removed from the p2m.
> >     - REMOVE: except for grant-table (replace_grant_host_mapping), each
> >     call to guest_physmap_remove_page are protected by the callers via a
> >         get_page -> .... -> guest_physmap_remove_page -> ... -> put_page. So
> >     the page can't be allocated for another domain until the last put_page.
> >     - RELINQUISH : the domain is not running anymore so we don't care...
> > 
> > Also avoid leaking a foreign page if the function is INSERTed a new mapping
> > on top of foreign mapping.
> > 
> > Signed-off-by: Julien Grall <julien.grall@linaro.org>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> Release hat: There are two major issues here, one is not broadcasting
> the TLB flush, which is a potential security issue (another VCPU can
> keep accessing a page after it is freed). The other is a potential DoS
> by leaking a reference on a foreign page, which would stop that domain
> from ever being destroyed.
> 
> Either of these two issues would be enough to justify taking this change
> for 4.4.
> 
> We are cutting rc2 at the moment, I will apply after that is out the
> way.

Done, on top of "xen/arm: correct flush_tlb_mask behaviour".

Patch

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 11f4714..85ca330 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -238,7 +238,7 @@  static int create_p2m_entries(struct domain *d,
                      int mattr,
                      p2m_type_t t)
 {
-    int rc, flush;
+    int rc;
     struct p2m_domain *p2m = &d->arch.p2m;
     lpae_t *first = NULL, *second = NULL, *third = NULL;
     paddr_t addr;
@@ -246,10 +246,15 @@  static int create_p2m_entries(struct domain *d,
                   cur_first_offset = ~0,
                   cur_second_offset = ~0;
     unsigned long count = 0;
+    unsigned int flush = 0;
     bool_t populate = (op == INSERT || op == ALLOCATE);
+    lpae_t pte;
 
     spin_lock(&p2m->lock);
 
+    if ( d != current->domain )
+        p2m_load_VTTBR(d);
+
     addr = start_gpaddr;
     while ( addr < end_gpaddr )
     {
@@ -316,15 +321,31 @@  static int create_p2m_entries(struct domain *d,
             cur_second_offset = second_table_offset(addr);
         }
 
-        flush = third[third_table_offset(addr)].p2m.valid;
+        pte = third[third_table_offset(addr)];
+
+        flush |= pte.p2m.valid;
+
+        /* TODO: Handle other p2m type
+         *
+         * It's safe to do the put_page here because page_alloc will
+         * flush the TLBs if the page is reallocated before the end of
+         * this loop.
+         */
+        if ( pte.p2m.valid && p2m_is_foreign(pte.p2m.type) )
+        {
+            unsigned long mfn = pte.p2m.base;
+
+            ASSERT(mfn_valid(mfn));
+            put_page(mfn_to_page(mfn));
+        }
 
         /* Allocate a new RAM page and attach */
         switch (op) {
             case ALLOCATE:
                 {
                     struct page_info *page;
-                    lpae_t pte;
 
+                    ASSERT(!pte.p2m.valid);
                     rc = -ENOMEM;
                     page = alloc_domheap_page(d, 0);
                     if ( page == NULL ) {
@@ -339,8 +360,7 @@  static int create_p2m_entries(struct domain *d,
                 break;
             case INSERT:
                 {
-                    lpae_t pte = mfn_to_p2m_entry(maddr >> PAGE_SHIFT,
-                                                  mattr, t);
+                    pte = mfn_to_p2m_entry(maddr >> PAGE_SHIFT, mattr, t);
                     write_pte(&third[third_table_offset(addr)], pte);
                     maddr += PAGE_SIZE;
                 }
@@ -348,9 +368,6 @@  static int create_p2m_entries(struct domain *d,
             case RELINQUISH:
             case REMOVE:
                 {
-                    lpae_t pte = third[third_table_offset(addr)];
-                    unsigned long mfn = pte.p2m.base;
-
                     if ( !pte.p2m.valid )
                     {
                         count++;
@@ -359,13 +376,6 @@  static int create_p2m_entries(struct domain *d,
 
                     count += 0x10;
 
-                    /* TODO: Handle other p2m type */
-                    if ( p2m_is_foreign(pte.p2m.type) )
-                    {
-                        ASSERT(mfn_valid(mfn));
-                        put_page(mfn_to_page(mfn));
-                    }
-
                     memset(&pte, 0x00, sizeof(pte));
                     write_pte(&third[third_table_offset(addr)], pte);
                     count++;
@@ -373,9 +383,6 @@  static int create_p2m_entries(struct domain *d,
                 break;
         }
 
-        if ( flush )
-            flush_tlb_all_local();
-
         /* Preempt every 2MiB (mapped) or 32 MiB (unmapped) - arbitrary */
         if ( op == RELINQUISH && count >= 0x2000 )
         {
@@ -392,6 +399,16 @@  static int create_p2m_entries(struct domain *d,
         addr += PAGE_SIZE;
     }
 
+    if ( flush )
+    {
+        /* At the beginning of the function, Xen is updating VTTBR
+         * with the domain where the mappings are created. In this
+         * case it's only necessary to flush TLBs on every CPUs with
+         * the current VMID (our domain).
+         */
+        flush_tlb();
+    }
+
     if ( op == ALLOCATE || op == INSERT )
     {
         unsigned long sgfn = paddr_to_pfn(start_gpaddr);
@@ -409,6 +426,9 @@  out:
     if (second) unmap_domain_page(second);
     if (first) unmap_domain_page(first);
 
+    if ( d != current->domain )
+        p2m_load_VTTBR(current->domain);
+
     spin_unlock(&p2m->lock);
 
     return rc;