From patchwork Tue Jun 10 09:57:58 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Ian Campbell <ian.campbell@citrix.com>
X-Patchwork-Id: 31618
Return-Path: <patchwork-forward+bncBC4Y5F6PT4PBBIFO3OOAKGQE7VZ7NNQ@linaro.org>
X-Original-To: linaro@patches.linaro.org
Delivered-To: linaro@patches.linaro.org
Received: from mail-ie0-f200.google.com (mail-ie0-f200.google.com
 [209.85.223.200])
 by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id A65402054B
 for <linaro@patches.linaro.org>; Tue, 10 Jun 2014 10:00:00 +0000 (UTC)
Received: by mail-ie0-f200.google.com with SMTP id tr6sf11980722ieb.11
 for <linaro@patches.linaro.org>; Tue, 10 Jun 2014 03:00:00 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:delivered-to:from:to:date:message-id:in-reply-to
 :references:mime-version:cc:subject:precedence:list-id
 :list-unsubscribe:list-post:list-help:list-subscribe:sender
 :errors-to:x-original-sender:x-original-authentication-results
 :mailing-list:list-archive:content-type:content-transfer-encoding;
 bh=anadrn8YvYLPITkPk2zVocrjC94Ph28xGQd3N79p+ks=;
 b=U8eHMaPVOKqmM4/BRAThx/zSDPAMsMYrMyFNcOuYb1vb97NQBwM88WeFw1Rij4HcgJ
 09rY9HPkXFME48V0xne0l9GMKby46VdZryEJ4qXTBBZDuYBwuhQ7saubMcpSWruXhb4D
 jfUrT3mXDOeuj+rKm3CaCqBVRXzEE4m/NyPje4Ycs/seb4D5iy9LVIxcsJg9WZfsWrcZ
 Z44PmUXDguoGtHNGnCDY/xiZ7OkK2A15WrKD+puDkRe2PXpcUZQWcP4KWCW2uy3M605J
 dKJPaMd2vTfvYiwjwugs58tcrk3iBOVwN9wfeiegit+NI1hEf6HeGIz7VHWHvk2GMle1
 xMHg==
X-Gm-Message-State: ALoCoQn2OKMK5wVb6wWM5iDsE2nX/D8HuCyYhnIw+VJtieFNyDk/Ci0xcVkHtFFRIGwKmfjGQ/kN
X-Received: by 10.182.119.194 with SMTP id kw2mr5495752obb.27.1402394400266; 
 Tue, 10 Jun 2014 03:00:00 -0700 (PDT)
X-BeenThere: patchwork-forward@linaro.org
Received: by 10.140.25.171 with SMTP id 40ls2032091qgt.85.gmail; Tue, 10 Jun
 2014 03:00:00 -0700 (PDT)
X-Received: by 10.52.164.237 with SMTP id yt13mr26384845vdb.18.1402394400081; 
 Tue, 10 Jun 2014 03:00:00 -0700 (PDT)
Received: from mail-ve0-f178.google.com (mail-ve0-f178.google.com
 [209.85.128.178]) by mx.google.com with ESMTPS id
 gu7si12883400vdc.22.2014.06.10.03.00.00
 for <patchwork-forward@linaro.org>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Tue, 10 Jun 2014 03:00:00 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 209.85.128.178 as permitted sender) client-ip=209.85.128.178; 
Received: by mail-ve0-f178.google.com with SMTP id sa20so7997316veb.9
 for <patchwork-forward@linaro.org>;
 Tue, 10 Jun 2014 03:00:00 -0700 (PDT)
X-Received: by 10.58.118.228 with SMTP id kp4mr214044veb.59.1402394399981;
 Tue, 10 Jun 2014 02:59:59 -0700 (PDT)
X-Forwarded-To: patchwork-forward@linaro.org
X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org
Delivered-To: patch@linaro.org
Received: by 10.221.54.6 with SMTP id vs6csp212651vcb;
 Tue, 10 Jun 2014 02:59:59 -0700 (PDT)
X-Received: by 10.224.121.72 with SMTP id g8mr39859301qar.79.1402394398444; 
 Tue, 10 Jun 2014 02:59:58 -0700 (PDT)
Received: from lists.xen.org (lists.xen.org. [50.57.142.19])
 by mx.google.com with ESMTPS id
 l66si26147818qgf.78.2014.06.10.02.59.58 for <multiple recipients>
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Tue, 10 Jun 2014 02:59:58 -0700 (PDT)
Received-SPF: none (google.com: xen-devel-bounces@lists.xen.org does not
 designate permitted sender hosts) client-ip=50.57.142.19; 
Received: from localhost ([127.0.0.1] helo=lists.xen.org)
 by lists.xen.org with esmtp (Exim 4.72)
 (envelope-from <xen-devel-bounces@lists.xen.org>)
 id 1WuIof-0007gr-Ht; Tue, 10 Jun 2014 09:58:05 +0000
Received: from mail6.bemta3.messagelabs.com ([195.245.230.39])
 by lists.xen.org with esmtp (Exim 4.72)
 (envelope-from <Ian.Campbell@citrix.com>) id 1WuIoe-0007gU-EK
 for xen-devel@lists.xen.org; Tue, 10 Jun 2014 09:58:04 +0000
Received: from [85.158.137.68:6056] by server-11.bemta-3.messagelabs.com id
 F1/64-19438-BA6D6935; Tue, 10 Jun 2014 09:58:03 +0000
X-Env-Sender: Ian.Campbell@citrix.com
X-Msg-Ref: server-7.tower-31.messagelabs.com!1402394280!9104109!1
X-Originating-IP: [66.165.176.63]
X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: 
 VHJ1c3RlZCBJUDogNjYuMTY1LjE3Ni42MyA9PiAzMDYwNDg=\n
X-StarScan-Received: 
X-StarScan-Version: 6.11.3; banners=-,-,-
X-VirusChecked: Checked
Received: (qmail 30117 invoked from network); 10 Jun 2014 09:58:02 -0000
Received: from smtp02.citrix.com (HELO SMTP02.CITRIX.COM) (66.165.176.63)
 by server-7.tower-31.messagelabs.com with RC4-SHA encrypted SMTP;
 10 Jun 2014 09:58:02 -0000
X-IronPort-AV: E=Sophos; i="4.98,1008,1392163200"; d="scan'208"; a="141507548"
Received: from accessns.citrite.net (HELO FTLPEX01CL01.citrite.net)
 ([10.9.154.239])
 by FTLPIPO02.CITRIX.COM with ESMTP; 10 Jun 2014 09:58:00 +0000
Received: from norwich.cam.xci-test.com (10.80.248.129) by
 smtprelay.citrix.com (10.13.107.78) with Microsoft SMTP Server id
 14.3.181.6; Tue, 10 Jun 2014 05:57:59 -0400
Received: from marilith-n13-p0.uk.xensource.com ([10.80.229.115]
 helo=marilith-n13.uk.xensource.com.)	by norwich.cam.xci-test.com with
 esmtp (Exim 4.72)	(envelope-from <ian.campbell@citrix.com>)	id
 1WuIoZ-000207-I6; Tue, 10 Jun 2014 09:57:59 +0000
From: Ian Campbell <ian.campbell@citrix.com>
To: <xen-devel@lists.xen.org>
Date: Tue, 10 Jun 2014 10:57:58 +0100
Message-ID: <1402394278-9850-6-git-send-email-ian.campbell@citrix.com>
X-Mailer: git-send-email 1.7.10.4
In-Reply-To: <1402394127.29980.52.camel@kazak.uk.xensource.com>
References: <1402394127.29980.52.camel@kazak.uk.xensource.com>
MIME-Version: 1.0
X-DLP: MIA2
Cc: julien.grall@linaro.org, tim@xen.org,
 Ian Campbell <ian.campbell@citrix.com>, stefano.stabellini@eu.citrix.com
Subject: [Xen-devel] [PATCH 6/6] xen: arm: use superpages in p2m when pages
 are suitably aligned
X-BeenThere: xen-devel@lists.xen.org
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: <patchwork-forward.linaro.org>
List-Unsubscribe: <http://groups.google.com/a/linaro.org/group/patchwork-forward/subscribe>, 
 <mailto:googlegroups-manage+836684582541+unsubscribe@googlegroups.com>
List-Post: <http://groups.google.com/a/linaro.org/group/patchwork-forward/post>, 
 <mailto:patchwork-forward@linaro.org>
List-Help: <http://support.google.com/a/linaro.org/bin/topic.py?topic=25838>, 
 <mailto:patchwork-forward+help@linaro.org>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
 <mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
X-Removed-Original-Auth: Dkim didn't pass.
X-Original-Sender: ian.campbell@citrix.com
X-Original-Authentication-Results: mx.google.com; spf=pass (google.com:
 domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 209.85.128.178 as permitted sender)
 smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org
Mailing-list: list patchwork-forward@linaro.org;
 contact patchwork-forward+owners@linaro.org
X-Google-Group-Id: 836684582541
List-Archive: <http://groups.google.com/a/linaro.org/group/patchwork-forward/>

This creates superpage (1G and 2M) mappings in the p2m for both domU and dom0
when the guest and machine addresses are suitably aligned. This relies on the
domain builder to allocate suitably sized regions, this is trivially true for
1:1 mapped dom0's and was arranged for guests in a previous patch. A non-1:1
dom0's memory is not currently deliberately aligned.

Since ARM pagetables are (mostly) consistent at each level this is implemented
by refactoring the handling of a single level of pagetable into a common
function. The two inconsistencies are that there are no superpage mappings at
level zero and that level three entry mappings must set the table bit.

When inserting new mappings the code shatters superpage mappings as necessary,
but currently makes no attempt to coalesce anything again. In particular when
inserting a mapping which could be a superpage over an existing table mapping
we do not attempt to free that lower page table and instead descend into it.

Some p2m statistics are added to keep track of the number of each level of
mapping and the number of times we've had to shatter an existing mapping.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
 xen/arch/arm/domain.c     |    1 +
 xen/arch/arm/p2m.c        |  497 +++++++++++++++++++++++++++++++++------------
 xen/include/asm-arm/p2m.h |    9 +
 3 files changed, 373 insertions(+), 134 deletions(-)
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index e494112..45b7cbd 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -742,6 +742,7 @@ int domain_relinquish_resources(struct domain *d)
 
 void arch_dump_domain_info(struct domain *d)
 {
+    p2m_dump_info(d);
 }
 
 
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 233df72..1c2c724 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1,6 +1,7 @@
 #include <xen/config.h>
 #include <xen/sched.h>
 #include <xen/lib.h>
+#include <xen/stdbool.h>
 #include <xen/errno.h>
 #include <xen/domain_page.h>
 #include <xen/bitops.h>
@@ -19,6 +20,22 @@
 #define p2m_table(pte) (p2m_valid(pte) && (pte).p2m.table)
 #define p2m_entry(pte) (p2m_valid(pte) && !(pte).p2m.table)
 
+void p2m_dump_info(struct domain *d)
+{
+    struct p2m_domain *p2m = &d->arch.p2m;
+
+    spin_lock(&p2m->lock);
+    printk("p2m mappings for domain %d (vmid %d):\n",
+           d->domain_id, p2m->vmid);
+    BUG_ON(p2m->stats.mappings[0] || p2m->stats.shattered[0]);
+    printk("  1G mappings: %d (shattered %d)\n",
+           p2m->stats.mappings[1], p2m->stats.shattered[1]);
+    printk("  2M mappings: %d (shattered %d)\n",
+           p2m->stats.mappings[2], p2m->stats.shattered[2]);
+    printk("  4K mappings: %d\n", p2m->stats.mappings[3]);
+    spin_unlock(&p2m->lock);
+}
+
 void dump_p2m_lookup(struct domain *d, paddr_t addr)
 {
     struct p2m_domain *p2m = &d->arch.p2m;
@@ -274,15 +291,26 @@ static inline void p2m_write_pte(lpae_t *p, lpae_t pte, bool_t flush_cache)
         clean_xen_dcache(*p);
 }
 
-/* Allocate a new page table page and hook it in via the given entry */
-static int p2m_create_table(struct domain *d, lpae_t *entry, bool_t flush_cache)
+/*
+ * Allocate a new page table page and hook it in via the given entry.
+ * apply_one_level relies on this returning 0 on success
+ * and -ve on failure.
+ *
+ * If the exist entry is present then it must be a mapping and not a
+ * table and it will be shattered into the next level down.
+ *
+ * level_shift is the number of bits at the level we want to create.
+ */
+static int p2m_create_table(struct domain *d, lpae_t *entry,
+                            int level_shift, bool_t flush_cache)
 {
     struct p2m_domain *p2m = &d->arch.p2m;
     struct page_info *page;
-    void *p;
+    lpae_t *p;
     lpae_t pte;
+    int splitting = entry->p2m.valid;
 
-    BUG_ON(entry->p2m.valid);
+    BUG_ON(entry->p2m.table);
 
     page = alloc_domheap_page(NULL, 0);
     if ( page == NULL )
@@ -291,9 +319,39 @@ static int p2m_create_table(struct domain *d, lpae_t *entry, bool_t flush_cache)
     page_list_add(page, &p2m->pages);
 
     p = __map_domain_page(page);
-    clear_page(p);
+    if ( splitting )
+    {
+        p2m_type_t t = entry->p2m.type;
+        unsigned long base_pfn = entry->p2m.base;
+        int i;
+
+        /*
+         * We are either splitting a first level 1G page into 512 second level
+         * 2M pages, or a second level 2M page into 512 third level 4K pages.
+         */
+         for ( i=0 ; i < LPAE_ENTRIES; i++ )
+         {
+             pte = mfn_to_p2m_entry(base_pfn + (i<<(level_shift-LPAE_SHIFT)),
+                                    MATTR_MEM, t);
+
+             /*
+              * First and second level super pages set p2m.table = 0, but
+              * third level entries set table = 1.
+              */
+             if ( level_shift - LPAE_SHIFT )
+                 pte.p2m.table = 0;
+
+             write_pte(&p[i], pte);
+         }
+    }
+    else
+    {
+        clear_page(p);
+    }
+
     if ( flush_cache )
         clean_xen_dcache_va_range(p, PAGE_SIZE);
+
     unmap_domain_page(p);
 
     pte = mfn_to_p2m_entry(page_to_mfn(page), MATTR_MEM, p2m_invalid);
@@ -311,8 +369,14 @@ enum p2m_operation {
     CACHEFLUSH,
 };
 
-static void p2m_put_page(const lpae_t pte)
+/* Put any references on the single 4K page referenced by pte.  TODO:
+ * Handle superpages, for now we only take special references for leaf
+ * pages (specifically foreign ones, which can't be super mapped today).
+ */
+static void p2m_put_l3_page(const lpae_t pte)
 {
+    ASSERT(p2m_valid(pte));
+
     /* TODO: Handle other p2m types
      *
      * It's safe to do the put_page here because page_alloc will
@@ -328,6 +392,248 @@ static void p2m_put_page(const lpae_t pte)
     }
 }
 
+/*
+ * Returns true if start_gpaddr..end_gpaddr contains at least one
+ * suitably aligned level_size mappping of maddr.
+ *
+ * So long as the range is large enough the end_gpaddr need not be
+ * aligned (callers should create one superpage mapping based on this
+ * result and then call this again on the new range, eventually the
+ * slop at the end will cause this function to return false).
+ */
+static bool_t is_mapping_aligned(const paddr_t start_gpaddr,
+                                 const paddr_t end_gpaddr,
+                                 const paddr_t maddr,
+                                 const paddr_t level_size)
+{
+    const paddr_t level_mask = level_size - 1;
+
+    /* No hardware superpages at level 0 */
+    if ( level_size == ZEROETH_SIZE )
+        return false;
+
+    /*
+     * A range smaller than the size of a superpage at this level
+     * cannot be superpage aligned.
+     */
+    if ( ( end_gpaddr - start_gpaddr ) < level_size - 1 )
+        return false;
+
+    /* Both the gpaddr and maddr must be aligned */
+    if ( start_gpaddr & level_mask )
+        return false;
+    if ( maddr & level_mask )
+        return false;
+    return true;
+}
+
+#define P2M_ONE_DESCEND        0
+#define P2M_ONE_PROGRESS_NOP   0x1
+#define P2M_ONE_PROGRESS       0x10
+
+/*
+ * 0   == (P2M_ONE_DESCEND) continue to decend the tree
+ * +ve == (P2M_ONE_PROGRESS_*) handled at this level, continue, flush,
+ *        entry, addr and maddr updated.  Return value is an
+ *        indication of the amount of work done (for preemption).
+ * -ve == (-Exxx) error.
+ */
+static int apply_one_level(struct domain *d,
+                           lpae_t *entry,
+                           unsigned int level,
+                           bool_t flush_cache,
+                           enum p2m_operation op,
+                           paddr_t start_gpaddr,
+                           paddr_t end_gpaddr,
+                           paddr_t *addr,
+                           paddr_t *maddr,
+                           bool_t *flush,
+                           int mattr,
+                           p2m_type_t t)
+{
+    /* Helpers to lookup the properties of each level */
+    const paddr_t level_sizes[] =
+        { ZEROETH_SIZE, FIRST_SIZE, SECOND_SIZE, THIRD_SIZE };
+    const paddr_t level_masks[] =
+        { ZEROETH_MASK, FIRST_MASK, SECOND_MASK, THIRD_MASK };
+    const paddr_t level_shifts[] =
+        { ZEROETH_SHIFT, FIRST_SHIFT, SECOND_SHIFT, THIRD_SHIFT };
+    const paddr_t level_size = level_sizes[level];
+    const paddr_t level_mask = level_masks[level];
+    const paddr_t level_shift = level_shifts[level];
+
+    struct p2m_domain *p2m = &d->arch.p2m;
+    lpae_t pte;
+    const lpae_t orig_pte = *entry;
+    int rc;
+
+    BUG_ON(level > 3);
+
+    switch ( op )
+    {
+    case ALLOCATE:
+        ASSERT(level < 3 || !p2m_valid(orig_pte));
+        ASSERT(*maddr == 0);
+
+        if ( p2m_valid(orig_pte) )
+            return P2M_ONE_DESCEND;
+
+        if ( is_mapping_aligned(*addr, end_gpaddr, 0, level_size) )
+        {
+            struct page_info *page;
+
+            page = alloc_domheap_pages(d, level_shift - PAGE_SHIFT, 0);
+            if ( page )
+            {
+                pte = mfn_to_p2m_entry(page_to_mfn(page), mattr, t);
+                if ( level != 3 )
+                    pte.p2m.table = 0;
+                p2m_write_pte(entry, pte, flush_cache);
+                p2m->stats.mappings[level]++;
+                return P2M_ONE_PROGRESS;
+            }
+            else if ( level == 3 )
+                return -ENOMEM;
+        }
+
+        BUG_ON(level == 3); /* L3 is always superpage aligned */
+
+        /*
+         * If we get here then we failed to allocate a sufficiently
+         * large contiguous region for this level (which can't be
+         * L3). Create a page table and continue to descend so we try
+         * smaller allocations.
+         */
+        rc = p2m_create_table(d, entry, 0, flush_cache);
+        if ( rc < 0 )
+            return rc;
+
+        return P2M_ONE_DESCEND;
+
+    case INSERT:
+        if ( is_mapping_aligned(*addr, end_gpaddr, *maddr, level_size) &&
+           /* We do not handle replacing an existing table with a superpage */
+             (level == 3 || !p2m_table(orig_pte)) )
+        {
+            /* New mapping is superpage aligned, make it */
+            pte = mfn_to_p2m_entry(*maddr >> PAGE_SHIFT, mattr, t);
+            if ( level < 3 )
+                pte.p2m.table = 0; /* Superpage entry */
+
+            p2m_write_pte(entry, pte, flush_cache);
+
+            *flush |= p2m_valid(orig_pte);
+
+            *addr += level_size;
+            *maddr += level_size;
+
+            if ( p2m_valid(orig_pte) )
+            {
+                /*
+                 * We can't currently get here for an existing table
+                 * mapping, since we don't handle replacing an
+                 * existing table with a superpage. If we did we would
+                 * need to handle freeing (and accounting) for the bit
+                 * of the p2m tree which we would be about to lop off.
+                 */
+                BUG_ON(level < 3 && p2m_table(orig_pte));
+                if ( level == 3 )
+                    p2m_put_l3_page(orig_pte);
+            }
+            else /* New mapping */
+                p2m->stats.mappings[level]++;
+
+            return P2M_ONE_PROGRESS;
+        }
+        else
+        {
+            /* New mapping is not superpage aligned, create a new table entry */
+            BUG_ON(level == 3); /* L3 is always superpage aligned */
+
+            /* Not present -> create table entry and descend */
+            if ( !p2m_valid(orig_pte) )
+            {
+                rc = p2m_create_table(d, entry, 0, flush_cache);
+                if ( rc < 0 )
+                    return rc;
+                return P2M_ONE_DESCEND;
+            }
+
+            /* Existing superpage mapping -> shatter and descend */
+            if ( p2m_entry(orig_pte) )
+            {
+                *flush = true;
+                rc = p2m_create_table(d, entry,
+                                      level_shift - PAGE_SHIFT, flush_cache);
+                if ( rc < 0 )
+                    return rc;
+
+                p2m->stats.shattered[level]++;
+                p2m->stats.mappings[level]--;
+                p2m->stats.mappings[level+1] += LPAE_ENTRIES;
+            } /* else: an existing table mapping -> descend */
+
+            BUG_ON(!entry->p2m.table);
+
+            return P2M_ONE_DESCEND;
+        }
+
+        break;
+
+    case RELINQUISH:
+    case REMOVE:
+        if ( !p2m_valid(orig_pte) )
+        {
+            /* Progress up to next boundary */
+            *addr = (*addr + level_size) & level_mask;
+            return P2M_ONE_PROGRESS_NOP;
+        }
+
+        if ( level < 3 && p2m_table(orig_pte) )
+            return P2M_ONE_DESCEND;
+
+        *flush = true;
+
+        memset(&pte, 0x00, sizeof(pte));
+        p2m_write_pte(entry, pte, flush_cache);
+
+        *addr += level_size;
+
+        p2m->stats.mappings[level]--;
+
+        if ( level == 3 )
+            p2m_put_l3_page(orig_pte);
+
+        /*
+         * This is still a single pte write, no matter the level, so no need to
+         * scale.
+         */
+        return P2M_ONE_PROGRESS;
+
+    case CACHEFLUSH:
+        if ( !p2m_valid(orig_pte) )
+        {
+            *addr = (*addr + level_size) & level_mask;
+            return P2M_ONE_PROGRESS_NOP;
+        }
+
+        /*
+         * could flush up to the next boundary, but would need to be
+         * careful about preemption, so just do one page now and loop.
+         */
+        *addr += PAGE_SIZE;
+        if ( p2m_is_ram(orig_pte.p2m.type) )
+        {
+            flush_page_to_ram(orig_pte.p2m.base + third_table_offset(*addr));
+            return P2M_ONE_PROGRESS;
+        }
+        else
+            return P2M_ONE_PROGRESS_NOP;
+    }
+
+    BUG(); /* Should never get here */
+}
+
 static int apply_p2m_changes(struct domain *d,
                      enum p2m_operation op,
                      paddr_t start_gpaddr,
@@ -344,9 +650,7 @@ static int apply_p2m_changes(struct domain *d,
                   cur_first_offset = ~0,
                   cur_second_offset = ~0;
     unsigned long count = 0;
-    unsigned int flush = 0;
-    bool_t populate = (op == INSERT || op == ALLOCATE);
-    lpae_t pte;
+    bool_t flush = false;
     bool_t flush_pt;
 
     /* Some IOMMU don't support coherent PT walk. When the p2m is
@@ -360,6 +664,25 @@ static int apply_p2m_changes(struct domain *d,
     addr = start_gpaddr;
     while ( addr < end_gpaddr )
     {
+        /*
+         * Arbitrarily, preempt every 512 operations or 8192 nops.
+         * 512*P2M_ONE_PROGRESS == 8192*P2M_ONE_PROGRESS_NOP == 0x2000
+         *
+         * count is initialised to 0 above, so we are guaranteed to
+         * always make at least one pass.
+         */
+
+        if ( op == RELINQUISH && count >= 0x2000 )
+        {
+            if ( hypercall_preempt_check() )
+            {
+                p2m->lowest_mapped_gfn = addr >> PAGE_SHIFT;
+                rc = -ERESTART;
+                goto out;
+            }
+            count = 0;
+        }
+
         if ( cur_first_page != p2m_first_level_index(addr) )
         {
             if ( first ) unmap_domain_page(first);
@@ -372,22 +695,18 @@ static int apply_p2m_changes(struct domain *d,
             cur_first_page = p2m_first_level_index(addr);
         }
 
-        if ( !p2m_valid(first[first_table_offset(addr)]) )
-        {
-            if ( !populate )
-            {
-                addr = (addr + FIRST_SIZE) & FIRST_MASK;
-                continue;
-            }
+        /* We only use a 3 level p2m at the moment, so no level 0,
+         * current hardware doesn't support super page mappings at
+         * level 0 anyway */
 
-            rc = p2m_create_table(d, &first[first_table_offset(addr)],
-                                  flush_pt);
-            if ( rc < 0 )
-            {
-                printk("p2m_populate_ram: L1 failed\n");
-                goto out;
-            }
-        }
+        rc = apply_one_level(d, &first[first_table_offset(addr)],
+                             1, flush_pt, op,
+                             start_gpaddr, end_gpaddr,
+                             &addr, &maddr, &flush,
+                             mattr, t);
+        if ( rc < 0 ) goto out;
+        count += rc;
+        if ( rc != P2M_ONE_DESCEND ) continue;
 
         BUG_ON(!p2m_valid(first[first_table_offset(addr)]));
 
@@ -399,23 +718,16 @@ static int apply_p2m_changes(struct domain *d,
         }
         /* else: second already valid */
 
-        if ( !p2m_valid(second[second_table_offset(addr)]) )
-        {
-            if ( !populate )
-            {
-                addr = (addr + SECOND_SIZE) & SECOND_MASK;
-                continue;
-            }
+        rc = apply_one_level(d,&second[second_table_offset(addr)],
+                             2, flush_pt, op,
+                             start_gpaddr, end_gpaddr,
+                             &addr, &maddr, &flush,
+                             mattr, t);
+        if ( rc < 0 ) goto out;
+        count += rc;
+        if ( rc != P2M_ONE_DESCEND ) continue;
 
-            rc = p2m_create_table(d, &second[second_table_offset(addr)],
-                                  flush_pt);
-            if ( rc < 0 ) {
-                printk("p2m_populate_ram: L2 failed\n");
-                goto out;
-            }
-        }
-
-        BUG_ON(!second[second_table_offset(addr)].p2m.valid);
+        BUG_ON(!p2m_valid(second[second_table_offset(addr)]));
 
         if ( cur_second_offset != second_table_offset(addr) )
         {
@@ -425,98 +737,15 @@ static int apply_p2m_changes(struct domain *d,
             cur_second_offset = second_table_offset(addr);
         }
 
-        pte = third[third_table_offset(addr)];
-
-        flush |= pte.p2m.valid;
-
-        /* TODO: Handle other p2m type
-         *
-         * It's safe to do the put_page here because page_alloc will
-         * flush the TLBs if the page is reallocated before the end of
-         * this loop.
-         */
-        if ( pte.p2m.valid && p2m_is_foreign(pte.p2m.type) )
-        {
-            unsigned long mfn = pte.p2m.base;
-
-            ASSERT(mfn_valid(mfn));
-            put_page(mfn_to_page(mfn));
-        }
-
-        switch (op) {
-            case ALLOCATE:
-                {
-                    /* Allocate a new RAM page and attach */
-                    struct page_info *page;
-
-                    ASSERT(!pte.p2m.valid);
-                    rc = -ENOMEM;
-                    page = alloc_domheap_page(d, 0);
-                    if ( page == NULL ) {
-                        printk("p2m_populate_ram: failed to allocate page\n");
-                        goto out;
-                    }
-
-                    pte = mfn_to_p2m_entry(page_to_mfn(page), mattr, t);
-
-                    p2m_write_pte(&third[third_table_offset(addr)],
-                                  pte, flush_pt);
-                }
-                break;
-            case INSERT:
-                {
-                    if ( pte.p2m.valid )
-                        p2m_put_page(pte);
-                    pte = mfn_to_p2m_entry(maddr >> PAGE_SHIFT, mattr, t);
-                    p2m_write_pte(&third[third_table_offset(addr)],
-                                  pte, flush_pt);
-                    maddr += PAGE_SIZE;
-                }
-                break;
-            case RELINQUISH:
-            case REMOVE:
-                {
-                    if ( !pte.p2m.valid )
-                    {
-                        count++;
-                        break;
-                    }
-
-                    p2m_put_page(pte);
-
-                    count += 0x10;
-
-                    memset(&pte, 0x00, sizeof(pte));
-                    p2m_write_pte(&third[third_table_offset(addr)],
-                                  pte, flush_pt);
-                    count++;
-                }
-                break;
-
-            case CACHEFLUSH:
-                {
-                    if ( !pte.p2m.valid || !p2m_is_ram(pte.p2m.type) )
-                        break;
-
-                    flush_page_to_ram(pte.p2m.base);
-                }
-                break;
-        }
-
-        /* Preempt every 2MiB (mapped) or 32 MiB (unmapped) - arbitrary */
-        if ( op == RELINQUISH && count >= 0x2000 )
-        {
-            if ( hypercall_preempt_check() )
-            {
-                p2m->lowest_mapped_gfn = addr >> PAGE_SHIFT;
-                rc = -ERESTART;
-                goto out;
-            }
-            count = 0;
-        }
-
-        /* Got the next page */
-        addr += PAGE_SIZE;
+        rc = apply_one_level(d, &third[third_table_offset(addr)],
+                             3, flush_pt, op,
+                             start_gpaddr, end_gpaddr,
+                             &addr, &maddr, &flush,
+                             mattr, t);
+        if ( rc < 0 ) goto out;
+        /* L3 had better have done something! We cannot descend any further */
+        BUG_ON(rc == P2M_ONE_DESCEND);
+        count += rc;
     }
 
     if ( flush )
@@ -574,7 +803,7 @@ int guest_physmap_add_entry(struct domain *d,
 {
     return apply_p2m_changes(d, INSERT,
                              pfn_to_paddr(gpfn),
-                             pfn_to_paddr(gpfn + (1 << page_order)),
+                             pfn_to_paddr(gpfn + (1 << page_order)) - 1,
                              pfn_to_paddr(mfn), MATTR_MEM, t);
 }
 
@@ -584,7 +813,7 @@ void guest_physmap_remove_page(struct domain *d,
 {
     apply_p2m_changes(d, REMOVE,
                       pfn_to_paddr(gpfn),
-                      pfn_to_paddr(gpfn + (1<<page_order)),
+                      pfn_to_paddr(gpfn + (1<<page_order)) - 1,
                       pfn_to_paddr(mfn), MATTR_MEM, p2m_invalid);
 }
 
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 911d32d..0cfbb09 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -29,6 +29,12 @@ struct p2m_domain {
      * resume the search. Apart from during teardown this can only
      * decrease. */
     unsigned long lowest_mapped_gfn;
+
+    struct {
+        uint32_t mappings[4]; /* Number of mappings at each p2m tree level */
+        uint32_t shattered[4]; /* Number of times we have shattered a mapping
+                                * at each p2m tree level. */
+    } stats;
 };
 
 /* List of possible type for each page in the p2m entry.
@@ -79,6 +85,9 @@ int p2m_alloc_table(struct domain *d);
 void p2m_save_state(struct vcpu *p);
 void p2m_restore_state(struct vcpu *n);
 
+/* */
+void p2m_dump_info(struct domain *d);
+
 /* Look up the MFN corresponding to a domain's PFN. */
 paddr_t p2m_lookup(struct domain *d, paddr_t gpfn, p2m_type_t *t);