diff mbox

[Xen-devel,for-4.7,2/2] xen/arm: p2m: Release the p2m lock before undoing the mappings

Message ID 1463407735-7409-3-git-send-email-julien.grall@arm.com
State Superseded
Headers show

Commit Message

Julien Grall May 16, 2016, 2:08 p.m. UTC
Since commit 4b25423a "arch/arm: unmap partially-mapped memory regions",
Xen has been undoing the P2M mappings when an error occurred during
insertion or memory allocation.

This is done by calling recursively apply_p2m_changes, however the
second call is done with the p2m lock taken which will result in a
deadlock for the current processor.

Fix it by moving the recursive call after the lock has been released.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---
    I think we could unlock the p2m lock before freeing the temporary
    mapping. Although, I played safe as this is a bug fix for Xen 4.7
    and to be backported up to Xen 4.5.
---
 xen/arch/arm/p2m.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

Comments

Julien Grall May 17, 2016, 11:35 a.m. UTC | #1
Hi Stefano and Wei,

On 17/05/16 12:24, Stefano Stabellini wrote:
> I think you are right. Especially with backports in mind, it would be
> better to introduce an __apply_p2m_changes function which assumes that
> the p2m lock has already been taken by the caller. Then you can base the
> implementation of apply_p2m_changes on it.

> On Tue, 17 May 2016, Wei Chen wrote:
>> Hi Julien,
>>
>> I have some concern about this patch. Because we released the spinlock
>> before remove the mapped memory. If somebody acquires the spinlock
>> before we remove the mapped memory, this mapped memory region can be
>> accessed by guest.
>>
>> The apply_p2m_changes is no longer atomic. Is it a security risk?

Accesses to the page table have never been atomic, as soon as an entry 
is written in the page tables, the guest vCPUs or a prefetcher could 
read it.

The spinlock is only here to protect the page tables against concurrent 
modifications. Releasing the lock is not an issue as Xen does not 
promise any ordering for the p2m changes.

Regards,
Julien Grall May 17, 2016, 12:48 p.m. UTC | #2
Hi Stefano,

On 17/05/16 13:27, Stefano Stabellini wrote:
> On Tue, 17 May 2016, Julien Grall wrote:
>> On 17/05/16 12:24, Stefano Stabellini wrote:
>>> I think you are right. Especially with backports in mind, it would be
>>> better to introduce an __apply_p2m_changes function which assumes that
>>> the p2m lock has already been taken by the caller. Then you can base the
>>> implementation of apply_p2m_changes on it.
>>
>>> On Tue, 17 May 2016, Wei Chen wrote:
>>>> Hi Julien,
>>>>
>>>> I have some concern about this patch. Because we released the spinlock
>>>> before remove the mapped memory. If somebody acquires the spinlock
>>>> before we remove the mapped memory, this mapped memory region can be
>>>> accessed by guest.
>>>>
>>>> The apply_p2m_changes is no longer atomic. Is it a security risk?
>>
>> Accesses to the page table have never been atomic, as soon as an entry is
>> written in the page tables, the guest vCPUs or a prefetcher could read it.
>>
>> The spinlock is only here to protect the page tables against concurrent
>> modifications. Releasing the lock is not an issue as Xen does not promise any
>> ordering for the p2m changes.
>
> I understand that. However I am wondering whether it might be possible
> for the guest to run commands which cause concurrent p2m change requests
> on purpose, inserting something else between the first phase and the
> second phase of apply_p2m_changes, causing problems to the hypervisor.

Removing and inserting entries are 2 distinct steps.

> Or maybe not even on purpose, but causing problem to itself nonetheless.

Each vCPU can only trigger one command at the time. So concurrent p2m 
changes would involve 2 vCPUs.

Even if vCPU A send the command before vCPU B, nothing prevents Xen to 
serve B before A.

The only way a guest could harm itself would be to have the 2 requests 
modifying the same regions in the page tables. However, per-above this 
behavior is undefined no matter the implementation of apply_p2m_changes.

> Honestly it is true that it doesn't look like Xen could run into
> troubles. But still this is a change in behaviour compared to the
> current code (which I know doesn't actually work) and I wanted to flag
> it.

This code has always been buggy, and I suspect the goal was to call back 
without the lock.

There is no reason to keep the lock more than necessary. Releasing the 
lock allow other p2m changes to be executed rather than spinning while 
the long execution (INSERTION + REMOVAL) is done.

Regards,
Julien Grall May 18, 2016, 10:31 a.m. UTC | #3
Hi Stefano,

On 18/05/2016 11:10, Stefano Stabellini wrote:
> All right, so the use case that should haved worked before (but didn't
> because the implementation was broken) and wouldn't work anymore with
> this patch is the following:
>
> - vcpu1 asks region1 to be mapped at gpaddrA
>   the mapping fails at least partially
> - vcpu2 asks region2 to be mapped at gpaddrA
>   the mapping succeeds
>
> This doesn't work anymore because the second mapping could be done in
> between the two halves of the first mapping, resulting in a partially
> mapped region2.

If we leave aside the buggy code, this use case has never worked and 
will never work. Xen might handle the request from vcpu2 before vcpu1.

It is not because of the implementation, but because the pCPU running 
vCPU1 may be slower or receive an interrupt...

So the use case you described is already unpredictable. The guest has to 
use an internal lock if it wants the request from vCPU2 to be handled 
after the one from vCPU1. And in this case, it is not possible to have 
vCPU2 messing in the middle of the vCPU1 transaction.

Regards,
Julien Grall May 18, 2016, 10:45 a.m. UTC | #4
On 18/05/2016 11:10, Stefano Stabellini wrote:
> I realize that this is an unimportant case and not worth supporting. I,
> for one, would prefer not to have to think about implementation halves
> of apply_p2m_changes going forward so I would prefer a different patch.
> That said, I still retract my comment and leave it up to you. If you
> would like to change this patch, I'll be happy to review v2, otherwise,
> if you prefer to keep it as is, let me know and I'll commit this
> version.

I forgot to reply to this part. I agree that the resulting code might be 
confusing. However today, the lock is taken for a very long time (TLBs 
are flushed in the middle, memory management,...) which may result to 
starve other vCPUs on big platform.

This time would be doubled in case of failure when inserting a new mapping.

FWIW, I am planning to rework the page table code for the next release 
to get it simplified and handle break-before-make (recommended by the 
ARM ARM) more easily.

Regards,
diff mbox

Patch

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 68c67b0..838d004 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1184,6 +1184,14 @@  out:
     while ( (pg = page_list_remove_head(&free_pages)) )
         free_domheap_page(pg);
 
+    for ( level = P2M_ROOT_LEVEL; level < 4; level ++ )
+    {
+        if ( mappings[level] )
+            unmap_domain_page(mappings[level]);
+    }
+
+    spin_unlock(&p2m->lock);
+
     if ( rc < 0 && ( op == INSERT || op == ALLOCATE ) &&
          addr != start_gpaddr )
     {
@@ -1196,14 +1204,6 @@  out:
                           mattr, 0, p2m_invalid, d->arch.p2m.default_access);
     }
 
-    for ( level = P2M_ROOT_LEVEL; level < 4; level ++ )
-    {
-        if ( mappings[level] )
-            unmap_domain_page(mappings[level]);
-    }
-
-    spin_unlock(&p2m->lock);
-
     return rc;
 }