[Xen-devel] xen/arm: p2m: Prevent deadlock when using memaccess

Message ID 20180228152547.12700-1-julien.grall@arm.com
State Accepted
Commit 0012ae8afb4a6e76f2847119f2c6850fbf41d9b7
Headers show
Series
  • [Xen-devel] xen/arm: p2m: Prevent deadlock when using memaccess
Related show

Commit Message

Julien Grall Feb. 28, 2018, 3:25 p.m.
Commit 7d623b358a4 "arm/mem_access: Add long-descriptor based gpt"
assumed the read-write lock can be taken recursively. However, this
assumption is wrong and will lead to deadlock when the lock is
contended.

To avoid the nested lock, rework the locking in get_page_from_gva and
p2m_mem_access_check_and_get_page. The latter will now be called without
the p2m lock. The new locking in p2m_mem_accces_check_and_get_page will
not cover the translation of the VA to an IPA.

This is fine because we can't promise that the stage-1 page-table have
changed behind our back (they are under guest control). Modification in
the stage-2 page-table can now happen, but I can't issue any potential
issue here except with the break-before-make sequence used when updating
page-table. gva_to_ipa may fail if the sequence is executed at the same
on another CPU. In that case we would fallback in the software lookup
path.

Signed-off-by: Julien Grall <julien.grall@arm.com>

---
    This patch should be backported to Xen 4.10. There are other
    potential optimization that I am working on. Although, I don't think
    they are backport material.
---
 xen/arch/arm/mem_access.c | 8 ++++++--
 xen/arch/arm/p2m.c        | 4 ++--
 xen/include/asm-arm/p2m.h | 4 ----
 3 files changed, 8 insertions(+), 8 deletions(-)

Comments

Stefano Stabellini March 2, 2018, 11:42 p.m. | #1
On Wed, 28 Feb 2018, Julien Grall wrote:
> Commit 7d623b358a4 "arm/mem_access: Add long-descriptor based gpt"
> assumed the read-write lock can be taken recursively. However, this
> assumption is wrong and will lead to deadlock when the lock is
> contended.
> 
> To avoid the nested lock, rework the locking in get_page_from_gva and
> p2m_mem_access_check_and_get_page. The latter will now be called without
> the p2m lock. The new locking in p2m_mem_accces_check_and_get_page will
> not cover the translation of the VA to an IPA.
> 
> This is fine because we can't promise that the stage-1 page-table have
> changed behind our back (they are under guest control). Modification in
> the stage-2 page-table can now happen, but I can't issue any potential
> issue here except with the break-before-make sequence used when updating
> page-table. gva_to_ipa may fail if the sequence is executed at the same
> on another CPU. In that case we would fallback in the software lookup
> path.
> 
> Signed-off-by: Julien Grall <julien.grall@arm.com>


Hi Julien,

At first glance the patch looks OK, but could you please add more
information on the recursive lock sequence that this patch is trying to
solve? Specifically, how Xen can get into a situation where it tries to
acquire the p2m lock twice recursively? I realize the comment at the
bottom says "access_guest_memory_by_ipa in guest_walk_(sd|ld)" but I
would appreciate more details. Thanks!


> ---
>     This patch should be backported to Xen 4.10. There are other
>     potential optimization that I am working on. Although, I don't think
>     they are backport material.
> ---
>  xen/arch/arm/mem_access.c | 8 ++++++--
>  xen/arch/arm/p2m.c        | 4 ++--
>  xen/include/asm-arm/p2m.h | 4 ----
>  3 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
> index 0f2cbb81d3..11c2b03b7b 100644
> --- a/xen/arch/arm/mem_access.c
> +++ b/xen/arch/arm/mem_access.c
> @@ -126,7 +126,7 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
>           * is not mapped.
>           */
>          if ( guest_walk_tables(v, gva, &ipa, &perms) < 0 )
> -            goto err;
> +            return NULL;
>  
>          /*
>           * Check permissions that are assumed by the caller. For instance in
> @@ -139,11 +139,13 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
>           * test for execute permissions this check can be left out.
>           */
>          if ( (flag & GV2M_WRITE) && !(perms & GV2M_WRITE) )
> -            goto err;
> +            return NULL;
>      }
>  
>      gfn = gaddr_to_gfn(ipa);
>  
> +    p2m_read_lock(p2m);
> +
>      /*
>       * We do this first as this is faster in the default case when no
>       * permission is set on the page.
> @@ -216,6 +218,8 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
>          page = NULL;
>  
>  err:
> +    p2m_read_unlock(p2m);
> +
>      return page;
>  }
>  
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 65e8b9c6ea..5de82aafe1 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1449,11 +1449,11 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
>      }
>  
>  err:
> +    p2m_read_unlock(p2m);
> +
>      if ( !page && p2m->mem_access_enabled )
>          page = p2m_mem_access_check_and_get_page(va, flags, v);
>  
> -    p2m_read_unlock(p2m);
> -
>      return page;
>  }
>  
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index a0abc84ed8..45ef2cd58b 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -23,10 +23,6 @@ extern void memory_type_changed(struct domain *);
>  struct p2m_domain {
>      /*
>       * Lock that protects updates to the p2m.
> -     *
> -     * Please note that we use this lock in a nested way by calling
> -     * access_guest_memory_by_ipa in guest_walk_(sd|ld). This must be
> -     * considered in the future implementation.
>       */
>      rwlock_t lock;
>  
> -- 
> 2.11.0
>
Sergej Proskurin March 6, 2018, 11:06 a.m. | #2
Hi Julien,


On 02/28/2018 04:25 PM, Julien Grall wrote:
> Commit 7d623b358a4 "arm/mem_access: Add long-descriptor based gpt"
> assumed the read-write lock can be taken recursively. However, this
> assumption is wrong and will lead to deadlock when the lock is
> contended.
>
> To avoid the nested lock, rework the locking in get_page_from_gva and
> p2m_mem_access_check_and_get_page. The latter will now be called without
> the p2m lock. The new locking in p2m_mem_accces_check_and_get_page will
> not cover the translation of the VA to an IPA.
>
> This is fine because we can't promise that the stage-1 page-table have
> changed behind our back (they are under guest control). Modification in
> the stage-2 page-table can now happen, but I can't issue any potential
> issue here except with the break-before-make sequence used when updating
> page-table. gva_to_ipa may fail if the sequence is executed at the same
> on another CPU. In that case we would fallback in the software lookup
> path.
>
> Signed-off-by: Julien Grall <julien.grall@arm.com>

The patch looks good to me. However, I did not understand the following
sentence in the commit message: "... but I can't issue any potential
issue here except with the break-before-make sequence used when updating
page-table". What issue did you mean? Thank you.

Reviewed-by: Sergej Proskurin <proskurin@sec.in.tum.de>

Thanks,
~Sergej

> ---
>     This patch should be backported to Xen 4.10. There are other
>     potential optimization that I am working on. Although, I don't think
>     they are backport material.
> ---
>  xen/arch/arm/mem_access.c | 8 ++++++--
>  xen/arch/arm/p2m.c        | 4 ++--
>  xen/include/asm-arm/p2m.h | 4 ----
>  3 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
> index 0f2cbb81d3..11c2b03b7b 100644
> --- a/xen/arch/arm/mem_access.c
> +++ b/xen/arch/arm/mem_access.c
> @@ -126,7 +126,7 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
>           * is not mapped.
>           */
>          if ( guest_walk_tables(v, gva, &ipa, &perms) < 0 )
> -            goto err;
> +            return NULL;
>  
>          /*
>           * Check permissions that are assumed by the caller. For instance in
> @@ -139,11 +139,13 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
>           * test for execute permissions this check can be left out.
>           */
>          if ( (flag & GV2M_WRITE) && !(perms & GV2M_WRITE) )
> -            goto err;
> +            return NULL;
>      }
>  
>      gfn = gaddr_to_gfn(ipa);
>  
> +    p2m_read_lock(p2m);
> +
>      /*
>       * We do this first as this is faster in the default case when no
>       * permission is set on the page.
> @@ -216,6 +218,8 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
>          page = NULL;
>  
>  err:
> +    p2m_read_unlock(p2m);
> +
>      return page;
>  }
>  
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 65e8b9c6ea..5de82aafe1 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1449,11 +1449,11 @@ struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
>      }
>  
>  err:
> +    p2m_read_unlock(p2m);
> +
>      if ( !page && p2m->mem_access_enabled )
>          page = p2m_mem_access_check_and_get_page(va, flags, v);
>  
> -    p2m_read_unlock(p2m);
> -
>      return page;
>  }
>  
> diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
> index a0abc84ed8..45ef2cd58b 100644
> --- a/xen/include/asm-arm/p2m.h
> +++ b/xen/include/asm-arm/p2m.h
> @@ -23,10 +23,6 @@ extern void memory_type_changed(struct domain *);
>  struct p2m_domain {
>      /*
>       * Lock that protects updates to the p2m.
> -     *
> -     * Please note that we use this lock in a nested way by calling
> -     * access_guest_memory_by_ipa in guest_walk_(sd|ld). This must be
> -     * considered in the future implementation.
>       */
>      rwlock_t lock;
>
Julien Grall March 6, 2018, 11:17 a.m. | #3
Hi Stefano,

On 02/03/18 23:42, Stefano Stabellini wrote:
> On Wed, 28 Feb 2018, Julien Grall wrote:
>> Commit 7d623b358a4 "arm/mem_access: Add long-descriptor based gpt"
>> assumed the read-write lock can be taken recursively. However, this
>> assumption is wrong and will lead to deadlock when the lock is
>> contended.
>>
>> To avoid the nested lock, rework the locking in get_page_from_gva and
>> p2m_mem_access_check_and_get_page. The latter will now be called without
>> the p2m lock. The new locking in p2m_mem_accces_check_and_get_page will
>> not cover the translation of the VA to an IPA.
>>
>> This is fine because we can't promise that the stage-1 page-table have
>> changed behind our back (they are under guest control). Modification in
>> the stage-2 page-table can now happen, but I can't issue any potential
>> issue here except with the break-before-make sequence used when updating
>> page-table. gva_to_ipa may fail if the sequence is executed at the same
>> on another CPU. In that case we would fallback in the software lookup
>> path.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> 
> Hi Julien,
> 
> At first glance the patch looks OK, but could you please add more
> information on the recursive lock sequence that this patch is trying to
> solve? Specifically, how Xen can get into a situation where it tries to
> acquire the p2m lock twice recursively? I realize the comment at the
> bottom says "access_guest_memory_by_ipa in guest_walk_(sd|ld)" but I
> would appreciate more details. Thanks!

Good point. I will resend it with an updated commit message.

Cheers,
Julien Grall March 6, 2018, 11:37 a.m. | #4
On 06/03/18 11:06, Sergej Proskurin wrote:
> Hi Julien,

Hi Sergej,

> 
> On 02/28/2018 04:25 PM, Julien Grall wrote:
>> Commit 7d623b358a4 "arm/mem_access: Add long-descriptor based gpt"
>> assumed the read-write lock can be taken recursively. However, this
>> assumption is wrong and will lead to deadlock when the lock is
>> contended.
>>
>> To avoid the nested lock, rework the locking in get_page_from_gva and
>> p2m_mem_access_check_and_get_page. The latter will now be called without
>> the p2m lock. The new locking in p2m_mem_accces_check_and_get_page will
>> not cover the translation of the VA to an IPA.
>>
>> This is fine because we can't promise that the stage-1 page-table have
>> changed behind our back (they are under guest control). Modification in
>> the stage-2 page-table can now happen, but I can't issue any potential
>> issue here except with the break-before-make sequence used when updating
>> page-table. gva_to_ipa may fail if the sequence is executed at the same
>> on another CPU. In that case we would fallback in the software lookup
>> path.
>>
>> Signed-off-by: Julien Grall <julien.grall@arm.com>
> 
> The patch looks good to me. However, I did not understand the following
> sentence in the commit message: "... but I can't issue any potential
> issue here except with the break-before-make sequence used when updating
> page-table". What issue did you mean? Thank you.

To avoid breaking the coherency when updating the page-table the Arm Arm 
recommends to follow a break-before-make sequence (see G4-4916 in ARM 
DDI 0487C.a). It a 4 steps approach:

1. Replace the old entry with an invalid entry
2. Invalidate the cached old entries with a broadcasting TLB 
invalidation instruction
3. Wait for the completion of the TLB instruction with a dsb followed by 
an isb
4. Write the new entry

The P2M code is using this sequence, so an address translation walk 
using hardware instruction may fail if another CPU happen to update (e.g 
breaking a superpage) the P2M entry at the same time/ This is because 
there are a small window where the entry does not exist.

If you are interested in more details, you can look at the talk I gave a 
couple of years ago (see [1]).

> 
> Reviewed-by: Sergej Proskurin <proskurin@sec.in.tum.de>

Thank you.

Cheers,

[1] https://schd.ws/hosted_files/xensummit2016/49/slides.pdf
Sergej Proskurin March 6, 2018, 11:51 a.m. | #5
Hi Julien,


On 03/06/2018 12:37 PM, Julien Grall wrote:
> On 06/03/18 11:06, Sergej Proskurin wrote:
>> Hi Julien,
>
> Hi Sergej,
>
>>
>> On 02/28/2018 04:25 PM, Julien Grall wrote:
>>> Commit 7d623b358a4 "arm/mem_access: Add long-descriptor based gpt"
>>> assumed the read-write lock can be taken recursively. However, this
>>> assumption is wrong and will lead to deadlock when the lock is
>>> contended.
>>>
>>> To avoid the nested lock, rework the locking in get_page_from_gva and
>>> p2m_mem_access_check_and_get_page. The latter will now be called
>>> without
>>> the p2m lock. The new locking in p2m_mem_accces_check_and_get_page will
>>> not cover the translation of the VA to an IPA.
>>>
>>> This is fine because we can't promise that the stage-1 page-table have
>>> changed behind our back (they are under guest control). Modification in
>>> the stage-2 page-table can now happen, but I can't issue any potential
>>> issue here except with the break-before-make sequence used when
>>> updating
>>> page-table. gva_to_ipa may fail if the sequence is executed at the same
>>> on another CPU. In that case we would fallback in the software lookup
>>> path.
>>>
>>> Signed-off-by: Julien Grall <julien.grall@arm.com>
>>
>> The patch looks good to me. However, I did not understand the following
>> sentence in the commit message: "... but I can't issue any potential
>> issue here except with the break-before-make sequence used when updating
>> page-table". What issue did you mean? Thank you.
>
> To avoid breaking the coherency when updating the page-table the Arm
> Arm recommends to follow a break-before-make sequence (see G4-4916 in
> ARM DDI 0487C.a). It a 4 steps approach:
>
> 1. Replace the old entry with an invalid entry
> 2. Invalidate the cached old entries with a broadcasting TLB
> invalidation instruction
> 3. Wait for the completion of the TLB instruction with a dsb followed
> by an isb
> 4. Write the new entry
>
> The P2M code is using this sequence, so an address translation walk
> using hardware instruction may fail if another CPU happen to update
> (e.g breaking a superpage) the P2M entry at the same time/ This is
> because there are a small window where the entry does not exist.

Makes sense. Thank you.

Cheers,
~Sergej

>
> If you are interested in more details, you can look at the talk I gave
> a couple of years ago (see [1]).
>
>>
>> Reviewed-by: Sergej Proskurin <proskurin@sec.in.tum.de>
>
> Thank you.
>
> Cheers,
>
> [1] https://schd.ws/hosted_files/xensummit2016/49/slides.pdf
>

Patch

diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c
index 0f2cbb81d3..11c2b03b7b 100644
--- a/xen/arch/arm/mem_access.c
+++ b/xen/arch/arm/mem_access.c
@@ -126,7 +126,7 @@  p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
          * is not mapped.
          */
         if ( guest_walk_tables(v, gva, &ipa, &perms) < 0 )
-            goto err;
+            return NULL;
 
         /*
          * Check permissions that are assumed by the caller. For instance in
@@ -139,11 +139,13 @@  p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
          * test for execute permissions this check can be left out.
          */
         if ( (flag & GV2M_WRITE) && !(perms & GV2M_WRITE) )
-            goto err;
+            return NULL;
     }
 
     gfn = gaddr_to_gfn(ipa);
 
+    p2m_read_lock(p2m);
+
     /*
      * We do this first as this is faster in the default case when no
      * permission is set on the page.
@@ -216,6 +218,8 @@  p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag,
         page = NULL;
 
 err:
+    p2m_read_unlock(p2m);
+
     return page;
 }
 
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 65e8b9c6ea..5de82aafe1 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1449,11 +1449,11 @@  struct page_info *get_page_from_gva(struct vcpu *v, vaddr_t va,
     }
 
 err:
+    p2m_read_unlock(p2m);
+
     if ( !page && p2m->mem_access_enabled )
         page = p2m_mem_access_check_and_get_page(va, flags, v);
 
-    p2m_read_unlock(p2m);
-
     return page;
 }
 
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index a0abc84ed8..45ef2cd58b 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -23,10 +23,6 @@  extern void memory_type_changed(struct domain *);
 struct p2m_domain {
     /*
      * Lock that protects updates to the p2m.
-     *
-     * Please note that we use this lock in a nested way by calling
-     * access_guest_memory_by_ipa in guest_walk_(sd|ld). This must be
-     * considered in the future implementation.
      */
     rwlock_t lock;