diff mbox

xen/arm: p2m: Correctly flush TLB in create_p2m_entries

Message ID 1389200759-22177-1-git-send-email-julien.grall@linaro.org
State Superseded, archived
Headers show

Commit Message

Julien Grall Jan. 8, 2014, 5:05 p.m. UTC
The p2m is shared between VCPUs for each domain. Currently Xen only flush
TLB on the local PCPU. This could result to mismatch between the mapping in the
p2m and TLBs.

Flush TLBs used by this domain on every PCPU.

Signed-off-by: Julien Grall <julien.grall@linaro.org>

---

This is a possible bug fix (found by reading the code) for Xen 4.4. I have
added a small optimisation to avoid flushing all TLBs when a VCPU of this
domain is running on the current cpu.

The downside of this patch is the function can be a little bit slower because
Xen is flushing more TLBs.
---
 xen/arch/arm/p2m.c                   |    7 ++++++-
 xen/include/asm-arm/arm32/flushtlb.h |    6 +++---
 xen/include/asm-arm/arm64/flushtlb.h |    6 +++---
 3 files changed, 12 insertions(+), 7 deletions(-)

Comments

Stefano Stabellini Jan. 8, 2014, 5:13 p.m. UTC | #1
On Wed, 8 Jan 2014, Julien Grall wrote:
> The p2m is shared between VCPUs for each domain. Currently Xen only flush
> TLB on the local PCPU. This could result to mismatch between the mapping in the
> p2m and TLBs.
> 
> Flush TLBs used by this domain on every PCPU.
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>

The fix makes sense to me.

> ---
> 
> This is a possible bug fix (found by reading the code) for Xen 4.4. I have
> added a small optimisation to avoid flushing all TLBs when a VCPU of this
> domain is running on the current cpu.
> 
> The downside of this patch is the function can be a little bit slower because
> Xen is flushing more TLBs.

Yes, I wonder how much slower it is going to be, considering that the flush
is executed for every iteration of the loop.


>  xen/arch/arm/p2m.c                   |    7 ++++++-
>  xen/include/asm-arm/arm32/flushtlb.h |    6 +++---
>  xen/include/asm-arm/arm64/flushtlb.h |    6 +++---
>  3 files changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 11f4714..9ab0378 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -374,7 +374,12 @@ static int create_p2m_entries(struct domain *d,
>          }
>  
>          if ( flush )
> -            flush_tlb_all_local();
> +        {
> +            if ( current->domain == d )
> +                flush_tlb();
> +            else
> +                flush_tlb_all();
> +        }
>  
>          /* Preempt every 2MiB (mapped) or 32 MiB (unmapped) - arbitrary */
>          if ( op == RELINQUISH && count >= 0x2000 )
> diff --git a/xen/include/asm-arm/arm32/flushtlb.h b/xen/include/asm-arm/arm32/flushtlb.h
> index ab166f3..6ff6f75 100644
> --- a/xen/include/asm-arm/arm32/flushtlb.h
> +++ b/xen/include/asm-arm/arm32/flushtlb.h
> @@ -23,12 +23,12 @@ static inline void flush_tlb(void)
>      isb();
>  }
>  
> -/* Flush local TLBs, all VMIDs, non-hypervisor mode */
> -static inline void flush_tlb_all_local(void)
> +/* Flush inner shareable TLBs, all VMIDs, non-hypervisor mode */
> +static inline void flush_tlb_all(void)
>  {
>      dsb();
>  
> -    WRITE_CP32((uint32_t) 0, TLBIALLNSNH);
> +    WRITE_CP32((uint32_t) 0, TLBIALLNSNHIS);
>  
>      dsb();
>      isb();
> diff --git a/xen/include/asm-arm/arm64/flushtlb.h b/xen/include/asm-arm/arm64/flushtlb.h
> index 9ce79a8..687eda1 100644
> --- a/xen/include/asm-arm/arm64/flushtlb.h
> +++ b/xen/include/asm-arm/arm64/flushtlb.h
> @@ -23,12 +23,12 @@ static inline void flush_tlb(void)
>          : : : "memory");
>  }
>  
> -/* Flush local TLBs, all VMIDs, non-hypervisor mode */
> -static inline void flush_tlb_all_local(void)
> +/* Flush inner shareable TLBs, all VMIDs, non-hypervisor mode */
> +static inline void flush_tlb_all(void)
>  {
>      asm volatile(
>          "dsb sy;"
> -        "tlbi alle1;"
> +        "tlbi alle1is;"
>          "dsb sy;"
>          "isb;"
>          : : : "memory");
> -- 
> 1.7.10.4
>
Ian Campbell Jan. 9, 2014, 10:59 a.m. UTC | #2
On Wed, 2014-01-08 at 17:13 +0000, Stefano Stabellini wrote:
> On Wed, 8 Jan 2014, Julien Grall wrote:
> > The p2m is shared between VCPUs for each domain. Currently Xen only flush
> > TLB on the local PCPU. This could result to mismatch between the mapping in the
> > p2m and TLBs.
> > 
> > Flush TLBs used by this domain on every PCPU.
> > 
> > Signed-off-by: Julien Grall <julien.grall@linaro.org>
> 
> The fix makes sense to me.

Me too. Has anyone had a grep for similar issues?

(the reason for getting this wrong is that for cache flushes the "is"
suffix restricts the flush to the IS domain, whereas with tlb flushes
the "is" suffix broadcasts instead of keeping it local, which is a bit
confusing ;-))

> 
> > ---
> > 
> > This is a possible bug fix (found by reading the code) for Xen 4.4. I have
> > added a small optimisation to avoid flushing all TLBs when a VCPU of this
> > domain is running on the current cpu.
> > 
> > The downside of this patch is the function can be a little bit slower because
> > Xen is flushing more TLBs.
> 
> Yes, I wonder how much slower it is going to be, considering that the flush
> is executed for every iteration of the loop.

It might be better to set the current VMID to the target domain for the
duration of this function, we'd still need the broadcast but at least we
wouldn't be killing unrelated VMIDs.

Pulling the flush out of the loop would require great case WRT accesses
from other VCPUs, e.g. you'd have to put the pages on a list (page->list
might be available?) and issue the put_page() after the flush, otherwise
it might get recycled into another domain while the first domain still
has TLB entries for it.

Or is there always something outside this function which holds another
ref count such that the page definitely won't be freed by the put_page
here?

Actually, do the existing code not have this issue already? The put_page
is before the flush. If this bug does exist now then I'd be inclined to
consider this a bug fix for 4.4, rather than a potential optimisation
for 4.5.

While looking at this function I'm now wondering what happens to the
existing page on ALLOCATE or INSERT, is it leaked?

Ian.
Julien Grall Jan. 9, 2014, 1:14 p.m. UTC | #3
On 01/09/2014 10:59 AM, Ian Campbell wrote:
> On Wed, 2014-01-08 at 17:13 +0000, Stefano Stabellini wrote:
>> On Wed, 8 Jan 2014, Julien Grall wrote:
>>> The p2m is shared between VCPUs for each domain. Currently Xen only flush
>>> TLB on the local PCPU. This could result to mismatch between the mapping in the
>>> p2m and TLBs.
>>>
>>> Flush TLBs used by this domain on every PCPU.
>>>
>>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>>
>> The fix makes sense to me.
>
> Me too. Has anyone had a grep for similar issues?

I think flush_xen_data_tlb and flush_xen_text_tlb should also be 
innershareable.

The former one is used by flush_tlb_mask. But ... this function seems 
badly implement, it's weird to use flush_xen_data_tlb because we are 
mainly using flush_tlb_mask in common/grant-table.c. Any ideas?


> (the reason for getting this wrong is that for cache flushes the "is"
> suffix restricts the flush to the IS domain, whereas with tlb flushes
> the "is" suffix broadcasts instead of keeping it local, which is a bit
> confusing ;-))
>
>>
>>> ---
>>>
>>> This is a possible bug fix (found by reading the code) for Xen 4.4. I have
>>> added a small optimisation to avoid flushing all TLBs when a VCPU of this
>>> domain is running on the current cpu.
>>>
>>> The downside of this patch is the function can be a little bit slower because
>>> Xen is flushing more TLBs.
>>
>> Yes, I wonder how much slower it is going to be, considering that the flush
>> is executed for every iteration of the loop.
>
> It might be better to set the current VMID to the target domain for the
> duration of this function, we'd still need the broadcast but at least we
> wouldn't be killing unrelated VMIDs.

I can modify the patch to handle that.

>
> Pulling the flush out of the loop would require great case WRT accesses
> from other VCPUs, e.g. you'd have to put the pages on a list (page->list
> might be available?) and issue the put_page() after the flush, otherwise
> it might get recycled into another domain while the first domain still
> has TLB entries for it.
>
> Or is there always something outside this function which holds another
> ref count such that the page definitely won't be freed by the put_page
> here?
 >
> Actually, do the existing code not have this issue already? The put_page
> is before the flush. If this bug does exist now then I'd be inclined to
> consider this a bug fix for 4.4, rather than a potential optimisation
> for 4.5.

For now we don't take reference when we map/unmap mapping. Most of the 
time create_p2m_entries is called by common code which take care of 
having a reference when this function is called. So we should be safe.

I would prefer to wait 4.5 for this optimisation (moving the flush 
outside the loop).

> While looking at this function I'm now wondering what happens to the
> existing page on ALLOCATE or INSERT, is it leaked?

For ALLOCATE page, it's on the domain page list so the page will be 
freed duing relinquish.

For INSERT, except for foreign mapping we don't have refcount for 
mapping. So the only issue could be with foreign mapping.
Ian Campbell Jan. 9, 2014, 2:05 p.m. UTC | #4
On Thu, 2014-01-09 at 13:14 +0000, Julien Grall wrote:
> 
> On 01/09/2014 10:59 AM, Ian Campbell wrote:
> > On Wed, 2014-01-08 at 17:13 +0000, Stefano Stabellini wrote:
> >> On Wed, 8 Jan 2014, Julien Grall wrote:
> >>> The p2m is shared between VCPUs for each domain. Currently Xen only flush
> >>> TLB on the local PCPU. This could result to mismatch between the mapping in the
> >>> p2m and TLBs.
> >>>
> >>> Flush TLBs used by this domain on every PCPU.
> >>>
> >>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> >>
> >> The fix makes sense to me.
> >
> > Me too. Has anyone had a grep for similar issues?
> 
> I think flush_xen_data_tlb and flush_xen_text_tlb should also be 
> innershareable.

I think text_tlb is ok, it's only used at start of day. The usage in
setup_pagetables and mmu_init_secondary_cpus are both explicitly local I
think. Perhaps it should be renamed _local.

The case in set_pte_flags_on_range via free_init_memory I'm less sure
about, but I don't think stale tlb entries are actually an issue here,
since there will certainly be one when the page becomes used again. But
maybe it would be safest to make it global.

> The former one is used by flush_tlb_mask.

Yes, the comment there is just wrong. I think this was my doing based on
the confusion I mentioned before.

We need to be careful not to change the (un)map_domain_page since those
are not shared between processors, I don't think this change would do
that.

>  But ... this function seems 
> badly implement, it's weird to use flush_xen_data_tlb because we are 
> mainly using flush_tlb_mask in common/grant-table.c. Any ideas?

Do you mean that this should be flushing the guest TLBs and not Xen's?
That does seem right... We actually need to be flushing for all vmid's
too I think -- for the alloc_heap_pages case.

Ian.

> 
> 
> > (the reason for getting this wrong is that for cache flushes the "is"
> > suffix restricts the flush to the IS domain, whereas with tlb flushes
> > the "is" suffix broadcasts instead of keeping it local, which is a bit
> > confusing ;-))
> >
> >>
> >>> ---
> >>>
> >>> This is a possible bug fix (found by reading the code) for Xen 4.4. I have
> >>> added a small optimisation to avoid flushing all TLBs when a VCPU of this
> >>> domain is running on the current cpu.
> >>>
> >>> The downside of this patch is the function can be a little bit slower because
> >>> Xen is flushing more TLBs.
> >>
> >> Yes, I wonder how much slower it is going to be, considering that the flush
> >> is executed for every iteration of the loop.
> >
> > It might be better to set the current VMID to the target domain for the
> > duration of this function, we'd still need the broadcast but at least we
> > wouldn't be killing unrelated VMIDs.
> 
> I can modify the patch to handle that.
> 
> >
> > Pulling the flush out of the loop would require great case WRT accesses
> > from other VCPUs, e.g. you'd have to put the pages on a list (page->list
> > might be available?) and issue the put_page() after the flush, otherwise
> > it might get recycled into another domain while the first domain still
> > has TLB entries for it.
> >
> > Or is there always something outside this function which holds another
> > ref count such that the page definitely won't be freed by the put_page
> > here?
>  >
> > Actually, do the existing code not have this issue already? The put_page
> > is before the flush. If this bug does exist now then I'd be inclined to
> > consider this a bug fix for 4.4, rather than a potential optimisation
> > for 4.5.
> 
> For now we don't take reference when we map/unmap mapping. Most of the 
> time create_p2m_entries is called by common code which take care of 
> having a reference when this function is called. So we should be safe.
> 
> I would prefer to wait 4.5 for this optimisation (moving the flush 
> outside the loop).
> 
> > While looking at this function I'm now wondering what happens to the
> > existing page on ALLOCATE or INSERT, is it leaked?
> 
> For ALLOCATE page, it's on the domain page list so the page will be 
> freed duing relinquish.
> 
> For INSERT, except for foreign mapping we don't have refcount for 
> mapping. So the only issue could be with foreign mapping.
>
Julien Grall Jan. 9, 2014, 3:08 p.m. UTC | #5
On 01/09/2014 02:05 PM, Ian Campbell wrote:
> On Thu, 2014-01-09 at 13:14 +0000, Julien Grall wrote:
>>
>> On 01/09/2014 10:59 AM, Ian Campbell wrote:
>>> On Wed, 2014-01-08 at 17:13 +0000, Stefano Stabellini wrote:
>>>> On Wed, 8 Jan 2014, Julien Grall wrote:
>>>>> The p2m is shared between VCPUs for each domain. Currently Xen only flush
>>>>> TLB on the local PCPU. This could result to mismatch between the mapping in the
>>>>> p2m and TLBs.
>>>>>
>>>>> Flush TLBs used by this domain on every PCPU.
>>>>>
>>>>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>>>>
>>>> The fix makes sense to me.
>>>
>>> Me too. Has anyone had a grep for similar issues?
>>
>> I think flush_xen_data_tlb and flush_xen_text_tlb should also be 
>> innershareable.
> 
> I think text_tlb is ok, it's only used at start of day. The usage in
> setup_pagetables and mmu_init_secondary_cpus are both explicitly local I
> think. Perhaps it should be renamed _local.

After looking to the code, both function are only used to flush for the
current cpu. Suffix flush_xen_data_tlb, flush_xen_text_tlb,
flush_xen_data_tlb_range_va with _local sounds a good idea. Is it a 4.4
material?

> The case in set_pte_flags_on_range via free_init_memory I'm less sure
> about, but I don't think stale tlb entries are actually an issue here,
> since there will certainly be one when the page becomes used again. But
> maybe it would be safest to make it global.

Theses translations will only be used for hypervisor mode. It should be
safe ...

We can flush TLB on every CPUs. That would mean we have to create
flush_xen_text_tlb and flush_xen_text_tlb_local.

>> The former one is used by flush_tlb_mask.
> 
> Yes, the comment there is just wrong. I think this was my doing based on
> the confusion I mentioned before.
> 
> We need to be careful not to change the (un)map_domain_page since those
> are not shared between processors, I don't think this change would do
> that.

Right, this function doesn't need to be changed. We need to modify the
behaviour of flush_tlb_mask.

>>  But ... this function seems 
>> badly implement, it's weird to use flush_xen_data_tlb because we are 
>> mainly using flush_tlb_mask in common/grant-table.c. Any ideas?
> 
> Do you mean that this should be flushing the guest TLBs and not Xen's?
> That does seem right... We actually need to be flushing for all vmid's
> too I think -- for the alloc_heap_pages case.

After looking to the code, flush_tlb_mask is called in 2 specific place
for ARM:
   - alloc_heap_pages: the flush is only be called if the new allocated
page was used by a domain before. So we need to flush only need to flush
TLB non-secure non-hyp inner-shareable.
For Xen 4.5, this flush can be removed for ARM, Xen already call flush
TLB in create_p2m_entries.
   - common/grant_table.c: every calls to flush_tlb_mask are used with
the current domain. A simple flush TLB by VMID inner-shareable should be
enough.

For Xen 4.4, I suggest to make flush_tlb_mask flushes TLB non-secure
non-hyp innershareable.

We would need to rework after 4.4 for optimisation.

Sincerely yours,
diff mbox

Patch

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 11f4714..9ab0378 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -374,7 +374,12 @@  static int create_p2m_entries(struct domain *d,
         }
 
         if ( flush )
-            flush_tlb_all_local();
+        {
+            if ( current->domain == d )
+                flush_tlb();
+            else
+                flush_tlb_all();
+        }
 
         /* Preempt every 2MiB (mapped) or 32 MiB (unmapped) - arbitrary */
         if ( op == RELINQUISH && count >= 0x2000 )
diff --git a/xen/include/asm-arm/arm32/flushtlb.h b/xen/include/asm-arm/arm32/flushtlb.h
index ab166f3..6ff6f75 100644
--- a/xen/include/asm-arm/arm32/flushtlb.h
+++ b/xen/include/asm-arm/arm32/flushtlb.h
@@ -23,12 +23,12 @@  static inline void flush_tlb(void)
     isb();
 }
 
-/* Flush local TLBs, all VMIDs, non-hypervisor mode */
-static inline void flush_tlb_all_local(void)
+/* Flush inner shareable TLBs, all VMIDs, non-hypervisor mode */
+static inline void flush_tlb_all(void)
 {
     dsb();
 
-    WRITE_CP32((uint32_t) 0, TLBIALLNSNH);
+    WRITE_CP32((uint32_t) 0, TLBIALLNSNHIS);
 
     dsb();
     isb();
diff --git a/xen/include/asm-arm/arm64/flushtlb.h b/xen/include/asm-arm/arm64/flushtlb.h
index 9ce79a8..687eda1 100644
--- a/xen/include/asm-arm/arm64/flushtlb.h
+++ b/xen/include/asm-arm/arm64/flushtlb.h
@@ -23,12 +23,12 @@  static inline void flush_tlb(void)
         : : : "memory");
 }
 
-/* Flush local TLBs, all VMIDs, non-hypervisor mode */
-static inline void flush_tlb_all_local(void)
+/* Flush inner shareable TLBs, all VMIDs, non-hypervisor mode */
+static inline void flush_tlb_all(void)
 {
     asm volatile(
         "dsb sy;"
-        "tlbi alle1;"
+        "tlbi alle1is;"
         "dsb sy;"
         "isb;"
         : : : "memory");