Message ID | 1389200759-22177-1-git-send-email-julien.grall@linaro.org |
---|---|
State | Superseded, archived |
Headers | show |
On Wed, 8 Jan 2014, Julien Grall wrote: > The p2m is shared between VCPUs for each domain. Currently Xen only flush > TLB on the local PCPU. This could result to mismatch between the mapping in the > p2m and TLBs. > > Flush TLBs used by this domain on every PCPU. > > Signed-off-by: Julien Grall <julien.grall@linaro.org> The fix makes sense to me. > --- > > This is a possible bug fix (found by reading the code) for Xen 4.4. I have > added a small optimisation to avoid flushing all TLBs when a VCPU of this > domain is running on the current cpu. > > The downside of this patch is the function can be a little bit slower because > Xen is flushing more TLBs. Yes, I wonder how much slower it is going to be, considering that the flush is executed for every iteration of the loop. > xen/arch/arm/p2m.c | 7 ++++++- > xen/include/asm-arm/arm32/flushtlb.h | 6 +++--- > xen/include/asm-arm/arm64/flushtlb.h | 6 +++--- > 3 files changed, 12 insertions(+), 7 deletions(-) > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c > index 11f4714..9ab0378 100644 > --- a/xen/arch/arm/p2m.c > +++ b/xen/arch/arm/p2m.c > @@ -374,7 +374,12 @@ static int create_p2m_entries(struct domain *d, > } > > if ( flush ) > - flush_tlb_all_local(); > + { > + if ( current->domain == d ) > + flush_tlb(); > + else > + flush_tlb_all(); > + } > > /* Preempt every 2MiB (mapped) or 32 MiB (unmapped) - arbitrary */ > if ( op == RELINQUISH && count >= 0x2000 ) > diff --git a/xen/include/asm-arm/arm32/flushtlb.h b/xen/include/asm-arm/arm32/flushtlb.h > index ab166f3..6ff6f75 100644 > --- a/xen/include/asm-arm/arm32/flushtlb.h > +++ b/xen/include/asm-arm/arm32/flushtlb.h > @@ -23,12 +23,12 @@ static inline void flush_tlb(void) > isb(); > } > > -/* Flush local TLBs, all VMIDs, non-hypervisor mode */ > -static inline void flush_tlb_all_local(void) > +/* Flush inner shareable TLBs, all VMIDs, non-hypervisor mode */ > +static inline void flush_tlb_all(void) > { > dsb(); > > - WRITE_CP32((uint32_t) 0, TLBIALLNSNH); > + WRITE_CP32((uint32_t) 0, TLBIALLNSNHIS); > > dsb(); > isb(); > diff --git a/xen/include/asm-arm/arm64/flushtlb.h b/xen/include/asm-arm/arm64/flushtlb.h > index 9ce79a8..687eda1 100644 > --- a/xen/include/asm-arm/arm64/flushtlb.h > +++ b/xen/include/asm-arm/arm64/flushtlb.h > @@ -23,12 +23,12 @@ static inline void flush_tlb(void) > : : : "memory"); > } > > -/* Flush local TLBs, all VMIDs, non-hypervisor mode */ > -static inline void flush_tlb_all_local(void) > +/* Flush inner shareable TLBs, all VMIDs, non-hypervisor mode */ > +static inline void flush_tlb_all(void) > { > asm volatile( > "dsb sy;" > - "tlbi alle1;" > + "tlbi alle1is;" > "dsb sy;" > "isb;" > : : : "memory"); > -- > 1.7.10.4 >
On Wed, 2014-01-08 at 17:13 +0000, Stefano Stabellini wrote: > On Wed, 8 Jan 2014, Julien Grall wrote: > > The p2m is shared between VCPUs for each domain. Currently Xen only flush > > TLB on the local PCPU. This could result to mismatch between the mapping in the > > p2m and TLBs. > > > > Flush TLBs used by this domain on every PCPU. > > > > Signed-off-by: Julien Grall <julien.grall@linaro.org> > > The fix makes sense to me. Me too. Has anyone had a grep for similar issues? (the reason for getting this wrong is that for cache flushes the "is" suffix restricts the flush to the IS domain, whereas with tlb flushes the "is" suffix broadcasts instead of keeping it local, which is a bit confusing ;-)) > > > --- > > > > This is a possible bug fix (found by reading the code) for Xen 4.4. I have > > added a small optimisation to avoid flushing all TLBs when a VCPU of this > > domain is running on the current cpu. > > > > The downside of this patch is the function can be a little bit slower because > > Xen is flushing more TLBs. > > Yes, I wonder how much slower it is going to be, considering that the flush > is executed for every iteration of the loop. It might be better to set the current VMID to the target domain for the duration of this function, we'd still need the broadcast but at least we wouldn't be killing unrelated VMIDs. Pulling the flush out of the loop would require great case WRT accesses from other VCPUs, e.g. you'd have to put the pages on a list (page->list might be available?) and issue the put_page() after the flush, otherwise it might get recycled into another domain while the first domain still has TLB entries for it. Or is there always something outside this function which holds another ref count such that the page definitely won't be freed by the put_page here? Actually, do the existing code not have this issue already? The put_page is before the flush. If this bug does exist now then I'd be inclined to consider this a bug fix for 4.4, rather than a potential optimisation for 4.5. While looking at this function I'm now wondering what happens to the existing page on ALLOCATE or INSERT, is it leaked? Ian.
On 01/09/2014 10:59 AM, Ian Campbell wrote: > On Wed, 2014-01-08 at 17:13 +0000, Stefano Stabellini wrote: >> On Wed, 8 Jan 2014, Julien Grall wrote: >>> The p2m is shared between VCPUs for each domain. Currently Xen only flush >>> TLB on the local PCPU. This could result to mismatch between the mapping in the >>> p2m and TLBs. >>> >>> Flush TLBs used by this domain on every PCPU. >>> >>> Signed-off-by: Julien Grall <julien.grall@linaro.org> >> >> The fix makes sense to me. > > Me too. Has anyone had a grep for similar issues? I think flush_xen_data_tlb and flush_xen_text_tlb should also be innershareable. The former one is used by flush_tlb_mask. But ... this function seems badly implement, it's weird to use flush_xen_data_tlb because we are mainly using flush_tlb_mask in common/grant-table.c. Any ideas? > (the reason for getting this wrong is that for cache flushes the "is" > suffix restricts the flush to the IS domain, whereas with tlb flushes > the "is" suffix broadcasts instead of keeping it local, which is a bit > confusing ;-)) > >> >>> --- >>> >>> This is a possible bug fix (found by reading the code) for Xen 4.4. I have >>> added a small optimisation to avoid flushing all TLBs when a VCPU of this >>> domain is running on the current cpu. >>> >>> The downside of this patch is the function can be a little bit slower because >>> Xen is flushing more TLBs. >> >> Yes, I wonder how much slower it is going to be, considering that the flush >> is executed for every iteration of the loop. > > It might be better to set the current VMID to the target domain for the > duration of this function, we'd still need the broadcast but at least we > wouldn't be killing unrelated VMIDs. I can modify the patch to handle that. > > Pulling the flush out of the loop would require great case WRT accesses > from other VCPUs, e.g. you'd have to put the pages on a list (page->list > might be available?) and issue the put_page() after the flush, otherwise > it might get recycled into another domain while the first domain still > has TLB entries for it. > > Or is there always something outside this function which holds another > ref count such that the page definitely won't be freed by the put_page > here? > > Actually, do the existing code not have this issue already? The put_page > is before the flush. If this bug does exist now then I'd be inclined to > consider this a bug fix for 4.4, rather than a potential optimisation > for 4.5. For now we don't take reference when we map/unmap mapping. Most of the time create_p2m_entries is called by common code which take care of having a reference when this function is called. So we should be safe. I would prefer to wait 4.5 for this optimisation (moving the flush outside the loop). > While looking at this function I'm now wondering what happens to the > existing page on ALLOCATE or INSERT, is it leaked? For ALLOCATE page, it's on the domain page list so the page will be freed duing relinquish. For INSERT, except for foreign mapping we don't have refcount for mapping. So the only issue could be with foreign mapping.
On Thu, 2014-01-09 at 13:14 +0000, Julien Grall wrote: > > On 01/09/2014 10:59 AM, Ian Campbell wrote: > > On Wed, 2014-01-08 at 17:13 +0000, Stefano Stabellini wrote: > >> On Wed, 8 Jan 2014, Julien Grall wrote: > >>> The p2m is shared between VCPUs for each domain. Currently Xen only flush > >>> TLB on the local PCPU. This could result to mismatch between the mapping in the > >>> p2m and TLBs. > >>> > >>> Flush TLBs used by this domain on every PCPU. > >>> > >>> Signed-off-by: Julien Grall <julien.grall@linaro.org> > >> > >> The fix makes sense to me. > > > > Me too. Has anyone had a grep for similar issues? > > I think flush_xen_data_tlb and flush_xen_text_tlb should also be > innershareable. I think text_tlb is ok, it's only used at start of day. The usage in setup_pagetables and mmu_init_secondary_cpus are both explicitly local I think. Perhaps it should be renamed _local. The case in set_pte_flags_on_range via free_init_memory I'm less sure about, but I don't think stale tlb entries are actually an issue here, since there will certainly be one when the page becomes used again. But maybe it would be safest to make it global. > The former one is used by flush_tlb_mask. Yes, the comment there is just wrong. I think this was my doing based on the confusion I mentioned before. We need to be careful not to change the (un)map_domain_page since those are not shared between processors, I don't think this change would do that. > But ... this function seems > badly implement, it's weird to use flush_xen_data_tlb because we are > mainly using flush_tlb_mask in common/grant-table.c. Any ideas? Do you mean that this should be flushing the guest TLBs and not Xen's? That does seem right... We actually need to be flushing for all vmid's too I think -- for the alloc_heap_pages case. Ian. > > > > (the reason for getting this wrong is that for cache flushes the "is" > > suffix restricts the flush to the IS domain, whereas with tlb flushes > > the "is" suffix broadcasts instead of keeping it local, which is a bit > > confusing ;-)) > > > >> > >>> --- > >>> > >>> This is a possible bug fix (found by reading the code) for Xen 4.4. I have > >>> added a small optimisation to avoid flushing all TLBs when a VCPU of this > >>> domain is running on the current cpu. > >>> > >>> The downside of this patch is the function can be a little bit slower because > >>> Xen is flushing more TLBs. > >> > >> Yes, I wonder how much slower it is going to be, considering that the flush > >> is executed for every iteration of the loop. > > > > It might be better to set the current VMID to the target domain for the > > duration of this function, we'd still need the broadcast but at least we > > wouldn't be killing unrelated VMIDs. > > I can modify the patch to handle that. > > > > > Pulling the flush out of the loop would require great case WRT accesses > > from other VCPUs, e.g. you'd have to put the pages on a list (page->list > > might be available?) and issue the put_page() after the flush, otherwise > > it might get recycled into another domain while the first domain still > > has TLB entries for it. > > > > Or is there always something outside this function which holds another > > ref count such that the page definitely won't be freed by the put_page > > here? > > > > Actually, do the existing code not have this issue already? The put_page > > is before the flush. If this bug does exist now then I'd be inclined to > > consider this a bug fix for 4.4, rather than a potential optimisation > > for 4.5. > > For now we don't take reference when we map/unmap mapping. Most of the > time create_p2m_entries is called by common code which take care of > having a reference when this function is called. So we should be safe. > > I would prefer to wait 4.5 for this optimisation (moving the flush > outside the loop). > > > While looking at this function I'm now wondering what happens to the > > existing page on ALLOCATE or INSERT, is it leaked? > > For ALLOCATE page, it's on the domain page list so the page will be > freed duing relinquish. > > For INSERT, except for foreign mapping we don't have refcount for > mapping. So the only issue could be with foreign mapping. >
On 01/09/2014 02:05 PM, Ian Campbell wrote: > On Thu, 2014-01-09 at 13:14 +0000, Julien Grall wrote: >> >> On 01/09/2014 10:59 AM, Ian Campbell wrote: >>> On Wed, 2014-01-08 at 17:13 +0000, Stefano Stabellini wrote: >>>> On Wed, 8 Jan 2014, Julien Grall wrote: >>>>> The p2m is shared between VCPUs for each domain. Currently Xen only flush >>>>> TLB on the local PCPU. This could result to mismatch between the mapping in the >>>>> p2m and TLBs. >>>>> >>>>> Flush TLBs used by this domain on every PCPU. >>>>> >>>>> Signed-off-by: Julien Grall <julien.grall@linaro.org> >>>> >>>> The fix makes sense to me. >>> >>> Me too. Has anyone had a grep for similar issues? >> >> I think flush_xen_data_tlb and flush_xen_text_tlb should also be >> innershareable. > > I think text_tlb is ok, it's only used at start of day. The usage in > setup_pagetables and mmu_init_secondary_cpus are both explicitly local I > think. Perhaps it should be renamed _local. After looking to the code, both function are only used to flush for the current cpu. Suffix flush_xen_data_tlb, flush_xen_text_tlb, flush_xen_data_tlb_range_va with _local sounds a good idea. Is it a 4.4 material? > The case in set_pte_flags_on_range via free_init_memory I'm less sure > about, but I don't think stale tlb entries are actually an issue here, > since there will certainly be one when the page becomes used again. But > maybe it would be safest to make it global. Theses translations will only be used for hypervisor mode. It should be safe ... We can flush TLB on every CPUs. That would mean we have to create flush_xen_text_tlb and flush_xen_text_tlb_local. >> The former one is used by flush_tlb_mask. > > Yes, the comment there is just wrong. I think this was my doing based on > the confusion I mentioned before. > > We need to be careful not to change the (un)map_domain_page since those > are not shared between processors, I don't think this change would do > that. Right, this function doesn't need to be changed. We need to modify the behaviour of flush_tlb_mask. >> But ... this function seems >> badly implement, it's weird to use flush_xen_data_tlb because we are >> mainly using flush_tlb_mask in common/grant-table.c. Any ideas? > > Do you mean that this should be flushing the guest TLBs and not Xen's? > That does seem right... We actually need to be flushing for all vmid's > too I think -- for the alloc_heap_pages case. After looking to the code, flush_tlb_mask is called in 2 specific place for ARM: - alloc_heap_pages: the flush is only be called if the new allocated page was used by a domain before. So we need to flush only need to flush TLB non-secure non-hyp inner-shareable. For Xen 4.5, this flush can be removed for ARM, Xen already call flush TLB in create_p2m_entries. - common/grant_table.c: every calls to flush_tlb_mask are used with the current domain. A simple flush TLB by VMID inner-shareable should be enough. For Xen 4.4, I suggest to make flush_tlb_mask flushes TLB non-secure non-hyp innershareable. We would need to rework after 4.4 for optimisation. Sincerely yours,
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c index 11f4714..9ab0378 100644 --- a/xen/arch/arm/p2m.c +++ b/xen/arch/arm/p2m.c @@ -374,7 +374,12 @@ static int create_p2m_entries(struct domain *d, } if ( flush ) - flush_tlb_all_local(); + { + if ( current->domain == d ) + flush_tlb(); + else + flush_tlb_all(); + } /* Preempt every 2MiB (mapped) or 32 MiB (unmapped) - arbitrary */ if ( op == RELINQUISH && count >= 0x2000 ) diff --git a/xen/include/asm-arm/arm32/flushtlb.h b/xen/include/asm-arm/arm32/flushtlb.h index ab166f3..6ff6f75 100644 --- a/xen/include/asm-arm/arm32/flushtlb.h +++ b/xen/include/asm-arm/arm32/flushtlb.h @@ -23,12 +23,12 @@ static inline void flush_tlb(void) isb(); } -/* Flush local TLBs, all VMIDs, non-hypervisor mode */ -static inline void flush_tlb_all_local(void) +/* Flush inner shareable TLBs, all VMIDs, non-hypervisor mode */ +static inline void flush_tlb_all(void) { dsb(); - WRITE_CP32((uint32_t) 0, TLBIALLNSNH); + WRITE_CP32((uint32_t) 0, TLBIALLNSNHIS); dsb(); isb(); diff --git a/xen/include/asm-arm/arm64/flushtlb.h b/xen/include/asm-arm/arm64/flushtlb.h index 9ce79a8..687eda1 100644 --- a/xen/include/asm-arm/arm64/flushtlb.h +++ b/xen/include/asm-arm/arm64/flushtlb.h @@ -23,12 +23,12 @@ static inline void flush_tlb(void) : : : "memory"); } -/* Flush local TLBs, all VMIDs, non-hypervisor mode */ -static inline void flush_tlb_all_local(void) +/* Flush inner shareable TLBs, all VMIDs, non-hypervisor mode */ +static inline void flush_tlb_all(void) { asm volatile( "dsb sy;" - "tlbi alle1;" + "tlbi alle1is;" "dsb sy;" "isb;" : : : "memory");
The p2m is shared between VCPUs for each domain. Currently Xen only flush TLB on the local PCPU. This could result to mismatch between the mapping in the p2m and TLBs. Flush TLBs used by this domain on every PCPU. Signed-off-by: Julien Grall <julien.grall@linaro.org> --- This is a possible bug fix (found by reading the code) for Xen 4.4. I have added a small optimisation to avoid flushing all TLBs when a VCPU of this domain is running on the current cpu. The downside of this patch is the function can be a little bit slower because Xen is flushing more TLBs. --- xen/arch/arm/p2m.c | 7 ++++++- xen/include/asm-arm/arm32/flushtlb.h | 6 +++--- xen/include/asm-arm/arm64/flushtlb.h | 6 +++--- 3 files changed, 12 insertions(+), 7 deletions(-)