[v2] xen/arm: p2m: Correctly flush TLB in create_p2m_entries

Message ID 1389285240-7116-1-git-send-email-julien.grall@linaro.org
State Superseded
Headers show

Commit Message

Julien Grall Jan. 9, 2014, 4:34 p.m.
The p2m is shared between VCPUs for each domain. Currently Xen only flush
TLB on the local PCPU. This could result to mismatch between the mapping in the
p2m and TLBs.

Flush TLB entries used by this domain on every PCPU. The flush can also be
moved out of the loop because:
    - ALLOCATE: only called for dom0 RAM allocation, so the flush is never called
    - INSERT: if valid = 1 that would means with have replaced a
    page that already belongs to the domain. A VCPU can write on the wrong page.
    This can append for dom0 with the 1:1 mapping because the mapping is not
    removed from the p2m.
    - REMOVE: except for grant-table (replace_grant_host_mapping), each
    call to guest_physmap_remove_page are protected by the callers via a
        get_page -> .... -> guest_physmap_remove_page -> ... -> put_page. So
    the page can't be allocated for another domain until the last put_page.
    - RELINQUISH : the domain is not running anymore so we don't care...

Signed-off-by: Julien Grall <julien.grall@linaro.org>

---
    Changes in v2:
        - Switch to the domain for only flush its TLBs entries
        - Move the flush out of the loop

This is a possible bug fix (found by reading the code) for Xen 4.4, I moved the
flush out of the loop which should be safe (see why in the commit message).
Without this patch, the guest can have stale TLB entries when the VCPU is moved
to another PCPU.

Except grant-table (I can't find {get,put}_page for grant-table code???),
all the callers are protected by a get_page before removing the page. So if the
another VCPU is trying to access to this page before the flush, it will just
read/write the wrong page.

The downside of this patch is Xen flushes less TLBs. Instead of flushing all TLBs
on the current PCPU, Xen flushes TLBs for a specific VMID on every CPUs. This
should be safe because create_p2m_entries only deal with a specific domain.

I don't think I forget case in this function. Let me know if it's the case.
---
 xen/arch/arm/p2m.c |   24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

Comments

Ian Campbell Jan. 13, 2014, 5:10 p.m. | #1
Hrm, our TLB flush discipline is horribly confused isn't it...

On Thu, 2014-01-09 at 16:34 +0000, Julien Grall wrote:
> The p2m is shared between VCPUs for each domain. Currently Xen only flush
> TLB on the local PCPU. This could result to mismatch between the mapping in the
> p2m and TLBs.
> 
> Flush TLB entries used by this domain on every PCPU. The flush can also be
> moved out of the loop because:
>     - ALLOCATE: only called for dom0 RAM allocation, so the flush is never called

OK.

An ASSERT(!third[third_table_offset(addr)].p2m.valid) might be
worthwhile if that is the case.

(I'm not sure why ALLOCATE can't be replaced by allocation followed by
an INSERT, it's seems very special case)

>     - INSERT: if valid = 1 that would means with have replaced a
>     page that already belongs to the domain. A VCPU can write on the wrong page.
>     This can append for dom0 with the 1:1 mapping because the mapping is not
>     removed from the p2m.

"append"? Do you mean "happen"?

In the non-dom0 1:1 case eventually the page will be freed, I guess by a
subsequent put_page elsewhere -- do they all contain the correct
flushing? Or do we just leak?

What happens if the page being replaced is foreign? Do we leak a
reference to another domain's page? (This is probably a separate issue
though, I suspect the put_page needs pulling out of the switch(op)
statement).

>     - REMOVE: except for grant-table (replace_grant_host_mapping),

What about this case? What makes it safe?

>  each
>     call to guest_physmap_remove_page are protected by the callers via a
>         get_page -> .... -> guest_physmap_remove_page -> ... -> put_page. So
>     the page can't be allocated for another domain until the last put_page.

I have confirmed this is the case for guest_remove_page, memory_exchange
and XENMEM_remove_from_physmap.

There is one case I saw where this isn't true which is gnttab_transfer,
AFAICT that will fail because steal_page unconditionally returns an
error on arm. There is also a flush_tlb_mask there, FWIW.

>     - RELINQUISH : the domain is not running anymore so we don't care...

At some point this page will be reused, as will the VMID, where/how is
it ensured that a flush will happen before that point? 

So I think the main outstanding question is what makes
replace_grant_host_mapping safe.

The other big issue is the leak of a foreign mapping on INSERT, but I
think that is separate.

> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> 
> ---
>     Changes in v2:
>         - Switch to the domain for only flush its TLBs entries
>         - Move the flush out of the loop
> 
> This is a possible bug fix (found by reading the code) for Xen 4.4, I moved the
> flush out of the loop which should be safe (see why in the commit message).
> Without this patch, the guest can have stale TLB entries when the VCPU is moved
> to another PCPU.
> 
> Except grant-table (I can't find {get,put}_page for grant-table code???),
> all the callers are protected by a get_page before removing the page. So if the
> another VCPU is trying to access to this page before the flush, it will just
> read/write the wrong page.
> 
> The downside of this patch is Xen flushes less TLBs. Instead of flushing all TLBs
> on the current PCPU, Xen flushes TLBs for a specific VMID on every CPUs. This
> should be safe because create_p2m_entries only deal with a specific domain.
> 
> I don't think I forget case in this function. Let me know if it's the case.
> ---
>  xen/arch/arm/p2m.c |   24 +++++++++++++++++++-----
>  1 file changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 11f4714..ad6f76e 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -238,7 +238,7 @@ static int create_p2m_entries(struct domain *d,
>                       int mattr,
>                       p2m_type_t t)
>  {
> -    int rc, flush;
> +    int rc;
>      struct p2m_domain *p2m = &d->arch.p2m;
>      lpae_t *first = NULL, *second = NULL, *third = NULL;
>      paddr_t addr;
> @@ -246,10 +246,14 @@ static int create_p2m_entries(struct domain *d,
>                    cur_first_offset = ~0,
>                    cur_second_offset = ~0;
>      unsigned long count = 0;
> +    unsigned int flush = 0;
>      bool_t populate = (op == INSERT || op == ALLOCATE);
>  
>      spin_lock(&p2m->lock);
>  
> +    if ( d != current->domain )
> +        p2m_load_VTTBR(d);
> +
>      addr = start_gpaddr;
>      while ( addr < end_gpaddr )
>      {
> @@ -316,7 +320,7 @@ static int create_p2m_entries(struct domain *d,
>              cur_second_offset = second_table_offset(addr);
>          }
>  
> -        flush = third[third_table_offset(addr)].p2m.valid;
> +        flush |= third[third_table_offset(addr)].p2m.valid;
>  
>          /* Allocate a new RAM page and attach */
>          switch (op) {
> @@ -373,9 +377,6 @@ static int create_p2m_entries(struct domain *d,
>                  break;
>          }
>  
> -        if ( flush )
> -            flush_tlb_all_local();
> -
>          /* Preempt every 2MiB (mapped) or 32 MiB (unmapped) - arbitrary */
>          if ( op == RELINQUISH && count >= 0x2000 )
>          {
> @@ -392,6 +393,16 @@ static int create_p2m_entries(struct domain *d,
>          addr += PAGE_SIZE;
>      }
>  
> +    if ( flush )
> +    {
> +        /* At the beginning of the function, Xen is updating VTTBR
> +         * with the domain where the mappings are created. In this
> +         * case it's only necessary to flush TLBs on every CPUs with
> +         * the current VMID (our domain).
> +         */
> +        flush_tlb();
> +    }
> +
>      if ( op == ALLOCATE || op == INSERT )
>      {
>          unsigned long sgfn = paddr_to_pfn(start_gpaddr);
> @@ -409,6 +420,9 @@ out:
>      if (second) unmap_domain_page(second);
>      if (first) unmap_domain_page(first);
>  
> +    if ( d != current->domain )
> +        p2m_load_VTTBR(current->domain);
> +
>      spin_unlock(&p2m->lock);
>  
>      return rc;
Julien Grall Jan. 13, 2014, 5:37 p.m. | #2
On 01/13/2014 05:10 PM, Ian Campbell wrote:
> Hrm, our TLB flush discipline is horribly confused isn't it...
> 
> On Thu, 2014-01-09 at 16:34 +0000, Julien Grall wrote:
>> The p2m is shared between VCPUs for each domain. Currently Xen only flush
>> TLB on the local PCPU. This could result to mismatch between the mapping in the
>> p2m and TLBs.
>>
>> Flush TLB entries used by this domain on every PCPU. The flush can also be
>> moved out of the loop because:
>>     - ALLOCATE: only called for dom0 RAM allocation, so the flush is never called
> 
> OK.
> 
> An ASSERT(!third[third_table_offset(addr)].p2m.valid) might be
> worthwhile if that is the case.

Will add it.

> 
> (I'm not sure why ALLOCATE can't be replaced by allocation followed by
> an INSERT, it's seems very special case)
> 
>>     - INSERT: if valid = 1 that would means with have replaced a
>>     page that already belongs to the domain. A VCPU can write on the wrong page.
>>     This can append for dom0 with the 1:1 mapping because the mapping is not
>>     removed from the p2m.
> 
> "append"? Do you mean "happen"?

I meant "happen".

> 
> In the non-dom0 1:1 case eventually the page will be freed, I guess by a
> subsequent put_page elsewhere -- do they all contain the correct
> flushing? Or do we just leak?

As for foreign mapping the INSERT function should be hardened. We don't
have a protection against the guest is replacing a current valid mapping.

> What happens if the page being replaced is foreign? Do we leak a
> reference to another domain's page? (This is probably a separate issue
> though, I suspect the put_page needs pulling out of the switch(op)
> statement).

Right we leak a reference to another domain.

> 
>>     - REMOVE: except for grant-table (replace_grant_host_mapping),
> 
> What about this case? What makes it safe?

As specified a bit later, I can't say if it's safe or not. I was unabled
to find a proof in the code.

>>  each
>>     call to guest_physmap_remove_page are protected by the callers via a
>>         get_page -> .... -> guest_physmap_remove_page -> ... -> put_page. So
>>     the page can't be allocated for another domain until the last put_page.
> 
> I have confirmed this is the case for guest_remove_page, memory_exchange
> and XENMEM_remove_from_physmap.
> 
> There is one case I saw where this isn't true which is gnttab_transfer,
> AFAICT that will fail because steal_page unconditionally returns an
> error on arm. There is also a flush_tlb_mask there, FWIW.

hmmm... I forgot this one... why don't we have a {get,put}_page in this
function?

> 
>>     - RELINQUISH : the domain is not running anymore so we don't care...
> 
> At some point this page will be reused, as will the VMID, where/how is
> it ensured that a flush will happen before that point? 

When the VMID is reused, Xen will flush everything TLBs associated to
this VMID (see p2m_alloc_table);
Ian Campbell Jan. 13, 2014, 5:57 p.m. | #3
On Mon, 2014-01-13 at 17:37 +0000, Julien Grall wrote:
> On 01/13/2014 05:10 PM, Ian Campbell wrote:
> > Hrm, our TLB flush discipline is horribly confused isn't it...
> > 
> > On Thu, 2014-01-09 at 16:34 +0000, Julien Grall wrote:
> >> The p2m is shared between VCPUs for each domain. Currently Xen only flush
> >> TLB on the local PCPU. This could result to mismatch between the mapping in the
> >> p2m and TLBs.
> >>
> >> Flush TLB entries used by this domain on every PCPU. The flush can also be
> >> moved out of the loop because:
> >>     - ALLOCATE: only called for dom0 RAM allocation, so the flush is never called
> > 
> > OK.
> > 
> > An ASSERT(!third[third_table_offset(addr)].p2m.valid) might be
> > worthwhile if that is the case.
> 
> Will add it.

Thanks.

> > (I'm not sure why ALLOCATE can't be replaced by allocation followed by
> > an INSERT, it's seems very special case)
> > 
> >>     - INSERT: if valid = 1 that would means with have replaced a
> >>     page that already belongs to the domain. A VCPU can write on the wrong page.
> >>     This can append for dom0 with the 1:1 mapping because the mapping is not
> >>     removed from the p2m.
> > 
> > "append"? Do you mean "happen"?
> 
> I meant "happen".
> 
> > 
> > In the non-dom0 1:1 case eventually the page will be freed, I guess by a
> > subsequent put_page elsewhere -- do they all contain the correct
> > flushing? Or do we just leak?
> 
> As for foreign mapping the INSERT function should be hardened. We don't

Did you mean "handled"?

> have a protection against the guest is replacing a current valid mapping.
> 
> > What happens if the page being replaced is foreign? Do we leak a
> > reference to another domain's page? (This is probably a separate issue
> > though, I suspect the put_page needs pulling out of the switch(op)
> > statement).
> 
> Right we leak a reference to another domain.
> 
> > 
> >>     - REMOVE: except for grant-table (replace_grant_host_mapping),
> > 
> > What about this case? What makes it safe?
> 
> As specified a bit later, I can't say if it's safe or not. I was unabled
> to find a proof in the code.

Sorry, I seem to have forgotten to read the blurb after the cut. I'll
have to have a think about that aspect tomorrow.

> >>  each
> >>     call to guest_physmap_remove_page are protected by the callers via a
> >>         get_page -> .... -> guest_physmap_remove_page -> ... -> put_page. So
> >>     the page can't be allocated for another domain until the last put_page.
> > 
> > I have confirmed this is the case for guest_remove_page, memory_exchange
> > and XENMEM_remove_from_physmap.
> > 
> > There is one case I saw where this isn't true which is gnttab_transfer,
> > AFAICT that will fail because steal_page unconditionally returns an
> > error on arm. There is also a flush_tlb_mask there, FWIW.
> 
> hmmm... I forgot this one... why don't we have a {get,put}_page in this
> function?

On x86 they seem to be in steal_page, which ARM doesn't implement.

> 
> > 
> >>     - RELINQUISH : the domain is not running anymore so we don't care...
> > 
> > At some point this page will be reused, as will the VMID, where/how is
> > it ensured that a flush will happen before that point? 
> 
> When the VMID is reused, Xen will flush everything TLBs associated to
> this VMID (see p2m_alloc_table);

So it does, I missed that when I looked.

Thanks,
Ian.
Julien Grall Jan. 14, 2014, 12:41 p.m. | #4
On 01/13/2014 05:57 PM, Ian Campbell wrote:
> On Mon, 2014-01-13 at 17:37 +0000, Julien Grall wrote:
>> On 01/13/2014 05:10 PM, Ian Campbell wrote:
>>> Hrm, our TLB flush discipline is horribly confused isn't it...
>>>
>>> On Thu, 2014-01-09 at 16:34 +0000, Julien Grall wrote:
>>>> The p2m is shared between VCPUs for each domain. Currently Xen only flush
>>>> TLB on the local PCPU. This could result to mismatch between the mapping in the
>>>> p2m and TLBs.
>>>>
>>>> Flush TLB entries used by this domain on every PCPU. The flush can also be
>>>> moved out of the loop because:
>>>>     - ALLOCATE: only called for dom0 RAM allocation, so the flush is never called
>>>
>>> OK.
>>>
>>> An ASSERT(!third[third_table_offset(addr)].p2m.valid) might be
>>> worthwhile if that is the case.
>>
>> Will add it.
> 
> Thanks.
> 
>>> (I'm not sure why ALLOCATE can't be replaced by allocation followed by
>>> an INSERT, it's seems very special case)
>>>
>>>>     - INSERT: if valid = 1 that would means with have replaced a
>>>>     page that already belongs to the domain. A VCPU can write on the wrong page.
>>>>     This can append for dom0 with the 1:1 mapping because the mapping is not
>>>>     removed from the p2m.
>>>
>>> "append"? Do you mean "happen"?
>>
>> I meant "happen".
>>
>>>
>>> In the non-dom0 1:1 case eventually the page will be freed, I guess by a
>>> subsequent put_page elsewhere -- do they all contain the correct
>>> flushing? Or do we just leak?
>>
>> As for foreign mapping the INSERT function should be hardened. We don't
> 
> Did you mean "handled"?

I meant both :). Actually we don't have any check in this function as
for REMOVE case.

I don't think it's possible to do it for 4.4, we take a reference on the
mapping every time a new entrie is added in the p2m.
Ian Campbell Jan. 14, 2014, 12:54 p.m. | #5
On Tue, 2014-01-14 at 12:41 +0000, Julien Grall wrote:
> On 01/13/2014 05:57 PM, Ian Campbell wrote:
> > On Mon, 2014-01-13 at 17:37 +0000, Julien Grall wrote:
> >> On 01/13/2014 05:10 PM, Ian Campbell wrote:
> >>> Hrm, our TLB flush discipline is horribly confused isn't it...
> >>>
> >>> On Thu, 2014-01-09 at 16:34 +0000, Julien Grall wrote:
> >>>> The p2m is shared between VCPUs for each domain. Currently Xen only flush
> >>>> TLB on the local PCPU. This could result to mismatch between the mapping in the
> >>>> p2m and TLBs.
> >>>>
> >>>> Flush TLB entries used by this domain on every PCPU. The flush can also be
> >>>> moved out of the loop because:
> >>>>     - ALLOCATE: only called for dom0 RAM allocation, so the flush is never called
> >>>
> >>> OK.
> >>>
> >>> An ASSERT(!third[third_table_offset(addr)].p2m.valid) might be
> >>> worthwhile if that is the case.
> >>
> >> Will add it.
> > 
> > Thanks.
> > 
> >>> (I'm not sure why ALLOCATE can't be replaced by allocation followed by
> >>> an INSERT, it's seems very special case)
> >>>
> >>>>     - INSERT: if valid = 1 that would means with have replaced a
> >>>>     page that already belongs to the domain. A VCPU can write on the wrong page.
> >>>>     This can append for dom0 with the 1:1 mapping because the mapping is not
> >>>>     removed from the p2m.
> >>>
> >>> "append"? Do you mean "happen"?
> >>
> >> I meant "happen".
> >>
> >>>
> >>> In the non-dom0 1:1 case eventually the page will be freed, I guess by a
> >>> subsequent put_page elsewhere -- do they all contain the correct
> >>> flushing? Or do we just leak?
> >>
> >> As for foreign mapping the INSERT function should be hardened. We don't
> > 
> > Did you mean "handled"?
> 
> I meant both :). Actually we don't have any check in this function as
> for REMOVE case.
> 
> I don't think it's possible to do it for 4.4, we take a reference on the
> mapping every time a new entrie is added in the p2m.

Can we pull the:
                /* TODO: Handle other p2m type */
                    if ( p2m_is_foreign(pte.p2m.type) )
                    {
                        ASSERT(mfn_valid(mfn));
                        put_page(mfn_to_page(mfn));
                    }
out to above the switch and have it be:
                                pte = third[third_table_offset(addr)]
                                flsuh = pte.p2m.valid
                                
                /* TODO: Handle other p2m type */
                                if (pte.p2m.valid &&
                                p2m_is_foreign(pte.p2m.type)
                                {
                                    ASSERT(mfn_valid(mfn));
                                    put_page(mfn_to_page(mfn));
                                }
I think that would be acceptable for 4.4, unless there is some
complication I'm not foreseeing?

Ian.
Julien Grall Jan. 14, 2014, 1:20 p.m. | #6
On 01/14/2014 12:54 PM, Ian Campbell wrote:
> On Tue, 2014-01-14 at 12:41 +0000, Julien Grall wrote:
>> On 01/13/2014 05:57 PM, Ian Campbell wrote:
>>> On Mon, 2014-01-13 at 17:37 +0000, Julien Grall wrote:
>>>> On 01/13/2014 05:10 PM, Ian Campbell wrote:
>>>>> Hrm, our TLB flush discipline is horribly confused isn't it...
>>>>>
>>>>> On Thu, 2014-01-09 at 16:34 +0000, Julien Grall wrote:
>>>>>> The p2m is shared between VCPUs for each domain. Currently Xen only flush
>>>>>> TLB on the local PCPU. This could result to mismatch between the mapping in the
>>>>>> p2m and TLBs.
>>>>>>
>>>>>> Flush TLB entries used by this domain on every PCPU. The flush can also be
>>>>>> moved out of the loop because:
>>>>>>     - ALLOCATE: only called for dom0 RAM allocation, so the flush is never called
>>>>>
>>>>> OK.
>>>>>
>>>>> An ASSERT(!third[third_table_offset(addr)].p2m.valid) might be
>>>>> worthwhile if that is the case.
>>>>
>>>> Will add it.
>>>
>>> Thanks.
>>>
>>>>> (I'm not sure why ALLOCATE can't be replaced by allocation followed by
>>>>> an INSERT, it's seems very special case)
>>>>>
>>>>>>     - INSERT: if valid = 1 that would means with have replaced a
>>>>>>     page that already belongs to the domain. A VCPU can write on the wrong page.
>>>>>>     This can append for dom0 with the 1:1 mapping because the mapping is not
>>>>>>     removed from the p2m.
>>>>>
>>>>> "append"? Do you mean "happen"?
>>>>
>>>> I meant "happen".
>>>>
>>>>>
>>>>> In the non-dom0 1:1 case eventually the page will be freed, I guess by a
>>>>> subsequent put_page elsewhere -- do they all contain the correct
>>>>> flushing? Or do we just leak?
>>>>
>>>> As for foreign mapping the INSERT function should be hardened. We don't
>>>
>>> Did you mean "handled"?
>>
>> I meant both :). Actually we don't have any check in this function as
>> for REMOVE case.
>>
>> I don't think it's possible to do it for 4.4, we take a reference on the
>> mapping every time a new entrie is added in the p2m.
> 
> Can we pull the:
>                 /* TODO: Handle other p2m type */
>                     if ( p2m_is_foreign(pte.p2m.type) )
>                     {
>                         ASSERT(mfn_valid(mfn));
>                         put_page(mfn_to_page(mfn));
>                     }
> out to above the switch and have it be:
>                                 pte = third[third_table_offset(addr)]
>                                 flsuh = pte.p2m.valid
>                                 
>                 /* TODO: Handle other p2m type */
>                                 if (pte.p2m.valid &&
>                                 p2m_is_foreign(pte.p2m.type)
>                                 {
>                                     ASSERT(mfn_valid(mfn));
>                                     put_page(mfn_to_page(mfn));
>                                 }
> I think that would be acceptable for 4.4, unless there is some
> complication I'm not foreseeing?

As discussed IRL, this is the only {get,put}_page to the foreign page
for this domain. So if the foreign domain has release all reference to
this page, the put_page will free the page.

In the tiny timeslice before the flush (at the end of the loop), the
page could be reallocated to another domain.
But hopefully we are safe (assuming my patch "xen/arm: correct
flush_tlb_mask behaviour" is also in the tree), because page_alloc will
call flush TLB.
Ian Campbell Jan. 14, 2014, 3:28 p.m. | #7
On Thu, 2014-01-09 at 16:34 +0000, Julien Grall wrote:
> Except grant-table (I can't find {get,put}_page for grant-table code???),

I think they are in __gnttab_map_grant_ref, within __get_paged_frame or
through page_get_owner_and_reference.

and on unmap it is in__gnttab_unmap_common_complete.

It's a bit of a complex maze though so I'm not entirely sure, perhaps
Tim, Keir or Jan can confirm that a grant mapping always takes a
reference on the mapped page (it seems like PV x86 ought to be relying
on this for safety anyhow).

I think the flush in alloc_heap_pages would also serve as a backstop,
wouldn't it?

> all the callers are protected by a get_page before removing the page. So if the
> another VCPU is trying to access to this page before the flush, it will just
> read/write the wrong page.
> 
> The downside of this patch is Xen flushes less TLBs. Instead of flushing all TLBs
> on the current PCPU, Xen flushes TLBs for a specific VMID on every CPUs. This
> should be safe because create_p2m_entries only deal with a specific domain.
> 
> I don't think I forget case in this function. Let me know if it's the case.
> ---
>  xen/arch/arm/p2m.c |   24 +++++++++++++++++++-----
>  1 file changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 11f4714..ad6f76e 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -238,7 +238,7 @@ static int create_p2m_entries(struct domain *d,
>                       int mattr,
>                       p2m_type_t t)
>  {
> -    int rc, flush;
> +    int rc;
>      struct p2m_domain *p2m = &d->arch.p2m;
>      lpae_t *first = NULL, *second = NULL, *third = NULL;
>      paddr_t addr;
> @@ -246,10 +246,14 @@ static int create_p2m_entries(struct domain *d,
>                    cur_first_offset = ~0,
>                    cur_second_offset = ~0;
>      unsigned long count = 0;
> +    unsigned int flush = 0;
>      bool_t populate = (op == INSERT || op == ALLOCATE);
>  
>      spin_lock(&p2m->lock);
>  
> +    if ( d != current->domain )
> +        p2m_load_VTTBR(d);
> +
>      addr = start_gpaddr;
>      while ( addr < end_gpaddr )
>      {
> @@ -316,7 +320,7 @@ static int create_p2m_entries(struct domain *d,
>              cur_second_offset = second_table_offset(addr);
>          }
>  
> -        flush = third[third_table_offset(addr)].p2m.valid;
> +        flush |= third[third_table_offset(addr)].p2m.valid;
>  
>          /* Allocate a new RAM page and attach */
>          switch (op) {
> @@ -373,9 +377,6 @@ static int create_p2m_entries(struct domain *d,
>                  break;
>          }
>  
> -        if ( flush )
> -            flush_tlb_all_local();
> -
>          /* Preempt every 2MiB (mapped) or 32 MiB (unmapped) - arbitrary */
>          if ( op == RELINQUISH && count >= 0x2000 )
>          {
> @@ -392,6 +393,16 @@ static int create_p2m_entries(struct domain *d,
>          addr += PAGE_SIZE;
>      }
>  
> +    if ( flush )
> +    {
> +        /* At the beginning of the function, Xen is updating VTTBR
> +         * with the domain where the mappings are created. In this
> +         * case it's only necessary to flush TLBs on every CPUs with
> +         * the current VMID (our domain).
> +         */
> +        flush_tlb();
> +    }
> +
>      if ( op == ALLOCATE || op == INSERT )
>      {
>          unsigned long sgfn = paddr_to_pfn(start_gpaddr);
> @@ -409,6 +420,9 @@ out:
>      if (second) unmap_domain_page(second);
>      if (first) unmap_domain_page(first);
>  
> +    if ( d != current->domain )
> +        p2m_load_VTTBR(current->domain);
> +
>      spin_unlock(&p2m->lock);
>  
>      return rc;
Ian Campbell Jan. 16, 2014, 10:46 a.m. | #8
On Thu, 2014-01-16 at 11:10 +0100, Tim Deegan wrote:
> At 15:28 +0000 on 14 Jan (1389709716), Ian Campbell wrote:
> > On Thu, 2014-01-09 at 16:34 +0000, Julien Grall wrote:
> > > Except grant-table (I can't find {get,put}_page for grant-table code???),
> > 
> > I think they are in __gnttab_map_grant_ref, within __get_paged_frame or
> > through page_get_owner_and_reference.
> > 
> > and on unmap it is in__gnttab_unmap_common_complete.
> > 
> > It's a bit of a complex maze though so I'm not entirely sure, perhaps
> > Tim, Keir or Jan can confirm that a grant mapping always takes a
> > reference on the mapped page (it seems like PV x86 ought to be relying
> > on this for safety anyhow).
> 
> Not claiming to understand it completely, but I agree with your analysis.

Thanks.

> > I think the flush in alloc_heap_pages would also serve as a backstop,
> > wouldn't it?
> 
> Not entirely -- if the grant mapping didn't take a ref, then the page
> could be freed and reassigned with the grant mapping still in place --
> the TLB flush doesn't help if the PTE is still there. :)

True, but I think we agree that grant mappings do (and must) take refs,
Phew!

Likewise on ARM we reference count foreign mappings so this is ok wrt
this sort of thing too.

I actually forgot I'd asked this question and was waiting for feedback
-- so the patch is already in, good thing it is all fine!

Ian.

Patch

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 11f4714..ad6f76e 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -238,7 +238,7 @@  static int create_p2m_entries(struct domain *d,
                      int mattr,
                      p2m_type_t t)
 {
-    int rc, flush;
+    int rc;
     struct p2m_domain *p2m = &d->arch.p2m;
     lpae_t *first = NULL, *second = NULL, *third = NULL;
     paddr_t addr;
@@ -246,10 +246,14 @@  static int create_p2m_entries(struct domain *d,
                   cur_first_offset = ~0,
                   cur_second_offset = ~0;
     unsigned long count = 0;
+    unsigned int flush = 0;
     bool_t populate = (op == INSERT || op == ALLOCATE);
 
     spin_lock(&p2m->lock);
 
+    if ( d != current->domain )
+        p2m_load_VTTBR(d);
+
     addr = start_gpaddr;
     while ( addr < end_gpaddr )
     {
@@ -316,7 +320,7 @@  static int create_p2m_entries(struct domain *d,
             cur_second_offset = second_table_offset(addr);
         }
 
-        flush = third[third_table_offset(addr)].p2m.valid;
+        flush |= third[third_table_offset(addr)].p2m.valid;
 
         /* Allocate a new RAM page and attach */
         switch (op) {
@@ -373,9 +377,6 @@  static int create_p2m_entries(struct domain *d,
                 break;
         }
 
-        if ( flush )
-            flush_tlb_all_local();
-
         /* Preempt every 2MiB (mapped) or 32 MiB (unmapped) - arbitrary */
         if ( op == RELINQUISH && count >= 0x2000 )
         {
@@ -392,6 +393,16 @@  static int create_p2m_entries(struct domain *d,
         addr += PAGE_SIZE;
     }
 
+    if ( flush )
+    {
+        /* At the beginning of the function, Xen is updating VTTBR
+         * with the domain where the mappings are created. In this
+         * case it's only necessary to flush TLBs on every CPUs with
+         * the current VMID (our domain).
+         */
+        flush_tlb();
+    }
+
     if ( op == ALLOCATE || op == INSERT )
     {
         unsigned long sgfn = paddr_to_pfn(start_gpaddr);
@@ -409,6 +420,9 @@  out:
     if (second) unmap_domain_page(second);
     if (first) unmap_domain_page(first);
 
+    if ( d != current->domain )
+        p2m_load_VTTBR(current->domain);
+
     spin_unlock(&p2m->lock);
 
     return rc;