diff mbox

[Xen-devel,v3,16/24] xen/passthrough: Introduce iommu_construct

Message ID 1421159133-31526-17-git-send-email-julien.grall@linaro.org
State Superseded, archived
Headers show

Commit Message

Julien Grall Jan. 13, 2015, 2:25 p.m. UTC
This new function will correctly initialize the IOMMU page table for the
current domain.

Also use it in iommu_assign_dt_device even though the current IOMMU
implementation on ARM shares P2M with the processor.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Cc: Jan Beulich <jbeulich@suse.com>

---
    Changes in v3:
        - The ASSERT in iommu_construct was redundant with the if ()
        - Remove d->need_iommu = 1 in assign_device has it's already
        done by iommu_construct.
        - Simplify the code in the caller of iommu_construct

    Changes in v2:
        - Add missing Signed-off-by
        - Rename iommu_buildup to iommu_construct
---
 xen/drivers/passthrough/arm/iommu.c   |  6 ++++++
 xen/drivers/passthrough/device_tree.c |  4 ++++
 xen/drivers/passthrough/iommu.c       | 19 +++++++++++++++++++
 xen/drivers/passthrough/pci.c         | 15 ++++-----------
 xen/include/xen/iommu.h               |  2 ++
 5 files changed, 35 insertions(+), 11 deletions(-)

Comments

Julien Grall Jan. 20, 2015, 2:28 p.m. UTC | #1
Hi Jan,

On 19/01/15 17:02, Jan Beulich wrote:
>>>> On 13.01.15 at 15:25, <julien.grall@linaro.org> wrote:
>> --- a/xen/drivers/passthrough/device_tree.c
>> +++ b/xen/drivers/passthrough/device_tree.c
>> @@ -41,6 +41,10 @@ int iommu_assign_dt_device(struct domain *d, struct dt_device_node *dev)
>>      if ( !list_empty(&dev->domain_list) )
>>          goto fail;
>>  
>> +    rc = iommu_construct(d);
>> +    if ( rc )
>> +        goto fail;
> 
> Considering that the only (current) caller of this it domain_build.c I'm
> afraid you're going to get into trouble if you get back -ERESTART
> here. Note that on x86 Dom0 setup works via iommu_hwdom_init(),
> which deals with the preemption needs (at that point in time) by
> calling process_pending_softirqs() every once in a while.

iommu_hwdom_init is also called for ARM (it's part of the common domain
creation code). So, I don't see any issue here as we match the same
behavior as PCI.

FWIW, on the previous version you asked to check the need_iommu(d) in
iommu_construct. For DOM0 it will return 0 and therefore never return
-ERESTART.

Regards,
Julien Grall Jan. 20, 2015, 5:11 p.m. UTC | #2
On 20/01/15 16:40, Jan Beulich wrote:
>>>> On 20.01.15 at 15:28, <julien.grall@linaro.org> wrote:
>> On 19/01/15 17:02, Jan Beulich wrote:
>>>>>> On 13.01.15 at 15:25, <julien.grall@linaro.org> wrote:
>>>> --- a/xen/drivers/passthrough/device_tree.c
>>>> +++ b/xen/drivers/passthrough/device_tree.c
>>>> @@ -41,6 +41,10 @@ int iommu_assign_dt_device(struct domain *d, struct dt_device_node *dev)
>>>>      if ( !list_empty(&dev->domain_list) )
>>>>          goto fail;
>>>>  
>>>> +    rc = iommu_construct(d);
>>>> +    if ( rc )
>>>> +        goto fail;
>>>
>>> Considering that the only (current) caller of this it domain_build.c I'm
>>> afraid you're going to get into trouble if you get back -ERESTART
>>> here. Note that on x86 Dom0 setup works via iommu_hwdom_init(),
>>> which deals with the preemption needs (at that point in time) by
>>> calling process_pending_softirqs() every once in a while.
>>
>> iommu_hwdom_init is also called for ARM (it's part of the common domain
>> creation code). So, I don't see any issue here as we match the same
>> behavior as PCI.
>>
>> FWIW, on the previous version you asked to check the need_iommu(d) in
>> iommu_construct. For DOM0 it will return 0 and therefore never return
>> -ERESTART.
> 
> Quoting the function:
> 
> +int iommu_construct(struct domain *d)
> +{
> +    int rc = 0;
> +
> +    if ( need_iommu(d) > 0 )
> +        return 0;
> +
> +    if ( !iommu_use_hap_pt(d) )
> +    {
> +        rc = arch_iommu_populate_page_table(d);
> +        if ( rc )
> +            return rc;
> +    }
> +
> +    d->need_iommu = 1;
> +
> +    return rc;
> +}

> If need_iommu() returns 0 for Dom0, then the early return won't get
> used. Hence I don't follow your comment above. And if what you say
> there was correct, then I don't understand why you add the call
> quoted at the very top in the first place (again taking into consideration
> that - afaict - the only [current] caller is in domain_build.c).

I don't understand what is the issue in the device tree use case. As I
said, assign_device in the pci do exactly the same things.

While this function is currently only used for DOM0, this will be used
in a later patch for guest non-PCI passthrough.

Regards,
Julien Grall Jan. 21, 2015, 10:37 a.m. UTC | #3
On 21/01/2015 10:23, Jan Beulich wrote:
>>>> On 20.01.15 at 18:11, <julien.grall@linaro.org> wrote:
>> On 20/01/15 16:40, Jan Beulich wrote:
>>>>>> On 20.01.15 at 15:28, <julien.grall@linaro.org> wrote:
>>>> On 19/01/15 17:02, Jan Beulich wrote:
>>>>>>>> On 13.01.15 at 15:25, <julien.grall@linaro.org> wrote:
>>>>>> --- a/xen/drivers/passthrough/device_tree.c
>>>>>> +++ b/xen/drivers/passthrough/device_tree.c
>>>>>> @@ -41,6 +41,10 @@ int iommu_assign_dt_device(struct domain *d, struct
>> dt_device_node *dev)
>>>>>>       if ( !list_empty(&dev->domain_list) )
>>>>>>           goto fail;
>>>>>>
>>>>>> +    rc = iommu_construct(d);
>>>>>> +    if ( rc )
>>>>>> +        goto fail;
>>>>>
>>>>> Considering that the only (current) caller of this it domain_build.c I'm
>>>>> afraid you're going to get into trouble if you get back -ERESTART
>>>>> here. Note that on x86 Dom0 setup works via iommu_hwdom_init(),
>>>>> which deals with the preemption needs (at that point in time) by
>>>>> calling process_pending_softirqs() every once in a while.
>>>>
>>>> iommu_hwdom_init is also called for ARM (it's part of the common domain
>>>> creation code). So, I don't see any issue here as we match the same
>>>> behavior as PCI.
>>>>
>>>> FWIW, on the previous version you asked to check the need_iommu(d) in
>>>> iommu_construct. For DOM0 it will return 0 and therefore never return
>>>> -ERESTART.
>>>
>>> Quoting the function:
>>>
>>> +int iommu_construct(struct domain *d)
>>> +{
>>> +    int rc = 0;
>>> +
>>> +    if ( need_iommu(d) > 0 )
>>> +        return 0;
>>> +
>>> +    if ( !iommu_use_hap_pt(d) )
>>> +    {
>>> +        rc = arch_iommu_populate_page_table(d);
>>> +        if ( rc )
>>> +            return rc;
>>> +    }
>>> +
>>> +    d->need_iommu = 1;
>>> +
>>> +    return rc;
>>> +}
>>
>>> If need_iommu() returns 0 for Dom0, then the early return won't get
>>> used. Hence I don't follow your comment above. And if what you say
>>> there was correct, then I don't understand why you add the call
>>> quoted at the very top in the first place (again taking into consideration
>>> that - afaict - the only [current] caller is in domain_build.c).
>>
>> I don't understand what is the issue in the device tree use case. As I
>> said, assign_device in the pci do exactly the same things.
>
> Sure, but it's not being called for Dom0, but only out of the domctl
> handler.

Hmmm right. It's the main difference between non-PCI and PCI passthrough.

>
>> While this function is currently only used for DOM0, this will be used
>> in a later patch for guest non-PCI passthrough.
>
> Okay, but you shouldn't break (or alter in [seemingly] benign ways) the
> Dom0 case imo.

As iommu_hwdom_init is initialized correctly the IOMMU for DOM0, 
iommu_construct is a no-op.

Would an if ( need_iommu(d) ) will be more clear? Maybe we an assert 
(!is_hardware_domain(d)).

Regards,
Julien Grall Jan. 21, 2015, 12:13 p.m. UTC | #4
Hi Jan,

On 21/01/15 10:48, Jan Beulich wrote:
>>>> On 21.01.15 at 11:37, <julien.grall@linaro.org> wrote:
>> On 21/01/2015 10:23, Jan Beulich wrote:
>>>>>> On 20.01.15 at 18:11, <julien.grall@linaro.org> wrote:
>>>> While this function is currently only used for DOM0, this will be used
>>>> in a later patch for guest non-PCI passthrough.
>>>
>>> Okay, but you shouldn't break (or alter in [seemingly] benign ways) the
>>> Dom0 case imo.
>>
>> As iommu_hwdom_init is initialized correctly the IOMMU for DOM0, 
>> iommu_construct is a no-op.
>>
>> Would an if ( need_iommu(d) ) will be more clear? Maybe we an assert 
>> (!is_hardware_domain(d)).
> 
> Just think this through properly: iommu_hwdom_init() may leave
> Dom0's ->need_iommu at 0 or 1 (depending on iommu_dom0_strict).
> And iommu_construct() specifically is a nop only when ->need_iommu
> is positive (x86's arch_iommu_populate_page_table() sets it to a
> negative value to indicate "being set up", and I wonder how ARM
> gets away without doing so).

iommu_dom0_strict is always set to 1 when IOMMU is used on ARM (see
check_hwdom_reqs).

Futhermore, we always share the page table with the processor, so we
never need to populate the page table.

Regards,
Julien Grall Jan. 21, 2015, 2:22 p.m. UTC | #5
On 21/01/15 14:13, Jan Beulich wrote:
>>>> On 21.01.15 at 13:13, <julien.grall@linaro.org> wrote:
>> Hi Jan,
>>
>> On 21/01/15 10:48, Jan Beulich wrote:
>>>>>> On 21.01.15 at 11:37, <julien.grall@linaro.org> wrote:
>>>> On 21/01/2015 10:23, Jan Beulich wrote:
>>>>>>>> On 20.01.15 at 18:11, <julien.grall@linaro.org> wrote:
>>>>>> While this function is currently only used for DOM0, this will be used
>>>>>> in a later patch for guest non-PCI passthrough.
>>>>>
>>>>> Okay, but you shouldn't break (or alter in [seemingly] benign ways) the
>>>>> Dom0 case imo.
>>>>
>>>> As iommu_hwdom_init is initialized correctly the IOMMU for DOM0, 
>>>> iommu_construct is a no-op.
>>>>
>>>> Would an if ( need_iommu(d) ) will be more clear? Maybe we an assert 
>>>> (!is_hardware_domain(d)).
>>>
>>> Just think this through properly: iommu_hwdom_init() may leave
>>> Dom0's ->need_iommu at 0 or 1 (depending on iommu_dom0_strict).
>>> And iommu_construct() specifically is a nop only when ->need_iommu
>>> is positive (x86's arch_iommu_populate_page_table() sets it to a
>>> negative value to indicate "being set up", and I wonder how ARM
>>> gets away without doing so).
>>
>> iommu_dom0_strict is always set to 1 when IOMMU is used on ARM (see
>> check_hwdom_reqs).
>>
>> Futhermore, we always share the page table with the processor, so we
>> never need to populate the page table.
> 
> That's all far from obvious looking at the patch at hand. And you
> can then only hope that these two "always" will always remain to
> be that way.

Hence the suggested ASSERT on a previous mail. I could also add a
comment in the code explaining the ASSERT().

Regards,
Julien Grall Feb. 20, 2015, 5:45 p.m. UTC | #6
On 20/02/15 16:58, Ian Campbell wrote:
> On Tue, 2015-01-13 at 14:25 +0000, Julien Grall wrote:
>> This new function will correctly initialize the IOMMU page table for the
>> current domain.
>>
>> Also use it in iommu_assign_dt_device even though the current IOMMU
>> implementation on ARM shares P2M with the processor.
>>
>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>> Cc: Jan Beulich <jbeulich@suse.com>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> Although I'm not actually sure what it is going to be used for.

The caller are in this patch too :) (see pci.c and device_tree.c changes).

> Also, needs Jan to ack I think.

I have some comments from Jan which I need to address in the next series.

Regards,
Julien Grall Feb. 23, 2015, 4:04 p.m. UTC | #7
Hi Ian,

On 23/02/15 15:31, Ian Campbell wrote:
> On Fri, 2015-02-20 at 17:45 +0000, Julien Grall wrote:
>> On 20/02/15 16:58, Ian Campbell wrote:
>>> On Tue, 2015-01-13 at 14:25 +0000, Julien Grall wrote:
>>>> This new function will correctly initialize the IOMMU page table for the
>>>> current domain.
>>>>
>>>> Also use it in iommu_assign_dt_device even though the current IOMMU
>>>> implementation on ARM shares P2M with the processor.
>>>>
>>>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>>>> Cc: Jan Beulich <jbeulich@suse.com>
>>>
>>> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>>>
>>> Although I'm not actually sure what it is going to be used for.
>>
>> The caller are in this patch too :) (see pci.c and device_tree.c changes).
> 
> Right, I meant "what non-trivial functionality is going to land here"
> i..e what the grandparent callers etc will eventually expect from it.

Oh ok. The function is used to the prepare the IOMMU structure for the
domain. Currently, the steps are:
	- build the page table on platforms which don't share the P2M
	- set need_iommu to 1.

Within this series, this is the only patch who touch this function.

Regards,
diff mbox

Patch

diff --git a/xen/drivers/passthrough/arm/iommu.c b/xen/drivers/passthrough/arm/iommu.c
index 3e9303a..5870aef 100644
--- a/xen/drivers/passthrough/arm/iommu.c
+++ b/xen/drivers/passthrough/arm/iommu.c
@@ -68,3 +68,9 @@  void arch_iommu_domain_destroy(struct domain *d)
 {
     iommu_dt_domain_destroy(d);
 }
+
+int arch_iommu_populate_page_table(struct domain *d)
+{
+    /* The IOMMU shares the p2m with the CPU */
+    return -ENOSYS;
+}
diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c
index 377d41d..88e496e 100644
--- a/xen/drivers/passthrough/device_tree.c
+++ b/xen/drivers/passthrough/device_tree.c
@@ -41,6 +41,10 @@  int iommu_assign_dt_device(struct domain *d, struct dt_device_node *dev)
     if ( !list_empty(&dev->domain_list) )
         goto fail;
 
+    rc = iommu_construct(d);
+    if ( rc )
+        goto fail;
+
     rc = hd->platform_ops->assign_device(d, 0, dt_to_dev(dev));
 
     if ( rc )
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index cc12735..8915244 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -187,6 +187,25 @@  void iommu_teardown(struct domain *d)
     tasklet_schedule(&iommu_pt_cleanup_tasklet);
 }
 
+int iommu_construct(struct domain *d)
+{
+    int rc = 0;
+
+    if ( need_iommu(d) > 0 )
+        return 0;
+
+    if ( !iommu_use_hap_pt(d) )
+    {
+        rc = arch_iommu_populate_page_table(d);
+        if ( rc )
+            return rc;
+    }
+
+    d->need_iommu = 1;
+
+    return rc;
+}
+
 void iommu_domain_destroy(struct domain *d)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 43ce5dc..9a47a37 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1355,18 +1355,11 @@  static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
     if ( !spin_trylock(&pcidevs_lock) )
         return -ERESTART;
 
-    if ( need_iommu(d) <= 0 )
+    rc = iommu_construct(d);
+    if ( rc )
     {
-        if ( !iommu_use_hap_pt(d) )
-        {
-            rc = arch_iommu_populate_page_table(d);
-            if ( rc )
-            {
-                spin_unlock(&pcidevs_lock);
-                return rc;
-            }
-        }
-        d->need_iommu = 1;
+        spin_unlock(&pcidevs_lock);
+        return rc;
     }
 
     pdev = pci_get_pdev_by_domain(hardware_domain, seg, bus, devfn);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index d0f99ef..c146ee4 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -65,6 +65,8 @@  int arch_iommu_domain_init(struct domain *d);
 int arch_iommu_populate_page_table(struct domain *d);
 void arch_iommu_check_autotranslated_hwdom(struct domain *d);
 
+int iommu_construct(struct domain *d);
+
 /* Function used internally, use iommu_domain_destroy */
 void iommu_teardown(struct domain *d);