diff mbox

[Xen-devel,v3,18/24] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody

Message ID 1421159133-31526-19-git-send-email-julien.grall@linaro.org
State Superseded, archived
Headers show

Commit Message

Julien Grall Jan. 13, 2015, 2:25 p.m. UTC
Currently, when the device is deassigned from a domain, we directly reassign
to DOM0.

As the device may not have been correctly reset, this may lead to corruption or
expose some part of DOM0 memory. Also, we may have no way to reset some
platform devices.

If Xen reassigns the device to "nobody", it may receive some global/context
fault because the transaction has failed (indeed the context has been
marked invalid). Unfortunately there is no simple way to quiesce a buggy
hardware. I think we could live with that for a first version of platform
device passthrough.

DOM0 will have to issue an hypercall to assign the device to itself if it
wants to use it.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Cc: Jan Beulich <jbeulich@suse.com>

---
    Changes in v3:
        - Use the coding style of the new SMMU drivers

    Changes in v2:
        - Fix typoes in the commit message
        - Update commit message
---
 xen/drivers/passthrough/arm/smmu.c    | 8 +++++++-
 xen/drivers/passthrough/device_tree.c | 9 +++------
 2 files changed, 10 insertions(+), 7 deletions(-)

Comments

Julien Grall Jan. 20, 2015, 2:31 p.m. UTC | #1
Hi Jan,

On 20/01/15 09:21, Jan Beulich wrote:
>>>> On 13.01.15 at 15:25, <julien.grall@linaro.org> wrote:
>> Currently, when the device is deassigned from a domain, we directly reassign
>> to DOM0.
>>
>> As the device may not have been correctly reset, this may lead to corruption 
>> or
>> expose some part of DOM0 memory. Also, we may have no way to reset some
>> platform devices.
>>
>> If Xen reassigns the device to "nobody", it may receive some global/context
>> fault because the transaction has failed (indeed the context has been
>> marked invalid). Unfortunately there is no simple way to quiesce a buggy
>> hardware. I think we could live with that for a first version of platform
>> device passthrough.
>>
>> DOM0 will have to issue an hypercall to assign the device to itself if it
>> wants to use it.
>>
>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>> Cc: Jan Beulich <jbeulich@suse.com>
> 
> Looks like this is because of ...
> 
>>  xen/drivers/passthrough/device_tree.c | 9 +++------
> 
> ... this. Which I think is a mistake, as I wasn't involved in the addition
> of that file. I'd appreciate an update to ./MAINTAINERS adding this
> file to the ARM section.

Ok. I will send a patch to add xen/drivers/passthrough/device_tree.c in
the "ARM section".

Regards,
Julien Grall Feb. 23, 2015, 10:10 a.m. UTC | #2
Hi Jan,

On 23/02/2015 09:40, Jan Beulich wrote:
>>>> On 20.02.15 at 18:04, <ian.campbell@citrix.com> wrote:
>> On Tue, 2015-01-13 at 14:25 +0000, Julien Grall wrote:
>>> Currently, when the device is deassigned from a domain, we directly reassign
>>> to DOM0.
>>>
>>> As the device may not have been correctly reset, this may lead to corruption
>> or
>>> expose some part of DOM0 memory. Also, we may have no way to reset some
>>> platform devices.
>>>
>>> If Xen reassigns the device to "nobody", it may receive some global/context
>>> fault because the transaction has failed (indeed the context has been
>>> marked invalid). Unfortunately there is no simple way to quiesce a buggy
>>> hardware. I think we could live with that for a first version of platform
>>> device passthrough.
>>>
>>> DOM0 will have to issue an hypercall to assign the device to itself if it
>>> wants to use it.
>>
>> Does this behaviour differ from x86? If so then it is worth calling that
>> out explicitly (even if not, good to know I think!)
> Indeed this is different from x86, and I think such behavior should
> be consistent across architectures. If Dom0 isn't handed back the
> device, who's going to do the supposedly prerequisite reset? On
> x86/PCI that's pciback's job, and it would be illegitimate for the
> driver to do _anything_ with the device if it wasn't owned by Dom0.

I think we already had this discussion on a previous version...

Right now, there is no possibility to reset a platform device in the 
kernel. There is some discussion about it but nothing more.

This series doesn't intend to have a generic solution for platform 
device pass-through. It's target embedded people who will always 
passthrough the same device to the same guest.

As DOM0 *should not* use the device (no reset driver, and no possibility 
to unbind), the safest way is to reassign the device to nobody.

So if the device is not quiescent it may mess up DOM0.

Regards,
Julien Grall Feb. 23, 2015, 10:10 a.m. UTC | #3
On 20/02/2015 17:04, Ian Campbell wrote:
> On Tue, 2015-01-13 at 14:25 +0000, Julien Grall wrote:
>> Currently, when the device is deassigned from a domain, we directly reassign
>> to DOM0.
>>
>> As the device may not have been correctly reset, this may lead to corruption or
>> expose some part of DOM0 memory. Also, we may have no way to reset some
>> platform devices.
>>
>> If Xen reassigns the device to "nobody", it may receive some global/context
>> fault because the transaction has failed (indeed the context has been
>> marked invalid). Unfortunately there is no simple way to quiesce a buggy
>> hardware. I think we could live with that for a first version of platform
>> device passthrough.
>>
>> DOM0 will have to issue an hypercall to assign the device to itself if it
>> wants to use it.
>
> Does this behaviour differ from x86? If so then it is worth calling that
> out explicitly (even if not, good to know I think!)

What do you mean by "calling that out explicitly"?

Regards,
Julien Grall Feb. 23, 2015, 4:37 p.m. UTC | #4
On 23/02/15 15:39, Ian Campbell wrote:
> On Mon, 2015-02-23 at 10:20 +0000, Jan Beulich wrote:
>>>>> On 23.02.15 at 11:10, <julien.grall@linaro.org> wrote:
>>> Hi Jan,
>>>
>>> On 23/02/2015 09:40, Jan Beulich wrote:
>>>>>>> On 20.02.15 at 18:04, <ian.campbell@citrix.com> wrote:
>>>>> On Tue, 2015-01-13 at 14:25 +0000, Julien Grall wrote:
>>>>>> Currently, when the device is deassigned from a domain, we directly reassign
>>>>>> to DOM0.
>>>>>>
>>>>>> As the device may not have been correctly reset, this may lead to corruption
>>>>> or
>>>>>> expose some part of DOM0 memory. Also, we may have no way to reset some
>>>>>> platform devices.
>>>>>>
>>>>>> If Xen reassigns the device to "nobody", it may receive some global/context
>>>>>> fault because the transaction has failed (indeed the context has been
>>>>>> marked invalid). Unfortunately there is no simple way to quiesce a buggy
>>>>>> hardware. I think we could live with that for a first version of platform
>>>>>> device passthrough.
>>>>>>
>>>>>> DOM0 will have to issue an hypercall to assign the device to itself if it
>>>>>> wants to use it.
>>>>>
>>>>> Does this behaviour differ from x86? If so then it is worth calling that
>>>>> out explicitly (even if not, good to know I think!)
>>>> Indeed this is different from x86, and I think such behavior should
>>>> be consistent across architectures. If Dom0 isn't handed back the
>>>> device, who's going to do the supposedly prerequisite reset? On
>>>> x86/PCI that's pciback's job, and it would be illegitimate for the
>>>> driver to do _anything_ with the device if it wasn't owned by Dom0.
>>>
>>> I think we already had this discussion on a previous version...
>>>
>>> Right now, there is no possibility to reset a platform device in the 
>>> kernel. There is some discussion about it but nothing more.
> 
> "there is no possibility to reset a platform device" isn't quite true,
> there is certainly a theoretical possibility (i.e. it is obviously the
> case that a dom0 could be written to do so for at least some devices).

> So what happens if such a dom0 arises in the future? I suppose the
> intention is that the user would having deassigned from domU and
> determined that there kernel has the necessary feature would do some
> sort of xl command to assign to dom0?

The list of things to do for resetting a device would be:
	- Deassign device from DOMU (though it's done during destruction)
	- Assign the device to DOM0 and map MMIO
	- Reset the device

But that not my concern right now. As we agreed, this feature (i.e reset
a device) is not in the scope of this series.

So assigning the device to DOM0 is not safe at all.

>>> This series doesn't intend to have a generic solution for platform 
>>> device pass-through. It's target embedded people who will always 
>>> passthrough the same device to the same guest.
>>>
>>> As DOM0 *should not* use the device (no reset driver, and no possibility 
>>> to unbind), the safest way is to reassign the device to nobody.
>>>
>>> So if the device is not quiescent it may mess up DOM0.
>>
>> And I think spelling this out in the description is what Ian is
>> asking for.
> 
> Yes.
> 
> I think it probably ought to be mentioned in the docs surrounding the
> hypercalls in question, for both PCI, DT and any other device type
> whether or not there is some implicit rebinding or not.

Ok.
Julien Grall Feb. 23, 2015, 4:54 p.m. UTC | #5
On 23/02/15 16:47, Jan Beulich wrote:
>>>> On 23.02.15 at 17:37, <julien.grall@linaro.org> wrote:
>> On 23/02/15 15:39, Ian Campbell wrote:
>>> On Mon, 2015-02-23 at 10:20 +0000, Jan Beulich wrote:
>>>>>>> On 23.02.15 at 11:10, <julien.grall@linaro.org> wrote:
>>>>> Right now, there is no possibility to reset a platform device in the 
>>>>> kernel. There is some discussion about it but nothing more.
>>>
>>> "there is no possibility to reset a platform device" isn't quite true,
>>> there is certainly a theoretical possibility (i.e. it is obviously the
>>> case that a dom0 could be written to do so for at least some devices).
>>
>>> So what happens if such a dom0 arises in the future? I suppose the
>>> intention is that the user would having deassigned from domU and
>>> determined that there kernel has the necessary feature would do some
>>> sort of xl command to assign to dom0?
>>
>> The list of things to do for resetting a device would be:
>> 	- Deassign device from DOMU (though it's done during destruction)
>> 	- Assign the device to DOM0 and map MMIO
>> 	- Reset the device
>>
>> But that not my concern right now. As we agreed, this feature (i.e reset
>> a device) is not in the scope of this series.
>>
>> So assigning the device to DOM0 is not safe at all.
> 
> Kind of confusing - above you state "assign the device to DOM0
> and map MMIO" as a prereq step for resetting the device, yet
> then you state assigning it back to Dom0 is not safe. Which of the
> two do you mean after all?

Ian was asking how would it be possible to reset the device and I gave
him a list of things to do.

Reassigning the device to DOM0 is not safe if there is no reset driver
in the DOM0 kernel. Unlike PCI, there is no generic way to reset a
platform device.

As for now, there is no reset driver at all in DOM0. Not reassign the
device is the better thing to do.

Regards,
Julien Grall March 10, 2015, 3:29 p.m. UTC | #6
Hi,

On 23/02/15 15:34, Ian Campbell wrote:
> On Mon, 2015-02-23 at 10:10 +0000, Julien Grall wrote:
>>
>> On 20/02/2015 17:04, Ian Campbell wrote:
>>> On Tue, 2015-01-13 at 14:25 +0000, Julien Grall wrote:
>>>> Currently, when the device is deassigned from a domain, we directly reassign
>>>> to DOM0.
>>>>
>>>> As the device may not have been correctly reset, this may lead to corruption or
>>>> expose some part of DOM0 memory. Also, we may have no way to reset some
>>>> platform devices.
>>>>
>>>> If Xen reassigns the device to "nobody", it may receive some global/context
>>>> fault because the transaction has failed (indeed the context has been
>>>> marked invalid). Unfortunately there is no simple way to quiesce a buggy
>>>> hardware. I think we could live with that for a first version of platform
>>>> device passthrough.
>>>>
>>>> DOM0 will have to issue an hypercall to assign the device to itself if it
>>>> wants to use it.
>>>
>>> Does this behaviour differ from x86?
> 
> I realise now that x86 is a red-herring, what I really meant was differ
> from other types of device (specifically PCI ones).
> 
>>  If so then it is worth calling that
>>> out explicitly (even if not, good to know I think!)
>>
>> What do you mean by "calling that out explicitly"?
> 
> Stating in the commit log or a suitably placed comment (at least under
> xen/include/public hopefully) that deassignment of dt devices behaves
> differently to deassignment of other types of devices.

I tried to search any documentation explaining the behavior of those
DOMCTL for PCI (mostly the deassign one) but I didn't find any.

By any chance, do you know if there is one? If so where?

Regards,
Julien Grall March 19, 2015, 3:07 p.m. UTC | #7
On 11/03/15 12:35, Ian Campbell wrote:
> On Tue, 2015-03-10 at 15:29 +0000, Julien Grall wrote:
>> Hi,
>>
>> On 23/02/15 15:34, Ian Campbell wrote:
>>> On Mon, 2015-02-23 at 10:10 +0000, Julien Grall wrote:
>>>>
>>>> On 20/02/2015 17:04, Ian Campbell wrote:
>>>>> On Tue, 2015-01-13 at 14:25 +0000, Julien Grall wrote:
>>>>>> Currently, when the device is deassigned from a domain, we directly reassign
>>>>>> to DOM0.
>>>>>>
>>>>>> As the device may not have been correctly reset, this may lead to corruption or
>>>>>> expose some part of DOM0 memory. Also, we may have no way to reset some
>>>>>> platform devices.
>>>>>>
>>>>>> If Xen reassigns the device to "nobody", it may receive some global/context
>>>>>> fault because the transaction has failed (indeed the context has been
>>>>>> marked invalid). Unfortunately there is no simple way to quiesce a buggy
>>>>>> hardware. I think we could live with that for a first version of platform
>>>>>> device passthrough.
>>>>>>
>>>>>> DOM0 will have to issue an hypercall to assign the device to itself if it
>>>>>> wants to use it.
>>>>>
>>>>> Does this behaviour differ from x86?
>>>
>>> I realise now that x86 is a red-herring, what I really meant was differ
>>> from other types of device (specifically PCI ones).
>>>
>>>>  If so then it is worth calling that
>>>>> out explicitly (even if not, good to know I think!)
>>>>
>>>> What do you mean by "calling that out explicitly"?
>>>
>>> Stating in the commit log or a suitably placed comment (at least under
>>> xen/include/public hopefully) that deassignment of dt devices behaves
>>> differently to deassignment of other types of devices.
>>
>> I tried to search any documentation explaining the behavior of those
>> DOMCTL for PCI (mostly the deassign one) but I didn't find any.
>>
>> By any chance, do you know if there is one? If so where?
> 
> If you didn't find it in the obvious places under xen/include/public
> then it probably doesn't exist.

Ok. I will add a comment in the patch extending XEN_DOMCTL_assign_device.

Regards,
diff mbox

Patch

diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index 3cf1773..45a2db8 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2774,7 +2774,7 @@  static int arm_smmu_reassign_dev(struct domain *s, struct domain *t,
 	int ret = 0;
 
 	/* Don't allow remapping on other domain than hwdom */
-	if (t != hardware_domain)
+	if (t && t != hardware_domain)
 		return -EPERM;
 
 	if (t == s)
@@ -2784,6 +2784,12 @@  static int arm_smmu_reassign_dev(struct domain *s, struct domain *t,
 	if (ret)
 		return ret;
 
+	if (t) {
+		ret = arm_smmu_assign_dev(t, devfn, dev);
+		if (ret)
+			return ret;
+	}
+
 	return 0;
 }
 
diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c
index e7eb34f..d9b486e 100644
--- a/xen/drivers/passthrough/device_tree.c
+++ b/xen/drivers/passthrough/device_tree.c
@@ -72,15 +72,12 @@  int iommu_deassign_dt_device(struct domain *d, struct dt_device_node *dev)
 
     spin_lock(&dtdevs_lock);
 
-    rc = hd->platform_ops->reassign_device(d, hardware_domain,
-                                           0, dt_to_dev(dev));
+    rc = hd->platform_ops->reassign_device(d, NULL, 0, dt_to_dev(dev));
     if ( rc )
         goto fail;
 
-    list_del(&dev->domain_list);
-
-    dt_device_set_used_by(dev, hardware_domain->domain_id);
-    list_add(&dev->domain_list, &domain_hvm_iommu(hardware_domain)->dt_devices);
+    list_del_init(&dev->domain_list);
+    dt_device_set_used_by(dev, DOMID_IO);
 
 fail:
     spin_unlock(&dtdevs_lock);