diff mbox series

[v2,2/2] thunderbolt: Make iommu_dma_protection more accurate

Message ID 0dd14883930c9f55ace22162e23765a37d91a057.1647624084.git.robin.murphy@arm.com
State New
Headers show
Series thunderbolt: Make iommu_dma_protection more accurate | expand

Commit Message

Robin Murphy March 18, 2022, 5:42 p.m. UTC
Between me trying to get rid of iommu_present() and Mario wanting to
support the AMD equivalent of DMAR_PLATFORM_OPT_IN, scrutiny has shown
that the iommu_dma_protection attribute is being far too optimistic.
Even if an IOMMU might be present for some PCI segment in the system,
that doesn't necessarily mean it provides translation for the device(s)
we care about. Furthermore, all that DMAR_PLATFORM_OPT_IN really does
is tell us that memory was protected before the kernel was loaded, and
prevent the user from disabling the intel-iommu driver entirely. While
that lets us assume kernel integrity, what matters for actual runtime
DMA protection is whether we trust individual devices, based on the
"external facing" property that we expect firmware to describe for
Thunderbolt ports.

It's proven challenging to determine the appropriate ports accurately
given the variety of possible topologies, so while still not getting a
perfect answer, by putting enough faith in firmware we can at least get
a good bit closer. If we can see that any device near a Thunderbolt NHI
has all the requisites for Kernel DMA Protection, chances are that it
*is* a relevant port, but moreover that implies that firmware is playing
the game overall, so we'll use that to assume that all Thunderbolt ports
should be correctly marked and thus will end up fully protected.

CC: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

v2: Give up trying to look for specific devices, just look for evidence
    that firmware cares at all.

 drivers/thunderbolt/domain.c | 12 +++--------
 drivers/thunderbolt/nhi.c    | 41 ++++++++++++++++++++++++++++++++++++
 include/linux/thunderbolt.h  |  2 ++
 3 files changed, 46 insertions(+), 9 deletions(-)

Comments

Robin Murphy March 21, 2022, 11:11 a.m. UTC | #1
On 2022-03-21 10:58, mika.westerberg@linux.intel.com wrote:
> Hi Mario,
> 
> On Fri, Mar 18, 2022 at 10:29:59PM +0000, Limonciello, Mario wrote:
>> [Public]
>>
>>> Between me trying to get rid of iommu_present() and Mario wanting to
>>> support the AMD equivalent of DMAR_PLATFORM_OPT_IN, scrutiny has
>>> shown
>>> that the iommu_dma_protection attribute is being far too optimistic.
>>> Even if an IOMMU might be present for some PCI segment in the system,
>>> that doesn't necessarily mean it provides translation for the device(s)
>>> we care about. Furthermore, all that DMAR_PLATFORM_OPT_IN really does
>>> is tell us that memory was protected before the kernel was loaded, and
>>> prevent the user from disabling the intel-iommu driver entirely. While
>>> that lets us assume kernel integrity, what matters for actual runtime
>>> DMA protection is whether we trust individual devices, based on the
>>> "external facing" property that we expect firmware to describe for
>>> Thunderbolt ports.
>>>
>>> It's proven challenging to determine the appropriate ports accurately
>>> given the variety of possible topologies, so while still not getting a
>>> perfect answer, by putting enough faith in firmware we can at least get
>>> a good bit closer. If we can see that any device near a Thunderbolt NHI
>>> has all the requisites for Kernel DMA Protection, chances are that it
>>> *is* a relevant port, but moreover that implies that firmware is playing
>>> the game overall, so we'll use that to assume that all Thunderbolt ports
>>> should be correctly marked and thus will end up fully protected.
>>>
>>
>> This approach looks generally good to me.  I do worry a little bit about older
>> systems that didn't set ExternalFacingPort in the FW but were previously setting
>> iommu_dma_protection, but I think that those could be treated on a quirk
>> basis to set PCI IDs for those root ports as external facing if/when they come
>> up.
> 
> There are no such systems out there AFAICT.

And even if there are, as above they've never actually been fully 
protected and still won't be, so it's arguably a good thing for them to 
stop thinking so.

>> I'll send up a follow up patch that adds the AMD ACPI table check.
>> If it looks good can roll it into your series for v3, or if this series goes
>> as is for v2 it can come on its own.
>>
>>> CC: Mario Limonciello <mario.limonciello@amd.com>
>>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>>> ---
>>>
>>> v2: Give up trying to look for specific devices, just look for evidence
>>>      that firmware cares at all.
>>
>> I still do think you could know exactly which devices to use if you're in
>> SW CM mode, but I guess the consensus is to not bifurcate the way this
>> can be checked.
> 
> Indeed.
> 
> The patch looks good to me now. I will give it a try on a couple of
> systems later today or tomorrow and let you guys know how it went. I
> don't expect any problems but let's see.
> 
> Thanks a lot Robin for working on this :)

Heh, let's just hope the other half-dozen or so subsystems I need to 
touch for this IOMMU cleanup aren't all quite as involved as this turned 
out to be :)

Cheers,
Robin.
Christoph Hellwig March 22, 2022, 9:16 a.m. UTC | #2
On Fri, Mar 18, 2022 at 05:42:58PM +0000, Robin Murphy wrote:
> Between me trying to get rid of iommu_present() and Mario wanting to
> support the AMD equivalent of DMAR_PLATFORM_OPT_IN, scrutiny has shown
> that the iommu_dma_protection attribute is being far too optimistic.
> Even if an IOMMU might be present for some PCI segment in the system,
> that doesn't necessarily mean it provides translation for the device(s)
> we care about. Furthermore, all that DMAR_PLATFORM_OPT_IN really does
> is tell us that memory was protected before the kernel was loaded, and
> prevent the user from disabling the intel-iommu driver entirely. While
> that lets us assume kernel integrity, what matters for actual runtime
> DMA protection is whether we trust individual devices, based on the
> "external facing" property that we expect firmware to describe for
> Thunderbolt ports.
> 
> It's proven challenging to determine the appropriate ports accurately
> given the variety of possible topologies, so while still not getting a
> perfect answer, by putting enough faith in firmware we can at least get
> a good bit closer. If we can see that any device near a Thunderbolt NHI
> has all the requisites for Kernel DMA Protection, chances are that it
> *is* a relevant port, but moreover that implies that firmware is playing
> the game overall, so we'll use that to assume that all Thunderbolt ports
> should be correctly marked and thus will end up fully protected.
> 
> CC: Mario Limonciello <mario.limonciello@amd.com>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

Looks sensible to me:

Acked-by: Christoph Hellwig <hch@lst.de>
Robin Murphy March 22, 2022, 2:40 p.m. UTC | #3
On 2022-03-22 11:41, Mika Westerberg wrote:
> Hi Robin,
> 
> I tried this now on two Intel systems. One with integrated Thunderbolt
> and one with discrete. There was a small issue, see below but once fixed
> it worked as expected :)
> 
> On Fri, Mar 18, 2022 at 05:42:58PM +0000, Robin Murphy wrote:
>> Between me trying to get rid of iommu_present() and Mario wanting to
>> support the AMD equivalent of DMAR_PLATFORM_OPT_IN, scrutiny has shown
>> that the iommu_dma_protection attribute is being far too optimistic.
>> Even if an IOMMU might be present for some PCI segment in the system,
>> that doesn't necessarily mean it provides translation for the device(s)
>> we care about. Furthermore, all that DMAR_PLATFORM_OPT_IN really does
>> is tell us that memory was protected before the kernel was loaded, and
>> prevent the user from disabling the intel-iommu driver entirely. While
>> that lets us assume kernel integrity, what matters for actual runtime
>> DMA protection is whether we trust individual devices, based on the
>> "external facing" property that we expect firmware to describe for
>> Thunderbolt ports.
>>
>> It's proven challenging to determine the appropriate ports accurately
>> given the variety of possible topologies, so while still not getting a
>> perfect answer, by putting enough faith in firmware we can at least get
>> a good bit closer. If we can see that any device near a Thunderbolt NHI
>> has all the requisites for Kernel DMA Protection, chances are that it
>> *is* a relevant port, but moreover that implies that firmware is playing
>> the game overall, so we'll use that to assume that all Thunderbolt ports
>> should be correctly marked and thus will end up fully protected.
>>
>> CC: Mario Limonciello <mario.limonciello@amd.com>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>
>> v2: Give up trying to look for specific devices, just look for evidence
>>      that firmware cares at all.
>>
>>   drivers/thunderbolt/domain.c | 12 +++--------
>>   drivers/thunderbolt/nhi.c    | 41 ++++++++++++++++++++++++++++++++++++
>>   include/linux/thunderbolt.h  |  2 ++
>>   3 files changed, 46 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/thunderbolt/domain.c b/drivers/thunderbolt/domain.c
>> index 7018d959f775..2889a214dadc 100644
>> --- a/drivers/thunderbolt/domain.c
>> +++ b/drivers/thunderbolt/domain.c
>> @@ -7,9 +7,7 @@
>>    */
>>   
>>   #include <linux/device.h>
>> -#include <linux/dmar.h>
>>   #include <linux/idr.h>
>> -#include <linux/iommu.h>
>>   #include <linux/module.h>
>>   #include <linux/pm_runtime.h>
>>   #include <linux/slab.h>
>> @@ -257,13 +255,9 @@ static ssize_t iommu_dma_protection_show(struct device *dev,
>>   					 struct device_attribute *attr,
>>   					 char *buf)
>>   {
>> -	/*
>> -	 * Kernel DMA protection is a feature where Thunderbolt security is
>> -	 * handled natively using IOMMU. It is enabled when IOMMU is
>> -	 * enabled and ACPI DMAR table has DMAR_PLATFORM_OPT_IN set.
>> -	 */
>> -	return sprintf(buf, "%d\n",
>> -		       iommu_present(&pci_bus_type) && dmar_platform_optin());
>> +	struct tb *tb = container_of(dev, struct tb, dev);
>> +
>> +	return sysfs_emit(buf, "%d\n", tb->nhi->iommu_dma_protection);
>>   }
>>   static DEVICE_ATTR_RO(iommu_dma_protection);
>>   
>> diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
>> index c73da0532be4..9e396e283792 100644
>> --- a/drivers/thunderbolt/nhi.c
>> +++ b/drivers/thunderbolt/nhi.c
>> @@ -14,6 +14,7 @@
>>   #include <linux/errno.h>
>>   #include <linux/pci.h>
>>   #include <linux/interrupt.h>
>> +#include <linux/iommu.h>
>>   #include <linux/module.h>
>>   #include <linux/delay.h>
>>   #include <linux/property.h>
>> @@ -1102,6 +1103,45 @@ static void nhi_check_quirks(struct tb_nhi *nhi)
>>   		nhi->quirks |= QUIRK_AUTO_CLEAR_INT;
>>   }
>>   
>> +static int nhi_check_iommu_pdev(struct pci_dev *pdev, void *data)
>> +{
>> +	if (!pdev->untrusted ||
>> +	    !dev_iommu_capable(&pdev->dev, IOMMU_CAP_PRE_BOOT_PROTECTION))
> 
> This one needs to take the pdev->external_facing into account too
> because most of the time there are no existing tunnels when the driver
> is loaded so we only see the PCIe root/downstream port. I think this is
> enough actually:
> 
> 	if (!pdev->external_facing ||
> 	    !dev_iommu_capable(&pdev->dev, IOMMU_CAP_PRE_BOOT_PROTECTION))

Ah yes, my bad, for some reason I got the misapprehension into my head 
that untrusted was propagated to the port as well, not just the devices 
behind it. I'll fix this and tweak the comment below to match.

>> +		return 0;
>> +	*(bool *)data = true;
>> +	return 1; /* Stop walking */
>> +}
>> +
>> +static void nhi_check_iommu(struct tb_nhi *nhi)
>> +{
>> +	struct pci_bus *bus = nhi->pdev->bus;
>> +	bool port_ok = false;
>> +
>> +	/*
>> +	 * Ideally what we'd do here is grab every PCI device that
>> +	 * represents a tunnelling adapter for this NHI and check their
>> +	 * status directly, but unfortunately USB4 seems to make it
>> +	 * obnoxiously difficult to reliably make any correlation.
>> +	 *
>> +	 * So for now we'll have to bodge it... Hoping that the system
>> +	 * is at least sane enough that an adapter is in the same PCI
>> +	 * segment as its NHI, if we can find *something* on that segment
>> +	 * which meets the requirements for Kernel DMA Protection, we'll
>> +	 * take that to imply that firmware is aware and has (hopefully)
>> +	 * done the right thing in general. We need to know that the PCI
>> +	 * layer has seen the ExternalFacingPort property and propagated
>> +	 * it to the "untrusted" flag that the IOMMU layer will then
>> +	 * enforce, but also that the IOMMU driver itself can be trusted
>> +	 * not to have been subverted by a pre-boot DMA attack.
>> +	 */
>> +	while (bus->parent)
>> +		bus = bus->parent;
>> +
>> +	pci_walk_bus(bus, nhi_check_iommu_pdev, &port_ok);
>> +
>> +	nhi->iommu_dma_protection = port_ok;
> 
> I would put here a log debug, something like this:
> 
> dev_dbg(&nhi->pdev->dev, "IOMMU DMA protection is %sabled\n",
> 	port_ok ? "en" : "dis");

Ack. I'll wait and send a v3 once the merge window's over, and can roll 
Mario's AMD IOMMU patch into that too.

Thanks,
Robin.
diff mbox series

Patch

diff --git a/drivers/thunderbolt/domain.c b/drivers/thunderbolt/domain.c
index 7018d959f775..2889a214dadc 100644
--- a/drivers/thunderbolt/domain.c
+++ b/drivers/thunderbolt/domain.c
@@ -7,9 +7,7 @@ 
  */
 
 #include <linux/device.h>
-#include <linux/dmar.h>
 #include <linux/idr.h>
-#include <linux/iommu.h>
 #include <linux/module.h>
 #include <linux/pm_runtime.h>
 #include <linux/slab.h>
@@ -257,13 +255,9 @@  static ssize_t iommu_dma_protection_show(struct device *dev,
 					 struct device_attribute *attr,
 					 char *buf)
 {
-	/*
-	 * Kernel DMA protection is a feature where Thunderbolt security is
-	 * handled natively using IOMMU. It is enabled when IOMMU is
-	 * enabled and ACPI DMAR table has DMAR_PLATFORM_OPT_IN set.
-	 */
-	return sprintf(buf, "%d\n",
-		       iommu_present(&pci_bus_type) && dmar_platform_optin());
+	struct tb *tb = container_of(dev, struct tb, dev);
+
+	return sysfs_emit(buf, "%d\n", tb->nhi->iommu_dma_protection);
 }
 static DEVICE_ATTR_RO(iommu_dma_protection);
 
diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
index c73da0532be4..9e396e283792 100644
--- a/drivers/thunderbolt/nhi.c
+++ b/drivers/thunderbolt/nhi.c
@@ -14,6 +14,7 @@ 
 #include <linux/errno.h>
 #include <linux/pci.h>
 #include <linux/interrupt.h>
+#include <linux/iommu.h>
 #include <linux/module.h>
 #include <linux/delay.h>
 #include <linux/property.h>
@@ -1102,6 +1103,45 @@  static void nhi_check_quirks(struct tb_nhi *nhi)
 		nhi->quirks |= QUIRK_AUTO_CLEAR_INT;
 }
 
+static int nhi_check_iommu_pdev(struct pci_dev *pdev, void *data)
+{
+	if (!pdev->untrusted ||
+	    !dev_iommu_capable(&pdev->dev, IOMMU_CAP_PRE_BOOT_PROTECTION))
+		return 0;
+	*(bool *)data = true;
+	return 1; /* Stop walking */
+}
+
+static void nhi_check_iommu(struct tb_nhi *nhi)
+{
+	struct pci_bus *bus = nhi->pdev->bus;
+	bool port_ok = false;
+
+	/*
+	 * Ideally what we'd do here is grab every PCI device that
+	 * represents a tunnelling adapter for this NHI and check their
+	 * status directly, but unfortunately USB4 seems to make it
+	 * obnoxiously difficult to reliably make any correlation.
+	 *
+	 * So for now we'll have to bodge it... Hoping that the system
+	 * is at least sane enough that an adapter is in the same PCI
+	 * segment as its NHI, if we can find *something* on that segment
+	 * which meets the requirements for Kernel DMA Protection, we'll
+	 * take that to imply that firmware is aware and has (hopefully)
+	 * done the right thing in general. We need to know that the PCI
+	 * layer has seen the ExternalFacingPort property and propagated
+	 * it to the "untrusted" flag that the IOMMU layer will then
+	 * enforce, but also that the IOMMU driver itself can be trusted
+	 * not to have been subverted by a pre-boot DMA attack.
+	 */
+	while (bus->parent)
+		bus = bus->parent;
+
+	pci_walk_bus(bus, nhi_check_iommu_pdev, &port_ok);
+
+	nhi->iommu_dma_protection = port_ok;
+}
+
 static int nhi_init_msi(struct tb_nhi *nhi)
 {
 	struct pci_dev *pdev = nhi->pdev;
@@ -1219,6 +1259,7 @@  static int nhi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		return -ENOMEM;
 
 	nhi_check_quirks(nhi);
+	nhi_check_iommu(nhi);
 
 	res = nhi_init_msi(nhi);
 	if (res) {
diff --git a/include/linux/thunderbolt.h b/include/linux/thunderbolt.h
index 124e13cb1469..7a8ad984e651 100644
--- a/include/linux/thunderbolt.h
+++ b/include/linux/thunderbolt.h
@@ -465,6 +465,7 @@  static inline struct tb_xdomain *tb_service_parent(struct tb_service *svc)
  * @msix_ida: Used to allocate MSI-X vectors for rings
  * @going_away: The host controller device is about to disappear so when
  *		this flag is set, avoid touching the hardware anymore.
+ * @iommu_dma_protection: An IOMMU will isolate external-facing ports.
  * @interrupt_work: Work scheduled to handle ring interrupt when no
  *		    MSI-X is used.
  * @hop_count: Number of rings (end point hops) supported by NHI.
@@ -479,6 +480,7 @@  struct tb_nhi {
 	struct tb_ring **rx_rings;
 	struct ida msix_ida;
 	bool going_away;
+	bool iommu_dma_protection;
 	struct work_struct interrupt_work;
 	u32 hop_count;
 	unsigned long quirks;