mbox series

[v7,0/5] iommu/smmu-v3: Workaround for hisilicon 161010801 erratum(reserve HW MSI)

Message ID 20170914125756.14836-1-shameerali.kolothum.thodi@huawei.com
Headers show
Series iommu/smmu-v3: Workaround for hisilicon 161010801 erratum(reserve HW MSI) | expand

Message

Shameerali Kolothum Thodi Sept. 14, 2017, 12:57 p.m. UTC
On certain HiSilicon platforms (hip06/hip07) the GIC ITS and PCIe RC
deviates from the standard implementation and this breaks PCIe MSI
functionality when SMMU is enabled.

The HiSilicon erratum 161010801 describes this limitation of certain
HiSilicon platforms to support the SMMU mappings for MSI transactions.
On these platforms GICv3 ITS translator is presented with the deviceID
by extending the MSI payload data to 64 bits to include the deviceID.
Hence, the PCIe controller on this platforms has to differentiate the MSI
payload against other DMA payload and has to modify the MSI payload.
This basically makes it difficult for this platforms to have a SMMU
translation for MSI.

This patch implements an ACPI and DT based quirk to reserve the hw msi
regions in the smmu-v3 driver which means these address regions will
not be translated and will be excluded from iova allocations.

To implement this quirk, the following changes are incorporated:
1. Added a generic helper function to IORT code to retrieve the
   associated ITS base address from a device IORT node.
2. Added a generic helper function to of iommu code to retrieve the
   associated msi controller base address from for a PCI RC
   msi-mapping and also platform device msi-parent.
3. Added quirk to SMMUv3 to retrieve the HW ITS address and replace
   the default SW MSI reserve address based on the IORT SMMU model
   or DT bindings.

Changelog:

v6 --> v7
Addressed request from Will to add DT support for the erratum:
 - added bt binding
 - add of_iommu_msi_get_resv_regions()
New arm64 silicon errata entry
Rename iort_iommu_{its->msi}_get_resv_regions

v5 --> v6
Addressed comments from Robin and Lorenzo:
-No change to patch#1 .
-Reverted v5 patch#2 as this might break the platforms where this quirk
  is not applicable. Provided a generic function in iommu code and added
  back the quirk implementation in SMMU v3 driver(patch#3)
 
v4 --> v5
Addressed comments from Robin and Lorenzo:
-Added a comment to make it clear that, for now, only straightforward 
  HW topologies are handled while reserving ITS regions(patch #1).

v3 --> v4
Rebased on 4.13-rc1.
Addressed comments from Robin, Will and Lorenzo:
-As suggested by Robin, moved the ITS msi reservation into 
  iommu_dma_get_resv_regions().
-Added its_count != resv region failure case(patch #1).

v2 --> v3
Addressed comments from Lorenzo and Robin:
-Removed dev_is_pci() check in smmuV3 driver.
-Don't treat device not having an ITS mapping as an error in
  iort helper function.

v1 --> v2
-patch 2/2: Invoke iort helper fn based on fwnode type(acpi).

RFCv2 -->PATCH
-Incorporated Lorenzo's review comments.

RFC v1 --> RFC v2
Based on Robin's review comments,
-Removed  the generic erratum framework.
-Using IORT/MADT tables to retrieve the ITS base addr instead  of vendor specific CSRT table.

John Garry (2):
  Doc: iommu/arm-smmu-v3: Add workaround for HiSilicon erratum 161010801
  iommu/of: Add msi address regions reservation helper

Shameer Kolothum (3):
  ACPI/IORT: Add msi address regions reservation helper
  iommu/dma: Add a helper function to reserve HW MSI address regions for
    IOMMU drivers
  iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801

 Documentation/arm64/silicon-errata.txt             |   1 +
 .../devicetree/bindings/iommu/arm,smmu-v3.txt      |   3 +
 drivers/acpi/arm64/iort.c                          |  96 ++++++++++++++++-
 drivers/iommu/arm-smmu-v3.c                        |  28 ++++-
 drivers/iommu/dma-iommu.c                          |  19 ++++
 drivers/iommu/of_iommu.c                           | 117 +++++++++++++++++++++
 drivers/irqchip/irq-gic-v3-its.c                   |   3 +-
 include/linux/acpi_iort.h                          |   7 +-
 include/linux/dma-iommu.h                          |   7 ++
 include/linux/of_iommu.h                           |  10 ++
 10 files changed, 281 insertions(+), 10 deletions(-)

-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Lorenzo Pieralisi Sept. 22, 2017, 2:27 p.m. UTC | #1
John, Shameer,

On Thu, Sep 14, 2017 at 01:57:54PM +0100, Shameer Kolothum wrote:
> From: John Garry <john.garry@huawei.com>

> 

> On some platforms msi-controller address regions have to be excluded

> from normal IOVA allocation in that they are detected and decoded in

> a HW specific way by system components and so they cannot be considered

> normal IOVA address space.

> 

> Add a helper function that retrieves msi address regions through device

> tree msi mapping, so that these regions will not be translated by IOMMU

> and will be excluded from IOVA allocations.

> 

> Signed-off-by: John Garry <john.garry@huawei.com>

> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

> ---

>  drivers/iommu/of_iommu.c | 117 +++++++++++++++++++++++++++++++++++++++++++++++

>  include/linux/of_iommu.h |  10 ++++

>  2 files changed, 127 insertions(+)

> 

> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c

> index 8cb6082..f2d1a76 100644

> --- a/drivers/iommu/of_iommu.c

> +++ b/drivers/iommu/of_iommu.c

> @@ -21,6 +21,7 @@

>  #include <linux/iommu.h>

>  #include <linux/limits.h>

>  #include <linux/of.h>

> +#include <linux/of_address.h>

>  #include <linux/of_iommu.h>

>  #include <linux/of_pci.h>

>  #include <linux/slab.h>

> @@ -246,6 +247,122 @@ const struct iommu_ops *of_iommu_configure(struct device *dev,

>  	return ops;

>  }

>  

> +/**

> + * of_iommu_msi_get_resv_regions - Reserved region driver helper

> + * @dev: Device from iommu_get_resv_regions()

> + * @list: Reserved region list from iommu_get_resv_regions()

> + *

> + * Returns: Number of reserved regions on success (0 if no associated

> + *          msi parent), appropriate error value otherwise.

> + */

> +int of_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)

> +{

> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;

> +	struct iommu_resv_region *region;

> +	struct device_node *np;

> +	struct resource res;

> +	int i, resv = 0, mappings = 0;

> +

> +	if (dev_is_pci(dev)) {

> +		struct device *dma_dev, *bridge;

> +		struct of_phandle_args iommu_spec;

> +		struct pci_dev *pdev = to_pci_dev(dev);

> +		int err, count;

> +		u32 rid, map_mask;

> +		const __be32 *msi_map;

> +

> +		bridge = pci_get_host_bridge_device(pdev);

> +		dma_dev = bridge->parent;

> +		pci_put_host_bridge_device(bridge);

> +

> +		if (!dma_dev->of_node)

> +			return -ENODEV;

> +

> +		iommu_spec.args_count = 1;

> +		np = iommu_spec.np = dma_dev->of_node;

> +		pci_for_each_dma_alias(pdev, __get_pci_rid, &iommu_spec);

> +

> +		rid = iommu_spec.args[0];

> +		if (!of_property_read_u32(np, "msi-map-mask", &map_mask))

> +			rid &= map_mask;

> +

> +		msi_map = of_get_property(np, "msi-map", NULL);

> +		if (!msi_map)

> +			return -ENODEV;

> +

> +		mappings = of_count_phandle_with_args(np, "msi-map", NULL) / 4;

> +

> +		for (i = 0, count = mappings; i < count; i++, msi_map += 4) {

> +			struct device_node *msi_node;

> +			u32 rid_base, rid_len, phandle;

> +

> +			rid_base = be32_to_cpup(msi_map + 0);

> +			phandle = be32_to_cpup(msi_map + 1);

> +			rid_len = be32_to_cpup(msi_map + 3);

> +

> +			/* check rid is within range */

> +			if (rid < rid_base || rid >= rid_base + rid_len) {

> +				mappings--;

> +				continue;

> +			}

> +

> +			msi_node = of_find_node_by_phandle(phandle);

> +			if (!msi_node)

> +				return -ENODEV;


This is basically of_pci_map_rid(), I wonder whether there is not
a way to consolidate some code here - duplicating certainly does not
help. To make MSI reservations generic this is probably the only way
to do it but it would be nice to reuse some OF MSI code.

With the current kernel API there is a way but it is a bit whacky.

Just loop over "msi-controller" nodes and try to map the device to
them through of_pci_map_rid, if mapping succeeds reserve region for
the target node.

Not a big fan of what I am proposing but it certainly helps reuse
some existing code that makes no sense to duplicate.

> +			err = of_address_to_resource(msi_node, 0, &res);

> +			of_node_put(msi_node);

> +			if (err)

> +				return err;

> +

> +			region = iommu_alloc_resv_region(res.start,

> +							 resource_size(&res),

> +							 prot, IOMMU_RESV_MSI);

> +			if (region) {

> +				list_add_tail(&region->list, head);

> +				resv++;

> +			}

> +		}

> +	} else if (dev->of_node) {

> +		struct device_node *msi_np;

> +		int index = 0;

> +		int tuples;

> +

> +		np = dev->of_node;

> +

> +		tuples = of_count_phandle_with_args(np, "msi-parent", NULL);

> +

> +		while (index < tuples) {


Would not be easier to have an of_parse_phandle_with_args() loop here ?

Lorenzo

> +			int msi_cells = 0;

> +			int err;

> +

> +			msi_np = of_parse_phandle(np, "msi-parent", index);

> +			if (!msi_np)

> +				return -ENODEV;

> +

> +			of_property_read_u32(msi_np, "#msi-cells", &msi_cells);

> +

> +			err = of_address_to_resource(msi_np, 0, &res);

> +			of_node_put(msi_np);

> +			if (err)

> +				return err;

> +

> +			mappings++;

> +

> +			region = iommu_alloc_resv_region(res.start,

> +							 resource_size(&res),

> +							 prot, IOMMU_RESV_MSI);

> +			if (region) {

> +				list_add_tail(&region->list, head);

> +				resv++;

> +			}

> +			index += 1 + msi_cells;

> +		}

> +	}

> +

> +	return (resv == mappings) ? resv : -ENODEV;

> +}

> +

>  static int __init of_iommu_init(void)

>  {

>  	struct device_node *np;

> diff --git a/include/linux/of_iommu.h b/include/linux/of_iommu.h

> index 13394ac..9267772 100644

> --- a/include/linux/of_iommu.h

> +++ b/include/linux/of_iommu.h

> @@ -14,6 +14,9 @@ extern int of_get_dma_window(struct device_node *dn, const char *prefix,

>  extern const struct iommu_ops *of_iommu_configure(struct device *dev,

>  					struct device_node *master_np);

>  

> +extern int of_iommu_msi_get_resv_regions(struct device *dev,

> +					struct list_head *head);

> +

>  #else

>  

>  static inline int of_get_dma_window(struct device_node *dn, const char *prefix,

> @@ -29,6 +32,13 @@ static inline const struct iommu_ops *of_iommu_configure(struct device *dev,

>  	return NULL;

>  }

>  

> +static int of_iommu_msi_get_resv_regions(struct device *dev,

> +					struct list_head *head)

> +{

> +	return -ENODEV;

> +}

> +

> +

>  #endif	/* CONFIG_OF_IOMMU */

>  

>  extern struct of_device_id __iommu_of_table;

> -- 

> 1.9.1

> 

> 

> --

> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in

> the body of a message to majordomo@vger.kernel.org

> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html