mbox series

[v4,00/26] iommu: Shared Virtual Addressing and SMMUv3 support

Message ID 20200224182401.353359-1-jean-philippe@linaro.org
Headers show
Series iommu: Shared Virtual Addressing and SMMUv3 support | expand

Message

Jean-Philippe Brucker Feb. 24, 2020, 6:23 p.m. UTC
Shared Virtual Addressing (SVA) allows to share process page tables with
devices using the IOMMU. Add a generic implementation of the IOMMU SVA
API, and add support in the Arm SMMUv3 driver.

Previous versions of this patchset were sent over a year ago [1][2] but
we've made a lot of progress since then:

* ATS support for SMMUv3 was merged in v5.2.
* The bind() and fault reporting APIs have been merged in v5.3.
* IOASID were added in v5.5.
* SMMUv3 PASID was added in v5.6, with some pending for v5.7.

* The first user of the bind() API will be merged in v5.7 [3]. The zip
  accelerator is also the first piece of hardware that I've been able to
  use for testing (previous versions were developed with software models)
  and I now have tools for evaluating SVA performance. Unfortunately I
  still don't have hardware that supports ATS and PRI; the zip accelerator
  uses stall.

These are the remaining changes for SVA support in SMMUv3. Since v3 [1]
I fixed countless bugs and - I think - addressed everyone's comments.
Thanks to recent MMU notifier rework, iommu-sva.c is a lot more
straightforward. I'm still unhappy with the complicated locking in the
SMMUv3 driver resulting from patch 12 (Seize private ASID), but I
haven't found anything better.

Please find all SVA patches on branches sva/current and sva/zip-devel at
https://jpbrucker.net/git/linux

[1] https://lore.kernel.org/linux-iommu/20180920170046.20154-1-jean-philippe.brucker@arm.com/
[2] https://lore.kernel.org/linux-iommu/20180511190641.23008-1-jean-philippe.brucker@arm.com/
[3] https://lore.kernel.org/linux-iommu/1581407665-13504-1-git-send-email-zhangfei.gao@linaro.org/

Jean-Philippe Brucker (26):
  mm/mmu_notifiers: pass private data down to alloc_notifier()
  iommu/sva: Manage process address spaces
  iommu: Add a page fault handler
  iommu/sva: Search mm by PASID
  iommu/iopf: Handle mm faults
  iommu/sva: Register page fault handler
  arm64: mm: Pin down ASIDs for sharing mm with devices
  iommu/io-pgtable-arm: Move some definitions to a header
  iommu/arm-smmu-v3: Manage ASIDs with xarray
  arm64: cpufeature: Export symbol read_sanitised_ftr_reg()
  iommu/arm-smmu-v3: Share process page tables
  iommu/arm-smmu-v3: Seize private ASID
  iommu/arm-smmu-v3: Add support for VHE
  iommu/arm-smmu-v3: Enable broadcast TLB maintenance
  iommu/arm-smmu-v3: Add SVA feature checking
  iommu/arm-smmu-v3: Add dev_to_master() helper
  iommu/arm-smmu-v3: Implement mm operations
  iommu/arm-smmu-v3: Hook up ATC invalidation to mm ops
  iommu/arm-smmu-v3: Add support for Hardware Translation Table Update
  iommu/arm-smmu-v3: Maintain a SID->device structure
  iommu/arm-smmu-v3: Ratelimit event dump
  dt-bindings: document stall property for IOMMU masters
  iommu/arm-smmu-v3: Add stall support for platform devices
  PCI/ATS: Add PRI stubs
  PCI/ATS: Export symbols of PRI functions
  iommu/arm-smmu-v3: Add support for PRI

 .../devicetree/bindings/iommu/iommu.txt       |   18 +
 arch/arm64/include/asm/mmu.h                  |    1 +
 arch/arm64/include/asm/mmu_context.h          |   11 +-
 arch/arm64/kernel/cpufeature.c                |    1 +
 arch/arm64/mm/context.c                       |  103 +-
 drivers/iommu/Kconfig                         |   13 +
 drivers/iommu/Makefile                        |    2 +
 drivers/iommu/arm-smmu-v3.c                   | 1354 +++++++++++++++--
 drivers/iommu/io-pgfault.c                    |  533 +++++++
 drivers/iommu/io-pgtable-arm.c                |   27 +-
 drivers/iommu/io-pgtable-arm.h                |   30 +
 drivers/iommu/iommu-sva.c                     |  596 ++++++++
 drivers/iommu/iommu-sva.h                     |   64 +
 drivers/iommu/iommu.c                         |    1 +
 drivers/iommu/of_iommu.c                      |    5 +-
 drivers/misc/sgi-gru/grutlbpurge.c            |    4 +-
 drivers/pci/ats.c                             |    4 +
 include/linux/iommu.h                         |   73 +
 include/linux/mmu_notifier.h                  |   10 +-
 include/linux/pci-ats.h                       |    8 +
 mm/mmu_notifier.c                             |    6 +-
 21 files changed, 2699 insertions(+), 165 deletions(-)
 create mode 100644 drivers/iommu/io-pgfault.c
 create mode 100644 drivers/iommu/io-pgtable-arm.h
 create mode 100644 drivers/iommu/iommu-sva.c
 create mode 100644 drivers/iommu/iommu-sva.h

Comments

Aaro Koskinen May 28, 2021, 8:09 a.m. UTC | #1
Hi,

On Mon, Feb 24, 2020 at 07:23:56PM +0100, Jean-Philippe Brucker wrote:
> When a device or driver misbehaves, it is possible to receive events

> much faster than we can print them out. Ratelimit the printing of

> events.

> 

> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>


Tested-by: Aaro Koskinen <aaro.koskinen@nokia.com>


> During the SVA tests when the device driver didn't properly stop DMA

> before unbinding, the event queue thread would almost lock-up the server

> with a flood of event 0xa. This patch helped recover from the error.


I was just debugging a similar case, and this patch was required to
prevent system from locking up.

Could you please resend this patch independently from the other patches
in the series, as it seems it's a worthwhile fix and still relevent for
current kernels. Thanks,

A.

> ---

>  drivers/iommu/arm-smmu-v3.c | 13 ++++++++-----

>  1 file changed, 8 insertions(+), 5 deletions(-)

> 

> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c

> index 28f8583cd47b..6a5987cce03f 100644

> --- a/drivers/iommu/arm-smmu-v3.c

> +++ b/drivers/iommu/arm-smmu-v3.c

> @@ -2243,17 +2243,20 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)

>  	struct arm_smmu_device *smmu = dev;

>  	struct arm_smmu_queue *q = &smmu->evtq.q;

>  	struct arm_smmu_ll_queue *llq = &q->llq;

> +	static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,

> +				      DEFAULT_RATELIMIT_BURST);

>  	u64 evt[EVTQ_ENT_DWORDS];

>  

>  	do {

>  		while (!queue_remove_raw(q, evt)) {

>  			u8 id = FIELD_GET(EVTQ_0_ID, evt[0]);

>  

> -			dev_info(smmu->dev, "event 0x%02x received:\n", id);

> -			for (i = 0; i < ARRAY_SIZE(evt); ++i)

> -				dev_info(smmu->dev, "\t0x%016llx\n",

> -					 (unsigned long long)evt[i]);

> -

> +			if (__ratelimit(&rs)) {

> +				dev_info(smmu->dev, "event 0x%02x received:\n", id);

> +				for (i = 0; i < ARRAY_SIZE(evt); ++i)

> +					dev_info(smmu->dev, "\t0x%016llx\n",

> +						 (unsigned long long)evt[i]);

> +			}

>  		}

>  

>  		/*
Jean-Philippe Brucker May 28, 2021, 4:25 p.m. UTC | #2
Hi Aaro,

On Fri, May 28, 2021 at 11:09:58AM +0300, Aaro Koskinen wrote:
> Hi,

> 

> On Mon, Feb 24, 2020 at 07:23:56PM +0100, Jean-Philippe Brucker wrote:

> > When a device or driver misbehaves, it is possible to receive events

> > much faster than we can print them out. Ratelimit the printing of

> > events.

> > 

> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>

> 

> Tested-by: Aaro Koskinen <aaro.koskinen@nokia.com>

> 

> > During the SVA tests when the device driver didn't properly stop DMA

> > before unbinding, the event queue thread would almost lock-up the server

> > with a flood of event 0xa. This patch helped recover from the error.

> 

> I was just debugging a similar case, and this patch was required to

> prevent system from locking up.

> 

> Could you please resend this patch independently from the other patches

> in the series, as it seems it's a worthwhile fix and still relevent for

> current kernels. Thanks,


Ok, I'll resend it

Thanks,
Jean

> 

> A.

> 

> > ---

> >  drivers/iommu/arm-smmu-v3.c | 13 ++++++++-----

> >  1 file changed, 8 insertions(+), 5 deletions(-)

> > 

> > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c

> > index 28f8583cd47b..6a5987cce03f 100644

> > --- a/drivers/iommu/arm-smmu-v3.c

> > +++ b/drivers/iommu/arm-smmu-v3.c

> > @@ -2243,17 +2243,20 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)

> >  	struct arm_smmu_device *smmu = dev;

> >  	struct arm_smmu_queue *q = &smmu->evtq.q;

> >  	struct arm_smmu_ll_queue *llq = &q->llq;

> > +	static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,

> > +				      DEFAULT_RATELIMIT_BURST);

> >  	u64 evt[EVTQ_ENT_DWORDS];

> >  

> >  	do {

> >  		while (!queue_remove_raw(q, evt)) {

> >  			u8 id = FIELD_GET(EVTQ_0_ID, evt[0]);

> >  

> > -			dev_info(smmu->dev, "event 0x%02x received:\n", id);

> > -			for (i = 0; i < ARRAY_SIZE(evt); ++i)

> > -				dev_info(smmu->dev, "\t0x%016llx\n",

> > -					 (unsigned long long)evt[i]);

> > -

> > +			if (__ratelimit(&rs)) {

> > +				dev_info(smmu->dev, "event 0x%02x received:\n", id);

> > +				for (i = 0; i < ARRAY_SIZE(evt); ++i)

> > +					dev_info(smmu->dev, "\t0x%016llx\n",

> > +						 (unsigned long long)evt[i]);

> > +			}

> >  		}

> >  

> >  		/*