Message ID | 1620656249-68890-1-git-send-email-john.garry@huawei.com |
---|---|
Headers | show |
Series | dma mapping/iommu: Allow IOMMU IOVA rcache range to be configured | expand |
On 10/05/2021 15:17, John Garry wrote: Hi Robin, guys, A friendly reminder on this one... Thanks > For streaming DMA mappings involving an IOMMU and whose IOVA len regularly > exceeds the IOVA rcache upper limit (meaning that they are not cached), > performance can be reduced. > > This is much more pronounced from commit 4e89dce72521 ("iommu/iova: Retry > from last rb tree node if iova search fails"), as discussed at [0]. > > IOVAs which cannot be cached are highly involved in the IOVA aging issue, > as discussed at [1]. > > This series allows the IOVA rcache range be configured, so that we may > cache all IOVAs per domain, thus improving performance. > > A new IOMMU group sysfs file is added - max_opt_dma_size - which is used > indirectly to configure the IOVA rcache range: > /sys/kernel/iommu_groups/X/max_opt_dma_size > > This file is updated same as how the IOMMU group default domain type is > updated, i.e. must unbind the only device in the group first. However, the > IOMMU default domain is reallocated in the device driver reprobe, and not > immediately. > > In addition, we keep (from v1 series) the DMA mapping API to allow DMA max > optimised size be set from a LLDD. How it works is a lot different. When > the LLDD calls this during probe, once the value is successfully recorded, we > return -EDEFER_PROBE. In the reprobe, the IOMM group default domain is > reallocated, and the new IOVA domain rcache upper limit is set according > to that DMA max optimised size. As such, we don't operate on a live IOMMU > domain. > > Note that the DMA mapping API frontend is not strictly required, but saves > the LLDD calling IOMMU APIs directly, that being not preferred. > > Some figures for storage scenario: > v5.13-rc1 baseline: 1200K IOPS > With series: 1800K IOPS > > All above are for IOMMU strict mode. Non-strict mode gives ~1800K IOPS in > all scenarios. > > Patch breakdown: > 1-11: Add support for setting DMA max optimised size via sysfs > 12-15: Add support for setting DMA max optimised size from LLDD > > [0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@huawei.com/ > [1] https://lore.kernel.org/linux-iommu/1607538189-237944-1-git-send-email-john.garry@huawei.com/ > > Differences to v1: > - Many > - Change method to not operate on a 'live' IOMMU domain: > - rather, force device driver to be re-probed once > dma_max_opt_size is set, and reconfig a new IOMMU group then > - Add iommu sysfs max_dma_opt_size file, and allow updating same as how > group type is changed > > John Garry (15): > iommu: Reactor iommu_group_store_type() > iova: Allow rcache range upper limit to be flexible > iommu: Allow max opt DMA len be set for a group via sysfs > iommu: Add iommu_group_get_max_opt_dma_size() > iova: Add iova_domain_len_is_cached() > iommu: Allow iommu_change_dev_def_domain() realloc default domain for > same type > iommu: Add iommu_realloc_dev_group() > dma-iommu: Add iommu_reconfig_dev_group_dma() > iova: Add init_iova_domain_ext() > dma-iommu: Use init_iova_domain_ext() for IOVA domain init > dma-iommu: Reconfig group domain > iommu: Add iommu_set_dev_dma_opt_size() > dma-mapping: Add dma_set_max_opt_size() > dma-iommu: Add iommu_dma_set_opt_size() > scsi: hisi_sas: Set max optimal DMA size for v3 hw > > drivers/iommu/dma-iommu.c | 51 +++++- > drivers/iommu/iommu.c | 231 +++++++++++++++++++------ > drivers/iommu/iova.c | 61 +++++-- > drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 5 + > include/linux/dma-iommu.h | 4 + > include/linux/dma-map-ops.h | 1 + > include/linux/dma-mapping.h | 8 + > include/linux/iommu.h | 19 ++ > include/linux/iova.h | 21 ++- > kernel/dma/mapping.c | 11 ++ > 10 files changed, 344 insertions(+), 68 deletions(-) >