mbox series

[0/7] iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts

Message ID 20190418172611.21561-1-julien.grall@arm.com
Headers show
Series iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts | expand

Message

Julien Grall April 18, 2019, 5:26 p.m. UTC
Hi all,

On RT, the function iommu_dma_map_msi_msg expects to be called from preemptible
context. However, this is not always the case resulting a splat with
!CONFIG_DEBUG_ATOMIC_SLEEP:

[   48.875777] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:974
[   48.875779] in_atomic(): 1, irqs_disabled(): 128, pid: 2103, name: ip
[   48.875782] INFO: lockdep is turned off.
[   48.875784] irq event stamp: 10684
[   48.875786] hardirqs last  enabled at (10683): [<ffff0000110c8d70>] _raw_spin_unlock_irqrestore+0x88/0x90
[   48.875791] hardirqs last disabled at (10684): [<ffff0000110c8b2c>] _raw_spin_lock_irqsave+0x24/0x68
[   48.875796] softirqs last  enabled at (0): [<ffff0000100ec590>] copy_process.isra.1.part.2+0x8d8/0x1970
[   48.875801] softirqs last disabled at (0): [<0000000000000000>]           (null)
[   48.875805] Preemption disabled at:
[   48.875805] [<ffff000010189ae8>] __setup_irq+0xd8/0x6c0
[   48.875811] CPU: 2 PID: 2103 Comm: ip Not tainted 5.0.3-rt1-00007-g42ede9a0fed6 #45
[   48.875815] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Jan 23 2017
[   48.875817] Call trace:
[   48.875818]  dump_backtrace+0x0/0x140
[   48.875821]  show_stack+0x14/0x20
[   48.875823]  dump_stack+0xa0/0xd4
[   48.875827]  ___might_sleep+0x16c/0x1f8
[   48.875831]  rt_spin_lock+0x5c/0x70
[   48.875835]  iommu_dma_map_msi_msg+0x5c/0x1d8
[   48.875839]  gicv2m_compose_msi_msg+0x3c/0x48
[   48.875843]  irq_chip_compose_msi_msg+0x40/0x58
[   48.875846]  msi_domain_activate+0x38/0x98
[   48.875849]  __irq_domain_activate_irq+0x58/0xa0
[   48.875852]  irq_domain_activate_irq+0x34/0x58
[   48.875855]  irq_activate+0x28/0x30
[   48.875858]  __setup_irq+0x2b0/0x6c0
[   48.875861]  request_threaded_irq+0xdc/0x188
[   48.875865]  sky2_setup_irq+0x44/0xf8
[   48.875868]  sky2_open+0x1a4/0x240
[   48.875871]  __dev_open+0xd8/0x188
[   48.875874]  __dev_change_flags+0x164/0x1f0
[   48.875877]  dev_change_flags+0x20/0x60
[   48.875879]  do_setlink+0x2a0/0xd30
[   48.875882]  __rtnl_newlink+0x5b4/0x6d8
[   48.875885]  rtnl_newlink+0x50/0x78
[   48.875888]  rtnetlink_rcv_msg+0x178/0x640
[   48.875891]  netlink_rcv_skb+0x58/0x118
[   48.875893]  rtnetlink_rcv+0x14/0x20
[   48.875896]  netlink_unicast+0x188/0x200
[   48.875898]  netlink_sendmsg+0x248/0x3d8
[   48.875900]  sock_sendmsg+0x18/0x40
[   48.875904]  ___sys_sendmsg+0x294/0x2d0
[   48.875908]  __sys_sendmsg+0x68/0xb8
[   48.875911]  __arm64_sys_sendmsg+0x20/0x28
[   48.875914]  el0_svc_common+0x90/0x118
[   48.875918]  el0_svc_handler+0x2c/0x80
[   48.875922]  el0_svc+0x8/0xc

This series is a first attempt to rework how MSI are mapped and composed
when an IOMMU is present.

I was able to test the changes in GICv2m and GICv3 ITS. I don't have
hardware for the other interrupt controllers.

Cheers,

Julien Grall (7):
  genirq/msi: Add a new field in msi_desc to store an IOMMU cookie
  iommu/dma-iommu: Split iommu_dma_map_msi_msg in two parts
  irqchip/gicv2m: Don't map the MSI page in gicv2m_compose_msi_msg
  irqchip/gic-v3-its: Don't map the MSI page in its_irq_compose_msi_msg
  irqchip/ls-scfg-msi: Don't map the MSI page in ls_scfg_msi_compose_msg
  irqchip/gic-v3-mbi: Don't map the MSI page in mbi_compose_m{b, s}i_msg
  iommu/dma-iommu: Remove iommu_dma_map_msi_msg()

 drivers/iommu/dma-iommu.c         | 45 ++++++++++++++++++++-------------------
 drivers/irqchip/irq-gic-v2m.c     |  8 ++++++-
 drivers/irqchip/irq-gic-v3-its.c  |  5 ++++-
 drivers/irqchip/irq-gic-v3-mbi.c  | 15 +++++++++++--
 drivers/irqchip/irq-ls-scfg-msi.c |  7 +++++-
 include/linux/dma-iommu.h         | 20 +++++++++++++++--
 include/linux/msi.h               |  3 +++
 7 files changed, 74 insertions(+), 29 deletions(-)

-- 
2.11.0

Comments

Christoph Hellwig April 23, 2019, 7:08 a.m. UTC | #1
On Thu, Apr 18, 2019 at 06:26:06PM +0100, Julien Grall wrote:
> +int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)

>  {

> +	struct device *dev = msi_desc_to_dev(desc);

>  	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);

>  	struct iommu_dma_cookie *cookie;

>  	unsigned long flags;

>  

> +	if (!domain || !domain->iova_cookie) {

> +		desc->iommu_cookie = NULL;

> +		return 0;

> +	}

>  

>  	cookie = domain->iova_cookie;

>  

> @@ -908,10 +908,33 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)

>  	 * of an MSI from within an IPI handler.

>  	 */

>  	spin_lock_irqsave(&cookie->msi_lock, flags);

> +	desc->iommu_cookie = iommu_dma_get_msi_page(dev, msi_addr, domain);

>  	spin_unlock_irqrestore(&cookie->msi_lock, flags);

>  

> +	return (desc->iommu_cookie) ? 0 : -ENOMEM;


No need for the braces.  Also I personally find a:

	if (!desc->iommu_cookie)
		return -ENOMEM;
	return 0;

much more readable, but that might just be personal preference.
Marc Zyngier April 23, 2019, 10:54 a.m. UTC | #2
On 18/04/2019 18:26, Julien Grall wrote:
> On RT, the function iommu_dma_map_msi_msg may be called from

> non-preemptible context. This will lead to a splat with

> CONFIG_DEBUG_ATOMIC_SLEEP as the function is using spin_lock

> (they can sleep on RT).

> 

> The function iommu_dma_map_msi_msg is used to map the MSI page in the

> IOMMU PT and update the MSI message with the IOVA.

> 

> Only the part to lookup for the MSI page requires to be called in

> preemptible context. As the MSI page cannot change over the lifecycle

> of the MSI interrupt, the lookup can be cached and re-used later on.

> 

> This patch split the function iommu_dma_map_msi_msg in two new

> functions:

>     - iommu_dma_prepare_msi: This function will prepare the mapping in

>     the IOMMU and store the cookie in the structure msi_desc. This

>     function should be called in preemptible context.

>     - iommu_dma_compose_msi_msg: This function will update the MSI

>     message with the IOVA when the device is behind an IOMMU.

> 

> Signed-off-by: Julien Grall <julien.grall@arm.com>

> ---

>  drivers/iommu/dma-iommu.c | 43 ++++++++++++++++++++++++++++++++-----------

>  include/linux/dma-iommu.h | 21 +++++++++++++++++++++

>  2 files changed, 53 insertions(+), 11 deletions(-)

> 

> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c

> index 77aabe637a60..f5c1f1685095 100644

> --- a/drivers/iommu/dma-iommu.c

> +++ b/drivers/iommu/dma-iommu.c

> @@ -888,17 +888,17 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,

>  	return NULL;

>  }

>  

> -void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)

> +int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)


I quite like the idea of moving from having an irq to having an msi_desc
passed to the IOMMU layer...

>  {

> -	struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));

> +	struct device *dev = msi_desc_to_dev(desc);

>  	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);

>  	struct iommu_dma_cookie *cookie;

> -	struct iommu_dma_msi_page *msi_page;

> -	phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;

>  	unsigned long flags;

>  

> -	if (!domain || !domain->iova_cookie)

> -		return;

> +	if (!domain || !domain->iova_cookie) {

> +		desc->iommu_cookie = NULL;

> +		return 0;

> +	}

>  

>  	cookie = domain->iova_cookie;

>  

> @@ -908,10 +908,33 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)

>  	 * of an MSI from within an IPI handler.

>  	 */

>  	spin_lock_irqsave(&cookie->msi_lock, flags);

> -	msi_page = iommu_dma_get_msi_page(dev, msi_addr, domain);

> +	desc->iommu_cookie = iommu_dma_get_msi_page(dev, msi_addr, domain);

>  	spin_unlock_irqrestore(&cookie->msi_lock, flags);

>  

> -	if (WARN_ON(!msi_page)) {

> +	return (desc->iommu_cookie) ? 0 : -ENOMEM;

> +}

> +

> +void iommu_dma_compose_msi_msg(int irq, struct msi_msg *msg)


... but I'd like it even better if it was uniform. Can you please move
the irq_get_msi_desc() to the callers of iommu_dma_compose_msi_msg(),
and make both functions take a msi_desc?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...
Julien Grall April 23, 2019, 10:55 a.m. UTC | #3
Hi,

On 4/23/19 8:08 AM, Christoph Hellwig wrote:
> On Thu, Apr 18, 2019 at 06:26:06PM +0100, Julien Grall wrote:

>> +int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)

>>   {

>> +	struct device *dev = msi_desc_to_dev(desc);

>>   	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);

>>   	struct iommu_dma_cookie *cookie;

>>   	unsigned long flags;

>>   

>> +	if (!domain || !domain->iova_cookie) {

>> +		desc->iommu_cookie = NULL;

>> +		return 0;

>> +	}

>>   

>>   	cookie = domain->iova_cookie;

>>   

>> @@ -908,10 +908,33 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)

>>   	 * of an MSI from within an IPI handler.

>>   	 */

>>   	spin_lock_irqsave(&cookie->msi_lock, flags);

>> +	desc->iommu_cookie = iommu_dma_get_msi_page(dev, msi_addr, domain);

>>   	spin_unlock_irqrestore(&cookie->msi_lock, flags);

>>   

>> +	return (desc->iommu_cookie) ? 0 : -ENOMEM;

> 

> No need for the braces.  Also I personally find a:

> 

> 	if (!desc->iommu_cookie)

> 		return -ENOMEM;

> 	return 0;

> 

> much more readable, but that might just be personal preference.


I am happy either way. I will use your suggestion in the next version.

Cheers,

-- 
Julien Grall
Julien Grall April 29, 2019, 1:14 p.m. UTC | #4
Hi Marc,

On 23/04/2019 11:54, Marc Zyngier wrote:
> On 18/04/2019 18:26, Julien Grall wrote:

>> On RT, the function iommu_dma_map_msi_msg may be called from

>> non-preemptible context. This will lead to a splat with

>> CONFIG_DEBUG_ATOMIC_SLEEP as the function is using spin_lock

>> (they can sleep on RT).

>>

>> The function iommu_dma_map_msi_msg is used to map the MSI page in the

>> IOMMU PT and update the MSI message with the IOVA.

>>

>> Only the part to lookup for the MSI page requires to be called in

>> preemptible context. As the MSI page cannot change over the lifecycle

>> of the MSI interrupt, the lookup can be cached and re-used later on.

>>

>> This patch split the function iommu_dma_map_msi_msg in two new

>> functions:

>>      - iommu_dma_prepare_msi: This function will prepare the mapping in

>>      the IOMMU and store the cookie in the structure msi_desc. This

>>      function should be called in preemptible context.

>>      - iommu_dma_compose_msi_msg: This function will update the MSI

>>      message with the IOVA when the device is behind an IOMMU.

>>

>> Signed-off-by: Julien Grall <julien.grall@arm.com>

>> ---

>>   drivers/iommu/dma-iommu.c | 43 ++++++++++++++++++++++++++++++++-----------

>>   include/linux/dma-iommu.h | 21 +++++++++++++++++++++

>>   2 files changed, 53 insertions(+), 11 deletions(-)

>>

>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c

>> index 77aabe637a60..f5c1f1685095 100644

>> --- a/drivers/iommu/dma-iommu.c

>> +++ b/drivers/iommu/dma-iommu.c

>> @@ -888,17 +888,17 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,

>>   	return NULL;

>>   }

>>   

>> -void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)

>> +int iommu_dma_prepare_msi(struct msi_desc *desc, phys_addr_t msi_addr)

> 

> I quite like the idea of moving from having an irq to having an msi_desc

> passed to the IOMMU layer...

> 

>>   {

>> -	struct device *dev = msi_desc_to_dev(irq_get_msi_desc(irq));

>> +	struct device *dev = msi_desc_to_dev(desc);

>>   	struct iommu_domain *domain = iommu_get_domain_for_dev(dev);

>>   	struct iommu_dma_cookie *cookie;

>> -	struct iommu_dma_msi_page *msi_page;

>> -	phys_addr_t msi_addr = (u64)msg->address_hi << 32 | msg->address_lo;

>>   	unsigned long flags;

>>   

>> -	if (!domain || !domain->iova_cookie)

>> -		return;

>> +	if (!domain || !domain->iova_cookie) {

>> +		desc->iommu_cookie = NULL;

>> +		return 0;

>> +	}

>>   

>>   	cookie = domain->iova_cookie;

>>   

>> @@ -908,10 +908,33 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)

>>   	 * of an MSI from within an IPI handler.

>>   	 */

>>   	spin_lock_irqsave(&cookie->msi_lock, flags);

>> -	msi_page = iommu_dma_get_msi_page(dev, msi_addr, domain);

>> +	desc->iommu_cookie = iommu_dma_get_msi_page(dev, msi_addr, domain);

>>   	spin_unlock_irqrestore(&cookie->msi_lock, flags);

>>   

>> -	if (WARN_ON(!msi_page)) {

>> +	return (desc->iommu_cookie) ? 0 : -ENOMEM;

>> +}

>> +

>> +void iommu_dma_compose_msi_msg(int irq, struct msi_msg *msg)

> 

> ... but I'd like it even better if it was uniform. Can you please move

> the irq_get_msi_desc() to the callers of iommu_dma_compose_msi_msg(),

> and make both functions take a msi_desc?


Make sense. I will modify iommu_dma_compose_msi_msg to take a msi_desc in parameter.

Cheers,

-- 
Julien Grall