diff mbox series

[1/1] iommu/arm-smmu-v3: replace writel with writel_relaxed in queue_inc_prod

Message ID 1497956694-11784-1-git-send-email-thunder.leizhen@huawei.com
State New
Headers show
Series [1/1] iommu/arm-smmu-v3: replace writel with writel_relaxed in queue_inc_prod | expand

Commit Message

Leizhen (ThunderTown) June 20, 2017, 11:04 a.m. UTC
This function is protected by spinlock, and the latter will do memory
barrier implicitly. So that we can safely use writel_relaxed. In fact, the
dmb operation will lengthen the time protected by lock, which indirectly
increase the locking confliction in the stress scene.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

---
 drivers/iommu/arm-smmu-v3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--
2.5.0

Comments

Robin Murphy June 20, 2017, 11:35 a.m. UTC | #1
On 20/06/17 12:04, Zhen Lei wrote:
> This function is protected by spinlock, and the latter will do memory

> barrier implicitly. So that we can safely use writel_relaxed. In fact, the

> dmb operation will lengthen the time protected by lock, which indirectly

> increase the locking confliction in the stress scene.


If you remove the DSB between writing the commands (to Normal memory)
and writing the pointer (to Device memory), how can you guarantee that
the complete command is visible to the SMMU and it isn't going to try to
consume stale memory contents? The spinlock is irrelevant since it's
taken *before* the command is written.

Robin.

> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

> ---

>  drivers/iommu/arm-smmu-v3.c | 2 +-

>  1 file changed, 1 insertion(+), 1 deletion(-)

> 

> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c

> index 380969a..d2fbee3 100644

> --- a/drivers/iommu/arm-smmu-v3.c

> +++ b/drivers/iommu/arm-smmu-v3.c

> @@ -728,7 +728,7 @@ static void queue_inc_prod(struct arm_smmu_queue *q)

>  	u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + 1;

> 

>  	q->prod = Q_OVF(q, q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);

> -	writel(q->prod, q->prod_reg);

> +	writel_relaxed(q->prod, q->prod_reg);

>  }

> 

>  /*

> --

> 2.5.0

> 

>
Leizhen (ThunderTown) June 21, 2017, 1:28 a.m. UTC | #2
On 2017/6/20 19:35, Robin Murphy wrote:
> On 20/06/17 12:04, Zhen Lei wrote:

>> This function is protected by spinlock, and the latter will do memory

>> barrier implicitly. So that we can safely use writel_relaxed. In fact, the

>> dmb operation will lengthen the time protected by lock, which indirectly

>> increase the locking confliction in the stress scene.

> 

> If you remove the DSB between writing the commands (to Normal memory)

> and writing the pointer (to Device memory), how can you guarantee that

> the complete command is visible to the SMMU and it isn't going to try to

> consume stale memory contents? The spinlock is irrelevant since it's

> taken *before* the command is written.

OK, I see, thanks. Let's me see if there are any other methods. And I think
that this may should be done well by hardware.

> 

> Robin.

> 

>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>

>> ---

>>  drivers/iommu/arm-smmu-v3.c | 2 +-

>>  1 file changed, 1 insertion(+), 1 deletion(-)

>>

>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c

>> index 380969a..d2fbee3 100644

>> --- a/drivers/iommu/arm-smmu-v3.c

>> +++ b/drivers/iommu/arm-smmu-v3.c

>> @@ -728,7 +728,7 @@ static void queue_inc_prod(struct arm_smmu_queue *q)

>>  	u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + 1;

>>

>>  	q->prod = Q_OVF(q, q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);

>> -	writel(q->prod, q->prod_reg);

>> +	writel_relaxed(q->prod, q->prod_reg);

>>  }

>>

>>  /*

>> --

>> 2.5.0

>>

>>

> 

> 

> .

> 


-- 
Thanks!
BestRegards
Will Deacon June 21, 2017, 9:08 a.m. UTC | #3
On Wed, Jun 21, 2017 at 09:28:23AM +0800, Leizhen (ThunderTown) wrote:
> On 2017/6/20 19:35, Robin Murphy wrote:

> > On 20/06/17 12:04, Zhen Lei wrote:

> >> This function is protected by spinlock, and the latter will do memory

> >> barrier implicitly. So that we can safely use writel_relaxed. In fact, the

> >> dmb operation will lengthen the time protected by lock, which indirectly

> >> increase the locking confliction in the stress scene.

> > 

> > If you remove the DSB between writing the commands (to Normal memory)

> > and writing the pointer (to Device memory), how can you guarantee that

> > the complete command is visible to the SMMU and it isn't going to try to

> > consume stale memory contents? The spinlock is irrelevant since it's

> > taken *before* the command is written.

> OK, I see, thanks. Let's me see if there are any other methods. And I think

> that this may should be done well by hardware.


FWIW, I did use the _relaxed variants wherever I could when I wrote the
driver. There might, of course, be bugs, but it's not like the normal case
for drivers where the author didn't consider the _relaxed accessors
initially.

Will
Leizhen (ThunderTown) June 26, 2017, 1:29 p.m. UTC | #4
On 2017/6/21 17:08, Will Deacon wrote:
> On Wed, Jun 21, 2017 at 09:28:23AM +0800, Leizhen (ThunderTown) wrote:

>> On 2017/6/20 19:35, Robin Murphy wrote:

>>> On 20/06/17 12:04, Zhen Lei wrote:

>>>> This function is protected by spinlock, and the latter will do memory

>>>> barrier implicitly. So that we can safely use writel_relaxed. In fact, the

>>>> dmb operation will lengthen the time protected by lock, which indirectly

>>>> increase the locking confliction in the stress scene.

>>>

>>> If you remove the DSB between writing the commands (to Normal memory)

>>> and writing the pointer (to Device memory), how can you guarantee that

>>> the complete command is visible to the SMMU and it isn't going to try to

>>> consume stale memory contents? The spinlock is irrelevant since it's

>>> taken *before* the command is written.

>> OK, I see, thanks. Let's me see if there are any other methods. And I think

>> that this may should be done well by hardware.

> 

> FWIW, I did use the _relaxed variants wherever I could when I wrote the

> driver. There might, of course, be bugs, but it's not like the normal case

> for drivers where the author didn't consider the _relaxed accessors

> initially.

A good news. I got a new idea and I will post v2 later.

> 

> Will

> 

> .

> 


-- 
Thanks!
BestRegards
Leizhen (ThunderTown) June 26, 2017, 1:41 p.m. UTC | #5
On 2017/6/26 21:29, Leizhen (ThunderTown) wrote:
> 

> 

> On 2017/6/21 17:08, Will Deacon wrote:

>> On Wed, Jun 21, 2017 at 09:28:23AM +0800, Leizhen (ThunderTown) wrote:

>>> On 2017/6/20 19:35, Robin Murphy wrote:

>>>> On 20/06/17 12:04, Zhen Lei wrote:

>>>>> This function is protected by spinlock, and the latter will do memory

>>>>> barrier implicitly. So that we can safely use writel_relaxed. In fact, the

>>>>> dmb operation will lengthen the time protected by lock, which indirectly

>>>>> increase the locking confliction in the stress scene.

>>>>

>>>> If you remove the DSB between writing the commands (to Normal memory)

>>>> and writing the pointer (to Device memory), how can you guarantee that

>>>> the complete command is visible to the SMMU and it isn't going to try to

>>>> consume stale memory contents? The spinlock is irrelevant since it's

>>>> taken *before* the command is written.

>>> OK, I see, thanks. Let's me see if there are any other methods. And I think

>>> that this may should be done well by hardware.

>>

>> FWIW, I did use the _relaxed variants wherever I could when I wrote the

>> driver. There might, of course, be bugs, but it's not like the normal case

>> for drivers where the author didn't consider the _relaxed accessors

>> initially.

> A good news. I got a new idea and I will post v2 later.

[PATCH 0/5] arm-smmu: performance optimization
[PATCH 1/5] iommu/arm-smmu-v3: put off the execution of TLBI* to reduce lock confliction

I just sent.

> 

>>

>> Will

>>

>> .

>>

> 


-- 
Thanks!
BestRegards
diff mbox series

Patch

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 380969a..d2fbee3 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -728,7 +728,7 @@  static void queue_inc_prod(struct arm_smmu_queue *q)
 	u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + 1;

 	q->prod = Q_OVF(q, q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod);
-	writel(q->prod, q->prod_reg);
+	writel_relaxed(q->prod, q->prod_reg);
 }

 /*