diff mbox series

[v1,1/2] scsi: ufs: Fix unbalanced scsi_block_reqs_cnt caused by ufshcd_hold()

Message ID 1604384682-15837-2-git-send-email-cang@codeaurora.org
State New
Headers show
Series [v1,1/2] scsi: ufs: Fix unbalanced scsi_block_reqs_cnt caused by ufshcd_hold() | expand

Commit Message

Can Guo Nov. 3, 2020, 6:24 a.m. UTC
The scsi_block_reqs_cnt increased in ufshcd_hold() is supposed to be
decreased back in ufshcd_ungate_work() in a paired way. However, if
specific ufshcd_hold/release sequences are met, it is possible that
scsi_block_reqs_cnt is increased twice but only one ungate work is
queued. To make sure scsi_block_reqs_cnt is handled by ufshcd_hold() and
ufshcd_ungate_work() in a paired way, increase it only if queue_work()
returns true.

Signed-off-by: Can Guo <cang@codeaurora.org>
Reviewed-by: Hongwu Su <hongwus@codeaurora.org>
---
 drivers/scsi/ufs/ufshcd.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Stanley Chu Nov. 3, 2020, 7:07 a.m. UTC | #1
Hi Can,

On Mon, 2020-11-02 at 22:24 -0800, Can Guo wrote:
> The scsi_block_reqs_cnt increased in ufshcd_hold() is supposed to be

> decreased back in ufshcd_ungate_work() in a paired way. However, if

> specific ufshcd_hold/release sequences are met, it is possible that

> scsi_block_reqs_cnt is increased twice but only one ungate work is

> queued. To make sure scsi_block_reqs_cnt is handled by ufshcd_hold() and


Just curious that how could this be possible? Would you have some failed
examples?

> ufshcd_ungate_work() in a paired way, increase it only if queue_work()

> returns true.

> 

> Signed-off-by: Can Guo <cang@codeaurora.org>

> Reviewed-by: Hongwu Su <hongwus@codeaurora.org>

> ---

>  drivers/scsi/ufs/ufshcd.c | 6 +++---

>  1 file changed, 3 insertions(+), 3 deletions(-)

> 

> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c

> index 847f355..efa7d86 100644

> --- a/drivers/scsi/ufs/ufshcd.c

> +++ b/drivers/scsi/ufs/ufshcd.c

> @@ -1634,12 +1634,12 @@ int ufshcd_hold(struct ufs_hba *hba, bool async)

>  		 */

>  		/* fallthrough */

>  	case CLKS_OFF:

> -		ufshcd_scsi_block_requests(hba);

>  		hba->clk_gating.state = REQ_CLKS_ON;

>  		trace_ufshcd_clk_gating(dev_name(hba->dev),

>  					hba->clk_gating.state);

> -		queue_work(hba->clk_gating.clk_gating_workq,

> -			   &hba->clk_gating.ungate_work);

> +		if (queue_work(hba->clk_gating.clk_gating_workq,

> +			       &hba->clk_gating.ungate_work))

> +			ufshcd_scsi_block_requests(hba);

>  		/*

>  		 * fall through to check if we should wait for this

>  		 * work to be done or not.


Thanks,
Stanley Chu
Can Guo Nov. 3, 2020, 10:01 a.m. UTC | #2
On 2020-11-03 15:07, Stanley Chu wrote:
> Hi Can,

> 

> On Mon, 2020-11-02 at 22:24 -0800, Can Guo wrote:

>> The scsi_block_reqs_cnt increased in ufshcd_hold() is supposed to be

>> decreased back in ufshcd_ungate_work() in a paired way. However, if

>> specific ufshcd_hold/release sequences are met, it is possible that

>> scsi_block_reqs_cnt is increased twice but only one ungate work is

>> queued. To make sure scsi_block_reqs_cnt is handled by ufshcd_hold() 

>> and

> 

> Just curious that how could this be possible? Would you have some 

> failed

> examples?

> 


[1] One gate_work() is in the workqueue, not yet executed, now clk state 
== REQ_CLKS_OFF.
[2] ufshcd_queuecommand() calls ufshcd_hold(async == ture) -> 
active_req++ -> scsi_block_reqs_cnt++ -> REQ_CLKS_ON -> queue ungate 
work -> active_req-- -> return -EAGAIN.
[3] Now gate_work() starts to run, but since the clk state is 
REQ_CLKS_ON, gate_work() just sets clk state to CLKS_ON and bail.
[3] Someone calls ufshcd_hold(async == false) -> do something -> 
ufshcd_release() -> clk state is changed to REQ_CLKS_OFF. Note that, 
till now, ungate_work() is still in the work queue, not executed yet.
[4] Now, if someone calls ufshcd_hold(), we will hit the issue.

Above sequence is a very common clk gate/ungate sequence. The issue
is because ungate_work is queued but cannot be executed in time. In my
case, I see the ungate_work is somehow delayed for about 150ms. This
change has been tested by customers on multiple platforms. And you
can tell from the code that it won't break anything. :)

Thanks,

Can Guo.

>> ufshcd_ungate_work() in a paired way, increase it only if queue_work()

>> returns true.

>> 

>> Signed-off-by: Can Guo <cang@codeaurora.org>

>> Reviewed-by: Hongwu Su <hongwus@codeaurora.org>

>> ---

>>  drivers/scsi/ufs/ufshcd.c | 6 +++---

>>  1 file changed, 3 insertions(+), 3 deletions(-)

>> 

>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c

>> index 847f355..efa7d86 100644

>> --- a/drivers/scsi/ufs/ufshcd.c

>> +++ b/drivers/scsi/ufs/ufshcd.c

>> @@ -1634,12 +1634,12 @@ int ufshcd_hold(struct ufs_hba *hba, bool 

>> async)

>>  		 */

>>  		/* fallthrough */

>>  	case CLKS_OFF:

>> -		ufshcd_scsi_block_requests(hba);

>>  		hba->clk_gating.state = REQ_CLKS_ON;

>>  		trace_ufshcd_clk_gating(dev_name(hba->dev),

>>  					hba->clk_gating.state);

>> -		queue_work(hba->clk_gating.clk_gating_workq,

>> -			   &hba->clk_gating.ungate_work);

>> +		if (queue_work(hba->clk_gating.clk_gating_workq,

>> +			       &hba->clk_gating.ungate_work))

>> +			ufshcd_scsi_block_requests(hba);

>>  		/*

>>  		 * fall through to check if we should wait for this

>>  		 * work to be done or not.

> 

> Thanks,

> Stanley Chu
Stanley Chu Nov. 3, 2020, 2:03 p.m. UTC | #3
Hi Can,

On Tue, 2020-11-03 at 18:01 +0800, Can Guo wrote:
> On 2020-11-03 15:07, Stanley Chu wrote:

> > Hi Can,

> > 

> > On Mon, 2020-11-02 at 22:24 -0800, Can Guo wrote:

> >> The scsi_block_reqs_cnt increased in ufshcd_hold() is supposed to be

> >> decreased back in ufshcd_ungate_work() in a paired way. However, if

> >> specific ufshcd_hold/release sequences are met, it is possible that

> >> scsi_block_reqs_cnt is increased twice but only one ungate work is

> >> queued. To make sure scsi_block_reqs_cnt is handled by ufshcd_hold() 

> >> and

> > 

> > Just curious that how could this be possible? Would you have some 

> > failed

> > examples?

> > 

> 

> [1] One gate_work() is in the workqueue, not yet executed, now clk state 

> == REQ_CLKS_OFF.

> [2] ufshcd_queuecommand() calls ufshcd_hold(async == ture) -> 

> active_req++ -> scsi_block_reqs_cnt++ -> REQ_CLKS_ON -> queue ungate 

> work -> active_req-- -> return -EAGAIN.

> [3] Now gate_work() starts to run, but since the clk state is 

> REQ_CLKS_ON, gate_work() just sets clk state to CLKS_ON and bail.

> [3] Someone calls ufshcd_hold(async == false) -> do something -> 

> ufshcd_release() -> clk state is changed to REQ_CLKS_OFF. Note that, 

> till now, ungate_work() is still in the work queue, not executed yet.

> [4] Now, if someone calls ufshcd_hold(), we will hit the issue.

> 

> Above sequence is a very common clk gate/ungate sequence. The issue

> is because ungate_work is queued but cannot be executed in time. In my

> case, I see the ungate_work is somehow delayed for about 150ms. This

> change has been tested by customers on multiple platforms. And you

> can tell from the code that it won't break anything. :)


Thanks so much for the details. Looks good to me.

Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>


> 

> Thanks,

> 

> Can Guo.

> 

> >> ufshcd_ungate_work() in a paired way, increase it only if queue_work()

> >> returns true.

> >> 

> >> Signed-off-by: Can Guo <cang@codeaurora.org>

> >> Reviewed-by: Hongwu Su <hongwus@codeaurora.org>

> >> ---

> >>  drivers/scsi/ufs/ufshcd.c | 6 +++---

> >>  1 file changed, 3 insertions(+), 3 deletions(-)

> >> 

> >> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c

> >> index 847f355..efa7d86 100644

> >> --- a/drivers/scsi/ufs/ufshcd.c

> >> +++ b/drivers/scsi/ufs/ufshcd.c

> >> @@ -1634,12 +1634,12 @@ int ufshcd_hold(struct ufs_hba *hba, bool 

> >> async)

> >>  		 */

> >>  		/* fallthrough */

> >>  	case CLKS_OFF:

> >> -		ufshcd_scsi_block_requests(hba);

> >>  		hba->clk_gating.state = REQ_CLKS_ON;

> >>  		trace_ufshcd_clk_gating(dev_name(hba->dev),

> >>  					hba->clk_gating.state);

> >> -		queue_work(hba->clk_gating.clk_gating_workq,

> >> -			   &hba->clk_gating.ungate_work);

> >> +		if (queue_work(hba->clk_gating.clk_gating_workq,

> >> +			       &hba->clk_gating.ungate_work))

> >> +			ufshcd_scsi_block_requests(hba);

> >>  		/*

> >>  		 * fall through to check if we should wait for this

> >>  		 * work to be done or not.

> > 

> > Thanks,

> > Stanley Chu
Bean Huo Nov. 3, 2020, 3:45 p.m. UTC | #4
> 

> Signed-off-by: Can Guo <cang@codeaurora.org>

> Reviewed-by: Hongwu Su <hongwus@codeaurora.org>


Reviewed-by: Bean Huo <beanhuo@micron.com>
Asutosh Das (asd) Nov. 11, 2020, 5:33 p.m. UTC | #5
On 11/2/2020 10:24 PM, Can Guo wrote:
> The scsi_block_reqs_cnt increased in ufshcd_hold() is supposed to be

> decreased back in ufshcd_ungate_work() in a paired way. However, if

> specific ufshcd_hold/release sequences are met, it is possible that

> scsi_block_reqs_cnt is increased twice but only one ungate work is

> queued. To make sure scsi_block_reqs_cnt is handled by ufshcd_hold() and

> ufshcd_ungate_work() in a paired way, increase it only if queue_work()

> returns true.

> 

> Signed-off-by: Can Guo <cang@codeaurora.org>

> Reviewed-by: Hongwu Su <hongwus@codeaurora.org>

> ---


Reviewed-by: Asutosh Das <asutoshd@codeaurora.org>


>   drivers/scsi/ufs/ufshcd.c | 6 +++---

>   1 file changed, 3 insertions(+), 3 deletions(-)

> 

> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c

> index 847f355..efa7d86 100644

> --- a/drivers/scsi/ufs/ufshcd.c

> +++ b/drivers/scsi/ufs/ufshcd.c

> @@ -1634,12 +1634,12 @@ int ufshcd_hold(struct ufs_hba *hba, bool async)

>   		 */

>   		/* fallthrough */

>   	case CLKS_OFF:

> -		ufshcd_scsi_block_requests(hba);

>   		hba->clk_gating.state = REQ_CLKS_ON;

>   		trace_ufshcd_clk_gating(dev_name(hba->dev),

>   					hba->clk_gating.state);

> -		queue_work(hba->clk_gating.clk_gating_workq,

> -			   &hba->clk_gating.ungate_work);

> +		if (queue_work(hba->clk_gating.clk_gating_workq,

> +			       &hba->clk_gating.ungate_work))

> +			ufshcd_scsi_block_requests(hba);

>   		/*

>   		 * fall through to check if we should wait for this

>   		 * work to be done or not.

> 



-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
Linux Foundation Collaborative Project
diff mbox series

Patch

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 847f355..efa7d86 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -1634,12 +1634,12 @@  int ufshcd_hold(struct ufs_hba *hba, bool async)
 		 */
 		/* fallthrough */
 	case CLKS_OFF:
-		ufshcd_scsi_block_requests(hba);
 		hba->clk_gating.state = REQ_CLKS_ON;
 		trace_ufshcd_clk_gating(dev_name(hba->dev),
 					hba->clk_gating.state);
-		queue_work(hba->clk_gating.clk_gating_workq,
-			   &hba->clk_gating.ungate_work);
+		if (queue_work(hba->clk_gating.clk_gating_workq,
+			       &hba->clk_gating.ungate_work))
+			ufshcd_scsi_block_requests(hba);
 		/*
 		 * fall through to check if we should wait for this
 		 * work to be done or not.