diff mbox series

[V2] scsi: ufs: Fix ufshcd_request_sense_async() for Samsung KLUFG8RHDA-B2D1

Message ID 20210823050117.11608-1-adrian.hunter@intel.com
State New
Headers show
Series [V2] scsi: ufs: Fix ufshcd_request_sense_async() for Samsung KLUFG8RHDA-B2D1 | expand

Commit Message

Adrian Hunter Aug. 23, 2021, 5:01 a.m. UTC
From: Adrian Hunter <ajhmls@gmail.com>

Samsung KLUFG8RHDA-B2D1 does not clear the unit attention condition if the
length is zero. So go back to requesting all the sense data, as it was
before patch "scsi: ufs: Request sense data asynchronously". That is
simpler than creating and maintaining a quirk for affected devices.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---


Changes in V2

	Alter comment to note the need for non-zero sense size.


 drivers/scsi/ufs/ufshcd.c | 27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

Comments

Bart Van Assche Aug. 24, 2021, 2:27 a.m. UTC | #1
On 8/22/21 10:01 PM, Adrian Hunter wrote:
> Samsung KLUFG8RHDA-B2D1 does not clear the unit attention condition if the

> length is zero. So go back to requesting all the sense data, as it was

> before patch "scsi: ufs: Request sense data asynchronously". That is

> simpler than creating and maintaining a quirk for affected devices.


Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Martin K. Petersen Aug. 24, 2021, 3:34 a.m. UTC | #2
Adrian,

> Samsung KLUFG8RHDA-B2D1 does not clear the unit attention condition if

> the length is zero. So go back to requesting all the sense data, as it

> was before patch "scsi: ufs: Request sense data asynchronously". That

> is simpler than creating and maintaining a quirk for affected devices.


Applied to 5.15/scsi-staging, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering
Adrian Hunter Aug. 24, 2021, 11:34 a.m. UTC | #3
On 24/08/21 6:34 am, Martin K. Petersen wrote:
> 

> Adrian,

> 

>> Samsung KLUFG8RHDA-B2D1 does not clear the unit attention condition if

>> the length is zero. So go back to requesting all the sense data, as it

>> was before patch "scsi: ufs: Request sense data asynchronously". That

>> is simpler than creating and maintaining a quirk for affected devices.

> 

> Applied to 5.15/scsi-staging, thanks!

> 


Can you please drop V2 of this patch?  I will send a V3 with the buffer leak fixed.
Martin K. Petersen Aug. 25, 2021, 2:59 a.m. UTC | #4
Adrian,

> Samsung KLUFG8RHDA-B2D1 does not clear the unit attention condition if

> the length is zero. So go back to requesting all the sense data, as it

> was before patch "scsi: ufs: Request sense data asynchronously". That

> is simpler than creating and maintaining a quirk for affected devices.


Applied to 5.15/scsi-staging, thanks!

-- 
Martin K. Petersen	Oracle Linux Engineering
Adrian Hunter Sept. 2, 2021, 10:18 a.m. UTC | #5
Hi

UFS devices can remain runtime suspended at system suspend time,
if the conditions are right.  Add support for that, first fixing
the impediments.


Adrian Hunter (3):
      scsi: ufs: Fix error handler clear ua deadlock
      scsi: ufs: Fix runtime PM dependencies getting broken
      scsi: ufs: Let devices remain runtime suspended during system suspend

 drivers/scsi/scsi_pm.c     | 16 +++++++---
 drivers/scsi/ufs/ufshcd.c  | 79 +++++++++++++++++++++++++++++++---------------
 drivers/scsi/ufs/ufshcd.h  | 11 ++++++-
 include/scsi/scsi_device.h |  1 +
 4 files changed, 75 insertions(+), 32 deletions(-)


Regards
Adrian
Asutosh Das (asd) Sept. 2, 2021, 4:53 p.m. UTC | #6
On 9/2/2021 3:18 AM, Adrian Hunter wrote:
> There is no guarantee to be able to enter the queue if requests are

> blocked. That is because freezing the queue will block entry to the

> queue, but freezing also waits for outstanding requests which can make

> no progress while the queue is blocked.

> 

> That situation can happen when the error handler issues requests to

> clear unit attention condition. The deadlock is very unlikely, so the

> error handler can be expected to clear ua at some point anyway, so the

> simple solution is not to wait to enter the queue.

> 

> Additionally, note that the RPMB queue might be not be entered because

> it is runtime suspended, but in that case ua will be cleared at RPMB

> runtime resume.

> 

> Fixes: aa53f580e67b49 ("scsi: ufs: Minor adjustments to error handling")

> Cc: stable@vger.kernel.org

> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

> ---


Looks good to me.
Reviewed-by: Asutosh Das <asutoshd@codeaurora.org>


>   drivers/scsi/ufs/ufshcd.c | 33 +++++++++++++++++++--------------

>   1 file changed, 19 insertions(+), 14 deletions(-)

> 

> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c

> index 67889d74761c..52fb059efa77 100644

> --- a/drivers/scsi/ufs/ufshcd.c

> +++ b/drivers/scsi/ufs/ufshcd.c

> @@ -224,7 +224,7 @@ static int ufshcd_reset_and_restore(struct ufs_hba *hba);

>   static int ufshcd_eh_host_reset_handler(struct scsi_cmnd *cmd);

>   static int ufshcd_clear_tm_cmd(struct ufs_hba *hba, int tag);

>   static void ufshcd_hba_exit(struct ufs_hba *hba);

> -static int ufshcd_clear_ua_wluns(struct ufs_hba *hba);

> +static int ufshcd_clear_ua_wluns(struct ufs_hba *hba, bool nowait);

>   static int ufshcd_probe_hba(struct ufs_hba *hba, bool async);

>   static int ufshcd_setup_clocks(struct ufs_hba *hba, bool on);

>   static int ufshcd_uic_hibern8_enter(struct ufs_hba *hba);

> @@ -4110,7 +4110,7 @@ int ufshcd_link_recovery(struct ufs_hba *hba)

>   		dev_err(hba->dev, "%s: link recovery failed, err %d",

>   			__func__, ret);

>   	else

> -		ufshcd_clear_ua_wluns(hba);

> +		ufshcd_clear_ua_wluns(hba, false);

>   

>   	return ret;

>   }

> @@ -5974,7 +5974,7 @@ static void ufshcd_err_handling_unprepare(struct ufs_hba *hba)

>   	ufshcd_release(hba);

>   	if (ufshcd_is_clkscaling_supported(hba))

>   		ufshcd_clk_scaling_suspend(hba, false);

> -	ufshcd_clear_ua_wluns(hba);

> +	ufshcd_clear_ua_wluns(hba, true);

>   	ufshcd_rpm_put(hba);

>   }

>   

> @@ -7907,7 +7907,7 @@ static int ufshcd_add_lus(struct ufs_hba *hba)

>   	if (ret)

>   		goto out;

>   

> -	ufshcd_clear_ua_wluns(hba);

> +	ufshcd_clear_ua_wluns(hba, false);

>   

>   	/* Initialize devfreq after UFS device is detected */

>   	if (ufshcd_is_clkscaling_supported(hba)) {

> @@ -7943,7 +7943,8 @@ static void ufshcd_request_sense_done(struct request *rq, blk_status_t error)

>   }

>   

>   static int

> -ufshcd_request_sense_async(struct ufs_hba *hba, struct scsi_device *sdev)

> +ufshcd_request_sense_async(struct ufs_hba *hba, struct scsi_device *sdev,

> +			   bool nowait)

>   {

>   	/*

>   	 * Some UFS devices clear unit attention condition only if the sense

> @@ -7951,6 +7952,7 @@ ufshcd_request_sense_async(struct ufs_hba *hba, struct scsi_device *sdev)

>   	 */

>   	static const u8 cmd[6] = {REQUEST_SENSE, 0, 0, 0, UFS_SENSE_SIZE, 0};

>   	struct scsi_request *rq;

> +	blk_mq_req_flags_t flags;

>   	struct request *req;

>   	char *buffer;

>   	int ret;

> @@ -7959,8 +7961,8 @@ ufshcd_request_sense_async(struct ufs_hba *hba, struct scsi_device *sdev)

>   	if (!buffer)

>   		return -ENOMEM;

>   

> -	req = blk_get_request(sdev->request_queue, REQ_OP_DRV_IN,

> -			      /*flags=*/BLK_MQ_REQ_PM);

> +	flags = BLK_MQ_REQ_PM | (nowait ? BLK_MQ_REQ_NOWAIT : 0);

> +	req = blk_get_request(sdev->request_queue, REQ_OP_DRV_IN, flags);

>   	if (IS_ERR(req)) {

>   		ret = PTR_ERR(req);

>   		goto out_free;

> @@ -7990,7 +7992,7 @@ ufshcd_request_sense_async(struct ufs_hba *hba, struct scsi_device *sdev)

>   	return ret;

>   }

>   

> -static int ufshcd_clear_ua_wlun(struct ufs_hba *hba, u8 wlun)

> +static int ufshcd_clear_ua_wlun(struct ufs_hba *hba, u8 wlun, bool nowait)

>   {

>   	struct scsi_device *sdp;

>   	unsigned long flags;

> @@ -8016,7 +8018,10 @@ static int ufshcd_clear_ua_wlun(struct ufs_hba *hba, u8 wlun)

>   	if (ret)

>   		goto out_err;

>   

> -	ret = ufshcd_request_sense_async(hba, sdp);

> +	ret = ufshcd_request_sense_async(hba, sdp, nowait);

> +	if (nowait && ret && wlun == UFS_UPIU_RPMB_WLUN &&

> +	    pm_runtime_suspended(&sdp->sdev_gendev))

> +		ret = 0; /* RPMB runtime resume will clear UAC */

>   	scsi_device_put(sdp);

>   out_err:

>   	if (ret)

> @@ -8025,16 +8030,16 @@ static int ufshcd_clear_ua_wlun(struct ufs_hba *hba, u8 wlun)

>   	return ret;

>   }

>   

> -static int ufshcd_clear_ua_wluns(struct ufs_hba *hba)

> +static int ufshcd_clear_ua_wluns(struct ufs_hba *hba, bool nowait)

>   {

>   	int ret = 0;

>   

>   	if (!hba->wlun_dev_clr_ua)

>   		goto out;

>   

> -	ret = ufshcd_clear_ua_wlun(hba, UFS_UPIU_UFS_DEVICE_WLUN);

> +	ret = ufshcd_clear_ua_wlun(hba, UFS_UPIU_UFS_DEVICE_WLUN, nowait);

>   	if (!ret)

> -		ret = ufshcd_clear_ua_wlun(hba, UFS_UPIU_RPMB_WLUN);

> +		ret = ufshcd_clear_ua_wlun(hba, UFS_UPIU_RPMB_WLUN, nowait);

>   	if (!ret)

>   		hba->wlun_dev_clr_ua = false;

>   out:

> @@ -8656,7 +8661,7 @@ static int ufshcd_set_dev_pwr_mode(struct ufs_hba *hba,

>   	 */

>   	hba->host->eh_noresume = 1;

>   	if (hba->wlun_dev_clr_ua)

> -		ufshcd_clear_ua_wlun(hba, UFS_UPIU_UFS_DEVICE_WLUN);

> +		ufshcd_clear_ua_wlun(hba, UFS_UPIU_UFS_DEVICE_WLUN, false);

>   

>   	cmd[4] = pwr_mode << 4;

>   

> @@ -9825,7 +9830,7 @@ static inline int ufshcd_clear_rpmb_uac(struct ufs_hba *hba)

>   

>   	if (!hba->wlun_rpmb_clr_ua)

>   		return 0;

> -	ret = ufshcd_clear_ua_wlun(hba, UFS_UPIU_RPMB_WLUN);

> +	ret = ufshcd_clear_ua_wlun(hba, UFS_UPIU_RPMB_WLUN, false);

>   	if (!ret)

>   		hba->wlun_rpmb_clr_ua = 0;

>   	return ret;

> 



-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
Linux Foundation Collaborative Project
Asutosh Das (asd) Sept. 2, 2021, 5:07 p.m. UTC | #7
On 9/2/2021 3:18 AM, Adrian Hunter wrote:
> If the UFS Device WLUN is runtime suspended and is in the same power

> mode, link state and b_rpm_dev_flush_capable (BKOP or WB buffer flush etc)

> state, then it can remain runtime suspended instead of being runtime

> resumed and then system suspended.

> 

> The following patches have cleared the way for that to happen:

>    scsi: ufs: Fix runtime PM dependencies getting broken

>    scsi: ufs: Fix error handler clear ua deadlock

> 

> So amend the logic accordingly.

> 

> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

> ---

>   drivers/scsi/ufs/ufshcd.c | 45 ++++++++++++++++++++++++++++-----------

>   drivers/scsi/ufs/ufshcd.h | 11 +++++++++-

>   2 files changed, 43 insertions(+), 13 deletions(-)

> 

Hi Adrian,
Thanks for the change.

> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c

> index 57ed4b93b949..8e799e47e095 100644

> --- a/drivers/scsi/ufs/ufshcd.c

> +++ b/drivers/scsi/ufs/ufshcd.c

> @@ -9722,13 +9722,29 @@ void ufshcd_resume_complete(struct device *dev)

>   		ufshcd_rpm_put(hba);

>   		hba->complete_put = false;

>   	}

> -	if (hba->rpmb_complete_put) {

> -		ufshcd_rpmb_rpm_put(hba);

> -		hba->rpmb_complete_put = false;

> -	}

>   }

>   EXPORT_SYMBOL_GPL(ufshcd_resume_complete);

>   

> +static bool ufshcd_rpm_ok_for_spm(struct ufs_hba *hba)

> +{

> +	struct device *dev = &hba->sdev_ufs_device->sdev_gendev;

> +	enum ufs_dev_pwr_mode dev_pwr_mode;

> +	enum uic_link_state link_state;

> +	unsigned long flags;

> +	bool res;

> +

In the current ufshcd_suspend(), there's a ufshcd_vops_suspend().
That's invoked for different pm_ops independent of the rpm_lvl and spm_lvl.
I'm not sure if any vendor driver does different things for diff pm_op.
Perhaps something to check.

> +	spin_lock_irqsave(&dev->power.lock, flags);

> +	dev_pwr_mode = ufs_get_pm_lvl_to_dev_pwr_mode(hba->spm_lvl);

> +	link_state = ufs_get_pm_lvl_to_link_pwr_state(hba->spm_lvl);

> +	res = pm_runtime_suspended(dev) &&

> +	      hba->curr_dev_pwr_mode == dev_pwr_mode &&

> +	      hba->uic_link_state == link_state &&

> +	      !hba->dev_info.b_rpm_dev_flush_capable;

> +	spin_unlock_irqrestore(&dev->power.lock, flags);

> +

> +	return res;

> +}

> +

>   int ufshcd_suspend_prepare(struct device *dev)

>   {

>   	struct ufs_hba *hba = dev_get_drvdata(dev);

> @@ -9741,17 +9757,22 @@ int ufshcd_suspend_prepare(struct device *dev)

>   	 * Refer ufshcd_resume_complete()

>   	 */

>   	if (hba->sdev_ufs_device) {

> -		ret = ufshcd_rpm_get_sync(hba);

> -		if (ret < 0 && ret != -EACCES) {

> -			ufshcd_rpm_put(hba);

> -			return ret;

> +		/* Prevent runtime suspend */

> +		ufshcd_rpm_get_noresume(hba);

> +		/*

> +		 * Check if already runtime suspended in same state as system

> +		 * suspend would be.

> +		 */

> +		if (!ufshcd_rpm_ok_for_spm(hba)) {

> +			/* RPM state is not ok for SPM, so runtime resume */

> +			ret = ufshcd_rpm_resume(hba);

> +			if (ret < 0 && ret != -EACCES) {

> +				ufshcd_rpm_put(hba);

> +				return ret;

> +			}

>   		}

>   		hba->complete_put = true;

>   	}

> -	if (hba->sdev_rpmb) {

> -		ufshcd_rpmb_rpm_get_sync(hba);

> -		hba->rpmb_complete_put = true;

> -	}

>   	return 0;

>   }

>   EXPORT_SYMBOL_GPL(ufshcd_suspend_prepare);

> diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h

> index 4723f27a55d1..149803d60ecb 100644

> --- a/drivers/scsi/ufs/ufshcd.h

> +++ b/drivers/scsi/ufs/ufshcd.h

> @@ -915,7 +915,6 @@ struct ufs_hba {

>   #endif

>   	u32 luns_avail;

>   	bool complete_put;

> -	bool rpmb_complete_put;

>   };

>   

>   /* Returns true if clocks can be gated. Otherwise false */

> @@ -1383,6 +1382,16 @@ static inline int ufshcd_rpm_put_sync(struct ufs_hba *hba)

>   	return pm_runtime_put_sync(&hba->sdev_ufs_device->sdev_gendev);

>   }

>   

> +static inline void ufshcd_rpm_get_noresume(struct ufs_hba *hba)

> +{

> +	pm_runtime_get_noresume(&hba->sdev_ufs_device->sdev_gendev);

> +}

> +

> +static inline int ufshcd_rpm_resume(struct ufs_hba *hba)

> +{

> +	return pm_runtime_resume(&hba->sdev_ufs_device->sdev_gendev);

> +}

> +

>   static inline int ufshcd_rpm_put(struct ufs_hba *hba)

>   {

>   	return pm_runtime_put(&hba->sdev_ufs_device->sdev_gendev);

> 



-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
Linux Foundation Collaborative Project
Adrian Hunter Sept. 3, 2021, 9:55 a.m. UTC | #8
On 2/09/21 8:07 pm, Asutosh Das (asd) wrote:
> On 9/2/2021 3:18 AM, Adrian Hunter wrote:

>> If the UFS Device WLUN is runtime suspended and is in the same power

>> mode, link state and b_rpm_dev_flush_capable (BKOP or WB buffer flush etc)

>> state, then it can remain runtime suspended instead of being runtime

>> resumed and then system suspended.

>>

>> The following patches have cleared the way for that to happen:

>>    scsi: ufs: Fix runtime PM dependencies getting broken

>>    scsi: ufs: Fix error handler clear ua deadlock

>>

>> So amend the logic accordingly.

>>

>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

>> ---

>>   drivers/scsi/ufs/ufshcd.c | 45 ++++++++++++++++++++++++++++-----------

>>   drivers/scsi/ufs/ufshcd.h | 11 +++++++++-

>>   2 files changed, 43 insertions(+), 13 deletions(-)

>>

> Hi Adrian,

> Thanks for the change.

> 

>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c

>> index 57ed4b93b949..8e799e47e095 100644

>> --- a/drivers/scsi/ufs/ufshcd.c

>> +++ b/drivers/scsi/ufs/ufshcd.c

>> @@ -9722,13 +9722,29 @@ void ufshcd_resume_complete(struct device *dev)

>>           ufshcd_rpm_put(hba);

>>           hba->complete_put = false;

>>       }

>> -    if (hba->rpmb_complete_put) {

>> -        ufshcd_rpmb_rpm_put(hba);

>> -        hba->rpmb_complete_put = false;

>> -    }

>>   }

>>   EXPORT_SYMBOL_GPL(ufshcd_resume_complete);

>>   +static bool ufshcd_rpm_ok_for_spm(struct ufs_hba *hba)

>> +{

>> +    struct device *dev = &hba->sdev_ufs_device->sdev_gendev;

>> +    enum ufs_dev_pwr_mode dev_pwr_mode;

>> +    enum uic_link_state link_state;

>> +    unsigned long flags;

>> +    bool res;

>> +

> In the current ufshcd_suspend(), there's a ufshcd_vops_suspend().

> That's invoked for different pm_ops independent of the rpm_lvl and spm_lvl.

> I'm not sure if any vendor driver does different things for diff pm_op.

> Perhaps something to check.

> 


Good point.  The logic was that way before "scsi: ufs: core: Enable power
management for wlu" which was first in v5.14, so drivers should expect the
behaviour of this patch.  Checking shows only ufs-hisi does something
different but it sets different rpm_lvl and spm_lvl so wouldn't see any
change.  However I am sending a V2 patch that makes it explicit.
diff mbox series

Patch

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index a3b419848f0a..9d6207f7cf6d 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -7937,7 +7937,8 @@  static int ufshcd_add_lus(struct ufs_hba *hba)
 static void ufshcd_request_sense_done(struct request *rq, blk_status_t error)
 {
 	if (error != BLK_STS_OK)
-		pr_err("%s: REQUEST SENSE failed (%d)", __func__, error);
+		pr_err("%s: REQUEST SENSE failed (%d)\n", __func__, error);
+	kfree(rq->end_io_data);
 	blk_put_request(rq);
 }
 
@@ -7945,23 +7946,39 @@  static int
 ufshcd_request_sense_async(struct ufs_hba *hba, struct scsi_device *sdev)
 {
 	/*
-	 * From SPC-6: the REQUEST SENSE command with any allocation length
-	 * clears the sense data.
+	 * Some UFS devices clear unit attention condition only if the sense
+	 * size used (UFS_SENSE_SIZE in this case) is non-zero.
 	 */
-	static const u8 cmd[6] = {REQUEST_SENSE, 0, 0, 0, 0, 0};
+	static const u8 cmd[6] = {REQUEST_SENSE, 0, 0, 0, UFS_SENSE_SIZE, 0};
 	struct scsi_request *rq;
 	struct request *req;
+	char *buffer;
+	int ret;
+
+	buffer = kzalloc(UFS_SENSE_SIZE, GFP_KERNEL);
+	if (!buffer)
+		return -ENOMEM;
 
-	req = blk_get_request(sdev->request_queue, REQ_OP_DRV_IN, /*flags=*/0);
+	req = blk_get_request(sdev->request_queue, REQ_OP_DRV_IN,
+			      /*flags=*/BLK_MQ_REQ_PM);
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
+	ret = blk_rq_map_kern(sdev->request_queue, req,
+			      buffer, UFS_SENSE_SIZE, GFP_NOIO);
+	if (ret) {
+		blk_put_request(req);
+		kfree(buffer);
+		return ret;
+	}
+
 	rq = scsi_req(req);
 	rq->cmd_len = ARRAY_SIZE(cmd);
 	memcpy(rq->cmd, cmd, rq->cmd_len);
 	rq->retries = 3;
 	req->timeout = 1 * HZ;
 	req->rq_flags |= RQF_PM | RQF_QUIET;
+	req->end_io_data = buffer;
 
 	blk_execute_rq_nowait(/*bd_disk=*/NULL, req, /*at_head=*/true,
 			      ufshcd_request_sense_done);