Message ID | 20240715063831.29792-1-peter.wang@mediatek.com |
---|---|
State | New |
Headers | show |
Series | [v2] ufs: core: fix deadlock when rtc update | expand |
On Mon, 2024-07-15 at 14:38 +0800, peter.wang@mediatek.com wrote: > - ufshcd_rpm_get_sync(hba); > + /* Skip update RTC if RPM state is not RPM_ACTIVE */ > + if (ufshcd_rpm_get_if_active(hba) <= 0) > + return; I understood your intention of this 'retun', but my understanding is you assume that __ufshcd_wl_resume() will schedule rtc update work, however, before the time that __ufshcd_wl_resume() completes, the RPM status is not RPM_ACTIVE until __ufshcd_wl_resume() completes and __update_runtime_status(dev, RPM_ACTIVE) is called. If rtc update work is performed before __update_runtime_status(dev, RPM_ACTIVE), here you return, then no RTC work will be scheduled. do you think it is possible?
On Mon, 2024-07-15 at 14:38 +0800, peter.wang@mediatek.com wrote: > Fixes: 6bf999e0eb41 ("scsi: ufs: core: Add UFS RTC support") > Cc: <stable@vger.kernel.org> 6.9.x > > Signed-off-by: Peter Wang <peter.wang@mediatek.com> ignore my previous two emails, I saw you have just skipped update, not skip schedule in this version. Reviewed-by: Bean Huo <beanhuo@micron.com>
On 7/14/24 11:38 PM, peter.wang@mediatek.com wrote: > There is a deadlock when runtime suspend waits for the flush of RTC work, > and the RTC work calls ufshcd_rpm_get_sync to wait for runtime resume. The above description is too brief - a description of how the fix works is missing. Please include a more detailed description in future patches. Anyway: Reviewed-by: Bart Van Assche <bvanassche@acm.org>
On Mon, 2024-07-15 at 10:17 -0700, Bart Van Assche wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On 7/14/24 11:38 PM, peter.wang@mediatek.com wrote: > > There is a deadlock when runtime suspend waits for the flush of RTC > work, > > and the RTC work calls ufshcd_rpm_get_sync to wait for runtime > resume. > > The above description is too brief - a description of how the fix > works > is missing. Please include a more detailed description in future > patches. > > Anyway: > > Reviewed-by: Bart Van Assche <bvanassche@acm.org> Hi Bart, Will improve in future patches. Thanks for review. Peter
> There is a deadlock when runtime suspend waits for the flush of RTC > work, and the RTC work calls ufshcd_rpm_get_sync to wait for runtime > resume. Applied to 6.11/scsi-staging, thanks!
On Mon, 15 Jul 2024 14:38:31 +0800, peter.wang@mediatek.com wrote: > There is a deadlock when runtime suspend waits for the flush of RTC work, > and the RTC work calls ufshcd_rpm_get_sync to wait for runtime resume. > > Here is deadlock backtrace > kworker/0:1 D 4892.876354 10 10971 4859 0x4208060 0x8 10 0 120 670730152367 > ptr f0ffff80c2e40000 0 1 0x00000001 0x000000ff 0x000000ff 0x000000ff > <ffffffee5e71ddb0> __switch_to+0x1a8/0x2d4 > <ffffffee5e71e604> __schedule+0x684/0xa98 > <ffffffee5e71ea60> schedule+0x48/0xc8 > <ffffffee5e725f78> schedule_timeout+0x48/0x170 > <ffffffee5e71fb74> do_wait_for_common+0x108/0x1b0 > <ffffffee5e71efe0> wait_for_completion+0x44/0x60 > <ffffffee5d6de968> __flush_work+0x39c/0x424 > <ffffffee5d6decc0> __cancel_work_sync+0xd8/0x208 > <ffffffee5d6dee2c> cancel_delayed_work_sync+0x14/0x28 > <ffffffee5e2551b8> __ufshcd_wl_suspend+0x19c/0x480 > <ffffffee5e255fb8> ufshcd_wl_runtime_suspend+0x3c/0x1d4 > <ffffffee5dffd80c> scsi_runtime_suspend+0x78/0xc8 > <ffffffee5df93580> __rpm_callback+0x94/0x3e0 > <ffffffee5df90b0c> rpm_suspend+0x2d4/0x65c > <ffffffee5df91448> __pm_runtime_suspend+0x80/0x114 > <ffffffee5dffd95c> scsi_runtime_idle+0x38/0x6c > <ffffffee5df912f4> rpm_idle+0x264/0x338 > <ffffffee5df90f14> __pm_runtime_idle+0x80/0x110 > <ffffffee5e24ce44> ufshcd_rtc_work+0x128/0x1e4 > <ffffffee5d6e3a40> process_one_work+0x26c/0x650 > <ffffffee5d6e65c8> worker_thread+0x260/0x3d8 > <ffffffee5d6edec8> kthread+0x110/0x134 > <ffffffee5d616b18> ret_from_fork+0x10/0x20 > > [...] Applied to 6.11/scsi-queue, thanks! [1/1] ufs: core: fix deadlock when rtc update https://git.kernel.org/mkp/scsi/c/3911af778f20
diff --git a/drivers/ufs/core/ufshcd-priv.h b/drivers/ufs/core/ufshcd-priv.h index f42d99ce5bf1..81d6f0cfb148 100644 --- a/drivers/ufs/core/ufshcd-priv.h +++ b/drivers/ufs/core/ufshcd-priv.h @@ -329,6 +329,11 @@ static inline int ufshcd_rpm_get_sync(struct ufs_hba *hba) return pm_runtime_get_sync(&hba->ufs_device_wlun->sdev_gendev); } +static inline int ufshcd_rpm_get_if_active(struct ufs_hba *hba) +{ + return pm_runtime_get_if_active(&hba->ufs_device_wlun->sdev_gendev); +} + static inline int ufshcd_rpm_put_sync(struct ufs_hba *hba) { return pm_runtime_put_sync(&hba->ufs_device_wlun->sdev_gendev); diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index 46433ecf0c4d..2e5adfa0f757 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -8171,7 +8171,10 @@ static void ufshcd_update_rtc(struct ufs_hba *hba) */ val = ts64.tv_sec - hba->dev_info.rtc_time_baseline; - ufshcd_rpm_get_sync(hba); + /* Skip update RTC if RPM state is not RPM_ACTIVE */ + if (ufshcd_rpm_get_if_active(hba) <= 0) + return; + err = ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_WRITE_ATTR, QUERY_ATTR_IDN_SECONDS_PASSED, 0, 0, &val); ufshcd_rpm_put_sync(hba);