Message ID | 1615408918-7242-1-git-send-email-jhugo@codeaurora.org |
---|---|
State | New |
Headers | show |
Series | [v2,RESEND] bus: mhi: core: Wait for ready state after reset | expand |
On Wed, Mar 10, 2021 at 01:41:58PM -0700, Jeffrey Hugo wrote: > After the device has signaled the end of reset by clearing the reset bit, > it will automatically reinit MHI and the internal device structures. Once > That is done, the device will signal it has entered the ready state. > > Signaling the ready state involves sending an interrupt (MSI) to the host > which might cause IOMMU faults if it occurs at the wrong time. > > If the controller is being powered down, and possibly removed, then the > reset flow would only wait for the end of reset. At which point, the host > and device would start a race. The host may complete its reset work, and > remove the interrupt handler, which would cause the interrupt to be > disabled in the IOMMU. If that occurs before the device signals the ready > state, then the IOMMU will fault since it blocked an interrupt. While > harmless, the fault would appear like a serious issue has occurred so let's > silence it by making sure the device hits the ready state before the host > completes its reset processing. > > Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> > --- > drivers/bus/mhi/core/pm.c | 17 ++++++++++++++++- > 1 file changed, 16 insertions(+), 1 deletion(-) > > diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c > index adb0e80..414da4f 100644 > --- a/drivers/bus/mhi/core/pm.c > +++ b/drivers/bus/mhi/core/pm.c > @@ -467,7 +467,7 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl) > > /* Trigger MHI RESET so that the device will not access host memory */ > if (!MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state)) { > - u32 in_reset = -1; > + u32 in_reset = -1, ready = 0; > unsigned long timeout = msecs_to_jiffies(mhi_cntrl->timeout_ms); > > dev_dbg(dev, "Triggering MHI Reset in device\n"); > @@ -490,6 +490,21 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl) > * hence re-program it > */ > mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0); > + > + if (!MHI_IN_PBL(mhi_get_exec_env(mhi_cntrl))) { > + /* wait for ready to be set */ > + ret = wait_event_timeout(mhi_cntrl->state_event, > + mhi_read_reg_field(mhi_cntrl, > + mhi_cntrl->regs, > + MHISTATUS, > + MHISTATUS_READY_MASK, > + MHISTATUS_READY_SHIFT, > + &ready) > + || ready, timeout); > + if (!ret || !ready) > + dev_warn(dev, > + "Device failed to enter READY state\n"); Wouldn't dev_err be more appropriate here provided that we might get IOMMU fault anytime soon? Thanks, Mani > + } > } > > dev_dbg(dev, > -- > Qualcomm Technologies, Inc. is a member of the > Code Aurora Forum, a Linux Foundation Collaborative Project. >
On 3/16/2021 12:14 AM, Manivannan Sadhasivam wrote: > On Wed, Mar 10, 2021 at 01:41:58PM -0700, Jeffrey Hugo wrote: >> After the device has signaled the end of reset by clearing the reset bit, >> it will automatically reinit MHI and the internal device structures. Once >> That is done, the device will signal it has entered the ready state. >> >> Signaling the ready state involves sending an interrupt (MSI) to the host >> which might cause IOMMU faults if it occurs at the wrong time. >> >> If the controller is being powered down, and possibly removed, then the >> reset flow would only wait for the end of reset. At which point, the host >> and device would start a race. The host may complete its reset work, and >> remove the interrupt handler, which would cause the interrupt to be >> disabled in the IOMMU. If that occurs before the device signals the ready >> state, then the IOMMU will fault since it blocked an interrupt. While >> harmless, the fault would appear like a serious issue has occurred so let's >> silence it by making sure the device hits the ready state before the host >> completes its reset processing. >> >> Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> >> --- >> drivers/bus/mhi/core/pm.c | 17 ++++++++++++++++- >> 1 file changed, 16 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c >> index adb0e80..414da4f 100644 >> --- a/drivers/bus/mhi/core/pm.c >> +++ b/drivers/bus/mhi/core/pm.c >> @@ -467,7 +467,7 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl) >> >> /* Trigger MHI RESET so that the device will not access host memory */ >> if (!MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state)) { >> - u32 in_reset = -1; >> + u32 in_reset = -1, ready = 0; >> unsigned long timeout = msecs_to_jiffies(mhi_cntrl->timeout_ms); >> >> dev_dbg(dev, "Triggering MHI Reset in device\n"); >> @@ -490,6 +490,21 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl) >> * hence re-program it >> */ >> mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0); >> + >> + if (!MHI_IN_PBL(mhi_get_exec_env(mhi_cntrl))) { >> + /* wait for ready to be set */ >> + ret = wait_event_timeout(mhi_cntrl->state_event, >> + mhi_read_reg_field(mhi_cntrl, >> + mhi_cntrl->regs, >> + MHISTATUS, >> + MHISTATUS_READY_MASK, >> + MHISTATUS_READY_SHIFT, >> + &ready) >> + || ready, timeout); >> + if (!ret || !ready) >> + dev_warn(dev, >> + "Device failed to enter READY state\n"); > > Wouldn't dev_err be more appropriate here provided that we might get IOMMU fault > anytime soon? I supposed. Didn't feel like a "true" error because nothing has actually failed, the chance of the IOMMU fault is low, and I couldn't enumerate what would be the expected action for the system user to take if they saw this as an error. I don't have a particularly strong opinion one way or the other. I figured warn was the more conservative option here. Will change. -- Jeffrey Hugo Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c index adb0e80..414da4f 100644 --- a/drivers/bus/mhi/core/pm.c +++ b/drivers/bus/mhi/core/pm.c @@ -467,7 +467,7 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl) /* Trigger MHI RESET so that the device will not access host memory */ if (!MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state)) { - u32 in_reset = -1; + u32 in_reset = -1, ready = 0; unsigned long timeout = msecs_to_jiffies(mhi_cntrl->timeout_ms); dev_dbg(dev, "Triggering MHI Reset in device\n"); @@ -490,6 +490,21 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl) * hence re-program it */ mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0); + + if (!MHI_IN_PBL(mhi_get_exec_env(mhi_cntrl))) { + /* wait for ready to be set */ + ret = wait_event_timeout(mhi_cntrl->state_event, + mhi_read_reg_field(mhi_cntrl, + mhi_cntrl->regs, + MHISTATUS, + MHISTATUS_READY_MASK, + MHISTATUS_READY_SHIFT, + &ready) + || ready, timeout); + if (!ret || !ready) + dev_warn(dev, + "Device failed to enter READY state\n"); + } } dev_dbg(dev,
After the device has signaled the end of reset by clearing the reset bit, it will automatically reinit MHI and the internal device structures. Once That is done, the device will signal it has entered the ready state. Signaling the ready state involves sending an interrupt (MSI) to the host which might cause IOMMU faults if it occurs at the wrong time. If the controller is being powered down, and possibly removed, then the reset flow would only wait for the end of reset. At which point, the host and device would start a race. The host may complete its reset work, and remove the interrupt handler, which would cause the interrupt to be disabled in the IOMMU. If that occurs before the device signals the ready state, then the IOMMU will fault since it blocked an interrupt. While harmless, the fault would appear like a serious issue has occurred so let's silence it by making sure the device hits the ready state before the host completes its reset processing. Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> --- drivers/bus/mhi/core/pm.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-)