Message ID | 20240821182923.145631-3-bvanassche@acm.org |
---|---|
State | New |
Headers | show |
Series | Fix the UFS driver hibernation code | expand |
On Wed, 2024-08-21 at 11:29 -0700, Bart Van Assche wrote: > drivers/ufs/core/ufshcd.c | 22 ++++++---------------- > include/ufs/ufshcd.h | 7 ++++--- > 2 files changed, 10 insertions(+), 19 deletions(-) > > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c > index d0ae6e50becc..e12f30b8a83c 100644 > --- a/drivers/ufs/core/ufshcd.c > +++ b/drivers/ufs/core/ufshcd.c > @@ -2585,6 +2585,7 @@ int ufshcd_send_uic_cmd(struct ufs_hba *hba, > struct uic_command *uic_cmd) > ufshcd_hold(hba); > mutex_lock(&hba->uic_cmd_mutex); > ufshcd_add_delay_before_dme_cmd(hba); > + WARN_ON(hba->uic_async_done); > > ret = __ufshcd_send_uic_cmd(hba, uic_cmd, true); > if (!ret) > @@ -4255,7 +4256,6 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba > *hba, struct uic_command *cmd) > unsigned long flags; > u8 status; > int ret; > - bool reenable_intr = false; > > mutex_lock(&hba->uic_cmd_mutex); > ufshcd_add_delay_before_dme_cmd(hba); > @@ -4266,15 +4266,6 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba > *hba, struct uic_command *cmd) > goto out_unlock; > } > hba->uic_async_done = &uic_async_done; > - if (ufshcd_readl(hba, REG_INTERRUPT_ENABLE) & > UIC_COMMAND_COMPL) { > - ufshcd_disable_intr(hba, UIC_COMMAND_COMPL); > - /* > - * Make sure UIC command completion interrupt is > disabled before > - * issuing UIC command. > - */ > - ufshcd_readl(hba, REG_INTERRUPT_ENABLE); > - reenable_intr = true; > - } > spin_unlock_irqrestore(hba->host->host_lock, flags); > ret = __ufshcd_send_uic_cmd(hba, cmd, false); > if (ret) { > @@ -4318,8 +4309,6 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba > *hba, struct uic_command *cmd) > spin_lock_irqsave(hba->host->host_lock, flags); > hba->active_uic_cmd = NULL; > hba->uic_async_done = NULL; > - if (reenable_intr) > - ufshcd_enable_intr(hba, UIC_COMMAND_COMPL); > if (ret) { > ufshcd_set_link_broken(hba); > ufshcd_schedule_eh_work(hba); > @@ -5472,11 +5461,12 @@ static irqreturn_t > ufshcd_uic_cmd_compl(struct ufs_hba *hba, u32 intr_status) > hba->errors |= (UFSHCD_UIC_HIBERN8_MASK & > intr_status); > > if (intr_status & UIC_COMMAND_COMPL && cmd) { > - cmd->argument2 |= ufshcd_get_uic_cmd_result(hba); > - cmd->argument3 = ufshcd_get_dme_attr_val(hba); Bart, My only concern is, removing disabling UIC completion IRQ, and keeping is.uccs 1, then we don't read its status in case of ufshcd_uic_pwr_ctrl path, whether this will affect the next UIC access result. Other things look good to me. Kind regards, Bean
On 8/21/24 2:27 PM, Bean Huo wrote: > My only concern is, removing disabling UIC completion IRQ, and keeping > is.uccs 1, then we don't read its status in case of ufshcd_uic_pwr_ctrl > path, whether this will affect the next UIC access result. Hmm ... I think I need more context information. If the UIC completion interrupt is left enabled then ufshcd_intr() will execute the code "intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS)". This statement reads all bits from the interrupt status register including the UCCS bit, isn't it? Thanks, Bart.
On 8/21/2024 11:29 AM, Bart Van Assche wrote: > Accessing a host controller register after the host controller has > entered the hibernation state may cause the host controller to exit the > hibernation state. Hence rework the hibernation entry code such that it > does not modify the interrupt enabled status. Bart, I am not clear on the offending condition, particularly the term "hibernation" used in this context. In the function ufshcd_uic_pwr_ctrl() where you are making the change, the host controller is fully active at this point, right? Please help me clarify the issue. Thanks, Bao
On 8/21/24 4:26 PM, Bao D. Nguyen wrote: > On 8/21/2024 11:29 AM, Bart Van Assche wrote: >> Accessing a host controller register after the host controller has >> entered the hibernation state may cause the host controller to exit the >> hibernation state. Hence rework the hibernation entry code such that it >> does not modify the interrupt enabled status. Bart, > > I am not clear on the offending condition, particularly the term > "hibernation" used in this context. In the function > ufshcd_uic_pwr_ctrl() where you are making the change, the host > controller is fully active at this point, right? > Please help me clarify the issue. Hi Bao, Isn't "hibernation" terminology that comes from the M-PHY standard? See also the DME_HIBERNATE_ENTER and DME_HIBERNATE_EXIT constants in the UFSHCI driver source code. Please let me know if you need more information. Bart.
On 8/21/2024 5:14 PM, Bart Van Assche wrote: > On 8/21/24 4:26 PM, Bao D. Nguyen wrote: >> On 8/21/2024 11:29 AM, Bart Van Assche wrote: >>> Accessing a host controller register after the host controller has >>> entered the hibernation state may cause the host controller to exit the >>> hibernation state. Hence rework the hibernation entry code such that it >>> does not modify the interrupt enabled status. Bart, > > >> I am not clear on the offending condition, particularly the term >> "hibernation" used in this context. In the function >> ufshcd_uic_pwr_ctrl() where you are making the change, the host >> controller is fully active at this point, right? >> Please help me clarify the issue. > > Hi Bao, > > Isn't "hibernation" terminology that comes from the M-PHY standard? > See also the DME_HIBERNATE_ENTER and DME_HIBERNATE_EXIT constants in > the UFSHCI driver source code. Please let me know if you need more > information. I see. Thanks Bart. If I understand correctly, the link is hibernated because we had a successful ufshcd_uic_hibern8_enter() earlier. Then the ufshcd_uic_pwr_ctrl() is invoked probably from a power mode change command? (a callstack would be helpful in this case). Anyway, accessing the UFSHCI host controller registers space somehow affected the M-PHY link state? If my understanding is correct, it seems like a hardware issue that we are trying to work around? Thanks, Bao
On Wed, 2024-08-21 at 11:29 -0700, Bart Van Assche wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > Accessing a host controller register after the host controller has > entered the hibernation state may cause the host controller to exit > the > hibernation state. Hence rework the hibernation entry code such that > it > does not modify the interrupt enabled status. This patch relies on > the > following: > * If an UIC command is submitted that should be completed by the UIC > command completion interrupt, hba->uic_async_done == NULL. > * If an UIC command is submitted that should be completed by the > power > mode change interrupt or by a hibernation state change interrupt, > hba->uic_async_done != NULL. > > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > Hi Bart, If link enter hibernate, change power mode should wakeup link to active before power mode change. So, even manual hibernate exit by clock ungating or auto hiberante exit by hardware, write hci interrupt register exit hiberante should not a problem? Thanks. Peter
On Wed, Aug 21, 2024 at 05:14:52PM -0700, Bart Van Assche wrote: > On 8/21/24 4:26 PM, Bao D. Nguyen wrote: > > On 8/21/2024 11:29 AM, Bart Van Assche wrote: > > > Accessing a host controller register after the host controller has > > > entered the hibernation state may cause the host controller to exit the > > > hibernation state. Hence rework the hibernation entry code such that it > > > does not modify the interrupt enabled status. Bart, > > > > I am not clear on the offending condition, particularly the term > > "hibernation" used in this context. In the function > > ufshcd_uic_pwr_ctrl() where you are making the change, the host > > controller is fully active at this point, right? > > Please help me clarify the issue. > > Hi Bao, > > Isn't "hibernation" terminology that comes from the M-PHY standard? Yeah, it creates confusion between OS hibernation and M-PHY hibernation. Maybe saying 'link hibernation' would avoid this. - Mani
On Wed, 2024-08-21 at 14:39 -0700, Bart Van Assche wrote: > On 8/21/24 2:27 PM, Bean Huo wrote: > > My only concern is, removing disabling UIC completion IRQ, and > > keeping > > is.uccs 1, then we don't read its status in case of > > ufshcd_uic_pwr_ctrl > > path, whether this will affect the next UIC access result. > > Hmm ... I think I need more context information. If the UIC > completion > interrupt is left enabled then ufshcd_intr() will execute the code > "intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS)". This > statement > reads all bits from the interrupt status register including the UCCS > bit, isn't it? > > Thanks, > > Bart. One UIC power ctrl command will generate two Compl interrupts, one is command complete (UIC_COMMAND_COMPL) and the other is power switch complete (UFSHCD_UIC_PWR_MASK). Is that right? I checked the current code and we don't read the UFSHCI registers except when we read intr status before ufshcd_uic_cmd_compl() and re-enable UIC compl interrupt. Do you mean re-enabling UIC complete interrupt will cause the problem? Kind regards, Bean
On 8/22/24 7:17 AM, Bean Huo wrote:
> Do you mean re-enabling UIC complete interrupt will cause the problem?
That's correct. ufshcd_uic_hibern8_enter() calls ufshcd_uic_pwr_ctrl()
indirectly. For the test setup that is on my desk, the code in
ufshcd_uic_pwr_ctrl() that re-enables the UIC completion interrupt
causes the UFS host controller to exit hibernation.
Thanks,
Bart.
On 8/21/24 6:05 PM, Bao D. Nguyen wrote: > If I understand correctly, the link is hibernated because we had a > successful ufshcd_uic_hibern8_enter() earlier. Then the > ufshcd_uic_pwr_ctrl() is invoked probably from a power mode change > command? (a callstack would be helpful in this case). Hi Bao, ufshcd_uic_hibern8_enter() calls ufshcd_uic_pwr_ctrl() directly. The former function causes the latter to send the UIC_CMD_DME_HIBER_ENTER command. Although UIC_CMD_DME_HIBER_ENTER causes the link to enter the hibernation state, the code in ufshcd_uic_pwr_ctrl() for re-enabling interrupts causes the UFS host controller that I'm testing to exit the hibernation state. > Anyway, accessing the UFSHCI host controller registers space somehow > affected the M-PHY link state? If my understanding is correct, it seems > like a hardware issue that we are trying to work around? Yes, this is a workaround particular behavior of a particular UFS controller. Thanks, Bart.
On 8/22/2024 11:13 AM, Bart Van Assche wrote: > On 8/21/24 6:05 PM, Bao D. Nguyen wrote: >> If I understand correctly, the link is hibernated because we had a >> successful ufshcd_uic_hibern8_enter() earlier. Then the >> ufshcd_uic_pwr_ctrl() is invoked probably from a power mode change >> command? (a callstack would be helpful in this case). > > Hi Bao, > > ufshcd_uic_hibern8_enter() calls ufshcd_uic_pwr_ctrl() directly. The > former function causes the latter to send the UIC_CMD_DME_HIBER_ENTER > command. Although UIC_CMD_DME_HIBER_ENTER causes the link to enter the > hibernation state, the code in ufshcd_uic_pwr_ctrl() for re-enabling > interrupts causes the UFS host controller that I'm testing to exit the > hibernation state. > >> Anyway, accessing the UFSHCI host controller registers space somehow >> affected the M-PHY link state? If my understanding is correct, it >> seems like a hardware issue that we are trying to work around? > > Yes, this is a workaround particular behavior of a particular UFS > controller. Thank you for the clarification, Bart. I am just curious about providing workaround for a hardware issue in the ufs core driver. Sometimes I notice the community is not accepting such a change and push the change to be implemented in the vendor/platform drivers. Now about your workaround, I have the same concern as Bean. For a ufshcd_uic_pwr_ctrl() command, i.e PMC, hibern8_enter/exit() commands, you will get 2 ufshcd_uic_cmd_compl() interrupts or 1 uic completion interrupt with 2 status bits set at once. a. UIC_COMMAND_COMPL is set b. and one of these bits UIC_POWER_MODE || UIC_HIBERNATE_EXIT || UIC_HIBERNATE_ENTER is set, depending on the completed uic command. In your proposed change to ufshcd_uic_cmd_compl(), you are signalling both complete(&cmd->done) and complete(hba->uic_async_done) for a single ufshcd_uic_pwr_ctrl() command which can cause issues. Thanks, Bao > > Thanks, > > Bart. >
On 8/22/24 1:54 PM, Bao D. Nguyen wrote: > I am just curious about providing workaround for a hardware issue in the > ufs core driver. Sometimes I notice the community is not accepting such > a change and push the change to be implemented in the vendor/platform > drivers. There are two reasons why I propose to implement this workaround as a change for the UFS driver core: - I am not aware of any way to implement the behavior change in a vendor driver without modifying the UFS driver core. - The workaround results in a simplification of the UFS driver core code. More lines are removed from the UFS driver than added. Although it would be possible to select whether the old or the new behavior is selected by introducing yet another host controller quirk, I prefer not to do that because it would make the UFSHCI driver even more complex. > In your proposed change to ufshcd_uic_cmd_compl(), you are signalling > both complete(&cmd->done) and complete(hba->uic_async_done) for a single > ufshcd_uic_pwr_ctrl() command which can cause issues. Please take another look at patch 2/2. With or without this patch applied, only hba->uic_async_done is signaled when a power mode UIC command completes. Thanks, Bart.
On 8/21/2024 11:29 AM, Bart Van Assche wrote: > Accessing a host controller register after the host controller has > entered the hibernation state may cause the host controller to exit the > hibernation state. Hence rework the hibernation entry code such that it > does not modify the interrupt enabled status. This patch relies on the > following: > * If an UIC command is submitted that should be completed by the UIC > command completion interrupt, hba->uic_async_done == NULL. > * If an UIC command is submitted that should be completed by the power > mode change interrupt or by a hibernation state change interrupt, > hba->uic_async_done != NULL. > > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > --- > drivers/ufs/core/ufshcd.c | 22 ++++++---------------- > include/ufs/ufshcd.h | 7 ++++--- > 2 files changed, 10 insertions(+), 19 deletions(-) > > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c > index d0ae6e50becc..e12f30b8a83c 100644 > --- a/drivers/ufs/core/ufshcd.c > +++ b/drivers/ufs/core/ufshcd.c > @@ -2585,6 +2585,7 @@ int ufshcd_send_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd) > ufshcd_hold(hba); > mutex_lock(&hba->uic_cmd_mutex); > ufshcd_add_delay_before_dme_cmd(hba); > + WARN_ON(hba->uic_async_done); > > ret = __ufshcd_send_uic_cmd(hba, uic_cmd, true); > if (!ret) > @@ -4255,7 +4256,6 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) > unsigned long flags; > u8 status; > int ret; > - bool reenable_intr = false; > > mutex_lock(&hba->uic_cmd_mutex); > ufshcd_add_delay_before_dme_cmd(hba); > @@ -4266,15 +4266,6 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) > goto out_unlock; > } > hba->uic_async_done = &uic_async_done; > - if (ufshcd_readl(hba, REG_INTERRUPT_ENABLE) & UIC_COMMAND_COMPL) { > - ufshcd_disable_intr(hba, UIC_COMMAND_COMPL); > - /* > - * Make sure UIC command completion interrupt is disabled before > - * issuing UIC command. > - */ > - ufshcd_readl(hba, REG_INTERRUPT_ENABLE); > - reenable_intr = true; > - } > spin_unlock_irqrestore(hba->host->host_lock, flags); > ret = __ufshcd_send_uic_cmd(hba, cmd, false); > if (ret) { > @@ -4318,8 +4309,6 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) > spin_lock_irqsave(hba->host->host_lock, flags); > hba->active_uic_cmd = NULL; > hba->uic_async_done = NULL; > - if (reenable_intr) > - ufshcd_enable_intr(hba, UIC_COMMAND_COMPL); > if (ret) { > ufshcd_set_link_broken(hba); > ufshcd_schedule_eh_work(hba); > @@ -5472,11 +5461,12 @@ static irqreturn_t ufshcd_uic_cmd_compl(struct ufs_hba *hba, u32 intr_status) > hba->errors |= (UFSHCD_UIC_HIBERN8_MASK & intr_status); > > if (intr_status & UIC_COMMAND_COMPL && cmd) { Let's consider this corner case. I am not sure if it's a valid concern, but it is a corner case that the extra interrupt (new behavior introduced by this patch) may bring. Let's say you are sending a ufshcd_uic_pwr_ctrl() command. You will get 2 uic completion interrupts: [1] ufshcd_uic_cmd_compl() is called for the first interrupt which happens to be UFSHCD_UIC_PWR_MASK only. At the end of the ufshcd_uic_pwr_ctrl(), you would set the hba->active_uic_cmd to NULL. [2]The second uic completion interrupt for UIC_COMMAND_COMP is delayed. This interrupt is newly introduced by this patch. Now let's say you have a new UIC command coming via ufshcd_send_uic_cmd(). The ufshcd_dispatch_uic_cmd() will update your hba->active_uic_cmd to the new uic_cmd. The delayed interrupt mentioned in [2] arrives late. The ufshcd_uic_cmd_compl() is called. In this scenario, this check here may return true "if (intr_status & UIC_COMMAND_COMPL && cmd) {..}" Now, the ufshcd_uic_cmd_compl() would proceed to complete the UIC_COMMAND_COMPL interrupt for the new command intended for the previous "old" uic command. Thanks, Bao > - cmd->argument2 |= ufshcd_get_uic_cmd_result(hba); > - cmd->argument3 = ufshcd_get_dme_attr_val(hba); > - if (!hba->uic_async_done) > + if (!hba->uic_async_done) { > + cmd->argument2 |= ufshcd_get_uic_cmd_result(hba); > + cmd->argument3 = ufshcd_get_dme_attr_val(hba); > cmd->cmd_active = 0; > - complete(&cmd->done); > + complete(&cmd->done); > + } > retval = IRQ_HANDLED; > } > > diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h > index a43b14276bc3..0577013a4611 100644 > --- a/include/ufs/ufshcd.h > +++ b/include/ufs/ufshcd.h > @@ -868,9 +868,10 @@ enum ufshcd_mcq_opr { > * @tmf_tag_set: TMF tag set. > * @tmf_queue: Used to allocate TMF tags. > * @tmf_rqs: array with pointers to TMF requests while these are in progress. > - * @active_uic_cmd: handle of active UIC command > - * @uic_cmd_mutex: mutex for UIC command > - * @uic_async_done: completion used during UIC processing > + * @active_uic_cmd: active UIC command pointer. > + * @uic_cmd_mutex: mutex used to serialize UIC command processing. > + * @uic_async_done: completion used to wait for power mode or hibernation state > + * changes. > * @ufshcd_state: UFSHCD state > * @eh_flags: Error handling flags > * @intr_mask: Interrupt Mask Bits >
On 8/22/24 4:34 PM, Bao D. Nguyen wrote: > Let's say you are sending a ufshcd_uic_pwr_ctrl() command. You will get > 2 uic completion interrupts: > [1] ufshcd_uic_cmd_compl() is called for the first interrupt which > happens to be UFSHCD_UIC_PWR_MASK only. At the end of the > ufshcd_uic_pwr_ctrl(), you would set the hba->active_uic_cmd to NULL. That's not correct. ufshcd_uic_pwr_ctrl() only clears hba->active_uic_cmd after the power mode change interrupt has been processed. > [2]The second uic completion interrupt for UIC_COMMAND_COMP is delayed. > This interrupt is newly introduced by this patch. If UIC_COMMAND_COMPL is delivered after UFSHCD_UIC_PWR_MASK then the UIC_COMMAND_COMPL interrupt will be ignored because hba->active_uic_cmd is cleared by ufshcd_uic_pwr_ctrl() after it has processed the power mode change interrupt. > Now let's say you have a new UIC command coming via > ufshcd_send_uic_cmd(). The ufshcd_dispatch_uic_cmd() will update your > hba->active_uic_cmd to the new uic_cmd. UIC command processing is serialized by hba->uic_cmd_mutex. Hence only one UIC command is processed at any given time. Does this address your concerns? Thanks, Bart.
On Thu, 2024-08-22 at 19:06 -0700, Bart Van Assche wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On 8/22/24 4:34 PM, Bao D. Nguyen wrote: > > Let's say you are sending a ufshcd_uic_pwr_ctrl() command. You will > get > > 2 uic completion interrupts: > > [1] ufshcd_uic_cmd_compl() is called for the first interrupt which > > happens to be UFSHCD_UIC_PWR_MASK only. At the end of the > > ufshcd_uic_pwr_ctrl(), you would set the hba->active_uic_cmd to > NULL. > > That's not correct. ufshcd_uic_pwr_ctrl() only clears > hba->active_uic_cmd after the power mode change interrupt has been > processed. > > > [2]The second uic completion interrupt for UIC_COMMAND_COMP is > delayed. > > This interrupt is newly introduced by this patch. > > If UIC_COMMAND_COMPL is delivered after UFSHCD_UIC_PWR_MASK then the > UIC_COMMAND_COMPL interrupt will be ignored because hba- > >active_uic_cmd > is cleared by ufshcd_uic_pwr_ctrl() after it has processed the > power mode change interrupt. > > > Now let's say you have a new UIC command coming via > > ufshcd_send_uic_cmd(). The ufshcd_dispatch_uic_cmd() will update > your > > hba->active_uic_cmd to the new uic_cmd. > > UIC command processing is serialized by hba->uic_cmd_mutex. Hence > only > one UIC command is processed at any given time. > > Does this address your concerns? > > Thanks, > > Bart. Hi Bart, You assume that the following steps are executed in sequence. (Theread A) send ufshcd_uic_pwr_ctrl (ISR) UIC_POWER_MODE A clear hab->active_uic_cmd (ISR) UIC_COMMAND_COMPL A (step A) do nothing (Thread B) ufshcd_send_uic_cmd set hab->active_uic_cmd (step B) (ISR) UIC_COMMAND_COMPL B complte thread B's cmd But actually step A ISR may come after step B and cause error. (Theread A) send ufshcd_uic_pwr_ctrl (ISR) UIC_POWER_MODE A clear hab->active_uic_cmd (Thread B) ufshcd_send_uic_cmd set hab->active_uic_cmd (step B) (ISR) UIC_COMMAND_COMPL A (step A) complete Thread A cmd Thanks. Peter
On 8/22/2024 7:06 PM, Bart Van Assche wrote: > On 8/22/24 4:34 PM, Bao D. Nguyen wrote: >> Let's say you are sending a ufshcd_uic_pwr_ctrl() command. You will >> get 2 uic completion interrupts: >> [1] ufshcd_uic_cmd_compl() is called for the first interrupt which >> happens to be UFSHCD_UIC_PWR_MASK only. At the end of the >> ufshcd_uic_pwr_ctrl(), you would set the hba->active_uic_cmd to NULL. > > That's not correct. ufshcd_uic_pwr_ctrl() only clears > hba->active_uic_cmd after the power mode change interrupt has been > processed. > Agree. >> [2]The second uic completion interrupt for UIC_COMMAND_COMP is delayed. >> This interrupt is newly introduced by this patch. > > If UIC_COMMAND_COMPL is delivered after UFSHCD_UIC_PWR_MASK then the > UIC_COMMAND_COMPL interrupt will be ignored because hba->active_uic_cmd > is cleared by ufshcd_uic_pwr_ctrl() after it has processed the > power mode change interrupt. Agree. > >> Now let's say you have a new UIC command coming via >> ufshcd_send_uic_cmd(). The ufshcd_dispatch_uic_cmd() will update your >> hba->active_uic_cmd to the new uic_cmd. > > UIC command processing is serialized by hba->uic_cmd_mutex. Hence only > one UIC command is processed at any given time. > > Does this address your concerns? Not really. I agree that the uic commands are serialized by the hba->uic_cmd_mutex. However, the UIC_COMMAND_COMPL extra interrupt belonging to the previous PMC/hibern8_enter/exit() command can come late and causes the ufshcd_uic_cmd_compl() to complete the current uic command incorrectly. Thanks, Bao > > Thanks, > > Bart.
On Thu, 2024-08-22 at 10:51 -0700, Bart Van Assche wrote: > On 8/22/24 7:17 AM, Bean Huo wrote: > > Do you mean re-enabling UIC complete interrupt will cause the > > problem? > > That's correct. ufshcd_uic_hibern8_enter() calls > ufshcd_uic_pwr_ctrl() > indirectly. For the test setup that is on my desk, the code in > ufshcd_uic_pwr_ctrl() that re-enables the UIC completion interrupt > causes the UFS host controller to exit hibernation. > > Thanks, > > Bart. Do you think this is only true in your case or for a specific UFS controller vendor? and this doesnnot mean that all UFS controller vendors have this problem? Maybe MTK has confirmed this. Kind regards, Bean
On 8/23/2024 6:54 PM, Bean Huo wrote: > On Thu, 2024-08-22 at 10:51 -0700, Bart Van Assche wrote: >> On 8/22/24 7:17 AM, Bean Huo wrote: >>> Do you mean re-enabling UIC complete interrupt will cause the >>> problem? >> That's correct. ufshcd_uic_hibern8_enter() calls >> ufshcd_uic_pwr_ctrl() >> indirectly. For the test setup that is on my desk, the code in >> ufshcd_uic_pwr_ctrl() that re-enables the UIC completion interrupt >> causes the UFS host controller to exit hibernation. That is a weird UFS host controller behavior. At least Qcom UFS host controller does not behave like this, because accessing the UFSHCI IRQ register does not require/involve link communication with the UFS device. Thanks, Can Guo. >> >> Thanks, >> >> Bart. > > Do you think this is only true in your case or for a specific UFS > controller vendor? and this doesnnot mean that all UFS controller > vendors have this problem? Maybe MTK has confirmed this. > > Kind regards, > Bean
On Thu, Aug 22, 2024 at 02:08:24PM -0700, Bart Van Assche wrote: > On 8/22/24 1:54 PM, Bao D. Nguyen wrote: > > I am just curious about providing workaround for a hardware issue in the > > ufs core driver. Sometimes I notice the community is not accepting such > > a change and push the change to be implemented in the vendor/platform > > drivers. > > There are two reasons why I propose to implement this workaround as a > change for the UFS driver core: > - I am not aware of any way to implement the behavior change in a vendor > driver without modifying the UFS driver core. Unfortunately you never mentioned which UFS controller this behavior applies to. > - The workaround results in a simplification of the UFS driver core > code. More lines are removed from the UFS driver than added. > This doesn't justify the modification of the UFS code driver for an errantic behavior of a UFS controller. > Although it would be possible to select whether the old or the new > behavior is selected by introducing yet another host controller quirk, I > prefer not to do that because it would make the UFSHCI driver even more > complex. > I strongly believe that using the quirk is the way forward to address this issue. Because this is not a documented behavior to be handled in the core driver and also defeats the purpose of having the quirks in first place. - Mani
On 8/23/24 5:01 AM, Manivannan Sadhasivam wrote: > Unfortunately you never mentioned which UFS controller this behavior applies to. Does your employer allow you to publish detailed information about unreleased SoCs? >> - The workaround results in a simplification of the UFS driver core >> code. More lines are removed from the UFS driver than added. > > This doesn't justify the modification of the UFS code driver for an errantic > behavior of a UFS controller. Patch 2/2 deletes more code than it adds and hence helps to make the UFS driver core easier to maintain. Adding yet another quirk would make the UFS driver core more complicated and hence harder to maintain. >> Although it would be possible to select whether the old or the new >> behavior is selected by introducing yet another host controller quirk, I >> prefer not to do that because it would make the UFSHCI driver even more >> complex. > > I strongly believe that using the quirk is the way forward to address this > issue. Because this is not a documented behavior to be handled in the core > driver and also defeats the purpose of having the quirks in first place. Adding a quirk is not an option in this case because adding a new quirk without code that uses the quirk is not allowed. It may take another year before the code that uses that new quirk is sent upstream because the team that sends Pixel SoC drivers upstream only does this after devices with that SoC are publicly available. Thanks, Bart.
On Fri, Aug 23, 2024 at 07:23:57AM -0700, Bart Van Assche wrote: > On 8/23/24 5:01 AM, Manivannan Sadhasivam wrote: > > Unfortunately you never mentioned which UFS controller this behavior applies to. > > Does your employer allow you to publish detailed information about > unreleased SoCs? > Certainly not! But in that case I will explicitly mention that the controller is used in an upcoming SoC or something like that. Otherwise, how can the community know whether you are sending the patch for an existing controller or a secret one? > > > - The workaround results in a simplification of the UFS driver core > > > code. More lines are removed from the UFS driver than added. > > > > This doesn't justify the modification of the UFS code driver for an errantic > > behavior of a UFS controller. > > Patch 2/2 deletes more code than it adds and hence helps to make the UFS > driver core easier to maintain. Adding yet another quirk would make the > UFS driver core more complicated and hence harder to maintain. > IMO nobody can make the UFS driver more complicated. It is complicated at its best :) But you are just applying the plaster in the core code for a quirky behavior of an unreleased SoC. IMO that is not something we would want. Imagine if other vendor also decides to do the same with the argument of removing more lines of code etc... we will end up with a situation where the core code doing something out of the spec for a buggy controller without any mentioning of the quirky behavior. So that will make the code more complex to understand. > > > Although it would be possible to select whether the old or the new > > > behavior is selected by introducing yet another host controller quirk, I > > > prefer not to do that because it would make the UFSHCI driver even more > > > complex. > > > > I strongly believe that using the quirk is the way forward to address this > > issue. Because this is not a documented behavior to be handled in the core > > driver and also defeats the purpose of having the quirks in first place. > > Adding a quirk is not an option in this case because adding a new quirk > without code that uses the quirk is not allowed. It may take another > year before the code that uses that new quirk is sent upstream because > the team that sends Pixel SoC drivers upstream only does this after > devices with that SoC are publicly available. > Then why can't you send the change at that time? We do the same for Qcom SoCs all the time and I'm pretty sure that the Pixel SoCs are no different. At that time, the SoC may get a new revision and we may end up not needing the quirk at all. I'm not saying that it will happen, but that is a common practice among the semiconductor companies. - Mani
On 8/23/24 7:58 AM, Manivannan Sadhasivam wrote:
> Then why can't you send the change at that time?
We use the Android GKI kernel and any patches must be sent upstream
first before these can be considered for inclusion in the GKI kernel.
Thanks,
Bart.
On 8/23/24 3:54 AM, Bean Huo wrote: > On Thu, 2024-08-22 at 10:51 -0700, Bart Van Assche wrote: >> That's correct. ufshcd_uic_hibern8_enter() calls >> ufshcd_uic_pwr_ctrl() >> indirectly. For the test setup that is on my desk, the code in >> ufshcd_uic_pwr_ctrl() that re-enables the UIC completion interrupt >> causes the UFS host controller to exit hibernation. > > Do you think this is only true in your case or for a specific UFS > controller vendor? and this doesnnot mean that all UFS controller > vendors have this problem? Maybe MTK has confirmed this. Hi Bean, I'm only aware of one UFS controller type that shows this behavior. Thanks, Bart.
On Fri, Aug 23, 2024 at 09:07:18AM -0700, Bart Van Assche wrote: > On 8/23/24 7:58 AM, Manivannan Sadhasivam wrote: > > Then why can't you send the change at that time? > > We use the Android GKI kernel and any patches must be sent upstream > first before these can be considered for inclusion in the GKI kernel. > But that's the same requirement for other SoC vendors as well. Anyway, these don't justify the fact that the core code should be modified to workaround a controller defect. Please use quirks as like other vendors. - Mani
On 8/23/24 9:48 AM, Manivannan Sadhasivam wrote: > On Fri, Aug 23, 2024 at 09:07:18AM -0700, Bart Van Assche wrote: >> On 8/23/24 7:58 AM, Manivannan Sadhasivam wrote: >>> Then why can't you send the change at that time? >> >> We use the Android GKI kernel and any patches must be sent upstream >> first before these can be considered for inclusion in the GKI kernel. > > But that's the same requirement for other SoC vendors as well. Anyway, these > don't justify the fact that the core code should be modified to workaround a > controller defect. Please use quirks as like other vendors. Let me repeat what I mentioned earlier: * Introducing a new quirk without introducing a user for that quirk is not acceptable because that would involve introducing code that is dead code from the point of view of the upstream kernel. * The UFS driver core is already complicated. If we don't need a new quirk we shouldn't introduce a new quirk. Bart.
On 8/22/24 8:43 PM, Bao D. Nguyen wrote: > However, the UIC_COMMAND_COMPL extra interrupt > belonging to the previous PMC/hibern8_enter/exit() command can come late > and causes the ufshcd_uic_cmd_compl() to complete the current uic > command incorrectly. Hi Bao, If the UIC command completion interrupt could come late we would already have observed unhandled interrupt errors in device logs, isn't it? Anyway, isn't this something that is easy to fix with something like the (untested) patch below? Thanks, Bart. diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index d0ae6e50becc..e3a487ea83f9 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -2543,13 +2543,11 @@ ufshcd_wait_for_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd) * __ufshcd_send_uic_cmd - Send UIC commands and retrieve the result * @hba: per adapter instance * @uic_cmd: UIC command - * @completion: initialize the completion only if this is set to true * * Return: 0 only if success. */ static int -__ufshcd_send_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd, - bool completion) +__ufshcd_send_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd) { lockdep_assert_held(&hba->uic_cmd_mutex); @@ -2559,8 +2557,7 @@ __ufshcd_send_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd, return -EIO; } - if (completion) - init_completion(&uic_cmd->done); + init_completion(&uic_cmd->done); uic_cmd->cmd_active = 1; ufshcd_dispatch_uic_cmd(hba, uic_cmd); @@ -2586,7 +2583,7 @@ int ufshcd_send_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd) mutex_lock(&hba->uic_cmd_mutex); ufshcd_add_delay_before_dme_cmd(hba); - ret = __ufshcd_send_uic_cmd(hba, uic_cmd, true); + ret = __ufshcd_send_uic_cmd(hba, uic_cmd); if (!ret) ret = ufshcd_wait_for_uic_cmd(hba, uic_cmd); @@ -4255,7 +4252,6 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) unsigned long flags; u8 status; int ret; - bool reenable_intr = false; mutex_lock(&hba->uic_cmd_mutex); ufshcd_add_delay_before_dme_cmd(hba); @@ -4266,17 +4262,8 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) goto out_unlock; } hba->uic_async_done = &uic_async_done; - if (ufshcd_readl(hba, REG_INTERRUPT_ENABLE) & UIC_COMMAND_COMPL) { - ufshcd_disable_intr(hba, UIC_COMMAND_COMPL); - /* - * Make sure UIC command completion interrupt is disabled before - * issuing UIC command. - */ - ufshcd_readl(hba, REG_INTERRUPT_ENABLE); - reenable_intr = true; - } spin_unlock_irqrestore(hba->host->host_lock, flags); - ret = __ufshcd_send_uic_cmd(hba, cmd, false); + ret = __ufshcd_send_uic_cmd(hba, cmd); if (ret) { dev_err(hba->dev, "pwr ctrl cmd 0x%x with mode 0x%x uic error %d\n", @@ -4300,6 +4287,13 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) goto out; } + ret = wait_for_completion_timeout(&cmd->done, + msecs_to_jiffies(uic_cmd_timeout)); + WARN_ON_ONCE(ret < 0); + if (ret == 0) + dev_err(hba->dev, "UIC command %#x timed out\n", cmd->command); + ret = 0; + check_upmcrs: status = ufshcd_get_upmcrs(hba); if (status != PWR_LOCAL) { @@ -4318,8 +4312,6 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) spin_lock_irqsave(hba->host->host_lock, flags); hba->active_uic_cmd = NULL; hba->uic_async_done = NULL; - if (reenable_intr) - ufshcd_enable_intr(hba, UIC_COMMAND_COMPL); if (ret) { ufshcd_set_link_broken(hba); ufshcd_schedule_eh_work(hba);
On 8/22/24 7:57 PM, Peter Wang (王信友) wrote: > You assume that the following steps are executed in sequence. > (Theread A) send ufshcd_uic_pwr_ctrl > (ISR) UIC_POWER_MODE A > clear hab->active_uic_cmd > (ISR) UIC_COMMAND_COMPL A (step A) > do nothing > (Thread B) ufshcd_send_uic_cmd set hab->active_uic_cmd (step B) > (ISR) UIC_COMMAND_COMPL B > complte thread B's cmd > > But actually step A ISR may come after step B and cause error. > > (Theread A) send ufshcd_uic_pwr_ctrl > (ISR) UIC_POWER_MODE A > clear hab->active_uic_cmd > (Thread B) ufshcd_send_uic_cmd set hab->active_uic_cmd (step B) > (ISR) UIC_COMMAND_COMPL A (step A) > complete Thread A cmd Hi Peter, Can you please take a look at the fix I proposed in my reply to Bao? Thanks, Bart.
On Fri, Aug 23, 2024 at 11:05:12AM -0700, Bart Van Assche wrote: > On 8/23/24 9:48 AM, Manivannan Sadhasivam wrote: > > On Fri, Aug 23, 2024 at 09:07:18AM -0700, Bart Van Assche wrote: > > > On 8/23/24 7:58 AM, Manivannan Sadhasivam wrote: > > > > Then why can't you send the change at that time? > > > > > > We use the Android GKI kernel and any patches must be sent upstream > > > first before these can be considered for inclusion in the GKI kernel. > > > > But that's the same requirement for other SoC vendors as well. Anyway, these > > don't justify the fact that the core code should be modified to workaround a > > controller defect. Please use quirks as like other vendors. > > Let me repeat what I mentioned earlier: > * Introducing a new quirk without introducing a user for that quirk is > not acceptable because that would involve introducing code that is > dead code from the point of view of the upstream kernel. As I pointed out earlier, you should just submit the quirk change when you are submitting your driver. But you said that for GKI requirement you are doing the change in core driver. But again, that is applicable for other vendors as well. What if other vendors start adding the workaround in the core driver citing GKI requirement (provided it also removes some code as you justified)? Will it be acceptable? NO. > * The UFS driver core is already complicated. If we don't need a new > quirk we shouldn't introduce a new quirk. > Sorry, the quirk is a quirk. All the existing quirks can be worked around in the core driver in some way. But we have the quirk mechanisms to specifically not to do that to avoid polluting the core code which has to follow the spec. Moreover, this workaround you are adding is not at all common for other controllers. So this definitely doesn't justify modifying the core code. IMO adding more code alone will not make a driver complicated, but changing the logic will. - Mani
On 8/23/24 7:29 PM, Manivannan Sadhasivam wrote: > What if other vendors start adding the workaround in the core driver citing GKI > requirement (provided it also removes some code as you justified)? Will it be > acceptable? NO. It's not up to you to define new rules for upstream kernel development. Anyone is allowed to publish patches that rework kernel code, whether or not the purpose of such a patch is to work around a SoC bug. Additionally, it has already happened that one of your colleagues submitted a workaround for a SoC bug to the UFS core driver. From the description of commit 0f52fcb99ea2 ("scsi: ufs: Try to save power mode change and UIC cmd completion timeout"): "This is to deal with the scenario in which completion has been raised but the one waiting for the completion cannot be awaken in time due to kernel scheduling problem." That description makes zero sense to me. My conclusion from commit 0f52fcb99ea2 is that it is a workaround for a bug in a UFS host controller, namely that a particular UFS host controller not always generates a UIC completion interrupt when it should. Bart.
On Fri, Aug 23, 2024 at 07:48:50PM -0700, Bart Van Assche wrote: > On 8/23/24 7:29 PM, Manivannan Sadhasivam wrote: > > What if other vendors start adding the workaround in the core driver citing GKI > > requirement (provided it also removes some code as you justified)? Will it be > > acceptable? NO. > > It's not up to you to define new rules for upstream kernel development. I'm not framing new rules, but just pointing out the common practice. > Anyone is allowed to publish patches that rework kernel code, whether > or not the purpose of such a patch is to work around a SoC bug. > Yes, at the same time if that code deviates from the norm, then anyone can complain. We are all working towards making the code better. > Additionally, it has already happened that one of your colleagues > submitted a workaround for a SoC bug to the UFS core driver. > From the description of commit 0f52fcb99ea2 ("scsi: ufs: Try to save > power mode change and UIC cmd completion timeout"): "This is to deal > with the scenario in which completion has been raised but the one > waiting for the completion cannot be awaken in time due to kernel > scheduling problem." That description makes zero sense to me. My > conclusion from commit 0f52fcb99ea2 is that it is a workaround for a > bug in a UFS host controller, namely that a particular UFS host > controller not always generates a UIC completion interrupt when it > should. > 0f52fcb99ea2 was submitted in 2020 before I started contributing to UFS driver seriously. But the description of that commit never mentioned any issue with the controller. It vaguely mentions 'kernel scheduling problem' which I don't know how to interpret. If I were looking into the code at that time, I would've definitely asked for clarity during the review phase. But there is no need to take it as an example. I can only assert the fact that working around the controller defect in core code when we already have quirks for the same purpose defeats the purpose of quirks. And it will encourage other people to start changing the core code in the future thus bypassing the quirks. But I'm not a maintainer of this part of the code. So I cannot definitely stop you from getting this patch merged. I'll leave it up to Martin to decide. - Mani
On Fri, 2024-08-23 at 13:27 -0700, Bart Van Assche wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On 8/22/24 7:57 PM, Peter Wang (王信友) wrote: > > You assume that the following steps are executed in sequence. > > (Theread A) send ufshcd_uic_pwr_ctrl > > (ISR) UIC_POWER_MODE A > > clear hab->active_uic_cmd > > (ISR) UIC_COMMAND_COMPL A (step A) > > do nothing > > (Thread B) ufshcd_send_uic_cmd set hab->active_uic_cmd (step B) > > (ISR) UIC_COMMAND_COMPL B > > complte thread B's cmd > > > > But actually step A ISR may come after step B and cause error. > > > > (Theread A) send ufshcd_uic_pwr_ctrl > > (ISR) UIC_POWER_MODE A > > clear hab->active_uic_cmd > > (Thread B) ufshcd_send_uic_cmd set hab->active_uic_cmd (step B) > > (ISR) UIC_COMMAND_COMPL A (step A) > > complete Thread A cmd > > Hi Peter, > > Can you please take a look at the fix I proposed in my reply to Bao? > > Thanks, > > Bart. > Hi Bart, In this case, I suggest use a vendor hooks ufshcd_vops_hibern8_notify. When hibernate enter pre-change, disable UIC_COMMAND_COMPL interrupt to prevent enable UIC_COMMAND_COMPL after hibernate enter. When hibernate exit post-change, enable UIC_COMMAND_COMPL interrupt. If it works, it won't modify the native kernel code, nor will it require adding a quirk. It would simply use a vendor hook as a workaround, without violating GKI, right? Thanks Peter
On 1/1/1970 8:00 AM, wrote: > On Fri, Aug 23, 2024 at 07:48:50PM -0700, Bart Van Assche wrote: >> On 8/23/24 7:29 PM, Manivannan Sadhasivam wrote: >>> What if other vendors start adding the workaround in the core driver citing GKI >>> requirement (provided it also removes some code as you justified)? Will it be >>> acceptable? NO. >> It's not up to you to define new rules for upstream kernel development. > I'm not framing new rules, but just pointing out the common practice. > >> Anyone is allowed to publish patches that rework kernel code, whether >> or not the purpose of such a patch is to work around a SoC bug. >> > Yes, at the same time if that code deviates from the norm, then anyone can > complain. We are all working towards making the code better. > >> Additionally, it has already happened that one of your colleagues >> submitted a workaround for a SoC bug to the UFS core driver. >> From the description of commit 0f52fcb99ea2 ("scsi: ufs: Try to save >> power mode change and UIC cmd completion timeout"): "This is to deal >> with the scenario in which completion has been raised but the one >> waiting for the completion cannot be awaken in time due to kernel >> scheduling problem." That description makes zero sense to me. My >> conclusion from commit 0f52fcb99ea2 is that it is a workaround for a >> bug in a UFS host controller, namely that a particular UFS host >> controller not always generates a UIC completion interrupt when it >> should. >> > 0f52fcb99ea2 was submitted in 2020 before I started contributing to UFS driver > seriously. But the description of that commit never mentioned any issue with the > controller. It vaguely mentions 'kernel scheduling problem' which I don't know > how to interpret. If I were looking into the code at that time, I would've > definitely asked for clarity during the review phase. 0f52fcb99ea2 is my commit, apologize for the confusion due to poor commit msg. What we were trying to fix was not a SoC BUG. More background for this change: from our customer side, we used to hit corner cases where the UIC command is sent, UFS host controller generates the UIC command completion interrupt fine, then UIC completion IRQ handler fires and calls the complete(), however the completion timeout error still happens. In this case, UFS, UFS host and UFS driver are the victims. And whatever could cause this scheduling problem should be fixed properly by the right PoC, but we thought making UFS driver robust in this spot would be good for all of the users who may face the similar issue, hence the change. Thanks, Can Guo. > > But there is no need to take it as an example. I can only assert the fact that > working around the controller defect in core code when we already have quirks > for the same purpose defeats the purpose of quirks. And it will encourage other > people to start changing the core code in the future thus bypassing the quirks. > > But I'm not a maintainer of this part of the code. So I cannot definitely stop > you from getting this patch merged. I'll leave it up to Martin to decide. > > - Mani >
On 8/25/24 11:16 PM, Peter Wang (王信友) wrote: > In this case, I suggest use a vendor hooks ufshcd_vops_hibern8_notify. > When hibernate enter pre-change, disable UIC_COMMAND_COMPL interrupt > to prevent enable UIC_COMMAND_COMPL after hibernate enter. > When hibernate exit post-change, enable UIC_COMMAND_COMPL interrupt. > > If it works, it won't modify the native kernel code, nor will it > require adding a quirk. It would simply use a vendor hook as a > workaround, without violating GKI, right? Hmm ... does this mean introducing a new vendor hook without introducing any host driver code that implements that hook? I don't think this is allowed. How about introducing a quirk that selects the current behavior if set (disable UIC completion interrupt around hibernation) and not touching the UIC completion interrupt if that quirk is not set? Thanks, Bart.
On Mon, 2024-08-26 at 11:08 -0700, Bart Van Assche wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On 8/25/24 11:16 PM, Peter Wang (王信友) wrote: > > In this case, I suggest use a vendor hooks > ufshcd_vops_hibern8_notify. > > When hibernate enter pre-change, disable UIC_COMMAND_COMPL > interrupt > > to prevent enable UIC_COMMAND_COMPL after hibernate enter. > > When hibernate exit post-change, enable UIC_COMMAND_COMPL > interrupt. > > > > If it works, it won't modify the native kernel code, nor will it > > require adding a quirk. It would simply use a vendor hook as a > > workaround, without violating GKI, right? > > Hmm ... does this mean introducing a new vendor hook without > introducing > any host driver code that implements that hook? I don't think this is > allowed. > Hi Bart, It is not a new vendor hook, ufshcd_vops_hibern8_notify is exist in current kernel. Thanks. Peter > How about introducing a quirk that selects the current behavior if > set > (disable UIC completion interrupt around hibernation) and not > touching > the UIC completion interrupt if that quirk is not set? > > Thanks, > > Bart.
On 8/26/24 6:39 PM, Peter Wang (王信友) wrote: > It is not a new vendor hook, ufshcd_vops_hibern8_notify is exist > in current kernel. Hi Peter, Is something like the untested patch below perhaps what you have in mind? Thanks, Bart. diff --git a/drivers/ufs/core/ufshcd-priv.h b/drivers/ufs/core/ufshcd-priv.h index ce36154ce963..69ee49a75c04 100644 --- a/drivers/ufs/core/ufshcd-priv.h +++ b/drivers/ufs/core/ufshcd-priv.h @@ -176,12 +176,14 @@ static inline void ufshcd_vops_setup_task_mgmt(struct ufs_hba *hba, return hba->vops->setup_task_mgmt(hba, tag, tm_function); } -static inline void ufshcd_vops_hibern8_notify(struct ufs_hba *hba, - enum uic_cmd_dme cmd, - enum ufs_notify_change_status status) +static inline void +ufshcd_vops_hibern8_notify(struct ufs_hba *hba, enum uic_cmd_dme cmd, + enum ufs_notify_change_status status, + bool *disable_uic_compl_intr) { if (hba->vops && hba->vops->hibern8_notify) - return hba->vops->hibern8_notify(hba, cmd, status); + return hba->vops->hibern8_notify(hba, cmd, status, + disable_uic_compl_intr); } static inline int ufshcd_vops_apply_dev_quirks(struct ufs_hba *hba) diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index e13b9ac145f6..614b24f2eb7f 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -2541,9 +2541,8 @@ ufshcd_wait_for_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd) * * Return: 0 only if success. */ -static int -__ufshcd_send_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd, - bool completion) +static int __ufshcd_send_uic_cmd(struct ufs_hba *hba, + struct uic_command *uic_cmd) { lockdep_assert_held(&hba->uic_cmd_mutex); @@ -2553,8 +2552,7 @@ __ufshcd_send_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd, return -EIO; } - if (completion) - init_completion(&uic_cmd->done); + init_completion(&uic_cmd->done); uic_cmd->cmd_active = 1; ufshcd_dispatch_uic_cmd(hba, uic_cmd); @@ -2580,7 +2578,7 @@ int ufshcd_send_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd) mutex_lock(&hba->uic_cmd_mutex); ufshcd_add_delay_before_dme_cmd(hba); - ret = __ufshcd_send_uic_cmd(hba, uic_cmd, true); + ret = __ufshcd_send_uic_cmd(hba, uic_cmd); if (!ret) ret = ufshcd_wait_for_uic_cmd(hba, uic_cmd); @@ -4243,7 +4241,8 @@ EXPORT_SYMBOL_GPL(ufshcd_dme_get_attr); * * Return: 0 on success, non-zero value on failure. */ -static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) +static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd, + bool disable_uic_compl_intr) { DECLARE_COMPLETION_ONSTACK(uic_async_done); unsigned long flags; @@ -4260,7 +4259,8 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) goto out_unlock; } hba->uic_async_done = &uic_async_done; - if (ufshcd_readl(hba, REG_INTERRUPT_ENABLE) & UIC_COMMAND_COMPL) { + if (disable_uic_compl_intr && + ufshcd_readl(hba, REG_INTERRUPT_ENABLE) & UIC_COMMAND_COMPL) { ufshcd_disable_intr(hba, UIC_COMMAND_COMPL); /* * Make sure UIC command completion interrupt is disabled before @@ -4270,7 +4270,7 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) reenable_intr = true; } spin_unlock_irqrestore(hba->host->host_lock, flags); - ret = __ufshcd_send_uic_cmd(hba, cmd, false); + ret = __ufshcd_send_uic_cmd(hba, cmd); if (ret) { dev_err(hba->dev, "pwr ctrl cmd 0x%x with mode 0x%x uic error %d\n", @@ -4294,6 +4294,16 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) goto out; } + if (!disable_uic_compl_intr && + ufshcd_readl(hba, REG_INTERRUPT_ENABLE) & UIC_COMMAND_COMPL) { + ret = wait_for_completion_timeout( + &cmd->done, msecs_to_jiffies(uic_cmd_timeout)); + if (ret == 0) + dev_err(hba->dev, "pwr ctrl cmd %#x timed out\n", + cmd->command); + ret = 0; + } + check_upmcrs: status = ufshcd_get_upmcrs(hba); if (status != PWR_LOCAL) { @@ -4353,7 +4363,7 @@ int ufshcd_uic_change_pwr_mode(struct ufs_hba *hba, u8 mode) } ufshcd_hold(hba); - ret = ufshcd_uic_pwr_ctrl(hba, &uic_cmd); + ret = ufshcd_uic_pwr_ctrl(hba, &uic_cmd, true); ufshcd_release(hba); out: @@ -4396,11 +4406,13 @@ int ufshcd_uic_hibern8_enter(struct ufs_hba *hba) .command = UIC_CMD_DME_HIBER_ENTER, }; ktime_t start = ktime_get(); + bool disable_uic_compl_intr = true; int ret; - ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_ENTER, PRE_CHANGE); + ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_ENTER, PRE_CHANGE, + &disable_uic_compl_intr); - ret = ufshcd_uic_pwr_ctrl(hba, &uic_cmd); + ret = ufshcd_uic_pwr_ctrl(hba, &uic_cmd, disable_uic_compl_intr); trace_ufshcd_profile_hibern8(dev_name(hba->dev), "enter", ktime_to_us(ktime_sub(ktime_get(), start)), ret); @@ -4409,7 +4421,7 @@ int ufshcd_uic_hibern8_enter(struct ufs_hba *hba) __func__, ret); else ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_ENTER, - POST_CHANGE); + POST_CHANGE, NULL); return ret; } @@ -4423,9 +4435,10 @@ int ufshcd_uic_hibern8_exit(struct ufs_hba *hba) int ret; ktime_t start = ktime_get(); - ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_EXIT, PRE_CHANGE); + ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_EXIT, PRE_CHANGE, + NULL); - ret = ufshcd_uic_pwr_ctrl(hba, &uic_cmd); + ret = ufshcd_uic_pwr_ctrl(hba, &uic_cmd, true); trace_ufshcd_profile_hibern8(dev_name(hba->dev), "exit", ktime_to_us(ktime_sub(ktime_get(), start)), ret); @@ -4434,7 +4447,7 @@ int ufshcd_uic_hibern8_exit(struct ufs_hba *hba) __func__, ret); } else { ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_EXIT, - POST_CHANGE); + POST_CHANGE, NULL); hba->ufs_stats.last_hibern8_exit_tstamp = local_clock(); hba->ufs_stats.hibern8_exit_cnt++; } diff --git a/drivers/ufs/host/cdns-pltfrm.c b/drivers/ufs/host/cdns-pltfrm.c index 66811d8d1929..24c05e5c455d 100644 --- a/drivers/ufs/host/cdns-pltfrm.c +++ b/drivers/ufs/host/cdns-pltfrm.c @@ -164,7 +164,8 @@ static int cdns_ufs_hce_enable_notify(struct ufs_hba *hba, * @status: notify stage (pre, post change) */ static void cdns_ufs_hibern8_notify(struct ufs_hba *hba, enum uic_cmd_dme cmd, - enum ufs_notify_change_status status) + enum ufs_notify_change_status status, + bool *disable_uic_compl_intr) { if (status == PRE_CHANGE && cmd == UIC_CMD_DME_HIBER_ENTER) cdns_ufs_get_l4_attr(hba); diff --git a/drivers/ufs/host/ufs-exynos.c b/drivers/ufs/host/ufs-exynos.c index 16ad3528d80b..e991d9e5e2e4 100644 --- a/drivers/ufs/host/ufs-exynos.c +++ b/drivers/ufs/host/ufs-exynos.c @@ -1624,8 +1624,9 @@ static int exynos_ufs_pwr_change_notify(struct ufs_hba *hba, } static void exynos_ufs_hibern8_notify(struct ufs_hba *hba, - enum uic_cmd_dme enter, - enum ufs_notify_change_status notify) + enum uic_cmd_dme enter, + enum ufs_notify_change_status notify, + bool *disable_uic_compl_intr) { switch ((u8)notify) { case PRE_CHANGE: diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h index a43b14276bc3..59b901d67400 100644 --- a/include/ufs/ufshcd.h +++ b/include/ufs/ufshcd.h @@ -355,8 +355,9 @@ struct ufs_hba_variant_ops { void (*setup_xfer_req)(struct ufs_hba *hba, int tag, bool is_scsi_cmd); void (*setup_task_mgmt)(struct ufs_hba *, int, u8); - void (*hibern8_notify)(struct ufs_hba *, enum uic_cmd_dme, - enum ufs_notify_change_status); + void (*hibern8_notify)(struct ufs_hba *, enum uic_cmd_dme, + enum ufs_notify_change_status, + bool *disable_uic_compl_intr); int (*apply_dev_quirks)(struct ufs_hba *hba); void (*fixup_dev_quirks)(struct ufs_hba *hba); int (*suspend)(struct ufs_hba *, enum ufs_pm_op,
On 8/23/2024 1:25 PM, Bart Van Assche wrote: > If the UIC command completion interrupt could come late we would > already have observed unhandled interrupt errors in device logs, isn't > it? > > Anyway, isn't this something that is easy to fix with something like the > (untested) patch below? > > > @@ -2559,8 +2557,7 @@ __ufshcd_send_uic_cmd(struct ufs_hba *hba, struct > uic_command *uic_cmd, > return -EIO; > } > > - if (completion) > - init_completion(&uic_cmd->done); > + init_completion(&uic_cmd->done); This change may not work, Bart. In this path ufshcd_uic_pwr_ctrl()->__ufshcd_send_uic_cmd(), the proposal always has a init_completion(&uic_cmd->done). However, there is no ufshcd_wait_for_uic_cmd() in this path. The isr will call ufshcd_uic_cmd_compl()->complete(&hba->active_uic_cmd->done); to signal the completion, but you are missing the code that waits for the &hba->active_uic_cmd->done signal, so it may cause stability issues. I am going to review another of your proposal in the mail with Peter and get back. Thanks, Bao
On 8/27/24 5:17 PM, Bao D. Nguyen wrote: > On 8/23/2024 1:25 PM, Bart Van Assche wrote: >> - if (completion) >> - init_completion(&uic_cmd->done); >> + init_completion(&uic_cmd->done); > > This change may not work, Bart. Why not? It is allowed to call init_completion() multiple times before wait_for_completion() is called. It is even allowed to call init_completion() without ever calling wait_for_completion(). All init_completion() does is to perform a few assignments: static inline void init_completion(struct completion *x) { x->done = 0; init_swait_queue_head(&x->wait); } It is not allowed to call init_completion() concurrently with wait_for_completion() because that would trigger a race condition. But I don't think that my patch introduces such a race. Thanks, Bart.
On 8/27/2024 8:42 AM, Bart Van Assche wrote: > On 8/26/24 6:39 PM, Peter Wang (王信友) wrote: >> It is not a new vendor hook, ufshcd_vops_hibern8_notify is exist >> in current kernel. > Hi Peter, > > Is something like the untested patch below perhaps what you have in > mind? Hi Bart, IMHO even though the patch may work, it makes the existing code more complicated with the disable_uic_compl_intr flag being sent to the platform driver that has nothing to do with it. It seems the proposal makes the code harder to read as well. A new person without any history of this issue trying to understand the purpose of this flag would probably need to spend some extra time digging the history. Thanks, Bao
On Tue, 2024-08-27 at 11:42 -0400, Bart Van Assche wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On 8/26/24 6:39 PM, Peter Wang (王信友) wrote: > > It is not a new vendor hook, ufshcd_vops_hibern8_notify is exist > > in current kernel. > Hi Peter, > > Is something like the untested patch below perhaps what you have in > mind? > > Thanks, > > Bart. > Hi Bart, No, I means you can reference ufs-sprd.c driver. which may have the same issue? /* * Disable UIC COMPL INTR to prevent access to UFSHCI after * checking HCS.UPMCRS */ ufs_sprd_ctrl_uic_compl(hba, false); Then after enter hibernte, you can prevent access to UFSHCI. After exit hibernate, enable uic complete interrupt again for workaround. Thanks. Peter > > diff --git a/drivers/ufs/core/ufshcd-priv.h > b/drivers/ufs/core/ufshcd-priv.h > index ce36154ce963..69ee49a75c04 100644 > --- a/drivers/ufs/core/ufshcd-priv.h > +++ b/drivers/ufs/core/ufshcd-priv.h > @@ -176,12 +176,14 @@ static inline void > ufshcd_vops_setup_task_mgmt(struct ufs_hba *hba, > return hba->vops->setup_task_mgmt(hba, tag, tm_function); > } > > -static inline void ufshcd_vops_hibern8_notify(struct ufs_hba *hba, > -enum uic_cmd_dme cmd, > -enum ufs_notify_change_status status) > +static inline void > +ufshcd_vops_hibern8_notify(struct ufs_hba *hba, enum uic_cmd_dme > cmd, > + enum ufs_notify_change_status status, > + bool *disable_uic_compl_intr) > { > if (hba->vops && hba->vops->hibern8_notify) > -return hba->vops->hibern8_notify(hba, cmd, status); > +return hba->vops->hibern8_notify(hba, cmd, status, > + disable_uic_compl_intr); > } > > static inline int ufshcd_vops_apply_dev_quirks(struct ufs_hba *hba) > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c > index e13b9ac145f6..614b24f2eb7f 100644 > --- a/drivers/ufs/core/ufshcd.c > +++ b/drivers/ufs/core/ufshcd.c > @@ -2541,9 +2541,8 @@ ufshcd_wait_for_uic_cmd(struct ufs_hba *hba, > struct uic_command *uic_cmd) > * > * Return: 0 only if success. > */ > -static int > -__ufshcd_send_uic_cmd(struct ufs_hba *hba, struct uic_command > *uic_cmd, > - bool completion) > +static int __ufshcd_send_uic_cmd(struct ufs_hba *hba, > + struct uic_command *uic_cmd) > { > lockdep_assert_held(&hba->uic_cmd_mutex); > > @@ -2553,8 +2552,7 @@ __ufshcd_send_uic_cmd(struct ufs_hba *hba, > struct > uic_command *uic_cmd, > return -EIO; > } > > -if (completion) > -init_completion(&uic_cmd->done); > +init_completion(&uic_cmd->done); > > uic_cmd->cmd_active = 1; > ufshcd_dispatch_uic_cmd(hba, uic_cmd); > @@ -2580,7 +2578,7 @@ int ufshcd_send_uic_cmd(struct ufs_hba *hba, > struct uic_command *uic_cmd) > mutex_lock(&hba->uic_cmd_mutex); > ufshcd_add_delay_before_dme_cmd(hba); > > -ret = __ufshcd_send_uic_cmd(hba, uic_cmd, true); > +ret = __ufshcd_send_uic_cmd(hba, uic_cmd); > if (!ret) > ret = ufshcd_wait_for_uic_cmd(hba, uic_cmd); > > @@ -4243,7 +4241,8 @@ EXPORT_SYMBOL_GPL(ufshcd_dme_get_attr); > * > * Return: 0 on success, non-zero value on failure. > */ > -static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct > uic_command > *cmd) > +static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct > uic_command > *cmd, > + bool disable_uic_compl_intr) > { > DECLARE_COMPLETION_ONSTACK(uic_async_done); > unsigned long flags; > @@ -4260,7 +4259,8 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba > *hba, struct uic_command *cmd) > goto out_unlock; > } > hba->uic_async_done = &uic_async_done; > -if (ufshcd_readl(hba, REG_INTERRUPT_ENABLE) & UIC_COMMAND_COMPL) { > +if (disable_uic_compl_intr && > + ufshcd_readl(hba, REG_INTERRUPT_ENABLE) & UIC_COMMAND_COMPL) { > ufshcd_disable_intr(hba, UIC_COMMAND_COMPL); > /* > * Make sure UIC command completion interrupt is disabled before > @@ -4270,7 +4270,7 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba > *hba, struct uic_command *cmd) > reenable_intr = true; > } > spin_unlock_irqrestore(hba->host->host_lock, flags); > -ret = __ufshcd_send_uic_cmd(hba, cmd, false); > +ret = __ufshcd_send_uic_cmd(hba, cmd); > if (ret) { > dev_err(hba->dev, > "pwr ctrl cmd 0x%x with mode 0x%x uic error %d\n", > @@ -4294,6 +4294,16 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba > *hba, struct uic_command *cmd) > goto out; > } > > +if (!disable_uic_compl_intr && > + ufshcd_readl(hba, REG_INTERRUPT_ENABLE) & UIC_COMMAND_COMPL) { > +ret = wait_for_completion_timeout( > +&cmd->done, msecs_to_jiffies(uic_cmd_timeout)); > +if (ret == 0) > +dev_err(hba->dev, "pwr ctrl cmd %#x timed out\n", > +cmd->command); > +ret = 0; > +} > + > check_upmcrs: > status = ufshcd_get_upmcrs(hba); > if (status != PWR_LOCAL) { > @@ -4353,7 +4363,7 @@ int ufshcd_uic_change_pwr_mode(struct ufs_hba > *hba, u8 mode) > } > > ufshcd_hold(hba); > -ret = ufshcd_uic_pwr_ctrl(hba, &uic_cmd); > +ret = ufshcd_uic_pwr_ctrl(hba, &uic_cmd, true); > ufshcd_release(hba); > > out: > @@ -4396,11 +4406,13 @@ int ufshcd_uic_hibern8_enter(struct ufs_hba > *hba) > .command = UIC_CMD_DME_HIBER_ENTER, > }; > ktime_t start = ktime_get(); > +bool disable_uic_compl_intr = true; > int ret; > > -ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_ENTER, > PRE_CHANGE); > +ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_ENTER, PRE_CHANGE, > + &disable_uic_compl_intr); > > -ret = ufshcd_uic_pwr_ctrl(hba, &uic_cmd); > +ret = ufshcd_uic_pwr_ctrl(hba, &uic_cmd, disable_uic_compl_intr); > trace_ufshcd_profile_hibern8(dev_name(hba->dev), "enter", > ktime_to_us(ktime_sub(ktime_get(), start)), ret); > > @@ -4409,7 +4421,7 @@ int ufshcd_uic_hibern8_enter(struct ufs_hba > *hba) > __func__, ret); > else > ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_ENTER, > -POST_CHANGE); > + POST_CHANGE, NULL); > > return ret; > } > @@ -4423,9 +4435,10 @@ int ufshcd_uic_hibern8_exit(struct ufs_hba > *hba) > int ret; > ktime_t start = ktime_get(); > > -ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_EXIT, PRE_CHANGE); > +ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_EXIT, PRE_CHANGE, > + NULL); > > -ret = ufshcd_uic_pwr_ctrl(hba, &uic_cmd); > +ret = ufshcd_uic_pwr_ctrl(hba, &uic_cmd, true); > trace_ufshcd_profile_hibern8(dev_name(hba->dev), "exit", > ktime_to_us(ktime_sub(ktime_get(), start)), ret); > > @@ -4434,7 +4447,7 @@ int ufshcd_uic_hibern8_exit(struct ufs_hba > *hba) > __func__, ret); > } else { > ufshcd_vops_hibern8_notify(hba, UIC_CMD_DME_HIBER_EXIT, > -POST_CHANGE); > + POST_CHANGE, NULL); > hba->ufs_stats.last_hibern8_exit_tstamp = local_clock(); > hba->ufs_stats.hibern8_exit_cnt++; > } > diff --git a/drivers/ufs/host/cdns-pltfrm.c b/drivers/ufs/host/cdns- > pltfrm.c > index 66811d8d1929..24c05e5c455d 100644 > --- a/drivers/ufs/host/cdns-pltfrm.c > +++ b/drivers/ufs/host/cdns-pltfrm.c > @@ -164,7 +164,8 @@ static int cdns_ufs_hce_enable_notify(struct > ufs_hba > *hba, > * @status: notify stage (pre, post change) > */ > static void cdns_ufs_hibern8_notify(struct ufs_hba *hba, enum > uic_cmd_dme cmd, > - enum ufs_notify_change_status status) > + enum ufs_notify_change_status status, > + bool *disable_uic_compl_intr) > { > if (status == PRE_CHANGE && cmd == UIC_CMD_DME_HIBER_ENTER) > cdns_ufs_get_l4_attr(hba); > diff --git a/drivers/ufs/host/ufs-exynos.c b/drivers/ufs/host/ufs- > exynos.c > index 16ad3528d80b..e991d9e5e2e4 100644 > --- a/drivers/ufs/host/ufs-exynos.c > +++ b/drivers/ufs/host/ufs-exynos.c > @@ -1624,8 +1624,9 @@ static int exynos_ufs_pwr_change_notify(struct > ufs_hba *hba, > } > > static void exynos_ufs_hibern8_notify(struct ufs_hba *hba, > - enum uic_cmd_dme enter, > - enum ufs_notify_change_status notify) > + enum uic_cmd_dme enter, > + enum ufs_notify_change_status notify, > + bool *disable_uic_compl_intr) > { > switch ((u8)notify) { > case PRE_CHANGE: > diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h > index a43b14276bc3..59b901d67400 100644 > --- a/include/ufs/ufshcd.h > +++ b/include/ufs/ufshcd.h > @@ -355,8 +355,9 @@ struct ufs_hba_variant_ops { > void(*setup_xfer_req)(struct ufs_hba *hba, int tag, > bool is_scsi_cmd); > void(*setup_task_mgmt)(struct ufs_hba *, int, u8); > -void (*hibern8_notify)(struct ufs_hba *, enum uic_cmd_dme, > -enum ufs_notify_change_status); > +void(*hibern8_notify)(struct ufs_hba *, enum uic_cmd_dme, > + enum ufs_notify_change_status, > + bool *disable_uic_compl_intr); > int(*apply_dev_quirks)(struct ufs_hba *hba); > void(*fixup_dev_quirks)(struct ufs_hba *hba); > int (*suspend)(struct ufs_hba *, enum ufs_pm_op, >
On 8/28/24 2:17 AM, Peter Wang (王信友) wrote: > No, I means you can reference ufs-sprd.c driver. which may have the > same issue? > > /* > * Disable UIC COMPL INTR to prevent access to > UFSHCI after > * checking HCS.UPMCRS > */ > ufs_sprd_ctrl_uic_compl(hba, false); > > Then after enter hibernte, you can prevent access to UFSHCI. > After exit hibernate, enable uic complete interrupt again for > workaround. Hi Peter, My opinion about this is as follows: * Host drivers should not disable or enable the UIC completion interrupt. Only the UFS controller core driver should do this. * The behavior I'm observing is that modifying the REG_INTERRUPT_ENABLE register is sufficient to cause the UniPro link to exit the hibernation state. Avoiding this cannot be achieved in a clean way without modifying the UFS controller core driver. Thanks, Bart.
On Wed, 2024-08-28 at 07:18 -0400, Bart Van Assche wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On 8/28/24 2:17 AM, Peter Wang (王信友) wrote: > > No, I means you can reference ufs-sprd.c driver. which may have the > > same issue? > > > > /* > > * Disable UIC COMPL INTR to prevent access to > > UFSHCI after > > * checking HCS.UPMCRS > > */ > > ufs_sprd_ctrl_uic_compl(hba, false); > > > > Then after enter hibernte, you can prevent access to UFSHCI. > > After exit hibernate, enable uic complete interrupt again for > > workaround. > Hi Peter, > > My opinion about this is as follows: > * Host drivers should not disable or enable the UIC completion > interrupt. Only the UFS controller core driver should do this. > * The behavior I'm observing is that modifying the > REG_INTERRUPT_ENABLE > register is sufficient to cause the UniPro link to exit the > hibernation state. Avoiding this cannot be achieved in a clean way > without modifying the UFS controller core driver. > > Thanks, > > Bart. Hi Bart, I am confusing, 1. If only UFS controller core driver should do this, What about registers that are not REG_INTERRUPT_ENABLE? Since ufshcd_writel has been exported, shouldn't the host driver have the authority to control all Host REGs? 2. Set REG_INTERRUPT_ENABLE only when hibernate exit? If cause the UniPro link to exit, then it should still be correct, just exiting hibernate early? Thanks. Peter
On 8/28/24 9:46 AM, Peter Wang (王信友) wrote: > 1. If only UFS controller core driver should do this, > What about registers that are not REG_INTERRUPT_ENABLE? > Since ufshcd_writel has been exported, shouldn't the host > driver have the authority to control all Host REGs? It is not because host drivers can change any host controller register that host drivers should take the freedom to modify all host controller registers. Modifying host controller registers that are vendor specific from a host driver seems totally fine to me. I think that standardized host controller registers should only be modified from the UFS driver core. Otherwise the UFS core driver cannot be understood nor verified without deep understanding of all the host drivers. > 2. Set REG_INTERRUPT_ENABLE only when hibernate exit? > If cause the UniPro link to exit, then it should still be correct, > just exiting hibernate early? This approach has two disadvantages: - It requires that even more state information is tracked in struct ufs_hba. - This approach is probably incompatible with the power management core. I think that there is code in the power management core for disabling interrupts during suspend and reenabling interrupts during resume. Enabling an interrupt that is already enabled is not allowed. In general, disabling / reenabling interrupts is something that should be avoided because it is not compatible with multithreading. The reason why it works in the UFS driver for UIC commands is because these are serialized. Thanks, Bart.
On Wed, 2024-08-28 at 10:10 -0400, Bart Van Assche wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On 8/28/24 9:46 AM, Peter Wang (王信友) wrote: > > 1. If only UFS controller core driver should do this, > > What about registers that are not REG_INTERRUPT_ENABLE? > > Since ufshcd_writel has been exported, shouldn't the host > > driver have the authority to control all Host REGs? > > It is not because host drivers can change any host controller > register > that host drivers should take the freedom to modify all host > controller > registers. Modifying host controller registers that are vendor > specific from a host driver seems totally fine to me. I think that > standardized host controller registers should only be modified from > the > UFS driver core. Otherwise the UFS core driver cannot be understood > nor > verified without deep understanding of all the host drivers. > Hi Bart, But what we are discussing is a 'workaround', which might allow modifications to standardized host controller registers, right? > > 2. Set REG_INTERRUPT_ENABLE only when hibernate exit? > > If cause the UniPro link to exit, then it should still be > correct, > > just exiting hibernate early? > > This approach has two disadvantages: > - It requires that even more state information is tracked in struct > ufs_hba. It could utilize a host private structure, like ufs_mtk_host. > - This approach is probably incompatible with the power management > core. > I think that there is code in the power management core for > disabling > interrupts during suspend and reenabling interrupts during resume. > Enabling an interrupt that is already enabled is not allowed. > Power management enable or disable interrupt should through device driver hook, such as suspend/resume callback function? I am not sure because MediaTek's power management does not directly control interrupts. > In general, disabling / reenabling interrupts is something that > should > be avoided because it is not compatible with multithreading. The > reason > why it works in the UFS driver for UIC commands is because these are > serialized. Yes, but entering and exiting hibernate are protected by clock gating or the runtime/system PM framework. There shouldn't be issues with multiple threads entering or exiting hibernate? Thanks. Peter > > Thanks, > > Bart.
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index d0ae6e50becc..e12f30b8a83c 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -2585,6 +2585,7 @@ int ufshcd_send_uic_cmd(struct ufs_hba *hba, struct uic_command *uic_cmd) ufshcd_hold(hba); mutex_lock(&hba->uic_cmd_mutex); ufshcd_add_delay_before_dme_cmd(hba); + WARN_ON(hba->uic_async_done); ret = __ufshcd_send_uic_cmd(hba, uic_cmd, true); if (!ret) @@ -4255,7 +4256,6 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) unsigned long flags; u8 status; int ret; - bool reenable_intr = false; mutex_lock(&hba->uic_cmd_mutex); ufshcd_add_delay_before_dme_cmd(hba); @@ -4266,15 +4266,6 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) goto out_unlock; } hba->uic_async_done = &uic_async_done; - if (ufshcd_readl(hba, REG_INTERRUPT_ENABLE) & UIC_COMMAND_COMPL) { - ufshcd_disable_intr(hba, UIC_COMMAND_COMPL); - /* - * Make sure UIC command completion interrupt is disabled before - * issuing UIC command. - */ - ufshcd_readl(hba, REG_INTERRUPT_ENABLE); - reenable_intr = true; - } spin_unlock_irqrestore(hba->host->host_lock, flags); ret = __ufshcd_send_uic_cmd(hba, cmd, false); if (ret) { @@ -4318,8 +4309,6 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) spin_lock_irqsave(hba->host->host_lock, flags); hba->active_uic_cmd = NULL; hba->uic_async_done = NULL; - if (reenable_intr) - ufshcd_enable_intr(hba, UIC_COMMAND_COMPL); if (ret) { ufshcd_set_link_broken(hba); ufshcd_schedule_eh_work(hba); @@ -5472,11 +5461,12 @@ static irqreturn_t ufshcd_uic_cmd_compl(struct ufs_hba *hba, u32 intr_status) hba->errors |= (UFSHCD_UIC_HIBERN8_MASK & intr_status); if (intr_status & UIC_COMMAND_COMPL && cmd) { - cmd->argument2 |= ufshcd_get_uic_cmd_result(hba); - cmd->argument3 = ufshcd_get_dme_attr_val(hba); - if (!hba->uic_async_done) + if (!hba->uic_async_done) { + cmd->argument2 |= ufshcd_get_uic_cmd_result(hba); + cmd->argument3 = ufshcd_get_dme_attr_val(hba); cmd->cmd_active = 0; - complete(&cmd->done); + complete(&cmd->done); + } retval = IRQ_HANDLED; } diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h index a43b14276bc3..0577013a4611 100644 --- a/include/ufs/ufshcd.h +++ b/include/ufs/ufshcd.h @@ -868,9 +868,10 @@ enum ufshcd_mcq_opr { * @tmf_tag_set: TMF tag set. * @tmf_queue: Used to allocate TMF tags. * @tmf_rqs: array with pointers to TMF requests while these are in progress. - * @active_uic_cmd: handle of active UIC command - * @uic_cmd_mutex: mutex for UIC command - * @uic_async_done: completion used during UIC processing + * @active_uic_cmd: active UIC command pointer. + * @uic_cmd_mutex: mutex used to serialize UIC command processing. + * @uic_async_done: completion used to wait for power mode or hibernation state + * changes. * @ufshcd_state: UFSHCD state * @eh_flags: Error handling flags * @intr_mask: Interrupt Mask Bits
Accessing a host controller register after the host controller has entered the hibernation state may cause the host controller to exit the hibernation state. Hence rework the hibernation entry code such that it does not modify the interrupt enabled status. This patch relies on the following: * If an UIC command is submitted that should be completed by the UIC command completion interrupt, hba->uic_async_done == NULL. * If an UIC command is submitted that should be completed by the power mode change interrupt or by a hibernation state change interrupt, hba->uic_async_done != NULL. Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- drivers/ufs/core/ufshcd.c | 22 ++++++---------------- include/ufs/ufshcd.h | 7 ++++--- 2 files changed, 10 insertions(+), 19 deletions(-)