diff mbox series

[V2,6/6] mmc: cqhci: Fix task clearing in CQE error recovery

Message ID 20231103084720.6886-7-adrian.hunter@intel.com
State New
Headers show
Series mmc: block: Fixes for CQE error recovery recovery | expand

Commit Message

Adrian Hunter Nov. 3, 2023, 8:47 a.m. UTC
If a task completion notification (TCN) is received when there is no
outstanding task, the cqhci driver issues a "spurious TCN" warning. This
was observed to happen right after CQE error recovery.

When an error interrupt is received the driver runs recovery logic.
It halts the controller, clears all pending tasks, and then re-enables
it. On some platforms, like Intel Jasper Lake, a stale task completion
event was observed, regardless of the CQHCI_CLEAR_ALL_TASKS bit being set.

This results in either:
a) Spurious TC completion event for an empty slot.
b) Corrupted data being passed up the stack, as a result of premature
   completion for a newly added task.

Rather than add a quirk for affected controllers, ensure tasks are cleared
by toggling CQHCI_ENABLE, which would happen anyway if
cqhci_clear_all_tasks() timed out. This is simpler and should be safe and
effective for all controllers.

Fixes: a4080225f51d ("mmc: cqhci: support for command queue enabled host")
Cc: stable@vger.kernel.org
Reported-by: Kornel Dulęba <korneld@chromium.org>
Tested-by: Kornel Dulęba <korneld@chromium.org>
Co-developed-by: Kornel Dulęba <korneld@chromium.org>
Signed-off-by: Kornel Dulęba <korneld@chromium.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/mmc/host/cqhci-core.c | 32 ++++++++++++++++----------------
 1 file changed, 16 insertions(+), 16 deletions(-)

Comments

Avri Altman Nov. 9, 2023, 9:23 a.m. UTC | #1
> If a task completion notification (TCN) is received when there is no
> outstanding task, the cqhci driver issues a "spurious TCN" warning. This was
> observed to happen right after CQE error recovery.
> 
> When an error interrupt is received the driver runs recovery logic.
> It halts the controller, clears all pending tasks, and then re-enables it. On
> some platforms, like Intel Jasper Lake, a stale task completion event was
> observed, regardless of the CQHCI_CLEAR_ALL_TASKS bit being set.
> 
> This results in either:
> a) Spurious TC completion event for an empty slot.
> b) Corrupted data being passed up the stack, as a result of premature
>    completion for a newly added task.
> 
> Rather than add a quirk for affected controllers, ensure tasks are cleared by
> toggling CQHCI_ENABLE, which would happen anyway if
> cqhci_clear_all_tasks() timed out. This is simpler and should be safe and
> effective for all controllers.
> 
> Fixes: a4080225f51d ("mmc: cqhci: support for command queue enabled
> host")
> Cc: stable@vger.kernel.org
> Reported-by: Kornel Dulęba <korneld@chromium.org>
> Tested-by: Kornel Dulęba <korneld@chromium.org>
> Co-developed-by: Kornel Dulęba <korneld@chromium.org>
> Signed-off-by: Kornel Dulęba <korneld@chromium.org>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
diff mbox series

Patch

diff --git a/drivers/mmc/host/cqhci-core.c b/drivers/mmc/host/cqhci-core.c
index 948799a0980c..41e94cd14109 100644
--- a/drivers/mmc/host/cqhci-core.c
+++ b/drivers/mmc/host/cqhci-core.c
@@ -1075,28 +1075,28 @@  static void cqhci_recovery_finish(struct mmc_host *mmc)
 
 	ok = cqhci_halt(mmc, CQHCI_FINISH_HALT_TIMEOUT);
 
-	if (!cqhci_clear_all_tasks(mmc, CQHCI_CLEAR_TIMEOUT))
-		ok = false;
-
 	/*
 	 * The specification contradicts itself, by saying that tasks cannot be
 	 * cleared if CQHCI does not halt, but if CQHCI does not halt, it should
 	 * be disabled/re-enabled, but not to disable before clearing tasks.
 	 * Have a go anyway.
 	 */
-	if (!ok) {
-		pr_debug("%s: cqhci: disable / re-enable\n", mmc_hostname(mmc));
-		cqcfg = cqhci_readl(cq_host, CQHCI_CFG);
-		cqcfg &= ~CQHCI_ENABLE;
-		cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
-		cqcfg |= CQHCI_ENABLE;
-		cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
-		/* Be sure that there are no tasks */
-		ok = cqhci_halt(mmc, CQHCI_FINISH_HALT_TIMEOUT);
-		if (!cqhci_clear_all_tasks(mmc, CQHCI_CLEAR_TIMEOUT))
-			ok = false;
-		WARN_ON(!ok);
-	}
+	if (!cqhci_clear_all_tasks(mmc, CQHCI_CLEAR_TIMEOUT))
+		ok = false;
+
+	/* Disable to make sure tasks really are cleared */
+	cqcfg = cqhci_readl(cq_host, CQHCI_CFG);
+	cqcfg &= ~CQHCI_ENABLE;
+	cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+
+	cqcfg = cqhci_readl(cq_host, CQHCI_CFG);
+	cqcfg |= CQHCI_ENABLE;
+	cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+
+	cqhci_halt(mmc, CQHCI_FINISH_HALT_TIMEOUT);
+
+	if (!ok)
+		cqhci_clear_all_tasks(mmc, CQHCI_CLEAR_TIMEOUT);
 
 	cqhci_recover_mrqs(cq_host);