diff mbox series

[1/2] mmc: tmio: Further fixup runtime PM management at remove

Message ID 20200519152434.6867-1-ulf.hansson@linaro.org
State New
Headers show
Series [1/2] mmc: tmio: Further fixup runtime PM management at remove | expand

Commit Message

Ulf Hansson May 19, 2020, 3:24 p.m. UTC
Before calling tmio_mmc_host_probe(), the caller is required to enable
clocks for its device, as to make it accessible when reading/writing
registers during probe.

Therefore, the responsibility to disable these clocks, in the error path of
->probe() and during ->remove(), is better managed outside
tmio_mmc_host_remove(). As a matter of fact, callers of
tmio_mmc_host_remove() already expects this to be the behaviour.

However, there's a problem with tmio_mmc_host_remove() when the Kconfig
option, CONFIG_PM, is set. More precisely, tmio_mmc_host_remove() may then
disable the clock via runtime PM, which leads to clock enable/disable
imbalance problems, when the caller of tmio_mmc_host_remove() also tries to
disable the same clocks.

To solve the problem, let's make sure tmio_mmc_host_remove() leaves the
device with clocks enabled, but also make sure to disable the IRQs, as we
normally do at ->runtime_suspend().

Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

---
 drivers/mmc/host/tmio_mmc_core.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

-- 
2.20.1

Comments

Ulf Hansson May 20, 2020, 11:35 a.m. UTC | #1
On Tue, 19 May 2020 at 17:24, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> Before calling tmio_mmc_host_probe(), the caller is required to enable
> clocks for its device, as to make it accessible when reading/writing
> registers during probe.
>
> Therefore, the responsibility to disable these clocks, in the error path of
> ->probe() and during ->remove(), is better managed outside
> tmio_mmc_host_remove(). As a matter of fact, callers of
> tmio_mmc_host_remove() already expects this to be the behaviour.
>
> However, there's a problem with tmio_mmc_host_remove() when the Kconfig
> option, CONFIG_PM, is set. More precisely, tmio_mmc_host_remove() may then
> disable the clock via runtime PM, which leads to clock enable/disable
> imbalance problems, when the caller of tmio_mmc_host_remove() also tries to
> disable the same clocks.
>
> To solve the problem, let's make sure tmio_mmc_host_remove() leaves the
> device with clocks enabled, but also make sure to disable the IRQs, as we
> normally do at ->runtime_suspend().
>
> Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

Applied for next and by adding a stable tag.

Kind regards
Uffe


> ---
>  drivers/mmc/host/tmio_mmc_core.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/mmc/host/tmio_mmc_core.c b/drivers/mmc/host/tmio_mmc_core.c
> index f31afd1c2671..ba301fb7656b 100644
> --- a/drivers/mmc/host/tmio_mmc_core.c
> +++ b/drivers/mmc/host/tmio_mmc_core.c
> @@ -1231,12 +1231,14 @@ void tmio_mmc_host_remove(struct tmio_mmc_host *host)
>         cancel_work_sync(&host->done);
>         cancel_delayed_work_sync(&host->delayed_reset_work);
>         tmio_mmc_release_dma(host);
> +       tmio_mmc_disable_mmc_irqs(host, TMIO_MASK_ALL);
>
> -       pm_runtime_dont_use_autosuspend(&pdev->dev);
>         if (host->native_hotplug)
>                 pm_runtime_put_noidle(&pdev->dev);
> -       pm_runtime_put_sync(&pdev->dev);
> +
>         pm_runtime_disable(&pdev->dev);
> +       pm_runtime_dont_use_autosuspend(&pdev->dev);
> +       pm_runtime_put_noidle(&pdev->dev);
>  }
>  EXPORT_SYMBOL_GPL(tmio_mmc_host_remove);
>
> --
> 2.20.1
>
Ulf Hansson May 20, 2020, 3:18 p.m. UTC | #2
On Wed, 20 May 2020 at 16:30, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Ulf,
>
> On Tue, May 19, 2020 at 5:24 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > Before calling tmio_mmc_host_probe(), the caller is required to enable
> > clocks for its device, as to make it accessible when reading/writing
> > registers during probe.
> >
> > Therefore, the responsibility to disable these clocks, in the error path of
> > ->probe() and during ->remove(), is better managed outside
> > tmio_mmc_host_remove(). As a matter of fact, callers of
> > tmio_mmc_host_remove() already expects this to be the behaviour.
> >
> > However, there's a problem with tmio_mmc_host_remove() when the Kconfig
> > option, CONFIG_PM, is set. More precisely, tmio_mmc_host_remove() may then
> > disable the clock via runtime PM, which leads to clock enable/disable
> > imbalance problems, when the caller of tmio_mmc_host_remove() also tries to
> > disable the same clocks.
> >
> > To solve the problem, let's make sure tmio_mmc_host_remove() leaves the
> > device with clocks enabled, but also make sure to disable the IRQs, as we
> > normally do at ->runtime_suspend().
> >
> > Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> > Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
>
> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
>
> (on R-Car Gen2, various Gen3, SH-Mobile AG5, R-Mobile A1, R-Mobile APE6,
>  RZ/A1, and RZ/A2)
>
> Gr{oetje,eeting}s,
>
>                         Geert

Thanks, patch amended!

Kind regards
Uffe
Geert Uytterhoeven May 20, 2020, 7:29 p.m. UTC | #3
Hi Wolfram,

On Wed, May 20, 2020 at 8:19 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Wed, May 20, 2020 at 5:49 PM Wolfram Sang
> <wsa+renesas@sang-engineering.com> wrote:
> > On Wed, May 20, 2020 at 04:30:33PM +0200, Geert Uytterhoeven wrote:
> > > On Tue, May 19, 2020 at 5:24 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > Before calling tmio_mmc_host_probe(), the caller is required to enable
> > > > clocks for its device, as to make it accessible when reading/writing
> > > > registers during probe.
> > > >
> > > > Therefore, the responsibility to disable these clocks, in the error path of
> > > > ->probe() and during ->remove(), is better managed outside
> > > > tmio_mmc_host_remove(). As a matter of fact, callers of
> > > > tmio_mmc_host_remove() already expects this to be the behaviour.
> > > >
> > > > However, there's a problem with tmio_mmc_host_remove() when the Kconfig
> > > > option, CONFIG_PM, is set. More precisely, tmio_mmc_host_remove() may then
> > > > disable the clock via runtime PM, which leads to clock enable/disable
> > > > imbalance problems, when the caller of tmio_mmc_host_remove() also tries to
> > > > disable the same clocks.
> > > >
> > > > To solve the problem, let's make sure tmio_mmc_host_remove() leaves the
> > > > device with clocks enabled, but also make sure to disable the IRQs, as we
> > > > normally do at ->runtime_suspend().
> > > >
> > > > Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > > > Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> > > > Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> > > > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> > >
> > > Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > >
> > > (on R-Car Gen2, various Gen3, SH-Mobile AG5, R-Mobile A1, R-Mobile APE6,
> > >  RZ/A1, and RZ/A2)
> >
> > Thanks, Geert! If it is not too much to ask, could you try re-applying
> > commit 7a7dab237027 ("mmc: tmio: remove workaround for NON_REMOVABLE")
> > on top of all these patches and see if your NFS is still stalled?
> >
> > Sidenote: we still need to tackle the problem when SCC hangs because it
> > has no clock. However, I am still interested if all the PM updates have
> > an impact in the beaviour you observed here[1].
> >
> > [1] https://patchwork.kernel.org/patch/11149285/
>
> I reverted "[PATCH] WIP: clk: renesas: rcar-gen3: enable SDnH clk for HS
> modes" (which I still had applied in my local tree), and reapplied "mmc:
> tmio: remove workaround for NON_REMOVABLE", but I cannot reproduce the
> issue, with or without the top 3 commits on mmc/next:
> ff5a1a63febb0761 mmc: tmio: Further fixup runtime PM management at remove
> 774c44ceff3c5b3f mmc: tmio: Make sure the PM domain is 'started' while probing
> 4863bb62a87786ec mmc: renesas_sdhi: remove manual clk handling
>
> Let's see if I can bisect where it was fixed...

Commit 9b0d6855e756b60d ("mmc: renesas_sdhi: enforce manual correction
for Gen3") fixed it.  However, there must be other later changes that
have impact, as reverting 9b0d6855e756b60d and reapplying 7a7dab237027
on both mmc/next~3 and mmc/next gives a working system.

Let's call it a day, no more bisecting today...

Gr{oetje,eeting}s,

                        Geert
Wolfram Sang May 20, 2020, 8:32 p.m. UTC | #4
> Commit 9b0d6855e756b60d ("mmc: renesas_sdhi: enforce manual correction
> for Gen3") fixed it.  However, there must be other later changes that
> have impact, as reverting 9b0d6855e756b60d and reapplying 7a7dab237027
> on both mmc/next~3 and mmc/next gives a working system.
> 
> Let's call it a day, no more bisecting today...

Thanks for the work, Geert!

This non-deterministic outcome already convinces me that we should
really first try to reproduce and fix the stalled SCC case before we
remove the workaround again.
diff mbox series

Patch

diff --git a/drivers/mmc/host/tmio_mmc_core.c b/drivers/mmc/host/tmio_mmc_core.c
index f31afd1c2671..ba301fb7656b 100644
--- a/drivers/mmc/host/tmio_mmc_core.c
+++ b/drivers/mmc/host/tmio_mmc_core.c
@@ -1231,12 +1231,14 @@  void tmio_mmc_host_remove(struct tmio_mmc_host *host)
 	cancel_work_sync(&host->done);
 	cancel_delayed_work_sync(&host->delayed_reset_work);
 	tmio_mmc_release_dma(host);
+	tmio_mmc_disable_mmc_irqs(host, TMIO_MASK_ALL);
 
-	pm_runtime_dont_use_autosuspend(&pdev->dev);
 	if (host->native_hotplug)
 		pm_runtime_put_noidle(&pdev->dev);
-	pm_runtime_put_sync(&pdev->dev);
+
 	pm_runtime_disable(&pdev->dev);
+	pm_runtime_dont_use_autosuspend(&pdev->dev);
+	pm_runtime_put_noidle(&pdev->dev);
 }
 EXPORT_SYMBOL_GPL(tmio_mmc_host_remove);