Message ID | 20211208022210.1300773-1-dmitry.baryshkov@linaro.org |
---|---|
Headers | show |
Series | clk: qcom: fix disp_cc_mdss_mdp_clk_src issues on sdm845 | expand |
Quoting Dmitry Baryshkov (2021-12-07 18:22:09) > Some of RCG2 clocks can become stuck during the boot process, when > device drivers are enabling and disabling the RCG2's parent clocks. > To prevernt such outcome of driver probe sequences, add API to park s/prevernt/prevent/ > clocks to the safe clock source (typically TCXO). > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> I'd prefer this approach vs. adding a new clk flag. The clk framework doesn't handle handoff properly today so we shouldn't try to bandage that in the core. > diff --git a/drivers/clk/qcom/clk-rcg2.c b/drivers/clk/qcom/clk-rcg2.c > index e1b1b426fae4..230b04a7427c 100644 > --- a/drivers/clk/qcom/clk-rcg2.c > +++ b/drivers/clk/qcom/clk-rcg2.c > @@ -1036,6 +1036,40 @@ static void clk_rcg2_shared_disable(struct clk_hw *hw) > regmap_write(rcg->clkr.regmap, rcg->cmd_rcgr + CFG_REG, cfg); > } > > +int clk_rcg2_park_safely(struct regmap *regmap, u32 offset, unsigned int safe_src) Please add kernel doc as it's an exported symbol. > +{ > + unsigned int val, ret, count; > + > + ret = regmap_read(regmap, offset + CFG_REG, &val); > + if (ret) > + return ret; > + > + /* assume safe source is 0 */ Are we assuming safe source is 0 here? It looks like we pass it in now? > + if ((val & CFG_SRC_SEL_MASK) == (safe_src << CFG_SRC_SEL_SHIFT)) > + return 0; > + > + regmap_write(regmap, offset + CFG_REG, safe_src << CFG_SRC_SEL_SHIFT); > + > + ret = regmap_update_bits(regmap, offset + CMD_REG, > + CMD_UPDATE, CMD_UPDATE); > + if (ret) > + return ret; > + > + /* Wait for update to take effect */ > + for (count = 500; count > 0; count--) { > + ret = regmap_read(regmap, offset + CMD_REG, &val); > + if (ret) > + return ret; > + if (!(val & CMD_UPDATE)) > + return 0; > + udelay(1); > + } > + > + WARN(1, "the rcg didn't update its configuration."); Add a newline? > + return -EBUSY; > +} > +EXPORT_SYMBOL_GPL(clk_rcg2_park_safely); > +
On Thu 09 Dec 00:37 PST 2021, Stephen Boyd wrote: > Quoting Dmitry Baryshkov (2021-12-07 18:22:09) > > Some of RCG2 clocks can become stuck during the boot process, when > > device drivers are enabling and disabling the RCG2's parent clocks. > > To prevernt such outcome of driver probe sequences, add API to park > > s/prevernt/prevent/ > > > clocks to the safe clock source (typically TCXO). > > > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> > > I'd prefer this approach vs. adding a new clk flag. The clk framework > doesn't handle handoff properly today so we shouldn't try to bandage > that in the core. > I'm not against putting this responsibility in the drivers, but I don't think we can blindly park all the RCGs that may or may not be used. Note that we should do this for all RCGs that downstream are marked as enable_safe_config (upstream should be using clk_rcg2_shared_ops) and disabling some of those probe time won't be appreciated by the hardware. If you don't like the flag passed to clk_disable_unused (which is like a very reasonable objection to have), we need to make progress towards a proper solution that replaces clk_disable_unused(). > > diff --git a/drivers/clk/qcom/clk-rcg2.c b/drivers/clk/qcom/clk-rcg2.c > > index e1b1b426fae4..230b04a7427c 100644 > > --- a/drivers/clk/qcom/clk-rcg2.c > > +++ b/drivers/clk/qcom/clk-rcg2.c > > @@ -1036,6 +1036,40 @@ static void clk_rcg2_shared_disable(struct clk_hw *hw) > > regmap_write(rcg->clkr.regmap, rcg->cmd_rcgr + CFG_REG, cfg); > > } > > > > +int clk_rcg2_park_safely(struct regmap *regmap, u32 offset, unsigned int safe_src) This seems to just duplicate clk_rcg2_shared_disable()? Regards, Bjorn > > Please add kernel doc as it's an exported symbol. > > > +{ > > + unsigned int val, ret, count; > > + > > + ret = regmap_read(regmap, offset + CFG_REG, &val); > > + if (ret) > > + return ret; > > + > > + /* assume safe source is 0 */ > > Are we assuming safe source is 0 here? It looks like we pass it in now? > > > + if ((val & CFG_SRC_SEL_MASK) == (safe_src << CFG_SRC_SEL_SHIFT)) > > + return 0; > > + > > + regmap_write(regmap, offset + CFG_REG, safe_src << CFG_SRC_SEL_SHIFT); > > + > > + ret = regmap_update_bits(regmap, offset + CMD_REG, > > + CMD_UPDATE, CMD_UPDATE); > > + if (ret) > > + return ret; > > + > > + /* Wait for update to take effect */ > > + for (count = 500; count > 0; count--) { > > + ret = regmap_read(regmap, offset + CMD_REG, &val); > > + if (ret) > > + return ret; > > + if (!(val & CMD_UPDATE)) > > + return 0; > > + udelay(1); > > + } > > + > > + WARN(1, "the rcg didn't update its configuration."); > > Add a newline? > > > + return -EBUSY; > > +} > > +EXPORT_SYMBOL_GPL(clk_rcg2_park_safely); > > +
On 09/12/2021 21:36, Bjorn Andersson wrote: > On Thu 09 Dec 00:37 PST 2021, Stephen Boyd wrote: > >> Quoting Dmitry Baryshkov (2021-12-07 18:22:09) >>> Some of RCG2 clocks can become stuck during the boot process, when >>> device drivers are enabling and disabling the RCG2's parent clocks. >>> To prevernt such outcome of driver probe sequences, add API to park >> >> s/prevernt/prevent/ >> >>> clocks to the safe clock source (typically TCXO). >>> >>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> >> >> I'd prefer this approach vs. adding a new clk flag. The clk framework >> doesn't handle handoff properly today so we shouldn't try to bandage >> that in the core. >> > > I'm not against putting this responsibility in the drivers, but I don't > think we can blindly park all the RCGs that may or may not be used. > > Note that we should do this for all RCGs that downstream are marked as > enable_safe_config (upstream should be using clk_rcg2_shared_ops) > and disabling some of those probe time won't be appreciated by the > hardware. Only for the hardware as crazy, as displays. And maybe gmu_clk_src. I don't think we expect venus or camcc to be really clocking when kernel boots. > > > If you don't like the flag passed to clk_disable_unused (which is like a > very reasonable objection to have), we need to make progress towards a > proper solution that replaces clk_disable_unused(). The issue is that at the time of clk_disable_unused() it can be too late, for example because msm being built-in into the kernel has already tried to play with PLLs/GDSCs and thus made RCG stuck. This is what I was observing on RB3 if the msm driver is built in and the splash screen is enabled. > >>> diff --git a/drivers/clk/qcom/clk-rcg2.c b/drivers/clk/qcom/clk-rcg2.c >>> index e1b1b426fae4..230b04a7427c 100644 >>> --- a/drivers/clk/qcom/clk-rcg2.c >>> +++ b/drivers/clk/qcom/clk-rcg2.c >>> @@ -1036,6 +1036,40 @@ static void clk_rcg2_shared_disable(struct clk_hw *hw) >>> regmap_write(rcg->clkr.regmap, rcg->cmd_rcgr + CFG_REG, cfg); >>> } >>> >>> +int clk_rcg2_park_safely(struct regmap *regmap, u32 offset, unsigned int safe_src) > > This seems to just duplicate clk_rcg2_shared_disable()? A light version of it. It does not do force_on/_off. And also it can not rely on clkr->regmap or clock name being set. Initially I used clk_rcg2_shared_disable + several patches to stop it from crashing if it is used on the non-registered clock. Then I just decided to write special helper. > > Regards, > Bjorn > >> >> Please add kernel doc as it's an exported symbol. Ack >> >>> +{ >>> + unsigned int val, ret, count; >>> + >>> + ret = regmap_read(regmap, offset + CFG_REG, &val); >>> + if (ret) >>> + return ret; >>> + >>> + /* assume safe source is 0 */ >> >> Are we assuming safe source is 0 here? It looks like we pass it in now? Leftover, will remove if/when posting v2. >> >>> + if ((val & CFG_SRC_SEL_MASK) == (safe_src << CFG_SRC_SEL_SHIFT)) >>> + return 0; >>> + >>> + regmap_write(regmap, offset + CFG_REG, safe_src << CFG_SRC_SEL_SHIFT); >>> + >>> + ret = regmap_update_bits(regmap, offset + CMD_REG, >>> + CMD_UPDATE, CMD_UPDATE); >>> + if (ret) >>> + return ret; >>> + >>> + /* Wait for update to take effect */ >>> + for (count = 500; count > 0; count--) { >>> + ret = regmap_read(regmap, offset + CMD_REG, &val); >>> + if (ret) >>> + return ret; >>> + if (!(val & CMD_UPDATE)) >>> + return 0; >>> + udelay(1); >>> + } >>> + >>> + WARN(1, "the rcg didn't update its configuration."); >> >> Add a newline? Ack. >> >>> + return -EBUSY; >>> +} >>> +EXPORT_SYMBOL_GPL(clk_rcg2_park_safely); >>> +
On Wed 15 Dec 13:14 PST 2021, Dmitry Baryshkov wrote: > On 09/12/2021 21:36, Bjorn Andersson wrote: > > On Thu 09 Dec 00:37 PST 2021, Stephen Boyd wrote: > > > > > Quoting Dmitry Baryshkov (2021-12-07 18:22:09) > > > > Some of RCG2 clocks can become stuck during the boot process, when > > > > device drivers are enabling and disabling the RCG2's parent clocks. > > > > To prevernt such outcome of driver probe sequences, add API to park > > > > > > s/prevernt/prevent/ > > > > > > > clocks to the safe clock source (typically TCXO). > > > > > > > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> > > > > > > I'd prefer this approach vs. adding a new clk flag. The clk framework > > > doesn't handle handoff properly today so we shouldn't try to bandage > > > that in the core. > > > > > > > I'm not against putting this responsibility in the drivers, but I don't > > think we can blindly park all the RCGs that may or may not be used. > > > > Note that we should do this for all RCGs that downstream are marked as > > enable_safe_config (upstream should be using clk_rcg2_shared_ops) > > and disabling some of those probe time won't be appreciated by the > > hardware. > > Only for the hardware as crazy, as displays. And maybe gmu_clk_src. I don't > think we expect venus or camcc to be really clocking when kernel boots. > SM8350 GCC has 44 clocks marked as such downstream. > > > > > > If you don't like the flag passed to clk_disable_unused (which is like a > > very reasonable objection to have), we need to make progress towards a > > proper solution that replaces clk_disable_unused(). > > The issue is that at the time of clk_disable_unused() it can be too late, > for example because msm being built-in into the kernel has already tried to > play with PLLs/GDSCs and thus made RCG stuck. Makes sense, so this logic will have to consider both the hardware state (or make assumptions thereof) and the clock votes in the kernel. > This is what I was observing > on RB3 if the msm driver is built in and the splash screen is enabled. > Which clock was this? We should be able to assume that the bootloader will hand us a clock tree that's functionally configured, so reparenting etc of the RCGs should not cause issues because the old parent will be ticking and we will explicitly start the new parent. One case that might not be handled though is the externally sourced clocks, where if you reconfigure the DSI phy without first parking the RCG you might lock up the RCG. So I think that whenever we mess with those clocks we need to make sure that the downstream RCGs are not ticking off them. > > > > > > diff --git a/drivers/clk/qcom/clk-rcg2.c b/drivers/clk/qcom/clk-rcg2.c > > > > index e1b1b426fae4..230b04a7427c 100644 > > > > --- a/drivers/clk/qcom/clk-rcg2.c > > > > +++ b/drivers/clk/qcom/clk-rcg2.c > > > > @@ -1036,6 +1036,40 @@ static void clk_rcg2_shared_disable(struct clk_hw *hw) > > > > regmap_write(rcg->clkr.regmap, rcg->cmd_rcgr + CFG_REG, cfg); > > > > } > > > > +int clk_rcg2_park_safely(struct regmap *regmap, u32 offset, unsigned int safe_src) > > > > This seems to just duplicate clk_rcg2_shared_disable()? > > A light version of it. It does not do force_on/_off. And also it can not > rely on clkr->regmap or clock name being set. Initially I used > clk_rcg2_shared_disable + several patches to stop it from crashing if it is > used on the non-registered clock. Then I just decided to write special > helper. > Okay, makes sense then. But I don't think we want to shoot down clocks at clock probe time. Regards, Bjorn > > > > Regards, > > Bjorn > > > > > > > > Please add kernel doc as it's an exported symbol. > > Ack > > > > > > > > +{ > > > > + unsigned int val, ret, count; > > > > + > > > > + ret = regmap_read(regmap, offset + CFG_REG, &val); > > > > + if (ret) > > > > + return ret; > > > > + > > > > + /* assume safe source is 0 */ > > > > > > Are we assuming safe source is 0 here? It looks like we pass it in now? > > Leftover, will remove if/when posting v2. > > > > > > > > + if ((val & CFG_SRC_SEL_MASK) == (safe_src << CFG_SRC_SEL_SHIFT)) > > > > + return 0; > > > > + > > > > + regmap_write(regmap, offset + CFG_REG, safe_src << CFG_SRC_SEL_SHIFT); > > > > + > > > > + ret = regmap_update_bits(regmap, offset + CMD_REG, > > > > + CMD_UPDATE, CMD_UPDATE); > > > > + if (ret) > > > > + return ret; > > > > + > > > > + /* Wait for update to take effect */ > > > > + for (count = 500; count > 0; count--) { > > > > + ret = regmap_read(regmap, offset + CMD_REG, &val); > > > > + if (ret) > > > > + return ret; > > > > + if (!(val & CMD_UPDATE)) > > > > + return 0; > > > > + udelay(1); > > > > + } > > > > + > > > > + WARN(1, "the rcg didn't update its configuration."); > > > > > > Add a newline? > > Ack. > > > > > > > > + return -EBUSY; > > > > +} > > > > +EXPORT_SYMBOL_GPL(clk_rcg2_park_safely); > > > > + > > > -- > With best wishes > Dmitry