diff mbox series

mmc: renesas_sdhi: Add margins for main clock and hs clock

Message ID 20220913161506.1817432-1-biju.das.jz@bp.renesas.com
State New
Headers show
Series mmc: renesas_sdhi: Add margins for main clock and hs clock | expand

Commit Message

Biju Das Sept. 13, 2022, 4:15 p.m. UTC
The SDHI high speed clock is 4 times that of the main clock. Currently,
there is no margin added for setting the rate associated with these
clocks. On RZ/G2L platforms, the lack of these margins leads to the
selection of a clock source with a lower clock rate compared to a higher
one.

RZ/G2L example case:
SD0 hs clock is 533333333 Hz and SD0 main clock is 133333333 Hz.
When we set the rate for the main clock 133333333, the request rate for
the parent becomes 533333332 (133333333 * 4) and the SD0 hs clock is
set as 400000000 Hz.

This patch adds a margin of (1/1024) higher hs clock and main clock rate.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
---
This patch is tested on RZ/G2{L,UL} platforms. It will be good to test
this patch on RCar Gen3/Gen4 platforms as I don't have the hardware.

Logs:-

Before the change:
 ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=4266666656
 ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=2133333328
 ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=1066666664
 ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=533333332
 ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=400000000
 ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=400000000

After the patch:
 ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=4270833320
 ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=2135416660
 ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=1067708330
 ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=533854165
 ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=533333333
 ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=533333333
 ####rzg2l_cpg_sd_clk_mux_set_parent####### index=0
 ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=533854164
---
 drivers/mmc/host/renesas_sdhi_core.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Biju Das Sept. 13, 2022, 5:31 p.m. UTC | #1
Hi Geert,

Thanks for the feedback.

> Subject: Re: [PATCH] mmc: renesas_sdhi: Add margins for main clock and hs
> clock
> 
> Hi Biju,
> 
> On Tue, Sep 13, 2022 at 5:15 PM Biju Das <biju.das.jz@bp.renesas.com>
> wrote:
> > The SDHI high speed clock is 4 times that of the main clock.
> > Currently, there is no margin added for setting the rate associated
> > with these clocks. On RZ/G2L platforms, the lack of these margins
> > leads to the selection of a clock source with a lower clock rate
> > compared to a higher one.
> >
> > RZ/G2L example case:
> > SD0 hs clock is 533333333 Hz and SD0 main clock is 133333333 Hz.
> > When we set the rate for the main clock 133333333, the request rate
> > for the parent becomes 533333332 (133333333 * 4) and the SD0 hs clock
> > is set as 400000000 Hz.
> >
> > This patch adds a margin of (1/1024) higher hs clock and main clock
> rate.
> >
> > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
> 
> Thanks for your patch!
> 
> > --- a/drivers/mmc/host/renesas_sdhi_core.c
> > +++ b/drivers/mmc/host/renesas_sdhi_core.c
> > @@ -147,6 +147,7 @@ static unsigned int renesas_sdhi_clk_update(struct
> tmio_mmc_host *host,
> >         }
> >
> >         new_clock = wanted_clock << clkh_shift;
> > +       new_clock += new_clock >> 10;
> 
> This changes the requested clock rate. Is that really a good idea?
> 
> Isn't it sufficient to change the test "if (freq > (new_clock << i))"
> slightly?

We hardly enter this test, after it request for proper wanted_clk.

freq is clk_round_rate(ref_clk, new_clock << i);  and
it compares 400MHz vs 533.332 MHz.

Basically wanted_clock= 133.33333 MHz and is scaled to power of 2
and then each iteration it scale down to power of 2 compare with
round rate value.

[    4.020781] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=4266666656
[    4.028013] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=2133333328
[    4.035330] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=1066666664
[    4.042639] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req->rate=533333332
   
> 
> >
> >         /*
> >          * We want the bus clock to be as close as possible to, but no
> > @@ -172,6 +173,7 @@ static unsigned int renesas_sdhi_clk_update(struct
> > tmio_mmc_host *host,
> >
> >         clk_set_rate(ref_clk, best_freq);
> >
> > +       best_freq += best_freq >> 10;
> 
> Can you please elaborate why this is needed?
> It looks counter-intuitive to me.

When we try to set 133.333 MHz clk, the determine rate
calculates req->rate for sd clk as 533.332 and 
since available clock source for sd0 clks are 266.6666, 400 and 
533.333 MHz, so it select the clock source as 400MHz.

Cheers,
Biju

> 
> >         if (priv->clkh)
> >                 clk_set_rate(priv->clk, best_freq >> clkh_shift);
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-
> m68k.org
> 
> In personal conversations with technical people, I call myself a hacker.
> But when I'm talking to journalists I just say "programmer" or something
> like that.
>                                 -- Linus Torvalds
Biju Das Sept. 14, 2022, 5:44 a.m. UTC | #2
Hi Geert,

> Subject: RE: [PATCH] mmc: renesas_sdhi: Add margins for main clock and hs
> clock
> 
> Hi Geert,
> 
> Thanks for the feedback.
> 
> > Subject: Re: [PATCH] mmc: renesas_sdhi: Add margins for main clock and
> > hs clock
> >
> > Hi Biju,
> >
> > On Tue, Sep 13, 2022 at 5:15 PM Biju Das <biju.das.jz@bp.renesas.com>
> > wrote:
> > > The SDHI high speed clock is 4 times that of the main clock.
> > > Currently, there is no margin added for setting the rate associated
> > > with these clocks. On RZ/G2L platforms, the lack of these margins
> > > leads to the selection of a clock source with a lower clock rate
> > > compared to a higher one.
> > >
> > > RZ/G2L example case:
> > > SD0 hs clock is 533333333 Hz and SD0 main clock is 133333333 Hz.
> > > When we set the rate for the main clock 133333333, the request rate
> > > for the parent becomes 533333332 (133333333 * 4) and the SD0 hs
> > > clock is set as 400000000 Hz.
> > >
> > > This patch adds a margin of (1/1024) higher hs clock and main clock
> > rate.
> > >
> > > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
> >
> > Thanks for your patch!
> >
> > > --- a/drivers/mmc/host/renesas_sdhi_core.c
> > > +++ b/drivers/mmc/host/renesas_sdhi_core.c
> > > @@ -147,6 +147,7 @@ static unsigned int
> > > renesas_sdhi_clk_update(struct
> > tmio_mmc_host *host,
> > >         }
> > >
> > >         new_clock = wanted_clock << clkh_shift;
> > > +       new_clock += new_clock >> 10;
> >
> > This changes the requested clock rate. Is that really a good idea?
> >
> > Isn't it sufficient to change the test "if (freq > (new_clock << i))"
> > slightly?
> 
> We hardly enter this test, after it request for proper wanted_clk.
> 
> freq is clk_round_rate(ref_clk, new_clock << i);  and it compares 400MHz
> vs 533.332 MHz.
> 
> Basically wanted_clock= 133.33333 MHz and is scaled to power of 2 and
> then each iteration it scale down to power of 2 compare with round rate
> value.
> 
> [    4.020781] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req-
> >rate=4266666656
> [    4.028013] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req-
> >rate=2133333328
> [    4.035330] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req-
> >rate=1066666664
> [    4.042639] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req-
> >rate=533333332
> 
> >
> > >
> > >         /*
> > >          * We want the bus clock to be as close as possible to, but
> > > no @@ -172,6 +173,7 @@ static unsigned int
> > > renesas_sdhi_clk_update(struct tmio_mmc_host *host,
> > >
> > >         clk_set_rate(ref_clk, best_freq);
> > >
> > > +       best_freq += best_freq >> 10;
> >
> > Can you please elaborate why this is needed?
> > It looks counter-intuitive to me.
> 
> When we try to set 133.333 MHz clk, the determine rate calculates req-
> >rate for sd clk as 533.332 and since available clock source for sd0 clks
> are 266.6666, 400 and
> 533.333 MHz, so it select the clock source as 400MHz.

Just to add here the main issue is coming from math calculation.

Consider any value A

B = A / 4
C = B * 4

Ideally, we expect A = C, But in the below example
It is not the case (it is A != C).

A = 533333333 (1600/3 MHz)
B = 533333333/4 = 133333333
C = 133333333 * 4 = 533333332

Since A != C we are ending up in this situation.

Cheers,
Biju

> 
> >
> > >         if (priv->clkh)
> > >                 clk_set_rate(priv->clk, best_freq >> clkh_shift);
> >
> > Gr{oetje,eeting}s,
> >
> >                         Geert
> >
> > --
> > Geert Uytterhoeven -- There's lots of Linux beyond ia32 --
> > geert@linux- m68k.org
> >
> > In personal conversations with technical people, I call myself a
> hacker.
> > But when I'm talking to journalists I just say "programmer" or
> > something like that.
> >                                 -- Linus Torvalds
Geert Uytterhoeven Sept. 14, 2022, 3:44 p.m. UTC | #3
Hi Biju,

On Wed, Sep 14, 2022 at 6:44 AM Biju Das <biju.das.jz@bp.renesas.com> wrote:
> > Subject: RE: [PATCH] mmc: renesas_sdhi: Add margins for main clock and hs
> > clock
> > > Subject: Re: [PATCH] mmc: renesas_sdhi: Add margins for main clock and
> > > hs clock
> > > On Tue, Sep 13, 2022 at 5:15 PM Biju Das <biju.das.jz@bp.renesas.com>
> > > wrote:
> > > > The SDHI high speed clock is 4 times that of the main clock.
> > > > Currently, there is no margin added for setting the rate associated
> > > > with these clocks. On RZ/G2L platforms, the lack of these margins
> > > > leads to the selection of a clock source with a lower clock rate
> > > > compared to a higher one.
> > > >
> > > > RZ/G2L example case:
> > > > SD0 hs clock is 533333333 Hz and SD0 main clock is 133333333 Hz.
> > > > When we set the rate for the main clock 133333333, the request rate
> > > > for the parent becomes 533333332 (133333333 * 4) and the SD0 hs
> > > > clock is set as 400000000 Hz.
> > > >
> > > > This patch adds a margin of (1/1024) higher hs clock and main clock
> > > rate.
> > > >
> > > > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
> > >
> > > Thanks for your patch!
> > >
> > > > --- a/drivers/mmc/host/renesas_sdhi_core.c
> > > > +++ b/drivers/mmc/host/renesas_sdhi_core.c
> > > > @@ -147,6 +147,7 @@ static unsigned int
> > > > renesas_sdhi_clk_update(struct
> > > tmio_mmc_host *host,
> > > >         }
> > > >
> > > >         new_clock = wanted_clock << clkh_shift;
> > > > +       new_clock += new_clock >> 10;
> > >
> > > This changes the requested clock rate. Is that really a good idea?
> > >
> > > Isn't it sufficient to change the test "if (freq > (new_clock << i))"
> > > slightly?
> >
> > We hardly enter this test, after it request for proper wanted_clk.
> >
> > freq is clk_round_rate(ref_clk, new_clock << i);  and it compares 400MHz
> > vs 533.332 MHz.
> >
> > Basically wanted_clock= 133.33333 MHz and is scaled to power of 2 and
> > then each iteration it scale down to power of 2 compare with round rate
> > value.
> >
> > [    4.020781] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req-
> > >rate=4266666656
> > [    4.028013] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req-
> > >rate=2133333328
> > [    4.035330] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req-
> > >rate=1066666664
> > [    4.042639] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req-
> > >rate=533333332
> >
> > >
> > > >
> > > >         /*
> > > >          * We want the bus clock to be as close as possible to, but
> > > > no @@ -172,6 +173,7 @@ static unsigned int
> > > > renesas_sdhi_clk_update(struct tmio_mmc_host *host,
> > > >
> > > >         clk_set_rate(ref_clk, best_freq);
> > > >
> > > > +       best_freq += best_freq >> 10;
> > >
> > > Can you please elaborate why this is needed?
> > > It looks counter-intuitive to me.
> >
> > When we try to set 133.333 MHz clk, the determine rate calculates req-
> > >rate for sd clk as 533.332 and since available clock source for sd0 clks
> > are 266.6666, 400 and
> > 533.333 MHz, so it select the clock source as 400MHz.
>
> Just to add here the main issue is coming from math calculation.
>
> Consider any value A
>
> B = A / 4
> C = B * 4
>
> Ideally, we expect A = C, But in the below example
> It is not the case (it is A != C).
>
> A = 533333333 (1600/3 MHz)
> B = 533333333/4 = 133333333
> C = 133333333 * 4 = 533333332
>
> Since A != C we are ending up in this situation.

I am fully aware of that ;-)

However, clk_round_rate() is supposed to return the closest
matching rate.  Hence when passed 533333332, it should return 533333333.
AFAIU, this is then rejected by "if (freq > (new_clock << i))", due to
being slightly too large, causing 400000000 to be picked instead.

Am I missing something?
I'm currently not in the position to test/verify what is actually happening.

I do see clk_mux_determine_rate_flags() has a comment:

        /* find the parent that can provide the fastest rate <= rate */

So perhaps the issue is that it does not return the closest rate, but a
slower rate?

Thanks!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Biju Das Sept. 16, 2022, 11:47 a.m. UTC | #4
Hi Geert,

Thanks for the feedback.

> Subject: Re: [PATCH] mmc: renesas_sdhi: Add margins for main clock and
> hs clock
> 
> Hi Biju,
> 
> On Wed, Sep 14, 2022 at 6:44 AM Biju Das <biju.das.jz@bp.renesas.com>
> wrote:
> > > Subject: RE: [PATCH] mmc: renesas_sdhi: Add margins for main clock
> > > and hs clock
> > > > Subject: Re: [PATCH] mmc: renesas_sdhi: Add margins for main
> clock
> > > > and hs clock On Tue, Sep 13, 2022 at 5:15 PM Biju Das
> > > > <biju.das.jz@bp.renesas.com>
> > > > wrote:
> > > > > The SDHI high speed clock is 4 times that of the main clock.
> > > > > Currently, there is no margin added for setting the rate
> > > > > associated with these clocks. On RZ/G2L platforms, the lack of
> > > > > these margins leads to the selection of a clock source with a
> > > > > lower clock rate compared to a higher one.
> > > > >
> > > > > RZ/G2L example case:
> > > > > SD0 hs clock is 533333333 Hz and SD0 main clock is 133333333
> Hz.
> > > > > When we set the rate for the main clock 133333333, the request
> > > > > rate for the parent becomes 533333332 (133333333 * 4) and the
> > > > > SD0 hs clock is set as 400000000 Hz.
> > > > >
> > > > > This patch adds a margin of (1/1024) higher hs clock and main
> > > > > clock
> > > > rate.
> > > > >
> > > > > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
> > > >
> > > > Thanks for your patch!
> > > >
> > > > > --- a/drivers/mmc/host/renesas_sdhi_core.c
> > > > > +++ b/drivers/mmc/host/renesas_sdhi_core.c
> > > > > @@ -147,6 +147,7 @@ static unsigned int
> > > > > renesas_sdhi_clk_update(struct
> > > > tmio_mmc_host *host,
> > > > >         }
> > > > >
> > > > >         new_clock = wanted_clock << clkh_shift;
> > > > > +       new_clock += new_clock >> 10;
> > > >
> > > > This changes the requested clock rate. Is that really a good
> idea?
> > > >
> > > > Isn't it sufficient to change the test "if (freq > (new_clock <<
> i))"
> > > > slightly?
> > >
> > > We hardly enter this test, after it request for proper wanted_clk.
> > >
> > > freq is clk_round_rate(ref_clk, new_clock << i);  and it compares
> > > 400MHz vs 533.332 MHz.
> > >
> > > Basically wanted_clock= 133.33333 MHz and is scaled to power of 2
> > > and then each iteration it scale down to power of 2 compare with
> > > round rate value.
> > >
> > > [    4.020781] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req-
> > > >rate=4266666656
> > > [    4.028013] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req-
> > > >rate=2133333328
> > > [    4.035330] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req-
> > > >rate=1066666664
> > > [    4.042639] ####rzg2l_cpg_sd_clk_mux_determine_rate####### req-
> > > >rate=533333332
> > >
> > > >
> > > > >
> > > > >         /*
> > > > >          * We want the bus clock to be as close as possible
> to,
> > > > > but no @@ -172,6 +173,7 @@ static unsigned int
> > > > > renesas_sdhi_clk_update(struct tmio_mmc_host *host,
> > > > >
> > > > >         clk_set_rate(ref_clk, best_freq);
> > > > >
> > > > > +       best_freq += best_freq >> 10;
> > > >
> > > > Can you please elaborate why this is needed?
> > > > It looks counter-intuitive to me.
> > >
> > > When we try to set 133.333 MHz clk, the determine rate calculates
> > > req-
> > > >rate for sd clk as 533.332 and since available clock source for
> sd0
> > > >clks
> > > are 266.6666, 400 and
> > > 533.333 MHz, so it select the clock source as 400MHz.
> >
> > Just to add here the main issue is coming from math calculation.
> >
> > Consider any value A
> >
> > B = A / 4
> > C = B * 4
> >
> > Ideally, we expect A = C, But in the below example It is not the
> case
> > (it is A != C).
> >
> > A = 533333333 (1600/3 MHz)
> > B = 533333333/4 = 133333333
> > C = 133333333 * 4 = 533333332
> >
> > Since A != C we are ending up in this situation.
> 
> I am fully aware of that ;-)
> 
> However, clk_round_rate() is supposed to return the closest matching
> rate.  Hence when passed 533333332, it should return 533333333.

No, it is returning 400000000.as ref_clk->rate=400000000, and new clk rate 533333332

[    4.524188] ###renesas_sdhi_clk_update ++ ### 400000000/533333333/533333332
[    4.531217] ###renesas_sdhi_clk_update -- ### 400000000/400000000/533333332

+               pr_err("###%s ++ ### %llu/%llu/%llu", __func__, clk_get_rate(ref_clk),freq, new_clock << i);
                freq = clk_round_rate(ref_clk, new_clock << i);
+               pr_err("###%s -- ### %llu/%llu/%llu", __func__, clk_get_rate(ref_clk),freq, new_clock << i);

> AFAIU, this is then rejected by "if (freq > (new_clock << i))", due to
> being slightly too large, causing 400000000 to be picked instead.
> 
> Am I missing something?

With the above, it skips the if statement

> I'm currently not in the position to test/verify what is actually
> happening.

OK Will check.

> 
> I do see clk_mux_determine_rate_flags() has a comment:
> 
>         /* find the parent that can provide the fastest rate <= rate
> */
> 
> So perhaps the issue is that it does not return the closest rate, but
> a slower rate?

Yes, Looks it returns a slower rate.

Cheers,
Biju
Biju Das Sept. 16, 2022, 4:04 p.m. UTC | #5
Hi Geert,

> Subject: RE: [PATCH] mmc: renesas_sdhi: Add margins for main clock and
> hs clock
> 
> Hi Geert,
> 
> Thanks for the feedback.
> 
> > Subject: Re: [PATCH] mmc: renesas_sdhi: Add margins for main clock
> and
> > hs clock
> >
> > Hi Biju,
> >
> > On Wed, Sep 14, 2022 at 6:44 AM Biju Das
> <biju.das.jz@bp.renesas.com>
> > wrote:
> > > > Subject: RE: [PATCH] mmc: renesas_sdhi: Add margins for main
> clock
> > > > and hs clock
> > > > > Subject: Re: [PATCH] mmc: renesas_sdhi: Add margins for main
> > clock
> > > > > and hs clock On Tue, Sep 13, 2022 at 5:15 PM Biju Das
> > > > > <biju.das.jz@bp.renesas.com>
> > > > > wrote:
> > > > > > The SDHI high speed clock is 4 times that of the main clock.
> > > > > > Currently, there is no margin added for setting the rate
> > > > > > associated with these clocks. On RZ/G2L platforms, the lack
> of
> > > > > > these margins leads to the selection of a clock source with
> a
> > > > > > lower clock rate compared to a higher one.
> > > > > >
> > > > > > RZ/G2L example case:
> > > > > > SD0 hs clock is 533333333 Hz and SD0 main clock is 133333333
> > Hz.
> > > > > > When we set the rate for the main clock 133333333, the
> request
> > > > > > rate for the parent becomes 533333332 (133333333 * 4) and
> the
> > > > > > SD0 hs clock is set as 400000000 Hz.
> > > > > >
> > > > > > This patch adds a margin of (1/1024) higher hs clock and
> main
> > > > > > clock
> > > > > rate.
> > > > > >
> > > > > > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
> > > > >
> > > > > Thanks for your patch!
> > > > >
> > > > > > --- a/drivers/mmc/host/renesas_sdhi_core.c
> > > > > > +++ b/drivers/mmc/host/renesas_sdhi_core.c
> > > > > > @@ -147,6 +147,7 @@ static unsigned int
> > > > > > renesas_sdhi_clk_update(struct
> > > > > tmio_mmc_host *host,
> > > > > >         }
> > > > > >
> > > > > >         new_clock = wanted_clock << clkh_shift;
> > > > > > +       new_clock += new_clock >> 10;
> > > > >
> > > > > This changes the requested clock rate. Is that really a good
> > idea?
> > > > >
> > > > > Isn't it sufficient to change the test "if (freq > (new_clock
> <<
> > i))"
> > > > > slightly?
> > > >
> > > > We hardly enter this test, after it request for proper
> wanted_clk.
> > > >
> > > > freq is clk_round_rate(ref_clk, new_clock << i);  and it
> compares
> > > > 400MHz vs 533.332 MHz.
> > > >
> > > > Basically wanted_clock= 133.33333 MHz and is scaled to power of
> 2
> > > > and then each iteration it scale down to power of 2 compare with
> > > > round rate value.
> > > >
> > > > [    4.020781] ####rzg2l_cpg_sd_clk_mux_determine_rate#######
> req-
> > > > >rate=4266666656
> > > > [    4.028013] ####rzg2l_cpg_sd_clk_mux_determine_rate#######
> req-
> > > > >rate=2133333328
> > > > [    4.035330] ####rzg2l_cpg_sd_clk_mux_determine_rate#######
> req-
> > > > >rate=1066666664
> > > > [    4.042639] ####rzg2l_cpg_sd_clk_mux_determine_rate#######
> req-
> > > > >rate=533333332
> > > >
> > > > >
> > > > > >
> > > > > >         /*
> > > > > >          * We want the bus clock to be as close as possible
> > to,
> > > > > > but no @@ -172,6 +173,7 @@ static unsigned int
> > > > > > renesas_sdhi_clk_update(struct tmio_mmc_host *host,
> > > > > >
> > > > > >         clk_set_rate(ref_clk, best_freq);
> > > > > >
> > > > > > +       best_freq += best_freq >> 10;
> > > > >
> > > > > Can you please elaborate why this is needed?
> > > > > It looks counter-intuitive to me.
> > > >
> > > > When we try to set 133.333 MHz clk, the determine rate
> calculates
> > > > req-
> > > > >rate for sd clk as 533.332 and since available clock source for
> > sd0
> > > > >clks
> > > > are 266.6666, 400 and
> > > > 533.333 MHz, so it select the clock source as 400MHz.
> > >
> > > Just to add here the main issue is coming from math calculation.
> > >
> > > Consider any value A
> > >
> > > B = A / 4
> > > C = B * 4
> > >
> > > Ideally, we expect A = C, But in the below example It is not the
> > case
> > > (it is A != C).
> > >
> > > A = 533333333 (1600/3 MHz)
> > > B = 533333333/4 = 133333333
> > > C = 133333333 * 4 = 533333332
> > >
> > > Since A != C we are ending up in this situation.
> >
> > I am fully aware of that ;-)
> >
> > However, clk_round_rate() is supposed to return the closest matching
> > rate.  Hence when passed 533333332, it should return 533333333.
> 
> No, it is returning 400000000.as ref_clk->rate=400000000, and new clk
> rate 533333332
> 
> [    4.524188] ###renesas_sdhi_clk_update ++ ###
> 400000000/533333333/533333332
> [    4.531217] ###renesas_sdhi_clk_update -- ###
> 400000000/400000000/533333332
> 
> +               pr_err("###%s ++ ### %llu/%llu/%llu", __func__,
> + clk_get_rate(ref_clk),freq, new_clock << i);
>                 freq = clk_round_rate(ref_clk, new_clock << i);
> +               pr_err("###%s -- ### %llu/%llu/%llu", __func__,
> + clk_get_rate(ref_clk),freq, new_clock << i);
> 
> > AFAIU, this is then rejected by "if (freq > (new_clock << i))", due
> to
> > being slightly too large, causing 400000000 to be picked instead.
> >
> > Am I missing something?
> 
> With the above, it skips the if statement
> 
> > I'm currently not in the position to test/verify what is actually
> > happening.
> 
> OK Will check.

I have patched SDHI driver to allow margin, so that it won't set a lower clock
and added round clock operation in Clock driver for rounding operation.
with that I get proper rates.

Is this solution acceptable or not?

[1]
diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
index 6edbf5c161ab..bb7a736d46a2 100644
--- a/drivers/mmc/host/renesas_sdhi_core.c
+++ b/drivers/mmc/host/renesas_sdhi_core.c
@@ -128,6 +128,7 @@ static unsigned int renesas_sdhi_clk_update(struct tmio_mmc_host *host,
        struct clk *ref_clk = priv->clk;
        unsigned int freq, diff, best_freq = 0, diff_min = ~0;
        unsigned int new_clock, clkh_shift = 0;
+       unsigned int new_clock_margin;
        int i;
 
        /*
@@ -155,8 +156,13 @@ static unsigned int renesas_sdhi_clk_update(struct tmio_mmc_host *host,
         * possible, but no greater than, new_clock << i.
         */
        for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0; i--) {
-               freq = clk_round_rate(ref_clk, new_clock << i);
-               if (freq > (new_clock << i)) {
+               if (freq <= (new_clock << i))
+                       freq = clk_round_rate(ref_clk, new_clock << i);
+
+               new_clock_margin = (new_clock << i);
+               new_clock_margin += new_clock_margin >> 10;
+
+               if (freq > new_clock_margin) {
                        /* Too fast; look for a slightly slower option */
                        freq = clk_round_rate(ref_clk, (new_clock << i) / 4 * 3);
                        if (freq > (new_clock << i))

[2]
+static inline bool is_better_rate(unsigned long req, unsigned long best,
+                                 unsigned long new)
+{
+       return (req <= new && new < best) || (best < req && best < new);
+}
+
 static int rzg2l_cpg_sd_clk_mux_determine_rate(struct clk_hw *hw,
                                               struct clk_rate_request *req)
 {
-       return clk_mux_determine_rate_flags(hw, req, 0);
+       unsigned int i;
+       unsigned long actual_rate, best_rate = 0;
+       unsigned long req_rate = req->rate;
+
+       for (i = 0; i < clk_hw_get_num_parents(hw); i++) {
+               struct clk_hw *parent = clk_hw_get_parent_by_index(hw, i);
+               unsigned long parent_rate = clk_hw_get_rate(parent);
+
+               actual_rate = clk_hw_round_rate(parent, req_rate);
+
+               if (is_better_rate(req_rate, best_rate, actual_rate)) {
+                       best_rate = actual_rate;
+                       req->rate = best_rate;
+                       req->best_parent_rate = parent_rate;
+                       req->best_parent_hw = parent;
+               }
+       }
+
+       if (!best_rate)
+               return -EINVAL;
+
+       return 0;
 }
 
 static int rzg2l_cpg_sd_clk_mux_set_parent(struct clk_hw *hw, u8 index)

Cheers,
Biju
Biju Das Sept. 16, 2022, 6:14 p.m. UTC | #6
Hi Geert,

> Subject: RE: [PATCH] mmc: renesas_sdhi: Add margins for main clock and
> hs clock
> 
> Hi Geert,
> 
> > Subject: RE: [PATCH] mmc: renesas_sdhi: Add margins for main clock
> and
> > hs clock
> >
> > Hi Geert,
> >
> > Thanks for the feedback.
> >
> > > Subject: Re: [PATCH] mmc: renesas_sdhi: Add margins for main clock
> > and
> > > hs clock
> > >
> > > Hi Biju,
> > >
> > > On Wed, Sep 14, 2022 at 6:44 AM Biju Das
> > <biju.das.jz@bp.renesas.com>
> > > wrote:
> > > > > Subject: RE: [PATCH] mmc: renesas_sdhi: Add margins for main
> > clock
> > > > > and hs clock
> > > > > > Subject: Re: [PATCH] mmc: renesas_sdhi: Add margins for main
> > > clock
> > > > > > and hs clock On Tue, Sep 13, 2022 at 5:15 PM Biju Das
> > > > > > <biju.das.jz@bp.renesas.com>
> > > > > > wrote:
> > > > > > > The SDHI high speed clock is 4 times that of the main
> clock.
> > > > > > > Currently, there is no margin added for setting the rate
> > > > > > > associated with these clocks. On RZ/G2L platforms, the
> lack
> > of
> > > > > > > these margins leads to the selection of a clock source
> with
> > a
> > > > > > > lower clock rate compared to a higher one.
> > > > > > >
> > > > > > > RZ/G2L example case:
> > > > > > > SD0 hs clock is 533333333 Hz and SD0 main clock is
> 133333333
> > > Hz.
> > > > > > > When we set the rate for the main clock 133333333, the
> > request
> > > > > > > rate for the parent becomes 533333332 (133333333 * 4) and
> > the
> > > > > > > SD0 hs clock is set as 400000000 Hz.
> > > > > > >
> > > > > > > This patch adds a margin of (1/1024) higher hs clock and
> > main
> > > > > > > clock
> > > > > > rate.
> > > > > > >
> > > > > > > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
> > > > > >
> > > > > > Thanks for your patch!
> > > > > >
> > > > > > > --- a/drivers/mmc/host/renesas_sdhi_core.c
> > > > > > > +++ b/drivers/mmc/host/renesas_sdhi_core.c
> > > > > > > @@ -147,6 +147,7 @@ static unsigned int
> > > > > > > renesas_sdhi_clk_update(struct
> > > > > > tmio_mmc_host *host,
> > > > > > >         }
> > > > > > >
> > > > > > >         new_clock = wanted_clock << clkh_shift;
> > > > > > > +       new_clock += new_clock >> 10;
> > > > > >
> > > > > > This changes the requested clock rate. Is that really a good
> > > idea?
> > > > > >
> > > > > > Isn't it sufficient to change the test "if (freq >
> (new_clock
> > <<
> > > i))"
> > > > > > slightly?
> > > > >
> > > > > We hardly enter this test, after it request for proper
> > wanted_clk.
> > > > >
> > > > > freq is clk_round_rate(ref_clk, new_clock << i);  and it
> > compares
> > > > > 400MHz vs 533.332 MHz.
> > > > >
> > > > > Basically wanted_clock= 133.33333 MHz and is scaled to power
> of
> > 2
> > > > > and then each iteration it scale down to power of 2 compare
> with
> > > > > round rate value.
> > > > >
> > > > > [    4.020781] ####rzg2l_cpg_sd_clk_mux_determine_rate#######
> > req-
> > > > > >rate=4266666656
> > > > > [    4.028013] ####rzg2l_cpg_sd_clk_mux_determine_rate#######
> > req-
> > > > > >rate=2133333328
> > > > > [    4.035330] ####rzg2l_cpg_sd_clk_mux_determine_rate#######
> > req-
> > > > > >rate=1066666664
> > > > > [    4.042639] ####rzg2l_cpg_sd_clk_mux_determine_rate#######
> > req-
> > > > > >rate=533333332
> > > > >
> > > > > >
> > > > > > >
> > > > > > >         /*
> > > > > > >          * We want the bus clock to be as close as
> possible
> > > to,
> > > > > > > but no @@ -172,6 +173,7 @@ static unsigned int
> > > > > > > renesas_sdhi_clk_update(struct tmio_mmc_host *host,
> > > > > > >
> > > > > > >         clk_set_rate(ref_clk, best_freq);
> > > > > > >
> > > > > > > +       best_freq += best_freq >> 10;
> > > > > >
> > > > > > Can you please elaborate why this is needed?
> > > > > > It looks counter-intuitive to me.
> > > > >
> > > > > When we try to set 133.333 MHz clk, the determine rate
> > calculates
> > > > > req-
> > > > > >rate for sd clk as 533.332 and since available clock source
> for
> > > sd0
> > > > > >clks
> > > > > are 266.6666, 400 and
> > > > > 533.333 MHz, so it select the clock source as 400MHz.
> > > >
> > > > Just to add here the main issue is coming from math calculation.
> > > >
> > > > Consider any value A
> > > >
> > > > B = A / 4
> > > > C = B * 4
> > > >
> > > > Ideally, we expect A = C, But in the below example It is not the
> > > case
> > > > (it is A != C).
> > > >
> > > > A = 533333333 (1600/3 MHz)
> > > > B = 533333333/4 = 133333333
> > > > C = 133333333 * 4 = 533333332
> > > >
> > > > Since A != C we are ending up in this situation.
> > >
> > > I am fully aware of that ;-)
> > >
> > > However, clk_round_rate() is supposed to return the closest
> matching
> > > rate.  Hence when passed 533333332, it should return 533333333.
> >
> > No, it is returning 400000000.as ref_clk->rate=400000000, and new
> clk
> > rate 533333332
> >
> > [    4.524188] ###renesas_sdhi_clk_update ++ ###
> > 400000000/533333333/533333332
> > [    4.531217] ###renesas_sdhi_clk_update -- ###
> > 400000000/400000000/533333332
> >
> > +               pr_err("###%s ++ ### %llu/%llu/%llu", __func__,
> > + clk_get_rate(ref_clk),freq, new_clock << i);
> >                 freq = clk_round_rate(ref_clk, new_clock << i);
> > +               pr_err("###%s -- ### %llu/%llu/%llu", __func__,
> > + clk_get_rate(ref_clk),freq, new_clock << i);
> >
> > > AFAIU, this is then rejected by "if (freq > (new_clock << i))",
> due
> > to
> > > being slightly too large, causing 400000000 to be picked instead.
> > >
> > > Am I missing something?
> >
> > With the above, it skips the if statement
> >
> > > I'm currently not in the position to test/verify what is actually
> > > happening.
> >
> > OK Will check.
> 
> I have patched SDHI driver to allow margin, so that it won't set a
> lower clock and added round clock operation in Clock driver for
> rounding operation.
> with that I get proper rates.
> 
> Is this solution acceptable or not?
> 
> [1]
> diff --git a/drivers/mmc/host/renesas_sdhi_core.c
> b/drivers/mmc/host/renesas_sdhi_core.c
> index 6edbf5c161ab..bb7a736d46a2 100644
> --- a/drivers/mmc/host/renesas_sdhi_core.c
> +++ b/drivers/mmc/host/renesas_sdhi_core.c
> @@ -128,6 +128,7 @@ static unsigned int renesas_sdhi_clk_update(struct
> tmio_mmc_host *host,
>         struct clk *ref_clk = priv->clk;
>         unsigned int freq, diff, best_freq = 0, diff_min = ~0;
>         unsigned int new_clock, clkh_shift = 0;
> +       unsigned int new_clock_margin;
>         int i;
> 
>         /*
> @@ -155,8 +156,13 @@ static unsigned int
> renesas_sdhi_clk_update(struct tmio_mmc_host *host,
>          * possible, but no greater than, new_clock << i.
>          */
>         for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0; i--) {
> -               freq = clk_round_rate(ref_clk, new_clock << i);
> -               if (freq > (new_clock << i)) {
> +               if (freq <= (new_clock << i))
> +                       freq = clk_round_rate(ref_clk, new_clock <<
> i);
> +
> +               new_clock_margin = (new_clock << i);
> +               new_clock_margin += new_clock_margin >> 10;
> +
> +               if (freq > new_clock_margin) {
>                         /* Too fast; look for a slightly slower option
> */
>                         freq = clk_round_rate(ref_clk, (new_clock <<
> i) / 4 * 3);
>                         if (freq > (new_clock << i))

Looks like for SDHI, the below clock margin change should be sufficient.
As the round operation in RZ/G2L clk driver changes[2] rounds to 
nearest clk(ie,533.332MHZ to 533.333MHz)

diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
index 6edbf5c161ab..539521f84e2f 100644
--- a/drivers/mmc/host/renesas_sdhi_core.c
+++ b/drivers/mmc/host/renesas_sdhi_core.c
@@ -128,6 +128,7 @@ static unsigned int renesas_sdhi_clk_update(struct tmio_mmc_host *host,
        struct clk *ref_clk = priv->clk;
        unsigned int freq, diff, best_freq = 0, diff_min = ~0;
        unsigned int new_clock, clkh_shift = 0;
+       unsigned int new_clock_margin;
        int i;
 
        /*
@@ -156,7 +157,9 @@ static unsigned int renesas_sdhi_clk_update(struct tmio_mmc_host *host,
         */
        for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0; i--) {
                freq = clk_round_rate(ref_clk, new_clock << i);
-               if (freq > (new_clock << i)) {
+               new_clock_margin = (new_clock << i) + ((new_clock << i) >> 10);
+
+               if (freq > new_clock_margin) {

Cheers,
Biju
Wolfram Sang Sept. 16, 2022, 6:40 p.m. UTC | #7
Hi Biju,

> diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
> index 6edbf5c161ab..539521f84e2f 100644
> --- a/drivers/mmc/host/renesas_sdhi_core.c
> +++ b/drivers/mmc/host/renesas_sdhi_core.c
> @@ -128,6 +128,7 @@ static unsigned int renesas_sdhi_clk_update(struct tmio_mmc_host *host,
>         struct clk *ref_clk = priv->clk;
>         unsigned int freq, diff, best_freq = 0, diff_min = ~0;
>         unsigned int new_clock, clkh_shift = 0;
> +       unsigned int new_clock_margin;
>         int i;
>  
>         /*
> @@ -156,7 +157,9 @@ static unsigned int renesas_sdhi_clk_update(struct tmio_mmc_host *host,
>          */
>         for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0; i--) {
>                 freq = clk_round_rate(ref_clk, new_clock << i);
> -               if (freq > (new_clock << i)) {
> +               new_clock_margin = (new_clock << i) + ((new_clock << i) >> 10);
> +
> +               if (freq > new_clock_margin) {

new_clock_margin needs a comment explaining why we need it and set it to
that particular value. Otherwise looks good to me.

Thanks for doing it,

   Wolfram
Biju Das Sept. 19, 2022, 8:33 a.m. UTC | #8
Hi Wolfram,

Thanks for the feedback.

> Subject: Re: [PATCH] mmc: renesas_sdhi: Add margins for main clock and
> hs clock
> 
> Hi Biju,
> 
> > diff --git a/drivers/mmc/host/renesas_sdhi_core.c
> > b/drivers/mmc/host/renesas_sdhi_core.c
> > index 6edbf5c161ab..539521f84e2f 100644
> > --- a/drivers/mmc/host/renesas_sdhi_core.c
> > +++ b/drivers/mmc/host/renesas_sdhi_core.c
> > @@ -128,6 +128,7 @@ static unsigned int
> renesas_sdhi_clk_update(struct tmio_mmc_host *host,
> >         struct clk *ref_clk = priv->clk;
> >         unsigned int freq, diff, best_freq = 0, diff_min = ~0;
> >         unsigned int new_clock, clkh_shift = 0;
> > +       unsigned int new_clock_margin;
> >         int i;
> >
> >         /*
> > @@ -156,7 +157,9 @@ static unsigned int
> renesas_sdhi_clk_update(struct tmio_mmc_host *host,
> >          */
> >         for (i = min(9, ilog2(UINT_MAX / new_clock)); i >= 0; i--) {
> >                 freq = clk_round_rate(ref_clk, new_clock << i);
> > -               if (freq > (new_clock << i)) {
> > +               new_clock_margin = (new_clock << i) + ((new_clock <<
> > + i) >> 10);
> > +
> > +               if (freq > new_clock_margin) {
> 
> new_clock_margin needs a comment explaining why we need it and set it
> to that particular value. Otherwise looks good to me.

Looks similar margin needs to be added to renesas_sdhi_set_clock()as well
Otherwise, CTL_SD_CARD_CLK_CTL is set to 0 and performance get impacted.

I will send v2 with these changes. Please provide feedback.

Cheers,
Biju
diff mbox series

Patch

diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
index 6edbf5c161ab..e41fbfc6ab7d 100644
--- a/drivers/mmc/host/renesas_sdhi_core.c
+++ b/drivers/mmc/host/renesas_sdhi_core.c
@@ -147,6 +147,7 @@  static unsigned int renesas_sdhi_clk_update(struct tmio_mmc_host *host,
 	}
 
 	new_clock = wanted_clock << clkh_shift;
+	new_clock += new_clock >> 10;
 
 	/*
 	 * We want the bus clock to be as close as possible to, but no
@@ -172,6 +173,7 @@  static unsigned int renesas_sdhi_clk_update(struct tmio_mmc_host *host,
 
 	clk_set_rate(ref_clk, best_freq);
 
+	best_freq += best_freq >> 10;
 	if (priv->clkh)
 		clk_set_rate(priv->clk, best_freq >> clkh_shift);