Message ID | 20220509210741.12020-3-angelogioacchino.delregno@collabora.com |
---|---|
State | Accepted |
Commit | 327e93cf9a59b0d04eb3a31a7fdbf0f11cf13ecb |
Headers | show |
Series | MediaTek SoC ARM/ARM64 System Timer | expand |
From: Yassine Oudjana <yassine.oudjana@gmail.com> On Mon, 9 May 2022 23:07:40 +0200, AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> wrote: > Some MediaTek platforms with a buggy TrustZone ATF firmware will not > initialize the AArch64 System Timer correctly: in these cases, the > System Timer address is correctly programmed, as well as the CNTFRQ_EL0 > register (reading 13MHz, as it should be), but the assigned hardware > timers are never started before (or after) booting Linux. > > In this condition, any call to function get_cycles() will be returning > zero, as CNTVCT_EL0 will always read zero. I spent a lot of time trying to figure out why the arch timer didn't work on MT6737T and never got any results. Turns out this is why... I ended up using the GPT (@ 0x10004000) as a system timer and it worked fine. With this patch the arch timer started to work finally. Thanks for the fix! See below for one comment on this patch. > One common critical symptom of that is trying to use the udelay() > function (calling __delay()), which executes the following loop: > > start = get_cycles(); > while ((get_cycles() - start) < cycles) > cpu_relax(); > > which, when CNTVCT_EL0 always reads zero, translates to: > > while((0 - 0) < 0) ==> while(0 < 0) > > ... generating an infinite loop, even though zero is never less > than zero, but always equal to it (this has to be researched, > but it's out of the scope of this commit). > > To fix this issue on the affected MediaTek platforms, the solution > is to simply start the timers that are designed to be System Timer(s). > These timers, downstream, are called "CPUXGPT" and there is one > timer per CPU core; luckily, it is not necessary to set a start bit > on each CPUX General Purpose Timer, but it's conveniently enough to: > - Set the clock divider (input = 26MHz, divider = 2, output = 13MHz); > - Set the ENABLE bit on a global register (starts all CPUX timers). > > The only small hurdle with this setup is that it's all done through > the MCUSYS wrapper, where it is needed, for each read or write, to > select a register address (by writing it to an index register) and > then to perform any R/W on a "CON" register. > > For example, writing "0x1" to the CPUXGPT register offset 0x4: > - Write 0x4 to mcusys INDEX register > - Write 0x1 to mcusys CON register > > Reading from CPUXGPT register offset 0x4: > - Write 0x4 to mcusys INDEX register > - Read mcusys CON register. > > Finally, starting this timer makes platforms affected by this issue > to work correctly. > > Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> > --- > drivers/clocksource/timer-mediatek.c | 119 +++++++++++++++++++++++++++ > 1 file changed, 119 insertions(+) > > diff --git a/drivers/clocksource/timer-mediatek.c b/drivers/clocksource/timer-mediatek.c > index 7bcb4a3f26fb..a3e90047f9ac 100644 > --- a/drivers/clocksource/timer-mediatek.c > +++ b/drivers/clocksource/timer-mediatek.c > @@ -22,6 +22,19 @@ > > #define TIMER_SYNC_TICKS (3) > > +/* cpux mcusys wrapper */ > +#define CPUX_CON_REG 0x0 > +#define CPUX_IDX_REG 0x4 > + > +/* cpux */ > +#define CPUX_IDX_GLOBAL_CTRL 0x0 > + #define CPUX_ENABLE BIT(0) > + #define CPUX_CLK_DIV_MASK GENMASK(10, 8) > + #define CPUX_CLK_DIV1 BIT(8) > + #define CPUX_CLK_DIV2 BIT(9) > + #define CPUX_CLK_DIV4 BIT(10) > +#define CPUX_IDX_GLOBAL_IRQ 0x30 > + > /* gpt */ > #define GPT_IRQ_EN_REG 0x00 > #define GPT_IRQ_ENABLE(val) BIT((val) - 1) > @@ -72,6 +85,57 @@ > > static void __iomem *gpt_sched_reg __read_mostly; > > +static u32 mtk_cpux_readl(u32 reg_idx, struct timer_of *to) > +{ > + writel(reg_idx, timer_of_base(to) + CPUX_IDX_REG); > + return readl(timer_of_base(to) + CPUX_CON_REG); > +} > + > +static void mtk_cpux_writel(u32 val, u32 reg_idx, struct timer_of *to) > +{ > + writel(reg_idx, timer_of_base(to) + CPUX_IDX_REG); > + writel(val, timer_of_base(to) + CPUX_CON_REG); > +} > + > +static void mtk_cpux_disable_irq(struct timer_of *to) > +{ > + const unsigned long *irq_mask = cpumask_bits(cpu_possible_mask); > + u32 val; > + > + val = mtk_cpux_readl(CPUX_IDX_GLOBAL_IRQ, to); > + val &= ~(*irq_mask); > + mtk_cpux_writel(val, CPUX_IDX_GLOBAL_IRQ, to); > +} > + > +static void mtk_cpux_enable_irq(struct timer_of *to) > +{ > + const unsigned long *irq_mask = cpumask_bits(cpu_possible_mask); > + u32 val; > + > + val = mtk_cpux_readl(CPUX_IDX_GLOBAL_IRQ, to); > + val |= *irq_mask; > + mtk_cpux_writel(val, CPUX_IDX_GLOBAL_IRQ, to); > +} > + > +static int mtk_cpux_clkevt_shutdown(struct clock_event_device *clkevt) > +{ > + /* Clear any irq */ > + mtk_cpux_disable_irq(to_timer_of(clkevt)); > + > + /* > + * Disabling CPUXGPT timer will crash the platform, especially > + * if Trusted Firmware is using it (usually, for sleep states), > + * so we only mask the IRQ and call it a day. > + */ > + return 0; > +} > + > +static int mtk_cpux_clkevt_resume(struct clock_event_device *clkevt) > +{ > + mtk_cpux_enable_irq(to_timer_of(clkevt)); > + return 0; > +} > + > static void mtk_syst_ack_irq(struct timer_of *to) > { > /* Clear and disable interrupt */ > @@ -281,6 +345,60 @@ static struct timer_of to = { > }, > }; > > +static int __init mtk_cpux_init(struct device_node *node) > +{ > + static struct timer_of to_cpux; > + u32 freq, val; > + int ret; > + > + /* > + * There are per-cpu interrupts for the CPUX General Purpose Timer > + * but since this timer feeds the AArch64 System Timer we can rely > + * on the CPU timer PPIs as well, so we don't declare TIMER_OF_IRQ. > + */ > + to_cpux.flags = TIMER_OF_BASE | TIMER_OF_CLOCK; > + to_cpux.clkevt.name = "mtk-cpuxgpt"; > + to_cpux.clkevt.rating = 10; > + to_cpux.clkevt.cpumask = cpu_possible_mask; > + to_cpux.clkevt.set_state_shutdown = mtk_cpux_clkevt_shutdown; > + to_cpux.clkevt.tick_resume = mtk_cpux_clkevt_resume; > + > + /* If this fails, bad things are about to happen... */ > + ret = timer_of_init(node, &to_cpux); > + if (ret) { > + WARN(1, "Cannot start CPUX timers.\n"); > + return ret; > + } > + > + /* > + * Check if we're given a clock with the right frequency for this > + * timer, otherwise warn but keep going with the setup anyway, as > + * that makes it possible to still boot the kernel, even though > + * it may not work correctly (random lockups, etc). > + * The reason behind this is that having an early UART may not be > + * possible for everyone and this gives a chance to retrieve kmsg > + * for eventual debugging even on consumer devices. > + */ > + freq = timer_of_rate(&to_cpux); > + if (freq > 13000000) Input clock is 26MHz and is then divided by 2 in CPUXGPT, so shouldn't this be 26000000 instead? I get a warning here with 26MHz system clock supplied: clocks { ... clk26m: clk26m { compatible = "fixed-clock"; clock-frequency = <26000000>; #clock-cells = <0>; }; ... }; ... soc { ... cpuxgpt: timer@10200670 { compatible = "mediatek,mt6795-systimer"; reg = <0 0x10200670 0 0x8>; clocks = <&clk26m>; }; ... }; > + WARN(1, "Requested unsupported timer frequency %u\n", freq); > + > + /* Clock input is 26MHz, set DIV2 to achieve 13MHz clock */ > + val = mtk_cpux_readl(CPUX_IDX_GLOBAL_CTRL, &to_cpux); > + val &= ~CPUX_CLK_DIV_MASK; > + val |= CPUX_CLK_DIV2; > + mtk_cpux_writel(val, CPUX_IDX_GLOBAL_CTRL, &to_cpux); > + > + /* Enable all CPUXGPT timers */ > + val = mtk_cpux_readl(CPUX_IDX_GLOBAL_CTRL, &to_cpux); > + mtk_cpux_writel(val | CPUX_ENABLE, CPUX_IDX_GLOBAL_CTRL, &to_cpux); > + > + clockevents_config_and_register(&to_cpux.clkevt, timer_of_rate(&to_cpux), > + TIMER_SYNC_TICKS, 0xffffffff); > + > + return 0; > +} > + > static int __init mtk_syst_init(struct device_node *node) > { > int ret; > @@ -339,3 +457,4 @@ static int __init mtk_gpt_init(struct device_node *node) > } > TIMER_OF_DECLARE(mtk_mt6577, "mediatek,mt6577-timer", mtk_gpt_init); > TIMER_OF_DECLARE(mtk_mt6765, "mediatek,mt6765-timer", mtk_syst_init); > +TIMER_OF_DECLARE(mtk_mt6795, "mediatek,mt6795-systimer", mtk_cpux_init); > -- > 2.35.1
On Mon, May 16 2022 at 10:31:12 +0200, AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> wrote: > Il 13/05/22 22:14, Yassine Oudjana ha scritto: >> From: Yassine Oudjana <yassine.oudjana@gmail.com> >> >> On Mon, 9 May 2022 23:07:40 +0200, AngeloGioacchino Del Regno >> <angelogioacchino.delregno@collabora.com> wrote: >>> Some MediaTek platforms with a buggy TrustZone ATF firmware will not >>> initialize the AArch64 System Timer correctly: in these cases, the >>> System Timer address is correctly programmed, as well as the >>> CNTFRQ_EL0 >>> register (reading 13MHz, as it should be), but the assigned hardware >>> timers are never started before (or after) booting Linux. >>> >>> In this condition, any call to function get_cycles() will be >>> returning >>> zero, as CNTVCT_EL0 will always read zero. >> >> I spent a lot of time trying to figure out why the arch timer didn't >> work on MT6737T and never got any results. Turns out this is why... >> >> I ended up using the GPT (@ 0x10004000) as a system timer and it >> worked fine. >> >> With this patch the arch timer started to work finally. Thanks for >> the fix! See below for one comment on this patch. >> > > Hello Yassine, > > yes this is a common quirk that's present on all (or almost all?) > older > MediaTek platforms - as I explained, due to TZ doing only partial init > for these timers. > > I'm happy to read that this is working out as expected: I saw you > pushing > some patches for older MTK SoCs, so I started researching about what > the > community was blocked on with the upstreaming of these, and learnt > about > such major blocker. > > There's more, though: you also need to initialize the CPU MTCMOS at > early > boot in order for SMP to work on (some?) old platforms, or at least > this > is true for MT6795. Oh I'm actually having trouble with SMP. I tried to do a bunch of things downstream does but the CPUs just kept refusing to come up. This might be the reason. > Since it looks like you're interested in giving love to old SoCs, I > will > anticipate to you that I *do* have a local implementation for a > correct > initialization of the MTCMOS for the non-boot cores... that needs to > be > cleaned up a bit before I push that upstream though. Any chance I can give it an early test? Having only one working CPU kind of sucks... > >>> One common critical symptom of that is trying to use the udelay() >>> function (calling __delay()), which executes the following loop: >>> >>> start = get_cycles(); >>> while ((get_cycles() - start) < cycles) >>> cpu_relax(); >>> >>> which, when CNTVCT_EL0 always reads zero, translates to: >>> >>> while((0 - 0) < 0) ==> while(0 < 0) >>> >>> ... generating an infinite loop, even though zero is never less >>> than zero, but always equal to it (this has to be researched, >>> but it's out of the scope of this commit). >>> >>> To fix this issue on the affected MediaTek platforms, the solution >>> is to simply start the timers that are designed to be System >>> Timer(s). >>> These timers, downstream, are called "CPUXGPT" and there is one >>> timer per CPU core; luckily, it is not necessary to set a start bit >>> on each CPUX General Purpose Timer, but it's conveniently enough to: >>> - Set the clock divider (input = 26MHz, divider = 2, output = >>> 13MHz); >>> - Set the ENABLE bit on a global register (starts all CPUX >>> timers). >>> >>> The only small hurdle with this setup is that it's all done through >>> the MCUSYS wrapper, where it is needed, for each read or write, to >>> select a register address (by writing it to an index register) and >>> then to perform any R/W on a "CON" register. >>> >>> For example, writing "0x1" to the CPUXGPT register offset 0x4: >>> - Write 0x4 to mcusys INDEX register >>> - Write 0x1 to mcusys CON register >>> >>> Reading from CPUXGPT register offset 0x4: >>> - Write 0x4 to mcusys INDEX register >>> - Read mcusys CON register. >>> >>> Finally, starting this timer makes platforms affected by this issue >>> to work correctly. >>> >>> Signed-off-by: AngeloGioacchino Del Regno >>> <angelogioacchino.delregno@collabora.com> >>> --- >>> drivers/clocksource/timer-mediatek.c | 119 >>> +++++++++++++++++++++++++++ >>> 1 file changed, 119 insertions(+) >>> >>> diff --git a/drivers/clocksource/timer-mediatek.c >>> b/drivers/clocksource/timer-mediatek.c >>> index 7bcb4a3f26fb..a3e90047f9ac 100644 >>> --- a/drivers/clocksource/timer-mediatek.c >>> +++ b/drivers/clocksource/timer-mediatek.c > > ..snip.. > >>> + >>> + /* >>> + * Check if we're given a clock with the right frequency for this >>> + * timer, otherwise warn but keep going with the setup anyway, as >>> + * that makes it possible to still boot the kernel, even though >>> + * it may not work correctly (random lockups, etc). >>> + * The reason behind this is that having an early UART may not be >>> + * possible for everyone and this gives a chance to retrieve kmsg >>> + * for eventual debugging even on consumer devices. >>> + */ >>> + freq = timer_of_rate(&to_cpux); >>> + if (freq > 13000000) >> >> Input clock is 26MHz and is then divided by 2 in CPUXGPT, so >> shouldn't >> this be 26000000 instead? I get a warning here with 26MHz system >> clock >> supplied: >> > > This may seem to be counter intuitive... I had two ways to implement > this: > 1. Design this driver to take "clk26m" as a clock input and make it so > that it reads the expected frequency from CNTFRQ_EL0, then setup > the > dividers based on that reading; or > 2. Take "clk13m" as input and refuse to take anything else. > > Keeping in mind that: > 1. There's no way (that I know, at least) to set a different clock > source for > the CPUXGPT timers, and > 2. There's no platform (I've been researching on that) that uses a > different > frequency for these timers... > > ...there will never be any platform that outputs a clock that's not > 13MHz, > hence I chose to follow path 2 and take the 13MHz "System Clock", > which is > something that is present downstream as well. > > In any case, now that you make me think about that, it may indeed be > more > logical to assign the 26MHz clock to this node... my intention was to > force > knowledge on this outputting 13MHz instead but, I realize, this may > be the > wrong way of doing that. Maybe it is a better idea to use clock-frequency = <13000000> to show the timer frequency? > >> clocks { >> ... >> clk26m: clk26m { >> compatible = "fixed-clock"; >> clock-frequency = <26000000>; >> #clock-cells = <0>; >> }; >> ... >> }; >> ... >> soc { >> ... >> cpuxgpt: timer@10200670 { >> compatible = "mediatek,mt6795-systimer"; >> reg = <0 0x10200670 0 0x8>; > > My congratulations on this timer node: you're a smart person! > I was expecting people complaining about "this doesn't work" and > having > to explain that 0x10200000 is not the right iostart for this node, but > I didn't have to. > > Hats off. Thanks :) I actually figured that out because I had topckgen incorrectly placed at 0x10200000 in my device tree, which made me check the datasheet, downstream dts and the timer driver to realize that the correct address for tockgen is 0x10210000, 0x10200000 is mcusys and the timer driver has the XGPT registers defined starting from 0, meaning I had to add their offset in dts. Regards, Yassine
diff --git a/drivers/clocksource/timer-mediatek.c b/drivers/clocksource/timer-mediatek.c index 7bcb4a3f26fb..a3e90047f9ac 100644 --- a/drivers/clocksource/timer-mediatek.c +++ b/drivers/clocksource/timer-mediatek.c @@ -22,6 +22,19 @@ #define TIMER_SYNC_TICKS (3) +/* cpux mcusys wrapper */ +#define CPUX_CON_REG 0x0 +#define CPUX_IDX_REG 0x4 + +/* cpux */ +#define CPUX_IDX_GLOBAL_CTRL 0x0 + #define CPUX_ENABLE BIT(0) + #define CPUX_CLK_DIV_MASK GENMASK(10, 8) + #define CPUX_CLK_DIV1 BIT(8) + #define CPUX_CLK_DIV2 BIT(9) + #define CPUX_CLK_DIV4 BIT(10) +#define CPUX_IDX_GLOBAL_IRQ 0x30 + /* gpt */ #define GPT_IRQ_EN_REG 0x00 #define GPT_IRQ_ENABLE(val) BIT((val) - 1) @@ -72,6 +85,57 @@ static void __iomem *gpt_sched_reg __read_mostly; +static u32 mtk_cpux_readl(u32 reg_idx, struct timer_of *to) +{ + writel(reg_idx, timer_of_base(to) + CPUX_IDX_REG); + return readl(timer_of_base(to) + CPUX_CON_REG); +} + +static void mtk_cpux_writel(u32 val, u32 reg_idx, struct timer_of *to) +{ + writel(reg_idx, timer_of_base(to) + CPUX_IDX_REG); + writel(val, timer_of_base(to) + CPUX_CON_REG); +} + +static void mtk_cpux_disable_irq(struct timer_of *to) +{ + const unsigned long *irq_mask = cpumask_bits(cpu_possible_mask); + u32 val; + + val = mtk_cpux_readl(CPUX_IDX_GLOBAL_IRQ, to); + val &= ~(*irq_mask); + mtk_cpux_writel(val, CPUX_IDX_GLOBAL_IRQ, to); +} + +static void mtk_cpux_enable_irq(struct timer_of *to) +{ + const unsigned long *irq_mask = cpumask_bits(cpu_possible_mask); + u32 val; + + val = mtk_cpux_readl(CPUX_IDX_GLOBAL_IRQ, to); + val |= *irq_mask; + mtk_cpux_writel(val, CPUX_IDX_GLOBAL_IRQ, to); +} + +static int mtk_cpux_clkevt_shutdown(struct clock_event_device *clkevt) +{ + /* Clear any irq */ + mtk_cpux_disable_irq(to_timer_of(clkevt)); + + /* + * Disabling CPUXGPT timer will crash the platform, especially + * if Trusted Firmware is using it (usually, for sleep states), + * so we only mask the IRQ and call it a day. + */ + return 0; +} + +static int mtk_cpux_clkevt_resume(struct clock_event_device *clkevt) +{ + mtk_cpux_enable_irq(to_timer_of(clkevt)); + return 0; +} + static void mtk_syst_ack_irq(struct timer_of *to) { /* Clear and disable interrupt */ @@ -281,6 +345,60 @@ static struct timer_of to = { }, }; +static int __init mtk_cpux_init(struct device_node *node) +{ + static struct timer_of to_cpux; + u32 freq, val; + int ret; + + /* + * There are per-cpu interrupts for the CPUX General Purpose Timer + * but since this timer feeds the AArch64 System Timer we can rely + * on the CPU timer PPIs as well, so we don't declare TIMER_OF_IRQ. + */ + to_cpux.flags = TIMER_OF_BASE | TIMER_OF_CLOCK; + to_cpux.clkevt.name = "mtk-cpuxgpt"; + to_cpux.clkevt.rating = 10; + to_cpux.clkevt.cpumask = cpu_possible_mask; + to_cpux.clkevt.set_state_shutdown = mtk_cpux_clkevt_shutdown; + to_cpux.clkevt.tick_resume = mtk_cpux_clkevt_resume; + + /* If this fails, bad things are about to happen... */ + ret = timer_of_init(node, &to_cpux); + if (ret) { + WARN(1, "Cannot start CPUX timers.\n"); + return ret; + } + + /* + * Check if we're given a clock with the right frequency for this + * timer, otherwise warn but keep going with the setup anyway, as + * that makes it possible to still boot the kernel, even though + * it may not work correctly (random lockups, etc). + * The reason behind this is that having an early UART may not be + * possible for everyone and this gives a chance to retrieve kmsg + * for eventual debugging even on consumer devices. + */ + freq = timer_of_rate(&to_cpux); + if (freq > 13000000) + WARN(1, "Requested unsupported timer frequency %u\n", freq); + + /* Clock input is 26MHz, set DIV2 to achieve 13MHz clock */ + val = mtk_cpux_readl(CPUX_IDX_GLOBAL_CTRL, &to_cpux); + val &= ~CPUX_CLK_DIV_MASK; + val |= CPUX_CLK_DIV2; + mtk_cpux_writel(val, CPUX_IDX_GLOBAL_CTRL, &to_cpux); + + /* Enable all CPUXGPT timers */ + val = mtk_cpux_readl(CPUX_IDX_GLOBAL_CTRL, &to_cpux); + mtk_cpux_writel(val | CPUX_ENABLE, CPUX_IDX_GLOBAL_CTRL, &to_cpux); + + clockevents_config_and_register(&to_cpux.clkevt, timer_of_rate(&to_cpux), + TIMER_SYNC_TICKS, 0xffffffff); + + return 0; +} + static int __init mtk_syst_init(struct device_node *node) { int ret; @@ -339,3 +457,4 @@ static int __init mtk_gpt_init(struct device_node *node) } TIMER_OF_DECLARE(mtk_mt6577, "mediatek,mt6577-timer", mtk_gpt_init); TIMER_OF_DECLARE(mtk_mt6765, "mediatek,mt6765-timer", mtk_syst_init); +TIMER_OF_DECLARE(mtk_mt6795, "mediatek,mt6795-systimer", mtk_cpux_init);
Some MediaTek platforms with a buggy TrustZone ATF firmware will not initialize the AArch64 System Timer correctly: in these cases, the System Timer address is correctly programmed, as well as the CNTFRQ_EL0 register (reading 13MHz, as it should be), but the assigned hardware timers are never started before (or after) booting Linux. In this condition, any call to function get_cycles() will be returning zero, as CNTVCT_EL0 will always read zero. One common critical symptom of that is trying to use the udelay() function (calling __delay()), which executes the following loop: start = get_cycles(); while ((get_cycles() - start) < cycles) cpu_relax(); which, when CNTVCT_EL0 always reads zero, translates to: while((0 - 0) < 0) ==> while(0 < 0) ... generating an infinite loop, even though zero is never less than zero, but always equal to it (this has to be researched, but it's out of the scope of this commit). To fix this issue on the affected MediaTek platforms, the solution is to simply start the timers that are designed to be System Timer(s). These timers, downstream, are called "CPUXGPT" and there is one timer per CPU core; luckily, it is not necessary to set a start bit on each CPUX General Purpose Timer, but it's conveniently enough to: - Set the clock divider (input = 26MHz, divider = 2, output = 13MHz); - Set the ENABLE bit on a global register (starts all CPUX timers). The only small hurdle with this setup is that it's all done through the MCUSYS wrapper, where it is needed, for each read or write, to select a register address (by writing it to an index register) and then to perform any R/W on a "CON" register. For example, writing "0x1" to the CPUXGPT register offset 0x4: - Write 0x4 to mcusys INDEX register - Write 0x1 to mcusys CON register Reading from CPUXGPT register offset 0x4: - Write 0x4 to mcusys INDEX register - Read mcusys CON register. Finally, starting this timer makes platforms affected by this issue to work correctly. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> --- drivers/clocksource/timer-mediatek.c | 119 +++++++++++++++++++++++++++ 1 file changed, 119 insertions(+)