spi: meson-spicc: save pow2 datarate between messages

Message ID	20220809152019.461741-1-narmstrong@baylibre.com
State	New
Headers	show Return-Path: <linux-spi-owner@kernel.org> From: Neil Armstrong <narmstrong@baylibre.com> To: broonie@kernel.org Cc: Neil Armstrong <narmstrong@baylibre.com>, linux-spi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-amlogic@lists.infradead.org, linux-kernel@vger.kernel.org, Da Xue <da@libre.computer> Subject: [PATCH] spi: meson-spicc: save pow2 datarate between messages Date: Tue, 9 Aug 2022 17:20:19 +0200 Message-Id: <20220809152019.461741-1-narmstrong@baylibre.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	spi: meson-spicc: save pow2 datarate between messages \| expand spi: meson-spicc: save pow2 datarate between messages

Neil Armstrong Aug. 9, 2022, 3:20 p.m. UTC

At the end of a message, the HW gets a reset in meson_spicc_unprepare_transfer(),
this resets the SPICC_CONREG register and notably the value set by the
Common Clock Framework.

This saves the datarate dividor value between message to keep the last
set value by the Common Clock Framework.

This didn't appear before commit 3e0cf4d3fc29 ("spi: meson-spicc: add a linear clock divider support")
because we recalculated and wrote the rate for each xfer.

Fixes: 3e0cf4d3fc29 ("spi: meson-spicc: add a linear clock divider support")
Reported-by: Da Xue <da@libre.computer>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
---
 drivers/spi/spi-meson-spicc.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Mark Brown Aug. 9, 2022, 3:27 p.m. UTC | #1

On Tue, Aug 09, 2022 at 05:20:19PM +0200, Neil Armstrong wrote:
> At the end of a message, the HW gets a reset in meson_spicc_unprepare_transfer(),
> this resets the SPICC_CONREG register and notably the value set by the
> Common Clock Framework.

> This saves the datarate dividor value between message to keep the last
> set value by the Common Clock Framework.

When you say the value set by the clock framework does that mean that
the clock driver is adjusting hardware inside the SPI controller IP
block which is then getting reset by the SPI driver without the SPI
driver knowing about it?  That seems like a bad idea as you're finding
here.

> This didn't appear before commit 3e0cf4d3fc29 ("spi: meson-spicc: add a linear clock divider support")
> because we recalculated and wrote the rate for each xfer.

Note that the rate might change per transfer.

Neil Armstrong Aug. 10, 2022, 9:17 a.m. UTC | #2

On 09/08/2022 17:27, Mark Brown wrote:
> On Tue, Aug 09, 2022 at 05:20:19PM +0200, Neil Armstrong wrote:
>> At the end of a message, the HW gets a reset in meson_spicc_unprepare_transfer(),
>> this resets the SPICC_CONREG register and notably the value set by the
>> Common Clock Framework.
> 
>> This saves the datarate dividor value between message to keep the last
>> set value by the Common Clock Framework.
> 
> When you say the value set by the clock framework does that mean that
> the clock driver is adjusting hardware inside the SPI controller IP
> block which is then getting reset by the SPI driver without the SPI
> driver knowing about it?  That seems like a bad idea as you're finding
> here.

The SPI driver is explicitely triggering a reset at the end of each message
to get back to a clean HW state, but it does reset the content of the "legacy"
registers containing the power of 2 divider value, the new registers configuring
the new clock divider path (only on newer SoCs) doesn't get cleared.

> 
>> This didn't appear before commit 3e0cf4d3fc29 ("spi: meson-spicc: add a linear clock divider support")
>> because we recalculated and wrote the rate for each xfer.
> 
> Note that the rate might change per transfer.

It's taken in account, this case is when the rate doesn't change.

Mark Brown Aug. 10, 2022, 12:37 p.m. UTC | #3

On Wed, Aug 10, 2022 at 11:17:14AM +0200, Neil Armstrong wrote:
> On 09/08/2022 17:27, Mark Brown wrote:
> > On Tue, Aug 09, 2022 at 05:20:19PM +0200, Neil Armstrong wrote:

> > When you say the value set by the clock framework does that mean that
> > the clock driver is adjusting hardware inside the SPI controller IP
> > block which is then getting reset by the SPI driver without the SPI
> > driver knowing about it?  That seems like a bad idea as you're finding
> > here.

> The SPI driver is explicitely triggering a reset at the end of each message
> to get back to a clean HW state, but it does reset the content of the "legacy"
> registers containing the power of 2 divider value, the new registers configuring
> the new clock divider path (only on newer SoCs) doesn't get cleared.

Sure, but that doesn't really address the concern - is this something
that the clk driver programmed or is this the driver forgetting to
restore a register that it programmed itself?  The commit message sounds
like the former which is a much bigger problem.

Neil Armstrong Aug. 10, 2022, 2:01 p.m. UTC | #4

On 10/08/2022 14:37, Mark Brown wrote:
> On Wed, Aug 10, 2022 at 11:17:14AM +0200, Neil Armstrong wrote:
>> On 09/08/2022 17:27, Mark Brown wrote:
>>> On Tue, Aug 09, 2022 at 05:20:19PM +0200, Neil Armstrong wrote:
> 
>>> When you say the value set by the clock framework does that mean that
>>> the clock driver is adjusting hardware inside the SPI controller IP
>>> block which is then getting reset by the SPI driver without the SPI
>>> driver knowing about it?  That seems like a bad idea as you're finding
>>> here.
> 
>> The SPI driver is explicitely triggering a reset at the end of each message
>> to get back to a clean HW state, but it does reset the content of the "legacy"
>> registers containing the power of 2 divider value, the new registers configuring
>> the new clock divider path (only on newer SoCs) doesn't get cleared.
> 
> Sure, but that doesn't really address the concern - is this something
> that the clk driver programmed or is this the driver forgetting to
> restore a register that it programmed itself?  The commit message sounds
> like the former which is a much bigger problem.

It's what is programmed by the Clock Framework yes, it was designed as-is
so the Clock Framework takes the most accurate clock path but the reset case
wasn't taken in account.

Neil

Mark Brown Aug. 10, 2022, 2:31 p.m. UTC | #5

On Wed, Aug 10, 2022 at 04:01:33PM +0200, Neil Armstrong wrote:
> On 10/08/2022 14:37, Mark Brown wrote:

> > Sure, but that doesn't really address the concern - is this something
> > that the clk driver programmed or is this the driver forgetting to
> > restore a register that it programmed itself?  The commit message sounds
> > like the former which is a much bigger problem.

> It's what is programmed by the Clock Framework yes, it was designed as-is
> so the Clock Framework takes the most accurate clock path but the reset case
> wasn't taken in account.

This seems like a bad idea, we shouldn't have two different drivers
managing the same register without explicit and visible coordination
with each other, this is at best asking for trouble as you've found
here.  I've not looked in detail but I think if you want to use the
clock framework here then this driver should register a clock provider
for the clock hardware in the IP block.

How does this work with runtime PM, what happens if the clock driver
decides to change something while the device is powered down?

Neil Armstrong Aug. 10, 2022, 2:40 p.m. UTC | #6

On 10/08/2022 16:31, Mark Brown wrote:
> On Wed, Aug 10, 2022 at 04:01:33PM +0200, Neil Armstrong wrote:
>> On 10/08/2022 14:37, Mark Brown wrote:
> 
>>> Sure, but that doesn't really address the concern - is this something
>>> that the clk driver programmed or is this the driver forgetting to
>>> restore a register that it programmed itself?  The commit message sounds
>>> like the former which is a much bigger problem.
> 
>> It's what is programmed by the Clock Framework yes, it was designed as-is
>> so the Clock Framework takes the most accurate clock path but the reset case
>> wasn't taken in account.
> 
> This seems like a bad idea, we shouldn't have two different drivers
> managing the same register without explicit and visible coordination
> with each other, this is at best asking for trouble as you've found
> here.  I've not looked in detail but I think if you want to use the
> clock framework here then this driver should register a clock provider
> for the clock hardware in the IP block.

I totally understand, this wasn't explicit until I found the bug.

I don't think it's worth adding so much code for this since we already
had an open-coded function which perfectly worked before.

> 
> How does this work with runtime PM, what happens if the clock driver
> decides to change something while the device is powered down?

There's no runtime PM implemented, and yes it would be an issue.

I'm perfectly OK to remove the CCF driver for the legacy clock path
and return back to the old open coded calculation since it perfectly
worked and stop using the legacy clock path for new SoCs since it would
never be selected anyway...
... but GX SoCs are broken so it would need an intermediate fix until
I push the refactoring to cleanup all this.

Neil

Mark Brown Aug. 10, 2022, 2:49 p.m. UTC | #7

On Wed, Aug 10, 2022 at 04:40:04PM +0200, Neil Armstrong wrote:

> I don't think it's worth adding so much code for this since we already

I don't recall the code for clock providers being that hard?  They're
generally pretty small, some of the ASoC CODEC drivers did something
similar.

> had an open-coded function which perfectly worked before.

Except in the cases it didn't...

> I'm perfectly OK to remove the CCF driver for the legacy clock path
> and return back to the old open coded calculation since it perfectly
> worked and stop using the legacy clock path for new SoCs since it would
> never be selected anyway...

It does seem better to go the clock provider route TBH.

> ... but GX SoCs are broken so it would need an intermediate fix until
> I push the refactoring to cleanup all this.

I'm trying to figure out if this is actually fixing the problem or just
papering over one case where things happened to go badly.

Neil Armstrong Aug. 10, 2022, 3:51 p.m. UTC | #8

On 10/08/2022 16:49, Mark Brown wrote:
> On Wed, Aug 10, 2022 at 04:40:04PM +0200, Neil Armstrong wrote:
> 
>> I don't think it's worth adding so much code for this since we already
> 
> I don't recall the code for clock providers being that hard?  They're
> generally pretty small, some of the ASoC CODEC drivers did something
> similar.

Seems over-engineering to me, but I can explore this path if it's the best
route to follow.

> 
>> had an open-coded function which perfectly worked before.
> 
> Except in the cases it didn't...

It did but wasn't generic enough to take the new clock path introduced
in the new SoCs.

> 
>> I'm perfectly OK to remove the CCF driver for the legacy clock path
>> and return back to the old open coded calculation since it perfectly
>> worked and stop using the legacy clock path for new SoCs since it would
>> never be selected anyway...
> 
> It does seem better to go the clock provider route TBH.

I'm afraid this won't fix the problem since CCF won't set the clock again
if the rate is already ok in it's cache, we'd still need to save the divider
value and restore it after the reset as I did on this exact patch.

> 
>> ... but GX SoCs are broken so it would need an intermediate fix until
>> I push the refactoring to cleanup all this.
> 
> I'm trying to figure out if this is actually fixing the problem or just
> papering over one case where things happened to go badly.

It does, when clk_set_rate() is called, the datarate field would be the
same as after the previous call.

Neil

Mark Brown Aug. 10, 2022, 4:10 p.m. UTC | #9

On Wed, Aug 10, 2022 at 05:51:33PM +0200, Neil Armstrong wrote:
> On 10/08/2022 16:49, Mark Brown wrote:

> > I don't recall the code for clock providers being that hard?  They're
> > generally pretty small, some of the ASoC CODEC drivers did something
> > similar.

> Seems over-engineering to me, but I can explore this path if it's the best
> route to follow.

Please.

> > > had an open-coded function which perfectly worked before.

> > Except in the cases it didn't...

> It did but wasn't generic enough to take the new clock path introduced
> in the new SoCs.

Right, it's leaving landmines lying around - it's hard for me to be
confident that we couldn't also get another surprise due to a new test
case exercising the code differently somehow, never mind the fragility
of the code.

> > > I'm perfectly OK to remove the CCF driver for the legacy clock path
> > > and return back to the old open coded calculation since it perfectly
> > > worked and stop using the legacy clock path for new SoCs since it would
> > > never be selected anyway...

> > It does seem better to go the clock provider route TBH.

> I'm afraid this won't fix the problem since CCF won't set the clock again
> if the rate is already ok in it's cache, we'd still need to save the divider
> value and restore it after the reset as I did on this exact patch.

The goal here is to ensure that the clock framework's idea of what the
programmed configuration is and the SPI driver's idea of what that is
can't get out of sync - instead of saving the value by virtue of reading
it back out of the hardware at some specific point and hoping that
there's never any changes triggered by the clock framework between the
save and restore the driver is getting notified whenever there's a
change being made and can update the cache then.  That way we ensure
we can't miss any cases and things are robust.

spi: meson-spicc: save pow2 datarate between messages

Commit Message

Comments

Patch