diff mbox series

i2c: mv64xxx: Fix random system lock caused by runtime PM

Message ID 20210408020000.21914-1-kabel@kernel.org
State Accepted
Commit 39930213e7779b9c4257499972b8afb8858f1a2d
Headers show
Series i2c: mv64xxx: Fix random system lock caused by runtime PM | expand

Commit Message

Marek Behún April 8, 2021, 2 a.m. UTC
I noticed a weird bug with this driver on Marvell CN9130 Customer
Reference Board.

Sometime after boot, the system locks with the following message:
 [104.071363] i2c i2c-0: mv64xxx: I2C bus locked, block: 1, time_left: 0

The system does not respond afterwards, only warns about RCU stalls.

This first appeared with commit e5c02cf54154 ("i2c: mv64xxx: Add runtime
PM support").

With further experimentation I discovered that adding a delay into
mv64xxx_i2c_hw_init() fixes this issue. This function is called before
every xfer, due to how runtime PM works in this driver. It seems that in
order to work correctly, a delay is needed after the bus is reset in
this function.

Since there already is a known erratum with this controller needing a
delay, I assume that this is just another place this needs to be
applied. Therefore I apply the delay only if errata_delay is true.

Signed-off-by: Marek Behún <kabel@kernel.org>
---
 drivers/i2c/busses/i2c-mv64xxx.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Marek Behún April 10, 2021, 4:47 p.m. UTC | #1
On Thu,  8 Apr 2021 04:00:00 +0200
Marek Behún <kabel@kernel.org> wrote:

> e5c02cf54154 ("i2c: mv64xxx: Add runtime

> PM support").


This commit should also contain a Fixes tag:

Fixes: e5c02cf54154 ("i2c: mv64xxx: Add runtime PM support").
Wolfram Sang April 13, 2021, 7:58 p.m. UTC | #2
On Thu, Apr 08, 2021 at 04:00:00AM +0200, Marek Behún wrote:
> I noticed a weird bug with this driver on Marvell CN9130 Customer

> Reference Board.

> 

> Sometime after boot, the system locks with the following message:

>  [104.071363] i2c i2c-0: mv64xxx: I2C bus locked, block: 1, time_left: 0

> 

> The system does not respond afterwards, only warns about RCU stalls.

> 

> This first appeared with commit e5c02cf54154 ("i2c: mv64xxx: Add runtime

> PM support").

> 

> With further experimentation I discovered that adding a delay into

> mv64xxx_i2c_hw_init() fixes this issue. This function is called before

> every xfer, due to how runtime PM works in this driver. It seems that in

> order to work correctly, a delay is needed after the bus is reset in

> this function.

> 

> Since there already is a known erratum with this controller needing a

> delay, I assume that this is just another place this needs to be

> applied. Therefore I apply the delay only if errata_delay is true.

> 

> Signed-off-by: Marek Behún <kabel@kernel.org>


Gregory? Looks reasonable to me and if so, we should have this in 5.12
already. Comments from others are welcome, too, of course.

> ---

>  drivers/i2c/busses/i2c-mv64xxx.c | 4 ++++

>  1 file changed, 4 insertions(+)

> 

> diff --git a/drivers/i2c/busses/i2c-mv64xxx.c b/drivers/i2c/busses/i2c-mv64xxx.c

> index c590d36b5fd1..5c8e94b6cdb5 100644

> --- a/drivers/i2c/busses/i2c-mv64xxx.c

> +++ b/drivers/i2c/busses/i2c-mv64xxx.c

> @@ -221,6 +221,10 @@ mv64xxx_i2c_hw_init(struct mv64xxx_i2c_data *drv_data)

>  	writel(0, drv_data->reg_base + drv_data->reg_offsets.ext_addr);

>  	writel(MV64XXX_I2C_REG_CONTROL_TWSIEN | MV64XXX_I2C_REG_CONTROL_STOP,

>  		drv_data->reg_base + drv_data->reg_offsets.control);

> +

> +	if (drv_data->errata_delay)

> +		udelay(5);

> +

>  	drv_data->state = MV64XXX_I2C_STATE_IDLE;

>  }

>  

> -- 

> 2.26.2

>
Gregory CLEMENT April 14, 2021, 1:28 p.m. UTC | #3
> On Thu, Apr 08, 2021 at 04:00:00AM +0200, Marek Behún wrote:

>> I noticed a weird bug with this driver on Marvell CN9130 Customer

>> Reference Board.

>> 

>> Sometime after boot, the system locks with the following message:

>>  [104.071363] i2c i2c-0: mv64xxx: I2C bus locked, block: 1, time_left: 0

>> 

>> The system does not respond afterwards, only warns about RCU stalls.

>> 

>> This first appeared with commit e5c02cf54154 ("i2c: mv64xxx: Add runtime

>> PM support").

>> 

>> With further experimentation I discovered that adding a delay into

>> mv64xxx_i2c_hw_init() fixes this issue. This function is called before

>> every xfer, due to how runtime PM works in this driver. It seems that in

>> order to work correctly, a delay is needed after the bus is reset in

>> this function.


Marek,

As you mentioned it was related to reset and the issue occurred with the
support of runtime PM. Did you try to add the delay only in the function
mv64xxx_i2c_runtime_resume(), just after the mv64xxx_i2c_hw_init() call ?

>> 

>> Since there already is a known erratum with this controller needing a

>> delay, I assume that this is just another place this needs to be

>> applied. Therefore I apply the delay only if errata_delay is true.

>> 

>> Signed-off-by: Marek Behún <kabel@kernel.org>

>

> Gregory? Looks reasonable to me and if so, we should have this in 5.12

> already. Comments from others are welcome, too, of course.


Hello Wolfram,

I don't have this specific platform. However, as you said it looks
reasonable and as it fixes an issue. And even if I had a pending
question, it is just an optimisation so you can add my

Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com>


Gregory


>

>> ---

>>  drivers/i2c/busses/i2c-mv64xxx.c | 4 ++++

>>  1 file changed, 4 insertions(+)

>> 

>> diff --git a/drivers/i2c/busses/i2c-mv64xxx.c b/drivers/i2c/busses/i2c-mv64xxx.c

>> index c590d36b5fd1..5c8e94b6cdb5 100644

>> --- a/drivers/i2c/busses/i2c-mv64xxx.c

>> +++ b/drivers/i2c/busses/i2c-mv64xxx.c

>> @@ -221,6 +221,10 @@ mv64xxx_i2c_hw_init(struct mv64xxx_i2c_data *drv_data)

>>  	writel(0, drv_data->reg_base + drv_data->reg_offsets.ext_addr);

>>  	writel(MV64XXX_I2C_REG_CONTROL_TWSIEN | MV64XXX_I2C_REG_CONTROL_STOP,

>>  		drv_data->reg_base + drv_data->reg_offsets.control);

>> +

>> +	if (drv_data->errata_delay)

>> +		udelay(5);

>> +

>>  	drv_data->state = MV64XXX_I2C_STATE_IDLE;

>>  }

>>  

>> -- 

>> 2.26.2

>> 


-- 
Gregory Clement, Bootlin
Embedded Linux and Kernel engineering
http://bootlin.com
Samuel Holland April 14, 2021, 1:42 p.m. UTC | #4
On 4/13/21 2:58 PM, Wolfram Sang wrote:
> On Thu, Apr 08, 2021 at 04:00:00AM +0200, Marek Behún wrote:

>> I noticed a weird bug with this driver on Marvell CN9130 Customer

>> Reference Board.

>>

>> Sometime after boot, the system locks with the following message:

>>  [104.071363] i2c i2c-0: mv64xxx: I2C bus locked, block: 1, time_left: 0

>>

>> The system does not respond afterwards, only warns about RCU stalls.

>>

>> This first appeared with commit e5c02cf54154 ("i2c: mv64xxx: Add runtime

>> PM support").

>>

>> With further experimentation I discovered that adding a delay into

>> mv64xxx_i2c_hw_init() fixes this issue. This function is called before

>> every xfer, due to how runtime PM works in this driver. It seems that in

>> order to work correctly, a delay is needed after the bus is reset in

>> this function.

>>

>> Since there already is a known erratum with this controller needing a

>> delay, I assume that this is just another place this needs to be

>> applied. Therefore I apply the delay only if errata_delay is true.

>>

>> Signed-off-by: Marek Behún <kabel@kernel.org>

> 

> Gregory? Looks reasonable to me and if so, we should have this in 5.12

> already. Comments from others are welcome, too, of course.


I also don't have an affected platform, so I did not notice this when adding
runtime PM. The change makes sense to me as well.

Reviewed-by: Samuel Holland <samuel@sholland.org>


Cheers,
Samuel

>> ---

>>  drivers/i2c/busses/i2c-mv64xxx.c | 4 ++++

>>  1 file changed, 4 insertions(+)

>>

>> diff --git a/drivers/i2c/busses/i2c-mv64xxx.c b/drivers/i2c/busses/i2c-mv64xxx.c

>> index c590d36b5fd1..5c8e94b6cdb5 100644

>> --- a/drivers/i2c/busses/i2c-mv64xxx.c

>> +++ b/drivers/i2c/busses/i2c-mv64xxx.c

>> @@ -221,6 +221,10 @@ mv64xxx_i2c_hw_init(struct mv64xxx_i2c_data *drv_data)

>>  	writel(0, drv_data->reg_base + drv_data->reg_offsets.ext_addr);

>>  	writel(MV64XXX_I2C_REG_CONTROL_TWSIEN | MV64XXX_I2C_REG_CONTROL_STOP,

>>  		drv_data->reg_base + drv_data->reg_offsets.control);

>> +

>> +	if (drv_data->errata_delay)

>> +		udelay(5);

>> +

>>  	drv_data->state = MV64XXX_I2C_STATE_IDLE;

>>  }

>>  

>> -- 

>> 2.26.2

>>
Marek Behún April 14, 2021, 2:29 p.m. UTC | #5
On Wed, 14 Apr 2021 15:28:18 +0200
Gregory CLEMENT <gregory.clement@bootlin.com> wrote:

> > On Thu, Apr 08, 2021 at 04:00:00AM +0200, Marek Behún wrote:  

> >> I noticed a weird bug with this driver on Marvell CN9130 Customer

> >> Reference Board.

> >> 

> >> Sometime after boot, the system locks with the following message:

> >>  [104.071363] i2c i2c-0: mv64xxx: I2C bus locked, block: 1, time_left: 0

> >> 

> >> The system does not respond afterwards, only warns about RCU stalls.

> >> 

> >> This first appeared with commit e5c02cf54154 ("i2c: mv64xxx: Add runtime

> >> PM support").

> >> 

> >> With further experimentation I discovered that adding a delay into

> >> mv64xxx_i2c_hw_init() fixes this issue. This function is called before

> >> every xfer, due to how runtime PM works in this driver. It seems that in

> >> order to work correctly, a delay is needed after the bus is reset in

> >> this function.  

> 

> Marek,

> 

> As you mentioned it was related to reset and the issue occurred with the

> support of runtime PM. Did you try to add the delay only in the function

> mv64xxx_i2c_runtime_resume(), just after the mv64xxx_i2c_hw_init() call ?

> 


I did indeed discover this when I added this delay into the resume
function. In fact I discovered this when I added printf()s into suspend
and resume when debugging. The problem disappeared with these printf()s
(UART is slow so printf() counted as the necessary delay it seems).

I then moved the delay into the hw_init() function because that is what
made sense to me, that the delay is necessary after the reset, not only
when resuming, but always. We just did not notice because a xfer was
never done immediately after reset before the PM patch. (But maybe I am
wrong, maybe it is not needed in the reset. It just makes the most
sense to me...)

Marek
Wolfram Sang April 15, 2021, 8:13 p.m. UTC | #6
On Thu, Apr 08, 2021 at 04:00:00AM +0200, Marek Behún wrote:
> I noticed a weird bug with this driver on Marvell CN9130 Customer

> Reference Board.

> 

> Sometime after boot, the system locks with the following message:

>  [104.071363] i2c i2c-0: mv64xxx: I2C bus locked, block: 1, time_left: 0

> 

> The system does not respond afterwards, only warns about RCU stalls.

> 

> This first appeared with commit e5c02cf54154 ("i2c: mv64xxx: Add runtime

> PM support").

> 

> With further experimentation I discovered that adding a delay into

> mv64xxx_i2c_hw_init() fixes this issue. This function is called before

> every xfer, due to how runtime PM works in this driver. It seems that in

> order to work correctly, a delay is needed after the bus is reset in

> this function.

> 

> Since there already is a known erratum with this controller needing a

> delay, I assume that this is just another place this needs to be

> applied. Therefore I apply the delay only if errata_delay is true.

> 

> Signed-off-by: Marek Behún <kabel@kernel.org>


Applied to for-current, thanks!
diff mbox series

Patch

diff --git a/drivers/i2c/busses/i2c-mv64xxx.c b/drivers/i2c/busses/i2c-mv64xxx.c
index c590d36b5fd1..5c8e94b6cdb5 100644
--- a/drivers/i2c/busses/i2c-mv64xxx.c
+++ b/drivers/i2c/busses/i2c-mv64xxx.c
@@ -221,6 +221,10 @@  mv64xxx_i2c_hw_init(struct mv64xxx_i2c_data *drv_data)
 	writel(0, drv_data->reg_base + drv_data->reg_offsets.ext_addr);
 	writel(MV64XXX_I2C_REG_CONTROL_TWSIEN | MV64XXX_I2C_REG_CONTROL_STOP,
 		drv_data->reg_base + drv_data->reg_offsets.control);
+
+	if (drv_data->errata_delay)
+		udelay(5);
+
 	drv_data->state = MV64XXX_I2C_STATE_IDLE;
 }