diff mbox series

i2c: iproc: reset bus after timeout if START_BUSY is stuck

Message ID 20230904090005.52622-1-jonas.gorski@bisdn.de
State New
Headers show
Series i2c: iproc: reset bus after timeout if START_BUSY is stuck | expand

Commit Message

Jonas Gorski Sept. 4, 2023, 9 a.m. UTC
If a transaction times out, the START_BUSY signal may have gotten stuck,
and subsequent transactaction attempts will fail as the bus is still
considered busy.

To work around this, check if the START_BUSY bit is still asserted, and
reset the controller in case it is.

This is also done by the alternative, non-upstream iproc-smbus driver
implementation [1].

Works around situations like:

    bcm-iproc-2c 1803b000.i2c: transaction timed out
    bcm-iproc-2c 1803b000.i2c: bus is busy
    bcm-iproc-2c 1803b000.i2c: bus is busy
    bcm-iproc-2c 1803b000.i2c: bus is busy
    bcm-iproc-2c 1803b000.i2c: bus is busy
    bcm-iproc-2c 1803b000.i2c: bus is busy
    ...

where the bus never recovers after a timeout.

[1] https://github.com/opencomputeproject/onie/blob/master/patches/kernel/3.2.69/driver-iproc-smbus.patch

Fixes: e6e5dd3566e0 ("i2c: iproc: Add Broadcom iProc I2C Driver")
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
---
The iproc-smbus driver does some additional checks/mitigations, but
since my I2C understanding is only very rudimentary, I didn't add them,
also the reset was enough to fix the issue I was seeing.

I was a bit conflicted about the Fixes tag, but since it fixes/work
around misbehaviour seen I decided to add one.

The issue was happening only in production, and only once per boot (so
far), but with 100% probability within a few hours.

 drivers/i2c/busses/i2c-bcm-iproc.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Ray Jui Sept. 5, 2023, 4:26 p.m. UTC | #1
Hi Jonas,

On 9/4/2023 2:00 AM, Jonas Gorski wrote:
> If a transaction times out, the START_BUSY signal may have gotten stuck,
> and subsequent transactaction attempts will fail as the bus is still
> considered busy.
> 
> To work around this, check if the START_BUSY bit is still asserted, and
> reset the controller in case it is.
> 
> This is also done by the alternative, non-upstream iproc-smbus driver
> implementation [1].
> 
> Works around situations like:
> 
>     bcm-iproc-2c 1803b000.i2c: transaction timed out
>     bcm-iproc-2c 1803b000.i2c: bus is busy
>     bcm-iproc-2c 1803b000.i2c: bus is busy
>     bcm-iproc-2c 1803b000.i2c: bus is busy
>     bcm-iproc-2c 1803b000.i2c: bus is busy
>     bcm-iproc-2c 1803b000.i2c: bus is busy
>     ...
> 
> where the bus never recovers after a timeout.
> 
> [1] https://github.com/opencomputeproject/onie/blob/master/patches/kernel/3.2.69/driver-iproc-smbus.patch
> 
> Fixes: e6e5dd3566e0 ("i2c: iproc: Add Broadcom iProc I2C Driver")
> Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
> ---
> The iproc-smbus driver does some additional checks/mitigations, but
> since my I2C understanding is only very rudimentary, I didn't add them,
> also the reset was enough to fix the issue I was seeing.
> 
> I was a bit conflicted about the Fixes tag, but since it fixes/work
> around misbehaviour seen I decided to add one.
> 
> The issue was happening only in production, and only once per boot (so
> far), but with 100% probability within a few hours.
> 
>  drivers/i2c/busses/i2c-bcm-iproc.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/i2c/busses/i2c-bcm-iproc.c b/drivers/i2c/busses/i2c-bcm-iproc.c
> index 05c80680dff4..69f9c199fa3b 100644
> --- a/drivers/i2c/busses/i2c-bcm-iproc.c
> +++ b/drivers/i2c/busses/i2c-bcm-iproc.c
> @@ -796,6 +796,15 @@ static int bcm_iproc_i2c_xfer_wait(struct bcm_iproc_i2c_dev *iproc_i2c,
>  	if (!time_left && !iproc_i2c->xfer_is_done) {
>  		dev_err(iproc_i2c->device, "transaction timed out\n");
>  
> +		/* check if START_BUSY did not clear */

Can you please rephrase the comment to make it more clear?

For example, something like this:

/*
 * If START_BUSY is still not clear, it means the controller may have
 * been stuck. In this case, reset the controller to recover.
 */

> +		if (!!(iproc_i2c_rd_reg(iproc_i2c, M_CMD_OFFSET) &
> +		       BIT(M_CMD_START_BUSY_SHIFT))) {
> +			/* re-initialize i2c for recovery */
> +			bcm_iproc_i2c_enable_disable(iproc_i2c, false);
> +			bcm_iproc_i2c_init(iproc_i2c);
> +			bcm_iproc_i2c_enable_disable(iproc_i2c, true);
> +		}
> +
>  		/* flush both TX/RX FIFOs */
>  		val = BIT(M_FIFO_RX_FLUSH_SHIFT) | BIT(M_FIFO_TX_FLUSH_SHIFT);
>  		iproc_i2c_wr_reg(iproc_i2c, M_FIFO_CTRL_OFFSET, val);
Jonas Gorski Sept. 8, 2023, 2:03 p.m. UTC | #2
Hi,

Am Mi., 6. Sept. 2023 um 00:53 Uhr schrieb Andi Shyti <andi.shyti@kernel.org>:
>
> Hi Jonas,
>
> On Mon, Sep 04, 2023 at 11:00:04AM +0200, Jonas Gorski wrote:
> > If a transaction times out, the START_BUSY signal may have gotten stuck,
> > and subsequent transactaction attempts will fail as the bus is still
> > considered busy.
> >
> > To work around this, check if the START_BUSY bit is still asserted, and
> > reset the controller in case it is.
> >
> > This is also done by the alternative, non-upstream iproc-smbus driver
> > implementation [1].
> >
> > Works around situations like:
> >
> >     bcm-iproc-2c 1803b000.i2c: transaction timed out
> >     bcm-iproc-2c 1803b000.i2c: bus is busy
> >     bcm-iproc-2c 1803b000.i2c: bus is busy
> >     bcm-iproc-2c 1803b000.i2c: bus is busy
> >     bcm-iproc-2c 1803b000.i2c: bus is busy
> >     bcm-iproc-2c 1803b000.i2c: bus is busy
> >     ...
> >
> > where the bus never recovers after a timeout.
> >
> > [1] https://github.com/opencomputeproject/onie/blob/master/patches/kernel/3.2.69/driver-iproc-smbus.patch
> >
> > Fixes: e6e5dd3566e0 ("i2c: iproc: Add Broadcom iProc I2C Driver")
> > Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
>
> I think the right Fixes tag should be:
>
> Fixes: 3f98ad45e585 ("i2c: iproc: add polling support")

That was the last change that part of the code, but the "issue" was
not introduced there. The code before that already did a timeout check
and flush in that case, without the reset.

Obviously the fix wouldn't apply without changes to a version without
that commit, but the version would be nevertheless affected by the
issue. That's why I chose the commit introducing the timeout handling.

> Cc: Rayagonda Kokatanur <rayagonda.kokatanur@broadcom.com>
> Cc: <stable@vger.kernel.org> # v5.2+
>
> > ---
> > The iproc-smbus driver does some additional checks/mitigations, but
> > since my I2C understanding is only very rudimentary, I didn't add them,
> > also the reset was enough to fix the issue I was seeing.
> >
> > I was a bit conflicted about the Fixes tag, but since it fixes/work
> > around misbehaviour seen I decided to add one.
> >
> > The issue was happening only in production, and only once per boot (so
> > far), but with 100% probability within a few hours.
> >
> >  drivers/i2c/busses/i2c-bcm-iproc.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/drivers/i2c/busses/i2c-bcm-iproc.c b/drivers/i2c/busses/i2c-bcm-iproc.c
> > index 05c80680dff4..69f9c199fa3b 100644
> > --- a/drivers/i2c/busses/i2c-bcm-iproc.c
> > +++ b/drivers/i2c/busses/i2c-bcm-iproc.c
> > @@ -796,6 +796,15 @@ static int bcm_iproc_i2c_xfer_wait(struct bcm_iproc_i2c_dev *iproc_i2c,
> >       if (!time_left && !iproc_i2c->xfer_is_done) {
> >               dev_err(iproc_i2c->device, "transaction timed out\n");
> >
> > +             /* check if START_BUSY did not clear */
>
> as Ray asked, can you please expand this comment?

Will do, thanks for the reviews!

Best Regards,
Jonas
diff mbox series

Patch

diff --git a/drivers/i2c/busses/i2c-bcm-iproc.c b/drivers/i2c/busses/i2c-bcm-iproc.c
index 05c80680dff4..69f9c199fa3b 100644
--- a/drivers/i2c/busses/i2c-bcm-iproc.c
+++ b/drivers/i2c/busses/i2c-bcm-iproc.c
@@ -796,6 +796,15 @@  static int bcm_iproc_i2c_xfer_wait(struct bcm_iproc_i2c_dev *iproc_i2c,
 	if (!time_left && !iproc_i2c->xfer_is_done) {
 		dev_err(iproc_i2c->device, "transaction timed out\n");
 
+		/* check if START_BUSY did not clear */
+		if (!!(iproc_i2c_rd_reg(iproc_i2c, M_CMD_OFFSET) &
+		       BIT(M_CMD_START_BUSY_SHIFT))) {
+			/* re-initialize i2c for recovery */
+			bcm_iproc_i2c_enable_disable(iproc_i2c, false);
+			bcm_iproc_i2c_init(iproc_i2c);
+			bcm_iproc_i2c_enable_disable(iproc_i2c, true);
+		}
+
 		/* flush both TX/RX FIFOs */
 		val = BIT(M_FIFO_RX_FLUSH_SHIFT) | BIT(M_FIFO_TX_FLUSH_SHIFT);
 		iproc_i2c_wr_reg(iproc_i2c, M_FIFO_CTRL_OFFSET, val);