diff mbox series

[v7,2/5] i2c: core: run atomic i2c xfer when !preemptible

Message ID 20230327-tegra-pmic-reboot-v7-2-18699d5dcd76@skidata.com
State New
Headers show
Series mfd: tps6586x: register restart handler | expand

Commit Message

Benjamin Bara July 15, 2023, 7:53 a.m. UTC
From: Benjamin Bara <benjamin.bara@skidata.com>

Since bae1d3a05a8b, i2c transfers are non-atomic if preemption is
disabled. However, non-atomic i2c transfers require preemption (e.g. in
wait_for_completion() while waiting for the DMA).

panic() calls preempt_disable_notrace() before calling
emergency_restart(). Therefore, if an i2c device is used for the
restart, the xfer should be atomic. This avoids warnings like:

[   12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0
[   12.676926] Voluntary context switch within RCU read-side critical section!
...
[   12.742376]  schedule_timeout from wait_for_completion_timeout+0x90/0x114
[   12.749179]  wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70
...
[   12.994527]  atomic_notifier_call_chain from machine_restart+0x34/0x58
[   13.001050]  machine_restart from panic+0x2a8/0x32c

Use !preemptible() instead, which is basically the same check as
pre-v5.2.

Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()")
Cc: stable@vger.kernel.org # v5.2+
Suggested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Acked-by: Wolfram Sang <wsa@kernel.org>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Benjamin Bara <benjamin.bara@skidata.com>
---
 drivers/i2c/i2c-core.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Chris Morgan Nov. 13, 2023, 1:12 a.m. UTC | #1
On Sat, Jul 15, 2023 at 09:53:24AM +0200, Benjamin Bara wrote:
> From: Benjamin Bara <benjamin.bara@skidata.com>
> 
> Since bae1d3a05a8b, i2c transfers are non-atomic if preemption is
> disabled. However, non-atomic i2c transfers require preemption (e.g. in
> wait_for_completion() while waiting for the DMA).
> 
> panic() calls preempt_disable_notrace() before calling
> emergency_restart(). Therefore, if an i2c device is used for the
> restart, the xfer should be atomic. This avoids warnings like:
> 
> [   12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0
> [   12.676926] Voluntary context switch within RCU read-side critical section!
> ...
> [   12.742376]  schedule_timeout from wait_for_completion_timeout+0x90/0x114
> [   12.749179]  wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70
> ...
> [   12.994527]  atomic_notifier_call_chain from machine_restart+0x34/0x58
> [   13.001050]  machine_restart from panic+0x2a8/0x32c
> 
> Use !preemptible() instead, which is basically the same check as
> pre-v5.2.
> 
> Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()")
> Cc: stable@vger.kernel.org # v5.2+
> Suggested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> Acked-by: Wolfram Sang <wsa@kernel.org>
> Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> Tested-by: Nishanth Menon <nm@ti.com>
> Signed-off-by: Benjamin Bara <benjamin.bara@skidata.com>

For kernel 6.7 I'm having an issue when I shutdown or reboot my
Rockchip RK3326 or Rockchip RK3566 based devices, and I've bisected
the issue down to this specific commit.

When I shutdown or restart the device, I receive messages in the kernel
log like the following:

[   37.121148] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x3
[   37.122178] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x3
[   37.123212] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x3
[   37.124226] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x3
[   37.125242] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x3
[   37.126133] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x1

The device will also occasionally freeze instead of rebooting or
shutting down. The i2c errors are consistent, but the freezing
behavior is not.

Thank you.

> ---
>  drivers/i2c/i2c-core.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/i2c/i2c-core.h b/drivers/i2c/i2c-core.h
> index 1247e6e6e975..05b8b8dfa9bd 100644
> --- a/drivers/i2c/i2c-core.h
> +++ b/drivers/i2c/i2c-core.h
> @@ -29,7 +29,7 @@ int i2c_dev_irq_from_resources(const struct resource *resources,
>   */
>  static inline bool i2c_in_atomic_xfer_mode(void)
>  {
> -	return system_state > SYSTEM_RUNNING && irqs_disabled();
> +	return system_state > SYSTEM_RUNNING && !preemptible();
>  }
>  
>  static inline int __i2c_lock_bus_helper(struct i2c_adapter *adap)
> 
> -- 
> 2.34.1
>
Dmitry Osipenko Nov. 13, 2023, 3:46 a.m. UTC | #2
On 11/13/23 04:12, Chris Morgan wrote:
> On Sat, Jul 15, 2023 at 09:53:24AM +0200, Benjamin Bara wrote:
>> From: Benjamin Bara <benjamin.bara@skidata.com>
>>
>> Since bae1d3a05a8b, i2c transfers are non-atomic if preemption is
>> disabled. However, non-atomic i2c transfers require preemption (e.g. in
>> wait_for_completion() while waiting for the DMA).
>>
>> panic() calls preempt_disable_notrace() before calling
>> emergency_restart(). Therefore, if an i2c device is used for the
>> restart, the xfer should be atomic. This avoids warnings like:
>>
>> [   12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0
>> [   12.676926] Voluntary context switch within RCU read-side critical section!
>> ...
>> [   12.742376]  schedule_timeout from wait_for_completion_timeout+0x90/0x114
>> [   12.749179]  wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70
>> ...
>> [   12.994527]  atomic_notifier_call_chain from machine_restart+0x34/0x58
>> [   13.001050]  machine_restart from panic+0x2a8/0x32c
>>
>> Use !preemptible() instead, which is basically the same check as
>> pre-v5.2.
>>
>> Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()")
>> Cc: stable@vger.kernel.org # v5.2+
>> Suggested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
>> Acked-by: Wolfram Sang <wsa@kernel.org>
>> Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
>> Tested-by: Nishanth Menon <nm@ti.com>
>> Signed-off-by: Benjamin Bara <benjamin.bara@skidata.com>
> 
> For kernel 6.7 I'm having an issue when I shutdown or reboot my
> Rockchip RK3326 or Rockchip RK3566 based devices, and I've bisected
> the issue down to this specific commit.
> 
> When I shutdown or restart the device, I receive messages in the kernel
> log like the following:
> 
> [   37.121148] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x3
> [   37.122178] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x3
> [   37.123212] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x3
> [   37.124226] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x3
> [   37.125242] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x3
> [   37.126133] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x1
> 
> The device will also occasionally freeze instead of rebooting or
> shutting down. The i2c errors are consistent, but the freezing
> behavior is not.

I couldn't reproduce your issue with v6.7-rc1 and RK3399 that also uses rk3x-i2c. Though, the rk3x-i2c driver looks suspicious. Please try this patch:

diff --git a/drivers/i2c/busses/i2c-rk3x.c b/drivers/i2c/busses/i2c-rk3x.c
index a044ca0c35a1..aad00e9909cc 100644
--- a/drivers/i2c/busses/i2c-rk3x.c
+++ b/drivers/i2c/busses/i2c-rk3x.c
@@ -219,6 +219,8 @@ struct rk3x_i2c {
 	enum rk3x_i2c_state state;
 	unsigned int processed;
 	int error;
+
+	int irq;
 };
 
 static inline void i2c_writel(struct rk3x_i2c *i2c, u32 value,
@@ -1090,8 +1092,10 @@ static int rk3x_i2c_xfer_common(struct i2c_adapter *adap,
 		rk3x_i2c_start(i2c);
 
 		if (!polling) {
+			enable_irq(i2c->irq);
 			timeout = wait_event_timeout(i2c->wait, !i2c->busy,
 						     msecs_to_jiffies(WAIT_TIMEOUT));
+			disable_irq(i2c->irq);
 		} else {
 			timeout = rk3x_i2c_wait_xfer_poll(i2c);
 		}
@@ -1236,7 +1240,6 @@ static int rk3x_i2c_probe(struct platform_device *pdev)
 	int ret = 0;
 	int bus_nr;
 	u32 value;
-	int irq;
 	unsigned long clk_rate;
 
 	i2c = devm_kzalloc(&pdev->dev, sizeof(struct rk3x_i2c), GFP_KERNEL);
@@ -1299,11 +1302,14 @@ static int rk3x_i2c_probe(struct platform_device *pdev)
 	}
 
 	/* IRQ setup */
-	irq = platform_get_irq(pdev, 0);
-	if (irq < 0)
-		return irq;
+	i2c->irq = platform_get_irq(pdev, 0);
+	if (i2c->irq < 0)
+		return i2c->irq;
+
+	/* interrupt will be enabled during of transfer time */
+	irq_set_status_flags(i2c->irq, IRQ_NOAUTOEN);
 
-	ret = devm_request_irq(&pdev->dev, irq, rk3x_i2c_irq,
+	ret = devm_request_irq(&pdev->dev, i2c->irq, rk3x_i2c_irq,
 			       0, dev_name(&pdev->dev), i2c);
 	if (ret < 0) {
 		dev_err(&pdev->dev, "cannot request IRQ\n");
Chris Morgan Nov. 13, 2023, 2:54 p.m. UTC | #3
On Mon, Nov 13, 2023 at 06:46:30AM +0300, Dmitry Osipenko wrote:
> On 11/13/23 04:12, Chris Morgan wrote:
> > On Sat, Jul 15, 2023 at 09:53:24AM +0200, Benjamin Bara wrote:
> >> From: Benjamin Bara <benjamin.bara@skidata.com>
> >>
> >> Since bae1d3a05a8b, i2c transfers are non-atomic if preemption is
> >> disabled. However, non-atomic i2c transfers require preemption (e.g. in
> >> wait_for_completion() while waiting for the DMA).
> >>
> >> panic() calls preempt_disable_notrace() before calling
> >> emergency_restart(). Therefore, if an i2c device is used for the
> >> restart, the xfer should be atomic. This avoids warnings like:
> >>
> >> [   12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0
> >> [   12.676926] Voluntary context switch within RCU read-side critical section!
> >> ...
> >> [   12.742376]  schedule_timeout from wait_for_completion_timeout+0x90/0x114
> >> [   12.749179]  wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70
> >> ...
> >> [   12.994527]  atomic_notifier_call_chain from machine_restart+0x34/0x58
> >> [   13.001050]  machine_restart from panic+0x2a8/0x32c
> >>
> >> Use !preemptible() instead, which is basically the same check as
> >> pre-v5.2.
> >>
> >> Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()")
> >> Cc: stable@vger.kernel.org # v5.2+
> >> Suggested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> >> Acked-by: Wolfram Sang <wsa@kernel.org>
> >> Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> >> Tested-by: Nishanth Menon <nm@ti.com>
> >> Signed-off-by: Benjamin Bara <benjamin.bara@skidata.com>
> > 
> > For kernel 6.7 I'm having an issue when I shutdown or reboot my
> > Rockchip RK3326 or Rockchip RK3566 based devices, and I've bisected
> > the issue down to this specific commit.
> > 
> > When I shutdown or restart the device, I receive messages in the kernel
> > log like the following:
> > 
> > [   37.121148] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x3
> > [   37.122178] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x3
> > [   37.123212] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x3
> > [   37.124226] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x3
> > [   37.125242] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x3
> > [   37.126133] rk3x-i2c fdd40000.i2c: irq in STATE_IDLE, ipd = 0x1
> > 
> > The device will also occasionally freeze instead of rebooting or
> > shutting down. The i2c errors are consistent, but the freezing
> > behavior is not.
> 
> I couldn't reproduce your issue with v6.7-rc1 and RK3399 that also uses rk3x-i2c. Though, the rk3x-i2c driver looks suspicious. Please try this patch:
> 
> diff --git a/drivers/i2c/busses/i2c-rk3x.c b/drivers/i2c/busses/i2c-rk3x.c
> index a044ca0c35a1..aad00e9909cc 100644
> --- a/drivers/i2c/busses/i2c-rk3x.c
> +++ b/drivers/i2c/busses/i2c-rk3x.c
> @@ -219,6 +219,8 @@ struct rk3x_i2c {
>  	enum rk3x_i2c_state state;
>  	unsigned int processed;
>  	int error;
> +
> +	int irq;
>  };
>  
>  static inline void i2c_writel(struct rk3x_i2c *i2c, u32 value,
> @@ -1090,8 +1092,10 @@ static int rk3x_i2c_xfer_common(struct i2c_adapter *adap,
>  		rk3x_i2c_start(i2c);
>  
>  		if (!polling) {
> +			enable_irq(i2c->irq);
>  			timeout = wait_event_timeout(i2c->wait, !i2c->busy,
>  						     msecs_to_jiffies(WAIT_TIMEOUT));
> +			disable_irq(i2c->irq);
>  		} else {
>  			timeout = rk3x_i2c_wait_xfer_poll(i2c);
>  		}
> @@ -1236,7 +1240,6 @@ static int rk3x_i2c_probe(struct platform_device *pdev)
>  	int ret = 0;
>  	int bus_nr;
>  	u32 value;
> -	int irq;
>  	unsigned long clk_rate;
>  
>  	i2c = devm_kzalloc(&pdev->dev, sizeof(struct rk3x_i2c), GFP_KERNEL);
> @@ -1299,11 +1302,14 @@ static int rk3x_i2c_probe(struct platform_device *pdev)
>  	}
>  
>  	/* IRQ setup */
> -	irq = platform_get_irq(pdev, 0);
> -	if (irq < 0)
> -		return irq;
> +	i2c->irq = platform_get_irq(pdev, 0);
> +	if (i2c->irq < 0)
> +		return i2c->irq;
> +
> +	/* interrupt will be enabled during of transfer time */
> +	irq_set_status_flags(i2c->irq, IRQ_NOAUTOEN);
>  
> -	ret = devm_request_irq(&pdev->dev, irq, rk3x_i2c_irq,
> +	ret = devm_request_irq(&pdev->dev, i2c->irq, rk3x_i2c_irq,
>  			       0, dev_name(&pdev->dev), i2c);
>  	if (ret < 0) {
>  		dev_err(&pdev->dev, "cannot request IRQ\n");
> 

I can confirm I no longer get any of the errors with this patch. Tested
on both an Anbernic RG353P (RK3566 with an RK817 PMIC) and an Odroid
Go Advance (RK3326 with an RK817 PMIC). The device appears to shut
down consistently again and I no longer see these messages in my dmesg
log when I shut down.

Thank you.

> 
> -- 
> Best regards,
> Dmitry
>
Chris Morgan Nov. 14, 2023, 6:43 p.m. UTC | #4
On Mon, Nov 13, 2023 at 04:48:26PM +0100, Benjamin Bara wrote:
> Hi!
> 
> Thanks for testing and the feedback!
> 
> On Mon, 13 Nov 2023 at 15:54, Chris Morgan <macroalpha82@gmail.com> wrote:
> > I can confirm I no longer get any of the errors with this patch. Tested
> > on both an Anbernic RG353P (RK3566 with an RK817 PMIC) and an Odroid
> > Go Advance (RK3326 with an RK817 PMIC). The device appears to shut
> > down consistently again and I no longer see these messages in my dmesg
> > log when I shut down.
> 
> Just to make sure: Are you compiling with CONFIG_PREEMPTION (and
> therefore CONFIG_PREEMPT_COUNT)?
> 
> If yes, could you please also test the following patch? Because I am not
> sure yet how polling can be false in a "polling required" situation,
> meaning .master_xfer() is called instead of .master_xfer_atomic() (while
> your test shows that irq_disabled() is true, which is basically done
> with !preemptible()). The patch should test the other way round: if the
> situation is found, force an atomic transfer instead.
> 
> Thank you!
> 
> diff --git a/drivers/i2c/busses/i2c-rk3x.c b/drivers/i2c/busses/i2c-rk3x.c
> index a044ca0c35a1..6e3e8433018f 100644
> --- a/drivers/i2c/busses/i2c-rk3x.c
> +++ b/drivers/i2c/busses/i2c-rk3x.c
> @@ -1131,6 +1131,10 @@ static int rk3x_i2c_xfer_common(struct i2c_adapter *adap,
>  static int rk3x_i2c_xfer(struct i2c_adapter *adap,
>                          struct i2c_msg *msgs, int num)
>  {
> +       if (irqs_disabled()) {
> +               WARN_ONCE(1, "Landed in non-atomic handler with disabled IRQs");
> +               return rk3x_i2c_xfer_common(adap, msgs, num, true);
> +       }
>         return rk3x_i2c_xfer_common(adap, msgs, num, false);
>  }
> 

I have CONFIG_PREEMPT_VOLUNTARY=y but CONFIG_PREEMPTION is not set.

Thank you.
Dmitry Osipenko Nov. 15, 2023, 6 a.m. UTC | #5
On 11/13/23 17:54, Chris Morgan wrote:
..
> I can confirm I no longer get any of the errors with this patch. Tested
> on both an Anbernic RG353P (RK3566 with an RK817 PMIC) and an Odroid
> Go Advance (RK3326 with an RK817 PMIC). The device appears to shut
> down consistently again and I no longer see these messages in my dmesg
> log when I shut down.

I'll prepare the proper patch, thanks.
diff mbox series

Patch

diff --git a/drivers/i2c/i2c-core.h b/drivers/i2c/i2c-core.h
index 1247e6e6e975..05b8b8dfa9bd 100644
--- a/drivers/i2c/i2c-core.h
+++ b/drivers/i2c/i2c-core.h
@@ -29,7 +29,7 @@  int i2c_dev_irq_from_resources(const struct resource *resources,
  */
 static inline bool i2c_in_atomic_xfer_mode(void)
 {
-	return system_state > SYSTEM_RUNNING && irqs_disabled();
+	return system_state > SYSTEM_RUNNING && !preemptible();
 }
 
 static inline int __i2c_lock_bus_helper(struct i2c_adapter *adap)