diff mbox series

gpio: pca953x: fix pca953x_irq_bus_sync_unlock race

Message ID 20240620042915.2173-1-ian.ray@gehealthcare.com
State New
Headers show
Series gpio: pca953x: fix pca953x_irq_bus_sync_unlock race | expand

Commit Message

Ian Ray June 20, 2024, 4:29 a.m. UTC
Ensure that `i2c_lock' is held when setting interrupt latch and mask in
pca953x_irq_bus_sync_unlock() in order to avoid races.

The other (non-probe) call site pca953x_gpio_set_multiple() ensures the
lock is held before calling pca953x_write_regs().

The problem occurred when a request raced against irq_bus_sync_unlock()
approximately once per thousand reboots on an i.MX8MP based system.

 * Normal case

   0-0022: write register AI|3a {03,02,00,00,01} Input latch P0
   0-0022: write register AI|49 {fc,fd,ff,ff,fe} Interrupt mask P0
   0-0022: write register AI|08 {ff,00,00,00,00} Output P3
   0-0022: write register AI|12 {fc,00,00,00,00} Config P3

 * Race case

   0-0022: write register AI|08 {ff,00,00,00,00} Output P3
   0-0022: write register AI|08 {03,02,00,00,01} *** Wrong register ***
   0-0022: write register AI|12 {fc,00,00,00,00} Config P3
   0-0022: write register AI|49 {fc,fd,ff,ff,fe} Interrupt mask P0

Signed-off-by: Ian Ray <ian.ray@gehealthcare.com>
---
 drivers/gpio/gpio-pca953x.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Bartosz Golaszewski June 21, 2024, 2:21 p.m. UTC | #1
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>


On Thu, 20 Jun 2024 07:29:15 +0300, Ian Ray wrote:
> Ensure that `i2c_lock' is held when setting interrupt latch and mask in
> pca953x_irq_bus_sync_unlock() in order to avoid races.
> 
> The other (non-probe) call site pca953x_gpio_set_multiple() ensures the
> lock is held before calling pca953x_write_regs().
> 
> The problem occurred when a request raced against irq_bus_sync_unlock()
> approximately once per thousand reboots on an i.MX8MP based system.
> 
> [...]

Applied, thanks!

[1/1] gpio: pca953x: fix pca953x_irq_bus_sync_unlock race
      commit: bfc6444b57dc7186b6acc964705d7516cbaf3904

Best regards,
Ian Ray Sept. 27, 2024, 11:36 a.m. UTC | #2
On Fri, Sep 27, 2024 at 11:49:04AM +0200, Jean Delvare wrote:
> 
> Hello Ian,
> 
> On Thu, 2024-06-20 at 07:29 +0300, Ian Ray wrote:
> > Ensure that `i2c_lock' is held when setting interrupt latch and mask in
> > pca953x_irq_bus_sync_unlock() in order to avoid races.
> >
> > The other (non-probe) call site pca953x_gpio_set_multiple() ensures the
> > lock is held before calling pca953x_write_regs().
> >
> > The problem occurred when a request raced against irq_bus_sync_unlock()
> > approximately once per thousand reboots on an i.MX8MP based system.
:
> > --- a/drivers/gpio/gpio-pca953x.c
> > +++ b/drivers/gpio/gpio-pca953x.c
> > @@ -758,6 +758,8 @@ static void pca953x_irq_bus_sync_unlock(struct irq_data *d)
> >         int level;
> >
> >         if (chip->driver_data & PCA_PCAL) {
> > +               guard(mutex)(&chip->i2c_lock);
> > +
> >                 /* Enable latch on interrupt-enabled inputs */
> >                 pca953x_write_regs(chip, PCAL953X_IN_LATCH, chip->irq_mask);
> >
> 
> I've been asked to backport this fix to SUSE kernels and I have a
> concern about it.
> 
> You take the i2c_lock mutex inside the (chip->driver_data & PCA_PCAL)
> conditional block, where pca953x_write_regs() is being called, and the
> commit description implies this is indeed the call you wanted to
> protect.
> 
> However, immediately after the conditional block, the common code path
> includes a call to pca953x_read_regs(). Looking at the rest of the
> driver code, I see that the i2c_lock mutex is *also* always held
> (except during device probe) when calling this function. Which isn't
> really surprising as I seem to understand the device uses a banked
> register addressing, and this typically affects both reading from and
> writing to registers.
> 
> So I suspect the i2c_lock mutex needs to be held for this call to
> pca953x_read_regs() as well (unless you are familiar with the register
> map and know for sure that the "direction" register is outside of the
> banked register range).

Hello Jean,

Direction is indeed banked (see, for example, PCA953x_BANK_CONFIG).

It certainly looks plausible that a race between
pca953x_gpio_direction_input or pca953x_gpio_direction_output and 
the register read in pca953x_irq_bus_sync_unlock may occur.

In practice, I think that this is unlikely to ever be observed because
(IMHO) GPIO direction is rarely changed after initialization.
(Disclaimer: this is true for the embedded systems I work with.)

Hope this clarifies things.

Thanks,
Ian


> 
> I'm not familiar with the gpio-pca953x driver at all so I may be
> missing something and maybe everything is actually fine, but I would
> appreciate if someone could take a look and give a second opinion.
> 
> Thanks,
> --
> Jean Delvare
> SUSE L3 Support
>
Bartosz Golaszewski Sept. 27, 2024, 11:40 a.m. UTC | #3
On Fri, Sep 27, 2024 at 1:36 PM Ian Ray <ian.ray@gehealthcare.com> wrote:
>
> On Fri, Sep 27, 2024 at 11:49:04AM +0200, Jean Delvare wrote:
> >
> > Hello Ian,
> >
> > On Thu, 2024-06-20 at 07:29 +0300, Ian Ray wrote:
> > > Ensure that `i2c_lock' is held when setting interrupt latch and mask in
> > > pca953x_irq_bus_sync_unlock() in order to avoid races.
> > >
> > > The other (non-probe) call site pca953x_gpio_set_multiple() ensures the
> > > lock is held before calling pca953x_write_regs().
> > >
> > > The problem occurred when a request raced against irq_bus_sync_unlock()
> > > approximately once per thousand reboots on an i.MX8MP based system.
> :
> > > --- a/drivers/gpio/gpio-pca953x.c
> > > +++ b/drivers/gpio/gpio-pca953x.c
> > > @@ -758,6 +758,8 @@ static void pca953x_irq_bus_sync_unlock(struct irq_data *d)
> > >         int level;
> > >
> > >         if (chip->driver_data & PCA_PCAL) {
> > > +               guard(mutex)(&chip->i2c_lock);
> > > +
> > >                 /* Enable latch on interrupt-enabled inputs */
> > >                 pca953x_write_regs(chip, PCAL953X_IN_LATCH, chip->irq_mask);
> > >
> >
> > I've been asked to backport this fix to SUSE kernels and I have a
> > concern about it.
> >
> > You take the i2c_lock mutex inside the (chip->driver_data & PCA_PCAL)
> > conditional block, where pca953x_write_regs() is being called, and the
> > commit description implies this is indeed the call you wanted to
> > protect.
> >
> > However, immediately after the conditional block, the common code path
> > includes a call to pca953x_read_regs(). Looking at the rest of the
> > driver code, I see that the i2c_lock mutex is *also* always held
> > (except during device probe) when calling this function. Which isn't
> > really surprising as I seem to understand the device uses a banked
> > register addressing, and this typically affects both reading from and
> > writing to registers.
> >
> > So I suspect the i2c_lock mutex needs to be held for this call to
> > pca953x_read_regs() as well (unless you are familiar with the register
> > map and know for sure that the "direction" register is outside of the
> > banked register range).
>
> Hello Jean,
>
> Direction is indeed banked (see, for example, PCA953x_BANK_CONFIG).
>
> It certainly looks plausible that a race between
> pca953x_gpio_direction_input or pca953x_gpio_direction_output and
> the register read in pca953x_irq_bus_sync_unlock may occur.
>
> In practice, I think that this is unlikely to ever be observed because
> (IMHO) GPIO direction is rarely changed after initialization.
> (Disclaimer: this is true for the embedded systems I work with.)
>
> Hope this clarifies things.
>

I'd argue that this is the case for kernel users but you can never
tell what the user-space will do. I think this may be a valid concern
and worth addressing.

Bart
Jean Delvare Oct. 7, 2024, 9:16 p.m. UTC | #4
Hi Ray,

On Thu, 2024-06-20 at 07:29 +0300, Ian Ray wrote:
> Ensure that `i2c_lock' is held when setting interrupt latch and mask in
> pca953x_irq_bus_sync_unlock() in order to avoid races.
> 
> The other (non-probe) call site pca953x_gpio_set_multiple() ensures the
> lock is held before calling pca953x_write_regs().
> 
> The problem occurred when a request raced against irq_bus_sync_unlock()
> approximately once per thousand reboots on an i.MX8MP based system.
> 
>  * Normal case
> 
>    0-0022: write register AI|3a {03,02,00,00,01} Input latch P0
>    0-0022: write register AI|49 {fc,fd,ff,ff,fe} Interrupt mask P0
>    0-0022: write register AI|08 {ff,00,00,00,00} Output P3
>    0-0022: write register AI|12 {fc,00,00,00,00} Config P3
> 
>  * Race case
> 
>    0-0022: write register AI|08 {ff,00,00,00,00} Output P3
>    0-0022: write register AI|08 {03,02,00,00,01} *** Wrong register ***
>    0-0022: write register AI|12 {fc,00,00,00,00} Config P3
>    0-0022: write register AI|49 {fc,fd,ff,ff,fe} Interrupt mask P0
> 

I have more questions on this. Where does the above log come from?
Specifically, at which layer (bus driver, regmap, gpio device drier)?
What do these values represent exactly? Which GPIO chip was used on
your system? Which i2c bus driver is being used on that system? What
are the "requests" you mention in the description above?

I'm asking because I do not understand how writing to the wrong
register can happen, even without holding i2c_lock in
pca953x_irq_bus_sync_unlock(). The i2c layer has a per-i2c_adapter lock
which is taken before any bus transfer, so it isn't possible that two
transfers collide at the bus level. So the lack of locking at the
device driver level could lead to data corruption (for example read-
modify-write cycles overlapping), but not to data being written to the
wrong register.

As a side note, I dug through the history of the gpio-pca953x driver
and found that i2c_lock was introduced before the driver was converted
to regmap by:

commit 6e20fb18054c179d7e64c0af43d855b9310a3394
Author: Roland Stigge
Date:   Thu Feb 10 15:01:23 2011 -0800

    drivers/gpio/pca953x.c: add a mutex to fix race condition

The fix added locking around read-modify-write cycles (which was indeed
needed) and also around simple register reads (which I don't think was
needed).

It turns out that regmap has its own protection around read-modify-
write cycles (see regmap_update_bits_base) so I think several uses of
i2c_lock should have been removed from the gpio-pca953x driver when it
was converted to regmap as they became redundant then. This driver-side
lock is still needed in a number of functions though, where the read-
modify-write is handled outside of regmap (for example in
pca953x_gpio_set_multiple).

Thanks,
Ian Ray Oct. 8, 2024, 6:02 a.m. UTC | #5
On Mon, Oct 07, 2024 at 11:16:51PM +0200, Jean Delvare wrote:
> Hi Ray,
> 
> On Thu, 2024-06-20 at 07:29 +0300, Ian Ray wrote:
> > Ensure that `i2c_lock' is held when setting interrupt latch and mask in
> > pca953x_irq_bus_sync_unlock() in order to avoid races.
> >
> > The other (non-probe) call site pca953x_gpio_set_multiple() ensures the
> > lock is held before calling pca953x_write_regs().
> >
> > The problem occurred when a request raced against irq_bus_sync_unlock()
> > approximately once per thousand reboots on an i.MX8MP based system.
> >
> >  * Normal case
> >
> >    0-0022: write register AI|3a {03,02,00,00,01} Input latch P0
> >    0-0022: write register AI|49 {fc,fd,ff,ff,fe} Interrupt mask P0
> >    0-0022: write register AI|08 {ff,00,00,00,00} Output P3
> >    0-0022: write register AI|12 {fc,00,00,00,00} Config P3
> >
> >  * Race case
> >
> >    0-0022: write register AI|08 {ff,00,00,00,00} Output P3
> >    0-0022: write register AI|08 {03,02,00,00,01} *** Wrong register ***
> >    0-0022: write register AI|12 {fc,00,00,00,00} Config P3
> >    0-0022: write register AI|49 {fc,fd,ff,ff,fe} Interrupt mask P0
> >
> 
> I have more questions on this. Where does the above log come from?
> Specifically, at which layer (bus driver, regmap, gpio device drier)?

Additional debug, with manually added commentary (sorry for not being
clearer).  The debug was added to drivers/base/regmap/regmap-i2c.c while
investigating the issue.

> What do these values represent exactly? Which GPIO chip was used on
> your system? Which i2c bus driver is being used on that system? What
> are the "requests" you mention in the description above?

GPIO expander pi4ioe5v6534q at I2C address 0-0022.

# grep . {name,uevent}
name:30a20000.i2c
uevent:OF_NAME=i2c
uevent:OF_FULLNAME=/soc@0/bus@30800000/i2c@30a20000
uevent:OF_COMPATIBLE_0=fsl,imx8mp-i2c
uevent:OF_COMPATIBLE_1=fsl,imx21-i2c
uevent:OF_COMPATIBLE_N=2
uevent:OF_ALIAS_0=i2c0

> 
> I'm asking because I do not understand how writing to the wrong
> register can happen, even without holding i2c_lock in
> pca953x_irq_bus_sync_unlock(). The i2c layer has a per-i2c_adapter lock

Given that pca953x_irq_bus_sync_unlock is part of an interrupt handler,
IMHO this explains very well why locking is needed (but I did not dig
deeper than that).

> which is taken before any bus transfer, so it isn't possible that two
> transfers collide at the bus level. So the lack of locking at the
> device driver level could lead to data corruption (for example read-
> modify-write cycles overlapping), but not to data being written to the
> wrong register.

Based on the observed data, the hypothesis was that pca953x_write_regs
(called via pca953x_gpio_set_multiple) and pca953x_irq_bus_sync_unlock
can race.

The missing guard neatly explained and fixed the issue (disclaimer: on
my hardware for my scenario).

> 
> As a side note, I dug through the history of the gpio-pca953x driver
> and found that i2c_lock was introduced before the driver was converted
> to regmap by:
> 
> commit 6e20fb18054c179d7e64c0af43d855b9310a3394
> Author: Roland Stigge
> Date:   Thu Feb 10 15:01:23 2011 -0800
> 
>     drivers/gpio/pca953x.c: add a mutex to fix race condition
> 
> The fix added locking around read-modify-write cycles (which was indeed
> needed) and also around simple register reads (which I don't think was
> needed).
> 
> It turns out that regmap has its own protection around read-modify-
> write cycles (see regmap_update_bits_base) so I think several uses of
> i2c_lock should have been removed from the gpio-pca953x driver when it
> was converted to regmap as they became redundant then. This driver-side
> lock is still needed in a number of functions though, where the read-
> modify-write is handled outside of regmap (for example in
> pca953x_gpio_set_multiple).
> 

Blue skies,
Ian


> Thanks,
> --
> Jean Delvare
> SUSE L3 Support
Jean Delvare Oct. 18, 2024, 9:26 a.m. UTC | #6
Hi Ray,

Sorry for the delay.

On Tue, 2024-10-08 at 09:02 +0300, Ian Ray wrote:
> On Mon, Oct 07, 2024 at 11:16:51PM +0200, Jean Delvare wrote:
> > On Thu, 2024-06-20 at 07:29 +0300, Ian Ray wrote:
> > > Ensure that `i2c_lock' is held when setting interrupt latch and mask in
> > > pca953x_irq_bus_sync_unlock() in order to avoid races.
> > > 
> > > The other (non-probe) call site pca953x_gpio_set_multiple() ensures the
> > > lock is held before calling pca953x_write_regs().
> > > 
> > > The problem occurred when a request raced against irq_bus_sync_unlock()
> > > approximately once per thousand reboots on an i.MX8MP based system.
> > > 
> > >  * Normal case
> > > 
> > >    0-0022: write register AI|3a {03,02,00,00,01} Input latch P0
> > >    0-0022: write register AI|49 {fc,fd,ff,ff,fe} Interrupt mask P0
> > >    0-0022: write register AI|08 {ff,00,00,00,00} Output P3
> > >    0-0022: write register AI|12 {fc,00,00,00,00} Config P3
> > > 
> > >  * Race case
> > > 
> > >    0-0022: write register AI|08 {ff,00,00,00,00} Output P3
> > >    0-0022: write register AI|08 {03,02,00,00,01} *** Wrong register ***
> > >    0-0022: write register AI|12 {fc,00,00,00,00} Config P3
> > >    0-0022: write register AI|49 {fc,fd,ff,ff,fe} Interrupt mask P0
> > > 
> > 
> > I have more questions on this. Where does the above log come from?
> > Specifically, at which layer (bus driver, regmap, gpio device drier)?
> 
> Additional debug, with manually added commentary (sorry for not being
> clearer).  The debug was added to drivers/base/regmap/regmap-i2c.c while
> investigating the issue.

FWIW, I think regmap includes a tracing facility which may have served
you. Specifically, I see calls to trace_regmap_hw_write_start() and
trace_regmap_hw_write_done() in _regmap_raw_write_impl(). But I must
confess I couldn't find where these functions are defined nor how to
enable tracing...

> > What do these values represent exactly? Which GPIO chip was used on
> > your system? Which i2c bus driver is being used on that system? What
> > are the "requests" you mention in the description above?
> 
> GPIO expander pi4ioe5v6534q at I2C address 0-0022.

This device model doesn't seem to be explicitly supported by driver
gpio-pca953x. I see it listed as compatible in
Documentation/devicetree/bindings/gpio/gpio-pca95xx.yaml but not in the
driver's pca953x_dt_ids. Out of curiosity, did you have to add it
manually? I admit I'm not familiar with these device tree node
declarations.

> # grep . {name,uevent}
> name:30a20000.i2c
> uevent:OF_NAME=i2c
> uevent:OF_FULLNAME=/soc@0/bus@30800000/i2c@30a20000
> uevent:OF_COMPATIBLE_0=fsl,imx8mp-i2c
> uevent:OF_COMPATIBLE_1=fsl,imx21-i2c
> uevent:OF_COMPATIBLE_N=2
> uevent:OF_ALIAS_0=i2c0

OK, so the underlying I2C master is capable of writing to multiple
registers at once. This helped me follow the code flow while trying to
figure out where the race was.

> > I'm asking because I do not understand how writing to the wrong
> > register can happen, even without holding i2c_lock in
> > pca953x_irq_bus_sync_unlock(). The i2c layer has a per-i2c_adapter lock
> 
> Given that pca953x_irq_bus_sync_unlock is part of an interrupt handler,
> IMHO this explains very well why locking is needed (but I did not dig
> deeper than that).

I took the time to dig deeper, my conclusions are below.

> > which is taken before any bus transfer, so it isn't possible that two
> > transfers collide at the bus level. So the lack of locking at the
> > device driver level could lead to data corruption (for example read-
> > modify-write cycles overlapping), but not to data being written to the
> > wrong register.
> 
> Based on the observed data, the hypothesis was that pca953x_write_regs
> (called via pca953x_gpio_set_multiple) and pca953x_irq_bus_sync_unlock
> can race.
> 
> The missing guard neatly explained and fixed the issue (disclaimer: on
> my hardware for my scenario).
> 
> > As a side note, I dug through the history of the gpio-pca953x driver
> > and found that i2c_lock was introduced before the driver was converted
> > to regmap by:
> > 
> > commit 6e20fb18054c179d7e64c0af43d855b9310a3394
> > Author: Roland Stigge
> > Date:   Thu Feb 10 15:01:23 2011 -0800
> > 
> >     drivers/gpio/pca953x.c: add a mutex to fix race condition
> > 
> > The fix added locking around read-modify-write cycles (which was indeed
> > needed) and also around simple register reads (which I don't think was
> > needed).
> > 
> > It turns out that regmap has its own protection around read-modify-
> > write cycles (see regmap_update_bits_base) so I think several uses of
> > i2c_lock should have been removed from the gpio-pca953x driver when it
> > was converted to regmap as they became redundant then.

I have to correct myself here. The regmap layer implements its own,
configurable and *optional* protection lock. It turns out that the
gpio-pca953x driver has it disabled:

static const struct regmap_config pca953x_i2c_regmap = {
	(...)
	.disable_locking = true,
	(...)
};

So it is expected and very needed that the gpio-pca953x driver
implements its own lock to protect against races whenever the hardware
is accessed.

> > This driver-side
> > lock is still needed in a number of functions though, where the read-
> > modify-write is handled outside of regmap (for example in
> > pca953x_gpio_set_multiple).

After reading the regmap code (which took me some time as I wasn't
familiar at all with it, I didn't know what I was looking for exactly
and I wanted to make sure I wasn't missing something along the way), I
think I understand what was racing exactly.

The gpio-pca953x driver uses regmap_bulk_write() which is implemented
by _regmap_raw_write_impl(). The register map uses an internal buffer
to prepare the actual hardware transfers:

struct regmap *__regmap_init(...) {
	(...)
	map->work_buf = kzalloc(map->format.buf_size, GFP_KERNEL);
	(...)
}

This work buffer has space for both the register address and the values
to be written to or read from the device:

	map->format.buf_size = DIV_ROUND_UP(config->reg_bits +
			config->val_bits + config->pad_bits, 8);

During a regmap raw write, the register address is written to the first
byte of the work buffer:

	map->format.format_reg(map->work_buf, reg, map->reg_shift);

where map->format.format_reg() is regmap_format_8() for the gpio-
pca953x driver:

static void regmap_format_8(void *buf, unsigned int val, unsigned int shift)
{
	u8 *b = buf;

	b[0] = val << shift;
}

If _regmap_raw_write_impl() is called concurrently without proper
locking then the contents of the work buffer may be overwritten by the
second caller before the first caller had a chance to use it. I think
this matches your debug log of the race case pretty well.

I checked the regmap implementation for other use cases of map-
>format.format_reg(map->work_buf, ...) and found it is being used in
_regmap_raw_read(), so I had to investigate further, because
pca953x_irq_bus_sync_unlock() also calls pca953x_read_regs(..., chip-
>regs->direction, ...) which in turn calls regmap_bulk_read().

For volatile registers, this function will call regmap_raw_read() which
reads the values from the hardware most of the time. However, for non-
volatile registers, _regmap_bulk_read() is being called instead, which
is implemented by _regmap_read() which reads from the regmap cache. As
it turns out that the direction registers are not volatile and are read
first as part of pca953x_irq_setup(), the values will always be
available from the cache when read from pca953x_irq_bus_sync_unlock(),
so no hardware access will happen and the internal work buffer won't be
used.

Therefore my conclusion is that your fix was needed, is correct and is
sufficient. My initial concern about the unprotected
pca953x_read_regs() call in pca953x_irq_bus_sync_unlock() was
incorrect. Sorry for the noise.
Ian Ray Oct. 21, 2024, 8:20 a.m. UTC | #7
On Fri, Oct 18, 2024 at 11:26:54AM +0200, Jean Delvare wrote:
> Hi Ray,
> 
> Sorry for the delay.
> 
> On Tue, 2024-10-08 at 09:02 +0300, Ian Ray wrote:
> > On Mon, Oct 07, 2024 at 11:16:51PM +0200, Jean Delvare wrote:
> > > On Thu, 2024-06-20 at 07:29 +0300, Ian Ray wrote:
> > > > Ensure that `i2c_lock' is held when setting interrupt latch and mask in
> > > > pca953x_irq_bus_sync_unlock() in order to avoid races.
> > > >
> > > > The other (non-probe) call site pca953x_gpio_set_multiple() ensures the
> > > > lock is held before calling pca953x_write_regs().
> > > >
> > > > The problem occurred when a request raced against irq_bus_sync_unlock()
> > > > approximately once per thousand reboots on an i.MX8MP based system.
> > > >
> > > >  * Normal case
> > > >
> > > >    0-0022: write register AI|3a {03,02,00,00,01} Input latch P0
> > > >    0-0022: write register AI|49 {fc,fd,ff,ff,fe} Interrupt mask P0
> > > >    0-0022: write register AI|08 {ff,00,00,00,00} Output P3
> > > >    0-0022: write register AI|12 {fc,00,00,00,00} Config P3
> > > >
> > > >  * Race case
> > > >
> > > >    0-0022: write register AI|08 {ff,00,00,00,00} Output P3
> > > >    0-0022: write register AI|08 {03,02,00,00,01} *** Wrong register ***
> > > >    0-0022: write register AI|12 {fc,00,00,00,00} Config P3
> > > >    0-0022: write register AI|49 {fc,fd,ff,ff,fe} Interrupt mask P0
> > > >
> > >
> > > I have more questions on this. Where does the above log come from?
> > > Specifically, at which layer (bus driver, regmap, gpio device drier)?
> >
> > Additional debug, with manually added commentary (sorry for not being
> > clearer).  The debug was added to drivers/base/regmap/regmap-i2c.c while
> > investigating the issue.
> 
> FWIW, I think regmap includes a tracing facility which may have served
> you. Specifically, I see calls to trace_regmap_hw_write_start() and
> trace_regmap_hw_write_done() in _regmap_raw_write_impl(). But I must
> confess I couldn't find where these functions are defined nor how to
> enable tracing...

Interesting, thank you!

> 
> > > What do these values represent exactly? Which GPIO chip was used on
> > > your system? Which i2c bus driver is being used on that system? What
> > > are the "requests" you mention in the description above?
> >
> > GPIO expander pi4ioe5v6534q at I2C address 0-0022.
> 
> This device model doesn't seem to be explicitly supported by driver
> gpio-pca953x. I see it listed as compatible in
> Documentation/devicetree/bindings/gpio/gpio-pca95xx.yaml but not in the
> driver's pca953x_dt_ids. Out of curiosity, did you have to add it
> manually? I admit I'm not familiar with these device tree node
> declarations.
> 
> > # grep . {name,uevent}
> > name:30a20000.i2c
> > uevent:OF_NAME=i2c
> > uevent:OF_FULLNAME=/soc@0/bus@30800000/i2c@30a20000
> > uevent:OF_COMPATIBLE_0=fsl,imx8mp-i2c
> > uevent:OF_COMPATIBLE_1=fsl,imx21-i2c
> > uevent:OF_COMPATIBLE_N=2
> > uevent:OF_ALIAS_0=i2c0
> 
> OK, so the underlying I2C master is capable of writing to multiple
> registers at once. This helped me follow the code flow while trying to
> figure out where the race was.
> 
> > > I'm asking because I do not understand how writing to the wrong
> > > register can happen, even without holding i2c_lock in
> > > pca953x_irq_bus_sync_unlock(). The i2c layer has a per-i2c_adapter lock
> >
> > Given that pca953x_irq_bus_sync_unlock is part of an interrupt handler,
> > IMHO this explains very well why locking is needed (but I did not dig
> > deeper than that).
> 
> I took the time to dig deeper, my conclusions are below.
> 
> > > which is taken before any bus transfer, so it isn't possible that two
> > > transfers collide at the bus level. So the lack of locking at the
> > > device driver level could lead to data corruption (for example read-
> > > modify-write cycles overlapping), but not to data being written to the
> > > wrong register.
> >
> > Based on the observed data, the hypothesis was that pca953x_write_regs
> > (called via pca953x_gpio_set_multiple) and pca953x_irq_bus_sync_unlock
> > can race.
> >
> > The missing guard neatly explained and fixed the issue (disclaimer: on
> > my hardware for my scenario).
> >
> > > As a side note, I dug through the history of the gpio-pca953x driver
> > > and found that i2c_lock was introduced before the driver was converted
> > > to regmap by:
> > >
> > > commit 6e20fb18054c179d7e64c0af43d855b9310a3394
> > > Author: Roland Stigge
> > > Date:   Thu Feb 10 15:01:23 2011 -0800
> > >
> > >     drivers/gpio/pca953x.c: add a mutex to fix race condition
> > >
> > > The fix added locking around read-modify-write cycles (which was indeed
> > > needed) and also around simple register reads (which I don't think was
> > > needed).
> > >
> > > It turns out that regmap has its own protection around read-modify-
> > > write cycles (see regmap_update_bits_base) so I think several uses of
> > > i2c_lock should have been removed from the gpio-pca953x driver when it
> > > was converted to regmap as they became redundant then.
> 
> I have to correct myself here. The regmap layer implements its own,
> configurable and *optional* protection lock. It turns out that the
> gpio-pca953x driver has it disabled:
> 
> static const struct regmap_config pca953x_i2c_regmap = {
>         (...)
>         .disable_locking = true,
>         (...)
> };
> 
> So it is expected and very needed that the gpio-pca953x driver
> implements its own lock to protect against races whenever the hardware
> is accessed.
> 
> > > This driver-side
> > > lock is still needed in a number of functions though, where the read-
> > > modify-write is handled outside of regmap (for example in
> > > pca953x_gpio_set_multiple).
> 
> After reading the regmap code (which took me some time as I wasn't
> familiar at all with it, I didn't know what I was looking for exactly
> and I wanted to make sure I wasn't missing something along the way), I
> think I understand what was racing exactly.
> 
> The gpio-pca953x driver uses regmap_bulk_write() which is implemented
> by _regmap_raw_write_impl(). The register map uses an internal buffer
> to prepare the actual hardware transfers:
> 
> struct regmap *__regmap_init(...) {
>         (...)
>         map->work_buf = kzalloc(map->format.buf_size, GFP_KERNEL);
>         (...)
> }
> 
> This work buffer has space for both the register address and the values
> to be written to or read from the device:
> 
>         map->format.buf_size = DIV_ROUND_UP(config->reg_bits +
>                         config->val_bits + config->pad_bits, 8);
> 
> During a regmap raw write, the register address is written to the first
> byte of the work buffer:
> 
>         map->format.format_reg(map->work_buf, reg, map->reg_shift);
> 
> where map->format.format_reg() is regmap_format_8() for the gpio-
> pca953x driver:
> 
> static void regmap_format_8(void *buf, unsigned int val, unsigned int shift)
> {
>         u8 *b = buf;
> 
>         b[0] = val << shift;
> }
> 
> If _regmap_raw_write_impl() is called concurrently without proper
> locking then the contents of the work buffer may be overwritten by the
> second caller before the first caller had a chance to use it. I think
> this matches your debug log of the race case pretty well.
> 
> I checked the regmap implementation for other use cases of map-
> >format.format_reg(map->work_buf, ...) and found it is being used in
> _regmap_raw_read(), so I had to investigate further, because
> pca953x_irq_bus_sync_unlock() also calls pca953x_read_regs(..., chip-
> >regs->direction, ...) which in turn calls regmap_bulk_read().
> 
> For volatile registers, this function will call regmap_raw_read() which
> reads the values from the hardware most of the time. However, for non-
> volatile registers, _regmap_bulk_read() is being called instead, which
> is implemented by _regmap_read() which reads from the regmap cache. As
> it turns out that the direction registers are not volatile and are read
> first as part of pca953x_irq_setup(), the values will always be
> available from the cache when read from pca953x_irq_bus_sync_unlock(),
> so no hardware access will happen and the internal work buffer won't be
> used.
> 
> Therefore my conclusion is that your fix was needed, is correct and is
> sufficient. My initial concern about the unprotected
> pca953x_read_regs() call in pca953x_irq_bus_sync_unlock() was
> incorrect. Sorry for the noise.

No noise, this was a really interesting study, and a good learning
experience for me.  Thank you for this.

Blue skies,
Ian


> 
> --
> Jean Delvare
> SUSE L3 Support
diff mbox series

Patch

diff --git a/drivers/gpio/gpio-pca953x.c b/drivers/gpio/gpio-pca953x.c
index 77a2812f2974..732a6964748c 100644
--- a/drivers/gpio/gpio-pca953x.c
+++ b/drivers/gpio/gpio-pca953x.c
@@ -758,6 +758,8 @@  static void pca953x_irq_bus_sync_unlock(struct irq_data *d)
 	int level;
 
 	if (chip->driver_data & PCA_PCAL) {
+		guard(mutex)(&chip->i2c_lock);
+
 		/* Enable latch on interrupt-enabled inputs */
 		pca953x_write_regs(chip, PCAL953X_IN_LATCH, chip->irq_mask);