gpio irqchip initialization race

Message ID CACRpkdbfidnokJ2WyagdcEycDu3dXKVon2z4TKM7KhNg_Hyf9w@mail.gmail.com
State New
Headers show

Commit Message

Linus Walleij April 1, 2016, 8:43 a.m.
On Fri, Apr 1, 2016 at 10:28 AM, Alexander Stein
<alexander.stein@systec-electronic.com> wrote:

> I suspect this depends on the time when gpio-keys is being probed. It will not

> be probed (or actually deferred) before mcp23s08_probe_one calls

> gpiochip_add_data as the parent GPIo controller to buttons is not present yet.

> I guess those error messages raise when gpio-keys is probed after

> gpiochip_add_data but before gpiochip_irqchip_add or

> gpiochip_set_chained_irqchip (in mcp23s08_irq_setup).

>

> The comment to _gpiochip_irqchip_add states the gpiochip has to be registered

> when calling. So to me this opens a rather small window where childs to

> gpiochip can be probed (parent is registered) but registering an interrupt

> will fail as the parent IRQ setup has not finished yet.

> Any suggestions how to fix that?


If that is what is happening it is indeed a very annoying race condition.
Strange we haven't seen it before!

So the problem doesn't exist when just using the gpiolib API or
the irqchip side independently, as we have been careful to make sure
that at least in modern drivers, these two are orthogonal.

I think the problem is that gpio-keys is calling gpio_to_irq(), and at that
point between gpiochip_add_data() but before gpiochip_irqchip_add()
it gets a bogus IRQ when it should be getting -EPROBE_DEFER.

I think we need to make sure that for drivers using gpiolib_irqchip_add()
-EPROBE_DEFER is returned for gpio[d]_to_irq() for this interrim
period.

What happens if you just apply this oneliner?


Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-gpio" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Linus Walleij April 1, 2016, 11:29 a.m. | #1
On Fri, Apr 1, 2016 at 10:56 AM, Alexander Stein
<alexander.stein@systec-electronic.com> wrote:
> [Me]

>> And in this case (if gpio_keys handle -EPROBE_DEFER gracefully)

>> all should be fine with my oneliner patch.

>>

>> I am more uncertain about the -EINVAL (-22) we might need some

>> more analysis there.

>

> I did 10 runs and got the following results:

>> 3x  gpio-keys user_sw: Unable to claim irq 0; error -22

>> 2x  gpio-keys user_sw: Unable to get irq number for GPIO 376, error -517

>> 5x  ok

>

> So, for one gpio-keys seems not to handle -EPROBE_DEFER gracefully


After reading the code the only problem seems to be that it prints that
error. A small patch to silence that print should fix it then, can you
confirm that in these cases the gpio-keys are retried later?

> and it seem

> still possible to get an invalid irq number.


I think this is because gpiolib initialize .to_irq() before setting up the
irqdomain.

I'm making a combined patch fixing both issues, send out in a few sec.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-gpio" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch hide | download patch | download mbox

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index b747c76fd2b1..838643d2d976 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -2004,7 +2004,7 @@  int gpiod_to_irq(const struct gpio_desc *desc)
        VALIDATE_DESC(desc);
        chip = desc->gdev->chip;
        offset = gpio_chip_hwgpio(desc);
-       return chip->to_irq ? chip->to_irq(chip, offset) : -ENXIO;
+       return chip->to_irq ? chip->to_irq(chip, offset) : -EPROBE_DEFER;
 }
 EXPORT_SYMBOL_GPL(gpiod_to_irq);