diff mbox series

[v2] gpio: Restrict usage of gc irq members before initialization

Message ID 20220315103813.84407-1-shreeya.patel@collabora.com
State New
Headers show
Series [v2] gpio: Restrict usage of gc irq members before initialization | expand

Commit Message

Shreeya Patel March 15, 2022, 10:38 a.m. UTC
gc irq members are exposed before they could be completely
initialized and this leads to race conditions.

One such issue was observed for the gc->irq.domain variable which
was accessed through the I2C interface in gpiochip_to_irq() before
it could be initialized by gpiochip_add_irqchip(). This resulted in
Kernel NULL pointer dereference.

To avoid such scenarios, restrict usage of gc irq members before
they are completely initialized.

Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
---

Changes in v2 :-
  - Make gc_irq_initialized flag a member of gpio_irq_chip structure.
  - Make use of barrier() to avoid reordering of flag initialization before
other gc irq members are initialized.


 drivers/gpio/gpiolib.c      | 11 ++++++++++-
 include/linux/gpio/driver.h |  9 +++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

Comments

Andy Shevchenko March 15, 2022, 11 a.m. UTC | #1
On Tue, Mar 15, 2022 at 12:38 PM Shreeya Patel
<shreeya.patel@collabora.com> wrote:

Thanks for the update, my comments below.

> gc irq members are exposed before they could be completely

gc --> GPIO chip


> initialized and this leads to race conditions.

Any example here. like ~3-4 lines of the Oops in question?

> One such issue was observed for the gc->irq.domain variable which
> was accessed through the I2C interface in gpiochip_to_irq() before
> it could be initialized by gpiochip_add_irqchip(). This resulted in
> Kernel NULL pointer dereference.
>
> To avoid such scenarios, restrict usage of gc irq members before

gc --> GPIO chip

> they are completely initialized.

...

> +       /*
> +        * Using barrier() here to prevent compiler from reordering
> +        * gc->irq.gc_irq_initialized before initialization of above
> +        * gc irq members.
> +        */
> +       barrier();
> +
> +       gc->irq.gc_irq_initialized = true;

There are too many duplications. Why not simply call it 'initialized'?

> -       if (gc->to_irq) {
> +       if (gc->to_irq && gc->irq.gc_irq_initialized) {

Why can't this check be added into gpiochip_to_irq() ?

    if (!gc->irq.initialized)
        return -ENXIO;

...

> +       bool gc_irq_initialized;

Can you move it closer to .init_hw so it will be weakly grouped by
logic similarities?
Also see above.
Shreeya Patel March 15, 2022, 12:32 p.m. UTC | #2
On 15/03/22 4:30 pm, Andy Shevchenko wrote:
> On Tue, Mar 15, 2022 at 12:38 PM Shreeya Patel
> <shreeya.patel@collabora.com> wrote:
>
> Thanks for the update, my comments below.
>
>> gc irq members are exposed before they could be completely
> gc --> GPIO chip
>
>
>> initialized and this leads to race conditions.
> Any example here. like ~3-4 lines of the Oops in question?
>
>> One such issue was observed for the gc->irq.domain variable which
>> was accessed through the I2C interface in gpiochip_to_irq() before
>> it could be initialized by gpiochip_add_irqchip(). This resulted in
>> Kernel NULL pointer dereference.
>>
>> To avoid such scenarios, restrict usage of gc irq members before
> gc --> GPIO chip
>
>> they are completely initialized.
> ...
>
>> +       /*
>> +        * Using barrier() here to prevent compiler from reordering
>> +        * gc->irq.gc_irq_initialized before initialization of above
>> +        * gc irq members.
>> +        */
>> +       barrier();
>> +
>> +       gc->irq.gc_irq_initialized = true;
> There are too many duplications. Why not simply call it 'initialized'?
>
>> -       if (gc->to_irq) {
>> +       if (gc->to_irq && gc->irq.gc_irq_initialized) {
> Why can't this check be added into gpiochip_to_irq() ?
>
>      if (!gc->irq.initialized)
>          return -ENXIO;
>
> ...


Because we don't want to return -ENXIO in case of race condition.

It should return -EPROBE_DEFER similar to how we are doing when gc->to_irq
is NULL.

So in this case when both gc->to_irq = NULL and gc->irq.initialized = FALSE,
we will be returning -EPROBE_DEFER. This will make sure that devices 
like touchscreen
do not become fatal due to returning -ENXIO.

>
>> +       bool gc_irq_initialized;
> Can you move it closer to .init_hw so it will be weakly grouped by
> logic similarities?
> Also see above.

Thanks for your comments, I'll make the necessary changes and send a v3.



Shreeya Patel
Andy Shevchenko March 15, 2022, 1:45 p.m. UTC | #3
On Tue, Mar 15, 2022 at 2:32 PM Shreeya Patel
<shreeya.patel@collabora.com> wrote:
> On 15/03/22 4:30 pm, Andy Shevchenko wrote:
> > On Tue, Mar 15, 2022 at 12:38 PM Shreeya Patel
> > <shreeya.patel@collabora.com> wrote:
> >
> > Thanks for the update, my comments below.
> >
> >> gc irq members are exposed before they could be completely
> > gc --> GPIO chip
> >
> >> initialized and this leads to race conditions.
> > Any example here. like ~3-4 lines of the Oops in question?
> >
> >> One such issue was observed for the gc->irq.domain variable which
> >> was accessed through the I2C interface in gpiochip_to_irq() before
> >> it could be initialized by gpiochip_add_irqchip(). This resulted in
> >> Kernel NULL pointer dereference.
> >>
> >> To avoid such scenarios, restrict usage of gc irq members before
> > gc --> GPIO chip
> >
> >> they are completely initialized.
> > ...
> >
> >> +       /*
> >> +        * Using barrier() here to prevent compiler from reordering
> >> +        * gc->irq.gc_irq_initialized before initialization of above
> >> +        * gc irq members.
> >> +        */
> >> +       barrier();
> >> +
> >> +       gc->irq.gc_irq_initialized = true;
> > There are too many duplications. Why not simply call it 'initialized'?
> >
> >> -       if (gc->to_irq) {
> >> +       if (gc->to_irq && gc->irq.gc_irq_initialized) {
> > Why can't this check be added into gpiochip_to_irq() ?
> >
> >      if (!gc->irq.initialized)
> >          return -ENXIO;
> >
> > ...
>
>
> Because we don't want to return -ENXIO in case of race condition.
>
> It should return -EPROBE_DEFER similar to how we are doing when gc->to_irq
> is NULL.

> So in this case when both gc->to_irq = NULL and gc->irq.initialized = FALSE,
> we will be returning -EPROBE_DEFER.

This is not true. The return code relies on an IRQ chip which may be
assigned (not NULL).

> This will make sure that devices
> like touchscreen
> do not become fatal due to returning -ENXIO.

So, then you need to move it to to_irq() and return there deferred
probe with a good comment in the code.

> >> +       bool gc_irq_initialized;
> > Can you move it closer to .init_hw so it will be weakly grouped by
> > logic similarities?
> > Also see above.
>
> Thanks for your comments, I'll make the necessary changes and send a v3.
kernel test robot March 15, 2022, 5:30 p.m. UTC | #4
Hi Shreeya,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linusw-gpio/for-next]
[also build test ERROR on v5.17-rc8 next-20220310]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Shreeya-Patel/gpio-Restrict-usage-of-gc-irq-members-before-initialization/20220315-183950
base:   https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio.git for-next
config: arm-xcep_defconfig (https://download.01.org/0day-ci/archive/20220316/202203160100.PENzQsMs-lkp@intel.com/config)
compiler: arm-linux-gnueabi-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/9f566a088a6f5fcb8830b07020294835072d516c
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Shreeya-Patel/gpio-Restrict-usage-of-gc-irq-members-before-initialization/20220315-183950
        git checkout 9f566a088a6f5fcb8830b07020294835072d516c
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=arm SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   drivers/gpio/gpiolib.c: In function 'gpiod_to_irq':
>> drivers/gpio/gpiolib.c:3068:29: error: 'struct gpio_chip' has no member named 'irq'
    3068 |         if (gc->to_irq && gc->irq.gc_irq_initialized) {
         |                             ^~


vim +3068 drivers/gpio/gpiolib.c

  3045	
  3046	/**
  3047	 * gpiod_to_irq() - return the IRQ corresponding to a GPIO
  3048	 * @desc: gpio whose IRQ will be returned (already requested)
  3049	 *
  3050	 * Return the IRQ corresponding to the passed GPIO, or an error code in case of
  3051	 * error.
  3052	 */
  3053	int gpiod_to_irq(const struct gpio_desc *desc)
  3054	{
  3055		struct gpio_chip *gc;
  3056		int offset;
  3057	
  3058		/*
  3059		 * Cannot VALIDATE_DESC() here as gpiod_to_irq() consumer semantics
  3060		 * requires this function to not return zero on an invalid descriptor
  3061		 * but rather a negative error number.
  3062		 */
  3063		if (!desc || IS_ERR(desc) || !desc->gdev || !desc->gdev->chip)
  3064			return -EINVAL;
  3065	
  3066		gc = desc->gdev->chip;
  3067		offset = gpio_chip_hwgpio(desc);
> 3068		if (gc->to_irq && gc->irq.gc_irq_initialized) {
  3069			int retirq = gc->to_irq(gc, offset);
  3070	
  3071			/* Zero means NO_IRQ */
  3072			if (!retirq)
  3073				return -ENXIO;
  3074	
  3075			return retirq;
  3076		}
  3077		return -ENXIO;
  3078	}
  3079	EXPORT_SYMBOL_GPL(gpiod_to_irq);
  3080	

---
0-DAY CI Kernel Test Service
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Shreeya Patel March 15, 2022, 6:42 p.m. UTC | #5
On 15/03/22 11:35 pm, Andy Shevchenko wrote:
> On Tue, Mar 15, 2022 at 7:38 PM kernel test robot <lkp@intel.com> wrote:
>> Hi Shreeya,
>>
>> Thank you for the patch! Yet something to improve:
>
>> All errors (new ones prefixed by >>):
>>
>>     drivers/gpio/gpiolib.c: In function 'gpiod_to_irq':
>>>> drivers/gpio/gpiolib.c:3068:29: error: 'struct gpio_chip' has no member named 'irq'
>>      3068 |         if (gc->to_irq && gc->irq.gc_irq_initialized) {
>>           |                             ^~
> Exactly, because this check should go under ifdeffery.


Makes sense to me now. gc->irq.initialized must be checked inside
gpiochip_to_irq() wrapped around an ifdef CONFIG_GPIOLIB_IRQCHIP

Thanks Andy, I'll send a v3 with this change.


Shreeya Patel
diff mbox series

Patch

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index defb7c464b87..3973146736a1 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -1593,6 +1593,15 @@  static int gpiochip_add_irqchip(struct gpio_chip *gc,
 
 	acpi_gpiochip_request_interrupts(gc);
 
+	/*
+	 * Using barrier() here to prevent compiler from reordering
+	 * gc->irq.gc_irq_initialized before initialization of above
+	 * gc irq members.
+	 */
+	barrier();
+
+	gc->irq.gc_irq_initialized = true;
+
 	return 0;
 }
 
@@ -3138,7 +3147,7 @@  int gpiod_to_irq(const struct gpio_desc *desc)
 
 	gc = desc->gdev->chip;
 	offset = gpio_chip_hwgpio(desc);
-	if (gc->to_irq) {
+	if (gc->to_irq && gc->irq.gc_irq_initialized) {
 		int retirq = gc->to_irq(gc, offset);
 
 		/* Zero means NO_IRQ */
diff --git a/include/linux/gpio/driver.h b/include/linux/gpio/driver.h
index b0728c8ad90c..e93de63feece 100644
--- a/include/linux/gpio/driver.h
+++ b/include/linux/gpio/driver.h
@@ -203,6 +203,15 @@  struct gpio_irq_chip {
 	 */
 	unsigned int *map;
 
+	/**
+	 * @gc_irq_initialized:
+	 *
+	 * Flag to track gc irq member's initialization.
+	 * This flag will make sure gc irq members are not used before
+	 * they are initialized.
+	 */
+	bool gc_irq_initialized;
+
 	/**
 	 * @threaded:
 	 *