mbox series

[GIT,PULL] gpio: fixes for v6.1-rc8

Message ID 20221202095455.83356-1-brgl@bgdev.pl
State New
Headers show
Series [GIT,PULL] gpio: fixes for v6.1-rc8 | expand

Pull-request

git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux.git tags/gpio-fixes-for-v6.1-rc8

Message

Bartosz Golaszewski Dec. 2, 2022, 9:54 a.m. UTC
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Linus,

Here's the last round of fixes for the upcoming release. The two resource leak
fixes are self-explanatory. The two character device commits need some more
backstory:

I recently listened to Laurent Pinchart's talk from this year's LPC[1] where he
discussed an issue with many subsystems that export device nodes to user-space
where one can open the device file, unbind the underlying device from the
driver, then call any of the relevant system calls and observe the kernel crash
with a NULL-pointer derefence. 

I verified that the problem exists with the GPIO subsystem as well. The reason
for that is: when a GPIO chip is removed, we drop the chip's data from the
GPIO device and set the relevant pointer to NULL but don't check it in syscall
callbacks in the character device's code. That's fixed by the first patch.

However that fix alone leaves the character device vulnerable to a race
condition in which the GPIO chip is removed when a syscall is in progress.
To avoid that, we have a second fix that uses a lock to protect the device's
structure from being disabled before all syscall callbacks returned.

Laurent blamed the issue on devres but I believe the problem comes from the
fact that many driver developers are simply unaware that resources exported
to user-space need to survive the driver unbind and only be released when the
device's reference count goes down to 0. Devres' docs are pretty clear on that:
the resources get freed on driver unbind. Resources that should survive it,
must not be managed. This is however a topic for a separate discussion which
I intend to start soon.

Please pull,
Bartosz Golaszewski

[1] https://www.youtube.com/watch?v=kW8LHWlJPTU

The following changes since commit b7b275e60bcd5f89771e865a8239325f86d9927d:

  Linux 6.1-rc7 (2022-11-27 13:31:48 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux.git tags/gpio-fixes-for-v6.1-rc8

for you to fetch changes up to 98d8b93c617139aeaf745c1573c02d86830f25d1:

  gpiolib: protect the GPIO device against being dropped while in use by user-space (2022-12-01 19:17:47 +0100)

----------------------------------------------------------------
gpio fixes for v6.1-rc8

- fix a memory leak in gpiochip_setup_dev() in core gpiolib
- fix a reference leak in gpio-amd8111
- fix a problem with user-space being able to trigger a NULL-pointer
  dereference ovet the GPIO character device

----------------------------------------------------------------
Bartosz Golaszewski (2):
      gpiolib: cdev: fix NULL-pointer dereferences
      gpiolib: protect the GPIO device against being dropped while in use by user-space

Xiongfeng Wang (1):
      gpio: amd8111: Fix PCI device reference count leak

Zeng Heng (1):
      gpiolib: fix memory leak in gpiochip_setup_dev()

 drivers/gpio/gpio-amd8111.c |   4 +
 drivers/gpio/gpiolib-cdev.c | 193 +++++++++++++++++++++++++++++++++++++++-----
 drivers/gpio/gpiolib.c      |  46 +++++++----
 drivers/gpio/gpiolib.h      |   5 ++
 4 files changed, 210 insertions(+), 38 deletions(-)

Comments

Bartosz Golaszewski Dec. 2, 2022, 2:49 p.m. UTC | #1
On Fri, Dec 2, 2022 at 10:54 AM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
>
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> Linus,
>
> Here's the last round of fixes for the upcoming release. The two resource leak
> fixes are self-explanatory. The two character device commits need some more
> backstory:
>
> I recently listened to Laurent Pinchart's talk from this year's LPC[1] where he
> discussed an issue with many subsystems that export device nodes to user-space
> where one can open the device file, unbind the underlying device from the
> driver, then call any of the relevant system calls and observe the kernel crash
> with a NULL-pointer derefence.
>
> I verified that the problem exists with the GPIO subsystem as well. The reason
> for that is: when a GPIO chip is removed, we drop the chip's data from the
> GPIO device and set the relevant pointer to NULL but don't check it in syscall
> callbacks in the character device's code. That's fixed by the first patch.
>
> However that fix alone leaves the character device vulnerable to a race
> condition in which the GPIO chip is removed when a syscall is in progress.
> To avoid that, we have a second fix that uses a lock to protect the device's
> structure from being disabled before all syscall callbacks returned.
>
> Laurent blamed the issue on devres but I believe the problem comes from the
> fact that many driver developers are simply unaware that resources exported
> to user-space need to survive the driver unbind and only be released when the
> device's reference count goes down to 0. Devres' docs are pretty clear on that:
> the resources get freed on driver unbind. Resources that should survive it,
> must not be managed. This is however a topic for a separate discussion which
> I intend to start soon.
>
> Please pull,
> Bartosz Golaszewski
>
> [1] https://www.youtube.com/watch?v=kW8LHWlJPTU
>
> The following changes since commit b7b275e60bcd5f89771e865a8239325f86d9927d:
>
>   Linux 6.1-rc7 (2022-11-27 13:31:48 -0800)
>
> are available in the Git repository at:
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux.git tags/gpio-fixes-for-v6.1-rc8
>
> for you to fetch changes up to 98d8b93c617139aeaf745c1573c02d86830f25d1:
>
>   gpiolib: protect the GPIO device against being dropped while in use by user-space (2022-12-01 19:17:47 +0100)
>
> ----------------------------------------------------------------
> gpio fixes for v6.1-rc8
>
> - fix a memory leak in gpiochip_setup_dev() in core gpiolib
> - fix a reference leak in gpio-amd8111
> - fix a problem with user-space being able to trigger a NULL-pointer
>   dereference ovet the GPIO character device
>
> ----------------------------------------------------------------
> Bartosz Golaszewski (2):
>       gpiolib: cdev: fix NULL-pointer dereferences
>       gpiolib: protect the GPIO device against being dropped while in use by user-space
>
> Xiongfeng Wang (1):
>       gpio: amd8111: Fix PCI device reference count leak
>
> Zeng Heng (1):
>       gpiolib: fix memory leak in gpiochip_setup_dev()
>
>  drivers/gpio/gpio-amd8111.c |   4 +
>  drivers/gpio/gpiolib-cdev.c | 193 +++++++++++++++++++++++++++++++++++++++-----
>  drivers/gpio/gpiolib.c      |  46 +++++++----
>  drivers/gpio/gpiolib.h      |   5 ++
>  4 files changed, 210 insertions(+), 38 deletions(-)

Linus,

Please don't pull this just yet, there's a build issue that was
reported to me with the last commit in that PR. I need to fix it and
redo the PR.

Bartosz