mbox series

[RESEND,RFC,0/5] Fix some race conditions that exists between fbmem and sysfb

Message ID 20220406213919.600294-1-javierm@redhat.com
Headers show
Series Fix some race conditions that exists between fbmem and sysfb | expand

Message

Javier Martinez Canillas April 6, 2022, 9:39 p.m. UTC
[ resend since dri-devel wasn't Cc'ed on all patches, sorry for the noise ]

Hello,

The patches in this series are mostly changes suggested by Daniel Vetter
to fix some race conditions that exists between the fbdev core (fbmem)
and sysfb with regard to device registration and removal.

For example, it is currently possible for sysfb to register a platform
device after a real DRM driver was registered and requested to remove the
conflicting framebuffers.

A symptom of this issue, was worked around with by commit fb561bf9abde
("fbdev: Prevent probing generic drivers if a FB is already registered")
but that's really a hack and should be reverted.

This series attempt to fix it more properly and reverted the mentioned
hack. This will also unblock a pending patch to not make num_registered_fb
visible to drivers anymore, since that's really internal to fbdev core.

Patch #1 is just a trivial preparatory change.

Patch #2 add sysfb_disable() and sysfb_try_unregister() helpers for fbmem
to use them.

Patch #3 changes how is dealt with conflicting framebuffers unregistering,
rather than having a variable to determine if a lock should be take, it
just drops the lock before unregistering the platform device.

Patch #4 fixes the mentioned race conditions and finally patch #5 is the
revert patch that was posted by Daniel before but he dropped from his set.

The patches were tested on a rpi4 using different video configurations:
(simpledrm -> vc4 both builtin, only vc4 builtin, only simpledrm builtin
and simpledrm builtin with vc4 built as a module).

I'm sending as an RFC since there are many changes to the locking scheme
and that is always tricky to get right. Please let me know what you think.

Best regards,
Javier


Daniel Vetter (1):
  Revert "fbdev: Prevent probing generic drivers if a FB is already
    registered"

Javier Martinez Canillas (4):
  firmware: sysfb: Make sysfb_create_simplefb() return a pdev pointer
  firmware: sysfb: Add helpers to unregister a pdev and disable
    registration
  fbdev: Restart conflicting fb removal loop when unregistering devices
  fbdev: Fix some race conditions between fbmem and sysfb

 drivers/firmware/sysfb.c          | 51 ++++++++++++++++++++++++++-----
 drivers/firmware/sysfb_simplefb.c | 24 +++++++++------
 drivers/video/fbdev/core/fbmem.c  | 38 ++++++++++++++++++-----
 drivers/video/fbdev/efifb.c       | 11 -------
 drivers/video/fbdev/simplefb.c    | 11 -------
 include/linux/fb.h                |  1 -
 include/linux/sysfb.h             | 29 +++++++++++++++---
 7 files changed, 112 insertions(+), 53 deletions(-)

Comments

Daniel Vetter April 7, 2022, 9:08 a.m. UTC | #1
On Wed, Apr 06, 2022 at 11:39:17PM +0200, Javier Martinez Canillas wrote:
> Drivers that want to remove registered conflicting framebuffers prior to
> register their own framebuffer, calls remove_conflicting_framebuffers().
> 
> This function takes the registration_lock mutex, to prevent a races when
> drivers register framebuffer devices. But if a conflicting framebuffer
> device is found, the underlaying platform device is unregistered and this
> will lead to the platform driver .remove callback to be called, which in
> turn will call to the unregister_framebuffer() that takes the same lock.
> 
> To prevent this, a struct fb_info.forced_out field was used as indication
> to unregister_framebuffer() whether the mutex has to be grabbed or not.
> 
> A cleaner solution is to drop the lock before platform_device_unregister()
> so unregister_framebuffer() can take it when called from the fbdev driver,
> and just grab the lock again after the device has been registered and do
> a removal loop restart.
> 
> Since the framebuffer devices will already be removed, the loop would just
> finish when no more conflicting framebuffers are found.
> 
> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>

It's always entertaining with these things since they can go boom in funny
ways, but need to a least try :-) Recursive locks are just a bit too evil.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
> 
>  drivers/video/fbdev/core/fbmem.c | 21 ++++++++++++++-------
>  include/linux/fb.h               |  1 -
>  2 files changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
> index b585339509b0..c1bfb8df9cba 100644
> --- a/drivers/video/fbdev/core/fbmem.c
> +++ b/drivers/video/fbdev/core/fbmem.c
> @@ -1555,6 +1555,7 @@ static void do_remove_conflicting_framebuffers(struct apertures_struct *a,
>  {
>  	int i;
>  
> +restart_removal:
>  	/* check all firmware fbs and kick off if the base addr overlaps */
>  	for_each_registered_fb(i) {
>  		struct apertures_struct *gen_aper;
> @@ -1582,8 +1583,18 @@ static void do_remove_conflicting_framebuffers(struct apertures_struct *a,
>  			 * fix would add code to remove the device from the system.
>  			 */
>  			if (dev_is_platform(device)) {
> -				registered_fb[i]->forced_out = true;
> +				/*
> +				 * Drop the lock since the driver will call to the
> +				 * unregister_framebuffer() function that takes it.
> +				 */
> +				mutex_unlock(&registration_lock);
>  				platform_device_unregister(to_platform_device(device));
> +				mutex_lock(&registration_lock);
> +				/*
> +				 * Restart the removal now that the platform device
> +				 * has been unregistered and its associated fb gone.
> +				 */
> +				goto restart_removal;
>  			} else {
>  				pr_warn("fb%d: cannot remove device\n", i);
>  				do_unregister_framebuffer(registered_fb[i]);
> @@ -1917,13 +1928,9 @@ EXPORT_SYMBOL(register_framebuffer);
>  void
>  unregister_framebuffer(struct fb_info *fb_info)
>  {
> -	bool forced_out = fb_info->forced_out;
> -
> -	if (!forced_out)
> -		mutex_lock(&registration_lock);
> +	mutex_lock(&registration_lock);
>  	do_unregister_framebuffer(fb_info);
> -	if (!forced_out)
> -		mutex_unlock(&registration_lock);
> +	mutex_unlock(&registration_lock);
>  }
>  EXPORT_SYMBOL(unregister_framebuffer);
>  
> diff --git a/include/linux/fb.h b/include/linux/fb.h
> index 39baa9a70779..f1e0cd751b06 100644
> --- a/include/linux/fb.h
> +++ b/include/linux/fb.h
> @@ -503,7 +503,6 @@ struct fb_info {
>  	} *apertures;
>  
>  	bool skip_vt_switch; /* no VT switch on suspend/resume required */
> -	bool forced_out; /* set when being removed by another driver */
>  };
>  
>  static inline struct apertures_struct *alloc_apertures(unsigned int max_num) {
> -- 
> 2.35.1
>
Daniel Vetter April 7, 2022, 9:11 a.m. UTC | #2
On Wed, Apr 06, 2022 at 11:39:18PM +0200, Javier Martinez Canillas wrote:
> The platform devices registered in sysfb match with a firmware-based fbdev
> or DRM driver, that are used to have early graphics using framebuffers set
> up by the system firmware.
> 
> Real DRM drivers later are probed and remove all conflicting framebuffers,
> leading to these platform devices for generic drivers to be unregistered.
> 
> But the current solution has two issues that this patch fixes:
> 
> 1) It is a layering violation for the fbdev core to unregister a device
>    that was registered by sysfb.
> 
>    Instead, the sysfb_try_unregister() helper function can be called for
>    sysfb to attempt unregistering the device if is the one registered.
> 
> 2) The sysfb_init() function could be called after a DRM driver is probed
>    and requested to unregister devices for drivers with a conflicting fb.
> 
>    To prevent this, disable any future sysfb platform device registration
>    by calling sysfb_disable(), if a driver requested to remove conflicting
>    framebuffers with remove_conflicting_framebuffers().
> 
> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
> ---
> 
>  drivers/video/fbdev/core/fbmem.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/video/fbdev/core/fbmem.c b/drivers/video/fbdev/core/fbmem.c
> index c1bfb8df9cba..acf641b05d11 100644
> --- a/drivers/video/fbdev/core/fbmem.c
> +++ b/drivers/video/fbdev/core/fbmem.c
> @@ -19,6 +19,7 @@
>  #include <linux/kernel.h>
>  #include <linux/major.h>
>  #include <linux/slab.h>
> +#include <linux/sysfb.h>
>  #include <linux/mm.h>
>  #include <linux/mman.h>
>  #include <linux/vt.h>
> @@ -1588,7 +1589,10 @@ static void do_remove_conflicting_framebuffers(struct apertures_struct *a,
>  				 * unregister_framebuffer() function that takes it.
>  				 */
>  				mutex_unlock(&registration_lock);
> -				platform_device_unregister(to_platform_device(device));
> +				if (!sysfb_try_unregister(device)) {
> +					/* sysfb didn't register this device, unregister it */

Maybe explain in the commit message that this is still needed for drivers
which set up their platform_dev themselves, like vga16fb.

Also I'm not sure we want to have an assumption encoded in fbmem.c here
that the sysfb device is always a platform device. I think it would be
better to call sysfb_try_unregister on any device, and then fall back to
the forced removal on our own if it's a platform device.

Also maybe change the comment to /* FIXME: Not all platform fb drivers use sysfb yet */

> +					platform_device_unregister(to_platform_device(device));
> +				}
>  				mutex_lock(&registration_lock);
>  				/*
>  				 * Restart the removal now that the platform device
> @@ -1781,6 +1785,17 @@ int remove_conflicting_framebuffers(struct apertures_struct *a,
>  		do_free = true;
>  	}
>  
> +	/*
> +	 * If a driver asked to unregister a platform device registered by
> +	 * sysfb, then can be assumed that this is a driver for a display
> +	 * that is set up by the system firmware and has a generic driver.
> +	 *
> +	 * Drivers for devices that don't have a generic driver will never
> +	 * ask for this, so let's assume that a real driver for the display
> +	 * was already probed and prevent sysfb to register devices later.
> +	 */

Yeah it's disappointing, but no worse than the piles of hacks we have now.

With the bikesheds addressed above:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> +	sysfb_disable();
> +
>  	mutex_lock(&registration_lock);
>  	do_remove_conflicting_framebuffers(a, name, primary);
>  	mutex_unlock(&registration_lock);
> -- 
> 2.35.1
>
Javier Martinez Canillas April 7, 2022, 9:15 a.m. UTC | #3
On 4/7/22 11:11, Daniel Vetter wrote:
> On Wed, Apr 06, 2022 at 11:39:18PM +0200, Javier Martinez Canillas wrote:

[snip]

> 
> Yeah it's disappointing, but no worse than the piles of hacks we have now.
> 
> With the bikesheds addressed above:
>

Agree with all your comments and will address in the next version.
 
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>

Thanks for reviewing these patches so quickly!