mbox series

[0/5] ACPI: scan: Fixes and cleanups related to dependencies list handling

Message ID 3140195.44csPzL39Z@kreacher
Headers show
Series ACPI: scan: Fixes and cleanups related to dependencies list handling | expand

Message

Rafael J. Wysocki June 16, 2021, 2:21 p.m. UTC
Hi,

The following patches address a few issues and do a few cleanups related to
the handling of the _DEP dependencies list.

Please refer to the patch changelogs for details.

Thanks!

Comments

Hans de Goede June 16, 2021, 2:36 p.m. UTC | #1
Hi,

On 6/16/21 4:21 PM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Make acpi_dev_get_first_consumer_dev_cb() a bit more straightforward
> and rewrite the comment in it.
> 
> No functional impact.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Thanks, patch looks good to me:

Reviewed-by: Hans de Goede <hdegoede@redhat.com>

Regards,

Hans


> ---
>  drivers/acpi/scan.c |   13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> Index: linux-pm/drivers/acpi/scan.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/scan.c
> +++ linux-pm/drivers/acpi/scan.c
> @@ -2107,13 +2107,12 @@ static int acpi_dev_get_first_consumer_d
>  	struct acpi_device *adev;
>  
>  	adev = acpi_bus_get_acpi_device(dep->consumer);
> -	if (!adev)
> -		/* If we don't find an adev then we want to continue parsing */
> -		return 0;
> -
> -	*(struct acpi_device **)data = adev;
> -
> -	return 1;
> +	if (adev) {
> +		*(struct acpi_device **)data = adev;
> +		return 1;
> +	}
> +	/* Continue parsing if the device object is not present. */
> +	return 0;
>  }
>  
>  static int acpi_scan_clear_dep(struct acpi_dep_data *dep, void *data)
> 
> 
>
Hans de Goede June 16, 2021, 2:48 p.m. UTC | #2
Hi,

On 6/16/21 4:23 PM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> In general, acpi_bus_attach() can only be run safely under
> acpi_scan_lock, but that lock cannot be acquired under
> acpi_dep_list_lock, so make acpi_scan_clear_dep() schedule deferred
> execution of acpi_bus_attach() under acpi_scan_lock instead of
> calling it directly.
> 
> This also fixes a possible race between acpi_scan_clear_dep() and
> device removal that might cause a device object that went away to
> be accessed, because acpi_scan_clear_dep() is changed to acquire
> a reference on the consumer device object.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/acpi/scan.c |   50 +++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 45 insertions(+), 5 deletions(-)
> 
> Index: linux-pm/drivers/acpi/scan.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/scan.c
> +++ linux-pm/drivers/acpi/scan.c
> @@ -2115,16 +2115,56 @@ static int acpi_dev_get_first_consumer_d
>  	return 0;
>  }
>  
> -static int acpi_scan_clear_dep(struct acpi_dep_data *dep, void *data)
> -{
> +struct acpi_scan_clear_dep_work {
> +	struct work_struct work;
>  	struct acpi_device *adev;
> +};
> +
> +static void acpi_scan_clear_dep_fn(struct work_struct *work)
> +{
> +	struct acpi_scan_clear_dep_work *cdw;
> +
> +	cdw = container_of(work, struct acpi_scan_clear_dep_work, work);
>  
> -	acpi_bus_get_device(dep->consumer, &adev);
> +	acpi_scan_lock_acquire();
> +	acpi_bus_attach(cdw->adev, true);
> +	acpi_scan_lock_release();
> +
> +	acpi_dev_put(cdw->adev);
> +	kfree(cdw);
> +}
> +
> +static bool acpi_scan_clear_dep_queue(struct acpi_device *adev)
> +{
> +	struct acpi_scan_clear_dep_work *cdw;
> +
> +	if (adev->dep_unmet)
> +		return false;
> +
> +	cdw = kmalloc(sizeof(*cdw), GFP_KERNEL);
> +	if (!cdw)
> +		return false;
> +
> +	cdw->adev = adev;
> +	INIT_WORK(&cdw->work, acpi_scan_clear_dep_fn);
> +	/*
> +	 * Since the work function may block on the lock until the entire
> +	 * initial enumeration of devices is complete, put it into the unbound
> +	 * workqueue.
> +	 */
> +	queue_work(system_unbound_wq, &cdw->work);

Hmm, I'm a bit worried about this. Even with the system_unbound_wq
some code may expect at least some progress being made with processing
works during the initial enumeration. OTOH this does run pretty early on.

Still I wonder if it would not be better to create + use our own workqueue
for this ?

I guess we can always do this if we run into issues later...

With that said / otherwise the patch looks good to me:

Reviewed-by: Hans de Goede <hdegoede@redhat.com>

Regards,

Hans




> +
> +	return true;
> +}
> +
> +static int acpi_scan_clear_dep(struct acpi_dep_data *dep, void *data)
> +{
> +	struct acpi_device *adev = acpi_bus_get_acpi_device(dep->consumer);
>  
>  	if (adev) {
>  		adev->dep_unmet--;
> -		if (!adev->dep_unmet)
> -			acpi_bus_attach(adev, true);
> +		if (!acpi_scan_clear_dep_queue(adev))
> +			acpi_dev_put(adev);
>  	}
>  
>  	list_del(&dep->node);
> 
> 
>
Hans de Goede June 16, 2021, 2:55 p.m. UTC | #3
Hi,

On 6/16/21 4:25 PM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> If acpi_add_single_object() runs concurrently with respect to
> acpi_scan_clear_dep() which deletes a dependencies list entry where
> the device being added is the consumer, the device's dep_unmet
> counter may not be updated to reflect that change.
> 
> Namely, if the dependencies list entry is deleted right after
> calling acpi_scan_dep_init() and before calling acpi_device_add(),
> acpi_scan_clear_dep() will not find the device object corresponding
> to the consumer device ACPI handle and it will not update its
> dep_unmet counter to reflect the deletion of the list entry.
> Consequently, the dep_unmet counter of the device will never
> become zero going forward which may prevent it from being
> completely enumerated.
> 
> To address this problem, modify acpi_add_single_object() to run
> acpi_tie_acpi_dev(), to attach the ACPI device object created by it
> to the corresponding ACPI namespace node, under acpi_dep_list_lock
> along with acpi_scan_dep_init() whenever the latter is called.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/acpi/scan.c |   46 +++++++++++++++++++++++++++++++++-------------
>  1 file changed, 33 insertions(+), 13 deletions(-)
> 
> Index: linux-pm/drivers/acpi/scan.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/scan.c
> +++ linux-pm/drivers/acpi/scan.c
> @@ -657,16 +657,12 @@ static int acpi_tie_acpi_dev(struct acpi
>  	return 0;
>  }
>  
> -int acpi_device_add(struct acpi_device *device,
> -		    void (*release)(struct device *))
> +int __acpi_device_add(struct acpi_device *device,
> +		      void (*release)(struct device *))
>  {
>  	struct acpi_device_bus_id *acpi_device_bus_id;
>  	int result;
>  
> -	result = acpi_tie_acpi_dev(device);
> -	if (result)
> -		return result;
> -
>  	/*
>  	 * Linkage
>  	 * -------
> @@ -755,6 +751,17 @@ err_unlock:
>  	return result;
>  }
>  
> +int acpi_device_add(struct acpi_device *adev, void (*release)(struct device *))
> +{
> +	int ret;
> +
> +	ret = acpi_tie_acpi_dev(adev);
> +	if (ret)
> +		return ret;
> +
> +	return __acpi_device_add(adev, release);
> +}
> +
>  /* --------------------------------------------------------------------------
>                                   Device Enumeration
>     -------------------------------------------------------------------------- */
> @@ -1681,14 +1688,10 @@ static void acpi_scan_dep_init(struct ac
>  {
>  	struct acpi_dep_data *dep;
>  
> -	mutex_lock(&acpi_dep_list_lock);
> -
>  	list_for_each_entry(dep, &acpi_dep_list, node) {
>  		if (dep->consumer == adev->handle)
>  			adev->dep_unmet++;
>  	}
> -
> -	mutex_unlock(&acpi_dep_list_lock);
>  }
>  
>  void acpi_device_add_finalize(struct acpi_device *device)
> @@ -1707,6 +1710,7 @@ static int acpi_add_single_object(struct
>  				  acpi_handle handle, int type, bool dep_init)
>  {
>  	struct acpi_device *device;
> +	bool release_dep_lock = false;
>  	int result;
>  
>  	device = kzalloc(sizeof(struct acpi_device), GFP_KERNEL);
> @@ -1720,16 +1724,32 @@ static int acpi_add_single_object(struct
>  	 * this must be done before the get power-/wakeup_dev-flags calls.
>  	 */
>  	if (type == ACPI_BUS_TYPE_DEVICE || type == ACPI_BUS_TYPE_PROCESSOR) {
> -		if (dep_init)
> +		if (dep_init) {
> +			mutex_lock(&acpi_dep_list_lock);
> +			/*
> +			 * Hold the lock until the acpi_tie_acpi_dev() call
> +			 * below to prevent concurrent acpi_scan_clear_dep()
> +			 * from deleting a dependency list entry without
> +			 * updating dep_unmet for the device.
> +			 */
> +			release_dep_lock = true;
>  			acpi_scan_dep_init(device);
> -
> +		}
>  		acpi_scan_init_status(device);
>  	}
>  
>  	acpi_bus_get_power_flags(device);
>  	acpi_bus_get_wakeup_device_flags(device);
>  
> -	result = acpi_device_add(device, acpi_device_release);
> +	result = acpi_tie_acpi_dev(device);
> +
> +	if (release_dep_lock)
> +		mutex_unlock(&acpi_dep_list_lock);
> +
> +	if (result)

AFAICT you are missing a "acpi_device_release(&device->dev);"
call in the error-exit path here, causing a mem-leak.

Otherwise this looks good, with the above fixed this is:

Reviewed-by: Hans de Goede <hdegoede@redhat.com>

Regards,

Hans


> +		return result;
> +
> +	result = __acpi_device_add(device, acpi_device_release);
>  	if (result) {
>  		acpi_device_release(&device->dev);
>  		return result;
> 
> 
>
Rafael J. Wysocki June 16, 2021, 3:12 p.m. UTC | #4
On Wed, Jun 16, 2021 at 4:48 PM Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi,
>
> On 6/16/21 4:23 PM, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > In general, acpi_bus_attach() can only be run safely under
> > acpi_scan_lock, but that lock cannot be acquired under
> > acpi_dep_list_lock, so make acpi_scan_clear_dep() schedule deferred
> > execution of acpi_bus_attach() under acpi_scan_lock instead of
> > calling it directly.
> >
> > This also fixes a possible race between acpi_scan_clear_dep() and
> > device removal that might cause a device object that went away to
> > be accessed, because acpi_scan_clear_dep() is changed to acquire
> > a reference on the consumer device object.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/acpi/scan.c |   50 +++++++++++++++++++++++++++++++++++++++++++++-----
> >  1 file changed, 45 insertions(+), 5 deletions(-)
> >
> > Index: linux-pm/drivers/acpi/scan.c
> > ===================================================================
> > --- linux-pm.orig/drivers/acpi/scan.c
> > +++ linux-pm/drivers/acpi/scan.c
> > @@ -2115,16 +2115,56 @@ static int acpi_dev_get_first_consumer_d
> >       return 0;
> >  }
> >
> > -static int acpi_scan_clear_dep(struct acpi_dep_data *dep, void *data)
> > -{
> > +struct acpi_scan_clear_dep_work {
> > +     struct work_struct work;
> >       struct acpi_device *adev;
> > +};
> > +
> > +static void acpi_scan_clear_dep_fn(struct work_struct *work)
> > +{
> > +     struct acpi_scan_clear_dep_work *cdw;
> > +
> > +     cdw = container_of(work, struct acpi_scan_clear_dep_work, work);
> >
> > -     acpi_bus_get_device(dep->consumer, &adev);
> > +     acpi_scan_lock_acquire();
> > +     acpi_bus_attach(cdw->adev, true);
> > +     acpi_scan_lock_release();
> > +
> > +     acpi_dev_put(cdw->adev);
> > +     kfree(cdw);
> > +}
> > +
> > +static bool acpi_scan_clear_dep_queue(struct acpi_device *adev)
> > +{
> > +     struct acpi_scan_clear_dep_work *cdw;
> > +
> > +     if (adev->dep_unmet)
> > +             return false;
> > +
> > +     cdw = kmalloc(sizeof(*cdw), GFP_KERNEL);
> > +     if (!cdw)
> > +             return false;
> > +
> > +     cdw->adev = adev;
> > +     INIT_WORK(&cdw->work, acpi_scan_clear_dep_fn);
> > +     /*
> > +      * Since the work function may block on the lock until the entire
> > +      * initial enumeration of devices is complete, put it into the unbound
> > +      * workqueue.
> > +      */
> > +     queue_work(system_unbound_wq, &cdw->work);
>
> Hmm, I'm a bit worried about this. Even with the system_unbound_wq
> some code may expect at least some progress being made with processing
> works during the initial enumeration. OTOH this does run pretty early on.
>
> Still I wonder if it would not be better to create + use our own workqueue
> for this ?
>
> I guess we can always do this if we run into issues later...

Exactly my thought.

> With that said / otherwise the patch looks good to me:
>
> Reviewed-by: Hans de Goede <hdegoede@redhat.com>

Thanks!
Hans de Goede June 16, 2021, 6:04 p.m. UTC | #5
Hi,

On 6/16/21 8:00 PM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> If acpi_add_single_object() runs concurrently with respect to
> acpi_scan_clear_dep() which deletes a dependencies list entry where
> the device being added is the consumer, the device's dep_unmet
> counter may not be updated to reflect that change.
> 
> Namely, if the dependencies list entry is deleted right after
> calling acpi_scan_dep_init() and before calling acpi_device_add(),
> acpi_scan_clear_dep() will not find the device object corresponding
> to the consumer device ACPI handle and it will not update its
> dep_unmet counter to reflect the deletion of the list entry.
> Consequently, the dep_unmet counter of the device will never
> become zero going forward which may prevent it from being
> completely enumerated.
> 
> To address this problem, modify acpi_add_single_object() to run
> acpi_tie_acpi_dev(), to attach the ACPI device object created by it
> to the corresponding ACPI namespace node, under acpi_dep_list_lock
> along with acpi_scan_dep_init() whenever the latter is called.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Reviewed-by: Hans de Goede <hdegoede@redhat.com>

FWIW this looks good to me now.

Regards,

Hans


> ---
> 
> -> v2:
>    * Fix memory leak spotted by Hans.
>    * Add the R-by tag from Hans.
> 
> ---
>  drivers/acpi/scan.c |   50 ++++++++++++++++++++++++++++++++------------------
>  1 file changed, 32 insertions(+), 18 deletions(-)
> 
> Index: linux-pm/drivers/acpi/scan.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/scan.c
> +++ linux-pm/drivers/acpi/scan.c
> @@ -657,16 +652,12 @@ static int acpi_tie_acpi_dev(struct acpi
>  	return 0;
>  }
>  
> -int acpi_device_add(struct acpi_device *device,
> -		    void (*release)(struct device *))
> +int __acpi_device_add(struct acpi_device *device,
> +		      void (*release)(struct device *))
>  {
>  	struct acpi_device_bus_id *acpi_device_bus_id;
>  	int result;
>  
> -	result = acpi_tie_acpi_dev(device);
> -	if (result)
> -		return result;
> -
>  	/*
>  	 * Linkage
>  	 * -------
> @@ -755,6 +746,17 @@ err_unlock:
>  	return result;
>  }
>  
> +int acpi_device_add(struct acpi_device *adev, void (*release)(struct device *))
> +{
> +	int ret;
> +
> +	ret = acpi_tie_acpi_dev(adev);
> +	if (ret)
> +		return ret;
> +
> +	return __acpi_device_add(adev, release);
> +}
> +
>  /* --------------------------------------------------------------------------
>                                   Device Enumeration
>     -------------------------------------------------------------------------- */
> @@ -1681,14 +1683,10 @@ static void acpi_scan_dep_init(struct ac
>  {
>  	struct acpi_dep_data *dep;
>  
> -	mutex_lock(&acpi_dep_list_lock);
> -
>  	list_for_each_entry(dep, &acpi_dep_list, node) {
>  		if (dep->consumer == adev->handle)
>  			adev->dep_unmet++;
>  	}
> -
> -	mutex_unlock(&acpi_dep_list_lock);
>  }
>  
>  void acpi_device_add_finalize(struct acpi_device *device)
> @@ -1707,6 +1705,7 @@ static int acpi_add_single_object(struct
>  				  acpi_handle handle, int type, bool dep_init)
>  {
>  	struct acpi_device *device;
> +	bool release_dep_lock = false;
>  	int result;
>  
>  	device = kzalloc(sizeof(struct acpi_device), GFP_KERNEL);
> @@ -1720,16 +1719,31 @@ static int acpi_add_single_object(struct
>  	 * this must be done before the get power-/wakeup_dev-flags calls.
>  	 */
>  	if (type == ACPI_BUS_TYPE_DEVICE || type == ACPI_BUS_TYPE_PROCESSOR) {
> -		if (dep_init)
> +		if (dep_init) {
> +			mutex_lock(&acpi_dep_list_lock);
> +			/*
> +			 * Hold the lock until the acpi_tie_acpi_dev() call
> +			 * below to prevent concurrent acpi_scan_clear_dep()
> +			 * from deleting a dependency list entry without
> +			 * updating dep_unmet for the device.
> +			 */
> +			release_dep_lock = true;
>  			acpi_scan_dep_init(device);
> -
> +		}
>  		acpi_scan_init_status(device);
>  	}
>  
>  	acpi_bus_get_power_flags(device);
>  	acpi_bus_get_wakeup_device_flags(device);
>  
> -	result = acpi_device_add(device, acpi_device_release);
> +	result = acpi_tie_acpi_dev(device);
> +
> +	if (release_dep_lock)
> +		mutex_unlock(&acpi_dep_list_lock);
> +
> +	if (!result)
> +		result = __acpi_device_add(device, acpi_device_release);
> +
>  	if (result) {
>  		acpi_device_release(&device->dev);
>  		return result;
> 
> 
>