diff mbox series

[RFC,v4,01/15] ACPI: Only enumerate enabled (or functional) processor devices

Message ID E1rVDmP-0027YJ-EW@rmk-PC.armlinux.org.uk
State New
Headers show
Series [RFC,v4,01/15] ACPI: Only enumerate enabled (or functional) processor devices | expand

Commit Message

Russell King (Oracle) Jan. 31, 2024, 4:49 p.m. UTC
From: James Morse <james.morse@arm.com>

Today the ACPI enumeration code 'visits' all devices that are present.

This is a problem for arm64, where CPUs are always present, but not
always enabled. When a device-check occurs because the firmware-policy
has changed and a CPU is now enabled, the following error occurs:
| acpi ACPI0007:48: Enumeration failure

This is ultimately because acpi_dev_ready_for_enumeration() returns
true for a device that is not enabled. The ACPI Processor driver
will not register such CPUs as they are not 'decoding their resources'.

ACPI allows a device to be functional instead of maintaining the
present and enabled bit, but we can't simply check the enabled bit
for all devices since firmware can be buggy.

If ACPI indicates that the device is present and enabled, then all well
and good, we can enumate it. However, if the device is present and not
enabled, then we also check whether the device is a processor device
to limit the impact of this new check to just processor devices.

This avoids enumerating present && functional processor devices that
are not enabled.

Signed-off-by: James Morse <james.morse@arm.com>
Co-developed-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
Changes since RFC v2:
 * Incorporate comment suggestion by Gavin Shan.
Changes since RFC v3:
 * Fixed "sert" typo.
Changes since RFC v3 (smaller series):
 * Restrict checking the enabled bit to processor devices, update
   commit comments.
 * Use Rafael's suggestion in
   https://lore.kernel.org/r/5760569.DvuYhMxLoT@kreacher
 * Updated with a fix - see:
   https://lore.kernel.org/all/Zbe8WQRASx6D6RaG@shell.armlinux.org.uk/
---
 drivers/acpi/acpi_processor.c | 11 +++++++++
 drivers/acpi/device_pm.c      |  2 +-
 drivers/acpi/device_sysfs.c   |  2 +-
 drivers/acpi/internal.h       |  4 ++-
 drivers/acpi/property.c       |  2 +-
 drivers/acpi/scan.c           | 46 +++++++++++++++++++++++++++--------
 6 files changed, 53 insertions(+), 14 deletions(-)

Comments

Rafael J. Wysocki Feb. 15, 2024, 8:10 p.m. UTC | #1
On Wed, Jan 31, 2024 at 5:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
>
> From: James Morse <james.morse@arm.com>
>
> Today the ACPI enumeration code 'visits' all devices that are present.
>
> This is a problem for arm64, where CPUs are always present, but not
> always enabled. When a device-check occurs because the firmware-policy
> has changed and a CPU is now enabled, the following error occurs:
> | acpi ACPI0007:48: Enumeration failure
>
> This is ultimately because acpi_dev_ready_for_enumeration() returns
> true for a device that is not enabled. The ACPI Processor driver
> will not register such CPUs as they are not 'decoding their resources'.
>
> ACPI allows a device to be functional instead of maintaining the
> present and enabled bit, but we can't simply check the enabled bit
> for all devices since firmware can be buggy.
>
> If ACPI indicates that the device is present and enabled, then all well
> and good, we can enumate it. However, if the device is present and not
> enabled, then we also check whether the device is a processor device
> to limit the impact of this new check to just processor devices.
>
> This avoids enumerating present && functional processor devices that
> are not enabled.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Co-developed-by: Rafael J. Wysocki <rjw@rjwysocki.net>
> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> ---
> Changes since RFC v2:
>  * Incorporate comment suggestion by Gavin Shan.
> Changes since RFC v3:
>  * Fixed "sert" typo.
> Changes since RFC v3 (smaller series):
>  * Restrict checking the enabled bit to processor devices, update
>    commit comments.
>  * Use Rafael's suggestion in
>    https://lore.kernel.org/r/5760569.DvuYhMxLoT@kreacher
>  * Updated with a fix - see:
>    https://lore.kernel.org/all/Zbe8WQRASx6D6RaG@shell.armlinux.org.uk/
> ---
>  drivers/acpi/acpi_processor.c | 11 +++++++++
>  drivers/acpi/device_pm.c      |  2 +-
>  drivers/acpi/device_sysfs.c   |  2 +-
>  drivers/acpi/internal.h       |  4 ++-
>  drivers/acpi/property.c       |  2 +-
>  drivers/acpi/scan.c           | 46 +++++++++++++++++++++++++++--------
>  6 files changed, 53 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 4fe2ef54088c..cf7c1cca69dd 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -626,6 +626,17 @@ static struct acpi_scan_handler processor_handler = {
>         },
>  };
>
> +bool acpi_device_is_processor(const struct acpi_device *adev)
> +{
> +       if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
> +               return true;
> +
> +       if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
> +               return false;
> +
> +       return acpi_scan_check_handler(adev, &processor_handler);
> +}
> +
>  static int acpi_processor_container_attach(struct acpi_device *dev,
>                                            const struct acpi_device_id *id)
>  {
> diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
> index 3b4d048c4941..e3c80f3b3b57 100644
> --- a/drivers/acpi/device_pm.c
> +++ b/drivers/acpi/device_pm.c
> @@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
>                 return -EINVAL;
>
>         device->power.state = ACPI_STATE_UNKNOWN;
> -       if (!acpi_device_is_present(device)) {
> +       if (!acpi_dev_ready_for_enumeration(device)) {
>                 device->flags.initialized = false;
>                 return -ENXIO;
>         }
> diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
> index 23373faa35ec..a0256d2493a7 100644
> --- a/drivers/acpi/device_sysfs.c
> +++ b/drivers/acpi/device_sysfs.c
> @@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
>         struct acpi_hardware_id *id;
>
>         /* Avoid unnecessarily loading modules for non present devices. */
> -       if (!acpi_device_is_present(acpi_dev))
> +       if (!acpi_dev_ready_for_enumeration(acpi_dev))
>                 return 0;
>
>         /*
> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> index 6588525c45ef..1bc8b6db60c5 100644
> --- a/drivers/acpi/internal.h
> +++ b/drivers/acpi/internal.h
> @@ -62,6 +62,8 @@ void acpi_sysfs_add_hotplug_profile(struct acpi_hotplug_profile *hotplug,
>  int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
>                                        const char *hotplug_profile_name);
>  void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +                            struct acpi_scan_handler *handler);
>
>  #ifdef CONFIG_DEBUG_FS
>  extern struct dentry *acpi_debugfs_dir;
> @@ -121,7 +123,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
>  void acpi_device_remove_files(struct acpi_device *dev);
>  void acpi_device_add_finalize(struct acpi_device *device);
>  void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
> -bool acpi_device_is_present(const struct acpi_device *adev);
>  bool acpi_device_is_battery(struct acpi_device *adev);
>  bool acpi_device_is_first_physical_node(struct acpi_device *adev,
>                                         const struct device *dev);
> @@ -133,6 +134,7 @@ int acpi_bus_register_early_device(int type);
>  const struct acpi_device *acpi_companion_match(const struct device *dev);
>  int __acpi_device_uevent_modalias(const struct acpi_device *adev,
>                                   struct kobj_uevent_env *env);
> +bool acpi_device_is_processor(const struct acpi_device *adev);
>
>  /* --------------------------------------------------------------------------
>                                    Power Resource
> diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
> index a6ead5204046..9f8d54038770 100644
> --- a/drivers/acpi/property.c
> +++ b/drivers/acpi/property.c
> @@ -1486,7 +1486,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
>         if (!is_acpi_device_node(fwnode))
>                 return false;
>
> -       return acpi_device_is_present(to_acpi_device_node(fwnode));
> +       return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
>  }
>
>  static const void *
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index e6ed1ba91e5c..fd2e8b3a5749 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
>         int error;
>
>         acpi_bus_get_status(adev);
> -       if (acpi_device_is_present(adev)) {
> +       if (acpi_dev_ready_for_enumeration(adev)) {
>                 /*
>                  * This function is only called for device objects for which
>                  * matching scan handlers exist.  The only situation in which
> @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
>         int error;
>
>         acpi_bus_get_status(adev);
> -       if (!acpi_device_is_present(adev)) {
> +       if (!acpi_dev_ready_for_enumeration(adev)) {
>                 acpi_scan_device_not_enumerated(adev);
>                 return 0;
>         }
> @@ -1917,11 +1917,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
>         return true;
>  }
>
> -bool acpi_device_is_present(const struct acpi_device *adev)
> -{
> -       return adev->status.present || adev->status.functional;
> -}
> -
>  static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
>                                        const char *idstr,
>                                        const struct acpi_device_id **matchid)
> @@ -1942,6 +1937,18 @@ static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
>         return false;
>  }
>
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +                            struct acpi_scan_handler *handler)
> +{
> +       struct acpi_hardware_id *hwid;
> +
> +       list_for_each_entry(hwid, &adev->pnp.ids, list)
> +               if (acpi_scan_handler_matching(handler, hwid->id, NULL))
> +                       return true;
> +
> +       return false;
> +}
> +
>  static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
>                                         const struct acpi_device_id **matchid)
>  {
> @@ -2405,16 +2412,35 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
>   * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
>   * @device: Pointer to the &struct acpi_device to check
>   *
> - * Check if the device is present and has no unmet dependencies.
> + * Check if the device is functional or enabled and has no unmet dependencies.
>   *
> - * Return true if the device is ready for enumeratino. Otherwise, return false.
> + * Return true if the device is ready for enumeration. Otherwise, return false.
>   */
>  bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
>  {
>         if (device->flags.honor_deps && device->dep_unmet)
>                 return false;
>
> -       return acpi_device_is_present(device);
> +       /*
> +        * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> +        * (!present && functional) for certain types of devices that should be
> +        * enumerated. Note that the enabled bit should not be set unless the
> +        * present bit is set.
> +        *
> +        * However, limit this only to processor devices to reduce possible
> +        * regressions with firmware.
> +        */
> +       if (!device->status.present)
> +               return device->status.functional;
> +
> +       /*
> +        * Fast path - if enabled is set, avoid the more expensive test to
> +        * check whether this device is a processor.
> +        */
> +       if (device->status.enabled)
> +               return true;
> +
> +       return !acpi_device_is_processor(device);
>  }
>  EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
>
> --

I can queue this up for 6.9 as it looks like the rest of the series
will still need some work.  What do you think?
Jonathan Cameron Feb. 19, 2024, 9:45 a.m. UTC | #2
On Thu, 15 Feb 2024 21:10:39 +0100
"Rafael J. Wysocki" <rafael@kernel.org> wrote:

> On Wed, Jan 31, 2024 at 5:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
> >
> > From: James Morse <james.morse@arm.com>
> >
> > Today the ACPI enumeration code 'visits' all devices that are present.
> >
> > This is a problem for arm64, where CPUs are always present, but not
> > always enabled. When a device-check occurs because the firmware-policy
> > has changed and a CPU is now enabled, the following error occurs:
> > | acpi ACPI0007:48: Enumeration failure
> >
> > This is ultimately because acpi_dev_ready_for_enumeration() returns
> > true for a device that is not enabled. The ACPI Processor driver
> > will not register such CPUs as they are not 'decoding their resources'.
> >
> > ACPI allows a device to be functional instead of maintaining the
> > present and enabled bit, but we can't simply check the enabled bit
> > for all devices since firmware can be buggy.
> >
> > If ACPI indicates that the device is present and enabled, then all well
> > and good, we can enumate it. However, if the device is present and not
> > enabled, then we also check whether the device is a processor device
> > to limit the impact of this new check to just processor devices.
> >
> > This avoids enumerating present && functional processor devices that
> > are not enabled.
> >
> > Signed-off-by: James Morse <james.morse@arm.com>
> > Co-developed-by: Rafael J. Wysocki <rjw@rjwysocki.net>
> > Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > ---
> > Changes since RFC v2:
> >  * Incorporate comment suggestion by Gavin Shan.
> > Changes since RFC v3:
> >  * Fixed "sert" typo.
> > Changes since RFC v3 (smaller series):
> >  * Restrict checking the enabled bit to processor devices, update
> >    commit comments.
> >  * Use Rafael's suggestion in
> >    https://lore.kernel.org/r/5760569.DvuYhMxLoT@kreacher
> >  * Updated with a fix - see:
> >    https://lore.kernel.org/all/Zbe8WQRASx6D6RaG@shell.armlinux.org.uk/
> > ---
> >  drivers/acpi/acpi_processor.c | 11 +++++++++
> >  drivers/acpi/device_pm.c      |  2 +-
> >  drivers/acpi/device_sysfs.c   |  2 +-
> >  drivers/acpi/internal.h       |  4 ++-
> >  drivers/acpi/property.c       |  2 +-
> >  drivers/acpi/scan.c           | 46 +++++++++++++++++++++++++++--------
> >  6 files changed, 53 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > index 4fe2ef54088c..cf7c1cca69dd 100644
> > --- a/drivers/acpi/acpi_processor.c
> > +++ b/drivers/acpi/acpi_processor.c
> > @@ -626,6 +626,17 @@ static struct acpi_scan_handler processor_handler = {
> >         },
> >  };
> >
> > +bool acpi_device_is_processor(const struct acpi_device *adev)
> > +{
> > +       if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
> > +               return true;
> > +
> > +       if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
> > +               return false;
> > +
> > +       return acpi_scan_check_handler(adev, &processor_handler);
> > +}
> > +
> >  static int acpi_processor_container_attach(struct acpi_device *dev,
> >                                            const struct acpi_device_id *id)
> >  {
> > diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
> > index 3b4d048c4941..e3c80f3b3b57 100644
> > --- a/drivers/acpi/device_pm.c
> > +++ b/drivers/acpi/device_pm.c
> > @@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
> >                 return -EINVAL;
> >
> >         device->power.state = ACPI_STATE_UNKNOWN;
> > -       if (!acpi_device_is_present(device)) {
> > +       if (!acpi_dev_ready_for_enumeration(device)) {
> >                 device->flags.initialized = false;
> >                 return -ENXIO;
> >         }
> > diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
> > index 23373faa35ec..a0256d2493a7 100644
> > --- a/drivers/acpi/device_sysfs.c
> > +++ b/drivers/acpi/device_sysfs.c
> > @@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
> >         struct acpi_hardware_id *id;
> >
> >         /* Avoid unnecessarily loading modules for non present devices. */
> > -       if (!acpi_device_is_present(acpi_dev))
> > +       if (!acpi_dev_ready_for_enumeration(acpi_dev))
> >                 return 0;
> >
> >         /*
> > diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> > index 6588525c45ef..1bc8b6db60c5 100644
> > --- a/drivers/acpi/internal.h
> > +++ b/drivers/acpi/internal.h
> > @@ -62,6 +62,8 @@ void acpi_sysfs_add_hotplug_profile(struct acpi_hotplug_profile *hotplug,
> >  int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
> >                                        const char *hotplug_profile_name);
> >  void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
> > +bool acpi_scan_check_handler(const struct acpi_device *adev,
> > +                            struct acpi_scan_handler *handler);
> >
> >  #ifdef CONFIG_DEBUG_FS
> >  extern struct dentry *acpi_debugfs_dir;
> > @@ -121,7 +123,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
> >  void acpi_device_remove_files(struct acpi_device *dev);
> >  void acpi_device_add_finalize(struct acpi_device *device);
> >  void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
> > -bool acpi_device_is_present(const struct acpi_device *adev);
> >  bool acpi_device_is_battery(struct acpi_device *adev);
> >  bool acpi_device_is_first_physical_node(struct acpi_device *adev,
> >                                         const struct device *dev);
> > @@ -133,6 +134,7 @@ int acpi_bus_register_early_device(int type);
> >  const struct acpi_device *acpi_companion_match(const struct device *dev);
> >  int __acpi_device_uevent_modalias(const struct acpi_device *adev,
> >                                   struct kobj_uevent_env *env);
> > +bool acpi_device_is_processor(const struct acpi_device *adev);
> >
> >  /* --------------------------------------------------------------------------
> >                                    Power Resource
> > diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
> > index a6ead5204046..9f8d54038770 100644
> > --- a/drivers/acpi/property.c
> > +++ b/drivers/acpi/property.c
> > @@ -1486,7 +1486,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
> >         if (!is_acpi_device_node(fwnode))
> >                 return false;
> >
> > -       return acpi_device_is_present(to_acpi_device_node(fwnode));
> > +       return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
> >  }
> >
> >  static const void *
> > diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> > index e6ed1ba91e5c..fd2e8b3a5749 100644
> > --- a/drivers/acpi/scan.c
> > +++ b/drivers/acpi/scan.c
> > @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
> >         int error;
> >
> >         acpi_bus_get_status(adev);
> > -       if (acpi_device_is_present(adev)) {
> > +       if (acpi_dev_ready_for_enumeration(adev)) {
> >                 /*
> >                  * This function is only called for device objects for which
> >                  * matching scan handlers exist.  The only situation in which
> > @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
> >         int error;
> >
> >         acpi_bus_get_status(adev);
> > -       if (!acpi_device_is_present(adev)) {
> > +       if (!acpi_dev_ready_for_enumeration(adev)) {
> >                 acpi_scan_device_not_enumerated(adev);
> >                 return 0;
> >         }
> > @@ -1917,11 +1917,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
> >         return true;
> >  }
> >
> > -bool acpi_device_is_present(const struct acpi_device *adev)
> > -{
> > -       return adev->status.present || adev->status.functional;
> > -}
> > -
> >  static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
> >                                        const char *idstr,
> >                                        const struct acpi_device_id **matchid)
> > @@ -1942,6 +1937,18 @@ static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
> >         return false;
> >  }
> >
> > +bool acpi_scan_check_handler(const struct acpi_device *adev,
> > +                            struct acpi_scan_handler *handler)
> > +{
> > +       struct acpi_hardware_id *hwid;
> > +
> > +       list_for_each_entry(hwid, &adev->pnp.ids, list)
> > +               if (acpi_scan_handler_matching(handler, hwid->id, NULL))
> > +                       return true;
> > +
> > +       return false;
> > +}
> > +
> >  static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
> >                                         const struct acpi_device_id **matchid)
> >  {
> > @@ -2405,16 +2412,35 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
> >   * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
> >   * @device: Pointer to the &struct acpi_device to check
> >   *
> > - * Check if the device is present and has no unmet dependencies.
> > + * Check if the device is functional or enabled and has no unmet dependencies.
> >   *
> > - * Return true if the device is ready for enumeratino. Otherwise, return false.
> > + * Return true if the device is ready for enumeration. Otherwise, return false.
> >   */
> >  bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
> >  {
> >         if (device->flags.honor_deps && device->dep_unmet)
> >                 return false;
> >
> > -       return acpi_device_is_present(device);
> > +       /*
> > +        * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> > +        * (!present && functional) for certain types of devices that should be
> > +        * enumerated. Note that the enabled bit should not be set unless the
> > +        * present bit is set.
> > +        *
> > +        * However, limit this only to processor devices to reduce possible
> > +        * regressions with firmware.
> > +        */
> > +       if (!device->status.present)
> > +               return device->status.functional;
> > +
> > +       /*
> > +        * Fast path - if enabled is set, avoid the more expensive test to
> > +        * check whether this device is a processor.
> > +        */
> > +       if (device->status.enabled)
> > +               return true;
> > +
> > +       return !acpi_device_is_processor(device);
> >  }
> >  EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
> >
> > --  
> 
> I can queue this up for 6.9 as it looks like the rest of the series
> will still need some work.  What do you think?

The sooner this goes in the sooner we discover if some of the bios bug
workarounds we have dropped form the series are in reality necessary
(i.e. get it into big board test farms).

So I'm definitely keen to see this go in for 6.9.

Hopefully we can make rapid progress on the rest of the series and
hammer out which of the remaining subtle differences between
the two flows are real vs code evolution issues.

Jonathan
Russell King (Oracle) Feb. 20, 2024, 11:30 a.m. UTC | #3
On Thu, Feb 15, 2024 at 09:10:39PM +0100, Rafael J. Wysocki wrote:
> On Wed, Jan 31, 2024 at 5:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
> >
> > From: James Morse <james.morse@arm.com>
> >
> > Today the ACPI enumeration code 'visits' all devices that are present.
> >
> > This is a problem for arm64, where CPUs are always present, but not
> > always enabled. When a device-check occurs because the firmware-policy
> > has changed and a CPU is now enabled, the following error occurs:
> > | acpi ACPI0007:48: Enumeration failure
> >
> > This is ultimately because acpi_dev_ready_for_enumeration() returns
> > true for a device that is not enabled. The ACPI Processor driver
> > will not register such CPUs as they are not 'decoding their resources'.
> >
> > ACPI allows a device to be functional instead of maintaining the
> > present and enabled bit, but we can't simply check the enabled bit
> > for all devices since firmware can be buggy.
> >
> > If ACPI indicates that the device is present and enabled, then all well
> > and good, we can enumate it. However, if the device is present and not
> > enabled, then we also check whether the device is a processor device
> > to limit the impact of this new check to just processor devices.
> >
> > This avoids enumerating present && functional processor devices that
> > are not enabled.
> >
> > Signed-off-by: James Morse <james.morse@arm.com>
> > Co-developed-by: Rafael J. Wysocki <rjw@rjwysocki.net>
> > Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> 
> I can queue this up for 6.9 as it looks like the rest of the series
> will still need some work.  What do you think?

That seems to be the only way we can make some progress with this
series. I've no idea how we progress from here because I can't answer
your questions on patch 2.
Rafael J. Wysocki Feb. 21, 2024, 1:01 p.m. UTC | #4
On Wed, Jan 31, 2024 at 5:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
>
> From: James Morse <james.morse@arm.com>
>
> Today the ACPI enumeration code 'visits' all devices that are present.
>
> This is a problem for arm64, where CPUs are always present, but not
> always enabled. When a device-check occurs because the firmware-policy
> has changed and a CPU is now enabled, the following error occurs:
> | acpi ACPI0007:48: Enumeration failure
>
> This is ultimately because acpi_dev_ready_for_enumeration() returns
> true for a device that is not enabled. The ACPI Processor driver
> will not register such CPUs as they are not 'decoding their resources'.
>
> ACPI allows a device to be functional instead of maintaining the
> present and enabled bit, but we can't simply check the enabled bit
> for all devices since firmware can be buggy.
>
> If ACPI indicates that the device is present and enabled, then all well
> and good, we can enumate it. However, if the device is present and not
> enabled, then we also check whether the device is a processor device
> to limit the impact of this new check to just processor devices.
>
> This avoids enumerating present && functional processor devices that
> are not enabled.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Co-developed-by: Rafael J. Wysocki <rjw@rjwysocki.net>
> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> ---
> Changes since RFC v2:
>  * Incorporate comment suggestion by Gavin Shan.
> Changes since RFC v3:
>  * Fixed "sert" typo.
> Changes since RFC v3 (smaller series):
>  * Restrict checking the enabled bit to processor devices, update
>    commit comments.
>  * Use Rafael's suggestion in
>    https://lore.kernel.org/r/5760569.DvuYhMxLoT@kreacher
>  * Updated with a fix - see:
>    https://lore.kernel.org/all/Zbe8WQRASx6D6RaG@shell.armlinux.org.uk/
> ---
>  drivers/acpi/acpi_processor.c | 11 +++++++++
>  drivers/acpi/device_pm.c      |  2 +-
>  drivers/acpi/device_sysfs.c   |  2 +-
>  drivers/acpi/internal.h       |  4 ++-
>  drivers/acpi/property.c       |  2 +-
>  drivers/acpi/scan.c           | 46 +++++++++++++++++++++++++++--------
>  6 files changed, 53 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 4fe2ef54088c..cf7c1cca69dd 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -626,6 +626,17 @@ static struct acpi_scan_handler processor_handler = {
>         },
>  };
>
> +bool acpi_device_is_processor(const struct acpi_device *adev)
> +{
> +       if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
> +               return true;
> +
> +       if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
> +               return false;
> +
> +       return acpi_scan_check_handler(adev, &processor_handler);
> +}
> +
>  static int acpi_processor_container_attach(struct acpi_device *dev,
>                                            const struct acpi_device_id *id)
>  {
> diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
> index 3b4d048c4941..e3c80f3b3b57 100644
> --- a/drivers/acpi/device_pm.c
> +++ b/drivers/acpi/device_pm.c
> @@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
>                 return -EINVAL;
>
>         device->power.state = ACPI_STATE_UNKNOWN;
> -       if (!acpi_device_is_present(device)) {
> +       if (!acpi_dev_ready_for_enumeration(device)) {

Sorry for failing to catch this earlier, but this change affects
non-processor devices possibly adversely.

Namely, one of the differences between acpi_device_is_present() and
acpi_dev_ready_for_enumeration() is the (device->flags.honor_deps &&
device->dep_unmet) check in the latter which is not present in the
former which may cause the power_manageable flag to be unset for
devices with dependencies, although they are in fact power-manageable.

The replacement here cannot be made.

>                 device->flags.initialized = false;
>                 return -ENXIO;
>         }
> diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
> index 23373faa35ec..a0256d2493a7 100644
> --- a/drivers/acpi/device_sysfs.c
> +++ b/drivers/acpi/device_sysfs.c
> @@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
>         struct acpi_hardware_id *id;
>
>         /* Avoid unnecessarily loading modules for non present devices. */
> -       if (!acpi_device_is_present(acpi_dev))
> +       if (!acpi_dev_ready_for_enumeration(acpi_dev))

The replacement here is incorrect for an analogous reason as above: it
may cause modalias creation to be skipped for devices with unmet
dependencies that are not processors and matching modules for them
should be loaded.

In fact, this replacement doesn't even have a functional effect on
processors, because there are no modules matching the processor device
ID AFAICS.

>                 return 0;
>
>         /*
> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> index 6588525c45ef..1bc8b6db60c5 100644
> --- a/drivers/acpi/internal.h
> +++ b/drivers/acpi/internal.h
> @@ -62,6 +62,8 @@ void acpi_sysfs_add_hotplug_profile(struct acpi_hotplug_profile *hotplug,
>  int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
>                                        const char *hotplug_profile_name);
>  void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +                            struct acpi_scan_handler *handler);
>
>  #ifdef CONFIG_DEBUG_FS
>  extern struct dentry *acpi_debugfs_dir;
> @@ -121,7 +123,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
>  void acpi_device_remove_files(struct acpi_device *dev);
>  void acpi_device_add_finalize(struct acpi_device *device);
>  void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
> -bool acpi_device_is_present(const struct acpi_device *adev);
>  bool acpi_device_is_battery(struct acpi_device *adev);
>  bool acpi_device_is_first_physical_node(struct acpi_device *adev,
>                                         const struct device *dev);
> @@ -133,6 +134,7 @@ int acpi_bus_register_early_device(int type);
>  const struct acpi_device *acpi_companion_match(const struct device *dev);
>  int __acpi_device_uevent_modalias(const struct acpi_device *adev,
>                                   struct kobj_uevent_env *env);
> +bool acpi_device_is_processor(const struct acpi_device *adev);
>
>  /* --------------------------------------------------------------------------
>                                    Power Resource
> diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
> index a6ead5204046..9f8d54038770 100644
> --- a/drivers/acpi/property.c
> +++ b/drivers/acpi/property.c
> @@ -1486,7 +1486,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
>         if (!is_acpi_device_node(fwnode))
>                 return false;
>
> -       return acpi_device_is_present(to_acpi_device_node(fwnode));
> +       return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));

This, again, may break non-processor devices with dependencies in
subtle ways and it doesn't have a functional effect on processors
AFAICS.

>  }
>
>  static const void *
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index e6ed1ba91e5c..fd2e8b3a5749 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
>         int error;
>
>         acpi_bus_get_status(adev);
> -       if (acpi_device_is_present(adev)) {
> +       if (acpi_dev_ready_for_enumeration(adev)) {

It looks to me like there are two purposes of this replacement.  One
is to handle the removal case which is analogous to the
acpi_scan_bus_check() case below.

The other purpose seems to be to avoid the dev_warn() message printed
when acpi_processor_add() does not return 1 for a processor device
that is not enabled.

However, this message arguably should not be printed at all so long as
acpi_bus_scan() succeeds, because hot-adding a device without a
matching scan handler is entirely valid.

I'll send a patch to fix this shortly.

In addition to that, it would suffice to make acpi_processor_add()
check the enabled bit and return 0 early when it is clear.  I'll send
a patch for this either.

>                 /*
>                  * This function is only called for device objects for which
>                  * matching scan handlers exist.  The only situation in which
> @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
>         int error;
>
>         acpi_bus_get_status(adev);
> -       if (!acpi_device_is_present(adev)) {
> +       if (!acpi_dev_ready_for_enumeration(adev)) {

Indeed, the enabled bit should be checked here, along with the present
and functional bits, but it would be better to move that check to
acpi_bus_trim_one() or even acpi_processor_remove(), so the
not-present-but-functional case is handled correctly.  And the
acpi_scan_device_check() case could then be handled analogously.
Another patch to be sent.

>                 acpi_scan_device_not_enumerated(adev);
>                 return 0;
>         }
> @@ -1917,11 +1917,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
>         return true;
>  }
>
> -bool acpi_device_is_present(const struct acpi_device *adev)
> -{
> -       return adev->status.present || adev->status.functional;
> -}
> -
>  static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
>                                        const char *idstr,
>                                        const struct acpi_device_id **matchid)
> @@ -1942,6 +1937,18 @@ static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
>         return false;
>  }
>
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +                            struct acpi_scan_handler *handler)
> +{
> +       struct acpi_hardware_id *hwid;
> +
> +       list_for_each_entry(hwid, &adev->pnp.ids, list)
> +               if (acpi_scan_handler_matching(handler, hwid->id, NULL))
> +                       return true;
> +
> +       return false;
> +}
> +
>  static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
>                                         const struct acpi_device_id **matchid)
>  {
> @@ -2405,16 +2412,35 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
>   * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
>   * @device: Pointer to the &struct acpi_device to check
>   *
> - * Check if the device is present and has no unmet dependencies.
> + * Check if the device is functional or enabled and has no unmet dependencies.
>   *
> - * Return true if the device is ready for enumeratino. Otherwise, return false.
> + * Return true if the device is ready for enumeration. Otherwise, return false.
>   */
>  bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
>  {
>         if (device->flags.honor_deps && device->dep_unmet)
>                 return false;
>
> -       return acpi_device_is_present(device);
> +       /*
> +        * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> +        * (!present && functional) for certain types of devices that should be
> +        * enumerated. Note that the enabled bit should not be set unless the
> +        * present bit is set.
> +        *
> +        * However, limit this only to processor devices to reduce possible
> +        * regressions with firmware.
> +        */
> +       if (!device->status.present)
> +               return device->status.functional;
> +
> +       /*
> +        * Fast path - if enabled is set, avoid the more expensive test to
> +        * check whether this device is a processor.
> +        */
> +       if (device->status.enabled)
> +               return true;
> +
> +       return !acpi_device_is_processor(device);
>  }
>  EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
>
> --
diff mbox series

Patch

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 4fe2ef54088c..cf7c1cca69dd 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -626,6 +626,17 @@  static struct acpi_scan_handler processor_handler = {
 	},
 };
 
+bool acpi_device_is_processor(const struct acpi_device *adev)
+{
+	if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
+		return true;
+
+	if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
+		return false;
+
+	return acpi_scan_check_handler(adev, &processor_handler);
+}
+
 static int acpi_processor_container_attach(struct acpi_device *dev,
 					   const struct acpi_device_id *id)
 {
diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
index 3b4d048c4941..e3c80f3b3b57 100644
--- a/drivers/acpi/device_pm.c
+++ b/drivers/acpi/device_pm.c
@@ -313,7 +313,7 @@  int acpi_bus_init_power(struct acpi_device *device)
 		return -EINVAL;
 
 	device->power.state = ACPI_STATE_UNKNOWN;
-	if (!acpi_device_is_present(device)) {
+	if (!acpi_dev_ready_for_enumeration(device)) {
 		device->flags.initialized = false;
 		return -ENXIO;
 	}
diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
index 23373faa35ec..a0256d2493a7 100644
--- a/drivers/acpi/device_sysfs.c
+++ b/drivers/acpi/device_sysfs.c
@@ -141,7 +141,7 @@  static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
 	struct acpi_hardware_id *id;
 
 	/* Avoid unnecessarily loading modules for non present devices. */
-	if (!acpi_device_is_present(acpi_dev))
+	if (!acpi_dev_ready_for_enumeration(acpi_dev))
 		return 0;
 
 	/*
diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
index 6588525c45ef..1bc8b6db60c5 100644
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -62,6 +62,8 @@  void acpi_sysfs_add_hotplug_profile(struct acpi_hotplug_profile *hotplug,
 int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
 				       const char *hotplug_profile_name);
 void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
+bool acpi_scan_check_handler(const struct acpi_device *adev,
+			     struct acpi_scan_handler *handler);
 
 #ifdef CONFIG_DEBUG_FS
 extern struct dentry *acpi_debugfs_dir;
@@ -121,7 +123,6 @@  int acpi_device_setup_files(struct acpi_device *dev);
 void acpi_device_remove_files(struct acpi_device *dev);
 void acpi_device_add_finalize(struct acpi_device *device);
 void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
-bool acpi_device_is_present(const struct acpi_device *adev);
 bool acpi_device_is_battery(struct acpi_device *adev);
 bool acpi_device_is_first_physical_node(struct acpi_device *adev,
 					const struct device *dev);
@@ -133,6 +134,7 @@  int acpi_bus_register_early_device(int type);
 const struct acpi_device *acpi_companion_match(const struct device *dev);
 int __acpi_device_uevent_modalias(const struct acpi_device *adev,
 				  struct kobj_uevent_env *env);
+bool acpi_device_is_processor(const struct acpi_device *adev);
 
 /* --------------------------------------------------------------------------
                                   Power Resource
diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
index a6ead5204046..9f8d54038770 100644
--- a/drivers/acpi/property.c
+++ b/drivers/acpi/property.c
@@ -1486,7 +1486,7 @@  static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
 	if (!is_acpi_device_node(fwnode))
 		return false;
 
-	return acpi_device_is_present(to_acpi_device_node(fwnode));
+	return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
 }
 
 static const void *
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index e6ed1ba91e5c..fd2e8b3a5749 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -304,7 +304,7 @@  static int acpi_scan_device_check(struct acpi_device *adev)
 	int error;
 
 	acpi_bus_get_status(adev);
-	if (acpi_device_is_present(adev)) {
+	if (acpi_dev_ready_for_enumeration(adev)) {
 		/*
 		 * This function is only called for device objects for which
 		 * matching scan handlers exist.  The only situation in which
@@ -338,7 +338,7 @@  static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
 	int error;
 
 	acpi_bus_get_status(adev);
-	if (!acpi_device_is_present(adev)) {
+	if (!acpi_dev_ready_for_enumeration(adev)) {
 		acpi_scan_device_not_enumerated(adev);
 		return 0;
 	}
@@ -1917,11 +1917,6 @@  static bool acpi_device_should_be_hidden(acpi_handle handle)
 	return true;
 }
 
-bool acpi_device_is_present(const struct acpi_device *adev)
-{
-	return adev->status.present || adev->status.functional;
-}
-
 static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
 				       const char *idstr,
 				       const struct acpi_device_id **matchid)
@@ -1942,6 +1937,18 @@  static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
 	return false;
 }
 
+bool acpi_scan_check_handler(const struct acpi_device *adev,
+			     struct acpi_scan_handler *handler)
+{
+	struct acpi_hardware_id *hwid;
+
+	list_for_each_entry(hwid, &adev->pnp.ids, list)
+		if (acpi_scan_handler_matching(handler, hwid->id, NULL))
+			return true;
+
+	return false;
+}
+
 static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
 					const struct acpi_device_id **matchid)
 {
@@ -2405,16 +2412,35 @@  EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
  * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
  * @device: Pointer to the &struct acpi_device to check
  *
- * Check if the device is present and has no unmet dependencies.
+ * Check if the device is functional or enabled and has no unmet dependencies.
  *
- * Return true if the device is ready for enumeratino. Otherwise, return false.
+ * Return true if the device is ready for enumeration. Otherwise, return false.
  */
 bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
 {
 	if (device->flags.honor_deps && device->dep_unmet)
 		return false;
 
-	return acpi_device_is_present(device);
+	/*
+	 * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
+	 * (!present && functional) for certain types of devices that should be
+	 * enumerated. Note that the enabled bit should not be set unless the
+	 * present bit is set.
+	 *
+	 * However, limit this only to processor devices to reduce possible
+	 * regressions with firmware.
+	 */
+	if (!device->status.present)
+		return device->status.functional;
+
+	/*
+	 * Fast path - if enabled is set, avoid the more expensive test to
+	 * check whether this device is a processor.
+	 */
+	if (device->status.enabled)
+		return true;
+
+	return !acpi_device_is_processor(device);
 }
 EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);