mbox series

[RFC,v3,00/21] ACPI/arm64: add support for virtual cpu hotplug

Message ID ZXmn46ptis59F0CO@shell.armlinux.org.uk
Headers show
Series ACPI/arm64: add support for virtual cpu hotplug | expand

Message

Russell King (Oracle) Dec. 13, 2023, 12:47 p.m. UTC
Hi,

This is this remaining patches for ARM64 virtual cpu hotplug, which
follows on from the previous set of 21 patches that GregKH has
recently queued up, and "x86: intel_epb: Don't rely on link order"
which can be found at:

https://lore.kernel.org/r/E1r6SeD-00DCuK-M6@rmk-PC.armlinux.org.uk
https://lore.kernel.org/r/ZVyz/Ve5pPu8AWoA@shell.armlinux.org.uk

The entire series can be found at:

 git://git.armlinux.org.uk/~rmk/linux-arm.git aarch64/hotplug-vcpu/head

The original cover message from the entire series is below the
diffstat.

 Documentation/arch/arm64/cpu-hotplug.rst   |  79 ++++++++++++++++
 Documentation/arch/arm64/index.rst         |   1 +
 arch/arm64/include/asm/acpi.h              |  11 +++
 arch/arm64/kernel/acpi_numa.c              |  11 ---
 arch/arm64/kernel/psci.c                   |   2 +-
 arch/arm64/kernel/smp.c                    |   3 +-
 arch/loongarch/Kconfig                     |   2 +-
 arch/loongarch/configs/loongson3_defconfig |   2 +-
 arch/loongarch/kernel/acpi.c               |   4 +-
 arch/x86/Kconfig                           |   3 +-
 arch/x86/kernel/acpi/boot.c                |   4 +-
 drivers/acpi/Kconfig                       |  13 ++-
 drivers/acpi/acpi_processor.c              | 141 ++++++++++++++++++++++++++---
 drivers/acpi/bus.c                         |  16 ++++
 drivers/acpi/device_pm.c                   |   2 +-
 drivers/acpi/device_sysfs.c                |   2 +-
 drivers/acpi/internal.h                    |   1 -
 drivers/acpi/property.c                    |   2 +-
 drivers/acpi/scan.c                        | 140 ++++++++++++++++++----------
 drivers/base/cpu.c                         |  16 +++-
 drivers/irqchip/irq-gic-v3.c               |  32 ++++---
 include/acpi/acpi_bus.h                    |   1 +
 include/acpi/actbl2.h                      |   1 +
 include/linux/acpi.h                       |  10 +-
 include/linux/cpumask.h                    |  25 +++++
 kernel/cpu.c                               |   3 +
 26 files changed, 421 insertions(+), 106 deletions(-)

On Tue, Oct 24, 2023 at 04:15:28PM +0100, Russell King (Oracle) wrote:
> Hi,
> 
> I'm posting James' patch set updated with most of the review comments
> from his RFC v2 series back in September. Individual patches have a
> changelog attached at the bottom of the commit message. Those which
> I have finished updating have my S-o-b on them, those which still have
> outstanding review comments from RFC v2 do not. In some of these cases
> I've asked questions and am waiting for responses.
> 
> I'm posting this as RFC v3 because there's still some unaddressed
> comments and it's clearly not ready for merging. Even if it was ready
> to be merged, it is too late in this development cycle to be taking
> this change in, so there would be little point posting it non-RFC.
> Also James stated that he's waiting for confirmation from the
> Kubernetes/Kata folk - I have no idea what the status is there.
> 
> I will be sending each patch individually to a wider audience
> appropriate for that patch - apologies to those missing out on this
> cover message. I have added more mailing lists to the series with the
> exception of the acpica list in a hope of this cover message also
> reaching those folk.
> 
> The changes that aren't included are:
> 
> 1. Updates for my patch that was merged via Thomas (thanks!):
>    c4dd854f740c cpu-hotplug: Provide prototypes for arch CPU registration
>    rather than having this change spread through James' patches.
> 
> 2. New patch - simplification of PA-RISC's smp_prepare_boot_cpu()
> 
> 3. Moved "ACPI: Use the acpi_device_is_present() helper in more places"
>    and "ACPI: Rename acpi_scan_device_not_present() to be about
>    enumeration" to the beginning of the series - these two patches are
>    already queued up for merging into 6.7.
> 
> 4. Moved "arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into
>    a helper" to the beginning of the series, which has been submitted,
>    but as yet the fate of that posting isn't known.
> 
> The first four patches in this series are provided for completness only.
> 
> There is an additional patch in James' git tree that isn't in the set
> of patches that James posted: "ACPI: processor: Only call
> arch_unregister_cpu() if HOTPLUG_CPU is selected" which looks to me to
> be a workaround for arch_unregister_cpu() being under the ifdef. I've
> commented on this on the RFC v2 posting making a suggestion, but as yet
> haven't had any response.
> 
> I've included almost all of James' original covering body below the
> diffstat.
> 
> The reason that I'm doing this is to help move this code forward so
> hopefully it can be merged - which is why I have been keen to dig out
> from James' patches anything that can be merged and submit it
> separately, since this is a feature for which some users have a
> definite need for.
> 
> Please note that I haven't tested this beyond building for aarch64 at
> the present time.
> 
> The series can be found at:
> 
>  git://git.armlinux.org.uk/~rmk/linux-arm.git aarch64/hotplug-vcpu/v6.6-rc7
> 
>  Documentation/arch/arm64/cpu-hotplug.rst   |  79 +++++++++++++++
>  Documentation/arch/arm64/index.rst         |   1 +
>  arch/arm64/Kconfig                         |   1 +
>  arch/arm64/include/asm/acpi.h              |  11 +++
>  arch/arm64/include/asm/cpu.h               |   1 -
>  arch/arm64/kernel/acpi_numa.c              |  11 ---
>  arch/arm64/kernel/psci.c                   |   2 +-
>  arch/arm64/kernel/setup.c                  |  13 +--
>  arch/arm64/kernel/smp.c                    |   5 +-
>  arch/ia64/Kconfig                          |   3 +
>  arch/ia64/include/asm/acpi.h               |   2 +-
>  arch/ia64/include/asm/cpu.h                |   6 --
>  arch/ia64/kernel/acpi.c                    |   6 +-
>  arch/ia64/kernel/setup.c                   |   2 +-
>  arch/ia64/kernel/topology.c                |  35 +------
>  arch/loongarch/Kconfig                     |   2 +
>  arch/loongarch/configs/loongson3_defconfig |   2 +-
>  arch/loongarch/kernel/acpi.c               |   4 +-
>  arch/loongarch/kernel/topology.c           |  38 +-------
>  arch/parisc/kernel/smp.c                   |   8 +-
>  arch/riscv/Kconfig                         |   1 +
>  arch/riscv/kernel/setup.c                  |  19 +---
>  arch/x86/Kconfig                           |   3 +
>  arch/x86/include/asm/cpu.h                 |   4 -
>  arch/x86/kernel/acpi/boot.c                |   4 +-
>  arch/x86/kernel/cpu/intel_epb.c            |   2 +-
>  arch/x86/kernel/topology.c                 |  27 +-----
>  drivers/acpi/Kconfig                       |  14 ++-
>  drivers/acpi/acpi_processor.c              | 151 +++++++++++++++++++++++------
>  drivers/acpi/bus.c                         |  16 +++
>  drivers/acpi/device_pm.c                   |   2 +-
>  drivers/acpi/device_sysfs.c                |   2 +-
>  drivers/acpi/internal.h                    |   1 -
>  drivers/acpi/processor_core.c              |   2 +-
>  drivers/acpi/property.c                    |   2 +-
>  drivers/acpi/scan.c                        | 148 ++++++++++++++++++----------
>  drivers/base/arch_topology.c               |  38 +++++---
>  drivers/base/cpu.c                         |  44 +++++++--
>  drivers/base/init.c                        |   2 +-
>  drivers/base/node.c                        |   7 --
>  drivers/firmware/psci/psci.c               |   2 +
>  drivers/irqchip/irq-gic-v3.c               |  38 +++++---
>  include/acpi/acpi_bus.h                    |   1 +
>  include/acpi/actbl2.h                      |   1 +
>  include/linux/acpi.h                       |  13 ++-
>  include/linux/cpu.h                        |   4 +
>  include/linux/cpumask.h                    |  25 +++++
>  kernel/cpu.c                               |   3 +
>  48 files changed, 516 insertions(+), 292 deletions(-)
> 
> 
> On Wed, Sep 13, 2023 at 04:37:48PM +0000, James Morse wrote:
> > Hello!
> > 
> > Changes since RFC-v1:
> >  * riscv is new, ia64 is gone
> >  * The KVM support is different, and upstream - no need to patch the host.
> > 
> > ---
> > 
> > This series adds what looks like cpuhotplug support to arm64 for use in
> > virtual machines. It does this by moving the cpu_register() calls for
> > architectures that support ACPI out of the arch code by using
> > GENERIC_CPU_DEVICES, then into the ACPI processor driver.
> > 
> > The kubernetes folk really want to be able to add CPUs to an existing VM,
> > in exactly the same way they do on x86. The use-case is pre-booting guests
> > with one CPU, then adding the number that were actually needed when the
> > workload is provisioned.
> > 
> > Wait? Doesn't arm64 support cpuhotplug already!?
> > In the arm world, cpuhotplug gets used to mean removing the power from a CPU.
> > The CPU is offline, and remains present. For x86, and ACPI, cpuhotplug
> > has the additional step of physically removing the CPU, so that it isn't
> > present anymore.
> > 
> > Arm64 doesn't support this, and can't support it: CPUs are really a slice
> > of the SoC, and there is not enough information in the existing ACPI tables
> > to describe which bits of the slice also got removed. Without a reference
> > machine: adding this support to the spec is a wild goose chase.
> > 
> > Critically: everything described in the firmware tables must remain present.
> > 
> > For a virtual machine this is easy as all the other bits of 'virtual SoC'
> > are emulated, so they can (and do) remain present when a vCPU is 'removed'.
> > 
> > On a system that supports cpuhotplug the MADT has to describe every possible
> > CPU at boot. Under KVM, the vGIC needs to know about every possible vCPU before
> > the guest is started.
> > With these constraints, virtual-cpuhotplug is really just a hypervisor/firmware
> > policy about which CPUs can be brought online.
> > 
> > This series adds support for virtual-cpuhotplug as exactly that: firmware
> > policy. This may even work on a physical machine too; for a guest the part of
> > firmware is played by the VMM. (typically Qemu).
> > 
> > PSCI support is modified to return 'DENIED' if the CPU can't be brought
> > online/enabled yet. The CPU object's _STA method's enabled bit is used to
> > indicate firmware's current disposition. If the CPU has its enabled bit clear,
> > it will not be registered with sysfs, and attempts to bring it online will
> > fail. The notifications that _STA has changed its value then work in the same
> > way as physical hotplug, and firmware can cause the CPU to be registered some
> > time later, allowing it to be brought online.
> > 
> > This creates something that looks like cpuhotplug to user-space, as the sysfs
> > files appear and disappear, and the udev notifications look the same.
> > 
> > One notable difference is the CPU present mask, which is exposed via sysfs.
> > Because the CPUs remain present throughout, they can still be seen in that mask.
> > This value does get used by webbrowsers to estimate the number of CPUs
> > as the CPU online mask is constantly changed on mobile phones.
> > 
> > Linux is tolerant of PSCI returning errors, as its always been allowed to do
> > that. To avoid confusing OS that can't tolerate this, we needed an additional
> > bit in the MADT GICC flags. This series copies ACPI_MADT_ONLINE_CAPABLE, which
> > appears to be for this purpose, but calls it ACPI_MADT_GICC_CPU_CAPABLE as it
> > has a different bit position in the GICC.
> > 
> > This code is unconditionally enabled for all ACPI architectures.
> > If there are problems with firmware tables on some devices, the CPUs will
> > already be online by the time the acpi_processor_make_enabled() is called.
> > A mismatch here causes a firmware-bug message and kernel taint. This should
> > only affect people with broken firmware who also boot with maxcpus=1, and
> > bring CPUs online later.
> > 
> > I had a go at switching the remaining architectures over to GENERIC_CPU_DEVICES,
> > so that the Kconfig symbol can be removed, but I got stuck with powerpc
> > and s390.
> > 
> > I've only build tested Loongarch and riscv. I've removed the ia64 specific
> > patches, but left the changes in other patches to make git-grep review of
> > renames easier.
> > 
> > If folk want to play along at home, you'll need a copy of Qemu that supports this.
> > https://github.com/salil-mehta/qemu.git salil/virt-cpuhp-armv8/rfc-v2-rc6
> > 
> > Replace your '-smp' argument with something like:
> > | -smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1
> > 
> > then feed the following to the Qemu montior;
> > | (qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
> > | (qemu) device_del cpu1
> > 
> > 
> > Why is this still an RFC? I'm still looking for confirmation from the
> > kubernetes/kata folk that this works for them. Because of this I've culled
> > the CC list...
> > 
> > 
> > This series is based on v6.6-rc1, and can be retrieved from:
> > https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/ virtual_cpu_hotplug/rfc/v2
> > 
> > 
> > Thanks,
> > 
> > James Morse (34):
> >   ACPI: Move ACPI_HOTPLUG_CPU to be disabled on arm64 and riscv
> >   drivers: base: Use present CPUs in GENERIC_CPU_DEVICES
> >   drivers: base: Allow parts of GENERIC_CPU_DEVICES to be overridden
> >   drivers: base: Move cpu_dev_init() after node_dev_init()
> >   drivers: base: Print a warning instead of panic() when register_cpu()
> >     fails
> >   arm64: setup: Switch over to GENERIC_CPU_DEVICES using
> >     arch_register_cpu()
> >   x86: intel_epb: Don't rely on link order
> >   x86/topology: Switch over to GENERIC_CPU_DEVICES
> >   LoongArch: Switch over to GENERIC_CPU_DEVICES
> >   riscv: Switch over to GENERIC_CPU_DEVICES
> >   arch_topology: Make register_cpu_capacity_sysctl() tolerant to late
> >     CPUs
> >   ACPI: Use the acpi_device_is_present() helper in more places
> >   ACPI: Rename acpi_scan_device_not_present() to be about enumeration
> >   ACPI: Only enumerate enabled (or functional) devices
> >   ACPI: processor: Add support for processors described as container
> >     packages
> >   ACPI: processor: Register CPUs that are online, but not described in
> >     the DSDT
> >   ACPI: processor: Register all CPUs from acpi_processor_get_info()
> >   ACPI: Rename ACPI_HOTPLUG_CPU to include 'present'
> >   ACPI: Move acpi_bus_trim_one() before acpi_scan_hot_remove()
> >   ACPI: Rename acpi_processor_hotadd_init and remove pre-processor
> >     guards
> >   ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug
> >   ACPI: Check _STA present bit before making CPUs not present
> >   ACPI: Warn when the present bit changes but the feature is not enabled
> >   drivers: base: Implement weak arch_unregister_cpu()
> >   LoongArch: Use the __weak version of arch_unregister_cpu()
> >   arm64: acpi: Move get_cpu_for_acpi_id() to a header
> >   ACPICA: Add new MADT GICC flags fields [code first?]
> >   arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a
> >     helper
> >   irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc()
> >   irqchip/gic-v3: Add support for ACPI's disabled but 'online capable'
> >     CPUs
> >   ACPI: add support to register CPUs based on the _STA enabled bit
> >   arm64: document virtual CPU hotplug's expectations
> >   ACPI: Add _OSC bits to advertise OS support for toggling CPU
> >     present/enabled
> >   cpumask: Add enabled cpumask for present CPUs that can be brought
> >     online
> > 
> > Jean-Philippe Brucker (1):
> >   arm64: psci: Ignore DENIED CPUs
> 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
>

Comments

Rafael J. Wysocki Dec. 15, 2023, 7:47 p.m. UTC | #1
On Friday, December 15, 2023 5:15:39 PM CET Jonathan Cameron wrote:
> On Fri, 15 Dec 2023 15:31:55 +0000
> "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> 
> > On Thu, Dec 14, 2023 at 07:37:10PM +0100, Rafael J. Wysocki wrote:
> > > On Thu, Dec 14, 2023 at 7:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:  
> > > >
> > > > On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
> > > > <linux@armlinux.org.uk> wrote:  
> > > > > I guess we need something like:
> > > > >
> > > > >         if (device->status.present)
> > > > >                 return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
> > > > >                        device->status.enabled;
> > > > >         else
> > > > >                 return device->status.functional;
> > > > >
> > > > > so we only check device->status.enabled for processor-type devices?  
> > > >
> > > > Yes, something like this.  
> > > 
> > > However, that is not sufficient, because there are
> > > ACPI_BUS_TYPE_DEVICE devices representing processors.
> > > 
> > > I'm not sure about a clean way to do it ATM.  
> > 
> > Ok, how about:
> > 
> > static bool acpi_dev_is_processor(const struct acpi_device *device)
> > {
> > 	struct acpi_hardware_id *hwid;
> > 
> > 	if (device->device_type == ACPI_BUS_TYPE_PROCESSOR)
> > 		return true;
> > 
> > 	if (device->device_type != ACPI_BUS_TYPE_DEVICE)
> > 		return false;
> > 
> > 	list_for_each_entry(hwid, &device->pnp.ids, list)
> > 		if (!strcmp(ACPI_PROCESSOR_OBJECT_HID, hwid->id) ||
> > 		    !strcmp(ACPI_PROCESSOR_DEVICE_HID, hwid->id))
> > 			return true;
> > 
> > 	return false;
> > }
> > 
> > and then:
> > 
> > 	if (device->status.present)
> > 		return !acpi_dev_is_processor(device) || device->status.enabled;
> > 	else
> > 		return device->status.functional;
> > 
> > ?
> > 
> Changing it to CPU only for now makes sense to me and I think this code snippet should do the
> job.  Nice and simple.

Well, except that it does checks that are done elsewhere slightly
differently, which from the maintenance POV is not nice.

Maybe something like the appended patch (untested).

---
 drivers/acpi/acpi_processor.c |   11 +++++++++++
 drivers/acpi/internal.h       |    3 +++
 drivers/acpi/scan.c           |   24 +++++++++++++++++++++++-
 3 files changed, 37 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/acpi/acpi_processor.c
===================================================================
--- linux-pm.orig/drivers/acpi/acpi_processor.c
+++ linux-pm/drivers/acpi/acpi_processor.c
@@ -644,6 +644,17 @@ static struct acpi_scan_handler processo
 	},
 };
 
+bool acpi_device_is_processor(const struct acpi_device *adev)
+{
+	if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
+		return true;
+
+	if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
+		return false;
+
+	return acpi_scan_check_handler(adev, &processor_handler);
+}
+
 static int acpi_processor_container_attach(struct acpi_device *dev,
 					   const struct acpi_device_id *id)
 {
Index: linux-pm/drivers/acpi/internal.h
===================================================================
--- linux-pm.orig/drivers/acpi/internal.h
+++ linux-pm/drivers/acpi/internal.h
@@ -62,6 +62,8 @@ void acpi_sysfs_add_hotplug_profile(stru
 int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
 				       const char *hotplug_profile_name);
 void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
+bool acpi_scan_check_handler(const struct acpi_device *adev,
+			     struct acpi_scan_handler *handler);
 
 #ifdef CONFIG_DEBUG_FS
 extern struct dentry *acpi_debugfs_dir;
@@ -133,6 +135,7 @@ int acpi_bus_register_early_device(int t
 const struct acpi_device *acpi_companion_match(const struct device *dev);
 int __acpi_device_uevent_modalias(const struct acpi_device *adev,
 				  struct kobj_uevent_env *env);
+bool acpi_device_is_processor(const struct acpi_device *adev);
 
 /* --------------------------------------------------------------------------
                                   Power Resource
Index: linux-pm/drivers/acpi/scan.c
===================================================================
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -1938,6 +1938,19 @@ static bool acpi_scan_handler_matching(s
 	return false;
 }
 
+bool acpi_scan_check_handler(const struct acpi_device *adev,
+			     struct acpi_scan_handler *handler)
+{
+	struct acpi_hardware_id *hwid;
+
+	list_for_each_entry(hwid, &adev->pnp.ids, list) {
+		if (acpi_scan_handler_matching(handler, hwid->id, NULL))
+			return true;
+	}
+
+	return false;
+}
+
 static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
 					const struct acpi_device_id **matchid)
 {
@@ -2410,7 +2423,16 @@ bool acpi_dev_ready_for_enumeration(cons
 	if (device->flags.honor_deps && device->dep_unmet)
 		return false;
 
-	return acpi_device_is_present(device);
+	if (device->status.functional)
+		return true;
+
+	if (!device->status.present)
+		return false;
+
+	if (device->status.enabled)
+		return true; /* Fast path. */
+
+	return !acpi_device_is_processor(device);
 }
 EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
Jonathan Cameron Jan. 2, 2024, 2:39 p.m. UTC | #2
On Fri, 15 Dec 2023 20:47:31 +0100
"Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:

> On Friday, December 15, 2023 5:15:39 PM CET Jonathan Cameron wrote:
> > On Fri, 15 Dec 2023 15:31:55 +0000
> > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> >   
> > > On Thu, Dec 14, 2023 at 07:37:10PM +0100, Rafael J. Wysocki wrote:  
> > > > On Thu, Dec 14, 2023 at 7:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:    
> > > > >
> > > > > On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
> > > > > <linux@armlinux.org.uk> wrote:    
> > > > > > I guess we need something like:
> > > > > >
> > > > > >         if (device->status.present)
> > > > > >                 return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
> > > > > >                        device->status.enabled;
> > > > > >         else
> > > > > >                 return device->status.functional;
> > > > > >
> > > > > > so we only check device->status.enabled for processor-type devices?    
> > > > >
> > > > > Yes, something like this.    
> > > > 
> > > > However, that is not sufficient, because there are
> > > > ACPI_BUS_TYPE_DEVICE devices representing processors.
> > > > 
> > > > I'm not sure about a clean way to do it ATM.    
> > > 
> > > Ok, how about:
> > > 
> > > static bool acpi_dev_is_processor(const struct acpi_device *device)
> > > {
> > > 	struct acpi_hardware_id *hwid;
> > > 
> > > 	if (device->device_type == ACPI_BUS_TYPE_PROCESSOR)
> > > 		return true;
> > > 
> > > 	if (device->device_type != ACPI_BUS_TYPE_DEVICE)
> > > 		return false;
> > > 
> > > 	list_for_each_entry(hwid, &device->pnp.ids, list)
> > > 		if (!strcmp(ACPI_PROCESSOR_OBJECT_HID, hwid->id) ||
> > > 		    !strcmp(ACPI_PROCESSOR_DEVICE_HID, hwid->id))
> > > 			return true;
> > > 
> > > 	return false;
> > > }
> > > 
> > > and then:
> > > 
> > > 	if (device->status.present)
> > > 		return !acpi_dev_is_processor(device) || device->status.enabled;
> > > 	else
> > > 		return device->status.functional;
> > > 
> > > ?
> > >   
> > Changing it to CPU only for now makes sense to me and I think this code snippet should do the
> > job.  Nice and simple.  
> 
> Well, except that it does checks that are done elsewhere slightly
> differently, which from the maintenance POV is not nice.
> 
> Maybe something like the appended patch (untested).

Hi Rafael,

As far as I can see that's functionally equivalent, so looks good to me.
I'm not set up to test this today though, so will defer to Russell on whether
there is anything missing

Thanks for putting this together.

Jonathan

> 
> ---
>  drivers/acpi/acpi_processor.c |   11 +++++++++++
>  drivers/acpi/internal.h       |    3 +++
>  drivers/acpi/scan.c           |   24 +++++++++++++++++++++++-
>  3 files changed, 37 insertions(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/acpi/acpi_processor.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/acpi_processor.c
> +++ linux-pm/drivers/acpi/acpi_processor.c
> @@ -644,6 +644,17 @@ static struct acpi_scan_handler processo
>  	},
>  };
>  
> +bool acpi_device_is_processor(const struct acpi_device *adev)
> +{
> +	if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
> +		return true;
> +
> +	if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
> +		return false;
> +
> +	return acpi_scan_check_handler(adev, &processor_handler);
> +}
> +
>  static int acpi_processor_container_attach(struct acpi_device *dev,
>  					   const struct acpi_device_id *id)
>  {
> Index: linux-pm/drivers/acpi/internal.h
> ===================================================================
> --- linux-pm.orig/drivers/acpi/internal.h
> +++ linux-pm/drivers/acpi/internal.h
> @@ -62,6 +62,8 @@ void acpi_sysfs_add_hotplug_profile(stru
>  int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
>  				       const char *hotplug_profile_name);
>  void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +			     struct acpi_scan_handler *handler);
>  
>  #ifdef CONFIG_DEBUG_FS
>  extern struct dentry *acpi_debugfs_dir;
> @@ -133,6 +135,7 @@ int acpi_bus_register_early_device(int t
>  const struct acpi_device *acpi_companion_match(const struct device *dev);
>  int __acpi_device_uevent_modalias(const struct acpi_device *adev,
>  				  struct kobj_uevent_env *env);
> +bool acpi_device_is_processor(const struct acpi_device *adev);
>  
>  /* --------------------------------------------------------------------------
>                                    Power Resource
> Index: linux-pm/drivers/acpi/scan.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/scan.c
> +++ linux-pm/drivers/acpi/scan.c
> @@ -1938,6 +1938,19 @@ static bool acpi_scan_handler_matching(s
>  	return false;
>  }
>  
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +			     struct acpi_scan_handler *handler)
> +{
> +	struct acpi_hardware_id *hwid;
> +
> +	list_for_each_entry(hwid, &adev->pnp.ids, list) {
> +		if (acpi_scan_handler_matching(handler, hwid->id, NULL))
> +			return true;
> +	}
> +
> +	return false;
> +}
> +
>  static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
>  					const struct acpi_device_id **matchid)
>  {
> @@ -2410,7 +2423,16 @@ bool acpi_dev_ready_for_enumeration(cons
>  	if (device->flags.honor_deps && device->dep_unmet)
>  		return false;
>  
> -	return acpi_device_is_present(device);
> +	if (device->status.functional)
> +		return true;
> +
> +	if (!device->status.present)
> +		return false;
> +
> +	if (device->status.enabled)
> +		return true; /* Fast path. */
> +
> +	return !acpi_device_is_processor(device);
>  }
>  EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
>  
> 
> 
>
Jonathan Cameron Jan. 11, 2024, 10:19 a.m. UTC | #3
On Tue, 2 Jan 2024 14:39:25 +0000
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Fri, 15 Dec 2023 20:47:31 +0100
> "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:
> 
> > On Friday, December 15, 2023 5:15:39 PM CET Jonathan Cameron wrote:  
> > > On Fri, 15 Dec 2023 15:31:55 +0000
> > > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > >     
> > > > On Thu, Dec 14, 2023 at 07:37:10PM +0100, Rafael J. Wysocki wrote:    
> > > > > On Thu, Dec 14, 2023 at 7:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:      
> > > > > >
> > > > > > On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
> > > > > > <linux@armlinux.org.uk> wrote:      
> > > > > > > I guess we need something like:
> > > > > > >
> > > > > > >         if (device->status.present)
> > > > > > >                 return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
> > > > > > >                        device->status.enabled;
> > > > > > >         else
> > > > > > >                 return device->status.functional;
> > > > > > >
> > > > > > > so we only check device->status.enabled for processor-type devices?      
> > > > > >
> > > > > > Yes, something like this.      
> > > > > 
> > > > > However, that is not sufficient, because there are
> > > > > ACPI_BUS_TYPE_DEVICE devices representing processors.
> > > > > 
> > > > > I'm not sure about a clean way to do it ATM.      
> > > > 
> > > > Ok, how about:
> > > > 
> > > > static bool acpi_dev_is_processor(const struct acpi_device *device)
> > > > {
> > > > 	struct acpi_hardware_id *hwid;
> > > > 
> > > > 	if (device->device_type == ACPI_BUS_TYPE_PROCESSOR)
> > > > 		return true;
> > > > 
> > > > 	if (device->device_type != ACPI_BUS_TYPE_DEVICE)
> > > > 		return false;
> > > > 
> > > > 	list_for_each_entry(hwid, &device->pnp.ids, list)
> > > > 		if (!strcmp(ACPI_PROCESSOR_OBJECT_HID, hwid->id) ||
> > > > 		    !strcmp(ACPI_PROCESSOR_DEVICE_HID, hwid->id))
> > > > 			return true;
> > > > 
> > > > 	return false;
> > > > }
> > > > 
> > > > and then:
> > > > 
> > > > 	if (device->status.present)
> > > > 		return !acpi_dev_is_processor(device) || device->status.enabled;
> > > > 	else
> > > > 		return device->status.functional;
> > > > 
> > > > ?
> > > >     
> > > Changing it to CPU only for now makes sense to me and I think this code snippet should do the
> > > job.  Nice and simple.    
> > 
> > Well, except that it does checks that are done elsewhere slightly
> > differently, which from the maintenance POV is not nice.
> > 
> > Maybe something like the appended patch (untested).  
> 
> Hi Rafael,
> 
> As far as I can see that's functionally equivalent, so looks good to me.
> I'm not set up to test this today though, so will defer to Russell on whether
> there is anything missing
> 
> Thanks for putting this together.

This is rather embarrassing...

I span this up on a QEMU instance with some prints to find out we need
the !acpi_device_is_processor() restriction.
On my 'random' test setup it fails on one device. ACPI0017 - which I
happen to know rather well. It's the weird pseudo device that lets
a CXL aware OS know there is a CEDT table to probe.

Whilst I really don't like that hack (it is all about making software
distribution of out of tree modules easier rather than something
fundamental), I'm the CXL QEMU maintainer :(

Will fix that, but it shows there is at least one broken firmware out
there.

On plus side, Rafael's code seems to work as expected and lets that
buggy firwmare carry on working :) So lets pretend the bug in qemu
is a deliberate test case!

Jonathan

p.s. My test setup blows up later for an unrelated reason with latest
kernel, so I'll be off debugging that for a while :(


> 
> Jonathan
> 
> > 
> > ---
> >  drivers/acpi/acpi_processor.c |   11 +++++++++++
> >  drivers/acpi/internal.h       |    3 +++
> >  drivers/acpi/scan.c           |   24 +++++++++++++++++++++++-
> >  3 files changed, 37 insertions(+), 1 deletion(-)
> > 
> > Index: linux-pm/drivers/acpi/acpi_processor.c
> > ===================================================================
> > --- linux-pm.orig/drivers/acpi/acpi_processor.c
> > +++ linux-pm/drivers/acpi/acpi_processor.c
> > @@ -644,6 +644,17 @@ static struct acpi_scan_handler processo
> >  	},
> >  };
> >  
> > +bool acpi_device_is_processor(const struct acpi_device *adev)
> > +{
> > +	if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
> > +		return true;
> > +
> > +	if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
> > +		return false;
> > +
> > +	return acpi_scan_check_handler(adev, &processor_handler);
> > +}
> > +
> >  static int acpi_processor_container_attach(struct acpi_device *dev,
> >  					   const struct acpi_device_id *id)
> >  {
> > Index: linux-pm/drivers/acpi/internal.h
> > ===================================================================
> > --- linux-pm.orig/drivers/acpi/internal.h
> > +++ linux-pm/drivers/acpi/internal.h
> > @@ -62,6 +62,8 @@ void acpi_sysfs_add_hotplug_profile(stru
> >  int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
> >  				       const char *hotplug_profile_name);
> >  void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
> > +bool acpi_scan_check_handler(const struct acpi_device *adev,
> > +			     struct acpi_scan_handler *handler);
> >  
> >  #ifdef CONFIG_DEBUG_FS
> >  extern struct dentry *acpi_debugfs_dir;
> > @@ -133,6 +135,7 @@ int acpi_bus_register_early_device(int t
> >  const struct acpi_device *acpi_companion_match(const struct device *dev);
> >  int __acpi_device_uevent_modalias(const struct acpi_device *adev,
> >  				  struct kobj_uevent_env *env);
> > +bool acpi_device_is_processor(const struct acpi_device *adev);
> >  
> >  /* --------------------------------------------------------------------------
> >                                    Power Resource
> > Index: linux-pm/drivers/acpi/scan.c
> > ===================================================================
> > --- linux-pm.orig/drivers/acpi/scan.c
> > +++ linux-pm/drivers/acpi/scan.c
> > @@ -1938,6 +1938,19 @@ static bool acpi_scan_handler_matching(s
> >  	return false;
> >  }
> >  
> > +bool acpi_scan_check_handler(const struct acpi_device *adev,
> > +			     struct acpi_scan_handler *handler)
> > +{
> > +	struct acpi_hardware_id *hwid;
> > +
> > +	list_for_each_entry(hwid, &adev->pnp.ids, list) {
> > +		if (acpi_scan_handler_matching(handler, hwid->id, NULL))
> > +			return true;
> > +	}
> > +
> > +	return false;
> > +}
> > +
> >  static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
> >  					const struct acpi_device_id **matchid)
> >  {
> > @@ -2410,7 +2423,16 @@ bool acpi_dev_ready_for_enumeration(cons
> >  	if (device->flags.honor_deps && device->dep_unmet)
> >  		return false;
> >  
> > -	return acpi_device_is_present(device);
> > +	if (device->status.functional)
> > +		return true;
> > +
> > +	if (!device->status.present)
> > +		return false;
> > +
> > +	if (device->status.enabled)
> > +		return true; /* Fast path. */
> > +
> > +	return !acpi_device_is_processor(device);
> >  }
> >  EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
> >  
> > 
> > 
> >   
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Russell King (Oracle) Jan. 11, 2024, 10:26 a.m. UTC | #4
On Thu, Jan 11, 2024 at 10:19:49AM +0000, Jonathan Cameron wrote:
> On Tue, 2 Jan 2024 14:39:25 +0000
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> 
> > On Fri, 15 Dec 2023 20:47:31 +0100
> > "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:
> > 
> > > On Friday, December 15, 2023 5:15:39 PM CET Jonathan Cameron wrote:  
> > > > On Fri, 15 Dec 2023 15:31:55 +0000
> > > > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > > >     
> > > > > On Thu, Dec 14, 2023 at 07:37:10PM +0100, Rafael J. Wysocki wrote:    
> > > > > > On Thu, Dec 14, 2023 at 7:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:      
> > > > > > >
> > > > > > > On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
> > > > > > > <linux@armlinux.org.uk> wrote:      
> > > > > > > > I guess we need something like:
> > > > > > > >
> > > > > > > >         if (device->status.present)
> > > > > > > >                 return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
> > > > > > > >                        device->status.enabled;
> > > > > > > >         else
> > > > > > > >                 return device->status.functional;
> > > > > > > >
> > > > > > > > so we only check device->status.enabled for processor-type devices?      
> > > > > > >
> > > > > > > Yes, something like this.      
> > > > > > 
> > > > > > However, that is not sufficient, because there are
> > > > > > ACPI_BUS_TYPE_DEVICE devices representing processors.
> > > > > > 
> > > > > > I'm not sure about a clean way to do it ATM.      
> > > > > 
> > > > > Ok, how about:
> > > > > 
> > > > > static bool acpi_dev_is_processor(const struct acpi_device *device)
> > > > > {
> > > > > 	struct acpi_hardware_id *hwid;
> > > > > 
> > > > > 	if (device->device_type == ACPI_BUS_TYPE_PROCESSOR)
> > > > > 		return true;
> > > > > 
> > > > > 	if (device->device_type != ACPI_BUS_TYPE_DEVICE)
> > > > > 		return false;
> > > > > 
> > > > > 	list_for_each_entry(hwid, &device->pnp.ids, list)
> > > > > 		if (!strcmp(ACPI_PROCESSOR_OBJECT_HID, hwid->id) ||
> > > > > 		    !strcmp(ACPI_PROCESSOR_DEVICE_HID, hwid->id))
> > > > > 			return true;
> > > > > 
> > > > > 	return false;
> > > > > }
> > > > > 
> > > > > and then:
> > > > > 
> > > > > 	if (device->status.present)
> > > > > 		return !acpi_dev_is_processor(device) || device->status.enabled;
> > > > > 	else
> > > > > 		return device->status.functional;
> > > > > 
> > > > > ?
> > > > >     
> > > > Changing it to CPU only for now makes sense to me and I think this code snippet should do the
> > > > job.  Nice and simple.    
> > > 
> > > Well, except that it does checks that are done elsewhere slightly
> > > differently, which from the maintenance POV is not nice.
> > > 
> > > Maybe something like the appended patch (untested).  
> > 
> > Hi Rafael,
> > 
> > As far as I can see that's functionally equivalent, so looks good to me.
> > I'm not set up to test this today though, so will defer to Russell on whether
> > there is anything missing
> > 
> > Thanks for putting this together.
> 
> This is rather embarrassing...
> 
> I span this up on a QEMU instance with some prints to find out we need
> the !acpi_device_is_processor() restriction.
> On my 'random' test setup it fails on one device. ACPI0017 - which I
> happen to know rather well. It's the weird pseudo device that lets
> a CXL aware OS know there is a CEDT table to probe.
> 
> Whilst I really don't like that hack (it is all about making software
> distribution of out of tree modules easier rather than something
> fundamental), I'm the CXL QEMU maintainer :(
> 
> Will fix that, but it shows there is at least one broken firmware out
> there.
> 
> On plus side, Rafael's code seems to work as expected and lets that
> buggy firwmare carry on working :) So lets pretend the bug in qemu
> is a deliberate test case!

Lol, thanks for a test case and showing that Rafael's approach is
indeed necessary.

Would your test quality for a tested-by for this? For reference, this
is my current version below with Rafael's update:

8<====
From: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Subject: [PATCH] ACPI: Only enumerate enabled (or functional) processor
 devices

From: James Morse <james.morse@arm.com>

Today the ACPI enumeration code 'visits' all devices that are present.

This is a problem for arm64, where CPUs are always present, but not
always enabled. When a device-check occurs because the firmware-policy
has changed and a CPU is now enabled, the following error occurs:
| acpi ACPI0007:48: Enumeration failure

This is ultimately because acpi_dev_ready_for_enumeration() returns
true for a device that is not enabled. The ACPI Processor driver
will not register such CPUs as they are not 'decoding their resources'.

ACPI allows a device to be functional instead of maintaining the
present and enabled bit, but we can't simply check the enabled bit
for all devices since firmware can be buggy.

If ACPI indicates that the device is present and enabled, then all well
and good, we can enumate it. However, if the device is present and not
enabled, then we also check whether the device is a processor device
to limit the impact of this new check to just processor devices.

This avoids enumerating present && functional processor devices that
are not enabled.

Signed-off-by: James Morse <james.morse@arm.com>
Co-developed-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
Changes since RFC v2:
 * Incorporate comment suggestion by Gavin Shan.
Changes since RFC v3:
 * Fixed "sert" typo.
Changes since RFC v3 (smaller series):
 * Restrict checking the enabled bit to processor devices, update
   commit comments.
 * Use Rafael's suggestion in
   https://lore.kernel.org/r/5760569.DvuYhMxLoT@kreacher
---
 drivers/acpi/acpi_processor.c | 11 ++++++++
 drivers/acpi/device_pm.c      |  2 +-
 drivers/acpi/device_sysfs.c   |  2 +-
 drivers/acpi/internal.h       |  4 ++-
 drivers/acpi/property.c       |  2 +-
 drivers/acpi/scan.c           | 49 ++++++++++++++++++++++++++++-------
 6 files changed, 56 insertions(+), 14 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 4fe2ef54088c..cf7c1cca69dd 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -626,6 +626,17 @@ static struct acpi_scan_handler processor_handler = {
 	},
 };
 
+bool acpi_device_is_processor(const struct acpi_device *adev)
+{
+	if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
+		return true;
+
+	if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
+		return false;
+
+	return acpi_scan_check_handler(adev, &processor_handler);
+}
+
 static int acpi_processor_container_attach(struct acpi_device *dev,
 					   const struct acpi_device_id *id)
 {
diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
index 3b4d048c4941..e3c80f3b3b57 100644
--- a/drivers/acpi/device_pm.c
+++ b/drivers/acpi/device_pm.c
@@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
 		return -EINVAL;
 
 	device->power.state = ACPI_STATE_UNKNOWN;
-	if (!acpi_device_is_present(device)) {
+	if (!acpi_dev_ready_for_enumeration(device)) {
 		device->flags.initialized = false;
 		return -ENXIO;
 	}
diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
index 23373faa35ec..a0256d2493a7 100644
--- a/drivers/acpi/device_sysfs.c
+++ b/drivers/acpi/device_sysfs.c
@@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
 	struct acpi_hardware_id *id;
 
 	/* Avoid unnecessarily loading modules for non present devices. */
-	if (!acpi_device_is_present(acpi_dev))
+	if (!acpi_dev_ready_for_enumeration(acpi_dev))
 		return 0;
 
 	/*
diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
index 866c7c4ed233..9388d4c8674a 100644
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -62,6 +62,8 @@ void acpi_sysfs_add_hotplug_profile(struct acpi_hotplug_profile *hotplug,
 int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
 				       const char *hotplug_profile_name);
 void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
+bool acpi_scan_check_handler(const struct acpi_device *adev,
+			     struct acpi_scan_handler *handler);
 
 #ifdef CONFIG_DEBUG_FS
 extern struct dentry *acpi_debugfs_dir;
@@ -107,7 +109,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
 void acpi_device_remove_files(struct acpi_device *dev);
 void acpi_device_add_finalize(struct acpi_device *device);
 void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
-bool acpi_device_is_present(const struct acpi_device *adev);
 bool acpi_device_is_battery(struct acpi_device *adev);
 bool acpi_device_is_first_physical_node(struct acpi_device *adev,
 					const struct device *dev);
@@ -119,6 +120,7 @@ int acpi_bus_register_early_device(int type);
 const struct acpi_device *acpi_companion_match(const struct device *dev);
 int __acpi_device_uevent_modalias(const struct acpi_device *adev,
 				  struct kobj_uevent_env *env);
+bool acpi_device_is_processor(const struct acpi_device *adev);
 
 /* --------------------------------------------------------------------------
                                   Power Resource
diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
index 6979a3f9f90a..14d6948fd88a 100644
--- a/drivers/acpi/property.c
+++ b/drivers/acpi/property.c
@@ -1420,7 +1420,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
 	if (!is_acpi_device_node(fwnode))
 		return false;
 
-	return acpi_device_is_present(to_acpi_device_node(fwnode));
+	return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
 }
 
 static const void *
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 02bb2cce423f..f94d1f744bcc 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
 	int error;
 
 	acpi_bus_get_status(adev);
-	if (acpi_device_is_present(adev)) {
+	if (acpi_dev_ready_for_enumeration(adev)) {
 		/*
 		 * This function is only called for device objects for which
 		 * matching scan handlers exist.  The only situation in which
@@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
 	int error;
 
 	acpi_bus_get_status(adev);
-	if (!acpi_device_is_present(adev)) {
+	if (!acpi_dev_ready_for_enumeration(adev)) {
 		acpi_scan_device_not_enumerated(adev);
 		return 0;
 	}
@@ -1913,11 +1913,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
 	return true;
 }
 
-bool acpi_device_is_present(const struct acpi_device *adev)
-{
-	return adev->status.present || adev->status.functional;
-}
-
 static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
 				       const char *idstr,
 				       const struct acpi_device_id **matchid)
@@ -1938,6 +1933,18 @@ static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
 	return false;
 }
 
+bool acpi_scan_check_handler(const struct acpi_device *adev,
+			     struct acpi_scan_handler *handler)
+{
+	struct acpi_hardware_id *hwid;
+
+	list_for_each_entry(hwid, &adev->pnp.ids, list)
+		if (acpi_scan_handler_matching(handler, hwid->id, NULL))
+			return true;
+
+	return false;
+}
+
 static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
 					const struct acpi_device_id **matchid)
 {
@@ -2381,16 +2388,38 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
  * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
  * @device: Pointer to the &struct acpi_device to check
  *
- * Check if the device is present and has no unmet dependencies.
+ * Check if the device is functional or enabled and has no unmet dependencies.
  *
- * Return true if the device is ready for enumeratino. Otherwise, return false.
+ * Return true if the device is ready for enumeration. Otherwise, return false.
  */
 bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
 {
 	if (device->flags.honor_deps && device->dep_unmet)
 		return false;
 
-	return acpi_device_is_present(device);
+	/*
+	 * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
+	 * (!present && functional) for certain types of devices that should be
+	 * enumerated. Note that the enabled bit should not be set unless the
+	 * present bit is set.
+	 *
+	 * However, limit this only to processor devices to reduce possible
+	 * regressions with firmware.
+	 */
+	if (device->status.functional)
+		return true;
+
+	if (!device->status.present)
+		return false;
+
+	/*
+	 * Fast path - if enabled is set, avoid the more expensive test to
+	 * check whether this device is a processor.
+	 */
+	if (device->status.enabled)
+		return true;
+
+	return !acpi_device_is_processor(device);
 }
 EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
Jonathan Cameron Jan. 12, 2024, 11:52 a.m. UTC | #5
On Thu, 11 Jan 2024 10:26:15 +0000
"Russell King (Oracle)" <linux@armlinux.org.uk> wrote:

> On Thu, Jan 11, 2024 at 10:19:49AM +0000, Jonathan Cameron wrote:
> > On Tue, 2 Jan 2024 14:39:25 +0000
> > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> >   
> > > On Fri, 15 Dec 2023 20:47:31 +0100
> > > "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:
> > >   
> > > > On Friday, December 15, 2023 5:15:39 PM CET Jonathan Cameron wrote:    
> > > > > On Fri, 15 Dec 2023 15:31:55 +0000
> > > > > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > > > >       
> > > > > > On Thu, Dec 14, 2023 at 07:37:10PM +0100, Rafael J. Wysocki wrote:      
> > > > > > > On Thu, Dec 14, 2023 at 7:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:        
> > > > > > > >
> > > > > > > > On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
> > > > > > > > <linux@armlinux.org.uk> wrote:        
> > > > > > > > > I guess we need something like:
> > > > > > > > >
> > > > > > > > >         if (device->status.present)
> > > > > > > > >                 return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
> > > > > > > > >                        device->status.enabled;
> > > > > > > > >         else
> > > > > > > > >                 return device->status.functional;
> > > > > > > > >
> > > > > > > > > so we only check device->status.enabled for processor-type devices?        
> > > > > > > >
> > > > > > > > Yes, something like this.        
> > > > > > > 
> > > > > > > However, that is not sufficient, because there are
> > > > > > > ACPI_BUS_TYPE_DEVICE devices representing processors.
> > > > > > > 
> > > > > > > I'm not sure about a clean way to do it ATM.        
> > > > > > 
> > > > > > Ok, how about:
> > > > > > 
> > > > > > static bool acpi_dev_is_processor(const struct acpi_device *device)
> > > > > > {
> > > > > > 	struct acpi_hardware_id *hwid;
> > > > > > 
> > > > > > 	if (device->device_type == ACPI_BUS_TYPE_PROCESSOR)
> > > > > > 		return true;
> > > > > > 
> > > > > > 	if (device->device_type != ACPI_BUS_TYPE_DEVICE)
> > > > > > 		return false;
> > > > > > 
> > > > > > 	list_for_each_entry(hwid, &device->pnp.ids, list)
> > > > > > 		if (!strcmp(ACPI_PROCESSOR_OBJECT_HID, hwid->id) ||
> > > > > > 		    !strcmp(ACPI_PROCESSOR_DEVICE_HID, hwid->id))
> > > > > > 			return true;
> > > > > > 
> > > > > > 	return false;
> > > > > > }
> > > > > > 
> > > > > > and then:
> > > > > > 
> > > > > > 	if (device->status.present)
> > > > > > 		return !acpi_dev_is_processor(device) || device->status.enabled;
> > > > > > 	else
> > > > > > 		return device->status.functional;
> > > > > > 
> > > > > > ?
> > > > > >       
> > > > > Changing it to CPU only for now makes sense to me and I think this code snippet should do the
> > > > > job.  Nice and simple.      
> > > > 
> > > > Well, except that it does checks that are done elsewhere slightly
> > > > differently, which from the maintenance POV is not nice.
> > > > 
> > > > Maybe something like the appended patch (untested).    
> > > 
> > > Hi Rafael,
> > > 
> > > As far as I can see that's functionally equivalent, so looks good to me.
> > > I'm not set up to test this today though, so will defer to Russell on whether
> > > there is anything missing
> > > 
> > > Thanks for putting this together.  
> > 
> > This is rather embarrassing...
> > 
> > I span this up on a QEMU instance with some prints to find out we need
> > the !acpi_device_is_processor() restriction.
> > On my 'random' test setup it fails on one device. ACPI0017 - which I
> > happen to know rather well. It's the weird pseudo device that lets
> > a CXL aware OS know there is a CEDT table to probe.
> > 
> > Whilst I really don't like that hack (it is all about making software
> > distribution of out of tree modules easier rather than something
> > fundamental), I'm the CXL QEMU maintainer :(
> > 
> > Will fix that, but it shows there is at least one broken firmware out
> > there.
> > 
> > On plus side, Rafael's code seems to work as expected and lets that
> > buggy firwmare carry on working :) So lets pretend the bug in qemu
> > is a deliberate test case!  
> 
> Lol, thanks for a test case and showing that Rafael's approach is
> indeed necessary.
> 
> Would your test quality for a tested-by for this? For reference, this
> is my current version below with Rafael's update:

Sure. This matches what I have.

Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>


> 
> 8<====
> From: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> Subject: [PATCH] ACPI: Only enumerate enabled (or functional) processor
>  devices
> 
> From: James Morse <james.morse@arm.com>
> 
> Today the ACPI enumeration code 'visits' all devices that are present.
> 
> This is a problem for arm64, where CPUs are always present, but not
> always enabled. When a device-check occurs because the firmware-policy
> has changed and a CPU is now enabled, the following error occurs:
> | acpi ACPI0007:48: Enumeration failure
> 
> This is ultimately because acpi_dev_ready_for_enumeration() returns
> true for a device that is not enabled. The ACPI Processor driver
> will not register such CPUs as they are not 'decoding their resources'.
> 
> ACPI allows a device to be functional instead of maintaining the
> present and enabled bit, but we can't simply check the enabled bit
> for all devices since firmware can be buggy.
> 
> If ACPI indicates that the device is present and enabled, then all well
> and good, we can enumate it. However, if the device is present and not
> enabled, then we also check whether the device is a processor device
> to limit the impact of this new check to just processor devices.
> 
> This avoids enumerating present && functional processor devices that
> are not enabled.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Co-developed-by: Rafael J. Wysocki <rjw@rjwysocki.net>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> ---
> Changes since RFC v2:
>  * Incorporate comment suggestion by Gavin Shan.
> Changes since RFC v3:
>  * Fixed "sert" typo.
> Changes since RFC v3 (smaller series):
>  * Restrict checking the enabled bit to processor devices, update
>    commit comments.
>  * Use Rafael's suggestion in
>    https://lore.kernel.org/r/5760569.DvuYhMxLoT@kreacher
> ---
>  drivers/acpi/acpi_processor.c | 11 ++++++++
>  drivers/acpi/device_pm.c      |  2 +-
>  drivers/acpi/device_sysfs.c   |  2 +-
>  drivers/acpi/internal.h       |  4 ++-
>  drivers/acpi/property.c       |  2 +-
>  drivers/acpi/scan.c           | 49 ++++++++++++++++++++++++++++-------
>  6 files changed, 56 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 4fe2ef54088c..cf7c1cca69dd 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -626,6 +626,17 @@ static struct acpi_scan_handler processor_handler = {
>  	},
>  };
>  
> +bool acpi_device_is_processor(const struct acpi_device *adev)
> +{
> +	if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
> +		return true;
> +
> +	if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
> +		return false;
> +
> +	return acpi_scan_check_handler(adev, &processor_handler);
> +}
> +
>  static int acpi_processor_container_attach(struct acpi_device *dev,
>  					   const struct acpi_device_id *id)
>  {
> diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
> index 3b4d048c4941..e3c80f3b3b57 100644
> --- a/drivers/acpi/device_pm.c
> +++ b/drivers/acpi/device_pm.c
> @@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
>  		return -EINVAL;
>  
>  	device->power.state = ACPI_STATE_UNKNOWN;
> -	if (!acpi_device_is_present(device)) {
> +	if (!acpi_dev_ready_for_enumeration(device)) {
>  		device->flags.initialized = false;
>  		return -ENXIO;
>  	}
> diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
> index 23373faa35ec..a0256d2493a7 100644
> --- a/drivers/acpi/device_sysfs.c
> +++ b/drivers/acpi/device_sysfs.c
> @@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
>  	struct acpi_hardware_id *id;
>  
>  	/* Avoid unnecessarily loading modules for non present devices. */
> -	if (!acpi_device_is_present(acpi_dev))
> +	if (!acpi_dev_ready_for_enumeration(acpi_dev))
>  		return 0;
>  
>  	/*
> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> index 866c7c4ed233..9388d4c8674a 100644
> --- a/drivers/acpi/internal.h
> +++ b/drivers/acpi/internal.h
> @@ -62,6 +62,8 @@ void acpi_sysfs_add_hotplug_profile(struct acpi_hotplug_profile *hotplug,
>  int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
>  				       const char *hotplug_profile_name);
>  void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +			     struct acpi_scan_handler *handler);
>  
>  #ifdef CONFIG_DEBUG_FS
>  extern struct dentry *acpi_debugfs_dir;
> @@ -107,7 +109,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
>  void acpi_device_remove_files(struct acpi_device *dev);
>  void acpi_device_add_finalize(struct acpi_device *device);
>  void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
> -bool acpi_device_is_present(const struct acpi_device *adev);
>  bool acpi_device_is_battery(struct acpi_device *adev);
>  bool acpi_device_is_first_physical_node(struct acpi_device *adev,
>  					const struct device *dev);
> @@ -119,6 +120,7 @@ int acpi_bus_register_early_device(int type);
>  const struct acpi_device *acpi_companion_match(const struct device *dev);
>  int __acpi_device_uevent_modalias(const struct acpi_device *adev,
>  				  struct kobj_uevent_env *env);
> +bool acpi_device_is_processor(const struct acpi_device *adev);
>  
>  /* --------------------------------------------------------------------------
>                                    Power Resource
> diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
> index 6979a3f9f90a..14d6948fd88a 100644
> --- a/drivers/acpi/property.c
> +++ b/drivers/acpi/property.c
> @@ -1420,7 +1420,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
>  	if (!is_acpi_device_node(fwnode))
>  		return false;
>  
> -	return acpi_device_is_present(to_acpi_device_node(fwnode));
> +	return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
>  }
>  
>  static const void *
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 02bb2cce423f..f94d1f744bcc 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
>  	int error;
>  
>  	acpi_bus_get_status(adev);
> -	if (acpi_device_is_present(adev)) {
> +	if (acpi_dev_ready_for_enumeration(adev)) {
>  		/*
>  		 * This function is only called for device objects for which
>  		 * matching scan handlers exist.  The only situation in which
> @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
>  	int error;
>  
>  	acpi_bus_get_status(adev);
> -	if (!acpi_device_is_present(adev)) {
> +	if (!acpi_dev_ready_for_enumeration(adev)) {
>  		acpi_scan_device_not_enumerated(adev);
>  		return 0;
>  	}
> @@ -1913,11 +1913,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
>  	return true;
>  }
>  
> -bool acpi_device_is_present(const struct acpi_device *adev)
> -{
> -	return adev->status.present || adev->status.functional;
> -}
> -
>  static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
>  				       const char *idstr,
>  				       const struct acpi_device_id **matchid)
> @@ -1938,6 +1933,18 @@ static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
>  	return false;
>  }
>  
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +			     struct acpi_scan_handler *handler)
> +{
> +	struct acpi_hardware_id *hwid;
> +
> +	list_for_each_entry(hwid, &adev->pnp.ids, list)
> +		if (acpi_scan_handler_matching(handler, hwid->id, NULL))
> +			return true;
> +
> +	return false;
> +}
> +
>  static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
>  					const struct acpi_device_id **matchid)
>  {
> @@ -2381,16 +2388,38 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
>   * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
>   * @device: Pointer to the &struct acpi_device to check
>   *
> - * Check if the device is present and has no unmet dependencies.
> + * Check if the device is functional or enabled and has no unmet dependencies.
>   *
> - * Return true if the device is ready for enumeratino. Otherwise, return false.
> + * Return true if the device is ready for enumeration. Otherwise, return false.
>   */
>  bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
>  {
>  	if (device->flags.honor_deps && device->dep_unmet)
>  		return false;
>  
> -	return acpi_device_is_present(device);
> +	/*
> +	 * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> +	 * (!present && functional) for certain types of devices that should be
> +	 * enumerated. Note that the enabled bit should not be set unless the
> +	 * present bit is set.
> +	 *
> +	 * However, limit this only to processor devices to reduce possible
> +	 * regressions with firmware.
> +	 */
> +	if (device->status.functional)
> +		return true;
> +
> +	if (!device->status.present)
> +		return false;
> +
> +	/*
> +	 * Fast path - if enabled is set, avoid the more expensive test to
> +	 * check whether this device is a processor.
> +	 */
> +	if (device->status.enabled)
> +		return true;
> +
> +	return !acpi_device_is_processor(device);
>  }
>  EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
>
Gavin Shan Jan. 22, 2024, 7:31 a.m. UTC | #6
On 1/11/24 20:26, Russell King (Oracle) wrote:
> On Thu, Jan 11, 2024 at 10:19:49AM +0000, Jonathan Cameron wrote:
>> On Tue, 2 Jan 2024 14:39:25 +0000
>> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
>>
>>> On Fri, 15 Dec 2023 20:47:31 +0100
>>> "Rafael J. Wysocki" <rjw@rjwysocki.net> wrote:
>>>
>>>> On Friday, December 15, 2023 5:15:39 PM CET Jonathan Cameron wrote:
>>>>> On Fri, 15 Dec 2023 15:31:55 +0000
>>>>> "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
>>>>>      
>>>>>> On Thu, Dec 14, 2023 at 07:37:10PM +0100, Rafael J. Wysocki wrote:
>>>>>>> On Thu, Dec 14, 2023 at 7:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>>>>>>>>
>>>>>>>> On Thu, Dec 14, 2023 at 7:10 PM Russell King (Oracle)
>>>>>>>> <linux@armlinux.org.uk> wrote:
>>>>>>>>> I guess we need something like:
>>>>>>>>>
>>>>>>>>>          if (device->status.present)
>>>>>>>>>                  return device->device_type != ACPI_BUS_TYPE_PROCESSOR ||
>>>>>>>>>                         device->status.enabled;
>>>>>>>>>          else
>>>>>>>>>                  return device->status.functional;
>>>>>>>>>
>>>>>>>>> so we only check device->status.enabled for processor-type devices?
>>>>>>>>
>>>>>>>> Yes, something like this.
>>>>>>>
>>>>>>> However, that is not sufficient, because there are
>>>>>>> ACPI_BUS_TYPE_DEVICE devices representing processors.
>>>>>>>
>>>>>>> I'm not sure about a clean way to do it ATM.
>>>>>>
>>>>>> Ok, how about:
>>>>>>
>>>>>> static bool acpi_dev_is_processor(const struct acpi_device *device)
>>>>>> {
>>>>>> 	struct acpi_hardware_id *hwid;
>>>>>>
>>>>>> 	if (device->device_type == ACPI_BUS_TYPE_PROCESSOR)
>>>>>> 		return true;
>>>>>>
>>>>>> 	if (device->device_type != ACPI_BUS_TYPE_DEVICE)
>>>>>> 		return false;
>>>>>>
>>>>>> 	list_for_each_entry(hwid, &device->pnp.ids, list)
>>>>>> 		if (!strcmp(ACPI_PROCESSOR_OBJECT_HID, hwid->id) ||
>>>>>> 		    !strcmp(ACPI_PROCESSOR_DEVICE_HID, hwid->id))
>>>>>> 			return true;
>>>>>>
>>>>>> 	return false;
>>>>>> }
>>>>>>
>>>>>> and then:
>>>>>>
>>>>>> 	if (device->status.present)
>>>>>> 		return !acpi_dev_is_processor(device) || device->status.enabled;
>>>>>> 	else
>>>>>> 		return device->status.functional;
>>>>>>
>>>>>> ?
>>>>>>      
>>>>> Changing it to CPU only for now makes sense to me and I think this code snippet should do the
>>>>> job.  Nice and simple.
>>>>
>>>> Well, except that it does checks that are done elsewhere slightly
>>>> differently, which from the maintenance POV is not nice.
>>>>
>>>> Maybe something like the appended patch (untested).
>>>
>>> Hi Rafael,
>>>
>>> As far as I can see that's functionally equivalent, so looks good to me.
>>> I'm not set up to test this today though, so will defer to Russell on whether
>>> there is anything missing
>>>
>>> Thanks for putting this together.
>>
>> This is rather embarrassing...
>>
>> I span this up on a QEMU instance with some prints to find out we need
>> the !acpi_device_is_processor() restriction.
>> On my 'random' test setup it fails on one device. ACPI0017 - which I
>> happen to know rather well. It's the weird pseudo device that lets
>> a CXL aware OS know there is a CEDT table to probe.
>>
>> Whilst I really don't like that hack (it is all about making software
>> distribution of out of tree modules easier rather than something
>> fundamental), I'm the CXL QEMU maintainer :(
>>
>> Will fix that, but it shows there is at least one broken firmware out
>> there.
>>
>> On plus side, Rafael's code seems to work as expected and lets that
>> buggy firwmare carry on working :) So lets pretend the bug in qemu
>> is a deliberate test case!
> 
> Lol, thanks for a test case and showing that Rafael's approach is
> indeed necessary.
> 
> Would your test quality for a tested-by for this? For reference, this
> is my current version below with Rafael's update:
> 
> 8<====
> From: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> Subject: [PATCH] ACPI: Only enumerate enabled (or functional) processor
>   devices
> 
> From: James Morse <james.morse@arm.com>
> 
> Today the ACPI enumeration code 'visits' all devices that are present.
> 
> This is a problem for arm64, where CPUs are always present, but not
> always enabled. When a device-check occurs because the firmware-policy
> has changed and a CPU is now enabled, the following error occurs:
> | acpi ACPI0007:48: Enumeration failure
> 
> This is ultimately because acpi_dev_ready_for_enumeration() returns
> true for a device that is not enabled. The ACPI Processor driver
> will not register such CPUs as they are not 'decoding their resources'.
> 
> ACPI allows a device to be functional instead of maintaining the
> present and enabled bit, but we can't simply check the enabled bit
> for all devices since firmware can be buggy.
> 
> If ACPI indicates that the device is present and enabled, then all well
> and good, we can enumate it. However, if the device is present and not
> enabled, then we also check whether the device is a processor device
> to limit the impact of this new check to just processor devices.
> 
> This avoids enumerating present && functional processor devices that
> are not enabled.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Co-developed-by: Rafael J. Wysocki <rjw@rjwysocki.net>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> ---
> Changes since RFC v2:
>   * Incorporate comment suggestion by Gavin Shan.
> Changes since RFC v3:
>   * Fixed "sert" typo.
> Changes since RFC v3 (smaller series):
>   * Restrict checking the enabled bit to processor devices, update
>     commit comments.
>   * Use Rafael's suggestion in
>     https://lore.kernel.org/r/5760569.DvuYhMxLoT@kreacher
> ---
>   drivers/acpi/acpi_processor.c | 11 ++++++++
>   drivers/acpi/device_pm.c      |  2 +-
>   drivers/acpi/device_sysfs.c   |  2 +-
>   drivers/acpi/internal.h       |  4 ++-
>   drivers/acpi/property.c       |  2 +-
>   drivers/acpi/scan.c           | 49 ++++++++++++++++++++++++++++-------
>   6 files changed, 56 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 4fe2ef54088c..cf7c1cca69dd 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -626,6 +626,17 @@ static struct acpi_scan_handler processor_handler = {
>   	},
>   };
>   
> +bool acpi_device_is_processor(const struct acpi_device *adev)
> +{
> +	if (adev->device_type == ACPI_BUS_TYPE_PROCESSOR)
> +		return true;
> +
> +	if (adev->device_type != ACPI_BUS_TYPE_DEVICE)
> +		return false;
> +
> +	return acpi_scan_check_handler(adev, &processor_handler);
> +}
> +
>   static int acpi_processor_container_attach(struct acpi_device *dev,
>   					   const struct acpi_device_id *id)
>   {
> diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
> index 3b4d048c4941..e3c80f3b3b57 100644
> --- a/drivers/acpi/device_pm.c
> +++ b/drivers/acpi/device_pm.c
> @@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
>   		return -EINVAL;
>   
>   	device->power.state = ACPI_STATE_UNKNOWN;
> -	if (!acpi_device_is_present(device)) {
> +	if (!acpi_dev_ready_for_enumeration(device)) {
>   		device->flags.initialized = false;
>   		return -ENXIO;
>   	}
> diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
> index 23373faa35ec..a0256d2493a7 100644
> --- a/drivers/acpi/device_sysfs.c
> +++ b/drivers/acpi/device_sysfs.c
> @@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
>   	struct acpi_hardware_id *id;
>   
>   	/* Avoid unnecessarily loading modules for non present devices. */
> -	if (!acpi_device_is_present(acpi_dev))
> +	if (!acpi_dev_ready_for_enumeration(acpi_dev))
>   		return 0;
>   
>   	/*
> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> index 866c7c4ed233..9388d4c8674a 100644
> --- a/drivers/acpi/internal.h
> +++ b/drivers/acpi/internal.h
> @@ -62,6 +62,8 @@ void acpi_sysfs_add_hotplug_profile(struct acpi_hotplug_profile *hotplug,
>   int acpi_scan_add_handler_with_hotplug(struct acpi_scan_handler *handler,
>   				       const char *hotplug_profile_name);
>   void acpi_scan_hotplug_enabled(struct acpi_hotplug_profile *hotplug, bool val);
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +			     struct acpi_scan_handler *handler);
>   
>   #ifdef CONFIG_DEBUG_FS
>   extern struct dentry *acpi_debugfs_dir;
> @@ -107,7 +109,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
>   void acpi_device_remove_files(struct acpi_device *dev);
>   void acpi_device_add_finalize(struct acpi_device *device);
>   void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
> -bool acpi_device_is_present(const struct acpi_device *adev);
>   bool acpi_device_is_battery(struct acpi_device *adev);
>   bool acpi_device_is_first_physical_node(struct acpi_device *adev,
>   					const struct device *dev);
> @@ -119,6 +120,7 @@ int acpi_bus_register_early_device(int type);
>   const struct acpi_device *acpi_companion_match(const struct device *dev);
>   int __acpi_device_uevent_modalias(const struct acpi_device *adev,
>   				  struct kobj_uevent_env *env);
> +bool acpi_device_is_processor(const struct acpi_device *adev);
>   
>   /* --------------------------------------------------------------------------
>                                     Power Resource
> diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
> index 6979a3f9f90a..14d6948fd88a 100644
> --- a/drivers/acpi/property.c
> +++ b/drivers/acpi/property.c
> @@ -1420,7 +1420,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
>   	if (!is_acpi_device_node(fwnode))
>   		return false;
>   
> -	return acpi_device_is_present(to_acpi_device_node(fwnode));
> +	return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
>   }
>   
>   static const void *
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 02bb2cce423f..f94d1f744bcc 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
>   	int error;
>   
>   	acpi_bus_get_status(adev);
> -	if (acpi_device_is_present(adev)) {
> +	if (acpi_dev_ready_for_enumeration(adev)) {
>   		/*
>   		 * This function is only called for device objects for which
>   		 * matching scan handlers exist.  The only situation in which
> @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
>   	int error;
>   
>   	acpi_bus_get_status(adev);
> -	if (!acpi_device_is_present(adev)) {
> +	if (!acpi_dev_ready_for_enumeration(adev)) {
>   		acpi_scan_device_not_enumerated(adev);
>   		return 0;
>   	}
> @@ -1913,11 +1913,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
>   	return true;
>   }
>   
> -bool acpi_device_is_present(const struct acpi_device *adev)
> -{
> -	return adev->status.present || adev->status.functional;
> -}
> -
>   static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
>   				       const char *idstr,
>   				       const struct acpi_device_id **matchid)
> @@ -1938,6 +1933,18 @@ static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
>   	return false;
>   }
>   
> +bool acpi_scan_check_handler(const struct acpi_device *adev,
> +			     struct acpi_scan_handler *handler)
> +{
> +	struct acpi_hardware_id *hwid;
> +
> +	list_for_each_entry(hwid, &adev->pnp.ids, list)
> +		if (acpi_scan_handler_matching(handler, hwid->id, NULL))
> +			return true;
> +
> +	return false;
> +}
> +
>   static struct acpi_scan_handler *acpi_scan_match_handler(const char *idstr,
>   					const struct acpi_device_id **matchid)
>   {
> @@ -2381,16 +2388,38 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
>    * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
>    * @device: Pointer to the &struct acpi_device to check
>    *
> - * Check if the device is present and has no unmet dependencies.
> + * Check if the device is functional or enabled and has no unmet dependencies.
>    *
> - * Return true if the device is ready for enumeratino. Otherwise, return false.
> + * Return true if the device is ready for enumeration. Otherwise, return false.
>    */
>   bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
>   {
>   	if (device->flags.honor_deps && device->dep_unmet)
>   		return false;
>   
> -	return acpi_device_is_present(device);
> +	/*
> +	 * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> +	 * (!present && functional) for certain types of devices that should be
> +	 * enumerated. Note that the enabled bit should not be set unless the
> +	 * present bit is set.
> +	 *
> +	 * However, limit this only to processor devices to reduce possible
> +	 * regressions with firmware.
> +	 */
> +	if (device->status.functional)
> +		return true;
> +
> +	if (!device->status.present)
> +		return false;
> +
> +	/*
> +	 * Fast path - if enabled is set, avoid the more expensive test to
> +	 * check whether this device is a processor.
> +	 */
> +	if (device->status.enabled)
> +		return true;
> +

It may be worthy to replace 'if enabled is set' with 'if the enabled bit is set',
to be consistent with the terminologies used in the above comments.

Apart from it, the patch itself looks good to me.

> +	return !acpi_device_is_processor(device);
>   }
>   EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
>   

Thanks,
Gavin
Russell King (Oracle) Jan. 29, 2024, 2:55 p.m. UTC | #7
Hi Jonathan,

On Fri, Jan 12, 2024 at 11:52:05AM +0000, Jonathan Cameron wrote:
> On Thu, 11 Jan 2024 10:26:15 +0000
> "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > @@ -2381,16 +2388,38 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
> >   * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
> >   * @device: Pointer to the &struct acpi_device to check
> >   *
> > - * Check if the device is present and has no unmet dependencies.
> > + * Check if the device is functional or enabled and has no unmet dependencies.
> >   *
> > - * Return true if the device is ready for enumeratino. Otherwise, return false.
> > + * Return true if the device is ready for enumeration. Otherwise, return false.
> >   */
> >  bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
> >  {
> >  	if (device->flags.honor_deps && device->dep_unmet)
> >  		return false;
> >  
> > -	return acpi_device_is_present(device);
> > +	/*
> > +	 * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> > +	 * (!present && functional) for certain types of devices that should be
> > +	 * enumerated. Note that the enabled bit should not be set unless the
> > +	 * present bit is set.
> > +	 *
> > +	 * However, limit this only to processor devices to reduce possible
> > +	 * regressions with firmware.
> > +	 */
> > +	if (device->status.functional)
> > +		return true;

I have a report from within Oracle that this causes testing failures
with QEMU using -smp cpus=2,maxcpus=4. I think it needs to be:

	if (!device->status.present)
		return device->status.functional;

	if (device->status.enabled)
		return true;

	return !acpi_device_is_processor(device);

So we can better understand the history here, let's list it as a
truth table. P=present, F=functional, E=enabled, Orig=how the code
is in mainline, James=James' original proposal, Rafael=the proposed
replacement but seems to be buggy, Rmk=the fixed version that passes
tests:

P F E	Orig	James	Rafael		Rmk
0 0 0	0	0	0		0
0 0 1	0	0	0		0
0 1 0	1	1	1		1
0 1 1	1	0	1		1
1 0 0	1	0	!processor	!processor
1 0 1	1	1	1		1
1 1 0	1	0	1		!processor
1 1 1	1	1	1		1

Any objections to this?
Russell King (Oracle) Jan. 29, 2024, 3:16 p.m. UTC | #8
On Mon, Jan 29, 2024 at 04:05:42PM +0100, Rafael J. Wysocki wrote:
> On Mon, Jan 29, 2024 at 3:55 PM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > Hi Jonathan,
> >
> > On Fri, Jan 12, 2024 at 11:52:05AM +0000, Jonathan Cameron wrote:
> > > On Thu, 11 Jan 2024 10:26:15 +0000
> > > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > > > @@ -2381,16 +2388,38 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
> > > >   * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
> > > >   * @device: Pointer to the &struct acpi_device to check
> > > >   *
> > > > - * Check if the device is present and has no unmet dependencies.
> > > > + * Check if the device is functional or enabled and has no unmet dependencies.
> > > >   *
> > > > - * Return true if the device is ready for enumeratino. Otherwise, return false.
> > > > + * Return true if the device is ready for enumeration. Otherwise, return false.
> > > >   */
> > > >  bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
> > > >  {
> > > >     if (device->flags.honor_deps && device->dep_unmet)
> > > >             return false;
> > > >
> > > > -   return acpi_device_is_present(device);
> > > > +   /*
> > > > +    * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> > > > +    * (!present && functional) for certain types of devices that should be
> > > > +    * enumerated. Note that the enabled bit should not be set unless the
> > > > +    * present bit is set.
> > > > +    *
> > > > +    * However, limit this only to processor devices to reduce possible
> > > > +    * regressions with firmware.
> > > > +    */
> > > > +   if (device->status.functional)
> > > > +           return true;
> >
> > I have a report from within Oracle that this causes testing failures
> > with QEMU using -smp cpus=2,maxcpus=4. I think it needs to be:
> >
> >         if (!device->status.present)
> >                 return device->status.functional;
> >
> >         if (device->status.enabled)
> >                 return true;
> >
> >         return !acpi_device_is_processor(device);
> 
> The above is fine by me.
> 
> > So we can better understand the history here, let's list it as a
> > truth table. P=present, F=functional, E=enabled, Orig=how the code
> > is in mainline, James=James' original proposal, Rafael=the proposed
> > replacement but seems to be buggy, Rmk=the fixed version that passes
> > tests:
> >
> > P F E   Orig    James   Rafael          Rmk
> > 0 0 0   0       0       0               0
> > 0 0 1   0       0       0               0
> > 0 1 0   1       1       1               1
> > 0 1 1   1       0       1               1
> > 1 0 0   1       0       !processor      !processor
> > 1 0 1   1       1       1               1
> > 1 1 0   1       0       1               !processor
> > 1 1 1   1       1       1               1
> >
> > Any objections to this?
> 
> So AFAIAC it can return false if not enabled, but present and
> functional.  [Side note: I'm wondering what "functional" means then,
> but whatever.]
Rafael J. Wysocki Jan. 29, 2024, 3:34 p.m. UTC | #9
On Mon, Jan 29, 2024 at 4:17 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Mon, Jan 29, 2024 at 04:05:42PM +0100, Rafael J. Wysocki wrote:
> > On Mon, Jan 29, 2024 at 3:55 PM Russell King (Oracle)
> > <linux@armlinux.org.uk> wrote:
> > >
> > > Hi Jonathan,
> > >
> > > On Fri, Jan 12, 2024 at 11:52:05AM +0000, Jonathan Cameron wrote:
> > > > On Thu, 11 Jan 2024 10:26:15 +0000
> > > > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote:
> > > > > @@ -2381,16 +2388,38 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
> > > > >   * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
> > > > >   * @device: Pointer to the &struct acpi_device to check
> > > > >   *
> > > > > - * Check if the device is present and has no unmet dependencies.
> > > > > + * Check if the device is functional or enabled and has no unmet dependencies.
> > > > >   *
> > > > > - * Return true if the device is ready for enumeratino. Otherwise, return false.
> > > > > + * Return true if the device is ready for enumeration. Otherwise, return false.
> > > > >   */
> > > > >  bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
> > > > >  {
> > > > >     if (device->flags.honor_deps && device->dep_unmet)
> > > > >             return false;
> > > > >
> > > > > -   return acpi_device_is_present(device);
> > > > > +   /*
> > > > > +    * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> > > > > +    * (!present && functional) for certain types of devices that should be
> > > > > +    * enumerated. Note that the enabled bit should not be set unless the
> > > > > +    * present bit is set.
> > > > > +    *
> > > > > +    * However, limit this only to processor devices to reduce possible
> > > > > +    * regressions with firmware.
> > > > > +    */
> > > > > +   if (device->status.functional)
> > > > > +           return true;
> > >
> > > I have a report from within Oracle that this causes testing failures
> > > with QEMU using -smp cpus=2,maxcpus=4. I think it needs to be:
> > >
> > >         if (!device->status.present)
> > >                 return device->status.functional;
> > >
> > >         if (device->status.enabled)
> > >                 return true;
> > >
> > >         return !acpi_device_is_processor(device);
> >
> > The above is fine by me.
> >
> > > So we can better understand the history here, let's list it as a
> > > truth table. P=present, F=functional, E=enabled, Orig=how the code
> > > is in mainline, James=James' original proposal, Rafael=the proposed
> > > replacement but seems to be buggy, Rmk=the fixed version that passes
> > > tests:
> > >
> > > P F E   Orig    James   Rafael          Rmk
> > > 0 0 0   0       0       0               0
> > > 0 0 1   0       0       0               0
> > > 0 1 0   1       1       1               1
> > > 0 1 1   1       0       1               1
> > > 1 0 0   1       0       !processor      !processor
> > > 1 0 1   1       1       1               1
> > > 1 1 0   1       0       1               !processor
> > > 1 1 1   1       1       1               1
> > >
> > > Any objections to this?
> >
> > So AFAIAC it can return false if not enabled, but present and
> > functional.  [Side note: I'm wondering what "functional" means then,
> > but whatever.]
>
> From ACPI v6.5 (bit 3 is our "status.functional":
>
>  _STA may return bit 0 clear (not present) with bit [3] set (device is
>  functional). This case is used to indicate a valid device for which no
>  device driver should be loaded (for example, a bridge device.) Children
>  of this device may be present and valid. OSPM should continue
>  enumeration below a device whose _STA returns this bit combination.
>
> So, for this case, acpi_dev_ready_for_enumeration() returning true for
> this case is correct, since we're supposed to enumerate it and child
> devices.
>
> It's probably also worth pointing out that in the above table, the two
> combinations with P=0 E=1 goes against the spec, but are included for
> completness.

The difference between the last two columns is the present and
functional, but not enabled combination AFAICS, for which my patch
just returned true, but the firmware disagrees with that.

It is kind of analogous to the "not present and functional" case
covered by the spec, which is why it is fine by me to return "false"
then (for processors), but the spec is not crystal clear about it.