mbox series

[v7,00/16] ACPI/arm64: add support for virtual cpu hotplug

Message ID 20240418135412.14730-1-Jonathan.Cameron@huawei.com
Headers show
Series ACPI/arm64: add support for virtual cpu hotplug | expand

Message

Jonathan Cameron April 18, 2024, 1:53 p.m. UTC
Whilst it is a bit quick after v6, a couple of critical issues
were pointed out by Russell, Salil and Rafael + one build issue
had been missed, so it seems sensible to make sure those conducting
testing or further review have access to a fixed version.

v7:
  - Fix misplaced config guard that broke bisection.
  - Greatly simplify the condition on which we call
    acpi_processor_hotadd_init().
  - Improve teardown ordering.

Fundamental change v6+: At the level of common ACPI infrastructure, use
the existing hotplug path for arm64 even though what needs to be
done at the architecture specific level is quite different.

An explicit check in arch_register_cpu() for arm64 prevents
this code doing anything if Physical CPU Hotplug is signalled.

This should resolve any concerns about treating virtual CPU
hotplug as if it were physical and potential unwanted side effects
if physical CPU hotplug is added to the ARM architecture in the
future.

v6: Thanks to Rafael for extensive help with the approach + reviews.
Specific changes:
 - Do not differentiate wrt code flow between traditional CPU HP
   and the new ARM flow.  The conditions on performing hotplug actions
   do need to be adjusted though to incorporate the slightly different
   state transition
     Added PRESENT + !ENABLED -> PRESENT + ENABLED
     to existing !PRESENT + !ENABLED -> PRESENT + ENABLED
 - Enable ACPI_HOTPLUG_CPU on arm64 and drop the earlier patches that
   took various code out of the protection of that.  Now the paths
 - New patch to drop unnecessary _STA check in hotplug code. This
   code cannot be entered unless ENABLED + PRESENT are set.
 - New patch to unify the flow of already onlined (at time of driver
   load) and hotplugged CPUs in acpi/processor_driver.c.
   This change is necessary because we can't easily distinguish the
   2 cases of deferred vs hotplug calls of register_cpu() on arm64.
   It is also a nice simplification.
 - Use flags rather than a structure for the extra parameter to
   acpi_scan_check_and_detach() - Thank to Shameer for offline feedback.

Updated version of James' original introduction.

This series adds what looks like cpuhotplug support to arm64 for use in
virtual machines. It does this by moving the cpu_register() calls for
architectures that support ACPI into an arch specific call made from
the ACPI processor driver.
 
The kubernetes folk really want to be able to add CPUs to an existing VM,
in exactly the same way they do on x86. The use-case is pre-booting guests
with one CPU, then adding the number that were actually needed when the
workload is provisioned.

Wait? Doesn't arm64 support cpuhotplug already!?
In the arm world, cpuhotplug gets used to mean removing the power from a CPU.
The CPU is offline, and remains present. For x86, and ACPI, cpuhotplug
has the additional step of physically removing the CPU, so that it isn't
present anymore.
 
Arm64 doesn't support this, and can't support it: CPUs are really a slice
of the SoC, and there is not enough information in the existing ACPI tables
to describe which bits of the slice also got removed. Without a reference
machine: adding this support to the spec is a wild goose chase.
 
Critically: everything described in the firmware tables must remain present.
 
For a virtual machine this is easy as all the other bits of 'virtual SoC'
are emulated, so they can (and do) remain present when a vCPU is 'removed'.

On a system that supports cpuhotplug the MADT has to describe every possible
CPU at boot. Under KVM, the vGIC needs to know about every possible vCPU before
the guest is started.
With these constraints, virtual-cpuhotplug is really just a hypervisor/firmware
policy about which CPUs can be brought online.
 
This series adds support for virtual-cpuhotplug as exactly that: firmware
policy. This may even work on a physical machine too; for a guest the part of
firmware is played by the VMM. (typically Qemu).
 
PSCI support is modified to return 'DENIED' if the CPU can't be brought
online/enabled yet. The CPU object's _STA method's enabled bit is used to
indicate firmware's current disposition. If the CPU has its enabled bit clear,
it will not be registered with sysfs, and attempts to bring it online will
fail. The notifications that _STA has changed its value then work in the same
way as physical hotplug, and firmware can cause the CPU to be registered some
time later, allowing it to be brought online.
 
This creates something that looks like cpuhotplug to user-space and the
kernel beyond arm64 architecture specific code, as the sysfs
files appear and disappear, and the udev notifications look the same.
 
One notable difference is the CPU present mask, which is exposed via sysfs.
Because the CPUs remain present throughout, they can still be seen in that mask.
This value does get used by webbrowsers to estimate the number of CPUs
as the CPU online mask is constantly changed on mobile phones.
 
Linux is tolerant of PSCI returning errors, as its always been allowed to do
that. To avoid confusing OS that can't tolerate this, we needed an additional
bit in the MADT GICC flags. This series copies ACPI_MADT_ONLINE_CAPABLE, which
appears to be for this purpose, but calls it ACPI_MADT_GICC_CPU_CAPABLE as it
has a different bit position in the GICC.
 
This code is unconditionally enabled for all ACPI architectures, though for
now only arm64 will have deferred the cpu_register() calls.

If folk want to play along at home, you'll need a copy of Qemu that supports this.
https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2

Replace your '-smp' argument with something like:
 | -smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1
 
 then feed the following to the Qemu montior;
 | (qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
 | (qemu) device_del cpu1

James Morse (7):
  ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug
  arm64: acpi: Move get_cpu_for_acpi_id() to a header
  irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc()
  irqchip/gic-v3: Add support for ACPI's disabled but 'online capable'
    CPUs
  arm64: document virtual CPU hotplug's expectations
  cpumask: Add enabled cpumask for present CPUs that can be brought
    online

Jean-Philippe Brucker (1):
  arm64: psci: Ignore DENIED CPUs

Jonathan Cameron (8):
  ACPI: processor: Simplify initial onlining to use same path for cold
    and hotplug
  cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER
  ACPI: processor: Drop duplicated check on _STA (enabled + present)
  ACPI: processor: Move checks and availability of acpi_processor
    earlier
  ACPI: processor: Add acpi_get_processor_handle() helper
  ACPI: scan: switch to flags for acpi_scan_check_and_detach()
  arm64: arch_register_cpu() variant to check if an ACPI handle is now
    available.
  arm64: Kconfig: Enable hotplug CPU on arm64 if ACPI_PROCESSOR is
    enabled.

 .../ABI/testing/sysfs-devices-system-cpu      |   6 +
 Documentation/arch/arm64/cpu-hotplug.rst      |  79 ++++++++++++
 Documentation/arch/arm64/index.rst            |   1 +
 arch/arm64/Kconfig                            |   1 +
 arch/arm64/include/asm/acpi.h                 |  11 ++
 arch/arm64/kernel/acpi.c                      |  16 +++
 arch/arm64/kernel/acpi_numa.c                 |  11 --
 arch/arm64/kernel/psci.c                      |   2 +-
 arch/arm64/kernel/smp.c                       |  56 ++++++++-
 drivers/acpi/acpi_processor.c                 | 113 ++++++++++--------
 drivers/acpi/processor_driver.c               |  44 ++-----
 drivers/acpi/scan.c                           |  47 ++++++--
 drivers/base/cpu.c                            |  12 +-
 drivers/irqchip/irq-gic-v3.c                  |  32 +++--
 include/acpi/acpi_bus.h                       |   1 +
 include/acpi/processor.h                      |   2 +-
 include/linux/acpi.h                          |  10 +-
 include/linux/cpumask.h                       |  25 ++++
 kernel/cpu.c                                  |   3 +
 19 files changed, 357 insertions(+), 115 deletions(-)
 create mode 100644 Documentation/arch/arm64/cpu-hotplug.rst

Comments

Miguel Luis April 19, 2024, 3:39 p.m. UTC | #1
> On 18 Apr 2024, at 13:53, Jonathan Cameron <jonathan.cameron@huawei.com> wrote:
> 
> Whilst it is a bit quick after v6, a couple of critical issues
> were pointed out by Russell, Salil and Rafael + one build issue
> had been missed, so it seems sensible to make sure those conducting
> testing or further review have access to a fixed version.
> 
> v7:
>  - Fix misplaced config guard that broke bisection.
>  - Greatly simplify the condition on which we call
>    acpi_processor_hotadd_init().
>  - Improve teardown ordering.
> 

Hi Jonathan,

I've tested v7 on an arm64 machine running QEMU from
https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2, with KVM.

- boot
- hotplug up to 'maxcpus'
- hotunplug down to the number of boot cpus
- hotplug vcpus and migrate with vcpus offline
- hotplug vcpus and migrate with vcpus online
- hotplug vcpus then unplug vcpus then migrate
- successive live migrations

Feel free to add:
Tested-by: Miguel Luis <miguel.luis@oracle.com>

Thank you
Miguel

> Fundamental change v6+: At the level of common ACPI infrastructure, use
> the existing hotplug path for arm64 even though what needs to be
> done at the architecture specific level is quite different.
> 
> An explicit check in arch_register_cpu() for arm64 prevents
> this code doing anything if Physical CPU Hotplug is signalled.
> 
> This should resolve any concerns about treating virtual CPU
> hotplug as if it were physical and potential unwanted side effects
> if physical CPU hotplug is added to the ARM architecture in the
> future.
> 
> v6: Thanks to Rafael for extensive help with the approach + reviews.
> Specific changes:
> - Do not differentiate wrt code flow between traditional CPU HP
>   and the new ARM flow.  The conditions on performing hotplug actions
>   do need to be adjusted though to incorporate the slightly different
>   state transition
>     Added PRESENT + !ENABLED -> PRESENT + ENABLED
>     to existing !PRESENT + !ENABLED -> PRESENT + ENABLED
> - Enable ACPI_HOTPLUG_CPU on arm64 and drop the earlier patches that
>   took various code out of the protection of that.  Now the paths
> - New patch to drop unnecessary _STA check in hotplug code. This
>   code cannot be entered unless ENABLED + PRESENT are set.
> - New patch to unify the flow of already onlined (at time of driver
>   load) and hotplugged CPUs in acpi/processor_driver.c.
>   This change is necessary because we can't easily distinguish the
>   2 cases of deferred vs hotplug calls of register_cpu() on arm64.
>   It is also a nice simplification.
> - Use flags rather than a structure for the extra parameter to
>   acpi_scan_check_and_detach() - Thank to Shameer for offline feedback.
> 
> Updated version of James' original introduction.
> 
> This series adds what looks like cpuhotplug support to arm64 for use in
> virtual machines. It does this by moving the cpu_register() calls for
> architectures that support ACPI into an arch specific call made from
> the ACPI processor driver.
> 
> The kubernetes folk really want to be able to add CPUs to an existing VM,
> in exactly the same way they do on x86. The use-case is pre-booting guests
> with one CPU, then adding the number that were actually needed when the
> workload is provisioned.
> 
> Wait? Doesn't arm64 support cpuhotplug already!?
> In the arm world, cpuhotplug gets used to mean removing the power from a CPU.
> The CPU is offline, and remains present. For x86, and ACPI, cpuhotplug
> has the additional step of physically removing the CPU, so that it isn't
> present anymore.
> 
> Arm64 doesn't support this, and can't support it: CPUs are really a slice
> of the SoC, and there is not enough information in the existing ACPI tables
> to describe which bits of the slice also got removed. Without a reference
> machine: adding this support to the spec is a wild goose chase.
> 
> Critically: everything described in the firmware tables must remain present.
> 
> For a virtual machine this is easy as all the other bits of 'virtual SoC'
> are emulated, so they can (and do) remain present when a vCPU is 'removed'.
> 
> On a system that supports cpuhotplug the MADT has to describe every possible
> CPU at boot. Under KVM, the vGIC needs to know about every possible vCPU before
> the guest is started.
> With these constraints, virtual-cpuhotplug is really just a hypervisor/firmware
> policy about which CPUs can be brought online.
> 
> This series adds support for virtual-cpuhotplug as exactly that: firmware
> policy. This may even work on a physical machine too; for a guest the part of
> firmware is played by the VMM. (typically Qemu).
> 
> PSCI support is modified to return 'DENIED' if the CPU can't be brought
> online/enabled yet. The CPU object's _STA method's enabled bit is used to
> indicate firmware's current disposition. If the CPU has its enabled bit clear,
> it will not be registered with sysfs, and attempts to bring it online will
> fail. The notifications that _STA has changed its value then work in the same
> way as physical hotplug, and firmware can cause the CPU to be registered some
> time later, allowing it to be brought online.
> 
> This creates something that looks like cpuhotplug to user-space and the
> kernel beyond arm64 architecture specific code, as the sysfs
> files appear and disappear, and the udev notifications look the same.
> 
> One notable difference is the CPU present mask, which is exposed via sysfs.
> Because the CPUs remain present throughout, they can still be seen in that mask.
> This value does get used by webbrowsers to estimate the number of CPUs
> as the CPU online mask is constantly changed on mobile phones.
> 
> Linux is tolerant of PSCI returning errors, as its always been allowed to do
> that. To avoid confusing OS that can't tolerate this, we needed an additional
> bit in the MADT GICC flags. This series copies ACPI_MADT_ONLINE_CAPABLE, which
> appears to be for this purpose, but calls it ACPI_MADT_GICC_CPU_CAPABLE as it
> has a different bit position in the GICC.
> 
> This code is unconditionally enabled for all ACPI architectures, though for
> now only arm64 will have deferred the cpu_register() calls.
> 
> If folk want to play along at home, you'll need a copy of Qemu that supports this.
> https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2
> 
> Replace your '-smp' argument with something like:
> | -smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1
> 
> then feed the following to the Qemu montior;
> | (qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
> | (qemu) device_del cpu1
> 
> James Morse (7):
>  ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
>  ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug
>  arm64: acpi: Move get_cpu_for_acpi_id() to a header
>  irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc()
>  irqchip/gic-v3: Add support for ACPI's disabled but 'online capable'
>    CPUs
>  arm64: document virtual CPU hotplug's expectations
>  cpumask: Add enabled cpumask for present CPUs that can be brought
>    online
> 
> Jean-Philippe Brucker (1):
>  arm64: psci: Ignore DENIED CPUs
> 
> Jonathan Cameron (8):
>  ACPI: processor: Simplify initial onlining to use same path for cold
>    and hotplug
>  cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER
>  ACPI: processor: Drop duplicated check on _STA (enabled + present)
>  ACPI: processor: Move checks and availability of acpi_processor
>    earlier
>  ACPI: processor: Add acpi_get_processor_handle() helper
>  ACPI: scan: switch to flags for acpi_scan_check_and_detach()
>  arm64: arch_register_cpu() variant to check if an ACPI handle is now
>    available.
>  arm64: Kconfig: Enable hotplug CPU on arm64 if ACPI_PROCESSOR is
>    enabled.
> 
> .../ABI/testing/sysfs-devices-system-cpu      |   6 +
> Documentation/arch/arm64/cpu-hotplug.rst      |  79 ++++++++++++
> Documentation/arch/arm64/index.rst            |   1 +
> arch/arm64/Kconfig                            |   1 +
> arch/arm64/include/asm/acpi.h                 |  11 ++
> arch/arm64/kernel/acpi.c                      |  16 +++
> arch/arm64/kernel/acpi_numa.c                 |  11 --
> arch/arm64/kernel/psci.c                      |   2 +-
> arch/arm64/kernel/smp.c                       |  56 ++++++++-
> drivers/acpi/acpi_processor.c                 | 113 ++++++++++--------
> drivers/acpi/processor_driver.c               |  44 ++-----
> drivers/acpi/scan.c                           |  47 ++++++--
> drivers/base/cpu.c                            |  12 +-
> drivers/irqchip/irq-gic-v3.c                  |  32 +++--
> include/acpi/acpi_bus.h                       |   1 +
> include/acpi/processor.h                      |   2 +-
> include/linux/acpi.h                          |  10 +-
> include/linux/cpumask.h                       |  25 ++++
> kernel/cpu.c                                  |   3 +
> 19 files changed, 357 insertions(+), 115 deletions(-)
> create mode 100644 Documentation/arch/arm64/cpu-hotplug.rst
> 
> -- 
> 2.39.2
>
Jonathan Cameron April 22, 2024, 10:40 a.m. UTC | #2
On Thu, 18 Apr 2024 14:54:07 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> From: James Morse <james.morse@arm.com>
> 
> To support virtual CPU hotplug, ACPI has added an 'online capable' bit
> to the MADT GICC entries. This indicates a disabled CPU entry may not
> be possible to online via PSCI until firmware has set enabled bit in
> _STA.
> 
> This means that a "usable" GIC is one that is marked as either enabled,
> or online capable. Therefore, change acpi_gicc_is_usable() to check both
> bits. However, we need to change the test in gic_acpi_match_gicc() back
> to testing just the enabled bit so the count of enabled distributors is
> correct.
> 
> What about the redistributor in the GICC entry? ACPI doesn't want to say.
> Assume the worst: When a redistributor is described in the GICC entry,
> but the entry is marked as disabled at boot, assume the redistributor
> is inaccessible.
> 
> The GICv3 driver doesn't support late online of redistributors, so this
> means the corresponding CPU can't be brought online either. Clear the
> possible and present bits.
> 
> Systems that want CPU hotplug in a VM can ensure their redistributors
> are always-on, and describe them that way with a GICR entry in the MADT.
> 
> When mapping redistributors found via GICC entries, handle the case
> where the arch code believes the CPU is present and possible, but it
> does not have an accessible redistributor. Print a warning and clear
> the present and possible bits.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

+CC Marc,

Whilst this has been unchanged for a long time, I'm not 100% sure
we've specifically drawn your attention to it before now.

Jonathan

> 
> ---
> v7: No Change.
> ---
>  drivers/irqchip/irq-gic-v3.c | 21 +++++++++++++++++++--
>  include/linux/acpi.h         |  3 ++-
>  2 files changed, 21 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 10af15f93d4d..66132251c1bb 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -2363,11 +2363,25 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
>  				(struct acpi_madt_generic_interrupt *)header;
>  	u32 reg = readl_relaxed(acpi_data.dist_base + GICD_PIDR2) & GIC_PIDR2_ARCH_MASK;
>  	u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
> +	int cpu = get_cpu_for_acpi_id(gicc->uid);
>  	void __iomem *redist_base;
>  
>  	if (!acpi_gicc_is_usable(gicc))
>  		return 0;
>  
> +	/*
> +	 * Capable but disabled CPUs can be brought online later. What about
> +	 * the redistributor? ACPI doesn't want to say!
> +	 * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
> +	 * Otherwise, prevent such CPUs from being brought online.
> +	 */
> +	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
> +		pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
> +		set_cpu_present(cpu, false);
> +		set_cpu_possible(cpu, false);
> +		return 0;
> +	}
> +
>  	redist_base = ioremap(gicc->gicr_base_address, size);
>  	if (!redist_base)
>  		return -ENOMEM;
> @@ -2413,9 +2427,12 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
>  
>  	/*
>  	 * If GICC is enabled and has valid gicr base address, then it means
> -	 * GICR base is presented via GICC
> +	 * GICR base is presented via GICC. The redistributor is only known to
> +	 * be accessible if the GICC is marked as enabled. If this bit is not
> +	 * set, we'd need to add the redistributor at runtime, which isn't
> +	 * supported.
>  	 */
> -	if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address)
> +	if (gicc->flags & ACPI_MADT_ENABLED && gicc->gicr_base_address)
>  		acpi_data.enabled_rdists++;
>  
>  	return 0;
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index 9844a3f9c4e5..fcfb7bb6789e 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -239,7 +239,8 @@ void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);
>  
>  static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc)
>  {
> -	return gicc->flags & ACPI_MADT_ENABLED;
> +	return gicc->flags & (ACPI_MADT_ENABLED |
> +			      ACPI_MADT_GICC_ONLINE_CAPABLE);
>  }
>  
>  /* the following numa functions are architecture-dependent */
Jonathan Cameron April 22, 2024, 10:46 a.m. UTC | #3
On Thu, 18 Apr 2024 14:54:05 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> From: James Morse <james.morse@arm.com>
> 
> ACPI identifies CPUs by UID. get_cpu_for_acpi_id() maps the ACPI UID
> to the Linux CPU number.
> 
> The helper to retrieve this mapping is only available in arm64's NUMA
> code.
> 
> Move it to live next to get_acpi_id_for_cpu().
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Miguel Luis <miguel.luis@oracle.com>
> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Another one where we'd been focused on the general ACPI aspects so long
the CC list didn't include relevant maintainers.

+CC Lorenzo, Hanjun and Sudeep.


> ---
> v7: No change
> ---
>  arch/arm64/include/asm/acpi.h | 11 +++++++++++
>  arch/arm64/kernel/acpi_numa.c | 11 -----------
>  2 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
> index 6792a1f83f2a..bc9a6656fc0c 100644
> --- a/arch/arm64/include/asm/acpi.h
> +++ b/arch/arm64/include/asm/acpi.h
> @@ -119,6 +119,17 @@ static inline u32 get_acpi_id_for_cpu(unsigned int cpu)
>  	return	acpi_cpu_get_madt_gicc(cpu)->uid;
>  }
>  
> +static inline int get_cpu_for_acpi_id(u32 uid)
> +{
> +	int cpu;
> +
> +	for (cpu = 0; cpu < nr_cpu_ids; cpu++)
> +		if (uid == get_acpi_id_for_cpu(cpu))
> +			return cpu;
> +
> +	return -EINVAL;
> +}
> +
>  static inline void arch_fix_phys_package_id(int num, u32 slot) { }
>  void __init acpi_init_cpus(void);
>  int apei_claim_sea(struct pt_regs *regs);
> diff --git a/arch/arm64/kernel/acpi_numa.c b/arch/arm64/kernel/acpi_numa.c
> index e51535a5f939..0c036a9a3c33 100644
> --- a/arch/arm64/kernel/acpi_numa.c
> +++ b/arch/arm64/kernel/acpi_numa.c
> @@ -34,17 +34,6 @@ int __init acpi_numa_get_nid(unsigned int cpu)
>  	return acpi_early_node_map[cpu];
>  }
>  
> -static inline int get_cpu_for_acpi_id(u32 uid)
> -{
> -	int cpu;
> -
> -	for (cpu = 0; cpu < nr_cpu_ids; cpu++)
> -		if (uid == get_acpi_id_for_cpu(cpu))
> -			return cpu;
> -
> -	return -EINVAL;
> -}
> -
>  static int __init acpi_parse_gicc_pxm(union acpi_subtable_headers *header,
>  				      const unsigned long end)
>  {
Rafael J. Wysocki April 22, 2024, 6:46 p.m. UTC | #4
On Thu, Apr 18, 2024 at 3:54 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> Separate code paths, combined with a flag set in acpi_processor.c to
> indicate a struct acpi_processor was for a hotplugged CPU ensured that
> per CPU data was only set up the first time that a CPU was initialized.
> This appears to be unnecessary as the paths can be combined by letting
> the online logic also handle any CPUs online at the time of driver load.
>
> Motivation for this change, beyond simplification, is that ARM64
> virtual CPU HP uses the same code paths for hotplug and cold path in
> acpi_processor.c so had no easy way to set the flag for hotplug only.
> Removing this necessity will enable ARM64 vCPU HP to reuse the existing
> code paths.
>
> Leave noisy pr_info() in place but update it to not state the CPU
> was hotplugged.
>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

LGTM, so

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> ---
> v7: No change.
> v6: New patch.
> RFT: I have very limited test resources for x86 and other
> architectures that may be affected by this change.
> ---
>  drivers/acpi/acpi_processor.c   |  1 -
>  drivers/acpi/processor_driver.c | 44 ++++++++++-----------------------
>  include/acpi/processor.h        |  2 +-
>  3 files changed, 14 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 7a0dd35d62c9..7fc924aeeed0 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -216,7 +216,6 @@ static int acpi_processor_hotadd_init(struct acpi_processor *pr)
>          * gets online for the first time.
>          */
>         pr_info("CPU%d has been hot-added\n", pr->id);
> -       pr->flags.need_hotplug_init = 1;
>
>  out:
>         cpus_write_unlock();
> diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
> index 67db60eda370..55782eac3ff1 100644
> --- a/drivers/acpi/processor_driver.c
> +++ b/drivers/acpi/processor_driver.c
> @@ -33,7 +33,6 @@ MODULE_AUTHOR("Paul Diefenbaugh");
>  MODULE_DESCRIPTION("ACPI Processor Driver");
>  MODULE_LICENSE("GPL");
>
> -static int acpi_processor_start(struct device *dev);
>  static int acpi_processor_stop(struct device *dev);
>
>  static const struct acpi_device_id processor_device_ids[] = {
> @@ -47,7 +46,6 @@ static struct device_driver acpi_processor_driver = {
>         .name = "processor",
>         .bus = &cpu_subsys,
>         .acpi_match_table = processor_device_ids,
> -       .probe = acpi_processor_start,
>         .remove = acpi_processor_stop,
>  };
>
> @@ -115,12 +113,10 @@ static int acpi_soft_cpu_online(unsigned int cpu)
>          * CPU got physically hotplugged and onlined for the first time:
>          * Initialize missing things.
>          */
> -       if (pr->flags.need_hotplug_init) {
> +       if (!pr->flags.previously_online) {
>                 int ret;
>
> -               pr_info("Will online and init hotplugged CPU: %d\n",
> -                       pr->id);
> -               pr->flags.need_hotplug_init = 0;
> +               pr_info("Will online and init CPU: %d\n", pr->id);
>                 ret = __acpi_processor_start(device);
>                 WARN(ret, "Failed to start CPU: %d\n", pr->id);
>         } else {
> @@ -167,9 +163,6 @@ static int __acpi_processor_start(struct acpi_device *device)
>         if (!pr)
>                 return -ENODEV;
>
> -       if (pr->flags.need_hotplug_init)
> -               return 0;
> -
>         result = acpi_cppc_processor_probe(pr);
>         if (result && !IS_ENABLED(CONFIG_ACPI_CPU_FREQ_PSS))
>                 dev_dbg(&device->dev, "CPPC data invalid or not present\n");
> @@ -185,32 +178,21 @@ static int __acpi_processor_start(struct acpi_device *device)
>
>         status = acpi_install_notify_handler(device->handle, ACPI_DEVICE_NOTIFY,
>                                              acpi_processor_notify, device);
> -       if (ACPI_SUCCESS(status))
> -               return 0;
> +       if (!ACPI_SUCCESS(status)) {
> +               result = -ENODEV;
> +               goto err_thermal_exit;
> +       }
> +       pr->flags.previously_online = 1;
>
> -       result = -ENODEV;
> -       acpi_processor_thermal_exit(pr, device);
> +       return 0;
>
> +err_thermal_exit:
> +       acpi_processor_thermal_exit(pr, device);
>  err_power_exit:
>         acpi_processor_power_exit(pr);
>         return result;
>  }
>
> -static int acpi_processor_start(struct device *dev)
> -{
> -       struct acpi_device *device = ACPI_COMPANION(dev);
> -       int ret;
> -
> -       if (!device)
> -               return -ENODEV;
> -
> -       /* Protect against concurrent CPU hotplug operations */
> -       cpu_hotplug_disable();
> -       ret = __acpi_processor_start(device);
> -       cpu_hotplug_enable();
> -       return ret;
> -}
> -
>  static int acpi_processor_stop(struct device *dev)
>  {
>         struct acpi_device *device = ACPI_COMPANION(dev);
> @@ -279,9 +261,9 @@ static int __init acpi_processor_driver_init(void)
>         if (result < 0)
>                 return result;
>
> -       result = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
> -                                          "acpi/cpu-drv:online",
> -                                          acpi_soft_cpu_online, NULL);
> +       result = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> +                                  "acpi/cpu-drv:online",
> +                                  acpi_soft_cpu_online, NULL);
>         if (result < 0)
>                 goto err;
>         hp_online = result;
> diff --git a/include/acpi/processor.h b/include/acpi/processor.h
> index 3f34ebb27525..e6f6074eadbf 100644
> --- a/include/acpi/processor.h
> +++ b/include/acpi/processor.h
> @@ -217,7 +217,7 @@ struct acpi_processor_flags {
>         u8 has_lpi:1;
>         u8 power_setup_done:1;
>         u8 bm_rld_set:1;
> -       u8 need_hotplug_init:1;
> +       u8 previously_online:1;
>  };
>
>  struct acpi_processor {
> --
Rafael J. Wysocki April 22, 2024, 6:48 p.m. UTC | #5
On Thu, Apr 18, 2024 at 3:55 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> The ACPI bus scan will only result in acpi_processor_add() being called
> if _STA has already been checked and the result is that the
> processor is enabled and present.  Hence drop this additional check.
>
> Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

LGTM, so

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> ---
> v7: No change
> v6: New patch to drop this unnecessary code. Now I think we only
>     need to explicitly read STA to print a warning in the ARM64
>     arch_unregister_cpu() path where we want to know if the
>     present bit has been unset as well.
> ---
>  drivers/acpi/acpi_processor.c | 6 ------
>  1 file changed, 6 deletions(-)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 7fc924aeeed0..ba0a6f0ac841 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -186,17 +186,11 @@ static void __init acpi_pcc_cpufreq_init(void) {}
>  #ifdef CONFIG_ACPI_HOTPLUG_CPU
>  static int acpi_processor_hotadd_init(struct acpi_processor *pr)
>  {
> -       unsigned long long sta;
> -       acpi_status status;
>         int ret;
>
>         if (invalid_phys_cpuid(pr->phys_id))
>                 return -ENODEV;
>
> -       status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
> -       if (ACPI_FAILURE(status) || !(sta & ACPI_STA_DEVICE_PRESENT))
> -               return -ENODEV;
> -
>         cpu_maps_update_begin();
>         cpus_write_lock();
>
> --
Rafael J. Wysocki April 22, 2024, 6:59 p.m. UTC | #6
On Thu, Apr 18, 2024 at 3:56 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> If CONFIG_ACPI_PROCESSOR provide a helper to retrieve the
> acpi_handle for a given CPU allowing access to methods
> in DSDT.
>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> v7: No change
> v6: New patch
> ---
>  drivers/acpi/acpi_processor.c | 10 ++++++++++
>  include/linux/acpi.h          |  7 +++++++
>  2 files changed, 17 insertions(+)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index ac7ddb30f10e..127ae8dcb787 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -35,6 +35,16 @@ EXPORT_PER_CPU_SYMBOL(processors);
>  struct acpi_processor_errata errata __read_mostly;
>  EXPORT_SYMBOL_GPL(errata);
>
> +acpi_handle acpi_get_processor_handle(int cpu)
> +{
> +       acpi_handle handle = NULL;

The local var looks redundant.

> +       struct acpi_processor *pr = per_cpu(processors, cpu);;
> +
> +       if (pr)
> +               handle = pr->handle;
> +
> +       return handle;

struct acpi_processor *pr;

pr = per_cpu(processors, cpu);
if (pr)
        return pr->handle;

return NULL;


> +}
>  static int acpi_processor_errata_piix4(struct pci_dev *dev)
>  {
>         u8 value1 = 0;
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index 34829f2c517a..9844a3f9c4e5 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -309,6 +309,8 @@ int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id,
>  int acpi_unmap_cpu(int cpu);
>  #endif /* CONFIG_ACPI_HOTPLUG_CPU */
>
> +acpi_handle acpi_get_processor_handle(int cpu);
> +
>  #ifdef CONFIG_ACPI_HOTPLUG_IOAPIC
>  int acpi_get_ioapic_id(acpi_handle handle, u32 gsi_base, u64 *phys_addr);
>  #endif
> @@ -1077,6 +1079,11 @@ static inline bool acpi_sleep_state_supported(u8 sleep_state)
>         return false;
>  }
>
> +static inline acpi_handle acpi_get_processor_handle(int cpu)
> +{
> +       return NULL;
> +}
> +
>  #endif /* !CONFIG_ACPI */
>
>  extern void arch_post_acpi_subsys_init(void);
> --
Rafael J. Wysocki April 22, 2024, 7:05 p.m. UTC | #7
On Thu, Apr 18, 2024 at 3:57 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> Precursor patch adds the ability to pass a uintptr_t of flags into
> acpi_scan_check_and detach() so that additional flags can be
> added to indicate whether to defer portions of the eject flow.
> The new flag follows in the next patch.
>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

I have no specific heartburn related to this, so

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> ---
> v7: No change
> v6: Based on internal feedback switch to less invasive change
>     to using flags rather than a struct.
> ---
>  drivers/acpi/scan.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index d1464324de95..1ec9677e6c2d 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -244,13 +244,16 @@ static int acpi_scan_try_to_offline(struct acpi_device *device)
>         return 0;
>  }
>
> -static int acpi_scan_check_and_detach(struct acpi_device *adev, void *check)
> +#define ACPI_SCAN_CHECK_FLAG_STATUS    BIT(0)
> +
> +static int acpi_scan_check_and_detach(struct acpi_device *adev, void *p)
>  {
>         struct acpi_scan_handler *handler = adev->handler;
> +       uintptr_t flags = (uintptr_t)p;
>
> -       acpi_dev_for_each_child_reverse(adev, acpi_scan_check_and_detach, check);
> +       acpi_dev_for_each_child_reverse(adev, acpi_scan_check_and_detach, p);
>
> -       if (check) {
> +       if (flags & ACPI_SCAN_CHECK_FLAG_STATUS) {
>                 acpi_bus_get_status(adev);
>                 /*
>                  * Skip devices that are still there and take the enabled
> @@ -288,7 +291,9 @@ static int acpi_scan_check_and_detach(struct acpi_device *adev, void *check)
>
>  static void acpi_scan_check_subtree(struct acpi_device *adev)
>  {
> -       acpi_scan_check_and_detach(adev, (void *)true);
> +       uintptr_t flags = ACPI_SCAN_CHECK_FLAG_STATUS;
> +
> +       acpi_scan_check_and_detach(adev, (void *)flags);
>  }
>
>  static int acpi_scan_hot_remove(struct acpi_device *device)
> @@ -2601,7 +2606,9 @@ EXPORT_SYMBOL(acpi_bus_scan);
>   */
>  void acpi_bus_trim(struct acpi_device *adev)
>  {
> -       acpi_scan_check_and_detach(adev, NULL);
> +       uintptr_t flags = 0;
> +
> +       acpi_scan_check_and_detach(adev, (void *)flags);
>  }
>  EXPORT_SYMBOL_GPL(acpi_bus_trim);
>
> --
Rafael J. Wysocki April 22, 2024, 7:16 p.m. UTC | #8
On Thu, Apr 18, 2024 at 9:50 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Thu, Apr 18, 2024 at 3:54 PM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > Whilst it is a bit quick after v6, a couple of critical issues
> > were pointed out by Russell, Salil and Rafael + one build issue
> > had been missed, so it seems sensible to make sure those conducting
> > testing or further review have access to a fixed version.
> >
> > v7:
> >   - Fix misplaced config guard that broke bisection.
> >   - Greatly simplify the condition on which we call
> >     acpi_processor_hotadd_init().
> >   - Improve teardown ordering.
>
> Thank you for the update!
>
> From a quick look, patches [01-08/16] appear to be good now, but I'll
> do a more detailed review on the following days.

Done now, I've sent comments on patches [4-5/16].

The other patches in the first half of the series LGTM.

I can't say much about the ARM64-specific part and the last patch has
been already ACKed by Thomas.

Thanks!
Hanjun Guo April 23, 2024, 6:18 a.m. UTC | #9
On 2024/4/18 21:53, Jonathan Cameron wrote:
> Separate code paths, combined with a flag set in acpi_processor.c to
> indicate a struct acpi_processor was for a hotplugged CPU ensured that
> per CPU data was only set up the first time that a CPU was initialized.
> This appears to be unnecessary as the paths can be combined by letting
> the online logic also handle any CPUs online at the time of driver load.
> 
> Motivation for this change, beyond simplification, is that ARM64
> virtual CPU HP uses the same code paths for hotplug and cold path in
> acpi_processor.c so had no easy way to set the flag for hotplug only.
> Removing this necessity will enable ARM64 vCPU HP to reuse the existing
> code paths.
> 
> Leave noisy pr_info() in place but update it to not state the CPU
> was hotplugged.
> 
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> ---
> v7: No change.
> v6: New patch.
> RFT: I have very limited test resources for x86 and other
> architectures that may be affected by this change.
> ---
>   drivers/acpi/acpi_processor.c   |  1 -
>   drivers/acpi/processor_driver.c | 44 ++++++++++-----------------------
>   include/acpi/processor.h        |  2 +-
>   3 files changed, 14 insertions(+), 33 deletions(-)

Nice simplification,

Reviewed-by: Hanjun Guo <guohanjun@huawei.com>

Thanks
Hanjun
Hanjun Guo April 23, 2024, 6:49 a.m. UTC | #10
On 2024/4/18 21:53, Jonathan Cameron wrote:
> The ACPI bus scan will only result in acpi_processor_add() being called
> if _STA has already been checked and the result is that the
> processor is enabled and present.  Hence drop this additional check.
> 
> Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> ---
> v7: No change
> v6: New patch to drop this unnecessary code. Now I think we only
>      need to explicitly read STA to print a warning in the ARM64
>      arch_unregister_cpu() path where we want to know if the
>      present bit has been unset as well.
> ---
>   drivers/acpi/acpi_processor.c | 6 ------
>   1 file changed, 6 deletions(-)
> 
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 7fc924aeeed0..ba0a6f0ac841 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -186,17 +186,11 @@ static void __init acpi_pcc_cpufreq_init(void) {}
>   #ifdef CONFIG_ACPI_HOTPLUG_CPU
>   static int acpi_processor_hotadd_init(struct acpi_processor *pr)
>   {
> -	unsigned long long sta;
> -	acpi_status status;
>   	int ret;
>   
>   	if (invalid_phys_cpuid(pr->phys_id))
>   		return -ENODEV;
>   
> -	status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
> -	if (ACPI_FAILURE(status) || !(sta & ACPI_STA_DEVICE_PRESENT))
> -		return -ENODEV;
> -
>   	cpu_maps_update_begin();
>   	cpus_write_lock();

Since the status bits were checked before acpi_processor_add() being
called, do we need to remove the if (!acpi_device_is_enabled(device))
check in acpi_processor_add() as well?

Thanks
Hanjun
Rafael J. Wysocki April 23, 2024, 9:31 a.m. UTC | #11
On Tue, Apr 23, 2024 at 8:49 AM Hanjun Guo <guohanjun@huawei.com> wrote:
>
> On 2024/4/18 21:53, Jonathan Cameron wrote:
> > The ACPI bus scan will only result in acpi_processor_add() being called
> > if _STA has already been checked and the result is that the
> > processor is enabled and present.  Hence drop this additional check.
> >
> > Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >
> > ---
> > v7: No change
> > v6: New patch to drop this unnecessary code. Now I think we only
> >      need to explicitly read STA to print a warning in the ARM64
> >      arch_unregister_cpu() path where we want to know if the
> >      present bit has been unset as well.
> > ---
> >   drivers/acpi/acpi_processor.c | 6 ------
> >   1 file changed, 6 deletions(-)
> >
> > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > index 7fc924aeeed0..ba0a6f0ac841 100644
> > --- a/drivers/acpi/acpi_processor.c
> > +++ b/drivers/acpi/acpi_processor.c
> > @@ -186,17 +186,11 @@ static void __init acpi_pcc_cpufreq_init(void) {}
> >   #ifdef CONFIG_ACPI_HOTPLUG_CPU
> >   static int acpi_processor_hotadd_init(struct acpi_processor *pr)
> >   {
> > -     unsigned long long sta;
> > -     acpi_status status;
> >       int ret;
> >
> >       if (invalid_phys_cpuid(pr->phys_id))
> >               return -ENODEV;
> >
> > -     status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
> > -     if (ACPI_FAILURE(status) || !(sta & ACPI_STA_DEVICE_PRESENT))
> > -             return -ENODEV;
> > -
> >       cpu_maps_update_begin();
> >       cpus_write_lock();
>
> Since the status bits were checked before acpi_processor_add() being
> called, do we need to remove the if (!acpi_device_is_enabled(device))
> check in acpi_processor_add() as well?

No, because its caller only checks the present bit.  The function
itself checks the enabled bit.
Hanjun Guo April 23, 2024, 11:13 a.m. UTC | #12
On 2024/4/23 17:31, Rafael J. Wysocki wrote:
> On Tue, Apr 23, 2024 at 8:49 AM Hanjun Guo <guohanjun@huawei.com> wrote:
>>
>> On 2024/4/18 21:53, Jonathan Cameron wrote:
>>> The ACPI bus scan will only result in acpi_processor_add() being called
>>> if _STA has already been checked and the result is that the
>>> processor is enabled and present.  Hence drop this additional check.
>>>
>>> Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
>>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>>
>>> ---
>>> v7: No change
>>> v6: New patch to drop this unnecessary code. Now I think we only
>>>       need to explicitly read STA to print a warning in the ARM64
>>>       arch_unregister_cpu() path where we want to know if the
>>>       present bit has been unset as well.
>>> ---
>>>    drivers/acpi/acpi_processor.c | 6 ------
>>>    1 file changed, 6 deletions(-)
>>>
>>> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
>>> index 7fc924aeeed0..ba0a6f0ac841 100644
>>> --- a/drivers/acpi/acpi_processor.c
>>> +++ b/drivers/acpi/acpi_processor.c
>>> @@ -186,17 +186,11 @@ static void __init acpi_pcc_cpufreq_init(void) {}
>>>    #ifdef CONFIG_ACPI_HOTPLUG_CPU
>>>    static int acpi_processor_hotadd_init(struct acpi_processor *pr)
>>>    {
>>> -     unsigned long long sta;
>>> -     acpi_status status;
>>>        int ret;
>>>
>>>        if (invalid_phys_cpuid(pr->phys_id))
>>>                return -ENODEV;
>>>
>>> -     status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
>>> -     if (ACPI_FAILURE(status) || !(sta & ACPI_STA_DEVICE_PRESENT))
>>> -             return -ENODEV;
>>> -
>>>        cpu_maps_update_begin();
>>>        cpus_write_lock();
>>
>> Since the status bits were checked before acpi_processor_add() being
>> called, do we need to remove the if (!acpi_device_is_enabled(device))
>> check in acpi_processor_add() as well?
> 
> No, because its caller only checks the present bit.  The function
> itself checks the enabled bit.

Thanks for the pointer, I can see the detail in the acpi_bus_attach()
now,

Reviewed-by: Hanjun Guo <guohanjun@huawei.com>

Thanks
Hanjun
Marc Zyngier April 23, 2024, 12:01 p.m. UTC | #13
On Mon, 22 Apr 2024 11:40:20 +0100,
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> 
> On Thu, 18 Apr 2024 14:54:07 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > From: James Morse <james.morse@arm.com>
> > 
> > To support virtual CPU hotplug, ACPI has added an 'online capable' bit
> > to the MADT GICC entries. This indicates a disabled CPU entry may not
> > be possible to online via PSCI until firmware has set enabled bit in
> > _STA.
> > 
> > This means that a "usable" GIC is one that is marked as either enabled,
> > or online capable. Therefore, change acpi_gicc_is_usable() to check both
> > bits. However, we need to change the test in gic_acpi_match_gicc() back
> > to testing just the enabled bit so the count of enabled distributors is
> > correct.
> > 
> > What about the redistributor in the GICC entry? ACPI doesn't want to say.
> > Assume the worst: When a redistributor is described in the GICC entry,
> > but the entry is marked as disabled at boot, assume the redistributor
> > is inaccessible.
> > 
> > The GICv3 driver doesn't support late online of redistributors, so this
> > means the corresponding CPU can't be brought online either. Clear the
> > possible and present bits.
> > 
> > Systems that want CPU hotplug in a VM can ensure their redistributors
> > are always-on, and describe them that way with a GICR entry in the MADT.
> > 
> > When mapping redistributors found via GICC entries, handle the case
> > where the arch code believes the CPU is present and possible, but it
> > does not have an accessible redistributor. Print a warning and clear
> > the present and possible bits.
> > 
> > Signed-off-by: James Morse <james.morse@arm.com>
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> +CC Marc,
> 
> Whilst this has been unchanged for a long time, I'm not 100% sure
> we've specifically drawn your attention to it before now.
> 
> Jonathan
> 
> > 
> > ---
> > v7: No Change.
> > ---
> >  drivers/irqchip/irq-gic-v3.c | 21 +++++++++++++++++++--
> >  include/linux/acpi.h         |  3 ++-
> >  2 files changed, 21 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> > index 10af15f93d4d..66132251c1bb 100644
> > --- a/drivers/irqchip/irq-gic-v3.c
> > +++ b/drivers/irqchip/irq-gic-v3.c
> > @@ -2363,11 +2363,25 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
> >  				(struct acpi_madt_generic_interrupt *)header;
> >  	u32 reg = readl_relaxed(acpi_data.dist_base + GICD_PIDR2) & GIC_PIDR2_ARCH_MASK;
> >  	u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
> > +	int cpu = get_cpu_for_acpi_id(gicc->uid);
> >  	void __iomem *redist_base;
> >  
> >  	if (!acpi_gicc_is_usable(gicc))
> >  		return 0;
> >  
> > +	/*
> > +	 * Capable but disabled CPUs can be brought online later. What about
> > +	 * the redistributor? ACPI doesn't want to say!
> > +	 * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
> > +	 * Otherwise, prevent such CPUs from being brought online.
> > +	 */
> > +	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
> > +		pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
> > +		set_cpu_present(cpu, false);
> > +		set_cpu_possible(cpu, false);
> > +		return 0;
> > +	}

It seems dangerous to clear those this late in the game, given how
disconnected from the architecture code this is. Are we sure that
nothing has sampled these cpumasks beforehand?

Thanks,

	M.
Hanjun Guo April 23, 2024, 12:02 p.m. UTC | #14
On 2024/4/18 21:54, Jonathan Cameron wrote:
> Precursor patch adds the ability to pass a uintptr_t of flags into
> acpi_scan_check_and detach() so that additional flags can be
> added to indicate whether to defer portions of the eject flow.
> The new flag follows in the next patch.
> 
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> ---
> v7: No change
> v6: Based on internal feedback switch to less invasive change
>      to using flags rather than a struct.
> ---
>   drivers/acpi/scan.c | 17 ++++++++++++-----
>   1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index d1464324de95..1ec9677e6c2d 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -244,13 +244,16 @@ static int acpi_scan_try_to_offline(struct acpi_device *device)
>   	return 0;
>   }
>   
> -static int acpi_scan_check_and_detach(struct acpi_device *adev, void *check)
> +#define ACPI_SCAN_CHECK_FLAG_STATUS	BIT(0)
> +
> +static int acpi_scan_check_and_detach(struct acpi_device *adev, void *p)
>   {
>   	struct acpi_scan_handler *handler = adev->handler;
> +	uintptr_t flags = (uintptr_t)p;
>   
> -	acpi_dev_for_each_child_reverse(adev, acpi_scan_check_and_detach, check);
> +	acpi_dev_for_each_child_reverse(adev, acpi_scan_check_and_detach, p);
>   
> -	if (check) {
> +	if (flags & ACPI_SCAN_CHECK_FLAG_STATUS) {
>   		acpi_bus_get_status(adev);
>   		/*
>   		 * Skip devices that are still there and take the enabled
> @@ -288,7 +291,9 @@ static int acpi_scan_check_and_detach(struct acpi_device *adev, void *check)
>   
>   static void acpi_scan_check_subtree(struct acpi_device *adev)
>   {
> -	acpi_scan_check_and_detach(adev, (void *)true);
> +	uintptr_t flags = ACPI_SCAN_CHECK_FLAG_STATUS;
> +
> +	acpi_scan_check_and_detach(adev, (void *)flags);
>   }
>   
>   static int acpi_scan_hot_remove(struct acpi_device *device)
> @@ -2601,7 +2606,9 @@ EXPORT_SYMBOL(acpi_bus_scan);
>    */
>   void acpi_bus_trim(struct acpi_device *adev)
>   {
> -	acpi_scan_check_and_detach(adev, NULL);
> +	uintptr_t flags = 0;
> +
> +	acpi_scan_check_and_detach(adev, (void *)flags);
>   }
>   EXPORT_SYMBOL_GPL(acpi_bus_trim);

Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Hanjun Guo April 23, 2024, 12:10 p.m. UTC | #15
On 2024/4/18 21:54, Jonathan Cameron wrote:
> From: James Morse <james.morse@arm.com>
> 
> ACPI identifies CPUs by UID. get_cpu_for_acpi_id() maps the ACPI UID
> to the Linux CPU number.
> 
> The helper to retrieve this mapping is only available in arm64's NUMA
> code.
> 
> Move it to live next to get_acpi_id_for_cpu().
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Miguel Luis <miguel.luis@oracle.com>
> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Looks good to me,

Acked-by: Hanjun Guo <guohanjun@huawei.com>

Thanks
Hanjun
Jonathan Cameron April 24, 2024, 12:54 p.m. UTC | #16
On Tue, 23 Apr 2024 13:01:21 +0100
Marc Zyngier <maz@kernel.org> wrote:

> On Mon, 22 Apr 2024 11:40:20 +0100,
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> > 
> > On Thu, 18 Apr 2024 14:54:07 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >   
> > > From: James Morse <james.morse@arm.com>
> > > 
> > > To support virtual CPU hotplug, ACPI has added an 'online capable' bit
> > > to the MADT GICC entries. This indicates a disabled CPU entry may not
> > > be possible to online via PSCI until firmware has set enabled bit in
> > > _STA.
> > > 
> > > This means that a "usable" GIC is one that is marked as either enabled,
> > > or online capable. Therefore, change acpi_gicc_is_usable() to check both
> > > bits. However, we need to change the test in gic_acpi_match_gicc() back
> > > to testing just the enabled bit so the count of enabled distributors is
> > > correct.
> > > 
> > > What about the redistributor in the GICC entry? ACPI doesn't want to say.
> > > Assume the worst: When a redistributor is described in the GICC entry,
> > > but the entry is marked as disabled at boot, assume the redistributor
> > > is inaccessible.
> > > 
> > > The GICv3 driver doesn't support late online of redistributors, so this
> > > means the corresponding CPU can't be brought online either. Clear the
> > > possible and present bits.
> > > 
> > > Systems that want CPU hotplug in a VM can ensure their redistributors
> > > are always-on, and describe them that way with a GICR entry in the MADT.
> > > 
> > > When mapping redistributors found via GICC entries, handle the case
> > > where the arch code believes the CPU is present and possible, but it
> > > does not have an accessible redistributor. Print a warning and clear
> > > the present and possible bits.
> > > 
> > > Signed-off-by: James Morse <james.morse@arm.com>
> > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>  
> > 
> > +CC Marc,
> > 
> > Whilst this has been unchanged for a long time, I'm not 100% sure
> > we've specifically drawn your attention to it before now.
> > 
> > Jonathan
> >   
> > > 
> > > ---
> > > v7: No Change.
> > > ---
> > >  drivers/irqchip/irq-gic-v3.c | 21 +++++++++++++++++++--
> > >  include/linux/acpi.h         |  3 ++-
> > >  2 files changed, 21 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> > > index 10af15f93d4d..66132251c1bb 100644
> > > --- a/drivers/irqchip/irq-gic-v3.c
> > > +++ b/drivers/irqchip/irq-gic-v3.c
> > > @@ -2363,11 +2363,25 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
> > >  				(struct acpi_madt_generic_interrupt *)header;
> > >  	u32 reg = readl_relaxed(acpi_data.dist_base + GICD_PIDR2) & GIC_PIDR2_ARCH_MASK;
> > >  	u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
> > > +	int cpu = get_cpu_for_acpi_id(gicc->uid);
> > >  	void __iomem *redist_base;
> > >  
> > >  	if (!acpi_gicc_is_usable(gicc))
> > >  		return 0;
> > >  
> > > +	/*
> > > +	 * Capable but disabled CPUs can be brought online later. What about
> > > +	 * the redistributor? ACPI doesn't want to say!
> > > +	 * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
> > > +	 * Otherwise, prevent such CPUs from being brought online.
> > > +	 */
> > > +	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
> > > +		pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
> > > +		set_cpu_present(cpu, false);
> > > +		set_cpu_possible(cpu, false);
> > > +		return 0;
> > > +	}  
> 
> It seems dangerous to clear those this late in the game, given how
> disconnected from the architecture code this is. Are we sure that
> nothing has sampled these cpumasks beforehand?

Hi Marc,

Any firmware that does this is being considered as buggy already
but given it is firmware and the spec doesn't say much about this,
there is always the possibility.

Not much happens between the point where these are setup and
the point where the the gic inits and this code runs, but even if careful
review showed it was fine today, it will be fragile to future changes.

I'm not sure there is a huge disadvantage for such broken firmware in
clearing these masks from the point of view of what is used throughout
the rest of the kernel. Here I think we are just looking to prevent the CPU
being onlined later.

We could add a set_cpu_broken() with appropriate mask.
Given this is very arm64 specific I'm not sure Rafael will be keen on
us checking such a mask in the generic ACPI code, but we could check it in
arch_register_cpu() and just not register the cpu if it matches.
That will cover the vCPU hotplug case.

Does that sounds sensible, or would you prefer something else?

Jonathan







> 
> Thanks,
> 
> 	M.
>
Marc Zyngier April 24, 2024, 3:33 p.m. UTC | #17
On Wed, 24 Apr 2024 13:54:38 +0100,
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> On Tue, 23 Apr 2024 13:01:21 +0100
> Marc Zyngier <maz@kernel.org> wrote:
> 
> > On Mon, 22 Apr 2024 11:40:20 +0100,
> > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> > > 
> > > On Thu, 18 Apr 2024 14:54:07 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

[...]

> > >   
> > > > +	/*
> > > > +	 * Capable but disabled CPUs can be brought online later. What about
> > > > +	 * the redistributor? ACPI doesn't want to say!
> > > > +	 * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
> > > > +	 * Otherwise, prevent such CPUs from being brought online.
> > > > +	 */
> > > > +	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
> > > > +		pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
> > > > +		set_cpu_present(cpu, false);
> > > > +		set_cpu_possible(cpu, false);
> > > > +		return 0;
> > > > +	}  
> > 
> > It seems dangerous to clear those this late in the game, given how
> > disconnected from the architecture code this is. Are we sure that
> > nothing has sampled these cpumasks beforehand?
> 
> Hi Marc,
> 
> Any firmware that does this is being considered as buggy already
> but given it is firmware and the spec doesn't say much about this,
> there is always the possibility.

There is no shortage of broken firmware out there, and I expect this
trend to progress.

> Not much happens between the point where these are setup and
> the point where the the gic inits and this code runs, but even if careful
> review showed it was fine today, it will be fragile to future changes.
> 
> I'm not sure there is a huge disadvantage for such broken firmware in
> clearing these masks from the point of view of what is used throughout
> the rest of the kernel. Here I think we are just looking to prevent the CPU
> being onlined later.

I totally agree on the goal, I simply question the way you get to it.

> 
> We could add a set_cpu_broken() with appropriate mask.
> Given this is very arm64 specific I'm not sure Rafael will be keen on
> us checking such a mask in the generic ACPI code, but we could check it in
> arch_register_cpu() and just not register the cpu if it matches.
> That will cover the vCPU hotplug case.
>
> Does that sounds sensible, or would you prefer something else?


Such a 'broken_rdists' mask is exactly what I have in mind, just
keeping it private to the GIC driver, and not expose it anywhere else.
You can then fail the hotplug event early, and avoid changing the
global masks from within the GIC driver. At least, we don't mess with
the internals of the kernel, and the CPU is properly marked as dead
(that mechanism should already work).

I'd expect the handling side to look like this (will not compile, but
you'll get the idea):

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 6fb276504bcc..e8f02bfd0e21 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -1009,6 +1009,9 @@ static int __gic_populate_rdist(struct redist_region *region, void __iomem *ptr)
 	u64 typer;
 	u32 aff;
 
+	if (cpumask_test_cpu(smp_processor_id(), &broken_rdists))
+		return 1;
+
 	/*
 	 * Convert affinity to a 32bit value that can be matched to
 	 * GICR_TYPER bits [63:32].
@@ -1260,14 +1263,15 @@ static int gic_dist_supports_lpis(void)
 		!gicv3_nolpi);
 }
 
-static void gic_cpu_init(void)
+static int gic_cpu_init(void)
 {
 	void __iomem *rbase;
-	int i;
+	int ret, i;
 
 	/* Register ourselves with the rest of the world */
-	if (gic_populate_rdist())
-		return;
+	ret = gic_populate_rdist();
+	if (ret)
+		return ret;
 
 	gic_enable_redist(true);
 
@@ -1286,6 +1290,8 @@ static void gic_cpu_init(void)
 
 	/* initialise system registers */
 	gic_cpu_sys_reg_init();
+
+	return 0;
 }
 
 #ifdef CONFIG_SMP
@@ -1295,7 +1301,11 @@ static void gic_cpu_init(void)
 
 static int gic_starting_cpu(unsigned int cpu)
 {
-	gic_cpu_init();
+	int ret;
+
+	ret = gic_cpu_init();
+	if (ret)
+		return ret;
 
 	if (gic_dist_supports_lpis())
 		its_cpu_init();

But the question is: do you rely on these masks having been
"corrected" anywhere else?

Thanks,

	M.
Salil Mehta April 24, 2024, 4:35 p.m. UTC | #18
>  From: Marc Zyngier <maz@kernel.org>
>  Sent: Wednesday, April 24, 2024 4:33 PM
>  To: Jonathan Cameron <jonathan.cameron@huawei.com>
>  Cc: Thomas Gleixner <tglx@linutronix.de>; Peter Zijlstra
>  <peterz@infradead.org>; linux-pm@vger.kernel.org;
>  loongarch@lists.linux.dev; linux-acpi@vger.kernel.org; linux-
>  arch@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
>  kernel@lists.infradead.org; kvmarm@lists.linux.dev; x86@kernel.org;
>  Russell King <linux@armlinux.org.uk>; Rafael J . Wysocki
>  <rafael@kernel.org>; Miguel Luis <miguel.luis@oracle.com>; James Morse
>  <james.morse@arm.com>; Salil Mehta <salil.mehta@huawei.com>; Jean-
>  Philippe Brucker <jean-philippe@linaro.org>; Catalin Marinas
>  <catalin.marinas@arm.com>; Will Deacon <will@kernel.org>; Linuxarm
>  <linuxarm@huawei.com>; Ingo Molnar <mingo@redhat.com>; Borislav
>  Petkov <bp@alien8.de>; Dave Hansen <dave.hansen@linux.intel.com>;
>  justin.he@arm.com; jianyong.wu@arm.com
>  Subject: Re: [PATCH v7 11/16] irqchip/gic-v3: Add support for ACPI's
>  disabled but 'online capable' CPUs
>  
>  On Wed, 24 Apr 2024 13:54:38 +0100,
>  Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
>  >
>  > On Tue, 23 Apr 2024 13:01:21 +0100
>  > Marc Zyngier <maz@kernel.org> wrote:
>  >
>  > > On Mon, 22 Apr 2024 11:40:20 +0100,
>  > > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
>  > > >
>  > > > On Thu, 18 Apr 2024 14:54:07 +0100 Jonathan Cameron
>  > > > <Jonathan.Cameron@huawei.com> wrote:
>  
>  [...]
>  
>  > > >
>  > > > > +	/*
>  > > > > +	 * Capable but disabled CPUs can be brought online later.  What about
>  > > > > +	 * the redistributor? ACPI doesn't want to say!
>  > > > > +	 * Virtual hotplug systems can use the MADT's "always-on"  GICR entries.
>  > > > > +	 * Otherwise, prevent such CPUs from being brought online.
>  > > > > +	 */
>  > > > > +	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
>  > > > > +		pr_warn_once("CPU %u's redistributor is  inaccessible: this CPU can't be brought online\n", cpu);
>  > > > > +		set_cpu_present(cpu, false);
>  > > > > +		set_cpu_possible(cpu, false);

(a digression) shouldn't we be clearing the enabled mask as well?

                                          set_cpu_enabled(cpu, false);


Best regards
Salil
Jonathan Cameron April 24, 2024, 5:08 p.m. UTC | #19
On Wed, 24 Apr 2024 17:35:54 +0100
Salil Mehta <salil.mehta@huawei.com> wrote:

> >  From: Marc Zyngier <maz@kernel.org>
> >  Sent: Wednesday, April 24, 2024 4:33 PM
> >  To: Jonathan Cameron <jonathan.cameron@huawei.com>
> >  Cc: Thomas Gleixner <tglx@linutronix.de>; Peter Zijlstra
> >  <peterz@infradead.org>; linux-pm@vger.kernel.org;
> >  loongarch@lists.linux.dev; linux-acpi@vger.kernel.org; linux-
> >  arch@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> >  kernel@lists.infradead.org; kvmarm@lists.linux.dev; x86@kernel.org;
> >  Russell King <linux@armlinux.org.uk>; Rafael J . Wysocki
> >  <rafael@kernel.org>; Miguel Luis <miguel.luis@oracle.com>; James Morse
> >  <james.morse@arm.com>; Salil Mehta <salil.mehta@huawei.com>; Jean-
> >  Philippe Brucker <jean-philippe@linaro.org>; Catalin Marinas
> >  <catalin.marinas@arm.com>; Will Deacon <will@kernel.org>; Linuxarm
> >  <linuxarm@huawei.com>; Ingo Molnar <mingo@redhat.com>; Borislav
> >  Petkov <bp@alien8.de>; Dave Hansen <dave.hansen@linux.intel.com>;
> >  justin.he@arm.com; jianyong.wu@arm.com
> >  Subject: Re: [PATCH v7 11/16] irqchip/gic-v3: Add support for ACPI's
> >  disabled but 'online capable' CPUs
> >  
> >  On Wed, 24 Apr 2024 13:54:38 +0100,
> >  Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:  
> >  >
> >  > On Tue, 23 Apr 2024 13:01:21 +0100
> >  > Marc Zyngier <maz@kernel.org> wrote:
> >  >  
> >  > > On Mon, 22 Apr 2024 11:40:20 +0100,
> >  > > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:  
> >  > > >
> >  > > > On Thu, 18 Apr 2024 14:54:07 +0100 Jonathan Cameron
> >  > > > <Jonathan.Cameron@huawei.com> wrote:  
> >  
> >  [...]
> >    
> >  > > >  
> >  > > > > +	/*
> >  > > > > +	 * Capable but disabled CPUs can be brought online later.  What about
> >  > > > > +	 * the redistributor? ACPI doesn't want to say!
> >  > > > > +	 * Virtual hotplug systems can use the MADT's "always-on"  GICR entries.
> >  > > > > +	 * Otherwise, prevent such CPUs from being brought online.
> >  > > > > +	 */
> >  > > > > +	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
> >  > > > > +		pr_warn_once("CPU %u's redistributor is  inaccessible: this CPU can't be brought online\n", cpu);
> >  > > > > +		set_cpu_present(cpu, false);
> >  > > > > +		set_cpu_possible(cpu, false);  
> 
> (a digression) shouldn't we be clearing the enabled mask as well?
> 
>                                           set_cpu_enabled(cpu, false);

FWIW I think not necessary. enabled is only set in register_cpu() and aim here is to
never call that for CPUs in this state.

Anyhow, I got distracted by the firmware bug I found whilst trying to test this but
now have a test setup that hits this path (once deliberately broken), so will
see what we can do about that doesn't have affect those masks.

Jonathan


> 
> 
> Best regards
> Salil
Jonathan Cameron April 25, 2024, 9:28 a.m. UTC | #20
On Wed, 24 Apr 2024 13:54:38 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Tue, 23 Apr 2024 13:01:21 +0100
> Marc Zyngier <maz@kernel.org> wrote:
> 
> > On Mon, 22 Apr 2024 11:40:20 +0100,
> > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:  
> > > 
> > > On Thu, 18 Apr 2024 14:54:07 +0100
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > >     
> > > > From: James Morse <james.morse@arm.com>
> > > > 
> > > > To support virtual CPU hotplug, ACPI has added an 'online capable' bit
> > > > to the MADT GICC entries. This indicates a disabled CPU entry may not
> > > > be possible to online via PSCI until firmware has set enabled bit in
> > > > _STA.
> > > > 
> > > > This means that a "usable" GIC is one that is marked as either enabled,
> > > > or online capable. Therefore, change acpi_gicc_is_usable() to check both
> > > > bits. However, we need to change the test in gic_acpi_match_gicc() back
> > > > to testing just the enabled bit so the count of enabled distributors is
> > > > correct.
> > > > 
> > > > What about the redistributor in the GICC entry? ACPI doesn't want to say.
> > > > Assume the worst: When a redistributor is described in the GICC entry,
> > > > but the entry is marked as disabled at boot, assume the redistributor
> > > > is inaccessible.
> > > > 
> > > > The GICv3 driver doesn't support late online of redistributors, so this
> > > > means the corresponding CPU can't be brought online either. Clear the
> > > > possible and present bits.
> > > > 
> > > > Systems that want CPU hotplug in a VM can ensure their redistributors
> > > > are always-on, and describe them that way with a GICR entry in the MADT.
> > > > 
> > > > When mapping redistributors found via GICC entries, handle the case
> > > > where the arch code believes the CPU is present and possible, but it
> > > > does not have an accessible redistributor. Print a warning and clear
> > > > the present and possible bits.
> > > > 
> > > > Signed-off-by: James Morse <james.morse@arm.com>
> > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>    
> > > 
> > > +CC Marc,
> > > 
> > > Whilst this has been unchanged for a long time, I'm not 100% sure
> > > we've specifically drawn your attention to it before now.
> > > 
> > > Jonathan
> > >     
> > > > 
> > > > ---
> > > > v7: No Change.
> > > > ---
> > > >  drivers/irqchip/irq-gic-v3.c | 21 +++++++++++++++++++--
> > > >  include/linux/acpi.h         |  3 ++-
> > > >  2 files changed, 21 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> > > > index 10af15f93d4d..66132251c1bb 100644
> > > > --- a/drivers/irqchip/irq-gic-v3.c
> > > > +++ b/drivers/irqchip/irq-gic-v3.c
> > > > @@ -2363,11 +2363,25 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
> > > >  				(struct acpi_madt_generic_interrupt *)header;
> > > >  	u32 reg = readl_relaxed(acpi_data.dist_base + GICD_PIDR2) & GIC_PIDR2_ARCH_MASK;
> > > >  	u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
> > > > +	int cpu = get_cpu_for_acpi_id(gicc->uid);
> > > >  	void __iomem *redist_base;
> > > >  
> > > >  	if (!acpi_gicc_is_usable(gicc))
> > > >  		return 0;
> > > >  
> > > > +	/*
> > > > +	 * Capable but disabled CPUs can be brought online later. What about
> > > > +	 * the redistributor? ACPI doesn't want to say!
> > > > +	 * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
> > > > +	 * Otherwise, prevent such CPUs from being brought online.
> > > > +	 */
> > > > +	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
> > > > +		pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
> > > > +		set_cpu_present(cpu, false);
> > > > +		set_cpu_possible(cpu, false);
> > > > +		return 0;
> > > > +	}    
> > 
> > It seems dangerous to clear those this late in the game, given how
> > disconnected from the architecture code this is. Are we sure that
> > nothing has sampled these cpumasks beforehand?  
> 
> Hi Marc,
> 
> Any firmware that does this is being considered as buggy already
> but given it is firmware and the spec doesn't say much about this,
> there is always the possibility.
> 
> Not much happens between the point where these are setup and
> the point where the the gic inits and this code runs, but even if careful
> review showed it was fine today, it will be fragile to future changes.
> 
> I'm not sure there is a huge disadvantage for such broken firmware in
> clearing these masks from the point of view of what is used throughout
> the rest of the kernel. Here I think we are just looking to prevent the CPU
> being onlined later.
> 
> We could add a set_cpu_broken() with appropriate mask.
> Given this is very arm64 specific I'm not sure Rafael will be keen on
> us checking such a mask in the generic ACPI code, but we could check it in
> arch_register_cpu() and just not register the cpu if it matches.
> That will cover the vCPU hotplug case.
> 
> Does that sounds sensible, or would you prefer something else?

Hi Marc

Some experiments later (faking this on a physical board - I never liked
CPU 120 anyway!) and using a different mask brings it's own minor pain.

When all the rest of the CPUs are brought up cpuhp_bringup_mask() is called
on cpu_present_mask so we need to do a dance in there to use a temporary
mask with broken cpus removed.  I think it makes sense to cut that out
at the top of the cpuhp_bringup_mask() pile of actions rather than trying
to paper over each actual thing that is dying... (looks like an infinite loop
somewhere but I haven't tracked down where yet).

I'll spin a patch so you can see what it looks like, but my concern is
we are just moving the risk from early users of these masks to later cases
where code assumes cpu_present_mask definitely means they are present.
That is probably a small set of cases but not nice either.

Looks like one of those cases where we need to pick the lesser of two evils
which is probably still the cpu_broken_mask approach.

On plus side if we decide to go back to the original approach having seen
that I already have the code :)

Jonathan



> 
> Jonathan
> 
> 
> 
> 
> 
> 
> 
> > 
> > Thanks,
> > 
> > 	M.
> >   
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Jonathan Cameron April 25, 2024, 9:56 a.m. UTC | #21
On Thu, 25 Apr 2024 10:28:06 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Wed, 24 Apr 2024 13:54:38 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Tue, 23 Apr 2024 13:01:21 +0100
> > Marc Zyngier <maz@kernel.org> wrote:
> >   
> > > On Mon, 22 Apr 2024 11:40:20 +0100,
> > > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:    
> > > > 
> > > > On Thu, 18 Apr 2024 14:54:07 +0100
> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > >       
> > > > > From: James Morse <james.morse@arm.com>
> > > > > 
> > > > > To support virtual CPU hotplug, ACPI has added an 'online capable' bit
> > > > > to the MADT GICC entries. This indicates a disabled CPU entry may not
> > > > > be possible to online via PSCI until firmware has set enabled bit in
> > > > > _STA.
> > > > > 
> > > > > This means that a "usable" GIC is one that is marked as either enabled,
> > > > > or online capable. Therefore, change acpi_gicc_is_usable() to check both
> > > > > bits. However, we need to change the test in gic_acpi_match_gicc() back
> > > > > to testing just the enabled bit so the count of enabled distributors is
> > > > > correct.
> > > > > 
> > > > > What about the redistributor in the GICC entry? ACPI doesn't want to say.
> > > > > Assume the worst: When a redistributor is described in the GICC entry,
> > > > > but the entry is marked as disabled at boot, assume the redistributor
> > > > > is inaccessible.
> > > > > 
> > > > > The GICv3 driver doesn't support late online of redistributors, so this
> > > > > means the corresponding CPU can't be brought online either. Clear the
> > > > > possible and present bits.
> > > > > 
> > > > > Systems that want CPU hotplug in a VM can ensure their redistributors
> > > > > are always-on, and describe them that way with a GICR entry in the MADT.
> > > > > 
> > > > > When mapping redistributors found via GICC entries, handle the case
> > > > > where the arch code believes the CPU is present and possible, but it
> > > > > does not have an accessible redistributor. Print a warning and clear
> > > > > the present and possible bits.
> > > > > 
> > > > > Signed-off-by: James Morse <james.morse@arm.com>
> > > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>      
> > > > 
> > > > +CC Marc,
> > > > 
> > > > Whilst this has been unchanged for a long time, I'm not 100% sure
> > > > we've specifically drawn your attention to it before now.
> > > > 
> > > > Jonathan
> > > >       
> > > > > 
> > > > > ---
> > > > > v7: No Change.
> > > > > ---
> > > > >  drivers/irqchip/irq-gic-v3.c | 21 +++++++++++++++++++--
> > > > >  include/linux/acpi.h         |  3 ++-
> > > > >  2 files changed, 21 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> > > > > index 10af15f93d4d..66132251c1bb 100644
> > > > > --- a/drivers/irqchip/irq-gic-v3.c
> > > > > +++ b/drivers/irqchip/irq-gic-v3.c
> > > > > @@ -2363,11 +2363,25 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
> > > > >  				(struct acpi_madt_generic_interrupt *)header;
> > > > >  	u32 reg = readl_relaxed(acpi_data.dist_base + GICD_PIDR2) & GIC_PIDR2_ARCH_MASK;
> > > > >  	u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
> > > > > +	int cpu = get_cpu_for_acpi_id(gicc->uid);
> > > > >  	void __iomem *redist_base;
> > > > >  
> > > > >  	if (!acpi_gicc_is_usable(gicc))
> > > > >  		return 0;
> > > > >  
> > > > > +	/*
> > > > > +	 * Capable but disabled CPUs can be brought online later. What about
> > > > > +	 * the redistributor? ACPI doesn't want to say!
> > > > > +	 * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
> > > > > +	 * Otherwise, prevent such CPUs from being brought online.
> > > > > +	 */
> > > > > +	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
> > > > > +		pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
> > > > > +		set_cpu_present(cpu, false);
> > > > > +		set_cpu_possible(cpu, false);
> > > > > +		return 0;
> > > > > +	}      
> > > 
> > > It seems dangerous to clear those this late in the game, given how
> > > disconnected from the architecture code this is. Are we sure that
> > > nothing has sampled these cpumasks beforehand?    
> > 
> > Hi Marc,
> > 
> > Any firmware that does this is being considered as buggy already
> > but given it is firmware and the spec doesn't say much about this,
> > there is always the possibility.
> > 
> > Not much happens between the point where these are setup and
> > the point where the the gic inits and this code runs, but even if careful
> > review showed it was fine today, it will be fragile to future changes.
> > 
> > I'm not sure there is a huge disadvantage for such broken firmware in
> > clearing these masks from the point of view of what is used throughout
> > the rest of the kernel. Here I think we are just looking to prevent the CPU
> > being onlined later.
> > 
> > We could add a set_cpu_broken() with appropriate mask.
> > Given this is very arm64 specific I'm not sure Rafael will be keen on
> > us checking such a mask in the generic ACPI code, but we could check it in
> > arch_register_cpu() and just not register the cpu if it matches.
> > That will cover the vCPU hotplug case.
> > 
> > Does that sounds sensible, or would you prefer something else?  
> 
> Hi Marc
> 
> Some experiments later (faking this on a physical board - I never liked
> CPU 120 anyway!) and using a different mask brings it's own minor pain.
> 
> When all the rest of the CPUs are brought up cpuhp_bringup_mask() is called
> on cpu_present_mask so we need to do a dance in there to use a temporary
> mask with broken cpus removed.  I think it makes sense to cut that out
> at the top of the cpuhp_bringup_mask() pile of actions rather than trying
> to paper over each actual thing that is dying... (looks like an infinite loop
> somewhere but I haven't tracked down where yet).
> 
> I'll spin a patch so you can see what it looks like, but my concern is
> we are just moving the risk from early users of these masks to later cases
> where code assumes cpu_present_mask definitely means they are present.
> That is probably a small set of cases but not nice either.
> 
> Looks like one of those cases where we need to pick the lesser of two evils
> which is probably still the cpu_broken_mask approach.
> 
> On plus side if we decide to go back to the original approach having seen
> that I already have the code :)
> 
> Jonathan
> 

Patch on top of this series.  If no one shouts before I have it ready I'll
roll a v8 with the mask introduction as a new patch and the other changes pushed into
appropriate patches.

From 361b76f36bfb4ff74fdceca7ebf14cfa43cae4a9 Mon Sep 17 00:00:00 2001
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Date: Wed, 24 Apr 2024 17:42:49 +0100
Subject: [PATCH] cpu: Add broken cpu mask to mark CPUs where inconsistent
 firmware means we can't start them.

On ARM64, it is not currently possible to use CPUs where the GICC entry
in ACPI specifies that it is online capable but not enabled. Only
always enabled entries are supported.

Previously if this condition was met, the present and possible cpu masks
were cleared for the relevant cpus.  However, those masks may already
have been used by other code so this is not known to be safe.

An alternative is to use an additional mask (broken) and check that
in the subset of places where these CPUs might be onlined or the
infrastructure to indicate this is possible created.
Specifically in bringup_nonboot_cpus() and in arch_register_cpu().

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 arch/arm64/kernel/smp.c      |  3 +++
 drivers/irqchip/irq-gic-v3.c |  3 +--
 include/linux/cpumask.h      | 19 +++++++++++++++++++
 kernel/cpu.c                 |  8 +++++++-
 4 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index ccb6ad347df9..39cd6a7c40d8 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -513,6 +513,9 @@ int arch_register_cpu(int cpu)
 	    IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU))
 		return -EPROBE_DEFER;
 
+	if (cpu_broken(cpu)) /* Inconsistent firmware - can't online */
+		return -ENODEV;
+
 #ifdef CONFIG_ACPI_HOTPLUG_CPU
 	/* For now block anything that looks like physical CPU Hotplug */
 	if (invalid_logical_cpuid(cpu) || !cpu_present(cpu)) {
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 66132251c1bb..a0063eb6484d 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -2377,8 +2377,7 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
 	 */
 	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
 		pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
-		set_cpu_present(cpu, false);
-		set_cpu_possible(cpu, false);
+		set_cpu_broken(cpu);
 		return 0;
 	}
 
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 4b202b94c97a..70a93ad8e590 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -96,6 +96,7 @@ static inline void set_nr_cpu_ids(unsigned int nr)
  *     cpu_enabled_mask  - has bit 'cpu' set iff cpu can be brought online
  *     cpu_online_mask  - has bit 'cpu' set iff cpu available to scheduler
  *     cpu_active_mask  - has bit 'cpu' set iff cpu available to migration
+ *     cpu_broken_mask  - has bit 'cpu' set iff the cpu should never be onlined
  *
  *  If !CONFIG_HOTPLUG_CPU, present == possible, and active == online.
  *
@@ -130,12 +131,14 @@ extern struct cpumask __cpu_enabled_mask;
 extern struct cpumask __cpu_present_mask;
 extern struct cpumask __cpu_active_mask;
 extern struct cpumask __cpu_dying_mask;
+extern struct cpumask __cpu_broken_mask;
 #define cpu_possible_mask ((const struct cpumask *)&__cpu_possible_mask)
 #define cpu_online_mask   ((const struct cpumask *)&__cpu_online_mask)
 #define cpu_enabled_mask   ((const struct cpumask *)&__cpu_enabled_mask)
 #define cpu_present_mask  ((const struct cpumask *)&__cpu_present_mask)
 #define cpu_active_mask   ((const struct cpumask *)&__cpu_active_mask)
 #define cpu_dying_mask    ((const struct cpumask *)&__cpu_dying_mask)
+#define cpu_broken_mask   ((const struct cpumask *)&__cpu_broken_mask)
 
 extern atomic_t __num_online_cpus;
 
@@ -1073,6 +1076,12 @@ set_cpu_dying(unsigned int cpu, bool dying)
 		cpumask_clear_cpu(cpu, &__cpu_dying_mask);
 }
 
+static inline void
+set_cpu_broken(unsigned int cpu)
+{
+	cpumask_set_cpu(cpu, &__cpu_broken_mask);
+}
+
 /**
  * to_cpumask - convert a NR_CPUS bitmap to a struct cpumask *
  * @bitmap: the bitmap
@@ -1159,6 +1168,11 @@ static inline bool cpu_dying(unsigned int cpu)
 	return cpumask_test_cpu(cpu, cpu_dying_mask);
 }
 
+static inline bool cpu_broken(unsigned int cpu)
+{
+	return cpumask_test_cpu(cpu, cpu_broken_mask);
+}
+
 #else
 
 #define num_online_cpus()	1U
@@ -1197,6 +1211,11 @@ static inline bool cpu_dying(unsigned int cpu)
 	return false;
 }
 
+static inline bool cpu_broken(unsigned int cpu)
+{
+	return false;
+}
+
 #endif /* NR_CPUS > 1 */
 
 #define cpu_is_offline(cpu)	unlikely(!cpu_online(cpu))
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 537099bf5d02..f8b73a11869e 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1907,12 +1907,15 @@ static inline bool cpuhp_bringup_cpus_parallel(unsigned int ncpus) { return fals
 
 void __init bringup_nonboot_cpus(unsigned int max_cpus)
 {
+	static const struct cpumask tmp_mask __initdata;
+
 	/* Try parallel bringup optimization if enabled */
 	if (cpuhp_bringup_cpus_parallel(max_cpus))
 		return;
 
+	cpumask_andnot(&tmp_mask, cpu_present_mask, cpu_broken_mask);
 	/* Full per CPU serialized bringup */
-	cpuhp_bringup_mask(cpu_present_mask, max_cpus, CPUHP_ONLINE);
+	cpuhp_bringup_mask(&tmp_mask, max_cpus, CPUHP_ONLINE);
 }
 
 #ifdef CONFIG_PM_SLEEP_SMP
@@ -3129,6 +3132,9 @@ EXPORT_SYMBOL(__cpu_active_mask);
 struct cpumask __cpu_dying_mask __read_mostly;
 EXPORT_SYMBOL(__cpu_dying_mask);
 
+struct cpumask __cpu_broken_mask __ro_after_init;
+EXPORT_SYMBOL(__cpu_broken_mask);
+
 atomic_t __num_online_cpus __read_mostly;
 EXPORT_SYMBOL(__num_online_cpus);
Jonathan Cameron April 25, 2024, 10:13 a.m. UTC | #22
On Thu, 25 Apr 2024 10:56:37 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Thu, 25 Apr 2024 10:28:06 +0100
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> 
> > On Wed, 24 Apr 2024 13:54:38 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >   
> > > On Tue, 23 Apr 2024 13:01:21 +0100
> > > Marc Zyngier <maz@kernel.org> wrote:
> > >     
> > > > On Mon, 22 Apr 2024 11:40:20 +0100,
> > > > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:      
> > > > > 
> > > > > On Thu, 18 Apr 2024 14:54:07 +0100
> > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > > > >         
> > > > > > From: James Morse <james.morse@arm.com>
> > > > > > 
> > > > > > To support virtual CPU hotplug, ACPI has added an 'online capable' bit
> > > > > > to the MADT GICC entries. This indicates a disabled CPU entry may not
> > > > > > be possible to online via PSCI until firmware has set enabled bit in
> > > > > > _STA.
> > > > > > 
> > > > > > This means that a "usable" GIC is one that is marked as either enabled,
> > > > > > or online capable. Therefore, change acpi_gicc_is_usable() to check both
> > > > > > bits. However, we need to change the test in gic_acpi_match_gicc() back
> > > > > > to testing just the enabled bit so the count of enabled distributors is
> > > > > > correct.
> > > > > > 
> > > > > > What about the redistributor in the GICC entry? ACPI doesn't want to say.
> > > > > > Assume the worst: When a redistributor is described in the GICC entry,
> > > > > > but the entry is marked as disabled at boot, assume the redistributor
> > > > > > is inaccessible.
> > > > > > 
> > > > > > The GICv3 driver doesn't support late online of redistributors, so this
> > > > > > means the corresponding CPU can't be brought online either. Clear the
> > > > > > possible and present bits.
> > > > > > 
> > > > > > Systems that want CPU hotplug in a VM can ensure their redistributors
> > > > > > are always-on, and describe them that way with a GICR entry in the MADT.
> > > > > > 
> > > > > > When mapping redistributors found via GICC entries, handle the case
> > > > > > where the arch code believes the CPU is present and possible, but it
> > > > > > does not have an accessible redistributor. Print a warning and clear
> > > > > > the present and possible bits.
> > > > > > 
> > > > > > Signed-off-by: James Morse <james.morse@arm.com>
> > > > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > > > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>        
> > > > > 
> > > > > +CC Marc,
> > > > > 
> > > > > Whilst this has been unchanged for a long time, I'm not 100% sure
> > > > > we've specifically drawn your attention to it before now.
> > > > > 
> > > > > Jonathan
> > > > >         
> > > > > > 
> > > > > > ---
> > > > > > v7: No Change.
> > > > > > ---
> > > > > >  drivers/irqchip/irq-gic-v3.c | 21 +++++++++++++++++++--
> > > > > >  include/linux/acpi.h         |  3 ++-
> > > > > >  2 files changed, 21 insertions(+), 3 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> > > > > > index 10af15f93d4d..66132251c1bb 100644
> > > > > > --- a/drivers/irqchip/irq-gic-v3.c
> > > > > > +++ b/drivers/irqchip/irq-gic-v3.c
> > > > > > @@ -2363,11 +2363,25 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
> > > > > >  				(struct acpi_madt_generic_interrupt *)header;
> > > > > >  	u32 reg = readl_relaxed(acpi_data.dist_base + GICD_PIDR2) & GIC_PIDR2_ARCH_MASK;
> > > > > >  	u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
> > > > > > +	int cpu = get_cpu_for_acpi_id(gicc->uid);
> > > > > >  	void __iomem *redist_base;
> > > > > >  
> > > > > >  	if (!acpi_gicc_is_usable(gicc))
> > > > > >  		return 0;
> > > > > >  
> > > > > > +	/*
> > > > > > +	 * Capable but disabled CPUs can be brought online later. What about
> > > > > > +	 * the redistributor? ACPI doesn't want to say!
> > > > > > +	 * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
> > > > > > +	 * Otherwise, prevent such CPUs from being brought online.
> > > > > > +	 */
> > > > > > +	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
> > > > > > +		pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
> > > > > > +		set_cpu_present(cpu, false);
> > > > > > +		set_cpu_possible(cpu, false);
> > > > > > +		return 0;
> > > > > > +	}        
> > > > 
> > > > It seems dangerous to clear those this late in the game, given how
> > > > disconnected from the architecture code this is. Are we sure that
> > > > nothing has sampled these cpumasks beforehand?      
> > > 
> > > Hi Marc,
> > > 
> > > Any firmware that does this is being considered as buggy already
> > > but given it is firmware and the spec doesn't say much about this,
> > > there is always the possibility.
> > > 
> > > Not much happens between the point where these are setup and
> > > the point where the the gic inits and this code runs, but even if careful
> > > review showed it was fine today, it will be fragile to future changes.
> > > 
> > > I'm not sure there is a huge disadvantage for such broken firmware in
> > > clearing these masks from the point of view of what is used throughout
> > > the rest of the kernel. Here I think we are just looking to prevent the CPU
> > > being onlined later.
> > > 
> > > We could add a set_cpu_broken() with appropriate mask.
> > > Given this is very arm64 specific I'm not sure Rafael will be keen on
> > > us checking such a mask in the generic ACPI code, but we could check it in
> > > arch_register_cpu() and just not register the cpu if it matches.
> > > That will cover the vCPU hotplug case.
> > > 
> > > Does that sounds sensible, or would you prefer something else?    
> > 
> > Hi Marc
> > 
> > Some experiments later (faking this on a physical board - I never liked
> > CPU 120 anyway!) and using a different mask brings it's own minor pain.
> > 
> > When all the rest of the CPUs are brought up cpuhp_bringup_mask() is called
> > on cpu_present_mask so we need to do a dance in there to use a temporary
> > mask with broken cpus removed.  I think it makes sense to cut that out
> > at the top of the cpuhp_bringup_mask() pile of actions rather than trying
> > to paper over each actual thing that is dying... (looks like an infinite loop
> > somewhere but I haven't tracked down where yet).
> > 
> > I'll spin a patch so you can see what it looks like, but my concern is
> > we are just moving the risk from early users of these masks to later cases
> > where code assumes cpu_present_mask definitely means they are present.
> > That is probably a small set of cases but not nice either.
> > 
> > Looks like one of those cases where we need to pick the lesser of two evils
> > which is probably still the cpu_broken_mask approach.
> > 
> > On plus side if we decide to go back to the original approach having seen
> > that I already have the code :)
> > 
> > Jonathan
> >   
> 
> Patch on top of this series.  If no one shouts before I have it ready I'll
> roll a v8 with the mask introduction as a new patch and the other changes pushed into
> appropriate patches.
> 
> From 361b76f36bfb4ff74fdceca7ebf14cfa43cae4a9 Mon Sep 17 00:00:00 2001
> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Date: Wed, 24 Apr 2024 17:42:49 +0100
> Subject: [PATCH] cpu: Add broken cpu mask to mark CPUs where inconsistent
>  firmware means we can't start them.
> 
> On ARM64, it is not currently possible to use CPUs where the GICC entry
> in ACPI specifies that it is online capable but not enabled. Only
> always enabled entries are supported.
> 
> Previously if this condition was met, the present and possible cpu masks
> were cleared for the relevant cpus.  However, those masks may already
> have been used by other code so this is not known to be safe.
> 
> An alternative is to use an additional mask (broken) and check that
> in the subset of places where these CPUs might be onlined or the
> infrastructure to indicate this is possible created.
> Specifically in bringup_nonboot_cpus() and in arch_register_cpu().
> 
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Obviously I'd missed Marc's reply on keeping this local to gicv3.
Will give that a go.

Sorry for the noise!

Jonathan

> ---
>  arch/arm64/kernel/smp.c      |  3 +++
>  drivers/irqchip/irq-gic-v3.c |  3 +--
>  include/linux/cpumask.h      | 19 +++++++++++++++++++
>  kernel/cpu.c                 |  8 +++++++-
>  4 files changed, 30 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index ccb6ad347df9..39cd6a7c40d8 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -513,6 +513,9 @@ int arch_register_cpu(int cpu)
>  	    IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU))
>  		return -EPROBE_DEFER;
>  
> +	if (cpu_broken(cpu)) /* Inconsistent firmware - can't online */
> +		return -ENODEV;
> +
>  #ifdef CONFIG_ACPI_HOTPLUG_CPU
>  	/* For now block anything that looks like physical CPU Hotplug */
>  	if (invalid_logical_cpuid(cpu) || !cpu_present(cpu)) {
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 66132251c1bb..a0063eb6484d 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -2377,8 +2377,7 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
>  	 */
>  	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
>  		pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
> -		set_cpu_present(cpu, false);
> -		set_cpu_possible(cpu, false);
> +		set_cpu_broken(cpu);
>  		return 0;
>  	}
>  
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index 4b202b94c97a..70a93ad8e590 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -96,6 +96,7 @@ static inline void set_nr_cpu_ids(unsigned int nr)
>   *     cpu_enabled_mask  - has bit 'cpu' set iff cpu can be brought online
>   *     cpu_online_mask  - has bit 'cpu' set iff cpu available to scheduler
>   *     cpu_active_mask  - has bit 'cpu' set iff cpu available to migration
> + *     cpu_broken_mask  - has bit 'cpu' set iff the cpu should never be onlined
>   *
>   *  If !CONFIG_HOTPLUG_CPU, present == possible, and active == online.
>   *
> @@ -130,12 +131,14 @@ extern struct cpumask __cpu_enabled_mask;
>  extern struct cpumask __cpu_present_mask;
>  extern struct cpumask __cpu_active_mask;
>  extern struct cpumask __cpu_dying_mask;
> +extern struct cpumask __cpu_broken_mask;
>  #define cpu_possible_mask ((const struct cpumask *)&__cpu_possible_mask)
>  #define cpu_online_mask   ((const struct cpumask *)&__cpu_online_mask)
>  #define cpu_enabled_mask   ((const struct cpumask *)&__cpu_enabled_mask)
>  #define cpu_present_mask  ((const struct cpumask *)&__cpu_present_mask)
>  #define cpu_active_mask   ((const struct cpumask *)&__cpu_active_mask)
>  #define cpu_dying_mask    ((const struct cpumask *)&__cpu_dying_mask)
> +#define cpu_broken_mask   ((const struct cpumask *)&__cpu_broken_mask)
>  
>  extern atomic_t __num_online_cpus;
>  
> @@ -1073,6 +1076,12 @@ set_cpu_dying(unsigned int cpu, bool dying)
>  		cpumask_clear_cpu(cpu, &__cpu_dying_mask);
>  }
>  
> +static inline void
> +set_cpu_broken(unsigned int cpu)
> +{
> +	cpumask_set_cpu(cpu, &__cpu_broken_mask);
> +}
> +
>  /**
>   * to_cpumask - convert a NR_CPUS bitmap to a struct cpumask *
>   * @bitmap: the bitmap
> @@ -1159,6 +1168,11 @@ static inline bool cpu_dying(unsigned int cpu)
>  	return cpumask_test_cpu(cpu, cpu_dying_mask);
>  }
>  
> +static inline bool cpu_broken(unsigned int cpu)
> +{
> +	return cpumask_test_cpu(cpu, cpu_broken_mask);
> +}
> +
>  #else
>  
>  #define num_online_cpus()	1U
> @@ -1197,6 +1211,11 @@ static inline bool cpu_dying(unsigned int cpu)
>  	return false;
>  }
>  
> +static inline bool cpu_broken(unsigned int cpu)
> +{
> +	return false;
> +}
> +
>  #endif /* NR_CPUS > 1 */
>  
>  #define cpu_is_offline(cpu)	unlikely(!cpu_online(cpu))
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 537099bf5d02..f8b73a11869e 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -1907,12 +1907,15 @@ static inline bool cpuhp_bringup_cpus_parallel(unsigned int ncpus) { return fals
>  
>  void __init bringup_nonboot_cpus(unsigned int max_cpus)
>  {
> +	static const struct cpumask tmp_mask __initdata;
> +
>  	/* Try parallel bringup optimization if enabled */
>  	if (cpuhp_bringup_cpus_parallel(max_cpus))
>  		return;
>  
> +	cpumask_andnot(&tmp_mask, cpu_present_mask, cpu_broken_mask);
>  	/* Full per CPU serialized bringup */
> -	cpuhp_bringup_mask(cpu_present_mask, max_cpus, CPUHP_ONLINE);
> +	cpuhp_bringup_mask(&tmp_mask, max_cpus, CPUHP_ONLINE);
>  }
>  
>  #ifdef CONFIG_PM_SLEEP_SMP
> @@ -3129,6 +3132,9 @@ EXPORT_SYMBOL(__cpu_active_mask);
>  struct cpumask __cpu_dying_mask __read_mostly;
>  EXPORT_SYMBOL(__cpu_dying_mask);
>  
> +struct cpumask __cpu_broken_mask __ro_after_init;
> +EXPORT_SYMBOL(__cpu_broken_mask);
> +
>  atomic_t __num_online_cpus __read_mostly;
>  EXPORT_SYMBOL(__num_online_cpus);
>
Jonathan Cameron April 25, 2024, 10:23 a.m. UTC | #23
On Wed, 24 Apr 2024 18:08:30 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Wed, 24 Apr 2024 17:35:54 +0100
> Salil Mehta <salil.mehta@huawei.com> wrote:
> 
> > >  From: Marc Zyngier <maz@kernel.org>
> > >  Sent: Wednesday, April 24, 2024 4:33 PM
> > >  To: Jonathan Cameron <jonathan.cameron@huawei.com>
> > >  Cc: Thomas Gleixner <tglx@linutronix.de>; Peter Zijlstra
> > >  <peterz@infradead.org>; linux-pm@vger.kernel.org;
> > >  loongarch@lists.linux.dev; linux-acpi@vger.kernel.org; linux-
> > >  arch@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> > >  kernel@lists.infradead.org; kvmarm@lists.linux.dev; x86@kernel.org;
> > >  Russell King <linux@armlinux.org.uk>; Rafael J . Wysocki
> > >  <rafael@kernel.org>; Miguel Luis <miguel.luis@oracle.com>; James Morse
> > >  <james.morse@arm.com>; Salil Mehta <salil.mehta@huawei.com>; Jean-
> > >  Philippe Brucker <jean-philippe@linaro.org>; Catalin Marinas
> > >  <catalin.marinas@arm.com>; Will Deacon <will@kernel.org>; Linuxarm
> > >  <linuxarm@huawei.com>; Ingo Molnar <mingo@redhat.com>; Borislav
> > >  Petkov <bp@alien8.de>; Dave Hansen <dave.hansen@linux.intel.com>;
> > >  justin.he@arm.com; jianyong.wu@arm.com
> > >  Subject: Re: [PATCH v7 11/16] irqchip/gic-v3: Add support for ACPI's
> > >  disabled but 'online capable' CPUs
> > >  
> > >  On Wed, 24 Apr 2024 13:54:38 +0100,
> > >  Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:    
> > >  >
> > >  > On Tue, 23 Apr 2024 13:01:21 +0100
> > >  > Marc Zyngier <maz@kernel.org> wrote:
> > >  >    
> > >  > > On Mon, 22 Apr 2024 11:40:20 +0100,
> > >  > > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:    
> > >  > > >
> > >  > > > On Thu, 18 Apr 2024 14:54:07 +0100 Jonathan Cameron
> > >  > > > <Jonathan.Cameron@huawei.com> wrote:    
> > >  
> > >  [...]
> > >      
> > >  > > >    
> > >  > > > > +	/*
> > >  > > > > +	 * Capable but disabled CPUs can be brought online later.  What about
> > >  > > > > +	 * the redistributor? ACPI doesn't want to say!
> > >  > > > > +	 * Virtual hotplug systems can use the MADT's "always-on"  GICR entries.
> > >  > > > > +	 * Otherwise, prevent such CPUs from being brought online.
> > >  > > > > +	 */
> > >  > > > > +	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
> > >  > > > > +		pr_warn_once("CPU %u's redistributor is  inaccessible: this CPU can't be brought online\n", cpu);
> > >  > > > > +		set_cpu_present(cpu, false);
> > >  > > > > +		set_cpu_possible(cpu, false);    
> > 
> > (a digression) shouldn't we be clearing the enabled mask as well?
> > 
> >                                           set_cpu_enabled(cpu, false);  
> 
> FWIW I think not necessary. enabled is only set in register_cpu() and aim here is to
> never call that for CPUs in this state.
> 
> Anyhow, I got distracted by the firmware bug I found whilst trying to test this but
> now have a test setup that hits this path (once deliberately broken), so will
> see what we can do about that doesn't have affect those masks.

This may be relevant with the context of Marc's email.  Don't crop so much!
However I think we probably don't care. This is bios bug, if we miss report it such
that userspace thinks it can online something that work work, it probably doesn't
matter.

Jonathan

> 
> Jonathan
> 
> 
> > 
> > 
> > Best regards
> > Salil  
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Jonathan Cameron April 25, 2024, 12:31 p.m. UTC | #24
On Wed, 24 Apr 2024 16:33:22 +0100
Marc Zyngier <maz@kernel.org> wrote:

> On Wed, 24 Apr 2024 13:54:38 +0100,
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > 
> > On Tue, 23 Apr 2024 13:01:21 +0100
> > Marc Zyngier <maz@kernel.org> wrote:
> >   
> > > On Mon, 22 Apr 2024 11:40:20 +0100,
> > > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:  
> > > > 
> > > > On Thu, 18 Apr 2024 14:54:07 +0100
> > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:  
> 
> [...]
> 
> > > >     
> > > > > +	/*
> > > > > +	 * Capable but disabled CPUs can be brought online later. What about
> > > > > +	 * the redistributor? ACPI doesn't want to say!
> > > > > +	 * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
> > > > > +	 * Otherwise, prevent such CPUs from being brought online.
> > > > > +	 */
> > > > > +	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
> > > > > +		pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
> > > > > +		set_cpu_present(cpu, false);
> > > > > +		set_cpu_possible(cpu, false);
> > > > > +		return 0;
> > > > > +	}    
> > > 
> > > It seems dangerous to clear those this late in the game, given how
> > > disconnected from the architecture code this is. Are we sure that
> > > nothing has sampled these cpumasks beforehand?  
> > 
> > Hi Marc,
> > 
> > Any firmware that does this is being considered as buggy already
> > but given it is firmware and the spec doesn't say much about this,
> > there is always the possibility.  
> 
> There is no shortage of broken firmware out there, and I expect this
> trend to progress.
> 
> > Not much happens between the point where these are setup and
> > the point where the the gic inits and this code runs, but even if careful
> > review showed it was fine today, it will be fragile to future changes.
> > 
> > I'm not sure there is a huge disadvantage for such broken firmware in
> > clearing these masks from the point of view of what is used throughout
> > the rest of the kernel. Here I think we are just looking to prevent the CPU
> > being onlined later.  
> 
> I totally agree on the goal, I simply question the way you get to it.
> 
> > 
> > We could add a set_cpu_broken() with appropriate mask.
> > Given this is very arm64 specific I'm not sure Rafael will be keen on
> > us checking such a mask in the generic ACPI code, but we could check it in
> > arch_register_cpu() and just not register the cpu if it matches.
> > That will cover the vCPU hotplug case.
> >
> > Does that sounds sensible, or would you prefer something else?  
> 
> 
> Such a 'broken_rdists' mask is exactly what I have in mind, just
> keeping it private to the GIC driver, and not expose it anywhere else.
> You can then fail the hotplug event early, and avoid changing the
> global masks from within the GIC driver. At least, we don't mess with
> the internals of the kernel, and the CPU is properly marked as dead
> (that mechanism should already work).
> 
> I'd expect the handling side to look like this (will not compile, but
> you'll get the idea):
Hi Marc,

In general this looks good - but...

I haven't gotten to the bottom of why yet (and it might be a side
effect of how I hacked the test by lying in minimal fashion and
just frigging the MADT read functions) but the hotplug flow is only getting
as far as calling __cpu_up() before it seems to enter an infinite loop.
That is it never gets far enough to fail this test.

Getting stuck in a psci cpu_on call.  I'm guessing something that
we didn't get to in the earlier gicv3 calls before bailing out is blocking that?
Looks like it gets to
SMCCC smc
and is never seen again.

Any ideas on where to look?  The one advantage so far of the higher level
approach is we never tried the hotplug callbacks at all so avoided hitting
that call.  One (little bit horrible) solution that might avoid this would 
be to add another cpuhp state very early on and fail at that stage.
I'm not keen on doing that without a better explanation than I have so far!

Thanks,

J

 
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 6fb276504bcc..e8f02bfd0e21 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -1009,6 +1009,9 @@ static int __gic_populate_rdist(struct redist_region *region, void __iomem *ptr)
>  	u64 typer;
>  	u32 aff;
>  
> +	if (cpumask_test_cpu(smp_processor_id(), &broken_rdists))
> +		return 1;
> +
>  	/*
>  	 * Convert affinity to a 32bit value that can be matched to
>  	 * GICR_TYPER bits [63:32].
> @@ -1260,14 +1263,15 @@ static int gic_dist_supports_lpis(void)
>  		!gicv3_nolpi);
>  }
>  
> -static void gic_cpu_init(void)
> +static int gic_cpu_init(void)
>  {
>  	void __iomem *rbase;
> -	int i;
> +	int ret, i;
>  
>  	/* Register ourselves with the rest of the world */
> -	if (gic_populate_rdist())
> -		return;
> +	ret = gic_populate_rdist();
> +	if (ret)
> +		return ret;
>  
>  	gic_enable_redist(true);
>  
> @@ -1286,6 +1290,8 @@ static void gic_cpu_init(void)
>  
>  	/* initialise system registers */
>  	gic_cpu_sys_reg_init();
> +
> +	return 0;
>  }
>  
>  #ifdef CONFIG_SMP
> @@ -1295,7 +1301,11 @@ static void gic_cpu_init(void)
>  
>  static int gic_starting_cpu(unsigned int cpu)
>  {
> -	gic_cpu_init();
> +	int ret;
> +
> +	ret = gic_cpu_init();
> +	if (ret)
> +		return ret;
>  
>  	if (gic_dist_supports_lpis())
>  		its_cpu_init();
> 
> But the question is: do you rely on these masks having been
> "corrected" anywhere else?
> 
> Thanks,
> 
> 	M.
>
Jonathan Cameron April 25, 2024, 3 p.m. UTC | #25
On Thu, 25 Apr 2024 13:31:50 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Wed, 24 Apr 2024 16:33:22 +0100
> Marc Zyngier <maz@kernel.org> wrote:
> 
> > On Wed, 24 Apr 2024 13:54:38 +0100,
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:  
> > > 
> > > On Tue, 23 Apr 2024 13:01:21 +0100
> > > Marc Zyngier <maz@kernel.org> wrote:
> > >     
> > > > On Mon, 22 Apr 2024 11:40:20 +0100,
> > > > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:    
> > > > > 
> > > > > On Thu, 18 Apr 2024 14:54:07 +0100
> > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:    
> > 
> > [...]
> >   
> > > > >       
> > > > > > +	/*
> > > > > > +	 * Capable but disabled CPUs can be brought online later. What about
> > > > > > +	 * the redistributor? ACPI doesn't want to say!
> > > > > > +	 * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
> > > > > > +	 * Otherwise, prevent such CPUs from being brought online.
> > > > > > +	 */
> > > > > > +	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
> > > > > > +		pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
> > > > > > +		set_cpu_present(cpu, false);
> > > > > > +		set_cpu_possible(cpu, false);
> > > > > > +		return 0;
> > > > > > +	}      
> > > > 
> > > > It seems dangerous to clear those this late in the game, given how
> > > > disconnected from the architecture code this is. Are we sure that
> > > > nothing has sampled these cpumasks beforehand?    
> > > 
> > > Hi Marc,
> > > 
> > > Any firmware that does this is being considered as buggy already
> > > but given it is firmware and the spec doesn't say much about this,
> > > there is always the possibility.    
> > 
> > There is no shortage of broken firmware out there, and I expect this
> > trend to progress.
> >   
> > > Not much happens between the point where these are setup and
> > > the point where the the gic inits and this code runs, but even if careful
> > > review showed it was fine today, it will be fragile to future changes.
> > > 
> > > I'm not sure there is a huge disadvantage for such broken firmware in
> > > clearing these masks from the point of view of what is used throughout
> > > the rest of the kernel. Here I think we are just looking to prevent the CPU
> > > being onlined later.    
> > 
> > I totally agree on the goal, I simply question the way you get to it.
> >   
> > > 
> > > We could add a set_cpu_broken() with appropriate mask.
> > > Given this is very arm64 specific I'm not sure Rafael will be keen on
> > > us checking such a mask in the generic ACPI code, but we could check it in
> > > arch_register_cpu() and just not register the cpu if it matches.
> > > That will cover the vCPU hotplug case.
> > >
> > > Does that sounds sensible, or would you prefer something else?    
> > 
> > 
> > Such a 'broken_rdists' mask is exactly what I have in mind, just
> > keeping it private to the GIC driver, and not expose it anywhere else.
> > You can then fail the hotplug event early, and avoid changing the
> > global masks from within the GIC driver. At least, we don't mess with
> > the internals of the kernel, and the CPU is properly marked as dead
> > (that mechanism should already work).
> > 
> > I'd expect the handling side to look like this (will not compile, but
> > you'll get the idea):  
> Hi Marc,
> 
> In general this looks good - but...
> 
> I haven't gotten to the bottom of why yet (and it might be a side
> effect of how I hacked the test by lying in minimal fashion and
> just frigging the MADT read functions) but the hotplug flow is only getting
> as far as calling __cpu_up() before it seems to enter an infinite loop.
> That is it never gets far enough to fail this test.
> 
> Getting stuck in a psci cpu_on call.  I'm guessing something that
> we didn't get to in the earlier gicv3 calls before bailing out is blocking that?
> Looks like it gets to
> SMCCC smc
> and is never seen again.
> 
> Any ideas on where to look?  The one advantage so far of the higher level
> approach is we never tried the hotplug callbacks at all so avoided hitting
> that call.  One (little bit horrible) solution that might avoid this would 
> be to add another cpuhp state very early on and fail at that stage.
> I'm not keen on doing that without a better explanation than I have so far!

Whilst it still doesn't work I suspect I'm loosing ability to print to the console
between that point and somewhat later and real problem is elsewhere.

Jonathan

> 
> Thanks,
> 
> J
> 
>  
> > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> > index 6fb276504bcc..e8f02bfd0e21 100644
> > --- a/drivers/irqchip/irq-gic-v3.c
> > +++ b/drivers/irqchip/irq-gic-v3.c
> > @@ -1009,6 +1009,9 @@ static int __gic_populate_rdist(struct redist_region *region, void __iomem *ptr)
> >  	u64 typer;
> >  	u32 aff;
> >  
> > +	if (cpumask_test_cpu(smp_processor_id(), &broken_rdists))
> > +		return 1;
> > +
> >  	/*
> >  	 * Convert affinity to a 32bit value that can be matched to
> >  	 * GICR_TYPER bits [63:32].
> > @@ -1260,14 +1263,15 @@ static int gic_dist_supports_lpis(void)
> >  		!gicv3_nolpi);
> >  }
> >  
> > -static void gic_cpu_init(void)
> > +static int gic_cpu_init(void)
> >  {
> >  	void __iomem *rbase;
> > -	int i;
> > +	int ret, i;
> >  
> >  	/* Register ourselves with the rest of the world */
> > -	if (gic_populate_rdist())
> > -		return;
> > +	ret = gic_populate_rdist();
> > +	if (ret)
> > +		return ret;
> >  
> >  	gic_enable_redist(true);
> >  
> > @@ -1286,6 +1290,8 @@ static void gic_cpu_init(void)
> >  
> >  	/* initialise system registers */
> >  	gic_cpu_sys_reg_init();
> > +
> > +	return 0;
> >  }
> >  
> >  #ifdef CONFIG_SMP
> > @@ -1295,7 +1301,11 @@ static void gic_cpu_init(void)
> >  
> >  static int gic_starting_cpu(unsigned int cpu)
> >  {
> > -	gic_cpu_init();
> > +	int ret;
> > +
> > +	ret = gic_cpu_init();
> > +	if (ret)
> > +		return ret;
> >  
> >  	if (gic_dist_supports_lpis())
> >  		its_cpu_init();
> > 
> > But the question is: do you rely on these masks having been
> > "corrected" anywhere else?
> > 
> > Thanks,
> > 
> > 	M.
> >   
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Jonathan Cameron April 25, 2024, 4:55 p.m. UTC | #26
On Thu, 25 Apr 2024 16:00:17 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Thu, 25 Apr 2024 13:31:50 +0100
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> 
> > On Wed, 24 Apr 2024 16:33:22 +0100
> > Marc Zyngier <maz@kernel.org> wrote:
> >   
> > > On Wed, 24 Apr 2024 13:54:38 +0100,
> > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:    
> > > > 
> > > > On Tue, 23 Apr 2024 13:01:21 +0100
> > > > Marc Zyngier <maz@kernel.org> wrote:
> > > >       
> > > > > On Mon, 22 Apr 2024 11:40:20 +0100,
> > > > > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:      
> > > > > > 
> > > > > > On Thu, 18 Apr 2024 14:54:07 +0100
> > > > > > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:      
> > > 
> > > [...]
> > >     
> > > > > >         
> > > > > > > +	/*
> > > > > > > +	 * Capable but disabled CPUs can be brought online later. What about
> > > > > > > +	 * the redistributor? ACPI doesn't want to say!
> > > > > > > +	 * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
> > > > > > > +	 * Otherwise, prevent such CPUs from being brought online.
> > > > > > > +	 */
> > > > > > > +	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
> > > > > > > +		pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
> > > > > > > +		set_cpu_present(cpu, false);
> > > > > > > +		set_cpu_possible(cpu, false);
> > > > > > > +		return 0;
> > > > > > > +	}        
> > > > > 
> > > > > It seems dangerous to clear those this late in the game, given how
> > > > > disconnected from the architecture code this is. Are we sure that
> > > > > nothing has sampled these cpumasks beforehand?      
> > > > 
> > > > Hi Marc,
> > > > 
> > > > Any firmware that does this is being considered as buggy already
> > > > but given it is firmware and the spec doesn't say much about this,
> > > > there is always the possibility.      
> > > 
> > > There is no shortage of broken firmware out there, and I expect this
> > > trend to progress.
> > >     
> > > > Not much happens between the point where these are setup and
> > > > the point where the the gic inits and this code runs, but even if careful
> > > > review showed it was fine today, it will be fragile to future changes.
> > > > 
> > > > I'm not sure there is a huge disadvantage for such broken firmware in
> > > > clearing these masks from the point of view of what is used throughout
> > > > the rest of the kernel. Here I think we are just looking to prevent the CPU
> > > > being onlined later.      
> > > 
> > > I totally agree on the goal, I simply question the way you get to it.
> > >     
> > > > 
> > > > We could add a set_cpu_broken() with appropriate mask.
> > > > Given this is very arm64 specific I'm not sure Rafael will be keen on
> > > > us checking such a mask in the generic ACPI code, but we could check it in
> > > > arch_register_cpu() and just not register the cpu if it matches.
> > > > That will cover the vCPU hotplug case.
> > > >
> > > > Does that sounds sensible, or would you prefer something else?      
> > > 
> > > 
> > > Such a 'broken_rdists' mask is exactly what I have in mind, just
> > > keeping it private to the GIC driver, and not expose it anywhere else.
> > > You can then fail the hotplug event early, and avoid changing the
> > > global masks from within the GIC driver. At least, we don't mess with
> > > the internals of the kernel, and the CPU is properly marked as dead
> > > (that mechanism should already work).
> > > 
> > > I'd expect the handling side to look like this (will not compile, but
> > > you'll get the idea):    
> > Hi Marc,
> > 
> > In general this looks good - but...
> > 
> > I haven't gotten to the bottom of why yet (and it might be a side
> > effect of how I hacked the test by lying in minimal fashion and
> > just frigging the MADT read functions) but the hotplug flow is only getting
> > as far as calling __cpu_up() before it seems to enter an infinite loop.
> > That is it never gets far enough to fail this test.
> > 
> > Getting stuck in a psci cpu_on call.  I'm guessing something that
> > we didn't get to in the earlier gicv3 calls before bailing out is blocking that?
> > Looks like it gets to
> > SMCCC smc
> > and is never seen again.
> > 
> > Any ideas on where to look?  The one advantage so far of the higher level
> > approach is we never tried the hotplug callbacks at all so avoided hitting
> > that call.  One (little bit horrible) solution that might avoid this would 
> > be to add another cpuhp state very early on and fail at that stage.
> > I'm not keen on doing that without a better explanation than I have so far!  
> 
> Whilst it still doesn't work I suspect I'm loosing ability to print to the console
> between that point and somewhat later and real problem is elsewhere.

Hi again,

Found it I think.  cpuhp calls between cpu:bringup and ap:online 
arm made from notify_cpu_starting() are clearly marked as nofail with a comment.
STARTING must not fail!

https://elixir.bootlin.com/linux/latest/source/kernel/cpu.c#L1642

Whilst I have no immediate idea why that comment is there it is pretty strong
argument against trying to have the CPUHP_AP_IRQ_GIC_STARTING callback fail
and expecting it to carry on working :( 
There would have been a nice print message, but given I don't appear to have
a working console after that stage I never see it.

So the best I have yet come up with for this is the option of a new callback registered
in gic_smp_init()

cpuhp_setup_state_nocalls(CPUHP_BP_PREPARE_DYN,
			  "irqchip/arm/gicv3:checkrdist",
			  gic_broken_rdist, NULL);

with callback being simply 

static int gic_broken_rdist(unsigned int cpu)
{
	if (cpumask_test_cpu(cpu, &broken_rdists))
		return -EINVAL;

	return 0;
}

That gets called cpuhp_up_callbacks() and is allows to fail and roll back the steps.

Not particularly satisfying but keeps the logic confined to the gicv3 driver.

What do you think?

Jonathan

> 
> Jonathan
> 
> > 
> > Thanks,
> > 
> > J
> > 
> >    
> > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> > > index 6fb276504bcc..e8f02bfd0e21 100644
> > > --- a/drivers/irqchip/irq-gic-v3.c
> > > +++ b/drivers/irqchip/irq-gic-v3.c
> > > @@ -1009,6 +1009,9 @@ static int __gic_populate_rdist(struct redist_region *region, void __iomem *ptr)
> > >  	u64 typer;
> > >  	u32 aff;
> > >  
> > > +	if (cpumask_test_cpu(smp_processor_id(), &broken_rdists))
> > > +		return 1;
> > > +
> > >  	/*
> > >  	 * Convert affinity to a 32bit value that can be matched to
> > >  	 * GICR_TYPER bits [63:32].
> > > @@ -1260,14 +1263,15 @@ static int gic_dist_supports_lpis(void)
> > >  		!gicv3_nolpi);
> > >  }
> > >  
> > > -static void gic_cpu_init(void)
> > > +static int gic_cpu_init(void)
> > >  {
> > >  	void __iomem *rbase;
> > > -	int i;
> > > +	int ret, i;
> > >  
> > >  	/* Register ourselves with the rest of the world */
> > > -	if (gic_populate_rdist())
> > > -		return;
> > > +	ret = gic_populate_rdist();
> > > +	if (ret)
> > > +		return ret;
> > >  
> > >  	gic_enable_redist(true);
> > >  
> > > @@ -1286,6 +1290,8 @@ static void gic_cpu_init(void)
> > >  
> > >  	/* initialise system registers */
> > >  	gic_cpu_sys_reg_init();
> > > +
> > > +	return 0;
> > >  }
> > >  
> > >  #ifdef CONFIG_SMP
> > > @@ -1295,7 +1301,11 @@ static void gic_cpu_init(void)
> > >  
> > >  static int gic_starting_cpu(unsigned int cpu)
> > >  {
> > > -	gic_cpu_init();
> > > +	int ret;
> > > +
> > > +	ret = gic_cpu_init();
> > > +	if (ret)
> > > +		return ret;
> > >  
> > >  	if (gic_dist_supports_lpis())
> > >  		its_cpu_init();
> > > 
> > > But the question is: do you rely on these masks having been
> > > "corrected" anywhere else?
> > > 
> > > Thanks,
> > > 
> > > 	M.
> > >     
> > 
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel  
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Marc Zyngier April 26, 2024, 12:41 p.m. UTC | #27
On Thu, 25 Apr 2024 17:55:27 +0100,
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> 
> On Thu, 25 Apr 2024 16:00:17 +0100
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> 
> > On Thu, 25 Apr 2024 13:31:50 +0100
> > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> > 
> > > On Wed, 24 Apr 2024 16:33:22 +0100
> > > Marc Zyngier <maz@kernel.org> wrote:

[...]

> > >   
> > > > I'd expect the handling side to look like this (will not compile, but
> > > > you'll get the idea):    
> > > Hi Marc,
> > > 
> > > In general this looks good - but...
> > > 
> > > I haven't gotten to the bottom of why yet (and it might be a side
> > > effect of how I hacked the test by lying in minimal fashion and
> > > just frigging the MADT read functions) but the hotplug flow is only getting
> > > as far as calling __cpu_up() before it seems to enter an infinite loop.
> > > That is it never gets far enough to fail this test.
> > > 
> > > Getting stuck in a psci cpu_on call.  I'm guessing something that
> > > we didn't get to in the earlier gicv3 calls before bailing out is blocking that?
> > > Looks like it gets to
> > > SMCCC smc
> > > and is never seen again.
> > > 
> > > Any ideas on where to look?  The one advantage so far of the higher level
> > > approach is we never tried the hotplug callbacks at all so avoided hitting
> > > that call.  One (little bit horrible) solution that might avoid this would 
> > > be to add another cpuhp state very early on and fail at that stage.
> > > I'm not keen on doing that without a better explanation than I have so far!  
> > 
> > Whilst it still doesn't work I suspect I'm loosing ability to print to the console
> > between that point and somewhat later and real problem is
> > elsewhere.

Sorry, travelling at the moment, so only spotted this now.

> 
> Hi again,
> 
> Found it I think.  cpuhp calls between cpu:bringup and ap:online 
> arm made from notify_cpu_starting() are clearly marked as nofail with a comment.
> STARTING must not fail!
> 
> https://elixir.bootlin.com/linux/latest/source/kernel/cpu.c#L1642

Ah, now that rings a bell! ;-)

> 
> Whilst I have no immediate idea why that comment is there it is pretty strong
> argument against trying to have the CPUHP_AP_IRQ_GIC_STARTING callback fail
> and expecting it to carry on working :( 
> There would have been a nice print message, but given I don't appear to have
> a working console after that stage I never see it.
> 
> So the best I have yet come up with for this is the option of a new callback registered
> in gic_smp_init()
> 
> cpuhp_setup_state_nocalls(CPUHP_BP_PREPARE_DYN,
> 			  "irqchip/arm/gicv3:checkrdist",
> 			  gic_broken_rdist, NULL);
> 
> with callback being simply 
> 
> static int gic_broken_rdist(unsigned int cpu)
> {
> 	if (cpumask_test_cpu(cpu, &broken_rdists))
> 		return -EINVAL;
> 
> 	return 0;
> }
> 
> That gets called cpuhp_up_callbacks() and is allows to fail and roll back the steps.
> 
> Not particularly satisfying but keeps the logic confined to the gicv3 driver.
> 
> What do you think?

Good enough for me. Cc me on the resulting patch when you repost it so
that I can eyeball it, but this is IMO the right direction.

Thanks,

	M.