diff mbox series

[v5,03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()

Message ID 20240412143719.11398-4-Jonathan.Cameron@huawei.com
State New
Headers show
Series ACPI/arm64: add support for virtual cpu hotplug | expand

Commit Message

Jonathan Cameron April 12, 2024, 2:37 p.m. UTC
From: James Morse <james.morse@arm.com>

The arm64 specific arch_register_cpu() call may defer CPU registration
until the ACPI interpreter is available and the _STA method can
be evaluated.

If this occurs, then a second attempt is made in
acpi_processor_get_info(). Note that the arm64 specific call has
not yet been added so for now this will never be successfully
called.

Systems can still be booted with 'acpi=off', or not include an
ACPI description at all as in these cases arch_register_cpu()
will not have deferred registration when first called.

This moves the CPU register logic back to a subsys_initcall(),
while the memory nodes will have been registered earlier.
Note this is where the call was prior to the cleanup series so
there should be no side effects of moving it back again for this
specific case.

[PATCH 00/21] Initial cleanups for vCPU HP.
https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@shell.armlinux.org.uk/

e.g. 5b95f94c3b9f ("x86/topology: Switch over to GENERIC_CPU_DEVICES")

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Joanthan Cameron <Jonathan.Cameron@huawei.com>
---
v5: Update commit message to make it clear this is moving the
    init back to where it was until very recently.

    No longer change the condition in the earlier registration point
    as that will be handled by the arm64 registration routine
    deferring until called again here.
---
 drivers/acpi/acpi_processor.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Salil Mehta April 15, 2024, 11:07 a.m. UTC | #1
Hello,

Engaging after a long time (I've been almost off grid due to very challenging personal circumstances).
Please find my response inline.

Thanks
Salil

>  From: Rafael J. Wysocki <rafael@kernel.org>
>  Sent: Friday, April 12, 2024 7:31 PM
>  To: Jonathan Cameron <jonathan.cameron@huawei.com>
>  Cc: linux-pm@vger.kernel.org; loongarch@lists.linux.dev; linux-
>  acpi@vger.kernel.org; linux-arch@vger.kernel.org; linux-
>  kernel@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
>  kvmarm@lists.linux.dev; x86@kernel.org; Russell King
>  <linux@armlinux.org.uk>; Rafael J . Wysocki <rafael@kernel.org>; Miguel
>  Luis <miguel.luis@oracle.com>; James Morse <james.morse@arm.com>;
>  Salil Mehta <salil.mehta@huawei.com>; Jean-Philippe Brucker <jean-
>  philippe@linaro.org>; Catalin Marinas <catalin.marinas@arm.com>; Will
>  Deacon <will@kernel.org>; Linuxarm <linuxarm@huawei.com>;
>  justin.he@arm.com; jianyong.wu@arm.com
>  Subject: Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from
>  acpi_processor_get_info()
>  
>  On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
>  <Jonathan.Cameron@huawei.com> wrote:
>  >
>  > From: James Morse <james.morse@arm.com>
>  >
>  > The arm64 specific arch_register_cpu() call may defer CPU registration
>  > until the ACPI interpreter is available and the _STA method can be
>  > evaluated.
>  >
>  > If this occurs, then a second attempt is made in
>  > acpi_processor_get_info(). Note that the arm64 specific call has not
>  > yet been added so for now this will never be successfully called.
>  >
>  > Systems can still be booted with 'acpi=off', or not include an ACPI
>  > description at all as in these cases arch_register_cpu() will not have
>  > deferred registration when first called.
>  >
>  > This moves the CPU register logic back to a subsys_initcall(), while
>  > the memory nodes will have been registered earlier.
>  > Note this is where the call was prior to the cleanup series so there
>  > should be no side effects of moving it back again for this specific
>  > case.
>  >
>  > [PATCH 00/21] Initial cleanups for vCPU HP.
>  >
>  https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@shell.armlinux.org.uk/
>  >
>  > e.g. 5b95f94c3b9f ("x86/topology: Switch over to
>  GENERIC_CPU_DEVICES")
>  >
>  > Signed-off-by: James Morse <james.morse@arm.com>
>  > Reviewed-by: Gavin Shan <gshan@redhat.com>
>  > Tested-by: Miguel Luis <miguel.luis@oracle.com>
>  > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
>  > Tested-by: Jianyong Wu <jianyong.wu@arm.com>
>  > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
>  > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>  > Signed-off-by: Joanthan Cameron <Jonathan.Cameron@huawei.com>
>  > ---
>  > v5: Update commit message to make it clear this is moving the
>  >     init back to where it was until very recently.
>  >
>  >     No longer change the condition in the earlier registration point
>  >     as that will be handled by the arm64 registration routine
>  >     deferring until called again here.
>  > ---
>  >  drivers/acpi/acpi_processor.c | 12 ++++++++++++
>  >  1 file changed, 12 insertions(+)
>  >
>  > diff --git a/drivers/acpi/acpi_processor.c
>  > b/drivers/acpi/acpi_processor.c index 93e029403d05..c78398cdd060
>  > 100644
>  > --- a/drivers/acpi/acpi_processor.c
>  > +++ b/drivers/acpi/acpi_processor.c
>  > @@ -317,6 +317,18 @@ static int acpi_processor_get_info(struct
>  > acpi_device *device)
>  >
>  >         c = &per_cpu(cpu_devices, pr->id);
>  >         ACPI_COMPANION_SET(&c->dev, device);
>  > +       /*
>  > +        * Register CPUs that are present. get_cpu_device() is used to skip
>  > +        * duplicate CPU descriptions from firmware.
>  > +        */
>  > +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
>  > +           !get_cpu_device(pr->id)) {
>  > +               int ret = arch_register_cpu(pr->id);
>  > +
>  > +               if (ret)
>  > +                       return ret;
>  > +       }
>  > +
>  >         /*
>  >          *  Extra Processor objects may be enumerated on MP systems with
>  >          *  less than the max # of CPUs. They should be ignored _iff
>  > --
>  
>  I am still unsure why there need to be two paths calling
>  arch_register_cpu() in acpi_processor_get_info().


This is because all CPUs are expected to be 'present' during the boot time for ARM64 arch.
This is not true for x86 world i.e. the logical_cpuid could  be invalid (and present mask not
set) for the x86 arch during the boot time.  Faking the 'present' behavior at the virtualizer
level for ARM is like interfering with the architecture and then tweaking the kernel to fit
that unauthorized hack. This has a potential to break the existing and future version of the
ARM arch. (Between, I'm one of the initial offender of doing that but later corrected the
approach after many discussions and KVM Forum presentations along with ARM )

Therefore, in ARM we keep all the processor as present and just use _STA enabled bit to
decide the online'ing of the processor and this requires a separate handling.


>  
>  Just below the comment partially pulled into the patch context above, there
>  is this code:
>  
>  if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
>           int ret = acpi_processor_hotadd_init(pr);
>  
>          if (ret)
>                  return ret;
>  }
>  
>  For the sake of the argument, fold acpi_processor_hotadd_init() into it and
>  drop the redundant _STA check from it:
>  
>  if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
>          if (invalid_phys_cpuid(pr->phys_id))
>                  return -ENODEV;
>  
>          cpu_maps_update_begin();
>          cpus_write_lock();
>  
>         ret = acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
>         if (ret) {
>                  cpus_write_unlock();
>                  cpu_maps_update_done();
>                  return ret;
>         }
>         ret = arch_register_cpu(pr->id);
>         if (ret) {
>                  acpi_unmap_cpu(pr->id);
>  
>                  cpus_write_unlock();
>                  cpu_maps_update_done();
>                  return ret;
>         }
>        pr_info("CPU%d has been hot-added\n", pr->id);
>        pr->flags.need_hotplug_init = 1;
>  
>        cpus_write_unlock();
>        cpu_maps_update_done();
>  }
>  
>  so I'm not sure why this cannot be combined with the new code.
>  
>  Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.


We cannot because logical cpu-id can never be invalid and cpus can
never be in NOT present state on ARM arch.


>  What's the difference then?  


Above is the precise difference. Changing the behavior of 'presence' in
the ARM architecture after boot is not allowed. With the latest efforts, we
have added the concept of 'online-capable' bit which can help in defer
online'ing the CPUs but then this is not same as not being present at the
boot time. 


The locking, which should be fine if I'm not
>  mistaken and need_hotplug_init that needs to be set if this code runs after
>  the processor driver has loaded AFAICS.

AFAICS, Locking looks to be okay to me as well.

Best regards
Salil.
Rafael J. Wysocki April 15, 2024, 11:52 a.m. UTC | #2
On Mon, Apr 15, 2024 at 12:52 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Fri, 12 Apr 2024 20:30:40 +0200
> "Rafael J. Wysocki" <rafael@kernel.org> wrote:
>
> > On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
> > <Jonathan.Cameron@huawei.com> wrote:
> > >
> > > From: James Morse <james.morse@arm.com>
> > >
> > > The arm64 specific arch_register_cpu() call may defer CPU registration
> > > until the ACPI interpreter is available and the _STA method can
> > > be evaluated.
> > >
> > > If this occurs, then a second attempt is made in
> > > acpi_processor_get_info(). Note that the arm64 specific call has
> > > not yet been added so for now this will never be successfully
> > > called.
> > >
> > > Systems can still be booted with 'acpi=off', or not include an
> > > ACPI description at all as in these cases arch_register_cpu()
> > > will not have deferred registration when first called.
> > >
> > > This moves the CPU register logic back to a subsys_initcall(),
> > > while the memory nodes will have been registered earlier.
> > > Note this is where the call was prior to the cleanup series so
> > > there should be no side effects of moving it back again for this
> > > specific case.
> > >
> > > [PATCH 00/21] Initial cleanups for vCPU HP.
> > > https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@shell.armlinux.org.uk/
> > >
> > > e.g. 5b95f94c3b9f ("x86/topology: Switch over to GENERIC_CPU_DEVICES")
> > >
> > > Signed-off-by: James Morse <james.morse@arm.com>
> > > Reviewed-by: Gavin Shan <gshan@redhat.com>
> > > Tested-by: Miguel Luis <miguel.luis@oracle.com>
> > > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > > Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > Signed-off-by: Joanthan Cameron <Jonathan.Cameron@huawei.com>
> > > ---
> > > v5: Update commit message to make it clear this is moving the
> > >     init back to where it was until very recently.
> > >
> > >     No longer change the condition in the earlier registration point
> > >     as that will be handled by the arm64 registration routine
> > >     deferring until called again here.
> > > ---
> > >  drivers/acpi/acpi_processor.c | 12 ++++++++++++
> > >  1 file changed, 12 insertions(+)
> > >
> > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > index 93e029403d05..c78398cdd060 100644
> > > --- a/drivers/acpi/acpi_processor.c
> > > +++ b/drivers/acpi/acpi_processor.c
> > > @@ -317,6 +317,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > >
> > >         c = &per_cpu(cpu_devices, pr->id);
> > >         ACPI_COMPANION_SET(&c->dev, device);
> > > +       /*
> > > +        * Register CPUs that are present. get_cpu_device() is used to skip
> > > +        * duplicate CPU descriptions from firmware.
> > > +        */
> > > +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> > > +           !get_cpu_device(pr->id)) {
> > > +               int ret = arch_register_cpu(pr->id);
> > > +
> > > +               if (ret)
> > > +                       return ret;
> > > +       }
> > > +
> > >         /*
> > >          *  Extra Processor objects may be enumerated on MP systems with
> > >          *  less than the max # of CPUs. They should be ignored _iff
> > > --
> >
> > I am still unsure why there need to be two paths calling
> > arch_register_cpu() in acpi_processor_get_info().
>
> I replied further down the thread, but the key point was to maintain
> the strong distinction between 'what' was done in a real hotplug
> path vs one where onlining was all.  We can relax that but it goes
> contrary to the careful dance that was needed to get any agreement
> to the ARM architecture aspects of this.

This seems to go beyond technical territory.

As a general rule, we tend to combine code paths that look similar
instead of making them separate on purpose.  Especially with a little
to no explanation of the technical reason.

> >
> > Just below the comment partially pulled into the patch context above,
> > there is this code:
> >
> > if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> >          int ret = acpi_processor_hotadd_init(pr);
> >
> >         if (ret)
> >                 return ret;
> > }
> >
> > For the sake of the argument, fold acpi_processor_hotadd_init() into
> > it and drop the redundant _STA check from it:
>
> If we combine these, the _STA check is necessary because we will call this
> path for delayed onlining of ARM64 CPUs (if the earlier registration code
> call or arch_register_cpu() returned -EPROBE defer). That's the only way
> we know that a given CPU is online capable but firmware is saying we can't
> bring it online yet (it may be be vHP later).

Ignoring the above as per the other message.

> >
> > if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> >         if (invalid_phys_cpuid(pr->phys_id))
> >                 return -ENODEV;
> >
> >         cpu_maps_update_begin();
> >         cpus_write_lock();
> >
> >        ret = acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
>
> I read that call as
>         acpi_map_cpu_for_physical_cpu_hotplug()
> but we could make it equivalent of.
>         acpi_map_cpu_for_whatever_cpu_hotplug()

Yes.

> (I'm not proposing those names though ;)

Sure.

> in which case it is fine to just stub it out on ARM64.
> >        if (ret) {
> >                 cpus_write_unlock();
> >                 cpu_maps_update_done();
> >                 return ret;
> >        }
> >        ret = arch_register_cpu(pr->id);
> >        if (ret) {
> >                 acpi_unmap_cpu(pr->id);
> >
> >                 cpus_write_unlock();
> >                 cpu_maps_update_done();
> >                 return ret;
> >        }
> >       pr_info("CPU%d has been hot-added\n", pr->id);
> >       pr->flags.need_hotplug_init = 1;
> This one needs more careful handling because we are calling this
> for non hotplug cases on arm64 in which case we end up setting this
> for initially online CPUs - thus if we offline and online them
> again via sysfs /sys/bus/cpu/device/cpuX/online it goes through the
> hotplug path and should not.
>
> So I need a way to detect if we are hotplugging the cpu or not.
> Is there a standard way to do this?

If you mean physical hot-add, I don't think so.

What exactly do you need to do differently in the two cases?

>  I haven't figured out how to use flags in drivers to communicate this state.
>
> >
> >       cpus_write_unlock();
> >       cpu_maps_update_done();
> > }
> >
> > so I'm not sure why this cannot be combined with the new code.
> >
> > Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.
> > What's the difference then?  The locking, which should be fine if I'm
> > not mistaken and need_hotplug_init that needs to be set if this code
> > runs after the processor driver has loaded AFAICS.
>
> That's the bit that I'm currently finding a challenge. Is there a clean
> way to detect that?

If you mean the need_hotplug_init flag, I'm currently a bit struggling
to convince myself that it is really needed.
Rafael J. Wysocki April 15, 2024, 12:51 p.m. UTC | #3
On Mon, Apr 15, 2024 at 1:51 PM Salil Mehta <salil.mehta@huawei.com> wrote:
>
> Hello,
>
> >  From: Thomas Gleixner <tglx@linutronix.de>
> >  Sent: Friday, April 12, 2024 9:55 PM
> >
> >  On Fri, Apr 12 2024 at 21:16, Russell King (Oracle) wrote:
> >  > On Fri, Apr 12, 2024 at 08:30:40PM +0200, Rafael J. Wysocki wrote:
> >  >> Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.
> >  >> What's the difference then?  The locking, which should be fine if I'm
> >  >> not mistaken and need_hotplug_init that needs to be set if this code
> >  >> runs after the processor driver has loaded AFAICS.
> >  >
> >  > It is over this that I walked away from progressing this code, because
> >  > I don't think it's quite as simple as you make it out to be.
> >  >
> >  > Yes, acpi_map_cpu() and acpi_unmap_cpu() are already arch
> >  implemented
> >  > functions, so Arm64 can easily provide stubs for these that do nothing.
> >  > That never caused me any concern.
> >  >
> >  > What does cause me great concern though are the finer details. For
> >  > example, above you seem to drop the evaluation of _STA for the
> >  > "make_present" case - I've no idea whether that is something that
> >  > should be deleted or not (if it is something that can be deleted, then
> >  > why not delete it now?)
> >  >
> >  > As for the cpu locking, I couldn't find anything in
> >  > arch_register_cpu() that depends on the cpu_maps_update stuff nor
> >  > needs the cpus_write_lock being taken - so I've no idea why the
> >  > "make_present" case takes these locks.
> >
> >  Anything which updates a CPU mask, e.g. cpu_present_mask, after early
> >  boot must hold the appropriate write locks. Otherwise it would be possible
> >  to online a CPU which just got marked present, but the registration has not
> >  completed yet.
> >
> >  > Finally, the "pr->flags.need_hotplug_init = 1" thing... it's not
> >  > obvious that this is required - remember that with Arm64's "enabled"
> >  > toggling, the "processor" is a slice of the system and doesn't
> >  > actually go away - it's just "not enabled" for use.
> >  >
> >  > Again, as "processors" in Arm64 are slices of the system, they have to
> >  > be fully described in ACPI before the OS boots, and they will be
> >  > marked as being "present", which means they will be enumerated, and
> >  > the driver will be probed. Any processor that is not to be used will
> >  > not have its enabled bit set. It is my understanding that every
> >  > processor will result in the ACPI processor driver being bound to it
> >  > whether its enabled or not.
> >  >
> >  > The difference between real hotplug and Arm64 hotplug is that real
> >  > hotplug makes stuff not-present (and thus unenumerable). Arm64
> >  hotplug
> >  > makes stuff not-enabled which is still enumerable.
> >
> >  Define "real hotplug" :)
> >
> >  Real physical hotplug does not really exist. That's at least true for x86, where
> >  the physical hotplug support was chased for a while, but never ended up in
> >  production.
> >
> >  Though virtualization happily jumped on it to hot add/remove CPUs to/from
> >  a guest.
> >
> >  There are limitations to this and we learned it the hard way on X86. At the
> >  end we came up with the following restrictions:
> >
> >      1) All possible CPUs have to be advertised at boot time via firmware
> >         (ACPI/DT/whatever) independent of them being present at boot time
> >         or not.
> >
> >         That guarantees proper sizing and ensures that associations
> >         between hardware entities and software representations and the
> >         resulting topology are stable for the lifetime of a system.
> >
> >         It is really required to know the full topology of the system at
> >         boot time especially with hybrid CPUs where some of the cores
> >         have hyperthreading and the others do not.
> >
> >
> >      2) Hot add can only mark an already registered (possible) CPU
> >         present. Adding non-registered CPUs after boot is not possible.
> >
> >         The CPU must have been registered in #1 already to ensure that
> >         the system topology does not suddenly change in an incompatible
> >         way at run-time.
> >
> >  The same restriction would apply to real physical hotplug. I don't think that's
> >  any different for ARM64 or any other architecture.
>
>
> There is a difference:
>
> 1.   ARM arch does not allows for any processor to be NOT present. Hence, because of
> this restriction any of its related per-cpu components must be present and enumerated
> at the boot time as well (exposed by firmware and ACPI). This means all the enumerated
> processors will be marked as 'present' but they might exist in NOT enabled (_STA.enabled=0)
> state.
>
> There was one clear difference and please correct me if I'm wrong here,  for x86, the LAPIC
> associated with the x86 core can be brought online later even after boot?
>
> But for ARM Arch, processors and its corresponding per-cpu components like redistributors
> all need to be present and enumerated during the boot time. Redistributors are part of
> ALWAYS-ON power domain.

OK

So what exactly is the problem with this and what does
acpi_processor_add() have to do with it?

Do you want the per-CPU structures etc. to be created from the
acpi_processor_add() path?

This plain won't work because acpi_processor_add(), as defined in the
mainline kernel today (and the Jonathan's patches don't change that
AFAICS), returns an error for processor devices with the "enabled" bit
clear in _STA (it returns an error if the "present" bit is clear too,
but that's obvious), so it only gets to calling arch_register_cpu() if
*both* "present" and "enabled" _STA bits are set for the given
processor device.

That, BTW, is why I keep saying that from the ACPI CPU enumeration
code perspective, there is no difference between "present" and
"enabled".

> 2.  Agreed regarding the topology. Are you suggesting that we must call arch_register_cpu()
> during boot time for all the 'present' CPUs? Even if that's the case, we might still want to defer
> registration of the cpu device (register_cpu() API) with the Linux device model. Later is what
> we are doing to hide/unhide the CPUs from the user while STA.Enabled Bit is toggled due to
> CPU (un)plug action.

There are two ways to approach this IMV, and both seem to be valid in principle.

One would be to treat CPUs with the "enabled" bit clear as not present
and create all of the requisite data structures for them when they
become available (in analogy with the "real hot-add" case).

The alternative one is to create all of the requisite data structures
for the CPUs that you find during boot, but register CPU devices for
those having the "enabled" _STA bit set.

It looks like you have chosen the second approach, which is fine with
me (although personally, I would rather choose the first one), but
then your arch code needs to arrange for the requisite CPU data
structures etc. to be set up before acpi_processor_add() gets called
because, as per the above, the latter just rejects CPUs with the
"enabled" _STA bit clear.

Thanks!
Rafael J. Wysocki April 15, 2024, 4:38 p.m. UTC | #4
On Mon, Apr 15, 2024 at 5:31 PM Salil Mehta <salil.mehta@huawei.com> wrote:
>
> >  From: Rafael J. Wysocki <rafael@kernel.org>
> >  Sent: Monday, April 15, 2024 1:51 PM
> >
> >  On Mon, Apr 15, 2024 at 1:51 PM Salil Mehta <salil.mehta@huawei.com>
> >  wrote:
> >  >

[cut]

> >  > >  Though virtualization happily jumped on it to hot add/remove CPUs
> >  > > to/from  a guest.
> >  > >
> >  > >  There are limitations to this and we learned it the hard way on
> >  > > X86. At the  end we came up with the following restrictions:
> >  > >
> >  > >      1) All possible CPUs have to be advertised at boot time via firmware
> >  > >         (ACPI/DT/whatever) independent of them being present at boot time
> >  > >         or not.
> >  > >
> >  > >         That guarantees proper sizing and ensures that associations
> >  > >         between hardware entities and software representations and the
> >  > >         resulting topology are stable for the lifetime of a system.
> >  > >
> >  > >         It is really required to know the full topology of the system at
> >  > >         boot time especially with hybrid CPUs where some of the cores
> >  > >         have hyperthreading and the others do not.
> >  > >
> >  > >
> >  > >      2) Hot add can only mark an already registered (possible) CPU
> >  > >         present. Adding non-registered CPUs after boot is not possible.
> >  > >
> >  > >         The CPU must have been registered in #1 already to ensure that
> >  > >         the system topology does not suddenly change in an incompatible
> >  > >         way at run-time.
> >  > >
> >  > >  The same restriction would apply to real physical hotplug. I don't
> >  > > think that's  any different for ARM64 or any other architecture.
> >  >
> >  >
> >  > There is a difference:
> >  >
> >  > 1.   ARM arch does not allows for any processor to be NOT present. Hence,  because of
> >  > this restriction any of its related per-cpu components must be present
> >  > and enumerated at the boot time as well (exposed by firmware and
> >  > ACPI). This means all the enumerated processors will be marked as
> >  > 'present' but they might exist in NOT enabled (_STA.enabled=0) state.
> >  >
> >  > There was one clear difference and please correct me if I'm wrong
> >  > here,  for x86, the LAPIC associated with the x86 core can be brought online later even after boot?
> >  >
> >  > But for ARM Arch, processors and its corresponding per-cpu components
> >  > like redistributors all need to be present and enumerated during the
> >  > boot time. Redistributors are part of ALWAYS-ON power domain.
> >
> >  OK
> >
> >  So what exactly is the problem with this and what does
> >  acpi_processor_add() have to do with it?
>
>
> For ARM Arch, during boot time, it should add processor as if no hotplug exists. But later,
> in context to the (fake) hotplug trigger from the virtualizer as a result of the CPU (un)plug
> action  it should just end up in registering the already present CPU with the Linux Driver Model.

So let me repeat this last time: acpi_processor_add() cannot do that,
because (as defined today) it rejects CPUs with the "enabled" bit
clear in _STA.

> >
> >  Do you want the per-CPU structures etc. to be created from the
> >  acpi_processor_add() path?
>
>
> I referred to the components related to ARM CPU Arch like redistributors etc.
> which will get initialized in context to Arch specific _init code not here. This
> i.e. acpi_processor_add() is arch agnostic code common to all architectures.
>
> [ A digression: You do have _weak functions which can be overridden to arch specific
>  handling like  arch_(un)map_cpu() etc. but we can't use those to defer initialize
>  the CPU related components - ARM Arch constraint!]

Not right now, but they can be added I suppose.

>
> >
> >  This plain won't work because acpi_processor_add(), as defined in the
> >  mainline kernel today (and the Jonathan's patches don't change that
> >  AFAICS), returns an error for processor devices with the "enabled" bit clear
> >  in _STA (it returns an error if the "present" bit is clear too, but that's
> >  obvious), so it only gets to calling arch_register_cpu() if
> >  *both* "present" and "enabled" _STA bits are set for the given processor
> >  device.
>
>
> If you are referring to the _STA check in the XX_hot_add_init() then in the current
> kernel code it only checks for the ACPI_STA_DEVICE_PRESENT flag and not
> the ACPI_STA_DEVICE_ENABLED flag(?).

No, I am not.  I'm referring to this code in 6.9-rc4:

static int acpi_processor_add(struct acpi_device *device,
                    const struct acpi_device_id *id)
{
    struct acpi_processor *pr;
    struct device *dev;
    int result = 0;

    if (!acpi_device_is_enabled(device))
        return -ENODEV;

    ...
}

where acpi_device_is_enabled() is defined as follows:

bool acpi_device_is_enabled(const struct acpi_device *adev)
{
    return adev->status.present && adev->status.enabled;
}

> The code being reviewed has changed
> exactly that behavior for 2 legs i.e. make-present and make-enabled legs.

I'm not sure what you mean here, but the code above means that
acpi_processor_add) does not distinguish between CPU with the
"present" bit clear (in which case the "enabled" bit must also be
clear as per the spec) and CPUs with the "present" bit set and the
"enabled" bit clear.  These two cases are handled in the same way.

> I'm open to further address your point clearly.

I hope that the above is clear enough.

> >
> >  That, BTW, is why I keep saying that from the ACPI CPU enumeration code
> >  perspective, there is no difference between "present" and "enabled".
>
>
> Agreed but there is still a subtle difference.  Enumeration happens once and
> for all the processors during the boot time. And as confirmed by yourself and
> Thomas as well that even in x86 arch all the processors will be discovered and
> their topology fixed during the boot time which is effectively the same behavior
> as in the ARM Arch. But ARM assumes those 'present' bits in the present masks
> to be set during the boot time which is not like x86(?).  Hence, 'present cpu' Bits
> will always be equal to 'possible cpu' Bits. This is a constraint put by the ARM
> maintainers and looks unshakable.

Yes, there are differences between architectures, but the ACPI code is
(or at least should be) architecture-agnostic (as you said somewhere
above).  So why does this matter for the ACPI code?

> >
> >  > 2.  Agreed regarding the topology. Are you suggesting that we must
> >  > call arch_register_cpu() during boot time for all the 'present' CPUs?
> >  > Even if that's the case, we might still want to defer registration of
> >  > the cpu device (register_cpu() API) with the Linux device model. Later
> >  > is what we are doing to hide/unhide the CPUs from the user while
> >  STA.Enabled Bit is toggled due to CPU (un)plug action.
> >
> >  There are two ways to approach this IMV, and both seem to be valid in
> >  principle.
> >
> >  One would be to treat CPUs with the "enabled" bit clear as not present and
> >  create all of the requisite data structures for them when they become
> >  available (in analogy with the "real hot-add" case).
>
>
> Right. This one is out-of-scope for ARM Arch as we cannot defer any Arch
> specific sizing and initializations after boot i.e. when processor_add() gets
> called again later as a trigger of CPU plug action at the Virtualizer.
>
>
> >
> >  The alternative one is to create all of the requisite data structures for the
> >  CPUs that you find during boot, but register CPU devices for those having
> >  the "enabled" _STA bit set.
>
>
> Correct. and we defer the registration for CPUs with online-capable Bit
> set in the ACPI MADT/GICC data structure. These CPUs basically form
> set of hot-pluggable CPUs on ARM.
>
>
> >
> >  It looks like you have chosen the second approach, which is fine with me
> >  (although personally, I would rather choose the first one), but then your
> >  arch code needs to arrange for the requisite CPU data structures etc. to be
> >  set up before acpi_processor_add() gets called because, as per the above,
> >  the latter just rejects CPUs with the "enabled" _STA bit clear.
>
> Yes, correct. First one is not possible - at least for now and to have that it will
> require lot of rework especially at GIC. But there are many other arch components
> (like timers, PMUs, etc.) whose behavior needs to be specified somewhere in the
> architecture as well. All these are closely coupled with the ARM CPU architecture.
> (it's beyond this discussion and lets leave that to ARM folks).
>
> This patch-set has a change to deal with ACPI _STA.Enabled Bit accordingly.

Well, I'm having a hard time with this.

As far as CPU enumeration goes, if the "enabled" bit is clear in _STA,
it does not happen at all.  Both on ARM and on x86.

Now tell me why there need to be two separate code paths calling
arch_register_cpu() in acpi_processor_add()?

I see no reason whatsoever.

Moreover, I see reasons why there needs to be only one such code path.

Please feel free to prove me wrong.

Thanks!
Jonathan Cameron April 16, 2024, 2 p.m. UTC | #5
On Fri, 12 Apr 2024 15:37:04 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> From: James Morse <james.morse@arm.com>
> 
> The arm64 specific arch_register_cpu() call may defer CPU registration
> until the ACPI interpreter is available and the _STA method can
> be evaluated.
> 
> If this occurs, then a second attempt is made in
> acpi_processor_get_info(). Note that the arm64 specific call has
> not yet been added so for now this will never be successfully
> called.
> 
> Systems can still be booted with 'acpi=off', or not include an
> ACPI description at all as in these cases arch_register_cpu()
> will not have deferred registration when first called.
> 
> This moves the CPU register logic back to a subsys_initcall(),
> while the memory nodes will have been registered earlier.
> Note this is where the call was prior to the cleanup series so
> there should be no side effects of moving it back again for this
> specific case.
> 
> [PATCH 00/21] Initial cleanups for vCPU HP.
> https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@shell.armlinux.org.uk/
> 
> e.g. 5b95f94c3b9f ("x86/topology: Switch over to GENERIC_CPU_DEVICES")
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Miguel Luis <miguel.luis@oracle.com>
> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Joanthan Cameron <Jonathan.Cameron@huawei.com>
> ---
> v5: Update commit message to make it clear this is moving the
>     init back to where it was until very recently.
> 
>     No longer change the condition in the earlier registration point
>     as that will be handled by the arm64 registration routine
>     deferring until called again here.
> ---
>  drivers/acpi/acpi_processor.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 93e029403d05..c78398cdd060 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -317,6 +317,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
>  
>  	c = &per_cpu(cpu_devices, pr->id);
>  	ACPI_COMPANION_SET(&c->dev, device);
> +	/*
> +	 * Register CPUs that are present. get_cpu_device() is used to skip
> +	 * duplicate CPU descriptions from firmware.
> +	 */
> +	if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> +	    !get_cpu_device(pr->id)) {

Just a quick note to call out that this case of 'duplicate' firmware
description needs an updated comment.  Now we are not deferring
registration on x86 this is detecting that arch_register_cpu()
has already been successfully called and we should not do it again.

I've added rather more detailed comments enumerating of the paths we
can take to hit acpi_processor_hotadd_init() in the v6 series
(tests ongoing)

Jonathan


> +		int ret = arch_register_cpu(pr->id);
> +
> +		if (ret)
> +			return ret;
> +	}
> +
>  	/*
>  	 *  Extra Processor objects may be enumerated on MP systems with
>  	 *  less than the max # of CPUs. They should be ignored _iff
Salil Mehta April 17, 2024, 3:01 p.m. UTC | #6
HI Rafael,

>  From: Rafael J. Wysocki <rafael@kernel.org>
>  Sent: Monday, April 15, 2024 5:39 PM
>  
>  On Mon, Apr 15, 2024 at 5:31 PM Salil Mehta <salil.mehta@huawei.com>  wrote:
>  >
>  > >  From: Rafael J. Wysocki <rafael@kernel.org>
>  > >  Sent: Monday, April 15, 2024 1:51 PM
>  > >
>  > >  On Mon, Apr 15, 2024 at 1:51 PM Salil Mehta
>  > > <salil.mehta@huawei.com>
>  > >  wrote:
>  > >  >
>  
>  [cut]
>  
>  > >  > >  Though virtualization happily jumped on it to hot add/remove
>  > > CPUs  > > to/from  a guest.
>  > >  > >
>  > >  > >  There are limitations to this and we learned it the hard way
>  > > on  > > X86. At the  end we came up with the following restrictions:
>  > >  > >
>  > >  > >      1) All possible CPUs have to be advertised at boot time via  firmware
>  > >  > >         (ACPI/DT/whatever) independent of them being present at boot time
>  > >  > >         or not.
>  > >  > >
>  > >  > >         That guarantees proper sizing and ensures that associations
>  > >  > >         between hardware entities and software representations and the
>  > >  > >         resulting topology are stable for the lifetime of a system.
>  > >  > >
>  > >  > >         It is really required to know the full topology of the system at
>  > >  > >         boot time especially with hybrid CPUs where some of the cores
>  > >  > >         have hyperthreading and the others do not.
>  > >  > >
>  > >  > >
>  > >  > >      2) Hot add can only mark an already registered (possible) CPU
>  > >  > >         present. Adding non-registered CPUs after boot is not possible.
>  > >  > >
>  > >  > >         The CPU must have been registered in #1 already to ensure that
>  > >  > >         the system topology does not suddenly change in an incompatible
>  > >  > >         way at run-time.
>  > >  > >
>  > >  > >  The same restriction would apply to real physical hotplug. I
>  > > don't  > > think that's  any different for ARM64 or any other architecture.
>  > >  >
>  > >  >
>  > >  > There is a difference:
>  > >  >
>  > >  > 1.   ARM arch does not allows for any processor to be NOT present. Hence,  because of
>  > >  > this restriction any of its related per-cpu components must be
>  > > present  > and enumerated at the boot time as well (exposed by
>  > > firmware and  > ACPI). This means all the enumerated processors will
>  > > be marked as  > 'present' but they might exist in NOT enabled (_STA.enabled=0) state.
>  > >  >
>  > >  > There was one clear difference and please correct me if I'm wrong
>  > > > here,  for x86, the LAPIC associated with the x86 core can be brought online later even after boot?
>  > >  >
>  > >  > But for ARM Arch, processors and its corresponding per-cpu
>  > > components  > like redistributors all need to be present and
>  > > enumerated during the  > boot time. Redistributors are part of ALWAYS-ON power domain.
>  > >
>  > >  OK
>  > >
>  > >  So what exactly is the problem with this and what does
>  > >  acpi_processor_add() have to do with it?
>  >
>  >
>  > For ARM Arch, during boot time, it should add processor as if no
>  > hotplug exists. But later, in context to the (fake) hotplug trigger
>  > from the virtualizer as a result of the CPU (un)plug action  it should just
>  end up in registering the already present CPU with the Linux Driver Model.
>  
>  So let me repeat this last time: acpi_processor_add() cannot do that,
>  because (as defined today) it rejects CPUs with the "enabled" bit clear in  _STA.


I understand that now because you have placed a check recently. sorry for stretching
it a bit but I wanted to clearly understand the reason for this behavior. Is it because,

1. It does not makes sense to add a disabled but present/functional processor or
    perhaps there are repercussions to support such a behavior?

Or

2. None of the existing processors need such a behavior?



>  > >  Do you want the per-CPU structures etc. to be created from the
>  > >  acpi_processor_add() path?
>  >
>  >
>  > I referred to the components related to ARM CPU Arch like redistributors etc.
>  > which will get initialized in context to Arch specific _init code not
>  > here. This i.e. acpi_processor_add() is arch agnostic code common to all architectures.
>  >
>  > [ A digression: You do have _weak functions which can be overridden to
>  > arch specific  handling like  arch_(un)map_cpu() etc. but we can't use
>  > those to defer initialize  the CPU related components - ARM Arch
>  > constraint!]
>  
>  Not right now, but they can be added I suppose.
>  
>  >
>  > >
>  > >  This plain won't work because acpi_processor_add(), as defined in
>  > > the  mainline kernel today (and the Jonathan's patches don't change
>  > > that  AFAICS), returns an error for processor devices with the
>  > > "enabled" bit clear  in _STA (it returns an error if the "present"
>  > > bit is clear too, but that's  obvious), so it only gets to calling
>  > > arch_register_cpu() if
>  > >  *both* "present" and "enabled" _STA bits are set for the given
>  > > processor  device.
>  >
>  >
>  > If you are referring to the _STA check in the XX_hot_add_init() then
>  > in the current kernel code it only checks for the
>  > ACPI_STA_DEVICE_PRESENT flag and not the ACPI_STA_DEVICE_ENABLED flag(?).
>  
>  No, I am not.  I'm referring to this code in 6.9-rc4:
>  
>  static int acpi_processor_add(struct acpi_device *device,
>                      const struct acpi_device_id *id) {
>      struct acpi_processor *pr;
>      struct device *dev;
>      int result = 0;
>  
>      if (!acpi_device_is_enabled(device))
>          return -ENODEV;


Ahh, sorry, I missed this check as this has been added recently. Yes, now your
logic of having common legs makes more sense.


>  
>      ...
>  }
>  
>  where acpi_device_is_enabled() is defined as follows:
>  
>  bool acpi_device_is_enabled(const struct acpi_device *adev) {
>      return adev->status.present && adev->status.enabled; }


Got it. 


[digression note:]
BTW, I'm wondering why we are checking adev->status.present
as having adev->status.enabled as set and adev->status.present
as unset would mean firmware has a BUG. If we really want to
check this then we should rather flag a warning on detection
of this condition? 


Either this:
 bool acpi_device_is_enabled(const struct acpi_device *adev) {
      
     if (!acpi_device_is_present(adev)) {
            if (adev->status.enabled)
                       pr_debug("Device [%s] status inconsistent: Enabled but not Present\n",
                                          device->pnp.bus_id);
            return false;
     }
      return  true;
}

Ideally this inconsistency should have been checked in acpi_bus_get_status()
and above function should have been just,

file: drivers/acpi/scan.c
bool acpi_device_is_enabled(const struct acpi_device *adev) {
      return !!adev->status.enabled; }


file: drivers/acpi/bus.c
int acpi_bus_get_status(struct acpi_device *device)
{
       [...]

	status = acpi_bus_get_status_handle(device->handle, &sta);
	if (ACPI_FAILURE(status))
		return -ENODEV;

	acpi_set_device_status(device, sta);

	if (device->status.functional && !device->status.present) {
		pr_debug("Device [%s] status [%08x]: functional but not present\n",
			 device->pnp.bus_id, (u32)sta);
	}

+	if (device->status.enabled && !device->status.present) {
+		pr_debug("BUG: Device [%s] status [%08x]: enabled but not present\n",
+			 device->pnp.bus_id, (u32)sta);
+                         /* any specific handling here? */
+	}

	pr_debug("Device [%s] status [%08x]\n", device->pnp.bus_id, (u32)sta);
	return 0;
}

>  
>  > The code being reviewed has changed
>  > exactly that behavior for 2 legs i.e. make-present and make-enabled legs.
>  
>  I'm not sure what you mean here, but the code above means that
>  acpi_processor_add) does not distinguish between CPU with the "present"
>  bit clear (in which case the "enabled" bit must also be clear as per the spec)
>  and CPUs with the "present" bit set and the "enabled" bit clear.  These two
>  cases are handled in the same way.
>  
>  > I'm open to further address your point clearly.
>  
>  I hope that the above is clear enough.


Yes, clear now. I missed the new check.

>  
>  > >
>  > >  That, BTW, is why I keep saying that from the ACPI CPU enumeration
>  > > code  perspective, there is no difference between "present" and
>  "enabled".
>  >
>  >
>  > Agreed but there is still a subtle difference.  Enumeration happens
>  > once and for all the processors during the boot time. And as confirmed
>  > by yourself and Thomas as well that even in x86 arch all the
>  > processors will be discovered and their topology fixed during the boot
>  > time which is effectively the same behavior as in the ARM Arch. But
>  > ARM assumes those 'present' bits in the present masks to be set during
>  > the boot time which is not like x86(?).  Hence, 'present cpu' Bits
>  > will always be equal to 'possible cpu' Bits. This is a constraint put by the
>  ARM maintainers and looks unshakable.
>  
>  Yes, there are differences between architectures, but the ACPI code is (or
>  at least should be) architecture-agnostic (as you said somewhere above).
>  So why does this matter for the ACPI code?


It should not. There were few bits like overriding of arch_register_cpu() which
was not allowed by ARM folks in 2020 when I floated the first prototype.


>  > >  > 2.  Agreed regarding the topology. Are you suggesting that we
>  > > must  > call arch_register_cpu() during boot time for all the 'present'  CPUs?
>  > >  > Even if that's the case, we might still want to defer
>  > > registration of  > the cpu device (register_cpu() API) with the
>  > > Linux device model. Later  > is what we are doing to hide/unhide the
>  > > CPUs from the user while  STA.Enabled Bit is toggled due to CPU  (un)plug action.
>  > >
>  > >  There are two ways to approach this IMV, and both seem to be valid
>  > > in  principle.
>  > >
>  > >  One would be to treat CPUs with the "enabled" bit clear as not
>  > > present and  create all of the requisite data structures for them
>  > > when they become  available (in analogy with the "real hot-add" case).
>  >
>  >
>  > Right. This one is out-of-scope for ARM Arch as we cannot defer any
>  > Arch specific sizing and initializations after boot i.e. when
>  > processor_add() gets called again later as a trigger of CPU plug action at the Virtualizer.
>  >
>  >
>  > >
>  > >  The alternative one is to create all of the requisite data
>  > > structures for the  CPUs that you find during boot, but register CPU
>  > > devices for those having  the "enabled" _STA bit set.
>  >
>  >
>  > Correct. and we defer the registration for CPUs with online-capable
>  > Bit set in the ACPI MADT/GICC data structure. These CPUs basically
>  > form set of hot-pluggable CPUs on ARM.
>  >
>  >
>  > >
>  > >  It looks like you have chosen the second approach, which is fine
>  > > with me  (although personally, I would rather choose the first one),
>  > > but then your  arch code needs to arrange for the requisite CPU data
>  > > structures etc. to be  set up before acpi_processor_add() gets
>  > > called because, as per the above,  the latter just rejects CPUs with the  "enabled" _STA bit clear.
>  >
>  > Yes, correct. First one is not possible - at least for now and to have
>  > that it will require lot of rework especially at GIC. But there are
>  > many other arch components (like timers, PMUs, etc.) whose behavior
>  > needs to be specified somewhere in the architecture as well. All these are closely coupled with the ARM CPU architecture.
>  > (it's beyond this discussion and lets leave that to ARM folks).
>  >
>  > This patch-set has a change to deal with ACPI _STA.Enabled Bit accordingly.
>  
>  Well, I'm having a hard time with this.
>  
>  As far as CPU enumeration goes, if the "enabled" bit is clear in _STA, it does
>  not happen at all.  Both on ARM and on x86.

sure, I can see that now.

>  
>  Now tell me why there need to be two separate code paths calling
>  arch_register_cpu() in acpi_processor_add()?


As mentioned above, in the first prototype I floated in the year 2020 any attempts
to override the __weak call of arch_register_cpu() for ARM64 was discouraged. 
Though, the reasons might have changed now as some code has been moved.

Once we are allowed to override the calls then there are many more possibilities
which open up to simplify the code further.


>  
>  I see no reason whatsoever.
>  
>  Moreover, I see reasons why there needs to be only one such code path.
>  
>  Please feel free to prove me wrong.
>  
>  Thanks!
Rafael J. Wysocki April 17, 2024, 4:19 p.m. UTC | #7
On Wed, Apr 17, 2024 at 5:01 PM Salil Mehta <salil.mehta@huawei.com> wrote:
>
> HI Rafael,
>
> >  From: Rafael J. Wysocki <rafael@kernel.org>
> >  Sent: Monday, April 15, 2024 5:39 PM
> >
> >  On Mon, Apr 15, 2024 at 5:31 PM Salil Mehta <salil.mehta@huawei.com>  wrote:
> >  >
> >  > >  From: Rafael J. Wysocki <rafael@kernel.org>
> >  > >  Sent: Monday, April 15, 2024 1:51 PM
> >  > >
> >  > >  On Mon, Apr 15, 2024 at 1:51 PM Salil Mehta
> >  > > <salil.mehta@huawei.com>
> >  > >  wrote:
> >  > >  >
> >
> >  [cut]
> >
> >  > >  > >  Though virtualization happily jumped on it to hot add/remove
> >  > > CPUs  > > to/from  a guest.
> >  > >  > >
> >  > >  > >  There are limitations to this and we learned it the hard way
> >  > > on  > > X86. At the  end we came up with the following restrictions:
> >  > >  > >
> >  > >  > >      1) All possible CPUs have to be advertised at boot time via  firmware
> >  > >  > >         (ACPI/DT/whatever) independent of them being present at boot time
> >  > >  > >         or not.
> >  > >  > >
> >  > >  > >         That guarantees proper sizing and ensures that associations
> >  > >  > >         between hardware entities and software representations and the
> >  > >  > >         resulting topology are stable for the lifetime of a system.
> >  > >  > >
> >  > >  > >         It is really required to know the full topology of the system at
> >  > >  > >         boot time especially with hybrid CPUs where some of the cores
> >  > >  > >         have hyperthreading and the others do not.
> >  > >  > >
> >  > >  > >
> >  > >  > >      2) Hot add can only mark an already registered (possible) CPU
> >  > >  > >         present. Adding non-registered CPUs after boot is not possible.
> >  > >  > >
> >  > >  > >         The CPU must have been registered in #1 already to ensure that
> >  > >  > >         the system topology does not suddenly change in an incompatible
> >  > >  > >         way at run-time.
> >  > >  > >
> >  > >  > >  The same restriction would apply to real physical hotplug. I
> >  > > don't  > > think that's  any different for ARM64 or any other architecture.
> >  > >  >
> >  > >  >
> >  > >  > There is a difference:
> >  > >  >
> >  > >  > 1.   ARM arch does not allows for any processor to be NOT present. Hence,  because of
> >  > >  > this restriction any of its related per-cpu components must be
> >  > > present  > and enumerated at the boot time as well (exposed by
> >  > > firmware and  > ACPI). This means all the enumerated processors will
> >  > > be marked as  > 'present' but they might exist in NOT enabled (_STA.enabled=0) state.
> >  > >  >
> >  > >  > There was one clear difference and please correct me if I'm wrong
> >  > > > here,  for x86, the LAPIC associated with the x86 core can be brought online later even after boot?
> >  > >  >
> >  > >  > But for ARM Arch, processors and its corresponding per-cpu
> >  > > components  > like redistributors all need to be present and
> >  > > enumerated during the  > boot time. Redistributors are part of ALWAYS-ON power domain.
> >  > >
> >  > >  OK
> >  > >
> >  > >  So what exactly is the problem with this and what does
> >  > >  acpi_processor_add() have to do with it?
> >  >
> >  >
> >  > For ARM Arch, during boot time, it should add processor as if no
> >  > hotplug exists. But later, in context to the (fake) hotplug trigger
> >  > from the virtualizer as a result of the CPU (un)plug action  it should just
> >  end up in registering the already present CPU with the Linux Driver Model.
> >
> >  So let me repeat this last time: acpi_processor_add() cannot do that,
> >  because (as defined today) it rejects CPUs with the "enabled" bit clear in  _STA.
>
>
> I understand that now because you have placed a check recently. sorry for stretching
> it a bit but I wanted to clearly understand the reason for this behavior. Is it because,
>
> 1. It does not makes sense to add a disabled but present/functional processor or
>     perhaps there are repercussions to support such a behavior?

Yes because it is unusable.

> Or
>
> 2. None of the existing processors need such a behavior?
>
>
>
> >  > >  Do you want the per-CPU structures etc. to be created from the
> >  > >  acpi_processor_add() path?
> >  >
> >  >
> >  > I referred to the components related to ARM CPU Arch like redistributors etc.
> >  > which will get initialized in context to Arch specific _init code not
> >  > here. This i.e. acpi_processor_add() is arch agnostic code common to all architectures.
> >  >
> >  > [ A digression: You do have _weak functions which can be overridden to
> >  > arch specific  handling like  arch_(un)map_cpu() etc. but we can't use
> >  > those to defer initialize  the CPU related components - ARM Arch
> >  > constraint!]
> >
> >  Not right now, but they can be added I suppose.
> >
> >  >
> >  > >
> >  > >  This plain won't work because acpi_processor_add(), as defined in
> >  > > the  mainline kernel today (and the Jonathan's patches don't change
> >  > > that  AFAICS), returns an error for processor devices with the
> >  > > "enabled" bit clear  in _STA (it returns an error if the "present"
> >  > > bit is clear too, but that's  obvious), so it only gets to calling
> >  > > arch_register_cpu() if
> >  > >  *both* "present" and "enabled" _STA bits are set for the given
> >  > > processor  device.
> >  >
> >  >
> >  > If you are referring to the _STA check in the XX_hot_add_init() then
> >  > in the current kernel code it only checks for the
> >  > ACPI_STA_DEVICE_PRESENT flag and not the ACPI_STA_DEVICE_ENABLED flag(?).
> >
> >  No, I am not.  I'm referring to this code in 6.9-rc4:
> >
> >  static int acpi_processor_add(struct acpi_device *device,
> >                      const struct acpi_device_id *id) {
> >      struct acpi_processor *pr;
> >      struct device *dev;
> >      int result = 0;
> >
> >      if (!acpi_device_is_enabled(device))
> >          return -ENODEV;
>
>
> Ahh, sorry, I missed this check as this has been added recently. Yes, now your
> logic of having common legs makes more sense.
>
>
> >
> >      ...
> >  }
> >
> >  where acpi_device_is_enabled() is defined as follows:
> >
> >  bool acpi_device_is_enabled(const struct acpi_device *adev) {
> >      return adev->status.present && adev->status.enabled; }
>
>
> Got it.
>
>
> [digression note:]
> BTW, I'm wondering why we are checking adev->status.present
> as having adev->status.enabled as set and adev->status.present
> as unset would mean firmware has a BUG. If we really want to
> check this then we should rather flag a warning on detection
> of this condition?

Adding a warning would be fine with me.

> Either this:
>  bool acpi_device_is_enabled(const struct acpi_device *adev) {
>
>      if (!acpi_device_is_present(adev)) {
>             if (adev->status.enabled)
>                        pr_debug("Device [%s] status inconsistent: Enabled but not Present\n",
>                                           device->pnp.bus_id);
>             return false;
>      }
>       return  true;
> }
>
> Ideally this inconsistency should have been checked in acpi_bus_get_status()
> and above function should have been just,

Yes, it can be added there.  It can even clear 'enabled' if 'present' is clear.

> file: drivers/acpi/scan.c
> bool acpi_device_is_enabled(const struct acpi_device *adev) {
>       return !!adev->status.enabled; }

Sure.

> file: drivers/acpi/bus.c
> int acpi_bus_get_status(struct acpi_device *device)
> {
>        [...]
>
>         status = acpi_bus_get_status_handle(device->handle, &sta);
>         if (ACPI_FAILURE(status))
>                 return -ENODEV;
>
>         acpi_set_device_status(device, sta);
>
>         if (device->status.functional && !device->status.present) {
>                 pr_debug("Device [%s] status [%08x]: functional but not present\n",
>                          device->pnp.bus_id, (u32)sta);
>         }
>
> +       if (device->status.enabled && !device->status.present) {
> +               pr_debug("BUG: Device [%s] status [%08x]: enabled but not present\n",
> +                        device->pnp.bus_id, (u32)sta);
> +                         /* any specific handling here? */
> +       }
>
>         pr_debug("Device [%s] status [%08x]\n", device->pnp.bus_id, (u32)sta);
>         return 0;
> }
>
> >
> >  > The code being reviewed has changed
> >  > exactly that behavior for 2 legs i.e. make-present and make-enabled legs.
> >
> >  I'm not sure what you mean here, but the code above means that
> >  acpi_processor_add) does not distinguish between CPU with the "present"
> >  bit clear (in which case the "enabled" bit must also be clear as per the spec)
> >  and CPUs with the "present" bit set and the "enabled" bit clear.  These two
> >  cases are handled in the same way.
> >
> >  > I'm open to further address your point clearly.
> >
> >  I hope that the above is clear enough.
>
>
> Yes, clear now. I missed the new check.
>
> >
> >  > >
> >  > >  That, BTW, is why I keep saying that from the ACPI CPU enumeration
> >  > > code  perspective, there is no difference between "present" and
> >  "enabled".
> >  >
> >  >
> >  > Agreed but there is still a subtle difference.  Enumeration happens
> >  > once and for all the processors during the boot time. And as confirmed
> >  > by yourself and Thomas as well that even in x86 arch all the
> >  > processors will be discovered and their topology fixed during the boot
> >  > time which is effectively the same behavior as in the ARM Arch. But
> >  > ARM assumes those 'present' bits in the present masks to be set during
> >  > the boot time which is not like x86(?).  Hence, 'present cpu' Bits
> >  > will always be equal to 'possible cpu' Bits. This is a constraint put by the
> >  ARM maintainers and looks unshakable.
> >
> >  Yes, there are differences between architectures, but the ACPI code is (or
> >  at least should be) architecture-agnostic (as you said somewhere above).
> >  So why does this matter for the ACPI code?
>
>
> It should not. There were few bits like overriding of arch_register_cpu() which
> was not allowed by ARM folks in 2020 when I floated the first prototype.
>
>
> >  > >  > 2.  Agreed regarding the topology. Are you suggesting that we
> >  > > must  > call arch_register_cpu() during boot time for all the 'present'  CPUs?
> >  > >  > Even if that's the case, we might still want to defer
> >  > > registration of  > the cpu device (register_cpu() API) with the
> >  > > Linux device model. Later  > is what we are doing to hide/unhide the
> >  > > CPUs from the user while  STA.Enabled Bit is toggled due to CPU  (un)plug action.
> >  > >
> >  > >  There are two ways to approach this IMV, and both seem to be valid
> >  > > in  principle.
> >  > >
> >  > >  One would be to treat CPUs with the "enabled" bit clear as not
> >  > > present and  create all of the requisite data structures for them
> >  > > when they become  available (in analogy with the "real hot-add" case).
> >  >
> >  >
> >  > Right. This one is out-of-scope for ARM Arch as we cannot defer any
> >  > Arch specific sizing and initializations after boot i.e. when
> >  > processor_add() gets called again later as a trigger of CPU plug action at the Virtualizer.
> >  >
> >  >
> >  > >
> >  > >  The alternative one is to create all of the requisite data
> >  > > structures for the  CPUs that you find during boot, but register CPU
> >  > > devices for those having  the "enabled" _STA bit set.
> >  >
> >  >
> >  > Correct. and we defer the registration for CPUs with online-capable
> >  > Bit set in the ACPI MADT/GICC data structure. These CPUs basically
> >  > form set of hot-pluggable CPUs on ARM.
> >  >
> >  >
> >  > >
> >  > >  It looks like you have chosen the second approach, which is fine
> >  > > with me  (although personally, I would rather choose the first one),
> >  > > but then your  arch code needs to arrange for the requisite CPU data
> >  > > structures etc. to be  set up before acpi_processor_add() gets
> >  > > called because, as per the above,  the latter just rejects CPUs with the  "enabled" _STA bit clear.
> >  >
> >  > Yes, correct. First one is not possible - at least for now and to have
> >  > that it will require lot of rework especially at GIC. But there are
> >  > many other arch components (like timers, PMUs, etc.) whose behavior
> >  > needs to be specified somewhere in the architecture as well. All these are closely coupled with the ARM CPU architecture.
> >  > (it's beyond this discussion and lets leave that to ARM folks).
> >  >
> >  > This patch-set has a change to deal with ACPI _STA.Enabled Bit accordingly.
> >
> >  Well, I'm having a hard time with this.
> >
> >  As far as CPU enumeration goes, if the "enabled" bit is clear in _STA, it does
> >  not happen at all.  Both on ARM and on x86.
>
> sure, I can see that now.
>
> >
> >  Now tell me why there need to be two separate code paths calling
> >  arch_register_cpu() in acpi_processor_add()?
>
>
> As mentioned above, in the first prototype I floated in the year 2020 any attempts
> to override the __weak call of arch_register_cpu() for ARM64 was discouraged.
> Though, the reasons might have changed now as some code has been moved.
>
> Once we are allowed to override the calls then there are many more possibilities
> which open up to simplify the code further.

Well, IMV this should just be an arch function with no __weak
defaults, because the default would probably be unusable in practice
anyway.
diff mbox series

Patch

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 93e029403d05..c78398cdd060 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -317,6 +317,18 @@  static int acpi_processor_get_info(struct acpi_device *device)
 
 	c = &per_cpu(cpu_devices, pr->id);
 	ACPI_COMPANION_SET(&c->dev, device);
+	/*
+	 * Register CPUs that are present. get_cpu_device() is used to skip
+	 * duplicate CPU descriptions from firmware.
+	 */
+	if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
+	    !get_cpu_device(pr->id)) {
+		int ret = arch_register_cpu(pr->id);
+
+		if (ret)
+			return ret;
+	}
+
 	/*
 	 *  Extra Processor objects may be enumerated on MP systems with
 	 *  less than the max # of CPUs. They should be ignored _iff