Message ID | 20250503191515.24041-4-ricardo.neri-calderon@linux.intel.com |
---|---|
State | New |
Headers | show |
Series | x86/hyperv/hv_vtl: Use a wakeup mailbox to boot secondary CPUs | expand |
On Sat, May 3, 2025 at 9:10 PM Ricardo Neri <ricardo.neri-calderon@linux.intel.com> wrote: > > The bootstrap processor uses acpi_wakeup_cpu() to indicate to firmware that > it wants to boot a secondary CPU using a mailbox as described in the > Multiprocessor Wakeup Structure of the ACPI specification. > > The wakeup mailbox does not strictly require support from ACPI. Well, except that it is defined by the ACPI specification. > The platform firmware can implement a mailbox compatible in structure and > operation and enumerate it using other mechanisms such a DeviceTree graph. So is there a specification defining this mechanism? It is generally not sufficient to put the code and DT bindings unilaterally into the OS and expect the firmware to follow suit. > Move the code used to setup and use the mailbox out of the ACPI > directory to use it when support for ACPI is not available or needed. I think that the code implementing interfaces defined by the ACPI specification is not generic and so it should not be built when the kernel is configured without ACPI support. > No functional changes are intended. > > Originally-by: Yunhong Jiang <yunhong.jiang@linux.intel.com> > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> > --- > Changes since v2: > - Only move to smpboot.c the portions of the code that configure and > use the mailbox. This also resolved the compile warnings about unused > functions that Michael Kelley reported. > - Edited the commit message for clarity. > > Changes since v1: > - None. > --- > arch/x86/kernel/acpi/madt_wakeup.c | 75 ---------------------------- > arch/x86/kernel/smpboot.c | 78 ++++++++++++++++++++++++++++++ > 2 files changed, 78 insertions(+), 75 deletions(-) > > diff --git a/arch/x86/kernel/acpi/madt_wakeup.c b/arch/x86/kernel/acpi/madt_wakeup.c > index 6b9e41a24574..15627f63f9f5 100644 > --- a/arch/x86/kernel/acpi/madt_wakeup.c > +++ b/arch/x86/kernel/acpi/madt_wakeup.c > @@ -2,7 +2,6 @@ > #include <linux/acpi.h> > #include <linux/cpu.h> > #include <linux/delay.h> > -#include <linux/io.h> > #include <linux/kexec.h> > #include <linux/memblock.h> > #include <linux/pgtable.h> > @@ -15,12 +14,6 @@ > #include <asm/processor.h> > #include <asm/reboot.h> > > -/* Physical address of the Multiprocessor Wakeup Structure mailbox */ > -static u64 acpi_mp_wake_mailbox_paddr __ro_after_init; > - > -/* Virtual address of the Multiprocessor Wakeup Structure mailbox */ > -static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox; > - > static u64 acpi_mp_pgd __ro_after_init; > static u64 acpi_mp_reset_vector_paddr __ro_after_init; > > @@ -127,63 +120,6 @@ static int __init acpi_mp_setup_reset(u64 reset_vector) > return 0; > } > > -static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip) > -{ > - if (!acpi_mp_wake_mailbox_paddr) { > - pr_warn_once("No MADT mailbox: cannot bringup secondary CPUs. Booting with kexec?\n"); > - return -EOPNOTSUPP; > - } > - > - /* > - * Remap mailbox memory only for the first call to acpi_wakeup_cpu(). > - * > - * Wakeup of secondary CPUs is fully serialized in the core code. > - * No need to protect acpi_mp_wake_mailbox from concurrent accesses. > - */ > - if (!acpi_mp_wake_mailbox) { > - acpi_mp_wake_mailbox = memremap(acpi_mp_wake_mailbox_paddr, > - sizeof(*acpi_mp_wake_mailbox), > - MEMREMAP_WB); > - } > - > - /* > - * Mailbox memory is shared between the firmware and OS. Firmware will > - * listen on mailbox command address, and once it receives the wakeup > - * command, the CPU associated with the given apicid will be booted. > - * > - * The value of 'apic_id' and 'wakeup_vector' must be visible to the > - * firmware before the wakeup command is visible. smp_store_release() > - * ensures ordering and visibility. > - */ > - acpi_mp_wake_mailbox->apic_id = apicid; > - acpi_mp_wake_mailbox->wakeup_vector = start_ip; > - smp_store_release(&acpi_mp_wake_mailbox->command, > - ACPI_MP_WAKE_COMMAND_WAKEUP); > - > - /* > - * Wait for the CPU to wake up. > - * > - * The CPU being woken up is essentially in a spin loop waiting to be > - * woken up. It should not take long for it wake up and acknowledge by > - * zeroing out ->command. > - * > - * ACPI specification doesn't provide any guidance on how long kernel > - * has to wait for a wake up acknowledgment. It also doesn't provide > - * a way to cancel a wake up request if it takes too long. > - * > - * In TDX environment, the VMM has control over how long it takes to > - * wake up secondary. It can postpone scheduling secondary vCPU > - * indefinitely. Giving up on wake up request and reporting error opens > - * possible attack vector for VMM: it can wake up a secondary CPU when > - * kernel doesn't expect it. Wait until positive result of the wake up > - * request. > - */ > - while (READ_ONCE(acpi_mp_wake_mailbox->command)) > - cpu_relax(); > - > - return 0; > -} > - > static void acpi_mp_disable_offlining(struct acpi_madt_multiproc_wakeup *mp_wake) > { > cpu_hotplug_disable_offlining(); > @@ -246,14 +182,3 @@ int __init acpi_parse_mp_wake(union acpi_subtable_headers *header, > > return 0; > } > - > -void __init setup_mp_wakeup_mailbox(u64 mailbox_paddr) > -{ > - acpi_mp_wake_mailbox_paddr = mailbox_paddr; > - apic_update_callback(wakeup_secondary_cpu_64, acpi_wakeup_cpu); > -} > - > -struct acpi_madt_multiproc_wakeup_mailbox *get_mp_wakeup_mailbox(void) > -{ > - return acpi_mp_wake_mailbox; > -} > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c > index d6cf1e23c2a3..6f39ebe4d192 100644 > --- a/arch/x86/kernel/smpboot.c > +++ b/arch/x86/kernel/smpboot.c > @@ -61,7 +61,9 @@ > #include <linux/cpuhotplug.h> > #include <linux/mc146818rtc.h> > #include <linux/acpi.h> > +#include <linux/io.h> > > +#include <asm/barrier.h> > #include <asm/acpi.h> > #include <asm/cacheinfo.h> > #include <asm/cpuid.h> > @@ -1354,3 +1356,79 @@ void native_play_dead(void) > } > > #endif > + > +#ifdef CONFIG_X86_64 > +/* Physical address of the Multiprocessor Wakeup Structure mailbox */ > +static u64 acpi_mp_wake_mailbox_paddr __ro_after_init; > + > +/* Virtual address of the Multiprocessor Wakeup Structure mailbox */ > +static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox; > + > +static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip) > +{ > + if (!acpi_mp_wake_mailbox_paddr) { > + pr_warn_once("No MADT mailbox: cannot bringup secondary CPUs. Booting with kexec?\n"); > + return -EOPNOTSUPP; > + } > + > + /* > + * Remap mailbox memory only for the first call to acpi_wakeup_cpu(). > + * > + * Wakeup of secondary CPUs is fully serialized in the core code. > + * No need to protect acpi_mp_wake_mailbox from concurrent accesses. > + */ > + if (!acpi_mp_wake_mailbox) { > + acpi_mp_wake_mailbox = memremap(acpi_mp_wake_mailbox_paddr, > + sizeof(*acpi_mp_wake_mailbox), > + MEMREMAP_WB); > + } > + > + /* > + * Mailbox memory is shared between the firmware and OS. Firmware will > + * listen on mailbox command address, and once it receives the wakeup > + * command, the CPU associated with the given apicid will be booted. > + * > + * The value of 'apic_id' and 'wakeup_vector' must be visible to the > + * firmware before the wakeup command is visible. smp_store_release() > + * ensures ordering and visibility. > + */ > + acpi_mp_wake_mailbox->apic_id = apicid; > + acpi_mp_wake_mailbox->wakeup_vector = start_ip; > + smp_store_release(&acpi_mp_wake_mailbox->command, > + ACPI_MP_WAKE_COMMAND_WAKEUP); > + > + /* > + * Wait for the CPU to wake up. > + * > + * The CPU being woken up is essentially in a spin loop waiting to be > + * woken up. It should not take long for it wake up and acknowledge by > + * zeroing out ->command. > + * > + * ACPI specification doesn't provide any guidance on how long kernel > + * has to wait for a wake up acknowledgment. It also doesn't provide > + * a way to cancel a wake up request if it takes too long. > + * > + * In TDX environment, the VMM has control over how long it takes to > + * wake up secondary. It can postpone scheduling secondary vCPU > + * indefinitely. Giving up on wake up request and reporting error opens > + * possible attack vector for VMM: it can wake up a secondary CPU when > + * kernel doesn't expect it. Wait until positive result of the wake up > + * request. > + */ > + while (READ_ONCE(acpi_mp_wake_mailbox->command)) > + cpu_relax(); > + > + return 0; > +} > + > +void __init setup_mp_wakeup_mailbox(u64 mailbox_paddr) > +{ > + acpi_mp_wake_mailbox_paddr = mailbox_paddr; > + apic_update_callback(wakeup_secondary_cpu_64, acpi_wakeup_cpu); > +} > + > +struct acpi_madt_multiproc_wakeup_mailbox *get_mp_wakeup_mailbox(void) > +{ > + return acpi_mp_wake_mailbox; > +} > +#endif > -- > 2.43.0 > >
On Mon, May 05, 2025 at 12:03:13PM +0200, Rafael J. Wysocki wrote: > On Sat, May 3, 2025 at 9:10 PM Ricardo Neri > <ricardo.neri-calderon@linux.intel.com> wrote: > > > > The bootstrap processor uses acpi_wakeup_cpu() to indicate to firmware that > > it wants to boot a secondary CPU using a mailbox as described in the > > Multiprocessor Wakeup Structure of the ACPI specification. > > > > The wakeup mailbox does not strictly require support from ACPI. > > Well, except that it is defined by the ACPI specification. That is true. > > > The platform firmware can implement a mailbox compatible in structure and > > operation and enumerate it using other mechanisms such a DeviceTree graph. > > So is there a specification defining this mechanism? > > It is generally not sufficient to put the code and DT bindings > unilaterally into the OS and expect the firmware to follow suit. > This mechanism is described in the section 4.3.5 of the Intel TDX Virtual Firmware Design Guide [1], but it refeers to the ACPI specification for the interface. > > Move the code used to setup and use the mailbox out of the ACPI > > directory to use it when support for ACPI is not available or needed. > > I think that the code implementing interfaces defined by the ACPI > specification is not generic and so it should not be built when the > kernel is configured without ACPI support. Support for ACPI would not be used on systems describing hardware using a DeviceTree graph. My goal is to have a common interface that both DT and ACPI can use. I think what is missing is that common interface. Would it be preferred to have a separate implementation that is functionally equivalent? Thanks and BR, Ricardo [1]. https://cdrdv2-public.intel.com/733585/tdx-virtual-firmware-design-guide-rev-004-20231206.pdf
On Tue, May 6, 2025 at 7:32 AM Ricardo Neri <ricardo.neri-calderon@linux.intel.com> wrote: > > On Mon, May 05, 2025 at 12:03:13PM +0200, Rafael J. Wysocki wrote: > > On Sat, May 3, 2025 at 9:10 PM Ricardo Neri > > <ricardo.neri-calderon@linux.intel.com> wrote: > > > > > > The bootstrap processor uses acpi_wakeup_cpu() to indicate to firmware that > > > it wants to boot a secondary CPU using a mailbox as described in the > > > Multiprocessor Wakeup Structure of the ACPI specification. > > > > > > The wakeup mailbox does not strictly require support from ACPI. > > > > Well, except that it is defined by the ACPI specification. > > That is true. > > > > > > The platform firmware can implement a mailbox compatible in structure and > > > operation and enumerate it using other mechanisms such a DeviceTree graph. > > > > So is there a specification defining this mechanism? > > > > It is generally not sufficient to put the code and DT bindings > > unilaterally into the OS and expect the firmware to follow suit. > > > > This mechanism is described in the section 4.3.5 of the Intel TDX > Virtual Firmware Design Guide [1], but it refeers to the ACPI > specification for the interface. Yes, it does. > > > Move the code used to setup and use the mailbox out of the ACPI > > > directory to use it when support for ACPI is not available or needed. > > > > I think that the code implementing interfaces defined by the ACPI > > specification is not generic and so it should not be built when the > > kernel is configured without ACPI support. > > Support for ACPI would not be used on systems describing hardware using > a DeviceTree graph. My goal is to have a common interface that both > DT and ACPI can use. I think what is missing is that common interface. I get the general idea. :-) > Would it be preferred to have a separate implementation that is > functionally equivalent? Functionally equivalent on the OS side, that is. It would be better, but honestly I'm not sure if this is going to be sufficient to eliminate the dependency on the ACPI spec. It is just as you want the firmware to implement this tiny bit of the ACPI spec even though it is not implementing the rest of it. > [1]. https://cdrdv2-public.intel.com/733585/tdx-virtual-firmware-design-guide-rev-004-20231206.pdf
On Tue, May 06, 2025 at 07:26:01PM +0200, Rafael J. Wysocki wrote: > On Tue, May 6, 2025 at 7:32 AM Ricardo Neri > <ricardo.neri-calderon@linux.intel.com> wrote: > > > > On Mon, May 05, 2025 at 12:03:13PM +0200, Rafael J. Wysocki wrote: > > > On Sat, May 3, 2025 at 9:10 PM Ricardo Neri > > > <ricardo.neri-calderon@linux.intel.com> wrote: > > > > > > > > The bootstrap processor uses acpi_wakeup_cpu() to indicate to firmware that > > > > it wants to boot a secondary CPU using a mailbox as described in the > > > > Multiprocessor Wakeup Structure of the ACPI specification. > > > > > > > > The wakeup mailbox does not strictly require support from ACPI. > > > > > > Well, except that it is defined by the ACPI specification. > > > > That is true. > > > > > > > > > The platform firmware can implement a mailbox compatible in structure and > > > > operation and enumerate it using other mechanisms such a DeviceTree graph. > > > > > > So is there a specification defining this mechanism? > > > > > > It is generally not sufficient to put the code and DT bindings > > > unilaterally into the OS and expect the firmware to follow suit. > > > > > > > This mechanism is described in the section 4.3.5 of the Intel TDX > > Virtual Firmware Design Guide [1], but it refeers to the ACPI > > specification for the interface. > > Yes, it does. > > > > > Move the code used to setup and use the mailbox out of the ACPI > > > > directory to use it when support for ACPI is not available or needed. > > > > > > I think that the code implementing interfaces defined by the ACPI > > > specification is not generic and so it should not be built when the > > > kernel is configured without ACPI support. > > > > Support for ACPI would not be used on systems describing hardware using > > a DeviceTree graph. My goal is to have a common interface that both > > DT and ACPI can use. I think what is missing is that common interface. > > I get the general idea. :-) > > > Would it be preferred to have a separate implementation that is > > functionally equivalent? > > Functionally equivalent on the OS side, that is. > > It would be better, but honestly I'm not sure if this is going to be > sufficient to eliminate the dependency on the ACPI spec. It is just > as you want the firmware to implement this tiny bit of the ACPI spec > even though it is not implementing the rest of it. Yes, that is all I need: the address of the mailbox and firmware that implements the mailbox interface as described in the ACPI Multiprocessor Structure. Thanks and BR, Ricardo >
diff --git a/arch/x86/kernel/acpi/madt_wakeup.c b/arch/x86/kernel/acpi/madt_wakeup.c index 6b9e41a24574..15627f63f9f5 100644 --- a/arch/x86/kernel/acpi/madt_wakeup.c +++ b/arch/x86/kernel/acpi/madt_wakeup.c @@ -2,7 +2,6 @@ #include <linux/acpi.h> #include <linux/cpu.h> #include <linux/delay.h> -#include <linux/io.h> #include <linux/kexec.h> #include <linux/memblock.h> #include <linux/pgtable.h> @@ -15,12 +14,6 @@ #include <asm/processor.h> #include <asm/reboot.h> -/* Physical address of the Multiprocessor Wakeup Structure mailbox */ -static u64 acpi_mp_wake_mailbox_paddr __ro_after_init; - -/* Virtual address of the Multiprocessor Wakeup Structure mailbox */ -static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox; - static u64 acpi_mp_pgd __ro_after_init; static u64 acpi_mp_reset_vector_paddr __ro_after_init; @@ -127,63 +120,6 @@ static int __init acpi_mp_setup_reset(u64 reset_vector) return 0; } -static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip) -{ - if (!acpi_mp_wake_mailbox_paddr) { - pr_warn_once("No MADT mailbox: cannot bringup secondary CPUs. Booting with kexec?\n"); - return -EOPNOTSUPP; - } - - /* - * Remap mailbox memory only for the first call to acpi_wakeup_cpu(). - * - * Wakeup of secondary CPUs is fully serialized in the core code. - * No need to protect acpi_mp_wake_mailbox from concurrent accesses. - */ - if (!acpi_mp_wake_mailbox) { - acpi_mp_wake_mailbox = memremap(acpi_mp_wake_mailbox_paddr, - sizeof(*acpi_mp_wake_mailbox), - MEMREMAP_WB); - } - - /* - * Mailbox memory is shared between the firmware and OS. Firmware will - * listen on mailbox command address, and once it receives the wakeup - * command, the CPU associated with the given apicid will be booted. - * - * The value of 'apic_id' and 'wakeup_vector' must be visible to the - * firmware before the wakeup command is visible. smp_store_release() - * ensures ordering and visibility. - */ - acpi_mp_wake_mailbox->apic_id = apicid; - acpi_mp_wake_mailbox->wakeup_vector = start_ip; - smp_store_release(&acpi_mp_wake_mailbox->command, - ACPI_MP_WAKE_COMMAND_WAKEUP); - - /* - * Wait for the CPU to wake up. - * - * The CPU being woken up is essentially in a spin loop waiting to be - * woken up. It should not take long for it wake up and acknowledge by - * zeroing out ->command. - * - * ACPI specification doesn't provide any guidance on how long kernel - * has to wait for a wake up acknowledgment. It also doesn't provide - * a way to cancel a wake up request if it takes too long. - * - * In TDX environment, the VMM has control over how long it takes to - * wake up secondary. It can postpone scheduling secondary vCPU - * indefinitely. Giving up on wake up request and reporting error opens - * possible attack vector for VMM: it can wake up a secondary CPU when - * kernel doesn't expect it. Wait until positive result of the wake up - * request. - */ - while (READ_ONCE(acpi_mp_wake_mailbox->command)) - cpu_relax(); - - return 0; -} - static void acpi_mp_disable_offlining(struct acpi_madt_multiproc_wakeup *mp_wake) { cpu_hotplug_disable_offlining(); @@ -246,14 +182,3 @@ int __init acpi_parse_mp_wake(union acpi_subtable_headers *header, return 0; } - -void __init setup_mp_wakeup_mailbox(u64 mailbox_paddr) -{ - acpi_mp_wake_mailbox_paddr = mailbox_paddr; - apic_update_callback(wakeup_secondary_cpu_64, acpi_wakeup_cpu); -} - -struct acpi_madt_multiproc_wakeup_mailbox *get_mp_wakeup_mailbox(void) -{ - return acpi_mp_wake_mailbox; -} diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index d6cf1e23c2a3..6f39ebe4d192 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -61,7 +61,9 @@ #include <linux/cpuhotplug.h> #include <linux/mc146818rtc.h> #include <linux/acpi.h> +#include <linux/io.h> +#include <asm/barrier.h> #include <asm/acpi.h> #include <asm/cacheinfo.h> #include <asm/cpuid.h> @@ -1354,3 +1356,79 @@ void native_play_dead(void) } #endif + +#ifdef CONFIG_X86_64 +/* Physical address of the Multiprocessor Wakeup Structure mailbox */ +static u64 acpi_mp_wake_mailbox_paddr __ro_after_init; + +/* Virtual address of the Multiprocessor Wakeup Structure mailbox */ +static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox; + +static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip) +{ + if (!acpi_mp_wake_mailbox_paddr) { + pr_warn_once("No MADT mailbox: cannot bringup secondary CPUs. Booting with kexec?\n"); + return -EOPNOTSUPP; + } + + /* + * Remap mailbox memory only for the first call to acpi_wakeup_cpu(). + * + * Wakeup of secondary CPUs is fully serialized in the core code. + * No need to protect acpi_mp_wake_mailbox from concurrent accesses. + */ + if (!acpi_mp_wake_mailbox) { + acpi_mp_wake_mailbox = memremap(acpi_mp_wake_mailbox_paddr, + sizeof(*acpi_mp_wake_mailbox), + MEMREMAP_WB); + } + + /* + * Mailbox memory is shared between the firmware and OS. Firmware will + * listen on mailbox command address, and once it receives the wakeup + * command, the CPU associated with the given apicid will be booted. + * + * The value of 'apic_id' and 'wakeup_vector' must be visible to the + * firmware before the wakeup command is visible. smp_store_release() + * ensures ordering and visibility. + */ + acpi_mp_wake_mailbox->apic_id = apicid; + acpi_mp_wake_mailbox->wakeup_vector = start_ip; + smp_store_release(&acpi_mp_wake_mailbox->command, + ACPI_MP_WAKE_COMMAND_WAKEUP); + + /* + * Wait for the CPU to wake up. + * + * The CPU being woken up is essentially in a spin loop waiting to be + * woken up. It should not take long for it wake up and acknowledge by + * zeroing out ->command. + * + * ACPI specification doesn't provide any guidance on how long kernel + * has to wait for a wake up acknowledgment. It also doesn't provide + * a way to cancel a wake up request if it takes too long. + * + * In TDX environment, the VMM has control over how long it takes to + * wake up secondary. It can postpone scheduling secondary vCPU + * indefinitely. Giving up on wake up request and reporting error opens + * possible attack vector for VMM: it can wake up a secondary CPU when + * kernel doesn't expect it. Wait until positive result of the wake up + * request. + */ + while (READ_ONCE(acpi_mp_wake_mailbox->command)) + cpu_relax(); + + return 0; +} + +void __init setup_mp_wakeup_mailbox(u64 mailbox_paddr) +{ + acpi_mp_wake_mailbox_paddr = mailbox_paddr; + apic_update_callback(wakeup_secondary_cpu_64, acpi_wakeup_cpu); +} + +struct acpi_madt_multiproc_wakeup_mailbox *get_mp_wakeup_mailbox(void) +{ + return acpi_mp_wake_mailbox; +} +#endif
The bootstrap processor uses acpi_wakeup_cpu() to indicate to firmware that it wants to boot a secondary CPU using a mailbox as described in the Multiprocessor Wakeup Structure of the ACPI specification. The wakeup mailbox does not strictly require support from ACPI. The platform firmware can implement a mailbox compatible in structure and operation and enumerate it using other mechanisms such a DeviceTree graph. Move the code used to setup and use the mailbox out of the ACPI directory to use it when support for ACPI is not available or needed. No functional changes are intended. Originally-by: Yunhong Jiang <yunhong.jiang@linux.intel.com> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> --- Changes since v2: - Only move to smpboot.c the portions of the code that configure and use the mailbox. This also resolved the compile warnings about unused functions that Michael Kelley reported. - Edited the commit message for clarity. Changes since v1: - None. --- arch/x86/kernel/acpi/madt_wakeup.c | 75 ---------------------------- arch/x86/kernel/smpboot.c | 78 ++++++++++++++++++++++++++++++ 2 files changed, 78 insertions(+), 75 deletions(-)