diff mbox series

[v2] x86/acpi/boot: Do not register processors that cannot be onlined for x2apic

Message ID 20230105041059.39366-1-kvijayab@amd.com
State Accepted
Commit e2869bd7af608c343988429ceb1c2fe99644a01f
Headers show
Series [v2] x86/acpi/boot: Do not register processors that cannot be onlined for x2apic | expand

Commit Message

Kishon Vijay Abraham I Jan. 5, 2023, 4:10 a.m. UTC
Section 5.2.12.12 Processor Local x2APIC Structure in the ACPI v6.5
spec mandates that both "enabled" and "online capable" Local APIC Flags
should be used to determine if the processor is usable or not.

However, Linux doesn't use the "online capable" flag for x2APIC to
determine if the processor is usable. As a result, cpu_possible_mask has
incorrect value and results in more memory getting allocated for per_cpu
variables than it is going to be used.

Make sure Linux parses both "enabled" and "online capable" flags for
x2APIC to correctly determine if the processor is usable.

Fixes: aa06e20f1be6 ("x86/ACPI: Don't add CPUs that are not online capable")
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Reported-by: Leo Duran <leo.duran@amd.com>
Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
---
Changes from v1:
1) Changed the ACPI spec version to 6.5 in the commit log
2) Changed the Fixes tag to point to commit aa06e20f1be6
3) Added "Reported-by: Leo Duran <leo.duran@amd.com>"
 arch/x86/kernel/acpi/boot.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

Comments

Rafael J. Wysocki Jan. 5, 2023, 5:09 p.m. UTC | #1
On Thu, Jan 5, 2023 at 5:11 AM Kishon Vijay Abraham I <kvijayab@amd.com> wrote:
>
> Section 5.2.12.12 Processor Local x2APIC Structure in the ACPI v6.5
> spec mandates that both "enabled" and "online capable" Local APIC Flags
> should be used to determine if the processor is usable or not.
>
> However, Linux doesn't use the "online capable" flag for x2APIC to
> determine if the processor is usable. As a result, cpu_possible_mask has
> incorrect value and results in more memory getting allocated for per_cpu
> variables than it is going to be used.
>
> Make sure Linux parses both "enabled" and "online capable" flags for
> x2APIC to correctly determine if the processor is usable.
>
> Fixes: aa06e20f1be6 ("x86/ACPI: Don't add CPUs that are not online capable")
> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
> Reported-by: Leo Duran <leo.duran@amd.com>
> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> ---
> Changes from v1:
> 1) Changed the ACPI spec version to 6.5 in the commit log
> 2) Changed the Fixes tag to point to commit aa06e20f1be6
> 3) Added "Reported-by: Leo Duran <leo.duran@amd.com>"
>  arch/x86/kernel/acpi/boot.c | 19 ++++++++++++++++---
>  1 file changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
> index 907cc98b1938..518bda50068c 100644
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -188,6 +188,17 @@ static int acpi_register_lapic(int id, u32 acpiid, u8 enabled)
>         return cpu;
>  }
>
> +static bool __init acpi_is_processor_usable(u32 lapic_flags)
> +{
> +       if (lapic_flags & ACPI_MADT_ENABLED)
> +               return true;
> +
> +       if (acpi_support_online_capable && (lapic_flags & ACPI_MADT_ONLINE_CAPABLE))
> +               return true;
> +
> +       return false;
> +}
> +
>  static int __init
>  acpi_parse_x2apic(union acpi_subtable_headers *header, const unsigned long end)
>  {
> @@ -212,6 +223,10 @@ acpi_parse_x2apic(union acpi_subtable_headers *header, const unsigned long end)
>         if (apic_id == 0xffffffff)
>                 return 0;
>
> +       /* don't register processors that cannot be onlined */
> +       if (!acpi_is_processor_usable(processor->lapic_flags))
> +               return 0;
> +
>         /*
>          * We need to register disabled CPU as well to permit
>          * counting disabled CPUs. This allows us to size
> @@ -250,9 +265,7 @@ acpi_parse_lapic(union acpi_subtable_headers * header, const unsigned long end)
>                 return 0;
>
>         /* don't register processors that can not be onlined */
> -       if (acpi_support_online_capable &&
> -           !(processor->lapic_flags & ACPI_MADT_ENABLED) &&
> -           !(processor->lapic_flags & ACPI_MADT_ONLINE_CAPABLE))
> +       if (!acpi_is_processor_usable(processor->lapic_flags))
>                 return 0;
>
>         /*
> --
> 2.34.1
>
Borislav Petkov Jan. 5, 2023, 10:22 p.m. UTC | #2
On Thu, Jan 05, 2023 at 06:09:59PM +0100, Rafael J. Wysocki wrote:
> On Thu, Jan 5, 2023 at 5:11 AM Kishon Vijay Abraham I <kvijayab@amd.com> wrote:
> >
> > Section 5.2.12.12 Processor Local x2APIC Structure in the ACPI v6.5
> > spec mandates that both "enabled" and "online capable" Local APIC Flags
> > should be used to determine if the processor is usable or not.
> >
> > However, Linux doesn't use the "online capable" flag for x2APIC to
> > determine if the processor is usable. As a result, cpu_possible_mask has
> > incorrect value and results in more memory getting allocated for per_cpu
> > variables than it is going to be used.
> >
> > Make sure Linux parses both "enabled" and "online capable" flags for
> > x2APIC to correctly determine if the processor is usable.
> >
> > Fixes: aa06e20f1be6 ("x86/ACPI: Don't add CPUs that are not online capable")
> > Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
> > Reported-by: Leo Duran <leo.duran@amd.com>
> > Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
> 
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Are you saying, I should take it through tip?
Zhang Rui Jan. 6, 2023, 12:57 a.m. UTC | #3
On Thu, 2023-01-05 at 18:09 +0100, Rafael J. Wysocki wrote:
> On Thu, Jan 5, 2023 at 5:11 AM Kishon Vijay Abraham I <
> kvijayab@amd.com> wrote:
> > Section 5.2.12.12 Processor Local x2APIC Structure in the ACPI v6.5
> > spec mandates that both "enabled" and "online capable" Local APIC
> > Flags
> > should be used to determine if the processor is usable or not.
> > 
> > However, Linux doesn't use the "online capable" flag for x2APIC to
> > determine if the processor is usable. As a result,
> > cpu_possible_mask has
> > incorrect value and results in more memory getting allocated for
> > per_cpu
> > variables than it is going to be used.
> > 
> > Make sure Linux parses both "enabled" and "online capable" flags
> > for
> > x2APIC to correctly determine if the processor is usable.
> > 
> > Fixes: aa06e20f1be6 ("x86/ACPI: Don't add CPUs that are not online
> > capable")
> > Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
> > Reported-by: Leo Duran <leo.duran@amd.com>
> > Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
> 
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 

Reviewed-by: Zhang Rui <rui.zhang@intel.com>

thanks,
rui
> > ---
> > Changes from v1:
> > 1) Changed the ACPI spec version to 6.5 in the commit log
> > 2) Changed the Fixes tag to point to commit aa06e20f1be6
> > 3) Added "Reported-by: Leo Duran <leo.duran@amd.com>"
> >  arch/x86/kernel/acpi/boot.c | 19 ++++++++++++++++---
> >  1 file changed, 16 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/acpi/boot.c
> > b/arch/x86/kernel/acpi/boot.c
> > index 907cc98b1938..518bda50068c 100644
> > --- a/arch/x86/kernel/acpi/boot.c
> > +++ b/arch/x86/kernel/acpi/boot.c
> > @@ -188,6 +188,17 @@ static int acpi_register_lapic(int id, u32
> > acpiid, u8 enabled)
> >         return cpu;
> >  }
> > 
> > +static bool __init acpi_is_processor_usable(u32 lapic_flags)
> > +{
> > +       if (lapic_flags & ACPI_MADT_ENABLED)
> > +               return true;
> > +
> > +       if (acpi_support_online_capable && (lapic_flags &
> > ACPI_MADT_ONLINE_CAPABLE))
> > +               return true;
> > +
> > +       return false;
> > +}
> > +
> >  static int __init
> >  acpi_parse_x2apic(union acpi_subtable_headers *header, const
> > unsigned long end)
> >  {
> > @@ -212,6 +223,10 @@ acpi_parse_x2apic(union acpi_subtable_headers
> > *header, const unsigned long end)
> >         if (apic_id == 0xffffffff)
> >                 return 0;
> > 
> > +       /* don't register processors that cannot be onlined */
> > +       if (!acpi_is_processor_usable(processor->lapic_flags))
> > +               return 0;
> > +
> >         /*
> >          * We need to register disabled CPU as well to permit
> >          * counting disabled CPUs. This allows us to size
> > @@ -250,9 +265,7 @@ acpi_parse_lapic(union acpi_subtable_headers *
> > header, const unsigned long end)
> >                 return 0;
> > 
> >         /* don't register processors that can not be onlined */
> > -       if (acpi_support_online_capable &&
> > -           !(processor->lapic_flags & ACPI_MADT_ENABLED) &&
> > -           !(processor->lapic_flags & ACPI_MADT_ONLINE_CAPABLE))
> > +       if (!acpi_is_processor_usable(processor->lapic_flags))
> >                 return 0;
> > 
> >         /*
> > --
> > 2.34.1
> >
Rafael J. Wysocki Jan. 10, 2023, 1:03 p.m. UTC | #4
On Thu, Jan 5, 2023 at 11:22 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Thu, Jan 05, 2023 at 06:09:59PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Jan 5, 2023 at 5:11 AM Kishon Vijay Abraham I <kvijayab@amd.com> wrote:
> > >
> > > Section 5.2.12.12 Processor Local x2APIC Structure in the ACPI v6.5
> > > spec mandates that both "enabled" and "online capable" Local APIC Flags
> > > should be used to determine if the processor is usable or not.
> > >
> > > However, Linux doesn't use the "online capable" flag for x2APIC to
> > > determine if the processor is usable. As a result, cpu_possible_mask has
> > > incorrect value and results in more memory getting allocated for per_cpu
> > > variables than it is going to be used.
> > >
> > > Make sure Linux parses both "enabled" and "online capable" flags for
> > > x2APIC to correctly determine if the processor is usable.
> > >
> > > Fixes: aa06e20f1be6 ("x86/ACPI: Don't add CPUs that are not online capable")
> > > Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
> > > Reported-by: Leo Duran <leo.duran@amd.com>
> > > Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>
> >
> > Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Are you saying, I should take it through tip?

Works for me either way.

Just please let me know if I need to take care of it. :-)
Borislav Petkov Jan. 10, 2023, 6:23 p.m. UTC | #5
On Tue, Jan 10, 2023 at 02:03:48PM +0100, Rafael J. Wysocki wrote:
> Just please let me know if I need to take care of it. :-)

You don't. :-)

Thx.
Guy Durrieu April 2, 2023, 10:41 a.m. UTC | #6
Le 05/01/2023 à 05:10, Kishon Vijay Abraham I a écrit :

> Section 5.2.12.12 Processor Local x2APIC Structure in the ACPI v6.5
> spec mandates that both "enabled" and "online capable" Local APIC Flags
> should be used to determine if the processor is usable or not.
>
> However, Linux doesn't use the "online capable" flag for x2APIC to
> determine if the processor is usable. As a result, cpu_possible_mask has
> incorrect value and results in more memory getting allocated for per_cpu
> variables than it is going to be used.
>
> Make sure Linux parses both "enabled" and "online capable" flags for
> x2APIC to correctly determine if the processor is usable.
>
> Fixes: aa06e20f1be6 ("x86/ACPI: Don't add CPUs that are not online capable")
> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
> Reported-by: Leo Duran <leo.duran@amd.com>
> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com>

Hello everyone,

My system worked fine with kernel 6.1.15, but stopped booting after
upgrading to 6.1.20 and resulted in a kernel panic:

---
[ 0.117782] Kernel panic — not syncing: timer doesn’t work through Interrupt-remapped IO-APIC
[ 0.117848] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.0-7-and64 #1 Debian 6.1.20-1
[ 0.117913] Hardware name: Gigabyte Technology Co., Ltd. ABS50M-Gaming 3/AB350M-Gaming 3-CF, BIOS F50d 07/02/2020
[ 0.117982] Call Trace:
[ 0.118634] <TASK>
[ 0.118685] dump_stack_lvl+0x44/0x5c
[ 0.118143] panic+0x118/0x2ed
[ 0.118198] panic_if_irq_remap.cold+0x5/0x5
[ 0.118256] setup_I0_APIC+0x3db/0x64b
[ 0.118313] ? _raw_spin_unlock_irqrestore+0x23/0x40
[ 0.118372] ? clear_IO_APIC_pin+0x169/0x240
[ 0.118429] apic_intr_node_init+0x101/0x106
[ 0.118485] x86_late_time_init+0x20/0x34
[ 0.118542] start_kerne1+0x667/0x727
[ 0.118598] secondary_startup_64_no_verify+0xe5/0xeb
[ 0.118658] </TASK>
[ 0.118711] ---[ end Kernel panic - not syncing: timer doesn’t work through Interrupt-remapped IO-APIC J---
---
I tried an update of the BIOS up to F51h without any effect.

I sent a bug report to Debian Bug Tracking System. In reply Bjørn Mork 
identified ce7d894bed1a539a8d6cff42f6f78f9db0c9c26b from the linux-6.1.y 
branch as the likely culprit.

After building a 6.1.20 kernel with just that commit reverted, my system boots normally again.

I reported that all tohttps://bugs.debian.org/1033732  and were asked to report the issue 'upstream' (which is what that mail is).

Hope it will help!

Best regards.

-- Guy Durrieu
Borislav Petkov April 2, 2023, 10:57 a.m. UTC | #7
On April 2, 2023 12:41:46 PM GMT+02:00, Guy Durrieu <guy.durrieu@cegetel.net> wrote:
>My system worked fine with kernel 6.1.15, but stopped booting after
>upgrading to 6.1.20 and resulted in a kernel panic:

Does this fix it:

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=x86/urgent

Thx.
Guy Durrieu April 2, 2023, 1:13 p.m. UTC | #8
Le 02/04/2023 à 12:57, Borislav Petkov a écrit :

> On April 2, 2023 12:41:46 PM GMT+02:00, Guy Durrieu <guy.durrieu@cegetel.net> wrote:
>> My system worked fine with kernel 6.1.15, but stopped booting after
>> upgrading to 6.1.20 and resulted in a kernel panic:
> Does this fix it:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=x86/urgent
>
> Thx.
>
Yes it does.

Regards.

-- Guy Durrieu
Borislav Petkov April 2, 2023, 3:18 p.m. UTC | #9
On Sun, Apr 02, 2023 at 03:13:05PM +0200, Guy Durrieu wrote:
> Yes it does.

Thanks for testing.
diff mbox series

Patch

diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 907cc98b1938..518bda50068c 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -188,6 +188,17 @@  static int acpi_register_lapic(int id, u32 acpiid, u8 enabled)
 	return cpu;
 }
 
+static bool __init acpi_is_processor_usable(u32 lapic_flags)
+{
+	if (lapic_flags & ACPI_MADT_ENABLED)
+		return true;
+
+	if (acpi_support_online_capable && (lapic_flags & ACPI_MADT_ONLINE_CAPABLE))
+		return true;
+
+	return false;
+}
+
 static int __init
 acpi_parse_x2apic(union acpi_subtable_headers *header, const unsigned long end)
 {
@@ -212,6 +223,10 @@  acpi_parse_x2apic(union acpi_subtable_headers *header, const unsigned long end)
 	if (apic_id == 0xffffffff)
 		return 0;
 
+	/* don't register processors that cannot be onlined */
+	if (!acpi_is_processor_usable(processor->lapic_flags))
+		return 0;
+
 	/*
 	 * We need to register disabled CPU as well to permit
 	 * counting disabled CPUs. This allows us to size
@@ -250,9 +265,7 @@  acpi_parse_lapic(union acpi_subtable_headers * header, const unsigned long end)
 		return 0;
 
 	/* don't register processors that can not be onlined */
-	if (acpi_support_online_capable &&
-	    !(processor->lapic_flags & ACPI_MADT_ENABLED) &&
-	    !(processor->lapic_flags & ACPI_MADT_ONLINE_CAPABLE))
+	if (!acpi_is_processor_usable(processor->lapic_flags))
 		return 0;
 
 	/*