diff mbox series

acpi: zero-initialize acpi_object union structure

Message ID 20241017035940.4067922-1-payamm@google.com
State New
Headers show
Series acpi: zero-initialize acpi_object union structure | expand

Commit Message

Payam Moradshahi Oct. 17, 2024, 3:59 a.m. UTC
The way in which acpi_object union is being initialized varies based on
compiler type, version and flags used. Some will zero-initialize the
entire union structure and some will only initialize the first N-bytes
of the union structure. This could lead to uninitialized union members.

This bug was confirmed by observing non-zero value for object->processor
structure variables.

non-zero initialized members of acpi_object union structure causes
incorrect error reporting by the driver.

If a BIOS is using "Device" statement as opposed to "Processor"
statement, object variable may contain uninitialized members causing the
driver to report "Invalid PBLK length" incorrectly.

Using memset to zero-initialize the union structure fixes this issue and
also removes the dependency of this function on compiler versions and
flags being used.

Tested: Tested on ARM64 hardware that was printing this error and
confirmed the prints were gone.

Also confirmed this does not cause regression on ARM64 and X86
machines.

Signed-off-by: Payam Moradshahi <payamm@google.com>
---
 drivers/acpi/acpi_processor.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Rafael J. Wysocki Oct. 21, 2024, 11:31 a.m. UTC | #1
On Thu, Oct 17, 2024 at 5:59 AM Payam Moradshahi <payamm@google.com> wrote:
>
> The way in which acpi_object union is being initialized varies based on
> compiler type, version and flags used. Some will zero-initialize the
> entire union structure and some will only initialize the first N-bytes
> of the union structure.

Any details?

> This could lead to uninitialized union members.

So this is working around a compiler bug AFAICS.

If the compiler has this bug, is it guaranteed to compile the rest of
the kernel correctly?

> This bug was confirmed by observing non-zero value for object->processor
> structure variables.

Where has it been observed?  What compiler version(s)? etc.

> non-zero initialized members of acpi_object union structure causes
> incorrect error reporting by the driver.
>
> If a BIOS is using "Device" statement as opposed to "Processor"
> statement, object variable may contain uninitialized members causing the
> driver to report "Invalid PBLK length" incorrectly.
>
> Using memset to zero-initialize the union structure fixes this issue and
> also removes the dependency of this function on compiler versions and
> flags being used.
>
> Tested: Tested on ARM64 hardware that was printing this error and
> confirmed the prints were gone.
>
> Also confirmed this does not cause regression on ARM64 and X86
> machines.
>
> Signed-off-by: Payam Moradshahi <payamm@google.com>
> ---
>  drivers/acpi/acpi_processor.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 7cf6101cb4c73..6696ad4937d21 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -275,7 +275,7 @@ static inline int acpi_processor_hotadd_init(struct acpi_processor *pr,
>
>  static int acpi_processor_get_info(struct acpi_device *device)
>  {
> -       union acpi_object object = { 0 };
> +       union acpi_object object;
>         struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
>         struct acpi_processor *pr = acpi_driver_data(device);
>         int device_declaration = 0;
> @@ -284,6 +284,8 @@ static int acpi_processor_get_info(struct acpi_device *device)
>         unsigned long long value;
>         int ret;
>
> +       memset(&object, 0, sizeof(union acpi_object));
> +
>         acpi_processor_errata();
>
>         /*
> --
Payam Moradshahi Oct. 24, 2024, 9:58 p.m. UTC | #2
Hi Rafael,

Thank you for your response. Please see below for my response.

Payam

On Mon, Oct 21, 2024 at 01:31:39PM GMT, Rafael J. Wysocki wrote:
> On Thu, Oct 17, 2024 at 5:59 AM Payam Moradshahi <payamm@google.com>  
> wrote:
> >
> > The way in which acpi_object union is being initialized varies based on
> > compiler type, version and flags used. Some will zero-initialize the
> > entire union structure and some will only initialize the first N-bytes
> > of the union structure.

> Any details?
I dumped acpi_object after initialization and noticed either the
entire structure was zero-initialized or just the first 8 bytes
depending on compiler type and version.

gcc 13.2.0: bad
CLANG_CL=362121269: good
CLANG_CL=609443126: bad
CLANG_CL=684773960: good


> > This could lead to uninitialized union members.

> So this is working around a compiler bug AFAICS.

> If the compiler has this bug, is it guaranteed to compile the rest of
> the kernel correctly?
This is not a compiler bug. The way acpi_object union is being
initialized is not c-spec compliant.

Based on C Standard ISO/IEC 9899:202x (E):

Section 6.2.6.1 (7) says:

When a value is stored in a member of an object of union type, the bytes of  
the
object representation that do not correspond to that member but do  
correspond
to other members take unspecified values

Section 6.7.9 says:

If an object that has automatic storage duration is not initialized  
explicitly,
its value is indeterminate.

If an object that has static or thread storage duration is not initialized
explicitly, then:
  - if it is a union, the first named member is initialized (recursively)
    according to these rules, and any padding is initialized to zero bits;

The above guarantees zero only in the case of static or thread storage,
which is not the case here.


> > This bug was confirmed by observing non-zero value for object->processor
> > structure variables.

> Where has it been observed?  What compiler version(s)? etc.
See above for some examples

> > non-zero initialized members of acpi_object union structure causes
> > incorrect error reporting by the driver.
> >
> > If a BIOS is using "Device" statement as opposed to "Processor"
> > statement, object variable may contain uninitialized members causing the
> > driver to report "Invalid PBLK length" incorrectly.
> >
> > Using memset to zero-initialize the union structure fixes this issue and
> > also removes the dependency of this function on compiler versions and
> > flags being used.
> >
> > Tested: Tested on ARM64 hardware that was printing this error and
> > confirmed the prints were gone.
> >
> > Also confirmed this does not cause regression on ARM64 and X86
> > machines.
> >
> > Signed-off-by: Payam Moradshahi <payamm@google.com>
> > ---
> >  drivers/acpi/acpi_processor.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/acpi/acpi_processor.c  
> b/drivers/acpi/acpi_processor.c
> > index 7cf6101cb4c73..6696ad4937d21 100644
> > --- a/drivers/acpi/acpi_processor.c
> > +++ b/drivers/acpi/acpi_processor.c
> > @@ -275,7 +275,7 @@ static inline int acpi_processor_hotadd_init(struct  
> acpi_processor *pr,
> >
> >  static int acpi_processor_get_info(struct acpi_device *device)
> >  {
> > -       union acpi_object object = { 0 };
> > +       union acpi_object object;
> >         struct acpi_buffer buffer = { sizeof(union acpi_object),  
> &object };
> >         struct acpi_processor *pr = acpi_driver_data(device);
> >         int device_declaration = 0;
> > @@ -284,6 +284,8 @@ static int acpi_processor_get_info(struct  
> acpi_device *device)
> >         unsigned long long value;
> >         int ret;
> >
> > +       memset(&object, 0, sizeof(union acpi_object));
> > +
> >         acpi_processor_errata();
> >
> >         /*
> > --
Moritz Fischer Oct. 25, 2024, 5:29 p.m. UTC | #3
Hi Rafael,

On Thu, Oct 24, 2024 at 02:58:27PM GMT, Payam Moradshahi wrote:
> Hi Rafael,

> Thank you for your response. Please see below for my response.

> Payam

> On Mon, Oct 21, 2024 at 01:31:39PM GMT, Rafael J. Wysocki wrote:
> > On Thu, Oct 17, 2024 at 5:59 AM Payam Moradshahi <payamm@google.com>
> > wrote:
> > >
> > > The way in which acpi_object union is being initialized varies based  
> on
> > > compiler type, version and flags used. Some will zero-initialize the
> > > entire union structure and some will only initialize the first N-bytes
> > > of the union structure.

> > Any details?
> I dumped acpi_object after initialization and noticed either the
> entire structure was zero-initialized or just the first 8 bytes
> depending on compiler type and version.

> gcc 13.2.0: bad
> CLANG_CL=362121269: good
> CLANG_CL=609443126: bad
> CLANG_CL=684773960: good

For reference the latter ones are Google internal builds of clang, but
we've played around in godbolt with this in a variety of versions and
some work and some don't. Payam has the relevant section from the C
standard below.


> > > This could lead to uninitialized union members.

> > So this is working around a compiler bug AFAICS.

> > If the compiler has this bug, is it guaranteed to compile the rest of
> > the kernel correctly?
> This is not a compiler bug. The way acpi_object union is being
> initialized is not c-spec compliant.

I think in past versions we might've gotten lucky with this, recent
compilers might've tightened this up some or changed behavior.


> Based on C Standard ISO/IEC 9899:202x (E):

> Section 6.2.6.1 (7) says:

> When a value is stored in a member of an object of union type, the bytes  
> of
> the
> object representation that do not correspond to that member but do
> correspond
> to other members take unspecified values

> Section 6.7.9 says:

> If an object that has automatic storage duration is not initialized
> explicitly,
> its value is indeterminate.

> If an object that has static or thread storage duration is not initialized
> explicitly, then:
>   - if it is a union, the first named member is initialized (recursively)
>     according to these rules, and any padding is initialized to zero bits;

> The above guarantees zero only in the case of static or thread storage,
> which is not the case here.


> > > This bug was confirmed by observing non-zero value for  
> object->processor
> > > structure variables.

> > Where has it been observed?  What compiler version(s)? etc.
> See above for some examples

This manifests like this on Google Axion arm64 builds using Arm's
prebuilt GCC 13.2.0:

[    1.844508] acpi ACPI0007:08: Invalid PBLK length [271170112]
[    1.850253] acpi ACPI0007:09: Invalid PBLK length [271170112]
[    1.855992] acpi ACPI0007:0a: Invalid PBLK length [271170112]
[    1.861730] acpi ACPI0007:0b: Invalid PBLK length [271170112]
[    1.867470] acpi ACPI0007:0c: Invalid PBLK length [271170112]
[    1.873208] acpi ACPI0007:0d: Invalid PBLK length [271170112]
[    1.878947] acpi ACPI0007:0e: Invalid PBLK length [271170112]
[    1.884686] acpi ACPI0007:0f: Invalid PBLK length [271170112]
[    1.890424] acpi ACPI0007:10: Invalid PBLK length [271170112]
[    1.896161] acpi ACPI0007:11: Invalid PBLK length [271170112]
[    1.901898] acpi ACPI0007:12: Invalid PBLK length [271170112]
[    1.907636] acpi ACPI0007:13: Invalid PBLK length [271170112]
[    1.913374] acpi ACPI0007:14: Invalid PBLK length [271170112]
[    1.919113] acpi ACPI0007:15: Invalid PBLK length [271170112]
[    1.924851] acpi ACPI0007:16: Invalid PBLK length [271170112]


> > > non-zero initialized members of acpi_object union structure causes
> > > incorrect error reporting by the driver.
> > >
> > > If a BIOS is using "Device" statement as opposed to "Processor"
> > > statement, object variable may contain uninitialized members causing  
> the
> > > driver to report "Invalid PBLK length" incorrectly.
> > >
> > > Using memset to zero-initialize the union structure fixes this issue  
> and
> > > also removes the dependency of this function on compiler versions and
> > > flags being used.
> > >
> > > Tested: Tested on ARM64 hardware that was printing this error and
> > > confirmed the prints were gone.
> > >
> > > Also confirmed this does not cause regression on ARM64 and X86
> > > machines.
> > >
> > > Signed-off-by: Payam Moradshahi <payamm@google.com>
> > > ---
> > >  drivers/acpi/acpi_processor.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/acpi/acpi_processor.c
> > b/drivers/acpi/acpi_processor.c
> > > index 7cf6101cb4c73..6696ad4937d21 100644
> > > --- a/drivers/acpi/acpi_processor.c
> > > +++ b/drivers/acpi/acpi_processor.c
> > > @@ -275,7 +275,7 @@ static inline int
> > acpi_processor_hotadd_init(struct acpi_processor *pr,
> > >
> > >  static int acpi_processor_get_info(struct acpi_device *device)
> > >  {
> > > -       union acpi_object object = { 0 };
> > > +       union acpi_object object;
> > >         struct acpi_buffer buffer = { sizeof(union acpi_object),
> > &object };
> > >         struct acpi_processor *pr = acpi_driver_data(device);
> > >         int device_declaration = 0;
> > > @@ -284,6 +284,8 @@ static int acpi_processor_get_info(struct
> > acpi_device *device)
> > >         unsigned long long value;
> > >         int ret;
> > >
> > > +       memset(&object, 0, sizeof(union acpi_object));
> > > +
> > >         acpi_processor_errata();
> > >
> > >         /*
> > > --

Payam, it might be good to add a log snippet and references to the relevant  
spec
sections to your v2 commit message.

Moreover, for v2 of the patch you probably want to add a
   Cc: stable@vger.kernel.org

to your patch to make sure it gets picked up once it gets picked up
for mainline. See [1] for more info on the stable process.

Cheers,
Moritz

[1] https://www.kernel.org/doc/Documentation/process/stable-kernel-rules.rst
diff mbox series

Patch

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 7cf6101cb4c73..6696ad4937d21 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -275,7 +275,7 @@  static inline int acpi_processor_hotadd_init(struct acpi_processor *pr,
 
 static int acpi_processor_get_info(struct acpi_device *device)
 {
-	union acpi_object object = { 0 };
+	union acpi_object object;
 	struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
 	struct acpi_processor *pr = acpi_driver_data(device);
 	int device_declaration = 0;
@@ -284,6 +284,8 @@  static int acpi_processor_get_info(struct acpi_device *device)
 	unsigned long long value;
 	int ret;
 
+	memset(&object, 0, sizeof(union acpi_object));
+
 	acpi_processor_errata();
 
 	/*