[edk2,0/5] MdePkg BaseTools: GCC optimization for X64

Message ID CAKv+Gu-+4OR2xu0q+Jbbve6_YGjcJyxr2mdrMrGSoyz-3mMZwQ@mail.gmail.com
State New
Headers show

Commit Message

Ard Biesheuvel July 15, 2016, 8:40 a.m.
On 15 July 2016 at 08:33, Laszlo Ersek <lersek@redhat.com> wrote:
> On 07/15/16 08:07, Ard Biesheuvel wrote:

>> On 15 July 2016 at 01:27, Laszlo Ersek <lersek@redhat.com> wrote:

>

>>> First, the build tests. I built OVMF 84 times, with the following settings:

>>>

>>> * Dimension 1: whether your and Steven's patches were applied or not.

>>

>> I take it this means this series only?

>

> Ah, yes, sorry about the ambiguity -- with "your and Steven's patches" I

> meant the five patches in this series -- one patch from Steven, four

> patches from you.

>

>>> * However, then I built my program -- find it attached -- with -b DEBUG

>>> and -b RELEASE too, and the -b RELEASE binary is broken. Here's the

>>> difference in output:

>>>

>>> FS2:\> DebugEnrollDefaultKeys.efi

>>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0

>>> info: SetupMode=0 SecureBoot=1 SecureBootEnable=1 CustomMode=0 VendorKeys=0

>>> info: success

>>>

>>> versus

>>>

>>> FS2:\> OptEnrollDefaultKeys.efi

>>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0

>>> error: EnrollListOfX509Certs("db",

>>> D719B2CB-3D3A-4596-A3BC-DAD00E67656F): Invalid Parameter

>>>

>>> I don't know why this happens, but it definitely relates to varargs --

>>> the program uses them liberally.

>>>

>>

>> If this code ultimately uses __builtin_ms_va_lists with -O2 and fails,

>> it makes sense to report it to the GCC maintainers (assuming we can

>> create a test case). I noticed that this code iterates over the same

>> VA_LIST twice, I wonder how well that was tested ...

>

> You make a good point about using VA_LIST twice possibly tickling gcc

> the wrong way. Plus, there's one (VOID)VA_ARG() macro invocation in my

> code -- in order to step over some arguments --, which might not be all

> that usual as well.

>

> In any case, the way I use varargs seems to be standards conformant.

>

> With regard to reporting this to gcc developers: I won't try that. I'm

> discouraged by two facts:

>

> * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818 -- open since

> October 2011. You'll find some recent comments from Steven and David in

> it :)

>

> * http://thread.gmane.org/gmane.comp.bios.tianocore.devel/10963 --

> shortly after I ran into the -Os corruption issue with GCC48, I created

> and posted this small reproducer. Creating the reproducer wasn't

> trivial. In parallel I sent the same reproducer to one of my (indirect)

> colleagues at Red Hat, who had been a veteran in upstream GCC

> development and maintenance. I also asked him how/where I should report

> the bug. I got no answer from him.

>

> If you'd like to report a GCC bug, please go ahead, but I won't waste my

> time :)

>


It seems that it is your testcase that is broken:


With that change, everything works fine for me

Thanks,
Ard.
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Comments

Laszlo Ersek July 15, 2016, 8:48 a.m. | #1
On 07/15/16 10:40, Ard Biesheuvel wrote:
> On 15 July 2016 at 08:33, Laszlo Ersek <lersek@redhat.com> wrote:

>> On 07/15/16 08:07, Ard Biesheuvel wrote:

>>> On 15 July 2016 at 01:27, Laszlo Ersek <lersek@redhat.com> wrote:

>>

>>>> First, the build tests. I built OVMF 84 times, with the following settings:

>>>>

>>>> * Dimension 1: whether your and Steven's patches were applied or not.

>>>

>>> I take it this means this series only?

>>

>> Ah, yes, sorry about the ambiguity -- with "your and Steven's patches" I

>> meant the five patches in this series -- one patch from Steven, four

>> patches from you.

>>

>>>> * However, then I built my program -- find it attached -- with -b DEBUG

>>>> and -b RELEASE too, and the -b RELEASE binary is broken. Here's the

>>>> difference in output:

>>>>

>>>> FS2:\> DebugEnrollDefaultKeys.efi

>>>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0

>>>> info: SetupMode=0 SecureBoot=1 SecureBootEnable=1 CustomMode=0 VendorKeys=0

>>>> info: success

>>>>

>>>> versus

>>>>

>>>> FS2:\> OptEnrollDefaultKeys.efi

>>>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0

>>>> error: EnrollListOfX509Certs("db",

>>>> D719B2CB-3D3A-4596-A3BC-DAD00E67656F): Invalid Parameter

>>>>

>>>> I don't know why this happens, but it definitely relates to varargs --

>>>> the program uses them liberally.

>>>>

>>>

>>> If this code ultimately uses __builtin_ms_va_lists with -O2 and fails,

>>> it makes sense to report it to the GCC maintainers (assuming we can

>>> create a test case). I noticed that this code iterates over the same

>>> VA_LIST twice, I wonder how well that was tested ...

>>

>> You make a good point about using VA_LIST twice possibly tickling gcc

>> the wrong way. Plus, there's one (VOID)VA_ARG() macro invocation in my

>> code -- in order to step over some arguments --, which might not be all

>> that usual as well.

>>

>> In any case, the way I use varargs seems to be standards conformant.

>>

>> With regard to reporting this to gcc developers: I won't try that. I'm

>> discouraged by two facts:

>>

>> * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818 -- open since

>> October 2011. You'll find some recent comments from Steven and David in

>> it :)

>>

>> * http://thread.gmane.org/gmane.comp.bios.tianocore.devel/10963 --

>> shortly after I ran into the -Os corruption issue with GCC48, I created

>> and posted this small reproducer. Creating the reproducer wasn't

>> trivial. In parallel I sent the same reproducer to one of my (indirect)

>> colleagues at Red Hat, who had been a veteran in upstream GCC

>> development and maintenance. I also asked him how/where I should report

>> the bug. I got no answer from him.

>>

>> If you'd like to report a GCC bug, please go ahead, but I won't waste my

>> time :)

>>

> 

> It seems that it is your testcase that is broken:

> 

> --- a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c

> +++ b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c

> @@ -663,6 +663,8 @@ EnrollListOfX509Certs (

>    UINT8            *Data;

>    UINT8            *Position;

> 

> +  Status = EFI_SUCCESS;

> +

>    //

>    // compute total size first, for UINT32 range check, and allocation

>    //

> 

> With that change, everything works fine for me


I agree that proving my code wrong would be the best outcome here, but
can you please explain to me on what path we return from
EnrollListOfX509Certs() without setting Status?

Thanks
Laszlo

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Ard Biesheuvel July 15, 2016, 8:58 a.m. | #2
On 15 July 2016 at 10:48, Laszlo Ersek <lersek@redhat.com> wrote:
> On 07/15/16 10:40, Ard Biesheuvel wrote:

>> On 15 July 2016 at 08:33, Laszlo Ersek <lersek@redhat.com> wrote:

>>> On 07/15/16 08:07, Ard Biesheuvel wrote:

>>>> On 15 July 2016 at 01:27, Laszlo Ersek <lersek@redhat.com> wrote:

>>>

>>>>> First, the build tests. I built OVMF 84 times, with the following settings:

>>>>>

>>>>> * Dimension 1: whether your and Steven's patches were applied or not.

>>>>

>>>> I take it this means this series only?

>>>

>>> Ah, yes, sorry about the ambiguity -- with "your and Steven's patches" I

>>> meant the five patches in this series -- one patch from Steven, four

>>> patches from you.

>>>

>>>>> * However, then I built my program -- find it attached -- with -b DEBUG

>>>>> and -b RELEASE too, and the -b RELEASE binary is broken. Here's the

>>>>> difference in output:

>>>>>

>>>>> FS2:\> DebugEnrollDefaultKeys.efi

>>>>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0

>>>>> info: SetupMode=0 SecureBoot=1 SecureBootEnable=1 CustomMode=0 VendorKeys=0

>>>>> info: success

>>>>>

>>>>> versus

>>>>>

>>>>> FS2:\> OptEnrollDefaultKeys.efi

>>>>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0

>>>>> error: EnrollListOfX509Certs("db",

>>>>> D719B2CB-3D3A-4596-A3BC-DAD00E67656F): Invalid Parameter

>>>>>

>>>>> I don't know why this happens, but it definitely relates to varargs --

>>>>> the program uses them liberally.

>>>>>

>>>>

>>>> If this code ultimately uses __builtin_ms_va_lists with -O2 and fails,

>>>> it makes sense to report it to the GCC maintainers (assuming we can

>>>> create a test case). I noticed that this code iterates over the same

>>>> VA_LIST twice, I wonder how well that was tested ...

>>>

>>> You make a good point about using VA_LIST twice possibly tickling gcc

>>> the wrong way. Plus, there's one (VOID)VA_ARG() macro invocation in my

>>> code -- in order to step over some arguments --, which might not be all

>>> that usual as well.

>>>

>>> In any case, the way I use varargs seems to be standards conformant.

>>>

>>> With regard to reporting this to gcc developers: I won't try that. I'm

>>> discouraged by two facts:

>>>

>>> * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818 -- open since

>>> October 2011. You'll find some recent comments from Steven and David in

>>> it :)

>>>

>>> * http://thread.gmane.org/gmane.comp.bios.tianocore.devel/10963 --

>>> shortly after I ran into the -Os corruption issue with GCC48, I created

>>> and posted this small reproducer. Creating the reproducer wasn't

>>> trivial. In parallel I sent the same reproducer to one of my (indirect)

>>> colleagues at Red Hat, who had been a veteran in upstream GCC

>>> development and maintenance. I also asked him how/where I should report

>>> the bug. I got no answer from him.

>>>

>>> If you'd like to report a GCC bug, please go ahead, but I won't waste my

>>> time :)

>>>

>>

>> It seems that it is your testcase that is broken:

>>

>> --- a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c

>> +++ b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c

>> @@ -663,6 +663,8 @@ EnrollListOfX509Certs (

>>    UINT8            *Data;

>>    UINT8            *Position;

>>

>> +  Status = EFI_SUCCESS;

>> +

>>    //

>>    // compute total size first, for UINT32 range check, and allocation

>>    //

>>

>> With that change, everything works fine for me

>

> I agree that proving my code wrong would be the best outcome here, but

> can you please explain to me on what path we return from

> EnrollListOfX509Certs() without setting Status?

>


Well, it seems that in the success case, we are supposed to end up at line 692

  if (EFI_ERROR (Status)) {
    goto Out;
  }

and expect the condition to be FALSE. If we have successfully tallied
the size of at least one cert, 'DataSize == sizeof *SingleHeader' will
be FALSE, and so Status will never have been assigned a value.
Apparently, the optimizer sees some kind of undefined behavior here
which it can exploit to bail unconditionally (or Status happens to be
0 under -O0)
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Laszlo Ersek July 15, 2016, 9:06 a.m. | #3
On 07/15/16 10:58, Ard Biesheuvel wrote:
> On 15 July 2016 at 10:48, Laszlo Ersek <lersek@redhat.com> wrote:

>> On 07/15/16 10:40, Ard Biesheuvel wrote:

>>> On 15 July 2016 at 08:33, Laszlo Ersek <lersek@redhat.com> wrote:

>>>> On 07/15/16 08:07, Ard Biesheuvel wrote:

>>>>> On 15 July 2016 at 01:27, Laszlo Ersek <lersek@redhat.com> wrote:

>>>>

>>>>>> First, the build tests. I built OVMF 84 times, with the following settings:

>>>>>>

>>>>>> * Dimension 1: whether your and Steven's patches were applied or not.

>>>>>

>>>>> I take it this means this series only?

>>>>

>>>> Ah, yes, sorry about the ambiguity -- with "your and Steven's patches" I

>>>> meant the five patches in this series -- one patch from Steven, four

>>>> patches from you.

>>>>

>>>>>> * However, then I built my program -- find it attached -- with -b DEBUG

>>>>>> and -b RELEASE too, and the -b RELEASE binary is broken. Here's the

>>>>>> difference in output:

>>>>>>

>>>>>> FS2:\> DebugEnrollDefaultKeys.efi

>>>>>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0

>>>>>> info: SetupMode=0 SecureBoot=1 SecureBootEnable=1 CustomMode=0 VendorKeys=0

>>>>>> info: success

>>>>>>

>>>>>> versus

>>>>>>

>>>>>> FS2:\> OptEnrollDefaultKeys.efi

>>>>>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0

>>>>>> error: EnrollListOfX509Certs("db",

>>>>>> D719B2CB-3D3A-4596-A3BC-DAD00E67656F): Invalid Parameter

>>>>>>

>>>>>> I don't know why this happens, but it definitely relates to varargs --

>>>>>> the program uses them liberally.

>>>>>>

>>>>>

>>>>> If this code ultimately uses __builtin_ms_va_lists with -O2 and fails,

>>>>> it makes sense to report it to the GCC maintainers (assuming we can

>>>>> create a test case). I noticed that this code iterates over the same

>>>>> VA_LIST twice, I wonder how well that was tested ...

>>>>

>>>> You make a good point about using VA_LIST twice possibly tickling gcc

>>>> the wrong way. Plus, there's one (VOID)VA_ARG() macro invocation in my

>>>> code -- in order to step over some arguments --, which might not be all

>>>> that usual as well.

>>>>

>>>> In any case, the way I use varargs seems to be standards conformant.

>>>>

>>>> With regard to reporting this to gcc developers: I won't try that. I'm

>>>> discouraged by two facts:

>>>>

>>>> * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818 -- open since

>>>> October 2011. You'll find some recent comments from Steven and David in

>>>> it :)

>>>>

>>>> * http://thread.gmane.org/gmane.comp.bios.tianocore.devel/10963 --

>>>> shortly after I ran into the -Os corruption issue with GCC48, I created

>>>> and posted this small reproducer. Creating the reproducer wasn't

>>>> trivial. In parallel I sent the same reproducer to one of my (indirect)

>>>> colleagues at Red Hat, who had been a veteran in upstream GCC

>>>> development and maintenance. I also asked him how/where I should report

>>>> the bug. I got no answer from him.

>>>>

>>>> If you'd like to report a GCC bug, please go ahead, but I won't waste my

>>>> time :)

>>>>

>>>

>>> It seems that it is your testcase that is broken:

>>>

>>> --- a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c

>>> +++ b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c

>>> @@ -663,6 +663,8 @@ EnrollListOfX509Certs (

>>>    UINT8            *Data;

>>>    UINT8            *Position;

>>>

>>> +  Status = EFI_SUCCESS;

>>> +

>>>    //

>>>    // compute total size first, for UINT32 range check, and allocation

>>>    //

>>>

>>> With that change, everything works fine for me

>>

>> I agree that proving my code wrong would be the best outcome here, but

>> can you please explain to me on what path we return from

>> EnrollListOfX509Certs() without setting Status?

>>

> 

> Well, it seems that in the success case, we are supposed to end up at line 692

> 

>   if (EFI_ERROR (Status)) {

>     goto Out;

>   }

> 

> and expect the condition to be FALSE. If we have successfully tallied

> the size of at least one cert, 'DataSize == sizeof *SingleHeader' will

> be FALSE, and so Status will never have been assigned a value.

> Apparently, the optimizer sees some kind of undefined behavior here

> which it can exploit to bail unconditionally (or Status happens to be

> 0 under -O0)

> 


You are perfectly right. Thank you for catching this!

And, to our collective relief, this means that there should be no
problem with this series. :) I think I can add my Tested-by:

Tested-by: Laszlo Ersek <lersek@redhat.com>


Cheers!
Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Ard Biesheuvel July 15, 2016, 9:25 a.m. | #4
On 15 July 2016 at 11:06, Laszlo Ersek <lersek@redhat.com> wrote:
> On 07/15/16 10:58, Ard Biesheuvel wrote:

>> On 15 July 2016 at 10:48, Laszlo Ersek <lersek@redhat.com> wrote:

>>> On 07/15/16 10:40, Ard Biesheuvel wrote:

>>>> On 15 July 2016 at 08:33, Laszlo Ersek <lersek@redhat.com> wrote:

>>>>> On 07/15/16 08:07, Ard Biesheuvel wrote:

>>>>>> On 15 July 2016 at 01:27, Laszlo Ersek <lersek@redhat.com> wrote:

>>>>>

>>>>>>> First, the build tests. I built OVMF 84 times, with the following settings:

>>>>>>>

>>>>>>> * Dimension 1: whether your and Steven's patches were applied or not.

>>>>>>

>>>>>> I take it this means this series only?

>>>>>

>>>>> Ah, yes, sorry about the ambiguity -- with "your and Steven's patches" I

>>>>> meant the five patches in this series -- one patch from Steven, four

>>>>> patches from you.

>>>>>

>>>>>>> * However, then I built my program -- find it attached -- with -b DEBUG

>>>>>>> and -b RELEASE too, and the -b RELEASE binary is broken. Here's the

>>>>>>> difference in output:

>>>>>>>

>>>>>>> FS2:\> DebugEnrollDefaultKeys.efi

>>>>>>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0

>>>>>>> info: SetupMode=0 SecureBoot=1 SecureBootEnable=1 CustomMode=0 VendorKeys=0

>>>>>>> info: success

>>>>>>>

>>>>>>> versus

>>>>>>>

>>>>>>> FS2:\> OptEnrollDefaultKeys.efi

>>>>>>> info: SetupMode=1 SecureBoot=0 SecureBootEnable=1 CustomMode=1 VendorKeys=0

>>>>>>> error: EnrollListOfX509Certs("db",

>>>>>>> D719B2CB-3D3A-4596-A3BC-DAD00E67656F): Invalid Parameter

>>>>>>>

>>>>>>> I don't know why this happens, but it definitely relates to varargs --

>>>>>>> the program uses them liberally.

>>>>>>>

>>>>>>

>>>>>> If this code ultimately uses __builtin_ms_va_lists with -O2 and fails,

>>>>>> it makes sense to report it to the GCC maintainers (assuming we can

>>>>>> create a test case). I noticed that this code iterates over the same

>>>>>> VA_LIST twice, I wonder how well that was tested ...

>>>>>

>>>>> You make a good point about using VA_LIST twice possibly tickling gcc

>>>>> the wrong way. Plus, there's one (VOID)VA_ARG() macro invocation in my

>>>>> code -- in order to step over some arguments --, which might not be all

>>>>> that usual as well.

>>>>>

>>>>> In any case, the way I use varargs seems to be standards conformant.

>>>>>

>>>>> With regard to reporting this to gcc developers: I won't try that. I'm

>>>>> discouraged by two facts:

>>>>>

>>>>> * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50818 -- open since

>>>>> October 2011. You'll find some recent comments from Steven and David in

>>>>> it :)

>>>>>

>>>>> * http://thread.gmane.org/gmane.comp.bios.tianocore.devel/10963 --

>>>>> shortly after I ran into the -Os corruption issue with GCC48, I created

>>>>> and posted this small reproducer. Creating the reproducer wasn't

>>>>> trivial. In parallel I sent the same reproducer to one of my (indirect)

>>>>> colleagues at Red Hat, who had been a veteran in upstream GCC

>>>>> development and maintenance. I also asked him how/where I should report

>>>>> the bug. I got no answer from him.

>>>>>

>>>>> If you'd like to report a GCC bug, please go ahead, but I won't waste my

>>>>> time :)

>>>>>

>>>>

>>>> It seems that it is your testcase that is broken:

>>>>

>>>> --- a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c

>>>> +++ b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c

>>>> @@ -663,6 +663,8 @@ EnrollListOfX509Certs (

>>>>    UINT8            *Data;

>>>>    UINT8            *Position;

>>>>

>>>> +  Status = EFI_SUCCESS;

>>>> +

>>>>    //

>>>>    // compute total size first, for UINT32 range check, and allocation

>>>>    //

>>>>

>>>> With that change, everything works fine for me

>>>

>>> I agree that proving my code wrong would be the best outcome here, but

>>> can you please explain to me on what path we return from

>>> EnrollListOfX509Certs() without setting Status?

>>>

>>

>> Well, it seems that in the success case, we are supposed to end up at line 692

>>

>>   if (EFI_ERROR (Status)) {

>>     goto Out;

>>   }

>>

>> and expect the condition to be FALSE. If we have successfully tallied

>> the size of at least one cert, 'DataSize == sizeof *SingleHeader' will

>> be FALSE, and so Status will never have been assigned a value.

>> Apparently, the optimizer sees some kind of undefined behavior here

>> which it can exploit to bail unconditionally (or Status happens to be

>> 0 under -O0)

>>

>

> You are perfectly right. Thank you for catching this!

>

> And, to our collective relief, this means that there should be no

> problem with this series. :) I think I can add my Tested-by:

>

> Tested-by: Laszlo Ersek <lersek@redhat.com>

>


Thanks a lot!

I do have another question, though. Looking at your logs, I found the
following results for the sizes

GCC48 before:

SECFV [11%Full] 212992 total, 25392 used, 187600 free
FVMAIN_COMPACT [63%Full] 1753088 total, 1113976 used, 639112 free
DXEFV [56%Full] 10485760 total, 5970624 used, 4515136 free
PEIFV [21%Full] 917504 total, 200616 used, 716888 free

GCC48 after:

SECFV [7%Full] 212992 total, 15728 used, 197264 free
FVMAIN_COMPACT [65%Full] 1753088 total, 1145808 used, 607280 free
DXEFV [39%Full] 10485760 total, 4148384 used, 6337376 free
PEIFV [14%Full] 917504 total, 131208 used, 786296 free


GCC49 before:

SECFV [12%Full] 212992 total, 25616 used, 187376 free
FVMAIN_COMPACT [64%Full] 1753088 total, 1122760 used, 630328 free
DXEFV [58%Full] 10485760 total, 6106048 used, 4379712 free
PEIFV [22%Full] 917504 total, 205992 used, 711512 free

GCC49 after:

SECFV [7%Full] 212992 total, 15824 used, 197168 free
FVMAIN_COMPACT [66%Full] 1753088 total, 1166952 used, 586136 free
DXEFV [40%Full] 10485760 total, 4225728 used, 6260032 free
PEIFV [14%Full] 917504 total, 133672 used, 783832 free

So DXEFV is substantially smaller, but FVMAIN_COMPACT ends up slightly
bigger. Which one do we care about mostly? I would assume
FVMAIN_COMPACT, since that translates into flash footprint directly.

So I did a quick test, comparing O0 / O1 / O2 on GCC48, and it seems
the sweet spot is O1 (due to lack of support for Os in older GCCs)

-O0

SECFV [11%Full] 212992 total, 23536 used, 189456 free
FVMAIN_COMPACT [58%Full] 1753088 total, 1016816 used, 736272 free
DXEFV [47%Full] 10485760 total, 4978656 used, 5507104 free
PEIFV [14%Full] 917504 total, 133328 used, 784176 free

-O1

SECFV [6%Full] 212992 total, 14768 used, 198224 free
FVMAIN_COMPACT [58%Full] 1753088 total, 1017856 used, 735232 free
DXEFV [37%Full] 10485760 total, 3907648 used, 6578112 free
PEIFV [10%Full] 917504 total, 95696 used, 821808 free

-O2

SECFV [7%Full] 212992 total, 15728 used, 197264 free
FVMAIN_COMPACT [63%Full] 1753088 total, 1111144 used, 641944 free
DXEFV [38%Full] 10485760 total, 4029984 used, 6455776 free
PEIFV [10%Full] 917504 total, 100400 used, 817104 free


Anyway, thanks again for your time and effort in testing this, much appreciated.

-- 
Ard.
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Laszlo Ersek July 15, 2016, 9:51 a.m. | #5
On 07/15/16 11:25, Ard Biesheuvel wrote:
> On 15 July 2016 at 11:06, Laszlo Ersek <lersek@redhat.com> wrote:


>> And, to our collective relief, this means that there should be no

>> problem with this series. :) I think I can add my Tested-by:

>>

>> Tested-by: Laszlo Ersek <lersek@redhat.com>

>>

> 

> Thanks a lot!

> 

> I do have another question, though. Looking at your logs, I found the

> following results for the sizes

> 

> GCC48 before:

> 

> SECFV [11%Full] 212992 total, 25392 used, 187600 free

> FVMAIN_COMPACT [63%Full] 1753088 total, 1113976 used, 639112 free

> DXEFV [56%Full] 10485760 total, 5970624 used, 4515136 free

> PEIFV [21%Full] 917504 total, 200616 used, 716888 free

> 

> GCC48 after:

> 

> SECFV [7%Full] 212992 total, 15728 used, 197264 free

> FVMAIN_COMPACT [65%Full] 1753088 total, 1145808 used, 607280 free

> DXEFV [39%Full] 10485760 total, 4148384 used, 6337376 free

> PEIFV [14%Full] 917504 total, 131208 used, 786296 free

> 

> 

> GCC49 before:

> 

> SECFV [12%Full] 212992 total, 25616 used, 187376 free

> FVMAIN_COMPACT [64%Full] 1753088 total, 1122760 used, 630328 free

> DXEFV [58%Full] 10485760 total, 6106048 used, 4379712 free

> PEIFV [22%Full] 917504 total, 205992 used, 711512 free

> 

> GCC49 after:

> 

> SECFV [7%Full] 212992 total, 15824 used, 197168 free

> FVMAIN_COMPACT [66%Full] 1753088 total, 1166952 used, 586136 free

> DXEFV [40%Full] 10485760 total, 4225728 used, 6260032 free

> PEIFV [14%Full] 917504 total, 133672 used, 783832 free

> 

> So DXEFV is substantially smaller, but FVMAIN_COMPACT ends up slightly

> bigger. Which one do we care about mostly? I would assume

> FVMAIN_COMPACT, since that translates into flash footprint directly.


Correct.

DXEFV is less relevant, but not irrelevant. We've been increasing its
allotted size in the FDF file(s) over time, cramming more and more DXE
phase stuff into OVMF. Its size does have an impact on guest memory
availability, when the SMM driver stack is built in, and S3 is enabled
on the QEMU command line. (Under these circumstances DXEFV's location is
covered as AcpiNVS. Not optimal, but that's what we have.)

So, in practice, shrinking DXEFV is useful as well.

FVMAIN_COMPACT is what ultimately determines the flash chip's size.
We've never had to change that (after a historical bump from 1MB to 2MB,
speaking unified flash terms -- for OVMF_CODE.fd, the bump means
1MB-128KB --> 2MB-128KB). If we have to increase OVMF_CODE.fd, that will
be visible on the host filesystem, and might present compatibility
problems for existing VMs after their next boot. (I hope not, but I've
never actually tested this scenario.)

Mike mentioned elsewhere in this thread that he measured the utility of
new build flags on the change in compressed size.

I guess it just tells us that TianoCompress performs incredibly well :)

> So I did a quick test, comparing O0 / O1 / O2 on GCC48, and it seems

> the sweet spot is O1 (due to lack of support for Os in older GCCs)

> 

> -O0

> 

> SECFV [11%Full] 212992 total, 23536 used, 189456 free

> FVMAIN_COMPACT [58%Full] 1753088 total, 1016816 used, 736272 free

> DXEFV [47%Full] 10485760 total, 4978656 used, 5507104 free

> PEIFV [14%Full] 917504 total, 133328 used, 784176 free

> 

> -O1

> 

> SECFV [6%Full] 212992 total, 14768 used, 198224 free

> FVMAIN_COMPACT [58%Full] 1753088 total, 1017856 used, 735232 free

> DXEFV [37%Full] 10485760 total, 3907648 used, 6578112 free

> PEIFV [10%Full] 917504 total, 95696 used, 821808 free

> 

> -O2

> 

> SECFV [7%Full] 212992 total, 15728 used, 197264 free

> FVMAIN_COMPACT [63%Full] 1753088 total, 1111144 used, 641944 free

> DXEFV [38%Full] 10485760 total, 4029984 used, 6455776 free

> PEIFV [10%Full] 917504 total, 100400 used, 817104 free


Huh, very interesting. -O1 vs. -O2 produce almost identically sized
PEIFV and DXEFV (both optimization settings beating -O0 of course), but
-O2 does something that makes the compression deteriorate, even to the
point where FVMAIN_COMPACT ends up larger than with -O0!

More agressive unrolling / inlining with -O2 perhaps?

And, the effect of -O1 seems to be: practically unchanged FVMAIN_COMPACT
size, relative to -O0 (it's negligibly larger than with -O0 I guess),
and significantly smaller DXEFV and PEIFV usage.

-O1 seems useful. Assuming that the "negligible" size increase in
FVMAIN_COMPACT, relative to -O0, remains negligible in the long term. :)

> Anyway, thanks again for your time and effort in testing this, much appreciated.


No, thank you for doing this; I've always been intrigued by the
optimization possibilities for OVMF. What I'd be most interested in is
enabling optimization (saving space) for -b DEBUG -- I don't do source
level debugging, but I want the debug messages and the ASSERT()s. Saving
space with those compiled in would be superb -- I guess that's where the
savings would really shine.

Thanks!
Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Ard Biesheuvel July 15, 2016, 10:15 a.m. | #6
On 15 July 2016 at 11:51, Laszlo Ersek <lersek@redhat.com> wrote:
> On 07/15/16 11:25, Ard Biesheuvel wrote:

>> On 15 July 2016 at 11:06, Laszlo Ersek <lersek@redhat.com> wrote:

>

>>> And, to our collective relief, this means that there should be no

>>> problem with this series. :) I think I can add my Tested-by:

>>>

>>> Tested-by: Laszlo Ersek <lersek@redhat.com>

>>>

>>

>> Thanks a lot!

>>

>> I do have another question, though. Looking at your logs, I found the

>> following results for the sizes

>>

>> GCC48 before:

>>

>> SECFV [11%Full] 212992 total, 25392 used, 187600 free

>> FVMAIN_COMPACT [63%Full] 1753088 total, 1113976 used, 639112 free

>> DXEFV [56%Full] 10485760 total, 5970624 used, 4515136 free

>> PEIFV [21%Full] 917504 total, 200616 used, 716888 free

>>

>> GCC48 after:

>>

>> SECFV [7%Full] 212992 total, 15728 used, 197264 free

>> FVMAIN_COMPACT [65%Full] 1753088 total, 1145808 used, 607280 free

>> DXEFV [39%Full] 10485760 total, 4148384 used, 6337376 free

>> PEIFV [14%Full] 917504 total, 131208 used, 786296 free

>>

>>

>> GCC49 before:

>>

>> SECFV [12%Full] 212992 total, 25616 used, 187376 free

>> FVMAIN_COMPACT [64%Full] 1753088 total, 1122760 used, 630328 free

>> DXEFV [58%Full] 10485760 total, 6106048 used, 4379712 free

>> PEIFV [22%Full] 917504 total, 205992 used, 711512 free

>>

>> GCC49 after:

>>

>> SECFV [7%Full] 212992 total, 15824 used, 197168 free

>> FVMAIN_COMPACT [66%Full] 1753088 total, 1166952 used, 586136 free

>> DXEFV [40%Full] 10485760 total, 4225728 used, 6260032 free

>> PEIFV [14%Full] 917504 total, 133672 used, 783832 free

>>

>> So DXEFV is substantially smaller, but FVMAIN_COMPACT ends up slightly

>> bigger. Which one do we care about mostly? I would assume

>> FVMAIN_COMPACT, since that translates into flash footprint directly.

>

> Correct.

>

> DXEFV is less relevant, but not irrelevant. We've been increasing its

> allotted size in the FDF file(s) over time, cramming more and more DXE

> phase stuff into OVMF. Its size does have an impact on guest memory

> availability, when the SMM driver stack is built in, and S3 is enabled

> on the QEMU command line. (Under these circumstances DXEFV's location is

> covered as AcpiNVS. Not optimal, but that's what we have.)

>

> So, in practice, shrinking DXEFV is useful as well.

>

> FVMAIN_COMPACT is what ultimately determines the flash chip's size.

> We've never had to change that (after a historical bump from 1MB to 2MB,

> speaking unified flash terms -- for OVMF_CODE.fd, the bump means

> 1MB-128KB --> 2MB-128KB). If we have to increase OVMF_CODE.fd, that will

> be visible on the host filesystem, and might present compatibility

> problems for existing VMs after their next boot. (I hope not, but I've

> never actually tested this scenario.)

>

> Mike mentioned elsewhere in this thread that he measured the utility of

> new build flags on the change in compressed size.

>

> I guess it just tells us that TianoCompress performs incredibly well :)

>

>> So I did a quick test, comparing O0 / O1 / O2 on GCC48, and it seems

>> the sweet spot is O1 (due to lack of support for Os in older GCCs)

>>

>> -O0

>>

>> SECFV [11%Full] 212992 total, 23536 used, 189456 free

>> FVMAIN_COMPACT [58%Full] 1753088 total, 1016816 used, 736272 free

>> DXEFV [47%Full] 10485760 total, 4978656 used, 5507104 free

>> PEIFV [14%Full] 917504 total, 133328 used, 784176 free

>>

>> -O1

>>

>> SECFV [6%Full] 212992 total, 14768 used, 198224 free

>> FVMAIN_COMPACT [58%Full] 1753088 total, 1017856 used, 735232 free

>> DXEFV [37%Full] 10485760 total, 3907648 used, 6578112 free

>> PEIFV [10%Full] 917504 total, 95696 used, 821808 free

>>

>> -O2

>>

>> SECFV [7%Full] 212992 total, 15728 used, 197264 free

>> FVMAIN_COMPACT [63%Full] 1753088 total, 1111144 used, 641944 free

>> DXEFV [38%Full] 10485760 total, 4029984 used, 6455776 free

>> PEIFV [10%Full] 917504 total, 100400 used, 817104 free

>

> Huh, very interesting. -O1 vs. -O2 produce almost identically sized

> PEIFV and DXEFV (both optimization settings beating -O0 of course), but

> -O2 does something that makes the compression deteriorate, even to the

> point where FVMAIN_COMPACT ends up larger than with -O0!

>

> More agressive unrolling / inlining with -O2 perhaps?

>

> And, the effect of -O1 seems to be: practically unchanged FVMAIN_COMPACT

> size, relative to -O0 (it's negligibly larger than with -O0 I guess),

> and significantly smaller DXEFV and PEIFV usage.

>

> -O1 seems useful. Assuming that the "negligible" size increase in

> FVMAIN_COMPACT, relative to -O0, remains negligible in the long term. :)

>

>> Anyway, thanks again for your time and effort in testing this, much appreciated.

>

> No, thank you for doing this; I've always been intrigued by the

> optimization possibilities for OVMF. What I'd be most interested in is

> enabling optimization (saving space) for -b DEBUG -- I don't do source

> level debugging, but I want the debug messages and the ASSERT()s. Saving

> space with those compiled in would be superb -- I guess that's where the

> savings would really shine.

>


Doing the same build for DEBUG gives me

-O0

SECFV [20%Full] 212992 total, 44464 used, 168528 free
FVMAIN_COMPACT [75%Full] 1753088 total, 1326592 used, 426496 free
DXEFV [67%Full] 10485760 total, 7077696 used, 3408064 free
PEIFV [28%Full] 917504 total, 257200 used, 660304 free

-O1

SECFV [16%Full] 212992 total, 35248 used, 177744 free
FVMAIN_COMPACT [76%Full] 1753088 total, 1339592 used, 413496 free
DXEFV [56%Full] 10485760 total, 5933120 used, 4552640 free
PEIFV [23%Full] 917504 total, 216240 used, 701264 free

-O2

SECFV [17%Full] 212992 total, 36688 used, 176304 free
FVMAIN_COMPACT [83%Full] 1753088 total, 1469904 used, 283184 free
DXEFV [58%Full] 10485760 total, 6157440 used, 4328320 free
PEIFV [24%Full] 917504 total, 227408 used, 690096 free

which paints roughly the same picture: much smaller binaries, but O2
is worse than O0/O1 after compression.

Another thing to take into account is that DXE code size translates
directly into RuntimeServicesCode memory footprint of
DXE_RUNTIME_DRIVER modules, although I am not sure if that has ever
been a source of concern.

-- 
Ard.
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Laszlo Ersek July 15, 2016, 10:37 a.m. | #7
On 07/15/16 12:15, Ard Biesheuvel wrote:
> On 15 July 2016 at 11:51, Laszlo Ersek <lersek@redhat.com> wrote:


>> What I'd be most interested in is

>> enabling optimization (saving space) for -b DEBUG -- I don't do source

>> level debugging, but I want the debug messages and the ASSERT()s. Saving

>> space with those compiled in would be superb -- I guess that's where the

>> savings would really shine.

>>

> 

> Doing the same build for DEBUG gives me

> 

> -O0

> 

> SECFV [20%Full] 212992 total, 44464 used, 168528 free

> FVMAIN_COMPACT [75%Full] 1753088 total, 1326592 used, 426496 free

> DXEFV [67%Full] 10485760 total, 7077696 used, 3408064 free

> PEIFV [28%Full] 917504 total, 257200 used, 660304 free

> 

> -O1

> 

> SECFV [16%Full] 212992 total, 35248 used, 177744 free

> FVMAIN_COMPACT [76%Full] 1753088 total, 1339592 used, 413496 free

> DXEFV [56%Full] 10485760 total, 5933120 used, 4552640 free

> PEIFV [23%Full] 917504 total, 216240 used, 701264 free

> 

> -O2

> 

> SECFV [17%Full] 212992 total, 36688 used, 176304 free

> FVMAIN_COMPACT [83%Full] 1753088 total, 1469904 used, 283184 free

> DXEFV [58%Full] 10485760 total, 6157440 used, 4328320 free

> PEIFV [24%Full] 917504 total, 227408 used, 690096 free

> 

> which paints roughly the same picture: much smaller binaries, but O2

> is worse than O0/O1 after compression.


Thank you for checking this -- in absolute terms, -O1 shaves about 1.1MB
off DXEFV. Not bad!

> Another thing to take into account is that DXE code size translates

> directly into RuntimeServicesCode memory footprint of

> DXE_RUNTIME_DRIVER modules,


Good point!

> although I am not sure if that has ever

> been a source of concern.


I've never thought of it -- my gut feeling is that the AcpiNVS stuff is
larger. (I could easily be biased though: the AcpiNVS allocations are
all manual and require thought, while RuntimeServicesCode allocs are all
automatic. I might be projecting "mental effort required" to "allocation
size" :))

Thanks
Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Ard Biesheuvel July 15, 2016, 8:35 p.m. | #8
On 15 July 2016 at 12:37, Laszlo Ersek <lersek@redhat.com> wrote:
> On 07/15/16 12:15, Ard Biesheuvel wrote:

>> On 15 July 2016 at 11:51, Laszlo Ersek <lersek@redhat.com> wrote:

>

>>> What I'd be most interested in is

>>> enabling optimization (saving space) for -b DEBUG -- I don't do source

>>> level debugging, but I want the debug messages and the ASSERT()s. Saving

>>> space with those compiled in would be superb -- I guess that's where the

>>> savings would really shine.

>>>

>>

>> Doing the same build for DEBUG gives me

>>

>> -O0

>>

>> SECFV [20%Full] 212992 total, 44464 used, 168528 free

>> FVMAIN_COMPACT [75%Full] 1753088 total, 1326592 used, 426496 free

>> DXEFV [67%Full] 10485760 total, 7077696 used, 3408064 free

>> PEIFV [28%Full] 917504 total, 257200 used, 660304 free

>>

>> -O1

>>

>> SECFV [16%Full] 212992 total, 35248 used, 177744 free

>> FVMAIN_COMPACT [76%Full] 1753088 total, 1339592 used, 413496 free

>> DXEFV [56%Full] 10485760 total, 5933120 used, 4552640 free

>> PEIFV [23%Full] 917504 total, 216240 used, 701264 free

>>

>> -O2

>>

>> SECFV [17%Full] 212992 total, 36688 used, 176304 free

>> FVMAIN_COMPACT [83%Full] 1753088 total, 1469904 used, 283184 free

>> DXEFV [58%Full] 10485760 total, 6157440 used, 4328320 free

>> PEIFV [24%Full] 917504 total, 227408 used, 690096 free

>>

>> which paints roughly the same picture: much smaller binaries, but O2

>> is worse than O0/O1 after compression.

>

> Thank you for checking this -- in absolute terms, -O1 shaves about 1.1MB

> off DXEFV. Not bad!

>

>> Another thing to take into account is that DXE code size translates

>> directly into RuntimeServicesCode memory footprint of

>> DXE_RUNTIME_DRIVER modules,

>

> Good point!

>

>> although I am not sure if that has ever

>> been a source of concern.

>

> I've never thought of it -- my gut feeling is that the AcpiNVS stuff is

> larger. (I could easily be biased though: the AcpiNVS allocations are

> all manual and require thought, while RuntimeServicesCode allocs are all

> automatic. I might be projecting "mental effort required" to "allocation

> size" :))

>


Actually, it appears -Os does work, as long as you also pass the
-maccumulate-outgoing-args switch. I am not sure what I did wrong
before, but all the builds below boot fine.

I get the following results:

RELEASE_GCC44

SECFV [6%Full] 212992 total, 14320 used, 198672 free
FVMAIN_COMPACT [56%Full] 1753088 total, 997184 used, 755904 free
DXEFV [33%Full] 10485760 total, 3508336 used, 6977424 free
PEIFV [11%Full] 917504 total, 109896 used, 807608 free

RELEASE_GCC46

SECFV [6%Full] 212992 total, 13936 used, 199056 free
FVMAIN_COMPACT [56%Full] 1753088 total, 987632 used, 765456 free
DXEFV [33%Full] 10485760 total, 3468304 used, 7017456 free
PEIFV [11%Full] 917504 total, 107784 used, 809720 free

RELEASE_GCC47

SECFV [6%Full] 212992 total, 13808 used, 199184 free
FVMAIN_COMPACT [56%Full] 1753088 total, 988072 used, 765016 free
DXEFV [32%Full] 10485760 total, 3449936 used, 7035824 free
PEIFV [11%Full] 917504 total, 106984 used, 810520 free

RELEASE_GCC48

SECFV [6%Full] 212992 total, 13744 used, 199248 free
FVMAIN_COMPACT [56%Full] 1753088 total, 992800 used, 760288 free
DXEFV [32%Full] 10485760 total, 3453456 used, 7032304 free
PEIFV [11%Full] 917504 total, 107464 used, 810040 free

DEBUG_GCC44

SECFV [14%Full] 212992 total, 31632 used, 181360 free
FVMAIN_COMPACT [74%Full] 1753088 total, 1300904 used, 452184 free
DXEFV [50%Full] 10485760 total, 5272464 used, 5213296 free
PEIFV [25%Full] 917504 total, 231496 used, 686008 free

DEBUG_GCC46

SECFV [14%Full] 212992 total, 31728 used, 181264 free
FVMAIN_COMPACT [73%Full] 1753088 total, 1291672 used, 461416 free
DXEFV [50%Full] 10485760 total, 5263888 used, 5221872 free
PEIFV [25%Full] 917504 total, 231656 used, 685848 free

DEBUG_GCC47

SECFV [14%Full] 212992 total, 31440 used, 181552 free
FVMAIN_COMPACT [73%Full] 1753088 total, 1290584 used, 462504 free
DXEFV [49%Full] 10485760 total, 5234224 used, 5251536 free
PEIFV [25%Full] 917504 total, 230152 used, 687352 free

DEBUG_GCC48

SECFV [14%Full] 212992 total, 31280 used, 181712 free
FVMAIN_COMPACT [73%Full] 1753088 total, 1296208 used, 456880 free
DXEFV [49%Full] 10485760 total, 5229424 used, 5256336 free
PEIFV [25%Full] 917504 total, 229512 used, 687992 free

I am going to respin the series to use Os rather than O2.

Thanks,
Ard.
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Patch

--- a/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
+++ b/OvmfPkg/EnrollDefaultKeys/EnrollDefaultKeys.c
@@ -663,6 +663,8 @@  EnrollListOfX509Certs (
   UINT8            *Data;
   UINT8            *Position;

+  Status = EFI_SUCCESS;
+
   //
   // compute total size first, for UINT32 range check, and allocation
   //