[edk2,v2,4/7] BaseTools-Conf:Introduce GCC5 new toolchain for x86

Message ID CAKv+Gu8z5comhHR4quVSMVE7JRe7KH9KB8ND7QX1Y4yEdP1h=Q@mail.gmail.com
State New
Headers show

Commit Message

Ard Biesheuvel July 13, 2016, 9:54 a.m.
On 11 July 2016 at 19:05, Jordan Justen <jordan.l.justen@intel.com> wrote:
> On 2016-07-08 01:42:41, Shi, Steven wrote:
>> GCC5 enable GCC Link Time Optimization (LTO) and code size
>> optimization (–Os) for aggressive code size improvement.
>
> Can you fix this to be a dash? (-Os)
>
>> GCC5 X64 code is small code model + position independent
>> code (PIE).
>>
>> Test pass platforms: OVMF (OvmfPkgIa32.dsc, OvmfPkgX64.dsc) and
>> Quark (Quark.dsc).
>> Test compiler and linker version: GCC 5.3, GCC 5.4, GNU ld 2.26.
>>
>> Contributed-under: TianoCore Contribution Agreement 1.0
>> Signed-off-by: Steven Shi <steven.shi@intel.com>
>> ---
>>  BaseTools/Conf/build_rule.template |  9 ++++
>>  BaseTools/Conf/tools_def.template  | 92 ++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 101 insertions(+)
>>  mode change 100644 => 100755 BaseTools/Conf/build_rule.template
>>  mode change 100644 => 100755 BaseTools/Conf/tools_def.template
>>
>> diff --git a/BaseTools/Conf/build_rule.template b/BaseTools/Conf/build_rule.template
>> old mode 100644
>> new mode 100755
>> index 91bcc18..25cf380
>> --- a/BaseTools/Conf/build_rule.template
>> +++ b/BaseTools/Conf/build_rule.template
>> @@ -295,6 +295,10 @@
>>          "$(DLINK)" -o ${dst} $(DLINK_FLAGS) --start-group $(DLINK_SPATH) @$(STATIC_LIBRARY_FILES_LIST) --end-group $(DLINK2_FLAGS)
>>          "$(OBJCOPY)" $(OBJCOPY_FLAGS) ${dst}
>>
>> +    <Command.GCC5>
>> +        "$(DLINK)" -o ${dst} $(DLINK_FLAGS) -Wl,--start-group,$(DLINK_SPATH),@$(STATIC_LIBRARY_FILES_LIST) -Wl,--end-group $(DLINK2_FLAGS)
>> +        "$(OBJCOPY)" $(OBJCOPY_FLAGS) ${dst}
>
> Can we convert the current GCC toolchains to use GCC as the linker?
> This would let us keep a single build rule family. It would take some
> extra effort to validate the toolchains.
>
> If that does not work, then rather than separate rule families for
> each new toolchain, can we just add a single new GCCLTO build rule
> family? Or, maybe GCCLD would be a better name.
>

Any GCC compatible compiler should be able to deal with being invoked
as the linker. The only problem is DLINK_FLAGS at the .inf  and .dsc
level, which would require the -Wl, prefix to be added to each linker
argument. In EDK2 itself, we don't have too many of those:

$ git grep DLINK_FL -- *.inf |grep GCC
ArmVirtPkg/PrePi/ArmVirtPrePiUniCoreRelocatable.inf:
GCC:*_*_*_DLINK_FLAGS = -pie -T $(MODULE_DIR)/Scripts/PrePi-PIE.lds
EmulatorPkg/Unix/Host/Host.inf:   GCC:*_*_IA32_DLINK_FLAGS == -o
$(BIN_DIR)/Host -m elf_i386 -dynamic-linker $(HOST_DLINK_PATHS)
-L/usr/lib/i386-linux-gnu -L/usr/X11R6/lib -lXext -lX11
EmulatorPkg/Unix/Host/Host.inf:   GCC:*_*_X64_DLINK_FLAGS == -o
$(BIN_DIR)/Host -m elf_x86_64 -dynamic-linker $(HOST_DLINK_PATHS)
-L/usr/lib/x86_64-linux-gnu -L/usr/X11R6/lib -lXext -lX11

$ git grep DLINK_FL -- *.dsc |grep GCC
CorebootPayloadPkg/CorebootPayloadPkgIa32X64.dsc:
GCC:DEBUG_*_*_DLINK_FLAGS      = -flto
OvmfPkg/OvmfPkgIa32.dsc:  GCC:*_*_*_DLINK_FLAGS = -z common-page-size=0x1000
OvmfPkg/OvmfPkgIa32X64.dsc:  GCC:*_*_*_DLINK_FLAGS = -z common-page-size=0x1000
OvmfPkg/OvmfPkgX64.dsc:  GCC:*_*_*_DLINK_FLAGS = -z common-page-size=0x1000

If we add GCCLD or GCCLTO as a build rule family, but not as a proper
family, this becomes problematic, since these overrides cannot be
reformulated in a way that allows them to work with both.


In general, I think this series is solving too many things at the same
time. We have

LTO vs non-LTO
PIC vs non-PIC
small model vs large model
PIE vs non-PIE
Os vs O0

all of which have some effect on code size, but there is no
quantification of which does what.

First of all, we have the introduction of Os on X64. I think not
having a -O setting was an oversight, and the fact that we appear to
use the default of -O0 for all X64 GCC builds should be treated as a
bug, not an improvement, since it was never the intention to build
with optimization disabled. We should have a separate patch for this,
that adds either -O2 or -Os to all GCC/X64 versions that can tolerate
it.

small model +pic vs large model could probably be applied to other GCC
versions as well? This is more debatable, of course, but this has not
a lot to do with GCC 5.3, I think? Note that I don't think we require
any new relocation types to be handled in GenFw if we build with
'hidden' visibility, i.e.,

"""
"""

PIE vs non-PIE

As Andrew mentions, this may be a requirement for CLANG/llvm, but
there is no reason to use it on GCC+LD if it does not require it,
since it makes GCC5 deviate from GCC4x in more ways than necessary.

LTO vs non-LTO

As mentioned above, this either be in the same family/build rule
family, or in a completely different family altogether (and I would
prefer the former).

For each of these changes, I would like to see a quantification of the
code size reduction, rather than enabling everything at the same time.
My suspicion is that the missing -O argument has the largest effect of
all, and we should backport that to other versions if we can IMO

Apologies for the meandering nature of this email, but I think these
topics deserve more attention than they are getting at the moment.


Regards,
Ard.

Patch

--- a/MdePkg/Include/X64/ProcessorBind.h
+++ b/MdePkg/Include/X64/ProcessorBind.h
@@ -27,6 +27,16 @@ 
 #pragma pack()
 #endif

+#if defined(__GNUC__) && defined(__pic__)
+//
+// Mark all symbol declarations and references as hidden, meaning they will not
+// be exported from a shared library, and thus will not be subject to symbol
+// preemption. This allows the compiler to refer to symbols directly using
+// relative references rather than via the GOT, which contains absolute symbol
+// addresses that are subject to runtime relocation.
+//
+#pragma GCC visibility push (hidden)
+#endif

 #if defined(__INTEL_COMPILER)
 //