From patchwork Wed Jul 13 09:54:25 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 71906 Delivered-To: patch@linaro.org Received: by 10.140.29.52 with SMTP id a49csp914384qga; Wed, 13 Jul 2016 02:54:28 -0700 (PDT) X-Received: by 10.67.13.196 with SMTP id fa4mr12334714pad.115.1468403668616; Wed, 13 Jul 2016 02:54:28 -0700 (PDT) Return-Path: Received: from ml01.01.org (ml01.01.org. [198.145.21.10]) by mx.google.com with ESMTPS id ss9si3271291pab.185.2016.07.13.02.54.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 13 Jul 2016 02:54:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of edk2-devel-bounces@lists.01.org designates 198.145.21.10 as permitted sender) client-ip=198.145.21.10; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of edk2-devel-bounces@lists.01.org designates 198.145.21.10 as permitted sender) smtp.mailfrom=edk2-devel-bounces@lists.01.org; dmarc=fail (p=NONE dis=NONE) header.from=linaro.org Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 7B1321A1F00; Wed, 13 Jul 2016 02:55:13 -0700 (PDT) X-Original-To: edk2-devel@lists.01.org Delivered-To: edk2-devel@lists.01.org Received: from mail-it0-x22b.google.com (mail-it0-x22b.google.com [IPv6:2607:f8b0:4001:c0b::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 1776A1A1EE0 for ; Wed, 13 Jul 2016 02:55:12 -0700 (PDT) Received: by mail-it0-x22b.google.com with SMTP id h190so39246809ith.1 for ; Wed, 13 Jul 2016 02:54:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=ziYNXiNTYYMNfXCSVj+owDrJy0iksAvNzvwWx2V92Qo=; b=AxkYHy1UCpN5Q92RydnsIvwsE8sQR0795z0sNknDfF86MnysFIGBVv+l6dwMVhLId4 2muczZMxMJJUEu2N/eL8vpLT3OCHmAUP/kv6fD60ieUjLdnleIW6smUZkVJdydTsBG2g kJuTcJCJsPWwzkAviGp6nHHJeP2ignQqrkRYw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=ziYNXiNTYYMNfXCSVj+owDrJy0iksAvNzvwWx2V92Qo=; b=gj0sRTgZ0J3+cgsWEvG7hlOLUShQzJuhGgqoGCsT9Pll7K2mPoHw8XVOfkczk1VncK flhwM4x2+QXKkU05oudbPz8nnMW0rZm9YZgO0QSmeKoIsd8M2S8MQBWonlojXZQc5cIV 7QnG31Z265tNe5RzzpzN7VfqurkeKvBcayPaFaob1puAxAijYNxitejcCNYwyIxGJdGc wuInti/F4vx9Ywsf3f8J0NP8Lk8/O/RDD9Bf658RkqJn4JYXohU0IuNlIy8J/krmVVu4 /ha2OJrGAA6+86KEjCphtsSf0BKHmtxWnkfjAROG3v0PbqVQudGmEOiLq6ETN/rfC+JA FOmw== X-Gm-Message-State: ALyK8tJqehDj87V0yGujmPL0sPrhwVvWTb41EcyVCvbb7rN2Md8k5EeetY0HZAMWcgF/whwvtHdOr7XL2VVzSkLB X-Received: by 10.36.57.199 with SMTP id l190mr22440196ita.5.1468403665731; Wed, 13 Jul 2016 02:54:25 -0700 (PDT) MIME-Version: 1.0 Received: by 10.36.214.6 with HTTP; Wed, 13 Jul 2016 02:54:25 -0700 (PDT) In-Reply-To: <146825671038.32137.7165601905988369757@jljusten-ivb> References: <1467967364-11556-1-git-send-email-steven.shi@intel.com> <1467967364-11556-5-git-send-email-steven.shi@intel.com> <146825671038.32137.7165601905988369757@jljusten-ivb> From: Ard Biesheuvel Date: Wed, 13 Jul 2016 11:54:25 +0200 Message-ID: To: Jordan Justen Subject: Re: [edk2] [PATCH v2 4/7] BaseTools-Conf:Introduce GCC5 new toolchain for x86 X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: "Kinney, Michael D" , edk2-devel-01 , "afish@apple.com" , "Gao, Liming" Errors-To: edk2-devel-bounces@lists.01.org Sender: "edk2-devel" On 11 July 2016 at 19:05, Jordan Justen wrote: > On 2016-07-08 01:42:41, Shi, Steven wrote: >> GCC5 enable GCC Link Time Optimization (LTO) and code size >> optimization (–Os) for aggressive code size improvement. > > Can you fix this to be a dash? (-Os) > >> GCC5 X64 code is small code model + position independent >> code (PIE). >> >> Test pass platforms: OVMF (OvmfPkgIa32.dsc, OvmfPkgX64.dsc) and >> Quark (Quark.dsc). >> Test compiler and linker version: GCC 5.3, GCC 5.4, GNU ld 2.26. >> >> Contributed-under: TianoCore Contribution Agreement 1.0 >> Signed-off-by: Steven Shi >> --- >> BaseTools/Conf/build_rule.template | 9 ++++ >> BaseTools/Conf/tools_def.template | 92 ++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 101 insertions(+) >> mode change 100644 => 100755 BaseTools/Conf/build_rule.template >> mode change 100644 => 100755 BaseTools/Conf/tools_def.template >> >> diff --git a/BaseTools/Conf/build_rule.template b/BaseTools/Conf/build_rule.template >> old mode 100644 >> new mode 100755 >> index 91bcc18..25cf380 >> --- a/BaseTools/Conf/build_rule.template >> +++ b/BaseTools/Conf/build_rule.template >> @@ -295,6 +295,10 @@ >> "$(DLINK)" -o ${dst} $(DLINK_FLAGS) --start-group $(DLINK_SPATH) @$(STATIC_LIBRARY_FILES_LIST) --end-group $(DLINK2_FLAGS) >> "$(OBJCOPY)" $(OBJCOPY_FLAGS) ${dst} >> >> + >> + "$(DLINK)" -o ${dst} $(DLINK_FLAGS) -Wl,--start-group,$(DLINK_SPATH),@$(STATIC_LIBRARY_FILES_LIST) -Wl,--end-group $(DLINK2_FLAGS) >> + "$(OBJCOPY)" $(OBJCOPY_FLAGS) ${dst} > > Can we convert the current GCC toolchains to use GCC as the linker? > This would let us keep a single build rule family. It would take some > extra effort to validate the toolchains. > > If that does not work, then rather than separate rule families for > each new toolchain, can we just add a single new GCCLTO build rule > family? Or, maybe GCCLD would be a better name. > Any GCC compatible compiler should be able to deal with being invoked as the linker. The only problem is DLINK_FLAGS at the .inf and .dsc level, which would require the -Wl, prefix to be added to each linker argument. In EDK2 itself, we don't have too many of those: $ git grep DLINK_FL -- *.inf |grep GCC ArmVirtPkg/PrePi/ArmVirtPrePiUniCoreRelocatable.inf: GCC:*_*_*_DLINK_FLAGS = -pie -T $(MODULE_DIR)/Scripts/PrePi-PIE.lds EmulatorPkg/Unix/Host/Host.inf: GCC:*_*_IA32_DLINK_FLAGS == -o $(BIN_DIR)/Host -m elf_i386 -dynamic-linker $(HOST_DLINK_PATHS) -L/usr/lib/i386-linux-gnu -L/usr/X11R6/lib -lXext -lX11 EmulatorPkg/Unix/Host/Host.inf: GCC:*_*_X64_DLINK_FLAGS == -o $(BIN_DIR)/Host -m elf_x86_64 -dynamic-linker $(HOST_DLINK_PATHS) -L/usr/lib/x86_64-linux-gnu -L/usr/X11R6/lib -lXext -lX11 $ git grep DLINK_FL -- *.dsc |grep GCC CorebootPayloadPkg/CorebootPayloadPkgIa32X64.dsc: GCC:DEBUG_*_*_DLINK_FLAGS = -flto OvmfPkg/OvmfPkgIa32.dsc: GCC:*_*_*_DLINK_FLAGS = -z common-page-size=0x1000 OvmfPkg/OvmfPkgIa32X64.dsc: GCC:*_*_*_DLINK_FLAGS = -z common-page-size=0x1000 OvmfPkg/OvmfPkgX64.dsc: GCC:*_*_*_DLINK_FLAGS = -z common-page-size=0x1000 If we add GCCLD or GCCLTO as a build rule family, but not as a proper family, this becomes problematic, since these overrides cannot be reformulated in a way that allows them to work with both. In general, I think this series is solving too many things at the same time. We have LTO vs non-LTO PIC vs non-PIC small model vs large model PIE vs non-PIE Os vs O0 all of which have some effect on code size, but there is no quantification of which does what. First of all, we have the introduction of Os on X64. I think not having a -O setting was an oversight, and the fact that we appear to use the default of -O0 for all X64 GCC builds should be treated as a bug, not an improvement, since it was never the intention to build with optimization disabled. We should have a separate patch for this, that adds either -O2 or -Os to all GCC/X64 versions that can tolerate it. small model +pic vs large model could probably be applied to other GCC versions as well? This is more debatable, of course, but this has not a lot to do with GCC 5.3, I think? Note that I don't think we require any new relocation types to be handled in GenFw if we build with 'hidden' visibility, i.e., """ """ PIE vs non-PIE As Andrew mentions, this may be a requirement for CLANG/llvm, but there is no reason to use it on GCC+LD if it does not require it, since it makes GCC5 deviate from GCC4x in more ways than necessary. LTO vs non-LTO As mentioned above, this either be in the same family/build rule family, or in a completely different family altogether (and I would prefer the former). For each of these changes, I would like to see a quantification of the code size reduction, rather than enabling everything at the same time. My suspicion is that the missing -O argument has the largest effect of all, and we should backport that to other versions if we can IMO Apologies for the meandering nature of this email, but I think these topics deserve more attention than they are getting at the moment. Regards, Ard. --- a/MdePkg/Include/X64/ProcessorBind.h +++ b/MdePkg/Include/X64/ProcessorBind.h @@ -27,6 +27,16 @@ #pragma pack() #endif +#if defined(__GNUC__) && defined(__pic__) +// +// Mark all symbol declarations and references as hidden, meaning they will not +// be exported from a shared library, and thus will not be subject to symbol +// preemption. This allows the compiler to refer to symbols directly using +// relative references rather than via the GOT, which contains absolute symbol +// addresses that are subject to runtime relocation. +// +#pragma GCC visibility push (hidden) +#endif #if defined(__INTEL_COMPILER) //