From patchwork Fri Aug 12 08:09:27 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 73834 Delivered-To: patch@linaro.org Received: by 10.140.29.52 with SMTP id a49csp51763qga; Fri, 12 Aug 2016 01:09:30 -0700 (PDT) X-Received: by 10.66.33.65 with SMTP id p1mr25250915pai.30.1470989370367; Fri, 12 Aug 2016 01:09:30 -0700 (PDT) Return-Path: Received: from ml01.01.org (ml01.01.org. [2001:19d0:306:5::1]) by mx.google.com with ESMTPS id n66si7767200pfi.70.2016.08.12.01.09.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 12 Aug 2016 01:09:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of edk2-devel-bounces@lists.01.org designates 2001:19d0:306:5::1 as permitted sender) client-ip=2001:19d0:306:5::1; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of edk2-devel-bounces@lists.01.org designates 2001:19d0:306:5::1 as permitted sender) smtp.mailfrom=edk2-devel-bounces@lists.01.org; dmarc=fail (p=NONE dis=NONE) header.from=linaro.org Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 474B11A1E00; Fri, 12 Aug 2016 01:09:29 -0700 (PDT) X-Original-To: edk2-devel@lists.01.org Delivered-To: edk2-devel@lists.01.org Received: from mail-io0-x22a.google.com (mail-io0-x22a.google.com [IPv6:2607:f8b0:4001:c06::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 999941A1DFE for ; Fri, 12 Aug 2016 01:09:28 -0700 (PDT) Received: by mail-io0-x22a.google.com with SMTP id 38so18557165iol.0 for ; Fri, 12 Aug 2016 01:09:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=nPGBnrHlPzxmSEanS3OsxZG50AmPywsDD3FHmMdK6Ck=; b=KzUFujjW56atzlA1wrpW3tHbdYyuGI6io7xPJG0C/HvMztQaufruoupCY702STsYY0 Iox+5Bj+g2yKd6/RPVpTNoXvh0JlsjOvLbWlfMV68ikuqlfS3lqPQwJyXh7p7fK6IfuS FzkoPo8oerNv6FXFMmTcLG3AIgTNy2YdiQnfU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=nPGBnrHlPzxmSEanS3OsxZG50AmPywsDD3FHmMdK6Ck=; b=C+SDdMGjGXgIeswJMv8Tw5CnxcQkMRgMO1wD/UKIiZ9XigbrLys9bVgZvnMYyWzsle xLB+DQE+3OPldyIQugsI05eaUYcoxUwH2kvVAnaJPWIIyG5aywrcvm/ivabc27qToOQw oO3y2o8SseJ2UVig2zads705kauHuBwuPx/UoA1Ezg8q54mRKziigLhAdD2W9GRTPOO5 ZyJ468N1+46A5I0iAudg1+mzESkdPbP/0QJqrT2uU1YEKx0hLWMKPto8teDlgaVRLpga F+p7VS5pM5CplB9Zoq3c+d2mHAiEfDru7fFlVUI3ibGMOCu5t1NXWaw9oBRbOE6/Itda 4R7Q== X-Gm-Message-State: AEkooutaep5gg3GGos8+8XWjlY6zRs//dIg3Z6RGSqXgh2QVIk1adf9Md/Ra7Qw2OqLuCwV1ih1KvpJBeVTENuIJ X-Received: by 10.107.40.133 with SMTP id o127mr15947506ioo.183.1470989367575; Fri, 12 Aug 2016 01:09:27 -0700 (PDT) MIME-Version: 1.0 Received: by 10.36.204.195 with HTTP; Fri, 12 Aug 2016 01:09:27 -0700 (PDT) In-Reply-To: <93CB1B8B-8788-49FD-8FBE-9DF24F838FA6@apple.com> References: <1470939632-32198-1-git-send-email-ard.biesheuvel@linaro.org> <93CB1B8B-8788-49FD-8FBE-9DF24F838FA6@apple.com> From: Ard Biesheuvel Date: Fri, 12 Aug 2016 10:09:27 +0200 Message-ID: To: Andrew Fish Subject: Re: [edk2] [PATCH 1/3] ArmPkg/CompilerIntrinsicsLib: replace memcpy and memset with C code X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "edk2-devel@lists.01.org" , "leif.lindholm@linaro.org" , "liming.gao@intel.com" , Eugene Cohen Errors-To: edk2-devel-bounces@lists.01.org Sender: "edk2-devel" On 12 August 2016 at 01:04, Andrew Fish wrote: > >> On Aug 11, 2016, at 2:50 PM, Cohen, Eugene wrote: >> >>>> Why does memcpy performance matter? In addition to the overall >>> memcpy stuff scattered around C code we have an instance that is >>> particularly sensitive to memcpy performance. For DMA operations >>> when invoking double-buffering or access to portions of a buffer that >>> is common mapped (i.e. uncached on non-coherent DMA systems) the >>> impact of a non-optimized memcpy is enormous compared to the >>> optimized ones because the penalty is amplified by orders of >>> magnitude due to uncached memory access latency. >>>> >>> >>> That code would be using CopyMem(), no? This only serves the >>> compiler >>> generated calls, which are few since Tianocore does not allow >>> initialized locals. >> >> I see and agree that should minimize the impact. I guess I'll ask the naive question. Could the BaseMemoryLib and CompilerIntrinsicsLib share the same stuff? >> > > Eugene, > > I think if a CompilerIntrinsicsLib implementation consumes the BaseMemoryLib class (lists it in the INF) then I think it should just work. > Adding this """ """ (and something similar for memset()) will make the AArch64 platforms I usually test with build happily without the compilerintrinsicslib. Since no other changes are required, this means that BaseMemoryLib is already pulled into all modules one way or the other, and so it would seem like an improvement not to have both implementations, since they do exactly the same. For ARM, this is obviously different given the RT abi and its __aeabi_ prefixed entry points. I suppose the memcpy and memset intrinsincs are more a GCC thing than an ARM thing. _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel --- a/ArmPkg/Library/BaseMemoryLibStm/AArch64/CopyMem.c +++ b/ArmPkg/Library/BaseMemoryLibStm/AArch64/CopyMem.c @@ -144,3 +144,10 @@ InternalMemCopyMem ( } return DestinationBuffer; } + +// +// Make this implementation satisfy references to the intrinsic memcpy() that +// may be emitted by the compiler. +// +__attribute__((__weak__, __alias__("InternalMemCopyMem"))) +void *memcpy(void *dest, const void *src, __SIZE_TYPE__ n);