From patchwork Wed Dec 20 20:52:05 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 122498 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp6085656qgn; Wed, 20 Dec 2017 12:52:59 -0800 (PST) X-Google-Smtp-Source: ACJfBospzSDtqG137cud55/iDOnNE7hwIzWbh5uZFJFrfdzRjW/mZlgs46vhfiLrf3ZlZs+kKrTe X-Received: by 10.84.131.98 with SMTP id 89mr8089182pld.170.1513803179807; Wed, 20 Dec 2017 12:52:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1513803179; cv=none; d=google.com; s=arc-20160816; b=FfEcrCouK4DN9C9h7NWMmINvlzBdLwGLbhTclJTXWS8xniVp4I6oei/IidgZS5npo/ wv/RdN8EVf0C2KC57YyHc4YspASKWldJdtZMe0LYqKrV+C6U7JFTTKExeEIMz9RdHj8c KxPXPsK3sraJ77UT+Ec0TCchv4ZGKgnkJ2wzkZluvq7/5ni6zCUb3ngbYCJwUdv5rjlC rR3hGfmJlMnUcjQcNyMeVDyDY9w12CixdOoQAhP5FQv0autTZLi1WDPTA+cU7CASHTq8 D4kEjoKUpN4O/yBr9opiDzabg9s48iw/BG1gtSCJ0JA/Sirmjw4Zn1QBtKkQ+JHllRj3 xZSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :arc-authentication-results; bh=0ks2U+/AABRRS0LOFuupAhsoqbmMLEoYFqpTUA31FWs=; b=HRJWO+WuTgieLjVTuIpGtSPZ6N1PQHywB1+w7FKI7cUgCe5AVUfFCrKuowJcInoe7o abDrz73GeIDnn3aKmV+2ZI+tXmT5yr5kkGJ12Md0wTl1rQncfoDGE3tzlCSun/UxJcwN VhmzdXS9UiJQtpqq3AEVrhWThrv0htZVahvvcG0F7G6GV5Vbm6LElPopO9LfoFRm+Xr6 4Pym6T5kzNndy1DsIKIXPHLygUWsRaoNAUp5FbjwjuqDjopR6Z5DRj8KMSqZ68L460uz EBJrn1savBiUTkV2zagjmKvAXV4nlpfFO9tFuSuf5MqUKLeaa/C3dkCrNf5lUXU9wS65 VF9Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h6si12317152pgc.643.2017.12.20.12.52.59; Wed, 20 Dec 2017 12:52:59 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755838AbdLTUw6 (ORCPT + 1 other); Wed, 20 Dec 2017 15:52:58 -0500 Received: from mout.kundenserver.de ([217.72.192.75]:53696 "EHLO mout.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754428AbdLTUw5 (ORCPT ); Wed, 20 Dec 2017 15:52:57 -0500 Received: from wuerfel.lan ([95.208.111.237]) by mrelayeu.kundenserver.de (mreue105 [212.227.15.145]) with ESMTPA (Nemesis) id 0M6lyG-1fF49b1ese-00wX5n; Wed, 20 Dec 2017 21:52:31 +0100 From: Arnd Bergmann To: Herbert Xu Cc: James Morris , Arnd Bergmann , Richard Biener , Jakub Jelinek , "David S. Miller" , Ard Biesheuvel , linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] [RFT] crypto: aes-generic - turn off -ftree-pre and -ftree-sra Date: Wed, 20 Dec 2017 21:52:05 +0100 Message-Id: <20171220205213.1025257-1-arnd@arndb.de> X-Mailer: git-send-email 2.9.0 X-Provags-ID: V03:K0:5tLyIAFS/yCzX546JtDV3AVUx4xbARx3xfLHjhvEf3wclPAlxNz sUSDBdXSql3OMJiINeQ6NZRo1dj/OXFynQNwqN0aejRs4QwSFXDW4V/XHjI/jVzgRlCysjn 9rJzPDlaC8QtAEAW1C4D60EIeu7HRc03Q//Kh79o+f+L+Yu6yEXneEY9pzZ8tTJvliETaSk fdgu4V4QuwVJhPGL2PDMg== X-UI-Out-Filterresults: notjunk:1; V01:K0:YLHkBJAlT8o=:yDWMb+2Ymbz3kLbIPqWF4g Qw1GzBJYMyMVIfeDpPSPBV8aFAlB+767mEluGxjlnQDs79CgDL/gkpyJkSm4D5e6/I1dxiqcR 9D7aDMm68uGXNO5UNvEKaoWWwQDWfzTh+Xjky0yGTvq23p0m5s2pFfQY5InmMqpAXBpNKDIjp QAOfxMoCYUjmkfF7/xcyNPTBF/SBJprLQ5EjJZ/NgCYvgdpA5SGTwPW3b+MGNDiUPDeUyUQ5A DDmsLALxvo01FPwQh24bZqj0b3u5gQ98/0aA8qnaYNhX0ehp7DpKUr3y/kIoWP10f2pXrtpQ6 E98KlWQfTGHRjsQsaZUdFpN6wme0Fckzo5fEBpSQ+UHsqURq7cXhiW9ZE/ojvl/L1YpjXXTIE 0U/C1csn67nLEv7tdqZ+IbMEr1Ex+TdG6Zb093bnmWJIJ41UBwVGTJFdqA6FiBsM2NmWIOaQL AuwK9wdOpLOetRIbGh6W8GgRGhpdhImbt+ZKPbGgLgs4N5cSZn4rJr/Bv37NOUmXPPU0dAMy9 97/u6vxd0LfX4KpoxtpNbTRzSyuIaP1Hdi3WGhPJnfgdES2hXFgZZptMBjJ1gbGHrfoljYFTv V7GpW/vUi38r9dFWjQYy66fYq6R4V2KEtcRikolIDNqz3dqSl+7MEe6yI/ZW194070Db/P37B qcy2UegCEfoT/fu+UQg41qdUc3X7lIQRRmTDISwa6un3S23aM+6A4+jejV71blnTdvAs/LUBy 25hrP1DR8mids9/RhmCI+ir74XaJ3HyYYv56Ag== Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org While testing other changes, I discovered that gcc-7.2.1 produces badly optimized code for aes_encrypt/aes_decrypt. This is especially true when CONFIG_UBSAN_SANITIZE_ALL is enabled, where it leads to extremely large stack usage that in turn might cause kernel stack overflows: crypto/aes_generic.c: In function 'aes_encrypt': crypto/aes_generic.c:1371:1: warning: the frame size of 4880 bytes is larger than 2048 bytes [-Wframe-larger-than=] crypto/aes_generic.c: In function 'aes_decrypt': crypto/aes_generic.c:1441:1: warning: the frame size of 4864 bytes is larger than 2048 bytes [-Wframe-larger-than=] I verified that this problem exists on all architectures that are supported by gcc-7.2, though arm64 in particular is less affected than the others. I also found that gcc-7.1 and gcc-8 do not show the extreme stack usage but still produce worse code than earlier versions for this file, apparently because of optimization passes that generally provide a substantial improvement in object code quality but understandably fail to find any shortcuts in the AES algorithm. Turning off the tree-pre and tree-sra optimization steps seems to reverse the effect, and could be used as a workaround in case we don't get a good gcc fix. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356 Cc: Richard Biener Cc: Jakub Jelinek Signed-off-by: Arnd Bergmann --- Jakub and Richard have done a more detailed analysis of this, and are working on ways to improve the code again. In the meantime, I'm sending this workaround to the Linux crypto maintainers to make them aware of this issue and for testing. What would be a good way to test the performance of the AES code with the various combinations of compiler versions, as well as UBSAN and this patch enabled or disabled? --- crypto/aes_generic.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) -- 2.9.0 diff --git a/crypto/aes_generic.c b/crypto/aes_generic.c index ca554d57d01e..35f973ba9878 100644 --- a/crypto/aes_generic.c +++ b/crypto/aes_generic.c @@ -1331,6 +1331,20 @@ EXPORT_SYMBOL_GPL(crypto_aes_set_key); f_rl(bo, bi, 3, k); \ } while (0) +#if __GNUC__ >= 7 +/* + * Newer compilers try to optimize integer arithmetic more aggressively, + * which generally improves code quality a lot, but in this specific case + * ends up hurting more than it helps, in some configurations drastically + * so. This turns off two optimization steps that have been shown to + * lead to rather badly optimized code with gcc-7. + * + * See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356 + */ +#pragma GCC optimize("-fno-tree-pre") +#pragma GCC optimize("-fno-tree-sra") +#endif + static void aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in) { const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);