From patchwork Mon Jul 31 20:43:55 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 109043 Delivered-To: patch@linaro.org Received: by 10.140.101.6 with SMTP id t6csp647883qge; Mon, 31 Jul 2017 13:44:28 -0700 (PDT) X-Received: by 10.98.220.134 with SMTP id c6mr17191279pfl.253.1501533868205; Mon, 31 Jul 2017 13:44:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1501533868; cv=none; d=google.com; s=arc-20160816; b=0YNmhpir6gbl0u0c2cILBy3dfwsaXvDUxq/r3O3AqVFJ0cYk8HYv3vymvS9URn9JZv VXpz6x6G+gGaU6PV+yuiXbF1jXSmWSesPhMOBvnM7BURsCQ/vDTFCn9TzqLIrJ4CDABi l2zA/G6isfXg4ec0xOpSor4O84Op70ak3kRa3h+imOqqqNHDWpcSCUTfUOaYtxM5rJk+ 5h7LeurziEjBybrNj34TYrxKxHyqtfLObQiuRbcQorGTnIGw+bmHO+v+acM7B4Z3vArs aF5PIlCR2FZ/6dna4m8l4/6LPN+Y463ZctXkne2PtTpaQTegM14bYwtvwPOXRzMuu1bL 6AxQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :arc-authentication-results; bh=zsuXlxb3FUVFq2rEpedsS3cupG7HeB4x9dxDD3yU3Xo=; b=wzwuMZi4WW1V4vNKvhOdmPjatcx+z1eLs9j6EE/IMogGI+e2fcZAn7TTWrN9GMD/JX wqmYx3ez1KNiCjPpt62TMK9O0MSMEzj8lJTYz32UEA4BVa4NwMA6Rxk3Wj2dm5LEtO04 K6kpoGUIPF1dUbj5M5KBUWs9QUIWvbJePJlD1emQNLmBN+Lv/jOXSKjxWirwUXokhtmg SVD7GVMIvyWpZXNBjrf+K9MhZrYugDum+GGh39I1iF2oiqwxZnYHFJLP/b6C1b4cuohG ouX0h2j+k+mSagtSBmNMR5hubS1XpjnMKQl7O2/AVEefB+oKZFeZEmFJzK7wIgRuJgRX rFKg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o11si597680pli.19.2017.07.31.13.44.27; Mon, 31 Jul 2017 13:44:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751126AbdGaUo0 (ORCPT + 1 other); Mon, 31 Jul 2017 16:44:26 -0400 Received: from mout.kundenserver.de ([212.227.17.24]:60822 "EHLO mout.kundenserver.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751059AbdGaUoZ (ORCPT ); Mon, 31 Jul 2017 16:44:25 -0400 Received: from wuerfel.lan ([78.43.238.10]) by mrelayeu.kundenserver.de (mreue103 [212.227.15.145]) with ESMTPA (Nemesis) id 0LoaZ0-1e4BIv3iri-00gZcO; Mon, 31 Jul 2017 22:44:11 +0200 From: Arnd Bergmann To: Herbert Xu , "David S. Miller" Cc: Arnd Bergmann , linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] crypto: serpent: improve __serpent_setkey with UBSAN Date: Mon, 31 Jul 2017 22:43:55 +0200 Message-Id: <20170731204407.1396336-1-arnd@arndb.de> X-Mailer: git-send-email 2.9.0 X-Provags-ID: V03:K0:DLbu7Ovj+O3hCVzxG7ifT2IflwiRCZYCUQHEWfBuu8hs1HBW+oU hufbKwA8VYaanMAkpDIqV5Y/vwbgBBH+2fs96X6kb7H7VOpOSqC4XasnM8SUviVL/R4ezXS Uw49Asp56SbEAc5VlyWHeZ/qFmnObBduu+szt0Xlyh/HF2AL2scgTSsStDi1zjTycBlKnEY mOC1rzFHiCCgQTW6S+/CA== X-UI-Out-Filterresults: notjunk:1; V01:K0:ydYBPgqwizg=:OwYzf/b2pfOVnP0o/xhTmN naMm8Zx6K3OXmwspcA3NSgKnKQBq/JBZ/VRDYx4nsi4rBm7fvPdlTz7BtoVwfAEQvPBxiOjiB Vqjfob4uB0vzIfboGwfsz1Fp7u/AfL8hslztEaXCLji0v/28ft/wxboTU7fUe5S8NqIegfhW9 LU1BNAmlYUa6Ch6/IYpTajnLSGbPb03bFMVlPCKNzpwEG3rpLD5EnilOrWsxqVdbvgVD28MNx C6FSnKbTB9ToQKe1r/daDLsgYCQkOElyLnDcy7gavT9Hre+h5V1znu7BSo3Zi4FBAF6leIZWD 7MKeCBnuG3vF6irI5iK1ue3/CIrbpKa0LI3VLXQi5kmX+TWfLU/teSbdMdsZXTkG4B5Vn5FgZ IEHunk67IWCQOFfnOxv+WAMy02YrA5HZf+pvzj+rpXuHofoiSSoj4a/Yc+zzrJnkMHHfIDlDY DE3kFS6Jovrt2hNOHVt/nLplO+ItxCFgaPQwfwmwroAA1v0++/kEyl1Kh1js0oZPbNcHwVpeh UDk/+1dMXbHk/OOBVoIHDy3XlGP9SLyfFPQkZjiLYcL8rDaPSHiwyP9WUkZNT3RCa5VB8ADXI WxLK5KUuYaG1EyJedqPga6doQr/SkSjYSyAWCdCYoNub+uEqH0C+RFG8vOK1wm/3clyE6Tg3j cOJ/tNO2xYOBb6ikVML5Bs/VW1eDnrHtcpc+8RsBoRO5RShBh+85gqqEaAM3Rehpvi2AziVTu Ul6YfKelijEaPhNYPbAPClH6bBBqLhcR1HV02Q== Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org When UBSAN is enabled, we get a very large stack frame for __serpent_setkey, when the register allocator ends up using more registers than it has, and has to spill temporary values to the stack. The code was originally optimized for in-order x86-32 CPU implementations using older compilers, but it now runs into a highly suboptimal case on all CPU architectures, as seen by this warning: crypto/serpent_generic.c: In function '__serpent_setkey': crypto/serpent_generic.c:436:1: error: the frame size of 2720 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] Disabling -fsanitize=alignment would avoid that warning, presumably the option turns off a optimization step that is required for getting the register allocation right, but there is no easy way to do that on gcc-7 (gcc-8 introduces a function attribute for this). I tried to figure out a way to modify the source code instead, and noticed that the two stages of the setkey() function (keyiter and sbox) each are fine by themselves, but not when combined into one function. Splitting out the entire sbox into a separate function also happens to work fine with all compilers I tried (arm, arm64 and x86). The setkey function uses a strange way to handle offsets into the key array, using both negative and positive index values, as well as adjusting the array pointer back and forth. I have checked that this actually makes no difference to modern compilers, but I left that untouched to make the patch easier to review and to keep the code closer to the reference implementation. Link: https://patchwork.kernel.org/patch/9189575/ Signed-off-by: Arnd Bergmann --- crypto/serpent_generic.c | 77 ++++++++++++++++++++++++++---------------------- 1 file changed, 41 insertions(+), 36 deletions(-) -- 2.9.0 diff --git a/crypto/serpent_generic.c b/crypto/serpent_generic.c index 94970a794975..7c3382facc82 100644 --- a/crypto/serpent_generic.c +++ b/crypto/serpent_generic.c @@ -229,6 +229,46 @@ x4 ^= x2; \ }) +static void __serpent_setkey_sbox(u32 r0, u32 r1, u32 r2, u32 r3, u32 r4, u32 *k) +{ + k += 100; + S3(r3, r4, r0, r1, r2); store_and_load_keys(r1, r2, r4, r3, 28, 24); + S4(r1, r2, r4, r3, r0); store_and_load_keys(r2, r4, r3, r0, 24, 20); + S5(r2, r4, r3, r0, r1); store_and_load_keys(r1, r2, r4, r0, 20, 16); + S6(r1, r2, r4, r0, r3); store_and_load_keys(r4, r3, r2, r0, 16, 12); + S7(r4, r3, r2, r0, r1); store_and_load_keys(r1, r2, r0, r4, 12, 8); + S0(r1, r2, r0, r4, r3); store_and_load_keys(r0, r2, r4, r1, 8, 4); + S1(r0, r2, r4, r1, r3); store_and_load_keys(r3, r4, r1, r0, 4, 0); + S2(r3, r4, r1, r0, r2); store_and_load_keys(r2, r4, r3, r0, 0, -4); + S3(r2, r4, r3, r0, r1); store_and_load_keys(r0, r1, r4, r2, -4, -8); + S4(r0, r1, r4, r2, r3); store_and_load_keys(r1, r4, r2, r3, -8, -12); + S5(r1, r4, r2, r3, r0); store_and_load_keys(r0, r1, r4, r3, -12, -16); + S6(r0, r1, r4, r3, r2); store_and_load_keys(r4, r2, r1, r3, -16, -20); + S7(r4, r2, r1, r3, r0); store_and_load_keys(r0, r1, r3, r4, -20, -24); + S0(r0, r1, r3, r4, r2); store_and_load_keys(r3, r1, r4, r0, -24, -28); + k -= 50; + S1(r3, r1, r4, r0, r2); store_and_load_keys(r2, r4, r0, r3, 22, 18); + S2(r2, r4, r0, r3, r1); store_and_load_keys(r1, r4, r2, r3, 18, 14); + S3(r1, r4, r2, r3, r0); store_and_load_keys(r3, r0, r4, r1, 14, 10); + S4(r3, r0, r4, r1, r2); store_and_load_keys(r0, r4, r1, r2, 10, 6); + S5(r0, r4, r1, r2, r3); store_and_load_keys(r3, r0, r4, r2, 6, 2); + S6(r3, r0, r4, r2, r1); store_and_load_keys(r4, r1, r0, r2, 2, -2); + S7(r4, r1, r0, r2, r3); store_and_load_keys(r3, r0, r2, r4, -2, -6); + S0(r3, r0, r2, r4, r1); store_and_load_keys(r2, r0, r4, r3, -6, -10); + S1(r2, r0, r4, r3, r1); store_and_load_keys(r1, r4, r3, r2, -10, -14); + S2(r1, r4, r3, r2, r0); store_and_load_keys(r0, r4, r1, r2, -14, -18); + S3(r0, r4, r1, r2, r3); store_and_load_keys(r2, r3, r4, r0, -18, -22); + k -= 50; + S4(r2, r3, r4, r0, r1); store_and_load_keys(r3, r4, r0, r1, 28, 24); + S5(r3, r4, r0, r1, r2); store_and_load_keys(r2, r3, r4, r1, 24, 20); + S6(r2, r3, r4, r1, r0); store_and_load_keys(r4, r0, r3, r1, 20, 16); + S7(r4, r0, r3, r1, r2); store_and_load_keys(r2, r3, r1, r4, 16, 12); + S0(r2, r3, r1, r4, r0); store_and_load_keys(r1, r3, r4, r2, 12, 8); + S1(r1, r3, r4, r2, r0); store_and_load_keys(r0, r4, r2, r1, 8, 4); + S2(r0, r4, r2, r1, r3); store_and_load_keys(r3, r4, r0, r1, 4, 0); + S3(r3, r4, r0, r1, r2); storekeys(r1, r2, r4, r3, 0); +} + int __serpent_setkey(struct serpent_ctx *ctx, const u8 *key, unsigned int keylen) { @@ -395,42 +435,7 @@ int __serpent_setkey(struct serpent_ctx *ctx, const u8 *key, keyiter(k[23], r1, r0, r3, 131, 31); /* Apply S-boxes */ - - S3(r3, r4, r0, r1, r2); store_and_load_keys(r1, r2, r4, r3, 28, 24); - S4(r1, r2, r4, r3, r0); store_and_load_keys(r2, r4, r3, r0, 24, 20); - S5(r2, r4, r3, r0, r1); store_and_load_keys(r1, r2, r4, r0, 20, 16); - S6(r1, r2, r4, r0, r3); store_and_load_keys(r4, r3, r2, r0, 16, 12); - S7(r4, r3, r2, r0, r1); store_and_load_keys(r1, r2, r0, r4, 12, 8); - S0(r1, r2, r0, r4, r3); store_and_load_keys(r0, r2, r4, r1, 8, 4); - S1(r0, r2, r4, r1, r3); store_and_load_keys(r3, r4, r1, r0, 4, 0); - S2(r3, r4, r1, r0, r2); store_and_load_keys(r2, r4, r3, r0, 0, -4); - S3(r2, r4, r3, r0, r1); store_and_load_keys(r0, r1, r4, r2, -4, -8); - S4(r0, r1, r4, r2, r3); store_and_load_keys(r1, r4, r2, r3, -8, -12); - S5(r1, r4, r2, r3, r0); store_and_load_keys(r0, r1, r4, r3, -12, -16); - S6(r0, r1, r4, r3, r2); store_and_load_keys(r4, r2, r1, r3, -16, -20); - S7(r4, r2, r1, r3, r0); store_and_load_keys(r0, r1, r3, r4, -20, -24); - S0(r0, r1, r3, r4, r2); store_and_load_keys(r3, r1, r4, r0, -24, -28); - k -= 50; - S1(r3, r1, r4, r0, r2); store_and_load_keys(r2, r4, r0, r3, 22, 18); - S2(r2, r4, r0, r3, r1); store_and_load_keys(r1, r4, r2, r3, 18, 14); - S3(r1, r4, r2, r3, r0); store_and_load_keys(r3, r0, r4, r1, 14, 10); - S4(r3, r0, r4, r1, r2); store_and_load_keys(r0, r4, r1, r2, 10, 6); - S5(r0, r4, r1, r2, r3); store_and_load_keys(r3, r0, r4, r2, 6, 2); - S6(r3, r0, r4, r2, r1); store_and_load_keys(r4, r1, r0, r2, 2, -2); - S7(r4, r1, r0, r2, r3); store_and_load_keys(r3, r0, r2, r4, -2, -6); - S0(r3, r0, r2, r4, r1); store_and_load_keys(r2, r0, r4, r3, -6, -10); - S1(r2, r0, r4, r3, r1); store_and_load_keys(r1, r4, r3, r2, -10, -14); - S2(r1, r4, r3, r2, r0); store_and_load_keys(r0, r4, r1, r2, -14, -18); - S3(r0, r4, r1, r2, r3); store_and_load_keys(r2, r3, r4, r0, -18, -22); - k -= 50; - S4(r2, r3, r4, r0, r1); store_and_load_keys(r3, r4, r0, r1, 28, 24); - S5(r3, r4, r0, r1, r2); store_and_load_keys(r2, r3, r4, r1, 24, 20); - S6(r2, r3, r4, r1, r0); store_and_load_keys(r4, r0, r3, r1, 20, 16); - S7(r4, r0, r3, r1, r2); store_and_load_keys(r2, r3, r1, r4, 16, 12); - S0(r2, r3, r1, r4, r0); store_and_load_keys(r1, r3, r4, r2, 12, 8); - S1(r1, r3, r4, r2, r0); store_and_load_keys(r0, r4, r2, r1, 8, 4); - S2(r0, r4, r2, r1, r3); store_and_load_keys(r3, r4, r0, r1, 4, 0); - S3(r3, r4, r0, r1, r2); storekeys(r1, r2, r4, r3, 0); + __serpent_setkey_sbox(r0, r1, r2, r3, r4, ctx->expkey); return 0; }