From patchwork Tue Jul 18 12:06:42 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108126 Delivered-To: patch@linaro.org Received: by 10.182.45.195 with SMTP id p3csp5810883obm; Tue, 18 Jul 2017 05:07:18 -0700 (PDT) X-Received: by 10.98.150.16 with SMTP id c16mr1376523pfe.64.1500379638599; Tue, 18 Jul 2017 05:07:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500379638; cv=none; d=google.com; s=arc-20160816; b=rp9GInetRCK3Y/aiYNFX95E3HuDwa2r7+fLl4DBkdlzKx5J4KpYQxFp4qcy3pgFdCz 5MTikdGbxDkxmdKe036zM7PxIqw4K0NBjfvpPg+e5RyxWDx9IJG3rq7I1hximST/sppe zJKhEdv7NJ/X/TNA9B8YvCkgjefW57TnDY0x94uS9lIuEEtGVdorB5m6KXwXBiDo17Gd SyRSUWfQr2LXLXw7Q+iHtupW19VOm2aFDEo933FtgZR1yATcsPM5pSRdOKc5cRC+a79Q IYWjvZnB2gpVAGs/xCsCH0GPJhZRON+38luGuiNpRg/wH3BdtxCjhwlTnynHVvBniZAv 4d+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=RUk13LTgTTa9iYd4M+Xw4vSl1JadagT+zkxXn3AyssQ=; b=R9L3R/B5y+XGDhw9mKHilVJMNWaEUTcjWcO3nfAqDBUE9bWcWSUzXIH/IGendqsokG 1rPFEcvj80/kkVnRqJh1JFPCL3R85F89I5j9D45XHV1ij5sU+svENEVjrGO2fDCn0Azu W60+3+MghrAkbEOPK59dyGbR/a1A8f3D430UZudfQVvt5j24Dbf1axeCTFierfl855Y4 0VMlvwFOJXulgFsDY8MmWPDLGVyHmRrDTP3UCwABJ33bObRKl/Rn6Ba77EEbMSPMC9ew m8Rtgoh6Q1xlf2SCp7ZFg0YJBctvNv6y5l6Ev9YTecfuz3WWGrrzH5oJnigquDjqy/oW 42mg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=fMLfDotT; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o131si1563241pfg.413.2017.07.18.05.07.18; Tue, 18 Jul 2017 05:07:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=fMLfDotT; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751336AbdGRMHR (ORCPT + 1 other); Tue, 18 Jul 2017 08:07:17 -0400 Received: from mail-wr0-f179.google.com ([209.85.128.179]:34929 "EHLO mail-wr0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751378AbdGRMHI (ORCPT ); Tue, 18 Jul 2017 08:07:08 -0400 Received: by mail-wr0-f179.google.com with SMTP id w4so25816743wrb.2 for ; Tue, 18 Jul 2017 05:07:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=RUk13LTgTTa9iYd4M+Xw4vSl1JadagT+zkxXn3AyssQ=; b=fMLfDotTdX1utYuhDIm8mV9wtg7ZxqzjIZNUbWrDpjk86hV49TQmo+krBUxkIHXXpz ga2L/Vnsw2kw9Ayf5XnB8hd+j3xSjoo0Lwq2vzVRIe0I+dM2G+jHepRdtTqBRGAS4Tai 8R0MMOcFKcGjX0TbZnjIHP7BMDBUQly1GJ55g= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=RUk13LTgTTa9iYd4M+Xw4vSl1JadagT+zkxXn3AyssQ=; b=N74/dRcmmrvbVnFe2mGeTTvpSDhyw2C9ppi28gPYg5+RDuegWzybVXuyRbvvQtZnbD g1tZF3BfY+ljGS97iGv+u8VD1wZmvEX0voIYZkCD0V9v9CKJ9yQ3e1tV1/Z/ZceAgRIe KI/GaCocMm2OtGgIZcENzIYD62d9crLXIE/h2mL8MLXW02KkR97K5Wrgccl8ROzXi/dZ 8oXbSPRaPmudS3slMcW8cDO4NHHsK+Fj8fcf6ykFQxUel3cEPj08L4QZhgQTdVtbfY1I 0WjBVunZC0ekbsWaPSaRhvin1fDZ9WNYUiCL8Z9WjY02kilG0WPPBDDzReK99q91pnf4 Dv0w== X-Gm-Message-State: AIVw110WMYBoeE334b4MZvKpeWw4udtMy9o68YtCnX+PY6ponMXQ4t3R wiyRnr/Vveh1mjrG5eAqrA== X-Received: by 10.223.179.193 with SMTP id x1mr1034164wrd.86.1500379627128; Tue, 18 Jul 2017 05:07:07 -0700 (PDT) Received: from localhost.localdomain ([154.145.198.181]) by smtp.gmail.com with ESMTPSA id l46sm2174532wrl.15.2017.07.18.05.07.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Jul 2017 05:07:06 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, nico@linaro.org, ebiggers@google.com Cc: Ard Biesheuvel Subject: [PATCH v4 5/8] crypto: arm/aes - avoid expanded lookup tables in the final round Date: Tue, 18 Jul 2017 13:06:42 +0100 Message-Id: <20170718120645.15880-6-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170718120645.15880-1-ard.biesheuvel@linaro.org> References: <20170718120645.15880-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org For the final round, avoid the expanded and padded lookup tables exported by the generic AES driver. Instead, for encryption, we can perform byte loads from the same table we used for the inner rounds, which will still be hot in the caches. For decryption, use the inverse AES Sbox exported by the generic AES driver, which is 4x smaller than the inverse table exported by the generic driver. This significantly reduces the Dcache footprint of our code, and does not introduce any additional module dependencies, given that we already rely on the core AES module for the shared key expansion routines. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/aes-cipher-core.S | 51 ++++++++++---------- 1 file changed, 26 insertions(+), 25 deletions(-) -- 2.9.3 diff --git a/arch/arm/crypto/aes-cipher-core.S b/arch/arm/crypto/aes-cipher-core.S index a727692cd9c1..5e9ddc576ec1 100644 --- a/arch/arm/crypto/aes-cipher-core.S +++ b/arch/arm/crypto/aes-cipher-core.S @@ -33,19 +33,19 @@ .endif .endm - .macro __load, out, in, idx + .macro __load, out, in, idx, sz, op .if __LINUX_ARM_ARCH__ < 7 && \idx > 0 - ldr \out, [ttab, \in, lsr #(8 * \idx) - 2] + ldr\op \out, [ttab, \in, lsr #(8 * \idx) - \sz] .else - ldr \out, [ttab, \in, lsl #2] + ldr\op \out, [ttab, \in, lsl #\sz] .endif .endm - .macro __hround, out0, out1, in0, in1, in2, in3, t3, t4, enc + .macro __hround, out0, out1, in0, in1, in2, in3, t3, t4, enc, sz, op __select \out0, \in0, 0 __select t0, \in1, 1 - __load \out0, \out0, 0 - __load t0, t0, 1 + __load \out0, \out0, 0, \sz, \op + __load t0, t0, 1, \sz, \op .if \enc __select \out1, \in1, 0 @@ -54,10 +54,10 @@ __select \out1, \in3, 0 __select t1, \in0, 1 .endif - __load \out1, \out1, 0 + __load \out1, \out1, 0, \sz, \op __select t2, \in2, 2 - __load t1, t1, 1 - __load t2, t2, 2 + __load t1, t1, 1, \sz, \op + __load t2, t2, 2, \sz, \op eor \out0, \out0, t0, ror #24 @@ -69,9 +69,9 @@ __select \t3, \in1, 2 __select \t4, \in2, 3 .endif - __load \t3, \t3, 2 - __load t0, t0, 3 - __load \t4, \t4, 3 + __load \t3, \t3, 2, \sz, \op + __load t0, t0, 3, \sz, \op + __load \t4, \t4, 3, \sz, \op eor \out1, \out1, t1, ror #24 eor \out0, \out0, t2, ror #16 @@ -83,14 +83,14 @@ eor \out1, \out1, t2 .endm - .macro fround, out0, out1, out2, out3, in0, in1, in2, in3 - __hround \out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1 - __hround \out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1 + .macro fround, out0, out1, out2, out3, in0, in1, in2, in3, sz=2, op + __hround \out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1, \sz, \op + __hround \out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1, \sz, \op .endm - .macro iround, out0, out1, out2, out3, in0, in1, in2, in3 - __hround \out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0 - __hround \out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0 + .macro iround, out0, out1, out2, out3, in0, in1, in2, in3, sz=2, op + __hround \out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0, \sz, \op + __hround \out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0, \sz, \op .endm .macro __rev, out, in @@ -115,7 +115,7 @@ .endif .endm - .macro do_crypt, round, ttab, ltab + .macro do_crypt, round, ttab, ltab, bsz push {r3-r11, lr} ldr r4, [in] @@ -147,9 +147,12 @@ 1: subs rounds, rounds, #4 \round r8, r9, r10, r11, r4, r5, r6, r7 - __adrl ttab, \ltab, ls + bls 2f \round r4, r5, r6, r7, r8, r9, r10, r11 - bhi 0b + b 0b + +2: __adrl ttab, \ltab + \round r4, r5, r6, r7, r8, r9, r10, r11, \bsz, b #ifdef CONFIG_CPU_BIG_ENDIAN __rev r4, r4 @@ -173,14 +176,12 @@ .align 6 aes_table_reduced crypto_ft_tab - aes_table_reduced crypto_fl_tab aes_table_reduced crypto_it_tab - aes_table_reduced crypto_il_tab ENTRY(__aes_arm_encrypt) - do_crypt fround, crypto_ft_tab, crypto_fl_tab + do_crypt fround, crypto_ft_tab, crypto_ft_tab + 1, 2 ENDPROC(__aes_arm_encrypt) ENTRY(__aes_arm_decrypt) - do_crypt iround, crypto_it_tab, crypto_il_tab + do_crypt iround, crypto_it_tab, crypto_aes_inv_sbox, 0 ENDPROC(__aes_arm_decrypt)