From patchwork Fri Jan 6 11:53:12 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 90151 Delivered-To: patch@linaro.org Received: by 10.182.224.138 with SMTP id rc10csp297040obc; Fri, 6 Jan 2017 03:53:55 -0800 (PST) X-Received: by 10.98.202.72 with SMTP id n69mr72099460pfg.24.1483703635814; Fri, 06 Jan 2017 03:53:55 -0800 (PST) Return-Path: Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id v26si79272210pge.21.2017.01.06.03.53.55 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 06 Jan 2017 03:53:55 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org designates 2001:1868:205::9 as permitted sender) client-ip=2001:1868:205::9; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org designates 2001:1868:205::9 as permitted sender) smtp.mailfrom=linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org; dmarc=fail (p=NONE dis=NONE) header.from=linaro.org Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1cPT5j-0000om-Q5; Fri, 06 Jan 2017 11:53:51 +0000 Received: from mail-wm0-f54.google.com ([74.125.82.54]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1cPT5e-0000gc-Jn for linux-arm-kernel@lists.infradead.org; Fri, 06 Jan 2017 11:53:49 +0000 Received: by mail-wm0-f54.google.com with SMTP id c85so22179510wmi.1 for ; Fri, 06 Jan 2017 03:53:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id; bh=9YZTRWFD3zpCOF88pXGmaocFDsuPe48F+B/oP8omcng=; b=ktHbJmW0ySs2yp01jRaM8kFZUFGvRxRXUhNpAUlc2o7lSa2aYmsnCTAjtRna0brPjJ Pm6WiYUKhChiPfCm+pY5nwYm6ITxRBe6z7qUmsoMDhwGwkDgFnrNoE4g5sF5t0/8/d46 +oDHRxR8b06cO99natTrUjNF3oYG6WnHIBEjo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=9YZTRWFD3zpCOF88pXGmaocFDsuPe48F+B/oP8omcng=; b=SfEU8kbx1lNZ9lBt2OF+w7bGdcAOQ6hhC0g+IyIBemKdgENjhncGbJW2z2kLg9JgbU VZwKbUn/cqPYtOstxQAd+SQqPpkRCUjA5isAfnBPjud2lkIbbMGK0MBIT2RnXBGRaQCB qv/UUgZsKIeNyLMSxKrvezlMm0hmgUX7KnOfJb6QLLw5HpXfRoUL22W4M93epysogtwO /o8uogxvFU/gxAzTgVzUILPe8gLY3BhKowGG8IYcBgxQ1adIpdDJQxJEcKtm0qs/3Tws SYqj7q01NRYsOG9f+3lJ/Lpg7WhwroFh7qj4uti5JKHX1qeP3EUeUr1V6i9v3w+38z8P ABWw== X-Gm-Message-State: AIkVDXIG/w07IGJdKlWB1srUBCCIusJfzl8sspRSEmYl+WMHjixiK5uhLy6XIfhlBK97kzM+ X-Received: by 10.28.191.87 with SMTP id p84mr3374736wmf.63.1483703603794; Fri, 06 Jan 2017 03:53:23 -0800 (PST) Received: from localhost.localdomain ([105.136.114.170]) by smtp.gmail.com with ESMTPSA id f134sm3005340wmf.19.2017.01.06.03.53.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 06 Jan 2017 03:53:23 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [RFT PATCH] crypto: arm/aes - replace scalar AES cipher Date: Fri, 6 Jan 2017 11:53:12 +0000 Message-Id: <1483703592-7745-1-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170106_035346_971587_BD27890B X-CRM114-Status: GOOD ( 21.10 ) X-Spam-Score: -2.0 (--) X-Spam-Report: SpamAssassin version 3.4.1 on bombadil.infradead.org summary: Content analysis details: (-2.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [74.125.82.54 listed in list.dnswl.org] -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [74.125.82.54 listed in wl.mailspike.net] -0.0 SPF_PASS SPF: sender matches SPF record -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux@armlinux.org.uk, Ard Biesheuvel , herbert@gondor.apana.org.au, nico@linaro.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org This replaces the scalar AES cipher that originates in the OpenSSL project with a new implementation that is ~15% (*) faster (on modern cores), and reuses the lookup tables and the key schedule generation routines from the generic C implementation (which is usually compiled in anyway due to networking and other subsystems depending on it). Note that the bit sliced NEON code for AES still depends on the scalar cipher that this patch replaces, so it is not removed entirely yet. * On Cortex-A57, the performance increases from 17.0 to 14.9 cycles per byte for 128-bit keys. Signed-off-by: Ard Biesheuvel --- It makes sense to test this on a variety of cores before deciding whether to merge it or not. Test results welcome. (insmod tcrypt.ko mode=200 sec=1) arch/arm/crypto/Kconfig | 20 +-- arch/arm/crypto/Makefile | 4 +- arch/arm/crypto/aes-cipher-core.S | 169 ++++++++++++++++++++ arch/arm/crypto/aes-cipher-glue.c | 69 ++++++++ arch/arm/crypto/aes_glue.c | 98 ------------ 5 files changed, 241 insertions(+), 119 deletions(-) -- 2.7.4 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig index 2f3339f015d3..f1de658c3c8f 100644 --- a/arch/arm/crypto/Kconfig +++ b/arch/arm/crypto/Kconfig @@ -62,33 +62,15 @@ config CRYPTO_SHA512_ARM using optimized ARM assembler and NEON, when available. config CRYPTO_AES_ARM - tristate "AES cipher algorithms (ARM-asm)" - depends on ARM + tristate "Scalar AES cipher for ARM" select CRYPTO_ALGAPI select CRYPTO_AES help Use optimized AES assembler routines for ARM platforms. - AES cipher algorithms (FIPS-197). AES uses the Rijndael - algorithm. - - Rijndael appears to be consistently a very good performer in - both hardware and software across a wide range of computing - environments regardless of its use in feedback or non-feedback - modes. Its key setup time is excellent, and its key agility is - good. Rijndael's very low memory requirements make it very well - suited for restricted-space environments, in which it also - demonstrates excellent performance. Rijndael's operations are - among the easiest to defend against power and timing attacks. - - The AES specifies three key sizes: 128, 192 and 256 bits - - See for more information. - config CRYPTO_AES_ARM_BS tristate "Bit sliced AES using NEON instructions" depends on KERNEL_MODE_NEON - select CRYPTO_AES_ARM select CRYPTO_BLKCIPHER select CRYPTO_SIMD help diff --git a/arch/arm/crypto/Makefile b/arch/arm/crypto/Makefile index 8d74e55eacd4..8f5de2db701c 100644 --- a/arch/arm/crypto/Makefile +++ b/arch/arm/crypto/Makefile @@ -27,8 +27,8 @@ $(warning $(ce-obj-y) $(ce-obj-m)) endif endif -aes-arm-y := aes-armv4.o aes_glue.o -aes-arm-bs-y := aesbs-core.o aesbs-glue.o +aes-arm-y := aes-cipher-core.o aes-cipher-glue.o +aes-arm-bs-y := aes-armv4.o aesbs-core.o aesbs-glue.o sha1-arm-y := sha1-armv4-large.o sha1_glue.o sha1-arm-neon-y := sha1-armv7-neon.o sha1_neon_glue.o sha256-arm-neon-$(CONFIG_KERNEL_MODE_NEON) := sha256_neon_glue.o diff --git a/arch/arm/crypto/aes-cipher-core.S b/arch/arm/crypto/aes-cipher-core.S new file mode 100644 index 000000000000..8d4a15364d43 --- /dev/null +++ b/arch/arm/crypto/aes-cipher-core.S @@ -0,0 +1,169 @@ +/* + * Scalar AES core transform + * + * Copyright (C) 2017 Linaro Ltd + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include + + .text + .align 5 + + rk .req r0 + rounds .req r1 + in .req r2 + out .req r3 + tt .req ip + + t0 .req lr + t1 .req r2 + t2 .req r3 + + .macro __select, out, in, idx + .if __LINUX_ARM_ARCH__ < 7 + and \out, \in, #0xff << (8 * \idx) + .else + ubfx \out, \in, #(8 * \idx), #8 + .endif + .endm + + .macro __load, out, in, idx + .if __LINUX_ARM_ARCH__ < 7 && \idx > 0 + ldr \out, [tt, \in, lsr #(8 * \idx) - 2] + .else + ldr \out, [tt, \in, lsl #2] + .endif + .endm + + .macro __hround, out0, out1, in0, in1, in2, in3, t3, t4, enc + __select \out0, \in0, 0 + __select t0, \in1, 1 + __load \out0, \out0, 0 + __load t0, t0, 1 + + .if \enc + __select \out1, \in1, 0 + __select t1, \in2, 1 + .else + __select \out1, \in3, 0 + __select t1, \in0, 1 + .endif + __load \out1, \out1, 0 + __select t2, \in2, 2 + __load t1, t1, 1 + __load t2, t2, 2 + + eor \out0, \out0, t0, ror #24 + + __select t0, \in3, 3 + .if \enc + __select \t3, \in3, 2 + __select \t4, \in0, 3 + .else + __select \t3, \in1, 2 + __select \t4, \in2, 3 + .endif + __load \t3, \t3, 2 + __load t0, t0, 3 + __load \t4, \t4, 3 + + eor \out1, \out1, t1, ror #24 + eor \out0, \out0, t2, ror #16 + ldm rk!, {t1, t2} + eor \out1, \out1, \t3, ror #16 + eor \out0, \out0, t0, ror #8 + eor \out1, \out1, \t4, ror #8 + eor \out0, \out0, t1 + eor \out1, \out1, t2 + .endm + + .macro fround, out0, out1, out2, out3, in0, in1, in2, in3 + __hround \out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1 + __hround \out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1 + .endm + + .macro iround, out0, out1, out2, out3, in0, in1, in2, in3 + __hround \out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0 + __hround \out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0 + .endm + + .macro __rev, out, in + .if __LINUX_ARM_ARCH__ < 7 + lsl t0, \in, #24 + and t1, \in, #0xff00 + and t2, \in, #0xff0000 + orr \out, t0, \in, lsr #24 + orr \out, \out, t1, lsl #8 + orr \out, \out, t2, lsr #8 + .else + rev \out, \in + .endif + .endm + + .macro do_crypt, round, ttab, ltab + push {r3-r11, lr} + + ldr r4, [in] + ldr r5, [in, #4] + ldr r6, [in, #8] + ldr r7, [in, #12] + + ldm rk!, {r8, r9, r10, r11} + +#ifdef CONFIG_CPU_BIG_ENDIAN + __rev r4, r4 + __rev r5, r5 + __rev r6, r6 + __rev r7, r7 +#endif + + eor r4, r4, r8 + eor r5, r5, r9 + eor r6, r6, r10 + eor r7, r7, r11 + + ldr tt, =\ttab + + tst rounds, #2 + bne 1f + +0: \round r8, r9, r10, r11, r4, r5, r6, r7 + \round r4, r5, r6, r7, r8, r9, r10, r11 + +1: subs rounds, rounds, #4 + \round r8, r9, r10, r11, r4, r5, r6, r7 + ldrls tt, =\ltab + \round r4, r5, r6, r7, r8, r9, r10, r11 + bhi 0b + +#ifdef CONFIG_CPU_BIG_ENDIAN + __rev r4, r4 + __rev r5, r5 + __rev r6, r6 + __rev r7, r7 +#endif + + ldr out, [sp] + + str r4, [out] + str r5, [out, #4] + str r6, [out, #8] + str r7, [out, #12] + + pop {r3-r11, pc} + + .align 3 + .ltorg + .endm + +ENTRY(__aes_arm_encrypt) + do_crypt fround, crypto_ft_tab, crypto_fl_tab +ENDPROC(__aes_arm_encrypt) + +ENTRY(__aes_arm_decrypt) + do_crypt iround, crypto_it_tab, crypto_il_tab +ENDPROC(__aes_arm_decrypt) diff --git a/arch/arm/crypto/aes-cipher-glue.c b/arch/arm/crypto/aes-cipher-glue.c new file mode 100644 index 000000000000..19545237112a --- /dev/null +++ b/arch/arm/crypto/aes-cipher-glue.c @@ -0,0 +1,69 @@ +/* + * Scalar AES core transform + * + * Copyright (C) 2017 Linaro Ltd + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include +#include + +asmlinkage void __aes_arm_encrypt(u32 *rk, int rounds, const u8 *in, u8 *out); +EXPORT_SYMBOL(__aes_arm_encrypt); + +asmlinkage void __aes_arm_decrypt(u32 *rk, int rounds, const u8 *in, u8 *out); +EXPORT_SYMBOL(__aes_arm_decrypt); + +static void aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in) +{ + struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm); + int rounds = 6 + ctx->key_length / 4; + + __aes_arm_encrypt(ctx->key_enc, rounds, in, out); +} + +static void aes_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in) +{ + struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm); + int rounds = 6 + ctx->key_length / 4; + + __aes_arm_decrypt(ctx->key_dec, rounds, in, out); +} + +static struct crypto_alg aes_alg = { + .cra_name = "aes", + .cra_driver_name = "aes-arm", + .cra_priority = 200, + .cra_flags = CRYPTO_ALG_TYPE_CIPHER, + .cra_blocksize = AES_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct crypto_aes_ctx), + .cra_module = THIS_MODULE, + + .cra_cipher.cia_min_keysize = AES_MIN_KEY_SIZE, + .cra_cipher.cia_max_keysize = AES_MAX_KEY_SIZE, + .cra_cipher.cia_setkey = crypto_aes_set_key, + .cra_cipher.cia_encrypt = aes_encrypt, + .cra_cipher.cia_decrypt = aes_decrypt +}; + +static int __init aes_init(void) +{ + return crypto_register_alg(&aes_alg); +} + +static void __exit aes_fini(void) +{ + crypto_unregister_alg(&aes_alg); +} + +module_init(aes_init); +module_exit(aes_fini); + +MODULE_DESCRIPTION("Scalar AES cipher for ARM"); +MODULE_AUTHOR("Ard Biesheuvel "); +MODULE_LICENSE("GPL v2"); +MODULE_ALIAS_CRYPTO("aes"); diff --git a/arch/arm/crypto/aes_glue.c b/arch/arm/crypto/aes_glue.c deleted file mode 100644 index 0409b8f89782..000000000000 --- a/arch/arm/crypto/aes_glue.c +++ /dev/null @@ -1,98 +0,0 @@ -/* - * Glue Code for the asm optimized version of the AES Cipher Algorithm - */ - -#include -#include -#include - -#include "aes_glue.h" - -EXPORT_SYMBOL(AES_encrypt); -EXPORT_SYMBOL(AES_decrypt); -EXPORT_SYMBOL(private_AES_set_encrypt_key); -EXPORT_SYMBOL(private_AES_set_decrypt_key); - -static void aes_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src) -{ - struct AES_CTX *ctx = crypto_tfm_ctx(tfm); - AES_encrypt(src, dst, &ctx->enc_key); -} - -static void aes_decrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src) -{ - struct AES_CTX *ctx = crypto_tfm_ctx(tfm); - AES_decrypt(src, dst, &ctx->dec_key); -} - -static int aes_set_key(struct crypto_tfm *tfm, const u8 *in_key, - unsigned int key_len) -{ - struct AES_CTX *ctx = crypto_tfm_ctx(tfm); - - switch (key_len) { - case AES_KEYSIZE_128: - key_len = 128; - break; - case AES_KEYSIZE_192: - key_len = 192; - break; - case AES_KEYSIZE_256: - key_len = 256; - break; - default: - tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN; - return -EINVAL; - } - - if (private_AES_set_encrypt_key(in_key, key_len, &ctx->enc_key) == -1) { - tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN; - return -EINVAL; - } - /* private_AES_set_decrypt_key expects an encryption key as input */ - ctx->dec_key = ctx->enc_key; - if (private_AES_set_decrypt_key(in_key, key_len, &ctx->dec_key) == -1) { - tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN; - return -EINVAL; - } - return 0; -} - -static struct crypto_alg aes_alg = { - .cra_name = "aes", - .cra_driver_name = "aes-asm", - .cra_priority = 200, - .cra_flags = CRYPTO_ALG_TYPE_CIPHER, - .cra_blocksize = AES_BLOCK_SIZE, - .cra_ctxsize = sizeof(struct AES_CTX), - .cra_module = THIS_MODULE, - .cra_list = LIST_HEAD_INIT(aes_alg.cra_list), - .cra_u = { - .cipher = { - .cia_min_keysize = AES_MIN_KEY_SIZE, - .cia_max_keysize = AES_MAX_KEY_SIZE, - .cia_setkey = aes_set_key, - .cia_encrypt = aes_encrypt, - .cia_decrypt = aes_decrypt - } - } -}; - -static int __init aes_init(void) -{ - return crypto_register_alg(&aes_alg); -} - -static void __exit aes_fini(void) -{ - crypto_unregister_alg(&aes_alg); -} - -module_init(aes_init); -module_exit(aes_fini); - -MODULE_DESCRIPTION("Rijndael (AES) Cipher Algorithm (ASM)"); -MODULE_LICENSE("GPL"); -MODULE_ALIAS_CRYPTO("aes"); -MODULE_ALIAS_CRYPTO("aes-asm"); -MODULE_AUTHOR("David McCullough ");