From patchwork Mon Jul 24 10:28:03 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108548 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887335qge; Mon, 24 Jul 2017 03:28:33 -0700 (PDT) X-Received: by 10.98.112.137 with SMTP id l131mr15750581pfc.194.1500892113513; Mon, 24 Jul 2017 03:28:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892113; cv=none; d=google.com; s=arc-20160816; b=goeK13s+hTHJWHnwkGiSnEHwte9kvYpwvSUBfaVd3EGc2DH68AvB7mMFZlngfQYyXX qThnjw8dGTiC0TxQqS/mXc8b+mZkV6vmWGEGZhqqCgjhfnX0852s6yNej+cz/yniIw5r j1sMwPSOqKr718tZ6OyTJ9TQNolp/NepUadqf1NKer3CNUZC4+QqkfJ1+QSS15hRnRgv NV4lmqO/aH7xZ5+fra/FFxEPafYsGxNn3tkzA9ZvbZNFEJ5hjXetPCc8Ykx8AL2vNxyX 0hMfbt51WV1oiMJE6Ty/4ps2JWLeVvdd8jBRXXk5RcR6Bs+KA4kKWqOZpP58ZbbCl6ZS gQWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=TTcJNC6kusyK3xBwuxZWQr4herCFHTJeaV8oPhIifTo=; b=jB9AOEKw/c7rTJZ4qPof3rnl4NyNez7JWOre16VgUTgvPUZrfuMCHJY/3xG/Krou21 ru6SBiaGRwmBdgAiJ6VwYU6Ptaf257KXiSF8uzuvw0l/+qEHVBe0mDi+qJFGH5odvE28 T4aZ1l33buWtaLOPDjqYWIBCpKgCXjiI/QsUhwNKLNr3CTHNVepKsGxBCR+nIlge9Z+1 g5F9pqTAojoPzSDVtabVDMsCcpsSgOzGr7oRk3sQ8o8klWQG1At30EuelF0X5DRJiyi5 fynNsmTnc6HJWUauXxlMwJhbJzgKtvuuv+NjgmenxGPIx8hEgEG0RibCWrjVXACRLfxP sajw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=hiuXFMe4; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.28.33; Mon, 24 Jul 2017 03:28:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=hiuXFMe4; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751543AbdGXK2c (ORCPT + 1 other); Mon, 24 Jul 2017 06:28:32 -0400 Received: from mail-wm0-f43.google.com ([74.125.82.43]:36236 "EHLO mail-wm0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751482AbdGXK2b (ORCPT ); Mon, 24 Jul 2017 06:28:31 -0400 Received: by mail-wm0-f43.google.com with SMTP id t201so14634211wmt.1 for ; Mon, 24 Jul 2017 03:28:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=TTcJNC6kusyK3xBwuxZWQr4herCFHTJeaV8oPhIifTo=; b=hiuXFMe4DYpKpvoHNX740qBijRVB75HV6oiQ2y4AHMFSJbnzFYcl517sSrorXgz8AS +tioPclqkGtOXpqAbBLWOJ2qGlMNDqlZTg0HTlVOh0mrRlVEwQgrOqn7Y+SjOLk92Uu8 vFWqn1R0smTOxSjl8lQ/LEwiG4SpdXAniglbM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=TTcJNC6kusyK3xBwuxZWQr4herCFHTJeaV8oPhIifTo=; b=WFf0svAgm/7MsBJdvTBvrMIv1VyaEhaMIOMLFOnCvqKS+zfwi/IBn1Q+1ZEraKrSQv FZ1C+/zLl16ozbuFJCRzzqR80lL+QUSl8F544tHCjHz+8TnJzVUTLHvOC7dSqvbVqg7G 0df6h0KRApbQcvqXmHcQ1hHVIS8T4QgGsiH/ghdqX9ST9TTb5yaJ2JvD+z10T+EfK/xR ejOQKPLANkRS5J/FtFbeWixPckfIrWlmEFiD3nHnghhE3KX9nC1yaK+u2Gt1y4821DUx 686EMX5FaQYh0PfPTds6GlE1XQIlc3xdlZWlv60UCQm7r52ezDuWxFJYryePwYVttWqc I8ZQ== X-Gm-Message-State: AIVw113wEs+snFAvBrQfKYijGYP/nMChVnhOSPrdVW/lkj+Jo8fqkSAg 4F7U+tgbGyXd/L9wAxY30g== X-Received: by 10.28.9.9 with SMTP id 9mr1148553wmj.93.1500892109822; Mon, 24 Jul 2017 03:28:29 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.28.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:28:29 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 01/18] crypto/algapi - use separate dst and src operands for __crypto_xor() Date: Mon, 24 Jul 2017 11:28:03 +0100 Message-Id: <20170724102820.16534-2-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org In preparation of introducing crypto_xor_cpy(), which will use separate operands for input and output, modify the __crypto_xor() implementation, which it will share with the existing crypto_xor(), which provides the actual functionality when not using the inline version. Signed-off-by: Ard Biesheuvel --- crypto/algapi.c | 25 ++++++++++++-------- include/crypto/algapi.h | 4 ++-- 2 files changed, 17 insertions(+), 12 deletions(-) -- 2.9.3 diff --git a/crypto/algapi.c b/crypto/algapi.c index e4cc7615a139..aa699ff6c876 100644 --- a/crypto/algapi.c +++ b/crypto/algapi.c @@ -975,13 +975,15 @@ void crypto_inc(u8 *a, unsigned int size) } EXPORT_SYMBOL_GPL(crypto_inc); -void __crypto_xor(u8 *dst, const u8 *src, unsigned int len) +void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int len) { int relalign = 0; if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) { int size = sizeof(unsigned long); - int d = ((unsigned long)dst ^ (unsigned long)src) & (size - 1); + int d = (((unsigned long)dst ^ (unsigned long)src1) | + ((unsigned long)dst ^ (unsigned long)src2)) & + (size - 1); relalign = d ? 1 << __ffs(d) : size; @@ -992,34 +994,37 @@ void __crypto_xor(u8 *dst, const u8 *src, unsigned int len) * process the remainder of the input using optimal strides. */ while (((unsigned long)dst & (relalign - 1)) && len > 0) { - *dst++ ^= *src++; + *dst++ = *src1++ ^ *src2++; len--; } } while (IS_ENABLED(CONFIG_64BIT) && len >= 8 && !(relalign & 7)) { - *(u64 *)dst ^= *(u64 *)src; + *(u64 *)dst = *(u64 *)src1 ^ *(u64 *)src2; dst += 8; - src += 8; + src1 += 8; + src2 += 8; len -= 8; } while (len >= 4 && !(relalign & 3)) { - *(u32 *)dst ^= *(u32 *)src; + *(u32 *)dst = *(u32 *)src1 ^ *(u32 *)src2; dst += 4; - src += 4; + src1 += 4; + src2 += 4; len -= 4; } while (len >= 2 && !(relalign & 1)) { - *(u16 *)dst ^= *(u16 *)src; + *(u16 *)dst = *(u16 *)src1 ^ *(u16 *)src2; dst += 2; - src += 2; + src1 += 2; + src2 += 2; len -= 2; } while (len--) - *dst++ ^= *src++; + *dst++ = *src1++ ^ *src2++; } EXPORT_SYMBOL_GPL(__crypto_xor); diff --git a/include/crypto/algapi.h b/include/crypto/algapi.h index 436c4c2683c7..fd547f946bf8 100644 --- a/include/crypto/algapi.h +++ b/include/crypto/algapi.h @@ -192,7 +192,7 @@ static inline unsigned int crypto_queue_len(struct crypto_queue *queue) } void crypto_inc(u8 *a, unsigned int size); -void __crypto_xor(u8 *dst, const u8 *src, unsigned int size); +void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int size); static inline void crypto_xor(u8 *dst, const u8 *src, unsigned int size) { @@ -207,7 +207,7 @@ static inline void crypto_xor(u8 *dst, const u8 *src, unsigned int size) size -= sizeof(unsigned long); } } else { - __crypto_xor(dst, src, size); + __crypto_xor(dst, dst, src, size); } } From patchwork Mon Jul 24 10:28:04 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108549 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887366qge; Mon, 24 Jul 2017 03:28:36 -0700 (PDT) X-Received: by 10.101.76.76 with SMTP id l12mr8310465pgr.161.1500892115975; Mon, 24 Jul 2017 03:28:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892115; cv=none; d=google.com; s=arc-20160816; b=YdbO4M8XLZgPPP/eI/bD0UPKummr5/wmGtfNBSc4Mk/9NH2W6YLkXZn29Ek6dfRq29 Zex1iGFifMgz4ro0wqdJkXrIp1CkuwT5h3iS+gYJfmwTHr2P6wVEPy6ovq8ZLlAK/X3/ rdG4XrmWK4+sN7ih486MgFDRSQbpu1hhrtPCDeqZNBZ41SVZHxnk7RKlNADcSNkXj3+1 DGdleBEHjsGag++8L5PCip2cGoJvYnpGQMHVXeO5WO7eSxUdwNUr3xls7iqJiV8KHB8m c6mVezhtkWuzGfC05BkT4HpCyyvWOELVARROm29mOGGYwF09vBAzGMXLEY6l1ZDw5UOp uVfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=mxdBflWlhyxAV6ke0QztQ15F8GtvJ1QDng+7G1HKu4k=; b=miwFn9AmrhuMaol6SviAC5RIm8Z3BXmrwqkiN/gd0OcL0a6Nybf+yOxNxkToZW/BMn PZx5mMMJ0zUfmSS47p+u2rdLix6WB8RsBi8MAXGGzSvp5J2JWHHbzwbdrfhHJHAZoOnF 5rrSU7V4UcMkS3DFClmn30AEL+sDQE5qyszqS7hNWRSW68eHCsQ+Nld4neoqlWXrWuNe MSbtStRQH/ZLECuTS7mLFKqMYkMOUrvE6tMUyocf232Ahp0crSLAMuLMUH/Hf/xInjLe SLl/G2dJnx6Id5yRoTTw5FHXgoH1wK6r0wlPJZ7Pa/tQt6tKodhgX37TQ0ELOPczm8mF xA/g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=HMKYLx0v; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.28.35; Mon, 24 Jul 2017 03:28:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=HMKYLx0v; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751999AbdGXK2f (ORCPT + 1 other); Mon, 24 Jul 2017 06:28:35 -0400 Received: from mail-wm0-f43.google.com ([74.125.82.43]:37104 "EHLO mail-wm0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751559AbdGXK2e (ORCPT ); Mon, 24 Jul 2017 06:28:34 -0400 Received: by mail-wm0-f43.google.com with SMTP id c184so27681976wmd.0 for ; Mon, 24 Jul 2017 03:28:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=mxdBflWlhyxAV6ke0QztQ15F8GtvJ1QDng+7G1HKu4k=; b=HMKYLx0vq4aeC2VnG+jZIX6y2NjozMX9BnpMrDbtoO8rdtY7oqRqqOoILIeLpI2F5n TWqpOaeSaXo6cTZBhiZez4kTuezPLBIKK3lnJ0GecdyjqrMs7FecI8EyhUI5lLEhP0i5 H+aSWmOkoj0y1sSF9ecNmgs1zPP17Zj3BjDJA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=mxdBflWlhyxAV6ke0QztQ15F8GtvJ1QDng+7G1HKu4k=; b=T7Ujrxw1FMB/ImuOZlybXfiQtV4LSwY1rmEuI7TWWmHaUhdKl1vTWnIihuM3Bx7JRh 1FFvMylDURapV0MdstX9BJ0w3VGKPSE5zv0bFBDthHELT+TmpXKJc+26jfyh6KI/5KFN ypctYa7njzEefJAOcoeLQsK355iLvgcL8UFpV3yLa3gOWZMaxFiMOZEd34yIRQTKGTzL YLU92DcgPEzd1zRQB7S4hmQ4xlNrYMghYKD+qlgl7e3EkIkVq8Q4jPQjLplkQEwGkOdx vDXaK0i3Q60lIiQh4D4nvMhnZ2VQjnXNHuETKAgkCAPE/UIuduS/xJYyurVieRLr0ac2 Q6QQ== X-Gm-Message-State: AIVw113R2TABynDjKX9bfBlRIHNquo7kHEw9wUm76njXnOQgHhGioVu7 JxAJVFzilYYyrbhiah6y2g== X-Received: by 10.28.130.208 with SMTP id e199mr4338170wmd.122.1500892112441; Mon, 24 Jul 2017 03:28:32 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.28.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:28:31 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 02/18] crypto/algapi - make crypto_xor() take separate dst and src arguments Date: Mon, 24 Jul 2017 11:28:04 +0100 Message-Id: <20170724102820.16534-3-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org There are quite a number of occurrences in the kernel of the pattern if (dst != src) memcpy(dst, src, walk.total % AES_BLOCK_SIZE); crypto_xor(dst, final, walk.total % AES_BLOCK_SIZE); or crypto_xor(keystream, src, nbytes); memcpy(dst, keystream, nbytes); where crypto_xor() is preceded or followed by a memcpy() invocation that is only there because crypto_xor() uses its output parameter as one of the inputs. To avoid having to add new instances of this pattern in the arm64 code, which will be refactored to implement non-SIMD fallbacks, add an alternative implementation called crypto_xor_cpy(), taking separate input and output arguments. This removes the need for the separate memcpy(). Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/aes-ce-glue.c | 4 +--- arch/arm/crypto/aes-neonbs-glue.c | 5 ++--- arch/arm64/crypto/aes-glue.c | 4 +--- arch/arm64/crypto/aes-neonbs-glue.c | 5 ++--- arch/sparc/crypto/aes_glue.c | 3 +-- arch/x86/crypto/aesni-intel_glue.c | 4 ++-- arch/x86/crypto/blowfish_glue.c | 3 +-- arch/x86/crypto/cast5_avx_glue.c | 3 +-- arch/x86/crypto/des3_ede_glue.c | 3 +-- crypto/ctr.c | 3 +-- crypto/pcbc.c | 12 ++++-------- drivers/crypto/vmx/aes_ctr.c | 3 +-- drivers/md/dm-crypt.c | 11 +++++------ include/crypto/algapi.h | 19 +++++++++++++++++++ 14 files changed, 42 insertions(+), 40 deletions(-) -- 2.9.3 diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c index 0f966a8ca1ce..d0a9cec73707 100644 --- a/arch/arm/crypto/aes-ce-glue.c +++ b/arch/arm/crypto/aes-ce-glue.c @@ -285,9 +285,7 @@ static int ctr_encrypt(struct skcipher_request *req) ce_aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc, num_rounds(ctx), blocks, walk.iv); - if (tdst != tsrc) - memcpy(tdst, tsrc, nbytes); - crypto_xor(tdst, tail, nbytes); + crypto_xor_cpy(tdst, tsrc, tail, nbytes); err = skcipher_walk_done(&walk, 0); } kernel_neon_end(); diff --git a/arch/arm/crypto/aes-neonbs-glue.c b/arch/arm/crypto/aes-neonbs-glue.c index c76377961444..18768f330449 100644 --- a/arch/arm/crypto/aes-neonbs-glue.c +++ b/arch/arm/crypto/aes-neonbs-glue.c @@ -221,9 +221,8 @@ static int ctr_encrypt(struct skcipher_request *req) u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE; u8 *src = walk.src.virt.addr + blocks * AES_BLOCK_SIZE; - if (dst != src) - memcpy(dst, src, walk.total % AES_BLOCK_SIZE); - crypto_xor(dst, final, walk.total % AES_BLOCK_SIZE); + crypto_xor_cpy(dst, src, final, + walk.total % AES_BLOCK_SIZE); err = skcipher_walk_done(&walk, 0); break; diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c index bcf596b0197e..0da30e3b0e4b 100644 --- a/arch/arm64/crypto/aes-glue.c +++ b/arch/arm64/crypto/aes-glue.c @@ -241,9 +241,7 @@ static int ctr_encrypt(struct skcipher_request *req) aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc, rounds, blocks, walk.iv, first); - if (tdst != tsrc) - memcpy(tdst, tsrc, nbytes); - crypto_xor(tdst, tail, nbytes); + crypto_xor_cpy(tdst, tsrc, tail, nbytes); err = skcipher_walk_done(&walk, 0); } kernel_neon_end(); diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c index db2501d93550..9001aec16007 100644 --- a/arch/arm64/crypto/aes-neonbs-glue.c +++ b/arch/arm64/crypto/aes-neonbs-glue.c @@ -224,9 +224,8 @@ static int ctr_encrypt(struct skcipher_request *req) u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE; u8 *src = walk.src.virt.addr + blocks * AES_BLOCK_SIZE; - if (dst != src) - memcpy(dst, src, walk.total % AES_BLOCK_SIZE); - crypto_xor(dst, final, walk.total % AES_BLOCK_SIZE); + crypto_xor_cpy(dst, src, final, + walk.total % AES_BLOCK_SIZE); err = skcipher_walk_done(&walk, 0); break; diff --git a/arch/sparc/crypto/aes_glue.c b/arch/sparc/crypto/aes_glue.c index c90930de76ba..3cd4f6b198b6 100644 --- a/arch/sparc/crypto/aes_glue.c +++ b/arch/sparc/crypto/aes_glue.c @@ -344,8 +344,7 @@ static void ctr_crypt_final(struct crypto_sparc64_aes_ctx *ctx, ctx->ops->ecb_encrypt(&ctx->key[0], (const u64 *)ctrblk, keystream, AES_BLOCK_SIZE); - crypto_xor((u8 *) keystream, src, nbytes); - memcpy(dst, keystream, nbytes); + crypto_xor_cpy(dst, (u8 *) keystream, src, nbytes); crypto_inc(ctrblk, AES_BLOCK_SIZE); } diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c index 4a55cdcdc008..5c15d6b57329 100644 --- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -475,8 +475,8 @@ static void ctr_crypt_final(struct crypto_aes_ctx *ctx, unsigned int nbytes = walk->nbytes; aesni_enc(ctx, keystream, ctrblk); - crypto_xor(keystream, src, nbytes); - memcpy(dst, keystream, nbytes); + crypto_xor_cpy(dst, keystream, src, nbytes); + crypto_inc(ctrblk, AES_BLOCK_SIZE); } diff --git a/arch/x86/crypto/blowfish_glue.c b/arch/x86/crypto/blowfish_glue.c index 17c05531dfd1..f9eca34301e2 100644 --- a/arch/x86/crypto/blowfish_glue.c +++ b/arch/x86/crypto/blowfish_glue.c @@ -271,8 +271,7 @@ static void ctr_crypt_final(struct bf_ctx *ctx, struct blkcipher_walk *walk) unsigned int nbytes = walk->nbytes; blowfish_enc_blk(ctx, keystream, ctrblk); - crypto_xor(keystream, src, nbytes); - memcpy(dst, keystream, nbytes); + crypto_xor_cpy(dst, keystream, src, nbytes); crypto_inc(ctrblk, BF_BLOCK_SIZE); } diff --git a/arch/x86/crypto/cast5_avx_glue.c b/arch/x86/crypto/cast5_avx_glue.c index 8648158f3916..dbea6020ffe7 100644 --- a/arch/x86/crypto/cast5_avx_glue.c +++ b/arch/x86/crypto/cast5_avx_glue.c @@ -256,8 +256,7 @@ static void ctr_crypt_final(struct blkcipher_desc *desc, unsigned int nbytes = walk->nbytes; __cast5_encrypt(ctx, keystream, ctrblk); - crypto_xor(keystream, src, nbytes); - memcpy(dst, keystream, nbytes); + crypto_xor_cpy(dst, keystream, src, nbytes); crypto_inc(ctrblk, CAST5_BLOCK_SIZE); } diff --git a/arch/x86/crypto/des3_ede_glue.c b/arch/x86/crypto/des3_ede_glue.c index d6fc59aaaadf..30c0a37f4882 100644 --- a/arch/x86/crypto/des3_ede_glue.c +++ b/arch/x86/crypto/des3_ede_glue.c @@ -277,8 +277,7 @@ static void ctr_crypt_final(struct des3_ede_x86_ctx *ctx, unsigned int nbytes = walk->nbytes; des3_ede_enc_blk(ctx, keystream, ctrblk); - crypto_xor(keystream, src, nbytes); - memcpy(dst, keystream, nbytes); + crypto_xor_cpy(dst, keystream, src, nbytes); crypto_inc(ctrblk, DES3_EDE_BLOCK_SIZE); } diff --git a/crypto/ctr.c b/crypto/ctr.c index 477d9226ccaa..854d924f9d8e 100644 --- a/crypto/ctr.c +++ b/crypto/ctr.c @@ -65,8 +65,7 @@ static void crypto_ctr_crypt_final(struct blkcipher_walk *walk, unsigned int nbytes = walk->nbytes; crypto_cipher_encrypt_one(tfm, keystream, ctrblk); - crypto_xor(keystream, src, nbytes); - memcpy(dst, keystream, nbytes); + crypto_xor_cpy(dst, keystream, src, nbytes); crypto_inc(ctrblk, bsize); } diff --git a/crypto/pcbc.c b/crypto/pcbc.c index 29dd2b4a3b85..d9e45a958720 100644 --- a/crypto/pcbc.c +++ b/crypto/pcbc.c @@ -55,8 +55,7 @@ static int crypto_pcbc_encrypt_segment(struct skcipher_request *req, do { crypto_xor(iv, src, bsize); crypto_cipher_encrypt_one(tfm, dst, iv); - memcpy(iv, dst, bsize); - crypto_xor(iv, src, bsize); + crypto_xor_cpy(iv, dst, src, bsize); src += bsize; dst += bsize; @@ -79,8 +78,7 @@ static int crypto_pcbc_encrypt_inplace(struct skcipher_request *req, memcpy(tmpbuf, src, bsize); crypto_xor(iv, src, bsize); crypto_cipher_encrypt_one(tfm, src, iv); - memcpy(iv, tmpbuf, bsize); - crypto_xor(iv, src, bsize); + crypto_xor_cpy(iv, tmpbuf, src, bsize); src += bsize; } while ((nbytes -= bsize) >= bsize); @@ -127,8 +125,7 @@ static int crypto_pcbc_decrypt_segment(struct skcipher_request *req, do { crypto_cipher_decrypt_one(tfm, dst, src); crypto_xor(dst, iv, bsize); - memcpy(iv, src, bsize); - crypto_xor(iv, dst, bsize); + crypto_xor_cpy(iv, dst, src, bsize); src += bsize; dst += bsize; @@ -153,8 +150,7 @@ static int crypto_pcbc_decrypt_inplace(struct skcipher_request *req, memcpy(tmpbuf, src, bsize); crypto_cipher_decrypt_one(tfm, src, src); crypto_xor(src, iv, bsize); - memcpy(iv, tmpbuf, bsize); - crypto_xor(iv, src, bsize); + crypto_xor_cpy(iv, src, tmpbuf, bsize); src += bsize; } while ((nbytes -= bsize) >= bsize); diff --git a/drivers/crypto/vmx/aes_ctr.c b/drivers/crypto/vmx/aes_ctr.c index 9c26d9e8dbea..15a23f7e2e24 100644 --- a/drivers/crypto/vmx/aes_ctr.c +++ b/drivers/crypto/vmx/aes_ctr.c @@ -104,8 +104,7 @@ static void p8_aes_ctr_final(struct p8_aes_ctr_ctx *ctx, pagefault_enable(); preempt_enable(); - crypto_xor(keystream, src, nbytes); - memcpy(dst, keystream, nbytes); + crypto_xor_cpy(dst, keystream, src, nbytes); crypto_inc(ctrblk, AES_BLOCK_SIZE); } diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index cdf6b1e12460..fa17e5452796 100644 --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -758,9 +758,8 @@ static int crypt_iv_tcw_whitening(struct crypt_config *cc, int i, r; /* xor whitening with sector number */ - memcpy(buf, tcw->whitening, TCW_WHITENING_SIZE); - crypto_xor(buf, (u8 *)§or, 8); - crypto_xor(&buf[8], (u8 *)§or, 8); + crypto_xor_cpy(buf, tcw->whitening, (u8 *)§or, 8); + crypto_xor_cpy(&buf[8], tcw->whitening + 8, (u8 *)§or, 8); /* calculate crc32 for every 32bit part and xor it */ desc->tfm = tcw->crc32_tfm; @@ -805,10 +804,10 @@ static int crypt_iv_tcw_gen(struct crypt_config *cc, u8 *iv, } /* Calculate IV */ - memcpy(iv, tcw->iv_seed, cc->iv_size); - crypto_xor(iv, (u8 *)§or, 8); + crypto_xor_cpy(iv, tcw->iv_seed, (u8 *)§or, 8); if (cc->iv_size > 8) - crypto_xor(&iv[8], (u8 *)§or, cc->iv_size - 8); + crypto_xor_cpy(&iv[8], tcw->iv_seed + 8, (u8 *)§or, + cc->iv_size - 8); return r; } diff --git a/include/crypto/algapi.h b/include/crypto/algapi.h index fd547f946bf8..e3cebf640c00 100644 --- a/include/crypto/algapi.h +++ b/include/crypto/algapi.h @@ -211,6 +211,25 @@ static inline void crypto_xor(u8 *dst, const u8 *src, unsigned int size) } } +static inline void crypto_xor_cpy(u8 *dst, const u8 *src1, const u8 *src2, + unsigned int size) +{ + if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && + __builtin_constant_p(size) && + (size % sizeof(unsigned long)) == 0) { + unsigned long *d = (unsigned long *)dst; + unsigned long *s1 = (unsigned long *)src1; + unsigned long *s2 = (unsigned long *)src2; + + while (size > 0) { + *d++ = *s1++ ^ *s2++; + size -= sizeof(unsigned long); + } + } else { + __crypto_xor(dst, src1, src2, size); + } +} + int blkcipher_walk_done(struct blkcipher_desc *desc, struct blkcipher_walk *walk, int err); int blkcipher_walk_virt(struct blkcipher_desc *desc, From patchwork Mon Jul 24 10:28:05 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108550 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887402qge; Mon, 24 Jul 2017 03:28:38 -0700 (PDT) X-Received: by 10.98.204.10 with SMTP id a10mr15804710pfg.332.1500892118542; Mon, 24 Jul 2017 03:28:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892118; cv=none; d=google.com; s=arc-20160816; b=rzWqleF9t6rRY5oozVzqV0dqnsPUjnHCeNkfqOJb5zI9yiCC3Yk0vkmugjR30dDz5S 1xAxp/ljjvBuxaeLGvP5tWcoI1dW06q4rs7ZrU7YCXk/XiKY/fId3vYNqSBpmZ2lFcqP FCC2Mlz4kckXhSo49HWpv/4kq0VPk0XZc/3VzFntdmOH6pvd2ez9vRNJUbmQi3oLQwIY /3fJWopztUfHHtcJr2qczlIPCWZ/4Guj3ApUv1DssKzzrtTs1jvy6bavb4hroeD4BWM8 0seqxKduv7bWnHk2lr3ZYVfSErcG9IfOvHjaluQx9LrN4h6xZ7wnQszegfFcJeRmrPrw jD+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=k3ld6dTTaWKjAGwefohmXas0N9G+pe8b8EvzVMNOcjg=; b=nyOYq5gzZVf1LlV+neYAa02p6st5MKr/IYBVuUKjwyVpP4U8nmzAsGOR9YG+rdVWZj 5ZsKEDVN5e03zMvWk+ZsZTo4otjBwJCQBa2ts8I3gVAu66TOgwybox/ubaQc6ytFRJKW AmpBx1kMUcQeb5LfGW999/6eY+q2JNRK6Uca0KMI0P4swS6emYnxJDbamLO8Cf0PhJXk X0BQDVt1UQcZW+g6bqzHRmzBpvfhXfVlRVVtah8GuZC7aq3mAggZL0hbv6yvYWxX54Dg +bgNSq/dpvAKBBMabk+mmL6/M+klsVokpde1j5bPsUMtm/WJeeJJp+tpaEK89CHBXYrN kXpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=V6UU9n8S; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.28.38; Mon, 24 Jul 2017 03:28:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=V6UU9n8S; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752071AbdGXK2h (ORCPT + 1 other); Mon, 24 Jul 2017 06:28:37 -0400 Received: from mail-wr0-f171.google.com ([209.85.128.171]:34636 "EHLO mail-wr0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751559AbdGXK2g (ORCPT ); Mon, 24 Jul 2017 06:28:36 -0400 Received: by mail-wr0-f171.google.com with SMTP id 12so104446778wrb.1 for ; Mon, 24 Jul 2017 03:28:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=k3ld6dTTaWKjAGwefohmXas0N9G+pe8b8EvzVMNOcjg=; b=V6UU9n8Se2U5QC2ThzXqZh2jcFR+vo3h43D/0YFllnmC8rQnMQe6g3hMAj/mGePyTQ bLDYidHzxQ9W6SzXfA/kbHURaZaEri4xr4winGUAgdZ9EqnMgBhVa8VruXEd2BLaEiLU hQ4VI/M/vGttxP3oXaq+NvKHAIE4vhqTbiFbc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=k3ld6dTTaWKjAGwefohmXas0N9G+pe8b8EvzVMNOcjg=; b=KC8y5Ow+FOLvCR2e77hYUBwVrxfFcnn6ygVwJIj+1jAmYlEbNZrJU3ZY/ooeS2zyoE KRHX7/0WhKm1hShDvbgJGOWVJv5KewponDgVN9vqSP+KRk2UwSjn4kjP1a6YMA9XgS7P fFcJotxOZa+D4CRQq3bceai2pRv6W5VZxLEbakDGL7+fOeEb25hOBBf0vzrbKSmwRlfV C8GI9BxCjVBkI1lYQ51bxq0wQ0d14DmTlQR3X7Ja59ZzCv4gi5x8+1esGppfaBGzaMJ7 rtvot+8xMIDElWQ7zjFmlP38jS12Xb7ykt6OAYqfVPBRoxte0IBmsENfq8VU2BYGOGEd KMjA== X-Gm-Message-State: AIVw112v8yE2lKDKiT5r03bYFRxdAhK20AvDNH6cW78AHUUAC+tSIFCe 9N9kpcReuoy7Z7LYNa01Zg== X-Received: by 10.223.137.137 with SMTP id x9mr7403316wrx.242.1500892114818; Mon, 24 Jul 2017 03:28:34 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.28.32 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:28:34 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 03/18] crypto: arm64/ghash-ce - add non-SIMD scalar fallback Date: Mon, 24 Jul 2017 11:28:05 +0100 Message-Id: <20170724102820.16534-4-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig | 3 +- arch/arm64/crypto/ghash-ce-glue.c | 49 ++++++++++++++++---- 2 files changed, 43 insertions(+), 9 deletions(-) -- 2.9.3 diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index d92293747d63..7d75a363e317 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -28,8 +28,9 @@ config CRYPTO_SHA2_ARM64_CE config CRYPTO_GHASH_ARM64_CE tristate "GHASH (for GCM chaining mode) using ARMv8 Crypto Extensions" - depends on ARM64 && KERNEL_MODE_NEON + depends on KERNEL_MODE_NEON select CRYPTO_HASH + select CRYPTO_GF128MUL config CRYPTO_CRCT10DIF_ARM64_CE tristate "CRCT10DIF digest algorithm using PMULL instructions" diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c index 833ec1e3f3e9..30221ef56e70 100644 --- a/arch/arm64/crypto/ghash-ce-glue.c +++ b/arch/arm64/crypto/ghash-ce-glue.c @@ -1,7 +1,7 @@ /* * Accelerated GHASH implementation with ARMv8 PMULL instructions. * - * Copyright (C) 2014 Linaro Ltd. + * Copyright (C) 2014 - 2017 Linaro Ltd. * * This program is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License version 2 as published @@ -9,7 +9,9 @@ */ #include +#include #include +#include #include #include #include @@ -25,6 +27,7 @@ MODULE_LICENSE("GPL v2"); struct ghash_key { u64 a; u64 b; + be128 k; }; struct ghash_desc_ctx { @@ -44,6 +47,36 @@ static int ghash_init(struct shash_desc *desc) return 0; } +static void ghash_do_update(int blocks, u64 dg[], const char *src, + struct ghash_key *key, const char *head) +{ + if (likely(may_use_simd())) { + kernel_neon_begin(); + pmull_ghash_update(blocks, dg, src, key, head); + kernel_neon_end(); + } else { + be128 dst = { cpu_to_be64(dg[1]), cpu_to_be64(dg[0]) }; + + do { + const u8 *in = src; + + if (head) { + in = head; + blocks++; + head = NULL; + } else { + src += GHASH_BLOCK_SIZE; + } + + crypto_xor((u8 *)&dst, in, GHASH_BLOCK_SIZE); + gf128mul_lle(&dst, &key->k); + } while (--blocks); + + dg[0] = be64_to_cpu(dst.b); + dg[1] = be64_to_cpu(dst.a); + } +} + static int ghash_update(struct shash_desc *desc, const u8 *src, unsigned int len) { @@ -67,10 +100,9 @@ static int ghash_update(struct shash_desc *desc, const u8 *src, blocks = len / GHASH_BLOCK_SIZE; len %= GHASH_BLOCK_SIZE; - kernel_neon_begin_partial(8); - pmull_ghash_update(blocks, ctx->digest, src, key, - partial ? ctx->buf : NULL); - kernel_neon_end(); + ghash_do_update(blocks, ctx->digest, src, key, + partial ? ctx->buf : NULL); + src += blocks * GHASH_BLOCK_SIZE; partial = 0; } @@ -89,9 +121,7 @@ static int ghash_final(struct shash_desc *desc, u8 *dst) memset(ctx->buf + partial, 0, GHASH_BLOCK_SIZE - partial); - kernel_neon_begin_partial(8); - pmull_ghash_update(1, ctx->digest, ctx->buf, key, NULL); - kernel_neon_end(); + ghash_do_update(1, ctx->digest, ctx->buf, key, NULL); } put_unaligned_be64(ctx->digest[1], dst); put_unaligned_be64(ctx->digest[0], dst + 8); @@ -111,6 +141,9 @@ static int ghash_setkey(struct crypto_shash *tfm, return -EINVAL; } + /* needed for the fallback */ + memcpy(&key->k, inkey, GHASH_BLOCK_SIZE); + /* perform multiplication by 'x' in GF(2^128) */ b = get_unaligned_be64(inkey); a = get_unaligned_be64(inkey + 8); From patchwork Mon Jul 24 10:28:06 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108551 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887419qge; Mon, 24 Jul 2017 03:28:40 -0700 (PDT) X-Received: by 10.84.217.208 with SMTP id d16mr16976187plj.208.1500892120131; Mon, 24 Jul 2017 03:28:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892120; cv=none; d=google.com; s=arc-20160816; b=byRT2ZkSWF+OEi5v+M+wVyJ0tOCv6YWOem7E39PvPNTFPxedRFWYOWIgKaeLv731m0 2THm6mVJP3XmCCnHVgIwuzoq4CBXhRwSV4nNV9foAadek+BAjamowqbLhRwmwCsQA3tf iIPLdyUp1JY+N0Q1RZ3PU4kNKGjYVyeplMtDKDeDZ6kivUzge1giCIiuhiNIKouV7Bk2 4H2oPmO+byzF1Z2WAD8n7m9StlgQZUnCrgPRsxmjO5QEKzps2u3lQv65UitnzK4XNg1f EbOBtD6N00VvBy4AqDkiI2B6aqgi3P3s7uXwDkswF4RX3yTEIC0QF/TSgBKfqGoZbGgR vf8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=zBKCe0pp0KmTrMKo+rEvT1GNR9oJ/NBs44emyjk55HA=; b=n4oKXAwqXlRqKFNMzkNCHQ+yLVKXRrndRzE9FmeMwEfvbMRbcx/uY0FPIgIOynjosJ pra0JJSZaneF0mFVnm4CmBJ2Ni35Yx3RyovUkQ+scfWN/OXZfdOR3LMScjkYgpI9eKeu cHGH+qNgF4ZVxSoGpDq6qLIqzdHKnL3w+iB1XA7Gsmd9B7A/Pe90niI3Ln0UXDDpIRUJ c33vWyrvB3bYem/Woo9jbafQgBXxUw2stouRNAO0Fls2R8SwnT4J9+punVH4wJ8Z8K0s JJ7qT6paw9qNz206/pD0HysPln3n/OaY63KWX/UN5cwdEoSDFg8h/6INu9xSN/RzbSjl NiBA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=GaeSWbvq; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.28.39; Mon, 24 Jul 2017 03:28:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=GaeSWbvq; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751966AbdGXK2j (ORCPT + 1 other); Mon, 24 Jul 2017 06:28:39 -0400 Received: from mail-wm0-f48.google.com ([74.125.82.48]:37898 "EHLO mail-wm0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751559AbdGXK2i (ORCPT ); Mon, 24 Jul 2017 06:28:38 -0400 Received: by mail-wm0-f48.google.com with SMTP id m85so13408426wma.1 for ; Mon, 24 Jul 2017 03:28:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=zBKCe0pp0KmTrMKo+rEvT1GNR9oJ/NBs44emyjk55HA=; b=GaeSWbvqB6ZtEIRcW9fw6tom/tx9LjtCjHYlzxqFtlp1Rus+Jrcw2+WRch3kGBzL6f ZM/UxwMVNodMmN4Mod/0Gr15r7RdWWF2KZsw6CkyO7WQQc874WkvWlHb8U4T3ua0jMON /vDvytjUzOT3zvMGeZNyS8H1gUw/BNk0KkgNo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=zBKCe0pp0KmTrMKo+rEvT1GNR9oJ/NBs44emyjk55HA=; b=oI0PdX+FWYtEZF1vNQHJHYaj63mvnuKUQv2s/OsPjsHpE/savMGSPisPa86TvVbqXO hlq8igfjVYVrviYV+SfkyEWRlAs8tnOjm8su9BFZ20G+5DRATNSyfc9btv+I4WF4sysi ie5S6XWSNqLfHJtj9cSa0TpooEFJ2WH2YqjRMCLCfYT/X7FubXuRjoPoldZS3ZN/v7I4 e3/ASrVVdb/hPDiL/IZFFByyxYno3XeUmOuI2L54gVTlmO/b92AvE9FDzZPEaOhDgg23 Gb4sQapMLcKz/36WyUSxRhxLpBUKhyFr3PgKM/WRCUL5eP/A7gJDmxPZmgDmyLFa9d0S HEnw== X-Gm-Message-State: AIVw113xktwhBlhnslTBO37QcuYN6iZs4idR42lwcwwE41OazOj/Dtey cVEtOyN7XtAFuQq6EoWoZQ== X-Received: by 10.28.0.19 with SMTP id 19mr4299036wma.168.1500892116984; Mon, 24 Jul 2017 03:28:36 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.28.35 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:28:36 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 04/18] crypto: arm64/crct10dif - add non-SIMD generic fallback Date: Mon, 24 Jul 2017 11:28:06 +0100 Message-Id: <20170724102820.16534-5-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/crct10dif-ce-glue.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) -- 2.9.3 diff --git a/arch/arm64/crypto/crct10dif-ce-glue.c b/arch/arm64/crypto/crct10dif-ce-glue.c index 60cb590c2590..96f0cae4a022 100644 --- a/arch/arm64/crypto/crct10dif-ce-glue.c +++ b/arch/arm64/crypto/crct10dif-ce-glue.c @@ -1,7 +1,7 @@ /* * Accelerated CRC-T10DIF using arm64 NEON and Crypto Extensions instructions * - * Copyright (C) 2016 Linaro Ltd + * Copyright (C) 2016 - 2017 Linaro Ltd * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -18,6 +18,7 @@ #include #include +#include #define CRC_T10DIF_PMULL_CHUNK_SIZE 16U @@ -48,9 +49,13 @@ static int crct10dif_update(struct shash_desc *desc, const u8 *data, } if (length > 0) { - kernel_neon_begin_partial(14); - *crc = crc_t10dif_pmull(*crc, data, length); - kernel_neon_end(); + if (may_use_simd()) { + kernel_neon_begin(); + *crc = crc_t10dif_pmull(*crc, data, length); + kernel_neon_end(); + } else { + *crc = crc_t10dif_generic(*crc, data, length); + } } return 0; From patchwork Mon Jul 24 10:28:07 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108552 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887445qge; Mon, 24 Jul 2017 03:28:42 -0700 (PDT) X-Received: by 10.98.74.10 with SMTP id x10mr15782851pfa.252.1500892121940; Mon, 24 Jul 2017 03:28:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892121; cv=none; d=google.com; s=arc-20160816; b=AjBHEUpq6lrzRoXARhCLItVhFIqkZMISwJVS5hsAKGe7jj6+bAUbGxbSbkSEv+sGEf ezu2Sv6PQFvGQibmSJCXvGKFje123KPtwoGh9ztmhP2Ug6lLa3UCjMtmEEuJ1Z9f+12y lZL+gD5dfGB46Ya1vr2r7xB2gQoOnDz/UwetFqxLKznn5VY09AM06tGhgJdm10230ItO kgCWzFaMIaPteaFvRTO6lMHbrtgC0o/PFFWzawwG3DuPHn0tqFV61n1r23S44iFiwLxu pqXHrs63QUPmP9GF2B9MrbLSkWCq/v75HCBfOFLfevJV09aZGuvpM7sTmhN2HdKHIqsa rx0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=WUNDdM4SDE+JdS3WUxk6ux+pnehBuSATXJixrw/hBn8=; b=d3AyeTdvptBZq57+p9L/HgZUtaMWK4RQ7AVtLiHrgP2fw1AFWLoT26KavANk2AP2q0 xfhaKflb+69CDvcjmMp6v+QEBrP4rUqmeGKX7iSjOotxJ3QJ2yBlGlF3MZFjonCGMId5 J1oIFqfulqPM5LZVZ1Zf8A3m/mwUjXW6R66+Xvt50yOWVNLISWAr1z6t/ljo2GzcoGD8 1XKILx6xMJMyJL32+DND15bpux9qw6uvjFVFEPTM7IyUvbk7hQZykRd89Y8B425b+5eu Bw/OHT0U+ekny/GXAzzKsBbDmckPi/Ii8ZGFmVGKMrfEhSNMNFQLWPyCbSwzuiCB68LJ P81Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=BNWzcxzd; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.28.41; Mon, 24 Jul 2017 03:28:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=BNWzcxzd; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752140AbdGXK2k (ORCPT + 1 other); Mon, 24 Jul 2017 06:28:40 -0400 Received: from mail-wr0-f180.google.com ([209.85.128.180]:34653 "EHLO mail-wr0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751559AbdGXK2k (ORCPT ); Mon, 24 Jul 2017 06:28:40 -0400 Received: by mail-wr0-f180.google.com with SMTP id 12so104447461wrb.1 for ; Mon, 24 Jul 2017 03:28:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=WUNDdM4SDE+JdS3WUxk6ux+pnehBuSATXJixrw/hBn8=; b=BNWzcxzddnbM/Gh5Gp8XpgEBkTEgfsxXSNXNW/p7tdeTIczZLx4B/av3w9Gx7ptgs7 O/zF/vadYDqcXxgFjKgWt+tZev0PKGFEllpvAFXqlX5o/F71yxrI534lAhKa2AblzOdD A/d7I0vFgRztQqSZHu4HjZBRrRrS4y/3lCxt8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=WUNDdM4SDE+JdS3WUxk6ux+pnehBuSATXJixrw/hBn8=; b=Ds1g3ZEys6trgDhlz3zgj/GhLbiF6+v8GTOGzBlQww01N/gHULo9QY6Ry9qZhOIB6x JWzCyaMzFqOY6AfNIBLoS5sErxO156SUWwJopKybKUc12Jh77mRK0TOi5qgpSSmDrQp1 YeaZFQjrE1Y9RQEfvTIX89B2nYiBeT+OyyFfq3Uybwtw7URPR1byN7Dy2BzYRIWiVO3h 89qmu0qDKRf19OGcxGZKdX3rk43/TA4xegD/0XKiq7vpIzCl2K0u4FClQNbauq5OQZXd 945lH1DiMZy4SCDFSMs1xLCakqGe7EbKxX7RckVWv6WMsZ4YMKANwzS5uMhrZJX3vV6a R5iw== X-Gm-Message-State: AIVw112hWyDaFhagm66RhwojRV4K5MzIeBYJqNiadoK3zLH+UYs24WOy U9XSq0lMJphdj9E5Wvoo1g== X-Received: by 10.223.177.158 with SMTP id q30mr18144262wra.123.1500892118809; Mon, 24 Jul 2017 03:28:38 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.28.37 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:28:38 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 05/18] crypto: arm64/crc32 - add non-SIMD scalar fallback Date: Mon, 24 Jul 2017 11:28:07 +0100 Message-Id: <20170724102820.16534-6-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/crc32-ce-glue.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) -- 2.9.3 diff --git a/arch/arm64/crypto/crc32-ce-glue.c b/arch/arm64/crypto/crc32-ce-glue.c index eccb1ae90064..624f4137918c 100644 --- a/arch/arm64/crypto/crc32-ce-glue.c +++ b/arch/arm64/crypto/crc32-ce-glue.c @@ -1,7 +1,7 @@ /* * Accelerated CRC32(C) using arm64 NEON and Crypto Extensions instructions * - * Copyright (C) 2016 Linaro Ltd + * Copyright (C) 2016 - 2017 Linaro Ltd * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -19,6 +19,7 @@ #include #include +#include #include #define PMULL_MIN_LEN 64L /* minimum size of buffer @@ -105,10 +106,10 @@ static int crc32_pmull_update(struct shash_desc *desc, const u8 *data, length -= l; } - if (length >= PMULL_MIN_LEN) { + if (length >= PMULL_MIN_LEN && may_use_simd()) { l = round_down(length, SCALE_F); - kernel_neon_begin_partial(10); + kernel_neon_begin(); *crc = crc32_pmull_le(data, l, *crc); kernel_neon_end(); @@ -137,10 +138,10 @@ static int crc32c_pmull_update(struct shash_desc *desc, const u8 *data, length -= l; } - if (length >= PMULL_MIN_LEN) { + if (length >= PMULL_MIN_LEN && may_use_simd()) { l = round_down(length, SCALE_F); - kernel_neon_begin_partial(10); + kernel_neon_begin(); *crc = crc32c_pmull_le(data, l, *crc); kernel_neon_end(); From patchwork Mon Jul 24 10:28:08 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108553 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887472qge; Mon, 24 Jul 2017 03:28:44 -0700 (PDT) X-Received: by 10.98.208.196 with SMTP id p187mr15539192pfg.320.1500892124017; Mon, 24 Jul 2017 03:28:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892124; cv=none; d=google.com; s=arc-20160816; b=fpsamZB258mFL4r5DcnD8mjXZPW5bMwDnUC1ikeRWAMoE7MvVwRY8WbSZT+1Xi5w2u fdEL+f3lPiygD9XAbFvs8g7o9xe+LWnEm6Kq3rhi6lXyBFUdPQ6aKjfHzBWiOmfw4SWL DOeuFcPivGoq+4DVOnPGEfyBwG0GsHVq49eyH+g7PbYAqbdQnKYj9ch3kybrTAPu8Vth 8G/e3K6hjSQ+4jfjA3hvDzoGaGEMijG1L1eDAfN5ytB9eZHsas9NIcKZ/m6mNd58VUem zwpDR3VjBglwPudotUF6wWWt94DwmwLu/ej8I4kMuWhUmEeDPqBZP/Ds1ty6eI6rSriB 0MMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=gvwr8T/I0jl4cYz9816v2BfDwXlBTA1ygIENM+xFK2s=; b=fzVhBafcQKFX8rKxJyOjvmPMoF/EG6+/HHE8ljVv3RDYNxBs5W/JE79Mdto0i/A13U uk60rP/IyMVJbyWCRuwlq8HBWu/czux0FJX6gdtvVoNL8QUrdgb/pXijiBbd/rK3Sx0c sJikvZI/D4F6gw81c4Wf8G8SvtoxsFIE67uQa/l/uAPqCXMv5jzOZ5ljhblERFdqZNid 5innrT1JLwyYE4YwN9b60J1x8el/v12YRxLqza7L80nM53DQIi+esMvX9hpijh63QQJv 10f8Eib5plBwX/3jrIujOkCmKIn5wZDvTUlh9i9EzdnjI9TBz1+TtQ3ExOZJjAwZg3nY bT1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=ende+AQG; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.28.43; Mon, 24 Jul 2017 03:28:44 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=ende+AQG; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752201AbdGXK2m (ORCPT + 1 other); Mon, 24 Jul 2017 06:28:42 -0400 Received: from mail-wr0-f172.google.com ([209.85.128.172]:36960 "EHLO mail-wr0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751559AbdGXK2m (ORCPT ); Mon, 24 Jul 2017 06:28:42 -0400 Received: by mail-wr0-f172.google.com with SMTP id 33so47734538wrz.4 for ; Mon, 24 Jul 2017 03:28:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=gvwr8T/I0jl4cYz9816v2BfDwXlBTA1ygIENM+xFK2s=; b=ende+AQGaysLJTo1XRRkmsrLCiHwe1hL/8OYmxbT+oXXpYKWzgYHPBBhaSOaauhhwq A3FdDeAP3j/PpxbtKuBPUVOFALDW8eJxvWLeno0nseDUB3C0cmdM2aI9128ZvPvKkIUe alu3DTbbF2cf2RU0y+TN9gXGTRGxp5NS9wkxQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=gvwr8T/I0jl4cYz9816v2BfDwXlBTA1ygIENM+xFK2s=; b=KRjPzuJrrgl5CcKkaxJe15moBwc17e2Xr5v9wzW+wc5rVKfCo2k5HBy9yTPLbNYnSo X27b32zmz0msjuojQAw466LGa3C7wmkUbE490L1igILC1Pv8ktZrRv7PZW1QkMdLFmZH ZbOgIiiknsX4ZETvVOXVj2VKLEt/VXPkXY7zHy39togt9WNXUlqHwR5YFzEK8pZDcQJE TtyQDsdIPttn4L6Op1VIjpbFlxa+dQohC0PdRAKmOJ+D2MXB0mcTi2zgOtCGn1beURJW QT6QULSBHvVou4Yo+R9tLLnTABUc5BeCc1Uj162dMs17vrUeLJIdiJrrrZFJggQKGPly BIWQ== X-Gm-Message-State: AIVw111+4zQNBFWFeuGMUF7FB9QLY+3cuffurcdUlCJ8BK7xBvT4SnYe zk1AG/clBA0/KMjqcwuOeQ== X-Received: by 10.223.164.194 with SMTP id h2mr18019311wrb.137.1500892120915; Mon, 24 Jul 2017 03:28:40 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.28.38 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:28:40 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 06/18] crypto: arm64/sha1-ce - add non-SIMD generic fallback Date: Mon, 24 Jul 2017 11:28:08 +0100 Message-Id: <20170724102820.16534-7-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig | 3 ++- arch/arm64/crypto/sha1-ce-glue.c | 18 ++++++++++++++---- 2 files changed, 16 insertions(+), 5 deletions(-) -- 2.9.3 diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index 7d75a363e317..5d5953545dad 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -18,8 +18,9 @@ config CRYPTO_SHA512_ARM64 config CRYPTO_SHA1_ARM64_CE tristate "SHA-1 digest algorithm (ARMv8 Crypto Extensions)" - depends on ARM64 && KERNEL_MODE_NEON + depends on KERNEL_MODE_NEON select CRYPTO_HASH + select CRYPTO_SHA1 config CRYPTO_SHA2_ARM64_CE tristate "SHA-224/SHA-256 digest algorithm (ARMv8 Crypto Extensions)" diff --git a/arch/arm64/crypto/sha1-ce-glue.c b/arch/arm64/crypto/sha1-ce-glue.c index ea319c055f5d..efbeb3e0dcfb 100644 --- a/arch/arm64/crypto/sha1-ce-glue.c +++ b/arch/arm64/crypto/sha1-ce-glue.c @@ -1,7 +1,7 @@ /* * sha1-ce-glue.c - SHA-1 secure hash using ARMv8 Crypto Extensions * - * Copyright (C) 2014 Linaro Ltd + * Copyright (C) 2014 - 2017 Linaro Ltd * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -9,6 +9,7 @@ */ #include +#include #include #include #include @@ -37,8 +38,11 @@ static int sha1_ce_update(struct shash_desc *desc, const u8 *data, { struct sha1_ce_state *sctx = shash_desc_ctx(desc); + if (!may_use_simd()) + return crypto_sha1_update(desc, data, len); + sctx->finalize = 0; - kernel_neon_begin_partial(16); + kernel_neon_begin(); sha1_base_do_update(desc, data, len, (sha1_block_fn *)sha1_ce_transform); kernel_neon_end(); @@ -52,13 +56,16 @@ static int sha1_ce_finup(struct shash_desc *desc, const u8 *data, struct sha1_ce_state *sctx = shash_desc_ctx(desc); bool finalize = !sctx->sst.count && !(len % SHA1_BLOCK_SIZE); + if (!may_use_simd()) + return crypto_sha1_finup(desc, data, len, out); + /* * Allow the asm code to perform the finalization if there is no * partial data and the input is a round multiple of the block size. */ sctx->finalize = finalize; - kernel_neon_begin_partial(16); + kernel_neon_begin(); sha1_base_do_update(desc, data, len, (sha1_block_fn *)sha1_ce_transform); if (!finalize) @@ -71,8 +78,11 @@ static int sha1_ce_final(struct shash_desc *desc, u8 *out) { struct sha1_ce_state *sctx = shash_desc_ctx(desc); + if (!may_use_simd()) + return crypto_sha1_finup(desc, NULL, 0, out); + sctx->finalize = 0; - kernel_neon_begin_partial(16); + kernel_neon_begin(); sha1_base_do_finalize(desc, (sha1_block_fn *)sha1_ce_transform); kernel_neon_end(); return sha1_base_finish(desc, out); From patchwork Mon Jul 24 10:28:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108554 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887512qge; Mon, 24 Jul 2017 03:28:46 -0700 (PDT) X-Received: by 10.98.150.16 with SMTP id c16mr15870230pfe.64.1500892126551; Mon, 24 Jul 2017 03:28:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892126; cv=none; d=google.com; s=arc-20160816; b=aLBS5sV7B3HbvAstX2ejq/JIsAjTV+DvU8FzS8VrJZW4TW/4V/nqjXef9Q5LTGnakW q6Eg/8t5L1I1GFXRSltXUFoephJG16Exn+/4HhUlHLLR7ZECsXzvDZs5y2QF12Nivg1o vPnwJ5eCdaF1GuNDphKvcGDlofjUz6GfKRglvuO++qXCZY5WS1+uIkdo2zdDzTlz/DvP zpISFcH7nSjl07EgZKC02V2ILvPZpAhKHg0YIwRT8oF26UeqXkeK8nLvLMX8QyJ6QRoO zWQdJbNchcmmhqj+Npwx74g1BDS+NDPUuHN5Ryh2leYAhEC2Jd4K1jb7OrSwVMcYLS4q 2psA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=tOJvzuIB+Rc40wnFBnD6TLb14pLMUlMB9fcZWYiK9ek=; b=oq8NeabzzsiKfvFu4zZ7oQf2jJvZLLHU2K4aJEqRU+njJZyoZXFjl7U3QcgTVeeU3n mB25/5ruX9D44zwcNRYQBTe6brWeWsojpkrmvdmZmYlEDmUNUdaAPZYE36MK3tqtpwvu clZBAUUZI/2/VkzZ4U9jnO6FIQNJAZpIU6e+OEOY5HqGsK4behVtSjcn6oLG0O6mrbpl CKvx5aGB/FNreit6flMLZpNXx/j9C+u0qgPtXyD9jBMD5Iuvx+ZQNgx2fKey1a0D86D0 NSzjHK4tvO33C6EpgY+ceQ69Gg1ozbGgycHL3uPfnRTfypu1G9Xce18D/a1tr4iXJfS9 B/tQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=INQ7lqP6; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.28.46; Mon, 24 Jul 2017 03:28:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=INQ7lqP6; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752312AbdGXK2p (ORCPT + 1 other); Mon, 24 Jul 2017 06:28:45 -0400 Received: from mail-wr0-f181.google.com ([209.85.128.181]:36452 "EHLO mail-wr0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751766AbdGXK2o (ORCPT ); Mon, 24 Jul 2017 06:28:44 -0400 Received: by mail-wr0-f181.google.com with SMTP id y43so104224371wrd.3 for ; Mon, 24 Jul 2017 03:28:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=tOJvzuIB+Rc40wnFBnD6TLb14pLMUlMB9fcZWYiK9ek=; b=INQ7lqP68sGH8AF5eO35GoS3R6g2VZowjX4HF0YEzWP5RmEs4tjVozJvREgC6OeRdx KiaSk/q0U+9wkYNge4uztVYnqjgn9+wOkoXBH5rjPx51ZtBYSqHYi9duCOq+K9q00h6T Y6wCHThJWBObpwYsM3HgSPWrTCTRIjC4InXDQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=tOJvzuIB+Rc40wnFBnD6TLb14pLMUlMB9fcZWYiK9ek=; b=Jc2nXYDpwVbOXdmiiKsGdyumhWundq2XDsOjyh9JEKPzpD/xU4+eg+WAhrwCz1FjU3 fLWV5O3iBQX9dt3uuF+22sneDgLtyUQDcXVi6GLLkf8ckTEcBPpa4XSTzO2aUmVP40Ex bILJ+vi37cSOj14Oakrd7NAUIID9SNImJ2xoj2hma0wS+31tZBEdAmZ8byrkL0ckmH/N 8pPi3qfiQ6TL9Ya8XRcgpOMa9humCRCzt2t/xEu2f+2g+PQVZxYxjcUoj2GPdPEqMBol ixLz9MT2ryncI3qo57oOO0OMikyOlA4u+H037naAdY35+x4SSSmOW+o3oiqZ8n0Rj/IB 07lQ== X-Gm-Message-State: AIVw110Oq8WbMwVqsRpjX+jRQyeHJMS52a5gv0AGe6TiT8Sf2Ox1qAOh +KmaRHrWnefa+25HXlWXpA== X-Received: by 10.223.158.139 with SMTP id a11mr14803500wrf.131.1500892122819; Mon, 24 Jul 2017 03:28:42 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.28.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:28:42 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 07/18] crypto: arm64/sha2-ce - add non-SIMD scalar fallback Date: Mon, 24 Jul 2017 11:28:09 +0100 Message-Id: <20170724102820.16534-8-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar code that can be invoked in that case. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig | 3 +- arch/arm64/crypto/sha2-ce-glue.c | 30 +++++++++++++++++--- arch/arm64/crypto/sha256-glue.c | 1 + 3 files changed, 29 insertions(+), 5 deletions(-) -- 2.9.3 diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index 5d5953545dad..8cd145f9c1ff 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -24,8 +24,9 @@ config CRYPTO_SHA1_ARM64_CE config CRYPTO_SHA2_ARM64_CE tristate "SHA-224/SHA-256 digest algorithm (ARMv8 Crypto Extensions)" - depends on ARM64 && KERNEL_MODE_NEON + depends on KERNEL_MODE_NEON select CRYPTO_HASH + select CRYPTO_SHA256_ARM64 config CRYPTO_GHASH_ARM64_CE tristate "GHASH (for GCM chaining mode) using ARMv8 Crypto Extensions" diff --git a/arch/arm64/crypto/sha2-ce-glue.c b/arch/arm64/crypto/sha2-ce-glue.c index 0ed9486f75dd..fd1ff2b13dfa 100644 --- a/arch/arm64/crypto/sha2-ce-glue.c +++ b/arch/arm64/crypto/sha2-ce-glue.c @@ -1,7 +1,7 @@ /* * sha2-ce-glue.c - SHA-224/SHA-256 using ARMv8 Crypto Extensions * - * Copyright (C) 2014 Linaro Ltd + * Copyright (C) 2014 - 2017 Linaro Ltd * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -9,6 +9,7 @@ */ #include +#include #include #include #include @@ -34,13 +35,19 @@ const u32 sha256_ce_offsetof_count = offsetof(struct sha256_ce_state, const u32 sha256_ce_offsetof_finalize = offsetof(struct sha256_ce_state, finalize); +asmlinkage void sha256_block_data_order(u32 *digest, u8 const *src, int blocks); + static int sha256_ce_update(struct shash_desc *desc, const u8 *data, unsigned int len) { struct sha256_ce_state *sctx = shash_desc_ctx(desc); + if (!may_use_simd()) + return sha256_base_do_update(desc, data, len, + (sha256_block_fn *)sha256_block_data_order); + sctx->finalize = 0; - kernel_neon_begin_partial(28); + kernel_neon_begin(); sha256_base_do_update(desc, data, len, (sha256_block_fn *)sha2_ce_transform); kernel_neon_end(); @@ -54,13 +61,22 @@ static int sha256_ce_finup(struct shash_desc *desc, const u8 *data, struct sha256_ce_state *sctx = shash_desc_ctx(desc); bool finalize = !sctx->sst.count && !(len % SHA256_BLOCK_SIZE); + if (!may_use_simd()) { + if (len) + sha256_base_do_update(desc, data, len, + (sha256_block_fn *)sha256_block_data_order); + sha256_base_do_finalize(desc, + (sha256_block_fn *)sha256_block_data_order); + return sha256_base_finish(desc, out); + } + /* * Allow the asm code to perform the finalization if there is no * partial data and the input is a round multiple of the block size. */ sctx->finalize = finalize; - kernel_neon_begin_partial(28); + kernel_neon_begin(); sha256_base_do_update(desc, data, len, (sha256_block_fn *)sha2_ce_transform); if (!finalize) @@ -74,8 +90,14 @@ static int sha256_ce_final(struct shash_desc *desc, u8 *out) { struct sha256_ce_state *sctx = shash_desc_ctx(desc); + if (!may_use_simd()) { + sha256_base_do_finalize(desc, + (sha256_block_fn *)sha256_block_data_order); + return sha256_base_finish(desc, out); + } + sctx->finalize = 0; - kernel_neon_begin_partial(28); + kernel_neon_begin(); sha256_base_do_finalize(desc, (sha256_block_fn *)sha2_ce_transform); kernel_neon_end(); return sha256_base_finish(desc, out); diff --git a/arch/arm64/crypto/sha256-glue.c b/arch/arm64/crypto/sha256-glue.c index a2226f841960..b064d925fe2a 100644 --- a/arch/arm64/crypto/sha256-glue.c +++ b/arch/arm64/crypto/sha256-glue.c @@ -29,6 +29,7 @@ MODULE_ALIAS_CRYPTO("sha256"); asmlinkage void sha256_block_data_order(u32 *digest, const void *data, unsigned int num_blks); +EXPORT_SYMBOL(sha256_block_data_order); asmlinkage void sha256_block_neon(u32 *digest, const void *data, unsigned int num_blks); From patchwork Mon Jul 24 10:28:10 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108555 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887538qge; Mon, 24 Jul 2017 03:28:48 -0700 (PDT) X-Received: by 10.84.167.2 with SMTP id c2mr17408781plb.336.1500892128473; Mon, 24 Jul 2017 03:28:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892128; cv=none; d=google.com; s=arc-20160816; b=XvEdRnP+OYLhXme9qpfQeBukCfuRsX6T7olECFSLdUSVMQ+E+gZSCFjkNpI+7Z8VUQ tVhjUVujqE6tTTq+aIgxIKwCY+wmVHSGGLVuvDp4l2dyvoNvg7pWicPIVatR7fD1vM7Q naLx/eg7vekqxhH6Ga6M1wuD+9yICVZ7GIxdy49Lxg0jgYQUnpxrleIGYMRyP23uyeBr 6R15Er8SAfDfe97hXiyQLSedWyNEAha+Xpxq5cg7OT9nIIO+INmP7aZCllcRWiOMb+mQ 6MXpVMMFL/La7byPFOZvyCAbe5s90GBnD+M1/x4scZqZ9HV3EURz+KkywJ0TTL5t+lSy bMPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=bV3d7llthIqwBiTNsm/2/efaqB6+Zxl96PHeU0J8dPg=; b=XbPdDYdI5FQq621FWRbGmoGoRpcrku0ycT7veSgQkasKLcMQr3ZJ6ROGZBIt/LiRhd 9ncVI8au5fDte0UdX7Qrw9OL/JvJFQrvrS/yqqbwPRROMCEXNWjEJGncVEHOzeeOAIFI BGs7leqU6YSBlfdrvLHuyAeLcwEBDfv+RwveYGXM50nomIeY5oKpF3ZoxJ3bQ6AFzRS+ O3tb9ud7Tp7jDFDQS9yXC5s3GG/qEm2MCyfqeiBlBckz14ZA+Jr0aj7KqX5tEhUH7AsE elA5QLHBwm1VTk7HXf4HiXqqCltUs1Paj48N1DFEI25IQY2mBJRpC/+f/bBCbN1T6kIf VpdA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=U0TAmVbD; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.28.48; Mon, 24 Jul 2017 03:28:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=U0TAmVbD; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752458AbdGXK2r (ORCPT + 1 other); Mon, 24 Jul 2017 06:28:47 -0400 Received: from mail-wm0-f46.google.com ([74.125.82.46]:35221 "EHLO mail-wm0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751766AbdGXK2q (ORCPT ); Mon, 24 Jul 2017 06:28:46 -0400 Received: by mail-wm0-f46.google.com with SMTP id c184so21140258wmd.0 for ; Mon, 24 Jul 2017 03:28:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=bV3d7llthIqwBiTNsm/2/efaqB6+Zxl96PHeU0J8dPg=; b=U0TAmVbDV1SXPkILcm9J4EV6nkoEVG2arwokHBDzLm617p6ouDxQSmPkgPLORRkPMp TnL5iSAWYl+sylwEZ4ezjlbzL0A38DDnxmijsvrzE4V8srpTWvoTY/a/Y5TKyFWELwXn UZTacYc7To5cDigG1uneuvMb4hh4MDjkZ0jJg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=bV3d7llthIqwBiTNsm/2/efaqB6+Zxl96PHeU0J8dPg=; b=WR4VQewffoyR4xI44E23rz2Oa/BpnqgXeeOxiwXNqVka/MtsCFCqNvqqdrMDOJcFZV M/lasGH4WIvrhmfCkBwcppgLm7FGtZBX/Ahx/ipk2dvAMvFcZQn69FsSamhSbNNqIuvp 0N3yzOX67Wpvw63lpdmueVp0BL9j97aXQ21agxMJ+r2948tMDM9C1dO3CAipeQgNFnDM edDiED702UrDYun3Cf6RrDsoJzb4ze1nc5MzSLSFQVqoN3PbfSG+aNCVHwK26hyHB5H2 PWF1zjtuXtfLSGoeOnh1Nt+e3hFLobnwsL6z9nPYESMaGy8AEhYG/qmcna1HxIN+5259 bgoA== X-Gm-Message-State: AIVw113/fseiIN+60HgiPfO06T6F5ERLokdGo/vBl1OuHjppR2a0PsYZ jEkNOBHQpAYtkMhcKpQV7Q== X-Received: by 10.28.49.3 with SMTP id x3mr4701348wmx.39.1500892124791; Mon, 24 Jul 2017 03:28:44 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.28.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:28:44 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 08/18] crypto: arm64/aes-ce-cipher - match round key endianness with generic code Date: Mon, 24 Jul 2017 11:28:10 +0100 Message-Id: <20170724102820.16534-9-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org In order to be able to reuse the generic AES code as a fallback for situations where the NEON may not be used, update the key handling to match the byte order of the generic code: it stores round keys as sequences of 32-bit quantities rather than streams of bytes, and so our code needs to be updated to reflect that. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-ce-ccm-core.S | 30 ++++++++--------- arch/arm64/crypto/aes-ce-cipher.c | 35 +++++++++----------- arch/arm64/crypto/aes-ce.S | 12 +++---- 3 files changed, 37 insertions(+), 40 deletions(-) -- 2.9.3 diff --git a/arch/arm64/crypto/aes-ce-ccm-core.S b/arch/arm64/crypto/aes-ce-ccm-core.S index 3363560c79b7..e3a375c4cb83 100644 --- a/arch/arm64/crypto/aes-ce-ccm-core.S +++ b/arch/arm64/crypto/aes-ce-ccm-core.S @@ -1,7 +1,7 @@ /* * aesce-ccm-core.S - AES-CCM transform for ARMv8 with Crypto Extensions * - * Copyright (C) 2013 - 2014 Linaro Ltd + * Copyright (C) 2013 - 2017 Linaro Ltd * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -32,7 +32,7 @@ ENTRY(ce_aes_ccm_auth_data) beq 8f /* out of input? */ cbnz w8, 0b eor v0.16b, v0.16b, v1.16b -1: ld1 {v3.16b}, [x4] /* load first round key */ +1: ld1 {v3.4s}, [x4] /* load first round key */ prfm pldl1strm, [x1] cmp w5, #12 /* which key size? */ add x6, x4, #16 @@ -42,17 +42,17 @@ ENTRY(ce_aes_ccm_auth_data) mov v5.16b, v3.16b b 4f 2: mov v4.16b, v3.16b - ld1 {v5.16b}, [x6], #16 /* load 2nd round key */ + ld1 {v5.4s}, [x6], #16 /* load 2nd round key */ 3: aese v0.16b, v4.16b aesmc v0.16b, v0.16b -4: ld1 {v3.16b}, [x6], #16 /* load next round key */ +4: ld1 {v3.4s}, [x6], #16 /* load next round key */ aese v0.16b, v5.16b aesmc v0.16b, v0.16b -5: ld1 {v4.16b}, [x6], #16 /* load next round key */ +5: ld1 {v4.4s}, [x6], #16 /* load next round key */ subs w7, w7, #3 aese v0.16b, v3.16b aesmc v0.16b, v0.16b - ld1 {v5.16b}, [x6], #16 /* load next round key */ + ld1 {v5.4s}, [x6], #16 /* load next round key */ bpl 3b aese v0.16b, v4.16b subs w2, w2, #16 /* last data? */ @@ -90,7 +90,7 @@ ENDPROC(ce_aes_ccm_auth_data) * u32 rounds); */ ENTRY(ce_aes_ccm_final) - ld1 {v3.16b}, [x2], #16 /* load first round key */ + ld1 {v3.4s}, [x2], #16 /* load first round key */ ld1 {v0.16b}, [x0] /* load mac */ cmp w3, #12 /* which key size? */ sub w3, w3, #2 /* modified # of rounds */ @@ -100,17 +100,17 @@ ENTRY(ce_aes_ccm_final) mov v5.16b, v3.16b b 2f 0: mov v4.16b, v3.16b -1: ld1 {v5.16b}, [x2], #16 /* load next round key */ +1: ld1 {v5.4s}, [x2], #16 /* load next round key */ aese v0.16b, v4.16b aesmc v0.16b, v0.16b aese v1.16b, v4.16b aesmc v1.16b, v1.16b -2: ld1 {v3.16b}, [x2], #16 /* load next round key */ +2: ld1 {v3.4s}, [x2], #16 /* load next round key */ aese v0.16b, v5.16b aesmc v0.16b, v0.16b aese v1.16b, v5.16b aesmc v1.16b, v1.16b -3: ld1 {v4.16b}, [x2], #16 /* load next round key */ +3: ld1 {v4.4s}, [x2], #16 /* load next round key */ subs w3, w3, #3 aese v0.16b, v3.16b aesmc v0.16b, v0.16b @@ -137,31 +137,31 @@ CPU_LE( rev x8, x8 ) /* keep swabbed ctr in reg */ cmp w4, #12 /* which key size? */ sub w7, w4, #2 /* get modified # of rounds */ ins v1.d[1], x9 /* no carry in lower ctr */ - ld1 {v3.16b}, [x3] /* load first round key */ + ld1 {v3.4s}, [x3] /* load first round key */ add x10, x3, #16 bmi 1f bne 4f mov v5.16b, v3.16b b 3f 1: mov v4.16b, v3.16b - ld1 {v5.16b}, [x10], #16 /* load 2nd round key */ + ld1 {v5.4s}, [x10], #16 /* load 2nd round key */ 2: /* inner loop: 3 rounds, 2x interleaved */ aese v0.16b, v4.16b aesmc v0.16b, v0.16b aese v1.16b, v4.16b aesmc v1.16b, v1.16b -3: ld1 {v3.16b}, [x10], #16 /* load next round key */ +3: ld1 {v3.4s}, [x10], #16 /* load next round key */ aese v0.16b, v5.16b aesmc v0.16b, v0.16b aese v1.16b, v5.16b aesmc v1.16b, v1.16b -4: ld1 {v4.16b}, [x10], #16 /* load next round key */ +4: ld1 {v4.4s}, [x10], #16 /* load next round key */ subs w7, w7, #3 aese v0.16b, v3.16b aesmc v0.16b, v0.16b aese v1.16b, v3.16b aesmc v1.16b, v1.16b - ld1 {v5.16b}, [x10], #16 /* load next round key */ + ld1 {v5.4s}, [x10], #16 /* load next round key */ bpl 2b aese v0.16b, v4.16b aese v1.16b, v4.16b diff --git a/arch/arm64/crypto/aes-ce-cipher.c b/arch/arm64/crypto/aes-ce-cipher.c index 50d9fe11d0c8..a0a0e5e3a8b5 100644 --- a/arch/arm64/crypto/aes-ce-cipher.c +++ b/arch/arm64/crypto/aes-ce-cipher.c @@ -1,7 +1,7 @@ /* * aes-ce-cipher.c - core AES cipher using ARMv8 Crypto Extensions * - * Copyright (C) 2013 - 2014 Linaro Ltd + * Copyright (C) 2013 - 2017 Linaro Ltd * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -9,6 +9,7 @@ */ #include +#include #include #include #include @@ -47,24 +48,24 @@ static void aes_cipher_encrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[]) kernel_neon_begin_partial(4); __asm__(" ld1 {v0.16b}, %[in] ;" - " ld1 {v1.16b}, [%[key]], #16 ;" + " ld1 {v1.4s}, [%[key]], #16 ;" " cmp %w[rounds], #10 ;" " bmi 0f ;" " bne 3f ;" " mov v3.16b, v1.16b ;" " b 2f ;" "0: mov v2.16b, v1.16b ;" - " ld1 {v3.16b}, [%[key]], #16 ;" + " ld1 {v3.4s}, [%[key]], #16 ;" "1: aese v0.16b, v2.16b ;" " aesmc v0.16b, v0.16b ;" - "2: ld1 {v1.16b}, [%[key]], #16 ;" + "2: ld1 {v1.4s}, [%[key]], #16 ;" " aese v0.16b, v3.16b ;" " aesmc v0.16b, v0.16b ;" - "3: ld1 {v2.16b}, [%[key]], #16 ;" + "3: ld1 {v2.4s}, [%[key]], #16 ;" " subs %w[rounds], %w[rounds], #3 ;" " aese v0.16b, v1.16b ;" " aesmc v0.16b, v0.16b ;" - " ld1 {v3.16b}, [%[key]], #16 ;" + " ld1 {v3.4s}, [%[key]], #16 ;" " bpl 1b ;" " aese v0.16b, v2.16b ;" " eor v0.16b, v0.16b, v3.16b ;" @@ -92,24 +93,24 @@ static void aes_cipher_decrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[]) kernel_neon_begin_partial(4); __asm__(" ld1 {v0.16b}, %[in] ;" - " ld1 {v1.16b}, [%[key]], #16 ;" + " ld1 {v1.4s}, [%[key]], #16 ;" " cmp %w[rounds], #10 ;" " bmi 0f ;" " bne 3f ;" " mov v3.16b, v1.16b ;" " b 2f ;" "0: mov v2.16b, v1.16b ;" - " ld1 {v3.16b}, [%[key]], #16 ;" + " ld1 {v3.4s}, [%[key]], #16 ;" "1: aesd v0.16b, v2.16b ;" " aesimc v0.16b, v0.16b ;" - "2: ld1 {v1.16b}, [%[key]], #16 ;" + "2: ld1 {v1.4s}, [%[key]], #16 ;" " aesd v0.16b, v3.16b ;" " aesimc v0.16b, v0.16b ;" - "3: ld1 {v2.16b}, [%[key]], #16 ;" + "3: ld1 {v2.4s}, [%[key]], #16 ;" " subs %w[rounds], %w[rounds], #3 ;" " aesd v0.16b, v1.16b ;" " aesimc v0.16b, v0.16b ;" - " ld1 {v3.16b}, [%[key]], #16 ;" + " ld1 {v3.4s}, [%[key]], #16 ;" " bpl 1b ;" " aesd v0.16b, v2.16b ;" " eor v0.16b, v0.16b, v3.16b ;" @@ -165,20 +166,16 @@ int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key, key_len != AES_KEYSIZE_256) return -EINVAL; - memcpy(ctx->key_enc, in_key, key_len); ctx->key_length = key_len; + for (i = 0; i < kwords; i++) + ctx->key_enc[i] = get_unaligned_le32(in_key + i * sizeof(u32)); kernel_neon_begin_partial(2); for (i = 0; i < sizeof(rcon); i++) { u32 *rki = ctx->key_enc + (i * kwords); u32 *rko = rki + kwords; -#ifndef CONFIG_CPU_BIG_ENDIAN rko[0] = ror32(aes_sub(rki[kwords - 1]), 8) ^ rcon[i] ^ rki[0]; -#else - rko[0] = rol32(aes_sub(rki[kwords - 1]), 8) ^ (rcon[i] << 24) ^ - rki[0]; -#endif rko[1] = rko[0] ^ rki[1]; rko[2] = rko[1] ^ rki[2]; rko[3] = rko[2] ^ rki[3]; @@ -210,9 +207,9 @@ int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key, key_dec[0] = key_enc[j]; for (i = 1, j--; j > 0; i++, j--) - __asm__("ld1 {v0.16b}, %[in] ;" + __asm__("ld1 {v0.4s}, %[in] ;" "aesimc v1.16b, v0.16b ;" - "st1 {v1.16b}, %[out] ;" + "st1 {v1.4s}, %[out] ;" : [out] "=Q"(key_dec[i]) : [in] "Q"(key_enc[j]) diff --git a/arch/arm64/crypto/aes-ce.S b/arch/arm64/crypto/aes-ce.S index b46093d567e5..50330f5c3adc 100644 --- a/arch/arm64/crypto/aes-ce.S +++ b/arch/arm64/crypto/aes-ce.S @@ -2,7 +2,7 @@ * linux/arch/arm64/crypto/aes-ce.S - AES cipher for ARMv8 with * Crypto Extensions * - * Copyright (C) 2013 Linaro Ltd + * Copyright (C) 2013 - 2017 Linaro Ltd * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -22,11 +22,11 @@ cmp \rounds, #12 blo 2222f /* 128 bits */ beq 1111f /* 192 bits */ - ld1 {v17.16b-v18.16b}, [\rk], #32 -1111: ld1 {v19.16b-v20.16b}, [\rk], #32 -2222: ld1 {v21.16b-v24.16b}, [\rk], #64 - ld1 {v25.16b-v28.16b}, [\rk], #64 - ld1 {v29.16b-v31.16b}, [\rk] + ld1 {v17.4s-v18.4s}, [\rk], #32 +1111: ld1 {v19.4s-v20.4s}, [\rk], #32 +2222: ld1 {v21.4s-v24.4s}, [\rk], #64 + ld1 {v25.4s-v28.4s}, [\rk], #64 + ld1 {v29.4s-v31.4s}, [\rk] .endm /* prepare for encryption with key in rk[] */ From patchwork Mon Jul 24 10:28:11 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108556 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887557qge; Mon, 24 Jul 2017 03:28:50 -0700 (PDT) X-Received: by 10.99.168.5 with SMTP id o5mr15422375pgf.207.1500892129973; Mon, 24 Jul 2017 03:28:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892129; cv=none; d=google.com; s=arc-20160816; b=OpFPYPeFm5kBQa4UvNZPthpIPVFcJWuZTdqCFlsslSN6Tu3e6KrG5lEUvsLaTmrJoV 0MNBy3lpvRvH0XvLRUItu8SkxyZqWtkA8iPNkYnCsdXAI/4XTw63lJ0R64docbbd/tp3 1EDX+oU5KxtsKfuHOZkA2TDhUewmkNl5OsdgvAnP2u0AgJlg9bMzRBS170aaZU9ps0AQ wegoNxNGGXdcPpsyOxbJ6DrDy2iLRv3enX+zKdg16dh3mP1Tj9ZrtsoSRA8qffd7yF07 VozsKqzyCo1/yxywpsH5+YLpDIUEUbIdKyFc7UXdNDB6IdWg24lfoKgVMxRncExWdJKb AeeA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=Q8qNtVVM7A1mNOWXz5wWpoWXHkKfLWT1lj0d0U7Z318=; b=paDAOAZrHmfql/mjQgoCbQdI+MUmp5f3NzraYrLfaSo99igjW9LcPM1MBu77wRq+UZ +OKfn9PxBHdYwLzgaKlnrEAdV40B5nUu4YcHSE0uKFiI+qkeih9RnmC1xhENp8PIWa57 Wm0Ffap/7p1Afw6to+ulXV2O67vDePoDT18G9N/6VRK5x4mOz8SIYiZdciG/Lgs4RlX1 4tH5CHwzOWfOcWZ6iOn/06E2JnMvK+m/2sWKC0KHlwL2p9gi6csbdvxfu0kyae++eK+Y u0o+QCsw+oKKpCaJJ5BNe74kZ+b6jZZ1j/OFLdNylY2uO8IWnj4IRKYNeFoEtPT7+Ihu 325A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=FEot+7Al; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.28.49; Mon, 24 Jul 2017 03:28:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=FEot+7Al; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752441AbdGXK2t (ORCPT + 1 other); Mon, 24 Jul 2017 06:28:49 -0400 Received: from mail-wr0-f177.google.com ([209.85.128.177]:36470 "EHLO mail-wr0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751766AbdGXK2s (ORCPT ); Mon, 24 Jul 2017 06:28:48 -0400 Received: by mail-wr0-f177.google.com with SMTP id y43so104225090wrd.3 for ; Mon, 24 Jul 2017 03:28:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Q8qNtVVM7A1mNOWXz5wWpoWXHkKfLWT1lj0d0U7Z318=; b=FEot+7AlgtJrDLKye8MZpeMRppmRChRX7Az5sMyqWCLPf9bsEdAY4K10EkrNZHv194 yUQ6ls6NrmX2fMB1b6MjIEsOhPUxzN1khafxikXLdjb+0pBGJz/i5F6mRzjHiuavdSXr I5B6BiY4odggSLLiXwBsKNYgB1A2z97fU4OIM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Q8qNtVVM7A1mNOWXz5wWpoWXHkKfLWT1lj0d0U7Z318=; b=GCvhg+/HggKDfSWZkVjYUfVrOWufT6MGvGOj/mwDaMdfwlku9ZYOph87ajhKsyqDal TOdWhvlDWplxOn7gI2/i2+pxYF9Q8JqtZZt8zmVqaWrHTiMGCF3CwDpjuQOBJfbC26B8 NvEnv+cTmItEKDa1M8jVRRlvYPQ8xmKdmUfqLyQgTQ5i+isfqMa+RvS4xwZMiWc+CJVA Q4pXiLwRuJyWjux26twGITQHjTFhUEjBZxjgF1QKs0/bbHh0wnagOL61HfWHxMWFO/vj /OzZ7JGQAwIk5+j5k5Tya5Op3NNgkEhK/zUpBE3ZiEBet3gB1q4pBn3GX/iEu+EEqeNj fc4w== X-Gm-Message-State: AIVw1103jkywfVVzM+pIaHXHkjZF5BcPSOuGjhvbxUuIwVuAg9GCMq0u /Zb7U8LCn1zmA/VrH7ji+g== X-Received: by 10.223.166.173 with SMTP id t42mr12044646wrc.272.1500892126980; Mon, 24 Jul 2017 03:28:46 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.28.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:28:46 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 09/18] crypto: arm64/aes-ce-cipher: add non-SIMD generic fallback Date: Mon, 24 Jul 2017 11:28:11 +0100 Message-Id: <20170724102820.16534-10-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar code that can be invoked in that case. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig | 1 + arch/arm64/crypto/aes-ce-cipher.c | 20 +++++++++++++++++--- 2 files changed, 18 insertions(+), 3 deletions(-) -- 2.9.3 diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index 8cd145f9c1ff..2fd4bb6d0b5a 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -52,6 +52,7 @@ config CRYPTO_AES_ARM64_CE tristate "AES core cipher using ARMv8 Crypto Extensions" depends on ARM64 && KERNEL_MODE_NEON select CRYPTO_ALGAPI + select CRYPTO_AES_ARM64 config CRYPTO_AES_ARM64_CE_CCM tristate "AES in CCM mode using ARMv8 Crypto Extensions" diff --git a/arch/arm64/crypto/aes-ce-cipher.c b/arch/arm64/crypto/aes-ce-cipher.c index a0a0e5e3a8b5..6a75cd75ed11 100644 --- a/arch/arm64/crypto/aes-ce-cipher.c +++ b/arch/arm64/crypto/aes-ce-cipher.c @@ -9,6 +9,7 @@ */ #include +#include #include #include #include @@ -21,6 +22,9 @@ MODULE_DESCRIPTION("Synchronous AES cipher using ARMv8 Crypto Extensions"); MODULE_AUTHOR("Ard Biesheuvel "); MODULE_LICENSE("GPL v2"); +asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds); +asmlinkage void __aes_arm64_decrypt(u32 *rk, u8 *out, const u8 *in, int rounds); + struct aes_block { u8 b[AES_BLOCK_SIZE]; }; @@ -45,7 +49,12 @@ static void aes_cipher_encrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[]) void *dummy0; int dummy1; - kernel_neon_begin_partial(4); + if (!may_use_simd()) { + __aes_arm64_encrypt(ctx->key_enc, dst, src, num_rounds(ctx)); + return; + } + + kernel_neon_begin(); __asm__(" ld1 {v0.16b}, %[in] ;" " ld1 {v1.4s}, [%[key]], #16 ;" @@ -90,7 +99,12 @@ static void aes_cipher_decrypt(struct crypto_tfm *tfm, u8 dst[], u8 const src[]) void *dummy0; int dummy1; - kernel_neon_begin_partial(4); + if (!may_use_simd()) { + __aes_arm64_decrypt(ctx->key_dec, dst, src, num_rounds(ctx)); + return; + } + + kernel_neon_begin(); __asm__(" ld1 {v0.16b}, %[in] ;" " ld1 {v1.4s}, [%[key]], #16 ;" @@ -170,7 +184,7 @@ int ce_aes_expandkey(struct crypto_aes_ctx *ctx, const u8 *in_key, for (i = 0; i < kwords; i++) ctx->key_enc[i] = get_unaligned_le32(in_key + i * sizeof(u32)); - kernel_neon_begin_partial(2); + kernel_neon_begin(); for (i = 0; i < sizeof(rcon); i++) { u32 *rki = ctx->key_enc + (i * kwords); u32 *rko = rki + kwords; From patchwork Mon Jul 24 10:28:12 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108557 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887604qge; Mon, 24 Jul 2017 03:28:52 -0700 (PDT) X-Received: by 10.99.9.2 with SMTP id 2mr4067336pgj.84.1500892132668; Mon, 24 Jul 2017 03:28:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892132; cv=none; d=google.com; s=arc-20160816; b=MuKtszIyIVs6flgsJOK9PD/nt7pmZiZsb26WDZXxqJ/JD+j5C4WbEOk2PnOWRUZt1R W0mLV+VB6Nb/vU83y2pnEcxopNDsYyRed+DUnwxx2Zk4hKQkiMcYmFrGHwlo9t/jmxQL J7hMR6FmhsDai4ujh3wqI5RJOftmbs8XZvpkzoAdBXB4ZXd9DzmtJFJ6nEkPCrlW3ERB 6JC3Kqk86RDJmoXpOX0mUt3UFPS+D7T2ISjFM8r3ND7yjghVuwgdSpZVgRJfiMkv16JR ZuWO6hF6zZ7VvYAz5A/vLm6LjflXa+0xJIPZWekxNgiJFO4NeZMqVgukrqxZVyUpt7pF w5VA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=ajZi/nnGIo11uzQoPdGNZxbHP1VHeXEVxkmjQS7jdvw=; b=yecCO11vD+bEy4ZO4wVvaUJgpe5g4/8F1WdDffvgIYEJCxjz2YFyxjfqwHlSunhPjv 3PGQb3UEnYOeN9QET4FgkpmpPc7DpwcBvMKjgYlQ3qI1upO7bscP30nqbSxAf75n/zdp ABFTyDTsZ5QXaLfarVfXxTpwd1Q0oV1qdhZb965I4zciwt7Zn1sRIzeo5vQI30dmuLv1 8MGEqgypMk4U2CnYpM9zL6B2DDWwKarVOBYWYvGrHEKjaaJTKnR1teaLfVnZGUiOgCBn PdAz5BZ9EpKYhzGtCbwwxepvF9rv5JKAzedVrAZPqQeFo9+EesCjyopy0if4dSpckOIS 08vg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=Zyh9W2Js; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.28.52; Mon, 24 Jul 2017 03:28:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=Zyh9W2Js; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752469AbdGXK2v (ORCPT + 1 other); Mon, 24 Jul 2017 06:28:51 -0400 Received: from mail-wm0-f47.google.com ([74.125.82.47]:35254 "EHLO mail-wm0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751766AbdGXK2u (ORCPT ); Mon, 24 Jul 2017 06:28:50 -0400 Received: by mail-wm0-f47.google.com with SMTP id c184so21141607wmd.0 for ; Mon, 24 Jul 2017 03:28:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ajZi/nnGIo11uzQoPdGNZxbHP1VHeXEVxkmjQS7jdvw=; b=Zyh9W2JsW9YYuI1qHTQG2DsTvxMhxXQ7ee9yYXTTJNsVMLfj8cP7/6YfTBcNERvZHI xj0oYAiUG21Tu4xYmd9mcU1mH6AShHLWLbqeqUGAL9wxt9WJWBSconXPOVYyymrFm42P ve89BKPhmywwZ6HqfBMN9bHHH2rONgugg3RVQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ajZi/nnGIo11uzQoPdGNZxbHP1VHeXEVxkmjQS7jdvw=; b=jPJdfNz4fgHDRKiMWb+ltVuRrSXzqV1IJia7M6PdPIxrtxm8+DVKvAt0N+QwNOp1es tgPWkKMsYILG+91e3BWYOyrL2ENT9rfbZzjRbWxDS4G38sAE6J4pRMjgreDbIbLQsg9+ 08/beXQv0JKCni9uVDd2Xw8Y/Cfr3I0qR76V1s13XNq/VEWStHrNLow234pJYJXdqSWo uanvmSxdw9TdZ3vkTT+aplaMM9PtGB9rW4QYjg6GIH9HKx33WNRPjbrkt0hjA11Y5FrS nglwmspTVxi7WdRASYtLR/UYTHzcO/DaipUbN807HCjaOcP9/b8xIYHC2PReeUXcAa73 N5Kg== X-Gm-Message-State: AIVw112TynuDcWdvoJIfr1UQ6u/dfWLYidpX8M/MGuDTW4OthXMKW+2c pm0i1gS/3rmhkqo2ugSTXA== X-Received: by 10.28.96.197 with SMTP id u188mr4776630wmb.73.1500892129117; Mon, 24 Jul 2017 03:28:49 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.28.47 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:28:48 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 10/18] crypto: arm64/aes-ce-ccm: add non-SIMD generic fallback Date: Mon, 24 Jul 2017 11:28:12 +0100 Message-Id: <20170724102820.16534-11-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The arm64 kernel will shortly disallow nested kernel mode NEON. So honour this in the ARMv8 Crypto Extensions implementation of CCM-AES, and fall back to a scalar implementation using the generic crypto helpers for AES, XOR and incrementing the CTR counter. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig | 1 + arch/arm64/crypto/aes-ce-ccm-glue.c | 174 ++++++++++++++++---- 2 files changed, 140 insertions(+), 35 deletions(-) -- 2.9.3 diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index 2fd4bb6d0b5a..ba637765c19a 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -59,6 +59,7 @@ config CRYPTO_AES_ARM64_CE_CCM depends on ARM64 && KERNEL_MODE_NEON select CRYPTO_ALGAPI select CRYPTO_AES_ARM64_CE + select CRYPTO_AES_ARM64 select CRYPTO_AEAD config CRYPTO_AES_ARM64_CE_BLK diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c index 6a7dbc7c83a6..a1254036f2b1 100644 --- a/arch/arm64/crypto/aes-ce-ccm-glue.c +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c @@ -1,7 +1,7 @@ /* * aes-ccm-glue.c - AES-CCM transform for ARMv8 with Crypto Extensions * - * Copyright (C) 2013 - 2014 Linaro Ltd + * Copyright (C) 2013 - 2017 Linaro Ltd * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -9,6 +9,7 @@ */ #include +#include #include #include #include @@ -44,6 +45,8 @@ asmlinkage void ce_aes_ccm_decrypt(u8 out[], u8 const in[], u32 cbytes, asmlinkage void ce_aes_ccm_final(u8 mac[], u8 const ctr[], u32 const rk[], u32 rounds); +asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds); + static int ccm_setkey(struct crypto_aead *tfm, const u8 *in_key, unsigned int key_len) { @@ -103,7 +106,45 @@ static int ccm_init_mac(struct aead_request *req, u8 maciv[], u32 msglen) return 0; } -static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[]) +static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[], + u32 abytes, u32 *macp, bool use_neon) +{ + if (likely(use_neon)) { + ce_aes_ccm_auth_data(mac, in, abytes, macp, key->key_enc, + num_rounds(key)); + } else { + if (*macp > 0 && *macp < AES_BLOCK_SIZE) { + int added = min(abytes, AES_BLOCK_SIZE - *macp); + + crypto_xor(&mac[*macp], in, added); + + *macp += added; + in += added; + abytes -= added; + } + + while (abytes > AES_BLOCK_SIZE) { + __aes_arm64_encrypt(key->key_enc, mac, mac, + num_rounds(key)); + crypto_xor(mac, in, AES_BLOCK_SIZE); + + in += AES_BLOCK_SIZE; + abytes -= AES_BLOCK_SIZE; + } + + if (abytes > 0) { + __aes_arm64_encrypt(key->key_enc, mac, mac, + num_rounds(key)); + crypto_xor(mac, in, abytes); + *macp = abytes; + } else { + *macp = 0; + } + } +} + +static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[], + bool use_neon) { struct crypto_aead *aead = crypto_aead_reqtfm(req); struct crypto_aes_ctx *ctx = crypto_aead_ctx(aead); @@ -122,8 +163,7 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[]) ltag.len = 6; } - ce_aes_ccm_auth_data(mac, (u8 *)<ag, ltag.len, &macp, ctx->key_enc, - num_rounds(ctx)); + ccm_update_mac(ctx, mac, (u8 *)<ag, ltag.len, &macp, use_neon); scatterwalk_start(&walk, req->src); do { @@ -135,8 +175,7 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[]) n = scatterwalk_clamp(&walk, len); } p = scatterwalk_map(&walk); - ce_aes_ccm_auth_data(mac, p, n, &macp, ctx->key_enc, - num_rounds(ctx)); + ccm_update_mac(ctx, mac, p, n, &macp, use_neon); len -= n; scatterwalk_unmap(p); @@ -145,6 +184,56 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[]) } while (len); } +static int ccm_crypt_fallback(struct skcipher_walk *walk, u8 mac[], u8 iv0[], + struct crypto_aes_ctx *ctx, bool enc) +{ + u8 buf[AES_BLOCK_SIZE]; + int err = 0; + + while (walk->nbytes) { + int blocks = walk->nbytes / AES_BLOCK_SIZE; + u32 tail = walk->nbytes % AES_BLOCK_SIZE; + u8 *dst = walk->dst.virt.addr; + u8 *src = walk->src.virt.addr; + u32 nbytes = walk->nbytes; + + if (nbytes == walk->total && tail > 0) { + blocks++; + tail = 0; + } + + do { + u32 bsize = AES_BLOCK_SIZE; + + if (nbytes < AES_BLOCK_SIZE) + bsize = nbytes; + + crypto_inc(walk->iv, AES_BLOCK_SIZE); + __aes_arm64_encrypt(ctx->key_enc, buf, walk->iv, + num_rounds(ctx)); + __aes_arm64_encrypt(ctx->key_enc, mac, mac, + num_rounds(ctx)); + if (enc) + crypto_xor(mac, src, bsize); + crypto_xor_cpy(dst, src, buf, bsize); + if (!enc) + crypto_xor(mac, dst, bsize); + dst += bsize; + src += bsize; + nbytes -= bsize; + } while (--blocks); + + err = skcipher_walk_done(walk, tail); + } + + if (!err) { + __aes_arm64_encrypt(ctx->key_enc, buf, iv0, num_rounds(ctx)); + __aes_arm64_encrypt(ctx->key_enc, mac, mac, num_rounds(ctx)); + crypto_xor(mac, buf, AES_BLOCK_SIZE); + } + return err; +} + static int ccm_encrypt(struct aead_request *req) { struct crypto_aead *aead = crypto_aead_reqtfm(req); @@ -153,39 +242,46 @@ static int ccm_encrypt(struct aead_request *req) u8 __aligned(8) mac[AES_BLOCK_SIZE]; u8 buf[AES_BLOCK_SIZE]; u32 len = req->cryptlen; + bool use_neon = may_use_simd(); int err; err = ccm_init_mac(req, mac, len); if (err) return err; - kernel_neon_begin_partial(6); + if (likely(use_neon)) + kernel_neon_begin(); if (req->assoclen) - ccm_calculate_auth_mac(req, mac); + ccm_calculate_auth_mac(req, mac, use_neon); /* preserve the original iv for the final round */ memcpy(buf, req->iv, AES_BLOCK_SIZE); err = skcipher_walk_aead_encrypt(&walk, req, true); - while (walk.nbytes) { - u32 tail = walk.nbytes % AES_BLOCK_SIZE; - - if (walk.nbytes == walk.total) - tail = 0; + if (likely(use_neon)) { + while (walk.nbytes) { + u32 tail = walk.nbytes % AES_BLOCK_SIZE; - ce_aes_ccm_encrypt(walk.dst.virt.addr, walk.src.virt.addr, - walk.nbytes - tail, ctx->key_enc, - num_rounds(ctx), mac, walk.iv); + if (walk.nbytes == walk.total) + tail = 0; - err = skcipher_walk_done(&walk, tail); - } - if (!err) - ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx)); + ce_aes_ccm_encrypt(walk.dst.virt.addr, + walk.src.virt.addr, + walk.nbytes - tail, ctx->key_enc, + num_rounds(ctx), mac, walk.iv); - kernel_neon_end(); + err = skcipher_walk_done(&walk, tail); + } + if (!err) + ce_aes_ccm_final(mac, buf, ctx->key_enc, + num_rounds(ctx)); + kernel_neon_end(); + } else { + err = ccm_crypt_fallback(&walk, mac, buf, ctx, true); + } if (err) return err; @@ -205,38 +301,46 @@ static int ccm_decrypt(struct aead_request *req) u8 __aligned(8) mac[AES_BLOCK_SIZE]; u8 buf[AES_BLOCK_SIZE]; u32 len = req->cryptlen - authsize; + bool use_neon = may_use_simd(); int err; err = ccm_init_mac(req, mac, len); if (err) return err; - kernel_neon_begin_partial(6); + if (likely(use_neon)) + kernel_neon_begin(); if (req->assoclen) - ccm_calculate_auth_mac(req, mac); + ccm_calculate_auth_mac(req, mac, use_neon); /* preserve the original iv for the final round */ memcpy(buf, req->iv, AES_BLOCK_SIZE); err = skcipher_walk_aead_decrypt(&walk, req, true); - while (walk.nbytes) { - u32 tail = walk.nbytes % AES_BLOCK_SIZE; + if (likely(use_neon)) { + while (walk.nbytes) { + u32 tail = walk.nbytes % AES_BLOCK_SIZE; - if (walk.nbytes == walk.total) - tail = 0; + if (walk.nbytes == walk.total) + tail = 0; - ce_aes_ccm_decrypt(walk.dst.virt.addr, walk.src.virt.addr, - walk.nbytes - tail, ctx->key_enc, - num_rounds(ctx), mac, walk.iv); + ce_aes_ccm_decrypt(walk.dst.virt.addr, + walk.src.virt.addr, + walk.nbytes - tail, ctx->key_enc, + num_rounds(ctx), mac, walk.iv); - err = skcipher_walk_done(&walk, tail); - } - if (!err) - ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx)); + err = skcipher_walk_done(&walk, tail); + } + if (!err) + ce_aes_ccm_final(mac, buf, ctx->key_enc, + num_rounds(ctx)); - kernel_neon_end(); + kernel_neon_end(); + } else { + err = ccm_crypt_fallback(&walk, mac, buf, ctx, false); + } if (err) return err; From patchwork Mon Jul 24 10:28:13 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108558 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887612qge; Mon, 24 Jul 2017 03:28:54 -0700 (PDT) X-Received: by 10.84.210.74 with SMTP id z68mr11987738plh.454.1500892134315; Mon, 24 Jul 2017 03:28:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892134; cv=none; d=google.com; s=arc-20160816; b=lcra0X1VgtW+Ek3m5F5uD4xPBosea6siUIuwcunuredDYx1jQBlyNqBYHYdCcZksZa T7SvdJlJCFlGN5YamU5Gu/Nm5lfzJd0BTnBZstiH485n8coNE8av9RwTkreVdLnQrmka wS8xlRScFYHuSi5f2CgbcluzCbc0Ba5JiENldLsd5/DCeWkKLorR1UXX0tkbBopPVlfj AQ3hedh9itK0mfMtNkaFp0ozoMoWw5X8S/Hmh5WDjUlNNHgoR4RcQDoEWK+fN2GhDqr/ vVyB4TRYwFd5V6ZKOM9pycX2EweOIt9ea0/l3nE5Q5qz9rdWhv4GBZ32NqQb5cuzM3mu z0Ow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=xOxPGsVwzCPmT+gjxld6nSBw+uPvohl10NFFDZBpfKk=; b=Ztcsm1gL/PrwtghCv+1xERmM5WgiV7yAKqZDjx3K6lkj14Ny4q/ZRcwUWNO9dm2fTo kuJU9ZsTaCN4zr5aGsyZSCRxiIAg7dQ+38UktQSGIn4ajvE0bClE6GIarXgu1PGmgvh/ qeYhSUmlnzGB4u3ZagAE0bH07NkKVrp+h3yRWgtwRtLx6MMkYMw5rOC4rmbOFwC/ryzb yDEKWPiJSKKe0IekQDGYEPF0Ome07iMyNYDvV6p0y0dF1vadVW/FhnFTXNT/1Y4yI7bb NDLEeKpmgGCzRO0FBFG3dFNi411aNb8vvy7zkJktKQ14iep300li5Nt3hI+ghu076iQh uciA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=h7K/e1HA; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.28.54; Mon, 24 Jul 2017 03:28:54 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=h7K/e1HA; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752259AbdGXK2x (ORCPT + 1 other); Mon, 24 Jul 2017 06:28:53 -0400 Received: from mail-wr0-f181.google.com ([209.85.128.181]:34706 "EHLO mail-wr0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751766AbdGXK2w (ORCPT ); Mon, 24 Jul 2017 06:28:52 -0400 Received: by mail-wr0-f181.google.com with SMTP id 12so104449683wrb.1 for ; Mon, 24 Jul 2017 03:28:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=xOxPGsVwzCPmT+gjxld6nSBw+uPvohl10NFFDZBpfKk=; b=h7K/e1HAbQbfLO6yyAd1cT7s5PNO6XlhMSwPfVFdSmzVjCxVtTNtaC7BVQGZ/EWWSn yrNyJBy7E0JQHVUuYTt5MO9VWuyh0vab49EpzZ2N4I1GmAgJpmzyPrdPgrzzPHSpT3SY LFpXNOEzMozzJ0HOopqvPcLnojgfKecxlgdXo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=xOxPGsVwzCPmT+gjxld6nSBw+uPvohl10NFFDZBpfKk=; b=SOm8SAVFHtlnWQ6Awma0htklWcTXDZpN+q/ORLeGoDAlKm66dnxS5zP1/NPKe4stRV d8QGxXWz7L+2D1raMOlXFSqnYkT7EoXJrbclMBmruxLQQLstNBsN1XWR5pZRarPXAucm O5Nrx9s10i85Rkw6u2pVHtPIpdFjCOZE+wLG9lxGnz2fFY6+2b2QGTj9gF3nCiI9aami s9HWtvHr1l9ItUMGi/phrfOXEJxEed3Ddrr67sGmihgHa+3ASbSp05GldpxPE1tPQxmu owuzWN1Xz0p7wnq+zV58NdYRFFMyLWT+F4R3TG2CwVqSdpIhJyYNmLH/My7oZNLwKA7l utrw== X-Gm-Message-State: AIVw111BjmhmpKf7kO0aWq24HIkPClP18lBHT7ezHyhJfjJPTT6v6DYy 0qNrT5UhaKfZm0I/jnSe0A== X-Received: by 10.223.139.78 with SMTP id v14mr11314995wra.248.1500892130999; Mon, 24 Jul 2017 03:28:50 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.28.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:28:50 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 11/18] crypto: arm64/aes-blk - add a non-SIMD fallback for synchronous CTR Date: Mon, 24 Jul 2017 11:28:13 +0100 Message-Id: <20170724102820.16534-12-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org To accommodate systems that may disallow use of the NEON in kernel mode in some circumstances, introduce a C fallback for synchronous AES in CTR mode, and use it if may_use_simd() returns false. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig | 6 +- arch/arm64/crypto/aes-ctr-fallback.h | 53 ++++++++++++++++++ arch/arm64/crypto/aes-glue.c | 59 +++++++++++++++----- 3 files changed, 101 insertions(+), 17 deletions(-) -- 2.9.3 diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index ba637765c19a..a068dcbe2518 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -64,15 +64,17 @@ config CRYPTO_AES_ARM64_CE_CCM config CRYPTO_AES_ARM64_CE_BLK tristate "AES in ECB/CBC/CTR/XTS modes using ARMv8 Crypto Extensions" - depends on ARM64 && KERNEL_MODE_NEON + depends on KERNEL_MODE_NEON select CRYPTO_BLKCIPHER select CRYPTO_AES_ARM64_CE + select CRYPTO_AES_ARM64 select CRYPTO_SIMD config CRYPTO_AES_ARM64_NEON_BLK tristate "AES in ECB/CBC/CTR/XTS modes using NEON instructions" - depends on ARM64 && KERNEL_MODE_NEON + depends on KERNEL_MODE_NEON select CRYPTO_BLKCIPHER + select CRYPTO_AES_ARM64 select CRYPTO_AES select CRYPTO_SIMD diff --git a/arch/arm64/crypto/aes-ctr-fallback.h b/arch/arm64/crypto/aes-ctr-fallback.h new file mode 100644 index 000000000000..c9285717b6b5 --- /dev/null +++ b/arch/arm64/crypto/aes-ctr-fallback.h @@ -0,0 +1,53 @@ +/* + * Fallback for sync aes(ctr) in contexts where kernel mode NEON + * is not allowed + * + * Copyright (C) 2017 Linaro Ltd + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include + +asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds); + +static inline int aes_ctr_encrypt_fallback(struct crypto_aes_ctx *ctx, + struct skcipher_request *req) +{ + struct skcipher_walk walk; + u8 buf[AES_BLOCK_SIZE]; + int err; + + err = skcipher_walk_virt(&walk, req, true); + + while (walk.nbytes > 0) { + u8 *dst = walk.dst.virt.addr; + u8 *src = walk.src.virt.addr; + int nbytes = walk.nbytes; + int tail = 0; + + if (nbytes < walk.total) { + nbytes = round_down(nbytes, AES_BLOCK_SIZE); + tail = walk.nbytes % AES_BLOCK_SIZE; + } + + do { + int bsize = min(nbytes, AES_BLOCK_SIZE); + + __aes_arm64_encrypt(ctx->key_enc, buf, walk.iv, + 6 + ctx->key_length / 4); + crypto_xor_cpy(dst, src, buf, bsize); + crypto_inc(walk.iv, AES_BLOCK_SIZE); + + dst += AES_BLOCK_SIZE; + src += AES_BLOCK_SIZE; + nbytes -= AES_BLOCK_SIZE; + } while (nbytes > 0); + + err = skcipher_walk_done(&walk, tail); + } + return err; +} diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c index 0da30e3b0e4b..998ba519a026 100644 --- a/arch/arm64/crypto/aes-glue.c +++ b/arch/arm64/crypto/aes-glue.c @@ -10,6 +10,7 @@ #include #include +#include #include #include #include @@ -19,6 +20,7 @@ #include #include "aes-ce-setkey.h" +#include "aes-ctr-fallback.h" #ifdef USE_V8_CRYPTO_EXTENSIONS #define MODE "ce" @@ -249,6 +251,17 @@ static int ctr_encrypt(struct skcipher_request *req) return err; } +static int ctr_encrypt_sync(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + + if (!may_use_simd()) + return aes_ctr_encrypt_fallback(ctx, req); + + return ctr_encrypt(req); +} + static int xts_encrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); @@ -355,8 +368,8 @@ static struct skcipher_alg aes_algs[] = { { .ivsize = AES_BLOCK_SIZE, .chunksize = AES_BLOCK_SIZE, .setkey = skcipher_aes_setkey, - .encrypt = ctr_encrypt, - .decrypt = ctr_encrypt, + .encrypt = ctr_encrypt_sync, + .decrypt = ctr_encrypt_sync, }, { .base = { .cra_name = "__xts(aes)", @@ -458,11 +471,35 @@ static int mac_init(struct shash_desc *desc) return 0; } +static void mac_do_update(struct crypto_aes_ctx *ctx, u8 const in[], int blocks, + u8 dg[], int enc_before, int enc_after) +{ + int rounds = 6 + ctx->key_length / 4; + + if (may_use_simd()) { + kernel_neon_begin(); + aes_mac_update(in, ctx->key_enc, rounds, blocks, dg, enc_before, + enc_after); + kernel_neon_end(); + } else { + if (enc_before) + __aes_arm64_encrypt(ctx->key_enc, dg, dg, rounds); + + while (blocks--) { + crypto_xor(dg, in, AES_BLOCK_SIZE); + in += AES_BLOCK_SIZE; + + if (blocks || enc_after) + __aes_arm64_encrypt(ctx->key_enc, dg, dg, + rounds); + } + } +} + static int mac_update(struct shash_desc *desc, const u8 *p, unsigned int len) { struct mac_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); struct mac_desc_ctx *ctx = shash_desc_ctx(desc); - int rounds = 6 + tctx->key.key_length / 4; while (len > 0) { unsigned int l; @@ -474,10 +511,8 @@ static int mac_update(struct shash_desc *desc, const u8 *p, unsigned int len) len %= AES_BLOCK_SIZE; - kernel_neon_begin(); - aes_mac_update(p, tctx->key.key_enc, rounds, blocks, - ctx->dg, (ctx->len != 0), (len != 0)); - kernel_neon_end(); + mac_do_update(&tctx->key, p, blocks, ctx->dg, + (ctx->len != 0), (len != 0)); p += blocks * AES_BLOCK_SIZE; @@ -505,11 +540,8 @@ static int cbcmac_final(struct shash_desc *desc, u8 *out) { struct mac_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); struct mac_desc_ctx *ctx = shash_desc_ctx(desc); - int rounds = 6 + tctx->key.key_length / 4; - kernel_neon_begin(); - aes_mac_update(NULL, tctx->key.key_enc, rounds, 0, ctx->dg, 1, 0); - kernel_neon_end(); + mac_do_update(&tctx->key, NULL, 0, ctx->dg, 1, 0); memcpy(out, ctx->dg, AES_BLOCK_SIZE); @@ -520,7 +552,6 @@ static int cmac_final(struct shash_desc *desc, u8 *out) { struct mac_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); struct mac_desc_ctx *ctx = shash_desc_ctx(desc); - int rounds = 6 + tctx->key.key_length / 4; u8 *consts = tctx->consts; if (ctx->len != AES_BLOCK_SIZE) { @@ -528,9 +559,7 @@ static int cmac_final(struct shash_desc *desc, u8 *out) consts += AES_BLOCK_SIZE; } - kernel_neon_begin(); - aes_mac_update(consts, tctx->key.key_enc, rounds, 1, ctx->dg, 0, 1); - kernel_neon_end(); + mac_do_update(&tctx->key, consts, 1, ctx->dg, 0, 1); memcpy(out, ctx->dg, AES_BLOCK_SIZE); From patchwork Mon Jul 24 10:28:14 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108559 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887627qge; Mon, 24 Jul 2017 03:28:56 -0700 (PDT) X-Received: by 10.98.150.16 with SMTP id c16mr15870547pfe.64.1500892136227; Mon, 24 Jul 2017 03:28:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892136; cv=none; d=google.com; s=arc-20160816; b=m2Qbed5Fy8a9DobjhV//SEazRwS61X9ifRV4GxN8j5gY354M8KU8lkF47L0SC9YAJ9 vL8LuKDc4mq8FTVeJQ/qgqkVzYynW18Za4i1HceMcmOxfrwGsrJ2YfCmsHclILMZ7O3N VPk5LZOXWUt5kjxEg7nU62J6D9Ql60sK/bFnfk4xvEBen1xjrDAYwjYNa4X65ejOK25c qjwWhGjhJkjeTptHMa1GGF6f79jTT9j1hrW4Mk2RnDMZ+SGy5wKDhIa3JQfq9ZR+1Z1J t5xEL3p4Bd3KIy96cKDKDY5qYh/m/9iv+bTFlslRUc2oEqiY7udupaixBij14JTQdq+V rNqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=WZklumhDDVsOZVNG9ejRpmsrpjRPQORsSJScJg4cN8E=; b=eYBK2MO9RG0JEByvdNj5kVlaDXX78ZMbVoqcGtqWozlV658A+efA7oJia067wO1BGz elyYN/SpKC5zvxdgFztFTvgAlSq2H5DLGdVLkke2qbugO5g+T3AjMkQpq5J5nLzRhd9l PGvZtNgKlLoG3W/2jAUcX67HF10aaJoUndjDCJeqkbgcWqtIcIpHNZDoZs5sn4/X31ox IZmT9g/H2iM1qDmW1Ts88WU4f4/HW3pjLHvz8A2MZHFlx4dGS/0/ZMCteDXxtplJc6hG wotCNmU+yk44XY09+XYxF/B+LY5/3hAT3xNZGRp+q85wy92QokmxVsFJYnBmHrs5tAJu vRPA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=UtxmKQB9; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.28.56; Mon, 24 Jul 2017 03:28:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=UtxmKQB9; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752568AbdGXK2z (ORCPT + 1 other); Mon, 24 Jul 2017 06:28:55 -0400 Received: from mail-wr0-f182.google.com ([209.85.128.182]:38254 "EHLO mail-wr0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752552AbdGXK2y (ORCPT ); Mon, 24 Jul 2017 06:28:54 -0400 Received: by mail-wr0-f182.google.com with SMTP id f21so47779515wrf.5 for ; Mon, 24 Jul 2017 03:28:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=WZklumhDDVsOZVNG9ejRpmsrpjRPQORsSJScJg4cN8E=; b=UtxmKQB99vxY4X5gIZVdHxS15J2gTGOZx0gjxxSjJTGhC0ESYhPxMKF428YeopnNGq clwqpwi1ifseDVUZDn3hSoH9Ij2jfOZuGjQuqOJgDQzCg733WZC2wA6wcozMklOHtgTl OxcLDd6yOtIUD4zSZ+YGoEJdhe+/B/cw4QaLI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=WZklumhDDVsOZVNG9ejRpmsrpjRPQORsSJScJg4cN8E=; b=iYchYfu2i+kFtKzsuhf3KOBl+PClMZ5net75nYn1PLfXms3fIQH28Vi+YDNib0eT/3 ro6j7OZf2ropbY4u74HefIHED6+0db7evEe0vJJWLVIA7X0o0aZZ0qm7Ry7nxs0wDedm 2b88vwibcrfC2jLnekfUgo2oZ82lv0mjdBiNY+OPoSq/2EeKWxlFmU5O4hr7/pYb0VPG OJkev0TdqrPYjFRCl8dmwTfZdT751TkecAr6E244wUq+UaI4TAF33CtgXOndkSYQUuDh YrSKQesmzPKZKZIVACH6QqVqa8JXJuM64MWRbwUo70AvAt6eLW6iG52xxwnFYmSAXvNQ m7FA== X-Gm-Message-State: AIVw111pDiLhCN2+ocLPuL1DjH9zCaTekosvLq1cD6608+8BLLSVWeD1 +B36HFtfZxXn+TuKyERWdQ== X-Received: by 10.223.179.193 with SMTP id x1mr15333737wrd.86.1500892133110; Mon, 24 Jul 2017 03:28:53 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.28.51 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:28:52 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 12/18] crypto: arm64/chacha20 - take may_use_simd() into account Date: Mon, 24 Jul 2017 11:28:14 +0100 Message-Id: <20170724102820.16534-13-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org To accommodate systems that disallow the use of kernel mode NEON in some circumstances, take the return value of may_use_simd into account when deciding whether to invoke the C fallback routine. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/chacha20-neon-glue.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) -- 2.9.3 diff --git a/arch/arm64/crypto/chacha20-neon-glue.c b/arch/arm64/crypto/chacha20-neon-glue.c index a7cd575ea223..cbdb75d15cd0 100644 --- a/arch/arm64/crypto/chacha20-neon-glue.c +++ b/arch/arm64/crypto/chacha20-neon-glue.c @@ -1,7 +1,7 @@ /* * ChaCha20 256-bit cipher algorithm, RFC7539, arm64 NEON functions * - * Copyright (C) 2016 Linaro, Ltd. + * Copyright (C) 2016 - 2017 Linaro, Ltd. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -26,6 +26,7 @@ #include #include +#include asmlinkage void chacha20_block_xor_neon(u32 *state, u8 *dst, const u8 *src); asmlinkage void chacha20_4block_xor_neon(u32 *state, u8 *dst, const u8 *src); @@ -64,7 +65,7 @@ static int chacha20_neon(struct skcipher_request *req) u32 state[16]; int err; - if (req->cryptlen <= CHACHA20_BLOCK_SIZE) + if (!may_use_simd() || req->cryptlen <= CHACHA20_BLOCK_SIZE) return crypto_chacha20_crypt(req); err = skcipher_walk_virt(&walk, req, true); From patchwork Mon Jul 24 10:28:15 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108560 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887661qge; Mon, 24 Jul 2017 03:28:58 -0700 (PDT) X-Received: by 10.84.232.130 with SMTP id i2mr17677922plk.278.1500892138643; Mon, 24 Jul 2017 03:28:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892138; cv=none; d=google.com; s=arc-20160816; b=d3lKW56EhHHWgOUfWLf7aFlnTmNSP5DajZLwh9bK7oX2S92CnMTKJfrxwvjbBcKVj5 +nyDOWuDcFeYqw2LkE6vj8O6EstNDGNowff5njg1iNNjWp0Wd24H1AP14Uy31uCRBXAF CZvkxlKJA+5VkAsvxqroXpG08agGbYBufcr31faTfbMWrqgfrdC2RcZYQMOxuvzD8j2D YKiTjQxTsP6RsGMiXI4nyhAL5+wO+3EKpeBZZ2X/Ls+PCrIlJCl7iDWinJQwoO//ZcAp oXqPNd7pDbxyTsxiyv9oNXbjMONzi2pz5w3P5uA5HyRInPim0gFIfy04KtMCi8g0GaYN jWVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=sLsc+siiuMgNTqaIo8Ly1dTqU2/gy3Oo66nAHZOi2S0=; b=qc2Bz7oBKpkwhJ27maZFMiNC7tC2p9fsXj95aKlRHAzQGy0R8aLQj9yViQleh9XNae bk598g5j6fS9Ix/2Oad1g9VWDnQs1q/tgM/sCwHKEwfdwhuUVNkF1DW1SU7meuMkKnTn TZ/IMR3EcaR4C4EH5Z/HG4C3RzeWlYRezFw4/iVOKZssoIjVsuo9mGip6KawjjoNGp2q Ycufbdsi5b+/YMhhwBS+O06UE8dhkY+1QzxahRjtJoYU0kx8mRX7yi4rmL1rB3aKg1sQ VwEioWhgBgQxRzYsQUXEDxNtQUxiMZgYkAB+jpoD4vi+SSTZbxA7L+i+WSaTbqd7K4R4 1ltg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=PbEhRfrI; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.28.58; Mon, 24 Jul 2017 03:28:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=PbEhRfrI; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752579AbdGXK25 (ORCPT + 1 other); Mon, 24 Jul 2017 06:28:57 -0400 Received: from mail-wr0-f180.google.com ([209.85.128.180]:34841 "EHLO mail-wr0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752552AbdGXK24 (ORCPT ); Mon, 24 Jul 2017 06:28:56 -0400 Received: by mail-wr0-f180.google.com with SMTP id k71so50755951wrc.2 for ; Mon, 24 Jul 2017 03:28:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=sLsc+siiuMgNTqaIo8Ly1dTqU2/gy3Oo66nAHZOi2S0=; b=PbEhRfrIn4LX/CNRqfMmQER8d6wx9m8LQCFdbGCnxlIFoEoW79RXcyOl6l4pp5wc8E LNQA0nndWjdB+9DwcUNrpbqvByx7zvwiFlqjkRllDYJmKKpsptmsT97AbSYtivPt/Xk0 kcW7NeZ3+dtnvP4ECyfP6KjBpCR0QvUlFhITY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=sLsc+siiuMgNTqaIo8Ly1dTqU2/gy3Oo66nAHZOi2S0=; b=hM312qL+6gN9F/4HFd8sHO0XH5giYC9n93wKn60u3U2LZU8UKIeKBkp4ZYg6h+Evzl ItqbUL1tACw5cJOWSfpIITCD2dhw/W3CAvbFU8J2h5OxCJhCJUIYqR0eehk30UUL3a0T AXwCRBvX8ZAYzF2o4zvx5O0KQt1onw0aTH+DvfKE5P35fjLI4GoyQLC7pWBpGkeQPsx9 1VtzEjFhZpctC0OWt4ic7BJHhXpI3vYXQBvkmJfs6JRVGdwtYsJ8vH6vUs5geGnQwtFh 4j+T+m+uKFVN0TF9O+s33EhYHl93IyhRO2KmSMuAJe6e4vAAM9PmMN1EqNWMRm1w85H0 Nqsw== X-Gm-Message-State: AIVw1115AWReH2GvjPDsUzBjQiK3fF7jpLgv5ZtuvVi58q3gktr3V/09 M6wgpSllAvU8BvRxfFYXgA== X-Received: by 10.223.151.212 with SMTP id t20mr11145498wrb.233.1500892135110; Mon, 24 Jul 2017 03:28:55 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.28.53 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:28:54 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 13/18] crypto: arm64/aes-bs - implement non-SIMD fallback for AES-CTR Date: Mon, 24 Jul 2017 11:28:15 +0100 Message-Id: <20170724102820.16534-14-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Of the various chaining modes implemented by the bit sliced AES driver, only CTR is exposed as a synchronous cipher, and requires a fallback in order to remain usable once we update the kernel mode NEON handling logic to disallow nested use. So wire up the existing CTR fallback C code. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig | 1 + arch/arm64/crypto/aes-neonbs-glue.c | 48 ++++++++++++++++++-- 2 files changed, 44 insertions(+), 5 deletions(-) -- 2.9.3 diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index a068dcbe2518..f9e264b83366 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -89,6 +89,7 @@ config CRYPTO_AES_ARM64_BS depends on KERNEL_MODE_NEON select CRYPTO_BLKCIPHER select CRYPTO_AES_ARM64_NEON_BLK + select CRYPTO_AES_ARM64 select CRYPTO_SIMD endif diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c index 9001aec16007..c55d68ccb89f 100644 --- a/arch/arm64/crypto/aes-neonbs-glue.c +++ b/arch/arm64/crypto/aes-neonbs-glue.c @@ -1,7 +1,7 @@ /* * Bit sliced AES using NEON instructions * - * Copyright (C) 2016 Linaro Ltd + * Copyright (C) 2016 - 2017 Linaro Ltd * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -9,12 +9,15 @@ */ #include +#include #include #include #include #include #include +#include "aes-ctr-fallback.h" + MODULE_AUTHOR("Ard Biesheuvel "); MODULE_LICENSE("GPL v2"); @@ -58,6 +61,11 @@ struct aesbs_cbc_ctx { u32 enc[AES_MAX_KEYLENGTH_U32]; }; +struct aesbs_ctr_ctx { + struct aesbs_ctx key; /* must be first member */ + struct crypto_aes_ctx fallback; +}; + struct aesbs_xts_ctx { struct aesbs_ctx key; u32 twkey[AES_MAX_KEYLENGTH_U32]; @@ -196,6 +204,25 @@ static int cbc_decrypt(struct skcipher_request *req) return err; } +static int aesbs_ctr_setkey_sync(struct crypto_skcipher *tfm, const u8 *in_key, + unsigned int key_len) +{ + struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm); + int err; + + err = crypto_aes_expand_key(&ctx->fallback, in_key, key_len); + if (err) + return err; + + ctx->key.rounds = 6 + key_len / 4; + + kernel_neon_begin(); + aesbs_convert_key(ctx->key.rk, ctx->fallback.key_enc, ctx->key.rounds); + kernel_neon_end(); + + return 0; +} + static int ctr_encrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); @@ -259,6 +286,17 @@ static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key, return aesbs_setkey(tfm, in_key, key_len); } +static int ctr_encrypt_sync(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm); + + if (!may_use_simd()) + return aes_ctr_encrypt_fallback(&ctx->fallback, req); + + return ctr_encrypt(req); +} + static int __xts_crypt(struct skcipher_request *req, void (*fn)(u8 out[], u8 const in[], u8 const rk[], int rounds, int blocks, u8 iv[])) @@ -355,7 +393,7 @@ static struct skcipher_alg aes_algs[] = { { .base.cra_driver_name = "ctr-aes-neonbs", .base.cra_priority = 250 - 1, .base.cra_blocksize = 1, - .base.cra_ctxsize = sizeof(struct aesbs_ctx), + .base.cra_ctxsize = sizeof(struct aesbs_ctr_ctx), .base.cra_module = THIS_MODULE, .min_keysize = AES_MIN_KEY_SIZE, @@ -363,9 +401,9 @@ static struct skcipher_alg aes_algs[] = { { .chunksize = AES_BLOCK_SIZE, .walksize = 8 * AES_BLOCK_SIZE, .ivsize = AES_BLOCK_SIZE, - .setkey = aesbs_setkey, - .encrypt = ctr_encrypt, - .decrypt = ctr_encrypt, + .setkey = aesbs_ctr_setkey_sync, + .encrypt = ctr_encrypt_sync, + .decrypt = ctr_encrypt_sync, }, { .base.cra_name = "__xts(aes)", .base.cra_driver_name = "__xts-aes-neonbs", From patchwork Mon Jul 24 10:28:16 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108561 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887691qge; Mon, 24 Jul 2017 03:29:01 -0700 (PDT) X-Received: by 10.99.165.21 with SMTP id n21mr713004pgf.43.1500892141161; Mon, 24 Jul 2017 03:29:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892141; cv=none; d=google.com; s=arc-20160816; b=CLTit23LyIYJRMOFbL0vLNKL8NazgKxw1qRotJkAaBYXf+wY8BnsE2IQWm4cElf8bH g1LH0lKIacB3Lcg4Gu5YQAaBBlzSRle2nnm7jKxPeMrWvIhi34Pj0L818XoZXmniYa/b ELTJs9Ak9nTj5ZFOLG3SaWgR2m+yN1hkW56WVV322S9wuuoFj9NEU6W3UD0O7dUXfIyL ObHjfptECwYVfqsp2J1k0PRwaSvxXePJDMjTY3zlx4Ib+42kdEUZa5h8ZSv0hMKrY+bQ 207m4t2rPga7Ujmuk1rmOOLqzG8ntWIF92TlIsXrudWWqorvlc1eTH6YESoiA0nmds4Z qQxA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=nJydJ2GTzgchRM1NBxgwDUZFfzMyFM/9L4T9aVJ2sHg=; b=urRvmvCGSw/mHUr2VhxKiY2QgDxWoXO7P1Xd9ljZe1Zn55Ahq0IcH4NyMaGeLa83id nlbqOogFno6Lf6NqziV/oac9T3z74n3p6EgMONPh4swl6EX4WLepSPu4vwKwkyl/XMfU PYE70DHBIszGHsRhWAj6NT1EmFPjNJxTQdP2xA/+L6glgT0dfnEXEHXZjfhhFX0o1SS9 2urIVvWRmf1VMMasFzWwFGN9WuJSCK3fPedBB0yyhXf0obhNpFyCoytDtBkMKL5Tv4MK ED1jt88VALUvjzvd1DGrNgbPD1mzj0od609b59dV4l1/iGND/NShL/uPVsPYkU1FxfZG aoMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=D6LPTIDF; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.29.00; Mon, 24 Jul 2017 03:29:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=D6LPTIDF; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752588AbdGXK3A (ORCPT + 1 other); Mon, 24 Jul 2017 06:29:00 -0400 Received: from mail-wr0-f177.google.com ([209.85.128.177]:33716 "EHLO mail-wr0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752552AbdGXK27 (ORCPT ); Mon, 24 Jul 2017 06:28:59 -0400 Received: by mail-wr0-f177.google.com with SMTP id v105so74049537wrb.0 for ; Mon, 24 Jul 2017 03:28:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=nJydJ2GTzgchRM1NBxgwDUZFfzMyFM/9L4T9aVJ2sHg=; b=D6LPTIDF0RzAHeOd8573L/eoOFKCvsOzgR2yqm4mll5J1ocME/8ByxDnzeAdNSEnE/ ISAFOhw/rdS5gvqJsF4HVsOx+gNXU+j+XoQhliyDYcGLOCPBaQ7EZ62kzX5YNKRvlmWM KiHAxXRkkUfh+bsZvbOUxoNUPJe/q9Q55QHoc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=nJydJ2GTzgchRM1NBxgwDUZFfzMyFM/9L4T9aVJ2sHg=; b=ARBl89oN5IoXSjuuyO90UxrIOpTYPGHOyNlX0ePmJ4790HVGldXg0gHK+ZgkJ0QdWs ZYGSwZvPpEc/MFdRriJCbJnrLOze9PUwGDse/b6tsp2QE9wOeenDh3e/zFwI9p4jfPQy 3OSHJFJip1UbqAiHojLgk/9L4a9nhUaCpu2biCdG3oD39GTAr2tQZ0lDI0p8WUvG0Cin jUXFt2l0Yy2OvS/kGt9Cjyg0iwd2qSVe6VQGQyhed+RoZ5Te/s4B5Y74sSsTb/ChOQOt G133zFoNqSmTKQ2xzWFH/KEulCZW8VZmQGuC6aF728xVHwIADOnD2ToPfD6/zOjWR62x Y8dw== X-Gm-Message-State: AIVw110+XrSdNw42K07ktN94kCMwYlrP1Q83OINPNv5hjYwafkpX4Otg ct1RZRsVR3VlLoXGxsd06w== X-Received: by 10.223.169.41 with SMTP id u38mr12641616wrc.286.1500892137437; Mon, 24 Jul 2017 03:28:57 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.28.55 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:28:56 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 14/18] crypto: arm64/gcm - implement native driver using v8 Crypto Extensions Date: Mon, 24 Jul 2017 11:28:16 +0100 Message-Id: <20170724102820.16534-15-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Currently, the AES-GCM implementation for arm64 systems that support the ARMv8 Crypto Extensions is based on the generic GCM module, which combines the AES-CTR implementation using AES instructions with the PMULL based GHASH driver. This is suboptimal, given the fact that the input data needs to be loaded twice, once for the encryption and again for the MAC calculation. On Cortex-A57 (r1p2) and other recent cores that implement micro-op fusing for the AES instructions, AES executes at less than 1 cycle per byte, which means that any cycles wasted on loading the data twice hurt even more. So implement a new GCM driver that combines the AES and PMULL instructions at the block level. This improves performance on Cortex-A57 by ~37% (from 3.5 cpb to 2.6 cpb) Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig | 4 +- arch/arm64/crypto/ghash-ce-core.S | 175 ++++++++ arch/arm64/crypto/ghash-ce-glue.c | 438 ++++++++++++++++++-- 3 files changed, 591 insertions(+), 26 deletions(-) -- 2.9.3 diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index f9e264b83366..7ca54a76f6b9 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -29,10 +29,12 @@ config CRYPTO_SHA2_ARM64_CE select CRYPTO_SHA256_ARM64 config CRYPTO_GHASH_ARM64_CE - tristate "GHASH (for GCM chaining mode) using ARMv8 Crypto Extensions" + tristate "GHASH/AES-GCM using ARMv8 Crypto Extensions" depends on KERNEL_MODE_NEON select CRYPTO_HASH select CRYPTO_GF128MUL + select CRYPTO_AES + select CRYPTO_AES_ARM64 config CRYPTO_CRCT10DIF_ARM64_CE tristate "CRCT10DIF digest algorithm using PMULL instructions" diff --git a/arch/arm64/crypto/ghash-ce-core.S b/arch/arm64/crypto/ghash-ce-core.S index f0bb9f0b524f..cb22459eba85 100644 --- a/arch/arm64/crypto/ghash-ce-core.S +++ b/arch/arm64/crypto/ghash-ce-core.S @@ -77,3 +77,178 @@ CPU_LE( rev64 T1.16b, T1.16b ) st1 {XL.2d}, [x1] ret ENDPROC(pmull_ghash_update) + + KS .req v8 + CTR .req v9 + INP .req v10 + + .macro load_round_keys, rounds, rk + cmp \rounds, #12 + blo 2222f /* 128 bits */ + beq 1111f /* 192 bits */ + ld1 {v17.4s-v18.4s}, [\rk], #32 +1111: ld1 {v19.4s-v20.4s}, [\rk], #32 +2222: ld1 {v21.4s-v24.4s}, [\rk], #64 + ld1 {v25.4s-v28.4s}, [\rk], #64 + ld1 {v29.4s-v31.4s}, [\rk] + .endm + + .macro enc_round, state, key + aese \state\().16b, \key\().16b + aesmc \state\().16b, \state\().16b + .endm + + .macro enc_block, state, rounds + cmp \rounds, #12 + b.lo 2222f /* 128 bits */ + b.eq 1111f /* 192 bits */ + enc_round \state, v17 + enc_round \state, v18 +1111: enc_round \state, v19 + enc_round \state, v20 +2222: .irp key, v21, v22, v23, v24, v25, v26, v27, v28, v29 + enc_round \state, \key + .endr + aese \state\().16b, v30.16b + eor \state\().16b, \state\().16b, v31.16b + .endm + + .macro pmull_gcm_do_crypt, enc + ld1 {SHASH.2d}, [x4] + ld1 {XL.2d}, [x1] + ldr x8, [x5, #8] // load lower counter + + movi MASK.16b, #0xe1 + ext SHASH2.16b, SHASH.16b, SHASH.16b, #8 +CPU_LE( rev x8, x8 ) + shl MASK.2d, MASK.2d, #57 + eor SHASH2.16b, SHASH2.16b, SHASH.16b + + .if \enc == 1 + ld1 {KS.16b}, [x7] + .endif + +0: ld1 {CTR.8b}, [x5] // load upper counter + ld1 {INP.16b}, [x3], #16 + rev x9, x8 + add x8, x8, #1 + sub w0, w0, #1 + ins CTR.d[1], x9 // set lower counter + + .if \enc == 1 + eor INP.16b, INP.16b, KS.16b // encrypt input + st1 {INP.16b}, [x2], #16 + .endif + + rev64 T1.16b, INP.16b + + cmp w6, #12 + b.ge 2f // AES-192/256? + +1: enc_round CTR, v21 + + ext T2.16b, XL.16b, XL.16b, #8 + ext IN1.16b, T1.16b, T1.16b, #8 + + enc_round CTR, v22 + + eor T1.16b, T1.16b, T2.16b + eor XL.16b, XL.16b, IN1.16b + + enc_round CTR, v23 + + pmull2 XH.1q, SHASH.2d, XL.2d // a1 * b1 + eor T1.16b, T1.16b, XL.16b + + enc_round CTR, v24 + + pmull XL.1q, SHASH.1d, XL.1d // a0 * b0 + pmull XM.1q, SHASH2.1d, T1.1d // (a1 + a0)(b1 + b0) + + enc_round CTR, v25 + + ext T1.16b, XL.16b, XH.16b, #8 + eor T2.16b, XL.16b, XH.16b + eor XM.16b, XM.16b, T1.16b + + enc_round CTR, v26 + + eor XM.16b, XM.16b, T2.16b + pmull T2.1q, XL.1d, MASK.1d + + enc_round CTR, v27 + + mov XH.d[0], XM.d[1] + mov XM.d[1], XL.d[0] + + enc_round CTR, v28 + + eor XL.16b, XM.16b, T2.16b + + enc_round CTR, v29 + + ext T2.16b, XL.16b, XL.16b, #8 + + aese CTR.16b, v30.16b + + pmull XL.1q, XL.1d, MASK.1d + eor T2.16b, T2.16b, XH.16b + + eor KS.16b, CTR.16b, v31.16b + + eor XL.16b, XL.16b, T2.16b + + .if \enc == 0 + eor INP.16b, INP.16b, KS.16b + st1 {INP.16b}, [x2], #16 + .endif + + cbnz w0, 0b + +CPU_LE( rev x8, x8 ) + st1 {XL.2d}, [x1] + str x8, [x5, #8] // store lower counter + + .if \enc == 1 + st1 {KS.16b}, [x7] + .endif + + ret + +2: b.eq 3f // AES-192? + enc_round CTR, v17 + enc_round CTR, v18 +3: enc_round CTR, v19 + enc_round CTR, v20 + b 1b + .endm + + /* + * void pmull_gcm_encrypt(int blocks, u64 dg[], u8 dst[], const u8 src[], + * struct ghash_key const *k, u8 ctr[], + * int rounds, u8 ks[]) + */ +ENTRY(pmull_gcm_encrypt) + pmull_gcm_do_crypt 1 +ENDPROC(pmull_gcm_encrypt) + + /* + * void pmull_gcm_decrypt(int blocks, u64 dg[], u8 dst[], const u8 src[], + * struct ghash_key const *k, u8 ctr[], + * int rounds) + */ +ENTRY(pmull_gcm_decrypt) + pmull_gcm_do_crypt 0 +ENDPROC(pmull_gcm_decrypt) + + /* + * void pmull_gcm_encrypt_block(u8 dst[], u8 src[], u8 rk[], int rounds) + */ +ENTRY(pmull_gcm_encrypt_block) + cbz x2, 0f + load_round_keys w3, x2 +0: ld1 {v0.16b}, [x1] + enc_block v0, w3 + st1 {v0.16b}, [x0] + ret +ENDPROC(pmull_gcm_encrypt_block) diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c index 30221ef56e70..ee6aaac05905 100644 --- a/arch/arm64/crypto/ghash-ce-glue.c +++ b/arch/arm64/crypto/ghash-ce-glue.c @@ -11,18 +11,25 @@ #include #include #include +#include +#include +#include #include +#include #include +#include +#include #include #include #include -MODULE_DESCRIPTION("GHASH secure hash using ARMv8 Crypto Extensions"); +MODULE_DESCRIPTION("GHASH and AES-GCM using ARMv8 Crypto Extensions"); MODULE_AUTHOR("Ard Biesheuvel "); MODULE_LICENSE("GPL v2"); #define GHASH_BLOCK_SIZE 16 #define GHASH_DIGEST_SIZE 16 +#define GCM_IV_SIZE 12 struct ghash_key { u64 a; @@ -36,9 +43,27 @@ struct ghash_desc_ctx { u32 count; }; +struct gcm_aes_ctx { + struct crypto_aes_ctx aes_key; + struct ghash_key ghash_key; +}; + asmlinkage void pmull_ghash_update(int blocks, u64 dg[], const char *src, struct ghash_key const *k, const char *head); +asmlinkage void pmull_gcm_encrypt(int blocks, u64 dg[], u8 dst[], + const u8 src[], struct ghash_key const *k, + u8 ctr[], int rounds, u8 ks[]); + +asmlinkage void pmull_gcm_decrypt(int blocks, u64 dg[], u8 dst[], + const u8 src[], struct ghash_key const *k, + u8 ctr[], int rounds); + +asmlinkage void pmull_gcm_encrypt_block(u8 dst[], u8 const src[], + u32 const rk[], int rounds); + +asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds); + static int ghash_init(struct shash_desc *desc) { struct ghash_desc_ctx *ctx = shash_desc_ctx(desc); @@ -130,17 +155,11 @@ static int ghash_final(struct shash_desc *desc, u8 *dst) return 0; } -static int ghash_setkey(struct crypto_shash *tfm, - const u8 *inkey, unsigned int keylen) +static int __ghash_setkey(struct ghash_key *key, + const u8 *inkey, unsigned int keylen) { - struct ghash_key *key = crypto_shash_ctx(tfm); u64 a, b; - if (keylen != GHASH_BLOCK_SIZE) { - crypto_shash_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN); - return -EINVAL; - } - /* needed for the fallback */ memcpy(&key->k, inkey, GHASH_BLOCK_SIZE); @@ -157,32 +176,401 @@ static int ghash_setkey(struct crypto_shash *tfm, return 0; } +static int ghash_setkey(struct crypto_shash *tfm, + const u8 *inkey, unsigned int keylen) +{ + struct ghash_key *key = crypto_shash_ctx(tfm); + + if (keylen != GHASH_BLOCK_SIZE) { + crypto_shash_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN); + return -EINVAL; + } + + return __ghash_setkey(key, inkey, keylen); +} + static struct shash_alg ghash_alg = { - .digestsize = GHASH_DIGEST_SIZE, - .init = ghash_init, - .update = ghash_update, - .final = ghash_final, - .setkey = ghash_setkey, - .descsize = sizeof(struct ghash_desc_ctx), - .base = { - .cra_name = "ghash", - .cra_driver_name = "ghash-ce", - .cra_priority = 200, - .cra_flags = CRYPTO_ALG_TYPE_SHASH, - .cra_blocksize = GHASH_BLOCK_SIZE, - .cra_ctxsize = sizeof(struct ghash_key), - .cra_module = THIS_MODULE, - }, + .base.cra_name = "ghash", + .base.cra_driver_name = "ghash-ce", + .base.cra_priority = 200, + .base.cra_flags = CRYPTO_ALG_TYPE_SHASH, + .base.cra_blocksize = GHASH_BLOCK_SIZE, + .base.cra_ctxsize = sizeof(struct ghash_key), + .base.cra_module = THIS_MODULE, + + .digestsize = GHASH_DIGEST_SIZE, + .init = ghash_init, + .update = ghash_update, + .final = ghash_final, + .setkey = ghash_setkey, + .descsize = sizeof(struct ghash_desc_ctx), +}; + +static int num_rounds(struct crypto_aes_ctx *ctx) +{ + /* + * # of rounds specified by AES: + * 128 bit key 10 rounds + * 192 bit key 12 rounds + * 256 bit key 14 rounds + * => n byte key => 6 + (n/4) rounds + */ + return 6 + ctx->key_length / 4; +} + +static int gcm_setkey(struct crypto_aead *tfm, const u8 *inkey, + unsigned int keylen) +{ + struct gcm_aes_ctx *ctx = crypto_aead_ctx(tfm); + u8 key[GHASH_BLOCK_SIZE]; + int ret; + + ret = crypto_aes_expand_key(&ctx->aes_key, inkey, keylen); + if (ret) { + tfm->base.crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN; + return -EINVAL; + } + + __aes_arm64_encrypt(ctx->aes_key.key_enc, key, (u8[AES_BLOCK_SIZE]){}, + num_rounds(&ctx->aes_key)); + + return __ghash_setkey(&ctx->ghash_key, key, sizeof(key)); +} + +static int gcm_setauthsize(struct crypto_aead *tfm, unsigned int authsize) +{ + switch (authsize) { + case 4: + case 8: + case 12 ... 16: + break; + default: + return -EINVAL; + } + return 0; +} + +static void gcm_update_mac(u64 dg[], const u8 *src, int count, u8 buf[], + int *buf_count, struct gcm_aes_ctx *ctx) +{ + if (*buf_count > 0) { + int buf_added = min(count, GHASH_BLOCK_SIZE - *buf_count); + + memcpy(&buf[*buf_count], src, buf_added); + + *buf_count += buf_added; + src += buf_added; + count -= buf_added; + } + + if (count >= GHASH_BLOCK_SIZE || *buf_count == GHASH_BLOCK_SIZE) { + int blocks = count / GHASH_BLOCK_SIZE; + + ghash_do_update(blocks, dg, src, &ctx->ghash_key, + *buf_count ? buf : NULL); + + src += blocks * GHASH_BLOCK_SIZE; + count %= GHASH_BLOCK_SIZE; + *buf_count = 0; + } + + if (count > 0) { + memcpy(buf, src, count); + *buf_count = count; + } +} + +static void gcm_calculate_auth_mac(struct aead_request *req, u64 dg[]) +{ + struct crypto_aead *aead = crypto_aead_reqtfm(req); + struct gcm_aes_ctx *ctx = crypto_aead_ctx(aead); + u8 buf[GHASH_BLOCK_SIZE]; + struct scatter_walk walk; + u32 len = req->assoclen; + int buf_count = 0; + + scatterwalk_start(&walk, req->src); + + do { + u32 n = scatterwalk_clamp(&walk, len); + u8 *p; + + if (!n) { + scatterwalk_start(&walk, sg_next(walk.sg)); + n = scatterwalk_clamp(&walk, len); + } + p = scatterwalk_map(&walk); + + gcm_update_mac(dg, p, n, buf, &buf_count, ctx); + len -= n; + + scatterwalk_unmap(p); + scatterwalk_advance(&walk, n); + scatterwalk_done(&walk, 0, len); + } while (len); + + if (buf_count) { + memset(&buf[buf_count], 0, GHASH_BLOCK_SIZE - buf_count); + ghash_do_update(1, dg, buf, &ctx->ghash_key, NULL); + } +} + +static void gcm_final(struct aead_request *req, struct gcm_aes_ctx *ctx, + u64 dg[], u8 tag[], int cryptlen) +{ + u8 mac[AES_BLOCK_SIZE]; + u128 lengths; + + lengths.a = cpu_to_be64(req->assoclen * 8); + lengths.b = cpu_to_be64(cryptlen * 8); + + ghash_do_update(1, dg, (void *)&lengths, &ctx->ghash_key, NULL); + + put_unaligned_be64(dg[1], mac); + put_unaligned_be64(dg[0], mac + 8); + + crypto_xor(tag, mac, AES_BLOCK_SIZE); +} + +static int gcm_encrypt(struct aead_request *req) +{ + struct crypto_aead *aead = crypto_aead_reqtfm(req); + struct gcm_aes_ctx *ctx = crypto_aead_ctx(aead); + struct skcipher_walk walk; + u8 iv[AES_BLOCK_SIZE]; + u8 ks[AES_BLOCK_SIZE]; + u8 tag[AES_BLOCK_SIZE]; + u64 dg[2] = {}; + int err; + + if (req->assoclen) + gcm_calculate_auth_mac(req, dg); + + memcpy(iv, req->iv, GCM_IV_SIZE); + put_unaligned_be32(1, iv + GCM_IV_SIZE); + + if (likely(may_use_simd())) { + kernel_neon_begin(); + + pmull_gcm_encrypt_block(tag, iv, ctx->aes_key.key_enc, + num_rounds(&ctx->aes_key)); + put_unaligned_be32(2, iv + GCM_IV_SIZE); + pmull_gcm_encrypt_block(ks, iv, NULL, + num_rounds(&ctx->aes_key)); + put_unaligned_be32(3, iv + GCM_IV_SIZE); + + err = skcipher_walk_aead_encrypt(&walk, req, true); + + while (walk.nbytes >= AES_BLOCK_SIZE) { + int blocks = walk.nbytes / AES_BLOCK_SIZE; + + pmull_gcm_encrypt(blocks, dg, walk.dst.virt.addr, + walk.src.virt.addr, &ctx->ghash_key, + iv, num_rounds(&ctx->aes_key), ks); + + err = skcipher_walk_done(&walk, + walk.nbytes % AES_BLOCK_SIZE); + } + kernel_neon_end(); + } else { + __aes_arm64_encrypt(ctx->aes_key.key_enc, tag, iv, + num_rounds(&ctx->aes_key)); + put_unaligned_be32(2, iv + GCM_IV_SIZE); + + err = skcipher_walk_aead_encrypt(&walk, req, true); + + while (walk.nbytes >= AES_BLOCK_SIZE) { + int blocks = walk.nbytes / AES_BLOCK_SIZE; + u8 *dst = walk.dst.virt.addr; + u8 *src = walk.src.virt.addr; + + do { + __aes_arm64_encrypt(ctx->aes_key.key_enc, + ks, iv, + num_rounds(&ctx->aes_key)); + crypto_xor_cpy(dst, src, ks, AES_BLOCK_SIZE); + crypto_inc(iv, AES_BLOCK_SIZE); + + dst += AES_BLOCK_SIZE; + src += AES_BLOCK_SIZE; + } while (--blocks > 0); + + ghash_do_update(walk.nbytes / AES_BLOCK_SIZE, dg, + walk.dst.virt.addr, &ctx->ghash_key, + NULL); + + err = skcipher_walk_done(&walk, + walk.nbytes % AES_BLOCK_SIZE); + } + if (walk.nbytes) + __aes_arm64_encrypt(ctx->aes_key.key_enc, ks, iv, + num_rounds(&ctx->aes_key)); + } + + /* handle the tail */ + if (walk.nbytes) { + u8 buf[GHASH_BLOCK_SIZE]; + + crypto_xor_cpy(walk.dst.virt.addr, walk.src.virt.addr, ks, + walk.nbytes); + + memcpy(buf, walk.dst.virt.addr, walk.nbytes); + memset(buf + walk.nbytes, 0, GHASH_BLOCK_SIZE - walk.nbytes); + ghash_do_update(1, dg, buf, &ctx->ghash_key, NULL); + + err = skcipher_walk_done(&walk, 0); + } + + if (err) + return err; + + gcm_final(req, ctx, dg, tag, req->cryptlen); + + /* copy authtag to end of dst */ + scatterwalk_map_and_copy(tag, req->dst, req->assoclen + req->cryptlen, + crypto_aead_authsize(aead), 1); + + return 0; +} + +static int gcm_decrypt(struct aead_request *req) +{ + struct crypto_aead *aead = crypto_aead_reqtfm(req); + struct gcm_aes_ctx *ctx = crypto_aead_ctx(aead); + unsigned int authsize = crypto_aead_authsize(aead); + struct skcipher_walk walk; + u8 iv[AES_BLOCK_SIZE]; + u8 tag[AES_BLOCK_SIZE]; + u8 buf[GHASH_BLOCK_SIZE]; + u64 dg[2] = {}; + int err; + + if (req->assoclen) + gcm_calculate_auth_mac(req, dg); + + memcpy(iv, req->iv, GCM_IV_SIZE); + put_unaligned_be32(1, iv + GCM_IV_SIZE); + + if (likely(may_use_simd())) { + kernel_neon_begin(); + + pmull_gcm_encrypt_block(tag, iv, ctx->aes_key.key_enc, + num_rounds(&ctx->aes_key)); + put_unaligned_be32(2, iv + GCM_IV_SIZE); + + err = skcipher_walk_aead_decrypt(&walk, req, true); + + while (walk.nbytes >= AES_BLOCK_SIZE) { + int blocks = walk.nbytes / AES_BLOCK_SIZE; + + pmull_gcm_decrypt(blocks, dg, walk.dst.virt.addr, + walk.src.virt.addr, &ctx->ghash_key, + iv, num_rounds(&ctx->aes_key)); + + err = skcipher_walk_done(&walk, + walk.nbytes % AES_BLOCK_SIZE); + } + if (walk.nbytes) + pmull_gcm_encrypt_block(iv, iv, NULL, + num_rounds(&ctx->aes_key)); + + kernel_neon_end(); + } else { + __aes_arm64_encrypt(ctx->aes_key.key_enc, tag, iv, + num_rounds(&ctx->aes_key)); + put_unaligned_be32(2, iv + GCM_IV_SIZE); + + err = skcipher_walk_aead_decrypt(&walk, req, true); + + while (walk.nbytes >= AES_BLOCK_SIZE) { + int blocks = walk.nbytes / AES_BLOCK_SIZE; + u8 *dst = walk.dst.virt.addr; + u8 *src = walk.src.virt.addr; + + ghash_do_update(blocks, dg, walk.src.virt.addr, + &ctx->ghash_key, NULL); + + do { + __aes_arm64_encrypt(ctx->aes_key.key_enc, + buf, iv, + num_rounds(&ctx->aes_key)); + crypto_xor_cpy(dst, src, buf, AES_BLOCK_SIZE); + crypto_inc(iv, AES_BLOCK_SIZE); + + dst += AES_BLOCK_SIZE; + src += AES_BLOCK_SIZE; + } while (--blocks > 0); + + err = skcipher_walk_done(&walk, + walk.nbytes % AES_BLOCK_SIZE); + } + if (walk.nbytes) + __aes_arm64_encrypt(ctx->aes_key.key_enc, iv, iv, + num_rounds(&ctx->aes_key)); + } + + /* handle the tail */ + if (walk.nbytes) { + memcpy(buf, walk.src.virt.addr, walk.nbytes); + memset(buf + walk.nbytes, 0, GHASH_BLOCK_SIZE - walk.nbytes); + ghash_do_update(1, dg, buf, &ctx->ghash_key, NULL); + + crypto_xor_cpy(walk.dst.virt.addr, walk.src.virt.addr, iv, + walk.nbytes); + + err = skcipher_walk_done(&walk, 0); + } + + if (err) + return err; + + gcm_final(req, ctx, dg, tag, req->cryptlen - authsize); + + /* compare calculated auth tag with the stored one */ + scatterwalk_map_and_copy(buf, req->src, + req->assoclen + req->cryptlen - authsize, + authsize, 0); + + if (crypto_memneq(tag, buf, authsize)) + return -EBADMSG; + return 0; +} + +static struct aead_alg gcm_aes_alg = { + .ivsize = GCM_IV_SIZE, + .chunksize = AES_BLOCK_SIZE, + .maxauthsize = AES_BLOCK_SIZE, + .setkey = gcm_setkey, + .setauthsize = gcm_setauthsize, + .encrypt = gcm_encrypt, + .decrypt = gcm_decrypt, + + .base.cra_name = "gcm(aes)", + .base.cra_driver_name = "gcm-aes-ce", + .base.cra_priority = 300, + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct gcm_aes_ctx), + .base.cra_module = THIS_MODULE, }; static int __init ghash_ce_mod_init(void) { - return crypto_register_shash(&ghash_alg); + int ret; + + ret = crypto_register_aead(&gcm_aes_alg); + if (ret) + return ret; + + ret = crypto_register_shash(&ghash_alg); + if (ret) + crypto_unregister_aead(&gcm_aes_alg); + return ret; } static void __exit ghash_ce_mod_exit(void) { crypto_unregister_shash(&ghash_alg); + crypto_unregister_aead(&gcm_aes_alg); } module_cpu_feature_match(PMULL, ghash_ce_mod_init); From patchwork Mon Jul 24 10:28:17 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108562 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887735qge; Mon, 24 Jul 2017 03:29:04 -0700 (PDT) X-Received: by 10.99.168.5 with SMTP id o5mr15422816pgf.207.1500892144228; Mon, 24 Jul 2017 03:29:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892144; cv=none; d=google.com; s=arc-20160816; b=V97/bnBnzl5Mmy3uJ03MG7AQL0ggNs0bSTV1tKiclvOwRtgjASoXl5gFFeGit+2g/c yZf2iaibxdOQLaE8RU4mXL8sSNl4nn1O+DJWwA6Jt3zx+52ubcSkhyjl8F5vPPJxZsoe RyM1GwpcmYn6fsOp7Q0B5SMsHpioRdY4IL1YpvXOhp0LR93xoWknZKaJ5b2D4Yi2QOLl hrFnzJdxCbjop9fwqBSgypjCd7gbu3M57jhgkg1yd0i5c4y4/yOu9YbpYZONkJLS/TJq L/SpQTn3YCJS3Y/efW5MOhPhNASo/GuRp7UeFSzq5GAnt9TnG5spZBm5iHRG9Ly25Zkz ccLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=ZYrSshYwqHydXmvQit+E8niPNJa8000ENiagmT4Ru2g=; b=lxmY4hoM+nI/sdwmGif4pfUd7Fnxz270S8kirdXdjuLFMuFEOVMhUXcBizsdbTfHjy j9R4zgNF5MECznVsqbAaxUvrnrIwdMGJT6xb8Bk5sRfEfHSn9VYtyE+ZrTMUOeGYTmXi vainX7YJFxSG2aqU+pFtsSD60IZ2dpAEPRi4erYgS2Ox9Jaa7FTZgmtnB+zflMapHGYc ieYOki9/P+6xdt9eImp64hjSlsdsMxdxVG9DXtC8RiOv9L/4y9ephkSH5JV5yJvKkqnt +g/Q4y3T8b2tHHpJwZpc/VnUvahNeFxz6kjw84A0BWgjJi9kahEeKHnmHhsQaju9AFig hMeQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=jrvdv69P; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.29.03; Mon, 24 Jul 2017 03:29:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=jrvdv69P; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752552AbdGXK3D (ORCPT + 1 other); Mon, 24 Jul 2017 06:29:03 -0400 Received: from mail-wm0-f42.google.com ([74.125.82.42]:33078 "EHLO mail-wm0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752626AbdGXK3B (ORCPT ); Mon, 24 Jul 2017 06:29:01 -0400 Received: by mail-wm0-f42.google.com with SMTP id c184so9375348wmd.0 for ; Mon, 24 Jul 2017 03:29:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ZYrSshYwqHydXmvQit+E8niPNJa8000ENiagmT4Ru2g=; b=jrvdv69P57gjpC/1npiRC9qjkQoXdynakezOHGmv3T4sIWuoiJ6ZIicK7ObZlR7NjO /42NG1nGgf8mi0OXrnyfuye7tVD4vMR2MqTg7+Vt1ycFLfbEty9gh9HrJwGohngglNKC /9hfXDxdj9br7eV/QAM3WUZ5aPbseibn9vPIU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ZYrSshYwqHydXmvQit+E8niPNJa8000ENiagmT4Ru2g=; b=JDEYoAcLXamwaWpKKKTVP9xNNFzfmtE8MgKtYIwTWmkUuoj+LZq32+4ZbL7fr2o0t+ Dfxtup1sLIbl+suEhNut/VfSjnS5JyQ4f1aYEgWl2tC2kuS6rKUAHYALh9KwFhC4kPK0 4uVZbuU0dc22ulhfaEYJdzHBd9q64YFDIh1H65KRmrhgaItkHcavzJUWWBMvoMNkrIh4 QmHxZoQ85/SmoCclsJAnxGtN3N4LCezxIrpknZILZgAkdO1DKd5iuqHttSO+BWP26MdA y3W57kxTGuoOHXD0X5Sw1geKmDOlvvkcl+cO/ZttXBe8s7Te4dOmkjpCL32zKGZdyZWh wPFg== X-Gm-Message-State: AIVw112d3QS/Q8zzoMoT0Z5edb7pWYAXrZOw501NDoh06dAQFGCw+1da 30BpyXaP/+eX+jN0OAqAIA== X-Received: by 10.28.133.209 with SMTP id h200mr4368664wmd.20.1500892139956; Mon, 24 Jul 2017 03:28:59 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.28.57 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:28:59 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 15/18] crypto: arm/ghash - add NEON accelerated fallback for vmull.p64 Date: Mon, 24 Jul 2017 11:28:17 +0100 Message-Id: <20170724102820.16534-16-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Implement a NEON fallback for systems that do support NEON but have no support for the optional 64x64->128 polynomial multiplication instruction that is part of the ARMv8 Crypto Extensions. It is based on the paper "Fast Software Polynomial Multiplication on ARM Processors Using the NEON Engine" by Danilo Camara, Conrado Gouvea, Julio Lopez and Ricardo Dahab (https://hal.inria.fr/hal-01506572) On a 32-bit guest executing under KVM on a Cortex-A57, the new code is not only 4x faster than the generic table based GHASH driver, it is also time invariant. (Note that the existing vmull.p64 code is 16x faster on this core). Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/Kconfig | 5 +- arch/arm/crypto/ghash-ce-core.S | 234 ++++++++++++++++---- arch/arm/crypto/ghash-ce-glue.c | 24 +- 3 files changed, 215 insertions(+), 48 deletions(-) -- 2.9.3 diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig index b9adedcc5b2e..ec72752d5668 100644 --- a/arch/arm/crypto/Kconfig +++ b/arch/arm/crypto/Kconfig @@ -94,14 +94,15 @@ config CRYPTO_AES_ARM_CE ARMv8 Crypto Extensions config CRYPTO_GHASH_ARM_CE - tristate "PMULL-accelerated GHASH using ARMv8 Crypto Extensions" + tristate "PMULL-accelerated GHASH using NEON/ARMv8 Crypto Extensions" depends on KERNEL_MODE_NEON select CRYPTO_HASH select CRYPTO_CRYPTD help Use an implementation of GHASH (used by the GCM AEAD chaining mode) that uses the 64x64 to 128 bit polynomial multiplication (vmull.p64) - that is part of the ARMv8 Crypto Extensions + that is part of the ARMv8 Crypto Extensions, or a slower variant that + uses the vmull.p8 instruction that is part of the basic NEON ISA. config CRYPTO_CRCT10DIF_ARM_CE tristate "CRCT10DIF digest algorithm using PMULL instructions" diff --git a/arch/arm/crypto/ghash-ce-core.S b/arch/arm/crypto/ghash-ce-core.S index f6ab8bcc9efe..2f78c10b1881 100644 --- a/arch/arm/crypto/ghash-ce-core.S +++ b/arch/arm/crypto/ghash-ce-core.S @@ -1,7 +1,7 @@ /* - * Accelerated GHASH implementation with ARMv8 vmull.p64 instructions. + * Accelerated GHASH implementation with NEON/ARMv8 vmull.p8/64 instructions. * - * Copyright (C) 2015 Linaro Ltd. + * Copyright (C) 2015 - 2017 Linaro Ltd. * * This program is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License version 2 as published @@ -12,40 +12,162 @@ #include SHASH .req q0 - SHASH2 .req q1 - T1 .req q2 - T2 .req q3 - MASK .req q4 - XL .req q5 - XM .req q6 - XH .req q7 - IN1 .req q7 + T1 .req q1 + XL .req q2 + XM .req q3 + XH .req q4 + IN1 .req q4 SHASH_L .req d0 SHASH_H .req d1 - SHASH2_L .req d2 - T1_L .req d4 - MASK_L .req d8 - XL_L .req d10 - XL_H .req d11 - XM_L .req d12 - XM_H .req d13 - XH_L .req d14 + T1_L .req d2 + T1_H .req d3 + XL_L .req d4 + XL_H .req d5 + XM_L .req d6 + XM_H .req d7 + XH_L .req d8 + + t0l .req d10 + t0h .req d11 + t1l .req d12 + t1h .req d13 + t2l .req d14 + t2h .req d15 + t3l .req d16 + t3h .req d17 + t4l .req d18 + t4h .req d19 + + t0q .req q5 + t1q .req q6 + t2q .req q7 + t3q .req q8 + t4q .req q9 + T2 .req q9 + + s1l .req d20 + s1h .req d21 + s2l .req d22 + s2h .req d23 + s3l .req d24 + s3h .req d25 + s4l .req d26 + s4h .req d27 + + MASK .req d28 + SHASH2_p8 .req d28 + + k16 .req d29 + k32 .req d30 + k48 .req d31 + SHASH2_p64 .req d31 .text .fpu crypto-neon-fp-armv8 + .macro __pmull_p64, rd, rn, rm, b1, b2, b3, b4 + vmull.p64 \rd, \rn, \rm + .endm + /* - * void pmull_ghash_update(int blocks, u64 dg[], const char *src, - * struct ghash_key const *k, const char *head) + * This implementation of 64x64 -> 128 bit polynomial multiplication + * using vmull.p8 instructions (8x8 -> 16) is taken from the paper + * "Fast Software Polynomial Multiplication on ARM Processors Using + * the NEON Engine" by Danilo Camara, Conrado Gouvea, Julio Lopez and + * Ricardo Dahab (https://hal.inria.fr/hal-01506572) + * + * It has been slightly tweaked for in-order performance, and to allow + * 'rq' to overlap with 'ad' or 'bd'. */ -ENTRY(pmull_ghash_update) - vld1.64 {SHASH}, [r3] + .macro __pmull_p8, rq, ad, bd, b1=t4l, b2=t3l, b3=t4l, b4=t3l + vext.8 t0l, \ad, \ad, #1 @ A1 + .ifc \b1, t4l + vext.8 t4l, \bd, \bd, #1 @ B1 + .endif + vmull.p8 t0q, t0l, \bd @ F = A1*B + vext.8 t1l, \ad, \ad, #2 @ A2 + vmull.p8 t4q, \ad, \b1 @ E = A*B1 + .ifc \b2, t3l + vext.8 t3l, \bd, \bd, #2 @ B2 + .endif + vmull.p8 t1q, t1l, \bd @ H = A2*B + vext.8 t2l, \ad, \ad, #3 @ A3 + vmull.p8 t3q, \ad, \b2 @ G = A*B2 + veor t0q, t0q, t4q @ L = E + F + .ifc \b3, t4l + vext.8 t4l, \bd, \bd, #3 @ B3 + .endif + vmull.p8 t2q, t2l, \bd @ J = A3*B + veor t0l, t0l, t0h @ t0 = (L) (P0 + P1) << 8 + veor t1q, t1q, t3q @ M = G + H + .ifc \b4, t3l + vext.8 t3l, \bd, \bd, #4 @ B4 + .endif + vmull.p8 t4q, \ad, \b3 @ I = A*B3 + veor t1l, t1l, t1h @ t1 = (M) (P2 + P3) << 16 + vmull.p8 t3q, \ad, \b4 @ K = A*B4 + vand t0h, t0h, k48 + vand t1h, t1h, k32 + veor t2q, t2q, t4q @ N = I + J + veor t0l, t0l, t0h + veor t1l, t1l, t1h + veor t2l, t2l, t2h @ t2 = (N) (P4 + P5) << 24 + vand t2h, t2h, k16 + veor t3l, t3l, t3h @ t3 = (K) (P6 + P7) << 32 + vmov.i64 t3h, #0 + vext.8 t0q, t0q, t0q, #15 + veor t2l, t2l, t2h + vext.8 t1q, t1q, t1q, #14 + vmull.p8 \rq, \ad, \bd @ D = A*B + vext.8 t2q, t2q, t2q, #13 + vext.8 t3q, t3q, t3q, #12 + veor t0q, t0q, t1q + veor t2q, t2q, t3q + veor \rq, \rq, t0q + veor \rq, \rq, t2q + .endm + + // + // PMULL (64x64->128) based reduction for CPUs that can do + // it in a single instruction. + // + .macro __pmull_reduce_p64 + vmull.p64 T1, XL_L, MASK + + veor XH_L, XH_L, XM_H + vext.8 T1, T1, T1, #8 + veor XL_H, XL_H, XM_L + veor T1, T1, XL + + vmull.p64 XL, T1_H, MASK + .endm + + // + // Alternative reduction for CPUs that lack support for the + // 64x64->128 PMULL instruction + // + .macro __pmull_reduce_p8 + veor XL_H, XL_H, XM_L + veor XH_L, XH_L, XM_H + + vshl.i64 T1, XL, #57 + vshl.i64 T2, XL, #62 + veor T1, T1, T2 + vshl.i64 T2, XL, #63 + veor T1, T1, T2 + veor XL_H, XL_H, T1_L + veor XH_L, XH_L, T1_H + + vshr.u64 T1, XL, #1 + veor XH, XH, XL + veor XL, XL, T1 + vshr.u64 T1, T1, #6 + vshr.u64 XL, XL, #1 + .endm + + .macro ghash_update, pn vld1.64 {XL}, [r1] - vmov.i8 MASK, #0xe1 - vext.8 SHASH2, SHASH, SHASH, #8 - vshl.u64 MASK, MASK, #57 - veor SHASH2, SHASH2, SHASH /* do the head block first, if supplied */ ldr ip, [sp] @@ -62,33 +184,59 @@ ENTRY(pmull_ghash_update) #ifndef CONFIG_CPU_BIG_ENDIAN vrev64.8 T1, T1 #endif - vext.8 T2, XL, XL, #8 vext.8 IN1, T1, T1, #8 - veor T1, T1, T2 + veor T1_L, T1_L, XL_H veor XL, XL, IN1 - vmull.p64 XH, SHASH_H, XL_H @ a1 * b1 + __pmull_\pn XH, XL_H, SHASH_H, s1h, s2h, s3h, s4h @ a1 * b1 veor T1, T1, XL - vmull.p64 XL, SHASH_L, XL_L @ a0 * b0 - vmull.p64 XM, SHASH2_L, T1_L @ (a1 + a0)(b1 + b0) + __pmull_\pn XL, XL_L, SHASH_L, s1l, s2l, s3l, s4l @ a0 * b0 + __pmull_\pn XM, T1_L, SHASH2_\pn @ (a1+a0)(b1+b0) - vext.8 T1, XL, XH, #8 - veor T2, XL, XH + veor T1, XL, XH veor XM, XM, T1 - veor XM, XM, T2 - vmull.p64 T2, XL_L, MASK_L - vmov XH_L, XM_H - vmov XM_H, XL_L + __pmull_reduce_\pn - veor XL, XM, T2 - vext.8 T2, XL, XL, #8 - vmull.p64 XL, XL_L, MASK_L - veor T2, T2, XH - veor XL, XL, T2 + veor T1, T1, XH + veor XL, XL, T1 bne 0b vst1.64 {XL}, [r1] bx lr -ENDPROC(pmull_ghash_update) + .endm + + /* + * void pmull_ghash_update(int blocks, u64 dg[], const char *src, + * struct ghash_key const *k, const char *head) + */ +ENTRY(pmull_ghash_update_p64) + vld1.64 {SHASH}, [r3] + veor SHASH2_p64, SHASH_L, SHASH_H + + vmov.i8 MASK, #0xe1 + vshl.u64 MASK, MASK, #57 + + ghash_update p64 +ENDPROC(pmull_ghash_update_p64) + +ENTRY(pmull_ghash_update_p8) + vld1.64 {SHASH}, [r3] + veor SHASH2_p8, SHASH_L, SHASH_H + + vext.8 s1l, SHASH_L, SHASH_L, #1 + vext.8 s2l, SHASH_L, SHASH_L, #2 + vext.8 s3l, SHASH_L, SHASH_L, #3 + vext.8 s4l, SHASH_L, SHASH_L, #4 + vext.8 s1h, SHASH_H, SHASH_H, #1 + vext.8 s2h, SHASH_H, SHASH_H, #2 + vext.8 s3h, SHASH_H, SHASH_H, #3 + vext.8 s4h, SHASH_H, SHASH_H, #4 + + vmov.i64 k16, #0xffff + vmov.i64 k32, #0xffffffff + vmov.i64 k48, #0xffffffffffff + + ghash_update p8 +ENDPROC(pmull_ghash_update_p8) diff --git a/arch/arm/crypto/ghash-ce-glue.c b/arch/arm/crypto/ghash-ce-glue.c index 6bac8bea9f1e..d9bb52cae2ac 100644 --- a/arch/arm/crypto/ghash-ce-glue.c +++ b/arch/arm/crypto/ghash-ce-glue.c @@ -22,6 +22,7 @@ MODULE_DESCRIPTION("GHASH secure hash using ARMv8 Crypto Extensions"); MODULE_AUTHOR("Ard Biesheuvel "); MODULE_LICENSE("GPL v2"); +MODULE_ALIAS_CRYPTO("ghash"); #define GHASH_BLOCK_SIZE 16 #define GHASH_DIGEST_SIZE 16 @@ -41,8 +42,17 @@ struct ghash_async_ctx { struct cryptd_ahash *cryptd_tfm; }; -asmlinkage void pmull_ghash_update(int blocks, u64 dg[], const char *src, - struct ghash_key const *k, const char *head); +asmlinkage void pmull_ghash_update_p64(int blocks, u64 dg[], const char *src, + struct ghash_key const *k, + const char *head); + +asmlinkage void pmull_ghash_update_p8(int blocks, u64 dg[], const char *src, + struct ghash_key const *k, + const char *head); + +static void (*pmull_ghash_update)(int blocks, u64 dg[], const char *src, + struct ghash_key const *k, + const char *head); static int ghash_init(struct shash_desc *desc) { @@ -312,6 +322,14 @@ static int __init ghash_ce_mod_init(void) { int err; + if (!(elf_hwcap & HWCAP_NEON)) + return -ENODEV; + + if (elf_hwcap2 & HWCAP2_PMULL) + pmull_ghash_update = pmull_ghash_update_p64; + else + pmull_ghash_update = pmull_ghash_update_p8; + err = crypto_register_shash(&ghash_alg); if (err) return err; @@ -332,5 +350,5 @@ static void __exit ghash_ce_mod_exit(void) crypto_unregister_shash(&ghash_alg); } -module_cpu_feature_match(PMULL, ghash_ce_mod_init); +module_init(ghash_ce_mod_init); module_exit(ghash_ce_mod_exit); From patchwork Mon Jul 24 10:28:18 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108563 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887779qge; Mon, 24 Jul 2017 03:29:08 -0700 (PDT) X-Received: by 10.99.113.11 with SMTP id m11mr15330135pgc.96.1500892148147; Mon, 24 Jul 2017 03:29:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892148; cv=none; d=google.com; s=arc-20160816; b=jBYmGbwUAl2D9fnp/6mG2thjWrfWlX5mkETQLsq9ctcMyQp5/rc3Jg+kLkjeIv0seI QG6o07JjyZDhp0XeAqgo16LTvOWs+8NkHaoNRquULmTblRtHiXfOwL0+BaJ8HtvjHLk3 vHrVBAvGIyXPVKC1y0XH/PDbSF51h5krdrq+4tzrpOt7yav6Basjv7oPgPd4zxrQJ6UH iZkjIov/LijDCu4vVIFCHmlqnPCVHWuG3phU3HoXzap3P3XCgSKGLNsAWNZf5QsBBJwE Z0W0f+0VLyE8TF8p6w3Z/857TiFnQW8cLm5ToDE+vxlhpfKpqJVATvHwAXgq2tsbsDFo tP8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=VlXiOcP6IchU9/mwecSnltgwECgHyEsJ9JlgI4vqfTc=; b=eQ/y3RtB0r/9Ni8ZOzM73LwiuW3KrO95O1EtBb64o5/oEhrcRxykDOGFzOjsh/jSOe TDVK+bxkKJSHb0R8SqeCs8JokhXiXSVySWicovhzSJRCOSJuRfJhXPR4cLjnmAY0FCL+ a+n0RNduu4K5OfkypzeforXmBiJlw1lTUQOdSIRaKQBqZHwFy+yrUEtseQwi3FhP3jqd bC85DzigE1hsOeNTw9Ss3Orj13azIwJSqhKDRTPL+W4HG5a5kVR298vXE3NLpWeOrrkf OcuJFEc8LIrt5yb8R1uWryq0CQDCsluaOZLjGqr/hPIBLrRMcxp+fhjdl6IfrMrmjm5h CjJQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=fsFXLpbA; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.29.07; Mon, 24 Jul 2017 03:29:08 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=fsFXLpbA; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752631AbdGXK3H (ORCPT + 1 other); Mon, 24 Jul 2017 06:29:07 -0400 Received: from mail-wr0-f169.google.com ([209.85.128.169]:34876 "EHLO mail-wr0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752616AbdGXK3D (ORCPT ); Mon, 24 Jul 2017 06:29:03 -0400 Received: by mail-wr0-f169.google.com with SMTP id k71so50757310wrc.2 for ; Mon, 24 Jul 2017 03:29:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=VlXiOcP6IchU9/mwecSnltgwECgHyEsJ9JlgI4vqfTc=; b=fsFXLpbAYo6omhzzN2OlEB9pRivL7HSWBl/JA3Cll0op4jBgQjUBVOUFvV7c64QD23 xqnLhi38XNsLz7Bgg/xfc4js//Bhh+7H0FTVoi02PlkIw2mcjiadsOVqi48h9d4DIdM4 fxPQKianFOJ8uW+9PncpKgiFx6Dp+1grSv6HQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=VlXiOcP6IchU9/mwecSnltgwECgHyEsJ9JlgI4vqfTc=; b=MlOcFK7V6EClSXNqhodvLg0KrKFedquRAFVlKh0Do47b/Md0MIkCIHkNpJT5My0psl wdblKBIksVKHi07IgN5RIUg3i3nVXPigVDvbvSiogvOOw6m15u/qjNAtr+FqjT+kyPOB dObv3DdpgvIz3pQ6+m1OtMhKMY2aYnXm5X8gGSM2HWlO7t52jbbJIbvzTvaaXpDqTyRd zuHsJcZNrdFUZiXIyRTVdnN8uX2vOO+unUv8cbGoo3g41JpatAUrGfuawEfVg3lZlZjA UC/N7Ryw3/ARMPJFI4VYDDLs3DqvMZ2MtnHUEiq0k4CVxgTnPZi4VRmOQuIsxkjIiMHI lZ1A== X-Gm-Message-State: AIVw111Z91R0rhpWk3fE6IQPcOu2wcz+nXykUjwTLNX2rsgeABjtVx2o we7Zyqju0Mti3hj59KM5RA== X-Received: by 10.223.178.85 with SMTP id y21mr9744162wra.92.1500892142300; Mon, 24 Jul 2017 03:29:02 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.29.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:29:01 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 16/18] crypto: arm64/ghash - add NEON accelerated fallback for 64-bit PMULL Date: Mon, 24 Jul 2017 11:28:18 +0100 Message-Id: <20170724102820.16534-17-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Implement a NEON fallback for systems that do support NEON but have no support for the optional 64x64->128 polynomial multiplication instruction that is part of the ARMv8 Crypto Extensions. It is based on the paper "Fast Software Polynomial Multiplication on ARM Processors Using the NEON Engine" by Danilo Camara, Conrado Gouvea, Julio Lopez and Ricardo Dahab (https://hal.inria.fr/hal-01506572), but has been reworked extensively for the AArch64 ISA. On a low-end core such as the Cortex-A53 found in the Raspberry Pi3, the NEON based implementation is 4x faster than the table based one, and is time invariant as well, making it less vulnerable to timing attacks. When combined with the bit-sliced NEON implementation of AES-CTR, the AES-GCM performance increases by 2x (from 58 to 29 cycles per byte). Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/ghash-ce-core.S | 248 +++++++++++++++++--- arch/arm64/crypto/ghash-ce-glue.c | 40 +++- 2 files changed, 252 insertions(+), 36 deletions(-) -- 2.9.3 diff --git a/arch/arm64/crypto/ghash-ce-core.S b/arch/arm64/crypto/ghash-ce-core.S index cb22459eba85..11ebf1ae248a 100644 --- a/arch/arm64/crypto/ghash-ce-core.S +++ b/arch/arm64/crypto/ghash-ce-core.S @@ -1,7 +1,7 @@ /* * Accelerated GHASH implementation with ARMv8 PMULL instructions. * - * Copyright (C) 2014 Linaro Ltd. + * Copyright (C) 2014 - 2017 Linaro Ltd. * * This program is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License version 2 as published @@ -11,31 +11,215 @@ #include #include - SHASH .req v0 - SHASH2 .req v1 - T1 .req v2 - T2 .req v3 - MASK .req v4 - XL .req v5 - XM .req v6 - XH .req v7 - IN1 .req v7 + SHASH .req v0 + SHASH2 .req v1 + T1 .req v2 + T2 .req v3 + MASK .req v4 + XL .req v5 + XM .req v6 + XH .req v7 + IN1 .req v7 + + k00_16 .req v8 + k32_48 .req v9 + + t3 .req v10 + t4 .req v11 + t5 .req v12 + t6 .req v13 + t7 .req v14 + t8 .req v15 + t9 .req v16 + + perm1 .req v17 + perm2 .req v18 + perm3 .req v19 + + sh1 .req v20 + sh2 .req v21 + sh3 .req v22 + sh4 .req v23 + + ss1 .req v24 + ss2 .req v25 + ss3 .req v26 + ss4 .req v27 .text .arch armv8-a+crypto - /* - * void pmull_ghash_update(int blocks, u64 dg[], const char *src, - * struct ghash_key const *k, const char *head) - */ -ENTRY(pmull_ghash_update) + .macro __pmull_p64, rd, rn, rm + pmull \rd\().1q, \rn\().1d, \rm\().1d + .endm + + .macro __pmull2_p64, rd, rn, rm + pmull2 \rd\().1q, \rn\().2d, \rm\().2d + .endm + + .macro __pmull_p8, rq, ad, bd + ext t3.8b, \ad\().8b, \ad\().8b, #1 // A1 + ext t5.8b, \ad\().8b, \ad\().8b, #2 // A2 + ext t7.8b, \ad\().8b, \ad\().8b, #3 // A3 + + __pmull_p8_\bd \rq, \ad + .endm + + .macro __pmull2_p8, rq, ad, bd + tbl t3.16b, {\ad\().16b}, perm1.16b // A1 + tbl t5.16b, {\ad\().16b}, perm2.16b // A2 + tbl t7.16b, {\ad\().16b}, perm3.16b // A3 + + __pmull2_p8_\bd \rq, \ad + .endm + + .macro __pmull_p8_SHASH, rq, ad + __pmull_p8_tail \rq, \ad\().8b, SHASH.8b, 8b,, sh1, sh2, sh3, sh4 + .endm + + .macro __pmull_p8_SHASH2, rq, ad + __pmull_p8_tail \rq, \ad\().8b, SHASH2.8b, 8b,, ss1, ss2, ss3, ss4 + .endm + + .macro __pmull2_p8_SHASH, rq, ad + __pmull_p8_tail \rq, \ad\().16b, SHASH.16b, 16b, 2, sh1, sh2, sh3, sh4 + .endm + + .macro __pmull_p8_tail, rq, ad, bd, nb, t, b1, b2, b3, b4 + pmull\t t3.8h, t3.\nb, \bd // F = A1*B + pmull\t t4.8h, \ad, \b1\().\nb // E = A*B1 + pmull\t t5.8h, t5.\nb, \bd // H = A2*B + pmull\t t6.8h, \ad, \b2\().\nb // G = A*B2 + pmull\t t7.8h, t7.\nb, \bd // J = A3*B + pmull\t t8.8h, \ad, \b3\().\nb // I = A*B3 + pmull\t t9.8h, \ad, \b4\().\nb // K = A*B4 + pmull\t \rq\().8h, \ad, \bd // D = A*B + + eor t3.16b, t3.16b, t4.16b // L = E + F + eor t5.16b, t5.16b, t6.16b // M = G + H + eor t7.16b, t7.16b, t8.16b // N = I + J + + uzp1 t4.2d, t3.2d, t5.2d + uzp2 t3.2d, t3.2d, t5.2d + uzp1 t6.2d, t7.2d, t9.2d + uzp2 t7.2d, t7.2d, t9.2d + + // t3 = (L) (P0 + P1) << 8 + // t5 = (M) (P2 + P3) << 16 + eor t4.16b, t4.16b, t3.16b + and t3.16b, t3.16b, k32_48.16b + + // t7 = (N) (P4 + P5) << 24 + // t9 = (K) (P6 + P7) << 32 + eor t6.16b, t6.16b, t7.16b + and t7.16b, t7.16b, k00_16.16b + + eor t4.16b, t4.16b, t3.16b + eor t6.16b, t6.16b, t7.16b + + zip2 t5.2d, t4.2d, t3.2d + zip1 t3.2d, t4.2d, t3.2d + zip2 t9.2d, t6.2d, t7.2d + zip1 t7.2d, t6.2d, t7.2d + + ext t3.16b, t3.16b, t3.16b, #15 + ext t5.16b, t5.16b, t5.16b, #14 + ext t7.16b, t7.16b, t7.16b, #13 + ext t9.16b, t9.16b, t9.16b, #12 + + eor t3.16b, t3.16b, t5.16b + eor t7.16b, t7.16b, t9.16b + eor \rq\().16b, \rq\().16b, t3.16b + eor \rq\().16b, \rq\().16b, t7.16b + .endm + + .macro __pmull_pre_p64 + movi MASK.16b, #0xe1 + shl MASK.2d, MASK.2d, #57 + .endm + + .macro __pmull_pre_p8 + // k00_16 := 0x0000000000000000_000000000000ffff + // k32_48 := 0x00000000ffffffff_0000ffffffffffff + movi k32_48.2d, #0xffffffff + mov k32_48.h[2], k32_48.h[0] + ushr k00_16.2d, k32_48.2d, #32 + + // prepare the permutation vectors + mov_q x5, 0x080f0e0d0c0b0a09 + movi T1.8b, #8 + dup perm1.2d, x5 + eor perm1.16b, perm1.16b, T1.16b + ushr perm2.2d, perm1.2d, #8 + ushr perm3.2d, perm1.2d, #16 + ushr T1.2d, perm1.2d, #24 + sli perm2.2d, perm1.2d, #56 + sli perm3.2d, perm1.2d, #48 + sli T1.2d, perm1.2d, #40 + + // precompute loop invariants + tbl sh1.16b, {SHASH.16b}, perm1.16b + tbl sh2.16b, {SHASH.16b}, perm2.16b + tbl sh3.16b, {SHASH.16b}, perm3.16b + tbl sh4.16b, {SHASH.16b}, T1.16b + ext ss1.8b, SHASH2.8b, SHASH2.8b, #1 + ext ss2.8b, SHASH2.8b, SHASH2.8b, #2 + ext ss3.8b, SHASH2.8b, SHASH2.8b, #3 + ext ss4.8b, SHASH2.8b, SHASH2.8b, #4 + .endm + + // + // PMULL (64x64->128) based reduction for CPUs that can do + // it in a single instruction. + // + .macro __pmull_reduce_p64 + pmull T2.1q, XL.1d, MASK.1d + eor XM.16b, XM.16b, T1.16b + + mov XH.d[0], XM.d[1] + mov XM.d[1], XL.d[0] + + eor XL.16b, XM.16b, T2.16b + ext T2.16b, XL.16b, XL.16b, #8 + pmull XL.1q, XL.1d, MASK.1d + .endm + + // + // Alternative reduction for CPUs that lack support for the + // 64x64->128 PMULL instruction + // + .macro __pmull_reduce_p8 + eor XM.16b, XM.16b, T1.16b + + mov XL.d[1], XM.d[0] + mov XH.d[0], XM.d[1] + + shl T1.2d, XL.2d, #57 + shl T2.2d, XL.2d, #62 + eor T2.16b, T2.16b, T1.16b + shl T1.2d, XL.2d, #63 + eor T2.16b, T2.16b, T1.16b + ext T1.16b, XL.16b, XH.16b, #8 + eor T2.16b, T2.16b, T1.16b + + mov XL.d[1], T2.d[0] + mov XH.d[0], T2.d[1] + + ushr T2.2d, XL.2d, #1 + eor XH.16b, XH.16b, XL.16b + eor XL.16b, XL.16b, T2.16b + ushr T2.2d, T2.2d, #6 + ushr XL.2d, XL.2d, #1 + .endm + + .macro __pmull_ghash, pn ld1 {SHASH.2d}, [x3] ld1 {XL.2d}, [x1] - movi MASK.16b, #0xe1 ext SHASH2.16b, SHASH.16b, SHASH.16b, #8 - shl MASK.2d, MASK.2d, #57 eor SHASH2.16b, SHASH2.16b, SHASH.16b + __pmull_pre_\pn + /* do the head block first, if supplied */ cbz x4, 0f ld1 {T1.2d}, [x4] @@ -52,23 +236,17 @@ CPU_LE( rev64 T1.16b, T1.16b ) eor T1.16b, T1.16b, T2.16b eor XL.16b, XL.16b, IN1.16b - pmull2 XH.1q, SHASH.2d, XL.2d // a1 * b1 + __pmull2_\pn XH, XL, SHASH // a1 * b1 eor T1.16b, T1.16b, XL.16b - pmull XL.1q, SHASH.1d, XL.1d // a0 * b0 - pmull XM.1q, SHASH2.1d, T1.1d // (a1 + a0)(b1 + b0) + __pmull_\pn XL, XL, SHASH // a0 * b0 + __pmull_\pn XM, T1, SHASH2 // (a1 + a0)(b1 + b0) - ext T1.16b, XL.16b, XH.16b, #8 eor T2.16b, XL.16b, XH.16b - eor XM.16b, XM.16b, T1.16b + ext T1.16b, XL.16b, XH.16b, #8 eor XM.16b, XM.16b, T2.16b - pmull T2.1q, XL.1d, MASK.1d - mov XH.d[0], XM.d[1] - mov XM.d[1], XL.d[0] + __pmull_reduce_\pn - eor XL.16b, XM.16b, T2.16b - ext T2.16b, XL.16b, XL.16b, #8 - pmull XL.1q, XL.1d, MASK.1d eor T2.16b, T2.16b, XH.16b eor XL.16b, XL.16b, T2.16b @@ -76,7 +254,19 @@ CPU_LE( rev64 T1.16b, T1.16b ) st1 {XL.2d}, [x1] ret -ENDPROC(pmull_ghash_update) + .endm + + /* + * void pmull_ghash_update(int blocks, u64 dg[], const char *src, + * struct ghash_key const *k, const char *head) + */ +ENTRY(pmull_ghash_update_p64) + __pmull_ghash p64 +ENDPROC(pmull_ghash_update_p64) + +ENTRY(pmull_ghash_update_p8) + __pmull_ghash p8 +ENDPROC(pmull_ghash_update_p8) KS .req v8 CTR .req v9 diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c index ee6aaac05905..cfc9c92814fd 100644 --- a/arch/arm64/crypto/ghash-ce-glue.c +++ b/arch/arm64/crypto/ghash-ce-glue.c @@ -26,6 +26,7 @@ MODULE_DESCRIPTION("GHASH and AES-GCM using ARMv8 Crypto Extensions"); MODULE_AUTHOR("Ard Biesheuvel "); MODULE_LICENSE("GPL v2"); +MODULE_ALIAS_CRYPTO("ghash"); #define GHASH_BLOCK_SIZE 16 #define GHASH_DIGEST_SIZE 16 @@ -48,8 +49,17 @@ struct gcm_aes_ctx { struct ghash_key ghash_key; }; -asmlinkage void pmull_ghash_update(int blocks, u64 dg[], const char *src, - struct ghash_key const *k, const char *head); +asmlinkage void pmull_ghash_update_p64(int blocks, u64 dg[], const char *src, + struct ghash_key const *k, + const char *head); + +asmlinkage void pmull_ghash_update_p8(int blocks, u64 dg[], const char *src, + struct ghash_key const *k, + const char *head); + +static void (*pmull_ghash_update)(int blocks, u64 dg[], const char *src, + struct ghash_key const *k, + const char *head); asmlinkage void pmull_gcm_encrypt(int blocks, u64 dg[], u8 dst[], const u8 src[], struct ghash_key const *k, @@ -557,13 +567,24 @@ static int __init ghash_ce_mod_init(void) { int ret; - ret = crypto_register_aead(&gcm_aes_alg); - if (ret) - return ret; + if (!(elf_hwcap & HWCAP_ASIMD)) + return -ENODEV; + + if (elf_hwcap & HWCAP_PMULL) + pmull_ghash_update = pmull_ghash_update_p64; + + else + pmull_ghash_update = pmull_ghash_update_p8; ret = crypto_register_shash(&ghash_alg); if (ret) - crypto_unregister_aead(&gcm_aes_alg); + return ret; + + if (elf_hwcap & HWCAP_PMULL) { + ret = crypto_register_aead(&gcm_aes_alg); + if (ret) + crypto_unregister_shash(&ghash_alg); + } return ret; } @@ -573,5 +594,10 @@ static void __exit ghash_ce_mod_exit(void) crypto_unregister_aead(&gcm_aes_alg); } -module_cpu_feature_match(PMULL, ghash_ce_mod_init); +static const struct cpu_feature ghash_cpu_feature[] = { + { cpu_feature(PMULL) }, { } +}; +MODULE_DEVICE_TABLE(cpu, ghash_cpu_feature); + +module_init(ghash_ce_mod_init); module_exit(ghash_ce_mod_exit); From patchwork Mon Jul 24 10:28:19 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108564 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887798qge; Mon, 24 Jul 2017 03:29:09 -0700 (PDT) X-Received: by 10.84.176.195 with SMTP id v61mr17178570plb.51.1500892149529; Mon, 24 Jul 2017 03:29:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892149; cv=none; d=google.com; s=arc-20160816; b=GDGiqcrmKFsxChNAJL7A/b+MNxrjdgbPXkTLAqLkdlp6DvSISnEv1JCA4czJqgc9ck TSZ7jJYCEA+VU6EWhAN1DnaVGIDAuAUmCZyxugLxahLh2hXIz/ahDxlGTJL8wC6UruIP mNfEEFBNitLai5ZpbQ+/YruLH5r6gIkRiZepM7C6J+99O5fJtlr+yWCyxrmD1nZq/8Ab 8Yp8KsQNI9S5RHhbVOMaoDuuNEI0+4J3Wg6geTYCRahKkAKoXdvUPALhdBxNcHLl0Q4z MilGLcOaLukQnU92R96CGaSug0j2UpUJkSDGx2Q7/l34S5pWuESj1ShQi2dH5y1JDUXF kkQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=g81ut7MyLgPbZZfmbJMMqfreu2SC7eHIWWzg6OWISfY=; b=fTUJcwlTHxmh/l3FH28pEl7XzGAyu2LhhKWE3HWdzKsEXLYUuQyDGTzK3PluwZ54gy rhXfNzst5KLGBfhOnGB5Skph2koMsA6m5FuGg9LNrsdFd5kBPWuEwSjiaJqUoKSun8B8 WU1mLPLRtc/X8rzPrRLVKTcLh8YKL3GNeATiBS7pnwT1vY3DTmwIr1gRb9WEkalVRSRS 4dFwFXXQpdG41OMEezl2eEdnSy57W5udJ5qpBL5eWmjp1+/boiuiEpStlY0RBt3uif7y ek8r9IEzEjuMoxwK/mNIvigRSycWLjm1V0M01k/fwUML+22EwZydgfZHlhvB3db5qd1Q 95Mw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=fK5H6WeF; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.29.09; Mon, 24 Jul 2017 03:29:09 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=fK5H6WeF; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752640AbdGXK3I (ORCPT + 1 other); Mon, 24 Jul 2017 06:29:08 -0400 Received: from mail-wm0-f48.google.com ([74.125.82.48]:37335 "EHLO mail-wm0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752626AbdGXK3F (ORCPT ); Mon, 24 Jul 2017 06:29:05 -0400 Received: by mail-wm0-f48.google.com with SMTP id c184so27692039wmd.0 for ; Mon, 24 Jul 2017 03:29:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=g81ut7MyLgPbZZfmbJMMqfreu2SC7eHIWWzg6OWISfY=; b=fK5H6WeFK6/xkjF6ab++Pumbg11zF8XGlWaET2SLA1G1vBH1yEKZtvI5hLnOvv8E/k rZLpoIF1GNoPtoOZ1HeXjn90HB11kVpF+QtLN4tW4RvogJU4Vcnu7TMUJRS2kJEeyl+k yhTh1qENzr9YStyemhKUHqNVZHUKBj6HTf1Fo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=g81ut7MyLgPbZZfmbJMMqfreu2SC7eHIWWzg6OWISfY=; b=hoM3Bbb6wCBI2bJOFqx8o+QAq+gY9a/tFutDgrdr2WTakdx66DcjMczUzpKhXt4bwN WxA+dRNXyI4rtFJ4c8X9lCXwlYjUadztWLIPBCl5BWQ5307y2kN5oJcfg17P4EV9pvYi mw0CkaTwBgkeieHzld/aUHAh9UE9aFluJ8TY0oagBHSbKUYEvhJ0sYyfZl8JcqlqFtal nwNEs7XH5hkH+RtL/VzsutpQchAJArblZXp14Ao8IDSozCWlc9NFx8wNhDswZxghdgc5 bFqD5jcZynPamL+3lUCaxviIhGqlFFx9DE38Xb6tjNH6latEOpsm6zS9u4Iyyqlsu8Te 6Uaw== X-Gm-Message-State: AIVw113UoR5mBoP7F0nHi2w+gMyCf6+Ljj8gLsNm4ffhgd8B28cW81fA dvTx+mVuLwKFSHaN4bC6xQ== X-Received: by 10.28.20.65 with SMTP id 62mr4459232wmu.51.1500892144210; Mon, 24 Jul 2017 03:29:04 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.29.02 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:29:03 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 17/18] crypto: arm/aes - avoid expanded lookup tables in the final round Date: Mon, 24 Jul 2017 11:28:19 +0100 Message-Id: <20170724102820.16534-18-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org For the final round, avoid the expanded and padded lookup tables exported by the generic AES driver. Instead, for encryption, we can perform byte loads from the same table we used for the inner rounds, which will still be hot in the caches. For decryption, use the inverse AES Sbox directly, which is 4x smaller than the inverse lookup table exported by the generic driver. This should significantly reduce the Dcache footprint of our code, which makes the code more robust against timing attacks. It does not introduce any additional module dependencies, given that we already rely on the core AES module for the shared key expansion routines. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/aes-cipher-core.S | 88 +++++++++++++++----- 1 file changed, 65 insertions(+), 23 deletions(-) -- 2.9.3 diff --git a/arch/arm/crypto/aes-cipher-core.S b/arch/arm/crypto/aes-cipher-core.S index c817a86c4ca8..54b384084637 100644 --- a/arch/arm/crypto/aes-cipher-core.S +++ b/arch/arm/crypto/aes-cipher-core.S @@ -10,6 +10,7 @@ */ #include +#include .text .align 5 @@ -32,19 +33,19 @@ .endif .endm - .macro __load, out, in, idx + .macro __load, out, in, idx, sz, op .if __LINUX_ARM_ARCH__ < 7 && \idx > 0 - ldr \out, [ttab, \in, lsr #(8 * \idx) - 2] + ldr\op \out, [ttab, \in, lsr #(8 * \idx) - \sz] .else - ldr \out, [ttab, \in, lsl #2] + ldr\op \out, [ttab, \in, lsl #\sz] .endif .endm - .macro __hround, out0, out1, in0, in1, in2, in3, t3, t4, enc + .macro __hround, out0, out1, in0, in1, in2, in3, t3, t4, enc, sz, op __select \out0, \in0, 0 __select t0, \in1, 1 - __load \out0, \out0, 0 - __load t0, t0, 1 + __load \out0, \out0, 0, \sz, \op + __load t0, t0, 1, \sz, \op .if \enc __select \out1, \in1, 0 @@ -53,10 +54,10 @@ __select \out1, \in3, 0 __select t1, \in0, 1 .endif - __load \out1, \out1, 0 + __load \out1, \out1, 0, \sz, \op __select t2, \in2, 2 - __load t1, t1, 1 - __load t2, t2, 2 + __load t1, t1, 1, \sz, \op + __load t2, t2, 2, \sz, \op eor \out0, \out0, t0, ror #24 @@ -68,9 +69,9 @@ __select \t3, \in1, 2 __select \t4, \in2, 3 .endif - __load \t3, \t3, 2 - __load t0, t0, 3 - __load \t4, \t4, 3 + __load \t3, \t3, 2, \sz, \op + __load t0, t0, 3, \sz, \op + __load \t4, \t4, 3, \sz, \op eor \out1, \out1, t1, ror #24 eor \out0, \out0, t2, ror #16 @@ -82,14 +83,14 @@ eor \out1, \out1, t2 .endm - .macro fround, out0, out1, out2, out3, in0, in1, in2, in3 - __hround \out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1 - __hround \out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1 + .macro fround, out0, out1, out2, out3, in0, in1, in2, in3, sz=2, op + __hround \out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1, \sz, \op + __hround \out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1, \sz, \op .endm - .macro iround, out0, out1, out2, out3, in0, in1, in2, in3 - __hround \out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0 - __hround \out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0 + .macro iround, out0, out1, out2, out3, in0, in1, in2, in3, sz=2, op + __hround \out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0, \sz, \op + __hround \out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0, \sz, \op .endm .macro __rev, out, in @@ -114,7 +115,7 @@ .endif .endm - .macro do_crypt, round, ttab, ltab + .macro do_crypt, round, ttab, ltab, bsz push {r3-r11, lr} ldr r4, [in] @@ -146,9 +147,12 @@ 1: subs rounds, rounds, #4 \round r8, r9, r10, r11, r4, r5, r6, r7 - __adrl ttab, \ltab, ls + bls 2f \round r4, r5, r6, r7, r8, r9, r10, r11 - bhi 0b + b 0b + +2: __adrl ttab, \ltab + \round r4, r5, r6, r7, r8, r9, r10, r11, \bsz, b #ifdef CONFIG_CPU_BIG_ENDIAN __rev r4, r4 @@ -170,10 +174,48 @@ .ltorg .endm + .align L1_CACHE_SHIFT + .type __aes_arm_inverse_sbox, %object +__aes_arm_inverse_sbox: + .byte 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38 + .byte 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb + .byte 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87 + .byte 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb + .byte 0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d + .byte 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e + .byte 0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2 + .byte 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25 + .byte 0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16 + .byte 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92 + .byte 0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda + .byte 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84 + .byte 0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a + .byte 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06 + .byte 0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02 + .byte 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b + .byte 0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea + .byte 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73 + .byte 0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85 + .byte 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e + .byte 0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89 + .byte 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b + .byte 0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20 + .byte 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4 + .byte 0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31 + .byte 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f + .byte 0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d + .byte 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef + .byte 0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0 + .byte 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61 + .byte 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26 + .byte 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d + .size __aes_arm_inverse_sbox, . - __aes_arm_inverse_sbox + ENTRY(__aes_arm_encrypt) - do_crypt fround, crypto_ft_tab, crypto_fl_tab + do_crypt fround, crypto_ft_tab, crypto_ft_tab + 1, 2 ENDPROC(__aes_arm_encrypt) + .align 5 ENTRY(__aes_arm_decrypt) - do_crypt iround, crypto_it_tab, crypto_il_tab + do_crypt iround, crypto_it_tab, __aes_arm_inverse_sbox, 0 ENDPROC(__aes_arm_decrypt) From patchwork Mon Jul 24 10:28:20 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 108565 Delivered-To: patch@linaro.org Received: by 10.140.101.44 with SMTP id t41csp3887803qge; Mon, 24 Jul 2017 03:29:10 -0700 (PDT) X-Received: by 10.101.69.142 with SMTP id o14mr9248465pgq.242.1500892150205; Mon, 24 Jul 2017 03:29:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1500892150; cv=none; d=google.com; s=arc-20160816; b=reAfLwcroW/HEPbA08C1mS+ohF2lu5vRFGr1FNOQ0NsSCJkyASKPZlx/CgcLMnc2+/ DxEhCwigoPBKpXQsraup15QdceTrQBP/+n76hambPUtAS+2cR9fLfeojyjnoQq7/ON5E Y/QLuZ2MPoDJ20q7lxWGAHfYIR8bGI5+Q0yJMIXx/ESPDgJAZGoxYYA48v3wz0Rx50rR NA7MiDKs0qhm6XplAVi4rhTvB33dHZJCxGkDdYu3yYTxl21HmwgKugqh/8FjQvTEWeYO J3z6ufV0sYbZvZzha8OauA2w1Uj4o2YBM/hE9jUX/vk+OUnVCiklUlEeiytEiejC0ady yjKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=OZc5Dp++Wnq6iTXP4inIMCbvFJtJXQRZD92BWyMxT0M=; b=PVHZmIdOUH4gLi4s7zneeE3qjot+d8xaKFA33KXSt1saxyV4C75H6RYDtR5oVSqfh+ 0i8IxxhNFw9qKfhVJzYZOBTbp3fjb4g3AW35Y71GNGB/aMs6NoKpbZ/aT5DtM//PLVQK V6bJtfBuJ9C6aaWjnaVIHeDOTpWAnsZNPKlt+sqi4qQef1GZv8J0+2+F+AspIj5OH4rl ngE2w+JRjLva4JiNGvr+t1X1+2Pmo2qogpnWBo3z0z76jqzG+D1pvR4Y9GsiI6dmFBy6 E1GwPxZRkL1LsSXEM6VzII7fy6qCvz5VAJxFz9WmWka9plLIAeFwTdZEstttji2H66nl ht4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.b=jduWyaw9; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 38si3348411pld.76.2017.07.24.03.29.09; Mon, 24 Jul 2017 03:29:10 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.b=jduWyaw9; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752626AbdGXK3J (ORCPT + 1 other); Mon, 24 Jul 2017 06:29:09 -0400 Received: from mail-wr0-f180.google.com ([209.85.128.180]:37071 "EHLO mail-wr0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752616AbdGXK3H (ORCPT ); Mon, 24 Jul 2017 06:29:07 -0400 Received: by mail-wr0-f180.google.com with SMTP id 33so47739146wrz.4 for ; Mon, 24 Jul 2017 03:29:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=OZc5Dp++Wnq6iTXP4inIMCbvFJtJXQRZD92BWyMxT0M=; b=jduWyaw9H3M7xcbQCJQQv4QbRYL5KrvY+lY6092iRrsowg4W8pGIYh4iDsD1GA1b3t xrWeN+liJeKp29Fu3auwa66qfr5lmQ/IiB6gPFpkxtEWgDrx70ma0trptrq8KW7zZcQJ +w5zhFyuru8PfD2J7NFPvnHmhcGd6sipr1ZoA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=OZc5Dp++Wnq6iTXP4inIMCbvFJtJXQRZD92BWyMxT0M=; b=DUkKShLJ62HdCkY728fKSerL4iFxq+xef68hEvmzImBKmjgrSM4ib+INdDaKW9ISAl iBMtsfGsq1jCyWK0yeAGYIsMnQeAYtHhiEGnCJ2jnet7eDnOLO8MCu7NXz+ztsVa++Fe +WwUka6Szlllkeet+JzXDVxRwn6axM4K4L+uhpOOgHxb/7KPDbqLAJEziu9ToAzz0i1w xqT9hChQeCNlDPna8c3DaT8aNeE4/oYfKFLoGsgBASdM8oXdyUD/eBeyzlsc9EL7FiZb shMTr8SgViY1ADrodlMcWdi/inFbwexx0UsrGZSq9HWChTVLav6HMFr6IzugXaarTOVB YKOw== X-Gm-Message-State: AIVw111SsfIFRtCf1oPqL1m7uTr43oyj/FVC8mNeZngsSg/FnxqNX9d1 XmGrQVYVpay1QXS1aLoTJA== X-Received: by 10.223.150.148 with SMTP id u20mr15769192wrb.195.1500892146197; Mon, 24 Jul 2017 03:29:06 -0700 (PDT) Received: from localhost.localdomain ([105.148.195.69]) by smtp.gmail.com with ESMTPSA id v44sm13205400wrb.53.2017.07.24.03.29.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jul 2017 03:29:05 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: herbert@gondor.apana.org.au, dave.martin@arm.com, Ard Biesheuvel Subject: [PATCH resend 18/18] crypto: arm64/aes - avoid expanded lookup tables in the final round Date: Mon, 24 Jul 2017 11:28:20 +0100 Message-Id: <20170724102820.16534-19-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170724102820.16534-1-ard.biesheuvel@linaro.org> References: <20170724102820.16534-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org For the final round, avoid the expanded and padded lookup tables exported by the generic AES driver. Instead, for encryption, we can perform byte loads from the same table we used for the inner rounds, which will still be hot in the caches. For decryption, use the inverse AES Sbox directly, which is 4x smaller than the inverse lookup table exported by the generic driver. This should significantly reduce the Dcache footprint of our code, which makes the code more robust against timing attacks. It does not introduce any additional module dependencies, given that we already rely on the core AES module for the shared key expansion routines. It also frees up register x18, which is not available as a scratch register on all platforms, which and so avoiding it improves shareability of this code. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-cipher-core.S | 152 ++++++++++++++------ 1 file changed, 107 insertions(+), 45 deletions(-) -- 2.9.3 diff --git a/arch/arm64/crypto/aes-cipher-core.S b/arch/arm64/crypto/aes-cipher-core.S index f2f9cc519309..6d2445d603cc 100644 --- a/arch/arm64/crypto/aes-cipher-core.S +++ b/arch/arm64/crypto/aes-cipher-core.S @@ -10,6 +10,7 @@ #include #include +#include .text @@ -17,94 +18,155 @@ out .req x1 in .req x2 rounds .req x3 - tt .req x4 - lt .req x2 + tt .req x2 - .macro __pair, enc, reg0, reg1, in0, in1e, in1d, shift + .macro __pair1, sz, op, reg0, reg1, in0, in1e, in1d, shift + .ifc \op\shift, b0 + ubfiz \reg0, \in0, #2, #8 + ubfiz \reg1, \in1e, #2, #8 + .else ubfx \reg0, \in0, #\shift, #8 - .if \enc ubfx \reg1, \in1e, #\shift, #8 - .else - ubfx \reg1, \in1d, #\shift, #8 .endif + + /* + * AArch64 cannot do byte size indexed loads from a table containing + * 32-bit quantities, i.e., 'ldrb w12, [tt, w12, uxtw #2]' is not a + * valid instruction. So perform the shift explicitly first for the + * high bytes (the low byte is shifted implicitly by using ubfiz rather + * than ubfx above) + */ + .ifnc \op, b ldr \reg0, [tt, \reg0, uxtw #2] ldr \reg1, [tt, \reg1, uxtw #2] + .else + .if \shift > 0 + lsl \reg0, \reg0, #2 + lsl \reg1, \reg1, #2 + .endif + ldrb \reg0, [tt, \reg0, uxtw] + ldrb \reg1, [tt, \reg1, uxtw] + .endif .endm - .macro __hround, out0, out1, in0, in1, in2, in3, t0, t1, enc + .macro __pair0, sz, op, reg0, reg1, in0, in1e, in1d, shift + ubfx \reg0, \in0, #\shift, #8 + ubfx \reg1, \in1d, #\shift, #8 + ldr\op \reg0, [tt, \reg0, uxtw #\sz] + ldr\op \reg1, [tt, \reg1, uxtw #\sz] + .endm + + .macro __hround, out0, out1, in0, in1, in2, in3, t0, t1, enc, sz, op ldp \out0, \out1, [rk], #8 - __pair \enc, w13, w14, \in0, \in1, \in3, 0 - __pair \enc, w15, w16, \in1, \in2, \in0, 8 - __pair \enc, w17, w18, \in2, \in3, \in1, 16 - __pair \enc, \t0, \t1, \in3, \in0, \in2, 24 - - eor \out0, \out0, w13 - eor \out1, \out1, w14 - eor \out0, \out0, w15, ror #24 - eor \out1, \out1, w16, ror #24 - eor \out0, \out0, w17, ror #16 - eor \out1, \out1, w18, ror #16 + __pair\enc \sz, \op, w12, w13, \in0, \in1, \in3, 0 + __pair\enc \sz, \op, w14, w15, \in1, \in2, \in0, 8 + __pair\enc \sz, \op, w16, w17, \in2, \in3, \in1, 16 + __pair\enc \sz, \op, \t0, \t1, \in3, \in0, \in2, 24 + + eor \out0, \out0, w12 + eor \out1, \out1, w13 + eor \out0, \out0, w14, ror #24 + eor \out1, \out1, w15, ror #24 + eor \out0, \out0, w16, ror #16 + eor \out1, \out1, w17, ror #16 eor \out0, \out0, \t0, ror #8 eor \out1, \out1, \t1, ror #8 .endm - .macro fround, out0, out1, out2, out3, in0, in1, in2, in3 - __hround \out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1 - __hround \out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1 + .macro fround, out0, out1, out2, out3, in0, in1, in2, in3, sz=2, op + __hround \out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1, \sz, \op + __hround \out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1, \sz, \op .endm - .macro iround, out0, out1, out2, out3, in0, in1, in2, in3 - __hround \out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0 - __hround \out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0 + .macro iround, out0, out1, out2, out3, in0, in1, in2, in3, sz=2, op + __hround \out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0, \sz, \op + __hround \out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0, \sz, \op .endm - .macro do_crypt, round, ttab, ltab - ldp w5, w6, [in] - ldp w7, w8, [in, #8] - ldp w9, w10, [rk], #16 - ldp w11, w12, [rk, #-8] + .macro do_crypt, round, ttab, ltab, bsz + ldp w4, w5, [in] + ldp w6, w7, [in, #8] + ldp w8, w9, [rk], #16 + ldp w10, w11, [rk, #-8] +CPU_BE( rev w4, w4 ) CPU_BE( rev w5, w5 ) CPU_BE( rev w6, w6 ) CPU_BE( rev w7, w7 ) -CPU_BE( rev w8, w8 ) + eor w4, w4, w8 eor w5, w5, w9 eor w6, w6, w10 eor w7, w7, w11 - eor w8, w8, w12 adr_l tt, \ttab - adr_l lt, \ltab tbnz rounds, #1, 1f -0: \round w9, w10, w11, w12, w5, w6, w7, w8 - \round w5, w6, w7, w8, w9, w10, w11, w12 +0: \round w8, w9, w10, w11, w4, w5, w6, w7 + \round w4, w5, w6, w7, w8, w9, w10, w11 1: subs rounds, rounds, #4 - \round w9, w10, w11, w12, w5, w6, w7, w8 - csel tt, tt, lt, hi - \round w5, w6, w7, w8, w9, w10, w11, w12 - b.hi 0b - + \round w8, w9, w10, w11, w4, w5, w6, w7 + b.ls 3f +2: \round w4, w5, w6, w7, w8, w9, w10, w11 + b 0b +3: adr_l tt, \ltab + \round w4, w5, w6, w7, w8, w9, w10, w11, \bsz, b + +CPU_BE( rev w4, w4 ) CPU_BE( rev w5, w5 ) CPU_BE( rev w6, w6 ) CPU_BE( rev w7, w7 ) -CPU_BE( rev w8, w8 ) - stp w5, w6, [out] - stp w7, w8, [out, #8] + stp w4, w5, [out] + stp w6, w7, [out, #8] ret .endm - .align 5 + .align L1_CACHE_SHIFT + .type __aes_arm64_inverse_sbox, %object +__aes_arm64_inverse_sbox: + .byte 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38 + .byte 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb + .byte 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87 + .byte 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb + .byte 0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d + .byte 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e + .byte 0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2 + .byte 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25 + .byte 0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16 + .byte 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92 + .byte 0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda + .byte 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84 + .byte 0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a + .byte 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06 + .byte 0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02 + .byte 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b + .byte 0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea + .byte 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73 + .byte 0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85 + .byte 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e + .byte 0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89 + .byte 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b + .byte 0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20 + .byte 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4 + .byte 0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31 + .byte 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f + .byte 0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d + .byte 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef + .byte 0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0 + .byte 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61 + .byte 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26 + .byte 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d + .size __aes_arm64_inverse_sbox, . - __aes_arm64_inverse_sbox + ENTRY(__aes_arm64_encrypt) - do_crypt fround, crypto_ft_tab, crypto_fl_tab + do_crypt fround, crypto_ft_tab, crypto_ft_tab + 1, 2 ENDPROC(__aes_arm64_encrypt) .align 5 ENTRY(__aes_arm64_decrypt) - do_crypt iround, crypto_it_tab, crypto_il_tab + do_crypt iround, crypto_it_tab, __aes_arm64_inverse_sbox, 0 ENDPROC(__aes_arm64_decrypt)