From patchwork Thu Apr 9 10:55:45 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 46933 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-la0-f72.google.com (mail-la0-f72.google.com [209.85.215.72]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 69257218D6 for ; Thu, 9 Apr 2015 10:56:51 +0000 (UTC) Received: by lamp14 with SMTP id p14sf26164192lam.3 for ; Thu, 09 Apr 2015 03:56:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:sender:precedence:list-id :x-original-sender:x-original-authentication-results:mailing-list :list-post:list-help:list-archive:list-unsubscribe; bh=BSi+9CrVbkXk2G92jTWjURWPhD7j/yuQQTojU3qrLEg=; b=MTsQLVS0yafJcofIxaxAJoLGSZ4nWQpFMdk4j7IRyKWKcspREWumRtOUBPy8DYKOte oXDnperXWU5ADccg/116ta4oDANZtKqvveNtfEcUmJkTwy7kQKqTv5STRsbV3ZnbkY9/ TxSpCu8w2GzvO4ib0wKoDJ7Qz07GwBt2bRaE3S/NYiL4I4bQ0WTx7a8eIYFeUHFs0sIx jRSashW/Z+x4zBFFpDtxrUzHxBSEhUFdyhjfpjykdkohmfiT3lxHoBf+o2SbgNFWgJgD druavoQRRGDZ17rw7S0G/TKJEb/nPj5DiyO/wclXRsmxdVTXU/7dYclAGipyol+VU6u0 eIIA== X-Gm-Message-State: ALoCoQlTjwgU1z/qOJYpP2UYSFAGN9WNMNNKAxmHgKb6ruAGuIh3llxL+A3heA3xJr0NA3fuSjtc X-Received: by 10.112.93.203 with SMTP id cw11mr5784733lbb.0.1428577010347; Thu, 09 Apr 2015 03:56:50 -0700 (PDT) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.45.6 with SMTP id i6ls299408lam.10.gmail; Thu, 09 Apr 2015 03:56:50 -0700 (PDT) X-Received: by 10.112.47.73 with SMTP id b9mr24111635lbn.46.1428577010197; Thu, 09 Apr 2015 03:56:50 -0700 (PDT) Received: from mail-la0-f53.google.com (mail-la0-f53.google.com. [209.85.215.53]) by mx.google.com with ESMTPS id d1si11156402laa.39.2015.04.09.03.56.50 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 09 Apr 2015 03:56:50 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.53 as permitted sender) client-ip=209.85.215.53; Received: by lagv1 with SMTP id v1so86546463lag.3 for ; Thu, 09 Apr 2015 03:56:50 -0700 (PDT) X-Received: by 10.152.4.72 with SMTP id i8mr3894793lai.29.1428577010087; Thu, 09 Apr 2015 03:56:50 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.67.65 with SMTP id l1csp394222lbt; Thu, 9 Apr 2015 03:56:49 -0700 (PDT) X-Received: by 10.70.109.131 with SMTP id hs3mr53376040pdb.40.1428576997611; Thu, 09 Apr 2015 03:56:37 -0700 (PDT) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id wy1si20935465pab.171.2015.04.09.03.56.36 for ; Thu, 09 Apr 2015 03:56:37 -0700 (PDT) Received-SPF: none (google.com: linux-crypto-owner@vger.kernel.org does not designate permitted sender hosts) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933308AbbDIK4g (ORCPT ); Thu, 9 Apr 2015 06:56:36 -0400 Received: from mail-wg0-f43.google.com ([74.125.82.43]:34110 "EHLO mail-wg0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933284AbbDIK4d (ORCPT ); Thu, 9 Apr 2015 06:56:33 -0400 Received: by wgso17 with SMTP id o17so4064918wgs.1 for ; Thu, 09 Apr 2015 03:56:32 -0700 (PDT) X-Received: by 10.180.109.136 with SMTP id hs8mr5067512wib.73.1428576992363; Thu, 09 Apr 2015 03:56:32 -0700 (PDT) Received: from ards-macbook-pro.local ([90.174.5.113]) by mx.google.com with ESMTPSA id b10sm2192186wiz.9.2015.04.09.03.56.30 (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 09 Apr 2015 03:56:31 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, herbert@gondor.apana.org.au, samitolvanen@google.com, jussi.kivilinna@iki.fi Cc: stockhausen@collogia.de, Ard Biesheuvel Subject: [PATCH v4 13/16] crypto/arm64: move SHA-224/256 ARMv8 implementation to base layer Date: Thu, 9 Apr 2015 12:55:45 +0200 Message-Id: <1428576948-23863-14-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 1.8.3.2 In-Reply-To: <1428576948-23863-1-git-send-email-ard.biesheuvel@linaro.org> References: <1428576948-23863-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-crypto@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: ard.biesheuvel@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.53 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , This removes all the boilerplate from the existing implementation, and replaces it with calls into the base layer. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/sha2-ce-core.S | 29 +++-- arch/arm64/crypto/sha2-ce-glue.c | 227 +++++++++------------------------------ 2 files changed, 63 insertions(+), 193 deletions(-) diff --git a/arch/arm64/crypto/sha2-ce-core.S b/arch/arm64/crypto/sha2-ce-core.S index 7f29fc031ea8..5df9d9d470ad 100644 --- a/arch/arm64/crypto/sha2-ce-core.S +++ b/arch/arm64/crypto/sha2-ce-core.S @@ -73,8 +73,8 @@ .word 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2 /* - * void sha2_ce_transform(int blocks, u8 const *src, u32 *state, - * u8 *head, long bytes) + * void sha2_ce_transform(struct sha256_ce_state *sst, u8 const *src, + * int blocks) */ ENTRY(sha2_ce_transform) /* load round constants */ @@ -85,24 +85,21 @@ ENTRY(sha2_ce_transform) ld1 {v12.4s-v15.4s}, [x8] /* load state */ - ldp dga, dgb, [x2] + ldp dga, dgb, [x0] - /* load partial input (if supplied) */ - cbz x3, 0f - ld1 {v16.4s-v19.4s}, [x3] - b 1f + /* load sha256_ce_state::finalize */ + ldr w4, [x0, #:lo12:sha256_ce_offsetof_finalize] /* load input */ 0: ld1 {v16.4s-v19.4s}, [x1], #64 - sub w0, w0, #1 + sub w2, w2, #1 -1: CPU_LE( rev32 v16.16b, v16.16b ) CPU_LE( rev32 v17.16b, v17.16b ) CPU_LE( rev32 v18.16b, v18.16b ) CPU_LE( rev32 v19.16b, v19.16b ) -2: add t0.4s, v16.4s, v0.4s +1: add t0.4s, v16.4s, v0.4s mov dg0v.16b, dgav.16b mov dg1v.16b, dgbv.16b @@ -131,15 +128,15 @@ CPU_LE( rev32 v19.16b, v19.16b ) add dgbv.4s, dgbv.4s, dg1v.4s /* handled all input blocks? */ - cbnz w0, 0b + cbnz w2, 0b /* * Final block: add padding and total bit count. - * Skip if we have no total byte count in x4. In that case, the input - * size was not a round multiple of the block size, and the padding is - * handled by the C code. + * Skip if the input size was not a round multiple of the block size, + * the padding is handled by the C code in that case. */ cbz x4, 3f + ldr x4, [x0, #:lo12:sha256_ce_offsetof_count] movi v17.2d, #0 mov x8, #0x80000000 movi v18.2d, #0 @@ -148,9 +145,9 @@ CPU_LE( rev32 v19.16b, v19.16b ) mov x4, #0 mov v19.d[0], xzr mov v19.d[1], x7 - b 2b + b 1b /* store new state */ -3: stp dga, dgb, [x2] +3: stp dga, dgb, [x0] ret ENDPROC(sha2_ce_transform) diff --git a/arch/arm64/crypto/sha2-ce-glue.c b/arch/arm64/crypto/sha2-ce-glue.c index ae67e88c28b9..1340e44c048b 100644 --- a/arch/arm64/crypto/sha2-ce-glue.c +++ b/arch/arm64/crypto/sha2-ce-glue.c @@ -12,206 +12,82 @@ #include #include #include +#include #include #include #include +#define ASM_EXPORT(sym, val) \ + asm(".globl " #sym "; .set " #sym ", %0" :: "I"(val)); + MODULE_DESCRIPTION("SHA-224/SHA-256 secure hash using ARMv8 Crypto Extensions"); MODULE_AUTHOR("Ard Biesheuvel "); MODULE_LICENSE("GPL v2"); -asmlinkage int sha2_ce_transform(int blocks, u8 const *src, u32 *state, - u8 *head, long bytes); - -static int sha224_init(struct shash_desc *desc) -{ - struct sha256_state *sctx = shash_desc_ctx(desc); - - *sctx = (struct sha256_state){ - .state = { - SHA224_H0, SHA224_H1, SHA224_H2, SHA224_H3, - SHA224_H4, SHA224_H5, SHA224_H6, SHA224_H7, - } - }; - return 0; -} - -static int sha256_init(struct shash_desc *desc) -{ - struct sha256_state *sctx = shash_desc_ctx(desc); - - *sctx = (struct sha256_state){ - .state = { - SHA256_H0, SHA256_H1, SHA256_H2, SHA256_H3, - SHA256_H4, SHA256_H5, SHA256_H6, SHA256_H7, - } - }; - return 0; -} - -static int sha2_update(struct shash_desc *desc, const u8 *data, - unsigned int len) -{ - struct sha256_state *sctx = shash_desc_ctx(desc); - unsigned int partial = sctx->count % SHA256_BLOCK_SIZE; - - sctx->count += len; - - if ((partial + len) >= SHA256_BLOCK_SIZE) { - int blocks; - - if (partial) { - int p = SHA256_BLOCK_SIZE - partial; - - memcpy(sctx->buf + partial, data, p); - data += p; - len -= p; - } +struct sha256_ce_state { + struct sha256_state sst; + u32 finalize; +}; - blocks = len / SHA256_BLOCK_SIZE; - len %= SHA256_BLOCK_SIZE; +asmlinkage void sha2_ce_transform(struct sha256_ce_state *sst, u8 const *src, + int blocks); - kernel_neon_begin_partial(28); - sha2_ce_transform(blocks, data, sctx->state, - partial ? sctx->buf : NULL, 0); - kernel_neon_end(); - - data += blocks * SHA256_BLOCK_SIZE; - partial = 0; - } - if (len) - memcpy(sctx->buf + partial, data, len); - return 0; -} - -static void sha2_final(struct shash_desc *desc) +static int sha256_ce_update(struct shash_desc *desc, const u8 *data, + unsigned int len) { - static const u8 padding[SHA256_BLOCK_SIZE] = { 0x80, }; - - struct sha256_state *sctx = shash_desc_ctx(desc); - __be64 bits = cpu_to_be64(sctx->count << 3); - u32 padlen = SHA256_BLOCK_SIZE - - ((sctx->count + sizeof(bits)) % SHA256_BLOCK_SIZE); - - sha2_update(desc, padding, padlen); - sha2_update(desc, (const u8 *)&bits, sizeof(bits)); -} - -static int sha224_final(struct shash_desc *desc, u8 *out) -{ - struct sha256_state *sctx = shash_desc_ctx(desc); - __be32 *dst = (__be32 *)out; - int i; - - sha2_final(desc); - - for (i = 0; i < SHA224_DIGEST_SIZE / sizeof(__be32); i++) - put_unaligned_be32(sctx->state[i], dst++); - - *sctx = (struct sha256_state){}; - return 0; -} + struct sha256_ce_state *sctx = shash_desc_ctx(desc); -static int sha256_final(struct shash_desc *desc, u8 *out) -{ - struct sha256_state *sctx = shash_desc_ctx(desc); - __be32 *dst = (__be32 *)out; - int i; - - sha2_final(desc); - - for (i = 0; i < SHA256_DIGEST_SIZE / sizeof(__be32); i++) - put_unaligned_be32(sctx->state[i], dst++); + sctx->finalize = 0; + kernel_neon_begin_partial(28); + sha256_base_do_update(desc, data, len, + (sha256_block_fn *)sha2_ce_transform); + kernel_neon_end(); - *sctx = (struct sha256_state){}; return 0; } -static void sha2_finup(struct shash_desc *desc, const u8 *data, - unsigned int len) +static int sha256_ce_finup(struct shash_desc *desc, const u8 *data, + unsigned int len, u8 *out) { - struct sha256_state *sctx = shash_desc_ctx(desc); - int blocks; + struct sha256_ce_state *sctx = shash_desc_ctx(desc); + bool finalize = !sctx->sst.count && !(len % SHA256_BLOCK_SIZE); - if (sctx->count || !len || (len % SHA256_BLOCK_SIZE)) { - sha2_update(desc, data, len); - sha2_final(desc); - return; - } + ASM_EXPORT(sha256_ce_offsetof_count, + offsetof(struct sha256_ce_state, sst.count)); + ASM_EXPORT(sha256_ce_offsetof_finalize, + offsetof(struct sha256_ce_state, finalize)); /* - * Use a fast path if the input is a multiple of 64 bytes. In - * this case, there is no need to copy data around, and we can - * perform the entire digest calculation in a single invocation - * of sha2_ce_transform() + * Allow the asm code to perform the finalization if there is no + * partial data and the input is a round multiple of the block size. */ - blocks = len / SHA256_BLOCK_SIZE; + sctx->finalize = finalize; kernel_neon_begin_partial(28); - sha2_ce_transform(blocks, data, sctx->state, NULL, len); + sha256_base_do_update(desc, data, len, + (sha256_block_fn *)sha2_ce_transform); + if (!finalize) + sha256_base_do_finalize(desc, + (sha256_block_fn *)sha2_ce_transform); kernel_neon_end(); + return sha256_base_finish(desc, out); } -static int sha224_finup(struct shash_desc *desc, const u8 *data, - unsigned int len, u8 *out) +static int sha256_ce_final(struct shash_desc *desc, u8 *out) { - struct sha256_state *sctx = shash_desc_ctx(desc); - __be32 *dst = (__be32 *)out; - int i; - - sha2_finup(desc, data, len); - - for (i = 0; i < SHA224_DIGEST_SIZE / sizeof(__be32); i++) - put_unaligned_be32(sctx->state[i], dst++); - - *sctx = (struct sha256_state){}; - return 0; -} - -static int sha256_finup(struct shash_desc *desc, const u8 *data, - unsigned int len, u8 *out) -{ - struct sha256_state *sctx = shash_desc_ctx(desc); - __be32 *dst = (__be32 *)out; - int i; - - sha2_finup(desc, data, len); - - for (i = 0; i < SHA256_DIGEST_SIZE / sizeof(__be32); i++) - put_unaligned_be32(sctx->state[i], dst++); - - *sctx = (struct sha256_state){}; - return 0; -} - -static int sha2_export(struct shash_desc *desc, void *out) -{ - struct sha256_state *sctx = shash_desc_ctx(desc); - struct sha256_state *dst = out; - - *dst = *sctx; - return 0; -} - -static int sha2_import(struct shash_desc *desc, const void *in) -{ - struct sha256_state *sctx = shash_desc_ctx(desc); - struct sha256_state const *src = in; - - *sctx = *src; - return 0; + kernel_neon_begin_partial(28); + sha256_base_do_finalize(desc, (sha256_block_fn *)sha2_ce_transform); + kernel_neon_end(); + return sha256_base_finish(desc, out); } static struct shash_alg algs[] = { { - .init = sha224_init, - .update = sha2_update, - .final = sha224_final, - .finup = sha224_finup, - .export = sha2_export, - .import = sha2_import, - .descsize = sizeof(struct sha256_state), + .init = sha224_base_init, + .update = sha256_ce_update, + .final = sha256_ce_final, + .finup = sha256_ce_finup, + .descsize = sizeof(struct sha256_ce_state), .digestsize = SHA224_DIGEST_SIZE, - .statesize = sizeof(struct sha256_state), .base = { .cra_name = "sha224", .cra_driver_name = "sha224-ce", @@ -221,15 +97,12 @@ static struct shash_alg algs[] = { { .cra_module = THIS_MODULE, } }, { - .init = sha256_init, - .update = sha2_update, - .final = sha256_final, - .finup = sha256_finup, - .export = sha2_export, - .import = sha2_import, - .descsize = sizeof(struct sha256_state), + .init = sha256_base_init, + .update = sha256_ce_update, + .final = sha256_ce_final, + .finup = sha256_ce_finup, + .descsize = sizeof(struct sha256_ce_state), .digestsize = SHA256_DIGEST_SIZE, - .statesize = sizeof(struct sha256_state), .base = { .cra_name = "sha256", .cra_driver_name = "sha256-ce",