From patchwork Sat Jan 28 23:25:31 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92768 Delivered-To: patch@linaro.org Received: by 10.140.20.99 with SMTP id 90csp847176qgi; Sat, 28 Jan 2017 15:26:01 -0800 (PST) X-Received: by 10.84.245.1 with SMTP id i1mr21897102pll.131.1485645961094; Sat, 28 Jan 2017 15:26:01 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 3si8337414pli.45.2017.01.28.15.26.00; Sat, 28 Jan 2017 15:26:01 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752532AbdA1XZ7 (ORCPT + 1 other); Sat, 28 Jan 2017 18:25:59 -0500 Received: from mail-wm0-f42.google.com ([74.125.82.42]:35019 "EHLO mail-wm0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751947AbdA1XZ6 (ORCPT ); Sat, 28 Jan 2017 18:25:58 -0500 Received: by mail-wm0-f42.google.com with SMTP id r126so165099089wmr.0 for ; Sat, 28 Jan 2017 15:25:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=uMxznzyzjyL6KPmIcSlEFgXHzKWi8SJ3MJxOaj5XmxA=; b=A0xRScx83L/xtlcdGb5DI8CKovFvM7LT0zl6Lxi5NgsIF7M1J1uEpPCw8l78wa23cc CJB7WjysdQdYcj4T7OZ4fT3rGIaToSPgX8KUUoPNqZkQE8iMVJ/tbBCdqgrcv19DuxUi 88CR8WqlFnsGo3Awj+5niVLg1jgVCrU8UL8VY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=uMxznzyzjyL6KPmIcSlEFgXHzKWi8SJ3MJxOaj5XmxA=; b=e2znwmQzLI1lYh8q/kj/PfvsX6Zaquq/mQpKFVkRqbINHc8hPTFxhLES0xOayu7i3K 6clpOYCzHHsc4795MdpDAXqkdVRHAddd61bhEBhj1qHvpvwPVSSPvjW3iC3xQIsr5sFW WAaUrw/EiFVkTcOmayygKrKGI4NojqfHqlyPJdDVWl8ab9t6YkzgMKNHIwPZtq1Ky9fV czZpRnSspGTUZt8CXHLcv9x7z/tBObgvNWxgj3FVtfFVEfCeEC5nCb+ZL/qZvnRUSU77 hinPcNAmq3jKt8L0MjfvAVOPtvOZ351q8dHzWgFTRAv9JoKP2YlPDQtHFflnaz4Oak9w 57lg== X-Gm-Message-State: AIkVDXLDZ6RjEZbLAIMuxI+3bJB5exiN4rQ/NbWlbzwdf94KV+1ZDE/TL+ZM/o7wAM9D76Ir X-Received: by 10.28.220.135 with SMTP id t129mr8996722wmg.97.1485645956680; Sat, 28 Jan 2017 15:25:56 -0800 (PST) Received: from localhost.localdomain ([160.163.215.165]) by smtp.gmail.com with ESMTPSA id 33sm14992064wrd.34.2017.01.28.15.25.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 28 Jan 2017 15:25:56 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, herbert@gondor.apana.org.au, Ard Biesheuvel Subject: [PATCH v3 02/10] crypto: arm/aes-ce - remove cra_alignmask Date: Sat, 28 Jan 2017 23:25:31 +0000 Message-Id: <1485645939-17126-3-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/aes-ce-core.S | 84 ++++++++++---------- arch/arm/crypto/aes-ce-glue.c | 15 ++-- 2 files changed, 47 insertions(+), 52 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm/crypto/aes-ce-core.S b/arch/arm/crypto/aes-ce-core.S index 987aa632c9f0..ba8e6a32fdc9 100644 --- a/arch/arm/crypto/aes-ce-core.S +++ b/arch/arm/crypto/aes-ce-core.S @@ -169,19 +169,19 @@ ENTRY(ce_aes_ecb_encrypt) .Lecbencloop3x: subs r4, r4, #3 bmi .Lecbenc1x - vld1.8 {q0-q1}, [r1, :64]! - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! + vld1.8 {q2}, [r1]! bl aes_encrypt_3x - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! b .Lecbencloop3x .Lecbenc1x: adds r4, r4, #3 beq .Lecbencout .Lecbencloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! bl aes_encrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lecbencloop .Lecbencout: @@ -195,19 +195,19 @@ ENTRY(ce_aes_ecb_decrypt) .Lecbdecloop3x: subs r4, r4, #3 bmi .Lecbdec1x - vld1.8 {q0-q1}, [r1, :64]! - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! + vld1.8 {q2}, [r1]! bl aes_decrypt_3x - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! b .Lecbdecloop3x .Lecbdec1x: adds r4, r4, #3 beq .Lecbdecout .Lecbdecloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! bl aes_decrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lecbdecloop .Lecbdecout: @@ -226,10 +226,10 @@ ENTRY(ce_aes_cbc_encrypt) vld1.8 {q0}, [r5] prepare_key r2, r3 .Lcbcencloop: - vld1.8 {q1}, [r1, :64]! @ get next pt block + vld1.8 {q1}, [r1]! @ get next pt block veor q0, q0, q1 @ ..and xor with iv bl aes_encrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lcbcencloop vst1.8 {q0}, [r5] @@ -244,8 +244,8 @@ ENTRY(ce_aes_cbc_decrypt) .Lcbcdecloop3x: subs r4, r4, #3 bmi .Lcbcdec1x - vld1.8 {q0-q1}, [r1, :64]! - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! + vld1.8 {q2}, [r1]! vmov q3, q0 vmov q4, q1 vmov q5, q2 @@ -254,19 +254,19 @@ ENTRY(ce_aes_cbc_decrypt) veor q1, q1, q3 veor q2, q2, q4 vmov q6, q5 - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! b .Lcbcdecloop3x .Lcbcdec1x: adds r4, r4, #3 beq .Lcbcdecout vmov q15, q14 @ preserve last round key .Lcbcdecloop: - vld1.8 {q0}, [r1, :64]! @ get next ct block + vld1.8 {q0}, [r1]! @ get next ct block veor q14, q15, q6 @ combine prev ct with last key vmov q6, q0 bl aes_decrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lcbcdecloop .Lcbcdecout: @@ -300,15 +300,15 @@ ENTRY(ce_aes_ctr_encrypt) rev ip, r6 add r6, r6, #1 vmov s11, ip - vld1.8 {q3-q4}, [r1, :64]! - vld1.8 {q5}, [r1, :64]! + vld1.8 {q3-q4}, [r1]! + vld1.8 {q5}, [r1]! bl aes_encrypt_3x veor q0, q0, q3 veor q1, q1, q4 veor q2, q2, q5 rev ip, r6 - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! vmov s27, ip b .Lctrloop3x .Lctr1x: @@ -318,10 +318,10 @@ ENTRY(ce_aes_ctr_encrypt) vmov q0, q6 bl aes_encrypt subs r4, r4, #1 - bmi .Lctrhalfblock @ blocks < 0 means 1/2 block - vld1.8 {q3}, [r1, :64]! + bmi .Lctrtailblock @ blocks < 0 means tail block + vld1.8 {q3}, [r1]! veor q3, q0, q3 - vst1.8 {q3}, [r0, :64]! + vst1.8 {q3}, [r0]! adds r6, r6, #1 @ increment BE ctr rev ip, r6 @@ -333,10 +333,8 @@ ENTRY(ce_aes_ctr_encrypt) vst1.8 {q6}, [r5] pop {r4-r6, pc} -.Lctrhalfblock: - vld1.8 {d1}, [r1, :64] - veor d0, d0, d1 - vst1.8 {d0}, [r0, :64] +.Lctrtailblock: + vst1.8 {q0}, [r0, :64] @ return just the key stream pop {r4-r6, pc} .Lctrcarry: @@ -405,8 +403,8 @@ ENTRY(ce_aes_xts_encrypt) .Lxtsenc3x: subs r4, r4, #3 bmi .Lxtsenc1x - vld1.8 {q0-q1}, [r1, :64]! @ get 3 pt blocks - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! @ get 3 pt blocks + vld1.8 {q2}, [r1]! next_tweak q4, q3, q7, q6 veor q0, q0, q3 next_tweak q5, q4, q7, q6 @@ -416,8 +414,8 @@ ENTRY(ce_aes_xts_encrypt) veor q0, q0, q3 veor q1, q1, q4 veor q2, q2, q5 - vst1.8 {q0-q1}, [r0, :64]! @ write 3 ct blocks - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! @ write 3 ct blocks + vst1.8 {q2}, [r0]! vmov q3, q5 teq r4, #0 beq .Lxtsencout @@ -426,11 +424,11 @@ ENTRY(ce_aes_xts_encrypt) adds r4, r4, #3 beq .Lxtsencout .Lxtsencloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! veor q0, q0, q3 bl aes_encrypt veor q0, q0, q3 - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 beq .Lxtsencout next_tweak q3, q3, q7, q6 @@ -456,8 +454,8 @@ ENTRY(ce_aes_xts_decrypt) .Lxtsdec3x: subs r4, r4, #3 bmi .Lxtsdec1x - vld1.8 {q0-q1}, [r1, :64]! @ get 3 ct blocks - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! @ get 3 ct blocks + vld1.8 {q2}, [r1]! next_tweak q4, q3, q7, q6 veor q0, q0, q3 next_tweak q5, q4, q7, q6 @@ -467,8 +465,8 @@ ENTRY(ce_aes_xts_decrypt) veor q0, q0, q3 veor q1, q1, q4 veor q2, q2, q5 - vst1.8 {q0-q1}, [r0, :64]! @ write 3 pt blocks - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! @ write 3 pt blocks + vst1.8 {q2}, [r0]! vmov q3, q5 teq r4, #0 beq .Lxtsdecout @@ -477,12 +475,12 @@ ENTRY(ce_aes_xts_decrypt) adds r4, r4, #3 beq .Lxtsdecout .Lxtsdecloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! veor q0, q0, q3 add ip, r2, #32 @ 3rd round key bl aes_decrypt veor q0, q0, q3 - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 beq .Lxtsdecout next_tweak q3, q3, q7, q6 diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c index 8857531915bf..883b84d828c5 100644 --- a/arch/arm/crypto/aes-ce-glue.c +++ b/arch/arm/crypto/aes-ce-glue.c @@ -278,14 +278,15 @@ static int ctr_encrypt(struct skcipher_request *req) u8 *tsrc = walk.src.virt.addr; /* - * Minimum alignment is 8 bytes, so if nbytes is <= 8, we need - * to tell aes_ctr_encrypt() to only read half a block. + * Tell aes_ctr_encrypt() to process a tail block. */ - blocks = (nbytes <= 8) ? -1 : 1; + blocks = -1; - ce_aes_ctr_encrypt(tail, tsrc, (u8 *)ctx->key_enc, + ce_aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc, num_rounds(ctx), blocks, walk.iv); - memcpy(tdst, tail, nbytes); + if (tdst != tsrc) + memcpy(tdst, tsrc, nbytes); + crypto_xor(tdst, tail, nbytes); err = skcipher_walk_done(&walk, 0); } kernel_neon_end(); @@ -345,7 +346,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -361,7 +361,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -378,7 +377,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = 1, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -396,7 +394,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_xts_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = 2 * AES_MIN_KEY_SIZE,