From patchwork Sat Jan 28 23:25:30 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92767 Delivered-To: patch@linaro.org Received: by 10.140.20.99 with SMTP id 90csp847175qgi; Sat, 28 Jan 2017 15:26:00 -0800 (PST) X-Received: by 10.99.116.22 with SMTP id p22mr16792426pgc.161.1485645960925; Sat, 28 Jan 2017 15:26:00 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 3si8337414pli.45.2017.01.28.15.26.00; Sat, 28 Jan 2017 15:26:00 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752479AbdA1XZ5 (ORCPT + 1 other); Sat, 28 Jan 2017 18:25:57 -0500 Received: from mail-wm0-f46.google.com ([74.125.82.46]:37192 "EHLO mail-wm0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751947AbdA1XZz (ORCPT ); Sat, 28 Jan 2017 18:25:55 -0500 Received: by mail-wm0-f46.google.com with SMTP id d140so1635523wmd.0 for ; Sat, 28 Jan 2017 15:25:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=CzjTHOxkTE4Qgd4iwVqP3xmJj/nHZVRS08KnoIo68+M=; b=KdHkT3Z1j/AmgExJUoogSfdNIvcUSJBjcOTJ4xdpIna019ZTDGSUsqPP7W1Kau6PbW NGwqIpq0bPpcLnelYU656JehXqLSv6QcUSeB2rXLb5YjHeJXVeHLkvQZNPP08jTe7Dj9 byyx0uKbeoM32olbV7o2VQrDC0Ht2yyBT/r3U= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=CzjTHOxkTE4Qgd4iwVqP3xmJj/nHZVRS08KnoIo68+M=; b=N2cqP8MMZm3ZGgm0TGVkfFucBxw9bLEraeUIXKRsMyCxZeMxFtvX3EDuDwVk4//JZs Rlu4YDHcdEiI7XyJ6RnO1j5aCDa51licjo9Ljgk+VIBtxzd+ACq92q93P3tUcfk/BU6F HwwBmf+Wl9I42x0ZnZYkrVy3JNJzzZ3Cwiub0JdoEY7HyLzfgtfPl7Tc7VHJp9VDqBo5 OmdJl+pfAPSz0YUWyK6nrNtW+IAkCnZMfO24KlHUnD5OX2eAZFrQ6A2RCPwL//N3zRC4 L1WZIxpd6hpFNcRuEbxVSb9DiAPPofEVQ3nTz4GY7/BxZihykct+MO0yFzx2i3UmtCNn lE9Q== X-Gm-Message-State: AIkVDXK+6kSSCaQXF4Rg4+lgVjj1h+k3Z/U3pqyzcV2Vvro2JDVyzULF+YO6Hnq3eA7/W5iH X-Received: by 10.223.139.142 with SMTP id o14mr13159365wra.6.1485645954360; Sat, 28 Jan 2017 15:25:54 -0800 (PST) Received: from localhost.localdomain ([160.163.215.165]) by smtp.gmail.com with ESMTPSA id 33sm14992064wrd.34.2017.01.28.15.25.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 28 Jan 2017 15:25:53 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, herbert@gondor.apana.org.au, Ard Biesheuvel Subject: [PATCH v3 01/10] crypto: arm64/aes-neon-bs - honour iv_out requirement in CTR mode Date: Sat, 28 Jan 2017 23:25:30 +0000 Message-Id: <1485645939-17126-2-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Update the new bitsliced NEON AES implementation in CTR mode to return the next IV back to the skcipher API client. This is necessary for chaining to work correctly. Note that this is only done if the request is a round multiple of the block size, since otherwise, chaining is impossible anyway. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-neonbs-core.S | 25 +++++++++++++------- 1 file changed, 16 insertions(+), 9 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/aes-neonbs-core.S b/arch/arm64/crypto/aes-neonbs-core.S index 8d0cdaa2768d..2ada12dd768e 100644 --- a/arch/arm64/crypto/aes-neonbs-core.S +++ b/arch/arm64/crypto/aes-neonbs-core.S @@ -874,12 +874,19 @@ CPU_LE( rev x8, x8 ) csel x4, x4, xzr, pl csel x9, x9, xzr, le + tbnz x9, #1, 0f next_ctr v1 + tbnz x9, #2, 0f next_ctr v2 + tbnz x9, #3, 0f next_ctr v3 + tbnz x9, #4, 0f next_ctr v4 + tbnz x9, #5, 0f next_ctr v5 + tbnz x9, #6, 0f next_ctr v6 + tbnz x9, #7, 0f next_ctr v7 0: mov bskey, x2 @@ -928,11 +935,11 @@ CPU_LE( rev x8, x8 ) eor v5.16b, v5.16b, v15.16b st1 {v5.16b}, [x0], #16 - next_ctr v0 +8: next_ctr v0 cbnz x4, 99b 0: st1 {v0.16b}, [x5] -8: ldp x29, x30, [sp], #16 +9: ldp x29, x30, [sp], #16 ret /* @@ -941,23 +948,23 @@ CPU_LE( rev x8, x8 ) */ 1: cbz x6, 8b st1 {v1.16b}, [x5] - b 8b + b 9b 2: cbz x6, 8b st1 {v4.16b}, [x5] - b 8b + b 9b 3: cbz x6, 8b st1 {v6.16b}, [x5] - b 8b + b 9b 4: cbz x6, 8b st1 {v3.16b}, [x5] - b 8b + b 9b 5: cbz x6, 8b st1 {v7.16b}, [x5] - b 8b + b 9b 6: cbz x6, 8b st1 {v2.16b}, [x5] - b 8b + b 9b 7: cbz x6, 8b st1 {v5.16b}, [x5] - b 8b + b 9b ENDPROC(aesbs_ctr_encrypt) From patchwork Sat Jan 28 23:25:31 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92768 Delivered-To: patch@linaro.org Received: by 10.140.20.99 with SMTP id 90csp847176qgi; Sat, 28 Jan 2017 15:26:01 -0800 (PST) X-Received: by 10.84.245.1 with SMTP id i1mr21897102pll.131.1485645961094; Sat, 28 Jan 2017 15:26:01 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 3si8337414pli.45.2017.01.28.15.26.00; Sat, 28 Jan 2017 15:26:01 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752532AbdA1XZ7 (ORCPT + 1 other); Sat, 28 Jan 2017 18:25:59 -0500 Received: from mail-wm0-f42.google.com ([74.125.82.42]:35019 "EHLO mail-wm0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751947AbdA1XZ6 (ORCPT ); Sat, 28 Jan 2017 18:25:58 -0500 Received: by mail-wm0-f42.google.com with SMTP id r126so165099089wmr.0 for ; Sat, 28 Jan 2017 15:25:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=uMxznzyzjyL6KPmIcSlEFgXHzKWi8SJ3MJxOaj5XmxA=; b=A0xRScx83L/xtlcdGb5DI8CKovFvM7LT0zl6Lxi5NgsIF7M1J1uEpPCw8l78wa23cc CJB7WjysdQdYcj4T7OZ4fT3rGIaToSPgX8KUUoPNqZkQE8iMVJ/tbBCdqgrcv19DuxUi 88CR8WqlFnsGo3Awj+5niVLg1jgVCrU8UL8VY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=uMxznzyzjyL6KPmIcSlEFgXHzKWi8SJ3MJxOaj5XmxA=; b=e2znwmQzLI1lYh8q/kj/PfvsX6Zaquq/mQpKFVkRqbINHc8hPTFxhLES0xOayu7i3K 6clpOYCzHHsc4795MdpDAXqkdVRHAddd61bhEBhj1qHvpvwPVSSPvjW3iC3xQIsr5sFW WAaUrw/EiFVkTcOmayygKrKGI4NojqfHqlyPJdDVWl8ab9t6YkzgMKNHIwPZtq1Ky9fV czZpRnSspGTUZt8CXHLcv9x7z/tBObgvNWxgj3FVtfFVEfCeEC5nCb+ZL/qZvnRUSU77 hinPcNAmq3jKt8L0MjfvAVOPtvOZ351q8dHzWgFTRAv9JoKP2YlPDQtHFflnaz4Oak9w 57lg== X-Gm-Message-State: AIkVDXLDZ6RjEZbLAIMuxI+3bJB5exiN4rQ/NbWlbzwdf94KV+1ZDE/TL+ZM/o7wAM9D76Ir X-Received: by 10.28.220.135 with SMTP id t129mr8996722wmg.97.1485645956680; Sat, 28 Jan 2017 15:25:56 -0800 (PST) Received: from localhost.localdomain ([160.163.215.165]) by smtp.gmail.com with ESMTPSA id 33sm14992064wrd.34.2017.01.28.15.25.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 28 Jan 2017 15:25:56 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, herbert@gondor.apana.org.au, Ard Biesheuvel Subject: [PATCH v3 02/10] crypto: arm/aes-ce - remove cra_alignmask Date: Sat, 28 Jan 2017 23:25:31 +0000 Message-Id: <1485645939-17126-3-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/aes-ce-core.S | 84 ++++++++++---------- arch/arm/crypto/aes-ce-glue.c | 15 ++-- 2 files changed, 47 insertions(+), 52 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm/crypto/aes-ce-core.S b/arch/arm/crypto/aes-ce-core.S index 987aa632c9f0..ba8e6a32fdc9 100644 --- a/arch/arm/crypto/aes-ce-core.S +++ b/arch/arm/crypto/aes-ce-core.S @@ -169,19 +169,19 @@ ENTRY(ce_aes_ecb_encrypt) .Lecbencloop3x: subs r4, r4, #3 bmi .Lecbenc1x - vld1.8 {q0-q1}, [r1, :64]! - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! + vld1.8 {q2}, [r1]! bl aes_encrypt_3x - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! b .Lecbencloop3x .Lecbenc1x: adds r4, r4, #3 beq .Lecbencout .Lecbencloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! bl aes_encrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lecbencloop .Lecbencout: @@ -195,19 +195,19 @@ ENTRY(ce_aes_ecb_decrypt) .Lecbdecloop3x: subs r4, r4, #3 bmi .Lecbdec1x - vld1.8 {q0-q1}, [r1, :64]! - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! + vld1.8 {q2}, [r1]! bl aes_decrypt_3x - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! b .Lecbdecloop3x .Lecbdec1x: adds r4, r4, #3 beq .Lecbdecout .Lecbdecloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! bl aes_decrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lecbdecloop .Lecbdecout: @@ -226,10 +226,10 @@ ENTRY(ce_aes_cbc_encrypt) vld1.8 {q0}, [r5] prepare_key r2, r3 .Lcbcencloop: - vld1.8 {q1}, [r1, :64]! @ get next pt block + vld1.8 {q1}, [r1]! @ get next pt block veor q0, q0, q1 @ ..and xor with iv bl aes_encrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lcbcencloop vst1.8 {q0}, [r5] @@ -244,8 +244,8 @@ ENTRY(ce_aes_cbc_decrypt) .Lcbcdecloop3x: subs r4, r4, #3 bmi .Lcbcdec1x - vld1.8 {q0-q1}, [r1, :64]! - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! + vld1.8 {q2}, [r1]! vmov q3, q0 vmov q4, q1 vmov q5, q2 @@ -254,19 +254,19 @@ ENTRY(ce_aes_cbc_decrypt) veor q1, q1, q3 veor q2, q2, q4 vmov q6, q5 - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! b .Lcbcdecloop3x .Lcbcdec1x: adds r4, r4, #3 beq .Lcbcdecout vmov q15, q14 @ preserve last round key .Lcbcdecloop: - vld1.8 {q0}, [r1, :64]! @ get next ct block + vld1.8 {q0}, [r1]! @ get next ct block veor q14, q15, q6 @ combine prev ct with last key vmov q6, q0 bl aes_decrypt - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 bne .Lcbcdecloop .Lcbcdecout: @@ -300,15 +300,15 @@ ENTRY(ce_aes_ctr_encrypt) rev ip, r6 add r6, r6, #1 vmov s11, ip - vld1.8 {q3-q4}, [r1, :64]! - vld1.8 {q5}, [r1, :64]! + vld1.8 {q3-q4}, [r1]! + vld1.8 {q5}, [r1]! bl aes_encrypt_3x veor q0, q0, q3 veor q1, q1, q4 veor q2, q2, q5 rev ip, r6 - vst1.8 {q0-q1}, [r0, :64]! - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! + vst1.8 {q2}, [r0]! vmov s27, ip b .Lctrloop3x .Lctr1x: @@ -318,10 +318,10 @@ ENTRY(ce_aes_ctr_encrypt) vmov q0, q6 bl aes_encrypt subs r4, r4, #1 - bmi .Lctrhalfblock @ blocks < 0 means 1/2 block - vld1.8 {q3}, [r1, :64]! + bmi .Lctrtailblock @ blocks < 0 means tail block + vld1.8 {q3}, [r1]! veor q3, q0, q3 - vst1.8 {q3}, [r0, :64]! + vst1.8 {q3}, [r0]! adds r6, r6, #1 @ increment BE ctr rev ip, r6 @@ -333,10 +333,8 @@ ENTRY(ce_aes_ctr_encrypt) vst1.8 {q6}, [r5] pop {r4-r6, pc} -.Lctrhalfblock: - vld1.8 {d1}, [r1, :64] - veor d0, d0, d1 - vst1.8 {d0}, [r0, :64] +.Lctrtailblock: + vst1.8 {q0}, [r0, :64] @ return just the key stream pop {r4-r6, pc} .Lctrcarry: @@ -405,8 +403,8 @@ ENTRY(ce_aes_xts_encrypt) .Lxtsenc3x: subs r4, r4, #3 bmi .Lxtsenc1x - vld1.8 {q0-q1}, [r1, :64]! @ get 3 pt blocks - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! @ get 3 pt blocks + vld1.8 {q2}, [r1]! next_tweak q4, q3, q7, q6 veor q0, q0, q3 next_tweak q5, q4, q7, q6 @@ -416,8 +414,8 @@ ENTRY(ce_aes_xts_encrypt) veor q0, q0, q3 veor q1, q1, q4 veor q2, q2, q5 - vst1.8 {q0-q1}, [r0, :64]! @ write 3 ct blocks - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! @ write 3 ct blocks + vst1.8 {q2}, [r0]! vmov q3, q5 teq r4, #0 beq .Lxtsencout @@ -426,11 +424,11 @@ ENTRY(ce_aes_xts_encrypt) adds r4, r4, #3 beq .Lxtsencout .Lxtsencloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! veor q0, q0, q3 bl aes_encrypt veor q0, q0, q3 - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 beq .Lxtsencout next_tweak q3, q3, q7, q6 @@ -456,8 +454,8 @@ ENTRY(ce_aes_xts_decrypt) .Lxtsdec3x: subs r4, r4, #3 bmi .Lxtsdec1x - vld1.8 {q0-q1}, [r1, :64]! @ get 3 ct blocks - vld1.8 {q2}, [r1, :64]! + vld1.8 {q0-q1}, [r1]! @ get 3 ct blocks + vld1.8 {q2}, [r1]! next_tweak q4, q3, q7, q6 veor q0, q0, q3 next_tweak q5, q4, q7, q6 @@ -467,8 +465,8 @@ ENTRY(ce_aes_xts_decrypt) veor q0, q0, q3 veor q1, q1, q4 veor q2, q2, q5 - vst1.8 {q0-q1}, [r0, :64]! @ write 3 pt blocks - vst1.8 {q2}, [r0, :64]! + vst1.8 {q0-q1}, [r0]! @ write 3 pt blocks + vst1.8 {q2}, [r0]! vmov q3, q5 teq r4, #0 beq .Lxtsdecout @@ -477,12 +475,12 @@ ENTRY(ce_aes_xts_decrypt) adds r4, r4, #3 beq .Lxtsdecout .Lxtsdecloop: - vld1.8 {q0}, [r1, :64]! + vld1.8 {q0}, [r1]! veor q0, q0, q3 add ip, r2, #32 @ 3rd round key bl aes_decrypt veor q0, q0, q3 - vst1.8 {q0}, [r0, :64]! + vst1.8 {q0}, [r0]! subs r4, r4, #1 beq .Lxtsdecout next_tweak q3, q3, q7, q6 diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c index 8857531915bf..883b84d828c5 100644 --- a/arch/arm/crypto/aes-ce-glue.c +++ b/arch/arm/crypto/aes-ce-glue.c @@ -278,14 +278,15 @@ static int ctr_encrypt(struct skcipher_request *req) u8 *tsrc = walk.src.virt.addr; /* - * Minimum alignment is 8 bytes, so if nbytes is <= 8, we need - * to tell aes_ctr_encrypt() to only read half a block. + * Tell aes_ctr_encrypt() to process a tail block. */ - blocks = (nbytes <= 8) ? -1 : 1; + blocks = -1; - ce_aes_ctr_encrypt(tail, tsrc, (u8 *)ctx->key_enc, + ce_aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc, num_rounds(ctx), blocks, walk.iv); - memcpy(tdst, tail, nbytes); + if (tdst != tsrc) + memcpy(tdst, tsrc, nbytes); + crypto_xor(tdst, tail, nbytes); err = skcipher_walk_done(&walk, 0); } kernel_neon_end(); @@ -345,7 +346,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -361,7 +361,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -378,7 +377,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = 1, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -396,7 +394,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_xts_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = 2 * AES_MIN_KEY_SIZE, From patchwork Sat Jan 28 23:25:32 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92769 Delivered-To: patch@linaro.org Received: by 10.140.20.99 with SMTP id 90csp847177qgi; Sat, 28 Jan 2017 15:26:01 -0800 (PST) X-Received: by 10.98.80.146 with SMTP id g18mr16361579pfj.74.1485645961305; Sat, 28 Jan 2017 15:26:01 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 3si8337414pli.45.2017.01.28.15.26.01; Sat, 28 Jan 2017 15:26:01 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752553AbdA1X0A (ORCPT + 1 other); Sat, 28 Jan 2017 18:26:00 -0500 Received: from mail-wm0-f51.google.com ([74.125.82.51]:38045 "EHLO mail-wm0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751947AbdA1X0A (ORCPT ); Sat, 28 Jan 2017 18:26:00 -0500 Received: by mail-wm0-f51.google.com with SMTP id r126so35378690wmr.1 for ; Sat, 28 Jan 2017 15:25:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Twhl3TaN4Yleujf68Qk0fFCweF7qBJDAw0TmYcZHr54=; b=H+eQ04wRWLbglT2mBmDQzrkwVCPRoKuYmGbvlzDWGL/qKQ2TXGdTjbDpMtot01GLjK 2eJu3sTpINiyWRAbD+12nf3Prejkilsmgbyr76KBlaDggKCYAq2hrBOgewFl1wEVD47S pUYE31+JU5sHXVpOV9BoiHLALekNRB0WudMrY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Twhl3TaN4Yleujf68Qk0fFCweF7qBJDAw0TmYcZHr54=; b=eZ0LZyczZncToBJqAzDuu1kFF2+ua44NJjTE+rcOBbFxwq7clsHk4fXpkMJaQ950gl DDa57XlRuTO1o5xa8lJApqTnyhwtiwbgClU6XEnU+LD0ZKiUrXO6Q8x9iiYtAsUaN49+ c5ntqJB3WvS1Z8KIRSZX9YM1BpyLBB8zNhm+SI3TDgiGejWMjvyk7ZoxyVq+ChJF7jls QsYmiaHQ+VyyNInNBPGAnMGmltwqOCSFKstcVfeEtQq5hxRVZbcNzPIyyEHsrAlasfC2 pDqThaoCORVuOpADDHDY2kFAFJeD4+sK8J6K0MtwYn6DnbDVukCgqDupevBEG3S7Vx3B dKOQ== X-Gm-Message-State: AIkVDXJidYgKrp0W7okfFyrEYuEUb+GjrlDZ/fbZ4zuLecXcVAlocY53I1PUx53Mi3xDgN8o X-Received: by 10.28.44.66 with SMTP id s63mr7876758wms.77.1485645958397; Sat, 28 Jan 2017 15:25:58 -0800 (PST) Received: from localhost.localdomain ([160.163.215.165]) by smtp.gmail.com with ESMTPSA id 33sm14992064wrd.34.2017.01.28.15.25.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 28 Jan 2017 15:25:57 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, herbert@gondor.apana.org.au, Ard Biesheuvel Subject: [PATCH v3 03/10] crypto: arm/chacha20 - remove cra_alignmask Date: Sat, 28 Jan 2017 23:25:32 +0000 Message-Id: <1485645939-17126-4-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/chacha20-neon-glue.c | 1 - 1 file changed, 1 deletion(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm/crypto/chacha20-neon-glue.c b/arch/arm/crypto/chacha20-neon-glue.c index 592f75ae4fa1..59a7be08e80c 100644 --- a/arch/arm/crypto/chacha20-neon-glue.c +++ b/arch/arm/crypto/chacha20-neon-glue.c @@ -94,7 +94,6 @@ static struct skcipher_alg alg = { .base.cra_priority = 300, .base.cra_blocksize = 1, .base.cra_ctxsize = sizeof(struct chacha20_ctx), - .base.cra_alignmask = 1, .base.cra_module = THIS_MODULE, .min_keysize = CHACHA20_KEY_SIZE, From patchwork Sat Jan 28 23:25:33 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92774 Delivered-To: patch@linaro.org Received: by 10.140.20.99 with SMTP id 90csp847287qgi; Sat, 28 Jan 2017 15:26:31 -0800 (PST) X-Received: by 10.99.44.138 with SMTP id s132mr16782601pgs.88.1485645991438; Sat, 28 Jan 2017 15:26:31 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d65si8383902pfl.73.2017.01.28.15.26.31; Sat, 28 Jan 2017 15:26:31 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752695AbdA1X0F (ORCPT + 1 other); Sat, 28 Jan 2017 18:26:05 -0500 Received: from mail-wm0-f48.google.com ([74.125.82.48]:38057 "EHLO mail-wm0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752007AbdA1X0C (ORCPT ); Sat, 28 Jan 2017 18:26:02 -0500 Received: by mail-wm0-f48.google.com with SMTP id r126so35379319wmr.1 for ; Sat, 28 Jan 2017 15:26:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=uEavdoflWkEgxBCfHtZRMITUTt3rHQF+vKAmVwYFXzM=; b=iX4RgwAwSImbXM2bgXM5ECffN/JuvHQEoDyaWSb9ZnkSvbbmjVYMJUMohA2+n70SRX 4dt7o364y8lxDBxbMv0RUSnMe/1K+m2UH1TfNFa/I1v9CZmX38XleBxIRjEjsFqrPBXs lmGfGqoZALALbh6Z/MSzR8QCSQcsoQxDdwp2w= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=uEavdoflWkEgxBCfHtZRMITUTt3rHQF+vKAmVwYFXzM=; b=KC98O4+iK+l2RoOrAG1jOvFK09PW7t37SW/91OajBtO8KnhaV8tf5XZL/lIvczNzzm fcTbg4hvj9ONWDTWtDrjG8Q1x1sG0/Sa9a+Bk0OZoejqyl8/eYVdginYiFfUuolXZuDz 6uEH02GEipXM15KaL8mVMJhwM/bdNpk8Oo6X6o7ZMv+jf8ckW0lWGX2qz4SMoKGNXHQ7 hLUtLtBSaUolvWoM42SaG8tarCn9ueEnxa69Jq3H2ndYRRrdbH+zncYGuaWNPIFqQeaf d09PvuoeXvZ0YVa4Bet0DrJF9cVRn3OYw6/o3CDKDq/48HCaaHowCg2UvVtZ8n40lwrZ yGKQ== X-Gm-Message-State: AIkVDXKbARihSOidN/UYGETwWQ9eG+769UH6E9cRPwWveu1MTeig/mUMsLZLvLykw5M32MTY X-Received: by 10.223.176.210 with SMTP id j18mr14926485wra.8.1485645960378; Sat, 28 Jan 2017 15:26:00 -0800 (PST) Received: from localhost.localdomain ([160.163.215.165]) by smtp.gmail.com with ESMTPSA id 33sm14992064wrd.34.2017.01.28.15.25.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 28 Jan 2017 15:25:59 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, herbert@gondor.apana.org.au, Ard Biesheuvel Subject: [PATCH v3 04/10] crypto: arm64/aes-ce-ccm - remove cra_alignmask Date: Sat, 28 Jan 2017 23:25:33 +0000 Message-Id: <1485645939-17126-5-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-ce-ccm-glue.c | 1 - 1 file changed, 1 deletion(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c index cc5515dac74a..6a7dbc7c83a6 100644 --- a/arch/arm64/crypto/aes-ce-ccm-glue.c +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c @@ -258,7 +258,6 @@ static struct aead_alg ccm_aes_alg = { .cra_priority = 300, .cra_blocksize = 1, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .ivsize = AES_BLOCK_SIZE, From patchwork Sat Jan 28 23:25:34 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92772 Delivered-To: patch@linaro.org Received: by 10.140.20.99 with SMTP id 90csp847286qgi; Sat, 28 Jan 2017 15:26:31 -0800 (PST) X-Received: by 10.84.213.9 with SMTP id f9mr22032731pli.138.1485645991284; Sat, 28 Jan 2017 15:26:31 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d65si8383902pfl.73.2017.01.28.15.26.31; Sat, 28 Jan 2017 15:26:31 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752164AbdA1X0P (ORCPT + 1 other); Sat, 28 Jan 2017 18:26:15 -0500 Received: from mail-wm0-f52.google.com ([74.125.82.52]:35045 "EHLO mail-wm0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752664AbdA1X0E (ORCPT ); Sat, 28 Jan 2017 18:26:04 -0500 Received: by mail-wm0-f52.google.com with SMTP id r126so165100532wmr.0 for ; Sat, 28 Jan 2017 15:26:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=jhswGRPz3ouSFevuzpnYnMoNl4tRT+sWbtop/inoM6k=; b=S+NTZ/DRQXdEEAKGnm0zpRymLvPrCKVgm1/vNX0jEJoJab6I05yHWH9XjTxGQJkxI6 0gvdnMkRdp2AlN2RBpjhHJt76ggsax/1XyMi65nF92R/piA54Lrz+MutORdebSP027tP Qjt6BZ7FFg9Xz/B/34yitY9kysAYlO4FlVAl0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=jhswGRPz3ouSFevuzpnYnMoNl4tRT+sWbtop/inoM6k=; b=jHTeP/S5JMQfZYupd/BgFj4J1D7cV4GCoLBHdO+QLGbWg88gyVFp13AjEj8vS6qSry ohT7ess5o9K82lxgswDRhFijfDW8eyRc/1n4NJHX71XQPv8qRUigT2p1fyqRl2JUgOHa T9VFyryYpScc76n7mfNoEuN7qnmR2WRgNlJKrwXCM5/S1tmmnpJphPV5Q/YPy3rHFAPE fSjEWXDvj6AJgFt15svE2vKxhfLDVKJ3DRnRgHfsol2la703/94QPNkCHcDJXxhgYEFC RYH+g11PoWoaspJQTpNubrBa8VqnWyfuKZIQeKYIY535/HfO75REVz+LTmp1XQY3+XhT ERTw== X-Gm-Message-State: AIkVDXKhn+t/ZnDZZ9MvqlFTi7AfLmSYhliBnjzE2xb+nvpoMgjOJdWT4RWjpw7M9benElA6 X-Received: by 10.223.139.27 with SMTP id n27mr14633001wra.149.1485645962586; Sat, 28 Jan 2017 15:26:02 -0800 (PST) Received: from localhost.localdomain ([160.163.215.165]) by smtp.gmail.com with ESMTPSA id 33sm14992064wrd.34.2017.01.28.15.26.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 28 Jan 2017 15:26:01 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, herbert@gondor.apana.org.au, Ard Biesheuvel Subject: [PATCH v3 05/10] crypto: arm64/aes-blk - remove cra_alignmask Date: Sat, 28 Jan 2017 23:25:34 +0000 Message-Id: <1485645939-17126-6-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-glue.c | 16 ++++++---------- arch/arm64/crypto/aes-modes.S | 8 +++----- 2 files changed, 9 insertions(+), 15 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c index 5164aaf82c6a..8ee1fb7aaa4f 100644 --- a/arch/arm64/crypto/aes-glue.c +++ b/arch/arm64/crypto/aes-glue.c @@ -215,14 +215,15 @@ static int ctr_encrypt(struct skcipher_request *req) u8 *tsrc = walk.src.virt.addr; /* - * Minimum alignment is 8 bytes, so if nbytes is <= 8, we need - * to tell aes_ctr_encrypt() to only read half a block. + * Tell aes_ctr_encrypt() to process a tail block. */ - blocks = (nbytes <= 8) ? -1 : 1; + blocks = -1; - aes_ctr_encrypt(tail, tsrc, (u8 *)ctx->key_enc, rounds, + aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc, rounds, blocks, walk.iv, first); - memcpy(tdst, tail, nbytes); + if (tdst != tsrc) + memcpy(tdst, tsrc, nbytes); + crypto_xor(tdst, tail, nbytes); err = skcipher_walk_done(&walk, 0); } kernel_neon_end(); @@ -282,7 +283,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -298,7 +298,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -315,7 +314,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = 1, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -332,7 +330,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_priority = PRIO - 1, .cra_blocksize = 1, .cra_ctxsize = sizeof(struct crypto_aes_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = AES_MIN_KEY_SIZE, @@ -350,7 +347,6 @@ static struct skcipher_alg aes_algs[] = { { .cra_flags = CRYPTO_ALG_INTERNAL, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct crypto_aes_xts_ctx), - .cra_alignmask = 7, .cra_module = THIS_MODULE, }, .min_keysize = 2 * AES_MIN_KEY_SIZE, diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S index 838dad5c209f..92b982a8b112 100644 --- a/arch/arm64/crypto/aes-modes.S +++ b/arch/arm64/crypto/aes-modes.S @@ -337,7 +337,7 @@ AES_ENTRY(aes_ctr_encrypt) .Lctrcarrydone: subs w4, w4, #1 - bmi .Lctrhalfblock /* blocks < 0 means 1/2 block */ + bmi .Lctrtailblock /* blocks <0 means tail block */ ld1 {v3.16b}, [x1], #16 eor v3.16b, v0.16b, v3.16b st1 {v3.16b}, [x0], #16 @@ -348,10 +348,8 @@ AES_ENTRY(aes_ctr_encrypt) FRAME_POP ret -.Lctrhalfblock: - ld1 {v3.8b}, [x1] - eor v3.8b, v0.8b, v3.8b - st1 {v3.8b}, [x0] +.Lctrtailblock: + st1 {v0.16b}, [x0] FRAME_POP ret From patchwork Sat Jan 28 23:25:35 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92771 Delivered-To: patch@linaro.org Received: by 10.140.20.99 with SMTP id 90csp847283qgi; Sat, 28 Jan 2017 15:26:31 -0800 (PST) X-Received: by 10.99.36.3 with SMTP id k3mr16762254pgk.136.1485645990963; Sat, 28 Jan 2017 15:26:30 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d65si8383902pfl.73.2017.01.28.15.26.30; Sat, 28 Jan 2017 15:26:30 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752007AbdA1X0M (ORCPT + 1 other); Sat, 28 Jan 2017 18:26:12 -0500 Received: from mail-wm0-f49.google.com ([74.125.82.49]:37252 "EHLO mail-wm0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752700AbdA1X0G (ORCPT ); Sat, 28 Jan 2017 18:26:06 -0500 Received: by mail-wm0-f49.google.com with SMTP id d140so1638768wmd.0 for ; Sat, 28 Jan 2017 15:26:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=zfpd1eO7xxGzNokheRELTOmW0w9HWUScNlNa89gvn3Y=; b=StuQ9i1xDlO8ZfdOX97Si5VN797LA2wqjoR+dTmWG0xhoefnC+3a2m71q2MDF3cUGh G6jcThScZnGO9AQZ7djmoUh+E045/qUhpWPAEp62dc8u3vqwTZTxWWLshWAgxafS9VpU xJ89qohepTLWvmKJjRMNb9eMyf4L9P1wGbtRI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=zfpd1eO7xxGzNokheRELTOmW0w9HWUScNlNa89gvn3Y=; b=KzgnW/cWkKr4yPehfGLNuOQaavGbRv0+9o+0wXelsCvW+dmf7ZDdEwebZKgui50Zh6 zGaPCvR9pJ906aGFVQ+i/8Nxx+cmc+GXRx1dAphinRhhc7Cjb0hCXZeV2W6vEngxe3xc HUnNgHjctVhpNO0wUXNe39opKlHuvt3l/VNPHWaw+KRtXhYAW3AE773z8d55rxHtqtGk R2OV0HpmJW4rdmQ3qNmBk2rWVYfVLZaPZplODFrLjKT6wQfxBmq/DBen5FjDyAD4sYJE /LOp6wmR/8oaSHx9rNKGBzmqKQrnWq/vNT/r3grJItb6jkxUhj+D/GjbY3oEqI/yGQAC Suhg== X-Gm-Message-State: AIkVDXK7yWzJki77AcfQuSv90v9K/YnGyX739HAlUp5EJwq1pSz0PXk2z0ltoUNj4NEJApbP X-Received: by 10.223.164.85 with SMTP id e21mr14512767wra.111.1485645964679; Sat, 28 Jan 2017 15:26:04 -0800 (PST) Received: from localhost.localdomain ([160.163.215.165]) by smtp.gmail.com with ESMTPSA id 33sm14992064wrd.34.2017.01.28.15.26.02 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 28 Jan 2017 15:26:03 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, herbert@gondor.apana.org.au, Ard Biesheuvel Subject: [PATCH v3 06/10] crypto: arm64/chacha20 - remove cra_alignmask Date: Sat, 28 Jan 2017 23:25:35 +0000 Message-Id: <1485645939-17126-7-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Remove the unnecessary alignmask: it is much more efficient to deal with the misalignment in the core algorithm than relying on the crypto API to copy the data to a suitably aligned buffer. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/chacha20-neon-glue.c | 1 - 1 file changed, 1 deletion(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/chacha20-neon-glue.c b/arch/arm64/crypto/chacha20-neon-glue.c index a7f2337d46cf..a7cd575ea223 100644 --- a/arch/arm64/crypto/chacha20-neon-glue.c +++ b/arch/arm64/crypto/chacha20-neon-glue.c @@ -93,7 +93,6 @@ static struct skcipher_alg alg = { .base.cra_priority = 300, .base.cra_blocksize = 1, .base.cra_ctxsize = sizeof(struct chacha20_ctx), - .base.cra_alignmask = 1, .base.cra_module = THIS_MODULE, .min_keysize = CHACHA20_KEY_SIZE, From patchwork Sat Jan 28 23:25:36 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92770 Delivered-To: patch@linaro.org Received: by 10.140.20.99 with SMTP id 90csp847281qgi; Sat, 28 Jan 2017 15:26:30 -0800 (PST) X-Received: by 10.84.211.11 with SMTP id b11mr21891407pli.83.1485645990809; Sat, 28 Jan 2017 15:26:30 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d65si8383902pfl.73.2017.01.28.15.26.30; Sat, 28 Jan 2017 15:26:30 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751947AbdA1X0M (ORCPT + 1 other); Sat, 28 Jan 2017 18:26:12 -0500 Received: from mail-wm0-f44.google.com ([74.125.82.44]:38092 "EHLO mail-wm0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752710AbdA1X0I (ORCPT ); Sat, 28 Jan 2017 18:26:08 -0500 Received: by mail-wm0-f44.google.com with SMTP id r126so35381295wmr.1 for ; Sat, 28 Jan 2017 15:26:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=i8PTGdBLEln38Rqn6mFXAiQkiYiMzfbJpCdlXeMJ1E4=; b=VlgkLeDm7HbLVpKvgzQ9Tu5ttKUCVD0OSkPzujKY915qAWQIebFL5L8kLQgvFFR/GS uMeFArGeuL4nOrLhhm7doCwZgez2pyw36JSkp0VOQ8tRl2cMeUDZApawtZK8/tdDLUHo /M5JexHjIVLdPwVK23GHjh45GZmhm2UheAtbE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=i8PTGdBLEln38Rqn6mFXAiQkiYiMzfbJpCdlXeMJ1E4=; b=FTpssxBhAiAbnuuhG0BN16h/QygbV9DvBycIxHprOJKodTqLpRFFAczOBxtnR106yp JyNpimSVjNYYUfrdv5OzQVkkkxi3BPVtL2OOecJIRo7U9rpGQaP3C23K+O9k1G143h59 qi2Wo/eCU2+pX2w6BXXBTtQCKI0klILn+76rIEkNUQ/rAv6lYxfS35yglznx6GJztUfa lCGeCUElC+O+xDz37ifova5NHYI/hSLazCYXKLFlMVrJUshsDxv5yz9ZpoWsZ1inCDyh qOaUA4+70zCNCQ8XJMJXl8ImbqJbxSuSYfZxtRNTbOhInCjqxdTFY+UlThrCTvD5yqhN Ik6Q== X-Gm-Message-State: AIkVDXLa9QB3rGqOafxLdS1nqd+Q7M0Q02WnXtnBloUb3sQctWw13Z0El8xe9qzhx614ls3O X-Received: by 10.28.95.87 with SMTP id t84mr7893674wmb.135.1485645966782; Sat, 28 Jan 2017 15:26:06 -0800 (PST) Received: from localhost.localdomain ([160.163.215.165]) by smtp.gmail.com with ESMTPSA id 33sm14992064wrd.34.2017.01.28.15.26.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 28 Jan 2017 15:26:06 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, herbert@gondor.apana.org.au, Ard Biesheuvel Subject: [PATCH v3 07/10] crypto: arm64/aes - avoid literals for cross-module symbol references Date: Sat, 28 Jan 2017 23:25:36 +0000 Message-Id: <1485645939-17126-8-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Using simple adrp/add pairs to refer to the AES lookup tables exposed by the generic AES driver (which could be loaded far away from this driver when KASLR is in effect) was unreliable at module load time before commit 41c066f2c4d4 ("arm64: assembler: make adr_l work in modules under KASLR"), which is why the AES code used literals instead. So now we can get rid of the literals, and switch to the adr_l macro. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-cipher-core.S | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/aes-cipher-core.S b/arch/arm64/crypto/aes-cipher-core.S index 37590ab8121a..cd58c61e6677 100644 --- a/arch/arm64/crypto/aes-cipher-core.S +++ b/arch/arm64/crypto/aes-cipher-core.S @@ -89,8 +89,8 @@ CPU_BE( rev w8, w8 ) eor w7, w7, w11 eor w8, w8, w12 - ldr tt, =\ttab - ldr lt, =\ltab + adr_l tt, \ttab + adr_l lt, \ltab tbnz rounds, #1, 1f @@ -111,9 +111,6 @@ CPU_BE( rev w8, w8 ) stp w5, w6, [out] stp w7, w8, [out, #8] ret - - .align 4 - .ltorg .endm .align 5 From patchwork Sat Jan 28 23:25:37 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92773 Delivered-To: patch@linaro.org Received: by 10.140.20.99 with SMTP id 90csp847284qgi; Sat, 28 Jan 2017 15:26:31 -0800 (PST) X-Received: by 10.84.192.137 with SMTP id c9mr21705110pld.7.1485645991128; Sat, 28 Jan 2017 15:26:31 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d65si8383902pfl.73.2017.01.28.15.26.31; Sat, 28 Jan 2017 15:26:31 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752613AbdA1X0N (ORCPT + 1 other); Sat, 28 Jan 2017 18:26:13 -0500 Received: from mail-wm0-f45.google.com ([74.125.82.45]:38104 "EHLO mail-wm0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752164AbdA1X0K (ORCPT ); Sat, 28 Jan 2017 18:26:10 -0500 Received: by mail-wm0-f45.google.com with SMTP id r126so35382017wmr.1 for ; Sat, 28 Jan 2017 15:26:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Cy9yiOmXCCJTU3CWvHqx5pREGmQmzaCRklJ9F/Bb1ZQ=; b=WAGFg0mcwChi41+74l8Iy5TaFAalfHJP7u89iP9SxQxjooOkkuRReUgSdRT74eYWgu B5arvHL2V7XCmT2h2AOQ+nzduoKDOHtJFj+mzOiamHMDraOoI6+oWa3KaL67YQ+Qeri+ +U2RkR7HDFa6OxdJ2GGhjghHPP6P6lxDFlvDw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Cy9yiOmXCCJTU3CWvHqx5pREGmQmzaCRklJ9F/Bb1ZQ=; b=EL7wh5yjv7YGsa6gI4t0HLRBwh8NwVEM/Y73qvwzFT0hT+4J8ubwafuTbQ+TNlZmq9 xq7an+nX1W4iIgeIKeSmdeDfp6PXd73PoMQTuGeMp1ZSfwVh0f7kEV6kxuea+ZiZm4IV 9zssXhvQqmndjr+dInlwXECSvPqmHFuS+Q/Bq714qDPnVRFLxpHkz4FbWOq6D0KOewvA qX/0QjPK1NQu341hYqCkZIyw+P+INQwt979y9oealVQiRcjuOK+I7J6Dvnh4lTSbpdSp UtMRouiqtK/vIalGubdR6oODUxV6S0ofREIszVE1aeUTwzeH5BSjXxWjHYWbpeGxC7E6 StUw== X-Gm-Message-State: AIkVDXIGTJNF24cdR9kg0xB8XdFBIHclQfmxMBYgHn3/h22Otj82rmmuDgkvaPFhKCQw9Ehs X-Received: by 10.28.23.70 with SMTP id 67mr7806623wmx.23.1485645968757; Sat, 28 Jan 2017 15:26:08 -0800 (PST) Received: from localhost.localdomain ([160.163.215.165]) by smtp.gmail.com with ESMTPSA id 33sm14992064wrd.34.2017.01.28.15.26.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 28 Jan 2017 15:26:07 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, herbert@gondor.apana.org.au, Ard Biesheuvel Subject: [PATCH v3 08/10] crypto: arm64/aes - performance tweak Date: Sat, 28 Jan 2017 23:25:37 +0000 Message-Id: <1485645939-17126-9-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Shuffle some instructions around in the __hround macro to shave off 0.1 cycles per byte on Cortex-A57. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-cipher-core.S | 52 +++++++------------- 1 file changed, 19 insertions(+), 33 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/aes-cipher-core.S b/arch/arm64/crypto/aes-cipher-core.S index cd58c61e6677..f2f9cc519309 100644 --- a/arch/arm64/crypto/aes-cipher-core.S +++ b/arch/arm64/crypto/aes-cipher-core.S @@ -20,46 +20,32 @@ tt .req x4 lt .req x2 - .macro __hround, out0, out1, in0, in1, in2, in3, t0, t1, enc - ldp \out0, \out1, [rk], #8 - - ubfx w13, \in0, #0, #8 - ubfx w14, \in1, #8, #8 - ldr w13, [tt, w13, uxtw #2] - ldr w14, [tt, w14, uxtw #2] - + .macro __pair, enc, reg0, reg1, in0, in1e, in1d, shift + ubfx \reg0, \in0, #\shift, #8 .if \enc - ubfx w17, \in1, #0, #8 - ubfx w18, \in2, #8, #8 + ubfx \reg1, \in1e, #\shift, #8 .else - ubfx w17, \in3, #0, #8 - ubfx w18, \in0, #8, #8 + ubfx \reg1, \in1d, #\shift, #8 .endif - ldr w17, [tt, w17, uxtw #2] - ldr w18, [tt, w18, uxtw #2] + ldr \reg0, [tt, \reg0, uxtw #2] + ldr \reg1, [tt, \reg1, uxtw #2] + .endm - ubfx w15, \in2, #16, #8 - ubfx w16, \in3, #24, #8 - ldr w15, [tt, w15, uxtw #2] - ldr w16, [tt, w16, uxtw #2] + .macro __hround, out0, out1, in0, in1, in2, in3, t0, t1, enc + ldp \out0, \out1, [rk], #8 - .if \enc - ubfx \t0, \in3, #16, #8 - ubfx \t1, \in0, #24, #8 - .else - ubfx \t0, \in1, #16, #8 - ubfx \t1, \in2, #24, #8 - .endif - ldr \t0, [tt, \t0, uxtw #2] - ldr \t1, [tt, \t1, uxtw #2] + __pair \enc, w13, w14, \in0, \in1, \in3, 0 + __pair \enc, w15, w16, \in1, \in2, \in0, 8 + __pair \enc, w17, w18, \in2, \in3, \in1, 16 + __pair \enc, \t0, \t1, \in3, \in0, \in2, 24 eor \out0, \out0, w13 - eor \out1, \out1, w17 - eor \out0, \out0, w14, ror #24 - eor \out1, \out1, w18, ror #24 - eor \out0, \out0, w15, ror #16 - eor \out1, \out1, \t0, ror #16 - eor \out0, \out0, w16, ror #8 + eor \out1, \out1, w14 + eor \out0, \out0, w15, ror #24 + eor \out1, \out1, w16, ror #24 + eor \out0, \out0, w17, ror #16 + eor \out1, \out1, w18, ror #16 + eor \out0, \out0, \t0, ror #8 eor \out1, \out1, \t1, ror #8 .endm From patchwork Sat Jan 28 23:25:38 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92775 Delivered-To: patch@linaro.org Received: by 10.140.20.99 with SMTP id 90csp847289qgi; Sat, 28 Jan 2017 15:26:31 -0800 (PST) X-Received: by 10.98.144.22 with SMTP id a22mr16211066pfe.160.1485645991765; Sat, 28 Jan 2017 15:26:31 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d65si8383902pfl.73.2017.01.28.15.26.31; Sat, 28 Jan 2017 15:26:31 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752609AbdA1X0Y (ORCPT + 1 other); Sat, 28 Jan 2017 18:26:24 -0500 Received: from mail-wm0-f47.google.com ([74.125.82.47]:37298 "EHLO mail-wm0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752494AbdA1X0X (ORCPT ); Sat, 28 Jan 2017 18:26:23 -0500 Received: by mail-wm0-f47.google.com with SMTP id d140so1641133wmd.0 for ; Sat, 28 Jan 2017 15:26:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9CAIQkYU/XyC31J6gMAyBSvd8YatUmzR0gUDMVzx5W0=; b=dEFT9z3bLKunOQsw0MvWYNdnmuNLTq5/XDH3AoGjU6Eqm33GL8hpG1EN7V3doZwwOJ KH2B7noWSE0+m31SGnx81W1lSRrVRB46AjcLt+HkheGyIhobpFVUDQ/55WvJGEOC+R+f 4/Q/1SWCOBfFZvwlX7ZgC4RMyZhUCmxckO8m8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9CAIQkYU/XyC31J6gMAyBSvd8YatUmzR0gUDMVzx5W0=; b=cFzdiADdQvR98zdlYLWd7T9UtjRKbLyMMyxEkwgLlqfWAgHW5LDA0DhYj7s7VXSSh+ ZUsgo2lQXSsgQOqCsu+5rDMf+nwj3UueuZvcJhbyr3fJe8K9RXw1+aVosccPdeEyzGus Ir0X3uaawehHyEtelJ2zMK30qdCk3QwhauvBGrCPhBLxgUjzzQ/P2Y4K9W1fM9WBkNxa IZr0FKMV2d/0Y+g9ThxbK+ZoLpBwGY82CeLbTwGW9zr89tdTMz9wjXhyBxxiD1qtwfhr nAkQshGBUTL9KlPGYFvXA7dP3zUfMe1/WIii2VuLzn231NMfUag7jba2vZnMbUM+cYZN XfbA== X-Gm-Message-State: AIkVDXLruN2KNiKZavdSoZ16JvKRUfJFRzuY8Oe0hxvupf+eLdgM0m/87MGCQTYd6gLLldOR X-Received: by 10.28.48.78 with SMTP id w75mr9198347wmw.55.1485645971947; Sat, 28 Jan 2017 15:26:11 -0800 (PST) Received: from localhost.localdomain ([160.163.215.165]) by smtp.gmail.com with ESMTPSA id 33sm14992064wrd.34.2017.01.28.15.26.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 28 Jan 2017 15:26:11 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, herbert@gondor.apana.org.au, Ard Biesheuvel Subject: [PATCH v3 09/10] crypto: arm64/aes-neon-blk - tweak performance for low end cores Date: Sat, 28 Jan 2017 23:25:38 +0000 Message-Id: <1485645939-17126-10-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The non-bitsliced AES implementation using the NEON is highly sensitive to micro-architectural details, and, as it turns out, the Cortex-A53 on the Raspberry Pi 3 is a core that can benefit from this code, given that its scalar AES performance is abysmal (32.9 cycles per byte). The new bitsliced AES code manages 19.8 cycles per byte on this core, but can only operate on 8 blocks at a time, which is not supported by all chaining modes. With a bit of tweaking, we can get the plain NEON code to run at 22.0 cycles per byte, making it useful for sequential modes like CBC encryption. (Like bitsliced NEON, the plain NEON implementation does not use any lookup tables, which makes it easy on the D-cache, and invulnerable to cache timing attacks) So tweak the plain NEON AES code to use tbl instructions rather than shl/sri pairs, and to avoid the need to reload permutation vectors or other constants from memory in every round. Also, improve the decryption performance by switching to 16x8 pmul instructions for the performing the multiplications in GF(2^8). To allow the ECB and CBC encrypt routines to be reused by the bitsliced NEON code in a subsequent patch, export them from the module. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-glue.c | 2 + arch/arm64/crypto/aes-neon.S | 235 +++++++++----------- 2 files changed, 102 insertions(+), 135 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c index 8ee1fb7aaa4f..055bc3f61138 100644 --- a/arch/arm64/crypto/aes-glue.c +++ b/arch/arm64/crypto/aes-glue.c @@ -409,5 +409,7 @@ static int __init aes_init(void) module_cpu_feature_match(AES, aes_init); #else module_init(aes_init); +EXPORT_SYMBOL(neon_aes_ecb_encrypt); +EXPORT_SYMBOL(neon_aes_cbc_encrypt); #endif module_exit(aes_exit); diff --git a/arch/arm64/crypto/aes-neon.S b/arch/arm64/crypto/aes-neon.S index 85f07ead7c5c..f1e3aa2732f9 100644 --- a/arch/arm64/crypto/aes-neon.S +++ b/arch/arm64/crypto/aes-neon.S @@ -1,7 +1,7 @@ /* * linux/arch/arm64/crypto/aes-neon.S - AES cipher for ARMv8 NEON * - * Copyright (C) 2013 Linaro Ltd + * Copyright (C) 2013 - 2017 Linaro Ltd. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -17,17 +17,25 @@ /* multiply by polynomial 'x' in GF(2^8) */ .macro mul_by_x, out, in, temp, const sshr \temp, \in, #7 - add \out, \in, \in + shl \out, \in, #1 and \temp, \temp, \const eor \out, \out, \temp .endm + /* multiply by polynomial 'x^2' in GF(2^8) */ + .macro mul_by_x2, out, in, temp, const + ushr \temp, \in, #6 + shl \out, \in, #2 + pmul \temp, \temp, \const + eor \out, \out, \temp + .endm + /* preload the entire Sbox */ .macro prepare, sbox, shiftrows, temp adr \temp, \sbox - movi v12.16b, #0x40 + movi v12.16b, #0x1b ldr q13, \shiftrows - movi v14.16b, #0x1b + ldr q14, .Lror32by8 ld1 {v16.16b-v19.16b}, [\temp], #64 ld1 {v20.16b-v23.16b}, [\temp], #64 ld1 {v24.16b-v27.16b}, [\temp], #64 @@ -50,37 +58,31 @@ /* apply SubBytes transformation using the the preloaded Sbox */ .macro sub_bytes, in - sub v9.16b, \in\().16b, v12.16b + sub v9.16b, \in\().16b, v15.16b tbl \in\().16b, {v16.16b-v19.16b}, \in\().16b - sub v10.16b, v9.16b, v12.16b + sub v10.16b, v9.16b, v15.16b tbx \in\().16b, {v20.16b-v23.16b}, v9.16b - sub v11.16b, v10.16b, v12.16b + sub v11.16b, v10.16b, v15.16b tbx \in\().16b, {v24.16b-v27.16b}, v10.16b tbx \in\().16b, {v28.16b-v31.16b}, v11.16b .endm /* apply MixColumns transformation */ - .macro mix_columns, in - mul_by_x v10.16b, \in\().16b, v9.16b, v14.16b - rev32 v8.8h, \in\().8h - eor \in\().16b, v10.16b, \in\().16b - shl v9.4s, v8.4s, #24 - shl v11.4s, \in\().4s, #24 - sri v9.4s, v8.4s, #8 - sri v11.4s, \in\().4s, #8 - eor v9.16b, v9.16b, v8.16b - eor v10.16b, v10.16b, v9.16b - eor \in\().16b, v10.16b, v11.16b - .endm - + .macro mix_columns, in, enc + .if \enc == 0 /* Inverse MixColumns: pre-multiply by { 5, 0, 4, 0 } */ - .macro inv_mix_columns, in - mul_by_x v11.16b, \in\().16b, v10.16b, v14.16b - mul_by_x v11.16b, v11.16b, v10.16b, v14.16b - eor \in\().16b, \in\().16b, v11.16b - rev32 v11.8h, v11.8h - eor \in\().16b, \in\().16b, v11.16b - mix_columns \in + mul_by_x2 v8.16b, \in\().16b, v9.16b, v12.16b + eor \in\().16b, \in\().16b, v8.16b + rev32 v8.8h, v8.8h + eor \in\().16b, \in\().16b, v8.16b + .endif + + mul_by_x v9.16b, \in\().16b, v8.16b, v12.16b + rev32 v8.8h, \in\().8h + eor v8.16b, v8.16b, v9.16b + eor \in\().16b, \in\().16b, v8.16b + tbl \in\().16b, {\in\().16b}, v14.16b + eor \in\().16b, \in\().16b, v8.16b .endm .macro do_block, enc, in, rounds, rk, rkp, i @@ -88,16 +90,13 @@ add \rkp, \rk, #16 mov \i, \rounds 1111: eor \in\().16b, \in\().16b, v15.16b /* ^round key */ + movi v15.16b, #0x40 tbl \in\().16b, {\in\().16b}, v13.16b /* ShiftRows */ sub_bytes \in - ld1 {v15.4s}, [\rkp], #16 subs \i, \i, #1 + ld1 {v15.4s}, [\rkp], #16 beq 2222f - .if \enc == 1 - mix_columns \in - .else - inv_mix_columns \in - .endif + mix_columns \in, \enc b 1111b 2222: eor \in\().16b, \in\().16b, v15.16b /* ^round key */ .endm @@ -116,139 +115,114 @@ */ .macro sub_bytes_2x, in0, in1 - sub v8.16b, \in0\().16b, v12.16b - sub v9.16b, \in1\().16b, v12.16b + sub v8.16b, \in0\().16b, v15.16b tbl \in0\().16b, {v16.16b-v19.16b}, \in0\().16b + sub v9.16b, \in1\().16b, v15.16b tbl \in1\().16b, {v16.16b-v19.16b}, \in1\().16b - sub v10.16b, v8.16b, v12.16b - sub v11.16b, v9.16b, v12.16b + sub v10.16b, v8.16b, v15.16b tbx \in0\().16b, {v20.16b-v23.16b}, v8.16b + sub v11.16b, v9.16b, v15.16b tbx \in1\().16b, {v20.16b-v23.16b}, v9.16b - sub v8.16b, v10.16b, v12.16b - sub v9.16b, v11.16b, v12.16b + sub v8.16b, v10.16b, v15.16b tbx \in0\().16b, {v24.16b-v27.16b}, v10.16b + sub v9.16b, v11.16b, v15.16b tbx \in1\().16b, {v24.16b-v27.16b}, v11.16b tbx \in0\().16b, {v28.16b-v31.16b}, v8.16b tbx \in1\().16b, {v28.16b-v31.16b}, v9.16b .endm .macro sub_bytes_4x, in0, in1, in2, in3 - sub v8.16b, \in0\().16b, v12.16b + sub v8.16b, \in0\().16b, v15.16b tbl \in0\().16b, {v16.16b-v19.16b}, \in0\().16b - sub v9.16b, \in1\().16b, v12.16b + sub v9.16b, \in1\().16b, v15.16b tbl \in1\().16b, {v16.16b-v19.16b}, \in1\().16b - sub v10.16b, \in2\().16b, v12.16b + sub v10.16b, \in2\().16b, v15.16b tbl \in2\().16b, {v16.16b-v19.16b}, \in2\().16b - sub v11.16b, \in3\().16b, v12.16b + sub v11.16b, \in3\().16b, v15.16b tbl \in3\().16b, {v16.16b-v19.16b}, \in3\().16b tbx \in0\().16b, {v20.16b-v23.16b}, v8.16b tbx \in1\().16b, {v20.16b-v23.16b}, v9.16b - sub v8.16b, v8.16b, v12.16b + sub v8.16b, v8.16b, v15.16b tbx \in2\().16b, {v20.16b-v23.16b}, v10.16b - sub v9.16b, v9.16b, v12.16b + sub v9.16b, v9.16b, v15.16b tbx \in3\().16b, {v20.16b-v23.16b}, v11.16b - sub v10.16b, v10.16b, v12.16b + sub v10.16b, v10.16b, v15.16b tbx \in0\().16b, {v24.16b-v27.16b}, v8.16b - sub v11.16b, v11.16b, v12.16b + sub v11.16b, v11.16b, v15.16b tbx \in1\().16b, {v24.16b-v27.16b}, v9.16b - sub v8.16b, v8.16b, v12.16b + sub v8.16b, v8.16b, v15.16b tbx \in2\().16b, {v24.16b-v27.16b}, v10.16b - sub v9.16b, v9.16b, v12.16b + sub v9.16b, v9.16b, v15.16b tbx \in3\().16b, {v24.16b-v27.16b}, v11.16b - sub v10.16b, v10.16b, v12.16b + sub v10.16b, v10.16b, v15.16b tbx \in0\().16b, {v28.16b-v31.16b}, v8.16b - sub v11.16b, v11.16b, v12.16b + sub v11.16b, v11.16b, v15.16b tbx \in1\().16b, {v28.16b-v31.16b}, v9.16b tbx \in2\().16b, {v28.16b-v31.16b}, v10.16b tbx \in3\().16b, {v28.16b-v31.16b}, v11.16b .endm .macro mul_by_x_2x, out0, out1, in0, in1, tmp0, tmp1, const - sshr \tmp0\().16b, \in0\().16b, #7 - add \out0\().16b, \in0\().16b, \in0\().16b - sshr \tmp1\().16b, \in1\().16b, #7 + sshr \tmp0\().16b, \in0\().16b, #7 + shl \out0\().16b, \in0\().16b, #1 + sshr \tmp1\().16b, \in1\().16b, #7 and \tmp0\().16b, \tmp0\().16b, \const\().16b - add \out1\().16b, \in1\().16b, \in1\().16b + shl \out1\().16b, \in1\().16b, #1 and \tmp1\().16b, \tmp1\().16b, \const\().16b eor \out0\().16b, \out0\().16b, \tmp0\().16b eor \out1\().16b, \out1\().16b, \tmp1\().16b .endm - .macro mix_columns_2x, in0, in1 - mul_by_x_2x v8, v9, \in0, \in1, v10, v11, v14 - rev32 v10.8h, \in0\().8h - rev32 v11.8h, \in1\().8h - eor \in0\().16b, v8.16b, \in0\().16b - eor \in1\().16b, v9.16b, \in1\().16b - shl v12.4s, v10.4s, #24 - shl v13.4s, v11.4s, #24 - eor v8.16b, v8.16b, v10.16b - sri v12.4s, v10.4s, #8 - shl v10.4s, \in0\().4s, #24 - eor v9.16b, v9.16b, v11.16b - sri v13.4s, v11.4s, #8 - shl v11.4s, \in1\().4s, #24 - sri v10.4s, \in0\().4s, #8 - eor \in0\().16b, v8.16b, v12.16b - sri v11.4s, \in1\().4s, #8 - eor \in1\().16b, v9.16b, v13.16b - eor \in0\().16b, v10.16b, \in0\().16b - eor \in1\().16b, v11.16b, \in1\().16b + .macro mul_by_x2_2x, out0, out1, in0, in1, tmp0, tmp1, const + ushr \tmp0\().16b, \in0\().16b, #6 + shl \out0\().16b, \in0\().16b, #2 + ushr \tmp1\().16b, \in1\().16b, #6 + pmul \tmp0\().16b, \tmp0\().16b, \const\().16b + shl \out1\().16b, \in1\().16b, #2 + pmul \tmp1\().16b, \tmp1\().16b, \const\().16b + eor \out0\().16b, \out0\().16b, \tmp0\().16b + eor \out1\().16b, \out1\().16b, \tmp1\().16b .endm - .macro inv_mix_cols_2x, in0, in1 - mul_by_x_2x v8, v9, \in0, \in1, v10, v11, v14 - mul_by_x_2x v8, v9, v8, v9, v10, v11, v14 + .macro mix_columns_2x, in0, in1, enc + .if \enc == 0 + /* Inverse MixColumns: pre-multiply by { 5, 0, 4, 0 } */ + mul_by_x2_2x v8, v9, \in0, \in1, v10, v11, v12 eor \in0\().16b, \in0\().16b, v8.16b - eor \in1\().16b, \in1\().16b, v9.16b rev32 v8.8h, v8.8h - rev32 v9.8h, v9.8h - eor \in0\().16b, \in0\().16b, v8.16b - eor \in1\().16b, \in1\().16b, v9.16b - mix_columns_2x \in0, \in1 - .endm - - .macro inv_mix_cols_4x, in0, in1, in2, in3 - mul_by_x_2x v8, v9, \in0, \in1, v10, v11, v14 - mul_by_x_2x v10, v11, \in2, \in3, v12, v13, v14 - mul_by_x_2x v8, v9, v8, v9, v12, v13, v14 - mul_by_x_2x v10, v11, v10, v11, v12, v13, v14 - eor \in0\().16b, \in0\().16b, v8.16b eor \in1\().16b, \in1\().16b, v9.16b - eor \in2\().16b, \in2\().16b, v10.16b - eor \in3\().16b, \in3\().16b, v11.16b - rev32 v8.8h, v8.8h rev32 v9.8h, v9.8h - rev32 v10.8h, v10.8h - rev32 v11.8h, v11.8h eor \in0\().16b, \in0\().16b, v8.16b eor \in1\().16b, \in1\().16b, v9.16b - eor \in2\().16b, \in2\().16b, v10.16b - eor \in3\().16b, \in3\().16b, v11.16b - mix_columns_2x \in0, \in1 - mix_columns_2x \in2, \in3 + .endif + + mul_by_x_2x v8, v9, \in0, \in1, v10, v11, v12 + rev32 v10.8h, \in0\().8h + rev32 v11.8h, \in1\().8h + eor v10.16b, v10.16b, v8.16b + eor v11.16b, v11.16b, v9.16b + eor \in0\().16b, \in0\().16b, v10.16b + eor \in1\().16b, \in1\().16b, v11.16b + tbl \in0\().16b, {\in0\().16b}, v14.16b + tbl \in1\().16b, {\in1\().16b}, v14.16b + eor \in0\().16b, \in0\().16b, v10.16b + eor \in1\().16b, \in1\().16b, v11.16b .endm - .macro do_block_2x, enc, in0, in1 rounds, rk, rkp, i + .macro do_block_2x, enc, in0, in1, rounds, rk, rkp, i ld1 {v15.4s}, [\rk] add \rkp, \rk, #16 mov \i, \rounds 1111: eor \in0\().16b, \in0\().16b, v15.16b /* ^round key */ eor \in1\().16b, \in1\().16b, v15.16b /* ^round key */ - sub_bytes_2x \in0, \in1 + movi v15.16b, #0x40 tbl \in0\().16b, {\in0\().16b}, v13.16b /* ShiftRows */ tbl \in1\().16b, {\in1\().16b}, v13.16b /* ShiftRows */ - ld1 {v15.4s}, [\rkp], #16 + sub_bytes_2x \in0, \in1 subs \i, \i, #1 + ld1 {v15.4s}, [\rkp], #16 beq 2222f - .if \enc == 1 - mix_columns_2x \in0, \in1 - ldr q13, .LForward_ShiftRows - .else - inv_mix_cols_2x \in0, \in1 - ldr q13, .LReverse_ShiftRows - .endif - movi v12.16b, #0x40 + mix_columns_2x \in0, \in1, \enc b 1111b 2222: eor \in0\().16b, \in0\().16b, v15.16b /* ^round key */ eor \in1\().16b, \in1\().16b, v15.16b /* ^round key */ @@ -262,23 +236,17 @@ eor \in1\().16b, \in1\().16b, v15.16b /* ^round key */ eor \in2\().16b, \in2\().16b, v15.16b /* ^round key */ eor \in3\().16b, \in3\().16b, v15.16b /* ^round key */ - sub_bytes_4x \in0, \in1, \in2, \in3 + movi v15.16b, #0x40 tbl \in0\().16b, {\in0\().16b}, v13.16b /* ShiftRows */ tbl \in1\().16b, {\in1\().16b}, v13.16b /* ShiftRows */ tbl \in2\().16b, {\in2\().16b}, v13.16b /* ShiftRows */ tbl \in3\().16b, {\in3\().16b}, v13.16b /* ShiftRows */ - ld1 {v15.4s}, [\rkp], #16 + sub_bytes_4x \in0, \in1, \in2, \in3 subs \i, \i, #1 + ld1 {v15.4s}, [\rkp], #16 beq 2222f - .if \enc == 1 - mix_columns_2x \in0, \in1 - mix_columns_2x \in2, \in3 - ldr q13, .LForward_ShiftRows - .else - inv_mix_cols_4x \in0, \in1, \in2, \in3 - ldr q13, .LReverse_ShiftRows - .endif - movi v12.16b, #0x40 + mix_columns_2x \in0, \in1, \enc + mix_columns_2x \in2, \in3, \enc b 1111b 2222: eor \in0\().16b, \in0\().16b, v15.16b /* ^round key */ eor \in1\().16b, \in1\().16b, v15.16b /* ^round key */ @@ -305,19 +273,7 @@ #include "aes-modes.S" .text - .align 4 -.LForward_ShiftRows: -CPU_LE( .byte 0x0, 0x5, 0xa, 0xf, 0x4, 0x9, 0xe, 0x3 ) -CPU_LE( .byte 0x8, 0xd, 0x2, 0x7, 0xc, 0x1, 0x6, 0xb ) -CPU_BE( .byte 0xb, 0x6, 0x1, 0xc, 0x7, 0x2, 0xd, 0x8 ) -CPU_BE( .byte 0x3, 0xe, 0x9, 0x4, 0xf, 0xa, 0x5, 0x0 ) - -.LReverse_ShiftRows: -CPU_LE( .byte 0x0, 0xd, 0xa, 0x7, 0x4, 0x1, 0xe, 0xb ) -CPU_LE( .byte 0x8, 0x5, 0x2, 0xf, 0xc, 0x9, 0x6, 0x3 ) -CPU_BE( .byte 0x3, 0x6, 0x9, 0xc, 0xf, 0x2, 0x5, 0x8 ) -CPU_BE( .byte 0xb, 0xe, 0x1, 0x4, 0x7, 0xa, 0xd, 0x0 ) - + .align 6 .LForward_Sbox: .byte 0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5 .byte 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76 @@ -385,3 +341,12 @@ CPU_BE( .byte 0xb, 0xe, 0x1, 0x4, 0x7, 0xa, 0xd, 0x0 ) .byte 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61 .byte 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26 .byte 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d + +.LForward_ShiftRows: + .octa 0x0b06010c07020d08030e09040f0a0500 + +.LReverse_ShiftRows: + .octa 0x0306090c0f0205080b0e0104070a0d00 + +.Lror32by8: + .octa 0x0c0f0e0d080b0a090407060500030201 From patchwork Sat Jan 28 23:25:39 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 92776 Delivered-To: patch@linaro.org Received: by 10.140.20.99 with SMTP id 90csp847288qgi; Sat, 28 Jan 2017 15:26:31 -0800 (PST) X-Received: by 10.99.138.201 with SMTP id y192mr16352326pgd.146.1485645991598; Sat, 28 Jan 2017 15:26:31 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d65si8383902pfl.73.2017.01.28.15.26.31; Sat, 28 Jan 2017 15:26:31 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752616AbdA1X0V (ORCPT + 1 other); Sat, 28 Jan 2017 18:26:21 -0500 Received: from mail-wm0-f44.google.com ([74.125.82.44]:37311 "EHLO mail-wm0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752494AbdA1X0V (ORCPT ); Sat, 28 Jan 2017 18:26:21 -0500 Received: by mail-wm0-f44.google.com with SMTP id d140so1641758wmd.0 for ; Sat, 28 Jan 2017 15:26:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ElVhTH7rbhdNxoCETCPbiBrPki0mmyu75g9pZfBtnwA=; b=C9eLTgd5q9rgn3rZFpaxnlYNcWa7KDYWcb4dfNll+3N7UjzZ8AiggrYXv26TP/Xr8b Eu6h9qx5pyCOgyUbkqWwV+lH6AX5cITMsJZUnrX3ifvDa8MSTQXtF619jJnWm8pzTiKs AaPDYKscCiovl90PD+FmyRe6kQ6ONT/O2Vrow= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ElVhTH7rbhdNxoCETCPbiBrPki0mmyu75g9pZfBtnwA=; b=VkxmGJN2BjrhZlTxjUBhT47gsVYF140HbWfNgcJxtCY/OzT3uqXrJnNX3kPSXlEgo8 yjBK2biUwop6pmR31ZMxLH+Tokjo+mlNP/eSwwB+4sdApPk83+TTkbSYY4qsLHJ7g69r 2nCPZOauYllxJst1rd+6tpiyiKBNPsxkyJMEw9JUYpZzOJYeJbBSoFYs7rY8bGnOELwI 96OlQYUP5KpiGnAo51WA7BSW5FVErpdAp26K2DJARiKo3nTl0ojcGnjV7x2GJk4H1XFM e+qZgeLnJzCK0Yf4HNWUoKpHIM+5Lj1vX9pidH7yIUmfYTwW4TrAilbGNA4F1IN7D/C5 1d1g== X-Gm-Message-State: AIkVDXKc/eOBo8xgeucKC0pVTrF4qlnMsRZqLmtKbEu6R2wpT5WOUHiJd6mo6JgGmnRONFzZ X-Received: by 10.223.131.34 with SMTP id 31mr15619163wrd.119.1485645974306; Sat, 28 Jan 2017 15:26:14 -0800 (PST) Received: from localhost.localdomain ([160.163.215.165]) by smtp.gmail.com with ESMTPSA id 33sm14992064wrd.34.2017.01.28.15.26.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 28 Jan 2017 15:26:13 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, herbert@gondor.apana.org.au, Ard Biesheuvel Subject: [PATCH v3 10/10] crypto: arm64/aes - replace scalar fallback with plain NEON fallback Date: Sat, 28 Jan 2017 23:25:39 +0000 Message-Id: <1485645939-17126-11-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> References: <1485645939-17126-1-git-send-email-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The new bitsliced NEON implementation of AES uses a fallback in two places: CBC encryption (which is strictly sequential, whereas this driver can only operate efficiently on 8 blocks at a time), and the XTS tweak generation, which involves encrypting a single AES block with a different key schedule. The plain (i.e., non-bitsliced) NEON code is more suitable as a fallback, given that it is faster than scalar on low end cores (which is what the NEON implementations target, since high end cores have dedicated instructions for AES), and shows similar behavior in terms of D-cache footprint and sensitivity to cache timing attacks. So switch the fallback handling to the plain NEON driver. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig | 2 +- arch/arm64/crypto/aes-neonbs-glue.c | 38 ++++++++++++++------ 2 files changed, 29 insertions(+), 11 deletions(-) -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index 5de75c3dcbd4..bed7feddfeed 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -86,7 +86,7 @@ config CRYPTO_AES_ARM64_BS tristate "AES in ECB/CBC/CTR/XTS modes using bit-sliced NEON algorithm" depends on KERNEL_MODE_NEON select CRYPTO_BLKCIPHER - select CRYPTO_AES_ARM64 + select CRYPTO_AES_ARM64_NEON_BLK select CRYPTO_SIMD endif diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c index 323dd76ae5f0..863e436ecf89 100644 --- a/arch/arm64/crypto/aes-neonbs-glue.c +++ b/arch/arm64/crypto/aes-neonbs-glue.c @@ -10,7 +10,6 @@ #include #include -#include #include #include #include @@ -42,7 +41,12 @@ asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[], asmlinkage void aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, int blocks, u8 iv[]); -asmlinkage void __aes_arm64_encrypt(u32 *rk, u8 *out, const u8 *in, int rounds); +/* borrowed from aes-neon-blk.ko */ +asmlinkage void neon_aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[], + int rounds, int blocks, int first); +asmlinkage void neon_aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[], + int rounds, int blocks, u8 iv[], + int first); struct aesbs_ctx { u8 rk[13 * (8 * AES_BLOCK_SIZE) + 32]; @@ -140,16 +144,28 @@ static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key, return 0; } -static void cbc_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst) +static int cbc_encrypt(struct skcipher_request *req) { + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_walk walk; + int err, first = 1; - __aes_arm64_encrypt(ctx->enc, dst, src, ctx->key.rounds); -} + err = skcipher_walk_virt(&walk, req, true); -static int cbc_encrypt(struct skcipher_request *req) -{ - return crypto_cbc_encrypt_walk(req, cbc_encrypt_one); + kernel_neon_begin(); + while (walk.nbytes >= AES_BLOCK_SIZE) { + unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; + + /* fall back to the non-bitsliced NEON implementation */ + neon_aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr, + ctx->enc, ctx->key.rounds, blocks, walk.iv, + first); + err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); + first = 0; + } + kernel_neon_end(); + return err; } static int cbc_decrypt(struct skcipher_request *req) @@ -254,9 +270,11 @@ static int __xts_crypt(struct skcipher_request *req, err = skcipher_walk_virt(&walk, req, true); - __aes_arm64_encrypt(ctx->twkey, walk.iv, walk.iv, ctx->key.rounds); - kernel_neon_begin(); + + neon_aes_ecb_encrypt(walk.iv, walk.iv, ctx->twkey, + ctx->key.rounds, 1, 1); + while (walk.nbytes >= AES_BLOCK_SIZE) { unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;