From patchwork Tue Nov 17 13:32:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 326411 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D09D8C64E90 for ; Tue, 17 Nov 2020 13:50:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 71B1F24199 for ; Tue, 17 Nov 2020 13:50:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="cet0n5Jf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732324AbgKQNu3 (ORCPT ); Tue, 17 Nov 2020 08:50:29 -0500 Received: from mail.kernel.org ([198.145.29.99]:42358 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732319AbgKQNch (ORCPT ); Tue, 17 Nov 2020 08:32:37 -0500 Received: from e123331-lin.nice.arm.com (lfbn-nic-1-188-42.w2-15.abo.wanadoo.fr [2.15.37.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 8B5ED2463D; Tue, 17 Nov 2020 13:32:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1605619958; bh=Iyf4Rgv1+x9PtEcqoWVcwfYPM8emMHSynQZDAw8Vlh8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=cet0n5JfTryI2r5wS0lajmlBaXW1o1RRiB4pZeSqFUJRK0Wwka+h0yt1RlMiiiUbm VXdHZ6gmvmY4vFt072CdGV3xX1YatYMgxHBI6/vxVm/7DPdVepaOZ8fjkdrp92Tnmb wZ7aKg3DlscXPu8H+jZ5tdfyq2Ctp+NBCqcyRzzk= From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Ondrej Mosnacek , Eric Biggers Subject: [PATCH v3 1/4] crypto: aegis128 - wipe plaintext and tag if decryption fails Date: Tue, 17 Nov 2020 14:32:11 +0100 Message-Id: <20201117133214.29114-2-ardb@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20201117133214.29114-1-ardb@kernel.org> References: <20201117133214.29114-1-ardb@kernel.org> Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The AEGIS spec mentions explicitly that the security guarantees hold only if the resulting plaintext and tag of a failed decryption are withheld. So ensure that we abide by this. While at it, drop the unused struct aead_request *req parameter from crypto_aegis128_process_crypt(). Reviewed-by: Ondrej Mosnacek Signed-off-by: Ard Biesheuvel --- crypto/aegis128-core.c | 32 ++++++++++++++++---- 1 file changed, 26 insertions(+), 6 deletions(-) diff --git a/crypto/aegis128-core.c b/crypto/aegis128-core.c index 44fb4956f0dd..3a71235892f5 100644 --- a/crypto/aegis128-core.c +++ b/crypto/aegis128-core.c @@ -154,6 +154,12 @@ static void crypto_aegis128_ad(struct aegis_state *state, } } +static void crypto_aegis128_wipe_chunk(struct aegis_state *state, u8 *dst, + const u8 *src, unsigned int size) +{ + memzero_explicit(dst, size); +} + static void crypto_aegis128_encrypt_chunk(struct aegis_state *state, u8 *dst, const u8 *src, unsigned int size) { @@ -324,7 +330,6 @@ static void crypto_aegis128_process_ad(struct aegis_state *state, static __always_inline int crypto_aegis128_process_crypt(struct aegis_state *state, - struct aead_request *req, struct skcipher_walk *walk, void (*crypt)(struct aegis_state *state, u8 *dst, const u8 *src, @@ -403,14 +408,14 @@ static int crypto_aegis128_encrypt(struct aead_request *req) if (aegis128_do_simd()) { crypto_aegis128_init_simd(&state, &ctx->key, req->iv); crypto_aegis128_process_ad(&state, req->src, req->assoclen); - crypto_aegis128_process_crypt(&state, req, &walk, + crypto_aegis128_process_crypt(&state, &walk, crypto_aegis128_encrypt_chunk_simd); crypto_aegis128_final_simd(&state, &tag, req->assoclen, cryptlen); } else { crypto_aegis128_init(&state, &ctx->key, req->iv); crypto_aegis128_process_ad(&state, req->src, req->assoclen); - crypto_aegis128_process_crypt(&state, req, &walk, + crypto_aegis128_process_crypt(&state, &walk, crypto_aegis128_encrypt_chunk); crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen); } @@ -438,19 +443,34 @@ static int crypto_aegis128_decrypt(struct aead_request *req) if (aegis128_do_simd()) { crypto_aegis128_init_simd(&state, &ctx->key, req->iv); crypto_aegis128_process_ad(&state, req->src, req->assoclen); - crypto_aegis128_process_crypt(&state, req, &walk, + crypto_aegis128_process_crypt(&state, &walk, crypto_aegis128_decrypt_chunk_simd); crypto_aegis128_final_simd(&state, &tag, req->assoclen, cryptlen); } else { crypto_aegis128_init(&state, &ctx->key, req->iv); crypto_aegis128_process_ad(&state, req->src, req->assoclen); - crypto_aegis128_process_crypt(&state, req, &walk, + crypto_aegis128_process_crypt(&state, &walk, crypto_aegis128_decrypt_chunk); crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen); } - return crypto_memneq(tag.bytes, zeros, authsize) ? -EBADMSG : 0; + if (unlikely(crypto_memneq(tag.bytes, zeros, authsize))) { + /* + * From Chapter 4. 'Security Analysis' of the AEGIS spec [0] + * + * "3. If verification fails, the decrypted plaintext and the + * wrong authentication tag should not be given as output." + * + * [0] https://competitions.cr.yp.to/round3/aegisv11.pdf + */ + skcipher_walk_aead_decrypt(&walk, req, false); + crypto_aegis128_process_crypt(NULL, &walk, + crypto_aegis128_wipe_chunk); + memzero_explicit(&tag, sizeof(tag)); + return -EBADMSG; + } + return 0; } static struct aead_alg crypto_aegis128_alg = { From patchwork Tue Nov 17 13:32:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 324679 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C16DC56202 for ; Tue, 17 Nov 2020 13:33:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E85CE2078E for ; Tue, 17 Nov 2020 13:33:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="WKPVzja5" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731924AbgKQNcl (ORCPT ); Tue, 17 Nov 2020 08:32:41 -0500 Received: from mail.kernel.org ([198.145.29.99]:42464 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732324AbgKQNck (ORCPT ); Tue, 17 Nov 2020 08:32:40 -0500 Received: from e123331-lin.nice.arm.com (lfbn-nic-1-188-42.w2-15.abo.wanadoo.fr [2.15.37.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 94EAB24199; Tue, 17 Nov 2020 13:32:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1605619960; bh=N2U9menlQv60P/GBM6E2+KMGks+534jhCdZlvYWzxB4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WKPVzja5YoK7XofHAZxTxQ/DGxLHzSjEqySpSLFgK5fc2AmvwaJHcZyNbw27y4l5c V7gQ/5t6bowjPoVOzi7Qp6awGuCdSouQEjjdJviXABMcumpBCpzkwoG05doRNjHS+O PfW+p8looZ3MznGkxkMvaNIvG42T7dblVPSPajbw= From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Ondrej Mosnacek , Eric Biggers Subject: [PATCH v3 2/4] crypto: aegis128/neon - optimize tail block handling Date: Tue, 17 Nov 2020 14:32:12 +0100 Message-Id: <20201117133214.29114-3-ardb@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20201117133214.29114-1-ardb@kernel.org> References: <20201117133214.29114-1-ardb@kernel.org> Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Avoid copying the tail block via a stack buffer if the total size exceeds a single AEGIS block. In this case, we can use overlapping loads and stores and NEON permutation instructions instead, which leads to a modest performance improvement on some cores (< 5%), and is slightly cleaner. Note that we still need to use a stack buffer if the entire input is smaller than 16 bytes, given that we cannot use 16 byte NEON loads and stores safely in this case. Signed-off-by: Ard Biesheuvel --- crypto/aegis128-neon-inner.c | 89 +++++++++++++++++--- 1 file changed, 75 insertions(+), 14 deletions(-) diff --git a/crypto/aegis128-neon-inner.c b/crypto/aegis128-neon-inner.c index 2a660ac1bc3a..cd1b3ad1d1f3 100644 --- a/crypto/aegis128-neon-inner.c +++ b/crypto/aegis128-neon-inner.c @@ -20,7 +20,6 @@ extern int aegis128_have_aes_insn; void *memcpy(void *dest, const void *src, size_t n); -void *memset(void *s, int c, size_t n); struct aegis128_state { uint8x16_t v[5]; @@ -173,10 +172,46 @@ void crypto_aegis128_update_neon(void *state, const void *msg) aegis128_save_state_neon(st, state); } +#ifdef CONFIG_ARM +/* + * AArch32 does not provide these intrinsics natively because it does not + * implement the underlying instructions. AArch32 only provides 64-bit + * wide vtbl.8/vtbx.8 instruction, so use those instead. + */ +static uint8x16_t vqtbl1q_u8(uint8x16_t a, uint8x16_t b) +{ + union { + uint8x16_t val; + uint8x8x2_t pair; + } __a = { a }; + + return vcombine_u8(vtbl2_u8(__a.pair, vget_low_u8(b)), + vtbl2_u8(__a.pair, vget_high_u8(b))); +} + +static uint8x16_t vqtbx1q_u8(uint8x16_t v, uint8x16_t a, uint8x16_t b) +{ + union { + uint8x16_t val; + uint8x8x2_t pair; + } __a = { a }; + + return vcombine_u8(vtbx2_u8(vget_low_u8(v), __a.pair, vget_low_u8(b)), + vtbx2_u8(vget_high_u8(v), __a.pair, vget_high_u8(b))); +} +#endif + +static const uint8_t permute[] __aligned(64) = { + -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, + 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, + -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, +}; + void crypto_aegis128_encrypt_chunk_neon(void *state, void *dst, const void *src, unsigned int size) { struct aegis128_state st = aegis128_load_state_neon(state); + const int short_input = size < AEGIS_BLOCK_SIZE; uint8x16_t msg; preload_sbox(); @@ -186,7 +221,8 @@ void crypto_aegis128_encrypt_chunk_neon(void *state, void *dst, const void *src, msg = vld1q_u8(src); st = aegis128_update_neon(st, msg); - vst1q_u8(dst, msg ^ s); + msg ^= s; + vst1q_u8(dst, msg); size -= AEGIS_BLOCK_SIZE; src += AEGIS_BLOCK_SIZE; @@ -195,13 +231,26 @@ void crypto_aegis128_encrypt_chunk_neon(void *state, void *dst, const void *src, if (size > 0) { uint8x16_t s = st.v[1] ^ (st.v[2] & st.v[3]) ^ st.v[4]; - uint8_t buf[AEGIS_BLOCK_SIZE] = {}; + uint8_t buf[AEGIS_BLOCK_SIZE]; + const void *in = src; + void *out = dst; + uint8x16_t m; - memcpy(buf, src, size); - msg = vld1q_u8(buf); - st = aegis128_update_neon(st, msg); - vst1q_u8(buf, msg ^ s); - memcpy(dst, buf, size); + if (__builtin_expect(short_input, 0)) + in = out = memcpy(buf + AEGIS_BLOCK_SIZE - size, src, size); + + m = vqtbl1q_u8(vld1q_u8(in + size - AEGIS_BLOCK_SIZE), + vld1q_u8(permute + 32 - size)); + + st = aegis128_update_neon(st, m); + + vst1q_u8(out + size - AEGIS_BLOCK_SIZE, + vqtbl1q_u8(m ^ s, vld1q_u8(permute + size))); + + if (__builtin_expect(short_input, 0)) + memcpy(dst, out, size); + else + vst1q_u8(out - AEGIS_BLOCK_SIZE, msg); } aegis128_save_state_neon(st, state); @@ -211,6 +260,7 @@ void crypto_aegis128_decrypt_chunk_neon(void *state, void *dst, const void *src, unsigned int size) { struct aegis128_state st = aegis128_load_state_neon(state); + const int short_input = size < AEGIS_BLOCK_SIZE; uint8x16_t msg; preload_sbox(); @@ -228,14 +278,25 @@ void crypto_aegis128_decrypt_chunk_neon(void *state, void *dst, const void *src, if (size > 0) { uint8x16_t s = st.v[1] ^ (st.v[2] & st.v[3]) ^ st.v[4]; uint8_t buf[AEGIS_BLOCK_SIZE]; + const void *in = src; + void *out = dst; + uint8x16_t m; - vst1q_u8(buf, s); - memcpy(buf, src, size); - msg = vld1q_u8(buf) ^ s; - vst1q_u8(buf, msg); - memcpy(dst, buf, size); + if (__builtin_expect(short_input, 0)) + in = out = memcpy(buf + AEGIS_BLOCK_SIZE - size, src, size); - st = aegis128_update_neon(st, msg); + m = s ^ vqtbx1q_u8(s, vld1q_u8(in + size - AEGIS_BLOCK_SIZE), + vld1q_u8(permute + 32 - size)); + + st = aegis128_update_neon(st, m); + + vst1q_u8(out + size - AEGIS_BLOCK_SIZE, + vqtbl1q_u8(m, vld1q_u8(permute + size))); + + if (__builtin_expect(short_input, 0)) + memcpy(dst, out, size); + else + vst1q_u8(out - AEGIS_BLOCK_SIZE, msg); } aegis128_save_state_neon(st, state); From patchwork Tue Nov 17 13:32:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 326082 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12CFBC64E8A for ; Tue, 17 Nov 2020 13:50:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B1E8724199 for ; Tue, 17 Nov 2020 13:50:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="sRg/MqDR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733267AbgKQNu2 (ORCPT ); Tue, 17 Nov 2020 08:50:28 -0500 Received: from mail.kernel.org ([198.145.29.99]:42534 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732324AbgKQNcn (ORCPT ); Tue, 17 Nov 2020 08:32:43 -0500 Received: from e123331-lin.nice.arm.com (lfbn-nic-1-188-42.w2-15.abo.wanadoo.fr [2.15.37.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 238B7207BC; Tue, 17 Nov 2020 13:32:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1605619962; bh=5TM+nvyVaaEjZdNFdJSIc1Owr0opm4PwH+sR4gtEzSA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=sRg/MqDR46Hmq9ICFWqnyKBMAiZmvaG51QO/czEYr2d6q1/0bwkeDVxKbmY6GqWnD JslpMiw3dWPuSLg3OK6WB+7skicXBTlk+0RI0XrMt7u4KOoBECfHECfR0keZpsZPgR 1HESKj5P4PZRWSm2b/8vUsYl5egR92zszYTp5HOA= From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Ondrej Mosnacek , Eric Biggers Subject: [PATCH v3 3/4] crypto: aegis128/neon - move final tag check to SIMD domain Date: Tue, 17 Nov 2020 14:32:13 +0100 Message-Id: <20201117133214.29114-4-ardb@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20201117133214.29114-1-ardb@kernel.org> References: <20201117133214.29114-1-ardb@kernel.org> Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Instead of calculating the tag and returning it to the caller on decryption, use a SIMD compare and min across vector to perform the comparison. This is slightly more efficient, and removes the need on the caller's part to wipe the tag from memory if the decryption failed. While at it, switch to unsigned int when passing cryptlen and assoclen - we don't support input sizes where it matters anyway. Signed-off-by: Ard Biesheuvel --- crypto/aegis128-core.c | 21 +++++++++---- crypto/aegis128-neon-inner.c | 33 ++++++++++++++++---- crypto/aegis128-neon.c | 21 +++++++++---- 3 files changed, 57 insertions(+), 18 deletions(-) diff --git a/crypto/aegis128-core.c b/crypto/aegis128-core.c index 3a71235892f5..859c7b905618 100644 --- a/crypto/aegis128-core.c +++ b/crypto/aegis128-core.c @@ -67,9 +67,11 @@ void crypto_aegis128_encrypt_chunk_simd(struct aegis_state *state, u8 *dst, const u8 *src, unsigned int size); void crypto_aegis128_decrypt_chunk_simd(struct aegis_state *state, u8 *dst, const u8 *src, unsigned int size); -void crypto_aegis128_final_simd(struct aegis_state *state, - union aegis_block *tag_xor, - u64 assoclen, u64 cryptlen); +int crypto_aegis128_final_simd(struct aegis_state *state, + union aegis_block *tag_xor, + unsigned int assoclen, + unsigned int cryptlen, + unsigned int authsize); static void crypto_aegis128_update(struct aegis_state *state) { @@ -411,7 +413,7 @@ static int crypto_aegis128_encrypt(struct aead_request *req) crypto_aegis128_process_crypt(&state, &walk, crypto_aegis128_encrypt_chunk_simd); crypto_aegis128_final_simd(&state, &tag, req->assoclen, - cryptlen); + cryptlen, 0); } else { crypto_aegis128_init(&state, &ctx->key, req->iv); crypto_aegis128_process_ad(&state, req->src, req->assoclen); @@ -445,8 +447,15 @@ static int crypto_aegis128_decrypt(struct aead_request *req) crypto_aegis128_process_ad(&state, req->src, req->assoclen); crypto_aegis128_process_crypt(&state, &walk, crypto_aegis128_decrypt_chunk_simd); - crypto_aegis128_final_simd(&state, &tag, req->assoclen, - cryptlen); + if (unlikely(crypto_aegis128_final_simd(&state, &tag, + req->assoclen, + cryptlen, authsize))) { + skcipher_walk_aead_decrypt(&walk, req, false); + crypto_aegis128_process_crypt(NULL, req, &walk, + crypto_aegis128_wipe_chunk); + return -EBADMSG; + } + return 0; } else { crypto_aegis128_init(&state, &ctx->key, req->iv); crypto_aegis128_process_ad(&state, req->src, req->assoclen); diff --git a/crypto/aegis128-neon-inner.c b/crypto/aegis128-neon-inner.c index cd1b3ad1d1f3..7de485907d81 100644 --- a/crypto/aegis128-neon-inner.c +++ b/crypto/aegis128-neon-inner.c @@ -199,6 +199,17 @@ static uint8x16_t vqtbx1q_u8(uint8x16_t v, uint8x16_t a, uint8x16_t b) return vcombine_u8(vtbx2_u8(vget_low_u8(v), __a.pair, vget_low_u8(b)), vtbx2_u8(vget_high_u8(v), __a.pair, vget_high_u8(b))); } + +static int8_t vminvq_s8(int8x16_t v) +{ + int8x8_t s = vpmin_s8(vget_low_s8(v), vget_high_s8(v)); + + s = vpmin_s8(s, s); + s = vpmin_s8(s, s); + s = vpmin_s8(s, s); + + return vget_lane_s8(s, 0); +} #endif static const uint8_t permute[] __aligned(64) = { @@ -302,8 +313,10 @@ void crypto_aegis128_decrypt_chunk_neon(void *state, void *dst, const void *src, aegis128_save_state_neon(st, state); } -void crypto_aegis128_final_neon(void *state, void *tag_xor, uint64_t assoclen, - uint64_t cryptlen) +int crypto_aegis128_final_neon(void *state, void *tag_xor, + unsigned int assoclen, + unsigned int cryptlen, + unsigned int authsize) { struct aegis128_state st = aegis128_load_state_neon(state); uint8x16_t v; @@ -311,13 +324,21 @@ void crypto_aegis128_final_neon(void *state, void *tag_xor, uint64_t assoclen, preload_sbox(); - v = st.v[3] ^ (uint8x16_t)vcombine_u64(vmov_n_u64(8 * assoclen), - vmov_n_u64(8 * cryptlen)); + v = st.v[3] ^ (uint8x16_t)vcombine_u64(vmov_n_u64(8ULL * assoclen), + vmov_n_u64(8ULL * cryptlen)); for (i = 0; i < 7; i++) st = aegis128_update_neon(st, v); - v = vld1q_u8(tag_xor); - v ^= st.v[0] ^ st.v[1] ^ st.v[2] ^ st.v[3] ^ st.v[4]; + v = st.v[0] ^ st.v[1] ^ st.v[2] ^ st.v[3] ^ st.v[4]; + + if (authsize > 0) { + v = vqtbl1q_u8(~vceqq_u8(v, vld1q_u8(tag_xor)), + vld1q_u8(permute + authsize)); + + return vminvq_s8((int8x16_t)v); + } + vst1q_u8(tag_xor, v); + return 0; } diff --git a/crypto/aegis128-neon.c b/crypto/aegis128-neon.c index 8271b1fa0fbc..94d591a002a4 100644 --- a/crypto/aegis128-neon.c +++ b/crypto/aegis128-neon.c @@ -14,8 +14,10 @@ void crypto_aegis128_encrypt_chunk_neon(void *state, void *dst, const void *src, unsigned int size); void crypto_aegis128_decrypt_chunk_neon(void *state, void *dst, const void *src, unsigned int size); -void crypto_aegis128_final_neon(void *state, void *tag_xor, uint64_t assoclen, - uint64_t cryptlen); +int crypto_aegis128_final_neon(void *state, void *tag_xor, + unsigned int assoclen, + unsigned int cryptlen, + unsigned int authsize); int aegis128_have_aes_insn __ro_after_init; @@ -60,11 +62,18 @@ void crypto_aegis128_decrypt_chunk_simd(union aegis_block *state, u8 *dst, kernel_neon_end(); } -void crypto_aegis128_final_simd(union aegis_block *state, - union aegis_block *tag_xor, - u64 assoclen, u64 cryptlen) +int crypto_aegis128_final_simd(union aegis_block *state, + union aegis_block *tag_xor, + unsigned int assoclen, + unsigned int cryptlen, + unsigned int authsize) { + int ret; + kernel_neon_begin(); - crypto_aegis128_final_neon(state, tag_xor, assoclen, cryptlen); + ret = crypto_aegis128_final_neon(state, tag_xor, assoclen, cryptlen, + authsize); kernel_neon_end(); + + return ret; } From patchwork Tue Nov 17 13:32:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 326085 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B5DBC63697 for ; Tue, 17 Nov 2020 13:50:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F403E24699 for ; Tue, 17 Nov 2020 13:50:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="XcH/5Ntl" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732663AbgKQNuR (ORCPT ); Tue, 17 Nov 2020 08:50:17 -0500 Received: from mail.kernel.org ([198.145.29.99]:42556 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731881AbgKQNcp (ORCPT ); Tue, 17 Nov 2020 08:32:45 -0500 Received: from e123331-lin.nice.arm.com (lfbn-nic-1-188-42.w2-15.abo.wanadoo.fr [2.15.37.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 3246B21534; Tue, 17 Nov 2020 13:32:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1605619964; bh=OV0VFiGCk8/dpDYsWACppJsvTGvmR1MzOzEu7Oxt4Yg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XcH/5Ntltc/B1+Hpl2KptxkHlc4KDjHW6rgTkn98fs5tXlVQ8vD01GzdzdhBMOD1y VN+RJhzMoRZCQWWeJUkBCHIaWSiW/xG1qlGAUdk9anfRneBcSxxuBUoid+8rW7EScR 5LzVIuDXo8JbgRShFgE6cWSuRkVl7VRfaKadp7+g= From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Ondrej Mosnacek , Eric Biggers Subject: [PATCH v3 4/4] crypto: aegis128 - expose SIMD code path as separate driver Date: Tue, 17 Nov 2020 14:32:14 +0100 Message-Id: <20201117133214.29114-5-ardb@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20201117133214.29114-1-ardb@kernel.org> References: <20201117133214.29114-1-ardb@kernel.org> Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Wiring the SIMD code into the generic driver has the unfortunate side effect that the tcrypt testing code cannot distinguish them, and will therefore not use the latter to fuzz test the former, as it does for other algorithms. So let's refactor the code a bit so we can register two implementations: aegis128-generic and aegis128-simd. Signed-off-by: Ard Biesheuvel Reviewed-by: Ondrej Mosnacek --- crypto/aegis128-core.c | 220 +++++++++++++------- 1 file changed, 143 insertions(+), 77 deletions(-) diff --git a/crypto/aegis128-core.c b/crypto/aegis128-core.c index 859c7b905618..2b05f79475d3 100644 --- a/crypto/aegis128-core.c +++ b/crypto/aegis128-core.c @@ -86,9 +86,10 @@ static void crypto_aegis128_update(struct aegis_state *state) } static void crypto_aegis128_update_a(struct aegis_state *state, - const union aegis_block *msg) + const union aegis_block *msg, + bool do_simd) { - if (aegis128_do_simd()) { + if (do_simd) { crypto_aegis128_update_simd(state, msg); return; } @@ -97,9 +98,10 @@ static void crypto_aegis128_update_a(struct aegis_state *state, crypto_aegis_block_xor(&state->blocks[0], msg); } -static void crypto_aegis128_update_u(struct aegis_state *state, const void *msg) +static void crypto_aegis128_update_u(struct aegis_state *state, const void *msg, + bool do_simd) { - if (aegis128_do_simd()) { + if (do_simd) { crypto_aegis128_update_simd(state, msg); return; } @@ -128,27 +130,28 @@ static void crypto_aegis128_init(struct aegis_state *state, crypto_aegis_block_xor(&state->blocks[4], &crypto_aegis_const[1]); for (i = 0; i < 5; i++) { - crypto_aegis128_update_a(state, key); - crypto_aegis128_update_a(state, &key_iv); + crypto_aegis128_update_a(state, key, false); + crypto_aegis128_update_a(state, &key_iv, false); } } static void crypto_aegis128_ad(struct aegis_state *state, - const u8 *src, unsigned int size) + const u8 *src, unsigned int size, + bool do_simd) { if (AEGIS_ALIGNED(src)) { const union aegis_block *src_blk = (const union aegis_block *)src; while (size >= AEGIS_BLOCK_SIZE) { - crypto_aegis128_update_a(state, src_blk); + crypto_aegis128_update_a(state, src_blk, do_simd); size -= AEGIS_BLOCK_SIZE; src_blk++; } } else { while (size >= AEGIS_BLOCK_SIZE) { - crypto_aegis128_update_u(state, src); + crypto_aegis128_update_u(state, src, do_simd); size -= AEGIS_BLOCK_SIZE; src += AEGIS_BLOCK_SIZE; @@ -180,7 +183,7 @@ static void crypto_aegis128_encrypt_chunk(struct aegis_state *state, u8 *dst, crypto_aegis_block_xor(&tmp, &state->blocks[1]); crypto_aegis_block_xor(&tmp, src_blk); - crypto_aegis128_update_a(state, src_blk); + crypto_aegis128_update_a(state, src_blk, false); *dst_blk = tmp; @@ -196,7 +199,7 @@ static void crypto_aegis128_encrypt_chunk(struct aegis_state *state, u8 *dst, crypto_aegis_block_xor(&tmp, &state->blocks[1]); crypto_xor(tmp.bytes, src, AEGIS_BLOCK_SIZE); - crypto_aegis128_update_u(state, src); + crypto_aegis128_update_u(state, src, false); memcpy(dst, tmp.bytes, AEGIS_BLOCK_SIZE); @@ -215,7 +218,7 @@ static void crypto_aegis128_encrypt_chunk(struct aegis_state *state, u8 *dst, crypto_aegis_block_xor(&tmp, &state->blocks[4]); crypto_aegis_block_xor(&tmp, &state->blocks[1]); - crypto_aegis128_update_a(state, &msg); + crypto_aegis128_update_a(state, &msg, false); crypto_aegis_block_xor(&msg, &tmp); @@ -241,7 +244,7 @@ static void crypto_aegis128_decrypt_chunk(struct aegis_state *state, u8 *dst, crypto_aegis_block_xor(&tmp, &state->blocks[1]); crypto_aegis_block_xor(&tmp, src_blk); - crypto_aegis128_update_a(state, &tmp); + crypto_aegis128_update_a(state, &tmp, false); *dst_blk = tmp; @@ -257,7 +260,7 @@ static void crypto_aegis128_decrypt_chunk(struct aegis_state *state, u8 *dst, crypto_aegis_block_xor(&tmp, &state->blocks[1]); crypto_xor(tmp.bytes, src, AEGIS_BLOCK_SIZE); - crypto_aegis128_update_a(state, &tmp); + crypto_aegis128_update_a(state, &tmp, false); memcpy(dst, tmp.bytes, AEGIS_BLOCK_SIZE); @@ -279,7 +282,7 @@ static void crypto_aegis128_decrypt_chunk(struct aegis_state *state, u8 *dst, memset(msg.bytes + size, 0, AEGIS_BLOCK_SIZE - size); - crypto_aegis128_update_a(state, &msg); + crypto_aegis128_update_a(state, &msg, false); memcpy(dst, msg.bytes, size); } @@ -287,7 +290,8 @@ static void crypto_aegis128_decrypt_chunk(struct aegis_state *state, u8 *dst, static void crypto_aegis128_process_ad(struct aegis_state *state, struct scatterlist *sg_src, - unsigned int assoclen) + unsigned int assoclen, + bool do_simd) { struct scatter_walk walk; union aegis_block buf; @@ -304,13 +308,13 @@ static void crypto_aegis128_process_ad(struct aegis_state *state, if (pos > 0) { unsigned int fill = AEGIS_BLOCK_SIZE - pos; memcpy(buf.bytes + pos, src, fill); - crypto_aegis128_update_a(state, &buf); + crypto_aegis128_update_a(state, &buf, do_simd); pos = 0; left -= fill; src += fill; } - crypto_aegis128_ad(state, src, left); + crypto_aegis128_ad(state, src, left, do_simd); src += left & ~(AEGIS_BLOCK_SIZE - 1); left &= AEGIS_BLOCK_SIZE - 1; } @@ -326,7 +330,7 @@ static void crypto_aegis128_process_ad(struct aegis_state *state, if (pos > 0) { memset(buf.bytes + pos, 0, AEGIS_BLOCK_SIZE - pos); - crypto_aegis128_update_a(state, &buf); + crypto_aegis128_update_a(state, &buf, do_simd); } } @@ -368,7 +372,7 @@ static void crypto_aegis128_final(struct aegis_state *state, crypto_aegis_block_xor(&tmp, &state->blocks[3]); for (i = 0; i < 7; i++) - crypto_aegis128_update_a(state, &tmp); + crypto_aegis128_update_a(state, &tmp, false); for (i = 0; i < AEGIS128_STATE_BLOCKS; i++) crypto_aegis_block_xor(tag_xor, &state->blocks[i]); @@ -396,7 +400,7 @@ static int crypto_aegis128_setauthsize(struct crypto_aead *tfm, return 0; } -static int crypto_aegis128_encrypt(struct aead_request *req) +static int crypto_aegis128_encrypt_generic(struct aead_request *req) { struct crypto_aead *tfm = crypto_aead_reqtfm(req); union aegis_block tag = {}; @@ -407,27 +411,18 @@ static int crypto_aegis128_encrypt(struct aead_request *req) struct aegis_state state; skcipher_walk_aead_encrypt(&walk, req, false); - if (aegis128_do_simd()) { - crypto_aegis128_init_simd(&state, &ctx->key, req->iv); - crypto_aegis128_process_ad(&state, req->src, req->assoclen); - crypto_aegis128_process_crypt(&state, &walk, - crypto_aegis128_encrypt_chunk_simd); - crypto_aegis128_final_simd(&state, &tag, req->assoclen, - cryptlen, 0); - } else { - crypto_aegis128_init(&state, &ctx->key, req->iv); - crypto_aegis128_process_ad(&state, req->src, req->assoclen); - crypto_aegis128_process_crypt(&state, &walk, - crypto_aegis128_encrypt_chunk); - crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen); - } + crypto_aegis128_init(&state, &ctx->key, req->iv); + crypto_aegis128_process_ad(&state, req->src, req->assoclen, false); + crypto_aegis128_process_crypt(&state, &walk, + crypto_aegis128_encrypt_chunk); + crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen); scatterwalk_map_and_copy(tag.bytes, req->dst, req->assoclen + cryptlen, authsize, 1); return 0; } -static int crypto_aegis128_decrypt(struct aead_request *req) +static int crypto_aegis128_decrypt_generic(struct aead_request *req) { static const u8 zeros[AEGIS128_MAX_AUTH_SIZE] = {}; struct crypto_aead *tfm = crypto_aead_reqtfm(req); @@ -442,27 +437,11 @@ static int crypto_aegis128_decrypt(struct aead_request *req) authsize, 0); skcipher_walk_aead_decrypt(&walk, req, false); - if (aegis128_do_simd()) { - crypto_aegis128_init_simd(&state, &ctx->key, req->iv); - crypto_aegis128_process_ad(&state, req->src, req->assoclen); - crypto_aegis128_process_crypt(&state, &walk, - crypto_aegis128_decrypt_chunk_simd); - if (unlikely(crypto_aegis128_final_simd(&state, &tag, - req->assoclen, - cryptlen, authsize))) { - skcipher_walk_aead_decrypt(&walk, req, false); - crypto_aegis128_process_crypt(NULL, req, &walk, - crypto_aegis128_wipe_chunk); - return -EBADMSG; - } - return 0; - } else { - crypto_aegis128_init(&state, &ctx->key, req->iv); - crypto_aegis128_process_ad(&state, req->src, req->assoclen); - crypto_aegis128_process_crypt(&state, &walk, - crypto_aegis128_decrypt_chunk); - crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen); - } + crypto_aegis128_init(&state, &ctx->key, req->iv); + crypto_aegis128_process_ad(&state, req->src, req->assoclen, false); + crypto_aegis128_process_crypt(&state, &walk, + crypto_aegis128_decrypt_chunk); + crypto_aegis128_final(&state, &tag, req->assoclen, cryptlen); if (unlikely(crypto_memneq(tag.bytes, zeros, authsize))) { /* @@ -482,42 +461,128 @@ static int crypto_aegis128_decrypt(struct aead_request *req) return 0; } -static struct aead_alg crypto_aegis128_alg = { - .setkey = crypto_aegis128_setkey, - .setauthsize = crypto_aegis128_setauthsize, - .encrypt = crypto_aegis128_encrypt, - .decrypt = crypto_aegis128_decrypt, +static int crypto_aegis128_encrypt_simd(struct aead_request *req) +{ + struct crypto_aead *tfm = crypto_aead_reqtfm(req); + union aegis_block tag = {}; + unsigned int authsize = crypto_aead_authsize(tfm); + struct aegis_ctx *ctx = crypto_aead_ctx(tfm); + unsigned int cryptlen = req->cryptlen; + struct skcipher_walk walk; + struct aegis_state state; - .ivsize = AEGIS128_NONCE_SIZE, - .maxauthsize = AEGIS128_MAX_AUTH_SIZE, - .chunksize = AEGIS_BLOCK_SIZE, + if (!aegis128_do_simd()) + return crypto_aegis128_encrypt_generic(req); - .base = { - .cra_blocksize = 1, - .cra_ctxsize = sizeof(struct aegis_ctx), - .cra_alignmask = 0, + skcipher_walk_aead_encrypt(&walk, req, false); + crypto_aegis128_init_simd(&state, &ctx->key, req->iv); + crypto_aegis128_process_ad(&state, req->src, req->assoclen, true); + crypto_aegis128_process_crypt(&state, &walk, + crypto_aegis128_encrypt_chunk_simd); + crypto_aegis128_final_simd(&state, &tag, req->assoclen, cryptlen, 0); - .cra_priority = 100, + scatterwalk_map_and_copy(tag.bytes, req->dst, req->assoclen + cryptlen, + authsize, 1); + return 0; +} - .cra_name = "aegis128", - .cra_driver_name = "aegis128-generic", +static int crypto_aegis128_decrypt_simd(struct aead_request *req) +{ + struct crypto_aead *tfm = crypto_aead_reqtfm(req); + union aegis_block tag; + unsigned int authsize = crypto_aead_authsize(tfm); + unsigned int cryptlen = req->cryptlen - authsize; + struct aegis_ctx *ctx = crypto_aead_ctx(tfm); + struct skcipher_walk walk; + struct aegis_state state; - .cra_module = THIS_MODULE, + if (!aegis128_do_simd()) + return crypto_aegis128_decrypt_generic(req); + + scatterwalk_map_and_copy(tag.bytes, req->src, req->assoclen + cryptlen, + authsize, 0); + + skcipher_walk_aead_decrypt(&walk, req, false); + crypto_aegis128_init_simd(&state, &ctx->key, req->iv); + crypto_aegis128_process_ad(&state, req->src, req->assoclen, true); + crypto_aegis128_process_crypt(&state, &walk, + crypto_aegis128_decrypt_chunk_simd); + + if (unlikely(crypto_aegis128_final_simd(&state, &tag, req->assoclen, + cryptlen, authsize))) { + skcipher_walk_aead_decrypt(&walk, req, false); + crypto_aegis128_process_crypt(NULL, &walk, + crypto_aegis128_wipe_chunk); + return -EBADMSG; } + return 0; +} + +static struct aead_alg crypto_aegis128_alg_generic = { + .setkey = crypto_aegis128_setkey, + .setauthsize = crypto_aegis128_setauthsize, + .encrypt = crypto_aegis128_encrypt_generic, + .decrypt = crypto_aegis128_decrypt_generic, + + .ivsize = AEGIS128_NONCE_SIZE, + .maxauthsize = AEGIS128_MAX_AUTH_SIZE, + .chunksize = AEGIS_BLOCK_SIZE, + + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct aegis_ctx), + .base.cra_alignmask = 0, + .base.cra_priority = 100, + .base.cra_name = "aegis128", + .base.cra_driver_name = "aegis128-generic", + .base.cra_module = THIS_MODULE, +}; + +static struct aead_alg crypto_aegis128_alg_simd = { + .setkey = crypto_aegis128_setkey, + .setauthsize = crypto_aegis128_setauthsize, + .encrypt = crypto_aegis128_encrypt_simd, + .decrypt = crypto_aegis128_decrypt_simd, + + .ivsize = AEGIS128_NONCE_SIZE, + .maxauthsize = AEGIS128_MAX_AUTH_SIZE, + .chunksize = AEGIS_BLOCK_SIZE, + + .base.cra_blocksize = 1, + .base.cra_ctxsize = sizeof(struct aegis_ctx), + .base.cra_alignmask = 0, + .base.cra_priority = 200, + .base.cra_name = "aegis128", + .base.cra_driver_name = "aegis128-simd", + .base.cra_module = THIS_MODULE, }; static int __init crypto_aegis128_module_init(void) { + int ret; + + ret = crypto_register_aead(&crypto_aegis128_alg_generic); + if (ret) + return ret; + if (IS_ENABLED(CONFIG_CRYPTO_AEGIS128_SIMD) && - crypto_aegis128_have_simd()) + crypto_aegis128_have_simd()) { + ret = crypto_register_aead(&crypto_aegis128_alg_simd); + if (ret) { + crypto_unregister_aead(&crypto_aegis128_alg_generic); + return ret; + } static_branch_enable(&have_simd); - - return crypto_register_aead(&crypto_aegis128_alg); + } + return 0; } static void __exit crypto_aegis128_module_exit(void) { - crypto_unregister_aead(&crypto_aegis128_alg); + if (IS_ENABLED(CONFIG_CRYPTO_AEGIS128_SIMD) && + crypto_aegis128_have_simd()) + crypto_unregister_aead(&crypto_aegis128_alg_simd); + + crypto_unregister_aead(&crypto_aegis128_alg_generic); } subsys_initcall(crypto_aegis128_module_init); @@ -528,3 +593,4 @@ MODULE_AUTHOR("Ondrej Mosnacek "); MODULE_DESCRIPTION("AEGIS-128 AEAD algorithm"); MODULE_ALIAS_CRYPTO("aegis128"); MODULE_ALIAS_CRYPTO("aegis128-generic"); +MODULE_ALIAS_CRYPTO("aegis128-simd");