From patchwork Fri Dec 1 21:19:23 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 120384 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp1655371qgn; Fri, 1 Dec 2017 13:26:01 -0800 (PST) X-Google-Smtp-Source: AGs4zMaGewPeBVmQLamPnhpsHJkdM4OBhHuR4+0tR3a6oZoygH/j7/bprfkoY4n889He1BCk6MM3 X-Received: by 10.84.128.106 with SMTP id 97mr7303246pla.250.1512163561807; Fri, 01 Dec 2017 13:26:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1512163561; cv=none; d=google.com; s=arc-20160816; b=xQNNcV9SqwCbi5bIjwMC/LeVF7K3upuzeQQMPI6mR5EwJ5YYXvpL0Ni/RwGpZsRPd1 bv7pMzIAgukqg9hiCn0WdugSky91q/CLuuVgVgNNqigsCE9dacd66eAd4Q7t4hWyOn/0 QNIuwcENvdt0iO/Deqa3cfdZZ++1WroFQTzCpmA2xKuSh3epQMwwMGKgM0N0LxfqX6uV oxVRe7ugl4sxYrMhCWKG7p3/sk/n1QKL5E8a7ncNAFeAYJzudZ1UzvozGmNLk1CnLnC9 dFlsNlmmvhYN2aPaZ+maGorMp6Dhh/ICssgC/yliPWtlB5o7jFGxaCPuCAtSgI2Q8GSs Q0zw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=LihVXBSIHLub8TadiGHOcXJbFC4M3p/6b/xh6SsRLz4=; b=Vc6oSw3ApNJrDmANN+39uLKRHJjDkBKivt5h+bXHBhGdqZFXwJVt+sKR1vyxOQXYz4 d81y2cTi97BovfA5rhtWp0SKwyeKkDLh9KCiBWWoxyJQsL2ISGzQROxDD4hRl6eGV4z3 4gez/9BUefrArLDoTlSP5pdUJm9ltArzqLRt+yLQZAeU9Gg7qeZX8LlGJBvLZ4nEPG8J HARUAJtOL+cKFJnR90Oj1uaPqXdw3vRfXGZQ+1GR2vjDeHQxz0Ltbl4rRN1/riCkiRkG fmpeC3nkrpobzoW2jjBwdIWYvX8lFG4AB0kUVT/Lb/D2vtjnse+Q1MtW0LYLFztwHizv 7Ivg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=fjU3ZhUZ; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 5si5602719plx.33.2017.12.01.13.26.01; Fri, 01 Dec 2017 13:26:01 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=fjU3ZhUZ; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751111AbdLAV0A (ORCPT + 1 other); Fri, 1 Dec 2017 16:26:00 -0500 Received: from mail-wr0-f196.google.com ([209.85.128.196]:42442 "EHLO mail-wr0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751066AbdLAVZ7 (ORCPT ); Fri, 1 Dec 2017 16:25:59 -0500 Received: by mail-wr0-f196.google.com with SMTP id s66so11439618wrc.9 for ; Fri, 01 Dec 2017 13:25:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=LihVXBSIHLub8TadiGHOcXJbFC4M3p/6b/xh6SsRLz4=; b=fjU3ZhUZKxvpQORsyuLVH7Hafq7NwfrNuoamJw0prOoBFNwEzBlO5bCPkAWJwxJefX 1QFwBKh5nGLcGyLz3roihEyendrNEeypeYCMev6+IaoVY+PFw1clMA6W3Stz1prqJ8g0 KrVSSVY2jEshRBdIJEFIcc83R51PWlKD0st38= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=LihVXBSIHLub8TadiGHOcXJbFC4M3p/6b/xh6SsRLz4=; b=Gb8l0p218/9XsK9EZImWSeSKLQloOD4pYJD3EmXFXjh4tCcgbpTwwNGIRfnKGt/qu+ oMfQpY3RQTOzfmzTgdFrtRkcU9PeKxl7xFc0r4qLAlcXoCmaRQhrTJ0y2dq+wB99G2g1 T85xblb4+cQnvDBciy8n0vDlmoZBtbpU2EODw85vsSr9DSrhG3tMfMw9fSsSoZItUDxG XQZoNaSRS/yZqaCz75vznVOHlWSugWbT6G0v244ePC6L5rJ3mWD56850A9zDjtSj8BmK zmDw4FsWi8XAEMak1TPocC2jzmwj5MwHsKo2JaK1ccHV0BsCNxWuCt4Cvw51O8wDNtxg dCxQ== X-Gm-Message-State: AJaThX4lHOeZOqZx7VSdaVLAEJ9AXQNqi53a59ohvJ6/mjounXvNzNrC Uq2Chy6pEJVRaJ4GEOSMRrDhfda8U74= X-Received: by 10.223.187.210 with SMTP id z18mr5924080wrg.117.1512163557702; Fri, 01 Dec 2017 13:25:57 -0800 (PST) Received: from localhost.localdomain ([105.150.171.234]) by smtp.gmail.com with ESMTPSA id p17sm2161682wma.23.2017.12.01.13.25.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 01 Dec 2017 13:25:56 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Dave Martin , Russell King - ARM Linux , Sebastian Andrzej Siewior , Mark Rutland , linux-rt-users@vger.kernel.org, Peter Zijlstra , Catalin Marinas , Will Deacon , Steven Rostedt , Thomas Gleixner Subject: [PATCH 1/5] crypto: arm64/aes-ce-ccm - move kernel mode neon en/disable into loop Date: Fri, 1 Dec 2017 21:19:23 +0000 Message-Id: <20171201211927.24653-2-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20171201211927.24653-1-ard.biesheuvel@linaro.org> References: <20171201211927.24653-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org When kernel mode NEON was first introduced on arm64, the preserve and restore of the userland NEON state was completely unoptimized, and involved saving all registers on each call to kernel_neon_begin(), and restoring them on each call to kernel_neon_end(). For this reason, the NEON crypto code that was introduced at the time keeps the NEON enabled throughout the execution of the crypto API methods, which may include calls back into the crypto API that could result in memory allocation or other actions that we should avoid when running with preemption disabled. Since then, we have optimized the kernel mode NEON handling, which now restores lazily (upon return to userland), and so the preserve action is only costly the first time it is called after entering the kernel. So let's put the kernel_neon_begin() and kernel_neon_end() calls around the actual invocations of the NEON crypto code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-ce-ccm-glue.c | 47 ++++++++++---------- 1 file changed, 23 insertions(+), 24 deletions(-) -- 2.11.0 diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c index a1254036f2b1..68b11aa690e4 100644 --- a/arch/arm64/crypto/aes-ce-ccm-glue.c +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c @@ -107,11 +107,13 @@ static int ccm_init_mac(struct aead_request *req, u8 maciv[], u32 msglen) } static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[], - u32 abytes, u32 *macp, bool use_neon) + u32 abytes, u32 *macp) { - if (likely(use_neon)) { + if (may_use_simd()) { + kernel_neon_begin(); ce_aes_ccm_auth_data(mac, in, abytes, macp, key->key_enc, num_rounds(key)); + kernel_neon_end(); } else { if (*macp > 0 && *macp < AES_BLOCK_SIZE) { int added = min(abytes, AES_BLOCK_SIZE - *macp); @@ -143,8 +145,7 @@ static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[], } } -static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[], - bool use_neon) +static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[]) { struct crypto_aead *aead = crypto_aead_reqtfm(req); struct crypto_aes_ctx *ctx = crypto_aead_ctx(aead); @@ -163,7 +164,7 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[], ltag.len = 6; } - ccm_update_mac(ctx, mac, (u8 *)<ag, ltag.len, &macp, use_neon); + ccm_update_mac(ctx, mac, (u8 *)<ag, ltag.len, &macp); scatterwalk_start(&walk, req->src); do { @@ -175,7 +176,7 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[], n = scatterwalk_clamp(&walk, len); } p = scatterwalk_map(&walk); - ccm_update_mac(ctx, mac, p, n, &macp, use_neon); + ccm_update_mac(ctx, mac, p, n, &macp); len -= n; scatterwalk_unmap(p); @@ -242,43 +243,42 @@ static int ccm_encrypt(struct aead_request *req) u8 __aligned(8) mac[AES_BLOCK_SIZE]; u8 buf[AES_BLOCK_SIZE]; u32 len = req->cryptlen; - bool use_neon = may_use_simd(); int err; err = ccm_init_mac(req, mac, len); if (err) return err; - if (likely(use_neon)) - kernel_neon_begin(); - if (req->assoclen) - ccm_calculate_auth_mac(req, mac, use_neon); + ccm_calculate_auth_mac(req, mac); /* preserve the original iv for the final round */ memcpy(buf, req->iv, AES_BLOCK_SIZE); err = skcipher_walk_aead_encrypt(&walk, req, true); - if (likely(use_neon)) { + if (may_use_simd()) { while (walk.nbytes) { u32 tail = walk.nbytes % AES_BLOCK_SIZE; if (walk.nbytes == walk.total) tail = 0; + kernel_neon_begin(); ce_aes_ccm_encrypt(walk.dst.virt.addr, walk.src.virt.addr, walk.nbytes - tail, ctx->key_enc, num_rounds(ctx), mac, walk.iv); + kernel_neon_end(); err = skcipher_walk_done(&walk, tail); } - if (!err) + if (!err) { + kernel_neon_begin(); ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx)); - - kernel_neon_end(); + kernel_neon_end(); + } } else { err = ccm_crypt_fallback(&walk, mac, buf, ctx, true); } @@ -301,43 +301,42 @@ static int ccm_decrypt(struct aead_request *req) u8 __aligned(8) mac[AES_BLOCK_SIZE]; u8 buf[AES_BLOCK_SIZE]; u32 len = req->cryptlen - authsize; - bool use_neon = may_use_simd(); int err; err = ccm_init_mac(req, mac, len); if (err) return err; - if (likely(use_neon)) - kernel_neon_begin(); - if (req->assoclen) - ccm_calculate_auth_mac(req, mac, use_neon); + ccm_calculate_auth_mac(req, mac); /* preserve the original iv for the final round */ memcpy(buf, req->iv, AES_BLOCK_SIZE); err = skcipher_walk_aead_decrypt(&walk, req, true); - if (likely(use_neon)) { + if (may_use_simd()) { while (walk.nbytes) { u32 tail = walk.nbytes % AES_BLOCK_SIZE; if (walk.nbytes == walk.total) tail = 0; + kernel_neon_begin(); ce_aes_ccm_decrypt(walk.dst.virt.addr, walk.src.virt.addr, walk.nbytes - tail, ctx->key_enc, num_rounds(ctx), mac, walk.iv); + kernel_neon_end(); err = skcipher_walk_done(&walk, tail); } - if (!err) + if (!err) { + kernel_neon_begin(); ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx)); - - kernel_neon_end(); + kernel_neon_end(); + } } else { err = ccm_crypt_fallback(&walk, mac, buf, ctx, false); } From patchwork Fri Dec 1 21:19:24 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 120385 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp1655434qgn; Fri, 1 Dec 2017 13:26:06 -0800 (PST) X-Google-Smtp-Source: AGs4zMaCPbugeIY/EUN/BQVEODx3zcHysRLwGXFaNNHkOaMKoJsGbWg/GUpOTFXMTz635cThghs/ X-Received: by 10.99.143.27 with SMTP id n27mr7212803pgd.121.1512163565976; Fri, 01 Dec 2017 13:26:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1512163565; cv=none; d=google.com; s=arc-20160816; b=mE4SRc008rtgNAWQv27U9tw5qDvvDAsy+qRDP0VLhBvUkOSwoJK1CHh7+BsKOEHsUk 9NlyhmvjbHI5J5har6MI7QjWjjuo/YKZyqy4HSM5YPNng+oLlCk9I/JastqE1JZ8Vu/O vmP9etEstcCtI8zoQYDv1P0Hzvb0rsWoTpqMxFS9RVsqnh/Lq0vpGTrLvp60jE7PVYy5 B9T8whsx1Ux7MWojdqonc89Gm8m+dXw1lsqZ3uk4Lok9mQ3GxDXXXxI+y4b85MfdAUvE nZUff8Tshgqj0c1LijTlLqIBljuUX6lSoWMO/iGxsu56mnfGpZmgSDYgXTpvrfDiH5KY gp7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=JvZzrv9lM7xwW32kubbNapxJZ5QCmwjDjg7R4pbAvFI=; b=KvMIQYvlqwM3/DpYoe3xR7/B6OBLd5j6wJWmIIHl/EiZZ9bCjRV1sMIBAQBBZRBd9t jIpkH2NyKnoS8JPlTenUkWxPPYTsfnKggPyd54tjVv871koTK5mhaRRALcY/g016C9eP OIdl+qFlFN09cYF44p3py/4Q48xaPQzOo/eJDWxyBSstBtTFeDWn6jx/VzHmS/lUS4Xb mt+hFxHxxhiMjRLb44EIbASd1H9pB3CBGlU8cs1Bk6YSgyxEgUGEWFfMgGpO6KqJko48 GyVK5zDgMjR2vSKzqpA/U9rzqyj6uFEKj97noJumcG2UZfJY47lgRkOO7M6HXQdBoLXd /vOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=CpR2XZ6H; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 5si5602719plx.33.2017.12.01.13.26.05; Fri, 01 Dec 2017 13:26:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=CpR2XZ6H; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751089AbdLAV0E (ORCPT + 1 other); Fri, 1 Dec 2017 16:26:04 -0500 Received: from mail-wr0-f193.google.com ([209.85.128.193]:41134 "EHLO mail-wr0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751182AbdLAV0D (ORCPT ); Fri, 1 Dec 2017 16:26:03 -0500 Received: by mail-wr0-f193.google.com with SMTP id z18so11482573wrb.8 for ; Fri, 01 Dec 2017 13:26:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=JvZzrv9lM7xwW32kubbNapxJZ5QCmwjDjg7R4pbAvFI=; b=CpR2XZ6Hs08d8xQoTRI1Y7ObpfYX3jSkNnC5ort0a0jeogSLaVg93C9lhUxEeNzzEy tzpIGI7Smu5UYnYbmB21oKdgZpLkW91jWl53TWmTHhfDJyOV4A0hwb5rBY2q15wir5yw /BW94s2130J9jaQe6rGYNq8bRHUswjXzd+DsE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=JvZzrv9lM7xwW32kubbNapxJZ5QCmwjDjg7R4pbAvFI=; b=Ypx+TjN+9GSzuZ4hqqPo0MHhahQ6cGHce0U9YrN1SctuQPKLUoggCPbz5sUo/OrEjJ n/SCd4f+B9suCkvslHJ9dVSKUewmPNZu4hJ3haXMP4tpVRTeiC+nHkxav31xthPEBRLy VewWqRKFPI7Xl3cQc78G55pFFrQXWkEs581+9qE4UQMJk4RKowtl+4m+V+NMXL8Fvdwf twPLoO10Hd0gcUpvw4YL76qXwYoSc3onRs+ZAcVXMO7UdKvAoHTqsZc+u3CKqqhAj8sZ j7gDuM0Oc3e7NusmQ96Bi4Y/mTcisRPVtB1kRkheNLmRJbD0ZHOoa7OKeLkHBKRcQInt Fstg== X-Gm-Message-State: AJaThX6hDWhhwoB5lpXvdIUhn/bQ6MBEyTWvTRSaYkqI271jGjOtQn1N XoX1ZTOPsKFIVtNeomegJ/0mlx9gon0= X-Received: by 10.223.189.135 with SMTP id l7mr6729468wrh.231.1512163561736; Fri, 01 Dec 2017 13:26:01 -0800 (PST) Received: from localhost.localdomain ([105.150.171.234]) by smtp.gmail.com with ESMTPSA id p17sm2161682wma.23.2017.12.01.13.25.57 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 01 Dec 2017 13:26:01 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Dave Martin , Russell King - ARM Linux , Sebastian Andrzej Siewior , Mark Rutland , linux-rt-users@vger.kernel.org, Peter Zijlstra , Catalin Marinas , Will Deacon , Steven Rostedt , Thomas Gleixner Subject: [PATCH 2/5] crypto: arm64/aes-blk - move kernel mode neon en/disable into loop Date: Fri, 1 Dec 2017 21:19:24 +0000 Message-Id: <20171201211927.24653-3-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20171201211927.24653-1-ard.biesheuvel@linaro.org> References: <20171201211927.24653-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org When kernel mode NEON was first introduced on arm64, the preserve and restore of the userland NEON state was completely unoptimized, and involved saving all registers on each call to kernel_neon_begin(), and restoring them on each call to kernel_neon_end(). For this reason, the NEON crypto code that was introduced at the time keeps the NEON enabled throughout the execution of the crypto API methods, which may include calls back into the crypto API that could result in memory allocation or other actions that we should avoid when running with preemption disabled. Since then, we have optimized the kernel mode NEON handling, which now restores lazily (upon return to userland), and so the preserve action is only costly the first time it is called after entering the kernel. So let's put the kernel_neon_begin() and kernel_neon_end() calls around the actual invocations of the NEON crypto code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Note that this requires some reshuffling of the registers in the asm code, because the XTS routines can no longer rely on the registers to retain their contents between invocations. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-glue.c | 81 +++++++++--------- arch/arm64/crypto/aes-modes.S | 90 ++++++++++---------- arch/arm64/crypto/aes-neonbs-glue.c | 14 ++- 3 files changed, 90 insertions(+), 95 deletions(-) -- 2.11.0 diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c index 998ba519a026..c983a2bfebe3 100644 --- a/arch/arm64/crypto/aes-glue.c +++ b/arch/arm64/crypto/aes-glue.c @@ -64,17 +64,17 @@ MODULE_LICENSE("GPL v2"); /* defined in aes-modes.S */ asmlinkage void aes_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[], - int rounds, int blocks, int first); + int rounds, int blocks); asmlinkage void aes_ecb_decrypt(u8 out[], u8 const in[], u8 const rk[], - int rounds, int blocks, int first); + int rounds, int blocks); asmlinkage void aes_cbc_encrypt(u8 out[], u8 const in[], u8 const rk[], - int rounds, int blocks, u8 iv[], int first); + int rounds, int blocks, u8 iv[]); asmlinkage void aes_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[], - int rounds, int blocks, u8 iv[], int first); + int rounds, int blocks, u8 iv[]); asmlinkage void aes_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], - int rounds, int blocks, u8 ctr[], int first); + int rounds, int blocks, u8 ctr[]); asmlinkage void aes_xts_encrypt(u8 out[], u8 const in[], u8 const rk1[], int rounds, int blocks, u8 const rk2[], u8 iv[], @@ -133,19 +133,19 @@ static int ecb_encrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); - int err, first, rounds = 6 + ctx->key_length / 4; + int err, rounds = 6 + ctx->key_length / 4; struct skcipher_walk walk; unsigned int blocks; err = skcipher_walk_virt(&walk, req, true); - kernel_neon_begin(); - for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { + while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { + kernel_neon_begin(); aes_ecb_encrypt(walk.dst.virt.addr, walk.src.virt.addr, - (u8 *)ctx->key_enc, rounds, blocks, first); + (u8 *)ctx->key_enc, rounds, blocks); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } - kernel_neon_end(); return err; } @@ -153,19 +153,19 @@ static int ecb_decrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); - int err, first, rounds = 6 + ctx->key_length / 4; + int err, rounds = 6 + ctx->key_length / 4; struct skcipher_walk walk; unsigned int blocks; err = skcipher_walk_virt(&walk, req, true); - kernel_neon_begin(); - for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { + while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { + kernel_neon_begin(); aes_ecb_decrypt(walk.dst.virt.addr, walk.src.virt.addr, - (u8 *)ctx->key_dec, rounds, blocks, first); + (u8 *)ctx->key_dec, rounds, blocks); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } - kernel_neon_end(); return err; } @@ -173,20 +173,19 @@ static int cbc_encrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); - int err, first, rounds = 6 + ctx->key_length / 4; + int err, rounds = 6 + ctx->key_length / 4; struct skcipher_walk walk; unsigned int blocks; err = skcipher_walk_virt(&walk, req, true); - kernel_neon_begin(); - for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { + while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { + kernel_neon_begin(); aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr, - (u8 *)ctx->key_enc, rounds, blocks, walk.iv, - first); + (u8 *)ctx->key_enc, rounds, blocks, walk.iv); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } - kernel_neon_end(); return err; } @@ -194,20 +193,19 @@ static int cbc_decrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); - int err, first, rounds = 6 + ctx->key_length / 4; + int err, rounds = 6 + ctx->key_length / 4; struct skcipher_walk walk; unsigned int blocks; err = skcipher_walk_virt(&walk, req, true); - kernel_neon_begin(); - for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { + while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { + kernel_neon_begin(); aes_cbc_decrypt(walk.dst.virt.addr, walk.src.virt.addr, - (u8 *)ctx->key_dec, rounds, blocks, walk.iv, - first); + (u8 *)ctx->key_dec, rounds, blocks, walk.iv); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } - kernel_neon_end(); return err; } @@ -215,20 +213,18 @@ static int ctr_encrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); - int err, first, rounds = 6 + ctx->key_length / 4; + int err, rounds = 6 + ctx->key_length / 4; struct skcipher_walk walk; int blocks; err = skcipher_walk_virt(&walk, req, true); - first = 1; - kernel_neon_begin(); while ((blocks = (walk.nbytes / AES_BLOCK_SIZE))) { + kernel_neon_begin(); aes_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr, - (u8 *)ctx->key_enc, rounds, blocks, walk.iv, - first); + (u8 *)ctx->key_enc, rounds, blocks, walk.iv); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); - first = 0; + kernel_neon_end(); } if (walk.nbytes) { u8 __aligned(8) tail[AES_BLOCK_SIZE]; @@ -241,12 +237,13 @@ static int ctr_encrypt(struct skcipher_request *req) */ blocks = -1; + kernel_neon_begin(); aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc, rounds, - blocks, walk.iv, first); + blocks, walk.iv); + kernel_neon_end(); crypto_xor_cpy(tdst, tsrc, tail, nbytes); err = skcipher_walk_done(&walk, 0); } - kernel_neon_end(); return err; } @@ -272,14 +269,14 @@ static int xts_encrypt(struct skcipher_request *req) err = skcipher_walk_virt(&walk, req, true); - kernel_neon_begin(); for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { + kernel_neon_begin(); aes_xts_encrypt(walk.dst.virt.addr, walk.src.virt.addr, (u8 *)ctx->key1.key_enc, rounds, blocks, (u8 *)ctx->key2.key_enc, walk.iv, first); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } - kernel_neon_end(); return err; } @@ -294,14 +291,14 @@ static int xts_decrypt(struct skcipher_request *req) err = skcipher_walk_virt(&walk, req, true); - kernel_neon_begin(); for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) { + kernel_neon_begin(); aes_xts_decrypt(walk.dst.virt.addr, walk.src.virt.addr, (u8 *)ctx->key1.key_dec, rounds, blocks, (u8 *)ctx->key2.key_enc, walk.iv, first); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } - kernel_neon_end(); return err; } @@ -425,7 +422,7 @@ static int cmac_setkey(struct crypto_shash *tfm, const u8 *in_key, /* encrypt the zero vector */ kernel_neon_begin(); - aes_ecb_encrypt(ctx->consts, (u8[AES_BLOCK_SIZE]){}, rk, rounds, 1, 1); + aes_ecb_encrypt(ctx->consts, (u8[AES_BLOCK_SIZE]){}, rk, rounds, 1); kernel_neon_end(); cmac_gf128_mul_by_x(consts, consts); @@ -454,8 +451,8 @@ static int xcbc_setkey(struct crypto_shash *tfm, const u8 *in_key, return err; kernel_neon_begin(); - aes_ecb_encrypt(key, ks[0], rk, rounds, 1, 1); - aes_ecb_encrypt(ctx->consts, ks[1], rk, rounds, 2, 0); + aes_ecb_encrypt(key, ks[0], rk, rounds, 1); + aes_ecb_encrypt(ctx->consts, ks[1], rk, rounds, 2); kernel_neon_end(); return cbcmac_setkey(tfm, key, sizeof(key)); diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S index 2674d43d1384..65b273667b34 100644 --- a/arch/arm64/crypto/aes-modes.S +++ b/arch/arm64/crypto/aes-modes.S @@ -40,24 +40,24 @@ #if INTERLEAVE == 2 aes_encrypt_block2x: - encrypt_block2x v0, v1, w3, x2, x6, w7 + encrypt_block2x v0, v1, w3, x2, x8, w7 ret ENDPROC(aes_encrypt_block2x) aes_decrypt_block2x: - decrypt_block2x v0, v1, w3, x2, x6, w7 + decrypt_block2x v0, v1, w3, x2, x8, w7 ret ENDPROC(aes_decrypt_block2x) #elif INTERLEAVE == 4 aes_encrypt_block4x: - encrypt_block4x v0, v1, v2, v3, w3, x2, x6, w7 + encrypt_block4x v0, v1, v2, v3, w3, x2, x8, w7 ret ENDPROC(aes_encrypt_block4x) aes_decrypt_block4x: - decrypt_block4x v0, v1, v2, v3, w3, x2, x6, w7 + decrypt_block4x v0, v1, v2, v3, w3, x2, x8, w7 ret ENDPROC(aes_decrypt_block4x) @@ -86,33 +86,32 @@ ENDPROC(aes_decrypt_block4x) #define FRAME_POP .macro do_encrypt_block2x - encrypt_block2x v0, v1, w3, x2, x6, w7 + encrypt_block2x v0, v1, w3, x2, x8, w7 .endm .macro do_decrypt_block2x - decrypt_block2x v0, v1, w3, x2, x6, w7 + decrypt_block2x v0, v1, w3, x2, x8, w7 .endm .macro do_encrypt_block4x - encrypt_block4x v0, v1, v2, v3, w3, x2, x6, w7 + encrypt_block4x v0, v1, v2, v3, w3, x2, x8, w7 .endm .macro do_decrypt_block4x - decrypt_block4x v0, v1, v2, v3, w3, x2, x6, w7 + decrypt_block4x v0, v1, v2, v3, w3, x2, x8, w7 .endm #endif /* * aes_ecb_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, - * int blocks, int first) + * int blocks) * aes_ecb_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, - * int blocks, int first) + * int blocks) */ AES_ENTRY(aes_ecb_encrypt) FRAME_PUSH - cbz w5, .LecbencloopNx enc_prepare w3, x2, x5 @@ -148,7 +147,6 @@ AES_ENDPROC(aes_ecb_encrypt) AES_ENTRY(aes_ecb_decrypt) FRAME_PUSH - cbz w5, .LecbdecloopNx dec_prepare w3, x2, x5 @@ -184,14 +182,12 @@ AES_ENDPROC(aes_ecb_decrypt) /* * aes_cbc_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, - * int blocks, u8 iv[], int first) + * int blocks, u8 iv[]) * aes_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, - * int blocks, u8 iv[], int first) + * int blocks, u8 iv[]) */ AES_ENTRY(aes_cbc_encrypt) - cbz w6, .Lcbcencloop - ld1 {v0.16b}, [x5] /* get iv */ enc_prepare w3, x2, x6 @@ -209,7 +205,6 @@ AES_ENDPROC(aes_cbc_encrypt) AES_ENTRY(aes_cbc_decrypt) FRAME_PUSH - cbz w6, .LcbcdecloopNx ld1 {v7.16b}, [x5] /* get iv */ dec_prepare w3, x2, x6 @@ -264,20 +259,19 @@ AES_ENDPROC(aes_cbc_decrypt) /* * aes_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, - * int blocks, u8 ctr[], int first) + * int blocks, u8 ctr[]) */ AES_ENTRY(aes_ctr_encrypt) FRAME_PUSH - cbz w6, .Lctrnotfirst /* 1st time around? */ + enc_prepare w3, x2, x6 ld1 {v4.16b}, [x5] -.Lctrnotfirst: - umov x8, v4.d[1] /* keep swabbed ctr in reg */ - rev x8, x8 + umov x6, v4.d[1] /* keep swabbed ctr in reg */ + rev x6, x6 #if INTERLEAVE >= 2 - cmn w8, w4 /* 32 bit overflow? */ + cmn w6, w4 /* 32 bit overflow? */ bcs .Lctrloop .LctrloopNx: subs w4, w4, #INTERLEAVE @@ -285,11 +279,11 @@ AES_ENTRY(aes_ctr_encrypt) #if INTERLEAVE == 2 mov v0.8b, v4.8b mov v1.8b, v4.8b - rev x7, x8 - add x8, x8, #1 + rev x7, x6 + add x6, x6, #1 ins v0.d[1], x7 - rev x7, x8 - add x8, x8, #1 + rev x7, x6 + add x6, x6, #1 ins v1.d[1], x7 ld1 {v2.16b-v3.16b}, [x1], #32 /* get 2 input blocks */ do_encrypt_block2x @@ -298,7 +292,7 @@ AES_ENTRY(aes_ctr_encrypt) st1 {v0.16b-v1.16b}, [x0], #32 #else ldr q8, =0x30000000200000001 /* addends 1,2,3[,0] */ - dup v7.4s, w8 + dup v7.4s, w6 mov v0.16b, v4.16b add v7.4s, v7.4s, v8.4s mov v1.16b, v4.16b @@ -316,9 +310,9 @@ AES_ENTRY(aes_ctr_encrypt) eor v2.16b, v7.16b, v2.16b eor v3.16b, v5.16b, v3.16b st1 {v0.16b-v3.16b}, [x0], #64 - add x8, x8, #INTERLEAVE + add x6, x6, #INTERLEAVE #endif - rev x7, x8 + rev x7, x6 ins v4.d[1], x7 cbz w4, .Lctrout b .LctrloopNx @@ -328,10 +322,10 @@ AES_ENTRY(aes_ctr_encrypt) #endif .Lctrloop: mov v0.16b, v4.16b - encrypt_block v0, w3, x2, x6, w7 + encrypt_block v0, w3, x2, x8, w7 - adds x8, x8, #1 /* increment BE ctr */ - rev x7, x8 + adds x6, x6, #1 /* increment BE ctr */ + rev x7, x6 ins v4.d[1], x7 bcs .Lctrcarry /* overflow? */ @@ -385,15 +379,17 @@ CPU_BE( .quad 0x87, 1 ) AES_ENTRY(aes_xts_encrypt) FRAME_PUSH - cbz w7, .LxtsencloopNx - ld1 {v4.16b}, [x6] - enc_prepare w3, x5, x6 - encrypt_block v4, w3, x5, x6, w7 /* first tweak */ - enc_switch_key w3, x2, x6 + cbz w7, .Lxtsencnotfirst + + enc_prepare w3, x5, x8 + encrypt_block v4, w3, x5, x8, w7 /* first tweak */ + enc_switch_key w3, x2, x8 ldr q7, .Lxts_mul_x b .LxtsencNx +.Lxtsencnotfirst: + enc_prepare w3, x2, x8 .LxtsencloopNx: ldr q7, .Lxts_mul_x next_tweak v4, v4, v7, v8 @@ -442,7 +438,7 @@ AES_ENTRY(aes_xts_encrypt) .Lxtsencloop: ld1 {v1.16b}, [x1], #16 eor v0.16b, v1.16b, v4.16b - encrypt_block v0, w3, x2, x6, w7 + encrypt_block v0, w3, x2, x8, w7 eor v0.16b, v0.16b, v4.16b st1 {v0.16b}, [x0], #16 subs w4, w4, #1 @@ -450,6 +446,7 @@ AES_ENTRY(aes_xts_encrypt) next_tweak v4, v4, v7, v8 b .Lxtsencloop .Lxtsencout: + st1 {v4.16b}, [x6] FRAME_POP ret AES_ENDPROC(aes_xts_encrypt) @@ -457,15 +454,17 @@ AES_ENDPROC(aes_xts_encrypt) AES_ENTRY(aes_xts_decrypt) FRAME_PUSH - cbz w7, .LxtsdecloopNx - ld1 {v4.16b}, [x6] - enc_prepare w3, x5, x6 - encrypt_block v4, w3, x5, x6, w7 /* first tweak */ - dec_prepare w3, x2, x6 + cbz w7, .Lxtsdecnotfirst + + enc_prepare w3, x5, x8 + encrypt_block v4, w3, x5, x8, w7 /* first tweak */ + dec_prepare w3, x2, x8 ldr q7, .Lxts_mul_x b .LxtsdecNx +.Lxtsdecnotfirst: + dec_prepare w3, x2, x8 .LxtsdecloopNx: ldr q7, .Lxts_mul_x next_tweak v4, v4, v7, v8 @@ -514,7 +513,7 @@ AES_ENTRY(aes_xts_decrypt) .Lxtsdecloop: ld1 {v1.16b}, [x1], #16 eor v0.16b, v1.16b, v4.16b - decrypt_block v0, w3, x2, x6, w7 + decrypt_block v0, w3, x2, x8, w7 eor v0.16b, v0.16b, v4.16b st1 {v0.16b}, [x0], #16 subs w4, w4, #1 @@ -522,6 +521,7 @@ AES_ENTRY(aes_xts_decrypt) next_tweak v4, v4, v7, v8 b .Lxtsdecloop .Lxtsdecout: + st1 {v4.16b}, [x6] FRAME_POP ret AES_ENDPROC(aes_xts_decrypt) diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c index c55d68ccb89f..9d823c77ec84 100644 --- a/arch/arm64/crypto/aes-neonbs-glue.c +++ b/arch/arm64/crypto/aes-neonbs-glue.c @@ -46,10 +46,9 @@ asmlinkage void aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[], /* borrowed from aes-neon-blk.ko */ asmlinkage void neon_aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[], - int rounds, int blocks, int first); + int rounds, int blocks); asmlinkage void neon_aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[], - int rounds, int blocks, u8 iv[], - int first); + int rounds, int blocks, u8 iv[]); struct aesbs_ctx { u8 rk[13 * (8 * AES_BLOCK_SIZE) + 32]; @@ -157,7 +156,7 @@ static int cbc_encrypt(struct skcipher_request *req) struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm); struct skcipher_walk walk; - int err, first = 1; + int err; err = skcipher_walk_virt(&walk, req, true); @@ -167,10 +166,9 @@ static int cbc_encrypt(struct skcipher_request *req) /* fall back to the non-bitsliced NEON implementation */ neon_aes_cbc_encrypt(walk.dst.virt.addr, walk.src.virt.addr, - ctx->enc, ctx->key.rounds, blocks, walk.iv, - first); + ctx->enc, ctx->key.rounds, blocks, + walk.iv); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); - first = 0; } kernel_neon_end(); return err; @@ -311,7 +309,7 @@ static int __xts_crypt(struct skcipher_request *req, kernel_neon_begin(); neon_aes_ecb_encrypt(walk.iv, walk.iv, ctx->twkey, - ctx->key.rounds, 1, 1); + ctx->key.rounds, 1); while (walk.nbytes >= AES_BLOCK_SIZE) { unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; From patchwork Fri Dec 1 21:19:27 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 120388 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp1655577qgn; Fri, 1 Dec 2017 13:26:16 -0800 (PST) X-Google-Smtp-Source: AGs4zMb1VUBF6UYzTZA4p30rfqAzG3FuGCmkP89nAjnR5F+xdBVlaFuQi/lIXxLOP2qh7loMqz0c X-Received: by 10.98.11.218 with SMTP id 87mr11602917pfl.228.1512163576843; Fri, 01 Dec 2017 13:26:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1512163576; cv=none; d=google.com; s=arc-20160816; b=S7gbo8oxnOWEC6xtf47qnLHRbpshnd9JyX6mQuu3beq4i2lLtr1GVBgoYi7Ayv7oje KTTeiw4DgCmuV870xcw4XDSn+mSjWiXzLhUzSV6Il5HzuaYnJfM3u0W3QPZQVctLRjsv 03XU1+UKye9RXkAxCpbYWsCwTYode01XFFGLiAqvV9JlUlM3LFeGzVPDb4Rsa+obKvpw M/T889QoO4iIT6UZTGbDYhy5Gnl0dVU78bPapjw+vTO/bmcfS9TKHZIRFCHSoz1hmln3 ZyQyWmY2eQ5REP9nfEiPPkGeZBi+C8bbxwOlLjdnZIClhejPQaE8vZQxpQZrVhWPvt74 ObMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=SgKsLu2PdMAkeEjJB77DO8HL4XwrPtctezT2ceVL2+w=; b=GF0Hn7WsdAvAOCQVTRkAESTGvZrEnVnl0tZS3E50vyN78o1e2S+8Pv1iYQle1SCp8I qkDzc/t99JHWVZ4DOTd1tjLSz3eB+yvTimArrWbsMB0zMYNK1g09Y8YjPtqeIDssqRKx PKAlqU1YgQqDbcOLK1Wyy9M3F7mOhnzEs0aHSBRgNS5SpVoz4JQ+FT/ngiRCf5RcHxrS nB4/GZFkCYrrrl/PO4sH0vgTySGInp7735okC2v9jcvBU/YPukClycz/o56BRdTCg9Iz S2qzYNf4hiT3wgkyFUQqKl4RqzsfTYYStnjnJ5flrFAgFw7GSfhKm+DgqWjCco/EFyb/ R7OQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=TomDMick; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u66si5760540pfa.237.2017.12.01.13.26.16; Fri, 01 Dec 2017 13:26:16 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=TomDMick; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751039AbdLAV0P (ORCPT + 1 other); Fri, 1 Dec 2017 16:26:15 -0500 Received: from mail-wr0-f194.google.com ([209.85.128.194]:42460 "EHLO mail-wr0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751188AbdLAV0O (ORCPT ); Fri, 1 Dec 2017 16:26:14 -0500 Received: by mail-wr0-f194.google.com with SMTP id s66so11440075wrc.9 for ; Fri, 01 Dec 2017 13:26:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=SgKsLu2PdMAkeEjJB77DO8HL4XwrPtctezT2ceVL2+w=; b=TomDMickzVqZNh64eSjZtwrzYo30etBaLKXiOAWPaaU7JdIAw+FpiedspC/C+MJbY3 skW85cd2T4R3j5zrtPbPDL4Ch1jJZ6lrUy5gGzi2kmkr1z8ADvmdO7RFcdjdo+ZZY598 gW6/OzozKtUHWkoJGWORQ2j/fDuvt8e9MC1bs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=SgKsLu2PdMAkeEjJB77DO8HL4XwrPtctezT2ceVL2+w=; b=oHb9X3M/NY0gIJEdwUVj6HuER2BSF3SM7eYxEHlkg/iekMUnf9XGnsxeY+K/lLFPHf lX8hYPTH8JNKqt5Ae3KOQCIH1uk5tcyjophQQJRE/sfc3NUvJFYMY5PeGBNdIuNSOHnO WUH17eCDuP+dqLhEnqJ9ikTnDzTh3ePNH9V8/T6U02DWfIIeNwLZP0dGaKKChdkaNd9c V7EZncx+hjOM1jAotrq0FVo9XY9e6nMt2mEzAYS2aeT5T+vmuuHBVvmpXo7atjE1lGkl R5Prg7UpDZnA2JnpDBftnUfhXgruKS/z0AqsGzoOq5EZqUOUBJYPjDeYBaJ4gMq+0HeM kpzg== X-Gm-Message-State: AJaThX7TUSRXgLLoUBQjkHX4DQIZP+apkLwK8QjfObOwwr9iUw1dD0kN AY8m+KcGRuDltNbM/W+hXBn9+lswzVk= X-Received: by 10.223.162.137 with SMTP id s9mr6921536wra.272.1512163572481; Fri, 01 Dec 2017 13:26:12 -0800 (PST) Received: from localhost.localdomain ([105.150.171.234]) by smtp.gmail.com with ESMTPSA id p17sm2161682wma.23.2017.12.01.13.26.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 01 Dec 2017 13:26:11 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Dave Martin , Russell King - ARM Linux , Sebastian Andrzej Siewior , Mark Rutland , linux-rt-users@vger.kernel.org, Peter Zijlstra , Catalin Marinas , Will Deacon , Steven Rostedt , Thomas Gleixner Subject: [PATCH 5/5] crypto: arm64/ghash - move kernel mode neon en/disable into loop Date: Fri, 1 Dec 2017 21:19:27 +0000 Message-Id: <20171201211927.24653-6-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20171201211927.24653-1-ard.biesheuvel@linaro.org> References: <20171201211927.24653-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org When kernel mode NEON was first introduced on arm64, the preserve and restore of the userland NEON state was completely unoptimized, and involved saving all registers on each call to kernel_neon_begin(), and restoring them on each call to kernel_neon_end(). For this reason, the NEON crypto code that was introduced at the time keeps the NEON enabled throughout the execution of the crypto API methods, which may include calls back into the crypto API that could result in memory allocation or other actions that we should avoid when running with preemption disabled. Since then, we have optimized the kernel mode NEON handling, which now restores lazily (upon return to userland), and so the preserve action is only costly the first time it is called after entering the kernel. So let's put the kernel_neon_begin() and kernel_neon_end() calls around the actual invocations of the NEON crypto code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/ghash-ce-glue.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) -- 2.11.0 diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c index cfc9c92814fd..ddd9b565d775 100644 --- a/arch/arm64/crypto/ghash-ce-glue.c +++ b/arch/arm64/crypto/ghash-ce-glue.c @@ -368,20 +368,22 @@ static int gcm_encrypt(struct aead_request *req) pmull_gcm_encrypt_block(ks, iv, NULL, num_rounds(&ctx->aes_key)); put_unaligned_be32(3, iv + GCM_IV_SIZE); + kernel_neon_end(); err = skcipher_walk_aead_encrypt(&walk, req, true); while (walk.nbytes >= AES_BLOCK_SIZE) { int blocks = walk.nbytes / AES_BLOCK_SIZE; + kernel_neon_begin(); pmull_gcm_encrypt(blocks, dg, walk.dst.virt.addr, walk.src.virt.addr, &ctx->ghash_key, iv, num_rounds(&ctx->aes_key), ks); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); } - kernel_neon_end(); } else { __aes_arm64_encrypt(ctx->aes_key.key_enc, tag, iv, num_rounds(&ctx->aes_key)); @@ -467,15 +469,18 @@ static int gcm_decrypt(struct aead_request *req) pmull_gcm_encrypt_block(tag, iv, ctx->aes_key.key_enc, num_rounds(&ctx->aes_key)); put_unaligned_be32(2, iv + GCM_IV_SIZE); + kernel_neon_end(); err = skcipher_walk_aead_decrypt(&walk, req, true); while (walk.nbytes >= AES_BLOCK_SIZE) { int blocks = walk.nbytes / AES_BLOCK_SIZE; + kernel_neon_begin(); pmull_gcm_decrypt(blocks, dg, walk.dst.virt.addr, walk.src.virt.addr, &ctx->ghash_key, iv, num_rounds(&ctx->aes_key)); + kernel_neon_end(); err = skcipher_walk_done(&walk, walk.nbytes % AES_BLOCK_SIZE); @@ -483,8 +488,6 @@ static int gcm_decrypt(struct aead_request *req) if (walk.nbytes) pmull_gcm_encrypt_block(iv, iv, NULL, num_rounds(&ctx->aes_key)); - - kernel_neon_end(); } else { __aes_arm64_encrypt(ctx->aes_key.key_enc, tag, iv, num_rounds(&ctx->aes_key));