From patchwork Thu Jan 18 17:06:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 764509 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9DEF72C843 for ; Thu, 18 Jan 2024 17:07:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597662; cv=none; b=MynFcWzfJZm/eup6miO5B+gla2tdaOX3+azxo2xuYjhLd5gtvX3wxZq626CSat0I/cAHxfsSC/aacrGP+3a1kcsiyNYmARjercjvpdLE1OL1frzKyi3mutyutbuNlkx+l/UaDRLZP9z9l4/g/keLwO8lu/TmodUnVDDb0c3RYO0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597662; c=relaxed/simple; bh=92UiQ6e2DOT3H/EOJ5xxiYlizB2H5aO6pwzx1rpb/FA=; h=Received:DKIM-Signature:X-Google-DKIM-Signature: X-Gm-Message-State:X-Google-Smtp-Source:X-Received:Date: In-Reply-To:Mime-Version:References:X-Developer-Key: X-Developer-Signature:X-Mailer:Message-ID:Subject:From:To:Cc: Content-Type; b=Abf+kAlFwa01vWc95u/frclPiQLEKTZZ3ZJ4+SB+Wn6b8LSnX7fzrhbBZ77wUj/U7gA+iwwnVTPIhfBqHvSqYgQ9LvwLHb4DYEjCxlKGJi1KFGGXnooJBzk4KQ/OiLgZ8laKS8fzn7L7ssXwXDktX7M0clTBTNFXa0aXtlEn57Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=O06Pg2Hk; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="O06Pg2Hk" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-5eba564eb3fso224847927b3.1 for ; Thu, 18 Jan 2024 09:07:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705597659; x=1706202459; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dbblg1g18vESHRUanqd3gqjzLxnkVDJAYZyoYzsYl98=; b=O06Pg2Hks3eB8xgC7X/NogVHH/+UotKK+NUEpm+raSIu+WXhfad5HB4Fseo82eY5qc uMuSfiiZL/+0theVRCAW3+TMuQ38kZGq5lSt1dNUhuLispmkXbQptn2GzQThkipkKozY UR1WJwVXBtUNt16f8EF+MqZe0K0UmY3XGcnZGDkUUu8kWlEJGGwN/l03MCTp7qn2n3I+ XHUUdfpAxDTheA+jn2Ug43FE30RGqdkrGB/pWpJMUZ0x2eiExagDAuO1w7XXmhHOE/8o 5pieDyJROkQQHOpwzVsNhtFOnGGRlQBU8919LLCLb3If5U5OQxBea3LEkGBLaXoapibp PXJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705597659; x=1706202459; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dbblg1g18vESHRUanqd3gqjzLxnkVDJAYZyoYzsYl98=; b=F9lbuEZHuPxyeIIMNi7AJKcs/qkPbVHkYsEQAFyu7R6ndBfQ/g73hUXLSV756mC/5H 3bFCBIg4FQ0taxb6ioPlATHHZAqn8/3R1D7v5MT3awG3ybpQsVwld1wNmgPRAmmOPl8A rGYm9KQKmWaOuP1BBex9BJFbI/ZaPP3ExDMIkH9NopYIjBd5UkfaLdrLt8DiS+b6MiRD euYnp81yqx4s0R5nVGr82/TlvyAkVJJPW3lvT/z44mO78pleOm3AWhGOgG10GCAvYsO3 BkvA+P8SEA0SZ87bRB2bb61ToxsppdlmHR7tKs2EMIgcZVktPatqpBi3Q8dqOFFstjfN +9kw== X-Gm-Message-State: AOJu0Yym2QX7mGgD0KP1LSICfApR1ebzM+Y59BLbysehUX81fr93Fzub pRKwyh5PQuU203YqPvONgLq0UlIsaIya/VnWyh3Sow4qvBhTV77swaS+EzUDka+aT2hN/mSfqRF TuP5X71ZzlPHzCBEQ6NOrbFNVg4ycbyBVwmQSXHQ3TZNMEAa2LCI3gt13F2AxIVwvRsIqDNWUCi iWN4z0DxNPuaDhvBhLd/Oi3s4XKoXR0g== X-Google-Smtp-Source: AGHT+IEj7o01HqXXidK9qvOo9cVEbWYcuM/kY5K8YovBWEfoN2D0lRYW5KX/6215Kh5B6yR1Os+UautR X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:a81:99c4:0:b0:5e6:6b2c:8620 with SMTP id q187-20020a8199c4000000b005e66b2c8620mr528280ywg.7.1705597659591; Thu, 18 Jan 2024 09:07:39 -0800 (PST) Date: Thu, 18 Jan 2024 18:06:30 +0100 In-Reply-To: <20240118170628.3049797-10-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240118170628.3049797-10-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=4040; i=ardb@kernel.org; h=from:subject; bh=JFhgp0jLRo+TU0jJ43nYp0OZsvoylBSy/v8YLlAKwv8=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIXVl1DTDBw+qFOQN5F8/n8XbOi1JWuyAUMeFB9cz/rDon +L+Gze3o5SFQYyDQVZMkUVg9t93O09PlKp1niULM4eVCWQIAxenAEzk9U5Ghrvv5rEbtqtqW7J+ 7K5XkmCSe2EWKda8bX/81rUOj3xUNjMyLBU1jvRYXKcqKOfm/rvj/k6pRF/Nj7NYzrVxyC9lP7O LDwA= X-Mailer: git-send-email 2.43.0.381.gb435a96ce8-goog Message-ID: <20240118170628.3049797-11-ardb+git@google.com> Subject: [PATCH v2 1/8] crypto: arm64/aes-ccm - Revert "Rewrite skcipher walker loop" From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: ebiggers@kernel.org, herbert@gondor.apana.org.au, Ard Biesheuvel From: Ard Biesheuvel This reverts commit 57ead1bf1c54, which updated the CCM code to only rely on walk.nbytes to check for failures returned from the skcipher walk API, mostly for the common good rather than to fix a particular problem in the code. This change introduces a problem of its own: the skcipher walk is started with the 'atomic' argument set to false, which means that the skcipher walk API is permitted to sleep. Subsequently, it invokes skcipher_walk_done() with preemption disabled on the final iteration of the loop. This appears to work by accident, but it is arguably a bad example, and providing a better example was the point of the original patch. Given that future changes to the CCM code will rely on the original behavior of entering the loop even for zero sized inputs, let's just revert this change entirely, and proceed from there. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-ce-ccm-glue.c | 57 +++++++++++--------- 1 file changed, 31 insertions(+), 26 deletions(-) diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c index 25cd3808ecbe..c4f14415f5f0 100644 --- a/arch/arm64/crypto/aes-ce-ccm-glue.c +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c @@ -161,39 +161,43 @@ static int ccm_encrypt(struct aead_request *req) memcpy(buf, req->iv, AES_BLOCK_SIZE); err = skcipher_walk_aead_encrypt(&walk, req, false); + if (unlikely(err)) + return err; kernel_neon_begin(); if (req->assoclen) ccm_calculate_auth_mac(req, mac); - while (walk.nbytes) { + do { u32 tail = walk.nbytes % AES_BLOCK_SIZE; - bool final = walk.nbytes == walk.total; - if (final) + if (walk.nbytes == walk.total) tail = 0; ce_aes_ccm_encrypt(walk.dst.virt.addr, walk.src.virt.addr, walk.nbytes - tail, ctx->key_enc, num_rounds(ctx), mac, walk.iv); - if (!final) - kernel_neon_end(); - err = skcipher_walk_done(&walk, tail); - if (!final) - kernel_neon_begin(); - } + if (walk.nbytes == walk.total) + ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx)); - ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx)); + kernel_neon_end(); - kernel_neon_end(); + if (walk.nbytes) { + err = skcipher_walk_done(&walk, tail); + if (unlikely(err)) + return err; + if (unlikely(walk.nbytes)) + kernel_neon_begin(); + } + } while (walk.nbytes); /* copy authtag to end of dst */ scatterwalk_map_and_copy(mac, req->dst, req->assoclen + req->cryptlen, crypto_aead_authsize(aead), 1); - return err; + return 0; } static int ccm_decrypt(struct aead_request *req) @@ -215,36 +219,37 @@ static int ccm_decrypt(struct aead_request *req) memcpy(buf, req->iv, AES_BLOCK_SIZE); err = skcipher_walk_aead_decrypt(&walk, req, false); + if (unlikely(err)) + return err; kernel_neon_begin(); if (req->assoclen) ccm_calculate_auth_mac(req, mac); - while (walk.nbytes) { + do { u32 tail = walk.nbytes % AES_BLOCK_SIZE; - bool final = walk.nbytes == walk.total; - if (final) + if (walk.nbytes == walk.total) tail = 0; ce_aes_ccm_decrypt(walk.dst.virt.addr, walk.src.virt.addr, walk.nbytes - tail, ctx->key_enc, num_rounds(ctx), mac, walk.iv); - if (!final) - kernel_neon_end(); - err = skcipher_walk_done(&walk, tail); - if (!final) - kernel_neon_begin(); - } + if (walk.nbytes == walk.total) + ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx)); - ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx)); + kernel_neon_end(); - kernel_neon_end(); - - if (unlikely(err)) - return err; + if (walk.nbytes) { + err = skcipher_walk_done(&walk, tail); + if (unlikely(err)) + return err; + if (unlikely(walk.nbytes)) + kernel_neon_begin(); + } + } while (walk.nbytes); /* compare calculated auth tag with the stored one */ scatterwalk_map_and_copy(buf, req->src, From patchwork Thu Jan 18 17:06:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 763716 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19A562C843 for ; Thu, 18 Jan 2024 17:07:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597664; cv=none; b=bLLSB6kVMbmkNvqJefeM1AtTC6OhNHpeciFOA8OKq60Y6SIzPja0uYsRTcOlUkRyCWy2wqVOgAjjUfd0LhGxqCuXCSiaLdhXfwvG8BuJ4AyCHiwle/xKbduZH88jUDDUhMLgpl/NFZjEyQ6LRYqlf9drT4M4gLhWp7/ZnYhI6Ms= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597664; c=relaxed/simple; bh=esEuVtsGnTmd0hkaLNKKp1YZntXpEFU6zENBP770TFE=; h=Received:DKIM-Signature:X-Google-DKIM-Signature: X-Gm-Message-State:X-Google-Smtp-Source:X-Received:Date: In-Reply-To:Mime-Version:References:X-Developer-Key: X-Developer-Signature:X-Mailer:Message-ID:Subject:From:To:Cc: Content-Type; b=FEVAROfwP8kK9eH1ucfPYzaHXGuLWetvbAj317Pq4lfwnE2dEpgnzno/PG4DACCM+DwE7bEQ67zj07fQ2v65MsmANudDAApA8Ew5ZCYP5/AGSn6yRBnc3HUXDx8XeGDOFEDVYWiuhPsooiSULleI9tY0v4Zfgz4wA9BZ5WhUDz0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=dC0uuvG2; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="dC0uuvG2" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-dc24de01bd9so2630594276.1 for ; Thu, 18 Jan 2024 09:07:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705597662; x=1706202462; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=rOaubHCN0d/JbHeeyKhI2Cii265DwIb3nT8U1KTfMHU=; b=dC0uuvG2gTDmVJXshQ14JKXfWV3JCCSYpZhstZDb+oCvyNGC7zlwgyb1wR9+pQG6T/ V3bfmbjjdHK82cd+Y2X39Qz0Iy2KSvF27NjZjHrp1Sr75CWG2wbXHGFiYhzMRrQpkmd4 Zl7hD+0zNZhE2WO13MgToOsU4+BajAfUb+n57QorYFDAdspPAFtKTAsrdT0gCVqPvq7P i/8U9JzIle49ks+YHhCXLJzJn/+FTni7yJtwAqTmi4WMU5hTk1CJArk4p5MNn2wkG/G1 wa7FXpJYEOJXkHcvBWZZbk4vcxmT6ppaOKkNWQ2MkSavyFy+V+rTU6yp8GlkOEY0kpqo lUeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705597662; x=1706202462; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rOaubHCN0d/JbHeeyKhI2Cii265DwIb3nT8U1KTfMHU=; b=GV+VQ2T6Ft1QwLAw4kBcpWVTJeYR3xrlyy9REXiLPJbPurWDu3PM3/NQ582yuFhwvo ZTMeer4Ez/Z1SqbIBltOE5OnOVeAxW4QxXEGJN/rehRm6SSSKhQjMe4mAbfp4EtY9z2Z FEpTNSFueutTdSOfxO6ndGabOKewcPIE5XuURAOeY4eOlXPIuLjiMc9PAvA+stH8X9LZ e+wKJzy1oBJsMTDFm/UrgMUfl8x6xLuZhcXM3o3m8nHcWx5TvonxgnNv7w/Nl4kvGJYY UidgIChveevfoAOWZF4+vTULtpsNQlnQkoMzrEqoNPJU4yGCNLGEZKJ9ZDWVjvIYzt// aU6A== X-Gm-Message-State: AOJu0YwVKIXuTsxfoKO/QsUXeKb+0xcCLT+tx7I3cKxUaeRI4vV1ljNJ fYJxo+n7u7xIue6+v+PshM2ANX16WRFe0TfGf8UmLmFUbmlvmGqAPzZSeEkBM3VvQW1pfEDMkSb 0bIfTl2LtWc8VeFxJZt9JLX5FFwB32W5afK5/1DZQaULCGvbH+CcNlg4MmINiRPLQZlDz2ykvzU 5fqtY9ySPjfBLJfGhVfyjkUY3U7jOUVg== X-Google-Smtp-Source: AGHT+IHM6SQvc5VkcIa9M59oWVSyxOnSY1QtA/erkBYS5yGmK6Kt14fMjuRsbozq1O3Nr8+Cby/i8Ve1 X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:a05:6902:2705:b0:dc2:3411:6424 with SMTP id dz5-20020a056902270500b00dc234116424mr477264ybb.2.1705597662157; Thu, 18 Jan 2024 09:07:42 -0800 (PST) Date: Thu, 18 Jan 2024 18:06:31 +0100 In-Reply-To: <20240118170628.3049797-10-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240118170628.3049797-10-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=1868; i=ardb@kernel.org; h=from:subject; bh=8krDFuO6ohSM5gr3lqbU2dJ0WRAj0VbK0XKYMUR2NTg=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIXVl1PRHblMnpFydMvO9y8MT/1sOPy11/7lJ8t61LQHdb rMcTjDKdZSyMIhxMMiKKbIIzP77bufpiVK1zrNkYeawMoEMYeDiFICJaHsw/PfP/7J00ZtaY7v7 sjk7xb8E7wu686NMmVF2/p8Z8v0Fjx4wMrzT//i+caHsr1Un2ad+bT4c+ifc+dQlwQYOjdeTZxy 1+8sPAA== X-Mailer: git-send-email 2.43.0.381.gb435a96ce8-goog Message-ID: <20240118170628.3049797-12-ardb+git@google.com> Subject: [PATCH v2 2/8] crypto: arm64/aes-ccm - Keep NEON enabled during skcipher walk From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: ebiggers@kernel.org, herbert@gondor.apana.org.au, Ard Biesheuvel From: Ard Biesheuvel Now that kernel mode NEON no longer disables preemption, we no longer have to take care to disable and re-enable use of the NEON when calling into the skcipher walk API. So just keep it enabled until done. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-ce-ccm-glue.c | 22 +++++++++----------- 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c index c4f14415f5f0..b177ebea7d09 100644 --- a/arch/arm64/crypto/aes-ce-ccm-glue.c +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c @@ -182,17 +182,16 @@ static int ccm_encrypt(struct aead_request *req) if (walk.nbytes == walk.total) ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx)); - kernel_neon_end(); - if (walk.nbytes) { err = skcipher_walk_done(&walk, tail); - if (unlikely(err)) - return err; - if (unlikely(walk.nbytes)) - kernel_neon_begin(); } } while (walk.nbytes); + kernel_neon_end(); + + if (unlikely(err)) + return err; + /* copy authtag to end of dst */ scatterwalk_map_and_copy(mac, req->dst, req->assoclen + req->cryptlen, crypto_aead_authsize(aead), 1); @@ -240,17 +239,16 @@ static int ccm_decrypt(struct aead_request *req) if (walk.nbytes == walk.total) ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx)); - kernel_neon_end(); - if (walk.nbytes) { err = skcipher_walk_done(&walk, tail); - if (unlikely(err)) - return err; - if (unlikely(walk.nbytes)) - kernel_neon_begin(); } } while (walk.nbytes); + kernel_neon_end(); + + if (unlikely(err)) + return err; + /* compare calculated auth tag with the stored one */ scatterwalk_map_and_copy(buf, req->src, req->assoclen + req->cryptlen - authsize, From patchwork Thu Jan 18 17:06:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 764508 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7C0A02C843 for ; Thu, 18 Jan 2024 17:07:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597667; cv=none; b=WctVfWIxGd5NTsHv2w6lCA7m8+P7a+kUGbXf3cOaLRG4qkOenEb3nmLE1kGsGv24sDBvFQdvcJy27gaK+jNao3FVcWUOcUr1fxYTHz73tZxdu/0O0xbMj7QwA7z5YQS0pSs4RIHrybcdooGniflKpx7HCjpzs1FAkometXrr6t8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597667; c=relaxed/simple; bh=pZ8oCY0r84RIgsUexJP/p9VeZlb6wrHZvW0eZmjWFv4=; h=Received:DKIM-Signature:X-Google-DKIM-Signature: X-Gm-Message-State:X-Google-Smtp-Source:X-Received:Date: In-Reply-To:Mime-Version:References:X-Developer-Key: X-Developer-Signature:X-Mailer:Message-ID:Subject:From:To:Cc: Content-Type; b=Z8EdrWVddVx5RJZqZVXyo+h1ZIOmlnu6qGsJ9jSwNUwtP2QN6GNSlH40EiEbTnrIO5r/5vmj1aPvIiOF/TckFTYvWzu840JSGfbodVDNsngjXrujY/ZwN81lkYzOYyz37vxwEPxQHwQFlKPSIB3Noshe45+/MK9FUamCAXsNEeM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=E7+PNovl; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="E7+PNovl" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-5fba6d7da06so101705407b3.0 for ; Thu, 18 Jan 2024 09:07:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705597664; x=1706202464; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HuhFJOxH8jB0ctCigCFsGkMCnb5I3c+w8zCI/ra5sf0=; b=E7+PNovljKBeZW72Ks1zI6CXuo+AQXQVwdo6dhqlsUQt59L/LlymiYoQGmkmIhNjRm fZtMqS4XUIhNgMKqZEPEjZvk+Yuvm1FNn3+hHdYsT4AfIWvvubCGiMt6v5c166G489Hk ukU4JA4B+lTn8GNcw7beoXmXucrCsNHUCrOgw7CxgZ19Wd+pxOjeBBiydksqFSqyh+80 2Wy4Ia+J+ynsnuBDbBMjjiI6NuAwFWEN/B9PRb1OlKkNChbCwQN5nFabC5VWlRpX+jG2 j9Wgj+WSxTJfOpMscLMi2XsO2qTxKDxzbUC4XWSntvcw4KLMxxhmTEpB/q6tKOGXscs1 x16A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705597664; x=1706202464; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HuhFJOxH8jB0ctCigCFsGkMCnb5I3c+w8zCI/ra5sf0=; b=wpc7hKpmOMuMgN9dd/HtyEmyR2AmhrWlrPTslndrz0gTq16/S4Iccyme0ZPbVRR3Qe CmCosnXgf0oVOJN1vRHVsvSnzW7bwpeRF3Sc7E+lVCCPFXFdaypkzqK63tEFeSZ/EQA1 jVjuPAPv/hynTEOU+VgUSOAvWYLx6SdpxAUQZsoDTqy4gQZePYG1ms6aGsceBN46nV9N noDmPEAv5sQeHZCuXVFRmufb2VKz2ddy230cUumvpieMbo5Q6X2if/486PHTuf4p1yKC OFBdT4NerBMXHGQWDJPLEEnQJUZO9Adq43VzcTH7+kHsu1xZGo7txcw/YYEMTT7YvIXV WlkQ== X-Gm-Message-State: AOJu0YyInTqGthl2AfheRCOGlX7J61wQQnNZrQmRQn4qYOUpOvWdKYHK 1ZjS8VK6pMoSA8VSkmteuY7ll8YwZ6WGJpfiTLFvfbFkHJ1IEfLgXIOF6gckzEBLc6ZCIado5xs HGM9JSSjJUM2DiQGYa28WOtbY1oxJsdoCto0AY+I+B0q+RTQUsmQMN41VA70REjGQnQbnca7HLO 4BCWMaVcYMWqpILCqrNUQ6FafB1UR8fg== X-Google-Smtp-Source: AGHT+IHXIK11si6qpGYUlCmD+oq280gtATie85QcyK9Mxc34DtzJbmZzVV1q5MDX7rFIFjUSn+rdziXZ X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:a25:abd1:0:b0:dc2:5237:81c7 with SMTP id v75-20020a25abd1000000b00dc2523781c7mr501414ybi.3.1705597664418; Thu, 18 Jan 2024 09:07:44 -0800 (PST) Date: Thu, 18 Jan 2024 18:06:32 +0100 In-Reply-To: <20240118170628.3049797-10-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240118170628.3049797-10-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=5464; i=ardb@kernel.org; h=from:subject; bh=9yJ/w8NC+Jc2jXxJ0qx2wj4/busGKuIzmC9B7mGx820=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIXVl1IxNVR4LrN92Gf07P+PPrdSjhonituv2md69yDf3g GP9hsTpHaUsDGIcDLJiiiwCs/++23l6olSt8yxZmDmsTCBDGLg4BWAi7/MYGR4/7G+RY/gTYDiF a8kWluy99o7r71ouSM2fNf/65NlN7MIMf7jufJe6eibBYOGJLw0rJYwfs+QZ2T4/VZzF5/tPnrc 3lx0A X-Mailer: git-send-email 2.43.0.381.gb435a96ce8-goog Message-ID: <20240118170628.3049797-13-ardb+git@google.com> Subject: [PATCH v2 3/8] crypto: arm64/aes-ccm - Pass short inputs via stack buffer From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: ebiggers@kernel.org, herbert@gondor.apana.org.au, Ard Biesheuvel From: Ard Biesheuvel In preparation for optimizing the CCM core asm code using permutation vectors and overlapping loads and stores, ensure that inputs shorter than the size of a AES block are passed via a buffer on the stack, in a way that positions the data at the end of a 16 byte buffer. This removes the need for the asm code to reason about a rare corner case where the tail of the data cannot be read/written using a single NEON load/store instruction. While at it, tweak the copyright header and authorship to bring it up to date. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-ce-ccm-glue.c | 57 ++++++++++++++------ 1 file changed, 40 insertions(+), 17 deletions(-) diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c index b177ebea7d09..4710e59075f5 100644 --- a/arch/arm64/crypto/aes-ce-ccm-glue.c +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c @@ -1,8 +1,11 @@ // SPDX-License-Identifier: GPL-2.0-only /* - * aes-ccm-glue.c - AES-CCM transform for ARMv8 with Crypto Extensions + * aes-ce-ccm-glue.c - AES-CCM transform for ARMv8 with Crypto Extensions * - * Copyright (C) 2013 - 2017 Linaro Ltd + * Copyright (C) 2013 - 2017 Linaro Ltd. + * Copyright (C) 2024 Google LLC + * + * Author: Ard Biesheuvel */ #include @@ -149,7 +152,7 @@ static int ccm_encrypt(struct aead_request *req) struct crypto_aes_ctx *ctx = crypto_aead_ctx(aead); struct skcipher_walk walk; u8 __aligned(8) mac[AES_BLOCK_SIZE]; - u8 buf[AES_BLOCK_SIZE]; + u8 orig_iv[AES_BLOCK_SIZE]; u32 len = req->cryptlen; int err; @@ -158,7 +161,7 @@ static int ccm_encrypt(struct aead_request *req) return err; /* preserve the original iv for the final round */ - memcpy(buf, req->iv, AES_BLOCK_SIZE); + memcpy(orig_iv, req->iv, AES_BLOCK_SIZE); err = skcipher_walk_aead_encrypt(&walk, req, false); if (unlikely(err)) @@ -171,16 +174,26 @@ static int ccm_encrypt(struct aead_request *req) do { u32 tail = walk.nbytes % AES_BLOCK_SIZE; + const u8 *src = walk.src.virt.addr; + u8 *dst = walk.dst.virt.addr; + u8 buf[AES_BLOCK_SIZE]; if (walk.nbytes == walk.total) tail = 0; - ce_aes_ccm_encrypt(walk.dst.virt.addr, walk.src.virt.addr, - walk.nbytes - tail, ctx->key_enc, - num_rounds(ctx), mac, walk.iv); + if (unlikely(walk.nbytes < AES_BLOCK_SIZE)) + src = dst = memcpy(&buf[sizeof(buf) - walk.nbytes], + src, walk.nbytes); + + ce_aes_ccm_encrypt(dst, src, walk.nbytes - tail, + ctx->key_enc, num_rounds(ctx), + mac, walk.iv); + + if (unlikely(walk.nbytes < AES_BLOCK_SIZE)) + memcpy(walk.dst.virt.addr, dst, walk.nbytes); if (walk.nbytes == walk.total) - ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx)); + ce_aes_ccm_final(mac, orig_iv, ctx->key_enc, num_rounds(ctx)); if (walk.nbytes) { err = skcipher_walk_done(&walk, tail); @@ -206,7 +219,7 @@ static int ccm_decrypt(struct aead_request *req) unsigned int authsize = crypto_aead_authsize(aead); struct skcipher_walk walk; u8 __aligned(8) mac[AES_BLOCK_SIZE]; - u8 buf[AES_BLOCK_SIZE]; + u8 orig_iv[AES_BLOCK_SIZE]; u32 len = req->cryptlen - authsize; int err; @@ -215,7 +228,7 @@ static int ccm_decrypt(struct aead_request *req) return err; /* preserve the original iv for the final round */ - memcpy(buf, req->iv, AES_BLOCK_SIZE); + memcpy(orig_iv, req->iv, AES_BLOCK_SIZE); err = skcipher_walk_aead_decrypt(&walk, req, false); if (unlikely(err)) @@ -228,16 +241,26 @@ static int ccm_decrypt(struct aead_request *req) do { u32 tail = walk.nbytes % AES_BLOCK_SIZE; + const u8 *src = walk.src.virt.addr; + u8 *dst = walk.dst.virt.addr; + u8 buf[AES_BLOCK_SIZE]; if (walk.nbytes == walk.total) tail = 0; - ce_aes_ccm_decrypt(walk.dst.virt.addr, walk.src.virt.addr, - walk.nbytes - tail, ctx->key_enc, - num_rounds(ctx), mac, walk.iv); + if (unlikely(walk.nbytes < AES_BLOCK_SIZE)) + src = dst = memcpy(&buf[sizeof(buf) - walk.nbytes], + src, walk.nbytes); + + ce_aes_ccm_decrypt(dst, src, walk.nbytes - tail, + ctx->key_enc, num_rounds(ctx), + mac, walk.iv); + + if (unlikely(walk.nbytes < AES_BLOCK_SIZE)) + memcpy(walk.dst.virt.addr, dst, walk.nbytes); if (walk.nbytes == walk.total) - ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx)); + ce_aes_ccm_final(mac, orig_iv, ctx->key_enc, num_rounds(ctx)); if (walk.nbytes) { err = skcipher_walk_done(&walk, tail); @@ -250,11 +273,11 @@ static int ccm_decrypt(struct aead_request *req) return err; /* compare calculated auth tag with the stored one */ - scatterwalk_map_and_copy(buf, req->src, + scatterwalk_map_and_copy(orig_iv, req->src, req->assoclen + req->cryptlen - authsize, authsize, 0); - if (crypto_memneq(mac, buf, authsize)) + if (crypto_memneq(mac, orig_iv, authsize)) return -EBADMSG; return 0; } @@ -293,6 +316,6 @@ module_init(aes_mod_init); module_exit(aes_mod_exit); MODULE_DESCRIPTION("Synchronous AES in CCM mode using ARMv8 Crypto Extensions"); -MODULE_AUTHOR("Ard Biesheuvel "); +MODULE_AUTHOR("Ard Biesheuvel "); MODULE_LICENSE("GPL v2"); MODULE_ALIAS_CRYPTO("ccm(aes)"); From patchwork Thu Jan 18 17:06:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 763715 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E46822C843 for ; Thu, 18 Jan 2024 17:07:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597669; cv=none; b=KDTRMKDRG/OcH9rGEZDwxhl4UCFofYtS3+CadySaMkHrZWze+j75Pgo6Jz9+hms8OrLsLWi7TvGjbu8/0TuYvvPyW/opCdUDqmUWXgtEDl2ZS/Lnz6ddSoOxcC17TjL08ysmnzIiwuhktIQNxDVIw7I6J04fuZl7/THtWTNXvg4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597669; c=relaxed/simple; bh=gwH56e0c63ZRWs77oxSFBaciVUVHWYbhYw5bqsppNZA=; h=Received:DKIM-Signature:X-Google-DKIM-Signature: X-Gm-Message-State:X-Google-Smtp-Source:X-Received:Date: In-Reply-To:Mime-Version:References:X-Developer-Key: X-Developer-Signature:X-Mailer:Message-ID:Subject:From:To:Cc: Content-Type; b=azPyCsAoy8p6NK2qbpHMOTi68vQf2jbD2fHrqSfI7awuIW2wEDn2QYcAWy4Tl/8k4Z8sBP0spC5NXRZdhsrmRUWTuqAKtYQigvp/oZwow21zvz3RSC4j6YrcEcLWbEXjo8cFVo2i/nMlH9DzSUi6v4qw4cB8mH3sKuk9SBBRdNE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=hT8dYwzr; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="hT8dYwzr" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-5ecfd153ccfso225585697b3.2 for ; Thu, 18 Jan 2024 09:07:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705597667; x=1706202467; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=MugIpKeTh5Ryctd5MeYebeB9tSZn9LJ8sjM7L85sl5s=; b=hT8dYwzr17gbbX4jPF9pRSqQFBi0yuSOwi2QUDxLZonQ/b8LtaecCb3lD1/Gyx3K6I +nYLOoXdvw27xbTiWuXoJAdcgAG7hEqU9w2YlugZGmn17etPRehlPXA8vyozzd5ectQv 1QncdM7JPNAKYpj/NvA9jbMylnnk/CRwoBvysmtupKY8LOrCHhQuQFdsciiutcuhj917 heUryRarzhrqxYoVrW7buGSB86doVEd20EzqKmwt4ec2/yDc/MkQpN+XOAIxHsjWOnAy +77crtCAJ82CntlNgwo1EF6Y1xSsFEwK7INTXy0nO1sw3qAiQOy9maoW9LjIil71DfEV h7ZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705597667; x=1706202467; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MugIpKeTh5Ryctd5MeYebeB9tSZn9LJ8sjM7L85sl5s=; b=MGZVW9qM9z4extV+P/ApOgoNFZrH6paCve771li50chUlDwucQkc1jSyLZtY3qwVgL MxsuKM+cc5NMe4sifJNvEcZqNurcVflMZhQtyS0pdD3RhsPr/Fh/Cm8zRegNPESWqRXO obDL+LDMur8Bo9BkRuev81VBtUGtgxfh4g3YiKAvFXWxeTkrIM4AcRO2sA5c7woIgvsm gevpMXtCBZbZ+3KfBfGAyJo7EJgh+Y63DmKS6KMUgZbC30a02tfMHcSPvTzIDG32AazR zG/fXb3MJuq+I35W7L03CKO3B521KbjQisuYz/D3fzPdpkfL93cYs1A06hlRtA8aElCI d79g== X-Gm-Message-State: AOJu0Yy5CgyyLbm7hwhSdSGLIqQZDYgP3GSg7nNrc0PnfnO8PrH2+seW 4sMLbPp4uNbWc4flCJav6KYeFkWVTYmcKc1mh1x4u9FsIveNqg5HuwXUGON3beQ6gq87FQc0J/Y lxs6dvWfrGZN8SrOxLAMYK7lcbKwhKVncCigosztu4nWbdBOw0QHz3TmrDpD+sjsKHKah3TtvQ2 36tZKnesoFtdRseq2Dh+kdY/hG8dgysA== X-Google-Smtp-Source: AGHT+IFzmFGMYipPUCWCf9jhIQke5LXuXQ3dIF3FvFy0k4NF6AROINPy/3x80U2ZRc2GZ92bfBLCAfTc X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:a25:dc82:0:b0:dc2:23f5:1791 with SMTP id y124-20020a25dc82000000b00dc223f51791mr474218ybe.6.1705597666848; Thu, 18 Jan 2024 09:07:46 -0800 (PST) Date: Thu, 18 Jan 2024 18:06:33 +0100 In-Reply-To: <20240118170628.3049797-10-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240118170628.3049797-10-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=3968; i=ardb@kernel.org; h=from:subject; bh=quStLKK0aPxXFozIrYQnjP2Hf1TbtvIgMZfqOvsyWqg=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIXVl1MzNXs/nZJ6/xnKh379QU9J7V11KY53qLmfmp9m1/ OHK7godpSwMYhwMsmKKLAKz/77beXqiVK3zLFmYOaxMIEMYuDgFYCINsgz/60q0fD4pCPFVH9PI seWIz2sWUQ4IDg9ZXGgrsOLV+T8vGf7pOdf5PpDuF3zoGrQ3+XvBhP4Z31bOiYoLy7arsFO6UME BAA== X-Mailer: git-send-email 2.43.0.381.gb435a96ce8-goog Message-ID: <20240118170628.3049797-14-ardb+git@google.com> Subject: [PATCH v2 4/8] crypto: arm64/aes-ccm - Replace bytewise tail handling with NEON permute From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: ebiggers@kernel.org, herbert@gondor.apana.org.au, Ard Biesheuvel From: Ard Biesheuvel Implement the CCM tail handling using a single sequence that uses permute vectors and overlapping loads and stores, rather than going over the tail byte by byte in a loop, and using scalar operations. This is more efficient, even though the measured speedup is only around 1-2% on the CPUs I have tried. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-ce-ccm-core.S | 59 +++++++++++++------- 1 file changed, 38 insertions(+), 21 deletions(-) diff --git a/arch/arm64/crypto/aes-ce-ccm-core.S b/arch/arm64/crypto/aes-ce-ccm-core.S index b03f7f71f893..b21a9b759ab2 100644 --- a/arch/arm64/crypto/aes-ce-ccm-core.S +++ b/arch/arm64/crypto/aes-ce-ccm-core.S @@ -1,8 +1,11 @@ /* SPDX-License-Identifier: GPL-2.0-only */ /* - * aesce-ccm-core.S - AES-CCM transform for ARMv8 with Crypto Extensions + * aes-ce-ccm-core.S - AES-CCM transform for ARMv8 with Crypto Extensions * - * Copyright (C) 2013 - 2017 Linaro Ltd + * Copyright (C) 2013 - 2017 Linaro Ltd. + * Copyright (C) 2024 Google LLC + * + * Author: Ard Biesheuvel */ #include @@ -168,13 +171,13 @@ CPU_LE( rev x8, x8 ) /* keep swabbed ctr in reg */ ld1 {v2.16b}, [x1], #16 /* load next input block */ .if \enc == 1 eor v2.16b, v2.16b, v5.16b /* final round enc+mac */ - eor v1.16b, v1.16b, v2.16b /* xor with crypted ctr */ + eor v6.16b, v1.16b, v2.16b /* xor with crypted ctr */ .else eor v2.16b, v2.16b, v1.16b /* xor with crypted ctr */ - eor v1.16b, v2.16b, v5.16b /* final round enc */ + eor v6.16b, v2.16b, v5.16b /* final round enc */ .endif eor v0.16b, v0.16b, v2.16b /* xor mac with pt ^ rk[last] */ - st1 {v1.16b}, [x0], #16 /* write output block */ + st1 {v6.16b}, [x0], #16 /* write output block */ bne 0b CPU_LE( rev x8, x8 ) st1 {v0.16b}, [x5] /* store mac */ @@ -183,25 +186,31 @@ CPU_LE( rev x8, x8 ) 6: eor v0.16b, v0.16b, v5.16b /* final round mac */ eor v1.16b, v1.16b, v5.16b /* final round enc */ - st1 {v0.16b}, [x5] /* store mac */ - add w2, w2, #16 /* process partial tail block */ -7: ldrb w9, [x1], #1 /* get 1 byte of input */ - umov w6, v1.b[0] /* get top crypted ctr byte */ - umov w7, v0.b[0] /* get top mac byte */ + + add x1, x1, w2, sxtw /* rewind the input pointer (w2 < 0) */ + add x0, x0, w2, sxtw /* rewind the output pointer */ + + adr_l x8, .Lpermute /* load permute vectors */ + add x9, x8, w2, sxtw + sub x8, x8, w2, sxtw + ld1 {v7.16b-v8.16b}, [x9] + ld1 {v9.16b}, [x8] + + ld1 {v2.16b}, [x1] /* load a full block of input */ + tbl v1.16b, {v1.16b}, v7.16b /* move keystream to end of register */ .if \enc == 1 - eor w7, w7, w9 - eor w9, w9, w6 + tbl v7.16b, {v2.16b}, v9.16b /* copy plaintext to start of v7 */ + eor v2.16b, v2.16b, v1.16b /* encrypt partial input block */ .else - eor w9, w9, w6 - eor w7, w7, w9 + eor v2.16b, v2.16b, v1.16b /* decrypt partial input block */ + tbl v7.16b, {v2.16b}, v9.16b /* copy plaintext to start of v7 */ .endif - strb w9, [x0], #1 /* store out byte */ - strb w7, [x5], #1 /* store mac byte */ - subs w2, w2, #1 - beq 5b - ext v0.16b, v0.16b, v0.16b, #1 /* shift out mac byte */ - ext v1.16b, v1.16b, v1.16b, #1 /* shift out ctr byte */ - b 7b + eor v0.16b, v0.16b, v7.16b /* fold plaintext into mac */ + tbx v2.16b, {v6.16b}, v8.16b /* insert output from previous iteration */ + + st1 {v0.16b}, [x5] /* store mac */ + st1 {v2.16b}, [x0] /* store output block */ + ret .endm /* @@ -219,3 +228,11 @@ SYM_FUNC_END(ce_aes_ccm_encrypt) SYM_FUNC_START(ce_aes_ccm_decrypt) aes_ccm_do_crypt 0 SYM_FUNC_END(ce_aes_ccm_decrypt) + + .section ".rodata", "a" + .align 6 + .fill 15, 1, 0xff +.Lpermute: + .byte 0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7 + .byte 0x8, 0x9, 0xa, 0xb, 0xc, 0xd, 0xe, 0xf + .fill 15, 1, 0xff From patchwork Thu Jan 18 17:06:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 764507 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C97662C844 for ; Thu, 18 Jan 2024 17:07:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597672; cv=none; b=IjrL/t7/6SYmeMkNeWCRma0+TgshkmoqJNjBqosNvMY/FHeJ2H/a3obf5gYO/xPNHzyFRdTJEZyzmjQ9BiIohjY6IQObQfouI6gOpm7kU59WBtYMuboCFojX6+85l3hEGHMZTiNnsPkJ/16d75/IEHWrol89U9fOFTDBc+IB9h4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597672; c=relaxed/simple; bh=DbEGZ3L5DngoSsRWHP3kKzWqHfzgs5uTriIQ1V4Vy+E=; h=Received:DKIM-Signature:X-Google-DKIM-Signature: X-Gm-Message-State:X-Google-Smtp-Source:X-Received:Date: In-Reply-To:Mime-Version:References:X-Developer-Key: X-Developer-Signature:X-Mailer:Message-ID:Subject:From:To:Cc: Content-Type; b=u+IAxOF+R/uYSTwwJo1eMpVAYwq0Uo638BcyUFAUw5NXGgbWJQt0xZ/xWZdbgTjAZDTsvMitI0TDKxboGfqVzZHp+e8WKiNgmlYVndGy4v0FiWPsoXDqtznAMduLELUpRruyYgraj0+Vqzm6zV67X0YX9WkfptawTxHKlnkAkYI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=2eybz9E4; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="2eybz9E4" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-40e530b7596so80299475e9.1 for ; Thu, 18 Jan 2024 09:07:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705597669; x=1706202469; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=07TFk+I2O9cltRfB22alYzY612bOP1dW+/1EMp4yy9s=; b=2eybz9E4f5DXh/2Xwhzuq+k4ZTI22kKSKBn2TbPrzQ81KGrPrqQhrsANfGb35FxgME f8eZj7wOLL6Yzok2oFLcQpfr/DpTj0Z7D5rrKCKmpIjSgx4mayrReRifmWnvuNSi5jai nz+6CkBskKDuNnsuPBaQdJNXMMcB6lV/tvBUlf3h43NC7E9+3e17U2h45rEny45GdvXB rTHjiCTPjBKLP54/L8DVFVvgHQEdr3CnAsS+knx8Eib8DUZBJ5TToIefkrhr4rpJbIUj vizOOfQSCx/ZxQHpWhVVmYcPd49tlR9Q/GYJXSMywsVFOFBXavadwHUHpzWt31Nb5LCG RbDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705597669; x=1706202469; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=07TFk+I2O9cltRfB22alYzY612bOP1dW+/1EMp4yy9s=; b=mvMxzLP//5OpQEWBusbPpuBb7bQUJxEhjrySXHR34STSN9zFIW5YN1yYdAJ9SUmdZ1 hLmCGz46z5fy/ysHsJE+E4io7M+giBHI06cmN6ucac07zqROrJ2QMEG5vnJOMazev8lK J7G1gYtMucWzb4Se2inUicYwIviUJMKSnghRxK71fawHHXud38YCR2vkX4K4P7kzk6/+ rH5tSs9u9wR93dCwcU+Px867G/JgrjYlukGZnvCtivPaw3zSvT5MIe54tdu6fY8cAxf8 Mc3TpVhFZ9Vy4IJ3XY20N5LdiqXd5Kxv5EyoKbTNFF62YSQK7aoBIwZT1RbLz45OTf3O bJXA== X-Gm-Message-State: AOJu0YxrUarOGoKQL9cB2+Klcm7pYRqIYP240SuVIILtKtBAML9E/2+x Er84vq7spatriw+VpV3jrcbIoNQfuhnq0dgGfZLhPNOhVAGn5bjC3M0hXFjrDA/5EeofN4rTGYQ hFQ2+Ba7TYcuhtd8VFML7sCn1/PK5kwJlr8AFjqlGWpctjhw+oH4RvJ/WzB/OITbkgkr+hwIRqY Apn9dpu/yztyuP4W4LtitlLwhuQLisFg== X-Google-Smtp-Source: AGHT+IGmSWEZubw3z/x0/RUxqBPx5wFHd+4QML+GB4geAAMGi/c4S7CGXNu0hjFXpXlCVZKpZ0sSJQB3 X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:a05:600c:600e:b0:40e:4d51:99a7 with SMTP id az14-20020a05600c600e00b0040e4d5199a7mr18870wmb.4.1705597669071; Thu, 18 Jan 2024 09:07:49 -0800 (PST) Date: Thu, 18 Jan 2024 18:06:34 +0100 In-Reply-To: <20240118170628.3049797-10-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240118170628.3049797-10-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=6993; i=ardb@kernel.org; h=from:subject; bh=B/bVfYR9u8frO8VtpHOGbHCAOq8to2rRaUDnMzpLDpg=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIXVl1Cz1HbMs74uelTJ3rNQ2CuK5zVtx9NejD/kn9cru/ lX27TLtKGVhEONgkBVTZBGY/ffdztMTpWqdZ8nCzGFlAhnCwMUpABNxOcTwT114qrF/8tvA3+vP 9F7u/XDyR9Sbmtxjb+4senxt86crl+sZ/ilbv034tVr6pUVSvshFS3bDrffK+dZFfBFv3+3yOS8 njRMA X-Mailer: git-send-email 2.43.0.381.gb435a96ce8-goog Message-ID: <20240118170628.3049797-15-ardb+git@google.com> Subject: [PATCH v2 5/8] crypto: arm64/aes-ccm - Reuse existing MAC update for AAD input From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: ebiggers@kernel.org, herbert@gondor.apana.org.au, Ard Biesheuvel From: Ard Biesheuvel CCM combines the counter (CTR) encryption mode with a MAC based on the same block cipher. This MAC construction is a bit clunky: it invokes the block cipher in a way that cannot be parallelized, resulting in poor CPU pipeline efficiency. The arm64 CCM code mitigates this by interleaving the encryption and MAC at the AES round level, resulting in a substantial speedup. But this approach does not apply to the additional authenticated data (AAD) which is not encrypted. This means the special asm routine dealing with the AAD is not any better than the MAC update routine used by the arm64 AES block encryption driver, so let's reuse that, and drop the special AES-CCM version. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig | 1 + arch/arm64/crypto/aes-ce-ccm-core.S | 71 -------------------- arch/arm64/crypto/aes-ce-ccm-glue.c | 49 +++++++++++--- arch/arm64/crypto/aes-glue.c | 1 + 4 files changed, 43 insertions(+), 79 deletions(-) diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index eb7b423ba463..e7d9bd8e4709 100644 --- a/arch/arm64/crypto/Kconfig +++ b/arch/arm64/crypto/Kconfig @@ -268,6 +268,7 @@ config CRYPTO_AES_ARM64_CE_CCM depends on ARM64 && KERNEL_MODE_NEON select CRYPTO_ALGAPI select CRYPTO_AES_ARM64_CE + select CRYPTO_AES_ARM64_CE_BLK select CRYPTO_AEAD select CRYPTO_LIB_AES help diff --git a/arch/arm64/crypto/aes-ce-ccm-core.S b/arch/arm64/crypto/aes-ce-ccm-core.S index b21a9b759ab2..0132872bd780 100644 --- a/arch/arm64/crypto/aes-ce-ccm-core.S +++ b/arch/arm64/crypto/aes-ce-ccm-core.S @@ -14,77 +14,6 @@ .text .arch armv8-a+crypto - /* - * u32 ce_aes_ccm_auth_data(u8 mac[], u8 const in[], u32 abytes, - * u32 macp, u8 const rk[], u32 rounds); - */ -SYM_FUNC_START(ce_aes_ccm_auth_data) - ld1 {v0.16b}, [x0] /* load mac */ - cbz w3, 1f - sub w3, w3, #16 - eor v1.16b, v1.16b, v1.16b -0: ldrb w7, [x1], #1 /* get 1 byte of input */ - subs w2, w2, #1 - add w3, w3, #1 - ins v1.b[0], w7 - ext v1.16b, v1.16b, v1.16b, #1 /* rotate in the input bytes */ - beq 8f /* out of input? */ - cbnz w3, 0b - eor v0.16b, v0.16b, v1.16b -1: ld1 {v3.4s}, [x4] /* load first round key */ - prfm pldl1strm, [x1] - cmp w5, #12 /* which key size? */ - add x6, x4, #16 - sub w7, w5, #2 /* modified # of rounds */ - bmi 2f - bne 5f - mov v5.16b, v3.16b - b 4f -2: mov v4.16b, v3.16b - ld1 {v5.4s}, [x6], #16 /* load 2nd round key */ -3: aese v0.16b, v4.16b - aesmc v0.16b, v0.16b -4: ld1 {v3.4s}, [x6], #16 /* load next round key */ - aese v0.16b, v5.16b - aesmc v0.16b, v0.16b -5: ld1 {v4.4s}, [x6], #16 /* load next round key */ - subs w7, w7, #3 - aese v0.16b, v3.16b - aesmc v0.16b, v0.16b - ld1 {v5.4s}, [x6], #16 /* load next round key */ - bpl 3b - aese v0.16b, v4.16b - subs w2, w2, #16 /* last data? */ - eor v0.16b, v0.16b, v5.16b /* final round */ - bmi 6f - ld1 {v1.16b}, [x1], #16 /* load next input block */ - eor v0.16b, v0.16b, v1.16b /* xor with mac */ - bne 1b -6: st1 {v0.16b}, [x0] /* store mac */ - beq 10f - adds w2, w2, #16 - beq 10f - mov w3, w2 -7: ldrb w7, [x1], #1 - umov w6, v0.b[0] - eor w6, w6, w7 - strb w6, [x0], #1 - subs w2, w2, #1 - beq 10f - ext v0.16b, v0.16b, v0.16b, #1 /* rotate out the mac bytes */ - b 7b -8: cbz w3, 91f - mov w7, w3 - add w3, w3, #16 -9: ext v1.16b, v1.16b, v1.16b, #1 - adds w7, w7, #1 - bne 9b -91: eor v0.16b, v0.16b, v1.16b - st1 {v0.16b}, [x0] -10: mov w0, w3 - ret -SYM_FUNC_END(ce_aes_ccm_auth_data) - /* * void ce_aes_ccm_final(u8 mac[], u8 const ctr[], u8 const rk[], * u32 rounds); diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c index 4710e59075f5..ed3d79e05112 100644 --- a/arch/arm64/crypto/aes-ce-ccm-glue.c +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c @@ -18,6 +18,8 @@ #include "aes-ce-setkey.h" +MODULE_IMPORT_NS(CRYPTO_INTERNAL); + static int num_rounds(struct crypto_aes_ctx *ctx) { /* @@ -30,8 +32,9 @@ static int num_rounds(struct crypto_aes_ctx *ctx) return 6 + ctx->key_length / 4; } -asmlinkage u32 ce_aes_ccm_auth_data(u8 mac[], u8 const in[], u32 abytes, - u32 macp, u32 const rk[], u32 rounds); +asmlinkage u32 ce_aes_mac_update(u8 const in[], u32 const rk[], int rounds, + int blocks, u8 dg[], int enc_before, + int enc_after); asmlinkage void ce_aes_ccm_encrypt(u8 out[], u8 const in[], u32 cbytes, u32 const rk[], u32 rounds, u8 mac[], @@ -97,6 +100,41 @@ static int ccm_init_mac(struct aead_request *req, u8 maciv[], u32 msglen) return 0; } +static u32 ce_aes_ccm_auth_data(u8 mac[], u8 const in[], u32 abytes, + u32 macp, u32 const rk[], u32 rounds) +{ + int enc_after = (macp + abytes) % AES_BLOCK_SIZE; + + do { + u32 blocks = abytes / AES_BLOCK_SIZE; + + if (macp == AES_BLOCK_SIZE || (!macp && blocks > 0)) { + u32 rem = ce_aes_mac_update(in, rk, rounds, blocks, mac, + macp, enc_after); + u32 adv = (blocks - rem) * AES_BLOCK_SIZE; + + macp = enc_after ? 0 : AES_BLOCK_SIZE; + in += adv; + abytes -= adv; + + if (unlikely(rem)) { + kernel_neon_end(); + kernel_neon_begin(); + macp = 0; + } + } else { + u32 l = min(AES_BLOCK_SIZE - macp, abytes); + + crypto_xor(&mac[macp], in, l); + in += l; + macp += l; + abytes -= l; + } + } while (abytes > 0); + + return macp; +} + static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[]) { struct crypto_aead *aead = crypto_aead_reqtfm(req); @@ -104,7 +142,7 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[]) struct __packed { __be16 l; __be32 h; u16 len; } ltag; struct scatter_walk walk; u32 len = req->assoclen; - u32 macp = 0; + u32 macp = AES_BLOCK_SIZE; /* prepend the AAD with a length tag */ if (len < 0xff00) { @@ -128,16 +166,11 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[]) scatterwalk_start(&walk, sg_next(walk.sg)); n = scatterwalk_clamp(&walk, len); } - n = min_t(u32, n, SZ_4K); /* yield NEON at least every 4k */ p = scatterwalk_map(&walk); macp = ce_aes_ccm_auth_data(mac, p, n, macp, ctx->key_enc, num_rounds(ctx)); - if (len / SZ_4K > (len - n) / SZ_4K) { - kernel_neon_end(); - kernel_neon_begin(); - } len -= n; scatterwalk_unmap(p); diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c index 162787c7aa86..a147e847a5a1 100644 --- a/arch/arm64/crypto/aes-glue.c +++ b/arch/arm64/crypto/aes-glue.c @@ -1048,6 +1048,7 @@ static int __init aes_init(void) #ifdef USE_V8_CRYPTO_EXTENSIONS module_cpu_feature_match(AES, aes_init); +EXPORT_SYMBOL_NS(ce_aes_mac_update, CRYPTO_INTERNAL); #else module_init(aes_init); EXPORT_SYMBOL(neon_aes_ecb_encrypt); From patchwork Thu Jan 18 17:06:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 763714 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 495F22C84D for ; Thu, 18 Jan 2024 17:07:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597674; cv=none; b=TAZ0y4/VU6Svp6vUY5dxU0crhb2pC47jif2PYpnr6OvwFYNQgQEUw/EQzosajMplcpzUlrrgTnWy+PANStRocwL0rT/7+EJOpuwvsXdrcmorGWrMop82xvSHD8AKF8QaXEXqGSYu+qojGVq0d1944gvdB4MB8en6B2M22XRe/AM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597674; c=relaxed/simple; bh=E8xa5pPwKdPObbfeUy3dAiPldcHAbAySa96B8+aC2vc=; h=Received:DKIM-Signature:X-Google-DKIM-Signature: X-Gm-Message-State:X-Google-Smtp-Source:X-Received:Date: In-Reply-To:Mime-Version:References:X-Developer-Key: X-Developer-Signature:X-Mailer:Message-ID:Subject:From:To:Cc: Content-Type; b=XkHtFssn7chMHO9hUC9iaztUpWzYfiymSm6oxZJ6w2QwlK2kzOIgYN4kERvIN+sHG0V0wnCsy8m7eTasT9rkNkam2ZO0YI4aoVE9wqTf3dbjdGcsiVnTAIjvXmJDHGhJJImtGCSQuRC8cR+AYLHvI2lH75oWA2nPfZLsiQHx+08= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Zhpl7XNJ; arc=none smtp.client-ip=209.85.221.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Zhpl7XNJ" Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-337d2ccaab0so302662f8f.0 for ; Thu, 18 Jan 2024 09:07:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705597671; x=1706202471; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=cIgUgl/Kj0Nyil4FuwM1hWjLVD7oyZgze427wzrk/WE=; b=Zhpl7XNJQ6cwpMPY0NGnHUgt1Oi0IOY/XROjT1N6j/Jj5/2hJ0ve9dgRfMOAIYQkAN RvQNsqg4Sd/V3udAGoICPfWVwB21tGMYOThhlxdeepU1HermT0pz0N/B229nVbkipRcI h//qyA9+x9gS3w9ZTNzyrA/8SH6zdPNOnF0HlctYLZOAR0suwU0+FkdiTUwi90biEVxS idOgOoLVH/2R23caGKGv4nU6dHl3558v1pkgmTpZcd2pjqYA9EzJZA4cQOJgM1mH4Zv9 pV0oXwft3kKBQnCMiUBl19O1ClTpKDC7uS5q30cCjo+zvYdRDGCf+HS62pSEMwJwcp7x eIxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705597671; x=1706202471; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=cIgUgl/Kj0Nyil4FuwM1hWjLVD7oyZgze427wzrk/WE=; b=K4mUomH1QoH/wQO0uqqPEEOE3HL1DQhZZNJ06KqrEgG6B6CHe+JOHyAvkhurQlAoyO C8VXBD3WRgNe+RE8DZHzbQQL2wd5kDGydDbRyXEE4IUdEgp9frxwVHBxtM0MxulyE05L qhcp1PfVGb1pGaRLB0E2rl7eipXRMKN/Qynz/kkA4HMgQmvKMtx54UtC6iW5SIL0veoS jn+rfn0WssGLiEi1iuFVuA1q/IGuKOJp9Cet5aaLqzpF9PMsW4A3kFS8F4bq/cY0Bwo/ aty0T/dxl6IfKGIM696KKR3QYONaC9WRflzNtq4LxJv1n0TI26TZv9LADQnSMGKK0h3t aGnA== X-Gm-Message-State: AOJu0YzW/i9+E7aoyreJfv0BgwfPafOuMe48/gvLC6wxFZ7+wagfECUt imzDjVcCtARo02st+PwbblhnlPzH9Wdpyf2zfzKgXXD8z5GL+PpZBY9UHR5qVsMr5uEYXU+1ySk mcNsrZb0w69Xne1kw2Hj+THnMzMZyIkyLn5bgRdP8Mbvl1nsYeerHGbxpi0VLGugOTebC8CRz9k n4KnZ6dRrtvjYWeAib5MgDKdIS4FWWug== X-Google-Smtp-Source: AGHT+IHmpkiTwDCMmoshq9XEruxIDO9/ssDzOUltwIkrpJ0qqgbXFwDHXffJA8kpNmNypyEyoN1Yj7d9 X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:adf:db4d:0:b0:337:61b6:e4af with SMTP id f13-20020adfdb4d000000b0033761b6e4afmr3219wrj.9.1705597671397; Thu, 18 Jan 2024 09:07:51 -0800 (PST) Date: Thu, 18 Jan 2024 18:06:35 +0100 In-Reply-To: <20240118170628.3049797-10-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240118170628.3049797-10-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=4505; i=ardb@kernel.org; h=from:subject; bh=BZ/DOHhRiS0C26Q9Ksj11LPomr42R+ccgChBBs+l+0U=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIXVl1GzbqgdMnT8+hDwodXzQ+Jyn5PaTqwWqe+d7PZWzP crCL7mxo5SFQYyDQVZMkUVg9t93O09PlKp1niULM4eVCWQIAxenAExEwJLhn+qGG4L3FtmvaZh6 zNbametjVeoe2wfTlN07fzYbMgn+uMLIMKtoW5y8zqnEe5OEv8w8lzN50lG92xVBfKs2piTPfJO qzgkA X-Mailer: git-send-email 2.43.0.381.gb435a96ce8-goog Message-ID: <20240118170628.3049797-16-ardb+git@google.com> Subject: [PATCH v2 6/8] crypto: arm64/aes-ccm - Cache round keys and unroll AES loops From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: ebiggers@kernel.org, herbert@gondor.apana.org.au, Ard Biesheuvel From: Ard Biesheuvel The CCM code as originally written attempted to use as few NEON registers as possible, to avoid having to eagerly preserve/restore the entire NEON register file at every call to kernel_neon_begin/end. At that time, this API took a number of NEON registers as a parameter, and only preserved that many registers. Today, the NEON register file is restored lazily, and the old API is long gone. This means we can use as many NEON registers as we can make meaningful use of, which means in the AES case that we can keep all round keys in registers rather than reloading each of them for each AES block processed. On Cortex-A53, this results in a speedup of more than 50%. (From 4 cycles per byte to 2.6 cycles per byte) Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-ce-ccm-core.S | 95 ++++++++------------ 1 file changed, 38 insertions(+), 57 deletions(-) diff --git a/arch/arm64/crypto/aes-ce-ccm-core.S b/arch/arm64/crypto/aes-ce-ccm-core.S index 0132872bd780..0ec59fc4ef3e 100644 --- a/arch/arm64/crypto/aes-ce-ccm-core.S +++ b/arch/arm64/crypto/aes-ce-ccm-core.S @@ -14,40 +14,46 @@ .text .arch armv8-a+crypto + .macro load_round_keys, rk, nr, tmp + sub w\tmp, \nr, #10 + add \tmp, \rk, w\tmp, sxtw #4 + ld1 {v10.4s-v13.4s}, [\rk] + ld1 {v14.4s-v17.4s}, [\tmp], #64 + ld1 {v18.4s-v21.4s}, [\tmp], #64 + ld1 {v3.4s-v5.4s}, [\tmp] + .endm + + .macro dround, va, vb, vk + aese \va\().16b, \vk\().16b + aesmc \va\().16b, \va\().16b + aese \vb\().16b, \vk\().16b + aesmc \vb\().16b, \vb\().16b + .endm + + .macro aes_encrypt, va, vb, nr + tbz \nr, #2, .L\@ + dround \va, \vb, v10 + dround \va, \vb, v11 + tbz \nr, #1, .L\@ + dround \va, \vb, v12 + dround \va, \vb, v13 +.L\@: .irp v, v14, v15, v16, v17, v18, v19, v20, v21, v3 + dround \va, \vb, \v + .endr + aese \va\().16b, v4.16b + aese \vb\().16b, v4.16b + .endm + /* * void ce_aes_ccm_final(u8 mac[], u8 const ctr[], u8 const rk[], * u32 rounds); */ SYM_FUNC_START(ce_aes_ccm_final) - ld1 {v3.4s}, [x2], #16 /* load first round key */ ld1 {v0.16b}, [x0] /* load mac */ - cmp w3, #12 /* which key size? */ - sub w3, w3, #2 /* modified # of rounds */ ld1 {v1.16b}, [x1] /* load 1st ctriv */ - bmi 0f - bne 3f - mov v5.16b, v3.16b - b 2f -0: mov v4.16b, v3.16b -1: ld1 {v5.4s}, [x2], #16 /* load next round key */ - aese v0.16b, v4.16b - aesmc v0.16b, v0.16b - aese v1.16b, v4.16b - aesmc v1.16b, v1.16b -2: ld1 {v3.4s}, [x2], #16 /* load next round key */ - aese v0.16b, v5.16b - aesmc v0.16b, v0.16b - aese v1.16b, v5.16b - aesmc v1.16b, v1.16b -3: ld1 {v4.4s}, [x2], #16 /* load next round key */ - subs w3, w3, #3 - aese v0.16b, v3.16b - aesmc v0.16b, v0.16b - aese v1.16b, v3.16b - aesmc v1.16b, v1.16b - bpl 1b - aese v0.16b, v4.16b - aese v1.16b, v4.16b + + aes_encrypt v0, v1, w3 + /* final round key cancels out */ eor v0.16b, v0.16b, v1.16b /* en-/decrypt the mac */ st1 {v0.16b}, [x0] /* store result */ @@ -55,6 +61,8 @@ SYM_FUNC_START(ce_aes_ccm_final) SYM_FUNC_END(ce_aes_ccm_final) .macro aes_ccm_do_crypt,enc + load_round_keys x3, w4, x10 + cbz x2, 5f ldr x8, [x6, #8] /* load lower ctr */ ld1 {v0.16b}, [x5] /* load mac */ @@ -64,37 +72,10 @@ CPU_LE( rev x8, x8 ) /* keep swabbed ctr in reg */ prfm pldl1strm, [x1] add x8, x8, #1 rev x9, x8 - cmp w4, #12 /* which key size? */ - sub w7, w4, #2 /* get modified # of rounds */ ins v1.d[1], x9 /* no carry in lower ctr */ - ld1 {v3.4s}, [x3] /* load first round key */ - add x10, x3, #16 - bmi 1f - bne 4f - mov v5.16b, v3.16b - b 3f -1: mov v4.16b, v3.16b - ld1 {v5.4s}, [x10], #16 /* load 2nd round key */ -2: /* inner loop: 3 rounds, 2x interleaved */ - aese v0.16b, v4.16b - aesmc v0.16b, v0.16b - aese v1.16b, v4.16b - aesmc v1.16b, v1.16b -3: ld1 {v3.4s}, [x10], #16 /* load next round key */ - aese v0.16b, v5.16b - aesmc v0.16b, v0.16b - aese v1.16b, v5.16b - aesmc v1.16b, v1.16b -4: ld1 {v4.4s}, [x10], #16 /* load next round key */ - subs w7, w7, #3 - aese v0.16b, v3.16b - aesmc v0.16b, v0.16b - aese v1.16b, v3.16b - aesmc v1.16b, v1.16b - ld1 {v5.4s}, [x10], #16 /* load next round key */ - bpl 2b - aese v0.16b, v4.16b - aese v1.16b, v4.16b + + aes_encrypt v0, v1, w4 + subs w2, w2, #16 bmi 6f /* partial block? */ ld1 {v2.16b}, [x1], #16 /* load next input block */ From patchwork Thu Jan 18 17:06:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 764506 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E95B92C842 for ; Thu, 18 Jan 2024 17:07:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597676; cv=none; b=QSerL/ioI0dpmo1z72TONelQmWkcbfVYJCyZt1L3YkNzlAZCYwO4mX3QeWwurkdZ0QiMrUqJtsqUMc2Ozaskx3xBxqoLrNGF1gaWCg1Y2J9IookaUorEaum2d3pZYrCC3H/AZoNd1QzOMv7c/4CrNjMw30gLK2e8ZduJk9KNu2E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597676; c=relaxed/simple; bh=P8ZS8tnkvpafVPNfBL+HYmJ3VqCNBCKke6HdO3S7Qts=; h=Received:DKIM-Signature:X-Google-DKIM-Signature: X-Gm-Message-State:X-Google-Smtp-Source:X-Received:Date: In-Reply-To:Mime-Version:References:X-Developer-Key: X-Developer-Signature:X-Mailer:Message-ID:Subject:From:To:Cc: Content-Type; b=FrP+FQlXYNG+c2ETle8GdhR57PjPrEaAyw8z7ln1hOuTYF+F8i1VWhUgmm3lgYQlI3C9YjcpZLQHAElnHvfvx52Dm0IBmLNuXGCFsQGmqVCbkR62An6E70byZ89acVN41H0N/AR5/3/Z3avQroWo8oMXjXF8wM7uJXcxbDuzCFA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=fmGqiKas; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="fmGqiKas" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-dc221ed88d9so4925856276.3 for ; Thu, 18 Jan 2024 09:07:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705597674; x=1706202474; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=s9Xi02Cj/kvnksvbQxFMC/EAI5U7xlC86fT0j1fgH/w=; b=fmGqiKasapW9RC4S7YCMEeMNSou4PMsbG5j5xVrYm5oGd+YNMW28vGeIYiGkJvgOSF /GJ8UwPVjE/A2VJq4j5XSF3W4zEKnj3qTFe86CU+6uYXZAXLfp3AKGFYfzD+0SQCeJQF K/ZAtde2MqobjpbSxZqK8+nGowC+cPT52A/tt2D48qA24giRBoXI/ns74ylb3C8l7JZM O1PbR8cwJCWxVh5mwjqfRZxP0P7sRz9QUz8cia2ODeJyPfUzcpHN5SJO8S18npX2gn3G fOd+E5e+Yb+mQQmdqCoF9Nd0WXTXBrj2SH1/uuYr7ua4sR+BZg699s4fPeBAm3x/XCsV 25Jg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705597674; x=1706202474; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=s9Xi02Cj/kvnksvbQxFMC/EAI5U7xlC86fT0j1fgH/w=; b=H6Pv8nrSCCvW9AYOEoK1FWKpTBH6iKIsWSr2aRNlS4k1UPKIJGNfOaYM4PzSSkuNSq Wmjh340gqPbQldJVTPss1APxVZV/BS0vfZ9xGvNHgg9tJ8w7ffVN/ElcQgtj/YMfttUX jy35Hs0uF1jUJgmELjMspSJGIjrvUFPmbbY3q67aN2slrzdODUriALO7YzkIlW6uQ1y+ FQTA+xL5A8lmloqDXbNF6VI0533frHWn/gl7gDezgiU64xxL6SyNzeAanMkq5N0PRawT Rip/BHZRrhDYyN2IDnEw++PQsnc2a0O13tg3MZ9eNVWHFTpWdvWnBmoudK0WxYZxitsb WUsw== X-Gm-Message-State: AOJu0YxJGz1yGHA2b0Bk+GJEDoVKklRcrB2hCXBjBfxPH4MRucvS7Dy2 6rXURLFMK/JInMYEnutN+vrzkSD3BmAS8oHJRRXBqHFUQbz09lEGSx8XiKTddPurLXuBTsoIlJ1 ZrfOgwUFwmORMFsCTA1iomNrAusv6YHEoNntNLi32yz7D5RlINlsEbzgNbLDqK6y2LYQ/T5gc/o td3LrCJuiTPBhFuiOHJhxWvS54lATPzA== X-Google-Smtp-Source: AGHT+IHYaFErAT8pg8KMTc/iQW4EQYkdOrynw1i4wcwKw1v+e8+uezrSPAhWSFVEQP0rzzAQc5I8+6WH X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:a05:6902:2705:b0:dbd:73bd:e55a with SMTP id dz5-20020a056902270500b00dbd73bde55amr65386ybb.4.1705597674009; Thu, 18 Jan 2024 09:07:54 -0800 (PST) Date: Thu, 18 Jan 2024 18:06:36 +0100 In-Reply-To: <20240118170628.3049797-10-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240118170628.3049797-10-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=3338; i=ardb@kernel.org; h=from:subject; bh=qGTwt2Caay9GHLXmY5Vb1w8QJbkrtTgRZpTpYGzKVU0=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIXVl1JzEDtOPKTMj+E7MqH5xsHBK341mifNPN/b3nNnwS 4A3OXxRRykLgxgHg6yYIovA7L/vdp6eKFXrPEsWZg4rE8gQBi5OAZhI8XaGPxzL9j+d9irx6Z4V Zu0VR003Xz0581Xyq8jsdza1zZPc/XoY/unNjH0n8dLcVmH2xX6zT69kQyZeKbO1kvVY/t5M85P 5Tj4A X-Mailer: git-send-email 2.43.0.381.gb435a96ce8-goog Message-ID: <20240118170628.3049797-17-ardb+git@google.com> Subject: [PATCH v2 7/8] crypto: arm64/aes-ccm - Merge encrypt and decrypt tail handling From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: ebiggers@kernel.org, herbert@gondor.apana.org.au, Ard Biesheuvel From: Ard Biesheuvel The encryption and decryption code paths are mostly identical, except for a small difference where the plaintext input into the MAC is taken from either the input or the output block. We can factor this in quite easily using a vector bit select, and a few additional XORs, without the need for branches. This way, we can use the same tail handling logic on the encrypt and decrypt code paths, allowing further consolidation of the asm helpers in a subsequent patch. (In the main loop, adding just a handful of ALU instructions results in a noticeable performance hit [around 5% on Apple M2], so those routines are kept separate) Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-ce-ccm-core.S | 26 ++++++++++---------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/arch/arm64/crypto/aes-ce-ccm-core.S b/arch/arm64/crypto/aes-ce-ccm-core.S index 0ec59fc4ef3e..bf3a888a5615 100644 --- a/arch/arm64/crypto/aes-ce-ccm-core.S +++ b/arch/arm64/crypto/aes-ce-ccm-core.S @@ -77,7 +77,7 @@ CPU_LE( rev x8, x8 ) /* keep swabbed ctr in reg */ aes_encrypt v0, v1, w4 subs w2, w2, #16 - bmi 6f /* partial block? */ + bmi ce_aes_ccm_crypt_tail ld1 {v2.16b}, [x1], #16 /* load next input block */ .if \enc == 1 eor v2.16b, v2.16b, v5.16b /* final round enc+mac */ @@ -93,8 +93,10 @@ CPU_LE( rev x8, x8 ) st1 {v0.16b}, [x5] /* store mac */ str x8, [x6, #8] /* store lsb end of ctr (BE) */ 5: ret + .endm -6: eor v0.16b, v0.16b, v5.16b /* final round mac */ +SYM_FUNC_START_LOCAL(ce_aes_ccm_crypt_tail) + eor v0.16b, v0.16b, v5.16b /* final round mac */ eor v1.16b, v1.16b, v5.16b /* final round enc */ add x1, x1, w2, sxtw /* rewind the input pointer (w2 < 0) */ @@ -108,20 +110,16 @@ CPU_LE( rev x8, x8 ) ld1 {v2.16b}, [x1] /* load a full block of input */ tbl v1.16b, {v1.16b}, v7.16b /* move keystream to end of register */ - .if \enc == 1 - tbl v7.16b, {v2.16b}, v9.16b /* copy plaintext to start of v7 */ - eor v2.16b, v2.16b, v1.16b /* encrypt partial input block */ - .else - eor v2.16b, v2.16b, v1.16b /* decrypt partial input block */ - tbl v7.16b, {v2.16b}, v9.16b /* copy plaintext to start of v7 */ - .endif - eor v0.16b, v0.16b, v7.16b /* fold plaintext into mac */ - tbx v2.16b, {v6.16b}, v8.16b /* insert output from previous iteration */ + eor v7.16b, v2.16b, v1.16b /* encrypt partial input block */ + bif v2.16b, v7.16b, v22.16b /* select plaintext */ + tbx v7.16b, {v6.16b}, v8.16b /* insert output from previous iteration */ + tbl v2.16b, {v2.16b}, v9.16b /* copy plaintext to start of v2 */ + eor v0.16b, v0.16b, v2.16b /* fold plaintext into mac */ st1 {v0.16b}, [x5] /* store mac */ - st1 {v2.16b}, [x0] /* store output block */ + st1 {v7.16b}, [x0] /* store output block */ ret - .endm +SYM_FUNC_END(ce_aes_ccm_crypt_tail) /* * void ce_aes_ccm_encrypt(u8 out[], u8 const in[], u32 cbytes, @@ -132,10 +130,12 @@ CPU_LE( rev x8, x8 ) * u8 ctr[]); */ SYM_FUNC_START(ce_aes_ccm_encrypt) + movi v22.16b, #255 aes_ccm_do_crypt 1 SYM_FUNC_END(ce_aes_ccm_encrypt) SYM_FUNC_START(ce_aes_ccm_decrypt) + movi v22.16b, #0 aes_ccm_do_crypt 0 SYM_FUNC_END(ce_aes_ccm_decrypt) From patchwork Thu Jan 18 17:06:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 763713 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C23FF2C842 for ; Thu, 18 Jan 2024 17:07:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597679; cv=none; b=thiBS5RL/iL+3o5uFId5+1wlEpqs1CfEiAWCOnn+9o2lPB+ThrwLdwb3dNyHnoqMXxSxG+/ycIzPEJ21WkTOGjwHyITTkkve/pjmHIkou1ASeBqP1/OUdKbUlCcXG70rl+H8T1kGkIvwhXgchCEh7G5rkyIQNAlzajPJ/7R4orY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705597679; c=relaxed/simple; bh=K5rNQyrGYBNM4tuICSMgnyjR0smFKeJ7PyYe4DkbrUI=; h=Received:DKIM-Signature:X-Google-DKIM-Signature: X-Gm-Message-State:X-Google-Smtp-Source:X-Received:Date: In-Reply-To:Mime-Version:References:X-Developer-Key: X-Developer-Signature:X-Mailer:Message-ID:Subject:From:To:Cc: Content-Type; b=AcrLJOCzwAzQnAfReMLu4VZaGHt2MpxHXjheD72wcE+FFtDO1D/OSw8OuTHmI0HV9RHdJPLtUet4m9iOFrAJsEd0i6ugnGfDNTTmbJU1Kxl5zjWIh+tWYfg8gA6wodWCbxaT4vbipJO8o0g83k6ji3cVyGlM5NX4r94KH9Y1Ed8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=fLZBvaVA; arc=none smtp.client-ip=209.85.221.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="fLZBvaVA" Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-337ad48edf8so2943614f8f.1 for ; Thu, 18 Jan 2024 09:07:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705597676; x=1706202476; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=acy82ql0ys5q2HlzSPX79aRI8SY/9cNbtRBfVJLqGZ0=; b=fLZBvaVAUnsUDA01KW7TjFFRT3wt61bCavvdneyudy8wskDV8btkP7x/05AnIW6elr JClQXjbYUpN1hhhxXwyjxqv84sl7EO8HaZDfth7EpogXj6KMdJQlxXzil1sOEWVjttrz sOlGg3Aj5S6YcH+PcF268cuqt0JNq3yXf1EzvS7SO+pZ/n5ow6hHC8jJ/JF3EKaqhv2e x5QlldPkl7Wyi6z6deNywXeJuNNBLtXZdVS5vl7EbU1C9cQz2fDxCLD8pEu2IHVx0wTu GpJyw+RKXlWpSfSwmCVD6mZbNHWO5H1hGhukY4vpyGIXRtWCSK2GNTNtceFz9z2NMFev yR7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705597676; x=1706202476; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=acy82ql0ys5q2HlzSPX79aRI8SY/9cNbtRBfVJLqGZ0=; b=iPl6RiHToGItkr3FzPdytPCnjk+SrwUQhz9omVmMSnB4yTHFSXR3cCqOvgTllfNA1v rqMA9Xa75ndEbA9LRk6IW/3Psb74A69t+9DmUmzVOSHpcCI4u/NvzGMNF7As0RuTvFve Ck4N7ml+rpYCViP503axqjR7Ob6fUCEUHbh/e04yPpGWJs9cEVysTsldU0AzOZWigwpL lCKbWXnns646x0UI0w2Fu0ifNuMq4VBvC7XgDbMW5bgVL/thKqzR/j6JcT++3Am30JL1 qRbRP+3+RLwVpkE96RCxysdSmNUfnnQFGr6ZtdiIf/OBXUcRu3NK6qIpUef0fTCti3w9 184A== X-Gm-Message-State: AOJu0YxqpfLHxuRbwLO/haD9fK3xH0kiiMKQ+RO+PtuR7u/9X/0B+DZs J5tuboYjdR4HP+TsxAc6sQJfs+nEIvt9edJv4HH+V9afobUEpBnQIY6H0sXRxbgIvunng6TNq+j ba9P6jM1WGTil6kaw8JbsKjDQMVdjDupkqVVJFrsfjf6SlmrhXAGzFjH9JTzXYvk8Hf6oYoNYwb AzZrmrLn1HLOfCfyfv7IhpVwNLBCIgeQ== X-Google-Smtp-Source: AGHT+IHL6F+ouXye7Uuy2+7YSiXzxvFuXChMh9R0NbuPg/jCvfX27Bc5a4dJull84IXlchQQpOBYOMXL X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:a05:6000:1d82:b0:337:7b7a:6538 with SMTP id bk2-20020a0560001d8200b003377b7a6538mr5541wrb.0.1705597676131; Thu, 18 Jan 2024 09:07:56 -0800 (PST) Date: Thu, 18 Jan 2024 18:06:37 +0100 In-Reply-To: <20240118170628.3049797-10-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240118170628.3049797-10-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=5931; i=ardb@kernel.org; h=from:subject; bh=4wALGXEGRak98LWDvH8ri947m6Dxgwb64gGS29g676g=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIXVl1NzER7f27mDilzv79crslq+zbpXul542UXLxJSvfJ bHPl8xN6ihlYRDjYJAVU2QRmP333c7TE6VqnWfJwsxhZQIZwsDFKQATefiGkaFhAces9x+Vbhe8 cjl6aaX6juJNohzLl4tYTrj+pMF+5fUohr/yfB/nLi0+93vpiU017LPWmzi03zPTO33f3TZV+5B jVz4DAA== X-Mailer: git-send-email 2.43.0.381.gb435a96ce8-goog Message-ID: <20240118170628.3049797-18-ardb+git@google.com> Subject: [PATCH v2 8/8] crypto: arm64/aes-ccm - Merge finalization into en/decrypt asm helpers From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: ebiggers@kernel.org, herbert@gondor.apana.org.au, Ard Biesheuvel From: Ard Biesheuvel The C glue code already infers whether or not the current iteration is the final one, by comparing walk.nbytes with walk.total. This means we can easily inform the asm helpers of this as well, by conditionally passing a pointer to the original IV, which is used in the finalization of the MAC. This removes the need for a separate call into the asm code to perform the finalization. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-ce-ccm-core.S | 40 +++++++++----------- arch/arm64/crypto/aes-ce-ccm-glue.c | 27 ++++++------- 2 files changed, 29 insertions(+), 38 deletions(-) diff --git a/arch/arm64/crypto/aes-ce-ccm-core.S b/arch/arm64/crypto/aes-ce-ccm-core.S index bf3a888a5615..f2624238fd95 100644 --- a/arch/arm64/crypto/aes-ce-ccm-core.S +++ b/arch/arm64/crypto/aes-ce-ccm-core.S @@ -44,28 +44,12 @@ aese \vb\().16b, v4.16b .endm - /* - * void ce_aes_ccm_final(u8 mac[], u8 const ctr[], u8 const rk[], - * u32 rounds); - */ -SYM_FUNC_START(ce_aes_ccm_final) - ld1 {v0.16b}, [x0] /* load mac */ - ld1 {v1.16b}, [x1] /* load 1st ctriv */ - - aes_encrypt v0, v1, w3 - - /* final round key cancels out */ - eor v0.16b, v0.16b, v1.16b /* en-/decrypt the mac */ - st1 {v0.16b}, [x0] /* store result */ - ret -SYM_FUNC_END(ce_aes_ccm_final) - .macro aes_ccm_do_crypt,enc load_round_keys x3, w4, x10 - cbz x2, 5f - ldr x8, [x6, #8] /* load lower ctr */ ld1 {v0.16b}, [x5] /* load mac */ + cbz x2, ce_aes_ccm_final + ldr x8, [x6, #8] /* load lower ctr */ CPU_LE( rev x8, x8 ) /* keep swabbed ctr in reg */ 0: /* outer loop */ ld1 {v1.8b}, [x6] /* load upper ctr */ @@ -90,9 +74,10 @@ CPU_LE( rev x8, x8 ) /* keep swabbed ctr in reg */ st1 {v6.16b}, [x0], #16 /* write output block */ bne 0b CPU_LE( rev x8, x8 ) - st1 {v0.16b}, [x5] /* store mac */ str x8, [x6, #8] /* store lsb end of ctr (BE) */ -5: ret + cbnz x7, ce_aes_ccm_final + st1 {v0.16b}, [x5] /* store mac */ + ret .endm SYM_FUNC_START_LOCAL(ce_aes_ccm_crypt_tail) @@ -116,18 +101,27 @@ SYM_FUNC_START_LOCAL(ce_aes_ccm_crypt_tail) tbl v2.16b, {v2.16b}, v9.16b /* copy plaintext to start of v2 */ eor v0.16b, v0.16b, v2.16b /* fold plaintext into mac */ - st1 {v0.16b}, [x5] /* store mac */ st1 {v7.16b}, [x0] /* store output block */ + cbz x7, 0f + +SYM_INNER_LABEL(ce_aes_ccm_final, SYM_L_LOCAL) + ld1 {v1.16b}, [x7] /* load 1st ctriv */ + + aes_encrypt v0, v1, w4 + + /* final round key cancels out */ + eor v0.16b, v0.16b, v1.16b /* en-/decrypt the mac */ +0: st1 {v0.16b}, [x5] /* store result */ ret SYM_FUNC_END(ce_aes_ccm_crypt_tail) /* * void ce_aes_ccm_encrypt(u8 out[], u8 const in[], u32 cbytes, * u8 const rk[], u32 rounds, u8 mac[], - * u8 ctr[]); + * u8 ctr[], u8 const final_iv[]); * void ce_aes_ccm_decrypt(u8 out[], u8 const in[], u32 cbytes, * u8 const rk[], u32 rounds, u8 mac[], - * u8 ctr[]); + * u8 ctr[], u8 const final_iv[]); */ SYM_FUNC_START(ce_aes_ccm_encrypt) movi v22.16b, #255 diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c index ed3d79e05112..ce9b28e3c7d6 100644 --- a/arch/arm64/crypto/aes-ce-ccm-glue.c +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c @@ -38,14 +38,11 @@ asmlinkage u32 ce_aes_mac_update(u8 const in[], u32 const rk[], int rounds, asmlinkage void ce_aes_ccm_encrypt(u8 out[], u8 const in[], u32 cbytes, u32 const rk[], u32 rounds, u8 mac[], - u8 ctr[]); + u8 ctr[], u8 const final_iv[]); asmlinkage void ce_aes_ccm_decrypt(u8 out[], u8 const in[], u32 cbytes, u32 const rk[], u32 rounds, u8 mac[], - u8 ctr[]); - -asmlinkage void ce_aes_ccm_final(u8 mac[], u8 const ctr[], u32 const rk[], - u32 rounds); + u8 ctr[], u8 const final_iv[]); static int ccm_setkey(struct crypto_aead *tfm, const u8 *in_key, unsigned int key_len) @@ -210,9 +207,12 @@ static int ccm_encrypt(struct aead_request *req) const u8 *src = walk.src.virt.addr; u8 *dst = walk.dst.virt.addr; u8 buf[AES_BLOCK_SIZE]; + u8 *final_iv = NULL; - if (walk.nbytes == walk.total) + if (walk.nbytes == walk.total) { tail = 0; + final_iv = orig_iv; + } if (unlikely(walk.nbytes < AES_BLOCK_SIZE)) src = dst = memcpy(&buf[sizeof(buf) - walk.nbytes], @@ -220,14 +220,11 @@ static int ccm_encrypt(struct aead_request *req) ce_aes_ccm_encrypt(dst, src, walk.nbytes - tail, ctx->key_enc, num_rounds(ctx), - mac, walk.iv); + mac, walk.iv, final_iv); if (unlikely(walk.nbytes < AES_BLOCK_SIZE)) memcpy(walk.dst.virt.addr, dst, walk.nbytes); - if (walk.nbytes == walk.total) - ce_aes_ccm_final(mac, orig_iv, ctx->key_enc, num_rounds(ctx)); - if (walk.nbytes) { err = skcipher_walk_done(&walk, tail); } @@ -277,9 +274,12 @@ static int ccm_decrypt(struct aead_request *req) const u8 *src = walk.src.virt.addr; u8 *dst = walk.dst.virt.addr; u8 buf[AES_BLOCK_SIZE]; + u8 *final_iv = NULL; - if (walk.nbytes == walk.total) + if (walk.nbytes == walk.total) { tail = 0; + final_iv = orig_iv; + } if (unlikely(walk.nbytes < AES_BLOCK_SIZE)) src = dst = memcpy(&buf[sizeof(buf) - walk.nbytes], @@ -287,14 +287,11 @@ static int ccm_decrypt(struct aead_request *req) ce_aes_ccm_decrypt(dst, src, walk.nbytes - tail, ctx->key_enc, num_rounds(ctx), - mac, walk.iv); + mac, walk.iv, final_iv); if (unlikely(walk.nbytes < AES_BLOCK_SIZE)) memcpy(walk.dst.virt.addr, dst, walk.nbytes); - if (walk.nbytes == walk.total) - ce_aes_ccm_final(mac, orig_iv, ctx->key_enc, num_rounds(ctx)); - if (walk.nbytes) { err = skcipher_walk_done(&walk, tail); }