From patchwork Sat Mar 10 15:21:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 131297 Delivered-To: patch@linaro.org Received: by 10.46.66.2 with SMTP id p2csp2244604lja; Sat, 10 Mar 2018 07:22:55 -0800 (PST) X-Google-Smtp-Source: AG47ELunon6CPCJtb+wJCKBVm+8Q+2jaThPrNaMOdfKX9uQzqX+lVE6esZNX+3zA0K14YoXMmAIB X-Received: by 10.98.75.129 with SMTP id d1mr2276651pfj.19.1520695375347; Sat, 10 Mar 2018 07:22:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520695375; cv=none; d=google.com; s=arc-20160816; b=mPpwpRRK00tBR9dIN4/2o/Wy/6uZztWNGqvcl7jvtYAj/1R7jyj2DCMnsiv/uF+uRS J/UWQX6fBYGRuzOUh79DyrhNi2YPMwxuUiWNqGotwvbE6gecBlqCOyzckwxbrApTnv2U fpOuXWKdejpxCpVzS0w+Jof1sAwjzE0WhxA9aGFPQbMvRCcY8/VIeiT/WMKOWReKqfs/ lOzP7ptU+WwuA+lJBG1EoDG9LPwwO814o2OU9tyMGDf6z0Fbq4KyPlP/1YYorXL8/F9e G424bmVoJyxUG1uYTHlMZ54WTwP1cZi3huQoSLkQfzk4fqHWlX3r4i3B5qTACSlT839i U7qg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=045pc8ZcXNQ6enBteVqbqVjC9IyYTZHYnUU9BOMiWIg=; b=AeWcco3jKOpQSrkQyBH+UijVbvqVZG+zAauLlSBBbSlhaqFp5xs0Y2q8ZGulVBS6JD uCpR1mX5GpoL9hNstUNXeVQ1AXp/FvOmwFGu8gI93IAKbwSalHYHk/j3Y2xLWR7s6lEs k9xknxxlha4umjh8QS+Vprhv9ILfpct2EqcN1Ri0JH06tcEBXGN10dybnh5/p3eenkYD chRwH3kXKZ1FbY5z/ocdkCNOC29OhkJHAPnNl8h6unlUktzv8WtBbp8wvP124WdiHqnh 0EFhUog0IeMFmUa2+i1Nv9Ecg+WUk58RN7gW2lzQufRibGw+JLvTV7TLZzluw87RCUx4 cedg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=gymxINhg; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q16si2824191pfg.221.2018.03.10.07.22.55; Sat, 10 Mar 2018 07:22:55 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=gymxINhg; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932271AbeCJPWx (ORCPT + 1 other); Sat, 10 Mar 2018 10:22:53 -0500 Received: from mail-wr0-f193.google.com ([209.85.128.193]:42067 "EHLO mail-wr0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932308AbeCJPWu (ORCPT ); Sat, 10 Mar 2018 10:22:50 -0500 Received: by mail-wr0-f193.google.com with SMTP id k9so11632260wre.9 for ; Sat, 10 Mar 2018 07:22:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=045pc8ZcXNQ6enBteVqbqVjC9IyYTZHYnUU9BOMiWIg=; b=gymxINhgItz+nuDPpQXrUwzDGnQLyQFYiBcxPpGA+owu+IcV0NuO7N1is247R5MHKQ jOAvbsyvv+NQ7mRXwUS7PhgQRtSCx2d3cNfdzmm4yvFDFHpnolBxukky0JMhhEID/Yz3 624aIfxCturwrbvsh45JVKmSjEbH8s8yCHIB8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=045pc8ZcXNQ6enBteVqbqVjC9IyYTZHYnUU9BOMiWIg=; b=mMQUtxj8/DXqYDTutj5rCu+AkZE6X4be4uMtEujkZ41kpe40yrXdwSeOv7rLSkQUTb fte3sUaflmrgEuy6CcUSiN9osklWLjuzxiFuQMt2SLSxCsO4PLaiaR3JI00pv6c4oo1A ktdXK1J+Fu1klY1QwkImWrdPYqf8G74HkzA6Hdrqsyd4C6tEmrAv21aMeo6cs5HgDy1o 2JXOjYujqy0CaK5rixR28dGVFRoEHpREUbKNlu/5HckicNpxdqx1W3V7QRRP3cmgRe5W t+DnqWP0R2EmEvS/HHyKZbjP38Oe1rl3N6RrV5bn/fsuCOVxuoYkr4GhIUXaZ8uS0O5o 91Jw== X-Gm-Message-State: AElRT7ElAHlFYEwM1NZXCadthAfVx7teBl90LmFeBaqhW/EbKTfGJ7Ak 4Mo0UTfLCBS9bUPovoo2KA968YNRtJ8= X-Received: by 10.223.201.142 with SMTP id f14mr1899909wrh.40.1520695369357; Sat, 10 Mar 2018 07:22:49 -0800 (PST) Received: from localhost.localdomain ([105.148.128.186]) by smtp.gmail.com with ESMTPSA id m9sm7027531wrf.13.2018.03.10.07.22.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 10 Mar 2018 07:22:48 -0800 (PST) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: herbert@gondor.apana.org.au, linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Dave Martin , Russell King - ARM Linux , Sebastian Andrzej Siewior , Mark Rutland , linux-rt-users@vger.kernel.org, Peter Zijlstra , Catalin Marinas , Will Deacon , Steven Rostedt , Thomas Gleixner Subject: [PATCH v5 07/23] crypto: arm64/aes-blk - add 4 way interleave to CBC encrypt path Date: Sat, 10 Mar 2018 15:21:52 +0000 Message-Id: <20180310152208.10369-8-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.15.1 In-Reply-To: <20180310152208.10369-1-ard.biesheuvel@linaro.org> References: <20180310152208.10369-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org CBC encryption is strictly sequential, and so the current AES code simply processes the input one block at a time. However, we are about to add yield support, which adds a bit of overhead, and which we prefer to align with other modes in terms of granularity (i.e., it is better to have all routines yield every 64 bytes and not have an exception for CBC encrypt which yields every 16 bytes) So unroll the loop by 4. We still cannot perform the AES algorithm in parallel, but we can at least merge the loads and stores. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-modes.S | 31 ++++++++++++++++---- 1 file changed, 25 insertions(+), 6 deletions(-) -- 2.15.1 diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S index 27a235b2ddee..e86535a1329d 100644 --- a/arch/arm64/crypto/aes-modes.S +++ b/arch/arm64/crypto/aes-modes.S @@ -94,17 +94,36 @@ AES_ENDPROC(aes_ecb_decrypt) */ AES_ENTRY(aes_cbc_encrypt) - ld1 {v0.16b}, [x5] /* get iv */ + ld1 {v4.16b}, [x5] /* get iv */ enc_prepare w3, x2, x6 -.Lcbcencloop: - ld1 {v1.16b}, [x1], #16 /* get next pt block */ - eor v0.16b, v0.16b, v1.16b /* ..and xor with iv */ +.Lcbcencloop4x: + subs w4, w4, #4 + bmi .Lcbcenc1x + ld1 {v0.16b-v3.16b}, [x1], #64 /* get 4 pt blocks */ + eor v0.16b, v0.16b, v4.16b /* ..and xor with iv */ encrypt_block v0, w3, x2, x6, w7 - st1 {v0.16b}, [x0], #16 + eor v1.16b, v1.16b, v0.16b + encrypt_block v1, w3, x2, x6, w7 + eor v2.16b, v2.16b, v1.16b + encrypt_block v2, w3, x2, x6, w7 + eor v3.16b, v3.16b, v2.16b + encrypt_block v3, w3, x2, x6, w7 + st1 {v0.16b-v3.16b}, [x0], #64 + mov v4.16b, v3.16b + b .Lcbcencloop4x +.Lcbcenc1x: + adds w4, w4, #4 + beq .Lcbcencout +.Lcbcencloop: + ld1 {v0.16b}, [x1], #16 /* get next pt block */ + eor v4.16b, v4.16b, v0.16b /* ..and xor with iv */ + encrypt_block v4, w3, x2, x6, w7 + st1 {v4.16b}, [x0], #16 subs w4, w4, #1 bne .Lcbcencloop - st1 {v0.16b}, [x5] /* return iv */ +.Lcbcencout: + st1 {v4.16b}, [x5] /* return iv */ ret AES_ENDPROC(aes_cbc_encrypt)