From patchwork Wed Jul 3 08:55:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 168394 Delivered-To: patch@linaro.org Received: by 2002:a92:4782:0:0:0:0:0 with SMTP id e2csp597331ilk; Wed, 3 Jul 2019 01:55:53 -0700 (PDT) X-Google-Smtp-Source: APXvYqy7qbW7XvLBmIyAfelWVjxnVJhiXwipAVFEXjeD265/zKzc1p+SFgf/dsHlSjOuLa0lbDty X-Received: by 2002:a17:90a:37ac:: with SMTP id v41mr10923879pjb.6.1562144153872; Wed, 03 Jul 2019 01:55:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562144153; cv=none; d=google.com; s=arc-20160816; b=KhzKsMjoAh0+r0PHC+rVOGxt2CVgaxB7lcRBfB1jGyIOPKYGfIOwBAqlDUZFtyjRkR d7+1U68v3IhRY9OohP7B+8tzEb/yp+guFPOcmXPiVGFf51+cpxG3Q3AClgf2Ak8s1w+w 4NHJYrmhrB+ydLU3DOx8MWkkcXQwWtxYpuiuMaEPAUqd0S+E9ZXzuepUXds+CTeIgAfI YM5aXlefjc2xWTjy0l0m8z/yM5sGIxjahSpyAlMT6xZezzSoaoPglkWSKEY7jEz5dazH E/mgIS/8TwpEK8rMKSli9BygD//ZjdG6nji8xYSqw/7O6RNkQbQ2+Gbbh3wDXcpPX3XU ZQFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature; bh=Dp9qP6sTONWkO4Yd/NU5NtSvJh7LYPkiXZbr4kC7GvU=; b=dpexEYnwslOTP0gRDQlktF3q3tBKkDGSpnDxRT/TVjzMEC+cCtRvcZC8K/ONpyXgHQ Pmw8eAe0vB0Xbe0kd+osN/dneMVw3VpBAOTagn45VUzWjVSVKZn9pQAc9A5kwtj/u6PL buQKKXm5aP1CTLgwEoduyKGX/sQbNz1BvAvWDY7xTkNCPeDyRsA/v3tcSE4KBdWjJA4r MZe2rCTnbCPuuD3+He6YO6JNq2bZkfCRvvb+8zJwyD64nkv/ZNuUSgHUQERy3aVJcAwt cQ+y+CuClEf5SZMBw4TeEFb4EPXlU93N0u/62GJfNvvJoQfUTNrv0El+ezdUyAgPDT8T FRMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="dwq4x3/a"; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k11si1643982pll.377.2019.07.03.01.55.53; Wed, 03 Jul 2019 01:55:53 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="dwq4x3/a"; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727114AbfGCIzx (ORCPT + 3 others); Wed, 3 Jul 2019 04:55:53 -0400 Received: from mail-lf1-f67.google.com ([209.85.167.67]:32793 "EHLO mail-lf1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727004AbfGCIzx (ORCPT ); Wed, 3 Jul 2019 04:55:53 -0400 Received: by mail-lf1-f67.google.com with SMTP id y17so1214086lfe.0 for ; Wed, 03 Jul 2019 01:55:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Dp9qP6sTONWkO4Yd/NU5NtSvJh7LYPkiXZbr4kC7GvU=; b=dwq4x3/aV8/sFAShqdmmhGKwDR9NyeLVyz3rorwe9tC0cqmVbXv2cJytipQYf1CsaO B1bua8sN2Y7ZArkSHqYv1g2x+ds+JuXbJNuswlsjU5Y4aBN6MzCXid9ufYTRJg6AYTgg y0HIbQkgcsv/GoOkT9DQmqGb+U96wB2SLTWbWpCYlOihi8AQkxRDQKXFwPHPhO55zjV4 q6IWxsr7Mxc6rwsmzXwvP00jfz5aaRW+7xopDs5x75REJz9ukxRxvXh6WXbcshX+O1ux 59nTK1WJ7D6zm+jVMQPmY5wmMbU2uTTqK806IWg7wZ1FSY07dKL2nBN4x5bnUbobE4Gb Bd1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Dp9qP6sTONWkO4Yd/NU5NtSvJh7LYPkiXZbr4kC7GvU=; b=k9vpgBT87O1OkCGBFpkRScK63gq50tkLDgeGCd6kgItUT58gj6wEcPdnsfJdkaIq9Y DXBYeDjZllfs1FXEBrx5dJkpizjuyFpAllE5Lu3bpZ4IyzM/HXgANGE2S4GswvUsKY/a vIDBcpYcrkN/L5Ave8lUyfcPO45sWZFpJnySuMAHVKNn+n/mnMqk/k/PbDns5ne7Xj6w KO4KnlOdt6j9cbq6cFWnK9Yv7wv7kAKX+hewM4x2AIp3aRLptrRj2TUNm+UtlJMvpVnW vyObnRSNBT49Ml0t/vJb79+ON0a9IuQmCDyQKdz7h9a4DXDYClOY/eJRq/fzMnczTOYA q1Pg== X-Gm-Message-State: APjAAAXI65tCeStKPFQQQux2FO1aPQabSKYeQATapPjaVK8UF4DnAGt1 uFKsAxQWJCCAK2eGRhw+tlnpbkcZV2DjxhVL X-Received: by 2002:a19:ca4b:: with SMTP id h11mr3586816lfj.162.1562144150753; Wed, 03 Jul 2019 01:55:50 -0700 (PDT) Received: from e111045-lin.arm.com (89-212-78-239.static.t-2.net. [89.212.78.239]) by smtp.gmail.com with ESMTPSA id h4sm354529ljj.31.2019.07.03.01.55.49 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Wed, 03 Jul 2019 01:55:49 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: Ard Biesheuvel , Eric Biggers , Ondrej Mosnacek , Herbert Xu , Steve Capper , Milan Broz Subject: [PATCH v4 4/7] crypto: aegis - avoid prerotated AES tables Date: Wed, 3 Jul 2019 10:55:09 +0200 Message-Id: <20190703085512.13915-5-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190703085512.13915-1-ard.biesheuvel@linaro.org> References: <20190703085512.13915-1-ard.biesheuvel@linaro.org> Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The generic AES code provides four sets of lookup tables, where each set consists of four tables containing the same 32-bit values, but rotated by 0, 8, 16 and 24 bits, respectively. This makes sense for CISC architectures such as x86 which support memory operands, but for other architectures, the rotates are quite cheap, and using all four tables needlessly thrashes the D-cache, and actually hurts rather than helps performance. Since x86 already has its own implementation of AEGIS based on AES-NI instructions, let's tweak the generic implementation towards other architectures, and avoid the prerotated tables, and perform the rotations inline. On ARM Cortex-A53, this results in a ~8% speedup. Acked-by: Ondrej Mosnacek Signed-off-by: Ard Biesheuvel --- crypto/aegis.h | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) -- 2.17.1 diff --git a/crypto/aegis.h b/crypto/aegis.h index 41a3090cda8e..3308066ddde0 100644 --- a/crypto/aegis.h +++ b/crypto/aegis.h @@ -10,6 +10,7 @@ #define _CRYPTO_AEGIS_H #include +#include #include #define AEGIS_BLOCK_SIZE 16 @@ -53,16 +54,13 @@ static void crypto_aegis_aesenc(union aegis_block *dst, const union aegis_block *key) { const u8 *s = src->bytes; - const u32 *t0 = crypto_ft_tab[0]; - const u32 *t1 = crypto_ft_tab[1]; - const u32 *t2 = crypto_ft_tab[2]; - const u32 *t3 = crypto_ft_tab[3]; + const u32 *t = crypto_ft_tab[0]; u32 d0, d1, d2, d3; - d0 = t0[s[ 0]] ^ t1[s[ 5]] ^ t2[s[10]] ^ t3[s[15]]; - d1 = t0[s[ 4]] ^ t1[s[ 9]] ^ t2[s[14]] ^ t3[s[ 3]]; - d2 = t0[s[ 8]] ^ t1[s[13]] ^ t2[s[ 2]] ^ t3[s[ 7]]; - d3 = t0[s[12]] ^ t1[s[ 1]] ^ t2[s[ 6]] ^ t3[s[11]]; + d0 = t[s[ 0]] ^ rol32(t[s[ 5]], 8) ^ rol32(t[s[10]], 16) ^ rol32(t[s[15]], 24); + d1 = t[s[ 4]] ^ rol32(t[s[ 9]], 8) ^ rol32(t[s[14]], 16) ^ rol32(t[s[ 3]], 24); + d2 = t[s[ 8]] ^ rol32(t[s[13]], 8) ^ rol32(t[s[ 2]], 16) ^ rol32(t[s[ 7]], 24); + d3 = t[s[12]] ^ rol32(t[s[ 1]], 8) ^ rol32(t[s[ 6]], 16) ^ rol32(t[s[11]], 24); dst->words32[0] = cpu_to_le32(d0) ^ key->words32[0]; dst->words32[1] = cpu_to_le32(d1) ^ key->words32[1];