From patchwork Mon Jun 24 07:38:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 167553 Delivered-To: patch@linaro.org Received: by 2002:a92:4782:0:0:0:0:0 with SMTP id e2csp3866961ilk; Mon, 24 Jun 2019 00:38:35 -0700 (PDT) X-Google-Smtp-Source: APXvYqzF+aJers66VCyFhd5P3i9DXksd7DdlHnQMTbb3OofPI9MEbLCY2oySSJZMCAuLSsh3tddt X-Received: by 2002:a63:d551:: with SMTP id v17mr32182606pgi.365.1561361915410; Mon, 24 Jun 2019 00:38:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561361915; cv=none; d=google.com; s=arc-20160816; b=IgQm6/3q/qvg3RoV7TUM/DyxSI5T05EgA2MBhe7qJuWxsiRX8akZT44IhZLgMF4r2c nnFZFYtnsHnZkNLPPZ1gDDCm6AK7w572P4ES14igIHZPpwtQcHoQa7IxrtJWmhg0WELC hJyU+RkfVuyTjBbrMCW1aL97lHobT7EA/SwW09/PnilEEuHRtKN3c25DsEvNW38GtRV4 DNwBEVDj3PJ0FmlxIV5ERQ26/QGodGlFeycQkR6YbW0lgjqP7eqRG8tAxLhsZFkb9VMo d+nSYnJic4hOWuzJdOHoOJhmHdS13sZnbcIuu577fPmn9FdP7a+AvxkdXLilT0yRX1LC F84A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=5OEzIzFiMPRV9S0K58N+POdMr4xGLcUFsxWOdk6NnnA=; b=I3LoQnWrEA/a5cuJPKqxYZVYjn6LbYw2jxXJ29rBE4nBqUHx6k1tClpthxKSYWDo5o D5hiWtMAioKC0V2mGvxBaSR9dMydYumWMtMRr/7Ub0yeem7uV7t4ph4tSnK1Ym4AcoZd Rg1bzDJtmAU+QmroNHRPUXnz0+o6c/BjTo2yZGMIVA9XUD/MG/yvJb7v7rMZVYjjYqPD 340xgXQI1xJhO9h4xdqVLbUYH9gy/SH3Zpkl4IGayxv2Tfg8DW6OudDFFHNLKzS2qZIm rCVDd33shBBWVRS1wobSFbQ0Srm5TMVN/0mlLMcWxLFlTt4hjkAx1IY+xGaJk1uMrzr/ Ravg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=JbFH8Ho1; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h5si9375669pgd.419.2019.06.24.00.38.35; Mon, 24 Jun 2019 00:38:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=JbFH8Ho1; spf=pass (google.com: best guess record for domain of linux-crypto-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726304AbfFXHie (ORCPT + 3 others); Mon, 24 Jun 2019 03:38:34 -0400 Received: from mail-wm1-f67.google.com ([209.85.128.67]:40032 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726077AbfFXHie (ORCPT ); Mon, 24 Jun 2019 03:38:34 -0400 Received: by mail-wm1-f67.google.com with SMTP id v19so12177480wmj.5 for ; Mon, 24 Jun 2019 00:38:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5OEzIzFiMPRV9S0K58N+POdMr4xGLcUFsxWOdk6NnnA=; b=JbFH8Ho1Xh+3eJVIL3O+9G/GjV6avDk5EN1nC/D6Yddr2ODKBk8LYB8If1dVcCabO6 4EPFLzpLy/BEaB+YdzCD6Q5ooGapt/8xL2NsmEMV8SVxqqeWKlrnBUTKCNK9yM0SynIS +7l4AtjnkeodLGEORYVqv7yemIWAq/J+2wU4GtyD5PKzC103nVnL/13n8kaEQypBG0ql CMGcBlyELTKIvt/rYd+sjHlkJutOgO75xTcrCL9ldmFsqVHGzY9xDNSik5X/1Xg64zhO OW3k439WmWi9NhfATrDDosUq7sUzmn99MhQLP1/M7WvfA9UkvPbbL0LsVVO4V4KwTVi9 oA1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5OEzIzFiMPRV9S0K58N+POdMr4xGLcUFsxWOdk6NnnA=; b=SYfbhQrnBFufxdShsuahTnL9Qg2QYUpPoBIzkkQrcINIm/G6izJv09EPjwYQkeLEqI wfC+iKnyITfP/iSKsXJxE1nPTyh4oUoTX6/dO1iOJpg3vEu1OSahAASkpq3AOsXJstWd 52+4rhyyxMEwtxs1GeMPlEVUM8v+8RGyCg7MyDedJuHt9SUAOCpUwCqdF2qHbtMS0bup IknCJLIxaKq7DT74CJMkCxZifSEtferb71Svl1Rl/p4bBNk4sKF52HSXtLs3ULXi5MYi 0FtY85ZQBIsmhMtfldQjzCHTDpHHQjjDgWzlku9QLFE+LgU1WqxqVjL9cvRxcv8Akj1e RpyA== X-Gm-Message-State: APjAAAVycjZk5sM173FHtApXNGKrgcyeHG6mdGzEey9Ab8vT1a8E9d7I GJGEC3ma+aeJ9p204sGQj1OBvdnapFfEww== X-Received: by 2002:a1c:5a56:: with SMTP id o83mr14098674wmb.103.1561361912283; Mon, 24 Jun 2019 00:38:32 -0700 (PDT) Received: from sudo.home ([2a01:cb1d:112:6f00:4866:7cdc:a930:8455]) by smtp.gmail.com with ESMTPSA id 203sm7419280wmc.30.2019.06.24.00.38.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Jun 2019 00:38:31 -0700 (PDT) From: Ard Biesheuvel To: linux-crypto@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Eric Biggers , Ondrej Mosnacek , Herbert Xu , Steve Capper Subject: [PATCH 3/6] crypto: aegis - avoid prerotated AES tables Date: Mon, 24 Jun 2019 09:38:15 +0200 Message-Id: <20190624073818.29296-4-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190624073818.29296-1-ard.biesheuvel@linaro.org> References: <20190624073818.29296-1-ard.biesheuvel@linaro.org> MIME-Version: 1.0 Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The generic AES code provides four sets of lookup tables, where each set consists of four tables containing the same 32-bit values, but rotated by 0, 8, 16 and 24 bits, respectively. This makes sense for CISC architectures such as x86 which support memory operands, but for other architectures, the rotates are quite cheap, and using all four tables needlessly thrashes the D-cache, and actually hurts rather than helps performance. Since x86 already has its own implementation of AEGIS based on AES-NI instructions, let's tweak the generic implementation towards other architectures, and avoid the prerotated tables, and perform the rotations inline. On ARM Cortex-A53, this results in a ~8% speedup. Signed-off-by: Ard Biesheuvel --- crypto/aegis.h | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) -- 2.20.1 Acked-by: Ondrej Mosnacek diff --git a/crypto/aegis.h b/crypto/aegis.h index 41a3090cda8e..3308066ddde0 100644 --- a/crypto/aegis.h +++ b/crypto/aegis.h @@ -10,6 +10,7 @@ #define _CRYPTO_AEGIS_H #include +#include #include #define AEGIS_BLOCK_SIZE 16 @@ -53,16 +54,13 @@ static void crypto_aegis_aesenc(union aegis_block *dst, const union aegis_block *key) { const u8 *s = src->bytes; - const u32 *t0 = crypto_ft_tab[0]; - const u32 *t1 = crypto_ft_tab[1]; - const u32 *t2 = crypto_ft_tab[2]; - const u32 *t3 = crypto_ft_tab[3]; + const u32 *t = crypto_ft_tab[0]; u32 d0, d1, d2, d3; - d0 = t0[s[ 0]] ^ t1[s[ 5]] ^ t2[s[10]] ^ t3[s[15]]; - d1 = t0[s[ 4]] ^ t1[s[ 9]] ^ t2[s[14]] ^ t3[s[ 3]]; - d2 = t0[s[ 8]] ^ t1[s[13]] ^ t2[s[ 2]] ^ t3[s[ 7]]; - d3 = t0[s[12]] ^ t1[s[ 1]] ^ t2[s[ 6]] ^ t3[s[11]]; + d0 = t[s[ 0]] ^ rol32(t[s[ 5]], 8) ^ rol32(t[s[10]], 16) ^ rol32(t[s[15]], 24); + d1 = t[s[ 4]] ^ rol32(t[s[ 9]], 8) ^ rol32(t[s[14]], 16) ^ rol32(t[s[ 3]], 24); + d2 = t[s[ 8]] ^ rol32(t[s[13]], 8) ^ rol32(t[s[ 2]], 16) ^ rol32(t[s[ 7]], 24); + d3 = t[s[12]] ^ rol32(t[s[ 1]], 8) ^ rol32(t[s[ 6]], 16) ^ rol32(t[s[11]], 24); dst->words32[0] = cpu_to_le32(d0) ^ key->words32[0]; dst->words32[1] = cpu_to_le32(d1) ^ key->words32[1];