From patchwork Mon Jun 16 10:02:16 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 31925 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-ie0-f199.google.com (mail-ie0-f199.google.com [209.85.223.199]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 3186720E7A for ; Mon, 16 Jun 2014 10:10:19 +0000 (UTC) Received: by mail-ie0-f199.google.com with SMTP id rd18sf31636379iec.2 for ; Mon, 16 Jun 2014 03:10:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:from:to:subject:date:message-id :in-reply-to:references:cc:precedence:list-id:list-unsubscribe :list-archive:list-post:list-help:list-subscribe:mime-version:sender :errors-to:x-original-sender:x-original-authentication-results :mailing-list:content-type:content-transfer-encoding; bh=tcpsp8TrCRqlpD+e8SnUtH5chP7SyxXmJ0PhDmCiBdo=; b=EuMTxM2eYLDHwSGNCKOYbjWSQucbXy9oWdSHN9mkDCiljsGAfEOwqqibjFoF0Kt2FY kkmBXJRECycL1fHRR7hhJDPwnthEEa5SvD1ySPjK96HgTpULMkLm2YZAsvWqYcOrV8dv 06lR0J2F5vC1ukmv83tp1XX4eec5Pd6jQhLWfN1mheOPD82hisHh7Xr7KQeQsfaNbL4e Zqi8QWJ5OriKy5uNdVbIFQmpKSBPEHMPGGR1ODkGzmrfhk+sejtnZxQz/Hfj4mQ+fXzk bSqxgOIkEOJ1Iedkv64BMuxnfW0EZs9ijHQPDSWJsNTX34BhyvCVVWAeXNOWRY1H/+9j Wu3Q== X-Gm-Message-State: ALoCoQlHQ8gS0fb0pzUPq4dOYlkhhMhZmo+rpQwpGUCrdyCYyuFfZ9EwVwylIX9CjpPQE0NpvROf X-Received: by 10.42.86.145 with SMTP id u17mr334272icl.11.1402913418622; Mon, 16 Jun 2014 03:10:18 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.32.203 with SMTP id h69ls4562701qgh.56.gmail; Mon, 16 Jun 2014 03:10:18 -0700 (PDT) X-Received: by 10.58.85.3 with SMTP id d3mr717494vez.34.1402913418518; Mon, 16 Jun 2014 03:10:18 -0700 (PDT) Received: from mail-vc0-f180.google.com (mail-vc0-f180.google.com [209.85.220.180]) by mx.google.com with ESMTPS id tx9si3918808vcb.102.2014.06.16.03.10.18 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 16 Jun 2014 03:10:18 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.220.180 as permitted sender) client-ip=209.85.220.180; Received: by mail-vc0-f180.google.com with SMTP id im17so4615118vcb.39 for ; Mon, 16 Jun 2014 03:10:18 -0700 (PDT) X-Received: by 10.58.187.19 with SMTP id fo19mr11474vec.45.1402913418381; Mon, 16 Jun 2014 03:10:18 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.221.54.6 with SMTP id vs6csp123312vcb; Mon, 16 Jun 2014 03:10:18 -0700 (PDT) X-Received: by 10.224.21.1 with SMTP id h1mr2339090qab.103.1402913417885; Mon, 16 Jun 2014 03:10:17 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id w6si12609020qab.69.2014.06.16.03.10.17 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 16 Jun 2014 03:10:17 -0700 (PDT) Received-SPF: none (google.com: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org does not designate permitted sender hosts) client-ip=2001:1868:205::9; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1WwTkV-0000vL-Lm; Mon, 16 Jun 2014 10:02:47 +0000 Received: from mail-we0-f178.google.com ([74.125.82.178]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1WwTkO-0000rD-BX for linux-arm-kernel@lists.infradead.org; Mon, 16 Jun 2014 10:02:41 +0000 Received: by mail-we0-f178.google.com with SMTP id x48so4732436wes.9 for ; Mon, 16 Jun 2014 03:02:17 -0700 (PDT) X-Received: by 10.180.24.2 with SMTP id q2mr26050285wif.22.1402912937424; Mon, 16 Jun 2014 03:02:17 -0700 (PDT) Received: from ards-macbook-pro.guests.st.com (29-39.80-90.static-ip.oleane.fr. [90.80.39.29]) by mx.google.com with ESMTPSA id ge6sm6410698wic.0.2014.06.16.03.02.16 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 16 Jun 2014 03:02:16 -0700 (PDT) From: Ard Biesheuvel To: catalin.marinas@arm.com Subject: [PATCH 2/2] arm64/crypto: improve performance of GHASH algorithm Date: Mon, 16 Jun 2014 12:02:16 +0200 Message-Id: <1402912936-9137-2-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 1.8.3.2 In-Reply-To: <1402912936-9137-1-git-send-email-ard.biesheuvel@linaro.org> References: <1402912936-9137-1-git-send-email-ard.biesheuvel@linaro.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20140616_030240_539498_CB446FC7 X-CRM114-Status: GOOD ( 12.47 ) X-Spam-Score: -0.7 (/) X-Spam-Report: SpamAssassin version 3.4.0 on bombadil.infradead.org summary: Content analysis details: (-0.7 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [74.125.82.178 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [74.125.82.178 listed in wl.mailspike.net] -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders Cc: linux-arm-kernel@lists.infradead.org, Ard Biesheuvel X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: ard.biesheuvel@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.220.180 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 This patches modifies the GHASH secure hash implementation to switch to a faster, polynomial multiplication based reduction instead of one that uses shifts and rotates. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/ghash-ce-core.S | 92 ++++++++++++++++----------------------- arch/arm64/crypto/ghash-ce-glue.c | 4 +- 2 files changed, 40 insertions(+), 56 deletions(-) diff --git a/arch/arm64/crypto/ghash-ce-core.S b/arch/arm64/crypto/ghash-ce-core.S index b9e6eaf41c9b..dc457015884e 100644 --- a/arch/arm64/crypto/ghash-ce-core.S +++ b/arch/arm64/crypto/ghash-ce-core.S @@ -3,14 +3,6 @@ * * Copyright (C) 2014 Linaro Ltd. * - * Based on arch/x86/crypto/ghash-pmullni-intel_asm.S - * - * Copyright (c) 2009 Intel Corp. - * Author: Huang Ying - * Vinodh Gopal - * Erdinc Ozturk - * Deniz Karakoyunlu - * * This program is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License version 2 as published * by the Free Software Foundation. @@ -19,13 +11,15 @@ #include #include - DATA .req v0 - SHASH .req v1 - IN1 .req v2 + SHASH .req v0 + SHASH2 .req v1 T1 .req v2 T2 .req v3 - T3 .req v4 - VZR .req v5 + MASK .req v4 + XL .req v5 + XM .req v6 + XH .req v7 + IN1 .req v7 .text .arch armv8-a+crypto @@ -35,61 +29,51 @@ * struct ghash_key const *k, const char *head) */ ENTRY(pmull_ghash_update) - ld1 {DATA.16b}, [x1] ld1 {SHASH.16b}, [x3] - eor VZR.16b, VZR.16b, VZR.16b + ld1 {XL.16b}, [x1] + movi MASK.16b, #0xe1 + ext SHASH2.16b, SHASH.16b, SHASH.16b, #8 + shl MASK.2d, MASK.2d, #57 + eor SHASH2.16b, SHASH2.16b, SHASH.16b /* do the head block first, if supplied */ cbz x4, 0f - ld1 {IN1.2d}, [x4] + ld1 {T1.2d}, [x4] b 1f -0: ld1 {IN1.2d}, [x2], #16 +0: ld1 {T1.2d}, [x2], #16 sub w0, w0, #1 -1: ext IN1.16b, IN1.16b, IN1.16b, #8 -CPU_LE( rev64 IN1.16b, IN1.16b ) - eor DATA.16b, DATA.16b, IN1.16b - /* multiply DATA by SHASH in GF(2^128) */ - ext T2.16b, DATA.16b, DATA.16b, #8 - ext T3.16b, SHASH.16b, SHASH.16b, #8 - eor T2.16b, T2.16b, DATA.16b - eor T3.16b, T3.16b, SHASH.16b +1: /* multiply XL by SHASH in GF(2^128) */ +CPU_LE( rev64 T1.16b, T1.16b ) - pmull2 T1.1q, SHASH.2d, DATA.2d // a1 * b1 - pmull DATA.1q, SHASH.1d, DATA.1d // a0 * b0 - pmull T2.1q, T2.1d, T3.1d // (a1 + a0)(b1 + b0) - eor T2.16b, T2.16b, T1.16b // (a0 * b1) + (a1 * b0) - eor T2.16b, T2.16b, DATA.16b + ext T2.16b, XL.16b, XL.16b, #8 + ext IN1.16b, T1.16b, T1.16b, #8 + eor T1.16b, T1.16b, T2.16b + eor XL.16b, XL.16b, IN1.16b - ext T3.16b, VZR.16b, T2.16b, #8 - ext T2.16b, T2.16b, VZR.16b, #8 - eor DATA.16b, DATA.16b, T3.16b - eor T1.16b, T1.16b, T2.16b // is result of - // carry-less multiplication + pmull2 XH.1q, SHASH.2d, XL.2d // a1 * b1 + eor T1.16b, T1.16b, XL.16b + pmull XL.1q, SHASH.1d, XL.1d // a0 * b0 + pmull XM.1q, SHASH2.1d, T1.1d // (a1 + a0)(b1 + b0) - /* first phase of the reduction */ - shl T3.2d, DATA.2d, #1 - eor T3.16b, T3.16b, DATA.16b - shl T3.2d, T3.2d, #5 - eor T3.16b, T3.16b, DATA.16b - shl T3.2d, T3.2d, #57 - ext T2.16b, VZR.16b, T3.16b, #8 - ext T3.16b, T3.16b, VZR.16b, #8 - eor DATA.16b, DATA.16b, T2.16b - eor T1.16b, T1.16b, T3.16b + ext T1.16b, XL.16b, XH.16b, #8 + eor T2.16b, XL.16b, XH.16b + eor XM.16b, XM.16b, T1.16b + eor XM.16b, XM.16b, T2.16b + pmull T2.1q, XL.1d, MASK.1d - /* second phase of the reduction */ - ushr T2.2d, DATA.2d, #5 - eor T2.16b, T2.16b, DATA.16b - ushr T2.2d, T2.2d, #1 - eor T2.16b, T2.16b, DATA.16b - ushr T2.2d, T2.2d, #1 - eor T1.16b, T1.16b, T2.16b - eor DATA.16b, DATA.16b, T1.16b + mov XH.d[0], XM.d[1] + mov XM.d[1], XL.d[0] + + eor XL.16b, XM.16b, T2.16b + ext T2.16b, XL.16b, XL.16b, #8 + pmull XL.1q, XL.1d, MASK.1d + eor T2.16b, T2.16b, XH.16b + eor XL.16b, XL.16b, T2.16b cbnz w0, 0b - st1 {DATA.16b}, [x1] + st1 {XL.16b}, [x1] ret ENDPROC(pmull_ghash_update) diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c index ef6aa69c4e0c..833ec1e3f3e9 100644 --- a/arch/arm64/crypto/ghash-ce-glue.c +++ b/arch/arm64/crypto/ghash-ce-glue.c @@ -67,7 +67,7 @@ static int ghash_update(struct shash_desc *desc, const u8 *src, blocks = len / GHASH_BLOCK_SIZE; len %= GHASH_BLOCK_SIZE; - kernel_neon_begin_partial(6); + kernel_neon_begin_partial(8); pmull_ghash_update(blocks, ctx->digest, src, key, partial ? ctx->buf : NULL); kernel_neon_end(); @@ -89,7 +89,7 @@ static int ghash_final(struct shash_desc *desc, u8 *dst) memset(ctx->buf + partial, 0, GHASH_BLOCK_SIZE - partial); - kernel_neon_begin_partial(6); + kernel_neon_begin_partial(8); pmull_ghash_update(1, ctx->digest, ctx->buf, key, NULL); kernel_neon_end(); }