From patchwork Tue Aug 5 20:34:52 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 34962 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-ob0-f199.google.com (mail-ob0-f199.google.com [209.85.214.199]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id E2C2A202A1 for ; Tue, 5 Aug 2014 20:35:08 +0000 (UTC) Received: by mail-ob0-f199.google.com with SMTP id wn1sf5896811obc.10 for ; Tue, 05 Aug 2014 13:35:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:sender:precedence:list-id:x-original-sender :x-original-authentication-results:mailing-list:list-post:list-help :list-archive:list-unsubscribe; bh=H7U4ayZ6wpKSOAYRZUzq0VuoCWDmAZifQJK8M0bh4Y8=; b=DB2bO/SMXL0mvVBf0tx1ho2nxeCRJh8nAegwAJSZehpS6wEl504kMiMuaitPAPPkFu CP5WSbSE2Kvl66m7A3MdjsgqbYVuasVy7egeg9E7y888wECk9u0ZeOH9eWUvDbw0N/q+ 8TCpew0CeO26rN5/+G/uzbpXalAceReZyhFD5ThsP+wFU/T921roBQRf03Ry69fjPRur M3n01US2hSrsHUu0YW8WDzMRKIsp6KkzGKxE/9W2IDStk8qtv7WnfX7oQDNlPWpOEz2V a6XTj/UxOEcp1d47mh3FdMzJIjmt56ixV4l1LHV1qMtiA3XRrDOaTl9ZSAONNVrJ3z1X sxTA== X-Gm-Message-State: ALoCoQke4nPVsU4XLHPJxHrqof/oKrG5BVgU+hKSZ8EoOjqpjX0DfqxNoI+ip/HYNMJ+ivZjS1ep X-Received: by 10.42.123.148 with SMTP id s20mr3592459icr.8.1407270908227; Tue, 05 Aug 2014 13:35:08 -0700 (PDT) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.81.48 with SMTP id e45ls434544qgd.33.gmail; Tue, 05 Aug 2014 13:35:08 -0700 (PDT) X-Received: by 10.220.17.145 with SMTP id s17mr4241229vca.77.1407270908057; Tue, 05 Aug 2014 13:35:08 -0700 (PDT) Received: from mail-vc0-f181.google.com (mail-vc0-f181.google.com [209.85.220.181]) by mx.google.com with ESMTPS id gv3si1993170veb.14.2014.08.05.13.35.08 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 05 Aug 2014 13:35:08 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.220.181 as permitted sender) client-ip=209.85.220.181; Received: by mail-vc0-f181.google.com with SMTP id lf12so2568574vcb.12 for ; Tue, 05 Aug 2014 13:35:07 -0700 (PDT) X-Received: by 10.221.5.137 with SMTP id og9mr5702763vcb.18.1407270907938; Tue, 05 Aug 2014 13:35:07 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.221.37.5 with SMTP id tc5csp419135vcb; Tue, 5 Aug 2014 13:35:07 -0700 (PDT) X-Received: by 10.70.63.37 with SMTP id d5mr6796015pds.51.1407270906829; Tue, 05 Aug 2014 13:35:06 -0700 (PDT) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x7si1532556pdp.333.2014.08.05.13.35.06 for ; Tue, 05 Aug 2014 13:35:06 -0700 (PDT) Received-SPF: none (google.com: linux-crypto-owner@vger.kernel.org does not designate permitted sender hosts) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753636AbaHEUfF (ORCPT ); Tue, 5 Aug 2014 16:35:05 -0400 Received: from mail-wi0-f174.google.com ([209.85.212.174]:58182 "EHLO mail-wi0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753590AbaHEUfE (ORCPT ); Tue, 5 Aug 2014 16:35:04 -0400 Received: by mail-wi0-f174.google.com with SMTP id d1so7621878wiv.13 for ; Tue, 05 Aug 2014 13:35:00 -0700 (PDT) X-Received: by 10.180.78.169 with SMTP id c9mr9888067wix.68.1407270898749; Tue, 05 Aug 2014 13:34:58 -0700 (PDT) Received: from ards-macbook-pro.local ([188.252.229.85]) by mx.google.com with ESMTPSA id k6sm6578797wjq.5.2014.08.05.13.34.56 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 05 Aug 2014 13:34:57 -0700 (PDT) From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org, linux@arm.linux.org.uk, jussi.kivilinna@iki.fi Cc: linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, Ard Biesheuvel Subject: [PATCH v2] ARM: crypto: enable NEON SHA-1 for big endian Date: Tue, 5 Aug 2014 22:34:52 +0200 Message-Id: <1407270892-20902-1-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 1.8.3.2 Sender: linux-crypto-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-crypto@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: ard.biesheuvel@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.220.181 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , This tweaks the SHA-1 NEON code slightly so it works correctly under big endian, and removes the Kconfig condition preventing it from being selected if CONFIG_CPU_BIG_ENDIAN is set. Signed-off-by: Ard Biesheuvel --- I accidentally submitted the version below to the patch system (#8125/1) rather than the version I had posted to LAKML for review. The difference between the two versions is that the first one just changed some vld1.32 calls into vld1.8 calls, resulting in the data being byte swapped twice after being read from memory: once by vld1.8 and once by the subsequent vrev32.8 instruction. Instead, this version retains the vld1.32 calls and makes the vrev32.8 calls conditional on !CPU_BIG_ENDIAN. As the vrev32.8 instruction did an implicit move as well, some register names had to be reshuffled to avoid having to move values between registers instead. Both versions pass the tcrypt built-in test suite for SHA1, in both big-endian and little-endian modes. arch/arm/crypto/sha1-armv7-neon.S | 39 ++++++++++++++++++++++----------------- crypto/Kconfig | 2 +- 2 files changed, 23 insertions(+), 18 deletions(-) diff --git a/arch/arm/crypto/sha1-armv7-neon.S b/arch/arm/crypto/sha1-armv7-neon.S index 50013c0e2864..dcd01f3f0bb0 100644 --- a/arch/arm/crypto/sha1-armv7-neon.S +++ b/arch/arm/crypto/sha1-armv7-neon.S @@ -9,7 +9,7 @@ */ #include - +#include .syntax unified .code 32 @@ -61,13 +61,13 @@ #define RT3 r12 #define W0 q0 -#define W1 q1 +#define W1 q7 #define W2 q2 #define W3 q3 #define W4 q4 -#define W5 q5 -#define W6 q6 -#define W7 q7 +#define W5 q6 +#define W6 q5 +#define W7 q1 #define tmp0 q8 #define tmp1 q9 @@ -79,6 +79,11 @@ #define qK3 q14 #define qK4 q15 +#ifdef CONFIG_CPU_BIG_ENDIAN +#define ARM_LE(code...) +#else +#define ARM_LE(code...) code +#endif /* Round function macros. */ @@ -150,45 +155,45 @@ #define W_PRECALC_00_15() \ add RWK, sp, #(WK_offs(0)); \ \ - vld1.32 {tmp0, tmp1}, [RDATA]!; \ - vrev32.8 W0, tmp0; /* big => little */ \ - vld1.32 {tmp2, tmp3}, [RDATA]!; \ + vld1.32 {W0, W7}, [RDATA]!; \ + ARM_LE(vrev32.8 W0, W0; ) /* big => little */ \ + vld1.32 {W6, W5}, [RDATA]!; \ vadd.u32 tmp0, W0, curK; \ - vrev32.8 W7, tmp1; /* big => little */ \ - vrev32.8 W6, tmp2; /* big => little */ \ + ARM_LE(vrev32.8 W7, W7; ) /* big => little */ \ + ARM_LE(vrev32.8 W6, W6; ) /* big => little */ \ vadd.u32 tmp1, W7, curK; \ - vrev32.8 W5, tmp3; /* big => little */ \ + ARM_LE(vrev32.8 W5, W5; ) /* big => little */ \ vadd.u32 tmp2, W6, curK; \ vst1.32 {tmp0, tmp1}, [RWK]!; \ vadd.u32 tmp3, W5, curK; \ vst1.32 {tmp2, tmp3}, [RWK]; \ #define WPRECALC_00_15_0(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \ - vld1.32 {tmp0, tmp1}, [RDATA]!; \ + vld1.32 {W0, W7}, [RDATA]!; \ #define WPRECALC_00_15_1(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \ add RWK, sp, #(WK_offs(0)); \ #define WPRECALC_00_15_2(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \ - vrev32.8 W0, tmp0; /* big => little */ \ + ARM_LE(vrev32.8 W0, W0; ) /* big => little */ \ #define WPRECALC_00_15_3(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \ - vld1.32 {tmp2, tmp3}, [RDATA]!; \ + vld1.32 {W6, W5}, [RDATA]!; \ #define WPRECALC_00_15_4(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \ vadd.u32 tmp0, W0, curK; \ #define WPRECALC_00_15_5(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \ - vrev32.8 W7, tmp1; /* big => little */ \ + ARM_LE(vrev32.8 W7, W7; ) /* big => little */ \ #define WPRECALC_00_15_6(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \ - vrev32.8 W6, tmp2; /* big => little */ \ + ARM_LE(vrev32.8 W6, W6; ) /* big => little */ \ #define WPRECALC_00_15_7(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \ vadd.u32 tmp1, W7, curK; \ #define WPRECALC_00_15_8(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \ - vrev32.8 W5, tmp3; /* big => little */ \ + ARM_LE(vrev32.8 W5, W5; ) /* big => little */ \ #define WPRECALC_00_15_9(i,W,W_m04,W_m08,W_m12,W_m16,W_m20,W_m24,W_m28) \ vadd.u32 tmp2, W6, curK; \ diff --git a/crypto/Kconfig b/crypto/Kconfig index 749b1e05c490..deef2a4b6559 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -542,7 +542,7 @@ config CRYPTO_SHA1_ARM config CRYPTO_SHA1_ARM_NEON tristate "SHA1 digest algorithm (ARM NEON)" - depends on ARM && KERNEL_MODE_NEON && !CPU_BIG_ENDIAN + depends on ARM && KERNEL_MODE_NEON select CRYPTO_SHA1_ARM select CRYPTO_SHA1 select CRYPTO_HASH