From patchwork Mon Jun  9 14:57:24 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Peter Maydell <peter.maydell@linaro.org>
X-Patchwork-Id: 31578
Return-Path: <patchwork-forward+bncBC6Z756YVMIBB2EZ26OAKGQERSS5NEI@linaro.org>
X-Original-To: linaro@patches.linaro.org
Delivered-To: linaro@patches.linaro.org
Received: from mail-ve0-f197.google.com (mail-ve0-f197.google.com
 [209.85.128.197])
 by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 0218820675
 for <linaro@patches.linaro.org>; Mon,  9 Jun 2014 15:04:08 +0000 (UTC)
Received: by mail-ve0-f197.google.com with SMTP id jz11sf11667060veb.4
 for <linaro@patches.linaro.org>; Mon, 09 Jun 2014 08:04:08 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:delivered-to:from:to:date
 :message-id:in-reply-to:references:subject:precedence:list-id
 :list-unsubscribe:list-archive:list-post:list-help:list-subscribe
 :errors-to:sender:x-original-sender
 :x-original-authentication-results:mailing-list;
 bh=/qS1szAE++IZsv5r4pVqZupo5w2gyFJ+dePxO3el1aE=;
 b=fXI0usCbUdYgjaHQ7VH0PNitbgON9CVGbGC+MRosuZWiY2Mj2eLGZUWeLrL73tp13o
 DwOEP/CnyfsjkPIaeMD+ZuEHAv2HR5Br15t26rUyUi4ALLAxJkkM3W/5HUAFEISbmpef
 UmPhz6WsIooJM1iOu+OlD8yQiZFMuy0EO4pnxtk6Qt/nC4QAs49HvZGupj+MbseQVcrF
 37d/+K2hGovEluaz/LnodwKoSICqNtM4Dq2dyay5xwOBxDyghN674OUzlFS5iCNW1ieY
 L2KQh+OcSbDRz82QwI/2lkwNiHWp2O4BSKRXw86HKS0awi1DVuQm2Byzsc/7MpO3bXHy
 4p1Q==
X-Gm-Message-State: ALoCoQkdwjYUI6hExy8JBlsGKY2+svkaKEk+WByWw4G1gPaCxTrqwNt8K3yVqQuW9D4WjSz3Bf+C
X-Received: by 10.58.173.231 with SMTP id bn7mr8933969vec.21.1402326248538; 
 Mon, 09 Jun 2014 08:04:08 -0700 (PDT)
MIME-Version: 1.0
X-BeenThere: patchwork-forward@linaro.org
Received: by 10.140.103.118 with SMTP id x109ls1617372qge.74.gmail; Mon, 09
 Jun 2014 08:04:08 -0700 (PDT)
X-Received: by 10.52.241.98 with SMTP id wh2mr21986258vdc.37.1402326248322; 
 Mon, 09 Jun 2014 08:04:08 -0700 (PDT)
Received: from mail-ve0-f173.google.com (mail-ve0-f173.google.com
 [209.85.128.173])
 by mx.google.com with ESMTPS id 13si749498vdg.106.2014.06.09.08.04.08
 for <patchwork-forward@linaro.org>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Mon, 09 Jun 2014 08:04:08 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 209.85.128.173 as permitted sender) client-ip=209.85.128.173; 
Received: by mail-ve0-f173.google.com with SMTP id db11so1208392veb.4
 for <patchwork-forward@linaro.org>;
 Mon, 09 Jun 2014 08:04:08 -0700 (PDT)
X-Received: by 10.220.44.141 with SMTP id a13mr836238vcf.71.1402326248231;
 Mon, 09 Jun 2014 08:04:08 -0700 (PDT)
X-Forwarded-To: patchwork-forward@linaro.org
X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org
Delivered-To: patch@linaro.org
Received: by 10.221.54.6 with SMTP id vs6csp152679vcb;
 Mon, 9 Jun 2014 08:04:07 -0700 (PDT)
X-Received: by 10.224.8.131 with SMTP id h3mr33667388qah.61.1402326247523;
 Mon, 09 Jun 2014 08:04:07 -0700 (PDT)
Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11])
 by mx.google.com with ESMTPS id
 e3si12023441qci.15.2014.06.09.08.04.07 for <patch@linaro.org>
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Mon, 09 Jun 2014 08:04:07 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates
 2001:4830:134:3::11 as permitted sender)
 client-ip=2001:4830:134:3::11; 
Received: from localhost ([::1]:33627 helo=lists.gnu.org)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <qemu-devel-bounces+patch=linaro.org@nongnu.org>)
 id 1Wu17G-0005kk-Sv
 for patch@linaro.org; Mon, 09 Jun 2014 11:04:06 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:55202)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <pm215@archaic.org.uk>) id 1Wu118-000592-Ob
 for qemu-devel@nongnu.org; Mon, 09 Jun 2014 10:57:48 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <pm215@archaic.org.uk>) id 1Wu116-0005h2-L6
 for qemu-devel@nongnu.org; Mon, 09 Jun 2014 10:57:46 -0400
Received: from mnementh.archaic.org.uk ([2001:8b0:1d0::1]:48572)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <pm215@archaic.org.uk>) id 1Wu116-0005dP-9l
 for qemu-devel@nongnu.org; Mon, 09 Jun 2014 10:57:44 -0400
Received: from pm215 by mnementh.archaic.org.uk with local (Exim 4.80)
 (envelope-from <pm215@archaic.org.uk>) id 1Wu110-00069p-BP
 for qemu-devel@nongnu.org; Mon, 09 Jun 2014 15:57:38 +0100
From: Peter Maydell <peter.maydell@linaro.org>
To: qemu-devel@nongnu.org
Date: Mon,  9 Jun 2014 15:57:24 +0100
Message-Id: <1402325858-23615-7-git-send-email-peter.maydell@linaro.org>
X-Mailer: git-send-email 1.7.10.4
In-Reply-To: <1402325858-23615-1-git-send-email-peter.maydell@linaro.org>
References: <1402325858-23615-1-git-send-email-peter.maydell@linaro.org>
X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address
 (bad octet value).
X-Received-From: 2001:8b0:1d0::1
Subject: [Qemu-devel] [PULL 06/20] target-arm: add support for v8 SHA1 and
 SHA256 instructions
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: <patchwork-forward.linaro.org>
List-Unsubscribe: <http://groups.google.com/a/linaro.org/group/patchwork-forward/subscribe>, 
 <mailto:googlegroups-manage+836684582541+unsubscribe@googlegroups.com>
List-Archive: <http://groups.google.com/a/linaro.org/group/patchwork-forward/>
List-Post: <http://groups.google.com/a/linaro.org/group/patchwork-forward/post>, 
 <mailto:patchwork-forward@linaro.org>
List-Help: <http://support.google.com/a/linaro.org/bin/topic.py?topic=25838>, 
 <mailto:patchwork-forward+help@linaro.org>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org
Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org
X-Removed-Original-Auth: Dkim didn't pass.
X-Original-Sender: peter.maydell@linaro.org
X-Original-Authentication-Results: mx.google.com; spf=pass (google.com:
 domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 209.85.128.173 as permitted sender)
 smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org
Mailing-list: list patchwork-forward@linaro.org;
 contact patchwork-forward+owners@linaro.org
X-Google-Group-Id: 836684582541

From: Ard Biesheuvel <ard.biesheuvel@linaro.org>

This adds support for the SHA1 and SHA256 instructions that are available
on some v8 implementations of Aarch32.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401386724-26529-2-git-send-email-peter.maydell@linaro.org
[PMM:
 * rebase
 * fix bad indent
 * add a missing UNDEF check for Q!=1 in the 3-reg SHA1/SHA256 case
 * use g_assert_not_reached()
 * don't re-extract bit 6 for the 2-reg-misc encodings
 * set the ELF HWCAP2 bits for the new features
]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 linux-user/elfload.c       |   2 +
 target-arm/cpu.c           |   2 +
 target-arm/cpu.h           |   2 +
 target-arm/crypto_helper.c | 257 +++++++++++++++++++++++++++++++++++++++++++--
 target-arm/helper.h        |   9 ++
 target-arm/translate.c     |  84 +++++++++++++++
 6 files changed, 349 insertions(+), 7 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 995f999..9bda262 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -468,6 +468,8 @@ static uint32_t get_elf_hwcap2(void)
     uint32_t hwcaps = 0;
 
     GET_FEATURE(ARM_FEATURE_V8_AES, ARM_HWCAP2_ARM_AES);
+    GET_FEATURE(ARM_FEATURE_V8_SHA1, ARM_HWCAP2_ARM_SHA1);
+    GET_FEATURE(ARM_FEATURE_V8_SHA256, ARM_HWCAP2_ARM_SHA2);
     GET_FEATURE(ARM_FEATURE_CRC, ARM_HWCAP2_ARM_CRC32);
     return hwcaps;
 }
diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index 794dcb9..753f6cb 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -317,6 +317,8 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
         set_feature(env, ARM_FEATURE_ARM_DIV);
         set_feature(env, ARM_FEATURE_LPAE);
         set_feature(env, ARM_FEATURE_V8_AES);
+        set_feature(env, ARM_FEATURE_V8_SHA1);
+        set_feature(env, ARM_FEATURE_V8_SHA256);
     }
     if (arm_feature(env, ARM_FEATURE_V7)) {
         set_feature(env, ARM_FEATURE_VAPA);
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index bf1886c..0cddf95 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -651,6 +651,8 @@ enum arm_features {
     ARM_FEATURE_CBAR_RO, /* has cp15 CBAR and it is read-only */
     ARM_FEATURE_EL2, /* has EL2 Virtualization support */
     ARM_FEATURE_EL3, /* has EL3 Secure monitor support */
+    ARM_FEATURE_V8_SHA1, /* implements SHA1 part of v8 Crypto Extensions */
+    ARM_FEATURE_V8_SHA256, /* implements SHA256 part of v8 Crypto Extensions */
 };
 
 static inline int arm_feature(CPUARMState *env, int feature)
diff --git a/target-arm/crypto_helper.c b/target-arm/crypto_helper.c
index d8898ed..3e4b5f7 100644
--- a/target-arm/crypto_helper.c
+++ b/target-arm/crypto_helper.c
@@ -1,7 +1,7 @@
 /*
  * crypto_helper.c - emulate v8 Crypto Extensions instructions
  *
- * Copyright (C) 2013 Linaro Ltd <ard.biesheuvel@linaro.org>
+ * Copyright (C) 2013 - 2014 Linaro Ltd <ard.biesheuvel@linaro.org>
  *
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public
@@ -15,9 +15,9 @@
 #include "exec/exec-all.h"
 #include "exec/helper-proto.h"
 
-union AES_STATE {
+union CRYPTO_STATE {
     uint8_t    bytes[16];
-    uint32_t   cols[4];
+    uint32_t   words[4];
     uint64_t   l[2];
 };
 
@@ -99,11 +99,11 @@ void HELPER(crypto_aese)(CPUARMState *env, uint32_t rd, uint32_t rm,
         /* ShiftRows permutation vector for decryption */
         { 0, 13, 10,  7, 4, 1, 14, 11, 8,  5, 2, 15, 12, 9, 6,  3 },
     };
-    union AES_STATE rk = { .l = {
+    union CRYPTO_STATE rk = { .l = {
         float64_val(env->vfp.regs[rm]),
         float64_val(env->vfp.regs[rm + 1])
     } };
-    union AES_STATE st = { .l = {
+    union CRYPTO_STATE st = { .l = {
         float64_val(env->vfp.regs[rd]),
         float64_val(env->vfp.regs[rd + 1])
     } };
@@ -260,7 +260,7 @@ void HELPER(crypto_aesmc)(CPUARMState *env, uint32_t rd, uint32_t rm,
         0x92b479a7, 0x99b970a9, 0x84ae6bbb, 0x8fa362b5,
         0xbe805d9f, 0xb58d5491, 0xa89a4f83, 0xa397468d,
     } };
-    union AES_STATE st = { .l = {
+    union CRYPTO_STATE st = { .l = {
         float64_val(env->vfp.regs[rm]),
         float64_val(env->vfp.regs[rm + 1])
     } };
@@ -269,7 +269,7 @@ void HELPER(crypto_aesmc)(CPUARMState *env, uint32_t rd, uint32_t rm,
     assert(decrypt < 2);
 
     for (i = 0; i < 16; i += 4) {
-        st.cols[i >> 2] = cpu_to_le32(
+        st.words[i >> 2] = cpu_to_le32(
             mc[decrypt][st.bytes[i]] ^
             rol32(mc[decrypt][st.bytes[i + 1]], 8) ^
             rol32(mc[decrypt][st.bytes[i + 2]], 16) ^
@@ -279,3 +279,246 @@ void HELPER(crypto_aesmc)(CPUARMState *env, uint32_t rd, uint32_t rm,
     env->vfp.regs[rd] = make_float64(st.l[0]);
     env->vfp.regs[rd + 1] = make_float64(st.l[1]);
 }
+
+/*
+ * SHA-1 logical functions
+ */
+
+static uint32_t cho(uint32_t x, uint32_t y, uint32_t z)
+{
+    return (x & (y ^ z)) ^ z;
+}
+
+static uint32_t par(uint32_t x, uint32_t y, uint32_t z)
+{
+    return x ^ y ^ z;
+}
+
+static uint32_t maj(uint32_t x, uint32_t y, uint32_t z)
+{
+    return (x & y) | ((x | y) & z);
+}
+
+void HELPER(crypto_sha1_3reg)(CPUARMState *env, uint32_t rd, uint32_t rn,
+                              uint32_t rm, uint32_t op)
+{
+    union CRYPTO_STATE d = { .l = {
+        float64_val(env->vfp.regs[rd]),
+        float64_val(env->vfp.regs[rd + 1])
+    } };
+    union CRYPTO_STATE n = { .l = {
+        float64_val(env->vfp.regs[rn]),
+        float64_val(env->vfp.regs[rn + 1])
+    } };
+    union CRYPTO_STATE m = { .l = {
+        float64_val(env->vfp.regs[rm]),
+        float64_val(env->vfp.regs[rm + 1])
+    } };
+
+    if (op == 3) { /* sha1su0 */
+        d.l[0] ^= d.l[1] ^ m.l[0];
+        d.l[1] ^= n.l[0] ^ m.l[1];
+    } else {
+        int i;
+
+        for (i = 0; i < 4; i++) {
+            uint32_t t;
+
+            switch (op) {
+            case 0: /* sha1c */
+                t = cho(d.words[1], d.words[2], d.words[3]);
+                break;
+            case 1: /* sha1p */
+                t = par(d.words[1], d.words[2], d.words[3]);
+                break;
+            case 2: /* sha1m */
+                t = maj(d.words[1], d.words[2], d.words[3]);
+                break;
+            default:
+                g_assert_not_reached();
+            }
+            t += rol32(d.words[0], 5) + n.words[0] + m.words[i];
+
+            n.words[0] = d.words[3];
+            d.words[3] = d.words[2];
+            d.words[2] = ror32(d.words[1], 2);
+            d.words[1] = d.words[0];
+            d.words[0] = t;
+        }
+    }
+    env->vfp.regs[rd] = make_float64(d.l[0]);
+    env->vfp.regs[rd + 1] = make_float64(d.l[1]);
+}
+
+void HELPER(crypto_sha1h)(CPUARMState *env, uint32_t rd, uint32_t rm)
+{
+    union CRYPTO_STATE m = { .l = {
+        float64_val(env->vfp.regs[rm]),
+        float64_val(env->vfp.regs[rm + 1])
+    } };
+
+    m.words[0] = ror32(m.words[0], 2);
+    m.words[1] = m.words[2] = m.words[3] = 0;
+
+    env->vfp.regs[rd] = make_float64(m.l[0]);
+    env->vfp.regs[rd + 1] = make_float64(m.l[1]);
+}
+
+void HELPER(crypto_sha1su1)(CPUARMState *env, uint32_t rd, uint32_t rm)
+{
+    union CRYPTO_STATE d = { .l = {
+        float64_val(env->vfp.regs[rd]),
+        float64_val(env->vfp.regs[rd + 1])
+    } };
+    union CRYPTO_STATE m = { .l = {
+        float64_val(env->vfp.regs[rm]),
+        float64_val(env->vfp.regs[rm + 1])
+    } };
+
+    d.words[0] = rol32(d.words[0] ^ m.words[1], 1);
+    d.words[1] = rol32(d.words[1] ^ m.words[2], 1);
+    d.words[2] = rol32(d.words[2] ^ m.words[3], 1);
+    d.words[3] = rol32(d.words[3] ^ d.words[0], 1);
+
+    env->vfp.regs[rd] = make_float64(d.l[0]);
+    env->vfp.regs[rd + 1] = make_float64(d.l[1]);
+}
+
+/*
+ * The SHA-256 logical functions, according to
+ * http://csrc.nist.gov/groups/STM/cavp/documents/shs/sha256-384-512.pdf
+ */
+
+static uint32_t S0(uint32_t x)
+{
+    return ror32(x, 2) ^ ror32(x, 13) ^ ror32(x, 22);
+}
+
+static uint32_t S1(uint32_t x)
+{
+    return ror32(x, 6) ^ ror32(x, 11) ^ ror32(x, 25);
+}
+
+static uint32_t s0(uint32_t x)
+{
+    return ror32(x, 7) ^ ror32(x, 18) ^ (x >> 3);
+}
+
+static uint32_t s1(uint32_t x)
+{
+    return ror32(x, 17) ^ ror32(x, 19) ^ (x >> 10);
+}
+
+void HELPER(crypto_sha256h)(CPUARMState *env, uint32_t rd, uint32_t rn,
+                            uint32_t rm)
+{
+    union CRYPTO_STATE d = { .l = {
+        float64_val(env->vfp.regs[rd]),
+        float64_val(env->vfp.regs[rd + 1])
+    } };
+    union CRYPTO_STATE n = { .l = {
+        float64_val(env->vfp.regs[rn]),
+        float64_val(env->vfp.regs[rn + 1])
+    } };
+    union CRYPTO_STATE m = { .l = {
+        float64_val(env->vfp.regs[rm]),
+        float64_val(env->vfp.regs[rm + 1])
+    } };
+    int i;
+
+    for (i = 0; i < 4; i++) {
+        uint32_t t = cho(n.words[0], n.words[1], n.words[2]) + n.words[3]
+                     + S1(n.words[0]) + m.words[i];
+
+        n.words[3] = n.words[2];
+        n.words[2] = n.words[1];
+        n.words[1] = n.words[0];
+        n.words[0] = d.words[3] + t;
+
+        t += maj(d.words[0], d.words[1], d.words[2]) + S0(d.words[0]);
+
+        d.words[3] = d.words[2];
+        d.words[2] = d.words[1];
+        d.words[1] = d.words[0];
+        d.words[0] = t;
+    }
+
+    env->vfp.regs[rd] = make_float64(d.l[0]);
+    env->vfp.regs[rd + 1] = make_float64(d.l[1]);
+}
+
+void HELPER(crypto_sha256h2)(CPUARMState *env, uint32_t rd, uint32_t rn,
+                             uint32_t rm)
+{
+    union CRYPTO_STATE d = { .l = {
+        float64_val(env->vfp.regs[rd]),
+        float64_val(env->vfp.regs[rd + 1])
+    } };
+    union CRYPTO_STATE n = { .l = {
+        float64_val(env->vfp.regs[rn]),
+        float64_val(env->vfp.regs[rn + 1])
+    } };
+    union CRYPTO_STATE m = { .l = {
+        float64_val(env->vfp.regs[rm]),
+        float64_val(env->vfp.regs[rm + 1])
+    } };
+    int i;
+
+    for (i = 0; i < 4; i++) {
+        uint32_t t = cho(d.words[0], d.words[1], d.words[2]) + d.words[3]
+                     + S1(d.words[0]) + m.words[i];
+
+        d.words[3] = d.words[2];
+        d.words[2] = d.words[1];
+        d.words[1] = d.words[0];
+        d.words[0] = n.words[3 - i] + t;
+    }
+
+    env->vfp.regs[rd] = make_float64(d.l[0]);
+    env->vfp.regs[rd + 1] = make_float64(d.l[1]);
+}
+
+void HELPER(crypto_sha256su0)(CPUARMState *env, uint32_t rd, uint32_t rm)
+{
+    union CRYPTO_STATE d = { .l = {
+        float64_val(env->vfp.regs[rd]),
+        float64_val(env->vfp.regs[rd + 1])
+    } };
+    union CRYPTO_STATE m = { .l = {
+        float64_val(env->vfp.regs[rm]),
+        float64_val(env->vfp.regs[rm + 1])
+    } };
+
+    d.words[0] += s0(d.words[1]);
+    d.words[1] += s0(d.words[2]);
+    d.words[2] += s0(d.words[3]);
+    d.words[3] += s0(m.words[0]);
+
+    env->vfp.regs[rd] = make_float64(d.l[0]);
+    env->vfp.regs[rd + 1] = make_float64(d.l[1]);
+}
+
+void HELPER(crypto_sha256su1)(CPUARMState *env, uint32_t rd, uint32_t rn,
+                              uint32_t rm)
+{
+    union CRYPTO_STATE d = { .l = {
+        float64_val(env->vfp.regs[rd]),
+        float64_val(env->vfp.regs[rd + 1])
+    } };
+    union CRYPTO_STATE n = { .l = {
+        float64_val(env->vfp.regs[rn]),
+        float64_val(env->vfp.regs[rn + 1])
+    } };
+    union CRYPTO_STATE m = { .l = {
+        float64_val(env->vfp.regs[rm]),
+        float64_val(env->vfp.regs[rm + 1])
+    } };
+
+    d.words[0] += s1(m.words[2]) + n.words[1];
+    d.words[1] += s1(m.words[3]) + n.words[2];
+    d.words[2] += s1(d.words[0]) + n.words[3];
+    d.words[3] += s1(d.words[1]) + m.words[0];
+
+    env->vfp.regs[rd] = make_float64(d.l[0]);
+    env->vfp.regs[rd + 1] = make_float64(d.l[1]);
+}
diff --git a/target-arm/helper.h b/target-arm/helper.h
index b63fd0f..113b09d 100644
--- a/target-arm/helper.h
+++ b/target-arm/helper.h
@@ -512,6 +512,15 @@ DEF_HELPER_3(neon_qzip32, void, env, i32, i32)
 DEF_HELPER_4(crypto_aese, void, env, i32, i32, i32)
 DEF_HELPER_4(crypto_aesmc, void, env, i32, i32, i32)
 
+DEF_HELPER_5(crypto_sha1_3reg, void, env, i32, i32, i32, i32)
+DEF_HELPER_3(crypto_sha1h, void, env, i32, i32)
+DEF_HELPER_3(crypto_sha1su1, void, env, i32, i32)
+
+DEF_HELPER_4(crypto_sha256h, void, env, i32, i32, i32)
+DEF_HELPER_4(crypto_sha256h2, void, env, i32, i32, i32)
+DEF_HELPER_3(crypto_sha256su0, void, env, i32, i32)
+DEF_HELPER_4(crypto_sha256su1, void, env, i32, i32, i32)
+
 DEF_HELPER_FLAGS_3(crc32, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
 DEF_HELPER_FLAGS_3(crc32c, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
 DEF_HELPER_2(dc_zva, void, env, i64)
diff --git a/target-arm/translate.c b/target-arm/translate.c
index d499caa..38ef5b1 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -4776,6 +4776,7 @@ static void gen_neon_narrow_op(int op, int u, int size,
 #define NEON_3R_VPMIN 21
 #define NEON_3R_VQDMULH_VQRDMULH 22
 #define NEON_3R_VPADD 23
+#define NEON_3R_SHA 24 /* SHA1C,SHA1P,SHA1M,SHA1SU0,SHA256H{2},SHA256SU1 */
 #define NEON_3R_VFM 25 /* VFMA, VFMS : float fused multiply-add */
 #define NEON_3R_FLOAT_ARITH 26 /* float VADD, VSUB, VPADD, VABD */
 #define NEON_3R_FLOAT_MULTIPLY 27 /* float VMLA, VMLS, VMUL */
@@ -4809,6 +4810,7 @@ static const uint8_t neon_3r_sizes[] = {
     [NEON_3R_VPMIN] = 0x7,
     [NEON_3R_VQDMULH_VQRDMULH] = 0x6,
     [NEON_3R_VPADD] = 0x7,
+    [NEON_3R_SHA] = 0xf, /* size field encodes op type */
     [NEON_3R_VFM] = 0x5, /* size bit 1 encodes op */
     [NEON_3R_FLOAT_ARITH] = 0x5, /* size bit 1 encodes op */
     [NEON_3R_FLOAT_MULTIPLY] = 0x5, /* size bit 1 encodes op */
@@ -4842,6 +4844,7 @@ static const uint8_t neon_3r_sizes[] = {
 #define NEON_2RM_VCEQ0 18
 #define NEON_2RM_VCLE0 19
 #define NEON_2RM_VCLT0 20
+#define NEON_2RM_SHA1H 21
 #define NEON_2RM_VABS 22
 #define NEON_2RM_VNEG 23
 #define NEON_2RM_VCGT0_F 24
@@ -4858,6 +4861,7 @@ static const uint8_t neon_3r_sizes[] = {
 #define NEON_2RM_VMOVN 36 /* Includes VQMOVN, VQMOVUN */
 #define NEON_2RM_VQMOVN 37 /* Includes VQMOVUN */
 #define NEON_2RM_VSHLL 38
+#define NEON_2RM_SHA1SU1 39 /* Includes SHA256SU0 */
 #define NEON_2RM_VRINTN 40
 #define NEON_2RM_VRINTX 41
 #define NEON_2RM_VRINTA 42
@@ -4918,6 +4922,7 @@ static const uint8_t neon_2rm_sizes[] = {
     [NEON_2RM_VCEQ0] = 0x7,
     [NEON_2RM_VCLE0] = 0x7,
     [NEON_2RM_VCLT0] = 0x7,
+    [NEON_2RM_SHA1H] = 0x4,
     [NEON_2RM_VABS] = 0x7,
     [NEON_2RM_VNEG] = 0x7,
     [NEON_2RM_VCGT0_F] = 0x4,
@@ -4934,6 +4939,7 @@ static const uint8_t neon_2rm_sizes[] = {
     [NEON_2RM_VMOVN] = 0x7,
     [NEON_2RM_VQMOVN] = 0x7,
     [NEON_2RM_VSHLL] = 0x7,
+    [NEON_2RM_SHA1SU1] = 0x4,
     [NEON_2RM_VRINTN] = 0x4,
     [NEON_2RM_VRINTX] = 0x4,
     [NEON_2RM_VRINTA] = 0x4,
@@ -5011,6 +5017,49 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins
         if (q && ((rd | rn | rm) & 1)) {
             return 1;
         }
+        /*
+         * The SHA-1/SHA-256 3-register instructions require special treatment
+         * here, as their size field is overloaded as an op type selector, and
+         * they all consume their input in a single pass.
+         */
+        if (op == NEON_3R_SHA) {
+            if (!q) {
+                return 1;
+            }
+            if (!u) { /* SHA-1 */
+                if (!arm_feature(env, ARM_FEATURE_V8_SHA1)) {
+                    return 1;
+                }
+                tmp = tcg_const_i32(rd);
+                tmp2 = tcg_const_i32(rn);
+                tmp3 = tcg_const_i32(rm);
+                tmp4 = tcg_const_i32(size);
+                gen_helper_crypto_sha1_3reg(cpu_env, tmp, tmp2, tmp3, tmp4);
+                tcg_temp_free_i32(tmp4);
+            } else { /* SHA-256 */
+                if (!arm_feature(env, ARM_FEATURE_V8_SHA256) || size == 3) {
+                    return 1;
+                }
+                tmp = tcg_const_i32(rd);
+                tmp2 = tcg_const_i32(rn);
+                tmp3 = tcg_const_i32(rm);
+                switch (size) {
+                case 0:
+                    gen_helper_crypto_sha256h(cpu_env, tmp, tmp2, tmp3);
+                    break;
+                case 1:
+                    gen_helper_crypto_sha256h2(cpu_env, tmp, tmp2, tmp3);
+                    break;
+                case 2:
+                    gen_helper_crypto_sha256su1(cpu_env, tmp, tmp2, tmp3);
+                    break;
+                }
+            }
+            tcg_temp_free_i32(tmp);
+            tcg_temp_free_i32(tmp2);
+            tcg_temp_free_i32(tmp3);
+            return 0;
+        }
         if (size == 3 && op != NEON_3R_LOGIC) {
             /* 64-bit element instructions. */
             for (pass = 0; pass < (q ? 2 : 1); pass++) {
@@ -6486,6 +6535,41 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins
                     tcg_temp_free_i32(tmp2);
                     tcg_temp_free_i32(tmp3);
                     break;
+                case NEON_2RM_SHA1H:
+                    if (!arm_feature(env, ARM_FEATURE_V8_SHA1)
+                        || ((rm | rd) & 1)) {
+                        return 1;
+                    }
+                    tmp = tcg_const_i32(rd);
+                    tmp2 = tcg_const_i32(rm);
+
+                    gen_helper_crypto_sha1h(cpu_env, tmp, tmp2);
+
+                    tcg_temp_free_i32(tmp);
+                    tcg_temp_free_i32(tmp2);
+                    break;
+                case NEON_2RM_SHA1SU1:
+                    if ((rm | rd) & 1) {
+                            return 1;
+                    }
+                    /* bit 6 (q): set -> SHA256SU0, cleared -> SHA1SU1 */
+                    if (q) {
+                        if (!arm_feature(env, ARM_FEATURE_V8_SHA256)) {
+                            return 1;
+                        }
+                    } else if (!arm_feature(env, ARM_FEATURE_V8_SHA1)) {
+                        return 1;
+                    }
+                    tmp = tcg_const_i32(rd);
+                    tmp2 = tcg_const_i32(rm);
+                    if (q) {
+                        gen_helper_crypto_sha256su0(cpu_env, tmp, tmp2);
+                    } else {
+                        gen_helper_crypto_sha1su1(cpu_env, tmp, tmp2);
+                    }
+                    tcg_temp_free_i32(tmp);
+                    tcg_temp_free_i32(tmp2);
+                    break;
                 default:
                 elementwise:
                     for (pass = 0; pass < (q ? 4 : 2); pass++) {