From patchwork Mon Jun  9 14:57:26 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Peter Maydell <peter.maydell@linaro.org>
X-Patchwork-Id: 31577
Return-Path: <patchwork-forward+bncBC6Z756YVMIBBYEZ26OAKGQEL65T3FA@linaro.org>
X-Original-To: linaro@patches.linaro.org
Delivered-To: linaro@patches.linaro.org
Received: from mail-wi0-f197.google.com (mail-wi0-f197.google.com
 [209.85.212.197])
 by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 2907420675
 for <linaro@patches.linaro.org>; Mon,  9 Jun 2014 15:04:02 +0000 (UTC)
Received: by mail-wi0-f197.google.com with SMTP id n3sf1004624wiv.0
 for <linaro@patches.linaro.org>; Mon, 09 Jun 2014 08:04:00 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:delivered-to:from:to:date
 :message-id:in-reply-to:references:subject:precedence:list-id
 :list-unsubscribe:list-archive:list-post:list-help:list-subscribe
 :errors-to:sender:x-original-sender
 :x-original-authentication-results:mailing-list;
 bh=8Fq6cN/qOsV24g3dPl4AhgVyICFMCu1cbD+HC3odWko=;
 b=SIx1BHJVp9vwbIskwke7kO/wNHH+TOw4CCF/A+x+K4ww7MO5fG+5p5pLkbBmYwFU8h
 eGD46nxg42jDHBc6CDoUw5gYF6dC4Eefb/Kowz/9hVv9SKZ782UY8By1jz9X+QIILOB6
 vSkfdxpNNRXPFEQgNxk0SCNg+XqJt/Lnvn7hi1U1le/9G8PVKDPJ0GXT7QsMzlXjyA+J
 O2t2sXkKVUTsbp1xrsjv5PIM8DLO8gX5Qhrd/tbUaYG34kwNIpIHf7AoDGKIwZPuOpVH
 GlRSDRTaKNW+k+Dbiy5OJTOeMB+xK54vqxQTi99efDPAtSNQXo2KiIAwS2/2dCr9kg8E
 YUvw==
X-Gm-Message-State: ALoCoQnYQUlWjG96AJZ//CskAKkroy/3b+tWZSUSU3cCWBVUL+Qxs+Nv2K/XO2lSj1N0OdSbUl5i
X-Received: by 10.112.11.229 with SMTP id t5mr668389lbb.10.1402326240922;
 Mon, 09 Jun 2014 08:04:00 -0700 (PDT)
MIME-Version: 1.0
X-BeenThere: patchwork-forward@linaro.org
Received: by 10.140.84.239 with SMTP id l102ls1659819qgd.81.gmail;
 Mon, 09 Jun 2014 08:04:00 -0700 (PDT)
X-Received: by 10.58.23.7 with SMTP id i7mr1717589vef.57.1402326237467;
 Mon, 09 Jun 2014 08:03:57 -0700 (PDT)
Received: from mail-vc0-f169.google.com (mail-vc0-f169.google.com
 [209.85.220.169]) by mx.google.com with ESMTPS id
 uj9si11967754vec.101.2014.06.09.08.03.57
 for <patchwork-forward@linaro.org>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Mon, 09 Jun 2014 08:03:57 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 209.85.220.169 as permitted sender) client-ip=209.85.220.169; 
Received: by mail-vc0-f169.google.com with SMTP id la4so6416419vcb.14
 for <patchwork-forward@linaro.org>;
 Mon, 09 Jun 2014 08:03:57 -0700 (PDT)
X-Received: by 10.52.52.168 with SMTP id u8mr18877676vdo.25.1402326237367;
 Mon, 09 Jun 2014 08:03:57 -0700 (PDT)
X-Forwarded-To: patchwork-forward@linaro.org
X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org
Delivered-To: patch@linaro.org
Received: by 10.221.54.6 with SMTP id vs6csp152667vcb;
 Mon, 9 Jun 2014 08:03:56 -0700 (PDT)
X-Received: by 10.140.42.12 with SMTP id b12mr4567530qga.109.1402326236631; 
 Mon, 09 Jun 2014 08:03:56 -0700 (PDT)
Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11])
 by mx.google.com with ESMTPS id
 n8si5394551qag.105.2014.06.09.08.03.56 for <patch@linaro.org>
 (version=TLSv1 cipher=RC4-SHA bits=128/128);
 Mon, 09 Jun 2014 08:03:56 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates
 2001:4830:134:3::11 as permitted sender)
 client-ip=2001:4830:134:3::11; 
Received: from localhost ([::1]:33625 helo=lists.gnu.org)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <qemu-devel-bounces+patch=linaro.org@nongnu.org>)
 id 1Wu176-0005aq-56
 for patch@linaro.org; Mon, 09 Jun 2014 11:03:56 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:55145)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <pm215@archaic.org.uk>) id 1Wu117-00055u-3c
 for qemu-devel@nongnu.org; Mon, 09 Jun 2014 10:57:46 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <pm215@archaic.org.uk>) id 1Wu115-0005ga-TZ
 for qemu-devel@nongnu.org; Mon, 09 Jun 2014 10:57:45 -0400
Received: from mnementh.archaic.org.uk ([2001:8b0:1d0::1]:48572)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <pm215@archaic.org.uk>) id 1Wu115-0005dP-Dd
 for qemu-devel@nongnu.org; Mon, 09 Jun 2014 10:57:43 -0400
Received: from pm215 by mnementh.archaic.org.uk with local (Exim 4.80)
 (envelope-from <pm215@archaic.org.uk>) id 1Wu110-00069z-FZ
 for qemu-devel@nongnu.org; Mon, 09 Jun 2014 15:57:38 +0100
From: Peter Maydell <peter.maydell@linaro.org>
To: qemu-devel@nongnu.org
Date: Mon,  9 Jun 2014 15:57:26 +0100
Message-Id: <1402325858-23615-9-git-send-email-peter.maydell@linaro.org>
X-Mailer: git-send-email 1.7.10.4
In-Reply-To: <1402325858-23615-1-git-send-email-peter.maydell@linaro.org>
References: <1402325858-23615-1-git-send-email-peter.maydell@linaro.org>
X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address
 (bad octet value).
X-Received-From: 2001:8b0:1d0::1
Subject: [Qemu-devel] [PULL 08/20] target-arm: add support for v8 VMULL.P64
 instruction
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: <patchwork-forward.linaro.org>
List-Unsubscribe: <http://groups.google.com/a/linaro.org/group/patchwork-forward/subscribe>, 
 <mailto:googlegroups-manage+836684582541+unsubscribe@googlegroups.com>
List-Archive: <http://groups.google.com/a/linaro.org/group/patchwork-forward/>
List-Post: <http://groups.google.com/a/linaro.org/group/patchwork-forward/post>, 
 <mailto:patchwork-forward@linaro.org>
List-Help: <http://support.google.com/a/linaro.org/bin/topic.py?topic=25838>, 
 <mailto:patchwork-forward+help@linaro.org>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org
Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org
X-Removed-Original-Auth: Dkim didn't pass.
X-Original-Sender: peter.maydell@linaro.org
X-Original-Authentication-Results: mx.google.com; spf=pass (google.com:
 domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 209.85.220.169 as permitted sender)
 smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org
Mailing-list: list patchwork-forward@linaro.org;
 contact patchwork-forward+owners@linaro.org
X-Google-Group-Id: 836684582541

Add support for the VMULL.P64 polynomial 64x64 to 128 bit multiplication
instruction in the A32/T32 instruction sets; this is part of the v8
Crypto Extensions.

To do this we have to move the neon_pmull_64_{lo,hi} helpers from
helper-a64.c into neon_helper.c so they can be used by the AArch32
translator.

Inspired-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1401386724-26529-4-git-send-email-peter.maydell@linaro.org
---
 linux-user/elfload.c     |  1 +
 target-arm/cpu.c         |  1 +
 target-arm/cpu.h         |  1 +
 target-arm/helper-a64.c  | 30 ------------------------------
 target-arm/helper-a64.h  |  2 --
 target-arm/helper.h      |  3 +++
 target-arm/neon_helper.c | 30 ++++++++++++++++++++++++++++++
 target-arm/translate.c   | 26 +++++++++++++++++++++++++-
 8 files changed, 61 insertions(+), 33 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 9bda262..3241fec 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -468,6 +468,7 @@ static uint32_t get_elf_hwcap2(void)
     uint32_t hwcaps = 0;
 
     GET_FEATURE(ARM_FEATURE_V8_AES, ARM_HWCAP2_ARM_AES);
+    GET_FEATURE(ARM_FEATURE_V8_PMULL, ARM_HWCAP2_ARM_PMULL);
     GET_FEATURE(ARM_FEATURE_V8_SHA1, ARM_HWCAP2_ARM_SHA1);
     GET_FEATURE(ARM_FEATURE_V8_SHA256, ARM_HWCAP2_ARM_SHA2);
     GET_FEATURE(ARM_FEATURE_CRC, ARM_HWCAP2_ARM_CRC32);
diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index 753f6cb..fb9c12d 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -319,6 +319,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
         set_feature(env, ARM_FEATURE_V8_AES);
         set_feature(env, ARM_FEATURE_V8_SHA1);
         set_feature(env, ARM_FEATURE_V8_SHA256);
+        set_feature(env, ARM_FEATURE_V8_PMULL);
     }
     if (arm_feature(env, ARM_FEATURE_V7)) {
         set_feature(env, ARM_FEATURE_VAPA);
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 0cddf95..369d472 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -653,6 +653,7 @@ enum arm_features {
     ARM_FEATURE_EL3, /* has EL3 Secure monitor support */
     ARM_FEATURE_V8_SHA1, /* implements SHA1 part of v8 Crypto Extensions */
     ARM_FEATURE_V8_SHA256, /* implements SHA256 part of v8 Crypto Extensions */
+    ARM_FEATURE_V8_PMULL, /* implements PMULL part of v8 Crypto Extensions */
 };
 
 static inline int arm_feature(CPUARMState *env, int feature)
diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c
index cccda74..f0d2722 100644
--- a/target-arm/helper-a64.c
+++ b/target-arm/helper-a64.c
@@ -186,36 +186,6 @@ uint64_t HELPER(simd_tbl)(CPUARMState *env, uint64_t result, uint64_t indices,
     return result;
 }
 
-/* Helper function for 64 bit polynomial multiply case:
- * perform PolynomialMult(op1, op2) and return either the top or
- * bottom half of the 128 bit result.
- */
-uint64_t HELPER(neon_pmull_64_lo)(uint64_t op1, uint64_t op2)
-{
-    int bitnum;
-    uint64_t res = 0;
-
-    for (bitnum = 0; bitnum < 64; bitnum++) {
-        if (op1 & (1ULL << bitnum)) {
-            res ^= op2 << bitnum;
-        }
-    }
-    return res;
-}
-uint64_t HELPER(neon_pmull_64_hi)(uint64_t op1, uint64_t op2)
-{
-    int bitnum;
-    uint64_t res = 0;
-
-    /* bit 0 of op1 can't influence the high 64 bits at all */
-    for (bitnum = 1; bitnum < 64; bitnum++) {
-        if (op1 & (1ULL << bitnum)) {
-            res ^= op2 >> (64 - bitnum);
-        }
-    }
-    return res;
-}
-
 /* 64bit/double versions of the neon float compare functions */
 uint64_t HELPER(neon_ceq_f64)(float64 a, float64 b, void *fpstp)
 {
diff --git a/target-arm/helper-a64.h b/target-arm/helper-a64.h
index 3f05bed..8de7536 100644
--- a/target-arm/helper-a64.h
+++ b/target-arm/helper-a64.h
@@ -28,8 +28,6 @@ DEF_HELPER_3(vfp_cmpes_a64, i64, f32, f32, ptr)
 DEF_HELPER_3(vfp_cmpd_a64, i64, f64, f64, ptr)
 DEF_HELPER_3(vfp_cmped_a64, i64, f64, f64, ptr)
 DEF_HELPER_FLAGS_5(simd_tbl, TCG_CALL_NO_RWG_SE, i64, env, i64, i64, i32, i32)
-DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64)
-DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_3(vfp_mulxs, TCG_CALL_NO_RWG, f32, f32, f32, ptr)
 DEF_HELPER_FLAGS_3(vfp_mulxd, TCG_CALL_NO_RWG, f64, f64, f64, ptr)
 DEF_HELPER_FLAGS_3(neon_ceq_f64, TCG_CALL_NO_RWG, i64, i64, i64, ptr)
diff --git a/target-arm/helper.h b/target-arm/helper.h
index 113b09d..0ef8fca 100644
--- a/target-arm/helper.h
+++ b/target-arm/helper.h
@@ -525,6 +525,9 @@ DEF_HELPER_FLAGS_3(crc32, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
 DEF_HELPER_FLAGS_3(crc32c, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32)
 DEF_HELPER_2(dc_zva, void, env, i64)
 
+DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
+
 #ifdef TARGET_AARCH64
 #include "helper-a64.h"
 #endif
diff --git a/target-arm/neon_helper.c b/target-arm/neon_helper.c
index 492e500..47d13e9 100644
--- a/target-arm/neon_helper.c
+++ b/target-arm/neon_helper.c
@@ -2211,3 +2211,33 @@ void HELPER(neon_zip16)(CPUARMState *env, uint32_t rd, uint32_t rm)
     env->vfp.regs[rm] = make_float64(m0);
     env->vfp.regs[rd] = make_float64(d0);
 }
+
+/* Helper function for 64 bit polynomial multiply case:
+ * perform PolynomialMult(op1, op2) and return either the top or
+ * bottom half of the 128 bit result.
+ */
+uint64_t HELPER(neon_pmull_64_lo)(uint64_t op1, uint64_t op2)
+{
+    int bitnum;
+    uint64_t res = 0;
+
+    for (bitnum = 0; bitnum < 64; bitnum++) {
+        if (op1 & (1ULL << bitnum)) {
+            res ^= op2 << bitnum;
+        }
+    }
+    return res;
+}
+uint64_t HELPER(neon_pmull_64_hi)(uint64_t op1, uint64_t op2)
+{
+    int bitnum;
+    uint64_t res = 0;
+
+    /* bit 0 of op1 can't influence the high 64 bits at all */
+    for (bitnum = 1; bitnum < 64; bitnum++) {
+        if (op1 & (1ULL << bitnum)) {
+            res ^= op2 >> (64 - bitnum);
+        }
+    }
+    return res;
+}
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 7124606..41c3fc7 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -5977,7 +5977,7 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins
                     {0, 0, 0, 9}, /* VQDMLSL */
                     {0, 0, 0, 0}, /* Integer VMULL */
                     {0, 0, 0, 1}, /* VQDMULL */
-                    {0, 0, 0, 15}, /* Polynomial VMULL */
+                    {0, 0, 0, 0xa}, /* Polynomial VMULL */
                     {0, 0, 0, 7}, /* Reserved: always UNDEF */
                 };
 
@@ -5996,6 +5996,30 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins
                     return 1;
                 }
 
+                /* Handle VMULL.P64 (Polynomial 64x64 to 128 bit multiply)
+                 * outside the loop below as it only performs a single pass.
+                 */
+                if (op == 14 && size == 2) {
+                    TCGv_i64 tcg_rn, tcg_rm, tcg_rd;
+
+                    if (!arm_feature(env, ARM_FEATURE_V8_PMULL)) {
+                        return 1;
+                    }
+                    tcg_rn = tcg_temp_new_i64();
+                    tcg_rm = tcg_temp_new_i64();
+                    tcg_rd = tcg_temp_new_i64();
+                    neon_load_reg64(tcg_rn, rn);
+                    neon_load_reg64(tcg_rm, rm);
+                    gen_helper_neon_pmull_64_lo(tcg_rd, tcg_rn, tcg_rm);
+                    neon_store_reg64(tcg_rd, rd);
+                    gen_helper_neon_pmull_64_hi(tcg_rd, tcg_rn, tcg_rm);
+                    neon_store_reg64(tcg_rd, rd + 1);
+                    tcg_temp_free_i64(tcg_rn);
+                    tcg_temp_free_i64(tcg_rm);
+                    tcg_temp_free_i64(tcg_rd);
+                    return 0;
+                }
+
                 /* Avoid overlapping operands.  Wide source operands are
                    always aligned so will never overlap with wide
                    destinations in problematic ways.  */