From patchwork Mon Mar 17 22:12:03 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 26414 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-pb0-f72.google.com (mail-pb0-f72.google.com [209.85.160.72]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id ECD76202FA for ; Mon, 17 Mar 2014 22:13:16 +0000 (UTC) Received: by mail-pb0-f72.google.com with SMTP id jt11sf15763943pbb.7 for ; Mon, 17 Mar 2014 15:13:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:date :message-id:in-reply-to:references:cc:subject:precedence:list-id :list-unsubscribe:list-archive:list-post:list-help:list-subscribe :errors-to:sender:x-original-sender :x-original-authentication-results:mailing-list; bh=HLuCAzypIcN+nHl6wu4VFJQVY0kxUju1DaQoRO89c5E=; b=CmHzfzoR8muHH2QyYNmSu4SAKPfWmGsy9ZiJ2n/Vsw8BLVxYAePy1hXfM0OUreCkju EvyjZ0XG9flU2vESGk6OJ0HdCTJbWRQ3xsfRCG8LWRx7Wcj5/eHCPtoD/FhO/7c0r8XM /VrZsFS10MbRi+i81+8psXRlvWpi/49oJPPKVMj8he8TRZrB7HTxTw6SfbkhtizDK6ZY xoDjHJ/WtiKj9F7v4R+BlLgoQJesRDc/cvkBMwD2TuYE1p06RwBlpPWKjQj7S2lKvwyr ohftQ0f+HZYVzVYdr+ljkSorFFY3/KVuP51G2kX490K7j2dh45krD4clEj5AyGcaIkTe Eokg== X-Gm-Message-State: ALoCoQnJk+dqqyc29fjfr/HTAlL3Q2rh2XGtQJxm8eBvxJR7/IK8CQ5LnFXhv0Fc890YKGL3s8ZA X-Received: by 10.67.21.145 with SMTP id hk17mr9951583pad.35.1395094396219; Mon, 17 Mar 2014 15:13:16 -0700 (PDT) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.48.161 with SMTP id o30ls1886835qga.94.gmail; Mon, 17 Mar 2014 15:13:16 -0700 (PDT) X-Received: by 10.220.106.84 with SMTP id w20mr21674418vco.18.1395094396026; Mon, 17 Mar 2014 15:13:16 -0700 (PDT) Received: from mail-vc0-f182.google.com (mail-vc0-f182.google.com [209.85.220.182]) by mx.google.com with ESMTPS id oo7si5794079vcb.22.2014.03.17.15.13.16 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 17 Mar 2014 15:13:16 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.220.182 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) client-ip=209.85.220.182; Received: by mail-vc0-f182.google.com with SMTP id ks9so6503038vcb.13 for ; Mon, 17 Mar 2014 15:13:15 -0700 (PDT) X-Received: by 10.52.179.198 with SMTP id di6mr18574005vdc.7.1395094395919; Mon, 17 Mar 2014 15:13:15 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.220.78.9 with SMTP id i9csp159262vck; Mon, 17 Mar 2014 15:13:15 -0700 (PDT) X-Received: by 10.224.165.83 with SMTP id h19mr31745927qay.27.1395094394839; Mon, 17 Mar 2014 15:13:14 -0700 (PDT) Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id f10si2110357qga.25.2014.03.17.15.13.14 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Mon, 17 Mar 2014 15:13:14 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Received: from localhost ([::1]:60530 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WPfmU-00045X-65 for patch@linaro.org; Mon, 17 Mar 2014 18:13:14 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51001) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WPflh-0003Ct-Aa for qemu-devel@nongnu.org; Mon, 17 Mar 2014 18:12:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WPflg-0007Kw-34 for qemu-devel@nongnu.org; Mon, 17 Mar 2014 18:12:25 -0400 Received: from mnementh.archaic.org.uk ([2001:8b0:1d0::1]:46905) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WPflf-0007Kk-JB for qemu-devel@nongnu.org; Mon, 17 Mar 2014 18:12:24 -0400 Received: from pm215 by mnementh.archaic.org.uk with local (Exim 4.80) (envelope-from ) id 1WPfld-00053A-MO; Mon, 17 Mar 2014 22:12:21 +0000 From: Peter Maydell To: Anthony Liguori Date: Mon, 17 Mar 2014 22:12:03 +0000 Message-Id: <1395094341-19339-13-git-send-email-peter.maydell@linaro.org> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1395094341-19339-1-git-send-email-peter.maydell@linaro.org> References: <1395094341-19339-1-git-send-email-peter.maydell@linaro.org> X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:8b0:1d0::1 Cc: Blue Swirl , =?UTF-8?q?Andreas=20F=C3=A4rber?= , qemu-devel@nongnu.org, Aurelien Jarno Subject: [Qemu-devel] [PULL 12/30] target-arm: A64: Implement SADDLP, UADDLP, SADALP, UADALP X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: peter.maydell@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.220.182 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 Implement the SADDLP, UADDLP, SADALP and UADALP instructions in the SIMD 2-reg misc category. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson Message-id: 1394822294-14837-8-git-send-email-peter.maydell@linaro.org --- target-arm/helper-a64.c | 61 +++++++++++++++++++++++++++++++++++++ target-arm/helper-a64.h | 4 +++ target-arm/translate-a64.c | 75 +++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 139 insertions(+), 1 deletion(-) diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c index 8f53223..c31c45e 100644 --- a/target-arm/helper-a64.c +++ b/target-arm/helper-a64.c @@ -293,3 +293,64 @@ float64 HELPER(rsqrtsf_f64)(float64 a, float64 b, void *fpstp) } return float64_muladd(a, b, float64_three, float_muladd_halve_result, fpst); } + +/* Pairwise long add: add pairs of adjacent elements into + * double-width elements in the result (eg _s8 is an 8x8->16 op) + */ +uint64_t HELPER(neon_addlp_s8)(uint64_t a) +{ + uint64_t nsignmask = 0x0080008000800080ULL; + uint64_t wsignmask = 0x8000800080008000ULL; + uint64_t elementmask = 0x00ff00ff00ff00ffULL; + uint64_t tmp1, tmp2; + uint64_t res, signres; + + /* Extract odd elements, sign extend each to a 16 bit field */ + tmp1 = a & elementmask; + tmp1 ^= nsignmask; + tmp1 |= wsignmask; + tmp1 = (tmp1 - nsignmask) ^ wsignmask; + /* Ditto for the even elements */ + tmp2 = (a >> 8) & elementmask; + tmp2 ^= nsignmask; + tmp2 |= wsignmask; + tmp2 = (tmp2 - nsignmask) ^ wsignmask; + + /* calculate the result by summing bits 0..14, 16..22, etc, + * and then adjusting the sign bits 15, 23, etc manually. + * This ensures the addition can't overflow the 16 bit field. + */ + signres = (tmp1 ^ tmp2) & wsignmask; + res = (tmp1 & ~wsignmask) + (tmp2 & ~wsignmask); + res ^= signres; + + return res; +} + +uint64_t HELPER(neon_addlp_u8)(uint64_t a) +{ + uint64_t tmp; + + tmp = a & 0x00ff00ff00ff00ffULL; + tmp += (a >> 8) & 0x00ff00ff00ff00ffULL; + return tmp; +} + +uint64_t HELPER(neon_addlp_s16)(uint64_t a) +{ + int32_t reslo, reshi; + + reslo = (int32_t)(int16_t)a + (int32_t)(int16_t)(a >> 16); + reshi = (int32_t)(int16_t)(a >> 32) + (int32_t)(int16_t)(a >> 48); + + return (uint32_t)reslo | (((uint64_t)reshi) << 32); +} + +uint64_t HELPER(neon_addlp_u16)(uint64_t a) +{ + uint64_t tmp; + + tmp = a & 0x0000ffff0000ffffULL; + tmp += (a >> 16) & 0x0000ffff0000ffffULL; + return tmp; +} diff --git a/target-arm/helper-a64.h b/target-arm/helper-a64.h index a113d22..88fc9fe 100644 --- a/target-arm/helper-a64.h +++ b/target-arm/helper-a64.h @@ -39,3 +39,7 @@ DEF_HELPER_FLAGS_3(recpsf_f32, TCG_CALL_NO_RWG, f32, f32, f32, ptr) DEF_HELPER_FLAGS_3(recpsf_f64, TCG_CALL_NO_RWG, f64, f64, f64, ptr) DEF_HELPER_FLAGS_3(rsqrtsf_f32, TCG_CALL_NO_RWG, f32, f32, f32, ptr) DEF_HELPER_FLAGS_3(rsqrtsf_f64, TCG_CALL_NO_RWG, f64, f64, f64, ptr) +DEF_HELPER_FLAGS_1(neon_addlp_s8, TCG_CALL_NO_RWG_SE, i64, i64) +DEF_HELPER_FLAGS_1(neon_addlp_u8, TCG_CALL_NO_RWG_SE, i64, i64) +DEF_HELPER_FLAGS_1(neon_addlp_s16, TCG_CALL_NO_RWG_SE, i64, i64) +DEF_HELPER_FLAGS_1(neon_addlp_u16, TCG_CALL_NO_RWG_SE, i64, i64) diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c index f8cae69..4562fac 100644 --- a/target-arm/translate-a64.c +++ b/target-arm/translate-a64.c @@ -81,6 +81,7 @@ typedef void NeonGenNarrowEnvFn(TCGv_i32, TCGv_ptr, TCGv_i64); typedef void NeonGenWidenFn(TCGv_i64, TCGv_i32); typedef void NeonGenTwoSingleOPFn(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr); typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr); +typedef void NeonGenOneOpFn(TCGv_i64, TCGv_i64); /* initialize TCG globals. */ void a64_translate_init(void) @@ -8456,6 +8457,78 @@ static void handle_rev(DisasContext *s, int opcode, bool u, } } +static void handle_2misc_pairwise(DisasContext *s, int opcode, bool u, + bool is_q, int size, int rn, int rd) +{ + /* Implement the pairwise operations from 2-misc: + * SADDLP, UADDLP, SADALP, UADALP. + * These all add pairs of elements in the input to produce a + * double-width result element in the output (possibly accumulating). + */ + bool accum = (opcode == 0x6); + int maxpass = is_q ? 2 : 1; + int pass; + TCGv_i64 tcg_res[2]; + + if (size == 2) { + /* 32 + 32 -> 64 op */ + TCGMemOp memop = size + (u ? 0 : MO_SIGN); + + for (pass = 0; pass < maxpass; pass++) { + TCGv_i64 tcg_op1 = tcg_temp_new_i64(); + TCGv_i64 tcg_op2 = tcg_temp_new_i64(); + + tcg_res[pass] = tcg_temp_new_i64(); + + read_vec_element(s, tcg_op1, rn, pass * 2, memop); + read_vec_element(s, tcg_op2, rn, pass * 2 + 1, memop); + tcg_gen_add_i64(tcg_res[pass], tcg_op1, tcg_op2); + if (accum) { + read_vec_element(s, tcg_op1, rd, pass, MO_64); + tcg_gen_add_i64(tcg_res[pass], tcg_res[pass], tcg_op1); + } + + tcg_temp_free_i64(tcg_op1); + tcg_temp_free_i64(tcg_op2); + } + } else { + for (pass = 0; pass < maxpass; pass++) { + TCGv_i64 tcg_op = tcg_temp_new_i64(); + NeonGenOneOpFn *genfn; + static NeonGenOneOpFn * const fns[2][2] = { + { gen_helper_neon_addlp_s8, gen_helper_neon_addlp_u8 }, + { gen_helper_neon_addlp_s16, gen_helper_neon_addlp_u16 }, + }; + + genfn = fns[size][u]; + + tcg_res[pass] = tcg_temp_new_i64(); + + read_vec_element(s, tcg_op, rn, pass, MO_64); + genfn(tcg_res[pass], tcg_op); + + if (accum) { + read_vec_element(s, tcg_op, rd, pass, MO_64); + if (size == 0) { + gen_helper_neon_addl_u16(tcg_res[pass], + tcg_res[pass], tcg_op); + } else { + gen_helper_neon_addl_u32(tcg_res[pass], + tcg_res[pass], tcg_op); + } + } + tcg_temp_free_i64(tcg_op); + } + } + if (!is_q) { + tcg_res[1] = tcg_const_i64(0); + } + for (pass = 0; pass < 2; pass++) { + write_vec_element(s, tcg_res[pass], rd, pass, MO_64); + tcg_temp_free_i64(tcg_res[pass]); + } +} + /* C3.6.17 AdvSIMD two reg misc * 31 30 29 28 24 23 22 21 17 16 12 11 10 9 5 4 0 * +---+---+---+-----------+------+-----------+--------+-----+------+------+ @@ -8510,7 +8583,7 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) unallocated_encoding(s); return; } - unsupported_encoding(s, insn); + handle_2misc_pairwise(s, opcode, u, is_q, size, rn, rd); return; case 0x13: /* SHLL, SHLL2 */ if (u == 0 || size == 3) {