From patchwork Thu Jun 21 01:53:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139403 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1469999lji; Wed, 20 Jun 2018 18:57:22 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJkObHgIIfTl3qgW/BmpDyRGLqBh48Ov9msE7sQkXeKBIhqDqNFNQGNHsVtRocZ7RG8d3FC X-Received: by 2002:ac8:7349:: with SMTP id q9-v6mr22010572qtp.132.1529546241942; Wed, 20 Jun 2018 18:57:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546241; cv=none; d=google.com; s=arc-20160816; b=B5P4ou9WP8+hqPh/1nNrl0DqGMTqdoydYG9CPXIz0SJjnJl+Mr71IkjEvSol1Ne8w+ ZV9IR7goYOWkJsGHDFA5zGTKaYxINU4uKEoX9gsuVpX5qbkoYufpbyoH4NG+MoblOjEn SqZ/koGcwDhnMThiZmVch9QH0MWYJY3Rpy/sv4n9xSVj6ZsepXZJMAnElf66iuATuc2i Kz+oausjfNxooA9IjfTV2l41mzqxEvR8e+bmzKnI9x0zgo3OBckTufbTdaG1vhxuTeUi 3ekCvZ/gNfPvXwWzzegvHD6pQ3EITWNQxhXKkiu2gYA+39S9i/oMn1XUzhK5hfJ2Htmk B/cQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=TtyyOtHe5R2tPmByYKa9JiSI9Mt4avEsP8TOldIsB08=; b=xh+7iSV8jHvC4A0dix2xPoB4zutNKlj27sP3xB3+WP9phur9fqgPVvLApBkXAK4sR+ qv7kv3VxGyau+4eNEw8XN5NekYfRSD73InxqwMpCt7sjk+qM0yAEfF1+tUtVQZ1gc40I vqsS4Pu/p1T0Lv5+KLlB4BYTFoV7hg8RNwpFtBtUw4RkEVaGUAFwq7mrrAEdT0ORK54g AdIMPlHG0DkUj6J0k35cuWlgG5vucd7s6yjaUPkoYxlHYYRTfphbqB2p/Cjr6hWkdZIF mb00dpidy3G6VZGdR5tUcu8cfU/poXLd/xSuyjMRwj2x9LQvE7WaarCSIa4eASqGS94j PVPA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Pc1lF5Yw; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id s5-v6si2539093qtj.203.2018.06.20.18.57.21 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 18:57:21 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Pc1lF5Yw; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52493 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoqf-0006eT-Bz for patch@linaro.org; Wed, 20 Jun 2018 21:57:21 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39102) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonb-0003wx-8g for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonZ-0002uA-6v for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:11 -0400 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:45790) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonY-0002t6-Sw for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:09 -0400 Received: by mail-pl0-x244.google.com with SMTP id o18-v6so58224pll.12 for ; Wed, 20 Jun 2018 18:54:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=TtyyOtHe5R2tPmByYKa9JiSI9Mt4avEsP8TOldIsB08=; b=Pc1lF5YwMM3AeAh2r3jwPwQTED0OxlgWv6erb3xYZTSrnqyWm7/jr1uFvuJHJxRHV9 ybsm6KBUuP7aJhth7MO8oKIYIISSnV+SL90jOWdC3Ha6hRcQIu3ApkhO70KqwEkFWgsn uSZ9uBv25bqS2jwfb1QwOicxQrTyPyRdMf9Aw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=TtyyOtHe5R2tPmByYKa9JiSI9Mt4avEsP8TOldIsB08=; b=e+t1bGCJ8xHLgdXlOFdkCK6YMddt+oO6pr5Rn3IbmdoJx2FFnKUBpf6Z+ThRwN7+JT e7kcEpP6310RyLBhpBtK+O0U/XitbqW+X4vKGG7KRL8X/399eqnsayklm/ut4i3zSoxt xk+0qPsiDdkegg5Q0ox2V1WiCP5ROkNL2FDh5B2Mu/J5Fir1Z94D9ZzP6B1ikQmefXhH ac8W59ll+xUSELmtZ7F3/rqr5GH2eyvdZBGjuwwYKSJivWekoZV5zT5BySFczKpJ7rXf N1IzsChcphEIXgzKTTJezzbfs/vaTarajqQPyd0L5JJFQ/iN+XMONpqhn4bODS5bABpA ABbg== X-Gm-Message-State: APt69E2eKYny7CMtO3lKH3dWK/XYtdGr1tGB9WYQBCoE+bkJ9xasBF+6 3Uwg8qvU/O2Nw8AhH1nPHtwR3qeaQZo= X-Received: by 2002:a17:902:b706:: with SMTP id d6-v6mr26453597pls.105.1529546047442; Wed, 20 Jun 2018 18:54:07 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.05 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:06 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:25 -1000 Message-Id: <20180621015359.12018-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v5 01/35] target/arm: Implement SVE Memory Contiguous Load Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 35 +++++++++ target/arm/sve_helper.c | 153 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 121 +++++++++++++++++++++++++++++ target/arm/sve.decode | 34 +++++++++ 4 files changed, 343 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell Reviewed-by: Alex Bennée diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 2e76084992..fcc9ba5f50 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -719,3 +719,38 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 128bbf9b04..4e6ad282f9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2810,3 +2810,156 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } + +/* + * Load contiguous data, protected by a governing predicate. + */ +#define DO_LD1(NAME, FN, TYPEE, TYPEM, H) \ +static void do_##NAME(CPUARMState *env, void *vd, void *vg, \ + target_ulong addr, intptr_t oprsz, \ + uintptr_t ra) \ +{ \ + intptr_t i = 0; \ + do { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m = 0; \ + if (pg & 1) { \ + m = FN(env, addr, ra); \ + } \ + *(TYPEE *)(vd + H(i)) = m; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += sizeof(TYPEM); \ + } while (i & 15); \ + } while (i < oprsz); \ +} \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + do_##NAME(env, &env->vfp.zregs[simd_data(desc)], vg, \ + addr, simd_oprsz(desc), GETPC()); \ +} + +#define DO_LD2(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m1 = 0, m2 = 0; \ + if (pg & 1) { \ + m1 = FN(env, addr, ra); \ + m2 = FN(env, addr + sizeof(TYPEM), ra); \ + } \ + *(TYPEE *)(d1 + H(i)) = m1; \ + *(TYPEE *)(d2 + H(i)) = m2; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 2 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_LD3(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m1 = 0, m2 = 0, m3 = 0; \ + if (pg & 1) { \ + m1 = FN(env, addr, ra); \ + m2 = FN(env, addr + sizeof(TYPEM), ra); \ + m3 = FN(env, addr + 2 * sizeof(TYPEM), ra); \ + } \ + *(TYPEE *)(d1 + H(i)) = m1; \ + *(TYPEE *)(d2 + H(i)) = m2; \ + *(TYPEE *)(d3 + H(i)) = m3; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 3 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_LD4(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \ + void *d4 = &env->vfp.zregs[(rd + 3) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m1 = 0, m2 = 0, m3 = 0, m4 = 0; \ + if (pg & 1) { \ + m1 = FN(env, addr, ra); \ + m2 = FN(env, addr + sizeof(TYPEM), ra); \ + m3 = FN(env, addr + 2 * sizeof(TYPEM), ra); \ + m4 = FN(env, addr + 3 * sizeof(TYPEM), ra); \ + } \ + *(TYPEE *)(d1 + H(i)) = m1; \ + *(TYPEE *)(d2 + H(i)) = m2; \ + *(TYPEE *)(d3 + H(i)) = m3; \ + *(TYPEE *)(d4 + H(i)) = m4; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 4 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +DO_LD1(sve_ld1bhu_r, cpu_ldub_data_ra, uint16_t, uint8_t, H1_2) +DO_LD1(sve_ld1bhs_r, cpu_ldsb_data_ra, uint16_t, int8_t, H1_2) +DO_LD1(sve_ld1bsu_r, cpu_ldub_data_ra, uint32_t, uint8_t, H1_4) +DO_LD1(sve_ld1bss_r, cpu_ldsb_data_ra, uint32_t, int8_t, H1_4) +DO_LD1(sve_ld1bdu_r, cpu_ldub_data_ra, uint64_t, uint8_t, ) +DO_LD1(sve_ld1bds_r, cpu_ldsb_data_ra, uint64_t, int8_t, ) + +DO_LD1(sve_ld1hsu_r, cpu_lduw_data_ra, uint32_t, uint16_t, H1_4) +DO_LD1(sve_ld1hss_r, cpu_ldsw_data_ra, uint32_t, int8_t, H1_4) +DO_LD1(sve_ld1hdu_r, cpu_lduw_data_ra, uint64_t, uint16_t, ) +DO_LD1(sve_ld1hds_r, cpu_ldsw_data_ra, uint64_t, int16_t, ) + +DO_LD1(sve_ld1sdu_r, cpu_ldl_data_ra, uint64_t, uint32_t, ) +DO_LD1(sve_ld1sds_r, cpu_ldl_data_ra, uint64_t, int32_t, ) + +DO_LD1(sve_ld1bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) +DO_LD2(sve_ld2bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) +DO_LD3(sve_ld3bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) +DO_LD4(sve_ld4bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) + +DO_LD1(sve_ld1hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) +DO_LD2(sve_ld2hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) +DO_LD3(sve_ld3hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) +DO_LD4(sve_ld4hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) + +DO_LD1(sve_ld1ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) +DO_LD2(sve_ld2ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) +DO_LD3(sve_ld3ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) +DO_LD4(sve_ld4ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) + +DO_LD1(sve_ld1dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) +DO_LD2(sve_ld2dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) +DO_LD3(sve_ld3dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) +DO_LD4(sve_ld4dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) + +#undef DO_LD1 +#undef DO_LD2 +#undef DO_LD3 +#undef DO_LD4 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 226c97579c..3543daff48 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -42,6 +42,8 @@ typedef void gen_helper_gvec_flags_3(TCGv_i32, TCGv_ptr, TCGv_ptr, typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); +typedef void gen_helper_gvec_mem(TCGv_env, TCGv_ptr, TCGv_i64, TCGv_i32); + /* * Helpers for extracting complex instruction fields. */ @@ -82,6 +84,15 @@ static inline int expand_imm_sh8u(int x) return (uint8_t)x << (x & 0x100 ? 8 : 0); } +/* Convert a 2-bit memory size (msz) to a 4-bit data type (dtype) + * with unsigned data. C.f. SVE Memory Contiguous Load Group. + */ +static inline int msz_dtype(int msz) +{ + static const uint8_t dtype[4] = { 0, 5, 10, 15 }; + return dtype[msz]; +} + /* * Include the generated decoder. */ @@ -3526,3 +3537,113 @@ static bool trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn) } return true; } + +/* + *** SVE Memory - Contiguous Load Group + */ + +/* The memory mode of the dtype. */ +static const TCGMemOp dtype_mop[16] = { + MO_UB, MO_UB, MO_UB, MO_UB, + MO_SL, MO_UW, MO_UW, MO_UW, + MO_SW, MO_SW, MO_UL, MO_UL, + MO_SB, MO_SB, MO_SB, MO_Q +}; + +#define dtype_msz(x) (dtype_mop[x] & MO_SIZE) + +/* The vector element size of dtype. */ +static const uint8_t dtype_esz[16] = { + 0, 1, 2, 3, + 3, 1, 2, 3, + 3, 2, 2, 3, + 3, 2, 1, 3 +}; + +static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, + gen_helper_gvec_mem *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr t_pg; + TCGv_i32 desc; + + /* For e.g. LD4, there are not enough arguments to pass all 4 + * registers as pointers, so encode the regno into the data field. + * For consistency, do this even for LD1. + */ + desc = tcg_const_i32(simd_desc(vsz, vsz, zt)); + t_pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + fn(cpu_env, t_pg, addr, desc); + + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); +} + +static void do_ld_zpa(DisasContext *s, int zt, int pg, + TCGv_i64 addr, int dtype, int nreg) +{ + static gen_helper_gvec_mem * const fns[16][4] = { + { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r, + gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r }, + { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1sds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hh_r, gen_helper_sve_ld2hh_r, + gen_helper_sve_ld3hh_r, gen_helper_sve_ld4hh_r }, + { gen_helper_sve_ld1hsu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1hds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hss_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1ss_r, gen_helper_sve_ld2ss_r, + gen_helper_sve_ld3ss_r, gen_helper_sve_ld4ss_r }, + { gen_helper_sve_ld1sdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1bds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bss_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1dd_r, gen_helper_sve_ld2dd_r, + gen_helper_sve_ld3dd_r, gen_helper_sve_ld4dd_r }, + }; + gen_helper_gvec_mem *fn = fns[dtype][nreg]; + + /* While there are holes in the table, they are not + * accessible via the instruction encoding. + */ + assert(fn != NULL); + do_mem_zpa(s, zt, pg, addr, fn); +} + +static bool trans_LD_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn) +{ + if (a->rm == 31) { + return false; + } + if (sve_access_check(s)) { + TCGv_i64 addr = new_tmp_a64(s); + tcg_gen_muli_i64(addr, cpu_reg(s, a->rm), + (a->nreg + 1) << dtype_msz(a->dtype)); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_ld_zpa(s, a->rd, a->pg, addr, a->dtype, a->nreg); + } + return true; +} + +static bool trans_LD_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) +{ + if (sve_access_check(s)) { + int vsz = vec_full_reg_size(s); + int elements = vsz >> dtype_esz[a->dtype]; + TCGv_i64 addr = new_tmp_a64(s); + + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), + (a->imm * elements * (a->nreg + 1)) + << dtype_msz(a->dtype)); + do_ld_zpa(s, a->rd, a->pg, addr, a->dtype, a->nreg); + } + return true; +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6f436f9096..cfb12da639 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -45,6 +45,9 @@ # Unsigned 8-bit immediate, optionally shifted left by 8. %sh8_i8u 5:9 !function=expand_imm_sh8u +# Unsigned load of msz into esz=2, represented as a dtype. +%msz_dtype 23:2 !function=msz_dtype + # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. %reg_movprfx 0:5 @@ -71,6 +74,8 @@ &incdec2_cnt rd rn pat esz imm d u &incdec_pred rd pg esz d u &incdec2_pred rd rn pg esz d u +&rprr_load rd pg rn rm dtype nreg +&rpri_load rd pg rn imm dtype nreg ########################################################################### # Named instruction formats. These are generally used to @@ -170,6 +175,15 @@ @incdec2_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 \ &incdec2_pred rn=%reg_movprfx +# Loads; user must fill in NREG. +@rprr_load_dt ....... dtype:4 rm:5 ... pg:3 rn:5 rd:5 &rprr_load +@rpri_load_dt ....... dtype:4 . imm:s4 ... pg:3 rn:5 rd:5 &rpri_load + +@rprr_load_msz ....... .... rm:5 ... pg:3 rn:5 rd:5 \ + &rprr_load dtype=%msz_dtype +@rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \ + &rpri_load dtype=%msz_dtype + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -665,3 +679,23 @@ LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9 # SVE load vector register LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9 + +### SVE Memory Contiguous Load Group + +# SVE contiguous load (scalar plus scalar) +LD_zprr 1010010 .... ..... 010 ... ..... ..... @rprr_load_dt nreg=0 + +# SVE contiguous load (scalar plus immediate) +LD_zpri 1010010 .... 0.... 101 ... ..... ..... @rpri_load_dt nreg=0 + +# SVE contiguous non-temporal load (scalar plus scalar) +# LDNT1B, LDNT1H, LDNT1W, LDNT1D +# SVE load multiple structures (scalar plus scalar) +# LD2B, LD2H, LD2W, LD2D; etc. +LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz + +# SVE contiguous non-temporal load (scalar plus immediate) +# LDNT1B, LDNT1H, LDNT1W, LDNT1D +# SVE load multiple structures (scalar plus immediate) +# LD2B, LD2H, LD2W, LD2D; etc. +LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz From patchwork Thu Jun 21 01:53:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139401 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1468345lji; Wed, 20 Jun 2018 18:54:42 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIxzOKnRWhPqbFWs4tlwvDRN9PsjJIJyK2jvl8sjLbAeYlhLQpcKYVXPGkMqnlWqSwNWoMG X-Received: by 2002:a37:2709:: with SMTP id n9-v6mr19543213qkn.416.1529546082869; Wed, 20 Jun 2018 18:54:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546082; cv=none; d=google.com; s=arc-20160816; b=R3suRs4Eb32uC97mznkeVZtX6ZiWEUiVFDZZO7+g/+BEchaWaJ4eHCed0Ym+oziWbi /WdcTJXiK7N5lVxjVlwiQa741YXTUdt7QCqJo/6T+aKwCydqo23G2OxGt1JhFrhD4ZI6 LctvWnqQ712iec4K09v5JUEio4IRCuAYyGCr5iuojmASmz43mZ4SddStT1KQv8iHYV7L K6z94OZI0Wc70HjyHkzek4Wq9ar/+gsAbomOguM+aWkKYcBjGC/yVoT5xVK3XboWTdiI pvRFvE8ezfmZEj56gQBmWABXatj+9Mk6uvC+H8c3RdFwb3p6OTMkA7sg4a42jPAMjbmE SKAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=Gdgl5W4Y8Z1LGarHK+ke+sNkDHVsnZS3jcF1WRaF47o=; b=qQwZ7psQNycTEbLGoz7MJDcWaCVWXpif6wXIsD8LHbnIgJPxIZgNwZkTBy/pZfiHUG snK3UuZ+aG/MtW0a47Y5azKu4hRPS0K48X2AgNdP0QzrtgYwQZe0tP7rvb9iKNrDPwE9 ufsd8qm2UwYEez/Posh3j1qUNuS7STHDt0baNGjKDi8vObAMvdcIJK15WEK4BjNlfwRb WLDtWL6D680NmDvwrfewJFzS+MFzc7rsIrtT7gpl1KeJqNpPory6xrocRcrw9D7CYnni Zkt7zM1OHoKqIXgWzd/P7FxCG0TO1O2rtMfNcY5d5ypv+e3vWmVYaehPJ22KaPGMlXCI BPtA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=TRzZFb7A; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id i196-v6si29616qki.231.2018.06.20.18.54.42 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 18:54:42 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=TRzZFb7A; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52478 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoo6-0003zU-85 for patch@linaro.org; Wed, 20 Jun 2018 21:54:42 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39118) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonc-0003xF-MG for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonb-0002wF-0L for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:12 -0400 Received: from mail-pl0-x229.google.com ([2607:f8b0:400e:c01::229]:43894) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVona-0002vM-O9 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:10 -0400 Received: by mail-pl0-x229.google.com with SMTP id c41-v6so745871plj.10 for ; Wed, 20 Jun 2018 18:54:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Gdgl5W4Y8Z1LGarHK+ke+sNkDHVsnZS3jcF1WRaF47o=; b=TRzZFb7ArDesOmDb1til8eQb2dKP0bLliOzqL+QGt8G/rWDB4/bTO/GHqMFKrwOcb4 WhqRl2MLiyKgJOetljbTLSM340fd6nWDxKC2EkJC7eo/ausfk+9C4hfZQnCZifX3FtHo moRLX7hlAOSOVOI+HtQRs1giduSmXanUn9+xA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Gdgl5W4Y8Z1LGarHK+ke+sNkDHVsnZS3jcF1WRaF47o=; b=C/pUVyEww9rgNT8tmeCfkf4uY58cHZI0ujqwDB9VxQxW6kpr2y5rDuUcliIuf9AdWG 82S3e5y0WIQqGPapNkJQAf8I2nROM0vs+StlKlurts8ohdEGa+TNwwa0uNp0IWF4bcSC zWC9AWzyryFiaBecTz0SVDZPrWLGHbvMLZga5r5pXG1Ezm+ZFNKBqT2vz+Qd/t7va+W6 W808H1tkiIR970+Y0OaKO1zalA7SFXYdH3qjd+cebJgIUDL2puatBvRk9f1Uq+gAplg3 +ggsC6hXeACHYu5/fKZl75uU45FrV6LWZErzGAk1E27SZlcYiApfkJMxlQBMbN6dxDxb NGoQ== X-Gm-Message-State: APt69E2PqTcQ2nIOt8KbnV4y9t1496K+m4Qj570bAvLwcYby+vajm82x bhmu1qZtEyulRDkfDwi2v/6XJvtqJ7M= X-Received: by 2002:a17:902:b285:: with SMTP id u5-v6mr26781847plr.183.1529546049415; Wed, 20 Jun 2018 18:54:09 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.07 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:08 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:26 -1000 Message-Id: <20180621015359.12018-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::229 Subject: [Qemu-devel] [PATCH v5 02/35] target/arm: Implement SVE Contiguous Load, first-fault and no-fault X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 40 ++++++++++ target/arm/sve_helper.c | 156 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 69 ++++++++++++++++ target/arm/sve.decode | 6 ++ 4 files changed, 271 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell Reviewed-by: Alex Bennée diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index fcc9ba5f50..7338abbbcf 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -754,3 +754,43 @@ DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldff1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldff1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldff1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldff1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldnf1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldnf1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldnf1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldnf1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4e6ad282f9..6e1b539ce3 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2963,3 +2963,159 @@ DO_LD4(sve_ld4dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) #undef DO_LD2 #undef DO_LD3 #undef DO_LD4 + +/* + * Load contiguous data, first-fault and no-fault. + */ + +#ifdef CONFIG_USER_ONLY + +/* Fault on byte I. All bits in FFR from I are cleared. The vector + * result from I is CONSTRAINED UNPREDICTABLE; we choose the MERGE + * option, which leaves subsequent data unchanged. + */ +static void __attribute__((cold)) +record_fault(CPUARMState *env, intptr_t i, intptr_t oprsz) +{ + uint64_t *ffr = env->vfp.pregs[FFR_PRED_NUM].p; + if (i & 63) { + ffr[i / 64] &= MAKE_64BIT_MASK(0, (i & 63) - 1); + i = ROUND_UP(i, 64); + } + for (; i < oprsz; i += 64) { + ffr[i / 64] = 0; + } +} + +/* Hold the mmap lock during the operation so that there is no race + * between page_check_range and the load operation. We expect the + * usual case to have no faults at all, so we check the whole range + * first and if successful defer to the normal load operation. + * + * TODO: Change mmap_lock to a rwlock so that multiple readers + * can run simultaneously. This will probably help other uses + * within QEMU as well. + */ +#define DO_LDFF1(PART, FN, TYPEE, TYPEM, H) \ +static void do_sve_ldff1##PART(CPUARMState *env, void *vd, void *vg, \ + target_ulong addr, intptr_t oprsz, \ + bool first, uintptr_t ra) \ +{ \ + intptr_t i = 0; \ + do { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m = 0; \ + if (pg & 1) { \ + if (!first && \ + page_check_range(addr, sizeof(TYPEM), PAGE_READ)) { \ + record_fault(env, i, oprsz); \ + return; \ + } \ + m = FN(env, addr, ra); \ + first = false; \ + } \ + *(TYPEE *)(vd + H(i)) = m; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += sizeof(TYPEM); \ + } while (i & 15); \ + } while (i < oprsz); \ +} \ +void HELPER(sve_ldff1##PART)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t oprsz = simd_oprsz(desc); \ + unsigned rd = simd_data(desc); \ + void *vd = &env->vfp.zregs[rd]; \ + mmap_lock(); \ + if (likely(page_check_range(addr, oprsz, PAGE_READ) == 0)) { \ + do_sve_ld1##PART(env, vd, vg, addr, oprsz, GETPC()); \ + } else { \ + do_sve_ldff1##PART(env, vd, vg, addr, oprsz, true, GETPC()); \ + } \ + mmap_unlock(); \ +} + +/* No-fault loads are like first-fault loads without the + * first faulting special case. + */ +#define DO_LDNF1(PART) \ +void HELPER(sve_ldnf1##PART)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t oprsz = simd_oprsz(desc); \ + unsigned rd = simd_data(desc); \ + void *vd = &env->vfp.zregs[rd]; \ + mmap_lock(); \ + if (likely(page_check_range(addr, oprsz, PAGE_READ) == 0)) { \ + do_sve_ld1##PART(env, vd, vg, addr, oprsz, GETPC()); \ + } else { \ + do_sve_ldff1##PART(env, vd, vg, addr, oprsz, false, GETPC()); \ + } \ + mmap_unlock(); \ +} + +#else + +/* TODO: System mode is not yet supported. + * This would probably use tlb_vaddr_to_host. + */ +#define DO_LDFF1(PART, FN, TYPEE, TYPEM, H) \ +void HELPER(sve_ldff1##PART)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + g_assert_not_reached(); \ +} + +#define DO_LDNF1(PART) \ +void HELPER(sve_ldnf1##PART)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + g_assert_not_reached(); \ +} + +#endif + +DO_LDFF1(bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) +DO_LDFF1(bhu_r, cpu_ldub_data_ra, uint16_t, uint8_t, H1_2) +DO_LDFF1(bhs_r, cpu_ldsb_data_ra, uint16_t, int8_t, H1_2) +DO_LDFF1(bsu_r, cpu_ldub_data_ra, uint32_t, uint8_t, H1_4) +DO_LDFF1(bss_r, cpu_ldsb_data_ra, uint32_t, int8_t, H1_4) +DO_LDFF1(bdu_r, cpu_ldub_data_ra, uint64_t, uint8_t, ) +DO_LDFF1(bds_r, cpu_ldsb_data_ra, uint64_t, int8_t, ) + +DO_LDFF1(hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) +DO_LDFF1(hsu_r, cpu_lduw_data_ra, uint32_t, uint16_t, H1_4) +DO_LDFF1(hss_r, cpu_ldsw_data_ra, uint32_t, int8_t, H1_4) +DO_LDFF1(hdu_r, cpu_lduw_data_ra, uint64_t, uint16_t, ) +DO_LDFF1(hds_r, cpu_ldsw_data_ra, uint64_t, int16_t, ) + +DO_LDFF1(ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) +DO_LDFF1(sdu_r, cpu_ldl_data_ra, uint64_t, uint32_t, ) +DO_LDFF1(sds_r, cpu_ldl_data_ra, uint64_t, int32_t, ) + +DO_LDFF1(dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) + +#undef DO_LDFF1 + +DO_LDNF1(bb_r) +DO_LDNF1(bhu_r) +DO_LDNF1(bhs_r) +DO_LDNF1(bsu_r) +DO_LDNF1(bss_r) +DO_LDNF1(bdu_r) +DO_LDNF1(bds_r) + +DO_LDNF1(hh_r) +DO_LDNF1(hsu_r) +DO_LDNF1(hss_r) +DO_LDNF1(hdu_r) +DO_LDNF1(hds_r) + +DO_LDNF1(ss_r) +DO_LDNF1(sdu_r) +DO_LDNF1(sds_r) + +DO_LDNF1(dd_r) + +#undef DO_LDNF1 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3543daff48..09f77b5405 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3647,3 +3647,72 @@ static bool trans_LD_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) } return true; } + +static bool trans_LDFF1_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn) +{ + static gen_helper_gvec_mem * const fns[16] = { + gen_helper_sve_ldff1bb_r, + gen_helper_sve_ldff1bhu_r, + gen_helper_sve_ldff1bsu_r, + gen_helper_sve_ldff1bdu_r, + + gen_helper_sve_ldff1sds_r, + gen_helper_sve_ldff1hh_r, + gen_helper_sve_ldff1hsu_r, + gen_helper_sve_ldff1hdu_r, + + gen_helper_sve_ldff1hds_r, + gen_helper_sve_ldff1hss_r, + gen_helper_sve_ldff1ss_r, + gen_helper_sve_ldff1sdu_r, + + gen_helper_sve_ldff1bds_r, + gen_helper_sve_ldff1bss_r, + gen_helper_sve_ldff1bhs_r, + gen_helper_sve_ldff1dd_r, + }; + + if (sve_access_check(s)) { + TCGv_i64 addr = new_tmp_a64(s); + tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), dtype_msz(a->dtype)); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_mem_zpa(s, a->rd, a->pg, addr, fns[a->dtype]); + } + return true; +} + +static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) +{ + static gen_helper_gvec_mem * const fns[16] = { + gen_helper_sve_ldnf1bb_r, + gen_helper_sve_ldnf1bhu_r, + gen_helper_sve_ldnf1bsu_r, + gen_helper_sve_ldnf1bdu_r, + + gen_helper_sve_ldnf1sds_r, + gen_helper_sve_ldnf1hh_r, + gen_helper_sve_ldnf1hsu_r, + gen_helper_sve_ldnf1hdu_r, + + gen_helper_sve_ldnf1hds_r, + gen_helper_sve_ldnf1hss_r, + gen_helper_sve_ldnf1ss_r, + gen_helper_sve_ldnf1sdu_r, + + gen_helper_sve_ldnf1bds_r, + gen_helper_sve_ldnf1bss_r, + gen_helper_sve_ldnf1bhs_r, + gen_helper_sve_ldnf1dd_r, + }; + + if (sve_access_check(s)) { + int vsz = vec_full_reg_size(s); + int elements = vsz >> dtype_esz[a->dtype]; + int off = (a->imm * elements) << dtype_msz(a->dtype); + TCGv_i64 addr = new_tmp_a64(s); + + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), off); + do_mem_zpa(s, a->rd, a->pg, addr, fns[a->dtype]); + } + return true; +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index cfb12da639..afbed57de1 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -685,9 +685,15 @@ LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9 # SVE contiguous load (scalar plus scalar) LD_zprr 1010010 .... ..... 010 ... ..... ..... @rprr_load_dt nreg=0 +# SVE contiguous first-fault load (scalar plus scalar) +LDFF1_zprr 1010010 .... ..... 011 ... ..... ..... @rprr_load_dt nreg=0 + # SVE contiguous load (scalar plus immediate) LD_zpri 1010010 .... 0.... 101 ... ..... ..... @rpri_load_dt nreg=0 +# SVE contiguous non-fault load (scalar plus immediate) +LDNF1_zpri 1010010 .... 1.... 101 ... ..... ..... @rpri_load_dt nreg=0 + # SVE contiguous non-temporal load (scalar plus scalar) # LDNT1B, LDNT1H, LDNT1W, LDNT1D # SVE load multiple structures (scalar plus scalar) From patchwork Thu Jun 21 01:53:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139404 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1470087lji; Wed, 20 Jun 2018 18:57:27 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJS/LP1wf5RwsfMXIa1kV8VXA3fvsjROYDsc/jYGWl6ZNoXvkirtSAgFlih7JrUfi25LLK4 X-Received: by 2002:ac8:2a96:: with SMTP id b22-v6mr21672214qta.415.1529546247672; Wed, 20 Jun 2018 18:57:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546247; cv=none; d=google.com; s=arc-20160816; b=0/MFJRGad44Zdp/eWrlQi3dItFrPm/EERbBdOccnl5i/+aqqFbau/m1egm61YND6yC /uyG9VD0q5KfNWNk7PXWGdnkXLtonCMU3kDyeg7ktHGYoRFVUg9xljMkPmViz96WjD12 PmoHpdOzm42bb0V96A0Sze8pK/LSSG4WUdYSBDPrh9+YdJcu390gC/1GzrS/OdZsVig2 nvYhrP4Fjw3iPvF2cGbIVoqflfyuBM7aiICKreHHLCPazxNihlarAPbmERZspAncjks/ o8RMkCg6cEQW7P47QXZWLGwbTVGHwUA2n91r241I+0eIzTVK9SzFwuiBkxWFrSwEjN0t puxA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=sBIcMa1gXsMhdLwRtMVs9CJ9jMS0ijsQQUBNA3BsdtM=; b=n9O3Nr+kBmTsUyXqMkOZvG1I/GlTFDQyE410j9iV5EJxP1SY77AvJTiYs716XIbXu1 BM4p/VpEKSyrmYX23zWRkqor5UOSS3Mov1fVCpF3DHWkqCiQWFyq2PZUsmsnmNMJOttQ 4qU+dqJMfiRjKIjUOUogC5ql1l0WLzSpxSFnAhhZEIMaa+QMgmUTvraJM8Fq+k+ONo0T 2tL3JljA61e7SMCZYMr3cxvIwR8I5081m+IJBiYN97bUM1+WXdaFuaoSNKV8xFnP9lQa T9pNdHE12c/Q05qsoKRfE+zszfxGzYp6nZ26K7ISeKbHRdrjkjNEqzLh/Y70oaUI7NDy BT2Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=SEbhed5a; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id g13-v6si3574377qtj.99.2018.06.20.18.57.27 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 18:57:27 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=SEbhed5a; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52496 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoqk-0006j7-Vl for patch@linaro.org; Wed, 20 Jun 2018 21:57:27 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39140) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVone-0003yJ-NP for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonc-0002xV-U2 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:14 -0400 Received: from mail-pl0-x233.google.com ([2607:f8b0:400e:c01::233]:35619) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonc-0002x1-KC for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:12 -0400 Received: by mail-pl0-x233.google.com with SMTP id k1-v6so756619plt.2 for ; Wed, 20 Jun 2018 18:54:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=sBIcMa1gXsMhdLwRtMVs9CJ9jMS0ijsQQUBNA3BsdtM=; b=SEbhed5ahN/dd2hWA5H5hW5PR0+mpZvEzXzTL00z6n/J7lNZbBHUOWq7s+Xf2V8LeM WbYkXy55VSDmjN/9LO3+7Rr5XshRafsFUYq1BeZ3HGRksAKLL2WvrEEvRzry+dWhXVBg bjtQQJfXUpdR/u0m22a+0Cml92A05qn0EryfI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=sBIcMa1gXsMhdLwRtMVs9CJ9jMS0ijsQQUBNA3BsdtM=; b=TtG21WBjMND60sO3jqmn8LYZurDjgwDngAm873y8HUYyR5Zmneyt+/nSRODiyonrwx rN7iwFQFEqTmO1bpEYRinJ5AhoZzDV70ySvGJ/9NbjrXlQK70GKJjjQRfEGCn1DDst6u OFu/zGTA07Fau40D8quwfPPzh+zf/Hh0vV8qRhltZtqGQ9tyUKkcOOwc18FqvdAPl/PI 3H/kfJcc9J+WypRi6i7cRe1pvH9zjuF43Cxm5TbcRPaI/5Kxm9mGwB1ezz4yrfJLYviA Zcw9EhCTPLsCZdJXQj0ovFwk2jWawCxpQyHTd1ZuBKI7LS/XnBeehbW45fO8bx7I2I3k 0wUw== X-Gm-Message-State: APt69E2YFhF+o04if+MpKmn+wNpL7uypRxVKndEkV/wY/gGGh6G5Qkg2 X4Wf1lpoUaxnkm0nvzT89mvO17vorK8= X-Received: by 2002:a17:902:5501:: with SMTP id f1-v6mr26161465pli.108.1529546051222; Wed, 20 Jun 2018 18:54:11 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.09 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:10 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:27 -1000 Message-Id: <20180621015359.12018-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::233 Subject: [Qemu-devel] [PATCH v5 03/35] target/arm: Implement SVE Memory Contiguous Store Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 29 +++++ target/arm/sve_helper.c | 211 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 65 ++++++++++++ target/arm/sve.decode | 38 +++++++ 4 files changed, 343 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell Reviewed-by: Alex Bennée diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 7338abbbcf..b768128951 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -794,3 +794,32 @@ DEF_HELPER_FLAGS_4(sve_ldnf1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ldnf1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ldnf1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1bh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st1bs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st1bd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 6e1b539ce3..f20774e240 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3119,3 +3119,214 @@ DO_LDNF1(sds_r) DO_LDNF1(dd_r) #undef DO_LDNF1 + +/* + * Store contiguous data, protected by a governing predicate. + */ +#define DO_ST1(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *vd = &env->vfp.zregs[rd]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m = *(TYPEE *)(vd + H(i)); \ + FN(env, addr, m, ra); \ + } \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_ST1_D(NAME, FN, TYPEM) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc) / 8; \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + uint64_t *d = &env->vfp.zregs[rd].d[0]; \ + uint8_t *pg = vg; \ + for (i = 0; i < oprsz; i += 1) { \ + if (pg[H1(i)] & 1) { \ + FN(env, addr, d[i], ra); \ + } \ + addr += sizeof(TYPEM); \ + } \ +} + +#define DO_ST2(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m1 = *(TYPEE *)(d1 + H(i)); \ + TYPEM m2 = *(TYPEE *)(d2 + H(i)); \ + FN(env, addr, m1, ra); \ + FN(env, addr + sizeof(TYPEM), m2, ra); \ + } \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 2 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_ST3(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m1 = *(TYPEE *)(d1 + H(i)); \ + TYPEM m2 = *(TYPEE *)(d2 + H(i)); \ + TYPEM m3 = *(TYPEE *)(d3 + H(i)); \ + FN(env, addr, m1, ra); \ + FN(env, addr + sizeof(TYPEM), m2, ra); \ + FN(env, addr + 2 * sizeof(TYPEM), m3, ra); \ + } \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 3 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_ST4(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \ + void *d4 = &env->vfp.zregs[(rd + 3) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m1 = *(TYPEE *)(d1 + H(i)); \ + TYPEM m2 = *(TYPEE *)(d2 + H(i)); \ + TYPEM m3 = *(TYPEE *)(d3 + H(i)); \ + TYPEM m4 = *(TYPEE *)(d4 + H(i)); \ + FN(env, addr, m1, ra); \ + FN(env, addr + sizeof(TYPEM), m2, ra); \ + FN(env, addr + 2 * sizeof(TYPEM), m3, ra); \ + FN(env, addr + 3 * sizeof(TYPEM), m4, ra); \ + } \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 4 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +DO_ST1(sve_st1bh_r, cpu_stb_data_ra, uint16_t, uint8_t, H1_2) +DO_ST1(sve_st1bs_r, cpu_stb_data_ra, uint32_t, uint8_t, H1_4) +DO_ST1_D(sve_st1bd_r, cpu_stb_data_ra, uint8_t) + +DO_ST1(sve_st1hs_r, cpu_stw_data_ra, uint32_t, uint16_t, H1_4) +DO_ST1_D(sve_st1hd_r, cpu_stw_data_ra, uint16_t) + +DO_ST1_D(sve_st1sd_r, cpu_stl_data_ra, uint32_t) + +DO_ST1(sve_st1bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) +DO_ST2(sve_st2bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) +DO_ST3(sve_st3bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) +DO_ST4(sve_st4bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) + +DO_ST1(sve_st1hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) +DO_ST2(sve_st2hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) +DO_ST3(sve_st3hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) +DO_ST4(sve_st4hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) + +DO_ST1(sve_st1ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) +DO_ST2(sve_st2ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) +DO_ST3(sve_st3ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) +DO_ST4(sve_st4ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) + +DO_ST1_D(sve_st1dd_r, cpu_stq_data_ra, uint64_t) + +void HELPER(sve_st2dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc) / 8; + intptr_t ra = GETPC(); + unsigned rd = simd_data(desc); + uint64_t *d1 = &env->vfp.zregs[rd].d[0]; + uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint8_t *pg = vg; + + for (i = 0; i < oprsz; i += 1) { + if (pg[H1(i)] & 1) { + cpu_stq_data_ra(env, addr, d1[i], ra); + cpu_stq_data_ra(env, addr + 8, d2[i], ra); + } + addr += 2 * 8; + } +} + +void HELPER(sve_st3dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc) / 8; + intptr_t ra = GETPC(); + unsigned rd = simd_data(desc); + uint64_t *d1 = &env->vfp.zregs[rd].d[0]; + uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint64_t *d3 = &env->vfp.zregs[(rd + 2) & 31].d[0]; + uint8_t *pg = vg; + + for (i = 0; i < oprsz; i += 1) { + if (pg[H1(i)] & 1) { + cpu_stq_data_ra(env, addr, d1[i], ra); + cpu_stq_data_ra(env, addr + 8, d2[i], ra); + cpu_stq_data_ra(env, addr + 16, d3[i], ra); + } + addr += 3 * 8; + } +} + +void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc) / 8; + intptr_t ra = GETPC(); + unsigned rd = simd_data(desc); + uint64_t *d1 = &env->vfp.zregs[rd].d[0]; + uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint64_t *d3 = &env->vfp.zregs[(rd + 2) & 31].d[0]; + uint64_t *d4 = &env->vfp.zregs[(rd + 3) & 31].d[0]; + uint8_t *pg = vg; + + for (i = 0; i < oprsz; i += 1) { + if (pg[H1(i)] & 1) { + cpu_stq_data_ra(env, addr, d1[i], ra); + cpu_stq_data_ra(env, addr + 8, d2[i], ra); + cpu_stq_data_ra(env, addr + 16, d3[i], ra); + cpu_stq_data_ra(env, addr + 24, d4[i], ra); + } + addr += 4 * 8; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 09f77b5405..b25fe96b77 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3716,3 +3716,68 @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) } return true; } + +static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, + int msz, int esz, int nreg) +{ + static gen_helper_gvec_mem * const fn_single[4][4] = { + { gen_helper_sve_st1bb_r, gen_helper_sve_st1bh_r, + gen_helper_sve_st1bs_r, gen_helper_sve_st1bd_r }, + { NULL, gen_helper_sve_st1hh_r, + gen_helper_sve_st1hs_r, gen_helper_sve_st1hd_r }, + { NULL, NULL, + gen_helper_sve_st1ss_r, gen_helper_sve_st1sd_r }, + { NULL, NULL, NULL, gen_helper_sve_st1dd_r }, + }; + static gen_helper_gvec_mem * const fn_multiple[3][4] = { + { gen_helper_sve_st2bb_r, gen_helper_sve_st2hh_r, + gen_helper_sve_st2ss_r, gen_helper_sve_st2dd_r }, + { gen_helper_sve_st3bb_r, gen_helper_sve_st3hh_r, + gen_helper_sve_st3ss_r, gen_helper_sve_st3dd_r }, + { gen_helper_sve_st4bb_r, gen_helper_sve_st4hh_r, + gen_helper_sve_st4ss_r, gen_helper_sve_st4dd_r }, + }; + gen_helper_gvec_mem *fn; + + if (nreg == 0) { + /* ST1 */ + fn = fn_single[msz][esz]; + } else { + /* ST2, ST3, ST4 -- msz == esz, enforced by encoding */ + assert(msz == esz); + fn = fn_multiple[nreg - 1][msz]; + } + assert(fn != NULL); + do_mem_zpa(s, zt, pg, addr, fn); +} + +static bool trans_ST_zprr(DisasContext *s, arg_rprr_store *a, uint32_t insn) +{ + if (a->rm == 31 || a->msz > a->esz) { + return false; + } + if (sve_access_check(s)) { + TCGv_i64 addr = new_tmp_a64(s); + tcg_gen_muli_i64(addr, cpu_reg(s, a->rm), (a->nreg + 1) << a->msz); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg); + } + return true; +} + +static bool trans_ST_zpri(DisasContext *s, arg_rpri_store *a, uint32_t insn) +{ + if (a->msz > a->esz) { + return false; + } + if (sve_access_check(s)) { + int vsz = vec_full_reg_size(s); + int elements = vsz >> a->esz; + TCGv_i64 addr = new_tmp_a64(s); + + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), + (a->imm * elements * (a->nreg + 1)) << a->msz); + do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg); + } + return true; +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index afbed57de1..6e159faaec 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -27,6 +27,7 @@ %imm7_22_16 22:2 16:5 %imm8_16_10 16:5 10:3 %imm9_16_10 16:s6 10:3 +%size_23 23:2 # A combination of tsz:imm3 -- extract esize. %tszimm_esz 22:2 5:5 !function=tszimm_esz @@ -76,6 +77,8 @@ &incdec2_pred rd rn pg esz d u &rprr_load rd pg rn rm dtype nreg &rpri_load rd pg rn imm dtype nreg +&rprr_store rd pg rn rm msz esz nreg +&rpri_store rd pg rn imm msz esz nreg ########################################################################### # Named instruction formats. These are generally used to @@ -184,6 +187,12 @@ @rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \ &rpri_load dtype=%msz_dtype +# Stores; user must fill in ESZ, MSZ, NREG as needed. +@rprr_store ....... .. .. rm:5 ... pg:3 rn:5 rd:5 &rprr_store +@rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_store +@rprr_store_esz_n0 ....... .. esz:2 rm:5 ... pg:3 rn:5 rd:5 \ + &rprr_store nreg=0 + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -705,3 +714,32 @@ LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz # SVE load multiple structures (scalar plus immediate) # LD2B, LD2H, LD2W, LD2D; etc. LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz + +### SVE Memory Store Group + +# SVE contiguous store (scalar plus immediate) +# ST1B, ST1H, ST1W, ST1D; require msz <= esz +ST_zpri 1110010 .. esz:2 0.... 111 ... ..... ..... \ + @rpri_store_msz nreg=0 + +# SVE contiguous store (scalar plus scalar) +# ST1B, ST1H, ST1W, ST1D; require msz <= esz +# Enumerate msz lest we conflict with STR_zri. +ST_zprr 1110010 00 .. ..... 010 ... ..... ..... \ + @rprr_store_esz_n0 msz=0 +ST_zprr 1110010 01 .. ..... 010 ... ..... ..... \ + @rprr_store_esz_n0 msz=1 +ST_zprr 1110010 10 .. ..... 010 ... ..... ..... \ + @rprr_store_esz_n0 msz=2 +ST_zprr 1110010 11 11 ..... 010 ... ..... ..... \ + @rprr_store msz=3 esz=3 nreg=0 + +# SVE contiguous non-temporal store (scalar plus immediate) (nreg == 0) +# SVE store multiple structures (scalar plus immediate) (nreg != 0) +ST_zpri 1110010 .. nreg:2 1.... 111 ... ..... ..... \ + @rpri_store_msz esz=%size_23 + +# SVE contiguous non-temporal store (scalar plus scalar) (nreg == 0) +# SVE store multiple structures (scalar plus scalar) (nreg != 0) +ST_zprr 1110010 msz:2 nreg:2 ..... 011 ... ..... ..... \ + @rprr_store esz=%size_23 From patchwork Thu Jun 21 01:53:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139407 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1471999lji; Wed, 20 Jun 2018 19:00:19 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIXfsCij695VfyHGhBH539i+J/nClMaBQ7Tyw1O/97CWHLdAmYjwJNYp23hGi5O3GquSTdM X-Received: by 2002:ac8:2f99:: with SMTP id l25-v6mr20402808qta.166.1529546419006; Wed, 20 Jun 2018 19:00:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546419; cv=none; d=google.com; s=arc-20160816; b=Nu1rtgBldX05TWQimWzNFurdpZ51WBujn8Ezn7n7K2cafXLpq+7mknTrH2XcXXOJsl +nLLLvb9vV2nLCFD0dKHR5GoS1LPMdmPmfeDAsoqDbs679yyDhIk/nMX+R1dAWAcqG9O qrw6w+GvNB9oeGc7+nqOZx1z/aZM0WCcH4N1I8s2piuC8rtNnFVVlDCIuLKFnym9syof Oleds/9MaiFxIgI4F3H5AUbHPMTTs//pKk2FqbgYBNX1bfqlandMguFKvQH09jnFtzEQ okG0K/8AgCMl+j2YG/jSNfGWqcXUAtJWxDBfjq3jCzhMQZZgrgbyvFE739+VB3G08F3i IueA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=MIKKw3KUuErnHai2QdqLnnCjJ3TTAEPOo7LFHNMeMjQ=; b=j/3cIYf8JRWmetp9Vk2FYSi7B8YZt141ChiU884d3H+sy0h7mhuh2CujQjrakiCJZn J7h5SQhFOyqeaaQetJIMEbMZ2VTQ31GJzkCv7yN+vs4hbx0l+aUnLNiR4fheOj0DkKhi Gstcm9cPOBF4mnBAXZ/7xseL23BpZjNNBk9640DWbiRJgfiD6e5Srj6jrOrbLth4BwOW 3GQkFLXdKN5x4pUtB5xe/hvUKzjpNTRGYqSJomBV6feBydI6mFHNXXyBAEPxPh6HBAWI IVupI+xJKCgrJmcocU30B/7dGJAbyzpcLxkiZkpo8V5CxNGK6XSY1FI+JcpgqbXYWp+G BwhQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="Exx/AZ1e"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id o19-v6si3145354qtb.380.2018.06.20.19.00.18 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:00:18 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="Exx/AZ1e"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52506 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVotW-0000UX-Ee for patch@linaro.org; Wed, 20 Jun 2018 22:00:18 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39146) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonf-0003zF-GI for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVone-0002yc-JI for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:15 -0400 Received: from mail-pf0-x234.google.com ([2607:f8b0:400e:c00::234]:41725) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVone-0002xp-DW for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:14 -0400 Received: by mail-pf0-x234.google.com with SMTP id a11-v6so683565pff.8 for ; Wed, 20 Jun 2018 18:54:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=MIKKw3KUuErnHai2QdqLnnCjJ3TTAEPOo7LFHNMeMjQ=; b=Exx/AZ1ezbHNXloM/rQKKAhGXVwwQbf8w2RLsNmYkRhuRm8qZqCGQoMUAf79ON8ENG bGLPAU/03nPbDkNRZV36TiFuMDMPdbn+WCrpFT5ws3+K/dczqnCtikNctVftzc9EWEa7 rZVxD7oe1Y8Nce56ZGSO76UM+iSTiDYQB+MDM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=MIKKw3KUuErnHai2QdqLnnCjJ3TTAEPOo7LFHNMeMjQ=; b=oz+8SNvtaV4nTqMaWDYjgAx4jFkkVatTbzhr/ikEMTLhXTGf6oc0+ew0X1eV76WgbN GPd/QN3VB+lQNsvSDLanJCC1fgppqUtHRKxQrX2Yay4ICmb+OUTdnr0vxh2KKwW/gSeb lEHvYjItiGRZiTAslrV90Rar3plbHe8ZNYFx7GBaxVrvQHwMLu4wfsa4d9z2c91Z9cvG TcQ4Mk2sPtGjLHqDTdqTNLtHP2ZC0degpYQoE6CjZxqPSpmEWrk4KK7f33dotbuaFrDu U21Q2AOEtBj4ZlEGtpxbqiReGzypmH/jdo7molBq6SjluV9Ka7EOlrJL722kEMoI6h2J lbZA== X-Gm-Message-State: APt69E35P5dg4tm3OQGIrJbXgPeVbfATKpT6ovjlhY3F2rqlYzM7FopB 1f5uMCalljPe+JEDzVbk/H/keIc4ZUc= X-Received: by 2002:a62:4556:: with SMTP id s83-v6mr25352737pfa.73.1529546053055; Wed, 20 Jun 2018 18:54:13 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.11 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:12 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:28 -1000 Message-Id: <20180621015359.12018-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::234 Subject: [Qemu-devel] [PATCH v5 04/35] target/arm: Implement SVE load and broadcast quadword X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 52 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 9 +++++++ 2 files changed, 61 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell Reviewed-by: Alex Bennée diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index b25fe96b77..83de87ee0e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3717,6 +3717,58 @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) return true; } +static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz) +{ + static gen_helper_gvec_mem * const fns[4] = { + gen_helper_sve_ld1bb_r, gen_helper_sve_ld1hh_r, + gen_helper_sve_ld1ss_r, gen_helper_sve_ld1dd_r, + }; + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr t_pg; + TCGv_i32 desc; + + /* Load the first quadword using the normal predicated load helpers. */ + desc = tcg_const_i32(simd_desc(16, 16, zt)); + t_pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + fns[msz](cpu_env, t_pg, addr, desc); + + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); + + /* Replicate that first quadword. */ + if (vsz > 16) { + unsigned dofs = vec_full_reg_offset(s, zt); + tcg_gen_gvec_dup_mem(4, dofs + 16, dofs, vsz - 16, vsz - 16); + } +} + +static bool trans_LD1RQ_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn) +{ + if (a->rm == 31) { + return false; + } + if (sve_access_check(s)) { + int msz = dtype_msz(a->dtype); + TCGv_i64 addr = new_tmp_a64(s); + tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), msz); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_ldrq(s, a->rd, a->pg, addr, msz); + } + return true; +} + +static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) +{ + if (sve_access_check(s)) { + TCGv_i64 addr = new_tmp_a64(s); + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), a->imm * 16); + do_ldrq(s, a->rd, a->pg, addr, dtype_msz(a->dtype)); + } + return true; +} + static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz, int esz, int nreg) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6e159faaec..606c4f623c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -715,6 +715,15 @@ LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz # LD2B, LD2H, LD2W, LD2D; etc. LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz +# SVE load and broadcast quadword (scalar plus scalar) +LD1RQ_zprr 1010010 .. 00 ..... 000 ... ..... ..... \ + @rprr_load_msz nreg=0 + +# SVE load and broadcast quadword (scalar plus immediate) +# LD1RQB, LD1RQH, LD1RQS, LD1RQD +LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \ + @rpri_load_msz nreg=0 + ### SVE Memory Store Group # SVE contiguous store (scalar plus immediate) From patchwork Thu Jun 21 01:53:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139405 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1470374lji; Wed, 20 Jun 2018 18:57:51 -0700 (PDT) X-Google-Smtp-Source: AAOMgpeTEdFFeRkRhG5TcsWkyF5g7CLYoDALGVyIxWeaXVEZoXp5Kzly91fxZ6f+G6r6d7rYrcVw X-Received: by 2002:ac8:928:: with SMTP id t37-v6mr449386qth.109.1529546271172; Wed, 20 Jun 2018 18:57:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546271; cv=none; d=google.com; s=arc-20160816; b=HZBw/6X2bX2K6mngQe6wlb/YpG54jDM9Q6nrr5AYf0ugjhSR6KoPrwhW9cztNjCvXP GRdNe1ZakCTdafsZ0a8QlilTvLT1Yoy0BEzvRps6aekO+p+LdJn+HhgGT4oWbASfKZaf mNKmuRs/9zv2er9ReZoXj47/31lLWm4LnJsb5dx4Anv0nTkxsKWaX0YpDGfHbEYfTy5y WkBRlDn6oiYGFh5lNvSsdyfVPGwmU4Sr4Xf4GoeoyGqqyq1nuCpNUMxcT4bWw9cLocvr FNyt2xzeql8/oL6eYQZwHgl7oOKzGy4ir/tbGnQ+vYYHJmm7KHojBYWqSMj2uSni1HIL EifQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=jeoMd9BTrp3hGuVBUBWNDT8zmWrdd5+QUIlWX/rjX9U=; b=O4XzKPxMqc/++2HRa2w4IxByJYqE4OQ6kUs1lzr3mmqdbKPxbz/B6mUEkvELyoAow+ AaoAxNxJE0xLCYTImNu7p37cwJD0WKEB/JreCY1komVoOIEcfQXeqRxlgBHHkiKJLlR0 qxJqzgpmVvczbPPVegAKnFrVY0n5jMh3TL1XeafYjeIygMpUfPdhQ8ffLLMCTYE8yzG+ 74GH5DZyfGrfSvGVA51SVxd9xO1aAsuF3UZShQLRQRwlfVXKC0I+8PgyoGsAlNFs6zzh 8Ads/XvadX+qNZNh+nQ73hLfjEKo/uPg8GZmP5U0Fa+Nutu4l+UUn4u5uylZyl+sBMx2 2Sdw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=CqeRnEsx; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id c56-v6si3788531qtc.342.2018.06.20.18.57.50 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 18:57:51 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=CqeRnEsx; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52494 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVor8-0006hx-Ff for patch@linaro.org; Wed, 20 Jun 2018 21:57:50 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39168) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonk-000444-2O for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVong-000300-6b for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:20 -0400 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:36275) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonf-0002zP-UI for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:16 -0400 Received: by mail-pl0-x244.google.com with SMTP id a7-v6so757304plp.3 for ; Wed, 20 Jun 2018 18:54:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=jeoMd9BTrp3hGuVBUBWNDT8zmWrdd5+QUIlWX/rjX9U=; b=CqeRnEsxsXAV8dgaNdD8BcGACuEXpBxwBoooJMM5T1vP9JHFCXACf3deCWcfr5b5yN 4q3ntELPvZD3hKTGBO3I1CvTGn0Ycf4+yk9Ty07GZyfW12qoZuiuPoFw6wSYj+yIxgVu X1S9d3MMd6wCyOKpg0V6N/fxnU8XqJ7YqmwqE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=jeoMd9BTrp3hGuVBUBWNDT8zmWrdd5+QUIlWX/rjX9U=; b=eBwkaoQGj4N8QUPadXugyjROHIOFYVPcSBmvH+b+C9gUPsMxrr3ST7y4b/TTYQNLnM ywjI/RWx20vrAM08WBWAaLTHE5+wa6nYI0QdLvCzgDA6y0VAMJlFNDQacezRekKrDoUt CMbcsQR8Nmx0DccKETuFZNA+gUBivgMYgPeEtRmJbx4q29IriqzDbE+wJco1CEXnipZU A5cK0gx/WiSuyWPivGH0lQNu6I0K3+z/0ea4eBKRzV98EYqNksZajBX0tSbRT1JcBsTw V8F2pZSa9gT6rfLHLe877+Q2xGCxwtb/Aypk+UFc/7+uVd/x453jmvuik7V/J+Pw/CtX h4jw== X-Gm-Message-State: APt69E1xsg2K5krm2rMOo/S0u3S8lQOuZMPKw1y0vgaiGJ1Ue9iQhqcf 7vKpOaL8y+yZMlZ0PGtueqq4NN4Y+80= X-Received: by 2002:a17:902:9f84:: with SMTP id g4-v6mr26463657plq.339.1529546054641; Wed, 20 Jun 2018 18:54:14 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.13 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:13 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:29 -1000 Message-Id: <20180621015359.12018-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v5 05/35] target/arm: Implement SVE integer convert to floating-point X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 30 +++++++++++++ target/arm/sve_helper.c | 38 ++++++++++++++++ target/arm/translate-sve.c | 90 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 22 ++++++++++ 4 files changed, 180 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell Reviewed-by: Alex Bennée diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b768128951..185112e1d2 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -720,6 +720,36 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_dh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_ucvt_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_dh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f20774e240..a2f034820a 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2811,6 +2811,44 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +/* Fully general two-operand expander, controlled by a predicate, + * With the extra float_status parameter. + */ +#define DO_ZPZ_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ +{ \ + intptr_t i = simd_oprsz(desc); \ + uint64_t *g = vg; \ + do { \ + uint64_t pg = g[(i - 1) >> 6]; \ + do { \ + i -= sizeof(TYPE); \ + if (likely((pg >> (i & 63)) & 1)) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) = OP(nn, status); \ + } \ + } while (i & 63); \ + } while (i != 0); \ +} + +DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) +DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) +DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) +DO_ZPZ_FP(sve_scvt_sd, uint64_t, , int32_to_float64) +DO_ZPZ_FP(sve_scvt_dh, uint64_t, , int64_to_float16) +DO_ZPZ_FP(sve_scvt_ds, uint64_t, , int64_to_float32) +DO_ZPZ_FP(sve_scvt_dd, uint64_t, , int64_to_float64) + +DO_ZPZ_FP(sve_ucvt_hh, uint16_t, H1_2, uint16_to_float16) +DO_ZPZ_FP(sve_ucvt_sh, uint32_t, H1_4, uint32_to_float16) +DO_ZPZ_FP(sve_ucvt_ss, uint32_t, H1_4, uint32_to_float32) +DO_ZPZ_FP(sve_ucvt_sd, uint64_t, , uint32_to_float64) +DO_ZPZ_FP(sve_ucvt_dh, uint64_t, , uint64_to_float16) +DO_ZPZ_FP(sve_ucvt_ds, uint64_t, , uint64_to_float32) +DO_ZPZ_FP(sve_ucvt_dd, uint64_t, , uint64_to_float64) + +#undef DO_ZPZ_FP + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 83de87ee0e..7639e589f5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3425,6 +3425,96 @@ DO_FP3(FRSQRTS, rsqrts) #undef DO_FP3 + +/* + *** SVE Floating Point Unary Operations Prediated Group + */ + +static bool do_zpz_ptr(DisasContext *s, int rd, int rn, int pg, + bool is_fp16, gen_helper_gvec_3_ptr *fn) +{ + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(is_fp16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + pred_full_reg_offset(s, pg), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); + } + return true; +} + +static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); +} + +static bool trans_SCVTF_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_sh); +} + +static bool trans_SCVTF_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_dh); +} + +static bool trans_SCVTF_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_ss); +} + +static bool trans_SCVTF_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_ds); +} + +static bool trans_SCVTF_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_sd); +} + +static bool trans_SCVTF_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_dd); +} + +static bool trans_UCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_hh); +} + +static bool trans_UCVTF_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_sh); +} + +static bool trans_UCVTF_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_dh); +} + +static bool trans_UCVTF_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_ss); +} + +static bool trans_UCVTF_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_ds); +} + +static bool trans_UCVTF_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_sd); +} + +static bool trans_UCVTF_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_dd); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 606c4f623c..3abdb87cf5 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -133,6 +133,9 @@ @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz +# One register operand, with governing predicate, no vector element size +@rd_pg_rn_e0 ........ .. ... ... ... pg:3 rn:5 rd:5 &rpr_esz esz=0 + # Two register operands with a 6-bit signed immediate. @rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri @@ -681,6 +684,25 @@ FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_rn_rm FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm +### SVE FP Unary Operations Predicated Group + +# SVE integer convert to floating-point +SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_dh 01100101 01 010 11 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_ss 01100101 10 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_sd 01100101 11 010 00 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_ds 01100101 11 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_dd 01100101 11 010 11 0 101 ... ..... ..... @rd_pg_rn_e0 + +UCVTF_hh 01100101 01 010 01 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_sh 01100101 01 010 10 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_dh 01100101 01 010 11 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_ss 01100101 10 010 10 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_sd 01100101 11 010 00 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_ds 01100101 11 010 10 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_dd 01100101 11 010 11 1 101 ... ..... ..... @rd_pg_rn_e0 + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Thu Jun 21 01:53:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139408 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1472160lji; Wed, 20 Jun 2018 19:00:27 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLGE4uaA8eVBfSOtIDZyCwO2QQxOmetTBh+SS0ZG1TFN5QSd9N4EvOBbFQVKpTCRNr30jwy X-Received: by 2002:ae9:e004:: with SMTP id m4-v6mr5503428qkk.436.1529546427829; Wed, 20 Jun 2018 19:00:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546427; cv=none; d=google.com; s=arc-20160816; b=yBJ/gkwtmlj6PibfQ5tuXl51eju4DdWeJlyazloTCJ0YR3rV5ESsqHr1ccQkIT3mHw +H2WpAtM8+lBK43ovxUJTWq/NzTkxRYTgWldQEGPyBH0R/+gMex9j1zGU12B6W2dc7iP hfvGNVsdBdJOYiNY7poxqt5wXXFnBcjN1oyc4Xryd75oiC2aWFHFhUj9DVNMx8zLa4+l fFwtWnwtnHdiT6cqCxMrPpRbY4sm2WKmjJVSDbYZEUFOHGmY0LrpFD8nI1PNPNRhDbMN hdwWkji1uJxJMkJaCsnRs3wDYLz1kiajH0qBfiT+6QEYNh7s8SQ4FnQJR6Mq6F9puJsY gJjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=AprqyjT71U0BaW5ZsVknsigpNMYP5WHOGkZ3Sh7Oqeo=; b=nXsKjZh8+fc/OvdMFnNaOH8RbHsUSYqw1kBTywmzBoWopUwZ6MrPrHEf54i5VZ7lOn 8vgwqTdAJnacGs0FFIZx+O/OwEyaCjJ7caUA69DLDdLOnni7zJFsbzChUBLsc0VlblcU FFEo7BjHM+zzS6suqsNR7FCh0IUzHSfDf2BR4GodQtSlslRz2aEUGZhQWRw4VPFaEByY O1sC1dDfbZOLGfy8Ibdibx20eKCV9WTp4gE6Xfy6Yt4Hal4HgQ+Q6IErcm1UHf5o93Wt 7Z/xo7H/3S4bN2bLt+QA4rEobuVJocwJME0q7fk3cEaxV7z/rZ++HoKurs/vNNpPdgJZ QsFA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Wrxeh8Ps; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id b187-v6si2319678qke.333.2018.06.20.19.00.27 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:00:27 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Wrxeh8Ps; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52512 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVotf-0000fW-6T for patch@linaro.org; Wed, 20 Jun 2018 22:00:27 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39169) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonk-000445-2Z for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVoni-000318-3K for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:20 -0400 Received: from mail-pl0-x233.google.com ([2607:f8b0:400e:c01::233]:45166) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonh-00030p-Qv for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:18 -0400 Received: by mail-pl0-x233.google.com with SMTP id o18-v6so58394pll.12 for ; Wed, 20 Jun 2018 18:54:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=AprqyjT71U0BaW5ZsVknsigpNMYP5WHOGkZ3Sh7Oqeo=; b=Wrxeh8Psd52w4kPEeZmLspqgBP+x6twYEKIo8HC4SeaRilL+vSQBw1Xug7cjfPkYM9 ADOcpHdykwUQSqBGXs5diKclRGZ+nZda8pOCWobgRcNQzfz/BBR4/4cM8LSOpSXKqZ6g F/0uG2QNJZXw9+NJ93Zp5FkUjMZqVn37i/3yY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=AprqyjT71U0BaW5ZsVknsigpNMYP5WHOGkZ3Sh7Oqeo=; b=sa7hJl9EswSeS2493RwDeKM+M5kFZfu7EPccBjknFnKCXdHgRyyuRFw+yw3pqk2wRg v4sdMcm6dOdKZ+p4YATzLLOOPZg/LzyDucTeQ4xS4XtseaCn6VtdcPywC+1XoVabiQXB EZ6dw31IkklwVLPAEmOr4bozwDDzoaRAtH1FdGj3pSn7vDAza7Dy3lw4LFzXJZERU+qu xSFG3Yg5WSJCB12VhZLhWbCVXRCK0L4rWoFF96uIbjLIS5F6fXOj5oeGfTZYITJPTgPe liHOxCFuQPmzo0kPjyDHjBjDZ2hSag+9e99q4Poq1q+Ffrg5j8wD1qEnUx4WPtbY+Ueq pzaw== X-Gm-Message-State: APt69E1+uKGKRiSPFNkqTjqd7bcTPvH9sgfr5luxPR3FvNxH4Fvsk7mP L9q6q3LCf8p4cfzF8GSCGjngvq814yQ= X-Received: by 2002:a17:902:585:: with SMTP id f5-v6mr14171274plf.142.1529546056520; Wed, 20 Jun 2018 18:54:16 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.14 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:15 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:30 -1000 Message-Id: <20180621015359.12018-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::233 Subject: [Qemu-devel] [PATCH v5 06/35] target/arm: Implement SVE floating-point arithmetic (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 77 +++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 89 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 46 ++++++++++++++++++++ target/arm/sve.decode | 17 ++++++++ 4 files changed, 229 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 185112e1d2..4097b55f0e 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -720,6 +720,83 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadd_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fsub_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsub_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsub_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmul_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmul_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmul_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fdiv_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fdiv_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fdiv_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmin_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmin_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmin_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmax_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmax_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmax_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fminnum_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnum_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnum_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmaxnum_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnum_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnum_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fabd_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fabd_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fabd_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fscalbn_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fscalbn_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fscalbn_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmulx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a2f034820a..6bf30a3e66 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2811,6 +2811,95 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +/* Fully general three-operand expander, controlled by a predicate, + * With the extra float_status parameter. + */ +#define DO_ZPZZ_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i = simd_oprsz(desc); \ + uint64_t *g = vg; \ + do { \ + uint64_t pg = g[(i - 1) >> 6]; \ + do { \ + i -= sizeof(TYPE); \ + if (likely((pg >> (i & 63)) & 1)) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + TYPE mm = *(TYPE *)(vm + H(i)); \ + *(TYPE *)(vd + H(i)) = OP(nn, mm, status); \ + } \ + } while (i & 63); \ + } while (i != 0); \ +} + +DO_ZPZZ_FP(sve_fadd_h, uint16_t, H1_2, float16_add) +DO_ZPZZ_FP(sve_fadd_s, uint32_t, H1_4, float32_add) +DO_ZPZZ_FP(sve_fadd_d, uint64_t, , float64_add) + +DO_ZPZZ_FP(sve_fsub_h, uint16_t, H1_2, float16_sub) +DO_ZPZZ_FP(sve_fsub_s, uint32_t, H1_4, float32_sub) +DO_ZPZZ_FP(sve_fsub_d, uint64_t, , float64_sub) + +DO_ZPZZ_FP(sve_fmul_h, uint16_t, H1_2, float16_mul) +DO_ZPZZ_FP(sve_fmul_s, uint32_t, H1_4, float32_mul) +DO_ZPZZ_FP(sve_fmul_d, uint64_t, , float64_mul) + +DO_ZPZZ_FP(sve_fdiv_h, uint16_t, H1_2, float16_div) +DO_ZPZZ_FP(sve_fdiv_s, uint32_t, H1_4, float32_div) +DO_ZPZZ_FP(sve_fdiv_d, uint64_t, , float64_div) + +DO_ZPZZ_FP(sve_fmin_h, uint16_t, H1_2, float16_min) +DO_ZPZZ_FP(sve_fmin_s, uint32_t, H1_4, float32_min) +DO_ZPZZ_FP(sve_fmin_d, uint64_t, , float64_min) + +DO_ZPZZ_FP(sve_fmax_h, uint16_t, H1_2, float16_max) +DO_ZPZZ_FP(sve_fmax_s, uint32_t, H1_4, float32_max) +DO_ZPZZ_FP(sve_fmax_d, uint64_t, , float64_max) + +DO_ZPZZ_FP(sve_fminnum_h, uint16_t, H1_2, float16_minnum) +DO_ZPZZ_FP(sve_fminnum_s, uint32_t, H1_4, float32_minnum) +DO_ZPZZ_FP(sve_fminnum_d, uint64_t, , float64_minnum) + +DO_ZPZZ_FP(sve_fmaxnum_h, uint16_t, H1_2, float16_maxnum) +DO_ZPZZ_FP(sve_fmaxnum_s, uint32_t, H1_4, float32_maxnum) +DO_ZPZZ_FP(sve_fmaxnum_d, uint64_t, , float64_maxnum) + +static inline float16 abd_h(float16 a, float16 b, float_status *s) +{ + return float16_abs(float16_sub(a, b, s)); +} + +static inline float32 abd_s(float32 a, float32 b, float_status *s) +{ + return float32_abs(float32_sub(a, b, s)); +} + +static inline float64 abd_d(float64 a, float64 b, float_status *s) +{ + return float64_abs(float64_sub(a, b, s)); +} + +DO_ZPZZ_FP(sve_fabd_h, uint16_t, H1_2, abd_h) +DO_ZPZZ_FP(sve_fabd_s, uint32_t, H1_4, abd_s) +DO_ZPZZ_FP(sve_fabd_d, uint64_t, , abd_d) + +static inline float64 scalbn_d(float64 a, int64_t b, float_status *s) +{ + int b_int = MIN(MAX(b, INT_MIN), INT_MAX); + return float64_scalbn(a, b_int, s); +} + +DO_ZPZZ_FP(sve_fscalbn_h, int16_t, H1_2, float16_scalbn) +DO_ZPZZ_FP(sve_fscalbn_s, int32_t, H1_4, float32_scalbn) +DO_ZPZZ_FP(sve_fscalbn_d, int64_t, , scalbn_d) + +DO_ZPZZ_FP(sve_fmulx_h, uint16_t, H1_2, helper_advsimd_mulxh) +DO_ZPZZ_FP(sve_fmulx_s, uint32_t, H1_4, helper_vfp_mulxs) +DO_ZPZZ_FP(sve_fmulx_d, uint64_t, , helper_vfp_mulxd) + +#undef DO_ZPZZ_FP + /* Fully general two-operand expander, controlled by a predicate, * With the extra float_status parameter. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7639e589f5..4df5360da9 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3425,6 +3425,52 @@ DO_FP3(FRSQRTS, rsqrts) #undef DO_FP3 +/* + *** SVE Floating Point Arithmetic - Predicated Group + */ + +static bool do_zpzz_fp(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_4_ptr *fn) +{ + if (fn == NULL) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); + } + return true; +} + +#define DO_FP3(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rprr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_4_ptr * const fns[4] = { \ + NULL, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \ + }; \ + return do_zpzz_fp(s, a, fns[a->esz]); \ +} + +DO_FP3(FADD_zpzz, fadd) +DO_FP3(FSUB_zpzz, fsub) +DO_FP3(FMUL_zpzz, fmul) +DO_FP3(FMIN_zpzz, fmin) +DO_FP3(FMAX_zpzz, fmax) +DO_FP3(FMINNM_zpzz, fminnum) +DO_FP3(FMAXNM_zpzz, fmaxnum) +DO_FP3(FABD, fabd) +DO_FP3(FSCALE, fscalbn) +DO_FP3(FDIV, fdiv) +DO_FP3(FMULX, fmulx) + +#undef DO_FP3 /* *** SVE Floating Point Unary Operations Prediated Group diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 3abdb87cf5..636212a638 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -684,6 +684,23 @@ FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_rn_rm FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm +### SVE FP Arithmetic Predicated Group + +# SVE floating-point arithmetic (predicated) +FADD_zpzz 01100101 .. 00 0000 100 ... ..... ..... @rdn_pg_rm +FSUB_zpzz 01100101 .. 00 0001 100 ... ..... ..... @rdn_pg_rm +FMUL_zpzz 01100101 .. 00 0010 100 ... ..... ..... @rdn_pg_rm +FSUB_zpzz 01100101 .. 00 0011 100 ... ..... ..... @rdm_pg_rn # FSUBR +FMAXNM_zpzz 01100101 .. 00 0100 100 ... ..... ..... @rdn_pg_rm +FMINNM_zpzz 01100101 .. 00 0101 100 ... ..... ..... @rdn_pg_rm +FMAX_zpzz 01100101 .. 00 0110 100 ... ..... ..... @rdn_pg_rm +FMIN_zpzz 01100101 .. 00 0111 100 ... ..... ..... @rdn_pg_rm +FABD 01100101 .. 00 1000 100 ... ..... ..... @rdn_pg_rm +FSCALE 01100101 .. 00 1001 100 ... ..... ..... @rdn_pg_rm +FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm +FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR +FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm + ### SVE FP Unary Operations Predicated Group # SVE integer convert to floating-point From patchwork Thu Jun 21 01:53:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139411 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1475156lji; Wed, 20 Jun 2018 19:03:42 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIU4PQSQTJb8flTVLy4Gy48zeuR/o+S2l9817rLzK5Hyf30xQ1IJSQ8/S151EepTtNqrEfo X-Received: by 2002:aed:22cc:: with SMTP id q12-v6mr20800786qtc.226.1529546622556; Wed, 20 Jun 2018 19:03:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546622; cv=none; d=google.com; s=arc-20160816; b=bCKGmgq4SSfHt5c9CrH/DFGPxmLaNRzUi256uV10SGa0XLjyIUcsgZty571hAhwpNV Y5ZCxPJ29+625aY76qmQlBy3tOXj5BGsL6wUMdWhVCiUtyRvbOQ2XyjI5ZJK2BS6G82e lR8moMeZAcOz+31gLua+2MQd4hUAGQVICOmLNfgWzQAMkPEQuP7zqs8arbmCQ9ulYQ/D xcX/Ekjz4k6dFdUEfhV7HuAp6qV/HD+eQOtHVZk/jXLKpPjjrcv56dE2/SatqAEXJRhc 0FRkG1M6En227uZvlUIxwsfw2YTq/dUPOkmvuFx0zHmJiFTPGXIPa+MYD6EAfyjEDRDI J3ZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=jINFLC02lMv+InyN1iespI0Zzx3CuWqfLW8uGBHW3LI=; b=NDyMql7SMYRNT0mHBO7Kv+1QOpx2emmK9RnhuZtweb2NwgO0wbUsmHilk8LxHmVkpx KwtYKdy4LwsL/KDu9xFtEsNQdap9KVLb9XB0xK79+TLiBawU4T9nwzuE4CZyixKVBg8n VIIT/hP2kroR2i8DDZW7SdX3Nn32AllvVyEInH9i1/2L+fNJ7Xjq4ChQwsa3ksnjAooF QH2GErA5t35CMLZRy8mNeekXiO4l6j2Sbv+UkN8eIMtmPiz0CHVpfC6n8OUWUUvr6C6N DrSzEGUq3n3oJ/FSGiXiqNrGGwMQvFSvKqTiIL5ecqWpeJ6Ad1NTsX9KkyMyv/eZbn31 IWig== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Iu7CDaR5; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id q3-v6si3743136qtk.65.2018.06.20.19.03.42 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:03:42 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Iu7CDaR5; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52525 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVown-0002wc-WD for patch@linaro.org; Wed, 20 Jun 2018 22:03:42 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39184) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonl-00045Z-Kh for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonk-000321-4W for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:21 -0400 Received: from mail-pf0-x234.google.com ([2607:f8b0:400e:c00::234]:38383) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonj-00031i-RW for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:20 -0400 Received: by mail-pf0-x234.google.com with SMTP id b74-v6so688247pfl.5 for ; Wed, 20 Jun 2018 18:54:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=jINFLC02lMv+InyN1iespI0Zzx3CuWqfLW8uGBHW3LI=; b=Iu7CDaR5A0qYg9rvkRicFbrbTGtKK6l0qfEy4W4Wjx8XmZwj4BJr2tdbB/t5sst889 vtd/5KvWw6ae3eDb4rZqM4qG/hQboQDijbyQZOcH8itVczk8NidmA+XXHU5uCdIUJWxS JmEyy6qTc0Up6v9X39STi0HEzeKWciaucAbAE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=jINFLC02lMv+InyN1iespI0Zzx3CuWqfLW8uGBHW3LI=; b=TXeSHKRHkzjxGV8dcaqEjriYD6UZKaUqdC/GlV7aNp7BNyw2JxXVOn1Xl3cSBLG243 NkMm9tbZJUP37VKAiWSlE1z9TFm8Uwsz5jABcm5g2pWRVyaAomr/PghXT4w9+y9i95Tw n8+IkfmkuFk2fVk903whWlAyNUK/hisXPVuTj+EM2kr8Pqi8yQ99X+wSQwORY5Z8CPBE 4c+hRlCSvD4fsyv/ya9slbgkoHnyyL1l+Fbbave+rUqpOd0JNPxo+MK8bLf/sjiYHYrP 0QJRfdevGTaNKnzmeYx9pHd2dj5PyNnbw46wrT9v5PA9FGxZxfCMOe0kzRQ/E8fJJNmx uFbA== X-Gm-Message-State: APt69E3a0t5yjyBRrIKou+xoCytecylOrh7zlwAUEBLEV6rSxVgKKQ2P ju5Hgs8U4O3H8t23mSgw8gq+V6MXeQk= X-Received: by 2002:a65:48c9:: with SMTP id o9-v6mr21085398pgs.262.1529546058526; Wed, 20 Jun 2018 18:54:18 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.16 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:17 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:31 -1000 Message-Id: <20180621015359.12018-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::234 Subject: [Qemu-devel] [PATCH v5 07/35] target/arm: Implement SVE FP Multiply-Add Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 16 ++++ target/arm/sve_helper.c | 158 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 49 ++++++++++++ target/arm/sve.decode | 17 ++++ 4 files changed, 240 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4097b55f0e..eb0645dd43 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -827,6 +827,22 @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 6bf30a3e66..aeb4ccadd9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2938,6 +2938,164 @@ DO_ZPZ_FP(sve_ucvt_dd, uint64_t, , uint64_to_float64) #undef DO_ZPZ_FP +/* 4-operand predicated multiply-add. This requires 7 operands to pass + * "properly", so we need to encode some of the registers into DESC. + */ +QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 20 > 32); + +static void do_fmla_zpzzz_h(CPUARMState *env, void *vg, uint32_t desc, + uint16_t neg1, uint16_t neg3) +{ + intptr_t i = simd_oprsz(desc); + unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); + unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); + unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); + unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); + void *vd = &env->vfp.zregs[rd]; + void *vn = &env->vfp.zregs[rn]; + void *vm = &env->vfp.zregs[rm]; + void *va = &env->vfp.zregs[ra]; + uint64_t *g = vg; + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + i -= 2; + if (likely((pg >> (i & 63)) & 1)) { + float16 e1, e2, e3, r; + + e1 = *(uint16_t *)(vn + H1_2(i)) ^ neg1; + e2 = *(uint16_t *)(vm + H1_2(i)); + e3 = *(uint16_t *)(va + H1_2(i)) ^ neg3; + r = float16_muladd(e1, e2, e3, 0, &env->vfp.fp_status); + *(uint16_t *)(vd + H1_2(i)) = r; + } + } while (i & 63); + } while (i != 0); +} + +void HELPER(sve_fmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_h(env, vg, desc, 0, 0); +} + +void HELPER(sve_fmls_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_h(env, vg, desc, 0x8000, 0); +} + +void HELPER(sve_fnmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_h(env, vg, desc, 0x8000, 0x8000); +} + +void HELPER(sve_fnmls_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_h(env, vg, desc, 0, 0x8000); +} + +static void do_fmla_zpzzz_s(CPUARMState *env, void *vg, uint32_t desc, + uint32_t neg1, uint32_t neg3) +{ + intptr_t i = simd_oprsz(desc); + unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); + unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); + unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); + unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); + void *vd = &env->vfp.zregs[rd]; + void *vn = &env->vfp.zregs[rn]; + void *vm = &env->vfp.zregs[rm]; + void *va = &env->vfp.zregs[ra]; + uint64_t *g = vg; + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + i -= 4; + if (likely((pg >> (i & 63)) & 1)) { + float32 e1, e2, e3, r; + + e1 = *(uint32_t *)(vn + H1_4(i)) ^ neg1; + e2 = *(uint32_t *)(vm + H1_4(i)); + e3 = *(uint32_t *)(va + H1_4(i)) ^ neg3; + r = float32_muladd(e1, e2, e3, 0, &env->vfp.fp_status); + *(uint32_t *)(vd + H1_4(i)) = r; + } + } while (i & 63); + } while (i != 0); +} + +void HELPER(sve_fmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_s(env, vg, desc, 0, 0); +} + +void HELPER(sve_fmls_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_s(env, vg, desc, 0x80000000, 0); +} + +void HELPER(sve_fnmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_s(env, vg, desc, 0x80000000, 0x80000000); +} + +void HELPER(sve_fnmls_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_s(env, vg, desc, 0, 0x80000000); +} + +static void do_fmla_zpzzz_d(CPUARMState *env, void *vg, uint32_t desc, + uint64_t neg1, uint64_t neg3) +{ + intptr_t i = simd_oprsz(desc); + unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); + unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); + unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); + unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); + void *vd = &env->vfp.zregs[rd]; + void *vn = &env->vfp.zregs[rn]; + void *vm = &env->vfp.zregs[rm]; + void *va = &env->vfp.zregs[ra]; + uint64_t *g = vg; + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + i -= 8; + if (likely((pg >> (i & 63)) & 1)) { + float64 e1, e2, e3, r; + + e1 = *(uint64_t *)(vn + i) ^ neg1; + e2 = *(uint64_t *)(vm + i); + e3 = *(uint64_t *)(va + i) ^ neg3; + r = float64_muladd(e1, e2, e3, 0, &env->vfp.fp_status); + *(uint64_t *)(vd + i) = r; + } + } while (i & 63); + } while (i != 0); +} + +void HELPER(sve_fmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_d(env, vg, desc, 0, 0); +} + +void HELPER(sve_fmls_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_d(env, vg, desc, INT64_MIN, 0); +} + +void HELPER(sve_fnmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_d(env, vg, desc, INT64_MIN, INT64_MIN); +} + +void HELPER(sve_fnmls_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_d(env, vg, desc, 0, INT64_MIN); +} + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4df5360da9..acad6374ef 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3472,6 +3472,55 @@ DO_FP3(FMULX, fmulx) #undef DO_FP3 +typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32); + +static bool do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla *fn) +{ + if (fn == NULL) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + unsigned vsz = vec_full_reg_size(s); + unsigned desc; + TCGv_i32 t_desc; + TCGv_ptr pg = tcg_temp_new_ptr(); + + /* We would need 7 operands to pass these arguments "properly". + * So we encode all the register numbers into the descriptor. + */ + desc = deposit32(a->rd, 5, 5, a->rn); + desc = deposit32(desc, 10, 5, a->rm); + desc = deposit32(desc, 15, 5, a->ra); + desc = simd_desc(vsz, vsz, desc); + + t_desc = tcg_const_i32(desc); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + fn(cpu_env, pg, t_desc); + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(pg); + return true; +} + +#define DO_FMLA(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rprrr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_sve_fmla * const fns[4] = { \ + NULL, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \ + }; \ + return do_fmla(s, a, fns[a->esz]); \ +} + +DO_FMLA(FMLA_zpzzz, fmla_zpzzz) +DO_FMLA(FMLS_zpzzz, fmls_zpzzz) +DO_FMLA(FNMLA_zpzzz, fnmla_zpzzz) +DO_FMLA(FNMLS_zpzzz, fnmls_zpzzz) + +#undef DO_FMLA + /* *** SVE Floating Point Unary Operations Prediated Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 636212a638..70e5a3aeb5 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -128,6 +128,8 @@ &rprrr_esz ra=%reg_movprfx @rdn_pg_ra_rm ........ esz:2 . rm:5 ... pg:3 ra:5 rd:5 \ &rprrr_esz rn=%reg_movprfx +@rdn_pg_rm_ra ........ esz:2 . ra:5 ... pg:3 rm:5 rd:5 \ + &rprrr_esz rn=%reg_movprfx # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @@ -701,6 +703,21 @@ FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm +### SVE FP Multiply-Add Group + +# SVE floating-point multiply-accumulate writing addend +FMLA_zpzzz 01100101 .. 1 ..... 000 ... ..... ..... @rda_pg_rn_rm +FMLS_zpzzz 01100101 .. 1 ..... 001 ... ..... ..... @rda_pg_rn_rm +FNMLA_zpzzz 01100101 .. 1 ..... 010 ... ..... ..... @rda_pg_rn_rm +FNMLS_zpzzz 01100101 .. 1 ..... 011 ... ..... ..... @rda_pg_rn_rm + +# SVE floating-point multiply-accumulate writing multiplicand +# FMAD, FMSB, FNMAD, FNMS +FMLA_zpzzz 01100101 .. 1 ..... 100 ... ..... ..... @rdn_pg_rm_ra +FMLS_zpzzz 01100101 .. 1 ..... 101 ... ..... ..... @rdn_pg_rm_ra +FNMLA_zpzzz 01100101 .. 1 ..... 110 ... ..... ..... @rdn_pg_rm_ra +FNMLS_zpzzz 01100101 .. 1 ..... 111 ... ..... ..... @rdn_pg_rm_ra + ### SVE FP Unary Operations Predicated Group # SVE integer convert to floating-point From patchwork Thu Jun 21 01:53:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139406 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1471058lji; Wed, 20 Jun 2018 18:58:57 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJQz4fEj4l7niEOOURUK9w9O6FR23Ev+2LY93SwxeJE4iFyMOud+RT/Ten6FvOsAMt8EuPi X-Received: by 2002:a0c:e354:: with SMTP id a20-v6mr20455376qvm.229.1529546337481; Wed, 20 Jun 2018 18:58:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546337; cv=none; d=google.com; s=arc-20160816; b=0PU6o34NTJKuEm5uEmQ5NMVTI0XCCMp5TrFrzvgN9ASyfgJcgdTAzlGOVu124xn4bF /w0vyc2JTI1azbxds0Xz4dzNVkAcpsB7s3A+JPXhjZ0dmgMlgRjNkD604eXFJG4vSsP9 D5SyhLYc6EmdPX8E2a2oNN/QZ1zS9GgLD75CXp7/1VH5fgGD0zsB1h6wstxxNAibPKQQ jzQtHMMLoSdYhOXPXXu5DHjwYFbT+Pt9bQB3o08TuB6ahAwwSbRMKpM90SlFB7SKRUK1 rRS6QTSgK96RjX03LbgyvM6vbLBkZYy3oaF8NcPY09F7AmrquNK+j2p30y9YpwAd4DLK JOCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=zMUCpNHqvYI8Hwem4/9P2JgKC2kBIp0MNfhRl9e2Emo=; b=r96DnxjiR+pNJSN98NEfpmUkPX0JSqKVwxOre758v5By+nuLy1+jKDYS9Ro800wviZ kz7qL7MUd79kNzS3189b14e2Hf2DWvvJTjQLF7rG7a69L1rS4HBOu1Vwz9jHDWTQZyUx 8I7Jb1I5fp1A89IC5r7cctQIfxvShudNLBCr+mEO1J+kv5gYiSeLmhB5COvXdwYIBp7C lCpXo3TZF6XEsxco/g+LO8+URn6Df3Ltzl/tUiNS6YPg+tJ5hnbJ8lU5trQYnRmWFid0 UsztD4g64Uvmz/RKs8MDzR8aqy79YdF7LiEq00O3MQBtL983D8QMllGQ3q1pziXXkt58 Oerg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="X0b2Xv/f"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id l190-v6si508433qkf.37.2018.06.20.18.58.57 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 18:58:57 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="X0b2Xv/f"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52498 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVosC-0007T3-UW for patch@linaro.org; Wed, 20 Jun 2018 21:58:57 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39195) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonn-00047F-7A for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonm-00032y-4S for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:23 -0400 Received: from mail-pf0-x22a.google.com ([2607:f8b0:400e:c00::22a]:46298) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonl-00032b-Sf for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:22 -0400 Received: by mail-pf0-x22a.google.com with SMTP id q1-v6so676563pff.13 for ; Wed, 20 Jun 2018 18:54:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=zMUCpNHqvYI8Hwem4/9P2JgKC2kBIp0MNfhRl9e2Emo=; b=X0b2Xv/fpekxUwxARyRF5KM/iKI9DXWj/G0sK9mxEnRpfsERL0kh83FaCzqp4z+sKQ KfvUWm5n2SzFpI39LhkHWYRGIj2cNsiy8fScfv5rQ+C2tzgJ+6AwafZ9ZMZ9l6lMwfbO W+p+pcm8loYraOlmOF/RjtWlZBFq2AVdT4JRw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=zMUCpNHqvYI8Hwem4/9P2JgKC2kBIp0MNfhRl9e2Emo=; b=gY38APSSvgRPFZ+uf6eBbeve6M3FkzbUHsLP4JaBp3fx29tzXJf2G2PoassXojLrAS YgiiqkVElk/gW2E/ura9D5W5G3vlz4AaA4qEebTjgXyHUdh9bZ1ikT7bOOArAmUoUCUM 6M0wMa0ugtrR4qfV4Pmx5NTjJgE6W7DOZSUG34w8dFfJXGlkSmRqafmZLyNnK5xToDEZ jon4TkfrROts5sbim6Lx8I6T5yQ0rF0TZUTor8xzVqpV53GNUUXkpJjtFMfOHORXsmFO sef2Z/iVgLW4jAmDXCopziPa76zmFRtOHUAmsn7McrunxRS9XiRZ/iuubrfxImj0PMpL 2Vtw== X-Gm-Message-State: APt69E3px0yhOWojsMyQHKK3L2QITz8drR4vjMBzjKHYFRi4OzCM0zMo Mthi2UpRzGCf9Xi95W1RrXExfxIEouI= X-Received: by 2002:a65:65ca:: with SMTP id y10-v6mr20821260pgv.359.1529546060529; Wed, 20 Jun 2018 18:54:20 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.19 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:19 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:32 -1000 Message-Id: <20180621015359.12018-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::22a Subject: [Qemu-devel] [PATCH v5 08/35] target/arm: Implement SVE Floating Point Accumulating Reduction Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 7 +++++ target/arm/sve_helper.c | 56 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 45 ++++++++++++++++++++++++++++++ target/arm/sve.decode | 5 ++++ 4 files changed, 113 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index eb0645dd43..68e55a8d03 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -720,6 +720,13 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG, + i64, i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG, + i64, i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG, + i64, i64, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index aeb4ccadd9..990e5f3900 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2811,6 +2811,62 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg, + void *status, uint32_t desc) +{ + intptr_t i = 0, opr_sz = simd_oprsz(desc); + float16 result = nn; + + do { + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); + do { + if (pg & 1) { + float16 mm = *(float16 *)(vm + H1_2(i)); + result = float16_add(result, mm, status); + } + i += sizeof(float16), pg >>= sizeof(float16); + } while (i & 15); + } while (i < opr_sz); + + return result; +} + +uint64_t HELPER(sve_fadda_s)(uint64_t nn, void *vm, void *vg, + void *status, uint32_t desc) +{ + intptr_t i = 0, opr_sz = simd_oprsz(desc); + float32 result = nn; + + do { + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); + do { + if (pg & 1) { + float32 mm = *(float32 *)(vm + H1_2(i)); + result = float32_add(result, mm, status); + } + i += sizeof(float32), pg >>= sizeof(float32); + } while (i & 15); + } while (i < opr_sz); + + return result; +} + +uint64_t HELPER(sve_fadda_d)(uint64_t nn, void *vm, void *vg, + void *status, uint32_t desc) +{ + intptr_t i = 0, opr_sz = simd_oprsz(desc) / 8; + uint64_t *m = vm; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i++) { + if (pg[H1(i)] & 1) { + nn = float64_add(nn, m[i], status); + } + } + + return nn; +} + /* Fully general three-operand expander, controlled by a predicate, * With the extra float_status parameter. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index acad6374ef..483ad33179 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3383,6 +3383,51 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI +/* + *** SVE Floating Point Accumulating Reduction Group + */ + +static bool trans_FADDA(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + typedef void fadda_fn(TCGv_i64, TCGv_i64, TCGv_ptr, + TCGv_ptr, TCGv_ptr, TCGv_i32); + static fadda_fn * const fns[3] = { + gen_helper_sve_fadda_h, + gen_helper_sve_fadda_s, + gen_helper_sve_fadda_d, + }; + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr t_rm, t_pg, t_fpst; + TCGv_i64 t_val; + TCGv_i32 t_desc; + + if (a->esz == 0) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + t_val = load_esz(cpu_env, vec_reg_offset(s, a->rn, 0, a->esz), a->esz); + t_rm = tcg_temp_new_ptr(); + t_pg = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(t_rm, cpu_env, vec_full_reg_offset(s, a->rm)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg)); + t_fpst = get_fpstatus_ptr(a->esz == MO_16); + t_desc = tcg_const_i32(simd_desc(vsz, vsz, 0)); + + fns[a->esz - 1](t_val, t_val, t_rm, t_pg, t_fpst, t_desc); + + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(t_fpst); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_ptr(t_rm); + + write_fp_dreg(s, a->rd, t_val); + tcg_temp_free_i64(t_val); + return true; +} + /* *** SVE Floating Point Arithmetic - Unpredicated Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 70e5a3aeb5..ba10cddb8a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -676,6 +676,11 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s +### SVE FP Accumulating Reduction Group + +# SVE floating-point serial reduction (predicated) +FADDA 01100101 .. 011 000 001 ... ..... ..... @rdn_pg_rm + ### SVE Floating Point Arithmetic - Unpredicated Group # SVE floating-point arithmetic (unpredicated) From patchwork Thu Jun 21 01:53:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139410 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1475123lji; Wed, 20 Jun 2018 19:03:39 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIFs5NVbrAHpmqFnBaVrmJfLff/t+Qkb8gLzJyRlO5xfDU4Omd4uCH1fdoXi6QkCFKCvD2G X-Received: by 2002:ac8:1692:: with SMTP id r18-v6mr21023657qtj.206.1529546619206; Wed, 20 Jun 2018 19:03:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546619; cv=none; d=google.com; s=arc-20160816; b=NlZCGWuLYpCRso4NhJAyTbRAwUnmSi/NSKdl9DzxOfRDXBM6kj21CYgNPWIUcH5rNI IoBWzMh8InQ3njNTMW2GuBsfHCZy4SLH9ahaiPCe3GST0cPLU0fELWuYsQ8z4JEYsRhJ IK4UCOFVP8aYV6ldcHrTTNt9a8YyMmohEuhHoCISCAODLaBLMCbTWWuakAst0YSpnOV7 L49XDHpsPasM7d/Ok1qV6McCi9wkn9wnNh/G2vdk347ORIppfjlm6DExEAyFfVZsv/f3 IVoZCNRrYNVRzfufNdxYv45fJuB37dpOlMgohGP9Ja6LTwyuJ9t5cTuskbY35K/B4Z75 0aCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=4PBOZ+xP2oHjWmgB6IgGTqH/sc5fgdOszz58FGX7GC4=; b=ya/Q4cbhq9cUmchiUVnBpsx7XDH6TJUh6GOCFjOm7aekBIIoo6EGhNI98d0D3T8Sgf FBTx7CAFuoPWpP5FJP8FYiGAePHh5BFWflUtlkXY4G/a6+yb4e2RcBuVSQrQWB6M3DOy jDmCas0wz0SRz3vDtJ7D9h3Y4fierSH44U8QbJj2k+QKiW9VS59iPV7sFsTVPqgaLUxs VOmGw9B45Qlo34ZxxZGm+ff5Ieq/kEzibtML+cnRk8lRwG0qJvrJzFiR8iINIxuxF8CZ mMAhxqic44br8ArrPYeVmugrdLyMrbJu+YrcDIYZGUhCB8CyTGMAwa3Vo8KSdwAA8cMs K7Xg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Oyf1MzH+; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id e11-v6si3515629qvo.221.2018.06.20.19.03.39 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:03:39 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Oyf1MzH+; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52524 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVowk-0002u6-LR for patch@linaro.org; Wed, 20 Jun 2018 22:03:38 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39206) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonp-00049F-0v for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonn-00033s-OK for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:25 -0400 Received: from mail-pl0-x242.google.com ([2607:f8b0:400e:c01::242]:38157) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonn-00033P-Gj for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:23 -0400 Received: by mail-pl0-x242.google.com with SMTP id d10-v6so757258plo.5 for ; Wed, 20 Jun 2018 18:54:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=4PBOZ+xP2oHjWmgB6IgGTqH/sc5fgdOszz58FGX7GC4=; b=Oyf1MzH+3mQJZTcDMLYUGxG1kN+kOtogDBVfDt545Cjfzb2UnhAMW5H6dkH+SJaOYj mmUJg9HK0mj1Rtu5PCjCT/Ih5yBg4DlxyNnmhxojtNy2FPlpaHU1UBqEGjLqhJv9+ait YC4nj5M+l38Zqlv4AqlwNQ+TUZx1aeb6QtZug= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=4PBOZ+xP2oHjWmgB6IgGTqH/sc5fgdOszz58FGX7GC4=; b=g1BbFQ5Pxuud2wek4EDRhhxFOo4lWq/EnapYjCPqYJrorPWTCwwW3SU/+rBq5XRmdw 6vRH1zHXh1LIPmT6krEtpyGfu30/HUQs27iD4StK4IAnD/3MuQGuhsOu5qtaRsW9ps08 BHI6Y3X6nFeZGG9RFSDnSYwC3q3N4lffC5eqiDZhUxBRj35qhAbMYdIWU0QqfawKqe0W y7SuRGEg20HyP9+4KcsIPviZ6FAuDHxxnSalNcWo4cVo3NP7mfiPA8+6nuQFffyHucbK mHMS0GFE33ZgaLrFz5WWfRCuqSmCxEfQD59HH7BvxWr8FDkfCR+OmnkqynGAZkKjYS6N 1b1g== X-Gm-Message-State: APt69E2tezeKajKGlCIf559vFwdLstTAXpSkwYtvHvsVjDL7+PG8iO0b e2BE7Gh/cZyM5D3IRPRtrW7Ya+XcxiI= X-Received: by 2002:a17:902:64cf:: with SMTP id y15-v6mr26307178pli.53.1529546062178; Wed, 20 Jun 2018 18:54:22 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.20 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:21 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:33 -1000 Message-Id: <20180621015359.12018-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::242 Subject: [Qemu-devel] [PATCH v5 09/35] target/arm: Implement SVE load and broadcast element X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 +++ target/arm/sve_helper.c | 41 +++++++++++++++++++++++++ target/arm/translate-sve.c | 62 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 5 +++ 4 files changed, 113 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 68e55a8d03..a5d3bb121c 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -274,6 +274,11 @@ DEF_HELPER_FLAGS_3(sve_clr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_clr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_clr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_movz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_movz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_movz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_movz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_asr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_asr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_asr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 990e5f3900..a9c98bca32 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -995,6 +995,47 @@ void HELPER(sve_clr_d)(void *vd, void *vg, uint32_t desc) } } +/* Copy Zn into Zn, and store zero into inactive elements. */ +void HELPER(sve_movz_b)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + d[i] = n[i] & expand_pred_b(pg[H1(i)]); + } +} + +void HELPER(sve_movz_h)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + d[i] = n[i] & expand_pred_h(pg[H1(i)]); + } +} + +void HELPER(sve_movz_s)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + d[i] = n[i] & expand_pred_s(pg[H1(i)]); + } +} + +void HELPER(sve_movz_d)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + d[i] = n[1] & -(uint64_t)(pg[H1(i)] & 1); + } +} + /* Three-operand expander, immediate operand, controlled by a predicate. */ #define DO_ZPZI(NAME, TYPE, H, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 483ad33179..954d6653d3 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -606,6 +606,20 @@ static bool do_clr_zp(DisasContext *s, int rd, int pg, int esz) return true; } +/* Copy Zn into Zd, storing zeros into inactive elements. */ +static void do_movz_zpz(DisasContext *s, int rd, int rn, int pg, int esz) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve_movz_b, gen_helper_sve_movz_h, + gen_helper_sve_movz_s, gen_helper_sve_movz_d, + }; + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + pred_full_reg_offset(s, pg), + vsz, vsz, 0, fns[esz]); +} + static bool do_zpzi_ool(DisasContext *s, arg_rpri_esz *a, gen_helper_gvec_3 *fn) { @@ -3999,6 +4013,54 @@ static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) return true; } +/* Load and broadcast element. */ +static bool trans_LD1R_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) +{ + if (!sve_access_check(s)) { + return true; + } + + unsigned vsz = vec_full_reg_size(s); + unsigned psz = pred_full_reg_size(s); + unsigned esz = dtype_esz[a->dtype]; + TCGLabel *over = gen_new_label(); + TCGv_i64 temp; + + /* If the guarding predicate has no bits set, no load occurs. */ + if (psz <= 8) { + /* Reduce the pred_esz_masks value simply to reduce the + * size of the code generated here. + */ + uint64_t psz_mask = MAKE_64BIT_MASK(0, psz * 8); + temp = tcg_temp_new_i64(); + tcg_gen_ld_i64(temp, cpu_env, pred_full_reg_offset(s, a->pg)); + tcg_gen_andi_i64(temp, temp, pred_esz_masks[esz] & psz_mask); + tcg_gen_brcondi_i64(TCG_COND_EQ, temp, 0, over); + tcg_temp_free_i64(temp); + } else { + TCGv_i32 t32 = tcg_temp_new_i32(); + find_last_active(s, t32, esz, a->pg); + tcg_gen_brcondi_i32(TCG_COND_LT, t32, 0, over); + tcg_temp_free_i32(t32); + } + + /* Load the data. */ + temp = tcg_temp_new_i64(); + tcg_gen_addi_i64(temp, cpu_reg_sp(s, a->rn), a->imm << esz); + tcg_gen_qemu_ld_i64(temp, temp, get_mem_index(s), + s->be_data | dtype_mop[a->dtype]); + + /* Broadcast to *all* elements. */ + tcg_gen_gvec_dup_i64(esz, vec_full_reg_offset(s, a->rd), + vsz, vsz, temp); + tcg_temp_free_i64(temp); + + /* Zero the inactive elements. */ + gen_set_label(over); + do_movz_zpz(s, a->rd, a->rd, a->pg, esz); + return true; +} + static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz, int esz, int nreg) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index ba10cddb8a..56039e2193 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -28,6 +28,7 @@ %imm8_16_10 16:5 10:3 %imm9_16_10 16:s6 10:3 %size_23 23:2 +%dtype_23_13 23:2 13:2 # A combination of tsz:imm3 -- extract esize. %tszimm_esz 22:2 5:5 !function=tszimm_esz @@ -750,6 +751,10 @@ LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9 # SVE load vector register LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9 +# SVE load and broadcast element +LD1R_zpri 1000010 .. 1 imm:6 1.. pg:3 rn:5 rd:5 \ + &rpri_load dtype=%dtype_23_13 nreg=0 + ### SVE Memory Contiguous Load Group # SVE contiguous load (scalar plus scalar) From patchwork Thu Jun 21 01:53:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139415 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1477604lji; Wed, 20 Jun 2018 19:06:53 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKO+CVpi6w65TkAzOEXIX6HgYc3w2TNDQGMMYY9Mw+Tv0XZE4nWlARZZNTqIltvojKdxvJV X-Received: by 2002:a37:d1d0:: with SMTP id o77-v6mr19291978qkl.185.1529546813796; Wed, 20 Jun 2018 19:06:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546813; cv=none; d=google.com; s=arc-20160816; b=PiyL2SU68FY4GfT01XPYi1eEZuQFfpgCFsLqy67AESiOOKLiqXPlu9lzZQzomQfi+9 tT4lV1CwRrjRBDaZdUE7K69XlwjfpVG2G7iZnqUYCqinlvq43mvpyPSnNk+VBNBe+NpH a7Za8P6i13JCLx9RV+/KJdXWJPbI61ZZ5/q4atbxa7LJEDnWl5bbJgMiFTGZspB+KPjS DvhmCySRODd0g0TM2tDESArJqiFJcPbl8N0ms0H1HQe3WpHZYGYwVkVSlT83V7Y4L/xq p6HUS6gXusso2zNsto98UEgIILS+Tqy7IwUpEqZJeaf5wkIL4JLBfyA+4ZwKdO0ULAKy VYiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=SEyYQK35r/FvibNeEMXdwWPmyznCQnXYKZpyUq8JXnM=; b=kzmaco/zr80e1ey4iVKyxepsVyI7OLbr4kkwgERsNIc2l2ZUDDp13nwMcfijJgndUG 9w/28+XIWX3vYZ1KXSIWpZa0L/I1mmUc2RfjtP8kVIiMMla8fUYkwqoq+2Ae7YxKRwod cdczSDgyXEm/MXzKUyl4Sgs0zI7TkxF4HCaimxYlO+8w6O0hV/guDz09+62Pg7Qm6w20 AflbFg7knVMq+MHJKTo+DNIENWBP4W3gcB1/43fsbLwBAlLzKSehQp/N9UJIIQLW+zsm MBucnfnwiqU1c6ScGJmL8QIkp0JzBIuty5M4fWLtF1xXnTUSLoVUNNaZT9kUTHaHW0fD i+EA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=IVeJyioO; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id 36-v6si3724432qvp.133.2018.06.20.19.06.53 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:06:53 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=IVeJyioO; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52542 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVozt-0005OV-7l for patch@linaro.org; Wed, 20 Jun 2018 22:06:53 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39217) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonq-0004BX-Gk for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonp-00034m-Eb for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:26 -0400 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:33126) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonp-00034H-7P for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:25 -0400 Received: by mail-pl0-x241.google.com with SMTP id 6-v6so763889plb.0 for ; Wed, 20 Jun 2018 18:54:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=SEyYQK35r/FvibNeEMXdwWPmyznCQnXYKZpyUq8JXnM=; b=IVeJyioO3nu3qs8l61Dr5svG97k3vhpH810KPNoFvYZRIsl+LB62dfo3DXOg8mSLDm LesSgf+TVqzTiNta6a2HhoknS6f8CsAaEjPcp55eJzCvUNohZOEphYzdJR1fwgxfIX/r 4YSUuxKQ7P5vKc6yU2WBACCIsZwsywxCIAAas= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=SEyYQK35r/FvibNeEMXdwWPmyznCQnXYKZpyUq8JXnM=; b=b5VIlntjQxh+RAZc/Bazj43cMtkIITR5oUwHZUrClm14F/gTMEAZCZCXpaZiRqS7Op +iB0U9YADLWh0pxgEqAGep8/1vjUJFlR/eikGQ2h3M5nLpAgAhjqPI1gylmQ2c/aT/A9 9FOjiQIsM8eV+NKcx3WX8LaR7ASE6iPC2yTSXURrispqLJVmrVGthE2tWedHcljd/tI+ FbZzvkPcUondO8DXS4+a7h6PK2gXgOvn/usPCfvI0PeJN4LFhog6AIq4LKOuMul2W9K+ EQTmqy+KT27cXDKutRw1btbcndnAoQb8XrJ4OMrg4Sju6sixuVMEH3bffjBcBSdGcORy E+UA== X-Gm-Message-State: APt69E2BCsLI9fX1Pbl/OaHMnUBteIDt+qT49qnlXDBxAo4QPvLAo1Jn lUgTFxHkIcZmSNnfV8KcKssmjyawduM= X-Received: by 2002:a17:902:b18b:: with SMTP id s11-v6mr26451373plr.190.1529546063925; Wed, 20 Jun 2018 18:54:23 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.22 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:23 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:34 -1000 Message-Id: <20180621015359.12018-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v5 10/35] target/arm: Implement SVE store vector/predicate register X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 103 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 6 +++ 2 files changed, 109 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 954d6653d3..50f1ff75ef 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3762,6 +3762,89 @@ static void do_ldr(DisasContext *s, uint32_t vofs, uint32_t len, tcg_temp_free_i64(t0); } +/* Similarly for stores. */ +static void do_str(DisasContext *s, uint32_t vofs, uint32_t len, + int rn, int imm) +{ + uint32_t len_align = QEMU_ALIGN_DOWN(len, 8); + uint32_t len_remain = len % 8; + uint32_t nparts = len / 8 + ctpop8(len_remain); + int midx = get_mem_index(s); + TCGv_i64 addr, t0; + + addr = tcg_temp_new_i64(); + t0 = tcg_temp_new_i64(); + + /* Note that unpredicated load/store of vector/predicate registers + * are defined as a stream of bytes, which equates to little-endian + * operations on larger quantities. There is no nice way to force + * a little-endian store for aarch64_be-linux-user out of line. + * + * Attempt to keep code expansion to a minimum by limiting the + * amount of unrolling done. + */ + if (nparts <= 4) { + int i; + + for (i = 0; i < len_align; i += 8) { + tcg_gen_ld_i64(t0, cpu_env, vofs + i); + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + i); + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ); + } + } else { + TCGLabel *loop = gen_new_label(); + TCGv_ptr t2, i = tcg_const_local_ptr(0); + + gen_set_label(loop); + + t2 = tcg_temp_new_ptr(); + tcg_gen_add_ptr(t2, cpu_env, i); + tcg_gen_ld_i64(t0, t2, vofs); + + /* Minimize the number of local temps that must be re-read from + * the stack each iteration. Instead, re-compute values other + * than the loop counter. + */ + tcg_gen_addi_ptr(t2, i, imm); + tcg_gen_extu_ptr_i64(addr, t2); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, rn)); + tcg_temp_free_ptr(t2); + + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ); + + tcg_gen_addi_ptr(i, i, 8); + + tcg_gen_brcondi_ptr(TCG_COND_LTU, i, len_align, loop); + tcg_temp_free_ptr(i); + } + + /* Predicate register stores can be any multiple of 2. */ + if (len_remain) { + tcg_gen_ld_i64(t0, cpu_env, vofs + len_align); + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + len_align); + + switch (len_remain) { + case 2: + case 4: + case 8: + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LE | ctz32(len_remain)); + break; + + case 6: + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUL); + tcg_gen_addi_i64(addr, addr, 4); + tcg_gen_shri_i64(addr, addr, 32); + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUW); + break; + + default: + g_assert_not_reached(); + } + } + tcg_temp_free_i64(addr); + tcg_temp_free_i64(t0); +} + static bool trans_LDR_zri(DisasContext *s, arg_rri *a, uint32_t insn) { if (sve_access_check(s)) { @@ -3782,6 +3865,26 @@ static bool trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn) return true; } +static bool trans_STR_zri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + if (sve_access_check(s)) { + int size = vec_full_reg_size(s); + int off = vec_full_reg_offset(s, a->rd); + do_str(s, off, size, a->rn, a->imm * size); + } + return true; +} + +static bool trans_STR_pri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + if (sve_access_check(s)) { + int size = pred_full_reg_size(s); + int off = pred_full_reg_offset(s, a->rd); + do_str(s, off, size, a->rn, a->imm * size); + } + return true; +} + /* *** SVE Memory - Contiguous Load Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 56039e2193..c088e51493 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -792,6 +792,12 @@ LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \ ### SVE Memory Store Group +# SVE store predicate register +STR_pri 1110010 11 0. ..... 000 ... ..... 0 .... @pd_rn_i9 + +# SVE store vector register +STR_zri 1110010 11 0. ..... 010 ... ..... ..... @rd_rn_i9 + # SVE contiguous store (scalar plus immediate) # ST1B, ST1H, ST1W, ST1D; require msz <= esz ST_zpri 1110010 .. esz:2 0.... 111 ... ..... ..... \ From patchwork Thu Jun 21 01:53:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139414 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1477595lji; Wed, 20 Jun 2018 19:06:52 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLZSH5adKcQ0lqJDrkZFiLBmyIhXHcQifeoHNaCC0P4iLm7MCQqS4aIq2aGmP1VA66EjuBx X-Received: by 2002:a0c:bfaa:: with SMTP id s39-v6mr20622624qvj.2.1529546812599; Wed, 20 Jun 2018 19:06:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546812; cv=none; d=google.com; s=arc-20160816; b=poXd8awyjjpWgJfeY+5Ihq1Cbhiunrr3iL6oX2XmdXgjHudOP2M2u2mpO2XAC0iaND qWtrILnlSSwbNOFgPkdI7jg/vE3Ljy/51deviO6zz7auUnOop+VbZhXZohQ4vynqhja8 mraURCfLUMbE7v+iqwbsSr7I4zFVIpykp+cUTFeFhAagP68iOSM+uIa9CEKABSp52I1S sXrr8RnNaFcK9CMMqslGdQk36IVub/9o8CL+RNIp6a1tEpQ9MgplDBX1EPf41lRFuqaZ vvQhSC1y+KKQo9Hb8EuFgbqqx44J86YUOlBgr3ad1ISjE0dBq6Tw/vFrtoHQ59a8nm5O mkvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=1mQ1TMjGOJewWJHYTgzA7FQK3l6kp40/y15IFPWJw84=; b=sZRgfPui/2qXahISobw+8Hr4A6EZRSftND6fbwRaSPGkh28zgVi+J07VFqbsUPbjfw 2l4BToWg1S14C/r0g4Mp++iS1RKwGrfxmcUqUfU3Q9zsZnlWjROo+Vx3K9AZNxAiexlx mHrhTnIRXUL4dSDN0HVI7gL28lZu9q+MOSFFgfoNm9gDk0IAUyxENHGG0vDD/AjFbij8 IsUnWM04LCupUDObiEj2jWY29433e7FxO7W74801jKkiWg3zah+xk8utO3LXRjVuGiSo YRm6t6ZfqGPDfH30L2y6CrSa+evJdkKMTM1j2zCbKLXffk6gwBZ2pTUYGajB5nl9xDwb E7hw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=IGntk7hp; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id u2-v6si3703280qki.143.2018.06.20.19.06.52 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:06:52 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=IGntk7hp; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52541 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVozs-0005NW-2W for patch@linaro.org; Wed, 20 Jun 2018 22:06:52 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39234) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonu-0004PB-Jm for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonr-000361-8a for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:30 -0400 Received: from mail-pl0-x230.google.com ([2607:f8b0:400e:c01::230]:35617) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonq-00035O-W9 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:27 -0400 Received: by mail-pl0-x230.google.com with SMTP id k1-v6so756899plt.2 for ; Wed, 20 Jun 2018 18:54:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=1mQ1TMjGOJewWJHYTgzA7FQK3l6kp40/y15IFPWJw84=; b=IGntk7hpd9zGMATxmEv6WefRle1U8d/B00/WI/Efr+1dQoCLjpEXto9HLxOMHPabo6 VnPHbLLOa9bj9f+PbUfinqdV/xwDUbv+jf+qYVmbqWtNmRZ+phg4JH2o4r6RY+lnfmYn sK8EXITIsFSSuFZ/0bcjndVyFUKBAfTZxFlf0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=1mQ1TMjGOJewWJHYTgzA7FQK3l6kp40/y15IFPWJw84=; b=eY4Dclx64rCK4g8q2kybpJ5xed2v+k2zHXYj6JpVHxXGsfI2AzkPXi2a3JNFHvP9B2 MTxFcZQ4oKdcpdJDwvWe5BrnjEsdY/pJ8CT06cF87BX0jRzkeauYUX0P5ra5j2Vc4mGB bIuxpRl7I59VJLVokQ7b1+OEeEuKcQpx0NjQ7/OqI1XFfYgaXWq6ySzrymfZ2+ZIN3tO TsVk8rpzYTbMVnMiAMKN64JejFpaQ7e8scHqbRrg2vtlxTewXkGBiNg6ELGJOKK2gw8l 64i3llwcPd6rZzjfllaAHEIHr106S+fkqCdWxvuXCSzcL9qAbiE5nBvHHq3O6fv6Elr/ dvQg== X-Gm-Message-State: APt69E3pRi6+ur5mgAZ9YT5tBdhvWtKD/yl0F9X5WfjeoFs2V4F1pRn6 3fhXWhsLHN6I9zqaS7FHQwXaMxF1gDU= X-Received: by 2002:a17:902:650a:: with SMTP id b10-v6mr26566350plk.45.1529546065677; Wed, 20 Jun 2018 18:54:25 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.24 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:24 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:35 -1000 Message-Id: <20180621015359.12018-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::230 Subject: [Qemu-devel] [PATCH v5 11/35] target/arm: Implement SVE scatter stores X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 41 +++++++++++++++++++++ target/arm/sve_helper.c | 62 ++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 74 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 39 ++++++++++++++++++++ 4 files changed, 216 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a5d3bb121c..8880128f9c 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -958,3 +958,44 @@ DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbs_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sths_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sthd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stsd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stdd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sthd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stsd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stdd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sthd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stsd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stdd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a9c98bca32..ed4861a292 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3712,3 +3712,65 @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg, addr += 4 * 8; } } + +/* Stores with a vector index. */ + +#define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc) / 8; \ + unsigned scale = simd_data(desc); \ + uintptr_t ra = GETPC(); \ + uint32_t *d = vd; TYPEI *m = vm; uint8_t *pg = vg; \ + for (i = 0; i < oprsz; i++) { \ + uint8_t pp = pg[H1(i)]; \ + if (pp & 0x01) { \ + target_ulong off = (target_ulong)m[H4(i * 2)] << scale; \ + FN(env, base + off, d[H4(i * 2)], ra); \ + } \ + if (pp & 0x10) { \ + target_ulong off = (target_ulong)m[H4(i * 2 + 1)] << scale; \ + FN(env, base + off, d[H4(i * 2 + 1)], ra); \ + } \ + } \ +} + +#define DO_ST1_ZPZ_D(NAME, TYPEI, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc) / 8; \ + unsigned scale = simd_data(desc); \ + uintptr_t ra = GETPC(); \ + uint64_t *d = vd, *m = vm; uint8_t *pg = vg; \ + for (i = 0; i < oprsz; i++) { \ + if (pg[H1(i)] & 1) { \ + target_ulong off = (target_ulong)(TYPEI)m[i] << scale; \ + FN(env, base + off, d[i], ra); \ + } \ + } \ +} + +DO_ST1_ZPZ_S(sve_stbs_zsu, uint32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_S(sve_sths_zsu, uint32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_S(sve_stss_zsu, uint32_t, cpu_stl_data_ra) + +DO_ST1_ZPZ_S(sve_stbs_zss, int32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_S(sve_sths_zss, int32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_S(sve_stss_zss, int32_t, cpu_stl_data_ra) + +DO_ST1_ZPZ_D(sve_stbd_zsu, uint32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_D(sve_sthd_zsu, uint32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_D(sve_stsd_zsu, uint32_t, cpu_stl_data_ra) +DO_ST1_ZPZ_D(sve_stdd_zsu, uint32_t, cpu_stq_data_ra) + +DO_ST1_ZPZ_D(sve_stbd_zss, int32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_D(sve_sthd_zss, int32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_D(sve_stsd_zss, int32_t, cpu_stl_data_ra) +DO_ST1_ZPZ_D(sve_stdd_zss, int32_t, cpu_stq_data_ra) + +DO_ST1_ZPZ_D(sve_stbd_zd, uint64_t, cpu_stb_data_ra) +DO_ST1_ZPZ_D(sve_sthd_zd, uint64_t, cpu_stw_data_ra) +DO_ST1_ZPZ_D(sve_stsd_zd, uint64_t, cpu_stl_data_ra) +DO_ST1_ZPZ_D(sve_stdd_zd, uint64_t, cpu_stq_data_ra) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 50f1ff75ef..6e1907cedd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -43,6 +43,8 @@ typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void gen_helper_gvec_mem(TCGv_env, TCGv_ptr, TCGv_i64, TCGv_i32); +typedef void gen_helper_gvec_mem_scatter(TCGv_env, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_i64, TCGv_i32); /* * Helpers for extracting complex instruction fields. @@ -4228,3 +4230,75 @@ static bool trans_ST_zpri(DisasContext *s, arg_rpri_store *a, uint32_t insn) } return true; } + +/* + *** SVE gather loads / scatter stores + */ + +static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale, + TCGv_i64 scalar, gen_helper_gvec_mem_scatter *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, scale)); + TCGv_ptr t_zm = tcg_temp_new_ptr(); + TCGv_ptr t_pg = tcg_temp_new_ptr(); + TCGv_ptr t_zt = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + tcg_gen_addi_ptr(t_zm, cpu_env, vec_full_reg_offset(s, zm)); + tcg_gen_addi_ptr(t_zt, cpu_env, vec_full_reg_offset(s, zt)); + fn(cpu_env, t_zt, t_pg, t_zm, scalar, desc); + + tcg_temp_free_ptr(t_zt); + tcg_temp_free_ptr(t_zm); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); +} + +static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) +{ + /* Indexed by [xs][msz]. */ + static gen_helper_gvec_mem_scatter * const fn32[2][3] = { + { gen_helper_sve_stbs_zsu, + gen_helper_sve_sths_zsu, + gen_helper_sve_stss_zsu, }, + { gen_helper_sve_stbs_zss, + gen_helper_sve_sths_zss, + gen_helper_sve_stss_zss, }, + }; + static gen_helper_gvec_mem_scatter * const fn64[3][4] = { + { gen_helper_sve_stbd_zsu, + gen_helper_sve_sthd_zsu, + gen_helper_sve_stsd_zsu, + gen_helper_sve_stdd_zsu, }, + { gen_helper_sve_stbd_zss, + gen_helper_sve_sthd_zss, + gen_helper_sve_stsd_zss, + gen_helper_sve_stdd_zss, }, + { gen_helper_sve_stbd_zd, + gen_helper_sve_sthd_zd, + gen_helper_sve_stsd_zd, + gen_helper_sve_stdd_zd, }, + }; + gen_helper_gvec_mem_scatter *fn; + + if (a->esz < a->msz || (a->msz == 0 && a->scale)) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + switch (a->esz) { + case MO_32: + fn = fn32[a->xs][a->msz]; + break; + case MO_64: + fn = fn64[a->xs][a->msz]; + break; + default: + g_assert_not_reached(); + } + do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz, + cpu_reg_sp(s, a->rn), fn); + return true; +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index c088e51493..2ca0fd85e6 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -80,6 +80,7 @@ &rpri_load rd pg rn imm dtype nreg &rprr_store rd pg rn rm msz esz nreg &rpri_store rd pg rn imm msz esz nreg +&rprr_scatter_store rd pg rn rm esz msz xs scale ########################################################################### # Named instruction formats. These are generally used to @@ -198,6 +199,8 @@ @rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_store @rprr_store_esz_n0 ....... .. esz:2 rm:5 ... pg:3 rn:5 rd:5 \ &rprr_store nreg=0 +@rprr_scatter_store ....... msz:2 .. rm:5 ... pg:3 rn:5 rd:5 \ + &rprr_scatter_store ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -824,3 +827,39 @@ ST_zpri 1110010 .. nreg:2 1.... 111 ... ..... ..... \ # SVE store multiple structures (scalar plus scalar) (nreg != 0) ST_zprr 1110010 msz:2 nreg:2 ..... 011 ... ..... ..... \ @rprr_store esz=%size_23 + +# SVE 32-bit scatter store (scalar plus 32-bit scaled offsets) +# Require msz > 0 && msz <= esz. +ST1_zprz 1110010 .. 11 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=0 esz=2 scale=1 +ST1_zprz 1110010 .. 11 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=1 esz=2 scale=1 + +# SVE 32-bit scatter store (scalar plus 32-bit unscaled offsets) +# Require msz <= esz. +ST1_zprz 1110010 .. 10 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=0 esz=2 scale=0 +ST1_zprz 1110010 .. 10 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=1 esz=2 scale=0 + +# SVE 64-bit scatter store (scalar plus 64-bit scaled offset) +# Require msz > 0 +ST1_zprz 1110010 .. 01 ..... 101 ... ..... ..... \ + @rprr_scatter_store xs=2 esz=3 scale=1 + +# SVE 64-bit scatter store (scalar plus 64-bit unscaled offset) +ST1_zprz 1110010 .. 00 ..... 101 ... ..... ..... \ + @rprr_scatter_store xs=2 esz=3 scale=0 + +# SVE 64-bit scatter store (scalar plus unpacked 32-bit scaled offset) +# Require msz > 0 +ST1_zprz 1110010 .. 01 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=0 esz=3 scale=1 +ST1_zprz 1110010 .. 01 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=1 esz=3 scale=1 + +# SVE 64-bit scatter store (scalar plus unpacked 32-bit unscaled offset) +ST1_zprz 1110010 .. 00 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=0 esz=3 scale=0 +ST1_zprz 1110010 .. 00 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=1 esz=3 scale=0 From patchwork Thu Jun 21 01:53:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139417 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1479556lji; Wed, 20 Jun 2018 19:09:23 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKM2ZdEb4keE5Ybo1nNuyWjvOr7wKu0O6fz0yXBuA+Oti2EjRVF6p2L3oyfW3JijeI/ibNC X-Received: by 2002:aed:3b3b:: with SMTP id p56-v6mr21328002qte.372.1529546963523; Wed, 20 Jun 2018 19:09:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546963; cv=none; d=google.com; s=arc-20160816; b=mlvum8ZrWhUOYkK/IeU1ViN4KJzNgssM+9C+qjU2kybUGWyPtp3T1Z/BMs/vktceZY jG/2M1qgJxxVL3id8IDy/CIjeo8qtoXTRQjsCGcUX7gT2Kqnv4ZuXcHzHNRh7u0rrOPC 2e0paCBkeFPnktkWkBEuUEIRmXTA/j7A/qOdFD/MYf+0NsdT68LGiSAo/D0x425PWvaV gNBDmGrD5SizemlzuAHUuvxeg/JQptGPiPS64/dmWX6L/rQg75+N0fwyU6TjB1FqlvKz m0PXAo7hyHsRVYLLUow9n3owMXSq87TQpGHFaI3deAjhQTbIC7AaMcaB4Z3DdVp0V+du oLgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=ABjeW7FWo1HVKOTk5GV1YLP/uN+2VusXjD7Yv5Umy/k=; b=WOvQg8v7UsBFkTdrm0chxuUdre+GN7l4hw9fyANechflrJR1dxPXVOa3TOVmibvE5X TQY/VA+WjQVl1UPwXoWpfiGxsXwdPMbneN43I6bqcpNXJAh6/kxmw6C6iRUbfFxghN3+ jciu7J463ErNFS531u6X3G0s3e9SvzhpZq1P26bA72R5I519uOhFEunLmrQm8gjvVkjL //o0PeKZi9sgPU1Pe8BNJpJKnJGbR+Ujl8HIjpq8SOZKWzbz4xG4jALpOOGKmR5oERZ5 0er7592HxmHdXxUhQleU6Fa6YVpcRCLaIyF+geZbE6EctlpamXSiKb7b+XKIaqomWVsk vg9w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=PiwcLZ7B; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id l38-v6si3808292qta.93.2018.06.20.19.09.23 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:09:23 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=PiwcLZ7B; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52553 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp2I-0007AV-VM for patch@linaro.org; Wed, 20 Jun 2018 22:09:23 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39233) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonu-0004P9-JT for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVons-00036p-U6 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:30 -0400 Received: from mail-pf0-x229.google.com ([2607:f8b0:400e:c00::229]:33683) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVons-00036R-NH for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:28 -0400 Received: by mail-pf0-x229.google.com with SMTP id b17-v6so694612pfi.0 for ; Wed, 20 Jun 2018 18:54:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ABjeW7FWo1HVKOTk5GV1YLP/uN+2VusXjD7Yv5Umy/k=; b=PiwcLZ7BYFg6q1immBv7qlBb5cLdO0M7fxvQmHG1CcOQd10qYjR+XlzSD9LIkHdXI+ sk9OeaWUsMHmpmeyB/oEonlBg5bJ+lYfHlPEGlGHGgqvrxc9V5TQIjRIa0a/djSxvK9U SUAbWQVRlqZqEwI20DXbyJ7LyzQilj5xSnlnE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ABjeW7FWo1HVKOTk5GV1YLP/uN+2VusXjD7Yv5Umy/k=; b=W2V0BaHuSg//4FjNhjnOaqltCKWD3uQZFqIRJ2IOxxrXqN9OPdGEpd7Y+WoRk/CE8p Tu+nkIhY8l9KQ+yO61x9II+5nBFYeaKPq7ddUF0Hj3DJsnDmB8oA1W4OvFU08xsx+hGI HfMDdfBEly95zEXBpyHdzBVU19bxkp6c8kqBHyS63U3VtC+QuLeSN9ZsSkkcLUxG8Ehx +wBibQactB3rmSqtYLSSKtSbaZv+dwdjdveePB0vJrjp1MciFcvceJyXAzgFCT5G/nNp 8L8Sv38srSe2Lj1sk8hi+3uB6DNmq/TIyBuwvr6LhgvEBnIELfBi4Y89w7aNbrDnVjzJ ogfg== X-Gm-Message-State: APt69E2xBhi4dKlX//7g0p+Swf8kG+j1Dt2Zd+9o/k7Gs/vv/wPiRQ4y XKpMnWqHe41pBoLXSsNmVQ9C/rNYfes= X-Received: by 2002:a65:5686:: with SMTP id v6-v6mr20906014pgs.141.1529546067428; Wed, 20 Jun 2018 18:54:27 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.25 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:26 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:36 -1000 Message-Id: <20180621015359.12018-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::229 Subject: [Qemu-devel] [PATCH v5 12/35] target/arm: Implement SVE prefetches X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 21 +++++++++++++++++++++ target/arm/sve.decode | 23 +++++++++++++++++++++++ 2 files changed, 44 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 6e1907cedd..c054e3268b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4302,3 +4302,24 @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) cpu_reg_sp(s, a->rn), fn); return true; } + +/* + * Prefetches + */ + +static bool trans_PRF(DisasContext *s, arg_PRF *a, uint32_t insn) +{ + /* Prefetch is a nop within QEMU. */ + sve_access_check(s); + return true; +} + +static bool trans_PRF_rr(DisasContext *s, arg_PRF_rr *a, uint32_t insn) +{ + if (a->rm == 31) { + return false; + } + /* Prefetch is a nop within QEMU. */ + sve_access_check(s); + return true; +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2ca0fd85e6..a20f98b70c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -793,6 +793,29 @@ LD1RQ_zprr 1010010 .. 00 ..... 000 ... ..... ..... \ LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \ @rpri_load_msz nreg=0 +# SVE 32-bit gather prefetch (scalar plus 32-bit scaled offsets) +PRF 1000010 00 -1 ----- 0-- --- ----- 0 ---- + +# SVE 32-bit gather prefetch (vector plus immediate) +PRF 1000010 -- 00 ----- 111 --- ----- 0 ---- + +# SVE contiguous prefetch (scalar plus immediate) +PRF 1000010 11 1- ----- 0-- --- ----- 0 ---- + +# SVE contiguous prefetch (scalar plus scalar) +PRF_rr 1000010 -- 00 rm:5 110 --- ----- 0 ---- + +### SVE Memory 64-bit Gather Group + +# SVE 64-bit gather prefetch (scalar plus 64-bit scaled offsets) +PRF 1100010 00 11 ----- 1-- --- ----- 0 ---- + +# SVE 64-bit gather prefetch (scalar plus unpacked 32-bit scaled offsets) +PRF 1100010 00 -1 ----- 0-- --- ----- 0 ---- + +# SVE 64-bit gather prefetch (vector plus immediate) +PRF 1100010 -- 00 ----- 111 --- ----- 0 ---- + ### SVE Memory Store Group # SVE store predicate register From patchwork Thu Jun 21 01:53:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139412 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1475312lji; Wed, 20 Jun 2018 19:03:55 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKM8HUES8Qfien+8N1mMJj7T980eELddiHpbC+Ar1j9aHuKu9YCfocfMtAMMlmD97fX543a X-Received: by 2002:a37:3ccb:: with SMTP id j194-v6mr3281922qka.153.1529546635196; Wed, 20 Jun 2018 19:03:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546635; cv=none; d=google.com; s=arc-20160816; b=W0LfjipiAdg/xk39xj2dJ9qwk77zV2ZE8m2J2Iv9iLfexY+6aA7USiu6MrLpxe/sec TBKlon3Li1FKzZX3DPssI3bc3Kw0tEb+Go2mr5C1pPtsC0e7FI5nFVN5KWB3zk3832Jx OE6acQwKH5+dhUnzNE7cSSI6m5r6mMGfP7cySiTfU/H0xtIydvOULGpLZl6B4ulkiQet uow7xbsQXFyfD1+cLtqDpkMNgM2LTcyfUjMo2IQHOHW7YZS7oZSDucGtKVr+1aj+BvuU s24dK0IOCsrrj3pH72ADm1/gFA93Irr24MJyELbIcXIAWl0nxv8l2HkMsR+XDh5EpUbf Sv1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=K1yCRF/mY5J+RXM8YLsiTaO0HBwTPISeSv7DvvgqE/M=; b=jT0yXAE3NzkLojHYvdEw0tFhUuiWTUl1v1bGr1ycDMgPfAReMgq83gMGl3bS7hN5dL l9X3jnkFkxuO5H4Xo5HuELHJSmPwbQ2j5EekH5Y9XVEIZsjm+UjV3RbIQUhWwsADHx8f xTJ/jqO75rG61iWjEDcT9ZGtp6xIPqJTl/98Ov71csZblEXGp1+IqxAvvTI5VR8dNm1H ac3bKJlczKaFLCrQlprSe1s9l/rnRf1h3tfgj0SEuVIYONK49ybAWqfyYZChU/XrbUjz Ms9XL4agB3E/SG03IWfCMTlZq4TesB01k69KbNjHxuiQSmcMsZgz6SU8nRlTUNjM7NKo 3Qdw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=FVBp4pQx; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id o133-v6si1145527qke.209.2018.06.20.19.03.54 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:03:55 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=FVBp4pQx; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52528 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVox0-00036T-HU for patch@linaro.org; Wed, 20 Jun 2018 22:03:54 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39251) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonx-0004S6-1D for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:35 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonv-000389-7c for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:33 -0400 Received: from mail-pg0-x232.google.com ([2607:f8b0:400e:c05::232]:32899) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonu-00037b-U2 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:31 -0400 Received: by mail-pg0-x232.google.com with SMTP id e11-v6so650176pgq.0 for ; Wed, 20 Jun 2018 18:54:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=K1yCRF/mY5J+RXM8YLsiTaO0HBwTPISeSv7DvvgqE/M=; b=FVBp4pQxf8qFaNCIoMrhRuks4RSSbupuHODvyJARQVdZvAjiHoOFUneemJA/yi5gzt +kbyTraTuuodArrbu8+xxZQOVEiMcwsbJC00XAjnlyrQs1TyD62ldtDYkhwXHvOtKUIu LbRpxnK713JGHlvSaBYbyhH0rbVkXVKqFb9Jw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=K1yCRF/mY5J+RXM8YLsiTaO0HBwTPISeSv7DvvgqE/M=; b=pLhk1JhzwFWlFDuuY4krkCAAIY6m+/fe9Hem4yH+rK9vGNa+UPRh/qPeyhm2omYQ3C ezih43hJF08DO6A01SZloCV+1H8HYtqQ/ltnFAHIcNEbeb7cYi9LwOOzGJUcc0I8A7jt 67UZ9kcIzBeyeIdTMOljP2gV5j2sIXYCczl4zyfL4l3L0jAHM913eJfaoFi/Lr4pkYLg v1OFkyHTCdJnphxNjQI2RzXksRgMHv4bzI1RvYYFBXTKBcHV/ztp7/EyMJURiND5kzOC aoJzvKsF330IvtvUP5UTEZmadlcZQu8CGwalYoE6djCGxgizhm+s24K+CxUugMjTbTX0 7LiQ== X-Gm-Message-State: APt69E3fQlD1VW9BHLyzSJNXCkDG3EZzAobAxFSDBHJGSVTHTIa3dCwc ba7Ckky38AzyP/JOvt9SBRjM81Ofrnk= X-Received: by 2002:a62:df89:: with SMTP id d9-v6mr5768561pfl.147.1529546069377; Wed, 20 Jun 2018 18:54:29 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.27 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:28 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:37 -1000 Message-Id: <20180621015359.12018-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::232 Subject: [Qemu-devel] [PATCH v5 13/35] target/arm: Implement SVE gather loads X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 67 ++++++++++++++++++++++++ target/arm/sve_helper.c | 77 +++++++++++++++++++++++++++ target/arm/translate-sve.c | 104 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 53 +++++++++++++++++++ 4 files changed, 301 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 8880128f9c..aeb62afc34 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -959,6 +959,73 @@ DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldssu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldssu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldddu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldddu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldddu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr, tl, i32) DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ed4861a292..9b99718156 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3713,6 +3713,83 @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg, } } +/* Loads with a vector index. */ + +#define DO_LD1_ZPZ_S(NAME, TYPEI, TYPEM, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + unsigned scale = simd_data(desc); \ + uintptr_t ra = GETPC(); \ + for (i = 0; i < oprsz; i++) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m = 0; \ + if (pg & 1) { \ + target_ulong off = *(TYPEI *)(vm + H1_4(i)); \ + m = FN(env, base + (off << scale), ra); \ + } \ + *(uint32_t *)(vd + H1_4(i)) = m; \ + i += 4, pg >>= 4; \ + } while (i & 15); \ + } \ +} + +#define DO_LD1_ZPZ_D(NAME, TYPEI, TYPEM, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc) / 8; \ + unsigned scale = simd_data(desc); \ + uintptr_t ra = GETPC(); \ + uint64_t *d = vd, *m = vm; uint8_t *pg = vg; \ + for (i = 0; i < oprsz; i++) { \ + TYPEM mm = 0; \ + if (pg[H1(i)] & 1) { \ + target_ulong off = (TYPEI)m[i]; \ + mm = FN(env, base + (off << scale), ra); \ + } \ + d[i] = mm; \ + } \ +} + +DO_LD1_ZPZ_S(sve_ldbsu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhsu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_S(sve_ldssu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_S(sve_ldbss_zsu, uint32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhss_zsu, uint32_t, int16_t, cpu_lduw_data_ra) + +DO_LD1_ZPZ_S(sve_ldbsu_zss, int32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhsu_zss, int32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_S(sve_ldssu_zss, int32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_S(sve_ldbss_zss, int32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhss_zss, int32_t, int16_t, cpu_lduw_data_ra) + +DO_LD1_ZPZ_D(sve_ldbdu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhdu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsdu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_D(sve_ldddu_zsu, uint32_t, uint64_t, cpu_ldq_data_ra) +DO_LD1_ZPZ_D(sve_ldbds_zsu, uint32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhds_zsu, uint32_t, int16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsds_zsu, uint32_t, int32_t, cpu_ldl_data_ra) + +DO_LD1_ZPZ_D(sve_ldbdu_zss, int32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhdu_zss, int32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsdu_zss, int32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_D(sve_ldddu_zss, int32_t, uint64_t, cpu_ldq_data_ra) +DO_LD1_ZPZ_D(sve_ldbds_zss, int32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhds_zss, int32_t, int16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsds_zss, int32_t, int32_t, cpu_ldl_data_ra) + +DO_LD1_ZPZ_D(sve_ldbdu_zd, uint64_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhdu_zd, uint64_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsdu_zd, uint64_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_D(sve_ldddu_zd, uint64_t, uint64_t, cpu_ldq_data_ra) +DO_LD1_ZPZ_D(sve_ldbds_zd, uint64_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhds_zd, uint64_t, int16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsds_zd, uint64_t, int32_t, cpu_ldl_data_ra) + /* Stores with a vector index. */ #define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c054e3268b..a53554f8d5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4255,6 +4255,110 @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale, tcg_temp_free_i32(desc); } +/* Indexed by [xs][u][msz]. */ +static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][3] = { + { { gen_helper_sve_ldbss_zsu, + gen_helper_sve_ldhss_zsu, + NULL, }, + { gen_helper_sve_ldbsu_zsu, + gen_helper_sve_ldhsu_zsu, + gen_helper_sve_ldssu_zsu, } }, + { { gen_helper_sve_ldbss_zss, + gen_helper_sve_ldhss_zss, + NULL, }, + { gen_helper_sve_ldbsu_zss, + gen_helper_sve_ldhsu_zss, + gen_helper_sve_ldssu_zss, } }, +}; + +static gen_helper_gvec_mem_scatter * const gather_load_fn64[3][2][4] = { + { { gen_helper_sve_ldbds_zsu, + gen_helper_sve_ldhds_zsu, + gen_helper_sve_ldsds_zsu, + NULL, }, + { gen_helper_sve_ldbdu_zsu, + gen_helper_sve_ldhdu_zsu, + gen_helper_sve_ldsdu_zsu, + gen_helper_sve_ldddu_zsu, } }, + { { gen_helper_sve_ldbds_zss, + gen_helper_sve_ldhds_zss, + gen_helper_sve_ldsds_zss, + NULL, }, + { gen_helper_sve_ldbdu_zss, + gen_helper_sve_ldhdu_zss, + gen_helper_sve_ldsdu_zss, + gen_helper_sve_ldddu_zss, } }, + { { gen_helper_sve_ldbds_zd, + gen_helper_sve_ldhds_zd, + gen_helper_sve_ldsds_zd, + NULL, }, + { gen_helper_sve_ldbdu_zd, + gen_helper_sve_ldhdu_zd, + gen_helper_sve_ldsdu_zd, + gen_helper_sve_ldddu_zd, } }, +}; + +static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn) +{ + gen_helper_gvec_mem_scatter *fn = NULL; + + if (a->esz < a->msz + || (a->msz == 0 && a->scale) + || (a->esz == a->msz && !a->u)) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + /* TODO: handle LDFF1. */ + switch (a->esz) { + case MO_32: + fn = gather_load_fn32[a->xs][a->u][a->msz]; + break; + case MO_64: + fn = gather_load_fn64[a->xs][a->u][a->msz]; + break; + } + assert(fn != NULL); + + do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz, + cpu_reg_sp(s, a->rn), fn); + return true; +} + +static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn) +{ + gen_helper_gvec_mem_scatter *fn = NULL; + TCGv_i64 imm; + + if (a->esz < a->msz || (a->esz == a->msz && !a->u)) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + /* TODO: handle LDFF1. */ + switch (a->esz) { + case MO_32: + fn = gather_load_fn32[0][a->u][a->msz]; + break; + case MO_64: + fn = gather_load_fn64[2][a->u][a->msz]; + break; + } + assert(fn != NULL); + + /* Treat LD1_zpiz (zn[x] + imm) the same way as LD1_zprz (rn + zm[x]) + * by loading the immediate into the scalar parameter. + */ + imm = tcg_const_i64(a->imm << a->msz); + do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn); + tcg_temp_free_i64(imm); + return true; +} + static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) { /* Indexed by [xs][msz]. */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a20f98b70c..fb73a22c0e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -80,6 +80,8 @@ &rpri_load rd pg rn imm dtype nreg &rprr_store rd pg rn rm msz esz nreg &rpri_store rd pg rn imm msz esz nreg +&rprr_gather_load rd pg rn rm esz msz u ff xs scale +&rpri_gather_load rd pg rn imm esz msz u ff &rprr_scatter_store rd pg rn rm esz msz xs scale ########################################################################### @@ -194,6 +196,18 @@ @rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \ &rpri_load dtype=%msz_dtype +# Gather Loads. +@rprr_g_load_u ....... .. . . rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load xs=2 +@rprr_g_load_xs_u ....... .. xs:1 . rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load +@rprr_g_load_xs_u_sc ....... .. xs:1 scale:1 rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load +@rprr_g_load_u_sc ....... .. . scale:1 rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load xs=2 +@rpri_g_load ....... msz:2 .. imm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rpri_gather_load + # Stores; user must fill in ESZ, MSZ, NREG as needed. @rprr_store ....... .. .. rm:5 ... pg:3 rn:5 rd:5 &rprr_store @rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_store @@ -758,6 +772,19 @@ LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9 LD1R_zpri 1000010 .. 1 imm:6 1.. pg:3 rn:5 rd:5 \ &rpri_load dtype=%dtype_23_13 nreg=0 +# SVE 32-bit gather load (scalar plus 32-bit unscaled offsets) +# SVE 32-bit gather load (scalar plus 32-bit scaled offsets) +LD1_zprz 1000010 00 .0 ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u esz=2 msz=0 scale=0 +LD1_zprz 1000010 01 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=2 msz=1 +LD1_zprz 1000010 10 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=2 msz=2 + +# SVE 32-bit gather load (vector plus immediate) +LD1_zpiz 1000010 .. 01 ..... 1.. ... ..... ..... \ + @rpri_g_load esz=2 + ### SVE Memory Contiguous Load Group # SVE contiguous load (scalar plus scalar) @@ -807,6 +834,32 @@ PRF_rr 1000010 -- 00 rm:5 110 --- ----- 0 ---- ### SVE Memory 64-bit Gather Group +# SVE 64-bit gather load (scalar plus 32-bit unpacked unscaled offsets) +# SVE 64-bit gather load (scalar plus 32-bit unpacked scaled offsets) +LD1_zprz 1100010 00 .0 ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u esz=3 msz=0 scale=0 +LD1_zprz 1100010 01 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=3 msz=1 +LD1_zprz 1100010 10 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=3 msz=2 +LD1_zprz 1100010 11 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=3 msz=3 + +# SVE 64-bit gather load (scalar plus 64-bit unscaled offsets) +# SVE 64-bit gather load (scalar plus 64-bit scaled offsets) +LD1_zprz 1100010 00 10 ..... 1.. ... ..... ..... \ + @rprr_g_load_u esz=3 msz=0 scale=0 +LD1_zprz 1100010 01 1. ..... 1.. ... ..... ..... \ + @rprr_g_load_u_sc esz=3 msz=1 +LD1_zprz 1100010 10 1. ..... 1.. ... ..... ..... \ + @rprr_g_load_u_sc esz=3 msz=2 +LD1_zprz 1100010 11 1. ..... 1.. ... ..... ..... \ + @rprr_g_load_u_sc esz=3 msz=3 + +# SVE 64-bit gather load (vector plus immediate) +LD1_zpiz 1100010 .. 01 ..... 1.. ... ..... ..... \ + @rpri_g_load esz=3 + # SVE 64-bit gather prefetch (scalar plus 64-bit scaled offsets) PRF 1100010 00 11 ----- 1-- --- ----- 0 ---- From patchwork Thu Jun 21 01:53:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139418 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1479558lji; Wed, 20 Jun 2018 19:09:23 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJQ77W/Hl0W5Gi2jgqNuIgdJvkWZjpOqlONZLGt+pc6bKXsovDedYR3D7nqJkxAR7s8c2Q+ X-Received: by 2002:a0c:8992:: with SMTP id 18-v6mr20757083qvr.61.1529546963547; Wed, 20 Jun 2018 19:09:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546963; cv=none; d=google.com; s=arc-20160816; b=QyyMAlZmmaful+/A5iceezjCbEK56J0ogCM72Fvvu8LoEKh8QtTgHwSlr14fk91RPH QgsTm47SW/6eWaq/H+7D6yo39mfUxwsohOTiswoLivigslrXFVaBRHrmKwSIH5+RQexN AsmIgZTrZKwA4YmAl3x3Fcg9OAxnPtQyizMtk3Vui+XfNbO+lXaCaLYUUjxnE99GjvLy qQ4Gc+EZM/KqABHmC5mPbnKbJQ4ji74qzaQS5ghnjqj3hzVr//cnQJzUFB/gOWQ/JEd6 gc1FjskDmUqxN2s87chHByChZgsG0Auvqdmk7QJiNG0SPVqyccnQ3DH2aNtDOXU9cj8P R/sw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=DhPHednOJI+L5QL2OOrjnicM5jhrFbP7kdaN7+aPugs=; b=TY3aTnDVkU2dWzGrSDzhuqkwdQrG0FnJG1i6QgK2kKRWmh6jVkfKYAinTTVH5zGhpa oDxsd0/IUdgLlTe6KNDOrWjI/VK9A/bUwm36MvnJxBRi1V4U9E9FVR8Jc0mAoJsHue10 kBUgwigoijTNV3I1dUkxFAV73tatatLknzeR+IAeOpEW2iJtKtMKzXjWQky8citr+/nM 0pEgqO0G/uC3R9Pk2gjV0GXx068AjwrA03J37Td6O0Uo/LzKXwDtaJD7Tk6LK8nXetYg HFZXDNcA6Xl51M+FYObm5/slbWC81zvZSulY+E/fc6+9g/T9YAbQiNg3eUneJ0YgWRR9 rh6w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=emhfObw7; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id 11-v6si3722519qkh.249.2018.06.20.19.09.23 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:09:23 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=emhfObw7; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52554 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp2J-0007AY-1D for patch@linaro.org; Wed, 20 Jun 2018 22:09:23 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39259) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonz-0004S9-8x for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonw-000391-Nx for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:34 -0400 Received: from mail-pf0-x232.google.com ([2607:f8b0:400e:c00::232]:35928) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonw-00038g-FD for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:32 -0400 Received: by mail-pf0-x232.google.com with SMTP id a12-v6so689301pfi.3 for ; Wed, 20 Jun 2018 18:54:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=DhPHednOJI+L5QL2OOrjnicM5jhrFbP7kdaN7+aPugs=; b=emhfObw7jBmVCq4NgsiYnUzo/2H5lfDP7j8X9N/0gtuwlk2VuxtcpEnge8NpLCo1vu DchC3dxWTTBSkZVuXxHdW2R8fTqiqqYSrU6hHn2Nu9mkgkTgoTQuhJSdrLV16m0eRgKf G/8N16oGp3q0Dv2DTFMPTay7R3X9EpiuX3ncE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=DhPHednOJI+L5QL2OOrjnicM5jhrFbP7kdaN7+aPugs=; b=r/iaY3K2cjdhZ0wIWs8gUAf6Xor8zccY2yXlQY64WrenlHWHO6H+yx9OFt4HNYaLXP jOjj9Hckiawk0cYDGX3gkVrMkbimQ63ydkjt2P1rdZ4+7rdwCjLVHMccdOqrzeD6kOd+ 9MD794Zm6IsU/BOzkxr9EquNVccj9U28jMbzJ0o+wwUD5CUZ0H3dFiVF9r74gfF1ad9z VMcMR6GWcRib8ubZLK/oV+lG6nMw9cTbX8CJ3gY79JA7dvym+M9rZ6OPnx2v2HPPphFH uax7IQf+Fm1QZHDfcxv7Nz3fV/DPGz6xiKGvsQ48t8wji0ecBM6nU8e1gI6oGSGzcNWN vNfQ== X-Gm-Message-State: APt69E0/wnFLKz3Dy+UIY2VMYnN0N3ouQLS9cqJuiCw+Y3InXaelAPW0 c6AQ0xW58yRfyU4rnqlhrilG7IYj/DY= X-Received: by 2002:a62:9385:: with SMTP id r5-v6mr25212938pfk.59.1529546071051; Wed, 20 Jun 2018 18:54:31 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.29 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:30 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:38 -1000 Message-Id: <20180621015359.12018-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::232 Subject: [Qemu-devel] [PATCH v5 14/35] target/arm: Implement SVE first-fault gather loads X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 67 ++++++++++++++++++++ target/arm/sve_helper.c | 88 ++++++++++++++++++++++++++ target/arm/translate-sve.c | 126 ++++++++++++++++++++++++------------- 3 files changed, 236 insertions(+), 45 deletions(-) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index aeb62afc34..55e8a908d4 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1026,6 +1026,73 @@ DEF_HELPER_FLAGS_6(sve_ldhds_zd, TCG_CALL_NO_WG, DEF_HELPER_FLAGS_6(sve_ldsds_zd, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffssu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldffbsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffssu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldffbdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffddu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldffbdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffddu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldffbdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffddu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr, tl, i32) DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 9b99718156..38f7cc2274 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3790,6 +3790,94 @@ DO_LD1_ZPZ_D(sve_ldbds_zd, uint64_t, int8_t, cpu_ldub_data_ra) DO_LD1_ZPZ_D(sve_ldhds_zd, uint64_t, int16_t, cpu_lduw_data_ra) DO_LD1_ZPZ_D(sve_ldsds_zd, uint64_t, int32_t, cpu_ldl_data_ra) +/* First fault loads with a vector index. */ + +#ifdef CONFIG_USER_ONLY + +#define DO_LDFF1_ZPZ(NAME, TYPEE, TYPEI, TYPEM, FN, H) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + unsigned scale = simd_data(desc); \ + uintptr_t ra = GETPC(); \ + bool first = true; \ + mmap_lock(); \ + for (i = 0; i < oprsz; i++) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m = 0; \ + if (pg & 1) { \ + target_ulong off = *(TYPEI *)(vm + H(i)); \ + target_ulong addr = base + (off << scale); \ + if (!first && \ + page_check_range(addr, sizeof(TYPEM), PAGE_READ)) { \ + record_fault(env, i, oprsz); \ + goto exit; \ + } \ + m = FN(env, addr, ra); \ + first = false; \ + } \ + *(TYPEE *)(vd + H(i)) = m; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + } while (i & 15); \ + } \ + exit: \ + mmap_unlock(); \ +} + +#else + +#define DO_LDFF1_ZPZ(NAME, TYPEE, TYPEI, TYPEM, FN, H) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + g_assert_not_reached(); \ +} + +#endif + +#define DO_LDFF1_ZPZ_S(NAME, TYPEI, TYPEM, FN) \ + DO_LDFF1_ZPZ(NAME, uint32_t, TYPEI, TYPEM, FN, H1_4) +#define DO_LDFF1_ZPZ_D(NAME, TYPEI, TYPEM, FN) \ + DO_LDFF1_ZPZ(NAME, uint64_t, TYPEI, TYPEM, FN, ) + +DO_LDFF1_ZPZ_S(sve_ldffbsu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffhsu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffssu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffbss_zsu, uint32_t, int8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffhss_zsu, uint32_t, int16_t, cpu_lduw_data_ra) + +DO_LDFF1_ZPZ_S(sve_ldffbsu_zss, int32_t, uint8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffhsu_zss, int32_t, uint16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffssu_zss, int32_t, uint32_t, cpu_ldl_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffbss_zss, int32_t, int8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffhss_zss, int32_t, int16_t, cpu_lduw_data_ra) + +DO_LDFF1_ZPZ_D(sve_ldffbdu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffhdu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffsdu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffddu_zsu, uint32_t, uint64_t, cpu_ldq_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffbds_zsu, uint32_t, int8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffhds_zsu, uint32_t, int16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffsds_zsu, uint32_t, int32_t, cpu_ldl_data_ra) + +DO_LDFF1_ZPZ_D(sve_ldffbdu_zss, int32_t, uint8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffhdu_zss, int32_t, uint16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffsdu_zss, int32_t, uint32_t, cpu_ldl_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffddu_zss, int32_t, uint64_t, cpu_ldq_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffbds_zss, int32_t, int8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffhds_zss, int32_t, int16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffsds_zss, int32_t, int32_t, cpu_ldl_data_ra) + +DO_LDFF1_ZPZ_D(sve_ldffbdu_zd, uint64_t, uint8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffhdu_zd, uint64_t, uint16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffsdu_zd, uint64_t, uint32_t, cpu_ldl_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffddu_zd, uint64_t, uint64_t, cpu_ldq_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffbds_zd, uint64_t, int8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffhds_zd, uint64_t, int16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffsds_zd, uint64_t, int32_t, cpu_ldl_data_ra) + /* Stores with a vector index. */ #define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a53554f8d5..11c1bf112c 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4255,47 +4255,85 @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale, tcg_temp_free_i32(desc); } -/* Indexed by [xs][u][msz]. */ -static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][3] = { - { { gen_helper_sve_ldbss_zsu, - gen_helper_sve_ldhss_zsu, - NULL, }, - { gen_helper_sve_ldbsu_zsu, - gen_helper_sve_ldhsu_zsu, - gen_helper_sve_ldssu_zsu, } }, - { { gen_helper_sve_ldbss_zss, - gen_helper_sve_ldhss_zss, - NULL, }, - { gen_helper_sve_ldbsu_zss, - gen_helper_sve_ldhsu_zss, - gen_helper_sve_ldssu_zss, } }, +/* Indexed by [ff][xs][u][msz]. */ +static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][2][3] = { + { { { gen_helper_sve_ldbss_zsu, + gen_helper_sve_ldhss_zsu, + NULL, }, + { gen_helper_sve_ldbsu_zsu, + gen_helper_sve_ldhsu_zsu, + gen_helper_sve_ldssu_zsu, } }, + { { gen_helper_sve_ldbss_zss, + gen_helper_sve_ldhss_zss, + NULL, }, + { gen_helper_sve_ldbsu_zss, + gen_helper_sve_ldhsu_zss, + gen_helper_sve_ldssu_zss, } }, }, + + { { { gen_helper_sve_ldffbss_zsu, + gen_helper_sve_ldffhss_zsu, + NULL, }, + { gen_helper_sve_ldffbsu_zsu, + gen_helper_sve_ldffhsu_zsu, + gen_helper_sve_ldffssu_zsu, } }, + { { gen_helper_sve_ldffbss_zss, + gen_helper_sve_ldffhss_zss, + NULL, }, + { gen_helper_sve_ldffbsu_zss, + gen_helper_sve_ldffhsu_zss, + gen_helper_sve_ldffssu_zss, } } } }; -static gen_helper_gvec_mem_scatter * const gather_load_fn64[3][2][4] = { - { { gen_helper_sve_ldbds_zsu, - gen_helper_sve_ldhds_zsu, - gen_helper_sve_ldsds_zsu, - NULL, }, - { gen_helper_sve_ldbdu_zsu, - gen_helper_sve_ldhdu_zsu, - gen_helper_sve_ldsdu_zsu, - gen_helper_sve_ldddu_zsu, } }, - { { gen_helper_sve_ldbds_zss, - gen_helper_sve_ldhds_zss, - gen_helper_sve_ldsds_zss, - NULL, }, - { gen_helper_sve_ldbdu_zss, - gen_helper_sve_ldhdu_zss, - gen_helper_sve_ldsdu_zss, - gen_helper_sve_ldddu_zss, } }, - { { gen_helper_sve_ldbds_zd, - gen_helper_sve_ldhds_zd, - gen_helper_sve_ldsds_zd, - NULL, }, - { gen_helper_sve_ldbdu_zd, - gen_helper_sve_ldhdu_zd, - gen_helper_sve_ldsdu_zd, - gen_helper_sve_ldddu_zd, } }, +static gen_helper_gvec_mem_scatter * const gather_load_fn64[2][3][2][4] = { + { { { gen_helper_sve_ldbds_zsu, + gen_helper_sve_ldhds_zsu, + gen_helper_sve_ldsds_zsu, + NULL, }, + { gen_helper_sve_ldbdu_zsu, + gen_helper_sve_ldhdu_zsu, + gen_helper_sve_ldsdu_zsu, + gen_helper_sve_ldddu_zsu, } }, + { { gen_helper_sve_ldbds_zss, + gen_helper_sve_ldhds_zss, + gen_helper_sve_ldsds_zss, + NULL, }, + { gen_helper_sve_ldbdu_zss, + gen_helper_sve_ldhdu_zss, + gen_helper_sve_ldsdu_zss, + gen_helper_sve_ldddu_zss, } }, + { { gen_helper_sve_ldbds_zd, + gen_helper_sve_ldhds_zd, + gen_helper_sve_ldsds_zd, + NULL, }, + { gen_helper_sve_ldbdu_zd, + gen_helper_sve_ldhdu_zd, + gen_helper_sve_ldsdu_zd, + gen_helper_sve_ldddu_zd, } } }, + + { { { gen_helper_sve_ldffbds_zsu, + gen_helper_sve_ldffhds_zsu, + gen_helper_sve_ldffsds_zsu, + NULL, }, + { gen_helper_sve_ldffbdu_zsu, + gen_helper_sve_ldffhdu_zsu, + gen_helper_sve_ldffsdu_zsu, + gen_helper_sve_ldffddu_zsu, } }, + { { gen_helper_sve_ldffbds_zss, + gen_helper_sve_ldffhds_zss, + gen_helper_sve_ldffsds_zss, + NULL, }, + { gen_helper_sve_ldffbdu_zss, + gen_helper_sve_ldffhdu_zss, + gen_helper_sve_ldffsdu_zss, + gen_helper_sve_ldffddu_zss, } }, + { { gen_helper_sve_ldffbds_zd, + gen_helper_sve_ldffhds_zd, + gen_helper_sve_ldffsds_zd, + NULL, }, + { gen_helper_sve_ldffbdu_zd, + gen_helper_sve_ldffhdu_zd, + gen_helper_sve_ldffsdu_zd, + gen_helper_sve_ldffddu_zd, } } } }; static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn) @@ -4311,13 +4349,12 @@ static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn) return true; } - /* TODO: handle LDFF1. */ switch (a->esz) { case MO_32: - fn = gather_load_fn32[a->xs][a->u][a->msz]; + fn = gather_load_fn32[a->ff][a->xs][a->u][a->msz]; break; case MO_64: - fn = gather_load_fn64[a->xs][a->u][a->msz]; + fn = gather_load_fn64[a->ff][a->xs][a->u][a->msz]; break; } assert(fn != NULL); @@ -4339,13 +4376,12 @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn) return true; } - /* TODO: handle LDFF1. */ switch (a->esz) { case MO_32: - fn = gather_load_fn32[0][a->u][a->msz]; + fn = gather_load_fn32[a->ff][0][a->u][a->msz]; break; case MO_64: - fn = gather_load_fn64[2][a->u][a->msz]; + fn = gather_load_fn64[a->ff][2][a->u][a->msz]; break; } assert(fn != NULL); From patchwork Thu Jun 21 01:53:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139409 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1473829lji; Wed, 20 Jun 2018 19:02:06 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIFTO06QX5i63NFbvtGommwpNMkRhKWqksstuS7FKQ0mi5jaDuSYNuAfb+auwYPHihvk5Qv X-Received: by 2002:a37:4792:: with SMTP id u140-v6mr19934213qka.100.1529546526827; Wed, 20 Jun 2018 19:02:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546526; cv=none; d=google.com; s=arc-20160816; b=ZoITS/JgqRA71PSd0kkxJ+KWMFneLNT7CLBZjg++fKi0LYzaFxf7PtuCShC4MoPZie MMg0gol5hkd7COd9mveB3iTqaHyAISoP9fpgtD7NE7ajuVtBDp2pLeggbPwJQ4sPhyPt tkxg/93FtUv9xeQjhIFrqj5NiAeAqUH7hL0znjiUI3bmMh8qf42ry6oo3zokT0c9Eiut MCY5nf2KVmFK3ICbcOdKlonehKyrN+Nx8X6yCVCB6hHVlyjK9TVQxqKkZdzFGyqY9PJ4 AqmgTh8zSFOQoCjSU7mzUQA184A4y6MOp7RSGS7f4Pq4pTfitTZ0TGvvAffVpLKQvDL0 GRdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=K/RaemtBMKMDMQiaXGvolQPI9idWFbIOOZMoovbxENI=; b=hNbFNOU1F8D3Hf3dG067NXqPEVOGDanv2ztUthaRcLS+400FfmMeITUBlG2VkSMfxc RG73bvq/C+heOvF6rZR3Lt1dLtcRI3UTcp5xhOX7aJzLLuJdtesvn6uhJE6H3i5rHE+L gcxBswVbZlICuz+HHFjWT9AzbSNKvWatnwnMUvFf+l5C3jlQU2IgCpeZCWLc0pRKqaiz ID4UE3zkW3XIcW0unHqvCmUt2GTBPHczmmlCjE3/eVKgiruy7sG3t3wYGQQndWPCvOf5 zon6d8T1NNO/HXxZcHFEHA1gdM0Lv3PXJwe0R/LlMXSJ7s1v3Mlicwfq/EMQiZbwPqSS KS+Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Bn4ppvSG; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id x9-v6si3734442qvh.260.2018.06.20.19.02.06 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:02:06 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Bn4ppvSG; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52519 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVovG-0001TI-5D for patch@linaro.org; Wed, 20 Jun 2018 22:02:06 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39273) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonz-0004SA-E6 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:36 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVony-0003A3-Ax for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:35 -0400 Received: from mail-pl0-x22f.google.com ([2607:f8b0:400e:c01::22f]:45163) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVony-00039Y-2c for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:34 -0400 Received: by mail-pl0-x22f.google.com with SMTP id o18-v6so58673pll.12 for ; Wed, 20 Jun 2018 18:54:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=K/RaemtBMKMDMQiaXGvolQPI9idWFbIOOZMoovbxENI=; b=Bn4ppvSGM/GQJ+fdsN6FJSqINnqdxfomlfNFtdEPFvX6j8vaIWfaSV0WfGSb94E5mf GMpToHtkg/p0QpoSD4O3KVJuxvzvdoWNJ7DolkjH7GGeSG4kZzojqMdAucAVVXeHWhhf xfXytbP9mLaO3AQJiEbtq+1z9GWBjVWDMSSbA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=K/RaemtBMKMDMQiaXGvolQPI9idWFbIOOZMoovbxENI=; b=dpC0gt/X9j2HywNhlwUNsKuSB9P/d80OaNIbPm880aywP/pHBY5SoE/1xttYG8Bfb1 I32xzUzeGPP3cHxONxyTgoVad3Cnylyv4dhjXUdfLZfyItSGqFaEZ23bGLecn5tho1gl qSjViRYC3IvAF3BBCa0zmBMkW43yvc03LhgFB9GNu/rzZVqDlqPp515DwcP8VhmIszQE zK217qpLVKSTdgJkyuEiCP+033GsYb2aNG/gTDS2uy810uAZgI7uBz1oA/pdccRqHxao IX3OqKn6m0j9BvfhNenCaCKO2hxIHc28RB+nANfe9oFlX8jqSBx1xRI2fpUNxxVkNxRE fzNw== X-Gm-Message-State: APt69E3FHdg+oqKEnZT1menEJP+ClHXWJE34WSjBA3USWPdRbAcWppSi 7CzW93e3vWCek/gbeIaNUCmYl0XMF3M= X-Received: by 2002:a17:902:b217:: with SMTP id t23-v6mr26425520plr.312.1529546072766; Wed, 20 Jun 2018 18:54:32 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.31 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:31 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:39 -1000 Message-Id: <20180621015359.12018-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::22f Subject: [Qemu-devel] [PATCH v5 15/35] target/arm: Implement SVE scatter store vector immediate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 83 ++++++++++++++++++++++++++------------ target/arm/sve.decode | 11 +++++ 2 files changed, 69 insertions(+), 25 deletions(-) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 11c1bf112c..fdc3231e63 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4395,31 +4395,33 @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn) return true; } +/* Indexed by [xs][msz]. */ +static gen_helper_gvec_mem_scatter * const scatter_store_fn32[2][3] = { + { gen_helper_sve_stbs_zsu, + gen_helper_sve_sths_zsu, + gen_helper_sve_stss_zsu, }, + { gen_helper_sve_stbs_zss, + gen_helper_sve_sths_zss, + gen_helper_sve_stss_zss, }, +}; + +static gen_helper_gvec_mem_scatter * const scatter_store_fn64[3][4] = { + { gen_helper_sve_stbd_zsu, + gen_helper_sve_sthd_zsu, + gen_helper_sve_stsd_zsu, + gen_helper_sve_stdd_zsu, }, + { gen_helper_sve_stbd_zss, + gen_helper_sve_sthd_zss, + gen_helper_sve_stsd_zss, + gen_helper_sve_stdd_zss, }, + { gen_helper_sve_stbd_zd, + gen_helper_sve_sthd_zd, + gen_helper_sve_stsd_zd, + gen_helper_sve_stdd_zd, }, +}; + static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) { - /* Indexed by [xs][msz]. */ - static gen_helper_gvec_mem_scatter * const fn32[2][3] = { - { gen_helper_sve_stbs_zsu, - gen_helper_sve_sths_zsu, - gen_helper_sve_stss_zsu, }, - { gen_helper_sve_stbs_zss, - gen_helper_sve_sths_zss, - gen_helper_sve_stss_zss, }, - }; - static gen_helper_gvec_mem_scatter * const fn64[3][4] = { - { gen_helper_sve_stbd_zsu, - gen_helper_sve_sthd_zsu, - gen_helper_sve_stsd_zsu, - gen_helper_sve_stdd_zsu, }, - { gen_helper_sve_stbd_zss, - gen_helper_sve_sthd_zss, - gen_helper_sve_stsd_zss, - gen_helper_sve_stdd_zss, }, - { gen_helper_sve_stbd_zd, - gen_helper_sve_sthd_zd, - gen_helper_sve_stsd_zd, - gen_helper_sve_stdd_zd, }, - }; gen_helper_gvec_mem_scatter *fn; if (a->esz < a->msz || (a->msz == 0 && a->scale)) { @@ -4430,10 +4432,10 @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) } switch (a->esz) { case MO_32: - fn = fn32[a->xs][a->msz]; + fn = scatter_store_fn32[a->xs][a->msz]; break; case MO_64: - fn = fn64[a->xs][a->msz]; + fn = scatter_store_fn64[a->xs][a->msz]; break; default: g_assert_not_reached(); @@ -4443,6 +4445,37 @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) return true; } +static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a, uint32_t insn) +{ + gen_helper_gvec_mem_scatter *fn = NULL; + TCGv_i64 imm; + + if (a->esz < a->msz) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + switch (a->esz) { + case MO_32: + fn = scatter_store_fn32[0][a->msz]; + break; + case MO_64: + fn = scatter_store_fn64[2][a->msz]; + break; + } + assert(fn != NULL); + + /* Treat ST1_zpiz (zn[x] + imm) the same way as ST1_zprz (rn + zm[x]) + * by loading the immediate into the scalar parameter. + */ + imm = tcg_const_i64(a->imm << a->msz); + do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn); + tcg_temp_free_i64(imm); + return true; +} + /* * Prefetches */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index fb73a22c0e..311ae6dbdf 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -83,6 +83,7 @@ &rprr_gather_load rd pg rn rm esz msz u ff xs scale &rpri_gather_load rd pg rn imm esz msz u ff &rprr_scatter_store rd pg rn rm esz msz xs scale +&rpri_scatter_store rd pg rn imm esz msz ########################################################################### # Named instruction formats. These are generally used to @@ -215,6 +216,8 @@ &rprr_store nreg=0 @rprr_scatter_store ....... msz:2 .. rm:5 ... pg:3 rn:5 rd:5 \ &rprr_scatter_store +@rpri_scatter_store ....... msz:2 .. imm:5 ... pg:3 rn:5 rd:5 \ + &rpri_scatter_store ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -927,6 +930,14 @@ ST1_zprz 1110010 .. 01 ..... 101 ... ..... ..... \ ST1_zprz 1110010 .. 00 ..... 101 ... ..... ..... \ @rprr_scatter_store xs=2 esz=3 scale=0 +# SVE 64-bit scatter store (vector plus immediate) +ST1_zpiz 1110010 .. 10 ..... 101 ... ..... ..... \ + @rpri_scatter_store esz=3 + +# SVE 32-bit scatter store (vector plus immediate) +ST1_zpiz 1110010 .. 11 ..... 101 ... ..... ..... \ + @rpri_scatter_store esz=2 + # SVE 64-bit scatter store (scalar plus unpacked 32-bit scaled offset) # Require msz > 0 ST1_zprz 1110010 .. 01 ..... 100 ... ..... ..... \ From patchwork Thu Jun 21 01:53:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139423 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1481770lji; Wed, 20 Jun 2018 19:12:40 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLi/w2mI2/FdWu+3n7Hc31kbI0VkvQonOz+U+A7H3lh2oZc3S4IIp1mA6hZ/2xC5L/VCWDE X-Received: by 2002:ac8:2631:: with SMTP id u46-v6mr20821655qtu.306.1529547160386; Wed, 20 Jun 2018 19:12:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547160; cv=none; d=google.com; s=arc-20160816; b=r2IOs6NZPXUr1GTHMrlVl5jXFLa9dLGrs3vs3fhbQfGqsf3siXSOXuhPC1ly1DzM0O qASdD4X4GrLQhq44+Cu+GNm/4qNuLlm6G3zOo+YjrFDJ0Ckmfv8Rb65LjGVYXRtvR1KW 0xBWeERYj1mJHt9uDmsBrPtAQmKDIBYY9ThyX50AYstYITnDXNkExYOvcHuWEhN+1sh+ MTj5EZ/Tf0+ThgOkNXH8tys9ZBpG3mkLNDJgDVSTY4AuVRucBj1IGNuz+vSEtaNGJ78k AK3soKLWvLCK1GHRE6p4AAzEMF93gSWhqDXPIjxmtzw0FLRTqyReMjfL6wrIup5f6ltU myNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=9e2ZEGdVMVF9+P5x8vHEftV4O66gf5blUE3ikkuxwT0=; b=IqkgP5M9SLgJHfCa//hVMZJWesE2P07MOxoNTo1A8V2c+9OJhtnL4kPd83avlkNxVg HC0Er3js3kTA7MH629wCcEsC5QG+JYfBXSBdULDO3dsPDt9iV5AKsKdaRDg4OmTs0NDV IXWwEqIXVVxMLF4cPKfXVE2kj4Fp/mWR01Ct/68J+ceNJrY3UkKegMphxFlo/hTKMPFu Uw2S5SHJupqdBkVltedjdzd5drBtUU68MFOCv3lZ6ktXWCWqcwgB4CnXxnZow1MGAYYi MYKSqSk/izxdjRmxCg7MAAHkKljQ54keRhawIUPglBMs7e2asbza8G76bxe6MH8rKQWU 3rlQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=EIJu97IH; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id a7-v6si3568984qvb.224.2018.06.20.19.12.40 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:12:40 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=EIJu97IH; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52575 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp5T-0001Hw-Qb for patch@linaro.org; Wed, 20 Jun 2018 22:12:39 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39301) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoo4-0004XC-5i for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVoo0-0003B3-9a for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:40 -0400 Received: from mail-pl0-x233.google.com ([2607:f8b0:400e:c01::233]:44757) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVoo0-0003Ah-0s for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:36 -0400 Received: by mail-pl0-x233.google.com with SMTP id z9-v6so746293plk.11 for ; Wed, 20 Jun 2018 18:54:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9e2ZEGdVMVF9+P5x8vHEftV4O66gf5blUE3ikkuxwT0=; b=EIJu97IH3/CM7JcmPhHXBLxcA6LCp8e+EG9Lveyro8IqzcWR4hxZVr6FDeJ8xuFtnY x6+xtwTyS9K+lXunAe/MHkWXUIx+I+qzLK1Hjm3QWjK2NNjfsHuF1eUoFEaqC+dJPasf K+TKAGmWAnp+52zIRMvzI6s19/PfgTj+hqQCo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9e2ZEGdVMVF9+P5x8vHEftV4O66gf5blUE3ikkuxwT0=; b=Du24WsEkSGoPgW7WKxPUSwRVsRKdVuU0AXWT5pZxJeCt5RGVdllyDOK2DiLt7dZHdW if72jYrLSoSFZVW64ZR92UGkO2v5xK5UmFLn9lc97ujElsZN1WNbgKOi99HOutaApO7K zXXKNEJBJKO++RQUY2DoF9fAzZhTRlgKw05FwYjI0+QTlEDOIbb4yvhQmC3W+51bjSO9 IcixDV0nv/HrdEgTTu0En8aogcXybyG931GXvgHicCwitNVZy/n9ILi/UwWeTI6Qg3ic GzHbIbwWdIo0CHx7g6XJhLndbCWmYrzTkrNQZNYSywHmpiBXD6CoJ4ckByZPMH+o0TXi mVlA== X-Gm-Message-State: APt69E2X2jGER0H4u+x+L8isLpVmVB1q9HZpvAUz8D+gnP2oY43XEUWY uE4W0Slx/33zJZ6t+GhmRZOXjUUREZE= X-Received: by 2002:a17:902:a989:: with SMTP id bh9-v6mr26782187plb.245.1529546074765; Wed, 20 Jun 2018 18:54:34 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.32 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:33 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:40 -1000 Message-Id: <20180621015359.12018-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::233 Subject: [Qemu-devel] [PATCH v5 16/35] target/arm: Implement SVE floating-point compare vectors X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 49 ++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 62 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 40 ++++++++++++++++++++++++ target/arm/sve.decode | 11 +++++++ 4 files changed, 162 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 55e8a908d4..6089b3a53f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -839,6 +839,55 @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmge_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmge_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmge_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmgt_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmgt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmgt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmeq_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmeq_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmeq_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmne_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmne_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmne_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmuo_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmuo_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmuo_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_facge_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facge_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facge_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_facgt_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facgt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facgt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 38f7cc2274..2fd34d722b 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3193,6 +3193,68 @@ void HELPER(sve_fnmls_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) do_fmla_zpzzz_d(env, vg, desc, 0, INT64_MIN); } +/* Two operand floating-point comparison controlled by a predicate. + * Unlike the integer version, we are not allowed to optimistically + * compare operands, since the comparison may have side effects wrt + * the FPSR. + */ +#define DO_FPCMP_PPZZ(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i = simd_oprsz(desc), j = (i - 1) >> 6; \ + uint64_t *d = vd, *g = vg; \ + do { \ + uint64_t out = 0, pg = g[j]; \ + do { \ + i -= sizeof(TYPE), out <<= sizeof(TYPE); \ + if (likely((pg >> (i & 63)) & 1)) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + TYPE mm = *(TYPE *)(vm + H(i)); \ + out |= OP(TYPE, nn, mm, status); \ + } \ + } while (i & 63); \ + d[j--] = out; \ + } while (i > 0); \ +} + +#define DO_FPCMP_PPZZ_H(NAME, OP) \ + DO_FPCMP_PPZZ(NAME##_h, float16, H1_2, OP) +#define DO_FPCMP_PPZZ_S(NAME, OP) \ + DO_FPCMP_PPZZ(NAME##_s, float32, H1_4, OP) +#define DO_FPCMP_PPZZ_D(NAME, OP) \ + DO_FPCMP_PPZZ(NAME##_d, float64, , OP) + +#define DO_FPCMP_PPZZ_ALL(NAME, OP) \ + DO_FPCMP_PPZZ_H(NAME, OP) \ + DO_FPCMP_PPZZ_S(NAME, OP) \ + DO_FPCMP_PPZZ_D(NAME, OP) + +#define DO_FCMGE(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) <= 0 +#define DO_FCMGT(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) < 0 +#define DO_FCMEQ(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) == 0 +#define DO_FCMNE(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) != 0 +#define DO_FCMUO(TYPE, X, Y, ST) \ + TYPE##_compare_quiet(X, Y, ST) == float_relation_unordered +#define DO_FACGE(TYPE, X, Y, ST) \ + TYPE##_compare(TYPE##_abs(Y), TYPE##_abs(X), ST) <= 0 +#define DO_FACGT(TYPE, X, Y, ST) \ + TYPE##_compare(TYPE##_abs(Y), TYPE##_abs(X), ST) < 0 + +DO_FPCMP_PPZZ_ALL(sve_fcmge, DO_FCMGE) +DO_FPCMP_PPZZ_ALL(sve_fcmgt, DO_FCMGT) +DO_FPCMP_PPZZ_ALL(sve_fcmeq, DO_FCMEQ) +DO_FPCMP_PPZZ_ALL(sve_fcmne, DO_FCMNE) +DO_FPCMP_PPZZ_ALL(sve_fcmuo, DO_FCMUO) +DO_FPCMP_PPZZ_ALL(sve_facge, DO_FACGE) +DO_FPCMP_PPZZ_ALL(sve_facgt, DO_FACGT) + +#undef DO_FPCMP_PPZZ_ALL +#undef DO_FPCMP_PPZZ_D +#undef DO_FPCMP_PPZZ_S +#undef DO_FPCMP_PPZZ_H +#undef DO_FPCMP_PPZZ + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fdc3231e63..a68126b369 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3533,6 +3533,46 @@ DO_FP3(FMULX, fmulx) #undef DO_FP3 +static bool do_fp_cmp(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_4_ptr *fn) +{ + if (fn == NULL) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_4_ptr(pred_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); + } + return true; +} + +#define DO_FPCMP(NAME, name) \ +static bool trans_##NAME##_ppzz(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_4_ptr * const fns[4] = { \ + NULL, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \ + }; \ + return do_fp_cmp(s, a, fns[a->esz]); \ +} + +DO_FPCMP(FCMGE, fcmge) +DO_FPCMP(FCMGT, fcmgt) +DO_FPCMP(FCMEQ, fcmeq) +DO_FPCMP(FCMNE, fcmne) +DO_FPCMP(FCMUO, fcmuo) +DO_FPCMP(FACGE, facge) +DO_FPCMP(FACGT, facgt) + +#undef DO_FPCMP + typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32); static bool do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla *fn) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 311ae6dbdf..6f4a36422d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -320,6 +320,17 @@ UXTH 00000100 .. 010 011 101 ... ..... ..... @rd_pg_rn SXTW 00000100 .. 010 100 101 ... ..... ..... @rd_pg_rn UXTW 00000100 .. 010 101 101 ... ..... ..... @rd_pg_rn +### SVE Floating Point Compare - Vectors Group + +# SVE floating-point compare vectors +FCMGE_ppzz 01100101 .. 0 ..... 010 ... ..... 0 .... @pd_pg_rn_rm +FCMGT_ppzz 01100101 .. 0 ..... 010 ... ..... 1 .... @pd_pg_rn_rm +FCMEQ_ppzz 01100101 .. 0 ..... 011 ... ..... 0 .... @pd_pg_rn_rm +FCMNE_ppzz 01100101 .. 0 ..... 011 ... ..... 1 .... @pd_pg_rn_rm +FCMUO_ppzz 01100101 .. 0 ..... 110 ... ..... 0 .... @pd_pg_rn_rm +FACGE_ppzz 01100101 .. 0 ..... 110 ... ..... 1 .... @pd_pg_rn_rm +FACGT_ppzz 01100101 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm + ### SVE Integer Multiply-Add Group # SVE integer multiply-add writing addend (predicated) From patchwork Thu Jun 21 01:53:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139413 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1476399lji; Wed, 20 Jun 2018 19:05:24 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJY1CDPmCnKrf5Ubt9yxFPGwWlAwKqiKIOYpzgp55EhDpPl9ga6JsmzuoiwLxfr9qDDbiPw X-Received: by 2002:ac8:152:: with SMTP id f18-v6mr21163891qtg.296.1529546723978; Wed, 20 Jun 2018 19:05:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546723; cv=none; d=google.com; s=arc-20160816; b=L26jcXJYaBoXaTgu5Lj99RncYBnpHeA0+JbUTbWZ12BNtu68PkWtMLQHV5qcWar+yM UajnTT4CZwibQKgsj2b1j0BiaTjWGcAHS3Hn+dfsr8LQNnac7g7Qqkz3c7HjPurCwFh+ UgKhDgJbvK0QQiQ5ehZs9T+1Q1/DLVeCIXZBco5uireSrZpgtBzoSc6NAtADf+mVyUDb I4BPMMOv3gkaadZaBZo/9PLIrajG4D52w6oAKefdADcQMzBYrLYUvxT8Xk9OE5lI120b 5K+IGkTcIe/f1U0qREV27WwKt0fgTYg3rjsuRMpqL1kc+KwlA82ldmR0dDMXm10z56hs VOUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=9jJ4qWntsM8u8UvCttYqEQPIwhB/EyI5Rg+m1004LwY=; b=VAl8kyvb2+78Tkc2U1Q4RRdi+jOXL1GijkrudSUSI04+xawLVshVjbpgrFztvGhcsy mNlcnGfm6nmAz+zyQnRZCJ0CE0xV08fiZd343aa3uJVDGdIbq9eHMviGP1H/2xRRW7Ti izI5qcwd3I951E8e3k67mvMUbd0eTZ3+M7tPerw2aUC5PHGZyKai981eKOUzcOLkiVq4 CAbqKakaoPWgUZPoIhDmdsuW+JDCu9aWbk9LgVKMxK12f/tZl6hFep7NiDNulnzIAerg X3bPiEHw97GO1AkOZMyVl38GVOwOMitMupcguvAUUaYNJ75q+bE9XJoyUIBDvbak6CKK NsKg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=agiJRlvR; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id u123-v6si988450qkh.104.2018.06.20.19.05.23 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:05:23 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=agiJRlvR; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52531 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoyR-0003qV-Ex for patch@linaro.org; Wed, 20 Jun 2018 22:05:23 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39300) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoo4-0004XB-5V for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVoo2-0003CZ-Eh for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:40 -0400 Received: from mail-pg0-x22b.google.com ([2607:f8b0:400e:c05::22b]:40125) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVoo2-0003C3-5W for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:38 -0400 Received: by mail-pg0-x22b.google.com with SMTP id w8-v6so640749pgp.7 for ; Wed, 20 Jun 2018 18:54:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9jJ4qWntsM8u8UvCttYqEQPIwhB/EyI5Rg+m1004LwY=; b=agiJRlvRlyasQ1opd7JSdagT0MZLsdOECu6ZOZZiYBKGIVEPzjY/JMo8hSGuob3Tkn zy5/s838UTl10sG3j4rCUcLxbzXvoAODmEFus/cc9TUS5IwOQw4L26Qz7e6Kn+CWKGc0 2jtOvjBVYEbGCPW2zaCoAWxnlnkUr3gqsxNUU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9jJ4qWntsM8u8UvCttYqEQPIwhB/EyI5Rg+m1004LwY=; b=a7cJiLQk3Rvhd65eyubokuGeD+y1hs7OOAy68T9S8I8l8g3GLMGdvSGOXVaBaybzEU uSQH/vTwrnVTnWw8bEzZrfzTQgGP2c8Sr0iyT74LAOpdUkZlGKoumlvBDa6R23nOWp61 NWW5t7hqCsm8IeQOk67Hrv/JwElfGuM1Uot8ZKgYLCCA47VQswga4fDnyHFuO/trsvuE Aphh04GCEsEDn/y+bxdmNTox0xd1ixK7hSs1spS+OVxYeAHcZq942RDGxw6fIVDVH642 /8MHaVYMldkaRJ7Xfwe+0w9EboenNabzrn3ncQgn7qB90lDLTIwN750VRtkKwWTz1xIO eYiw== X-Gm-Message-State: APt69E2zLT1ssvvWAvodhzG2jmnFYPPgOzN/zUrEJCWaQIjLhuj5Lxgh HD38D1pnICvXpkRkhVW1i5G+dKswVPs= X-Received: by 2002:a65:5a4f:: with SMTP id z15-v6mr20049760pgs.283.1529546076773; Wed, 20 Jun 2018 18:54:36 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.34 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:35 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:41 -1000 Message-Id: <20180621015359.12018-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22b Subject: [Qemu-devel] [PATCH v5 17/35] target/arm: Implement SVE floating-point arithmetic with immediate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 56 ++++++++++++++++++++++++++++ target/arm/sve_helper.c | 69 +++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 75 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 14 +++++++ 4 files changed, 214 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 6089b3a53f..087819ec2b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -809,6 +809,62 @@ DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadds_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadds_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadds_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fsubs_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubs_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubs_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmuls_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmuls_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmuls_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fsubrs_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubrs_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubrs_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmaxnms_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnms_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnms_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fminnms_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnms_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnms_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmaxs_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxs_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxs_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmins_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 2fd34d722b..a40df62414 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2997,6 +2997,75 @@ DO_ZPZZ_FP(sve_fmulx_d, uint64_t, , helper_vfp_mulxd) #undef DO_ZPZZ_FP +/* Three-operand expander, with one scalar operand, controlled by + * a predicate, with the extra float_status parameter. + */ +#define DO_ZPZS_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint64_t scalar, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i = simd_oprsz(desc); \ + uint64_t *g = vg; \ + TYPE mm = scalar; \ + do { \ + uint64_t pg = g[(i - 1) >> 6]; \ + do { \ + i -= sizeof(TYPE); \ + if (likely((pg >> (i & 63)) & 1)) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) = OP(nn, mm, status); \ + } \ + } while (i & 63); \ + } while (i != 0); \ +} + +DO_ZPZS_FP(sve_fadds_h, float16, H1_2, float16_add) +DO_ZPZS_FP(sve_fadds_s, float32, H1_4, float32_add) +DO_ZPZS_FP(sve_fadds_d, float64, , float64_add) + +DO_ZPZS_FP(sve_fsubs_h, float16, H1_2, float16_sub) +DO_ZPZS_FP(sve_fsubs_s, float32, H1_4, float32_sub) +DO_ZPZS_FP(sve_fsubs_d, float64, , float64_sub) + +DO_ZPZS_FP(sve_fmuls_h, float16, H1_2, float16_mul) +DO_ZPZS_FP(sve_fmuls_s, float32, H1_4, float32_mul) +DO_ZPZS_FP(sve_fmuls_d, float64, , float64_mul) + +static inline float16 subr_h(float16 a, float16 b, float_status *s) +{ + return float16_sub(b, a, s); +} + +static inline float32 subr_s(float32 a, float32 b, float_status *s) +{ + return float32_sub(b, a, s); +} + +static inline float64 subr_d(float64 a, float64 b, float_status *s) +{ + return float64_sub(b, a, s); +} + +DO_ZPZS_FP(sve_fsubrs_h, float16, H1_2, subr_h) +DO_ZPZS_FP(sve_fsubrs_s, float32, H1_4, subr_s) +DO_ZPZS_FP(sve_fsubrs_d, float64, , subr_d) + +DO_ZPZS_FP(sve_fmaxnms_h, float16, H1_2, float16_maxnum) +DO_ZPZS_FP(sve_fmaxnms_s, float32, H1_4, float32_maxnum) +DO_ZPZS_FP(sve_fmaxnms_d, float64, , float64_maxnum) + +DO_ZPZS_FP(sve_fminnms_h, float16, H1_2, float16_minnum) +DO_ZPZS_FP(sve_fminnms_s, float32, H1_4, float32_minnum) +DO_ZPZS_FP(sve_fminnms_d, float64, , float64_minnum) + +DO_ZPZS_FP(sve_fmaxs_h, float16, H1_2, float16_max) +DO_ZPZS_FP(sve_fmaxs_s, float32, H1_4, float32_max) +DO_ZPZS_FP(sve_fmaxs_d, float64, , float64_max) + +DO_ZPZS_FP(sve_fmins_h, float16, H1_2, float16_min) +DO_ZPZS_FP(sve_fmins_s, float32, H1_4, float32_min) +DO_ZPZS_FP(sve_fmins_d, float64, , float64_min) + /* Fully general two-operand expander, controlled by a predicate, * With the extra float_status parameter. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a68126b369..12cfadf4e9 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -32,6 +32,7 @@ #include "exec/log.h" #include "trace-tcg.h" #include "translate-a64.h" +#include "fpu/softfloat.h" typedef void GVecGen2sFn(unsigned, uint32_t, uint32_t, @@ -3533,6 +3534,80 @@ DO_FP3(FMULX, fmulx) #undef DO_FP3 +typedef void gen_helper_sve_fp2scalar(TCGv_ptr, TCGv_ptr, TCGv_ptr, + TCGv_i64, TCGv_ptr, TCGv_i32); + +static void do_fp_scalar(DisasContext *s, int zd, int zn, int pg, bool is_fp16, + TCGv_i64 scalar, gen_helper_sve_fp2scalar *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr t_zd, t_zn, t_pg, status; + TCGv_i32 desc; + + t_zd = tcg_temp_new_ptr(); + t_zn = tcg_temp_new_ptr(); + t_pg = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, zd)); + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, zn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + + status = get_fpstatus_ptr(is_fp16); + desc = tcg_const_i32(simd_desc(vsz, vsz, 0)); + fn(t_zd, t_zn, t_pg, scalar, status, desc); + + tcg_temp_free_i32(desc); + tcg_temp_free_ptr(status); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_ptr(t_zd); +} + +static void do_fp_imm(DisasContext *s, arg_rpri_esz *a, uint64_t imm, + gen_helper_sve_fp2scalar *fn) +{ + TCGv_i64 temp = tcg_const_i64(imm); + do_fp_scalar(s, a->rd, a->rn, a->pg, a->esz == MO_16, temp, fn); + tcg_temp_free_i64(temp); +} + +#define DO_FP_IMM(NAME, name, const0, const1) \ +static bool trans_##NAME##_zpzi(DisasContext *s, arg_rpri_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_sve_fp2scalar * const fns[3] = { \ + gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, \ + gen_helper_sve_##name##_d \ + }; \ + static uint64_t const val[3][2] = { \ + { float16_##const0, float16_##const1 }, \ + { float32_##const0, float32_##const1 }, \ + { float64_##const0, float64_##const1 }, \ + }; \ + if (a->esz == 0) { \ + return false; \ + } \ + if (sve_access_check(s)) { \ + do_fp_imm(s, a, val[a->esz - 1][a->imm], fns[a->esz - 1]); \ + } \ + return true; \ +} + +#define float16_two make_float16(0x4000) +#define float32_two make_float32(0x40000000) +#define float64_two make_float64(0x4000000000000000ULL) + +DO_FP_IMM(FADD, fadds, half, one) +DO_FP_IMM(FSUB, fsubs, half, one) +DO_FP_IMM(FMUL, fmuls, half, two) +DO_FP_IMM(FSUBR, fsubrs, half, one) +DO_FP_IMM(FMAXNM, fmaxnms, zero, one) +DO_FP_IMM(FMINNM, fminnms, zero, one) +DO_FP_IMM(FMAX, fmaxs, zero, one) +DO_FP_IMM(FMIN, fmins, zero, one) + +#undef DO_FP_IMM + static bool do_fp_cmp(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4_ptr *fn) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6f4a36422d..0068e415ac 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -160,6 +160,10 @@ @rdn_pg4 ........ esz:2 .. pg:4 ... ........ rd:5 \ &rpri_esz rn=%reg_movprfx +# Two register operand, one one-bit floating-point operand. +@rdn_i1 ........ esz:2 ......... pg:3 .... imm:1 rd:5 \ + &rpri_esz rn=%reg_movprfx + # Two register operand, one encoded bitmask. @rdn_dbm ........ .. .... dbm:13 rd:5 \ &rr_dbm rn=%reg_movprfx @@ -740,6 +744,16 @@ FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm +# SVE floating-point arithmetic with immediate (predicated) +FADD_zpzi 01100101 .. 011 000 100 ... 0000 . ..... @rdn_i1 +FSUB_zpzi 01100101 .. 011 001 100 ... 0000 . ..... @rdn_i1 +FMUL_zpzi 01100101 .. 011 010 100 ... 0000 . ..... @rdn_i1 +FSUBR_zpzi 01100101 .. 011 011 100 ... 0000 . ..... @rdn_i1 +FMAXNM_zpzi 01100101 .. 011 100 100 ... 0000 . ..... @rdn_i1 +FMINNM_zpzi 01100101 .. 011 101 100 ... 0000 . ..... @rdn_i1 +FMAX_zpzi 01100101 .. 011 110 100 ... 0000 . ..... @rdn_i1 +FMIN_zpzi 01100101 .. 011 111 100 ... 0000 . ..... @rdn_i1 + ### SVE FP Multiply-Add Group # SVE floating-point multiply-accumulate writing addend From patchwork Thu Jun 21 01:53:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139420 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1480091lji; Wed, 20 Jun 2018 19:10:12 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIBZN/Ciif8OeuGAAYfYtCDukWZbPktICAZpz6d94klG4tAmw9jxpFpogcymv83blrboxQZ X-Received: by 2002:ac8:34db:: with SMTP id x27-v6mr21733166qtb.223.1529547012884; Wed, 20 Jun 2018 19:10:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547012; cv=none; d=google.com; s=arc-20160816; b=fEKRgRlYlQxqPqtp5bfkyFvDI7/DuBOXdhy39SB3GkwIA9V552nQzSaOHnp4WPD1U1 6ri0ppoe4oJoQHjcXAmvvgh8TM4P6cdOaAS+PWfBXs/w/hrXMn3nawH8cZPTfRsE+KDe JgtDzkb+PRLXoCcqJBzYtvgcLffdK/n0cYiiYPXgzidLp9UbDTvZmiOXMGL5pqJWFIwX ZLENFj+wQUzqMWhqbVOTAgeUoneljITJ+GuG19VTUhAtQJlpFSpS6Fe4ldhX2xNhR+Ph 5YbTulWE9wEF0KBAsjdqt1tyXMp4bzPN74kUqcYfwidI7tmFYWtORTJ2Zu4b2kGJWcXM W29Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=jp8pCSaPEPFwVcyoLa5oiFh6g5GNPr1vfSN3Kxz3GBU=; b=GC5Jpn77XuRtXpVcP+2pEgb4dMGT0ubCmVXKp+/DhBDTQ+ZlU1EkhbAQD4KLZ+hjAq pFcD55q6wTo5tGJ5OG1MFeGp7gEhr64Vw/QPDPTHNJlMtzqgLLeADlFIvtGX/J20w2w6 dJ3js0bwh+kumGjim0tzN7oM4ajEE17oMMmFFlI4hTknHWlzfaWjokO4rGNNlBpXDQWq j8Gmi3PAuvrXZ+XXXGj7+3xNGk+TQa8oC0X+FOEOWqjpPtGYk5U+7i+aKAmKLBl4M6B/ LIEJEfyHqimRpLyMgOs14KM7+FeqGaMULmaSYXOTBvzrUgGanRMo+TOsvEp4JjxBbt0x twPQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=avrrm2FT; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id c56-v6si3809570qtc.342.2018.06.20.19.10.12 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:10:12 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=avrrm2FT; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52560 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp36-0007n6-AK for patch@linaro.org; Wed, 20 Jun 2018 22:10:12 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39316) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoo5-0004YX-G9 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVoo4-0003DL-9U for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:41 -0400 Received: from mail-pg0-x22e.google.com ([2607:f8b0:400e:c05::22e]:34589) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVoo4-0003Cz-0x for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:40 -0400 Received: by mail-pg0-x22e.google.com with SMTP id q4-v6so650289pgr.1 for ; Wed, 20 Jun 2018 18:54:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=jp8pCSaPEPFwVcyoLa5oiFh6g5GNPr1vfSN3Kxz3GBU=; b=avrrm2FT+CqmhvMSc0/aaluOOU56WJzxxjaVMuOccjzXLOCVVSO3nFU3hZ8znXAM68 iVlMNUOyhfxGl8L1mNM6Rvcf89bPqRp93iqPGYI/XJDHZx06oAeCDjzKAV0UYgRxGW3H GSJkpqKhDih930J/m1Ao/Dd5IHZArcr/Tjq3A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=jp8pCSaPEPFwVcyoLa5oiFh6g5GNPr1vfSN3Kxz3GBU=; b=gvYsw+bz/UcdOxEiUuiwRpWKF74Twxs/QSBE3GVBws8BVQzJamXqmX6RsUUxlbUM3p eNRvAkkZeSKi6xV4zdiwDKEwYxidcd6RnkkgQxsyTmu5qi/SI8COnrN5C7FvZHf0M049 vNWlk83C1pxYd9V5jqPPwAhjWx/EKYzrGxTyKkuNjAbp/coFQXFKfwQ1mXqEMsWfIBh1 B0VV23K3TaSJkOJdgoPtj4a8iy/H0qoMaLEGsDoHavhSH6ApzqtDtpJOGGpMd016aQts J8EIEGf9bWTPhfZFe8POkl9r6oKdkIqrKgmnrNchO5LF8rpDezwmbyqGW0dquRsmH/Ul z/5g== X-Gm-Message-State: APt69E1i+ilpICuQRJBvUz9YNJb5iOCoRhWkyqMEVXpcndEE9UGgifDi 8faG54orqij/xXO/TfdY81jaKmnr8jM= X-Received: by 2002:a62:138c:: with SMTP id 12-v6mr25466554pft.34.1529546078605; Wed, 20 Jun 2018 18:54:38 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.36 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:37 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:42 -1000 Message-Id: <20180621015359.12018-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22e Subject: [Qemu-devel] [PATCH v5 18/35] target/arm: Implement SVE Floating Point Multiply Indexed Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper.h | 14 +++++++++++ target/arm/translate-sve.c | 50 ++++++++++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 48 ++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 19 +++++++++++++++ 4 files changed, 131 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper.h b/target/arm/helper.h index 879a7229e9..56439ac1e4 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -620,6 +620,20 @@ DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_ftsmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(gvec_fmla_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fmla_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fmla_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 12cfadf4e9..e4ba84cadd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3400,6 +3400,56 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI +/* + *** SVE Floating Point Multiply-Add Indexed Group + */ + +static bool trans_FMLA_zzxz(DisasContext *s, arg_FMLA_zzxz *a, uint32_t insn) +{ + static gen_helper_gvec_4_ptr * const fns[3] = { + gen_helper_gvec_fmla_idx_h, + gen_helper_gvec_fmla_idx_s, + gen_helper_gvec_fmla_idx_d, + }; + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + status, vsz, vsz, a->index * 2 + a->sub, + fns[a->esz - 1]); + tcg_temp_free_ptr(status); + } + return true; +} + +/* + *** SVE Floating Point Multiply Indexed Group + */ + +static bool trans_FMUL_zzx(DisasContext *s, arg_FMUL_zzx *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] = { + gen_helper_gvec_fmul_idx_h, + gen_helper_gvec_fmul_idx_s, + gen_helper_gvec_fmul_idx_d, + }; + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, a->index, fns[a->esz - 1]); + tcg_temp_free_ptr(status); + } + return true; +} + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index f504dd53c8..97af75a61b 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -495,3 +495,51 @@ DO_3OP(gvec_rsqrts_d, helper_rsqrtsf_f64, float64) #endif #undef DO_3OP + +/* For the indexed ops, SVE applies the index per 128-bit vector segment. + * For AdvSIMD, there is of course only one such vector segment. + */ + +#define DO_MUL_IDX(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ +{ \ + intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \ + intptr_t idx = simd_data(desc); \ + TYPE *d = vd, *n = vn, *m = vm; \ + for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \ + TYPE mm = m[H(i + idx)]; \ + for (j = 0; j < segment; j++) { \ + d[i + j] = TYPE##_mul(n[i + j], mm, stat); \ + } \ + } \ +} + +DO_MUL_IDX(gvec_fmul_idx_h, float16, H2) +DO_MUL_IDX(gvec_fmul_idx_s, float32, H4) +DO_MUL_IDX(gvec_fmul_idx_d, float64, ) + +#undef DO_MUL_IDX + +#define DO_FMLA_IDX(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \ + void *stat, uint32_t desc) \ +{ \ + intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \ + TYPE op1_neg = extract32(desc, SIMD_DATA_SHIFT, 1); \ + intptr_t idx = desc >> (SIMD_DATA_SHIFT + 1); \ + TYPE *d = vd, *n = vn, *m = vm, *a = va; \ + op1_neg <<= (8 * sizeof(TYPE) - 1); \ + for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \ + TYPE mm = m[H(i + idx)]; \ + for (j = 0; j < segment; j++) { \ + d[i + j] = TYPE##_muladd(n[i + j] ^ op1_neg, \ + mm, a[i + j], 0, stat); \ + } \ + } \ +} + +DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2) +DO_FMLA_IDX(gvec_fmla_idx_s, float32, H4) +DO_FMLA_IDX(gvec_fmla_idx_d, float64, ) + +#undef DO_FMLA_IDX diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0068e415ac..f64e74fa9d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -29,6 +29,7 @@ %imm9_16_10 16:s6 10:3 %size_23 23:2 %dtype_23_13 23:2 13:2 +%index3_22_19 22:1 19:2 # A combination of tsz:imm3 -- extract esize. %tszimm_esz 22:2 5:5 !function=tszimm_esz @@ -712,6 +713,24 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s +### SVE FP Multiply-Add Indexed Group + +# SVE floating-point multiply-add (indexed) +FMLA_zzxz 01100100 0.1 .. rm:3 00000 sub:1 rn:5 rd:5 \ + ra=%reg_movprfx index=%index3_22_19 esz=1 +FMLA_zzxz 01100100 101 index:2 rm:3 00000 sub:1 rn:5 rd:5 \ + ra=%reg_movprfx esz=2 +FMLA_zzxz 01100100 111 index:1 rm:4 00000 sub:1 rn:5 rd:5 \ + ra=%reg_movprfx esz=3 + +### SVE FP Multiply Indexed Group + +# SVE floating-point multiply (indexed) +FMUL_zzx 01100100 0.1 .. rm:3 001000 rn:5 rd:5 \ + index=%index3_22_19 esz=1 +FMUL_zzx 01100100 101 index:2 rm:3 001000 rn:5 rd:5 esz=2 +FMUL_zzx 01100100 111 index:1 rm:4 001000 rn:5 rd:5 esz=3 + ### SVE FP Accumulating Reduction Group # SVE floating-point serial reduction (predicated) From patchwork Thu Jun 21 01:53:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139426 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1483381lji; Wed, 20 Jun 2018 19:14:58 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJtyUEdlIxwtCgqT36QPlunDNnmZyX0uxDApq9MNB4AAPycVU1buB/c8CCnk4ksttZxZ5Ji X-Received: by 2002:a37:2e46:: with SMTP id u67-v6mr18908642qkh.106.1529547298364; Wed, 20 Jun 2018 19:14:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547298; cv=none; d=google.com; s=arc-20160816; b=QE/HCS+iRX/1+Hl8/NkQ13VOD35+mSyUuAaYI/bV8laV6TUze6Ee0e7DjVZ9Jqv0kD jrk2yFk7mPfd5IhQ39kIlVx+Eyx5MqUPOUlavqS4IszZjy/QRF6Padv43tNKeJwe7N2A rvPchSQP9ZSMEBdwIZ7yB8W0HAO+ecoSVbinGTDXK1tC8wZpow3uEQ0JksT0amFm4TKh 128LdNtBMWzps+iKrXYXg8JSwiZI9yBdMV9dbZzdIie0Lf6mLrTxCuTbsG5h8mCgiGD8 +0FkWodzFLK5K7pbol55qZKeEjfAk9atWonAqMY/L7mPCUlMETqeeKqShSXWObFxxKNY jRzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=UzNv4pLhvlwFa3l7wUH4tyICe6dAZvXuJw329+CC+Ss=; b=0gyHa+BFs9eGKhuKZ/+cniNE4Fh7IXT+JEjpYoIczxDw8JQBATViihGp/gUC8fJ3P9 xAARMIuRbGCAX15FnOlgOA1kDIcTuwhEPEoToUr1c4RTc/G6n85r1buNpqv0+fnbK5uQ QYMO5PxkvMpbZCblTKrn9SQPh3+S16CnupGAv+94+losQ/JIBlOXMLCB6+ezs8B3ns8e bV3MlmY2PNEQun2iuTeyiJebsaTdXMAMyorgw0lFT2FyUkiL3WdVFLg4/8Pbc5gm0Fs8 OxMgBmexkYUYlM+e3bLjOkUwbeVX8pKPIAEAEineraJzQJimY1bWLCekrSUyiYkHjCX4 w2Vw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ffGshJrO; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id s11-v6si3355141qvn.166.2018.06.20.19.14.58 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:14:58 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ffGshJrO; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52585 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp7h-00036F-Ii for patch@linaro.org; Wed, 20 Jun 2018 22:14:57 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39328) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoo7-0004bG-Ug for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVoo6-0003Gz-G6 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:44 -0400 Received: from mail-pg0-x22c.google.com ([2607:f8b0:400e:c05::22c]:35482) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVoo6-0003Fo-77 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:42 -0400 Received: by mail-pg0-x22c.google.com with SMTP id i7-v6so648518pgp.2 for ; Wed, 20 Jun 2018 18:54:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=UzNv4pLhvlwFa3l7wUH4tyICe6dAZvXuJw329+CC+Ss=; b=ffGshJrOFEfOSZmQgPcThaJYrK8RvDPP0rzSVZ1hgJAFwbRpZg3g2Lch/zgze+wLPZ VMeIbNofb/kDMTq6c7j/BxMy70zE9/Js4xwAzxDy0IgkDhNMBeGQydMjpX2y27prQwzf LRcjmEMHCGZ3yNgCbYJzI4O0zoMtHpoK192k0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=UzNv4pLhvlwFa3l7wUH4tyICe6dAZvXuJw329+CC+Ss=; b=MKjDeOXoEEhJ3gKZdBgfL+kfZYW+3r/WfqBpmcNeTSBT+yT/AW0AwbW8M8SBHEXxgV 7WEmbqWwL/irSYTS3AOXb2632lQmZUOCKA3iCuq5om/4IVBKWCoMmgDI9ddKoVxqOLjN GSpRNiVohmbCt1ZC9H8zsq4IEE/F1AfsaGKBusRFk5L4t2zjU7OS6q+6Dea/BQIIk9qE dqy6Oku3qECsj+LG/lXCt1yCrGKAutgcjuyPH2VAFHQThKjX00IgWUd7C08H1sQ7RjPk PQr3HQjMwa9rxXRnfcDfeSoR1+qq9jHqjvB3vbwa3wntmwcWomV3o/R8VPC2BPliFb6a Xjdg== X-Gm-Message-State: APt69E3SWHIxLcm+3CjJf2gY7pFaH/ehuNkPxsHLLtxEv5nzh4BitAW3 GIvyh/TRvtaF6Qf0aTyB0V1amgSpDbs= X-Received: by 2002:a65:5884:: with SMTP id d4-v6mr20324762pgu.292.1529546080823; Wed, 20 Jun 2018 18:54:40 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.38 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:39 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:43 -1000 Message-Id: <20180621015359.12018-20-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22c Subject: [Qemu-devel] [PATCH v5 19/35] target/arm: Implement SVE FP Fast Reduction Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 35 ++++++++++++++++++++++ target/arm/sve_helper.c | 61 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 57 +++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 +++++ 4 files changed, 161 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 087819ec2b..ff69d143a0 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -725,6 +725,41 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_faddv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_faddv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_faddv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fmaxnmv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxnmv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxnmv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fminnmv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminnmv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminnmv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fmaxv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fminv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG, i64, i64, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a40df62414..befea9ba54 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2852,6 +2852,67 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +/* Recursive reduction on a function; + * C.f. the ARM ARM function ReducePredicated. + * + * While it would be possible to write this without the DATA temporary, + * it is much simpler to process the predicate register this way. + * The recursion is bounded to depth 7 (128 fp16 elements), so there's + * little to gain with a more complex non-recursive form. + */ +#define DO_REDUCE(NAME, TYPE, H, FUNC, IDENT) \ +static TYPE NAME##_reduce(TYPE *data, float_status *status, uintptr_t n) \ +{ \ + if (n == 1) { \ + return *data; \ + } else { \ + uintptr_t half = n / 2; \ + TYPE lo = NAME##_reduce(data, status, half); \ + TYPE hi = NAME##_reduce(data + half, status, half); \ + return TYPE##_##FUNC(lo, hi, status); \ + } \ +} \ +uint64_t HELPER(NAME)(void *vn, void *vg, void *vs, uint32_t desc) \ +{ \ + uintptr_t i, oprsz = simd_oprsz(desc), maxsz = simd_maxsz(desc); \ + TYPE data[sizeof(ARMVectorReg) / sizeof(TYPE)]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + *(TYPE *)((void *)data + i) = (pg & 1 ? nn : IDENT); \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ + for (; i < maxsz; i += sizeof(TYPE)) { \ + *(TYPE *)((void *)data + i) = IDENT; \ + } \ + return NAME##_reduce(data, vs, maxsz / sizeof(TYPE)); \ +} + +DO_REDUCE(sve_faddv_h, float16, H1_2, add, float16_zero) +DO_REDUCE(sve_faddv_s, float32, H1_4, add, float32_zero) +DO_REDUCE(sve_faddv_d, float64, , add, float64_zero) + +/* Identity is floatN_default_nan, without the function call. */ +DO_REDUCE(sve_fminnmv_h, float16, H1_2, minnum, 0x7E00) +DO_REDUCE(sve_fminnmv_s, float32, H1_4, minnum, 0x7FC00000) +DO_REDUCE(sve_fminnmv_d, float64, , minnum, 0x7FF8000000000000ULL) + +DO_REDUCE(sve_fmaxnmv_h, float16, H1_2, maxnum, 0x7E00) +DO_REDUCE(sve_fmaxnmv_s, float32, H1_4, maxnum, 0x7FC00000) +DO_REDUCE(sve_fmaxnmv_d, float64, , maxnum, 0x7FF8000000000000ULL) + +DO_REDUCE(sve_fminv_h, float16, H1_2, min, float16_infinity) +DO_REDUCE(sve_fminv_s, float32, H1_4, min, float32_infinity) +DO_REDUCE(sve_fminv_d, float64, , min, float64_infinity) + +DO_REDUCE(sve_fmaxv_h, float16, H1_2, max, float16_chs(float16_infinity)) +DO_REDUCE(sve_fmaxv_s, float32, H1_4, max, float32_chs(float32_infinity)) +DO_REDUCE(sve_fmaxv_d, float64, , max, float64_chs(float64_infinity)) + +#undef DO_REDUCE + uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg, void *status, uint32_t desc) { diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e4ba84cadd..47d64f2fc7 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3450,6 +3450,63 @@ static bool trans_FMUL_zzx(DisasContext *s, arg_FMUL_zzx *a, uint32_t insn) return true; } +/* + *** SVE Floating Point Fast Reduction Group + */ + +typedef void gen_helper_fp_reduce(TCGv_i64, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_i32); + +static void do_reduce(DisasContext *s, arg_rpr_esz *a, + gen_helper_fp_reduce *fn) +{ + unsigned vsz = vec_full_reg_size(s); + unsigned p2vsz = pow2ceil(vsz); + TCGv_i32 t_desc = tcg_const_i32(simd_desc(vsz, p2vsz, 0)); + TCGv_ptr t_zn, t_pg, status; + TCGv_i64 temp; + + temp = tcg_temp_new_i64(); + t_zn = tcg_temp_new_ptr(); + t_pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg)); + status = get_fpstatus_ptr(a->esz == MO_16); + + fn(temp, t_zn, t_pg, status, t_desc); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_ptr(status); + tcg_temp_free_i32(t_desc); + + write_fp_dreg(s, a->rd, temp); + tcg_temp_free_i64(temp); +} + +#define DO_VPZ(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_fp_reduce * const fns[3] = { \ + gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, \ + gen_helper_sve_##name##_d, \ + }; \ + if (a->esz == 0) { \ + return false; \ + } \ + if (sve_access_check(s)) { \ + do_reduce(s, a, fns[a->esz - 1]); \ + } \ + return true; \ +} + +DO_VPZ(FADDV, faddv) +DO_VPZ(FMINNMV, fminnmv) +DO_VPZ(FMAXNMV, fmaxnmv) +DO_VPZ(FMINV, fminv) +DO_VPZ(FMAXV, fmaxv) + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f64e74fa9d..39a803621f 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -731,6 +731,14 @@ FMUL_zzx 01100100 0.1 .. rm:3 001000 rn:5 rd:5 \ FMUL_zzx 01100100 101 index:2 rm:3 001000 rn:5 rd:5 esz=2 FMUL_zzx 01100100 111 index:1 rm:4 001000 rn:5 rd:5 esz=3 +### SVE FP Fast Reduction Group + +FADDV 01100101 .. 000 000 001 ... ..... ..... @rd_pg_rn +FMAXNMV 01100101 .. 000 100 001 ... ..... ..... @rd_pg_rn +FMINNMV 01100101 .. 000 101 001 ... ..... ..... @rd_pg_rn +FMAXV 01100101 .. 000 110 001 ... ..... ..... @rd_pg_rn +FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn + ### SVE FP Accumulating Reduction Group # SVE floating-point serial reduction (predicated) From patchwork Thu Jun 21 01:53:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139419 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1479812lji; Wed, 20 Jun 2018 19:09:48 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJ8VYUh1XDUeeOkQF+vq6O8+POSuWVzoQuJlnaLZqGOxc0nVldYndcO1FLBU1IMVAlGkVI9 X-Received: by 2002:a37:5041:: with SMTP id e62-v6mr20464183qkb.133.1529546988772; Wed, 20 Jun 2018 19:09:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546988; cv=none; d=google.com; s=arc-20160816; b=ImlTcIrWxjxSuVVX3+1noKvxSebWhwDJ1hI6ZTp6CKzTkhmHwSqL6EwC5DxCOi6Yl5 F1oxEc9lRLMbsMhLpY6kB9u7FctRUNqy547kMk0P+fevF8u6jXmpMeXMp7u/QtsFHC/x OqJWk84jrPikF50293dmGV47Dkn9L7+tD5wDg33ZKUn///Sfrxk1Yv73yEDcVLZHslmN DjwQIklYN2BJ7nM4CVxdHTreQIqdQoDON/lMrGbSUrtZRWh22zcVKSJ4vpmqYKCROyGH shBQF6dYkp+vdi6BM6OHhok+Rrd07KwQtGr8nd7cYs6+w7d918gh9EP+L/hfd19PRnu+ yWYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=RCtfZpKBlW4ElNRMaWaS9zgg8qe947nw1g14SfBucD0=; b=CyYdn4e3UWpUx6j59a/DJz/LFaPzDMPFNaTA9SGtIkkQisWZB5pCG4Bn0Sr81dXkBs 8wSdaLoez0XJB/YRWEAW8MxnBRMMtCp7Z4mwYnTFA+C9hViwE7N2AT4ubYwdiPpa7fJ8 HKsRBp1rsWoQlFa6TJEgD9kwpD3YQevv/xggcSJzT2rluwlDd0BQrtQrbz9p6UXplU6/ mPVj9FtiZhgBjZCtRpmuHhFCQCCwAoH5mAYiXCzEHjCcOePT98989utNLezYcf+8nVfO GiD/3h1Oes93F86zYdXHSYyQDH4gc5PeSHPpEn/9MFs0uMXVJoyFG2/vow5Kv0s0ICYu f5fQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=iJW0uLe9; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id v29-v6si3730387qtj.237.2018.06.20.19.09.48 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:09:48 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=iJW0uLe9; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52556 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp2i-0007DW-5d for patch@linaro.org; Wed, 20 Jun 2018 22:09:48 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39336) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoo9-0004bI-21 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVoo8-0003I8-18 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:45 -0400 Received: from mail-pl0-x22f.google.com ([2607:f8b0:400e:c01::22f]:38072) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVoo7-0003Hl-PM for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:43 -0400 Received: by mail-pl0-x22f.google.com with SMTP id d10-v6so757627plo.5 for ; Wed, 20 Jun 2018 18:54:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=RCtfZpKBlW4ElNRMaWaS9zgg8qe947nw1g14SfBucD0=; b=iJW0uLe9bOPvuuQn+kE+zY+X/rWphnXbttEfStGrY6TM2IjUFmjg7G+cW5AQKXp/Rv cNS8Eai3+gi1t24aUU1kxwdFJkmDP6dG5IoX+nLNViMMLc9CkMSDSOu1iyyzrfiB6G8C Q0VpLRGwedekZ2/z9sz1ZrBkt8sM5fqnpF2iI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=RCtfZpKBlW4ElNRMaWaS9zgg8qe947nw1g14SfBucD0=; b=uZK5TCKFcuK6XoPv13FEsBjgTXJNbc9nYVqkvRZL3KhAWtneySW+vRJGSQxRhuRx72 MmKNZUu+9tfsUKJ499rJ//EVs4NYgyDe2lA9n/0/BgrQIHyH6Clwp3l5Ar9r9Mjh1TB4 0cWzTYlBf/9J7s4hcn1Db5LsK1HiDEsQgnHUrZsRE/aPFMwPGbJe4VbP0JuClowHmpFj ZM9SVpFYgAUGDSAWPyYEgOgfly8QdBl1+U2vJH+SjMJbbsTk4PMs0s+I1MhET1Ym3unC +w5ja6yIV+RiSYonQ9QZC4ZeO4P30CzraOJnRrTsr8s79t1CZJ6rbjqA5jUJIgBpwKMI qkEA== X-Gm-Message-State: APt69E3nsT5tCRoJWLu4aB+uC+vKiWcEAZSISR/oVRLuRajPXtuk/AZX cbZs87sJr7gKZsbNqbnJq3droTxdWsA= X-Received: by 2002:a17:902:125:: with SMTP id 34-v6mr26252825plb.42.1529546082491; Wed, 20 Jun 2018 18:54:42 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.40 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:41 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:44 -1000 Message-Id: <20180621015359.12018-21-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::22f Subject: [Qemu-devel] [PATCH v5 20/35] target/arm: Implement SVE Floating Point Unary Operations - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper.h | 8 +++++++ target/arm/translate-sve.c | 47 ++++++++++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 20 ++++++++++++++++ target/arm/sve.decode | 5 ++++ 4 files changed, 80 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper.h b/target/arm/helper.h index 56439ac1e4..ad9cb6c7d5 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -601,6 +601,14 @@ DEF_HELPER_FLAGS_5(gvec_fcmlas_idx, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_fcmlad, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_frsqrte_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frsqrte_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frsqrte_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 47d64f2fc7..d7957cddbd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3507,6 +3507,53 @@ DO_VPZ(FMAXNMV, fmaxnmv) DO_VPZ(FMINV, fminv) DO_VPZ(FMAXV, fmaxv) +/* + *** SVE Floating Point Unary Operations - Unpredicated Group + */ + +static void do_zz_fp(DisasContext *s, arg_rr_esz *a, gen_helper_gvec_2_ptr *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + + tcg_gen_gvec_2_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); +} + +static bool trans_FRECPE(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_2_ptr * const fns[3] = { + gen_helper_gvec_frecpe_h, + gen_helper_gvec_frecpe_s, + gen_helper_gvec_frecpe_d, + }; + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + do_zz_fp(s, a, fns[a->esz - 1]); + } + return true; +} + +static bool trans_FRSQRTE(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_2_ptr * const fns[3] = { + gen_helper_gvec_frsqrte_h, + gen_helper_gvec_frsqrte_s, + gen_helper_gvec_frsqrte_d, + }; + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + do_zz_fp(s, a, fns[a->esz - 1]); + } + return true; +} + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 97af75a61b..073e5c58e7 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -427,6 +427,26 @@ void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } +#define DO_2OP(NAME, FUNC, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = FUNC(n[i], stat); \ + } \ +} + +DO_2OP(gvec_frecpe_h, helper_recpe_f16, float16) +DO_2OP(gvec_frecpe_s, helper_recpe_f32, float32) +DO_2OP(gvec_frecpe_d, helper_recpe_f64, float64) + +DO_2OP(gvec_frsqrte_h, helper_rsqrte_f16, float16) +DO_2OP(gvec_frsqrte_s, helper_rsqrte_f32, float32) +DO_2OP(gvec_frsqrte_d, helper_rsqrte_f64, float64) + +#undef DO_2OP + /* Floating-point trigonometric starting value. * See the ARM ARM pseudocode function FPTrigSMul. */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 39a803621f..191be9463d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -739,6 +739,11 @@ FMINNMV 01100101 .. 000 101 001 ... ..... ..... @rd_pg_rn FMAXV 01100101 .. 000 110 001 ... ..... ..... @rd_pg_rn FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn +## SVE Floating Point Unary Operations - Unpredicated Group + +FRECPE 01100101 .. 001 110 001100 ..... ..... @rd_rn +FRSQRTE 01100101 .. 001 111 001100 ..... ..... @rd_rn + ### SVE FP Accumulating Reduction Group # SVE floating-point serial reduction (predicated) From patchwork Thu Jun 21 01:53:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139429 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1484910lji; Wed, 20 Jun 2018 19:17:19 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIs96Nw7tnqnUDIKbjeMgjpA6KO7LO9L/EQqXhLbr81sgNjKok9UD/lkUhlzp9ZXpBRuYaP X-Received: by 2002:ac8:4603:: with SMTP id p3-v6mr21764728qtn.418.1529547438959; Wed, 20 Jun 2018 19:17:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547438; cv=none; d=google.com; s=arc-20160816; b=WqamyhmKFVpjQRTWfsIIIn8L9Bm7PBg2pt0pupz22uioH1WGwOcSZIxa9Q7KvFweZs kbwNN5yTipZL40g8TwH3wlVr9B03vdOOGQYrr37Pn38ltE1pk3x9GSPGV3CuMwu9CWyM ebcfz7srozNeDLA79+NMgAFI9/xD3FbaGCvv0+4DNS0BJVtNYA51YhVcrEJphykGB7ZN BatO6oX/vpLtms8GRXZ7Z2M7LOToGmA16Og0Lrj6RSByCBtJNPasf1HrN4DPqrLT1z3U fwF0Exu59T2vfk2MwyuNolcCxFxbgjTVZxCcHa/pQ3cpqUp1o/dSR1Lg4YzXuhCAfyFL 6F1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=G7HJFf01ahwyJ3S0o1PGDi5YnhvesRg4vJI2sGihuSA=; b=LtiEsK6+zn8NyIayNz+j62FHm3kMY/5q2ysxuWudue65RKp3ttb31joY2+OBRKu/7m rWhu3J+c+8LUvHDER4clXFPvwlwjl8k4b/iWdTcT0XKz0cD0weloEIvVpEJ8bOwkMyD+ cLwBs1IBecBgwbohcBtJT5muvMudnvBwjYf70SvGS9B+GZm0IRDqXOhLqmJPVReRyD3F uIERzamjO3t2ZqQQjGBMiGEsT77k4QIZNwHmqQP53S7Vg+UIppmbuON1MuHT9diJ/4qI J0hde6napiEj9R0iBDnxOcqcOo8TtKRQKChyQXuOpy2i6aldt4JODM8oZso8NKIHJZV6 cRJw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=UmbomvWx; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id d18-v6si3634487qkc.200.2018.06.20.19.17.18 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:17:18 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=UmbomvWx; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52600 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp9y-0004tz-BT for patch@linaro.org; Wed, 20 Jun 2018 22:17:18 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39369) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooE-0004gh-19 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooA-0003JL-0M for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:50 -0400 Received: from mail-pl0-x234.google.com ([2607:f8b0:400e:c01::234]:45168) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVoo9-0003Ij-OZ for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:45 -0400 Received: by mail-pl0-x234.google.com with SMTP id o18-v6so58914pll.12 for ; Wed, 20 Jun 2018 18:54:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=G7HJFf01ahwyJ3S0o1PGDi5YnhvesRg4vJI2sGihuSA=; b=UmbomvWx7hafV6A8fKPrlz7fy1b2Mo3KDedLFZh1MbgFEU2RCs2tvAEWLhneDFIwzb +L+sE2TLGMMw4bHvVPyGnm9yw5wNLJg3APO9WNIMz2nlAP2xu6rBGeHaGybzSsqqgCXa Zxl9Kn6D0f0jQhSkI6FNxSna6b7Ec+71Zq5rQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=G7HJFf01ahwyJ3S0o1PGDi5YnhvesRg4vJI2sGihuSA=; b=FqD8sEWkGAwc5zlMw7XrhkX0y+DdrKNWBYew+9lTw6WT/ViTAZxPV6Q9X3S6kmJZXq Q4mk3CC8XTkJNA/4L0mowf1ZdaKgb3c7iJAKXwb2SOM1kekeN35P412T4j4g+4tsEBuR HQpcIZTIYfgfQANMCXNBmiueI152ptsM7A8NxRQ8+aGJvImbNu56MBqc65pTBpYkHr6O zJTnbf3pU1qwKWI4voMnrZtzXHRT4ZIIKRBm7iSSXYJvHpotPbvaVXB9YJh11MbM3ECn fXzSL496Nm6YDFgrKjxAY0KSZEqwonkMKuzQikJ1huMNrdiH4tbz2lRSoJtnao5Tu0M1 /Ajg== X-Gm-Message-State: APt69E1r7co+M6FpPPRttY9/MGjcrXh3HnLlB7scvTbMC8L4RFaNHIuf sVzd9R8DD9ODGCPQ7S0TUIA/YTzXBOg= X-Received: by 2002:a17:902:5501:: with SMTP id f1-v6mr26162868pli.108.1529546084438; Wed, 20 Jun 2018 18:54:44 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.42 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:43 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:45 -1000 Message-Id: <20180621015359.12018-22-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::234 Subject: [Qemu-devel] [PATCH v5 21/35] target/arm: Implement SVE FP Compare with Zero Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 42 +++++++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 43 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 43 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 10 +++++++++ 4 files changed, 138 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ff69d143a0..44a98440c9 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -767,6 +767,48 @@ DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG, i64, i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmge0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmge0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmge0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmgt0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmgt0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmgt0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmlt0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmlt0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmlt0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmle0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmle0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmle0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmeq0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmeq0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmeq0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmne0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmne0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmne0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index befea9ba54..06e963e9c2 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3362,6 +3362,8 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ #define DO_FCMGE(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) <= 0 #define DO_FCMGT(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) < 0 +#define DO_FCMLE(TYPE, X, Y, ST) TYPE##_compare(X, Y, ST) <= 0 +#define DO_FCMLT(TYPE, X, Y, ST) TYPE##_compare(X, Y, ST) < 0 #define DO_FCMEQ(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) == 0 #define DO_FCMNE(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) != 0 #define DO_FCMUO(TYPE, X, Y, ST) \ @@ -3385,6 +3387,47 @@ DO_FPCMP_PPZZ_ALL(sve_facgt, DO_FACGT) #undef DO_FPCMP_PPZZ_H #undef DO_FPCMP_PPZZ +/* One operand floating-point comparison against zero, controlled + * by a predicate. + */ +#define DO_FPCMP_PPZ0(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i = simd_oprsz(desc), j = (i - 1) >> 6; \ + uint64_t *d = vd, *g = vg; \ + do { \ + uint64_t out = 0, pg = g[j]; \ + do { \ + i -= sizeof(TYPE), out <<= sizeof(TYPE); \ + if ((pg >> (i & 63)) & 1) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + out |= OP(TYPE, nn, 0, status); \ + } \ + } while (i & 63); \ + d[j--] = out; \ + } while (i > 0); \ +} + +#define DO_FPCMP_PPZ0_H(NAME, OP) \ + DO_FPCMP_PPZ0(NAME##_h, float16, H1_2, OP) +#define DO_FPCMP_PPZ0_S(NAME, OP) \ + DO_FPCMP_PPZ0(NAME##_s, float32, H1_4, OP) +#define DO_FPCMP_PPZ0_D(NAME, OP) \ + DO_FPCMP_PPZ0(NAME##_d, float64, , OP) + +#define DO_FPCMP_PPZ0_ALL(NAME, OP) \ + DO_FPCMP_PPZ0_H(NAME, OP) \ + DO_FPCMP_PPZ0_S(NAME, OP) \ + DO_FPCMP_PPZ0_D(NAME, OP) + +DO_FPCMP_PPZ0_ALL(sve_fcmge0, DO_FCMGE) +DO_FPCMP_PPZ0_ALL(sve_fcmgt0, DO_FCMGT) +DO_FPCMP_PPZ0_ALL(sve_fcmle0, DO_FCMLE) +DO_FPCMP_PPZ0_ALL(sve_fcmlt0, DO_FCMLT) +DO_FPCMP_PPZ0_ALL(sve_fcmeq0, DO_FCMEQ) +DO_FPCMP_PPZ0_ALL(sve_fcmne0, DO_FCMNE) + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index d7957cddbd..0c55501935 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3554,6 +3554,49 @@ static bool trans_FRSQRTE(DisasContext *s, arg_rr_esz *a, uint32_t insn) return true; } +/* + *** SVE Floating Point Compare with Zero Group + */ + +static void do_ppz_fp(DisasContext *s, arg_rpr_esz *a, + gen_helper_gvec_3_ptr *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + + tcg_gen_gvec_3_ptr(pred_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); +} + +#define DO_PPZ(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_3_ptr * const fns[3] = { \ + gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, \ + gen_helper_sve_##name##_d, \ + }; \ + if (a->esz == 0) { \ + return false; \ + } \ + if (sve_access_check(s)) { \ + do_ppz_fp(s, a, fns[a->esz - 1]); \ + } \ + return true; \ +} + +DO_PPZ(FCMGE_ppz0, fcmge0) +DO_PPZ(FCMGT_ppz0, fcmgt0) +DO_PPZ(FCMLE_ppz0, fcmle0) +DO_PPZ(FCMLT_ppz0, fcmlt0) +DO_PPZ(FCMEQ_ppz0, fcmeq0) +DO_PPZ(FCMNE_ppz0, fcmne0) + +#undef DO_PPZ + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 191be9463d..5dea2e5e88 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -140,6 +140,7 @@ # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz +@pd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 . rd:4 &rpr_esz # One register operand, with governing predicate, no vector element size @rd_pg_rn_e0 ........ .. ... ... ... pg:3 rn:5 rd:5 &rpr_esz esz=0 @@ -744,6 +745,15 @@ FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn FRECPE 01100101 .. 001 110 001100 ..... ..... @rd_rn FRSQRTE 01100101 .. 001 111 001100 ..... ..... @rd_rn +### SVE FP Compare with Zero Group + +FCMGE_ppz0 01100101 .. 0100 00 001 ... ..... 0 .... @pd_pg_rn +FCMGT_ppz0 01100101 .. 0100 00 001 ... ..... 1 .... @pd_pg_rn +FCMLT_ppz0 01100101 .. 0100 01 001 ... ..... 0 .... @pd_pg_rn +FCMLE_ppz0 01100101 .. 0100 01 001 ... ..... 1 .... @pd_pg_rn +FCMEQ_ppz0 01100101 .. 0100 10 001 ... ..... 0 .... @pd_pg_rn +FCMNE_ppz0 01100101 .. 0100 11 001 ... ..... 0 .... @pd_pg_rn + ### SVE FP Accumulating Reduction Group # SVE floating-point serial reduction (predicated) From patchwork Thu Jun 21 01:53:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139421 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1481366lji; Wed, 20 Jun 2018 19:12:06 -0700 (PDT) X-Google-Smtp-Source: ADUXVKI0UwQg0W9BT3QwkuKR0pZ5jc5VWyXkO07VBoFJgKCIlk31SPX8B8TaB8ssgYJ6dvbQjM4A X-Received: by 2002:ac8:18d7:: with SMTP id o23-v6mr21893274qtk.28.1529547126219; Wed, 20 Jun 2018 19:12:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547126; cv=none; d=google.com; s=arc-20160816; b=xWsBTGSeUZqAOW5WYO62C4XOv4b94YWH0vToc+8h4DOjW4m3FtJZsdt90SKMKt2bNv CG8Es9nbklpahSfy5oQm9QH3Pm+1xpFXIMDo+8JTvnBRP60AqSJWCvyck/6qzxV0bRNi LMH3x5BOq54i8422EnmoQpklXIEXwrMY1HfrnYwuk8z5xYntWe8HbDscKR6rqzMvCgxa LF/f+9JW2Mm0354cKvyrzQf6nWhWQxO3mgAvMaMuDuwzx4CnUdhuYK7CDz3kiMR6oPDP rKj5X7quNeaxGbMjYLnPX5oYz6CPDel+MXbTY8uZ4HGCgS0zmAYVUPaPG53j7YtW9DeZ casA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=YM1glD7stHNvZqQN3AWRG/IvykMWfYM5AgvXp5srQR0=; b=J35V8NNgeD/NmI5wzdOnIgcyUgntqGImLEiEzDH1+keY/tnqBn/TDBtSTSEFAgF5GH o5uLvEEFKtX2MWzDg+iWXtj10cQuRc9UpY6J3ZAzac3bi2bhhGluk1ZidkiWb12oxGja 8ymFc1MmMP/XkVeg/D31UtNWLRzUTvK4SXLTt0qArMDxhN88odW4w1kZODEM96pUpUX4 a02VUgC/qXMD0+Wc82x/tQYTNU9npaVMFtZ9ON5F8h00i0ZjD/2Uhm5aCMRHHfxtRUJS jmhHs368DlVpZEuWYCgBBlMSLnQpYBbfOBJHm6Egbo7hoXSOFZwwjN7qegcXsXtiWIOK CUiA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=aEjp4Cfg; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id s11-v6si3351124qvn.166.2018.06.20.19.12.06 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:12:06 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=aEjp4Cfg; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52570 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp4v-0000b2-JU for patch@linaro.org; Wed, 20 Jun 2018 22:12:05 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39368) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooE-0004gg-0s for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooC-0003KT-35 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:50 -0400 Received: from mail-pl0-x22b.google.com ([2607:f8b0:400e:c01::22b]:39902) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooB-0003K2-QY for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:47 -0400 Received: by mail-pl0-x22b.google.com with SMTP id s24-v6so752994plq.6 for ; Wed, 20 Jun 2018 18:54:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=YM1glD7stHNvZqQN3AWRG/IvykMWfYM5AgvXp5srQR0=; b=aEjp4CfgU4VlL5WiVthx2wYXBFXcqdH3LemPp3n9hn72GsrgZO9zhZXM+KscqZIoj1 7pfReehU4z2UIPP69KrvHf3LWpVVV9z1luhhmRwN8ASS9HBycIiFiktvdO07gKkDEOwl I9B5bXQE0Uz8Olv89i8RyWCyvjjtP2TAdKYgI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=YM1glD7stHNvZqQN3AWRG/IvykMWfYM5AgvXp5srQR0=; b=s8xG3+Q7ww7COIxaLusHmoNtPTwHg5BI3K94urs4+BcrSfUbvFdWG49BWO8z6NlfmK PAMz4KO/IJ3nq9JiKfnO6eZlQUxslp5Qc5y+Ezh4B71G3qtV+0Hwd1RY3x2QM3C1lhJu CV4fpToLuu5KzRDc202ZpjJ22s/vM2ATpvEJujQ7btvxIs1OXmzOs9awFWQXotCmDnSJ x4+ZCySXwjifdMQZlX4KEtIIQetQlG1Rf5g5ImjCMiMEz/pDrSLv/wsPJAcMX/3RORT/ gp8GBxkXRP5qo8W9w0LlAu6ojseWf/Cs6Iu4N5oC1qM2O/tOrVgz2RBnbzNko6JfE6Yn E4bQ== X-Gm-Message-State: APt69E3fXhw2qwbw4v5VroPMD02HlJ9cUC9AhX5vtUS24KJpZTytPRyb cshwjhBgVhs6FtSAJ0223NUPtlttC8E= X-Received: by 2002:a17:902:b40f:: with SMTP id x15-v6mr26487854plr.270.1529546086488; Wed, 20 Jun 2018 18:54:46 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.44 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:45 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:46 -1000 Message-Id: <20180621015359.12018-23-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::22b Subject: [Qemu-devel] [PATCH v5 22/35] target/arm: Implement SVE floating-point trig multiply-add coefficient X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 4 +++ target/arm/sve_helper.c | 70 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 27 +++++++++++++++ target/arm/sve.decode | 3 ++ 4 files changed, 104 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 44a98440c9..aca137fc37 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1037,6 +1037,10 @@ DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 06e963e9c2..3dc35d8b12 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3428,6 +3428,76 @@ DO_FPCMP_PPZ0_ALL(sve_fcmlt0, DO_FCMLT) DO_FPCMP_PPZ0_ALL(sve_fcmeq0, DO_FCMEQ) DO_FPCMP_PPZ0_ALL(sve_fcmne0, DO_FCMNE) +/* FP Trig Multiply-Add. */ + +void HELPER(sve_ftmad_h)(void *vd, void *vn, void *vm, void *vs, uint32_t desc) +{ + static const float16 coeff[16] = { + 0x3c00, 0xb155, 0x2030, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, + 0x3c00, 0xb800, 0x293a, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, + }; + intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float16); + intptr_t x = simd_data(desc); + float16 *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i++) { + float16 mm = m[i]; + intptr_t xx = x; + if (float16_is_neg(mm)) { + mm = float16_abs(mm); + xx += 8; + } + d[i] = float16_muladd(n[i], mm, coeff[xx], 0, vs); + } +} + +void HELPER(sve_ftmad_s)(void *vd, void *vn, void *vm, void *vs, uint32_t desc) +{ + static const float32 coeff[16] = { + 0x3f800000, 0xbe2aaaab, 0x3c088886, 0xb95008b9, + 0x36369d6d, 0x00000000, 0x00000000, 0x00000000, + 0x3f800000, 0xbf000000, 0x3d2aaaa6, 0xbab60705, + 0x37cd37cc, 0x00000000, 0x00000000, 0x00000000, + }; + intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float32); + intptr_t x = simd_data(desc); + float32 *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i++) { + float32 mm = m[i]; + intptr_t xx = x; + if (float32_is_neg(mm)) { + mm = float32_abs(mm); + xx += 8; + } + d[i] = float32_muladd(n[i], mm, coeff[xx], 0, vs); + } +} + +void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t desc) +{ + static const float64 coeff[16] = { + 0x3ff0000000000000ull, 0xbfc5555555555543ull, + 0x3f8111111110f30cull, 0xbf2a01a019b92fc6ull, + 0x3ec71de351f3d22bull, 0xbe5ae5e2b60f7b91ull, + 0x3de5d8408868552full, 0x0000000000000000ull, + 0x3ff0000000000000ull, 0xbfe0000000000000ull, + 0x3fa5555555555536ull, 0xbf56c16c16c13a0bull, + 0x3efa01a019b1e8d8ull, 0xbe927e4f7282f468ull, + 0x3e21ee96d2641b13ull, 0xbda8f76380fbb401ull, + }; + intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float64); + intptr_t x = simd_data(desc); + float64 *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i++) { + float64 mm = m[i]; + intptr_t xx = x; + if (float64_is_neg(mm)) { + mm = float64_abs(mm); + xx += 8; + } + d[i] = float64_muladd(n[i], mm, coeff[xx], 0, vs); + } +} + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0c55501935..fb225d56a1 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3597,6 +3597,33 @@ DO_PPZ(FCMNE_ppz0, fcmne0) #undef DO_PPZ +/* + *** SVE floating-point trig multiply-add coefficient + */ + +static bool trans_FTMAD(DisasContext *s, arg_FTMAD *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] = { + gen_helper_sve_ftmad_h, + gen_helper_sve_ftmad_s, + gen_helper_sve_ftmad_d, + }; + + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, a->imm, fns[a->esz - 1]); + tcg_temp_free_ptr(status); + } + return true; +} + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5dea2e5e88..e29c598783 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -796,6 +796,9 @@ FMINNM_zpzi 01100101 .. 011 101 100 ... 0000 . ..... @rdn_i1 FMAX_zpzi 01100101 .. 011 110 100 ... 0000 . ..... @rdn_i1 FMIN_zpzi 01100101 .. 011 111 100 ... 0000 . ..... @rdn_i1 +# SVE floating-point trig multiply-add coefficient +FTMAD 01100101 esz:2 010 imm:3 100000 rm:5 rd:5 rn=%reg_movprfx + ### SVE FP Multiply-Add Group # SVE floating-point multiply-accumulate writing addend From patchwork Thu Jun 21 01:53:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139422 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1481690lji; Wed, 20 Jun 2018 19:12:33 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKxZVpAIcLdiTbmygWetsfqTz7i+JNMdyA6/UgAV3zU5V0r3mN6q/8znIlwMr7uFDPpp2mW X-Received: by 2002:a37:16d1:: with SMTP id 78-v6mr3912355qkw.271.1529547152992; Wed, 20 Jun 2018 19:12:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547152; cv=none; d=google.com; s=arc-20160816; b=iOox3n2StXkHDB1fJrm4m7D48YZwBK4MjnJat6eMDgbkzVvEhgFy+4cxNxZXmNQuVx i7yMhQBTPXiyZD0aVEngiSB8ptWyV78MEqe+DXDDY+2Tof7I/7veYoDQz8zdK9qr2Kf4 zFY9aNQkiLlcAs1R/rbV6XJQu+yQizi9gJrRXBprO/n4gBwpLAzfrEkwSt4Bm4PwwmVb XRudmZsobJT+9sn2kQI3z1xetX9mzpeL2SqlC3Qp6OC0Fo6MQuC6uIZfa+Aj4GVQ1ofa W4GLkCxxcKrCkY90MmFNdk0Dt5Pw+taAXXRAVFVlar+Q5ArO0VBcjxps4hSsHh+7GJtx XVbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=fcBqKOfz/OhWxZj3vl+NSnfdZzqnqtvhx9FbSAfYtQI=; b=sMsPwBpZbMoM+TkaFXj+x1eMVTz7G5C/7X4IuNWm9LikLaF3qP140oJ7mGcDZz36AT n4fUMH+yqNvlLJLhLfrGzgZEeuFPpYQ8ldJgEBBp1hhSnzf1YPlq0vwxqYLdZV/SEgF8 F25IOGYXPKWsozSRat4H0t5mf2B8yqhw2JisYSjGRwCQlC7APraWgjbLeg/Lk+qHFzLf M7VlpV1T/YaqbB+C8KctFjlV6YHrRYejBF/KqL3vBhk8eq2pQr2g2yrtJGaa7PPhe8ZJ JuKOe4Q3q5di36LyqCAPyM6oqZnencagiycDvacLlGwsvYsGVhBvtANQzoBL2EbPhFis w0+w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=DDVtR+IP; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id w10-v6si2478483qtk.365.2018.06.20.19.12.32 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:12:32 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=DDVtR+IP; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52573 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp5M-0001CG-DP for patch@linaro.org; Wed, 20 Jun 2018 22:12:32 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39391) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooG-0004is-Cb for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooF-0003MF-BK for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:52 -0400 Received: from mail-pf0-x232.google.com ([2607:f8b0:400e:c00::232]:37729) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooF-0003Lj-2f for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:51 -0400 Received: by mail-pf0-x232.google.com with SMTP id y5-v6so685998pfn.4 for ; Wed, 20 Jun 2018 18:54:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=fcBqKOfz/OhWxZj3vl+NSnfdZzqnqtvhx9FbSAfYtQI=; b=DDVtR+IPTnfPti8/xA5a86+TdFJjdcq/TOilp6C4x6q2iwxo+R+wePs2ge2XcMtHMW Qwb4Dop8Z5Hhu1KKJ4SMiBSUOaeFdX1ORZh6JZ5oZOR8JKClLrMh2wuuoTRhev5Amwxg 4ceWPT98Yrrb9g0yYfSr8kLKK5CVt/f7zCwR4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=fcBqKOfz/OhWxZj3vl+NSnfdZzqnqtvhx9FbSAfYtQI=; b=Iy1XLUzmrLzJTFLBasDp/6aa4rvmK5tCC8HfPyLhgpgDw7k85HcxYrfSUytgxx1Lwl 4frbG21UKgPXfAsrADDDFcGujnAzTCdcM6p4z3+tWwE0ZR1apYm1j1KgNTAcsaPrA0cQ OLjVWbfe+Iu6exKN2wstpPZ6LwK2lxPZdaX4WM/tapYE6n1N8d6Zrnwsb8gvx/rNMAv0 lp10B+tZR8f9RIjZ3PDz9ZLTM1j45F/4afK3jLEHiEES+ak6HFK4EtdocAPQ3grQ8vch Kwor9eZx3O/cjEdTInKpOOOYL6T4EwuREyftWLhiOjleDSAZpK12O9KYWJujuxPX4Zzz X+3w== X-Gm-Message-State: APt69E3CtT/9O1e3tEfIKO9qOWvGhbb/o0FHA482QWQVkgjvOwCAdJJm rOcu0F0YYFlgUP879bCRzjHjiRZ2Nro= X-Received: by 2002:a62:4f4f:: with SMTP id d76-v6mr25014999pfb.188.1529546089754; Wed, 20 Jun 2018 18:54:49 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.46 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:48 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:47 -1000 Message-Id: <20180621015359.12018-24-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::232 Subject: [Qemu-devel] [PATCH v5 23/35] target/arm: Implement SVE floating-point convert precision X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 13 +++++++++++++ target/arm/sve_helper.c | 27 +++++++++++++++++++++++++++ target/arm/translate-sve.c | 30 ++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 ++++++++ 4 files changed, 78 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index aca137fc37..4c379dbb05 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -942,6 +942,19 @@ DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_dh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 3dc35d8b12..e1bbe1f550 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3147,6 +3147,33 @@ void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ } while (i != 0); \ } +static inline float32 sve_f16_to_f32(float16 f, float_status *s) +{ + return float16_to_float32(f, true, s); +} + +static inline float64 sve_f16_to_f64(float16 f, float_status *s) +{ + return float16_to_float64(f, true, s); +} + +static inline float16 sve_f32_to_f16(float32 f, float_status *s) +{ + return float32_to_float16(f, true, s); +} + +static inline float16 sve_f64_to_f16(float64 f, float_status *s) +{ + return float64_to_float16(f, true, s); +} + +DO_ZPZ_FP(sve_fcvt_sh, uint32_t, H1_4, sve_f32_to_f16) +DO_ZPZ_FP(sve_fcvt_hs, uint32_t, H1_4, sve_f16_to_f32) +DO_ZPZ_FP(sve_fcvt_dh, uint64_t, , sve_f64_to_f16) +DO_ZPZ_FP(sve_fcvt_hd, uint64_t, , sve_f16_to_f64) +DO_ZPZ_FP(sve_fcvt_ds, uint64_t, , float64_to_float32) +DO_ZPZ_FP(sve_fcvt_sd, uint64_t, , float32_to_float64) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fb225d56a1..c08c1263f1 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3940,6 +3940,36 @@ static bool do_zpz_ptr(DisasContext *s, int rd, int rn, int pg, return true; } +static bool trans_FCVT_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_sh); +} + +static bool trans_FCVT_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hs); +} + +static bool trans_FCVT_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_dh); +} + +static bool trans_FCVT_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hd); +} + +static bool trans_FCVT_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_ds); +} + +static bool trans_FCVT_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_sd); +} + static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index e29c598783..fd45f51029 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -816,6 +816,14 @@ FNMLS_zpzzz 01100101 .. 1 ..... 111 ... ..... ..... @rdn_pg_rm_ra ### SVE FP Unary Operations Predicated Group +# SVE floating-point convert precision +FCVT_sh 01100101 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_hs 01100101 10 0010 01 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_dh 01100101 11 0010 00 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_hd 01100101 11 0010 01 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_ds 01100101 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_sd 01100101 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0 + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 From patchwork Thu Jun 21 01:53:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139424 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1483007lji; Wed, 20 Jun 2018 19:14:27 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLr/OYuRxtrP/ktmDpyLf31JF9gEnctswvX4g8bjczoot9WNAL2Q+zFx10hYTTF4Pxe4L5P X-Received: by 2002:a37:db0a:: with SMTP id e10-v6mr19521811qki.136.1529547267012; Wed, 20 Jun 2018 19:14:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547267; cv=none; d=google.com; s=arc-20160816; b=s//nkkHZaOsH2FFAVbjxcWRxE+nPrSuvtEpguvnSME7oztktxe9hDXwogREaUBqmz/ eiL7P3z5lXRco92Zsv9ow4V5lNQ/VQ4otC2HdaNS4ff6nuILvY2RtJPWX9KrZf5xASmG 8vrKK+X+1fpN34o5TX0DjG/VS7aveyc0fWBQ9x9S2Pq7/1RJGBETjzjG5xYcw0fOg6BQ AIN9LtTve9DuuDzDTltf/orksCiQI6I/zg1qMP3rcls1//rlZjIalmCbNKb60NcBLti5 fQXQcAzLQgmdXlJV8FTRwtG2eYXB1M65M2lIanKuwcQJNMq6AHmlhgLX5Be+eHrYpVS+ FnDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=chAs1KQCufmRrHCN4/ljpOmhhg2MDuWbn1GZ6QxL+Fs=; b=kNSWHH+bJJcIeV9VilbS7QLzgh0I48PN+oivWyre3YFvD5joSTsbL+XVy9yt8z8p9K 5LULfNG1JTwGEKWQqnSPGagLsoa7fuPTcDxWI0Om4RaI6Ahv8podzFl7wnD8B25JReBM EvL3S1h2lnXIslK+neIG1G2cwM6yxalWdLcdbjZTrjT1np2wM1FM7glxqUZYG6h44BlL Wge0xsOYwhTfOyPTN0VvTA1dFzNkFOoWfL7yvkJZURtvRLXEOy4dxUcP7qlwMaXE++CU zrlhljK+lvJR2C0CmrK27iuhCn2K1xG625t1FIVXhQI1MQ0oe+YgA7j84fjiSckf9xno UA4w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=NWopxSu0; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id d4-v6si3404966qtf.84.2018.06.20.19.14.26 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:14:26 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=NWopxSu0; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52581 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp7C-0002Ow-EZ for patch@linaro.org; Wed, 20 Jun 2018 22:14:26 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39409) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooJ-0004lF-7I for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooH-0003Nj-DL for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:55 -0400 Received: from mail-pf0-x22b.google.com ([2607:f8b0:400e:c00::22b]:46300) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooH-0003N0-51 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:53 -0400 Received: by mail-pf0-x22b.google.com with SMTP id q1-v6so677086pff.13 for ; Wed, 20 Jun 2018 18:54:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=chAs1KQCufmRrHCN4/ljpOmhhg2MDuWbn1GZ6QxL+Fs=; b=NWopxSu0548Zu8bzdF529X8cVpnrSkLPg+hDZcQKCYoG4EuFKf87WUIaMHRcq+elZn NGDyHhnyjYuJJFzq2GXOtI1X0pTYPbEEmgPv++Tz09/2Tlddqk+Kd233MqzE96oSyma+ h2hIeGnSs0SfIRexroUl1wKyl9amW/oGuKXQk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=chAs1KQCufmRrHCN4/ljpOmhhg2MDuWbn1GZ6QxL+Fs=; b=Yq5A/niaHE5Tjjjyn31KNSahUb/uaAoEjPltMLJENe/NWeYG7E42KEQWbqZfDOjLWE IH+xvxOMAwcMogjIhHVS5dwghn8J002A+CzDgdSyGY7Wt0THqNynqKuTNe2u03lkJQPC A9X/Nh6xF+uyQqDVyzIQsZCH26tH8aXQcEUlY3db6T58jm/GOCpp0FO/LZf+kYJZbOhS 3YKoSiJovKWkKFUrgX2krSMDkqc6MDSmUdQi2oMLUcoBOU0NFjw4JI7no2xu7jUFHiFc JTcFQ3eJE/f/Bg7Jjc9cZn5wT/F1r2fx0kg4kdRUx0vt0IRmQ+WnRUdGQwYkckTxgIto EeAw== X-Gm-Message-State: APt69E09fth6Tzcip1CX9Wt2dGgIi/x8/tYh8WzwTD6hs1UncB2Zwqw8 JrsPDeoQxepjZ5jEKmh/UwTEvXHodX8= X-Received: by 2002:a62:df89:: with SMTP id d9-v6mr5769449pfl.147.1529546091633; Wed, 20 Jun 2018 18:54:51 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.49 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:50 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:48 -1000 Message-Id: <20180621015359.12018-25-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::22b Subject: [Qemu-devel] [PATCH v5 24/35] target/arm: Implement SVE floating-point convert to integer X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 30 +++++++++++++ target/arm/helper.h | 12 +++--- target/arm/helper.c | 2 +- target/arm/sve_helper.c | 88 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 70 ++++++++++++++++++++++++++++++ target/arm/sve.decode | 16 +++++++ 6 files changed, 211 insertions(+), 7 deletions(-) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4c379dbb05..37fa9eb9bb 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -955,6 +955,36 @@ DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_hd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcvtzu_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_hd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/helper.h b/target/arm/helper.h index ad9cb6c7d5..8607077dda 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -134,12 +134,12 @@ DEF_HELPER_2(vfp_touid, i32, f64, ptr) DEF_HELPER_2(vfp_touizh, i32, f16, ptr) DEF_HELPER_2(vfp_touizs, i32, f32, ptr) DEF_HELPER_2(vfp_touizd, i32, f64, ptr) -DEF_HELPER_2(vfp_tosih, i32, f16, ptr) -DEF_HELPER_2(vfp_tosis, i32, f32, ptr) -DEF_HELPER_2(vfp_tosid, i32, f64, ptr) -DEF_HELPER_2(vfp_tosizh, i32, f16, ptr) -DEF_HELPER_2(vfp_tosizs, i32, f32, ptr) -DEF_HELPER_2(vfp_tosizd, i32, f64, ptr) +DEF_HELPER_2(vfp_tosih, s32, f16, ptr) +DEF_HELPER_2(vfp_tosis, s32, f32, ptr) +DEF_HELPER_2(vfp_tosid, s32, f64, ptr) +DEF_HELPER_2(vfp_tosizh, s32, f16, ptr) +DEF_HELPER_2(vfp_tosizs, s32, f32, ptr) +DEF_HELPER_2(vfp_tosizd, s32, f64, ptr) DEF_HELPER_3(vfp_toshs_round_to_zero, i32, f32, i32, ptr) DEF_HELPER_3(vfp_tosls_round_to_zero, i32, f32, i32, ptr) diff --git a/target/arm/helper.c b/target/arm/helper.c index 1248d84e6f..a36f5b1899 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -11360,7 +11360,7 @@ ftype HELPER(name)(uint32_t x, void *fpstp) \ } #define CONV_FTOI(name, ftype, fsz, sign, round) \ -uint32_t HELPER(name)(ftype x, void *fpstp) \ +sign##int32_t HELPER(name)(ftype x, void *fpstp) \ { \ float_status *fpst = fpstp; \ if (float##fsz##_is_any_nan(x)) { \ diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index e1bbe1f550..497efbf3a8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3167,6 +3167,78 @@ static inline float16 sve_f64_to_f16(float64 f, float_status *s) return float64_to_float16(f, true, s); } +static inline int16_t vfp_float16_to_int16_rtz(float16 f, float_status *s) +{ + if (float16_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float16_to_int16_round_to_zero(f, s); +} + +static inline int64_t vfp_float16_to_int64_rtz(float16 f, float_status *s) +{ + if (float16_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float16_to_int64_round_to_zero(f, s); +} + +static inline int64_t vfp_float32_to_int64_rtz(float32 f, float_status *s) +{ + if (float32_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float32_to_int64_round_to_zero(f, s); +} + +static inline int64_t vfp_float64_to_int64_rtz(float64 f, float_status *s) +{ + if (float64_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float64_to_int64_round_to_zero(f, s); +} + +static inline uint16_t vfp_float16_to_uint16_rtz(float16 f, float_status *s) +{ + if (float16_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float16_to_uint16_round_to_zero(f, s); +} + +static inline uint64_t vfp_float16_to_uint64_rtz(float16 f, float_status *s) +{ + if (float16_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float16_to_uint64_round_to_zero(f, s); +} + +static inline uint64_t vfp_float32_to_uint64_rtz(float32 f, float_status *s) +{ + if (float32_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float32_to_uint64_round_to_zero(f, s); +} + +static inline uint64_t vfp_float64_to_uint64_rtz(float64 f, float_status *s) +{ + if (float64_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float64_to_uint64_round_to_zero(f, s); +} + DO_ZPZ_FP(sve_fcvt_sh, uint32_t, H1_4, sve_f32_to_f16) DO_ZPZ_FP(sve_fcvt_hs, uint32_t, H1_4, sve_f16_to_f32) DO_ZPZ_FP(sve_fcvt_dh, uint64_t, , sve_f64_to_f16) @@ -3174,6 +3246,22 @@ DO_ZPZ_FP(sve_fcvt_hd, uint64_t, , sve_f16_to_f64) DO_ZPZ_FP(sve_fcvt_ds, uint64_t, , float64_to_float32) DO_ZPZ_FP(sve_fcvt_sd, uint64_t, , float32_to_float64) +DO_ZPZ_FP(sve_fcvtzs_hh, uint16_t, H1_2, vfp_float16_to_int16_rtz) +DO_ZPZ_FP(sve_fcvtzs_hs, uint32_t, H1_4, helper_vfp_tosizh) +DO_ZPZ_FP(sve_fcvtzs_ss, uint32_t, H1_4, helper_vfp_tosizs) +DO_ZPZ_FP(sve_fcvtzs_hd, uint64_t, , vfp_float16_to_int64_rtz) +DO_ZPZ_FP(sve_fcvtzs_sd, uint64_t, , vfp_float32_to_int64_rtz) +DO_ZPZ_FP(sve_fcvtzs_ds, uint64_t, , helper_vfp_tosizd) +DO_ZPZ_FP(sve_fcvtzs_dd, uint64_t, , vfp_float64_to_int64_rtz) + +DO_ZPZ_FP(sve_fcvtzu_hh, uint16_t, H1_2, vfp_float16_to_uint16_rtz) +DO_ZPZ_FP(sve_fcvtzu_hs, uint32_t, H1_4, helper_vfp_touizh) +DO_ZPZ_FP(sve_fcvtzu_ss, uint32_t, H1_4, helper_vfp_touizs) +DO_ZPZ_FP(sve_fcvtzu_hd, uint64_t, , vfp_float16_to_uint64_rtz) +DO_ZPZ_FP(sve_fcvtzu_sd, uint64_t, , vfp_float32_to_uint64_rtz) +DO_ZPZ_FP(sve_fcvtzu_ds, uint64_t, , helper_vfp_touizd) +DO_ZPZ_FP(sve_fcvtzu_dd, uint64_t, , vfp_float64_to_uint64_rtz) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c08c1263f1..fd94c5337e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3970,6 +3970,76 @@ static bool trans_FCVT_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_sd); } +static bool trans_FCVTZS_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hh); +} + +static bool trans_FCVTZU_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hh); +} + +static bool trans_FCVTZS_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hs); +} + +static bool trans_FCVTZU_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hs); +} + +static bool trans_FCVTZS_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hd); +} + +static bool trans_FCVTZU_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hd); +} + +static bool trans_FCVTZS_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_ss); +} + +static bool trans_FCVTZU_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_ss); +} + +static bool trans_FCVTZS_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_sd); +} + +static bool trans_FCVTZU_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_sd); +} + +static bool trans_FCVTZS_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_ds); +} + +static bool trans_FCVTZU_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_ds); +} + +static bool trans_FCVTZS_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_dd); +} + +static bool trans_FCVTZU_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_dd); +} + static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index fd45f51029..c4597d4fbe 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -824,6 +824,22 @@ FCVT_hd 01100101 11 0010 01 101 ... ..... ..... @rd_pg_rn_e0 FCVT_ds 01100101 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 FCVT_sd 01100101 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0 +# SVE floating-point convert to integer +FCVTZS_hh 01100101 01 011 01 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_hh 01100101 01 011 01 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_hs 01100101 01 011 10 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_hs 01100101 01 011 10 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_hd 01100101 01 011 11 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_hd 01100101 01 011 11 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_ss 01100101 10 011 10 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_ss 01100101 10 011 10 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_ds 01100101 11 011 00 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_ds 01100101 11 011 00 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_sd 01100101 11 011 10 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_sd 01100101 11 011 10 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_dd 01100101 11 011 11 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_dd 01100101 11 011 11 1 101 ... ..... ..... @rd_pg_rn_e0 + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 From patchwork Thu Jun 21 01:53:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139425 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1483294lji; Wed, 20 Jun 2018 19:14:51 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJljfSWCgcUQQ1e1Oth9dvsIoK4tDCHR6If/8WxZzmqNq7zwm9XbyySCHZ0fxZ0Ob4gT9qX X-Received: by 2002:aed:25cb:: with SMTP id y11-v6mr21365585qtc.333.1529547291022; Wed, 20 Jun 2018 19:14:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547291; cv=none; d=google.com; s=arc-20160816; b=VWM9CWqyssffaCUmoyOUui+e5EoxsdsYSgmYHANsrhKADs2MLHEa81x3qo+DaJYy1v YX6+3evDEghQeoFzcEioFcpNO4wi0QYHVizIr32lZ3Tm8s/aa2lZdejSJ5pxjVjiRGbI QdY8wGzS1GGHg7lvYUT4BjlPUphEDRpZN5AV87gppeYVx6AURU+62ubLzx/LmTOfxn8H 5SHhOLfKU48mEMAII7TEdCfNpLaZlfxKv7BaoYmh+ExbqUHAxL27yKqdYdIe7uAhItxF 3LdpfGF+ctBv5OSGLWxoAL1Q4xFIGT509Tp1VDzNgAlSNPngEFktE7DmYvwzPPmQ82H0 xWHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=QIBCwmPy3yQFPEkh4dvSFFLAvuGXcxOXoxq2dHESWn8=; b=TBHpZy8GZuvMxtBHXvhGD9Bk+VjVitpsrj9bVQhmMXaCAvbomLILtNlyfu+ZFUadWe pcUkv1VxRv0gik27HvHAXcOT7L2kffCMohbmoqXPvaOlxnZIfx6AnEq7bArIlsuHvnX5 h6Q5X/a8V+ju5NN9DxxPyB4ApJogeb1gi/ku3yPIFgEVPv/z3nFFoMoccZ2yYJe+fP/g hHuYlG1bPPhrZEUyl+lvyeS9DC8aCa9fWWt7UEmtBI0azmWZdHCH1a2+OTfHOS8d1cYU epIdn9SNQYFKiEnwvEE6wLx5f3qiP8eCy/oakILJR1kRoIFoquu/42EqckMETf9GP29n Gl6g== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Ek5cy461; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id h46-v6si3405156qth.404.2018.06.20.19.14.50 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:14:51 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Ek5cy461; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52583 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp7a-00030R-H8 for patch@linaro.org; Wed, 20 Jun 2018 22:14:50 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39417) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooK-0004lG-5v for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooI-0003OZ-U0 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:56 -0400 Received: from mail-pg0-x22d.google.com ([2607:f8b0:400e:c05::22d]:36153) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooI-0003O6-LC for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:54 -0400 Received: by mail-pg0-x22d.google.com with SMTP id m5-v6so647544pgd.3 for ; Wed, 20 Jun 2018 18:54:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=QIBCwmPy3yQFPEkh4dvSFFLAvuGXcxOXoxq2dHESWn8=; b=Ek5cy461GCteWjNZEZ4h8VxhkYlxZpZFz6sBIoWOWbxKs/Muui3X9hvs/WNjeB0ZG1 as7Ck+Z2hRa363uvHF0LkkU+fjgdUOV9Zd//CShEw0AJhxBcGif+scg/H+Izs4D/QNV2 sSFuaQ4QR4jgdQ8s5lUfYlu2W2HWihXR3SSDg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=QIBCwmPy3yQFPEkh4dvSFFLAvuGXcxOXoxq2dHESWn8=; b=lnPP407+rDxRNxQy1ZfotS2WAXRVWTnkivpr/I1rLp5aK8LmVrrOOGPys1aMS0hm1+ MUwpte4Xxibw2D48QiK5+1JyTq07l7yCNBZ1px8UorGOI6kbDfO0vhPlTzfykFVUaiMe rBINQYiKi+/DSuTjJhaL8do0SzK87j6Vc5iuSqlxN/qozIb2UA3+awQvOc31d1iyXvea WPCp3VhsItMJFmr3W1zqEJXjr24OPvPDPE8ecI93W6M56qtz1IshcLSGvlj6P9Ebp4uf /pIELLZrkh2qMaNzV3TcTpqW4p0381y+BHXShu0k5rzGUkE/A9/rShGRfr2Gd0jjZh9+ I3TA== X-Gm-Message-State: APt69E3P2MIYWLrj+3tBZmYthJPTvRyG7dGJcRl+WKY6hNxZ+oQyD2GW ebEjAswuC2uf5kAr/EiHGx4q/CFK+cI= X-Received: by 2002:a63:8048:: with SMTP id j69-v6mr21062875pgd.429.1529546093325; Wed, 20 Jun 2018 18:54:53 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.51 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:52 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:49 -1000 Message-Id: <20180621015359.12018-26-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22d Subject: [Qemu-devel] [PATCH v5 25/35] target/arm: Implement SVE floating-point round to integral value X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 +++++++ target/arm/sve_helper.c | 8 ++++ target/arm/translate-sve.c | 77 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 9 +++++ 4 files changed, 108 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 37fa9eb9bb..36168c5bb2 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -985,6 +985,20 @@ DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frint_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frint_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frint_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_frintx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 497efbf3a8..3b7a2f58c1 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3262,6 +3262,14 @@ DO_ZPZ_FP(sve_fcvtzu_sd, uint64_t, , vfp_float32_to_uint64_rtz) DO_ZPZ_FP(sve_fcvtzu_ds, uint64_t, , helper_vfp_touizd) DO_ZPZ_FP(sve_fcvtzu_dd, uint64_t, , vfp_float64_to_uint64_rtz) +DO_ZPZ_FP(sve_frint_h, uint16_t, H1_2, helper_advsimd_rinth) +DO_ZPZ_FP(sve_frint_s, uint32_t, H1_4, helper_rints) +DO_ZPZ_FP(sve_frint_d, uint64_t, , helper_rintd) + +DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_round_to_int) +DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int) +DO_ZPZ_FP(sve_frintx_d, uint64_t, , float64_round_to_int) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fd94c5337e..15e4f2888b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4040,6 +4040,83 @@ static bool trans_FCVTZU_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_dd); } +static gen_helper_gvec_3_ptr * const frint_fns[3] = { + gen_helper_sve_frint_h, + gen_helper_sve_frint_s, + gen_helper_sve_frint_d +}; + +static bool trans_FRINTI(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + if (a->esz == 0) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, + frint_fns[a->esz - 1]); +} + +static bool trans_FRINTX(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] = { + gen_helper_sve_frintx_h, + gen_helper_sve_frintx_s, + gen_helper_sve_frintx_d + }; + if (a->esz == 0) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]); +} + +static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a, int mode) +{ + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_i32 tmode = tcg_const_i32(mode); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + + gen_helper_set_rmode(tmode, tmode, status); + + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, frint_fns[a->esz - 1]); + + gen_helper_set_rmode(tmode, tmode, status); + tcg_temp_free_i32(tmode); + tcg_temp_free_ptr(status); + } + return true; +} + +static bool trans_FRINTN(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_frint_mode(s, a, float_round_nearest_even); +} + +static bool trans_FRINTP(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_frint_mode(s, a, float_round_up); +} + +static bool trans_FRINTM(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_frint_mode(s, a, float_round_down); +} + +static bool trans_FRINTZ(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_frint_mode(s, a, float_round_to_zero); +} + +static bool trans_FRINTA(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_frint_mode(s, a, float_round_ties_away); +} + static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index c4597d4fbe..eeab3d485f 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -840,6 +840,15 @@ FCVTZU_sd 01100101 11 011 10 1 101 ... ..... ..... @rd_pg_rn_e0 FCVTZS_dd 01100101 11 011 11 0 101 ... ..... ..... @rd_pg_rn_e0 FCVTZU_dd 01100101 11 011 11 1 101 ... ..... ..... @rd_pg_rn_e0 +# SVE floating-point round to integral value +FRINTN 01100101 .. 000 000 101 ... ..... ..... @rd_pg_rn +FRINTP 01100101 .. 000 001 101 ... ..... ..... @rd_pg_rn +FRINTM 01100101 .. 000 010 101 ... ..... ..... @rd_pg_rn +FRINTZ 01100101 .. 000 011 101 ... ..... ..... @rd_pg_rn +FRINTA 01100101 .. 000 100 101 ... ..... ..... @rd_pg_rn +FRINTX 01100101 .. 000 110 101 ... ..... ..... @rd_pg_rn +FRINTI 01100101 .. 000 111 101 ... ..... ..... @rd_pg_rn + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 From patchwork Thu Jun 21 01:53:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139430 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1485574lji; Wed, 20 Jun 2018 19:18:33 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKWIy+CsqwxCTHMdpjmpTX12aZAubjaM7xXQzf02NGY3rkPj38+0Tnq2LkSZ9gdYp2zEI/T X-Received: by 2002:a0c:dd82:: with SMTP id v2-v6mr21041632qvk.176.1529547513559; Wed, 20 Jun 2018 19:18:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547513; cv=none; d=google.com; s=arc-20160816; b=WZqzmdncakz1bhCYNbZ/k8bzNSXEGV5DUWtxUE9WG8cApPV4+T6oXpZ4512xecHpjx BoJKWKFdAyIGcVFR+VqebZTNHpMJtRsXxLZ5GQH4as45AiT75+aP5DVk59ktjKZxaSrB VENWTktCK0DzzvybUyVyHAiAUCz43hhhLzp7P/VQZK9fQA3Z8RLlqiB8dDkMMatZgy3D ZmOrySBTEhYV9szyyEdL/j1Axj/Y7+HGdiA+ZZ5y9eFZw2tHBY4nm3KTCNzRlgjRSMDF zEqWaZPgDp7o0SEfxdJytb9TLwvbhipLObZdR+T1CZGDs9rmoZDDL9GfljNkyS+YNo/H +Nzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=LHWFtiyOKa+HrAxinQWN0q0OwYqueH9VPd2ad8t6RQs=; b=UOy2DQ4j9cexTDhXzLY3xIjNiS1Hs/VHKCmhOmJVcg6sPh2qfpCCPQeS1zdFyGF8TG 90zsmQCizHDMWm+nsscg/5Z+jvEJpyYmqgOvLIfH4AEoE0yPFQZGdAXi9Y5Y8y1nDacc KN0gZyD7gtDdvuu3+eolM7/QRgOeTIlNlygeawm4hlFYcuksYqwqBE5wwW5quACChvjv SDGimgYqCsHT7EYxs3ptzjxnqhpT4fAuHmqK4m3yI2dnZJMo5ZIMZQEfOIV6pLB9KM36 fJyyQhl0KKtegCVyWYNb1LpLsH45E/nw4zDeKd88CQ49j33uZL8QjgGjAZUdTelZaEa1 gx3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=H1HfCJQ0; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id 9-v6si3855399qtn.162.2018.06.20.19.18.33 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:18:33 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=H1HfCJQ0; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52604 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVpBA-0005dX-Us for patch@linaro.org; Wed, 20 Jun 2018 22:18:33 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39465) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooO-0004oz-8g for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooK-0003Q9-Ux for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:00 -0400 Received: from mail-pl0-x235.google.com ([2607:f8b0:400e:c01::235]:38078) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooK-0003PO-PL for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:56 -0400 Received: by mail-pl0-x235.google.com with SMTP id d10-v6so757826plo.5 for ; Wed, 20 Jun 2018 18:54:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=LHWFtiyOKa+HrAxinQWN0q0OwYqueH9VPd2ad8t6RQs=; b=H1HfCJQ0nxatWBoXR0gS5E1IEyiaFdmQB2rJ61wgdCLtLpRrOHH0ZV7svJ+lLzTvzA Jyk8ziuFE6wscp0t5qgH4/nl7EDP6orGh3QDttx8jdYD7rZB0w7r6nh0S2bAuGm2yPb8 O+0Y7ElA1AgNCJ96irnVsfzPV5meiET+uNL0Q= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=LHWFtiyOKa+HrAxinQWN0q0OwYqueH9VPd2ad8t6RQs=; b=sTYoExAkkhDxJj30tyiqaLw/g/f7eLFVUWtx3WLQ7mMw2F48oVNaSewe9D+sQneAGa hupDNSYTBAyIymWEQe7gFswx/lEQRmLs6SANFOynzohlk7NkwwJffDG7KrUaLm/I1E2V putLOOS667LVwytPJGoJxLuj4QGjU+nFy4GpsYLZ5Fh+nyEmjnOO1+lJebbaCPHKTk9X ACHQTcgK1HBY82x8hyOzVkuTE2+nBcRmibJ/TJe0KgCyEvF3KC3FsL1llPD5tqP11zVS N2GpzgSwR/N1Q9RcQWooFoto1rN9sy8mrWFsIo96mLgvHCCF50ENkJ8HKgE6MiefdeXR Hq4w== X-Gm-Message-State: APt69E24xNpyS/By82YxuCs4fBKMkQQzaxV3NQYwxhkYkyRsj3swu9ee wd63RXfc5zpgusT9QV+aJ/kMoKHnM3U= X-Received: by 2002:a17:902:b582:: with SMTP id a2-v6mr26632985pls.335.1529546095408; Wed, 20 Jun 2018 18:54:55 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.53 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:54 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:50 -1000 Message-Id: <20180621015359.12018-27-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::235 Subject: [Qemu-devel] [PATCH v5 26/35] target/arm: Implement SVE floating-point unary operations X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++++++++ target/arm/sve_helper.c | 8 ++++++++ target/arm/translate-sve.c | 26 ++++++++++++++++++++++++++ target/arm/sve.decode | 4 ++++ 4 files changed, 52 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 36168c5bb2..891346a5ac 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -999,6 +999,20 @@ DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frecpx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frecpx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frecpx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fsqrt_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fsqrt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fsqrt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 3b7a2f58c1..5309cf0866 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3270,6 +3270,14 @@ DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_round_to_int) DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int) DO_ZPZ_FP(sve_frintx_d, uint64_t, , float64_round_to_int) +DO_ZPZ_FP(sve_frecpx_h, uint16_t, H1_2, helper_frecpx_f16) +DO_ZPZ_FP(sve_frecpx_s, uint32_t, H1_4, helper_frecpx_f32) +DO_ZPZ_FP(sve_frecpx_d, uint64_t, , helper_frecpx_f64) + +DO_ZPZ_FP(sve_fsqrt_h, uint16_t, H1_2, float16_sqrt) +DO_ZPZ_FP(sve_fsqrt_s, uint32_t, H1_4, float32_sqrt) +DO_ZPZ_FP(sve_fsqrt_d, uint64_t, , float64_sqrt) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 15e4f2888b..308c04de89 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4117,6 +4117,32 @@ static bool trans_FRINTA(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return do_frint_mode(s, a, float_round_ties_away); } +static bool trans_FRECPX(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] = { + gen_helper_sve_frecpx_h, + gen_helper_sve_frecpx_s, + gen_helper_sve_frecpx_d + }; + if (a->esz == 0) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]); +} + +static bool trans_FSQRT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] = { + gen_helper_sve_fsqrt_h, + gen_helper_sve_fsqrt_s, + gen_helper_sve_fsqrt_d + }; + if (a->esz == 0) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]); +} + static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index eeab3d485f..94d7b157b4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -849,6 +849,10 @@ FRINTA 01100101 .. 000 100 101 ... ..... ..... @rd_pg_rn FRINTX 01100101 .. 000 110 101 ... ..... ..... @rd_pg_rn FRINTI 01100101 .. 000 111 101 ... ..... ..... @rd_pg_rn +# SVE floating-point unary operations +FRECPX 01100101 .. 001 100 101 ... ..... ..... @rd_pg_rn +FSQRT 01100101 .. 001 101 101 ... ..... ..... @rd_pg_rn + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 From patchwork Thu Jun 21 01:53:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139431 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1487857lji; Wed, 20 Jun 2018 19:21:52 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJtctVo+pIjihe5O2Zchxal9Pc0h4isiOrjKTgHAmg5IdvWlCs1xwCVp6dctxDLBR0zRYVK X-Received: by 2002:ac8:44d3:: with SMTP id b19-v6mr20438971qto.167.1529547712128; Wed, 20 Jun 2018 19:21:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547712; cv=none; d=google.com; s=arc-20160816; b=E/nk3wIRhsZSiYU11PvGoa3Yz9LzbmPLIrEXgbQhKAtmpdFcSh7yPvHOIK5MR/6RAT pyDmSbtNWbDA2azMiKxnfKIOPAe0uJ0IGEsReuc4GrPIB6ddf8/IdlIAv0qf6CBt15jk t8FZOload6Y30dcwUuC3DenWC2gWM0LgE4XX07Yx7oeDoKZraCcJv+GsZ1berutjdKkr 820hU8u3NaChdyrx7LDsWJZeDo9cBaPDRSFvrhtVJXDwnpM5UEYzwBKqZiR6Mxnin9uF jSvcSAFisbx1+f6ADymBdGzFde/S/X8BuNTPliv70iGcgnjZvMXly0xD23cP0DWrfu+R IjZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=/0h8A3BZDF8RXvY+bWdtcrfvhJEFmIbQwJk4x2l8+Dk=; b=QJUBsrtmh1R95oem7DWi1X2uZgJ4iEKuHVpx/IaOSmDP7rvGeIEWENxkPS9LSQLTu9 dIr0fHJGlXH3g2+6FcZ/cAK0wZeFpP7N5J8kO/XTX3vjdtwWWOFBX+oTxZwYKEbD9lQt OvbjRUXznTs/UpRjzkla1XlOOfQfAivTqr0QIWMc0dY4yWmEXLljwDB4En/gZZ7RdcIS vsKmWfBkRUB35OQ4AteYBt6q8MRmfSVnRf9cO5mC9PFktuuU37ccT/svqERSEqASnPXR aaC3oKFK3oo7Kjmo1WOiajIB4ZTstWxXaIjqPtVZzyaxBBmms8aQVIq9pr32E8spDGxE UgRw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=AFLbihHr; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id g49-v6si80305qvg.7.2018.06.20.19.21.51 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:21:52 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=AFLbihHr; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52630 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVpEN-0001aN-LM for patch@linaro.org; Wed, 20 Jun 2018 22:21:51 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39464) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooO-0004ov-8C for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooM-0003Qt-Ga for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:00 -0400 Received: from mail-pg0-x232.google.com ([2607:f8b0:400e:c05::232]:44093) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooM-0003Qe-Ab for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:58 -0400 Received: by mail-pg0-x232.google.com with SMTP id g26-v6so636820pgn.11 for ; Wed, 20 Jun 2018 18:54:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=/0h8A3BZDF8RXvY+bWdtcrfvhJEFmIbQwJk4x2l8+Dk=; b=AFLbihHryh2nCHH15bOVKtmQXN2BBDJWIdp/FaNFfUOtsm3fAedPN81Cq7odxYIusQ zd3OPyXErv+P9eUG9BiS0fpiAJr8FRyOOps0X1ITtKaMcMDeH2LOTbxy+H9yJIawFLVq zHY0cu2IdvGWl/dA9J3IvMCoVfi8V0hLN1mPY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=/0h8A3BZDF8RXvY+bWdtcrfvhJEFmIbQwJk4x2l8+Dk=; b=mSbjkEtSneLUxlpOJ9i7N/JZvppN3NJsMOXXE7s3o7JImpIu4YAhSoub8Q9M4CA9Eq v6VATYJxR4SJ5PtLouA7RJmKSlFTm5WDITFGezD4NU1LvqlWkqNXN5UG5gj/5PN45dcm dzBxof1mUORPsHaycutztZbQ/3Lsf9iDASG0HL4EgfjUPEVY4IUlqVT5wh3Vsy+Ktx7w LmqwIjxZYiMzbaoFs523ZMwGqBlOv6Xb8TEaB6MCEcxiW366K5shEpCmxmlnhs5eAYGP xyjCw2u/qGhtmOe4O+SICJANOaLLHJ5BZh3YGQUUI5YdkYEuohNzvP09/zj0t+DWO7Yd S8zA== X-Gm-Message-State: APt69E1z5Zo/7AzJ5eorxtzQzUhPoM2TSj/xRz+XW8yCv0x84lqPvNx1 5gwjyZWxow3r0HjqzRkPUOeLhb7qkAI= X-Received: by 2002:a62:8202:: with SMTP id w2-v6mr25345344pfd.32.1529546097046; Wed, 20 Jun 2018 18:54:57 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.55 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:56 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:51 -1000 Message-Id: <20180621015359.12018-28-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::232 Subject: [Qemu-devel] [PATCH v5 27/35] target/arm: Implement SVE MOVPRFX X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 60 +++++++++++++++++++++++++++++++++++++- target/arm/sve.decode | 7 +++++ 2 files changed, 66 insertions(+), 1 deletion(-) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 308c04de89..067c219b54 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -351,6 +351,23 @@ static bool do_zpzz_ool(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4 *fn) return true; } +/* Select active elememnts from Zn and inactive elements from Zm, + * storing the result in Zd. + */ +static void do_sel_z(DisasContext *s, int rd, int rn, int rm, int pg, int esz) +{ + static gen_helper_gvec_4 * const fns[4] = { + gen_helper_sve_sel_zpzz_b, gen_helper_sve_sel_zpzz_h, + gen_helper_sve_sel_zpzz_s, gen_helper_sve_sel_zpzz_d + }; + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + pred_full_reg_offset(s, pg), + vsz, vsz, 0, fns[esz]); +} + #define DO_ZPZZ(NAME, name) \ static bool trans_##NAME##_zpzz(DisasContext *s, arg_rprr_esz *a, \ uint32_t insn) \ @@ -401,7 +418,13 @@ static bool trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) return do_zpzz_ool(s, a, fns[a->esz]); } -DO_ZPZZ(SEL, sel) +static bool trans_SEL_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + do_sel_z(s, a->rd, a->rn, a->rm, a->pg, a->esz); + } + return true; +} #undef DO_ZPZZ @@ -5038,3 +5061,38 @@ static bool trans_PRF_rr(DisasContext *s, arg_PRF_rr *a, uint32_t insn) sve_access_check(s); return true; } + +/* + * Move Prefix + * + * TODO: The implementation so far could handle predicated merging movprfx. + * The helper functions as written take an extra source register to + * use in the operation, but the result is only written when predication + * succeeds. For unpredicated movprfx, we need to rearrange the helpers + * to allow the final write back to the destination to be unconditional. + * For predicated zering movprfz, we need to rearrange the helpers to + * allow the final write back to zero inactives. + * + * In the meantime, just emit the moves. + */ + +static bool trans_MOVPRFX(DisasContext *s, arg_MOVPRFX *a, uint32_t insn) +{ + return do_mov_z(s, a->rd, a->rn); +} + +static bool trans_MOVPRFX_m(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + do_sel_z(s, a->rd, a->rn, a->rd, a->pg, a->esz); + } + return true; +} + +static bool trans_MOVPRFX_z(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + do_movz_zpz(s, a->rd, a->rn, a->pg, a->esz); + } + return true; +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 94d7b157b4..85f2b39776 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -266,6 +266,10 @@ ORV 00000100 .. 011 000 001 ... ..... ..... @rd_pg_rn EORV 00000100 .. 011 001 001 ... ..... ..... @rd_pg_rn ANDV 00000100 .. 011 010 001 ... ..... ..... @rd_pg_rn +# SVE constructive prefix (predicated) +MOVPRFX_z 00000100 .. 010 000 001 ... ..... ..... @rd_pg_rn +MOVPRFX_m 00000100 .. 010 001 001 ... ..... ..... @rd_pg_rn + # SVE integer add reduction (predicated) # Note that saddv requires size != 3. UADDV 00000100 .. 000 001 001 ... ..... ..... @rd_pg_rn @@ -414,6 +418,9 @@ ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm ### SVE Integer Misc - Unpredicated Group +# SVE constructive prefix (unpredicated) +MOVPRFX 00000100 00 1 00000 101111 rn:5 rd:5 + # SVE floating-point exponential accelerator # Note esz != 0 FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn From patchwork Thu Jun 21 01:53:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139432 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1488584lji; Wed, 20 Jun 2018 19:23:03 -0700 (PDT) X-Google-Smtp-Source: ADUXVKL+6U+LbjppYN6TX4I6BS7Lqce4yBhfUwYte2EmegOnFi0mbxQCB28T2BFG0eYOLQMqSMfQ X-Received: by 2002:ac8:2836:: with SMTP id 51-v6mr21703832qtq.131.1529547783437; Wed, 20 Jun 2018 19:23:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547783; cv=none; d=google.com; s=arc-20160816; b=tiTv/kewS692mXKIWPdnXpVh0ffK9iexeZZI2IAebPT49NZ3Y7CjYMTXy3W+788rzg sZpZ4TEvfhlLpil/s8yXHiz0Ne5LU8AbFHjRUBxKGZBZa5JRgTwu/RrD960e33UZn3e8 R9pNHWvFsQu9vAZapV/X2QabCYV6UOufByarAKx+P2Je9yeG0x9Na+Z2d4x5NE9Q80X1 CDeyUpYbekai3z8uaDKSNt33cMtbZZKo9DlT9X8ezwdYgZ18pTCQKZk1L/pKahC3NOTk CWwmk9YUc7jYwSIBnOH3Oho4q5Jmyn03QAjBTvwZ71Kexi9Ft2WtKIaaTI/cEvyADP6A aPQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=li9h4tydA6NYZ7tm8NAvM0RkuO1DvOT8U6gU0EDKrec=; b=v1mcEGmWZX2lrH0+xAcdMy0lHRJUEDhikSJilpta7YTLCNMasUJ6Ri71VLrcO5QNpI 6BfnlKC5XryXjnRBS2+TS+30lAJgCyVipjG/fzkXko585hFzK6hnE6/G5Ne7I4ekK6fw wZueXs3lf0gbhPT34RUoDpGazFtoLSXqWVYC+wR9TguK6V+2Pg4Pl0zlirIuuoGKoCZi HhxELHYspXUX0EcltdddUASxR4M7RAzhj3uQWrtacUiJSaBjC8GzEpdo1OlwjXXuu2Et lwoGj/mM3AoSANnasiVuQVrCXrEkNsoBKd24MKetvIR/FjTOkQwz7ORXZRHlFEZlte5I seFw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=IaRvnxPH; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id s22-v6si3811522qts.215.2018.06.20.19.23.03 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:23:03 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=IaRvnxPH; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52634 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVpFW-0002Ah-V7 for patch@linaro.org; Wed, 20 Jun 2018 22:23:02 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39492) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooQ-0004rB-8L for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooO-0003Sc-H6 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:02 -0400 Received: from mail-pl0-x235.google.com ([2607:f8b0:400e:c01::235]:43907) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooO-0003Rs-7u for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:00 -0400 Received: by mail-pl0-x235.google.com with SMTP id c41-v6so746764plj.10 for ; Wed, 20 Jun 2018 18:55:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=li9h4tydA6NYZ7tm8NAvM0RkuO1DvOT8U6gU0EDKrec=; b=IaRvnxPHffmQGsxyZo10mwgRGxXD2Jcri9Crz7ytPRU/yckYyxv7X/Na3029U8H4at dBnb8v8SHwsw1UyX5iyNkONbyYbr1xqIWKEG1BTMpGF0fy77OSFeTHQ4Gd8b568EiJ8g TBzhlhOS9cRrBd9C2WsM/G454UsJYsFrntSok= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=li9h4tydA6NYZ7tm8NAvM0RkuO1DvOT8U6gU0EDKrec=; b=DjtZdkzKUJL4+L9AlJzcns5NK7Lq0HbtKpnGJVvrclXmYLnSxvySM3pk4XKLTneDyH +neIHvye2t7AVk97z7RY/zBMMVwr0EXHMrKnIJrojxvEw5QzsHO5Mc4frRNG7gq+dnkc shMW2F8eY7dHjwChgHLr6VZclEjfNXlHxT7IGJokEyJbbPZsanmvUCnQ8oYoCXZDidEH x/7ZwfreiwsbvRnb/I0QJYRR4b23cDH2A+bz1G37WZmgsHxKJU7+f2o8xc0H3oJTwHuJ kJeA4OpA31fxlEHlOY68gjMXLcADvVkEgd1uWOaa5+zCbeGmvqDavJwyD0CdacqJYXhT ecSw== X-Gm-Message-State: APt69E3h5HECrMOjBi44rJs6TBE2oKuA9ZAkQbq0M386aJhfYVQZmOVb U+zOM7QztCYvl6tbyAzEY/sTK+S3Qxo= X-Received: by 2002:a17:902:b18b:: with SMTP id s11-v6mr26452889plr.190.1529546098935; Wed, 20 Jun 2018 18:54:58 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.57 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:58 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:52 -1000 Message-Id: <20180621015359.12018-29-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::235 Subject: [Qemu-devel] [PATCH v5 28/35] target/arm: Implement SVE floating-point complex add X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 7 +++ target/arm/sve_helper.c | 100 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 24 +++++++++ target/arm/sve.decode | 4 ++ 4 files changed, 135 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 891346a5ac..0bd9fe2f28 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1092,6 +1092,13 @@ DEF_HELPER_FLAGS_6(sve_facgt_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_facgt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcadd_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcadd_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcadd_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 5309cf0866..ee7fc23bb9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3629,6 +3629,106 @@ void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t desc) } } +/* + * FP Complex Add + */ + +void HELPER(sve_fcadd_h)(void *vd, void *vn, void *vm, void *vg, + void *vs, uint32_t desc) +{ + intptr_t j, i = simd_oprsz(desc); + uint64_t *g = vg; + float16 neg_imag = float16_set_sign(0, simd_data(desc)); + float16 neg_real = float16_chs(neg_imag); + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + float16 e0, e1, e2, e3; + + /* I holds the real index; J holds the imag index. */ + j = i - sizeof(float16); + i -= 2 * sizeof(float16); + + e0 = *(float16 *)(vn + H1_2(i)); + e1 = *(float16 *)(vm + H1_2(j)) ^ neg_real; + e2 = *(float16 *)(vn + H1_2(j)); + e3 = *(float16 *)(vm + H1_2(i)) ^ neg_imag; + + if (likely((pg >> (i & 63)) & 1)) { + *(float16 *)(vd + H1_2(i)) = float16_add(e0, e1, vs); + } + if (likely((pg >> (j & 63)) & 1)) { + *(float16 *)(vd + H1_2(j)) = float16_add(e2, e3, vs); + } + } while (i & 63); + } while (i != 0); +} + +void HELPER(sve_fcadd_s)(void *vd, void *vn, void *vm, void *vg, + void *vs, uint32_t desc) +{ + intptr_t j, i = simd_oprsz(desc); + uint64_t *g = vg; + float32 neg_imag = float32_set_sign(0, simd_data(desc)); + float32 neg_real = float32_chs(neg_imag); + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + float32 e0, e1, e2, e3; + + /* I holds the real index; J holds the imag index. */ + j = i - sizeof(float32); + i -= 2 * sizeof(float32); + + e0 = *(float32 *)(vn + H1_2(i)); + e1 = *(float32 *)(vm + H1_2(j)) ^ neg_real; + e2 = *(float32 *)(vn + H1_2(j)); + e3 = *(float32 *)(vm + H1_2(i)) ^ neg_imag; + + if (likely((pg >> (i & 63)) & 1)) { + *(float32 *)(vd + H1_2(i)) = float32_add(e0, e1, vs); + } + if (likely((pg >> (j & 63)) & 1)) { + *(float32 *)(vd + H1_2(j)) = float32_add(e2, e3, vs); + } + } while (i & 63); + } while (i != 0); +} + +void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg, + void *vs, uint32_t desc) +{ + intptr_t j, i = simd_oprsz(desc); + uint64_t *g = vg; + float64 neg_imag = float64_set_sign(0, simd_data(desc)); + float64 neg_real = float64_chs(neg_imag); + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + float64 e0, e1, e2, e3; + + /* I holds the real index; J holds the imag index. */ + j = i - sizeof(float64); + i -= 2 * sizeof(float64); + + e0 = *(float64 *)(vn + H1_2(i)); + e1 = *(float64 *)(vm + H1_2(j)) ^ neg_real; + e2 = *(float64 *)(vn + H1_2(j)); + e3 = *(float64 *)(vm + H1_2(i)) ^ neg_imag; + + if (likely((pg >> (i & 63)) & 1)) { + *(float64 *)(vd + H1_2(i)) = float64_add(e0, e1, vs); + } + if (likely((pg >> (j & 63)) & 1)) { + *(float64 *)(vd + H1_2(j)) = float64_add(e2, e3, vs); + } + } while (i & 63); + } while (i != 0); +} + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 067c219b54..7a39be9bdd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3895,6 +3895,30 @@ DO_FPCMP(FACGT, facgt) #undef DO_FPCMP +static bool trans_FCADD(DisasContext *s, arg_FCADD *a, uint32_t insn) +{ + static gen_helper_gvec_4_ptr * const fns[3] = { + gen_helper_sve_fcadd_h, + gen_helper_sve_fcadd_s, + gen_helper_sve_fcadd_d + }; + + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, a->rot, fns[a->esz - 1]); + tcg_temp_free_ptr(status); + } + return true; +} + typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32); static bool do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla *fn) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 85f2b39776..7b5ada1311 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -721,6 +721,10 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s +# SVE floating-point complex add (predicated) +FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \ + rn=%reg_movprfx + ### SVE FP Multiply-Add Indexed Group # SVE floating-point multiply-add (indexed) From patchwork Thu Jun 21 01:53:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139416 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1477782lji; Wed, 20 Jun 2018 19:07:07 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIx/KGXrVsOT+18GZpWhqTNfheJQmQRVmfGKI+nuGluPI0m52c+NbfqJmSAYkZzRdSQo+kE X-Received: by 2002:a37:21e6:: with SMTP id f99-v6mr700882qki.206.1529546827779; Wed, 20 Jun 2018 19:07:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529546827; cv=none; d=google.com; s=arc-20160816; b=V6Z+K4FGvdSjGLloxrbbo66fyouIUHDKmFm6S8/okHutS3uyy18y/oiWAfu3Sk2B0j 0NgzvlSOY8vxp092ORhcjt2FhnL/aeD5wV9eZ9jaJg35384WAEgM3N9DIie1P5Vmewr2 lp5hSKFA3CKx8B9eLJWtloftpxDf+Ay21QRLsKPao8S8Zz9t32hlABNd6WLP0VH+1QbK K602IAvTOosdqYj0k2LFXckKWZhqdYE6oWwkaNWcgFQcZGx0oMZHyssELQRUORUEBG7V MR5C97VGyuSlV8C8eGHfv26ylkTsFzd/nX1MEXffXKoryw3E7z1sYIRgQyg5ALl6jB45 /Ahg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=knsSl75J+cMYx24an5DGn279NUCI/wNVkLxKUvInCr0=; b=VRT1lyusHYbrszvAmhxKKrud5ITO/Z8Fj7xSp+Zv7/zGfDMO5EOm45h9MWx2FNnYNm 09zCB25N9fTtRvdAJnSl8d2Zr5SmcmbaeeVZilOmEZdPwTwdO9R+KIwwR6mjBED17hEn pSSGVCVVQNuKicSQUZNs/LHrS3hz5R2u+92Ro8rJJNU3v7sdaFvZa5GX0ZP28mlHRrAO rRFcJ7OaWOGJc0biwXFjkp0nQrQroJ2MXpslbgQI+/VQXokbPZItWj9Htzc1gBT6f//7 /Dhm7VwdSA907ibPx5NWMlx+9K/LQ3uzJRv/1EZ1e8If8GwwsOrg6jjEfYY8m0QrUPci TBYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=GO4WWikd; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id j6-v6si3932043qvl.174.2018.06.20.19.07.07 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:07:07 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=GO4WWikd; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52545 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp07-0005Xk-45 for patch@linaro.org; Wed, 20 Jun 2018 22:07:07 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39507) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooS-0004tA-1x for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooQ-0003Tx-MC for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:04 -0400 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:35973) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooQ-0003TM-Dh for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:02 -0400 Received: by mail-pf0-x242.google.com with SMTP id a12-v6so689780pfi.3 for ; Wed, 20 Jun 2018 18:55:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=knsSl75J+cMYx24an5DGn279NUCI/wNVkLxKUvInCr0=; b=GO4WWikdwc74BHSAljIwlBGSvTkcetCiWOnbzCpa773a3vk+fsoO/MuyA0SOnKRETq mIWA17Tqhb9tTWt/IKfS4qUCO6IvGjMFJ1/hvFPECg/FKX9vMsLKvxRkkwq9MA+jmJPe ra3wvMhmC7Zim2aukV5tJpXOEyFIaSOVpF2l8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=knsSl75J+cMYx24an5DGn279NUCI/wNVkLxKUvInCr0=; b=WuxLD1iRrzGOLGguuoB/fJoNzzAv/EApF2+JICyKW2XVfLTl0Zsz0EusKJXJ2j5B4C 2e+YTjYmTLgyl5MKkknKbsvv0IfzLN3sgiuAsvaAPEEuN7wlcNJROgX4jidN3D17znqq cqpRe1Ds+IK4OLADZZu9HXbG2SrWN7TYy61g338ugdh3DiVj8XJP82XlvPCA2TbPiex6 tSLR9pQmvIGOJoC7HYB5eoJpUP/5G/RaLzQJ1EKr26RE+nnFOLODK3Mmhs9T0OiQLzwQ lKPQ+0jXi6iskiohLdlrEEpfoygxXxKqtSyk9utZrJk/UPj8DMYe2odQVrAbTOqiBBSL iNZw== X-Gm-Message-State: APt69E0OAvii7n8KkPt6nmEOKVM/shsHIRI6A6oEJNG9W1NqcCgMrvvI OkCjwMXrAmad4/FGVkin7GqhxuMWEHY= X-Received: by 2002:aa7:83d1:: with SMTP id j17-v6mr25500360pfn.236.1529546100920; Wed, 20 Jun 2018 18:55:00 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.59 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:00 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:53 -1000 Message-Id: <20180621015359.12018-30-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v5 29/35] target/arm: Implement SVE fp complex multiply add X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 4 + target/arm/sve_helper.c | 162 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 37 +++++++++ target/arm/sve.decode | 4 + 4 files changed, 207 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0bd9fe2f28..023952a9a4 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1115,6 +1115,10 @@ DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ee7fc23bb9..cd3dfc8b26 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3729,6 +3729,168 @@ void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg, } while (i != 0); } +/* + * FP Complex Multiply + */ + +QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 22 > 32); + +void HELPER(sve_fcmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +{ + intptr_t j, i = simd_oprsz(desc); + unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); + unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); + unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); + unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); + unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2); + bool flip = rot & 1; + float16 neg_imag, neg_real; + void *vd = &env->vfp.zregs[rd]; + void *vn = &env->vfp.zregs[rn]; + void *vm = &env->vfp.zregs[rm]; + void *va = &env->vfp.zregs[ra]; + uint64_t *g = vg; + + neg_imag = float16_set_sign(0, (rot & 2) != 0); + neg_real = float16_set_sign(0, rot == 1 || rot == 2); + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + float16 e1, e2, e3, e4, nr, ni, mr, mi, d; + + /* I holds the real index; J holds the imag index. */ + j = i - sizeof(float16); + i -= 2 * sizeof(float16); + + nr = *(float16 *)(vn + H1_2(i)); + ni = *(float16 *)(vn + H1_2(j)); + mr = *(float16 *)(vm + H1_2(i)); + mi = *(float16 *)(vm + H1_2(j)); + + e2 = (flip ? ni : nr); + e1 = (flip ? mi : mr) ^ neg_real; + e4 = e2; + e3 = (flip ? mr : mi) ^ neg_imag; + + if (likely((pg >> (i & 63)) & 1)) { + d = *(float16 *)(va + H1_2(i)); + d = float16_muladd(e2, e1, d, 0, &env->vfp.fp_status_f16); + *(float16 *)(vd + H1_2(i)) = d; + } + if (likely((pg >> (j & 63)) & 1)) { + d = *(float16 *)(va + H1_2(j)); + d = float16_muladd(e4, e3, d, 0, &env->vfp.fp_status_f16); + *(float16 *)(vd + H1_2(j)) = d; + } + } while (i & 63); + } while (i != 0); +} + +void HELPER(sve_fcmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +{ + intptr_t j, i = simd_oprsz(desc); + unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); + unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); + unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); + unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); + unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2); + bool flip = rot & 1; + float32 neg_imag, neg_real; + void *vd = &env->vfp.zregs[rd]; + void *vn = &env->vfp.zregs[rn]; + void *vm = &env->vfp.zregs[rm]; + void *va = &env->vfp.zregs[ra]; + uint64_t *g = vg; + + neg_imag = float32_set_sign(0, (rot & 2) != 0); + neg_real = float32_set_sign(0, rot == 1 || rot == 2); + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + float32 e1, e2, e3, e4, nr, ni, mr, mi, d; + + /* I holds the real index; J holds the imag index. */ + j = i - sizeof(float32); + i -= 2 * sizeof(float32); + + nr = *(float32 *)(vn + H1_2(i)); + ni = *(float32 *)(vn + H1_2(j)); + mr = *(float32 *)(vm + H1_2(i)); + mi = *(float32 *)(vm + H1_2(j)); + + e2 = (flip ? ni : nr); + e1 = (flip ? mi : mr) ^ neg_real; + e4 = e2; + e3 = (flip ? mr : mi) ^ neg_imag; + + if (likely((pg >> (i & 63)) & 1)) { + d = *(float32 *)(va + H1_2(i)); + d = float32_muladd(e2, e1, d, 0, &env->vfp.fp_status); + *(float32 *)(vd + H1_2(i)) = d; + } + if (likely((pg >> (j & 63)) & 1)) { + d = *(float32 *)(va + H1_2(j)); + d = float32_muladd(e4, e3, d, 0, &env->vfp.fp_status); + *(float32 *)(vd + H1_2(j)) = d; + } + } while (i & 63); + } while (i != 0); +} + +void HELPER(sve_fcmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) +{ + intptr_t j, i = simd_oprsz(desc); + unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); + unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); + unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); + unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); + unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2); + bool flip = rot & 1; + float64 neg_imag, neg_real; + void *vd = &env->vfp.zregs[rd]; + void *vn = &env->vfp.zregs[rn]; + void *vm = &env->vfp.zregs[rm]; + void *va = &env->vfp.zregs[ra]; + uint64_t *g = vg; + + neg_imag = float64_set_sign(0, (rot & 2) != 0); + neg_real = float64_set_sign(0, rot == 1 || rot == 2); + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + float64 e1, e2, e3, e4, nr, ni, mr, mi, d; + + /* I holds the real index; J holds the imag index. */ + j = i - sizeof(float64); + i -= 2 * sizeof(float64); + + nr = *(float64 *)(vn + H1_2(i)); + ni = *(float64 *)(vn + H1_2(j)); + mr = *(float64 *)(vm + H1_2(i)); + mi = *(float64 *)(vm + H1_2(j)); + + e2 = (flip ? ni : nr); + e1 = (flip ? mi : mr) ^ neg_real; + e4 = e2; + e3 = (flip ? mr : mi) ^ neg_imag; + + if (likely((pg >> (i & 63)) & 1)) { + d = *(float64 *)(va + H1_2(i)); + d = float64_muladd(e2, e1, d, 0, &env->vfp.fp_status); + *(float64 *)(vd + H1_2(i)) = d; + } + if (likely((pg >> (j & 63)) & 1)) { + d = *(float64 *)(va + H1_2(j)); + d = float64_muladd(e4, e3, d, 0, &env->vfp.fp_status); + *(float64 *)(vd + H1_2(j)) = d; + } + } while (i & 63); + } while (i != 0); +} + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7a39be9bdd..6487fe760a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3968,6 +3968,43 @@ DO_FMLA(FNMLS_zpzzz, fnmls_zpzzz) #undef DO_FMLA +static bool trans_FCMLA_zpzzz(DisasContext *s, + arg_FCMLA_zpzzz *a, uint32_t insn) +{ + static gen_helper_sve_fmla * const fns[3] = { + gen_helper_sve_fcmla_zpzzz_h, + gen_helper_sve_fcmla_zpzzz_s, + gen_helper_sve_fcmla_zpzzz_d, + }; + + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + unsigned desc; + TCGv_i32 t_desc; + TCGv_ptr pg = tcg_temp_new_ptr(); + + /* We would need 7 operands to pass these arguments "properly". + * So we encode all the register numbers into the descriptor. + */ + desc = deposit32(a->rd, 5, 5, a->rn); + desc = deposit32(desc, 10, 5, a->rm); + desc = deposit32(desc, 15, 5, a->ra); + desc = deposit32(desc, 20, 2, a->rot); + desc = sextract32(desc, 0, 22); + desc = simd_desc(vsz, vsz, desc); + + t_desc = tcg_const_i32(desc); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + fns[a->esz - 1](cpu_env, pg, t_desc); + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(pg); + } + return true; +} + /* *** SVE Floating Point Unary Operations Prediated Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7b5ada1311..da89697700 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -725,6 +725,10 @@ MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \ rn=%reg_movprfx +# SVE floating-point complex multiply-add (predicated) +FCMLA_zpzzz 01100100 esz:2 0 rm:5 0 rot:2 pg:3 rn:5 rd:5 \ + ra=%reg_movprfx + ### SVE FP Multiply-Add Indexed Group # SVE floating-point multiply-add (indexed) From patchwork Thu Jun 21 01:53:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139433 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1488699lji; Wed, 20 Jun 2018 19:23:15 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLPNiktgCRGxWt8XVZFm4xo9i1GfQeETWolpxkmCk5eRmXHn0zQyJAlcmt/H9RveyHZ6DW9 X-Received: by 2002:a37:a10:: with SMTP id 16-v6mr4348355qkk.228.1529547795262; Wed, 20 Jun 2018 19:23:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547795; cv=none; d=google.com; s=arc-20160816; b=pBOeXzVoOCKdav0hYkRy5NoubXKjCgXZBGKGyQpAbOn33hRtT/ENg6UKjnuIQ4v6xK 9eaNZQqKTpcQP7I7AfZDTUpqmC6X4FASNhTDbWHB7TZIt+BWxUqoTUr5rPqlKuj41BfR L6igGb+7/aEooesaXXWwyl3c0taCak7In0f1iuHlfehC1RIB5t3z4Mp/C15eZL0/0KRm M0bbLoV8tqhpHdHqvvFhpEG9K4k07q2cpv4C/ob1UJ2dxDyda1KQMTyX4XnSliBkIl5P oDeCGguk9gEeZgVJNe4s3fYuNFIdVKQvI4i5NOIEkVsoUSB8uPl5qFfjjxD3Fiy/Rapv YQeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=NuYhoUhPplWoUIEzViyhFHHhbuisvWgK7aXq7dj7Bws=; b=gP9Uc2/Z4iFy7CAXitgxaxRAkFV1YJkH95W3Nf9zh50LRDqRiySdCbHnbKtA0w/+Si zZP5/XKvaeaLAVmU1WEFE9cBb4c/Kb56MWCR7kqLEaSr/8MXHR/VQElhP61w6xcIDrOS W745acslwHwm12n+fpNY4yjTpqTqukrj6xXEAk8ikZ7lfksTolYueifz263hPyKuB5Ww WNdr+mRfFlyGW4dP6/afaKW2vfsfMdbRiFBoMBLUI/7u09soar5pOFE7Me4cbyQJUfH5 0OlVxB4dPlD0SvSt1OKuQgItaRqUYCIPDKESdW/uOCRmEsVbN4Io87LaP7876Lc4X4UE YYHQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=iZl8jqOl; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id m7-v6si3522605qvo.247.2018.06.20.19.23.15 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:23:15 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=iZl8jqOl; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52635 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVpFi-0002Be-R3 for patch@linaro.org; Wed, 20 Jun 2018 22:23:14 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39524) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooT-0004uH-5T for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooS-0003Ut-Bq for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:05 -0400 Received: from mail-pl0-x235.google.com ([2607:f8b0:400e:c01::235]:36790) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooS-0003UV-5W for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:04 -0400 Received: by mail-pl0-x235.google.com with SMTP id a7-v6so758236plp.3 for ; Wed, 20 Jun 2018 18:55:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=NuYhoUhPplWoUIEzViyhFHHhbuisvWgK7aXq7dj7Bws=; b=iZl8jqOl7KrFWi80RF2LGXQt82p2pfqtffZyTACNRDOWHwWYYl/rx6ZJdHdS3RfWNf nVR7y1alA9zxk71vOKbKH5njN1hGyc6Owt1HtRhV0QZmyx1ECW1DdfW14TeF63xhtayC xwRMMBtEMxH5jIr0WlGbJggCEQallIpWJ3VLU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=NuYhoUhPplWoUIEzViyhFHHhbuisvWgK7aXq7dj7Bws=; b=QyDoObEchFY5h02hd1ug4Nf3iQEZdToXng2eiDL6H/jwpiuK5TxtNs6Y5elhdeVpFK 47f10oS+b6WEE44gbU5h2WtftCnb87+rlfAwiKhocP9nVNn6C/ukHyWD4t61csN7N+hq skQhxb+Hh4/xfqkbrbeyqnCb/OxoQ5rIuLjMtCdDQgDDniFB5/dkshjNaY+fvDIXHcUG o26BjjzE8LYrjD30o/BUm/ZAY5jTuUn5cXtdIi+Yz70/nQIscZR6bUNpMcK/NRHwMXYB SpEG3RCUjc4e8smBJGldTJXR6KkjJBQoFGmIHK6GSG4ontXQryPxa5PJRsfgtJewEwEG fhGQ== X-Gm-Message-State: APt69E0e8q7KX3KPqRjn5jeQ59aFny74qcLTFJD+fMYKJ5c5EQNuEIGU 9dBYVtuRAX+ZHFhtjbP5XcRfbJLKqDc= X-Received: by 2002:a17:902:9a06:: with SMTP id v6-v6mr26242519plp.21.1529546102935; Wed, 20 Jun 2018 18:55:02 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.55.01 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:02 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:54 -1000 Message-Id: <20180621015359.12018-31-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::235 Subject: [Qemu-devel] [PATCH v5 30/35] target/arm: Pass index to AdvSIMD FCMLA (indexed) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" The original commit failed to pass, or use, the index. Fixes: d17b7cdcf4ea Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 21 ++++++++++++--------- target/arm/vec_helper.c | 10 ++++++---- 2 files changed, 18 insertions(+), 13 deletions(-) -- 2.17.1 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 8d8a4cecb0..038e48278f 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -12669,15 +12669,18 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) case 0x13: /* FCMLA #90 */ case 0x15: /* FCMLA #180 */ case 0x17: /* FCMLA #270 */ - tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_reg_offset(s, rm, index, size), fpst, - is_q ? 16 : 8, vec_full_reg_size(s), - extract32(insn, 13, 2), /* rot */ - size == MO_64 - ? gen_helper_gvec_fcmlas_idx - : gen_helper_gvec_fcmlah_idx); - tcg_temp_free_ptr(fpst); + { + int rot = extract32(insn, 13, 2); + int data = index * 4 + rot; + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_reg_offset(s, rm, index, size), fpst, + is_q ? 16 : 8, vec_full_reg_size(s), data, + size == MO_64 + ? gen_helper_gvec_fcmlas_idx + : gen_helper_gvec_fcmlah_idx); + tcg_temp_free_ptr(fpst); + } return; } diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 073e5c58e7..8f2dc4b989 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -317,10 +317,11 @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); + intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; uintptr_t i; - float16 e1 = m[H2(flip)]; - float16 e3 = m[H2(1 - flip)]; + float16 e1 = m[H2(2 * index + flip)]; + float16 e3 = m[H2(2 * index + 1 - flip)]; /* Shift boolean to the sign bit so we can xor to negate. */ neg_real <<= 15; @@ -377,10 +378,11 @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); + intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; uintptr_t i; - float32 e1 = m[H4(flip)]; - float32 e3 = m[H4(1 - flip)]; + float32 e1 = m[H4(2 * index + flip)]; + float32 e3 = m[H4(2 * index + 1 - flip)]; /* Shift boolean to the sign bit so we can xor to negate. */ neg_real <<= 31; From patchwork Thu Jun 21 01:53:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139427 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1484550lji; Wed, 20 Jun 2018 19:16:45 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKzpCld0Vis/Y7njwFzVZ5lYriA4hpxBmI6tEErd3IqfDwSJ+/7Hbxo7A7nMOzmmUdejTF2 X-Received: by 2002:aed:3caa:: with SMTP id d39-v6mr21908892qtf.408.1529547405562; Wed, 20 Jun 2018 19:16:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547405; cv=none; d=google.com; s=arc-20160816; b=bbd3x1VvTrANph7UlLdjVDqJRiwBGZ6CdQLZkQ+USZS/37uI3LiOTlPuulLTKYEueB BOogLSU/2OVytQOayIcdthe4JTLF0uocvCcQYFTO2dHoUax5tKXvAWFm3a7wQ3WwQjkn /AWBQCwH1snPgBZcDQCM4BksGZaanC0j9c1s27SD+hBYL0KOKjYHmSrKgaqg9/8YEy6P LFmDIfH1Uy/GAr/NZUO9zJJr/1aLmCDxbUdDOOkYnaFRxWFeGNvqcUwQU6Ne3KXZhj4d CThqDVFMW8HLV0BqJsc9dhk/xage+S041Zog7joZDjnxZdwpC1lLtx+0YiDsz8NR5UYf FisQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=cTkQOLsJ7w2QJCExRny2kUXfmlF9jgGlwgL2VbhhBnA=; b=LaellNrXrItLARbBCki9AzkbwnVCw2ScgtInb9p6O0Zf40omPdvily8rsUCis8DBbx d+4s6XnW25zalbHx4PXhW0BW/eH/OE24mAWRtWpDC7xyh2v5LDN3Iigt1rgz1c6ALMgs 9JfPrC6fw0ef/6N13JlILtqhKZWma8tMrpFjMagQpUwqsI7ILiw8zz3SKu1zQztDMzC1 LMOTefHLLMIXdrkN1rEAywEWiJ486gR/PJsrSoY3dyBEPTYapC4cizYPEcqTgG3SKXkU ETh38FWBeSOVzA961rrSJpifzqXfwagLaccckAWpDQ0wF6Z10k+q5e2C+yiOl7XK56fI b7Ug== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=AHckJE8T; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id o21-v6si3718701qtm.130.2018.06.20.19.16.45 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:16:45 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=AHckJE8T; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52597 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp9R-0004GD-1Z for patch@linaro.org; Wed, 20 Jun 2018 22:16:45 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39567) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooY-0004zQ-3S for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooU-0003WH-7h for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:10 -0400 Received: from mail-pl0-x22b.google.com ([2607:f8b0:400e:c01::22b]:38069) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooU-0003Vg-04 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:06 -0400 Received: by mail-pl0-x22b.google.com with SMTP id d10-v6so757999plo.5 for ; Wed, 20 Jun 2018 18:55:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=cTkQOLsJ7w2QJCExRny2kUXfmlF9jgGlwgL2VbhhBnA=; b=AHckJE8Tqqlj9FzWbFYAQTxzHTf5m9XO/dr6iag24efxQvjWTx6rtwQ+JL2kL6RlDp +VHDChJNjUNAgJrgeV4CmWPjxvB0xcJEl+GQsYy5948wK9dni+gNnRM8/+Wd6PkfWtK7 jpdMHLlF/QJb1+gCbjoFpDR/9Gbtpvx8kuUPQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=cTkQOLsJ7w2QJCExRny2kUXfmlF9jgGlwgL2VbhhBnA=; b=j+fYh21dZ30JrVSYdZkHWtKhkv9dcg86JZFpo8hv+QbaaoZ0Lw32mut0jtKwGuj7SV CRLOP2qNbAF2xcD2FDR883hbWMnpVTSYPkvlv+tIizxJ+oFV4Tg2g7yxBgAJDNCm61Cn coX+3JMRjpqUvhxh6W42AlUEx6OeF01JxlwFWEF622WEalSY7BHZgkjp2uSMO4zVP4a/ VU9BsWuGiYC8HZzTdIOkpfFEgqPr3EcLqpEXzCFTFEqk9hQPiMOqW8kEPGpetdmFZGbH fJpYsOx/+tNcM+dh4NPWpEYq4zfobMTQxJjZBrvuVUTNY3TvSiQyBWqMNN/o6mTrSb5J UG6Q== X-Gm-Message-State: APt69E2CDHUhV02W+ODflvH/vNZ0Kna1ZNJRtjJZpKonPldtHtLnIb0M DfDjW8k3XEbGjGDtnNDUKFnc6VEZpzM= X-Received: by 2002:a17:902:650a:: with SMTP id b10-v6mr26568033plk.45.1529546104693; Wed, 20 Jun 2018 18:55:04 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.55.03 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:03 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:55 -1000 Message-Id: <20180621015359.12018-32-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::22b Subject: [Qemu-devel] [PATCH v5 31/35] target/arm: Implement SVE fp complex multiply add (indexed) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Enhance the existing helpers to support SVE, which takes the index from each 128-bit segment. The change has no effect for AdvSIMD, since there is only one such segment. Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 23 ++++++++++++++++++ target/arm/vec_helper.c | 50 +++++++++++++++++++++++--------------- target/arm/sve.decode | 6 +++++ 3 files changed, 59 insertions(+), 20 deletions(-) -- 2.17.1 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 6487fe760a..209a69cd76 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4005,6 +4005,29 @@ static bool trans_FCMLA_zpzzz(DisasContext *s, return true; } +static bool trans_FCMLA_zzxz(DisasContext *s, arg_FCMLA_zzxz *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[2] = { + gen_helper_gvec_fcmlah_idx, + gen_helper_gvec_fcmlas_idx, + }; + + tcg_debug_assert(a->esz == 1 || a->esz == 2); + tcg_debug_assert(a->rd == a->ra); + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, + a->index * 4 + a->rot, + fns[a->esz - 1]); + tcg_temp_free_ptr(status); + } + return true; +} + /* *** SVE Floating Point Unary Operations Prediated Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 8f2dc4b989..db5aeb9f24 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -319,22 +319,27 @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; - uintptr_t i; - float16 e1 = m[H2(2 * index + flip)]; - float16 e3 = m[H2(2 * index + 1 - flip)]; + intptr_t elements = opr_sz / sizeof(float16); + intptr_t eltspersegment = 16 / sizeof(float16); + intptr_t i, j; /* Shift boolean to the sign bit so we can xor to negate. */ neg_real <<= 15; neg_imag <<= 15; - e1 ^= neg_real; - e3 ^= neg_imag; - for (i = 0; i < opr_sz / 2; i += 2) { - float16 e2 = n[H2(i + flip)]; - float16 e4 = e2; + for (i = 0; i < elements; i += eltspersegment) { + float16 mr = m[H2(i + 2 * index + 0)]; + float16 mi = m[H2(i + 2 * index + 1)]; + float16 e1 = neg_real ^ (flip ? mi : mr); + float16 e3 = neg_imag ^ (flip ? mr : mi); - d[H2(i)] = float16_muladd(e2, e1, d[H2(i)], 0, fpst); - d[H2(i + 1)] = float16_muladd(e4, e3, d[H2(i + 1)], 0, fpst); + for (j = i; j < i + eltspersegment; j += 2) { + float16 e2 = n[H2(j + flip)]; + float16 e4 = e2; + + d[H2(j)] = float16_muladd(e2, e1, d[H2(j)], 0, fpst); + d[H2(j + 1)] = float16_muladd(e4, e3, d[H2(j + 1)], 0, fpst); + } } clear_tail(d, opr_sz, simd_maxsz(desc)); } @@ -380,22 +385,27 @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; - uintptr_t i; - float32 e1 = m[H4(2 * index + flip)]; - float32 e3 = m[H4(2 * index + 1 - flip)]; + intptr_t elements = opr_sz / sizeof(float32); + intptr_t eltspersegment = 16 / sizeof(float32); + intptr_t i, j; /* Shift boolean to the sign bit so we can xor to negate. */ neg_real <<= 31; neg_imag <<= 31; - e1 ^= neg_real; - e3 ^= neg_imag; - for (i = 0; i < opr_sz / 4; i += 2) { - float32 e2 = n[H4(i + flip)]; - float32 e4 = e2; + for (i = 0; i < elements; i += eltspersegment) { + float32 mr = m[H4(i + 2 * index + 0)]; + float32 mi = m[H4(i + 2 * index + 1)]; + float32 e1 = neg_real ^ (flip ? mi : mr); + float32 e3 = neg_imag ^ (flip ? mr : mi); - d[H4(i)] = float32_muladd(e2, e1, d[H4(i)], 0, fpst); - d[H4(i + 1)] = float32_muladd(e4, e3, d[H4(i + 1)], 0, fpst); + for (j = i; j < i + eltspersegment; j += 2) { + float32 e2 = n[H4(j + flip)]; + float32 e4 = e2; + + d[H4(j)] = float32_muladd(e2, e1, d[H4(j)], 0, fpst); + d[H4(j + 1)] = float32_muladd(e4, e3, d[H4(j + 1)], 0, fpst); + } } clear_tail(d, opr_sz, simd_maxsz(desc)); } diff --git a/target/arm/sve.decode b/target/arm/sve.decode index da89697700..b578d104c4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -729,6 +729,12 @@ FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \ FCMLA_zpzzz 01100100 esz:2 0 rm:5 0 rot:2 pg:3 rn:5 rd:5 \ ra=%reg_movprfx +# SVE floating-point complex multiply-add (indexed) +FCMLA_zzxz 01100100 10 1 index:2 rm:3 0001 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx esz=1 +FCMLA_zzxz 01100100 11 1 index:1 rm:4 0001 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx esz=2 + ### SVE FP Multiply-Add Indexed Group # SVE floating-point multiply-add (indexed) From patchwork Thu Jun 21 01:53:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139435 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1490114lji; Wed, 20 Jun 2018 19:25:35 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIsdw11Dhhih4IGdarUHS2Li6ZFH29ervSUQ7Gdrkd6fONkGbCa2aXWNmfX5y2vtJpaJbgd X-Received: by 2002:aed:3b3b:: with SMTP id p56-v6mr21359599qte.372.1529547935552; Wed, 20 Jun 2018 19:25:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547935; cv=none; d=google.com; s=arc-20160816; b=A9woxu+TU3l26OCPz/tr+tg0wwmYGCZPr5TKk7dmccevt+TNgwPW4Nsl/j2vJwrXrD Zpq+d4V1yys0oNyeK/O9IfNwdrzZC6THQZWmX8z4jFOnD2xObXily7eFzTYUYH8bTrtS BXww0gzNwanuvrCVJGpu8IiSahY1T6XzolCRrCJyI2l/QO0TZ+v04k9fUDsZPCwKaNac DTKpWXAMxWBDo11TT/g7rV3VfPjtSxVhVJZmNbppCCkLDfhNwlNfyHn8Q5WxuCrUef5c lcpYfXi+FI6Gu5POFO2xl6m55oCxDpa5eJQiH0b8TO0zBF2eyfollPH1txC9G5yce6P2 rSFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=/jTgOXL9VwGXcuJ4oLcHRPbBUE2s7YvX1Bg6uCTWz1E=; b=h2R4v27xZrMyGv8iwGLrakE5ifiYkl834p5Zm4OA6S4HnG+w6SjtPXDCix/mYJE7FQ 5/zSZRd1kVfvVqn1uPm1b5Ht+TDzDug6EwOyHb39KidiF/3Ceb8Xlw5TWAlbxXtaJNln 6xJfg2OQ4umGHapW2TS+2Rc8Dwl1NXhsTabcKDJvpyGPaTPoUJ8Up7EPyMugOz8cl4vV qAUGWH1RWTaMErurpO/6ENpxuDfQq3kvU9+ulBi0o/Kb6QMypNwTptFiumBlB/xIOldu cJmqWfdnSKIdTe88qOQ0rN+IKaI47zvX65Vc7BbOWL+O/ZGET7UBOoBy7PN+B6mmms4P MjXg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=SJoBpV6F; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id r2-v6si2784534qtj.249.2018.06.20.19.25.35 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:25:35 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=SJoBpV6F; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52648 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVpHz-0003m7-2I for patch@linaro.org; Wed, 20 Jun 2018 22:25:35 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39570) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooY-0004zU-43 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooV-0003XO-ON for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:10 -0400 Received: from mail-pf0-x234.google.com ([2607:f8b0:400e:c00::234]:46309) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooV-0003Wt-HZ for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:07 -0400 Received: by mail-pf0-x234.google.com with SMTP id q1-v6so677329pff.13 for ; Wed, 20 Jun 2018 18:55:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=/jTgOXL9VwGXcuJ4oLcHRPbBUE2s7YvX1Bg6uCTWz1E=; b=SJoBpV6F3WRiO59/SsA0BQQlylcCDgPgeBoDp+ifaPxu/UZRDpe+L2buLPyZMGIQiJ Y5IrwleDOBB89I4JdrkbNGPnH7GuGZ1gtfq/YBKWsA67P+b7a7w4isl4jEm+5qAT5i4K JY7x8zvCZq9CdE2Of1j4i5UJRtATQKwFSGYWM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=/jTgOXL9VwGXcuJ4oLcHRPbBUE2s7YvX1Bg6uCTWz1E=; b=ZQUTioejahW057vf2kWVM2yBX/0WHP6jgEZnwruFp6wgtoKjYV++XOUZUDPHDbPr/F xN4/YK4XQHFpjDk6U6rN5uSxDETCWf7nCt+ZazMl4zrW3A1d4OsLSUr9KoGiHxvcLs5X mXTbdretrTwCs/d4rhNvg9jFjrvUedOvgz5Wm/3drNedswnpX+iS84oObnVF6V9lCrX9 L4tUa5ohJmc6WdSl0DTPquS349bLQKdlN7ibiMS1iFPkneBoLr9lHNBU8AM1Ev1neIAi wCnqH/afs4klVGmex4jpfZWFNRqoV7vj8lZmUh5aQ4JSBYKxjoxR5zOA+65vzNDthUpd BlTg== X-Gm-Message-State: APt69E0shthRutxMMhXdldAMPF2hWzBlk32YtDs8kegVI/l93IrgO3qR XU0N/KRo8LlS8/8/ITirQJvj6iLlBfA= X-Received: by 2002:a62:f705:: with SMTP id h5-v6mr25491227pfi.169.1529546106269; Wed, 20 Jun 2018 18:55:06 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.55.04 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:05 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:56 -1000 Message-Id: <20180621015359.12018-33-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::234 Subject: [Qemu-devel] [PATCH v5 32/35] target/arm: Implement SVE dot product (vectors) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper.h | 5 +++ target/arm/translate-sve.c | 17 ++++++++++ target/arm/vec_helper.c | 67 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 3 ++ 4 files changed, 92 insertions(+) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/helper.h b/target/arm/helper.h index 8607077dda..e23ce7ff19 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -583,6 +583,11 @@ DEF_HELPER_FLAGS_5(gvec_qrdmlah_s32, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_qrdmlsh_s32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sdot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 209a69cd76..aa109208e5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3423,6 +3423,23 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI +static bool trans_DOT_zzz(DisasContext *s, arg_DOT_zzz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[2][2] = { + { gen_helper_gvec_sdot_b, gen_helper_gvec_sdot_h }, + { gen_helper_gvec_udot_b, gen_helper_gvec_udot_h } + }; + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, 0, fns[a->u][a->sz]); + } + return true; +} + /* *** SVE Floating Point Multiply-Add Indexed Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index db5aeb9f24..c16a30c3b5 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -194,6 +194,73 @@ void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } +/* Integer 8 and 16-bit dot-product. + * + * Note that for the loops herein, host endianness does not matter + * with respect to the ordering of data within the 64-bit lanes. + * All elements are treated equally, no matter where they are. + */ + +void HELPER(gvec_sdot_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint32_t *d = vd; + int8_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] += n[i * 4 + 0] * m[i * 4 + 0] + + n[i * 4 + 1] * m[i * 4 + 1] + + n[i * 4 + 2] * m[i * 4 + 2] + + n[i * 4 + 3] * m[i * 4 + 3]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_udot_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint32_t *d = vd; + uint8_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] += n[i * 4 + 0] * m[i * 4 + 0] + + n[i * 4 + 1] * m[i * 4 + 1] + + n[i * 4 + 2] * m[i * 4 + 2] + + n[i * 4 + 3] * m[i * 4 + 3]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_sdot_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd; + int16_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] += (int64_t)n[i * 4 + 0] * m[i * 4 + 0] + + (int64_t)n[i * 4 + 1] * m[i * 4 + 1] + + (int64_t)n[i * 4 + 2] * m[i * 4 + 2] + + (int64_t)n[i * 4 + 3] * m[i * 4 + 3]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd; + uint16_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] += (uint64_t)n[i * 4 + 0] * m[i * 4 + 0] + + (uint64_t)n[i * 4 + 1] * m[i * 4 + 1] + + (uint64_t)n[i * 4 + 2] * m[i * 4 + 2] + + (uint64_t)n[i * 4 + 3] * m[i * 4 + 3]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + void HELPER(gvec_fcaddh)(void *vd, void *vn, void *vm, void *vfpst, uint32_t desc) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b578d104c4..0b29da9f3a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -721,6 +721,9 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s +# SVE integer dot product (unpredicated) +DOT_zzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 + # SVE floating-point complex add (predicated) FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \ rn=%reg_movprfx From patchwork Thu Jun 21 01:53:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139434 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1490000lji; Wed, 20 Jun 2018 19:25:22 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLKB0bALr7OPggYuBmWbu60giJ4hNUfSAwC7QdfEmch6ow8Uyyd+jDmT3xDIlit3OjqzWV8 X-Received: by 2002:a37:2a8e:: with SMTP id q14-v6mr20619868qkq.380.1529547922311; Wed, 20 Jun 2018 19:25:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547922; cv=none; d=google.com; s=arc-20160816; b=Giv9Ph8PdkOaUb5QFgzri8/eroUXMIILaYi9JvCjWaVPgsrm8+11TKRFWKbaq4Xma/ n5DE60jPw5aHOfDAhqQ8mkMLDu1tkut90M8s/jO7Jn1pVyG7jMiLLxQPPjECOmhKmAx4 YBqb/y/nvw6+g17GXDVgkBegQY8OdUYTLnVOOgzyU87xMM4KGl2N7Fz2iOVYitgijywZ 92zbtPj9aGTKeAMjjqLF4Eu1JZ3cF9EJ0SxtMJFHKDCAv9TZtUWf+DjQTC/xHZlGJR0J mGmcEHMjN9a13SHVRkq6+zGjgBr2MXcmaicdserjkcd1Irh7HSrhhW/zmsri/2RnIabH 11/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=VTLYFUk8noLkXNDXuc9Ff1/aKekYRtej65w/CxbcRBo=; b=bkpQXABSdM4Wm8FcBX0X6nCXBalDGVfKJT65tUr5QZEdoQa5YEY4VtLBHjoNZOpEvb Kf3zDYrL/q/65/Pd8LPo1JJIwRRFEg4xdwN6cdJVqZULIvX/MD4PiY2Krtyn+pKze4AI 5kCuBMZzslEiVGz3/QKTJR4TeH61skD2bvhpj2cG8dJt44zITwXmAgNpBBN0tbNCHnus 9LTWqI5t2q8cA0humH3HOkEhisRVYEmd9aaoJqFx41Hr+JUJNi0Pvc6JWZPrQuZMuhCJ 1+1gKhO+wG/1I1wQtRmbSQXSoyTFFmZ6Fkc9BpDDEHHAiF0w/8PuzD/K91Oy9ohzknhB cU2Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=aOZuopV2; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id o133-v6si1181431qke.209.2018.06.20.19.25.22 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:25:22 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=aOZuopV2; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52647 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVpHl-0003ja-RR for patch@linaro.org; Wed, 20 Jun 2018 22:25:21 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39590) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooZ-00050I-BD for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooX-0003Ya-P6 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:11 -0400 Received: from mail-pl0-x22f.google.com ([2607:f8b0:400e:c01::22f]:38073) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooX-0003Xw-GG for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:09 -0400 Received: by mail-pl0-x22f.google.com with SMTP id d10-v6so758077plo.5 for ; Wed, 20 Jun 2018 18:55:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=VTLYFUk8noLkXNDXuc9Ff1/aKekYRtej65w/CxbcRBo=; b=aOZuopV2Xits/v/nAXedaE3FJGRzUWFxnvI87kFStOR1xpGE3QD7HnYVh1fEwifXy4 ccJzOG4YXOjfaAMqIi4s3pOIWvXjxtqtRiAg56V+QIEy9VpNw/kKOazL8n1R3te5+497 DZxM9fUXtZBH1MFKCGCCMKJDADZfpqyEq3DHg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=VTLYFUk8noLkXNDXuc9Ff1/aKekYRtej65w/CxbcRBo=; b=C9DPXN0itOBrSVWPeF+E2fah96auQiREf18B8ISnoAmFN9uiB+eJCte0WvmxJrZiP4 59TUvyZK5Hzu2DzXaEl9ca0WL76aXeuCKtrPVQK6YuAsnYWo++hcy2ACuOMNYU3xpO+K vfBcjS6AKCxcunJpj6Zv3SPdEIc+dZ+UAe1OWqbQs13Cpm1xb6P7diQdaUagLdytPpkK UW8ba/ZS2b9mJQmW0ZaikAaYLlMCoKmDL5Gcv5hECVS+RzaWs+iPwEaGTeApOBc3EMuv bwWiobnSCdmtHIhOmFbj+FZcwCWctE54GyGPm+cY0F81o8xvGwIjsHYXmFf1zlFuG1YB 0R5g== X-Gm-Message-State: APt69E3gKc9szqq7zYnhqM5b9iQRMbnRAoMvt9PWtp+f/Gk1TXpbAkD9 5hmvJfqVwkaFCewKRhc0iHBKQ2TGSZg= X-Received: by 2002:a17:902:22e:: with SMTP id 43-v6mr9206749plc.82.1529546108212; Wed, 20 Jun 2018 18:55:08 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.55.06 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:07 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:57 -1000 Message-Id: <20180621015359.12018-34-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::22f Subject: [Qemu-devel] [PATCH v5 33/35] target/arm: Implement SVE dot product (indexed) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper.h | 5 ++ target/arm/translate-sve.c | 18 +++++++ target/arm/vec_helper.c | 96 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 +++- 4 files changed, 126 insertions(+), 1 deletion(-) -- 2.17.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index e23ce7ff19..59e8c3bd1b 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -588,6 +588,11 @@ DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sdot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_udot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sdot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_udot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index aa109208e5..af2958be10 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3440,6 +3440,24 @@ static bool trans_DOT_zzz(DisasContext *s, arg_DOT_zzz *a, uint32_t insn) return true; } +static bool trans_DOT_zzx(DisasContext *s, arg_DOT_zzx *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[2][2] = { + { gen_helper_gvec_sdot_idx_b, gen_helper_gvec_sdot_idx_h }, + { gen_helper_gvec_udot_idx_b, gen_helper_gvec_udot_idx_h } + }; + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, a->index, fns[a->u][a->sz]); + } + return true; +} + + /* *** SVE Floating Point Multiply-Add Indexed Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index c16a30c3b5..3117ee39cd 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -261,6 +261,102 @@ void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, uint32_t desc) clear_tail(d, opr_sz, simd_maxsz(desc)); } +void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4; + intptr_t index = simd_data(desc); + uint32_t *d = vd; + int8_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz_4; i = j) { + int8_t m0 = m[(i + index) * 4 + 0]; + int8_t m1 = m[(i + index) * 4 + 1]; + int8_t m2 = m[(i + index) * 4 + 2]; + int8_t m3 = m[(i + index) * 4 + 3]; + + j = i; + do { + d[j] += n[j * 4 + 0] * m0 + + n[j * 4 + 1] * m1 + + n[j * 4 + 2] * m2 + + n[j * 4 + 3] * m3; + } while (++j < MIN(i + 4, opr_sz_4)); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4; + intptr_t index = simd_data(desc); + uint32_t *d = vd; + uint8_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz_4; i = j) { + uint8_t m0 = m[(i + index) * 4 + 0]; + uint8_t m1 = m[(i + index) * 4 + 1]; + uint8_t m2 = m[(i + index) * 4 + 2]; + uint8_t m3 = m[(i + index) * 4 + 3]; + + j = i; + do { + d[j] += n[j * 4 + 0] * m0 + + n[j * 4 + 1] * m1 + + n[j * 4 + 2] * m2 + + n[j * 4 + 3] * m3; + } while (++j < MIN(i + 4, opr_sz_4)); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8; + intptr_t index = simd_data(desc); + uint64_t *d = vd; + int16_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz_8; i = j) { + int64_t m0 = m[(i + index) * 4 + 0]; + int64_t m1 = m[(i + index) * 4 + 1]; + int64_t m2 = m[(i + index) * 4 + 2]; + int64_t m3 = m[(i + index) * 4 + 3]; + + j = i; + do { + d[j] += n[j * 4 + 0] * m0 + + n[j * 4 + 1] * m1 + + n[j * 4 + 2] * m2 + + n[j * 4 + 3] * m3; + } while (++j < MIN(i + 2, opr_sz_8)); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_udot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8; + intptr_t index = simd_data(desc); + uint64_t *d = vd; + uint16_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz_8; i = j) { + uint64_t m0 = m[(i + index) * 4 + 0]; + uint64_t m1 = m[(i + index) * 4 + 1]; + uint64_t m2 = m[(i + index) * 4 + 2]; + uint64_t m3 = m[(i + index) * 4 + 3]; + + j = i; + do { + d[j] += n[j * 4 + 0] * m0 + + n[j * 4 + 1] * m1 + + n[j * 4 + 2] * m2 + + n[j * 4 + 3] * m3; + } while (++j < MIN(i + 2, opr_sz_8)); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + void HELPER(gvec_fcaddh)(void *vd, void *vn, void *vm, void *vfpst, uint32_t desc) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0b29da9f3a..f69ff285a1 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -722,7 +722,13 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s # SVE integer dot product (unpredicated) -DOT_zzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 +DOT_zzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 ra=%reg_movprfx + +# SVE integer dot product (indexed) +DOT_zzx 01000100 101 index:2 rm:3 00000 u:1 rn:5 rd:5 \ + sz=0 ra=%reg_movprfx +DOT_zzx 01000100 111 index:1 rm:4 00000 u:1 rn:5 rd:5 \ + sz=1 ra=%reg_movprfx # SVE floating-point complex add (predicated) FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \ From patchwork Thu Jun 21 01:53:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139436 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1491443lji; Wed, 20 Jun 2018 19:27:43 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKkYlu+9/eheCg70JdBWTtqRztKI1+sV/EXy+c5nPPUQLThwNk5CTNuNeoT4NNqdv+Oub21 X-Received: by 2002:a37:4e8c:: with SMTP id c134-v6mr18977223qkb.207.1529548063224; Wed, 20 Jun 2018 19:27:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529548063; cv=none; d=google.com; s=arc-20160816; b=xbOZVU9iHOQLM2CA5/FgXPT4FrGslYd9LH4EPRAiMPGQtwdZMOZAeAgFGfTPqvJwnY PfswmCOJbpY+eECuGO3yLi5vBBI5Td3hsWN2v3hnBHLwDUgINwtRaydmiA+DBK+eYEaL bVvu6uQgcJDyftFIOT8UWhFAOxatdgdaAGzHhedyMc2Ybx3h0y4V5peR3MCHLtoXmbSZ Fcs5gKm2ZMTX9MkNle+pFlgyxN+Ug0gsODVpbmDIjiig08ru5rIcLztgCwsBARBmcnYO 6Tpn2gkjX7NeJv2KquqynLzCWo9iWRWrlksXCgQgypqnqo8djnUr/Lj2oi7+FN2WnH8T eUnA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=gLVizP9+OtQMsYfZB5CLtpDd26fS7Q4VxOJvrGxLcWk=; b=ZkGPjUYCmAJSXm7yNRA+/if1vkzdG0Mf29SKHPK1K5yjuYqPTG1qF/+h6aEkR/wdPM ge9wqk8YcZ3DCp8aCAhYRishryv4I4PYxrzdEwcYCXePdJmoxLaVXcExdajU3mZjPIvG SilP0anwv0Af3zGLchh6xAO3Q2RC2bquebLuYsvxwAkhQhS+r2PKCPuv4/8Tnzz6Oamw 3uGg/rlalTbLkLyPg3CBPGqReOYtxqmUnbK48VBl5TcV81KMh7eg+W4yX9AjxorYu6az oEXobwVsotwWyB8fpVd76yp/9jDUTBlRGA98ZaPS+cCzQpXNSPgr/4GUBG5CXnzmCJAt FooQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Okl5CQko; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id o4-v6si3142229qko.171.2018.06.20.19.27.43 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:27:43 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Okl5CQko; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52653 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVpK2-0004vq-Pb for patch@linaro.org; Wed, 20 Jun 2018 22:27:42 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39617) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoob-00050J-0M for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooZ-0003a3-LD for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:13 -0400 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:43026) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooZ-0003ZL-Ft for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:11 -0400 Received: by mail-pl0-x243.google.com with SMTP id c41-v6so747004plj.10 for ; Wed, 20 Jun 2018 18:55:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=gLVizP9+OtQMsYfZB5CLtpDd26fS7Q4VxOJvrGxLcWk=; b=Okl5CQkoUBL0dxb31zKdiNx0CoQ7TviNHo/h07VEmSCUWfXIaQ38DlywpQ76qDh4tQ SZuXEDopk0BXZLUKB/yrvjAUH13iy8tYhwwtGVA8/zNQH3Lc1RThWhn9lFpz2LlzzKsN 7mfTIhxhXJ50SVliAvvArNnAScLqYgTe3FfpU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=gLVizP9+OtQMsYfZB5CLtpDd26fS7Q4VxOJvrGxLcWk=; b=S9SKte8EbE8RMJJYHD0dBN8SZ+zDcvanlc6qrPzs80tqFVM+U1LkQxz1yN97JtCE9v uqFS6aCuZuvRGRjFCxv8fVAzR+oSDODxYZno3uibhX8trgznbpe4XZDoZjwQG4siKAaT Sm9nZfRbRFYtkFr6iGvOxnEwTJ1yiZEcB6d6RJJmvg0JdqAYOMkhGzAymXowJM0BFpCm P5oHx5wHuGwfAs4Mc0OvPWQDS/cDmOdtusWoeFX9rZ5Sz088eEQ2ry4ydRFAjiZyzsKt 1Bn7q5kr+GAMwNDaobNcmVdEhAGtsDh/ALEVmolFaY4MzaaWvQgDOrUFAS2MyRwl349s Aqtw== X-Gm-Message-State: APt69E39sm1oSZmyGVfyosvOmPSzsphsP5dO6uCvLmzzPfZRrPI07vTM YvwRbsLx5eWUtp1TQyNZm2u0e1RUT1c= X-Received: by 2002:a17:902:1081:: with SMTP id c1-v6mr26228473pla.153.1529546110071; Wed, 20 Jun 2018 18:55:10 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.55.08 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:09 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:58 -1000 Message-Id: <20180621015359.12018-35-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v5 34/35] target/arm: Enable SVE for aarch64-linux-user X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Enable ARM_FEATURE_SVE for the generic "max" cpu. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/cpu.c | 7 +++++++ target/arm/cpu64.c | 1 + 2 files changed, 8 insertions(+) -- 2.17.1 diff --git a/target/arm/cpu.c b/target/arm/cpu.c index e1de45e904..8e4f4d8c21 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -164,6 +164,13 @@ static void arm_cpu_reset(CPUState *s) env->cp15.sctlr_el[1] |= SCTLR_UCT | SCTLR_UCI | SCTLR_DZE; /* and to the FP/Neon instructions */ env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 20, 2, 3); + /* and to the SVE instructions */ + env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 16, 2, 3); + env->cp15.cptr_el[3] |= CPTR_EZ; + /* with maximum vector length */ + env->vfp.zcr_el[1] = ARM_MAX_VQ - 1; + env->vfp.zcr_el[2] = ARM_MAX_VQ - 1; + env->vfp.zcr_el[3] = ARM_MAX_VQ - 1; #else /* Reset into the highest available EL */ if (arm_feature(env, ARM_FEATURE_EL3)) { diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c index c50dcd4077..0360d7efc5 100644 --- a/target/arm/cpu64.c +++ b/target/arm/cpu64.c @@ -252,6 +252,7 @@ static void aarch64_max_initfn(Object *obj) set_feature(&cpu->env, ARM_FEATURE_V8_RDM); set_feature(&cpu->env, ARM_FEATURE_V8_FP16); set_feature(&cpu->env, ARM_FEATURE_V8_FCMA); + set_feature(&cpu->env, ARM_FEATURE_SVE); /* For usermode -cpu max we can use a larger and more efficient DCZ * blocksize since we don't have to follow what the hardware does. */ From patchwork Thu Jun 21 01:53:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139428 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1484852lji; Wed, 20 Jun 2018 19:17:13 -0700 (PDT) X-Google-Smtp-Source: ADUXVKL7ypSVIppHWnKYY5tfXnjafjmT7k2J3rRE1+yAYdp8tY78llfa48G7gzqpBEblnMRSe36V X-Received: by 2002:ac8:5416:: with SMTP id b22-v6mr21857440qtq.332.1529547433018; Wed, 20 Jun 2018 19:17:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547433; cv=none; d=google.com; s=arc-20160816; b=nj9DJ0HuraVALy8bfyQNKUkpsohCCPWHDCKahyjjtL3Fbnb8PnEo8QbwBgbU/fHFoN l+dVzhH4AofC6Fpx5n4zanuledhVaz8sT6kwCUHX8wVogjfL1C732wUjTy4GFEUXv9y4 vRNjSoe15CCaSOUQxdAuR0rw6VhgSF7JGgkVRcE/fqDZxROPXSFeHSeR2F/OLJASskAl FrDNNvLwlgybHrv9gQyit5vBiLLCEI4Im73y0Gt+oig+R1Clo5uBeDK0Vg82wdHygXXs j5CTrR5he/qmOPszJFGtgjscWTvDkYdP3S1PggW9bn2sdT+eB7P/vRyulHJxUzQ66Ubi B3ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=DAvQ6d8C7WhBxzP4vcx1hFuoE7NEqSILwo/Uh1DOpfk=; b=P4pKQJUD9nDGApjDayDnYLYGK2WhNSMV6uUkNSQMeBoYjw9Cwk2XW2NOlHGJEcFyKu t+xXv9X3bBIxIB1YxGjRHPvMUP4wRKX1hG0o//Gk9MIf+fHso9Xj43gmXj9HYoOIRQtb VrfO0oI3CQC96KxAQFEPjFOurYmJsyOz0ThNXfC61lpEe51HZ0M8UFU+os2RJmUjNkCB rYD5XoxWkZeRXRQnhagP+lmfAMk380gwwoEoyiXYOGuq/51uy52eNkvpMqTBTnEoec1K ZgbVG2nW3bTH2Rx7zBcrpjrvWIMEP+OipmTHP/nxwgs5LwzX+YwMmJBrSfX3VW9ohDWB O+aQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=MWxRzx4G; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id 75-v6si3585259qki.95.2018.06.20.19.17.12 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:17:13 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=MWxRzx4G; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52598 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp9s-0004pI-F6 for patch@linaro.org; Wed, 20 Jun 2018 22:17:12 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39637) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVood-00052B-N9 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVoob-0003bI-Os for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:15 -0400 Received: from mail-pf0-x22a.google.com ([2607:f8b0:400e:c00::22a]:42129) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVoob-0003af-FW for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:13 -0400 Received: by mail-pf0-x22a.google.com with SMTP id w7-v6so683532pfn.9 for ; Wed, 20 Jun 2018 18:55:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=DAvQ6d8C7WhBxzP4vcx1hFuoE7NEqSILwo/Uh1DOpfk=; b=MWxRzx4GMNO4DAGIBuxFSOTcl1DHvFGagls5ki1R7inZcuW1pMMXAeT3qYVa7uIx0G lXpHLhvYFuSIpeJLlMcqdH/fgUeKCDWQ4iY0rgfXIu9Q94x+4H3AriOZtvrd8NOSb7MO n1Ix5VE6dYxvyLBeQN0AVFyICi2W/3l/SD5v8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=DAvQ6d8C7WhBxzP4vcx1hFuoE7NEqSILwo/Uh1DOpfk=; b=gsLvqoLiwR5IxexjjaGrSWyvM9WcLxCrbfNsR65hzSMkkWK7YFWxIp1S0ou3mipnMT YJPvkuqGHoyE8MLNPecJ5iq7R0SmBdwfrzAZeKIFzBD7zCLz0mbXPm57Q59qd8EUL7vS 8qGRLn/ARb7hhmw3eWu2ZNq0+/fsvba+GN/9muTojMbzR2oPQth1ksE13IMF9nseDkuA u+BKa9Z7lOSh8zzf6UnCSv4rGc1L9ASdVf0ajgDY2rACfE4wli4i8GpCucveZwCeG+QN D0jDBK2Sgbu0Xxe7biqHNJH7PlmmuM5wY3gSS9sajXcQcqH3RjxZzqzbKfwMlpjjwrgY t79A== X-Gm-Message-State: APt69E2jJMRB7h1GGgRp5fYICwFrGkzcV/77xxFa4rq0q0fysBqHLRVf Q4BKXfi7anHYvaXSgYgtUWH8LW0mFxc= X-Received: by 2002:a65:43c9:: with SMTP id n9-v6mr20527147pgp.399.1529546112151; Wed, 20 Jun 2018 18:55:12 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.55.10 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:11 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:59 -1000 Message-Id: <20180621015359.12018-36-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::22a Subject: [Qemu-devel] [PATCH v5 35/35] target/arm: Implement ARMv8.2-DotProd X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" We've already added the helpers with an SVE patch, all that remains is to wire up the aa64 and aa32 translators. Enable the feature within -cpu max for CONFIG_USER_ONLY. Signed-off-by: Richard Henderson --- target/arm/cpu.h | 1 + linux-user/elfload.c | 1 + target/arm/cpu.c | 1 + target/arm/cpu64.c | 1 + target/arm/translate-a64.c | 36 +++++++++++++++++ target/arm/translate.c | 81 ++++++++++++++++++++++++++------------ 6 files changed, 96 insertions(+), 25 deletions(-) -- 2.17.1 Reviewed-by: Peter Maydell diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 8488273c5b..23098474e1 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -1480,6 +1480,7 @@ enum arm_features { ARM_FEATURE_V8_SM4, /* implements SM4 part of v8 Crypto Extensions */ ARM_FEATURE_V8_ATOMICS, /* ARMv8.1-Atomics feature */ ARM_FEATURE_V8_RDM, /* implements v8.1 simd round multiply */ + ARM_FEATURE_V8_DOTPROD, /* implements v8.2 simd dot product */ ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */ ARM_FEATURE_V8_FCMA, /* has complex number part of v8.3 extensions. */ }; diff --git a/linux-user/elfload.c b/linux-user/elfload.c index 13bc78d0c8..bdb023b477 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -583,6 +583,7 @@ static uint32_t get_elf_hwcap(void) ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP); GET_FEATURE(ARM_FEATURE_V8_ATOMICS, ARM_HWCAP_A64_ATOMICS); GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM); + GET_FEATURE(ARM_FEATURE_V8_DOTPROD, ARM_HWCAP_A64_ASIMDDP); GET_FEATURE(ARM_FEATURE_V8_FCMA, ARM_HWCAP_A64_FCMA); #undef GET_FEATURE diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 8e4f4d8c21..95ac9c064d 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -1794,6 +1794,7 @@ static void arm_max_initfn(Object *obj) set_feature(&cpu->env, ARM_FEATURE_V8_PMULL); set_feature(&cpu->env, ARM_FEATURE_CRC); set_feature(&cpu->env, ARM_FEATURE_V8_RDM); + set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD); set_feature(&cpu->env, ARM_FEATURE_V8_FCMA); #endif } diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c index 0360d7efc5..3b4bc73ffa 100644 --- a/target/arm/cpu64.c +++ b/target/arm/cpu64.c @@ -250,6 +250,7 @@ static void aarch64_max_initfn(Object *obj) set_feature(&cpu->env, ARM_FEATURE_CRC); set_feature(&cpu->env, ARM_FEATURE_V8_ATOMICS); set_feature(&cpu->env, ARM_FEATURE_V8_RDM); + set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD); set_feature(&cpu->env, ARM_FEATURE_V8_FP16); set_feature(&cpu->env, ARM_FEATURE_V8_FCMA); set_feature(&cpu->env, ARM_FEATURE_SVE); diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 038e48278f..903d6233d3 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -640,6 +640,16 @@ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, vec_full_reg_size(s), gvec_op); } +/* Expand a 3-operand operation using an out-of-line helper. */ +static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd, + int rn, int rm, int data, gen_helper_gvec_3 *fn) +{ + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + is_q ? 16 : 8, vec_full_reg_size(s), data, fn); +} + /* Expand a 3-operand + env pointer operation using * an out-of-line helper. */ @@ -11336,6 +11346,14 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) } feature = ARM_FEATURE_V8_RDM; break; + case 0x02: /* SDOT (vector) */ + case 0x12: /* UDOT (vector) */ + if (size != MO_32) { + unallocated_encoding(s); + return; + } + feature = ARM_FEATURE_V8_DOTPROD; + break; case 0x8: /* FCMLA, #0 */ case 0x9: /* FCMLA, #90 */ case 0xa: /* FCMLA, #180 */ @@ -11389,6 +11407,11 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) } return; + case 0x2: /* SDOT / UDOT */ + gen_gvec_op3_ool(s, is_q, rd, rn, rm, 0, + u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b); + return; + case 0x8: /* FCMLA, #0 */ case 0x9: /* FCMLA, #90 */ case 0xa: /* FCMLA, #180 */ @@ -12568,6 +12591,13 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) return; } break; + case 0x0e: /* SDOT */ + case 0x1e: /* UDOT */ + if (size != MO_32 || !arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) { + unallocated_encoding(s); + return; + } + break; case 0x11: /* FCMLA #0 */ case 0x13: /* FCMLA #90 */ case 0x15: /* FCMLA #180 */ @@ -12665,6 +12695,12 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } switch (16 * u + opcode) { + case 0x0e: /* SDOT */ + case 0x1e: /* UDOT */ + gen_gvec_op3_ool(s, is_q, rd, rn, rm, index, + u ? gen_helper_gvec_udot_idx_b + : gen_helper_gvec_sdot_idx_b); + return; case 0x11: /* FCMLA #0 */ case 0x13: /* FCMLA #90 */ case 0x15: /* FCMLA #180 */ diff --git a/target/arm/translate.c b/target/arm/translate.c index f405c82fb2..09f2852b29 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -7748,9 +7748,10 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) */ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn) { - gen_helper_gvec_3_ptr *fn_gvec_ptr; - int rd, rn, rm, rot, size, opr_sz; - TCGv_ptr fpst; + gen_helper_gvec_3 *fn_gvec = NULL; + gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL; + int rd, rn, rm, opr_sz; + int data = 0; bool q; q = extract32(insn, 6, 1); @@ -7763,8 +7764,8 @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn) if ((insn & 0xfe200f10) == 0xfc200800) { /* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */ - size = extract32(insn, 20, 1); - rot = extract32(insn, 23, 2); + int size = extract32(insn, 20, 1); + data = extract32(insn, 23, 2); /* rot */ if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA) || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) { return 1; @@ -7772,13 +7773,20 @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn) fn_gvec_ptr = size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah; } else if ((insn & 0xfea00f10) == 0xfc800800) { /* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */ - size = extract32(insn, 20, 1); - rot = extract32(insn, 24, 1); + int size = extract32(insn, 20, 1); + data = extract32(insn, 24, 1); /* rot */ if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA) || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) { return 1; } fn_gvec_ptr = size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh; + } else if ((insn & 0xfeb00f00) == 0xfc200d00) { + /* V[US]DOT -- 1111 1100 0.10 .... .... 1101 .Q.U .... */ + bool u = extract32(insn, 4, 1); + if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) { + return 1; + } + fn_gvec = u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b; } else { return 1; } @@ -7793,12 +7801,19 @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn) } opr_sz = (1 + q) * 8; - fpst = get_fpstatus_ptr(1); - tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), - vfp_reg_offset(1, rn), - vfp_reg_offset(1, rm), fpst, - opr_sz, opr_sz, rot, fn_gvec_ptr); - tcg_temp_free_ptr(fpst); + if (fn_gvec_ptr) { + TCGv_ptr fpst = get_fpstatus_ptr(1); + tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), + vfp_reg_offset(1, rn), + vfp_reg_offset(1, rm), fpst, + opr_sz, opr_sz, data, fn_gvec_ptr); + tcg_temp_free_ptr(fpst); + } else { + tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), + vfp_reg_offset(1, rn), + vfp_reg_offset(1, rm), + opr_sz, opr_sz, data, fn_gvec); + } return 0; } @@ -7812,8 +7827,10 @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn) static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn) { - int rd, rn, rm, rot, size, opr_sz; - TCGv_ptr fpst; + gen_helper_gvec_3 *fn_gvec = NULL; + gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL; + int rd, rn, rm, opr_sz; + int data = 0; bool q; q = extract32(insn, 6, 1); @@ -7826,12 +7843,21 @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn) if ((insn & 0xff000f10) == 0xfe000800) { /* VCMLA (indexed) -- 1111 1110 S.RR .... .... 1000 ...0 .... */ - rot = extract32(insn, 20, 2); - size = extract32(insn, 23, 1); + int size = extract32(insn, 23, 1); + data = extract32(insn, 20, 2); /* rot */ if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA) || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) { return 1; } + fn_gvec_ptr = (size ? gen_helper_gvec_fcmlas_idx + : gen_helper_gvec_fcmlah_idx); + } else if ((insn & 0xffb00f00) == 0xfe200d00) { + /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */ + int u = extract32(insn, 4, 1); + if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) { + return 1; + } + fn_gvec = u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b; } else { return 1; } @@ -7846,14 +7872,19 @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn) } opr_sz = (1 + q) * 8; - fpst = get_fpstatus_ptr(1); - tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), - vfp_reg_offset(1, rn), - vfp_reg_offset(1, rm), fpst, - opr_sz, opr_sz, rot, - size ? gen_helper_gvec_fcmlas_idx - : gen_helper_gvec_fcmlah_idx); - tcg_temp_free_ptr(fpst); + if (fn_gvec_ptr) { + TCGv_ptr fpst = get_fpstatus_ptr(1); + tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), + vfp_reg_offset(1, rn), + vfp_reg_offset(1, rm), fpst, + opr_sz, opr_sz, data, fn_gvec_ptr); + tcg_temp_free_ptr(fpst); + } else { + tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), + vfp_reg_offset(1, rn), + vfp_reg_offset(1, rm), + opr_sz, opr_sz, data, fn_gvec); + } return 0; }