From patchwork Thu Aug 17 23:01:07 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 110350 Delivered-To: patch@linaro.org Received: by 10.140.95.78 with SMTP id h72csp147139qge; Thu, 17 Aug 2017 16:04:37 -0700 (PDT) X-Received: by 10.55.128.1 with SMTP id b1mr10059092qkd.76.1503011077877; Thu, 17 Aug 2017 16:04:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1503011077; cv=none; d=google.com; s=arc-20160816; b=kQdF6yuhkDyddzGOQThb9yWVCh1+8bpvtVrMhk/uSFV1HUC21/oLlIFe/DIF0W7tIR S+M+swlP41H0DOwLCyMfk8qyqab/i/cDLe4YTeggWH13Y3fxrxKLo4nN1eE0597RsQh9 +wfHl5Op7XwKqhK+nFa/Keh5q/C/o6duxlkubrg0l5pCvy6/o10/BqP/TEyDPr5rVhTL pLz+R4T7Ww665hCUkqa8te21E2XePyjZ24UjPIZKVcdZuTMNYSJhdVbINlr6Szsf5K3E +wJ+e5oJek+IgtGfl2ZDdIDWwYeO2qyETiAHfgS0EBZZdj9cLEvkSW4RAgcGVgWq/agx ff6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=6IuCfVCVWZnbLbStTM3CUG3Tw378RJWUNZA3UpqaoME=; b=RhNXxOI1i6431zS9jComgxyyBxwLRhRcTAXVmYb0UcZAVzBgIb4Q9bEBJ0AaVN2CO9 q8jb6pIZXigY+9cdTQ+Lx055ykFdrLlPHNO30DQF/4Mgr5XGb9FoNc6j6qnKDhNCzia9 KzI32UOEJZFUpylQzTQzSOrb2styC0uRRJ7ZPbKHLUQl4n6MbYpZQB2wAVD3LESj3Ebm CaeaQC9CE6GCZAWlcaqZbBOHq2S2QVIQ2pRSuf0nSm5YQZXtwKOOWmWExh1kn/Ywydg2 xKfu/z26tzwIGNQioYF6+JCwO4ojaZuDBsdYl5cqsTRqfFTCohktrVcos9U/c3sGQByx TxOQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=HFo6/2zN; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id s69si4002111qka.319.2017.08.17.16.04.37 for (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 17 Aug 2017 16:04:37 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=HFo6/2zN; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:56325 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTq7-0007BD-Gt for patch@linaro.org; Thu, 17 Aug 2017 19:04:35 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44511) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTn3-0004vZ-K0 for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1diTn0-0000qV-BS for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:25 -0400 Received: from mail-pg0-x22a.google.com ([2607:f8b0:400e:c05::22a]:38803) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1diTn0-0000pD-1a for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:22 -0400 Received: by mail-pg0-x22a.google.com with SMTP id t80so24355818pgb.5 for ; Thu, 17 Aug 2017 16:01:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6IuCfVCVWZnbLbStTM3CUG3Tw378RJWUNZA3UpqaoME=; b=HFo6/2zN2J4WP4XpPEfxz1htFoKxrkWDoeCwSZOFFO3LDcTHQFMm4IluN8zVNDlIXT bYvyf3TgmNLlVOKVa6ThZen0jBM4gKa8AGN0hjkBM3wULth3tNGFtuawxLCNnRV8P/Sb T8IyC60YFJHgGcmj9UHeE2AIR6zgZIPaiXW6M= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6IuCfVCVWZnbLbStTM3CUG3Tw378RJWUNZA3UpqaoME=; b=EKkXZBdRVEcAzKMb/ijp0yqonKcAGDTBv/uHhrv8GeCpHnMHrUHgeI3UZi1q0jMiBp DnNTEmehAFfZHhoSTqaa2keNj3kFLxFkPAQ5kVkyWlT+CsmgDM93AlVXxeHjgsQHUJma KkYlw6U1SZg2jz6d1bcGERyas9hcBKJ3XXOzJycuvhkFwi6vlFG18/nIMUX6zc63wx2W 80CNRhIgRNXmjdYdSsLcQ2Bj3hkXfT7sUO7sWRKuIEwaJGw3+/hVBfhP8a0HEH6yFcAB fXHujo6TlwTgJk1B0+jOVDKnroE30gGMNz9RMhLWJ1JsElkjS45Gw+3UQAXyZi+cZQq7 5hAw== X-Gm-Message-State: AHYfb5hMTKEAKMYTGSW2EVPUoLXN9Th613V5SeaDqAl3lOwHrlW8bN73 p7lAoNWe0OWSXVL9wWwwTQ== X-Received: by 10.98.70.132 with SMTP id o4mr6736746pfi.104.1503010877999; Thu, 17 Aug 2017 16:01:17 -0700 (PDT) Received: from bigtime.twiddle.net (97-126-108-236.tukw.qwest.net. [97.126.108.236]) by smtp.gmail.com with ESMTPSA id c23sm5190043pfc.136.2017.08.17.16.01.16 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 17 Aug 2017 16:01:16 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Thu, 17 Aug 2017 16:01:07 -0700 Message-Id: <20170817230114.3655-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20170817230114.3655-1-richard.henderson@linaro.org> References: <20170817230114.3655-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22a Subject: [Qemu-devel] [PATCH 1/8] tcg: Add generic vector infrastructure and ops for add/sub/logic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- Makefile.target | 5 +- tcg/tcg-op-gvec.h | 88 ++++++++++ tcg/tcg-runtime.h | 16 ++ tcg/tcg-op-gvec.c | 443 +++++++++++++++++++++++++++++++++++++++++++++++++ tcg/tcg-runtime-gvec.c | 199 ++++++++++++++++++++++ 5 files changed, 749 insertions(+), 2 deletions(-) create mode 100644 tcg/tcg-op-gvec.h create mode 100644 tcg/tcg-op-gvec.c create mode 100644 tcg/tcg-runtime-gvec.c -- 2.13.5 Reviewed-by: Philippe Mathieu-Daudé diff --git a/Makefile.target b/Makefile.target index 7f42c45db8..9ae3e904f7 100644 --- a/Makefile.target +++ b/Makefile.target @@ -93,8 +93,9 @@ all: $(PROGS) stap # cpu emulator library obj-y += exec.o obj-y += accel/ -obj-$(CONFIG_TCG) += tcg/tcg.o tcg/tcg-op.o tcg/optimize.o -obj-$(CONFIG_TCG) += tcg/tcg-common.o tcg/tcg-runtime.o +obj-$(CONFIG_TCG) += tcg/tcg.o tcg/tcg-common.o tcg/optimize.o +obj-$(CONFIG_TCG) += tcg/tcg-op.o tcg/tcg-op-gvec.o +obj-$(CONFIG_TCG) += tcg/tcg-runtime.o tcg/tcg-runtime-gvec.o obj-$(CONFIG_TCG_INTERPRETER) += tcg/tci.o obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o obj-y += fpu/softfloat.o diff --git a/tcg/tcg-op-gvec.h b/tcg/tcg-op-gvec.h new file mode 100644 index 0000000000..10db3599a5 --- /dev/null +++ b/tcg/tcg-op-gvec.h @@ -0,0 +1,88 @@ +/* + * Generic vector operation expansion + * + * Copyright (c) 2017 Linaro + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +/* + * "Generic" vectors. All operands are given as offsets from ENV, + * and therefore cannot also be allocated via tcg_global_mem_new_*. + * OPSZ is the byte size of the vector upon which the operation is performed. + * CLSZ is the byte size of the full vector; bytes beyond OPSZ are cleared. + * + * All sizes must be 8 or any multiple of 16. + * When OPSZ is 8, the alignment may be 8, otherwise must be 16. + * Operands may completely, but not partially, overlap. + */ + +/* Fundamental operation expanders. These are exposed to the front ends + so that target-specific SIMD operations can be handled similarly to + the standard SIMD operations. */ + +typedef struct { + /* "Small" sizes: expand inline as a 64-bit or 32-bit lane. + Generally only one of these will be non-NULL. */ + void (*fni8)(TCGv_i64, TCGv_i64, TCGv_i64); + void (*fni4)(TCGv_i32, TCGv_i32, TCGv_i32); + /* Similarly, but load up a constant and re-use across lanes. */ + void (*fni8x)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64); + uint64_t extra_value; + /* Larger sizes: expand out-of-line helper w/size descriptor. */ + void (*fno)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); +} GVecGen3; + +void tcg_gen_gvec_3(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz, const GVecGen3 *); + +#define DEF_GVEC_2(X) \ + void tcg_gen_gvec_##X(uint32_t dofs, uint32_t aofs, uint32_t bofs, \ + uint32_t opsz, uint32_t clsz) + +DEF_GVEC_2(add8); +DEF_GVEC_2(add16); +DEF_GVEC_2(add32); +DEF_GVEC_2(add64); + +DEF_GVEC_2(sub8); +DEF_GVEC_2(sub16); +DEF_GVEC_2(sub32); +DEF_GVEC_2(sub64); + +DEF_GVEC_2(and8); +DEF_GVEC_2(or8); +DEF_GVEC_2(xor8); +DEF_GVEC_2(andc8); +DEF_GVEC_2(orc8); + +#undef DEF_GVEC_2 + +/* + * 64-bit vector operations. Use these when the register has been + * allocated with tcg_global_mem_new_i64. OPSZ = CLSZ = 8. + */ + +#define DEF_VEC8_2(X) \ + void tcg_gen_vec8_##X(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) + +DEF_VEC8_2(add8); +DEF_VEC8_2(add16); +DEF_VEC8_2(add32); + +DEF_VEC8_2(sub8); +DEF_VEC8_2(sub16); +DEF_VEC8_2(sub32); + +#undef DEF_VEC8_2 diff --git a/tcg/tcg-runtime.h b/tcg/tcg-runtime.h index c41d38a557..f8d07090f8 100644 --- a/tcg/tcg-runtime.h +++ b/tcg/tcg-runtime.h @@ -134,3 +134,19 @@ GEN_ATOMIC_HELPERS(xor_fetch) GEN_ATOMIC_HELPERS(xchg) #undef GEN_ATOMIC_HELPERS + +DEF_HELPER_FLAGS_4(gvec_add8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_add16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_add32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_add64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_sub8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sub16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sub32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sub64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_and8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_or8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_xor8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_andc8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_orc8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c new file mode 100644 index 0000000000..6de49dc07f --- /dev/null +++ b/tcg/tcg-op-gvec.c @@ -0,0 +1,443 @@ +/* + * Generic vector operation expansion + * + * Copyright (c) 2017 Linaro + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" +#include "cpu.h" +#include "exec/exec-all.h" +#include "tcg.h" +#include "tcg-op.h" +#include "tcg-op-gvec.h" +#include "trace-tcg.h" +#include "trace/mem.h" + +#define REP8(x) ((x) * 0x0101010101010101ull) +#define REP16(x) ((x) * 0x0001000100010001ull) + +#define MAX_INLINE 16 + +static inline void check_size_s(uint32_t opsz, uint32_t clsz) +{ + tcg_debug_assert(opsz % 8 == 0); + tcg_debug_assert(clsz % 8 == 0); + tcg_debug_assert(opsz <= clsz); +} + +static inline void check_align_s_3(uint32_t dofs, uint32_t aofs, uint32_t bofs) +{ + tcg_debug_assert(dofs % 8 == 0); + tcg_debug_assert(aofs % 8 == 0); + tcg_debug_assert(bofs % 8 == 0); +} + +static inline void check_size_l(uint32_t opsz, uint32_t clsz) +{ + tcg_debug_assert(opsz % 16 == 0); + tcg_debug_assert(clsz % 16 == 0); + tcg_debug_assert(opsz <= clsz); +} + +static inline void check_align_l_3(uint32_t dofs, uint32_t aofs, uint32_t bofs) +{ + tcg_debug_assert(dofs % 16 == 0); + tcg_debug_assert(aofs % 16 == 0); + tcg_debug_assert(bofs % 16 == 0); +} + +static inline void check_overlap_3(uint32_t d, uint32_t a, + uint32_t b, uint32_t s) +{ + tcg_debug_assert(d == a || d + s <= a || a + s <= d); + tcg_debug_assert(d == b || d + s <= b || b + s <= d); + tcg_debug_assert(a == b || a + s <= b || b + s <= a); +} + +static void expand_clr(uint32_t dofs, uint32_t opsz, uint32_t clsz) +{ + if (clsz > opsz) { + TCGv_i64 zero = tcg_const_i64(0); + uint32_t i; + + for (i = opsz; i < clsz; i += 8) { + tcg_gen_st_i64(zero, tcg_ctx.tcg_env, dofs + i); + } + tcg_temp_free_i64(zero); + } +} + +static TCGv_i32 make_desc(uint32_t opsz, uint32_t clsz) +{ + tcg_debug_assert(opsz >= 16 && opsz <= 255 * 16 && opsz % 16 == 0); + tcg_debug_assert(clsz >= 16 && clsz <= 255 * 16 && clsz % 16 == 0); + opsz /= 16; + clsz /= 16; + opsz -= 1; + clsz -= 1; + return tcg_const_i32(deposit32(opsz, 8, 8, clsz)); +} + +static void expand_3_o(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz, + void (*fno)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32)) +{ + TCGv_ptr d = tcg_temp_new_ptr(); + TCGv_ptr a = tcg_temp_new_ptr(); + TCGv_ptr b = tcg_temp_new_ptr(); + TCGv_i32 desc = make_desc(opsz, clsz); + + tcg_gen_addi_ptr(d, tcg_ctx.tcg_env, dofs); + tcg_gen_addi_ptr(a, tcg_ctx.tcg_env, aofs); + tcg_gen_addi_ptr(b, tcg_ctx.tcg_env, bofs); + fno(d, a, b, desc); + + tcg_temp_free_ptr(d); + tcg_temp_free_ptr(a); + tcg_temp_free_ptr(b); + tcg_temp_free_i32(desc); +} + +static void expand_3x4(uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t opsz, + void (*fni)(TCGv_i32, TCGv_i32, TCGv_i32)) +{ + TCGv_i32 t0 = tcg_temp_new_i32(); + uint32_t i; + + if (aofs == bofs) { + for (i = 0; i < opsz; i += 4) { + tcg_gen_ld_i32(t0, tcg_ctx.tcg_env, aofs + i); + fni(t0, t0, t0); + tcg_gen_st_i32(t0, tcg_ctx.tcg_env, dofs + i); + } + } else { + TCGv_i32 t1 = tcg_temp_new_i32(); + for (i = 0; i < opsz; i += 4) { + tcg_gen_ld_i32(t0, tcg_ctx.tcg_env, aofs + i); + tcg_gen_ld_i32(t1, tcg_ctx.tcg_env, bofs + i); + fni(t0, t0, t1); + tcg_gen_st_i32(t0, tcg_ctx.tcg_env, dofs + i); + } + tcg_temp_free_i32(t1); + } + tcg_temp_free_i32(t0); +} + +static void expand_3x8(uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t opsz, + void (*fni)(TCGv_i64, TCGv_i64, TCGv_i64)) +{ + TCGv_i64 t0 = tcg_temp_new_i64(); + uint32_t i; + + if (aofs == bofs) { + for (i = 0; i < opsz; i += 8) { + tcg_gen_ld_i64(t0, tcg_ctx.tcg_env, aofs + i); + fni(t0, t0, t0); + tcg_gen_st_i64(t0, tcg_ctx.tcg_env, dofs + i); + } + } else { + TCGv_i64 t1 = tcg_temp_new_i64(); + for (i = 0; i < opsz; i += 8) { + tcg_gen_ld_i64(t0, tcg_ctx.tcg_env, aofs + i); + tcg_gen_ld_i64(t1, tcg_ctx.tcg_env, bofs + i); + fni(t0, t0, t1); + tcg_gen_st_i64(t0, tcg_ctx.tcg_env, dofs + i); + } + tcg_temp_free_i64(t1); + } + tcg_temp_free_i64(t0); +} + +static void expand_3x8p1(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint64_t data, + void (*fni)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64)) +{ + TCGv_i64 t0 = tcg_temp_new_i64(); + TCGv_i64 t2 = tcg_const_i64(data); + uint32_t i; + + if (aofs == bofs) { + for (i = 0; i < opsz; i += 8) { + tcg_gen_ld_i64(t0, tcg_ctx.tcg_env, aofs + i); + fni(t0, t0, t0, t2); + tcg_gen_st_i64(t0, tcg_ctx.tcg_env, dofs + i); + } + } else { + TCGv_i64 t1 = tcg_temp_new_i64(); + for (i = 0; i < opsz; i += 8) { + tcg_gen_ld_i64(t0, tcg_ctx.tcg_env, aofs + i); + tcg_gen_ld_i64(t1, tcg_ctx.tcg_env, bofs + i); + fni(t0, t0, t1, t2); + tcg_gen_st_i64(t0, tcg_ctx.tcg_env, dofs + i); + } + tcg_temp_free_i64(t1); + } + tcg_temp_free_i64(t0); + tcg_temp_free_i64(t2); +} + +void tcg_gen_gvec_3(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz, const GVecGen3 *g) +{ + check_overlap_3(dofs, aofs, bofs, clsz); + if (opsz <= MAX_INLINE) { + check_size_s(opsz, clsz); + check_align_s_3(dofs, aofs, bofs); + if (g->fni8) { + expand_3x8(dofs, aofs, bofs, opsz, g->fni8); + } else if (g->fni4) { + expand_3x4(dofs, aofs, bofs, opsz, g->fni4); + } else if (g->fni8x) { + expand_3x8p1(dofs, aofs, bofs, opsz, g->extra_value, g->fni8x); + } else { + g_assert_not_reached(); + } + expand_clr(dofs, opsz, clsz); + } else { + check_size_l(opsz, clsz); + check_align_l_3(dofs, aofs, bofs); + expand_3_o(dofs, aofs, bofs, opsz, clsz, g->fno); + } +} + +static void gen_addv_mask(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b, TCGv_i64 m) +{ + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i64 t2 = tcg_temp_new_i64(); + TCGv_i64 t3 = tcg_temp_new_i64(); + + tcg_gen_andc_i64(t1, a, m); + tcg_gen_andc_i64(t2, b, m); + tcg_gen_xor_i64(t3, a, b); + tcg_gen_add_i64(d, t1, t2); + tcg_gen_and_i64(t3, t3, m); + tcg_gen_xor_i64(d, d, t3); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); + tcg_temp_free_i64(t3); +} + +void tcg_gen_gvec_add8(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz) +{ + static const GVecGen3 g = { + .extra_value = REP8(0x80), + .fni8x = gen_addv_mask, + .fno = gen_helper_gvec_add8, + }; + tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); +} + +void tcg_gen_gvec_add16(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz) +{ + static const GVecGen3 g = { + .extra_value = REP16(0x8000), + .fni8x = gen_addv_mask, + .fno = gen_helper_gvec_add16, + }; + tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); +} + +void tcg_gen_gvec_add32(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz) +{ + static const GVecGen3 g = { + .fni4 = tcg_gen_add_i32, + .fno = gen_helper_gvec_add32, + }; + tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); +} + +void tcg_gen_gvec_add64(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz) +{ + static const GVecGen3 g = { + .fni8 = tcg_gen_add_i64, + .fno = gen_helper_gvec_add64, + }; + tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); +} + +void tcg_gen_vec8_add8(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 m = tcg_const_i64(REP8(0x80)); + gen_addv_mask(d, a, b, m); + tcg_temp_free_i64(m); +} + +void tcg_gen_vec8_add16(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 m = tcg_const_i64(REP16(0x8000)); + gen_addv_mask(d, a, b, m); + tcg_temp_free_i64(m); +} + +void tcg_gen_vec8_add32(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i64 t2 = tcg_temp_new_i64(); + + tcg_gen_andi_i64(t1, a, ~0xffffffffull); + tcg_gen_add_i64(t2, a, b); + tcg_gen_add_i64(t1, t1, b); + tcg_gen_deposit_i64(d, t1, t2, 0, 32); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +static void gen_subv_mask(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b, TCGv_i64 m) +{ + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i64 t2 = tcg_temp_new_i64(); + TCGv_i64 t3 = tcg_temp_new_i64(); + + tcg_gen_or_i64(t1, a, m); + tcg_gen_andc_i64(t2, b, m); + tcg_gen_eqv_i64(t3, a, b); + tcg_gen_sub_i64(d, t1, t2); + tcg_gen_and_i64(t3, t3, m); + tcg_gen_xor_i64(d, d, t3); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); + tcg_temp_free_i64(t3); +} + +void tcg_gen_gvec_sub8(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz) +{ + static const GVecGen3 g = { + .extra_value = REP8(0x80), + .fni8x = gen_subv_mask, + .fno = gen_helper_gvec_sub8, + }; + tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); +} + +void tcg_gen_gvec_sub16(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz) +{ + static const GVecGen3 g = { + .extra_value = REP16(0x8000), + .fni8x = gen_subv_mask, + .fno = gen_helper_gvec_sub16, + }; + tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); +} + +void tcg_gen_gvec_sub32(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz) +{ + static const GVecGen3 g = { + .fni4 = tcg_gen_sub_i32, + .fno = gen_helper_gvec_sub32, + }; + tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); +} + +void tcg_gen_gvec_sub64(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz) +{ + static const GVecGen3 g = { + .fni8 = tcg_gen_sub_i64, + .fno = gen_helper_gvec_sub64, + }; + tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); +} + +void tcg_gen_vec8_sub8(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 m = tcg_const_i64(REP8(0x80)); + gen_subv_mask(d, a, b, m); + tcg_temp_free_i64(m); +} + +void tcg_gen_vec8_sub16(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 m = tcg_const_i64(REP16(0x8000)); + gen_subv_mask(d, a, b, m); + tcg_temp_free_i64(m); +} + +void tcg_gen_vec8_sub32(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i64 t2 = tcg_temp_new_i64(); + + tcg_gen_andi_i64(t1, b, ~0xffffffffull); + tcg_gen_sub_i64(t2, a, b); + tcg_gen_sub_i64(t1, a, t1); + tcg_gen_deposit_i64(d, t1, t2, 0, 32); + + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + +void tcg_gen_gvec_and8(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz) +{ + static const GVecGen3 g = { + .fni8 = tcg_gen_and_i64, + .fno = gen_helper_gvec_and8, + }; + tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); +} + +void tcg_gen_gvec_or8(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz) +{ + static const GVecGen3 g = { + .fni8 = tcg_gen_or_i64, + .fno = gen_helper_gvec_or8, + }; + tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); +} + +void tcg_gen_gvec_xor8(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz) +{ + static const GVecGen3 g = { + .fni8 = tcg_gen_xor_i64, + .fno = gen_helper_gvec_xor8, + }; + tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); +} + +void tcg_gen_gvec_andc8(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz) +{ + static const GVecGen3 g = { + .fni8 = tcg_gen_andc_i64, + .fno = gen_helper_gvec_andc8, + }; + tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); +} + +void tcg_gen_gvec_orc8(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t clsz) +{ + static const GVecGen3 g = { + .fni8 = tcg_gen_orc_i64, + .fno = gen_helper_gvec_orc8, + }; + tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); +} diff --git a/tcg/tcg-runtime-gvec.c b/tcg/tcg-runtime-gvec.c new file mode 100644 index 0000000000..9a37ce07a2 --- /dev/null +++ b/tcg/tcg-runtime-gvec.c @@ -0,0 +1,199 @@ +/* + * Generic vectorized operation runtime + * + * Copyright (c) 2017 Linaro + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#include "qemu/osdep.h" +#include "qemu/host-utils.h" +#include "cpu.h" +#include "exec/helper-proto.h" + +/* Virtually all hosts support 16-byte vectors. Those that don't + can emulate them via GCC's generic vector extension. + + In tcg-op-gvec.c, we asserted that both the size and alignment + of the data are multiples of 16. */ + +typedef uint8_t vec8 __attribute__((vector_size(16))); +typedef uint16_t vec16 __attribute__((vector_size(16))); +typedef uint32_t vec32 __attribute__((vector_size(16))); +typedef uint64_t vec64 __attribute__((vector_size(16))); + +static inline intptr_t extract_opsz(uint32_t desc) +{ + return ((desc & 0xff) + 1) * 16; +} + +static inline intptr_t extract_clsz(uint32_t desc) +{ + return (((desc >> 8) & 0xff) + 1) * 16; +} + +static inline void clear_high(void *d, intptr_t opsz, uint32_t desc) +{ + intptr_t clsz = extract_clsz(desc); + intptr_t i; + + if (unlikely(clsz > opsz)) { + for (i = opsz; i < clsz; i += sizeof(vec64)) { + *(vec64 *)(d + i) = (vec64){ 0 }; + } + } +} + +void HELPER(gvec_add8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t opsz = extract_opsz(desc); + intptr_t i; + + for (i = 0; i < opsz; i += sizeof(vec8)) { + *(vec8 *)(d + i) = *(vec8 *)(a + i) + *(vec8 *)(b + i); + } + clear_high(d, opsz, desc); +} + +void HELPER(gvec_add16)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t opsz = extract_opsz(desc); + intptr_t i; + + for (i = 0; i < opsz; i += sizeof(vec16)) { + *(vec16 *)(d + i) = *(vec16 *)(a + i) + *(vec16 *)(b + i); + } + clear_high(d, opsz, desc); +} + +void HELPER(gvec_add32)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t opsz = extract_opsz(desc); + intptr_t i; + + for (i = 0; i < opsz; i += sizeof(vec32)) { + *(vec32 *)(d + i) = *(vec32 *)(a + i) + *(vec32 *)(b + i); + } + clear_high(d, opsz, desc); +} + +void HELPER(gvec_add64)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t opsz = extract_opsz(desc); + intptr_t i; + + for (i = 0; i < opsz; i += sizeof(vec64)) { + *(vec64 *)(d + i) = *(vec64 *)(a + i) + *(vec64 *)(b + i); + } + clear_high(d, opsz, desc); +} + +void HELPER(gvec_sub8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t opsz = extract_opsz(desc); + intptr_t i; + + for (i = 0; i < opsz; i += sizeof(vec8)) { + *(vec8 *)(d + i) = *(vec8 *)(a + i) - *(vec8 *)(b + i); + } + clear_high(d, opsz, desc); +} + +void HELPER(gvec_sub16)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t opsz = extract_opsz(desc); + intptr_t i; + + for (i = 0; i < opsz; i += sizeof(vec16)) { + *(vec16 *)(d + i) = *(vec16 *)(a + i) - *(vec16 *)(b + i); + } + clear_high(d, opsz, desc); +} + +void HELPER(gvec_sub32)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t opsz = extract_opsz(desc); + intptr_t i; + + for (i = 0; i < opsz; i += sizeof(vec32)) { + *(vec32 *)(d + i) = *(vec32 *)(a + i) - *(vec32 *)(b + i); + } + clear_high(d, opsz, desc); +} + +void HELPER(gvec_sub64)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t opsz = extract_opsz(desc); + intptr_t i; + + for (i = 0; i < opsz; i += sizeof(vec64)) { + *(vec64 *)(d + i) = *(vec64 *)(a + i) - *(vec64 *)(b + i); + } + clear_high(d, opsz, desc); +} + +void HELPER(gvec_and8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t opsz = extract_opsz(desc); + intptr_t i; + + for (i = 0; i < opsz; i += sizeof(vec64)) { + *(vec64 *)(d + i) = *(vec64 *)(a + i) & *(vec64 *)(b + i); + } + clear_high(d, opsz, desc); +} + +void HELPER(gvec_or8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t opsz = extract_opsz(desc); + intptr_t i; + + for (i = 0; i < opsz; i += sizeof(vec64)) { + *(vec64 *)(d + i) = *(vec64 *)(a + i) | *(vec64 *)(b + i); + } + clear_high(d, opsz, desc); +} + +void HELPER(gvec_xor8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t opsz = extract_opsz(desc); + intptr_t i; + + for (i = 0; i < opsz; i += sizeof(vec64)) { + *(vec64 *)(d + i) = *(vec64 *)(a + i) ^ *(vec64 *)(b + i); + } + clear_high(d, opsz, desc); +} + +void HELPER(gvec_andc8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t opsz = extract_opsz(desc); + intptr_t i; + + for (i = 0; i < opsz; i += sizeof(vec64)) { + *(vec64 *)(d + i) = *(vec64 *)(a + i) &~ *(vec64 *)(b + i); + } + clear_high(d, opsz, desc); +} + +void HELPER(gvec_orc8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t opsz = extract_opsz(desc); + intptr_t i; + + for (i = 0; i < opsz; i += sizeof(vec64)) { + *(vec64 *)(d + i) = *(vec64 *)(a + i) |~ *(vec64 *)(b + i); + } + clear_high(d, opsz, desc); +} From patchwork Thu Aug 17 23:01:08 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 110349 Delivered-To: patch@linaro.org Received: by 10.140.95.78 with SMTP id h72csp146155qge; Thu, 17 Aug 2017 16:03:38 -0700 (PDT) X-Received: by 10.233.216.197 with SMTP id u188mr859813qkf.215.1503011018279; Thu, 17 Aug 2017 16:03:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1503011018; cv=none; d=google.com; s=arc-20160816; b=SGCtyxaHcasgnxwvOUlqZwwei3mixzWrMNuTiyzINygbw7OnVTsLvXQEioA3erWY1I 5W+2YYnJ3Zhvota8iMW58nuJ2TF+hr5svixybQUnfD36CRcq0FOjUPyBbLxjCRqQA6kg nN+ZyMWhDB2sG6kgKIhEzL533IkqUOmBS68BzbEhQGh7UUnyU+HamyVwnW4Cmh6OD7W/ VxQURP9psqoai/q0e1EpTWQl5NP11eu5rhdLYmnGnFBE6lxJF3J4106cUO0+jWcNAGsA vuG6DaUDhVNa8FbvZPMLTKMu5HGZv/3oV+MTJXha5o14WExqTa6IBk/991gKxg8s5d8d zV7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=aXDZ4Qr9mwfYKP1Hb4K6fjgDzUCvCw9IzsWpbOBP7G0=; b=jowOdnaTrHZNc39rkT508EwRi62xHdqX809LNvGON2xG5XJZnADh48qQSLDOnPTQP9 ZZI+7Q0APnUdULajSn2zprQqFsOn7xBp6hp4PuvWnrcFM37T57kOyZy0L+WU1ID6GYKd nrRfVmQfQMGwcrOJGQCTbBmWWgKD3UMB6YLskEn+YZjlBW3uwudw6ryUffpPgG8X59JG KEj8i1fWe0r8HsX6qEUgq6VFJt9nQvHJ0L+NqFYnPnRZNI3h4aLz4OyHkEwfoDhkVRTm VBxfMYsSooC4UAymU7sW16bXFtTxv8cAlSUlhVTU3faewwj5aJ0MeHQuwWTUIhFME4nt Y7xg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=TXNcPQ8q; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id m88si3891376qtd.363.2017.08.17.16.03.37 for (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 17 Aug 2017 16:03:38 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=TXNcPQ8q; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:56297 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTpA-0006A2-3R for patch@linaro.org; Thu, 17 Aug 2017 19:03:36 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44460) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTn1-0004so-0O for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1diTmz-0000nM-1m for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:23 -0400 Received: from mail-pg0-x235.google.com ([2607:f8b0:400e:c05::235]:33446) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1diTmy-0000lm-Qa for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:20 -0400 Received: by mail-pg0-x235.google.com with SMTP id t3so23902992pgt.0 for ; Thu, 17 Aug 2017 16:01:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=aXDZ4Qr9mwfYKP1Hb4K6fjgDzUCvCw9IzsWpbOBP7G0=; b=TXNcPQ8qLiZ3cZen6iUEA4m0Jsfn/DYXvMtximn7ImIWf0YkK5RkHY5Z+lwX+gpuvm k5c4PyN7e1pAPlEPAxkLgwaCpUKDkaljzxKAVNNrSYY9R9FE0k5S+IYD+LVZbqAR40vi VJL9vyISIzfpC7MvYPE8nBnleg+IbgdwNGpIM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=aXDZ4Qr9mwfYKP1Hb4K6fjgDzUCvCw9IzsWpbOBP7G0=; b=PfcdjFlZa2QKmjZLYnxAM8Z+rwM5QGbWJP+hZ72h/pbCVaRMNp11rezXpVRrV0hAfX HSck5dn9BQYJEwJBP/+ywybYJPj7c9yv5Xj72oq269fdRFaoQ/2MQTzn5kVtKum8ca07 FNpx794FNO76MV/F9pNh7KAtpmjSSwlV+WlxloE2S7fbCq17p2KznL29MRJ1SuY2K8Ur BVYvMUZ0Csl7KszS2R5lwNS5morS6tGGcKPTqgFNN9eceqbQjUazGjSbiJ2dkfMQ1Mrp 0CE1hPijCYzlWVQQ2k+z5BD09zWj1JEsu/zo17tiKNevm2jv/QXDX/2u7T+wUmq28BK0 wGPA== X-Gm-Message-State: AHYfb5gYeGIhryK6mDjdyjj5wOrkcUTc2dDvAeObis/yVQU/ebgv+1st c0HO4iHjWfz2DkM5SvHztA== X-Received: by 10.84.171.195 with SMTP id l61mr7755983plb.464.1503010879346; Thu, 17 Aug 2017 16:01:19 -0700 (PDT) Received: from bigtime.twiddle.net (97-126-108-236.tukw.qwest.net. [97.126.108.236]) by smtp.gmail.com with ESMTPSA id c23sm5190043pfc.136.2017.08.17.16.01.18 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 17 Aug 2017 16:01:18 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Thu, 17 Aug 2017 16:01:08 -0700 Message-Id: <20170817230114.3655-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20170817230114.3655-1-richard.henderson@linaro.org> References: <20170817230114.3655-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::235 Subject: [Qemu-devel] [PATCH 2/8] target/arm: Use generic vector infrastructure for aa64 add/sub/logic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 137 ++++++++++++++++++++++++++++----------------- 1 file changed, 87 insertions(+), 50 deletions(-) -- 2.13.5 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 2200e25be0..025354f983 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -21,6 +21,7 @@ #include "cpu.h" #include "exec/exec-all.h" #include "tcg-op.h" +#include "tcg-op-gvec.h" #include "qemu/log.h" #include "arm_ldst.h" #include "translate.h" @@ -82,6 +83,7 @@ typedef void NeonGenTwoDoubleOPFn(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr); typedef void NeonGenOneOpFn(TCGv_i64, TCGv_i64); typedef void CryptoTwoOpEnvFn(TCGv_ptr, TCGv_i32, TCGv_i32); typedef void CryptoThreeOpEnvFn(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32); +typedef void GVecGenTwoFn(uint32_t, uint32_t, uint32_t, uint32_t, uint32_t); /* initialize TCG globals. */ void a64_translate_init(void) @@ -537,6 +539,21 @@ static inline int vec_reg_offset(DisasContext *s, int regno, return offs; } +/* Return the offset info CPUARMState of the "whole" vector register Qn. */ +static inline int vec_full_reg_offset(DisasContext *s, int regno) +{ + assert_fp_access_checked(s); + return offsetof(CPUARMState, vfp.regs[regno * 2]); +} + +/* Return the byte size of the "whole" vector register, VL / 8. */ +static inline int vec_full_reg_size(DisasContext *s) +{ + /* FIXME SVE: We should put the composite ZCR_EL* value into tb->flags. + In the meantime this is just the AdvSIMD length of 128. */ + return 128 / 8; +} + /* Return the offset into CPUARMState of a slice (from * the least significant end) of FP register Qn (ie * Dn, Sn, Hn or Bn). @@ -9042,11 +9059,38 @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn) bool is_q = extract32(insn, 30, 1); TCGv_i64 tcg_op1, tcg_op2, tcg_res[2]; int pass; + GVecGenTwoFn *gvec_op; if (!fp_access_check(s)) { return; } + switch (size + 4 * is_u) { + case 0: /* AND */ + gvec_op = tcg_gen_gvec_and8; + goto do_gvec; + case 1: /* BIC */ + gvec_op = tcg_gen_gvec_andc8; + goto do_gvec; + case 2: /* ORR */ + gvec_op = tcg_gen_gvec_or8; + goto do_gvec; + case 3: /* ORN */ + gvec_op = tcg_gen_gvec_orc8; + goto do_gvec; + case 4: /* EOR */ + gvec_op = tcg_gen_gvec_xor8; + goto do_gvec; + do_gvec: + gvec_op(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + is_q ? 16 : 8, vec_full_reg_size(s)); + return; + } + + /* Note that we've now eliminated all !is_u. */ + tcg_op1 = tcg_temp_new_i64(); tcg_op2 = tcg_temp_new_i64(); tcg_res[0] = tcg_temp_new_i64(); @@ -9056,47 +9100,27 @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn) read_vec_element(s, tcg_op1, rn, pass, MO_64); read_vec_element(s, tcg_op2, rm, pass, MO_64); - if (!is_u) { - switch (size) { - case 0: /* AND */ - tcg_gen_and_i64(tcg_res[pass], tcg_op1, tcg_op2); - break; - case 1: /* BIC */ - tcg_gen_andc_i64(tcg_res[pass], tcg_op1, tcg_op2); - break; - case 2: /* ORR */ - tcg_gen_or_i64(tcg_res[pass], tcg_op1, tcg_op2); - break; - case 3: /* ORN */ - tcg_gen_orc_i64(tcg_res[pass], tcg_op1, tcg_op2); - break; - } - } else { - if (size != 0) { - /* B* ops need res loaded to operate on */ - read_vec_element(s, tcg_res[pass], rd, pass, MO_64); - } + /* B* ops need res loaded to operate on */ + read_vec_element(s, tcg_res[pass], rd, pass, MO_64); - switch (size) { - case 0: /* EOR */ - tcg_gen_xor_i64(tcg_res[pass], tcg_op1, tcg_op2); - break; - case 1: /* BSL bitwise select */ - tcg_gen_xor_i64(tcg_op1, tcg_op1, tcg_op2); - tcg_gen_and_i64(tcg_op1, tcg_op1, tcg_res[pass]); - tcg_gen_xor_i64(tcg_res[pass], tcg_op2, tcg_op1); - break; - case 2: /* BIT, bitwise insert if true */ - tcg_gen_xor_i64(tcg_op1, tcg_op1, tcg_res[pass]); - tcg_gen_and_i64(tcg_op1, tcg_op1, tcg_op2); - tcg_gen_xor_i64(tcg_res[pass], tcg_res[pass], tcg_op1); - break; - case 3: /* BIF, bitwise insert if false */ - tcg_gen_xor_i64(tcg_op1, tcg_op1, tcg_res[pass]); - tcg_gen_andc_i64(tcg_op1, tcg_op1, tcg_op2); - tcg_gen_xor_i64(tcg_res[pass], tcg_res[pass], tcg_op1); - break; - } + switch (size) { + case 1: /* BSL bitwise select */ + tcg_gen_xor_i64(tcg_op1, tcg_op1, tcg_op2); + tcg_gen_and_i64(tcg_op1, tcg_op1, tcg_res[pass]); + tcg_gen_xor_i64(tcg_res[pass], tcg_op2, tcg_op1); + break; + case 2: /* BIT, bitwise insert if true */ + tcg_gen_xor_i64(tcg_op1, tcg_op1, tcg_res[pass]); + tcg_gen_and_i64(tcg_op1, tcg_op1, tcg_op2); + tcg_gen_xor_i64(tcg_res[pass], tcg_res[pass], tcg_op1); + break; + case 3: /* BIF, bitwise insert if false */ + tcg_gen_xor_i64(tcg_op1, tcg_op1, tcg_res[pass]); + tcg_gen_andc_i64(tcg_op1, tcg_op1, tcg_op2); + tcg_gen_xor_i64(tcg_res[pass], tcg_res[pass], tcg_op1); + break; + default: + g_assert_not_reached(); } } @@ -9370,6 +9394,7 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) int rn = extract32(insn, 5, 5); int rd = extract32(insn, 0, 5); int pass; + GVecGenTwoFn *gvec_op; switch (opcode) { case 0x13: /* MUL, PMUL */ @@ -9409,6 +9434,28 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) return; } + switch (opcode) { + case 0x10: /* ADD, SUB */ + { + static GVecGenTwoFn * const fns[4][2] = { + { tcg_gen_gvec_add8, tcg_gen_gvec_sub8 }, + { tcg_gen_gvec_add16, tcg_gen_gvec_sub16 }, + { tcg_gen_gvec_add32, tcg_gen_gvec_sub32 }, + { tcg_gen_gvec_add64, tcg_gen_gvec_sub64 }, + }; + gvec_op = fns[size][u]; + goto do_gvec; + } + break; + + do_gvec: + gvec_op(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + is_q ? 16 : 8, vec_full_reg_size(s)); + return; + } + if (size == 3) { assert(is_q); for (pass = 0; pass < 2; pass++) { @@ -9581,16 +9628,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genfn = fns[size][u]; break; } - case 0x10: /* ADD, SUB */ - { - static NeonGenTwoOpFn * const fns[3][2] = { - { gen_helper_neon_add_u8, gen_helper_neon_sub_u8 }, - { gen_helper_neon_add_u16, gen_helper_neon_sub_u16 }, - { tcg_gen_add_i32, tcg_gen_sub_i32 }, - }; - genfn = fns[size][u]; - break; - } case 0x11: /* CMTST, CMEQ */ { static NeonGenTwoOpFn * const fns[3][2] = { From patchwork Thu Aug 17 23:01:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 110348 Delivered-To: patch@linaro.org Received: by 10.140.95.78 with SMTP id h72csp146142qge; Thu, 17 Aug 2017 16:03:38 -0700 (PDT) X-Received: by 10.200.57.101 with SMTP id t34mr9781440qtb.158.1503011017967; Thu, 17 Aug 2017 16:03:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1503011017; cv=none; d=google.com; s=arc-20160816; b=ggwC1gU/UVuQyegTquPz/zZg3aFh8Sv+u3OclhdbbAxOhNuVw0hpX+qBBBQXoDAJ7z BfH9eh/gI1isgiZ15AZ0zB6eW+i3tQKc7eqlPQCtKcYep4fqCZix6OrWItsDiookEjIw 7tREZI5vHg7Wd7ky2qmRvvWjVvl5qv0iyLt6/9WgBfUWAtq9nBjfZ4JOx8zTrKYqqRK0 ox5yq/bGtL0oSkLJ3hx91kOwAGo4d8jLGjWlHoykAYvwqzjQpZA3dRsFKFNpDo+mn8T0 ALcyDbNUEp+R+erveZbxpr+knU9pYj0GltZhQ6kqZ5zcJpLlOdosxq/DV/btovXxktS0 SW2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=9o+4s+u+HhA1SN1wAmIK4x0UqDGVW4ct/r56gOt0Xg0=; b=kTwKgJnp/MCn7qEZYp7QdCGuEVARgKw+NYVDT+zMZ5KKPi7C2DRFgBV08Jf+nPJusp xNoDkl84baFHcKGYbg6mlwLK5/QMYpYl9YHQnihGbJSjSRpBsNnPFEoHrKlVXBtsCZp5 NgY4XAPsKuXTk6NGmUQXYcq9+mZAZLnLhyUDw6SfHa39w3e2o7T10zFPDbViJTtY6dLx 1Gy6ik0xVdlaKFEiOTcPj67korGUJ8fq9xiKa7INJK6eJFp1/zQCMkvVOJ1c+bGn2Mz1 fv9PZ/Mr7tvgpumbXO7N4Bneut9VM929FwRZkx4t2MW8UrvBG9KgDajr3IAGi7MHL/bx Fu1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ERhOP+bZ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id e55si4064850qta.502.2017.08.17.16.03.37 for (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 17 Aug 2017 16:03:37 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ERhOP+bZ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:56299 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTp9-0006An-TQ for patch@linaro.org; Thu, 17 Aug 2017 19:03:35 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44514) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTn3-0004vg-RA for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1diTn0-0000qQ-Ag for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:25 -0400 Received: from mail-pg0-x22a.google.com ([2607:f8b0:400e:c05::22a]:36958) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1diTn0-0000px-5Y for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:22 -0400 Received: by mail-pg0-x22a.google.com with SMTP id y129so51924905pgy.4 for ; Thu, 17 Aug 2017 16:01:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9o+4s+u+HhA1SN1wAmIK4x0UqDGVW4ct/r56gOt0Xg0=; b=ERhOP+bZ0PJ2PEOW+/ErsB9nFCLtJqJLFDUj272WDZ0MjFuhNhXwKnFnAmMUKs4A7s Hj8cuiKw2JYV52xfx6PLq7IEAaweD9doF4Md1XiAcX3CdQP0ajE6QRFcNqOvRiEh36rg Z7zWyt9Sy5eLrqYruYPWdiwWhW6ZteYTQT4tw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9o+4s+u+HhA1SN1wAmIK4x0UqDGVW4ct/r56gOt0Xg0=; b=X+QBsxjDylkCdE6laL42KUW9aRc/BQgRWPJ/ha0PQ/uIVl4/6LOrW3k1TYSTtJiaCH VH40tyai0cLb6r13L4qpYSAmtBblU8QyLGYNyYAKSgro1ktt8k8s9UbAWHtVEv0yM3vL xkx28FYtfFIaNwlQERi0WNE9YWM5HKmGHRkfiujj2fk136AovV1ffJSp8gEU2H0KpQ9I yraTE2cB2jWXffeydFCSHUdMYbcLGbIa55es8w697reJOz97hO/70PGfGw4R29/a3toS TNCoOK+2wsPc4DgpB34NIVG2uiArg6oc61tqCFWAiPC/S48w6r5HbtDZsVnPDv17KLBW ZrWg== X-Gm-Message-State: AHYfb5gaHQP4fgaC5j1shin9WerTpMk3Yl1JgXLW0eH1TpTuEWajoLyo 8/fj2UX/PFcPxwI7lAwi/w== X-Received: by 10.98.198.145 with SMTP id x17mr6922847pfk.272.1503010880804; Thu, 17 Aug 2017 16:01:20 -0700 (PDT) Received: from bigtime.twiddle.net (97-126-108-236.tukw.qwest.net. [97.126.108.236]) by smtp.gmail.com with ESMTPSA id c23sm5190043pfc.136.2017.08.17.16.01.19 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 17 Aug 2017 16:01:19 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Thu, 17 Aug 2017 16:01:09 -0700 Message-Id: <20170817230114.3655-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20170817230114.3655-1-richard.henderson@linaro.org> References: <20170817230114.3655-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22a Subject: [Qemu-devel] [PATCH 3/8] tcg: Add types for host vectors X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Nothing uses or enables them yet. Signed-off-by: Richard Henderson --- tcg/tcg.h | 5 +++++ tcg/tcg.c | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) -- 2.13.5 Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Alex Bennée diff --git a/tcg/tcg.h b/tcg/tcg.h index dd97095af5..1277caed3d 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -256,6 +256,11 @@ typedef struct TCGPool { typedef enum TCGType { TCG_TYPE_I32, TCG_TYPE_I64, + + TCG_TYPE_V64, + TCG_TYPE_V128, + TCG_TYPE_V256, + TCG_TYPE_COUNT, /* number of different types */ /* An alias for the size of the host register. */ diff --git a/tcg/tcg.c b/tcg/tcg.c index 787c8ba0f7..ea78d47fad 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -118,7 +118,7 @@ static TCGReg tcg_reg_alloc_new(TCGContext *s, TCGType t) static bool tcg_out_ldst_finalize(TCGContext *s); #endif -static TCGRegSet tcg_target_available_regs[2]; +static TCGRegSet tcg_target_available_regs[TCG_TYPE_COUNT]; static TCGRegSet tcg_target_call_clobber_regs; #if TCG_TARGET_INSN_UNIT_SIZE == 1 From patchwork Thu Aug 17 23:01:10 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 110347 Delivered-To: patch@linaro.org Received: by 10.140.95.78 with SMTP id h72csp144732qge; Thu, 17 Aug 2017 16:02:23 -0700 (PDT) X-Received: by 10.55.188.134 with SMTP id m128mr9574989qkf.233.1503010943336; Thu, 17 Aug 2017 16:02:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1503010943; cv=none; d=google.com; s=arc-20160816; b=rfaIAyJzNMip/e9R9Afq+IGzvmldS1ir3AhxOwwmAZa88hzyF6oOKHXEkdu3zzg/Pj R5cQ2VqexhWAASfXAuV5Hx9gsEbfaoRqR9CXfXcd55YguqZklySDtUWvUqqhEw+U/URd 2dTp8SHtVNF42PkdiiTK3bidkcxDDRSF2Ln7sNKrj83Uj0hA0ID5bE4/+euML3kDBz/s oZ2oNnTPQzyyi7QWedRFOzveCS4BgOKMhQBIzjRXD9rIpuyIhBB+aXkJDZoc4Imfs0Gs 4PpTSY9uITycPod9PgA5Qco0s2lWmF4ON31cuIeS4YzYkDzxlIgtZ7SJ5bMq9jlzkyZy pjMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=Ve0DM43w5YrFDYvyz9+DQQvsPDp9TqP3iha+QC7CvrE=; b=vWLvGMihFX6oi8MGIPd/dZ33F3oNjSg5bWYq3l6QJBhWrce/l5I1ZFLEJIakUeG1mm FITg6ikT2DZttfH07zUs4WvnNs34C8UVdk3KrxJzZk1i9Wtx4GF3c55jN2CAps3vW/KD rL45ZGKELIn8qHe+NmFEKdncYe1yg5TSvwtYccRS1w50nBYhLrefnxPJp9WVey0QKPk/ 7AyNek5bQqWKxwBzTVsBy9SsG/RFqy7cc4vRGSHcm7+qw+My8jL8KSUpi68zMkFbYy1z KN7b7OTCcRhp3wACtDG6S7Qv4XIuXfWIeiUa5hxb6vr87Vyvk97dh6SdgwFD/ugOFRrM hJ3w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=fL+HCOIa; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id p133si3895986qka.476.2017.08.17.16.02.22 for (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 17 Aug 2017 16:02:23 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=fL+HCOIa; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:56274 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTnx-0005PF-4I for patch@linaro.org; Thu, 17 Aug 2017 19:02:21 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44499) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTn3-0004uL-24 for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1diTn1-0000sr-Qb for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:25 -0400 Received: from mail-pg0-x230.google.com ([2607:f8b0:400e:c05::230]:36964) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1diTn1-0000qt-Jd for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:23 -0400 Received: by mail-pg0-x230.google.com with SMTP id y129so51925216pgy.4 for ; Thu, 17 Aug 2017 16:01:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Ve0DM43w5YrFDYvyz9+DQQvsPDp9TqP3iha+QC7CvrE=; b=fL+HCOIasLc1hBhR5VTNOSXl/yfHcdm8oT4BJkflQeu5Y3+ExC64p5Jsjtn6fwe+V/ FdugLl11X4yiNj16xp7SsDElgXfxuHSFSipnr0S+WXsXthNFuRxUWGBfu/QvLZJWBUJg D6PKMj1NaPH+p7t7FPuauKAoQf4fVjD3+6laU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Ve0DM43w5YrFDYvyz9+DQQvsPDp9TqP3iha+QC7CvrE=; b=VkDatV2FAdJ9sFRUnq+PBzGOnxAEdXWDOAcWoIZU1EIjfozIIXQlEedq+M4iRKdtuJ HQybxHb2eyKGA46O0B3zc3dWqu4dTzbmlnT6PSQLw2APcO0fzGBf0OeJeITh3UxMFM+V wiyIdaosqcH1A9iCPkbFDBgD7Ph8CxhZQ390TfK3nZ64kLAb9DxDxzVBL7rn6aseBuYW jZ21QZMNlHrKVoPeXyuR41uCl9K30IBRR8CdyBgYFEcyvLmgjcM3hsYu8N3O9A5Tx4XP QPy3FCsvqv4Ehkdp2A3B9B8H/uEv+UjuTLtMZWhkXqt4tAciQOQ96Fhxuzh8wjwKftgX n63g== X-Gm-Message-State: AHYfb5gUc+QB38VmE6ScD/FK+j+2e1hyZgvIEJ6wsuiWxgAkT/Hx41QN y1boMInhGqnym6wrFabn7w== X-Received: by 10.98.193.68 with SMTP id i65mr6957117pfg.142.1503010882201; Thu, 17 Aug 2017 16:01:22 -0700 (PDT) Received: from bigtime.twiddle.net (97-126-108-236.tukw.qwest.net. [97.126.108.236]) by smtp.gmail.com with ESMTPSA id c23sm5190043pfc.136.2017.08.17.16.01.20 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 17 Aug 2017 16:01:21 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Thu, 17 Aug 2017 16:01:10 -0700 Message-Id: <20170817230114.3655-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20170817230114.3655-1-richard.henderson@linaro.org> References: <20170817230114.3655-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::230 Subject: [Qemu-devel] [PATCH 4/8] tcg: Add operations for host vectors X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Nothing uses or implements them yet. Signed-off-by: Richard Henderson --- tcg/tcg-opc.h | 89 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ tcg/tcg.h | 24 ++++++++++++++++ 2 files changed, 113 insertions(+) -- 2.13.5 Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Alex Bennée diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h index 956fb1e9f3..9162125fac 100644 --- a/tcg/tcg-opc.h +++ b/tcg/tcg-opc.h @@ -206,6 +206,95 @@ DEF(qemu_st_i64, 0, TLADDR_ARGS + DATA64_ARGS, 1, #undef TLADDR_ARGS #undef DATA64_ARGS + +/* Host integer vector operations. */ +/* These opcodes are required whenever the base vector size is enabled. */ + +DEF(mov_v64, 1, 1, 0, IMPL(TCG_TARGET_HAS_v64)) +DEF(mov_v128, 1, 1, 0, IMPL(TCG_TARGET_HAS_v128)) +DEF(mov_v256, 1, 1, 0, IMPL(TCG_TARGET_HAS_v256)) + +DEF(movi_v64, 1, 0, 1, IMPL(TCG_TARGET_HAS_v64)) +DEF(movi_v128, 1, 0, 1, IMPL(TCG_TARGET_HAS_v128)) +DEF(movi_v256, 1, 0, 1, IMPL(TCG_TARGET_HAS_v256)) + +DEF(ld_v64, 1, 1, 1, IMPL(TCG_TARGET_HAS_v64)) +DEF(ld_v128, 1, 1, 1, IMPL(TCG_TARGET_HAS_v128)) +DEF(ld_v256, 1, 1, 1, IMPL(TCG_TARGET_HAS_v256)) + +DEF(st_v64, 0, 2, 1, IMPL(TCG_TARGET_HAS_v64)) +DEF(st_v128, 0, 2, 1, IMPL(TCG_TARGET_HAS_v128)) +DEF(st_v256, 0, 2, 1, IMPL(TCG_TARGET_HAS_v256)) + +DEF(and_v64, 1, 2, 0, IMPL(TCG_TARGET_HAS_v64)) +DEF(and_v128, 1, 2, 0, IMPL(TCG_TARGET_HAS_v128)) +DEF(and_v256, 1, 2, 0, IMPL(TCG_TARGET_HAS_v256)) + +DEF(or_v64, 1, 2, 0, IMPL(TCG_TARGET_HAS_v64)) +DEF(or_v128, 1, 2, 0, IMPL(TCG_TARGET_HAS_v128)) +DEF(or_v256, 1, 2, 0, IMPL(TCG_TARGET_HAS_v256)) + +DEF(xor_v64, 1, 2, 0, IMPL(TCG_TARGET_HAS_v64)) +DEF(xor_v128, 1, 2, 0, IMPL(TCG_TARGET_HAS_v128)) +DEF(xor_v256, 1, 2, 0, IMPL(TCG_TARGET_HAS_v256)) + +DEF(add8_v64, 1, 2, 0, IMPL(TCG_TARGET_HAS_v64)) +DEF(add16_v64, 1, 2, 0, IMPL(TCG_TARGET_HAS_v64)) +DEF(add32_v64, 1, 2, 0, IMPL(TCG_TARGET_HAS_v64)) + +DEF(add8_v128, 1, 2, 0, IMPL(TCG_TARGET_HAS_v128)) +DEF(add16_v128, 1, 2, 0, IMPL(TCG_TARGET_HAS_v128)) +DEF(add32_v128, 1, 2, 0, IMPL(TCG_TARGET_HAS_v128)) +DEF(add64_v128, 1, 2, 0, IMPL(TCG_TARGET_HAS_v128)) + +DEF(add8_v256, 1, 2, 0, IMPL(TCG_TARGET_HAS_v256)) +DEF(add16_v256, 1, 2, 0, IMPL(TCG_TARGET_HAS_v256)) +DEF(add32_v256, 1, 2, 0, IMPL(TCG_TARGET_HAS_v256)) +DEF(add64_v256, 1, 2, 0, IMPL(TCG_TARGET_HAS_v256)) + +DEF(sub8_v64, 1, 2, 0, IMPL(TCG_TARGET_HAS_v64)) +DEF(sub16_v64, 1, 2, 0, IMPL(TCG_TARGET_HAS_v64)) +DEF(sub32_v64, 1, 2, 0, IMPL(TCG_TARGET_HAS_v64)) + +DEF(sub8_v128, 1, 2, 0, IMPL(TCG_TARGET_HAS_v128)) +DEF(sub16_v128, 1, 2, 0, IMPL(TCG_TARGET_HAS_v128)) +DEF(sub32_v128, 1, 2, 0, IMPL(TCG_TARGET_HAS_v128)) +DEF(sub64_v128, 1, 2, 0, IMPL(TCG_TARGET_HAS_v128)) + +DEF(sub8_v256, 1, 2, 0, IMPL(TCG_TARGET_HAS_v256)) +DEF(sub16_v256, 1, 2, 0, IMPL(TCG_TARGET_HAS_v256)) +DEF(sub32_v256, 1, 2, 0, IMPL(TCG_TARGET_HAS_v256)) +DEF(sub64_v256, 1, 2, 0, IMPL(TCG_TARGET_HAS_v256)) + +/* These opcodes are optional. + All element counts must be supported if any are. */ + +DEF(not_v64, 1, 1, 0, IMPL(TCG_TARGET_HAS_not_v64)) +DEF(not_v128, 1, 1, 0, IMPL(TCG_TARGET_HAS_not_v128)) +DEF(not_v256, 1, 1, 0, IMPL(TCG_TARGET_HAS_not_v256)) + +DEF(andc_v64, 1, 2, 0, IMPL(TCG_TARGET_HAS_andc_v64)) +DEF(andc_v128, 1, 2, 0, IMPL(TCG_TARGET_HAS_andc_v128)) +DEF(andc_v256, 1, 2, 0, IMPL(TCG_TARGET_HAS_andc_v256)) + +DEF(orc_v64, 1, 2, 0, IMPL(TCG_TARGET_HAS_orc_v64)) +DEF(orc_v128, 1, 2, 0, IMPL(TCG_TARGET_HAS_orc_v128)) +DEF(orc_v256, 1, 2, 0, IMPL(TCG_TARGET_HAS_orc_v256)) + +DEF(neg8_v64, 1, 1, 0, IMPL(TCG_TARGET_HAS_neg_v64)) +DEF(neg16_v64, 1, 1, 0, IMPL(TCG_TARGET_HAS_neg_v64)) +DEF(neg32_v64, 1, 1, 0, IMPL(TCG_TARGET_HAS_neg_v64)) + +DEF(neg8_v128, 1, 1, 0, IMPL(TCG_TARGET_HAS_neg_v128)) +DEF(neg16_v128, 1, 1, 0, IMPL(TCG_TARGET_HAS_neg_v128)) +DEF(neg32_v128, 1, 1, 0, IMPL(TCG_TARGET_HAS_neg_v128)) +DEF(neg64_v128, 1, 1, 0, IMPL(TCG_TARGET_HAS_neg_v128)) + +DEF(neg8_v256, 1, 1, 0, IMPL(TCG_TARGET_HAS_neg_v256)) +DEF(neg16_v256, 1, 1, 0, IMPL(TCG_TARGET_HAS_neg_v256)) +DEF(neg32_v256, 1, 1, 0, IMPL(TCG_TARGET_HAS_neg_v256)) +DEF(neg64_v256, 1, 1, 0, IMPL(TCG_TARGET_HAS_neg_v256)) + #undef IMPL #undef IMPL64 #undef DEF diff --git a/tcg/tcg.h b/tcg/tcg.h index 1277caed3d..b9e15da13b 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -166,6 +166,30 @@ typedef uint64_t TCGRegSet; #define TCG_TARGET_HAS_rem_i64 0 #endif +#ifndef TCG_TARGET_HAS_v64 +#define TCG_TARGET_HAS_v64 0 +#define TCG_TARGET_HAS_andc_v64 0 +#define TCG_TARGET_HAS_orc_v64 0 +#define TCG_TARGET_HAS_not_v64 0 +#define TCG_TARGET_HAS_neg_v64 0 +#endif + +#ifndef TCG_TARGET_HAS_v128 +#define TCG_TARGET_HAS_v128 0 +#define TCG_TARGET_HAS_andc_v128 0 +#define TCG_TARGET_HAS_orc_v128 0 +#define TCG_TARGET_HAS_not_v128 0 +#define TCG_TARGET_HAS_neg_v128 0 +#endif + +#ifndef TCG_TARGET_HAS_v256 +#define TCG_TARGET_HAS_v256 0 +#define TCG_TARGET_HAS_andc_v256 0 +#define TCG_TARGET_HAS_orc_v256 0 +#define TCG_TARGET_HAS_not_v256 0 +#define TCG_TARGET_HAS_neg_v256 0 +#endif + /* For 32-bit targets, some sort of unsigned widening multiply is required. */ #if TCG_TARGET_REG_BITS == 32 \ && !(defined(TCG_TARGET_HAS_mulu2_i32) \ From patchwork Thu Aug 17 23:01:11 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 110352 Delivered-To: patch@linaro.org Received: by 10.140.95.78 with SMTP id h72csp148627qge; Thu, 17 Aug 2017 16:06:07 -0700 (PDT) X-Received: by 10.237.42.226 with SMTP id t89mr9345220qtd.242.1503011167797; Thu, 17 Aug 2017 16:06:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1503011167; cv=none; d=google.com; s=arc-20160816; b=qG1wB/hJzDPb3nEFPoZLHZj6HmwU7+YAmrtgMq4k7+6zFAJVZYsD5CdMt/lr2xDXzY moeaEbRERqp+yspzqSmR2Hqd0MBaMNQSaMPx6I1bsIkK37ZlWYYnz0xQ+T+9ofA5JFz9 fhjVClsFY52JzUC6mv+ZHHuxsCg8kjJrFRZDm6kCaRvJ1N5OImptNtw29TK5h9Mv3w9Q srLrXAC+W9AGuELos3wa8MIKKXGNfMpSUSKBdrHdfEUP7i0sZAFg67TkVfNEGeBN9WqT vDRpq3A315Qxe6OG2KabhoXkLg73Jc5GaCOwy85rMm1g5RpV2jcRQ4e84z0VZUoazWvb zxrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=hIDrwc+oo2/OTeyFZIxY3GeNtfhOOi0JxoLuI/S8d1Q=; b=Pw20a6yWyhTwuDc5oc2OGqLC7ZMKXdMF7P4nn31zyHr1wpwCpKjrboYfne+4U00w8A fAlRY34Wv9jS8gQJ9KhHAJTZbc+OZb3BCf+JKIVX04ithUAC5BwzO/4M6+G28rMVb37c vLw+cesvGa88wEEqkSn4nMR7FPURS8FRpH0P+vN7NNW1VaI5LKc4z9GlAlS7C/2r1o27 ltixm1FugR8oBZ/7MO7dblecILkuCmsxPTyYQ1Y54v0o/WfcHK3lsYImiaQDe9OVcHa6 /NIisHISI9jpBs7s7a10JB4XS/ki72Hgd3UxzxnvGmusXHq7X8TVzjEMpZApzs0d+9iW pdIA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Y0Lsb3z6; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id o35si3824783qtd.541.2017.08.17.16.06.07 for (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 17 Aug 2017 16:06:07 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Y0Lsb3z6; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:56351 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTrZ-0008JF-G2 for patch@linaro.org; Thu, 17 Aug 2017 19:06:05 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44537) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTn4-0004wR-J8 for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1diTn3-0000x0-9E for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:26 -0400 Received: from mail-pg0-x22b.google.com ([2607:f8b0:400e:c05::22b]:36971) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1diTn3-0000va-24 for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:25 -0400 Received: by mail-pg0-x22b.google.com with SMTP id y129so51925581pgy.4 for ; Thu, 17 Aug 2017 16:01:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=hIDrwc+oo2/OTeyFZIxY3GeNtfhOOi0JxoLuI/S8d1Q=; b=Y0Lsb3z60F7sB14wYzFrb59izB7AHueIXWhSfpjV9vCPDXmsE4WpCyRAZjpn7RjgFp s1cMHaKERonXYCCb5ZTfT+af5GVV+sVq4J6lOdYqrvLKAuLJxD69VdA3VwWgj9WkxFrU cQgOlf1XiQkd+YpwWxKsvz1/w4vvOwJ4YSNTw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=hIDrwc+oo2/OTeyFZIxY3GeNtfhOOi0JxoLuI/S8d1Q=; b=ijkU8ggfhHXc868xM9p/BFsbh+47L80DQUrlFlyuyI4A5acoYKQ1a1b3ZvEeOVLIJs CRogmzFIM5L0D3QsKoKxLuzApqvMHcLLmf73PQxroN05D5eUnLgTz7MXPoF31aINIDHn n+zbK8GvxdmOom1LqW1aNRpQsiUTU15bjneXxZRQJnD8dk4kSGpnfeJUSiaUwPQlf+te dzWS9adbR24kNqSYetrpZYp7M1Pxu/lO1nS6bh46fY9GA841ZLnpk3Qa0XytHpC2va5O qqufnKuUiCL1SVcgq05N+x6xHgdAsGSl9qWakwRUIi+KxJ2Xbd+jhzBiAQVP+PPdfIc8 /ksA== X-Gm-Message-State: AHYfb5jqsV812HPwFSZf83IGkRcYtCycen/tsVm6IRLSpKkhZzwcttNs aDL9UvxlUNy0pFLvFAu3vA== X-Received: by 10.98.82.2 with SMTP id g2mr6749735pfb.308.1503010883573; Thu, 17 Aug 2017 16:01:23 -0700 (PDT) Received: from bigtime.twiddle.net (97-126-108-236.tukw.qwest.net. [97.126.108.236]) by smtp.gmail.com with ESMTPSA id c23sm5190043pfc.136.2017.08.17.16.01.22 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 17 Aug 2017 16:01:22 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Thu, 17 Aug 2017 16:01:11 -0700 Message-Id: <20170817230114.3655-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20170817230114.3655-1-richard.henderson@linaro.org> References: <20170817230114.3655-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22b Subject: [Qemu-devel] [PATCH 5/8] tcg: Add tcg_op_supported X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/tcg.h | 2 + tcg/tcg.c | 310 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 312 insertions(+) -- 2.13.5 Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Alex Bennée diff --git a/tcg/tcg.h b/tcg/tcg.h index b9e15da13b..b443143b21 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -962,6 +962,8 @@ do {\ #define tcg_temp_free_ptr(T) tcg_temp_free_i64(TCGV_PTR_TO_NAT(T)) #endif +bool tcg_op_supported(TCGOpcode op); + void tcg_gen_callN(TCGContext *s, void *func, TCGArg ret, int nargs, TCGArg *args); diff --git a/tcg/tcg.c b/tcg/tcg.c index ea78d47fad..3c3cdda938 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -751,6 +751,316 @@ int tcg_check_temp_count(void) } #endif +/* Return true if OP may appear in the opcode stream. + Test the runtime variable that controls each opcode. */ +bool tcg_op_supported(TCGOpcode op) +{ + switch (op) { + case INDEX_op_discard: + case INDEX_op_set_label: + case INDEX_op_call: + case INDEX_op_br: + case INDEX_op_mb: + case INDEX_op_insn_start: + case INDEX_op_exit_tb: + case INDEX_op_goto_tb: + case INDEX_op_qemu_ld_i32: + case INDEX_op_qemu_st_i32: + case INDEX_op_qemu_ld_i64: + case INDEX_op_qemu_st_i64: + return true; + + case INDEX_op_goto_ptr: + return TCG_TARGET_HAS_goto_ptr; + + case INDEX_op_mov_i32: + case INDEX_op_movi_i32: + case INDEX_op_setcond_i32: + case INDEX_op_brcond_i32: + case INDEX_op_ld8u_i32: + case INDEX_op_ld8s_i32: + case INDEX_op_ld16u_i32: + case INDEX_op_ld16s_i32: + case INDEX_op_ld_i32: + case INDEX_op_st8_i32: + case INDEX_op_st16_i32: + case INDEX_op_st_i32: + case INDEX_op_add_i32: + case INDEX_op_sub_i32: + case INDEX_op_mul_i32: + case INDEX_op_and_i32: + case INDEX_op_or_i32: + case INDEX_op_xor_i32: + case INDEX_op_shl_i32: + case INDEX_op_shr_i32: + case INDEX_op_sar_i32: + return true; + + case INDEX_op_movcond_i32: + return TCG_TARGET_HAS_movcond_i32; + case INDEX_op_div_i32: + case INDEX_op_divu_i32: + return TCG_TARGET_HAS_div_i32; + case INDEX_op_rem_i32: + case INDEX_op_remu_i32: + return TCG_TARGET_HAS_rem_i32; + case INDEX_op_div2_i32: + case INDEX_op_divu2_i32: + return TCG_TARGET_HAS_div2_i32; + case INDEX_op_rotl_i32: + case INDEX_op_rotr_i32: + return TCG_TARGET_HAS_rot_i32; + case INDEX_op_deposit_i32: + return TCG_TARGET_HAS_deposit_i32; + case INDEX_op_extract_i32: + return TCG_TARGET_HAS_extract_i32; + case INDEX_op_sextract_i32: + return TCG_TARGET_HAS_sextract_i32; + case INDEX_op_add2_i32: + return TCG_TARGET_HAS_add2_i32; + case INDEX_op_sub2_i32: + return TCG_TARGET_HAS_sub2_i32; + case INDEX_op_mulu2_i32: + return TCG_TARGET_HAS_mulu2_i32; + case INDEX_op_muls2_i32: + return TCG_TARGET_HAS_muls2_i32; + case INDEX_op_muluh_i32: + return TCG_TARGET_HAS_muluh_i32; + case INDEX_op_mulsh_i32: + return TCG_TARGET_HAS_mulsh_i32; + case INDEX_op_ext8s_i32: + return TCG_TARGET_HAS_ext8s_i32; + case INDEX_op_ext16s_i32: + return TCG_TARGET_HAS_ext16s_i32; + case INDEX_op_ext8u_i32: + return TCG_TARGET_HAS_ext8u_i32; + case INDEX_op_ext16u_i32: + return TCG_TARGET_HAS_ext16u_i32; + case INDEX_op_bswap16_i32: + return TCG_TARGET_HAS_bswap16_i32; + case INDEX_op_bswap32_i32: + return TCG_TARGET_HAS_bswap32_i32; + case INDEX_op_not_i32: + return TCG_TARGET_HAS_not_i32; + case INDEX_op_neg_i32: + return TCG_TARGET_HAS_neg_i32; + case INDEX_op_andc_i32: + return TCG_TARGET_HAS_andc_i32; + case INDEX_op_orc_i32: + return TCG_TARGET_HAS_orc_i32; + case INDEX_op_eqv_i32: + return TCG_TARGET_HAS_eqv_i32; + case INDEX_op_nand_i32: + return TCG_TARGET_HAS_nand_i32; + case INDEX_op_nor_i32: + return TCG_TARGET_HAS_nor_i32; + case INDEX_op_clz_i32: + return TCG_TARGET_HAS_clz_i32; + case INDEX_op_ctz_i32: + return TCG_TARGET_HAS_ctz_i32; + case INDEX_op_ctpop_i32: + return TCG_TARGET_HAS_ctpop_i32; + + case INDEX_op_brcond2_i32: + case INDEX_op_setcond2_i32: + return TCG_TARGET_REG_BITS == 32; + + case INDEX_op_mov_i64: + case INDEX_op_movi_i64: + case INDEX_op_setcond_i64: + case INDEX_op_brcond_i64: + case INDEX_op_ld8u_i64: + case INDEX_op_ld8s_i64: + case INDEX_op_ld16u_i64: + case INDEX_op_ld16s_i64: + case INDEX_op_ld32u_i64: + case INDEX_op_ld32s_i64: + case INDEX_op_ld_i64: + case INDEX_op_st8_i64: + case INDEX_op_st16_i64: + case INDEX_op_st32_i64: + case INDEX_op_st_i64: + case INDEX_op_add_i64: + case INDEX_op_sub_i64: + case INDEX_op_mul_i64: + case INDEX_op_and_i64: + case INDEX_op_or_i64: + case INDEX_op_xor_i64: + case INDEX_op_shl_i64: + case INDEX_op_shr_i64: + case INDEX_op_sar_i64: + case INDEX_op_ext_i32_i64: + case INDEX_op_extu_i32_i64: + return TCG_TARGET_REG_BITS == 64; + + case INDEX_op_movcond_i64: + return TCG_TARGET_HAS_movcond_i64; + case INDEX_op_div_i64: + case INDEX_op_divu_i64: + return TCG_TARGET_HAS_div_i64; + case INDEX_op_rem_i64: + case INDEX_op_remu_i64: + return TCG_TARGET_HAS_rem_i64; + case INDEX_op_div2_i64: + case INDEX_op_divu2_i64: + return TCG_TARGET_HAS_div2_i64; + case INDEX_op_rotl_i64: + case INDEX_op_rotr_i64: + return TCG_TARGET_HAS_rot_i64; + case INDEX_op_deposit_i64: + return TCG_TARGET_HAS_deposit_i64; + case INDEX_op_extract_i64: + return TCG_TARGET_HAS_extract_i64; + case INDEX_op_sextract_i64: + return TCG_TARGET_HAS_sextract_i64; + case INDEX_op_extrl_i64_i32: + return TCG_TARGET_HAS_extrl_i64_i32; + case INDEX_op_extrh_i64_i32: + return TCG_TARGET_HAS_extrh_i64_i32; + case INDEX_op_ext8s_i64: + return TCG_TARGET_HAS_ext8s_i64; + case INDEX_op_ext16s_i64: + return TCG_TARGET_HAS_ext16s_i64; + case INDEX_op_ext32s_i64: + return TCG_TARGET_HAS_ext32s_i64; + case INDEX_op_ext8u_i64: + return TCG_TARGET_HAS_ext8u_i64; + case INDEX_op_ext16u_i64: + return TCG_TARGET_HAS_ext16u_i64; + case INDEX_op_ext32u_i64: + return TCG_TARGET_HAS_ext32u_i64; + case INDEX_op_bswap16_i64: + return TCG_TARGET_HAS_bswap16_i64; + case INDEX_op_bswap32_i64: + return TCG_TARGET_HAS_bswap32_i64; + case INDEX_op_bswap64_i64: + return TCG_TARGET_HAS_bswap64_i64; + case INDEX_op_not_i64: + return TCG_TARGET_HAS_not_i64; + case INDEX_op_neg_i64: + return TCG_TARGET_HAS_neg_i64; + case INDEX_op_andc_i64: + return TCG_TARGET_HAS_andc_i64; + case INDEX_op_orc_i64: + return TCG_TARGET_HAS_orc_i64; + case INDEX_op_eqv_i64: + return TCG_TARGET_HAS_eqv_i64; + case INDEX_op_nand_i64: + return TCG_TARGET_HAS_nand_i64; + case INDEX_op_nor_i64: + return TCG_TARGET_HAS_nor_i64; + case INDEX_op_clz_i64: + return TCG_TARGET_HAS_clz_i64; + case INDEX_op_ctz_i64: + return TCG_TARGET_HAS_ctz_i64; + case INDEX_op_ctpop_i64: + return TCG_TARGET_HAS_ctpop_i64; + case INDEX_op_add2_i64: + return TCG_TARGET_HAS_add2_i64; + case INDEX_op_sub2_i64: + return TCG_TARGET_HAS_sub2_i64; + case INDEX_op_mulu2_i64: + return TCG_TARGET_HAS_mulu2_i64; + case INDEX_op_muls2_i64: + return TCG_TARGET_HAS_muls2_i64; + case INDEX_op_muluh_i64: + return TCG_TARGET_HAS_muluh_i64; + case INDEX_op_mulsh_i64: + return TCG_TARGET_HAS_mulsh_i64; + + case INDEX_op_mov_v64: + case INDEX_op_movi_v64: + case INDEX_op_ld_v64: + case INDEX_op_st_v64: + case INDEX_op_and_v64: + case INDEX_op_or_v64: + case INDEX_op_xor_v64: + case INDEX_op_add8_v64: + case INDEX_op_add16_v64: + case INDEX_op_add32_v64: + case INDEX_op_sub8_v64: + case INDEX_op_sub16_v64: + case INDEX_op_sub32_v64: + return TCG_TARGET_HAS_v64; + + case INDEX_op_mov_v128: + case INDEX_op_movi_v128: + case INDEX_op_ld_v128: + case INDEX_op_st_v128: + case INDEX_op_and_v128: + case INDEX_op_or_v128: + case INDEX_op_xor_v128: + case INDEX_op_add8_v128: + case INDEX_op_add16_v128: + case INDEX_op_add32_v128: + case INDEX_op_add64_v128: + case INDEX_op_sub8_v128: + case INDEX_op_sub16_v128: + case INDEX_op_sub32_v128: + case INDEX_op_sub64_v128: + return TCG_TARGET_HAS_v128; + + case INDEX_op_mov_v256: + case INDEX_op_movi_v256: + case INDEX_op_ld_v256: + case INDEX_op_st_v256: + case INDEX_op_and_v256: + case INDEX_op_or_v256: + case INDEX_op_xor_v256: + case INDEX_op_add8_v256: + case INDEX_op_add16_v256: + case INDEX_op_add32_v256: + case INDEX_op_add64_v256: + case INDEX_op_sub8_v256: + case INDEX_op_sub16_v256: + case INDEX_op_sub32_v256: + case INDEX_op_sub64_v256: + return TCG_TARGET_HAS_v256; + + case INDEX_op_not_v64: + return TCG_TARGET_HAS_not_v64; + case INDEX_op_not_v128: + return TCG_TARGET_HAS_not_v128; + case INDEX_op_not_v256: + return TCG_TARGET_HAS_not_v256; + + case INDEX_op_andc_v64: + return TCG_TARGET_HAS_andc_v64; + case INDEX_op_andc_v128: + return TCG_TARGET_HAS_andc_v128; + case INDEX_op_andc_v256: + return TCG_TARGET_HAS_andc_v256; + + case INDEX_op_orc_v64: + return TCG_TARGET_HAS_orc_v64; + case INDEX_op_orc_v128: + return TCG_TARGET_HAS_orc_v128; + case INDEX_op_orc_v256: + return TCG_TARGET_HAS_orc_v256; + + case INDEX_op_neg8_v64: + case INDEX_op_neg16_v64: + case INDEX_op_neg32_v64: + return TCG_TARGET_HAS_neg_v64; + + case INDEX_op_neg8_v128: + case INDEX_op_neg16_v128: + case INDEX_op_neg32_v128: + case INDEX_op_neg64_v128: + return TCG_TARGET_HAS_neg_v128; + + case INDEX_op_neg8_v256: + case INDEX_op_neg16_v256: + case INDEX_op_neg32_v256: + case INDEX_op_neg64_v256: + return TCG_TARGET_HAS_neg_v256; + + case NB_OPS: + break; + } + g_assert_not_reached(); +} + /* Note: we convert the 64 bit args to 32 bit and do some alignment and endian swap. Maybe it would be better to do the alignment and endian swap in tcg_reg_alloc_call(). */ From patchwork Thu Aug 17 23:01:12 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 110353 Delivered-To: patch@linaro.org Received: by 10.140.95.78 with SMTP id h72csp151032qge; Thu, 17 Aug 2017 16:08:24 -0700 (PDT) X-Received: by 10.200.8.20 with SMTP id u20mr9757668qth.287.1503011304521; Thu, 17 Aug 2017 16:08:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1503011304; cv=none; d=google.com; s=arc-20160816; b=XerWPOD7wkKeQ+f3/iD2tNX2ciGTM+Pd7tDBIaebOF8bNBXXmd/eEmgn0PluTQjLeO xOjnJot6uhOi2bMQbn2MJw7ZNLeMVDiRnBkfzn4LhqYxlkZVw95lUl+vJjlTCHDm/32Y 8MoeDAbOQUYMqZLUma3jq6CX3kqxf39z+9xSDJ467kmFMlp4pmYuMDPuIjHx/KWRVj3/ d4UB96oXbprPgUlRXUfSMWGd4q79UPuTQt8252+o3deweFGXkNO4bWON2DV6zGrGevmu EphmOIAJAdrdBW/afPAOjf+9NVp/eVumMrBhJ0c1tQyj5x+/HIoSkzNmDZEkFDNDBPi/ +Cow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=dpdlyo3XtVpsa3ftb6C8wRlu5yzXE0Obkds6jVxb5hw=; b=W5atO/UqhwcTVQhzacIccUBNK7EpeaFtKdRppGGPOUWuukUF5t0uj9tVehq7V2XU/5 ZDQkPiFTD0MZzbrSpHr8xEwsMUEtezmTvi+/4wVGHljW3v7KzlD1hVo/+2cxNy+PAG4d 9Pfy5pQxv3lSeTGkGrQJFHROTwofNMu/9EtI1WBsfSE9RDKrPBhBPPDZ2sH1/KaIIWTw mbjkpNOjOlxFRQrOdYfmUmP1YGs00XcNKf4I6GKxMsalfsgEG9M+BPBeqoYLVannatWC u/6LRZ26YWQcjMJijXgn/erPF8q/pNGrSfAmOCCpkw/XNPQWrLcT5OtH7aJrwNhpitzU eKqA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=NYmCuYgE; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id b189si3934029qkd.383.2017.08.17.16.08.24 for (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 17 Aug 2017 16:08:24 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=NYmCuYgE; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:56502 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTtm-0001rw-FC for patch@linaro.org; Thu, 17 Aug 2017 19:08:22 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44564) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTn5-0004xV-P7 for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1diTn5-0000y2-3z for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:27 -0400 Received: from mail-pg0-x22d.google.com ([2607:f8b0:400e:c05::22d]:36560) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1diTn4-0000xc-VB for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:27 -0400 Received: by mail-pg0-x22d.google.com with SMTP id i12so51987893pgr.3 for ; Thu, 17 Aug 2017 16:01:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=dpdlyo3XtVpsa3ftb6C8wRlu5yzXE0Obkds6jVxb5hw=; b=NYmCuYgEnbczxXFGsRPIauPB+03hnctKKtTaOoKZa/Yek3w5jieLYsou8TgpIhmJ1o RFdWq32rcqfaZdS8/7R9H5HZPnuhWF6LajCpasBijtif4pxANchy6RgjHRZnZeX5TYRR 4aim+tHJHsAEv80MSBeZJW1bQMHHRUpXc+Ow8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=dpdlyo3XtVpsa3ftb6C8wRlu5yzXE0Obkds6jVxb5hw=; b=AAac+EqPjQcW2nBivgp7f20nVa/wcaRNHmDw97zYtXxmT0ssBaIcKX9iTJFf9uYsF+ sbpUeXuNJwInkgeEYjx/AO1ByBc9NobBkYYk26ITJFkix1xqdA4ByzJ3+Eet3/JkALdz RU+wegMyQ+NVe6Qb21EKUanUM/IK/H69jvWmbnaZtSSnndcGpZNMm0xK/CDDgl9Xx79x H/WKlhDYvP/AyFx/XybegBhYo2qOR1iCZf/rmrS/u8a/kISoJrGInDTRxha02Vbi8nYh WrPgi6z/i1Of686Nrzsn/oprCCMqcOzClfR2K0jvyjlZx0m+4zV33zonbVe+NujJ/Ub6 5fhQ== X-Gm-Message-State: AHYfb5ggvXa39lZMOfqovHkku7/kvZFFHjii007A8WMriMzADkw5cCZx QTO1Ox/Dpal3LrqASSBCwQ== X-Received: by 10.98.14.93 with SMTP id w90mr6909429pfi.298.1503010885018; Thu, 17 Aug 2017 16:01:25 -0700 (PDT) Received: from bigtime.twiddle.net (97-126-108-236.tukw.qwest.net. [97.126.108.236]) by smtp.gmail.com with ESMTPSA id c23sm5190043pfc.136.2017.08.17.16.01.23 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 17 Aug 2017 16:01:24 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Thu, 17 Aug 2017 16:01:12 -0700 Message-Id: <20170817230114.3655-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20170817230114.3655-1-richard.henderson@linaro.org> References: <20170817230114.3655-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22d Subject: [Qemu-devel] [PATCH 6/8] tcg: Add INDEX_op_invalid X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Add with value 0 so that structure zero initialization can indicate that the field is not present. Signed-off-by: Richard Henderson --- tcg/tcg-opc.h | 2 ++ tcg/tcg.c | 3 +++ 2 files changed, 5 insertions(+) -- 2.13.5 Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Alex Bennée diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h index 9162125fac..b1445a4c24 100644 --- a/tcg/tcg-opc.h +++ b/tcg/tcg-opc.h @@ -26,6 +26,8 @@ * DEF(name, oargs, iargs, cargs, flags) */ +DEF(invalid, 0, 0, 0, TCG_OPF_NOT_PRESENT) + /* predefined ops */ DEF(discard, 1, 0, 0, TCG_OPF_NOT_PRESENT) DEF(set_label, 0, 0, 1, TCG_OPF_BB_END | TCG_OPF_NOT_PRESENT) diff --git a/tcg/tcg.c b/tcg/tcg.c index 3c3cdda938..879b29e81f 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -756,6 +756,9 @@ int tcg_check_temp_count(void) bool tcg_op_supported(TCGOpcode op) { switch (op) { + case INDEX_op_invalid: + return false; + case INDEX_op_discard: case INDEX_op_set_label: case INDEX_op_call: From patchwork Thu Aug 17 23:01:13 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 110354 Delivered-To: patch@linaro.org Received: by 10.140.95.78 with SMTP id h72csp152339qge; Thu, 17 Aug 2017 16:09:53 -0700 (PDT) X-Received: by 10.237.60.8 with SMTP id t8mr9859979qte.56.1503011393901; Thu, 17 Aug 2017 16:09:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1503011393; cv=none; d=google.com; s=arc-20160816; b=WtBXH+A8HHVc8yRdzENURKNIAXJsSoxiGa/lp4SDmgtE1HrtZH9TiO0UHsZ675CIxD z4pzrkXnQlecv9BSkZzY+OPHswlWs2PzXUSAvbayRB3EQ3qf4YZANYxEtfOUvfv1bu7P 8W5q4NTekCFw/ojvrbRkRgJ5tekveUTo/fozHCJD2tu340tqpF41d1588pyzbFFMeIUK onwNRLtvUnTc2FTCvyd2NDYadx2Do3NCwOqkHaIC8dTSrfN/xwVRS8ta1gM+/fPGtXVy iq0aFtWV/TiJYifx6NBYc8UZjX6Erq9Zok2U/V2sGtZtpqo+fuwy46/4w3Y1TBfx73dr DNQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=ayDUNuMxBJiB70+Pu5TZupwFTb2kohK2+QYR/Q2F6rg=; b=jghcsngs1L36+18vf/P6TTH+T4pdjtOFjUgZ2IQMsaBzBnqHmlUxuAcRrC6Xns6p39 vLgoXKdKpYgIj8/DQcOJ0rBl9vsNI2kVvHIJ32Qf8EX77//4g9eUiGuwfq6c3CqVaHt2 ErMG8B2soPLlkuKf4IPpoyUQV6IlB1o1II/qDvS02TEK771GeY3orGL31HJG9efNXjWM BQt/VRnyfxjifisIqbpphTiqoXU4xFSNGVem7sRiY+0G2birIokkjT/i86sWf1mOj7zn z9QAJIvRWFZLNrk3jPBxOM69o5LS+af9BFxRrmms7MXsuAi01v+ce3/KzCuH5oxliHvW E8/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=V4mEpb1x; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id n19si3955802qtk.551.2017.08.17.16.09.53 for (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 17 Aug 2017 16:09:53 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=V4mEpb1x; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:56685 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTvD-0003Hs-Kh for patch@linaro.org; Thu, 17 Aug 2017 19:09:51 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44637) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTnB-00053N-U4 for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:35 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1diTn7-0000yz-1O for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:34 -0400 Received: from mail-pg0-x230.google.com ([2607:f8b0:400e:c05::230]:36567) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1diTn6-0000yi-QL for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:28 -0400 Received: by mail-pg0-x230.google.com with SMTP id i12so51988336pgr.3 for ; Thu, 17 Aug 2017 16:01:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ayDUNuMxBJiB70+Pu5TZupwFTb2kohK2+QYR/Q2F6rg=; b=V4mEpb1xOMQrlcvGv1KqgEZ+kPVth2kZllsNtYaFD+kChXCtn2lwjYNfWyXJpIrLFJ 7gJyLIRWPOgAFTZUVy/76rkddSgBcQYecfccNy3Zpm8nuSokPtFo/0NPC5s+LJjhM5gS XQNTD20jWF4tLuNrw6wJAa1wZy0l2D1nhZnk8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ayDUNuMxBJiB70+Pu5TZupwFTb2kohK2+QYR/Q2F6rg=; b=nnW/Hgyx0YnBWYxgd2BOujoUZT4UXyXCIsd++oAiUqIcvzjsops9liL764U/t+/FU1 KvzRg1gDdbgFj86owSiFlfSciGmK47oUMH20sBs6X7/WuPR193iu0hYFEn1tG7xzFQNH 8ONJ4QmVkXLOkXHKn5x1kPQ0CvWah5i45Baykbq808sd4CXFQ/DojLjJCELSDnM1hf3g prlQcTu1LqQnr47g+0PALhdpXWq2nXls5noCuktvn6pzbZGlHXznH82UtITMmIxGxf9H xKLnDnnqSO+VcqHsbya1CALVhn7JTF+l2tPfINf+/48ZwFAJ4iEnsz3+ZXFJnx+m5uIi +dFA== X-Gm-Message-State: AHYfb5hdQNcoxsHZv5wsu56t69WvMks+IfY6TdM6tPCVj5JzXrDUrktK 0KBH35Qi+yXNSdv4JXuKkA== X-Received: by 10.98.86.195 with SMTP id h64mr6585247pfj.99.1503010886329; Thu, 17 Aug 2017 16:01:26 -0700 (PDT) Received: from bigtime.twiddle.net (97-126-108-236.tukw.qwest.net. [97.126.108.236]) by smtp.gmail.com with ESMTPSA id c23sm5190043pfc.136.2017.08.17.16.01.25 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 17 Aug 2017 16:01:25 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Thu, 17 Aug 2017 16:01:13 -0700 Message-Id: <20170817230114.3655-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20170817230114.3655-1-richard.henderson@linaro.org> References: <20170817230114.3655-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::230 Subject: [Qemu-devel] [PATCH 7/8] tcg: Expand target vector ops with host vector ops X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/tcg-op-gvec.h | 4 + tcg/tcg.h | 6 +- tcg/tcg-op-gvec.c | 230 +++++++++++++++++++++++++++++++++++++++++++----------- tcg/tcg.c | 8 +- 4 files changed, 197 insertions(+), 51 deletions(-) -- 2.13.5 diff --git a/tcg/tcg-op-gvec.h b/tcg/tcg-op-gvec.h index 10db3599a5..99f36d208e 100644 --- a/tcg/tcg-op-gvec.h +++ b/tcg/tcg-op-gvec.h @@ -40,6 +40,10 @@ typedef struct { /* Similarly, but load up a constant and re-use across lanes. */ void (*fni8x)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64); uint64_t extra_value; + /* Operations with host vector ops. */ + TCGOpcode op_v256; + TCGOpcode op_v128; + TCGOpcode op_v64; /* Larger sizes: expand out-of-line helper w/size descriptor. */ void (*fno)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); } GVecGen3; diff --git a/tcg/tcg.h b/tcg/tcg.h index b443143b21..7f10501d31 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -825,9 +825,11 @@ int tcg_global_mem_new_internal(TCGType, TCGv_ptr, intptr_t, const char *); TCGv_i32 tcg_global_reg_new_i32(TCGReg reg, const char *name); TCGv_i64 tcg_global_reg_new_i64(TCGReg reg, const char *name); -TCGv_i32 tcg_temp_new_internal_i32(int temp_local); -TCGv_i64 tcg_temp_new_internal_i64(int temp_local); +int tcg_temp_new_internal(TCGType type, bool temp_local); +TCGv_i32 tcg_temp_new_internal_i32(bool temp_local); +TCGv_i64 tcg_temp_new_internal_i64(bool temp_local); +void tcg_temp_free_internal(int arg); void tcg_temp_free_i32(TCGv_i32 arg); void tcg_temp_free_i64(TCGv_i64 arg); diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 6de49dc07f..3aca565dc0 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -30,54 +30,73 @@ #define REP8(x) ((x) * 0x0101010101010101ull) #define REP16(x) ((x) * 0x0001000100010001ull) -#define MAX_INLINE 16 +#define MAX_UNROLL 4 -static inline void check_size_s(uint32_t opsz, uint32_t clsz) +static inline void check_size_align(uint32_t opsz, uint32_t clsz, uint32_t ofs) { - tcg_debug_assert(opsz % 8 == 0); - tcg_debug_assert(clsz % 8 == 0); + uint32_t align = clsz > 16 || opsz >= 16 ? 15 : 7; + tcg_debug_assert(opsz > 0); tcg_debug_assert(opsz <= clsz); + tcg_debug_assert((opsz & align) == 0); + tcg_debug_assert((clsz & align) == 0); + tcg_debug_assert((ofs & align) == 0); } -static inline void check_align_s_3(uint32_t dofs, uint32_t aofs, uint32_t bofs) +static inline void check_overlap_3(uint32_t d, uint32_t a, + uint32_t b, uint32_t s) { - tcg_debug_assert(dofs % 8 == 0); - tcg_debug_assert(aofs % 8 == 0); - tcg_debug_assert(bofs % 8 == 0); + tcg_debug_assert(d == a || d + s <= a || a + s <= d); + tcg_debug_assert(d == b || d + s <= b || b + s <= d); + tcg_debug_assert(a == b || a + s <= b || b + s <= a); } -static inline void check_size_l(uint32_t opsz, uint32_t clsz) +static inline bool check_size_impl(uint32_t opsz, uint32_t lnsz) { - tcg_debug_assert(opsz % 16 == 0); - tcg_debug_assert(clsz % 16 == 0); - tcg_debug_assert(opsz <= clsz); + uint32_t lnct = opsz / lnsz; + return lnct >= 1 && lnct <= MAX_UNROLL; } -static inline void check_align_l_3(uint32_t dofs, uint32_t aofs, uint32_t bofs) +static void expand_clr_v(uint32_t dofs, uint32_t clsz, uint32_t lnsz, + TCGType type, TCGOpcode opc_mv, TCGOpcode opc_st) { - tcg_debug_assert(dofs % 16 == 0); - tcg_debug_assert(aofs % 16 == 0); - tcg_debug_assert(bofs % 16 == 0); -} + TCGArg t0 = tcg_temp_new_internal(type, 0); + TCGArg env = GET_TCGV_PTR(tcg_ctx.tcg_env); + uint32_t i; -static inline void check_overlap_3(uint32_t d, uint32_t a, - uint32_t b, uint32_t s) -{ - tcg_debug_assert(d == a || d + s <= a || a + s <= d); - tcg_debug_assert(d == b || d + s <= b || b + s <= d); - tcg_debug_assert(a == b || a + s <= b || b + s <= a); + tcg_gen_op2(&tcg_ctx, opc_mv, t0, 0); + for (i = 0; i < clsz; i += lnsz) { + tcg_gen_op3(&tcg_ctx, opc_st, t0, env, dofs + i); + } + tcg_temp_free_internal(t0); } -static void expand_clr(uint32_t dofs, uint32_t opsz, uint32_t clsz) +static void expand_clr(uint32_t dofs, uint32_t clsz) { - if (clsz > opsz) { - TCGv_i64 zero = tcg_const_i64(0); - uint32_t i; + if (clsz >= 32 && TCG_TARGET_HAS_v256) { + uint32_t done = QEMU_ALIGN_DOWN(clsz, 32); + expand_clr_v(dofs, done, 32, TCG_TYPE_V256, + INDEX_op_movi_v256, INDEX_op_st_v256); + dofs += done; + clsz -= done; + } - for (i = opsz; i < clsz; i += 8) { - tcg_gen_st_i64(zero, tcg_ctx.tcg_env, dofs + i); - } - tcg_temp_free_i64(zero); + if (clsz >= 16 && TCG_TARGET_HAS_v128) { + uint16_t done = QEMU_ALIGN_DOWN(clsz, 16); + expand_clr_v(dofs, done, 16, TCG_TYPE_V128, + INDEX_op_movi_v128, INDEX_op_st_v128); + dofs += done; + clsz -= done; + } + + if (TCG_TARGET_REG_BITS == 64) { + expand_clr_v(dofs, clsz, 8, TCG_TYPE_I64, + INDEX_op_movi_i64, INDEX_op_st_i64); + } else if (TCG_TARGET_HAS_v64) { + expand_clr_v(dofs, clsz, 8, TCG_TYPE_V64, + INDEX_op_movi_v64, INDEX_op_st_v64); + } else { + expand_clr_v(dofs, clsz, 4, TCG_TYPE_I32, + INDEX_op_movi_i32, INDEX_op_st_i32); } } @@ -164,6 +183,7 @@ static void expand_3x8(uint32_t dofs, uint32_t aofs, tcg_temp_free_i64(t0); } +/* FIXME: add CSE for constants and we can eliminate this. */ static void expand_3x8p1(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t opsz, uint64_t data, void (*fni)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64)) @@ -192,28 +212,111 @@ static void expand_3x8p1(uint32_t dofs, uint32_t aofs, uint32_t bofs, tcg_temp_free_i64(t2); } +static void expand_3_v(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t opsz, uint32_t lnsz, TCGType type, + TCGOpcode opc_op, TCGOpcode opc_ld, TCGOpcode opc_st) +{ + TCGArg t0 = tcg_temp_new_internal(type, 0); + TCGArg env = GET_TCGV_PTR(tcg_ctx.tcg_env); + uint32_t i; + + if (aofs == bofs) { + for (i = 0; i < opsz; i += lnsz) { + tcg_gen_op3(&tcg_ctx, opc_ld, t0, env, aofs + i); + tcg_gen_op3(&tcg_ctx, opc_op, t0, t0, t0); + tcg_gen_op3(&tcg_ctx, opc_st, t0, env, dofs + i); + } + } else { + TCGArg t1 = tcg_temp_new_internal(type, 0); + for (i = 0; i < opsz; i += lnsz) { + tcg_gen_op3(&tcg_ctx, opc_ld, t0, env, aofs + i); + tcg_gen_op3(&tcg_ctx, opc_ld, t1, env, bofs + i); + tcg_gen_op3(&tcg_ctx, opc_op, t0, t0, t1); + tcg_gen_op3(&tcg_ctx, opc_st, t0, env, dofs + i); + } + tcg_temp_free_internal(t1); + } + tcg_temp_free_internal(t0); +} + void tcg_gen_gvec_3(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t opsz, uint32_t clsz, const GVecGen3 *g) { + check_size_align(opsz, clsz, dofs | aofs | bofs); check_overlap_3(dofs, aofs, bofs, clsz); - if (opsz <= MAX_INLINE) { - check_size_s(opsz, clsz); - check_align_s_3(dofs, aofs, bofs); - if (g->fni8) { - expand_3x8(dofs, aofs, bofs, opsz, g->fni8); - } else if (g->fni4) { - expand_3x4(dofs, aofs, bofs, opsz, g->fni4); + + if (opsz > MAX_UNROLL * 32 || clsz > MAX_UNROLL * 32) { + goto do_ool; + } + + /* Recall that ARM SVE allows vector sizes that are not a power of 2. + Expand with successively smaller host vector sizes. The intent is + that e.g. opsz == 80 would be expanded with 2x32 + 1x16. */ + /* ??? For clsz > opsz, the host may be able to use an op-sized + operation, zeroing the balance of the register. We can then + use a cl-sized store to implement the clearing without an extra + store operation. This is true for aarch64 and x86_64 hosts. */ + + if (check_size_impl(opsz, 32) && tcg_op_supported(g->op_v256)) { + uint32_t done = QEMU_ALIGN_DOWN(opsz, 32); + expand_3_v(dofs, aofs, bofs, done, 32, TCG_TYPE_V256, + g->op_v256, INDEX_op_ld_v256, INDEX_op_st_v256); + dofs += done; + aofs += done; + bofs += done; + opsz -= done; + clsz -= done; + } + + if (check_size_impl(opsz, 16) && tcg_op_supported(g->op_v128)) { + uint32_t done = QEMU_ALIGN_DOWN(opsz, 16); + expand_3_v(dofs, aofs, bofs, done, 16, TCG_TYPE_V128, + g->op_v128, INDEX_op_ld_v128, INDEX_op_st_v128); + dofs += done; + aofs += done; + bofs += done; + opsz -= done; + clsz -= done; + } + + if (check_size_impl(opsz, 8)) { + uint32_t done = QEMU_ALIGN_DOWN(opsz, 8); + if (tcg_op_supported(g->op_v64)) { + expand_3_v(dofs, aofs, bofs, done, 8, TCG_TYPE_V64, + g->op_v64, INDEX_op_ld_v64, INDEX_op_st_v64); + } else if (g->fni8) { + expand_3x8(dofs, aofs, bofs, done, g->fni8); } else if (g->fni8x) { - expand_3x8p1(dofs, aofs, bofs, opsz, g->extra_value, g->fni8x); + expand_3x8p1(dofs, aofs, bofs, done, g->extra_value, g->fni8x); } else { - g_assert_not_reached(); + done = 0; } - expand_clr(dofs, opsz, clsz); - } else { - check_size_l(opsz, clsz); - check_align_l_3(dofs, aofs, bofs); - expand_3_o(dofs, aofs, bofs, opsz, clsz, g->fno); + dofs += done; + aofs += done; + bofs += done; + opsz -= done; + clsz -= done; } + + if (check_size_impl(opsz, 4)) { + uint32_t done = QEMU_ALIGN_DOWN(opsz, 4); + expand_3x4(dofs, aofs, bofs, done, g->fni4); + dofs += done; + aofs += done; + bofs += done; + opsz -= done; + clsz -= done; + } + + if (opsz == 0) { + if (clsz != 0) { + expand_clr(dofs, clsz); + } + return; + } + + do_ool: + expand_3_o(dofs, aofs, bofs, opsz, clsz, g->fno); } static void gen_addv_mask(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b, TCGv_i64 m) @@ -240,6 +343,9 @@ void tcg_gen_gvec_add8(uint32_t dofs, uint32_t aofs, uint32_t bofs, static const GVecGen3 g = { .extra_value = REP8(0x80), .fni8x = gen_addv_mask, + .op_v256 = INDEX_op_add8_v256, + .op_v128 = INDEX_op_add8_v128, + .op_v64 = INDEX_op_add8_v64, .fno = gen_helper_gvec_add8, }; tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); @@ -251,6 +357,9 @@ void tcg_gen_gvec_add16(uint32_t dofs, uint32_t aofs, uint32_t bofs, static const GVecGen3 g = { .extra_value = REP16(0x8000), .fni8x = gen_addv_mask, + .op_v256 = INDEX_op_add16_v256, + .op_v128 = INDEX_op_add16_v128, + .op_v64 = INDEX_op_add16_v64, .fno = gen_helper_gvec_add16, }; tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); @@ -261,6 +370,9 @@ void tcg_gen_gvec_add32(uint32_t dofs, uint32_t aofs, uint32_t bofs, { static const GVecGen3 g = { .fni4 = tcg_gen_add_i32, + .op_v256 = INDEX_op_add32_v256, + .op_v128 = INDEX_op_add32_v128, + .op_v64 = INDEX_op_add32_v64, .fno = gen_helper_gvec_add32, }; tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); @@ -271,6 +383,8 @@ void tcg_gen_gvec_add64(uint32_t dofs, uint32_t aofs, uint32_t bofs, { static const GVecGen3 g = { .fni8 = tcg_gen_add_i64, + .op_v256 = INDEX_op_add64_v256, + .op_v128 = INDEX_op_add64_v128, .fno = gen_helper_gvec_add64, }; tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); @@ -328,6 +442,9 @@ void tcg_gen_gvec_sub8(uint32_t dofs, uint32_t aofs, uint32_t bofs, static const GVecGen3 g = { .extra_value = REP8(0x80), .fni8x = gen_subv_mask, + .op_v256 = INDEX_op_sub8_v256, + .op_v128 = INDEX_op_sub8_v128, + .op_v64 = INDEX_op_sub8_v64, .fno = gen_helper_gvec_sub8, }; tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); @@ -339,6 +456,9 @@ void tcg_gen_gvec_sub16(uint32_t dofs, uint32_t aofs, uint32_t bofs, static const GVecGen3 g = { .extra_value = REP16(0x8000), .fni8x = gen_subv_mask, + .op_v256 = INDEX_op_sub16_v256, + .op_v128 = INDEX_op_sub16_v128, + .op_v64 = INDEX_op_sub16_v64, .fno = gen_helper_gvec_sub16, }; tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); @@ -349,6 +469,9 @@ void tcg_gen_gvec_sub32(uint32_t dofs, uint32_t aofs, uint32_t bofs, { static const GVecGen3 g = { .fni4 = tcg_gen_sub_i32, + .op_v256 = INDEX_op_sub32_v256, + .op_v128 = INDEX_op_sub32_v128, + .op_v64 = INDEX_op_sub32_v64, .fno = gen_helper_gvec_sub32, }; tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); @@ -359,6 +482,8 @@ void tcg_gen_gvec_sub64(uint32_t dofs, uint32_t aofs, uint32_t bofs, { static const GVecGen3 g = { .fni8 = tcg_gen_sub_i64, + .op_v256 = INDEX_op_sub64_v256, + .op_v128 = INDEX_op_sub64_v128, .fno = gen_helper_gvec_sub64, }; tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); @@ -397,6 +522,9 @@ void tcg_gen_gvec_and8(uint32_t dofs, uint32_t aofs, uint32_t bofs, { static const GVecGen3 g = { .fni8 = tcg_gen_and_i64, + .op_v256 = INDEX_op_and_v256, + .op_v128 = INDEX_op_and_v128, + .op_v64 = INDEX_op_and_v64, .fno = gen_helper_gvec_and8, }; tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); @@ -407,6 +535,9 @@ void tcg_gen_gvec_or8(uint32_t dofs, uint32_t aofs, uint32_t bofs, { static const GVecGen3 g = { .fni8 = tcg_gen_or_i64, + .op_v256 = INDEX_op_or_v256, + .op_v128 = INDEX_op_or_v128, + .op_v64 = INDEX_op_or_v64, .fno = gen_helper_gvec_or8, }; tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); @@ -417,6 +548,9 @@ void tcg_gen_gvec_xor8(uint32_t dofs, uint32_t aofs, uint32_t bofs, { static const GVecGen3 g = { .fni8 = tcg_gen_xor_i64, + .op_v256 = INDEX_op_xor_v256, + .op_v128 = INDEX_op_xor_v128, + .op_v64 = INDEX_op_xor_v64, .fno = gen_helper_gvec_xor8, }; tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); @@ -427,6 +561,9 @@ void tcg_gen_gvec_andc8(uint32_t dofs, uint32_t aofs, uint32_t bofs, { static const GVecGen3 g = { .fni8 = tcg_gen_andc_i64, + .op_v256 = INDEX_op_andc_v256, + .op_v128 = INDEX_op_andc_v128, + .op_v64 = INDEX_op_andc_v64, .fno = gen_helper_gvec_andc8, }; tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); @@ -437,6 +574,9 @@ void tcg_gen_gvec_orc8(uint32_t dofs, uint32_t aofs, uint32_t bofs, { static const GVecGen3 g = { .fni8 = tcg_gen_orc_i64, + .op_v256 = INDEX_op_orc_v256, + .op_v128 = INDEX_op_orc_v128, + .op_v64 = INDEX_op_orc_v64, .fno = gen_helper_gvec_orc8, }; tcg_gen_gvec_3(dofs, aofs, bofs, opsz, clsz, &g); diff --git a/tcg/tcg.c b/tcg/tcg.c index 879b29e81f..86eb4214b0 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -604,7 +604,7 @@ int tcg_global_mem_new_internal(TCGType type, TCGv_ptr base, return temp_idx(s, ts); } -static int tcg_temp_new_internal(TCGType type, int temp_local) +int tcg_temp_new_internal(TCGType type, bool temp_local) { TCGContext *s = &tcg_ctx; TCGTemp *ts; @@ -650,7 +650,7 @@ static int tcg_temp_new_internal(TCGType type, int temp_local) return idx; } -TCGv_i32 tcg_temp_new_internal_i32(int temp_local) +TCGv_i32 tcg_temp_new_internal_i32(bool temp_local) { int idx; @@ -658,7 +658,7 @@ TCGv_i32 tcg_temp_new_internal_i32(int temp_local) return MAKE_TCGV_I32(idx); } -TCGv_i64 tcg_temp_new_internal_i64(int temp_local) +TCGv_i64 tcg_temp_new_internal_i64(bool temp_local) { int idx; @@ -666,7 +666,7 @@ TCGv_i64 tcg_temp_new_internal_i64(int temp_local) return MAKE_TCGV_I64(idx); } -static void tcg_temp_free_internal(int idx) +void tcg_temp_free_internal(int idx) { TCGContext *s = &tcg_ctx; TCGTemp *ts; From patchwork Thu Aug 17 23:01:14 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 110355 Delivered-To: patch@linaro.org Received: by 10.140.95.78 with SMTP id h72csp152833qge; Thu, 17 Aug 2017 16:10:24 -0700 (PDT) X-Received: by 10.200.41.45 with SMTP id y42mr10691390qty.54.1503011424022; Thu, 17 Aug 2017 16:10:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1503011424; cv=none; d=google.com; s=arc-20160816; b=rtiEGstNPp+ocLql0rWofoMMrVovVtNBujiAJjunMHi5KtI4BthXDX3LIR4AJNkJ09 Sdhg4KbrSvNoATI+UdsH+Nj2wwugsDE0mPdx2eCG9fbEflE2nw+g+55axp7OawF8U1GB ieLUfLo1w8uYe+RBWsHo5cAd/nsRExndrV1T29Q+0ymhBFJ+kxLZXtLPrVuC5jnqMfpW bbAOkrgoV9m8IVYAvAER3cYx0uWz8rCRZm89HxkZX+b3UcbpBAdoxBcSCRtyfuoo8aMH ugoPxTc3mgjGbhXITmBaSSy6KZZZeRgkOszywAenaUE0VwGqVXwjPoy3XpGrJoFqDb78 IMxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=dCdy+QRsPHrn9hSMlPvVaJHGa7p6SJQcfc8C7AMm1tk=; b=i+Xfdr0VWsjhyw4/6plvGO/ogeeOjLRLn6H3YAkckDa8Q5wn/kuahSaMkloY0aeM86 LQ8fb9+b6DPyAMxM8rFbokwiNACCdN0kw+UdTQZqwyXRM5c9T4x5+uKTdmnd7iKnPBIO RWMnfCr5g+gIg7UH2P4Vpo5jwsuOWQ+uktSgrgXR6rw97bfsdKWwD0puXVEdkvZfJdh/ 6re0iXfBUGeCXkwv+PPsnDbNmv1KIVz2//VcHjI0q/9LxD/gfDw/HxON/HxsEhFnjlIS M8jNCDUyhn65EASyOrskn0daVy9kVX2ABb1y4lrVvBniyj04/YT/XPd56U6MDLrLmZsI Vl/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=VmT+n/qC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id s69si4011052qka.319.2017.08.17.16.10.23 for (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 17 Aug 2017 16:10:24 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=VmT+n/qC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:56720 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTvh-0003hP-Nt for patch@linaro.org; Thu, 17 Aug 2017 19:10:21 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44638) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1diTnB-00053O-U6 for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:36 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1diTn9-000107-5p for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:34 -0400 Received: from mail-pg0-x236.google.com ([2607:f8b0:400e:c05::236]:38835) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1diTn8-0000zc-TF for qemu-devel@nongnu.org; Thu, 17 Aug 2017 19:01:31 -0400 Received: by mail-pg0-x236.google.com with SMTP id t80so24357871pgb.5 for ; Thu, 17 Aug 2017 16:01:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=dCdy+QRsPHrn9hSMlPvVaJHGa7p6SJQcfc8C7AMm1tk=; b=VmT+n/qCUGZgbd3eLemOH9Nwy2f5WiaWGMLwFtRtyJZPYP7igI5TzvPQAM6H53OOHQ eJps6bJs7v1YQkhIoENVQPuHwQHy9kiSaGGKjH/EmzY0USuka1WuHRL+6jGHSZjOXEiE rkFI2s/gp5j73pvkXtRFJo+RChJ5tj9kic1wU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=dCdy+QRsPHrn9hSMlPvVaJHGa7p6SJQcfc8C7AMm1tk=; b=JFxK+d1MSN/E+Bo7FHGYoFXRn/Epo7zpewp2kuy7W1BeiuJFhfC7TIxP+USlrZJ47M 3bjcmMj/CltFuHpkWXiuBtQ3r3yPLG3XK22rbXvXwZ1qSYuT8MwJXzByM2IAVYn3rFLp H08ICOZjmUEYCf20O2pLYWyenLKFU6EiTBK57uXlhIYoS5sL7jqC8WG/WmN4ZpJBIP1A KiX47s2ZVCHNmTn6QppBtHqypGl7skvlZwRG8DjleOOoAL6o9IZpGbp61kZ+Fp808BlY Oy6au69SKQqlsyoPqKrn0u3QPRcAfCuW81vqugCSMZCc8p3TDTmT0Pwv6/SDV7/dD5IU R99w== X-Gm-Message-State: AHYfb5hYWYL+w7LFxzwID6t/l0XSnTXJJG63iwXqrUeuIF7Ct4hekT5S e5s8UqiGb+6YbULa86u2mg== X-Received: by 10.98.208.196 with SMTP id p187mr6819530pfg.320.1503010888086; Thu, 17 Aug 2017 16:01:28 -0700 (PDT) Received: from bigtime.twiddle.net (97-126-108-236.tukw.qwest.net. [97.126.108.236]) by smtp.gmail.com with ESMTPSA id c23sm5190043pfc.136.2017.08.17.16.01.26 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 17 Aug 2017 16:01:26 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Thu, 17 Aug 2017 16:01:14 -0700 Message-Id: <20170817230114.3655-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.13.5 In-Reply-To: <20170817230114.3655-1-richard.henderson@linaro.org> References: <20170817230114.3655-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::236 Subject: [Qemu-devel] [PATCH 8/8] tcg/i386: Add vector operations X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.h | 46 +++++- tcg/tcg-opc.h | 12 +- tcg/i386/tcg-target.inc.c | 382 ++++++++++++++++++++++++++++++++++++++++++---- 3 files changed, 399 insertions(+), 41 deletions(-) -- 2.13.5 diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index e512648c95..147f82062b 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -30,11 +30,10 @@ #ifdef __x86_64__ # define TCG_TARGET_REG_BITS 64 -# define TCG_TARGET_NB_REGS 16 #else # define TCG_TARGET_REG_BITS 32 -# define TCG_TARGET_NB_REGS 8 #endif +# define TCG_TARGET_NB_REGS 24 typedef enum { TCG_REG_EAX = 0, @@ -56,6 +55,19 @@ typedef enum { TCG_REG_R13, TCG_REG_R14, TCG_REG_R15, + + /* SSE registers; 64-bit has access to 8 more, but we won't + need more than a few and using only the first 8 minimizes + the need for a rex prefix on the sse instructions. */ + TCG_REG_XMM0, + TCG_REG_XMM1, + TCG_REG_XMM2, + TCG_REG_XMM3, + TCG_REG_XMM4, + TCG_REG_XMM5, + TCG_REG_XMM6, + TCG_REG_XMM7, + TCG_REG_RAX = TCG_REG_EAX, TCG_REG_RCX = TCG_REG_ECX, TCG_REG_RDX = TCG_REG_EDX, @@ -79,6 +91,17 @@ extern bool have_bmi1; extern bool have_bmi2; extern bool have_popcnt; +#ifdef __SSE2__ +#define have_sse2 true +#else +extern bool have_sse2; +#endif +#ifdef __AVX2__ +#define have_avx2 true +#else +extern bool have_avx2; +#endif + /* optional instructions */ #define TCG_TARGET_HAS_div2_i32 1 #define TCG_TARGET_HAS_rot_i32 1 @@ -147,6 +170,25 @@ extern bool have_popcnt; #define TCG_TARGET_HAS_mulsh_i64 0 #endif +#define TCG_TARGET_HAS_v64 have_sse2 +#define TCG_TARGET_HAS_v128 have_sse2 +#define TCG_TARGET_HAS_v256 have_avx2 + +#define TCG_TARGET_HAS_andc_v64 TCG_TARGET_HAS_v64 +#define TCG_TARGET_HAS_orc_v64 0 +#define TCG_TARGET_HAS_not_v64 0 +#define TCG_TARGET_HAS_neg_v64 0 + +#define TCG_TARGET_HAS_andc_v128 TCG_TARGET_HAS_v128 +#define TCG_TARGET_HAS_orc_v128 0 +#define TCG_TARGET_HAS_not_v128 0 +#define TCG_TARGET_HAS_neg_v128 0 + +#define TCG_TARGET_HAS_andc_v256 TCG_TARGET_HAS_v256 +#define TCG_TARGET_HAS_orc_v256 0 +#define TCG_TARGET_HAS_not_v256 0 +#define TCG_TARGET_HAS_neg_v256 0 + #define TCG_TARGET_deposit_i32_valid(ofs, len) \ (have_bmi2 || \ ((ofs) == 0 && (len) == 8) || \ diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h index b1445a4c24..b84cd584fb 100644 --- a/tcg/tcg-opc.h +++ b/tcg/tcg-opc.h @@ -212,13 +212,13 @@ DEF(qemu_st_i64, 0, TLADDR_ARGS + DATA64_ARGS, 1, /* Host integer vector operations. */ /* These opcodes are required whenever the base vector size is enabled. */ -DEF(mov_v64, 1, 1, 0, IMPL(TCG_TARGET_HAS_v64)) -DEF(mov_v128, 1, 1, 0, IMPL(TCG_TARGET_HAS_v128)) -DEF(mov_v256, 1, 1, 0, IMPL(TCG_TARGET_HAS_v256)) +DEF(mov_v64, 1, 1, 0, TCG_OPF_NOT_PRESENT) +DEF(mov_v128, 1, 1, 0, TCG_OPF_NOT_PRESENT) +DEF(mov_v256, 1, 1, 0, TCG_OPF_NOT_PRESENT) -DEF(movi_v64, 1, 0, 1, IMPL(TCG_TARGET_HAS_v64)) -DEF(movi_v128, 1, 0, 1, IMPL(TCG_TARGET_HAS_v128)) -DEF(movi_v256, 1, 0, 1, IMPL(TCG_TARGET_HAS_v256)) +DEF(movi_v64, 1, 0, 1, TCG_OPF_NOT_PRESENT) +DEF(movi_v128, 1, 0, 1, TCG_OPF_NOT_PRESENT) +DEF(movi_v256, 1, 0, 1, TCG_OPF_NOT_PRESENT) DEF(ld_v64, 1, 1, 1, IMPL(TCG_TARGET_HAS_v64)) DEF(ld_v128, 1, 1, 1, IMPL(TCG_TARGET_HAS_v128)) diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index aeefb72aa0..0e01b54aa0 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -31,7 +31,9 @@ static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = { "%r8", "%r9", "%r10", "%r11", "%r12", "%r13", "%r14", "%r15", #else "%eax", "%ecx", "%edx", "%ebx", "%esp", "%ebp", "%esi", "%edi", + NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, #endif + "%xmm0", "%xmm1", "%xmm2", "%xmm3", "%xmm4", "%xmm5", "%xmm6", "%xmm7", }; #endif @@ -61,6 +63,14 @@ static const int tcg_target_reg_alloc_order[] = { TCG_REG_EDX, TCG_REG_EAX, #endif + TCG_REG_XMM0, + TCG_REG_XMM1, + TCG_REG_XMM2, + TCG_REG_XMM3, + TCG_REG_XMM4, + TCG_REG_XMM5, + TCG_REG_XMM6, + TCG_REG_XMM7, }; static const int tcg_target_call_iarg_regs[] = { @@ -94,7 +104,7 @@ static const int tcg_target_call_oarg_regs[] = { #define TCG_CT_CONST_I32 0x400 #define TCG_CT_CONST_WSZ 0x800 -/* Registers used with L constraint, which are the first argument +/* Registers used with L constraint, which are the first argument registers on x86_64, and two random call clobbered registers on i386. */ #if TCG_TARGET_REG_BITS == 64 @@ -127,6 +137,16 @@ bool have_bmi1; bool have_bmi2; bool have_popcnt; +#ifndef have_sse2 +bool have_sse2; +#endif +#ifdef have_avx2 +#define have_avx1 have_avx2 +#else +static bool have_avx1; +bool have_avx2; +#endif + #ifdef CONFIG_CPUID_H static bool have_movbe; static bool have_lzcnt; @@ -215,6 +235,10 @@ static const char *target_parse_constraint(TCGArgConstraint *ct, /* With TZCNT/LZCNT, we can have operand-size as an input. */ ct->ct |= TCG_CT_CONST_WSZ; break; + case 'x': + ct->ct |= TCG_CT_REG; + tcg_regset_set32(ct->u.regs, 0, 0xff0000); + break; /* qemu_ld/st address constraint */ case 'L': @@ -292,6 +316,7 @@ static inline int tcg_target_const_match(tcg_target_long val, TCGType type, #endif #define P_SIMDF3 0x20000 /* 0xf3 opcode prefix */ #define P_SIMDF2 0x40000 /* 0xf2 opcode prefix */ +#define P_VEXL 0x80000 /* Set VEX.L = 1 */ #define OPC_ARITH_EvIz (0x81) #define OPC_ARITH_EvIb (0x83) @@ -324,13 +349,31 @@ static inline int tcg_target_const_match(tcg_target_long val, TCGType type, #define OPC_MOVL_Iv (0xb8) #define OPC_MOVBE_GyMy (0xf0 | P_EXT38) #define OPC_MOVBE_MyGy (0xf1 | P_EXT38) +#define OPC_MOVDQA_GyMy (0x6f | P_EXT | P_DATA16) +#define OPC_MOVDQA_MyGy (0x7f | P_EXT | P_DATA16) +#define OPC_MOVDQU_GyMy (0x6f | P_EXT | P_SIMDF3) +#define OPC_MOVDQU_MyGy (0x7f | P_EXT | P_SIMDF3) +#define OPC_MOVQ_GyMy (0x7e | P_EXT | P_SIMDF3) +#define OPC_MOVQ_MyGy (0xd6 | P_EXT | P_DATA16) #define OPC_MOVSBL (0xbe | P_EXT) #define OPC_MOVSWL (0xbf | P_EXT) #define OPC_MOVSLQ (0x63 | P_REXW) #define OPC_MOVZBL (0xb6 | P_EXT) #define OPC_MOVZWL (0xb7 | P_EXT) +#define OPC_PADDB (0xfc | P_EXT | P_DATA16) +#define OPC_PADDW (0xfd | P_EXT | P_DATA16) +#define OPC_PADDD (0xfe | P_EXT | P_DATA16) +#define OPC_PADDQ (0xd4 | P_EXT | P_DATA16) +#define OPC_PAND (0xdb | P_EXT | P_DATA16) +#define OPC_PANDN (0xdf | P_EXT | P_DATA16) #define OPC_PDEP (0xf5 | P_EXT38 | P_SIMDF2) #define OPC_PEXT (0xf5 | P_EXT38 | P_SIMDF3) +#define OPC_POR (0xeb | P_EXT | P_DATA16) +#define OPC_PSUBB (0xf8 | P_EXT | P_DATA16) +#define OPC_PSUBW (0xf9 | P_EXT | P_DATA16) +#define OPC_PSUBD (0xfa | P_EXT | P_DATA16) +#define OPC_PSUBQ (0xfb | P_EXT | P_DATA16) +#define OPC_PXOR (0xef | P_EXT | P_DATA16) #define OPC_POP_r32 (0x58) #define OPC_POPCNT (0xb8 | P_EXT | P_SIMDF3) #define OPC_PUSH_r32 (0x50) @@ -500,7 +543,8 @@ static void tcg_out_modrm(TCGContext *s, int opc, int r, int rm) tcg_out8(s, 0xc0 | (LOWREGMASK(r) << 3) | LOWREGMASK(rm)); } -static void tcg_out_vex_pfx_opc(TCGContext *s, int opc, int r, int v, int rm) +static void tcg_out_vex_pfx_opc(TCGContext *s, int opc, int r, int v, + int rm, int index) { int tmp; @@ -515,14 +559,16 @@ static void tcg_out_vex_pfx_opc(TCGContext *s, int opc, int r, int v, int rm) } else if (opc & P_EXT) { tmp = 1; } else { - tcg_abort(); + g_assert_not_reached(); } - tmp |= 0x40; /* VEX.X */ tmp |= (r & 8 ? 0 : 0x80); /* VEX.R */ + tmp |= (index & 8 ? 0 : 0x40); /* VEX.X */ tmp |= (rm & 8 ? 0 : 0x20); /* VEX.B */ tcg_out8(s, tmp); tmp = (opc & P_REXW ? 0x80 : 0); /* VEX.W */ + tmp |= (opc & P_VEXL ? 0x04 : 0); /* VEX.L */ + /* VEX.pp */ if (opc & P_DATA16) { tmp |= 1; /* 0x66 */ @@ -538,7 +584,7 @@ static void tcg_out_vex_pfx_opc(TCGContext *s, int opc, int r, int v, int rm) static void tcg_out_vex_modrm(TCGContext *s, int opc, int r, int v, int rm) { - tcg_out_vex_pfx_opc(s, opc, r, v, rm); + tcg_out_vex_pfx_opc(s, opc, r, v, rm, 0); tcg_out8(s, 0xc0 | (LOWREGMASK(r) << 3) | LOWREGMASK(rm)); } @@ -565,7 +611,7 @@ static void tcg_out_opc_pool_imm(TCGContext *s, int opc, int r, static void tcg_out_vex_pool_imm(TCGContext *s, int opc, int r, int v, tcg_target_ulong data) { - tcg_out_vex_pfx_opc(s, opc, r, v, 0); + tcg_out_vex_pfx_opc(s, opc, r, v, 0, 0); tcg_out_sfx_pool_imm(s, r, data); } @@ -574,8 +620,8 @@ static void tcg_out_vex_pool_imm(TCGContext *s, int opc, int r, int v, mode for absolute addresses, ~RM is the size of the immediate operand that will follow the instruction. */ -static void tcg_out_modrm_sib_offset(TCGContext *s, int opc, int r, int rm, - int index, int shift, intptr_t offset) +static void tcg_out_sib_offset(TCGContext *s, int r, int rm, int index, + int shift, intptr_t offset) { int mod, len; @@ -586,7 +632,6 @@ static void tcg_out_modrm_sib_offset(TCGContext *s, int opc, int r, int rm, intptr_t pc = (intptr_t)s->code_ptr + 5 + ~rm; intptr_t disp = offset - pc; if (disp == (int32_t)disp) { - tcg_out_opc(s, opc, r, 0, 0); tcg_out8(s, (LOWREGMASK(r) << 3) | 5); tcg_out32(s, disp); return; @@ -596,7 +641,6 @@ static void tcg_out_modrm_sib_offset(TCGContext *s, int opc, int r, int rm, use of the MODRM+SIB encoding and is therefore larger than rip-relative addressing. */ if (offset == (int32_t)offset) { - tcg_out_opc(s, opc, r, 0, 0); tcg_out8(s, (LOWREGMASK(r) << 3) | 4); tcg_out8(s, (4 << 3) | 5); tcg_out32(s, offset); @@ -604,10 +648,9 @@ static void tcg_out_modrm_sib_offset(TCGContext *s, int opc, int r, int rm, } /* ??? The memory isn't directly addressable. */ - tcg_abort(); + g_assert_not_reached(); } else { /* Absolute address. */ - tcg_out_opc(s, opc, r, 0, 0); tcg_out8(s, (r << 3) | 5); tcg_out32(s, offset); return; @@ -630,7 +673,6 @@ static void tcg_out_modrm_sib_offset(TCGContext *s, int opc, int r, int rm, that would be used for %esp is the escape to the two byte form. */ if (index < 0 && LOWREGMASK(rm) != TCG_REG_ESP) { /* Single byte MODRM format. */ - tcg_out_opc(s, opc, r, rm, 0); tcg_out8(s, mod | (LOWREGMASK(r) << 3) | LOWREGMASK(rm)); } else { /* Two byte MODRM+SIB format. */ @@ -644,7 +686,6 @@ static void tcg_out_modrm_sib_offset(TCGContext *s, int opc, int r, int rm, tcg_debug_assert(index != TCG_REG_ESP); } - tcg_out_opc(s, opc, r, rm, index); tcg_out8(s, mod | (LOWREGMASK(r) << 3) | 4); tcg_out8(s, (shift << 6) | (LOWREGMASK(index) << 3) | LOWREGMASK(rm)); } @@ -656,6 +697,21 @@ static void tcg_out_modrm_sib_offset(TCGContext *s, int opc, int r, int rm, } } +static void tcg_out_modrm_sib_offset(TCGContext *s, int opc, int r, int rm, + int index, int shift, intptr_t offset) +{ + tcg_out_opc(s, opc, r, rm < 0 ? 0 : rm, index < 0 ? 0 : index); + tcg_out_sib_offset(s, r, rm, index, shift, offset); +} + +static void tcg_out_vex_modrm_sib_offset(TCGContext *s, int opc, int r, int v, + int rm, int index, int shift, + intptr_t offset) +{ + tcg_out_vex_pfx_opc(s, opc, r, v, rm < 0 ? 0 : rm, index < 0 ? 0 : index); + tcg_out_sib_offset(s, r, rm, index, shift, offset); +} + /* A simplification of the above with no index or shift. */ static inline void tcg_out_modrm_offset(TCGContext *s, int opc, int r, int rm, intptr_t offset) @@ -663,6 +719,31 @@ static inline void tcg_out_modrm_offset(TCGContext *s, int opc, int r, tcg_out_modrm_sib_offset(s, opc, r, rm, -1, 0, offset); } +static inline void tcg_out_vex_modrm_offset(TCGContext *s, int opc, int r, + int v, int rm, intptr_t offset) +{ + tcg_out_vex_modrm_sib_offset(s, opc, r, v, rm, -1, 0, offset); +} + +static void tcg_out_maybe_vex_modrm(TCGContext *s, int opc, int r, int rm) +{ + if (have_avx1) { + tcg_out_vex_modrm(s, opc, r, 0, rm); + } else { + tcg_out_modrm(s, opc, r, rm); + } +} + +static void tcg_out_maybe_vex_modrm_offset(TCGContext *s, int opc, int r, + int rm, intptr_t offset) +{ + if (have_avx1) { + tcg_out_vex_modrm_offset(s, opc, r, 0, rm, offset); + } else { + tcg_out_modrm_offset(s, opc, r, rm, offset); + } +} + /* Generate dest op= src. Uses the same ARITH_* codes as tgen_arithi. */ static inline void tgen_arithr(TCGContext *s, int subop, int dest, int src) { @@ -673,12 +754,32 @@ static inline void tgen_arithr(TCGContext *s, int subop, int dest, int src) tcg_out_modrm(s, OPC_ARITH_GvEv + (subop << 3) + ext, dest, src); } -static inline void tcg_out_mov(TCGContext *s, TCGType type, - TCGReg ret, TCGReg arg) +static void tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) { if (arg != ret) { - int opc = OPC_MOVL_GvEv + (type == TCG_TYPE_I64 ? P_REXW : 0); - tcg_out_modrm(s, opc, ret, arg); + int opc = 0; + + switch (type) { + case TCG_TYPE_I64: + opc = P_REXW; + /* fallthru */ + case TCG_TYPE_I32: + opc |= OPC_MOVL_GvEv; + tcg_out_modrm(s, opc, ret, arg); + break; + + case TCG_TYPE_V256: + opc = P_VEXL; + /* fallthru */ + case TCG_TYPE_V128: + case TCG_TYPE_V64: + opc |= OPC_MOVDQA_GyMy; + tcg_out_maybe_vex_modrm(s, opc, ret, arg); + break; + + default: + g_assert_not_reached(); + } } } @@ -687,6 +788,27 @@ static void tcg_out_movi(TCGContext *s, TCGType type, { tcg_target_long diff; + switch (type) { + case TCG_TYPE_I32: + case TCG_TYPE_I64: + break; + + case TCG_TYPE_V64: + case TCG_TYPE_V128: + case TCG_TYPE_V256: + /* ??? Revisit this as the implementation progresses. */ + tcg_debug_assert(arg == 0); + if (have_avx1) { + tcg_out_vex_modrm(s, OPC_PXOR, ret, ret, ret); + } else { + tcg_out_modrm(s, OPC_PXOR, ret, ret); + } + return; + + default: + g_assert_not_reached(); + } + if (arg == 0) { tgen_arithr(s, ARITH_XOR, ret, ret); return; @@ -750,18 +872,54 @@ static inline void tcg_out_pop(TCGContext *s, int reg) tcg_out_opc(s, OPC_POP_r32 + LOWREGMASK(reg), 0, reg, 0); } -static inline void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, - TCGReg arg1, intptr_t arg2) +static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, + TCGReg arg1, intptr_t arg2) { - int opc = OPC_MOVL_GvEv + (type == TCG_TYPE_I64 ? P_REXW : 0); - tcg_out_modrm_offset(s, opc, ret, arg1, arg2); + switch (type) { + case TCG_TYPE_I64: + tcg_out_modrm_offset(s, OPC_MOVL_GvEv | P_REXW, ret, arg1, arg2); + break; + case TCG_TYPE_I32: + tcg_out_modrm_offset(s, OPC_MOVL_GvEv, ret, arg1, arg2); + break; + case TCG_TYPE_V64: + tcg_out_maybe_vex_modrm_offset(s, OPC_MOVQ_GyMy, ret, arg1, arg2); + break; + case TCG_TYPE_V128: + tcg_out_maybe_vex_modrm_offset(s, OPC_MOVDQU_GyMy, ret, arg1, arg2); + break; + case TCG_TYPE_V256: + tcg_out_vex_modrm_offset(s, OPC_MOVDQU_GyMy | P_VEXL, + ret, 0, arg1, arg2); + break; + default: + g_assert_not_reached(); + } } -static inline void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg, - TCGReg arg1, intptr_t arg2) +static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg, + TCGReg arg1, intptr_t arg2) { - int opc = OPC_MOVL_EvGv + (type == TCG_TYPE_I64 ? P_REXW : 0); - tcg_out_modrm_offset(s, opc, arg, arg1, arg2); + switch (type) { + case TCG_TYPE_I64: + tcg_out_modrm_offset(s, OPC_MOVL_EvGv | P_REXW, arg, arg1, arg2); + break; + case TCG_TYPE_I32: + tcg_out_modrm_offset(s, OPC_MOVL_EvGv, arg, arg1, arg2); + break; + case TCG_TYPE_V64: + tcg_out_maybe_vex_modrm_offset(s, OPC_MOVQ_MyGy, arg, arg1, arg2); + break; + case TCG_TYPE_V128: + tcg_out_maybe_vex_modrm_offset(s, OPC_MOVDQU_MyGy, arg, arg1, arg2); + break; + case TCG_TYPE_V256: + tcg_out_vex_modrm_offset(s, OPC_MOVDQU_MyGy | P_VEXL, + arg, 0, arg1, arg2); + break; + default: + g_assert_not_reached(); + } } static bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val, @@ -773,6 +931,8 @@ static bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val, return false; } rexw = P_REXW; + } else if (type != TCG_TYPE_I32) { + return false; } tcg_out_modrm_offset(s, OPC_MOVL_EvIz | rexw, 0, base, ofs); tcg_out32(s, val); @@ -1914,6 +2074,15 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, case glue(glue(INDEX_op_, x), _i32) #endif +#define OP_128_256(x) \ + case glue(glue(INDEX_op_, x), _v256): \ + rexw = P_VEXL; /* FALLTHRU */ \ + case glue(glue(INDEX_op_, x), _v128) + +#define OP_64_128_256(x) \ + OP_128_256(x): \ + case glue(glue(INDEX_op_, x), _v64) + /* Hoist the loads of the most common arguments. */ a0 = args[0]; a1 = args[1]; @@ -2379,19 +2548,94 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, } break; + OP_64_128_256(add8): + c = OPC_PADDB; + goto gen_simd; + OP_64_128_256(add16): + c = OPC_PADDW; + goto gen_simd; + OP_64_128_256(add32): + c = OPC_PADDD; + goto gen_simd; + OP_128_256(add64): + c = OPC_PADDQ; + goto gen_simd; + OP_64_128_256(sub8): + c = OPC_PSUBB; + goto gen_simd; + OP_64_128_256(sub16): + c = OPC_PSUBW; + goto gen_simd; + OP_64_128_256(sub32): + c = OPC_PSUBD; + goto gen_simd; + OP_128_256(sub64): + c = OPC_PSUBQ; + goto gen_simd; + OP_64_128_256(and): + c = OPC_PAND; + goto gen_simd; + OP_64_128_256(andc): + c = OPC_PANDN; + goto gen_simd; + OP_64_128_256(or): + c = OPC_POR; + goto gen_simd; + OP_64_128_256(xor): + c = OPC_PXOR; + gen_simd: + if (have_avx1) { + tcg_out_vex_modrm(s, c, a0, a1, a2); + } else { + tcg_out_modrm(s, c, a0, a2); + } + break; + + case INDEX_op_ld_v64: + c = TCG_TYPE_V64; + goto gen_simd_ld; + case INDEX_op_ld_v128: + c = TCG_TYPE_V128; + goto gen_simd_ld; + case INDEX_op_ld_v256: + c = TCG_TYPE_V256; + gen_simd_ld: + tcg_out_ld(s, c, a0, a1, a2); + break; + + case INDEX_op_st_v64: + c = TCG_TYPE_V64; + goto gen_simd_st; + case INDEX_op_st_v128: + c = TCG_TYPE_V128; + goto gen_simd_st; + case INDEX_op_st_v256: + c = TCG_TYPE_V256; + gen_simd_st: + tcg_out_st(s, c, a0, a1, a2); + break; + case INDEX_op_mb: tcg_out_mb(s, a0); break; case INDEX_op_mov_i32: /* Always emitted via tcg_out_mov. */ case INDEX_op_mov_i64: + case INDEX_op_mov_v64: + case INDEX_op_mov_v128: + case INDEX_op_mov_v256: case INDEX_op_movi_i32: /* Always emitted via tcg_out_movi. */ case INDEX_op_movi_i64: + case INDEX_op_movi_v64: + case INDEX_op_movi_v128: + case INDEX_op_movi_v256: case INDEX_op_call: /* Always emitted via tcg_out_call. */ default: tcg_abort(); } #undef OP_32_64 +#undef OP_128_256 +#undef OP_64_128_256 } static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) @@ -2417,6 +2661,9 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) = { .args_ct_str = { "r", "r", "L", "L" } }; static const TCGTargetOpDef L_L_L_L = { .args_ct_str = { "L", "L", "L", "L" } }; + static const TCGTargetOpDef x_0_x = { .args_ct_str = { "x", "0", "x" } }; + static const TCGTargetOpDef x_x_x = { .args_ct_str = { "x", "x", "x" } }; + static const TCGTargetOpDef x_r = { .args_ct_str = { "x", "r" } }; switch (op) { case INDEX_op_goto_ptr: @@ -2620,6 +2867,52 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) return &s2; } + case INDEX_op_ld_v64: + case INDEX_op_ld_v128: + case INDEX_op_ld_v256: + case INDEX_op_st_v64: + case INDEX_op_st_v128: + case INDEX_op_st_v256: + return &x_r; + + case INDEX_op_add8_v64: + case INDEX_op_add8_v128: + case INDEX_op_add16_v64: + case INDEX_op_add16_v128: + case INDEX_op_add32_v64: + case INDEX_op_add32_v128: + case INDEX_op_add64_v128: + case INDEX_op_sub8_v64: + case INDEX_op_sub8_v128: + case INDEX_op_sub16_v64: + case INDEX_op_sub16_v128: + case INDEX_op_sub32_v64: + case INDEX_op_sub32_v128: + case INDEX_op_sub64_v128: + case INDEX_op_and_v64: + case INDEX_op_and_v128: + case INDEX_op_andc_v64: + case INDEX_op_andc_v128: + case INDEX_op_or_v64: + case INDEX_op_or_v128: + case INDEX_op_xor_v64: + case INDEX_op_xor_v128: + return have_avx1 ? &x_x_x : &x_0_x; + + case INDEX_op_add8_v256: + case INDEX_op_add16_v256: + case INDEX_op_add32_v256: + case INDEX_op_add64_v256: + case INDEX_op_sub8_v256: + case INDEX_op_sub16_v256: + case INDEX_op_sub32_v256: + case INDEX_op_sub64_v256: + case INDEX_op_and_v256: + case INDEX_op_andc_v256: + case INDEX_op_or_v256: + case INDEX_op_xor_v256: + return &x_x_x; + default: break; } @@ -2725,9 +3018,16 @@ static void tcg_out_nop_fill(tcg_insn_unit *p, int count) static void tcg_target_init(TCGContext *s) { #ifdef CONFIG_CPUID_H - unsigned a, b, c, d; + unsigned a, b, c, d, b7 = 0; int max = __get_cpuid_max(0, 0); + if (max >= 7) { + /* BMI1 is available on AMD Piledriver and Intel Haswell CPUs. */ + __cpuid_count(7, 0, a, b7, c, d); + have_bmi1 = (b7 & bit_BMI) != 0; + have_bmi2 = (b7 & bit_BMI2) != 0; + } + if (max >= 1) { __cpuid(1, a, b, c, d); #ifndef have_cmov @@ -2736,17 +3036,26 @@ static void tcg_target_init(TCGContext *s) available, we'll use a small forward branch. */ have_cmov = (d & bit_CMOV) != 0; #endif +#ifndef have_sse2 + have_sse2 = (d & bit_SSE2) != 0; +#endif /* MOVBE is only available on Intel Atom and Haswell CPUs, so we need to probe for it. */ have_movbe = (c & bit_MOVBE) != 0; have_popcnt = (c & bit_POPCNT) != 0; - } - if (max >= 7) { - /* BMI1 is available on AMD Piledriver and Intel Haswell CPUs. */ - __cpuid_count(7, 0, a, b, c, d); - have_bmi1 = (b & bit_BMI) != 0; - have_bmi2 = (b & bit_BMI2) != 0; +#ifndef have_avx2 + /* There are a number of things we must check before we can be + sure of not hitting invalid opcode. */ + if (c & bit_OSXSAVE) { + unsigned xcrl, xcrh; + asm ("xgetbv" : "=a" (xcrl), "=d" (xcrh) : "c" (0)); + if (xcrl & 6 == 6) { + have_avx1 = (c & bit_AVX) != 0; + have_avx2 = (b7 & bit_AVX2) != 0; + } + } +#endif } max = __get_cpuid_max(0x8000000, 0); @@ -2763,6 +3072,13 @@ static void tcg_target_init(TCGContext *s) } else { tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xff); } + if (have_sse2) { + tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_V64], 0, 0xff0000); + tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_V128], 0, 0xff0000); + } + if (have_avx2) { + tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_V256], 0, 0xff0000); + } tcg_regset_clear(tcg_target_call_clobber_regs); tcg_regset_set_reg(tcg_target_call_clobber_regs, TCG_REG_EAX);