From patchwork Thu Oct 11 20:52:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 148673 Delivered-To: patch@linaro.org Received: by 2002:a2e:8595:0:0:0:0:0 with SMTP id b21-v6csp2596967lji; Thu, 11 Oct 2018 13:55:45 -0700 (PDT) X-Google-Smtp-Source: ACcGV60V5HOXpwjqtMYAFOk9Q8xXoyAB5c1qvwGaoj1KYbip0mYJ4meULNxLXqIbvCmebBaEmT1c X-Received: by 2002:ac8:4297:: with SMTP id o23-v6mr3107073qtl.389.1539291345848; Thu, 11 Oct 2018 13:55:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539291345; cv=none; d=google.com; s=arc-20160816; b=u/usQsaNWSuLceQpYsykr0G5tDKYjngcgqrK+x+6AurrxKBl8JV9Bq+v2AMiv/RKJU 7fDPNbHG8b60W3BcHoutzrPsmIyGdyyJ+m6LcHPPwCZt7l78lBDQqc0iorCw8pWCzPSE j6khRj81vuakUQ+e4qW38QrOJeQxqeLwuTm3AXLxH8waI+n4VEDGHKlpZRmpXeEFjsCB zzUL5JkBXjI/rYX6XnxIq4tj25i+9mlHSjSs6/bVV2KK1U7+tp7b0JwRF0F+c23S0RtD EoFOrC/RUagHAPyepESFxNCRcYso5QsP23E8TMNhQ9+evaQVdotDe7MMvQ00BbuEQzv+ jJsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature; bh=cvRkZzylpccO4n3sFiEt4kh2YWk5/bo+faK+QPTLakc=; b=k6MaGmAc7xibuv4LRKWYDQH3uFT8/gQwTA/pLM/VnuCmgW0klVUrNVynlmj8lG1MP9 Sj9zmOMiiF+sWOp5HyLxnqfWRPVZ2Z1KzPPFM6iGnXl4b/nVdEAVW36fPf5aY3zrQEWY aRQvWbqYlwkAwQrPQacR7ab0qYcEOLSiQ40Gg+qvX2krgpvUPHcHXoUkr6V33Ig73tDA 1BBPWbdeGINgilI0D3jSi4uVWteh/t/vwzknqVPXV32vGXGEbnXbSXVyvQzXrxVoPQSr iHzHZXWatfDgE9QSP33Q/Xy7eg2n5Qcdoz9wc6b5YZKUhEvsoDvL2qQ55qEpshujimhf VuZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=HUk0vDad; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id x42-v6si6892812qvf.28.2018.10.11.13.55.45 for (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 11 Oct 2018 13:55:45 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=HUk0vDad; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:37086 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gAhzl-0000tG-Aq for patch@linaro.org; Thu, 11 Oct 2018 16:55:45 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45121) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gAhwn-0006po-LY for qemu-devel@nongnu.org; Thu, 11 Oct 2018 16:52:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gAhwk-0005se-RX for qemu-devel@nongnu.org; Thu, 11 Oct 2018 16:52:41 -0400 Received: from mail-pg1-x530.google.com ([2607:f8b0:4864:20::530]:42105) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gAhwk-0005UB-A7 for qemu-devel@nongnu.org; Thu, 11 Oct 2018 16:52:38 -0400 Received: by mail-pg1-x530.google.com with SMTP id i4-v6so4724898pgq.9 for ; Thu, 11 Oct 2018 13:52:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=cvRkZzylpccO4n3sFiEt4kh2YWk5/bo+faK+QPTLakc=; b=HUk0vDadyyvn9TWYmV89p5lSjvUMG63KI0NfoULkVegJ2wTJuFrF5pX0lzxI2nAVKg ixfGlqwoYPSkPKqzq8VNm6eIl5CdyrH4TQI9KxSwqNgkj2rylw3yylGMheR9LUgQ3U0I rU9iNT9qXB07DkrUno170HCzA0iI869yjB/Zs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=cvRkZzylpccO4n3sFiEt4kh2YWk5/bo+faK+QPTLakc=; b=SB/51cI8/ILmP0nSyZCVTD1VMNJ+pItTcaFqTOydbUBsL0iXoWRFDnfagQo8+Qxx4R lDPN4CA47IFIaUvWm0yZk5B5OPfVyf2WzxOtek2HVDiDxz6kYpX/Hy+9umrVuLnQq4sy dUSIMpsNFFUMrIhqzs63oLG2mbiK1ZBc8YSyK+Bk8jyulx1jWF6a0T38xG6PduaztyFG biVjHFrdPB0lrIPGTM/Zf2LBVgJPAY0yO+97fs5ygpA8A7g1oLaWe83ZTL5r/0u7NMXz oa6AKrV9kaijxlPb+eMS8ZqFJeImyDarTJKjNDh4qifOurzItceu5Lzxq1gtqIJyr8Wo 9Lqw== X-Gm-Message-State: ABuFfognjhHgWBtnUbCZytVb92GU1o+T3MqZXKANQ0LPxkMvdfVYiaR1 d/Sos/hboZsjxIx0LSeE0uARapmO57U= X-Received: by 2002:a62:509a:: with SMTP id g26-v6mr3076239pfj.62.1539291153937; Thu, 11 Oct 2018 13:52:33 -0700 (PDT) Received: from cloudburst.twiddle.net (97-113-8-179.tukw.qwest.net. [97.113.8.179]) by smtp.gmail.com with ESMTPSA id h87-v6sm34707866pfj.78.2018.10.11.13.52.32 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 11 Oct 2018 13:52:32 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Thu, 11 Oct 2018 13:52:03 -0700 Message-Id: <20181011205206.3552-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181011205206.3552-1-richard.henderson@linaro.org> References: <20181011205206.3552-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::530 Subject: [Qemu-devel] [PATCH 17/20] target/arm: Use gvec for NEON VLD all lanes X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate.c | 81 ++++++++++++++---------------------------- 1 file changed, 26 insertions(+), 55 deletions(-) -- 2.17.1 diff --git a/target/arm/translate.c b/target/arm/translate.c index a9bd93bba1..1e79a1eec0 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -2993,19 +2993,6 @@ static void gen_vfp_msr(TCGv_i32 tmp) tcg_temp_free_i32(tmp); } -static void gen_neon_dup_u8(TCGv_i32 var, int shift) -{ - TCGv_i32 tmp = tcg_temp_new_i32(); - if (shift) - tcg_gen_shri_i32(var, var, shift); - tcg_gen_ext8u_i32(var, var); - tcg_gen_shli_i32(tmp, var, 8); - tcg_gen_or_i32(var, var, tmp); - tcg_gen_shli_i32(tmp, var, 16); - tcg_gen_or_i32(var, var, tmp); - tcg_temp_free_i32(tmp); -} - static void gen_neon_dup_low16(TCGv_i32 var) { TCGv_i32 tmp = tcg_temp_new_i32(); @@ -3024,28 +3011,6 @@ static void gen_neon_dup_high16(TCGv_i32 var) tcg_temp_free_i32(tmp); } -static TCGv_i32 gen_load_and_replicate(DisasContext *s, TCGv_i32 addr, int size) -{ - /* Load a single Neon element and replicate into a 32 bit TCG reg */ - TCGv_i32 tmp = tcg_temp_new_i32(); - switch (size) { - case 0: - gen_aa32_ld8u(s, tmp, addr, get_mem_index(s)); - gen_neon_dup_u8(tmp, 0); - break; - case 1: - gen_aa32_ld16u(s, tmp, addr, get_mem_index(s)); - gen_neon_dup_low16(tmp); - break; - case 2: - gen_aa32_ld32u(s, tmp, addr, get_mem_index(s)); - break; - default: /* Avoid compiler warnings. */ - abort(); - } - return tmp; -} - static int handle_vsel(uint32_t insn, uint32_t rd, uint32_t rn, uint32_t rm, uint32_t dp) { @@ -4949,6 +4914,7 @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn) int load; int shift; int n; + int vec_size; TCGv_i32 addr; TCGv_i32 tmp; TCGv_i32 tmp2; @@ -5118,28 +5084,33 @@ static int disas_neon_ls_insn(DisasContext *s, uint32_t insn) } addr = tcg_temp_new_i32(); load_reg_var(s, addr, rn); - if (nregs == 1) { - /* VLD1 to all lanes: bit 5 indicates how many Dregs to write */ - tmp = gen_load_and_replicate(s, addr, size); - tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 0)); - tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 1)); - if (insn & (1 << 5)) { - tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd + 1, 0)); - tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd + 1, 1)); - } - tcg_temp_free_i32(tmp); - } else { - /* VLD2/3/4 to all lanes: bit 5 indicates register stride */ - stride = (insn & (1 << 5)) ? 2 : 1; - for (reg = 0; reg < nregs; reg++) { - tmp = gen_load_and_replicate(s, addr, size); - tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 0)); - tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 1)); - tcg_temp_free_i32(tmp); - tcg_gen_addi_i32(addr, addr, 1 << size); - rd += stride; + + /* VLD1 to all lanes: bit 5 indicates how many Dregs to write. + * VLD2/3/4 to all lanes: bit 5 indicates register stride. + */ + stride = insn & (1 << 5) ? 2 : 1; + vec_size = nregs == 1 ? stride * 8 : 8; + + tmp = tcg_temp_new_i32(); + for (reg = 0; reg < nregs; reg++) { + gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s), + s->be_data | size); + if ((rd & 1) && vec_size == 16) { + /* We cannot write 16 bytes at once because the + * destination is unaligned. + */ + tcg_gen_gvec_dup_i32(size, neon_reg_offset(rd, 0), + 8, 8, tmp); + tcg_gen_gvec_mov(0, neon_reg_offset(rd + 1, 0), + neon_reg_offset(rd, 0), 8, 8); + } else { + tcg_gen_gvec_dup_i32(size, neon_reg_offset(rd, 0), + vec_size, vec_size, tmp); } + tcg_gen_addi_i32(addr, addr, 1 << size); + rd += stride; } + tcg_temp_free_i32(tmp); tcg_temp_free_i32(addr); stride = (1 << size) * nregs; } else {