From patchwork Thu Jun 21 01:53:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 139427 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1484550lji; Wed, 20 Jun 2018 19:16:45 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKzpCld0Vis/Y7njwFzVZ5lYriA4hpxBmI6tEErd3IqfDwSJ+/7Hbxo7A7nMOzmmUdejTF2 X-Received: by 2002:aed:3caa:: with SMTP id d39-v6mr21908892qtf.408.1529547405562; Wed, 20 Jun 2018 19:16:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529547405; cv=none; d=google.com; s=arc-20160816; b=bbd3x1VvTrANph7UlLdjVDqJRiwBGZ6CdQLZkQ+USZS/37uI3LiOTlPuulLTKYEueB BOogLSU/2OVytQOayIcdthe4JTLF0uocvCcQYFTO2dHoUax5tKXvAWFm3a7wQ3WwQjkn /AWBQCwH1snPgBZcDQCM4BksGZaanC0j9c1s27SD+hBYL0KOKjYHmSrKgaqg9/8YEy6P LFmDIfH1Uy/GAr/NZUO9zJJr/1aLmCDxbUdDOOkYnaFRxWFeGNvqcUwQU6Ne3KXZhj4d CThqDVFMW8HLV0BqJsc9dhk/xage+S041Zog7joZDjnxZdwpC1lLtx+0YiDsz8NR5UYf FisQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=cTkQOLsJ7w2QJCExRny2kUXfmlF9jgGlwgL2VbhhBnA=; b=LaellNrXrItLARbBCki9AzkbwnVCw2ScgtInb9p6O0Zf40omPdvily8rsUCis8DBbx d+4s6XnW25zalbHx4PXhW0BW/eH/OE24mAWRtWpDC7xyh2v5LDN3Iigt1rgz1c6ALMgs 9JfPrC6fw0ef/6N13JlILtqhKZWma8tMrpFjMagQpUwqsI7ILiw8zz3SKu1zQztDMzC1 LMOTefHLLMIXdrkN1rEAywEWiJ486gR/PJsrSoY3dyBEPTYapC4cizYPEcqTgG3SKXkU ETh38FWBeSOVzA961rrSJpifzqXfwagLaccckAWpDQ0wF6Z10k+q5e2C+yiOl7XK56fI b7Ug== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=AHckJE8T; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id o21-v6si3718701qtm.130.2018.06.20.19.16.45 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 20 Jun 2018 19:16:45 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=AHckJE8T; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52597 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp9R-0004GD-1Z for patch@linaro.org; Wed, 20 Jun 2018 22:16:45 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39567) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooY-0004zQ-3S for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooU-0003WH-7h for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:10 -0400 Received: from mail-pl0-x22b.google.com ([2607:f8b0:400e:c01::22b]:38069) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooU-0003Vg-04 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:06 -0400 Received: by mail-pl0-x22b.google.com with SMTP id d10-v6so757999plo.5 for ; Wed, 20 Jun 2018 18:55:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=cTkQOLsJ7w2QJCExRny2kUXfmlF9jgGlwgL2VbhhBnA=; b=AHckJE8Tqqlj9FzWbFYAQTxzHTf5m9XO/dr6iag24efxQvjWTx6rtwQ+JL2kL6RlDp +VHDChJNjUNAgJrgeV4CmWPjxvB0xcJEl+GQsYy5948wK9dni+gNnRM8/+Wd6PkfWtK7 jpdMHLlF/QJb1+gCbjoFpDR/9Gbtpvx8kuUPQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=cTkQOLsJ7w2QJCExRny2kUXfmlF9jgGlwgL2VbhhBnA=; b=j+fYh21dZ30JrVSYdZkHWtKhkv9dcg86JZFpo8hv+QbaaoZ0Lw32mut0jtKwGuj7SV CRLOP2qNbAF2xcD2FDR883hbWMnpVTSYPkvlv+tIizxJ+oFV4Tg2g7yxBgAJDNCm61Cn coX+3JMRjpqUvhxh6W42AlUEx6OeF01JxlwFWEF622WEalSY7BHZgkjp2uSMO4zVP4a/ VU9BsWuGiYC8HZzTdIOkpfFEgqPr3EcLqpEXzCFTFEqk9hQPiMOqW8kEPGpetdmFZGbH fJpYsOx/+tNcM+dh4NPWpEYq4zfobMTQxJjZBrvuVUTNY3TvSiQyBWqMNN/o6mTrSb5J UG6Q== X-Gm-Message-State: APt69E2CDHUhV02W+ODflvH/vNZ0Kna1ZNJRtjJZpKonPldtHtLnIb0M DfDjW8k3XEbGjGDtnNDUKFnc6VEZpzM= X-Received: by 2002:a17:902:650a:: with SMTP id b10-v6mr26568033plk.45.1529546104693; Wed, 20 Jun 2018 18:55:04 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.55.03 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:03 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:55 -1000 Message-Id: <20180621015359.12018-32-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::22b Subject: [Qemu-devel] [PATCH v5 31/35] target/arm: Implement SVE fp complex multiply add (indexed) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Enhance the existing helpers to support SVE, which takes the index from each 128-bit segment. The change has no effect for AdvSIMD, since there is only one such segment. Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 23 ++++++++++++++++++ target/arm/vec_helper.c | 50 +++++++++++++++++++++++--------------- target/arm/sve.decode | 6 +++++ 3 files changed, 59 insertions(+), 20 deletions(-) -- 2.17.1 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 6487fe760a..209a69cd76 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4005,6 +4005,29 @@ static bool trans_FCMLA_zpzzz(DisasContext *s, return true; } +static bool trans_FCMLA_zzxz(DisasContext *s, arg_FCMLA_zzxz *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[2] = { + gen_helper_gvec_fcmlah_idx, + gen_helper_gvec_fcmlas_idx, + }; + + tcg_debug_assert(a->esz == 1 || a->esz == 2); + tcg_debug_assert(a->rd == a->ra); + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, + a->index * 4 + a->rot, + fns[a->esz - 1]); + tcg_temp_free_ptr(status); + } + return true; +} + /* *** SVE Floating Point Unary Operations Prediated Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 8f2dc4b989..db5aeb9f24 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -319,22 +319,27 @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; - uintptr_t i; - float16 e1 = m[H2(2 * index + flip)]; - float16 e3 = m[H2(2 * index + 1 - flip)]; + intptr_t elements = opr_sz / sizeof(float16); + intptr_t eltspersegment = 16 / sizeof(float16); + intptr_t i, j; /* Shift boolean to the sign bit so we can xor to negate. */ neg_real <<= 15; neg_imag <<= 15; - e1 ^= neg_real; - e3 ^= neg_imag; - for (i = 0; i < opr_sz / 2; i += 2) { - float16 e2 = n[H2(i + flip)]; - float16 e4 = e2; + for (i = 0; i < elements; i += eltspersegment) { + float16 mr = m[H2(i + 2 * index + 0)]; + float16 mi = m[H2(i + 2 * index + 1)]; + float16 e1 = neg_real ^ (flip ? mi : mr); + float16 e3 = neg_imag ^ (flip ? mr : mi); - d[H2(i)] = float16_muladd(e2, e1, d[H2(i)], 0, fpst); - d[H2(i + 1)] = float16_muladd(e4, e3, d[H2(i + 1)], 0, fpst); + for (j = i; j < i + eltspersegment; j += 2) { + float16 e2 = n[H2(j + flip)]; + float16 e4 = e2; + + d[H2(j)] = float16_muladd(e2, e1, d[H2(j)], 0, fpst); + d[H2(j + 1)] = float16_muladd(e4, e3, d[H2(j + 1)], 0, fpst); + } } clear_tail(d, opr_sz, simd_maxsz(desc)); } @@ -380,22 +385,27 @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; - uintptr_t i; - float32 e1 = m[H4(2 * index + flip)]; - float32 e3 = m[H4(2 * index + 1 - flip)]; + intptr_t elements = opr_sz / sizeof(float32); + intptr_t eltspersegment = 16 / sizeof(float32); + intptr_t i, j; /* Shift boolean to the sign bit so we can xor to negate. */ neg_real <<= 31; neg_imag <<= 31; - e1 ^= neg_real; - e3 ^= neg_imag; - for (i = 0; i < opr_sz / 4; i += 2) { - float32 e2 = n[H4(i + flip)]; - float32 e4 = e2; + for (i = 0; i < elements; i += eltspersegment) { + float32 mr = m[H4(i + 2 * index + 0)]; + float32 mi = m[H4(i + 2 * index + 1)]; + float32 e1 = neg_real ^ (flip ? mi : mr); + float32 e3 = neg_imag ^ (flip ? mr : mi); - d[H4(i)] = float32_muladd(e2, e1, d[H4(i)], 0, fpst); - d[H4(i + 1)] = float32_muladd(e4, e3, d[H4(i + 1)], 0, fpst); + for (j = i; j < i + eltspersegment; j += 2) { + float32 e2 = n[H4(j + flip)]; + float32 e4 = e2; + + d[H4(j)] = float32_muladd(e2, e1, d[H4(j)], 0, fpst); + d[H4(j + 1)] = float32_muladd(e4, e3, d[H4(j + 1)], 0, fpst); + } } clear_tail(d, opr_sz, simd_maxsz(desc)); } diff --git a/target/arm/sve.decode b/target/arm/sve.decode index da89697700..b578d104c4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -729,6 +729,12 @@ FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \ FCMLA_zpzzz 01100100 esz:2 0 rm:5 0 rot:2 pg:3 rn:5 rd:5 \ ra=%reg_movprfx +# SVE floating-point complex multiply-add (indexed) +FCMLA_zzxz 01100100 10 1 index:2 rm:3 0001 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx esz=1 +FCMLA_zzxz 01100100 11 1 index:1 rm:4 0001 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx esz=2 + ### SVE FP Multiply-Add Indexed Group # SVE floating-point multiply-add (indexed)