From patchwork Tue Mar 8 01:53:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 549221 Delivered-To: patch@linaro.org Received: by 2002:a05:7000:6713:0:0:0:0 with SMTP id u19csp2214636mag; Mon, 7 Mar 2022 17:58:19 -0800 (PST) X-Google-Smtp-Source: ABdhPJw5E8U7hmxEnpezx0Hdjd80n5Ncag8F+l/MrFMCI46N6e+A2DaSVPXFU7M9UNR/YmVhtnyO X-Received: by 2002:a5b:712:0:b0:629:5d03:4b1b with SMTP id g18-20020a5b0712000000b006295d034b1bmr3963293ybq.380.1646704699813; Mon, 07 Mar 2022 17:58:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1646704699; cv=none; d=google.com; s=arc-20160816; b=AFq+ZcAgPfzwXhhTL62kX4XKwrFVh/MIx9nzzED+STG8y+iEQdjaX6UeWde8Dno1ba hbayQOFVIlhwUav4w1ZXMlA895on3n0psbj1OZpG6VIFkf1ELIKnNekwvTuWn2Q+F8rc ZPullWMw1OzPQXj+pUsIg/ekL+Lag8BaCvzA0C+4d8BjMKtbKdzcgngY8k+p9+govwU6 HKckcnobdSdQAVCM26twFx5nN6gXZhc0xUxrTbzf5wkl4CYFGDbY9cMnbCfwKo9Ka9dT 6lv845ja1BOBksUVeffC9dgxLb/YUxAsA83WolLp2tAGnMdiPVkyolmVFynS0SsgTZAH HF8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=AIkDrKPjIFQDWf/u9lIIkn47c2ajCrHecdrKXw/wTUs=; b=Kq7gFA/hnv/goWEnGZ4WU56u1rKw/FmP2+AZjcfUSPUzQZuccb2PhrD8cGWOiAWT4s rdbln8iMgQ/mcjijpG2RhzpWhaOqj7hsNiQsPSQNpXATRHxiX0Wulh9rNzY+cnp2BxfL yFl2qoOjvt9Mjbw7Yp1djmbBcttqJ2lhJb2VOLJwiQpUnBSaomz2MwOOESaAldZPGh0H FUTBJY7BZlEG1Hz9erFYRWZJBPWCdLLZBoEShW81w6p0MRiXwCfjR8Q+SPkHji445EPm vmrUq1wMV+j3CSs+8n4mVSzaayuG2Pyuws5s6/phfP+5y+NuQYmNBti57FdBwjKgwduh P23Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=XMtOxnxC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id q11-20020a25bfcb000000b00628ca1ba3b5si8748621ybm.537.2022.03.07.17.58.19 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Mon, 07 Mar 2022 17:58:19 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=XMtOxnxC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48154 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nRP7D-000504-Ar for patch@linaro.org; Mon, 07 Mar 2022 20:58:19 -0500 Received: from eggs.gnu.org ([209.51.188.92]:54602) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nRP3C-0006GD-7f for qemu-devel@nongnu.org; Mon, 07 Mar 2022 20:54:10 -0500 Received: from [2607:f8b0:4864:20::1031] (port=37692 helo=mail-pj1-x1031.google.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nRP3A-0007u7-8N for qemu-devel@nongnu.org; Mon, 07 Mar 2022 20:54:09 -0500 Received: by mail-pj1-x1031.google.com with SMTP id p3-20020a17090a680300b001bbfb9d760eso884843pjj.2 for ; Mon, 07 Mar 2022 17:54:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=AIkDrKPjIFQDWf/u9lIIkn47c2ajCrHecdrKXw/wTUs=; b=XMtOxnxCfoAL6ctEmMoUTRHa92xlDq0Tg8ZrK64FoRC/5kUa8JMs/Ml9qXkSKYp9+C uDQkU/cFGq31ClH2Tl6vDsjFFsv8MI86MeVtjdaWEw2qANdhUgUZG1MVhD92X5aMQs9r LGuKrXvI13Pupr5+3kVBE191snMLP90gLgsS88V2Q+Jy8KRnJbxWGf4iH/CmoLvVZaDZ i4LZbKxmV7me3rMsg8FLnkMKyzXWrZ1qr1znSqND7S96rdsm64OeMAFY8gIYRFwscTdz J59gpPLcI0T5sW2DKuhKDXKLDfVKNXJGXF+5Z2NcDb+s5fu+MpM4Ss6reVyvffCfMpVh FBcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=AIkDrKPjIFQDWf/u9lIIkn47c2ajCrHecdrKXw/wTUs=; b=S+MXyEkJgXXzp//wfQksm7NzzXnmHsJWSlOxMyHzheOvrhuvzVYly+d3RucXJIujcE QN0N2dXzlnvE9Y+rG/z6rTbtVDsS+HDGc4I8ZBxhxDc5A/dRBFbmvGuOrspHUSxUHMSk VEn44/zVqzhzGVZAdEpYt91SeX77HARbXk0yvx4VximBXj7LaPw3ZRKvKoX3AYAjUYF1 Mkw6pY8EWE87NiW1z6ulFBhWiG4PiFIDExV2VSIopkL5IXYq7KagrfxFKQQYhet8cIns LCAkvXYairUjztesQjLx6C3nC2u3UC3Tl6SBFrpgyo1tg63gfv3QXp02pAALnei8AzJu EaUA== X-Gm-Message-State: AOAM531DnAvyFO5Iuvf70UzciCElnBGXVyXC/SpYyOx+UhK+48T6itT2 6ZVPY6goxJvEHd29mAGK01eKQWNiK25y5g== X-Received: by 2002:a17:902:8ec7:b0:14a:c442:8ca2 with SMTP id x7-20020a1709028ec700b0014ac4428ca2mr15334745plo.12.1646704446989; Mon, 07 Mar 2022 17:54:06 -0800 (PST) Received: from localhost.localdomain (cpe-50-113-46-110.hawaii.res.rr.com. [50.113.46.110]) by smtp.gmail.com with ESMTPSA id bh3-20020a056a02020300b00378b62df320sm12775221pgb.73.2022.03.07.17.54.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 07 Mar 2022 17:54:06 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 04/11] target/s390x: vxeh2: Update for changes to vector shifts Date: Mon, 7 Mar 2022 15:53:51 -1000 Message-Id: <20220308015358.188499-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220308015358.188499-1-richard.henderson@linaro.org> References: <20220308015358.188499-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-Host-Lookup-Failed: Reverse DNS lookup failed for 2607:f8b0:4864:20::1031 (failed) Received-SPF: pass client-ip=2607:f8b0:4864:20::1031; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1031.google.com X-Spam_score_int: -6 X-Spam_score: -0.7 X-Spam_bar: / X-Spam_report: (-0.7 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, PDS_HP_HELO_NORDNS=0.659, RCVD_IN_DNSWL_NONE=-0.0001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-s390x@nongnu.org, dmiller423@gmail.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" From: David Miller Prior to vector enhancements 2, the shift count was supposed to be equal for each byte lest the result be unpredictable, which allowed us to assume that the shift count was the same, and optimize accordingly. With vector enhancements 2, the shift count is allowed to be different for each byte, and we must cope with that. Signed-off-by: David Miller Message-Id: <20220307020327.3003-4-dmiller423@gmail.com> [rth: Split out of larger patch; simplify shift/merge code.] Signed-off-by: Richard Henderson --- target/s390x/helper.h | 3 ++ target/s390x/tcg/vec_int_helper.c | 58 ++++++++++++++++++++++ target/s390x/tcg/translate_vx.c.inc | 77 ++++++++++++----------------- target/s390x/tcg/insn-data.def | 12 ++--- 4 files changed, 99 insertions(+), 51 deletions(-) diff --git a/target/s390x/helper.h b/target/s390x/helper.h index 7412130883..bf33d86f74 100644 --- a/target/s390x/helper.h +++ b/target/s390x/helper.h @@ -203,8 +203,11 @@ DEF_HELPER_FLAGS_3(gvec_vpopct16, TCG_CALL_NO_RWG, void, ptr, cptr, i32) DEF_HELPER_FLAGS_4(gvec_verim8, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32) DEF_HELPER_FLAGS_4(gvec_verim16, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32) DEF_HELPER_FLAGS_4(gvec_vsl, TCG_CALL_NO_RWG, void, ptr, cptr, i64, i32) +DEF_HELPER_FLAGS_4(gvec_vsl_ve2, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32) DEF_HELPER_FLAGS_4(gvec_vsra, TCG_CALL_NO_RWG, void, ptr, cptr, i64, i32) +DEF_HELPER_FLAGS_4(gvec_vsra_ve2, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32) DEF_HELPER_FLAGS_4(gvec_vsrl, TCG_CALL_NO_RWG, void, ptr, cptr, i64, i32) +DEF_HELPER_FLAGS_4(gvec_vsrl_ve2, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32) DEF_HELPER_FLAGS_4(gvec_vscbi8, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32) DEF_HELPER_FLAGS_4(gvec_vscbi16, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32) DEF_HELPER_4(gvec_vtm, void, ptr, cptr, env, i32) diff --git a/target/s390x/tcg/vec_int_helper.c b/target/s390x/tcg/vec_int_helper.c index 5561b3ed90..a881d5d267 100644 --- a/target/s390x/tcg/vec_int_helper.c +++ b/target/s390x/tcg/vec_int_helper.c @@ -540,18 +540,76 @@ void HELPER(gvec_vsl)(void *v1, const void *v2, uint64_t count, s390_vec_shl(v1, v2, count); } +void HELPER(gvec_vsl_ve2)(void *v1, const void *v2, const void *v3, + uint32_t desc) +{ + S390Vector tmp; + uint32_t sh, e0, e1 = 0; + + for (int i = 15; i >= 0; --i, e1 = e0 << 24) { + e0 = s390_vec_read_element8(v2, i); + sh = s390_vec_read_element8(v3, i) & 7; + + s390_vec_write_element8(&tmp, i, rol32(e0 | e1, sh)); + } + + *(S390Vector *)v1 = tmp; +} + void HELPER(gvec_vsra)(void *v1, const void *v2, uint64_t count, uint32_t desc) { s390_vec_sar(v1, v2, count); } +void HELPER(gvec_vsra_ve2)(void *v1, const void *v2, const void *v3, + uint32_t desc) +{ + S390Vector tmp; + uint32_t sh, e0, e1; + int i = 0; + + e0 = s390_vec_read_element8(v2, 0); + e1 = -(e0 >> 7) << 8; + + for (;;) { + sh = s390_vec_read_element8(v3, i) & 7; + + s390_vec_write_element8(&tmp, i, (e0 | e1) >> sh); + + if (++i >= 16) { + break; + } + + e1 = e0 << 8; + e0 = s390_vec_read_element8(v2, i); + } + + *(S390Vector *)v1 = tmp; +} + void HELPER(gvec_vsrl)(void *v1, const void *v2, uint64_t count, uint32_t desc) { s390_vec_shr(v1, v2, count); } +void HELPER(gvec_vsrl_ve2)(void *v1, const void *v2, const void *v3, + uint32_t desc) +{ + S390Vector tmp; + uint32_t sh, e0, e1 = 0; + + for (int i = 0; i < 16; ++i, e1 = e0 << 8) { + e0 = s390_vec_read_element8(v2, i); + sh = s390_vec_read_element8(v3, i) & 7; + + s390_vec_write_element8(&tmp, i, (e0 | e1) >> sh); + } + + *(S390Vector *)v1 = tmp; +} + #define DEF_VSCBI(BITS) \ void HELPER(gvec_vscbi##BITS)(void *v1, const void *v2, const void *v3, \ uint32_t desc) \ diff --git a/target/s390x/tcg/translate_vx.c.inc b/target/s390x/tcg/translate_vx.c.inc index d514e8b218..967f6213d8 100644 --- a/target/s390x/tcg/translate_vx.c.inc +++ b/target/s390x/tcg/translate_vx.c.inc @@ -2018,21 +2018,42 @@ static DisasJumpType op_ves(DisasContext *s, DisasOps *o) return DISAS_NEXT; } +static DisasJumpType gen_vsh_bit_byte(DisasContext *s, DisasOps *o, + gen_helper_gvec_2i *gen, + gen_helper_gvec_3 *gen_ve2) +{ + bool byte = s->insn->data; + + if (!byte && s390_has_feat(S390_FEAT_VECTOR_ENH2)) { + gen_gvec_3_ool(get_field(s, v1), get_field(s, v2), + get_field(s, v3), 0, gen_ve2); + } else { + TCGv_i64 shift = tcg_temp_new_i64(); + + read_vec_element_i64(shift, get_field(s, v3), 7, ES_8); + tcg_gen_andi_i64(shift, shift, byte ? 0x78 : 7); + gen_gvec_2i_ool(get_field(s, v1), get_field(s, v2), shift, 0, gen); + tcg_temp_free_i64(shift); + } + return DISAS_NEXT; +} + static DisasJumpType op_vsl(DisasContext *s, DisasOps *o) { - TCGv_i64 shift = tcg_temp_new_i64(); + return gen_vsh_bit_byte(s, o, gen_helper_gvec_vsl, + gen_helper_gvec_vsl_ve2); +} - read_vec_element_i64(shift, get_field(s, v3), 7, ES_8); - if (s->fields.op2 == 0x74) { - tcg_gen_andi_i64(shift, shift, 0x7); - } else { - tcg_gen_andi_i64(shift, shift, 0x78); - } +static DisasJumpType op_vsra(DisasContext *s, DisasOps *o) +{ + return gen_vsh_bit_byte(s, o, gen_helper_gvec_vsra, + gen_helper_gvec_vsra_ve2); +} - gen_gvec_2i_ool(get_field(s, v1), get_field(s, v2), - shift, 0, gen_helper_gvec_vsl); - tcg_temp_free_i64(shift); - return DISAS_NEXT; +static DisasJumpType op_vsrl(DisasContext *s, DisasOps *o) +{ + return gen_vsh_bit_byte(s, o, gen_helper_gvec_vsrl, + gen_helper_gvec_vsrl_ve2); } static DisasJumpType op_vsldb(DisasContext *s, DisasOps *o) @@ -2064,40 +2085,6 @@ static DisasJumpType op_vsldb(DisasContext *s, DisasOps *o) return DISAS_NEXT; } -static DisasJumpType op_vsra(DisasContext *s, DisasOps *o) -{ - TCGv_i64 shift = tcg_temp_new_i64(); - - read_vec_element_i64(shift, get_field(s, v3), 7, ES_8); - if (s->fields.op2 == 0x7e) { - tcg_gen_andi_i64(shift, shift, 0x7); - } else { - tcg_gen_andi_i64(shift, shift, 0x78); - } - - gen_gvec_2i_ool(get_field(s, v1), get_field(s, v2), - shift, 0, gen_helper_gvec_vsra); - tcg_temp_free_i64(shift); - return DISAS_NEXT; -} - -static DisasJumpType op_vsrl(DisasContext *s, DisasOps *o) -{ - TCGv_i64 shift = tcg_temp_new_i64(); - - read_vec_element_i64(shift, get_field(s, v3), 7, ES_8); - if (s->fields.op2 == 0x7c) { - tcg_gen_andi_i64(shift, shift, 0x7); - } else { - tcg_gen_andi_i64(shift, shift, 0x78); - } - - gen_gvec_2i_ool(get_field(s, v1), get_field(s, v2), - shift, 0, gen_helper_gvec_vsrl); - tcg_temp_free_i64(shift); - return DISAS_NEXT; -} - static DisasJumpType op_vs(DisasContext *s, DisasOps *o) { const uint8_t es = get_field(s, m4); diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def index 46add91a0e..f487a64abf 100644 --- a/target/s390x/tcg/insn-data.def +++ b/target/s390x/tcg/insn-data.def @@ -1204,19 +1204,19 @@ F(0xe778, VESRLV, VRR_c, V, 0, 0, 0, 0, vesv, 0, IF_VEC) F(0xe738, VESRL, VRS_a, V, la2, 0, 0, 0, ves, 0, IF_VEC) /* VECTOR SHIFT LEFT */ - F(0xe774, VSL, VRR_c, V, 0, 0, 0, 0, vsl, 0, IF_VEC) + E(0xe774, VSL, VRR_c, V, 0, 0, 0, 0, vsl, 0, 0, IF_VEC) /* VECTOR SHIFT LEFT BY BYTE */ - F(0xe775, VSLB, VRR_c, V, 0, 0, 0, 0, vsl, 0, IF_VEC) + E(0xe775, VSLB, VRR_c, V, 0, 0, 0, 0, vsl, 0, 1, IF_VEC) /* VECTOR SHIFT LEFT DOUBLE BY BYTE */ F(0xe777, VSLDB, VRI_d, V, 0, 0, 0, 0, vsldb, 0, IF_VEC) /* VECTOR SHIFT RIGHT ARITHMETIC */ - F(0xe77e, VSRA, VRR_c, V, 0, 0, 0, 0, vsra, 0, IF_VEC) + E(0xe77e, VSRA, VRR_c, V, 0, 0, 0, 0, vsra, 0, 0, IF_VEC) /* VECTOR SHIFT RIGHT ARITHMETIC BY BYTE */ - F(0xe77f, VSRAB, VRR_c, V, 0, 0, 0, 0, vsra, 0, IF_VEC) + E(0xe77f, VSRAB, VRR_c, V, 0, 0, 0, 0, vsra, 0, 1, IF_VEC) /* VECTOR SHIFT RIGHT LOGICAL */ - F(0xe77c, VSRL, VRR_c, V, 0, 0, 0, 0, vsrl, 0, IF_VEC) + E(0xe77c, VSRL, VRR_c, V, 0, 0, 0, 0, vsrl, 0, 0, IF_VEC) /* VECTOR SHIFT RIGHT LOGICAL BY BYTE */ - F(0xe77d, VSRLB, VRR_c, V, 0, 0, 0, 0, vsrl, 0, IF_VEC) + E(0xe77d, VSRLB, VRR_c, V, 0, 0, 0, 0, vsrl, 0, 1, IF_VEC) /* VECTOR SUBTRACT */ F(0xe7f7, VS, VRR_c, V, 0, 0, 0, 0, vs, 0, IF_VEC) /* VECTOR SUBTRACT COMPUTE BORROW INDICATION */