[v2,26/67] target/arm: Convert FABD to decodetree

Message ID	20240524232121.284515-27-richard.henderson@linaro.org
State	Superseded
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; From: Richard Henderson <richard.henderson@linaro.org> To: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org, Peter Maydell <peter.maydell@linaro.org> Subject: [PATCH v2 26/67] target/arm: Convert FABD to decodetree Date: Fri, 24 May 2024 16:20:40 -0700 Message-Id: <20240524232121.284515-27-richard.henderson@linaro.org> In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org> References: <20240524232121.284515-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2607:f8b0:4864:20::42e; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org
Series	target/arm: Convert a64 advsimd to decodetree (part 1) \| expand [v2,00/67] target/arm: Convert a64 advsimd to decodetree (part 1) [v2,01/67] target/arm: Add neoverse-n1 to qemu-arm (DO NOT MERGE) [v2,02/67] target/arm: Use PLD, PLDW, PLI not NOP for t32 [v2,03/67] target/arm: Reject incorrect operands to PLD, PLDW, PLI [v2,04/67] target/arm: Zero-extend writeback for fp16 FCVTZS (scalar, integer) [v2,05/67] target/arm: Fix decode of FMOV (hp) vs MOVI [v2,06/67] target/arm: Verify sz=0 for Advanced SIMD scalar pairwise (fp16) [v2,07/67] target/arm: Split out gengvec.c [v2,08/67] target/arm: Split out gengvec64.c [v2,09/67] target/arm: Convert Cryptographic AES to decodetree [v2,10/67] target/arm: Convert Cryptographic 3-register SHA to decodetree [v2,11/67] target/arm: Convert Cryptographic 2-register SHA to decodetree [v2,12/67] target/arm: Convert Cryptographic 3-register SHA512 to decodetree [v2,13/67] target/arm: Convert Cryptographic 2-register SHA512 to decodetree [v2,14/67] target/arm: Convert Cryptographic 4-register to decodetree [v2,15/67] target/arm: Convert Cryptographic 3-register, imm2 to decodetree [v2,16/67] target/arm: Convert XAR to decodetree [v2,17/67] target/arm: Convert Advanced SIMD copy to decodetree [v2,18/67] target/arm: Convert FMULX to decodetree [v2,19/67] target/arm: Convert FADD, FSUB, FDIV, FMUL to decodetree [v2,20/67] target/arm: Convert FMAX, FMIN, FMAXNM, FMINNM to decodetree [v2,21/67] target/arm: Introduce vfp_load_reg16 [v2,22/67] target/arm: Expand vfp neg and abs inline [v2,23/67] target/arm: Convert FNMUL to decodetree [v2,24/67] target/arm: Convert FMLA, FMLS to decodetree [v2,25/67] target/arm: Convert FCMEQ, FCMGE, FCMGT, FACGE, FACGT to decodetree [v2,26/67] target/arm: Convert FABD to decodetree [v2,27/67] target/arm: Convert FRECPS, FRSQRTS to decodetree [v2,28/67] target/arm: Convert FADDP to decodetree [v2,29/67] target/arm: Convert FMAXP, FMINP, FMAXNMP, FMINNMP to decodetree [v2,30/67] target/arm: Use gvec for neon faddp, fmaxp, fminp [v2,31/67] target/arm: Convert ADDP to decodetree [v2,32/67] target/arm: Use gvec for neon padd [v2,33/67] target/arm: Convert SMAXP, SMINP, UMAXP, UMINP to decodetree [v2,34/67] target/arm: Use gvec for neon pmax, pmin [v2,35/67] target/arm: Convert FMLAL, FMLSL to decodetree [v2,36/67] target/arm: Convert disas_simd_3same_logic to decodetree [v2,37/67] target/arm: Improve vector UQADD, UQSUB, SQADD, SQSUB [v2,38/67] target/arm: Convert SUQADD and USQADD to gvec [v2,39/67] target/arm: Inline scalar SUQADD and USQADD [v2,40/67] target/arm: Inline scalar SQADD, UQADD, SQSUB, UQSUB [v2,41/67] target/arm: Convert SQADD, SQSUB, UQADD, UQSUB to decodetree [v2,42/67] target/arm: Convert SUQADD, USQADD to decodetree [v2,43/67] target/arm: Convert SSHL, USHL to decodetree [v2,44/67] target/arm: Convert SRSHL and URSHL (register) to gvec [v2,45/67] target/arm: Convert SRSHL, URSHL to decodetree [v2,46/67] target/arm: Convert SQSHL and UQSHL (register) to gvec [v2,47/67] target/arm: Convert SQSHL, UQSHL to decodetree [v2,48/67] target/arm: Convert SQRSHL and UQRSHL (register) to gvec [v2,49/67] target/arm: Convert SQRSHL, UQRSHL to decodetree [v2,50/67] target/arm: Convert ADD, SUB (vector) to decodetree [v2,51/67] target/arm: Convert CMGT, CMHI, CMGE, CMHS, CMTST, CMEQ to decodetree [v2,52/67] target/arm: Use TCG_COND_TSTNE in gen_cmtst_{i32, i64} [v2,53/67] target/arm: Use TCG_COND_TSTNE in gen_cmtst_vec [v2,54/67] target/arm: Convert SHADD, UHADD to gvec [v2,55/67] target/arm: Convert SHADD, UHADD to decodetree [v2,56/67] target/arm: Convert SHSUB, UHSUB to gvec [v2,57/67] target/arm: Convert SHSUB, UHSUB to decodetree [v2,58/67] target/arm: Convert SRHADD, URHADD to gvec [v2,59/67] target/arm: Convert SRHADD, URHADD to decodetree [v2,60/67] target/arm: Convert SMAX, SMIN, UMAX, UMIN to decodetree [v2,61/67] target/arm: Convert SABA, SABD, UABA, UABD to decodetree [v2,62/67] target/arm: Convert MUL, PMUL to decodetree [v2,63/67] target/arm: Convert MLA, MLS to decodetree [v2,64/67] target/arm: Tidy SQDMULH, SQRDMULH (vector) [v2,65/67] target/arm: Convert SQDMULH, SQRDMULH to decodetree [v2,66/67] target/arm: Convert FMADD, FMSUB, FNMADD, FNMSUB to decodetree [v2,67/67] target/arm: Convert FCSEL to decodetree

Message ID

20240524232121.284515-27-richard.henderson@linaro.org

State

Superseded

Headers

Received-SPF: pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as
 permitted sender) client-ip=209.51.188.17;
From: Richard Henderson <richard.henderson@linaro.org>
To: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org,
	Peter Maydell <peter.maydell@linaro.org>
Subject: [PATCH v2 26/67] target/arm: Convert FABD to decodetree
Date: Fri, 24 May 2024 16:20:40 -0700
Message-Id: <20240524232121.284515-27-richard.henderson@linaro.org>
In-Reply-To: <20240524232121.284515-1-richard.henderson@linaro.org>
References: <20240524232121.284515-1-richard.henderson@linaro.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=2607:f8b0:4864:20::42e;
 envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42e.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org
Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org

Series

target/arm: Convert a64 advsimd to decodetree (part 1) | expand

Commit Message

Richard Henderson May 24, 2024, 11:20 p.m. UTC

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/helper.h            |  1 +
 target/arm/tcg/a64.decode      |  6 ++++
 target/arm/tcg/translate-a64.c | 60 ++++++++++++++++++++++------------
 target/arm/tcg/vec_helper.c    |  6 ++++
 4 files changed, 53 insertions(+), 20 deletions(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index 8d076011c1..ff6e3094f4 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -724,6 +724,7 @@  DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
 
 DEF_HELPER_FLAGS_5(gvec_fabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(gvec_fabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_5(gvec_fabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
 
 DEF_HELPER_FLAGS_5(gvec_fceq_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_5(gvec_fceq_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32)
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index 7fc3277be6..a852b5f06f 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -728,6 +728,9 @@  FACGE_s         0111 1110 0.1 ..... 11101 1 ..... ..... @rrr_sd
 FACGT_s         0111 1110 110 ..... 00101 1 ..... ..... @rrr_h
 FACGT_s         0111 1110 1.1 ..... 11101 1 ..... ..... @rrr_sd
 
+FABD_s          0111 1110 110 ..... 00010 1 ..... ..... @rrr_h
+FABD_s          0111 1110 1.1 ..... 11010 1 ..... ..... @rrr_sd
+
 ### Advanced SIMD three same
 
 FADD_v          0.00 1110 010 ..... 00010 1 ..... ..... @qrrr_h
@@ -778,6 +781,9 @@  FACGE_v         0.10 1110 0.1 ..... 11101 1 ..... ..... @qrrr_sd
 FACGT_v         0.10 1110 110 ..... 00101 1 ..... ..... @qrrr_h
 FACGT_v         0.10 1110 1.1 ..... 11101 1 ..... ..... @qrrr_sd
 
+FABD_v          0.10 1110 110 ..... 00010 1 ..... ..... @qrrr_h
+FABD_v          0.10 1110 1.1 ..... 11010 1 ..... ..... @qrrr_sd
+
 ### Advanced SIMD scalar x indexed element
 
 FMUL_si         0101 1111 00 .. .... 1001 . 0 ..... .....   @rrx_h
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 75b0c1a005..633384d2a5 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -5010,6 +5010,31 @@  static const FPScalar f_scalar_facgt = {
 };
 TRANS(FACGT_s, do_fp3_scalar, a, &f_scalar_facgt)
 
+static void gen_fabd_h(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_ptr s)
+{
+    gen_helper_vfp_subh(d, n, m, s);
+    gen_vfp_absh(d, d);
+}
+
+static void gen_fabd_s(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, TCGv_ptr s)
+{
+    gen_helper_vfp_subs(d, n, m, s);
+    gen_vfp_abss(d, d);
+}
+
+static void gen_fabd_d(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_ptr s)
+{
+    gen_helper_vfp_subd(d, n, m, s);
+    gen_vfp_absd(d, d);
+}
+
+static const FPScalar f_scalar_fabd = {
+    gen_fabd_h,
+    gen_fabd_s,
+    gen_fabd_d,
+};
+TRANS(FABD_s, do_fp3_scalar, a, &f_scalar_fabd)
+
 static bool do_fp3_vector(DisasContext *s, arg_qrrr_e *a,
                           gen_helper_gvec_3_ptr * const fns[3])
 {
@@ -5150,6 +5175,13 @@  static gen_helper_gvec_3_ptr * const f_vector_facgt[3] = {
 };
 TRANS(FACGT_v, do_fp3_vector, a, f_vector_facgt)
 
+static gen_helper_gvec_3_ptr * const f_vector_fabd[3] = {
+    gen_helper_gvec_fabd_h,
+    gen_helper_gvec_fabd_s,
+    gen_helper_gvec_fabd_d,
+};
+TRANS(FABD_v, do_fp3_vector, a, f_vector_fabd)
+
 /*
  * Advanced SIMD scalar/vector x indexed element
  */
@@ -9303,10 +9335,6 @@  static void handle_3same_float(DisasContext *s, int size, int elements,
             case 0x3f: /* FRSQRTS */
                 gen_helper_rsqrtsf_f64(tcg_res, tcg_op1, tcg_op2, fpst);
                 break;
-            case 0x7a: /* FABD */
-                gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst);
-                gen_vfp_absd(tcg_res, tcg_res);
-                break;
             default:
             case 0x18: /* FMAXNM */
             case 0x19: /* FMLA */
@@ -9322,6 +9350,7 @@  static void handle_3same_float(DisasContext *s, int size, int elements,
             case 0x5c: /* FCMGE */
             case 0x5d: /* FACGE */
             case 0x5f: /* FDIV */
+            case 0x7a: /* FABD */
             case 0x7c: /* FCMGT */
             case 0x7d: /* FACGT */
                 g_assert_not_reached();
@@ -9344,10 +9373,6 @@  static void handle_3same_float(DisasContext *s, int size, int elements,
             case 0x3f: /* FRSQRTS */
                 gen_helper_rsqrtsf_f32(tcg_res, tcg_op1, tcg_op2, fpst);
                 break;
-            case 0x7a: /* FABD */
-                gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst);
-                gen_vfp_abss(tcg_res, tcg_res);
-                break;
             default:
             case 0x18: /* FMAXNM */
             case 0x19: /* FMLA */
@@ -9363,6 +9388,7 @@  static void handle_3same_float(DisasContext *s, int size, int elements,
             case 0x5c: /* FCMGE */
             case 0x5d: /* FACGE */
             case 0x5f: /* FDIV */
+            case 0x7a: /* FABD */
             case 0x7c: /* FCMGT */
             case 0x7d: /* FACGT */
                 g_assert_not_reached();
@@ -9405,7 +9431,6 @@  static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
         switch (fpopcode) {
         case 0x1f: /* FRECPS */
         case 0x3f: /* FRSQRTS */
-        case 0x7a: /* FABD */
             break;
         default:
         case 0x1b: /* FMULX */
@@ -9413,6 +9438,7 @@  static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
         case 0x7d: /* FACGT */
         case 0x1c: /* FCMEQ */
         case 0x5c: /* FCMGE */
+        case 0x7a: /* FABD */
         case 0x7c: /* FCMGT */
             unallocated_encoding(s);
             return;
@@ -9568,13 +9594,13 @@  static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
     switch (fpopcode) {
     case 0x07: /* FRECPS */
     case 0x0f: /* FRSQRTS */
-    case 0x1a: /* FABD */
         break;
     default:
     case 0x03: /* FMULX */
     case 0x04: /* FCMEQ (reg) */
     case 0x14: /* FCMGE (reg) */
     case 0x15: /* FACGE */
+    case 0x1a: /* FABD */
     case 0x1c: /* FCMGT (reg) */
     case 0x1d: /* FACGT */
         unallocated_encoding(s);
@@ -9602,15 +9628,12 @@  static void disas_simd_scalar_three_reg_same_fp16(DisasContext *s,
     case 0x0f: /* FRSQRTS */
         gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
         break;
-    case 0x1a: /* FABD */
-        gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
-        tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff);
-        break;
     default:
     case 0x03: /* FMULX */
     case 0x04: /* FCMEQ (reg) */
     case 0x14: /* FCMGE (reg) */
     case 0x15: /* FACGE */
+    case 0x1a: /* FABD */
     case 0x1c: /* FCMGT (reg) */
     case 0x1d: /* FACGT */
         g_assert_not_reached();
@@ -11272,7 +11295,6 @@  static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
         return;
     case 0x1f: /* FRECPS */
     case 0x3f: /* FRSQRTS */
-    case 0x7a: /* FABD */
         if (!fp_access_check(s)) {
             return;
         }
@@ -11314,6 +11336,7 @@  static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
     case 0x5c: /* FCMGE */
     case 0x5d: /* FACGE */
     case 0x5f: /* FDIV */
+    case 0x7a: /* FABD */
     case 0x7d: /* FACGT */
     case 0x7c: /* FCMGT */
         unallocated_encoding(s);
@@ -11659,7 +11682,6 @@  static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
     switch (fpopcode) {
     case 0x7: /* FRECPS */
     case 0xf: /* FRSQRTS */
-    case 0x1a: /* FABD */
         pairwise = false;
         break;
     case 0x10: /* FMAXNMP */
@@ -11684,6 +11706,7 @@  static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
     case 0x14: /* FCMGE */
     case 0x15: /* FACGE */
     case 0x17: /* FDIV */
+    case 0x1a: /* FABD */
     case 0x1c: /* FCMGT */
     case 0x1d: /* FACGT */
         unallocated_encoding(s);
@@ -11757,10 +11780,6 @@  static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
             case 0xf: /* FRSQRTS */
                 gen_helper_rsqrtsf_f16(tcg_res, tcg_op1, tcg_op2, fpst);
                 break;
-            case 0x1a: /* FABD */
-                gen_helper_advsimd_subh(tcg_res, tcg_op1, tcg_op2, fpst);
-                tcg_gen_andi_i32(tcg_res, tcg_res, 0x7fff);
-                break;
             default:
             case 0x0: /* FMAXNM */
             case 0x1: /* FMLA */
@@ -11776,6 +11795,7 @@  static void disas_simd_three_reg_same_fp16(DisasContext *s, uint32_t insn)
             case 0x14: /* FCMGE */
             case 0x15: /* FACGE */
             case 0x17: /* FDIV */
+            case 0x1a: /* FABD */
             case 0x1c: /* FCMGT */
             case 0x1d: /* FACGT */
                 g_assert_not_reached();
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
index dabefa3526..e9d7922f30 100644
--- a/target/arm/tcg/vec_helper.c
+++ b/target/arm/tcg/vec_helper.c
@@ -1154,6 +1154,11 @@  static float32 float32_abd(float32 op1, float32 op2, float_status *stat)
     return float32_abs(float32_sub(op1, op2, stat));
 }
 
+static float64 float64_abd(float64 op1, float64 op2, float_status *stat)
+{
+    return float64_abs(float64_sub(op1, op2, stat));
+}
+
 /*
  * Reciprocal step. These are the AArch32 version which uses a
  * non-fused multiply-and-subtract.
@@ -1238,6 +1243,7 @@  DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64)
 
 DO_3OP(gvec_fabd_h, float16_abd, float16)
 DO_3OP(gvec_fabd_s, float32_abd, float32)
+DO_3OP(gvec_fabd_d, float64_abd, float64)
 
 DO_3OP(gvec_fceq_h, float16_ceq, float16)
 DO_3OP(gvec_fceq_s, float32_ceq, float32)

[v2,26/67] target/arm: Convert FABD to decodetree

Commit Message

Patch