mbox series

[v4,00/36] tcg: Support for Int128 with helpers

Message ID 20230108023719.2466341-1-richard.henderson@linaro.org
Headers show
Series tcg: Support for Int128 with helpers | expand

Message

Richard Henderson Jan. 8, 2023, 2:36 a.m. UTC
Changes for v4:
  * About half of the v3 series has been merged,
  * AArch64 host requires even argument register.
  * target/{arm,ppc,s390x,i386} uses included here.

Patches requiring review:
  01-tcg-Define-TCG_TYPE_I128-and-related-helper-macro.patch
  02-tcg-Handle-dh_typecode_i128-with-TCG_CALL_-RET-AR.patch
  03-tcg-Allocate-objects-contiguously-in-temp_allocat.patch
  05-tcg-Add-TCG_CALL_-RET-ARG-_BY_REF.patch
  07-tcg-Add-TCG_CALL_RET_BY_VEC.patch
  08-include-qemu-int128-Use-Int128-structure-for-TCI.patch
  09-tcg-i386-Add-TCG_TARGET_CALL_-RET-ARG-_I128.patch
  10-tcg-tci-Fix-big-endian-return-register-ordering.patch
  11-tcg-tci-Add-TCG_TARGET_CALL_-RET-ARG-_I128.patch
  13-tcg-Add-temp-allocation-for-TCGv_i128.patch
  14-tcg-Add-basic-data-movement-for-TCGv_i128.patch
  15-tcg-Add-guest-load-store-primitives-for-TCGv_i128.patch
  16-tcg-Add-tcg_gen_-non-atomic_cmpxchg_i128.patch
  17-tcg-Split-out-tcg_gen_nonatomic_cmpxchg_i-32-64.patch
  24-target-s390x-Use-a-single-return-for-helper_divs3.patch
  31-target-s390x-Use-Int128-for-passing-float128.patch
  32-target-s390x-Use-tcg_gen_atomic_cmpxchg_i128-for-.patch
  33-target-s390x-Implement-CC_OP_NZ-in-gen_op_calc_cc.patch
  34-target-i386-Split-out-gen_cmpxchg8b-gen_cmpxchg16.patch
  35-target-i386-Inline-cmpxchg8b.patch
  36-target-i386-Inline-cmpxchg16b.patch


r~


Ilya Leoshkevich (2):
  tests/tcg/s390x: Add div.c
  tests/tcg/s390x: Add clst.c

Richard Henderson (34):
  tcg: Define TCG_TYPE_I128 and related helper macros
  tcg: Handle dh_typecode_i128 with TCG_CALL_{RET,ARG}_NORMAL
  tcg: Allocate objects contiguously in temp_allocate_frame
  tcg: Introduce tcg_out_addi_ptr
  tcg: Add TCG_CALL_{RET,ARG}_BY_REF
  tcg: Introduce tcg_target_call_oarg_reg
  tcg: Add TCG_CALL_RET_BY_VEC
  include/qemu/int128: Use Int128 structure for TCI
  tcg/i386: Add TCG_TARGET_CALL_{RET,ARG}_I128
  tcg/tci: Fix big-endian return register ordering
  tcg/tci: Add TCG_TARGET_CALL_{RET,ARG}_I128
  tcg: Add TCG_TARGET_CALL_{RET,ARG}_I128
  tcg: Add temp allocation for TCGv_i128
  tcg: Add basic data movement for TCGv_i128
  tcg: Add guest load/store primitives for TCGv_i128
  tcg: Add tcg_gen_{non}atomic_cmpxchg_i128
  tcg: Split out tcg_gen_nonatomic_cmpxchg_i{32,64}
  target/arm: Use tcg_gen_atomic_cmpxchg_i128 for STXP
  target/arm: Use tcg_gen_atomic_cmpxchg_i128 for CASP
  target/ppc: Use tcg_gen_atomic_cmpxchg_i128 for STQCX
  tests/tcg/s390x: Add long-double.c
  target/s390x: Use a single return for helper_divs32/u32
  target/s390x: Use a single return for helper_divs64/u64
  target/s390x: Use Int128 for return from CLST
  target/s390x: Use Int128 for return from CKSM
  target/s390x: Use Int128 for return from TRE
  target/s390x: Copy wout_x1 to wout_x1_P
  target/s390x: Use Int128 for returning float128
  target/s390x: Use Int128 for passing float128
  target/s390x: Use tcg_gen_atomic_cmpxchg_i128 for CDSG
  target/s390x: Implement CC_OP_NZ in gen_op_calc_cc
  target/i386: Split out gen_cmpxchg8b, gen_cmpxchg16b
  target/i386: Inline cmpxchg8b
  target/i386: Inline cmpxchg16b

 accel/tcg/tcg-runtime.h          |  11 +
 include/exec/cpu_ldst.h          |  10 +
 include/exec/helper-head.h       |   7 +
 include/qemu/atomic128.h         |  29 ++-
 include/qemu/int128.h            |  25 +-
 include/tcg/tcg-op.h             |  15 ++
 include/tcg/tcg.h                |  49 +++-
 target/arm/helper-a64.h          |   8 -
 target/i386/helper.h             |   6 -
 target/ppc/helper.h              |   2 -
 target/s390x/helper.h            |  54 ++---
 tcg/aarch64/tcg-target.h         |   2 +
 tcg/arm/tcg-target.h             |   2 +
 tcg/i386/tcg-target.h            |  10 +
 tcg/loongarch64/tcg-target.h     |   2 +
 tcg/mips/tcg-target.h            |   2 +
 tcg/riscv/tcg-target.h           |   3 +
 tcg/s390x/tcg-target.h           |   2 +
 tcg/sparc64/tcg-target.h         |   2 +
 tcg/tcg-internal.h               |  17 ++
 tcg/tci/tcg-target.h             |   3 +
 target/s390x/tcg/insn-data.h.inc |  60 ++---
 accel/tcg/cputlb.c               | 112 +++++++++
 accel/tcg/user-exec.c            |  66 ++++++
 target/arm/helper-a64.c          | 147 ------------
 target/arm/translate-a64.c       | 121 +++++-----
 target/i386/tcg/mem_helper.c     | 126 ----------
 target/i386/tcg/translate.c      | 126 ++++++++--
 target/ppc/mem_helper.c          |  44 ----
 target/ppc/translate.c           | 102 ++++----
 target/s390x/tcg/fpu_helper.c    | 103 ++++----
 target/s390x/tcg/int_helper.c    |  64 ++---
 target/s390x/tcg/mem_helper.c    |  77 +-----
 target/s390x/tcg/translate.c     | 217 +++++++++++------
 tcg/tcg-op.c                     | 393 ++++++++++++++++++++++++++-----
 tcg/tcg.c                        | 303 +++++++++++++++++++++---
 tcg/tci.c                        |  65 ++---
 tests/tcg/s390x/clst.c           |  82 +++++++
 tests/tcg/s390x/div.c            |  75 ++++++
 tests/tcg/s390x/long-double.c    |  24 ++
 util/int128.c                    |  42 ++++
 accel/tcg/atomic_common.c.inc    |  45 ++++
 tcg/aarch64/tcg-target.c.inc     |  17 +-
 tcg/arm/tcg-target.c.inc         |  30 ++-
 tcg/i386/tcg-target.c.inc        |  52 +++-
 tcg/loongarch64/tcg-target.c.inc |  17 +-
 tcg/mips/tcg-target.c.inc        |  17 +-
 tcg/ppc/tcg-target.c.inc         |  20 +-
 tcg/riscv/tcg-target.c.inc       |  17 +-
 tcg/s390x/tcg-target.c.inc       |  16 +-
 tcg/sparc64/tcg-target.c.inc     |  19 +-
 tcg/tci/tcg-target.c.inc         |  27 ++-
 tests/tcg/s390x/Makefile.target  |   3 +
 53 files changed, 1936 insertions(+), 954 deletions(-)
 create mode 100644 tests/tcg/s390x/clst.c
 create mode 100644 tests/tcg/s390x/div.c
 create mode 100644 tests/tcg/s390x/long-double.c

Comments

Mark Cave-Ayland Jan. 10, 2023, 11:12 p.m. UTC | #1
On 08/01/2023 02:36, Richard Henderson wrote:

> Changes for v4:
>    * About half of the v3 series has been merged,
>    * AArch64 host requires even argument register.
>    * target/{arm,ppc,s390x,i386} uses included here.
> 
> Patches requiring review:
>    01-tcg-Define-TCG_TYPE_I128-and-related-helper-macro.patch
>    02-tcg-Handle-dh_typecode_i128-with-TCG_CALL_-RET-AR.patch
>    03-tcg-Allocate-objects-contiguously-in-temp_allocat.patch
>    05-tcg-Add-TCG_CALL_-RET-ARG-_BY_REF.patch
>    07-tcg-Add-TCG_CALL_RET_BY_VEC.patch
>    08-include-qemu-int128-Use-Int128-structure-for-TCI.patch
>    09-tcg-i386-Add-TCG_TARGET_CALL_-RET-ARG-_I128.patch
>    10-tcg-tci-Fix-big-endian-return-register-ordering.patch
>    11-tcg-tci-Add-TCG_TARGET_CALL_-RET-ARG-_I128.patch
>    13-tcg-Add-temp-allocation-for-TCGv_i128.patch
>    14-tcg-Add-basic-data-movement-for-TCGv_i128.patch
>    15-tcg-Add-guest-load-store-primitives-for-TCGv_i128.patch
>    16-tcg-Add-tcg_gen_-non-atomic_cmpxchg_i128.patch
>    17-tcg-Split-out-tcg_gen_nonatomic_cmpxchg_i-32-64.patch
>    24-target-s390x-Use-a-single-return-for-helper_divs3.patch
>    31-target-s390x-Use-Int128-for-passing-float128.patch
>    32-target-s390x-Use-tcg_gen_atomic_cmpxchg_i128-for-.patch
>    33-target-s390x-Implement-CC_OP_NZ-in-gen_op_calc_cc.patch
>    34-target-i386-Split-out-gen_cmpxchg8b-gen_cmpxchg16.patch
>    35-target-i386-Inline-cmpxchg8b.patch
>    36-target-i386-Inline-cmpxchg16b.patch
> 
> 
> r~
> 
> 
> Ilya Leoshkevich (2):
>    tests/tcg/s390x: Add div.c
>    tests/tcg/s390x: Add clst.c
> 
> Richard Henderson (34):
>    tcg: Define TCG_TYPE_I128 and related helper macros
>    tcg: Handle dh_typecode_i128 with TCG_CALL_{RET,ARG}_NORMAL
>    tcg: Allocate objects contiguously in temp_allocate_frame
>    tcg: Introduce tcg_out_addi_ptr
>    tcg: Add TCG_CALL_{RET,ARG}_BY_REF
>    tcg: Introduce tcg_target_call_oarg_reg
>    tcg: Add TCG_CALL_RET_BY_VEC
>    include/qemu/int128: Use Int128 structure for TCI
>    tcg/i386: Add TCG_TARGET_CALL_{RET,ARG}_I128
>    tcg/tci: Fix big-endian return register ordering
>    tcg/tci: Add TCG_TARGET_CALL_{RET,ARG}_I128
>    tcg: Add TCG_TARGET_CALL_{RET,ARG}_I128
>    tcg: Add temp allocation for TCGv_i128
>    tcg: Add basic data movement for TCGv_i128
>    tcg: Add guest load/store primitives for TCGv_i128
>    tcg: Add tcg_gen_{non}atomic_cmpxchg_i128
>    tcg: Split out tcg_gen_nonatomic_cmpxchg_i{32,64}
>    target/arm: Use tcg_gen_atomic_cmpxchg_i128 for STXP
>    target/arm: Use tcg_gen_atomic_cmpxchg_i128 for CASP
>    target/ppc: Use tcg_gen_atomic_cmpxchg_i128 for STQCX
>    tests/tcg/s390x: Add long-double.c
>    target/s390x: Use a single return for helper_divs32/u32
>    target/s390x: Use a single return for helper_divs64/u64
>    target/s390x: Use Int128 for return from CLST
>    target/s390x: Use Int128 for return from CKSM
>    target/s390x: Use Int128 for return from TRE
>    target/s390x: Copy wout_x1 to wout_x1_P
>    target/s390x: Use Int128 for returning float128
>    target/s390x: Use Int128 for passing float128
>    target/s390x: Use tcg_gen_atomic_cmpxchg_i128 for CDSG
>    target/s390x: Implement CC_OP_NZ in gen_op_calc_cc
>    target/i386: Split out gen_cmpxchg8b, gen_cmpxchg16b
>    target/i386: Inline cmpxchg8b
>    target/i386: Inline cmpxchg16b
> 
>   accel/tcg/tcg-runtime.h          |  11 +
>   include/exec/cpu_ldst.h          |  10 +
>   include/exec/helper-head.h       |   7 +
>   include/qemu/atomic128.h         |  29 ++-
>   include/qemu/int128.h            |  25 +-
>   include/tcg/tcg-op.h             |  15 ++
>   include/tcg/tcg.h                |  49 +++-
>   target/arm/helper-a64.h          |   8 -
>   target/i386/helper.h             |   6 -
>   target/ppc/helper.h              |   2 -
>   target/s390x/helper.h            |  54 ++---
>   tcg/aarch64/tcg-target.h         |   2 +
>   tcg/arm/tcg-target.h             |   2 +
>   tcg/i386/tcg-target.h            |  10 +
>   tcg/loongarch64/tcg-target.h     |   2 +
>   tcg/mips/tcg-target.h            |   2 +
>   tcg/riscv/tcg-target.h           |   3 +
>   tcg/s390x/tcg-target.h           |   2 +
>   tcg/sparc64/tcg-target.h         |   2 +
>   tcg/tcg-internal.h               |  17 ++
>   tcg/tci/tcg-target.h             |   3 +
>   target/s390x/tcg/insn-data.h.inc |  60 ++---
>   accel/tcg/cputlb.c               | 112 +++++++++
>   accel/tcg/user-exec.c            |  66 ++++++
>   target/arm/helper-a64.c          | 147 ------------
>   target/arm/translate-a64.c       | 121 +++++-----
>   target/i386/tcg/mem_helper.c     | 126 ----------
>   target/i386/tcg/translate.c      | 126 ++++++++--
>   target/ppc/mem_helper.c          |  44 ----
>   target/ppc/translate.c           | 102 ++++----
>   target/s390x/tcg/fpu_helper.c    | 103 ++++----
>   target/s390x/tcg/int_helper.c    |  64 ++---
>   target/s390x/tcg/mem_helper.c    |  77 +-----
>   target/s390x/tcg/translate.c     | 217 +++++++++++------
>   tcg/tcg-op.c                     | 393 ++++++++++++++++++++++++++-----
>   tcg/tcg.c                        | 303 +++++++++++++++++++++---
>   tcg/tci.c                        |  65 ++---
>   tests/tcg/s390x/clst.c           |  82 +++++++
>   tests/tcg/s390x/div.c            |  75 ++++++
>   tests/tcg/s390x/long-double.c    |  24 ++
>   util/int128.c                    |  42 ++++
>   accel/tcg/atomic_common.c.inc    |  45 ++++
>   tcg/aarch64/tcg-target.c.inc     |  17 +-
>   tcg/arm/tcg-target.c.inc         |  30 ++-
>   tcg/i386/tcg-target.c.inc        |  52 +++-
>   tcg/loongarch64/tcg-target.c.inc |  17 +-
>   tcg/mips/tcg-target.c.inc        |  17 +-
>   tcg/ppc/tcg-target.c.inc         |  20 +-
>   tcg/riscv/tcg-target.c.inc       |  17 +-
>   tcg/s390x/tcg-target.c.inc       |  16 +-
>   tcg/sparc64/tcg-target.c.inc     |  19 +-
>   tcg/tci/tcg-target.c.inc         |  27 ++-
>   tests/tcg/s390x/Makefile.target  |   3 +
>   53 files changed, 1936 insertions(+), 954 deletions(-)
>   create mode 100644 tests/tcg/s390x/clst.c
>   create mode 100644 tests/tcg/s390x/div.c
>   create mode 100644 tests/tcg/s390x/long-double.c

Now that the TCG documentation is more visible, would it be possible to add a patch 
to update the relevant parts of docs/devel/tcg-ops.rst to reflect the new Int128 support?


ATB,

Mark.
Richard Henderson Jan. 24, 2023, 9:46 p.m. UTC | #2
On 1/10/23 13:12, Mark Cave-Ayland wrote:
> Now that the TCG documentation is more visible, would it be possible to add a patch to 
> update the relevant parts of docs/devel/tcg-ops.rst to reflect the new Int128 support?

For avoidance of doubt, this document covers the intermediate representation and some 
backend specifics.  There are no changes to either of these at this time.  The TCGv_i128 
type is lowered to TCG_TYPE_REG (either I32 or I64 per host) during translation of guest 
instructions to intermediate opcodes.

Not to say another document shouldn't be written covering the translation interface...


r~
Richard Henderson Jan. 24, 2023, 9:54 p.m. UTC | #3
On 1/7/23 16:36, Richard Henderson wrote:
> Patches requiring review:
>    01-tcg-Define-TCG_TYPE_I128-and-related-helper-macro.patch
>    02-tcg-Handle-dh_typecode_i128-with-TCG_CALL_-RET-AR.patch
>    03-tcg-Allocate-objects-contiguously-in-temp_allocat.patch
>    05-tcg-Add-TCG_CALL_-RET-ARG-_BY_REF.patch
>    07-tcg-Add-TCG_CALL_RET_BY_VEC.patch
>    08-include-qemu-int128-Use-Int128-structure-for-TCI.patch
>    09-tcg-i386-Add-TCG_TARGET_CALL_-RET-ARG-_I128.patch
>    10-tcg-tci-Fix-big-endian-return-register-ordering.patch
>    11-tcg-tci-Add-TCG_TARGET_CALL_-RET-ARG-_I128.patch
>    13-tcg-Add-temp-allocation-for-TCGv_i128.patch
>    14-tcg-Add-basic-data-movement-for-TCGv_i128.patch
>    15-tcg-Add-guest-load-store-primitives-for-TCGv_i128.patch
>    16-tcg-Add-tcg_gen_-non-atomic_cmpxchg_i128.patch
>    17-tcg-Split-out-tcg_gen_nonatomic_cmpxchg_i-32-64.patch
>    24-target-s390x-Use-a-single-return-for-helper_divs3.patch
>    31-target-s390x-Use-Int128-for-passing-float128.patch
>    32-target-s390x-Use-tcg_gen_atomic_cmpxchg_i128-for-.patch
>    33-target-s390x-Implement-CC_OP_NZ-in-gen_op_calc_cc.patch
>    34-target-i386-Split-out-gen_cmpxchg8b-gen_cmpxchg16.patch
>    35-target-i386-Inline-cmpxchg8b.patch
>    36-target-i386-Inline-cmpxchg16b.patch

Ping.  Only 2, 3, 10, 14 reviewed in the past 2 weeks.
There is a very minor patch conflict now in patch 4, nothing worth re-posting over.


r~
Alex Bennée Jan. 25, 2023, 9:50 p.m. UTC | #4
Richard Henderson <richard.henderson@linaro.org> writes:

> Changes for v4:
>   * About half of the v3 series has been merged,
>   * AArch64 host requires even argument register.
>   * target/{arm,ppc,s390x,i386} uses included here.

Have you got a branch or a new re-base? I tried applying but got messy
conflicts I couldn't cleanly resolve.