mbox series

[for-6.2,00/53] target/arm: MVE slices 3 and 4

Message ID 20210729111512.16541-1-peter.maydell@linaro.org
Headers show
Series target/arm: MVE slices 3 and 4 | expand

Message

Peter Maydell July 29, 2021, 11:14 a.m. UTC
This patchseries provides the third and fourth slices of the MVE
implementation, which gives us complete coverage of all instructions
and brings us to the point where we can actually enable it.

In this series:
 * fixes for minor bugs in a couple of the insns already upstream
 * all the remaining integer instructions
 * the remaining loads and stores (scatter-gather and interleaving)
 * the floating point instructions
 * patch enabling MVE for the Cortex-M55

Things still to do:
 * MVE loads/stores should check alignment (this will depend on
   the patchset that RTH just sent out, and I didn't want to
   entangle the two features unnecessarily)
 * gdbstub support (blocked on the gdb folks nailing down what
   the XML for it should be)
 * optimization: many of the insns should have inline versions
   to use when we know we aren't doing any predication

But none of those are blockers for this landing upstream once
we reopen for 6.2.

Still to review:
 03, 07, 10, 21, 26, and the new patches 36-53

thanks
-- PMM

Peter Maydell (53):
  target/arm: Note that we handle VMOVL as a special case of VSHLL
  target/arm: Print MVE VPR in CPU dumps
  target/arm: Fix MVE VSLI by 0 and VSRI by <dt>
  target/arm: Fix signed VADDV
  target/arm: Fix mask handling for MVE narrowing operations
  target/arm: Fix 48-bit saturating shifts
  target/arm: Fix MVE 48-bit SQRSHRL for small right shifts
  target/arm: Fix calculation of LTP mask when LR is 0
  target/arm: Factor out mve_eci_mask()
  target/arm: Fix VPT advance when ECI is non-zero
  target/arm: Fix VLDRB/H/W for predicated elements
  target/arm: Implement MVE VMULL (polynomial)
  target/arm: Implement MVE incrementing/decrementing dup insns
  target/arm: Factor out gen_vpst()
  target/arm: Implement MVE integer vector comparisons
  target/arm: Implement MVE integer vector-vs-scalar comparisons
  target/arm: Implement MVE VPSEL
  target/arm: Implement MVE VMLAS
  target/arm: Implement MVE shift-by-scalar
  target/arm: Move 'x' and 'a' bit definitions into vmlaldav formats
  target/arm: Implement MVE integer min/max across vector
  target/arm: Implement MVE VABAV
  target/arm: Implement MVE narrowing moves
  target/arm: Rename MVEGenDualAccOpFn to MVEGenLongDualAccOpFn
  target/arm: Implement MVE VMLADAV and VMLSLDAV
  target/arm: Implement MVE VMLA
  target/arm: Implement MVE saturating doubling multiply accumulates
  target/arm: Implement MVE VQABS, VQNEG
  target/arm: Implement MVE VMAXA, VMINA
  target/arm: Implement MVE VMOV to/from 2 general-purpose registers
  target/arm: Implement MVE VPNOT
  target/arm: Implement MVE VCTP
  target/arm: Implement MVE scatter-gather insns
  target/arm: Implement MVE scatter-gather immediate forms
  target/arm: Implement MVE interleaving loads/stores
  target/arm: Implement MVE VADD (floating-point)
  target/arm: Implement MVE VSUB, VMUL, VABD, VMAXNM, VMINNM
  target/arm: Implement MVE VCADD
  target/arm: Implement MVE VFMA and VFMS
  target/arm: Implement MVE VCMUL and VCMLA
  target/arm: Implement MVE VMAXNMA and VMINNMA
  target/arm: Implement MVE scalar fp insns
  target/arm: Implement MVE fp-with-scalar VFMA, VFMAS
  softfloat: Remove assertion preventing silencing of NaN in default-NaN
    mode
  target/arm: Implement MVE FP max/min across vector
  target/arm: Implement MVE fp vector comparisons
  target/arm: Implement MVE fp scalar comparisons
  target/arm: Implement MVE VCVT between floating and fixed point
  target/arm: Implement MVE VCVT between fp and integer
  target/arm: Implement MVE VCVT with specified rounding mode
  target/arm: Implement MVE VCVT between single and half precision
  target/arm: Implement MVE VRINT insns
  target/arm: Enable MVE in Cortex-M55

 docs/system/arm/emulation.rst  |    1 +
 target/arm/helper-mve.h        |  425 +++++++
 target/arm/translate-a32.h     |    2 +
 target/arm/translate.h         |    6 +
 target/arm/vec_internal.h      |   11 +
 target/arm/mve.decode          |  463 +++++++-
 target/arm/t32.decode          |    1 +
 target/arm/cpu.c               |    3 +
 target/arm/cpu_tcg.c           |    7 +-
 target/arm/mve_helper.c        | 1899 +++++++++++++++++++++++++++++++-
 target/arm/translate-mve.c     | 1154 ++++++++++++++++++-
 target/arm/translate-neon.c    |    6 -
 target/arm/translate-vfp.c     |    2 +-
 target/arm/translate.c         |   33 +
 target/arm/vec_helper.c        |   14 +-
 fpu/softfloat-specialize.c.inc |    1 -
 16 files changed, 3911 insertions(+), 117 deletions(-)

-- 
2.20.1