[v4,00/31] Add ARMv8.2 half-precision functions

Message ID 20180227143852.11175-1-alex.bennee@linaro.org
Headers show
Series
  • Add ARMv8.2 half-precision functions
Related show

Message

Alex Bennée Feb. 27, 2018, 2:38 p.m.
A few minor fixes and a chunk of Richard's r-b tags. Now all that is
left is:

  patch 0014/arm translate a64 add FP16 FMULX MLS FMLA to simd.patch needs review

Otherwise see comments bellow --- for other changes

Alex Bennée (31):
  include/exec/helper-head.h: support f16 in helper calls
  target/arm/cpu64: introduce ARM_V8_FP16 feature bit
  target/arm/cpu.h: update comment for half-precision values
  target/arm/cpu.h: add additional float_status flags
  target/arm/helper: pass explicit fpst to set_rmode
  arm/translate-a64: implement half-precision F(MIN|MAX)(V|NMV)
  arm/translate-a64: handle_3same_64 comment fix
  arm/translate-a64: initial decode for simd_three_reg_same_fp16
  arm/translate-a64: add FP16 FADD/FABD/FSUB/FMUL/FDIV to
    simd_three_reg_same_fp16
  arm/translate-a64: add FP16 F[A]C[EQ/GE/GT] to
    simd_three_reg_same_fp16
  arm/translate-a64: add FP16 FMULA/X/S to simd_three_reg_same_fp16
  arm/translate-a64: add FP16 FR[ECP/SQRT]S to simd_three_reg_same_fp16
  arm/translate-a64: add FP16 pairwise ops simd_three_reg_same_fp16
  arm/translate-a64: add FP16 FMULX/MLS/FMLA to simd_indexed
  arm/translate-a64: add FP16 x2 ops for simd_indexed
  arm/translate-a64: initial decode for simd_two_reg_misc_fp16
  arm/translate-a64: add FP16 FPRINTx to simd_two_reg_misc_fp16
  arm/translate-a64: add FCVTxx to simd_two_reg_misc_fp16
  arm/translate-a64: add FP16 FCMxx (zero) to simd_two_reg_misc_fp16
  arm/translate-a64: add FP16 SCVTF/UCVFT to simd_two_reg_misc_fp16
  arm/translate-a64: add FP16 FNEG/FABS to simd_two_reg_misc_fp16
  arm/helper.c: re-factor recpe and add recepe_f16
  arm/translate-a64: add FP16 FRECPE
  arm/translate-a64: add FP16 FRCPX to simd_two_reg_misc_fp16
  arm/translate-a64: add FP16 FSQRT to simd_two_reg_misc_fp16
  arm/helper.c: re-factor rsqrte and add rsqrte_f16
  arm/translate-a64: add FP16 FRSQRTE to simd_two_reg_misc_fp16
  arm/translate-a64: add FP16 FMOV to simd_mod_imm
  arm/translate-a64: add all FP16 ops in simd_scalar_pairwise
  arm/translate-a64: implement simd_scalar_three_reg_same_fp16
  arm/translate-a64: add all single op FP16 to handle_fp_1src_half

 include/exec/helper-head.h |    3 +
 include/fpu/softfloat.h    |   16 +-
 target/arm/cpu.h           |   34 +-
 target/arm/cpu64.c         |    1 +
 target/arm/helper-a64.c    |  269 ++++++++++
 target/arm/helper-a64.h    |   33 ++
 target/arm/helper.c        |  479 +++++++++--------
 target/arm/helper.h        |   14 +-
 target/arm/translate-a64.c | 1260 +++++++++++++++++++++++++++++++++++++-------
 target/arm/translate.c     |   12 +-
 10 files changed, 1694 insertions(+), 427 deletions(-)

-- 
2.15.1

Comments

no-reply@patchew.org Feb. 27, 2018, 3:14 p.m. | #1
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180227143852.11175-1-alex.bennee@linaro.org
Subject: [Qemu-devel] [PATCH v4 00/31] Add ARMv8.2 half-precision functions

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 t [tag update]            patchew/1519303376-92875-1-git-send-email-imammedo@redhat.com -> patchew/1519303376-92875-1-git-send-email-imammedo@redhat.com
 * [new tag]               patchew/20180227143852.11175-1-alex.bennee@linaro.org -> patchew/20180227143852.11175-1-alex.bennee@linaro.org
Switched to a new branch 'test'
760d23f344 arm/translate-a64: add all single op FP16 to handle_fp_1src_half
3ea247e123 arm/translate-a64: implement simd_scalar_three_reg_same_fp16
37e52e25c8 arm/translate-a64: add all FP16 ops in simd_scalar_pairwise
610a2ded87 arm/translate-a64: add FP16 FMOV to simd_mod_imm
5424b22a64 arm/translate-a64: add FP16 FRSQRTE to simd_two_reg_misc_fp16
3e2470a1f2 arm/helper.c: re-factor rsqrte and add rsqrte_f16
f65dc9c221 arm/translate-a64: add FP16 FSQRT to simd_two_reg_misc_fp16
02a876e8b2 arm/translate-a64: add FP16 FRCPX to simd_two_reg_misc_fp16
10c2769281 arm/translate-a64: add FP16 FRECPE
99ab8acc6b arm/helper.c: re-factor recpe and add recepe_f16
16b68c0355 arm/translate-a64: add FP16 FNEG/FABS to simd_two_reg_misc_fp16
719ccc9c2b arm/translate-a64: add FP16 SCVTF/UCVFT to simd_two_reg_misc_fp16
4356d7486e arm/translate-a64: add FP16 FCMxx (zero) to simd_two_reg_misc_fp16
dbbfb5f3db arm/translate-a64: add FCVTxx to simd_two_reg_misc_fp16
34431e3779 arm/translate-a64: add FP16 FPRINTx to simd_two_reg_misc_fp16
9fb95bb359 arm/translate-a64: initial decode for simd_two_reg_misc_fp16
ff81b5a5df arm/translate-a64: add FP16 x2 ops for simd_indexed
ea00fa8eb0 arm/translate-a64: add FP16 FMULX/MLS/FMLA to simd_indexed
87f41ae057 arm/translate-a64: add FP16 pairwise ops simd_three_reg_same_fp16
d4787194de arm/translate-a64: add FP16 FR[ECP/SQRT]S to simd_three_reg_same_fp16
ad7b642414 arm/translate-a64: add FP16 FMULA/X/S to simd_three_reg_same_fp16
cd51ab787c arm/translate-a64: add FP16 F[A]C[EQ/GE/GT] to simd_three_reg_same_fp16
e70d8c2a6c arm/translate-a64: add FP16 FADD/FABD/FSUB/FMUL/FDIV to simd_three_reg_same_fp16
e59dc95dc1 arm/translate-a64: initial decode for simd_three_reg_same_fp16
87ece4e0d1 arm/translate-a64: handle_3same_64 comment fix
bf53e72c06 arm/translate-a64: implement half-precision F(MIN|MAX)(V|NMV)
65949ff25e target/arm/helper: pass explicit fpst to set_rmode
4a2b48fb32 target/arm/cpu.h: add additional float_status flags
ca4f33ba81 target/arm/cpu.h: update comment for half-precision values
e8be261bd7 target/arm/cpu64: introduce ARM_V8_FP16 feature bit
e15c5b2b04 include/exec/helper-head.h: support f16 in helper calls

=== OUTPUT BEGIN ===
Checking PATCH 1/31: include/exec/helper-head.h: support f16 in helper calls...
Checking PATCH 2/31: target/arm/cpu64: introduce ARM_V8_FP16 feature bit...
Checking PATCH 3/31: target/arm/cpu.h: update comment for half-precision values...
Checking PATCH 4/31: target/arm/cpu.h: add additional float_status flags...
Checking PATCH 5/31: target/arm/helper: pass explicit fpst to set_rmode...
Checking PATCH 6/31: arm/translate-a64: implement half-precision F(MIN|MAX)(V|NMV)...
Checking PATCH 7/31: arm/translate-a64: handle_3same_64 comment fix...
Checking PATCH 8/31: arm/translate-a64: initial decode for simd_three_reg_same_fp16...
Checking PATCH 9/31: arm/translate-a64: add FP16 FADD/FABD/FSUB/FMUL/FDIV to simd_three_reg_same_fp16...
Checking PATCH 10/31: arm/translate-a64: add FP16 F[A]C[EQ/GE/GT] to simd_three_reg_same_fp16...
Checking PATCH 11/31: arm/translate-a64: add FP16 FMULA/X/S to simd_three_reg_same_fp16...
Checking PATCH 12/31: arm/translate-a64: add FP16 FR[ECP/SQRT]S to simd_three_reg_same_fp16...
Checking PATCH 13/31: arm/translate-a64: add FP16 pairwise ops simd_three_reg_same_fp16...
Checking PATCH 14/31: arm/translate-a64: add FP16 FMULX/MLS/FMLA to simd_indexed...
Checking PATCH 15/31: arm/translate-a64: add FP16 x2 ops for simd_indexed...
Checking PATCH 16/31: arm/translate-a64: initial decode for simd_two_reg_misc_fp16...
Checking PATCH 17/31: arm/translate-a64: add FP16 FPRINTx to simd_two_reg_misc_fp16...
Checking PATCH 18/31: arm/translate-a64: add FCVTxx to simd_two_reg_misc_fp16...
Checking PATCH 19/31: arm/translate-a64: add FP16 FCMxx (zero) to simd_two_reg_misc_fp16...
Checking PATCH 20/31: arm/translate-a64: add FP16 SCVTF/UCVFT to simd_two_reg_misc_fp16...
ERROR: space prohibited before that close parenthesis ')'
#26: FILE: target/arm/helper.c:11305:
+FLOAT_CONVS(si, h, 16, )

total: 1 errors, 0 warnings, 208 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 21/31: arm/translate-a64: add FP16 FNEG/FABS to simd_two_reg_misc_fp16...
Checking PATCH 22/31: arm/helper.c: re-factor recpe and add recepe_f16...
Checking PATCH 23/31: arm/translate-a64: add FP16 FRECPE...
Checking PATCH 24/31: arm/translate-a64: add FP16 FRCPX to simd_two_reg_misc_fp16...
Checking PATCH 25/31: arm/translate-a64: add FP16 FSQRT to simd_two_reg_misc_fp16...
Checking PATCH 26/31: arm/helper.c: re-factor rsqrte and add rsqrte_f16...
Checking PATCH 27/31: arm/translate-a64: add FP16 FRSQRTE to simd_two_reg_misc_fp16...
Checking PATCH 28/31: arm/translate-a64: add FP16 FMOV to simd_mod_imm...
Checking PATCH 29/31: arm/translate-a64: add all FP16 ops in simd_scalar_pairwise...
Checking PATCH 30/31: arm/translate-a64: implement simd_scalar_three_reg_same_fp16...
Checking PATCH 31/31: arm/translate-a64: add all single op FP16 to handle_fp_1src_half...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@freelists.org
no-reply@patchew.org Feb. 27, 2018, 3:14 p.m. | #2
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180227143852.11175-1-alex.bennee@linaro.org
Subject: [Qemu-devel] [PATCH v4 00/31] Add ARMv8.2 half-precision functions

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 t [tag update]            patchew/1519303376-92875-1-git-send-email-imammedo@redhat.com -> patchew/1519303376-92875-1-git-send-email-imammedo@redhat.com
 * [new tag]               patchew/20180227143852.11175-1-alex.bennee@linaro.org -> patchew/20180227143852.11175-1-alex.bennee@linaro.org
Switched to a new branch 'test'
760d23f344 arm/translate-a64: add all single op FP16 to handle_fp_1src_half
3ea247e123 arm/translate-a64: implement simd_scalar_three_reg_same_fp16
37e52e25c8 arm/translate-a64: add all FP16 ops in simd_scalar_pairwise
610a2ded87 arm/translate-a64: add FP16 FMOV to simd_mod_imm
5424b22a64 arm/translate-a64: add FP16 FRSQRTE to simd_two_reg_misc_fp16
3e2470a1f2 arm/helper.c: re-factor rsqrte and add rsqrte_f16
f65dc9c221 arm/translate-a64: add FP16 FSQRT to simd_two_reg_misc_fp16
02a876e8b2 arm/translate-a64: add FP16 FRCPX to simd_two_reg_misc_fp16
10c2769281 arm/translate-a64: add FP16 FRECPE
99ab8acc6b arm/helper.c: re-factor recpe and add recepe_f16
16b68c0355 arm/translate-a64: add FP16 FNEG/FABS to simd_two_reg_misc_fp16
719ccc9c2b arm/translate-a64: add FP16 SCVTF/UCVFT to simd_two_reg_misc_fp16
4356d7486e arm/translate-a64: add FP16 FCMxx (zero) to simd_two_reg_misc_fp16
dbbfb5f3db arm/translate-a64: add FCVTxx to simd_two_reg_misc_fp16
34431e3779 arm/translate-a64: add FP16 FPRINTx to simd_two_reg_misc_fp16
9fb95bb359 arm/translate-a64: initial decode for simd_two_reg_misc_fp16
ff81b5a5df arm/translate-a64: add FP16 x2 ops for simd_indexed
ea00fa8eb0 arm/translate-a64: add FP16 FMULX/MLS/FMLA to simd_indexed
87f41ae057 arm/translate-a64: add FP16 pairwise ops simd_three_reg_same_fp16
d4787194de arm/translate-a64: add FP16 FR[ECP/SQRT]S to simd_three_reg_same_fp16
ad7b642414 arm/translate-a64: add FP16 FMULA/X/S to simd_three_reg_same_fp16
cd51ab787c arm/translate-a64: add FP16 F[A]C[EQ/GE/GT] to simd_three_reg_same_fp16
e70d8c2a6c arm/translate-a64: add FP16 FADD/FABD/FSUB/FMUL/FDIV to simd_three_reg_same_fp16
e59dc95dc1 arm/translate-a64: initial decode for simd_three_reg_same_fp16
87ece4e0d1 arm/translate-a64: handle_3same_64 comment fix
bf53e72c06 arm/translate-a64: implement half-precision F(MIN|MAX)(V|NMV)
65949ff25e target/arm/helper: pass explicit fpst to set_rmode
4a2b48fb32 target/arm/cpu.h: add additional float_status flags
ca4f33ba81 target/arm/cpu.h: update comment for half-precision values
e8be261bd7 target/arm/cpu64: introduce ARM_V8_FP16 feature bit
e15c5b2b04 include/exec/helper-head.h: support f16 in helper calls

=== OUTPUT BEGIN ===
Checking PATCH 1/31: include/exec/helper-head.h: support f16 in helper calls...
Checking PATCH 2/31: target/arm/cpu64: introduce ARM_V8_FP16 feature bit...
Checking PATCH 3/31: target/arm/cpu.h: update comment for half-precision values...
Checking PATCH 4/31: target/arm/cpu.h: add additional float_status flags...
Checking PATCH 5/31: target/arm/helper: pass explicit fpst to set_rmode...
Checking PATCH 6/31: arm/translate-a64: implement half-precision F(MIN|MAX)(V|NMV)...
Checking PATCH 7/31: arm/translate-a64: handle_3same_64 comment fix...
Checking PATCH 8/31: arm/translate-a64: initial decode for simd_three_reg_same_fp16...
Checking PATCH 9/31: arm/translate-a64: add FP16 FADD/FABD/FSUB/FMUL/FDIV to simd_three_reg_same_fp16...
Checking PATCH 10/31: arm/translate-a64: add FP16 F[A]C[EQ/GE/GT] to simd_three_reg_same_fp16...
Checking PATCH 11/31: arm/translate-a64: add FP16 FMULA/X/S to simd_three_reg_same_fp16...
Checking PATCH 12/31: arm/translate-a64: add FP16 FR[ECP/SQRT]S to simd_three_reg_same_fp16...
Checking PATCH 13/31: arm/translate-a64: add FP16 pairwise ops simd_three_reg_same_fp16...
Checking PATCH 14/31: arm/translate-a64: add FP16 FMULX/MLS/FMLA to simd_indexed...
Checking PATCH 15/31: arm/translate-a64: add FP16 x2 ops for simd_indexed...
Checking PATCH 16/31: arm/translate-a64: initial decode for simd_two_reg_misc_fp16...
Checking PATCH 17/31: arm/translate-a64: add FP16 FPRINTx to simd_two_reg_misc_fp16...
Checking PATCH 18/31: arm/translate-a64: add FCVTxx to simd_two_reg_misc_fp16...
Checking PATCH 19/31: arm/translate-a64: add FP16 FCMxx (zero) to simd_two_reg_misc_fp16...
Checking PATCH 20/31: arm/translate-a64: add FP16 SCVTF/UCVFT to simd_two_reg_misc_fp16...
ERROR: space prohibited before that close parenthesis ')'
#26: FILE: target/arm/helper.c:11305:
+FLOAT_CONVS(si, h, 16, )

total: 1 errors, 0 warnings, 208 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 21/31: arm/translate-a64: add FP16 FNEG/FABS to simd_two_reg_misc_fp16...
Checking PATCH 22/31: arm/helper.c: re-factor recpe and add recepe_f16...
Checking PATCH 23/31: arm/translate-a64: add FP16 FRECPE...
Checking PATCH 24/31: arm/translate-a64: add FP16 FRCPX to simd_two_reg_misc_fp16...
Checking PATCH 25/31: arm/translate-a64: add FP16 FSQRT to simd_two_reg_misc_fp16...
Checking PATCH 26/31: arm/helper.c: re-factor rsqrte and add rsqrte_f16...
Checking PATCH 27/31: arm/translate-a64: add FP16 FRSQRTE to simd_two_reg_misc_fp16...
Checking PATCH 28/31: arm/translate-a64: add FP16 FMOV to simd_mod_imm...
Checking PATCH 29/31: arm/translate-a64: add all FP16 ops in simd_scalar_pairwise...
Checking PATCH 30/31: arm/translate-a64: implement simd_scalar_three_reg_same_fp16...
Checking PATCH 31/31: arm/translate-a64: add all single op FP16 to handle_fp_1src_half...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@freelists.org
Peter Maydell Feb. 27, 2018, 6:11 p.m. | #3
On 27 February 2018 at 14:38, Alex Bennée <alex.bennee@linaro.org> wrote:
> A few minor fixes and a chunk of Richard's r-b tags. Now all that is

> left is:

>

>   patch 0014/arm translate a64 add FP16 FMULX MLS FMLA to simd.patch needs review

>

> Otherwise see comments bellow --- for other changes

>


Thanks -- applied to target-arm.next. Some caveats:

(1) we can fix the nit RTH noted about FMULX later

(2) I notice that there's no patch here that adds the linux-user/elfload.c
code to set a hwcap for the guest program to indicate FP16 presence.
Presumably there is such a hwcap?

(3) Is this complete fp16 support or are there still more pieces to come?
I'm assuming it's all done...

(4) I've split the "add new ARM_V8_FP16 feature bit to the enum"
and "enable the feature on the 'any' CPU" parts of patch 2, so
we can do the latter at the end. If there is still missing parts
to fp16 then we can drop the enable-feature half of that
for the moment.

thanks
-- PMM
Alex Bennée Feb. 28, 2018, 1:32 p.m. | #4
Peter Maydell <peter.maydell@linaro.org> writes:

> On 27 February 2018 at 14:38, Alex Bennée <alex.bennee@linaro.org> wrote:

>> A few minor fixes and a chunk of Richard's r-b tags. Now all that is

>> left is:

>>

>>   patch 0014/arm translate a64 add FP16 FMULX MLS FMLA to simd.patch needs review

>>

>> Otherwise see comments bellow --- for other changes

>>

>

> Thanks -- applied to target-arm.next. Some caveats:

>

> (1) we can fix the nit RTH noted about FMULX later

>

> (2) I notice that there's no patch here that adds the linux-user/elfload.c

> code to set a hwcap for the guest program to indicate FP16 presence.

> Presumably there is such a hwcap?


I'd missed it as risu doesn't need it. I see rth has sent a patch so
I'll read up on it and see if I can extend vector-benchmark to use it to
detect FP16.

>

> (3) Is this complete fp16 support or are there still more pieces to come?

> I'm assuming it's all done...


All AArch64 is done. I'm not sure how much AArch32 is needed for SVE
support. The ARM ARM says "When this feature is implemented it is
implemented in both Advanced SIMD and floating-point, and in AArch64 and
AArch32 states." but I think it is legal to have a 64 bit only CPU
without AArch32?

Unfortunately the magic I used to extract all the AArch64 HP
instructions from the ASL doesn't work on the AArch32 definitions which
put important differentiating notes in different places.

Once I've got the list I'll document it so we don't forget...

>

> (4) I've split the "add new ARM_V8_FP16 feature bit to the enum"

> and "enable the feature on the 'any' CPU" parts of patch 2, so

> we can do the latter at the end. If there is still missing parts

> to fp16 then we can drop the enable-feature half of that

> for the moment.


I guess that depends on if we model any AArch64 only CPUs?

>

> thanks

> -- PMM



--
Alex Bennée
Peter Maydell Feb. 28, 2018, 3:02 p.m. | #5
On 28 February 2018 at 13:32, Alex Bennée <alex.bennee@linaro.org> wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:

>> (3) Is this complete fp16 support or are there still more pieces to come?

>> I'm assuming it's all done...

>

> All AArch64 is done. I'm not sure how much AArch32 is needed for SVE

> support. The ARM ARM says "When this feature is implemented it is

> implemented in both Advanced SIMD and floating-point, and in AArch64 and

> AArch32 states." but I think it is legal to have a 64 bit only CPU

> without AArch32?


64-bit only is legal, but none of our current implemented CPUs
(including the proposed "max") are 64-bit only :-)
For linux-user it doesn't matter, because the code will only
run as AArch64 anyway, but before we advertise it in the ID
registers for system emulation I would definitely prefer to have
the AArch32 instructions.

Do you have a feel for how much work the AArch32 side is?

thanks
-- PMM
Richard Henderson Feb. 28, 2018, 4:58 p.m. | #6
On 02/28/2018 07:02 AM, Peter Maydell wrote:
> Do you have a feel for how much work the AArch32 side is?


Based on my experience adding fcmla support to aa32, probably less than a week.


r~