[v4,13/22] fpu/softfloat: re-factor mul

Message ID	20180206164815.10084-14-alex.bennee@linaro.org
State	Superseded
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; From: =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org> To: richard.henderson@linaro.org, peter.maydell@linaro.org, laurent@vivier.eu, bharata@linux.vnet.ibm.com, andrew@andrewdutcher.com Date: Tue, 6 Feb 2018 16:48:06 +0000 Message-Id: <20180206164815.10084-14-alex.bennee@linaro.org> In-Reply-To: <20180206164815.10084-1-alex.bennee@linaro.org> References: <20180206164815.10084-1-alex.bennee@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: [Qemu-devel] [PATCH v4 13/22] fpu/softfloat: re-factor mul Precedence: list Cc: =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org>, qemu-devel@nongnu.org, Aurelien Jarno <aurelien@aurel32.net> Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patch=linaro.org@nongnu.org>
Series	re-factor softfloat and add fp16 functions \| expand [v4,00/22] re-factor softfloat and add fp16 functions [v4,01/22] fpu/softfloat: implement float16_squash_input_denormal [v4,02/22] include/fpu/softfloat: remove USE_SOFTFLOAT_STRUCT_TYPES [v4,03/22] fpu/softfloat-types: new header to prevent excessive re-builds [v4,04/22] target/*/cpu.h: remove softfloat.h [v4,05/22] include/fpu/softfloat: implement float16_abs helper [v4,06/22] include/fpu/softfloat: implement float16_chs helper [v4,07/22] include/fpu/softfloat: implement float16_set_sign helper [v4,08/22] include/fpu/softfloat: add some float16 constants [v4,09/22] fpu/softfloat: improve comments on ARM NaN propagation [v4,10/22] fpu/softfloat: move the extract functions to the top of the file [v4,11/22] fpu/softfloat: define decompose structures [v4,12/22] fpu/softfloat: re-factor add/sub [v4,13/22] fpu/softfloat: re-factor mul [v4,14/22] fpu/softfloat: re-factor div [v4,15/22] fpu/softfloat: re-factor muladd [v4,16/22] fpu/softfloat: re-factor round_to_int [v4,17/22] fpu/softfloat: re-factor float to int/uint [v4,18/22] fpu/softfloat: re-factor int/uint to float [v4,19/22] fpu/softfloat: re-factor scalbn [v4,20/22] fpu/softfloat: re-factor minmax [v4,21/22] fpu/softfloat: re-factor compare [v4,22/22] fpu/softfloat: re-factor sqrt

Message ID

20180206164815.10084-14-alex.bennee@linaro.org

State

Superseded

Headers

Received-SPF: pass (google.com: domain of
	qemu-devel-bounces+patch=linaro.org@nongnu.org designates
	2001:4830:134:3::11 as permitted sender)
	client-ip=2001:4830:134:3::11; 
From: =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org>
To: richard.henderson@linaro.org, peter.maydell@linaro.org, laurent@vivier.eu,
	bharata@linux.vnet.ibm.com, andrew@andrewdutcher.com
Date: Tue,  6 Feb 2018 16:48:06 +0000
Message-Id: <20180206164815.10084-14-alex.bennee@linaro.org>
In-Reply-To: <20180206164815.10084-1-alex.bennee@linaro.org>
References: <20180206164815.10084-1-alex.bennee@linaro.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Subject: [Qemu-devel] [PATCH v4 13/22] fpu/softfloat: re-factor mul
Precedence: list
Cc: =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org>,
	qemu-devel@nongnu.org, Aurelien Jarno <aurelien@aurel32.net>
Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+patch=linaro.org@nongnu.org>

Series

re-factor softfloat and add fp16 functions | expand

Commit Message

Alex Bennée Feb. 6, 2018, 4:48 p.m. UTC

We can now add float16_mul and use the common decompose and
canonicalize functions to have a single implementation for
float16/32/64 versions.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


---
v3
  - mul compile fixes for new names
  - remove duplicate s-o-b
  - more explicit inf * zero check
v4
  - use is_nan
  - pick_nan_parts/pick_nan
---
 fpu/softfloat.c         | 209 +++++++++++++++++++-----------------------------
 include/fpu/softfloat.h |   1 +
 2 files changed, 82 insertions(+), 128 deletions(-)

-- 
2.15.1

Comments

Peter Maydell Feb. 13, 2018, 3:20 p.m. UTC | #1

On 6 February 2018 at 16:48, Alex Bennée <alex.bennee@linaro.org> wrote:
> We can now add float16_mul and use the common decompose and

> canonicalize functions to have a single implementation for

> float16/32/64 versions.

>

> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

>

> ---

> v3


> +/*

> + * Returns the result of multiplying the floating-point values `a' and

> + * `b'. The operation is performed according to the IEC/IEEE Standard

> + * for Binary Floating-Point Arithmetic.

> + */

> +

> +static FloatParts mul_floats(FloatParts a, FloatParts b, float_status *s)

> +{

> +    bool sign = a.sign ^ b.sign;

> +

> +    if (a.cls == float_class_normal && b.cls == float_class_normal) {

> +        uint64_t hi, lo;

> +        int exp = a.exp + b.exp;

> +

> +        mul64To128(a.frac, b.frac, &hi, &lo);


It seems a shame that we previously were able to use a
32x32->64 multiply for the float32 case, and now we have to
do an expensive 64x64->128 multiply regardless...

Regardless
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>


thanks
-- PMM

Richard Henderson Feb. 13, 2018, 3:39 p.m. UTC | #2

On 02/13/2018 07:20 AM, Peter Maydell wrote:
>> +static FloatParts mul_floats(FloatParts a, FloatParts b, float_status *s)

>> +{

>> +    bool sign = a.sign ^ b.sign;

>> +

>> +    if (a.cls == float_class_normal && b.cls == float_class_normal) {

>> +        uint64_t hi, lo;

>> +        int exp = a.exp + b.exp;

>> +

>> +        mul64To128(a.frac, b.frac, &hi, &lo);

> 

> It seems a shame that we previously were able to use a

> 32x32->64 multiply for the float32 case, and now we have to

> do an expensive 64x64->128 multiply regardless...


To be fair, I've proposed two different solutions addressing that -- c++
templates and glibc macros -- and you like neither.  Is there a third
alternative that does not involve code duplication?


r~

Alex Bennée Feb. 19, 2018, 4:04 p.m. UTC | #3

Peter Maydell <peter.maydell@linaro.org> writes:

> On 6 February 2018 at 16:48, Alex Bennée <alex.bennee@linaro.org> wrote:

>> We can now add float16_mul and use the common decompose and

>> canonicalize functions to have a single implementation for

>> float16/32/64 versions.

>>

>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

>>

>> ---

>> v3

>

>> +/*

>> + * Returns the result of multiplying the floating-point values `a' and

>> + * `b'. The operation is performed according to the IEC/IEEE Standard

>> + * for Binary Floating-Point Arithmetic.

>> + */

>> +

>> +static FloatParts mul_floats(FloatParts a, FloatParts b, float_status *s)

>> +{

>> +    bool sign = a.sign ^ b.sign;

>> +

>> +    if (a.cls == float_class_normal && b.cls == float_class_normal) {

>> +        uint64_t hi, lo;

>> +        int exp = a.exp + b.exp;

>> +

>> +        mul64To128(a.frac, b.frac, &hi, &lo);

>

> It seems a shame that we previously were able to use a

> 32x32->64 multiply for the float32 case, and now we have to

> do an expensive 64x64->128 multiply regardless...

Actually for mul the hit isn't too bad. When we do a div however you do
notice a bit of a gulf:

 https://i.imgur.com/KMWceo8.png

We could start passing &floatN_params to the functions much like the
sqrt function and be a bit smarter when we do our multiply and let the
compiler figure it out as we go.

Another avenue worth exploring is ensuring we use native Int128 support
where we can so these wide operations can use wide registers where
available.

However both of these things for future optimisations given it doesn't
show up in dbt-bench timings.

>

> Regardless

> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

>

> thanks

> -- PMM

--
Alex Bennée

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 2190e7de56..6d29e1a103 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -735,6 +735,87 @@  float64 __attribute__((flatten)) float64_sub(float64 a, float64 b,
     return float64_round_pack_canonical(pr, status);
 }
 
+/*
+ * Returns the result of multiplying the floating-point values `a' and
+ * `b'. The operation is performed according to the IEC/IEEE Standard
+ * for Binary Floating-Point Arithmetic.
+ */
+
+static FloatParts mul_floats(FloatParts a, FloatParts b, float_status *s)
+{
+    bool sign = a.sign ^ b.sign;
+
+    if (a.cls == float_class_normal && b.cls == float_class_normal) {
+        uint64_t hi, lo;
+        int exp = a.exp + b.exp;
+
+        mul64To128(a.frac, b.frac, &hi, &lo);
+        shift128RightJamming(hi, lo, DECOMPOSED_BINARY_POINT, &hi, &lo);
+        if (lo & DECOMPOSED_OVERFLOW_BIT) {
+            shift64RightJamming(lo, 1, &lo);
+            exp += 1;
+        }
+
+        /* Re-use a */
+        a.exp = exp;
+        a.sign = sign;
+        a.frac = lo;
+        return a;
+    }
+    /* handle all the NaN cases */
+    if (is_nan(a.cls) || is_nan(b.cls)) {
+        return pick_nan(a, b, s);
+    }
+    /* Inf * Zero == NaN */
+    if ((a.cls == float_class_inf && b.cls == float_class_zero) ||
+        (a.cls == float_class_zero && b.cls == float_class_inf)) {
+        s->float_exception_flags |= float_flag_invalid;
+        a.cls = float_class_dnan;
+        a.sign = sign;
+        return a;
+    }
+    /* Multiply by 0 or Inf */
+    if (a.cls == float_class_inf || a.cls == float_class_zero) {
+        a.sign = sign;
+        return a;
+    }
+    if (b.cls == float_class_inf || b.cls == float_class_zero) {
+        b.sign = sign;
+        return b;
+    }
+    g_assert_not_reached();
+}
+
+float16 __attribute__((flatten)) float16_mul(float16 a, float16 b,
+                                             float_status *status)
+{
+    FloatParts pa = float16_unpack_canonical(a, status);
+    FloatParts pb = float16_unpack_canonical(b, status);
+    FloatParts pr = mul_floats(pa, pb, status);
+
+    return float16_round_pack_canonical(pr, status);
+}
+
+float32 __attribute__((flatten)) float32_mul(float32 a, float32 b,
+                                             float_status *status)
+{
+    FloatParts pa = float32_unpack_canonical(a, status);
+    FloatParts pb = float32_unpack_canonical(b, status);
+    FloatParts pr = mul_floats(pa, pb, status);
+
+    return float32_round_pack_canonical(pr, status);
+}
+
+float64 __attribute__((flatten)) float64_mul(float64 a, float64 b,
+                                             float_status *status)
+{
+    FloatParts pa = float64_unpack_canonical(a, status);
+    FloatParts pb = float64_unpack_canonical(b, status);
+    FloatParts pr = mul_floats(pa, pb, status);
+
+    return float64_round_pack_canonical(pr, status);
+}
+
 /*----------------------------------------------------------------------------
 | Takes a 64-bit fixed-point value `absZ' with binary point between bits 6
 | and 7, and returns the properly rounded 32-bit integer corresponding to the
@@ -2546,70 +2627,6 @@  float32 float32_round_to_int(float32 a, float_status *status)
 
 }
 
-/*----------------------------------------------------------------------------
-| Returns the result of multiplying the single-precision floating-point values
-| `a' and `b'.  The operation is performed according to the IEC/IEEE Standard
-| for Binary Floating-Point Arithmetic.
-*----------------------------------------------------------------------------*/
-
-float32 float32_mul(float32 a, float32 b, float_status *status)
-{
-    flag aSign, bSign, zSign;
-    int aExp, bExp, zExp;
-    uint32_t aSig, bSig;
-    uint64_t zSig64;
-    uint32_t zSig;
-
-    a = float32_squash_input_denormal(a, status);
-    b = float32_squash_input_denormal(b, status);
-
-    aSig = extractFloat32Frac( a );
-    aExp = extractFloat32Exp( a );
-    aSign = extractFloat32Sign( a );
-    bSig = extractFloat32Frac( b );
-    bExp = extractFloat32Exp( b );
-    bSign = extractFloat32Sign( b );
-    zSign = aSign ^ bSign;
-    if ( aExp == 0xFF ) {
-        if ( aSig || ( ( bExp == 0xFF ) && bSig ) ) {
-            return propagateFloat32NaN(a, b, status);
-        }
-        if ( ( bExp | bSig ) == 0 ) {
-            float_raise(float_flag_invalid, status);
-            return float32_default_nan(status);
-        }
-        return packFloat32( zSign, 0xFF, 0 );
-    }
-    if ( bExp == 0xFF ) {
-        if (bSig) {
-            return propagateFloat32NaN(a, b, status);
-        }
-        if ( ( aExp | aSig ) == 0 ) {
-            float_raise(float_flag_invalid, status);
-            return float32_default_nan(status);
-        }
-        return packFloat32( zSign, 0xFF, 0 );
-    }
-    if ( aExp == 0 ) {
-        if ( aSig == 0 ) return packFloat32( zSign, 0, 0 );
-        normalizeFloat32Subnormal( aSig, &aExp, &aSig );
-    }
-    if ( bExp == 0 ) {
-        if ( bSig == 0 ) return packFloat32( zSign, 0, 0 );
-        normalizeFloat32Subnormal( bSig, &bExp, &bSig );
-    }
-    zExp = aExp + bExp - 0x7F;
-    aSig = ( aSig | 0x00800000 )<<7;
-    bSig = ( bSig | 0x00800000 )<<8;
-    shift64RightJamming( ( (uint64_t) aSig ) * bSig, 32, &zSig64 );
-    zSig = zSig64;
-    if ( 0 <= (int32_t) ( zSig<<1 ) ) {
-        zSig <<= 1;
-        --zExp;
-    }
-    return roundAndPackFloat32(zSign, zExp, zSig, status);
-
-}
 
 /*----------------------------------------------------------------------------
 | Returns the result of dividing the single-precision floating-point value `a'
@@ -4142,70 +4159,6 @@  float64 float64_trunc_to_int(float64 a, float_status *status)
     return res;
 }
 
-
-/*----------------------------------------------------------------------------
-| Returns the result of multiplying the double-precision floating-point values
-| `a' and `b'.  The operation is performed according to the IEC/IEEE Standard
-| for Binary Floating-Point Arithmetic.
-*----------------------------------------------------------------------------*/
-
-float64 float64_mul(float64 a, float64 b, float_status *status)
-{
-    flag aSign, bSign, zSign;
-    int aExp, bExp, zExp;
-    uint64_t aSig, bSig, zSig0, zSig1;
-
-    a = float64_squash_input_denormal(a, status);
-    b = float64_squash_input_denormal(b, status);
-
-    aSig = extractFloat64Frac( a );
-    aExp = extractFloat64Exp( a );
-    aSign = extractFloat64Sign( a );
-    bSig = extractFloat64Frac( b );
-    bExp = extractFloat64Exp( b );
-    bSign = extractFloat64Sign( b );
-    zSign = aSign ^ bSign;
-    if ( aExp == 0x7FF ) {
-        if ( aSig || ( ( bExp == 0x7FF ) && bSig ) ) {
-            return propagateFloat64NaN(a, b, status);
-        }
-        if ( ( bExp | bSig ) == 0 ) {
-            float_raise(float_flag_invalid, status);
-            return float64_default_nan(status);
-        }
-        return packFloat64( zSign, 0x7FF, 0 );
-    }
-    if ( bExp == 0x7FF ) {
-        if (bSig) {
-            return propagateFloat64NaN(a, b, status);
-        }
-        if ( ( aExp | aSig ) == 0 ) {
-            float_raise(float_flag_invalid, status);
-            return float64_default_nan(status);
-        }
-        return packFloat64( zSign, 0x7FF, 0 );
-    }
-    if ( aExp == 0 ) {
-        if ( aSig == 0 ) return packFloat64( zSign, 0, 0 );
-        normalizeFloat64Subnormal( aSig, &aExp, &aSig );
-    }
-    if ( bExp == 0 ) {
-        if ( bSig == 0 ) return packFloat64( zSign, 0, 0 );
-        normalizeFloat64Subnormal( bSig, &bExp, &bSig );
-    }
-    zExp = aExp + bExp - 0x3FF;
-    aSig = ( aSig | LIT64( 0x0010000000000000 ) )<<10;
-    bSig = ( bSig | LIT64( 0x0010000000000000 ) )<<11;
-    mul64To128( aSig, bSig, &zSig0, &zSig1 );
-    zSig0 |= ( zSig1 != 0 );
-    if ( 0 <= (int64_t) ( zSig0<<1 ) ) {
-        zSig0 <<= 1;
-        --zExp;
-    }
-    return roundAndPackFloat64(zSign, zExp, zSig0, status);
-
-}
-
 /*----------------------------------------------------------------------------
 | Returns the result of dividing the double-precision floating-point value `a'
 | by the corresponding value `b'.  The operation is performed according to
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index 693ece0974..7fc63dd60f 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -239,6 +239,7 @@  float64 float16_to_float64(float16 a, flag ieee, float_status *status);
 
 float16 float16_add(float16, float16, float_status *status);
 float16 float16_sub(float16, float16, float_status *status);
+float16 float16_mul(float16, float16, float_status *status);
 
 int float16_is_quiet_nan(float16, float_status *status);
 int float16_is_signaling_nan(float16, float_status *status);

[v4,13/22] fpu/softfloat: re-factor mul

Commit Message

Comments

Patch