[09/17] math: Use exp10m1f from CORE-MATH

Message ID	20241025182614.2022697-10-adhemerval.zanella@linaro.org
State	New
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 554FC3858CDB From: Adhemerval Zanella <adhemerval.zanella@linaro.org> To: libc-alpha@sourceware.org Cc: Paul Zimmermann <Paul.Zimmermann@inria.fr>, Alexei Sibidanov <sibid@uvic.ca> Subject: [PATCH 09/17] math: Use exp10m1f from CORE-MATH Date: Fri, 25 Oct 2024 15:21:47 -0300 Message-ID: <20241025182614.2022697-10-adhemerval.zanella@linaro.org> In-Reply-To: <20241025182614.2022697-1-adhemerval.zanella@linaro.org> References: <20241025182614.2022697-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: libc-alpha-bounces~patch=linaro.org@sourceware.org
Series	Add more CORE-MATH on libm \| expand [00/17] Add more CORE-MATH on libm [01/17] math: Add e_gammaf_r to glibc code and style [02/17] benchtests: Add exp10m1f benchmark [03/17] benchtests: Add exp2m1f benchmark [04/17] benchtests: Add expm1f benchmark [05/17] benchtests: Add log10f benchmark [06/17] benchtests: Add log2p1f benchmark [07/17] benchtests: Add log1p benchmark [08/17] benchtests: Add log10p1f benchmark [09/17] math: Use exp10m1f from CORE-MATH [10/17] math: Use exp2m1f from CORE-MATH [11/17] math: Use expm1f from CORE-MATH [13/17] math: Use log2p1f from CORE-MATH [15/17] math: Use log10p1f from CORE-MATH [16/17] x86_64: Add exp10m1f with FMA [17/17] x86_64: Add exp2m1f with FMA

diff --git a/SHARED-FILES b/SHARED-FILES index 2aefac8a95..a1dd3878f9 100644 --- a/SHARED-FILES +++ b/SHARED-FILES @@ -231,3 +231,7 @@ sysdeps/ieee754/flt-32/e_gammaf_r.c - replace 0x1p127f * 0x1p127f by math_narrow_eval (x * 0x1p127f) - replace 0x1p-127f * sgn[k&1] by math_narrow_eval (0x1p-127f * sgn[k&1]) - add libm_alias_finite (__ieee754_gammaf_r, __gammaf_r) at the end +sysdeps/ieee754/flt-32/s_exp10m1f.c + (file src/binary32/exp10m1/exp10m1f.c in CORE_MATH) + - The code was adapted to use glibc code style and internal + functions to handle errno, overflow, and underflow. diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps index 8c333fcd29..00832393f6 100644 --- a/sysdeps/aarch64/libm-test-ulps +++ b/sysdeps/aarch64/libm-test-ulps @@ -1083,22 +1083,18 @@ ldouble: 3 Function: "exp10m1": double: 4 -float: 2 ldouble: 3 Function: "exp10m1_downward": double: 3 -float: 3 ldouble: 6 Function: "exp10m1_towardzero": double: 2 -float: 3 ldouble: 6 Function: "exp10m1_upward": double: 5 -float: 3 ldouble: 6 Function: "exp2": diff --git a/sysdeps/arc/fpu/libm-test-ulps b/sysdeps/arc/fpu/libm-test-ulps index d1665b9b72..ad4ae4c632 100644 --- a/sysdeps/arc/fpu/libm-test-ulps +++ b/sysdeps/arc/fpu/libm-test-ulps @@ -789,7 +789,6 @@ float: 1 Function: "exp10m1": double: 4 -float: 2 Function: "exp10m1_downward": double: 3 diff --git a/sysdeps/arc/nofpu/libm-test-ulps b/sysdeps/arc/nofpu/libm-test-ulps index e768024eb5..4cecea6919 100644 --- a/sysdeps/arc/nofpu/libm-test-ulps +++ b/sysdeps/arc/nofpu/libm-test-ulps @@ -194,7 +194,6 @@ float: 1 Function: "exp10m1": double: 3 -float: 2 Function: "exp2": double: 1 diff --git a/sysdeps/arm/libm-test-ulps b/sysdeps/arm/libm-test-ulps index 21fe98a7bd..bcbe82ca34 100644 --- a/sysdeps/arm/libm-test-ulps +++ b/sysdeps/arm/libm-test-ulps @@ -779,19 +779,15 @@ float: 1 Function: "exp10m1": double: 3 -float: 2 Function: "exp10m1_downward": double: 3 -float: 3 Function: "exp10m1_towardzero": double: 2 -float: 3 Function: "exp10m1_upward": double: 4 -float: 4 Function: "exp2": double: 1 diff --git a/sysdeps/hppa/fpu/libm-test-ulps b/sysdeps/hppa/fpu/libm-test-ulps index 57c1155e8b..3dc1f2f8c3 100644 --- a/sysdeps/hppa/fpu/libm-test-ulps +++ b/sysdeps/hppa/fpu/libm-test-ulps @@ -798,19 +798,15 @@ float: 1 Function: "exp10m1": double: 4 -float: 2 Function: "exp10m1_downward": double: 3 -float: 3 Function: "exp10m1_towardzero": double: 2 -float: 3 Function: "exp10m1_upward": double: 5 -float: 4 Function: "exp2": double: 1 diff --git a/sysdeps/i386/fpu/libm-test-ulps b/sysdeps/i386/fpu/libm-test-ulps index d13b05a220..96d02334e4 100644 --- a/sysdeps/i386/fpu/libm-test-ulps +++ b/sysdeps/i386/fpu/libm-test-ulps @@ -1155,25 +1155,21 @@ ldouble: 2 Function: "exp10m1": double: 3 -float: 1 float128: 3 ldouble: 4 Function: "exp10m1_downward": double: 3 -float: 3 float128: 6 ldouble: 6 Function: "exp10m1_towardzero": double: 2 -float: 2 float128: 6 ldouble: 5 Function: "exp10m1_upward": double: 5 -float: 3 float128: 6 ldouble: 6 diff --git a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps index 0a813aa8dd..c18e2bc8de 100644 --- a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps +++ b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps @@ -1158,25 +1158,21 @@ ldouble: 2 Function: "exp10m1": double: 3 -float: 1 float128: 3 ldouble: 4 Function: "exp10m1_downward": double: 3 -float: 3 float128: 6 ldouble: 6 Function: "exp10m1_towardzero": double: 2 -float: 2 float128: 6 ldouble: 5 Function: "exp10m1_upward": double: 5 -float: 3 float128: 6 ldouble: 6 diff --git a/sysdeps/ieee754/flt-32/s_exp10m1f.c b/sysdeps/ieee754/flt-32/s_exp10m1f.c new file mode 100644 index 0000000000..610f269b01 --- /dev/null +++ b/sysdeps/ieee754/flt-32/s_exp10m1f.c @@ -0,0 +1,225 @@ +/* Implementation of the exp10m1 function for binary32. + +Copyright (c) 2022-2024 Alexei Sibidanov. Paul Zimmermann. + +The original version of this file was copied from the CORE-MATH +project (file src/binary32/exp10m1/exp10m1f.c, revision d680516). + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + */ + +#include <math.h> +#include "math_config.h" +#include <libm-alias-float.h> + +float +__exp10m1f (float x) +{ + const double iln10h = 0x1.a934f09p+1 * 16; + const double iln10l = 0x1.e68dc57f2496p-29 * 16; + double z = x; + uint32_t ux = asuint (x); + uint32_t ax = ux & (~0u >> 1); + if (__glibc_unlikely (ux > 0xc0f0d2f1u)) + { /* x < -7.52575 */ + if (ax > (0xffu << 23)) + return x + x; /* nan */ + return (ux == 0xff800000) ? -0x1p+0 : -0x1p+0 + 0x1p-26f; + } + else if (__glibc_unlikely (ax > 0x421a209au)) + { /* x > 38.5318 */ + if (ax >= asuint (INFINITY)) + return x + x; /* +Inf or NaN */ + return __math_oflowf (0); + } + else if (__glibc_unlikely (ax < 0x3d89c604u)) + { /* |x| < 0.1549/log(10) */ + double z2 = z * z, r; + if (__glibc_unlikely (ax < 0x3d1622fbu)) + { /* |x| < 8.44e-2/log(10) */ + if (__glibc_unlikely (ax < 0x3c8b76a3u)) + { /* |x| < 3.92e-2/log(10) */ + if (__glibc_unlikely (ax < 0x3bcced04u)) + { /* |x| < 1.44e-2/log(10) */ + if (__glibc_unlikely (ax < 0x3acf33ebu)) + { /* |x| < 3.64e-3/log(10 */ + if (__glibc_unlikely (ax < 0x395a966bu)) + { /* |x| < 4.8e-4/log(10 */ + if (__glibc_unlikely (ax < 0x36fe4a4bu)) + { /* |x| < 1.745e-5/log(10) */ + if (__glibc_unlikely (ax < 0x32407f39u)) + { /* |x| < 2.58e-8/log(10) */ + if (__glibc_unlikely (ax < 0x245e5bd9u)) + { /* |x| < 4.82164e-17 */ + r = 0x1.26bb1bbb55516p+1; + } + else + { + if (__glibc_unlikely (ux == 0x2c994b7bu)) + return 0x1.60f974p-37f - 0x1p-90f; + r = 0x1.26bb1bbb55516p+1 + + z * 0x1.53524c73cea69p+1; + } + } + else + { + if (__glibc_unlikely (ux == 0xb6fa215bu)) + return -0x1.1ff87ep-16 + 0x1p-68; + r = 0x1.26bb1bbb55516p+1 + + z * (0x1.53524c73ea62fp+1 + + z * 0x1.0470591de2c75p+1); + } + } + else + { + static const double cp[] = + { + 0x1.26bb1bbb55515p+1, 0x1.53524c73cea69p+1, + 0x1.0470595038cc2p+1, 0x1.2bd7609fe1561p+0 + }; + r = (cp[0] + z * cp[1]) + + z2 * (cp[2] + z * cp[3]); + } + } + else + { + static const double cp[] = + { + 0x1.26bb1bbb55516p+1, 0x1.53524c73ce6dbp+1, + 0x1.0470591de3024p+1, 0x1.2bd76b79060e6p+0, + 0x1.1429ffd3a963dp-1 + }; + r = (cp[0] + z * cp[1]) + + z2 * (cp[2] + z * (cp[3] + z * cp[4])); + } + } + else + { + static const double cp[] = + { + 0x1.26bb1bbb55516p+1, 0x1.53524c73cea67p+1, + 0x1.0470591dc2953p+1, 0x1.2bd760a004d64p+0, + 0x1.142a85da6f072p-1, 0x1.a7ed70725b00ep-3 + }; + r = (cp[0] + z * cp[1]) + z2 + * ((cp[2] + z * cp[3]) + + z2 * (cp[4] + z * cp[5])); + } + } + else + { + static const double cp[] = + { + 0x1.26bb1bbb55516p+1, 0x1.53524c73ceadep+1, + 0x1.0470591de2bb4p+1, 0x1.2bd76099a9d33p+0, + 0x1.1429ffd829b0bp-1, 0x1.a7f2a6a0f7dc8p-3, + 0x1.16e4dfbce0f56p-4 + }; + r = (cp[0] + z * cp[1]) + + z2 * ((cp[2] + z * cp[3]) + + z2 * (cp[4] + z * (cp[5] + z * cp[6]))); + } + } + else + { + static const double cp[] = + { + 0x1.26bb1bbb55515p+1, 0x1.53524c73cea6ap+1, + 0x1.0470591de476p+1, 0x1.2bd7609fd4ee2p+0, + 0x1.1429ff70a9b48p-1, 0x1.a7ed71259ba5bp-3, + 0x1.16f3004fb3ac1p-4, 0x1.4116b0388aa9fp-6 + }; + r = ((cp[0] + z * cp[1]) + z2 * (cp[2] + z * cp[3])) + + (z2 * z2) * ((cp[4] + z * cp[5]) + z2 * (cp[6] + + z * cp[7])); + } + } + else + { + static const double cp[] = + { + 0x1.26bb1bbb55515p+1, 0x1.53524c73cea42p+1, 0x1.0470591de2d1dp+1, + 0x1.2bd760a010a53p+0, 0x1.1429ffd16170cp-1, 0x1.a7ed6b2a0d97fp-3, + 0x1.16e4e37fa51e4p-4, 0x1.4147fe4c1676fp-6, 0x1.4897c4b3e329ap-8 + }; + r = ((cp[0] + z * cp[1]) + z2 * (cp[2] + z * cp[3])) + + (z2 * z2) * ((cp[4] + z * cp[5]) + + z2 * (cp[6] + z * (cp[7] + z * cp[8]))); + } + r *= z; + return r; + } + else + { + /* -7.52575 < x < -0.1549/log(10) or 0.1549/log(10) < x < 38.5318 */ + static const double tb[] = + { + 0x1p+0, 0x1.0b5586cf9890fp+0, 0x1.172b83c7d517bp+0, + 0x1.2387a6e756238p+0, 0x1.306fe0a31b715p+0, 0x1.3dea64c123422p+0, + 0x1.4bfdad5362a27p+0, 0x1.5ab07dd485429p+0, 0x1.6a09e667f3bcdp+0, + 0x1.7a11473eb0187p+0, 0x1.8ace5422aa0dap+0, 0x1.9c49182a3f09p+0, + 0x1.ae89f995ad3adp+0, 0x1.c199bdd85529cp+0, 0x1.d5818dcfba487p+0, + 0x1.ea4afa2a490dap+0 + }; + static const double c[] = + { + 0x1.62e42fefa398bp-5, 0x1.ebfbdff84555ap-11, 0x1.c6b08d4ad86d3p-17, + 0x1.3b2ad1b1716a2p-23, 0x1.5d7472718ce9dp-30, 0x1.4a1d7f457ac56p-37 + }; + + if (__glibc_unlikely ((ux << 11) == 0)) + { + uint32_t k = (ux >> 21) - 0x1fc; + if (k <= 0xb) + { + if (k == 0) + return 10.0f - 1.0f; + if (k == 4) + return 100.0f - 1.0f; + if (k == 6) + return 1000.0f - 1.0f; + if (k == 8) + return 10000.0f - 1.0f; + if (k == 9) + return 100000.0f - 1.0f; + if (k == 10) + return 1000000.0f - 1.0f; + if (k == 11) + return 10000000.0f - 1.0f; + } + } + double a = iln10h * z; + double ia = floor (a); + double h = (a - ia) + iln10l * z; + int64_t i = ia; + int64_t j = i & 0xf; + int64_t e = i - j; + e >>= 4; + double s = tb[j]; + s *= asdouble ((e + 0x3ffull) << 52); + double h2 = h * h; + double c0 = c[0] + h * c[1]; + double c2 = c[2] + h * c[3]; + double c4 = c[4] + h * c[5]; + c0 += h2 * (c2 + h2 * c4); + double w = s * h; + return (s - 1.0) + w * c0; + } +} +libm_alias_float (__exp10m1, exp10m1) diff --git a/sysdeps/loongarch/lp64/libm-test-ulps b/sysdeps/loongarch/lp64/libm-test-ulps index 75e21ee2a8..de7e526008 100644 --- a/sysdeps/loongarch/lp64/libm-test-ulps +++ b/sysdeps/loongarch/lp64/libm-test-ulps @@ -977,22 +977,18 @@ ldouble: 3 Function: "exp10m1": double: 4 -float: 2 ldouble: 3 Function: "exp10m1_downward": double: 3 -float: 3 ldouble: 6 Function: "exp10m1_towardzero": double: 2 -float: 3 ldouble: 6 Function: "exp10m1_upward": double: 5 -float: 3 ldouble: 6 Function: "exp2": diff --git a/sysdeps/m68k/m680x0/fpu/libm-test-ulps b/sysdeps/m68k/m680x0/fpu/libm-test-ulps index 73e98eb479..96cd7b3fd5 100644 --- a/sysdeps/m68k/m680x0/fpu/libm-test-ulps +++ b/sysdeps/m68k/m680x0/fpu/libm-test-ulps @@ -882,19 +882,15 @@ ldouble: 3 Function: "exp10m1": double: 1 -float: 1 Function: "exp10m1_downward": double: 1 -float: 1 Function: "exp10m1_towardzero": double: 1 -float: 1 Function: "exp10m1_upward": double: 3 -float: 1 Function: "exp2": double: 1 diff --git a/sysdeps/mips/mips32/libm-test-ulps b/sysdeps/mips/mips32/libm-test-ulps index cc1600c2ab..829da3ebf1 100644 --- a/sysdeps/mips/mips32/libm-test-ulps +++ b/sysdeps/mips/mips32/libm-test-ulps @@ -779,19 +779,15 @@ float: 1 Function: "exp10m1": double: 3 -float: 2 Function: "exp10m1_downward": double: 3 -float: 3 Function: "exp10m1_towardzero": double: 2 -float: 3 Function: "exp10m1_upward": double: 4 -float: 4 Function: "exp2": double: 1 diff --git a/sysdeps/mips/mips64/libm-test-ulps b/sysdeps/mips/mips64/libm-test-ulps index 88dbb20401..5c7b050b11 100644 --- a/sysdeps/mips/mips64/libm-test-ulps +++ b/sysdeps/mips/mips64/libm-test-ulps @@ -978,22 +978,18 @@ ldouble: 3 Function: "exp10m1": double: 3 -float: 2 ldouble: 3 Function: "exp10m1_downward": double: 3 -float: 3 ldouble: 6 Function: "exp10m1_towardzero": double: 2 -float: 3 ldouble: 6 Function: "exp10m1_upward": double: 4 -float: 4 ldouble: 6 Function: "exp2": diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps index dcefbcefcf..30ddbaf05f 100644 --- a/sysdeps/powerpc/fpu/libm-test-ulps +++ b/sysdeps/powerpc/fpu/libm-test-ulps @@ -1186,25 +1186,21 @@ ldouble: 4 Function: "exp10m1": double: 4 -float: 2 float128: 3 ldouble: 4 Function: "exp10m1_downward": double: 3 -float: 3 float128: 6 ldouble: 9 Function: "exp10m1_towardzero": double: 2 -float: 3 float128: 6 ldouble: 9 Function: "exp10m1_upward": double: 5 -float: 3 float128: 6 ldouble: 8 diff --git a/sysdeps/powerpc/nofpu/libm-test-ulps b/sysdeps/powerpc/nofpu/libm-test-ulps index 14be978dc3..e15e1f8fc0 100644 --- a/sysdeps/powerpc/nofpu/libm-test-ulps +++ b/sysdeps/powerpc/nofpu/libm-test-ulps @@ -987,22 +987,18 @@ ldouble: 4 Function: "exp10m1": double: 3 -float: 2 ldouble: 3 Function: "exp10m1_downward": double: 3 -float: 3 ldouble: 9 Function: "exp10m1_towardzero": double: 2 -float: 3 ldouble: 9 Function: "exp10m1_upward": double: 4 -float: 4 ldouble: 10 Function: "exp2": diff --git a/sysdeps/riscv/nofpu/libm-test-ulps b/sysdeps/riscv/nofpu/libm-test-ulps index 4adad138b0..2271e1ee6c 100644 --- a/sysdeps/riscv/nofpu/libm-test-ulps +++ b/sysdeps/riscv/nofpu/libm-test-ulps @@ -974,7 +974,6 @@ ldouble: 3 Function: "exp10m1": double: 2 -float: 1 ldouble: 1 Function: "exp2": diff --git a/sysdeps/riscv/rvd/libm-test-ulps b/sysdeps/riscv/rvd/libm-test-ulps index 07a8794e48..0bc61bf3c8 100644 --- a/sysdeps/riscv/rvd/libm-test-ulps +++ b/sysdeps/riscv/rvd/libm-test-ulps @@ -977,22 +977,18 @@ ldouble: 3 Function: "exp10m1": double: 4 -float: 2 ldouble: 3 Function: "exp10m1_downward": double: 3 -float: 3 ldouble: 6 Function: "exp10m1_towardzero": double: 2 -float: 3 ldouble: 6 Function: "exp10m1_upward": double: 5 -float: 3 ldouble: 6 Function: "exp2": diff --git a/sysdeps/s390/fpu/libm-test-ulps b/sysdeps/s390/fpu/libm-test-ulps index 6e4084ae37..860479f3ff 100644 --- a/sysdeps/s390/fpu/libm-test-ulps +++ b/sysdeps/s390/fpu/libm-test-ulps @@ -978,22 +978,18 @@ ldouble: 3 Function: "exp10m1": double: 4 -float: 2 ldouble: 3 Function: "exp10m1_downward": double: 3 -float: 3 ldouble: 6 Function: "exp10m1_towardzero": double: 2 -float: 3 ldouble: 6 Function: "exp10m1_upward": double: 5 -float: 3 ldouble: 6 Function: "exp2": diff --git a/sysdeps/sparc/fpu/libm-test-ulps b/sysdeps/sparc/fpu/libm-test-ulps index ccb6c794e7..4d8064b543 100644 --- a/sysdeps/sparc/fpu/libm-test-ulps +++ b/sysdeps/sparc/fpu/libm-test-ulps @@ -978,22 +978,18 @@ ldouble: 3 Function: "exp10m1": double: 3 -float: 2 ldouble: 3 Function: "exp10m1_downward": double: 3 -float: 3 ldouble: 6 Function: "exp10m1_towardzero": double: 2 -float: 3 ldouble: 6 Function: "exp10m1_upward": double: 4 -float: 4 ldouble: 6 Function: "exp2": diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps index 80d0125642..3f438c2090 100644 --- a/sysdeps/x86_64/fpu/libm-test-ulps +++ b/sysdeps/x86_64/fpu/libm-test-ulps @@ -1431,25 +1431,21 @@ float: 1 Function: "exp10m1": double: 4 -float: 2 float128: 3 ldouble: 4 Function: "exp10m1_downward": double: 3 -float: 3 float128: 6 ldouble: 6 Function: "exp10m1_towardzero": double: 2 -float: 3 float128: 6 ldouble: 5 Function: "exp10m1_upward": double: 5 -float: 4 float128: 6 ldouble: 6

[09/17] math: Use exp10m1f from CORE-MATH

Commit Message

Comments

Patch