Message ID | 20250131191844.2582716-12-adhemerval.zanella@linaro.org |
---|---|
State | New |
Headers | show |
Series | Add c23 CORE-MATH binary32 implementations to libm | expand |
I confirm all binary32 inputs yield correct rounding on x86-64 with all rounding modes. Paul > From: Adhemerval Zanella <adhemerval.zanella@linaro.org> > Cc: DJ Delorie <dj@redhat.com>, > Joseph Myers <josmyers@redhat.com>, > Paul Zimmermann <Paul.Zimmermann@inria.fr>, > Alexei Sibidanov <sibid@uvic.ca> > Date: Fri, 31 Jan 2025 16:17:15 -0300 > > The CORE-MATH implementation is correctly rounded (for any rounding mode) > and shows better performance to the generic atanpif. > > The code was adapted to glibc style and to use the definition of > math_config.h (to handle errno, overflow, and underflow). > > Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, > gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): > > latency master patched improvement > x86_64 66.3296 52.7558 20.46% > x86_64v2 66.0429 51.4007 22.17% > x86_64v3 60.6294 48.7876 19.53% > aarch64 (Neoverse) 24.3163 20.9110 14.00% > power8 16.5766 13.3620 19.39% > power10 16.5115 13.4072 18.80% > > reciprocal-throughput master patched improvement > x86_64 30.8599 16.0866 47.87% > x86_64v2 29.2286 15.4688 47.08% > x86_64v3 23.0960 12.8510 44.36% > aarch64 (Neoverse) 15.4619 10.6752 30.96% > power8 7.9200 5.2483 33.73% > power10 6.8539 4.6262 32.50% > --- > SHARED-FILES | 4 + > sysdeps/aarch64/libm-test-ulps | 4 - > sysdeps/arc/fpu/libm-test-ulps | 4 - > sysdeps/arc/nofpu/libm-test-ulps | 1 - > sysdeps/arm/libm-test-ulps | 4 - > sysdeps/hppa/fpu/libm-test-ulps | 4 - > sysdeps/i386/fpu/libm-test-ulps | 4 - > .../i386/i686/fpu/multiarch/libm-test-ulps | 4 - > sysdeps/ieee754/flt-32/s_atanpif.c | 109 ++++++++++++++++++ > sysdeps/loongarch/lp64/libm-test-ulps | 4 - > sysdeps/mips/mips64/libm-test-ulps | 4 - > sysdeps/or1k/fpu/libm-test-ulps | 4 - > sysdeps/or1k/nofpu/libm-test-ulps | 1 - > sysdeps/powerpc/fpu/libm-test-ulps | 4 - > sysdeps/riscv/nofpu/libm-test-ulps | 1 - > sysdeps/riscv/rvd/libm-test-ulps | 4 - > sysdeps/s390/fpu/libm-test-ulps | 4 - > sysdeps/sparc/fpu/libm-test-ulps | 4 - > sysdeps/x86_64/fpu/libm-test-ulps | 4 - > 19 files changed, 113 insertions(+), 59 deletions(-) > create mode 100644 sysdeps/ieee754/flt-32/s_atanpif.c > > diff --git a/SHARED-FILES b/SHARED-FILES > index b403a2a6f0..5702a2d1c3 100644 > --- a/SHARED-FILES > +++ b/SHARED-FILES > @@ -346,3 +346,7 @@ sysdeps/ieee754/flt-32/s_atan2pif.c: > (src/binary32/atan2pi/atan2pif.c in CORE-MATH) > - the code was adapted to use glibc code style and internal > functions to handle errno, overflow, and underflow. > +sysdeps/ieee754/flt-32/s_atanpif.c: > + (src/binary32/atanpi/atanpif.c in CORE-MATH) > + - the code was adapted to use glibc code style and internal > + functions to handle errno, overflow, and underflow. > diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps > index be29b37721..10f182a211 100644 > --- a/sysdeps/aarch64/libm-test-ulps > +++ b/sysdeps/aarch64/libm-test-ulps > @@ -218,22 +218,18 @@ ldouble: 4 > > Function: "atanpi": > double: 2 > -float: 1 > ldouble: 2 > > Function: "atanpi_downward": > double: 1 > -float: 2 > ldouble: 1 > > Function: "atanpi_towardzero": > double: 1 > -float: 2 > ldouble: 1 > > Function: "atanpi_upward": > double: 1 > -float: 1 > ldouble: 2 > > Function: "cabs": > diff --git a/sysdeps/arc/fpu/libm-test-ulps b/sysdeps/arc/fpu/libm-test-ulps > index 1383c88b95..7fb407cecd 100644 > --- a/sysdeps/arc/fpu/libm-test-ulps > +++ b/sysdeps/arc/fpu/libm-test-ulps > @@ -123,19 +123,15 @@ double: 3 > > Function: "atanpi": > double: 2 > -float: 1 > > Function: "atanpi_downward": > double: 1 > -float: 2 > > Function: "atanpi_towardzero": > double: 1 > -float: 2 > > Function: "atanpi_upward": > double: 1 > -float: 1 > > Function: "cabs": > double: 1 > diff --git a/sysdeps/arc/nofpu/libm-test-ulps b/sysdeps/arc/nofpu/libm-test-ulps > index 9028f5cbe7..1859c2bd4f 100644 > --- a/sysdeps/arc/nofpu/libm-test-ulps > +++ b/sysdeps/arc/nofpu/libm-test-ulps > @@ -30,7 +30,6 @@ double: 2 > > Function: "atanpi": > double: 2 > -float: 1 > > Function: "cabs": > double: 1 > diff --git a/sysdeps/arm/libm-test-ulps b/sysdeps/arm/libm-test-ulps > index e1c538f79f..fa9864adee 100644 > --- a/sysdeps/arm/libm-test-ulps > +++ b/sysdeps/arm/libm-test-ulps > @@ -120,19 +120,15 @@ double: 3 > > Function: "atanpi": > double: 2 > -float: 1 > > Function: "atanpi_downward": > double: 1 > -float: 2 > > Function: "atanpi_towardzero": > double: 1 > -float: 2 > > Function: "atanpi_upward": > double: 1 > -float: 1 > > Function: "cabs": > double: 1 > diff --git a/sysdeps/hppa/fpu/libm-test-ulps b/sysdeps/hppa/fpu/libm-test-ulps > index 796da7b5ab..a59f61fc4e 100644 > --- a/sysdeps/hppa/fpu/libm-test-ulps > +++ b/sysdeps/hppa/fpu/libm-test-ulps > @@ -120,19 +120,15 @@ double: 3 > > Function: "atanpi": > double: 2 > -float: 1 > > Function: "atanpi_downward": > double: 1 > -float: 2 > > Function: "atanpi_towardzero": > double: 1 > -float: 2 > > Function: "atanpi_upward": > double: 1 > -float: 1 > > Function: "cabs": > double: 1 > diff --git a/sysdeps/i386/fpu/libm-test-ulps b/sysdeps/i386/fpu/libm-test-ulps > index 4f687c762b..8aa52f4600 100644 > --- a/sysdeps/i386/fpu/libm-test-ulps > +++ b/sysdeps/i386/fpu/libm-test-ulps > @@ -201,25 +201,21 @@ ldouble: 5 > > Function: "atanpi": > double: 1 > -float: 1 > float128: 2 > ldouble: 1 > > Function: "atanpi_downward": > double: 2 > -float: 2 > float128: 1 > ldouble: 2 > > Function: "atanpi_towardzero": > double: 1 > -float: 2 > float128: 1 > ldouble: 2 > > Function: "atanpi_upward": > double: 2 > -float: 1 > float128: 2 > ldouble: 1 > > diff --git a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps > index f24c87b302..8032636808 100644 > --- a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps > +++ b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps > @@ -201,25 +201,21 @@ ldouble: 5 > > Function: "atanpi": > double: 1 > -float: 1 > float128: 2 > ldouble: 2 > > Function: "atanpi_downward": > double: 2 > -float: 2 > float128: 1 > ldouble: 2 > > Function: "atanpi_towardzero": > double: 1 > -float: 2 > float128: 1 > ldouble: 2 > > Function: "atanpi_upward": > double: 2 > -float: 1 > float128: 2 > ldouble: 1 > > diff --git a/sysdeps/ieee754/flt-32/s_atanpif.c b/sysdeps/ieee754/flt-32/s_atanpif.c > new file mode 100644 > index 0000000000..40ca9f5053 > --- /dev/null > +++ b/sysdeps/ieee754/flt-32/s_atanpif.c > @@ -0,0 +1,109 @@ > +/* Correctly-rounded half-revolution arc-tangent of binary32 value. > + > +Copyright (c) 2022-2025 Alexei Sibidanov. > + > +The original version of this file was copied from the CORE-MATH > +project (file src/binary32/atanpi/atanpif.c, revision e02000e). > + > +Permission is hereby granted, free of charge, to any person obtaining a copy > +of this software and associated documentation files (the "Software"), to deal > +in the Software without restriction, including without limitation the rights > +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell > +copies of the Software, and to permit persons to whom the Software is > +furnished to do so, subject to the following conditions: > + > +The above copyright notice and this permission notice shall be included in all > +copies or substantial portions of the Software. > + > +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE > +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, > +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > +SOFTWARE. > + > +*/ > + > +#include <errno.h> > +#include <math.h> > +#include <stdint.h> > +#include <libm-alias-float.h> > +#include "math_config.h" > + > +float > +__atanpif (float x) > +{ > + uint32_t t = asuint (x); > + int32_t e = (t >> 23) & 0xff; > + bool gt = e >= 127; > + if (__glibc_unlikely (e > 127 + 24)) > + { > + float f = copysignf (0.5f, x); > + if (__glibc_unlikely (e == 0xff)) > + { > + if (t << 9) > + return x + x; /* nan */ > + return f; /* inf */ > + } > + /* Warning: 0x1.45f306p-2f / x underflows for |x| >= 0x1.45f306p+124 */ > + if (fabsf (x) >= 0x1.45f306p+124f) > + return f - 4.0f / x; > + else > + return f - 0x1.45f306p-2f / x; > + } > + double z = x; > + if (__glibc_unlikely (e < 127 - 13)) > + { > + double sx = z * 0x1.45f306dc9c883p-2; > + if (__glibc_unlikely (e < 127 - 25)) > + { > + float rsx = sx; > + if (x != 0 && rsx == 0) > + __set_errno (ERANGE); > + return rsx; > + } > + return sx - (0x1.5555555555555p-2 * sx) * (x * x); > + } > + uint32_t ax = t & (~0u >> 1); > + if (__glibc_unlikely (ax == 0x3fa267ddu)) > + return copysignf (0x1.267004p-2f, x) - copysignf (0x1p-55f, x); > + if (__glibc_unlikely (ax == 0x3f693531u)) > + return copysignf (0x1.e1a662p-3f, x) + copysignf (0x1p-28f, x); > + if (__glibc_unlikely (ax == 0x3f800000u)) > + return copysignf (0x1p-2f, x); > + if (gt) > + z = 1 / z; > + double z2 = z * z; > + double z4 = z2 * z2; > + double z8 = z4 * z4; > + static const double cn[] = > + { > + 0x1.45f306dc9c882p-2, 0x1.733b561bc23d5p-1, 0x1.28d9805bdfbf2p-1, > + 0x1.8c3ba966ae287p-3, 0x1.94a7f81ee634bp-6, 0x1.a6bbf6127a6dfp-11 > + }; > + static const double cd[] = > + { > + 0x1p+0, 0x1.4e3b3ecc2518fp+1, 0x1.3ef4a360ff063p+1, > + 0x1.0f1dc55bad551p+0, 0x1.8da0fecc018a4p-3, 0x1.8fa87803776bfp-7, > + 0x1.dadf2ca0acb43p-14 > + }; > + double cn0 = cn[0] + z2 * cn[1]; > + double cn2 = cn[2] + z2 * cn[3]; > + double cn4 = cn[4] + z2 * cn[5]; > + cn0 += z4 * cn2; > + cn0 += z8 * cn4; > + cn0 *= z; > + double cd0 = cd[0] + z2 * cd[1]; > + double cd2 = cd[2] + z2 * cd[3]; > + double cd4 = cd[4] + z2 * cd[5]; > + double cd6 = cd[6]; > + cd0 += z4 * cd2; > + cd4 += z4 * cd6; > + cd0 += z8 * cd4; > + double r = cn0 / cd0; > + if (gt) > + r = copysign (0.5, z) - r; > + return r; > +} > +libm_alias_float (__atanpi, atanpi) > diff --git a/sysdeps/loongarch/lp64/libm-test-ulps b/sysdeps/loongarch/lp64/libm-test-ulps > index d5adc119cf..0cac55cbe4 100644 > --- a/sysdeps/loongarch/lp64/libm-test-ulps > +++ b/sysdeps/loongarch/lp64/libm-test-ulps > @@ -162,22 +162,18 @@ ldouble: 4 > > Function: "atanpi": > double: 2 > -float: 1 > ldouble: 2 > > Function: "atanpi_downward": > double: 1 > -float: 2 > ldouble: 1 > > Function: "atanpi_towardzero": > double: 1 > -float: 2 > ldouble: 1 > > Function: "atanpi_upward": > double: 1 > -float: 1 > ldouble: 2 > > Function: "cabs": > diff --git a/sysdeps/mips/mips64/libm-test-ulps b/sysdeps/mips/mips64/libm-test-ulps > index c901b00f20..1b5bcff11e 100644 > --- a/sysdeps/mips/mips64/libm-test-ulps > +++ b/sysdeps/mips/mips64/libm-test-ulps > @@ -162,22 +162,18 @@ ldouble: 4 > > Function: "atanpi": > double: 2 > -float: 1 > ldouble: 2 > > Function: "atanpi_downward": > double: 1 > -float: 2 > ldouble: 1 > > Function: "atanpi_towardzero": > double: 1 > -float: 2 > ldouble: 1 > > Function: "atanpi_upward": > double: 1 > -float: 1 > ldouble: 2 > > Function: "cabs": > diff --git a/sysdeps/or1k/fpu/libm-test-ulps b/sysdeps/or1k/fpu/libm-test-ulps > index 9934382bde..a608e3c949 100644 > --- a/sysdeps/or1k/fpu/libm-test-ulps > +++ b/sysdeps/or1k/fpu/libm-test-ulps > @@ -120,19 +120,15 @@ double: 3 > > Function: "atanpi": > double: 2 > -float: 1 > > Function: "atanpi_downward": > double: 1 > -float: 2 > > Function: "atanpi_towardzero": > double: 1 > -float: 2 > > Function: "atanpi_upward": > double: 1 > -float: 1 > > Function: "cabs": > double: 1 > diff --git a/sysdeps/or1k/nofpu/libm-test-ulps b/sysdeps/or1k/nofpu/libm-test-ulps > index 7ff5ee4425..56986f0be0 100644 > --- a/sysdeps/or1k/nofpu/libm-test-ulps > +++ b/sysdeps/or1k/nofpu/libm-test-ulps > @@ -93,7 +93,6 @@ double: 3 > > Function: "atanpi": > double: 2 > -float: 1 > > Function: "cabs": > double: 1 > diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps > index b1c01b4d94..630111e6c4 100644 > --- a/sysdeps/powerpc/fpu/libm-test-ulps > +++ b/sysdeps/powerpc/fpu/libm-test-ulps > @@ -206,25 +206,21 @@ ldouble: 4 > > Function: "atanpi": > double: 2 > -float: 1 > float128: 2 > ldouble: 1 > > Function: "atanpi_downward": > double: 1 > -float: 2 > float128: 1 > ldouble: 2 > > Function: "atanpi_towardzero": > double: 1 > -float: 2 > float128: 1 > ldouble: 3 > > Function: "atanpi_upward": > double: 1 > -float: 1 > float128: 2 > ldouble: 5 > > diff --git a/sysdeps/riscv/nofpu/libm-test-ulps b/sysdeps/riscv/nofpu/libm-test-ulps > index f55df65c6a..087dcd79fc 100644 > --- a/sysdeps/riscv/nofpu/libm-test-ulps > +++ b/sysdeps/riscv/nofpu/libm-test-ulps > @@ -126,7 +126,6 @@ ldouble: 4 > > Function: "atanpi": > double: 2 > -float: 1 > ldouble: 2 > > Function: "cabs": > diff --git a/sysdeps/riscv/rvd/libm-test-ulps b/sysdeps/riscv/rvd/libm-test-ulps > index 879f5c5669..efd83affa4 100644 > --- a/sysdeps/riscv/rvd/libm-test-ulps > +++ b/sysdeps/riscv/rvd/libm-test-ulps > @@ -166,22 +166,18 @@ ldouble: 4 > > Function: "atanpi": > double: 2 > -float: 1 > ldouble: 2 > > Function: "atanpi_downward": > double: 1 > -float: 2 > ldouble: 1 > > Function: "atanpi_towardzero": > double: 1 > -float: 2 > ldouble: 1 > > Function: "atanpi_upward": > double: 1 > -float: 1 > ldouble: 2 > > Function: "cabs": > diff --git a/sysdeps/s390/fpu/libm-test-ulps b/sysdeps/s390/fpu/libm-test-ulps > index c4a27b96ad..709debb205 100644 > --- a/sysdeps/s390/fpu/libm-test-ulps > +++ b/sysdeps/s390/fpu/libm-test-ulps > @@ -162,22 +162,18 @@ ldouble: 4 > > Function: "atanpi": > double: 2 > -float: 1 > ldouble: 2 > > Function: "atanpi_downward": > double: 1 > -float: 2 > ldouble: 1 > > Function: "atanpi_towardzero": > double: 1 > -float: 2 > ldouble: 1 > > Function: "atanpi_upward": > double: 1 > -float: 1 > ldouble: 2 > > Function: "cabs": > diff --git a/sysdeps/sparc/fpu/libm-test-ulps b/sysdeps/sparc/fpu/libm-test-ulps > index fbf1507bd9..becf5da3d6 100644 > --- a/sysdeps/sparc/fpu/libm-test-ulps > +++ b/sysdeps/sparc/fpu/libm-test-ulps > @@ -162,22 +162,18 @@ ldouble: 4 > > Function: "atanpi": > double: 2 > -float: 1 > ldouble: 2 > > Function: "atanpi_downward": > double: 1 > -float: 2 > ldouble: 1 > > Function: "atanpi_towardzero": > double: 1 > -float: 2 > ldouble: 1 > > Function: "atanpi_upward": > double: 1 > -float: 1 > ldouble: 2 > > Function: "cabs": > diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps > index a340df6243..8c5d4fd471 100644 > --- a/sysdeps/x86_64/fpu/libm-test-ulps > +++ b/sysdeps/x86_64/fpu/libm-test-ulps > @@ -339,25 +339,21 @@ float: 1 > > Function: "atanpi": > double: 2 > -float: 1 > float128: 2 > ldouble: 2 > > Function: "atanpi_downward": > double: 1 > -float: 2 > float128: 1 > ldouble: 2 > > Function: "atanpi_towardzero": > double: 1 > -float: 2 > float128: 1 > ldouble: 2 > > Function: "atanpi_upward": > double: 1 > -float: 1 > float128: 2 > ldouble: 1 > > -- > 2.43.0 > >
diff --git a/SHARED-FILES b/SHARED-FILES index b403a2a6f0..5702a2d1c3 100644 --- a/SHARED-FILES +++ b/SHARED-FILES @@ -346,3 +346,7 @@ sysdeps/ieee754/flt-32/s_atan2pif.c: (src/binary32/atan2pi/atan2pif.c in CORE-MATH) - the code was adapted to use glibc code style and internal functions to handle errno, overflow, and underflow. +sysdeps/ieee754/flt-32/s_atanpif.c: + (src/binary32/atanpi/atanpif.c in CORE-MATH) + - the code was adapted to use glibc code style and internal + functions to handle errno, overflow, and underflow. diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps index be29b37721..10f182a211 100644 --- a/sysdeps/aarch64/libm-test-ulps +++ b/sysdeps/aarch64/libm-test-ulps @@ -218,22 +218,18 @@ ldouble: 4 Function: "atanpi": double: 2 -float: 1 ldouble: 2 Function: "atanpi_downward": double: 1 -float: 2 ldouble: 1 Function: "atanpi_towardzero": double: 1 -float: 2 ldouble: 1 Function: "atanpi_upward": double: 1 -float: 1 ldouble: 2 Function: "cabs": diff --git a/sysdeps/arc/fpu/libm-test-ulps b/sysdeps/arc/fpu/libm-test-ulps index 1383c88b95..7fb407cecd 100644 --- a/sysdeps/arc/fpu/libm-test-ulps +++ b/sysdeps/arc/fpu/libm-test-ulps @@ -123,19 +123,15 @@ double: 3 Function: "atanpi": double: 2 -float: 1 Function: "atanpi_downward": double: 1 -float: 2 Function: "atanpi_towardzero": double: 1 -float: 2 Function: "atanpi_upward": double: 1 -float: 1 Function: "cabs": double: 1 diff --git a/sysdeps/arc/nofpu/libm-test-ulps b/sysdeps/arc/nofpu/libm-test-ulps index 9028f5cbe7..1859c2bd4f 100644 --- a/sysdeps/arc/nofpu/libm-test-ulps +++ b/sysdeps/arc/nofpu/libm-test-ulps @@ -30,7 +30,6 @@ double: 2 Function: "atanpi": double: 2 -float: 1 Function: "cabs": double: 1 diff --git a/sysdeps/arm/libm-test-ulps b/sysdeps/arm/libm-test-ulps index e1c538f79f..fa9864adee 100644 --- a/sysdeps/arm/libm-test-ulps +++ b/sysdeps/arm/libm-test-ulps @@ -120,19 +120,15 @@ double: 3 Function: "atanpi": double: 2 -float: 1 Function: "atanpi_downward": double: 1 -float: 2 Function: "atanpi_towardzero": double: 1 -float: 2 Function: "atanpi_upward": double: 1 -float: 1 Function: "cabs": double: 1 diff --git a/sysdeps/hppa/fpu/libm-test-ulps b/sysdeps/hppa/fpu/libm-test-ulps index 796da7b5ab..a59f61fc4e 100644 --- a/sysdeps/hppa/fpu/libm-test-ulps +++ b/sysdeps/hppa/fpu/libm-test-ulps @@ -120,19 +120,15 @@ double: 3 Function: "atanpi": double: 2 -float: 1 Function: "atanpi_downward": double: 1 -float: 2 Function: "atanpi_towardzero": double: 1 -float: 2 Function: "atanpi_upward": double: 1 -float: 1 Function: "cabs": double: 1 diff --git a/sysdeps/i386/fpu/libm-test-ulps b/sysdeps/i386/fpu/libm-test-ulps index 4f687c762b..8aa52f4600 100644 --- a/sysdeps/i386/fpu/libm-test-ulps +++ b/sysdeps/i386/fpu/libm-test-ulps @@ -201,25 +201,21 @@ ldouble: 5 Function: "atanpi": double: 1 -float: 1 float128: 2 ldouble: 1 Function: "atanpi_downward": double: 2 -float: 2 float128: 1 ldouble: 2 Function: "atanpi_towardzero": double: 1 -float: 2 float128: 1 ldouble: 2 Function: "atanpi_upward": double: 2 -float: 1 float128: 2 ldouble: 1 diff --git a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps index f24c87b302..8032636808 100644 --- a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps +++ b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps @@ -201,25 +201,21 @@ ldouble: 5 Function: "atanpi": double: 1 -float: 1 float128: 2 ldouble: 2 Function: "atanpi_downward": double: 2 -float: 2 float128: 1 ldouble: 2 Function: "atanpi_towardzero": double: 1 -float: 2 float128: 1 ldouble: 2 Function: "atanpi_upward": double: 2 -float: 1 float128: 2 ldouble: 1 diff --git a/sysdeps/ieee754/flt-32/s_atanpif.c b/sysdeps/ieee754/flt-32/s_atanpif.c new file mode 100644 index 0000000000..40ca9f5053 --- /dev/null +++ b/sysdeps/ieee754/flt-32/s_atanpif.c @@ -0,0 +1,109 @@ +/* Correctly-rounded half-revolution arc-tangent of binary32 value. + +Copyright (c) 2022-2025 Alexei Sibidanov. + +The original version of this file was copied from the CORE-MATH +project (file src/binary32/atanpi/atanpif.c, revision e02000e). + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +*/ + +#include <errno.h> +#include <math.h> +#include <stdint.h> +#include <libm-alias-float.h> +#include "math_config.h" + +float +__atanpif (float x) +{ + uint32_t t = asuint (x); + int32_t e = (t >> 23) & 0xff; + bool gt = e >= 127; + if (__glibc_unlikely (e > 127 + 24)) + { + float f = copysignf (0.5f, x); + if (__glibc_unlikely (e == 0xff)) + { + if (t << 9) + return x + x; /* nan */ + return f; /* inf */ + } + /* Warning: 0x1.45f306p-2f / x underflows for |x| >= 0x1.45f306p+124 */ + if (fabsf (x) >= 0x1.45f306p+124f) + return f - 4.0f / x; + else + return f - 0x1.45f306p-2f / x; + } + double z = x; + if (__glibc_unlikely (e < 127 - 13)) + { + double sx = z * 0x1.45f306dc9c883p-2; + if (__glibc_unlikely (e < 127 - 25)) + { + float rsx = sx; + if (x != 0 && rsx == 0) + __set_errno (ERANGE); + return rsx; + } + return sx - (0x1.5555555555555p-2 * sx) * (x * x); + } + uint32_t ax = t & (~0u >> 1); + if (__glibc_unlikely (ax == 0x3fa267ddu)) + return copysignf (0x1.267004p-2f, x) - copysignf (0x1p-55f, x); + if (__glibc_unlikely (ax == 0x3f693531u)) + return copysignf (0x1.e1a662p-3f, x) + copysignf (0x1p-28f, x); + if (__glibc_unlikely (ax == 0x3f800000u)) + return copysignf (0x1p-2f, x); + if (gt) + z = 1 / z; + double z2 = z * z; + double z4 = z2 * z2; + double z8 = z4 * z4; + static const double cn[] = + { + 0x1.45f306dc9c882p-2, 0x1.733b561bc23d5p-1, 0x1.28d9805bdfbf2p-1, + 0x1.8c3ba966ae287p-3, 0x1.94a7f81ee634bp-6, 0x1.a6bbf6127a6dfp-11 + }; + static const double cd[] = + { + 0x1p+0, 0x1.4e3b3ecc2518fp+1, 0x1.3ef4a360ff063p+1, + 0x1.0f1dc55bad551p+0, 0x1.8da0fecc018a4p-3, 0x1.8fa87803776bfp-7, + 0x1.dadf2ca0acb43p-14 + }; + double cn0 = cn[0] + z2 * cn[1]; + double cn2 = cn[2] + z2 * cn[3]; + double cn4 = cn[4] + z2 * cn[5]; + cn0 += z4 * cn2; + cn0 += z8 * cn4; + cn0 *= z; + double cd0 = cd[0] + z2 * cd[1]; + double cd2 = cd[2] + z2 * cd[3]; + double cd4 = cd[4] + z2 * cd[5]; + double cd6 = cd[6]; + cd0 += z4 * cd2; + cd4 += z4 * cd6; + cd0 += z8 * cd4; + double r = cn0 / cd0; + if (gt) + r = copysign (0.5, z) - r; + return r; +} +libm_alias_float (__atanpi, atanpi) diff --git a/sysdeps/loongarch/lp64/libm-test-ulps b/sysdeps/loongarch/lp64/libm-test-ulps index d5adc119cf..0cac55cbe4 100644 --- a/sysdeps/loongarch/lp64/libm-test-ulps +++ b/sysdeps/loongarch/lp64/libm-test-ulps @@ -162,22 +162,18 @@ ldouble: 4 Function: "atanpi": double: 2 -float: 1 ldouble: 2 Function: "atanpi_downward": double: 1 -float: 2 ldouble: 1 Function: "atanpi_towardzero": double: 1 -float: 2 ldouble: 1 Function: "atanpi_upward": double: 1 -float: 1 ldouble: 2 Function: "cabs": diff --git a/sysdeps/mips/mips64/libm-test-ulps b/sysdeps/mips/mips64/libm-test-ulps index c901b00f20..1b5bcff11e 100644 --- a/sysdeps/mips/mips64/libm-test-ulps +++ b/sysdeps/mips/mips64/libm-test-ulps @@ -162,22 +162,18 @@ ldouble: 4 Function: "atanpi": double: 2 -float: 1 ldouble: 2 Function: "atanpi_downward": double: 1 -float: 2 ldouble: 1 Function: "atanpi_towardzero": double: 1 -float: 2 ldouble: 1 Function: "atanpi_upward": double: 1 -float: 1 ldouble: 2 Function: "cabs": diff --git a/sysdeps/or1k/fpu/libm-test-ulps b/sysdeps/or1k/fpu/libm-test-ulps index 9934382bde..a608e3c949 100644 --- a/sysdeps/or1k/fpu/libm-test-ulps +++ b/sysdeps/or1k/fpu/libm-test-ulps @@ -120,19 +120,15 @@ double: 3 Function: "atanpi": double: 2 -float: 1 Function: "atanpi_downward": double: 1 -float: 2 Function: "atanpi_towardzero": double: 1 -float: 2 Function: "atanpi_upward": double: 1 -float: 1 Function: "cabs": double: 1 diff --git a/sysdeps/or1k/nofpu/libm-test-ulps b/sysdeps/or1k/nofpu/libm-test-ulps index 7ff5ee4425..56986f0be0 100644 --- a/sysdeps/or1k/nofpu/libm-test-ulps +++ b/sysdeps/or1k/nofpu/libm-test-ulps @@ -93,7 +93,6 @@ double: 3 Function: "atanpi": double: 2 -float: 1 Function: "cabs": double: 1 diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps index b1c01b4d94..630111e6c4 100644 --- a/sysdeps/powerpc/fpu/libm-test-ulps +++ b/sysdeps/powerpc/fpu/libm-test-ulps @@ -206,25 +206,21 @@ ldouble: 4 Function: "atanpi": double: 2 -float: 1 float128: 2 ldouble: 1 Function: "atanpi_downward": double: 1 -float: 2 float128: 1 ldouble: 2 Function: "atanpi_towardzero": double: 1 -float: 2 float128: 1 ldouble: 3 Function: "atanpi_upward": double: 1 -float: 1 float128: 2 ldouble: 5 diff --git a/sysdeps/riscv/nofpu/libm-test-ulps b/sysdeps/riscv/nofpu/libm-test-ulps index f55df65c6a..087dcd79fc 100644 --- a/sysdeps/riscv/nofpu/libm-test-ulps +++ b/sysdeps/riscv/nofpu/libm-test-ulps @@ -126,7 +126,6 @@ ldouble: 4 Function: "atanpi": double: 2 -float: 1 ldouble: 2 Function: "cabs": diff --git a/sysdeps/riscv/rvd/libm-test-ulps b/sysdeps/riscv/rvd/libm-test-ulps index 879f5c5669..efd83affa4 100644 --- a/sysdeps/riscv/rvd/libm-test-ulps +++ b/sysdeps/riscv/rvd/libm-test-ulps @@ -166,22 +166,18 @@ ldouble: 4 Function: "atanpi": double: 2 -float: 1 ldouble: 2 Function: "atanpi_downward": double: 1 -float: 2 ldouble: 1 Function: "atanpi_towardzero": double: 1 -float: 2 ldouble: 1 Function: "atanpi_upward": double: 1 -float: 1 ldouble: 2 Function: "cabs": diff --git a/sysdeps/s390/fpu/libm-test-ulps b/sysdeps/s390/fpu/libm-test-ulps index c4a27b96ad..709debb205 100644 --- a/sysdeps/s390/fpu/libm-test-ulps +++ b/sysdeps/s390/fpu/libm-test-ulps @@ -162,22 +162,18 @@ ldouble: 4 Function: "atanpi": double: 2 -float: 1 ldouble: 2 Function: "atanpi_downward": double: 1 -float: 2 ldouble: 1 Function: "atanpi_towardzero": double: 1 -float: 2 ldouble: 1 Function: "atanpi_upward": double: 1 -float: 1 ldouble: 2 Function: "cabs": diff --git a/sysdeps/sparc/fpu/libm-test-ulps b/sysdeps/sparc/fpu/libm-test-ulps index fbf1507bd9..becf5da3d6 100644 --- a/sysdeps/sparc/fpu/libm-test-ulps +++ b/sysdeps/sparc/fpu/libm-test-ulps @@ -162,22 +162,18 @@ ldouble: 4 Function: "atanpi": double: 2 -float: 1 ldouble: 2 Function: "atanpi_downward": double: 1 -float: 2 ldouble: 1 Function: "atanpi_towardzero": double: 1 -float: 2 ldouble: 1 Function: "atanpi_upward": double: 1 -float: 1 ldouble: 2 Function: "cabs": diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps index a340df6243..8c5d4fd471 100644 --- a/sysdeps/x86_64/fpu/libm-test-ulps +++ b/sysdeps/x86_64/fpu/libm-test-ulps @@ -339,25 +339,21 @@ float: 1 Function: "atanpi": double: 2 -float: 1 float128: 2 ldouble: 2 Function: "atanpi_downward": double: 1 -float: 2 float128: 1 ldouble: 2 Function: "atanpi_towardzero": double: 1 -float: 2 float128: 1 ldouble: 2 Function: "atanpi_upward": double: 1 -float: 1 float128: 2 ldouble: 1