Message ID | 20250131191844.2582716-15-adhemerval.zanella@linaro.org |
---|---|
State | New |
Headers | show |
Series | Add c23 CORE-MATH binary32 implementations to libm | expand |
I confirm all binary32 inputs are correctly rounded on x86-64 for all rounding modes. Paul > From: Adhemerval Zanella <adhemerval.zanella@linaro.org> > Cc: DJ Delorie <dj@redhat.com>, > Joseph Myers <josmyers@redhat.com>, > Paul Zimmermann <Paul.Zimmermann@inria.fr>, > Alexei Sibidanov <sibid@uvic.ca> > Date: Fri, 31 Jan 2025 16:17:18 -0300 > > The CORE-MATH implementation is correctly rounded (for any rounding mode) > and shows better performance to the generic tanpif. > > The code was adapted to glibc style and to use the definition of > math_config.h (to handle errno, overflow, and underflow). > > Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, > gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): > > latency master patched improvement > x86_64 85.1683 47.7990 43.88% > x86_64v2 76.8219 41.4679 46.02% > x86_64v3 73.7775 37.7734 48.80% > aarch64 (Neoverse) 35.4514 18.0742 49.02% > power8 22.7604 10.1054 55.60% > power10 22.1358 9.9553 55.03% > > reciprocal-throughput master patched improvement > x86_64 41.0174 19.4718 52.53% > x86_64v2 34.8565 11.3761 67.36% > x86_64v3 34.0325 9.6989 71.50% > aarch64 (Neoverse) 25.4349 9.2017 63.82% > power8 13.8626 3.8486 72.24% > power10 11.7933 3.6420 69.12% > --- > SHARED-FILES | 4 + > sysdeps/aarch64/libm-test-ulps | 4 - > sysdeps/arc/fpu/libm-test-ulps | 4 - > sysdeps/arc/nofpu/libm-test-ulps | 1 - > sysdeps/arm/libm-test-ulps | 4 - > sysdeps/hppa/fpu/libm-test-ulps | 4 - > sysdeps/i386/fpu/libm-test-ulps | 4 - > .../i386/i686/fpu/multiarch/libm-test-ulps | 4 - > sysdeps/ieee754/flt-32/math_config.h | 25 ++++++ > sysdeps/ieee754/flt-32/s_tanpif.c | 88 +++++++++++++++++++ > sysdeps/loongarch/lp64/libm-test-ulps | 4 - > sysdeps/mips/mips64/libm-test-ulps | 4 - > sysdeps/or1k/fpu/libm-test-ulps | 4 - > sysdeps/or1k/nofpu/libm-test-ulps | 1 - > sysdeps/powerpc/fpu/libm-test-ulps | 4 - > sysdeps/powerpc/fpu/math_private.h | 1 + > sysdeps/riscv/nofpu/libm-test-ulps | 1 - > sysdeps/riscv/rvd/libm-test-ulps | 4 - > sysdeps/s390/fpu/libm-test-ulps | 4 - > sysdeps/sparc/fpu/libm-test-ulps | 4 - > sysdeps/x86_64/fpu/libm-test-ulps | 4 - > 21 files changed, 118 insertions(+), 59 deletions(-) > create mode 100644 sysdeps/ieee754/flt-32/s_tanpif.c > > diff --git a/SHARED-FILES b/SHARED-FILES > index c108f3b308..25ece987f1 100644 > --- a/SHARED-FILES > +++ b/SHARED-FILES > @@ -358,3 +358,7 @@ sysdeps/ieee754/flt-32/s_sinpif.c: > (src/binary32/sinpi/sinpif.c in CORE-MATH) > - the code was adapted to use glibc code style and internal > functions to handle errno, overflow, and underflow. > +sysdeps/ieee754/flt-32/s_tanpif.c: > + (src/binary32/tanpi/tanpif.c in CORE-MATH) > + - the code was adapted to use glibc code style and internal > + functions to handle errno, overflow, and underflow. > diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps > index c6c93aa0e4..3bcd0e5ae4 100644 > --- a/sysdeps/aarch64/libm-test-ulps > +++ b/sysdeps/aarch64/libm-test-ulps > @@ -1681,7 +1681,6 @@ ldouble: 3 > > Function: "tanpi": > double: 3 > -float: 3 > ldouble: 3 > > Function: "tanpi_advsimd": > @@ -1690,7 +1689,6 @@ float: 2 > > Function: "tanpi_downward": > double: 2 > -float: 3 > ldouble: 4 > > Function: "tanpi_sve": > @@ -1699,12 +1697,10 @@ float: 2 > > Function: "tanpi_towardzero": > double: 2 > -float: 3 > ldouble: 4 > > Function: "tanpi_upward": > double: 2 > -float: 4 > ldouble: 4 > > Function: "tgamma": > diff --git a/sysdeps/arc/fpu/libm-test-ulps b/sysdeps/arc/fpu/libm-test-ulps > index 65ebf6f9a0..24e6036954 100644 > --- a/sysdeps/arc/fpu/libm-test-ulps > +++ b/sysdeps/arc/fpu/libm-test-ulps > @@ -1137,19 +1137,15 @@ double: 3 > > Function: "tanpi": > double: 3 > -float: 3 > > Function: "tanpi_downward": > double: 2 > -float: 3 > > Function: "tanpi_towardzero": > double: 2 > -float: 3 > > Function: "tanpi_upward": > double: 2 > -float: 4 > > Function: "tgamma": > double: 9 > diff --git a/sysdeps/arc/nofpu/libm-test-ulps b/sysdeps/arc/nofpu/libm-test-ulps > index 3ba4f01cbf..6359d6e038 100644 > --- a/sysdeps/arc/nofpu/libm-test-ulps > +++ b/sysdeps/arc/nofpu/libm-test-ulps > @@ -271,7 +271,6 @@ double: 2 > > Function: "tanpi": > double: 3 > -float: 3 > > Function: "tgamma": > double: 9 > diff --git a/sysdeps/arm/libm-test-ulps b/sysdeps/arm/libm-test-ulps > index f887712d8e..273c54dd4d 100644 > --- a/sysdeps/arm/libm-test-ulps > +++ b/sysdeps/arm/libm-test-ulps > @@ -1130,19 +1130,15 @@ double: 3 > > Function: "tanpi": > double: 3 > -float: 3 > > Function: "tanpi_downward": > double: 2 > -float: 3 > > Function: "tanpi_towardzero": > double: 2 > -float: 3 > > Function: "tanpi_upward": > double: 2 > -float: 4 > > Function: "tgamma": > double: 9 > diff --git a/sysdeps/hppa/fpu/libm-test-ulps b/sysdeps/hppa/fpu/libm-test-ulps > index 10f7f2ebde..723cb79d12 100644 > --- a/sysdeps/hppa/fpu/libm-test-ulps > +++ b/sysdeps/hppa/fpu/libm-test-ulps > @@ -1160,19 +1160,15 @@ double: 3 > > Function: "tanpi": > double: 3 > -float: 3 > > Function: "tanpi_downward": > double: 2 > -float: 3 > > Function: "tanpi_towardzero": > double: 2 > -float: 3 > > Function: "tanpi_upward": > double: 2 > -float: 4 > > Function: "tgamma": > double: 9 > diff --git a/sysdeps/i386/fpu/libm-test-ulps b/sysdeps/i386/fpu/libm-test-ulps > index 77aa7155db..8107d2fa2c 100644 > --- a/sysdeps/i386/fpu/libm-test-ulps > +++ b/sysdeps/i386/fpu/libm-test-ulps > @@ -1750,25 +1750,21 @@ ldouble: 4 > > Function: "tanpi": > double: 3 > -float: 3 > float128: 3 > ldouble: 3 > > Function: "tanpi_downward": > double: 2 > -float: 3 > float128: 4 > ldouble: 4 > > Function: "tanpi_towardzero": > double: 2 > -float: 3 > float128: 4 > ldouble: 4 > > Function: "tanpi_upward": > double: 2 > -float: 4 > float128: 4 > ldouble: 4 > > diff --git a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps > index 7168d577d8..b99c50214c 100644 > --- a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps > +++ b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps > @@ -1755,25 +1755,21 @@ ldouble: 4 > > Function: "tanpi": > double: 3 > -float: 3 > float128: 3 > ldouble: 3 > > Function: "tanpi_downward": > double: 2 > -float: 3 > float128: 4 > ldouble: 4 > > Function: "tanpi_towardzero": > double: 2 > -float: 3 > float128: 4 > ldouble: 4 > > Function: "tanpi_upward": > double: 2 > -float: 4 > float128: 4 > ldouble: 4 > > diff --git a/sysdeps/ieee754/flt-32/math_config.h b/sysdeps/ieee754/flt-32/math_config.h > index 035461199c..8d9c8ee3ad 100644 > --- a/sysdeps/ieee754/flt-32/math_config.h > +++ b/sysdeps/ieee754/flt-32/math_config.h > @@ -84,6 +84,31 @@ roundeven_finite (double x) > #endif > } > > +#ifndef ROUNDEVENF_INTRINSICS > +/* When set, roundevenf_finite will route to the internal roundevenf function. */ > +# define ROUNDEVENF_INTRINSICS 1 > +#endif > + > +static inline float > +roundevenf_finite (float x) > +{ > + if (!isfinite (x)) > + __builtin_unreachable (); > +#if ROUNDEVENF_INTRINSICS > + return roundevenf (x); > +#else > + float y = roundf (x); > + if (fabs (x - y) == 0.5) > + { > + union { float f; uint32_t i; } u = {y}; > + union { float f; uint32_t i; } v = {y - copysignf (1.0, x)}; > + if (__builtin_ctzl (v.i) > __builtin_ctzl (u.i)) > + y = v.f; > + } > + return y; > +#endif > +} > + > static inline uint32_t > asuint (float f) > { > diff --git a/sysdeps/ieee754/flt-32/s_tanpif.c b/sysdeps/ieee754/flt-32/s_tanpif.c > new file mode 100644 > index 0000000000..efbc47b507 > --- /dev/null > +++ b/sysdeps/ieee754/flt-32/s_tanpif.c > @@ -0,0 +1,88 @@ > +/* Correctly-rounded tangent of binary32 value for angles in half-revolutions > + > +Copyright (c) 2022-2025 Alexei Sibidanov. > + > +The original version of this file was copied from the CORE-MATH > +project (src/binary32/tanpi/tanpif.c, revision 3bbf907). > + > +Permission is hereby granted, free of charge, to any person obtaining a copy > +of this software and associated documentation files (the "Software"), to deal > +in the Software without restriction, including without limitation the rights > +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell > +copies of the Software, and to permit persons to whom the Software is > +furnished to do so, subject to the following conditions: > + > +The above copyright notice and this permission notice shall be included in all > +copies or substantial portions of the Software. > + > +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE > +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, > +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > +SOFTWARE. > + > +*/ > + > +#include <stdint.h> > +#include <errno.h> > +#include <libm-alias-float.h> > +#include "math_config.h" > + > +float > +__tanpif (float x) > +{ > + uint32_t ix = asuint (x); > + uint32_t e = ix & (0xff << 23); > + if (__glibc_unlikely (e > (150 << 23))) > + { > + if (e == (0xff << 23)) > + { > + if (!(ix << 9)) > + return __math_invalidf (x); > + return x + x; /* nan */ > + } > + return copysign (0.0f, x); > + } > + float x4 = 4.0f * x; > + float nx4 = roundevenf_finite (x4); > + float dx4 = x4 - nx4; > + float ni = roundevenf_finite (x); > + float zf = x - ni; > + if (__glibc_unlikely (dx4 == 0.0f)) > + { > + int k = x4; > + if (k & 1) > + return copysignf (1.0f, zf); > + k &= 7; > + if (k == 0) > + return copysignf (0.0f, x); > + if (k == 4) > + return -copysignf (0.0f, x); > + __set_errno (ERANGE); > + if (k == 2) > + return 1.0f / 0.0f; > + if (k == 6) > + return -1.0f / 0.0f; > + } > + ix = asuint (zf); > + uint32_t a = ix & (~0u >> 1); > + if (__glibc_unlikely (a == 0x3e933802u)) > + return copysignf (0x1.44cfbap+0f, zf) + copysignf (0x1p-25f, zf); > + if (__glibc_unlikely (a == 0x38f26685u)) > + return copysignf (0x1.7cc304p-12, zf) + copysignf (0x1p-37f, zf); > + > + double z = zf, z2 = z * z; > + > + static const double cn[] = { 0x1.921fb54442d19p-1, -0x1.1f458b3e1f8d6p-2, > + 0x1.68a34bd0b8f6ap-6, -0x1.e4866f7a25f99p-13 }; > + static const double cd[] = { 0x1p+0, -0x1.4b4b98d2df3a7p-1, > + 0x1.8e9926d2bb901p-4, -0x1.a6f77fd847eep-9 }; > + double z4 > + = z2 * z2, > + r = (z - z * z2) * ((cn[0] + z2 * cn[1]) + z4 * (cn[2] + z2 * cn[3])) > + / (((cd[0] + z2 * cd[1]) + z4 * (cd[2] + z2 * cd[3])) * (0.25 - z2)); > + return r; > +} > +libm_alias_float (__tanpi, tanpi) > diff --git a/sysdeps/loongarch/lp64/libm-test-ulps b/sysdeps/loongarch/lp64/libm-test-ulps > index 4fadba43c2..b4a6a3ba35 100644 > --- a/sysdeps/loongarch/lp64/libm-test-ulps > +++ b/sysdeps/loongarch/lp64/libm-test-ulps > @@ -1437,22 +1437,18 @@ ldouble: 3 > > Function: "tanpi": > double: 3 > -float: 3 > ldouble: 3 > > Function: "tanpi_downward": > double: 2 > -float: 3 > ldouble: 4 > > Function: "tanpi_towardzero": > double: 2 > -float: 3 > ldouble: 4 > > Function: "tanpi_upward": > double: 2 > -float: 4 > ldouble: 4 > > Function: "tgamma": > diff --git a/sysdeps/mips/mips64/libm-test-ulps b/sysdeps/mips/mips64/libm-test-ulps > index 5177b54557..3b1c725fae 100644 > --- a/sysdeps/mips/mips64/libm-test-ulps > +++ b/sysdeps/mips/mips64/libm-test-ulps > @@ -1449,22 +1449,18 @@ ldouble: 3 > > Function: "tanpi": > double: 3 > -float: 3 > ldouble: 3 > > Function: "tanpi_downward": > double: 2 > -float: 3 > ldouble: 4 > > Function: "tanpi_towardzero": > double: 2 > -float: 3 > ldouble: 4 > > Function: "tanpi_upward": > double: 2 > -float: 4 > ldouble: 4 > > Function: "tgamma": > diff --git a/sysdeps/or1k/fpu/libm-test-ulps b/sysdeps/or1k/fpu/libm-test-ulps > index 1fb4ec57c0..accf30904d 100644 > --- a/sysdeps/or1k/fpu/libm-test-ulps > +++ b/sysdeps/or1k/fpu/libm-test-ulps > @@ -1115,19 +1115,15 @@ double: 3 > > Function: "tanpi": > double: 3 > -float: 3 > > Function: "tanpi_downward": > double: 2 > -float: 3 > > Function: "tanpi_towardzero": > double: 2 > -float: 3 > > Function: "tanpi_upward": > double: 2 > -float: 4 > > Function: "tgamma": > double: 9 > diff --git a/sysdeps/or1k/nofpu/libm-test-ulps b/sysdeps/or1k/nofpu/libm-test-ulps > index aff11b5148..f3d5604e6a 100644 > --- a/sysdeps/or1k/nofpu/libm-test-ulps > +++ b/sysdeps/or1k/nofpu/libm-test-ulps > @@ -1015,7 +1015,6 @@ double: 3 > > Function: "tanpi": > double: 3 > -float: 3 > > Function: "tgamma": > double: 9 > diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps > index e59c3e47ef..404d3afc27 100644 > --- a/sysdeps/powerpc/fpu/libm-test-ulps > +++ b/sysdeps/powerpc/fpu/libm-test-ulps > @@ -1857,25 +1857,21 @@ ldouble: 6 > > Function: "tanpi": > double: 3 > -float: 3 > float128: 2 > ldouble: 2 > > Function: "tanpi_downward": > double: 2 > -float: 3 > float128: 4 > ldouble: 8 > > Function: "tanpi_towardzero": > double: 2 > -float: 3 > float128: 4 > ldouble: 8 > > Function: "tanpi_upward": > double: 2 > -float: 4 > float128: 4 > ldouble: 8 > > diff --git a/sysdeps/powerpc/fpu/math_private.h b/sysdeps/powerpc/fpu/math_private.h > index aace1a8708..7065d276c0 100644 > --- a/sysdeps/powerpc/fpu/math_private.h > +++ b/sysdeps/powerpc/fpu/math_private.h > @@ -62,6 +62,7 @@ __ieee754_sqrtf128 (_Float128 __x) > #ifdef _ARCH_PWR6 > /* ISA 2.03 provides frin/round() and cntlzw/ctznll(). */ > # define ROUNDEVEN_INTRINSICS 0 > +# define ROUNDEVENF_INTRINSICS 0 > #endif > > #endif /* _PPC_MATH_PRIVATE_H_ */ > diff --git a/sysdeps/riscv/nofpu/libm-test-ulps b/sysdeps/riscv/nofpu/libm-test-ulps > index 2545d0e166..720250e208 100644 > --- a/sysdeps/riscv/nofpu/libm-test-ulps > +++ b/sysdeps/riscv/nofpu/libm-test-ulps > @@ -1306,7 +1306,6 @@ ldouble: 3 > > Function: "tanpi": > double: 3 > -float: 3 > ldouble: 3 > > Function: "tgamma": > diff --git a/sysdeps/riscv/rvd/libm-test-ulps b/sysdeps/riscv/rvd/libm-test-ulps > index 94534a4f80..ee5df22f81 100644 > --- a/sysdeps/riscv/rvd/libm-test-ulps > +++ b/sysdeps/riscv/rvd/libm-test-ulps > @@ -1452,22 +1452,18 @@ ldouble: 3 > > Function: "tanpi": > double: 3 > -float: 3 > ldouble: 3 > > Function: "tanpi_downward": > double: 2 > -float: 3 > ldouble: 4 > > Function: "tanpi_towardzero": > double: 2 > -float: 3 > ldouble: 4 > > Function: "tanpi_upward": > double: 2 > -float: 4 > ldouble: 4 > > Function: "tgamma": > diff --git a/sysdeps/s390/fpu/libm-test-ulps b/sysdeps/s390/fpu/libm-test-ulps > index 2c89048b56..1491089e84 100644 > --- a/sysdeps/s390/fpu/libm-test-ulps > +++ b/sysdeps/s390/fpu/libm-test-ulps > @@ -1434,22 +1434,18 @@ ldouble: 3 > > Function: "tanpi": > double: 3 > -float: 3 > ldouble: 3 > > Function: "tanpi_downward": > double: 2 > -float: 3 > ldouble: 4 > > Function: "tanpi_towardzero": > double: 2 > -float: 3 > ldouble: 4 > > Function: "tanpi_upward": > double: 2 > -float: 4 > ldouble: 4 > > Function: "tgamma": > diff --git a/sysdeps/sparc/fpu/libm-test-ulps b/sysdeps/sparc/fpu/libm-test-ulps > index 3af2355545..d894901bbd 100644 > --- a/sysdeps/sparc/fpu/libm-test-ulps > +++ b/sysdeps/sparc/fpu/libm-test-ulps > @@ -1449,22 +1449,18 @@ ldouble: 3 > > Function: "tanpi": > double: 3 > -float: 3 > ldouble: 3 > > Function: "tanpi_downward": > double: 2 > -float: 3 > ldouble: 4 > > Function: "tanpi_towardzero": > double: 2 > -float: 3 > ldouble: 4 > > Function: "tanpi_upward": > double: 2 > -float: 4 > ldouble: 4 > > Function: "tgamma": > diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps > index f6da5ba186..a4bd2edcbc 100644 > --- a/sysdeps/x86_64/fpu/libm-test-ulps > +++ b/sysdeps/x86_64/fpu/libm-test-ulps > @@ -2292,25 +2292,21 @@ double: 1 > > Function: "tanpi": > double: 3 > -float: 3 > float128: 3 > ldouble: 3 > > Function: "tanpi_downward": > double: 2 > -float: 3 > float128: 4 > ldouble: 4 > > Function: "tanpi_towardzero": > double: 2 > -float: 3 > float128: 4 > ldouble: 4 > > Function: "tanpi_upward": > double: 2 > -float: 4 > float128: 4 > ldouble: 4 > > -- > 2.43.0 > >
diff --git a/SHARED-FILES b/SHARED-FILES index c108f3b308..25ece987f1 100644 --- a/SHARED-FILES +++ b/SHARED-FILES @@ -358,3 +358,7 @@ sysdeps/ieee754/flt-32/s_sinpif.c: (src/binary32/sinpi/sinpif.c in CORE-MATH) - the code was adapted to use glibc code style and internal functions to handle errno, overflow, and underflow. +sysdeps/ieee754/flt-32/s_tanpif.c: + (src/binary32/tanpi/tanpif.c in CORE-MATH) + - the code was adapted to use glibc code style and internal + functions to handle errno, overflow, and underflow. diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps index c6c93aa0e4..3bcd0e5ae4 100644 --- a/sysdeps/aarch64/libm-test-ulps +++ b/sysdeps/aarch64/libm-test-ulps @@ -1681,7 +1681,6 @@ ldouble: 3 Function: "tanpi": double: 3 -float: 3 ldouble: 3 Function: "tanpi_advsimd": @@ -1690,7 +1689,6 @@ float: 2 Function: "tanpi_downward": double: 2 -float: 3 ldouble: 4 Function: "tanpi_sve": @@ -1699,12 +1697,10 @@ float: 2 Function: "tanpi_towardzero": double: 2 -float: 3 ldouble: 4 Function: "tanpi_upward": double: 2 -float: 4 ldouble: 4 Function: "tgamma": diff --git a/sysdeps/arc/fpu/libm-test-ulps b/sysdeps/arc/fpu/libm-test-ulps index 65ebf6f9a0..24e6036954 100644 --- a/sysdeps/arc/fpu/libm-test-ulps +++ b/sysdeps/arc/fpu/libm-test-ulps @@ -1137,19 +1137,15 @@ double: 3 Function: "tanpi": double: 3 -float: 3 Function: "tanpi_downward": double: 2 -float: 3 Function: "tanpi_towardzero": double: 2 -float: 3 Function: "tanpi_upward": double: 2 -float: 4 Function: "tgamma": double: 9 diff --git a/sysdeps/arc/nofpu/libm-test-ulps b/sysdeps/arc/nofpu/libm-test-ulps index 3ba4f01cbf..6359d6e038 100644 --- a/sysdeps/arc/nofpu/libm-test-ulps +++ b/sysdeps/arc/nofpu/libm-test-ulps @@ -271,7 +271,6 @@ double: 2 Function: "tanpi": double: 3 -float: 3 Function: "tgamma": double: 9 diff --git a/sysdeps/arm/libm-test-ulps b/sysdeps/arm/libm-test-ulps index f887712d8e..273c54dd4d 100644 --- a/sysdeps/arm/libm-test-ulps +++ b/sysdeps/arm/libm-test-ulps @@ -1130,19 +1130,15 @@ double: 3 Function: "tanpi": double: 3 -float: 3 Function: "tanpi_downward": double: 2 -float: 3 Function: "tanpi_towardzero": double: 2 -float: 3 Function: "tanpi_upward": double: 2 -float: 4 Function: "tgamma": double: 9 diff --git a/sysdeps/hppa/fpu/libm-test-ulps b/sysdeps/hppa/fpu/libm-test-ulps index 10f7f2ebde..723cb79d12 100644 --- a/sysdeps/hppa/fpu/libm-test-ulps +++ b/sysdeps/hppa/fpu/libm-test-ulps @@ -1160,19 +1160,15 @@ double: 3 Function: "tanpi": double: 3 -float: 3 Function: "tanpi_downward": double: 2 -float: 3 Function: "tanpi_towardzero": double: 2 -float: 3 Function: "tanpi_upward": double: 2 -float: 4 Function: "tgamma": double: 9 diff --git a/sysdeps/i386/fpu/libm-test-ulps b/sysdeps/i386/fpu/libm-test-ulps index 77aa7155db..8107d2fa2c 100644 --- a/sysdeps/i386/fpu/libm-test-ulps +++ b/sysdeps/i386/fpu/libm-test-ulps @@ -1750,25 +1750,21 @@ ldouble: 4 Function: "tanpi": double: 3 -float: 3 float128: 3 ldouble: 3 Function: "tanpi_downward": double: 2 -float: 3 float128: 4 ldouble: 4 Function: "tanpi_towardzero": double: 2 -float: 3 float128: 4 ldouble: 4 Function: "tanpi_upward": double: 2 -float: 4 float128: 4 ldouble: 4 diff --git a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps index 7168d577d8..b99c50214c 100644 --- a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps +++ b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps @@ -1755,25 +1755,21 @@ ldouble: 4 Function: "tanpi": double: 3 -float: 3 float128: 3 ldouble: 3 Function: "tanpi_downward": double: 2 -float: 3 float128: 4 ldouble: 4 Function: "tanpi_towardzero": double: 2 -float: 3 float128: 4 ldouble: 4 Function: "tanpi_upward": double: 2 -float: 4 float128: 4 ldouble: 4 diff --git a/sysdeps/ieee754/flt-32/math_config.h b/sysdeps/ieee754/flt-32/math_config.h index 035461199c..8d9c8ee3ad 100644 --- a/sysdeps/ieee754/flt-32/math_config.h +++ b/sysdeps/ieee754/flt-32/math_config.h @@ -84,6 +84,31 @@ roundeven_finite (double x) #endif } +#ifndef ROUNDEVENF_INTRINSICS +/* When set, roundevenf_finite will route to the internal roundevenf function. */ +# define ROUNDEVENF_INTRINSICS 1 +#endif + +static inline float +roundevenf_finite (float x) +{ + if (!isfinite (x)) + __builtin_unreachable (); +#if ROUNDEVENF_INTRINSICS + return roundevenf (x); +#else + float y = roundf (x); + if (fabs (x - y) == 0.5) + { + union { float f; uint32_t i; } u = {y}; + union { float f; uint32_t i; } v = {y - copysignf (1.0, x)}; + if (__builtin_ctzl (v.i) > __builtin_ctzl (u.i)) + y = v.f; + } + return y; +#endif +} + static inline uint32_t asuint (float f) { diff --git a/sysdeps/ieee754/flt-32/s_tanpif.c b/sysdeps/ieee754/flt-32/s_tanpif.c new file mode 100644 index 0000000000..efbc47b507 --- /dev/null +++ b/sysdeps/ieee754/flt-32/s_tanpif.c @@ -0,0 +1,88 @@ +/* Correctly-rounded tangent of binary32 value for angles in half-revolutions + +Copyright (c) 2022-2025 Alexei Sibidanov. + +The original version of this file was copied from the CORE-MATH +project (src/binary32/tanpi/tanpif.c, revision 3bbf907). + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +*/ + +#include <stdint.h> +#include <errno.h> +#include <libm-alias-float.h> +#include "math_config.h" + +float +__tanpif (float x) +{ + uint32_t ix = asuint (x); + uint32_t e = ix & (0xff << 23); + if (__glibc_unlikely (e > (150 << 23))) + { + if (e == (0xff << 23)) + { + if (!(ix << 9)) + return __math_invalidf (x); + return x + x; /* nan */ + } + return copysign (0.0f, x); + } + float x4 = 4.0f * x; + float nx4 = roundevenf_finite (x4); + float dx4 = x4 - nx4; + float ni = roundevenf_finite (x); + float zf = x - ni; + if (__glibc_unlikely (dx4 == 0.0f)) + { + int k = x4; + if (k & 1) + return copysignf (1.0f, zf); + k &= 7; + if (k == 0) + return copysignf (0.0f, x); + if (k == 4) + return -copysignf (0.0f, x); + __set_errno (ERANGE); + if (k == 2) + return 1.0f / 0.0f; + if (k == 6) + return -1.0f / 0.0f; + } + ix = asuint (zf); + uint32_t a = ix & (~0u >> 1); + if (__glibc_unlikely (a == 0x3e933802u)) + return copysignf (0x1.44cfbap+0f, zf) + copysignf (0x1p-25f, zf); + if (__glibc_unlikely (a == 0x38f26685u)) + return copysignf (0x1.7cc304p-12, zf) + copysignf (0x1p-37f, zf); + + double z = zf, z2 = z * z; + + static const double cn[] = { 0x1.921fb54442d19p-1, -0x1.1f458b3e1f8d6p-2, + 0x1.68a34bd0b8f6ap-6, -0x1.e4866f7a25f99p-13 }; + static const double cd[] = { 0x1p+0, -0x1.4b4b98d2df3a7p-1, + 0x1.8e9926d2bb901p-4, -0x1.a6f77fd847eep-9 }; + double z4 + = z2 * z2, + r = (z - z * z2) * ((cn[0] + z2 * cn[1]) + z4 * (cn[2] + z2 * cn[3])) + / (((cd[0] + z2 * cd[1]) + z4 * (cd[2] + z2 * cd[3])) * (0.25 - z2)); + return r; +} +libm_alias_float (__tanpi, tanpi) diff --git a/sysdeps/loongarch/lp64/libm-test-ulps b/sysdeps/loongarch/lp64/libm-test-ulps index 4fadba43c2..b4a6a3ba35 100644 --- a/sysdeps/loongarch/lp64/libm-test-ulps +++ b/sysdeps/loongarch/lp64/libm-test-ulps @@ -1437,22 +1437,18 @@ ldouble: 3 Function: "tanpi": double: 3 -float: 3 ldouble: 3 Function: "tanpi_downward": double: 2 -float: 3 ldouble: 4 Function: "tanpi_towardzero": double: 2 -float: 3 ldouble: 4 Function: "tanpi_upward": double: 2 -float: 4 ldouble: 4 Function: "tgamma": diff --git a/sysdeps/mips/mips64/libm-test-ulps b/sysdeps/mips/mips64/libm-test-ulps index 5177b54557..3b1c725fae 100644 --- a/sysdeps/mips/mips64/libm-test-ulps +++ b/sysdeps/mips/mips64/libm-test-ulps @@ -1449,22 +1449,18 @@ ldouble: 3 Function: "tanpi": double: 3 -float: 3 ldouble: 3 Function: "tanpi_downward": double: 2 -float: 3 ldouble: 4 Function: "tanpi_towardzero": double: 2 -float: 3 ldouble: 4 Function: "tanpi_upward": double: 2 -float: 4 ldouble: 4 Function: "tgamma": diff --git a/sysdeps/or1k/fpu/libm-test-ulps b/sysdeps/or1k/fpu/libm-test-ulps index 1fb4ec57c0..accf30904d 100644 --- a/sysdeps/or1k/fpu/libm-test-ulps +++ b/sysdeps/or1k/fpu/libm-test-ulps @@ -1115,19 +1115,15 @@ double: 3 Function: "tanpi": double: 3 -float: 3 Function: "tanpi_downward": double: 2 -float: 3 Function: "tanpi_towardzero": double: 2 -float: 3 Function: "tanpi_upward": double: 2 -float: 4 Function: "tgamma": double: 9 diff --git a/sysdeps/or1k/nofpu/libm-test-ulps b/sysdeps/or1k/nofpu/libm-test-ulps index aff11b5148..f3d5604e6a 100644 --- a/sysdeps/or1k/nofpu/libm-test-ulps +++ b/sysdeps/or1k/nofpu/libm-test-ulps @@ -1015,7 +1015,6 @@ double: 3 Function: "tanpi": double: 3 -float: 3 Function: "tgamma": double: 9 diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps index e59c3e47ef..404d3afc27 100644 --- a/sysdeps/powerpc/fpu/libm-test-ulps +++ b/sysdeps/powerpc/fpu/libm-test-ulps @@ -1857,25 +1857,21 @@ ldouble: 6 Function: "tanpi": double: 3 -float: 3 float128: 2 ldouble: 2 Function: "tanpi_downward": double: 2 -float: 3 float128: 4 ldouble: 8 Function: "tanpi_towardzero": double: 2 -float: 3 float128: 4 ldouble: 8 Function: "tanpi_upward": double: 2 -float: 4 float128: 4 ldouble: 8 diff --git a/sysdeps/powerpc/fpu/math_private.h b/sysdeps/powerpc/fpu/math_private.h index aace1a8708..7065d276c0 100644 --- a/sysdeps/powerpc/fpu/math_private.h +++ b/sysdeps/powerpc/fpu/math_private.h @@ -62,6 +62,7 @@ __ieee754_sqrtf128 (_Float128 __x) #ifdef _ARCH_PWR6 /* ISA 2.03 provides frin/round() and cntlzw/ctznll(). */ # define ROUNDEVEN_INTRINSICS 0 +# define ROUNDEVENF_INTRINSICS 0 #endif #endif /* _PPC_MATH_PRIVATE_H_ */ diff --git a/sysdeps/riscv/nofpu/libm-test-ulps b/sysdeps/riscv/nofpu/libm-test-ulps index 2545d0e166..720250e208 100644 --- a/sysdeps/riscv/nofpu/libm-test-ulps +++ b/sysdeps/riscv/nofpu/libm-test-ulps @@ -1306,7 +1306,6 @@ ldouble: 3 Function: "tanpi": double: 3 -float: 3 ldouble: 3 Function: "tgamma": diff --git a/sysdeps/riscv/rvd/libm-test-ulps b/sysdeps/riscv/rvd/libm-test-ulps index 94534a4f80..ee5df22f81 100644 --- a/sysdeps/riscv/rvd/libm-test-ulps +++ b/sysdeps/riscv/rvd/libm-test-ulps @@ -1452,22 +1452,18 @@ ldouble: 3 Function: "tanpi": double: 3 -float: 3 ldouble: 3 Function: "tanpi_downward": double: 2 -float: 3 ldouble: 4 Function: "tanpi_towardzero": double: 2 -float: 3 ldouble: 4 Function: "tanpi_upward": double: 2 -float: 4 ldouble: 4 Function: "tgamma": diff --git a/sysdeps/s390/fpu/libm-test-ulps b/sysdeps/s390/fpu/libm-test-ulps index 2c89048b56..1491089e84 100644 --- a/sysdeps/s390/fpu/libm-test-ulps +++ b/sysdeps/s390/fpu/libm-test-ulps @@ -1434,22 +1434,18 @@ ldouble: 3 Function: "tanpi": double: 3 -float: 3 ldouble: 3 Function: "tanpi_downward": double: 2 -float: 3 ldouble: 4 Function: "tanpi_towardzero": double: 2 -float: 3 ldouble: 4 Function: "tanpi_upward": double: 2 -float: 4 ldouble: 4 Function: "tgamma": diff --git a/sysdeps/sparc/fpu/libm-test-ulps b/sysdeps/sparc/fpu/libm-test-ulps index 3af2355545..d894901bbd 100644 --- a/sysdeps/sparc/fpu/libm-test-ulps +++ b/sysdeps/sparc/fpu/libm-test-ulps @@ -1449,22 +1449,18 @@ ldouble: 3 Function: "tanpi": double: 3 -float: 3 ldouble: 3 Function: "tanpi_downward": double: 2 -float: 3 ldouble: 4 Function: "tanpi_towardzero": double: 2 -float: 3 ldouble: 4 Function: "tanpi_upward": double: 2 -float: 4 ldouble: 4 Function: "tgamma": diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps index f6da5ba186..a4bd2edcbc 100644 --- a/sysdeps/x86_64/fpu/libm-test-ulps +++ b/sysdeps/x86_64/fpu/libm-test-ulps @@ -2292,25 +2292,21 @@ double: 1 Function: "tanpi": double: 3 -float: 3 float128: 3 ldouble: 3 Function: "tanpi_downward": double: 2 -float: 3 float128: 4 ldouble: 4 Function: "tanpi_towardzero": double: 2 -float: 3 float128: 4 ldouble: 4 Function: "tanpi_upward": double: 2 -float: 4 float128: 4 ldouble: 4