Message ID | 20250131191844.2582716-13-adhemerval.zanella@linaro.org |
---|---|
State | New |
Headers | show |
Series | Add c23 CORE-MATH binary32 implementations to libm | expand |
I confirm all binary32 inputs yield correct rounding for all rounding modes on x86-64. Paul > From: Adhemerval Zanella <adhemerval.zanella@linaro.org> > Cc: DJ Delorie <dj@redhat.com>, > Joseph Myers <josmyers@redhat.com>, > Paul Zimmermann <Paul.Zimmermann@inria.fr>, > Alexei Sibidanov <sibid@uvic.ca> > Date: Fri, 31 Jan 2025 16:17:16 -0300 > > The CORE-MATH implementation is correctly rounded (for any rounding mode) > and shows better performance to the generic cospif. > > The code was adapted to glibc style and to use the definition of > math_config.h (to handle errno, overflow, and underflow). > > Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, > gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): > > latency master patched improvement > x86_64 47.4679 38.4157 19.07% > x86_64v2 46.9686 38.3329 18.39% > x86_64v3 43.8929 31.8510 27.43% > aarch64 (Neoverse) 18.8867 13.2089 30.06% > power8 22.9435 7.8023 65.99% > power10 15.4472 7.77505 49.67% > > reciprocal-throughput master patched improvement > x86_64 20.9518 11.4991 45.12% > x86_64v2 19.8699 10.5921 46.69% > x86_64v3 19.3475 9.3998 51.42% > aarch64 (Neoverse) 12.5767 6.2158 50.58% > power8 15.0566 3.2654 78.31% > power10 9.2866 3.1147 66.46% > --- > SHARED-FILES | 4 + > sysdeps/aarch64/libm-test-ulps | 4 - > sysdeps/arc/fpu/libm-test-ulps | 4 - > sysdeps/arc/nofpu/libm-test-ulps | 1 - > sysdeps/arm/libm-test-ulps | 4 - > sysdeps/hppa/fpu/libm-test-ulps | 4 - > sysdeps/i386/fpu/libm-test-ulps | 4 - > .../i386/i686/fpu/multiarch/libm-test-ulps | 4 - > sysdeps/ieee754/flt-32/s_cospif.c | 136 ++++++++++++++++++ > sysdeps/loongarch/lp64/libm-test-ulps | 4 - > sysdeps/mips/mips64/libm-test-ulps | 4 - > sysdeps/or1k/fpu/libm-test-ulps | 4 - > sysdeps/or1k/nofpu/libm-test-ulps | 1 - > sysdeps/powerpc/fpu/libm-test-ulps | 4 - > sysdeps/riscv/nofpu/libm-test-ulps | 1 - > sysdeps/riscv/rvd/libm-test-ulps | 4 - > sysdeps/s390/fpu/libm-test-ulps | 4 - > sysdeps/sparc/fpu/libm-test-ulps | 4 - > sysdeps/x86_64/fpu/libm-test-ulps | 4 - > 19 files changed, 140 insertions(+), 59 deletions(-) > create mode 100644 sysdeps/ieee754/flt-32/s_cospif.c > > diff --git a/SHARED-FILES b/SHARED-FILES > index 5702a2d1c3..3ce38d1542 100644 > --- a/SHARED-FILES > +++ b/SHARED-FILES > @@ -350,3 +350,7 @@ sysdeps/ieee754/flt-32/s_atanpif.c: > (src/binary32/atanpi/atanpif.c in CORE-MATH) > - the code was adapted to use glibc code style and internal > functions to handle errno, overflow, and underflow. > +sysdeps/ieee754/flt-32/s_cospif.c: > + (src/binary32/cospi/cospif.c in CORE-MATH) > + - the code was adapted to use glibc code style and internal > + functions to handle errno, overflow, and underflow. > diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps > index 10f182a211..a15f3effa0 100644 > --- a/sysdeps/aarch64/libm-test-ulps > +++ b/sysdeps/aarch64/libm-test-ulps > @@ -782,7 +782,6 @@ ldouble: 3 > > Function: "cospi": > double: 2 > -float: 2 > ldouble: 2 > > Function: "cospi_advsimd": > @@ -791,7 +790,6 @@ float: 1 > > Function: "cospi_downward": > double: 1 > -float: 2 > ldouble: 2 > > Function: "cospi_sve": > @@ -800,12 +798,10 @@ float: 1 > > Function: "cospi_towardzero": > double: 1 > -float: 1 > ldouble: 2 > > Function: "cospi_upward": > double: 1 > -float: 2 > ldouble: 2 > > Function: Real part of "cpow": > diff --git a/sysdeps/arc/fpu/libm-test-ulps b/sysdeps/arc/fpu/libm-test-ulps > index 7fb407cecd..f2acbf453e 100644 > --- a/sysdeps/arc/fpu/libm-test-ulps > +++ b/sysdeps/arc/fpu/libm-test-ulps > @@ -553,19 +553,15 @@ double: 3 > > Function: "cospi": > double: 2 > -float: 2 > > Function: "cospi_downward": > double: 1 > -float: 2 > > Function: "cospi_towardzero": > double: 1 > -float: 1 > > Function: "cospi_upward": > double: 1 > -float: 2 > > Function: Real part of "cpow": > double: 9 > diff --git a/sysdeps/arc/nofpu/libm-test-ulps b/sysdeps/arc/nofpu/libm-test-ulps > index 1859c2bd4f..8716e5d29e 100644 > --- a/sysdeps/arc/nofpu/libm-test-ulps > +++ b/sysdeps/arc/nofpu/libm-test-ulps > @@ -134,7 +134,6 @@ double: 2 > > Function: "cospi": > double: 2 > -float: 2 > > Function: Real part of "cpow": > double: 2 > diff --git a/sysdeps/arm/libm-test-ulps b/sysdeps/arm/libm-test-ulps > index fa9864adee..647f92944c 100644 > --- a/sysdeps/arm/libm-test-ulps > +++ b/sysdeps/arm/libm-test-ulps > @@ -545,19 +545,15 @@ double: 2 > > Function: "cospi": > double: 2 > -float: 2 > > Function: "cospi_downward": > double: 1 > -float: 2 > > Function: "cospi_towardzero": > double: 1 > -float: 1 > > Function: "cospi_upward": > double: 1 > -float: 2 > > Function: Real part of "cpow": > double: 2 > diff --git a/sysdeps/hppa/fpu/libm-test-ulps b/sysdeps/hppa/fpu/libm-test-ulps > index a59f61fc4e..88f7701c0e 100644 > --- a/sysdeps/hppa/fpu/libm-test-ulps > +++ b/sysdeps/hppa/fpu/libm-test-ulps > @@ -555,19 +555,15 @@ double: 2 > > Function: "cospi": > double: 2 > -float: 2 > > Function: "cospi_downward": > double: 1 > -float: 2 > > Function: "cospi_towardzero": > double: 1 > -float: 1 > > Function: "cospi_upward": > double: 1 > -float: 2 > > Function: Real part of "cpow": > double: 2 > diff --git a/sysdeps/i386/fpu/libm-test-ulps b/sysdeps/i386/fpu/libm-test-ulps > index 8aa52f4600..39066956b0 100644 > --- a/sysdeps/i386/fpu/libm-test-ulps > +++ b/sysdeps/i386/fpu/libm-test-ulps > @@ -854,25 +854,21 @@ ldouble: 3 > > Function: "cospi": > double: 2 > -float: 2 > float128: 2 > ldouble: 2 > > Function: "cospi_downward": > double: 1 > -float: 2 > float128: 2 > ldouble: 2 > > Function: "cospi_towardzero": > double: 1 > -float: 1 > float128: 2 > ldouble: 2 > > Function: "cospi_upward": > double: 1 > -float: 2 > float128: 2 > ldouble: 2 > > diff --git a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps > index 8032636808..a8c4723850 100644 > --- a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps > +++ b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps > @@ -854,25 +854,21 @@ ldouble: 3 > > Function: "cospi": > double: 2 > -float: 2 > float128: 2 > ldouble: 2 > > Function: "cospi_downward": > double: 1 > -float: 2 > float128: 2 > ldouble: 2 > > Function: "cospi_towardzero": > double: 1 > -float: 1 > float128: 2 > ldouble: 2 > > Function: "cospi_upward": > double: 1 > -float: 2 > float128: 2 > ldouble: 2 > > diff --git a/sysdeps/ieee754/flt-32/s_cospif.c b/sysdeps/ieee754/flt-32/s_cospif.c > new file mode 100644 > index 0000000000..d4c652f8c0 > --- /dev/null > +++ b/sysdeps/ieee754/flt-32/s_cospif.c > @@ -0,0 +1,136 @@ > +/* Correctly-rounded cosine of binary32 value for angles in half-revolutions > + > +Copyright (c) 2022-2025 Alexei Sibidanov. > + > +The original version of this file was copied from the CORE-MATH > +project (src/binary32/cospi/cospif.c, revision f786e13). > + > +Permission is hereby granted, free of charge, to any person obtaining a copy > +of this software and associated documentation files (the "Software"), to deal > +in the Software without restriction, including without limitation the rights > +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell > +copies of the Software, and to permit persons to whom the Software is > +furnished to do so, subject to the following conditions: > + > +The above copyright notice and this permission notice shall be included in all > +copies or substantial portions of the Software. > + > +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE > +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, > +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > +SOFTWARE. > + > +*/ > + > +#include <math.h> > +#include <stdint.h> > +#include <libm-alias-float.h> > +#include "math_config.h" > + > +float > +__cospif (float x) > +{ > + static const double sn[] = > + { > + 0x1.921fb54442d0fp-37, -0x1.4abbce6102b94p-112, 0x1.4669fa3c58463p-189 > + }; > + static const double cn[] = > + { > + -0x1.3bd3cc9be45cfp-74, 0x1.03c1f08088742p-150, -0x1.55d1e5eff55a5p-228 > + }; > + /* S[i] approximates sin(i*pi/2^6) */ > + static const double S[] = > + { > + 0x0p+0, 0x1.91f65f10dd814p-5, 0x1.917a6bc29b42cp-4, > + 0x1.2c8106e8e613ap-3, 0x1.8f8b83c69a60bp-3, 0x1.f19f97b215f1bp-3, > + 0x1.294062ed59f06p-2, 0x1.58f9a75ab1fddp-2, 0x1.87de2a6aea963p-2, > + 0x1.b5d1009e15ccp-2, 0x1.e2b5d3806f63bp-2, 0x1.073879922ffeep-1, > + 0x1.1c73b39ae68c8p-1, 0x1.30ff7fce17035p-1, 0x1.44cf325091dd6p-1, > + 0x1.57d69348cecap-1, 0x1.6a09e667f3bcdp-1, 0x1.7b5df226aafafp-1, > + 0x1.8bc806b151741p-1, 0x1.9b3e047f38741p-1, 0x1.a9b66290ea1a3p-1, > + 0x1.b728345196e3ep-1, 0x1.c38b2f180bdb1p-1, 0x1.ced7af43cc773p-1, > + 0x1.d906bcf328d46p-1, 0x1.e212104f686e5p-1, 0x1.e9f4156c62ddap-1, > + 0x1.f0a7efb9230d7p-1, 0x1.f6297cff75cbp-1, 0x1.fa7557f08a517p-1, > + 0x1.fd88da3d12526p-1, 0x1.ff621e3796d7ep-1, 0x1p+0, > + 0x1.ff621e3796d7ep-1, 0x1.fd88da3d12526p-1, 0x1.fa7557f08a517p-1, > + 0x1.f6297cff75cbp-1, 0x1.f0a7efb9230d7p-1, 0x1.e9f4156c62ddap-1, > + 0x1.e212104f686e5p-1, 0x1.d906bcf328d46p-1, 0x1.ced7af43cc773p-1, > + 0x1.c38b2f180bdb1p-1, 0x1.b728345196e3ep-1, 0x1.a9b66290ea1a3p-1, > + 0x1.9b3e047f38741p-1, 0x1.8bc806b151741p-1, 0x1.7b5df226aafafp-1, > + 0x1.6a09e667f3bcdp-1, 0x1.57d69348cecap-1, 0x1.44cf325091dd6p-1, > + 0x1.30ff7fce17035p-1, 0x1.1c73b39ae68c8p-1, 0x1.073879922ffeep-1, > + 0x1.e2b5d3806f63bp-2, 0x1.b5d1009e15ccp-2, 0x1.87de2a6aea963p-2, > + 0x1.58f9a75ab1fddp-2, 0x1.294062ed59f06p-2, 0x1.f19f97b215f1bp-3, > + 0x1.8f8b83c69a60bp-3, 0x1.2c8106e8e613ap-3, 0x1.917a6bc29b42cp-4, > + 0x1.91f65f10dd814p-5, 0x0p+0, -0x1.91f65f10dd814p-5, > + -0x1.917a6bc29b42cp-4, -0x1.2c8106e8e613ap-3, -0x1.8f8b83c69a60bp-3, > + -0x1.f19f97b215f1bp-3, -0x1.294062ed59f06p-2, -0x1.58f9a75ab1fddp-2, > + -0x1.87de2a6aea963p-2, -0x1.b5d1009e15ccp-2, -0x1.e2b5d3806f63bp-2, > + -0x1.073879922ffeep-1, -0x1.1c73b39ae68c8p-1, -0x1.30ff7fce17035p-1, > + -0x1.44cf325091dd6p-1, -0x1.57d69348cecap-1, -0x1.6a09e667f3bcdp-1, > + -0x1.7b5df226aafafp-1, -0x1.8bc806b151741p-1, -0x1.9b3e047f38741p-1, > + -0x1.a9b66290ea1a3p-1, -0x1.b728345196e3ep-1, -0x1.c38b2f180bdb1p-1, > + -0x1.ced7af43cc773p-1, -0x1.d906bcf328d46p-1, -0x1.e212104f686e5p-1, > + -0x1.e9f4156c62ddap-1, -0x1.f0a7efb9230d7p-1, -0x1.f6297cff75cbp-1, > + -0x1.fa7557f08a517p-1, -0x1.fd88da3d12526p-1, -0x1.ff621e3796d7ep-1, > + -0x1p+0, -0x1.ff621e3796d7ep-1, -0x1.fd88da3d12526p-1, > + -0x1.fa7557f08a517p-1, -0x1.f6297cff75cbp-1, -0x1.f0a7efb9230d7p-1, > + -0x1.e9f4156c62ddap-1, -0x1.e212104f686e5p-1, -0x1.d906bcf328d46p-1, > + -0x1.ced7af43cc773p-1, -0x1.c38b2f180bdb1p-1, -0x1.b728345196e3ep-1, > + -0x1.a9b66290ea1a3p-1, -0x1.9b3e047f38741p-1, -0x1.8bc806b151741p-1, > + -0x1.7b5df226aafafp-1, -0x1.6a09e667f3bcdp-1, -0x1.57d69348cecap-1, > + -0x1.44cf325091dd6p-1, -0x1.30ff7fce17035p-1, -0x1.1c73b39ae68c8p-1, > + -0x1.073879922ffeep-1, -0x1.e2b5d3806f63bp-2, -0x1.b5d1009e15ccp-2, > + -0x1.87de2a6aea963p-2, -0x1.58f9a75ab1fddp-2, -0x1.294062ed59f06p-2, > + -0x1.f19f97b215f1bp-3, -0x1.8f8b83c69a60bp-3, -0x1.2c8106e8e613ap-3, > + -0x1.917a6bc29b42cp-4, -0x1.91f65f10dd814p-5 > + }; > + > + uint32_t ix = asuint (x); > + int32_t e = (ix >> 23) & 0xff; > + if (__glibc_unlikely (e == 0xff)) > + { > + if (!(ix << 9)) > + return __math_invalidf (x); > + return x + x; /* nan */ > + } > + int32_t m = (ix & ~0u >> 9) | 1 << 23; > + int32_t s = 143 - e; > + int32_t p = e - 112; > + if (__glibc_unlikely (p < 0)) /* |x| < 2^-15 */ > + { > + uint32_t ax = ix & (~0u>>1); > + /* Warning: -0x1.3bd3ccp+2f * x underflows for |x| < 0x1.9f03p-129 */ > + if (ax >= 0x19f030u) > + return fmaf (-0x1.3bd3ccp+2f * x, x, 1.0f); > + else /* |x| < 0x1.9f03p-129 */ > + return fmaf (-x, x, 1.0f); > + } > + if (__glibc_unlikely (p > 31)) > + { > + if (__glibc_unlikely (p > 63)) > + return 1.0f; > + int32_t iq = m << (p - 32); > + return S[(iq + 32) & 127]; > + } > + int32_t k = m << p; > + if (__glibc_unlikely (k == 0)) > + { > + int32_t iq = m >> (32 - p); > + return S[(iq + 32) & 127]; > + } > + double z = k; > + double z2 = z * z; > + double fs = sn[0] + z2 * (sn[1] + z2 * sn[2]); > + double fc = cn[0] + z2 * (cn[1] + z2 * cn[2]); > + uint32_t iq = m >> s; > + iq = (iq + 1) >> 1; > + uint32_t is = iq & 127, ic = (iq + 32) & 127; > + double ts = S[ic], tc = S[is]; > + double r = ts + (ts * z2) * fc - (tc * z) * fs; > + return r; > +} > +libm_alias_float (__cospi, cospi) > diff --git a/sysdeps/loongarch/lp64/libm-test-ulps b/sysdeps/loongarch/lp64/libm-test-ulps > index 0cac55cbe4..fe84c60913 100644 > --- a/sysdeps/loongarch/lp64/libm-test-ulps > +++ b/sysdeps/loongarch/lp64/libm-test-ulps > @@ -701,22 +701,18 @@ ldouble: 3 > > Function: "cospi": > double: 2 > -float: 2 > ldouble: 2 > > Function: "cospi_downward": > double: 1 > -float: 2 > ldouble: 2 > > Function: "cospi_towardzero": > double: 1 > -float: 1 > ldouble: 2 > > Function: "cospi_upward": > double: 1 > -float: 2 > ldouble: 2 > > Function: Real part of "cpow": > diff --git a/sysdeps/mips/mips64/libm-test-ulps b/sysdeps/mips/mips64/libm-test-ulps > index 1b5bcff11e..ddc78d0239 100644 > --- a/sysdeps/mips/mips64/libm-test-ulps > +++ b/sysdeps/mips/mips64/libm-test-ulps > @@ -701,22 +701,18 @@ ldouble: 3 > > Function: "cospi": > double: 2 > -float: 2 > ldouble: 2 > > Function: "cospi_downward": > double: 1 > -float: 2 > ldouble: 2 > > Function: "cospi_towardzero": > double: 1 > -float: 1 > ldouble: 2 > > Function: "cospi_upward": > double: 1 > -float: 2 > ldouble: 2 > > Function: Real part of "cpow": > diff --git a/sysdeps/or1k/fpu/libm-test-ulps b/sysdeps/or1k/fpu/libm-test-ulps > index a608e3c949..884b4cc361 100644 > --- a/sysdeps/or1k/fpu/libm-test-ulps > +++ b/sysdeps/or1k/fpu/libm-test-ulps > @@ -545,19 +545,15 @@ double: 2 > > Function: "cospi": > double: 2 > -float: 2 > > Function: "cospi_downward": > double: 1 > -float: 2 > > Function: "cospi_towardzero": > double: 1 > -float: 1 > > Function: "cospi_upward": > double: 1 > -float: 2 > > Function: Real part of "cpow": > double: 2 > diff --git a/sysdeps/or1k/nofpu/libm-test-ulps b/sysdeps/or1k/nofpu/libm-test-ulps > index 56986f0be0..aec66e0fa3 100644 > --- a/sysdeps/or1k/nofpu/libm-test-ulps > +++ b/sysdeps/or1k/nofpu/libm-test-ulps > @@ -509,7 +509,6 @@ double: 2 > > Function: "cospi": > double: 2 > -float: 2 > > Function: Real part of "cpow": > double: 2 > diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps > index 630111e6c4..bdf0c98dc7 100644 > --- a/sysdeps/powerpc/fpu/libm-test-ulps > +++ b/sysdeps/powerpc/fpu/libm-test-ulps > @@ -858,25 +858,21 @@ ldouble: 2 > > Function: "cospi": > double: 2 > -float: 2 > float128: 2 > ldouble: 2 > > Function: "cospi_downward": > double: 1 > -float: 2 > float128: 2 > ldouble: 4 > > Function: "cospi_towardzero": > double: 1 > -float: 1 > float128: 2 > ldouble: 6 > > Function: "cospi_upward": > double: 1 > -float: 2 > float128: 2 > ldouble: 6 > > diff --git a/sysdeps/riscv/nofpu/libm-test-ulps b/sysdeps/riscv/nofpu/libm-test-ulps > index 087dcd79fc..08af2495f3 100644 > --- a/sysdeps/riscv/nofpu/libm-test-ulps > +++ b/sysdeps/riscv/nofpu/libm-test-ulps > @@ -650,7 +650,6 @@ ldouble: 3 > > Function: "cospi": > double: 2 > -float: 2 > ldouble: 2 > > Function: Real part of "cpow": > diff --git a/sysdeps/riscv/rvd/libm-test-ulps b/sysdeps/riscv/rvd/libm-test-ulps > index efd83affa4..6644e38ebc 100644 > --- a/sysdeps/riscv/rvd/libm-test-ulps > +++ b/sysdeps/riscv/rvd/libm-test-ulps > @@ -709,22 +709,18 @@ ldouble: 3 > > Function: "cospi": > double: 2 > -float: 2 > ldouble: 2 > > Function: "cospi_downward": > double: 1 > -float: 2 > ldouble: 2 > > Function: "cospi_towardzero": > double: 1 > -float: 1 > ldouble: 2 > > Function: "cospi_upward": > double: 1 > -float: 2 > ldouble: 2 > > Function: Real part of "cpow": > diff --git a/sysdeps/s390/fpu/libm-test-ulps b/sysdeps/s390/fpu/libm-test-ulps > index 709debb205..6318760eb5 100644 > --- a/sysdeps/s390/fpu/libm-test-ulps > +++ b/sysdeps/s390/fpu/libm-test-ulps > @@ -701,22 +701,18 @@ ldouble: 3 > > Function: "cospi": > double: 2 > -float: 2 > ldouble: 2 > > Function: "cospi_downward": > double: 1 > -float: 2 > ldouble: 2 > > Function: "cospi_towardzero": > double: 1 > -float: 1 > ldouble: 2 > > Function: "cospi_upward": > double: 1 > -float: 2 > ldouble: 2 > > Function: Real part of "cpow": > diff --git a/sysdeps/sparc/fpu/libm-test-ulps b/sysdeps/sparc/fpu/libm-test-ulps > index becf5da3d6..2c319f8ae2 100644 > --- a/sysdeps/sparc/fpu/libm-test-ulps > +++ b/sysdeps/sparc/fpu/libm-test-ulps > @@ -701,22 +701,18 @@ ldouble: 3 > > Function: "cospi": > double: 2 > -float: 2 > ldouble: 2 > > Function: "cospi_downward": > double: 1 > -float: 2 > ldouble: 2 > > Function: "cospi_towardzero": > double: 1 > -float: 1 > ldouble: 2 > > Function: "cospi_upward": > double: 1 > -float: 2 > ldouble: 2 > > Function: Real part of "cpow": > diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps > index 8c5d4fd471..e2cf3e04b6 100644 > --- a/sysdeps/x86_64/fpu/libm-test-ulps > +++ b/sysdeps/x86_64/fpu/libm-test-ulps > @@ -1050,25 +1050,21 @@ float: 2 > > Function: "cospi": > double: 2 > -float: 2 > float128: 2 > ldouble: 2 > > Function: "cospi_downward": > double: 1 > -float: 2 > float128: 2 > ldouble: 2 > > Function: "cospi_towardzero": > double: 1 > -float: 1 > float128: 2 > ldouble: 2 > > Function: "cospi_upward": > double: 1 > -float: 2 > float128: 2 > ldouble: 2 > > -- > 2.43.0 > >
diff --git a/SHARED-FILES b/SHARED-FILES index 5702a2d1c3..3ce38d1542 100644 --- a/SHARED-FILES +++ b/SHARED-FILES @@ -350,3 +350,7 @@ sysdeps/ieee754/flt-32/s_atanpif.c: (src/binary32/atanpi/atanpif.c in CORE-MATH) - the code was adapted to use glibc code style and internal functions to handle errno, overflow, and underflow. +sysdeps/ieee754/flt-32/s_cospif.c: + (src/binary32/cospi/cospif.c in CORE-MATH) + - the code was adapted to use glibc code style and internal + functions to handle errno, overflow, and underflow. diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps index 10f182a211..a15f3effa0 100644 --- a/sysdeps/aarch64/libm-test-ulps +++ b/sysdeps/aarch64/libm-test-ulps @@ -782,7 +782,6 @@ ldouble: 3 Function: "cospi": double: 2 -float: 2 ldouble: 2 Function: "cospi_advsimd": @@ -791,7 +790,6 @@ float: 1 Function: "cospi_downward": double: 1 -float: 2 ldouble: 2 Function: "cospi_sve": @@ -800,12 +798,10 @@ float: 1 Function: "cospi_towardzero": double: 1 -float: 1 ldouble: 2 Function: "cospi_upward": double: 1 -float: 2 ldouble: 2 Function: Real part of "cpow": diff --git a/sysdeps/arc/fpu/libm-test-ulps b/sysdeps/arc/fpu/libm-test-ulps index 7fb407cecd..f2acbf453e 100644 --- a/sysdeps/arc/fpu/libm-test-ulps +++ b/sysdeps/arc/fpu/libm-test-ulps @@ -553,19 +553,15 @@ double: 3 Function: "cospi": double: 2 -float: 2 Function: "cospi_downward": double: 1 -float: 2 Function: "cospi_towardzero": double: 1 -float: 1 Function: "cospi_upward": double: 1 -float: 2 Function: Real part of "cpow": double: 9 diff --git a/sysdeps/arc/nofpu/libm-test-ulps b/sysdeps/arc/nofpu/libm-test-ulps index 1859c2bd4f..8716e5d29e 100644 --- a/sysdeps/arc/nofpu/libm-test-ulps +++ b/sysdeps/arc/nofpu/libm-test-ulps @@ -134,7 +134,6 @@ double: 2 Function: "cospi": double: 2 -float: 2 Function: Real part of "cpow": double: 2 diff --git a/sysdeps/arm/libm-test-ulps b/sysdeps/arm/libm-test-ulps index fa9864adee..647f92944c 100644 --- a/sysdeps/arm/libm-test-ulps +++ b/sysdeps/arm/libm-test-ulps @@ -545,19 +545,15 @@ double: 2 Function: "cospi": double: 2 -float: 2 Function: "cospi_downward": double: 1 -float: 2 Function: "cospi_towardzero": double: 1 -float: 1 Function: "cospi_upward": double: 1 -float: 2 Function: Real part of "cpow": double: 2 diff --git a/sysdeps/hppa/fpu/libm-test-ulps b/sysdeps/hppa/fpu/libm-test-ulps index a59f61fc4e..88f7701c0e 100644 --- a/sysdeps/hppa/fpu/libm-test-ulps +++ b/sysdeps/hppa/fpu/libm-test-ulps @@ -555,19 +555,15 @@ double: 2 Function: "cospi": double: 2 -float: 2 Function: "cospi_downward": double: 1 -float: 2 Function: "cospi_towardzero": double: 1 -float: 1 Function: "cospi_upward": double: 1 -float: 2 Function: Real part of "cpow": double: 2 diff --git a/sysdeps/i386/fpu/libm-test-ulps b/sysdeps/i386/fpu/libm-test-ulps index 8aa52f4600..39066956b0 100644 --- a/sysdeps/i386/fpu/libm-test-ulps +++ b/sysdeps/i386/fpu/libm-test-ulps @@ -854,25 +854,21 @@ ldouble: 3 Function: "cospi": double: 2 -float: 2 float128: 2 ldouble: 2 Function: "cospi_downward": double: 1 -float: 2 float128: 2 ldouble: 2 Function: "cospi_towardzero": double: 1 -float: 1 float128: 2 ldouble: 2 Function: "cospi_upward": double: 1 -float: 2 float128: 2 ldouble: 2 diff --git a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps index 8032636808..a8c4723850 100644 --- a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps +++ b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps @@ -854,25 +854,21 @@ ldouble: 3 Function: "cospi": double: 2 -float: 2 float128: 2 ldouble: 2 Function: "cospi_downward": double: 1 -float: 2 float128: 2 ldouble: 2 Function: "cospi_towardzero": double: 1 -float: 1 float128: 2 ldouble: 2 Function: "cospi_upward": double: 1 -float: 2 float128: 2 ldouble: 2 diff --git a/sysdeps/ieee754/flt-32/s_cospif.c b/sysdeps/ieee754/flt-32/s_cospif.c new file mode 100644 index 0000000000..d4c652f8c0 --- /dev/null +++ b/sysdeps/ieee754/flt-32/s_cospif.c @@ -0,0 +1,136 @@ +/* Correctly-rounded cosine of binary32 value for angles in half-revolutions + +Copyright (c) 2022-2025 Alexei Sibidanov. + +The original version of this file was copied from the CORE-MATH +project (src/binary32/cospi/cospif.c, revision f786e13). + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +*/ + +#include <math.h> +#include <stdint.h> +#include <libm-alias-float.h> +#include "math_config.h" + +float +__cospif (float x) +{ + static const double sn[] = + { + 0x1.921fb54442d0fp-37, -0x1.4abbce6102b94p-112, 0x1.4669fa3c58463p-189 + }; + static const double cn[] = + { + -0x1.3bd3cc9be45cfp-74, 0x1.03c1f08088742p-150, -0x1.55d1e5eff55a5p-228 + }; + /* S[i] approximates sin(i*pi/2^6) */ + static const double S[] = + { + 0x0p+0, 0x1.91f65f10dd814p-5, 0x1.917a6bc29b42cp-4, + 0x1.2c8106e8e613ap-3, 0x1.8f8b83c69a60bp-3, 0x1.f19f97b215f1bp-3, + 0x1.294062ed59f06p-2, 0x1.58f9a75ab1fddp-2, 0x1.87de2a6aea963p-2, + 0x1.b5d1009e15ccp-2, 0x1.e2b5d3806f63bp-2, 0x1.073879922ffeep-1, + 0x1.1c73b39ae68c8p-1, 0x1.30ff7fce17035p-1, 0x1.44cf325091dd6p-1, + 0x1.57d69348cecap-1, 0x1.6a09e667f3bcdp-1, 0x1.7b5df226aafafp-1, + 0x1.8bc806b151741p-1, 0x1.9b3e047f38741p-1, 0x1.a9b66290ea1a3p-1, + 0x1.b728345196e3ep-1, 0x1.c38b2f180bdb1p-1, 0x1.ced7af43cc773p-1, + 0x1.d906bcf328d46p-1, 0x1.e212104f686e5p-1, 0x1.e9f4156c62ddap-1, + 0x1.f0a7efb9230d7p-1, 0x1.f6297cff75cbp-1, 0x1.fa7557f08a517p-1, + 0x1.fd88da3d12526p-1, 0x1.ff621e3796d7ep-1, 0x1p+0, + 0x1.ff621e3796d7ep-1, 0x1.fd88da3d12526p-1, 0x1.fa7557f08a517p-1, + 0x1.f6297cff75cbp-1, 0x1.f0a7efb9230d7p-1, 0x1.e9f4156c62ddap-1, + 0x1.e212104f686e5p-1, 0x1.d906bcf328d46p-1, 0x1.ced7af43cc773p-1, + 0x1.c38b2f180bdb1p-1, 0x1.b728345196e3ep-1, 0x1.a9b66290ea1a3p-1, + 0x1.9b3e047f38741p-1, 0x1.8bc806b151741p-1, 0x1.7b5df226aafafp-1, + 0x1.6a09e667f3bcdp-1, 0x1.57d69348cecap-1, 0x1.44cf325091dd6p-1, + 0x1.30ff7fce17035p-1, 0x1.1c73b39ae68c8p-1, 0x1.073879922ffeep-1, + 0x1.e2b5d3806f63bp-2, 0x1.b5d1009e15ccp-2, 0x1.87de2a6aea963p-2, + 0x1.58f9a75ab1fddp-2, 0x1.294062ed59f06p-2, 0x1.f19f97b215f1bp-3, + 0x1.8f8b83c69a60bp-3, 0x1.2c8106e8e613ap-3, 0x1.917a6bc29b42cp-4, + 0x1.91f65f10dd814p-5, 0x0p+0, -0x1.91f65f10dd814p-5, + -0x1.917a6bc29b42cp-4, -0x1.2c8106e8e613ap-3, -0x1.8f8b83c69a60bp-3, + -0x1.f19f97b215f1bp-3, -0x1.294062ed59f06p-2, -0x1.58f9a75ab1fddp-2, + -0x1.87de2a6aea963p-2, -0x1.b5d1009e15ccp-2, -0x1.e2b5d3806f63bp-2, + -0x1.073879922ffeep-1, -0x1.1c73b39ae68c8p-1, -0x1.30ff7fce17035p-1, + -0x1.44cf325091dd6p-1, -0x1.57d69348cecap-1, -0x1.6a09e667f3bcdp-1, + -0x1.7b5df226aafafp-1, -0x1.8bc806b151741p-1, -0x1.9b3e047f38741p-1, + -0x1.a9b66290ea1a3p-1, -0x1.b728345196e3ep-1, -0x1.c38b2f180bdb1p-1, + -0x1.ced7af43cc773p-1, -0x1.d906bcf328d46p-1, -0x1.e212104f686e5p-1, + -0x1.e9f4156c62ddap-1, -0x1.f0a7efb9230d7p-1, -0x1.f6297cff75cbp-1, + -0x1.fa7557f08a517p-1, -0x1.fd88da3d12526p-1, -0x1.ff621e3796d7ep-1, + -0x1p+0, -0x1.ff621e3796d7ep-1, -0x1.fd88da3d12526p-1, + -0x1.fa7557f08a517p-1, -0x1.f6297cff75cbp-1, -0x1.f0a7efb9230d7p-1, + -0x1.e9f4156c62ddap-1, -0x1.e212104f686e5p-1, -0x1.d906bcf328d46p-1, + -0x1.ced7af43cc773p-1, -0x1.c38b2f180bdb1p-1, -0x1.b728345196e3ep-1, + -0x1.a9b66290ea1a3p-1, -0x1.9b3e047f38741p-1, -0x1.8bc806b151741p-1, + -0x1.7b5df226aafafp-1, -0x1.6a09e667f3bcdp-1, -0x1.57d69348cecap-1, + -0x1.44cf325091dd6p-1, -0x1.30ff7fce17035p-1, -0x1.1c73b39ae68c8p-1, + -0x1.073879922ffeep-1, -0x1.e2b5d3806f63bp-2, -0x1.b5d1009e15ccp-2, + -0x1.87de2a6aea963p-2, -0x1.58f9a75ab1fddp-2, -0x1.294062ed59f06p-2, + -0x1.f19f97b215f1bp-3, -0x1.8f8b83c69a60bp-3, -0x1.2c8106e8e613ap-3, + -0x1.917a6bc29b42cp-4, -0x1.91f65f10dd814p-5 + }; + + uint32_t ix = asuint (x); + int32_t e = (ix >> 23) & 0xff; + if (__glibc_unlikely (e == 0xff)) + { + if (!(ix << 9)) + return __math_invalidf (x); + return x + x; /* nan */ + } + int32_t m = (ix & ~0u >> 9) | 1 << 23; + int32_t s = 143 - e; + int32_t p = e - 112; + if (__glibc_unlikely (p < 0)) /* |x| < 2^-15 */ + { + uint32_t ax = ix & (~0u>>1); + /* Warning: -0x1.3bd3ccp+2f * x underflows for |x| < 0x1.9f03p-129 */ + if (ax >= 0x19f030u) + return fmaf (-0x1.3bd3ccp+2f * x, x, 1.0f); + else /* |x| < 0x1.9f03p-129 */ + return fmaf (-x, x, 1.0f); + } + if (__glibc_unlikely (p > 31)) + { + if (__glibc_unlikely (p > 63)) + return 1.0f; + int32_t iq = m << (p - 32); + return S[(iq + 32) & 127]; + } + int32_t k = m << p; + if (__glibc_unlikely (k == 0)) + { + int32_t iq = m >> (32 - p); + return S[(iq + 32) & 127]; + } + double z = k; + double z2 = z * z; + double fs = sn[0] + z2 * (sn[1] + z2 * sn[2]); + double fc = cn[0] + z2 * (cn[1] + z2 * cn[2]); + uint32_t iq = m >> s; + iq = (iq + 1) >> 1; + uint32_t is = iq & 127, ic = (iq + 32) & 127; + double ts = S[ic], tc = S[is]; + double r = ts + (ts * z2) * fc - (tc * z) * fs; + return r; +} +libm_alias_float (__cospi, cospi) diff --git a/sysdeps/loongarch/lp64/libm-test-ulps b/sysdeps/loongarch/lp64/libm-test-ulps index 0cac55cbe4..fe84c60913 100644 --- a/sysdeps/loongarch/lp64/libm-test-ulps +++ b/sysdeps/loongarch/lp64/libm-test-ulps @@ -701,22 +701,18 @@ ldouble: 3 Function: "cospi": double: 2 -float: 2 ldouble: 2 Function: "cospi_downward": double: 1 -float: 2 ldouble: 2 Function: "cospi_towardzero": double: 1 -float: 1 ldouble: 2 Function: "cospi_upward": double: 1 -float: 2 ldouble: 2 Function: Real part of "cpow": diff --git a/sysdeps/mips/mips64/libm-test-ulps b/sysdeps/mips/mips64/libm-test-ulps index 1b5bcff11e..ddc78d0239 100644 --- a/sysdeps/mips/mips64/libm-test-ulps +++ b/sysdeps/mips/mips64/libm-test-ulps @@ -701,22 +701,18 @@ ldouble: 3 Function: "cospi": double: 2 -float: 2 ldouble: 2 Function: "cospi_downward": double: 1 -float: 2 ldouble: 2 Function: "cospi_towardzero": double: 1 -float: 1 ldouble: 2 Function: "cospi_upward": double: 1 -float: 2 ldouble: 2 Function: Real part of "cpow": diff --git a/sysdeps/or1k/fpu/libm-test-ulps b/sysdeps/or1k/fpu/libm-test-ulps index a608e3c949..884b4cc361 100644 --- a/sysdeps/or1k/fpu/libm-test-ulps +++ b/sysdeps/or1k/fpu/libm-test-ulps @@ -545,19 +545,15 @@ double: 2 Function: "cospi": double: 2 -float: 2 Function: "cospi_downward": double: 1 -float: 2 Function: "cospi_towardzero": double: 1 -float: 1 Function: "cospi_upward": double: 1 -float: 2 Function: Real part of "cpow": double: 2 diff --git a/sysdeps/or1k/nofpu/libm-test-ulps b/sysdeps/or1k/nofpu/libm-test-ulps index 56986f0be0..aec66e0fa3 100644 --- a/sysdeps/or1k/nofpu/libm-test-ulps +++ b/sysdeps/or1k/nofpu/libm-test-ulps @@ -509,7 +509,6 @@ double: 2 Function: "cospi": double: 2 -float: 2 Function: Real part of "cpow": double: 2 diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps index 630111e6c4..bdf0c98dc7 100644 --- a/sysdeps/powerpc/fpu/libm-test-ulps +++ b/sysdeps/powerpc/fpu/libm-test-ulps @@ -858,25 +858,21 @@ ldouble: 2 Function: "cospi": double: 2 -float: 2 float128: 2 ldouble: 2 Function: "cospi_downward": double: 1 -float: 2 float128: 2 ldouble: 4 Function: "cospi_towardzero": double: 1 -float: 1 float128: 2 ldouble: 6 Function: "cospi_upward": double: 1 -float: 2 float128: 2 ldouble: 6 diff --git a/sysdeps/riscv/nofpu/libm-test-ulps b/sysdeps/riscv/nofpu/libm-test-ulps index 087dcd79fc..08af2495f3 100644 --- a/sysdeps/riscv/nofpu/libm-test-ulps +++ b/sysdeps/riscv/nofpu/libm-test-ulps @@ -650,7 +650,6 @@ ldouble: 3 Function: "cospi": double: 2 -float: 2 ldouble: 2 Function: Real part of "cpow": diff --git a/sysdeps/riscv/rvd/libm-test-ulps b/sysdeps/riscv/rvd/libm-test-ulps index efd83affa4..6644e38ebc 100644 --- a/sysdeps/riscv/rvd/libm-test-ulps +++ b/sysdeps/riscv/rvd/libm-test-ulps @@ -709,22 +709,18 @@ ldouble: 3 Function: "cospi": double: 2 -float: 2 ldouble: 2 Function: "cospi_downward": double: 1 -float: 2 ldouble: 2 Function: "cospi_towardzero": double: 1 -float: 1 ldouble: 2 Function: "cospi_upward": double: 1 -float: 2 ldouble: 2 Function: Real part of "cpow": diff --git a/sysdeps/s390/fpu/libm-test-ulps b/sysdeps/s390/fpu/libm-test-ulps index 709debb205..6318760eb5 100644 --- a/sysdeps/s390/fpu/libm-test-ulps +++ b/sysdeps/s390/fpu/libm-test-ulps @@ -701,22 +701,18 @@ ldouble: 3 Function: "cospi": double: 2 -float: 2 ldouble: 2 Function: "cospi_downward": double: 1 -float: 2 ldouble: 2 Function: "cospi_towardzero": double: 1 -float: 1 ldouble: 2 Function: "cospi_upward": double: 1 -float: 2 ldouble: 2 Function: Real part of "cpow": diff --git a/sysdeps/sparc/fpu/libm-test-ulps b/sysdeps/sparc/fpu/libm-test-ulps index becf5da3d6..2c319f8ae2 100644 --- a/sysdeps/sparc/fpu/libm-test-ulps +++ b/sysdeps/sparc/fpu/libm-test-ulps @@ -701,22 +701,18 @@ ldouble: 3 Function: "cospi": double: 2 -float: 2 ldouble: 2 Function: "cospi_downward": double: 1 -float: 2 ldouble: 2 Function: "cospi_towardzero": double: 1 -float: 1 ldouble: 2 Function: "cospi_upward": double: 1 -float: 2 ldouble: 2 Function: Real part of "cpow": diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps index 8c5d4fd471..e2cf3e04b6 100644 --- a/sysdeps/x86_64/fpu/libm-test-ulps +++ b/sysdeps/x86_64/fpu/libm-test-ulps @@ -1050,25 +1050,21 @@ float: 2 Function: "cospi": double: 2 -float: 2 float128: 2 ldouble: 2 Function: "cospi_downward": double: 1 -float: 2 float128: 2 ldouble: 2 Function: "cospi_towardzero": double: 1 -float: 1 float128: 2 ldouble: 2 Function: "cospi_upward": double: 1 -float: 2 float128: 2 ldouble: 2