From patchwork Fri Nov 29 13:17:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 846141 Delivered-To: patch@linaro.org Received: by 2002:adf:f2c4:0:b0:382:43a8:7b94 with SMTP id d4csp875536wrp; Fri, 29 Nov 2024 05:54:10 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWmmOgxZ64w8viXYbWBpNzG+PZAsX4vZJMCZ20UuZUC3YKnZkeFCMmE8gbnKDRsYVKt/YZsCw==@linaro.org X-Google-Smtp-Source: AGHT+IF3q+F4OIkwTpn4Vzs4LjIyuZngn16oZbBVO6sc3L3pJCNm5gzNOmkAlzClvZWpRhDS+2hw X-Received: by 2002:a05:6902:2709:b0:e38:1364:7075 with SMTP id 3f1490d57ef6-e395b8ce8abmr10571731276.29.1732888450523; Fri, 29 Nov 2024 05:54:10 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1732888450; cv=pass; d=google.com; s=arc-20240605; b=UuYfjLt/Ow3dSdr+8lHT2VOnLvOZVt+/7XSAqQKDg0StiMerd0IM5z9+qTLm5aSqBX UBLT72FEtbBZEK8SSaVh6Qlsp8LoJGp0/sr402KmC7d+lyv2TdKxMp1iSpJS+Qi7wxx7 ouwYzHAxt2BHjBN/waflZWdi3Icro034z3YO6OGaQdtkXvaTVSNJJce8nce5cRwXz+x4 8eUY+zN/7eMB1kSwdUPKCkPDrLqSCvt3RFO1SBlWdJRee5SY6cn6ZLpFjwXwaIm8SB4r GSiT+Z/6IB5/v+B0+aiqUekGjwsznWYTK7fs10jWlXDhDc85PXn1AmC0fTYEVf86eNHR hTow== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dkim-filter:arc-filter:dmarc-filter :delivered-to:dkim-filter; bh=srI9cQxCaA/4Jn0TH7Xto3hRykBQVudfovtW8ULk/EM=; fh=sFucH9KQW8Y8eMoQXaNIgycLDa7roysdjTHpHLIprh4=; b=CdE3SKxjXnCzLozyCZHZNgwNj8LUfJks/xHtQk+PYoORfDXwB810KjDxleGkaVydX7 gNN6+eAhnDEJYEWaO0LyoBAVxS52TiOwjROru0L1Y9somLwxU/4Vun4W7qwlUDTka2e4 wQVFigKDmn6Hk/qUdGiM0v9pg+m2cHVGW3LZ3Rdv+fk07enPEFEn7OMvJFT6u5L1tzOM AfIZeZJErdnIvUeAwpThnLiy+P2mELkAnCvIkStDWONkICN4o6bizwnHrPg2n8nLomxg TFLfeh2dNPeR02eusK+/1vpX2gPKAkM/UPbMn7/gcr7/IxKwZdC12XtXdaZddHn52tFx cPIg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=reCEBR8H; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id a1e0cc1a2514c-85b82db88fdsi1405783241.171.2024.11.29.05.54.10 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:54:10 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=reCEBR8H; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 091A73858D3C for ; Fri, 29 Nov 2024 13:54:10 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 091A73858D3C Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=reCEBR8H X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) by sourceware.org (Postfix) with ESMTPS id 6814B3858C56 for ; Fri, 29 Nov 2024 13:21:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6814B3858C56 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6814B3858C56 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::62c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1732886497; cv=none; b=a5MIo4KD/tA6q0O0Tw0khVDe+moz6igC4M+oJH9tqqXxxxhsubPbrsXrOrjaKaCPe+cTjRxDvQW6OW3HZQmnTfbjXZ5DLLi8H0JSOx9PpcfWt3A9mV4DJgJMt2LbesxX8bFecnDmQYv/kckronIcZ7sjvla7pMy5v0i0kf82rc4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1732886497; c=relaxed/simple; bh=9NdkNRH4qyBZeVa7MtN7tdKSuZ5Ukg29kAWsf+EuwlQ=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=x8BZHR/nipa+ESAvfv3bKvRi42jgHIEpRA7HK20PaD01s+Y4x8BTuLMrsskA8cnnN+KFRPN4iLem79pg7BYJjE1NRPktaSvIo+YakToumfSGioy6b0/cncTH9VMwvONd/uj6KHVclwF+95TsGB6XMhkMEnj2ztTxnaqT5EmElfk= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6814B3858C56 Received: by mail-pl1-x62c.google.com with SMTP id d9443c01a7336-21288ce11d7so16175965ad.2 for ; Fri, 29 Nov 2024 05:21:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1732886496; x=1733491296; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=srI9cQxCaA/4Jn0TH7Xto3hRykBQVudfovtW8ULk/EM=; b=reCEBR8HBWfN6ixtpoqSEcGC3uiy1uAaudpoBVwcw+eikBc0PP3xsIcHQ3tT7VThFz fbTri9XzeHUJvyyy4ko72d219xC6DVYtPaYEJJ3JzL583H1pzKPLCiKV5ADc1yetWXNA PlazkUW+dBS1wWVcbSbvqVYS7bhZU11l06PqVXAMqdqTi+OaUbD9re1o5Mv3f7idMgjk IbBCP3dyd7LNQvPZhXvHNEaSI4hnu6tqBOUgletErBrGjc/urYYUC3ZX8sDITaY9O2tM fOl/yRhVAcPCN/K1OjGRR+Z8JE+blOOOOlRoFac2G/pQJSHqzUq5Zhy7idNxeKCvreFS Eo3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732886496; x=1733491296; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=srI9cQxCaA/4Jn0TH7Xto3hRykBQVudfovtW8ULk/EM=; b=VgcXwCWLu+RzB/wR3rwZ4tVBk7ur17NctVgwPy9FoIQJ0j4P2QNyGHfljsG8rfGMvk tfMpeoYIJSZwbvgo94/9Cf/z3kwzWda0df0DUU9UeCmuEl2ACxU/WghbHdvwwo9q0TSC hg267XwWtePA9v/1eBvFc/csplUMdpcx5mZ92aaWoYEYrniiXmpqheqNxhTH74RS8Pp1 QfWRrSBDOTgABmu9aWLF6WD1rYH/1TMycRj5I82BgKhWm0kzAG3CcfcD58+uHLa11d4I CoPYl7S0JZZGuFvn5qbkWO/vA26S8S4eKOtHCyzRKJ5o3rvFAC87zelqvaXfijEVsxIc hvrg== X-Gm-Message-State: AOJu0YynRTxOhnU4a7Callyf65qDVHICBaN6/GljCbBiELa88OZg2ZfT vEi4fRjli+D1IL/0Cxzrpk9Q52g/AeBJuBh0abU+S9GCeeZFlVduj+mnC6xjVEa5pSx4zW3EHbZ 7RVKIlw== X-Gm-Gg: ASbGnct4VV8lqLnKEIIkSjc680dbLOPNA/LzXs2oBdck3Of7MrdfH2J8GmJEh244YWr bSuAToNZ+CM1J199zAtJJxBF/lrjSkgdwudy9n5/OEnWLrCfKfpIi63YDeTcGwA2h8mAfKrNF9i ReHQxRchsEyHCdkFo5zHKzKqIopFMdaGH6PGCHUkvyzvOrDk7ztCcR80fJ37S6PsmTaYQW43GqW gM1QrCQ7hC5RXGEbKJ5D1bKGTnmrnTgxAY+MQfzMBANoYId6/VSTVP5Y+PEa3A= X-Received: by 2002:a17:902:cec3:b0:20c:cd23:449d with SMTP id d9443c01a7336-21501e5d86amr146250655ad.46.1732886495891; Fri, 29 Nov 2024 05:21:35 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c1:68c8:3143:6603:ad16:715e]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2153d5f66d5sm14472255ad.201.2024.11.29.05.21.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:21:35 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: DJ Delorie , Alexei Sibidanov , Paul Zimmermann Subject: [PATCH 21/23] math: Use coshf from CORE-MATH Date: Fri, 29 Nov 2024 10:17:45 -0300 Message-ID: <20241129132032.476978-22-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241129132032.476978-1-adhemerval.zanella@linaro.org> References: <20241129132032.476978-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patch=linaro.org@sourceware.org The CORE-MATH implementation is correctly rounded (for any rounding mode), although it should worse performance than current one. The current implementation performance comes mainly from the internal usage of the optimize expf implementation, and shows a maximum ULPs of 2 for FE_TONEAREST and 3 for other rounding modes. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 40.6995 49.0737 -20.58% x86_64v2 40.5841 44.3604 -9.30% x86_64v3 39.3879 39.7502 -0.92% i686 112.3380 129.8570 -15.59% aarch64 (Neoverse) 18.6914 17.0946 8.54% power10 11.1343 9.3245 16.25% reciprocal-throughput master patched improvement x86_64 18.6471 24.1077 -29.28% x86_64v2 17.7501 20.2946 -14.34% x86_64v3 17.8262 17.1877 3.58% i686 64.1454 86.5645 -34.95% aarch64 (Neoverse) 9.77226 12.2314 -25.16% power10 4.0200 5.3316 -32.63% Signed-off-by: Alexei Sibidanov Signed-off-by: Paul Zimmermann Signed-off-by: Adhemerval Zanella --- SHARED-FILES | 4 + sysdeps/aarch64/libm-test-ulps | 4 - sysdeps/alpha/fpu/libm-test-ulps | 4 - sysdeps/arc/fpu/libm-test-ulps | 4 - sysdeps/arc/nofpu/libm-test-ulps | 1 - sysdeps/arm/libm-test-ulps | 8 +- sysdeps/csky/fpu/libm-test-ulps | 4 - sysdeps/csky/nofpu/libm-test-ulps | 4 - sysdeps/hppa/fpu/libm-test-ulps | 4 - sysdeps/i386/fpu/libm-test-ulps | 4 - .../i386/i686/fpu/multiarch/libm-test-ulps | 4 - sysdeps/ieee754/flt-32/e_atan2f.c | 2 +- sysdeps/ieee754/flt-32/e_coshf.c | 156 ++++++++++++------ sysdeps/loongarch/lp64/libm-test-ulps | 4 - sysdeps/microblaze/libm-test-ulps | 1 - sysdeps/mips/mips32/libm-test-ulps | 4 - sysdeps/mips/mips64/libm-test-ulps | 4 - sysdeps/or1k/fpu/libm-test-ulps | 4 - sysdeps/or1k/nofpu/libm-test-ulps | 4 - sysdeps/powerpc/fpu/libm-test-ulps | 4 - sysdeps/powerpc/nofpu/libm-test-ulps | 4 - sysdeps/riscv/nofpu/libm-test-ulps | 4 - sysdeps/riscv/rvd/libm-test-ulps | 4 - sysdeps/s390/fpu/libm-test-ulps | 4 - sysdeps/sh/libm-test-ulps | 2 - sysdeps/sparc/fpu/libm-test-ulps | 4 - sysdeps/x86_64/fpu/libm-test-ulps | 6 +- 27 files changed, 112 insertions(+), 144 deletions(-) diff --git a/SHARED-FILES b/SHARED-FILES index d32c837b46..320e0b3be9 100644 --- a/SHARED-FILES +++ b/SHARED-FILES @@ -322,3 +322,7 @@ sysdeps/ieee754/flt-32/e_atanhf.c: (src/binary32/atanh/atanhf.c in CORE-MATH) - The code was adapted to use glibc code style and internal functions to handle errno, overflow, and underflow. +sysdeps/ieee754/flt-32/e_coshf.c: + (src/binary32/cosh/coshf.c in CORE-MATH) + - the code was adapted to use glibc code style and internal + functions to handle errno, overflow, and underflow. diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps index 0c686221d2..2bbaf97239 100644 --- a/sysdeps/aarch64/libm-test-ulps +++ b/sysdeps/aarch64/libm-test-ulps @@ -698,7 +698,6 @@ ldouble: 2 Function: "cosh": double: 2 -float: 2 ldouble: 2 Function: "cosh_advsimd": @@ -707,7 +706,6 @@ float: 2 Function: "cosh_downward": double: 3 -float: 1 ldouble: 3 Function: "cosh_sve": @@ -716,12 +714,10 @@ float: 2 Function: "cosh_towardzero": double: 3 -float: 1 ldouble: 3 Function: "cosh_upward": double: 2 -float: 2 ldouble: 3 Function: Real part of "cpow": diff --git a/sysdeps/alpha/fpu/libm-test-ulps b/sysdeps/alpha/fpu/libm-test-ulps index e108b2543c..6b433fbba7 100644 --- a/sysdeps/alpha/fpu/libm-test-ulps +++ b/sysdeps/alpha/fpu/libm-test-ulps @@ -625,22 +625,18 @@ ldouble: 2 Function: "cosh": double: 2 -float: 2 ldouble: 2 Function: "cosh_downward": double: 3 -float: 1 ldouble: 3 Function: "cosh_towardzero": double: 3 -float: 1 ldouble: 3 Function: "cosh_upward": double: 2 -float: 2 ldouble: 3 Function: Real part of "cpow": diff --git a/sysdeps/arc/fpu/libm-test-ulps b/sysdeps/arc/fpu/libm-test-ulps index a0d6e89b49..a16c1097f0 100644 --- a/sysdeps/arc/fpu/libm-test-ulps +++ b/sysdeps/arc/fpu/libm-test-ulps @@ -497,19 +497,15 @@ float: 2 Function: "cosh": double: 3 -float: 3 Function: "cosh_downward": double: 3 -float: 1 Function: "cosh_towardzero": double: 3 -float: 1 Function: "cosh_upward": double: 3 -float: 2 Function: Real part of "cpow": double: 9 diff --git a/sysdeps/arc/nofpu/libm-test-ulps b/sysdeps/arc/nofpu/libm-test-ulps index 1a6b37a728..de09fa9b92 100644 --- a/sysdeps/arc/nofpu/libm-test-ulps +++ b/sysdeps/arc/nofpu/libm-test-ulps @@ -124,7 +124,6 @@ float: 1 Function: "cosh": double: 2 -float: 2 Function: Real part of "cpow": double: 2 diff --git a/sysdeps/arm/libm-test-ulps b/sysdeps/arm/libm-test-ulps index ea2f3a22b0..df423594de 100644 --- a/sysdeps/arm/libm-test-ulps +++ b/sysdeps/arm/libm-test-ulps @@ -52,8 +52,6 @@ double: 3 Function: "atan": double: 1 -Function: "atan2": - Function: "atan2_downward": double: 1 @@ -493,19 +491,15 @@ float: 2 Function: "cosh": double: 2 -float: 2 Function: "cosh_downward": double: 3 -float: 1 Function: "cosh_towardzero": double: 3 -float: 1 Function: "cosh_upward": double: 2 -float: 2 Function: Real part of "cpow": double: 2 @@ -674,7 +668,7 @@ float: 2 Function: Real part of "ctanh_downward": double: 4 -float: 2 +float: 3 Function: Imaginary part of "ctanh_downward": double: 6 diff --git a/sysdeps/csky/fpu/libm-test-ulps b/sysdeps/csky/fpu/libm-test-ulps index f8ab682f84..ee95d85682 100644 --- a/sysdeps/csky/fpu/libm-test-ulps +++ b/sysdeps/csky/fpu/libm-test-ulps @@ -489,19 +489,15 @@ float: 1 Function: "cosh": double: 2 -float: 2 Function: "cosh_downward": double: 3 -float: 1 Function: "cosh_towardzero": double: 3 -float: 1 Function: "cosh_upward": double: 2 -float: 2 Function: Real part of "cpow": double: 2 diff --git a/sysdeps/csky/nofpu/libm-test-ulps b/sysdeps/csky/nofpu/libm-test-ulps index f8b888ea9d..64239e6e64 100644 --- a/sysdeps/csky/nofpu/libm-test-ulps +++ b/sysdeps/csky/nofpu/libm-test-ulps @@ -487,19 +487,15 @@ float: 2 Function: "cosh": double: 2 -float: 2 Function: "cosh_downward": double: 1 -float: 1 Function: "cosh_towardzero": double: 1 -float: 1 Function: "cosh_upward": double: 1 -float: 2 Function: Real part of "cpow": double: 2 diff --git a/sysdeps/hppa/fpu/libm-test-ulps b/sysdeps/hppa/fpu/libm-test-ulps index c8d4423ae0..845f6a8331 100644 --- a/sysdeps/hppa/fpu/libm-test-ulps +++ b/sysdeps/hppa/fpu/libm-test-ulps @@ -503,19 +503,15 @@ float: 2 Function: "cosh": double: 2 -float: 2 Function: "cosh_downward": double: 3 -float: 1 Function: "cosh_towardzero": double: 3 -float: 1 Function: "cosh_upward": double: 2 -float: 2 Function: Real part of "cpow": double: 2 diff --git a/sysdeps/i386/fpu/libm-test-ulps b/sysdeps/i386/fpu/libm-test-ulps index f0cc6594af..f42aed258b 100644 --- a/sysdeps/i386/fpu/libm-test-ulps +++ b/sysdeps/i386/fpu/libm-test-ulps @@ -757,25 +757,21 @@ ldouble: 2 Function: "cosh": double: 1 -float: 2 float128: 2 ldouble: 3 Function: "cosh_downward": double: 3 -float: 1 float128: 3 ldouble: 3 Function: "cosh_towardzero": double: 3 -float: 1 float128: 3 ldouble: 3 Function: "cosh_upward": double: 4 -float: 2 float128: 3 ldouble: 3 diff --git a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps index 8a56f383a9..374aa0a939 100644 --- a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps +++ b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps @@ -757,25 +757,21 @@ ldouble: 2 Function: "cosh": double: 1 -float: 2 float128: 2 ldouble: 3 Function: "cosh_downward": double: 3 -float: 1 float128: 3 ldouble: 3 Function: "cosh_towardzero": double: 3 -float: 1 float128: 3 ldouble: 3 Function: "cosh_upward": double: 4 -float: 2 float128: 3 ldouble: 3 diff --git a/sysdeps/ieee754/flt-32/e_atan2f.c b/sysdeps/ieee754/flt-32/e_atan2f.c index 836202f122..5ebb139eea 100644 --- a/sysdeps/ieee754/flt-32/e_atan2f.c +++ b/sysdeps/ieee754/flt-32/e_atan2f.c @@ -121,7 +121,7 @@ __ieee754_atan2f (float y, float x) /* we use x+y below so that the invalid exception is set for (x,y) = (qnan,snan) or (snan,qnan) */ if (ay > (0xff << 23)) - return y + y; /* nan */ + return x + y; /* nan */ if (ax > (0xff << 23)) return x + y; /* nan */ bool yinf = ay == (0xff << 23); diff --git a/sysdeps/ieee754/flt-32/e_coshf.c b/sysdeps/ieee754/flt-32/e_coshf.c index 052d387e42..281c86bf3b 100644 --- a/sysdeps/ieee754/flt-32/e_coshf.c +++ b/sysdeps/ieee754/flt-32/e_coshf.c @@ -1,63 +1,117 @@ -/* e_coshf.c -- float version of e_cosh.c. - */ +/* Correctly-rounded hyperbolic cosine function for binary32 value. -/* - * ==================================================== - * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved. - * - * Developed at SunPro, a Sun Microsystems, Inc. business. - * Permission to use, copy, modify, and distribute this - * software is freely granted, provided that this notice - * is preserved. - * ==================================================== - */ +Copyright (c) 2022-2024 Alexei Sibidanov. +The original version of this file was copied from the CORE-MATH +project (file src/binary32/cosh/coshf.c, revision c26f1e4). + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +*/ + +#include #include -#include -#include #include - -static const float huge = 1.0e30; -static const float one = 1.0, half=0.5; +#include "math_config.h" float __ieee754_coshf (float x) { - float t,w; - int32_t ix; - - GET_FLOAT_WORD(ix,x); - ix &= 0x7fffffff; - - /* |x| in [0,22] */ - if (ix < 0x41b00000) { - /* |x| in [0,0.5*ln2], return 1+expm1(|x|)^2/(2*exp(|x|)) */ - if(ix<0x3eb17218) { - if (ix<0x24000000) return one; /* cosh(tiny) = 1 */ - t = __expm1f(fabsf(x)); - w = one+t; - return one+(t*t)/(w+w); - } - - /* |x| in [0.5*ln2,22], return (exp(|x|)+1/exp(|x|)/2; */ - t = __ieee754_expf(fabsf(x)); - return half*t+half/t; + static const double c[] = + { + 1, 0x1.62e42fef4c4e7p-6, 0x1.ebfd1b232f475p-13, 0x1.c6b19384ecd93p-20 + }; + static const double ch[] = + { + 1, 0x1.62e42fefa39efp-6, 0x1.ebfbdff82c58fp-13, + 0x1.c6b08d702e0edp-20, 0x1.3b2ab6fb92e5ep-27, 0x1.5d886e6d54203p-35, + 0x1.430976b8ce6efp-43 + }; + static const uint64_t tb[] = + { + 0x3fe0000000000000, 0x3fe059b0d3158574, 0x3fe0b5586cf9890f, + 0x3fe11301d0125b51, 0x3fe172b83c7d517b, 0x3fe1d4873168b9aa, + 0x3fe2387a6e756238, 0x3fe29e9df51fdee1, 0x3fe306fe0a31b715, + 0x3fe371a7373aa9cb, 0x3fe3dea64c123422, 0x3fe44e086061892d, + 0x3fe4bfdad5362a27, 0x3fe5342b569d4f82, 0x3fe5ab07dd485429, + 0x3fe6247eb03a5585, 0x3fe6a09e667f3bcd, 0x3fe71f75e8ec5f74, + 0x3fe7a11473eb0187, 0x3fe82589994cce13, 0x3fe8ace5422aa0db, + 0x3fe93737b0cdc5e5, 0x3fe9c49182a3f090, 0x3fea5503b23e255d, + 0x3feae89f995ad3ad, 0x3feb7f76f2fb5e47, 0x3fec199bdd85529c, + 0x3fecb720dcef9069, 0x3fed5818dcfba487, 0x3fedfc97337b9b5f, + 0x3feea4afa2a490da, 0x3fef50765b6e4540 + }; + const double iln2 = 0x1.71547652b82fep+5; + double z = x; + uint32_t ax = asuint (x) << 1; + if (__glibc_unlikely (ax > 0x8565a9f8u)) + { /* |x| >~ 89.4 */ + if (ax >= 0xff000000u) + { + if (ax << 8) + return x + x; /* nan */ + return INFINITY; /* +-inf */ } - - /* |x| in [22, log(maxdouble)] return half*exp(|x|) */ - if (ix < 0x42b17180) return half*__ieee754_expf(fabsf(x)); - - /* |x| in [log(maxdouble), overflowthresold] */ - if (ix<=0x42b2d4fc) { - w = __ieee754_expf(half*fabsf(x)); - t = half*w; - return t*w; + return __math_oflowf (0); + } + if (__glibc_unlikely (ax < 0x7c000000u)) + { /* |x| < 0.125 */ + if (__glibc_unlikely (ax < 0x74000000u)) + { /* |x| < 0x1p-11 */ + if (__glibc_unlikely (ax < 0x66000000u)) /* |x| < 0x1p-24 */ + return fmaf (fabsf (x), 0x1p-25, 1.0f); + return (0.5f * x) * x + 1.0f; } - - /* x is INF or NaN */ - if(ix>=0x7f800000) return x*x; - - /* |x| > overflowthresold, cosh(x) overflow */ - return math_narrow_eval (huge*huge); + static const double cp[] = + { + 0x1.fffffffffffe3p-2, 0x1.55555555723cfp-5, + 0x1.6c16bee4a5986p-10, 0x1.a0483fc0328f7p-16 + }; + double z2 = z * z; + double z4 = z2 * z2; + return 1 + z2 * ((cp[0] + z2 * cp[1]) + z4 * (cp[2] + z2 * (cp[3]))); + } + double a = iln2 * z; + double ia = roundeven_finite (a); + double h = a - ia; + double h2 = h * h; + int64_t jp = asuint64 (ia + 0x1.8p52); + int64_t jm = -jp; + double sp = asdouble (tb[jp & 31] + ((jp >> 5) << 52)); + double sm = asdouble (tb[jm & 31] + ((jm >> 5) << 52)); + double te = c[0] + h2 * c[2]; + double to = (c[1] + h2 * c[3]); + double rp = sp * (te + h * to); + double rm = sm * (te - h * to); + double r = rp + rm; + float ub = r; + double lb = r - 1.45e-10 * r; + if (__glibc_unlikely (ub != lb)) + { + const double iln2h = 0x1.7154765p+5; + const double iln2l = 0x1.5c17f0bbbe88p-26; + h = (iln2h * z - ia) + iln2l * z; + h2 = h * h; + te = ch[0] + h2 * ch[2] + (h2 * h2) * (ch[4] + h2 * ch[6]); + to = ch[1] + h2 * (ch[3] + h2 * ch[5]); + r = sp * (te + h * to) + sm * (te - h * to); + ub = r; + } + return ub; } libm_alias_finite (__ieee754_coshf, __coshf) diff --git a/sysdeps/loongarch/lp64/libm-test-ulps b/sysdeps/loongarch/lp64/libm-test-ulps index 08a3d63393..da24aa920d 100644 --- a/sysdeps/loongarch/lp64/libm-test-ulps +++ b/sysdeps/loongarch/lp64/libm-test-ulps @@ -625,22 +625,18 @@ ldouble: 2 Function: "cosh": double: 2 -float: 2 ldouble: 2 Function: "cosh_downward": double: 3 -float: 1 ldouble: 3 Function: "cosh_towardzero": double: 3 -float: 1 ldouble: 3 Function: "cosh_upward": double: 2 -float: 2 ldouble: 3 Function: Real part of "cpow": diff --git a/sysdeps/microblaze/libm-test-ulps b/sysdeps/microblaze/libm-test-ulps index a6fa8376e1..367201f937 100644 --- a/sysdeps/microblaze/libm-test-ulps +++ b/sysdeps/microblaze/libm-test-ulps @@ -119,7 +119,6 @@ float: 1 Function: "cosh": double: 1 -float: 1 Function: Real part of "cpow": double: 2 diff --git a/sysdeps/mips/mips32/libm-test-ulps b/sysdeps/mips/mips32/libm-test-ulps index 2487ffe183..6a5c723b3a 100644 --- a/sysdeps/mips/mips32/libm-test-ulps +++ b/sysdeps/mips/mips32/libm-test-ulps @@ -493,19 +493,15 @@ float: 2 Function: "cosh": double: 2 -float: 2 Function: "cosh_downward": double: 3 -float: 1 Function: "cosh_towardzero": double: 3 -float: 1 Function: "cosh_upward": double: 2 -float: 2 Function: Real part of "cpow": double: 2 diff --git a/sysdeps/mips/mips64/libm-test-ulps b/sysdeps/mips/mips64/libm-test-ulps index 0ce7c06b2b..0e6a383ad2 100644 --- a/sysdeps/mips/mips64/libm-test-ulps +++ b/sysdeps/mips/mips64/libm-test-ulps @@ -625,22 +625,18 @@ ldouble: 2 Function: "cosh": double: 2 -float: 2 ldouble: 2 Function: "cosh_downward": double: 3 -float: 1 ldouble: 3 Function: "cosh_towardzero": double: 3 -float: 1 ldouble: 3 Function: "cosh_upward": double: 2 -float: 2 ldouble: 3 Function: Real part of "cpow": diff --git a/sysdeps/or1k/fpu/libm-test-ulps b/sysdeps/or1k/fpu/libm-test-ulps index c673d62a62..3037f731b9 100644 --- a/sysdeps/or1k/fpu/libm-test-ulps +++ b/sysdeps/or1k/fpu/libm-test-ulps @@ -493,19 +493,15 @@ float: 1 Function: "cosh": double: 2 -float: 2 Function: "cosh_downward": double: 3 -float: 1 Function: "cosh_towardzero": double: 3 -float: 1 Function: "cosh_upward": double: 2 -float: 2 Function: Real part of "cpow": double: 2 diff --git a/sysdeps/or1k/nofpu/libm-test-ulps b/sysdeps/or1k/nofpu/libm-test-ulps index 8e4801fe24..f5646ee5cf 100644 --- a/sysdeps/or1k/nofpu/libm-test-ulps +++ b/sysdeps/or1k/nofpu/libm-test-ulps @@ -493,19 +493,15 @@ float: 1 Function: "cosh": double: 2 -float: 2 Function: "cosh_downward": double: 2 -float: 1 Function: "cosh_towardzero": double: 2 -float: 1 Function: "cosh_upward": double: 2 -float: 2 Function: Real part of "cpow": double: 2 diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps index e394159703..0594638bf5 100644 --- a/sysdeps/powerpc/fpu/libm-test-ulps +++ b/sysdeps/powerpc/fpu/libm-test-ulps @@ -762,25 +762,21 @@ ldouble: 5 Function: "cosh": double: 2 -float: 2 float128: 2 ldouble: 3 Function: "cosh_downward": double: 3 -float: 1 float128: 3 ldouble: 6 Function: "cosh_towardzero": double: 3 -float: 1 float128: 3 ldouble: 6 Function: "cosh_upward": double: 2 -float: 2 float128: 3 ldouble: 2 diff --git a/sysdeps/powerpc/nofpu/libm-test-ulps b/sysdeps/powerpc/nofpu/libm-test-ulps index deec711353..80ff04b318 100644 --- a/sysdeps/powerpc/nofpu/libm-test-ulps +++ b/sysdeps/powerpc/nofpu/libm-test-ulps @@ -629,22 +629,18 @@ ldouble: 5 Function: "cosh": double: 2 -float: 2 ldouble: 3 Function: "cosh_downward": double: 3 -float: 1 ldouble: 6 Function: "cosh_towardzero": double: 3 -float: 1 ldouble: 6 Function: "cosh_upward": double: 2 -float: 2 ldouble: 2 Function: Real part of "cpow": diff --git a/sysdeps/riscv/nofpu/libm-test-ulps b/sysdeps/riscv/nofpu/libm-test-ulps index 3a551201e7..48eb063323 100644 --- a/sysdeps/riscv/nofpu/libm-test-ulps +++ b/sysdeps/riscv/nofpu/libm-test-ulps @@ -622,22 +622,18 @@ ldouble: 2 Function: "cosh": double: 2 -float: 2 ldouble: 2 Function: "cosh_downward": double: 1 -float: 1 ldouble: 2 Function: "cosh_towardzero": double: 1 -float: 1 ldouble: 2 Function: "cosh_upward": double: 1 -float: 2 ldouble: 3 Function: Real part of "cpow": diff --git a/sysdeps/riscv/rvd/libm-test-ulps b/sysdeps/riscv/rvd/libm-test-ulps index 2f89c82811..385c746328 100644 --- a/sysdeps/riscv/rvd/libm-test-ulps +++ b/sysdeps/riscv/rvd/libm-test-ulps @@ -625,22 +625,18 @@ ldouble: 2 Function: "cosh": double: 2 -float: 2 ldouble: 2 Function: "cosh_downward": double: 3 -float: 1 ldouble: 3 Function: "cosh_towardzero": double: 3 -float: 1 ldouble: 3 Function: "cosh_upward": double: 2 -float: 2 ldouble: 3 Function: Real part of "cpow": diff --git a/sysdeps/s390/fpu/libm-test-ulps b/sysdeps/s390/fpu/libm-test-ulps index 54a42a8c51..ccc6e06a97 100644 --- a/sysdeps/s390/fpu/libm-test-ulps +++ b/sysdeps/s390/fpu/libm-test-ulps @@ -625,22 +625,18 @@ ldouble: 2 Function: "cosh": double: 2 -float: 2 ldouble: 2 Function: "cosh_downward": double: 3 -float: 1 ldouble: 3 Function: "cosh_towardzero": double: 3 -float: 1 ldouble: 3 Function: "cosh_upward": double: 2 -float: 2 ldouble: 3 Function: Real part of "cpow": diff --git a/sysdeps/sh/libm-test-ulps b/sysdeps/sh/libm-test-ulps index 3694110fc1..f7131fdc86 100644 --- a/sysdeps/sh/libm-test-ulps +++ b/sysdeps/sh/libm-test-ulps @@ -243,11 +243,9 @@ float: 1 Function: "cosh": double: 2 -float: 2 Function: "cosh_towardzero": double: 3 -float: 1 Function: Real part of "cpow": double: 2 diff --git a/sysdeps/sparc/fpu/libm-test-ulps b/sysdeps/sparc/fpu/libm-test-ulps index 56a7a04480..b004005134 100644 --- a/sysdeps/sparc/fpu/libm-test-ulps +++ b/sysdeps/sparc/fpu/libm-test-ulps @@ -625,22 +625,18 @@ ldouble: 2 Function: "cosh": double: 2 -float: 2 ldouble: 2 Function: "cosh_downward": double: 3 -float: 1 ldouble: 3 Function: "cosh_towardzero": double: 3 -float: 1 ldouble: 3 Function: "cosh_upward": double: 2 -float: 2 ldouble: 3 Function: Real part of "cpow": diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps index 74192a4c77..89fef415b5 100644 --- a/sysdeps/x86_64/fpu/libm-test-ulps +++ b/sysdeps/x86_64/fpu/libm-test-ulps @@ -934,25 +934,21 @@ float: 1 Function: "cosh": double: 2 -float: 2 float128: 2 ldouble: 3 Function: "cosh_downward": double: 3 -float: 1 float128: 3 ldouble: 3 Function: "cosh_towardzero": double: 3 -float: 1 float128: 3 ldouble: 3 Function: "cosh_upward": double: 2 -float: 2 float128: 3 ldouble: 3 @@ -1226,7 +1222,7 @@ ldouble: 2 Function: Real part of "ctanh_downward": double: 4 -float: 2 +float: 3 float128: 5 ldouble: 4