From patchwork Fri Nov 29 13:17:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 846142 Delivered-To: patch@linaro.org Received: by 2002:adf:f2c4:0:b0:382:43a8:7b94 with SMTP id d4csp877118wrp; Fri, 29 Nov 2024 05:57:14 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWSZnUqPvYU/rLY6Xs4Gqpfgfe930XEdpQ4JmNtwGlbqjZp5tT6fPuiPfKNjwuSbX/zX8V5sA==@linaro.org X-Google-Smtp-Source: AGHT+IE5yQPehLRB2HBkRD9jno4afykPfBYmITOZZMx90rzD1EqQPG2YJtT26dfSkYxEKkx5vIIA X-Received: by 2002:a05:690c:6f0c:b0:6ef:5cd2:49bb with SMTP id 00721157ae682-6ef5cd24c80mr27869937b3.30.1732888634069; Fri, 29 Nov 2024 05:57:14 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1732888634; cv=pass; d=google.com; s=arc-20240605; b=SU/qHGJSKOAhm06NL2MTUtIn6qBiD5mwEK6AIGf1myLrBVI9u8Sn73x3EvO54oUpV8 /aV4Pe9SDZ3wKQ0o+zrYCYguTHxbHvAlB6P7iWvj4s81yL9IDpD4KikJ9jBToge1td2M MSZRlgQYoa5VxOJYP4VA7FNklq21Uvp9SYiHeJ8Aei0xLdNoc9dUmcY/ARCpaUA3yWit 6uu1RrRnDLa0Ah7ckc7dNOkZ9RQ439zt6xwmE8ghVjRL+jDrAuIPjgLcb142vZr5io9M 3ldLwTvgQAZFgisrcGjCLzR4vma5gmcrp+0y7fa86UbCOXBK5xitc4vq22Tv7RAWOzpX zcsQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dkim-filter:arc-filter:dmarc-filter :delivered-to:dkim-filter; bh=A0ko3ChkMrehReUtJva3vebFvqbNGVEQoecYrSbCCh4=; fh=sFucH9KQW8Y8eMoQXaNIgycLDa7roysdjTHpHLIprh4=; b=getEZ1fU+DjQUXnwsbGAiHgTdt9BAiagOfFGAlB3GkJkKT2UUrBGNRO8KnlQhaJtcq 7hLYFBCgt5uUOtM8LEgPYomy5J2UbbZdbmIZwCViNXn19ekI8iyimURBmCcSwAqrCkUQ cAKq2tEdcOcDqL6wnA2q2AE2UHGOA7WbgbqqLnECK8eisaHv6L1Nfy5LyAT3DWfkDoUW Q1RP7vw49ACVzFK7HN8aS3jclRJj3N8LQrtsKQ95wJ7w8vhi7rklaqFhpcwL6+J8freu jVwCuSYNJcI7vxFB0ALO+5Q3muMN2MJfPOETWcamhhKt8MnVxBwYCAPXF8xozL0LS/CD ynbw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=L8oQ17l5; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id af79cd13be357-7b6849fb044si440625285a.692.2024.11.29.05.57.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:57:14 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=L8oQ17l5; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9E8C43858404 for ; Fri, 29 Nov 2024 13:57:13 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9E8C43858404 Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=L8oQ17l5 X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by sourceware.org (Postfix) with ESMTPS id AAB46385840B for ; Fri, 29 Nov 2024 13:21:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AAB46385840B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AAB46385840B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::632 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1732886501; cv=none; b=ETi4+swidrqpYBfTl/hTDcNr0JccfO+utAi6DbXxMawWjYoAjc68ymK6F1v/kUSziU/dadGnpxnlgkosjZ88sG/+TOXxcyFRe4YLCxNK65VRsH3+m9HPDzuNCwqwSwSn3XZq9xd9QQdrWvIRVw3wvb4t4NFUvHQrKYEWDspScis= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1732886501; c=relaxed/simple; bh=wjXzR+VdN8p7P4a7SH1VyqpXUr/VzK2m5MgTuc94rg0=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=IgozcjcMH83fvFK3MpmNNcyG7l6kgB8NzVT7jRNQoxwMsNS7jb1ouT2nc4QYi0sNkCy5FuojogBF42AuHlLIn/7LZwakxdoLUIRkTAfV45CwVe0EovkdOK6RbJb4nO7oFYT2umUC5LAZTNsP9SALNrBzP2nTzknvnuVEeb4Nonw= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AAB46385840B Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-21278f3d547so13703525ad.1 for ; Fri, 29 Nov 2024 05:21:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1732886500; x=1733491300; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=A0ko3ChkMrehReUtJva3vebFvqbNGVEQoecYrSbCCh4=; b=L8oQ17l5dgBURx1dKo0Q+oDAa4Cj6cESILknMjT021CoakVvggGAMAAqEUHaxytO5X ABrT+zSxrJdi/6nhlHxab0QBb8FJPVTh9ysJ3YCT036sVizG6O7TpGwNquuQNj4ubzhm QtaGazKqleFolNyCeZVKOuEUeISKHvmZUIGc43JGlvFZJt/e4guJhSjKkvU5DEMvMj+d utYK1owqOGjwMToKUuTMWvG9egSh3MAz3n8aqwzAaqCZ+CepFXOchq+wqqx/RjBpCpf4 w+0sgvCjkuUrxJFBxHwOToJYSqNeTBjBkRsybXQCigYq7+4Sm78Wv3P0CEcQKfWzpgKr PGTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732886500; x=1733491300; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=A0ko3ChkMrehReUtJva3vebFvqbNGVEQoecYrSbCCh4=; b=ZOo/4QZGZSiOPqSmzr+vdnCEbguz0WBpOSaTr5bK1k7BxIcDfXGYqNK9HOi89Z7hH+ Lft4hNUfFfTlndk6/0USrFc92cUX5TOoGOlnSYVFRLaieCGol7VuRGyl2bZ8VxxgCX+v hM2GNn5r6namTTM26yx8W6q+aBj/4WouCbgzyl60o/Jx/Q8tjDOxJ0RmoaQLWSz0bt8x w0FKSbSuov6wVSmD29JgDq2nwPaL89AFskUi87acJ/7BlpS4IYwqc9owDgtbWNKr2uNj XZ9ymBZ0/95lEfecB+IqOu9cQxkVNA0cyRMyB/Tn/i+CKo0PohFS8GPkrRqUKdeJC5iW 2eqw== X-Gm-Message-State: AOJu0Yxce7cpnskTE+NSm8dPaYLMpVvwIclQ7FqqSOt8bFsjkqlE4g7O l34uaVSnt//mFEAdCrC2o4wlf0D8p/1uvyvruSTQs7RvrwI8x8d/qolw5MvPgiUGWna3IxPIGUw c5yZpiA== X-Gm-Gg: ASbGncsOzTlOVbWGUcr76R3HXMG00R3JmBHqiJlhvPMc6qIW4rj3e4ut+7VRI9fwYa5 3RrMVCkTDTGVDcuninNtN7OFCfsafBjI8QQSKiTSDdqqgzouCknzP4M3JRCtqtsj43NUUlaPaFO 7IS1X+KkepqDd828DeGz2MXU5W4mKNEEbNshL3Cg/hs3EtXOOzC7LU+a64btCY0j8FNU+xPZMrl MFJHSpUfwOZoyrZmhKDo6J5/gapKPtWPyyTSeeRGGWRk0zwSoL+wO0PVngo/cg= X-Received: by 2002:a17:902:d48b:b0:20b:cae5:dec4 with SMTP id d9443c01a7336-2150128c372mr153313385ad.24.1732886500197; Fri, 29 Nov 2024 05:21:40 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c1:68c8:3143:6603:ad16:715e]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2153d5f66d5sm14472255ad.201.2024.11.29.05.21.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Nov 2024 05:21:39 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: DJ Delorie , Alexei Sibidanov , Paul Zimmermann Subject: [PATCH 23/23] math: Use tanhf from CORE-MATH Date: Fri, 29 Nov 2024 10:17:47 -0300 Message-ID: <20241129132032.476978-24-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241129132032.476978-1-adhemerval.zanella@linaro.org> References: <20241129132032.476978-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patch=linaro.org@sourceware.org The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic tanhf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 51.5273 41.0951 20.25% x86_64v2 47.7021 39.1526 17.92% x86_64v3 45.0373 34.2737 23.90% i686 133.9970 83.8596 37.42% aarch64 (Neoverse) 21.5439 14.7961 31.32% power10 13.3301 8.4406 36.68% reciprocal-throughput master patched improvement x86_64 24.9493 12.8547 48.48% x86_64v2 20.7051 12.7761 38.29% x86_64v3 19.2492 11.0851 42.41% i686 78.6498 29.8211 62.08% aarch64 (Neoverse) 11.6026 7.11487 38.68% power10 6.3328 2.8746 54.61% Signed-off-by: Alexei Sibidanov Signed-off-by: Paul Zimmermann Signed-off-by: Adhemerval Zanella --- SHARED-FILES | 4 + sysdeps/aarch64/libm-test-ulps | 4 - sysdeps/alpha/fpu/libm-test-ulps | 4 - sysdeps/arc/fpu/libm-test-ulps | 4 - sysdeps/arc/nofpu/libm-test-ulps | 1 - sysdeps/arm/libm-test-ulps | 4 - sysdeps/csky/fpu/libm-test-ulps | 4 - sysdeps/csky/nofpu/libm-test-ulps | 4 - sysdeps/hppa/fpu/libm-test-ulps | 4 - sysdeps/i386/fpu/libm-test-ulps | 4 - .../i386/i686/fpu/multiarch/libm-test-ulps | 4 - sysdeps/ieee754/flt-32/s_tanhf.c | 131 +++++++++++------- sysdeps/loongarch/lp64/libm-test-ulps | 4 - sysdeps/m68k/m680x0/fpu/libm-test-ulps | 3 - sysdeps/microblaze/libm-test-ulps | 1 - sysdeps/mips/mips32/libm-test-ulps | 4 - sysdeps/mips/mips64/libm-test-ulps | 4 - sysdeps/or1k/fpu/libm-test-ulps | 4 - sysdeps/or1k/nofpu/libm-test-ulps | 4 - sysdeps/powerpc/fpu/libm-test-ulps | 4 - sysdeps/powerpc/nofpu/libm-test-ulps | 4 - sysdeps/riscv/nofpu/libm-test-ulps | 4 - sysdeps/riscv/rvd/libm-test-ulps | 4 - sysdeps/s390/fpu/libm-test-ulps | 4 - sysdeps/sh/libm-test-ulps | 2 - sysdeps/sparc/fpu/libm-test-ulps | 4 - sysdeps/x86_64/fpu/libm-test-ulps | 4 - 27 files changed, 82 insertions(+), 144 deletions(-) diff --git a/SHARED-FILES b/SHARED-FILES index 3bd4e7fb4a..032c407881 100644 --- a/SHARED-FILES +++ b/SHARED-FILES @@ -330,3 +330,7 @@ sysdeps/ieee754/flt-32/e_sinhf.c: (src/binary32/sinh/sinhf.c in CORE-MATH) - the code was adapted to use glibc code style and internal functions to handle errno, overflow, and underflow. +sysdeps/ieee754/flt-32/s_tanhf.c: + (src/binary32/tanh/tanhf.c in CORE-MATH) + - the code was adapted to use glibc code style and internal + functions to handle errno, overflow, and underflow. diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps index 4545d8236b..ab8f7e830e 100644 --- a/sysdeps/aarch64/libm-test-ulps +++ b/sysdeps/aarch64/libm-test-ulps @@ -1549,7 +1549,6 @@ ldouble: 1 Function: "tanh": double: 2 -float: 2 ldouble: 2 Function: "tanh_advsimd": @@ -1558,7 +1557,6 @@ float: 2 Function: "tanh_downward": double: 3 -float: 3 ldouble: 4 Function: "tanh_sve": @@ -1567,12 +1565,10 @@ float: 2 Function: "tanh_towardzero": double: 2 -float: 2 ldouble: 3 Function: "tanh_upward": double: 3 -float: 3 ldouble: 3 Function: "tgamma": diff --git a/sysdeps/alpha/fpu/libm-test-ulps b/sysdeps/alpha/fpu/libm-test-ulps index d4329e060d..59ae18c412 100644 --- a/sysdeps/alpha/fpu/libm-test-ulps +++ b/sysdeps/alpha/fpu/libm-test-ulps @@ -1322,22 +1322,18 @@ ldouble: 1 Function: "tanh": double: 2 -float: 2 ldouble: 2 Function: "tanh_downward": double: 3 -float: 3 ldouble: 4 Function: "tanh_towardzero": double: 2 -float: 2 ldouble: 3 Function: "tanh_upward": double: 3 -float: 3 ldouble: 3 Function: "tgamma": diff --git a/sysdeps/arc/fpu/libm-test-ulps b/sysdeps/arc/fpu/libm-test-ulps index eb8296d736..f11dc58686 100644 --- a/sysdeps/arc/fpu/libm-test-ulps +++ b/sysdeps/arc/fpu/libm-test-ulps @@ -1057,19 +1057,15 @@ double: 1 Function: "tanh": double: 3 -float: 2 Function: "tanh_downward": double: 3 -float: 3 Function: "tanh_towardzero": double: 3 -float: 3 Function: "tanh_upward": double: 3 -float: 3 Function: "tgamma": double: 9 diff --git a/sysdeps/arc/nofpu/libm-test-ulps b/sysdeps/arc/nofpu/libm-test-ulps index 519a174f7a..f549db1801 100644 --- a/sysdeps/arc/nofpu/libm-test-ulps +++ b/sysdeps/arc/nofpu/libm-test-ulps @@ -252,7 +252,6 @@ double: 2 Function: "tanh": double: 2 -float: 2 Function: "tgamma": double: 9 diff --git a/sysdeps/arm/libm-test-ulps b/sysdeps/arm/libm-test-ulps index 7be1a7c75b..705b0641f5 100644 --- a/sysdeps/arm/libm-test-ulps +++ b/sysdeps/arm/libm-test-ulps @@ -1051,19 +1051,15 @@ double: 1 Function: "tanh": double: 2 -float: 2 Function: "tanh_downward": double: 3 -float: 3 Function: "tanh_towardzero": double: 2 -float: 2 Function: "tanh_upward": double: 3 -float: 3 Function: "tgamma": double: 9 diff --git a/sysdeps/csky/fpu/libm-test-ulps b/sysdeps/csky/fpu/libm-test-ulps index ffc0676765..0e0cc63f07 100644 --- a/sysdeps/csky/fpu/libm-test-ulps +++ b/sysdeps/csky/fpu/libm-test-ulps @@ -975,19 +975,15 @@ double: 1 Function: "tanh": double: 2 -float: 2 Function: "tanh_downward": double: 3 -float: 3 Function: "tanh_towardzero": double: 2 -float: 2 Function: "tanh_upward": double: 3 -float: 3 Function: "tgamma": double: 9 diff --git a/sysdeps/csky/nofpu/libm-test-ulps b/sysdeps/csky/nofpu/libm-test-ulps index a7c85db00d..a9f6566e2c 100644 --- a/sysdeps/csky/nofpu/libm-test-ulps +++ b/sysdeps/csky/nofpu/libm-test-ulps @@ -1006,19 +1006,15 @@ double: 1 Function: "tanh": double: 2 -float: 2 Function: "tanh_downward": double: 3 -float: 3 Function: "tanh_towardzero": double: 2 -float: 2 Function: "tanh_upward": double: 3 -float: 3 Function: "tgamma": double: 9 diff --git a/sysdeps/hppa/fpu/libm-test-ulps b/sysdeps/hppa/fpu/libm-test-ulps index 0fbb2f81bb..ea344c2f6c 100644 --- a/sysdeps/hppa/fpu/libm-test-ulps +++ b/sysdeps/hppa/fpu/libm-test-ulps @@ -1084,19 +1084,15 @@ double: 1 Function: "tanh": double: 2 -float: 2 Function: "tanh_downward": double: 3 -float: 3 Function: "tanh_towardzero": double: 2 -float: 2 Function: "tanh_upward": double: 3 -float: 3 Function: "tgamma": double: 9 diff --git a/sysdeps/i386/fpu/libm-test-ulps b/sysdeps/i386/fpu/libm-test-ulps index 26101c933e..3f00faf29f 100644 --- a/sysdeps/i386/fpu/libm-test-ulps +++ b/sysdeps/i386/fpu/libm-test-ulps @@ -1613,25 +1613,21 @@ ldouble: 2 Function: "tanh": double: 2 -float: 2 float128: 2 ldouble: 3 Function: "tanh_downward": double: 3 -float: 3 float128: 4 ldouble: 4 Function: "tanh_towardzero": double: 2 -float: 2 float128: 3 ldouble: 3 Function: "tanh_upward": double: 3 -float: 3 float128: 3 ldouble: 4 diff --git a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps index 92e821d609..af9ac1cf1c 100644 --- a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps +++ b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps @@ -1618,25 +1618,21 @@ ldouble: 2 Function: "tanh": double: 2 -float: 2 float128: 2 ldouble: 3 Function: "tanh_downward": double: 3 -float: 3 float128: 4 ldouble: 4 Function: "tanh_towardzero": double: 2 -float: 2 float128: 3 ldouble: 3 Function: "tanh_upward": double: 3 -float: 3 float128: 3 ldouble: 4 diff --git a/sysdeps/ieee754/flt-32/s_tanhf.c b/sysdeps/ieee754/flt-32/s_tanhf.c index 2c12f04569..b7273d9724 100644 --- a/sysdeps/ieee754/flt-32/s_tanhf.c +++ b/sysdeps/ieee754/flt-32/s_tanhf.c @@ -1,63 +1,88 @@ -/* s_tanhf.c -- float version of s_tanh.c. - */ +/* Correctly-rounded hyperbolic tangent function for binary32 value. -/* - * ==================================================== - * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved. - * - * Developed at SunPro, a Sun Microsystems, Inc. business. - * Permission to use, copy, modify, and distribute this - * software is freely granted, provided that this notice - * is preserved. - * ==================================================== - */ +Copyright (c) 2022-2024 Alexei Sibidanov. -#if defined(LIBM_SCCS) && !defined(lint) -static char rcsid[] = "$NetBSD: s_tanhf.c,v 1.4 1995/05/10 20:48:24 jtc Exp $"; -#endif +The original version of this file was copied from the CORE-MATH +project (file src/binary32/tanh/tanhf.c, revision bc385c2). -#include -#include -#include -#include -#include +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: -static const float one=1.0, two=2.0, tiny = 1.0e-30; - -float __tanhf(float x) -{ - float t,z; - int32_t jx,ix; +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. - GET_FLOAT_WORD(jx,x); - ix = jx&0x7fffffff; +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +*/ - /* x is INF or NaN */ - if(ix>=0x7f800000) { - if (jx>=0) return one/x+one; /* tanh(+-inf)=+-1 */ - else return one/x-one; /* tanh(NaN) = NaN */ - } +#include +#include +#include +#include "math_config.h" - /* |x| < 22 */ - if (ix < 0x41b00000) { /* |x|<22 */ - if (ix == 0) - return x; /* x == +-0 */ - if (ix<0x24000000) /* |x|<2**-55 */ - { - math_check_force_underflow (x); - return x*(one+x); /* tanh(small) = small */ - } - if (ix>=0x3f800000) { /* |x|>=1 */ - t = __expm1f(two*fabsf(x)); - z = one - two/(t+two); - } else { - t = __expm1f(-two*fabsf(x)); - z= -t/(t+two); - } - /* |x| > 22, return +-1 */ - } else { - z = one - tiny; /* raised inexact flag */ +float +__tanhf (float x) +{ + double z = x; + uint32_t ux = asuint (x); + int e = (ux >> 23) & 0xff; + if (__glibc_unlikely (e == 0xff)) + { + if (ux << 9) + return x + x; /* nan */ + static const float ir[] = { 1.0f, -1.0f }; + return ir[ux >> 31]; /* +-inf */ + } + if (__glibc_unlikely (e < 115)) + { + if (__glibc_unlikely (e < 102)) + { + if (__glibc_unlikely ((ux << 1) == 0)) + return x; + return fmaf (-x, fabsf (x), x); } - return (jx>=0)? z: -z; + float x2 = x * x; + return fmaf (x, -0x1.555556p-2f * x2, x); + } + if ((ux << 1) > (0x41102cb3u << 1)) + return copysignf (1.0f, x) - copysignf (0x1p-25f, x); + double z2 = z * z, z4 = z2 * z2, z8 = z4 * z4; + static const double cn[] = + { + 0x1p+0, 0x1.30877b8b72d33p-3, 0x1.694aa09ae9e5ep-8, + 0x1.4101377abb729p-14, 0x1.e0392b1db0018p-22, 0x1.2533756e546f7p-30, + 0x1.d62e5abe6ae8ap-41, 0x1.b06be534182dep-54 + }; + static const double cd[] = + { + 0x1p+0, 0x1.ed99131b0ebeap-2, 0x1.0d27ed6c95a69p-5, + 0x1.7cbdaca0e9fccp-11, 0x1.b4e60b892578ep-18, 0x1.a6f707c5c71abp-26, + 0x1.35a8b6e2cd94cp-35, 0x1.ca8230677aa01p-47 + }; + double n0 = cn[0] + z2 * cn[1]; + double n2 = cn[2] + z2 * cn[3]; + double n4 = cn[4] + z2 * cn[5]; + double n6 = cn[6] + z2 * cn[7]; + n0 += z4 * n2; + n4 += z4 * n6; + n0 += z8 * n4; + double d0 = cd[0] + z2 * cd[1]; + double d2 = cd[2] + z2 * cd[3]; + double d4 = cd[4] + z2 * cd[5]; + double d6 = cd[6] + z2 * cd[7]; + d0 += z4 * d2; + d4 += z4 * d6; + d0 += z8 * d4; + double r = z * n0 / d0; + return r; } libm_alias_float (__tanh, tanh) diff --git a/sysdeps/loongarch/lp64/libm-test-ulps b/sysdeps/loongarch/lp64/libm-test-ulps index 1a9ef3c217..a5a8850929 100644 --- a/sysdeps/loongarch/lp64/libm-test-ulps +++ b/sysdeps/loongarch/lp64/libm-test-ulps @@ -1329,22 +1329,18 @@ ldouble: 1 Function: "tanh": double: 2 -float: 2 ldouble: 2 Function: "tanh_downward": double: 3 -float: 3 ldouble: 4 Function: "tanh_towardzero": double: 2 -float: 2 ldouble: 3 Function: "tanh_upward": double: 3 -float: 3 ldouble: 3 Function: "tgamma": diff --git a/sysdeps/m68k/m680x0/fpu/libm-test-ulps b/sysdeps/m68k/m680x0/fpu/libm-test-ulps index 7b3e67efaf..573cfdd45a 100644 --- a/sysdeps/m68k/m680x0/fpu/libm-test-ulps +++ b/sysdeps/m68k/m680x0/fpu/libm-test-ulps @@ -1160,15 +1160,12 @@ double: 1 Function: "tanh_downward": double: 1 -float: 1 Function: "tanh_towardzero": double: 1 -float: 1 Function: "tanh_upward": double: 1 -float: 1 Function: "tgamma": double: 3 diff --git a/sysdeps/microblaze/libm-test-ulps b/sysdeps/microblaze/libm-test-ulps index c9fc2aa5aa..f7eac16221 100644 --- a/sysdeps/microblaze/libm-test-ulps +++ b/sysdeps/microblaze/libm-test-ulps @@ -234,7 +234,6 @@ double: 2 Function: "tanh": double: 2 -float: 2 Function: "tgamma": double: 5 diff --git a/sysdeps/mips/mips32/libm-test-ulps b/sysdeps/mips/mips32/libm-test-ulps index 2155853acf..418887d215 100644 --- a/sysdeps/mips/mips32/libm-test-ulps +++ b/sysdeps/mips/mips32/libm-test-ulps @@ -1054,19 +1054,15 @@ double: 1 Function: "tanh": double: 2 -float: 2 Function: "tanh_downward": double: 3 -float: 3 Function: "tanh_towardzero": double: 2 -float: 2 Function: "tanh_upward": double: 3 -float: 3 Function: "tgamma": double: 9 diff --git a/sysdeps/mips/mips64/libm-test-ulps b/sysdeps/mips/mips64/libm-test-ulps index c4dfef0b23..fb2689efb7 100644 --- a/sysdeps/mips/mips64/libm-test-ulps +++ b/sysdeps/mips/mips64/libm-test-ulps @@ -1340,22 +1340,18 @@ ldouble: 1 Function: "tanh": double: 2 -float: 2 ldouble: 2 Function: "tanh_downward": double: 3 -float: 3 ldouble: 4 Function: "tanh_towardzero": double: 2 -float: 2 ldouble: 3 Function: "tanh_upward": double: 3 -float: 3 ldouble: 3 Function: "tgamma": diff --git a/sysdeps/or1k/fpu/libm-test-ulps b/sysdeps/or1k/fpu/libm-test-ulps index f352f71c7d..1acfd0fcaf 100644 --- a/sysdeps/or1k/fpu/libm-test-ulps +++ b/sysdeps/or1k/fpu/libm-test-ulps @@ -988,19 +988,15 @@ double: 1 Function: "tanh": double: 2 -float: 2 Function: "tanh_downward": double: 3 -float: 3 Function: "tanh_towardzero": double: 2 -float: 2 Function: "tanh_upward": double: 3 -float: 3 Function: "tgamma": double: 9 diff --git a/sysdeps/or1k/nofpu/libm-test-ulps b/sysdeps/or1k/nofpu/libm-test-ulps index a69026e9ee..d0a00f9de1 100644 --- a/sysdeps/or1k/nofpu/libm-test-ulps +++ b/sysdeps/or1k/nofpu/libm-test-ulps @@ -978,19 +978,15 @@ double: 1 Function: "tanh": double: 2 -float: 2 Function: "tanh_downward": double: 3 -float: 3 Function: "tanh_towardzero": double: 2 -float: 2 Function: "tanh_upward": double: 3 -float: 3 Function: "tgamma": double: 9 diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps index 210ea0a26b..4d22d956cd 100644 --- a/sysdeps/powerpc/fpu/libm-test-ulps +++ b/sysdeps/powerpc/fpu/libm-test-ulps @@ -1721,25 +1721,21 @@ ldouble: 3 Function: "tanh": double: 2 -float: 2 float128: 2 ldouble: 1 Function: "tanh_downward": double: 3 -float: 3 float128: 4 ldouble: 4 Function: "tanh_towardzero": double: 2 -float: 2 float128: 3 ldouble: 4 Function: "tanh_upward": double: 3 -float: 3 float128: 3 ldouble: 6 diff --git a/sysdeps/powerpc/nofpu/libm-test-ulps b/sysdeps/powerpc/nofpu/libm-test-ulps index c2a0a64d50..244345f934 100644 --- a/sysdeps/powerpc/nofpu/libm-test-ulps +++ b/sysdeps/powerpc/nofpu/libm-test-ulps @@ -1456,22 +1456,18 @@ ldouble: 3 Function: "tanh": double: 2 -float: 2 ldouble: 1 Function: "tanh_downward": double: 3 -float: 3 ldouble: 4 Function: "tanh_towardzero": double: 2 -float: 2 ldouble: 4 Function: "tanh_upward": double: 3 -float: 3 ldouble: 6 Function: "tgamma": diff --git a/sysdeps/riscv/nofpu/libm-test-ulps b/sysdeps/riscv/nofpu/libm-test-ulps index b29beefdba..22041cf093 100644 --- a/sysdeps/riscv/nofpu/libm-test-ulps +++ b/sysdeps/riscv/nofpu/libm-test-ulps @@ -1269,22 +1269,18 @@ ldouble: 1 Function: "tanh": double: 2 -float: 2 ldouble: 2 Function: "tanh_downward": double: 3 -float: 3 ldouble: 4 Function: "tanh_towardzero": double: 2 -float: 2 ldouble: 3 Function: "tanh_upward": double: 3 -float: 3 ldouble: 3 Function: "tgamma": diff --git a/sysdeps/riscv/rvd/libm-test-ulps b/sysdeps/riscv/rvd/libm-test-ulps index b78c11ec09..511d6a3b90 100644 --- a/sysdeps/riscv/rvd/libm-test-ulps +++ b/sysdeps/riscv/rvd/libm-test-ulps @@ -1327,22 +1327,18 @@ ldouble: 1 Function: "tanh": double: 2 -float: 2 ldouble: 2 Function: "tanh_downward": double: 3 -float: 3 ldouble: 4 Function: "tanh_towardzero": double: 2 -float: 2 ldouble: 3 Function: "tanh_upward": double: 3 -float: 3 ldouble: 3 Function: "tgamma": diff --git a/sysdeps/s390/fpu/libm-test-ulps b/sysdeps/s390/fpu/libm-test-ulps index 2beaf10dc4..528d1ade6e 100644 --- a/sysdeps/s390/fpu/libm-test-ulps +++ b/sysdeps/s390/fpu/libm-test-ulps @@ -1326,22 +1326,18 @@ ldouble: 1 Function: "tanh": double: 2 -float: 2 ldouble: 2 Function: "tanh_downward": double: 3 -float: 3 ldouble: 4 Function: "tanh_towardzero": double: 2 -float: 2 ldouble: 3 Function: "tanh_upward": double: 3 -float: 3 ldouble: 3 Function: "tgamma": diff --git a/sysdeps/sh/libm-test-ulps b/sysdeps/sh/libm-test-ulps index 002218c7fa..783ee44a7b 100644 --- a/sysdeps/sh/libm-test-ulps +++ b/sysdeps/sh/libm-test-ulps @@ -488,11 +488,9 @@ double: 1 Function: "tanh": double: 2 -float: 2 Function: "tanh_towardzero": double: 2 -float: 2 Function: "tgamma": double: 9 diff --git a/sysdeps/sparc/fpu/libm-test-ulps b/sysdeps/sparc/fpu/libm-test-ulps index 72ecefff12..dbafba668e 100644 --- a/sysdeps/sparc/fpu/libm-test-ulps +++ b/sysdeps/sparc/fpu/libm-test-ulps @@ -1340,22 +1340,18 @@ ldouble: 1 Function: "tanh": double: 2 -float: 2 ldouble: 2 Function: "tanh_downward": double: 3 -float: 3 ldouble: 4 Function: "tanh_towardzero": double: 2 -float: 2 ldouble: 3 Function: "tanh_upward": double: 3 -float: 3 ldouble: 3 Function: "tgamma": diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps index 038622d624..7df7b51a3f 100644 --- a/sysdeps/x86_64/fpu/libm-test-ulps +++ b/sysdeps/x86_64/fpu/libm-test-ulps @@ -2140,25 +2140,21 @@ float: 2 Function: "tanh": double: 2 -float: 2 float128: 2 ldouble: 3 Function: "tanh_downward": double: 3 -float: 3 float128: 4 ldouble: 4 Function: "tanh_towardzero": double: 2 -float: 2 float128: 3 ldouble: 3 Function: "tanh_upward": double: 3 -float: 3 float128: 3 ldouble: 4