From patchwork Mon Nov 11 13:45:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 842459 Delivered-To: patch@linaro.org Received: by 2002:a5d:6307:0:b0:381:e71e:8f7b with SMTP id i7csp3034403wru; Mon, 11 Nov 2024 05:49:00 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWsM7mMDqQdye8tqVu85HDmqafF0lximBtd80JBnMPlR9s66g6krAO1hx/2Y3nUsTm/rV38Mg==@linaro.org X-Google-Smtp-Source: AGHT+IGMgPwfLHJvGdmwrU9/pDuq0vZjG5Ql5t4rPdXhwxpgwI2PtXKywv8G0avIOlWPuI/vr3Yg X-Received: by 2002:ac8:5794:0:b0:460:8f80:9091 with SMTP id d75a77b69052e-4630933aad6mr216155321cf.15.1731332939925; Mon, 11 Nov 2024 05:48:59 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1731332939; cv=pass; d=google.com; s=arc-20240605; b=OELS4JUoLCeXAJoWl+tWe8XZD8RqZ539CS+8W3P3WJY345zu2Uf0+q+dLgHZk8TzcX A/ZcBxEE6/UnLLc6xybBhcPhxDpFjQlCbfmu3yBtIdJglNX4WlLQndaCbJBDw1DlXhkC ZXh+1iWFsPSU9RqupXCu2srYiyGblavCoDQ7AnSjr2pVU3k34ZSzPfm99ttMgGi9Wapt onjpWBxdPp8nGVXIxddcnKu5XqY1wdLMpyi+7QuZ3khOXCOoH0Vj9M2TYVXp8to9vz9a kc1zSVMXaWMwRgnV9+3Zq4LetXenutF864jocnQ457C9eoR0S8uJ1/TTfMQsJQ3HP1/i zGyQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=Op5IdD9dyDiz1tHME/u0gIGWYVxhuZ+w3R5I8FVy7d4=; fh=dOT1c3bMtfitA3niap6ZM+rrtX8XZRbOZMtD8c16ZiA=; b=JDNaR9NmxvnYRhysAsg0t3fBybf/gxqSULF7Dzgvr+J2sQ4Mn0JossUFB1CJAIq3nf V8wBc6ZnPM/Nh7v9q9fWzvfSOe4ZlmzquPZ9mw9txdwCl2UUUKtWhBVvlN/lJKb+XNY5 QJebB1EDr8oy1rn+NmuP5YjklLcjGAR32kxfJDsHUbska5K3wndO/J8HAh92xmpwt8lG 1jWu0vzXBSi6utsfgfusLNNblQdXhTBtXDRlcgvyngfrKvnensa5ifEHUy+O3rguHtih 90lB24E0hrE8YgkLqsgbYiovNCSGYwLCIl0JizLdpiKgM3pSQ0LP7eQ7vRZbnItFr33j UulA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Ed56hbj2; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id d75a77b69052e-462ff5c7e30si117593641cf.286.2024.11.11.05.48.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Nov 2024 05:48:59 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Ed56hbj2; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7F9473857C78 for ; Mon, 11 Nov 2024 13:48:59 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com [IPv6:2607:f8b0:4864:20::436]) by sourceware.org (Postfix) with ESMTPS id 3D2E23858C78 for ; Mon, 11 Nov 2024 13:48:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3D2E23858C78 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3D2E23858C78 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::436 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731332892; cv=none; b=XtoSUmKDuHO2EAfFOSRC/GzhbaZo23S0klU1Ug5ZTEdRzg904RbIAzeJoFNK1EX5x3/Fg/ez7O30rWOmD2FGyold7niVTJwfJDiOD+9bFWEVRX9aOZhCoAZtqqgBqIEW+B//bO0xGf6IzgvI9PFXGUDBjkMIaG1wYYok6cXTDvs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1731332892; c=relaxed/simple; bh=EDZEKOPWeIxoHm46So4O2sWEStcz5puSmRjaVOYx43c=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=sc69xrI1mJzuMba0c0vNNd/MKe2eG2OScUxUMepZ7jSh8Q0j9oqmRTzRCAxO73OwoDHItGiPAte5InxRbN3mvTsfS1mGQ0CDnMPBjICGn84Y7RaYMxU17slc0vRbgd1gaPaPO2K/9wQTvXzPHXN+JCqGHK7YoG7hcpF5B10B/s8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pf1-x436.google.com with SMTP id d2e1a72fcca58-723f37dd76cso4605236b3a.0 for ; Mon, 11 Nov 2024 05:48:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1731332880; x=1731937680; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Op5IdD9dyDiz1tHME/u0gIGWYVxhuZ+w3R5I8FVy7d4=; b=Ed56hbj2vZxTRRZuNppWDgX2cWxD2lXKLG4PzaIRhav7ryTQMpyjC+hB8vIGAfYcDr v7khNEhMdvfQAHjjwMysJZJZHWDp2sDvcUouzmuQwvEjvBAJdyl5JvbuOkq/D2pIiZMG bR0AjFpCaTtomZGIE/May2LiB0RMBbX6KFdZilN3wtmgtnyCW+rrjf29eg8E3mJ45C94 6QxNcORbrnwbNWK+UkCr+tWx+Ev1jYR3S1J0z8WqvKLPetWN+OVKaChyiJNPs6W1s8R1 ASUBkeyu8lRS3uSVEqsz126h0xtRk4Rn/bAe79097+7x4JxG3oDQAhzRLSaeRiMuW84h t/0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1731332880; x=1731937680; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Op5IdD9dyDiz1tHME/u0gIGWYVxhuZ+w3R5I8FVy7d4=; b=PqbXnq/gvzIMrbo0PM3RRaO87t6s9XbHYpKdwPhhIBLcA56rbgtZsxMqNUqW+zZB/U yrY+w0EUvIID4+UItHfb30XSPlTdOyuPWhSvOcoW/N62tXWy0C+enCOTVikBO6eJHFtb Wf6+Hsy0dHASV7wOOzbfV2vx2x9k+S6LXFb/s7zF59yu4eS9BJgSCb8TLPhEEDKP9v5/ sZ9nYxVsjVfWlhP4rMbcBJjhvAM5v1nfalF+QquomWZ0hGRB394eKSeOQv/Gul6HuP8i hZGe9foMpzXLLkFrSQO17vXEr8zg7c8tRYwaKBknseGnRYiGoQOtSAukZUXDFy3HYvZZ xWjQ== X-Gm-Message-State: AOJu0YyVKGvkbP6TO5QBVHEwrzMVsN2rU535Q9En2o1lQmv+gead+99G 4K+wfu+VVizVqYScMFoybhfPgCsCb0NUW3uULEq5pJRatyRX034LMT2DALvE/dnJQqfja4xxv2K wOu0sFA== X-Received: by 2002:a05:6a00:2444:b0:71e:59d2:9c99 with SMTP id d2e1a72fcca58-72413274284mr17206391b3a.4.1731332879439; Mon, 11 Nov 2024 05:47:59 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:1b55:b2b2:a79f:60ab:6ea2]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7f41f65bf93sm8530126a12.79.2024.11.11.05.47.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Nov 2024 05:47:59 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Alexei Sibidanov , Paul Zimmermann Subject: [PATCH 08/11] math: Use erff from CORE-MATH Date: Mon, 11 Nov 2024 10:45:46 -0300 Message-ID: <20241111134740.1410635-9-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241111134740.1410635-1-adhemerval.zanella@linaro.org> References: <20241111134740.1410635-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patch=linaro.org@sourceware.org The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows better performance to the generic erff. The code was adapted to glibc style and to use the definition of math_config.h. Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (M1, gcc 13.2.1), and powerpc (POWER10, gcc 13.2.1): latency master patched improvement x86_64 85.7363 45.1372 47.35% x86_64v2 86.6337 38.5816 55.47% x86_64v3 71.3810 34.0843 52.25% i686 190.143 97.5014 48.72% aarch64 34.9091 14.9320 57.23% power10 38.6160 8.5188 77.94% powerpc 39.7446 8.45781 78.72% reciprocal-throughput master patched improvement x86_64 35.1739 14.7603 58.04% x86_64v2 34.5976 11.2283 67.55% x86_64v3 27.3260 9.8550 63.94% i686 91.0282 30.8840 66.07% aarch64 22.5831 6.9615 69.17% power10 18.0386 3.0918 82.86% powerpc 20.7277 3.63396 82.47% Signed-off-by: Alexei Sibidanov Signed-off-by: Paul Zimmermann Signed-off-by: Adhemerval Zanella --- SHARED-FILES | 4 + sysdeps/aarch64/libm-test-ulps | 1 - sysdeps/alpha/fpu/libm-test-ulps | 4 - sysdeps/arc/fpu/libm-test-ulps | 4 - sysdeps/arc/nofpu/libm-test-ulps | 1 - sysdeps/arm/libm-test-ulps | 4 - sysdeps/csky/fpu/libm-test-ulps | 4 - sysdeps/csky/nofpu/libm-test-ulps | 4 - sysdeps/hppa/fpu/libm-test-ulps | 4 - sysdeps/i386/fpu/libm-test-ulps | 4 - .../i386/i686/fpu/multiarch/libm-test-ulps | 4 - sysdeps/ieee754/flt-32/s_erff.c | 393 +++++++++++------- sysdeps/loongarch/lp64/libm-test-ulps | 4 - sysdeps/m68k/m680x0/fpu/libm-test-ulps | 4 - sysdeps/microblaze/libm-test-ulps | 1 - sysdeps/mips/mips32/libm-test-ulps | 4 - sysdeps/mips/mips64/libm-test-ulps | 4 - sysdeps/nios2/libm-test-ulps | 1 - sysdeps/or1k/fpu/libm-test-ulps | 4 - sysdeps/or1k/nofpu/libm-test-ulps | 4 - sysdeps/powerpc/fpu/libm-test-ulps | 4 - sysdeps/powerpc/nofpu/libm-test-ulps | 4 - sysdeps/riscv/nofpu/libm-test-ulps | 4 - sysdeps/riscv/rvd/libm-test-ulps | 4 - sysdeps/s390/fpu/libm-test-ulps | 4 - sysdeps/sh/libm-test-ulps | 2 - sysdeps/sparc/fpu/libm-test-ulps | 4 - sysdeps/x86_64/fpu/libm-test-ulps | 4 - 28 files changed, 251 insertions(+), 236 deletions(-) diff --git a/SHARED-FILES b/SHARED-FILES index d367f4b62f..ccc5179f80 100644 --- a/SHARED-FILES +++ b/SHARED-FILES @@ -272,3 +272,7 @@ sysdeps/ieee754/flt-32/s_cbrtf.c (file src/binary32/cbrt/cbrtf.c in CORE-MATH) - The code was adapted to use glibc code style and internal functions to handle errno, overflow, and underflow. +sysdeps/ieee754/flt-32/s_erff.c + (file src/binary32/erf/erff.c in CORE-MATH) + - The code was adapted to use glibc code style and internal + functions to handle errno, overflow, and underflow. diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps index 4979769b58..fc10f7f80d 100644 --- a/sysdeps/aarch64/libm-test-ulps +++ b/sysdeps/aarch64/libm-test-ulps @@ -991,7 +991,6 @@ ldouble: 5 Function: "erf": double: 1 -float: 1 ldouble: 1 Function: "erf_advsimd": diff --git a/sysdeps/alpha/fpu/libm-test-ulps b/sysdeps/alpha/fpu/libm-test-ulps index a2b5404f9d..bbb3a5c459 100644 --- a/sysdeps/alpha/fpu/libm-test-ulps +++ b/sysdeps/alpha/fpu/libm-test-ulps @@ -909,22 +909,18 @@ ldouble: 5 Function: "erf": double: 1 -float: 1 ldouble: 1 Function: "erf_downward": double: 1 -float: 1 ldouble: 2 Function: "erf_towardzero": double: 1 -float: 1 ldouble: 1 Function: "erf_upward": double: 1 -float: 1 ldouble: 2 Function: "erfc": diff --git a/sysdeps/arc/fpu/libm-test-ulps b/sysdeps/arc/fpu/libm-test-ulps index c6f3646797..9e422da289 100644 --- a/sysdeps/arc/fpu/libm-test-ulps +++ b/sysdeps/arc/fpu/libm-test-ulps @@ -733,19 +733,15 @@ float: 6 Function: "erf": double: 1 -float: 1 Function: "erf_downward": double: 1 -float: 1 Function: "erf_towardzero": double: 1 -float: 2 Function: "erf_upward": double: 2 -float: 2 Function: "erfc": double: 5 diff --git a/sysdeps/arc/nofpu/libm-test-ulps b/sysdeps/arc/nofpu/libm-test-ulps index 6319012db5..2c24fdf663 100644 --- a/sysdeps/arc/nofpu/libm-test-ulps +++ b/sysdeps/arc/nofpu/libm-test-ulps @@ -177,7 +177,6 @@ float: 2 Function: "erf": double: 1 -float: 1 Function: "erfc": double: 5 diff --git a/sysdeps/arm/libm-test-ulps b/sysdeps/arm/libm-test-ulps index d9317046a9..153cd1f3d7 100644 --- a/sysdeps/arm/libm-test-ulps +++ b/sysdeps/arm/libm-test-ulps @@ -726,19 +726,15 @@ float: 3 Function: "erf": double: 1 -float: 1 Function: "erf_downward": double: 1 -float: 1 Function: "erf_towardzero": double: 1 -float: 1 Function: "erf_upward": double: 1 -float: 1 Function: "erfc": double: 5 diff --git a/sysdeps/csky/fpu/libm-test-ulps b/sysdeps/csky/fpu/libm-test-ulps index c3a3db9bcb..d276db245b 100644 --- a/sysdeps/csky/fpu/libm-test-ulps +++ b/sysdeps/csky/fpu/libm-test-ulps @@ -719,19 +719,15 @@ float: 3 Function: "erf": double: 1 -float: 1 Function: "erf_downward": double: 1 -float: 1 Function: "erf_towardzero": double: 1 -float: 1 Function: "erf_upward": double: 1 -float: 1 Function: "erfc": double: 5 diff --git a/sysdeps/csky/nofpu/libm-test-ulps b/sysdeps/csky/nofpu/libm-test-ulps index 68a74bf1d0..ea08fd5378 100644 --- a/sysdeps/csky/nofpu/libm-test-ulps +++ b/sysdeps/csky/nofpu/libm-test-ulps @@ -717,19 +717,15 @@ float: 3 Function: "erf": double: 1 -float: 1 Function: "erf_downward": double: 1 -float: 1 Function: "erf_towardzero": double: 1 -float: 1 Function: "erf_upward": double: 1 -float: 1 Function: "erfc": double: 5 diff --git a/sysdeps/hppa/fpu/libm-test-ulps b/sysdeps/hppa/fpu/libm-test-ulps index a54737db2e..7e4f6ebe77 100644 --- a/sysdeps/hppa/fpu/libm-test-ulps +++ b/sysdeps/hppa/fpu/libm-test-ulps @@ -739,20 +739,16 @@ float: 3 Function: "erf": double: 1 -float: 1 ldouble: 1 Function: "erf_downward": double: 1 -float: 1 Function: "erf_towardzero": double: 1 -float: 1 Function: "erf_upward": double: 1 -float: 1 Function: "erfc": double: 5 diff --git a/sysdeps/i386/fpu/libm-test-ulps b/sysdeps/i386/fpu/libm-test-ulps index a77ded2648..041c180f7a 100644 --- a/sysdeps/i386/fpu/libm-test-ulps +++ b/sysdeps/i386/fpu/libm-test-ulps @@ -1078,25 +1078,21 @@ ldouble: 3 Function: "erf": double: 1 -float: 1 float128: 1 ldouble: 1 Function: "erf_downward": double: 1 -float: 1 float128: 2 ldouble: 1 Function: "erf_towardzero": double: 1 -float: 1 float128: 1 ldouble: 1 Function: "erf_upward": double: 1 -float: 1 float128: 2 ldouble: 1 diff --git a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps index a9cd01bf03..e3ee0c61f6 100644 --- a/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps +++ b/sysdeps/i386/i686/fpu/multiarch/libm-test-ulps @@ -1081,25 +1081,21 @@ ldouble: 3 Function: "erf": double: 1 -float: 1 float128: 1 ldouble: 1 Function: "erf_downward": double: 1 -float: 1 float128: 2 ldouble: 1 Function: "erf_towardzero": double: 1 -float: 1 float128: 1 ldouble: 1 Function: "erf_upward": double: 1 -float: 1 float128: 2 ldouble: 1 diff --git a/sysdeps/ieee754/flt-32/s_erff.c b/sysdeps/ieee754/flt-32/s_erff.c index 6c541dba23..762f160e9f 100644 --- a/sysdeps/ieee754/flt-32/s_erff.c +++ b/sysdeps/ieee754/flt-32/s_erff.c @@ -1,155 +1,256 @@ -/* s_erff.c -- float version of s_erf.c. - */ +/* Correctly-rounded error function for binary32 value. -/* - * ==================================================== - * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved. - * - * Developed at SunPro, a Sun Microsystems, Inc. business. - * Permission to use, copy, modify, and distribute this - * software is freely granted, provided that this notice - * is preserved. - * ==================================================== - */ +Copyright (c) 2022-2024 Alexei Sibidanov. -#if defined(LIBM_SCCS) && !defined(lint) -static char rcsid[] = "$NetBSD: s_erff.c,v 1.4 1995/05/10 20:47:07 jtc Exp $"; -#endif +This file is part of the CORE-MATH project +project (file src/binary32/erf/erff.c revision bc385c2). + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. +*/ -#include -#include #include -#include -#include -#include +#include #include -#include +#include "math_config.h" -static const float -tiny = 1e-30, -one = 1.0000000000e+00, /* 0x3F800000 */ -erx = 8.4506291151e-01, /* 0x3f58560b */ -/* - * Coefficients for approximation to erf on [0,0.84375] - */ -efx = 1.2837916613e-01, /* 0x3e0375d4 */ -pp0 = 1.2837916613e-01, /* 0x3e0375d4 */ -pp1 = -3.2504209876e-01, /* 0xbea66beb */ -pp2 = -2.8481749818e-02, /* 0xbce9528f */ -pp3 = -5.7702702470e-03, /* 0xbbbd1489 */ -pp4 = -2.3763017452e-05, /* 0xb7c756b1 */ -qq1 = 3.9791721106e-01, /* 0x3ecbbbce */ -qq2 = 6.5022252500e-02, /* 0x3d852a63 */ -qq3 = 5.0813062117e-03, /* 0x3ba68116 */ -qq4 = 1.3249473704e-04, /* 0x390aee49 */ -qq5 = -3.9602282413e-06, /* 0xb684e21a */ -/* - * Coefficients for approximation to erf in [0.84375,1.25] - */ -pa0 = -2.3621185683e-03, /* 0xbb1acdc6 */ -pa1 = 4.1485610604e-01, /* 0x3ed46805 */ -pa2 = -3.7220788002e-01, /* 0xbebe9208 */ -pa3 = 3.1834661961e-01, /* 0x3ea2fe54 */ -pa4 = -1.1089469492e-01, /* 0xbde31cc2 */ -pa5 = 3.5478305072e-02, /* 0x3d1151b3 */ -pa6 = -2.1663755178e-03, /* 0xbb0df9c0 */ -qa1 = 1.0642088205e-01, /* 0x3dd9f331 */ -qa2 = 5.4039794207e-01, /* 0x3f0a5785 */ -qa3 = 7.1828655899e-02, /* 0x3d931ae7 */ -qa4 = 1.2617121637e-01, /* 0x3e013307 */ -qa5 = 1.3637083583e-02, /* 0x3c5f6e13 */ -qa6 = 1.1984500103e-02, /* 0x3c445aa3 */ -/* - * Coefficients for approximation to erfc in [1.25,1/0.35] - */ -ra0 = -9.8649440333e-03, /* 0xbc21a093 */ -ra1 = -6.9385856390e-01, /* 0xbf31a0b7 */ -ra2 = -1.0558626175e+01, /* 0xc128f022 */ -ra3 = -6.2375331879e+01, /* 0xc2798057 */ -ra4 = -1.6239666748e+02, /* 0xc322658c */ -ra5 = -1.8460508728e+02, /* 0xc3389ae7 */ -ra6 = -8.1287437439e+01, /* 0xc2a2932b */ -ra7 = -9.8143291473e+00, /* 0xc11d077e */ -sa1 = 1.9651271820e+01, /* 0x419d35ce */ -sa2 = 1.3765776062e+02, /* 0x4309a863 */ -sa3 = 4.3456588745e+02, /* 0x43d9486f */ -sa4 = 6.4538726807e+02, /* 0x442158c9 */ -sa5 = 4.2900814819e+02, /* 0x43d6810b */ -sa6 = 1.0863500214e+02, /* 0x42d9451f */ -sa7 = 6.5702495575e+00, /* 0x40d23f7c */ -sa8 = -6.0424413532e-02, /* 0xbd777f97 */ -/* - * Coefficients for approximation to erfc in [1/.35,28] - */ -rb0 = -9.8649431020e-03, /* 0xbc21a092 */ -rb1 = -7.9928326607e-01, /* 0xbf4c9dd4 */ -rb2 = -1.7757955551e+01, /* 0xc18e104b */ -rb3 = -1.6063638306e+02, /* 0xc320a2ea */ -rb4 = -6.3756646729e+02, /* 0xc41f6441 */ -rb5 = -1.0250950928e+03, /* 0xc480230b */ -rb6 = -4.8351919556e+02, /* 0xc3f1c275 */ -sb1 = 3.0338060379e+01, /* 0x41f2b459 */ -sb2 = 3.2579251099e+02, /* 0x43a2e571 */ -sb3 = 1.5367296143e+03, /* 0x44c01759 */ -sb4 = 3.1998581543e+03, /* 0x4547fdbb */ -sb5 = 2.5530502930e+03, /* 0x451f90ce */ -sb6 = 4.7452853394e+02, /* 0x43ed43a7 */ -sb7 = -2.2440952301e+01; /* 0xc1b38712 */ - -float __erff(float x) +float +__erff (float x) { - int32_t hx,ix,i; - float R,S,P,Q,s,y,z,r; - GET_FLOAT_WORD(hx,x); - ix = hx&0x7fffffff; - if(ix>=0x7f800000) { /* erf(nan)=nan */ - i = ((uint32_t)hx>>31)<<1; - return (float)(1-i)+one/x; /* erf(+-inf)=+-1 */ - } - - if(ix < 0x3f580000) { /* |x|<0.84375 */ - if(ix < 0x31800000) { /* |x|<2**-28 */ - if (ix < 0x04000000) - { - /* Avoid spurious underflow. */ - float ret = 0.0625f * (16.0f * x + (16.0f * efx) * x); - math_check_force_underflow (ret); - return ret; - } - return x + efx*x; - } - z = x*x; - r = pp0+z*(pp1+z*(pp2+z*(pp3+z*pp4))); - s = one+z*(qq1+z*(qq2+z*(qq3+z*(qq4+z*qq5)))); - y = r/s; - return x + x*y; - } - if(ix < 0x3fa00000) { /* 0.84375 <= |x| < 1.25 */ - s = fabsf(x)-one; - P = pa0+s*(pa1+s*(pa2+s*(pa3+s*(pa4+s*(pa5+s*pa6))))); - Q = one+s*(qa1+s*(qa2+s*(qa3+s*(qa4+s*(qa5+s*qa6))))); - if(hx>=0) return erx + P/Q; else return -erx - P/Q; - } - if (ix >= 0x40c00000) { /* inf>|x|>=6 */ - if(hx>=0) return one-tiny; else return tiny-one; - } - x = fabsf(x); - s = one/(x*x); - if(ix< 0x4036DB6E) { /* |x| < 1/0.35 */ - R=ra0+s*(ra1+s*(ra2+s*(ra3+s*(ra4+s*( - ra5+s*(ra6+s*ra7)))))); - S=one+s*(sa1+s*(sa2+s*(sa3+s*(sa4+s*( - sa5+s*(sa6+s*(sa7+s*sa8))))))); - } else { /* |x| >= 1/0.35 */ - R=rb0+s*(rb1+s*(rb2+s*(rb3+s*(rb4+s*( - rb5+s*rb6))))); - S=one+s*(sb1+s*(sb2+s*(sb3+s*(sb4+s*( - sb5+s*(sb6+s*sb7)))))); - } - GET_FLOAT_WORD(ix,x); - SET_FLOAT_WORD(z,ix&0xfffff000); - r = __ieee754_expf(-z*z-(float)0.5625)*__ieee754_expf((z-x)*(z+x)+R/S); - if(hx>=0) return one-r/x; else return r/x-one; + /* for 7 <= i < 63, C[i-7] is a degree-7 polynomial approximation of + erf(i/16+1/32+x) for -1/32 <= x <= 1/32 */ + static const double C[56][8] = { + { 0x1.f86faa9428f9cp-2, 0x1.cfc41e36c7dfap-1, -0x1.b2c7dc53508b9p-2, + -0x1.5a9de93fa556ep-3, 0x1.731793dbb01b5p-3, 0x1.133e06426cf18p-6, + -0x1.a12a6289cafd8p-5, 0x1.717d6f1d6f557p-9 }, + { 0x1.1855a5fd3dd50p-1, 0x1.b3aafcc27502fp-1, -0x1.cee5ac8e92bb2p-2, + -0x1.fa02983ca2d79p-4, 0x1.77cd746cb1922p-3, -0x1.fa6f277886487p-10, + -0x1.8de75458db416p-5, 0x1.00899c98551c9p-7 }, + { 0x1.32a54cb8db67ap-1, 0x1.96164fafd8de5p-1, -0x1.e23a7ea0c9ad3p-2, + -0x1.3f5ee15671cf4p-4, 0x1.70e468a3d72d9p-3, -0x1.3da68037cfc99p-6, + -0x1.69ed9ba1f9839p-5, 0x1.8cab9244a4ff4p-7 }, + { 0x1.4b13713ad3513p-1, 0x1.7791b886e7405p-1, -0x1.ecef423109bf5p-2, + -0x1.15c3c5cec6847p-5, 0x1.5f688fc931ba6p-3, -0x1.1da63ed190037p-5, + -0x1.38427ca63cca4p-5, 0x1.fa00e52525e17p-7 }, + { 0x1.61955607dd15dp-1, 0x1.58a445da7c74ep-1, -0x1.ef6c246a0f66cp-2, + 0x1.e83e0d9d61330p-8, 0x1.44cc65535bc9fp-3, -0x1.87d3c4860435dp-5, + -0x1.f90b10501169bp-6, 0x1.22295856d427ap-6 }, + { 0x1.762870f720c6fp-1, 0x1.39ccc1b136d5cp-1, -0x1.ea4feea4e4744p-2, + 0x1.715e5952ebfbap-5, 0x1.22cdbd83c75c4p-3, -0x1.da50aa1d925b6p-5, + -0x1.754dc0a29b4ddp-6, 0x1.350b6bef9392cp-6 }, + { 0x1.88d1cd474a2e0p-1, 0x1.1b7e98fe26219p-1, -0x1.de65a22ce1419p-2, + 0x1.40686a3f16400p-4, 0x1.f6b0cbb216b2bp-4, -0x1.09c7c903edd57p-4, + -0x1.da7529fde641p-7, 0x1.362a7a0588eabp-6 }, + { 0x1.999d4192a5717p-1, 0x1.fc3ee5d1524b3p-2, -0x1.cc990045b55c8p-2, + 0x1.b37338e68b37dp-4, 0x1.a0d120c872ea7p-4, -0x1.19bb2b07ecff6p-4, + -0x1.a110f5f593aafp-8, 0x1.272c15a57720ep-6 }, + { 0x1.a89c850b7d54dp-1, 0x1.c40b0729ed54ap-2, -0x1.b5eaaef0a2346p-2, + 0x1.0847c7dacbae1p-3, 0x1.47de0ba6d18fbp-4, -0x1.1d9de77a4b648p-4, + 0x1.30ffbe56f0726p-10, 0x1.0a9cb99feea01p-6 }, + { 0x1.b5e62fce16096p-1, 0x1.8eed36b886d95p-2, -0x1.9b64a06e50705p-2, + 0x1.2bb6e2c744df5p-3, 0x1.dee3261ca61bcp-5, -0x1.16996004f7da5p-4, + 0x1.fdff37bae983ep-8, 0x1.c750083e65f9ap-7 }, + { 0x1.c194b1d49a184p-1, 0x1.5d4fd33729015p-2, -0x1.7e0f4f045addbp-2, + 0x1.444bc66c31a1bp-3, 0x1.356dbf16ec8f1p-5, -0x1.0643de0906cd8p-4, + 0x1.b281af7bd3a2cp-7, 0x1.6b97eaa2c6abdp-7 }, + { 0x1.cbc54b476248ep-1, 0x1.2f7cc3fe6f423p-2, -0x1.5ee8429e36de8p-2, + 0x1.52a8395f96177p-3, 0x1.313761ba257dcp-6, -0x1.dcf844d5fed8fp-5, + 0x1.1e1420f475fa9p-6, 0x1.091c7dc1e18b2p-7 }, + { 0x1.d4970f9ce00d9p-1, 0x1.059f59af7a905p-2, -0x1.3eda354de36c3p-2, + 0x1.57b85ad439779p-3, 0x1.8e913b9778136p-10, -0x1.a2893bd3435f4p-5, + 0x1.4d3a90e37164ap-6, 0x1.4ce7f6e19a902p-8 }, + { 0x1.dc29fb60715b0p-1, 0x1.bf8e1b1ca2277p-3, -0x1.1eb7095e5d6d2p-2, + 0x1.549ea6f7a64f4p-3, -0x1.b10f12f3877a3p-7, -0x1.61420c8f7156ap-5, + 0x1.674f1f92a8812p-6, 0x1.25543ffd74d52p-9 }, + { 0x1.e29e22a89d767p-1, 0x1.7bd5c7df3fe99p-3, -0x1.fe674494077bfp-3, + 0x1.4a9feacf86578p-3, -0x1.a008269076644p-6, -0x1.1cf0e8fb4f1cbp-5, + 0x1.6e0d2ef105fb3p-6, -0x1.367205fbd7876p-12 }, + { 0x1.e812fc64db36ap-1, 0x1.3fda6bc016991p-3, -0x1.c1cb278627920p-3, + 0x1.3b10512314f1ep-3, -0x1.1e6457bb1b9a9p-5, -0x1.b1f6474e2388cp-6, + 0x1.640a5345f7ec7p-6, -0x1.3dae5a997fdbp-9 }, + { 0x1.eca6ccd709544p-1, 0x1.0b3f52ce8c380p-3, -0x1.8885019f63c6dp-3, + 0x1.274275fc91a05p-3, -0x1.57f73699a8372p-5, -0x1.3076a305fc7cep-6, + 0x1.4c6ae04843a41p-6, -0x1.0be5fcf5ecc91p-8 }, + { 0x1.f0762fde45ee7p-1, 0x1.bb1c972f23e4ap-4, -0x1.5341e3c01b58dp-3, + 0x1.107929f6f0b60p-3, -0x1.7e1b34f976c02p-5, -0x1.73b62589c234ap-7, + 0x1.2a97ee1876486p-6, -0x1.595f40a3150fep-8 }, + { 0x1.f39bc242e43e6p-1, 0x1.6c7e64e7281c5p-4, -0x1.2274b86835fd3p-3, + 0x1.efb890e5c770dp-4, -0x1.92c7db16847e0p-5, -0x1.45477db5e2dd4p-8, + 0x1.01fc6165fc866p-6, -0x1.8845509030c2cp-8 }, + { 0x1.f62fe80272419p-1, 0x1.297db960e4f5dp-4, -0x1.ecb83b087c04fp-4, + 0x1.bce18363ca3d1p-4, -0x1.985aaf776482cp-5, 0x1.cd953efdae886p-12, + 0x1.ab9a0b89b54ffp-7, -0x1.9b5e576ccc31cp-8 }, + { 0x1.f848acb544e95p-1, 0x1.e1d4cf1e24501p-5, -0x1.9e12e1fde5552p-4, + 0x1.8a27806df3d1bp-4, -0x1.91674e5eb3319p-5, 0x1.3bc75595b2db8p-8, + 0x1.51bc537ac61afp-7, -0x1.96b23b19ea04dp-8 }, + { 0x1.f9f9ba8d3c733p-1, 0x1.83298d7172108p-5, -0x1.58d101f905a75p-4, + 0x1.58f1456f8639bp-4, -0x1.808d1850b8231p-5, 0x1.0c1bd99c348a7p-7, + 0x1.f61e9d7bc48cap-8, -0x1.7f07c13441774p-8 }, + { 0x1.fb54641aebbc9p-1, 0x1.34ac36ad8dafap-5, -0x1.1c8ec267f9405p-4, + 0x1.2a52c5d841848p-4, -0x1.68541c02b3b6bp-5, 0x1.5afe400196379p-7, + 0x1.565b2d6eda3d6p-8, -0x1.596aaff29e739p-8 }, + { 0x1.fc67bcf2d7b8fp-1, 0x1.e85c449e377efp-6, -0x1.d177f166c07c6p-5, + 0x1.fe23b7584b504p-5, -0x1.4b12109613313p-5, 0x1.8d9905c0acf7dp-7, + 0x1.9265032a669dap-9, -0x1.2ac4a6dbcbf3ep-8 }, + { 0x1.fd40bd6d7a785p-1, 0x1.7f5188610ddc7p-6, -0x1.7954423f7c998p-5, + 0x1.af5baae33887fp-5, -0x1.2ad77c7cbc474p-5, 0x1.a7b8c47ec2a51p-7, + 0x1.46646ee094bccp-10, -0x1.ef19d8db8673p-9 }, + { 0x1.fdea6e062d0c9p-1, 0x1.2a875b5ffab58p-6, -0x1.2f3178cd6dcd5p-5, + 0x1.68d1c45b94182p-5, -0x1.09648ed3aeaefp-5, 0x1.ad8b150d38164p-7, + -0x1.e9a6023d9429fp-13, -0x1.8722d19ee2e8ep-9 }, + { 0x1.fe6e1742f7cf5p-1, 0x1.cd5ec93c1243ap-7, -0x1.e2ff3aaacb386p-6, + 0x1.2aa4e5823cc89p-5, -0x1.d049842dbe399p-6, 0x1.a34edb21ab302p-7, + -0x1.676e5996c7f9bp-10, -0x1.23b01a35140bfp-9 }, + { 0x1.fed37386190fbp-1, 0x1.61beae53b72c2p-7, -0x1.7d6193f22c3c1p-6, + 0x1.e947279e3bb7dp-6, -0x1.906031b97ca97p-6, 0x1.8d14d62561755p-7, + -0x1.1f245e7178882p-9, -0x1.9257d4eb47685p-10 }, + { 0x1.ff20e0a7ba8c2p-1, 0x1.0d1d69569b839p-7, -0x1.2a8ca0dc02752p-6, + 0x1.8cc071b709751p-6, -0x1.54a149f1b070cp-6, 0x1.6e9137b13412cp-7, + -0x1.6577ed3d8e83bp-9, -0x1.e9c1a5178a289p-11 }, + { 0x1.ff5b8fb26f5f6p-1, 0x1.9646f35a7663cp-8, -0x1.cf68ed9311b0bp-7, + 0x1.3e8735b5a694fp-6, -0x1.1e1612d026fdfp-6, 0x1.4afd8e6ca636dp-7, + -0x1.8c375170ccb22p-9, -0x1.c799443c4fd3bp-12 }, + { 0x1.ff87b1913e853p-1, 0x1.30499b5039596p-8, -0x1.64964201ec8bap-7, + 0x1.fa73d7eafba98p-7, -0x1.daa3022141fbbp-7, 0x1.2509444c063b7p-7, + -0x1.99482a2f8a0a1p-9, -0x1.403d1f76c9454p-15 }, + { 0x1.ffa89fe5b3625p-1, 0x1.c4412bf4b8f35p-9, -0x1.100f347126cf0p-7, + 0x1.8ebda07671d40p-7, -0x1.850c6a31c98c1p-7, 0x1.fdac860c67d21p-8, + -0x1.927d03d2ba12cp-9, 0x1.0ff620b4190fep-12 }, + { 0x1.ffc10194fcb64p-1, 0x1.4d78bba8ca621p-9, -0x1.9ba107a443e02p-8, + 0x1.36f273fbc04ccp-7, -0x1.3b38716ac7e6fp-7, 0x1.b3fe0181914acp-8, + -0x1.7d3fe7de98c5cp-9, 0x1.ea31f8e5317f7p-12 }, + { 0x1.ffd2eae369a07p-1, 0x1.e7f232d9e266cp-10, -0x1.34c7442dd48d9p-8, + 0x1.e066bed070a0bp-8, -0x1.f914f3c42fc0dp-8, 0x1.6f4664ed2260fp-8, + -0x1.5e59910761d24p-9, 0x1.39cbb6e84c126p-11 }, + { 0x1.ffdff92db56e5p-1, 0x1.6235fbd7a4373p-10, -0x1.cb5e029b9e56ap-9, + 0x1.6fa4c7ef274dap-8, -0x1.903a089a835f3p-8, 0x1.30f12e0ca1901p-8, + -0x1.39d21b6957f99p-9, 0x1.5d3f8495a703cp-11 }, + { 0x1.ffe96a78a04a9p-1, 0x1.fe41cd9bb4f2cp-11, -0x1.52d7b28966c0cp-9, + 0x1.16c192d86a1a7p-8, -0x1.39bfce951100cp-8, 0x1.f376a7869f9e3p-9, + -0x1.12e6cef999c4fp-9, 0x1.66acd4d667b5p-11 }, + { 0x1.fff0312b010b5p-1, 0x1.6caa0d3583018p-11, -0x1.efb729f4cf75bp-10, + 0x1.a2da7cebe12acp-9, -0x1.e6c27a24bc759p-9, 0x1.93b1f4d8ea65p-9, + -0x1.d82050aa94a08p-10, 0x1.5cd7dc75d6cbap-11 }, + { 0x1.fff50456dab8cp-1, 0x1.0295ef6591865p-11, -0x1.679880e95a4dap-10, + 0x1.37d38e3a5c8ebp-9, -0x1.75b3708aebb8fp-9, 0x1.4231c4b4b0296p-9, + -0x1.8e26476489318p-10, 0x1.45c3b570dd924p-11 }, + { 0x1.fff86cfd3e657p-1, 0x1.6be02102b353dp-12, -0x1.02b157780d6aep-10, + 0x1.cc1d886861133p-10, -0x1.1bff6f12ec9abp-9, 0x1.fc0f77bd9c736p-10, + -0x1.4a3320bd0959dp-10, 0x1.267f8b4f95d2p-11 }, + { 0x1.fffad0b901755p-1, 0x1.fc0d55470cf5ep-13, -0x1.7121aff5e820ep-11, + 0x1.506d6992f7de5p-10, -0x1.ab595d3ecd0d6p-10, 0x1.8bdd79daaf754p-10, + -0x1.0d9b090f997c1p-10, 0x1.031ab9fd1c7dap-11 }, + { 0x1.fffc7a37857d2p-1, 0x1.5feada379d8a5p-13, -0x1.05304df58f3aap-11, + 0x1.e79c081b8600fp-11, -0x1.3e5dbe33232e0p-10, 0x1.30eb208200729p-10, + -0x1.b1d493b147945p-11, 0x1.bd587bbc071bep-12 }, + { 0x1.fffd9fdeabccep-1, 0x1.e3bcf436a1a49p-14, -0x1.6e953111ef0a1p-12, + 0x1.5e3edf6768654p-11, -0x1.d5be67c0547a4p-11, 0x1.d07d9ffa1d435p-11, + -0x1.58328f5f358cap-11, 0x1.76d42d95c42c4p-12 }, + { 0x1.fffe68f4fa777p-1, 0x1.49e17724f4cddp-14, -0x1.fe48c44e229c1p-13, + 0x1.f2bd95d76f188p-12, -0x1.57388cb12d011p-11, 0x1.5decc25c5c079p-11, + -0x1.0d7499d1b0d2dp-11, 0x1.359332c94ecdcp-12 }, + { 0x1.fffef1960d85dp-1, 0x1.be6abbb10a4cdp-15, -0x1.6040381a8c313p-13, + 0x1.5fff1dde9ee9dp-12, -0x1.f0c933efa9971p-12, 0x1.04cbf4a5cd760p-11, + -0x1.a07f150af6dadp-12, 0x1.f68dd183426bap-13 }, + { 0x1.ffff4db27f146p-1, 0x1.2bb5cc22e5cd8p-15, -0x1.e25894899f526p-14, + 0x1.ec8a8e5a72757p-13, -0x1.64256ae0a3cf9p-12, 0x1.80a836c18c46cp-12, + -0x1.3dea401af6775p-12, 0x1.915ddff3fe0d1p-13 }, + { 0x1.ffff8b500e77cp-1, 0x1.8f4ccca7fc769p-16, -0x1.478cffe305946p-14, + 0x1.559f04adde504p-13, -0x1.f9e1577d6961dp-13, 0x1.18bda53c14716p-12, + -0x1.df8634c35541cp-13, 0x1.3bb5c6b616337p-13 }, + { 0x1.ffffb43555b5fp-1, 0x1.07ebd2a2d26c8p-16, -0x1.b93e442a37f2bp-15, + 0x1.d5cf15159ce28p-14, -0x1.63f5e1469c006p-13, 0x1.95a03acebac18p-13, + -0x1.656e5e2a1f8e2p-13, 0x1.e98c437189bdep-14 }, + { 0x1.ffffcf23ff5fcp-1, 0x1.5a2adfa0b492cp-17, -0x1.26c88270759f0p-15, + 0x1.40473572b99a8p-14, -0x1.f057cbde578a5p-14, 0x1.22178d1c3c948p-13, + -0x1.0765b61a0d859p-13, 0x1.765b3ea03ddbep-14 }, + { 0x1.ffffe0bd3e852p-1, 0x1.c282cd3957a72p-18, -0x1.86ad6dfa44faap-16, + 0x1.b0f313f03a029p-15, -0x1.56e44abecd255p-14, 0x1.9ad1ecfe34a89p-14, + -0x1.7fe4033478618p-14, 0x1.1a8184e049fbfp-14 }, + { 0x1.ffffec2641a9ep-1, 0x1.22df29821407ep-18, -0x1.00c902a6cfd98p-16, + 0x1.22234eb88671fp-15, -0x1.d57a181c9e6e1p-15, 0x1.200c283b54a90p-14, + -0x1.14b4c3295a7d0p-14, 0x1.a4f966f713bdep-15 }, + { 0x1.fffff37d63a36p-1, 0x1.74adc8f405eecp-19, -0x1.4ed4228e44858p-17, + 0x1.81918baea92bap-16, -0x1.3e81b17a0009cp-15, 0x1.9004a36116436p-15, + -0x1.8aa1ba400e076p-15, 0x1.35cd4e2340a9ep-15 }, + { 0x1.fffff82cdcf1bp-1, 0x1.d9c73698fa87dp-20, -0x1.b11017ec67115p-18, + 0x1.fc0dfadf653f8p-17, -0x1.ac4e03cd2dfc2p-16, 0x1.131806b5abbc5p-15, + -0x1.1672ef66fcaafp-15, 0x1.c2882c7debed7p-16 }, + { 0x1.fffffb248c39dp-1, 0x1.2acee2f5ec66ap-20, -0x1.15cc570408a36p-18, + 0x1.4be757bbb75a3p-17, -0x1.1d6aa5f8d2940p-16, 0x1.76c5937d5105ep-16, + -0x1.84dffc3ca9302p-16, 0x1.43c8315f2c30ap-16 }, + { 0x1.fffffd01f36afp-1, 0x1.75fa8dbc840bap-21, -0x1.6186da0133f5ap-19, + 0x1.ae023231e1af5p-18, -0x1.790812f7ca394p-17, 0x1.f9c25656d0ef2p-17, + -0x1.0cc66682e304cp-16, 0x1.cc170a75d6f9cp-17 }, + { 0x1.fffffe2ba0ea5p-1, 0x1.d06ad6ecde88ep-22, -0x1.be46aa8edc9a1p-20, + 0x1.143860c7840b8p-18, -0x1.edaba78fb1260p-18, 0x1.52138a96ecee2p-17, + -0x1.6fca538c4e2eep-17, 0x1.434040640bcefp-17 }, + { 0x1.fffffee3cc32cp-1, 0x1.1e1e857adb8ddp-22, -0x1.1769ce5f2a6e8p-20, + 0x1.5fe5d479b0543p-19, -0x1.405d865c94c2ap-18, 0x1.bfc94feb96afcp-18, + -0x1.f245d5f3e8358p-18, 0x1.c142456acf443p-18 }, + }; + float ax = fabsf (x); + uint32_t ux = asuint (ax); + double s = x; + double z = ax; + /* 0x407ad444 corresponds to x = 0x1.f5a888p+1 = 3.91921..., which is the + largest float such that erf(x) does not round to 1 (to nearest). */ + if (__glibc_unlikely (ux > 0x407ad444u)) + { + float os = copysignf (1.0f, x); + if (ux > (0xffu << 23)) + return x + x; /* nan */ + if (ux == (0xffu << 23)) + return os; /* +-inf */ + return os - 0x1p-25f * os; + } + double v = floor (16.0 * z); + uint32_t i = 16.0f * ax; + /* 0x3ee00000 corresponds to x = 0.4375, for smaller x we have i < 7. */ + if (__glibc_unlikely (ux < 0x3ee00000u)) + { + static const double c[] = + { + 0x1.20dd750429b6dp+0, -0x1.812746b0375fbp-2, + 0x1.ce2f219fd6f45p-4, -0x1.b82ce2cbf0838p-6, + 0x1.565bb655adb85p-8, -0x1.c025bfc879c94p-11, + 0x1.f81718f61309cp-14, -0x1.cc67bd88f5867p-17 + }; + double z2 = s * s, z4 = z2 * z2, z8 = z4 * z4; + double c0 = c[0] + z2 * c[1]; + double c2 = c[2] + z2 * c[3]; + double c4 = c[4] + z2 * c[5]; + double c6 = c[6] + z2 * c[7]; + c0 += z4 * c2; + c4 += z4 * c6; + c0 += z8 * c4; + return s * c0; + } + z = (z - 0.03125) - 0.0625 * v; + const double *c = C[i - 7]; + double z2 = z * z, z4 = z2 * z2; + double c0 = c[0] + z * c[1]; + double c2 = c[2] + z * c[3]; + double c4 = c[4] + z * c[5]; + double c6 = c[6] + z * c[7]; + c0 += z2 * c2; + c4 += z2 * c6; + c0 += z4 * c4; + return copysign (c0, s); } libm_alias_float (__erf, erf) - diff --git a/sysdeps/loongarch/lp64/libm-test-ulps b/sysdeps/loongarch/lp64/libm-test-ulps index ba070f8224..2af6da3638 100644 --- a/sysdeps/loongarch/lp64/libm-test-ulps +++ b/sysdeps/loongarch/lp64/libm-test-ulps @@ -909,22 +909,18 @@ ldouble: 5 Function: "erf": double: 1 -float: 1 ldouble: 1 Function: "erf_downward": double: 1 -float: 1 ldouble: 2 Function: "erf_towardzero": double: 1 -float: 1 ldouble: 1 Function: "erf_upward": double: 1 -float: 1 ldouble: 2 Function: "erfc": diff --git a/sysdeps/m68k/m680x0/fpu/libm-test-ulps b/sysdeps/m68k/m680x0/fpu/libm-test-ulps index 8456a59010..3ea3f74e89 100644 --- a/sysdeps/m68k/m680x0/fpu/libm-test-ulps +++ b/sysdeps/m68k/m680x0/fpu/libm-test-ulps @@ -839,21 +839,17 @@ ldouble: 5 Function: "erf": double: 1 -float: 1 ldouble: 1 Function: "erf_downward": double: 1 -float: 1 Function: "erf_towardzero": double: 1 -float: 1 ldouble: 1 Function: "erf_upward": double: 1 -float: 1 ldouble: 1 Function: "erfc": diff --git a/sysdeps/microblaze/libm-test-ulps b/sysdeps/microblaze/libm-test-ulps index c89096defd..c079a4f501 100644 --- a/sysdeps/microblaze/libm-test-ulps +++ b/sysdeps/microblaze/libm-test-ulps @@ -172,7 +172,6 @@ float: 2 Function: "erf": double: 1 -float: 1 Function: "erfc": double: 3 diff --git a/sysdeps/mips/mips32/libm-test-ulps b/sysdeps/mips/mips32/libm-test-ulps index cef264d649..ebd88e0cef 100644 --- a/sysdeps/mips/mips32/libm-test-ulps +++ b/sysdeps/mips/mips32/libm-test-ulps @@ -723,19 +723,15 @@ float: 3 Function: "erf": double: 1 -float: 1 Function: "erf_downward": double: 1 -float: 1 Function: "erf_towardzero": double: 1 -float: 1 Function: "erf_upward": double: 1 -float: 1 Function: "erfc": double: 5 diff --git a/sysdeps/mips/mips64/libm-test-ulps b/sysdeps/mips/mips64/libm-test-ulps index 724249d3ad..ca658b945c 100644 --- a/sysdeps/mips/mips64/libm-test-ulps +++ b/sysdeps/mips/mips64/libm-test-ulps @@ -909,22 +909,18 @@ ldouble: 5 Function: "erf": double: 1 -float: 1 ldouble: 1 Function: "erf_downward": double: 1 -float: 1 ldouble: 2 Function: "erf_towardzero": double: 1 -float: 1 ldouble: 1 Function: "erf_upward": double: 1 -float: 1 ldouble: 2 Function: "erfc": diff --git a/sysdeps/nios2/libm-test-ulps b/sysdeps/nios2/libm-test-ulps index dbccba13cb..6416c7ff38 100644 --- a/sysdeps/nios2/libm-test-ulps +++ b/sysdeps/nios2/libm-test-ulps @@ -177,7 +177,6 @@ float: 2 Function: "erf": double: 1 -float: 1 Function: "erfc": double: 5 diff --git a/sysdeps/or1k/fpu/libm-test-ulps b/sysdeps/or1k/fpu/libm-test-ulps index df2b69ac75..fb1606801b 100644 --- a/sysdeps/or1k/fpu/libm-test-ulps +++ b/sysdeps/or1k/fpu/libm-test-ulps @@ -723,19 +723,15 @@ float: 3 Function: "erf": double: 1 -float: 1 Function: "erf_downward": double: 1 -float: 1 Function: "erf_towardzero": double: 1 -float: 1 Function: "erf_upward": double: 1 -float: 1 Function: "erfc": double: 5 diff --git a/sysdeps/or1k/nofpu/libm-test-ulps b/sysdeps/or1k/nofpu/libm-test-ulps index 2263f3f0b7..2742e3b8be 100644 --- a/sysdeps/or1k/nofpu/libm-test-ulps +++ b/sysdeps/or1k/nofpu/libm-test-ulps @@ -723,19 +723,15 @@ float: 3 Function: "erf": double: 1 -float: 1 Function: "erf_downward": double: 1 -float: 1 Function: "erf_towardzero": double: 1 -float: 1 Function: "erf_upward": double: 1 -float: 1 Function: "erfc": double: 5 diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps index 36fa54d97e..b22aaf90bd 100644 --- a/sysdeps/powerpc/fpu/libm-test-ulps +++ b/sysdeps/powerpc/fpu/libm-test-ulps @@ -1105,25 +1105,21 @@ double: 1 Function: "erf": double: 1 -float: 1 float128: 1 ldouble: 1 Function: "erf_downward": double: 1 -float: 1 float128: 2 ldouble: 2 Function: "erf_towardzero": double: 1 -float: 1 float128: 1 ldouble: 2 Function: "erf_upward": double: 1 -float: 1 float128: 2 ldouble: 3 diff --git a/sysdeps/powerpc/nofpu/libm-test-ulps b/sysdeps/powerpc/nofpu/libm-test-ulps index c32c8017b4..98cd67bd07 100644 --- a/sysdeps/powerpc/nofpu/libm-test-ulps +++ b/sysdeps/powerpc/nofpu/libm-test-ulps @@ -919,22 +919,18 @@ double: 1 Function: "erf": double: 1 -float: 1 ldouble: 1 Function: "erf_downward": double: 1 -float: 1 ldouble: 2 Function: "erf_towardzero": double: 1 -float: 1 ldouble: 2 Function: "erf_upward": double: 1 -float: 1 ldouble: 2 Function: "erfc": diff --git a/sysdeps/riscv/nofpu/libm-test-ulps b/sysdeps/riscv/nofpu/libm-test-ulps index 79927c2bd9..8cfeb7bcb2 100644 --- a/sysdeps/riscv/nofpu/libm-test-ulps +++ b/sysdeps/riscv/nofpu/libm-test-ulps @@ -906,22 +906,18 @@ ldouble: 5 Function: "erf": double: 1 -float: 1 ldouble: 1 Function: "erf_downward": double: 1 -float: 1 ldouble: 2 Function: "erf_towardzero": double: 1 -float: 1 ldouble: 1 Function: "erf_upward": double: 1 -float: 1 ldouble: 2 Function: "erfc": diff --git a/sysdeps/riscv/rvd/libm-test-ulps b/sysdeps/riscv/rvd/libm-test-ulps index fbd5b8fed7..f7c6c82dd1 100644 --- a/sysdeps/riscv/rvd/libm-test-ulps +++ b/sysdeps/riscv/rvd/libm-test-ulps @@ -909,22 +909,18 @@ ldouble: 5 Function: "erf": double: 1 -float: 1 ldouble: 1 Function: "erf_downward": double: 1 -float: 1 ldouble: 2 Function: "erf_towardzero": double: 1 -float: 1 ldouble: 1 Function: "erf_upward": double: 1 -float: 1 ldouble: 2 Function: "erfc": diff --git a/sysdeps/s390/fpu/libm-test-ulps b/sysdeps/s390/fpu/libm-test-ulps index ade5a39db4..36da5d742a 100644 --- a/sysdeps/s390/fpu/libm-test-ulps +++ b/sysdeps/s390/fpu/libm-test-ulps @@ -910,22 +910,18 @@ ldouble: 5 Function: "erf": double: 1 -float: 1 ldouble: 1 Function: "erf_downward": double: 1 -float: 1 ldouble: 2 Function: "erf_towardzero": double: 1 -float: 1 ldouble: 1 Function: "erf_upward": double: 1 -float: 1 ldouble: 2 Function: "erfc": diff --git a/sysdeps/sh/libm-test-ulps b/sysdeps/sh/libm-test-ulps index b0040d7218..6b55797f81 100644 --- a/sysdeps/sh/libm-test-ulps +++ b/sysdeps/sh/libm-test-ulps @@ -355,11 +355,9 @@ float: 3 Function: "erf": double: 1 -float: 1 Function: "erf_towardzero": double: 1 -float: 1 Function: "erfc": double: 5 diff --git a/sysdeps/sparc/fpu/libm-test-ulps b/sysdeps/sparc/fpu/libm-test-ulps index d78b46b97b..3aed9d4d06 100644 --- a/sysdeps/sparc/fpu/libm-test-ulps +++ b/sysdeps/sparc/fpu/libm-test-ulps @@ -909,22 +909,18 @@ ldouble: 5 Function: "erf": double: 1 -float: 1 ldouble: 1 Function: "erf_downward": double: 1 -float: 1 ldouble: 2 Function: "erf_towardzero": double: 1 -float: 1 ldouble: 1 Function: "erf_upward": double: 1 -float: 1 ldouble: 2 Function: "erfc": diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps index 327937929d..e00ad56c62 100644 --- a/sysdeps/x86_64/fpu/libm-test-ulps +++ b/sysdeps/x86_64/fpu/libm-test-ulps @@ -1290,25 +1290,21 @@ ldouble: 3 Function: "erf": double: 1 -float: 1 float128: 1 ldouble: 1 Function: "erf_downward": double: 1 -float: 1 float128: 2 ldouble: 1 Function: "erf_towardzero": double: 1 -float: 1 float128: 1 ldouble: 1 Function: "erf_upward": double: 1 -float: 1 float128: 2 ldouble: 1