From patchwork Tue Apr 29 16:29:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 885756 Delivered-To: patch@linaro.org Received: by 2002:a5d:4884:0:b0:38f:210b:807b with SMTP id g4csp425148wrq; Tue, 29 Apr 2025 09:41:25 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUER5wyPMdisWyYFmXUoup+M8QYx7m9Cl5nQBPnhL0QC7MxD5yXoshaeVBaJle/Sco1Z2qgCg==@linaro.org X-Google-Smtp-Source: AGHT+IG4ORRjXOAmdE+z5rrYaYKPuMez7BaJ0P/09XZQBZE/aI5h9mSSw+hLUBx4EMEbObNpWx+M X-Received: by 2002:a05:622a:4816:b0:476:71d2:61e0 with SMTP id d75a77b69052e-488138400c4mr61017601cf.23.1745944884818; Tue, 29 Apr 2025 09:41:24 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1745944884; cv=pass; d=google.com; s=arc-20240605; b=LojGqhOSDVhHcdm3h329vZ+ZGVNDbjXs6WJQuZctQF7T6uBAGOq/k2KHSHVIALqJa4 F2Cfv9fu4dfVZfewMvLd76nbbBu/gjY2SJ28Dz+D2RPtPbLarqVhnFyniNajCfwld2wu AwyvYsjxq/t54xouomRTc7Xnf+yEFWyV5ti570gTJ+j2fo7RCBtzC7/BYQW16gFdkh9q GChEbp/7khHZDhL461JI3+sP/yo6B/uPJnC1IDD4ADFCkHJUuSDow+aXd8sfCgKvBNBE YFNEcXpzQH8Cv1D4TrNJAki/ALljNXcHzS6GLhrkXp67vr99oPaxUPJtJ8M6xwOqhprd +BQQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dkim-filter:arc-filter:dmarc-filter :delivered-to:dkim-filter; bh=UN1kHXY/J8CVcD1n+t17YNy3JewyE8g4BuZrlOJl/PU=; fh=QEQhASUpKkWwnOlb+v2n9KZRZHJ12Go9dzJxmOcjPvE=; b=hrW3TVkGoDVjryLL99OfSD55ujcjSfGIoOvnb2+6RsqOanuDls7gAhyzQ1KlPtZZKq 2bJVeNvyYZSSwXFkS1Ch6O6XipAQdqXBkT6r/mse5HCsbw8Q9YXMyE1xbQu+Zfkm+xlu 30oCrxx9tYuCySpz+wUApUyKlDrzfcYXbgidQcQMhVadpTl1Qc842Zlgsoy/LqQo2Jmu x/cVfkkLKCYIR9R9wkzuDnCMhTAIvyl47bUnU7//Q7fs+nX2FYiHJHDHvJL7TPrxjlFU iFlW4GP71s0Kcv1gkVuzm3WncTpPuAz6njs9BRh79wog4iUYADVZEQvadZvyoq0JP2tE U0mg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=kIgnjI5v; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id d75a77b69052e-47ea19874c0si131916641cf.334.2025.04.29.09.41.24 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Apr 2025 09:41:24 -0700 (PDT) Received-SPF: pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=kIgnjI5v; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5CE773858C50 for ; Tue, 29 Apr 2025 16:41:24 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5CE773858C50 Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=kIgnjI5v X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x42a.google.com (mail-pf1-x42a.google.com [IPv6:2607:f8b0:4864:20::42a]) by sourceware.org (Postfix) with ESMTPS id 303373858D33 for ; Tue, 29 Apr 2025 16:40:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 303373858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 303373858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::42a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1745944815; cv=none; b=LI8+ncaGQ5LEtd6uiYq5yzHQu/UKJWR1D2YmxfHZIJS5F7yIWHmUy1nYmQXZCUCXE6/UeQ9TwBOMQ1aUAauWTbRhxJ+5A0BrLvF0acepvgsRE9dK0saOAGTGHKezCdIQiHQsxHSJE8wMbsc/1MwLHyuzkKTxAApz6tUab+U1ju4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1745944815; c=relaxed/simple; bh=ot0V8DBBMmR9l04H5QJMQ62PY17q41BRZe100LzTCXQ=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=V6enf3/4K6UGqyJkUZ+om/7WR4JetkB4UlRbwT3ljOcr2VtzLuMxILapc2sJGHo5gtS/GDuaViVWK62yqtBtfnH4fxfiLBtoM0sY5kxbQjvw/uvuXV9V8JEfvScX2hXI1SIAG0RgYzxEtnhgtlO2SaGRVbbybRSXTC3hFthYL8A= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 303373858D33 Received: by mail-pf1-x42a.google.com with SMTP id d2e1a72fcca58-736c062b1f5so5345397b3a.0 for ; Tue, 29 Apr 2025 09:40:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1745944814; x=1746549614; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UN1kHXY/J8CVcD1n+t17YNy3JewyE8g4BuZrlOJl/PU=; b=kIgnjI5vkMaFPA2HbpT3TGKSa/WDVB0Ha/N7wXFE1/ePTe/XDPtr8ImRZcnPPRRrvT zqT2GPgl2sre2x8UP01JwOwy97S3CWfPiSV5wy5wGjpcwM0f67hUxXnVbQjJPCTyfQac 6izXhEsQ929JRFfD5eJRlnUH5+8SSJZIpnReJFl8grvRRnq444xssLX/YInBWkUVG1Ib rWy1Vm93w76BVfoxWGeRLm+t36qSPv1Nq152HqGgWIZiFjjaV0IDhrsFyFjYJlE34wh6 fPUB66T4E09mPnFfLtQxkLxol6LZSIcN8EVV3bThgDB7Hs8og5+CL3qgg38sOOK+jV1D a6zA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745944814; x=1746549614; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UN1kHXY/J8CVcD1n+t17YNy3JewyE8g4BuZrlOJl/PU=; b=Ua+pAEjK4B0Bu8Fp+Bi4/pS/88lU0mVzq7D73brStTrPL+1njfIqb+9Brux4khNTvz /r+PNEPFMeriFZoxYiGfC8FAixfaqoAk5x0/LGSsA8ZRoKCS4UolypODf7l4NbX+dr99 7+YVuAWZ42I0lxfQutzReqK1r+yVsuQG0qFr+STC5ORFqpXV0YXgRvBAonJFV2xm79HV fo5KBUnAebsaLxfdiXkJH2iJiIbLyV/UnIuH/+w5E/Ah9+BrWUWnuzisv5XlOrLMvN7r lxWqi6cvRyDmMccfSjhQLOcoTJL7mnvnvp6dnBJ27BJ5KjFUJBH1ljesftg0gfLdaGIx Trlg== X-Gm-Message-State: AOJu0YyxS7k5ImQLnmP3xum4L7PhRDZGvK5KCtV8ekcy0VyN9BLah+fb 6C7k784pS3OMEL1L8eK3f60fGoo9A9LmAXVBfOWuB2NW2X5gxeleeNCzo9MiKvWleiZF4yhxL5b b X-Gm-Gg: ASbGncs5XJ+dxqzKX2GcQ2KvL7re/SUUxRlSthT8ELHOCez+mX6NeeXvgqPxbsMg39w M9q3tkbvi9NBIugVG+H+eshbdKZDLclouVZGZ2DoG8zMkn34TkE4hYp1YFueKq4RXiKoApqE5Bp y/4JTgZUsexSAsHcmvs2pLgrJgKCVenFA9hkk4hqeqe/X01nm1vZ5FXL6X40I6ODYKrqEn09EMr DTI2TbjNfSiB9ebXmceeskNRyCn9QUH5HYZ6P92TaDWSBjH4NAF0bA0ZX0pOibMv/MOstKUUylk BwLGdruG6tU33/j/Km0HgbDS3WgZUBXP+vTNXoweLAS3hCVDqZ5QxQ== X-Received: by 2002:a05:6a00:114e:b0:73f:1c49:90f3 with SMTP id d2e1a72fcca58-7402715dffcmr5697889b3a.11.1745944813908; Tue, 29 Apr 2025 09:40:13 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:9bf1:b7a9:b8c3:bb3a:fca6]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-73e259134d5sm10492409b3a.19.2025.04.29.09.40.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Apr 2025 09:40:13 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Wilco Dijkstra , Xiaolin Tang , Peter Bergner Subject: [PATCH v3 1/6] math: Remove UB and optimize double ilogb Date: Tue, 29 Apr 2025 13:29:59 -0300 Message-ID: <20250429164007.2928271-2-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250429164007.2928271-1-adhemerval.zanella@linaro.org> References: <20250429164007.2928271-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patch=linaro.org@sourceware.org The subnormal exponent calculation invokes UB by left shifting the signed expoenent to find the first leading bit. The implementation also uses 32 bits operations, which generates suboptimal code in 64 bits architectures. The patch reimplements ilogb using the math_config.h macros and uses the new stdbit function to simplify the subnormal handling. On aarch64 it generates better code: * master: 0000000000000000 <__ieee754_ilogb>: 0: 9e660000 fmov x0, d0 4: d360fc02 lsr x2, x0, #32 8: d360f801 ubfx x1, x0, #32, #31 c: f26c285f tst x2, #0x7ff00000 10: 540001a1 b.ne 44 <__ieee754_ilogb+0x44> // b.any 14: 2a000022 orr w2, w1, w0 18: 34000322 cbz w2, 7c <__ieee754_ilogb+0x7c> 1c: 35000221 cbnz w1, 60 <__ieee754_ilogb+0x60> 20: 2a0003e1 mov w1, w0 24: 7100001f cmp w0, #0x0 28: 12808240 mov w0, #0xfffffbed // #-1043 2c: 540000ad b.le 40 <__ieee754_ilogb+0x40> 30: 531f7821 lsl w1, w1, #1 34: 51000400 sub w0, w0, #0x1 38: 7100003f cmp w1, #0x0 3c: 54ffffac b.gt 30 <__ieee754_ilogb+0x30> 40: d65f03c0 ret 44: 13147c20 asr w0, w1, #20 48: 12b00202 mov w2, #0x7fefffff // #2146435071 4c: 510ffc00 sub w0, w0, #0x3ff 50: 6b02003f cmp w1, w2 54: 12b00001 mov w1, #0x7fffffff // #2147483647 58: 1a819000 csel w0, w0, w1, ls // ls = plast 5c: d65f03c0 ret 60: 53155021 lsl w1, w1, #11 64: 12807fa0 mov w0, #0xfffffc02 // #-1022 68: 531f7821 lsl w1, w1, #1 6c: 51000400 sub w0, w0, #0x1 70: 7100003f cmp w1, #0x0 74: 54ffffac b.gt 68 <__ieee754_ilogb+0x68> 78: d65f03c0 ret 7c: 320107e0 mov w0, #0x80000001 // #-2147483647 80: d65f03c0 ret * patch: 0000000000000000 <__ieee754_ilogb>: 0: 9e660001 fmov x1, d0 4: d374f820 ubfx x0, x1, #52, #11 8: 350000e0 cbnz w0, 24 <__ieee754_ilogb+0x24> c: d374cc21 lsl x1, x1, #12 10: b4000141 cbz x1, 38 <__ieee754_ilogb+0x38> 14: dac01021 clz x1, x1 18: 12807fc0 mov w0, #0xfffffc01 // #-1023 1c: 4b010000 sub w0, w0, w1 20: d65f03c0 ret 24: 711ffc1f cmp w0, #0x7ff 28: 510ffc00 sub w0, w0, #0x3ff 2c: 12b00001 mov w1, #0x7fffffff // #2147483647 30: 1a811000 csel w0, w0, w1, ne // ne = any 34: d65f03c0 ret 38: 320107e0 mov w0, #0x80000001 // #-2147483647 3c: d65f03c0 ret Other architecture with support for stdc_leading_zeros and/or __builtin_clzll should have similar improvements. Checked on aarch64-linux-gnu and x86_64-linux-gnu. --- sysdeps/ieee754/dbl-64/e_ilogb.c | 80 ++++++++++++-------------------- 1 file changed, 29 insertions(+), 51 deletions(-) diff --git a/sysdeps/ieee754/dbl-64/e_ilogb.c b/sysdeps/ieee754/dbl-64/e_ilogb.c index 1e338a59c1..89e7498266 100644 --- a/sysdeps/ieee754/dbl-64/e_ilogb.c +++ b/sysdeps/ieee754/dbl-64/e_ilogb.c @@ -1,63 +1,41 @@ -/* @(#)s_ilogb.c 5.1 93/09/24 */ -/* - * ==================================================== - * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved. - * - * Developed at SunPro, a Sun Microsystems, Inc. business. - * Permission to use, copy, modify, and distribute this - * software is freely granted, provided that this notice - * is preserved. - * ==================================================== - */ +/* Get integer exponent of a floating-point value. + Copyright (C) 1999-2025 Free Software Foundation, Inc. + This file is part of the GNU C Library. -#if defined(LIBM_SCCS) && !defined(lint) -static char rcsid[] = "$NetBSD: s_ilogb.c,v 1.9 1995/05/10 20:47:28 jtc Exp $"; -#endif + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. -/* ilogb(double x) - * return the binary exponent of non-zero x - * ilogb(0) = FP_ILOGB0 - * ilogb(NaN) = FP_ILOGBNAN (no signal is raised) - * ilogb(+-Inf) = INT_MAX (no signal is raised) - */ + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ #include #include -#include +#include +#include "math_config.h" int __ieee754_ilogb (double x) { - int32_t hx, lx, ix; - - GET_HIGH_WORD (hx, x); - hx &= 0x7fffffff; - if (hx < 0x00100000) + uint64_t ux = asuint64 (x); + int ex = (ux & ~SIGN_MASK) >> MANTISSA_WIDTH; + if (ex == 0) /* zero or subnormal */ { - GET_LOW_WORD (lx, x); - if ((hx | lx) == 0) - return FP_ILOGB0; /* ilogb(0) = FP_ILOGB0 */ - else /* subnormal x */ - if (hx == 0) - { - for (ix = -1043; lx > 0; lx <<= 1) - ix -= 1; - } - else - { - for (ix = -1022, hx <<= 11; hx > 0; hx <<= 1) - ix -= 1; - } - return ix; + /* Clear sign and exponent */ + ux <<= 12; + if (ux == 0) + return FP_ILOGB0; + /* subnormal */ + return -1023 - stdc_leading_zeros (ux); } - else if (hx < 0x7ff00000) - return (hx >> 20) - 1023; - else if (FP_ILOGBNAN != INT_MAX) - { - /* ISO C99 requires ilogb(+-Inf) == INT_MAX. */ - GET_LOW_WORD (lx, x); - if (((hx ^ 0x7ff00000) | lx) == 0) - return INT_MAX; - } - return FP_ILOGBNAN; + if (ex == EXPONENT_MASK >> MANTISSA_WIDTH) /* NaN or Inf */ + return ux << 12 ? FP_ILOGBNAN : INT_MAX; + return ex - 1023; }