From patchwork Fri Apr 25 20:44:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 884426 Delivered-To: patch@linaro.org Received: by 2002:a5d:474d:0:b0:38f:210b:807b with SMTP id o13csp4157410wrs; Fri, 25 Apr 2025 13:54:13 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXEx5GrykeS01RXGurtmR7m3V1r9upOaL/Ip6bHptI4KwNe+T6bBA2Nje+zjOPbMay1B3nmkA==@linaro.org X-Google-Smtp-Source: AGHT+IG2U1AiYVjqoV1WXuqWLwCC7HJOK7xD6hbEukltDJitKBSnp6HCi6CtNYfNKlRymi1pUFis X-Received: by 2002:a05:6214:518a:b0:6e8:fcc9:a291 with SMTP id 6a1803df08f44-6f4d1f1a173mr18706676d6.23.1745614453018; Fri, 25 Apr 2025 13:54:13 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1745614453; cv=pass; d=google.com; s=arc-20240605; b=hs6Pn0W5A4oUzocBH/4QYZHnFucebAcL879Lvx0FGQW9cugMC4B/Ur1kjTwVjC45t1 wZ90BvL8wJ/7w6QCW7gRTTsRecECCV86skfIvia0BjRT6twxi9+PcoPZmqGp9e0tiHA6 iWy0ho+AC+NoLvTkhlAiGQDHjvCPh2MKBwAkDwwA0mmQPapSYzBm+KJtG3NZWYFpwpjy EoEqOV4PjRQH1E2FEcWM2/+nxH35aO2jFIRI1L562Hy9tPOQLqBnh1IvkC32m+ePaJ3n zXSrxjGReuc7iaoW6fu7InEAA8Qplb6rrz28pCd+009Qg5Woxk1tuu2GP63GHbQm9IFL FVCA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature:dkim-filter:arc-filter:dmarc-filter:delivered-to :dkim-filter; bh=9qByV/z7W2+pndrAqQZUwvdFrIt7m8Hac+R4dpiQzc4=; fh=dHLBnA+MhGtNtN2B2JMAELi4oD+gmgMg7DL8H0jYbkI=; b=Seds4+MBGg10lo84t7xp2Eeit4NKyicpdgysIWrA8fUiuM5i2MYEWv1VM2YQ1EuEy6 NKmgjz/nh5voR75BtIK7VH+R7JZg4oDTfBd51iXUAIhOBeaon22SW5e9x9ZaI7l92sFR czcU2flLu/pYZAD9K2nis19CTCAJCktLX4XOIlckqWv1lngO2/nkjxdZHWVZC3ejrjjX X86dMPVsAYOdZ8QNkP3xslOKW2SQCgFE6TyQVxSDJar7m3j+kHQksOeeeJ/0eCyOOguX xM+xouQeKj6Bau/f16hn9JBMH+6FfkzL6g0w5t8afG/10BK/SAlNn4vNzqygDWgDDKDG Ajkg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=q7hYLiUi; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id 6a1803df08f44-6f4c08d675csi47504196d6.3.2025.04.25.13.54.12 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Apr 2025 13:54:12 -0700 (PDT) Received-SPF: pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=q7hYLiUi; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 949AC3858433 for ; Fri, 25 Apr 2025 20:54:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 949AC3858433 Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=q7hYLiUi X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by sourceware.org (Postfix) with ESMTPS id 882DD3857B94 for ; Fri, 25 Apr 2025 20:53:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 882DD3857B94 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 882DD3857B94 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::1036 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1745614396; cv=none; b=PKnYvg8InZpqiVszHelhopbRxX6HbRY+XyO/R3L9mhh0aJhxtcfp9XnUtE7EKCnmT1cr9u6B3+YBFAYSszeSP4i1evPwFoa9/PgxGyKeMYR7CpBCLyzj78ihcbnNzKkTAZ+SoJzGNp00L41ZxzSi5eJWS6ycoe8O+XjjQ3tWMys= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1745614396; c=relaxed/simple; bh=vm4GV3mSzgBQAlfatnEHYao32yPuNi9GGPmlnPW81mw=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=EWLug3WatasBP5SO9gpyQRQX6BHElNfrOKMVJJP6D+eC8yeA3phTAsaDxu67uIasKo0TmRFKtBd6mq239AhaH+fIVL1GCJo3SP0My7lwugRtpoCXIHIb84VuwScs/nO32XQYCg4oji7koOensPu5eKLt7HLfPpPTMUWV7jzi06g= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 882DD3857B94 Received: by mail-pj1-x1036.google.com with SMTP id 98e67ed59e1d1-301e05b90caso2930686a91.2 for ; Fri, 25 Apr 2025 13:53:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1745614395; x=1746219195; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=9qByV/z7W2+pndrAqQZUwvdFrIt7m8Hac+R4dpiQzc4=; b=q7hYLiUiasOC92wR24f+f2j5m65Euk2SBwONlSUj2WqHbb8SIfH5B+WuyaUKHTgkFv WFXAKluaqi9/6QO3pH1WbWhVFwrslci47mGN5XizoZYceu8UT7geHSBre7tmbe8vMdAL ya3v2XJ3dsn9ctk78ux+9k6Ru2jZTGJKKQ1RZcKpXCsy5xjwgPzHNKS0bdCzwiJVIWML h6xKfJ/htV1XLk943Y4jXgB7jB3vjv9p716GyXloTGvXLKmKrrvGx/wNexlTc+0b9Tf6 b0fdk6P/BUqGNXV703T/NHguBlsjuIreem0Zqx8/E+E/WEhgw7oUPNDtJMEXbuhiKImc 5M5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745614395; x=1746219195; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9qByV/z7W2+pndrAqQZUwvdFrIt7m8Hac+R4dpiQzc4=; b=TL7GDCLa0xKQlHDT/nZ+CJImI4ALC0hvV8BY2l4WdZJlKp56JVuJ24hmkmp0B9l7Hx boiAr5PIppd92ogCuxJs1BMh52BeN/XeMxMI1Lbift98A62vQfapCVQvc4MBXW/qw23Q yAXTG6ThVTHsT1HqQL6dVxjxV6lgJ9exz7jJj0zoYFaf9Dniup5dCNHFZTzMjyY+pAqd 5n9nT2h5cOvQ5mal0WoY9rU3b5ubLUbaqaklQymB5vzsWJsygbGDGkFrqIUT9HW+00+7 /bT3jh26/TC3Dqu/LigwjHoxqu/nuxrEhmRoD51wwMQa0mRn8+lzetW2ClKtBOK/uhGT werg== X-Gm-Message-State: AOJu0YyCx8mf2JmN7LyUCEATNxEtyx7SIQGPiCwtVnJCUlQVdxTSExgT cjylB6bbIWM0kIvzbUmgsvtE/k/hhGC4A8JgkuCduMu0ulgUo3NzfSWcPtNby0iSQNElmC7ZgyQ + X-Gm-Gg: ASbGncv/LbOxLWT0oLsLFJntzPWNxWMWhcUb7MgAtQvYcQ0a1+Go8vVpKEdWQfzaSVt utZJ0Hgmnmef5w8dYzl7W6UPHH1XWGJhtIy+qSJe6uANFib4+kpg3Ll/OBveqcEM2ALwZ3Ji77d YoQ5kBKHd6pqfZ4rNrQh/vDDfOX6h8X4/IO65H1I81ZegaYeVMwk5RWb9HqQkibHb72bjXo4eY/ 3FWMUtNfl2U1ivJGW9XqAYrQptqwiS5Q1+UpkeOO7ESjvHxs6Qjys6GrT9FRLFmUNIREuCvk3wg LKox2RvfkJD1eQr0vyJPirQBQL1yMk+8inPIsSW3cg4wMgwwxAxXkQ== X-Received: by 2002:a17:90b:586f:b0:2ee:ad18:b309 with SMTP id 98e67ed59e1d1-30a01300251mr1399026a91.3.1745614395047; Fri, 25 Apr 2025 13:53:15 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:9bf1:37fb:44e3:5707:516b]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-309f77371efsm2385260a91.9.2025.04.25.13.53.14 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Apr 2025 13:53:14 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org Subject: [PATCH 1/4] math: Remove UB and optimize double ilogb Date: Fri, 25 Apr 2025 17:44:07 -0300 Message-ID: <20250425205309.3866442-2-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250425205309.3866442-1-adhemerval.zanella@linaro.org> References: <20250425205309.3866442-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patch=linaro.org@sourceware.org The subnormal exponent calculation invokes UB by left shifting the signed expoenent to find the first leading bit. The implementation also used 32 bits operations, which is generates suboptimal code in 64 bits architectures. The patch reimplements ilogb using the math_config.h macros and uses the new stdbit function to simplify the subnormal handling. On aarch64 it generates better code: * master: 0000000000000000 <__ieee754_ilogb>: 0: 9e660000 fmov x0, d0 4: d360fc02 lsr x2, x0, #32 8: d360f801 ubfx x1, x0, #32, #31 c: f26c285f tst x2, #0x7ff00000 10: 540001a1 b.ne 44 <__ieee754_ilogb+0x44> // b.any 14: 2a000022 orr w2, w1, w0 18: 34000322 cbz w2, 7c <__ieee754_ilogb+0x7c> 1c: 35000221 cbnz w1, 60 <__ieee754_ilogb+0x60> 20: 2a0003e1 mov w1, w0 24: 7100001f cmp w0, #0x0 28: 12808240 mov w0, #0xfffffbed // #-1043 2c: 540000ad b.le 40 <__ieee754_ilogb+0x40> 30: 531f7821 lsl w1, w1, #1 34: 51000400 sub w0, w0, #0x1 38: 7100003f cmp w1, #0x0 3c: 54ffffac b.gt 30 <__ieee754_ilogb+0x30> 40: d65f03c0 ret 44: 13147c20 asr w0, w1, #20 48: 12b00202 mov w2, #0x7fefffff // #2146435071 4c: 510ffc00 sub w0, w0, #0x3ff 50: 6b02003f cmp w1, w2 54: 12b00001 mov w1, #0x7fffffff // #2147483647 58: 1a819000 csel w0, w0, w1, ls // ls = plast 5c: d65f03c0 ret 60: 53155021 lsl w1, w1, #11 64: 12807fa0 mov w0, #0xfffffc02 // #-1022 68: 531f7821 lsl w1, w1, #1 6c: 51000400 sub w0, w0, #0x1 70: 7100003f cmp w1, #0x0 74: 54ffffac b.gt 68 <__ieee754_ilogb+0x68> 78: d65f03c0 ret 7c: 320107e0 mov w0, #0x80000001 // #-2147483647 80: d65f03c0 ret * patch: 0000000000000000 <__ieee754_ilogb>: 0: 9e660001 fmov x1, d0 4: d374f820 ubfx x0, x1, #52, #11 8: 350000e0 cbnz w0, 24 <__ieee754_ilogb+0x24> c: d374cc21 lsl x1, x1, #12 10: b4000141 cbz x1, 38 <__ieee754_ilogb+0x38> 14: dac01021 clz x1, x1 18: 12807fc0 mov w0, #0xfffffc01 // #-1023 1c: 4b010000 sub w0, w0, w1 20: d65f03c0 ret 24: 711ffc1f cmp w0, #0x7ff 28: 510ffc00 sub w0, w0, #0x3ff 2c: 12b00001 mov w1, #0x7fffffff // #2147483647 30: 1a811000 csel w0, w0, w1, ne // ne = any 34: d65f03c0 ret 38: 320107e0 mov w0, #0x80000001 // #-2147483647 3c: d65f03c0 ret Other architecture with support for stdc_leading_zeros and/or __builtin_clzll should have similar improvements. Checked on aarch64-linux-gnu and x86_64-linux-gnu. --- sysdeps/ieee754/dbl-64/e_ilogb.c | 80 ++++++++++++-------------------- 1 file changed, 29 insertions(+), 51 deletions(-) diff --git a/sysdeps/ieee754/dbl-64/e_ilogb.c b/sysdeps/ieee754/dbl-64/e_ilogb.c index 1e338a59c1..89e7498266 100644 --- a/sysdeps/ieee754/dbl-64/e_ilogb.c +++ b/sysdeps/ieee754/dbl-64/e_ilogb.c @@ -1,63 +1,41 @@ -/* @(#)s_ilogb.c 5.1 93/09/24 */ -/* - * ==================================================== - * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved. - * - * Developed at SunPro, a Sun Microsystems, Inc. business. - * Permission to use, copy, modify, and distribute this - * software is freely granted, provided that this notice - * is preserved. - * ==================================================== - */ +/* Get integer exponent of a floating-point value. + Copyright (C) 1999-2025 Free Software Foundation, Inc. + This file is part of the GNU C Library. -#if defined(LIBM_SCCS) && !defined(lint) -static char rcsid[] = "$NetBSD: s_ilogb.c,v 1.9 1995/05/10 20:47:28 jtc Exp $"; -#endif + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. -/* ilogb(double x) - * return the binary exponent of non-zero x - * ilogb(0) = FP_ILOGB0 - * ilogb(NaN) = FP_ILOGBNAN (no signal is raised) - * ilogb(+-Inf) = INT_MAX (no signal is raised) - */ + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ #include #include -#include +#include +#include "math_config.h" int __ieee754_ilogb (double x) { - int32_t hx, lx, ix; - - GET_HIGH_WORD (hx, x); - hx &= 0x7fffffff; - if (hx < 0x00100000) + uint64_t ux = asuint64 (x); + int ex = (ux & ~SIGN_MASK) >> MANTISSA_WIDTH; + if (ex == 0) /* zero or subnormal */ { - GET_LOW_WORD (lx, x); - if ((hx | lx) == 0) - return FP_ILOGB0; /* ilogb(0) = FP_ILOGB0 */ - else /* subnormal x */ - if (hx == 0) - { - for (ix = -1043; lx > 0; lx <<= 1) - ix -= 1; - } - else - { - for (ix = -1022, hx <<= 11; hx > 0; hx <<= 1) - ix -= 1; - } - return ix; + /* Clear sign and exponent */ + ux <<= 12; + if (ux == 0) + return FP_ILOGB0; + /* subnormal */ + return -1023 - stdc_leading_zeros (ux); } - else if (hx < 0x7ff00000) - return (hx >> 20) - 1023; - else if (FP_ILOGBNAN != INT_MAX) - { - /* ISO C99 requires ilogb(+-Inf) == INT_MAX. */ - GET_LOW_WORD (lx, x); - if (((hx ^ 0x7ff00000) | lx) == 0) - return INT_MAX; - } - return FP_ILOGBNAN; + if (ex == EXPONENT_MASK >> MANTISSA_WIDTH) /* NaN or Inf */ + return ux << 12 ? FP_ILOGBNAN : INT_MAX; + return ex - 1023; }