From patchwork Fri Mar 29 13:35:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 161449 Delivered-To: patch@linaro.org Received: by 2002:a02:c6d8:0:0:0:0:0 with SMTP id r24csp2055251jan; Fri, 29 Mar 2019 06:41:51 -0700 (PDT) X-Google-Smtp-Source: APXvYqxzGRMowquUTbex+R+/5MwIrPiiiIUw92zGZuYrTzUFUCs1HQaFxayX2t0pyEJceaSr3e+F X-Received: by 2002:a17:902:8bca:: with SMTP id r10mr38814762plo.67.1553866911389; Fri, 29 Mar 2019 06:41:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553866911; cv=none; d=google.com; s=arc-20160816; b=mcy6MbYRnOAWZXZEDHx8x3L1DQdjyufVoKQaAw4OUBvJ1w+aT1twfUmnEQS+e2CqtI cJASCIqg4zfpc0E/JokvQTxdVt2ciOsAE6xMBIWrPQL0ZtAfDtDAmKwAPWR6sTqE5d0L ERViVUb0dNVgYrzy0UDkcc8KEj6qo7RAMuX3piB5yq1r9k07C6u0Hqv5j4BRkNtlpZfT LjgfwJOFyvy1iCbfcNJw9fly4fgke8v+XicvV4DaxvFW3FGeEkIoMk1H+TueqWSk6qYj Ra1Wnt7MiHcJ7ypPAsYwuqLHgSOO9fsjAVJHRNGo2AeAru4ssNSoe9DtBaiBvmAneJ6m VjbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:to:from :dkim-signature:delivered-to:sender:list-help:list-post:list-archive :list-subscribe:list-unsubscribe:list-id:precedence:mailing-list :dkim-signature:domainkey-signature; bh=y7hDoDPm7xa4KUIeRhgc0uJ+6/iWhVuzW4W4DS9A3H0=; b=oP5M+xoEbE/skX4YGHYq/cWzVN6LEvsicxFjJrM9gC46GWrrKNxa8TQz+W3HMf/ACS Ov8zngWqcuwFy6vzsKZ9qT2KbhputNSZGX2bet5Y6sDe32hgBkuVWdCtICe1ixqsmtSy qpucNS8w/QvUtiAGY76BWMl/fvaeKMUBQltKVQTYEUOqImL03xofueD2wYretIMdxK5F Tm6/NgjYzf7+/rmBsXZvCAhLaUU4itTwNkuug69UzpdG7Si0cPtI42sLf6qmLf3Yo2v8 Ym/PsWmF59OmASniDGJxytZZrivMYcMW4xJ+NJ5UXi6NUuettqvr2KzL3m8L2Dj0na5U YB+A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=WLNRhYz9; dkim=pass header.i=@linaro.org header.s=google header.b=jsc1gl0x; spf=pass (google.com: domain of libc-alpha-return-101018-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom="libc-alpha-return-101018-patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id q61si2038696plb.308.2019.03.29.06.41.51 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 Mar 2019 06:41:51 -0700 (PDT) Received-SPF: pass (google.com: domain of libc-alpha-return-101018-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=WLNRhYz9; dkim=pass header.i=@linaro.org header.s=google header.b=jsc1gl0x; spf=pass (google.com: domain of libc-alpha-return-101018-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom="libc-alpha-return-101018-patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=Ioq4gyd2AgshNeKVtzo+fqefD8ZKeS6 QQDaqyrkyhFNd6IIHFLg6Av38pXQNhD+rPHEQ5dPuhjDrZl123ny3HtIlKjAgE5C CBVDFW/KgPEzzqC7Qo2Can9lclgedU0dgioFaXZ0BOjMW0ZDU9aPXDUxjRADgSMk CoKvaew2wXyE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; s=default; bh=FdBl6bl8b0lEDcqFW7/wX2FOlkA=; b=WLNRh Yz9Q3FrS9k8PI55YQv6aq7YOBtfLLXMmVdtySMG0+qRraYoWVo5lfa3ixMWvtqES K0YFiWMxfjnf06EfUz7pFobVZ6h8euW6taZ2oBSjdWk1GxNL38f6zAFm8v/UL22N 3gEjYcMPCQuWiBJE09s1xEJ6GIgMe6az3+gl4c= Received: (qmail 123692 invoked by alias); 29 Mar 2019 13:36:39 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 121370 invoked by uid 89); 29 Mar 2019 13:36:15 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.8 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mail-vs1-f68.google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=y7hDoDPm7xa4KUIeRhgc0uJ+6/iWhVuzW4W4DS9A3H0=; b=jsc1gl0xWuCosxurNeEbrFKHd4YWUvaPIok8788ALO14KqsI7GJN94m6cQyKqbwSu2 ipQtsCszHNYSlPOBVPJkiNjvLwOKQYMgO2KxsM85Dj/wtsTuVvWW3VhtECJNm3fWmhBC CVFekYetb4wact1hpIIZDGRyTUfU2iOQHb9jkNxMU/vTJnly8aL7Mj6FfijRcq8St4EW CcuoZayEnu2Y+Jf+9cD76vhSGuAOVUQ1+GPeR5p2O5oTfO9cWxj7fcfMtpTBkxWv4rRk xKEXdZhjiM/w5RJY3G1hr8iS5KFjxMLcDuNQnu4xYXN/i5LVZ8Loa/OxZVYlFBKEv8U4 JvGg== Return-Path: From: Adhemerval Zanella To: libc-alpha@sourceware.org Subject: [PATCH 21/28] powerpc: Refactor powerpc32 lround/lroundf/llround/llroundf Date: Fri, 29 Mar 2019 10:35:22 -0300 Message-Id: <20190329133529.22523-22-adhemerval.zanella@linaro.org> In-Reply-To: <20190329133529.22523-1-adhemerval.zanella@linaro.org> References: <20190329133529.22523-1-adhemerval.zanella@linaro.org> This patches consolidates all the powerpc llround{f} implementations on the generic sysdeps/powerpc/powerpc32/fpu/s_llround{f}. The only missing optimization is the power6x one which I could not make GCC generates mftgpr for 32 bits output. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/powerpc32/fpu/Makefile [$(subdir) == math] (CFLAGS-s_lround.c): New rule. * sysdeps/powerpc/powerpc32/fpu/s_llround.c (__llround): Add power5+ and fctidz optimization. * sysdeps/powerpc/powerpc32/fpu/s_lround.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_lround.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (CFLAGS-s_llround-power6.c, CFLAGS-s_llround-power5+.c, CFLAGS-s_llround-ppc32.c, CFLAGS-s_lround-ppc32.c, CFLAGS-s_lround-power5+.c): New rule. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/s_llroundf.S: Likewise. * sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc32/power5+/fpu/s_llroundf.S: Likewise. * sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_llroundf.S: Likewise. --- sysdeps/powerpc/powerpc32/fpu/Makefile | 1 + sysdeps/powerpc/powerpc32/fpu/s_llround.c | 51 +++++++- sysdeps/powerpc/powerpc32/fpu/s_lround.S | 123 ------------------ sysdeps/powerpc/powerpc32/fpu/s_lround.c | 77 +++++++++++ .../powerpc32/power4/fpu/multiarch/Makefile | 5 + .../power4/fpu/multiarch/s_llround-power5+.S | 31 ----- .../power4/fpu/multiarch/s_llround-power5+.c | 2 + .../power4/fpu/multiarch/s_llround-power6.S | 31 ----- .../power4/fpu/multiarch/s_llround-power6.c | 2 + .../power4/fpu/multiarch/s_llround-ppc32.S | 31 ----- .../power4/fpu/multiarch/s_llround-ppc32.c | 2 + .../power4/fpu/multiarch/s_lround-power5+.S | 33 ----- .../power4/fpu/multiarch/s_lround-power5+.c | 2 + .../power4/fpu/multiarch/s_lround-ppc32.S | 31 ----- .../power4/fpu/multiarch/s_lround-ppc32.c | 2 + .../powerpc/powerpc32/power4/fpu/s_llround.S | 105 --------------- .../powerpc/powerpc32/power4/fpu/s_llroundf.S | 1 - .../powerpc/powerpc32/power5+/fpu/s_llround.S | 53 -------- .../powerpc32/power5+/fpu/s_llroundf.S | 1 - .../powerpc/powerpc32/power5+/fpu/s_lround.S | 51 -------- .../powerpc/powerpc32/power6/fpu/s_llround.S | 53 -------- .../powerpc/powerpc32/power6/fpu/s_llroundf.S | 1 - 22 files changed, 142 insertions(+), 547 deletions(-) delete mode 100644 sysdeps/powerpc/powerpc32/fpu/s_lround.S create mode 100644 sysdeps/powerpc/powerpc32/fpu/s_lround.c delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.S create mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.c delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.S create mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.c delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.S create mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.c delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.S create mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.c delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.S create mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.c delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S delete mode 100644 sysdeps/powerpc/powerpc32/power4/fpu/s_llroundf.S delete mode 100644 sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S delete mode 100644 sysdeps/powerpc/powerpc32/power5+/fpu/s_llroundf.S delete mode 100644 sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S delete mode 100644 sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S delete mode 100644 sysdeps/powerpc/powerpc32/power6/fpu/s_llroundf.S -- 2.17.1 Reviewed-by: Gabriel F. T. Gomes diff --git a/sysdeps/powerpc/powerpc32/fpu/Makefile b/sysdeps/powerpc/powerpc32/fpu/Makefile index c79b192f60..b8b6bb0fa2 100644 --- a/sysdeps/powerpc/powerpc32/fpu/Makefile +++ b/sysdeps/powerpc/powerpc32/fpu/Makefile @@ -2,6 +2,7 @@ ifeq ($(subdir),math) # lrint is aliased to lrintf, so suppress compiler builtins to # avoid mismatched signatures. CFLAGS-s_lrint.c += -fno-builtin-lrintf +CFLAGS-s_lround.c += -fno-builtin-lroundf endif ifeq ($(subdir),misc) diff --git a/sysdeps/powerpc/powerpc32/fpu/s_llround.c b/sysdeps/powerpc/powerpc32/fpu/s_llround.c index 14d10e1e63..dc9f41b8e7 100644 --- a/sysdeps/powerpc/powerpc32/fpu/s_llround.c +++ b/sysdeps/powerpc/powerpc32/fpu/s_llround.c @@ -18,10 +18,10 @@ #include #include -#include #include #include #include +#include /* Round to the nearest integer, with values exactly on a 0.5 boundary rounded away from zero, regardless of the current rounding mode. @@ -31,9 +31,53 @@ long long int __llround (double x) { +#ifdef _ARCH_PWR5X + x = round (x); + /* The barrier prevents compiler from optimizing it to llround when + compiled with -fno-math-errno */ + math_opt_barrier (x); + return x; +#else long long xr; if (HAVE_PPC_FCTIDZ) - xr = (long long) x; + { + /* IEEE 1003.1 lround function. IEEE specifies "round to the nearest + integer value, rounding halfway cases away from zero, regardless of + the current rounding mode." However PowerPC Architecture defines + "round to Nearest" as "Choose the best approximation. In case of a + tie, choose the one that is even (least significant bit o).". + So we can't use the PowerPC "round to Nearest" mode. Instead we set + "round toward Zero" mode and round by adding +-0.5 before rounding + to the integer value. + + It is necessary to detect when x is (+-)0x1.fffffffffffffp-2 + because adding +-0.5 in this case will cause an erroneous shift, + carry and round. We simply return 0 if 0.5 > x > -0.5. Likewise + if x is and odd number between +-(2^52 and 2^53-1) a shift and + carry will erroneously round if biased with +-0.5. Therefore if x + is greater/less than +-2^52 we don't need to bias the number with + +-0.5. */ + double ax = fabs (x); + + if (ax < 0.5) + return 0; + + if (ax < 0x1p+52) + { + /* Test whether an integer to avoid spurious "inexact". */ + double t = ax + 0x1p+52; + t = t - 0x1p+52; + if (ax != t) + { + ax = ax + 0.5; + if (x < 0.0) + ax = -fabs (ax); + x = ax; + } + } + + return x; + } else { /* Avoid incorrect exceptions from libgcc conversions (as of GCC @@ -80,5 +124,8 @@ __llround (double x) xr -= (long long) ((unsigned long long) xr - 1) < 0; } return xr; +#endif } +#ifndef __llround libm_alias_double (__llround, llround) +#endif diff --git a/sysdeps/powerpc/powerpc32/fpu/s_lround.S b/sysdeps/powerpc/powerpc32/fpu/s_lround.S deleted file mode 100644 index 2d9540ceed..0000000000 --- a/sysdeps/powerpc/powerpc32/fpu/s_lround.S +++ /dev/null @@ -1,123 +0,0 @@ -/* lround function. PowerPC32 version. - Copyright (C) 2004-2019 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include -#include -#include -#include - - .section .rodata.cst4,"aM",@progbits,4 - .align 2 -.LC0: /* 0.5 */ - .long 0x3f000000 -.LC1: /* 2^52. */ - .long 0x59800000 - .section .rodata.cst8,"aM",@progbits,8 - .align 3 -.LC2: /* 0x7fffffff.8p0. */ - .long 0x41dfffff - .long 0xffe00000 -.LC3: /* -0x80000000.8p0. */ - .long 0xc1e00000 - .long 0x00100000 - .section ".text" - -/* long [r3] lround (float x [fp1]) - IEEE 1003.1 lround function. IEEE specifies "round to the nearest - integer value, rounding halfway cases away from zero, regardless of - the current rounding mode." However PowerPC Architecture defines - "round to Nearest" as "Choose the best approximation. In case of a - tie, choose the one that is even (least significant bit o).". - So we can't use the PowerPC "round to Nearest" mode. Instead we set - "round toward Zero" mode and round by adding +-0.5 before rounding - to the integer value. It is necessary to detect when x is - (+-)0x1.fffffffffffffp-2 because adding +-0.5 in this case will - cause an erroneous shift, carry and round. We simply return 0 if - 0.5 > x > -0.5. */ - -ENTRY (__lround) - stwu r1,-16(r1) - cfi_adjust_cfa_offset (16) -#ifdef SHARED - mflr r11 - cfi_register(lr,r11) - SETUP_GOT_ACCESS(r9,got_label) - addis r10,r9,.LC0-got_label@ha - lfs fp10,.LC0-got_label@l(r10) - addis r10,r9,.LC1-got_label@ha - lfs fp11,.LC1-got_label@l(r10) - addis r10,r9,.LC2-got_label@ha - lfd fp9,.LC2-got_label@l(r10) - addis r10,r9,.LC3-got_label@ha - lfd fp8,.LC3-got_label@l(r10) - mtlr r11 - cfi_same_value (lr) -#else - lis r9,.LC0@ha - lfs fp10,.LC0@l(r9) - lis r9,.LC1@ha - lfs fp11,.LC1@l(r9) - lis r9,.LC2@ha - lfd fp9,.LC2@l(r9) - lis r9,.LC3@ha - lfd fp8,.LC3@l(r9) -#endif - fabs fp2, fp1 /* Get the absolute value of x. */ - fsub fp12,fp10,fp10 /* Compute 0.0. */ - fcmpu cr6, fp2, fp10 /* if |x| < 0.5 */ - fcmpu cr5, fp1, fp9 /* if x >= 0x7fffffff.8p0 */ - fcmpu cr1, fp1, fp8 /* if x <= -0x80000000.8p0 */ - fcmpu cr7, fp1, fp12 /* x is negative? x < 0.0 */ - blt- cr6,.Lretzero - bge- cr5,.Loflow - ble- cr1,.Loflow - /* Test whether an integer to avoid spurious "inexact". */ - fadd fp3,fp2,fp11 - fsub fp3,fp3,fp11 - fcmpu cr5, fp2, fp3 - beq cr5,.Lnobias - fadd fp3,fp2,fp10 /* |x|+=0.5 bias to prepare to round. */ - bge cr7,.Lconvert /* x is positive so don't negate x. */ - fnabs fp3,fp3 /* -(|x|+=0.5) */ -.Lconvert: - fctiwz fp4,fp3 /* Convert to Integer word lround toward 0. */ - stfd fp4,8(r1) - nop /* Ensure the following load is in a different dispatch */ - nop /* group to avoid pipe stall on POWER4&5. */ - nop - lwz r3,8+LOWORD(r1) /* Load return as integer. */ -.Lout: - addi r1,r1,16 - blr -.Lretzero: /* when 0.5 > x > -0.5 */ - li r3,0 /* return 0. */ - b .Lout -.Lnobias: - fmr fp3,fp1 - b .Lconvert -.Loflow: - fmr fp3,fp11 - bge cr7,.Lconvert - fnabs fp3,fp3 - b .Lconvert - END (__lround) - -libm_alias_double (__lround, lround) - -strong_alias (__lround, __lroundf) -libm_alias_float (__lround, lround) diff --git a/sysdeps/powerpc/powerpc32/fpu/s_lround.c b/sysdeps/powerpc/powerpc32/fpu/s_lround.c new file mode 100644 index 0000000000..28c06d25d2 --- /dev/null +++ b/sysdeps/powerpc/powerpc32/fpu/s_lround.c @@ -0,0 +1,77 @@ +/* lround function. PowerPC32 version. + Copyright (C) 2004-2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define lroundf __redirect_lroundf +#define __lroundf __redirect___lroundf +#include +#undef lroundf +#undef __lroundf +#include +#include + +long int +__lround (double x) +{ +#ifdef _ARCH_PWR5X + x = round (x); +#else + /* Ieee 1003.1 lround function. ieee specifies "round to the nearest + integer value, rounding halfway cases away from zero, regardless of + the current rounding mode." however powerpc architecture defines + "round to nearest" as "choose the best approximation. in case of a + tie, choose the one that is even (least significant bit o).". + so we can't use the powerpc "round to nearest" mode. instead we set + "round toward zero" mode and round by adding +-0.5 before rounding + to the integer value. it is necessary to detect when x is + (+-)0x1.fffffffffffffp-2 because adding +-0.5 in this case will + cause an erroneous shift, carry and round. we simply return 0 if + 0.5 > x > -0.5. */ + + double ax = fabs (x); + + if (ax < 0.5) + return 0; + + if (x >= 0x7fffffff.8p0 || x <= -0x80000000.8p0) + x = (x < 0.0) ? -0x1p+52 : 0x1p+52; + else + { + /* Test whether an integer to avoid spurious "inexact". */ + double t = ax + 0x1p+52; + t = t - 0x1p+52; + if (ax != t) + { + ax = ax + 0.5; + if (x < 0.0) + ax = -fabs (ax); + x = ax; + } + } +#endif + /* Force evaluation of values larger than long int, so invalid + exceptions are raise. */ + long long int ret; + asm ("fctiwz %0, %1" : "=d" (ret) : "d" (x)); + return ret; +} +#ifndef __lround +libm_alias_double (__lround, lround) + +strong_alias (__lround, __lroundf) +libm_alias_float (__lround, lround) +#endif diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile index d33d403a1b..60f2c95532 100644 --- a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile +++ b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile @@ -23,6 +23,11 @@ CFLAGS-s_llrintf-ppc32.c += -mcpu=power4 CFLAGS-s_llrint-power6.c += -mcpu=power6 CFLAGS-s_llrint-ppc32.c += -mcpu=power4 CFLAGS-s_lrint-ppc32.c += -mcpu=power4 +CFLAGS-s_llround-power6.c += -mcpu=power6 +CFLAGS-s_llround-power5+.c += -mcpu=power5+ +CFLAGS-s_llround-ppc32.c += -mcpu=power4 +CFLAGS-s_lround-ppc32.c += -mcpu=power4 +CFLAGS-s_lround-power5+.c += -mcpu=power5+ CFLAGS-s_ceil-power5+.c = -mcpu=power5+ CFLAGS-s_ceilf-power5+.c = -mcpu=power5+ CFLAGS-s_modf-power5+.c = -mcpu=power5+ diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.S b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.S deleted file mode 100644 index 80fc5b0f6c..0000000000 --- a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.S +++ /dev/null @@ -1,31 +0,0 @@ -/* lround function. PowerPC32/POWER5+ version. - Copyright (C) 2013-2019 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include -#include - -#undef weak_alias -#define weak_alias(a,b) -#undef strong_alias -#define strong_alias(a,b) -#undef compat_symbol -#define compat_symbol(a,b,c,d) - -#define __llround __llround_power5plus - -#include diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.c b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.c new file mode 100644 index 0000000000..794fad71c2 --- /dev/null +++ b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.c @@ -0,0 +1,2 @@ +#define __llround __llround_power5plus +#include diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.S b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.S deleted file mode 100644 index d6d9a0a4cb..0000000000 --- a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.S +++ /dev/null @@ -1,31 +0,0 @@ -/* lround function. PowerPC32/POWER6 version. - Copyright (C) 2013-2019 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include -#include - -#undef weak_alias -#define weak_alias(a,b) -#undef strong_alias -#define strong_alias(a,b) -#undef compat_symbol -#define compat_symbol(a,b,c,d) - -#define __llround __llround_power6 - -#include diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.c b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.c new file mode 100644 index 0000000000..2b588c4c54 --- /dev/null +++ b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.c @@ -0,0 +1,2 @@ +#define __llround __llround_power6 +#include diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.S b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.S deleted file mode 100644 index 085fbc6de4..0000000000 --- a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.S +++ /dev/null @@ -1,31 +0,0 @@ -/* llround function. PowerPC32 default version. - Copyright (C) 2013-2019 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include -#include - -#undef weak_alias -#define weak_alias(a,b) -#undef strong_alias -#define strong_alias(a,b) -#undef compat_symbol -#define compat_symbol(a,b,c,d) - -#define __llround __llround_ppc32 - -#include diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.c b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.c new file mode 100644 index 0000000000..3b5ff344bd --- /dev/null +++ b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.c @@ -0,0 +1,2 @@ +#define __llround __llround_ppc32 +#include diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.S b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.S deleted file mode 100644 index 350315e38b..0000000000 --- a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.S +++ /dev/null @@ -1,33 +0,0 @@ -/* lround function. POWER5+, PowerPC32 version. - Copyright (C) 2013-2019 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include -#include - -#undef hidden_def -#define hidden_def(name) -#undef weak_alias -#define weak_alias(name, alias) -#undef strong_alias -#define strong_alias(name, alias) -#undef compat_symbol -#define compat_symbol(lib, name, alias, ver) - -#define __lround __lround_power5plus - -#include diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.c b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.c new file mode 100644 index 0000000000..d6a17c4bca --- /dev/null +++ b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.c @@ -0,0 +1,2 @@ +#define __lround __lround_power5plus +#include diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.S b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.S deleted file mode 100644 index 245a682720..0000000000 --- a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.S +++ /dev/null @@ -1,31 +0,0 @@ -/* lround function. PowerPC32 default version. - Copyright (C) 2013-2019 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include -#include - -#undef weak_alias -#define weak_alias(a,b) -#undef strong_alias -#define strong_alias(a,b) -#undef compat_symbol -#define compat_symbol(a,b,c,d) - -#define __lround __lround_ppc32 - -#include diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.c b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.c new file mode 100644 index 0000000000..5355e1f712 --- /dev/null +++ b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.c @@ -0,0 +1,2 @@ +#define __lround __lround_ppc32 +#include diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S b/sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S deleted file mode 100644 index 3d6ba34180..0000000000 --- a/sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S +++ /dev/null @@ -1,105 +0,0 @@ -/* llround function. PowerPC32 on PowerPC64 version. - Copyright (C) 2004-2019 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include -#include -#include -#include - - .section .rodata.cst8,"aM",@progbits,8 - .align 3 - .LC0: .long (52+127)<<23 /* 0x1p+52 */ - .long (-1+127)<<23 /* 0.5 */ - - .section ".text" - -/* long [r3] lround (float x [fp1]) - IEEE 1003.1 lround function. IEEE specifies "round to the nearest - integer value, rounding halfway cases away from zero, regardless of - the current rounding mode." However PowerPC Architecture defines - "round to Nearest" as "Choose the best approximation. In case of a - tie, choose the one that is even (least significant bit o).". - So we can't use the PowerPC "round to Nearest" mode. Instead we set - "round toward Zero" mode and round by adding +-0.5 before rounding - to the integer value. - - It is necessary to detect when x is (+-)0x1.fffffffffffffp-2 - because adding +-0.5 in this case will cause an erroneous shift, - carry and round. We simply return 0 if 0.5 > x > -0.5. Likewise - if x is and odd number between +-(2^52 and 2^53-1) a shift and - carry will erroneously round if biased with +-0.5. Therefore if x - is greater/less than +-2^52 we don't need to bias the number with - +-0.5. */ - -ENTRY (__llround) - stwu r1,-16(r1) - cfi_adjust_cfa_offset (16) -#ifdef SHARED - mflr r11 - cfi_register(lr,r11) - SETUP_GOT_ACCESS(r9,got_label) - addis r9,r9,.LC0-got_label@ha - addi r9,r9,.LC0-got_label@l - mtlr r11 - cfi_same_value (lr) - lfs fp9,0(r9) - lfs fp10,4(r9) -#else - lis r9,.LC0@ha - lfs fp9,.LC0@l(r9) /* Load 2^52 into fpr9. */ - lfs fp10,.LC0@l+4(r9) /* Load 0.5 into fpr10. */ -#endif - fabs fp2,fp1 /* Get the absolute value of x. */ - fsub fp12,fp10,fp10 /* Compute 0.0 into fpr12. */ - fcmpu cr6,fp2,fp10 /* if |x| < 0.5 */ - fcmpu cr7,fp2,fp9 /* if |x| >= 2^52 */ - fcmpu cr1,fp1,fp12 /* x is negative? x < 0.0 */ - blt- cr6,.Lretzero /* 0.5 > x < -0.5 so just return 0. */ - bge- cr7,.Lnobias /* 2^52 > x < -2^52 just convert with no bias. */ - /* Test whether an integer to avoid spurious "inexact". */ - fadd fp3,fp2,fp9 - fsub fp3,fp3,fp9 - fcmpu cr5,fp2,fp3 - beq cr5,.Lnobias - fadd fp3,fp2,fp10 /* |x|+=0.5 bias to prepare to round. */ - bge cr1,.Lconvert /* x is positive so don't negate x. */ - fnabs fp3,fp3 /* -(|x|+=0.5) */ -.Lconvert: - fctidz fp4,fp3 /* Convert to Integer double word round toward 0. */ - stfd fp4,8(r1) - nop - nop - nop - lwz r3,8+HIWORD(r1) /* Load return as integer. */ - lwz r4,8+LOWORD(r1) -.Lout: - addi r1,r1,16 - blr -.Lretzero: /* 0.5 > x > -0.5 */ - li r3,0 /* return 0. */ - li r4,0 - b .Lout -.Lnobias: - fmr fp3,fp1 - b .Lconvert - END (__llround) - -libm_alias_double (__llround, llround) - -strong_alias (__llround, __llroundf) -libm_alias_float (__llround, llround) diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/s_llroundf.S b/sysdeps/powerpc/powerpc32/power4/fpu/s_llroundf.S deleted file mode 100644 index 72d6181541..0000000000 --- a/sysdeps/powerpc/powerpc32/power4/fpu/s_llroundf.S +++ /dev/null @@ -1 +0,0 @@ -/* __llroundf is in s_llround.S */ diff --git a/sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S b/sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S deleted file mode 100644 index 5035170d94..0000000000 --- a/sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S +++ /dev/null @@ -1,53 +0,0 @@ -/* lround function. POWER5+, PowerPC32 version. - Copyright (C) 2006-2019 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include -#include -#include -#include - -/* long [r3] llround (float x [fp1]) - IEEE 1003.1 lround function. IEEE specifies "round to the nearest - integer value, rounding halfway cases away from zero, regardless of - the current rounding mode." However PowerPC Architecture defines - "round to Nearest" as "Choose the best approximation. In case of a - tie, choose the one that is even (least significant bit o).". - So we pre-round using the V2.02 Floating Round to Integer Nearest - instruction before we use the Floating Convert to Integer Word with - round to zero instruction. */ - - .machine "power5" -ENTRY (__llround) - stwu r1,-16(r1) - cfi_adjust_cfa_offset (16) - frin fp2,fp1 - fctidz fp3,fp2 /* Convert To Integer Word lround toward 0. */ - stfd fp3,8(r1) - nop /* Ensure the following load is in a different dispatch */ - nop /* group to avoid pipe stall on POWER4&5. */ - nop - lwz r3,8+HIWORD(r1) - lwz r4,8+LOWORD(r1) - addi r1,r1,16 - blr - END (__llround) - -libm_alias_double (__llround, llround) - -strong_alias (__llround, __llroundf) -libm_alias_float (__llround, llround) diff --git a/sysdeps/powerpc/powerpc32/power5+/fpu/s_llroundf.S b/sysdeps/powerpc/powerpc32/power5+/fpu/s_llroundf.S deleted file mode 100644 index 030d2fdff8..0000000000 --- a/sysdeps/powerpc/powerpc32/power5+/fpu/s_llroundf.S +++ /dev/null @@ -1 +0,0 @@ -/* __llroundf is in s_llround.S */ diff --git a/sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S b/sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S deleted file mode 100644 index f30a0bff3e..0000000000 --- a/sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S +++ /dev/null @@ -1,51 +0,0 @@ -/* lround function. POWER5+, PowerPC32 version. - Copyright (C) 2006-2019 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ -#include -#include -#include -#include - -/* long [r3] lround (float x [fp1]) - IEEE 1003.1 lround function. IEEE specifies "round to the nearest - integer value, rounding halfway cases away from zero, regardless of - the current rounding mode." However PowerPC Architecture defines - "round to Nearest" as "Choose the best approximation. In case of a - tie, choose the one that is even (least significant bit o).". - So we pre-round using the V2.02 Floating Round to Integer Nearest - instruction before we use the Floating Convert to Integer Word with - round to zero instruction. */ - - .machine "power5" -ENTRY (__lround) - stwu r1,-16(r1) - cfi_adjust_cfa_offset (16) - frin fp2,fp1 - fctiwz fp3,fp2 /* Convert To Integer Word lround toward 0. */ - stfd fp3,8(r1) - nop /* Ensure the following load is in a different dispatch */ - nop /* group to avoid pipe stall on POWER4&5. */ - nop - lwz r3,8+LOWORD(r1) - addi r1,r1,16 - blr - END (__lround) - -libm_alias_double (__lround, lround) - -strong_alias (__lround, __lroundf) -libm_alias_float (__lround, lround) diff --git a/sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S b/sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S deleted file mode 100644 index 7ba312cc6b..0000000000 --- a/sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S +++ /dev/null @@ -1,53 +0,0 @@ -/* lround function. POWER5+, PowerPC32 version. - Copyright (C) 2006-2019 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include -#include -#include -#include - -/* long [r3] llround (float x [fp1]) - IEEE 1003.1 lround function. IEEE specifies "round to the nearest - integer value, rounding halfway cases away from zero, regardless of - the current rounding mode." However PowerPC Architecture defines - "round to Nearest" as "Choose the best approximation. In case of a - tie, choose the one that is even (least significant bit o).". - So we pre-round using the V2.02 Floating Round to Integer Nearest - instruction before we use the Floating Convert to Integer Word with - round to zero instruction. */ - - .machine "power5" -ENTRY (__llround) - stwu r1,-16(r1) - cfi_adjust_cfa_offset (16) - frin fp2,fp1 - fctidz fp3,fp2 /* Convert To Integer Word lround toward 0. */ - stfd fp3,8(r1) -/* Insure the following load is in a different dispatch group by - inserting "group ending nop". */ - ori r1,r1,0 - lwz r3,8+HIWORD(r1) - lwz r4,8+LOWORD(r1) - addi r1,r1,16 - blr - END (__llround) - -libm_alias_double (__llround, llround) - -strong_alias (__llround, __llroundf) -libm_alias_float (__llround, llround) diff --git a/sysdeps/powerpc/powerpc32/power6/fpu/s_llroundf.S b/sysdeps/powerpc/powerpc32/power6/fpu/s_llroundf.S deleted file mode 100644 index 030d2fdff8..0000000000 --- a/sysdeps/powerpc/powerpc32/power6/fpu/s_llroundf.S +++ /dev/null @@ -1 +0,0 @@ -/* __llroundf is in s_llround.S */