From patchwork Mon Dec 19 14:04:23 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 88476 Delivered-To: patch@linaro.org Received: by 10.140.20.101 with SMTP id 92csp1174013qgi; Mon, 19 Dec 2016 06:05:35 -0800 (PST) X-Received: by 10.84.138.3 with SMTP id 3mr35128215plo.108.1482156335832; Mon, 19 Dec 2016 06:05:35 -0800 (PST) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id e25si587283plj.108.2016.12.19.06.05.35 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 19 Dec 2016 06:05:35 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-return-76078-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org; spf=pass (google.com: domain of libc-alpha-return-76078-patch=linaro.org@sourceware.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=libc-alpha-return-76078-patch=linaro.org@sourceware.org; dmarc=fail (p=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=m7rqhp1cApJUkGX52XNrbWL0S9f26kO L8GHqm5MOamjq/EhoEpV+3ZDiRg4nZNA2NHPXW8BEmklh+5vi8KjUHzZC+LYzW9N Dbh5HBZY3Snk6DIClSKCRJ5VfdCP90SKoNc8HFiqMH2PgAoaDl9+vhB8jTLWrokO a15POgn1a/S8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; s=default; bh=mNI8gCHkVDtrGDpynuWIwBbhZ84=; b=CFq3R gIOkw0+RFS7vb0QgoKTTV+WxZXH1QaL0OBR1k7wTWml4keZJYyzzy0lttiKaogRP xAH9ElnHjKcuQd2q9CaEFpwUdPN3pApwYUhY8f1PSJxHBVvdl0LtJRWEKz6eQQNq q7R3Yuue/hBZ0rnB7yLENZfXGmbW0nGKY6rBGc= Received: (qmail 116487 invoked by alias); 19 Dec 2016 14:04:48 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 116335 invoked by uid 89); 19 Dec 2016 14:04:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=benchtests, weak_alias, s_fmaxf.S, UD:s_fmaxf.S X-HELO: mail-yw0-f179.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=WyCENs6N9fyuCfyQr3h5QIndGBp864ypiBPnOekD0PU=; b=TfBAc52gdblhNmXKCGEhLLO5BdidYq1/dt6dmKfBXbjuy3znxkSTp+wfbl7R8Dj3AO oLQ0DgjAoRGR4OHIA3RAeYodiCtScKFKOlujCE62z+TT8ISVZYVbU8UP3REWT66S41Zv aTwNuR87PSQ9Cc/Y7yMGLoaA3ACxOQnwSIWeupy4O61ZOin5aWhaXx9839yZ5sUtTfQu +nSZNa8iG5kGYeHNbrCpZ5VI6lBu1Mvj6Dzu08FIo26u/PZMfowrJz6x2mGTb+BNA1rq DSNAjv2jtJ4bgZJLaDtOSgj4QHp77YJT0NVFfRQJd9dKuJvJRifnQOmm4WA7Gw0EzJ3b xCIw== X-Gm-Message-State: AKaTC03a8U0D3EWDEutL5KwojImVkgAY3YSZ/tqX0JKcwngTHYOACPuqDd8nwcQmrWutcxbn X-Received: by 10.129.155.11 with SMTP id s11mr13374132ywg.203.1482156274619; Mon, 19 Dec 2016 06:04:34 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Subject: [PATCH 4/4] powerpc: Remove f{max,min}{f} assembly implementations Date: Mon, 19 Dec 2016 12:04:23 -0200 Message-Id: <1482156263-22267-4-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1482156263-22267-1-git-send-email-adhemerval.zanella@linaro.org> References: <1482156263-22267-1-git-send-email-adhemerval.zanella@linaro.org> This patch removes the powerpc assembly implementation of fmax/fmin. Based on benchtests, the assembly ones shows: $ ./testrun.sh benchtests/bench-fmax "fmax": { "": { "duration": 5.07586e+09, "iterations": 2.01676e+09, "max": 1350.39, "min": 2.073, "mean": 2.51684 }, "qNaN": { "duration": 5.09315e+09, "iterations": 8.4568e+08, "max": 2788, "min": 5.806, "mean": 6.02255 }, "sNaN": { "duration": 5.09073e+09, "iterations": 8.42316e+08, "max": 4215.84, "min": 5.737, "mean": 6.04373 } And $ ./testrun.sh benchtests/bench-fmin "fmin": { "": { "duration": 5.07711e+09, "iterations": 2.02982e+09, "max": 497.094, "min": 2.073, "mean": 2.50126 }, "qNaN": { "duration": 5.09134e+09, "iterations": 8.46968e+08, "max": 2255.14, "min": 5.807, "mean": 6.01125 }, "sNaN": { "duration": 5.09122e+09, "iterations": 8.4746e+08, "max": 1969.38, "min": 5.729, "mean": 6.00763 } } The default implementation (math/s_f{max.min}_template.c) shows slight better latency for all cases: $ ./testrun.sh benchtests/bench-fmax "fmax": { "": { "duration": 5.07044e+09, "iterations": 2.38695e+09, "max": 2048.58, "min": 2.073, "mean": 2.12423 }, "qNaN": { "duration": 5.09004e+09, "iterations": 9.45428e+08, "max": 3306.93, "min": 5.138, "mean": 5.38385 }, "sNaN": { "duration": 5.08458e+09, "iterations": 1.15959e+09, "max": 972.008, "min": 3.321, "mean": 4.3848 } } And: $ ./testrun.sh benchtests/bench-fmin "fmin": { "": { "duration": 5.06817e+09, "iterations": 2.3913e+09, "max": 1177.9, "min": 2.073, "mean": 2.11942 }, "qNaN": { "duration": 5.08857e+09, "iterations": 9.45656e+08, "max": 2658.83, "min": 5.09, "mean": 5.38099 }, "sNaN": { "duration": 5.08093e+09, "iterations": 1.16725e+09, "max": 1030.74, "min": 3.323, "mean": 4.3529 } } Both were run with GCC 5.4 (ubuntu 16 default installation) using default compiler flags on POWER8E 3.4GHz (powerpc64le-linux-gnu). --- ChangeLog | 10 +++++ sysdeps/powerpc/fpu/s_fmax.S | 77 ---------------------------------- sysdeps/powerpc/fpu/s_fmaxf.S | 1 - sysdeps/powerpc/fpu/s_fmin.S | 77 ---------------------------------- sysdeps/powerpc/fpu/s_fminf.S | 1 - sysdeps/powerpc/powerpc32/fpu/s_fmax.S | 5 --- sysdeps/powerpc/powerpc32/fpu/s_fmin.S | 5 --- sysdeps/powerpc/powerpc64/fpu/s_fmax.S | 5 --- sysdeps/powerpc/powerpc64/fpu/s_fmin.S | 5 --- 9 files changed, 10 insertions(+), 176 deletions(-) delete mode 100644 sysdeps/powerpc/fpu/s_fmax.S delete mode 100644 sysdeps/powerpc/fpu/s_fmaxf.S delete mode 100644 sysdeps/powerpc/fpu/s_fmin.S delete mode 100644 sysdeps/powerpc/fpu/s_fminf.S delete mode 100644 sysdeps/powerpc/powerpc32/fpu/s_fmax.S delete mode 100644 sysdeps/powerpc/powerpc32/fpu/s_fmin.S delete mode 100644 sysdeps/powerpc/powerpc64/fpu/s_fmax.S delete mode 100644 sysdeps/powerpc/powerpc64/fpu/s_fmin.S -- 2.7.4 diff --git a/sysdeps/powerpc/fpu/s_fmax.S b/sysdeps/powerpc/fpu/s_fmax.S deleted file mode 100644 index e6405c0..0000000 --- a/sysdeps/powerpc/fpu/s_fmax.S +++ /dev/null @@ -1,77 +0,0 @@ -/* Floating-point maximum. PowerPC version. - Copyright (C) 1997-2016 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include - -ENTRY(__fmax) -/* double [f1] fmax (double [f1] x, double [f2] y); */ - fcmpu cr0,fp1,fp2 - blt cr0,0f /* if x < y, neither x nor y can be NaN... */ - bnulr+ cr0 -/* x and y are unordered, so one of x or y must be a NaN... */ - fcmpu cr1,fp2,fp2 - bun cr1,1f -/* x is a NaN; y is not. Test if x is signaling. */ -#ifdef __powerpc64__ - stfd fp1,-8(r1) - lwz r3,-8+HIWORD(r1) -#else - stwu r1,-16(r1) - cfi_adjust_cfa_offset (16) - stfd fp1,8(r1) - lwz r3,8+HIWORD(r1) - addi r1,r1,16 - cfi_adjust_cfa_offset (-16) -#endif - andis. r3,r3,8 - bne cr0,0f - b 2f -1: /* y is a NaN; x may or may not be. */ - fcmpu cr1,fp1,fp1 - bun cr1,2f -/* y is a NaN; x is not. Test if y is signaling. */ -#ifdef __powerpc64__ - stfd fp2,-8(r1) - lwz r3,-8+HIWORD(r1) -#else - stwu r1,-16(r1) - cfi_adjust_cfa_offset (16) - stfd fp2,8(r1) - lwz r3,8+HIWORD(r1) - addi r1,r1,16 - cfi_adjust_cfa_offset (-16) -#endif - andis. r3,r3,8 - bnelr cr0 -2: /* x and y are NaNs, or one is a signaling NaN. */ - fadd fp1,fp1,fp2 - blr -0: fmr fp1,fp2 - blr -END(__fmax) - -weak_alias (__fmax,fmax) - -/* It turns out that it's safe to use this code even for single-precision. */ -strong_alias(__fmax,__fmaxf) -weak_alias (__fmax,fmaxf) - -#ifdef NO_LONG_DOUBLE -weak_alias (__fmax,__fmaxl) -weak_alias (__fmax,fmaxl) -#endif diff --git a/sysdeps/powerpc/fpu/s_fmaxf.S b/sysdeps/powerpc/fpu/s_fmaxf.S deleted file mode 100644 index 3c2d62b..0000000 --- a/sysdeps/powerpc/fpu/s_fmaxf.S +++ /dev/null @@ -1 +0,0 @@ -/* __fmaxf is in s_fmax.c */ diff --git a/sysdeps/powerpc/fpu/s_fmin.S b/sysdeps/powerpc/fpu/s_fmin.S deleted file mode 100644 index 9ae77fe..0000000 --- a/sysdeps/powerpc/fpu/s_fmin.S +++ /dev/null @@ -1,77 +0,0 @@ -/* Floating-point minimum. PowerPC version. - Copyright (C) 1997-2016 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include - -ENTRY(__fmin) -/* double [f1] fmin (double [f1] x, double [f2] y); */ - fcmpu cr0,fp1,fp2 - bgt cr0,0f /* if x > y, neither x nor y can be NaN... */ - bnulr+ cr0 -/* x and y are unordered, so one of x or y must be a NaN... */ - fcmpu cr1,fp2,fp2 - bun cr1,1f -/* x is a NaN; y is not. Test if x is signaling. */ -#ifdef __powerpc64__ - stfd fp1,-8(r1) - lwz r3,-8+HIWORD(r1) -#else - stwu r1,-16(r1) - cfi_adjust_cfa_offset (16) - stfd fp1,8(r1) - lwz r3,8+HIWORD(r1) - addi r1,r1,16 - cfi_adjust_cfa_offset (-16) -#endif - andis. r3,r3,8 - bne cr0,0f - b 2f -1: /* y is a NaN; x may or may not be. */ - fcmpu cr1,fp1,fp1 - bun cr1,2f -/* y is a NaN; x is not. Test if y is signaling. */ -#ifdef __powerpc64__ - stfd fp2,-8(r1) - lwz r3,-8+HIWORD(r1) -#else - stwu r1,-16(r1) - cfi_adjust_cfa_offset (16) - stfd fp2,8(r1) - lwz r3,8+HIWORD(r1) - addi r1,r1,16 - cfi_adjust_cfa_offset (-16) -#endif - andis. r3,r3,8 - bnelr cr0 -2: /* x and y are NaNs, or one is a signaling NaN. */ - fadd fp1,fp1,fp2 - blr -0: fmr fp1,fp2 - blr -END(__fmin) - -weak_alias (__fmin,fmin) - -/* It turns out that it's safe to use this code even for single-precision. */ -strong_alias(__fmin,__fminf) -weak_alias (__fmin,fminf) - -#ifdef NO_LONG_DOUBLE -weak_alias (__fmin,__fminl) -weak_alias (__fmin,fminl) -#endif diff --git a/sysdeps/powerpc/fpu/s_fminf.S b/sysdeps/powerpc/fpu/s_fminf.S deleted file mode 100644 index 10ab7fe..0000000 --- a/sysdeps/powerpc/fpu/s_fminf.S +++ /dev/null @@ -1 +0,0 @@ -/* __fminf is in s_fmin.c */ diff --git a/sysdeps/powerpc/powerpc32/fpu/s_fmax.S b/sysdeps/powerpc/powerpc32/fpu/s_fmax.S deleted file mode 100644 index 6973576..0000000 --- a/sysdeps/powerpc/powerpc32/fpu/s_fmax.S +++ /dev/null @@ -1,5 +0,0 @@ -#include -#include -#if LONG_DOUBLE_COMPAT(libm, GLIBC_2_1) -compat_symbol (libm, __fmax, fmaxl, GLIBC_2_1) -#endif diff --git a/sysdeps/powerpc/powerpc32/fpu/s_fmin.S b/sysdeps/powerpc/powerpc32/fpu/s_fmin.S deleted file mode 100644 index 6d4a0a9..0000000 --- a/sysdeps/powerpc/powerpc32/fpu/s_fmin.S +++ /dev/null @@ -1,5 +0,0 @@ -#include -#include -#if LONG_DOUBLE_COMPAT(libm, GLIBC_2_1) -compat_symbol (libm, __fmin, fminl, GLIBC_2_1) -#endif diff --git a/sysdeps/powerpc/powerpc64/fpu/s_fmax.S b/sysdeps/powerpc/powerpc64/fpu/s_fmax.S deleted file mode 100644 index 6973576..0000000 --- a/sysdeps/powerpc/powerpc64/fpu/s_fmax.S +++ /dev/null @@ -1,5 +0,0 @@ -#include -#include -#if LONG_DOUBLE_COMPAT(libm, GLIBC_2_1) -compat_symbol (libm, __fmax, fmaxl, GLIBC_2_1) -#endif diff --git a/sysdeps/powerpc/powerpc64/fpu/s_fmin.S b/sysdeps/powerpc/powerpc64/fpu/s_fmin.S deleted file mode 100644 index 6d4a0a9..0000000 --- a/sysdeps/powerpc/powerpc64/fpu/s_fmin.S +++ /dev/null @@ -1,5 +0,0 @@ -#include -#include -#if LONG_DOUBLE_COMPAT(libm, GLIBC_2_1) -compat_symbol (libm, __fmin, fminl, GLIBC_2_1) -#endif