diff mbox series

[01/28] powerpc: Use generic fabs{f} implementations

Message ID 20190329133529.22523-2-adhemerval.zanella@linaro.org
State New
Headers show
Series powerpc floating-point optimization refactor | expand

Commit Message

Adhemerval Zanella Netto March 29, 2019, 1:35 p.m. UTC
Since be2e25bbd78f9fdf the generic ieee754 implementation uses
compiler builtin which generates fabs{f} for all supported targets.

Checked on powerpc-linux-gnu (built without --with-cpu, with
--with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch),
powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+
and --disable-multi-arch).

	* sysdeps/powerpc/fpu/s_fabs.S: Remove file.
	* sysdeps/powerpc/fpu/s_fabsf.S: Likewise.
---
 sysdeps/powerpc/fpu/s_fabs.S  | 33 ---------------------------------
 sysdeps/powerpc/fpu/s_fabsf.S |  1 -
 2 files changed, 34 deletions(-)
 delete mode 100644 sysdeps/powerpc/fpu/s_fabs.S
 delete mode 100644 sysdeps/powerpc/fpu/s_fabsf.S

-- 
2.17.1

Comments

Joseph Myers April 1, 2019, 8:04 p.m. UTC | #1
On Fri, 29 Mar 2019, Adhemerval Zanella wrote:

> Since be2e25bbd78f9fdf the generic ieee754 implementation uses

> compiler builtin which generates fabs{f} for all supported targets.


One reason for the existing version might be a microoptimization for code 
size to make the float and double versions aliases, as permitted by the 
ABIs and instruction set in this case.  (This is not an objection to this 
patch.)

-- 
Joseph S. Myers
joseph@codesourcery.com
Adhemerval Zanella Netto April 3, 2019, 1:04 a.m. UTC | #2
On 02/04/2019 03:04, Joseph Myers wrote:
> On Fri, 29 Mar 2019, Adhemerval Zanella wrote:

> 

>> Since be2e25bbd78f9fdf the generic ieee754 implementation uses

>> compiler builtin which generates fabs{f} for all supported targets.

> 

> One reason for the existing version might be a microoptimization for code 

> size to make the float and double versions aliases, as permitted by the 

> ABIs and instruction set in this case.  (This is not an objection to this 

> patch.)

> 


Indeed powerpc does such microoptimizations for some implementations, but
if the idea is indeed to push such optimizations couldn't add on generic
implementation though a flag defined in arch-specific headers?
Gabriel F. T. Gomes April 15, 2019, 8:23 p.m. UTC | #3
On Wed, Apr 03 2019, Adhemerval Zanella wrote:
> 

> 

> On 02/04/2019 03:04, Joseph Myers wrote:

> > On Fri, 29 Mar 2019, Adhemerval Zanella wrote:

> > 

> >> Since be2e25bbd78f9fdf the generic ieee754 implementation uses

> >> compiler builtin which generates fabs{f} for all supported targets.

> > 

> > One reason for the existing version might be a microoptimization for code 

> > size to make the float and double versions aliases, as permitted by the 

> > ABIs and instruction set in this case.  (This is not an objection to this 

> > patch.)

> > 

> 

> Indeed powerpc does such microoptimizations for some implementations, but


The code generated by these functions is as follows (on powerpc64le):

Before the patch:
000000000004dac0 <fabs>:
   4dac0:       0e 00 4c 3c     addis   r2,r12,14
   4dac4:       40 9b 42 38     addi    r2,r2,-25792
   4dac8:       10 0a 20 fc     fabs    f1,f1
   4dacc:       20 00 80 4e     blr

After the patch:
000000000004dac0 <fabs>:
   4dac0:       10 0a 20 fc     fabs    f1,f1
   4dac4:       20 00 80 4e     blr

(this is true when --enable-stack-protector is not in use, or when it is
set to strong, as is the case for the distros I checked (Debian, Fedora
and OpenSuse); when set to all, there would be a lot more differences
before and after the patch, as the assembly implementation wouldn't get
stack protection code, but that's not a problem, imo).

(also, the absence of the TOC-setup code is fine, as the function does
not need anything from the TOC).

> if the idea is indeed to push such optimizations couldn't add on generic

> implementation though a flag defined in arch-specific headers?


Given the size of these functions, I believe that this is not worth the
effort, and I believe this patch is OK as is.

Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
Tulio Magno Quites Machado Filho April 15, 2019, 9:32 p.m. UTC | #4
"Gabriel F. T. Gomes" <gabriel@inconstante.eti.br> writes:

> The code generated by these functions is as follows (on powerpc64le):

>

> Before the patch:

> 000000000004dac0 <fabs>:

>    4dac0:       0e 00 4c 3c     addis   r2,r12,14

>    4dac4:       40 9b 42 38     addi    r2,r2,-25792

>    4dac8:       10 0a 20 fc     fabs    f1,f1

>    4dacc:       20 00 80 4e     blr

>

> After the patch:

> 000000000004dac0 <fabs>:

>    4dac0:       10 0a 20 fc     fabs    f1,f1

>    4dac4:       20 00 80 4e     blr


This is showing that __fabs could have been using ENTRY_TOCLESS instead of
ENTRY. ;-)

-- 
Tulio Magno
Adhemerval Zanella Netto April 17, 2019, 5:08 p.m. UTC | #5
On 15/04/2019 18:32, Tulio Magno Quites Machado Filho wrote:
> "Gabriel F. T. Gomes" <gabriel@inconstante.eti.br> writes:

> 

>> The code generated by these functions is as follows (on powerpc64le):

>>

>> Before the patch:

>> 000000000004dac0 <fabs>:

>>    4dac0:       0e 00 4c 3c     addis   r2,r12,14

>>    4dac4:       40 9b 42 38     addi    r2,r2,-25792

>>    4dac8:       10 0a 20 fc     fabs    f1,f1

>>    4dacc:       20 00 80 4e     blr

>>

>> After the patch:

>> 000000000004dac0 <fabs>:

>>    4dac0:       10 0a 20 fc     fabs    f1,f1

>>    4dac4:       20 00 80 4e     blr

> 

> This is showing that __fabs could have been using ENTRY_TOCLESS instead of

> ENTRY. ;-)

> 


And another point try use C implementation where possible (since it
abstracts that kind of possible ABI issues) ;)
diff mbox series

Patch

diff --git a/sysdeps/powerpc/fpu/s_fabs.S b/sysdeps/powerpc/fpu/s_fabs.S
deleted file mode 100644
index 7147f6b4a2..0000000000
--- a/sysdeps/powerpc/fpu/s_fabs.S
+++ /dev/null
@@ -1,33 +0,0 @@ 
-/* Floating-point absolute value.  PowerPC version.
-   Copyright (C) 1997-2019 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-#include <libm-alias-float.h>
-#include <libm-alias-double.h>
-
-ENTRY(__fabs)
-/* double [f1] fabs (double [f1] x); */
-	fabs fp1,fp1
-	blr
-END(__fabs)
-
-libm_alias_double (__fabs, fabs)
-
-/* It turns out that it's safe to use this code even for single-precision.  */
-strong_alias(__fabs,__fabsf)
-libm_alias_float (__fabs, fabs)
diff --git a/sysdeps/powerpc/fpu/s_fabsf.S b/sysdeps/powerpc/fpu/s_fabsf.S
deleted file mode 100644
index 877c710ce8..0000000000
--- a/sysdeps/powerpc/fpu/s_fabsf.S
+++ /dev/null
@@ -1 +0,0 @@ 
-/* __fabsf is in s_fabs.S  */