diff mbox

[AArch64] Emit square root using the Newton series

Message ID 56685C62.9070705@arm.com
State New
Headers show

Commit Message

Kyrylo Tkachov Dec. 9, 2015, 4:52 p.m. UTC
Hi Evandro,

On 08/12/15 21:35, Evandro Menezes wrote:
> Emit square root using the Newton series

>

>    2015-12-03  Evandro Menezes  <e.menezes@samsung.com>

>

>    gcc/

>             * config/aarch64/aarch64-protos.h (aarch64_emit_swsqrt):

>    Declare new

>             function.

>             * config/aarch64/aarch64-simd.md (sqrt<mode>2): New

>    expansion and

>             insn definitions.

>             * config/aarch64/aarch64-tuning-flags.def

>             (AARCH64_EXTRA_TUNE_FAST_SQRT): New tuning macro.

>             * config/aarch64/aarch64.c (aarch64_emit_swsqrt): Define

>    new function.

>             * config/aarch64/aarch64.md (sqrt<mode>2): New expansion

>    and insn

>             definitions.

>             * config/aarch64/aarch64.opt (mlow-precision-recip-sqrt):

>    Expand option

>             description.

>             * doc/invoke.texi (mlow-precision-recip-sqrt): Likewise.

>

> This patch extends the patch that added support for implementing x^-1/2 using the Newton series by adding support for x^1/2 as well.

>

> Is it OK at this point of stage 3?

>


A comment on the patch itself from me...


That seems like a misleading name to me.
If we're doing this, that means that the sqrt instruction is not faster
than doing the inverse sqrt estimation followed by a multiply.
I think a name like "synth_sqrt" or "estimate_sqrt" or something along those lines
is more appropriate.

Thanks,
Kyrill
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64-tuning-flags.def b/gcc/config/aarch64/aarch64-tuning-flags.def
index 6f7dbce..11c6c9a 100644
--- a/gcc/config/aarch64/aarch64-tuning-flags.def
+++ b/gcc/config/aarch64/aarch64-tuning-flags.def
@@ -30,4 +30,4 @@ 
  
  AARCH64_EXTRA_TUNING_OPTION ("rename_fma_regs", RENAME_FMA_REGS)
  AARCH64_EXTRA_TUNING_OPTION ("recip_sqrt", RECIP_SQRT)
-
+AARCH64_EXTRA_TUNING_OPTION ("fast_sqrt", FAST_SQRT)