diff mbox

[auto-inc-dec.c] Account for cost of move operation in FORM_PRE_ADD and FORM_POST_ADD cases

Message ID 562E13EC.3010305@arm.com
State Accepted
Commit 477ee35f51115282b8eb56c0077e197f282b765e
Headers show

Commit Message

Kyrylo Tkachov Oct. 26, 2015, 11:52 a.m. UTC
On 26/10/15 11:28, Bernd Schmidt wrote:
> On 10/26/2015 12:12 PM, Bernd Schmidt wrote:

>>

>> But isn't that balanced by the fact that it doesn't seem to take into

>> account the gain of removing the inc_insn either? So I think this can't

>> be right.

>

> Argh, misread the code. The patch is OK with the change I suggested.

>


Thanks!
Here's what I committed with r229344.

Kyrill

2015-10-26  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * auto-inc-dec.c (insert_move_insn_before): Delete.
     (attempt_change): Remember to cost the simple move in the
     FORM_PRE_ADD and FORM_POST_ADD cases.

>

> Bernd

>

Comments

Christophe Lyon Oct. 28, 2015, 5:23 p.m. UTC | #1
Hi Kyrill,

On 26 October 2015 at 12:52, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>

> On 26/10/15 11:28, Bernd Schmidt wrote:

>>

>> On 10/26/2015 12:12 PM, Bernd Schmidt wrote:

>>>

>>>

>>> But isn't that balanced by the fact that it doesn't seem to take into

>>> account the gain of removing the inc_insn either? So I think this can't

>>> be right.

>>

>>

>> Argh, misread the code. The patch is OK with the change I suggested.

>>

>

> Thanks!

> Here's what I committed with r229344.

>

Since this commit, I've noticed:

FAIL: gcc.target/arm/lp1243022.c scan-rtl-dump subreg2 "REG_INC"
with --target arm-none-eabi --with-thumb --with-cpu=cortex-a9

as well as ICEs:
  gcc.target/aarch64/advsimd-intrinsics/vldX.c   -O2  (internal compiler error)
  gcc.target/aarch64/advsimd-intrinsics/vldX.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  (internal compiler error)

with --target arm-none-linux-gnueabihf --with-thumb
--with-cpu=cortex-a15 --with-fpu=neon-vfpv4
and --target arm-none-linux-gnueabihf --with-thumb
--with-cpu=cortex-a57 --with-fpu=crypto-neon-fp-armv8

See for a more synthetic view:
http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/229344/report-build-info.html

Can you have a look?

Thanks,

Christophe.

> Kyrill

>

> 2015-10-26  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

>

>     * auto-inc-dec.c (insert_move_insn_before): Delete.

>     (attempt_change): Remember to cost the simple move in the

>     FORM_PRE_ADD and FORM_POST_ADD cases.

>

>>

>> Bernd

>>

>
Kyrylo Tkachov Oct. 28, 2015, 5:45 p.m. UTC | #2
On 28/10/15 17:23, Christophe Lyon wrote:
> Hi Kyrill,


Hi Christophe,

>

> On 26 October 2015 at 12:52, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:

>> On 26/10/15 11:28, Bernd Schmidt wrote:

>>> On 10/26/2015 12:12 PM, Bernd Schmidt wrote:

>>>>

>>>> But isn't that balanced by the fact that it doesn't seem to take into

>>>> account the gain of removing the inc_insn either? So I think this can't

>>>> be right.

>>>

>>> Argh, misread the code. The patch is OK with the change I suggested.

>>>

>> Thanks!

>> Here's what I committed with r229344.

>>

> Since this commit, I've noticed:

>

> FAIL: gcc.target/arm/lp1243022.c scan-rtl-dump subreg2 "REG_INC"

> with --target arm-none-eabi --with-thumb --with-cpu=cortex-a9


I think this is a testcase issue, I'll look into updating it separately.

> as well as ICEs:

>    gcc.target/aarch64/advsimd-intrinsics/vldX.c   -O2  (internal compiler error)

>    gcc.target/aarch64/advsimd-intrinsics/vldX.c   -O2 -flto

> -fno-use-linker-plugin -flto-partition=none  (internal compiler error)

>

> with --target arm-none-linux-gnueabihf --with-thumb

> --with-cpu=cortex-a15 --with-fpu=neon-vfpv4

> and --target arm-none-linux-gnueabihf --with-thumb

> --with-cpu=cortex-a57 --with-fpu=crypto-neon-fp-armv8

>

> See for a more synthetic view:

> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/229344/report-build-info.html

>

> Can you have a look?


The ICE backtrace is:
0x7a4494 change_address_1
         $SRC/gcc/emit-rtl.c:2132
0x7a7453 adjust_address_1(rtx_def*, machine_mode, long, int, int, int, long)
         $SRC/gcc/emit-rtl.c:2270
0xab792d gen_lowpart_general(machine_mode, rtx_def*)
         $SRC/gcc/rtlhooks.c:90
0xfd721d gen_split_47(rtx_insn*, rtx_def**)
         $SRC/gcc/config/arm/arm.md:4336
0x12473f2 split_11
         $SRC/gcc/config/arm/arm.md:4331
0x12473f2 split_insns(rtx_def*, rtx_insn*)
         $SRC/gcc/config/arm/sync.md:361
0x7af30e try_split(rtx_def*, rtx_insn*, int)
         $SRC/gcc/emit-rtl.c:3664
0xa53dc5 split_insn
         $SRC/gcc/recog.c:2874
0xa5ccf5 split_all_insns()
         $SRC/gcc/recog.c:2964
0xa5cde3 rest_of_handle_split_after_reload
         $SRC/gcc/recog.c:3900
0xa5cde3 execute
         $SRC/gcc/recog.c:3929


Looks like a latent issue exposed by my patch.
Could you please file a BZ entry if get the chance?

Thanks,
Kyrill


> Thanks,

>

> Christophe.

>

>> Kyrill

>>

>> 2015-10-26  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

>>

>>      * auto-inc-dec.c (insert_move_insn_before): Delete.

>>      (attempt_change): Remember to cost the simple move in the

>>      FORM_PRE_ADD and FORM_POST_ADD cases.

>>

>>> Bernd

>>>
Kyrylo Tkachov Oct. 29, 2015, 3:53 p.m. UTC | #3
On 28/10/15 17:45, Kyrill Tkachov wrote:
>

> On 28/10/15 17:23, Christophe Lyon wrote:

>> Hi Kyrill,

>

> Hi Christophe,

>

>>

>> On 26 October 2015 at 12:52, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:

>>> On 26/10/15 11:28, Bernd Schmidt wrote:

>>>> On 10/26/2015 12:12 PM, Bernd Schmidt wrote:

>>>>>

>>>>> But isn't that balanced by the fact that it doesn't seem to take into

>>>>> account the gain of removing the inc_insn either? So I think this can't

>>>>> be right.

>>>>

>>>> Argh, misread the code. The patch is OK with the change I suggested.

>>>>

>>> Thanks!

>>> Here's what I committed with r229344.

>>>

>> Since this commit, I've noticed:

>>

>> FAIL: gcc.target/arm/lp1243022.c scan-rtl-dump subreg2 "REG_INC"

>> with --target arm-none-eabi --with-thumb --with-cpu=cortex-a9

>

> I think this is a testcase issue, I'll look into updating it separately.

>

>> as well as ICEs:

>>    gcc.target/aarch64/advsimd-intrinsics/vldX.c   -O2  (internal compiler error)

>>    gcc.target/aarch64/advsimd-intrinsics/vldX.c   -O2 -flto

>> -fno-use-linker-plugin -flto-partition=none  (internal compiler error)

>>

>> with --target arm-none-linux-gnueabihf --with-thumb

>> --with-cpu=cortex-a15 --with-fpu=neon-vfpv4

>> and --target arm-none-linux-gnueabihf --with-thumb

>> --with-cpu=cortex-a57 --with-fpu=crypto-neon-fp-armv8

>>

>> See for a more synthetic view:

>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/229344/report-build-info.html

>>

>> Can you have a look?

>

> The ICE backtrace is:

> 0x7a4494 change_address_1

>         $SRC/gcc/emit-rtl.c:2132

> 0x7a7453 adjust_address_1(rtx_def*, machine_mode, long, int, int, int, long)

>         $SRC/gcc/emit-rtl.c:2270

> 0xab792d gen_lowpart_general(machine_mode, rtx_def*)

>         $SRC/gcc/rtlhooks.c:90

> 0xfd721d gen_split_47(rtx_insn*, rtx_def**)

>         $SRC/gcc/config/arm/arm.md:4336

> 0x12473f2 split_11

>         $SRC/gcc/config/arm/arm.md:4331

> 0x12473f2 split_insns(rtx_def*, rtx_insn*)

>         $SRC/gcc/config/arm/sync.md:361

> 0x7af30e try_split(rtx_def*, rtx_insn*, int)

>         $SRC/gcc/emit-rtl.c:3664

> 0xa53dc5 split_insn

>         $SRC/gcc/recog.c:2874

> 0xa5ccf5 split_all_insns()

>         $SRC/gcc/recog.c:2964

> 0xa5cde3 rest_of_handle_split_after_reload

>         $SRC/gcc/recog.c:3900

> 0xa5cde3 execute

>         $SRC/gcc/recog.c:3929

>

>

> Looks like a latent issue exposed by my patch.

> Could you please file a BZ entry if get the chance?

>


I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68149 for this one.

Kyrill

> Thanks,

> Kyrill

>

>

>> Thanks,

>>

>> Christophe.

>>

>>> Kyrill

>>>

>>> 2015-10-26  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

>>>

>>>      * auto-inc-dec.c (insert_move_insn_before): Delete.

>>>      (attempt_change): Remember to cost the simple move in the

>>>      FORM_PRE_ADD and FORM_POST_ADD cases.

>>>

>>>> Bernd

>>>>

>
diff mbox

Patch

commit cc7c4748eac1f9d59ceb5393132c098aba99765d
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Fri Oct 16 13:46:51 2015 +0100

    [auto-inc-dec.c] Account for cost of move operation in FORM_PRE_ADD and FORM_POST_ADD cases

diff --git a/gcc/auto-inc-dec.c b/gcc/auto-inc-dec.c
index e003b13..9f7c8e0 100644
--- a/gcc/auto-inc-dec.c
+++ b/gcc/auto-inc-dec.c
@@ -439,24 +439,6 @@  move_dead_notes (rtx_insn *to_insn, rtx_insn *from_insn, rtx pattern)
     }
 }
 
-
-/* Create a mov insn DEST_REG <- SRC_REG and insert it before
-   NEXT_INSN.  */
-
-static rtx_insn *
-insert_move_insn_before (rtx_insn *next_insn, rtx dest_reg, rtx src_reg)
-{
-  rtx_insn *insns;
-
-  start_sequence ();
-  emit_move_insn (dest_reg, src_reg);
-  insns = get_insns ();
-  end_sequence ();
-  emit_insn_before (insns, next_insn);
-  return insns;
-}
-
-
 /* Change mem_insn.mem_loc so that uses NEW_ADDR which has an
    increment of INC_REG.  To have reached this point, the change is a
    legitimate one from a dataflow point of view.  The only questions
@@ -490,8 +472,21 @@  attempt_change (rtx new_addr, rtx inc_reg)
 
   old_cost = (set_src_cost (mem, mode, speed)
 	      + set_rtx_cost (PATTERN (inc_insn.insn), speed));
+
   new_cost = set_src_cost (mem_tmp, mode, speed);
 
+  /* In the FORM_PRE_ADD and FORM_POST_ADD cases we emit an extra move
+     whose cost we should account for.  */
+  if (inc_insn.form == FORM_PRE_ADD
+      || inc_insn.form == FORM_POST_ADD)
+    {
+      start_sequence ();
+      emit_move_insn (inc_insn.reg_res, inc_insn.reg0);
+      mov_insn = get_insns ();
+      end_sequence ();
+      new_cost += seq_cost (mov_insn, speed);
+    }
+
   /* The first item of business is to see if this is profitable.  */
   if (old_cost < new_cost)
     {
@@ -522,8 +517,8 @@  attempt_change (rtx new_addr, rtx inc_reg)
       /* Replace the addition with a move.  Do it at the location of
 	 the addition since the operand of the addition may change
 	 before the memory reference.  */
-      mov_insn = insert_move_insn_before (inc_insn.insn,
-					  inc_insn.reg_res, inc_insn.reg0);
+      gcc_assert (mov_insn);
+      emit_insn_before (mov_insn, inc_insn.insn);
       move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0);
 
       regno = REGNO (inc_insn.reg_res);
@@ -548,8 +543,8 @@  attempt_change (rtx new_addr, rtx inc_reg)
       break;
 
     case FORM_POST_ADD:
-      mov_insn = insert_move_insn_before (mem_insn.insn,
-					  inc_insn.reg_res, inc_insn.reg0);
+      gcc_assert (mov_insn);
+      emit_insn_before (mov_insn, mem_insn.insn);
       move_dead_notes (mov_insn, inc_insn.insn, inc_insn.reg0);
 
       /* Do not move anything to the mov insn because the instruction