diff mbox

[ira] Miss checks in split_live_ranges_for_shrink_wrap

Message ID CACgzC7D-f5gmnJP0rTOrWG0aporGcSwY+6yD7HuQSRkySBTEiQ@mail.gmail.com
State New
Headers show

Commit Message

Zhenqiang Chen Aug. 14, 2014, 2:55 a.m. UTC
Hi,

Function split_live_ranges_for_shrink_wrap has code

  if (!flag_shrink_wrap)
    return false;

But flag_shrink_wrap is TRUE by default when optimize > 0 even if the
port does not support shrink-wrap. To make sure shrink-wrap is
enabled, "HAVE_simple_return" must be defined and "HAVE_simple_return"
must be TRUE.

Please refer function.c and shrink-wrap.c on how shrink-wrap is
enabled in thread_prologue_and_epilogue_insns.

To make the check easy, the patch defines a MICRO:
SUPPORT_SHRINK_WRAP_P and replace the uses in ira.c and ifcvt.c

Bootstrap and no make check regression on X86-64.

OK for trunk?

Thanks!
-Zhenqiang

ChangeLog:
2014-08-14  Zhenqiang Chen  <zhenqiang.chen@arm.com>

        * shrink-wrap.h: #define SUPPORT_SHRINK_WRAP_P.
        * ira.c: #include "shrink-wrap.h"
        (split_live_ranges_for_shrink_wrap): Use SUPPORT_SHRINK_WRAP_P.
        * ifcvt.c: #include "shrink-wrap.h"
        (dead_or_predicable): Use SUPPORT_SHRINK_WRAP_P.

testsuite/ChangeLog:
2014-08-14  Zhenqiang Chen  <zhenqiang.chen@arm.com>

        * gcc.target/arm/split-live-ranges-for-shrink-wrap.c: New test.

          && single_succ (new_dest) == EXIT_BLOCK_PTR_FOR_FN (cfun)
@@ -4293,7 +4293,6 @@ dead_or_predicable (basic_block test_bb,
basic_block merge_bb,
            }
          BITMAP_FREE (return_regs);
        }
-#endif
     }

  no_body:

Comments

Jeff Law Aug. 29, 2014, 8:53 p.m. UTC | #1
On 08/13/14 20:55, Zhenqiang Chen wrote:
> Hi,
>
> Function split_live_ranges_for_shrink_wrap has code
>
>    if (!flag_shrink_wrap)
>      return false;
>
> But flag_shrink_wrap is TRUE by default when optimize > 0 even if the
> port does not support shrink-wrap. To make sure shrink-wrap is
> enabled, "HAVE_simple_return" must be defined and "HAVE_simple_return"
> must be TRUE.
>
> Please refer function.c and shrink-wrap.c on how shrink-wrap is
> enabled in thread_prologue_and_epilogue_insns.
>
> To make the check easy, the patch defines a MICRO:
> SUPPORT_SHRINK_WRAP_P and replace the uses in ira.c and ifcvt.c
>
> Bootstrap and no make check regression on X86-64.
>
> OK for trunk?
>
> Thanks!
> -Zhenqiang
>
> ChangeLog:
> 2014-08-14  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>
>          * shrink-wrap.h: #define SUPPORT_SHRINK_WRAP_P.
>          * ira.c: #include "shrink-wrap.h"
>          (split_live_ranges_for_shrink_wrap): Use SUPPORT_SHRINK_WRAP_P.
>          * ifcvt.c: #include "shrink-wrap.h"
>          (dead_or_predicable): Use SUPPORT_SHRINK_WRAP_P.
So what's the motivation behind this patch?   I can probably guess the 
motivation, but I might guess wrong.  Since you know the motivation, 
it's best if you just tell everyone what it is.


>
> testsuite/ChangeLog:
> 2014-08-14  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>
>          * gcc.target/arm/split-live-ranges-for-shrink-wrap.c: New test.
Testcase wasn't included in the patchkit.

 From a pure bikeshedding standpoint "SUPPORT_SHRINK_WRAP_P" seems 
poorly named.  SHRINK_WRAPPING_ENABLED seems like a better name to me.

Can you repost with the testcase included, name change and basic 
rationale behind why you want to make this change.  I'm pretty sure 
it'll be OK at that point.

jeff
Jeff Law Sept. 5, 2014, 4:45 a.m. UTC | #2
On 09/01/14 02:13, Zhenqiang Chen wrote:
>
> To split live-range of register, split_live_ranges_for_shrink_wrap will
> introduce additional register copies. If such copies can not be optimized by
> later optimizations, it will lead to code size and performance regression.
> My tests on ARM THUMB1 code size show lots of regressions due to additional
> register copies. Shrink-wrap is not enabled for ARM THUMB1, so I think
> split_live_ranges_for_shrink_wrap should not be called.
So has anyone looked at why IRA ends up selecting different registers 
for the source/dest of these copies?   Odds are it's just an artifact of 
the heuristics in use, but I'd like to make sure there isn't something 
inherently wrong happening in IRA that's causing it to not tie the 
source/dest of those copies.



> ChangeLog:
> 2014-09-01  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>
>          * shrink-wrap.h: #define SHRINK_WRAPPING_ENABLED.
>          * ira.c: #include "shrink-wrap.h"
>          (split_live_ranges_for_shrink_wrap): Use SHRINK_WRAPPING_ENABLED.
>          * ifcvt.c: #include "shrink-wrap.h"
>          (dead_or_predicable): Use SHRINK_WRAPPING_ENABLED.
>
> testsuite/ChangeLog:
> 2014-09-01  Zhenqiang Chen  <zhenqiang.chen@arm.com>
>
>          * gcc.target/arm/split-live-ranges-for-shrink-wrap.c: New test.
Thanks.  OK for the trunk.

As noted above, it'd may be worth spending a little time looking at the 
regressions without this patch installed to see why IRA isn't doing a 
good job of tying the source/dest of these copies together -- perhaps 
there's something that's been overlooked and fixing it may be beneficial.

jeff
Zhenqiang Chen Sept. 9, 2014, 6:23 a.m. UTC | #3
> -----Original Message-----
> From: Jeff Law [mailto:law@redhat.com]
> Sent: Friday, September 05, 2014 12:45 PM
> To: Zhenqiang Chen
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH, ira] Miss checks in split_live_ranges_for_shrink_wrap
> 
> On 09/01/14 02:13, Zhenqiang Chen wrote:
> >
> > To split live-range of register, split_live_ranges_for_shrink_wrap
> > will introduce additional register copies. If such copies can not be
> > optimized by later optimizations, it will lead to code size and
performance
> regression.
> > My tests on ARM THUMB1 code size show lots of regressions due to
> > additional register copies. Shrink-wrap is not enabled for ARM THUMB1,
> > so I think split_live_ranges_for_shrink_wrap should not be called.
> So has anyone looked at why IRA ends up selecting different registers
> for the source/dest of these copies?   Odds are it's just an artifact of
> the heuristics in use, but I'd like to make sure there isn't something
> inherently wrong happening in IRA that's causing it to not tie the
source/dest
> of those copies.
> 
> 
> 
> > ChangeLog:
> > 2014-09-01  Zhenqiang Chen  <zhenqiang.chen@arm.com>
> >
> >          * shrink-wrap.h: #define SHRINK_WRAPPING_ENABLED.
> >          * ira.c: #include "shrink-wrap.h"
> >          (split_live_ranges_for_shrink_wrap): Use
> SHRINK_WRAPPING_ENABLED.
> >          * ifcvt.c: #include "shrink-wrap.h"
> >          (dead_or_predicable): Use SHRINK_WRAPPING_ENABLED.
> >
> > testsuite/ChangeLog:
> > 2014-09-01  Zhenqiang Chen  <zhenqiang.chen@arm.com>
> >
> >          * gcc.target/arm/split-live-ranges-for-shrink-wrap.c: New test.
> Thanks.  OK for the trunk.

Thanks. The patch is installed @r215041.
 
> As noted above, it'd may be worth spending a little time looking at the
> regressions without this patch installed to see why IRA isn't doing a good
job
> of tying the source/dest of these copies together -- perhaps there's
> something that's been overlooked and fixing it may be beneficial.

I had investigated it. Compared with 4.8, the allocation order and conflict
cost might be the root cause. A bug is submitted: PR63210.

Thanks!
-Zhenqiang 
 
> jeff
>
diff mbox

Patch

diff --git a/gcc/shrink-wrap.h b/gcc/shrink-wrap.h
index bccfb31..31ce2d4 100644
--- a/gcc/shrink-wrap.h
+++ b/gcc/shrink-wrap.h
@@ -45,6 +45,9 @@  extern edge get_unconverted_simple_return (edge, bitmap_head,
 extern void convert_to_simple_return (edge entry_edge, edge orig_entry_edge,
                                      bitmap_head bb_flags, rtx returnjump,
                                      vec<edge> unconverted_simple_returns);
+#define SUPPORT_SHRINK_WRAP_P (flag_shrink_wrap && HAVE_simple_return)
+#else
+#define SUPPORT_SHRINK_WRAP_P false
 #endif

 #endif  /* GCC_SHRINK_WRAP_H  */
diff --git a/gcc/ira.c b/gcc/ira.c
index ccc6c79..8d58b60 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -392,6 +392,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "lra.h"
 #include "dce.h"
 #include "dbgcnt.h"
+#include "shrink-wrap.h"

 struct target_ira default_target_ira;
 struct target_ira_int default_target_ira_int;
@@ -4775,7 +4776,7 @@  split_live_ranges_for_shrink_wrap (void)
   bitmap_head need_new, reachable;
   vec<basic_block> queue;

-  if (!flag_shrink_wrap)
+  if (!SUPPORT_SHRINK_WRAP_P)
     return false;

   bitmap_initialize (&need_new, 0);
diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index e44c1dc..c44e1c2 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -42,6 +42,7 @@ 
 #include "df.h"
 #include "vec.h"
 #include "dbgcnt.h"
+#include "shrink-wrap.h"

 #ifndef HAVE_conditional_move
 #define HAVE_conditional_move 0
@@ -4239,14 +4240,13 @@  dead_or_predicable (basic_block test_bb,
basic_block merge_bb,
        if (NONDEBUG_INSN_P (insn))
          df_simulate_find_defs (insn, merge_set);

-#ifdef HAVE_simple_return
       /* If shrink-wrapping, disable this optimization when test_bb is
         the first basic block and merge_bb exits.  The idea is to not
         move code setting up a return register as that may clobber a
         register used to pass function parameters, which then must be
         saved in caller-saved regs.  A caller-saved reg requires the
         prologue, killing a shrink-wrap opportunity.  */
-      if ((flag_shrink_wrap && HAVE_simple_return && !epilogue_completed)
+      if ((SUPPORT_SHRINK_WRAP_P && !epilogue_completed)
          && ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb == test_bb
          && single_succ_p (new_dest)