diff mbox

Fix PR 61225

Message ID CACgzC7D40=bxqag59GYQuXtze_ZAquo3PhpLmh0EGxZte5Mw4A@mail.gmail.com
State New
Headers show

Commit Message

Zhenqiang Chen May 22, 2014, 9:52 a.m. UTC
On 21 May 2014 20:43, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Wed, May 21, 2014 at 11:58 AM, Zhenqiang Chen wrote:
>> Hi,
>>
>> The patch fixes the gcc.target/i386/pr49095.c FAIL in PR61225. The
>> test case tends to check a peephole2 optimization, which optimizes the
>> following sequence
>>
>>     2: bx:SI=ax:SI
>>     25: ax:SI=[bx:SI]
>>     7: {ax:SI=ax:SI-0x1;clobber flags:CC;}
>>     8: [bx:SI]=ax:SI
>>     9: flags:CCZ=cmp(ax:SI,0)
>> to
>>    2: bx:SI=ax:SI
>>    41: {flags:CCZ=cmp([bx:SI]-0x1,0);[bx:SI]=[bx:SI]-0x1;}
>>
>> The enhanced shrink-wrapping, which calls copyprop_hardreg_forward
>> changes the INSN 25 to
>>
>>     25: ax:SI=[ax:SI]
>>
>> Then peephole2 can not optimize it since two memory_operands look like
>> different.
>>
>> To fix it, the patch adds another peephole2 rule to read one more
>> insn. From the register copy, it knows the address is the same.
>
> That is one complex peephole2 to deal with a transformation like this.
> It seems to be like it's a too specific solution for a bigger problem.
>
> Could you please try one of the following solutions instead:
>
> 1. Track register values for peephole2 and try different alternatives
> based on known register equivalences? E.g. in your example, perhaps
> there is already a REG_EQUAL/REG_EQUIV note available on insn 25 after
> copyprop_hardreg_forward, to annotate that [ax:SI] is equivalent to
> [bx:SI] at that point (or if that information is not available, it is
> not very difficult to make it available). Then you could try applying
> peephole2 on the original pattern but also on patterns modified with
> the known equivalences (i.e. try peephole2 on multiple equivalent
> patterns for the same insn). This may expose other peephole2
> opportunities, not just the specific one your patch addresses.

Patch is updated according to the comment. There is no REG_EQUAL. So I
add it when replace_oldest_value_reg.

ChangeLog:
2014-05-22  Zhenqiang Chen  <zhenqiang.chen@linaro.org>

        Part of PR rtl-optimization/61225
        * config/i386/i386-protos.h (ix86_peephole2_rtx_equal_p): New proto.
        * config/i386/i386.c (ix86_peephole2_rtx_equal_p): New function.
        * regcprop.c (replace_oldest_value_reg): Add REG_EQUAL note when
        propagating to SET.

     }

Comments

Jeff Law July 17, 2014, 3:10 a.m. UTC | #1
On 05/22/14 03:52, Zhenqiang Chen wrote:
> On 21 May 2014 20:43, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>> On Wed, May 21, 2014 at 11:58 AM, Zhenqiang Chen wrote:
>>> Hi,
>>>
>>> The patch fixes the gcc.target/i386/pr49095.c FAIL in PR61225. The
>>> test case tends to check a peephole2 optimization, which optimizes the
>>> following sequence
>>>
>>>      2: bx:SI=ax:SI
>>>      25: ax:SI=[bx:SI]
>>>      7: {ax:SI=ax:SI-0x1;clobber flags:CC;}
>>>      8: [bx:SI]=ax:SI
>>>      9: flags:CCZ=cmp(ax:SI,0)
>>> to
>>>     2: bx:SI=ax:SI
>>>     41: {flags:CCZ=cmp([bx:SI]-0x1,0);[bx:SI]=[bx:SI]-0x1;}
>>>
>>> The enhanced shrink-wrapping, which calls copyprop_hardreg_forward
>>> changes the INSN 25 to
>>>
>>>      25: ax:SI=[ax:SI]
>>>
>>> Then peephole2 can not optimize it since two memory_operands look like
>>> different.
>>>
>>> To fix it, the patch adds another peephole2 rule to read one more
>>> insn. From the register copy, it knows the address is the same.
>>
>> That is one complex peephole2 to deal with a transformation like this.
>> It seems to be like it's a too specific solution for a bigger problem.
>>
>> Could you please try one of the following solutions instead:
>>
>> 1. Track register values for peephole2 and try different alternatives
>> based on known register equivalences? E.g. in your example, perhaps
>> there is already a REG_EQUAL/REG_EQUIV note available on insn 25 after
>> copyprop_hardreg_forward, to annotate that [ax:SI] is equivalent to
>> [bx:SI] at that point (or if that information is not available, it is
>> not very difficult to make it available). Then you could try applying
>> peephole2 on the original pattern but also on patterns modified with
>> the known equivalences (i.e. try peephole2 on multiple equivalent
>> patterns for the same insn). This may expose other peephole2
>> opportunities, not just the specific one your patch addresses.
>
> Patch is updated according to the comment. There is no REG_EQUAL. So I
> add it when replace_oldest_value_reg.
>
> ChangeLog:
> 2014-05-22  Zhenqiang Chen  <zhenqiang.chen@linaro.org>
>
>          Part of PR rtl-optimization/61225
>          * config/i386/i386-protos.h (ix86_peephole2_rtx_equal_p): New proto.
>          * config/i386/i386.c (ix86_peephole2_rtx_equal_p): New function.
>          * regcprop.c (replace_oldest_value_reg): Add REG_EQUAL note when
>          propagating to SET.
I can't help but wonder why the new 4 insn combination code isn't 
presenting this as a nice big fat insn to the x86 backend which would 
eliminate the need for the peep2.

But, assuming there's a fundamental reason why that's not kicking in...

In replace_oldest_value_reg, why not use reg_overlap_mentioned_p to 
determine if the REGNO of NEW_RTX is modified by INSN?  I'd look to 
avoid some of those calls to single_set (insn).  Just call it once and 
reuse the value.

Shouldn't you be ensuring the REG_EQUAL note is unique?  I think we have 
a routine to avoid creating a note that already exists.

Don't you have to ensure that the value in the REG_EQUAL note has not 
changed?  A REG_EQUAL note denotes an equivalence that holds at the 
single insn where it appears.  If you want to use the value elsewhere 
you'd have to ensure the value hasn't been changed.  If RTX referred to 
by the REG_EQUAL note is a MEM, this can be relatively difficult due to 
aliasing issues.

Jeff
diff mbox

Patch

diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 39462bd..0c4a2b9 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -42,6 +42,7 @@  extern enum calling_abi ix86_function_type_abi (const_tree);

 extern void ix86_reset_previous_fndecl (void);

+extern bool ix86_peephole2_rtx_equal_p (rtx, rtx, rtx, rtx);
 #ifdef RTX_CODE
 extern int standard_80387_constant_p (rtx);
 extern const char *standard_80387_constant_opcode (rtx);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 6ffb788..583ebe8 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -46856,6 +46856,29 @@  ix86_atomic_assign_expand_fenv (tree *hold,
tree *clear, tree *update)
                    atomic_feraiseexcept_call);
 }

+/* OP0 is the SET_DEST of INSN and OP1 is the SET_SRC of INSN.
+   Check whether OP1 and OP6 is equal.  */
+
+bool
+ix86_peephole2_rtx_equal_p (rtx insn, rtx op0, rtx op1, rtx op6)
+{
+  rtx note;
+
+  if (!reg_overlap_mentioned_p (op0, op1) && rtx_equal_p (op1, op6))
+    return true;
+
+  gcc_assert (single_set (insn)
+             && op0 == SET_DEST (single_set (insn))
+             && op1 == SET_SRC (single_set (insn)));
+
+  note = find_reg_note (insn, REG_EQUAL, NULL_RTX);
+  if (note
+      && !reg_overlap_mentioned_p (op0, XEXP (note, 0))
+      && rtx_equal_p (XEXP (note, 0), op6))
+    return true;
+
+  return false;
+}
 /* Initialize the GCC target structure.  */
 #undef TARGET_RETURN_IN_MEMORY
 #define TARGET_RETURN_IN_MEMORY ix86_return_in_memory
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 44e80ec..b57fc86 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -16996,11 +16996,12 @@ 
                     [(match_dup 0)
                      (match_operand:SWI 2 "<nonmemory_operand>")]))
              (clobber (reg:CC FLAGS_REG))])
-   (set (match_dup 1) (match_dup 0))
+   (set (match_operand:SWI 6 "memory_operand") (match_dup 0))
    (set (reg FLAGS_REG) (compare (match_dup 0) (const_int 0)))]
   "(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
    && peep2_reg_dead_p (4, operands[0])
-   && !reg_overlap_mentioned_p (operands[0], operands[1])
+   && ix86_peephole2_rtx_equal_p (peep2_next_insn (0), operands[0],
+                                 operands[1], operands[6])
    && !reg_overlap_mentioned_p (operands[0], operands[2])
    && (<MODE>mode != QImode
        || immediate_operand (operands[2], QImode)
diff --git a/gcc/regcprop.c b/gcc/regcprop.c
index 7a5a4f6..4e09724 100644
--- a/gcc/regcprop.c
+++ b/gcc/regcprop.c
@@ -510,6 +510,22 @@  replace_oldest_value_reg (rtx *loc, enum
reg_class cl, rtx insn,
        fprintf (dump_file, "insn %u: replaced reg %u with %u\n",
                 INSN_UID (insn), REGNO (*loc), REGNO (new_rtx));

+      if (single_set (insn) && GET_CODE (PATTERN (insn)) == SET
+         && GET_MODE_CLASS (GET_MODE (SET_DEST (insn))) != MODE_CC
+         && GET_CODE (SET_SRC (single_set (insn))) != COMPARE)
+       {
+         rtx set = single_set (insn);
+         rtx dest = SET_DEST (set);
+
+         if (REG_P (dest) && REG_P (new_rtx)
+             && REGNO (dest) == REGNO (new_rtx))
+           /* REGNO of the NEW_RTX is modified by INSN.  No way to forwarded
+              it any more.  So add REG_EQUAL note to record its previous
+              value.  This note can be used to check whether two RTXs
+              equal or not.  */
+           add_reg_note (insn, REG_EQUAL, copy_rtx (SET_SRC (set)));
+       }
+
       validate_change (insn, loc, new_rtx, 1);
       return true;