diff mbox

fold x ^ y to 0 if x == y

Message ID CAAgBjM=K8jOscVOjPtmycVQRVwEp_1ZEvi9jFNSKAze-tFqp4g@mail.gmail.com
State New
Headers show

Commit Message

Prathamesh Kulkarni July 20, 2016, 10:07 p.m. UTC
On 20 July 2016 at 16:35, Richard Biener <rguenther@suse.de> wrote:
> On Wed, 20 Jul 2016, Prathamesh Kulkarni wrote:

>

>> On 8 July 2016 at 12:29, Richard Biener <rguenther@suse.de> wrote:

>> > On Fri, 8 Jul 2016, Richard Biener wrote:

>> >

>> >> On Fri, 8 Jul 2016, Prathamesh Kulkarni wrote:

>> >>

>> >> > Hi Richard,

>> >> > For the following test-case:

>> >> >

>> >> > int f(int x, int y)

>> >> > {

>> >> >    int ret;

>> >> >

>> >> >    if (x == y)

>> >> >      ret = x ^ y;

>> >> >    else

>> >> >      ret = 1;

>> >> >

>> >> >    return ret;

>> >> > }

>> >> >

>> >> > I was wondering if x ^ y should be folded to 0 since

>> >> > it's guarded by condition x == y ?

>> >> >

>> >> > optimized dump shows:

>> >> > f (int x, int y)

>> >> > {

>> >> >   int iftmp.0_1;

>> >> >   int iftmp.0_4;

>> >> >

>> >> >   <bb 2>:

>> >> >   if (x_2(D) == y_3(D))

>> >> >     goto <bb 3>;

>> >> >   else

>> >> >     goto <bb 4>;

>> >> >

>> >> >   <bb 3>:

>> >> >   iftmp.0_4 = x_2(D) ^ y_3(D);

>> >> >

>> >> >   <bb 4>:

>> >> >   # iftmp.0_1 = PHI <iftmp.0_4(3), 1(2)>

>> >> >   return iftmp.0_1;

>> >> >

>> >> > }

>> >> >

>> >> > The attached patch tries to fold for above case.

>> >> > I am checking if op0 and op1 are equal using:

>> >> > if (bitmap_intersect_p (vr1->equiv, vr2->equiv)

>> >> >    && operand_equal_p (vr1->min, vr1->max)

>> >> >    && operand_equal_p (vr2->min, vr2->max))

>> >> >   { /* equal /* }

>> >> >

>> >> > I suppose intersection would check if op0 and op1 have equivalent ranges,

>> >> > and added operand_equal_p check to ensure that there is only one

>> >> > element within the range. Does that look correct ?

>> >> > Bootstrap+test in progress on x86_64-unknown-linux-gnu.

>> >>

>> >> I think VRP is the wrong place to catch this and DOM should have but it

>> >> does

>> >>

>> >> Optimizing block #3

>> >>

>> >> 1>>> STMT 1 = x_2(D) le_expr y_3(D)

>> >> 1>>> STMT 1 = x_2(D) ge_expr y_3(D)

>> >> 1>>> STMT 1 = x_2(D) eq_expr y_3(D)

>> >> 1>>> STMT 0 = x_2(D) ne_expr y_3(D)

>> >> 0>>> COPY x_2(D) = y_3(D)

>> >> 0>>> COPY y_3(D) = x_2(D)

>> >> Optimizing statement ret_4 = x_2(D) ^ y_3(D);

>> >>   Replaced 'x_2(D)' with variable 'y_3(D)'

>> >>   Replaced 'y_3(D)' with variable 'x_2(D)'

>> >>   Folded to: ret_4 = x_2(D) ^ y_3(D);

>> >> LKUP STMT ret_4 = x_2(D) bit_xor_expr y_3(D)

>> >>

>> >> heh, registering both reqivalencies is obviously not going to help...

>> >>

>> >> The 2nd equivalence is from doing

>> >>

>> >>       /* We already recorded that LHS = RHS, with canonicalization,

>> >>          value chain following, etc.

>> >>

>> >>          We also want to record RHS = LHS, but without any

>> >> canonicalization

>> >>          or value chain following.  */

>> >>       if (TREE_CODE (rhs) == SSA_NAME)

>> >>         const_and_copies->record_const_or_copy_raw (rhs, lhs,

>> >>                                                     SSA_NAME_VALUE (rhs));

>> >>

>> >> generally recording both is not helpful.  Jeff?  This seems to be

>> >> r233207 (fix for PR65917) which must have regressed this testcase.

>> >

>> > Just verified it works fine on the GCC 5 branch:

>> >

>> > Optimizing block #3

>> >

>> > 0>>> COPY y_3(D) = x_2(D)

>> > 1>>> STMT 1 = x_2(D) le_expr y_3(D)

>> > 1>>> STMT 1 = x_2(D) ge_expr y_3(D)

>> > 1>>> STMT 1 = x_2(D) eq_expr y_3(D)

>> > 1>>> STMT 0 = x_2(D) ne_expr y_3(D)

>> > Optimizing statement ret_4 = x_2(D) ^ y_3(D);

>> >   Replaced 'y_3(D)' with variable 'x_2(D)'

>> > Applying pattern match.pd:240, gimple-match.c:11346

>> > gimple_simplified to ret_4 = 0;

>> >   Folded to: ret_4 = 0;

>> I have reported it as PR71947.

>> Could you help me point out how to fix this ?

>

> Not record both equivalences.  This might break the testcase it was

> introduced for (obviously).  Which is why I CCed Jeff for his opinion.

Well, folding happens for x - y, if x == y.

int f(int x, int y)
{
  int ret;
  if (x == y)
    ret = x - y;
  else
    ret = 1;

  return ret;
}

For the above test-case, extract_range_from_binary_expr_1()
determines that range of ret = [0, 0]
and propagates it.

vrp1 dump:
f (int x, int y)
{
  int ret;

  <bb 2>:
  if (x_2(D) == y_3(D))
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 3>:
  ret_4 = x_2(D) - y_3(D);

  <bb 4>:
  # ret_1 = PHI <0(3), 10(2)>
  return ret_1;

}

Then the dce pass removes ret_4 = x_2(D) - y_3(D) since it's
redundant.
However it appears vrp fails to notice the equality for the following test-case,
and sets range for ret to VARYING.

int f(int x, int y, int a, int b)
{
  int ret = 10;
  if (a == x
      && b == y
      && a == b)
    ret = x - y;

  return ret;
}

Looking at the vrp dump, shows the following form
after inserting ASSERT_EXPR:

SSA form after inserting ASSERT_EXPRs
f (int x, int y, int a, int b)
{
  int ret;
  _Bool _1;
  _Bool _2;
  _Bool _3;

  <bb 2>:
  _1 = a_5(D) == x_6(D);
  _2 = b_7(D) == y_8(D);
  _3 = _1 & _2;
  if (_3 != 0)
    goto <bb 3>;
  else
    goto <bb 5>;

  <bb 3>:
  a_11 = ASSERT_EXPR <a_5(D), a_5(D) == x_6(D)>;
  x_12 = ASSERT_EXPR <x_6(D), x_6(D) == a_11>;
  b_13 = ASSERT_EXPR <b_7(D), b_7(D) == y_8(D)>;
  y_14 = ASSERT_EXPR <y_8(D), y_8(D) == b_13>;
  if (a_11 == b_13)
    goto <bb 4>;
  else
    goto <bb 5>;

  <bb 4>:
  ret_9 = x_12 ^ y_14;

  <bb 5>:
  # ret_4 = PHI <10(2), 10(3), ret_9(4)>
  return ret_4;

}

Shouldn't there also be a range assertion for a_11 == b_13 ?

Should we modify extract_range_from_binary_expr_1 to handle
this case for bit_xor_expr so similar to minus_expr case,
the range [0, 0] would get propagated and make ret = x ^ y redundant
which will then be removed by dce pass ?
I have attached untested patch for that.
Does it look OK ?
(It also doesn't handle  a == x && b == y && a == b case).

Thanks,
Prathamesh
>

> Richard.

>

> --

> Richard Biener <rguenther@suse.de>

> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

Comments

Prathamesh Kulkarni July 20, 2016, 10:09 p.m. UTC | #1
On 20 July 2016 at 23:07, Prathamesh Kulkarni
<prathamesh.kulkarni@linaro.org> wrote:
> On 20 July 2016 at 16:35, Richard Biener <rguenther@suse.de> wrote:

>> On Wed, 20 Jul 2016, Prathamesh Kulkarni wrote:

>>

>>> On 8 July 2016 at 12:29, Richard Biener <rguenther@suse.de> wrote:

>>> > On Fri, 8 Jul 2016, Richard Biener wrote:

>>> >

>>> >> On Fri, 8 Jul 2016, Prathamesh Kulkarni wrote:

>>> >>

>>> >> > Hi Richard,

>>> >> > For the following test-case:

>>> >> >

>>> >> > int f(int x, int y)

>>> >> > {

>>> >> >    int ret;

>>> >> >

>>> >> >    if (x == y)

>>> >> >      ret = x ^ y;

>>> >> >    else

>>> >> >      ret = 1;

>>> >> >

>>> >> >    return ret;

>>> >> > }

>>> >> >

>>> >> > I was wondering if x ^ y should be folded to 0 since

>>> >> > it's guarded by condition x == y ?

>>> >> >

>>> >> > optimized dump shows:

>>> >> > f (int x, int y)

>>> >> > {

>>> >> >   int iftmp.0_1;

>>> >> >   int iftmp.0_4;

>>> >> >

>>> >> >   <bb 2>:

>>> >> >   if (x_2(D) == y_3(D))

>>> >> >     goto <bb 3>;

>>> >> >   else

>>> >> >     goto <bb 4>;

>>> >> >

>>> >> >   <bb 3>:

>>> >> >   iftmp.0_4 = x_2(D) ^ y_3(D);

>>> >> >

>>> >> >   <bb 4>:

>>> >> >   # iftmp.0_1 = PHI <iftmp.0_4(3), 1(2)>

>>> >> >   return iftmp.0_1;

>>> >> >

>>> >> > }

>>> >> >

>>> >> > The attached patch tries to fold for above case.

>>> >> > I am checking if op0 and op1 are equal using:

>>> >> > if (bitmap_intersect_p (vr1->equiv, vr2->equiv)

>>> >> >    && operand_equal_p (vr1->min, vr1->max)

>>> >> >    && operand_equal_p (vr2->min, vr2->max))

>>> >> >   { /* equal /* }

>>> >> >

>>> >> > I suppose intersection would check if op0 and op1 have equivalent ranges,

>>> >> > and added operand_equal_p check to ensure that there is only one

>>> >> > element within the range. Does that look correct ?

>>> >> > Bootstrap+test in progress on x86_64-unknown-linux-gnu.

>>> >>

>>> >> I think VRP is the wrong place to catch this and DOM should have but it

>>> >> does

>>> >>

>>> >> Optimizing block #3

>>> >>

>>> >> 1>>> STMT 1 = x_2(D) le_expr y_3(D)

>>> >> 1>>> STMT 1 = x_2(D) ge_expr y_3(D)

>>> >> 1>>> STMT 1 = x_2(D) eq_expr y_3(D)

>>> >> 1>>> STMT 0 = x_2(D) ne_expr y_3(D)

>>> >> 0>>> COPY x_2(D) = y_3(D)

>>> >> 0>>> COPY y_3(D) = x_2(D)

>>> >> Optimizing statement ret_4 = x_2(D) ^ y_3(D);

>>> >>   Replaced 'x_2(D)' with variable 'y_3(D)'

>>> >>   Replaced 'y_3(D)' with variable 'x_2(D)'

>>> >>   Folded to: ret_4 = x_2(D) ^ y_3(D);

>>> >> LKUP STMT ret_4 = x_2(D) bit_xor_expr y_3(D)

>>> >>

>>> >> heh, registering both reqivalencies is obviously not going to help...

>>> >>

>>> >> The 2nd equivalence is from doing

>>> >>

>>> >>       /* We already recorded that LHS = RHS, with canonicalization,

>>> >>          value chain following, etc.

>>> >>

>>> >>          We also want to record RHS = LHS, but without any

>>> >> canonicalization

>>> >>          or value chain following.  */

>>> >>       if (TREE_CODE (rhs) == SSA_NAME)

>>> >>         const_and_copies->record_const_or_copy_raw (rhs, lhs,

>>> >>                                                     SSA_NAME_VALUE (rhs));

>>> >>

>>> >> generally recording both is not helpful.  Jeff?  This seems to be

>>> >> r233207 (fix for PR65917) which must have regressed this testcase.

>>> >

>>> > Just verified it works fine on the GCC 5 branch:

>>> >

>>> > Optimizing block #3

>>> >

>>> > 0>>> COPY y_3(D) = x_2(D)

>>> > 1>>> STMT 1 = x_2(D) le_expr y_3(D)

>>> > 1>>> STMT 1 = x_2(D) ge_expr y_3(D)

>>> > 1>>> STMT 1 = x_2(D) eq_expr y_3(D)

>>> > 1>>> STMT 0 = x_2(D) ne_expr y_3(D)

>>> > Optimizing statement ret_4 = x_2(D) ^ y_3(D);

>>> >   Replaced 'y_3(D)' with variable 'x_2(D)'

>>> > Applying pattern match.pd:240, gimple-match.c:11346

>>> > gimple_simplified to ret_4 = 0;

>>> >   Folded to: ret_4 = 0;

>>> I have reported it as PR71947.

>>> Could you help me point out how to fix this ?

>>

>> Not record both equivalences.  This might break the testcase it was

>> introduced for (obviously).  Which is why I CCed Jeff for his opinion.

> Well, folding happens for x - y, if x == y.

>

> int f(int x, int y)

> {

>   int ret;

>   if (x == y)

>     ret = x - y;

>   else

>     ret = 1;

>

>   return ret;

> }

Oops wrong test-case, the dump below is of the following test-case:
int f(int x, int y)
{
  int ret = 10;
  if (x == y)
    ret = x  -  y;
  return ret;
}
>

> For the above test-case, extract_range_from_binary_expr_1()

> determines that range of ret = [0, 0]

> and propagates it.

>

> vrp1 dump:

> f (int x, int y)

> {

>   int ret;

>

>   <bb 2>:

>   if (x_2(D) == y_3(D))

>     goto <bb 3>;

>   else

>     goto <bb 4>;

>

>   <bb 3>:

>   ret_4 = x_2(D) - y_3(D);

>

>   <bb 4>:

>   # ret_1 = PHI <0(3), 10(2)>

>   return ret_1;

>

> }

>

> Then the dce pass removes ret_4 = x_2(D) - y_3(D) since it's

> redundant.

> However it appears vrp fails to notice the equality for the following test-case,

> and sets range for ret to VARYING.

>

> int f(int x, int y, int a, int b)

> {

>   int ret = 10;

>   if (a == x

>       && b == y

>       && a == b)

>     ret = x - y;

>

>   return ret;

> }

>

> Looking at the vrp dump, shows the following form

> after inserting ASSERT_EXPR:

>

> SSA form after inserting ASSERT_EXPRs

> f (int x, int y, int a, int b)

> {

>   int ret;

>   _Bool _1;

>   _Bool _2;

>   _Bool _3;

>

>   <bb 2>:

>   _1 = a_5(D) == x_6(D);

>   _2 = b_7(D) == y_8(D);

>   _3 = _1 & _2;

>   if (_3 != 0)

>     goto <bb 3>;

>   else

>     goto <bb 5>;

>

>   <bb 3>:

>   a_11 = ASSERT_EXPR <a_5(D), a_5(D) == x_6(D)>;

>   x_12 = ASSERT_EXPR <x_6(D), x_6(D) == a_11>;

>   b_13 = ASSERT_EXPR <b_7(D), b_7(D) == y_8(D)>;

>   y_14 = ASSERT_EXPR <y_8(D), y_8(D) == b_13>;

>   if (a_11 == b_13)

>     goto <bb 4>;

>   else

>     goto <bb 5>;

>

>   <bb 4>:

>   ret_9 = x_12 ^ y_14;

>

>   <bb 5>:

>   # ret_4 = PHI <10(2), 10(3), ret_9(4)>

>   return ret_4;

>

> }

>

> Shouldn't there also be a range assertion for a_11 == b_13 ?

>

> Should we modify extract_range_from_binary_expr_1 to handle

> this case for bit_xor_expr so similar to minus_expr case,

> the range [0, 0] would get propagated and make ret = x ^ y redundant

> which will then be removed by dce pass ?

> I have attached untested patch for that.

> Does it look OK ?

> (It also doesn't handle  a == x && b == y && a == b case).

>

> Thanks,

> Prathamesh

>>

>> Richard.

>>

>> --

>> Richard Biener <rguenther@suse.de>

>> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
diff mbox

Patch

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 4333d60..ef00ee2 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2250,6 +2250,7 @@  extract_range_from_binary_expr_1 (value_range *vr,
      TODO, we may be able to derive anti-ranges in some cases.  */
   if (code != BIT_AND_EXPR
       && code != BIT_IOR_EXPR
+      && code != BIT_XOR_EXPR
       && code != TRUNC_DIV_EXPR
       && code != FLOOR_DIV_EXPR
       && code != CEIL_DIV_EXPR
@@ -3104,6 +3105,19 @@  extract_range_from_binary_expr_1 (value_range *vr,
 	    ;
 	  else
 	    max = min = NULL_TREE;
+
+	  /* Set range 0 for r, where r = op0 ^ op1 and op0 equals op1.  */
+	  /* FIXME: I am using bitmap_intersect_p() to determine if vr0
+	     and vr1 are equivalent and operand_equal_p() to ensure
+	     that range has only one symbol. Is this correct ?.  */
+	  if (TREE_CODE (vr0.min) == SSA_NAME
+	      && TREE_CODE (vr0.max) == SSA_NAME
+	      && TREE_CODE (vr1.min) == SSA_NAME
+	      && TREE_CODE (vr1.max) == SSA_NAME
+	      && vr0.equiv && vr1.equiv
+	      && bitmap_intersect_p (vr0.equiv, vr1.equiv)
+	      && operand_equal_p (vr0.min, vr0.max, 0))
+	    max = min = build_int_cst (expr_type, 0);
 	}
     }
   else