fold x ^ y to 0 if x == y

On 20 July 2016 at 16:35, Richard Biener <rguenther@suse.de> wrote:
> On Wed, 20 Jul 2016, Prathamesh Kulkarni wrote:

>

>> On 8 July 2016 at 12:29, Richard Biener <rguenther@suse.de> wrote:

>> > On Fri, 8 Jul 2016, Richard Biener wrote:

>> >

>> >> On Fri, 8 Jul 2016, Prathamesh Kulkarni wrote:

>> >>

>> >> > Hi Richard,

>> >> > For the following test-case:

>> >> >

>> >> > int f(int x, int y)

>> >> > {

>> >> >    int ret;

>> >> >

>> >> >    if (x == y)

>> >> >      ret = x ^ y;

>> >> >    else

>> >> >      ret = 1;

>> >> >

>> >> >    return ret;

>> >> > }

>> >> >

>> >> > I was wondering if x ^ y should be folded to 0 since

>> >> > it's guarded by condition x == y ?

>> >> >

>> >> > optimized dump shows:

>> >> > f (int x, int y)

>> >> > {

>> >> >   int iftmp.0_1;

>> >> >   int iftmp.0_4;

>> >> >

>> >> >   <bb 2>:

>> >> >   if (x_2(D) == y_3(D))

>> >> >     goto <bb 3>;

>> >> >   else

>> >> >     goto <bb 4>;

>> >> >

>> >> >   <bb 3>:

>> >> >   iftmp.0_4 = x_2(D) ^ y_3(D);

>> >> >

>> >> >   <bb 4>:

>> >> >   # iftmp.0_1 = PHI <iftmp.0_4(3), 1(2)>

>> >> >   return iftmp.0_1;

>> >> >

>> >> > }

>> >> >

>> >> > The attached patch tries to fold for above case.

>> >> > I am checking if op0 and op1 are equal using:

>> >> > if (bitmap_intersect_p (vr1->equiv, vr2->equiv)

>> >> >    && operand_equal_p (vr1->min, vr1->max)

>> >> >    && operand_equal_p (vr2->min, vr2->max))

>> >> >   { /* equal /* }

>> >> >

>> >> > I suppose intersection would check if op0 and op1 have equivalent ranges,

>> >> > and added operand_equal_p check to ensure that there is only one

>> >> > element within the range. Does that look correct ?

>> >> > Bootstrap+test in progress on x86_64-unknown-linux-gnu.

>> >>

>> >> I think VRP is the wrong place to catch this and DOM should have but it

>> >> does

>> >>

>> >> Optimizing block #3

>> >>

>> >> 1>>> STMT 1 = x_2(D) le_expr y_3(D)

>> >> 1>>> STMT 1 = x_2(D) ge_expr y_3(D)

>> >> 1>>> STMT 1 = x_2(D) eq_expr y_3(D)

>> >> 1>>> STMT 0 = x_2(D) ne_expr y_3(D)

>> >> 0>>> COPY x_2(D) = y_3(D)

>> >> 0>>> COPY y_3(D) = x_2(D)

>> >> Optimizing statement ret_4 = x_2(D) ^ y_3(D);

>> >>   Replaced 'x_2(D)' with variable 'y_3(D)'

>> >>   Replaced 'y_3(D)' with variable 'x_2(D)'

>> >>   Folded to: ret_4 = x_2(D) ^ y_3(D);

>> >> LKUP STMT ret_4 = x_2(D) bit_xor_expr y_3(D)

>> >>

>> >> heh, registering both reqivalencies is obviously not going to help...

>> >>

>> >> The 2nd equivalence is from doing

>> >>

>> >>       /* We already recorded that LHS = RHS, with canonicalization,

>> >>          value chain following, etc.

>> >>

>> >>          We also want to record RHS = LHS, but without any

>> >> canonicalization

>> >>          or value chain following.  */

>> >>       if (TREE_CODE (rhs) == SSA_NAME)

>> >>         const_and_copies->record_const_or_copy_raw (rhs, lhs,

>> >>                                                     SSA_NAME_VALUE (rhs));

>> >>

>> >> generally recording both is not helpful.  Jeff?  This seems to be

>> >> r233207 (fix for PR65917) which must have regressed this testcase.

>> >

>> > Just verified it works fine on the GCC 5 branch:

>> >

>> > Optimizing block #3

>> >

>> > 0>>> COPY y_3(D) = x_2(D)

>> > 1>>> STMT 1 = x_2(D) le_expr y_3(D)

>> > 1>>> STMT 1 = x_2(D) ge_expr y_3(D)

>> > 1>>> STMT 1 = x_2(D) eq_expr y_3(D)

>> > 1>>> STMT 0 = x_2(D) ne_expr y_3(D)

>> > Optimizing statement ret_4 = x_2(D) ^ y_3(D);

>> >   Replaced 'y_3(D)' with variable 'x_2(D)'

>> > Applying pattern match.pd:240, gimple-match.c:11346

>> > gimple_simplified to ret_4 = 0;

>> >   Folded to: ret_4 = 0;

>> I have reported it as PR71947.

>> Could you help me point out how to fix this ?

>

> Not record both equivalences.  This might break the testcase it was

> introduced for (obviously).  Which is why I CCed Jeff for his opinion.

Well, folding happens for x - y, if x == y.

int f(int x, int y)
{
  int ret;
  if (x == y)
    ret = x - y;
  else
    ret = 1;

  return ret;
}

For the above test-case, extract_range_from_binary_expr_1()
determines that range of ret = [0, 0]
and propagates it.

vrp1 dump:
f (int x, int y)
{
  int ret;

  <bb 2>:
  if (x_2(D) == y_3(D))
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 3>:
  ret_4 = x_2(D) - y_3(D);

  <bb 4>:
  # ret_1 = PHI <0(3), 10(2)>
  return ret_1;

}

Then the dce pass removes ret_4 = x_2(D) - y_3(D) since it's
redundant.
However it appears vrp fails to notice the equality for the following test-case,
and sets range for ret to VARYING.

int f(int x, int y, int a, int b)
{
  int ret = 10;
  if (a == x
      && b == y
      && a == b)
    ret = x - y;

  return ret;
}

Looking at the vrp dump, shows the following form
after inserting ASSERT_EXPR:

SSA form after inserting ASSERT_EXPRs
f (int x, int y, int a, int b)
{
  int ret;
  _Bool _1;
  _Bool _2;
  _Bool _3;

  <bb 2>:
  _1 = a_5(D) == x_6(D);
  _2 = b_7(D) == y_8(D);
  _3 = _1 & _2;
  if (_3 != 0)
    goto <bb 3>;
  else
    goto <bb 5>;

  <bb 3>:
  a_11 = ASSERT_EXPR <a_5(D), a_5(D) == x_6(D)>;
  x_12 = ASSERT_EXPR <x_6(D), x_6(D) == a_11>;
  b_13 = ASSERT_EXPR <b_7(D), b_7(D) == y_8(D)>;
  y_14 = ASSERT_EXPR <y_8(D), y_8(D) == b_13>;
  if (a_11 == b_13)
    goto <bb 4>;
  else
    goto <bb 5>;

  <bb 4>:
  ret_9 = x_12 ^ y_14;

  <bb 5>:
  # ret_4 = PHI <10(2), 10(3), ret_9(4)>
  return ret_4;

}

Shouldn't there also be a range assertion for a_11 == b_13 ?

Should we modify extract_range_from_binary_expr_1 to handle
this case for bit_xor_expr so similar to minus_expr case,
the range [0, 0] would get propagated and make ret = x ^ y redundant
which will then be removed by dce pass ?
I have attached untested patch for that.
Does it look OK ?
(It also doesn't handle  a == x && b == y && a == b case).

Thanks,
Prathamesh
>

> Richard.

>

> --

> Richard Biener <rguenther@suse.de>

> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

fold x ^ y to 0 if x == y

Commit Message

Comments

Patch