PR78153

Message ID	CAAgBjMm5bQhfPjM8Ff5DrNz+zhhJqq0eTj9_r2G5AEB=LPnHHw@mail.gmail.com
State	Superseded
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of gcc-patches-return-442072-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=m2z1roIPJH7TQ3rkQNLGU2Nb7c4DhUSPDma4Ce55oDu2hy yXkwwP58kQnDnI8ivKel4/GNGh6rUG3cIbEGID+/s+pOVKQg4FB4rk3ixOFA7/HM aGsTdPKWDk/Jpd2Tzt2q5MfgEtDYW2VUwxUZWwov4HgqxcQsUp9o0c6h6cH00= Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org MIME-Version: 1.0 From: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> Date: Sun, 20 Nov 2016 19:20:20 +0530 Message-ID: <CAAgBjMm5bQhfPjM8Ff5DrNz+zhhJqq0eTj9_r2G5AEB=LPnHHw@mail.gmail.com> Subject: PR78153 To: gcc Patches <gcc-patches@gcc.gnu.org>, Richard Biener <rguenther@suse.de>, Martin Sebor <msebor@gmail.com> Content-Type: multipart/mixed; boundary=001a114ac280f60aef0541bbd167

Prathamesh Kulkarni Nov. 20, 2016, 1:50 p.m. UTC

Hi,
As suggested by Martin in PR78153 strlen's return value cannot exceed
PTRDIFF_MAX.
So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic()
in the attached patch.

However it regressed strlenopt-3.c:

Consider fn1() from strlenopt-3.c:

__attribute__((noinline, noclone)) size_t
fn1 (char *p, char *q)
{
  size_t s = strlen (q);
  strcpy (p, q);
  return s - strlen (p);
}

The optimized dump shows the following:

__attribute__((noclone, noinline))
fn1 (char * p, char * q)
{
  size_t s;
  size_t _7;
  long unsigned int _9;

  <bb 2>:
  s_4 = strlen (q_3(D));
  _9 = s_4 + 1;
  __builtin_memcpy (p_5(D), q_3(D), _9);
  _7 = 0;
  return _7;

}

which introduces the regression, because the test expects "return 0;" in fn1().

The issue seems to be in vrp2:

Before the patch:
Visiting statement:
s_4 = strlen (q_3(D));
Found new range for s_4: VARYING

Visiting statement:
_1 = s_4;
Found new range for _1: [s_4, s_4]
marking stmt to be not simulated again

Visiting statement:
_7 = s_4 - _1;
Applying pattern match.pd:111, gimple-match.c:27997
Match-and-simplified s_4 - _1 to 0
Intersecting
  [0, 0]
and
  [0, +INF]
to
  [0, 0]
Found new range for _7: [0, 0]

__attribute__((noclone, noinline))
fn1 (char * p, char * q)
{
  size_t s;
  long unsigned int _1;
  long unsigned int _9;

  <bb 2>:
  s_4 = strlen (q_3(D));
  _9 = s_4 + 1;
  __builtin_memcpy (p_5(D), q_3(D), _9);
  _1 = s_4;
  return 0;

}


After the patch:
Visiting statement:
s_4 = strlen (q_3(D));
Intersecting
  [0, 9223372036854775806]
and
  [0, 9223372036854775806]
to
  [0, 9223372036854775806]
Found new range for s_4: [0, 9223372036854775806]
marking stmt to be not simulated again

Visiting statement:
_1 = s_4;
Intersecting
  [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)
and
  [0, 9223372036854775806]
to
  [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)
Found new range for _1: [0, 9223372036854775806]
marking stmt to be not simulated again

Visiting statement:
_7 = s_4 - _1;
Intersecting
  ~[9223372036854775807, 9223372036854775809]
and
  ~[9223372036854775807, 9223372036854775809]
to
  ~[9223372036854775807, 9223372036854775809]
Found new range for _7: ~[9223372036854775807, 9223372036854775809]
marking stmt to be not simulated again

__attribute__((noclone, noinline))
fn1 (char * p, char * q)
{
  size_t s;
  long unsigned int _1;
  size_t _7;
  long unsigned int _9;

  <bb 2>:
  s_4 = strlen (q_3(D));
  _9 = s_4 + 1;
  __builtin_memcpy (p_5(D), q_3(D), _9);
  _1 = s_4;
  _7 = s_4 - _1;
  return _7;

}

Then forwprop4 turns
_1 = s_4
_7 = s_4 - _1
into
_7 = 0

and we end up with:
_7 = 0
return _7
in optimized dump.

Running ccp again after forwprop4 trivially solves the issue, however
I am not sure if we want to run ccp again ?

The issue is probably with extract_range_from_ssa_name():
For _1 = s_4

Before patch:
VR for s_4 is set to varying.
So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name.
Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4,
and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using
match.pd pattern x - x -> 0).

After patch:
VR for s_4 is set to [0, PTRDIFF_MAX - 1]
And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1]
so IIUC, we then lose the information that _1 is equal to s_4,
and vrp doesn't transform _7 = s_4 - _1 to _7 = 0.
forwprop4 does that because it sees that s_4 and _1 are equivalent.
Does this sound correct ?

I am not sure how to proceed with the patch,
and would be grateful for suggestions.

Thanks,
Prathamesh

Jakub Jelinek Nov. 20, 2016, 2:04 p.m. UTC | #1

On Sun, Nov 20, 2016 at 07:20:20PM +0530, Prathamesh Kulkarni wrote:
> --- a/gcc/tree-vrp.c

> +++ b/gcc/tree-vrp.c

> @@ -4013,6 +4013,16 @@ extract_range_basic (value_range *vr, gimple *stmt)

>  			          : vrp_val_max (type), NULL);

>  	  }

>  	  return;

> +	case CFN_BUILT_IN_STRLEN:

> +	  {

> +	    tree type = TREE_TYPE (gimple_call_lhs (stmt));

> +	    unsigned HOST_WIDE_INT max =

> +		TREE_INT_CST_LOW (vrp_val_max (ptrdiff_type_node)) - 1;


Wrong formatting, = should go on the next line, and should be indented only
2 columns more than the previous line.  Plus TREE_INT_CST_LOW really
shouldn't be used in new code.  You should use tree_to_uhwi or tree_to_shwi
instead.  Why the -1?  Can you just
fold_convert (type, TYPE_MAX_VALUE (ptrdiff_type_node)); ?
Or, if you really want the -1, e.g. wide_int max = vrp_val_max (ptrdiff_type_node);
wide_int_to_tree (type, max - 1);
or something similar.
> +

> +	    set_value_range (vr, VR_RANGE, build_int_cst (type, 0),

> +			     build_int_cst (type, max), NULL);

> +	  }

> +	  return;

>  	default:

>  	  break;

>  	}


	Jakub

Prathamesh Kulkarni Nov. 20, 2016, 2:13 p.m. UTC | #2

On 20 November 2016 at 19:34, Jakub Jelinek <jakub@redhat.com> wrote:
> On Sun, Nov 20, 2016 at 07:20:20PM +0530, Prathamesh Kulkarni wrote:

>> --- a/gcc/tree-vrp.c

>> +++ b/gcc/tree-vrp.c

>> @@ -4013,6 +4013,16 @@ extract_range_basic (value_range *vr, gimple *stmt)

>>                                 : vrp_val_max (type), NULL);

>>         }

>>         return;

>> +     case CFN_BUILT_IN_STRLEN:

>> +       {

>> +         tree type = TREE_TYPE (gimple_call_lhs (stmt));

>> +         unsigned HOST_WIDE_INT max =

>> +             TREE_INT_CST_LOW (vrp_val_max (ptrdiff_type_node)) - 1;

>

> Wrong formatting, = should go on the next line, and should be indented only

> 2 columns more than the previous line.  Plus TREE_INT_CST_LOW really

> shouldn't be used in new code.  You should use tree_to_uhwi or tree_to_shwi

> instead.  Why the -1?  Can you just

> fold_convert (type, TYPE_MAX_VALUE (ptrdiff_type_node)); ?

> Or, if you really want the -1, e.g. wide_int max = vrp_val_max (ptrdiff_type_node);

> wide_int_to_tree (type, max - 1);

> or something similar.

Hi Jakub,
Thanks for the suggestions.
Sorry I wrote misleading info in the patch. As per PR, strlen's return value
should always be less than PTRDIFF_MAX, so I am setting the range to
[0, PTRDIFF_MAX - 1].
I will use wide_int in the next version of patch.

Thanks,
Prathamesh
>> +

>> +         set_value_range (vr, VR_RANGE, build_int_cst (type, 0),

>> +                          build_int_cst (type, max), NULL);

>> +       }

>> +       return;

>>       default:

>>         break;

>>       }

>

>         Jakub

Richard Biener Nov. 21, 2016, 9:40 a.m. UTC | #3

On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote:

> Hi,

> As suggested by Martin in PR78153 strlen's return value cannot exceed

> PTRDIFF_MAX.

> So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic()

> in the attached patch.

> 

> However it regressed strlenopt-3.c:

> 

> Consider fn1() from strlenopt-3.c:

> 

> __attribute__((noinline, noclone)) size_t

> fn1 (char *p, char *q)

> {

>   size_t s = strlen (q);

>   strcpy (p, q);

>   return s - strlen (p);

> }

> 

> The optimized dump shows the following:

> 

> __attribute__((noclone, noinline))

> fn1 (char * p, char * q)

> {

>   size_t s;

>   size_t _7;

>   long unsigned int _9;

> 

>   <bb 2>:

>   s_4 = strlen (q_3(D));

>   _9 = s_4 + 1;

>   __builtin_memcpy (p_5(D), q_3(D), _9);

>   _7 = 0;

>   return _7;

> 

> }

> 

> which introduces the regression, because the test expects "return 0;" in fn1().

> 

> The issue seems to be in vrp2:

> 

> Before the patch:

> Visiting statement:

> s_4 = strlen (q_3(D));

> Found new range for s_4: VARYING

> 

> Visiting statement:

> _1 = s_4;

> Found new range for _1: [s_4, s_4]

> marking stmt to be not simulated again

> 

> Visiting statement:

> _7 = s_4 - _1;

> Applying pattern match.pd:111, gimple-match.c:27997

> Match-and-simplified s_4 - _1 to 0

> Intersecting

>   [0, 0]

> and

>   [0, +INF]

> to

>   [0, 0]

> Found new range for _7: [0, 0]

> 

> __attribute__((noclone, noinline))

> fn1 (char * p, char * q)

> {

>   size_t s;

>   long unsigned int _1;

>   long unsigned int _9;

> 

>   <bb 2>:

>   s_4 = strlen (q_3(D));

>   _9 = s_4 + 1;

>   __builtin_memcpy (p_5(D), q_3(D), _9);

>   _1 = s_4;

>   return 0;

> 

> }

> 

> 

> After the patch:

> Visiting statement:

> s_4 = strlen (q_3(D));

> Intersecting

>   [0, 9223372036854775806]

> and

>   [0, 9223372036854775806]

> to

>   [0, 9223372036854775806]

> Found new range for s_4: [0, 9223372036854775806]

> marking stmt to be not simulated again

> 

> Visiting statement:

> _1 = s_4;

> Intersecting

>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)

> and

>   [0, 9223372036854775806]

> to

>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)

> Found new range for _1: [0, 9223372036854775806]

> marking stmt to be not simulated again

> 

> Visiting statement:

> _7 = s_4 - _1;

> Intersecting

>   ~[9223372036854775807, 9223372036854775809]

> and

>   ~[9223372036854775807, 9223372036854775809]

> to

>   ~[9223372036854775807, 9223372036854775809]

> Found new range for _7: ~[9223372036854775807, 9223372036854775809]

> marking stmt to be not simulated again

> 

> __attribute__((noclone, noinline))

> fn1 (char * p, char * q)

> {

>   size_t s;

>   long unsigned int _1;

>   size_t _7;

>   long unsigned int _9;

> 

>   <bb 2>:

>   s_4 = strlen (q_3(D));

>   _9 = s_4 + 1;

>   __builtin_memcpy (p_5(D), q_3(D), _9);

>   _1 = s_4;

>   _7 = s_4 - _1;

>   return _7;

> 

> }

> 

> Then forwprop4 turns

> _1 = s_4

> _7 = s_4 - _1

> into

> _7 = 0

> 

> and we end up with:

> _7 = 0

> return _7

> in optimized dump.

> 

> Running ccp again after forwprop4 trivially solves the issue, however

> I am not sure if we want to run ccp again ?

> 

> The issue is probably with extract_range_from_ssa_name():

> For _1 = s_4

> 

> Before patch:

> VR for s_4 is set to varying.

> So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name.

> Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4,

> and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using

> match.pd pattern x - x -> 0).

> 

> After patch:

> VR for s_4 is set to [0, PTRDIFF_MAX - 1]

> And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1]

> so IIUC, we then lose the information that _1 is equal to s_4,


We don't lose it, it's in its set of equivalencies.

> and vrp doesn't transform _7 = s_4 - _1 to _7 = 0.

> forwprop4 does that because it sees that s_4 and _1 are equivalent.

> Does this sound correct ?


Yes.  So the issue is really that vrp_visit_assignment_or_call calls
gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when
we do not have a singleton VR_RANGE does not fall back to looking
at equivalences (there's not a good cheap way to do that currently because
VRP doesn't keep a proper copy lattice but simply IORs equivalences
from all equivalences).  In theory simply using the first set bit
might work.  Thus sth like

@@ -7057,6 +7030,12 @@ vrp_valueize (tree name)
              || is_gimple_min_invariant (vr->min))
          && vrp_operand_equal_p (vr->min, vr->max))
        return vr->min;
+      else if (vr->equiv && ! bitmap_empty_p (vr->equiv))
+       {
+         unsigned num = bitmap_first_set_bit (vr->equiv);
+         if (num < SSA_NAME_VERSION (name))
+           return ssa_name (num);
+       }
     }
   return name;
 }

might work with the idea of simply doing canonicalization to one of
the equivalences.  But as we don't allow copies in the SSA def stmt
(via vrp_valueize_1) I'm not sure that's good enough canonicalization.

Richard.

Prathamesh Kulkarni Nov. 22, 2016, 1:02 p.m. UTC | #4

On 21 November 2016 at 15:10, Richard Biener <rguenther@suse.de> wrote:
> On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote:

>

>> Hi,

>> As suggested by Martin in PR78153 strlen's return value cannot exceed

>> PTRDIFF_MAX.

>> So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic()

>> in the attached patch.

>>

>> However it regressed strlenopt-3.c:

>>

>> Consider fn1() from strlenopt-3.c:

>>

>> __attribute__((noinline, noclone)) size_t

>> fn1 (char *p, char *q)

>> {

>>   size_t s = strlen (q);

>>   strcpy (p, q);

>>   return s - strlen (p);

>> }

>>

>> The optimized dump shows the following:

>>

>> __attribute__((noclone, noinline))

>> fn1 (char * p, char * q)

>> {

>>   size_t s;

>>   size_t _7;

>>   long unsigned int _9;

>>

>>   <bb 2>:

>>   s_4 = strlen (q_3(D));

>>   _9 = s_4 + 1;

>>   __builtin_memcpy (p_5(D), q_3(D), _9);

>>   _7 = 0;

>>   return _7;

>>

>> }

>>

>> which introduces the regression, because the test expects "return 0;" in fn1().

>>

>> The issue seems to be in vrp2:

>>

>> Before the patch:

>> Visiting statement:

>> s_4 = strlen (q_3(D));

>> Found new range for s_4: VARYING

>>

>> Visiting statement:

>> _1 = s_4;

>> Found new range for _1: [s_4, s_4]

>> marking stmt to be not simulated again

>>

>> Visiting statement:

>> _7 = s_4 - _1;

>> Applying pattern match.pd:111, gimple-match.c:27997

>> Match-and-simplified s_4 - _1 to 0

>> Intersecting

>>   [0, 0]

>> and

>>   [0, +INF]

>> to

>>   [0, 0]

>> Found new range for _7: [0, 0]

>>

>> __attribute__((noclone, noinline))

>> fn1 (char * p, char * q)

>> {

>>   size_t s;

>>   long unsigned int _1;

>>   long unsigned int _9;

>>

>>   <bb 2>:

>>   s_4 = strlen (q_3(D));

>>   _9 = s_4 + 1;

>>   __builtin_memcpy (p_5(D), q_3(D), _9);

>>   _1 = s_4;

>>   return 0;

>>

>> }

>>

>>

>> After the patch:

>> Visiting statement:

>> s_4 = strlen (q_3(D));

>> Intersecting

>>   [0, 9223372036854775806]

>> and

>>   [0, 9223372036854775806]

>> to

>>   [0, 9223372036854775806]

>> Found new range for s_4: [0, 9223372036854775806]

>> marking stmt to be not simulated again

>>

>> Visiting statement:

>> _1 = s_4;

>> Intersecting

>>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)

>> and

>>   [0, 9223372036854775806]

>> to

>>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)

>> Found new range for _1: [0, 9223372036854775806]

>> marking stmt to be not simulated again

>>

>> Visiting statement:

>> _7 = s_4 - _1;

>> Intersecting

>>   ~[9223372036854775807, 9223372036854775809]

>> and

>>   ~[9223372036854775807, 9223372036854775809]

>> to

>>   ~[9223372036854775807, 9223372036854775809]

>> Found new range for _7: ~[9223372036854775807, 9223372036854775809]

>> marking stmt to be not simulated again

>>

>> __attribute__((noclone, noinline))

>> fn1 (char * p, char * q)

>> {

>>   size_t s;

>>   long unsigned int _1;

>>   size_t _7;

>>   long unsigned int _9;

>>

>>   <bb 2>:

>>   s_4 = strlen (q_3(D));

>>   _9 = s_4 + 1;

>>   __builtin_memcpy (p_5(D), q_3(D), _9);

>>   _1 = s_4;

>>   _7 = s_4 - _1;

>>   return _7;

>>

>> }

>>

>> Then forwprop4 turns

>> _1 = s_4

>> _7 = s_4 - _1

>> into

>> _7 = 0

>>

>> and we end up with:

>> _7 = 0

>> return _7

>> in optimized dump.

>>

>> Running ccp again after forwprop4 trivially solves the issue, however

>> I am not sure if we want to run ccp again ?

>>

>> The issue is probably with extract_range_from_ssa_name():

>> For _1 = s_4

>>

>> Before patch:

>> VR for s_4 is set to varying.

>> So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name.

>> Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4,

>> and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using

>> match.pd pattern x - x -> 0).

>>

>> After patch:

>> VR for s_4 is set to [0, PTRDIFF_MAX - 1]

>> And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1]

>> so IIUC, we then lose the information that _1 is equal to s_4,

>

> We don't lose it, it's in its set of equivalencies.

Ah, I missed that, thanks. For some reason I had mis-conception that
equivalences stores
variables which have same value-ranges but are not necessarily equal.
>

>> and vrp doesn't transform _7 = s_4 - _1 to _7 = 0.

>> forwprop4 does that because it sees that s_4 and _1 are equivalent.

>> Does this sound correct ?

>

> Yes.  So the issue is really that vrp_visit_assignment_or_call calls

> gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when

> we do not have a singleton VR_RANGE does not fall back to looking

> at equivalences (there's not a good cheap way to do that currently because

> VRP doesn't keep a proper copy lattice but simply IORs equivalences

> from all equivalences).  In theory simply using the first set bit

> might work.  Thus sth like

>

> @@ -7057,6 +7030,12 @@ vrp_valueize (tree name)

>               || is_gimple_min_invariant (vr->min))

>           && vrp_operand_equal_p (vr->min, vr->max))

>         return vr->min;

> +      else if (vr->equiv && ! bitmap_empty_p (vr->equiv))

> +       {

> +         unsigned num = bitmap_first_set_bit (vr->equiv);

> +         if (num < SSA_NAME_VERSION (name))

> +           return ssa_name (num);

> +       }

>      }

>    return name;

>  }

>

> might work with the idea of simply doing canonicalization to one of

> the equivalences.  But as we don't allow copies in the SSA def stmt

> (via vrp_valueize_1) I'm not sure that's good enough canonicalization.

IIUC, we record the equivalent variables in vr->equiv
but do not canonicalize to one of the equivalence like "copy-of value"
in copyprop ?
Using first set bit unfortunately doesn't help for the above case.

Sorry if this sounds silly, should we just run copyprop/ccp once again
after vrp2 to ensure that there are no copies left ?
However that might be quite expensive ?
Or make vrp track copies like copyprop using a separate copy-of lattice ?

Thanks,
Prathamesh
>

> Richard.

Richard Biener Nov. 22, 2016, 2:48 p.m. UTC | #5

On Tue, 22 Nov 2016, Prathamesh Kulkarni wrote:

> On 21 November 2016 at 15:10, Richard Biener <rguenther@suse.de> wrote:

> > On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote:

> >

> >> Hi,

> >> As suggested by Martin in PR78153 strlen's return value cannot exceed

> >> PTRDIFF_MAX.

> >> So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic()

> >> in the attached patch.

> >>

> >> However it regressed strlenopt-3.c:

> >>

> >> Consider fn1() from strlenopt-3.c:

> >>

> >> __attribute__((noinline, noclone)) size_t

> >> fn1 (char *p, char *q)

> >> {

> >>   size_t s = strlen (q);

> >>   strcpy (p, q);

> >>   return s - strlen (p);

> >> }

> >>

> >> The optimized dump shows the following:

> >>

> >> __attribute__((noclone, noinline))

> >> fn1 (char * p, char * q)

> >> {

> >>   size_t s;

> >>   size_t _7;

> >>   long unsigned int _9;

> >>

> >>   <bb 2>:

> >>   s_4 = strlen (q_3(D));

> >>   _9 = s_4 + 1;

> >>   __builtin_memcpy (p_5(D), q_3(D), _9);

> >>   _7 = 0;

> >>   return _7;

> >>

> >> }

> >>

> >> which introduces the regression, because the test expects "return 0;" in fn1().

> >>

> >> The issue seems to be in vrp2:

> >>

> >> Before the patch:

> >> Visiting statement:

> >> s_4 = strlen (q_3(D));

> >> Found new range for s_4: VARYING

> >>

> >> Visiting statement:

> >> _1 = s_4;

> >> Found new range for _1: [s_4, s_4]

> >> marking stmt to be not simulated again

> >>

> >> Visiting statement:

> >> _7 = s_4 - _1;

> >> Applying pattern match.pd:111, gimple-match.c:27997

> >> Match-and-simplified s_4 - _1 to 0

> >> Intersecting

> >>   [0, 0]

> >> and

> >>   [0, +INF]

> >> to

> >>   [0, 0]

> >> Found new range for _7: [0, 0]

> >>

> >> __attribute__((noclone, noinline))

> >> fn1 (char * p, char * q)

> >> {

> >>   size_t s;

> >>   long unsigned int _1;

> >>   long unsigned int _9;

> >>

> >>   <bb 2>:

> >>   s_4 = strlen (q_3(D));

> >>   _9 = s_4 + 1;

> >>   __builtin_memcpy (p_5(D), q_3(D), _9);

> >>   _1 = s_4;

> >>   return 0;

> >>

> >> }

> >>

> >>

> >> After the patch:

> >> Visiting statement:

> >> s_4 = strlen (q_3(D));

> >> Intersecting

> >>   [0, 9223372036854775806]

> >> and

> >>   [0, 9223372036854775806]

> >> to

> >>   [0, 9223372036854775806]

> >> Found new range for s_4: [0, 9223372036854775806]

> >> marking stmt to be not simulated again

> >>

> >> Visiting statement:

> >> _1 = s_4;

> >> Intersecting

> >>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)

> >> and

> >>   [0, 9223372036854775806]

> >> to

> >>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)

> >> Found new range for _1: [0, 9223372036854775806]

> >> marking stmt to be not simulated again

> >>

> >> Visiting statement:

> >> _7 = s_4 - _1;

> >> Intersecting

> >>   ~[9223372036854775807, 9223372036854775809]

> >> and

> >>   ~[9223372036854775807, 9223372036854775809]

> >> to

> >>   ~[9223372036854775807, 9223372036854775809]

> >> Found new range for _7: ~[9223372036854775807, 9223372036854775809]

> >> marking stmt to be not simulated again

> >>

> >> __attribute__((noclone, noinline))

> >> fn1 (char * p, char * q)

> >> {

> >>   size_t s;

> >>   long unsigned int _1;

> >>   size_t _7;

> >>   long unsigned int _9;

> >>

> >>   <bb 2>:

> >>   s_4 = strlen (q_3(D));

> >>   _9 = s_4 + 1;

> >>   __builtin_memcpy (p_5(D), q_3(D), _9);

> >>   _1 = s_4;

> >>   _7 = s_4 - _1;

> >>   return _7;

> >>

> >> }

> >>

> >> Then forwprop4 turns

> >> _1 = s_4

> >> _7 = s_4 - _1

> >> into

> >> _7 = 0

> >>

> >> and we end up with:

> >> _7 = 0

> >> return _7

> >> in optimized dump.

> >>

> >> Running ccp again after forwprop4 trivially solves the issue, however

> >> I am not sure if we want to run ccp again ?

> >>

> >> The issue is probably with extract_range_from_ssa_name():

> >> For _1 = s_4

> >>

> >> Before patch:

> >> VR for s_4 is set to varying.

> >> So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name.

> >> Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4,

> >> and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using

> >> match.pd pattern x - x -> 0).

> >>

> >> After patch:

> >> VR for s_4 is set to [0, PTRDIFF_MAX - 1]

> >> And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1]

> >> so IIUC, we then lose the information that _1 is equal to s_4,

> >

> > We don't lose it, it's in its set of equivalencies.

> Ah, I missed that, thanks. For some reason I had mis-conception that

> equivalences stores

> variables which have same value-ranges but are not necessarily equal.

> >

> >> and vrp doesn't transform _7 = s_4 - _1 to _7 = 0.

> >> forwprop4 does that because it sees that s_4 and _1 are equivalent.

> >> Does this sound correct ?

> >

> > Yes.  So the issue is really that vrp_visit_assignment_or_call calls

> > gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when

> > we do not have a singleton VR_RANGE does not fall back to looking

> > at equivalences (there's not a good cheap way to do that currently because

> > VRP doesn't keep a proper copy lattice but simply IORs equivalences

> > from all equivalences).  In theory simply using the first set bit

> > might work.  Thus sth like

> >

> > @@ -7057,6 +7030,12 @@ vrp_valueize (tree name)

> >               || is_gimple_min_invariant (vr->min))

> >           && vrp_operand_equal_p (vr->min, vr->max))

> >         return vr->min;

> > +      else if (vr->equiv && ! bitmap_empty_p (vr->equiv))

> > +       {

> > +         unsigned num = bitmap_first_set_bit (vr->equiv);

> > +         if (num < SSA_NAME_VERSION (name))

> > +           return ssa_name (num);

> > +       }

> >      }

> >    return name;

> >  }

> >

> > might work with the idea of simply doing canonicalization to one of

> > the equivalences.  But as we don't allow copies in the SSA def stmt

> > (via vrp_valueize_1) I'm not sure that's good enough canonicalization.

> IIUC, we record the equivalent variables in vr->equiv

> but do not canonicalize to one of the equivalence like "copy-of value"

> in copyprop ?

> Using first set bit unfortunately doesn't help for the above case.

> 

> Sorry if this sounds silly, should we just run copyprop/ccp once again

> after vrp2 to ensure that there are no copies left ?


why?  forwprop also does copy and constant propagation.  For the 
regression simply adjust the pass dump you scan.

> However that might be quite expensive ?

> Or make vrp track copies like copyprop using a separate copy-of lattice ?


Ideally we'd unify the three SSA propagation passes into one.  We'd
have to have separate lattices for copy&constant and range&known-bits.

Richard.

> Thanks,

> Prathamesh

> >

> > Richard.

> 

> 


-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

Prathamesh Kulkarni Nov. 22, 2016, 3:21 p.m. UTC | #6

On 22 November 2016 at 20:18, Richard Biener <rguenther@suse.de> wrote:
> On Tue, 22 Nov 2016, Prathamesh Kulkarni wrote:

>

>> On 21 November 2016 at 15:10, Richard Biener <rguenther@suse.de> wrote:

>> > On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote:

>> >

>> >> Hi,

>> >> As suggested by Martin in PR78153 strlen's return value cannot exceed

>> >> PTRDIFF_MAX.

>> >> So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic()

>> >> in the attached patch.

>> >>

>> >> However it regressed strlenopt-3.c:

>> >>

>> >> Consider fn1() from strlenopt-3.c:

>> >>

>> >> __attribute__((noinline, noclone)) size_t

>> >> fn1 (char *p, char *q)

>> >> {

>> >>   size_t s = strlen (q);

>> >>   strcpy (p, q);

>> >>   return s - strlen (p);

>> >> }

>> >>

>> >> The optimized dump shows the following:

>> >>

>> >> __attribute__((noclone, noinline))

>> >> fn1 (char * p, char * q)

>> >> {

>> >>   size_t s;

>> >>   size_t _7;

>> >>   long unsigned int _9;

>> >>

>> >>   <bb 2>:

>> >>   s_4 = strlen (q_3(D));

>> >>   _9 = s_4 + 1;

>> >>   __builtin_memcpy (p_5(D), q_3(D), _9);

>> >>   _7 = 0;

>> >>   return _7;

>> >>

>> >> }

>> >>

>> >> which introduces the regression, because the test expects "return 0;" in fn1().

>> >>

>> >> The issue seems to be in vrp2:

>> >>

>> >> Before the patch:

>> >> Visiting statement:

>> >> s_4 = strlen (q_3(D));

>> >> Found new range for s_4: VARYING

>> >>

>> >> Visiting statement:

>> >> _1 = s_4;

>> >> Found new range for _1: [s_4, s_4]

>> >> marking stmt to be not simulated again

>> >>

>> >> Visiting statement:

>> >> _7 = s_4 - _1;

>> >> Applying pattern match.pd:111, gimple-match.c:27997

>> >> Match-and-simplified s_4 - _1 to 0

>> >> Intersecting

>> >>   [0, 0]

>> >> and

>> >>   [0, +INF]

>> >> to

>> >>   [0, 0]

>> >> Found new range for _7: [0, 0]

>> >>

>> >> __attribute__((noclone, noinline))

>> >> fn1 (char * p, char * q)

>> >> {

>> >>   size_t s;

>> >>   long unsigned int _1;

>> >>   long unsigned int _9;

>> >>

>> >>   <bb 2>:

>> >>   s_4 = strlen (q_3(D));

>> >>   _9 = s_4 + 1;

>> >>   __builtin_memcpy (p_5(D), q_3(D), _9);

>> >>   _1 = s_4;

>> >>   return 0;

>> >>

>> >> }

>> >>

>> >>

>> >> After the patch:

>> >> Visiting statement:

>> >> s_4 = strlen (q_3(D));

>> >> Intersecting

>> >>   [0, 9223372036854775806]

>> >> and

>> >>   [0, 9223372036854775806]

>> >> to

>> >>   [0, 9223372036854775806]

>> >> Found new range for s_4: [0, 9223372036854775806]

>> >> marking stmt to be not simulated again

>> >>

>> >> Visiting statement:

>> >> _1 = s_4;

>> >> Intersecting

>> >>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)

>> >> and

>> >>   [0, 9223372036854775806]

>> >> to

>> >>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)

>> >> Found new range for _1: [0, 9223372036854775806]

>> >> marking stmt to be not simulated again

>> >>

>> >> Visiting statement:

>> >> _7 = s_4 - _1;

>> >> Intersecting

>> >>   ~[9223372036854775807, 9223372036854775809]

>> >> and

>> >>   ~[9223372036854775807, 9223372036854775809]

>> >> to

>> >>   ~[9223372036854775807, 9223372036854775809]

>> >> Found new range for _7: ~[9223372036854775807, 9223372036854775809]

>> >> marking stmt to be not simulated again

>> >>

>> >> __attribute__((noclone, noinline))

>> >> fn1 (char * p, char * q)

>> >> {

>> >>   size_t s;

>> >>   long unsigned int _1;

>> >>   size_t _7;

>> >>   long unsigned int _9;

>> >>

>> >>   <bb 2>:

>> >>   s_4 = strlen (q_3(D));

>> >>   _9 = s_4 + 1;

>> >>   __builtin_memcpy (p_5(D), q_3(D), _9);

>> >>   _1 = s_4;

>> >>   _7 = s_4 - _1;

>> >>   return _7;

>> >>

>> >> }

>> >>

>> >> Then forwprop4 turns

>> >> _1 = s_4

>> >> _7 = s_4 - _1

>> >> into

>> >> _7 = 0

>> >>

>> >> and we end up with:

>> >> _7 = 0

>> >> return _7

>> >> in optimized dump.

>> >>

>> >> Running ccp again after forwprop4 trivially solves the issue, however

>> >> I am not sure if we want to run ccp again ?

>> >>

>> >> The issue is probably with extract_range_from_ssa_name():

>> >> For _1 = s_4

>> >>

>> >> Before patch:

>> >> VR for s_4 is set to varying.

>> >> So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name.

>> >> Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4,

>> >> and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using

>> >> match.pd pattern x - x -> 0).

>> >>

>> >> After patch:

>> >> VR for s_4 is set to [0, PTRDIFF_MAX - 1]

>> >> And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1]

>> >> so IIUC, we then lose the information that _1 is equal to s_4,

>> >

>> > We don't lose it, it's in its set of equivalencies.

>> Ah, I missed that, thanks. For some reason I had mis-conception that

>> equivalences stores

>> variables which have same value-ranges but are not necessarily equal.

>> >

>> >> and vrp doesn't transform _7 = s_4 - _1 to _7 = 0.

>> >> forwprop4 does that because it sees that s_4 and _1 are equivalent.

>> >> Does this sound correct ?

>> >

>> > Yes.  So the issue is really that vrp_visit_assignment_or_call calls

>> > gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when

>> > we do not have a singleton VR_RANGE does not fall back to looking

>> > at equivalences (there's not a good cheap way to do that currently because

>> > VRP doesn't keep a proper copy lattice but simply IORs equivalences

>> > from all equivalences).  In theory simply using the first set bit

>> > might work.  Thus sth like

>> >

>> > @@ -7057,6 +7030,12 @@ vrp_valueize (tree name)

>> >               || is_gimple_min_invariant (vr->min))

>> >           && vrp_operand_equal_p (vr->min, vr->max))

>> >         return vr->min;

>> > +      else if (vr->equiv && ! bitmap_empty_p (vr->equiv))

>> > +       {

>> > +         unsigned num = bitmap_first_set_bit (vr->equiv);

>> > +         if (num < SSA_NAME_VERSION (name))

>> > +           return ssa_name (num);

>> > +       }

>> >      }

>> >    return name;

>> >  }

>> >

>> > might work with the idea of simply doing canonicalization to one of

>> > the equivalences.  But as we don't allow copies in the SSA def stmt

>> > (via vrp_valueize_1) I'm not sure that's good enough canonicalization.

>> IIUC, we record the equivalent variables in vr->equiv

>> but do not canonicalize to one of the equivalence like "copy-of value"

>> in copyprop ?

>> Using first set bit unfortunately doesn't help for the above case.

>>

>> Sorry if this sounds silly, should we just run copyprop/ccp once again

>> after vrp2 to ensure that there are no copies left ?

>

> why?  forwprop also does copy and constant propagation.  For the

> regression simply adjust the pass dump you scan.

Well, with the patch the redundant store to and load from _7 still remains
in optimized dump for fn1() in strlenopt-3.c:

__attribute__((noclone, noinline))
fn1 (char * p, char * q)
{
  size_t s;
  size_t _7;
  long unsigned int _9;

  <bb 2>:
  s_4 = strlen (q_3(D));
  _9 = s_4 + 1;
  __builtin_memcpy (p_5(D), q_3(D), _9);
  _7 = 0;
  return _7;

}

Running ccp again after forwprop4 would get rid of _7.
Without the patch we have return _0; in optimized dump.

Thanks,
Prathamesh
>

>> However that might be quite expensive ?

>> Or make vrp track copies like copyprop using a separate copy-of lattice ?

>

> Ideally we'd unify the three SSA propagation passes into one.  We'd

> have to have separate lattices for copy&constant and range&known-bits.

>

> Richard.

>

>> Thanks,

>> Prathamesh

>> >

>> > Richard.

>>

>>

>

> --

> Richard Biener <rguenther@suse.de>

> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

Richard Biener Nov. 22, 2016, 3:23 p.m. UTC | #7

On Tue, 22 Nov 2016, Prathamesh Kulkarni wrote:

> On 22 November 2016 at 20:18, Richard Biener <rguenther@suse.de> wrote:

> > On Tue, 22 Nov 2016, Prathamesh Kulkarni wrote:

> >

> >> On 21 November 2016 at 15:10, Richard Biener <rguenther@suse.de> wrote:

> >> > On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote:

> >> >

> >> >> Hi,

> >> >> As suggested by Martin in PR78153 strlen's return value cannot exceed

> >> >> PTRDIFF_MAX.

> >> >> So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic()

> >> >> in the attached patch.

> >> >>

> >> >> However it regressed strlenopt-3.c:

> >> >>

> >> >> Consider fn1() from strlenopt-3.c:

> >> >>

> >> >> __attribute__((noinline, noclone)) size_t

> >> >> fn1 (char *p, char *q)

> >> >> {

> >> >>   size_t s = strlen (q);

> >> >>   strcpy (p, q);

> >> >>   return s - strlen (p);

> >> >> }

> >> >>

> >> >> The optimized dump shows the following:

> >> >>

> >> >> __attribute__((noclone, noinline))

> >> >> fn1 (char * p, char * q)

> >> >> {

> >> >>   size_t s;

> >> >>   size_t _7;

> >> >>   long unsigned int _9;

> >> >>

> >> >>   <bb 2>:

> >> >>   s_4 = strlen (q_3(D));

> >> >>   _9 = s_4 + 1;

> >> >>   __builtin_memcpy (p_5(D), q_3(D), _9);

> >> >>   _7 = 0;

> >> >>   return _7;

> >> >>

> >> >> }

> >> >>

> >> >> which introduces the regression, because the test expects "return 0;" in fn1().

> >> >>

> >> >> The issue seems to be in vrp2:

> >> >>

> >> >> Before the patch:

> >> >> Visiting statement:

> >> >> s_4 = strlen (q_3(D));

> >> >> Found new range for s_4: VARYING

> >> >>

> >> >> Visiting statement:

> >> >> _1 = s_4;

> >> >> Found new range for _1: [s_4, s_4]

> >> >> marking stmt to be not simulated again

> >> >>

> >> >> Visiting statement:

> >> >> _7 = s_4 - _1;

> >> >> Applying pattern match.pd:111, gimple-match.c:27997

> >> >> Match-and-simplified s_4 - _1 to 0

> >> >> Intersecting

> >> >>   [0, 0]

> >> >> and

> >> >>   [0, +INF]

> >> >> to

> >> >>   [0, 0]

> >> >> Found new range for _7: [0, 0]

> >> >>

> >> >> __attribute__((noclone, noinline))

> >> >> fn1 (char * p, char * q)

> >> >> {

> >> >>   size_t s;

> >> >>   long unsigned int _1;

> >> >>   long unsigned int _9;

> >> >>

> >> >>   <bb 2>:

> >> >>   s_4 = strlen (q_3(D));

> >> >>   _9 = s_4 + 1;

> >> >>   __builtin_memcpy (p_5(D), q_3(D), _9);

> >> >>   _1 = s_4;

> >> >>   return 0;

> >> >>

> >> >> }

> >> >>

> >> >>

> >> >> After the patch:

> >> >> Visiting statement:

> >> >> s_4 = strlen (q_3(D));

> >> >> Intersecting

> >> >>   [0, 9223372036854775806]

> >> >> and

> >> >>   [0, 9223372036854775806]

> >> >> to

> >> >>   [0, 9223372036854775806]

> >> >> Found new range for s_4: [0, 9223372036854775806]

> >> >> marking stmt to be not simulated again

> >> >>

> >> >> Visiting statement:

> >> >> _1 = s_4;

> >> >> Intersecting

> >> >>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)

> >> >> and

> >> >>   [0, 9223372036854775806]

> >> >> to

> >> >>   [0, 9223372036854775806]  EQUIVALENCES: { s_4 } (1 elements)

> >> >> Found new range for _1: [0, 9223372036854775806]

> >> >> marking stmt to be not simulated again

> >> >>

> >> >> Visiting statement:

> >> >> _7 = s_4 - _1;

> >> >> Intersecting

> >> >>   ~[9223372036854775807, 9223372036854775809]

> >> >> and

> >> >>   ~[9223372036854775807, 9223372036854775809]

> >> >> to

> >> >>   ~[9223372036854775807, 9223372036854775809]

> >> >> Found new range for _7: ~[9223372036854775807, 9223372036854775809]

> >> >> marking stmt to be not simulated again

> >> >>

> >> >> __attribute__((noclone, noinline))

> >> >> fn1 (char * p, char * q)

> >> >> {

> >> >>   size_t s;

> >> >>   long unsigned int _1;

> >> >>   size_t _7;

> >> >>   long unsigned int _9;

> >> >>

> >> >>   <bb 2>:

> >> >>   s_4 = strlen (q_3(D));

> >> >>   _9 = s_4 + 1;

> >> >>   __builtin_memcpy (p_5(D), q_3(D), _9);

> >> >>   _1 = s_4;

> >> >>   _7 = s_4 - _1;

> >> >>   return _7;

> >> >>

> >> >> }

> >> >>

> >> >> Then forwprop4 turns

> >> >> _1 = s_4

> >> >> _7 = s_4 - _1

> >> >> into

> >> >> _7 = 0

> >> >>

> >> >> and we end up with:

> >> >> _7 = 0

> >> >> return _7

> >> >> in optimized dump.

> >> >>

> >> >> Running ccp again after forwprop4 trivially solves the issue, however

> >> >> I am not sure if we want to run ccp again ?

> >> >>

> >> >> The issue is probably with extract_range_from_ssa_name():

> >> >> For _1 = s_4

> >> >>

> >> >> Before patch:

> >> >> VR for s_4 is set to varying.

> >> >> So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name.

> >> >> Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4,

> >> >> and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using

> >> >> match.pd pattern x - x -> 0).

> >> >>

> >> >> After patch:

> >> >> VR for s_4 is set to [0, PTRDIFF_MAX - 1]

> >> >> And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1]

> >> >> so IIUC, we then lose the information that _1 is equal to s_4,

> >> >

> >> > We don't lose it, it's in its set of equivalencies.

> >> Ah, I missed that, thanks. For some reason I had mis-conception that

> >> equivalences stores

> >> variables which have same value-ranges but are not necessarily equal.

> >> >

> >> >> and vrp doesn't transform _7 = s_4 - _1 to _7 = 0.

> >> >> forwprop4 does that because it sees that s_4 and _1 are equivalent.

> >> >> Does this sound correct ?

> >> >

> >> > Yes.  So the issue is really that vrp_visit_assignment_or_call calls

> >> > gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when

> >> > we do not have a singleton VR_RANGE does not fall back to looking

> >> > at equivalences (there's not a good cheap way to do that currently because

> >> > VRP doesn't keep a proper copy lattice but simply IORs equivalences

> >> > from all equivalences).  In theory simply using the first set bit

> >> > might work.  Thus sth like

> >> >

> >> > @@ -7057,6 +7030,12 @@ vrp_valueize (tree name)

> >> >               || is_gimple_min_invariant (vr->min))

> >> >           && vrp_operand_equal_p (vr->min, vr->max))

> >> >         return vr->min;

> >> > +      else if (vr->equiv && ! bitmap_empty_p (vr->equiv))

> >> > +       {

> >> > +         unsigned num = bitmap_first_set_bit (vr->equiv);

> >> > +         if (num < SSA_NAME_VERSION (name))

> >> > +           return ssa_name (num);

> >> > +       }

> >> >      }

> >> >    return name;

> >> >  }

> >> >

> >> > might work with the idea of simply doing canonicalization to one of

> >> > the equivalences.  But as we don't allow copies in the SSA def stmt

> >> > (via vrp_valueize_1) I'm not sure that's good enough canonicalization.

> >> IIUC, we record the equivalent variables in vr->equiv

> >> but do not canonicalize to one of the equivalence like "copy-of value"

> >> in copyprop ?

> >> Using first set bit unfortunately doesn't help for the above case.

> >>

> >> Sorry if this sounds silly, should we just run copyprop/ccp once again

> >> after vrp2 to ensure that there are no copies left ?

> >

> > why?  forwprop also does copy and constant propagation.  For the

> > regression simply adjust the pass dump you scan.

> Well, with the patch the redundant store to and load from _7 still remains

> in optimized dump for fn1() in strlenopt-3.c:

> 

> __attribute__((noclone, noinline))

> fn1 (char * p, char * q)

> {

>   size_t s;

>   size_t _7;

>   long unsigned int _9;

> 

>   <bb 2>:

>   s_4 = strlen (q_3(D));

>   _9 = s_4 + 1;

>   __builtin_memcpy (p_5(D), q_3(D), _9);

>   _7 = 0;

>   return _7;

> 

> }

> 

> Running ccp again after forwprop4 would get rid of _7.

> Without the patch we have return _0; in optimized dump.


Ah, but then that's a missing "folding" of the return.  It's not
a load/store anyway.

Richard.

> Thanks,

> Prathamesh

> >

> >> However that might be quite expensive ?

> >> Or make vrp track copies like copyprop using a separate copy-of lattice ?

> >

> > Ideally we'd unify the three SSA propagation passes into one.  We'd

> > have to have separate lattices for copy&constant and range&known-bits.

> >

> > Richard.

> >

> >> Thanks,

> >> Prathamesh

> >> >

> >> > Richard.

> >>

> >>

> >

> > --

> > Richard Biener <rguenther@suse.de>

> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

> 

> 


-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

Commit Message

Comments

Patch