Message ID | CAAgBjMm5bQhfPjM8Ff5DrNz+zhhJqq0eTj9_r2G5AEB=LPnHHw@mail.gmail.com |
---|---|
State | Superseded |
Headers | show |
On Sun, Nov 20, 2016 at 07:20:20PM +0530, Prathamesh Kulkarni wrote: > --- a/gcc/tree-vrp.c > +++ b/gcc/tree-vrp.c > @@ -4013,6 +4013,16 @@ extract_range_basic (value_range *vr, gimple *stmt) > : vrp_val_max (type), NULL); > } > return; > + case CFN_BUILT_IN_STRLEN: > + { > + tree type = TREE_TYPE (gimple_call_lhs (stmt)); > + unsigned HOST_WIDE_INT max = > + TREE_INT_CST_LOW (vrp_val_max (ptrdiff_type_node)) - 1; Wrong formatting, = should go on the next line, and should be indented only 2 columns more than the previous line. Plus TREE_INT_CST_LOW really shouldn't be used in new code. You should use tree_to_uhwi or tree_to_shwi instead. Why the -1? Can you just fold_convert (type, TYPE_MAX_VALUE (ptrdiff_type_node)); ? Or, if you really want the -1, e.g. wide_int max = vrp_val_max (ptrdiff_type_node); wide_int_to_tree (type, max - 1); or something similar. > + > + set_value_range (vr, VR_RANGE, build_int_cst (type, 0), > + build_int_cst (type, max), NULL); > + } > + return; > default: > break; > } Jakub
On 20 November 2016 at 19:34, Jakub Jelinek <jakub@redhat.com> wrote: > On Sun, Nov 20, 2016 at 07:20:20PM +0530, Prathamesh Kulkarni wrote: >> --- a/gcc/tree-vrp.c >> +++ b/gcc/tree-vrp.c >> @@ -4013,6 +4013,16 @@ extract_range_basic (value_range *vr, gimple *stmt) >> : vrp_val_max (type), NULL); >> } >> return; >> + case CFN_BUILT_IN_STRLEN: >> + { >> + tree type = TREE_TYPE (gimple_call_lhs (stmt)); >> + unsigned HOST_WIDE_INT max = >> + TREE_INT_CST_LOW (vrp_val_max (ptrdiff_type_node)) - 1; > > Wrong formatting, = should go on the next line, and should be indented only > 2 columns more than the previous line. Plus TREE_INT_CST_LOW really > shouldn't be used in new code. You should use tree_to_uhwi or tree_to_shwi > instead. Why the -1? Can you just > fold_convert (type, TYPE_MAX_VALUE (ptrdiff_type_node)); ? > Or, if you really want the -1, e.g. wide_int max = vrp_val_max (ptrdiff_type_node); > wide_int_to_tree (type, max - 1); > or something similar. Hi Jakub, Thanks for the suggestions. Sorry I wrote misleading info in the patch. As per PR, strlen's return value should always be less than PTRDIFF_MAX, so I am setting the range to [0, PTRDIFF_MAX - 1]. I will use wide_int in the next version of patch. Thanks, Prathamesh >> + >> + set_value_range (vr, VR_RANGE, build_int_cst (type, 0), >> + build_int_cst (type, max), NULL); >> + } >> + return; >> default: >> break; >> } > > Jakub
On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote: > Hi, > As suggested by Martin in PR78153 strlen's return value cannot exceed > PTRDIFF_MAX. > So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic() > in the attached patch. > > However it regressed strlenopt-3.c: > > Consider fn1() from strlenopt-3.c: > > __attribute__((noinline, noclone)) size_t > fn1 (char *p, char *q) > { > size_t s = strlen (q); > strcpy (p, q); > return s - strlen (p); > } > > The optimized dump shows the following: > > __attribute__((noclone, noinline)) > fn1 (char * p, char * q) > { > size_t s; > size_t _7; > long unsigned int _9; > > <bb 2>: > s_4 = strlen (q_3(D)); > _9 = s_4 + 1; > __builtin_memcpy (p_5(D), q_3(D), _9); > _7 = 0; > return _7; > > } > > which introduces the regression, because the test expects "return 0;" in fn1(). > > The issue seems to be in vrp2: > > Before the patch: > Visiting statement: > s_4 = strlen (q_3(D)); > Found new range for s_4: VARYING > > Visiting statement: > _1 = s_4; > Found new range for _1: [s_4, s_4] > marking stmt to be not simulated again > > Visiting statement: > _7 = s_4 - _1; > Applying pattern match.pd:111, gimple-match.c:27997 > Match-and-simplified s_4 - _1 to 0 > Intersecting > [0, 0] > and > [0, +INF] > to > [0, 0] > Found new range for _7: [0, 0] > > __attribute__((noclone, noinline)) > fn1 (char * p, char * q) > { > size_t s; > long unsigned int _1; > long unsigned int _9; > > <bb 2>: > s_4 = strlen (q_3(D)); > _9 = s_4 + 1; > __builtin_memcpy (p_5(D), q_3(D), _9); > _1 = s_4; > return 0; > > } > > > After the patch: > Visiting statement: > s_4 = strlen (q_3(D)); > Intersecting > [0, 9223372036854775806] > and > [0, 9223372036854775806] > to > [0, 9223372036854775806] > Found new range for s_4: [0, 9223372036854775806] > marking stmt to be not simulated again > > Visiting statement: > _1 = s_4; > Intersecting > [0, 9223372036854775806] EQUIVALENCES: { s_4 } (1 elements) > and > [0, 9223372036854775806] > to > [0, 9223372036854775806] EQUIVALENCES: { s_4 } (1 elements) > Found new range for _1: [0, 9223372036854775806] > marking stmt to be not simulated again > > Visiting statement: > _7 = s_4 - _1; > Intersecting > ~[9223372036854775807, 9223372036854775809] > and > ~[9223372036854775807, 9223372036854775809] > to > ~[9223372036854775807, 9223372036854775809] > Found new range for _7: ~[9223372036854775807, 9223372036854775809] > marking stmt to be not simulated again > > __attribute__((noclone, noinline)) > fn1 (char * p, char * q) > { > size_t s; > long unsigned int _1; > size_t _7; > long unsigned int _9; > > <bb 2>: > s_4 = strlen (q_3(D)); > _9 = s_4 + 1; > __builtin_memcpy (p_5(D), q_3(D), _9); > _1 = s_4; > _7 = s_4 - _1; > return _7; > > } > > Then forwprop4 turns > _1 = s_4 > _7 = s_4 - _1 > into > _7 = 0 > > and we end up with: > _7 = 0 > return _7 > in optimized dump. > > Running ccp again after forwprop4 trivially solves the issue, however > I am not sure if we want to run ccp again ? > > The issue is probably with extract_range_from_ssa_name(): > For _1 = s_4 > > Before patch: > VR for s_4 is set to varying. > So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name. > Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4, > and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using > match.pd pattern x - x -> 0). > > After patch: > VR for s_4 is set to [0, PTRDIFF_MAX - 1] > And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1] > so IIUC, we then lose the information that _1 is equal to s_4, We don't lose it, it's in its set of equivalencies. > and vrp doesn't transform _7 = s_4 - _1 to _7 = 0. > forwprop4 does that because it sees that s_4 and _1 are equivalent. > Does this sound correct ? Yes. So the issue is really that vrp_visit_assignment_or_call calls gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when we do not have a singleton VR_RANGE does not fall back to looking at equivalences (there's not a good cheap way to do that currently because VRP doesn't keep a proper copy lattice but simply IORs equivalences from all equivalences). In theory simply using the first set bit might work. Thus sth like @@ -7057,6 +7030,12 @@ vrp_valueize (tree name) || is_gimple_min_invariant (vr->min)) && vrp_operand_equal_p (vr->min, vr->max)) return vr->min; + else if (vr->equiv && ! bitmap_empty_p (vr->equiv)) + { + unsigned num = bitmap_first_set_bit (vr->equiv); + if (num < SSA_NAME_VERSION (name)) + return ssa_name (num); + } } return name; } might work with the idea of simply doing canonicalization to one of the equivalences. But as we don't allow copies in the SSA def stmt (via vrp_valueize_1) I'm not sure that's good enough canonicalization. Richard.
On 21 November 2016 at 15:10, Richard Biener <rguenther@suse.de> wrote: > On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote: > >> Hi, >> As suggested by Martin in PR78153 strlen's return value cannot exceed >> PTRDIFF_MAX. >> So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic() >> in the attached patch. >> >> However it regressed strlenopt-3.c: >> >> Consider fn1() from strlenopt-3.c: >> >> __attribute__((noinline, noclone)) size_t >> fn1 (char *p, char *q) >> { >> size_t s = strlen (q); >> strcpy (p, q); >> return s - strlen (p); >> } >> >> The optimized dump shows the following: >> >> __attribute__((noclone, noinline)) >> fn1 (char * p, char * q) >> { >> size_t s; >> size_t _7; >> long unsigned int _9; >> >> <bb 2>: >> s_4 = strlen (q_3(D)); >> _9 = s_4 + 1; >> __builtin_memcpy (p_5(D), q_3(D), _9); >> _7 = 0; >> return _7; >> >> } >> >> which introduces the regression, because the test expects "return 0;" in fn1(). >> >> The issue seems to be in vrp2: >> >> Before the patch: >> Visiting statement: >> s_4 = strlen (q_3(D)); >> Found new range for s_4: VARYING >> >> Visiting statement: >> _1 = s_4; >> Found new range for _1: [s_4, s_4] >> marking stmt to be not simulated again >> >> Visiting statement: >> _7 = s_4 - _1; >> Applying pattern match.pd:111, gimple-match.c:27997 >> Match-and-simplified s_4 - _1 to 0 >> Intersecting >> [0, 0] >> and >> [0, +INF] >> to >> [0, 0] >> Found new range for _7: [0, 0] >> >> __attribute__((noclone, noinline)) >> fn1 (char * p, char * q) >> { >> size_t s; >> long unsigned int _1; >> long unsigned int _9; >> >> <bb 2>: >> s_4 = strlen (q_3(D)); >> _9 = s_4 + 1; >> __builtin_memcpy (p_5(D), q_3(D), _9); >> _1 = s_4; >> return 0; >> >> } >> >> >> After the patch: >> Visiting statement: >> s_4 = strlen (q_3(D)); >> Intersecting >> [0, 9223372036854775806] >> and >> [0, 9223372036854775806] >> to >> [0, 9223372036854775806] >> Found new range for s_4: [0, 9223372036854775806] >> marking stmt to be not simulated again >> >> Visiting statement: >> _1 = s_4; >> Intersecting >> [0, 9223372036854775806] EQUIVALENCES: { s_4 } (1 elements) >> and >> [0, 9223372036854775806] >> to >> [0, 9223372036854775806] EQUIVALENCES: { s_4 } (1 elements) >> Found new range for _1: [0, 9223372036854775806] >> marking stmt to be not simulated again >> >> Visiting statement: >> _7 = s_4 - _1; >> Intersecting >> ~[9223372036854775807, 9223372036854775809] >> and >> ~[9223372036854775807, 9223372036854775809] >> to >> ~[9223372036854775807, 9223372036854775809] >> Found new range for _7: ~[9223372036854775807, 9223372036854775809] >> marking stmt to be not simulated again >> >> __attribute__((noclone, noinline)) >> fn1 (char * p, char * q) >> { >> size_t s; >> long unsigned int _1; >> size_t _7; >> long unsigned int _9; >> >> <bb 2>: >> s_4 = strlen (q_3(D)); >> _9 = s_4 + 1; >> __builtin_memcpy (p_5(D), q_3(D), _9); >> _1 = s_4; >> _7 = s_4 - _1; >> return _7; >> >> } >> >> Then forwprop4 turns >> _1 = s_4 >> _7 = s_4 - _1 >> into >> _7 = 0 >> >> and we end up with: >> _7 = 0 >> return _7 >> in optimized dump. >> >> Running ccp again after forwprop4 trivially solves the issue, however >> I am not sure if we want to run ccp again ? >> >> The issue is probably with extract_range_from_ssa_name(): >> For _1 = s_4 >> >> Before patch: >> VR for s_4 is set to varying. >> So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name. >> Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4, >> and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using >> match.pd pattern x - x -> 0). >> >> After patch: >> VR for s_4 is set to [0, PTRDIFF_MAX - 1] >> And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1] >> so IIUC, we then lose the information that _1 is equal to s_4, > > We don't lose it, it's in its set of equivalencies. Ah, I missed that, thanks. For some reason I had mis-conception that equivalences stores variables which have same value-ranges but are not necessarily equal. > >> and vrp doesn't transform _7 = s_4 - _1 to _7 = 0. >> forwprop4 does that because it sees that s_4 and _1 are equivalent. >> Does this sound correct ? > > Yes. So the issue is really that vrp_visit_assignment_or_call calls > gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when > we do not have a singleton VR_RANGE does not fall back to looking > at equivalences (there's not a good cheap way to do that currently because > VRP doesn't keep a proper copy lattice but simply IORs equivalences > from all equivalences). In theory simply using the first set bit > might work. Thus sth like > > @@ -7057,6 +7030,12 @@ vrp_valueize (tree name) > || is_gimple_min_invariant (vr->min)) > && vrp_operand_equal_p (vr->min, vr->max)) > return vr->min; > + else if (vr->equiv && ! bitmap_empty_p (vr->equiv)) > + { > + unsigned num = bitmap_first_set_bit (vr->equiv); > + if (num < SSA_NAME_VERSION (name)) > + return ssa_name (num); > + } > } > return name; > } > > might work with the idea of simply doing canonicalization to one of > the equivalences. But as we don't allow copies in the SSA def stmt > (via vrp_valueize_1) I'm not sure that's good enough canonicalization. IIUC, we record the equivalent variables in vr->equiv but do not canonicalize to one of the equivalence like "copy-of value" in copyprop ? Using first set bit unfortunately doesn't help for the above case. Sorry if this sounds silly, should we just run copyprop/ccp once again after vrp2 to ensure that there are no copies left ? However that might be quite expensive ? Or make vrp track copies like copyprop using a separate copy-of lattice ? Thanks, Prathamesh > > Richard.
On Tue, 22 Nov 2016, Prathamesh Kulkarni wrote: > On 21 November 2016 at 15:10, Richard Biener <rguenther@suse.de> wrote: > > On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote: > > > >> Hi, > >> As suggested by Martin in PR78153 strlen's return value cannot exceed > >> PTRDIFF_MAX. > >> So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic() > >> in the attached patch. > >> > >> However it regressed strlenopt-3.c: > >> > >> Consider fn1() from strlenopt-3.c: > >> > >> __attribute__((noinline, noclone)) size_t > >> fn1 (char *p, char *q) > >> { > >> size_t s = strlen (q); > >> strcpy (p, q); > >> return s - strlen (p); > >> } > >> > >> The optimized dump shows the following: > >> > >> __attribute__((noclone, noinline)) > >> fn1 (char * p, char * q) > >> { > >> size_t s; > >> size_t _7; > >> long unsigned int _9; > >> > >> <bb 2>: > >> s_4 = strlen (q_3(D)); > >> _9 = s_4 + 1; > >> __builtin_memcpy (p_5(D), q_3(D), _9); > >> _7 = 0; > >> return _7; > >> > >> } > >> > >> which introduces the regression, because the test expects "return 0;" in fn1(). > >> > >> The issue seems to be in vrp2: > >> > >> Before the patch: > >> Visiting statement: > >> s_4 = strlen (q_3(D)); > >> Found new range for s_4: VARYING > >> > >> Visiting statement: > >> _1 = s_4; > >> Found new range for _1: [s_4, s_4] > >> marking stmt to be not simulated again > >> > >> Visiting statement: > >> _7 = s_4 - _1; > >> Applying pattern match.pd:111, gimple-match.c:27997 > >> Match-and-simplified s_4 - _1 to 0 > >> Intersecting > >> [0, 0] > >> and > >> [0, +INF] > >> to > >> [0, 0] > >> Found new range for _7: [0, 0] > >> > >> __attribute__((noclone, noinline)) > >> fn1 (char * p, char * q) > >> { > >> size_t s; > >> long unsigned int _1; > >> long unsigned int _9; > >> > >> <bb 2>: > >> s_4 = strlen (q_3(D)); > >> _9 = s_4 + 1; > >> __builtin_memcpy (p_5(D), q_3(D), _9); > >> _1 = s_4; > >> return 0; > >> > >> } > >> > >> > >> After the patch: > >> Visiting statement: > >> s_4 = strlen (q_3(D)); > >> Intersecting > >> [0, 9223372036854775806] > >> and > >> [0, 9223372036854775806] > >> to > >> [0, 9223372036854775806] > >> Found new range for s_4: [0, 9223372036854775806] > >> marking stmt to be not simulated again > >> > >> Visiting statement: > >> _1 = s_4; > >> Intersecting > >> [0, 9223372036854775806] EQUIVALENCES: { s_4 } (1 elements) > >> and > >> [0, 9223372036854775806] > >> to > >> [0, 9223372036854775806] EQUIVALENCES: { s_4 } (1 elements) > >> Found new range for _1: [0, 9223372036854775806] > >> marking stmt to be not simulated again > >> > >> Visiting statement: > >> _7 = s_4 - _1; > >> Intersecting > >> ~[9223372036854775807, 9223372036854775809] > >> and > >> ~[9223372036854775807, 9223372036854775809] > >> to > >> ~[9223372036854775807, 9223372036854775809] > >> Found new range for _7: ~[9223372036854775807, 9223372036854775809] > >> marking stmt to be not simulated again > >> > >> __attribute__((noclone, noinline)) > >> fn1 (char * p, char * q) > >> { > >> size_t s; > >> long unsigned int _1; > >> size_t _7; > >> long unsigned int _9; > >> > >> <bb 2>: > >> s_4 = strlen (q_3(D)); > >> _9 = s_4 + 1; > >> __builtin_memcpy (p_5(D), q_3(D), _9); > >> _1 = s_4; > >> _7 = s_4 - _1; > >> return _7; > >> > >> } > >> > >> Then forwprop4 turns > >> _1 = s_4 > >> _7 = s_4 - _1 > >> into > >> _7 = 0 > >> > >> and we end up with: > >> _7 = 0 > >> return _7 > >> in optimized dump. > >> > >> Running ccp again after forwprop4 trivially solves the issue, however > >> I am not sure if we want to run ccp again ? > >> > >> The issue is probably with extract_range_from_ssa_name(): > >> For _1 = s_4 > >> > >> Before patch: > >> VR for s_4 is set to varying. > >> So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name. > >> Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4, > >> and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using > >> match.pd pattern x - x -> 0). > >> > >> After patch: > >> VR for s_4 is set to [0, PTRDIFF_MAX - 1] > >> And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1] > >> so IIUC, we then lose the information that _1 is equal to s_4, > > > > We don't lose it, it's in its set of equivalencies. > Ah, I missed that, thanks. For some reason I had mis-conception that > equivalences stores > variables which have same value-ranges but are not necessarily equal. > > > >> and vrp doesn't transform _7 = s_4 - _1 to _7 = 0. > >> forwprop4 does that because it sees that s_4 and _1 are equivalent. > >> Does this sound correct ? > > > > Yes. So the issue is really that vrp_visit_assignment_or_call calls > > gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when > > we do not have a singleton VR_RANGE does not fall back to looking > > at equivalences (there's not a good cheap way to do that currently because > > VRP doesn't keep a proper copy lattice but simply IORs equivalences > > from all equivalences). In theory simply using the first set bit > > might work. Thus sth like > > > > @@ -7057,6 +7030,12 @@ vrp_valueize (tree name) > > || is_gimple_min_invariant (vr->min)) > > && vrp_operand_equal_p (vr->min, vr->max)) > > return vr->min; > > + else if (vr->equiv && ! bitmap_empty_p (vr->equiv)) > > + { > > + unsigned num = bitmap_first_set_bit (vr->equiv); > > + if (num < SSA_NAME_VERSION (name)) > > + return ssa_name (num); > > + } > > } > > return name; > > } > > > > might work with the idea of simply doing canonicalization to one of > > the equivalences. But as we don't allow copies in the SSA def stmt > > (via vrp_valueize_1) I'm not sure that's good enough canonicalization. > IIUC, we record the equivalent variables in vr->equiv > but do not canonicalize to one of the equivalence like "copy-of value" > in copyprop ? > Using first set bit unfortunately doesn't help for the above case. > > Sorry if this sounds silly, should we just run copyprop/ccp once again > after vrp2 to ensure that there are no copies left ? why? forwprop also does copy and constant propagation. For the regression simply adjust the pass dump you scan. > However that might be quite expensive ? > Or make vrp track copies like copyprop using a separate copy-of lattice ? Ideally we'd unify the three SSA propagation passes into one. We'd have to have separate lattices for copy&constant and range&known-bits. Richard. > Thanks, > Prathamesh > > > > Richard. > > -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
On 22 November 2016 at 20:18, Richard Biener <rguenther@suse.de> wrote: > On Tue, 22 Nov 2016, Prathamesh Kulkarni wrote: > >> On 21 November 2016 at 15:10, Richard Biener <rguenther@suse.de> wrote: >> > On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote: >> > >> >> Hi, >> >> As suggested by Martin in PR78153 strlen's return value cannot exceed >> >> PTRDIFF_MAX. >> >> So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic() >> >> in the attached patch. >> >> >> >> However it regressed strlenopt-3.c: >> >> >> >> Consider fn1() from strlenopt-3.c: >> >> >> >> __attribute__((noinline, noclone)) size_t >> >> fn1 (char *p, char *q) >> >> { >> >> size_t s = strlen (q); >> >> strcpy (p, q); >> >> return s - strlen (p); >> >> } >> >> >> >> The optimized dump shows the following: >> >> >> >> __attribute__((noclone, noinline)) >> >> fn1 (char * p, char * q) >> >> { >> >> size_t s; >> >> size_t _7; >> >> long unsigned int _9; >> >> >> >> <bb 2>: >> >> s_4 = strlen (q_3(D)); >> >> _9 = s_4 + 1; >> >> __builtin_memcpy (p_5(D), q_3(D), _9); >> >> _7 = 0; >> >> return _7; >> >> >> >> } >> >> >> >> which introduces the regression, because the test expects "return 0;" in fn1(). >> >> >> >> The issue seems to be in vrp2: >> >> >> >> Before the patch: >> >> Visiting statement: >> >> s_4 = strlen (q_3(D)); >> >> Found new range for s_4: VARYING >> >> >> >> Visiting statement: >> >> _1 = s_4; >> >> Found new range for _1: [s_4, s_4] >> >> marking stmt to be not simulated again >> >> >> >> Visiting statement: >> >> _7 = s_4 - _1; >> >> Applying pattern match.pd:111, gimple-match.c:27997 >> >> Match-and-simplified s_4 - _1 to 0 >> >> Intersecting >> >> [0, 0] >> >> and >> >> [0, +INF] >> >> to >> >> [0, 0] >> >> Found new range for _7: [0, 0] >> >> >> >> __attribute__((noclone, noinline)) >> >> fn1 (char * p, char * q) >> >> { >> >> size_t s; >> >> long unsigned int _1; >> >> long unsigned int _9; >> >> >> >> <bb 2>: >> >> s_4 = strlen (q_3(D)); >> >> _9 = s_4 + 1; >> >> __builtin_memcpy (p_5(D), q_3(D), _9); >> >> _1 = s_4; >> >> return 0; >> >> >> >> } >> >> >> >> >> >> After the patch: >> >> Visiting statement: >> >> s_4 = strlen (q_3(D)); >> >> Intersecting >> >> [0, 9223372036854775806] >> >> and >> >> [0, 9223372036854775806] >> >> to >> >> [0, 9223372036854775806] >> >> Found new range for s_4: [0, 9223372036854775806] >> >> marking stmt to be not simulated again >> >> >> >> Visiting statement: >> >> _1 = s_4; >> >> Intersecting >> >> [0, 9223372036854775806] EQUIVALENCES: { s_4 } (1 elements) >> >> and >> >> [0, 9223372036854775806] >> >> to >> >> [0, 9223372036854775806] EQUIVALENCES: { s_4 } (1 elements) >> >> Found new range for _1: [0, 9223372036854775806] >> >> marking stmt to be not simulated again >> >> >> >> Visiting statement: >> >> _7 = s_4 - _1; >> >> Intersecting >> >> ~[9223372036854775807, 9223372036854775809] >> >> and >> >> ~[9223372036854775807, 9223372036854775809] >> >> to >> >> ~[9223372036854775807, 9223372036854775809] >> >> Found new range for _7: ~[9223372036854775807, 9223372036854775809] >> >> marking stmt to be not simulated again >> >> >> >> __attribute__((noclone, noinline)) >> >> fn1 (char * p, char * q) >> >> { >> >> size_t s; >> >> long unsigned int _1; >> >> size_t _7; >> >> long unsigned int _9; >> >> >> >> <bb 2>: >> >> s_4 = strlen (q_3(D)); >> >> _9 = s_4 + 1; >> >> __builtin_memcpy (p_5(D), q_3(D), _9); >> >> _1 = s_4; >> >> _7 = s_4 - _1; >> >> return _7; >> >> >> >> } >> >> >> >> Then forwprop4 turns >> >> _1 = s_4 >> >> _7 = s_4 - _1 >> >> into >> >> _7 = 0 >> >> >> >> and we end up with: >> >> _7 = 0 >> >> return _7 >> >> in optimized dump. >> >> >> >> Running ccp again after forwprop4 trivially solves the issue, however >> >> I am not sure if we want to run ccp again ? >> >> >> >> The issue is probably with extract_range_from_ssa_name(): >> >> For _1 = s_4 >> >> >> >> Before patch: >> >> VR for s_4 is set to varying. >> >> So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name. >> >> Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4, >> >> and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using >> >> match.pd pattern x - x -> 0). >> >> >> >> After patch: >> >> VR for s_4 is set to [0, PTRDIFF_MAX - 1] >> >> And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1] >> >> so IIUC, we then lose the information that _1 is equal to s_4, >> > >> > We don't lose it, it's in its set of equivalencies. >> Ah, I missed that, thanks. For some reason I had mis-conception that >> equivalences stores >> variables which have same value-ranges but are not necessarily equal. >> > >> >> and vrp doesn't transform _7 = s_4 - _1 to _7 = 0. >> >> forwprop4 does that because it sees that s_4 and _1 are equivalent. >> >> Does this sound correct ? >> > >> > Yes. So the issue is really that vrp_visit_assignment_or_call calls >> > gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when >> > we do not have a singleton VR_RANGE does not fall back to looking >> > at equivalences (there's not a good cheap way to do that currently because >> > VRP doesn't keep a proper copy lattice but simply IORs equivalences >> > from all equivalences). In theory simply using the first set bit >> > might work. Thus sth like >> > >> > @@ -7057,6 +7030,12 @@ vrp_valueize (tree name) >> > || is_gimple_min_invariant (vr->min)) >> > && vrp_operand_equal_p (vr->min, vr->max)) >> > return vr->min; >> > + else if (vr->equiv && ! bitmap_empty_p (vr->equiv)) >> > + { >> > + unsigned num = bitmap_first_set_bit (vr->equiv); >> > + if (num < SSA_NAME_VERSION (name)) >> > + return ssa_name (num); >> > + } >> > } >> > return name; >> > } >> > >> > might work with the idea of simply doing canonicalization to one of >> > the equivalences. But as we don't allow copies in the SSA def stmt >> > (via vrp_valueize_1) I'm not sure that's good enough canonicalization. >> IIUC, we record the equivalent variables in vr->equiv >> but do not canonicalize to one of the equivalence like "copy-of value" >> in copyprop ? >> Using first set bit unfortunately doesn't help for the above case. >> >> Sorry if this sounds silly, should we just run copyprop/ccp once again >> after vrp2 to ensure that there are no copies left ? > > why? forwprop also does copy and constant propagation. For the > regression simply adjust the pass dump you scan. Well, with the patch the redundant store to and load from _7 still remains in optimized dump for fn1() in strlenopt-3.c: __attribute__((noclone, noinline)) fn1 (char * p, char * q) { size_t s; size_t _7; long unsigned int _9; <bb 2>: s_4 = strlen (q_3(D)); _9 = s_4 + 1; __builtin_memcpy (p_5(D), q_3(D), _9); _7 = 0; return _7; } Running ccp again after forwprop4 would get rid of _7. Without the patch we have return _0; in optimized dump. Thanks, Prathamesh > >> However that might be quite expensive ? >> Or make vrp track copies like copyprop using a separate copy-of lattice ? > > Ideally we'd unify the three SSA propagation passes into one. We'd > have to have separate lattices for copy&constant and range&known-bits. > > Richard. > >> Thanks, >> Prathamesh >> > >> > Richard. >> >> > > -- > Richard Biener <rguenther@suse.de> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
On Tue, 22 Nov 2016, Prathamesh Kulkarni wrote: > On 22 November 2016 at 20:18, Richard Biener <rguenther@suse.de> wrote: > > On Tue, 22 Nov 2016, Prathamesh Kulkarni wrote: > > > >> On 21 November 2016 at 15:10, Richard Biener <rguenther@suse.de> wrote: > >> > On Sun, 20 Nov 2016, Prathamesh Kulkarni wrote: > >> > > >> >> Hi, > >> >> As suggested by Martin in PR78153 strlen's return value cannot exceed > >> >> PTRDIFF_MAX. > >> >> So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic() > >> >> in the attached patch. > >> >> > >> >> However it regressed strlenopt-3.c: > >> >> > >> >> Consider fn1() from strlenopt-3.c: > >> >> > >> >> __attribute__((noinline, noclone)) size_t > >> >> fn1 (char *p, char *q) > >> >> { > >> >> size_t s = strlen (q); > >> >> strcpy (p, q); > >> >> return s - strlen (p); > >> >> } > >> >> > >> >> The optimized dump shows the following: > >> >> > >> >> __attribute__((noclone, noinline)) > >> >> fn1 (char * p, char * q) > >> >> { > >> >> size_t s; > >> >> size_t _7; > >> >> long unsigned int _9; > >> >> > >> >> <bb 2>: > >> >> s_4 = strlen (q_3(D)); > >> >> _9 = s_4 + 1; > >> >> __builtin_memcpy (p_5(D), q_3(D), _9); > >> >> _7 = 0; > >> >> return _7; > >> >> > >> >> } > >> >> > >> >> which introduces the regression, because the test expects "return 0;" in fn1(). > >> >> > >> >> The issue seems to be in vrp2: > >> >> > >> >> Before the patch: > >> >> Visiting statement: > >> >> s_4 = strlen (q_3(D)); > >> >> Found new range for s_4: VARYING > >> >> > >> >> Visiting statement: > >> >> _1 = s_4; > >> >> Found new range for _1: [s_4, s_4] > >> >> marking stmt to be not simulated again > >> >> > >> >> Visiting statement: > >> >> _7 = s_4 - _1; > >> >> Applying pattern match.pd:111, gimple-match.c:27997 > >> >> Match-and-simplified s_4 - _1 to 0 > >> >> Intersecting > >> >> [0, 0] > >> >> and > >> >> [0, +INF] > >> >> to > >> >> [0, 0] > >> >> Found new range for _7: [0, 0] > >> >> > >> >> __attribute__((noclone, noinline)) > >> >> fn1 (char * p, char * q) > >> >> { > >> >> size_t s; > >> >> long unsigned int _1; > >> >> long unsigned int _9; > >> >> > >> >> <bb 2>: > >> >> s_4 = strlen (q_3(D)); > >> >> _9 = s_4 + 1; > >> >> __builtin_memcpy (p_5(D), q_3(D), _9); > >> >> _1 = s_4; > >> >> return 0; > >> >> > >> >> } > >> >> > >> >> > >> >> After the patch: > >> >> Visiting statement: > >> >> s_4 = strlen (q_3(D)); > >> >> Intersecting > >> >> [0, 9223372036854775806] > >> >> and > >> >> [0, 9223372036854775806] > >> >> to > >> >> [0, 9223372036854775806] > >> >> Found new range for s_4: [0, 9223372036854775806] > >> >> marking stmt to be not simulated again > >> >> > >> >> Visiting statement: > >> >> _1 = s_4; > >> >> Intersecting > >> >> [0, 9223372036854775806] EQUIVALENCES: { s_4 } (1 elements) > >> >> and > >> >> [0, 9223372036854775806] > >> >> to > >> >> [0, 9223372036854775806] EQUIVALENCES: { s_4 } (1 elements) > >> >> Found new range for _1: [0, 9223372036854775806] > >> >> marking stmt to be not simulated again > >> >> > >> >> Visiting statement: > >> >> _7 = s_4 - _1; > >> >> Intersecting > >> >> ~[9223372036854775807, 9223372036854775809] > >> >> and > >> >> ~[9223372036854775807, 9223372036854775809] > >> >> to > >> >> ~[9223372036854775807, 9223372036854775809] > >> >> Found new range for _7: ~[9223372036854775807, 9223372036854775809] > >> >> marking stmt to be not simulated again > >> >> > >> >> __attribute__((noclone, noinline)) > >> >> fn1 (char * p, char * q) > >> >> { > >> >> size_t s; > >> >> long unsigned int _1; > >> >> size_t _7; > >> >> long unsigned int _9; > >> >> > >> >> <bb 2>: > >> >> s_4 = strlen (q_3(D)); > >> >> _9 = s_4 + 1; > >> >> __builtin_memcpy (p_5(D), q_3(D), _9); > >> >> _1 = s_4; > >> >> _7 = s_4 - _1; > >> >> return _7; > >> >> > >> >> } > >> >> > >> >> Then forwprop4 turns > >> >> _1 = s_4 > >> >> _7 = s_4 - _1 > >> >> into > >> >> _7 = 0 > >> >> > >> >> and we end up with: > >> >> _7 = 0 > >> >> return _7 > >> >> in optimized dump. > >> >> > >> >> Running ccp again after forwprop4 trivially solves the issue, however > >> >> I am not sure if we want to run ccp again ? > >> >> > >> >> The issue is probably with extract_range_from_ssa_name(): > >> >> For _1 = s_4 > >> >> > >> >> Before patch: > >> >> VR for s_4 is set to varying. > >> >> So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name. > >> >> Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4, > >> >> and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using > >> >> match.pd pattern x - x -> 0). > >> >> > >> >> After patch: > >> >> VR for s_4 is set to [0, PTRDIFF_MAX - 1] > >> >> And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1] > >> >> so IIUC, we then lose the information that _1 is equal to s_4, > >> > > >> > We don't lose it, it's in its set of equivalencies. > >> Ah, I missed that, thanks. For some reason I had mis-conception that > >> equivalences stores > >> variables which have same value-ranges but are not necessarily equal. > >> > > >> >> and vrp doesn't transform _7 = s_4 - _1 to _7 = 0. > >> >> forwprop4 does that because it sees that s_4 and _1 are equivalent. > >> >> Does this sound correct ? > >> > > >> > Yes. So the issue is really that vrp_visit_assignment_or_call calls > >> > gimple_fold_stmt_to_constant_1 with vrp_valueize[_1] which when > >> > we do not have a singleton VR_RANGE does not fall back to looking > >> > at equivalences (there's not a good cheap way to do that currently because > >> > VRP doesn't keep a proper copy lattice but simply IORs equivalences > >> > from all equivalences). In theory simply using the first set bit > >> > might work. Thus sth like > >> > > >> > @@ -7057,6 +7030,12 @@ vrp_valueize (tree name) > >> > || is_gimple_min_invariant (vr->min)) > >> > && vrp_operand_equal_p (vr->min, vr->max)) > >> > return vr->min; > >> > + else if (vr->equiv && ! bitmap_empty_p (vr->equiv)) > >> > + { > >> > + unsigned num = bitmap_first_set_bit (vr->equiv); > >> > + if (num < SSA_NAME_VERSION (name)) > >> > + return ssa_name (num); > >> > + } > >> > } > >> > return name; > >> > } > >> > > >> > might work with the idea of simply doing canonicalization to one of > >> > the equivalences. But as we don't allow copies in the SSA def stmt > >> > (via vrp_valueize_1) I'm not sure that's good enough canonicalization. > >> IIUC, we record the equivalent variables in vr->equiv > >> but do not canonicalize to one of the equivalence like "copy-of value" > >> in copyprop ? > >> Using first set bit unfortunately doesn't help for the above case. > >> > >> Sorry if this sounds silly, should we just run copyprop/ccp once again > >> after vrp2 to ensure that there are no copies left ? > > > > why? forwprop also does copy and constant propagation. For the > > regression simply adjust the pass dump you scan. > Well, with the patch the redundant store to and load from _7 still remains > in optimized dump for fn1() in strlenopt-3.c: > > __attribute__((noclone, noinline)) > fn1 (char * p, char * q) > { > size_t s; > size_t _7; > long unsigned int _9; > > <bb 2>: > s_4 = strlen (q_3(D)); > _9 = s_4 + 1; > __builtin_memcpy (p_5(D), q_3(D), _9); > _7 = 0; > return _7; > > } > > Running ccp again after forwprop4 would get rid of _7. > Without the patch we have return _0; in optimized dump. Ah, but then that's a missing "folding" of the return. It's not a load/store anyway. Richard. > Thanks, > Prathamesh > > > >> However that might be quite expensive ? > >> Or make vrp track copies like copyprop using a separate copy-of lattice ? > > > > Ideally we'd unify the three SSA propagation passes into one. We'd > > have to have separate lattices for copy&constant and range&known-bits. > > > > Richard. > > > >> Thanks, > >> Prathamesh > >> > > >> > Richard. > >> > >> > > > > -- > > Richard Biener <rguenther@suse.de> > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) > > -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr78153-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr78153-1.c new file mode 100644 index 0000000..2530ba0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr78153-1.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-evrp-slim" } */ + +void f(const char *s) +{ + if (__PTRDIFF_MAX__ <= __builtin_strlen (s)) + __builtin_abort (); +} + +/* { dg-final { scan-tree-dump-not "__builtin_abort" "evrp" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr78153-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr78153-2.c new file mode 100644 index 0000000..de70450 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr78153-2.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-evrp-slim" } */ + +void f(const char *s) +{ + __PTRDIFF_TYPE__ n = __builtin_strlen (s); + if (n < 0) + __builtin_abort (); +} + +/* { dg-final { scan-tree-dump-not "__builtin_abort" "evrp" } } */ diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index c2a4133..d17b413 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -4013,6 +4013,16 @@ extract_range_basic (value_range *vr, gimple *stmt) : vrp_val_max (type), NULL); } return; + case CFN_BUILT_IN_STRLEN: + { + tree type = TREE_TYPE (gimple_call_lhs (stmt)); + unsigned HOST_WIDE_INT max = + TREE_INT_CST_LOW (vrp_val_max (ptrdiff_type_node)) - 1; + + set_value_range (vr, VR_RANGE, build_int_cst (type, 0), + build_int_cst (type, max), NULL); + } + return; default: break; }