Message ID | 87k25x3yp7.fsf@linaro.org |
---|---|
State | New |
Headers | show |
On Thu, May 4, 2017 at 8:47 AM, Richard Sandiford <richard.sandiford@linaro.org> wrote: > For the reasons explained in PR77536, niter_for_unrolled_loop assumes 5 > iterations in the absence of profiling information, although it doesn't > increase beyond the estimate for the original loop. This left a hole in > which the new estimate could be less than the old one but still greater > than the limit imposed by CEIL (nb_iterations_upper_bound, unroll factor). > > Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Ok. Thanks, Richard. > Thanks, > Richard > > > gcc/ > 2017-05-04 Richard Sandiford <richard.sandiford@linaro.org> > > * tree-ssa-loop-manip.c (niter_for_unrolled_loop): Add commentary > to explain the use of truncating division. Cap the number of > iterations to the maximum given by nb_iterations_upper_bound, > if defined. > > gcc/testsuite/ > * gcc.dg/vect/vect-profile-1.c: New test. > > Index: gcc/tree-ssa-loop-manip.c > =================================================================== > --- gcc/tree-ssa-loop-manip.c 2017-05-03 08:46:26.068861808 +0100 > +++ gcc/tree-ssa-loop-manip.c 2017-05-04 07:41:56.686034705 +0100 > @@ -1104,6 +1104,9 @@ niter_for_unrolled_loop (struct loop *lo > gcc_assert (factor != 0); > bool profile_p = false; > gcov_type est_niter = expected_loop_iterations_unbounded (loop, &profile_p); > + /* Note that this is really CEIL (est_niter + 1, factor) - 1, where the > + "+ 1" converts latch iterations to loop iterations and the "- 1" > + converts back. */ > gcov_type new_est_niter = est_niter / factor; > > /* Without profile feedback, loops for which we do not know a better estimate > @@ -1120,6 +1123,15 @@ niter_for_unrolled_loop (struct loop *lo > new_est_niter = 5; > } > > + if (loop->any_upper_bound) > + { > + /* As above, this is really CEIL (upper_bound + 1, factor) - 1. */ > + widest_int bound = wi::udiv_floor (loop->nb_iterations_upper_bound, > + factor); > + if (wi::ltu_p (bound, new_est_niter)) > + new_est_niter = bound.to_uhwi (); > + } > + > return new_est_niter; > } > > Index: gcc/testsuite/gcc.dg/vect/vect-profile-1.c > =================================================================== > --- /dev/null 2017-05-04 07:24:39.449302696 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-profile-1.c 2017-05-04 07:41:56.685075916 +0100 > @@ -0,0 +1,35 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target vect_int } */ > +/* { dg-additional-options "-fdump-tree-vect-details-blocks" } */ > + > +/* At least one of these should correspond to a full vector. */ > + > +void > +f1 (int *x) > +{ > + for (int j = 0; j < 2; ++j) > + x[j] += 1; > +} > + > +void > +f2 (int *x) > +{ > + for (int j = 0; j < 4; ++j) > + x[j] += 1; > +} > + > +void > +f3 (int *x) > +{ > + for (int j = 0; j < 8; ++j) > + x[j] += 1; > +} > + > +void > +f4 (int *x) > +{ > + for (int j = 0; j < 16; ++j) > + x[j] += 1; > +} > + > +/* { dg-final { scan-tree-dump {goto <bb [0-9]+>; \[0+.0*%\]} vect } } */
Index: gcc/tree-ssa-loop-manip.c =================================================================== --- gcc/tree-ssa-loop-manip.c 2017-05-03 08:46:26.068861808 +0100 +++ gcc/tree-ssa-loop-manip.c 2017-05-04 07:41:56.686034705 +0100 @@ -1104,6 +1104,9 @@ niter_for_unrolled_loop (struct loop *lo gcc_assert (factor != 0); bool profile_p = false; gcov_type est_niter = expected_loop_iterations_unbounded (loop, &profile_p); + /* Note that this is really CEIL (est_niter + 1, factor) - 1, where the + "+ 1" converts latch iterations to loop iterations and the "- 1" + converts back. */ gcov_type new_est_niter = est_niter / factor; /* Without profile feedback, loops for which we do not know a better estimate @@ -1120,6 +1123,15 @@ niter_for_unrolled_loop (struct loop *lo new_est_niter = 5; } + if (loop->any_upper_bound) + { + /* As above, this is really CEIL (upper_bound + 1, factor) - 1. */ + widest_int bound = wi::udiv_floor (loop->nb_iterations_upper_bound, + factor); + if (wi::ltu_p (bound, new_est_niter)) + new_est_niter = bound.to_uhwi (); + } + return new_est_niter; } Index: gcc/testsuite/gcc.dg/vect/vect-profile-1.c =================================================================== --- /dev/null 2017-05-04 07:24:39.449302696 +0100 +++ gcc/testsuite/gcc.dg/vect/vect-profile-1.c 2017-05-04 07:41:56.685075916 +0100 @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ +/* { dg-additional-options "-fdump-tree-vect-details-blocks" } */ + +/* At least one of these should correspond to a full vector. */ + +void +f1 (int *x) +{ + for (int j = 0; j < 2; ++j) + x[j] += 1; +} + +void +f2 (int *x) +{ + for (int j = 0; j < 4; ++j) + x[j] += 1; +} + +void +f3 (int *x) +{ + for (int j = 0; j < 8; ++j) + x[j] += 1; +} + +void +f4 (int *x) +{ + for (int j = 0; j < 16; ++j) + x[j] += 1; +} + +/* { dg-final { scan-tree-dump {goto <bb [0-9]+>; \[0+.0*%\]} vect } } */