Cap niter_for_unrolled_loop to upper bound

Message ID 87k25x3yp7.fsf@linaro.org
State New
Headers show

Commit Message

Richard Sandiford May 4, 2017, 6:47 a.m.
For the reasons explained in PR77536, niter_for_unrolled_loop assumes 5
iterations in the absence of profiling information, although it doesn't
increase beyond the estimate for the original loop.  This left a hole in
which the new estimate could be less than the old one but still greater
than the limit imposed by CEIL (nb_iterations_upper_bound, unroll factor).

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
2017-05-04  Richard Sandiford  <richard.sandiford@linaro.org>

	* tree-ssa-loop-manip.c (niter_for_unrolled_loop): Add commentary
	to explain the use of truncating division.  Cap the number of
	iterations to the maximum given by nb_iterations_upper_bound,
	if defined.

gcc/testsuite/
	* gcc.dg/vect/vect-profile-1.c: New test.

Comments

Richard Biener May 4, 2017, 10:25 a.m. | #1
On Thu, May 4, 2017 at 8:47 AM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> For the reasons explained in PR77536, niter_for_unrolled_loop assumes 5

> iterations in the absence of profiling information, although it doesn't

> increase beyond the estimate for the original loop.  This left a hole in

> which the new estimate could be less than the old one but still greater

> than the limit imposed by CEIL (nb_iterations_upper_bound, unroll factor).

>

> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?


Ok.

Thanks,
Richard.

> Thanks,

> Richard

>

>

> gcc/

> 2017-05-04  Richard Sandiford  <richard.sandiford@linaro.org>

>

>         * tree-ssa-loop-manip.c (niter_for_unrolled_loop): Add commentary

>         to explain the use of truncating division.  Cap the number of

>         iterations to the maximum given by nb_iterations_upper_bound,

>         if defined.

>

> gcc/testsuite/

>         * gcc.dg/vect/vect-profile-1.c: New test.

>

> Index: gcc/tree-ssa-loop-manip.c

> ===================================================================

> --- gcc/tree-ssa-loop-manip.c   2017-05-03 08:46:26.068861808 +0100

> +++ gcc/tree-ssa-loop-manip.c   2017-05-04 07:41:56.686034705 +0100

> @@ -1104,6 +1104,9 @@ niter_for_unrolled_loop (struct loop *lo

>    gcc_assert (factor != 0);

>    bool profile_p = false;

>    gcov_type est_niter = expected_loop_iterations_unbounded (loop, &profile_p);

> +  /* Note that this is really CEIL (est_niter + 1, factor) - 1, where the

> +     "+ 1" converts latch iterations to loop iterations and the "- 1"

> +     converts back.  */

>    gcov_type new_est_niter = est_niter / factor;

>

>    /* Without profile feedback, loops for which we do not know a better estimate

> @@ -1120,6 +1123,15 @@ niter_for_unrolled_loop (struct loop *lo

>         new_est_niter = 5;

>      }

>

> +  if (loop->any_upper_bound)

> +    {

> +      /* As above, this is really CEIL (upper_bound + 1, factor) - 1.  */

> +      widest_int bound = wi::udiv_floor (loop->nb_iterations_upper_bound,

> +                                        factor);

> +      if (wi::ltu_p (bound, new_est_niter))

> +       new_est_niter = bound.to_uhwi ();

> +    }

> +

>    return new_est_niter;

>  }

>

> Index: gcc/testsuite/gcc.dg/vect/vect-profile-1.c

> ===================================================================

> --- /dev/null   2017-05-04 07:24:39.449302696 +0100

> +++ gcc/testsuite/gcc.dg/vect/vect-profile-1.c  2017-05-04 07:41:56.685075916 +0100

> @@ -0,0 +1,35 @@

> +/* { dg-do compile } */

> +/* { dg-require-effective-target vect_int } */

> +/* { dg-additional-options "-fdump-tree-vect-details-blocks" } */

> +

> +/* At least one of these should correspond to a full vector.  */

> +

> +void

> +f1 (int *x)

> +{

> +  for (int j = 0; j < 2; ++j)

> +    x[j] += 1;

> +}

> +

> +void

> +f2 (int *x)

> +{

> +  for (int j = 0; j < 4; ++j)

> +    x[j] += 1;

> +}

> +

> +void

> +f3 (int *x)

> +{

> +  for (int j = 0; j < 8; ++j)

> +    x[j] += 1;

> +}

> +

> +void

> +f4 (int *x)

> +{

> +  for (int j = 0; j < 16; ++j)

> +    x[j] += 1;

> +}

> +

> +/* { dg-final { scan-tree-dump {goto <bb [0-9]+>; \[0+.0*%\]} vect } } */

Patch hide | download patch | download mbox

Index: gcc/tree-ssa-loop-manip.c
===================================================================
--- gcc/tree-ssa-loop-manip.c	2017-05-03 08:46:26.068861808 +0100
+++ gcc/tree-ssa-loop-manip.c	2017-05-04 07:41:56.686034705 +0100
@@ -1104,6 +1104,9 @@  niter_for_unrolled_loop (struct loop *lo
   gcc_assert (factor != 0);
   bool profile_p = false;
   gcov_type est_niter = expected_loop_iterations_unbounded (loop, &profile_p);
+  /* Note that this is really CEIL (est_niter + 1, factor) - 1, where the
+     "+ 1" converts latch iterations to loop iterations and the "- 1"
+     converts back.  */
   gcov_type new_est_niter = est_niter / factor;
 
   /* Without profile feedback, loops for which we do not know a better estimate
@@ -1120,6 +1123,15 @@  niter_for_unrolled_loop (struct loop *lo
 	new_est_niter = 5;
     }
 
+  if (loop->any_upper_bound)
+    {
+      /* As above, this is really CEIL (upper_bound + 1, factor) - 1.  */
+      widest_int bound = wi::udiv_floor (loop->nb_iterations_upper_bound,
+					 factor);
+      if (wi::ltu_p (bound, new_est_niter))
+	new_est_niter = bound.to_uhwi ();
+    }
+
   return new_est_niter;
 }
 
Index: gcc/testsuite/gcc.dg/vect/vect-profile-1.c
===================================================================
--- /dev/null	2017-05-04 07:24:39.449302696 +0100
+++ gcc/testsuite/gcc.dg/vect/vect-profile-1.c	2017-05-04 07:41:56.685075916 +0100
@@ -0,0 +1,35 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-fdump-tree-vect-details-blocks" } */
+
+/* At least one of these should correspond to a full vector.  */
+
+void
+f1 (int *x)
+{
+  for (int j = 0; j < 2; ++j)
+    x[j] += 1;
+}
+
+void
+f2 (int *x)
+{
+  for (int j = 0; j < 4; ++j)
+    x[j] += 1;
+}
+
+void
+f3 (int *x)
+{
+  for (int j = 0; j < 8; ++j)
+    x[j] += 1;
+}
+
+void
+f4 (int *x)
+{
+  for (int j = 0; j < 16; ++j)
+    x[j] += 1;
+}
+
+/* { dg-final { scan-tree-dump {goto <bb [0-9]+>; \[0+.0*%\]} vect } } */