diff mbox

[ifcvt] Add a new parameter to limit if-conversion

Message ID 1450446761-5209-1-git-send-email-james.greenhalgh@arm.com
State New
Headers show

Commit Message

James Greenhalgh Dec. 18, 2015, 1:52 p.m. UTC
Hi,

PR68920 talks about undesirable if-conversion in the x86 back-end.
The if-conversion cost model simply uses BRANCH_COST (I want to revisit
this for GCC 7), but BRANCH_COST is overloaded in the compiler and reducing
it causes other optimisations to be disabled.

Consequently, to fix the PR we want something new that the target can set
to override BRANCH_COST and reduce the number of instructions that get
if-converted. This patch adds this mechanism through a param.

Bootstrapped on x86_64-none-linux-gnu and aarch64-none-linux-gnu.

OK for trunk?

Thanks,
James

---
gcc/

2015-12-17  James Greenhalgh  <james.greenhalgh@arm.com>

	PR rtl-optimization/68920
	* params.def (PARAM_MAX_RTL_IF_CONVERSION_INSNS): New.
	* ifcvt.c (noce_find_if_block): Limit by new param.
	* doc/invoke.texi (max-rtl-if-conversion-insns): Document it.

gcc/testsuite/

2015-12-17  James Greenhalgh  <james.greenhalgh@arm.com>

	PR rtl-optimization/68920
	* gcc.deg/ifcvt-5.c: New.

Comments

James Greenhalgh Dec. 18, 2015, 2:19 p.m. UTC | #1
On Fri, Dec 18, 2015 at 05:09:14PM +0300, Yuri Rumyantsev wrote:
> James,

> 

> We implemented slightly different patch - we restrict number of SET

> instructions for if-conversion through new parameter and add check in

> bb_ok_for_noce_convert_multiple_sets:

> 

> +  unsigned limit = MIN (ii->branch_cost,

> + (unsigned) PARAM_VALUE (PARAM_MAX_IF_CONV_SET_INSNS));

> ..

> -  if (count > ii->branch_cost)

> -    return FALSE;

> +  if (count > limit)

> +    return false;

> 

> Now we are testing it for different suites/chips but I saw that real

> benchmark for which we saw huge performance degradation gets speed-ip

> on 65% for -march=slm -m32.


Yes, that will work too. I put it where I did to give back-ends the choice
to turn off all if-conversion by setting the param to zero. Your fix is more
targeted to fixing just the one regression. I don't have a preference as
to which direction we take this. I saw a similar performance boost for your
testcase on my development box with my patch - so at least we now have two
candidate patches to fix the performance regression.

Thanks,
James

> 

> 2015-12-18 16:52 GMT+03:00 James Greenhalgh <james.greenhalgh@arm.com>:

> >

> > Hi,

> >

> > PR68920 talks about undesirable if-conversion in the x86 back-end.

> > The if-conversion cost model simply uses BRANCH_COST (I want to revisit

> > this for GCC 7), but BRANCH_COST is overloaded in the compiler and reducing

> > it causes other optimisations to be disabled.

> >

> > Consequently, to fix the PR we want something new that the target can set

> > to override BRANCH_COST and reduce the number of instructions that get

> > if-converted. This patch adds this mechanism through a param.

> >

> > Bootstrapped on x86_64-none-linux-gnu and aarch64-none-linux-gnu.

> >

> > OK for trunk?

> >

> > Thanks,

> > James

> >

> > ---

> > gcc/

> >

> > 2015-12-17  James Greenhalgh  <james.greenhalgh@arm.com>

> >

> >         PR rtl-optimization/68920

> >         * params.def (PARAM_MAX_RTL_IF_CONVERSION_INSNS): New.

> >         * ifcvt.c (noce_find_if_block): Limit by new param.

> >         * doc/invoke.texi (max-rtl-if-conversion-insns): Document it.

> >

> > gcc/testsuite/

> >

> > 2015-12-17  James Greenhalgh  <james.greenhalgh@arm.com>

> >

> >         PR rtl-optimization/68920

> >         * gcc.deg/ifcvt-5.c: New.

> >

>
diff mbox

Patch

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9b3e2fe..1f9b0fe 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10341,6 +10341,14 @@  In each case, the @var{value} is an integer.  The allowable choices for
 When branch is predicted to be taken with probability lower than this threshold
 (in percent), then it is considered well predictable. The default is 10.
 
+@item max-rtl-if-conversion-insns
+RTL if-conversion tries to remove conditional branches around a block and
+replace them with conditionally executed instructions.  This parameter
+gives the maximum number of instructions in a block which should be
+considered for if-conversion.  The default is 10, though the compiler will
+also use other heuristics to decide whether if-conversion is likely to be
+profitable.
+
 @item max-crossjump-edges
 The maximum number of incoming edges to consider for cross-jumping.
 The algorithm used by @option{-fcrossjumping} is @math{O(N^2)} in
diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 6164232..2674baa 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -32,6 +32,7 @@ 
 #include "regs.h"
 #include "emit-rtl.h"
 #include "recog.h"
+#include "params.h"
 
 #include "cfgrtl.h"
 #include "cfganal.h"
@@ -3944,6 +3945,9 @@  noce_find_if_block (basic_block test_bb, edge then_edge, edge else_edge,
   if_info.then_else_reversed = then_else_reversed;
   if_info.branch_cost = BRANCH_COST (optimize_bb_for_speed_p (test_bb),
 				     predictable_edge_p (then_edge));
+  if_info.branch_cost
+    = MIN (if_info.branch_cost,
+	   (unsigned int) PARAM_VALUE (PARAM_MAX_RTL_IF_CONVERSION_INSNS));
 
   /* Do the real work.  */
 
diff --git a/gcc/params.def b/gcc/params.def
index 41fd8a8..a0d9787 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -49,6 +49,15 @@  DEFPARAM (PARAM_PREDICTABLE_BRANCH_OUTCOME,
 	  "Maximal estimated outcome of branch considered predictable.",
 	  2, 0, 50)
 
+/* The maximum number of insns in a basic block to consider for
+   RTL if-conversion.  A target might use BRANCH_COST to further limit the
+   number of insns considered.  */
+DEFPARAM (PARAM_MAX_RTL_IF_CONVERSION_INSNS,
+	  "max-rtl-if-conversion-insns",
+	  "Maximum number of insns in a basic block to consider for RTL "
+	  "if-conversion.",
+	  10, 0, 10)
+
 DEFPARAM (PARAM_INLINE_MIN_SPEEDUP,
 	  "inline-min-speedup",
 	  "The minimal estimated speedup allowing inliner to ignore inline-insns-single and inline-isnsns-auto.",
diff --git a/gcc/testsuite/gcc.dg/ifcvt-5.c b/gcc/testsuite/gcc.dg/ifcvt-5.c
new file mode 100644
index 0000000..fc6a2ca
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ifcvt-5.c
@@ -0,0 +1,19 @@ 
+/* Check that multi-insn if-conversion is not done if the override
+   parameter would not allow it.  */
+
+/* { dg-options "-fdump-rtl-ce1 -O2 --param max-rtl-if-conversion-insns=1" } */
+int
+foo (int x, int y, int a)
+{
+  int i = x;
+  int j = y;
+  /* Try to make taking the branch likely.  */
+  __builtin_expect (x > y, 1);
+  if (x > y)
+    {
+      i = a;
+      j = i;
+    }
+  return i * j;
+}
+/* { dg-final { scan-rtl-dump "0 true changes made" "ce1" } } */