diff mbox

Use straight-line sequence for signed overflow additive operation

Message ID 20524360.1RVvHARErM@polaris
State New
Headers show

Commit Message

Eric Botcazou Oct. 27, 2016, 3:26 p.m. UTC
Hi,

as suggested by Segher, this changes the generic signed-signed-signed case of 
expand_addsub_overflow to using a straight-line code sequence instead of a 
branchy one, the new sequence being also shorter.  On 32-bit PowerPC the code 
generated at -O2 for 32-bit addition and subtraction is:

        add 10,3,4
        eqv 3,3,4
        xor 9,10,4
        and. 8,9,3
        blt- 0, <overflow>

        subf 9,4,3
        xor 3,3,4
        eqv 4,9,4
        and. 10,3,4
        blt- 0, <overflow>

eqv is not(xor) so a typical RISC machine will have 1 more instruction (if it 
doesn't have and(not) like e.g. SPARC).  What's even more interesting is that 
the overflow overhead is constant, e.g. for 64-bit addition and subtraction:

        addc 6,4,6
        adde 10,3,5
        eqv 3,3,5
        xor 9,10,5
        and. 9,9,3
        blt- 0, <overflow>

        subfc 6,6,4
        subfe 10,5,3
        xor 3,3,5
        eqv 5,10,5
        and. 9,3,5
        blt- 0, <overflow>

so this will also help 32-bit x86, which doesn't implement 64-bit overflow 
operations in the MD file.  There is a fixlet for do_jump_by_parts_greater_rtx 
in the patch too, which forgets to invert the probability of the jump when it 
swaps the arms of the branch.

Tested on x86-64/Linux, PowerPC/Linux and PowerPC64/Linux, OK for mainline?


2016-10-27  Eric Botcazou  <ebotcazou@adacore.com>
            Segher Boessenkool  <segher@kernel.crashing.org>

	* dojump.c (do_jump_by_parts_greater_rtx): Invert probability when
	swapping the arms of the branch.
	* internal-fn.c (expand_addsub_overflow): Use a straight-line code
	sequence for the generic signed-signed-signed case.

-- 
Eric Botcazou

Comments

Bernd Schmidt Oct. 28, 2016, 10:49 a.m. UTC | #1
On 10/27/2016 05:26 PM, Eric Botcazou wrote:
> as suggested by Segher, this changes the generic signed-signed-signed case of

> expand_addsub_overflow to using a straight-line code sequence instead of a

> branchy one, the new sequence being also shorter.


> 	* dojump.c (do_jump_by_parts_greater_rtx): Invert probability when

> 	swapping the arms of the branch.

> 	* internal-fn.c (expand_addsub_overflow): Use a straight-line code

> 	sequence for the generic signed-signed-signed case.


Ok.


Bernd
diff mbox

Patch

Index: dojump.c
===================================================================
--- dojump.c	(revision 241611)
+++ dojump.c	(working copy)
@@ -703,6 +703,7 @@  do_jump_by_parts_greater_rtx (machine_mo
       if_false_label = drop_through_label;
       drop_through_if_true = false;
       drop_through_if_false = true;
+      prob = inv (prob);
     }
 
   /* Compare a word at a time, high order first.  */
Index: internal-fn.c
===================================================================
--- internal-fn.c	(revision 241611)
+++ internal-fn.c	(working copy)
@@ -847,56 +847,68 @@  expand_addsub_overflow (location_t loc,
 	delete_insns_since (last);
       }
 
-    rtx_code_label *sub_check = gen_label_rtx ();
-    int pos_neg = 3;
-
     /* Compute the operation.  On RTL level, the addition is always
        unsigned.  */
     res = expand_binop (mode, code == PLUS_EXPR ? add_optab : sub_optab,
 			op0, op1, NULL_RTX, false, OPTAB_LIB_WIDEN);
 
-    /* If we can prove one of the arguments (for MINUS_EXPR only
+    /* If we can prove that one of the arguments (for MINUS_EXPR only
        the second operand, as subtraction is not commutative) is always
        non-negative or always negative, we can do just one comparison
-       and conditional jump instead of 2 at runtime, 3 present in the
-       emitted code.  If one of the arguments is CONST_INT, all we
-       need is to make sure it is op1, then the first
-       do_compare_rtx_and_jump will be just folded.  Otherwise try
-       to use range info if available.  */
-    if (code == PLUS_EXPR && CONST_INT_P (op0))
-      std::swap (op0, op1);
-    else if (CONST_INT_P (op1))
-      ;
-    else if (code == PLUS_EXPR && TREE_CODE (arg0) == SSA_NAME)
+       and conditional jump.  */
+    int pos_neg = get_range_pos_neg (arg1);
+    if (code == PLUS_EXPR)
       {
-        pos_neg = get_range_pos_neg (arg0);
-        if (pos_neg != 3)
-	  std::swap (op0, op1);
+	int pos_neg0 = get_range_pos_neg (arg0);
+	if (pos_neg0 != 3 && pos_neg == 3)
+	  {
+	    std::swap (op0, op1);
+	    pos_neg = pos_neg0;
+	  }
       }
-    if (pos_neg == 3 && !CONST_INT_P (op1) && TREE_CODE (arg1) == SSA_NAME)
-      pos_neg = get_range_pos_neg (arg1);
-
-    /* If the op1 is negative, we have to use a different check.  */
-    if (pos_neg == 3)
-      do_compare_rtx_and_jump (op1, const0_rtx, LT, false, mode, NULL_RTX,
-			       NULL, sub_check, PROB_EVEN);
-
-    /* Compare the result of the operation with one of the operands.  */
-    if (pos_neg & 1)
-      do_compare_rtx_and_jump (res, op0, code == PLUS_EXPR ? GE : LE,
-			       false, mode, NULL_RTX, NULL, done_label,
-			       PROB_VERY_LIKELY);
 
-    /* If we get here, we have to print the error.  */
+    /* Addition overflows if and only if the two operands have the same sign,
+       and the result has the opposite sign.  Subtraction overflows if and
+       only if the two operands have opposite sign, and the subtrahend has
+       the same sign as the result.  Here 0 is counted as positive.  */
     if (pos_neg == 3)
       {
-	emit_jump (do_error);
-	emit_label (sub_check);
+	/* Compute op0 ^ op1 (operands have opposite sign).  */
+        rtx op_xor = expand_binop (mode, xor_optab, op0, op1, NULL_RTX, false,
+				   OPTAB_LIB_WIDEN);
+
+	/* Compute res ^ op1 (result and 2nd operand have opposite sign).  */
+	rtx res_xor = expand_binop (mode, xor_optab, res, op1, NULL_RTX, false,
+				    OPTAB_LIB_WIDEN);
+
+	rtx tem;
+	if (code == PLUS_EXPR)
+	  {
+	    /* Compute (res ^ op1) & ~(op0 ^ op1).  */
+	    tem = expand_unop (mode, one_cmpl_optab, op_xor, NULL_RTX, false);
+	    tem = expand_binop (mode, and_optab, res_xor, tem, NULL_RTX, false,
+				OPTAB_LIB_WIDEN);
+	  }
+	else
+	  {
+	    /* Compute (op0 ^ op1) & ~(res ^ op1).  */
+	    tem = expand_unop (mode, one_cmpl_optab, res_xor, NULL_RTX, false);
+	    tem = expand_binop (mode, and_optab, op_xor, tem, NULL_RTX, false,
+				OPTAB_LIB_WIDEN);
+	  }
+
+	/* No overflow if the result has bit sign cleared.  */
+	do_compare_rtx_and_jump (tem, const0_rtx, GE, false, mode, NULL_RTX,
+				 NULL, done_label, PROB_VERY_LIKELY);
       }
 
-    /* We have k = a + b for b < 0 here.  k <= a must hold.  */
-    if (pos_neg & 2)
-      do_compare_rtx_and_jump (res, op0, code == PLUS_EXPR ? LE : GE,
+    /* Compare the result of the operation with the first operand.
+       No overflow for addition if second operand is positive and result
+       is larger or second operand is negative and result is smaller.
+       Likewise for subtraction with sign of second operand flipped.  */
+    else
+      do_compare_rtx_and_jump (res, op0,
+			       (pos_neg == 1) ^ (code == MINUS_EXPR) ? GE : LE,
 			       false, mode, NULL_RTX, NULL, done_label,
 			       PROB_VERY_LIKELY);
   }