diff mbox

[2/2] Enable elimination of zext/sext

Message ID 53AA8501.809@linaro.org
State New
Headers show

Commit Message

Kugan Vivekanandarajah June 25, 2014, 8:14 a.m. UTC
On 24/06/14 22:21, Jakub Jelinek wrote:
> On Tue, Jun 24, 2014 at 09:53:35PM +1000, Kugan wrote:
>> 2014-06-24  Kugan Vivekanandarajah  <kuganv@linaro.org>
>>
>> 	* gcc/calls.c (precompute_arguments: Check is_promoted_for_type
>> 	and set the promoted mode.
>> 	(is_promoted_for_type) : New function.
>> 	(expand_expr_real_1) : Check is_promoted_for_type
>> 	and set the promoted mode.
>> 	* gcc/expr.h (is_promoted_for_type) : New function definition.
>> 	* gcc/cfgexpand.c (expand_gimple_stmt_1) : Call emit_move_insn if
>> 	SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.
> 
> Similarly to the other patch, no gcc/ prefix in ChangeLog, no space before
> :, watch for too long lines, remove useless ()s around conditions.

Changed it.

>> +bool
>> +is_promoted_for_type (tree ssa, enum machine_mode lhs_mode, bool lhs_uns)
>> +{
>> +  wide_int type_min, type_max;
>> +  wide_int min, max, limit;
>> +  unsigned int prec;
>> +  tree lhs_type;
>> +  bool rhs_uns;
>> +
>> +  if (flag_wrapv
> 
> Why?
> 
>> +      || (flag_strict_overflow == false)
> 
> Why?  Also, that would be !flag_strict_overflow instead of
> (flag_strict_overflow == false)

For these flags, value ranges generated are not usable for extension
eliminations. Therefore, without this some of the test cases in
regression fails. For example:

short a;
void
foo (void)
{
  for (a = 0; a >= 0; a++)
    ;
}
-Os  -fno-strict-overflow produces the following range for the index
increment and hence goes into infinite loop.
_10: [1, 32768]
_10 = _4 + 1;

> 
>> +      || (ssa == NULL_TREE)
>> +      || (TREE_CODE (ssa) != SSA_NAME)
>> +      || !INTEGRAL_TYPE_P (TREE_TYPE (ssa))
>> +      || POINTER_TYPE_P (TREE_TYPE (ssa)))
> 
> All pointer types are !INTEGRAL_TYPE_P, so the last condition
> doesn't make any sense.

I have changed this. Please see the attached patch.


Thanks,
Kugan

gcc/
2014-06-25  Kugan Vivekanandarajah  <kuganv@linaro.org>

	* calls.c (precompute_arguments): Check is_promoted_for_type
	and set the promoted mode.
	(is_promoted_for_type): New function.
	(expand_expr_real_1): Check is_promoted_for_type
	and set the promoted mode.
	* expr.h (is_promoted_for_type): New function definition.
	* cfgexpand.c (expand_gimple_stmt_1): Call emit_move_insn if
	SUBREG is promoted with SRP_SIGNED_AND_UNSIGNED.

Comments

Jakub Jelinek June 25, 2014, 8:36 a.m. UTC | #1
On Wed, Jun 25, 2014 at 06:14:57PM +1000, Kugan wrote:
> For these flags, value ranges generated are not usable for extension
> eliminations. Therefore, without this some of the test cases in
> regression fails. For example:
> 
> short a;
> void
> foo (void)
> {
>   for (a = 0; a >= 0; a++)
>     ;
> }
> -Os  -fno-strict-overflow produces the following range for the index
> increment and hence goes into infinite loop.
> _10: [1, 32768]
> _10 = _4 + 1;

For -fwrapv I don't see why you'd get into trouble ever, the VRP computation
should be well aware of the -fwrapv semantics and the value ranges should
reflect that.

For -fno-strict-overflow, I have no idea since it is very weirdly defined.

In any case, for your example above, the loop is always well defined,
because for char/short a++ is performed as:
a = (short) ((int) a + 1)
So, if the patch turns it into infinite loop, with -Os -fno-strict-overflow
or -Os, it is simply a problem with the patch.  VR [1, 32768] looks correct,
a++ is performed only if a is >= 0, therefore before addition [0, 32767].
But from VR [1, 32768] you can't optimize away the sign extension, make sure
you don't have there off-by-one?

It would be nice if the patch contained some testcases, it is easy
to construct testcases where you have arbitrary VRs on some SSA_NAMEs,
you just need something to stick the VR on, so you can do something like:
type foo (type a)
{
  if (a < VR_min + 1 || a > VR_max + 1) return; // If VR_min is type minimum or VR_max type maximum this needs to be adjusted of course.
  a = a + 1;
  // now you can try some cast that your optimization would try to optimize
  return a;
}
Or void bar (type a) { a = (a & mask) + bias; (or similarly) }
Make sure to cover the boundary cases, where VR minimum or maximum still
allow optimizing away zero and/or sign extensions, and another case where
they are +- 1 and already don't allow it.

	Jakub
diff mbox

Patch

diff --git a/gcc/calls.c b/gcc/calls.c
index a3e6faa..eac512f 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1484,7 +1484,10 @@  precompute_arguments (int num_actuals, struct arg_data *args)
 	      args[i].initial_value
 		= gen_lowpart_SUBREG (mode, args[i].value);
 	      SUBREG_PROMOTED_VAR_P (args[i].initial_value) = 1;
-	      SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
+	      if (is_promoted_for_type (args[i].tree_value, mode, !args[i].unsignedp))
+		SUBREG_PROMOTED_SET (args[i].initial_value, SRP_SIGNED_AND_UNSIGNED);
+	      else
+		SUBREG_PROMOTED_SET (args[i].initial_value, args[i].unsignedp);
 	    }
 	}
     }
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index e8cd87f..0540b4d 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3309,7 +3309,13 @@  expand_gimple_stmt_1 (gimple stmt)
 					  GET_MODE (target), temp, unsignedp);
 		  }
 
-		convert_move (SUBREG_REG (target), temp, unsignedp);
+		if ((SUBREG_PROMOTED_GET (target) == SRP_SIGNED_AND_UNSIGNED)
+		    && (GET_CODE (temp) == SUBREG)
+		    && (GET_MODE (target) == GET_MODE (temp))
+		    && (GET_MODE (SUBREG_REG (target)) == GET_MODE (SUBREG_REG (temp))))
+		  emit_move_insn (SUBREG_REG (target), SUBREG_REG (temp));
+		else
+		  convert_move (SUBREG_REG (target), temp, unsignedp);
 	      }
 	    else if (nontemporal && emit_storent_insn (target, temp))
 	      ;
diff --git a/gcc/expr.c b/gcc/expr.c
index f9103a5..15da092 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9210,6 +9210,59 @@  expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 }
 #undef REDUCE_BIT_FIELD
 
+/* Return TRUE if value in SSA is already zero/sign extended for lhs type
+   (type here is the combination of LHS_MODE and LHS_UNS) using value range
+   information stored.  Return FALSE otherwise.  */
+bool
+is_promoted_for_type (tree ssa, enum machine_mode lhs_mode, bool lhs_uns)
+{
+  wide_int type_min, type_max;
+  wide_int min, max, limit;
+  unsigned int prec;
+  tree lhs_type;
+  bool rhs_uns;
+
+  if (flag_wrapv
+      || !flag_strict_overflow
+      || ssa == NULL_TREE
+      || TREE_CODE (ssa) != SSA_NAME
+      || !INTEGRAL_TYPE_P (TREE_TYPE (ssa)))
+    return false;
+
+  /* Return FALSE if value_range is not recorded for SSA.  */
+  if (get_range_info (ssa, &min, &max) != VR_RANGE)
+    return false;
+
+  lhs_type = lang_hooks.types.type_for_mode (lhs_mode, lhs_uns);
+  rhs_uns = TYPE_UNSIGNED (TREE_TYPE (ssa));
+  prec = min.get_precision ();
+
+  /* Signed maximum value.  */
+  limit = wide_int::from (TYPE_MAX_VALUE (TREE_TYPE (ssa)), prec, SIGNED);
+
+  /* Signedness of LHS and RHS differs but values in range.  */
+  if ((rhs_uns != lhs_uns)
+      && ((!lhs_uns && !wi::neg_p (min, TYPE_SIGN (lhs_type)))
+	  || (lhs_uns && (wi::cmp (max, limit, TYPE_SIGN (TREE_TYPE (ssa))) == -1))))
+    lhs_uns = !lhs_uns;
+
+  /* Signedness of LHS and RHS should match.  */
+  if (rhs_uns != lhs_uns)
+    return false;
+
+  type_min = wide_int::from (TYPE_MIN_VALUE (lhs_type), prec,
+			     TYPE_SIGN (TREE_TYPE (ssa)));
+  type_max = wide_int::from (TYPE_MAX_VALUE (lhs_type), prec,
+			     TYPE_SIGN (TREE_TYPE (ssa)));
+
+  /* Check if values lies in-between the type range.  */
+  if ((wi::neg_p (max, TYPE_SIGN (TREE_TYPE (ssa)))
+       || (wi::cmp (max, type_max, TYPE_SIGN (TREE_TYPE (ssa))) != 1))
+      && (!wi::neg_p (min, TYPE_SIGN (TREE_TYPE (ssa)))
+	  || (wi::cmp (type_min, min, TYPE_SIGN (TREE_TYPE (ssa))) != 1)))
+    return true;
+  return false;
+}
 
 /* Return TRUE if expression STMT is suitable for replacement.  
    Never consider memory loads as replaceable, because those don't ever lead 
@@ -9513,7 +9566,10 @@  expand_expr_real_1 (tree exp, rtx target, enum machine_mode tmode,
 
 	  temp = gen_lowpart_SUBREG (mode, decl_rtl);
 	  SUBREG_PROMOTED_VAR_P (temp) = 1;
-	  SUBREG_PROMOTED_SET (temp, unsignedp);
+	  if (is_promoted_for_type (ssa_name, mode, !unsignedp))
+	    SUBREG_PROMOTED_SET (temp, SRP_SIGNED_AND_UNSIGNED);
+	  else
+	    SUBREG_PROMOTED_SET (temp, unsignedp);
 
 	  return temp;
 	}
diff --git a/gcc/expr.h b/gcc/expr.h
index 6a1d3ab..e99d000 100644
--- a/gcc/expr.h
+++ b/gcc/expr.h
@@ -440,6 +440,7 @@  extern rtx expand_expr_real_1 (tree, rtx, enum machine_mode,
 			       enum expand_modifier, rtx *, bool);
 extern rtx expand_expr_real_2 (sepops, rtx, enum machine_mode,
 			       enum expand_modifier);
+extern bool is_promoted_for_type (tree, enum machine_mode, bool);
 
 /* Generate code for computing expression EXP.
    An rtx for the computed value is returned.  The value is never null.