diff mbox

[2/2,expand] make expand_builtin_strncmp more general

Message ID 1478039355.9676.68.camel@linux.vnet.ibm.com
State New
Headers show

Commit Message

Aaron Sawdey Nov. 1, 2016, 10:29 p.m. UTC
This patch adds code to expand_builtin_strncmp so it also attempts
expansion via cmpstrnsi in the case where c_strlen() returns NULL for
both string arguments, meaning that neither one is a constant.

-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

Comments

Aaron Sawdey Nov. 4, 2016, 4:27 p.m. UTC | #1
ChangeLog for this patch:

2016-11-03  Aaron Sawdey  <acsawdey@linux.vnet.ibm.com>

	* builtins.c (expand_builtin_strncmp): Attempt expansion of strncmp
	via cmpstrnsi even if neither string is constant.

-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
Richard Biener Nov. 8, 2016, 12:36 p.m. UTC | #2
On Tue, Nov 1, 2016 at 11:29 PM, Aaron Sawdey
<acsawdey@linux.vnet.ibm.com> wrote:
> This patch adds code to expand_builtin_strncmp so it also attempts

> expansion via cmpstrnsi in the case where c_strlen() returns NULL for

> both string arguments, meaning that neither one is a constant.


@@ -3929,61 +3929,85 @@
     unsigned int arg1_align = get_pointer_alignment (arg1) / BITS_PER_UNIT;
     unsigned int arg2_align = get_pointer_alignment (arg2) / BITS_PER_UNIT;

+    /* If we don't have POINTER_TYPE, call the function.  */
+    if (arg1_align == 0 || arg2_align == 0)
+      return NULL_RTX;
+

hm?  we cann validate_arglist at the beginning...

+    if (!len1 && !len2)
+      {
+       /* If neither arg1 nor arg2 are constant strings.  */
+       /* Stabilize the arguments in case gen_cmpstrnsi fails.  */
+       arg1 = builtin_save_expr (arg1);
+       arg2 = builtin_save_expr (arg2);

we no longer need the stabilization dance since we expand from GIMPLE.

+       /* Use save_expr here because cmpstrnsi may modify arg3
+          and builtin_save_expr() doesn't force the save to happen.  */
+       len = save_expr (arg3);
+       len = fold_convert_loc (loc, sizetype, len);

cmpstrnsi may certainly not modify trees in-place.  If it does it
needs to be fixed.

+       /* If both arguments have side effects, we cannot optimize.  */
+       if (TREE_SIDE_EFFECTS (len))
+         return NULL_RTX;

this can't happen anymore

btw, due to the re-indention a context diff would be _much_ easier to review.

So the real "meat" of your change is

    /* If both arguments have side effects, we cannot optimize.  */
-    if (!len || TREE_SIDE_EFFECTS (len))
+   if (TREE_SIDE_EFFECTS (len))
      return NULL_RTX;

and falling back to arg3 as len if !len1 && !len2.

Plus

+    /* Set MEM_SIZE as appropriate.  This will only happen in
+       the case where incoming arg3 is constant and arg1/arg2 are not.  */
+    if (CONST_INT_P (arg3_rtx))
+      {
+       set_mem_size (arg1_rtx, INTVAL (arg3_rtx));
+       set_mem_size (arg2_rtx, INTVAL (arg3_rtx));
+      }

where I don't really see why you need it or how it is even correct (arg1 might
terminate with a '\0' before arg3).

It would  be nice to simplify the patch to simply do

   if (!len1 && !len2)
     len = arg3;
   else if (!len1)
...

Richard.

> --

> Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com

> 050-2/C113  (507) 253-7520 home: 507/263-0782

> IBM Linux Technology Center - PPC Toolchain
Aaron Sawdey Nov. 8, 2016, 1:04 p.m. UTC | #3
Richard,
  Thanks for the review ... comments below.

On Tue, 2016-11-08 at 13:36 +0100, Richard Biener wrote:
> On Tue, Nov 1, 2016 at 11:29 PM, Aaron Sawdey

> <acsawdey@linux.vnet.ibm.com> wrote:

> > 

> > This patch adds code to expand_builtin_strncmp so it also attempts

> > expansion via cmpstrnsi in the case where c_strlen() returns NULL

> > for

> > both string arguments, meaning that neither one is a constant.

> 

> @@ -3929,61 +3929,85 @@

>      unsigned int arg1_align = get_pointer_alignment (arg1) /

> BITS_PER_UNIT;

>      unsigned int arg2_align = get_pointer_alignment (arg2) /

> BITS_PER_UNIT;

> 

> +    /* If we don't have POINTER_TYPE, call the function.  */

> +    if (arg1_align == 0 || arg2_align == 0)

> +      return NULL_RTX;

> +

> 

> hm?  we cann validate_arglist at the beginning...

> 

> +    if (!len1 && !len2)

> +      {

> +       /* If neither arg1 nor arg2 are constant strings.  */

> +       /* Stabilize the arguments in case gen_cmpstrnsi fails.  */

> +       arg1 = builtin_save_expr (arg1);

> +       arg2 = builtin_save_expr (arg2);

> 

> we no longer need the stabilization dance since we expand from

> GIMPLE.

> 

> +       /* Use save_expr here because cmpstrnsi may modify arg3

> +          and builtin_save_expr() doesn't force the save to

> happen.  */

> +       len = save_expr (arg3);

> +       len = fold_convert_loc (loc, sizetype, len);

> 

> cmpstrnsi may certainly not modify trees in-place.  If it does it

> needs to be fixed.


I needed this to get past a bootstrap failure on i386 where the
cmpstrnsi pattern was modifying cx which also contained a length used
elsewhere. The pattern does this:

  count = operands[3];
  countreg = ix86_zero_extend_to_Pmode (count);

but does not clobber operand 3 even though it uses it as the count for
the repz cmpsb. I wonder if this is the source of that problem?

> 

> +       /* If both arguments have side effects, we cannot

> optimize.  */

> +       if (TREE_SIDE_EFFECTS (len))

> +         return NULL_RTX;

> 

> this can't happen anymore

> 

> btw, due to the re-indention a context diff would be _much_ easier to

> review.


I assume you mean a diff that ignores whitespace -- I'll make sure to
do that.

> 

> So the real "meat" of your change is

> 

>     /* If both arguments have side effects, we cannot optimize.  */

> -    if (!len || TREE_SIDE_EFFECTS (len))

> +   if (TREE_SIDE_EFFECTS (len))

>       return NULL_RTX;

> 

> and falling back to arg3 as len if !len1 && !len2.

> 

> Plus

> 

> +    /* Set MEM_SIZE as appropriate.  This will only happen in

> +       the case where incoming arg3 is constant and arg1/arg2 are

> not.  */

> +    if (CONST_INT_P (arg3_rtx))

> +      {

> +       set_mem_size (arg1_rtx, INTVAL (arg3_rtx));

> +       set_mem_size (arg2_rtx, INTVAL (arg3_rtx));

> +      }

> 

> where I don't really see why you need it or how it is even correct

> (arg1 might

> terminate with a '\0' before arg3).

> 

> It would  be nice to simplify the patch to simply do

> 

>    if (!len1 && !len2)

>      len = arg3;

>    else if (!len1)

> ...


I'll see if I can simplify it to do just this and also remove the
unnecessary stuff you've flagged.

Thanks,
    Aaron

-- 
Aaron Sawdey, Ph.D.  acsawdey@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
diff mbox

Patch

Index: builtins.c
===================================================================
--- builtins.c	(revision 241743)
+++ builtins.c	(working copy)
@@ -3929,61 +3929,85 @@ 
     unsigned int arg1_align = get_pointer_alignment (arg1) / BITS_PER_UNIT;
     unsigned int arg2_align = get_pointer_alignment (arg2) / BITS_PER_UNIT;
 
+    /* If we don't have POINTER_TYPE, call the function.  */
+    if (arg1_align == 0 || arg2_align == 0)
+      return NULL_RTX;
+
     len1 = c_strlen (arg1, 1);
     len2 = c_strlen (arg2, 1);
 
-    if (len1)
-      len1 = size_binop_loc (loc, PLUS_EXPR, ssize_int (1), len1);
-    if (len2)
-      len2 = size_binop_loc (loc, PLUS_EXPR, ssize_int (1), len2);
+    if (!len1 && !len2)
+      {
+	/* If neither arg1 nor arg2 are constant strings.  */
+	/* Stabilize the arguments in case gen_cmpstrnsi fails.  */
+	arg1 = builtin_save_expr (arg1);
+	arg2 = builtin_save_expr (arg2);
+	/* Use save_expr here because cmpstrnsi may modify arg3
+	   and builtin_save_expr() doesn't force the save to happen.  */
+	len = save_expr (arg3);
+	len = fold_convert_loc (loc, sizetype, len);
+      }
+    else
+      {
+	if (len1)
+	  len1 = size_binop_loc (loc, PLUS_EXPR, ssize_int (1), len1);
+	if (len2)
+	  len2 = size_binop_loc (loc, PLUS_EXPR, ssize_int (1), len2);
 
-    /* If we don't have a constant length for the first, use the length
-       of the second, if we know it.  We don't require a constant for
-       this case; some cost analysis could be done if both are available
-       but neither is constant.  For now, assume they're equally cheap,
-       unless one has side effects.  If both strings have constant lengths,
-       use the smaller.  */
+	/* If we don't have a constant length for the first, use the length
+	   of the second, if we know it.  We don't require a constant for
+	   this case; some cost analysis could be done if both are available
+	   but neither is constant.  For now, assume they're equally cheap,
+	   unless one has side effects.  If both strings have constant lengths,
+	   use the smaller.  */
 
-    if (!len1)
-      len = len2;
-    else if (!len2)
-      len = len1;
-    else if (TREE_SIDE_EFFECTS (len1))
-      len = len2;
-    else if (TREE_SIDE_EFFECTS (len2))
-      len = len1;
-    else if (TREE_CODE (len1) != INTEGER_CST)
-      len = len2;
-    else if (TREE_CODE (len2) != INTEGER_CST)
-      len = len1;
-    else if (tree_int_cst_lt (len1, len2))
-      len = len1;
-    else
-      len = len2;
+	if (!len1)
+	  len = len2;
+	else if (!len2)
+	  len = len1;
+	else if (TREE_SIDE_EFFECTS (len1))
+	  len = len2;
+	else if (TREE_SIDE_EFFECTS (len2))
+	  len = len1;
+	else if (TREE_CODE (len1) != INTEGER_CST)
+	  len = len2;
+	else if (TREE_CODE (len2) != INTEGER_CST)
+	  len = len1;
+	else if (tree_int_cst_lt (len1, len2))
+	  len = len1;
+	else
+	  len = len2;
 
-    /* If both arguments have side effects, we cannot optimize.  */
-    if (!len || TREE_SIDE_EFFECTS (len))
-      return NULL_RTX;
+	/* If both arguments have side effects, we cannot optimize.  */
+	if (TREE_SIDE_EFFECTS (len))
+	  return NULL_RTX;
 
-    /* The actual new length parameter is MIN(len,arg3).  */
-    len = fold_build2_loc (loc, MIN_EXPR, TREE_TYPE (len), len,
-		       fold_convert_loc (loc, TREE_TYPE (len), arg3));
+	/* The actual new length parameter is MIN(len,arg3).  */
+	len = fold_build2_loc (loc, MIN_EXPR, TREE_TYPE (len), len,
+			       fold_convert_loc (loc, TREE_TYPE (len), arg3));
 
-    /* If we don't have POINTER_TYPE, call the function.  */
-    if (arg1_align == 0 || arg2_align == 0)
-      return NULL_RTX;
+	/* Stabilize the arguments in case gen_cmpstrnsi fails.  */
+	arg1 = builtin_save_expr (arg1);
+	arg2 = builtin_save_expr (arg2);
+	len = builtin_save_expr (len);
+      }
 
-    /* Stabilize the arguments in case gen_cmpstrnsi fails.  */
-    arg1 = builtin_save_expr (arg1);
-    arg2 = builtin_save_expr (arg2);
-    len = builtin_save_expr (len);
-
     arg1_rtx = get_memory_rtx (arg1, len);
     arg2_rtx = get_memory_rtx (arg2, len);
     arg3_rtx = expand_normal (len);
+    
+    /* Set MEM_SIZE as appropriate.  This will only happen in 
+       the case where incoming arg3 is constant and arg1/arg2 are not.  */
+    if (CONST_INT_P (arg3_rtx))
+      {
+	set_mem_size (arg1_rtx, INTVAL (arg3_rtx));
+	set_mem_size (arg2_rtx, INTVAL (arg3_rtx));
+      }
+    
     result = expand_cmpstrn_or_cmpmem (cmpstrn_icode, target, arg1_rtx,
 				       arg2_rtx, TREE_TYPE (len), arg3_rtx,
 				       MIN (arg1_align, arg2_align));
+
     if (result)
       {
 	/* Return the value in the proper mode for this function.  */