From patchwork Fri Aug 12 12:24:55 2011
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org>
X-Patchwork-Id: 3430
Return-Path: <patch+caf_=linaro-patchwork=canonical.com@linaro.org>
X-Original-To: patchwork@peony.canonical.com
Delivered-To: patchwork@peony.canonical.com
Received: from fiordland.canonical.com (fiordland.canonical.com
 [91.189.94.145])
 by peony.canonical.com (Postfix) with ESMTP id 809CA23F41
 for <patchwork@peony.canonical.com>;
 Fri, 12 Aug 2011 12:24:59 +0000 (UTC)
Received: from mail-qy0-f173.google.com (mail-qy0-f173.google.com
 [209.85.216.173])
 by fiordland.canonical.com (Postfix) with ESMTP id 14704A181B8
 for <linaro-patchwork@canonical.com>;
 Fri, 12 Aug 2011 12:24:58 +0000 (UTC)
Received: by qyk31 with SMTP id 31so356077qyk.11
 for <linaro-patchwork@canonical.com>;
 Fri, 12 Aug 2011 05:24:58 -0700 (PDT)
Received: by 10.229.16.200 with SMTP id p8mr580814qca.22.1313151898252;
 Fri, 12 Aug 2011 05:24:58 -0700 (PDT)
X-Forwarded-To: linaro-patchwork@canonical.com
X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com
Delivered-To: patches@linaro.org
Received: by 10.229.190.71 with SMTP id dh7cs149018qcb;
 Fri, 12 Aug 2011 05:24:57 -0700 (PDT)
Received: from mr.google.com ([10.224.204.74])
 by 10.224.204.74 with SMTP id fl10mr773930qab.168.1313151897529
 (num_hops = 1); Fri, 12 Aug 2011 05:24:57 -0700 (PDT)
Received: by 10.224.204.74 with SMTP id fl10mr560927qab.168.1313151897040;
 Fri, 12 Aug 2011 05:24:57 -0700 (PDT)
Received: from mail-qy0-f171.google.com (mail-qy0-f171.google.com
 [209.85.216.171]) by mx.google.com with ESMTPS id
 fg8si6531148qab.53.2011.08.12.05.24.55
 (version=TLSv1/SSLv3 cipher=OTHER);
 Fri, 12 Aug 2011 05:24:56 -0700 (PDT)
Received-SPF: neutral (google.com: 209.85.216.171 is neither permitted nor
 denied by best guess record for domain of
 ramana.radhakrishnan@linaro.org) client-ip=209.85.216.171; 
Authentication-Results: mx.google.com;
 spf=neutral (google.com: 209.85.216.171 is neither
 permitted nor denied by best guess record for domain of
 ramana.radhakrishnan@linaro.org)
 smtp.mail=ramana.radhakrishnan@linaro.org
Received: by qyk36 with SMTP id 36so353565qyk.16
 for <patches@linaro.org>; Fri, 12 Aug 2011 05:24:55 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.224.109.71 with SMTP id i7mr565491qap.210.1313151895437; Fri,
 12 Aug 2011 05:24:55 -0700 (PDT)
Received: by 10.224.89.67 with HTTP; Fri, 12 Aug 2011 05:24:55 -0700 (PDT)
Date: Fri, 12 Aug 2011 13:24:55 +0100
Message-ID: <CACUk7=WcdAs3_X0=f6a7G6oC=cu-kiNDJ9pJphhc5M0Ym9r1pQ@mail.gmail.com>
Subject: [RFC ARM] Audit uses of optimize_size in the ARM backend.
From: Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org>
To: gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Patch Tracking <patches@linaro.org>, Richard Earnshaw <rearnsha@arm.com>

Hi,

Quite some time back someone had pointed out that the ARM backend
used optimize_size in quite a few areas and that backends shouldn't
use this directly in patterns any more. I had written this patch up a few weeks
back and it was in one of my trees and had gone through some degree of
testing.

While the ARM backend doesn't support hot cold partitioning of basic blocks
because of issues with the mini-pool placements, I suspect
this by itself is a good cleanup. The bits of the use that I'm not
convinced about
yet are the changes of optimize_size in thumb_legitimize_address
to optimize_insn_for_size_p and I'm looking for some comments there.

There are still other uses of optimize_size and here are some thoughts
on what we should do there. I will go back and do this when I have
some more free time next but I hope to have the changes in by the time
stage1 is over if these are deemed to be useful.


   - arm/aout.h : ASM_OUTPUT_ADDR_DIFF_ELT : replace with
optimize_function_for_size_p ?
   - arm/arm.h : TARGET_USE_MOVT (probably again something that could
       benefit with the change.)
   - arm/arm.h : CONSTANT_ALIGNMENT - probably should retain optimize_size .
   - arm/arm.h : DATA_ALIGNMENT - Likewise.
   - arm/arm.h : CASE_VECTOR_PC_RELATIVE should go hand in glove with
addr_diff_elt output.
   - arm/coff.h or arm/elf.h : JUMP_TABLES_IN_TEXT_SECTION :
optimize_function_for_size_p () ?
   - arm/arm.c  (arm_compute_save_reg_mask) : Replace optimize_size
with optimize_function_for_size_p ().
   - arm/arm.c  (arm_output_epilogue): Replace optimize_size with
optimize_function_for_size_p ().
   - arm/arm.c ( arm_expand_prologue): Likewise
   - arm/arm.c (thumb1_extra_regs_pushed): optimize_function_for_size_p
   - arm/arm.c (arm_final_prescan_insn): Probably optimize_insn_for_size_p () .
   - arm/arm.c (arm_conditional_register_usage): optimize_function_for_size_p.


Ok for trunk after a bootstrap, test run ? Thoughts about what we do
with the rest of the uses ?

cheers
Ramana


          * config/arm/arm.md ("*mulsi3_compare0_v6"): Replace optimize_size
            with optimize_insn_for_size_p.
            ("*mulsi_compare0_scratch_v6"): Likewise.
            ("*mulsi3addsi_compare0_v6"): Likewise.
            ("casesi"): Likewise.
            (dimode_general_splitter): Name existing splitter and like above.
            ("bswapsi2"): Likewise.
            * config/arm/thumb2.md (t2_muls_peepholes): Likewise.
            * config/arm/arm.c (thumb_legitimize_address): Replace optimize_size
            with optimize_insn_for_size_p.
            (adjacent_mem_locations): Likewise.
            (arm_const_double_by_parts): Likewise.
            * config/arm/arm.h (FUNCTION_BOUNDARY): Use
optimize_function_for_size_p.
            (MODE_BASE_REG_CLASS): Likewise.
            * config/arm/constraints.md (constraint "Dc"): Use
optimize_insn_for_size_p.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 6cd80f8..97dd249 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -6359,7 +6359,7 @@ thumb_legitimize_address (rtx x, rtx orig_x, enum machine_mode mode)
       /* Try and fold the offset into a biasing of the base register and
 	 then offsetting that.  Don't do this when optimizing for space
 	 since it can cause too many CSEs.  */
-      if (optimize_size && offset >= 0
+      if (optimize_insn_for_size_p () && offset >= 0
 	  && offset < 256 + 31 * GET_MODE_SIZE (mode))
 	{
 	  HOST_WIDE_INT delta;
@@ -9787,7 +9787,7 @@ adjacent_mem_locations (rtx a, rtx b)
 	  /* If the target has load delay slots, then there's no benefit
 	     to using an ldm instruction unless the offset is zero and
 	     we are optimizing for size.  */
-	  return (optimize_size && (REGNO (reg0) == REGNO (reg1))
+	  return (optimize_insn_for_size_p () && (REGNO (reg0) == REGNO (reg1))
 		  && (val0 == 0 || val1 == 0 || val0 == 4 || val1 == 4)
 		  && (val_diff == 4 || val_diff == -4));
 	}
@@ -9868,7 +9868,7 @@ multiple_operation_profitable_p (bool is_store ATTRIBUTE_UNUSED,
 
      As a compromise, we use ldr for counts of 1 or 2 regs, and ldm
      for counts of 3 or 4 regs.  */
-  if (nops <= 2 && arm_tune_xscale && !optimize_size)
+  if (nops <= 2 && arm_tune_xscale && !optimize_insn_for_size_p ())
     return false;
   return true;
 }
@@ -12445,7 +12445,7 @@ arm_const_double_by_parts (rtx val)
   enum machine_mode mode = GET_MODE (val);
   rtx part;
 
-  if (optimize_size || arm_ld_sched)
+  if (optimize_insn_for_size_p () || arm_ld_sched)
     return true;
 
   if (mode == VOIDmode)
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 869b9a9..b18f08e 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -533,7 +533,7 @@ extern int arm_arch_thumb_hwdiv;
 #define PREFERRED_STACK_BOUNDARY \
     (arm_abi == ARM_ABI_ATPCS ? 64 : STACK_BOUNDARY)
 
-#define FUNCTION_BOUNDARY  ((TARGET_THUMB && optimize_size) ? 16 : 32)
+#define FUNCTION_BOUNDARY  ((TARGET_THUMB && optimize_function_for_size_p (cfun)) ? 16 : 32)
 
 /* The lowest bit is used to indicate Thumb-mode functions, so the
    vbit must go into the delta field of pointers to member
@@ -1141,9 +1141,10 @@ enum reg_class
 /* For the Thumb the high registers cannot be used as base registers
    when addressing quantities in QI or HI mode; if we don't know the
    mode, then we must be conservative.  */
-#define MODE_BASE_REG_CLASS(MODE)					\
-    (TARGET_ARM || (TARGET_THUMB2 && !optimize_size) ? CORE_REGS :      \
-     (((MODE) == SImode) ? BASE_REGS : LO_REGS))
+#define MODE_BASE_REG_CLASS(MODE)   				        \
+  (TARGET_ARM || (TARGET_THUMB2 && !optimize_function_for_size_p ()) ?	\
+   CORE_REGS :								\
+   (((MODE) == SImode) ? BASE_REGS : LO_REGS))
 
 /* For Thumb we can not support SP+reg addressing, so we return LO_REGS
    instead of BASE_REGS.  */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 3d4dcfa..8e17930 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -1400,7 +1400,7 @@
 			 (const_int 0)))
    (set (match_operand:SI 0 "s_register_operand" "=r")
 	(mult:SI (match_dup 2) (match_dup 1)))]
-  "TARGET_ARM && arm_arch6 && optimize_size"
+  "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()"
   "mul%.\\t%0, %2, %1"
   [(set_attr "conds" "set")
    (set_attr "insn" "muls")]
@@ -1426,7 +1426,7 @@
 			  (match_operand:SI 1 "s_register_operand" "r"))
 			 (const_int 0)))
    (clobber (match_scratch:SI 0 "=r"))]
-  "TARGET_ARM && arm_arch6 && optimize_size"
+  "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()"
   "mul%.\\t%0, %2, %1"
   [(set_attr "conds" "set")
    (set_attr "insn" "muls")]
@@ -1486,7 +1486,7 @@
    (set (match_operand:SI 0 "s_register_operand" "=r")
 	(plus:SI (mult:SI (match_dup 2) (match_dup 1))
 		 (match_dup 3)))]
-  "TARGET_ARM && arm_arch6 && optimize_size"
+  "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()"
   "mla%.\\t%0, %2, %1, %3"
   [(set_attr "conds" "set")
    (set_attr "insn" "mlas")]
@@ -1516,7 +1516,7 @@
 		  (match_operand:SI 3 "s_register_operand" "r"))
 	 (const_int 0)))
    (clobber (match_scratch:SI 0 "=r"))]
-  "TARGET_ARM && arm_arch6 && optimize_size"
+  "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()"
   "mla%.\\t%0, %2, %1, %3"
   [(set_attr "conds" "set")
    (set_attr "insn" "mlas")]
@@ -4992,13 +4992,13 @@
    (set_attr "thumb2_neg_pool_range" "*,*,*,0,*")]
 )
 
-(define_split
+(define_split ;; dimode_general_splitter
   [(set (match_operand:ANY64 0 "arm_general_register_operand" "")
 	(match_operand:ANY64 1 "const_double_operand" ""))]
   "TARGET_32BIT
    && reload_completed
    && (arm_const_double_inline_cost (operands[1])
-       <= ((optimize_size || arm_ld_sched) ? 3 : 4))"
+       <= ((optimize_insn_for_size_p () || arm_ld_sched) ? 3 : 4))"
   [(const_int 0)]
   "
   arm_split_constant (SET, SImode, curr_insn,
@@ -8477,7 +8477,7 @@
    (match_operand:SI 2 "const_int_operand" "")	; total range
    (match_operand:SI 3 "" "")			; table label
    (match_operand:SI 4 "" "")]			; Out of range label
-  "TARGET_32BIT || optimize_size || flag_pic"
+  "TARGET_32BIT || optimize_insn_for_size_p () || flag_pic"
   "
   {
     enum insn_code code;
@@ -10845,7 +10845,7 @@
 (define_expand "bswapsi2"
   [(set (match_operand:SI 0 "s_register_operand" "=r")
   	(bswap:SI (match_operand:SI 1 "s_register_operand" "r")))]
-"TARGET_EITHER && (arm_arch6 || !optimize_size)"
+"TARGET_EITHER && (arm_arch6 || !optimize_insn_for_size_p ())"
 "
     if (!arm_arch6)
       {
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index f5b8521..c7e13ec 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -232,7 +232,7 @@
   if optimizing for space or when we have load-delay slots to fill."
  (and (match_code "const_double,const_int,const_vector")
       (match_test "TARGET_32BIT && arm_const_double_inline_cost (op) == 4
-		   && !(optimize_size || arm_ld_sched)")))
+		   && !(optimize_insn_for_size_p () || arm_ld_sched)")))
 
 (define_constraint "Di"
  "@internal
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 9a11012..e228963 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -873,11 +873,11 @@
 ;; implementations and since "mul<c>" will be generated by
 ;; "*arm_mulsi3_v6" anyhow.  The assembler will use a 16-bit encoding
 ;; for "mul<c>" whenever possible anyhow.
-(define_peephole2
+(define_peephole2 ;; t2_muls_peepholes
   [(set (match_operand:SI 0 "low_register_operand" "")
         (mult:SI (match_operand:SI 1 "low_register_operand" "")
                  (match_dup 0)))]
-  "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)"
+  "TARGET_THUMB2 && optimize_insn_for_size_p () && peep2_regno_dead_p (0, CC_REGNUM)"
   [(parallel
     [(set (match_dup 0)
            (mult:SI (match_dup 0) (match_dup 1)))
@@ -889,7 +889,7 @@
   [(set (match_operand:SI 0 "low_register_operand" "")
         (mult:SI (match_dup 0)
 	         (match_operand:SI 1 "low_register_operand" "")))]
-  "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)"
+  "TARGET_THUMB2 && optimize_insn_for_size_p () && peep2_regno_dead_p (0, CC_REGNUM)"
   [(parallel
     [(set (match_dup 0)
            (mult:SI (match_dup 0) (match_dup 1)))
@@ -902,7 +902,7 @@
         (mult:SI (match_operand:SI 1 "low_register_operand" "%0")
                  (match_operand:SI 2 "low_register_operand" "l")))
    (clobber (reg:CC CC_REGNUM))]
-  "TARGET_THUMB2 && optimize_size && reload_completed"
+  "TARGET_THUMB2 && optimize_insn_for_size_p () && reload_completed"
   "mul%!\\t%0, %2, %0"
   [(set_attr "predicable" "yes")
    (set_attr "length" "2")
@@ -916,7 +916,7 @@
          (const_int 0)))
    (set (match_operand:SI 0 "register_operand" "=l")
 	(mult:SI (match_dup 1) (match_dup 2)))]
-  "TARGET_THUMB2 && optimize_size"
+  "TARGET_THUMB2 && optimize_insn_for_size_p ()"
   "muls\\t%0, %2, %0"
   [(set_attr "length" "2")
    (set_attr "insn" "muls")])
@@ -928,7 +928,7 @@
 	          (match_operand:SI 2 "register_operand" "l"))
          (const_int 0)))
    (clobber (match_scratch:SI 0 "=l"))]
-  "TARGET_THUMB2 && optimize_size"
+  "TARGET_THUMB2 && optimize_insn_for_size_p ()"
   "muls\\t%0, %2, %0"
   [(set_attr "length" "2")
    (set_attr "insn" "muls")])