From patchwork Fri Aug 12 12:24:55 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramana Radhakrishnan X-Patchwork-Id: 3430 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id 809CA23F41 for ; Fri, 12 Aug 2011 12:24:59 +0000 (UTC) Received: from mail-qy0-f173.google.com (mail-qy0-f173.google.com [209.85.216.173]) by fiordland.canonical.com (Postfix) with ESMTP id 14704A181B8 for ; Fri, 12 Aug 2011 12:24:58 +0000 (UTC) Received: by qyk31 with SMTP id 31so356077qyk.11 for ; Fri, 12 Aug 2011 05:24:58 -0700 (PDT) Received: by 10.229.16.200 with SMTP id p8mr580814qca.22.1313151898252; Fri, 12 Aug 2011 05:24:58 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.229.190.71 with SMTP id dh7cs149018qcb; Fri, 12 Aug 2011 05:24:57 -0700 (PDT) Received: from mr.google.com ([10.224.204.74]) by 10.224.204.74 with SMTP id fl10mr773930qab.168.1313151897529 (num_hops = 1); Fri, 12 Aug 2011 05:24:57 -0700 (PDT) Received: by 10.224.204.74 with SMTP id fl10mr560927qab.168.1313151897040; Fri, 12 Aug 2011 05:24:57 -0700 (PDT) Received: from mail-qy0-f171.google.com (mail-qy0-f171.google.com [209.85.216.171]) by mx.google.com with ESMTPS id fg8si6531148qab.53.2011.08.12.05.24.55 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 12 Aug 2011 05:24:56 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.216.171 is neither permitted nor denied by best guess record for domain of ramana.radhakrishnan@linaro.org) client-ip=209.85.216.171; Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.216.171 is neither permitted nor denied by best guess record for domain of ramana.radhakrishnan@linaro.org) smtp.mail=ramana.radhakrishnan@linaro.org Received: by qyk36 with SMTP id 36so353565qyk.16 for ; Fri, 12 Aug 2011 05:24:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.224.109.71 with SMTP id i7mr565491qap.210.1313151895437; Fri, 12 Aug 2011 05:24:55 -0700 (PDT) Received: by 10.224.89.67 with HTTP; Fri, 12 Aug 2011 05:24:55 -0700 (PDT) Date: Fri, 12 Aug 2011 13:24:55 +0100 Message-ID: Subject: [RFC ARM] Audit uses of optimize_size in the ARM backend. From: Ramana Radhakrishnan To: gcc-patches Cc: Patch Tracking , Richard Earnshaw Hi, Quite some time back someone had pointed out that the ARM backend used optimize_size in quite a few areas and that backends shouldn't use this directly in patterns any more. I had written this patch up a few weeks back and it was in one of my trees and had gone through some degree of testing. While the ARM backend doesn't support hot cold partitioning of basic blocks because of issues with the mini-pool placements, I suspect this by itself is a good cleanup. The bits of the use that I'm not convinced about yet are the changes of optimize_size in thumb_legitimize_address to optimize_insn_for_size_p and I'm looking for some comments there. There are still other uses of optimize_size and here are some thoughts on what we should do there. I will go back and do this when I have some more free time next but I hope to have the changes in by the time stage1 is over if these are deemed to be useful. - arm/aout.h : ASM_OUTPUT_ADDR_DIFF_ELT : replace with optimize_function_for_size_p ? - arm/arm.h : TARGET_USE_MOVT (probably again something that could benefit with the change.) - arm/arm.h : CONSTANT_ALIGNMENT - probably should retain optimize_size . - arm/arm.h : DATA_ALIGNMENT - Likewise. - arm/arm.h : CASE_VECTOR_PC_RELATIVE should go hand in glove with addr_diff_elt output. - arm/coff.h or arm/elf.h : JUMP_TABLES_IN_TEXT_SECTION : optimize_function_for_size_p () ? - arm/arm.c (arm_compute_save_reg_mask) : Replace optimize_size with optimize_function_for_size_p (). - arm/arm.c (arm_output_epilogue): Replace optimize_size with optimize_function_for_size_p (). - arm/arm.c ( arm_expand_prologue): Likewise - arm/arm.c (thumb1_extra_regs_pushed): optimize_function_for_size_p - arm/arm.c (arm_final_prescan_insn): Probably optimize_insn_for_size_p () . - arm/arm.c (arm_conditional_register_usage): optimize_function_for_size_p. Ok for trunk after a bootstrap, test run ? Thoughts about what we do with the rest of the uses ? cheers Ramana * config/arm/arm.md ("*mulsi3_compare0_v6"): Replace optimize_size with optimize_insn_for_size_p. ("*mulsi_compare0_scratch_v6"): Likewise. ("*mulsi3addsi_compare0_v6"): Likewise. ("casesi"): Likewise. (dimode_general_splitter): Name existing splitter and like above. ("bswapsi2"): Likewise. * config/arm/thumb2.md (t2_muls_peepholes): Likewise. * config/arm/arm.c (thumb_legitimize_address): Replace optimize_size with optimize_insn_for_size_p. (adjacent_mem_locations): Likewise. (arm_const_double_by_parts): Likewise. * config/arm/arm.h (FUNCTION_BOUNDARY): Use optimize_function_for_size_p. (MODE_BASE_REG_CLASS): Likewise. * config/arm/constraints.md (constraint "Dc"): Use optimize_insn_for_size_p. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 6cd80f8..97dd249 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -6359,7 +6359,7 @@ thumb_legitimize_address (rtx x, rtx orig_x, enum machine_mode mode) /* Try and fold the offset into a biasing of the base register and then offsetting that. Don't do this when optimizing for space since it can cause too many CSEs. */ - if (optimize_size && offset >= 0 + if (optimize_insn_for_size_p () && offset >= 0 && offset < 256 + 31 * GET_MODE_SIZE (mode)) { HOST_WIDE_INT delta; @@ -9787,7 +9787,7 @@ adjacent_mem_locations (rtx a, rtx b) /* If the target has load delay slots, then there's no benefit to using an ldm instruction unless the offset is zero and we are optimizing for size. */ - return (optimize_size && (REGNO (reg0) == REGNO (reg1)) + return (optimize_insn_for_size_p () && (REGNO (reg0) == REGNO (reg1)) && (val0 == 0 || val1 == 0 || val0 == 4 || val1 == 4) && (val_diff == 4 || val_diff == -4)); } @@ -9868,7 +9868,7 @@ multiple_operation_profitable_p (bool is_store ATTRIBUTE_UNUSED, As a compromise, we use ldr for counts of 1 or 2 regs, and ldm for counts of 3 or 4 regs. */ - if (nops <= 2 && arm_tune_xscale && !optimize_size) + if (nops <= 2 && arm_tune_xscale && !optimize_insn_for_size_p ()) return false; return true; } @@ -12445,7 +12445,7 @@ arm_const_double_by_parts (rtx val) enum machine_mode mode = GET_MODE (val); rtx part; - if (optimize_size || arm_ld_sched) + if (optimize_insn_for_size_p () || arm_ld_sched) return true; if (mode == VOIDmode) diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index 869b9a9..b18f08e 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -533,7 +533,7 @@ extern int arm_arch_thumb_hwdiv; #define PREFERRED_STACK_BOUNDARY \ (arm_abi == ARM_ABI_ATPCS ? 64 : STACK_BOUNDARY) -#define FUNCTION_BOUNDARY ((TARGET_THUMB && optimize_size) ? 16 : 32) +#define FUNCTION_BOUNDARY ((TARGET_THUMB && optimize_function_for_size_p (cfun)) ? 16 : 32) /* The lowest bit is used to indicate Thumb-mode functions, so the vbit must go into the delta field of pointers to member @@ -1141,9 +1141,10 @@ enum reg_class /* For the Thumb the high registers cannot be used as base registers when addressing quantities in QI or HI mode; if we don't know the mode, then we must be conservative. */ -#define MODE_BASE_REG_CLASS(MODE) \ - (TARGET_ARM || (TARGET_THUMB2 && !optimize_size) ? CORE_REGS : \ - (((MODE) == SImode) ? BASE_REGS : LO_REGS)) +#define MODE_BASE_REG_CLASS(MODE) \ + (TARGET_ARM || (TARGET_THUMB2 && !optimize_function_for_size_p ()) ? \ + CORE_REGS : \ + (((MODE) == SImode) ? BASE_REGS : LO_REGS)) /* For Thumb we can not support SP+reg addressing, so we return LO_REGS instead of BASE_REGS. */ diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 3d4dcfa..8e17930 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -1400,7 +1400,7 @@ (const_int 0))) (set (match_operand:SI 0 "s_register_operand" "=r") (mult:SI (match_dup 2) (match_dup 1)))] - "TARGET_ARM && arm_arch6 && optimize_size" + "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()" "mul%.\\t%0, %2, %1" [(set_attr "conds" "set") (set_attr "insn" "muls")] @@ -1426,7 +1426,7 @@ (match_operand:SI 1 "s_register_operand" "r")) (const_int 0))) (clobber (match_scratch:SI 0 "=r"))] - "TARGET_ARM && arm_arch6 && optimize_size" + "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()" "mul%.\\t%0, %2, %1" [(set_attr "conds" "set") (set_attr "insn" "muls")] @@ -1486,7 +1486,7 @@ (set (match_operand:SI 0 "s_register_operand" "=r") (plus:SI (mult:SI (match_dup 2) (match_dup 1)) (match_dup 3)))] - "TARGET_ARM && arm_arch6 && optimize_size" + "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()" "mla%.\\t%0, %2, %1, %3" [(set_attr "conds" "set") (set_attr "insn" "mlas")] @@ -1516,7 +1516,7 @@ (match_operand:SI 3 "s_register_operand" "r")) (const_int 0))) (clobber (match_scratch:SI 0 "=r"))] - "TARGET_ARM && arm_arch6 && optimize_size" + "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()" "mla%.\\t%0, %2, %1, %3" [(set_attr "conds" "set") (set_attr "insn" "mlas")] @@ -4992,13 +4992,13 @@ (set_attr "thumb2_neg_pool_range" "*,*,*,0,*")] ) -(define_split +(define_split ;; dimode_general_splitter [(set (match_operand:ANY64 0 "arm_general_register_operand" "") (match_operand:ANY64 1 "const_double_operand" ""))] "TARGET_32BIT && reload_completed && (arm_const_double_inline_cost (operands[1]) - <= ((optimize_size || arm_ld_sched) ? 3 : 4))" + <= ((optimize_insn_for_size_p () || arm_ld_sched) ? 3 : 4))" [(const_int 0)] " arm_split_constant (SET, SImode, curr_insn, @@ -8477,7 +8477,7 @@ (match_operand:SI 2 "const_int_operand" "") ; total range (match_operand:SI 3 "" "") ; table label (match_operand:SI 4 "" "")] ; Out of range label - "TARGET_32BIT || optimize_size || flag_pic" + "TARGET_32BIT || optimize_insn_for_size_p () || flag_pic" " { enum insn_code code; @@ -10845,7 +10845,7 @@ (define_expand "bswapsi2" [(set (match_operand:SI 0 "s_register_operand" "=r") (bswap:SI (match_operand:SI 1 "s_register_operand" "r")))] -"TARGET_EITHER && (arm_arch6 || !optimize_size)" +"TARGET_EITHER && (arm_arch6 || !optimize_insn_for_size_p ())" " if (!arm_arch6) { diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md index f5b8521..c7e13ec 100644 --- a/gcc/config/arm/constraints.md +++ b/gcc/config/arm/constraints.md @@ -232,7 +232,7 @@ if optimizing for space or when we have load-delay slots to fill." (and (match_code "const_double,const_int,const_vector") (match_test "TARGET_32BIT && arm_const_double_inline_cost (op) == 4 - && !(optimize_size || arm_ld_sched)"))) + && !(optimize_insn_for_size_p () || arm_ld_sched)"))) (define_constraint "Di" "@internal diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md index 9a11012..e228963 100644 --- a/gcc/config/arm/thumb2.md +++ b/gcc/config/arm/thumb2.md @@ -873,11 +873,11 @@ ;; implementations and since "mul" will be generated by ;; "*arm_mulsi3_v6" anyhow. The assembler will use a 16-bit encoding ;; for "mul" whenever possible anyhow. -(define_peephole2 +(define_peephole2 ;; t2_muls_peepholes [(set (match_operand:SI 0 "low_register_operand" "") (mult:SI (match_operand:SI 1 "low_register_operand" "") (match_dup 0)))] - "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)" + "TARGET_THUMB2 && optimize_insn_for_size_p () && peep2_regno_dead_p (0, CC_REGNUM)" [(parallel [(set (match_dup 0) (mult:SI (match_dup 0) (match_dup 1))) @@ -889,7 +889,7 @@ [(set (match_operand:SI 0 "low_register_operand" "") (mult:SI (match_dup 0) (match_operand:SI 1 "low_register_operand" "")))] - "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)" + "TARGET_THUMB2 && optimize_insn_for_size_p () && peep2_regno_dead_p (0, CC_REGNUM)" [(parallel [(set (match_dup 0) (mult:SI (match_dup 0) (match_dup 1))) @@ -902,7 +902,7 @@ (mult:SI (match_operand:SI 1 "low_register_operand" "%0") (match_operand:SI 2 "low_register_operand" "l"))) (clobber (reg:CC CC_REGNUM))] - "TARGET_THUMB2 && optimize_size && reload_completed" + "TARGET_THUMB2 && optimize_insn_for_size_p () && reload_completed" "mul%!\\t%0, %2, %0" [(set_attr "predicable" "yes") (set_attr "length" "2") @@ -916,7 +916,7 @@ (const_int 0))) (set (match_operand:SI 0 "register_operand" "=l") (mult:SI (match_dup 1) (match_dup 2)))] - "TARGET_THUMB2 && optimize_size" + "TARGET_THUMB2 && optimize_insn_for_size_p ()" "muls\\t%0, %2, %0" [(set_attr "length" "2") (set_attr "insn" "muls")]) @@ -928,7 +928,7 @@ (match_operand:SI 2 "register_operand" "l")) (const_int 0))) (clobber (match_scratch:SI 0 "=l"))] - "TARGET_THUMB2 && optimize_size" + "TARGET_THUMB2 && optimize_insn_for_size_p ()" "muls\\t%0, %2, %0" [(set_attr "length" "2") (set_attr "insn" "muls")])