diff mbox

[ARM,v3] PR69770 -mlong-calls does not affect calls to __gnu_mcount_nc generated by -pg

Message ID CADnVucDrK4LXBEwEEVJi8fNre42LvyqOx+PjHY18SyhK11fMtQ@mail.gmail.com
State New
Headers show

Commit Message

Charles Baylis March 29, 2016, 11:41 a.m. UTC
On 29 March 2016 at 02:16, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>

> Hi Charles,

>

> +static void

> +arm_emit_long_call_profile_insn ()

> +{

> +  rtx sym_ref = gen_rtx_SYMBOL_REF (Pmode, "__gnu_mcount_nc");

> +  /* if movt/movw are not available, use a constant pool */

> +  if (!arm_arch_thumb2)

>

> Should this be !TARGET_USE_MOVT?


Hi Kugan,

Thanks for the review.

TARGET_USE_MOVT has additional conditions which mean that it can be
false on targets with MOVW/MOVT depending on the tuning parameters for
the target CPU. Because this patch works in a slightly odd way, I
think it is better to use MOVW/MOVT where possible so that the
slightly hacky use of the literal pool is avoided. Since this only
happens when profiling, it is not essential to have the fully
optimised code sequence here. I'm happy to change it if anybody feels
strongly though.

I've noticed in the quoted snippet that there are some GNU coding
style errors, so I've respun the patch with those corrected.

gcc/ChangeLog:

2016-03-29  Charles Baylis  <charles.baylis@linaro.org>

        * config/arm/arm-protos.h (arm_emit_long_call_profile): New function.
        * config/arm/arm.c (arm_emit_long_call_profile_insn): New function.
        (arm_expand_prologue): Likewise.
        (thumb1_expand_prologue): Likewise.
        (arm_output_long_call_to_profile_func): Likewise.
        (arm_emit_long_call_profile): Likewise.
        * config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment.
        * config/arm/arm.md (arm_long_call_profile): New pattern.
        * config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New
        define.
        * config/arm/thumb1.md (thumb1_long_call_profile): New pattern.
        * config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE.

gcc/testsuite/ChangeLog:

2016-03-29  Charles Baylis  <charles.baylis@linaro.org>

        * gcc.target/arm/pr69770.c: New test.

Comments

Christophe Lyon March 29, 2016, 1:03 p.m. UTC | #1
On 29 March 2016 at 13:41, Charles Baylis <charles.baylis@linaro.org> wrote:
> On 29 March 2016 at 02:16, Kugan <kugan.vivekanandarajah@linaro.org> wrote:

>>

>> Hi Charles,

>>

>> +static void

>> +arm_emit_long_call_profile_insn ()

>> +{

>> +  rtx sym_ref = gen_rtx_SYMBOL_REF (Pmode, "__gnu_mcount_nc");

>> +  /* if movt/movw are not available, use a constant pool */

>> +  if (!arm_arch_thumb2)

>>

>> Should this be !TARGET_USE_MOVT?

>

> Hi Kugan,

>

> Thanks for the review.

>

> TARGET_USE_MOVT has additional conditions which mean that it can be

> false on targets with MOVW/MOVT depending on the tuning parameters for

> the target CPU. Because this patch works in a slightly odd way, I

> think it is better to use MOVW/MOVT where possible so that the

> slightly hacky use of the literal pool is avoided. Since this only

> happens when profiling, it is not essential to have the fully

> optimised code sequence here. I'm happy to change it if anybody feels

> strongly though.

>

> I've noticed in the quoted snippet that there are some GNU coding

> style errors, so I've respun the patch with those corrected.

>

> gcc/ChangeLog:

>

> 2016-03-29  Charles Baylis  <charles.baylis@linaro.org>

>

>         * config/arm/arm-protos.h (arm_emit_long_call_profile): New function.

>         * config/arm/arm.c (arm_emit_long_call_profile_insn): New function.

>         (arm_expand_prologue): Likewise.

>         (thumb1_expand_prologue): Likewise.

>         (arm_output_long_call_to_profile_func): Likewise.

>         (arm_emit_long_call_profile): Likewise.

>         * config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment.

>         * config/arm/arm.md (arm_long_call_profile): New pattern.

>         * config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New

>         define.

>         * config/arm/thumb1.md (thumb1_long_call_profile): New pattern.


Hi Charles,

In thumb1.md, I noticed:
@@ -1798,7 +1798,7 @@
   [(unspec_volatile [(match_operand:SI 0 "s_register_operand" "l")]
             VUNSPEC_EH_RETURN)
    (clobber (match_scratch:SI 1 "=&l"))]
-  "TARGET_THUMB1"
+  "TARGET_THUMB1 && 0"
   "#"
   "&& reload_completed"
   [(const_int 0)]

which looks like an artifact of WIP.

>         * config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE.

>

> gcc/testsuite/ChangeLog:

>

> 2016-03-29  Charles Baylis  <charles.baylis@linaro.org>

>

>         * gcc.target/arm/pr69770.c: New test.
diff mbox

Patch

From 5785ddcfd518c44cf87b0fc74b4397fd98d1b0c1 Mon Sep 17 00:00:00 2001
From: Charles Baylis <charles.baylis@linaro.org>
Date: Tue, 29 Mar 2016 12:28:25 +0100
Subject: [PATCH] PR69770 -mlong-calls does not affect calls to __gnu_mcount_nc
 generated by -pg

gcc/ChangeLog:

2016-03-29  Charles Baylis  <charles.baylis@linaro.org>

        * config/arm/arm-protos.h (arm_emit_long_call_profile): New function.
        * config/arm/arm.c (arm_emit_long_call_profile_insn): New function.
        (arm_expand_prologue): Likewise.
        (thumb1_expand_prologue): Likewise.
        (arm_output_long_call_to_profile_func): Likewise.
        (arm_emit_long_call_profile): Likewise.
        * config/arm/arm.h: (ASM_OUTPUT_REG_PUSH) Update comment.
        * config/arm/arm.md (arm_long_call_profile): New pattern.
        * config/arm/bpabi.h (ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS): New
	define.
        * config/arm/thumb1.md (thumb1_long_call_profile): New pattern.
        * config/arm/unspecs.md (unspecv): Add VUNSPEC_LONG_CALL_PROFILE.

gcc/testsuite/ChangeLog:

2016-03-29  Charles Baylis  <charles.baylis@linaro.org>

        * gcc.target/arm/pr69770.c: New test.

Change-Id: I9b8de01fea083f17f729c3801f83174bedb3b0c6

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 0083673..324c9f4 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -343,6 +343,7 @@  extern void arm_register_target_pragmas (void);
 extern void arm_cpu_cpp_builtins (struct cpp_reader *);
 
 extern bool arm_is_constant_pool_ref (rtx);
+void arm_emit_long_call_profile ();
 
 /* Flags used to identify the presence of processor capabilities.  */
 
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index c868490..885657a 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -21426,6 +21426,21 @@  output_probe_stack_range (rtx reg1, rtx reg2)
   return "";
 }
 
+static void
+arm_emit_long_call_profile_insn ()
+{
+  rtx sym_ref = gen_rtx_SYMBOL_REF (Pmode, "__gnu_mcount_nc");
+  /* If movt/movw are not available, use a constant pool.  */
+  if (!arm_arch_thumb2)
+  {
+    sym_ref = force_const_mem (Pmode, sym_ref);
+  }
+  rtvec vec = gen_rtvec (1, sym_ref);
+  rtx tmp = gen_rtx_UNSPEC_VOLATILE (VOIDmode, vec, VUNSPEC_LONG_CALL_PROFILE);
+  emit_insn (tmp);
+}
+
+
 /* Generate the prologue instructions for entry into an ARM or Thumb-2
    function.  */
 void
@@ -21789,6 +21804,10 @@  arm_expand_prologue (void)
       arm_load_pic_register (mask);
     }
 
+  if (crtl->profile && TARGET_LONG_CALLS
+      && ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS)
+    arm_emit_long_call_profile_insn ();
+
   /* If we are profiling, make sure no instructions are scheduled before
      the call to mcount.  Similarly if the user has requested no
      scheduling in the prolog.  Similarly if we want non-call exceptions
@@ -24985,6 +25004,10 @@  thumb1_expand_prologue (void)
   if (frame_pointer_needed)
     thumb_set_frame_pointer (offsets);
 
+  if (crtl->profile && TARGET_LONG_CALLS
+      && ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS)
+    arm_emit_long_call_profile_insn ();
+
   /* If we are profiling, make sure no instructions are scheduled before
      the call to mcount.  Similarly if the user has requested no
      scheduling in the prolog.  Similarly if we want non-call exceptions
@@ -30289,4 +30312,70 @@  arm_sched_fusion_priority (rtx_insn *insn, int max_pri,
   return;
 }
 
+static void
+arm_output_long_call_to_profile_func (rtx * operands, bool push_scratch)
+{
+  /* operands[0] is the address of the __gnu_mcount_nc function.
+     operands[1] is the scratch register we use to load that address.  */
+  if (push_scratch)
+    output_asm_insn ("push\t{%1}", operands);
+  output_asm_insn ("push\t{lr}", operands);
+  if (GET_CODE (operands[0]) == SYMBOL_REF)
+    {
+      output_asm_insn ("movw\t%1, #:lower16:%c0", operands);
+      output_asm_insn ("movt\t%1, #:upper16:%c0", operands);
+    }
+  else
+    {
+      output_asm_insn ("ldr\t%1, %0", operands);
+    }
+  if (!arm_arch5)
+    {
+      output_asm_insn ("mov\tlr, pc", operands);
+      output_asm_insn ("mov\tpc, %1", operands);
+    }
+  else
+    output_asm_insn ("blx\t%1", operands);
+  if (push_scratch)
+    output_asm_insn ("pop\t{%1}", operands);
+}
+
+void
+arm_emit_long_call_profile ()
+{
+  rtx alcp = NULL;
+  rtx operands[2];
+  bool push_scratch;
+  /* Find the use of arm_long_call_profile.  */
+  for (rtx_insn * insn = get_insns (); insn; insn = NEXT_INSN (insn))
+    {
+      if (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_PROLOGUE_END)
+	break;
+      if (INSN_CODE (insn) == CODE_FOR_arm_long_call_profile
+	  || INSN_CODE (insn) == CODE_FOR_thumb1_long_call_profile)
+	{
+	  alcp = PATTERN (insn);
+	  break;
+	}
+    }
+  gcc_assert (alcp);
+
+  operands[0] = XEXP (XEXP (alcp, 0), 0);
+  if (TARGET_32BIT)
+    {
+      operands[1] = gen_rtx_REG (SImode, IP_REGNUM);
+      push_scratch = false;
+    }
+  else
+    {
+      /* for nested functions, we can set push_scratch to false, since
+	 final.c:profile_function.c and ASM_OUTPUT_REG_PUSH preserve it as
+	 part of the sequence to preserve ip across the call to the
+	 profiling function.  */
+      operands[1] = gen_rtx_REG (SImode, R0_REGNUM + 7);
+      push_scratch = !IS_NESTED (arm_current_func_type ());
+    }
+  arm_output_long_call_to_profile_func (operands, push_scratch);
+}
+
 #include "gt-arm.h"
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 6352140..c8e2e47 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2044,7 +2044,9 @@  extern int making_const_table;
    that ASM_OUTPUT_REG_PUSH will be matched with ASM_OUTPUT_REG_POP, and
    that r7 isn't used by the function profiler, so we can use it as a
    scratch reg.  WARNING: This isn't safe in the general case!  It may be
-   sensitive to future changes in final.c:profile_function.  */
+   sensitive to future changes in final.c:profile_function.  This is also
+   relied on in arm_emit_long_call_profile () which assumes r7 can be
+   used as a scratch register to load the address of the function profiler.  */
 #define ASM_OUTPUT_REG_PUSH(STREAM, REGNO)		\
   do							\
     {							\
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 47171b9..0c9710b 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -11424,6 +11424,15 @@ 
   DONE;
 })
 
+(define_insn "arm_long_call_profile"
+  [(unspec_volatile [(match_operand:SI 0 "general_operand" "ji,m")
+		    ] VUNSPEC_LONG_CALL_PROFILE)]
+  "TARGET_32BIT"
+  "%@ arm_long_call_profile"
+  [(set_attr "arm_pool_range" "*,4096")
+   (set_attr "arm_neg_pool_range" "*,4084")]
+)
+
 ;; Vector bits common to IWMMXT and Neon
 (include "vec-common.md")
 ;; Load the Intel Wireless Multimedia Extension patterns
diff --git a/gcc/config/arm/bpabi.h b/gcc/config/arm/bpabi.h
index 5d6c4ed..89bf698 100644
--- a/gcc/config/arm/bpabi.h
+++ b/gcc/config/arm/bpabi.h
@@ -174,11 +174,20 @@ 
 
 #undef  NO_PROFILE_COUNTERS
 #define NO_PROFILE_COUNTERS 1
+#undef  ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS
+#define ARM_FUNCTION_PROFILER_SUPPORTS_LONG_CALLS 1
 #undef  ARM_FUNCTION_PROFILER
 #define ARM_FUNCTION_PROFILER(STREAM, LABELNO)  			\
 {									\
-  fprintf (STREAM, "\tpush\t{lr}\n");					\
-  fprintf (STREAM, "\tbl\t__gnu_mcount_nc\n");				\
+  if (TARGET_LONG_CALLS)						\
+    { 									\
+      arm_emit_long_call_profile ();					\
+    }									\
+  else									\
+    {									\
+      fprintf (STREAM, "\tpush\t{lr}\n");				\
+      fprintf (STREAM, "\tbl\t__gnu_mcount_nc\n");			\
+    }									\
 }
 
 #undef SUBTARGET_FRAME_POINTER_REQUIRED
diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 072ed4d..482d8cb 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -1798,7 +1798,7 @@ 
   [(unspec_volatile [(match_operand:SI 0 "s_register_operand" "l")]
 		    VUNSPEC_EH_RETURN)
    (clobber (match_scratch:SI 1 "=&l"))]
-  "TARGET_THUMB1"
+  "TARGET_THUMB1 && 0"
   "#"
   "&& reload_completed"
   [(const_int 0)]
@@ -1809,4 +1809,13 @@ 
   }"
   [(set_attr "type" "mov_reg")]
 )
+
+(define_insn "thumb1_long_call_profile"
+  [(unspec_volatile [(match_operand:SI 0 "general_operand" "j,m")
+		    ] VUNSPEC_LONG_CALL_PROFILE)]
+  "TARGET_THUMB1"
+  "%@ thumb1_long_call_profile"
+  [(set_attr "pool_range" "1018")]
+)
+
 
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index 5744c62..d7ddc3a 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -148,6 +148,7 @@ 
   VUNSPEC_GET_FPSCR	; Represent fetch of FPSCR content.
   VUNSPEC_SET_FPSCR	; Represent assign of FPSCR content.
   VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing.
+  VUNSPEC_LONG_CALL_PROFILE ; Represent a long call to profile function
 ])
 
 ;; Enumerators for NEON unspecs.
diff --git a/gcc/testsuite/gcc.target/arm/pr69770.c b/gcc/testsuite/gcc.target/arm/pr69770.c
new file mode 100644
index 0000000..93b433d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr69770.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile } */
+/* { dg-options "-pg -mlong-calls" } */
+
+extern void g (void);
+
+int
+f ()
+{
+  g ();
+  return 0;
+}
+
+
+/* { dg-final { scan-assembler-not "bl\[ \t\]+__gnu_mcount_nc" } } */
+/* { dg-final { scan-assembler "__gnu_mcount_nc" } } */
-- 
1.9.1