[1/3] aarch64: Clean up _dl_runtime_resolve

Message ID 20180801222347.18903-2-rth@twiddle.net
State New
Headers show
Series
  • aarch64: Update ld.so for vector abi
Related show

Commit Message

Richard Henderson Aug. 1, 2018, 10:23 p.m.
From: Richard Henderson <richard.henderson@linaro.org>


	* sysdeps/aarch64/dl-trampoline.S (_dl_runtime_resolve):
	Do not record unwind info for arguments; this is unneeded;
	do not save x9 just to have a register to pair with x8;
	properly include the 16 bytes of PLT stack into the unwind;
	create a frame pointer with the spare stack slot;
	rearrange the exit to only adjust the stack once.
---
 sysdeps/aarch64/dl-trampoline.S | 50 +++++++++------------------------
 1 file changed, 14 insertions(+), 36 deletions(-)

-- 
2.17.1

Comments

Szabolcs Nagy Aug. 2, 2018, 4:03 p.m. | #1
On 01/08/18 23:23, rth@twiddle.net wrote:
> From: Richard Henderson <richard.henderson@linaro.org>

> 

> 	* sysdeps/aarch64/dl-trampoline.S (_dl_runtime_resolve):

> 	Do not record unwind info for arguments; this is unneeded;

> 	do not save x9 just to have a register to pair with x8;

> 	properly include the 16 bytes of PLT stack into the unwind;

> 	create a frame pointer with the spare stack slot;

> 	rearrange the exit to only adjust the stack once.


i thought the cfi annotations were needed for all registers
in case the debugger wants to investigate register content
across a _dl_runtime_resolve frame (possibly several frames
up in the call stack), this may not be a common use case though
and i don't know what's the convention in glibc asm, the compiler
seems to emit annotation for all spilled registers with -g.

> --- a/sysdeps/aarch64/dl-trampoline.S

> +++ b/sysdeps/aarch64/dl-trampoline.S

> @@ -32,7 +32,6 @@

>   	.text

>   	.globl _dl_runtime_resolve

>   	.type _dl_runtime_resolve, #function

> -	cfi_startproc

>   	.align 2

>   _dl_runtime_resolve:

>   	/* AArch64 we get called with:

> @@ -41,46 +40,24 @@ _dl_runtime_resolve:

>   	   [sp, #8]	lr

>   	   [sp, #0]	&PLTGOT[n]

>   	 */

> -

> +	cfi_startproc


is there a problem keeping it at its original place above?
the tlsdesc asm has cfi_startproc at the same place.

otherwise the patch looks ok.
Richard Henderson Aug. 2, 2018, 5:04 p.m. | #2
On 08/02/2018 12:03 PM, Szabolcs Nagy wrote:
> On 01/08/18 23:23, rth@twiddle.net wrote:

>> From: Richard Henderson <richard.henderson@linaro.org>

>>

>>     * sysdeps/aarch64/dl-trampoline.S (_dl_runtime_resolve):

>>     Do not record unwind info for arguments; this is unneeded;

>>     do not save x9 just to have a register to pair with x8;

>>     properly include the 16 bytes of PLT stack into the unwind;

>>     create a frame pointer with the spare stack slot;

>>     rearrange the exit to only adjust the stack once.

> 

> i thought the cfi annotations were needed for all registers

> in case the debugger wants to investigate register content

> across a _dl_runtime_resolve frame (possibly several frames

> up in the call stack),


However that's typically for the call-saved registers, where the compiler might
save data in that register across the call.  These are not call-saved
registers.  They are argument registers.  There will not be any debug info that
refers to them.

> this may not be a common use case though

> and i don't know what's the convention in glibc asm, the compiler

> seems to emit annotation for all spilled registers with -g.


Sure, because the compiler is spilling call-saved registers.
The others it just clobbers with no annotation.

>> -    cfi_startproc

>>       .align 2

>>   _dl_runtime_resolve:

>>       /* AArch64 we get called with:

>> @@ -41,46 +40,24 @@ _dl_runtime_resolve:

>>          [sp, #8]    lr

>>          [sp, #0]    &PLTGOT[n]

>>        */

>> -

>> +    cfi_startproc

> 

> is there a problem keeping it at its original place above?

> the tlsdesc asm has cfi_startproc at the same place.


It could stay where it is, but I thought it clearer to place the following two
annotations immediately adjacent (because it's state incoming, not anything we
are doing here, and should not be separated from the start).  Further, to place
all of the annotations immediately after the comment that describes why.


r~
Szabolcs Nagy Aug. 2, 2018, 5:20 p.m. | #3
On 02/08/18 18:04, Richard Henderson wrote:
> On 08/02/2018 12:03 PM, Szabolcs Nagy wrote:

>> On 01/08/18 23:23, rth@twiddle.net wrote:

>>> From: Richard Henderson <richard.henderson@linaro.org>

>>>

>>>      * sysdeps/aarch64/dl-trampoline.S (_dl_runtime_resolve):

>>>      Do not record unwind info for arguments; this is unneeded;

>>>      do not save x9 just to have a register to pair with x8;

>>>      properly include the 16 bytes of PLT stack into the unwind;

>>>      create a frame pointer with the spare stack slot;

>>>      rearrange the exit to only adjust the stack once.

>>

>> i thought the cfi annotations were needed for all registers

>> in case the debugger wants to investigate register content

>> across a _dl_runtime_resolve frame (possibly several frames

>> up in the call stack),

> 

> However that's typically for the call-saved registers, where the compiler might

> save data in that register across the call.  These are not call-saved

> registers.  They are argument registers.  There will not be any debug info that

> refers to them.

> 

>> this may not be a common use case though

>> and i don't know what's the convention in glibc asm, the compiler

>> seems to emit annotation for all spilled registers with -g.

> 

> Sure, because the compiler is spilling call-saved registers.

> The others it just clobbers with no annotation.

> 

>>> -    cfi_startproc

>>>        .align 2

>>>    _dl_runtime_resolve:

>>>        /* AArch64 we get called with:

>>> @@ -41,46 +40,24 @@ _dl_runtime_resolve:

>>>           [sp, #8]    lr

>>>           [sp, #0]    &PLTGOT[n]

>>>         */

>>> -

>>> +    cfi_startproc

>>

>> is there a problem keeping it at its original place above?

>> the tlsdesc asm has cfi_startproc at the same place.

> 

> It could stay where it is, but I thought it clearer to place the following two

> annotations immediately adjacent (because it's state incoming, not anything we

> are doing here, and should not be separated from the start).  Further, to place

> all of the annotations immediately after the comment that describes why.

> 


i see, then this patch is OK to commit, thanks.

Patch

diff --git a/sysdeps/aarch64/dl-trampoline.S b/sysdeps/aarch64/dl-trampoline.S
index a86d0722d4..e8e2af485a 100644
--- a/sysdeps/aarch64/dl-trampoline.S
+++ b/sysdeps/aarch64/dl-trampoline.S
@@ -32,7 +32,6 @@ 
 	.text
 	.globl _dl_runtime_resolve
 	.type _dl_runtime_resolve, #function
-	cfi_startproc
 	.align 2
 _dl_runtime_resolve:
 	/* AArch64 we get called with:
@@ -41,46 +40,24 @@  _dl_runtime_resolve:
 	   [sp, #8]	lr
 	   [sp, #0]	&PLTGOT[n]
 	 */
-
+	cfi_startproc
+	cfi_adjust_cfa_offset(16)	/* Incorporate PLT */
 	cfi_rel_offset (lr, 8)
 
 	/* Save arguments.  */
-	stp	x8, x9, [sp, #-(80+8*16)]!
+	stp	x29, x8, [sp, #-(80+8*16)]!
 	cfi_adjust_cfa_offset (80+8*16)
-	cfi_rel_offset (x8, 0)
-	cfi_rel_offset (x9, 8)
+	cfi_rel_offset (x29, 0)
+	mov	x29, sp
 
 	stp	x6, x7, [sp,  #16]
-	cfi_rel_offset (x6, 16)
-	cfi_rel_offset (x7, 24)
-
 	stp	x4, x5, [sp,  #32]
-	cfi_rel_offset (x4, 32)
-	cfi_rel_offset (x5, 40)
-
 	stp	x2, x3, [sp,  #48]
-	cfi_rel_offset (x2, 48)
-	cfi_rel_offset (x3, 56)
-
 	stp	x0, x1, [sp,  #64]
-	cfi_rel_offset (x0, 64)
-	cfi_rel_offset (x1, 72)
-
 	stp	q0, q1, [sp, #(80+0*16)]
-	cfi_rel_offset (q0, 80+0*16)
-	cfi_rel_offset (q1, 80+1*16)
-
 	stp	q2, q3, [sp, #(80+2*16)]
-	cfi_rel_offset (q0, 80+2*16)
-	cfi_rel_offset (q1, 80+3*16)
-
 	stp	q4, q5, [sp, #(80+4*16)]
-	cfi_rel_offset (q0, 80+4*16)
-	cfi_rel_offset (q1, 80+5*16)
-
 	stp	q6, q7, [sp, #(80+6*16)]
-	cfi_rel_offset (q0, 80+6*16)
-	cfi_rel_offset (q1, 80+7*16)
 
 	/* Get pointer to linker struct.  */
 	ldr	PTR_REG (0), [ip0, #-PTR_SIZE]
@@ -101,25 +78,26 @@  _dl_runtime_resolve:
 	mov	ip0, x0
 
 	/* Get arguments and return address back.  */
-	ldp	q0, q1, [sp, #(80+0*16)]
-	ldp	q2, q3, [sp, #(80+2*16)]
-	ldp	q4, q5, [sp, #(80+4*16)]
+	ldr	lr, [sp, #80+8*16+8]
 	ldp	q6, q7, [sp, #(80+6*16)]
+	ldp	q4, q5, [sp, #(80+4*16)]
+	ldp	q2, q3, [sp, #(80+2*16)]
+	ldp	q0, q1, [sp, #(80+0*16)]
 	ldp	x0, x1, [sp, #64]
 	ldp	x2, x3, [sp, #48]
 	ldp	x4, x5, [sp, #32]
 	ldp	x6, x7, [sp, #16]
-	ldp	x8, x9, [sp], #(80+8*16)
-	cfi_adjust_cfa_offset (-(80+8*16))
-
-	ldp	ip1, lr, [sp], #16
-	cfi_adjust_cfa_offset (-16)
+	ldp	x29, x8, [sp], 80+8*16+16
+	cfi_adjust_cfa_offset (-(80+8*16+16))
+	cfi_restore (lr)
+	cfi_restore (x29)
 
 	/* Jump to the newly found address.  */
 	br	ip0
 
 	cfi_endproc
 	.size _dl_runtime_resolve, .-_dl_runtime_resolve
+
 #ifndef PROF
 	.globl _dl_runtime_profile
 	.type _dl_runtime_profile, #function