[stable/4.0.y] ARM: 8325/1: exynos: move resume code to .text section

Message ID 1433299379-18183-1-git-send-email-khilman@kernel.org
State New
Headers show

Commit Message

Kevin Hilman June 3, 2015, 2:42 a.m.
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>

This code calls cpu_resume() using a straight branch (b), so
now that we have moved cpu_resume() back to .text, this should
be moved there as well. Any direct references to symbols that will
remain in the .data section are replaced with explicit PC-relative
references.

Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
(cherry picked from commit 12833bacf5d904c2dac0c3f52b2ebde5f2c5a2bc)
Cc: <stable@vger.kernel.org> # v4.0+
Signed-off-by: Kevin Hilman <khilman@linaro.org>
---
This fixes compile errors on stable/linux-4.0.y when building for ARM
using multi_v7_defconfig + CONFIG_THUMB2_KERNEL=y:

../arch/arm/mach-exynos/sleep.S:72: Error: invalid immediate for address calculation (value = 0x00000004)
../arch/arm/mach-exynos/sleep.S:74: Error: invalid immediate for address calculation (value = 0x00000004)
 
 arch/arm/mach-exynos/sleep.S | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

Comments

Ard Biesheuvel June 15, 2015, 7:04 a.m. | #1
On 15 June 2015 at 01:13, Ben Hutchings <ben@decadent.org.uk> wrote:
> On Tue, 2015-06-02 at 19:42 -0700, Kevin Hilman wrote:
>> From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>>
>> This code calls cpu_resume() using a straight branch (b), so
>> now that we have moved cpu_resume() back to .text, this should
>> be moved there as well. Any direct references to symbols that will
>> remain in the .data section are replaced with explicit PC-relative
>> references.
>
> I don't get it.  cpu_resume() is still in the .data section in 4.0.
> This appears to depend on:
>
> commit d0776aff9a38b1390cc06ffc2c4dcf6ece7c05b9
> Author: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Date:   Wed Mar 25 07:39:21 2015 +0100
>
>     ARM: 8324/1: move cpu_resume() to .text section
>

You are right. So this patch results in the Exynos resume function to
call cpu_resume() in .data from the .text section. This turns out to
work fine for normal configs (exynos_defconfig, multi_v7_defconfig)
built for ARM since the distance is < 16 MB, and -apparently- fixes an
issue Kevin spotted with the Thumb build on top of that. Whether
cpu_resume() may now be out of Thumb range (8 MB) in some configs is
irrelevant since the Thumb build was broken in the first place.
Ard Biesheuvel June 15, 2015, 2:20 p.m. | #2
On 15 June 2015 at 13:08, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Jun 15, 2015 at 09:04:20AM +0200, Ard Biesheuvel wrote:
>> On 15 June 2015 at 01:13, Ben Hutchings <ben@decadent.org.uk> wrote:
>> > On Tue, 2015-06-02 at 19:42 -0700, Kevin Hilman wrote:
>> >> From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> >>
>> >> This code calls cpu_resume() using a straight branch (b), so
>> >> now that we have moved cpu_resume() back to .text, this should
>> >> be moved there as well. Any direct references to symbols that will
>> >> remain in the .data section are replaced with explicit PC-relative
>> >> references.
>> >
>> > I don't get it.  cpu_resume() is still in the .data section in 4.0.
>> > This appears to depend on:
>> >
>> > commit d0776aff9a38b1390cc06ffc2c4dcf6ece7c05b9
>> > Author: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> > Date:   Wed Mar 25 07:39:21 2015 +0100
>> >
>> >     ARM: 8324/1: move cpu_resume() to .text section
>> >
>>
>> You are right. So this patch results in the Exynos resume function to
>> call cpu_resume() in .data from the .text section. This turns out to
>> work fine for normal configs (exynos_defconfig, multi_v7_defconfig)
>> built for ARM since the distance is < 16 MB, and -apparently- fixes an
>> issue Kevin spotted with the Thumb build on top of that. Whether
>> cpu_resume() may now be out of Thumb range (8 MB) in some configs is
>> irrelevant since the Thumb build was broken in the first place.
>
> Err, commit 12833bacf5d904c2dac0c3f52b2ebde5f2c5a2bc should not be
> taken into older kernels at all, precisely because 8324/1 is not
> backported either.
>
> The whole point of moving it (if you read the commit text) is because
> we've moved cpu_resume to .text, which then allows the mach-* asm
> which calls cpu_resume to also move.  Without the first, the second
> doesn't make sense.
>

It appears that the introduction of open coded PC-relative references
works around an adr range issue on Thumb2. So it's kind of a side
effect of this patch, and not the original purpose. I was aware of
this when it was proposed for -stable, and I didn't see any harm. I
guess I should have said something ...

But looking at the original error:

../arch/arm/mach-exynos/sleep.S:72: Error: invalid immediate for
address calculation (value = 0x00000004)
../arch/arm/mach-exynos/sleep.S:74: Error: invalid immediate for
address calculation (value = 0x00000004)

there is something odd going on, since a value of 0x4 should obviously
be in range.

I can reproduce the error with this minimal .S:

"""
    adr r0, cp15_save_power

    .align
    .globl cp15_save_power
cp15_save_power:
    .long 0 @ cp15 power control
"""

$ arm-linux-gnueabihf-as -o /tmp/sleep.o /tmp/sleep.S -mthumb
/tmp/sleep.S: Assembler messages:
/tmp/sleep.S:1: Error: invalid immediate for address calculation
(value = 0x00000004)

The error goes away when I drop the -mthumb or when I replace the code with
"""
    adr r0, 0f

    .align
    .globl cp15_save_power
cp15_save_power:
0:  .long 0 @ cp15 power control
"""

In summary, this is an unrelated binutils issue that happens to go
away with $subject patch. (I am using binutils v2.24 btw)

> I think the question is - what's caused stable-4.0 to start spitting
> these errors?  Presumably, 4.0 didn't, and stable-4.0 has regressed?
> Maybe, rather than trying to fix this new regression, the original
> cause should be reverted?
>

Not sure whether it's a regression. I think this code does not usually
get built in Thumb2 mode in the first place.
Ard Biesheuvel June 16, 2015, 10:09 a.m. | #3
On 16 June 2015 at 00:59, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Jun 15, 2015 at 03:17:34PM -0700, Kevin Hilman wrote:
>> Ard Biesheuvel <ard.biesheuvel@linaro.org> writes:
>>
>> > On 15 June 2015 at 13:08, Russell King - ARM Linux
>> [...]
>>
>> >> I think the question is - what's caused stable-4.0 to start spitting
>> >> these errors?  Presumably, 4.0 didn't, and stable-4.0 has regressed?
>> >> Maybe, rather than trying to fix this new regression, the original
>> >> cause should be reverted?
>> >>
>> >
>> > Not sure whether it's a regression. I think this code does not usually
>> > get built in Thumb2 mode in the first place.
>>
>> It's not a regression in stable-4.0, v4.0 has the same build failure.
>>
>> I think we've only caught this now since I added multi_v7 +
>> CONFIG_THUMB2_KERNEL=y builds to kernelci.org for mainline and the
>> stable trees.
>
> Okay, so it's not a user reported regression, but comes from a build
> system.  So I continue to wonder what the value is of trying to fix
> it in stable kernels, vs the risk of de-stabilising them, especially
> when the fix we have in mainline can't be applied.
>
> I'd suggest waiting until we have proper users reporting a failure
> over this.  (Who's a proper user?)
>

Well, unfortunately that ship has sailed. This patch has already been
included in 4.0.5.

So the question is whether we should revert it or not. My position is
that there is no need:

In vanilla v4.0, there are 8 mach-specific invocations of cpu_resume()
(which resides in .data itself):

From the .text section:
arch/arm/mach-imx/suspend-imx6.S
arch/arm/mach-mvebu/pmsu_ll.S
arch/arm/mach-omap2/sleep34xx.S
arch/arm/mach-s3c24xx/sleep.S
arch/arm/mach-s3c64xx/sleep.S
arch/arm/mach-tegra/reset-handler.S

From the .data section:
arch/arm/mach-exynos/sleep.S
arch/arm/mach-s5pv210/sleep.S

In mainline, we moved cpu_resume() and the two latter callers into
.text as well, but strictly for the purpose of allowing much larger
kernels to be linked, This is not supported by v4.0 anyway, so moving
the call to cpu_resume() from .data to .text for exynos is unlikely to
cause trouble since it just brings it in line with what the majority
(including s3c24xx and s3c64xx) are already doing. (The link problem
for large kernels needs all 3 patches, but not necessarily in a strict
order.)

And the side effect is that it dodges the binutils issue, allowing
v4.0-stable to be built in Thumb2 mode for Exynos and
multi_v7_defconfig.

Patch

diff --git a/arch/arm/mach-exynos/sleep.S b/arch/arm/mach-exynos/sleep.S
index 31d25834b9c4..cf950790fbdc 100644
--- a/arch/arm/mach-exynos/sleep.S
+++ b/arch/arm/mach-exynos/sleep.S
@@ -23,14 +23,7 @@ 
 #define CPU_MASK	0xff0ffff0
 #define CPU_CORTEX_A9	0x410fc090
 
-	/*
-	 * The following code is located into the .data section. This is to
-	 * allow l2x0_regs_phys to be accessed with a relative load while we
-	 * can't rely on any MMU translation. We could have put l2x0_regs_phys
-	 * in the .text section as well, but some setups might insist on it to
-	 * be truly read-only. (Reference from: arch/arm/kernel/sleep.S)
-	 */
-	.data
+	.text
 	.align
 
 	/*
@@ -69,10 +62,12 @@  ENTRY(exynos_cpu_resume_ns)
 	cmp	r0, r1
 	bne	skip_cp15
 
-	adr	r0, cp15_save_power
+	adr	r0, _cp15_save_power
 	ldr	r1, [r0]
-	adr	r0, cp15_save_diag
+	ldr	r1, [r0, r1]
+	adr	r0, _cp15_save_diag
 	ldr	r2, [r0]
+	ldr	r2, [r0, r2]
 	mov	r0, #SMC_CMD_C15RESUME
 	dsb
 	smc	#0
@@ -118,14 +113,20 @@  skip_l2x0:
 skip_cp15:
 	b	cpu_resume
 ENDPROC(exynos_cpu_resume_ns)
+
+	.align
+_cp15_save_power:
+	.long	cp15_save_power - .
+_cp15_save_diag:
+	.long	cp15_save_diag - .
+#ifdef CONFIG_CACHE_L2X0
+1:	.long	l2x0_saved_regs - .
+#endif /* CONFIG_CACHE_L2X0 */
+
+	.data
 	.globl cp15_save_diag
 cp15_save_diag:
 	.long	0	@ cp15 diagnostic
 	.globl cp15_save_power
 cp15_save_power:
 	.long	0	@ cp15 power control
-
-#ifdef CONFIG_CACHE_L2X0
-	.align
-1:	.long	l2x0_saved_regs - .
-#endif /* CONFIG_CACHE_L2X0 */