diff mbox

[11/11] ARM: Get rid of .LCcralign local label usage in alignment_trap macro

Message ID 1343649500-18491-11-git-send-email-anton.vorontsov@linaro.org
State New
Headers show

Commit Message

Anton Vorontsov July 30, 2012, 11:58 a.m. UTC
This makes the code more izolated.

The downside of this is that we now have an additional branch and the
code itself is 8 bytes longer. But on the bright side, this new layout
can be more cache friendly since cr_alignment address might be already
in the cache line (not that I measured anything, it's just fun to think
about it).

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
---
 arch/arm/kernel/entry-armv.S     |    2 --
 arch/arm/kernel/entry-header.S   |    6 +++++-
 arch/arm/kernel/kgdb_fiq_entry.S |    3 ---
 3 files changed, 5 insertions(+), 6 deletions(-)

Comments

Russell King - ARM Linux July 30, 2012, 2:15 p.m. UTC | #1
On Mon, Jul 30, 2012 at 04:58:20AM -0700, Anton Vorontsov wrote:
> This makes the code more izolated.
> 
> The downside of this is that we now have an additional branch and the
> code itself is 8 bytes longer. But on the bright side, this new layout
> can be more cache friendly since cr_alignment address might be already
> in the cache line (not that I measured anything, it's just fun to think
> about it).

The caches are harvard, so mixing data and code together does not increase
performance.  Having data which is used by the same code in the same cache
line results in better performance.

The additional branch will also cause a pipeline stall on older CPUs.

So no, I don't see any way that this is a performance improvement.  Please
leave this as is.
Anton Vorontsov Aug. 1, 2012, 8:53 p.m. UTC | #2
On Mon, Jul 30, 2012 at 03:15:44PM +0100, Russell King - ARM Linux wrote:
> On Mon, Jul 30, 2012 at 04:58:20AM -0700, Anton Vorontsov wrote:
> > This makes the code more izolated.
> > 
> > The downside of this is that we now have an additional branch and the
> > code itself is 8 bytes longer. But on the bright side, this new layout
> > can be more cache friendly since cr_alignment address might be already
> > in the cache line (not that I measured anything, it's just fun to think
> > about it).
> 
> The caches are harvard, so mixing data and code together does not increase
> performance.  Having data which is used by the same code in the same cache
> line results in better performance.
> 
> The additional branch will also cause a pipeline stall on older CPUs.
> 
> So no, I don't see any way that this is a performance improvement.  Please
> leave this as is.

Sure, will drop it.

Thanks!
diff mbox

Patch

diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 6aeb9b8..6b04ab5 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -266,8 +266,6 @@  __pabt_svc:
 ENDPROC(__pabt_svc)
 
 	.align	5
-.LCcralign:
-	.word	cr_alignment
 #ifdef MULTI_DABORT
 .LCprocfns:
 	.word	processor
diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
index c3c09ac..5a05e7f 100644
--- a/arch/arm/kernel/entry-header.S
+++ b/arch/arm/kernel/entry-header.S
@@ -38,9 +38,13 @@ 
 
 	.macro	alignment_trap, rtemp
 #ifdef CONFIG_ALIGNMENT_TRAP
-	ldr	\rtemp, .LCcralign
+	ldr	\rtemp, 1f
 	ldr	\rtemp, [\rtemp]
 	mcr	p15, 0, \rtemp, c1, c0
+	b	2f
+1:
+	.word	cr_alignment
+2:
 #endif
 	.endm
 
diff --git a/arch/arm/kernel/kgdb_fiq_entry.S b/arch/arm/kernel/kgdb_fiq_entry.S
index 7be3726..e7c05fc 100644
--- a/arch/arm/kernel/kgdb_fiq_entry.S
+++ b/arch/arm/kernel/kgdb_fiq_entry.S
@@ -18,9 +18,6 @@ 
 
 	.text
 
-@ This is needed for usr_entry/alignment_trap
-.LCcralign:
-	.long	cr_alignment
 .LCdohandle:
 	.long	kgdb_fiq_do_handle