diff mbox

[v2,3/3] arm64: spinlock: use lock->owner to optimise spin_unlock_wait

Message ID 1465403139-21054-3-git-send-email-will.deacon@arm.com
State New
Headers show

Commit Message

Will Deacon June 8, 2016, 4:25 p.m. UTC
Rather than wait until we observe the lock being free, we can also
return from spin_unlock_wait if we observe that the lock is now held
by somebody else, which implies that it was unlocked but we just missed
seeing it in that state.

Furthermore, in such a scenario there is no longer a need to write back
the value that we loaded, since we know that there has been a lock
hand-off, which is sufficient to publish any stores prior to the
unlock_wait. The litmus test is something like:

AArch64
{
0:X1=x; 0:X3=y;
1:X1=y;
2:X1=y; 2:X3=x;
}
 P0          | P1           | P2           ;
 MOV W0,#1   | MOV W0,#1    | LDAR W0,[X1] ;
 STR W0,[X1] | STLR W0,[X1] | LDR W2,[X3]  ;
 DMB SY      |              |              ;
 LDR W2,[X3] |              |              ;
exists
(0:X2=0 /\ 2:X0=1 /\ 2:X2=0)

where P0 is doing spin_unlock_wait, P1 is doing spin_unlock and P2 is
doing spin_lock.

Signed-off-by: Will Deacon <will.deacon@arm.com>

---
 arch/arm64/include/asm/spinlock.h | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

-- 
2.1.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Comments

Will Deacon June 10, 2016, 12:46 p.m. UTC | #1
On Fri, Jun 10, 2016 at 02:25:20PM +0200, Peter Zijlstra wrote:
> On Wed, Jun 08, 2016 at 05:25:39PM +0100, Will Deacon wrote:

> > Rather than wait until we observe the lock being free, we can also

> > return from spin_unlock_wait if we observe that the lock is now held

> > by somebody else, which implies that it was unlocked but we just missed

> > seeing it in that state.

> > 

> > Furthermore, in such a scenario there is no longer a need to write back

> > the value that we loaded, since we know that there has been a lock

> > hand-off, which is sufficient to publish any stores prior to the

> > unlock_wait.

> 

> You might want a few words on _why_ here. It took me a little while to

> figure that out.


How about "... because the ARM architecture ensures that a Store-Release
is multi-copy-atomic when observed by a Load-Acquire instruction"?

> Also; human readable arguments to support the thing below go a long way

> into validating the test is indeed correct. Because as you've shown,

> even the validators cannot be trusted ;-)


Well, I didn't actually provide the output of a model here. I'm just
capturing the rationale in a non-ambiguous form.

> > The litmus test is something like:

> > 

> > AArch64

> > {

> > 0:X1=x; 0:X3=y;

> > 1:X1=y;

> > 2:X1=y; 2:X3=x;

> > }

> >  P0          | P1           | P2           ;

> >  MOV W0,#1   | MOV W0,#1    | LDAR W0,[X1] ;

> >  STR W0,[X1] | STLR W0,[X1] | LDR W2,[X3]  ;

> >  DMB SY      |              |              ;

> >  LDR W2,[X3] |              |              ;

> > exists

> > (0:X2=0 /\ 2:X0=1 /\ 2:X2=0)

> > 

> > where P0 is doing spin_unlock_wait, P1 is doing spin_unlock and P2 is

> > doing spin_lock.

> 

> I still have a hard time deciphering these things..


I'll nail you down at LPC and share the kool-aid :)

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
diff mbox

Patch

diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h
index d5c894253e73..e875a5a551d7 100644
--- a/arch/arm64/include/asm/spinlock.h
+++ b/arch/arm64/include/asm/spinlock.h
@@ -30,20 +30,39 @@  static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
 {
 	unsigned int tmp;
 	arch_spinlock_t lockval;
+	u32 owner;
 
 	/*
 	 * Ensure prior spin_lock operations to other locks have completed
 	 * on this CPU before we test whether "lock" is locked.
 	 */
 	smp_mb();
+	owner = READ_ONCE(lock->owner) << 16;
 
 	asm volatile(
 "	sevl\n"
 "1:	wfe\n"
 "2:	ldaxr	%w0, %2\n"
+	/* Is the lock free? */
 "	eor	%w1, %w0, %w0, ror #16\n"
-"	cbnz	%w1, 1b\n"
-	/* Serialise against any concurrent lockers */
+"	cbz	%w1, 3f\n"
+	/* Lock taken -- has there been a subsequent unlock->lock transition? */
+"	eor	%w1, %w3, %w0, lsl #16\n"
+"	cbz	%w1, 1b\n"
+	/*
+	 * The owner has been updated, so there was an unlock->lock
+	 * transition that we missed. That means we can rely on the
+	 * store-release of the unlock operation paired with the
+	 * load-acquire of the lock operation to publish any of our
+	 * previous stores to the new lock owner and therefore don't
+	 * need to bother with the writeback below.
+	 */
+"	b	4f\n"
+"3:\n"
+	/*
+	 * Serialise against any concurrent lockers by writing back the
+	 * unlocked lock value
+	 */
 	ARM64_LSE_ATOMIC_INSN(
 	/* LL/SC */
 "	stxr	%w1, %w0, %2\n"
@@ -53,9 +72,11 @@  static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
 "	mov	%w1, %w0\n"
 "	cas	%w0, %w0, %2\n"
 "	eor	%w1, %w1, %w0\n")
+	/* Somebody else wrote to the lock, GOTO 10 and reload the value */
 "	cbnz	%w1, 2b\n"
+"4:"
 	: "=&r" (lockval), "=&r" (tmp), "+Q" (*lock)
-	:
+	: "r" (owner)
 	: "memory");
 }