From patchwork Fri Oct 24 12:22:20 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steve Capper X-Patchwork-Id: 39470 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-la0-f72.google.com (mail-la0-f72.google.com [209.85.215.72]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 8A70224026 for ; Fri, 24 Oct 2014 12:24:17 +0000 (UTC) Received: by mail-la0-f72.google.com with SMTP id gq15sf1767949lab.7 for ; Fri, 24 Oct 2014 05:24:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:from:to:subject:date:message-id :in-reply-to:references:cc:precedence:list-id:list-unsubscribe :list-archive:list-post:list-help:list-subscribe:mime-version:sender :errors-to:x-original-sender:x-original-authentication-results :mailing-list:content-type:content-transfer-encoding; bh=gpbVsrm7W8m8yuOrD5zhzeqJdTsvSgWDtuflHS3OxNs=; b=Lqxv5YasmMKHuRHe14Jzgmk/8oVYvIOFjF0K2Cr+ihNGgiZ+EzGcwviORxuEuOABvf pVqbL/p2/T6J++EO/8cxzADieNovKeNm6MXVd+rHHsHAMvtSEaZWR9Xel+8m+TEETpZD h5Crr+2/jCI9ecElSNAnlSX9MCO3vwNmaJ1rU+aXwkKrr6nmHpi8hdcd1eZp5wmReRtp PfN0NKVF6/adlWfZuIc38g3YGb5ANs+5MoRLwPORQnxgUU1aGQrq/JUcxO71O1xH9ulV eqqyzq+fq2+3+HsM2YDX1cgvrXercA8gTioIzj72jfdHgQ3L1RLxyWfSjkCfAdpvkyZH KM2Q== X-Gm-Message-State: ALoCoQkjUeTUusyqd9+39JIlC9yuCsqhNAOeQPmamDM0fJ/n0dTKciKdb7LkX9ci6mSW2qRQN3f4 X-Received: by 10.180.101.136 with SMTP id fg8mr775999wib.3.1414153456110; Fri, 24 Oct 2014 05:24:16 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.202.230 with SMTP id kl6ls193407lac.99.gmail; Fri, 24 Oct 2014 05:24:15 -0700 (PDT) X-Received: by 10.152.22.194 with SMTP id g2mr4123979laf.33.1414153455586; Fri, 24 Oct 2014 05:24:15 -0700 (PDT) Received: from mail-lb0-f179.google.com (mail-lb0-f179.google.com. [209.85.217.179]) by mx.google.com with ESMTPS id ba16si6794910lab.35.2014.10.24.05.24.15 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 24 Oct 2014 05:24:15 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.179 as permitted sender) client-ip=209.85.217.179; Received: by mail-lb0-f179.google.com with SMTP id l4so2466341lbv.10 for ; Fri, 24 Oct 2014 05:24:15 -0700 (PDT) X-Received: by 10.152.120.199 with SMTP id le7mr4018876lab.67.1414153455353; Fri, 24 Oct 2014 05:24:15 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.84.229 with SMTP id c5csp417882lbz; Fri, 24 Oct 2014 05:24:14 -0700 (PDT) X-Received: by 10.67.16.106 with SMTP id fv10mr4322987pad.47.1414153453677; Fri, 24 Oct 2014 05:24:13 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id ui1si4181197pbc.4.2014.10.24.05.24.13 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 24 Oct 2014 05:24:13 -0700 (PDT) Received-SPF: none (google.com: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org does not designate permitted sender hosts) client-ip=2001:1868:205::9; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1XhdtN-0003HO-Q9; Fri, 24 Oct 2014 12:22:53 +0000 Received: from mail-wi0-f174.google.com ([209.85.212.174]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1XhdtK-0002wr-2R for linux-arm-kernel@lists.infradead.org; Fri, 24 Oct 2014 12:22:51 +0000 Received: by mail-wi0-f174.google.com with SMTP id q5so1070363wiv.1 for ; Fri, 24 Oct 2014 05:22:27 -0700 (PDT) X-Received: by 10.180.90.237 with SMTP id bz13mr3835786wib.50.1414153347221; Fri, 24 Oct 2014 05:22:27 -0700 (PDT) Received: from marmot.wormnet.eu (marmot.wormnet.eu. [188.246.204.87]) by mx.google.com with ESMTPSA id p4sm1833034wiz.23.2014.10.24.05.22.26 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 24 Oct 2014 05:22:26 -0700 (PDT) From: Steve Capper To: linux-arm-kernel@lists.infradead.org Subject: [PATCH V2] arm64: xchg: Implement cmpxchg_double Date: Fri, 24 Oct 2014 13:22:20 +0100 Message-Id: <1414153340-19552-1-git-send-email-steve.capper@linaro.org> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1413374323-2062-1-git-send-email-steve.capper@linaro.org> References: <1413374323-2062-1-git-send-email-steve.capper@linaro.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20141024_052250_265192_284F94AB X-CRM114-Status: GOOD ( 14.76 ) X-Spam-Score: -0.7 (/) X-Spam-Report: SpamAssassin version 3.4.0 on bombadil.infradead.org summary: Content analysis details: (-0.7 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [209.85.212.174 listed in list.dnswl.org] -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [209.85.212.174 listed in wl.mailspike.net] -0.0 SPF_PASS SPF: sender matches SPF record -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders Cc: catalin.marinas@arm.com, Liviu.Dudau@arm.com, will.deacon@arm.com, Steve Capper X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: steve.capper@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.179 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 The arm64 architecture has the ability to exclusively load and store a pair of registers from an address (ldxp/stxp). Also the SLUB can take advantage of a cmpxchg_double implementation to avoid taking some locks. This patch provides an implementation of cmpxchg_double for 64-bit pairs, and activates the logic required for the SLUB to use these functions (HAVE_ALIGNED_STRUCT_PAGE and HAVE_CMPXCHG_DOUBLE). Also definitions of this_cpu_cmpxchg_8 and this_cpu_cmpxchg_double_8 are wired up to cmpxchg_local and cmpxchg_double_local (rather than the stock implementations that perform non-atomic operations with interrupts disabled) as they are used by the SLUB. On a Juno platform running on only the A57s I get quite a noticeable performance improvement with 5 runs of hackbench on v3.17: Baseline | With Patch -----------------+----------- Mean 119.2312 | 106.1782 StdDev 0.4919 | 0.4494 (times taken to complete `./hackbench 100 process 1000', in seconds) Signed-off-by: Steve Capper --- Changed in V2, added the this_cpu_cmpxchg* definitions, these are used by the fast path of the SLUB (without this our hackbench mean goes up to 111.9 seconds). Cheers Liviu for pointing out this ommission! The performance measurements were taken against a newer kernel running on a board with newer firmware, thus the baseline is faster than the one posted in V1. --- arch/arm64/Kconfig | 2 ++ arch/arm64/include/asm/cmpxchg.h | 71 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 73 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index fd4e81a..4a0f9a1 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -31,12 +31,14 @@ config ARM64 select GENERIC_STRNLEN_USER select GENERIC_TIME_VSYSCALL select HARDIRQS_SW_RESEND + select HAVE_ALIGNED_STRUCT_PAGE if SLUB select HAVE_ARCH_AUDITSYSCALL select HAVE_ARCH_JUMP_LABEL select HAVE_ARCH_KGDB select HAVE_ARCH_TRACEHOOK select HAVE_C_RECORDMCOUNT select HAVE_CC_STACKPROTECTOR + select HAVE_CMPXCHG_DOUBLE select HAVE_DEBUG_BUGVERBOSE select HAVE_DEBUG_KMEMLEAK select HAVE_DMA_API_DEBUG diff --git a/arch/arm64/include/asm/cmpxchg.h b/arch/arm64/include/asm/cmpxchg.h index ddb9d78..89e397b 100644 --- a/arch/arm64/include/asm/cmpxchg.h +++ b/arch/arm64/include/asm/cmpxchg.h @@ -19,6 +19,7 @@ #define __ASM_CMPXCHG_H #include +#include #include @@ -152,6 +153,51 @@ static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old, return oldval; } +#define system_has_cmpxchg_double() 1 + +static inline int __cmpxchg_double(volatile void *ptr1, volatile void *ptr2, + unsigned long old1, unsigned long old2, + unsigned long new1, unsigned long new2, int size) +{ + unsigned long loop, lost; + + switch (size) { + case 8: + VM_BUG_ON((unsigned long *)ptr2 - (unsigned long *)ptr1 != 1); + do { + asm volatile("// __cmpxchg_double8\n" + " ldxp %0, %1, %2\n" + " eor %0, %0, %3\n" + " eor %1, %1, %4\n" + " orr %1, %0, %1\n" + " mov %w0, #0\n" + " cbnz %1, 1f\n" + " stxp %w0, %5, %6, %2\n" + "1:\n" + : "=&r"(loop), "=&r"(lost), "+Q" (*(u64 *)ptr1) + : "r" (old1), "r"(old2), "r"(new1), "r"(new2)); + } while (loop); + break; + default: + BUILD_BUG(); + } + + return !lost; +} + +static inline int __cmpxchg_double_mb(volatile void *ptr1, volatile void *ptr2, + unsigned long old1, unsigned long old2, + unsigned long new1, unsigned long new2, int size) +{ + int ret; + + smp_mb(); + ret = __cmpxchg_double(ptr1, ptr2, old1, old2, new1, new2, size); + smp_mb(); + + return ret; +} + static inline unsigned long __cmpxchg_mb(volatile void *ptr, unsigned long old, unsigned long new, int size) { @@ -182,6 +228,31 @@ static inline unsigned long __cmpxchg_mb(volatile void *ptr, unsigned long old, __ret; \ }) +#define cmpxchg_double(ptr1, ptr2, o1, o2, n1, n2) \ +({\ + int __ret;\ + __ret = __cmpxchg_double_mb((ptr1), (ptr2), (unsigned long)(o1), \ + (unsigned long)(o2), (unsigned long)(n1), \ + (unsigned long)(n2), sizeof(*(ptr1)));\ + __ret; \ +}) + +#define cmpxchg_double_local(ptr1, ptr2, o1, o2, n1, n2) \ +({\ + int __ret;\ + __ret = __cmpxchg_double((ptr1), (ptr2), (unsigned long)(o1), \ + (unsigned long)(o2), (unsigned long)(n1), \ + (unsigned long)(n2), sizeof(*(ptr1)));\ + __ret; \ +}) + +#define this_cpu_cmpxchg_8(ptr, o, n) \ + cmpxchg_local(raw_cpu_ptr(&(ptr)), o, n) + +#define this_cpu_cmpxchg_double_8(ptr1, ptr2, o1, o2, n1, n2) \ + cmpxchg_double_local(raw_cpu_ptr(&(ptr1)), raw_cpu_ptr(&(ptr2)), \ + o1, o2, n1, n2) + #define cmpxchg64(ptr,o,n) cmpxchg((ptr),(o),(n)) #define cmpxchg64_local(ptr,o,n) cmpxchg_local((ptr),(o),(n))