From patchwork Mon May 15 07:57:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 684081 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0CAF8C7EE26 for ; Mon, 15 May 2023 08:07:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230002AbjEOIHY (ORCPT ); Mon, 15 May 2023 04:07:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230074AbjEOIHW (ORCPT ); Mon, 15 May 2023 04:07:22 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DFA211716; Mon, 15 May 2023 01:07:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=gp7OCKaBYjIIkPcuPMybL+ixdp/AdwhFq0MWXNBGUCo=; b=mfr/Lee6q6wb3hW9VLMJP6RMas nTl9Qe1+9Iu48xOQT4aXYKIo3YdHu4pVixS6ep5klWjn2Hleinv1VLcJZJw+35Dak53xO4M9X9KIK FpMiCIIAsi7/QgdKTHO+THopVLnjgnmza7zDZfiP5yT3vka2IQVSR4N/m/pdNqYPnTzsA8lIVgWqD P5m3mwC26ExmfD/meFO5f1ezwrEtuIohzrkIWoLo3mQANg3Hx3wLE8RBN4yr9HOkOwiYyEuNQ3/1W PLL/LI+1uLThBRAeCCPMwP+WHJ7vOYl/0orY1PoeN18xp4Y80oyboD7BypdwlKtzQ00oQnrhgh54F yIuqhB5w==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1pyTDk-00BQMy-1T; Mon, 15 May 2023 08:06:16 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 0C5C8300786; Mon, 15 May 2023 10:06:10 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id AB37F202674A5; Mon, 15 May 2023 10:06:10 +0200 (CEST) Message-ID: <20230515080553.979680310@infradead.org> User-Agent: quilt/0.66 Date: Mon, 15 May 2023 09:57:00 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 01/11] cyrpto/b128ops: Remove struct u128 References: <20230515075659.118447996@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Per git-grep u128_xor() and its related struct u128 are unused except to implement {be,le}128_xor(). Remove them to free up the namespace. Signed-off-by: Peter Zijlstra (Intel) Acked-by: Herbert Xu --- include/crypto/b128ops.h | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-) --- a/include/crypto/b128ops.h +++ b/include/crypto/b128ops.h @@ -50,10 +50,6 @@ #include typedef struct { - u64 a, b; -} u128; - -typedef struct { __be64 a, b; } be128; @@ -61,20 +57,16 @@ typedef struct { __le64 b, a; } le128; -static inline void u128_xor(u128 *r, const u128 *p, const u128 *q) +static inline void be128_xor(be128 *r, const be128 *p, const be128 *q) { r->a = p->a ^ q->a; r->b = p->b ^ q->b; } -static inline void be128_xor(be128 *r, const be128 *p, const be128 *q) -{ - u128_xor((u128 *)r, (u128 *)p, (u128 *)q); -} - static inline void le128_xor(le128 *r, const le128 *p, const le128 *q) { - u128_xor((u128 *)r, (u128 *)p, (u128 *)q); + r->a = p->a ^ q->a; + r->b = p->b ^ q->b; } #endif /* _CRYPTO_B128OPS_H */ From patchwork Mon May 15 07:57:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 682070 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47B6DC7EE24 for ; Mon, 15 May 2023 08:07:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239801AbjEOIHi (ORCPT ); Mon, 15 May 2023 04:07:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38552 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236245AbjEOIH1 (ORCPT ); Mon, 15 May 2023 04:07:27 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 50FFF1723; Mon, 15 May 2023 01:07:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=axnHoGjIL7t9C+QVLq6pAPgEZLVVerXkwPR51K17BxQ=; b=J+RYi2Olcu/GDwkb7uyoMxOowZ gdnnCNimNXMzjHfiHbHeyq75lJdwzqja8okQUmitQ1ua7iWanYtjpUVa32iVsf63pmox4vvEN+Ejz 1wpagut3YpzUSwNPpwpsl3jsBlJSSesN7UlPVAxsCXGiTi2d8aMeHHRkqQGQKv/P2cwZEs4JCNCfS xBlhWHGUBVitTRjLQwLm3qFy1fFLZshzr1Jf7x5B4z/zbEO4fjLkYFu6HtcQ+JbhA2M4mY6B7vLd7 0ZMD0sEUiA88G6VeDUxFw1EIBsMuAOuqnZrusc8ZVstoP2bAOXDwpltv/Y/kKX6d4HXBKVWWAlia0 mx0Oe8lw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1pyTDk-00BQMz-1T; Mon, 15 May 2023 08:06:16 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 09F29300244; Mon, 15 May 2023 10:06:10 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id AEDEA202FCE90; Mon, 15 May 2023 10:06:10 +0200 (CEST) Message-ID: <20230515080554.047618712@infradead.org> User-Agent: quilt/0.66 Date: Mon, 15 May 2023 09:57:01 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 02/11] types: Introduce [us]128 References: <20230515075659.118447996@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Introduce [us]128 (when available). Unlike [us]64, ensure they are always naturally aligned. This also enables 128bit wide atomics (which require natural alignment) such as cmpxchg128(). Signed-off-by: Peter Zijlstra (Intel) Acked-by: Herbert Xu --- include/linux/types.h | 5 +++++ include/uapi/linux/types.h | 4 ++++ 2 files changed, 9 insertions(+) --- a/include/linux/types.h +++ b/include/linux/types.h @@ -10,6 +10,11 @@ #define DECLARE_BITMAP(name,bits) \ unsigned long name[BITS_TO_LONGS(bits)] +#ifdef __SIZEOF_INT128__ +typedef __s128 s128; +typedef __u128 u128; +#endif + typedef u32 __kernel_dev_t; typedef __kernel_fd_set fd_set; --- a/include/uapi/linux/types.h +++ b/include/uapi/linux/types.h @@ -13,6 +13,10 @@ #include +#ifdef __SIZEOF_INT128__ +typedef __signed__ __int128 __s128 __attribute__((aligned(16))); +typedef unsigned __int128 __u128 __attribute__((aligned(16))); +#endif /* * Below are truly Linux-specific types that should never collide with --- a/lib/crypto/curve25519-hacl64.c +++ b/lib/crypto/curve25519-hacl64.c @@ -14,8 +14,6 @@ #include #include -typedef __uint128_t u128; - static __always_inline u64 u64_eq_mask(u64 a, u64 b) { u64 x = a ^ b; --- a/lib/crypto/poly1305-donna64.c +++ b/lib/crypto/poly1305-donna64.c @@ -10,8 +10,6 @@ #include #include -typedef __uint128_t u128; - void poly1305_core_setkey(struct poly1305_core_key *key, const u8 raw_key[POLY1305_BLOCK_SIZE]) { From patchwork Mon May 15 07:57:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 682071 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7119EC7EE24 for ; Mon, 15 May 2023 08:07:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239554AbjEOIHe (ORCPT ); Mon, 15 May 2023 04:07:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236077AbjEOIH1 (ORCPT ); Mon, 15 May 2023 04:07:27 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D47FE10FF; Mon, 15 May 2023 01:07:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=1dpaKFw/ImnX3ivBkF4/OTirLfnK12XzoUFqOemVS9M=; b=nCa92XFnMhdkelbE+tRZRrXIYS kOcZdCKGEMW4KSKB+6B4aivSqYkUHANT/UlD4Aj6tqOcvWV54dCTn3Zgoan/4wXM6P8uTariGluRA dK6MON9zNhNVc6sBNEvJzPx2w9m+W7e4I4aCZpl5DVMtACFO18Ch9JPZ1jyUpU83EMRKXHtV8Iz1D XfR1l9t+Il6O7f3x5b8EmWQM90cSGueNT4YgQOkrbRReRJORMk8uuKxUx66lTPr2d1C5h5y33LUiC TgvfbRNNsp0dRP9g9qP2uK7HzZtalOk93SWeVN83x38l4Z+We3DOMEzC6VzL2KU4lEfmaiaAB2JGw HPP9047A==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1pyTDk-003HUP-En; Mon, 15 May 2023 08:06:17 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 10D4F300912; Mon, 15 May 2023 10:06:10 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id B1D27202FCE9F; Mon, 15 May 2023 10:06:10 +0200 (CEST) Message-ID: <20230515080554.114813040@infradead.org> User-Agent: quilt/0.66 Date: Mon, 15 May 2023 09:57:02 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 03/11] arch: Introduce arch_{,try_}_cmpxchg128{,_local}() References: <20230515075659.118447996@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org For all architectures that currently support cmpxchg_double() implement the cmpxchg128() family of functions that is basically the same but with a saner interface. Signed-off-by: Peter Zijlstra (Intel) Acked-by: Heiko Carstens Acked-by: Mark Rutland --- arch/arm64/include/asm/atomic_ll_sc.h | 41 +++++++++++++++++++++ arch/arm64/include/asm/atomic_lse.h | 31 ++++++++++++++++ arch/arm64/include/asm/cmpxchg.h | 26 +++++++++++++ arch/s390/include/asm/cmpxchg.h | 14 +++++++ arch/x86/include/asm/cmpxchg_32.h | 3 + arch/x86/include/asm/cmpxchg_64.h | 64 +++++++++++++++++++++++++++++++++- 6 files changed, 177 insertions(+), 2 deletions(-) --- a/arch/arm64/include/asm/atomic_ll_sc.h +++ b/arch/arm64/include/asm/atomic_ll_sc.h @@ -326,6 +326,47 @@ __CMPXCHG_DBL( , , , ) __CMPXCHG_DBL(_mb, dmb ish, l, "memory") #undef __CMPXCHG_DBL + +union __u128_halves { + u128 full; + struct { + u64 low, high; + }; +}; + +#define __CMPXCHG128(name, mb, rel, cl...) \ +static __always_inline u128 \ +__ll_sc__cmpxchg128##name(volatile u128 *ptr, u128 old, u128 new) \ +{ \ + union __u128_halves r, o = { .full = (old) }, \ + n = { .full = (new) }; \ + unsigned int tmp; \ + \ + asm volatile("// __cmpxchg128" #name "\n" \ + " prfm pstl1strm, %[v]\n" \ + "1: ldxp %[rl], %[rh], %[v]\n" \ + " cmp %[rl], %[ol]\n" \ + " ccmp %[rh], %[oh], 0, eq\n" \ + " b.ne 2f\n" \ + " st" #rel "xp %w[tmp], %[nl], %[nh], %[v]\n" \ + " cbnz %w[tmp], 1b\n" \ + " " #mb "\n" \ + "2:" \ + : [v] "+Q" (*(u128 *)ptr), \ + [rl] "=&r" (r.low), [rh] "=&r" (r.high), \ + [tmp] "=&r" (tmp) \ + : [ol] "r" (o.low), [oh] "r" (o.high), \ + [nl] "r" (n.low), [nh] "r" (n.high) \ + : "cc", ##cl); \ + \ + return r.full; \ +} + +__CMPXCHG128( , , ) +__CMPXCHG128(_mb, dmb ish, l, "memory") + +#undef __CMPXCHG128 + #undef K #endif /* __ASM_ATOMIC_LL_SC_H */ --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -317,4 +317,35 @@ __CMPXCHG_DBL(_mb, al, "memory") #undef __CMPXCHG_DBL +#define __CMPXCHG128(name, mb, cl...) \ +static __always_inline u128 \ +__lse__cmpxchg128##name(volatile u128 *ptr, u128 old, u128 new) \ +{ \ + union __u128_halves r, o = { .full = (old) }, \ + n = { .full = (new) }; \ + register unsigned long x0 asm ("x0") = o.low; \ + register unsigned long x1 asm ("x1") = o.high; \ + register unsigned long x2 asm ("x2") = n.low; \ + register unsigned long x3 asm ("x3") = n.high; \ + register unsigned long x4 asm ("x4") = (unsigned long)ptr; \ + \ + asm volatile( \ + __LSE_PREAMBLE \ + " casp" #mb "\t%[old1], %[old2], %[new1], %[new2], %[v]\n"\ + : [old1] "+&r" (x0), [old2] "+&r" (x1), \ + [v] "+Q" (*(u128 *)ptr) \ + : [new1] "r" (x2), [new2] "r" (x3), [ptr] "r" (x4), \ + [oldval1] "r" (o.low), [oldval2] "r" (o.high) \ + : cl); \ + \ + r.low = x0; r.high = x1; \ + \ + return r.full; \ +} + +__CMPXCHG128( , ) +__CMPXCHG128(_mb, al, "memory") + +#undef __CMPXCHG128 + #endif /* __ASM_ATOMIC_LSE_H */ --- a/arch/arm64/include/asm/cmpxchg.h +++ b/arch/arm64/include/asm/cmpxchg.h @@ -146,6 +146,19 @@ __CMPXCHG_DBL(_mb) #undef __CMPXCHG_DBL +#define __CMPXCHG128(name) \ +static inline u128 __cmpxchg128##name(volatile u128 *ptr, \ + u128 old, u128 new) \ +{ \ + return __lse_ll_sc_body(_cmpxchg128##name, \ + ptr, old, new); \ +} + +__CMPXCHG128( ) +__CMPXCHG128(_mb) + +#undef __CMPXCHG128 + #define __CMPXCHG_GEN(sfx) \ static __always_inline unsigned long __cmpxchg##sfx(volatile void *ptr, \ unsigned long old, \ @@ -228,6 +241,19 @@ __CMPXCHG_GEN(_mb) __ret; \ }) +/* cmpxchg128 */ +#define system_has_cmpxchg128() 1 + +#define arch_cmpxchg128(ptr, o, n) \ +({ \ + __cmpxchg128_mb((ptr), (o), (n)); \ +}) + +#define arch_cmpxchg128_local(ptr, o, n) \ +({ \ + __cmpxchg128((ptr), (o), (n)); \ +}) + #define __CMPWAIT_CASE(w, sfx, sz) \ static inline void __cmpwait_case_##sz(volatile void *ptr, \ unsigned long val) \ --- a/arch/s390/include/asm/cmpxchg.h +++ b/arch/s390/include/asm/cmpxchg.h @@ -224,4 +224,18 @@ static __always_inline int __cmpxchg_dou (unsigned long)(n1), (unsigned long)(n2)); \ }) +#define system_has_cmpxchg128() 1 + +static __always_inline u128 arch_cmpxchg128(volatile u128 *ptr, u128 old, u128 new) +{ + asm volatile( + " cdsg %[old],%[new],%[ptr]\n" + : [old] "+d" (old), [ptr] "+QS" (*ptr) + : [new] "d" (new) + : "memory", "cc"); + return old; +} + +#define arch_cmpxchg128 arch_cmpxchg128 + #endif /* __ASM_CMPXCHG_H */ --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -103,6 +103,7 @@ static inline bool __try_cmpxchg64(volat #endif -#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX8) +#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX8) +#define system_has_cmpxchg64() boot_cpu_has(X86_FEATURE_CX8) #endif /* _ASM_X86_CMPXCHG_32_H */ --- a/arch/x86/include/asm/cmpxchg_64.h +++ b/arch/x86/include/asm/cmpxchg_64.h @@ -20,6 +20,68 @@ arch_try_cmpxchg((ptr), (po), (n)); \ }) -#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX16) +union __u128_halves { + u128 full; + struct { + u64 low, high; + }; +}; + +#define __arch_cmpxchg128(_ptr, _old, _new, _lock) \ +({ \ + union __u128_halves o = { .full = (_old), }, \ + n = { .full = (_new), }; \ + \ + asm volatile(_lock "cmpxchg16b %[ptr]" \ + : [ptr] "+m" (*(_ptr)), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high) \ + : "memory"); \ + \ + o.full; \ +}) + +static __always_inline u128 arch_cmpxchg128(volatile u128 *ptr, u128 old, u128 new) +{ + return __arch_cmpxchg128(ptr, old, new, LOCK_PREFIX); +} + +static __always_inline u128 arch_cmpxchg128_local(volatile u128 *ptr, u128 old, u128 new) +{ + return __arch_cmpxchg128(ptr, old, new,); +} + +#define __arch_try_cmpxchg128(_ptr, _oldp, _new, _lock) \ +({ \ + union __u128_halves o = { .full = *(_oldp), }, \ + n = { .full = (_new), }; \ + bool ret; \ + \ + asm volatile(_lock "cmpxchg16b %[ptr]" \ + CC_SET(e) \ + : CC_OUT(e) (ret), \ + [ptr] "+m" (*ptr), \ + "+a" (o.low), "+d" (o.high) \ + : "b" (n.low), "c" (n.high) \ + : "memory"); \ + \ + if (unlikely(!ret)) \ + *(_oldp) = o.full; \ + \ + likely(ret); \ +}) + +static __always_inline bool arch_try_cmpxchg128(volatile u128 *ptr, u128 *oldp, u128 new) +{ + return __arch_try_cmpxchg128(ptr, oldp, new, LOCK_PREFIX); +} + +static __always_inline bool arch_try_cmpxchg128_local(volatile u128 *ptr, u128 *oldp, u128 new) +{ + return __arch_try_cmpxchg128(ptr, oldp, new,); +} + +#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX16) +#define system_has_cmpxchg128() boot_cpu_has(X86_FEATURE_CX16) #endif /* _ASM_X86_CMPXCHG_64_H */ From patchwork Mon May 15 07:57:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 682069 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A20C9C7EE30 for ; Mon, 15 May 2023 08:07:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239918AbjEOIHk (ORCPT ); Mon, 15 May 2023 04:07:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238550AbjEOIHa (ORCPT ); Mon, 15 May 2023 04:07:30 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CC511A5; Mon, 15 May 2023 01:07:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=RCr9Xn0mKbWganhsRQfOR6jVFCjDlLx/rSxLfiFKpAk=; b=d+Q3WNOaRhK0mj1LuRITzeVvaf IUGC/nzRDUh6SubVu5LMHXgslcA+Kt+w9WLxc1CJgrI/DvmbjYTbVGjYoXL4/7oHpiiRhtZ1ltogj Ai4KCHKJzkavv9NelV+OYkzOPvTUvWyNYNhW3a95xNkNqyXJG5aOP4h2Y/4WSrkkOhu+RwFm5Eiqi k+dC8bMZ7KL0Dw9NG0LJbxdLqFErv1WaMC426/uXywDkDIMDgMr3SEU65fJxW4ItF5lXQl2/zrgQI O3UoOHKOCNf+8ZYv/SqyRtrWEytfLqwmt3qDqBRzsaf6TUaInYa/0xBEkXbiOB4cFhqmra4E5jnbT DTUpnFTA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1pyTDk-003HUR-Ep; Mon, 15 May 2023 08:06:17 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 0E9FF30080C; Mon, 15 May 2023 10:06:10 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id B721D202FCEA3; Mon, 15 May 2023 10:06:10 +0200 (CEST) Message-ID: <20230515080554.181725437@infradead.org> User-Agent: quilt/0.66 Date: Mon, 15 May 2023 09:57:03 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 04/11] instrumentation: Wire up cmpxchg128() References: <20230515075659.118447996@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Wire up the cmpxchg128 family in the atomic wrapper scripts. These provide the generic cmpxchg128 family of functions from the arch_ prefixed version, adding explicit instrumentation where needed. Signed-off-by: Peter Zijlstra (Intel) Acked-by: Mark Rutland --- include/linux/atomic/atomic-arch-fallback.h | 95 +++++++++++++++++++++++++++- include/linux/atomic/atomic-instrumented.h | 86 +++++++++++++++++++++++++ scripts/atomic/gen-atomic-fallback.sh | 4 - scripts/atomic/gen-atomic-instrumented.sh | 4 - 4 files changed, 183 insertions(+), 6 deletions(-) --- a/include/linux/atomic/atomic-arch-fallback.h +++ b/include/linux/atomic/atomic-arch-fallback.h @@ -77,6 +77,29 @@ #endif /* arch_cmpxchg64_relaxed */ +#ifndef arch_cmpxchg128_relaxed +#define arch_cmpxchg128_acquire arch_cmpxchg128 +#define arch_cmpxchg128_release arch_cmpxchg128 +#define arch_cmpxchg128_relaxed arch_cmpxchg128 +#else /* arch_cmpxchg128_relaxed */ + +#ifndef arch_cmpxchg128_acquire +#define arch_cmpxchg128_acquire(...) \ + __atomic_op_acquire(arch_cmpxchg128, __VA_ARGS__) +#endif + +#ifndef arch_cmpxchg128_release +#define arch_cmpxchg128_release(...) \ + __atomic_op_release(arch_cmpxchg128, __VA_ARGS__) +#endif + +#ifndef arch_cmpxchg128 +#define arch_cmpxchg128(...) \ + __atomic_op_fence(arch_cmpxchg128, __VA_ARGS__) +#endif + +#endif /* arch_cmpxchg128_relaxed */ + #ifndef arch_try_cmpxchg_relaxed #ifdef arch_try_cmpxchg #define arch_try_cmpxchg_acquire arch_try_cmpxchg @@ -217,6 +240,76 @@ #endif /* arch_try_cmpxchg64_relaxed */ +#ifndef arch_try_cmpxchg128_relaxed +#ifdef arch_try_cmpxchg128 +#define arch_try_cmpxchg128_acquire arch_try_cmpxchg128 +#define arch_try_cmpxchg128_release arch_try_cmpxchg128 +#define arch_try_cmpxchg128_relaxed arch_try_cmpxchg128 +#endif /* arch_try_cmpxchg128 */ + +#ifndef arch_try_cmpxchg128 +#define arch_try_cmpxchg128(_ptr, _oldp, _new) \ +({ \ + typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \ + ___r = arch_cmpxchg128((_ptr), ___o, (_new)); \ + if (unlikely(___r != ___o)) \ + *___op = ___r; \ + likely(___r == ___o); \ +}) +#endif /* arch_try_cmpxchg128 */ + +#ifndef arch_try_cmpxchg128_acquire +#define arch_try_cmpxchg128_acquire(_ptr, _oldp, _new) \ +({ \ + typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \ + ___r = arch_cmpxchg128_acquire((_ptr), ___o, (_new)); \ + if (unlikely(___r != ___o)) \ + *___op = ___r; \ + likely(___r == ___o); \ +}) +#endif /* arch_try_cmpxchg128_acquire */ + +#ifndef arch_try_cmpxchg128_release +#define arch_try_cmpxchg128_release(_ptr, _oldp, _new) \ +({ \ + typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \ + ___r = arch_cmpxchg128_release((_ptr), ___o, (_new)); \ + if (unlikely(___r != ___o)) \ + *___op = ___r; \ + likely(___r == ___o); \ +}) +#endif /* arch_try_cmpxchg128_release */ + +#ifndef arch_try_cmpxchg128_relaxed +#define arch_try_cmpxchg128_relaxed(_ptr, _oldp, _new) \ +({ \ + typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \ + ___r = arch_cmpxchg128_relaxed((_ptr), ___o, (_new)); \ + if (unlikely(___r != ___o)) \ + *___op = ___r; \ + likely(___r == ___o); \ +}) +#endif /* arch_try_cmpxchg128_relaxed */ + +#else /* arch_try_cmpxchg128_relaxed */ + +#ifndef arch_try_cmpxchg128_acquire +#define arch_try_cmpxchg128_acquire(...) \ + __atomic_op_acquire(arch_try_cmpxchg128, __VA_ARGS__) +#endif + +#ifndef arch_try_cmpxchg128_release +#define arch_try_cmpxchg128_release(...) \ + __atomic_op_release(arch_try_cmpxchg128, __VA_ARGS__) +#endif + +#ifndef arch_try_cmpxchg128 +#define arch_try_cmpxchg128(...) \ + __atomic_op_fence(arch_try_cmpxchg128, __VA_ARGS__) +#endif + +#endif /* arch_try_cmpxchg128_relaxed */ + #ifndef arch_try_cmpxchg_local #define arch_try_cmpxchg_local(_ptr, _oldp, _new) \ ({ \ @@ -2668,4 +2761,4 @@ arch_atomic64_dec_if_positive(atomic64_t #endif #endif /* _LINUX_ATOMIC_FALLBACK_H */ -// ad2e2b4d168dbc60a73922616047a9bfa446af36 +// 52dfc6fe4a2e7234bbd2aa3e16a377c1db793a53 --- a/include/linux/atomic/atomic-instrumented.h +++ b/include/linux/atomic/atomic-instrumented.h @@ -2034,6 +2034,36 @@ atomic_long_dec_if_positive(atomic_long_ arch_cmpxchg64_relaxed(__ai_ptr, __VA_ARGS__); \ }) +#define cmpxchg128(ptr, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + kcsan_mb(); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + arch_cmpxchg128(__ai_ptr, __VA_ARGS__); \ +}) + +#define cmpxchg128_acquire(ptr, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + arch_cmpxchg128_acquire(__ai_ptr, __VA_ARGS__); \ +}) + +#define cmpxchg128_release(ptr, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + kcsan_release(); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + arch_cmpxchg128_release(__ai_ptr, __VA_ARGS__); \ +}) + +#define cmpxchg128_relaxed(ptr, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + arch_cmpxchg128_relaxed(__ai_ptr, __VA_ARGS__); \ +}) + #define try_cmpxchg(ptr, oldp, ...) \ ({ \ typeof(ptr) __ai_ptr = (ptr); \ @@ -2110,6 +2140,44 @@ atomic_long_dec_if_positive(atomic_long_ arch_try_cmpxchg64_relaxed(__ai_ptr, __ai_oldp, __VA_ARGS__); \ }) +#define try_cmpxchg128(ptr, oldp, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + typeof(oldp) __ai_oldp = (oldp); \ + kcsan_mb(); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \ + arch_try_cmpxchg128(__ai_ptr, __ai_oldp, __VA_ARGS__); \ +}) + +#define try_cmpxchg128_acquire(ptr, oldp, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + typeof(oldp) __ai_oldp = (oldp); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \ + arch_try_cmpxchg128_acquire(__ai_ptr, __ai_oldp, __VA_ARGS__); \ +}) + +#define try_cmpxchg128_release(ptr, oldp, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + typeof(oldp) __ai_oldp = (oldp); \ + kcsan_release(); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \ + arch_try_cmpxchg128_release(__ai_ptr, __ai_oldp, __VA_ARGS__); \ +}) + +#define try_cmpxchg128_relaxed(ptr, oldp, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + typeof(oldp) __ai_oldp = (oldp); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \ + arch_try_cmpxchg128_relaxed(__ai_ptr, __ai_oldp, __VA_ARGS__); \ +}) + #define cmpxchg_local(ptr, ...) \ ({ \ typeof(ptr) __ai_ptr = (ptr); \ @@ -2124,6 +2192,13 @@ atomic_long_dec_if_positive(atomic_long_ arch_cmpxchg64_local(__ai_ptr, __VA_ARGS__); \ }) +#define cmpxchg128_local(ptr, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + arch_cmpxchg128_local(__ai_ptr, __VA_ARGS__); \ +}) + #define sync_cmpxchg(ptr, ...) \ ({ \ typeof(ptr) __ai_ptr = (ptr); \ @@ -2150,6 +2225,15 @@ atomic_long_dec_if_positive(atomic_long_ arch_try_cmpxchg64_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \ }) +#define try_cmpxchg128_local(ptr, oldp, ...) \ +({ \ + typeof(ptr) __ai_ptr = (ptr); \ + typeof(oldp) __ai_oldp = (oldp); \ + instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \ + instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \ + arch_try_cmpxchg128_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \ +}) + #define cmpxchg_double(ptr, ...) \ ({ \ typeof(ptr) __ai_ptr = (ptr); \ @@ -2167,4 +2251,4 @@ atomic_long_dec_if_positive(atomic_long_ }) #endif /* _LINUX_ATOMIC_INSTRUMENTED_H */ -// 6b513a42e1a1b5962532a019b7fc91eaa044ad5e +// 82d1be694fab30414527d0877c29fa75ed5a0b74 --- a/scripts/atomic/gen-atomic-fallback.sh +++ b/scripts/atomic/gen-atomic-fallback.sh @@ -217,11 +217,11 @@ cat << EOF EOF -for xchg in "arch_xchg" "arch_cmpxchg" "arch_cmpxchg64"; do +for xchg in "arch_xchg" "arch_cmpxchg" "arch_cmpxchg64" "arch_cmpxchg128"; do gen_xchg_fallbacks "${xchg}" done -for cmpxchg in "cmpxchg" "cmpxchg64"; do +for cmpxchg in "cmpxchg" "cmpxchg64" "cmpxchg128"; do gen_try_cmpxchg_fallbacks "${cmpxchg}" done --- a/scripts/atomic/gen-atomic-instrumented.sh +++ b/scripts/atomic/gen-atomic-instrumented.sh @@ -166,14 +166,14 @@ grep '^[a-z]' "$1" | while read name met done -for xchg in "xchg" "cmpxchg" "cmpxchg64" "try_cmpxchg" "try_cmpxchg64"; do +for xchg in "xchg" "cmpxchg" "cmpxchg64" "cmpxchg128" "try_cmpxchg" "try_cmpxchg64" "try_cmpxchg128"; do for order in "" "_acquire" "_release" "_relaxed"; do gen_xchg "${xchg}" "${order}" "" printf "\n" done done -for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg" "try_cmpxchg_local" "try_cmpxchg64_local" ; do +for xchg in "cmpxchg_local" "cmpxchg64_local" "cmpxchg128_local" "sync_cmpxchg" "try_cmpxchg_local" "try_cmpxchg64_local" "try_cmpxchg128_local"; do gen_xchg "${xchg}" "" "" printf "\n" done From patchwork Mon May 15 07:57:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 682073 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DD56C7EE2F for ; Mon, 15 May 2023 08:07:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233460AbjEOIHZ (ORCPT ); Mon, 15 May 2023 04:07:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38478 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230288AbjEOIHX (ORCPT ); Mon, 15 May 2023 04:07:23 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9FFDA1731; Mon, 15 May 2023 01:07:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=3RRGLs3vZsBA3WquAm9CYBrs3TiBBk1z4f/JSTl06ko=; b=C7Q+9vEV1QZqlrWrdiRS/4ZaOo C4QaD94dYLKM4rVSpBMuM4MPCUEwWG2wlZb7z7L+ytDDmEzthuIXFw5s4xhl1Zhwa8PHyegz3LsWv sX9yk6m2RhCvKI4eawfr4xg6NlwXbIHLgql/4Hh2IAyjJ5EKkfn282tlp6HqkTBBJgH+es2+j28ap mwmpcCi64hREGLbEv+l39wIXLA29adUS/A2oGsp2yAeCyreAL19bzS9x+aYuSwanUcEPwpYAb2/WQ Q6Uzz57SiaFlBid1/0NAHDSF0uzCiEKR6fHXhBcPEjn8/4auGrU3RiX6Ur+Re+rAmPdmvbZq7AL9R kV+s3owA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1pyTDl-00BQN3-1H; Mon, 15 May 2023 08:06:17 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 8708C300ECD; Mon, 15 May 2023 10:06:15 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id BBF7D202FCEA5; Mon, 15 May 2023 10:06:10 +0200 (CEST) Message-ID: <20230515080554.248739380@infradead.org> User-Agent: quilt/0.66 Date: Mon, 15 May 2023 09:57:04 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 05/11] percpu: Wire up cmpxchg128 References: <20230515075659.118447996@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org In order to replace cmpxchg_double() with the newly minted cmpxchg128() family of functions, wire it up in this_cpu_cmpxchg(). Signed-off-by: Peter Zijlstra (Intel) --- arch/arm64/include/asm/percpu.h | 20 +++++++++++++ arch/s390/include/asm/percpu.h | 16 ++++++++++ arch/x86/include/asm/percpu.h | 59 ++++++++++++++++++++++++++++++++++++++++ include/asm-generic/percpu.h | 16 ++++++++++ 4 files changed, 111 insertions(+) --- a/arch/arm64/include/asm/percpu.h +++ b/arch/arm64/include/asm/percpu.h @@ -140,6 +140,10 @@ PERCPU_RET_OP(add, add, ldadd) * re-enabling preemption for preemptible kernels, but doing that in a way * which builds inside a module would mean messing directly with the preempt * count. If you do this, peterz and tglx will hunt you down. + * + * Not to mention it'll break the actual preemption model for missing a + * preemption point when TIF_NEED_RESCHED gets set while preemption is + * disabled. */ #define this_cpu_cmpxchg_double_8(ptr1, ptr2, o1, o2, n1, n2) \ ({ \ @@ -240,6 +244,22 @@ PERCPU_RET_OP(add, add, ldadd) #define this_cpu_cmpxchg_8(pcp, o, n) \ _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) +#define this_cpu_cmpxchg64(pcp, o, n) this_cpu_cmpxchg_8(pcp, o, n) + +#define this_cpu_cmpxchg128(pcp, o, n) \ +({ \ + typedef typeof(pcp) pcp_op_T__; \ + u128 old__, new__, ret__; \ + pcp_op_T__ *ptr__; \ + old__ = o; \ + new__ = n; \ + preempt_disable_notrace(); \ + ptr__ = raw_cpu_ptr(&(pcp)); \ + ret__ = cmpxchg128_local((void *)ptr__, old__, new__); \ + preempt_enable_notrace(); \ + ret__; \ +}) + #ifdef __KVM_NVHE_HYPERVISOR__ extern unsigned long __hyp_per_cpu_offset(unsigned int cpu); #define __per_cpu_offset --- a/arch/s390/include/asm/percpu.h +++ b/arch/s390/include/asm/percpu.h @@ -148,6 +148,22 @@ #define this_cpu_cmpxchg_4(pcp, oval, nval) arch_this_cpu_cmpxchg(pcp, oval, nval) #define this_cpu_cmpxchg_8(pcp, oval, nval) arch_this_cpu_cmpxchg(pcp, oval, nval) +#define this_cpu_cmpxchg64(pcp, o, n) this_cpu_cmpxchg_8(pcp, o, n) + +#define this_cpu_cmpxchg128(pcp, oval, nval) \ +({ \ + typedef typeof(pcp) pcp_op_T__; \ + u128 old__, new__, ret__; \ + pcp_op_T__ *ptr__; \ + old__ = oval; \ + new__ = nval; \ + preempt_disable_notrace(); \ + ptr__ = raw_cpu_ptr(&(pcp)); \ + ret__ = cmpxchg128((void *)ptr__, old__, new__); \ + preempt_enable_notrace(); \ + ret__; \ +}) + #define arch_this_cpu_xchg(pcp, nval) \ ({ \ typeof(pcp) *ptr__; \ --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -210,6 +210,65 @@ do { \ (typeof(_var))(unsigned long) pco_old__; \ }) +#if defined(CONFIG_X86_32) && defined(CONFIG_X86_CMPXCHG64) +#define percpu_cmpxchg64_op(size, qual, _var, _oval, _nval) \ +({ \ + union { \ + u64 var; \ + struct { \ + u32 low, high; \ + }; \ + } old__, new__; \ + \ + old__.var = _oval; \ + new__.var = _nval; \ + \ + asm qual ("cmpxchg8b " __percpu_arg([var]) \ + : [var] "+m" (_var), \ + "+a" (old__.low), \ + "+d" (old__.high) \ + : "b" (new__.low), \ + "c" (new__.high) \ + : "memory"); \ + \ + old__.var; \ +}) + +#define raw_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg64_op(8, , pcp, oval, nval) +#define this_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg64_op(8, volatile, pcp, oval, nval) +#endif + +#ifdef CONFIG_X86_64 +#define raw_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg_op(8, , pcp, oval, nval); +#define this_cpu_cmpxchg64(pcp, oval, nval) percpu_cmpxchg_op(8, volatile, pcp, oval, nval); + +#define percpu_cmpxchg128_op(size, qual, _var, _oval, _nval) \ +({ \ + union { \ + u128 var; \ + struct { \ + u64 low, high; \ + }; \ + } old__, new__; \ + \ + old__.var = _oval; \ + new__.var = _nval; \ + \ + asm qual ("cmpxchg16b " __percpu_arg([var]) \ + : [var] "+m" (_var), \ + "+a" (old__.low), \ + "+d" (old__.high) \ + : "b" (new__.low), \ + "c" (new__.high) \ + : "memory"); \ + \ + old__.var; \ +}) + +#define raw_cpu_cmpxchg128(pcp, oval, nval) percpu_cmpxchg128_op(16, , pcp, oval, nval) +#define this_cpu_cmpxchg128(pcp, oval, nval) percpu_cmpxchg128_op(16, volatile, pcp, oval, nval) +#endif + /* * this_cpu_read() makes gcc load the percpu variable every time it is * accessed while this_cpu_read_stable() allows the value to be cached. --- a/include/asm-generic/percpu.h +++ b/include/asm-generic/percpu.h @@ -298,6 +298,14 @@ do { \ #define raw_cpu_cmpxchg_8(pcp, oval, nval) \ raw_cpu_generic_cmpxchg(pcp, oval, nval) #endif +#ifndef raw_cpu_cmpxchg64 +#define raw_cpu_cmpxchg64(pcp, oval, nval) \ + raw_cpu_generic_cmpxchg(pcp, oval, nval) +#endif +#ifndef raw_cpu_cmpxchg128 +#define raw_cpu_cmpxchg128(pcp, oval, nval) \ + raw_cpu_generic_cmpxchg(pcp, oval, nval) +#endif #ifndef raw_cpu_cmpxchg_double_1 #define raw_cpu_cmpxchg_double_1(pcp1, pcp2, oval1, oval2, nval1, nval2) \ @@ -423,6 +431,14 @@ do { \ #define this_cpu_cmpxchg_8(pcp, oval, nval) \ this_cpu_generic_cmpxchg(pcp, oval, nval) #endif +#ifndef this_cpu_cmpxchg64 +#define this_cpu_cmpxchg64(pcp, oval, nval) \ + this_cpu_generic_cmpxchg(pcp, oval, nval) +#endif +#ifndef this_cpu_cmpxchg128 +#define this_cpu_cmpxchg128(pcp, oval, nval) \ + this_cpu_generic_cmpxchg(pcp, oval, nval) +#endif #ifndef this_cpu_cmpxchg_double_1 #define this_cpu_cmpxchg_double_1(pcp1, pcp2, oval1, oval2, nval1, nval2) \ From patchwork Mon May 15 07:57:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 684079 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E2D2C7EE2E for ; Mon, 15 May 2023 08:07:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239252AbjEOIHd (ORCPT ); Mon, 15 May 2023 04:07:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235190AbjEOIH0 (ORCPT ); Mon, 15 May 2023 04:07:26 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB854170B; Mon, 15 May 2023 01:07:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=rgZ7hZWkHDtTpF84Xw00Ifyh/7yejqO/hjYHP4KrVyk=; b=VA/C0CmchcfXpbAczSsPgonOyt 2aUt/agGqPFIFpZ6l2I6Qg5nqdJuz9IoNMj9V/ozfrCxRR6fb5YEf1Ga0kiJRyftM4e6pWjqWSKkc dIb0BRa5UE1aFif3qoM2MZ9k3ki3NvKxYPbB/R6/S547Pj10WkYXT/Q4uPhD0UtHAZXSrtLCyzGlN kb1o1hAHxW8lgC6PteRUH9iYuSUB3Gmflm7Y6x74H7yhLbWGQ2MymwpDc7/LOZXgNN8aLC7xVvmQ7 r4CZEX4BQYELzBYNeKfUDIJMu3CGkBsiIghUyTmTjEZk9+BAmwahfI7LbKKQJ31UK8rF4TV1yz3Wq y8BpnERg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1pyTDl-003HUW-Ez; Mon, 15 May 2023 08:06:17 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 872123033CC; Mon, 15 May 2023 10:06:15 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id C2169202FCEA4; Mon, 15 May 2023 10:06:10 +0200 (CEST) Message-ID: <20230515080554.315901115@infradead.org> User-Agent: quilt/0.66 Date: Mon, 15 May 2023 09:57:05 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org, Vasant Hegde Subject: [PATCH v3 06/11] x86,amd_iommu: Replace cmpxchg_double() References: <20230515075659.118447996@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Signed-off-by: Peter Zijlstra (Intel) Tested-by: Vasant Hegde --- drivers/iommu/amd/amd_iommu_types.h | 9 +++++++-- drivers/iommu/amd/iommu.c | 10 ++++------ 2 files changed, 11 insertions(+), 8 deletions(-) --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -986,8 +986,13 @@ union irte_ga_hi { }; struct irte_ga { - union irte_ga_lo lo; - union irte_ga_hi hi; + union { + struct { + union irte_ga_lo lo; + union irte_ga_hi hi; + }; + u128 irte; + }; }; struct irq_2_irte { --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3003,10 +3003,10 @@ static int alloc_irq_index(struct amd_io static int modify_irte_ga(struct amd_iommu *iommu, u16 devid, int index, struct irte_ga *irte, struct amd_ir_data *data) { - bool ret; struct irq_remap_table *table; - unsigned long flags; struct irte_ga *entry; + unsigned long flags; + u128 old; table = get_irq_table(iommu, devid); if (!table) @@ -3017,16 +3017,14 @@ static int modify_irte_ga(struct amd_iom entry = (struct irte_ga *)table->table; entry = &entry[index]; - ret = cmpxchg_double(&entry->lo.val, &entry->hi.val, - entry->lo.val, entry->hi.val, - irte->lo.val, irte->hi.val); /* * We use cmpxchg16 to atomically update the 128-bit IRTE, * and it cannot be updated by the hardware or other processors * behind us, so the return value of cmpxchg16 should be the * same as the old value. */ - WARN_ON(!ret); + old = entry->irte; + WARN_ON(!try_cmpxchg128(&entry->irte, &old, irte->irte)); if (data) data->ref = entry; From patchwork Mon May 15 07:57:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 682072 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B20B2C7EE24 for ; Mon, 15 May 2023 08:07:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238839AbjEOIHb (ORCPT ); Mon, 15 May 2023 04:07:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231400AbjEOIH0 (ORCPT ); Mon, 15 May 2023 04:07:26 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AA6EF1735; Mon, 15 May 2023 01:07:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=/ezz9F43epJmjX+2+q2EuT6G37yn9BJ4CMV+iiE+KIk=; b=EaiXXa+JLNkopxcm154uvINSoT XXhhxSNYI9Dj8TJsMePYGsmPawAk4PgxqA24LoC8ToY8T+UvgX6Mt1Iok43XBxw17lZCw5OZdQ2tQ ZaaSSGuhzCCsuKXfTSd/sC/q9v1c7d0frobCxDxfEp+AjNlpv5Qz2smjLvBY1tflTjx41hwRaKkev GYei11HqZz4Lpne0uwvABdXqI3PMUjfJYYIBLxOseh77Y/9EJ3pflT5Yk/oyQZ6FSRQ1NCjm4r6UF NHWZLaCOz+6hbMv+Kar15kRPnKaI+ssxzGQ048NySZiovPqlPKgHK6tfNCL3XXal7p8b7Oo4dmUEi /ipC8uhw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1pyTDl-00BQN4-1M; Mon, 15 May 2023 08:06:17 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 87106302FB8; Mon, 15 May 2023 10:06:15 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id C75F2202FCEA7; Mon, 15 May 2023 10:06:10 +0200 (CEST) Message-ID: <20230515080554.382778664@infradead.org> User-Agent: quilt/0.66 Date: Mon, 15 May 2023 09:57:06 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 07/11] x86,intel_iommu: Replace cmpxchg_double() References: <20230515075659.118447996@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Lu Baolu --- drivers/iommu/intel/irq_remapping.c | 8 -- include/linux/dmar.h | 125 +++++++++++++++++++----------------- 2 files changed, 68 insertions(+), 65 deletions(-) --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -174,18 +174,14 @@ static int modify_irte(struct irq_2_iomm irte = &iommu->ir_table->base[index]; if ((irte->pst == 1) || (irte_modified->pst == 1)) { - bool ret; - - ret = cmpxchg_double(&irte->low, &irte->high, - irte->low, irte->high, - irte_modified->low, irte_modified->high); /* * We use cmpxchg16 to atomically update the 128-bit IRTE, * and it cannot be updated by the hardware or other processors * behind us, so the return value of cmpxchg16 should be the * same as the old value. */ - WARN_ON(!ret); + u128 old = irte->irte; + WARN_ON(!try_cmpxchg128(&irte->irte, &old, irte_modified->irte)); } else { WRITE_ONCE(irte->low, irte_modified->low); WRITE_ONCE(irte->high, irte_modified->high); --- a/include/linux/dmar.h +++ b/include/linux/dmar.h @@ -201,67 +201,74 @@ static inline void detect_intel_iommu(vo struct irte { union { - /* Shared between remapped and posted mode*/ struct { - __u64 present : 1, /* 0 */ - fpd : 1, /* 1 */ - __res0 : 6, /* 2 - 6 */ - avail : 4, /* 8 - 11 */ - __res1 : 3, /* 12 - 14 */ - pst : 1, /* 15 */ - vector : 8, /* 16 - 23 */ - __res2 : 40; /* 24 - 63 */ + union { + /* Shared between remapped and posted mode*/ + struct { + __u64 present : 1, /* 0 */ + fpd : 1, /* 1 */ + __res0 : 6, /* 2 - 6 */ + avail : 4, /* 8 - 11 */ + __res1 : 3, /* 12 - 14 */ + pst : 1, /* 15 */ + vector : 8, /* 16 - 23 */ + __res2 : 40; /* 24 - 63 */ + }; + + /* Remapped mode */ + struct { + __u64 r_present : 1, /* 0 */ + r_fpd : 1, /* 1 */ + dst_mode : 1, /* 2 */ + redir_hint : 1, /* 3 */ + trigger_mode : 1, /* 4 */ + dlvry_mode : 3, /* 5 - 7 */ + r_avail : 4, /* 8 - 11 */ + r_res0 : 4, /* 12 - 15 */ + r_vector : 8, /* 16 - 23 */ + r_res1 : 8, /* 24 - 31 */ + dest_id : 32; /* 32 - 63 */ + }; + + /* Posted mode */ + struct { + __u64 p_present : 1, /* 0 */ + p_fpd : 1, /* 1 */ + p_res0 : 6, /* 2 - 7 */ + p_avail : 4, /* 8 - 11 */ + p_res1 : 2, /* 12 - 13 */ + p_urgent : 1, /* 14 */ + p_pst : 1, /* 15 */ + p_vector : 8, /* 16 - 23 */ + p_res2 : 14, /* 24 - 37 */ + pda_l : 26; /* 38 - 63 */ + }; + __u64 low; + }; + + union { + /* Shared between remapped and posted mode*/ + struct { + __u64 sid : 16, /* 64 - 79 */ + sq : 2, /* 80 - 81 */ + svt : 2, /* 82 - 83 */ + __res3 : 44; /* 84 - 127 */ + }; + + /* Posted mode*/ + struct { + __u64 p_sid : 16, /* 64 - 79 */ + p_sq : 2, /* 80 - 81 */ + p_svt : 2, /* 82 - 83 */ + p_res3 : 12, /* 84 - 95 */ + pda_h : 32; /* 96 - 127 */ + }; + __u64 high; + }; }; - - /* Remapped mode */ - struct { - __u64 r_present : 1, /* 0 */ - r_fpd : 1, /* 1 */ - dst_mode : 1, /* 2 */ - redir_hint : 1, /* 3 */ - trigger_mode : 1, /* 4 */ - dlvry_mode : 3, /* 5 - 7 */ - r_avail : 4, /* 8 - 11 */ - r_res0 : 4, /* 12 - 15 */ - r_vector : 8, /* 16 - 23 */ - r_res1 : 8, /* 24 - 31 */ - dest_id : 32; /* 32 - 63 */ - }; - - /* Posted mode */ - struct { - __u64 p_present : 1, /* 0 */ - p_fpd : 1, /* 1 */ - p_res0 : 6, /* 2 - 7 */ - p_avail : 4, /* 8 - 11 */ - p_res1 : 2, /* 12 - 13 */ - p_urgent : 1, /* 14 */ - p_pst : 1, /* 15 */ - p_vector : 8, /* 16 - 23 */ - p_res2 : 14, /* 24 - 37 */ - pda_l : 26; /* 38 - 63 */ - }; - __u64 low; - }; - - union { - /* Shared between remapped and posted mode*/ - struct { - __u64 sid : 16, /* 64 - 79 */ - sq : 2, /* 80 - 81 */ - svt : 2, /* 82 - 83 */ - __res3 : 44; /* 84 - 127 */ - }; - - /* Posted mode*/ - struct { - __u64 p_sid : 16, /* 64 - 79 */ - p_sq : 2, /* 80 - 81 */ - p_svt : 2, /* 82 - 83 */ - p_res3 : 12, /* 84 - 95 */ - pda_h : 32; /* 96 - 127 */ - }; - __u64 high; +#ifdef CONFIG_IRQ_REMAP + __u128 irte; +#endif }; }; From patchwork Mon May 15 07:57:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 684076 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88907C7EE26 for ; Mon, 15 May 2023 08:07:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239935AbjEOIHl (ORCPT ); Mon, 15 May 2023 04:07:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38642 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238556AbjEOIHa (ORCPT ); Mon, 15 May 2023 04:07:30 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CCEA1737; Mon, 15 May 2023 01:07:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=1Hyxt0TX1t/0uiattJSuj2ZTNM+BZCXyokp3fCNZGCs=; b=pfsV7Qbq81Z0IkI+Xvza6f76LG geyP1yqSpQ0cdv4M5vnGLyJVao4N1TuqaCz9CzumRH4FSACCzlrD9SnOzgfbEEzEvlPz+a1s+R1SD XU6IhOkvFQJ93yyWdsEupR2Mqx52rhYjF0Seid0p6zqtjuBqiFf3cZqv6TBGDQ8ShnrRoZ97CPad8 +l+HuRdyiowbD/zkPhjH4ZDmYadt9tcyGKRYa7V3/pNYx2IcgbVK+CnGZ/uUl0ZtJbAq33shpR5c0 1ZAI8rrHM/oDGodef2VoDHWu43MSu7ivZC62mvy5lcoU6nvfYdh41GMGxjhG5dmJecYvR0ryroQ/Y r3AbSCuA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1pyTDl-003HUT-CX; Mon, 15 May 2023 08:06:17 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 870AD302680; Mon, 15 May 2023 10:06:15 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id CBFAF202FCEA6; Mon, 15 May 2023 10:06:10 +0200 (CEST) Message-ID: <20230515080554.453785148@infradead.org> User-Agent: quilt/0.66 Date: Mon, 15 May 2023 09:57:07 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 08/11] slub: Replace cmpxchg_double() References: <20230515075659.118447996@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Signed-off-by: Peter Zijlstra (Intel) Acked-by: Vlastimil Babka Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- include/linux/slub_def.h | 12 ++- mm/slab.h | 49 ++++++++++++++-- mm/slub.c | 143 ++++++++++++++++++++++++++++------------------- 3 files changed, 138 insertions(+), 66 deletions(-) --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -39,7 +39,8 @@ enum stat_item { CPU_PARTIAL_FREE, /* Refill cpu partial on free */ CPU_PARTIAL_NODE, /* Refill cpu partial from node partial */ CPU_PARTIAL_DRAIN, /* Drain cpu partial to node partial */ - NR_SLUB_STAT_ITEMS }; + NR_SLUB_STAT_ITEMS +}; #ifndef CONFIG_SLUB_TINY /* @@ -47,8 +48,13 @@ enum stat_item { * with this_cpu_cmpxchg_double() alignment requirements. */ struct kmem_cache_cpu { - void **freelist; /* Pointer to next available object */ - unsigned long tid; /* Globally unique transaction id */ + union { + struct { + void **freelist; /* Pointer to next available object */ + unsigned long tid; /* Globally unique transaction id */ + }; + freelist_aba_t freelist_tid; + }; struct slab *slab; /* The slab from which we are allocating */ #ifdef CONFIG_SLUB_CPU_PARTIAL struct slab *partial; /* Partially allocated frozen slabs */ --- a/mm/slab.h +++ b/mm/slab.h @@ -6,6 +6,38 @@ */ void __init kmem_cache_init(void); +#ifdef CONFIG_HAVE_ALIGNED_STRUCT_PAGE +#ifdef CONFIG_64BIT +# ifdef system_has_cmpxchg128 +# define system_has_freelist_aba() system_has_cmpxchg128() +# define try_cmpxchg_freelist try_cmpxchg128 +# define this_cpu_cmpxchg_freelist this_cpu_cmpxchg128 +typedef u128 freelist_full_t; +# endif +#else /* CONFIG_64BIT */ +# ifdef system_has_cmpxchg64 +# define system_has_freelist_aba() system_has_cmpxchg64() +# define try_cmpxchg_freelist try_cmpxchg64 +# define this_cpu_cmpxchg_freelist this_cpu_cmpxchg64 +typedef u64 freelist_full_t; +# endif +#endif /* CONFIG_64BIT */ +#endif /* CONFIG_HAVE_ALIGNED_STRUCT_PAGE */ + +/* + * Freelist pointer and counter to cmpxchg together, avoids the typical ABA + * problems with cmpxchg of just a pointer. + */ +typedef union { +#ifdef system_has_freelist_aba + struct { + void *freelist; + unsigned long counter; + }; + freelist_full_t full; +#endif +} freelist_aba_t; + /* Reuses the bits in struct page */ struct slab { unsigned long __page_flags; @@ -38,14 +70,19 @@ struct slab { #endif }; /* Double-word boundary */ - void *freelist; /* first free object */ union { - unsigned long counters; struct { - unsigned inuse:16; - unsigned objects:15; - unsigned frozen:1; + void *freelist; /* first free object */ + union { + unsigned long counters; + struct { + unsigned inuse:16; + unsigned objects:15; + unsigned frozen:1; + }; + }; }; + freelist_aba_t freelist_counter; }; }; struct rcu_head rcu_head; @@ -72,7 +109,7 @@ SLAB_MATCH(memcg_data, memcg_data); #endif #undef SLAB_MATCH static_assert(sizeof(struct slab) <= sizeof(struct page)); -#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && defined(CONFIG_SLUB) +#if defined(system_has_freelist_aba) && defined(CONFIG_SLUB) static_assert(IS_ALIGNED(offsetof(struct slab, freelist), 2*sizeof(void *))); #endif --- a/mm/slub.c +++ b/mm/slub.c @@ -292,7 +292,12 @@ static inline bool kmem_cache_has_cpu_pa /* Poison object */ #define __OBJECT_POISON ((slab_flags_t __force)0x80000000U) /* Use cmpxchg_double */ + +#ifdef system_has_freelist_aba #define __CMPXCHG_DOUBLE ((slab_flags_t __force)0x40000000U) +#else +#define __CMPXCHG_DOUBLE ((slab_flags_t __force)0U) +#endif /* * Tracking user of a slab. @@ -512,6 +517,40 @@ static __always_inline void slab_unlock( __bit_spin_unlock(PG_locked, &page->flags); } +static inline bool +__update_freelist_fast(struct slab *slab, + void *freelist_old, unsigned long counters_old, + void *freelist_new, unsigned long counters_new) +{ +#ifdef system_has_freelist_aba + freelist_aba_t old = { .freelist = freelist_old, .counter = counters_old }; + freelist_aba_t new = { .freelist = freelist_new, .counter = counters_new }; + + return try_cmpxchg_freelist(&slab->freelist_counter.full, &old.full, new.full); +#else + return false; +#endif +} + +static inline bool +__update_freelist_slow(struct slab *slab, + void *freelist_old, unsigned long counters_old, + void *freelist_new, unsigned long counters_new) +{ + bool ret = false; + + slab_lock(slab); + if (slab->freelist == freelist_old && + slab->counters == counters_old) { + slab->freelist = freelist_new; + slab->counters = counters_new; + ret = true; + } + slab_unlock(slab); + + return ret; +} + /* * Interrupts must be disabled (for the fallback code to work right), typically * by an _irqsave() lock variant. On PREEMPT_RT the preempt_disable(), which is @@ -519,33 +558,25 @@ static __always_inline void slab_unlock( * allocation/ free operation in hardirq context. Therefore nothing can * interrupt the operation. */ -static inline bool __cmpxchg_double_slab(struct kmem_cache *s, struct slab *slab, +static inline bool __slab_update_freelist(struct kmem_cache *s, struct slab *slab, void *freelist_old, unsigned long counters_old, void *freelist_new, unsigned long counters_new, const char *n) { + bool ret; + if (USE_LOCKLESS_FAST_PATH()) lockdep_assert_irqs_disabled(); -#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \ - defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE) + if (s->flags & __CMPXCHG_DOUBLE) { - if (cmpxchg_double(&slab->freelist, &slab->counters, - freelist_old, counters_old, - freelist_new, counters_new)) - return true; - } else -#endif - { - slab_lock(slab); - if (slab->freelist == freelist_old && - slab->counters == counters_old) { - slab->freelist = freelist_new; - slab->counters = counters_new; - slab_unlock(slab); - return true; - } - slab_unlock(slab); + ret = __update_freelist_fast(slab, freelist_old, counters_old, + freelist_new, counters_new); + } else { + ret = __update_freelist_slow(slab, freelist_old, counters_old, + freelist_new, counters_new); } + if (likely(ret)) + return true; cpu_relax(); stat(s, CMPXCHG_DOUBLE_FAIL); @@ -557,36 +588,26 @@ static inline bool __cmpxchg_double_slab return false; } -static inline bool cmpxchg_double_slab(struct kmem_cache *s, struct slab *slab, +static inline bool slab_update_freelist(struct kmem_cache *s, struct slab *slab, void *freelist_old, unsigned long counters_old, void *freelist_new, unsigned long counters_new, const char *n) { -#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \ - defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE) + bool ret; + if (s->flags & __CMPXCHG_DOUBLE) { - if (cmpxchg_double(&slab->freelist, &slab->counters, - freelist_old, counters_old, - freelist_new, counters_new)) - return true; - } else -#endif - { + ret = __update_freelist_fast(slab, freelist_old, counters_old, + freelist_new, counters_new); + } else { unsigned long flags; local_irq_save(flags); - slab_lock(slab); - if (slab->freelist == freelist_old && - slab->counters == counters_old) { - slab->freelist = freelist_new; - slab->counters = counters_new; - slab_unlock(slab); - local_irq_restore(flags); - return true; - } - slab_unlock(slab); + ret = __update_freelist_slow(slab, freelist_old, counters_old, + freelist_new, counters_new); local_irq_restore(flags); } + if (likely(ret)) + return true; cpu_relax(); stat(s, CMPXCHG_DOUBLE_FAIL); @@ -2228,7 +2249,7 @@ static inline void *acquire_slab(struct VM_BUG_ON(new.frozen); new.frozen = 1; - if (!__cmpxchg_double_slab(s, slab, + if (!__slab_update_freelist(s, slab, freelist, counters, new.freelist, new.counters, "acquire_slab")) @@ -2554,7 +2575,7 @@ static void deactivate_slab(struct kmem_ } - if (!cmpxchg_double_slab(s, slab, + if (!slab_update_freelist(s, slab, old.freelist, old.counters, new.freelist, new.counters, "unfreezing slab")) { @@ -2611,7 +2632,7 @@ static void __unfreeze_partials(struct k new.frozen = 0; - } while (!__cmpxchg_double_slab(s, slab, + } while (!__slab_update_freelist(s, slab, old.freelist, old.counters, new.freelist, new.counters, "unfreezing slab")); @@ -3008,6 +3029,22 @@ static inline bool pfmemalloc_match(stru } #ifndef CONFIG_SLUB_TINY +static inline bool +__update_cpu_freelist_fast(struct kmem_cache *s, + void *freelist_old, void *freelist_new, + unsigned long tid) +{ +#ifdef system_has_freelist_aba + freelist_aba_t old = { .freelist = freelist_old, .counter = tid }; + freelist_aba_t new = { .freelist = freelist_new, .counter = next_tid(tid) }; + + return this_cpu_cmpxchg_freelist(s->cpu_slab->freelist_tid.full, + old.full, new.full) == old.full; +#else + return false; +#endif +} + /* * Check the slab->freelist and either transfer the freelist to the * per cpu freelist or deactivate the slab. @@ -3034,7 +3071,7 @@ static inline void *get_freelist(struct new.inuse = slab->objects; new.frozen = freelist != NULL; - } while (!__cmpxchg_double_slab(s, slab, + } while (!__slab_update_freelist(s, slab, freelist, counters, NULL, new.counters, "get_freelist")); @@ -3359,11 +3396,7 @@ static __always_inline void *__slab_allo * against code executing on this cpu *not* from access by * other cpus. */ - if (unlikely(!this_cpu_cmpxchg_double( - s->cpu_slab->freelist, s->cpu_slab->tid, - object, tid, - next_object, next_tid(tid)))) { - + if (unlikely(!__update_cpu_freelist_fast(s, object, next_object, tid))) { note_cmpxchg_failure("slab_alloc", s, tid); goto redo; } @@ -3631,7 +3664,7 @@ static void __slab_free(struct kmem_cach } } - } while (!cmpxchg_double_slab(s, slab, + } while (!slab_update_freelist(s, slab, prior, counters, head, new.counters, "__slab_free")); @@ -3736,11 +3769,7 @@ static __always_inline void do_slab_free set_freepointer(s, tail_obj, freelist); - if (unlikely(!this_cpu_cmpxchg_double( - s->cpu_slab->freelist, s->cpu_slab->tid, - freelist, tid, - head, next_tid(tid)))) { - + if (unlikely(!__update_cpu_freelist_fast(s, freelist, head, tid))) { note_cmpxchg_failure("slab_free", s, tid); goto redo; } @@ -4505,11 +4534,11 @@ static int kmem_cache_open(struct kmem_c } } -#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \ - defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE) - if (system_has_cmpxchg_double() && (s->flags & SLAB_NO_CMPXCHG) == 0) +#ifdef system_has_freelist_aba + if (system_has_freelist_aba() && !(s->flags & SLAB_NO_CMPXCHG)) { /* Enable fast mode */ s->flags |= __CMPXCHG_DOUBLE; + } #endif /* From patchwork Mon May 15 07:57:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 684080 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0880DC77B7D for ; Mon, 15 May 2023 08:07:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236019AbjEOIH0 (ORCPT ); Mon, 15 May 2023 04:07:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233752AbjEOIHZ (ORCPT ); Mon, 15 May 2023 04:07:25 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 682DB1706; Mon, 15 May 2023 01:07:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=4kaqfhENJznU2Mt7hBhEZrEbaoSBpAbVRLS0RsxQHdk=; b=XLXPvYSK56u1FQB+k+bhon/4pv zSD0YSZlWXiZHPUqIUl6Yyjf5NB4UWFAN3n7Oyp5hyeiOUtKZ9mBZT9y0+I9Onvas22ORYZ6Y3maD /QyoXUY4K27SRULt1VqM5bU6j/J3FLtlzDzZjEbqJpQgQUxuBK0FWXTF3USAeGyIQx0wTYpj+p7lO J/i/ZJRbdXbF8MlsCfnRwwzoTdOCq+IXrVCfsrCDR2uvOke13V5Dr4oE/McXFo/bEOt60OgXbLqEc XeXVrkZaRUSBBo2VPSQDMRDARGlBnk/FVOdYNtIj6j7ZjCFKOzNR7PzQQ9P/94djJbRTqJ9LyTCal lulzxa5w==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1pyTDl-00BQN5-1u; Mon, 15 May 2023 08:06:17 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 870D5302DA8; Mon, 15 May 2023 10:06:15 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id CF2C4202FCEAA; Mon, 15 May 2023 10:06:10 +0200 (CEST) Message-ID: <20230515080554.520976397@infradead.org> User-Agent: quilt/0.66 Date: Mon, 15 May 2023 09:57:08 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 09/11] mm/slub: Fold slab_update_freelist() References: <20230515075659.118447996@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org The two functions slab_update_freelist() and __slab_update_freelist() are nearly identical, fold and add a boolean argument and rely on constant propagation. Signed-off-by: Peter Zijlstra (Intel) --- mm/slub.c | 80 +++++++++++++++++++++----------------------------------------- 1 file changed, 28 insertions(+), 52 deletions(-) --- a/mm/slub.c +++ b/mm/slub.c @@ -559,53 +559,29 @@ __update_freelist_slow(struct slab *slab * allocation/ free operation in hardirq context. Therefore nothing can * interrupt the operation. */ -static inline bool __slab_update_freelist(struct kmem_cache *s, struct slab *slab, - void *freelist_old, unsigned long counters_old, - void *freelist_new, unsigned long counters_new, - const char *n) +static __always_inline +bool slab_update_freelist(struct kmem_cache *s, struct slab *slab, + void *freelist_old, unsigned long counters_old, + void *freelist_new, unsigned long counters_new, + bool irq_save, const char *n) { bool ret; - if (USE_LOCKLESS_FAST_PATH()) + if (!irq_save && USE_LOCKLESS_FAST_PATH()) lockdep_assert_irqs_disabled(); if (s->flags & __CMPXCHG_DOUBLE) { ret = __update_freelist_fast(slab, freelist_old, counters_old, freelist_new, counters_new); } else { - ret = __update_freelist_slow(slab, freelist_old, counters_old, - freelist_new, counters_new); - } - if (likely(ret)) - return true; - - cpu_relax(); - stat(s, CMPXCHG_DOUBLE_FAIL); - -#ifdef SLUB_DEBUG_CMPXCHG - pr_info("%s %s: cmpxchg double redo ", n, s->name); -#endif - - return false; -} - -static inline bool slab_update_freelist(struct kmem_cache *s, struct slab *slab, - void *freelist_old, unsigned long counters_old, - void *freelist_new, unsigned long counters_new, - const char *n) -{ - bool ret; - - if (s->flags & __CMPXCHG_DOUBLE) { - ret = __update_freelist_fast(slab, freelist_old, counters_old, - freelist_new, counters_new); - } else { unsigned long flags; - local_irq_save(flags); + if (irq_save) + local_irq_save(flags); ret = __update_freelist_slow(slab, freelist_old, counters_old, freelist_new, counters_new); - local_irq_restore(flags); + if (irq_save) + local_irq_restore(flags); } if (likely(ret)) return true; @@ -2250,10 +2226,10 @@ static inline void *acquire_slab(struct VM_BUG_ON(new.frozen); new.frozen = 1; - if (!__slab_update_freelist(s, slab, - freelist, counters, - new.freelist, new.counters, - "acquire_slab")) + if (!slab_update_freelist(s, slab, + freelist, counters, + new.freelist, new.counters, + false, "acquire_slab")) return NULL; remove_partial(n, slab); @@ -2577,9 +2553,9 @@ static void deactivate_slab(struct kmem_ if (!slab_update_freelist(s, slab, - old.freelist, old.counters, - new.freelist, new.counters, - "unfreezing slab")) { + old.freelist, old.counters, + new.freelist, new.counters, + true, "unfreezing slab")) { if (mode == M_PARTIAL) spin_unlock_irqrestore(&n->list_lock, flags); goto redo; @@ -2633,10 +2609,10 @@ static void __unfreeze_partials(struct k new.frozen = 0; - } while (!__slab_update_freelist(s, slab, - old.freelist, old.counters, - new.freelist, new.counters, - "unfreezing slab")); + } while (!slab_update_freelist(s, slab, + old.freelist, old.counters, + new.freelist, new.counters, + false, "unfreezing slab")); if (unlikely(!new.inuse && n->nr_partial >= s->min_partial)) { slab->next = slab_to_discard; @@ -3072,10 +3048,10 @@ static inline void *get_freelist(struct new.inuse = slab->objects; new.frozen = freelist != NULL; - } while (!__slab_update_freelist(s, slab, - freelist, counters, - NULL, new.counters, - "get_freelist")); + } while (!slab_update_freelist(s, slab, + freelist, counters, + NULL, new.counters, + false, "get_freelist")); return freelist; } @@ -3666,9 +3642,9 @@ static void __slab_free(struct kmem_cach } } while (!slab_update_freelist(s, slab, - prior, counters, - head, new.counters, - "__slab_free")); + prior, counters, + head, new.counters, + true, "__slab_free")); if (likely(!n)) { From patchwork Mon May 15 07:57:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Zijlstra X-Patchwork-Id: 682068 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06232C77B75 for ; Mon, 15 May 2023 08:07:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240034AbjEOIHn (ORCPT ); Mon, 15 May 2023 04:07:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38666 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238643AbjEOIHb (ORCPT ); Mon, 15 May 2023 04:07:31 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1EFF71738; Mon, 15 May 2023 01:07:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=A512lmdZ5mu+BsxXuNhkbi41aEPQspSS2JnnTxvGyYs=; b=oExTQSqYYsszhfdd1ylxeG6Q6a tG+d+I5HsF7cIpvCdFkNzi543XG1LNV7upSjhuLCTeHoQtkIqE1s4RRIOaAckIGvkm5Oryhf142XL LhN3/umz0XmDOUxaMFls5DCRmjf38fycbdxzwI48qvCv8QSvCS1reHFBKHmNnsXXZygD3Ygpe/x28 QXWWWfWquxFT5aJynme5K7e9jcFTyZlADDy4RDwJSIxMdcShS+XcLkh0Ppy0gFczCWd9MIhu3RcHN jQT1+d8loKxnQaJeBZXMi87o9/Mn2j+Qur//DsCEwMzzg/2PQt/VmqMZrBOEjnIeaxQZnsZqk9gr9 xMzDSeiQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1pyTDl-003HUa-Jl; Mon, 15 May 2023 08:06:17 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 8DFA5303434; Mon, 15 May 2023 10:06:15 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id D522D202FCEA9; Mon, 15 May 2023 10:06:10 +0200 (CEST) Message-ID: <20230515080554.589824283@infradead.org> User-Agent: quilt/0.66 Date: Mon, 15 May 2023 09:57:09 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 10/11] arch: Remove cmpxchg_double References: <20230515075659.118447996@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org No moar users, remove the monster. Signed-off-by: Peter Zijlstra (Intel) Acked-by: Heiko Carstens --- Documentation/core-api/this_cpu_ops.rst | 2 - arch/arm64/include/asm/atomic_ll_sc.h | 33 ---------------- arch/arm64/include/asm/atomic_lse.h | 36 ------------------ arch/arm64/include/asm/cmpxchg.h | 46 ----------------------- arch/arm64/include/asm/percpu.h | 10 ----- arch/s390/include/asm/cmpxchg.h | 34 ----------------- arch/s390/include/asm/percpu.h | 18 --------- arch/x86/include/asm/cmpxchg.h | 25 ------------ arch/x86/include/asm/cmpxchg_32.h | 1 arch/x86/include/asm/cmpxchg_64.h | 1 arch/x86/include/asm/percpu.h | 41 -------------------- include/asm-generic/percpu.h | 58 ----------------------------- include/linux/atomic/atomic-instrumented.h | 17 -------- include/linux/percpu-defs.h | 38 ------------------- scripts/atomic/gen-atomic-instrumented.sh | 15 ++----- 15 files changed, 5 insertions(+), 370 deletions(-) --- a/Documentation/core-api/this_cpu_ops.rst +++ b/Documentation/core-api/this_cpu_ops.rst @@ -53,7 +53,6 @@ are defined. These operations can be use this_cpu_add_return(pcp, val) this_cpu_xchg(pcp, nval) this_cpu_cmpxchg(pcp, oval, nval) - this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) this_cpu_sub(pcp, val) this_cpu_inc(pcp) this_cpu_dec(pcp) @@ -242,7 +241,6 @@ modifies the variable, then RMW actions __this_cpu_add_return(pcp, val) __this_cpu_xchg(pcp, nval) __this_cpu_cmpxchg(pcp, oval, nval) - __this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) __this_cpu_sub(pcp, val) __this_cpu_inc(pcp) __this_cpu_dec(pcp) --- a/arch/arm64/include/asm/atomic_ll_sc.h +++ b/arch/arm64/include/asm/atomic_ll_sc.h @@ -294,39 +294,6 @@ __CMPXCHG_CASE( , , mb_, 64, dmb ish, #undef __CMPXCHG_CASE -#define __CMPXCHG_DBL(name, mb, rel, cl) \ -static __always_inline long \ -__ll_sc__cmpxchg_double##name(unsigned long old1, \ - unsigned long old2, \ - unsigned long new1, \ - unsigned long new2, \ - volatile void *ptr) \ -{ \ - unsigned long tmp, ret; \ - \ - asm volatile("// __cmpxchg_double" #name "\n" \ - " prfm pstl1strm, %2\n" \ - "1: ldxp %0, %1, %2\n" \ - " eor %0, %0, %3\n" \ - " eor %1, %1, %4\n" \ - " orr %1, %0, %1\n" \ - " cbnz %1, 2f\n" \ - " st" #rel "xp %w0, %5, %6, %2\n" \ - " cbnz %w0, 1b\n" \ - " " #mb "\n" \ - "2:" \ - : "=&r" (tmp), "=&r" (ret), "+Q" (*(__uint128_t *)ptr) \ - : "r" (old1), "r" (old2), "r" (new1), "r" (new2) \ - : cl); \ - \ - return ret; \ -} - -__CMPXCHG_DBL( , , , ) -__CMPXCHG_DBL(_mb, dmb ish, l, "memory") - -#undef __CMPXCHG_DBL - union __u128_halves { u128 full; struct { --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -281,42 +281,6 @@ __CMPXCHG_CASE(x, , mb_, 64, al, "memo #undef __CMPXCHG_CASE -#define __CMPXCHG_DBL(name, mb, cl...) \ -static __always_inline long \ -__lse__cmpxchg_double##name(unsigned long old1, \ - unsigned long old2, \ - unsigned long new1, \ - unsigned long new2, \ - volatile void *ptr) \ -{ \ - unsigned long oldval1 = old1; \ - unsigned long oldval2 = old2; \ - register unsigned long x0 asm ("x0") = old1; \ - register unsigned long x1 asm ("x1") = old2; \ - register unsigned long x2 asm ("x2") = new1; \ - register unsigned long x3 asm ("x3") = new2; \ - register unsigned long x4 asm ("x4") = (unsigned long)ptr; \ - \ - asm volatile( \ - __LSE_PREAMBLE \ - " casp" #mb "\t%[old1], %[old2], %[new1], %[new2], %[v]\n"\ - " eor %[old1], %[old1], %[oldval1]\n" \ - " eor %[old2], %[old2], %[oldval2]\n" \ - " orr %[old1], %[old1], %[old2]" \ - : [old1] "+&r" (x0), [old2] "+&r" (x1), \ - [v] "+Q" (*(__uint128_t *)ptr) \ - : [new1] "r" (x2), [new2] "r" (x3), [ptr] "r" (x4), \ - [oldval1] "r" (oldval1), [oldval2] "r" (oldval2) \ - : cl); \ - \ - return x0; \ -} - -__CMPXCHG_DBL( , ) -__CMPXCHG_DBL(_mb, al, "memory") - -#undef __CMPXCHG_DBL - #define __CMPXCHG128(name, mb, cl...) \ static __always_inline u128 \ __lse__cmpxchg128##name(volatile u128 *ptr, u128 old, u128 new) \ --- a/arch/arm64/include/asm/cmpxchg.h +++ b/arch/arm64/include/asm/cmpxchg.h @@ -130,22 +130,6 @@ __CMPXCHG_CASE(mb_, 64) #undef __CMPXCHG_CASE -#define __CMPXCHG_DBL(name) \ -static inline long __cmpxchg_double##name(unsigned long old1, \ - unsigned long old2, \ - unsigned long new1, \ - unsigned long new2, \ - volatile void *ptr) \ -{ \ - return __lse_ll_sc_body(_cmpxchg_double##name, \ - old1, old2, new1, new2, ptr); \ -} - -__CMPXCHG_DBL( ) -__CMPXCHG_DBL(_mb) - -#undef __CMPXCHG_DBL - #define __CMPXCHG128(name) \ static inline u128 __cmpxchg128##name(volatile u128 *ptr, \ u128 old, u128 new) \ @@ -211,36 +195,6 @@ __CMPXCHG_GEN(_mb) #define arch_cmpxchg64 arch_cmpxchg #define arch_cmpxchg64_local arch_cmpxchg_local -/* cmpxchg_double */ -#define system_has_cmpxchg_double() 1 - -#define __cmpxchg_double_check(ptr1, ptr2) \ -({ \ - if (sizeof(*(ptr1)) != 8) \ - BUILD_BUG(); \ - VM_BUG_ON((unsigned long *)(ptr2) - (unsigned long *)(ptr1) != 1); \ -}) - -#define arch_cmpxchg_double(ptr1, ptr2, o1, o2, n1, n2) \ -({ \ - int __ret; \ - __cmpxchg_double_check(ptr1, ptr2); \ - __ret = !__cmpxchg_double_mb((unsigned long)(o1), (unsigned long)(o2), \ - (unsigned long)(n1), (unsigned long)(n2), \ - ptr1); \ - __ret; \ -}) - -#define arch_cmpxchg_double_local(ptr1, ptr2, o1, o2, n1, n2) \ -({ \ - int __ret; \ - __cmpxchg_double_check(ptr1, ptr2); \ - __ret = !__cmpxchg_double((unsigned long)(o1), (unsigned long)(o2), \ - (unsigned long)(n1), (unsigned long)(n2), \ - ptr1); \ - __ret; \ -}) - /* cmpxchg128 */ #define system_has_cmpxchg128() 1 --- a/arch/arm64/include/asm/percpu.h +++ b/arch/arm64/include/asm/percpu.h @@ -145,16 +145,6 @@ PERCPU_RET_OP(add, add, ldadd) * preemption point when TIF_NEED_RESCHED gets set while preemption is * disabled. */ -#define this_cpu_cmpxchg_double_8(ptr1, ptr2, o1, o2, n1, n2) \ -({ \ - int __ret; \ - preempt_disable_notrace(); \ - __ret = cmpxchg_double_local( raw_cpu_ptr(&(ptr1)), \ - raw_cpu_ptr(&(ptr2)), \ - o1, o2, n1, n2); \ - preempt_enable_notrace(); \ - __ret; \ -}) #define _pcp_protect(op, pcp, ...) \ ({ \ --- a/arch/s390/include/asm/cmpxchg.h +++ b/arch/s390/include/asm/cmpxchg.h @@ -190,40 +190,6 @@ static __always_inline unsigned long __c #define arch_cmpxchg_local arch_cmpxchg #define arch_cmpxchg64_local arch_cmpxchg -#define system_has_cmpxchg_double() 1 - -static __always_inline int __cmpxchg_double(unsigned long p1, unsigned long p2, - unsigned long o1, unsigned long o2, - unsigned long n1, unsigned long n2) -{ - union register_pair old = { .even = o1, .odd = o2, }; - union register_pair new = { .even = n1, .odd = n2, }; - int cc; - - asm volatile( - " cdsg %[old],%[new],%[ptr]\n" - " ipm %[cc]\n" - " srl %[cc],28\n" - : [cc] "=&d" (cc), [old] "+&d" (old.pair) - : [new] "d" (new.pair), - [ptr] "QS" (*(unsigned long *)p1), "Q" (*(unsigned long *)p2) - : "memory", "cc"); - return !cc; -} - -#define arch_cmpxchg_double(p1, p2, o1, o2, n1, n2) \ -({ \ - typeof(p1) __p1 = (p1); \ - typeof(p2) __p2 = (p2); \ - \ - BUILD_BUG_ON(sizeof(*(p1)) != sizeof(long)); \ - BUILD_BUG_ON(sizeof(*(p2)) != sizeof(long)); \ - VM_BUG_ON((unsigned long)((__p1) + 1) != (unsigned long)(__p2));\ - __cmpxchg_double((unsigned long)__p1, (unsigned long)__p2, \ - (unsigned long)(o1), (unsigned long)(o2), \ - (unsigned long)(n1), (unsigned long)(n2)); \ -}) - #define system_has_cmpxchg128() 1 static __always_inline u128 arch_cmpxchg128(volatile u128 *ptr, u128 old, u128 new) --- a/arch/s390/include/asm/percpu.h +++ b/arch/s390/include/asm/percpu.h @@ -180,24 +180,6 @@ #define this_cpu_xchg_4(pcp, nval) arch_this_cpu_xchg(pcp, nval) #define this_cpu_xchg_8(pcp, nval) arch_this_cpu_xchg(pcp, nval) -#define arch_this_cpu_cmpxchg_double(pcp1, pcp2, o1, o2, n1, n2) \ -({ \ - typeof(pcp1) *p1__; \ - typeof(pcp2) *p2__; \ - int ret__; \ - \ - preempt_disable_notrace(); \ - p1__ = raw_cpu_ptr(&(pcp1)); \ - p2__ = raw_cpu_ptr(&(pcp2)); \ - ret__ = __cmpxchg_double((unsigned long)p1__, (unsigned long)p2__, \ - (unsigned long)(o1), (unsigned long)(o2), \ - (unsigned long)(n1), (unsigned long)(n2)); \ - preempt_enable_notrace(); \ - ret__; \ -}) - -#define this_cpu_cmpxchg_double_8 arch_this_cpu_cmpxchg_double - #include #endif /* __ARCH_S390_PERCPU__ */ --- a/arch/x86/include/asm/cmpxchg.h +++ b/arch/x86/include/asm/cmpxchg.h @@ -239,29 +239,4 @@ extern void __add_wrong_size(void) #define __xadd(ptr, inc, lock) __xchg_op((ptr), (inc), xadd, lock) #define xadd(ptr, inc) __xadd((ptr), (inc), LOCK_PREFIX) -#define __cmpxchg_double(pfx, p1, p2, o1, o2, n1, n2) \ -({ \ - bool __ret; \ - __typeof__(*(p1)) __old1 = (o1), __new1 = (n1); \ - __typeof__(*(p2)) __old2 = (o2), __new2 = (n2); \ - BUILD_BUG_ON(sizeof(*(p1)) != sizeof(long)); \ - BUILD_BUG_ON(sizeof(*(p2)) != sizeof(long)); \ - VM_BUG_ON((unsigned long)(p1) % (2 * sizeof(long))); \ - VM_BUG_ON((unsigned long)((p1) + 1) != (unsigned long)(p2)); \ - asm volatile(pfx "cmpxchg%c5b %1" \ - CC_SET(e) \ - : CC_OUT(e) (__ret), \ - "+m" (*(p1)), "+m" (*(p2)), \ - "+a" (__old1), "+d" (__old2) \ - : "i" (2 * sizeof(long)), \ - "b" (__new1), "c" (__new2)); \ - __ret; \ -}) - -#define arch_cmpxchg_double(p1, p2, o1, o2, n1, n2) \ - __cmpxchg_double(LOCK_PREFIX, p1, p2, o1, o2, n1, n2) - -#define arch_cmpxchg_double_local(p1, p2, o1, o2, n1, n2) \ - __cmpxchg_double(, p1, p2, o1, o2, n1, n2) - #endif /* ASM_X86_CMPXCHG_H */ --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -103,7 +103,6 @@ static inline bool __try_cmpxchg64(volat #endif -#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX8) #define system_has_cmpxchg64() boot_cpu_has(X86_FEATURE_CX8) #endif /* _ASM_X86_CMPXCHG_32_H */ --- a/arch/x86/include/asm/cmpxchg_64.h +++ b/arch/x86/include/asm/cmpxchg_64.h @@ -81,7 +81,6 @@ static __always_inline bool arch_try_cmp return __arch_try_cmpxchg128(ptr, oldp, new,); } -#define system_has_cmpxchg_double() boot_cpu_has(X86_FEATURE_CX16) #define system_has_cmpxchg128() boot_cpu_has(X86_FEATURE_CX16) #endif /* _ASM_X86_CMPXCHG_64_H */ --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -349,23 +349,6 @@ do { \ #define this_cpu_cmpxchg_2(pcp, oval, nval) percpu_cmpxchg_op(2, volatile, pcp, oval, nval) #define this_cpu_cmpxchg_4(pcp, oval, nval) percpu_cmpxchg_op(4, volatile, pcp, oval, nval) -#ifdef CONFIG_X86_CMPXCHG64 -#define percpu_cmpxchg8b_double(pcp1, pcp2, o1, o2, n1, n2) \ -({ \ - bool __ret; \ - typeof(pcp1) __o1 = (o1), __n1 = (n1); \ - typeof(pcp2) __o2 = (o2), __n2 = (n2); \ - asm volatile("cmpxchg8b "__percpu_arg(1) \ - CC_SET(z) \ - : CC_OUT(z) (__ret), "+m" (pcp1), "+m" (pcp2), "+a" (__o1), "+d" (__o2) \ - : "b" (__n1), "c" (__n2)); \ - __ret; \ -}) - -#define raw_cpu_cmpxchg_double_4 percpu_cmpxchg8b_double -#define this_cpu_cmpxchg_double_4 percpu_cmpxchg8b_double -#endif /* CONFIG_X86_CMPXCHG64 */ - /* * Per cpu atomic 64 bit operations are only available under 64 bit. * 32 bit must fall back to generic operations. @@ -388,30 +371,6 @@ do { \ #define this_cpu_add_return_8(pcp, val) percpu_add_return_op(8, volatile, pcp, val) #define this_cpu_xchg_8(pcp, nval) percpu_xchg_op(8, volatile, pcp, nval) #define this_cpu_cmpxchg_8(pcp, oval, nval) percpu_cmpxchg_op(8, volatile, pcp, oval, nval) - -/* - * Pretty complex macro to generate cmpxchg16 instruction. The instruction - * is not supported on early AMD64 processors so we must be able to emulate - * it in software. The address used in the cmpxchg16 instruction must be - * aligned to a 16 byte boundary. - */ -#define percpu_cmpxchg16b_double(pcp1, pcp2, o1, o2, n1, n2) \ -({ \ - bool __ret; \ - typeof(pcp1) __o1 = (o1), __n1 = (n1); \ - typeof(pcp2) __o2 = (o2), __n2 = (n2); \ - alternative_io("leaq %P1,%%rsi\n\tcall this_cpu_cmpxchg16b_emu\n\t", \ - "cmpxchg16b " __percpu_arg(1) "\n\tsetz %0\n\t", \ - X86_FEATURE_CX16, \ - ASM_OUTPUT2("=a" (__ret), "+m" (pcp1), \ - "+m" (pcp2), "+d" (__o2)), \ - "b" (__n1), "c" (__n2), "a" (__o1) : "rsi"); \ - __ret; \ -}) - -#define raw_cpu_cmpxchg_double_8 percpu_cmpxchg16b_double -#define this_cpu_cmpxchg_double_8 percpu_cmpxchg16b_double - #endif static __always_inline bool x86_this_cpu_constant_test_bit(unsigned int nr, --- a/include/asm-generic/percpu.h +++ b/include/asm-generic/percpu.h @@ -99,19 +99,6 @@ do { \ __ret; \ }) -#define raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ -({ \ - typeof(pcp1) *__p1 = raw_cpu_ptr(&(pcp1)); \ - typeof(pcp2) *__p2 = raw_cpu_ptr(&(pcp2)); \ - int __ret = 0; \ - if (*__p1 == (oval1) && *__p2 == (oval2)) { \ - *__p1 = nval1; \ - *__p2 = nval2; \ - __ret = 1; \ - } \ - (__ret); \ -}) - #define __this_cpu_generic_read_nopreempt(pcp) \ ({ \ typeof(pcp) ___ret; \ @@ -180,17 +167,6 @@ do { \ __ret; \ }) -#define this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ -({ \ - int __ret; \ - unsigned long __flags; \ - raw_local_irq_save(__flags); \ - __ret = raw_cpu_generic_cmpxchg_double(pcp1, pcp2, \ - oval1, oval2, nval1, nval2); \ - raw_local_irq_restore(__flags); \ - __ret; \ -}) - #ifndef raw_cpu_read_1 #define raw_cpu_read_1(pcp) raw_cpu_generic_read(pcp) #endif @@ -307,23 +283,6 @@ do { \ raw_cpu_generic_cmpxchg(pcp, oval, nval) #endif -#ifndef raw_cpu_cmpxchg_double_1 -#define raw_cpu_cmpxchg_double_1(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef raw_cpu_cmpxchg_double_2 -#define raw_cpu_cmpxchg_double_2(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef raw_cpu_cmpxchg_double_4 -#define raw_cpu_cmpxchg_double_4(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef raw_cpu_cmpxchg_double_8 -#define raw_cpu_cmpxchg_double_8(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - raw_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif - #ifndef this_cpu_read_1 #define this_cpu_read_1(pcp) this_cpu_generic_read(pcp) #endif @@ -440,21 +399,4 @@ do { \ this_cpu_generic_cmpxchg(pcp, oval, nval) #endif -#ifndef this_cpu_cmpxchg_double_1 -#define this_cpu_cmpxchg_double_1(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef this_cpu_cmpxchg_double_2 -#define this_cpu_cmpxchg_double_2(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef this_cpu_cmpxchg_double_4 -#define this_cpu_cmpxchg_double_4(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif -#ifndef this_cpu_cmpxchg_double_8 -#define this_cpu_cmpxchg_double_8(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) -#endif - #endif /* _ASM_GENERIC_PERCPU_H_ */ --- a/include/linux/atomic/atomic-instrumented.h +++ b/include/linux/atomic/atomic-instrumented.h @@ -2234,21 +2234,6 @@ atomic_long_dec_if_positive(atomic_long_ arch_try_cmpxchg128_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \ }) -#define cmpxchg_double(ptr, ...) \ -({ \ - typeof(ptr) __ai_ptr = (ptr); \ - kcsan_mb(); \ - instrument_atomic_read_write(__ai_ptr, 2 * sizeof(*__ai_ptr)); \ - arch_cmpxchg_double(__ai_ptr, __VA_ARGS__); \ -}) - - -#define cmpxchg_double_local(ptr, ...) \ -({ \ - typeof(ptr) __ai_ptr = (ptr); \ - instrument_atomic_read_write(__ai_ptr, 2 * sizeof(*__ai_ptr)); \ - arch_cmpxchg_double_local(__ai_ptr, __VA_ARGS__); \ -}) #endif /* _LINUX_ATOMIC_INSTRUMENTED_H */ -// 82d1be694fab30414527d0877c29fa75ed5a0b74 +// 3611991b015450e119bcd7417a9431af7f3ba13c --- a/include/linux/percpu-defs.h +++ b/include/linux/percpu-defs.h @@ -343,33 +343,6 @@ static __always_inline void __this_cpu_p pscr2_ret__; \ }) -/* - * Special handling for cmpxchg_double. cmpxchg_double is passed two - * percpu variables. The first has to be aligned to a double word - * boundary and the second has to follow directly thereafter. - * We enforce this on all architectures even if they don't support - * a double cmpxchg instruction, since it's a cheap requirement, and it - * avoids breaking the requirement for architectures with the instruction. - */ -#define __pcpu_double_call_return_bool(stem, pcp1, pcp2, ...) \ -({ \ - bool pdcrb_ret__; \ - __verify_pcpu_ptr(&(pcp1)); \ - BUILD_BUG_ON(sizeof(pcp1) != sizeof(pcp2)); \ - VM_BUG_ON((unsigned long)(&(pcp1)) % (2 * sizeof(pcp1))); \ - VM_BUG_ON((unsigned long)(&(pcp2)) != \ - (unsigned long)(&(pcp1)) + sizeof(pcp1)); \ - switch(sizeof(pcp1)) { \ - case 1: pdcrb_ret__ = stem##1(pcp1, pcp2, __VA_ARGS__); break; \ - case 2: pdcrb_ret__ = stem##2(pcp1, pcp2, __VA_ARGS__); break; \ - case 4: pdcrb_ret__ = stem##4(pcp1, pcp2, __VA_ARGS__); break; \ - case 8: pdcrb_ret__ = stem##8(pcp1, pcp2, __VA_ARGS__); break; \ - default: \ - __bad_size_call_parameter(); break; \ - } \ - pdcrb_ret__; \ -}) - #define __pcpu_size_call(stem, variable, ...) \ do { \ __verify_pcpu_ptr(&(variable)); \ @@ -426,9 +399,6 @@ do { \ #define raw_cpu_xchg(pcp, nval) __pcpu_size_call_return2(raw_cpu_xchg_, pcp, nval) #define raw_cpu_cmpxchg(pcp, oval, nval) \ __pcpu_size_call_return2(raw_cpu_cmpxchg_, pcp, oval, nval) -#define raw_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - __pcpu_double_call_return_bool(raw_cpu_cmpxchg_double_, pcp1, pcp2, oval1, oval2, nval1, nval2) - #define raw_cpu_sub(pcp, val) raw_cpu_add(pcp, -(val)) #define raw_cpu_inc(pcp) raw_cpu_add(pcp, 1) #define raw_cpu_dec(pcp) raw_cpu_sub(pcp, 1) @@ -488,11 +458,6 @@ do { \ raw_cpu_cmpxchg(pcp, oval, nval); \ }) -#define __this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ -({ __this_cpu_preempt_check("cmpxchg_double"); \ - raw_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2); \ -}) - #define __this_cpu_sub(pcp, val) __this_cpu_add(pcp, -(typeof(pcp))(val)) #define __this_cpu_inc(pcp) __this_cpu_add(pcp, 1) #define __this_cpu_dec(pcp) __this_cpu_sub(pcp, 1) @@ -513,9 +478,6 @@ do { \ #define this_cpu_xchg(pcp, nval) __pcpu_size_call_return2(this_cpu_xchg_, pcp, nval) #define this_cpu_cmpxchg(pcp, oval, nval) \ __pcpu_size_call_return2(this_cpu_cmpxchg_, pcp, oval, nval) -#define this_cpu_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) \ - __pcpu_double_call_return_bool(this_cpu_cmpxchg_double_, pcp1, pcp2, oval1, oval2, nval1, nval2) - #define this_cpu_sub(pcp, val) this_cpu_add(pcp, -(typeof(pcp))(val)) #define this_cpu_inc(pcp) this_cpu_add(pcp, 1) #define this_cpu_dec(pcp) this_cpu_sub(pcp, 1) --- a/scripts/atomic/gen-atomic-instrumented.sh +++ b/scripts/atomic/gen-atomic-instrumented.sh @@ -84,7 +84,6 @@ gen_xchg() { local xchg="$1"; shift local order="$1"; shift - local mult="$1"; shift kcsan_barrier="" if [ "${xchg%_local}" = "${xchg}" ]; then @@ -104,8 +103,8 @@ cat < X-Patchwork-Id: 684077 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58656C7EE2E for ; Mon, 15 May 2023 08:07:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239672AbjEOIHj (ORCPT ); Mon, 15 May 2023 04:07:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238329AbjEOIH3 (ORCPT ); Mon, 15 May 2023 04:07:29 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B32DE41; Mon, 15 May 2023 01:07:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=qQ97W8DjtNWYNvTOAleOKLKZr+6FI4styhSio5/BDPE=; b=jDA5YrBAqv2XXwX2hyoap3XhWh FUFqXWI9xbofaUhrnD/ux504IzigT+5UAWA2nMY5JWOA6Oqia5Te8jXiDxFNK8pNpovPpq5Ipx4Tp BoP1Xn/k2yhszEl4J3D1i1ULvw2aVOT2NmbFxiAt/QCeyGGnPAoLUpB5+TENS4VG9T0BTK9qeOYph j0IJUDpNmFgbT4b5PUux6ZW+r1nWua3QIgMmTlB8o48//Is03yKfVTJtReG1SFLB/GbnnyxmvNPD4 42RHsxPs8LP1AfUAk0r3kC5W2n+QFqPC+el6SkJJV5JBjO1M1OFIEMoIuucEObRX9HHJj+weRarpG WQ5sFIOg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1pyTDl-00BQN6-1z; Mon, 15 May 2023 08:06:18 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 934F2305ECF; Mon, 15 May 2023 10:06:15 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id D82FB202FCEA8; Mon, 15 May 2023 10:06:10 +0200 (CEST) Message-ID: <20230515080554.657068280@infradead.org> User-Agent: quilt/0.66 Date: Mon, 15 May 2023 09:57:10 +0200 From: Peter Zijlstra To: torvalds@linux-foundation.org Cc: corbet@lwn.net, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, mark.rutland@arm.com, catalin.marinas@arm.com, dennis@kernel.org, tj@kernel.org, cl@linux.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, borntraeger@linux.ibm.com, svens@linux.ibm.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, joro@8bytes.org, suravee.suthikulpanit@amd.com, robin.murphy@arm.com, dwmw2@infradead.org, baolu.lu@linux.intel.com, Arnd Bergmann , Herbert Xu , davem@davemloft.net, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, Andrew Morton , vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-s390@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 11/11] s390/cpum_sf: Convert to cmpxchg128() References: <20230515075659.118447996@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Now that there is a cross arch u128 and cmpxchg128(), use those instead of the custom CDSG helper. Signed-off-by: Peter Zijlstra (Intel) Acked-by: Heiko Carstens --- arch/s390/include/asm/cpu_mf.h | 2 +- arch/s390/kernel/perf_cpum_sf.c | 16 +++------------- 2 files changed, 4 insertions(+), 14 deletions(-) --- a/arch/s390/include/asm/cpu_mf.h +++ b/arch/s390/include/asm/cpu_mf.h @@ -140,7 +140,7 @@ union hws_trailer_header { unsigned int dsdes:16; /* 48-63: size of diagnostic SDE */ unsigned long long overflow; /* 64 - Overflow Count */ }; - __uint128_t val; + u128 val; }; struct hws_trailer_entry { --- a/arch/s390/kernel/perf_cpum_sf.c +++ b/arch/s390/kernel/perf_cpum_sf.c @@ -1275,16 +1275,6 @@ static void hw_collect_samples(struct pe } } -static inline __uint128_t __cdsg(__uint128_t *ptr, __uint128_t old, __uint128_t new) -{ - asm volatile( - " cdsg %[old],%[new],%[ptr]\n" - : [old] "+d" (old), [ptr] "+QS" (*ptr) - : [new] "d" (new) - : "memory", "cc"); - return old; -} - /* hw_perf_event_update() - Process sampling buffer * @event: The perf event * @flush_all: Flag to also flush partially filled sample-data-blocks @@ -1362,7 +1352,7 @@ static void hw_perf_event_update(struct new.f = 0; new.a = 1; new.overflow = 0; - prev.val = __cdsg(&te->header.val, old.val, new.val); + prev.val = cmpxchg128(&te->header.val, old.val, new.val); } while (prev.val != old.val); /* Advance to next sample-data-block */ @@ -1572,7 +1562,7 @@ static bool aux_set_alert(struct aux_buf } new.a = 1; new.overflow = 0; - prev.val = __cdsg(&te->header.val, old.val, new.val); + prev.val = cmpxchg128(&te->header.val, old.val, new.val); } while (prev.val != old.val); return true; } @@ -1646,7 +1636,7 @@ static bool aux_reset_buffer(struct aux_ new.a = 1; else new.a = 0; - prev.val = __cdsg(&te->header.val, old.val, new.val); + prev.val = cmpxchg128(&te->header.val, old.val, new.val); } while (prev.val != old.val); *overflow += orig_overflow; }