Message ID | 20180504173937.25300-2-mark.rutland@arm.com |
---|---|
State | New |
Headers | show |
Series | arm64: add instrumented atomics | expand |
On Fri, May 04, 2018 at 06:39:32PM +0100, Mark Rutland wrote: > Currently <asm-generic/atomic-instrumented.h> only instruments the fully > ordered variants of atomic functions, ignoring the {relaxed,acquire,release} > ordering variants. > > This patch reworks the header to instrument all ordering variants of the atomic > functions, so that architectures implementing these are instrumented > appropriately. > > To minimise repetition, a macro is used to generate each variant from a common > template. The {full,relaxed,acquire,release} order variants respectively are > then built using this template, where the architecture provides an > implementation. > > To stick to an 80 column limit while keeping the templates legible, the return > type and function name of each template are split over two lines. For > consistency, this is done even when not strictly necessary. > > Signed-off-by: Mark Rutland <mark.rutland@arm.com> > Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> > Cc: Boqun Feng <boqun.feng@gmail.com> > Cc: Dmitry Vyukov <dvyukov@google.com> > Cc: Ingo Molnar <mingo@kernel.org> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Will Deacon <will.deacon@arm.com> > --- > include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++----- > 1 file changed, 1008 insertions(+), 187 deletions(-) Is there really no way to either generate or further macro compress this? This is stupid repetitive, we just got rid of all that endless copy paste crap in atomic implementations and now we're going back to that. Adding or changing atomic bits becomes horrifically painful because of this.
On Fri, May 04, 2018 at 08:01:05PM +0200, Peter Zijlstra wrote: > On Fri, May 04, 2018 at 06:39:32PM +0100, Mark Rutland wrote: > > Currently <asm-generic/atomic-instrumented.h> only instruments the fully > > ordered variants of atomic functions, ignoring the {relaxed,acquire,release} > > ordering variants. > > > > This patch reworks the header to instrument all ordering variants of the atomic > > functions, so that architectures implementing these are instrumented > > appropriately. > > > > To minimise repetition, a macro is used to generate each variant from a common > > template. The {full,relaxed,acquire,release} order variants respectively are > > then built using this template, where the architecture provides an > > implementation. > > include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++----- > > 1 file changed, 1008 insertions(+), 187 deletions(-) > > Is there really no way to either generate or further macro compress this? I can definitely macro compress this somewhat, but the bulk of the repetition will be the ifdeffery, which can't be macro'd away IIUC. Generating this with a script is possible -- do we do anything like that elsewhere? > This is stupid repetitive, we just got rid of all that endless copy > paste crap in atomic implementations and now we're going back to that. > > Adding or changing atomic bits becomes horrifically painful because of this. Sure thing; mangling it to its current state was a pain enough. Thanks, Mark.
On Fri, May 04, 2018 at 07:09:09PM +0100, Mark Rutland wrote: > On Fri, May 04, 2018 at 08:01:05PM +0200, Peter Zijlstra wrote: > > On Fri, May 04, 2018 at 06:39:32PM +0100, Mark Rutland wrote: > > > include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++----- > > > 1 file changed, 1008 insertions(+), 187 deletions(-) > > > > Is there really no way to either generate or further macro compress this? > > I can definitely macro compress this somewhat, but the bulk of the > repetition will be the ifdeffery, which can't be macro'd away IIUC. Right, much like what we already have in linux/atomic.h I suspect, having to duplicating that isn't brilliant either. > Generating this with a script is possible -- do we do anything like that > elsewhere? There's include/generated/ in your build directory. But nothing on this scale I think.
On Sat, May 05, 2018 at 10:11:00AM +0200, Ingo Molnar wrote: > Before: > > #ifndef atomic_fetch_dec_relaxed > > #ifndef atomic_fetch_dec > #define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > #define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > #define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > #define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > #else /* atomic_fetch_dec */ > #define atomic_fetch_dec_relaxed atomic_fetch_dec > #define atomic_fetch_dec_acquire atomic_fetch_dec > #define atomic_fetch_dec_release atomic_fetch_dec > #endif /* atomic_fetch_dec */ > > #else /* atomic_fetch_dec_relaxed */ > > #ifndef atomic_fetch_dec_acquire > #define atomic_fetch_dec_acquire(...) \ > __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > #endif > > #ifndef atomic_fetch_dec_release > #define atomic_fetch_dec_release(...) \ > __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > #endif > > #ifndef atomic_fetch_dec > #define atomic_fetch_dec(...) \ > __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > #endif > #endif /* atomic_fetch_dec_relaxed */ > > After: > > #ifndef atomic_fetch_dec_relaxed > # ifndef atomic_fetch_dec > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > # else > # define atomic_fetch_dec_relaxed atomic_fetch_dec > # define atomic_fetch_dec_acquire atomic_fetch_dec > # define atomic_fetch_dec_release atomic_fetch_dec > # endif > #else > # ifndef atomic_fetch_dec_acquire > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > # endif > # ifndef atomic_fetch_dec_release > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > # endif > # ifndef atomic_fetch_dec > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > # endif > #endif > > The new variant is readable at a glance, and the hierarchy of defines is very > obvious as well. It wraps and looks hideous in my normal setup. And I do detest that indent after # thing. > And I think we could do even better - there's absolutely no reason why _every_ > operation has to be made conditional on a finegrained level - they are overriden > in API groups. In fact allowing individual override is arguably a fragility. > > So we could do the following simplification on top of that: > > #ifndef atomic_fetch_dec_relaxed > # ifndef atomic_fetch_dec > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > # else > # define atomic_fetch_dec_relaxed atomic_fetch_dec > # define atomic_fetch_dec_acquire atomic_fetch_dec > # define atomic_fetch_dec_release atomic_fetch_dec > # endif > #else > # ifndef atomic_fetch_dec > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > # endif > #endif This would disallow an architecture to override just fetch_dec_release for instance. I don't think there currently is any architecture that does that, but the intent was to allow it to override anything and only provide defaults where it does not. None of this takes away the giant trainwreck that is the annotated atomic stuff though. And I seriously hate this one: ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation") and will likely undo that the moment I need to change anything there. So no, don't like.
* Peter Zijlstra <peterz@infradead.org> wrote: > > So we could do the following simplification on top of that: > > > > #ifndef atomic_fetch_dec_relaxed > > # ifndef atomic_fetch_dec > > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > > # else > > # define atomic_fetch_dec_relaxed atomic_fetch_dec > > # define atomic_fetch_dec_acquire atomic_fetch_dec > > # define atomic_fetch_dec_release atomic_fetch_dec > > # endif > > #else > > # ifndef atomic_fetch_dec > > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > > # endif > > #endif > > This would disallow an architecture to override just fetch_dec_release for > instance. Couldn't such a crazy arch just define _all_ the 3 APIs in this group? That's really a small price and makes the place pay the complexity price that does the weirdness... > I don't think there currently is any architecture that does that, but the > intent was to allow it to override anything and only provide defaults where it > does not. I'd argue that if a new arch only defines one of these APIs that's probably a bug. If they absolutely want to do it, they still can - by defining all 3 APIs. So there's no loss in arch flexibility. > None of this takes away the giant trainwreck that is the annotated atomic stuff > though. > > And I seriously hate this one: > > ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation") > > and will likely undo that the moment I need to change anything there. If it makes the code more readable then I don't object - the problem was that the instrumentation indirection made all that code much harder to follow. Thanks, Ingo
On Sat, May 5, 2018 at 10:47 AM, Peter Zijlstra <peterz@infradead.org> wrote: > On Sat, May 05, 2018 at 10:11:00AM +0200, Ingo Molnar wrote: > >> Before: >> >> #ifndef atomic_fetch_dec_relaxed >> >> #ifndef atomic_fetch_dec >> #define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) >> #define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) >> #define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) >> #define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) >> #else /* atomic_fetch_dec */ >> #define atomic_fetch_dec_relaxed atomic_fetch_dec >> #define atomic_fetch_dec_acquire atomic_fetch_dec >> #define atomic_fetch_dec_release atomic_fetch_dec >> #endif /* atomic_fetch_dec */ >> >> #else /* atomic_fetch_dec_relaxed */ >> >> #ifndef atomic_fetch_dec_acquire >> #define atomic_fetch_dec_acquire(...) \ >> __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) >> #endif >> >> #ifndef atomic_fetch_dec_release >> #define atomic_fetch_dec_release(...) \ >> __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) >> #endif >> >> #ifndef atomic_fetch_dec >> #define atomic_fetch_dec(...) \ >> __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) >> #endif >> #endif /* atomic_fetch_dec_relaxed */ >> >> After: >> >> #ifndef atomic_fetch_dec_relaxed >> # ifndef atomic_fetch_dec >> # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) >> # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) >> # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) >> # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) >> # else >> # define atomic_fetch_dec_relaxed atomic_fetch_dec >> # define atomic_fetch_dec_acquire atomic_fetch_dec >> # define atomic_fetch_dec_release atomic_fetch_dec >> # endif >> #else >> # ifndef atomic_fetch_dec_acquire >> # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) >> # endif >> # ifndef atomic_fetch_dec_release >> # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) >> # endif >> # ifndef atomic_fetch_dec >> # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) >> # endif >> #endif >> >> The new variant is readable at a glance, and the hierarchy of defines is very >> obvious as well. > > It wraps and looks hideous in my normal setup. And I do detest that indent > after # thing. > >> And I think we could do even better - there's absolutely no reason why _every_ >> operation has to be made conditional on a finegrained level - they are overriden >> in API groups. In fact allowing individual override is arguably a fragility. >> >> So we could do the following simplification on top of that: >> >> #ifndef atomic_fetch_dec_relaxed >> # ifndef atomic_fetch_dec >> # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) >> # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) >> # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) >> # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) >> # else >> # define atomic_fetch_dec_relaxed atomic_fetch_dec >> # define atomic_fetch_dec_acquire atomic_fetch_dec >> # define atomic_fetch_dec_release atomic_fetch_dec >> # endif >> #else >> # ifndef atomic_fetch_dec >> # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) >> # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) >> # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) >> # endif >> #endif > > This would disallow an architecture to override just fetch_dec_release for > instance. > > I don't think there currently is any architecture that does that, but the > intent was to allow it to override anything and only provide defaults where it > does not. > > None of this takes away the giant trainwreck that is the annotated atomic stuff > though. > > And I seriously hate this one: > > ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation") > > and will likely undo that the moment I need to change anything there. > > So no, don't like. That was asked by Ingo: https://groups.google.com/d/msg/kasan-dev/3sNHjjb4GCI/Xz1uVWaaAAAJ I think in the end all of current options suck in one way or another, so we are just going in circles. We either need something different (e.g. codegen), or settle on one option for doing it.
* Peter Zijlstra <peterz@infradead.org> wrote: > On Sat, May 05, 2018 at 10:11:00AM +0200, Ingo Molnar wrote: > > > Before: > > > > #ifndef atomic_fetch_dec_relaxed > > > > #ifndef atomic_fetch_dec > > #define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > > #define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > > #define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > > #define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > > #else /* atomic_fetch_dec */ > > #define atomic_fetch_dec_relaxed atomic_fetch_dec > > #define atomic_fetch_dec_acquire atomic_fetch_dec > > #define atomic_fetch_dec_release atomic_fetch_dec > > #endif /* atomic_fetch_dec */ > > > > #else /* atomic_fetch_dec_relaxed */ > > > > #ifndef atomic_fetch_dec_acquire > > #define atomic_fetch_dec_acquire(...) \ > > __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > > #endif > > > > #ifndef atomic_fetch_dec_release > > #define atomic_fetch_dec_release(...) \ > > __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > > #endif > > > > #ifndef atomic_fetch_dec > > #define atomic_fetch_dec(...) \ > > __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > > #endif > > #endif /* atomic_fetch_dec_relaxed */ > > > > After: > > > > #ifndef atomic_fetch_dec_relaxed > > # ifndef atomic_fetch_dec > > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > > # else > > # define atomic_fetch_dec_relaxed atomic_fetch_dec > > # define atomic_fetch_dec_acquire atomic_fetch_dec > > # define atomic_fetch_dec_release atomic_fetch_dec > > # endif > > #else > > # ifndef atomic_fetch_dec_acquire > > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > > # endif > > # ifndef atomic_fetch_dec_release > > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > > # endif > > # ifndef atomic_fetch_dec > > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > > # endif > > #endif > > > > The new variant is readable at a glance, and the hierarchy of defines is very > > obvious as well. > > It wraps and looks hideous in my normal setup. And I do detest that indent > after # thing. You should use wider terminals if you take a look at such code - there's already numerous areas of the kernel that are not readable on 80x25 terminals. _Please_ try the following experiment, for me: Enter the 21st century temporarily and widen two of your terminals from 80 cols to 100 cols - it's only ~20% wider. Apply the 3 patches I sent and then open the new and the old atomic.h in the two terminals and compare them visually. The new structure is _much_ more compact, it is nicer looking and much more readable. Thanks, Ingo
On Fri, May 04, 2018 at 08:24:22PM +0200, Peter Zijlstra wrote: > On Fri, May 04, 2018 at 07:09:09PM +0100, Mark Rutland wrote: > > On Fri, May 04, 2018 at 08:01:05PM +0200, Peter Zijlstra wrote: > > > On Fri, May 04, 2018 at 06:39:32PM +0100, Mark Rutland wrote: > > > > > include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++----- > > > > 1 file changed, 1008 insertions(+), 187 deletions(-) > > > > > > Is there really no way to either generate or further macro compress this? > > > > I can definitely macro compress this somewhat, but the bulk of the > > repetition will be the ifdeffery, which can't be macro'd away IIUC. > > Right, much like what we already have in linux/atomic.h I suspect, > having to duplicating that isn't brilliant either. > > > Generating this with a script is possible -- do we do anything like that > > elsewhere? > > There's include/generated/ in your build directory. But nothing on this > scale I think. Sure. I'm not familiar with how we generate those, so I'll go digging through the build infrastructure. Thanks, Mark.
On Sat, May 05, 2018 at 11:04:03AM +0200, Ingo Molnar wrote: > > * Peter Zijlstra <peterz@infradead.org> wrote: > > > > So we could do the following simplification on top of that: > > > > > > #ifndef atomic_fetch_dec_relaxed > > > # ifndef atomic_fetch_dec > > > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > > > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > > > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > > > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > > > # else > > > # define atomic_fetch_dec_relaxed atomic_fetch_dec > > > # define atomic_fetch_dec_acquire atomic_fetch_dec > > > # define atomic_fetch_dec_release atomic_fetch_dec > > > # endif > > > #else > > > # ifndef atomic_fetch_dec > > > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > > > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > > > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > > > # endif > > > #endif > > > > This would disallow an architecture to override just fetch_dec_release for > > instance. > > Couldn't such a crazy arch just define _all_ the 3 APIs in this group? > That's really a small price and makes the place pay the complexity > price that does the weirdness... I would expect the pattern where it can do all 'release' and/or all 'acquire' variants special but cannot use the __atomic_op_*() wrappery. > > I don't think there currently is any architecture that does that, but the > > intent was to allow it to override anything and only provide defaults where it > > does not. > > I'd argue that if a new arch only defines one of these APIs that's probably a bug. > If they absolutely want to do it, they still can - by defining all 3 APIs. > > So there's no loss in arch flexibility. Ideally we'd generate the whole mess.. and then allowing these extra few overrides is not a problem at all. > > None of this takes away the giant trainwreck that is the annotated atomic stuff > > though. > > > > And I seriously hate this one: > > > > ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation") > > > > and will likely undo that the moment I need to change anything there. > > If it makes the code more readable then I don't object - the problem was that the > instrumentation indirection made all that code much harder to follow. Thing is, it is all the exact same loop, and bitrot mandates they drift over time. When I cleaned up all the architectures I found plenty cases where there were spurious differences between things.
On Sat, May 05, 2018 at 11:09:03AM +0200, Ingo Molnar wrote: > > > # ifndef atomic_fetch_dec_acquire > > > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > > > # endif > > > # ifndef atomic_fetch_dec_release > > > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > > > # endif > > > # ifndef atomic_fetch_dec > > > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > > > # endif > > > #endif > > > > > > The new variant is readable at a glance, and the hierarchy of defines is very > > > obvious as well. > > > > It wraps and looks hideous in my normal setup. And I do detest that indent > > after # thing. > > You should use wider terminals if you take a look at such code - there's already > numerous areas of the kernel that are not readable on 80x25 terminals. > > _Please_ try the following experiment, for me: > > Enter the 21st century temporarily and widen two of your terminals from 80 cols to > 100 cols - it's only ~20% wider. Doesn't work that way. The only way I get more columns is if I shrink my font further. I work with tiles per monitor (left/right obv.) and use two columns per editor. This gets me a total of 4 columns. On my desktop that is slightly over 100 characters per column, on my laptop that is slightly below 100 -- mostly because I'm pixel limited on fontsize on that thing (FullHD sucks). If it wraps it wraps.
On Sat, May 05, 2018 at 11:05:51AM +0200, Dmitry Vyukov wrote: > On Sat, May 5, 2018 at 10:47 AM, Peter Zijlstra <peterz@infradead.org> wrote: > > And I seriously hate this one: > > > > ba1c9f83f633 ("locking/atomic/x86: Un-macro-ify atomic ops implementation") > > > > and will likely undo that the moment I need to change anything there. > That was asked by Ingo: > https://groups.google.com/d/msg/kasan-dev/3sNHjjb4GCI/Xz1uVWaaAAAJ > > I think in the end all of current options suck in one way or another, > so we are just going in circles. Yeah, and I disagree with him, but didn't have the energy to fight at that time (and still don't really, I'm just complaining). > We either need something different (e.g. codegen), or settle on one > option for doing it. Codegen I think is the only sensible option at this point for all the wrappers. The existing ones (without the annotation muck) were already cumbersome, the annotation stuff just makes it completely horrid.
* Ingo Molnar <mingo@kernel.org> wrote: > * Peter Zijlstra <peterz@infradead.org> wrote: > > > > So we could do the following simplification on top of that: > > > > > > #ifndef atomic_fetch_dec_relaxed > > > # ifndef atomic_fetch_dec > > > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > > > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > > > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > > > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > > > # else > > > # define atomic_fetch_dec_relaxed atomic_fetch_dec > > > # define atomic_fetch_dec_acquire atomic_fetch_dec > > > # define atomic_fetch_dec_release atomic_fetch_dec > > > # endif > > > #else > > > # ifndef atomic_fetch_dec > > > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > > > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > > > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > > > # endif > > > #endif > > > > This would disallow an architecture to override just fetch_dec_release for > > instance. > > Couldn't such a crazy arch just define _all_ the 3 APIs in this group? > That's really a small price and makes the place pay the complexity > price that does the weirdness... > > > I don't think there currently is any architecture that does that, but the > > intent was to allow it to override anything and only provide defaults where it > > does not. > > I'd argue that if a new arch only defines one of these APIs that's probably a bug. > If they absolutely want to do it, they still can - by defining all 3 APIs. > > So there's no loss in arch flexibility. BTW., PowerPC for example is already in such a situation, it does not define atomic_cmpxchg_release(), only the other APIs: #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n))) #define atomic_cmpxchg_relaxed(v, o, n) \ cmpxchg_relaxed(&((v)->counter), (o), (n)) #define atomic_cmpxchg_acquire(v, o, n) \ cmpxchg_acquire(&((v)->counter), (o), (n)) Was it really the intention on the PowerPC side that the generic code falls back to cmpxchg(), i.e.: # define atomic_cmpxchg_release(...) __atomic_op_release(atomic_cmpxchg, __VA_ARGS__) Which after macro expansion becomes: smp_mb__before_atomic(); atomic_cmpxchg_relaxed(v, o, n); smp_mb__before_atomic() on PowerPC falls back to the generic __smp_mb(), which falls back to mb(), which on PowerPC is the 'sync' instruction. Isn't this a inefficiency bug? While I'm pretty clueless about PowerPC low level cmpxchg atomics, they appear to have the following basic structure: full cmpxchg(): PPC_ATOMIC_ENTRY_BARRIER # sync ldarx + stdcx PPC_ATOMIC_EXIT_BARRIER # sync cmpxchg_relaxed(): ldarx + stdcx cmpxchg_acquire(): ldarx + stdcx PPC_ACQUIRE_BARRIER # lwsync The logical extension for cmpxchg_release() would be: cmpxchg_release(): PPC_RELEASE_BARRIER # lwsync ldarx + stdcx But instead we silently get the generic fallback, which does: smp_mb__before_atomic(); atomic_cmpxchg_relaxed(v, o, n); Which maps to: sync ldarx + stdcx Note that it uses a full barrier instead of lwsync (does that stand for 'lightweight sync'?). Even if it turns out we need the full barrier, with the overly finegrained structure of the atomics this detail is totally undocumented and non-obvious. Thanks, Ingo
On Sat, May 05, 2018 at 11:38:29AM +0200, Ingo Molnar wrote: > > * Ingo Molnar <mingo@kernel.org> wrote: > > > * Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > So we could do the following simplification on top of that: > > > > > > > > #ifndef atomic_fetch_dec_relaxed > > > > # ifndef atomic_fetch_dec > > > > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > > > > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > > > > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > > > > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > > > > # else > > > > # define atomic_fetch_dec_relaxed atomic_fetch_dec > > > > # define atomic_fetch_dec_acquire atomic_fetch_dec > > > > # define atomic_fetch_dec_release atomic_fetch_dec > > > > # endif > > > > #else > > > > # ifndef atomic_fetch_dec > > > > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > > > > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > > > > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > > > > # endif > > > > #endif > > > > > > This would disallow an architecture to override just fetch_dec_release for > > > instance. > > > > Couldn't such a crazy arch just define _all_ the 3 APIs in this group? > > That's really a small price and makes the place pay the complexity > > price that does the weirdness... > > > > > I don't think there currently is any architecture that does that, but the > > > intent was to allow it to override anything and only provide defaults where it > > > does not. > > > > I'd argue that if a new arch only defines one of these APIs that's probably a bug. > > If they absolutely want to do it, they still can - by defining all 3 APIs. > > > > So there's no loss in arch flexibility. > > BTW., PowerPC for example is already in such a situation, it does not define > atomic_cmpxchg_release(), only the other APIs: > > #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n))) > #define atomic_cmpxchg_relaxed(v, o, n) \ > cmpxchg_relaxed(&((v)->counter), (o), (n)) > #define atomic_cmpxchg_acquire(v, o, n) \ > cmpxchg_acquire(&((v)->counter), (o), (n)) > > Was it really the intention on the PowerPC side that the generic code falls back > to cmpxchg(), i.e.: > > # define atomic_cmpxchg_release(...) __atomic_op_release(atomic_cmpxchg, __VA_ARGS__) > So ppc has its own definition __atomic_op_release() in arch/powerpc/include/asm/atomic.h: #define __atomic_op_release(op, args...) \ ({ \ __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory"); \ op##_relaxed(args); \ }) , and PPC_RELEASE_BARRIER is lwsync, so we map to lwsync(); atomic_cmpxchg_relaxed(v, o, n); And the reason, why we don't define atomic_cmpxchg_release() but define atomic_cmpxchg_acquire() is that, atomic_cmpxchg_*() could provide no ordering guarantee if the cmp fails, we did this for atomic_cmpxchg_acquire() but not for atomic_cmpxchg_release(), because doing so may introduce a memory barrier inside a ll/sc critical section, please see the comment before __cmpxchg_u32_acquire() in arch/powerpc/include/asm/cmpxchg.h: /* * cmpxchg family don't have order guarantee if cmp part fails, therefore we * can avoid superfluous barriers if we use assembly code to implement * cmpxchg() and cmpxchg_acquire(), however we don't do the similar for * cmpxchg_release() because that will result in putting a barrier in the * middle of a ll/sc loop, which is probably a bad idea. For example, this * might cause the conditional store more likely to fail. */ Regards, Boqun > Which after macro expansion becomes: > > smp_mb__before_atomic(); > atomic_cmpxchg_relaxed(v, o, n); > > smp_mb__before_atomic() on PowerPC falls back to the generic __smp_mb(), which > falls back to mb(), which on PowerPC is the 'sync' instruction. > > Isn't this a inefficiency bug? > > While I'm pretty clueless about PowerPC low level cmpxchg atomics, they appear to > have the following basic structure: > > full cmpxchg(): > > PPC_ATOMIC_ENTRY_BARRIER # sync > ldarx + stdcx > PPC_ATOMIC_EXIT_BARRIER # sync > > cmpxchg_relaxed(): > > ldarx + stdcx > > cmpxchg_acquire(): > > ldarx + stdcx > PPC_ACQUIRE_BARRIER # lwsync > > The logical extension for cmpxchg_release() would be: > > cmpxchg_release(): > > PPC_RELEASE_BARRIER # lwsync > ldarx + stdcx > > But instead we silently get the generic fallback, which does: > > smp_mb__before_atomic(); > atomic_cmpxchg_relaxed(v, o, n); > > Which maps to: > > sync > ldarx + stdcx > > Note that it uses a full barrier instead of lwsync (does that stand for > 'lightweight sync'?). > > Even if it turns out we need the full barrier, with the overly finegrained > structure of the atomics this detail is totally undocumented and non-obvious. > > Thanks, > > Ingo
Hi Ingo, On Sat, May 05, 2018 at 12:00:55PM +0200, Ingo Molnar wrote: > > * Ingo Molnar <mingo@kernel.org> wrote: > > > > So there's no loss in arch flexibility. > > > > BTW., PowerPC for example is already in such a situation, it does not define > > atomic_cmpxchg_release(), only the other APIs: > > > > #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n))) > > #define atomic_cmpxchg_relaxed(v, o, n) \ > > cmpxchg_relaxed(&((v)->counter), (o), (n)) > > #define atomic_cmpxchg_acquire(v, o, n) \ > > cmpxchg_acquire(&((v)->counter), (o), (n)) > > > > Was it really the intention on the PowerPC side that the generic code falls back > > to cmpxchg(), i.e.: > > > > # define atomic_cmpxchg_release(...) __atomic_op_release(atomic_cmpxchg, __VA_ARGS__) > > > > Which after macro expansion becomes: > > > > smp_mb__before_atomic(); > > atomic_cmpxchg_relaxed(v, o, n); > > > > smp_mb__before_atomic() on PowerPC falls back to the generic __smp_mb(), which > > falls back to mb(), which on PowerPC is the 'sync' instruction. > > > > Isn't this a inefficiency bug? > > > > While I'm pretty clueless about PowerPC low level cmpxchg atomics, they appear to > > have the following basic structure: > > > > full cmpxchg(): > > > > PPC_ATOMIC_ENTRY_BARRIER # sync > > ldarx + stdcx > > PPC_ATOMIC_EXIT_BARRIER # sync > > > > cmpxchg_relaxed(): > > > > ldarx + stdcx > > > > cmpxchg_acquire(): > > > > ldarx + stdcx > > PPC_ACQUIRE_BARRIER # lwsync > > > > The logical extension for cmpxchg_release() would be: > > > > cmpxchg_release(): > > > > PPC_RELEASE_BARRIER # lwsync > > ldarx + stdcx > > > > But instead we silently get the generic fallback, which does: > > > > smp_mb__before_atomic(); > > atomic_cmpxchg_relaxed(v, o, n); > > > > Which maps to: > > > > sync > > ldarx + stdcx > > > > Note that it uses a full barrier instead of lwsync (does that stand for > > 'lightweight sync'?). > > > > Even if it turns out we need the full barrier, with the overly finegrained > > structure of the atomics this detail is totally undocumented and non-obvious. > > The patch below fills in those bits and implements the optimized cmpxchg_release() > family of APIs. The end effect should be that cmpxchg_release() will now use > 'lwsync' instead of 'sync' on PowerPC, for the following APIs: > > cmpxchg_release() > cmpxchg64_release() > atomic_cmpxchg_release() > atomic64_cmpxchg_release() > > I based this choice of the release barrier on an existing bitops low level PowerPC > method: > > DEFINE_BITOP(clear_bits_unlock, andc, PPC_RELEASE_BARRIER) > > This clearly suggests that PPC_RELEASE_BARRIER is in active use and 'lwsync' is > the 'release barrier' instruction, if I interpreted that right. > Thanks for looking into this, but as I said in other email: https://marc.info/?l=linux-kernel&m=152551511324210&w=2 , we actually generate light weight barriers for cmpxchg_release() familiy. The reason of the asymmetry between cmpxchg_acquire() and cmpxchg_release() is that we want to save a barrier for cmpxchg_acquire() if the cmp fails, but doing the similar for cmpxchg_release() will introduce a scenario that puts a barrier in a ll/sc loop, which may be a bad idea. > But I know very little about PowerPC so this might be spectacularly wrong. It's > totally untested as well. I also pretty sick today so my mental capabilities are > significantly reduced ... > Feel sorry about that, hope you well! Please let me know if you think I should provide more document work to make this more informative. Regards, Boqun > So not signed off and such. > > Thanks, > > Ingo > > --- > arch/powerpc/include/asm/atomic.h | 4 ++ > arch/powerpc/include/asm/cmpxchg.h | 81 ++++++++++++++++++++++++++++++++++++++ > 2 files changed, 85 insertions(+) > > diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h > index 682b3e6a1e21..f7a6f29acb12 100644 > --- a/arch/powerpc/include/asm/atomic.h > +++ b/arch/powerpc/include/asm/atomic.h > @@ -213,6 +213,8 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v) > cmpxchg_relaxed(&((v)->counter), (o), (n)) > #define atomic_cmpxchg_acquire(v, o, n) \ > cmpxchg_acquire(&((v)->counter), (o), (n)) > +#define atomic_cmpxchg_release(v, o, n) \ > + cmpxchg_release(&((v)->counter), (o), (n)) > > #define atomic_xchg(v, new) (xchg(&((v)->counter), new)) > #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new)) > @@ -519,6 +521,8 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v) > cmpxchg_relaxed(&((v)->counter), (o), (n)) > #define atomic64_cmpxchg_acquire(v, o, n) \ > cmpxchg_acquire(&((v)->counter), (o), (n)) > +#define atomic64_cmpxchg_release(v, o, n) \ > + cmpxchg_release(&((v)->counter), (o), (n)) > > #define atomic64_xchg(v, new) (xchg(&((v)->counter), new)) > #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new)) > diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h > index 9b001f1f6b32..6e46310b1833 100644 > --- a/arch/powerpc/include/asm/cmpxchg.h > +++ b/arch/powerpc/include/asm/cmpxchg.h > @@ -213,10 +213,12 @@ __xchg_relaxed(void *ptr, unsigned long x, unsigned int size) > CMPXCHG_GEN(u8, , PPC_ATOMIC_ENTRY_BARRIER, PPC_ATOMIC_EXIT_BARRIER, "memory"); > CMPXCHG_GEN(u8, _local, , , "memory"); > CMPXCHG_GEN(u8, _acquire, , PPC_ACQUIRE_BARRIER, "memory"); > +CMPXCHG_GEN(u8, _release, PPC_RELEASE_BARRIER, , "memory"); > CMPXCHG_GEN(u8, _relaxed, , , "cc"); > CMPXCHG_GEN(u16, , PPC_ATOMIC_ENTRY_BARRIER, PPC_ATOMIC_EXIT_BARRIER, "memory"); > CMPXCHG_GEN(u16, _local, , , "memory"); > CMPXCHG_GEN(u16, _acquire, , PPC_ACQUIRE_BARRIER, "memory"); > +CMPXCHG_GEN(u16, _release, PPC_RELEASE_BARRIER, , "memory"); > CMPXCHG_GEN(u16, _relaxed, , , "cc"); > > static __always_inline unsigned long > @@ -314,6 +316,29 @@ __cmpxchg_u32_acquire(u32 *p, unsigned long old, unsigned long new) > return prev; > } > > +static __always_inline unsigned long > +__cmpxchg_u32_release(u32 *p, unsigned long old, unsigned long new) > +{ > + unsigned long prev; > + > + __asm__ __volatile__ ( > + PPC_RELEASE_BARRIER > +"1: lwarx %0,0,%2 # __cmpxchg_u32_release\n" > +" cmpw 0,%0,%3\n" > +" bne- 2f\n" > + PPC405_ERR77(0, %2) > +" stwcx. %4,0,%2\n" > +" bne- 1b\n" > + "\n" > +"2:" > + : "=&r" (prev), "+m" (*p) > + : "r" (p), "r" (old), "r" (new) > + : "cc", "memory"); > + > + return prev; > +} > + > + > #ifdef CONFIG_PPC64 > static __always_inline unsigned long > __cmpxchg_u64(volatile unsigned long *p, unsigned long old, unsigned long new) > @@ -397,6 +422,27 @@ __cmpxchg_u64_acquire(u64 *p, unsigned long old, unsigned long new) > > return prev; > } > + > +static __always_inline unsigned long > +__cmpxchg_u64_release(u64 *p, unsigned long old, unsigned long new) > +{ > + unsigned long prev; > + > + __asm__ __volatile__ ( > + PPC_RELEASE_BARRIER > +"1: ldarx %0,0,%2 # __cmpxchg_u64_release\n" > +" cmpd 0,%0,%3\n" > +" bne- 2f\n" > +" stdcx. %4,0,%2\n" > +" bne- 1b\n" > + "\n" > +"2:" > + : "=&r" (prev), "+m" (*p) > + : "r" (p), "r" (old), "r" (new) > + : "cc", "memory"); > + > + return prev; > +} > #endif > > static __always_inline unsigned long > @@ -478,6 +524,27 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new, > BUILD_BUG_ON_MSG(1, "Unsupported size for __cmpxchg_acquire"); > return old; > } > + > +static __always_inline unsigned long > +__cmpxchg_release(void *ptr, unsigned long old, unsigned long new, > + unsigned int size) > +{ > + switch (size) { > + case 1: > + return __cmpxchg_u8_release(ptr, old, new); > + case 2: > + return __cmpxchg_u16_release(ptr, old, new); > + case 4: > + return __cmpxchg_u32_release(ptr, old, new); > +#ifdef CONFIG_PPC64 > + case 8: > + return __cmpxchg_u64_release(ptr, old, new); > +#endif > + } > + BUILD_BUG_ON_MSG(1, "Unsupported size for __cmpxchg_release"); > + return old; > +} > + > #define cmpxchg(ptr, o, n) \ > ({ \ > __typeof__(*(ptr)) _o_ = (o); \ > @@ -512,6 +579,15 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new, > (unsigned long)_o_, (unsigned long)_n_, \ > sizeof(*(ptr))); \ > }) > + > +#define cmpxchg_release(ptr, o, n) \ > +({ \ > + __typeof__(*(ptr)) _o_ = (o); \ > + __typeof__(*(ptr)) _n_ = (n); \ > + (__typeof__(*(ptr))) __cmpxchg_release((ptr), \ > + (unsigned long)_o_, (unsigned long)_n_, \ > + sizeof(*(ptr))); \ > +}) > #ifdef CONFIG_PPC64 > #define cmpxchg64(ptr, o, n) \ > ({ \ > @@ -533,6 +609,11 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new, > BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ > cmpxchg_acquire((ptr), (o), (n)); \ > }) > +#define cmpxchg64_release(ptr, o, n) \ > +({ \ > + BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ > + cmpxchg_release((ptr), (o), (n)); \ > +}) > #else > #include <asm-generic/cmpxchg-local.h> > #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
* Ingo Molnar <mingo@kernel.org> wrote: > If that's too then there's a few more things we could do - for example the ^--too much > attached patch renames a (very minor) misnomer to a shorter name and thus saves on > the longest lines, the column histogram now looks like this: Thanks, Ingo
On Sat, May 05, 2018 at 12:35:50PM +0200, Ingo Molnar wrote: > > * Boqun Feng <boqun.feng@gmail.com> wrote: > > > On Sat, May 05, 2018 at 11:38:29AM +0200, Ingo Molnar wrote: > > > > > > * Ingo Molnar <mingo@kernel.org> wrote: > > > > > > > * Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > > > > > So we could do the following simplification on top of that: > > > > > > > > > > > > #ifndef atomic_fetch_dec_relaxed > > > > > > # ifndef atomic_fetch_dec > > > > > > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > > > > > > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > > > > > > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > > > > > > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > > > > > > # else > > > > > > # define atomic_fetch_dec_relaxed atomic_fetch_dec > > > > > > # define atomic_fetch_dec_acquire atomic_fetch_dec > > > > > > # define atomic_fetch_dec_release atomic_fetch_dec > > > > > > # endif > > > > > > #else > > > > > > # ifndef atomic_fetch_dec > > > > > > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > > > > > > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > > > > > > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > > > > > > # endif > > > > > > #endif > > > > > > > > > > This would disallow an architecture to override just fetch_dec_release for > > > > > instance. > > > > > > > > Couldn't such a crazy arch just define _all_ the 3 APIs in this group? > > > > That's really a small price and makes the place pay the complexity > > > > price that does the weirdness... > > > > > > > > > I don't think there currently is any architecture that does that, but the > > > > > intent was to allow it to override anything and only provide defaults where it > > > > > does not. > > > > > > > > I'd argue that if a new arch only defines one of these APIs that's probably a bug. > > > > If they absolutely want to do it, they still can - by defining all 3 APIs. > > > > > > > > So there's no loss in arch flexibility. > > > > > > BTW., PowerPC for example is already in such a situation, it does not define > > > atomic_cmpxchg_release(), only the other APIs: > > > > > > #define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n))) > > > #define atomic_cmpxchg_relaxed(v, o, n) \ > > > cmpxchg_relaxed(&((v)->counter), (o), (n)) > > > #define atomic_cmpxchg_acquire(v, o, n) \ > > > cmpxchg_acquire(&((v)->counter), (o), (n)) > > > > > > Was it really the intention on the PowerPC side that the generic code falls back > > > to cmpxchg(), i.e.: > > > > > > # define atomic_cmpxchg_release(...) __atomic_op_release(atomic_cmpxchg, __VA_ARGS__) > > > > > > > So ppc has its own definition __atomic_op_release() in > > arch/powerpc/include/asm/atomic.h: > > > > #define __atomic_op_release(op, args...) \ > > ({ \ > > __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory"); \ > > op##_relaxed(args); \ > > }) > > > > , and PPC_RELEASE_BARRIER is lwsync, so we map to > > > > lwsync(); > > atomic_cmpxchg_relaxed(v, o, n); > > > > And the reason, why we don't define atomic_cmpxchg_release() but define > > atomic_cmpxchg_acquire() is that, atomic_cmpxchg_*() could provide no > > ordering guarantee if the cmp fails, we did this for > > atomic_cmpxchg_acquire() but not for atomic_cmpxchg_release(), because > > doing so may introduce a memory barrier inside a ll/sc critical section, > > please see the comment before __cmpxchg_u32_acquire() in > > arch/powerpc/include/asm/cmpxchg.h: > > > > /* > > * cmpxchg family don't have order guarantee if cmp part fails, therefore we > > * can avoid superfluous barriers if we use assembly code to implement > > * cmpxchg() and cmpxchg_acquire(), however we don't do the similar for > > * cmpxchg_release() because that will result in putting a barrier in the > > * middle of a ll/sc loop, which is probably a bad idea. For example, this > > * might cause the conditional store more likely to fail. > > */ > > Makes sense, thanks a lot for the explanation, missed that comment in the middle > of the assembly functions! > ;-) I could move it so somewhere else in the future. > So the patch I sent is buggy, please disregard it. > > May I suggest the patch below? No change in functionality, but it documents the > lack of the cmpxchg_release() APIs and maps them explicitly to the full cmpxchg() > version. (Which the generic code does now in a rather roundabout way.) > Hmm.. cmpxchg_release() is actually lwsync() + cmpxchg_relaxed(), but you just make it sync() + cmpxchg_relaxed() + sync() with the fallback, and sync() is much heavier, so I don't think the fallback is correct. I think maybe you can move powerpc's __atomic_op_{acqurie,release}() from atomic.h to cmpxchg.h (in arch/powerpc/include/asm), and #define cmpxchg_release __atomic_op_release(cmpxchg, __VA_ARGS__); #define cmpxchg64_release __atomic_op_release(cmpxchg64, __VA_ARGS__); I put a diff below to say what I mean (untested). > Also, the change to arch/powerpc/include/asm/atomic.h has no functional effect > right now either, but should anyone add a _relaxed() variant in the future, with > this change atomic_cmpxchg_release() and atomic64_cmpxchg_release() will pick that > up automatically. > You mean with your other modification in include/linux/atomic.h, right? Because with the unmodified include/linux/atomic.h, we already pick that autmatically. If so, I think that's fine. Here is the diff for the modification for cmpxchg_release(), the idea is we generate them in asm/cmpxchg.h other than linux/atomic.h for ppc, so we keep the new linux/atomic.h working. Because if I understand correctly, the next linux/atomic.h only accepts that 1) architecture only defines fully ordered primitives or 2) architecture only defines _relaxed primitives or 3) architecture defines all four (fully, _relaxed, _acquire, _release) primitives So powerpc needs to define all four primitives in its only asm/cmpxchg.h. Regards, Boqun diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h index 682b3e6a1e21..0136be11c84f 100644 --- a/arch/powerpc/include/asm/atomic.h +++ b/arch/powerpc/include/asm/atomic.h @@ -13,24 +13,6 @@ #define ATOMIC_INIT(i) { (i) } -/* - * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with - * a "bne-" instruction at the end, so an isync is enough as a acquire barrier - * on the platform without lwsync. - */ -#define __atomic_op_acquire(op, args...) \ -({ \ - typeof(op##_relaxed(args)) __ret = op##_relaxed(args); \ - __asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory"); \ - __ret; \ -}) - -#define __atomic_op_release(op, args...) \ -({ \ - __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory"); \ - op##_relaxed(args); \ -}) - static __inline__ int atomic_read(const atomic_t *v) { int t; diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h index 9b001f1f6b32..9e20a942aff9 100644 --- a/arch/powerpc/include/asm/cmpxchg.h +++ b/arch/powerpc/include/asm/cmpxchg.h @@ -8,6 +8,24 @@ #include <asm/asm-compat.h> #include <linux/bug.h> +/* + * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with + * a "bne-" instruction at the end, so an isync is enough as a acquire barrier + * on the platform without lwsync. + */ +#define __atomic_op_acquire(op, args...) \ +({ \ + typeof(op##_relaxed(args)) __ret = op##_relaxed(args); \ + __asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory"); \ + __ret; \ +}) + +#define __atomic_op_release(op, args...) \ +({ \ + __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory"); \ + op##_relaxed(args); \ +}) + #ifdef __BIG_ENDIAN #define BITOFF_CAL(size, off) ((sizeof(u32) - size - off) * BITS_PER_BYTE) #else @@ -512,6 +530,8 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new, (unsigned long)_o_, (unsigned long)_n_, \ sizeof(*(ptr))); \ }) + +#define cmpxchg_release(ptr, o, n) __atomic_op_release(cmpxchg, __VA_ARGS__) #ifdef CONFIG_PPC64 #define cmpxchg64(ptr, o, n) \ ({ \ @@ -533,6 +553,7 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new, BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ cmpxchg_acquire((ptr), (o), (n)); \ }) +#define cmpxchg64_release(ptr, o, n) __atomic_op_release(cmpxchg64, __VA_ARGS__) #else #include <asm-generic/cmpxchg-local.h> #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
On Sat, May 05, 2018 at 03:27:51PM +0200, Ingo Molnar wrote: > > * Boqun Feng <boqun.feng@gmail.com> wrote: > > > > May I suggest the patch below? No change in functionality, but it documents the > > > lack of the cmpxchg_release() APIs and maps them explicitly to the full cmpxchg() > > > version. (Which the generic code does now in a rather roundabout way.) > > > > > > > Hmm.. cmpxchg_release() is actually lwsync() + cmpxchg_relaxed(), but > > you just make it sync() + cmpxchg_relaxed() + sync() with the fallback, > > and sync() is much heavier, so I don't think the fallback is correct. > > Indeed! > > The bit I missed previously is that PowerPC provides its own __atomic_op_release() > method: > > #define __atomic_op_release(op, args...) \ > ({ \ > __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory"); \ > op##_relaxed(args); \ > }) > > ... which maps to LWSYNC as you say, and my patch made that worse. > > > I think maybe you can move powerpc's __atomic_op_{acqurie,release}() > > from atomic.h to cmpxchg.h (in arch/powerpc/include/asm), and > > > > #define cmpxchg_release __atomic_op_release(cmpxchg, __VA_ARGS__); > > #define cmpxchg64_release __atomic_op_release(cmpxchg64, __VA_ARGS__); > > > > I put a diff below to say what I mean (untested). > > > > > Also, the change to arch/powerpc/include/asm/atomic.h has no functional effect > > > right now either, but should anyone add a _relaxed() variant in the future, with > > > this change atomic_cmpxchg_release() and atomic64_cmpxchg_release() will pick that > > > up automatically. > > > > > > > You mean with your other modification in include/linux/atomic.h, right? > > Because with the unmodified include/linux/atomic.h, we already pick that > > autmatically. If so, I think that's fine. > > > > Here is the diff for the modification for cmpxchg_release(), the idea is > > we generate them in asm/cmpxchg.h other than linux/atomic.h for ppc, so > > we keep the new linux/atomic.h working. Because if I understand > > correctly, the next linux/atomic.h only accepts that > > > > 1) architecture only defines fully ordered primitives > > > > or > > > > 2) architecture only defines _relaxed primitives > > > > or > > > > 3) architecture defines all four (fully, _relaxed, _acquire, > > _release) primitives > > > > So powerpc needs to define all four primitives in its only > > asm/cmpxchg.h. > > Correct, although the new logic is still RFC, PeterZ didn't like the first version > I proposed and might NAK them. > Understood. From my side, I don't have strong feelings for either way. But since powerpc gets affected with the new logic, so I'm glad I could help. > Thanks for the patch - I have created the patch below from it and added your > Signed-off-by. > Thanks ;-) > The only change I made beyond a trivial build fix is that I also added the release > atomics variants explicitly: > > +#define atomic_cmpxchg_release(v, o, n) \ > + cmpxchg_release(&((v)->counter), (o), (n)) > +#define atomic64_cmpxchg_release(v, o, n) \ > + cmpxchg_release(&((v)->counter), (o), (n)) > > It has passed a PowerPC cross-build test here, but no runtime tests. > Do you have the commit at any branch in tip tree? I could pull it and cross-build and check the assembly code of lib/atomic64_test.c, that way I could verify whether we mess something up. > Does this patch look good to you? > Yep! Regards, Boqun > (Still subject to PeterZ's Ack/NAK.) > > Thanks, > > Ingo > > ======================> > From: Boqun Feng <boqun.feng@gmail.com> > Date: Sat, 5 May 2018 19:28:17 +0800 > Subject: [PATCH] locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs > > Move PowerPC's __op_{acqurie,release}() from atomic.h to > cmpxchg.h (in arch/powerpc/include/asm), plus use them to > define these two methods: > > #define cmpxchg_release __op_release(cmpxchg, __VA_ARGS__); > #define cmpxchg64_release __op_release(cmpxchg64, __VA_ARGS__); > > ... the idea is to generate all these methods in cmpxchg.h and to define the full > array of atomic primitives, including the cmpxchg_release() methods which were > defined by the generic code before. > > Also define the atomic[64]_() variants explicitly. > > This ensures that all these low level cmpxchg APIs are defined in > PowerPC headers, with no generic header fallbacks. > > No change in functionality or code generation. > > Signed-off-by: Boqun Feng <boqun.feng@gmail.com> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: aryabinin@virtuozzo.com > Cc: catalin.marinas@arm.com > Cc: dvyukov@google.com > Cc: linux-arm-kernel@lists.infradead.org > Cc: will.deacon@arm.com > Link: http://lkml.kernel.org/r/20180505112817.ihrb726i37bwm4cj@tardis > Signed-off-by: Ingo Molnar <mingo@kernel.org> > --- > arch/powerpc/include/asm/atomic.h | 22 ++++------------------ > arch/powerpc/include/asm/cmpxchg.h | 24 ++++++++++++++++++++++++ > 2 files changed, 28 insertions(+), 18 deletions(-) > > diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h > index 682b3e6a1e21..4e06955ec10f 100644 > --- a/arch/powerpc/include/asm/atomic.h > +++ b/arch/powerpc/include/asm/atomic.h > @@ -13,24 +13,6 @@ > > #define ATOMIC_INIT(i) { (i) } > > -/* > - * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with > - * a "bne-" instruction at the end, so an isync is enough as a acquire barrier > - * on the platform without lwsync. > - */ > -#define __atomic_op_acquire(op, args...) \ > -({ \ > - typeof(op##_relaxed(args)) __ret = op##_relaxed(args); \ > - __asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory"); \ > - __ret; \ > -}) > - > -#define __atomic_op_release(op, args...) \ > -({ \ > - __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory"); \ > - op##_relaxed(args); \ > -}) > - > static __inline__ int atomic_read(const atomic_t *v) > { > int t; > @@ -213,6 +195,8 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v) > cmpxchg_relaxed(&((v)->counter), (o), (n)) > #define atomic_cmpxchg_acquire(v, o, n) \ > cmpxchg_acquire(&((v)->counter), (o), (n)) > +#define atomic_cmpxchg_release(v, o, n) \ > + cmpxchg_release(&((v)->counter), (o), (n)) > > #define atomic_xchg(v, new) (xchg(&((v)->counter), new)) > #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new)) > @@ -519,6 +503,8 @@ static __inline__ long atomic64_dec_if_positive(atomic64_t *v) > cmpxchg_relaxed(&((v)->counter), (o), (n)) > #define atomic64_cmpxchg_acquire(v, o, n) \ > cmpxchg_acquire(&((v)->counter), (o), (n)) > +#define atomic64_cmpxchg_release(v, o, n) \ > + cmpxchg_release(&((v)->counter), (o), (n)) > > #define atomic64_xchg(v, new) (xchg(&((v)->counter), new)) > #define atomic64_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new)) > diff --git a/arch/powerpc/include/asm/cmpxchg.h b/arch/powerpc/include/asm/cmpxchg.h > index 9b001f1f6b32..e27a612b957f 100644 > --- a/arch/powerpc/include/asm/cmpxchg.h > +++ b/arch/powerpc/include/asm/cmpxchg.h > @@ -8,6 +8,24 @@ > #include <asm/asm-compat.h> > #include <linux/bug.h> > > +/* > + * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with > + * a "bne-" instruction at the end, so an isync is enough as a acquire barrier > + * on the platform without lwsync. > + */ > +#define __atomic_op_acquire(op, args...) \ > +({ \ > + typeof(op##_relaxed(args)) __ret = op##_relaxed(args); \ > + __asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory"); \ > + __ret; \ > +}) > + > +#define __atomic_op_release(op, args...) \ > +({ \ > + __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory"); \ > + op##_relaxed(args); \ > +}) > + > #ifdef __BIG_ENDIAN > #define BITOFF_CAL(size, off) ((sizeof(u32) - size - off) * BITS_PER_BYTE) > #else > @@ -512,6 +530,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new, > (unsigned long)_o_, (unsigned long)_n_, \ > sizeof(*(ptr))); \ > }) > + > +#define cmpxchg_release(...) __atomic_op_release(cmpxchg, __VA_ARGS__) > + > #ifdef CONFIG_PPC64 > #define cmpxchg64(ptr, o, n) \ > ({ \ > @@ -533,6 +554,9 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned long new, > BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ > cmpxchg_acquire((ptr), (o), (n)); \ > }) > + > +#define cmpxchg64_release(...) __atomic_op_release(cmpxchg64, __VA_ARGS__) > + > #else > #include <asm-generic/cmpxchg-local.h> > #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
On Sat, 2018-05-05 at 12:00 +0200, Ingo Molnar wrote: > This clearly suggests that PPC_RELEASE_BARRIER is in active use and 'lwsync' is > the 'release barrier' instruction, if I interpreted that right. The closest to one we got. The semantics are that it orders all load/store pairs to cachable storage except store+load. Cheers, Ben.
* Boqun Feng <boqun.feng@gmail.com> wrote: > > The only change I made beyond a trivial build fix is that I also added the release > > atomics variants explicitly: > > > > +#define atomic_cmpxchg_release(v, o, n) \ > > + cmpxchg_release(&((v)->counter), (o), (n)) > > +#define atomic64_cmpxchg_release(v, o, n) \ > > + cmpxchg_release(&((v)->counter), (o), (n)) > > > > It has passed a PowerPC cross-build test here, but no runtime tests. > > > > Do you have the commit at any branch in tip tree? I could pull it and > cross-build and check the assembly code of lib/atomic64_test.c, that way > I could verify whether we mess something up. > > > Does this patch look good to you? > > > > Yep! Great - I have pushed the commits out into the locking tree, they can be found in: git fetch git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git locking/core The PowerPC preparatory commit from you is: 0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs Thanks, Ingo
Hi Ingo, > From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001 > From: Ingo Molnar <mingo@kernel.org> > Date: Sat, 5 May 2018 10:23:23 +0200 > Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more > > Before: > > #ifndef atomic_fetch_dec_relaxed > # ifndef atomic_fetch_dec > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > # else > # define atomic_fetch_dec_relaxed atomic_fetch_dec > # define atomic_fetch_dec_acquire atomic_fetch_dec > # define atomic_fetch_dec_release atomic_fetch_dec > # endif > #else > # ifndef atomic_fetch_dec_acquire > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > # endif > # ifndef atomic_fetch_dec_release > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > # endif > # ifndef atomic_fetch_dec > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > # endif > #endif > > After: > > #ifndef atomic_fetch_dec_relaxed > # ifndef atomic_fetch_dec > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > # else > # define atomic_fetch_dec_relaxed atomic_fetch_dec > # define atomic_fetch_dec_acquire atomic_fetch_dec > # define atomic_fetch_dec_release atomic_fetch_dec > # endif > #else > # ifndef atomic_fetch_dec > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > # endif > #endif > > The idea is that because we already group these APIs by certain defines > such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary > branches - we can do the same in the secondary branch as well. > > ( Also remove some unnecessarily duplicate comments, as the API > group defines are now pretty much self-documenting. ) > > No change in functionality. > > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Cc: Will Deacon <will.deacon@arm.com> > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Ingo Molnar <mingo@kernel.org> This breaks compilation on RISC-V. (For some of its atomics, the arch currently defines the _relaxed and the full variants and it relies on the generic definitions for the _acquire and the _release variants.) Andrea > --- > include/linux/atomic.h | 312 ++++++++++--------------------------------------- > 1 file changed, 62 insertions(+), 250 deletions(-) > > diff --git a/include/linux/atomic.h b/include/linux/atomic.h > index 67aaafba256b..352ecc72d7f5 100644 > --- a/include/linux/atomic.h > +++ b/include/linux/atomic.h > @@ -71,98 +71,66 @@ > }) > #endif > > -/* atomic_add_return_relaxed() et al: */ > - > #ifndef atomic_add_return_relaxed > # define atomic_add_return_relaxed atomic_add_return > # define atomic_add_return_acquire atomic_add_return > # define atomic_add_return_release atomic_add_return > #else > -# ifndef atomic_add_return_acquire > -# define atomic_add_return_acquire(...) __atomic_op_acquire(atomic_add_return, __VA_ARGS__) > -# endif > -# ifndef atomic_add_return_release > -# define atomic_add_return_release(...) __atomic_op_release(atomic_add_return, __VA_ARGS__) > -# endif > # ifndef atomic_add_return > # define atomic_add_return(...) __atomic_op_fence(atomic_add_return, __VA_ARGS__) > +# define atomic_add_return_acquire(...) __atomic_op_acquire(atomic_add_return, __VA_ARGS__) > +# define atomic_add_return_release(...) __atomic_op_release(atomic_add_return, __VA_ARGS__) > # endif > #endif > > -/* atomic_inc_return_relaxed() et al: */ > - > #ifndef atomic_inc_return_relaxed > # define atomic_inc_return_relaxed atomic_inc_return > # define atomic_inc_return_acquire atomic_inc_return > # define atomic_inc_return_release atomic_inc_return > #else > -# ifndef atomic_inc_return_acquire > -# define atomic_inc_return_acquire(...) __atomic_op_acquire(atomic_inc_return, __VA_ARGS__) > -# endif > -# ifndef atomic_inc_return_release > -# define atomic_inc_return_release(...) __atomic_op_release(atomic_inc_return, __VA_ARGS__) > -# endif > # ifndef atomic_inc_return > # define atomic_inc_return(...) __atomic_op_fence(atomic_inc_return, __VA_ARGS__) > +# define atomic_inc_return_acquire(...) __atomic_op_acquire(atomic_inc_return, __VA_ARGS__) > +# define atomic_inc_return_release(...) __atomic_op_release(atomic_inc_return, __VA_ARGS__) > # endif > #endif > > -/* atomic_sub_return_relaxed() et al: */ > - > #ifndef atomic_sub_return_relaxed > # define atomic_sub_return_relaxed atomic_sub_return > # define atomic_sub_return_acquire atomic_sub_return > # define atomic_sub_return_release atomic_sub_return > #else > -# ifndef atomic_sub_return_acquire > -# define atomic_sub_return_acquire(...) __atomic_op_acquire(atomic_sub_return, __VA_ARGS__) > -# endif > -# ifndef atomic_sub_return_release > -# define atomic_sub_return_release(...) __atomic_op_release(atomic_sub_return, __VA_ARGS__) > -# endif > # ifndef atomic_sub_return > # define atomic_sub_return(...) __atomic_op_fence(atomic_sub_return, __VA_ARGS__) > +# define atomic_sub_return_acquire(...) __atomic_op_acquire(atomic_sub_return, __VA_ARGS__) > +# define atomic_sub_return_release(...) __atomic_op_release(atomic_sub_return, __VA_ARGS__) > # endif > #endif > > -/* atomic_dec_return_relaxed() et al: */ > - > #ifndef atomic_dec_return_relaxed > # define atomic_dec_return_relaxed atomic_dec_return > # define atomic_dec_return_acquire atomic_dec_return > # define atomic_dec_return_release atomic_dec_return > #else > -# ifndef atomic_dec_return_acquire > -# define atomic_dec_return_acquire(...) __atomic_op_acquire(atomic_dec_return, __VA_ARGS__) > -# endif > -# ifndef atomic_dec_return_release > -# define atomic_dec_return_release(...) __atomic_op_release(atomic_dec_return, __VA_ARGS__) > -# endif > # ifndef atomic_dec_return > # define atomic_dec_return(...) __atomic_op_fence(atomic_dec_return, __VA_ARGS__) > +# define atomic_dec_return_acquire(...) __atomic_op_acquire(atomic_dec_return, __VA_ARGS__) > +# define atomic_dec_return_release(...) __atomic_op_release(atomic_dec_return, __VA_ARGS__) > # endif > #endif > > -/* atomic_fetch_add_relaxed() et al: */ > - > #ifndef atomic_fetch_add_relaxed > # define atomic_fetch_add_relaxed atomic_fetch_add > # define atomic_fetch_add_acquire atomic_fetch_add > # define atomic_fetch_add_release atomic_fetch_add > #else > -# ifndef atomic_fetch_add_acquire > -# define atomic_fetch_add_acquire(...) __atomic_op_acquire(atomic_fetch_add, __VA_ARGS__) > -# endif > -# ifndef atomic_fetch_add_release > -# define atomic_fetch_add_release(...) __atomic_op_release(atomic_fetch_add, __VA_ARGS__) > -# endif > # ifndef atomic_fetch_add > # define atomic_fetch_add(...) __atomic_op_fence(atomic_fetch_add, __VA_ARGS__) > +# define atomic_fetch_add_acquire(...) __atomic_op_acquire(atomic_fetch_add, __VA_ARGS__) > +# define atomic_fetch_add_release(...) __atomic_op_release(atomic_fetch_add, __VA_ARGS__) > # endif > #endif > > -/* atomic_fetch_inc_relaxed() et al: */ > - > #ifndef atomic_fetch_inc_relaxed > # ifndef atomic_fetch_inc > # define atomic_fetch_inc(v) atomic_fetch_add(1, (v)) > @@ -175,37 +143,25 @@ > # define atomic_fetch_inc_release atomic_fetch_inc > # endif > #else > -# ifndef atomic_fetch_inc_acquire > -# define atomic_fetch_inc_acquire(...) __atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__) > -# endif > -# ifndef atomic_fetch_inc_release > -# define atomic_fetch_inc_release(...) __atomic_op_release(atomic_fetch_inc, __VA_ARGS__) > -# endif > # ifndef atomic_fetch_inc > # define atomic_fetch_inc(...) __atomic_op_fence(atomic_fetch_inc, __VA_ARGS__) > +# define atomic_fetch_inc_acquire(...) __atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__) > +# define atomic_fetch_inc_release(...) __atomic_op_release(atomic_fetch_inc, __VA_ARGS__) > # endif > #endif > > -/* atomic_fetch_sub_relaxed() et al: */ > - > #ifndef atomic_fetch_sub_relaxed > # define atomic_fetch_sub_relaxed atomic_fetch_sub > # define atomic_fetch_sub_acquire atomic_fetch_sub > # define atomic_fetch_sub_release atomic_fetch_sub > #else > -# ifndef atomic_fetch_sub_acquire > -# define atomic_fetch_sub_acquire(...) __atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__) > -# endif > -# ifndef atomic_fetch_sub_release > -# define atomic_fetch_sub_release(...) __atomic_op_release(atomic_fetch_sub, __VA_ARGS__) > -# endif > # ifndef atomic_fetch_sub > # define atomic_fetch_sub(...) __atomic_op_fence(atomic_fetch_sub, __VA_ARGS__) > +# define atomic_fetch_sub_acquire(...) __atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__) > +# define atomic_fetch_sub_release(...) __atomic_op_release(atomic_fetch_sub, __VA_ARGS__) > # endif > #endif > > -/* atomic_fetch_dec_relaxed() et al: */ > - > #ifndef atomic_fetch_dec_relaxed > # ifndef atomic_fetch_dec > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > @@ -218,127 +174,86 @@ > # define atomic_fetch_dec_release atomic_fetch_dec > # endif > #else > -# ifndef atomic_fetch_dec_acquire > -# define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > -# endif > -# ifndef atomic_fetch_dec_release > -# define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > -# endif > # ifndef atomic_fetch_dec > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > +# define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > +# define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > # endif > #endif > > -/* atomic_fetch_or_relaxed() et al: */ > - > #ifndef atomic_fetch_or_relaxed > # define atomic_fetch_or_relaxed atomic_fetch_or > # define atomic_fetch_or_acquire atomic_fetch_or > # define atomic_fetch_or_release atomic_fetch_or > #else > -# ifndef atomic_fetch_or_acquire > -# define atomic_fetch_or_acquire(...) __atomic_op_acquire(atomic_fetch_or, __VA_ARGS__) > -# endif > -# ifndef atomic_fetch_or_release > -# define atomic_fetch_or_release(...) __atomic_op_release(atomic_fetch_or, __VA_ARGS__) > -# endif > # ifndef atomic_fetch_or > # define atomic_fetch_or(...) __atomic_op_fence(atomic_fetch_or, __VA_ARGS__) > +# define atomic_fetch_or_acquire(...) __atomic_op_acquire(atomic_fetch_or, __VA_ARGS__) > +# define atomic_fetch_or_release(...) __atomic_op_release(atomic_fetch_or, __VA_ARGS__) > # endif > #endif > > -/* atomic_fetch_and_relaxed() et al: */ > - > #ifndef atomic_fetch_and_relaxed > # define atomic_fetch_and_relaxed atomic_fetch_and > # define atomic_fetch_and_acquire atomic_fetch_and > # define atomic_fetch_and_release atomic_fetch_and > #else > -# ifndef atomic_fetch_and_acquire > -# define atomic_fetch_and_acquire(...) __atomic_op_acquire(atomic_fetch_and, __VA_ARGS__) > -# endif > -# ifndef atomic_fetch_and_release > -# define atomic_fetch_and_release(...) __atomic_op_release(atomic_fetch_and, __VA_ARGS__) > -# endif > # ifndef atomic_fetch_and > # define atomic_fetch_and(...) __atomic_op_fence(atomic_fetch_and, __VA_ARGS__) > +# define atomic_fetch_and_acquire(...) __atomic_op_acquire(atomic_fetch_and, __VA_ARGS__) > +# define atomic_fetch_and_release(...) __atomic_op_release(atomic_fetch_and, __VA_ARGS__) > # endif > #endif > > #ifdef atomic_andnot > > -/* atomic_fetch_andnot_relaxed() et al: */ > - > #ifndef atomic_fetch_andnot_relaxed > # define atomic_fetch_andnot_relaxed atomic_fetch_andnot > # define atomic_fetch_andnot_acquire atomic_fetch_andnot > # define atomic_fetch_andnot_release atomic_fetch_andnot > #else > -# ifndef atomic_fetch_andnot_acquire > -# define atomic_fetch_andnot_acquire(...) __atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__) > -# endif > -# ifndef atomic_fetch_andnot_release > -# define atomic_fetch_andnot_release(...) __atomic_op_release(atomic_fetch_andnot, __VA_ARGS__) > -# endif > # ifndef atomic_fetch_andnot > # define atomic_fetch_andnot(...) __atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__) > +# define atomic_fetch_andnot_acquire(...) __atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__) > +# define atomic_fetch_andnot_release(...) __atomic_op_release(atomic_fetch_andnot, __VA_ARGS__) > # endif > #endif > > #endif /* atomic_andnot */ > > -/* atomic_fetch_xor_relaxed() et al: */ > - > #ifndef atomic_fetch_xor_relaxed > # define atomic_fetch_xor_relaxed atomic_fetch_xor > # define atomic_fetch_xor_acquire atomic_fetch_xor > # define atomic_fetch_xor_release atomic_fetch_xor > #else > -# ifndef atomic_fetch_xor_acquire > -# define atomic_fetch_xor_acquire(...) __atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__) > -# endif > -# ifndef atomic_fetch_xor_release > -# define atomic_fetch_xor_release(...) __atomic_op_release(atomic_fetch_xor, __VA_ARGS__) > -# endif > # ifndef atomic_fetch_xor > # define atomic_fetch_xor(...) __atomic_op_fence(atomic_fetch_xor, __VA_ARGS__) > +# define atomic_fetch_xor_acquire(...) __atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__) > +# define atomic_fetch_xor_release(...) __atomic_op_release(atomic_fetch_xor, __VA_ARGS__) > # endif > #endif > > - > -/* atomic_xchg_relaxed() et al: */ > - > #ifndef atomic_xchg_relaxed > #define atomic_xchg_relaxed atomic_xchg > #define atomic_xchg_acquire atomic_xchg > #define atomic_xchg_release atomic_xchg > #else > -# ifndef atomic_xchg_acquire > -# define atomic_xchg_acquire(...) __atomic_op_acquire(atomic_xchg, __VA_ARGS__) > -# endif > -# ifndef atomic_xchg_release > -# define atomic_xchg_release(...) __atomic_op_release(atomic_xchg, __VA_ARGS__) > -# endif > # ifndef atomic_xchg > # define atomic_xchg(...) __atomic_op_fence(atomic_xchg, __VA_ARGS__) > +# define atomic_xchg_acquire(...) __atomic_op_acquire(atomic_xchg, __VA_ARGS__) > +# define atomic_xchg_release(...) __atomic_op_release(atomic_xchg, __VA_ARGS__) > # endif > #endif > > -/* atomic_cmpxchg_relaxed() et al: */ > - > #ifndef atomic_cmpxchg_relaxed > # define atomic_cmpxchg_relaxed atomic_cmpxchg > # define atomic_cmpxchg_acquire atomic_cmpxchg > # define atomic_cmpxchg_release atomic_cmpxchg > #else > -# ifndef atomic_cmpxchg_acquire > -# define atomic_cmpxchg_acquire(...) __atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__) > -# endif > -# ifndef atomic_cmpxchg_release > -# define atomic_cmpxchg_release(...) __atomic_op_release(atomic_cmpxchg, __VA_ARGS__) > -# endif > # ifndef atomic_cmpxchg > # define atomic_cmpxchg(...) __atomic_op_fence(atomic_cmpxchg, __VA_ARGS__) > +# define atomic_cmpxchg_acquire(...) __atomic_op_acquire(atomic_cmpxchg, __VA_ARGS__) > +# define atomic_cmpxchg_release(...) __atomic_op_release(atomic_cmpxchg, __VA_ARGS__) > # endif > #endif > > @@ -362,57 +277,39 @@ > # define atomic_try_cmpxchg_release atomic_try_cmpxchg > #endif > > -/* cmpxchg_relaxed() et al: */ > - > #ifndef cmpxchg_relaxed > # define cmpxchg_relaxed cmpxchg > # define cmpxchg_acquire cmpxchg > # define cmpxchg_release cmpxchg > #else > -# ifndef cmpxchg_acquire > -# define cmpxchg_acquire(...) __atomic_op_acquire(cmpxchg, __VA_ARGS__) > -# endif > -# ifndef cmpxchg_release > -# define cmpxchg_release(...) __atomic_op_release(cmpxchg, __VA_ARGS__) > -# endif > # ifndef cmpxchg > # define cmpxchg(...) __atomic_op_fence(cmpxchg, __VA_ARGS__) > +# define cmpxchg_acquire(...) __atomic_op_acquire(cmpxchg, __VA_ARGS__) > +# define cmpxchg_release(...) __atomic_op_release(cmpxchg, __VA_ARGS__) > # endif > #endif > > -/* cmpxchg64_relaxed() et al: */ > - > #ifndef cmpxchg64_relaxed > # define cmpxchg64_relaxed cmpxchg64 > # define cmpxchg64_acquire cmpxchg64 > # define cmpxchg64_release cmpxchg64 > #else > -# ifndef cmpxchg64_acquire > -# define cmpxchg64_acquire(...) __atomic_op_acquire(cmpxchg64, __VA_ARGS__) > -# endif > -# ifndef cmpxchg64_release > -# define cmpxchg64_release(...) __atomic_op_release(cmpxchg64, __VA_ARGS__) > -# endif > # ifndef cmpxchg64 > # define cmpxchg64(...) __atomic_op_fence(cmpxchg64, __VA_ARGS__) > +# define cmpxchg64_acquire(...) __atomic_op_acquire(cmpxchg64, __VA_ARGS__) > +# define cmpxchg64_release(...) __atomic_op_release(cmpxchg64, __VA_ARGS__) > # endif > #endif > > -/* xchg_relaxed() et al: */ > - > #ifndef xchg_relaxed > # define xchg_relaxed xchg > # define xchg_acquire xchg > # define xchg_release xchg > #else > -# ifndef xchg_acquire > -# define xchg_acquire(...) __atomic_op_acquire(xchg, __VA_ARGS__) > -# endif > -# ifndef xchg_release > -# define xchg_release(...) __atomic_op_release(xchg, __VA_ARGS__) > -# endif > # ifndef xchg > # define xchg(...) __atomic_op_fence(xchg, __VA_ARGS__) > +# define xchg_acquire(...) __atomic_op_acquire(xchg, __VA_ARGS__) > +# define xchg_release(...) __atomic_op_release(xchg, __VA_ARGS__) > # endif > #endif > > @@ -569,98 +466,66 @@ static inline int atomic_dec_if_positive(atomic_t *v) > # define atomic64_set_release(v, i) smp_store_release(&(v)->counter, (i)) > #endif > > -/* atomic64_add_return_relaxed() et al: */ > - > #ifndef atomic64_add_return_relaxed > # define atomic64_add_return_relaxed atomic64_add_return > # define atomic64_add_return_acquire atomic64_add_return > # define atomic64_add_return_release atomic64_add_return > #else > -# ifndef atomic64_add_return_acquire > -# define atomic64_add_return_acquire(...) __atomic_op_acquire(atomic64_add_return, __VA_ARGS__) > -# endif > -# ifndef atomic64_add_return_release > -# define atomic64_add_return_release(...) __atomic_op_release(atomic64_add_return, __VA_ARGS__) > -# endif > # ifndef atomic64_add_return > # define atomic64_add_return(...) __atomic_op_fence(atomic64_add_return, __VA_ARGS__) > +# define atomic64_add_return_acquire(...) __atomic_op_acquire(atomic64_add_return, __VA_ARGS__) > +# define atomic64_add_return_release(...) __atomic_op_release(atomic64_add_return, __VA_ARGS__) > # endif > #endif > > -/* atomic64_inc_return_relaxed() et al: */ > - > #ifndef atomic64_inc_return_relaxed > # define atomic64_inc_return_relaxed atomic64_inc_return > # define atomic64_inc_return_acquire atomic64_inc_return > # define atomic64_inc_return_release atomic64_inc_return > #else > -# ifndef atomic64_inc_return_acquire > -# define atomic64_inc_return_acquire(...) __atomic_op_acquire(atomic64_inc_return, __VA_ARGS__) > -# endif > -# ifndef atomic64_inc_return_release > -# define atomic64_inc_return_release(...) __atomic_op_release(atomic64_inc_return, __VA_ARGS__) > -# endif > # ifndef atomic64_inc_return > # define atomic64_inc_return(...) __atomic_op_fence(atomic64_inc_return, __VA_ARGS__) > +# define atomic64_inc_return_acquire(...) __atomic_op_acquire(atomic64_inc_return, __VA_ARGS__) > +# define atomic64_inc_return_release(...) __atomic_op_release(atomic64_inc_return, __VA_ARGS__) > # endif > #endif > > -/* atomic64_sub_return_relaxed() et al: */ > - > #ifndef atomic64_sub_return_relaxed > # define atomic64_sub_return_relaxed atomic64_sub_return > # define atomic64_sub_return_acquire atomic64_sub_return > # define atomic64_sub_return_release atomic64_sub_return > #else > -# ifndef atomic64_sub_return_acquire > -# define atomic64_sub_return_acquire(...) __atomic_op_acquire(atomic64_sub_return, __VA_ARGS__) > -# endif > -# ifndef atomic64_sub_return_release > -# define atomic64_sub_return_release(...) __atomic_op_release(atomic64_sub_return, __VA_ARGS__) > -# endif > # ifndef atomic64_sub_return > # define atomic64_sub_return(...) __atomic_op_fence(atomic64_sub_return, __VA_ARGS__) > +# define atomic64_sub_return_acquire(...) __atomic_op_acquire(atomic64_sub_return, __VA_ARGS__) > +# define atomic64_sub_return_release(...) __atomic_op_release(atomic64_sub_return, __VA_ARGS__) > # endif > #endif > > -/* atomic64_dec_return_relaxed() et al: */ > - > #ifndef atomic64_dec_return_relaxed > # define atomic64_dec_return_relaxed atomic64_dec_return > # define atomic64_dec_return_acquire atomic64_dec_return > # define atomic64_dec_return_release atomic64_dec_return > #else > -# ifndef atomic64_dec_return_acquire > -# define atomic64_dec_return_acquire(...) __atomic_op_acquire(atomic64_dec_return, __VA_ARGS__) > -# endif > -# ifndef atomic64_dec_return_release > -# define atomic64_dec_return_release(...) __atomic_op_release(atomic64_dec_return, __VA_ARGS__) > -# endif > # ifndef atomic64_dec_return > # define atomic64_dec_return(...) __atomic_op_fence(atomic64_dec_return, __VA_ARGS__) > +# define atomic64_dec_return_acquire(...) __atomic_op_acquire(atomic64_dec_return, __VA_ARGS__) > +# define atomic64_dec_return_release(...) __atomic_op_release(atomic64_dec_return, __VA_ARGS__) > # endif > #endif > > -/* atomic64_fetch_add_relaxed() et al: */ > - > #ifndef atomic64_fetch_add_relaxed > # define atomic64_fetch_add_relaxed atomic64_fetch_add > # define atomic64_fetch_add_acquire atomic64_fetch_add > # define atomic64_fetch_add_release atomic64_fetch_add > #else > -# ifndef atomic64_fetch_add_acquire > -# define atomic64_fetch_add_acquire(...) __atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__) > -# endif > -# ifndef atomic64_fetch_add_release > -# define atomic64_fetch_add_release(...) __atomic_op_release(atomic64_fetch_add, __VA_ARGS__) > -# endif > # ifndef atomic64_fetch_add > # define atomic64_fetch_add(...) __atomic_op_fence(atomic64_fetch_add, __VA_ARGS__) > +# define atomic64_fetch_add_acquire(...) __atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__) > +# define atomic64_fetch_add_release(...) __atomic_op_release(atomic64_fetch_add, __VA_ARGS__) > # endif > #endif > > -/* atomic64_fetch_inc_relaxed() et al: */ > - > #ifndef atomic64_fetch_inc_relaxed > # ifndef atomic64_fetch_inc > # define atomic64_fetch_inc(v) atomic64_fetch_add(1, (v)) > @@ -673,37 +538,25 @@ static inline int atomic_dec_if_positive(atomic_t *v) > # define atomic64_fetch_inc_release atomic64_fetch_inc > # endif > #else > -# ifndef atomic64_fetch_inc_acquire > -# define atomic64_fetch_inc_acquire(...) __atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__) > -# endif > -# ifndef atomic64_fetch_inc_release > -# define atomic64_fetch_inc_release(...) __atomic_op_release(atomic64_fetch_inc, __VA_ARGS__) > -# endif > # ifndef atomic64_fetch_inc > # define atomic64_fetch_inc(...) __atomic_op_fence(atomic64_fetch_inc, __VA_ARGS__) > +# define atomic64_fetch_inc_acquire(...) __atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__) > +# define atomic64_fetch_inc_release(...) __atomic_op_release(atomic64_fetch_inc, __VA_ARGS__) > # endif > #endif > > -/* atomic64_fetch_sub_relaxed() et al: */ > - > #ifndef atomic64_fetch_sub_relaxed > # define atomic64_fetch_sub_relaxed atomic64_fetch_sub > # define atomic64_fetch_sub_acquire atomic64_fetch_sub > # define atomic64_fetch_sub_release atomic64_fetch_sub > #else > -# ifndef atomic64_fetch_sub_acquire > -# define atomic64_fetch_sub_acquire(...) __atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__) > -# endif > -# ifndef atomic64_fetch_sub_release > -# define atomic64_fetch_sub_release(...) __atomic_op_release(atomic64_fetch_sub, __VA_ARGS__) > -# endif > # ifndef atomic64_fetch_sub > # define atomic64_fetch_sub(...) __atomic_op_fence(atomic64_fetch_sub, __VA_ARGS__) > +# define atomic64_fetch_sub_acquire(...) __atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__) > +# define atomic64_fetch_sub_release(...) __atomic_op_release(atomic64_fetch_sub, __VA_ARGS__) > # endif > #endif > > -/* atomic64_fetch_dec_relaxed() et al: */ > - > #ifndef atomic64_fetch_dec_relaxed > # ifndef atomic64_fetch_dec > # define atomic64_fetch_dec(v) atomic64_fetch_sub(1, (v)) > @@ -716,127 +569,86 @@ static inline int atomic_dec_if_positive(atomic_t *v) > # define atomic64_fetch_dec_release atomic64_fetch_dec > # endif > #else > -# ifndef atomic64_fetch_dec_acquire > -# define atomic64_fetch_dec_acquire(...) __atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__) > -# endif > -# ifndef atomic64_fetch_dec_release > -# define atomic64_fetch_dec_release(...) __atomic_op_release(atomic64_fetch_dec, __VA_ARGS__) > -# endif > # ifndef atomic64_fetch_dec > # define atomic64_fetch_dec(...) __atomic_op_fence(atomic64_fetch_dec, __VA_ARGS__) > +# define atomic64_fetch_dec_acquire(...) __atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__) > +# define atomic64_fetch_dec_release(...) __atomic_op_release(atomic64_fetch_dec, __VA_ARGS__) > # endif > #endif > > -/* atomic64_fetch_or_relaxed() et al: */ > - > #ifndef atomic64_fetch_or_relaxed > # define atomic64_fetch_or_relaxed atomic64_fetch_or > # define atomic64_fetch_or_acquire atomic64_fetch_or > # define atomic64_fetch_or_release atomic64_fetch_or > #else > -# ifndef atomic64_fetch_or_acquire > -# define atomic64_fetch_or_acquire(...) __atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__) > -# endif > -# ifndef atomic64_fetch_or_release > -# define atomic64_fetch_or_release(...) __atomic_op_release(atomic64_fetch_or, __VA_ARGS__) > -# endif > # ifndef atomic64_fetch_or > # define atomic64_fetch_or(...) __atomic_op_fence(atomic64_fetch_or, __VA_ARGS__) > +# define atomic64_fetch_or_acquire(...) __atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__) > +# define atomic64_fetch_or_release(...) __atomic_op_release(atomic64_fetch_or, __VA_ARGS__) > # endif > #endif > > - > -/* atomic64_fetch_and_relaxed() et al: */ > - > #ifndef atomic64_fetch_and_relaxed > # define atomic64_fetch_and_relaxed atomic64_fetch_and > # define atomic64_fetch_and_acquire atomic64_fetch_and > # define atomic64_fetch_and_release atomic64_fetch_and > #else > -# ifndef atomic64_fetch_and_acquire > -# define atomic64_fetch_and_acquire(...) __atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__) > -# endif > -# ifndef atomic64_fetch_and_release > -# define atomic64_fetch_and_release(...) __atomic_op_release(atomic64_fetch_and, __VA_ARGS__) > -# endif > # ifndef atomic64_fetch_and > # define atomic64_fetch_and(...) __atomic_op_fence(atomic64_fetch_and, __VA_ARGS__) > +# define atomic64_fetch_and_acquire(...) __atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__) > +# define atomic64_fetch_and_release(...) __atomic_op_release(atomic64_fetch_and, __VA_ARGS__) > # endif > #endif > > #ifdef atomic64_andnot > > -/* atomic64_fetch_andnot_relaxed() et al: */ > - > #ifndef atomic64_fetch_andnot_relaxed > # define atomic64_fetch_andnot_relaxed atomic64_fetch_andnot > # define atomic64_fetch_andnot_acquire atomic64_fetch_andnot > # define atomic64_fetch_andnot_release atomic64_fetch_andnot > #else > -# ifndef atomic64_fetch_andnot_acquire > -# define atomic64_fetch_andnot_acquire(...) __atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__) > -# endif > -# ifndef atomic64_fetch_andnot_release > -# define atomic64_fetch_andnot_release(...) __atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__) > -# endif > # ifndef atomic64_fetch_andnot > # define atomic64_fetch_andnot(...) __atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__) > +# define atomic64_fetch_andnot_acquire(...) __atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__) > +# define atomic64_fetch_andnot_release(...) __atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__) > # endif > #endif > > #endif /* atomic64_andnot */ > > -/* atomic64_fetch_xor_relaxed() et al: */ > - > #ifndef atomic64_fetch_xor_relaxed > # define atomic64_fetch_xor_relaxed atomic64_fetch_xor > # define atomic64_fetch_xor_acquire atomic64_fetch_xor > # define atomic64_fetch_xor_release atomic64_fetch_xor > #else > -# ifndef atomic64_fetch_xor_acquire > -# define atomic64_fetch_xor_acquire(...) __atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__) > -# endif > -# ifndef atomic64_fetch_xor_release > -# define atomic64_fetch_xor_release(...) __atomic_op_release(atomic64_fetch_xor, __VA_ARGS__) > -# endif > # ifndef atomic64_fetch_xor > # define atomic64_fetch_xor(...) __atomic_op_fence(atomic64_fetch_xor, __VA_ARGS__) > +# define atomic64_fetch_xor_acquire(...) __atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__) > +# define atomic64_fetch_xor_release(...) __atomic_op_release(atomic64_fetch_xor, __VA_ARGS__) > # endif > #endif > > -/* atomic64_xchg_relaxed() et al: */ > - > #ifndef atomic64_xchg_relaxed > # define atomic64_xchg_relaxed atomic64_xchg > # define atomic64_xchg_acquire atomic64_xchg > # define atomic64_xchg_release atomic64_xchg > #else > -# ifndef atomic64_xchg_acquire > -# define atomic64_xchg_acquire(...) __atomic_op_acquire(atomic64_xchg, __VA_ARGS__) > -# endif > -# ifndef atomic64_xchg_release > -# define atomic64_xchg_release(...) __atomic_op_release(atomic64_xchg, __VA_ARGS__) > -# endif > # ifndef atomic64_xchg > # define atomic64_xchg(...) __atomic_op_fence(atomic64_xchg, __VA_ARGS__) > +# define atomic64_xchg_acquire(...) __atomic_op_acquire(atomic64_xchg, __VA_ARGS__) > +# define atomic64_xchg_release(...) __atomic_op_release(atomic64_xchg, __VA_ARGS__) > # endif > #endif > > -/* atomic64_cmpxchg_relaxed() et al: */ > - > #ifndef atomic64_cmpxchg_relaxed > # define atomic64_cmpxchg_relaxed atomic64_cmpxchg > # define atomic64_cmpxchg_acquire atomic64_cmpxchg > # define atomic64_cmpxchg_release atomic64_cmpxchg > #else > -# ifndef atomic64_cmpxchg_acquire > -# define atomic64_cmpxchg_acquire(...) __atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__) > -# endif > -# ifndef atomic64_cmpxchg_release > -# define atomic64_cmpxchg_release(...) __atomic_op_release(atomic64_cmpxchg, __VA_ARGS__) > -# endif > # ifndef atomic64_cmpxchg > # define atomic64_cmpxchg(...) __atomic_op_fence(atomic64_cmpxchg, __VA_ARGS__) > +# define atomic64_cmpxchg_acquire(...) __atomic_op_acquire(atomic64_cmpxchg, __VA_ARGS__) > +# define atomic64_cmpxchg_release(...) __atomic_op_release(atomic64_cmpxchg, __VA_ARGS__) > # endif > #endif >
Hi Ingo, > From f5efafa83af8c46b9e81b010b46caeeadb450179 Mon Sep 17 00:00:00 2001 > From: Ingo Molnar <mingo@kernel.org> > Date: Sat, 5 May 2018 10:46:41 +0200 > Subject: [PATCH] locking/atomics: Combine the atomic_andnot() and atomic64_andnot() API definitions > > The atomic_andnot() and atomic64_andnot() are defined in 4 separate groups > spred out in the atomic.h header: > > #ifdef atomic_andnot > ... > #endif /* atomic_andnot */ > ... > #ifndef atomic_andnot > ... > #endif > ... > #ifdef atomic64_andnot > ... > #endif /* atomic64_andnot */ > ... > #ifndef atomic64_andnot > ... > #endif > > Combine them into unify them into two groups: Nit: "Combine them into unify them into" Andrea > > #ifdef atomic_andnot > #else > #endif > > ... > > #ifdef atomic64_andnot > #else > #endif > > So that one API group is defined in a single place within the header. > > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Cc: Will Deacon <will.deacon@arm.com> > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Ingo Molnar <mingo@kernel.org> > --- > include/linux/atomic.h | 72 +++++++++++++++++++++++++------------------------- > 1 file changed, 36 insertions(+), 36 deletions(-) > > diff --git a/include/linux/atomic.h b/include/linux/atomic.h > index 352ecc72d7f5..1176cf7c6f03 100644 > --- a/include/linux/atomic.h > +++ b/include/linux/atomic.h > @@ -205,22 +205,6 @@ > # endif > #endif > > -#ifdef atomic_andnot > - > -#ifndef atomic_fetch_andnot_relaxed > -# define atomic_fetch_andnot_relaxed atomic_fetch_andnot > -# define atomic_fetch_andnot_acquire atomic_fetch_andnot > -# define atomic_fetch_andnot_release atomic_fetch_andnot > -#else > -# ifndef atomic_fetch_andnot > -# define atomic_fetch_andnot(...) __atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__) > -# define atomic_fetch_andnot_acquire(...) __atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__) > -# define atomic_fetch_andnot_release(...) __atomic_op_release(atomic_fetch_andnot, __VA_ARGS__) > -# endif > -#endif > - > -#endif /* atomic_andnot */ > - > #ifndef atomic_fetch_xor_relaxed > # define atomic_fetch_xor_relaxed atomic_fetch_xor > # define atomic_fetch_xor_acquire atomic_fetch_xor > @@ -338,7 +322,22 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u) > # define atomic_inc_not_zero(v) atomic_add_unless((v), 1, 0) > #endif > > -#ifndef atomic_andnot > +#ifdef atomic_andnot > + > +#ifndef atomic_fetch_andnot_relaxed > +# define atomic_fetch_andnot_relaxed atomic_fetch_andnot > +# define atomic_fetch_andnot_acquire atomic_fetch_andnot > +# define atomic_fetch_andnot_release atomic_fetch_andnot > +#else > +# ifndef atomic_fetch_andnot > +# define atomic_fetch_andnot(...) __atomic_op_fence(atomic_fetch_andnot, __VA_ARGS__) > +# define atomic_fetch_andnot_acquire(...) __atomic_op_acquire(atomic_fetch_andnot, __VA_ARGS__) > +# define atomic_fetch_andnot_release(...) __atomic_op_release(atomic_fetch_andnot, __VA_ARGS__) > +# endif > +#endif > + > +#else /* !atomic_andnot: */ > + > static inline void atomic_andnot(int i, atomic_t *v) > { > atomic_and(~i, v); > @@ -363,7 +362,8 @@ static inline int atomic_fetch_andnot_release(int i, atomic_t *v) > { > return atomic_fetch_and_release(~i, v); > } > -#endif > + > +#endif /* !atomic_andnot */ > > /** > * atomic_inc_not_zero_hint - increment if not null > @@ -600,22 +600,6 @@ static inline int atomic_dec_if_positive(atomic_t *v) > # endif > #endif > > -#ifdef atomic64_andnot > - > -#ifndef atomic64_fetch_andnot_relaxed > -# define atomic64_fetch_andnot_relaxed atomic64_fetch_andnot > -# define atomic64_fetch_andnot_acquire atomic64_fetch_andnot > -# define atomic64_fetch_andnot_release atomic64_fetch_andnot > -#else > -# ifndef atomic64_fetch_andnot > -# define atomic64_fetch_andnot(...) __atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__) > -# define atomic64_fetch_andnot_acquire(...) __atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__) > -# define atomic64_fetch_andnot_release(...) __atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__) > -# endif > -#endif > - > -#endif /* atomic64_andnot */ > - > #ifndef atomic64_fetch_xor_relaxed > # define atomic64_fetch_xor_relaxed atomic64_fetch_xor > # define atomic64_fetch_xor_acquire atomic64_fetch_xor > @@ -672,7 +656,22 @@ static inline int atomic_dec_if_positive(atomic_t *v) > # define atomic64_try_cmpxchg_release atomic64_try_cmpxchg > #endif > > -#ifndef atomic64_andnot > +#ifdef atomic64_andnot > + > +#ifndef atomic64_fetch_andnot_relaxed > +# define atomic64_fetch_andnot_relaxed atomic64_fetch_andnot > +# define atomic64_fetch_andnot_acquire atomic64_fetch_andnot > +# define atomic64_fetch_andnot_release atomic64_fetch_andnot > +#else > +# ifndef atomic64_fetch_andnot > +# define atomic64_fetch_andnot(...) __atomic_op_fence(atomic64_fetch_andnot, __VA_ARGS__) > +# define atomic64_fetch_andnot_acquire(...) __atomic_op_acquire(atomic64_fetch_andnot, __VA_ARGS__) > +# define atomic64_fetch_andnot_release(...) __atomic_op_release(atomic64_fetch_andnot, __VA_ARGS__) > +# endif > +#endif > + > +#else /* !atomic64_andnot: */ > + > static inline void atomic64_andnot(long long i, atomic64_t *v) > { > atomic64_and(~i, v); > @@ -697,7 +696,8 @@ static inline long long atomic64_fetch_andnot_release(long long i, atomic64_t *v > { > return atomic64_fetch_and_release(~i, v); > } > -#endif > + > +#endif /* !atomic64_andnot */ > > #define atomic64_cond_read_relaxed(v, c) smp_cond_load_relaxed(&(v)->counter, (c)) > #define atomic64_cond_read_acquire(v, c) smp_cond_load_acquire(&(v)->counter, (c))
* Andrea Parri <andrea.parri@amarulasolutions.com> wrote: > Hi Ingo, > > > From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001 > > From: Ingo Molnar <mingo@kernel.org> > > Date: Sat, 5 May 2018 10:23:23 +0200 > > Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more > > > > Before: > > > > #ifndef atomic_fetch_dec_relaxed > > # ifndef atomic_fetch_dec > > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > > # else > > # define atomic_fetch_dec_relaxed atomic_fetch_dec > > # define atomic_fetch_dec_acquire atomic_fetch_dec > > # define atomic_fetch_dec_release atomic_fetch_dec > > # endif > > #else > > # ifndef atomic_fetch_dec_acquire > > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > > # endif > > # ifndef atomic_fetch_dec_release > > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > > # endif > > # ifndef atomic_fetch_dec > > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > > # endif > > #endif > > > > After: > > > > #ifndef atomic_fetch_dec_relaxed > > # ifndef atomic_fetch_dec > > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > > # else > > # define atomic_fetch_dec_relaxed atomic_fetch_dec > > # define atomic_fetch_dec_acquire atomic_fetch_dec > > # define atomic_fetch_dec_release atomic_fetch_dec > > # endif > > #else > > # ifndef atomic_fetch_dec > > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > > # endif > > #endif > > > > The idea is that because we already group these APIs by certain defines > > such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary > > branches - we can do the same in the secondary branch as well. > > > > ( Also remove some unnecessarily duplicate comments, as the API > > group defines are now pretty much self-documenting. ) > > > > No change in functionality. > > > > Cc: Peter Zijlstra <peterz@infradead.org> > > Cc: Linus Torvalds <torvalds@linux-foundation.org> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > Cc: Thomas Gleixner <tglx@linutronix.de> > > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > Cc: Will Deacon <will.deacon@arm.com> > > Cc: linux-kernel@vger.kernel.org > > Signed-off-by: Ingo Molnar <mingo@kernel.org> > > This breaks compilation on RISC-V. (For some of its atomics, the arch > currently defines the _relaxed and the full variants and it relies on > the generic definitions for the _acquire and the _release variants.) I don't have cross-compilation for RISC-V, which is a relatively new arch. (Is there any RISC-V set of cross-compilation tools on kernel.org somewhere?) Could you please send a patch that defines those variants against Linus's tree, like the PowerPC patch that does something similar: 0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs ? ... and I'll integrate it into the proper place to make it all bisectable, etc. Thanks, Ingo
On Sun, May 6, 2018, at 8:11 PM, Ingo Molnar wrote: > > * Boqun Feng <boqun.feng@gmail.com> wrote: > > > > The only change I made beyond a trivial build fix is that I also added the release > > > atomics variants explicitly: > > > > > > +#define atomic_cmpxchg_release(v, o, n) \ > > > + cmpxchg_release(&((v)->counter), (o), (n)) > > > +#define atomic64_cmpxchg_release(v, o, n) \ > > > + cmpxchg_release(&((v)->counter), (o), (n)) > > > > > > It has passed a PowerPC cross-build test here, but no runtime tests. > > > > > > > Do you have the commit at any branch in tip tree? I could pull it and > > cross-build and check the assembly code of lib/atomic64_test.c, that way > > I could verify whether we mess something up. > > > > > Does this patch look good to you? > > > > > > > Yep! > > Great - I have pushed the commits out into the locking tree, they can be > found in: > > git fetch git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git > locking/core > Thanks! My compile test told me that we need to remove the definitions of atomic_xchg and atomic64_xchg in ppc's asm/atomic.h: they are now duplicate, and will prevent the generation of _release and _acquire in the new logic. If you need a updated patch for this from me, I could send later today. (I don't have a handy environment for patch sending now, so...) Other than this, the modification looks fine, the lib/atomic64_test.c generated the same asm before and after the patches. Regards, Boqun > The PowerPC preparatory commit from you is: > > 0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/ > cmpxchg.h and define the full set of cmpxchg APIs > > Thanks, > > Ingo
* Boqun Feng <boqun.feng@gmail.com> wrote: > > > On Sun, May 6, 2018, at 8:11 PM, Ingo Molnar wrote: > > > > * Boqun Feng <boqun.feng@gmail.com> wrote: > > > > > > The only change I made beyond a trivial build fix is that I also added the release > > > > atomics variants explicitly: > > > > > > > > +#define atomic_cmpxchg_release(v, o, n) \ > > > > + cmpxchg_release(&((v)->counter), (o), (n)) > > > > +#define atomic64_cmpxchg_release(v, o, n) \ > > > > + cmpxchg_release(&((v)->counter), (o), (n)) > > > > > > > > It has passed a PowerPC cross-build test here, but no runtime tests. > > > > > > > > > > Do you have the commit at any branch in tip tree? I could pull it and > > > cross-build and check the assembly code of lib/atomic64_test.c, that way > > > I could verify whether we mess something up. > > > > > > > Does this patch look good to you? > > > > > > > > > > Yep! > > > > Great - I have pushed the commits out into the locking tree, they can be > > found in: > > > > git fetch git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git > > locking/core > > > > Thanks! My compile test told me that we need to remove the definitions of > atomic_xchg and atomic64_xchg in ppc's asm/atomic.h: they are now > duplicate, and will prevent the generation of _release and _acquire in the > new logic. > > If you need a updated patch for this from me, I could send later today. > (I don't have a handy environment for patch sending now, so...) That would be cool, thanks! My own cross-build testing didn't trigger that build failure. > Other than this, the modification looks fine, the lib/atomic64_test.c > generated the same asm before and after the patches. Cool, thanks for checking! Thanks, Ingo
On Sun, May 06, 2018 at 04:57:27PM +0200, Ingo Molnar wrote: > > * Andrea Parri <andrea.parri@amarulasolutions.com> wrote: > > > Hi Ingo, > > > > > From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001 > > > From: Ingo Molnar <mingo@kernel.org> > > > Date: Sat, 5 May 2018 10:23:23 +0200 > > > Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more > > > > > > Before: > > > > > > #ifndef atomic_fetch_dec_relaxed > > > # ifndef atomic_fetch_dec > > > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > > > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > > > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > > > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > > > # else > > > # define atomic_fetch_dec_relaxed atomic_fetch_dec > > > # define atomic_fetch_dec_acquire atomic_fetch_dec > > > # define atomic_fetch_dec_release atomic_fetch_dec > > > # endif > > > #else > > > # ifndef atomic_fetch_dec_acquire > > > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > > > # endif > > > # ifndef atomic_fetch_dec_release > > > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > > > # endif > > > # ifndef atomic_fetch_dec > > > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > > > # endif > > > #endif > > > > > > After: > > > > > > #ifndef atomic_fetch_dec_relaxed > > > # ifndef atomic_fetch_dec > > > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) > > > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) > > > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) > > > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) > > > # else > > > # define atomic_fetch_dec_relaxed atomic_fetch_dec > > > # define atomic_fetch_dec_acquire atomic_fetch_dec > > > # define atomic_fetch_dec_release atomic_fetch_dec > > > # endif > > > #else > > > # ifndef atomic_fetch_dec > > > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) > > > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) > > > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) > > > # endif > > > #endif > > > > > > The idea is that because we already group these APIs by certain defines > > > such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary > > > branches - we can do the same in the secondary branch as well. > > > > > > ( Also remove some unnecessarily duplicate comments, as the API > > > group defines are now pretty much self-documenting. ) > > > > > > No change in functionality. > > > > > > Cc: Peter Zijlstra <peterz@infradead.org> > > > Cc: Linus Torvalds <torvalds@linux-foundation.org> > > > Cc: Andrew Morton <akpm@linux-foundation.org> > > > Cc: Thomas Gleixner <tglx@linutronix.de> > > > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > > Cc: Will Deacon <will.deacon@arm.com> > > > Cc: linux-kernel@vger.kernel.org > > > Signed-off-by: Ingo Molnar <mingo@kernel.org> > > > > This breaks compilation on RISC-V. (For some of its atomics, the arch > > currently defines the _relaxed and the full variants and it relies on > > the generic definitions for the _acquire and the _release variants.) > > I don't have cross-compilation for RISC-V, which is a relatively new arch. > (Is there any RISC-V set of cross-compilation tools on kernel.org somewhere?) I'm using the toolchain from: https://riscv.org/software-tools/ (adding Palmer and Albert in Cc:) > > Could you please send a patch that defines those variants against Linus's tree, > like the PowerPC patch that does something similar: > > 0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs > > ? Yes, please see below for a first RFC. (BTW, get_maintainer.pl says that that patch missed Benjamin, Paul, Michael and linuxppc-dev@lists.ozlabs.org: FWIW, I'm Cc-ing the maintainers here.) Andrea From 411f05a44e0b53a435331b977ff864fba7501a95 Mon Sep 17 00:00:00 2001 From: Andrea Parri <andrea.parri@amarulasolutions.com> Date: Mon, 7 May 2018 10:59:20 +0200 Subject: [RFC PATCH] riscv/atomic: Defines _acquire/_release variants In preparation for Ingo's renovation of the generic atomic.h header [1], define the _acquire/_release variants in the arch's header. No change in code generation. [1] http://lkml.kernel.org/r/20180505081100.nsyrqrpzq2vd27bk@gmail.com http://lkml.kernel.org/r/20180505083635.622xmcvb42dw5xxh@gmail.com Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Albert Ou <albert@sifive.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: linux-riscv@lists.infradead.org --- arch/riscv/include/asm/atomic.h | 88 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 88 insertions(+) diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h index 855115ace98c8..7cbd8033dfb5d 100644 --- a/arch/riscv/include/asm/atomic.h +++ b/arch/riscv/include/asm/atomic.h @@ -153,22 +153,54 @@ ATOMIC_OPS(sub, add, +, -i) #define atomic_add_return_relaxed atomic_add_return_relaxed #define atomic_sub_return_relaxed atomic_sub_return_relaxed +#define atomic_add_return_acquire(...) \ + __atomic_op_acquire(atomic_add_return, __VA_ARGS__) +#define atomic_sub_return_acquire(...) \ + __atomic_op_acquire(atomic_sub_return, __VA_ARGS__) +#define atomic_add_return_release(...) \ + __atomic_op_release(atomic_add_return, __VA_ARGS__) +#define atomic_sub_return_release(...) \ + __atomic_op_release(atomic_sub_return, __VA_ARGS__) #define atomic_add_return atomic_add_return #define atomic_sub_return atomic_sub_return #define atomic_fetch_add_relaxed atomic_fetch_add_relaxed #define atomic_fetch_sub_relaxed atomic_fetch_sub_relaxed +#define atomic_fetch_add_acquire(...) \ + __atomic_op_acquire(atomic_fetch_add, __VA_ARGS__) +#define atomic_fetch_sub_acquire(...) \ + __atomic_op_acquire(atomic_fetch_sub, __VA_ARGS__) +#define atomic_fetch_add_release(...) \ + __atomic_op_release(atomic_fetch_add, __VA_ARGS__) +#define atomic_fetch_sub_release(...) \ + __atomic_op_release(atomic_fetch_sub, __VA_ARGS__) #define atomic_fetch_add atomic_fetch_add #define atomic_fetch_sub atomic_fetch_sub #ifndef CONFIG_GENERIC_ATOMIC64 #define atomic64_add_return_relaxed atomic64_add_return_relaxed #define atomic64_sub_return_relaxed atomic64_sub_return_relaxed +#define atomic64_add_return_acquire(...) \ + __atomic_op_acquire(atomic64_add_return, __VA_ARGS__) +#define atomic64_sub_return_acquire(...) \ + __atomic_op_acquire(atomic64_sub_return, __VA_ARGS__) +#define atomic64_add_return_release(...) \ + __atomic_op_release(atomic64_add_return, __VA_ARGS__) +#define atomic64_sub_return_release(...) \ + __atomic_op_release(atomic64_sub_return, __VA_ARGS__) #define atomic64_add_return atomic64_add_return #define atomic64_sub_return atomic64_sub_return #define atomic64_fetch_add_relaxed atomic64_fetch_add_relaxed #define atomic64_fetch_sub_relaxed atomic64_fetch_sub_relaxed +#define atomic64_fetch_add_acquire(...) \ + __atomic_op_acquire(atomic64_fetch_add, __VA_ARGS__) +#define atomic64_fetch_sub_acquire(...) \ + __atomic_op_acquire(atomic64_fetch_sub, __VA_ARGS__) +#define atomic64_fetch_add_release(...) \ + __atomic_op_release(atomic64_fetch_add, __VA_ARGS__) +#define atomic64_fetch_sub_release(...) \ + __atomic_op_release(atomic64_fetch_sub, __VA_ARGS__) #define atomic64_fetch_add atomic64_fetch_add #define atomic64_fetch_sub atomic64_fetch_sub #endif @@ -191,6 +223,18 @@ ATOMIC_OPS(xor, xor, i) #define atomic_fetch_and_relaxed atomic_fetch_and_relaxed #define atomic_fetch_or_relaxed atomic_fetch_or_relaxed #define atomic_fetch_xor_relaxed atomic_fetch_xor_relaxed +#define atomic_fetch_and_acquire(...) \ + __atomic_op_acquire(atomic_fetch_and, __VA_ARGS__) +#define atomic_fetch_or_acquire(...) \ + __atomic_op_acquire(atomic_fetch_or, __VA_ARGS__) +#define atomic_fetch_xor_acquire(...) \ + __atomic_op_acquire(atomic_fetch_xor, __VA_ARGS__) +#define atomic_fetch_and_release(...) \ + __atomic_op_release(atomic_fetch_and, __VA_ARGS__) +#define atomic_fetch_or_release(...) \ + __atomic_op_release(atomic_fetch_or, __VA_ARGS__) +#define atomic_fetch_xor_release(...) \ + __atomic_op_release(atomic_fetch_xor, __VA_ARGS__) #define atomic_fetch_and atomic_fetch_and #define atomic_fetch_or atomic_fetch_or #define atomic_fetch_xor atomic_fetch_xor @@ -199,6 +243,18 @@ ATOMIC_OPS(xor, xor, i) #define atomic64_fetch_and_relaxed atomic64_fetch_and_relaxed #define atomic64_fetch_or_relaxed atomic64_fetch_or_relaxed #define atomic64_fetch_xor_relaxed atomic64_fetch_xor_relaxed +#define atomic64_fetch_and_acquire(...) \ + __atomic_op_acquire(atomic64_fetch_and, __VA_ARGS__) +#define atomic64_fetch_or_acquire(...) \ + __atomic_op_acquire(atomic64_fetch_or, __VA_ARGS__) +#define atomic64_fetch_xor_acquire(...) \ + __atomic_op_acquire(atomic64_fetch_xor, __VA_ARGS__) +#define atomic64_fetch_and_release(...) \ + __atomic_op_release(atomic64_fetch_and, __VA_ARGS__) +#define atomic64_fetch_or_release(...) \ + __atomic_op_release(atomic64_fetch_or, __VA_ARGS__) +#define atomic64_fetch_xor_release(...) \ + __atomic_op_release(atomic64_fetch_xor, __VA_ARGS__) #define atomic64_fetch_and atomic64_fetch_and #define atomic64_fetch_or atomic64_fetch_or #define atomic64_fetch_xor atomic64_fetch_xor @@ -290,22 +346,54 @@ ATOMIC_OPS(dec, add, +, -1) #define atomic_inc_return_relaxed atomic_inc_return_relaxed #define atomic_dec_return_relaxed atomic_dec_return_relaxed +#define atomic_inc_return_acquire(...) \ + __atomic_op_acquire(atomic_inc_return, __VA_ARGS__) +#define atomic_dec_return_acquire(...) \ + __atomic_op_acquire(atomic_dec_return, __VA_ARGS__) +#define atomic_inc_return_release(...) \ + __atomic_op_release(atomic_inc_return, __VA_ARGS__) +#define atomic_dec_return_release(...) \ + __atomic_op_release(atomic_dec_return, __VA_ARGS__) #define atomic_inc_return atomic_inc_return #define atomic_dec_return atomic_dec_return #define atomic_fetch_inc_relaxed atomic_fetch_inc_relaxed #define atomic_fetch_dec_relaxed atomic_fetch_dec_relaxed +#define atomic_fetch_inc_acquire(...) \ + __atomic_op_acquire(atomic_fetch_inc, __VA_ARGS__) +#define atomic_fetch_dec_acquire(...) \ + __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) +#define atomic_fetch_inc_release(...) \ + __atomic_op_release(atomic_fetch_inc, __VA_ARGS__) +#define atomic_fetch_dec_release(...) \ + __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) #define atomic_fetch_inc atomic_fetch_inc #define atomic_fetch_dec atomic_fetch_dec #ifndef CONFIG_GENERIC_ATOMIC64 #define atomic64_inc_return_relaxed atomic64_inc_return_relaxed #define atomic64_dec_return_relaxed atomic64_dec_return_relaxed +#define atomic64_inc_return_acquire(...) \ + __atomic_op_acquire(atomic64_inc_return, __VA_ARGS__) +#define atomic64_dec_return_acquire(...) \ + __atomic_op_acquire(atomic64_dec_return, __VA_ARGS__) +#define atomic64_inc_return_release(...) \ + __atomic_op_release(atomic64_inc_return, __VA_ARGS__) +#define atomic64_dec_return_release(...) \ + __atomic_op_release(atomic64_dec_return, __VA_ARGS__) #define atomic64_inc_return atomic64_inc_return #define atomic64_dec_return atomic64_dec_return #define atomic64_fetch_inc_relaxed atomic64_fetch_inc_relaxed #define atomic64_fetch_dec_relaxed atomic64_fetch_dec_relaxed +#define atomic64_fetch_inc_acquire(...) \ + __atomic_op_acquire(atomic64_fetch_inc, __VA_ARGS__) +#define atomic64_fetch_dec_acquire(...) \ + __atomic_op_acquire(atomic64_fetch_dec, __VA_ARGS__) +#define atomic64_fetch_inc_release(...) \ + __atomic_op_release(atomic64_fetch_inc, __VA_ARGS__) +#define atomic64_fetch_dec_release(...) \ + __atomic_op_release(atomic64_fetch_dec, __VA_ARGS__) #define atomic64_fetch_inc atomic64_fetch_inc #define atomic64_fetch_dec atomic64_fetch_dec #endif -- 2.7.4 > > ... and I'll integrate it into the proper place to make it all bisectable, etc. > > Thanks, > > Ingo
On Sun, 06 May 2018 07:57:27 PDT (-0700), mingo@kernel.org wrote: > > * Andrea Parri <andrea.parri@amarulasolutions.com> wrote: > >> Hi Ingo, >> >> > From 5affbf7e91901143f84f1b2ca64f4afe70e210fd Mon Sep 17 00:00:00 2001 >> > From: Ingo Molnar <mingo@kernel.org> >> > Date: Sat, 5 May 2018 10:23:23 +0200 >> > Subject: [PATCH] locking/atomics: Simplify the op definitions in atomic.h some more >> > >> > Before: >> > >> > #ifndef atomic_fetch_dec_relaxed >> > # ifndef atomic_fetch_dec >> > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) >> > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) >> > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) >> > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) >> > # else >> > # define atomic_fetch_dec_relaxed atomic_fetch_dec >> > # define atomic_fetch_dec_acquire atomic_fetch_dec >> > # define atomic_fetch_dec_release atomic_fetch_dec >> > # endif >> > #else >> > # ifndef atomic_fetch_dec_acquire >> > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) >> > # endif >> > # ifndef atomic_fetch_dec_release >> > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) >> > # endif >> > # ifndef atomic_fetch_dec >> > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) >> > # endif >> > #endif >> > >> > After: >> > >> > #ifndef atomic_fetch_dec_relaxed >> > # ifndef atomic_fetch_dec >> > # define atomic_fetch_dec(v) atomic_fetch_sub(1, (v)) >> > # define atomic_fetch_dec_relaxed(v) atomic_fetch_sub_relaxed(1, (v)) >> > # define atomic_fetch_dec_acquire(v) atomic_fetch_sub_acquire(1, (v)) >> > # define atomic_fetch_dec_release(v) atomic_fetch_sub_release(1, (v)) >> > # else >> > # define atomic_fetch_dec_relaxed atomic_fetch_dec >> > # define atomic_fetch_dec_acquire atomic_fetch_dec >> > # define atomic_fetch_dec_release atomic_fetch_dec >> > # endif >> > #else >> > # ifndef atomic_fetch_dec >> > # define atomic_fetch_dec(...) __atomic_op_fence(atomic_fetch_dec, __VA_ARGS__) >> > # define atomic_fetch_dec_acquire(...) __atomic_op_acquire(atomic_fetch_dec, __VA_ARGS__) >> > # define atomic_fetch_dec_release(...) __atomic_op_release(atomic_fetch_dec, __VA_ARGS__) >> > # endif >> > #endif >> > >> > The idea is that because we already group these APIs by certain defines >> > such as atomic_fetch_dec_relaxed and atomic_fetch_dec in the primary >> > branches - we can do the same in the secondary branch as well. >> > >> > ( Also remove some unnecessarily duplicate comments, as the API >> > group defines are now pretty much self-documenting. ) >> > >> > No change in functionality. >> > >> > Cc: Peter Zijlstra <peterz@infradead.org> >> > Cc: Linus Torvalds <torvalds@linux-foundation.org> >> > Cc: Andrew Morton <akpm@linux-foundation.org> >> > Cc: Thomas Gleixner <tglx@linutronix.de> >> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> >> > Cc: Will Deacon <will.deacon@arm.com> >> > Cc: linux-kernel@vger.kernel.org >> > Signed-off-by: Ingo Molnar <mingo@kernel.org> >> >> This breaks compilation on RISC-V. (For some of its atomics, the arch >> currently defines the _relaxed and the full variants and it relies on >> the generic definitions for the _acquire and the _release variants.) > > I don't have cross-compilation for RISC-V, which is a relatively new arch. > (Is there any RISC-V set of cross-compilation tools on kernel.org somewhere?) Arnd added RISC-V to the cross compiler list a month or two ago when he updated them all. I use the "make.cross" script from the Intel test robot, which will fetch the cross compilers for you. It looks like I made a Git Hub pull request to update the script for RISC-V, it fetches from kernel.org https://github.com/palmer-dabbelt/lkp-tests/blob/e14f4236ccd0572f4b87ffd480fecefee412dedc/sbin/make.cross http://cdn.kernel.org/pub/tools/crosstool/files/bin/ http://cdn.kernel.org/pub/tools/crosstool/files/bin/x86_64/7.3.0/x86_64-gcc-7.3.0-nolibc_riscv64-linux.tar.gz > Could you please send a patch that defines those variants against Linus's tree, > like the PowerPC patch that does something similar: > > 0476a632cb3a: locking/atomics/powerpc: Move cmpxchg helpers to asm/cmpxchg.h and define the full set of cmpxchg APIs > > ? > > ... and I'll integrate it into the proper place to make it all bisectable, etc. Sorry, I got buried in email again. Did this get merged, or is there a current version of the patch set I should look at?
diff --git a/include/asm-generic/atomic-instrumented.h b/include/asm-generic/atomic-instrumented.h index ec07f23678ea..26f0e3098442 100644 --- a/include/asm-generic/atomic-instrumented.h +++ b/include/asm-generic/atomic-instrumented.h @@ -40,171 +40,664 @@ static __always_inline void atomic64_set(atomic64_t *v, s64 i) arch_atomic64_set(v, i); } -static __always_inline int atomic_xchg(atomic_t *v, int i) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic_xchg(v, i); +#define INSTR_ATOMIC_XCHG(order) \ +static __always_inline int \ +atomic_xchg##order(atomic_t *v, int i) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic_xchg##order(v, i); \ } -static __always_inline s64 atomic64_xchg(atomic64_t *v, s64 i) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic64_xchg(v, i); +INSTR_ATOMIC_XCHG() + +#ifdef arch_atomic_xchg_relaxed +INSTR_ATOMIC_XCHG(_relaxed) +#define atomic_xchg_relaxed atomic_xchg_relaxed +#endif + +#ifdef arch_atomic_xchg_acquire +INSTR_ATOMIC_XCHG(_acquire) +#define atomic_xchg_acquire atomic_xchg_acquire +#endif + +#ifdef arch_atomic_xchg_release +INSTR_ATOMIC_XCHG(_release) +#define atomic_xchg_release atomic_xchg_release +#endif + +#define INSTR_ATOMIC64_XCHG(order) \ +static __always_inline s64 \ +atomic64_xchg##order(atomic64_t *v, s64 i) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic64_xchg##order(v, i); \ } -static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic_cmpxchg(v, old, new); +INSTR_ATOMIC64_XCHG() + +#ifdef arch_atomic64_xchg_relaxed +INSTR_ATOMIC64_XCHG(_relaxed) +#define atomic64_xchg_relaxed atomic64_xchg_relaxed +#endif + +#ifdef arch_atomic64_xchg_acquire +INSTR_ATOMIC64_XCHG(_acquire) +#define atomic64_xchg_acquire atomic64_xchg_acquire +#endif + +#ifdef arch_atomic64_xchg_release +INSTR_ATOMIC64_XCHG(_release) +#define atomic64_xchg_release atomic64_xchg_release +#endif + +#define INSTR_ATOMIC_CMPXCHG(order) \ +static __always_inline int \ +atomic_cmpxchg##order(atomic_t *v, int old, int new) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic_cmpxchg##order(v, old, new); \ } -static __always_inline s64 atomic64_cmpxchg(atomic64_t *v, s64 old, s64 new) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic64_cmpxchg(v, old, new); +INSTR_ATOMIC_CMPXCHG() + +#ifdef arch_atomic_cmpxchg_relaxed +INSTR_ATOMIC_CMPXCHG(_relaxed) +#define atomic_cmpxchg_relaxed atomic_cmpxchg_relaxed +#endif + +#ifdef arch_atomic_cmpxchg_acquire +INSTR_ATOMIC_CMPXCHG(_acquire) +#define atomic_cmpxchg_acquire atomic_cmpxchg_acquire +#endif + +#ifdef arch_atomic_cmpxchg_release +INSTR_ATOMIC_CMPXCHG(_release) +#define atomic_cmpxchg_release atomic_cmpxchg_release +#endif + +#define INSTR_ATOMIC64_CMPXCHG(order) \ +static __always_inline s64 \ +atomic64_cmpxchg##order(atomic64_t *v, s64 old, s64 new) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic64_cmpxchg##order(v, old, new); \ +} + +INSTR_ATOMIC64_CMPXCHG() + +#ifdef arch_atomic64_cmpxchg_relaxed +INSTR_ATOMIC64_CMPXCHG(_relaxed) +#define atomic64_cmpxchg_relaxed atomic64_cmpxchg_relaxed +#endif + +#ifdef arch_atomic64_cmpxchg_acquire +INSTR_ATOMIC64_CMPXCHG(_acquire) +#define atomic64_cmpxchg_acquire atomic64_cmpxchg_acquire +#endif + +#ifdef arch_atomic64_cmpxchg_release +INSTR_ATOMIC64_CMPXCHG(_release) +#define atomic64_cmpxchg_release atomic64_cmpxchg_release +#endif + +#define INSTR_ATOMIC_TRY_CMPXCHG(order) \ +static __always_inline bool \ +atomic_try_cmpxchg##order(atomic_t *v, int *old, int new) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + kasan_check_read(old, sizeof(*old)); \ + return arch_atomic_try_cmpxchg##order(v, old, new); \ } #ifdef arch_atomic_try_cmpxchg +INSTR_ATOMIC_TRY_CMPXCHG() #define atomic_try_cmpxchg atomic_try_cmpxchg -static __always_inline bool atomic_try_cmpxchg(atomic_t *v, int *old, int new) -{ - kasan_check_write(v, sizeof(*v)); - kasan_check_read(old, sizeof(*old)); - return arch_atomic_try_cmpxchg(v, old, new); -} #endif -#ifdef arch_atomic64_try_cmpxchg -#define atomic64_try_cmpxchg atomic64_try_cmpxchg -static __always_inline bool atomic64_try_cmpxchg(atomic64_t *v, s64 *old, s64 new) -{ - kasan_check_write(v, sizeof(*v)); - kasan_check_read(old, sizeof(*old)); - return arch_atomic64_try_cmpxchg(v, old, new); +#ifdef arch_atomic_try_cmpxchg_relaxed +INSTR_ATOMIC_TRY_CMPXCHG(_relaxed) +#define atomic_try_cmpxchg_relaxed atomic_try_cmpxchg_relaxed +#endif + +#ifdef arch_atomic_try_cmpxchg_acquire +INSTR_ATOMIC_TRY_CMPXCHG(_acquire) +#define atomic_try_cmpxchg_acquire atomic_try_cmpxchg_acquire +#endif + +#ifdef arch_atomic_try_cmpxchg_release +INSTR_ATOMIC_TRY_CMPXCHG(_release) +#define atomic_try_cmpxchg_release atomic_try_cmpxchg_release +#endif + +#define INSTR_ATOMIC64_TRY_CMPXCHG(order) \ +static __always_inline bool \ +atomic64_try_cmpxchg##order(atomic64_t *v, s64 *old, s64 new) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + kasan_check_read(old, sizeof(*old)); \ + return arch_atomic64_try_cmpxchg##order(v, old, new); \ } + +#ifdef arch_atomic64_try_cmpxchg +INSTR_ATOMIC64_TRY_CMPXCHG() +#define atomic_try_cmpxchg atomic_try_cmpxchg #endif -static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u) -{ - kasan_check_write(v, sizeof(*v)); - return __arch_atomic_add_unless(v, a, u); +#ifdef arch_atomic64_try_cmpxchg_relaxed +INSTR_ATOMIC64_TRY_CMPXCHG(_relaxed) +#define atomic_try_cmpxchg_relaxed atomic_try_cmpxchg_relaxed +#endif + +#ifdef arch_atomic64_try_cmpxchg_acquire +INSTR_ATOMIC64_TRY_CMPXCHG(_acquire) +#define atomic_try_cmpxchg_acquire atomic_try_cmpxchg_acquire +#endif + +#ifdef arch_atomic64_try_cmpxchg_release +INSTR_ATOMIC64_TRY_CMPXCHG(_release) +#define atomic_try_cmpxchg_release atomic_try_cmpxchg_release +#endif + +#define __INSTR_ATOMIC_ADD_UNLESS(order) \ +static __always_inline int \ +__atomic_add_unless##order(atomic_t *v, int a, int u) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return __arch_atomic_add_unless##order(v, a, u); \ } +__INSTR_ATOMIC_ADD_UNLESS() -static __always_inline bool atomic64_add_unless(atomic64_t *v, s64 a, s64 u) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic64_add_unless(v, a, u); +#ifdef __arch_atomic_add_unless_relaxed +__INSTR_ATOMIC_ADD_UNLESS(_relaxed) +#define __atomic_add_unless_relaxed __atomic_add_unless_relaxed +#endif + +#ifdef __arch_atomic_add_unless_acquire +__INSTR_ATOMIC_ADD_UNLESS(_acquire) +#define __atomic_add_unless_acquire __atomic_add_unless_acquire +#endif + +#ifdef __arch_atomic_add_unless_release +__INSTR_ATOMIC_ADD_UNLESS(_release) +#define __atomic_add_unless_release __atomic_add_unless_release +#endif + +#define INSTR_ATOMIC64_ADD_UNLESS(order) \ +static __always_inline bool \ +atomic64_add_unless##order(atomic64_t *v, s64 a, s64 u) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic64_add_unless##order(v, a, u); \ } -static __always_inline void atomic_inc(atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - arch_atomic_inc(v); +INSTR_ATOMIC64_ADD_UNLESS() + +#ifdef arch_atomic64_add_unless_relaxed +INSTR_ATOMIC64_ADD_UNLESS(_relaxed) +#define atomic64_add_unless_relaxed atomic64_add_unless_relaxed +#endif + +#ifdef arch_atomic64_add_unless_acquire +INSTR_ATOMIC64_ADD_UNLESS(_acquire) +#define atomic64_add_unless_acquire atomic64_add_unless_acquire +#endif + +#ifdef arch_atomic64_add_unless_release +INSTR_ATOMIC64_ADD_UNLESS(_release) +#define atomic64_add_unless_release atomic64_add_unless_release +#endif + +#define INSTR_ATOMIC_INC(order) \ +static __always_inline void \ +atomic_inc##order(atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + arch_atomic_inc##order(v); \ } -static __always_inline void atomic64_inc(atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - arch_atomic64_inc(v); +INSTR_ATOMIC_INC() + +#ifdef arch_atomic_inc_relaxed +INSTR_ATOMIC_INC(_relaxed) +#define atomic_inc_relaxed atomic_inc_relaxed +#endif + +#ifdef arch_atomic_inc_acquire +INSTR_ATOMIC_INC(_acquire) +#define atomic_inc_acquire atomic_inc_acquire +#endif + +#ifdef arch_atomic_inc_release +INSTR_ATOMIC_INC(_release) +#define atomic_inc_release atomic_inc_release +#endif + +#define INSTR_ATOMIC64_INC(order) \ +static __always_inline void \ +atomic64_inc##order(atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + arch_atomic64_inc##order(v); \ } -static __always_inline void atomic_dec(atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - arch_atomic_dec(v); +INSTR_ATOMIC64_INC() + +#ifdef arch_atomic64_inc_relaxed +INSTR_ATOMIC64_INC(_relaxed) +#define atomic64_inc_relaxed atomic64_inc_relaxed +#endif + +#ifdef arch_atomic64_inc_acquire +INSTR_ATOMIC64_INC(_acquire) +#define atomic64_inc_acquire atomic64_inc_acquire +#endif + +#ifdef arch_atomic64_inc_release +INSTR_ATOMIC64_INC(_release) +#define atomic64_inc_release atomic64_inc_release +#endif + +#define INSTR_ATOMIC_DEC(order) \ +static __always_inline void \ +atomic_dec##order(atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + arch_atomic_dec##order(v); \ } -static __always_inline void atomic64_dec(atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - arch_atomic64_dec(v); +INSTR_ATOMIC_DEC() + +#ifdef arch_atomic_dec_relaxed +INSTR_ATOMIC_DEC(_relaxed) +#define atomic_dec_relaxed atomic_dec_relaxed +#endif + +#ifdef arch_atomic_dec_acquire +INSTR_ATOMIC_DEC(_acquire) +#define atomic_dec_acquire atomic_dec_acquire +#endif + +#ifdef arch_atomic_dec_release +INSTR_ATOMIC_DEC(_release) +#define atomic_dec_release atomic_dec_release +#endif + +#define INSTR_ATOMIC64_DEC(order) \ +static __always_inline void \ +atomic64_dec##order(atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + arch_atomic64_dec##order(v); \ } -static __always_inline void atomic_add(int i, atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - arch_atomic_add(i, v); +INSTR_ATOMIC64_DEC() + +#ifdef arch_atomic64_dec_relaxed +INSTR_ATOMIC64_DEC(_relaxed) +#define atomic64_dec_relaxed atomic64_dec_relaxed +#endif + +#ifdef arch_atomic64_dec_acquire +INSTR_ATOMIC64_DEC(_acquire) +#define atomic64_dec_acquire atomic64_dec_acquire +#endif + +#ifdef arch_atomic64_dec_release +INSTR_ATOMIC64_DEC(_release) +#define atomic64_dec_release atomic64_dec_release +#endif + +#define INSTR_ATOMIC_ADD(order) \ +static __always_inline void \ +atomic_add##order(int i, atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + arch_atomic_add##order(i, v); \ } -static __always_inline void atomic64_add(s64 i, atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - arch_atomic64_add(i, v); +INSTR_ATOMIC_ADD() + +#ifdef arch_atomic_add_relaxed +INSTR_ATOMIC_ADD(_relaxed) +#define atomic_add_relaxed atomic_add_relaxed +#endif + +#ifdef arch_atomic_add_acquire +INSTR_ATOMIC_ADD(_acquire) +#define atomic_add_acquire atomic_add_acquire +#endif + +#ifdef arch_atomic_add_release +INSTR_ATOMIC_ADD(_release) +#define atomic_add_release atomic_add_release +#endif + +#define INSTR_ATOMIC64_ADD(order) \ +static __always_inline void \ +atomic64_add##order(s64 i, atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + arch_atomic64_add##order(i, v); \ } -static __always_inline void atomic_sub(int i, atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - arch_atomic_sub(i, v); +INSTR_ATOMIC64_ADD() + +#ifdef arch_atomic64_add_relaxed +INSTR_ATOMIC64_ADD(_relaxed) +#define atomic64_add_relaxed atomic64_add_relaxed +#endif + +#ifdef arch_atomic64_add_acquire +INSTR_ATOMIC64_ADD(_acquire) +#define atomic64_add_acquire atomic64_add_acquire +#endif + +#ifdef arch_atomic64_add_release +INSTR_ATOMIC64_ADD(_release) +#define atomic64_add_release atomic64_add_release +#endif + +#define INSTR_ATOMIC_SUB(order) \ +static __always_inline void \ +atomic_sub##order(int i, atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + arch_atomic_sub##order(i, v); \ } -static __always_inline void atomic64_sub(s64 i, atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - arch_atomic64_sub(i, v); +INSTR_ATOMIC_SUB() + +#ifdef arch_atomic_sub_relaxed +INSTR_ATOMIC_SUB(_relaxed) +#define atomic_sub_relaxed atomic_sub_relaxed +#endif + +#ifdef arch_atomic_sub_acquire +INSTR_ATOMIC_SUB(_acquire) +#define atomic_sub_acquire atomic_sub_acquire +#endif + +#ifdef arch_atomic_sub_release +INSTR_ATOMIC_SUB(_release) +#define atomic_sub_release atomic_sub_release +#endif + +#define INSTR_ATOMIC64_SUB(order) \ +static __always_inline void \ +atomic64_sub##order(s64 i, atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + arch_atomic64_sub##order(i, v); \ } -static __always_inline void atomic_and(int i, atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - arch_atomic_and(i, v); +INSTR_ATOMIC64_SUB() + +#ifdef arch_atomic64_sub_relaxed +INSTR_ATOMIC64_SUB(_relaxed) +#define atomic64_sub_relaxed atomic64_sub_relaxed +#endif + +#ifdef arch_atomic64_sub_acquire +INSTR_ATOMIC64_SUB(_acquire) +#define atomic64_sub_acquire atomic64_sub_acquire +#endif + +#ifdef arch_atomic64_sub_release +INSTR_ATOMIC64_SUB(_release) +#define atomic64_sub_release atomic64_sub_release +#endif + +#define INSTR_ATOMIC_AND(order) \ +static __always_inline void \ +atomic_and##order(int i, atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + arch_atomic_and##order(i, v); \ } -static __always_inline void atomic64_and(s64 i, atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - arch_atomic64_and(i, v); +INSTR_ATOMIC_AND() + +#ifdef arch_atomic_and_relaxed +INSTR_ATOMIC_AND(_relaxed) +#define atomic_and_relaxed atomic_and_relaxed +#endif + +#ifdef arch_atomic_and_acquire +INSTR_ATOMIC_AND(_acquire) +#define atomic_and_acquire atomic_and_acquire +#endif + +#ifdef arch_atomic_and_release +INSTR_ATOMIC_AND(_release) +#define atomic_and_release atomic_and_release +#endif + +#define INSTR_ATOMIC64_AND(order) \ +static __always_inline void \ +atomic64_and##order(s64 i, atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + arch_atomic64_and##order(i, v); \ } -static __always_inline void atomic_or(int i, atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - arch_atomic_or(i, v); +INSTR_ATOMIC64_AND() + +#ifdef arch_atomic64_and_relaxed +INSTR_ATOMIC64_AND(_relaxed) +#define atomic64_and_relaxed atomic64_and_relaxed +#endif + +#ifdef arch_atomic64_and_acquire +INSTR_ATOMIC64_AND(_acquire) +#define atomic64_and_acquire atomic64_and_acquire +#endif + +#ifdef arch_atomic64_and_release +INSTR_ATOMIC64_AND(_release) +#define atomic64_and_release atomic64_and_release +#endif + +#define INSTR_ATOMIC_OR(order) \ +static __always_inline void \ +atomic_or##order(int i, atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + arch_atomic_or##order(i, v); \ } -static __always_inline void atomic64_or(s64 i, atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - arch_atomic64_or(i, v); +INSTR_ATOMIC_OR() + +#ifdef arch_atomic_or_relaxed +INSTR_ATOMIC_OR(_relaxed) +#define atomic_or_relaxed atomic_or_relaxed +#endif + +#ifdef arch_atomic_or_acquire +INSTR_ATOMIC_OR(_acquire) +#define atomic_or_acquire atomic_or_acquire +#endif + +#ifdef arch_atomic_or_release +INSTR_ATOMIC_OR(_release) +#define atomic_or_release atomic_or_release +#endif + +#define INSTR_ATOMIC64_OR(order) \ +static __always_inline void \ +atomic64_or##order(s64 i, atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + arch_atomic64_or##order(i, v); \ } -static __always_inline void atomic_xor(int i, atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - arch_atomic_xor(i, v); +INSTR_ATOMIC64_OR() + +#ifdef arch_atomic64_or_relaxed +INSTR_ATOMIC64_OR(_relaxed) +#define atomic64_or_relaxed atomic64_or_relaxed +#endif + +#ifdef arch_atomic64_or_acquire +INSTR_ATOMIC64_OR(_acquire) +#define atomic64_or_acquire atomic64_or_acquire +#endif + +#ifdef arch_atomic64_or_release +INSTR_ATOMIC64_OR(_release) +#define atomic64_or_release atomic64_or_release +#endif + +#define INSTR_ATOMIC_XOR(order) \ +static __always_inline void \ +atomic_xor##order(int i, atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + arch_atomic_xor##order(i, v); \ } -static __always_inline void atomic64_xor(s64 i, atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - arch_atomic64_xor(i, v); +INSTR_ATOMIC_XOR() + +#ifdef arch_atomic_xor_relaxed +INSTR_ATOMIC_XOR(_relaxed) +#define atomic_xor_relaxed atomic_xor_relaxed +#endif + +#ifdef arch_atomic_xor_acquire +INSTR_ATOMIC_XOR(_acquire) +#define atomic_xor_acquire atomic_xor_acquire +#endif + +#ifdef arch_atomic_xor_release +INSTR_ATOMIC_XOR(_release) +#define atomic_xor_release atomic_xor_release +#endif + +#define INSTR_ATOMIC64_XOR(order) \ +static __always_inline void \ +atomic64_xor##order(s64 i, atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + arch_atomic64_xor##order(i, v); \ } -static __always_inline int atomic_inc_return(atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic_inc_return(v); +INSTR_ATOMIC64_XOR() + +#ifdef arch_atomic64_xor_relaxed +INSTR_ATOMIC64_XOR(_relaxed) +#define atomic64_xor_relaxed atomic64_xor_relaxed +#endif + +#ifdef arch_atomic64_xor_acquire +INSTR_ATOMIC64_XOR(_acquire) +#define atomic64_xor_acquire atomic64_xor_acquire +#endif + +#ifdef arch_atomic64_xor_release +INSTR_ATOMIC64_XOR(_release) +#define atomic64_xor_release atomic64_xor_release +#endif + +#define INSTR_ATOMIC_INC_RETURN(order) \ +static __always_inline int \ +atomic_inc_return##order(atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic_inc_return##order(v); \ } -static __always_inline s64 atomic64_inc_return(atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic64_inc_return(v); +INSTR_ATOMIC_INC_RETURN() + +#ifdef arch_atomic_inc_return_relaxed +INSTR_ATOMIC_INC_RETURN(_relaxed) +#define atomic_inc_return_relaxed atomic_inc_return_relaxed +#endif + +#ifdef arch_atomic_inc_return_acquire +INSTR_ATOMIC_INC_RETURN(_acquire) +#define atomic_inc_return_acquire atomic_inc_return_acquire +#endif + +#ifdef arch_atomic_inc_return_release +INSTR_ATOMIC_INC_RETURN(_release) +#define atomic_inc_return_release atomic_inc_return_release +#endif + +#define INSTR_ATOMIC64_INC_RETURN(order) \ +static __always_inline s64 \ +atomic64_inc_return##order(atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic64_inc_return##order(v); \ } -static __always_inline int atomic_dec_return(atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic_dec_return(v); +INSTR_ATOMIC64_INC_RETURN() + +#ifdef arch_atomic64_inc_return_relaxed +INSTR_ATOMIC64_INC_RETURN(_relaxed) +#define atomic64_inc_return_relaxed atomic64_inc_return_relaxed +#endif + +#ifdef arch_atomic64_inc_return_acquire +INSTR_ATOMIC64_INC_RETURN(_acquire) +#define atomic64_inc_return_acquire atomic64_inc_return_acquire +#endif + +#ifdef arch_atomic64_inc_return_release +INSTR_ATOMIC64_INC_RETURN(_release) +#define atomic64_inc_return_release atomic64_inc_return_release +#endif + +#define INSTR_ATOMIC_DEC_RETURN(order) \ +static __always_inline int \ +atomic_dec_return##order(atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic_dec_return##order(v); \ } -static __always_inline s64 atomic64_dec_return(atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic64_dec_return(v); +INSTR_ATOMIC_DEC_RETURN() + +#ifdef arch_atomic_dec_return_relaxed +INSTR_ATOMIC_DEC_RETURN(_relaxed) +#define atomic_dec_return_relaxed atomic_dec_return_relaxed +#endif + +#ifdef arch_atomic_dec_return_acquire +INSTR_ATOMIC_DEC_RETURN(_acquire) +#define atomic_dec_return_acquire atomic_dec_return_acquire +#endif + +#ifdef arch_atomic_dec_return_release +INSTR_ATOMIC_DEC_RETURN(_release) +#define atomic_dec_return_release atomic_dec_return_release +#endif + +#define INSTR_ATOMIC64_DEC_RETURN(order) \ +static __always_inline s64 \ +atomic64_dec_return##order(atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic64_dec_return##order(v); \ } +INSTR_ATOMIC64_DEC_RETURN() + +#ifdef arch_atomic64_dec_return_relaxed +INSTR_ATOMIC64_DEC_RETURN(_relaxed) +#define atomic64_dec_return_relaxed atomic64_dec_return_relaxed +#endif + +#ifdef arch_atomic64_dec_return_acquire +INSTR_ATOMIC64_DEC_RETURN(_acquire) +#define atomic64_dec_return_acquire atomic64_dec_return_acquire +#endif + +#ifdef arch_atomic64_dec_return_release +INSTR_ATOMIC64_DEC_RETURN(_release) +#define atomic64_dec_return_release atomic64_dec_return_release +#endif + static __always_inline s64 atomic64_inc_not_zero(atomic64_t *v) { kasan_check_write(v, sizeof(*v)); @@ -241,90 +734,356 @@ static __always_inline bool atomic64_inc_and_test(atomic64_t *v) return arch_atomic64_inc_and_test(v); } -static __always_inline int atomic_add_return(int i, atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic_add_return(i, v); +#define INSTR_ATOMIC_ADD_RETURN(order) \ +static __always_inline int \ +atomic_add_return##order(int i, atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic_add_return##order(i, v); \ } -static __always_inline s64 atomic64_add_return(s64 i, atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic64_add_return(i, v); +INSTR_ATOMIC_ADD_RETURN() + +#ifdef arch_atomic_add_return_relaxed +INSTR_ATOMIC_ADD_RETURN(_relaxed) +#define atomic_add_return_relaxed atomic_add_return_relaxed +#endif + +#ifdef arch_atomic_add_return_acquire +INSTR_ATOMIC_ADD_RETURN(_acquire) +#define atomic_add_return_acquire atomic_add_return_acquire +#endif + +#ifdef arch_atomic_add_return_release +INSTR_ATOMIC_ADD_RETURN(_release) +#define atomic_add_return_release atomic_add_return_release +#endif + +#define INSTR_ATOMIC64_ADD_RETURN(order) \ +static __always_inline s64 \ +atomic64_add_return##order(s64 i, atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic64_add_return##order(i, v); \ } -static __always_inline int atomic_sub_return(int i, atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic_sub_return(i, v); +INSTR_ATOMIC64_ADD_RETURN() + +#ifdef arch_atomic64_add_return_relaxed +INSTR_ATOMIC64_ADD_RETURN(_relaxed) +#define atomic64_add_return_relaxed atomic64_add_return_relaxed +#endif + +#ifdef arch_atomic64_add_return_acquire +INSTR_ATOMIC64_ADD_RETURN(_acquire) +#define atomic64_add_return_acquire atomic64_add_return_acquire +#endif + +#ifdef arch_atomic64_add_return_release +INSTR_ATOMIC64_ADD_RETURN(_release) +#define atomic64_add_return_release atomic64_add_return_release +#endif + +#define INSTR_ATOMIC_SUB_RETURN(order) \ +static __always_inline int \ +atomic_sub_return##order(int i, atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic_sub_return##order(i, v); \ } -static __always_inline s64 atomic64_sub_return(s64 i, atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic64_sub_return(i, v); +INSTR_ATOMIC_SUB_RETURN() + +#ifdef arch_atomic_sub_return_relaxed +INSTR_ATOMIC_SUB_RETURN(_relaxed) +#define atomic_sub_return_relaxed atomic_sub_return_relaxed +#endif + +#ifdef arch_atomic_sub_return_acquire +INSTR_ATOMIC_SUB_RETURN(_acquire) +#define atomic_sub_return_acquire atomic_sub_return_acquire +#endif + +#ifdef arch_atomic_sub_return_release +INSTR_ATOMIC_SUB_RETURN(_release) +#define atomic_sub_return_release atomic_sub_return_release +#endif + +#define INSTR_ATOMIC64_SUB_RETURN(order) \ +static __always_inline s64 \ +atomic64_sub_return##order(s64 i, atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic64_sub_return##order(i, v); \ } -static __always_inline int atomic_fetch_add(int i, atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic_fetch_add(i, v); +INSTR_ATOMIC64_SUB_RETURN() + +#ifdef arch_atomic64_sub_return_relaxed +INSTR_ATOMIC64_SUB_RETURN(_relaxed) +#define atomic64_sub_return_relaxed atomic64_sub_return_relaxed +#endif + +#ifdef arch_atomic64_sub_return_acquire +INSTR_ATOMIC64_SUB_RETURN(_acquire) +#define atomic64_sub_return_acquire atomic64_sub_return_acquire +#endif + +#ifdef arch_atomic64_sub_return_release +INSTR_ATOMIC64_SUB_RETURN(_release) +#define atomic64_sub_return_release atomic64_sub_return_release +#endif + +#define INSTR_ATOMIC_FETCH_ADD(order) \ +static __always_inline int \ +atomic_fetch_add##order(int i, atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic_fetch_add##order(i, v); \ } -static __always_inline s64 atomic64_fetch_add(s64 i, atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic64_fetch_add(i, v); +INSTR_ATOMIC_FETCH_ADD() + +#ifdef arch_atomic_fetch_add_relaxed +INSTR_ATOMIC_FETCH_ADD(_relaxed) +#define atomic_fetch_add_relaxed atomic_fetch_add_relaxed +#endif + +#ifdef arch_atomic_fetch_add_acquire +INSTR_ATOMIC_FETCH_ADD(_acquire) +#define atomic_fetch_add_acquire atomic_fetch_add_acquire +#endif + +#ifdef arch_atomic_fetch_add_release +INSTR_ATOMIC_FETCH_ADD(_release) +#define atomic_fetch_add_release atomic_fetch_add_release +#endif + +#define INSTR_ATOMIC64_FETCH_ADD(order) \ +static __always_inline s64 \ +atomic64_fetch_add##order(s64 i, atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic64_fetch_add##order(i, v); \ } -static __always_inline int atomic_fetch_sub(int i, atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic_fetch_sub(i, v); +INSTR_ATOMIC64_FETCH_ADD() + +#ifdef arch_atomic64_fetch_add_relaxed +INSTR_ATOMIC64_FETCH_ADD(_relaxed) +#define atomic64_fetch_add_relaxed atomic64_fetch_add_relaxed +#endif + +#ifdef arch_atomic64_fetch_add_acquire +INSTR_ATOMIC64_FETCH_ADD(_acquire) +#define atomic64_fetch_add_acquire atomic64_fetch_add_acquire +#endif + +#ifdef arch_atomic64_fetch_add_release +INSTR_ATOMIC64_FETCH_ADD(_release) +#define atomic64_fetch_add_release atomic64_fetch_add_release +#endif + +#define INSTR_ATOMIC_FETCH_SUB(order) \ +static __always_inline int \ +atomic_fetch_sub##order(int i, atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic_fetch_sub##order(i, v); \ } -static __always_inline s64 atomic64_fetch_sub(s64 i, atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic64_fetch_sub(i, v); +INSTR_ATOMIC_FETCH_SUB() + +#ifdef arch_atomic_fetch_sub_relaxed +INSTR_ATOMIC_FETCH_SUB(_relaxed) +#define atomic_fetch_sub_relaxed atomic_fetch_sub_relaxed +#endif + +#ifdef arch_atomic_fetch_sub_acquire +INSTR_ATOMIC_FETCH_SUB(_acquire) +#define atomic_fetch_sub_acquire atomic_fetch_sub_acquire +#endif + +#ifdef arch_atomic_fetch_sub_release +INSTR_ATOMIC_FETCH_SUB(_release) +#define atomic_fetch_sub_release atomic_fetch_sub_release +#endif + +#define INSTR_ATOMIC64_FETCH_SUB(order) \ +static __always_inline s64 \ +atomic64_fetch_sub##order(s64 i, atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic64_fetch_sub##order(i, v); \ } -static __always_inline int atomic_fetch_and(int i, atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic_fetch_and(i, v); +INSTR_ATOMIC64_FETCH_SUB() + +#ifdef arch_atomic64_fetch_sub_relaxed +INSTR_ATOMIC64_FETCH_SUB(_relaxed) +#define atomic64_fetch_sub_relaxed atomic64_fetch_sub_relaxed +#endif + +#ifdef arch_atomic64_fetch_sub_acquire +INSTR_ATOMIC64_FETCH_SUB(_acquire) +#define atomic64_fetch_sub_acquire atomic64_fetch_sub_acquire +#endif + +#ifdef arch_atomic64_fetch_sub_release +INSTR_ATOMIC64_FETCH_SUB(_release) +#define atomic64_fetch_sub_release atomic64_fetch_sub_release +#endif + +#define INSTR_ATOMIC_FETCH_AND(order) \ +static __always_inline int \ +atomic_fetch_and##order(int i, atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic_fetch_and##order(i, v); \ } -static __always_inline s64 atomic64_fetch_and(s64 i, atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic64_fetch_and(i, v); +INSTR_ATOMIC_FETCH_AND() + +#ifdef arch_atomic_fetch_and_relaxed +INSTR_ATOMIC_FETCH_AND(_relaxed) +#define atomic_fetch_and_relaxed atomic_fetch_and_relaxed +#endif + +#ifdef arch_atomic_fetch_and_acquire +INSTR_ATOMIC_FETCH_AND(_acquire) +#define atomic_fetch_and_acquire atomic_fetch_and_acquire +#endif + +#ifdef arch_atomic_fetch_and_release +INSTR_ATOMIC_FETCH_AND(_release) +#define atomic_fetch_and_release atomic_fetch_and_release +#endif + +#define INSTR_ATOMIC64_FETCH_AND(order) \ +static __always_inline s64 \ +atomic64_fetch_and##order(s64 i, atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic64_fetch_and##order(i, v); \ } -static __always_inline int atomic_fetch_or(int i, atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic_fetch_or(i, v); +INSTR_ATOMIC64_FETCH_AND() + +#ifdef arch_atomic64_fetch_and_relaxed +INSTR_ATOMIC64_FETCH_AND(_relaxed) +#define atomic64_fetch_and_relaxed atomic64_fetch_and_relaxed +#endif + +#ifdef arch_atomic64_fetch_and_acquire +INSTR_ATOMIC64_FETCH_AND(_acquire) +#define atomic64_fetch_and_acquire atomic64_fetch_and_acquire +#endif + +#ifdef arch_atomic64_fetch_and_release +INSTR_ATOMIC64_FETCH_AND(_release) +#define atomic64_fetch_and_release atomic64_fetch_and_release +#endif + +#define INSTR_ATOMIC_FETCH_OR(order) \ +static __always_inline int \ +atomic_fetch_or##order(int i, atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic_fetch_or##order(i, v); \ } -static __always_inline s64 atomic64_fetch_or(s64 i, atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic64_fetch_or(i, v); +INSTR_ATOMIC_FETCH_OR() + +#ifdef arch_atomic_fetch_or_relaxed +INSTR_ATOMIC_FETCH_OR(_relaxed) +#define atomic_fetch_or_relaxed atomic_fetch_or_relaxed +#endif + +#ifdef arch_atomic_fetch_or_acquire +INSTR_ATOMIC_FETCH_OR(_acquire) +#define atomic_fetch_or_acquire atomic_fetch_or_acquire +#endif + +#ifdef arch_atomic_fetch_or_release +INSTR_ATOMIC_FETCH_OR(_release) +#define atomic_fetch_or_release atomic_fetch_or_release +#endif + +#define INSTR_ATOMIC64_FETCH_OR(order) \ +static __always_inline s64 \ +atomic64_fetch_or##order(s64 i, atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic64_fetch_or##order(i, v); \ } -static __always_inline int atomic_fetch_xor(int i, atomic_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic_fetch_xor(i, v); +INSTR_ATOMIC64_FETCH_OR() + +#ifdef arch_atomic64_fetch_or_relaxed +INSTR_ATOMIC64_FETCH_OR(_relaxed) +#define atomic64_fetch_or_relaxed atomic64_fetch_or_relaxed +#endif + +#ifdef arch_atomic64_fetch_or_acquire +INSTR_ATOMIC64_FETCH_OR(_acquire) +#define atomic64_fetch_or_acquire atomic64_fetch_or_acquire +#endif + +#ifdef arch_atomic64_fetch_or_release +INSTR_ATOMIC64_FETCH_OR(_release) +#define atomic64_fetch_or_release atomic64_fetch_or_release +#endif + +#define INSTR_ATOMIC_FETCH_XOR(order) \ +static __always_inline int \ +atomic_fetch_xor##order(int i, atomic_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic_fetch_xor##order(i, v); \ } -static __always_inline s64 atomic64_fetch_xor(s64 i, atomic64_t *v) -{ - kasan_check_write(v, sizeof(*v)); - return arch_atomic64_fetch_xor(i, v); +INSTR_ATOMIC_FETCH_XOR() + +#ifdef arch_atomic_fetch_xor_relaxed +INSTR_ATOMIC_FETCH_XOR(_relaxed) +#define atomic_fetch_xor_relaxed atomic_fetch_xor_relaxed +#endif + +#ifdef arch_atomic_fetch_xor_acquire +INSTR_ATOMIC_FETCH_XOR(_acquire) +#define atomic_fetch_xor_acquire atomic_fetch_xor_acquire +#endif + +#ifdef arch_atomic_fetch_xor_release +INSTR_ATOMIC_FETCH_XOR(_release) +#define atomic_fetch_xor_release atomic_fetch_xor_release +#endif + +#define INSTR_ATOMIC64_FETCH_XOR(xorder) \ +static __always_inline s64 \ +atomic64_fetch_xor##xorder(s64 i, atomic64_t *v) \ +{ \ + kasan_check_write(v, sizeof(*v)); \ + return arch_atomic64_fetch_xor##xorder(i, v); \ } +INSTR_ATOMIC64_FETCH_XOR() + +#ifdef arch_atomic64_fetch_xor_relaxed +INSTR_ATOMIC64_FETCH_XOR(_relaxed) +#define atomic64_fetch_xor_relaxed atomic64_fetch_xor_relaxed +#endif + +#ifdef arch_atomic64_fetch_xor_acquire +INSTR_ATOMIC64_FETCH_XOR(_acquire) +#define atomic64_fetch_xor_acquire atomic64_fetch_xor_acquire +#endif + +#ifdef arch_atomic64_fetch_xor_release +INSTR_ATOMIC64_FETCH_XOR(_release) +#define atomic64_fetch_xor_release atomic64_fetch_xor_release +#endif + static __always_inline bool atomic_sub_and_test(int i, atomic_t *v) { kasan_check_write(v, sizeof(*v)); @@ -349,31 +1108,64 @@ static __always_inline bool atomic64_add_negative(s64 i, atomic64_t *v) return arch_atomic64_add_negative(i, v); } -static __always_inline unsigned long -cmpxchg_size(volatile void *ptr, unsigned long old, unsigned long new, int size) -{ - kasan_check_write(ptr, size); - switch (size) { - case 1: - return arch_cmpxchg((u8 *)ptr, (u8)old, (u8)new); - case 2: - return arch_cmpxchg((u16 *)ptr, (u16)old, (u16)new); - case 4: - return arch_cmpxchg((u32 *)ptr, (u32)old, (u32)new); - case 8: - BUILD_BUG_ON(sizeof(unsigned long) != 8); - return arch_cmpxchg((u64 *)ptr, (u64)old, (u64)new); - } - BUILD_BUG(); - return 0; +#define INSTR_CMPXCHG(order) \ +static __always_inline unsigned long \ +cmpxchg##order##_size(volatile void *ptr, unsigned long old, \ + unsigned long new, int size) \ +{ \ + kasan_check_write(ptr, size); \ + switch (size) { \ + case 1: \ + return arch_cmpxchg##order((u8 *)ptr, (u8)old, (u8)new); \ + case 2: \ + return arch_cmpxchg##order((u16 *)ptr, (u16)old, (u16)new); \ + case 4: \ + return arch_cmpxchg##order((u32 *)ptr, (u32)old, (u32)new); \ + case 8: \ + BUILD_BUG_ON(sizeof(unsigned long) != 8); \ + return arch_cmpxchg##order((u64 *)ptr, (u64)old, (u64)new); \ + } \ + BUILD_BUG(); \ + return 0; \ } +INSTR_CMPXCHG() #define cmpxchg(ptr, old, new) \ ({ \ ((__typeof__(*(ptr)))cmpxchg_size((ptr), (unsigned long)(old), \ (unsigned long)(new), sizeof(*(ptr)))); \ }) +#ifdef arch_cmpxchg_relaxed +INSTR_CMPXCHG(_relaxed) +#define cmpxchg_relaxed(ptr, old, new) \ +({ \ + ((__typeof__(*(ptr)))cmpxchg_relaxed_size((ptr), \ + (unsigned long)(old), (unsigned long)(new), \ + sizeof(*(ptr)))); \ +}) +#endif + +#ifdef arch_cmpxchg_acquire +INSTR_CMPXCHG(_acquire) +#define cmpxchg_acquire(ptr, old, new) \ +({ \ + ((__typeof__(*(ptr)))cmpxchg_acquire_size((ptr), \ + (unsigned long)(old), (unsigned long)(new), \ + sizeof(*(ptr)))); \ +}) +#endif + +#ifdef arch_cmpxchg_release +INSTR_CMPXCHG(_release) +#define cmpxchg_release(ptr, old, new) \ +({ \ + ((__typeof__(*(ptr)))cmpxchg_release_size((ptr), \ + (unsigned long)(old), (unsigned long)(new), \ + sizeof(*(ptr)))); \ +}) +#endif + static __always_inline unsigned long sync_cmpxchg_size(volatile void *ptr, unsigned long old, unsigned long new, int size) @@ -428,19 +1220,48 @@ cmpxchg_local_size(volatile void *ptr, unsigned long old, unsigned long new, sizeof(*(ptr)))); \ }) -static __always_inline u64 -cmpxchg64_size(volatile u64 *ptr, u64 old, u64 new) -{ - kasan_check_write(ptr, sizeof(*ptr)); - return arch_cmpxchg64(ptr, old, new); +#define INSTR_CMPXCHG64(order) \ +static __always_inline u64 \ +cmpxchg64##order##_size(volatile u64 *ptr, u64 old, u64 new) \ +{ \ + kasan_check_write(ptr, sizeof(*ptr)); \ + return arch_cmpxchg64##order(ptr, old, new); \ } +INSTR_CMPXCHG64() #define cmpxchg64(ptr, old, new) \ ({ \ ((__typeof__(*(ptr)))cmpxchg64_size((ptr), (u64)(old), \ (u64)(new))); \ }) +#ifdef arch_cmpxchg64_relaxed +INSTR_CMPXCHG64(_relaxed) +#define cmpxchg64_relaxed(ptr, old, new) \ +({ \ + ((__typeof__(*(ptr)))cmpxchg64_relaxed_size((ptr), (u64)(old), \ + (u64)(new))); \ +}) +#endif + +#ifdef arch_cmpxchg64_acquire +INSTR_CMPXCHG64(_acquire) +#define cmpxchg64_acquire(ptr, old, new) \ +({ \ + ((__typeof__(*(ptr)))cmpxchg64_acquire_size((ptr), (u64)(old), \ + (u64)(new))); \ +}) +#endif + +#ifdef arch_cmpxchg64_release +INSTR_CMPXCHG64(_release) +#define cmpxchg64_release(ptr, old, new) \ +({ \ + ((__typeof__(*(ptr)))cmpxchg64_release_size((ptr), (u64)(old), \ + (u64)(new))); \ +}) +#endif + static __always_inline u64 cmpxchg64_local_size(volatile u64 *ptr, u64 old, u64 new) {
Currently <asm-generic/atomic-instrumented.h> only instruments the fully ordered variants of atomic functions, ignoring the {relaxed,acquire,release} ordering variants. This patch reworks the header to instrument all ordering variants of the atomic functions, so that architectures implementing these are instrumented appropriately. To minimise repetition, a macro is used to generate each variant from a common template. The {full,relaxed,acquire,release} order variants respectively are then built using this template, where the architecture provides an implementation. To stick to an 80 column limit while keeping the templates legible, the return type and function name of each template are split over two lines. For consistency, this is done even when not strictly necessary. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> --- include/asm-generic/atomic-instrumented.h | 1195 ++++++++++++++++++++++++----- 1 file changed, 1008 insertions(+), 187 deletions(-) -- 2.11.0