diff mbox series

[PATCHv2,06/11] atomics/treewide: rework ordering barriers

Message ID 20180625105952.3756-7-mark.rutland@arm.com
State Superseded
Headers show
Series atomics: generate atomic headers / instrument arm64 | expand

Commit Message

Mark Rutland June 25, 2018, 10:59 a.m. UTC
Currently architectures can override __atomic_op_*() to define the barriers
used before/after a relaxed atomic when used to build acquire/release/fence
variants.

This has the unfortunate property of requiring the architecture to define the
full wrapper for the atomics, rather than just the barriers they care about,
and gets in the way of generating atomics which can be easily read.

Instead, this patch has architectures define an optional set of barriers,
__atomic_mb_{before,after}_{acquire,release,fence}(), which <linux/atomic.h>
uses to build the wrappers.

At the same time, let's avoid any potential reliance on these elsewhere
by ensuring each is undef'd at the end of <linux/atomic.h>.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>

Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andrea Parri <parri.andrea@gmail.com>
---
 arch/alpha/include/asm/atomic.h   |  8 +++----
 arch/powerpc/include/asm/atomic.h | 17 +++++---------
 arch/riscv/include/asm/atomic.h   | 17 +++++---------
 include/linux/atomic.h            | 47 ++++++++++++++++++++++++++-------------
 4 files changed, 45 insertions(+), 44 deletions(-)

-- 
2.11.0

Comments

Mark Rutland June 25, 2018, 3:44 p.m. UTC | #1
On Mon, Jun 25, 2018 at 11:59:47AM +0100, Mark Rutland wrote:
> Currently architectures can override __atomic_op_*() to define the barriers

> used before/after a relaxed atomic when used to build acquire/release/fence

> variants.

> 

> This has the unfortunate property of requiring the architecture to define the

> full wrapper for the atomics, rather than just the barriers they care about,

> and gets in the way of generating atomics which can be easily read.

> 

> Instead, this patch has architectures define an optional set of barriers,

> __atomic_mb_{before,after}_{acquire,release,fence}(), which <linux/atomic.h>

> uses to build the wrappers.

> 

> At the same time, let's avoid any potential reliance on these elsewhere

> by ensuring each is undef'd at the end of <linux/atomic.h>.


> +#undef __atomic_op_acquire

> +#undef __atomic_op_release

> +#undef __atomic_op_fence

> +

> +#undef __atomic_acquire_fence

> +#undef __atomic_release_fence

> +#undef __atomic_pre_fence

> +#undef __atomic_post_fence


Due to the way preprocessor gets expanded, this happens to break the
cmpxchg() wrappers on some architectures. Without re-implementing those
with some pretty horrendous static inlines and wrappers to guarantee
type invariance, it's not possible to solve that.

As agreed over IRC, I've removed the undefs for now.

Alas.

Mark.
Will Deacon July 4, 2018, 3:06 p.m. UTC | #2
On Mon, Jun 25, 2018 at 11:59:47AM +0100, Mark Rutland wrote:
> Currently architectures can override __atomic_op_*() to define the barriers

> used before/after a relaxed atomic when used to build acquire/release/fence

> variants.

> 

> This has the unfortunate property of requiring the architecture to define the

> full wrapper for the atomics, rather than just the barriers they care about,

> and gets in the way of generating atomics which can be easily read.

> 

> Instead, this patch has architectures define an optional set of barriers,

> __atomic_mb_{before,after}_{acquire,release,fence}(), which <linux/atomic.h>

> uses to build the wrappers.


Looks like you've renamed these in the patch but not updated the commit
message. Also, to add to the bikeshedding, would it worth adding "rmw" in
there somewhere, e.g. __atomic_post_rmw_fence, since I assume these only
apply to value-returning stuff?

Will
Mark Rutland July 4, 2018, 3:56 p.m. UTC | #3
On Wed, Jul 04, 2018 at 04:06:46PM +0100, Will Deacon wrote:
> On Mon, Jun 25, 2018 at 11:59:47AM +0100, Mark Rutland wrote:

> > Currently architectures can override __atomic_op_*() to define the barriers

> > used before/after a relaxed atomic when used to build acquire/release/fence

> > variants.

> > 

> > This has the unfortunate property of requiring the architecture to define the

> > full wrapper for the atomics, rather than just the barriers they care about,

> > and gets in the way of generating atomics which can be easily read.

> > 

> > Instead, this patch has architectures define an optional set of barriers,

> > __atomic_mb_{before,after}_{acquire,release,fence}(), which <linux/atomic.h>

> > uses to build the wrappers.

> 

> Looks like you've renamed these in the patch but not updated the commit

> message.


Yup; Peter also pointed that out. In my branch this now looks like:

----
Instead, this patch has architectures define an optional set of barriers:

* __atomic_acquire_fence()
* __atomic_release_fence()
* __atomic_pre_fence()
* __atomic_post_fence()

... which <linux/atomic.h> uses to build the wrappers.
----

... which is hopefully more legible, too!

> Also, to add to the bikeshedding, would it worth adding "rmw" in there

> somewhere, e.g. __atomic_post_rmw_fence, since I assume these only

> apply to value-returning stuff?


I don't have any opinion there, but I'm also not sure I've parsed your
rationale correctly. I guess a !RMW full-fence op doesn't make sense? Or
that's something we want to avoid in the API?

AFAICT, we only use __atomic_{pre,post}_fence() for RMW ops today.

Thanks,
Mark.
Will Deacon July 4, 2018, 5:50 p.m. UTC | #4
On Wed, Jul 04, 2018 at 04:56:19PM +0100, Mark Rutland wrote:
> On Wed, Jul 04, 2018 at 04:06:46PM +0100, Will Deacon wrote:

> > On Mon, Jun 25, 2018 at 11:59:47AM +0100, Mark Rutland wrote:

> > > Currently architectures can override __atomic_op_*() to define the barriers

> > > used before/after a relaxed atomic when used to build acquire/release/fence

> > > variants.

> > > 

> > > This has the unfortunate property of requiring the architecture to define the

> > > full wrapper for the atomics, rather than just the barriers they care about,

> > > and gets in the way of generating atomics which can be easily read.

> > > 

> > > Instead, this patch has architectures define an optional set of barriers,

> > > __atomic_mb_{before,after}_{acquire,release,fence}(), which <linux/atomic.h>

> > > uses to build the wrappers.

> > 

> > Looks like you've renamed these in the patch but not updated the commit

> > message.

> 

> Yup; Peter also pointed that out. In my branch this now looks like:

> 

> ----

> Instead, this patch has architectures define an optional set of barriers:

> 

> * __atomic_acquire_fence()

> * __atomic_release_fence()

> * __atomic_pre_fence()

> * __atomic_post_fence()

> 

> ... which <linux/atomic.h> uses to build the wrappers.

> ----

> 

> ... which is hopefully more legible, too!

> 

> > Also, to add to the bikeshedding, would it worth adding "rmw" in there

> > somewhere, e.g. __atomic_post_rmw_fence, since I assume these only

> > apply to value-returning stuff?

> 

> I don't have any opinion there, but I'm also not sure I've parsed your

> rationale correctly. I guess a !RMW full-fence op doesn't make sense? Or

> that's something we want to avoid in the API?

> 

> AFAICT, we only use __atomic_{pre,post}_fence() for RMW ops today.


No, I think you're right and my terminology is confused. Leave it as-is
for the moment.

Cheers,

Will
Mark Rutland July 5, 2018, 10:12 a.m. UTC | #5
On Wed, Jul 04, 2018 at 06:50:00PM +0100, Will Deacon wrote:
> On Wed, Jul 04, 2018 at 04:56:19PM +0100, Mark Rutland wrote:

> > On Wed, Jul 04, 2018 at 04:06:46PM +0100, Will Deacon wrote:

> > > On Mon, Jun 25, 2018 at 11:59:47AM +0100, Mark Rutland wrote:

> > > > Currently architectures can override __atomic_op_*() to define the barriers

> > > > used before/after a relaxed atomic when used to build acquire/release/fence

> > > > variants.

> > > > 

> > > > This has the unfortunate property of requiring the architecture to define the

> > > > full wrapper for the atomics, rather than just the barriers they care about,

> > > > and gets in the way of generating atomics which can be easily read.

> > > > 

> > > > Instead, this patch has architectures define an optional set of barriers,

> > > > __atomic_mb_{before,after}_{acquire,release,fence}(), which <linux/atomic.h>

> > > > uses to build the wrappers.

> > > 

> > > Looks like you've renamed these in the patch but not updated the commit

> > > message.

> > 

> > Yup; Peter also pointed that out. In my branch this now looks like:

> > 

> > ----

> > Instead, this patch has architectures define an optional set of barriers:

> > 

> > * __atomic_acquire_fence()

> > * __atomic_release_fence()

> > * __atomic_pre_fence()

> > * __atomic_post_fence()

> > 

> > ... which <linux/atomic.h> uses to build the wrappers.

> > ----

> > 

> > ... which is hopefully more legible, too!

> > 

> > > Also, to add to the bikeshedding, would it worth adding "rmw" in there

> > > somewhere, e.g. __atomic_post_rmw_fence, since I assume these only

> > > apply to value-returning stuff?

> > 

> > I don't have any opinion there, but I'm also not sure I've parsed your

> > rationale correctly. I guess a !RMW full-fence op doesn't make sense? Or

> > that's something we want to avoid in the API?

> > 

> > AFAICT, we only use __atomic_{pre,post}_fence() for RMW ops today.

> 

> No, I think you're right and my terminology is confused. Leave it as-is

> for the moment.


Sure thing.

Perhaps __atomic_{pre,post}_full_fence() might be better, assuming
you're trying to avoid people erroneously assuming that
__atomic_{pre,post}_fence() are like acquire/release fences.

Thanks,
Mark.
Will Deacon July 5, 2018, 4:25 p.m. UTC | #6
On Thu, Jul 05, 2018 at 11:12:41AM +0100, Mark Rutland wrote:
> On Wed, Jul 04, 2018 at 06:50:00PM +0100, Will Deacon wrote:

> > On Wed, Jul 04, 2018 at 04:56:19PM +0100, Mark Rutland wrote:

> > > On Wed, Jul 04, 2018 at 04:06:46PM +0100, Will Deacon wrote:

> > > > On Mon, Jun 25, 2018 at 11:59:47AM +0100, Mark Rutland wrote:

> > > > > Currently architectures can override __atomic_op_*() to define the barriers

> > > > > used before/after a relaxed atomic when used to build acquire/release/fence

> > > > > variants.

> > > > > 

> > > > > This has the unfortunate property of requiring the architecture to define the

> > > > > full wrapper for the atomics, rather than just the barriers they care about,

> > > > > and gets in the way of generating atomics which can be easily read.

> > > > > 

> > > > > Instead, this patch has architectures define an optional set of barriers,

> > > > > __atomic_mb_{before,after}_{acquire,release,fence}(), which <linux/atomic.h>

> > > > > uses to build the wrappers.

> > > > 

> > > > Looks like you've renamed these in the patch but not updated the commit

> > > > message.

> > > 

> > > Yup; Peter also pointed that out. In my branch this now looks like:

> > > 

> > > ----

> > > Instead, this patch has architectures define an optional set of barriers:

> > > 

> > > * __atomic_acquire_fence()

> > > * __atomic_release_fence()

> > > * __atomic_pre_fence()

> > > * __atomic_post_fence()

> > > 

> > > ... which <linux/atomic.h> uses to build the wrappers.

> > > ----

> > > 

> > > ... which is hopefully more legible, too!

> > > 

> > > > Also, to add to the bikeshedding, would it worth adding "rmw" in there

> > > > somewhere, e.g. __atomic_post_rmw_fence, since I assume these only

> > > > apply to value-returning stuff?

> > > 

> > > I don't have any opinion there, but I'm also not sure I've parsed your

> > > rationale correctly. I guess a !RMW full-fence op doesn't make sense? Or

> > > that's something we want to avoid in the API?

> > > 

> > > AFAICT, we only use __atomic_{pre,post}_fence() for RMW ops today.

> > 

> > No, I think you're right and my terminology is confused. Leave it as-is

> > for the moment.

> 

> Sure thing.

> 

> Perhaps __atomic_{pre,post}_full_fence() might be better, assuming

> you're trying to avoid people erroneously assuming that

> __atomic_{pre,post}_fence() are like acquire/release fences.


Good idea, I think that's better.

Will
diff mbox series

Patch

diff --git a/arch/alpha/include/asm/atomic.h b/arch/alpha/include/asm/atomic.h
index 4a6a8f58c9c9..df8d9aa2a47e 100644
--- a/arch/alpha/include/asm/atomic.h
+++ b/arch/alpha/include/asm/atomic.h
@@ -18,11 +18,11 @@ 
  * To ensure dependency ordering is preserved for the _relaxed and
  * _release atomics, an smp_read_barrier_depends() is unconditionally
  * inserted into the _relaxed variants, which are used to build the
- * barriered versions. To avoid redundant back-to-back fences, we can
- * define the _acquire and _fence versions explicitly.
+ * barriered versions. Avoid redundant back-to-back fences in the
+ * _acquire and _fence versions.
  */
-#define __atomic_op_acquire(op, args...)	op##_relaxed(args)
-#define __atomic_op_fence			__atomic_op_release
+#define __atomic_acquire_fence()
+#define __atomic_post_fence()
 
 #define ATOMIC_INIT(i)		{ (i) }
 #define ATOMIC64_INIT(i)	{ (i) }
diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index a0156cb43d1f..963abf8bf1c0 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -18,18 +18,11 @@ 
  * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
  * on the platform without lwsync.
  */
-#define __atomic_op_acquire(op, args...)				\
-({									\
-	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
-	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");	\
-	__ret;								\
-})
-
-#define __atomic_op_release(op, args...)				\
-({									\
-	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");	\
-	op##_relaxed(args);						\
-})
+#define __atomic_acquire_fence()					\
+	__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory")
+
+#define __atomic_release_fence()					\
+	__asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory")
 
 static __inline__ int atomic_read(const atomic_t *v)
 {
diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h
index 512b89485790..c452359c9cb8 100644
--- a/arch/riscv/include/asm/atomic.h
+++ b/arch/riscv/include/asm/atomic.h
@@ -25,18 +25,11 @@ 
 
 #define ATOMIC_INIT(i)	{ (i) }
 
-#define __atomic_op_acquire(op, args...)				\
-({									\
-	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
-	__asm__ __volatile__(RISCV_ACQUIRE_BARRIER "" ::: "memory");	\
-	__ret;								\
-})
-
-#define __atomic_op_release(op, args...)				\
-({									\
-	__asm__ __volatile__(RISCV_RELEASE_BARRIER "" ::: "memory");	\
-	op##_relaxed(args);						\
-})
+#define __atomic_acquire_fence()					\
+	__asm__ __volatile__(RISCV_ACQUIRE_BARRIER "" ::: "memory")
+
+#define __atomic_release_fence()					\
+	__asm__ __volatile__(RISCV_RELEASE_BARRIER "" ::: "memory");
 
 static __always_inline int atomic_read(const atomic_t *v)
 {
diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index 8e04f1f69bd9..fc39e4bead30 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -38,40 +38,46 @@ 
  * barriers on top of the relaxed variant. In the case where the relaxed
  * variant is already fully ordered, no additional barriers are needed.
  *
- * Besides, if an arch has a special barrier for acquire/release, it could
- * implement its own __atomic_op_* and use the same framework for building
- * variants
- *
- * If an architecture overrides __atomic_op_acquire() it will probably want
- * to define smp_mb__after_spinlock().
+ * If an architecture overrides __atomic_acquire_fence() it will probably
+ * want to define smp_mb__after_spinlock().
  */
-#ifndef __atomic_op_acquire
+#ifndef __atomic_acquire_fence
+#define __atomic_acquire_fence		smp_mb__after_atomic
+#endif
+
+#ifndef __atomic_release_fence
+#define __atomic_release_fence		smp_mb__before_atomic
+#endif
+
+#ifndef __atomic_pre_fence
+#define __atomic_pre_fence		smp_mb__before_atomic
+#endif
+
+#ifndef __atomic_post_fence
+#define __atomic_post_fence		smp_mb__after_atomic
+#endif
+
 #define __atomic_op_acquire(op, args...)				\
 ({									\
 	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
-	smp_mb__after_atomic();						\
+	__atomic_acquire_fence();					\
 	__ret;								\
 })
-#endif
 
-#ifndef __atomic_op_release
 #define __atomic_op_release(op, args...)				\
 ({									\
-	smp_mb__before_atomic();					\
+	__atomic_release_fence();					\
 	op##_relaxed(args);						\
 })
-#endif
 
-#ifndef __atomic_op_fence
 #define __atomic_op_fence(op, args...)					\
 ({									\
 	typeof(op##_relaxed(args)) __ret;				\
-	smp_mb__before_atomic();					\
+	__atomic_pre_fence();						\
 	__ret = op##_relaxed(args);					\
-	smp_mb__after_atomic();						\
+	__atomic_post_fence();						\
 	__ret;								\
 })
-#endif
 
 /* atomic_add_return_relaxed */
 #ifndef atomic_add_return_relaxed
@@ -1308,4 +1314,13 @@  static inline long long atomic64_dec_if_positive(atomic64_t *v)
 
 #include <asm-generic/atomic-long.h>
 
+#undef __atomic_op_acquire
+#undef __atomic_op_release
+#undef __atomic_op_fence
+
+#undef __atomic_acquire_fence
+#undef __atomic_release_fence
+#undef __atomic_pre_fence
+#undef __atomic_post_fence
+
 #endif /* _LINUX_ATOMIC_H */