[2/3] sysdeps/arm/bits/atomic.h: Add a wider range of atomic operations

Message ID 1412349086-11473-3-git-send-email-will.newton@linaro.org
State Superseded
Headers show

Commit Message

Will Newton Oct. 3, 2014, 3:11 p.m.
For the case where atomic operations are fully supported by the
compiler expose more of these operations directly to glibc. So for
example, instead of implementing atomic_or using the compare and
exchange compiler builtin, implement it by using the atomic or
compiler builtin directly.

This results in an approximate 1kB code size reduction in libc.so and
a small improvement on the malloc benchtest:

Before: 266.279
After: 259.073

ChangeLog:

2014-10-03  Will Newton  <will.newton@linaro.org>

	* sysdeps/arm/bits/atomic.h [__GNUC_PREREQ (4, 7) &&
	__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4]
	(__arch_compare_and_exchange_bool_8_int): Define in terms
	of gcc atomic builtin rather than link error.
	(__arch_compare_and_exchange_bool_16_int): Likewise.
	(__arch_compare_and_exchange_bool_64_int): Likewise.
	(__arch_compare_and_exchange_val_8_int): Likewise.
	(__arch_compare_and_exchange_val_16_int): Likewise.
	(__arch_compare_and_exchange_val_64_int): Likewise.
	(__arch_exchange_8_int): Likewise.
	(__arch_exchange_16_int): Likewise.
	(__arch_exchange_64_int): Likewise.
	(__arch_exchange_and_add_8_int): New define.
	(__arch_exchange_and_add_16_int): Likewise.
	(__arch_exchange_and_add_32_int): Likewise.
	(__arch_exchange_and_add_64_int): Likewise.
	(atomic_exchange_and_add_acq): Likewise.
	(atomic_exchange_and_add_rel): Likewise.
	(catomic_exchange_and_add): Likewise.
	(__arch_exchange_and_and_8_int): New define.
	(__arch_exchange_and_and_16_int): Likewise.
	(__arch_exchange_and_and_32_int): Likewise.
	(__arch_exchange_and_and_64_int): Likewise.
	(atomic_and): Likewise.
	(atomic_and_val): Likewise.
	(catomic_and): Likewise.
	(__arch_exchange_and_or_8_int): New define.
	(__arch_exchange_and_or_16_int): Likewise.
	(__arch_exchange_and_or_32_int): Likewise.
	(__arch_exchange_and_or_64_int): Likewise.
	(atomic_or): Likewise.
	(atomic_or_val): Likewise.
	(catomic_or): Likewise.
---
 sysdeps/arm/bits/atomic.h | 203 ++++++++++++++++++++++++++++++++++------------
 1 file changed, 153 insertions(+), 50 deletions(-)

Comments

Joseph Myers Oct. 3, 2014, 4:31 p.m. | #1
On Fri, 3 Oct 2014, Will Newton wrote:

> -# define __arch_compare_and_exchange_bool_64_int(mem, newval, oldval, model) \
> -  ({__arm_link_error (); 0; })
> +# define __arch_exchange_32_int(mem, newval, model)	\
> +  __atomic_exchange_n (mem, newval, model)

I think obvious link errors are desirable for 64-bit atomics, rather than 
possibly calling libatomic functions using locks, or libgcc functions that 
would introduce a hidden dependency on a more recent kernel version (for 
64-bit atomics helpers) than we currently require (the helpers having been 
introduced in 3.1).

(So in a generic header it should be configurable whether it provides 
64-bit atomic operations or not.)
Torvald Riegel Oct. 6, 2014, 1:29 p.m. | #2
On Fri, 2014-10-03 at 16:31 +0000, Joseph S. Myers wrote:
> On Fri, 3 Oct 2014, Will Newton wrote:
> 
> > -# define __arch_compare_and_exchange_bool_64_int(mem, newval, oldval, model) \
> > -  ({__arm_link_error (); 0; })
> > +# define __arch_exchange_32_int(mem, newval, model)	\
> > +  __atomic_exchange_n (mem, newval, model)
> 
> I think obvious link errors are desirable for 64-bit atomics, rather than 
> possibly calling libatomic functions using locks, or libgcc functions that 
> would introduce a hidden dependency on a more recent kernel version (for 
> 64-bit atomics helpers) than we currently require (the helpers having been 
> introduced in 3.1).
> 
> (So in a generic header it should be configurable whether it provides 
> 64-bit atomic operations or not.)

I don't feel like we need link errors for 64b atomics, but if we do I
agree that the generic header should take care of this, under
arch-specific control.  I don't think we need to have finer granularity
than, roughly, "supports 64b atomic ops on naturally aligned 64b
integers".

Some concurrent algorithms might be written differently if an arch
provides atomic read-modify-write ops like atomic_or instead of just a
CAS; but I don't think we have such optimizations in current code, so we
can just add this later (including the arch-specific flags for this)
instead of havign to consider this now.
Joseph Myers Oct. 6, 2014, 3:54 p.m. | #3
On Mon, 6 Oct 2014, Torvald Riegel wrote:

> On Fri, 2014-10-03 at 16:31 +0000, Joseph S. Myers wrote:
> > On Fri, 3 Oct 2014, Will Newton wrote:
> > 
> > > -# define __arch_compare_and_exchange_bool_64_int(mem, newval, oldval, model) \
> > > -  ({__arm_link_error (); 0; })
> > > +# define __arch_exchange_32_int(mem, newval, model)	\
> > > +  __atomic_exchange_n (mem, newval, model)
> > 
> > I think obvious link errors are desirable for 64-bit atomics, rather than 
> > possibly calling libatomic functions using locks, or libgcc functions that 
> > would introduce a hidden dependency on a more recent kernel version (for 
> > 64-bit atomics helpers) than we currently require (the helpers having been 
> > introduced in 3.1).
> > 
> > (So in a generic header it should be configurable whether it provides 
> > 64-bit atomic operations or not.)
> 
> I don't feel like we need link errors for 64b atomics, but if we do I

It seems prudent to me to avoid non-obvious problems if such an atomic 
operation creeps in.

> agree that the generic header should take care of this, under
> arch-specific control.  I don't think we need to have finer granularity
> than, roughly, "supports 64b atomic ops on naturally aligned 64b
> integers".

Well, architectures might also configure whether 8-bit or 16-bit atomic 
operations are available (but I'm not sure there's anything in glibc that 
would try to use them even conditionally, so maybe the generic code should 
just make those into link errors unconditionally).

Patch

diff --git a/sysdeps/arm/bits/atomic.h b/sysdeps/arm/bits/atomic.h
index 88cbe67..be314e4 100644
--- a/sysdeps/arm/bits/atomic.h
+++ b/sysdeps/arm/bits/atomic.h
@@ -52,84 +52,184 @@  void __arm_link_error (void);
    a pattern to do this efficiently.  */
 #if __GNUC_PREREQ (4, 7) && defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4
 
-# define atomic_exchange_acq(mem, value)                                \
-  __atomic_val_bysize (__arch_exchange, int, mem, value, __ATOMIC_ACQUIRE)
+/* Compare and exchange.
+   For all "bool" routines, we return FALSE if exchange successful.  */
 
-# define atomic_exchange_rel(mem, value)                                \
-  __atomic_val_bysize (__arch_exchange, int, mem, value, __ATOMIC_RELEASE)
+# define __arch_compare_and_exchange_bool_8_int(mem, newval, oldval, model) \
+  ({									\
+    typeof (*mem) __oldval = (oldval);					\
+    !__atomic_compare_exchange_n (mem, (void *) &__oldval, newval, 0,	\
+				  model, __ATOMIC_RELAXED);		\
+  })
 
-/* Atomic exchange (without compare).  */
+# define __arch_compare_and_exchange_bool_16_int(mem, newval, oldval, model) \
+  ({									\
+    typeof (*mem) __oldval = (oldval);					\
+    !__atomic_compare_exchange_n (mem, (void *) &__oldval, newval, 0,	\
+				  model, __ATOMIC_RELAXED);		\
+  })
 
-# define __arch_exchange_8_int(mem, newval, model)      \
-  (__arm_link_error (), (typeof (*mem)) 0)
+# define __arch_compare_and_exchange_bool_32_int(mem, newval, oldval, model) \
+  ({									\
+    typeof (*mem) __oldval = (oldval);					\
+    !__atomic_compare_exchange_n (mem, (void *) &__oldval, newval, 0,	\
+				  model, __ATOMIC_RELAXED);		\
+  })
 
-# define __arch_exchange_16_int(mem, newval, model)     \
-  (__arm_link_error (), (typeof (*mem)) 0)
+#  define __arch_compare_and_exchange_bool_64_int(mem, newval, oldval, model) \
+  ({									\
+    typeof (*mem) __oldval = (oldval);					\
+    !__atomic_compare_exchange_n (mem, (void *) &__oldval, newval, 0,	\
+				  model, __ATOMIC_RELAXED);		\
+  })
 
-# define __arch_exchange_32_int(mem, newval, model)     \
-  __atomic_exchange_n (mem, newval, model)
+# define __arch_compare_and_exchange_val_8_int(mem, newval, oldval, model) \
+  ({									\
+    typeof (*mem) __oldval = (oldval);					\
+    __atomic_compare_exchange_n (mem, (void *) &__oldval, newval, 0,	\
+				 model, __ATOMIC_RELAXED);		\
+    __oldval;								\
+  })
+
+# define __arch_compare_and_exchange_val_16_int(mem, newval, oldval, model) \
+  ({									\
+    typeof (*mem) __oldval = (oldval);					\
+    __atomic_compare_exchange_n (mem, (void *) &__oldval, newval, 0,	\
+				 model, __ATOMIC_RELAXED);		\
+    __oldval;								\
+  })
+
+# define __arch_compare_and_exchange_val_32_int(mem, newval, oldval, model) \
+  ({									\
+    typeof (*mem) __oldval = (oldval);					\
+    __atomic_compare_exchange_n (mem, (void *) &__oldval, newval, 0,	\
+				 model, __ATOMIC_RELAXED);		\
+    __oldval;								\
+  })
+
+#  define __arch_compare_and_exchange_val_64_int(mem, newval, oldval, model) \
+  ({									\
+    typeof (*mem) __oldval = (oldval);					\
+    __atomic_compare_exchange_n (mem, (void *) &__oldval, newval, 0,	\
+				 model, __ATOMIC_RELAXED);		\
+    __oldval;								\
+  })
 
-# define __arch_exchange_64_int(mem, newval, model)     \
-  (__arm_link_error (), (typeof (*mem)) 0)
 
 /* Compare and exchange with "acquire" semantics, ie barrier after.  */
 
-# define atomic_compare_and_exchange_bool_acq(mem, new, old)    \
-  __atomic_bool_bysize (__arch_compare_and_exchange_bool, int,  \
-                        mem, new, old, __ATOMIC_ACQUIRE)
+# define atomic_compare_and_exchange_bool_acq(mem, new, old)	\
+  __atomic_bool_bysize (__arch_compare_and_exchange_bool, int,	\
+			mem, new, old, __ATOMIC_ACQUIRE)
 
-# define atomic_compare_and_exchange_val_acq(mem, new, old)     \
-  __atomic_val_bysize (__arch_compare_and_exchange_val, int,    \
-                       mem, new, old, __ATOMIC_ACQUIRE)
+# define atomic_compare_and_exchange_val_acq(mem, new, old)	\
+  __atomic_val_bysize (__arch_compare_and_exchange_val, int,	\
+		       mem, new, old, __ATOMIC_ACQUIRE)
 
 /* Compare and exchange with "release" semantics, ie barrier before.  */
 
-# define atomic_compare_and_exchange_bool_rel(mem, new, old)    \
-  __atomic_bool_bysize (__arch_compare_and_exchange_bool, int,  \
-                        mem, new, old, __ATOMIC_RELEASE)
+# define atomic_compare_and_exchange_bool_rel(mem, new, old)	\
+  __atomic_bool_bysize (__arch_compare_and_exchange_bool, int,	\
+			mem, new, old, __ATOMIC_RELEASE)
 
-# define atomic_compare_and_exchange_val_rel(mem, new, old)      \
+# define atomic_compare_and_exchange_val_rel(mem, new, old)	 \
   __atomic_val_bysize (__arch_compare_and_exchange_val, int,    \
                        mem, new, old, __ATOMIC_RELEASE)
 
-/* Compare and exchange.
-   For all "bool" routines, we return FALSE if exchange succesful.  */
 
-# define __arch_compare_and_exchange_bool_8_int(mem, newval, oldval, model) \
-  ({__arm_link_error (); 0; })
+/* Atomic exchange (without compare).  */
 
-# define __arch_compare_and_exchange_bool_16_int(mem, newval, oldval, model) \
-  ({__arm_link_error (); 0; })
+# define __arch_exchange_8_int(mem, newval, model)	\
+  __atomic_exchange_n (mem, newval, model)
 
-# define __arch_compare_and_exchange_bool_32_int(mem, newval, oldval, model) \
-  ({                                                                    \
-    typeof (*mem) __oldval = (oldval);                                  \
-    !__atomic_compare_exchange_n (mem, (void *) &__oldval, newval, 0,   \
-                                  model, __ATOMIC_RELAXED);             \
-  })
+# define __arch_exchange_16_int(mem, newval, model)	\
+  __atomic_exchange_n (mem, newval, model)
 
-# define __arch_compare_and_exchange_bool_64_int(mem, newval, oldval, model) \
-  ({__arm_link_error (); 0; })
+# define __arch_exchange_32_int(mem, newval, model)	\
+  __atomic_exchange_n (mem, newval, model)
 
-# define __arch_compare_and_exchange_val_8_int(mem, newval, oldval, model) \
-  ({__arm_link_error (); oldval; })
+#  define __arch_exchange_64_int(mem, newval, model)	\
+  __atomic_exchange_n (mem, newval, model)
 
-# define __arch_compare_and_exchange_val_16_int(mem, newval, oldval, model) \
-  ({__arm_link_error (); oldval; })
+# define atomic_exchange_acq(mem, value)				\
+  __atomic_val_bysize (__arch_exchange, int, mem, value, __ATOMIC_ACQUIRE)
 
-# define __arch_compare_and_exchange_val_32_int(mem, newval, oldval, model) \
-  ({                                                                    \
-    typeof (*mem) __oldval = (oldval);                                  \
-    __atomic_compare_exchange_n (mem, (void *) &__oldval, newval, 0,    \
-                                 model, __ATOMIC_RELAXED);              \
-    __oldval;                                                           \
-  })
+# define atomic_exchange_rel(mem, value)				\
+  __atomic_val_bysize (__arch_exchange, int, mem, value, __ATOMIC_RELEASE)
+
+
+/* Atomically add value and return the previous (unincremented) value.  */
+
+# define __arch_exchange_and_add_8_int(mem, value, model)	\
+  __atomic_fetch_add (mem, value, model)
+
+# define __arch_exchange_and_add_16_int(mem, value, model)	\
+  __atomic_fetch_add (mem, value, model)
+
+# define __arch_exchange_and_add_32_int(mem, value, model)	\
+  __atomic_fetch_add (mem, value, model)
+
+#  define __arch_exchange_and_add_64_int(mem, value, model)	\
+  __atomic_fetch_add (mem, value, model)
+
+# define atomic_exchange_and_add_acq(mem, value)			\
+  __atomic_val_bysize (__arch_exchange_and_add, int, mem, value,	\
+		       __ATOMIC_ACQUIRE)
+
+# define atomic_exchange_and_add_rel(mem, value)			\
+  __atomic_val_bysize (__arch_exchange_and_add, int, mem, value,	\
+		       __ATOMIC_RELEASE)
+
+# define catomic_exchange_and_add atomic_exchange_and_add
+
+/* Atomically bitwise and value and return the previous value.  */
+
+# define __arch_exchange_and_and_8_int(mem, value, model)	\
+  __atomic_fetch_and (mem, value, model)
+
+# define __arch_exchange_and_and_16_int(mem, value, model)	\
+  __atomic_fetch_and (mem, value, model)
 
-# define __arch_compare_and_exchange_val_64_int(mem, newval, oldval, model) \
-  ({__arm_link_error (); oldval; })
+# define __arch_exchange_and_and_32_int(mem, value, model)	\
+  __atomic_fetch_and (mem, value, model)
+
+#  define __arch_exchange_and_and_64_int(mem, value, model)	\
+  __atomic_fetch_and (mem, value, model)
+
+# define atomic_and(mem, value)						\
+  __atomic_val_bysize (__arch_exchange_and_and, int, mem, value,	\
+		       __ATOMIC_ACQUIRE)
+
+# define atomic_and_val atomic_and
+
+# define catomic_and atomic_and
+
+/* Atomically bitwise or value and return the previous value.  */
+
+# define __arch_exchange_and_or_8_int(mem, value, model)	\
+  __atomic_fetch_or (mem, value, model)
+
+# define __arch_exchange_and_or_16_int(mem, value, model)	\
+  __atomic_fetch_or (mem, value, model)
+
+# define __arch_exchange_and_or_32_int(mem, value, model)	\
+  __atomic_fetch_or (mem, value, model)
+
+#  define __arch_exchange_and_or_64_int(mem, value, model)	\
+  __atomic_fetch_or (mem, value, model)
+
+# define atomic_or(mem, value)					\
+  __atomic_val_bysize (__arch_exchange_and_or, int, mem, value,	\
+		       __ATOMIC_ACQUIRE)
+
+# define atomic_or_val atomic_or
+
+# define catomic_or atomic_or
 
 #elif defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4
+
 /* Atomic compare and exchange.  */
+
 # define __arch_compare_and_exchange_val_32_acq(mem, newval, oldval) \
   __sync_val_compare_and_swap ((mem), (oldval), (newval))
 #else
@@ -138,8 +238,10 @@  void __arm_link_error (void);
 #endif
 
 #if !__GNUC_PREREQ (4, 7) || !defined (__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4)
+
 /* We don't support atomic operations on any non-word types.
    So make them link errors.  */
+
 # define __arch_compare_and_exchange_val_8_acq(mem, newval, oldval) \
   ({ __arm_link_error (); oldval; })
 
@@ -148,6 +250,7 @@  void __arm_link_error (void);
 
 # define __arch_compare_and_exchange_val_64_acq(mem, newval, oldval) \
   ({ __arm_link_error (); oldval; })
+
 #endif
 
 /* An OS-specific bits/atomic.h file will define this macro if