[API-NEXT,v3] api: atomic: added atomic_lock_free_u64

Message ID CAPiYAf4YioHivTBsK1Sj-zruCR9yse+GyfjLsD+eBH=q4E_Fxw@mail.gmail.com
State New
Headers show

Commit Message

Ola Liljedahl Dec. 8, 2015, 4:44 p.m.
The patch doesn't apply. It depends on this patch [API-NEXT,v3,7/7] api:
atomic: added 32 bit acquire and release
<http://patches.opendataplane.org/patch/3435/>

As can be seen here in the patch:
> >

> >  /**

> > + * Atomic operations

> > + *

> > + * Atomic operations listed in a bit field structure.

> > + */

> > +typedef union odp_atomic_op_t {

> > +     /** Operation flags */

> > +     struct {

> > +             uint32_t init      : 1;  /**< Init atomic variable */

> > +             uint32_t load      : 1;  /**< Atomic load */

> > +             uint32_t store     : 1;  /**< Atomic store */

> > +             uint32_t fetch_add : 1;  /**< Atomic fetch and add */

> > +             uint32_t add       : 1;  /**< Atomic add */

> > +             uint32_t fetch_sub : 1;  /**< Atomic fetch and substract */

> > +             uint32_t sub       : 1;  /**< Atomic substract */

> > +             uint32_t fetch_inc : 1;  /**< Atomic fetch and increment */

> > +             uint32_t inc       : 1;  /**< Atomic increment */

> > +             uint32_t fetch_dec : 1;  /**< Atomic fetch and decrement */

> > +             uint32_t dec       : 1;  /**< Atomic decrement */

> > +             uint32_t min       : 1;  /**< Atomic minimum */

> > +             uint32_t max       : 1;  /**< Atomic maximum */

> > +             uint32_t cas       : 1;  /**< Atomic compare and swap */

> > +     } op;

> > +

> > +     /** All bits of the bit field structure.

> > +       * Operation flag mapping is architecture specific. This field can

> > be

> > +       * used to set/clear all flags, or bitwise operations over the

> > entire

> > +       * structure. */

> > +     uint32_t all_bits;

> > +} odp_atomic_op_t;

> > +

> > +/**

> > + * Query which atomic uint64 operations are lock-free

> > + *

> > + * Lock-free implementations have higher performance and scale better

> > than

> > + * implementations using locks. User can decide to use e.g. uint32

> atomic

> > + * variables instead of uint64 to optimize performance on platforms that

> > + * implement a performance critical operation using locks.

> > + *

> > + * @param atomic_op  Pointer to atomic operation structure for storing

> > + *                   operation flags. All bits are initialized to zero

> > during

> > + *                   the operation. The parameter is ignored when NULL.

> > + * @retval 0 None of the operations are lock-free

> > + * @retval 1 Some of the operations are lock-free

> > + * @retval 2 All operations are lock-free

> > + */

> > +int odp_atomic_lock_free_u64(odp_atomic_op_t *atomic_op);

> > +

> > +/**

> >   * @}

> >   */

> >

> > diff --git a/platform/linux-generic/Makefile.am b/platform/linux-

> > generic/Makefile.am

> > index a6b6029..0b7234e 100644

> > --- a/platform/linux-generic/Makefile.am

> > +++ b/platform/linux-generic/Makefile.am

> > @@ -151,6 +151,7 @@ noinst_HEADERS = \

> >                 ${srcdir}/Makefile.inc

> >

> >  __LIB__libodp_la_SOURCES = \

> > +                        odp_atomic.c \

> >                          odp_barrier.c \

> >                          odp_buffer.c \

> >                          odp_classification.c \

> > diff --git a/platform/linux-generic/odp_atomic.c b/platform/linux-

> > generic/odp_atomic.c

> > new file mode 100644

> > index 0000000..996d09a

> > --- /dev/null

> > +++ b/platform/linux-generic/odp_atomic.c

> > @@ -0,0 +1,24 @@

> > +/* Copyright (c) 2015, Linaro Limited

> > + * All rights reserved.

> > + *

> > + * SPDX-License-Identifier:     BSD-3-Clause

> > + */

> > +

> > +#include <odp/atomic.h>

> > +

> > +int odp_atomic_lock_free_u64(odp_atomic_op_t *atomic_op)

> > +{

> > +#if __GCC_ATOMIC_LLONG_LOCK_FREE < 2

> > +     /* All operations have locks */

> > +     if (atomic_op)

> > +             atomic_op->all_bits = 0;

> > +

> > +     return 0;

> > +#else

> > +     /* All operations are lock-free */

> > +     if (atomic_op)

> > +             atomic_op->all_bits = ~((uint32_t)0);

> > +

> > +     return 2;

> > +#endif

> > +}

> > --

> > 2.6.2

> >

> > _______________________________________________

> > lng-odp mailing list

> > lng-odp@lists.linaro.org

> > https://lists.linaro.org/mailman/listinfo/lng-odp

> _______________________________________________

> lng-odp mailing list

> lng-odp@lists.linaro.org

> https://lists.linaro.org/mailman/listinfo/lng-odp

>

Comments

Ola Liljedahl Dec. 9, 2015, 12:31 p.m. | #1
On 9 December 2015 at 11:58, Savolainen, Petri (Nokia - FI/Espoo) <
petri.savolainen@nokia.com> wrote:

> It patches API-NEXT branch. Did you try to apply it to master? Just tried

> it out, and it applies nicely.

>

I get it now. The branch is actually called "api-next" (lower case), not
API-NEXT. You fooled me there. It applies now. I approve of it.


>

> “[API-NEXT,v3,7/7] api: atomic: added 32 bit acquire and release” is in

> api-next.

>

And I never approved of it.


>

>

> The idea is to add those operation-mem_model combinations, that are

> commonly used and make sense (from our opinion, mainly acq/rel ?).

>

Even just relaxed/acquire/release flavours of the basic set of atomic
operations causes a combinatorial explosion.


>

>

> For example:

>

> ·         There will never be store_acq, store_csm, store_acq_rel,

> load_rls, load_acq_rel, etc … since those are illegal combinations by the

> C11 spec

>

And the compiler should detect such violations if the ordering parameter is
a constant.

> ·         seq_cst could be left out altogether, since we could reason

> that it’s commonly an overkill, etc

>

Someone might prefer to program according to sequential consistency instead
of e.g. release consistency. ODP doesn't specify any specific memory model
AFAIK.

> o   we could have a full barrier, which uses it, but do not provide

> atomic_xxx_seq_cst

>

> ·         Is ‘consume’ needed at all?

>

One reason could be to optimise performance on architectures (and CPU
implementations) where the acquire ordering is expensive (e.g. ARMv7A
requires this to be implemented using DMB) and a simple data dependency
would be sufficient (e.g. because the acquired data is accessed through the
pointer which was read or obtained using the atomic operation, possibly
with some intermediate steps (transforming a scalar index to a pointer)). I
don't know what performance load-acquire has on ARMv8 CPU's.


> ·         Relaxed is the default

>

> ·         Is acq_rel interesting with other operations than cas?

>

Exchange?

Every atomic operation which is both a read and a write may be used in
situations where the written data is used to release something (and you
need release ordering) and the read data is used to acquire something (and
you need acquire ordering). As all fetch_xxx operations are atomic RMW
operations, they could potentially be used with acq_rel ordering.


>

>

> I think it’s better to define a set of operations that are really needed

> (and legal),

>

A lot of them are needed but perhaps not in *your* code.

than have a memory order param which opens up all combinations (also
> illegal ones).  Every function would need to check validity of the

> mem_order param, and have branches in the (inlined) code to select the

> correct path. Compiler based atomics have better chance to check and

> optimize code on build time (== can be efficient with the param).

>

GCC already does a good work here, selecting the proper code at compile
time (when the memory order parameter is a constant). I don't see why we
need a different solution from what GCC and Clang do. Some poor
architecture that hasn't got the same level of support in the compiler
(perhaps depending on a very old version of GCC) needs to work a little
harder in their version of odp_atomic.h (I was originally doing the same in
linux-generic odp_atomic_internal.h) but I don't see this as a big problem.
The compiler will still select the proper code path when generating
instructions.

I think this is a suboptimal solution catering to hypothetical problems.



>

> -Petri

>

>

>

>

>

>

>

> *From:* EXT Ola Liljedahl [mailto:ola.liljedahl@linaro.org]

> *Sent:* Tuesday, December 08, 2015 6:44 PM

> *To:* Savolainen, Petri (Nokia - FI/Espoo)

> *Cc:* lng-odp@lists.linaro.org

> *Subject:* Re: [lng-odp] [API-NEXT PATCH v3] api: atomic: added

> atomic_lock_free_u64

>

>

>

> The patch doesn't apply. It depends on this patch [API-NEXT,v3,7/7] api:

> atomic: added 32 bit acquire and release

> <http://patches.opendataplane.org/patch/3435/>

>

>

>

> As can be seen here in the patch:

>

> diff --git a/include/odp/api/atomic.h b/include/odp/api/atomic.h

>

> index 316f13a..5e897c1 100644

>

> --- a/include/odp/api/atomic.h

>

> +++ b/include/odp/api/atomic.h

>

> @@ -388,6 +388,54 @@ void odp_atomic_add_rls_u32(odp_atomic_u32_t *atom,

> uint32_t val);

>

> * void odp_atomic_sub_rls_u32(odp_atomic_u32_t *atom, uint32_t val);*

>

>

>

>  /**

>

> + * Atomic operations

>

> + *

>

> + * Atomic operations listed in a bit field structure.

>

>

>

> What happened to that earlier patch series (cleanup of atomics

> documentation and added some new atomic operations)? I know I wasn't fond

> of adding new operations with just some of the different possible ordering

> models, the choice seem quite haphazard to me. I have used the same

> operations for lock-less programming but with different memory orderings.

> Better to expose the operations in linux-generic odp_atomic_internal.h

> where the memory ordering is a parameter.

>

>

>

>

>

> On 26 November 2015 at 09:56, Savolainen, Petri (Nokia - FI/Espoo) <

> petri.savolainen@nokia.com> wrote:

>

> ping.

>

> > -----Original Message-----

> > From: lng-odp [mailto:lng-odp-bounces@lists.linaro.org] On Behalf Of EXT

> > Petri Savolainen

> > Sent: Thursday, November 12, 2015 10:30 AM

> > To: lng-odp@lists.linaro.org

> > Subject: [lng-odp] [API-NEXT PATCH v3] api: atomic: added

> > atomic_lock_free_u64

> >

> > Platforms may support some uint64 operations lock-free and

> > others not. For example, inc_64 can be natively supported but

> > cas_64 (or max_64/min_64) not. User may be able to switch to

> > 32 bit variables when a 64 bit operation uses locks. The same

> > atomic operation struture could be used for platform init to guide

> > lock vs. lock-free implementation (e.g. if cas_64 is never

> > called, inc_64 can be lock-free).

> >

> > Signed-off-by: Petri Savolainen <petri.savolainen@nokia.com>

> > ---

> >  include/odp/api/atomic.h            | 48

> > +++++++++++++++++++++++++++++++++++++

> >  platform/linux-generic/Makefile.am  |  1 +

> >  platform/linux-generic/odp_atomic.c | 24 +++++++++++++++++++

> >  3 files changed, 73 insertions(+)

> >  create mode 100644 platform/linux-generic/odp_atomic.c

> >

> > diff --git a/include/odp/api/atomic.h b/include/odp/api/atomic.h

> > index 316f13a..5e897c1 100644

> > --- a/include/odp/api/atomic.h

> > +++ b/include/odp/api/atomic.h

> > @@ -388,6 +388,54 @@ void odp_atomic_add_rls_u32(odp_atomic_u32_t *atom,

> > uint32_t val);

> >  void odp_atomic_sub_rls_u32(odp_atomic_u32_t *atom, uint32_t val);

> >

> >  /**

> > + * Atomic operations

> > + *

> > + * Atomic operations listed in a bit field structure.

> > + */

> > +typedef union odp_atomic_op_t {

> > +     /** Operation flags */

> > +     struct {

> > +             uint32_t init      : 1;  /**< Init atomic variable */

> > +             uint32_t load      : 1;  /**< Atomic load */

> > +             uint32_t store     : 1;  /**< Atomic store */

> > +             uint32_t fetch_add : 1;  /**< Atomic fetch and add */

> > +             uint32_t add       : 1;  /**< Atomic add */

> > +             uint32_t fetch_sub : 1;  /**< Atomic fetch and substract */

> > +             uint32_t sub       : 1;  /**< Atomic substract */

> > +             uint32_t fetch_inc : 1;  /**< Atomic fetch and increment */

> > +             uint32_t inc       : 1;  /**< Atomic increment */

> > +             uint32_t fetch_dec : 1;  /**< Atomic fetch and decrement */

> > +             uint32_t dec       : 1;  /**< Atomic decrement */

> > +             uint32_t min       : 1;  /**< Atomic minimum */

> > +             uint32_t max       : 1;  /**< Atomic maximum */

> > +             uint32_t cas       : 1;  /**< Atomic compare and swap */

> > +     } op;

> > +

> > +     /** All bits of the bit field structure.

> > +       * Operation flag mapping is architecture specific. This field can

> > be

> > +       * used to set/clear all flags, or bitwise operations over the

> > entire

> > +       * structure. */

> > +     uint32_t all_bits;

> > +} odp_atomic_op_t;

> > +

> > +/**

> > + * Query which atomic uint64 operations are lock-free

> > + *

> > + * Lock-free implementations have higher performance and scale better

> > than

> > + * implementations using locks. User can decide to use e.g. uint32

> atomic

> > + * variables instead of uint64 to optimize performance on platforms that

> > + * implement a performance critical operation using locks.

> > + *

> > + * @param atomic_op  Pointer to atomic operation structure for storing

> > + *                   operation flags. All bits are initialized to zero

> > during

> > + *                   the operation. The parameter is ignored when NULL.

> > + * @retval 0 None of the operations are lock-free

> > + * @retval 1 Some of the operations are lock-free

> > + * @retval 2 All operations are lock-free

> > + */

> > +int odp_atomic_lock_free_u64(odp_atomic_op_t *atomic_op);

> > +

> > +/**

> >   * @}

> >   */

> >

> > diff --git a/platform/linux-generic/Makefile.am b/platform/linux-

> > generic/Makefile.am

> > index a6b6029..0b7234e 100644

> > --- a/platform/linux-generic/Makefile.am

> > +++ b/platform/linux-generic/Makefile.am

> > @@ -151,6 +151,7 @@ noinst_HEADERS = \

> >                 ${srcdir}/Makefile.inc

> >

> >  __LIB__libodp_la_SOURCES = \

> > +                        odp_atomic.c \

> >                          odp_barrier.c \

> >                          odp_buffer.c \

> >                          odp_classification.c \

> > diff --git a/platform/linux-generic/odp_atomic.c b/platform/linux-

> > generic/odp_atomic.c

> > new file mode 100644

> > index 0000000..996d09a

> > --- /dev/null

> > +++ b/platform/linux-generic/odp_atomic.c

> > @@ -0,0 +1,24 @@

> > +/* Copyright (c) 2015, Linaro Limited

> > + * All rights reserved.

> > + *

> > + * SPDX-License-Identifier:     BSD-3-Clause

> > + */

> > +

> > +#include <odp/atomic.h>

> > +

> > +int odp_atomic_lock_free_u64(odp_atomic_op_t *atomic_op)

> > +{

> > +#if __GCC_ATOMIC_LLONG_LOCK_FREE < 2

> > +     /* All operations have locks */

> > +     if (atomic_op)

> > +             atomic_op->all_bits = 0;

> > +

> > +     return 0;

> > +#else

> > +     /* All operations are lock-free */

> > +     if (atomic_op)

> > +             atomic_op->all_bits = ~((uint32_t)0);

> > +

> > +     return 2;

> > +#endif

> > +}

> > --

> > 2.6.2

> >

> > _______________________________________________

> > lng-odp mailing list

> > lng-odp@lists.linaro.org

> > https://lists.linaro.org/mailman/listinfo/lng-odp

> _______________________________________________

> lng-odp mailing list

> lng-odp@lists.linaro.org

> https://lists.linaro.org/mailman/listinfo/lng-odp

>

>

>

Patch

diff --git a/include/odp/api/atomic.h b/include/odp/api/atomic.h
index 316f13a..5e897c1 100644
--- a/include/odp/api/atomic.h
+++ b/include/odp/api/atomic.h
@@ -388,6 +388,54 @@  void odp_atomic_add_rls_u32(odp_atomic_u32_t *atom,
uint32_t val);
* void odp_atomic_sub_rls_u32(odp_atomic_u32_t *atom, uint32_t val);*

 /**
+ * Atomic operations
+ *
+ * Atomic operations listed in a bit field structure.

What happened to that earlier patch series (cleanup of atomics
documentation and added some new atomic operations)? I know I wasn't fond
of adding new operations with just some of the different possible ordering
models, the choice seem quite haphazard to me. I have used the same
operations for lock-less programming but with different memory orderings.
Better to expose the operations in linux-generic odp_atomic_internal.h
where the memory ordering is a parameter.


On 26 November 2015 at 09:56, Savolainen, Petri (Nokia - FI/Espoo) <
petri.savolainen@nokia.com> wrote:

> ping.
>
> > -----Original Message-----
> > From: lng-odp [mailto:lng-odp-bounces@lists.linaro.org] On Behalf Of EXT
> > Petri Savolainen
> > Sent: Thursday, November 12, 2015 10:30 AM
> > To: lng-odp@lists.linaro.org
> > Subject: [lng-odp] [API-NEXT PATCH v3] api: atomic: added
> > atomic_lock_free_u64
> >
> > Platforms may support some uint64 operations lock-free and
> > others not. For example, inc_64 can be natively supported but
> > cas_64 (or max_64/min_64) not. User may be able to switch to
> > 32 bit variables when a 64 bit operation uses locks. The same
> > atomic operation struture could be used for platform init to guide
> > lock vs. lock-free implementation (e.g. if cas_64 is never
> > called, inc_64 can be lock-free).
> >
> > Signed-off-by: Petri Savolainen <petri.savolainen@nokia.com>
> > ---
> >  include/odp/api/atomic.h            | 48
> > +++++++++++++++++++++++++++++++++++++
> >  platform/linux-generic/Makefile.am  |  1 +
> >  platform/linux-generic/odp_atomic.c | 24 +++++++++++++++++++
> >  3 files changed, 73 insertions(+)
> >  create mode 100644 platform/linux-generic/odp_atomic.c
> >
> > diff --git a/include/odp/api/atomic.h b/include/odp/api/atomic.h
> > index 316f13a..5e897c1 100644
> > --- a/include/odp/api/atomic.h
> > +++ b/include/odp/api/atomic.h
> > @@ -388,6 +388,54 @@ void odp_atomic_add_rls_u32(odp_atomic_u32_t *atom,
> > uint32_t val);
> >  void odp_atomic_sub_rls_u32(odp_atomic_u32_t *atom, uint32_t val);