diff mbox

[ARM,3/7,ping3] Refactor atomic compare_and_swap to make it fit for ARMv8-M Baseline

Message ID 797c1fda-ba0a-f6be-f9e4-43bba28a8fc7@foss.arm.com
State New
Headers show

Commit Message

Thomas Preudhomme Oct. 24, 2016, 8:05 a.m. UTC
Ping?

Best regards,

Thomas

On 14/10/16 14:50, Thomas Preudhomme wrote:
> Ping?

>

> Best regards,

>

> Thomas

>

> On 03/10/16 17:44, Thomas Preudhomme wrote:

>> Ping?

>>

>> Best regards,

>>

>> Thomas

>>

>> On 22/09/16 14:44, Thomas Preudhomme wrote:

>>> Hi,

>>>

>>> This patch is part of a patch series to add support for atomic operations on

>>> ARMv8-M Baseline targets in GCC. This specific patch refactors the expander and

>>> splitter for atomics to make the logic work with ARMv8-M Baseline which has

>>> limitation of Thumb-1 in terms of CC flag setting and different conditional

>>> compare insn patterns.

>>>

>>> ChangeLog entry is as follows:

>>>

>>> *** gcc/ChangeLog ***

>>>

>>> 2016-09-02  Thomas Preud'homme  <thomas.preudhomme@arm.com>

>>>

>>>         * config/arm/arm.c (arm_expand_compare_and_swap): Add new bdst local

>>>         variable.  Add the new parameter to the insn generator.  Set that

>>>         parameter to be CC flag for 32-bit targets, bval otherwise.  Set the

>>>         return value from the negation of that parameter for Thumb-1, keeping

>>>         the logic unchanged otherwise except for using bdst as the destination

>>>         register of the compare_and_swap insn.

>>>         (arm_split_compare_and_swap): Add explanation about how is the value

>>>         returned to the function comment.  Rename scratch variable to

>>>         neg_bval.  Adapt initialization of variables holding operands to the

>>>         new operand numbers.  Use return register to hold result of store

>>>         exclusive for Thumb-1, scratch register otherwise.  Construct the

>>>         appropriate cbranch for Thumb-1 targets, keeping the logic unchanged

>>>         for 32-bit targets.  Guard Z flag setting to restrict to 32bit targets.

>>>         Use gen_cbranchsi4 rather than hand-written conditional branch to loop

>>>         for strongly ordered compare_and_swap.

>>>         * config/arm/predicates.md (cc_register_operand): New predicate.

>>>         * config/arm/sync.md (atomic_compare_and_swap<mode>_1): Use a

>>>         match_operand with the new predicate to accept either the CC flag or a

>>>         destination register for the boolean return value, restricting it to

>>>         CC flag only via constraint.  Adapt operand numbers accordingly.

>>>

>>>

>>> Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all

>>> atomic and synchronization testcases in the testsuite [2]. Patchset was also

>>> bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at

>>> optimization level -O1 and above [1] without any regression in the testsuite and

>>> no code generation difference in libitm and libgomp.

>>>

>>> Code generation for ARMv8-M Baseline has been manually examined and compared

>>> against ARMv8-A Thumb-2 for the following configuration without finding any

>>> issue:

>>>

>>> gcc.dg/atomic-op-2.c at -Os

>>> gcc.dg/atomic-compare-exchange-2.c at -Os

>>> gcc.dg/atomic-compare-exchange-3.c at -O3

>>>

>>>

>>> Is this ok for trunk?

>>>

>>> Best regards,

>>>

>>> Thomas

>>>

>>> [1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and

>>> undefined ("-O2 -g")

>>> [2] The exact list is:

>>>

>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c

>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c

>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c

>>> gcc/testsuite/gcc.dg/atomic-exchange-1.c

>>> gcc/testsuite/gcc.dg/atomic-exchange-2.c

>>> gcc/testsuite/gcc.dg/atomic-exchange-3.c

>>> gcc/testsuite/gcc.dg/atomic-fence.c

>>> gcc/testsuite/gcc.dg/atomic-flag.c

>>> gcc/testsuite/gcc.dg/atomic-generic.c

>>> gcc/testsuite/gcc.dg/atomic-generic-aux.c

>>> gcc/testsuite/gcc.dg/atomic-invalid-2.c

>>> gcc/testsuite/gcc.dg/atomic-load-1.c

>>> gcc/testsuite/gcc.dg/atomic-load-2.c

>>> gcc/testsuite/gcc.dg/atomic-load-3.c

>>> gcc/testsuite/gcc.dg/atomic-lockfree.c

>>> gcc/testsuite/gcc.dg/atomic-lockfree-aux.c

>>> gcc/testsuite/gcc.dg/atomic-noinline.c

>>> gcc/testsuite/gcc.dg/atomic-noinline-aux.c

>>> gcc/testsuite/gcc.dg/atomic-op-1.c

>>> gcc/testsuite/gcc.dg/atomic-op-2.c

>>> gcc/testsuite/gcc.dg/atomic-op-3.c

>>> gcc/testsuite/gcc.dg/atomic-op-6.c

>>> gcc/testsuite/gcc.dg/atomic-store-1.c

>>> gcc/testsuite/gcc.dg/atomic-store-2.c

>>> gcc/testsuite/gcc.dg/atomic-store-3.c

>>> gcc/testsuite/g++.dg/ext/atomic-1.C

>>> gcc/testsuite/g++.dg/ext/atomic-2.C

>>> gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c

>>> gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c

>>> gcc/testsuite/gcc.target/arm/atomic-op-acquire.c

>>> gcc/testsuite/gcc.target/arm/atomic-op-char.c

>>> gcc/testsuite/gcc.target/arm/atomic-op-consume.c

>>> gcc/testsuite/gcc.target/arm/atomic-op-int.c

>>> gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c

>>> gcc/testsuite/gcc.target/arm/atomic-op-release.c

>>> gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c

>>> gcc/testsuite/gcc.target/arm/atomic-op-short.c

>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c

>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c

>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c

>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c

>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c

>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c

>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c

>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c

>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c

>>> gcc/testsuite/gcc.target/arm/sync-1.c

>>> gcc/testsuite/gcc.target/arm/synchronize.c

>>> gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c

>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c

>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c

>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c

>>> libstdc++-v3/testsuite/29_atomics/atomic/60658.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/62259.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/64658.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/65147.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/65913.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/70766.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/49445.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/constexpr.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/copy_list.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/default.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/direct_list.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/single_value.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/user_pod.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/51811.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/56011.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_assignment.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_conversion.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/pointer_partial_void.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/base_classes.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/compare_exchange_lowering.cc

>>>

>>>

>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/explicit_instantiation/1.cc

>>>

>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/1.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/56012.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/aggregate.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/default.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/standard_layout.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/trivial.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/60940.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/65147.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/constexpr.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/copy_list.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/default.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/direct_list.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/single_value.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/bitwise.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/decrement.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/increment.cc

>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_assignment.cc

>>>

>>>

>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_conversion.cc

>>>

>>>

>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/standard_layout.cc

>>>

>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/trivial.cc

>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/functions_std_c++0x.cc

>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/macros.cc

>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++0x.cc

Comments

Kyrill Tkachov Oct. 26, 2016, 3:47 p.m. UTC | #1
Hi Thomas,

On 24/10/16 09:05, Thomas Preudhomme wrote:
> Ping?

>

> Best regards,

>

> Thomas

>

> On 14/10/16 14:50, Thomas Preudhomme wrote:

>> Ping?

>>

>> Best regards,

>>

>> Thomas

>>

>> On 03/10/16 17:44, Thomas Preudhomme wrote:

>>> Ping?

>>>

>>> Best regards,

>>>

>>> Thomas

>>>

>>> On 22/09/16 14:44, Thomas Preudhomme wrote:

>>>> Hi,

>>>>

>>>> This patch is part of a patch series to add support for atomic operations on

>>>> ARMv8-M Baseline targets in GCC. This specific patch refactors the expander and

>>>> splitter for atomics to make the logic work with ARMv8-M Baseline which has

>>>> limitation of Thumb-1 in terms of CC flag setting and different conditional

>>>> compare insn patterns.

>>>>

>>>> ChangeLog entry is as follows:

>>>>

>>>> *** gcc/ChangeLog ***

>>>>

>>>> 2016-09-02  Thomas Preud'homme <thomas.preudhomme@arm.com>

>>>>

>>>>         * config/arm/arm.c (arm_expand_compare_and_swap): Add new bdst local

>>>>         variable.  Add the new parameter to the insn generator.  Set that

>>>>         parameter to be CC flag for 32-bit targets, bval otherwise.  Set the

>>>>         return value from the negation of that parameter for Thumb-1, keeping

>>>>         the logic unchanged otherwise except for using bdst as the destination

>>>>         register of the compare_and_swap insn.

>>>>         (arm_split_compare_and_swap): Add explanation about how is the value

>>>>         returned to the function comment.  Rename scratch variable to

>>>>         neg_bval.  Adapt initialization of variables holding operands to the

>>>>         new operand numbers.  Use return register to hold result of store

>>>>         exclusive for Thumb-1, scratch register otherwise. Construct the

>>>>         appropriate cbranch for Thumb-1 targets, keeping the logic unchanged

>>>>         for 32-bit targets.  Guard Z flag setting to restrict to 32bit targets.

>>>>         Use gen_cbranchsi4 rather than hand-written conditional branch to loop

>>>>         for strongly ordered compare_and_swap.

>>>>         * config/arm/predicates.md (cc_register_operand): New predicate.

>>>>         * config/arm/sync.md (atomic_compare_and_swap<mode>_1): Use a

>>>>         match_operand with the new predicate to accept either the CC flag or a

>>>>         destination register for the boolean return value, restricting it to

>>>>         CC flag only via constraint.  Adapt operand numbers accordingly.

>>>>

>>>>

>>>> Testing: No code generation difference for ARMv7-A, ARMv7VE and ARMv8-A on all

>>>> atomic and synchronization testcases in the testsuite [2]. Patchset was also

>>>> bootstrapped with --enable-itm --enable-gomp on ARMv8-A in ARM and Thumb mode at

>>>> optimization level -O1 and above [1] without any regression in the testsuite and

>>>> no code generation difference in libitm and libgomp.

>>>>

>>>> Code generation for ARMv8-M Baseline has been manually examined and compared

>>>> against ARMv8-A Thumb-2 for the following configuration without finding any

>>>> issue:

>>>>

>>>> gcc.dg/atomic-op-2.c at -Os

>>>> gcc.dg/atomic-compare-exchange-2.c at -Os

>>>> gcc.dg/atomic-compare-exchange-3.c at -O3

>>>>

>>>>

>>>> Is this ok for trunk?

>>>>


This is ok.
Thanks,
Kyrill


>>>> Best regards,

>>>>

>>>> Thomas

>>>>

>>>> [1] CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET were set to "-O1 -g", "-O3 -g" and

>>>> undefined ("-O2 -g")

>>>> [2] The exact list is:

>>>>

>>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-1.c

>>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-2.c

>>>> gcc/testsuite/gcc.dg/atomic-compare-exchange-3.c

>>>> gcc/testsuite/gcc.dg/atomic-exchange-1.c

>>>> gcc/testsuite/gcc.dg/atomic-exchange-2.c

>>>> gcc/testsuite/gcc.dg/atomic-exchange-3.c

>>>> gcc/testsuite/gcc.dg/atomic-fence.c

>>>> gcc/testsuite/gcc.dg/atomic-flag.c

>>>> gcc/testsuite/gcc.dg/atomic-generic.c

>>>> gcc/testsuite/gcc.dg/atomic-generic-aux.c

>>>> gcc/testsuite/gcc.dg/atomic-invalid-2.c

>>>> gcc/testsuite/gcc.dg/atomic-load-1.c

>>>> gcc/testsuite/gcc.dg/atomic-load-2.c

>>>> gcc/testsuite/gcc.dg/atomic-load-3.c

>>>> gcc/testsuite/gcc.dg/atomic-lockfree.c

>>>> gcc/testsuite/gcc.dg/atomic-lockfree-aux.c

>>>> gcc/testsuite/gcc.dg/atomic-noinline.c

>>>> gcc/testsuite/gcc.dg/atomic-noinline-aux.c

>>>> gcc/testsuite/gcc.dg/atomic-op-1.c

>>>> gcc/testsuite/gcc.dg/atomic-op-2.c

>>>> gcc/testsuite/gcc.dg/atomic-op-3.c

>>>> gcc/testsuite/gcc.dg/atomic-op-6.c

>>>> gcc/testsuite/gcc.dg/atomic-store-1.c

>>>> gcc/testsuite/gcc.dg/atomic-store-2.c

>>>> gcc/testsuite/gcc.dg/atomic-store-3.c

>>>> gcc/testsuite/g++.dg/ext/atomic-1.C

>>>> gcc/testsuite/g++.dg/ext/atomic-2.C

>>>> gcc/testsuite/gcc.target/arm/atomic-comp-swap-release-acquire.c

>>>> gcc/testsuite/gcc.target/arm/atomic-op-acq_rel.c

>>>> gcc/testsuite/gcc.target/arm/atomic-op-acquire.c

>>>> gcc/testsuite/gcc.target/arm/atomic-op-char.c

>>>> gcc/testsuite/gcc.target/arm/atomic-op-consume.c

>>>> gcc/testsuite/gcc.target/arm/atomic-op-int.c

>>>> gcc/testsuite/gcc.target/arm/atomic-op-relaxed.c

>>>> gcc/testsuite/gcc.target/arm/atomic-op-release.c

>>>> gcc/testsuite/gcc.target/arm/atomic-op-seq_cst.c

>>>> gcc/testsuite/gcc.target/arm/atomic-op-short.c

>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_1.c

>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_2.c

>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_3.c

>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_4.c

>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_5.c

>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_6.c

>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_7.c

>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_8.c

>>>> gcc/testsuite/gcc.target/arm/atomic_loaddi_9.c

>>>> gcc/testsuite/gcc.target/arm/sync-1.c

>>>> gcc/testsuite/gcc.target/arm/synchronize.c

>>>> gcc/testsuite/gcc.target/arm/armv8-sync-comp-swap.c

>>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-acquire.c

>>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-full.c

>>>> gcc/testsuite/gcc.target/arm/armv8-sync-op-release.c

>>>> libstdc++-v3/testsuite/29_atomics/atomic/60658.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/62259.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/64658.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/65147.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/65913.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/70766.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/49445.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/constexpr.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/copy_list.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/default.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/direct_list.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/single_value.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/cons/user_pod.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/51811.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/56011.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_assignment.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/integral_conversion.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/operators/pointer_partial_void.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/base_classes.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/compare_exchange_lowering.cc

>>>>

>>>>

>>>> libstdc++-v3/testsuite/29_atomics/atomic/requirements/explicit_instantiation/1.cc

>>>>

>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/1.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/56012.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/aggregate.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/default.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/standard_layout.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/requirements/trivial.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/60940.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/65147.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/constexpr.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/copy_list.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/default.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/direct_list.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/cons/single_value.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/bitwise.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/decrement.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/increment.cc

>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_assignment.cc

>>>>

>>>>

>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/operators/integral_conversion.cc

>>>>

>>>>

>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/standard_layout.cc

>>>>

>>>> libstdc++-v3/testsuite/29_atomics/atomic_integral/requirements/trivial.cc

>>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/functions_std_c++0x.cc

>>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/macros.cc

>>>> libstdc++-v3/testsuite/29_atomics/headers/atomic/types_std_c++0x.cc
diff mbox

Patch

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 39e3aa85c0cc1d42b0c58dda143513feb248827e..c3249d42ae6720369eaaebb460b25687fde0af6c 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -28152,9 +28152,9 @@  emit_unlikely_jump (rtx insn)
 void
 arm_expand_compare_and_swap (rtx operands[])
 {
-  rtx bval, rval, mem, oldval, newval, is_weak, mod_s, mod_f, x;
+  rtx bval, bdst, rval, mem, oldval, newval, is_weak, mod_s, mod_f, x;
   machine_mode mode;
-  rtx (*gen) (rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  rtx (*gen) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
 
   bval = operands[0];
   rval = operands[1];
@@ -28211,43 +28211,54 @@  arm_expand_compare_and_swap (rtx operands[])
       gcc_unreachable ();
     }
 
-  emit_insn (gen (rval, mem, oldval, newval, is_weak, mod_s, mod_f));
+  bdst = TARGET_THUMB1 ? bval : gen_rtx_REG (CCmode, CC_REGNUM);
+  emit_insn (gen (bdst, rval, mem, oldval, newval, is_weak, mod_s, mod_f));
 
   if (mode == QImode || mode == HImode)
     emit_move_insn (operands[1], gen_lowpart (mode, rval));
 
   /* In all cases, we arrange for success to be signaled by Z set.
      This arrangement allows for the boolean result to be used directly
-     in a subsequent branch, post optimization.  */
-  x = gen_rtx_REG (CCmode, CC_REGNUM);
-  x = gen_rtx_EQ (SImode, x, const0_rtx);
-  emit_insn (gen_rtx_SET (bval, x));
+     in a subsequent branch, post optimization.  For Thumb-1 targets, the
+     boolean negation of the result is also stored in bval because Thumb-1
+     backend lacks dependency tracking for CC flag due to flag-setting not
+     being represented at RTL level.  */
+  if (TARGET_THUMB1)
+      emit_insn (gen_cstoresi_eq0_thumb1 (bval, bdst));
+  else
+    {
+      x = gen_rtx_EQ (SImode, bdst, const0_rtx);
+      emit_insn (gen_rtx_SET (bval, x));
+    }
 }
 
 /* Split a compare and swap pattern.  It is IMPLEMENTATION DEFINED whether
    another memory store between the load-exclusive and store-exclusive can
    reset the monitor from Exclusive to Open state.  This means we must wait
    until after reload to split the pattern, lest we get a register spill in
-   the middle of the atomic sequence.  */
+   the middle of the atomic sequence.  Success of the compare and swap is
+   indicated by the Z flag set for 32bit targets and by neg_bval being zero
+   for Thumb-1 targets (ie. negation of the boolean value returned by
+   atomic_compare_and_swapmode standard pattern in operand 0).  */
 
 void
 arm_split_compare_and_swap (rtx operands[])
 {
-  rtx rval, mem, oldval, newval, scratch;
+  rtx rval, mem, oldval, newval, neg_bval;
   machine_mode mode;
   enum memmodel mod_s, mod_f;
   bool is_weak;
   rtx_code_label *label1, *label2;
   rtx x, cond;
 
-  rval = operands[0];
-  mem = operands[1];
-  oldval = operands[2];
-  newval = operands[3];
-  is_weak = (operands[4] != const0_rtx);
-  mod_s = memmodel_from_int (INTVAL (operands[5]));
-  mod_f = memmodel_from_int (INTVAL (operands[6]));
-  scratch = operands[7];
+  rval = operands[1];
+  mem = operands[2];
+  oldval = operands[3];
+  newval = operands[4];
+  is_weak = (operands[5] != const0_rtx);
+  mod_s = memmodel_from_int (INTVAL (operands[6]));
+  mod_f = memmodel_from_int (INTVAL (operands[7]));
+  neg_bval = TARGET_THUMB1 ? operands[0] : operands[8];
   mode = GET_MODE (mem);
 
   bool is_armv8_sync = arm_arch8 && is_mm_sync (mod_s);
@@ -28279,26 +28290,44 @@  arm_split_compare_and_swap (rtx operands[])
 
   arm_emit_load_exclusive (mode, rval, mem, use_acquire);
 
-  cond = arm_gen_compare_reg (NE, rval, oldval, scratch);
-  x = gen_rtx_NE (VOIDmode, cond, const0_rtx);
-  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
-			    gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
-  emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
+  /* Z is set to 0 for 32bit targets (resp. rval set to 1) if oldval != rval,
+     as required to communicate with arm_expand_compare_and_swap.  */
+  if (TARGET_32BIT)
+    {
+      cond = arm_gen_compare_reg (NE, rval, oldval, neg_bval);
+      x = gen_rtx_NE (VOIDmode, cond, const0_rtx);
+      x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+				gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
+      emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
+    }
+  else
+    {
+      emit_move_insn (neg_bval, const1_rtx);
+      cond = gen_rtx_NE (VOIDmode, rval, oldval);
+      if (thumb1_cmpneg_operand (oldval, SImode))
+	emit_unlikely_jump (gen_cbranchsi4_scratch (neg_bval, rval, oldval,
+						    label2, cond));
+      else
+	emit_unlikely_jump (gen_cbranchsi4_insn (cond, rval, oldval, label2));
+    }
 
-  arm_emit_store_exclusive (mode, scratch, mem, newval, use_release);
+  arm_emit_store_exclusive (mode, neg_bval, mem, newval, use_release);
 
   /* Weak or strong, we want EQ to be true for success, so that we
      match the flags that we got from the compare above.  */
-  cond = gen_rtx_REG (CCmode, CC_REGNUM);
-  x = gen_rtx_COMPARE (CCmode, scratch, const0_rtx);
-  emit_insn (gen_rtx_SET (cond, x));
+  if (TARGET_32BIT)
+    {
+      cond = gen_rtx_REG (CCmode, CC_REGNUM);
+      x = gen_rtx_COMPARE (CCmode, neg_bval, const0_rtx);
+      emit_insn (gen_rtx_SET (cond, x));
+    }
 
   if (!is_weak)
     {
-      x = gen_rtx_NE (VOIDmode, cond, const0_rtx);
-      x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
-				gen_rtx_LABEL_REF (Pmode, label1), pc_rtx);
-      emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
+      /* Z is set to boolean value of !neg_bval, as required to communicate
+	 with arm_expand_compare_and_swap.  */
+      x = gen_rtx_NE (VOIDmode, neg_bval, const0_rtx);
+      emit_unlikely_jump (gen_cbranchsi4 (x, neg_bval, const0_rtx, label1));
     }
 
   if (!is_mm_relaxed (mod_f))
diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index 3e747d682300fe4c232a618e9a549a833ee153fe..af727edaa570fe67948c4432d9fa7bb90815feb8 100644
--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@@ -391,6 +391,12 @@ 
 	     || mode == CC_DGTUmode));
 })
 
+;; Any register, including CC
+(define_predicate "cc_register_operand"
+  (and (match_code "reg")
+       (ior (match_operand 0 "s_register_operand")
+	    (match_operand 0 "cc_register"))))
+
 (define_special_predicate "arm_extendqisi_mem_op"
   (and (match_operand 0 "memory_operand")
        (match_test "TARGET_ARM ? arm_legitimate_address_outer_p (mode,
diff --git a/gcc/config/arm/sync.md b/gcc/config/arm/sync.md
index d36c24f76f670d7602f766d7172286504faa7af5..b4e0713108d9867d7226fad3241e46d1faf3172a 100644
--- a/gcc/config/arm/sync.md
+++ b/gcc/config/arm/sync.md
@@ -190,20 +190,20 @@ 
 })
 
 (define_insn_and_split "atomic_compare_and_swap<mode>_1"
-  [(set (reg:CC_Z CC_REGNUM)					;; bool out
+  [(set (match_operand 0 "cc_register_operand" "=&c")		;; bool out
 	(unspec_volatile:CC_Z [(const_int 0)] VUNSPEC_ATOMIC_CAS))
-   (set (match_operand:SI 0 "s_register_operand" "=&r")		;; val out
+   (set (match_operand:SI 1 "s_register_operand" "=&r")		;; val out
 	(zero_extend:SI
-	  (match_operand:NARROW 1 "mem_noofs_operand" "+Ua")))	;; memory
-   (set (match_dup 1)
+	  (match_operand:NARROW 2 "mem_noofs_operand" "+Ua")))	;; memory
+   (set (match_dup 2)
 	(unspec_volatile:NARROW
-	  [(match_operand:SI 2 "arm_add_operand" "rIL")		;; expected
-	   (match_operand:NARROW 3 "s_register_operand" "r")	;; desired
-	   (match_operand:SI 4 "const_int_operand")		;; is_weak
-	   (match_operand:SI 5 "const_int_operand")		;; mod_s
-	   (match_operand:SI 6 "const_int_operand")]		;; mod_f
+	  [(match_operand:SI 3 "arm_add_operand" "rIL")		;; expected
+	   (match_operand:NARROW 4 "s_register_operand" "r")	;; desired
+	   (match_operand:SI 5 "const_int_operand")		;; is_weak
+	   (match_operand:SI 6 "const_int_operand")		;; mod_s
+	   (match_operand:SI 7 "const_int_operand")]		;; mod_f
 	  VUNSPEC_ATOMIC_CAS))
-   (clobber (match_scratch:SI 7 "=&r"))]
+   (clobber (match_scratch:SI 8 "=&r"))]
   "<sync_predtab>"
   "#"
   "&& reload_completed"
@@ -219,19 +219,19 @@ 
   [(SI "rIL") (DI "rDi")])
 
 (define_insn_and_split "atomic_compare_and_swap<mode>_1"
-  [(set (reg:CC_Z CC_REGNUM)					;; bool out
+  [(set (match_operand 0 "cc_register_operand" "=&c")		;; bool out
 	(unspec_volatile:CC_Z [(const_int 0)] VUNSPEC_ATOMIC_CAS))
-   (set (match_operand:SIDI 0 "s_register_operand" "=&r")	;; val out
-	(match_operand:SIDI 1 "mem_noofs_operand" "+Ua"))	;; memory
-   (set (match_dup 1)
+   (set (match_operand:SIDI 1 "s_register_operand" "=&r")	;; val out
+	(match_operand:SIDI 2 "mem_noofs_operand" "+Ua"))	;; memory
+   (set (match_dup 2)
 	(unspec_volatile:SIDI
-	  [(match_operand:SIDI 2 "<cas_cmp_operand>" "<cas_cmp_str>") ;; expect
-	   (match_operand:SIDI 3 "s_register_operand" "r")	;; desired
-	   (match_operand:SI 4 "const_int_operand")		;; is_weak
-	   (match_operand:SI 5 "const_int_operand")		;; mod_s
-	   (match_operand:SI 6 "const_int_operand")]		;; mod_f
+	  [(match_operand:SIDI 3 "<cas_cmp_operand>" "<cas_cmp_str>") ;; expect
+	   (match_operand:SIDI 4 "s_register_operand" "r")	;; desired
+	   (match_operand:SI 5 "const_int_operand")		;; is_weak
+	   (match_operand:SI 6 "const_int_operand")		;; mod_s
+	   (match_operand:SI 7 "const_int_operand")]		;; mod_f
 	  VUNSPEC_ATOMIC_CAS))
-   (clobber (match_scratch:SI 7 "=&r"))]
+   (clobber (match_scratch:SI 8 "=&r"))]
   "<sync_predtab>"
   "#"
   "&& reload_completed"