diff mbox

[v3,03/15] exec-all.h: revert tb_page_addr_t to target_ulong

Message ID 20160930213106.20186-4-alex.bennee@linaro.org
State New
Headers show

Commit Message

Alex Bennée Sept. 30, 2016, 9:30 p.m. UTC
Commit b480d9b74 converted tb_page_addr_t to abi_ulong which while the
right size imposes additional alignment restrictions on the type. This
gets in the way of using atomic accesses on certain guest platforms
which allow finer alignments. As tb_page_addr_t isn't actually visible
to the guest we can revert the change.

This is potentially less efficient for ILP32 style guests but it is the
simpler change to make.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
 include/exec/exec-all.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.9.3

Comments

Alex Bennée Oct. 3, 2016, 9:32 a.m. UTC | #1
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 30/09/2016 23:30, Alex Bennée wrote:

>> Commit b480d9b74 converted tb_page_addr_t to abi_ulong which while the

>> right size imposes additional alignment restrictions on the type. This

>> gets in the way of using atomic accesses on certain guest platforms

>> which allow finer alignments. As tb_page_addr_t isn't actually visible

>> to the guest we can revert the change.

>>

>> This is potentially less efficient for ILP32 style guests but it is the

>> simpler change to make.

>>

>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

>> ---

>>  include/exec/exec-all.h | 2 +-

>>  1 file changed, 1 insertion(+), 1 deletion(-)

>>

>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h

>> index 336a57c..c3596a6 100644

>> --- a/include/exec/exec-all.h

>> +++ b/include/exec/exec-all.h

>> @@ -30,7 +30,7 @@

>>     addresses in userspace mode.  Define tb_page_addr_t to be an appropriate

>>     type.  */

>>  #if defined(CONFIG_USER_ONLY)

>> -typedef abi_ulong tb_page_addr_t;

>> +typedef target_ulong tb_page_addr_t;

>>  #else

>>  typedef ram_addr_t tb_page_addr_t;

>>  #endif

>>

>

> I think target_ulong is in general not accessible atomically, because we

> still haven't dropped 64-on-32 support.


True. I've been thinking about our options here.

1. For atomic_read/set fall back to plain access for sizeof(*ptr) >
sizeof(void *)

This would throw up warnings in ThreadSanitizer on 64-on-32 but
practically generate the same code as before.

2. Split sizeof(*ptr) > sizeof(void *) accesses over two

This would keep the sanitizer happy but be incorrect behaviour, you
could get tears.

3. Add a generic 64-on-32 lock for these accesses

It would be a global lock for any oversized access which could end up
being highly contended but at least any behaviour would be always
correct.

It's tricky because pretty much all of the atomic_set/read use is purely
for C11 correctness, actual sequential correctness is guaranteed by
using more restrictive primitives and/or additional locking. I'm not
suggesting we support 64-on-32 for any of the more strict primitives.

However the series as a whole does have value. As you can see from the
other patches there are some real races being picked up by the sanitizer
which only really become visible when a) you remove the noise of the
"false" positives and b) run the test many many times. For example this
one:

==================
WARNING: ThreadSanitizer: data race (pid=24906)
  Read of size 8 at 0x7db4000261f0 by thread T3 (mutexes: write M8203):
    #0 do_tb_flush /home/alex/lsrc/qemu/qemu.git/translate-all.c:872 (qemu-arm+0x00006000ce68)
    #1 process_queued_cpu_work /home/alex/lsrc/qemu/qemu.git/cpus-common.c:337 (qemu-arm+0x000060116712)
    #2 cpu_loop /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:654 (qemu-arm+0x000060052213)
    #3 clone_func /home/alex/lsrc/qemu/qemu.git/linux-user/syscall.c:6070 (qemu-arm+0x0000600686fb)
    #4 <null> <null> (libtsan.so.0+0x0000000230d9)

  Previous write of size 8 at 0x7db4000261f0 by main thread (mutexes: write M8):
    #0 cpu_list_add /home/alex/lsrc/qemu/qemu.git/cpus-common.c:87 (qemu-arm+0x000060115b7a)
    #1 cpu_exec_init /home/alex/lsrc/qemu/qemu.git/exec.c:641 (qemu-arm+0x000060009900)
    #2 arm_cpu_initfn /home/alex/lsrc/qemu/qemu.git/target-arm/cpu.c:447 (qemu-arm+0x0000600f833b)
    #3 object_init_with_type qom/object.c:339 (qemu-arm+0x000060156c73)
    #4 object_init_with_type qom/object.c:335 (qemu-arm+0x000060156c35)
    #5 object_initialize_with_type qom/object.c:370 (qemu-arm+0x000060156f35)
    #6 object_new_with_type qom/object.c:478 (qemu-arm+0x00006015763a)
    #7 object_new qom/object.c:488 (qemu-arm+0x00006015769e)
    #8 cpu_generic_init qom/cpu.c:76 (qemu-arm+0x0000601537f4)
    #9 cpu_arm_init /home/alex/lsrc/qemu/qemu.git/target-arm/helper.c:5101 (qemu-arm+0x0000600eed19)
    #10 cpu_copy /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:3744 (qemu-arm+0x000060053516)
    #11 do_fork /home/alex/lsrc/qemu/qemu.git/linux-user/syscall.c:6107 (qemu-arm+0x000060068832)
    #12 do_syscall /home/alex/lsrc/qemu/qemu.git/linux-user/syscall.c:9489 (qemu-arm+0x000060073bcc)
    #13 cpu_loop /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:789 (qemu-arm+0x00006005305e)
    #14 main /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:4751 (qemu-arm+0x0000600555c9)

  Location is heap block of size 37200 at 0x7db40001e000 allocated by main thread:
    #0 malloc <null> (libtsan.so.0+0x0000000254a3)
    #1 g_malloc <null> (libglib-2.0.so.0+0x00000004f728)
    #2 object_new qom/object.c:488 (qemu-arm+0x00006015769e)
    #3 cpu_generic_init qom/cpu.c:76 (qemu-arm+0x0000601537f4)
    #4 cpu_arm_init /home/alex/lsrc/qemu/qemu.git/target-arm/helper.c:5101 (qemu-arm+0x0000600eed19)
    #5 cpu_copy /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:3744 (qemu-arm+0x000060053516)
    #6 do_fork /home/alex/lsrc/qemu/qemu.git/linux-user/syscall.c:6107 (qemu-arm+0x000060068832)
    #7 do_syscall /home/alex/lsrc/qemu/qemu.git/linux-user/syscall.c:9489 (qemu-arm+0x000060073bcc)
    #8 cpu_loop /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:789 (qemu-arm+0x00006005305e)
    #9 main /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:4751 (qemu-arm+0x0000600555c9)

  Mutex M8203 (0x0000604ad638) created at:
    #0 pthread_mutex_init <null> (libtsan.so.0+0x0000000280b5)
    #1 qemu_mutex_init util/qemu-thread-posix.c:46 (qemu-arm+0x0000601be220)
    #2 code_gen_alloc /home/alex/lsrc/qemu/qemu.git/translate-all.c:739 (qemu-arm+0x00006000c7ce)
    #3 tcg_exec_init /home/alex/lsrc/qemu/qemu.git/translate-all.c:757 (qemu-arm+0x00006000c845)
    #4 main /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:4250 (qemu-arm+0x000060054aca)

  Mutex M8 (0x000060508920) created at:
    #0 pthread_mutex_init <null> (libtsan.so.0+0x0000000280b5)
    #1 qemu_mutex_init util/qemu-thread-posix.c:46 (qemu-arm+0x0000601be220)
    #2 qemu_init_cpu_list /home/alex/lsrc/qemu/qemu.git/cpus-common.c:42 (qemu-arm+0x000060115973)
    #3 main /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:4161 (qemu-arm+0x000060054842)

  Thread T3 (tid=24910, running) created by main thread at:
    #0 pthread_create <null> (libtsan.so.0+0x000000027577)
    #1 do_fork /home/alex/lsrc/qemu/qemu.git/linux-user/syscall.c:6147 (qemu-arm+0x000060068c1a)
    #2 do_syscall /home/alex/lsrc/qemu/qemu.git/linux-user/syscall.c:9489 (qemu-arm+0x000060073bcc)
    #3 cpu_loop /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:789 (qemu-arm+0x00006005305e)
    #4 main /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:4751 (qemu-arm+0x0000600555c9)

SUMMARY: ThreadSanitizer: data race /home/alex/lsrc/qemu/qemu.git/translate-all.c:872 do_tb_flush
==================

which showed up on run 619 of a 1000 run test...


>

> Paolo



--
Alex Bennée
Alex Bennée Oct. 3, 2016, 4:16 p.m. UTC | #2
Emilio G. Cota <cota@braap.org> writes:

> On Mon, Oct 03, 2016 at 10:32:55 +0100, Alex Bennée wrote:

> (snip)

>> However the series as a whole does have value. As you can see from the

>> other patches there are some real races being picked up by the sanitizer

>> which only really become visible when a) you remove the noise of the

>> "false" positives and b) run the test many many times. For example this

>> one:

>>

>> ==================

>> WARNING: ThreadSanitizer: data race (pid=24906)

>>   Read of size 8 at 0x7db4000261f0 by thread T3 (mutexes: write M8203):

>>     #0 do_tb_flush /home/alex/lsrc/qemu/qemu.git/translate-all.c:872 (qemu-arm+0x00006000ce68)

>>     #1 process_queued_cpu_work /home/alex/lsrc/qemu/qemu.git/cpus-common.c:337 (qemu-arm+0x000060116712)

>>     #2 cpu_loop /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:654 (qemu-arm+0x000060052213)

>>     #3 clone_func /home/alex/lsrc/qemu/qemu.git/linux-user/syscall.c:6070 (qemu-arm+0x0000600686fb)

>>     #4 <null> <null> (libtsan.so.0+0x0000000230d9)

>>

>>   Previous write of size 8 at 0x7db4000261f0 by main thread (mutexes: write M8):

>>     #0 cpu_list_add /home/alex/lsrc/qemu/qemu.git/cpus-common.c:87 (qemu-arm+0x000060115b7a)

>>     #1 cpu_exec_init /home/alex/lsrc/qemu/qemu.git/exec.c:641 (qemu-arm+0x000060009900)

>>     #2 arm_cpu_initfn /home/alex/lsrc/qemu/qemu.git/target-arm/cpu.c:447 (qemu-arm+0x0000600f833b)

> [..]

>

> Nice! Which patch fixes this--patch 10? It would be cool to have this

> report in the corresponding commit message.


This particular one only actually showed up after I sent the last
series - I'd kicked off 1000 repeating tests just before I boarded my
flight back home :-)

However patch 10 fixes another rare case which is in the class of races
caused by creating or destroying a thread just as we flush.

>

> Thanks,

>

> 		Emilio



--
Alex Bennée
Alex Bennée Oct. 4, 2016, 2:08 p.m. UTC | #3
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 03/10/2016 11:32, Alex Bennée wrote:

>> 1. For atomic_read/set fall back to plain access for sizeof(*ptr) >

>> sizeof(void *)

>>

>> This would throw up warnings in ThreadSanitizer on 64-on-32 but

>> practically generate the same code as before.

>>

>> 2. Split sizeof(*ptr) > sizeof(void *) accesses over two

>>

>> This would keep the sanitizer happy but be incorrect behaviour, you

>> could get tears.

>>

>> 3. Add a generic 64-on-32 lock for these accesses

>>

>> It would be a global lock for any oversized access which could end up

>> being highly contended but at least any behaviour would be always

>> correct.

>

> (3) is what libatomic does.

>

> For generic statistics/counters I have written, but not yet upstreamed,

> a module which defines two abstract data types:

>

> - Stat64 is a 64-bit value and it can do 64-bit add/min/max.  Reads

> block writes, but writes try to bypass the lock if possible (e.g.

> min/max that don't modify the stored value, or add that only touches the

> lower 32-bits).

>

> - Count64 is actually a 63-bit value and can do 32-bit add or 63-bit

> get/set.  Writes block reads, but 32-bit adds try pretty hard not to.

>

> These are meant for the block layer, so they would be necessary even if

> 64-on-32 went away.  Unfortunately, neither is a perfect replacement for

> a 64-bit variable...


With rth's CONFIG_ATOMIC64 patch mingw doesn't complain with this:

atomic.h: relax sizeof(*ptr) > sizeof(*void) check

If the compiler can already handle oversized accesses let it do so.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>


1 file changed, 10 insertions(+), 4 deletions(-)
include/qemu/atomic.h | 14 ++++++++++----

modified   include/qemu/atomic.h
@@ -99,15 +99,21 @@
  * no effect on the generated code but not using the atomic primitives
  * will get flagged by sanitizers as a violation.
  */
+#ifdef CONFIG_ATOMIC64
+#define ATOMIC_BUILD_BUG_ON(ptr) QEMU_BUILD_BUG_ON(sizeof(*ptr) > sizeof(uint64_t));
+#else
+#define ATOMIC_BUILD_BUG_ON(ptr) QEMU_BUILD_BUG_ON(sizeof(*ptr) > sizeof(void *));
+#endif
+
 #define atomic_read(ptr)                              \
     ({                                                \
-    QEMU_BUILD_BUG_ON(sizeof(*ptr) > sizeof(void *)); \
-    __atomic_load_n(ptr, __ATOMIC_RELAXED);           \
+        ATOMIC_BUILD_BUG_ON(ptr);                     \
+        __atomic_load_n(ptr, __ATOMIC_RELAXED);       \
     })

 #define atomic_set(ptr, i)  do {                      \
-    QEMU_BUILD_BUG_ON(sizeof(*ptr) > sizeof(void *)); \
-    __atomic_store_n(ptr, i, __ATOMIC_RELAXED);       \
+        ATOMIC_BUILD_BUG_ON(ptr);                     \
+        __atomic_store_n(ptr, i, __ATOMIC_RELAXED);   \
 } while(0)

 /* See above: most compilers currently treat consume and acquire the
diff mbox

Patch

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 336a57c..c3596a6 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -30,7 +30,7 @@ 
    addresses in userspace mode.  Define tb_page_addr_t to be an appropriate
    type.  */
 #if defined(CONFIG_USER_ONLY)
-typedef abi_ulong tb_page_addr_t;
+typedef target_ulong tb_page_addr_t;
 #else
 typedef ram_addr_t tb_page_addr_t;
 #endif