diff mbox series

[PULL,1/3] linux-user: un-parent OBJECT(cpu) when closing thread

Message ID 20220816122621.2066292-2-alex.bennee@linaro.org
State Accepted
Commit 52f0c1607671293afcdb2acc2f83e9bccbfa74bb
Headers show
Series [PULL,1/3] linux-user: un-parent OBJECT(cpu) when closing thread | expand

Commit Message

Alex Bennée Aug. 16, 2022, 12:26 p.m. UTC
While forcing the CPU to unrealize by hand does trigger the clean-up
code we never fully free resources because refcount never reaches
zero. This is because QOM automatically added objects without an
explicit parent to /unattached/, incrementing the refcount.

Instead of manually triggering unrealization just unparent the object
and let the device machinery deal with that for us.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/866
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220811151413.3350684-2-alex.bennee@linaro.org>

Comments

Richard Henderson Aug. 19, 2022, 1:02 a.m. UTC | #1
On 8/16/22 05:26, Alex Bennée wrote:
> While forcing the CPU to unrealize by hand does trigger the clean-up
> code we never fully free resources because refcount never reaches
> zero. This is because QOM automatically added objects without an
> explicit parent to /unattached/, incrementing the refcount.
> 
> Instead of manually triggering unrealization just unparent the object
> and let the device machinery deal with that for us.
> 
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/866
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Reviewed-by: Laurent Vivier <laurent@vivier.eu>
> Message-Id: <20220811151413.3350684-2-alex.bennee@linaro.org>
> 
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index f409121202..bfdd60136b 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -8594,7 +8594,13 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1,
>           if (CPU_NEXT(first_cpu)) {
>               TaskState *ts = cpu->opaque;
>   
> -            object_property_set_bool(OBJECT(cpu), "realized", false, NULL);
> +            if (ts->child_tidptr) {
> +                put_user_u32(0, ts->child_tidptr);
> +                do_sys_futex(g2h(cpu, ts->child_tidptr),
> +                             FUTEX_WAKE, INT_MAX, NULL, NULL, 0);
> +            }
> +
> +            object_unparent(OBJECT(cpu));

This has caused a regression in arm/aarch64.

We hard-code ARMCPRegInfo pointers into TranslationBlocks, for calling into 
helper_{get,set}cp_reg{,64}.  So we have a race condition between whichever cpu thread 
translates the code first (encoding the pointer), and that cpu thread exiting, so that the 
next execution of the TB references a freed data structure.

We shouldn't have N copies of these pointers in the first place.  This seems like 
something that ought to be placed on the ARMCPUClass, so that it could be shared by each cpu.

But that's going to be a complex fix, so I'm reverting this for rc4.


r~
Alex Bennée Aug. 19, 2022, 8:37 a.m. UTC | #2
Richard Henderson <richard.henderson@linaro.org> writes:

> On 8/16/22 05:26, Alex Bennée wrote:
>> While forcing the CPU to unrealize by hand does trigger the clean-up
>> code we never fully free resources because refcount never reaches
>> zero. This is because QOM automatically added objects without an
>> explicit parent to /unattached/, incrementing the refcount.
>> Instead of manually triggering unrealization just unparent the
>> object
>> and let the device machinery deal with that for us.
>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/866
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> Reviewed-by: Laurent Vivier <laurent@vivier.eu>
>> Message-Id: <20220811151413.3350684-2-alex.bennee@linaro.org>
>> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
>> index f409121202..bfdd60136b 100644
>> --- a/linux-user/syscall.c
>> +++ b/linux-user/syscall.c
>> @@ -8594,7 +8594,13 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1,
>>           if (CPU_NEXT(first_cpu)) {
>>               TaskState *ts = cpu->opaque;
>>   -            object_property_set_bool(OBJECT(cpu), "realized",
>> false, NULL);
>> +            if (ts->child_tidptr) {
>> +                put_user_u32(0, ts->child_tidptr);
>> +                do_sys_futex(g2h(cpu, ts->child_tidptr),
>> +                             FUTEX_WAKE, INT_MAX, NULL, NULL, 0);
>> +            }
>> +
>> +            object_unparent(OBJECT(cpu));
>
> This has caused a regression in arm/aarch64.
>
> We hard-code ARMCPRegInfo pointers into TranslationBlocks, for calling
> into helper_{get,set}cp_reg{,64}.  So we have a race condition between
> whichever cpu thread translates the code first (encoding the pointer),
> and that cpu thread exiting, so that the next execution of the TB
> references a freed data structure.

What is the test case that breaks this? I guess a multi-threaded
sysregs.c would trigger it?

> We shouldn't have N copies of these pointers in the first place.  This
> seems like something that ought to be placed on the ARMCPUClass, so
> that it could be shared by each cpu.
>
> But that's going to be a complex fix, so I'm reverting this for rc4.
>
>
> r~
Richard Henderson Aug. 19, 2022, 2:36 p.m. UTC | #3
On 8/19/22 01:37, Alex Bennée wrote:
>> This has caused a regression in arm/aarch64.
>>
>> We hard-code ARMCPRegInfo pointers into TranslationBlocks, for calling
>> into helper_{get,set}cp_reg{,64}.  So we have a race condition between
>> whichever cpu thread translates the code first (encoding the pointer),
>> and that cpu thread exiting, so that the next execution of the TB
>> references a freed data structure.
> 
> What is the test case that breaks this? I guess a multi-threaded
> sysregs.c would trigger it?

E.g. tests/tcg/aarch64-linux-user/signals.


r~
diff mbox series

Patch

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index f409121202..bfdd60136b 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8594,7 +8594,13 @@  static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1,
         if (CPU_NEXT(first_cpu)) {
             TaskState *ts = cpu->opaque;
 
-            object_property_set_bool(OBJECT(cpu), "realized", false, NULL);
+            if (ts->child_tidptr) {
+                put_user_u32(0, ts->child_tidptr);
+                do_sys_futex(g2h(cpu, ts->child_tidptr),
+                             FUTEX_WAKE, INT_MAX, NULL, NULL, 0);
+            }
+
+            object_unparent(OBJECT(cpu));
             object_unref(OBJECT(cpu));
             /*
              * At this point the CPU should be unrealized and removed
@@ -8604,11 +8610,6 @@  static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1,
 
             pthread_mutex_unlock(&clone_lock);
 
-            if (ts->child_tidptr) {
-                put_user_u32(0, ts->child_tidptr);
-                do_sys_futex(g2h(cpu, ts->child_tidptr),
-                             FUTEX_WAKE, INT_MAX, NULL, NULL, 0);
-            }
             thread_cpu = NULL;
             g_free(ts);
             rcu_unregister_thread();