Message ID | 20240530194250.1801701-6-alex.bennee@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | cpus: a few tweaks to CPU realization | expand |
On 5/30/24 12:42, Alex Bennée wrote: > Originally I tried to move where vCPU thread initialisation to later > in realize. However pulling that thread (sic) got gnarly really > quickly. It turns out some steps of CPU realization need values that > can only be determined from the running vCPU thread. > > However having moved enough out of the thread creation we can now > queue work before the thread starts (at least for TCG guests) and > avoid the race between vcpu_init and other vcpu states a plugin might > subscribe to. > > Signed-off-by: Alex Bennée <alex.bennee@linaro.org> > --- > hw/core/cpu-common.c | 20 ++++++++++++-------- > 1 file changed, 12 insertions(+), 8 deletions(-) > > diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c > index 6cfc01593a..bf1a7b8892 100644 > --- a/hw/core/cpu-common.c > +++ b/hw/core/cpu-common.c > @@ -222,14 +222,6 @@ static void cpu_common_realizefn(DeviceState *dev, Error **errp) > cpu_resume(cpu); > } > > - /* Plugin initialization must wait until the cpu start executing code */ > -#ifdef CONFIG_PLUGIN > - if (tcg_enabled()) { > - cpu->plugin_state = qemu_plugin_create_vcpu_state(); > - async_run_on_cpu(cpu, qemu_plugin_vcpu_init__async, RUN_ON_CPU_NULL); > - } > -#endif > - > /* NOTE: latest generic point where the cpu is fully realized */ > } > > @@ -273,6 +265,18 @@ static void cpu_common_initfn(Object *obj) > QTAILQ_INIT(&cpu->watchpoints); > > cpu_exec_initfn(cpu); > + > + /* > + * Plugin initialization must wait until the cpu start executing > + * code, but we must queue this work before the threads are > + * created to ensure we don't race. > + */ > +#ifdef CONFIG_PLUGIN > + if (tcg_enabled()) { > + cpu->plugin_state = qemu_plugin_create_vcpu_state(); > + async_run_on_cpu(cpu, qemu_plugin_vcpu_init__async, RUN_ON_CPU_NULL); > + } > +#endif > } > > static void cpu_common_finalize(Object *obj) Could you check it works for all combination? - user-mode - system-mode tcg - system-mode mttcg When I tried to move this code around, one of them didn't work correctly. Else, Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
On 30/5/24 21:42, Alex Bennée wrote: > Originally I tried to move where vCPU thread initialisation to later > in realize. However pulling that thread (sic) got gnarly really > quickly. It turns out some steps of CPU realization need values that > can only be determined from the running vCPU thread. FYI: https://lore.kernel.org/qemu-devel/20240528145953.65398-6-philmd@linaro.org/ > However having moved enough out of the thread creation we can now > queue work before the thread starts (at least for TCG guests) and > avoid the race between vcpu_init and other vcpu states a plugin might > subscribe to. > > Signed-off-by: Alex Bennée <alex.bennee@linaro.org> > --- > hw/core/cpu-common.c | 20 ++++++++++++-------- > 1 file changed, 12 insertions(+), 8 deletions(-) > > diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c > index 6cfc01593a..bf1a7b8892 100644 > --- a/hw/core/cpu-common.c > +++ b/hw/core/cpu-common.c > @@ -222,14 +222,6 @@ static void cpu_common_realizefn(DeviceState *dev, Error **errp) > cpu_resume(cpu); > } > > - /* Plugin initialization must wait until the cpu start executing code */ > -#ifdef CONFIG_PLUGIN > - if (tcg_enabled()) { > - cpu->plugin_state = qemu_plugin_create_vcpu_state(); > - async_run_on_cpu(cpu, qemu_plugin_vcpu_init__async, RUN_ON_CPU_NULL); > - } > -#endif > - > /* NOTE: latest generic point where the cpu is fully realized */ > } > > @@ -273,6 +265,18 @@ static void cpu_common_initfn(Object *obj) > QTAILQ_INIT(&cpu->watchpoints); > > cpu_exec_initfn(cpu); > + > + /* > + * Plugin initialization must wait until the cpu start executing > + * code, but we must queue this work before the threads are > + * created to ensure we don't race. > + */ > +#ifdef CONFIG_PLUGIN > + if (tcg_enabled()) { > + cpu->plugin_state = qemu_plugin_create_vcpu_state(); Per https://etherpad.opendev.org/p/QEMU_vCPU_life, plugin_state could be initialized in AccelCPUClass::cpu_instance_init (although this callback is called at CPUClass::instance_post_init which I haven't yet figured why). > + async_run_on_cpu(cpu, qemu_plugin_vcpu_init__async, RUN_ON_CPU_NULL); > + } > +#endif > } > > static void cpu_common_finalize(Object *obj)
Philippe Mathieu-Daudé <philmd@linaro.org> writes: > On 30/5/24 21:42, Alex Bennée wrote: >> Originally I tried to move where vCPU thread initialisation to later >> in realize. However pulling that thread (sic) got gnarly really >> quickly. It turns out some steps of CPU realization need values that >> can only be determined from the running vCPU thread. > > FYI: > https://lore.kernel.org/qemu-devel/20240528145953.65398-6-philmd@linaro.org/ But this still has it in realize which would still race as the threads are started before we call the common realize functions. > >> However having moved enough out of the thread creation we can now >> queue work before the thread starts (at least for TCG guests) and >> avoid the race between vcpu_init and other vcpu states a plugin might >> subscribe to. >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> >> --- >> hw/core/cpu-common.c | 20 ++++++++++++-------- >> 1 file changed, 12 insertions(+), 8 deletions(-) >> diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c >> index 6cfc01593a..bf1a7b8892 100644 >> --- a/hw/core/cpu-common.c >> +++ b/hw/core/cpu-common.c >> @@ -222,14 +222,6 @@ static void cpu_common_realizefn(DeviceState *dev, Error **errp) >> cpu_resume(cpu); >> } >> - /* Plugin initialization must wait until the cpu start >> executing code */ >> -#ifdef CONFIG_PLUGIN >> - if (tcg_enabled()) { >> - cpu->plugin_state = qemu_plugin_create_vcpu_state(); >> - async_run_on_cpu(cpu, qemu_plugin_vcpu_init__async, RUN_ON_CPU_NULL); >> - } >> -#endif >> - >> /* NOTE: latest generic point where the cpu is fully realized */ >> } >> @@ -273,6 +265,18 @@ static void cpu_common_initfn(Object *obj) >> QTAILQ_INIT(&cpu->watchpoints); >> cpu_exec_initfn(cpu); >> + >> + /* >> + * Plugin initialization must wait until the cpu start executing >> + * code, but we must queue this work before the threads are >> + * created to ensure we don't race. >> + */ >> +#ifdef CONFIG_PLUGIN >> + if (tcg_enabled()) { >> + cpu->plugin_state = qemu_plugin_create_vcpu_state(); > > Per https://etherpad.opendev.org/p/QEMU_vCPU_life, plugin_state could > be initialized in AccelCPUClass::cpu_instance_init (although this > callback is called at CPUClass::instance_post_init which I haven't > yet figured why). Why are x86 and RiscV special in terms of calling this function? > >> + async_run_on_cpu(cpu, qemu_plugin_vcpu_init__async, RUN_ON_CPU_NULL); >> + } >> +#endif >> } >> static void cpu_common_finalize(Object *obj)
Pierrick Bouvier <pierrick.bouvier@linaro.org> writes: > On 5/30/24 12:42, Alex Bennée wrote: >> Originally I tried to move where vCPU thread initialisation to later >> in realize. However pulling that thread (sic) got gnarly really >> quickly. It turns out some steps of CPU realization need values that >> can only be determined from the running vCPU thread. >> However having moved enough out of the thread creation we can now >> queue work before the thread starts (at least for TCG guests) and >> avoid the race between vcpu_init and other vcpu states a plugin might >> subscribe to. >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> >> --- >> hw/core/cpu-common.c | 20 ++++++++++++-------- >> 1 file changed, 12 insertions(+), 8 deletions(-) >> diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c >> index 6cfc01593a..bf1a7b8892 100644 >> --- a/hw/core/cpu-common.c >> +++ b/hw/core/cpu-common.c >> @@ -222,14 +222,6 @@ static void cpu_common_realizefn(DeviceState *dev, Error **errp) >> cpu_resume(cpu); >> } >> - /* Plugin initialization must wait until the cpu start >> executing code */ >> -#ifdef CONFIG_PLUGIN >> - if (tcg_enabled()) { >> - cpu->plugin_state = qemu_plugin_create_vcpu_state(); >> - async_run_on_cpu(cpu, qemu_plugin_vcpu_init__async, RUN_ON_CPU_NULL); >> - } >> -#endif >> - >> /* NOTE: latest generic point where the cpu is fully realized */ >> } >> @@ -273,6 +265,18 @@ static void cpu_common_initfn(Object *obj) >> QTAILQ_INIT(&cpu->watchpoints); >> cpu_exec_initfn(cpu); >> + >> + /* >> + * Plugin initialization must wait until the cpu start executing >> + * code, but we must queue this work before the threads are >> + * created to ensure we don't race. >> + */ >> +#ifdef CONFIG_PLUGIN >> + if (tcg_enabled()) { >> + cpu->plugin_state = qemu_plugin_create_vcpu_state(); >> + async_run_on_cpu(cpu, qemu_plugin_vcpu_init__async, RUN_ON_CPU_NULL); >> + } >> +#endif >> } >> static void cpu_common_finalize(Object *obj) > > Could you check it works for all combination? > - user-mode > - system-mode tcg > - system-mode mttcg I was hand testing against the record replay tests in avocado (rr single thread tcg) and general system and user mode stuff. I haven't run a full pipeline yet, although I did apply your IPS patches ontop and run that: https://gitlab.com/stsquad/qemu/-/pipelines/1312869352 but all those failures are user-mode build failures due to the missing time stubs AFAICT. > > When I tried to move this code around, one of them didn't work correctly. > > Else, > Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
On 31/5/24 10:47, Alex Bennée wrote: > Philippe Mathieu-Daudé <philmd@linaro.org> writes: > >> On 30/5/24 21:42, Alex Bennée wrote: >>> Originally I tried to move where vCPU thread initialisation to later >>> in realize. However pulling that thread (sic) got gnarly really >>> quickly. It turns out some steps of CPU realization need values that >>> can only be determined from the running vCPU thread. >> >> FYI: >> https://lore.kernel.org/qemu-devel/20240528145953.65398-6-philmd@linaro.org/ > > But this still has it in realize which would still race as the threads > are started before we call the common realize functions. > >> >>> However having moved enough out of the thread creation we can now >>> queue work before the thread starts (at least for TCG guests) and >>> avoid the race between vcpu_init and other vcpu states a plugin might >>> subscribe to. >>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> >>> --- >>> hw/core/cpu-common.c | 20 ++++++++++++-------- >>> 1 file changed, 12 insertions(+), 8 deletions(-) Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c index 6cfc01593a..bf1a7b8892 100644 --- a/hw/core/cpu-common.c +++ b/hw/core/cpu-common.c @@ -222,14 +222,6 @@ static void cpu_common_realizefn(DeviceState *dev, Error **errp) cpu_resume(cpu); } - /* Plugin initialization must wait until the cpu start executing code */ -#ifdef CONFIG_PLUGIN - if (tcg_enabled()) { - cpu->plugin_state = qemu_plugin_create_vcpu_state(); - async_run_on_cpu(cpu, qemu_plugin_vcpu_init__async, RUN_ON_CPU_NULL); - } -#endif - /* NOTE: latest generic point where the cpu is fully realized */ } @@ -273,6 +265,18 @@ static void cpu_common_initfn(Object *obj) QTAILQ_INIT(&cpu->watchpoints); cpu_exec_initfn(cpu); + + /* + * Plugin initialization must wait until the cpu start executing + * code, but we must queue this work before the threads are + * created to ensure we don't race. + */ +#ifdef CONFIG_PLUGIN + if (tcg_enabled()) { + cpu->plugin_state = qemu_plugin_create_vcpu_state(); + async_run_on_cpu(cpu, qemu_plugin_vcpu_init__async, RUN_ON_CPU_NULL); + } +#endif } static void cpu_common_finalize(Object *obj)
Originally I tried to move where vCPU thread initialisation to later in realize. However pulling that thread (sic) got gnarly really quickly. It turns out some steps of CPU realization need values that can only be determined from the running vCPU thread. However having moved enough out of the thread creation we can now queue work before the thread starts (at least for TCG guests) and avoid the race between vcpu_init and other vcpu states a plugin might subscribe to. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> --- hw/core/cpu-common.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-)