diff mbox series

[RFC] target/arm: de-duplicate our register XML definitions

Message ID 20220610154036.1253158-1-alex.bennee@linaro.org
State New
Headers show
Series [RFC] target/arm: de-duplicate our register XML definitions | expand

Commit Message

Alex Bennée June 10, 2022, 3:40 p.m. UTC
We generate the XML for each vCPU we create which for heavily threaded
linux-user runs can add up to a lot of memory. Unfortunately we can't
only do it once as we may have vCPUs with different capabilities in
different cores but we can at least free duplicate definitions if we
find them.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1070
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 target/arm/gdbstub.c | 38 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

Comments

Peter Maydell June 10, 2022, 4:03 p.m. UTC | #1
On Fri, 10 Jun 2022 at 16:40, Alex Bennée <alex.bennee@linaro.org> wrote:
>
> We generate the XML for each vCPU we create which for heavily threaded
> linux-user runs can add up to a lot of memory. Unfortunately we can't
> only do it once as we may have vCPUs with different capabilities in
> different cores but we can at least free duplicate definitions if we
> find them.

How big actually is the XML here? 400 bytes? 4K? 40K? 400K?

> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1070
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

Can we defer the work until/unless gdb actually asks for it?

I have some details-of-the-code comments below, but first:
does gdb even support having the XML for each CPU be different?
Our gdbstub.c seems to always ask for the XML only for the first
CPU in the "process". If gdb does support heterogenous CPU XML,
does it require that different XML descriptions have different
names? We always call them system-registers.xml and sve-registers.xml
regardless of whether different CPUs might have different contents
for those XML blobs...

> ---
>  target/arm/gdbstub.c | 38 ++++++++++++++++++++++++++++++++++++--
>  1 file changed, 36 insertions(+), 2 deletions(-)
>
> diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
> index 2f806512d0..85ef6dc6db 100644
> --- a/target/arm/gdbstub.c
> +++ b/target/arm/gdbstub.c
> @@ -318,7 +318,24 @@ int arm_gen_dynamic_sysreg_xml(CPUState *cs, int base_reg)
>      g_string_append_printf(s, "<feature name=\"org.qemu.gdb.arm.sys.regs\">");
>      g_hash_table_foreach(cpu->cp_regs, arm_register_sysreg_for_xml, &param);
>      g_string_append_printf(s, "</feature>");
> -    cpu->dyn_sysreg_xml.desc = g_string_free(s, false);
> +
> +    /* De-duplicate system register descriptions */
> +    if (cs != first_cpu) {
> +        CPUState *other_cs;
> +        CPU_FOREACH(other_cs) {
> +            ARMCPU *other_arm = ARM_CPU(other_cs);
> +            if (g_strcmp0(other_arm->dyn_sysreg_xml.desc, s->str) == 0) {

If you check whether the dyn_sysreg_xml.num matches first, that will
probably be a much faster way of checking in almost all cases than
doing the strcmp on the however-long-it-is XML string: it seems unlikely
that in a heterogenous config the CPUs will manage to have exactly the
same number of registers.

If you have a setup with, say, 4 CPUs of type X and then 4 of type Y,
for every type Y CPU this loop will do the strcmp of Y's XML against
the type X XML for cpu 0, then again for 1, then again for 2, then for 3,
even though in theory we already know at that point that 0,1,2,3 all
have the same XML and so if it didn't match for cpu 0 it won't match
for 1,2,3...  But maybe the comparison against the number-of-registers
saves us from having to try to optimise that away.

> +                cpu->dyn_sysreg_xml.desc = other_arm->dyn_sysreg_xml.desc;
> +                g_string_free(s, true);
> +                s = NULL;
> +                break;
> +            }
> +        }
> +    }
> +
> +    if (s) {
> +        cpu->dyn_sysreg_xml.desc = g_string_free(s, false);
> +    }
>      return cpu->dyn_sysreg_xml.num;
>  }

-- PMM
Alex Bennée June 11, 2022, 7:05 a.m. UTC | #2
Peter Maydell <peter.maydell@linaro.org> writes:

> On Fri, 10 Jun 2022 at 16:40, Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>> We generate the XML for each vCPU we create which for heavily threaded
>> linux-user runs can add up to a lot of memory. Unfortunately we can't
>> only do it once as we may have vCPUs with different capabilities in
>> different cores but we can at least free duplicate definitions if we
>> find them.
>
> How big actually is the XML here? 400 bytes? 4K? 40K? 400K?

For linux-user the sysreg XML is about 2k and the sve XML is about 8k
but it all adds up pretty quickly if you have lots of threads. BTW I
think the cpreg handling is also ranking quite highly in valgrinds
reporting but I haven't looked into it too deeply yet.

>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1070
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>
> Can we defer the work until/unless gdb actually asks for it?

Yes. In fact really I'd like to remove all the XML generation details
from the various front-ends and move it to a common register API which
could do exactly that. We could then just share a uniform view of the
registers and perhaps extend it to understand per-core register sets if
we ever get to that point.

I also wonder if we can drop the individual XML segments and static GDB
register mappings in favour of one complete XML for whatever registers
QEMU wants to expose at the point? CC'ing some of our GDB colleges for
their input.

>
> I have some details-of-the-code comments below, but first:
> does gdb even support having the XML for each CPU be different?
> Our gdbstub.c seems to always ask for the XML only for the first
> CPU in the "process". If gdb does support heterogenous CPU XML,
> does it require that different XML descriptions have different
> names? We always call them system-registers.xml and sve-registers.xml
> regardless of whether different CPUs might have different contents
> for those XML blobs...
>
>> ---
>>  target/arm/gdbstub.c | 38 ++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 36 insertions(+), 2 deletions(-)
>>
>> diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
>> index 2f806512d0..85ef6dc6db 100644
>> --- a/target/arm/gdbstub.c
>> +++ b/target/arm/gdbstub.c
>> @@ -318,7 +318,24 @@ int arm_gen_dynamic_sysreg_xml(CPUState *cs, int base_reg)
>>      g_string_append_printf(s, "<feature name=\"org.qemu.gdb.arm.sys.regs\">");
>>      g_hash_table_foreach(cpu->cp_regs, arm_register_sysreg_for_xml, &param);
>>      g_string_append_printf(s, "</feature>");
>> -    cpu->dyn_sysreg_xml.desc = g_string_free(s, false);
>> +
>> +    /* De-duplicate system register descriptions */
>> +    if (cs != first_cpu) {
>> +        CPUState *other_cs;
>> +        CPU_FOREACH(other_cs) {
>> +            ARMCPU *other_arm = ARM_CPU(other_cs);
>> +            if (g_strcmp0(other_arm->dyn_sysreg_xml.desc, s->str) == 0) {
>
> If you check whether the dyn_sysreg_xml.num matches first, that will
> probably be a much faster way of checking in almost all cases than
> doing the strcmp on the however-long-it-is XML string: it seems unlikely
> that in a heterogenous config the CPUs will manage to have exactly the
> same number of registers.
>
> If you have a setup with, say, 4 CPUs of type X and then 4 of type Y,
> for every type Y CPU this loop will do the strcmp of Y's XML against
> the type X XML for cpu 0, then again for 1, then again for 2, then for 3,
> even though in theory we already know at that point that 0,1,2,3 all
> have the same XML and so if it didn't match for cpu 0 it won't match
> for 1,2,3...  But maybe the comparison against the number-of-registers
> saves us from having to try to optimise that away.
>
>> +                cpu->dyn_sysreg_xml.desc = other_arm->dyn_sysreg_xml.desc;
>> +                g_string_free(s, true);
>> +                s = NULL;
>> +                break;
>> +            }
>> +        }
>> +    }
>> +
>> +    if (s) {
>> +        cpu->dyn_sysreg_xml.desc = g_string_free(s, false);
>> +    }
>>      return cpu->dyn_sysreg_xml.num;
>>  }
>
> -- PMM

I'm travelling today so I'll see if I can use the time to sketch up a
common API and make this fugly approach obsolete.
diff mbox series

Patch

diff --git a/target/arm/gdbstub.c b/target/arm/gdbstub.c
index 2f806512d0..85ef6dc6db 100644
--- a/target/arm/gdbstub.c
+++ b/target/arm/gdbstub.c
@@ -318,7 +318,24 @@  int arm_gen_dynamic_sysreg_xml(CPUState *cs, int base_reg)
     g_string_append_printf(s, "<feature name=\"org.qemu.gdb.arm.sys.regs\">");
     g_hash_table_foreach(cpu->cp_regs, arm_register_sysreg_for_xml, &param);
     g_string_append_printf(s, "</feature>");
-    cpu->dyn_sysreg_xml.desc = g_string_free(s, false);
+
+    /* De-duplicate system register descriptions */
+    if (cs != first_cpu) {
+        CPUState *other_cs;
+        CPU_FOREACH(other_cs) {
+            ARMCPU *other_arm = ARM_CPU(other_cs);
+            if (g_strcmp0(other_arm->dyn_sysreg_xml.desc, s->str) == 0) {
+                cpu->dyn_sysreg_xml.desc = other_arm->dyn_sysreg_xml.desc;
+                g_string_free(s, true);
+                s = NULL;
+                break;
+            }
+        }
+    }
+
+    if (s) {
+        cpu->dyn_sysreg_xml.desc = g_string_free(s, false);
+    }
     return cpu->dyn_sysreg_xml.num;
 }
 
@@ -436,7 +453,24 @@  int arm_gen_dynamic_svereg_xml(CPUState *cs, int base_reg)
                            base_reg++);
     info->num += 2;
     g_string_append_printf(s, "</feature>");
-    cpu->dyn_svereg_xml.desc = g_string_free(s, false);
+
+    /* De-duplicate SVE register descriptions */
+    if (cs != first_cpu) {
+        CPUState *other_cs;
+        CPU_FOREACH(other_cs) {
+            ARMCPU *other_arm = ARM_CPU(other_cs);
+            if (g_strcmp0(other_arm->dyn_svereg_xml.desc, s->str) == 0) {
+                cpu->dyn_sysreg_xml.desc = other_arm->dyn_svereg_xml.desc;
+                g_string_free(s, true);
+                s = NULL;
+                break;
+            }
+        }
+    }
+
+    if (s) {
+        cpu->dyn_svereg_xml.desc = g_string_free(s, false);
+    }
 
     return cpu->dyn_svereg_xml.num;
 }