diff mbox series

[v3,05/17] plugins: scoreboard API

Message ID 20240206092423.3005995-6-pierrick.bouvier@linaro.org
State New
Headers show
Series TCG Plugin inline operation enhancement | expand

Commit Message

Pierrick Bouvier Feb. 6, 2024, 9:24 a.m. UTC
We introduce a cpu local storage, automatically managed (and extended)
by QEMU itself. Plugin allocate a scoreboard, and don't have to deal
with how many cpus are launched.

This API will be used by new inline functions but callbacks can benefit
from this as well. This way, they can operate without a global lock for
simple operations.

At any point during execution, any scoreboard will be dimensioned with
at least qemu_plugin_num_vcpus entries.

New functions:
- qemu_plugin_scoreboard_find
- qemu_plugin_scoreboard_free
- qemu_plugin_scoreboard_new

In more, we define a qemu_plugin_u64, which is a simple struct holding
a pointer to a scoreboard, and a given offset.
This allows to have a scoreboard containing structs, without having to
bring offset for all operations on a specific field.

Since most of the plugins are simply collecting a sum of per-cpu values,
qemu_plugin_u64 directly support this operation as well.

New functions:
- qemu_plugin_u64_add
- qemu_plugin_u64_get
- qemu_plugin_u64_set
- qemu_plugin_u64_sum
New macros:
- qemu_plugin_scoreboard_u64
- qemu_plugin_scoreboard_u64_in_struct

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 include/qemu/plugin.h        |  5 +++
 include/qemu/qemu-plugin.h   | 83 ++++++++++++++++++++++++++++++++++++
 plugins/plugin.h             |  7 +++
 plugins/api.c                | 53 +++++++++++++++++++++++
 plugins/core.c               | 67 +++++++++++++++++++++++++++++
 plugins/qemu-plugins.symbols |  7 +++
 6 files changed, 222 insertions(+)

Comments

Richard Henderson Feb. 7, 2024, 3:21 a.m. UTC | #1
On 2/6/24 19:24, Pierrick Bouvier wrote:
> We introduce a cpu local storage, automatically managed (and extended)
> by QEMU itself. Plugin allocate a scoreboard, and don't have to deal
> with how many cpus are launched.
> 
> This API will be used by new inline functions but callbacks can benefit
> from this as well. This way, they can operate without a global lock for
> simple operations.
> 
> At any point during execution, any scoreboard will be dimensioned with
> at least qemu_plugin_num_vcpus entries.
> 
> New functions:
> - qemu_plugin_scoreboard_find
> - qemu_plugin_scoreboard_free
> - qemu_plugin_scoreboard_new
> 
> In more, we define a qemu_plugin_u64, which is a simple struct holding
> a pointer to a scoreboard, and a given offset.
> This allows to have a scoreboard containing structs, without having to
> bring offset for all operations on a specific field.
> 
> Since most of the plugins are simply collecting a sum of per-cpu values,
> qemu_plugin_u64 directly support this operation as well.
> 
> New functions:
> - qemu_plugin_u64_add
> - qemu_plugin_u64_get
> - qemu_plugin_u64_set
> - qemu_plugin_u64_sum
> New macros:
> - qemu_plugin_scoreboard_u64
> - qemu_plugin_scoreboard_u64_in_struct

I think the u64 stuff should be a second patch built upon the basic scoreboard support.

> +/* A scoreboard is an array of values, indexed by vcpu_index */
> +struct qemu_plugin_scoreboard {
> +    GArray *data;
> +};

Unnecessary?  Generates an extra pointer dereference for no apparent benefit. 
Alternately, might be useful for other data structure changes...

> +/**
> + * typedef qemu_plugin_u64 - uint64_t member of an entry in a scoreboard
> + *
> + * This field allows to access a specific uint64_t member in one given entry,
> + * located at a specified offset. Inline operations expect this as entry.
> + */
> +typedef struct {
> +    struct qemu_plugin_scoreboard *score;

Embed the struct instead?

> @@ -31,6 +31,9 @@ struct qemu_plugin_state {
>        * but with the HT we avoid adding a field to CPUState.
>        */
>       GHashTable *cpu_ht;
> +    /* Scoreboards, indexed by their addresses. */
> +    GHashTable *scoreboards;

Why a hash table?  All you want is to be able to iterate through all, and add/remove 
easily.  Seems like QLIST from <qemu/queue.h> would be better, and the QLIST_ENTRY member 
would make struct qemu_plugin_scoreboard useful.


r~
Pierrick Bouvier Feb. 7, 2024, 5:59 a.m. UTC | #2
On 2/7/24 07:21, Richard Henderson wrote:
> On 2/6/24 19:24, Pierrick Bouvier wrote:
>> We introduce a cpu local storage, automatically managed (and extended)
>> by QEMU itself. Plugin allocate a scoreboard, and don't have to deal
>> with how many cpus are launched.
>>
>> This API will be used by new inline functions but callbacks can benefit
>> from this as well. This way, they can operate without a global lock for
>> simple operations.
>>
>> At any point during execution, any scoreboard will be dimensioned with
>> at least qemu_plugin_num_vcpus entries.
>>
>> New functions:
>> - qemu_plugin_scoreboard_find
>> - qemu_plugin_scoreboard_free
>> - qemu_plugin_scoreboard_new
>>
>> In more, we define a qemu_plugin_u64, which is a simple struct holding
>> a pointer to a scoreboard, and a given offset.
>> This allows to have a scoreboard containing structs, without having to
>> bring offset for all operations on a specific field.
>>
>> Since most of the plugins are simply collecting a sum of per-cpu values,
>> qemu_plugin_u64 directly support this operation as well.
>>
>> New functions:
>> - qemu_plugin_u64_add
>> - qemu_plugin_u64_get
>> - qemu_plugin_u64_set
>> - qemu_plugin_u64_sum
>> New macros:
>> - qemu_plugin_scoreboard_u64
>> - qemu_plugin_scoreboard_u64_in_struct
> 
> I think the u64 stuff should be a second patch built upon the basic scoreboard support.
> 

You're right, should be easier to review.

>> +/* A scoreboard is an array of values, indexed by vcpu_index */
>> +struct qemu_plugin_scoreboard {
>> +    GArray *data;
>> +};
> 
> Unnecessary?  Generates an extra pointer dereference for no apparent benefit.
> Alternately, might be useful for other data structure changes...
> 

Thought to change it to a typedef after removing other members. Will do 
if you noticed this too.

>> +/**
>> + * typedef qemu_plugin_u64 - uint64_t member of an entry in a scoreboard
>> + *
>> + * This field allows to access a specific uint64_t member in one given entry,
>> + * located at a specified offset. Inline operations expect this as entry.
>> + */
>> +typedef struct {
>> +    struct qemu_plugin_scoreboard *score;
> 
> Embed the struct instead?
> 

Several qemu_plugin_u64 can point to the same scoreboard, so it has to 
be a pointer. It saves a scoreboard pointer + offset for a given entry.

>> @@ -31,6 +31,9 @@ struct qemu_plugin_state {
>>         * but with the HT we avoid adding a field to CPUState.
>>         */
>>        GHashTable *cpu_ht;
>> +    /* Scoreboards, indexed by their addresses. */
>> +    GHashTable *scoreboards;
> 
> Why a hash table?  All you want is to be able to iterate through all, and add/remove
> easily.  Seems like QLIST from <qemu/queue.h> would be better, and the QLIST_ENTRY member
> would make struct qemu_plugin_scoreboard useful.
> 

Thought that having O(1) removal was a nice property, compared to a 
linked list. I can switch to a QLIST if you still think it's better.

What do you mean by "make struct qemu_plugin_scoreboard useful"?

> 
> r~
Richard Henderson Feb. 11, 2024, 12:41 a.m. UTC | #3
On 2/6/24 19:59, Pierrick Bouvier wrote:
>> Why a hash table?  All you want is to be able to iterate through all, and add/remove
>> easily.  Seems like QLIST from <qemu/queue.h> would be better, and the QLIST_ENTRY member
>> would make struct qemu_plugin_scoreboard useful.
>>
> 
> Thought that having O(1) removal was a nice property, compared to a linked list. I can 
> switch to a QLIST if you still think it's better.

QLIST is double-linked, so it's still O(1).


> What do you mean by "make struct qemu_plugin_scoreboard useful"?

As opposed to simply a typedef of GArray.


r~
Pierrick Bouvier Feb. 11, 2024, 2:26 p.m. UTC | #4
On 2/11/24 04:41, Richard Henderson wrote:
> On 2/6/24 19:59, Pierrick Bouvier wrote:
>>> Why a hash table?  All you want is to be able to iterate through all, and add/remove
>>> easily.  Seems like QLIST from <qemu/queue.h> would be better, and the QLIST_ENTRY member
>>> would make struct qemu_plugin_scoreboard useful.
>>>
>>
>> Thought that having O(1) removal was a nice property, compared to a linked list. I can
>> switch to a QLIST if you still think it's better.
> 
> QLIST is double-linked, so it's still O(1).
> 
>

Is it an "intrusive" linked list (where next and prev are part of the 
struct entry itself)?

>> What do you mean by "make struct qemu_plugin_scoreboard useful"?
> 
> As opposed to simply a typedef of GArray.
> 
> 
> r~
>
Richard Henderson Feb. 11, 2024, 7:13 p.m. UTC | #5
On 2/11/24 04:26, Pierrick Bouvier wrote:
> On 2/11/24 04:41, Richard Henderson wrote:
>> On 2/6/24 19:59, Pierrick Bouvier wrote:
>>>> Why a hash table?  All you want is to be able to iterate through all, and add/remove
>>>> easily.  Seems like QLIST from <qemu/queue.h> would be better, and the QLIST_ENTRY member
>>>> would make struct qemu_plugin_scoreboard useful.
>>>>
>>>
>>> Thought that having O(1) removal was a nice property, compared to a linked list. I can
>>> switch to a QLIST if you still think it's better.
>>
>> QLIST is double-linked, so it's still O(1).
>>
>>
> 
> Is it an "intrusive" linked list (where next and prev are part of the struct entry itself)?

Yes.


r~
Pierrick Bouvier Feb. 12, 2024, 6:24 a.m. UTC | #6
On 2/11/24 23:13, Richard Henderson wrote:
> On 2/11/24 04:26, Pierrick Bouvier wrote:
>> On 2/11/24 04:41, Richard Henderson wrote:
>>> On 2/6/24 19:59, Pierrick Bouvier wrote:
>>>>> Why a hash table?  All you want is to be able to iterate through all, and add/remove
>>>>> easily.  Seems like QLIST from <qemu/queue.h> would be better, and the QLIST_ENTRY member
>>>>> would make struct qemu_plugin_scoreboard useful.
>>>>>
>>>>
>>>> Thought that having O(1) removal was a nice property, compared to a linked list. I can
>>>> switch to a QLIST if you still think it's better.
>>>
>>> QLIST is double-linked, so it's still O(1).
>>>
>>>
>>
>> Is it an "intrusive" linked list (where next and prev are part of the struct entry itself)?
> 
> Yes.
>

Sounds good then, I'll switch to QLIST.

> 
> r~
diff mbox series

Patch

diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index b0c5ac68293..af4aeef4d78 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -112,6 +112,11 @@  struct qemu_plugin_insn {
     bool mem_only;
 };
 
+/* A scoreboard is an array of values, indexed by vcpu_index */
+struct qemu_plugin_scoreboard {
+    GArray *data;
+};
+
 /*
  * qemu_plugin_insn allocate and cleanup functions. We don't expect to
  * cleanup many of these structures. They are reused for each fresh
diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
index e94ae4d2abb..a48586ef0c1 100644
--- a/include/qemu/qemu-plugin.h
+++ b/include/qemu/qemu-plugin.h
@@ -222,6 +222,19 @@  void qemu_plugin_register_vcpu_resume_cb(qemu_plugin_id_t id,
 struct qemu_plugin_tb;
 /** struct qemu_plugin_insn - Opaque handle for a translated instruction */
 struct qemu_plugin_insn;
+/** struct qemu_plugin_scoreboard - Opaque handle for a scoreboard */
+struct qemu_plugin_scoreboard;
+
+/**
+ * typedef qemu_plugin_u64 - uint64_t member of an entry in a scoreboard
+ *
+ * This field allows to access a specific uint64_t member in one given entry,
+ * located at a specified offset. Inline operations expect this as entry.
+ */
+typedef struct {
+    struct qemu_plugin_scoreboard *score;
+    size_t offset;
+} qemu_plugin_u64;
 
 /**
  * enum qemu_plugin_cb_flags - type of callback
@@ -753,5 +766,75 @@  int qemu_plugin_read_register(unsigned int vcpu,
                               struct qemu_plugin_register *handle,
                               GByteArray *buf);
 
+/**
+ * qemu_plugin_scoreboard_new() - alloc a new scoreboard
+ *
+ * @element_size: size (in bytes) for one entry
+ *
+ * Returns a pointer to a new scoreboard. It must be freed using
+ * qemu_plugin_scoreboard_free.
+ */
+QEMU_PLUGIN_API
+struct qemu_plugin_scoreboard *qemu_plugin_scoreboard_new(size_t element_size);
+
+/**
+ * qemu_plugin_scoreboard_free() - free a scoreboard
+ * @score: scoreboard to free
+ */
+QEMU_PLUGIN_API
+void qemu_plugin_scoreboard_free(struct qemu_plugin_scoreboard *score);
+
+/**
+ * qemu_plugin_scoreboard_find() - get pointer to an entry of a scoreboard
+ * @score: scoreboard to query
+ * @vcpu_index: entry index
+ *
+ * Returns address of entry of a scoreboard matching a given vcpu_index. This
+ * address can be modified later if scoreboard is resized.
+ */
+QEMU_PLUGIN_API
+void *qemu_plugin_scoreboard_find(struct qemu_plugin_scoreboard *score,
+                                  unsigned int vcpu_index);
+
+/* Macros to define a qemu_plugin_u64 */
+#define qemu_plugin_scoreboard_u64(score) \
+    (qemu_plugin_u64) {score, 0}
+#define qemu_plugin_scoreboard_u64_in_struct(score, type, member) \
+    (qemu_plugin_u64) {score, offsetof(type, member)}
+
+/**
+ * qemu_plugin_u64_add() - add a value to a qemu_plugin_u64 for a given vcpu
+ * @entry: entry to query
+ * @vcpu_index: entry index
+ * @added: value to add
+ */
+QEMU_PLUGIN_API
+void qemu_plugin_u64_add(qemu_plugin_u64 entry, unsigned int vcpu_index,
+                         uint64_t added);
+
+/**
+ * qemu_plugin_u64_get() - get value of a qemu_plugin_u64 for a given vcpu
+ * @entry: entry to query
+ * @vcpu_index: entry index
+ */
+QEMU_PLUGIN_API
+uint64_t qemu_plugin_u64_get(qemu_plugin_u64 entry, unsigned int vcpu_index);
+
+/**
+ * qemu_plugin_u64_set() - set value of a qemu_plugin_u64 for a given vcpu
+ * @entry: entry to query
+ * @vcpu_index: entry index
+ * @val: new value
+ */
+QEMU_PLUGIN_API
+void qemu_plugin_u64_set(qemu_plugin_u64 entry, unsigned int vcpu_index,
+                         uint64_t val);
+
+/**
+ * qemu_plugin_u64_sum() - return sum of all vcpu entries in a scoreboard
+ * @entry: entry to sum
+ */
+QEMU_PLUGIN_API
+uint64_t qemu_plugin_u64_sum(qemu_plugin_u64 entry);
 
 #endif /* QEMU_QEMU_PLUGIN_H */
diff --git a/plugins/plugin.h b/plugins/plugin.h
index 00b3509f708..fd93a372803 100644
--- a/plugins/plugin.h
+++ b/plugins/plugin.h
@@ -31,6 +31,9 @@  struct qemu_plugin_state {
      * but with the HT we avoid adding a field to CPUState.
      */
     GHashTable *cpu_ht;
+    /* Scoreboards, indexed by their addresses. */
+    GHashTable *scoreboards;
+    size_t scoreboard_alloc_size;
     DECLARE_BITMAP(mask, QEMU_PLUGIN_EV_MAX);
     /*
      * @lock protects the struct as well as ctx->uninstalling.
@@ -101,4 +104,8 @@  void exec_inline_op(struct qemu_plugin_dyn_cb *cb);
 
 int plugin_num_vcpus(void);
 
+struct qemu_plugin_scoreboard *plugin_scoreboard_new(size_t element_size);
+
+void plugin_scoreboard_free(struct qemu_plugin_scoreboard *score);
+
 #endif /* PLUGIN_H */
diff --git a/plugins/api.c b/plugins/api.c
index dbfc5e83729..15edad6769b 100644
--- a/plugins/api.c
+++ b/plugins/api.c
@@ -521,3 +521,56 @@  static void __attribute__((__constructor__)) qemu_api_init(void)
     qemu_mutex_init(&reg_handle_lock);
 
 }
+
+struct qemu_plugin_scoreboard *qemu_plugin_scoreboard_new(size_t element_size)
+{
+    return plugin_scoreboard_new(element_size);
+}
+
+void qemu_plugin_scoreboard_free(struct qemu_plugin_scoreboard *score)
+{
+    plugin_scoreboard_free(score);
+}
+
+void *qemu_plugin_scoreboard_find(struct qemu_plugin_scoreboard *score,
+                                  unsigned int vcpu_index)
+{
+    g_assert(vcpu_index < qemu_plugin_num_vcpus());
+    /* we can't use g_array_index since entry size is not statically known */
+    char *base_ptr = score->data->data;
+    return base_ptr + vcpu_index * g_array_get_element_size(score->data);
+}
+
+static uint64_t *plugin_u64_address(qemu_plugin_u64 entry,
+                                    unsigned int vcpu_index)
+{
+    char *ptr = qemu_plugin_scoreboard_find(entry.score, vcpu_index);
+    return (uint64_t *)(ptr + entry.offset);
+}
+
+void qemu_plugin_u64_add(qemu_plugin_u64 entry, unsigned int vcpu_index,
+                         uint64_t added)
+{
+    *plugin_u64_address(entry, vcpu_index) += added;
+}
+
+uint64_t qemu_plugin_u64_get(qemu_plugin_u64 entry,
+                             unsigned int vcpu_index)
+{
+    return *plugin_u64_address(entry, vcpu_index);
+}
+
+void qemu_plugin_u64_set(qemu_plugin_u64 entry, unsigned int vcpu_index,
+                         uint64_t val)
+{
+    *plugin_u64_address(entry, vcpu_index) = val;
+}
+
+uint64_t qemu_plugin_u64_sum(qemu_plugin_u64 entry)
+{
+    uint64_t total = 0;
+    for (int i = 0; i < qemu_plugin_num_vcpus(); ++i) {
+        total += qemu_plugin_u64_get(entry, i);
+    }
+    return total;
+}
diff --git a/plugins/core.c b/plugins/core.c
index 609d9d5c184..fd8604bcb79 100644
--- a/plugins/core.c
+++ b/plugins/core.c
@@ -209,6 +209,44 @@  plugin_register_cb_udata(qemu_plugin_id_t id, enum qemu_plugin_event ev,
     do_plugin_register_cb(id, ev, func, udata);
 }
 
+static void plugin_resize_one_scoreboard(gpointer key,
+                                         gpointer value,
+                                         gpointer user_data)
+{
+    struct qemu_plugin_scoreboard *score =
+        (struct qemu_plugin_scoreboard *) value;
+    size_t new_alloc_size = (size_t) user_data;
+    g_array_set_size(score->data, new_alloc_size);
+}
+
+static void plugin_grow_scoreboards__locked(CPUState *cpu)
+{
+    if (cpu->cpu_index < plugin.scoreboard_alloc_size) {
+        return;
+    }
+
+    bool need_realloc = FALSE;
+    while (cpu->cpu_index >= plugin.scoreboard_alloc_size) {
+        plugin.scoreboard_alloc_size *= 2;
+        need_realloc = TRUE;
+    }
+
+
+    if (!need_realloc || g_hash_table_size(plugin.scoreboards) == 0) {
+        /* nothing to do, we just updated sizes for future scoreboards */
+        return;
+    }
+
+    /* cpus must be stopped, as tb might still use an existing scoreboard. */
+    start_exclusive();
+    g_hash_table_foreach(plugin.scoreboards,
+                         &plugin_resize_one_scoreboard,
+                         GSIZE_TO_POINTER(plugin.scoreboard_alloc_size));
+    /* force all tb to be flushed, as scoreboard pointers were changed. */
+    tb_flush(cpu);
+    end_exclusive();
+}
+
 void qemu_plugin_vcpu_init_hook(CPUState *cpu)
 {
     bool success;
@@ -219,6 +257,7 @@  void qemu_plugin_vcpu_init_hook(CPUState *cpu)
     success = g_hash_table_insert(plugin.cpu_ht, &cpu->cpu_index,
                                   &cpu->cpu_index);
     g_assert(success);
+    plugin_grow_scoreboards__locked(cpu);
     qemu_rec_mutex_unlock(&plugin.lock);
 
     plugin_vcpu_cb__simple(cpu, QEMU_PLUGIN_EV_VCPU_INIT);
@@ -572,6 +611,8 @@  static void __attribute__((__constructor__)) plugin_init(void)
     qemu_rec_mutex_init(&plugin.lock);
     plugin.id_ht = g_hash_table_new(g_int64_hash, g_int64_equal);
     plugin.cpu_ht = g_hash_table_new(g_int_hash, g_int_equal);
+    plugin.scoreboards = g_hash_table_new(g_int64_hash, g_int64_equal);
+    plugin.scoreboard_alloc_size = 16; /* avoid frequent reallocation */
     QTAILQ_INIT(&plugin.ctxs);
     qht_init(&plugin.dyn_cb_arr_ht, plugin_dyn_cb_arr_cmp, 16,
              QHT_MODE_AUTO_RESIZE);
@@ -582,3 +623,29 @@  int plugin_num_vcpus(void)
 {
     return plugin.num_vcpus;
 }
+
+struct qemu_plugin_scoreboard *plugin_scoreboard_new(size_t element_size)
+{
+    struct qemu_plugin_scoreboard *score =
+        g_malloc0(sizeof(struct qemu_plugin_scoreboard));
+    score->data = g_array_new(FALSE, TRUE, element_size);
+    g_array_set_size(score->data, plugin.scoreboard_alloc_size);
+
+    qemu_rec_mutex_lock(&plugin.lock);
+    bool inserted = g_hash_table_insert(plugin.scoreboards, score, score);
+    g_assert(inserted);
+    qemu_rec_mutex_unlock(&plugin.lock);
+
+    return score;
+}
+
+void plugin_scoreboard_free(struct qemu_plugin_scoreboard *score)
+{
+    qemu_rec_mutex_lock(&plugin.lock);
+    bool removed = g_hash_table_remove(plugin.scoreboards, score);
+    g_assert(removed);
+    qemu_rec_mutex_unlock(&plugin.lock);
+
+    g_array_free(score->data, TRUE);
+    g_free(score);
+}
diff --git a/plugins/qemu-plugins.symbols b/plugins/qemu-plugins.symbols
index 27fe97239be..6204453d0fd 100644
--- a/plugins/qemu-plugins.symbols
+++ b/plugins/qemu-plugins.symbols
@@ -37,10 +37,17 @@ 
   qemu_plugin_register_vcpu_tb_exec_inline;
   qemu_plugin_register_vcpu_tb_trans_cb;
   qemu_plugin_reset;
+  qemu_plugin_scoreboard_free;
+  qemu_plugin_scoreboard_find;
+  qemu_plugin_scoreboard_new;
   qemu_plugin_start_code;
   qemu_plugin_tb_get_insn;
   qemu_plugin_tb_n_insns;
   qemu_plugin_tb_vaddr;
+  qemu_plugin_u64_add;
+  qemu_plugin_u64_get;
+  qemu_plugin_u64_set;
+  qemu_plugin_u64_sum;
   qemu_plugin_uninstall;
   qemu_plugin_vcpu_for_each;
 };