diff mbox series

[for-8.1,v10,10/14] util/selfmap: Rewrite using qemu/interval-tree.h

Message ID 20230807163705.9848-11-richard.henderson@linaro.org
State Superseded
Headers show
Series linux-user: image mapping fixes | expand

Commit Message

Richard Henderson Aug. 7, 2023, 4:37 p.m. UTC
We will want to be able to search the set of mappings.
For this patch, the two users iterate the tree in order.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/qemu/selfmap.h |  20 ++++----
 linux-user/elfload.c   |  14 +++--
 linux-user/syscall.c   |  15 +++---
 util/selfmap.c         | 114 +++++++++++++++++++++++++----------------
 4 files changed, 96 insertions(+), 67 deletions(-)

Comments

Richard Henderson Aug. 7, 2023, 6:17 p.m. UTC | #1
On 8/7/23 09:37, Richard Henderson wrote:
> We will want to be able to search the set of mappings.
> For this patch, the two users iterate the tree in order.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   include/qemu/selfmap.h |  20 ++++----
>   linux-user/elfload.c   |  14 +++--
>   linux-user/syscall.c   |  15 +++---
>   util/selfmap.c         | 114 +++++++++++++++++++++++++----------------
>   4 files changed, 96 insertions(+), 67 deletions(-)

I should note that, for 8.2, this will enable a rewrite of open_self_maps_1 so that it 
does not require page-by-page checking of page_get_flags.

My idea is that open_self_maps_1 would use walk_memory_regions to see all guest memory 
regions.  The per-region callback would cross-check with the host-region interval tree to 
find the dev+inode+path.

Cc Ilya and Helge, since there are two outstanding changes to open_self_maps.


r~
Michael Tokarev Aug. 8, 2023, 6:15 a.m. UTC | #2
07.08.2023 19:37, Richard Henderson wrote:
> We will want to be able to search the set of mappings.
> For this patch, the two users iterate the tree in order.

> diff --git a/include/qemu/selfmap.h b/include/qemu/selfmap.h

>   /**
>    * read_self_maps:
>    *
>    * Read /proc/self/maps and return a list of MapInfo structures.

Nitpick: this comment still says "a list", while in all other places
it has been changed to "a tree".

>    */
> -GSList *read_self_maps(void);
> +IntervalTreeRoot *read_self_maps(void);

/mjt
Richard Henderson Aug. 9, 2023, 3:23 p.m. UTC | #3
On 8/9/23 08:11, Helge Deller wrote:
> Fix a crash in qemu-user when running
> 
>      cat /proc/self/maps
> 
> in a chroot, where /proc isn't mounted.
> 
> The problem was introduced by commit 3ce3dd8ca965 ("util/selfmap:
> Rewrite using qemu/interval-tree.h") where in open_self_maps_1() the
> function read_self_maps() is called and which returns NULL if it can't
> read the hosts /proc/self/maps file. Afterwards that NULL is fed into
> interval_tree_iter_first() which doesn't check if the root node is NULL.
> 
> Fix it by adding a check if root is NULL and return NULL in that case.
> 
> Signed-off-by: Helge Deller <deller@gmx.de>
> Fixes: 3ce3dd8ca965 ("util/selfmap: Rewrite using qemu/interval-tree.h")
> 
> diff --git a/util/interval-tree.c b/util/interval-tree.c
> index f2866aa7d3..53465182e6 100644
> --- a/util/interval-tree.c
> +++ b/util/interval-tree.c
> @@ -797,7 +797,7 @@ IntervalTreeNode *interval_tree_iter_first(IntervalTreeRoot *root,
>   {
>       IntervalTreeNode *node, *leftmost;
> 
> -    if (!root->rb_root.rb_node) {
> +    if (!root || !root->rb_root.rb_node) {


I guess this is good enough for 8.1.  Before the conversion to interval-tree we would also 
emit nothing.

I've already done a rewrite for 8.2, and I noticed this problem.  There I emit what 
mapping information that I have, which is everything except for the device+path data.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~
Helge Deller Aug. 9, 2023, 3:53 p.m. UTC | #4
On 8/9/23 17:23, Richard Henderson wrote:
> On 8/9/23 08:11, Helge Deller wrote:
>> Fix a crash in qemu-user when running
>>
>>      cat /proc/self/maps
>>
>> in a chroot, where /proc isn't mounted.
>>
>> The problem was introduced by commit 3ce3dd8ca965 ("util/selfmap:
>> Rewrite using qemu/interval-tree.h") where in open_self_maps_1() the
>> function read_self_maps() is called and which returns NULL if it can't
>> read the hosts /proc/self/maps file. Afterwards that NULL is fed into
>> interval_tree_iter_first() which doesn't check if the root node is NULL.
>>
>> Fix it by adding a check if root is NULL and return NULL in that case.
>>
>> Signed-off-by: Helge Deller <deller@gmx.de>
>> Fixes: 3ce3dd8ca965 ("util/selfmap: Rewrite using qemu/interval-tree.h")
>>
>> diff --git a/util/interval-tree.c b/util/interval-tree.c
>> index f2866aa7d3..53465182e6 100644
>> --- a/util/interval-tree.c
>> +++ b/util/interval-tree.c
>> @@ -797,7 +797,7 @@ IntervalTreeNode *interval_tree_iter_first(IntervalTreeRoot *root,
>>   {
>>       IntervalTreeNode *node, *leftmost;
>>
>> -    if (!root->rb_root.rb_node) {
>> +    if (!root || !root->rb_root.rb_node) {
>
>
> I guess this is good enough for 8.1.  Before the conversion to interval-tree we would also emit nothing.

Yes and yes.

> I've already done a rewrite for 8.2, and I noticed this problem.
> There I emit what mapping information that I have, which is
> everything except for the device+path data.

nice.

> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

Shall I send a pull request?
If so, is it OK that I include this patch in the pull-request as well?
   linux-user: Fix openat() emulation to correctly detect accesses to /proc
   https://lists.nongnu.org/archive/html/qemu-devel/2023-08/msg00165.html
which already has been R-b: Daniel P. Berrangé

Helge
Richard Henderson Aug. 9, 2023, 4:33 p.m. UTC | #5
On 8/9/23 08:53, Helge Deller wrote:
> On 8/9/23 17:23, Richard Henderson wrote:
>> On 8/9/23 08:11, Helge Deller wrote:
>>> Fix a crash in qemu-user when running
>>>
>>>      cat /proc/self/maps
>>>
>>> in a chroot, where /proc isn't mounted.
>>>
>>> The problem was introduced by commit 3ce3dd8ca965 ("util/selfmap:
>>> Rewrite using qemu/interval-tree.h") where in open_self_maps_1() the
>>> function read_self_maps() is called and which returns NULL if it can't
>>> read the hosts /proc/self/maps file. Afterwards that NULL is fed into
>>> interval_tree_iter_first() which doesn't check if the root node is NULL.
>>>
>>> Fix it by adding a check if root is NULL and return NULL in that case.
>>>
>>> Signed-off-by: Helge Deller <deller@gmx.de>
>>> Fixes: 3ce3dd8ca965 ("util/selfmap: Rewrite using qemu/interval-tree.h")
>>>
>>> diff --git a/util/interval-tree.c b/util/interval-tree.c
>>> index f2866aa7d3..53465182e6 100644
>>> --- a/util/interval-tree.c
>>> +++ b/util/interval-tree.c
>>> @@ -797,7 +797,7 @@ IntervalTreeNode *interval_tree_iter_first(IntervalTreeRoot *root,
>>>   {
>>>       IntervalTreeNode *node, *leftmost;
>>>
>>> -    if (!root->rb_root.rb_node) {
>>> +    if (!root || !root->rb_root.rb_node) {
>>
>>
>> I guess this is good enough for 8.1.  Before the conversion to interval-tree we would 
>> also emit nothing.
> 
> Yes and yes.
> 
>> I've already done a rewrite for 8.2, and I noticed this problem.
>> There I emit what mapping information that I have, which is
>> everything except for the device+path data.
> 
> nice.
> 
>> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> 
> Shall I send a pull request?
> If so, is it OK that I include this patch in the pull-request as well?
>    linux-user: Fix openat() emulation to correctly detect accesses to /proc
>    https://lists.nongnu.org/archive/html/qemu-devel/2023-08/msg00165.html
> which already has been R-b: Daniel P. Berrangé

I can pick them both up -- I have other linux-user patches to send.


r~
Ilya Leoshkevich Aug. 10, 2023, 9:31 p.m. UTC | #6
On Mon, 2023-08-07 at 11:17 -0700, Richard Henderson wrote:
> On 8/7/23 09:37, Richard Henderson wrote:
> > We will want to be able to search the set of mappings.
> > For this patch, the two users iterate the tree in order.
> > 
> > Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> > ---
> >   include/qemu/selfmap.h |  20 ++++----
> >   linux-user/elfload.c   |  14 +++--
> >   linux-user/syscall.c   |  15 +++---
> >   util/selfmap.c         | 114 +++++++++++++++++++++++++-----------
> > -----
> >   4 files changed, 96 insertions(+), 67 deletions(-)
> 
> I should note that, for 8.2, this will enable a rewrite of
> open_self_maps_1 so that it 
> does not require page-by-page checking of page_get_flags.
> 
> My idea is that open_self_maps_1 would use walk_memory_regions to see
> all guest memory 
> regions.  The per-region callback would cross-check with the host-
> region interval tree to 
> find the dev+inode+path.
> 
> Cc Ilya and Helge, since there are two outstanding changes to
> open_self_maps.
> 
> 
> r~

My outstanding change should not be sensitive to this; it should be
possible to put it in both before or after the rewrite.



I really like this idea though, since I looked into ppc64le and there
printing maps is quite broken: it's not just that QEMU can't determine
the names of the mapped files, but also a number of regions are simply
missing. This also affects core dumps generated by GDB attached to
gdbstub.

For example, cat /proc/self/maps has the following internal page
layout:

start            end              size             prot
0000000010000000-000000001000d000 000000000000d000 r-x
000000001000d000-0000000010010000 0000000000003000 ---
0000000010010000-000000001001f000 000000000000f000 r--
000000001001f000-0000000010020000 0000000000001000 r--
0000000010020000-0000000010021000 0000000000001000 rw-
0000100000000000-0000100000010000 0000000000010000 ---
0000100000010000-0000100000810000 0000000000800000 rw-
0000100000810000-0000100000830000 0000000000020000 r-x
0000100000830000-000010000083d000 000000000000d000 r-x
000010000083d000-0000100000840000 0000000000003000 ---
0000100000840000-000010000084f000 000000000000f000 r--
000010000084f000-0000100000850000 0000000000001000 r--
0000100000850000-0000100000851000 0000000000001000 rw-
0000100000851000-0000100000852000 0000000000001000 rw-
0000100000860000-0000100000861000 0000000000001000 r-x
0000100000880000-0000100000a50000 00000000001d0000 r-x
0000100000a50000-0000100000a60000 0000000000010000 r--
0000100000a60000-0000100000a70000 0000000000010000 rw-
0000100000a70000-0000100000b70000 0000000000100000 rw-
0000100000b70000-000010000742d000 00000000068bd000 r--
00007fffb22b0000-00007fffb22e0000 0000000000030000 rw-

but prints only:

100000000000-100000010000 ---p 00000000 00:00 0                       
100000010000-100000810000 rw-p 00000000 00:00 0                       
[stack]
100000810000-100000830000 r-xp 00000000 fd:00 3049136                 
/usr/lib64/ld-2.17.so
100000880000-100000a50000 r-xp 00000000 fd:00 3017372                 
/usr/lib64/libc-2.17.so
100000a50000-100000a60000 r--p 001c0000 fd:00 3017372                 
/usr/lib64/libc-2.17.so
100000a60000-100000a70000 rw-p 001d0000 fd:00 3017372                 
/usr/lib64/libc-2.17.so
100000a70000-100000b70000 rw-p 00000000 00:00 0                       
7fffb22b0000-7fffb22e0000 rw-p 00000000 00:00 0                       

I don't see a good way to prevent page_check_range() from rejecting
most of the mappings with the current code structure, but I think that
after the proposed rewrite it should begin to just work.
Helge Deller Aug. 10, 2023, 10:06 p.m. UTC | #7
On 8/10/23 23:31, Ilya Leoshkevich wrote:
> On Mon, 2023-08-07 at 11:17 -0700, Richard Henderson wrote:
>> On 8/7/23 09:37, Richard Henderson wrote:
>>> We will want to be able to search the set of mappings.
>>> For this patch, the two users iterate the tree in order.
>>>
>>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>>> ---
>>>    include/qemu/selfmap.h |  20 ++++----
>>>    linux-user/elfload.c   |  14 +++--
>>>    linux-user/syscall.c   |  15 +++---
>>>    util/selfmap.c         | 114 +++++++++++++++++++++++++-----------
>>> -----
>>>    4 files changed, 96 insertions(+), 67 deletions(-)
>>
>> I should note that, for 8.2, this will enable a rewrite of
>> open_self_maps_1 so that it
>> does not require page-by-page checking of page_get_flags.
>>
>> My idea is that open_self_maps_1 would use walk_memory_regions to see
>> all guest memory
>> regions.  The per-region callback would cross-check with the host-
>> region interval tree to
>> find the dev+inode+path.
>>
>> Cc Ilya and Helge, since there are two outstanding changes to
>> open_self_maps.

I think the rewrite is good.
My patches regarding the map aren't important, I can adjust them
afterwards and resend (if necessary).

Helge
diff mbox series

Patch

diff --git a/include/qemu/selfmap.h b/include/qemu/selfmap.h
index 3479a2a618..aacd6ae0a0 100644
--- a/include/qemu/selfmap.h
+++ b/include/qemu/selfmap.h
@@ -9,9 +9,10 @@ 
 #ifndef SELFMAP_H
 #define SELFMAP_H
 
+#include "qemu/interval-tree.h"
+
 typedef struct {
-    unsigned long start;
-    unsigned long end;
+    IntervalTreeNode itree;
 
     /* flags */
     bool is_read;
@@ -19,26 +20,25 @@  typedef struct {
     bool is_exec;
     bool is_priv;
 
-    unsigned long offset;
-    gchar *dev;
+    uint64_t offset;
     uint64_t inode;
-    gchar *path;
+    const char *path;
+    char dev[];
 } MapInfo;
 
-
 /**
  * read_self_maps:
  *
  * Read /proc/self/maps and return a list of MapInfo structures.
  */
-GSList *read_self_maps(void);
+IntervalTreeRoot *read_self_maps(void);
 
 /**
  * free_self_maps:
- * @info: a GSlist
+ * @info: an interval tree
  *
- * Free a list of MapInfo structures.
+ * Free a tree of MapInfo structures.
  */
-void free_self_maps(GSList *info);
+void free_self_maps(IntervalTreeRoot *root);
 
 #endif /* SELFMAP_H */
diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index c9e176a9f6..f497286abe 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -2622,7 +2622,8 @@  static uintptr_t pgd_find_hole_fallback(uintptr_t guest_size, uintptr_t brk,
 static uintptr_t pgb_find_hole(uintptr_t guest_loaddr, uintptr_t guest_size,
                                long align, uintptr_t offset)
 {
-    GSList *maps, *iter;
+    IntervalTreeRoot *maps;
+    IntervalTreeNode *iter;
     uintptr_t this_start, this_end, next_start, brk;
     intptr_t ret = -1;
 
@@ -2640,12 +2641,15 @@  static uintptr_t pgb_find_hole(uintptr_t guest_loaddr, uintptr_t guest_size,
     /* The first hole is before the first map entry. */
     this_start = mmap_min_addr;
 
-    for (iter = maps; iter;
-         this_start = next_start, iter = g_slist_next(iter)) {
+    for (iter = interval_tree_iter_first(maps, 0, -1);
+         iter;
+         this_start = next_start,
+         iter = interval_tree_iter_next(iter, 0, -1)) {
+        MapInfo *info = container_of(iter, MapInfo, itree);
         uintptr_t align_start, hole_size;
 
-        this_end = ((MapInfo *)iter->data)->start;
-        next_start = ((MapInfo *)iter->data)->end;
+        this_end = info->itree.start;
+        next_start = info->itree.last + 1;
         align_start = ROUND_UP(this_start + offset, align);
 
         /* Skip holes that are too small. */
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 7c2c2f6e2f..a15bce2be2 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8070,16 +8070,17 @@  static int open_self_maps_1(CPUArchState *cpu_env, int fd, bool smaps)
 {
     CPUState *cpu = env_cpu(cpu_env);
     TaskState *ts = cpu->opaque;
-    GSList *map_info = read_self_maps();
-    GSList *s;
+    IntervalTreeRoot *map_info = read_self_maps();
+    IntervalTreeNode *s;
     int count;
 
-    for (s = map_info; s; s = g_slist_next(s)) {
-        MapInfo *e = (MapInfo *) s->data;
+    for (s = interval_tree_iter_first(map_info, 0, -1); s;
+         s = interval_tree_iter_next(s, 0, -1)) {
+        MapInfo *e = container_of(s, MapInfo, itree);
 
-        if (h2g_valid(e->start)) {
-            unsigned long min = e->start;
-            unsigned long max = e->end;
+        if (h2g_valid(e->itree.start)) {
+            unsigned long min = e->itree.start;
+            unsigned long max = e->itree.last + 1;
             int flags = page_get_flags(h2g(min));
             const char *path;
 
diff --git a/util/selfmap.c b/util/selfmap.c
index 2c14f019ce..4db5b42651 100644
--- a/util/selfmap.c
+++ b/util/selfmap.c
@@ -10,74 +10,98 @@ 
 #include "qemu/cutils.h"
 #include "qemu/selfmap.h"
 
-GSList *read_self_maps(void)
+IntervalTreeRoot *read_self_maps(void)
 {
-    gchar *maps;
-    GSList *map_info = NULL;
+    IntervalTreeRoot *root;
+    gchar *maps, **lines;
+    guint i, nlines;
 
-    if (g_file_get_contents("/proc/self/maps", &maps, NULL, NULL)) {
-        gchar **lines = g_strsplit(maps, "\n", 0);
-        int i, entries = g_strv_length(lines);
+    if (!g_file_get_contents("/proc/self/maps", &maps, NULL, NULL)) {
+        return NULL;
+    }
 
-        for (i = 0; i < entries; i++) {
-            gchar **fields = g_strsplit(lines[i], " ", 6);
-            if (g_strv_length(fields) > 4) {
-                MapInfo *e = g_new0(MapInfo, 1);
-                int errors = 0;
-                const char *end;
+    root = g_new0(IntervalTreeRoot, 1);
+    lines = g_strsplit(maps, "\n", 0);
+    nlines = g_strv_length(lines);
 
-                errors |= qemu_strtoul(fields[0], &end, 16, &e->start);
-                errors |= qemu_strtoul(end + 1, NULL, 16, &e->end);
+    for (i = 0; i < nlines; i++) {
+        gchar **fields = g_strsplit(lines[i], " ", 6);
+        guint nfields = g_strv_length(fields);
+
+        if (nfields > 4) {
+            uint64_t start, end, offset, inode;
+            int errors = 0;
+            const char *p;
+
+            errors |= qemu_strtou64(fields[0], &p, 16, &start);
+            errors |= qemu_strtou64(p + 1, NULL, 16, &end);
+            errors |= qemu_strtou64(fields[2], NULL, 16, &offset);
+            errors |= qemu_strtou64(fields[4], NULL, 10, &inode);
+
+            if (!errors) {
+                size_t dev_len, path_len;
+                MapInfo *e;
+
+                dev_len = strlen(fields[3]) + 1;
+                if (nfields == 6) {
+                    p = fields[5];
+                    p += strspn(p, " ");
+                    path_len = strlen(p) + 1;
+                } else {
+                    p = NULL;
+                    path_len = 0;
+                }
+
+                e = g_malloc0(sizeof(*e) + dev_len + path_len);
+
+                e->itree.start = start;
+                e->itree.last = end - 1;
+                e->offset = offset;
+                e->inode = inode;
 
                 e->is_read  = fields[1][0] == 'r';
                 e->is_write = fields[1][1] == 'w';
                 e->is_exec  = fields[1][2] == 'x';
                 e->is_priv  = fields[1][3] == 'p';
 
-                errors |= qemu_strtoul(fields[2], NULL, 16, &e->offset);
-                e->dev = g_strdup(fields[3]);
-                errors |= qemu_strtou64(fields[4], NULL, 10, &e->inode);
-
-                if (!errors) {
-                    /*
-                     * The last field may have leading spaces which we
-                     * need to strip.
-                     */
-                    if (g_strv_length(fields) == 6) {
-                        e->path = g_strdup(g_strchug(fields[5]));
-                    }
-                    map_info = g_slist_prepend(map_info, e);
-                } else {
-                    g_free(e->dev);
-                    g_free(e);
+                memcpy(e->dev, fields[3], dev_len);
+                if (path_len) {
+                    e->path = memcpy(e->dev + dev_len, p, path_len);
                 }
+
+                interval_tree_insert(&e->itree, root);
             }
-
-            g_strfreev(fields);
         }
-        g_strfreev(lines);
-        g_free(maps);
+        g_strfreev(fields);
     }
+    g_strfreev(lines);
+    g_free(maps);
 
-    /* ensure the map data is in the same order we collected it */
-    return g_slist_reverse(map_info);
+    return root;
 }
 
 /**
  * free_self_maps:
- * @info: a GSlist
+ * @root: an interval tree
  *
- * Free a list of MapInfo structures.
+ * Free a tree of MapInfo structures.
+ * Since we allocated each MapInfo in one chunk, we need not consider the
+ * contents and can simply free each RBNode.
  */
-static void free_info(gpointer data)
+
+static void free_rbnode(RBNode *n)
 {
-    MapInfo *e = (MapInfo *) data;
-    g_free(e->dev);
-    g_free(e->path);
-    g_free(e);
+    if (n) {
+        free_rbnode(n->rb_left);
+        free_rbnode(n->rb_right);
+        g_free(n);
+    }
 }
 
-void free_self_maps(GSList *info)
+void free_self_maps(IntervalTreeRoot *root)
 {
-    g_slist_free_full(info, &free_info);
+    if (root) {
+        free_rbnode(root->rb_root.rb_node);
+        g_free(root);
+    }
 }