Message ID | 20230807163705.9848-11-richard.henderson@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | linux-user: image mapping fixes | expand |
On 8/7/23 09:37, Richard Henderson wrote: > We will want to be able to search the set of mappings. > For this patch, the two users iterate the tree in order. > > Signed-off-by: Richard Henderson <richard.henderson@linaro.org> > --- > include/qemu/selfmap.h | 20 ++++---- > linux-user/elfload.c | 14 +++-- > linux-user/syscall.c | 15 +++--- > util/selfmap.c | 114 +++++++++++++++++++++++++---------------- > 4 files changed, 96 insertions(+), 67 deletions(-) I should note that, for 8.2, this will enable a rewrite of open_self_maps_1 so that it does not require page-by-page checking of page_get_flags. My idea is that open_self_maps_1 would use walk_memory_regions to see all guest memory regions. The per-region callback would cross-check with the host-region interval tree to find the dev+inode+path. Cc Ilya and Helge, since there are two outstanding changes to open_self_maps. r~
07.08.2023 19:37, Richard Henderson wrote: > We will want to be able to search the set of mappings. > For this patch, the two users iterate the tree in order. > diff --git a/include/qemu/selfmap.h b/include/qemu/selfmap.h > /** > * read_self_maps: > * > * Read /proc/self/maps and return a list of MapInfo structures. Nitpick: this comment still says "a list", while in all other places it has been changed to "a tree". > */ > -GSList *read_self_maps(void); > +IntervalTreeRoot *read_self_maps(void); /mjt
On 8/9/23 08:11, Helge Deller wrote: > Fix a crash in qemu-user when running > > cat /proc/self/maps > > in a chroot, where /proc isn't mounted. > > The problem was introduced by commit 3ce3dd8ca965 ("util/selfmap: > Rewrite using qemu/interval-tree.h") where in open_self_maps_1() the > function read_self_maps() is called and which returns NULL if it can't > read the hosts /proc/self/maps file. Afterwards that NULL is fed into > interval_tree_iter_first() which doesn't check if the root node is NULL. > > Fix it by adding a check if root is NULL and return NULL in that case. > > Signed-off-by: Helge Deller <deller@gmx.de> > Fixes: 3ce3dd8ca965 ("util/selfmap: Rewrite using qemu/interval-tree.h") > > diff --git a/util/interval-tree.c b/util/interval-tree.c > index f2866aa7d3..53465182e6 100644 > --- a/util/interval-tree.c > +++ b/util/interval-tree.c > @@ -797,7 +797,7 @@ IntervalTreeNode *interval_tree_iter_first(IntervalTreeRoot *root, > { > IntervalTreeNode *node, *leftmost; > > - if (!root->rb_root.rb_node) { > + if (!root || !root->rb_root.rb_node) { I guess this is good enough for 8.1. Before the conversion to interval-tree we would also emit nothing. I've already done a rewrite for 8.2, and I noticed this problem. There I emit what mapping information that I have, which is everything except for the device+path data. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~
On 8/9/23 17:23, Richard Henderson wrote: > On 8/9/23 08:11, Helge Deller wrote: >> Fix a crash in qemu-user when running >> >> cat /proc/self/maps >> >> in a chroot, where /proc isn't mounted. >> >> The problem was introduced by commit 3ce3dd8ca965 ("util/selfmap: >> Rewrite using qemu/interval-tree.h") where in open_self_maps_1() the >> function read_self_maps() is called and which returns NULL if it can't >> read the hosts /proc/self/maps file. Afterwards that NULL is fed into >> interval_tree_iter_first() which doesn't check if the root node is NULL. >> >> Fix it by adding a check if root is NULL and return NULL in that case. >> >> Signed-off-by: Helge Deller <deller@gmx.de> >> Fixes: 3ce3dd8ca965 ("util/selfmap: Rewrite using qemu/interval-tree.h") >> >> diff --git a/util/interval-tree.c b/util/interval-tree.c >> index f2866aa7d3..53465182e6 100644 >> --- a/util/interval-tree.c >> +++ b/util/interval-tree.c >> @@ -797,7 +797,7 @@ IntervalTreeNode *interval_tree_iter_first(IntervalTreeRoot *root, >> { >> IntervalTreeNode *node, *leftmost; >> >> - if (!root->rb_root.rb_node) { >> + if (!root || !root->rb_root.rb_node) { > > > I guess this is good enough for 8.1. Before the conversion to interval-tree we would also emit nothing. Yes and yes. > I've already done a rewrite for 8.2, and I noticed this problem. > There I emit what mapping information that I have, which is > everything except for the device+path data. nice. > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Shall I send a pull request? If so, is it OK that I include this patch in the pull-request as well? linux-user: Fix openat() emulation to correctly detect accesses to /proc https://lists.nongnu.org/archive/html/qemu-devel/2023-08/msg00165.html which already has been R-b: Daniel P. Berrangé Helge
On 8/9/23 08:53, Helge Deller wrote: > On 8/9/23 17:23, Richard Henderson wrote: >> On 8/9/23 08:11, Helge Deller wrote: >>> Fix a crash in qemu-user when running >>> >>> cat /proc/self/maps >>> >>> in a chroot, where /proc isn't mounted. >>> >>> The problem was introduced by commit 3ce3dd8ca965 ("util/selfmap: >>> Rewrite using qemu/interval-tree.h") where in open_self_maps_1() the >>> function read_self_maps() is called and which returns NULL if it can't >>> read the hosts /proc/self/maps file. Afterwards that NULL is fed into >>> interval_tree_iter_first() which doesn't check if the root node is NULL. >>> >>> Fix it by adding a check if root is NULL and return NULL in that case. >>> >>> Signed-off-by: Helge Deller <deller@gmx.de> >>> Fixes: 3ce3dd8ca965 ("util/selfmap: Rewrite using qemu/interval-tree.h") >>> >>> diff --git a/util/interval-tree.c b/util/interval-tree.c >>> index f2866aa7d3..53465182e6 100644 >>> --- a/util/interval-tree.c >>> +++ b/util/interval-tree.c >>> @@ -797,7 +797,7 @@ IntervalTreeNode *interval_tree_iter_first(IntervalTreeRoot *root, >>> { >>> IntervalTreeNode *node, *leftmost; >>> >>> - if (!root->rb_root.rb_node) { >>> + if (!root || !root->rb_root.rb_node) { >> >> >> I guess this is good enough for 8.1. Before the conversion to interval-tree we would >> also emit nothing. > > Yes and yes. > >> I've already done a rewrite for 8.2, and I noticed this problem. >> There I emit what mapping information that I have, which is >> everything except for the device+path data. > > nice. > >> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > > Shall I send a pull request? > If so, is it OK that I include this patch in the pull-request as well? > linux-user: Fix openat() emulation to correctly detect accesses to /proc > https://lists.nongnu.org/archive/html/qemu-devel/2023-08/msg00165.html > which already has been R-b: Daniel P. Berrangé I can pick them both up -- I have other linux-user patches to send. r~
On Mon, 2023-08-07 at 11:17 -0700, Richard Henderson wrote: > On 8/7/23 09:37, Richard Henderson wrote: > > We will want to be able to search the set of mappings. > > For this patch, the two users iterate the tree in order. > > > > Signed-off-by: Richard Henderson <richard.henderson@linaro.org> > > --- > > include/qemu/selfmap.h | 20 ++++---- > > linux-user/elfload.c | 14 +++-- > > linux-user/syscall.c | 15 +++--- > > util/selfmap.c | 114 +++++++++++++++++++++++++----------- > > ----- > > 4 files changed, 96 insertions(+), 67 deletions(-) > > I should note that, for 8.2, this will enable a rewrite of > open_self_maps_1 so that it > does not require page-by-page checking of page_get_flags. > > My idea is that open_self_maps_1 would use walk_memory_regions to see > all guest memory > regions. The per-region callback would cross-check with the host- > region interval tree to > find the dev+inode+path. > > Cc Ilya and Helge, since there are two outstanding changes to > open_self_maps. > > > r~ My outstanding change should not be sensitive to this; it should be possible to put it in both before or after the rewrite. I really like this idea though, since I looked into ppc64le and there printing maps is quite broken: it's not just that QEMU can't determine the names of the mapped files, but also a number of regions are simply missing. This also affects core dumps generated by GDB attached to gdbstub. For example, cat /proc/self/maps has the following internal page layout: start end size prot 0000000010000000-000000001000d000 000000000000d000 r-x 000000001000d000-0000000010010000 0000000000003000 --- 0000000010010000-000000001001f000 000000000000f000 r-- 000000001001f000-0000000010020000 0000000000001000 r-- 0000000010020000-0000000010021000 0000000000001000 rw- 0000100000000000-0000100000010000 0000000000010000 --- 0000100000010000-0000100000810000 0000000000800000 rw- 0000100000810000-0000100000830000 0000000000020000 r-x 0000100000830000-000010000083d000 000000000000d000 r-x 000010000083d000-0000100000840000 0000000000003000 --- 0000100000840000-000010000084f000 000000000000f000 r-- 000010000084f000-0000100000850000 0000000000001000 r-- 0000100000850000-0000100000851000 0000000000001000 rw- 0000100000851000-0000100000852000 0000000000001000 rw- 0000100000860000-0000100000861000 0000000000001000 r-x 0000100000880000-0000100000a50000 00000000001d0000 r-x 0000100000a50000-0000100000a60000 0000000000010000 r-- 0000100000a60000-0000100000a70000 0000000000010000 rw- 0000100000a70000-0000100000b70000 0000000000100000 rw- 0000100000b70000-000010000742d000 00000000068bd000 r-- 00007fffb22b0000-00007fffb22e0000 0000000000030000 rw- but prints only: 100000000000-100000010000 ---p 00000000 00:00 0 100000010000-100000810000 rw-p 00000000 00:00 0 [stack] 100000810000-100000830000 r-xp 00000000 fd:00 3049136 /usr/lib64/ld-2.17.so 100000880000-100000a50000 r-xp 00000000 fd:00 3017372 /usr/lib64/libc-2.17.so 100000a50000-100000a60000 r--p 001c0000 fd:00 3017372 /usr/lib64/libc-2.17.so 100000a60000-100000a70000 rw-p 001d0000 fd:00 3017372 /usr/lib64/libc-2.17.so 100000a70000-100000b70000 rw-p 00000000 00:00 0 7fffb22b0000-7fffb22e0000 rw-p 00000000 00:00 0 I don't see a good way to prevent page_check_range() from rejecting most of the mappings with the current code structure, but I think that after the proposed rewrite it should begin to just work.
On 8/10/23 23:31, Ilya Leoshkevich wrote: > On Mon, 2023-08-07 at 11:17 -0700, Richard Henderson wrote: >> On 8/7/23 09:37, Richard Henderson wrote: >>> We will want to be able to search the set of mappings. >>> For this patch, the two users iterate the tree in order. >>> >>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> >>> --- >>> include/qemu/selfmap.h | 20 ++++---- >>> linux-user/elfload.c | 14 +++-- >>> linux-user/syscall.c | 15 +++--- >>> util/selfmap.c | 114 +++++++++++++++++++++++++----------- >>> ----- >>> 4 files changed, 96 insertions(+), 67 deletions(-) >> >> I should note that, for 8.2, this will enable a rewrite of >> open_self_maps_1 so that it >> does not require page-by-page checking of page_get_flags. >> >> My idea is that open_self_maps_1 would use walk_memory_regions to see >> all guest memory >> regions. The per-region callback would cross-check with the host- >> region interval tree to >> find the dev+inode+path. >> >> Cc Ilya and Helge, since there are two outstanding changes to >> open_self_maps. I think the rewrite is good. My patches regarding the map aren't important, I can adjust them afterwards and resend (if necessary). Helge
diff --git a/include/qemu/selfmap.h b/include/qemu/selfmap.h index 3479a2a618..aacd6ae0a0 100644 --- a/include/qemu/selfmap.h +++ b/include/qemu/selfmap.h @@ -9,9 +9,10 @@ #ifndef SELFMAP_H #define SELFMAP_H +#include "qemu/interval-tree.h" + typedef struct { - unsigned long start; - unsigned long end; + IntervalTreeNode itree; /* flags */ bool is_read; @@ -19,26 +20,25 @@ typedef struct { bool is_exec; bool is_priv; - unsigned long offset; - gchar *dev; + uint64_t offset; uint64_t inode; - gchar *path; + const char *path; + char dev[]; } MapInfo; - /** * read_self_maps: * * Read /proc/self/maps and return a list of MapInfo structures. */ -GSList *read_self_maps(void); +IntervalTreeRoot *read_self_maps(void); /** * free_self_maps: - * @info: a GSlist + * @info: an interval tree * - * Free a list of MapInfo structures. + * Free a tree of MapInfo structures. */ -void free_self_maps(GSList *info); +void free_self_maps(IntervalTreeRoot *root); #endif /* SELFMAP_H */ diff --git a/linux-user/elfload.c b/linux-user/elfload.c index c9e176a9f6..f497286abe 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -2622,7 +2622,8 @@ static uintptr_t pgd_find_hole_fallback(uintptr_t guest_size, uintptr_t brk, static uintptr_t pgb_find_hole(uintptr_t guest_loaddr, uintptr_t guest_size, long align, uintptr_t offset) { - GSList *maps, *iter; + IntervalTreeRoot *maps; + IntervalTreeNode *iter; uintptr_t this_start, this_end, next_start, brk; intptr_t ret = -1; @@ -2640,12 +2641,15 @@ static uintptr_t pgb_find_hole(uintptr_t guest_loaddr, uintptr_t guest_size, /* The first hole is before the first map entry. */ this_start = mmap_min_addr; - for (iter = maps; iter; - this_start = next_start, iter = g_slist_next(iter)) { + for (iter = interval_tree_iter_first(maps, 0, -1); + iter; + this_start = next_start, + iter = interval_tree_iter_next(iter, 0, -1)) { + MapInfo *info = container_of(iter, MapInfo, itree); uintptr_t align_start, hole_size; - this_end = ((MapInfo *)iter->data)->start; - next_start = ((MapInfo *)iter->data)->end; + this_end = info->itree.start; + next_start = info->itree.last + 1; align_start = ROUND_UP(this_start + offset, align); /* Skip holes that are too small. */ diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 7c2c2f6e2f..a15bce2be2 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -8070,16 +8070,17 @@ static int open_self_maps_1(CPUArchState *cpu_env, int fd, bool smaps) { CPUState *cpu = env_cpu(cpu_env); TaskState *ts = cpu->opaque; - GSList *map_info = read_self_maps(); - GSList *s; + IntervalTreeRoot *map_info = read_self_maps(); + IntervalTreeNode *s; int count; - for (s = map_info; s; s = g_slist_next(s)) { - MapInfo *e = (MapInfo *) s->data; + for (s = interval_tree_iter_first(map_info, 0, -1); s; + s = interval_tree_iter_next(s, 0, -1)) { + MapInfo *e = container_of(s, MapInfo, itree); - if (h2g_valid(e->start)) { - unsigned long min = e->start; - unsigned long max = e->end; + if (h2g_valid(e->itree.start)) { + unsigned long min = e->itree.start; + unsigned long max = e->itree.last + 1; int flags = page_get_flags(h2g(min)); const char *path; diff --git a/util/selfmap.c b/util/selfmap.c index 2c14f019ce..4db5b42651 100644 --- a/util/selfmap.c +++ b/util/selfmap.c @@ -10,74 +10,98 @@ #include "qemu/cutils.h" #include "qemu/selfmap.h" -GSList *read_self_maps(void) +IntervalTreeRoot *read_self_maps(void) { - gchar *maps; - GSList *map_info = NULL; + IntervalTreeRoot *root; + gchar *maps, **lines; + guint i, nlines; - if (g_file_get_contents("/proc/self/maps", &maps, NULL, NULL)) { - gchar **lines = g_strsplit(maps, "\n", 0); - int i, entries = g_strv_length(lines); + if (!g_file_get_contents("/proc/self/maps", &maps, NULL, NULL)) { + return NULL; + } - for (i = 0; i < entries; i++) { - gchar **fields = g_strsplit(lines[i], " ", 6); - if (g_strv_length(fields) > 4) { - MapInfo *e = g_new0(MapInfo, 1); - int errors = 0; - const char *end; + root = g_new0(IntervalTreeRoot, 1); + lines = g_strsplit(maps, "\n", 0); + nlines = g_strv_length(lines); - errors |= qemu_strtoul(fields[0], &end, 16, &e->start); - errors |= qemu_strtoul(end + 1, NULL, 16, &e->end); + for (i = 0; i < nlines; i++) { + gchar **fields = g_strsplit(lines[i], " ", 6); + guint nfields = g_strv_length(fields); + + if (nfields > 4) { + uint64_t start, end, offset, inode; + int errors = 0; + const char *p; + + errors |= qemu_strtou64(fields[0], &p, 16, &start); + errors |= qemu_strtou64(p + 1, NULL, 16, &end); + errors |= qemu_strtou64(fields[2], NULL, 16, &offset); + errors |= qemu_strtou64(fields[4], NULL, 10, &inode); + + if (!errors) { + size_t dev_len, path_len; + MapInfo *e; + + dev_len = strlen(fields[3]) + 1; + if (nfields == 6) { + p = fields[5]; + p += strspn(p, " "); + path_len = strlen(p) + 1; + } else { + p = NULL; + path_len = 0; + } + + e = g_malloc0(sizeof(*e) + dev_len + path_len); + + e->itree.start = start; + e->itree.last = end - 1; + e->offset = offset; + e->inode = inode; e->is_read = fields[1][0] == 'r'; e->is_write = fields[1][1] == 'w'; e->is_exec = fields[1][2] == 'x'; e->is_priv = fields[1][3] == 'p'; - errors |= qemu_strtoul(fields[2], NULL, 16, &e->offset); - e->dev = g_strdup(fields[3]); - errors |= qemu_strtou64(fields[4], NULL, 10, &e->inode); - - if (!errors) { - /* - * The last field may have leading spaces which we - * need to strip. - */ - if (g_strv_length(fields) == 6) { - e->path = g_strdup(g_strchug(fields[5])); - } - map_info = g_slist_prepend(map_info, e); - } else { - g_free(e->dev); - g_free(e); + memcpy(e->dev, fields[3], dev_len); + if (path_len) { + e->path = memcpy(e->dev + dev_len, p, path_len); } + + interval_tree_insert(&e->itree, root); } - - g_strfreev(fields); } - g_strfreev(lines); - g_free(maps); + g_strfreev(fields); } + g_strfreev(lines); + g_free(maps); - /* ensure the map data is in the same order we collected it */ - return g_slist_reverse(map_info); + return root; } /** * free_self_maps: - * @info: a GSlist + * @root: an interval tree * - * Free a list of MapInfo structures. + * Free a tree of MapInfo structures. + * Since we allocated each MapInfo in one chunk, we need not consider the + * contents and can simply free each RBNode. */ -static void free_info(gpointer data) + +static void free_rbnode(RBNode *n) { - MapInfo *e = (MapInfo *) data; - g_free(e->dev); - g_free(e->path); - g_free(e); + if (n) { + free_rbnode(n->rb_left); + free_rbnode(n->rb_right); + g_free(n); + } } -void free_self_maps(GSList *info) +void free_self_maps(IntervalTreeRoot *root) { - g_slist_free_full(info, &free_info); + if (root) { + free_rbnode(root->rb_root.rb_node); + g_free(root); + } }
We will want to be able to search the set of mappings. For this patch, the two users iterate the tree in order. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- include/qemu/selfmap.h | 20 ++++---- linux-user/elfload.c | 14 +++-- linux-user/syscall.c | 15 +++--- util/selfmap.c | 114 +++++++++++++++++++++++++---------------- 4 files changed, 96 insertions(+), 67 deletions(-)