diff mbox

perf tools: Rebuild rbtree when adjusting symbols for kcore

Message ID 1446803172-83107-1-git-send-email-wangnan0@huawei.com
State New
Headers show

Commit Message

Wang Nan Nov. 6, 2015, 9:46 a.m. UTC
In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
address but only reinsert it into rbtree if the symbol belongs to
another map. However, the expression for adjusting symbol (pos->start -=
curr_map->start - curr_map->pgoff) can change the relative order between
two symbols (even if the affected symbols are in different maps, in
kcore case they are possible to share one same dso), which damages the
rbtree.

For example:

When using kcore:

# readelf -a /proc/kcore

  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  ...
  LOAD           0x0000000000002000 0xffffffc000000000 0x0000000000000000    <-- kernel
                 0x000000007fc00000 0x000000007fc00000  RWE    1000
  LOAD           0xfffffffffc002000 0xffffffbffc000000 0x0000000000000000    <-- module
                 0x0000000004000000 0x0000000004000000  RWE    1000

For modules memory area:
  map->start = 0xffffffbffc000000, map->pgoff = 0xfffffffffc002000
For normal kernel memory area:
  map->start = 0xffffffc000000000, map->pgoff = 0x0000000000002000

Function A is a normal kernel function at:	0xffffffc00021b428.
Function B is a function in module at: 		0xffffffbffc000000.

&A > &B before calling dso__split_kallsyms_for_kcore(), and they are
already in the rbtree.

During dso__split_kallsyms_for_kcore(), when adjusting symbols using
 pos->start -= curr_map->start - curr_map->pgoff

pos->start for A become: (0xffffffc00021b428 - 0xffffffc000000000 + 0x0000000000002000) = 0x21d428
pos->start for B become: (0xffffffbffc000000 - 0xffffffbffc000000 + 0xfffffffffc002000) = 0xfffffffffc002000

&A < &B, the order is changed.

This patch rebuild rbtree unconditionally to ensure the rbtree is
always healthy.

Signed-off-by: Wang Nan <wangnan0@huawei.com>

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
---

Here is my test result on my aarch64 system:

*Step 1: create kprobes*

[root@localhost ~]# ./perf_arm64 probe -m /tmp/kernel_module.ko my_func
Added new event:
  probe:my_func (on my_func in kernel_module)

You can now use it in all perf tools, such as:

	perf record -e probe:my_func -aR sleep 1

[root@localhost ~]# ./perf_arm64 probe sys_write
Added new event:
  probe:sys_write      (on sys_write)

You can now use it in all perf tools, such as:

	perf record -e probe:sys_write -aR sleep 1

[root@localhost ~]# cat /sys/kernel/debug/kprobes/list 
ffffffbffc000000  k  my_func+0x0  kernel_module [DISABLED]
ffffffc00021b428  k  SyS_write+0x0    [DISABLED]


*Step 2: rebuild perf without commit 98d3b25*

$ git log --oneline
3321d2b Revert "perf tools: Fix find_perf_probe_point_from_map() which incorrectly returns success"
e054731 perf stat: Make stat options global
0014de1 perf sched latency: Fix thread pid reuse issue
98d3b25 perf tools: Fix find_perf_probe_point_from_map() which incorrectly returns success
956959f perf trace: Fix documentation for -i


*Step 3: test and get the buggy result*

[root@localhost ~]# PAGER=cat ./perf_arm64 probe -l 
  Error: Failed to show event list.

[root@localhost ~]# PAGER=cat ./perf_arm64 probe -v -l 
map_groups__set_modules_path_dir: cannot open /lib/modules/4.1.12+ dir
Problems setting modules path maps, continuing anyway...
Opening /sys/kernel/debug/tracing//kprobe_events write=0
Opening /sys/kernel/debug/tracing//uprobe_events write=0
Parsing probe_events: p:probe/my_func kernel_module:my_func
Group:probe Event:my_func probe:p
Looking at the vmlinux_path (7 entries long)
symsrc__init: cannot get elf header.
Using /proc/kcore for kernel object code
Using /proc/kallsyms for symbols
try to find information at 3ffc000000 in kernel_module
Failed to find module kernel_module.
Failed to find the path for kernel_module: [kernel_module]
Failed to find corresponding probes from debuginfo.
Failed to synthesize perf probe point: 0
  Error: Failed to show event list. Reason: Invalid argument (Code: -22)


*Step 4: Introduce this patch*

$ git log --oneline
36a8201 perf tools: Rebuild rbtree when adjusting symbols for kcore
3321d2b Revert "perf tools: Fix find_perf_probe_point_from_map() which incorrectly returns success"
e054731 perf stat: Make stat options global
0014de1 perf sched latency: Fix thread pid reuse issue
98d3b25 perf tools: Fix find_perf_probe_point_from_map() which incorrectly returns success


*Step 5: Try again*

[root@localhost ~]# PAGER=cat ./perf_arm64 probe -l 
  probe:my_func (on my_func in kernel_module)
  probe:sys_write      (on sys_write)
[root@localhost ~]# PAGER=cat ./perf_arm64 probe -v -l 
map_groups__set_modules_path_dir: cannot open /lib/modules/4.1.12+ dir
Problems setting modules path maps, continuing anyway...
Opening /sys/kernel/debug/tracing//kprobe_events write=0
Opening /sys/kernel/debug/tracing//uprobe_events write=0
Parsing probe_events: p:probe/my_func kernel_module:my_func
Group:probe Event:my_func probe:p
Looking at the vmlinux_path (7 entries long)
symsrc__init: cannot get elf header.
Using /proc/kcore for kernel object code
Using /proc/kallsyms for symbols
Failed to find corresponding probes from debuginfo.
Failed to find probe point from both of dwarf and map.
  probe:my_func (on my_func in kernel_module)
Parsing probe_events: p:probe/sys_write _text+1684520
Group:probe Event:sys_write probe:p
try to find information at 19b428 in kernel
Looking at the vmlinux_path (7 entries long)
symsrc__init: cannot get elf header.
Failed to find the path for kernel: Invalid ELF file
Failed to find corresponding probes from debuginfo.
  probe:sys_write      (on sys_write)

---
 tools/perf/util/symbol.c | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Comments

Wang Nan Nov. 6, 2015, 1:34 p.m. UTC | #1
On 2015/11/6 21:19, Arnaldo Carvalho de Melo wrote:
> Em Fri, Nov 06, 2015 at 09:46:12AM +0000, Wang Nan escreveu:

>> In dso__split_kallsyms_for_kcore(), current code adjusts symbol's

>> address but only reinsert it into rbtree if the symbol belongs to

>> another map. However, the expression for adjusting symbol (pos->start -=

>> curr_map->start - curr_map->pgoff) can change the relative order between

>> two symbols (even if the affected symbols are in different maps, in

>> kcore case they are possible to share one same dso), which damages the

>> rbtree.

> Right, some code does change the symbol values it gets from whatever

> symtab (kallsyms, ELF, JIT maps, etc) when it should instead use the per

> map data structure (struct map) and its ->{map,unmap}_ip, ->pgoff,

> ->reloc, members for that :-\

>

> I.e. 'struct dso' should be just what comes from the symtab, while

> 'struct map' should be about where that DSO is in memory.

>

> With that in mind, do you still think your fix is the correct one?


Not very sure. I'm not familar with this part of code. Actually
speaking I don't understand the relationship between what you said
and what I found...

I spent a whole day to answer Masami's question that why
kernel_get_symbol_address_by_name success but __find_kernel_function()
fail in my platform, and described it in commit message.
This patch is the best one I can find. It solves my problem but may be
incorrect. Just want you and other know my result. Please let
me know if you and other want further information. Now its pirority
is low because patch 98d3b25 and Masami's update are already enough
for me.

I'll go back to BPF stuff. There are still much work to do :-)

Thank you.

> Adrian?

>

> - Arnaldo

>   



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Wang Nan Nov. 11, 2015, 7:02 a.m. UTC | #2
On 2015/11/6 21:59, Adrian Hunter wrote:
> On 06/11/15 15:19, Arnaldo Carvalho de Melo wrote:

>> Em Fri, Nov 06, 2015 at 09:46:12AM +0000, Wang Nan escreveu:

>>> In dso__split_kallsyms_for_kcore(), current code adjusts symbol's

>>> address but only reinsert it into rbtree if the symbol belongs to

>>> another map. However, the expression for adjusting symbol (pos->start -=

>>> curr_map->start - curr_map->pgoff) can change the relative order between

>>> two symbols (even if the affected symbols are in different maps, in

>>> kcore case they are possible to share one same dso), which damages the

>>> rbtree.

>> Right, some code does change the symbol values it gets from whatever

>> symtab (kallsyms, ELF, JIT maps, etc) when it should instead use the per

>> map data structure (struct map) and its ->{map,unmap}_ip, ->pgoff,

>> ->reloc, members for that :-\

>>

>> I.e. 'struct dso' should be just what comes from the symtab, while

>> 'struct map' should be about where that DSO is in memory.

>>

>> With that in mind, do you still think your fix is the correct one?

>>

>> Adrian?

> The problem is when the order in memory (in kallsyms) is different

> to the order on the dso (kcore).

>

> I think to make it more general it needs to insert to a new tree.

> e.g.

>


I have tested this patch and it works for me.

Thank you.

> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c

> index b4cc7662677e..09343a880c0b 100644

> --- a/tools/perf/util/symbol.c

> +++ b/tools/perf/util/symbol.c

> @@ -654,19 +654,24 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,

>   	struct map_groups *kmaps = map__kmaps(map);

>   	struct map *curr_map;

>   	struct symbol *pos;

> -	int count = 0, moved = 0;

> +	int count = 0;

> +	struct rb_root old_root = dso->symbols[map->type];

>   	struct rb_root *root = &dso->symbols[map->type];

>   	struct rb_node *next = rb_first(root);

>   

>   	if (!kmaps)

>   		return -1;

>   

> +	*root = RB_ROOT;

> +

>   	while (next) {

>   		char *module;

>   

>   		pos = rb_entry(next, struct symbol, rb_node);

>   		next = rb_next(&pos->rb_node);

>   

> +		rb_erase_init(&pos->rb_node, &old_root);

> +

>   		module = strchr(pos->name, '\t');

>   		if (module)

>   			*module = '\0';

> @@ -674,28 +679,21 @@ static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,

>   		curr_map = map_groups__find(kmaps, map->type, pos->start);

>   

>   		if (!curr_map || (filter && filter(curr_map, pos))) {

> -			rb_erase_init(&pos->rb_node, root);

>   			symbol__delete(pos);

> -		} else {

> -			pos->start -= curr_map->start - curr_map->pgoff;

> -			if (pos->end)

> -				pos->end -= curr_map->start - curr_map->pgoff;

> -			if (curr_map->dso != map->dso) {

> -				rb_erase_init(&pos->rb_node, root);

> -				symbols__insert(

> -					&curr_map->dso->symbols[curr_map->type],

> -					pos);

> -				++moved;

> -			} else {

> -				++count;

> -			}

> +			continue;

>   		}

> +

> +		pos->start -= curr_map->start - curr_map->pgoff;

> +		if (pos->end)

> +			pos->end -= curr_map->start - curr_map->pgoff;

> +		symbols__insert(&curr_map->dso->symbols[curr_map->type], pos);

> +		++count;

>   	}

>   

>   	/* Symbols have been adjusted */

>   	dso->adjust_symbols = 1;

>   

> -	return count + moved;

> +	return count;

>   }

>   

>   /*

>



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
diff mbox

Patch

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index b4cc766..09bb6e8 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -654,7 +654,7 @@  static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
 	struct map_groups *kmaps = map__kmaps(map);
 	struct map *curr_map;
 	struct symbol *pos;
-	int count = 0, moved = 0;
+	int count = 0;
 	struct rb_root *root = &dso->symbols[map->type];
 	struct rb_node *next = rb_first(root);
 
@@ -677,25 +677,23 @@  static int dso__split_kallsyms_for_kcore(struct dso *dso, struct map *map,
 			rb_erase_init(&pos->rb_node, root);
 			symbol__delete(pos);
 		} else {
+			rb_erase_init(&pos->rb_node, root);
+
 			pos->start -= curr_map->start - curr_map->pgoff;
 			if (pos->end)
 				pos->end -= curr_map->start - curr_map->pgoff;
-			if (curr_map->dso != map->dso) {
-				rb_erase_init(&pos->rb_node, root);
-				symbols__insert(
-					&curr_map->dso->symbols[curr_map->type],
-					pos);
-				++moved;
-			} else {
-				++count;
-			}
+
+			symbols__insert(
+				&curr_map->dso->symbols[curr_map->type],
+				pos);
+			++count;
 		}
 	}
 
 	/* Symbols have been adjusted */
 	dso->adjust_symbols = 1;
 
-	return count + moved;
+	return count;
 }
 
 /*