diff mbox

[RFC] perf tools: Don't set inherit bit for system wide evsel

Message ID 1445597029-133332-1-git-send-email-wangnan0@huawei.com
State New
Headers show

Commit Message

Wang Nan Oct. 23, 2015, 10:43 a.m. UTC
Inherit bit is useless for a system wide evsel [1]. Further kernel
improvements are giving more constrain [2] on inherit events. This
patch set inherit bit to 0 to avoid potential constrains.

[1] http://lkml.kernel.org/r/20151022124142.GQ17308@twins.programming.kicks-ass.net
[2] http://lkml.kernel.org/r/1445559014-4667-1-git-send-email-ast@kernel.org

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Li Zefan <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/ebpf-0tgilipxoo6fiebcxu3ft866@git.kernel.org
---

evsel->system_wide doesn't correct reflect whether this evsel is system
wide or not, so checks pid when invoking perf_event_open, and it is
always correct.

---
 tools/perf/util/evsel.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Arnaldo Carvalho de Melo Oct. 23, 2015, 1:51 p.m. UTC | #1
Em Fri, Oct 23, 2015 at 10:43:49AM +0000, Wang Nan escreveu:
> Inherit bit is useless for a system wide evsel [1]. Further kernel
> improvements are giving more constrain [2] on inherit events. This
> patch set inherit bit to 0 to avoid potential constrains.
> 
> [1] http://lkml.kernel.org/r/20151022124142.GQ17308@twins.programming.kicks-ass.net
> [2] http://lkml.kernel.org/r/1445559014-4667-1-git-send-email-ast@kernel.org
> 
> Signed-off-by: Wang Nan <wangnan0@huawei.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Li Zefan <lizefan@huawei.com>
> Cc: pi3orama@163.com
> Link: http://lkml.kernel.org/n/ebpf-0tgilipxoo6fiebcxu3ft866@git.kernel.org
> ---
> 
> evsel->system_wide doesn't correct reflect whether this evsel is system
> wide or not, so checks pid when invoking perf_event_open, and it is
> always correct.

Can't we do this at perf_evlist__config() or perf_evsel__config() time?

We have record_opts at perf_evsel__config() time and I think we should
leave changing the attr at perf_evsel__open() time for feature
fallbacks, i.e. something we will only know when trying to use, which is
different from this inherit-on-syswide case, that we know far in advance
we will not need.

- Arnaldo
 
> ---
>  tools/perf/util/evsel.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 5566b16..e2d6c9a 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1337,6 +1337,7 @@ retry_sample_id:
>  
>  		for (thread = 0; thread < nthreads; thread++) {
>  			int group_fd;
> +			struct perf_event_attr attr;
>  
>  			if (!evsel->cgrp && !evsel->system_wide)
>  				pid = thread_map__pid(threads, thread);
> @@ -1346,7 +1347,10 @@ retry_open:
>  			pr_debug2("sys_perf_event_open: pid %d  cpu %d  group_fd %d  flags %#lx\n",
>  				  pid, cpus->map[cpu], group_fd, flags);
>  
> -			FD(evsel, cpu, thread) = sys_perf_event_open(&evsel->attr,
> +			attr = evsel->attr;
> +			if (pid == -1)
> +				attr.inherit = 0;
> +			FD(evsel, cpu, thread) = sys_perf_event_open(&attr,
>  								     pid,
>  								     cpus->map[cpu],
>  								     group_fd, flags);
> -- 
> 1.8.3.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
pi3orama Oct. 23, 2015, 1:58 p.m. UTC | #2
发自我的 iPhone

> 在 2015年10月23日,下午9:51,Arnaldo Carvalho de Melo <acme@kernel.org> 写道:
> 
> Em Fri, Oct 23, 2015 at 10:43:49AM +0000, Wang Nan escreveu:
>> Inherit bit is useless for a system wide evsel [1]. Further kernel
>> improvements are giving more constrain [2] on inherit events. This
>> patch set inherit bit to 0 to avoid potential constrains.
>> 
>> [1] http://lkml.kernel.org/r/20151022124142.GQ17308@twins.programming.kicks-ass.net
>> [2] http://lkml.kernel.org/r/1445559014-4667-1-git-send-email-ast@kernel.org
>> 
>> Signed-off-by: Wang Nan <wangnan0@huawei.com>
>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
>> Cc: Alexei Starovoitov <ast@plumgrid.com>
>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> Cc: Li Zefan <lizefan@huawei.com>
>> Cc: pi3orama@163.com
>> Link: http://lkml.kernel.org/n/ebpf-0tgilipxoo6fiebcxu3ft866@git.kernel.org
>> ---
>> 
>> evsel->system_wide doesn't correct reflect whether this evsel is system
>> wide or not, so checks pid when invoking perf_event_open, and it is
>> always correct.
> 
> Can't we do this at perf_evlist__config() or perf_evsel__config() time?

perf_evlist_config() is excluded because perf record doesn't use it.

> 
> We have record_opts at perf_evsel__config() time and I think we should
> leave changing the attr at perf_evsel__open() time for feature
> fallbacks, i.e. something we will only know when trying to use, which is
> different from this inherit-on-syswide case, that we know far in advance
> we will not need.

I tried to set this bit based on evsel->system_wide but it seems not reliable
as it should be, so I was wondering whether it is designed for other use. I will look
into this next week.

Thank you.
> 
> - Arnaldo
> 
>> ---
>> tools/perf/util/evsel.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>> 
>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>> index 5566b16..e2d6c9a 100644
>> --- a/tools/perf/util/evsel.c
>> +++ b/tools/perf/util/evsel.c
>> @@ -1337,6 +1337,7 @@ retry_sample_id:
>> 
>>        for (thread = 0; thread < nthreads; thread++) {
>>            int group_fd;
>> +            struct perf_event_attr attr;
>> 
>>            if (!evsel->cgrp && !evsel->system_wide)
>>                pid = thread_map__pid(threads, thread);
>> @@ -1346,7 +1347,10 @@ retry_open:
>>            pr_debug2("sys_perf_event_open: pid %d  cpu %d  group_fd %d  flags %#lx\n",
>>                  pid, cpus->map[cpu], group_fd, flags);
>> 
>> -            FD(evsel, cpu, thread) = sys_perf_event_open(&evsel->attr,
>> +            attr = evsel->attr;
>> +            if (pid == -1)
>> +                attr.inherit = 0;
>> +            FD(evsel, cpu, thread) = sys_perf_event_open(&attr,
>>                                     pid,
>>                                     cpus->map[cpu],
>>                                     group_fd, flags);
>> -- 
>> 1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Wang Nan Oct. 26, 2015, 9:08 a.m. UTC | #3
On 2015/10/24 0:17, Arnaldo Carvalho de Melo wrote:
> Em Fri, Oct 23, 2015 at 09:58:20PM +0800, pi3orama escreveu:

>>

>> 发自我的 iPhone

>>

>>> 在 2015年10月23日,下午9:51,Arnaldo Carvalho de Melo <acme@kernel.org> 写道:

>>>

>>> Em Fri, Oct 23, 2015 at 10:43:49AM +0000, Wang Nan escreveu:

>>>> Inherit bit is useless for a system wide evsel [1]. Further kernel

>>>> improvements are giving more constrain [2] on inherit events. This

>>>> patch set inherit bit to 0 to avoid potential constrains.

>>>>

>>>> [1] http://lkml.kernel.org/r/20151022124142.GQ17308@twins.programming.kicks-ass.net

>>>> [2] http://lkml.kernel.org/r/1445559014-4667-1-git-send-email-ast@kernel.org

>>>>

>>>> Signed-off-by: Wang Nan <wangnan0@huawei.com>

>>>> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>

>>>> Cc: Alexei Starovoitov <ast@plumgrid.com>

>>>> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>

>>>> Cc: Li Zefan <lizefan@huawei.com>

>>>> Cc: pi3orama@163.com

>>>> Link: http://lkml.kernel.org/n/ebpf-0tgilipxoo6fiebcxu3ft866@git.kernel.org

>>>> ---

>>>>

>>>> evsel->system_wide doesn't correct reflect whether this evsel is system

>>>> wide or not, so checks pid when invoking perf_event_open, and it is

>>>> always correct.

>>> Can't we do this at perf_evlist__config() or perf_evsel__config() time?

>> perf_evlist_config() is excluded because perf record doesn't use it.

> Yeah, we need to make it use it :-\


Its my fault that perf record *does* use perf_evlist__config(), but 
'perf stat'
doesn't.

>   

>>> We have record_opts at perf_evsel__config() time and I think we should

>>> leave changing the attr at perf_evsel__open() time for feature

>>> fallbacks, i.e. something we will only know when trying to use, which is

>>> different from this inherit-on-syswide case, that we know far in advance

>>> we will not need.

>> I tried to set this bit based on evsel->system_wide but it seems not reliable

>> as it should be, so I was wondering whether it is designed for other use. I will look

>> into this next week.


evsel->system_wide is introduced by commit 
bf8e8f4b832972c76d64ab2e2837a48397144887
(perf evlist: Add 'system_wide' option), but Adrian only introduced a 
new field
into perf, doesn't really make it active. Until now the only user of it is
arch/x86/util/intel-pt.c, but I'm not very sure the reason for IPT to 
use that
field.

If I understand correctly, it should be okay for a normal system wide 
evsel to have
this var set. I'll try another RFC for it.

Thank you.

> Ok, thanks in advance, lemme go back looking at eBPF :-)

>

> - Arnaldo



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Wang Nan Oct. 26, 2015, 11:48 a.m. UTC | #4
On 2015/10/26 17:25, Adrian Hunter wrote:
> On 26/10/15 11:08, Wangnan (F) wrote:

>>

>> evsel->system_wide is introduced by commit

>> bf8e8f4b832972c76d64ab2e2837a48397144887

>> (perf evlist: Add 'system_wide' option), but Adrian only introduced a new field

>> into perf, doesn't really make it active. Until now the only user of it is

>> arch/x86/util/intel-pt.c, but I'm not very sure the reason for IPT to use that

>> field.

>>

>> If I understand correctly, it should be okay for a normal system wide evsel

>> to have

>> this var set. I'll try another RFC for it.

> evsel->system_wide is for mixing evsels that aren't system-wide with ones

> that are.

>

> It might work to set it for all system-wide evsels but you will have to

> check the code and test it, because that would be using it in a new way

> that has never been tested.


I have check all occurance of system_wide I can found and found
only one behavior change which I believe should be okay. Please
have a look at [1].

Thank you.

[1] 
http://lkml.kernel.org/g/1445859720-146146-1-git-send-email-wangnan0@huawei.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
diff mbox

Patch

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 5566b16..e2d6c9a 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1337,6 +1337,7 @@  retry_sample_id:
 
 		for (thread = 0; thread < nthreads; thread++) {
 			int group_fd;
+			struct perf_event_attr attr;
 
 			if (!evsel->cgrp && !evsel->system_wide)
 				pid = thread_map__pid(threads, thread);
@@ -1346,7 +1347,10 @@  retry_open:
 			pr_debug2("sys_perf_event_open: pid %d  cpu %d  group_fd %d  flags %#lx\n",
 				  pid, cpus->map[cpu], group_fd, flags);
 
-			FD(evsel, cpu, thread) = sys_perf_event_open(&evsel->attr,
+			attr = evsel->attr;
+			if (pid == -1)
+				attr.inherit = 0;
+			FD(evsel, cpu, thread) = sys_perf_event_open(&attr,
 								     pid,
 								     cpus->map[cpu],
 								     group_fd, flags);