[v2,1/4] perf trace: Exit when build eBPF program failure

Message ID 20190606094845.4800-2-leo.yan@linaro.org
State New
Headers show
Series
  • [v2,1/4] perf trace: Exit when build eBPF program failure
Related show

Commit Message

Leo Yan June 6, 2019, 9:48 a.m.
On my Juno board with ARM64 CPUs, perf trace command reports the eBPF
program building failure but the command will not exit and continue to
run.  If we define an eBPF event in config file, the event will be
parsed with below flow:

  perf_config()
    `> trace__config()
	 `> parse_events_option()
	      `> parse_events__scanner()
	           `-> parse_events_parse()
	                 `> parse_events_load_bpf()
	                      `> llvm__compile_bpf()

Though the low level functions return back error values when detect eBPF
building failure, but parse_events_option() returns 1 for this case and
trace__config() passes 1 to perf_config(); perf_config() doesn't treat
the returned value 1 as failure and it continues to parse other
configurations.  Thus the perf command continues to run even without
enabling eBPF event successfully.

This patch changes error handling in trace__config(), when it detects
failure it will return -1 rather than directly pass error value (1);
finally, perf_config() will directly bail out and perf will exit for
this case.

Signed-off-by: Leo Yan <leo.yan@linaro.org>

---
 tools/perf/builtin-trace.c | 8 ++++++++
 1 file changed, 8 insertions(+)

-- 
2.17.1

Comments

Arnaldo Carvalho de Melo June 6, 2019, 1:30 p.m. | #1
Em Thu, Jun 06, 2019 at 05:48:42PM +0800, Leo Yan escreveu:
> On my Juno board with ARM64 CPUs, perf trace command reports the eBPF

> program building failure but the command will not exit and continue to

> run.  If we define an eBPF event in config file, the event will be

> parsed with below flow:

> 

>   perf_config()

>     `> trace__config()

> 	 `> parse_events_option()

> 	      `> parse_events__scanner()

> 	           `-> parse_events_parse()

> 	                 `> parse_events_load_bpf()

> 	                      `> llvm__compile_bpf()

> 

> Though the low level functions return back error values when detect eBPF

> building failure, but parse_events_option() returns 1 for this case and


(gdb) n
parse_events__scanner (str=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o", parse_state=0x7fffffff7fa0,
    start_token=258) at util/parse-events.c:1870
1870		parse_events__delete_buffer(buffer, scanner);
(gdb) n
1871		parse_events_lex_destroy(scanner);
(gdb) n
1872		return ret;
(gdb) p ret
$53 = 1
(gdb) bt
#0  parse_events__scanner (str=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o", parse_state=0x7fffffff7fa0,
    start_token=258) at util/parse-events.c:1872
#1  0x000000000050a926 in parse_events (evlist=0xb9e5d0, str=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o",
    err=0x7fffffff8020) at util/parse-events.c:1907
#2  0x000000000050ad94 in parse_events_option (opt=0x7fffffff8080,
    str=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o", unset=0) at util/parse-events.c:2007
#3  0x0000000000497fa8 in trace__config (var=0x7fffffff8150 "trace.add_events",
    value=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o", arg=0x7fffffffa1c0) at builtin-trace.c:3706
#4  0x00000000004e9a79 in perf_config (fn=0x497ee4 <trace__config>, data=0x7fffffffa1c0) at util/config.c:738
#5  0x0000000000498c97 in cmd_trace (argc=2, argv=0x7fffffffd690) at builtin-trace.c:3865
#6  0x00000000004d8c17 in run_builtin (p=0xa0e600 <commands+576>, argc=2, argv=0x7fffffffd690) at perf.c:303
#7  0x00000000004d8e84 in handle_internal_command (argc=2, argv=0x7fffffffd690) at perf.c:355
#8  0x00000000004d8fd3 in run_argv (argcp=0x7fffffffd4ec, argv=0x7fffffffd4e0) at perf.c:399
#9  0x00000000004d933f in main (argc=2, argv=0x7fffffffd690) at perf.c:521
(gdb)

So its parse_events__scanner() that returns 1, parse_events() propagate
that and:

parse_events_option (opt=0x7fffffff8080, str=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o", unset=0)
    at util/parse-events.c:2009
2009		if (ret) {
(gdb) p ret
$56 = 1
(gdb) n
2010			parse_events_print_error(&err, str);
(gdb) n
event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o'
                     \___ Kernel verifier blocks program loading

(add -v to see detail)
2011			fprintf(stderr, "Run 'perf list' for a list of valid events\n");
(gdb)

So the -4007 error is printed, and all we can say is that parsing events
failed, but we end up not propagating that error back when we use
parse_events_option(), we could use instead:

        struct parse_events_error err = { .idx = 0, };
        int ret = parse_events(evlist, str, &err);

And make parse_events_error have the raw err, i.e. -4007 in this case:

        [ERRCODE_OFFSET(VERIFY)]        = "Kernel verifier blocks program loading",

In your case would be something else, I'm just trying to load the
precompiled .o that does things the BPF kernel verifier doesn't like.

So yeah, your patch looks ok, i.e. parse_events_option() returning !0
should make trace__config() return -1.

But see below:

- Arnaldo

> trace__config() passes 1 to perf_config(); perf_config() doesn't treat

> the returned value 1 as failure and it continues to parse other

> configurations.  Thus the perf command continues to run even without

> enabling eBPF event successfully.

> 

> This patch changes error handling in trace__config(), when it detects

> failure it will return -1 rather than directly pass error value (1);

> finally, perf_config() will directly bail out and perf will exit for

> this case.

> 

> Signed-off-by: Leo Yan <leo.yan@linaro.org>

> ---

>  tools/perf/builtin-trace.c | 8 ++++++++

>  1 file changed, 8 insertions(+)

> 

> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c

> index 54b2d0fd0d02..4b5d004aab74 100644

> --- a/tools/perf/builtin-trace.c

> +++ b/tools/perf/builtin-trace.c

> @@ -3664,6 +3664,14 @@ static int trace__config(const char *var, const char *value, void *arg)

>  					       "event selector. use 'perf list' to list available events",

>  					       parse_events_option);

>  		err = parse_events_option(&o, value, 0);

> +

> +		/*

> +		 * When parse option successfully parse_events_option() will

> +		 * return 0, otherwise means the paring failure.  And it

> +		 * returns 1 for eBPF program building failure; so adjust the

> +		 * err value to -1 for the failure.

> +		 */

> +		err = err ? -1 : 0;


I'll rewrite the comment above to make it more succint and fix things
like 'paring' (parsing):

		/*
		 * parse_events_option() returns !0 to indicate failure
		 * while the perf_config code that calls trace__config()
		 * expects < 0 returns to indicate error, so:
		 */

		 if (err)
		 	err = -1;
>  	} else if (!strcmp(var, "trace.show_timestamp")) {

>  		trace->show_tstamp = perf_config_bool(var, value);

>  	} else if (!strcmp(var, "trace.show_duration")) {

> -- 

> 2.17.1


-- 

- Arnaldo
Arnaldo Carvalho de Melo June 6, 2019, 1:34 p.m. | #2
Em Thu, Jun 06, 2019 at 10:30:19AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, Jun 06, 2019 at 05:48:42PM +0800, Leo Yan escreveu:

> > +++ b/tools/perf/builtin-trace.c

> > @@ -3664,6 +3664,14 @@ static int trace__config(const char *var, const char *value, void *arg)

> >  					       "event selector. use 'perf list' to list available events",

> >  					       parse_events_option);

> >  		err = parse_events_option(&o, value, 0);

> > +

> > +		/*

> > +		 * When parse option successfully parse_events_option() will

> > +		 * return 0, otherwise means the paring failure.  And it

> > +		 * returns 1 for eBPF program building failure; so adjust the

> > +		 * err value to -1 for the failure.

> > +		 */

> > +		err = err ? -1 : 0;

> 

> I'll rewrite the comment above to make it more succint and fix things

> like 'paring' (parsing):

> 

> 		/*

> 		 * parse_events_option() returns !0 to indicate failure

> 		 * while the perf_config code that calls trace__config()

> 		 * expects < 0 returns to indicate error, so:

> 		 */

> 

> 		 if (err)

> 		 	err = -1;


Even shorter, please let me know if I can keep your
Signed-off-by/authorship for this one.

- Arnaldo

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index f7e4e50bddbd..1a2a605cf068 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -3703,7 +3703,12 @@ static int trace__config(const char *var, const char *value, void *arg)
 		struct option o = OPT_CALLBACK('e', "event", &trace->evlist, "event",
 					       "event selector. use 'perf list' to list available events",
 					       parse_events_option);
-		err = parse_events_option(&o, value, 0);
+		/*
+		 * We can't propagate parse_event_option() return, as it is 1
+		 * for failure while perf_config() expects -1.
+		 */
+		if (parse_events_option(&o, value, 0))
+			err = -1;
 	} else if (!strcmp(var, "trace.show_timestamp")) {
 		trace->show_tstamp = perf_config_bool(var, value);
 	} else if (!strcmp(var, "trace.show_duration")) {
Leo Yan June 6, 2019, 1:56 p.m. | #3
Hi Arnaldo,

On Thu, Jun 06, 2019 at 10:30:19AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jun 06, 2019 at 05:48:42PM +0800, Leo Yan escreveu:

> > On my Juno board with ARM64 CPUs, perf trace command reports the eBPF

> > program building failure but the command will not exit and continue to

> > run.  If we define an eBPF event in config file, the event will be

> > parsed with below flow:

> > 

> >   perf_config()

> >     `> trace__config()

> > 	 `> parse_events_option()

> > 	      `> parse_events__scanner()

> > 	           `-> parse_events_parse()

> > 	                 `> parse_events_load_bpf()

> > 	                      `> llvm__compile_bpf()

> > 

> > Though the low level functions return back error values when detect eBPF

> > building failure, but parse_events_option() returns 1 for this case and

> 

> (gdb) n

> parse_events__scanner (str=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o", parse_state=0x7fffffff7fa0,

>     start_token=258) at util/parse-events.c:1870

> 1870		parse_events__delete_buffer(buffer, scanner);

> (gdb) n

> 1871		parse_events_lex_destroy(scanner);

> (gdb) n

> 1872		return ret;

> (gdb) p ret

> $53 = 1

> (gdb) bt

> #0  parse_events__scanner (str=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o", parse_state=0x7fffffff7fa0,

>     start_token=258) at util/parse-events.c:1872

> #1  0x000000000050a926 in parse_events (evlist=0xb9e5d0, str=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o",

>     err=0x7fffffff8020) at util/parse-events.c:1907

> #2  0x000000000050ad94 in parse_events_option (opt=0x7fffffff8080,

>     str=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o", unset=0) at util/parse-events.c:2007

> #3  0x0000000000497fa8 in trace__config (var=0x7fffffff8150 "trace.add_events",

>     value=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o", arg=0x7fffffffa1c0) at builtin-trace.c:3706

> #4  0x00000000004e9a79 in perf_config (fn=0x497ee4 <trace__config>, data=0x7fffffffa1c0) at util/config.c:738

> #5  0x0000000000498c97 in cmd_trace (argc=2, argv=0x7fffffffd690) at builtin-trace.c:3865

> #6  0x00000000004d8c17 in run_builtin (p=0xa0e600 <commands+576>, argc=2, argv=0x7fffffffd690) at perf.c:303

> #7  0x00000000004d8e84 in handle_internal_command (argc=2, argv=0x7fffffffd690) at perf.c:355

> #8  0x00000000004d8fd3 in run_argv (argcp=0x7fffffffd4ec, argv=0x7fffffffd4e0) at perf.c:399

> #9  0x00000000004d933f in main (argc=2, argv=0x7fffffffd690) at perf.c:521

> (gdb)

> 

> So its parse_events__scanner() that returns 1, parse_events() propagate

> that and:

> 

> parse_events_option (opt=0x7fffffff8080, str=0xb9d170 "/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o", unset=0)

>     at util/parse-events.c:2009

> 2009		if (ret) {

> (gdb) p ret

> $56 = 1

> (gdb) n

> 2010			parse_events_print_error(&err, str);

> (gdb) n

> event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o'

>                      \___ Kernel verifier blocks program loading

> 

> (add -v to see detail)

> 2011			fprintf(stderr, "Run 'perf list' for a list of valid events\n");

> (gdb)

> 

> So the -4007 error is printed, and all we can say is that parsing events

> failed, but we end up not propagating that error back when we use

> parse_events_option(), we could use instead:

> 

>         struct parse_events_error err = { .idx = 0, };

>         int ret = parse_events(evlist, str, &err);

> 

> And make parse_events_error have the raw err, i.e. -4007 in this case:

> 

>         [ERRCODE_OFFSET(VERIFY)]        = "Kernel verifier blocks program loading",

> 

> In your case would be something else, I'm just trying to load the

> precompiled .o that does things the BPF kernel verifier doesn't like.


Yes, exactly.  My failure is compilation failure but not BPF verifier
failure.

> So yeah, your patch looks ok, i.e. parse_events_option() returning !0

> should make trace__config() return -1.

> 

> But see below:

> 

> - Arnaldo

> 

> > trace__config() passes 1 to perf_config(); perf_config() doesn't treat

> > the returned value 1 as failure and it continues to parse other

> > configurations.  Thus the perf command continues to run even without

> > enabling eBPF event successfully.

> > 

> > This patch changes error handling in trace__config(), when it detects

> > failure it will return -1 rather than directly pass error value (1);

> > finally, perf_config() will directly bail out and perf will exit for

> > this case.

> > 

> > Signed-off-by: Leo Yan <leo.yan@linaro.org>

> > ---

> >  tools/perf/builtin-trace.c | 8 ++++++++

> >  1 file changed, 8 insertions(+)

> > 

> > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c

> > index 54b2d0fd0d02..4b5d004aab74 100644

> > --- a/tools/perf/builtin-trace.c

> > +++ b/tools/perf/builtin-trace.c

> > @@ -3664,6 +3664,14 @@ static int trace__config(const char *var, const char *value, void *arg)

> >  					       "event selector. use 'perf list' to list available events",

> >  					       parse_events_option);

> >  		err = parse_events_option(&o, value, 0);

> > +

> > +		/*

> > +		 * When parse option successfully parse_events_option() will

> > +		 * return 0, otherwise means the paring failure.  And it

> > +		 * returns 1 for eBPF program building failure; so adjust the

> > +		 * err value to -1 for the failure.

> > +		 */

> > +		err = err ? -1 : 0;

> 

> I'll rewrite the comment above to make it more succint and fix things

> like 'paring' (parsing):

> 

> 		/*

> 		 * parse_events_option() returns !0 to indicate failure

> 		 * while the perf_config code that calls trace__config()

> 		 * expects < 0 returns to indicate error, so:

> 		 */

> 

> 		 if (err)

> 		 	err = -1;


This looks good to me.  Thanks a lot for the reviewing.

Leo.

> >  	} else if (!strcmp(var, "trace.show_timestamp")) {

> >  		trace->show_tstamp = perf_config_bool(var, value);

> >  	} else if (!strcmp(var, "trace.show_duration")) {

> > -- 

> > 2.17.1

> 

> -- 

> 

> - Arnaldo
Leo Yan June 10, 2019, 7:38 a.m. | #4
On Thu, Jun 06, 2019 at 10:34:24AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jun 06, 2019 at 10:30:19AM -0300, Arnaldo Carvalho de Melo escreveu:

> > Em Thu, Jun 06, 2019 at 05:48:42PM +0800, Leo Yan escreveu:

> > > +++ b/tools/perf/builtin-trace.c

> > > @@ -3664,6 +3664,14 @@ static int trace__config(const char *var, const char *value, void *arg)

> > >  					       "event selector. use 'perf list' to list available events",

> > >  					       parse_events_option);

> > >  		err = parse_events_option(&o, value, 0);

> > > +

> > > +		/*

> > > +		 * When parse option successfully parse_events_option() will

> > > +		 * return 0, otherwise means the paring failure.  And it

> > > +		 * returns 1 for eBPF program building failure; so adjust the

> > > +		 * err value to -1 for the failure.

> > > +		 */

> > > +		err = err ? -1 : 0;

> > 

> > I'll rewrite the comment above to make it more succint and fix things

> > like 'paring' (parsing):

> > 

> > 		/*

> > 		 * parse_events_option() returns !0 to indicate failure

> > 		 * while the perf_config code that calls trace__config()

> > 		 * expects < 0 returns to indicate error, so:

> > 		 */

> > 

> > 		 if (err)

> > 		 	err = -1;

> 

> Even shorter, please let me know if I can keep your

> Signed-off-by/authorship for this one.


Sorry I miss this email.

> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c

> index f7e4e50bddbd..1a2a605cf068 100644

> --- a/tools/perf/builtin-trace.c

> +++ b/tools/perf/builtin-trace.c

> @@ -3703,7 +3703,12 @@ static int trace__config(const char *var, const char *value, void *arg)

>  		struct option o = OPT_CALLBACK('e', "event", &trace->evlist, "event",

>  					       "event selector. use 'perf list' to list available events",

>  					       parse_events_option);

> -		err = parse_events_option(&o, value, 0);

> +		/*

> +		 * We can't propagate parse_event_option() return, as it is 1

> +		 * for failure while perf_config() expects -1.

> +		 */

> +		if (parse_events_option(&o, value, 0))

> +			err = -1;

>  	} else if (!strcmp(var, "trace.show_timestamp")) {

>  		trace->show_tstamp = perf_config_bool(var, value);

>  	} else if (!strcmp(var, "trace.show_duration")) {



Yeah, the change looks good to me. And very appreciate your effort to
improve the patch quality.

Thanks,
Leo.

Patch

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 54b2d0fd0d02..4b5d004aab74 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -3664,6 +3664,14 @@  static int trace__config(const char *var, const char *value, void *arg)
 					       "event selector. use 'perf list' to list available events",
 					       parse_events_option);
 		err = parse_events_option(&o, value, 0);
+
+		/*
+		 * When parse option successfully parse_events_option() will
+		 * return 0, otherwise means the paring failure.  And it
+		 * returns 1 for eBPF program building failure; so adjust the
+		 * err value to -1 for the failure.
+		 */
+		err = err ? -1 : 0;
 	} else if (!strcmp(var, "trace.show_timestamp")) {
 		trace->show_tstamp = perf_config_bool(var, value);
 	} else if (!strcmp(var, "trace.show_duration")) {