Message ID | 1457949585-191064-3-git-send-email-wangnan0@huawei.com |
---|---|
State | Superseded |
Headers | show |
On Wed, Mar 23, 2016 at 06:50:21PM +0100, Peter Zijlstra wrote: > On Mon, Mar 14, 2016 at 09:59:42AM +0000, Wang Nan wrote: > > +++ b/arch/arm/kernel/hw_breakpoint.c > > @@ -631,7 +631,7 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp) > > info->address &= ~alignment_mask; > > info->ctrl.len <<= offset; > > > > - if (!bp->overflow_handler) { > > + if (is_default_overflow_handler(bp)) { > > /* > > * Mismatch breakpoints are required for single-stepping > > * breakpoints. > > @@ -754,7 +754,7 @@ static void watchpoint_handler(unsigned long addr, unsigned int fsr, > > * mismatch breakpoint so we can single-step over the > > * watchpoint trigger. > > */ > > - if (!wp->overflow_handler) > > + if (is_default_overflow_handler(wp)) > > enable_single_step(wp, instruction_pointer(regs)); > > > > unlock: > > diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c > > index b45c95d..4ef5373 100644 > > --- a/arch/arm64/kernel/hw_breakpoint.c > > +++ b/arch/arm64/kernel/hw_breakpoint.c > > @@ -616,7 +616,7 @@ static int breakpoint_handler(unsigned long unused, unsigned int esr, > > perf_bp_event(bp, regs); > > > > /* Do we need to handle the stepping? */ > > - if (!bp->overflow_handler) > > + if (is_default_overflow_handler(bp)) > > step = 1; > > unlock: > > rcu_read_unlock(); > > @@ -712,7 +712,7 @@ static int watchpoint_handler(unsigned long addr, unsigned int esr, > > perf_bp_event(wp, regs); > > > > /* Do we need to handle the stepping? */ > > - if (!wp->overflow_handler) > > + if (is_default_overflow_handler(wp)) > > step = 1; > > > > unlock: > > Will, why does it matter what the overflow handler is for this stuff? Because ptrace registers an overflow handler for raising a SIGTRAP and ptrace users (e.g. GDB) expect to handle the single-stepping themselves. Perf, on the other hand, will livelock if the kernel doesn't do the stepping. FWIW, I hate this whole thing. The only users of the perf side just seem to be people running whacky test cases and then pointing out the places where we're not identical to x86 :( Will
On Wed, Mar 23, 2016 at 08:29:38PM +0100, Peter Zijlstra wrote: > On Wed, Mar 23, 2016 at 06:13:49PM +0000, Will Deacon wrote: > > On Wed, Mar 23, 2016 at 06:50:21PM +0100, Peter Zijlstra wrote: > > > On Mon, Mar 14, 2016 at 09:59:42AM +0000, Wang Nan wrote: > > > > +++ b/arch/arm/kernel/hw_breakpoint.c > > > > @@ -631,7 +631,7 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp) > > > > info->address &= ~alignment_mask; > > > > info->ctrl.len <<= offset; > > > > > > > > - if (!bp->overflow_handler) { > > > > + if (is_default_overflow_handler(bp)) { > > > > /* > > > > * Mismatch breakpoints are required for single-stepping > > > > * breakpoints. > > > > @@ -754,7 +754,7 @@ static void watchpoint_handler(unsigned long addr, unsigned int fsr, > > > > * mismatch breakpoint so we can single-step over the > > > > * watchpoint trigger. > > > > */ > > > > - if (!wp->overflow_handler) > > > > + if (is_default_overflow_handler(wp)) > > > > enable_single_step(wp, instruction_pointer(regs)); > > > > > > > > unlock: > > > > diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c > > > > index b45c95d..4ef5373 100644 > > > > --- a/arch/arm64/kernel/hw_breakpoint.c > > > > +++ b/arch/arm64/kernel/hw_breakpoint.c > > > > @@ -616,7 +616,7 @@ static int breakpoint_handler(unsigned long unused, unsigned int esr, > > > > perf_bp_event(bp, regs); > > > > > > > > /* Do we need to handle the stepping? */ > > > > - if (!bp->overflow_handler) > > > > + if (is_default_overflow_handler(bp)) > > > > step = 1; > > > > unlock: > > > > rcu_read_unlock(); > > > > @@ -712,7 +712,7 @@ static int watchpoint_handler(unsigned long addr, unsigned int esr, > > > > perf_bp_event(wp, regs); > > > > > > > > /* Do we need to handle the stepping? */ > > > > - if (!wp->overflow_handler) > > > > + if (is_default_overflow_handler(wp)) > > > > step = 1; > > > > > > > > unlock: > > > > > > Will, why does it matter what the overflow handler is for this stuff? > > > > Because ptrace registers an overflow handler for raising a SIGTRAP and > > ptrace users (e.g. GDB) expect to handle the single-stepping themselves. > > Perf, on the other hand, will livelock if the kernel doesn't do the > > stepping. > > Would it, perhaps, make sense to invert this test and check for > ->overflow_handler == ptrace_hbptriggered instead? That way nobody gets > surprise live-locks, endlessly triggering the same trap. Not sure... I can imagine kgdb, for example, wanting to handle the stepping itself. You also need to play clever tricks if you want to step through LL/SC atomics, which the code here doesn't even try to handle (because it involves disassembling the instructions and applying a bunch of heuristics), so I imagine most debuggers wanting to take care of the step themselves. > But yes, this kinda blows. Yup. Will
diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c index 6284779..b8df458 100644 --- a/arch/arm/kernel/hw_breakpoint.c +++ b/arch/arm/kernel/hw_breakpoint.c @@ -631,7 +631,7 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp) info->address &= ~alignment_mask; info->ctrl.len <<= offset; - if (!bp->overflow_handler) { + if (is_default_overflow_handler(bp)) { /* * Mismatch breakpoints are required for single-stepping * breakpoints. @@ -754,7 +754,7 @@ static void watchpoint_handler(unsigned long addr, unsigned int fsr, * mismatch breakpoint so we can single-step over the * watchpoint trigger. */ - if (!wp->overflow_handler) + if (is_default_overflow_handler(wp)) enable_single_step(wp, instruction_pointer(regs)); unlock: diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c index b45c95d..4ef5373 100644 --- a/arch/arm64/kernel/hw_breakpoint.c +++ b/arch/arm64/kernel/hw_breakpoint.c @@ -616,7 +616,7 @@ static int breakpoint_handler(unsigned long unused, unsigned int esr, perf_bp_event(bp, regs); /* Do we need to handle the stepping? */ - if (!bp->overflow_handler) + if (is_default_overflow_handler(bp)) step = 1; unlock: rcu_read_unlock(); @@ -712,7 +712,7 @@ static int watchpoint_handler(unsigned long addr, unsigned int esr, perf_bp_event(wp, regs); /* Do we need to handle the stepping? */ - if (!wp->overflow_handler) + if (is_default_overflow_handler(wp)) step = 1; unlock: diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index a9d8cab..d5f99cd 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -833,6 +833,12 @@ extern void perf_event_output(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs); +static inline bool +is_default_overflow_handler(struct perf_event *event) +{ + return (event->overflow_handler == perf_event_output); +} + extern void perf_event_header__init_id(struct perf_event_header *header, struct perf_sample_data *data, diff --git a/kernel/events/core.c b/kernel/events/core.c index 1a1312e..ed69532 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6467,10 +6467,7 @@ static int __perf_event_overflow(struct perf_event *event, irq_work_queue(&event->pending); } - if (event->overflow_handler) - event->overflow_handler(event, data, regs); - else - perf_event_output(event, data, regs); + event->overflow_handler(event, data, regs); if (*perf_event_fasync(event) && event->pending_kill) { event->pending_wakeup = 1; @@ -7963,8 +7960,13 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu, context = parent_event->overflow_handler_context; } - event->overflow_handler = overflow_handler; - event->overflow_handler_context = context; + if (overflow_handler) { + event->overflow_handler = overflow_handler; + event->overflow_handler_context = context; + } else { + event->overflow_handler = perf_event_output; + event->overflow_handler_context = NULL; + } perf_event__state_init(event);
Set a default event->overflow_handler in perf_event_alloc() so don't need checking event->overflow_handler in __perf_event_overflow(). Following commits can give a different default overflow_handler. No extra performance introduced into hot path because in the original code we still need reading this handler from memory. A conditional branch is avoided so actually we remove some instructions. Initial idea comes from Peter at [1]. Since default value of event->overflow_handler is not null, existing 'if (!overflow_handler)' need to be changed. is_default_overflow_handler() is introduced for this. [1] http://lkml.kernel.org/r/20130708121557.GA17211@twins.programming.kicks-ass.net Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: He Kuang <hekuang@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com --- arch/arm/kernel/hw_breakpoint.c | 4 ++-- arch/arm64/kernel/hw_breakpoint.c | 4 ++-- include/linux/perf_event.h | 6 ++++++ kernel/events/core.c | 14 ++++++++------ 4 files changed, 18 insertions(+), 10 deletions(-) -- 1.8.3.4