Message ID | 20200914130143.1322802-4-daniel.thompson@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | kgdb: Honour the kprobe blocklist when setting breakpoints | expand |
Hi, On Mon, Sep 14, 2020 at 6:02 AM Daniel Thompson <daniel.thompson@linaro.org> wrote: > > During debug trap execution we expect dbg_deactivate_sw_breakpoints() > to be paired with an dbg_activate_sw_breakpoint(). Currently although > the calls are paired correctly they are needlessly smeared across three > different functions. Worse this also results in code to drive polled I/O > being called with breakpoints activated which, in turn, needlessly > increases the set of functions that will recursively trap if breakpointed. > > Fix this by moving the activation of breakpoints into the debug core. > > Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> > --- > kernel/debug/debug_core.c | 2 ++ > kernel/debug/gdbstub.c | 1 - > kernel/debug/kdb/kdb_debugger.c | 2 -- > 3 files changed, 2 insertions(+), 3 deletions(-) I like the idea, but previously the kgdb_arch_handle_exception() was always called after the SW breakpoints were activated. Are you sure it's OK to swap those two orders across all architectures? -Doug
On Mon, Sep 14, 2020 at 05:13:28PM -0700, Doug Anderson wrote: > Hi, > > On Mon, Sep 14, 2020 at 6:02 AM Daniel Thompson > <daniel.thompson@linaro.org> wrote: > > > > During debug trap execution we expect dbg_deactivate_sw_breakpoints() > > to be paired with an dbg_activate_sw_breakpoint(). Currently although > > the calls are paired correctly they are needlessly smeared across three > > different functions. Worse this also results in code to drive polled I/O > > being called with breakpoints activated which, in turn, needlessly > > increases the set of functions that will recursively trap if breakpointed. > > > > Fix this by moving the activation of breakpoints into the debug core. > > > > Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> > > --- > > kernel/debug/debug_core.c | 2 ++ > > kernel/debug/gdbstub.c | 1 - > > kernel/debug/kdb/kdb_debugger.c | 2 -- > > 3 files changed, 2 insertions(+), 3 deletions(-) > > I like the idea, but previously the kgdb_arch_handle_exception() was > always called after the SW breakpoints were activated. Are you sure > it's OK to swap those two orders across all architectures? Pretty sure, yes. However, given the poor attention to detail I demonstrated in patch 2/3, I figure I'd better write out the full chain of reasoning if I want you to trust me ;-) . kgdb_arch_handle_exception() is already called frequently with breakpoints disabled since it is basically a fallback that is called whenever the architecture neutral parts of the gdb packet processing cannot cope. So your question becomes more specific: is it OK to swap orders when the architecture code is handling a (c)ontinue or (s)tep packet[1]? The reason the architecture neutral part cannot cope is because it because *if* gdb has been instructed that the PC must be changed before resuming then the architecture neutral code does not know how to effect this. In other words the call to kgdb_arch_handle_exception() will boil down to doing whatever the architecture equivalent of writing to linux_regs->pc actually is (or return an error). Updating the PC is safe whether or not breakpoints are deactivated and activating the breakpoints if the architecture code actually censors the resume is a bug (which we just fixed). Daniel. [1] The fallthroughs aren't a whole lot of fun to read but if gdb_cmd_exception_pass() provokes a fallthrough then it will have rewritten the packet as a (c)ontinue.
Hi, On Tue, Sep 15, 2020 at 6:45 AM Daniel Thompson <daniel.thompson@linaro.org> wrote: > > On Mon, Sep 14, 2020 at 05:13:28PM -0700, Doug Anderson wrote: > > Hi, > > > > On Mon, Sep 14, 2020 at 6:02 AM Daniel Thompson > > <daniel.thompson@linaro.org> wrote: > > > > > > During debug trap execution we expect dbg_deactivate_sw_breakpoints() > > > to be paired with an dbg_activate_sw_breakpoint(). Currently although > > > the calls are paired correctly they are needlessly smeared across three > > > different functions. Worse this also results in code to drive polled I/O > > > being called with breakpoints activated which, in turn, needlessly > > > increases the set of functions that will recursively trap if breakpointed. > > > > > > Fix this by moving the activation of breakpoints into the debug core. > > > > > > Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> > > > --- > > > kernel/debug/debug_core.c | 2 ++ > > > kernel/debug/gdbstub.c | 1 - > > > kernel/debug/kdb/kdb_debugger.c | 2 -- > > > 3 files changed, 2 insertions(+), 3 deletions(-) > > > > I like the idea, but previously the kgdb_arch_handle_exception() was > > always called after the SW breakpoints were activated. Are you sure > > it's OK to swap those two orders across all architectures? > > Pretty sure, yes. > > However, given the poor attention to detail I demonstrated in patch 2/3, > I figure I'd better write out the full chain of reasoning if I want > you to trust me ;-) . > > kgdb_arch_handle_exception() is already called frequently with > breakpoints disabled since it is basically a fallback that is called > whenever the architecture neutral parts of the gdb packet processing > cannot cope. > > So your question becomes more specific: is it OK to swap orders when the > architecture code is handling a (c)ontinue or (s)tep packet[1]? > > The reason the architecture neutral part cannot cope is because it > because *if* gdb has been instructed that the PC must be changed before > resuming then the architecture neutral code does not know how to effect > this. In other words the call to kgdb_arch_handle_exception() will > boil down to doing whatever the architecture equivalent of writing to > linux_regs->pc actually is (or return an error). > > Updating the PC is safe whether or not breakpoints are deactivated > and activating the breakpoints if the architecture code actually > censors the resume is a bug (which we just fixed). OK, fair enough. Without digging into all the arch code, I was assuming that some arch somewhere could be playing tricks on us. After all, the arch code is told about all the breakpoints (kgdb_arch_set_breakpoint) so, in theory, it could be doing something funky (not applying the breakpoint until it sees the "continue" or something?). I guess the one thing that seems the most plausible that an architecture might be doing is doing something special if it is told to "continue" or "single step" an address that had a breakpoint on it. I guess I could imagine that on some architectures this might require special handling (could it be somehow illegal to run the CPU in step mode over a breakpoint instruction?). ...or maybe if it was using hardware breakpoints those don't trigger properly if you're continuing to the address of the breakpoint? I'm making stuff up here and presumably none of this is true, but it's what I was worried about. From a quick glance, I don't _think_ anyone is doing this. Presumably today they'd either need a) a funky implementation for kgdb_arch_set_breakpoint() or they'd need b) code somewhere which read memory and looked for "gdb_bpt_instr". a) Looking at kgdb_arch_set_breakpoint(), I don't see anything too funky. They all look like nearly the same code of writing the breakpoint to memory, perhaps taking into account locked .text space by using a special patch routine. b) Looking at users of "gdb_bpt_instr" I do see some funkiness in "h8300" and "microblaze", but nothing that looks like it fits the bill of what we're looking for. In any case, even if someone was doing something funky, it would be possible to adapt the code to the new way of the world. You'd just check the PC when applying breakpoints rather than checking the breakpoints when continuing. So I'm going to go ahead and say: Reviewed-by: Douglas Anderson <dianders@chromium.org> ...and I guess if it really truly causes problems for someone we'll figure it out. Feel free to ignore the below. I wrote it up but realized it probably wasn't useful but couldn't bear to delete it. :-P One somewhat plausible example (I don't think this happens, so feel free to let your eyes glaze over and ignore). Assume that the kernel has a mix of ARM and Thumb code. When told to set a breakpoint at an address the kernel wouldn't know whether the address refers to ARM or Thumb code. In the past I've solved this type of problem by using an instruction as a breakpoint that is an illegal instruction when interpreted any which way (the instruction is illegal as an ARM instruction and both halves illegal as a Thumb instruction). Right when we're about to continue, though, we actually have extra information about the PC we're going to continue to. If we know we're about to continue in Thumb mode and the address we're going to continue on is the 2nd half of a breakpoint instruction we suddenly know that the breakpoint should have been a Thumb-mode instruction and we could fix it up. AKA: 1. kernel is asked to set a breakpoint at 0x12340000. We don't know if this is arm or thumb so we set a 32-bit (ARM) breakpoint at 0x12340000 2. We're told to continue in thumb mode at address 0x12340002 3. We suddenly know that the breakpoint at 0x12340000 should have been a 16-bit (Thumb) breakpoint, not a 32-bit (ARM) breakpoint, so we could fix it up before continuing. OK, that probably was just confusing, and, like I said, I don't think kdb in Linux does that. ;-) -Doug
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c index 9618c1e2faf6..74d42d3f3180 100644 --- a/kernel/debug/debug_core.c +++ b/kernel/debug/debug_core.c @@ -763,6 +763,8 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs, } } + dbg_activate_sw_breakpoints(); + /* Call the I/O driver's post_exception routine */ if (dbg_io_ops->post_exception) dbg_io_ops->post_exception(); diff --git a/kernel/debug/gdbstub.c b/kernel/debug/gdbstub.c index cc3c43dfec44..e9a9c3097089 100644 --- a/kernel/debug/gdbstub.c +++ b/kernel/debug/gdbstub.c @@ -1061,7 +1061,6 @@ int gdb_serial_stub(struct kgdb_state *ks) error_packet(remcom_out_buffer, -EINVAL); break; } - dbg_activate_sw_breakpoints(); fallthrough; /* to default processing */ default: default_handle: diff --git a/kernel/debug/kdb/kdb_debugger.c b/kernel/debug/kdb/kdb_debugger.c index 53a0df6e4d92..0220afda3200 100644 --- a/kernel/debug/kdb/kdb_debugger.c +++ b/kernel/debug/kdb/kdb_debugger.c @@ -147,7 +147,6 @@ int kdb_stub(struct kgdb_state *ks) return DBG_PASS_EVENT; } kdb_bp_install(ks->linux_regs); - dbg_activate_sw_breakpoints(); /* Set the exit state to a single step or a continue */ if (KDB_STATE(DOING_SS)) gdbstub_state(ks, "s"); @@ -167,7 +166,6 @@ int kdb_stub(struct kgdb_state *ks) * differently vs the gdbstub */ kgdb_single_step = 0; - dbg_deactivate_sw_breakpoints(); return DBG_SWITCH_CPU_EVENT; } return kgdb_info[ks->cpu].ret_state;
During debug trap execution we expect dbg_deactivate_sw_breakpoints() to be paired with an dbg_activate_sw_breakpoint(). Currently although the calls are paired correctly they are needlessly smeared across three different functions. Worse this also results in code to drive polled I/O being called with breakpoints activated which, in turn, needlessly increases the set of functions that will recursively trap if breakpointed. Fix this by moving the activation of breakpoints into the debug core. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> --- kernel/debug/debug_core.c | 2 ++ kernel/debug/gdbstub.c | 1 - kernel/debug/kdb/kdb_debugger.c | 2 -- 3 files changed, 2 insertions(+), 3 deletions(-) -- 2.25.4