[v2] arm: Fix backtrace generation when IPI is masked

Message ID 1442315112-14039-1-git-send-email-daniel.thompson@linaro.org
State New
Headers show

Commit Message

Daniel Thompson Sept. 15, 2015, 11:05 a.m.
Currently on ARM when <SysRq-L> is triggered from an interrupt handler
(e.g. a SysRq issued using UART or kbd) the main CPU will wedge for ten
seconds with interrupts masked before issuing a backtrace for every CPU
except itself.

The new backtrace code introduced by commit 96f0e00378d4 ("ARM: add
basic support for on-demand backtrace of other CPUs") does not work
correctly when run from an interrupt handler because IPI_CPU_BACKTRACE
is used to generate the backtrace on all CPUs but cannot preempt the
current calling context.

This can be fixed by detecting that the calling context cannot be
preempted and issuing the backtrace directly in this case. Issuing
directly leaves us without any pt_regs to pass to nmi_cpu_backtrace()
so we also modify the generic code to call dump_stack() when its
argument is NULL.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
---

Notes:
    Changes in v2:
    
    * Improved commit message to better describe the changes to the generic
      code (Hillf Danton).

 arch/arm/kernel/smp.c | 7 +++++++
 lib/nmi_backtrace.c   | 5 ++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

--
2.4.3

Comments

Russell King - ARM Linux Sept. 15, 2015, 11:30 a.m. | #1
On Tue, Sep 15, 2015 at 12:05:12PM +0100, Daniel Thompson wrote:
> Currently on ARM when <SysRq-L> is triggered from an interrupt handler
> (e.g. a SysRq issued using UART or kbd) the main CPU will wedge for ten
> seconds with interrupts masked before issuing a backtrace for every CPU
> except itself.
> 
> The new backtrace code introduced by commit 96f0e00378d4 ("ARM: add
> basic support for on-demand backtrace of other CPUs") does not work
> correctly when run from an interrupt handler because IPI_CPU_BACKTRACE
> is used to generate the backtrace on all CPUs but cannot preempt the
> current calling context.

This patch needs a little more work - what happens to the IPI_CPU_BACKTRACE
we've sent to ourselves?  (It fires after the interrupt handler for the
UART/kbd has finished.)  It ought to be masked out if we're going to
handle it a different way.
Daniel Thompson Sept. 15, 2015, 1:15 p.m. | #2
On 15/09/15 12:30, Russell King - ARM Linux wrote:
> On Tue, Sep 15, 2015 at 12:05:12PM +0100, Daniel Thompson wrote:
>> Currently on ARM when <SysRq-L> is triggered from an interrupt handler
>> (e.g. a SysRq issued using UART or kbd) the main CPU will wedge for ten
>> seconds with interrupts masked before issuing a backtrace for every CPU
>> except itself.
>>
>> The new backtrace code introduced by commit 96f0e00378d4 ("ARM: add
>> basic support for on-demand backtrace of other CPUs") does not work
>> correctly when run from an interrupt handler because IPI_CPU_BACKTRACE
>> is used to generate the backtrace on all CPUs but cannot preempt the
>> current calling context.
>
> This patch needs a little more work - what happens to the IPI_CPU_BACKTRACE
> we've sent to ourselves?  (It fires after the interrupt handler for the
> UART/kbd has finished.)  It ought to be masked out if we're going to
> handle it a different way.

Actually it already gets masked out. The argument to raise_nmi() points 
to a data structure owned by the backtrace library functions and this 
structure if altered during the execution of nmi_cpu_backtrace() to 
clear the calling CPU.

I had originally planned to use cpumask_test_and_clear_cpu() for the 
conditional branch but that would be broken because nmi_cpu_backtrace() 
would become a nop if we clear anything from the mask before calling it!

I guess I should add a comment about this to save us from broken but 
"obviously correct" cleanups in the future...


Daniel.
Russell King - ARM Linux Sept. 15, 2015, 1:42 p.m. | #3
On Tue, Sep 15, 2015 at 02:15:10PM +0100, Daniel Thompson wrote:
> Actually it already gets masked out. The argument to raise_nmi() points to a
> data structure owned by the backtrace library functions and this structure
> if altered during the execution of nmi_cpu_backtrace() to clear the calling
> CPU.
> 
> I had originally planned to use cpumask_test_and_clear_cpu() for the
> conditional branch but that would be broken because nmi_cpu_backtrace()
> would become a nop if we clear anything from the mask before calling it!
> 
> I guess I should add a comment about this to save us from broken but
> "obviously correct" cleanups in the future...

Absolutely.

Patch

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 48185a773852..4d8a80328c74 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -748,6 +748,13 @@  core_initcall(register_cpufreq_notifier);

 static void raise_nmi(cpumask_t *mask)
 {
+	/*
+	 * Generate the backtrace directly if we are running in a
+	 * calling context that is not preemptible by the backtrace IPI.
+	 */
+	if (cpumask_test_cpu(smp_processor_id(), mask) && irqs_disabled())
+		nmi_cpu_backtrace(NULL);
+
 	smp_cross_call(mask, IPI_CPU_BACKTRACE);
 }

diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c
index 88d3d32e5923..be0466a80d0b 100644
--- a/lib/nmi_backtrace.c
+++ b/lib/nmi_backtrace.c
@@ -149,7 +149,10 @@  bool nmi_cpu_backtrace(struct pt_regs *regs)
 		/* Replace printk to write into the NMI seq */
 		this_cpu_write(printk_func, nmi_vprintk);
 		pr_warn("NMI backtrace for cpu %d\n", cpu);
-		show_regs(regs);
+		if (regs)
+			show_regs(regs);
+		else
+			dump_stack();
 		this_cpu_write(printk_func, printk_func_save);

 		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));