[RFC,2/2] locking/spinlock/debug: Add checks for kgdb trap safety

Message ID 20200522145510.2109799-3-daniel.thompson@linaro.org
State New
Headers show
Series
  • Introduce KGDB_DEBUG_SPINLOCKS
Related show

Commit Message

Daniel Thompson May 22, 2020, 2:55 p.m.
In general it is not safe to call spin_lock() whilst executing in the
kgdb trap handler. The trap can be entered from all sorts of execution
context (NMI, IRQ, irqs disabled, etc) and the kgdb/kdb needs to be
as resillient as possible.

Currently it is difficult to spot mistakes in the kgdb/kdb logic
(especially so for kdb because it uses more kernel features than
pure-kgdb). Let's provide a means to bring attention to deadlock
risks in the debug code.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>

---
 include/linux/kgdb.h            | 16 ++++++++++++++++
 kernel/locking/spinlock_debug.c |  4 ++++
 lib/Kconfig.kgdb                | 11 +++++++++++
 3 files changed, 31 insertions(+)

-- 
2.25.4

Comments

Peter Zijlstra May 22, 2020, 4:35 p.m. | #1
On Fri, May 22, 2020 at 03:55:10PM +0100, Daniel Thompson wrote:
> In general it is not safe to call spin_lock() whilst executing in the

> kgdb trap handler. The trap can be entered from all sorts of execution

> context (NMI, IRQ, irqs disabled, etc) and the kgdb/kdb needs to be

> as resillient as possible.

> 

> Currently it is difficult to spot mistakes in the kgdb/kdb logic

> (especially so for kdb because it uses more kernel features than

> pure-kgdb). Let's provide a means to bring attention to deadlock

> risks in the debug code.


I really dislike this thing. Also, commit:

  f6f48e180404 ("lockdep: Teach lockdep about "USED" <- "IN-NMI" inversions")

should be able to trigger here when the kgdb traps are marked as NMI.
x86 will soon have that.

Patch

diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index b072aeb1fd78..de30ce8078cf 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -332,4 +332,20 @@  extern void kgdb_panic(const char *msg);
 #define dbg_late_init()
 static inline void kgdb_panic(const char *msg) {}
 #endif /* ! CONFIG_KGDB */
+
+#ifdef CONFIG_KGDB_DEBUG_SPINLOCK
+/**
+ * check_kgdb_context_before() - Check whether to issue a spinlock warning
+ *
+ * Currently this only reports when the master processor violates the
+ * locking rules (because we are using the in_dbg_master() macro since
+ * we are confident that will avoid false positives).
+ *
+ * Return: True if we are executing in the debug trap
+ */
+static inline int check_kgdb_context_before(void) { return in_dbg_master(); }
+#else
+static inline int check_kgdb_context_before(void) { return 0; }
+#endif
+
 #endif /* _KGDB_H_ */
diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
index b9d93087ee66..b49789e0fed8 100644
--- a/kernel/locking/spinlock_debug.c
+++ b/kernel/locking/spinlock_debug.c
@@ -12,6 +12,7 @@ 
 #include <linux/debug_locks.h>
 #include <linux/delay.h>
 #include <linux/export.h>
+#include <linux/kgdb.h>
 
 void __raw_spin_lock_init(raw_spinlock_t *lock, const char *name,
 			  struct lock_class_key *key, short inner)
@@ -84,6 +85,7 @@  debug_spin_lock_before(raw_spinlock_t *lock)
 	SPIN_BUG_ON(READ_ONCE(lock->owner) == current, lock, "recursion");
 	SPIN_BUG_ON(READ_ONCE(lock->owner_cpu) == raw_smp_processor_id(),
 							lock, "cpu recursion");
+	SPIN_BUG_ON(check_kgdb_context_before(), lock, "in debug trap");
 }
 
 static inline void debug_spin_lock_after(raw_spinlock_t *lock)
@@ -174,6 +176,7 @@  int do_raw_read_trylock(rwlock_t *lock)
 void do_raw_read_unlock(rwlock_t *lock)
 {
 	RWLOCK_BUG_ON(lock->magic != RWLOCK_MAGIC, lock, "bad magic");
+	RWLOCK_BUG_ON(check_kgdb_context_before(), lock, "in debug trap");
 	arch_read_unlock(&lock->raw_lock);
 }
 
@@ -183,6 +186,7 @@  static inline void debug_write_lock_before(rwlock_t *lock)
 	RWLOCK_BUG_ON(lock->owner == current, lock, "recursion");
 	RWLOCK_BUG_ON(lock->owner_cpu == raw_smp_processor_id(),
 							lock, "cpu recursion");
+	RWLOCK_BUG_ON(check_kgdb_context_before(), lock, "in debug trap");
 }
 
 static inline void debug_write_lock_after(rwlock_t *lock)
diff --git a/lib/Kconfig.kgdb b/lib/Kconfig.kgdb
index 933680b59e2d..4d57900d6c53 100644
--- a/lib/Kconfig.kgdb
+++ b/lib/Kconfig.kgdb
@@ -29,6 +29,17 @@  config KGDB_SERIAL_CONSOLE
 	  Share a serial console with kgdb. Sysrq-g must be used
 	  to break in initially.
 
+config KGDB_DEBUG_SPINLOCK
+	bool "KGDB: Check for spin lock usage when system is halted"
+	select DEBUG_SPINLOCK
+	default n
+	help
+	  Say Y here to catch spin lock waiting when we are running
+	  in the kgdb trap handler and report it. When the trap handler
+	  is executing all other system activity is halted and spin lock
+	  contention will lead to deadlock. This makes any spin lock wait
+	  from this execution context risky.
+
 config KGDB_TESTS
 	bool "KGDB: internal test suite"
 	default n