diff mbox

[3.19-rc2,v14,0/7] arm: Implement arch_trigger_all_cpu_backtrace

Message ID 54BF83C9.5060300@linaro.org
State New
Headers show

Commit Message

Daniel Thompson Jan. 21, 2015, 10:47 a.m. UTC
On 20/01/15 20:53, Stephen Boyd wrote:
> On 01/20/2015 02:25 AM, Daniel Thompson wrote:
>> On 13/01/15 10:26, Daniel Thompson wrote:
>>> Hi Thomas, Hi Jason:
>>>     Patches 1 to 3 are for you (and should be separable from the rest
>>>     of the series). The patches haven't changes since the last time
>>>     I posted them. The changes in v14 tidy up the later part of the
>>>     patch set in order to share more code between x86 and arm.
>> No review comments! Have I finally got this right?
>>
>> If so it possible and/or sensible to get patches 1-3 in a tree that
>> feeds linux-next. I'd really like the gic changes to meet the various
>> ARM build and boot bots.
> 
> With this patchset, is it possible to call sched_clock() from within NMI
> context? I ask because the generic sched_clock() code is not NMI safe
> today. We were planning on making it NMI safe by doing something similar
> to what was done for ktime_get_mono_fast_ns() but we haven't gotten
> around to it. Mostly because no architecture that uses generic
> sched_clock() has support for NMIs right now.

I've not done any work to make sched_clock() safe to call from NMI.
However since my patchset does not introduce any calls to sched_clock()
from NMI I think this is OK!

I ported Steven Rostedt's work to make arch_trigger_all_cpu_backtrace()
safe from NMI from x86 to ARM. One result of Steven's approach are that
printk() timestamping is deferred until we return to normal context.
Thus even with CONFIG_PRINTK_TIME we do not call local_clock() during
NMI processing.

To confirm the above I have added the code below to my kernel and ran it
with a fairly paranoid set of debugging options. The check does not fire.


Daniel.

Comments

Daniel Thompson Jan. 21, 2015, 1:48 p.m. UTC | #1
On 21/01/15 13:06, Steven Rostedt wrote:
> On Wed, 21 Jan 2015 10:47:37 +0000
> Daniel Thompson <daniel.thompson@linaro.org> wrote:
> 
> 
>>> With this patchset, is it possible to call sched_clock() from within NMI
>>> context? I ask because the generic sched_clock() code is not NMI safe
> 
> That's not good. Better not run function tracing, as that could trace
> functions in NMI context (I depend on that it does), and it uses
> sched_clock() as the default clock.

I think sched_clock is unsafe as in "may sometimes give the wrong value"
rather than "can lock up arbitrarily".  Thus the impact is unlikely to
be harmful enough to want to avoid tracing altogether.

It would require special care be taken when interpreting the timestamps
however. Also since update_sched_clock() is a notrace function its very
hard to figure out when timestamps are at risk.

Anyhow, the fix doesn't seem that hard. I can take a look.


Daniel.
Daniel Thompson Jan. 22, 2015, 11:21 a.m. UTC | #2
On 21/01/15 13:48, Daniel Thompson wrote:
> On 21/01/15 13:06, Steven Rostedt wrote:
>> On Wed, 21 Jan 2015 10:47:37 +0000
>> Daniel Thompson <daniel.thompson@linaro.org> wrote:
>>
>>
>>>> With this patchset, is it possible to call sched_clock() from within NMI
>>>> context? I ask because the generic sched_clock() code is not NMI safe
>>
>> That's not good. Better not run function tracing, as that could trace
>> functions in NMI context (I depend on that it does), and it uses
>> sched_clock() as the default clock.
> 
> I think sched_clock is unsafe as in "may sometimes give the wrong value"
> rather than "can lock up arbitrarily".  Thus the impact is unlikely to
> be harmful enough to want to avoid tracing altogether.

Just to update the record...

The above paragraph is wrong in every possible way. It is a livelock
(and I'm working on it).



> It would require special care be taken when interpreting the timestamps
> however. Also since update_sched_clock() is a notrace function its very
> hard to figure out when timestamps are at risk.
> 
> Anyhow, the fix doesn't seem that hard. I can take a look.
> 
> 
> Daniel.
>
diff mbox

Patch

diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 630dd2372238..fea0deeb524b 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -111,8 +111,10 @@  extern void warn_slowpath_null(const char *file,
const int line);
        int __ret_warn_once = !!(condition);                    \
                                                                \
        if (unlikely(__ret_warn_once))                          \
-               if (WARN_ON(!__warned))                         \
+               if (unlikely(!__warned)) {                      \
                        __warned = true;                        \
+                       __WARN();                               \
+               }                                               \
        unlikely(__ret_warn_once);                              \
 })

diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index 01d2d15aa662..81ea469b7e68 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -63,6 +63,8 @@  unsigned long long notrace sched_clock(void)
        u64 cyc;
        unsigned long seq;

+       WARN_ON_ONCE(in_nmi());
+
        if (cd.suspended)
                return cd.epoch_ns;