Message ID | CAKohpok3U_8zQurs219fOmwERFrnZ0sOE-YUc7B+fKYfFLmEpg@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Wed, Nov 12, 2014 at 11:41:09AM +0530, Viresh Kumar wrote: > On 11 November 2014 22:45, Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > Here is a summarized list: > > > > * Unbound workqueues affinity (to housekeeper) > > * Unbound timers affinity (to housekeeper) > > * 1 Hz residual scheduler tick offlining to housekeeper > > * Fix some scheduler accounting that don't even work with 1 Hz: cpu load > > accounting, rt_scale, load balancing, etc... > > * Lighten the syscall path and get rid of cputime accounting + RCU hooks > > for people who want isolation + fast syscalls and faults. > > * Work on non-affinable workqueues > > * Work on non-affinable timers > > * ... > > + spurious interrupts with NOHZ_FULL on all architectures which break isolation > but doesn't get caught with traces. Can be observed with this: > > diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c > index 481fa54..91d490d 100644 > --- a/kernel/time/hrtimer.c > +++ b/kernel/time/hrtimer.c > @@ -1244,7 +1244,8 @@ void hrtimer_interrupt(struct clock_event_device *dev) > { > struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases); > ktime_t expires_next, now, entry_time, delta; > - int i, retries = 0; > + int i, retries = 0, count = 0; > + static int total_spurious; > > BUG_ON(!cpu_base->hres_active); > cpu_base->nr_events++; > @@ -1304,10 +1305,14 @@ void hrtimer_interrupt(struct clock_event_device *dev) > break; > } > > + count++; > __run_hrtimer(timer, &basenow); > } > } > > + if (!count) > + pr_err("____%s: Totalspurious: %d\n", __func__, > ++total_spurious); > + I'd rather leave that to tracepoints. Like trace_hrtimer_spurious(). Or better yet: have trace_hrtimer_interrupt() which we can compare against trace_hrtimer_expire_entry/exit() to check if any hrtimer callback have run in the interrupt. This way we avoid workarounds like the above count. > /* > * Store the new expiry value so the migration code can verify > * against it. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On 12 November 2014 19:24, Frederic Weisbecker <fweisbec@gmail.com> wrote: > I'd rather leave that to tracepoints. Like trace_hrtimer_spurious(). Yeah, it was just to prove things right on the console without getting into traces. > Or better yet: have trace_hrtimer_interrupt() which we can compare against > trace_hrtimer_expire_entry/exit() to check if any hrtimer callback have run > in the interrupt. This way we avoid workarounds like the above count. Yeah, I also believe we better add this debug information to mainline kernel. I will try to get a patch for that soon. Would it be recommended to add both trace points? i.e. trace_hrtimer_interrupt() and trace_hrtimer_spurious() ? -- viresh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Wed, Nov 12, 2014 at 08:26:05PM +0530, Viresh Kumar wrote: > On 12 November 2014 19:24, Frederic Weisbecker <fweisbec@gmail.com> wrote: > > I'd rather leave that to tracepoints. Like trace_hrtimer_spurious(). > > Yeah, it was just to prove things right on the console without getting > into traces. > > > Or better yet: have trace_hrtimer_interrupt() which we can compare against > > trace_hrtimer_expire_entry/exit() to check if any hrtimer callback have run > > in the interrupt. This way we avoid workarounds like the above count. > > Yeah, I also believe we better add this debug information to mainline kernel. > I will try to get a patch for that soon. > > Would it be recommended to add both trace points? > i.e. trace_hrtimer_interrupt() and trace_hrtimer_spurious() I don't think you need to add anything. We already have tracepoints for every single interrupt (and therefore also for the hrtimer one) and we have expiry tracepoints. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On 12 November 2014 20:36, Peter Zijlstra <peterz@infradead.org> wrote: > I don't think you need to add anything. We already have tracepoints for > every single interrupt (and therefore also for the hrtimer one) and we > have expiry tracepoints. I will crosscheck this again but as far as I remember these spurious interrupts aren't caught with tracers currently. Not even the irq_enter/exit ones. I saw that piece of code long back (and obviously don't understand it completely). But the problem was that because these are spurious clockevent interrupts, we don't have anything to do on that interrupt. No tick/sched/softirq processing. And the irq-handling code returns before even calling trace_irq_enter(). And that's why I had to add those prints initially. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On 12 November 2014 20:46, Viresh Kumar <viresh.kumar@linaro.org> wrote: > On 12 November 2014 20:36, Peter Zijlstra <peterz@infradead.org> wrote: >> I don't think you need to add anything. We already have tracepoints for >> every single interrupt (and therefore also for the hrtimer one) and we >> have expiry tracepoints. > > I will crosscheck this again but as far as I remember these spurious interrupts > aren't caught with tracers currently. Not even the irq_enter/exit ones. So yes, we do get irq_handler_entry/exit traces for the clockevent device. The problem here is that people miss these as there are no warnings issued here. And so nobody tried to resolve these spurious interrupts. This is how it looks on a ARM machine (With trace_hrtimer_spurious), though it is relevant for all architectures which emulate ONESHOT mode over PERIODIC mode: <idle>-0 [001] d.h2 719.642085: irq_handler_entry: irq=30 name=arch_timer <idle>-0 [001] d.h3 719.646506: hrtimer_spurious: count:105 <idle>-0 [001] d.h2 719.646511: irq_handler_exit: irq=30 ret=handled <idle>-0 [001] d..3 719.646522: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719640000000 softexpires=719640000000 <idle>-0 [001] d..3 719.646545: hrtimer_cancel: hrtimer=ee7bb740 <idle>-0 [001] d.h2 719.647086: irq_handler_entry: irq=30 name=arch_timer <idle>-0 [001] d.h3 719.651540: hrtimer_spurious: count:106 <idle>-0 [001] d.h2 719.651545: irq_handler_exit: irq=30 ret=handled <idle>-0 [001] d..3 719.651554: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719645000000 softexpires=719645000000 <idle>-0 [001] d..3 719.651576: hrtimer_cancel: hrtimer=ee7bb740 <idle>-0 [001] d.h2 719.652084: irq_handler_entry: irq=30 name=arch_timer <idle>-0 [001] d.h3 719.656571: hrtimer_spurious: count:107 <idle>-0 [001] d.h2 719.656575: irq_handler_exit: irq=30 ret=handled <idle>-0 [001] d..3 719.656583: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719650000000 softexpires=719650000000 <idle>-0 [001] d..3 719.656601: hrtimer_cancel: hrtimer=ee7bb740 <idle>-0 [001] d.h2 719.657084: irq_handler_entry: irq=30 name=arch_timer <idle>-0 [001] d.h3 719.661611: hrtimer_spurious: count:108 <idle>-0 [001] d.h2 719.661616: irq_handler_exit: irq=30 ret=handled <idle>-0 [001] d..3 719.661627: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719655000000 softexpires=719655000000 <idle>-0 [001] d..3 719.661652: hrtimer_cancel: hrtimer=ee7bb740 <idle>-0 [001] d.h2 719.662086: irq_handler_entry: irq=30 name=arch_timer <idle>-0 [001] d.h3 719.666643: hrtimer_spurious: count:109 <idle>-0 [001] d.h2 719.666648: irq_handler_exit: irq=30 ret=handled <idle>-0 [001] d..3 719.666657: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719660000000 softexpires=719660000000 <idle>-0 [001] d..3 719.666680: hrtimer_cancel: hrtimer=ee7bb740 <idle>-0 [001] d.h2 719.667084: irq_handler_entry: irq=30 name=arch_timer <idle>-0 [001] d.h3 719.671678: hrtimer_spurious: count:110 <idle>-0 [001] d.h2 719.671682: irq_handler_exit: irq=30 ret=handled <idle>-0 [001] d..3 719.671693: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719665000000 softexpires=719665000000 <idle>-0 [001] d..3 719.671714: hrtimer_cancel: hrtimer=ee7bb740 <idle>-0 [001] d.h2 719.672085: irq_handler_entry: irq=30 name=arch_timer <idle>-0 [001] d.h3 719.676712: hrtimer_spurious: count:111 <idle>-0 [001] d.h2 719.676717: irq_handler_exit: irq=30 ret=handled <idle>-0 [001] d..3 719.676726: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719670000000 softexpires=719670000000 <idle>-0 [001] d..3 719.676749: hrtimer_cancel: hrtimer=ee7bb740 <idle>-0 [001] d.h2 719.677087: irq_handler_entry: irq=30 name=arch_timer <idle>-0 [001] d.h3 719.681745: hrtimer_spurious: count:112 <idle>-0 [001] d.h2 719.681750: irq_handler_exit: irq=30 ret=handled <idle>-0 [001] d..3 719.681759: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719675000000 softexpires=719675000000 <idle>-0 [001] d..3 719.681780: hrtimer_cancel: hrtimer=ee7bb740 <idle>-0 [001] d.h2 719.682084: irq_handler_entry: irq=30 name=arch_timer <idle>-0 [001] d.h3 719.686783: hrtimer_spurious: count:113 <idle>-0 [001] d.h2 719.686787: irq_handler_exit: irq=30 ret=handled <idle>-0 [001] d..3 719.686797: hrtimer_start: hrtimer=ee7bb740 function=tick_sched_timer expires=719680000000 softexpires=719680000000 <idle>-0 [001] d..3 719.686819: hrtimer_cancel: hrtimer=ee7bb740 Normally this happens on a non-spurious interrupt: is-cpu-isolated-1963 [000] d.h1 719.642080: irq_handler_entry: irq=30 name=arch_timer is-cpu-isolated-1963 [000] d.h2 719.642086: hrtimer_cancel: hrtimer=ee7b3740 is-cpu-isolated-1963 [000] d.h1 719.642089: hrtimer_expire_entry: hrtimer=ee7b3740 function=tick_sched_timer now=719635008926 is-cpu-isolated-1963 [000] d.h1 719.642095: softirq_raise: vec=1 [action=TIMER] is-cpu-isolated-1963 [000] d.h1 719.642099: softirq_raise: vec=9 [action=RCU] is-cpu-isolated-1963 [000] d.h2 719.642104: sched_stat_runtime: comm=trace-isolation pid=1963 runtime=348042 [ns] vruntime=7804506025 [ns] is-cpu-isolated-1963 [000] d.h1 719.642111: hrtimer_expire_exit: hrtimer=ee7b3740 is-cpu-isolated-1963 [000] d.h2 719.642115: hrtimer_start: hrtimer=ee7b3740 function=tick_sched_timer expires=719640000000 softexpires=719640000000 is-cpu-isolated-1963 [000] d.h1 719.642121: irq_handler_exit: irq=30 ret=handled .. followed by softirq processing. Another trace point with trace_hrtimer_spurious() might end up being useful :) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 481fa54..91d490d 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -1244,7 +1244,8 @@ void hrtimer_interrupt(struct clock_event_device *dev) { struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases); ktime_t expires_next, now, entry_time, delta; - int i, retries = 0; + int i, retries = 0, count = 0; + static int total_spurious; BUG_ON(!cpu_base->hres_active); cpu_base->nr_events++; @@ -1304,10 +1305,14 @@ void hrtimer_interrupt(struct clock_event_device *dev) break; } + count++; __run_hrtimer(timer, &basenow); } } + if (!count) + pr_err("____%s: Totalspurious: %d\n", __func__, ++total_spurious); + /* * Store the new expiry value so the migration code can verify * against it.