From patchwork Tue Oct 25 13:58:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 618626 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2CD9CC04A95 for ; Tue, 25 Oct 2022 13:59:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231696AbiJYN72 (ORCPT ); Tue, 25 Oct 2022 09:59:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47330 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231191AbiJYN7Y (ORCPT ); Tue, 25 Oct 2022 09:59:24 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D25561B9CD; Tue, 25 Oct 2022 06:59:22 -0700 (PDT) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1666706361; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JsI07DTqXjWlRwYDoljjsPLtGx2xh4qIy28E4h8N7no=; b=T3EsepFnG1VEPakgQAPv6utfeISI5b9SCb3qu9nw9yeGr8jw8t1zR6Dx6ch73PUNFHVOGL oqi15SnAg6HRSDAkl4rHMmQDmCPs6a9Lv63YmgSPuSPwPNSps9UbplBD71AHpxg2pJhoxM FyKJ6I+g+ewtLZORJV916KEkkleW6We3dblqJb1P61KSpLSm5qD968Gg2KoZJ148/HVXoU TIw427aYJT6plp81tvDQn1+JvmlMVHx6z/OXTHK8T9S81zxHoAeP4HroqZjYTVa+HVrrka zTpa7evXpC5fGDXUF91xC6m66JO487MZ65rfWuv1Y0eNCYhcD7momR1AF1QKSQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1666706361; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JsI07DTqXjWlRwYDoljjsPLtGx2xh4qIy28E4h8N7no=; b=i7qAAP55lL27scwdgwJ3teCam7iQjkFLDDmuZcmI7LTkktrsPSWpugrETm6inYdHUHJQBi M/D/ekZWd+6fl2BQ== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Eric Dumazet , Thomas Gleixner , "Rafael J. Wysocki" , linux-pm@vger.kernel.org, Arjan van de Ven , "Paul E. McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen , "Rafael J . Wysocki" , Viresh Kumar , Michael Ellerman Subject: [PATCH v3 01/17] cpufreq: Prepare timer flags for hierarchical timer pull model Date: Tue, 25 Oct 2022 15:58:34 +0200 Message-Id: <20221025135850.51044-2-anna-maria@linutronix.de> In-Reply-To: <20221025135850.51044-1-anna-maria@linutronix.de> References: <20221025135850.51044-1-anna-maria@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org Note: This is a proposal only. I was waiting on input how to change this driver properly to use the already existing infrastructure. See therfore the thread on linux-pm mailinglist: https://lore.kernel.org/linux-pm/4c99f34b-40f1-e6cc-2669-7854b615b5fd@linutronix.de/ gpstates timer is the only timer using TIMER_PINNED and TIMER_DEFERRABLE flag. When moving to hierarchical timer pull model, pinned and deferrable timers are stored in separate bases. To ensure gpstates timer always expires on the CPU where it is pinned to, keep only TIMER_PINNED flag and drop TIMER_DEFERRABLE flag. While at it, rewrite comment explaining the rule for timer expiry for the next interval and fix whitespace damages. Signed-off-by: Anna-Maria Behnsen Cc: linux-pm@vger.kernel.org Cc: Rafael J. Wysocki Cc: Viresh Kumar Cc: Michael Ellerman --- drivers/cpufreq/powernv-cpufreq.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c index fddbd1ea1635..08d6bd54539d 100644 --- a/drivers/cpufreq/powernv-cpufreq.c +++ b/drivers/cpufreq/powernv-cpufreq.c @@ -640,18 +640,18 @@ static inline int calc_global_pstate(unsigned int elapsed_time, return highest_lpstate_idx + index_diff; } -static inline void queue_gpstate_timer(struct global_pstate_info *gpstates) +static inline void queue_gpstate_timer(struct global_pstate_info *gpstates) { unsigned int timer_interval; /* - * Setting up timer to fire after GPSTATE_TIMER_INTERVAL ms, But - * if it exceeds MAX_RAMP_DOWN_TIME ms for ramp down time. - * Set timer such that it fires exactly at MAX_RAMP_DOWN_TIME - * seconds of ramp down time. + * Timer should expire next time after GPSTATE_TIMER_INTERVAL. If + * the resulting interval (elapsed time + interval) between last + * and next timer expiry is greater than MAX_RAMP_DOWN_TIME, ensure + * it is maximum MAX_RAMP_DOWN_TIME when queueing the next timer. */ if ((gpstates->elapsed_time + GPSTATE_TIMER_INTERVAL) - > MAX_RAMP_DOWN_TIME) + > MAX_RAMP_DOWN_TIME) timer_interval = MAX_RAMP_DOWN_TIME - gpstates->elapsed_time; else timer_interval = GPSTATE_TIMER_INTERVAL; @@ -865,8 +865,7 @@ static int powernv_cpufreq_cpu_init(struct cpufreq_policy *policy) /* initialize timer */ gpstates->policy = policy; - timer_setup(&gpstates->timer, gpstate_timer_handler, - TIMER_PINNED | TIMER_DEFERRABLE); + timer_setup(&gpstates->timer, gpstate_timer_handler, TIMER_PINNED); gpstates->timer.expires = jiffies + msecs_to_jiffies(GPSTATE_TIMER_INTERVAL); spin_lock_init(&gpstates->gpstate_lock); From patchwork Tue Oct 25 13:58:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 618624 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17943FA3740 for ; Tue, 25 Oct 2022 13:59:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231836AbiJYN7c (ORCPT ); Tue, 25 Oct 2022 09:59:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231546AbiJYN70 (ORCPT ); Tue, 25 Oct 2022 09:59:26 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3917A29C; Tue, 25 Oct 2022 06:59:25 -0700 (PDT) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1666706363; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vvmb4wfRRA26uf1ExR6XWWcWf4ly6fbBq4i3VJ/TCEY=; b=LwzQPpX+Ny/PxpUuI5+8qcVrqQfT+WjE7/NENxhUShNmOEw/iTf37CJ4FwRBqsuzl/u5z+ wnhOmVBSHuvvQyowXC9m+wb5s2TtJpaDZsDxGrXBFbncS/SR0qiDtqA+1c8pv+rwlX9rfm aKX3hwqW3fvLBrefg8HwPjql35faDPU5ouq3uzKL5MPAVoxJLVIEeSj2Pgi26Jjk/kIKG7 eEX9DHBUgsGP6QEHMDXA0SIbTKgCLNyh0PJSXLS9Q2AxQ0fmesw+KOHPOCPl+ANDsd6Tw8 jjKvOFiZGe8GwKADdu1P0m0wi7qVYiwGlix0MqBMBM89OEmMaLhlzh/DYDe1LQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1666706363; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vvmb4wfRRA26uf1ExR6XWWcWf4ly6fbBq4i3VJ/TCEY=; b=8N2pUT0OaCAesECVu/HpMphD9DDy2spXcuoPHDFPo3FjS0DZYJm78WHzj9wdoRKnl4Q9t9 /tIBD5wqb5wMx+Cg== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Eric Dumazet , Thomas Gleixner , "Rafael J. Wysocki" , linux-pm@vger.kernel.org, Arjan van de Ven , "Paul E. McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen Subject: [PATCH v3 05/17] timer: Rework idle logic Date: Tue, 25 Oct 2022 15:58:38 +0200 Message-Id: <20221025135850.51044-6-anna-maria@linutronix.de> In-Reply-To: <20221025135850.51044-1-anna-maria@linutronix.de> References: <20221025135850.51044-1-anna-maria@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: Thomas Gleixner To improve readability of the code, split base->idle calculation and expires calculation into separate parts. Signed-off-by: Thomas Gleixner Signed-off-by: Anna-Maria Behnsen Reviewed-by: Frederic Weisbecker --- kernel/time/timer.c | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index f34bc75ff848..cb4194ecca60 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1727,21 +1727,20 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) base->clk = nextevt; } - if (time_before_eq(nextevt, basej)) { - expires = basem; - base->is_idle = false; - } else { - if (base->timers_pending) - expires = basem + (u64)(nextevt - basej) * TICK_NSEC; - /* - * If we expect to sleep more than a tick, mark the base idle. - * Also the tick is stopped so any added timer must forward - * the base clk itself to keep granularity small. This idle - * logic is only maintained for the BASE_STD base, deferrable - * timers may still see large granularity skew (by design). - */ - if ((expires - basem) > TICK_NSEC) - base->is_idle = true; + /* + * Base is idle if the next event is more than a tick away. Also + * the tick is stopped so any added timer must forward the base clk + * itself to keep granularity small. This idle logic is only + * maintained for the BASE_STD base, deferrable timers may still + * see large granularity skew (by design). + */ + base->is_idle = time_after(nextevt, basej + 1); + + if (base->timers_pending) { + /* If we missed a tick already, force 0 delta */ + if (time_before_eq(nextevt, basej)) + nextevt = basej; + expires = basem + (u64)(nextevt - basej) * TICK_NSEC; } raw_spin_unlock(&base->lock); From patchwork Tue Oct 25 13:58:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 618623 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3959DC04A95 for ; Tue, 25 Oct 2022 13:59:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231991AbiJYN7e (ORCPT ); Tue, 25 Oct 2022 09:59:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47608 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231796AbiJYN7a (ORCPT ); Tue, 25 Oct 2022 09:59:30 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F0ECDB23; Tue, 25 Oct 2022 06:59:25 -0700 (PDT) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1666706364; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bEYRtSQZY6jeCn9Fg5wiaPDLLW9bMvPs60Ugkh6xZUM=; b=BJZKlLp2fBwdYyOnyjBirLcT1R3s1L0dw5UkQHCoKz6k0UuaYvdI7xogqsF5iJmM2t1Zzu GIHpuTpZn8DjdrFKVmsAP92ivrwqwzjXgI6Re737SBFgRdYncJatpEQxjVze1W6Tx8Acwc d6RzrjlXI4IruX6O+wVURAY/CGKmDRl5NDkr1WbxkrLRd/d3rslj8vHIJcYdRTqQOHvuag bRYhV2PRE+dTAjRnx2+JsoHP3KVyBNZ59unMeHTlpNA/UJEVfl38vnkSfqiVXvCKkm6vMf p1eCxjOBfK3iC1W3l0yOFAMjoxCfJhuDZaPRquSbVS6kiR1Gjy5FiXsP9ShLeA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1666706364; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bEYRtSQZY6jeCn9Fg5wiaPDLLW9bMvPs60Ugkh6xZUM=; b=fgQwUeSi6EU55H15YYflIBCOdeSxfE/YqvZJuu8IWanCQJRN8Le1041TeRnMB6JHRexQPh zWKHrGPCAIcc8HAw== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Eric Dumazet , Thomas Gleixner , "Rafael J. Wysocki" , linux-pm@vger.kernel.org, Arjan van de Ven , "Paul E. McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen , Richard Cochran Subject: [PATCH v3 07/17] timer: Retrieve next expiry of pinned/non-pinned timers seperately Date: Tue, 25 Oct 2022 15:58:40 +0200 Message-Id: <20221025135850.51044-8-anna-maria@linutronix.de> In-Reply-To: <20221025135850.51044-1-anna-maria@linutronix.de> References: <20221025135850.51044-1-anna-maria@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org For the conversion of the NOHZ timer placement to a pull at expiry time model it's required to have seperate expiry times for the pinned and the non-pinned (movable) timers. Therefore struct timer_events is introduced. No functional change Originally-by: Richard Cochran (linutronix GmbH) Signed-off-by: Anna-Maria Behnsen --- kernel/time/tick-internal.h | 8 ++++- kernel/time/tick-sched.c | 20 ++++++++---- kernel/time/timer.c | 65 ++++++++++++++++++++++++++++--------- 3 files changed, 70 insertions(+), 23 deletions(-) diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h index 649f2b48e8f0..fcb2d45c2934 100644 --- a/kernel/time/tick-internal.h +++ b/kernel/time/tick-internal.h @@ -8,6 +8,11 @@ #include "timekeeping.h" #include "tick-sched.h" +struct timer_events { + u64 local; + u64 global; +}; + #ifdef CONFIG_GENERIC_CLOCKEVENTS # define TICK_DO_TIMER_NONE -1 @@ -163,7 +168,8 @@ static inline void timers_update_nohz(void) { } DECLARE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases); -extern u64 get_next_timer_interrupt(unsigned long basej, u64 basem); +extern void get_next_timer_interrupt(unsigned long basej, u64 basem, + struct timer_events *tevt); void timer_clear_idle(void); #define CLOCK_SET_WALL \ diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 7ffdc7ba19b4..78f172d1f3d2 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -784,7 +784,8 @@ static inline bool local_timer_softirq_pending(void) static ktime_t tick_nohz_next_event(struct tick_sched *ts, int cpu) { - u64 basemono, next_tick, delta, expires; + struct timer_events tevt = { .local = KTIME_MAX, .global = KTIME_MAX }; + u64 basemono, delta, expires; unsigned long basejiff; unsigned int seq; @@ -809,7 +810,11 @@ static ktime_t tick_nohz_next_event(struct tick_sched *ts, int cpu) */ if (rcu_needs_cpu() || arch_needs_cpu() || irq_work_needs_cpu() || local_timer_softirq_pending()) { - next_tick = basemono + TICK_NSEC; + /* + * If anyone needs the CPU, treat this as a local + * timer expiring in a jiffy. + */ + tevt.local = basemono + TICK_NSEC; } else { /* * Get the next pending timer. If high resolution @@ -818,17 +823,18 @@ static ktime_t tick_nohz_next_event(struct tick_sched *ts, int cpu) * disabled this also looks at the next expiring * hrtimer. */ - next_tick = get_next_timer_interrupt(basejiff, basemono); - ts->next_timer = next_tick; + get_next_timer_interrupt(basejiff, basemono, &tevt); + tevt.local = min_t(u64, tevt.local, tevt.global); + ts->next_timer = tevt.local; } /* * If the tick is due in the next period, keep it ticking or * force prod the timer. */ - WARN_ON_ONCE(basemono > next_tick); + WARN_ON_ONCE(basemono > tevt.local); - delta = next_tick - basemono; + delta = tevt.local - basemono; if (delta <= (u64)TICK_NSEC) { /* * Tell the timer code that the base is not idle, i.e. undo @@ -861,7 +867,7 @@ static ktime_t tick_nohz_next_event(struct tick_sched *ts, int cpu) else expires = KTIME_MAX; - ts->timer_expires = min_t(u64, expires, next_tick); + ts->timer_expires = min_t(u64, expires, tevt.local); out: return ts->timer_expires; diff --git a/kernel/time/timer.c b/kernel/time/timer.c index b3eea90cb212..069478e65230 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1664,7 +1664,7 @@ static void __next_timer_interrupt(struct timer_base *base) * Check, if the next hrtimer event is before the next timer wheel * event: */ -static u64 cmp_next_hrtimer_event(u64 basem, u64 expires) +static void cmp_next_hrtimer_event(u64 basem, struct timer_events *tevt) { u64 nextevt = hrtimer_get_next_event(); @@ -1672,15 +1672,17 @@ static u64 cmp_next_hrtimer_event(u64 basem, u64 expires) * If high resolution timers are enabled * hrtimer_get_next_event() returns KTIME_MAX. */ - if (expires <= nextevt) - return expires; + if (tevt->local <= nextevt) + return; /* * If the next timer is already expired, return the tick base * time so the tick is fired immediately. */ - if (nextevt <= basem) - return basem; + if (nextevt <= basem) { + tevt->local = basem; + return; + } /* * Round up to the next jiffie. High resolution timers are @@ -1690,7 +1692,7 @@ static u64 cmp_next_hrtimer_event(u64 basem, u64 expires) * * Use DIV_ROUND_UP_ULL to prevent gcc calling __divdi3 */ - return DIV_ROUND_UP_ULL(nextevt, TICK_NSEC) * TICK_NSEC; + tevt->local = DIV_ROUND_UP_ULL(nextevt, TICK_NSEC) * TICK_NSEC; } @@ -1703,26 +1705,31 @@ static unsigned long next_timer_interrupt(struct timer_base *base) } /** - * get_next_timer_interrupt - return the time (clock mono) of the next timer + * get_next_timer_interrupt * @basej: base time jiffies * @basem: base time clock monotonic + * @tevt: Pointer to the storage for the expiry values * - * Returns the tick aligned clock monotonic time of the next pending - * timer or KTIME_MAX if no timer is pending. + * Stores the next pending local and global timer expiry values in the + * struct pointed to by @tevt. If a queue is empty the corresponding field + * is set to KTIME_MAX. */ -u64 get_next_timer_interrupt(unsigned long basej, u64 basem) +void get_next_timer_interrupt(unsigned long basej, u64 basem, + struct timer_events *tevt) { unsigned long nextevt, nextevt_local, nextevt_global; struct timer_base *base_local, *base_global; bool local_first, is_idle; - u64 expires = KTIME_MAX; + + /* Preset local / global events */ + tevt->local = tevt->global = KTIME_MAX; /* * Pretend that there is no timer pending if the cpu is offline. * Possible pending timers will be migrated later to an active cpu. */ if (cpu_is_offline(smp_processor_id())) - return expires; + return; base_local = this_cpu_ptr(&timer_bases[BASE_LOCAL]); base_global = this_cpu_ptr(&timer_bases[BASE_GLOBAL]); @@ -1779,16 +1786,44 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) /* We need to mark both bases in sync */ base_local->is_idle = base_global->is_idle = is_idle; - if (base_local->timers_pending || base_global->timers_pending) { + /* + * If the bases are not marked idle, i.e one of the events is at + * max. one tick away, then the CPU can't go into a NOHZ idle + * sleep. Use the earlier event of both and store it in the local + * expiry value. The next global event is irrelevant in this case + * and can be left as KTIME_MAX. CPU will wakeup on time. + */ + if (!is_idle) { /* If we missed a tick already, force 0 delta */ if (time_before_eq(nextevt, basej)) nextevt = basej; - expires = basem + (u64)(nextevt - basej) * TICK_NSEC; + tevt->local = basem + (u64)(nextevt - basej) * TICK_NSEC; + goto unlock; } + + /* + * If the bases are marked idle, i.e. the next event on both the + * local and the global queue are farther away than a tick, + * evaluate both bases. No need to check whether one of the bases + * has an already expired timer as this is caught by the !is_idle + * condition above. + */ + if (base_local->timers_pending) + tevt->local = basem + (u64)(nextevt_local - basej) * TICK_NSEC; + + /* + * If the local queue expires first, then the global event can be + * ignored. The CPU wakes up before that. If the global queue is + * empty, nothing to do either. + */ + if (!local_first && base_global->timers_pending) + tevt->global = basem + (u64)(nextevt_global - basej) * TICK_NSEC; + +unlock: raw_spin_unlock(&base_global->lock); raw_spin_unlock(&base_local->lock); - return cmp_next_hrtimer_event(basem, expires); + cmp_next_hrtimer_event(basem, tevt); } /** From patchwork Tue Oct 25 13:58:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 618621 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE43BFA3740 for ; Tue, 25 Oct 2022 13:59:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232234AbiJYN7l (ORCPT ); Tue, 25 Oct 2022 09:59:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47694 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231917AbiJYN7d (ORCPT ); Tue, 25 Oct 2022 09:59:33 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B60C559B; Tue, 25 Oct 2022 06:59:28 -0700 (PDT) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1666706367; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CGHVsSTlrh42mtIuI9uwDJB0LFHl02eMk5Ez8/za1gE=; b=psy71XoywLxpFRQtpsAzmGAtKp/n3Nkt57Eou0efsjHndu1b2cex56ZmmiWO/z5Ir+Jvln wStFFpbewui20Pxoz2OrdFzZdUjknLDe+y+h93ds4RSxKv0o3cKEHoYBoeq/18yELHDcv0 E3/FXpntETQtt2XKbhjzkEjzkWiQ2QxR//TCVgSMq6KH2V7PMJmBBCnt5IWpc3RW54gXC8 +z3YNawaC+b8+yJ4hK+g+fgBZvxTaJu5RgF5p/G8AlZ6A3yvb4QJQwgv26MUUDj/zbcbt3 xo+chAZ27SWHljI3LC0FS9lR0r6V2Fv96vK3AdJPsiO+KhSj+bIslofR4ilvaA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1666706367; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CGHVsSTlrh42mtIuI9uwDJB0LFHl02eMk5Ez8/za1gE=; b=cDavS7ruVI86fuuZ0pIbMSDz0yM97jug9hFd7TT1eAn1P/NDH13s+F8kfvIDSwfEkg6gaD zqwM8KHh929+elBQ== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Eric Dumazet , Thomas Gleixner , "Rafael J. Wysocki" , linux-pm@vger.kernel.org, Arjan van de Ven , "Paul E. McKenney" , Frederic Weisbecker , Rik van Riel , "Richard Cochran (linutronix GmbH)" , Anna-Maria Behnsen Subject: [PATCH v3 11/17] timer: Restructure internal locking Date: Tue, 25 Oct 2022 15:58:44 +0200 Message-Id: <20221025135850.51044-12-anna-maria@linutronix.de> In-Reply-To: <20221025135850.51044-1-anna-maria@linutronix.de> References: <20221025135850.51044-1-anna-maria@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: "Richard Cochran (linutronix GmbH)" Move the locking out from __run_timers() to the call sites, so the protected section can be extended at the call site. Preparatory patch for changing the NOHZ timer placement to a pull at expiry time model. No functional change. Signed-off-by: Richard Cochran (linutronix GmbH) Signed-off-by: Anna-Maria Behnsen --- kernel/time/timer.c | 31 +++++++++++++++++++++---------- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 54631e86763f..bce2f87d5e70 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1911,11 +1911,7 @@ static inline void __run_timers(struct timer_base *base) struct hlist_head heads[LVL_DEPTH]; int levels; - if (time_before(jiffies, base->next_expiry)) - return; - - timer_base_lock_expiry(base); - raw_spin_lock_irq(&base->lock); + lockdep_assert_held(&base->lock); while (time_after_eq(jiffies, base->clk) && time_after_eq(jiffies, base->next_expiry)) { @@ -1935,21 +1931,36 @@ static inline void __run_timers(struct timer_base *base) while (levels--) expire_timers(base, heads + levels); } +} + +static void __run_timer_base(struct timer_base *base) +{ + if (time_before(jiffies, base->next_expiry)) + return; + + timer_base_lock_expiry(base); + raw_spin_lock_irq(&base->lock); + __run_timers(base); raw_spin_unlock_irq(&base->lock); timer_base_unlock_expiry(base); } +static void run_timer_base(int index) +{ + struct timer_base *base = this_cpu_ptr(&timer_bases[index]); + + __run_timer_base(base); +} + /* * This function runs timers and the timer-tq in bottom half context. */ static __latent_entropy void run_timer_softirq(struct softirq_action *h) { - struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_LOCAL]); - - __run_timers(base); + run_timer_base(BASE_LOCAL); if (IS_ENABLED(CONFIG_NO_HZ_COMMON)) { - __run_timers(this_cpu_ptr(&timer_bases[BASE_GLOBAL])); - __run_timers(this_cpu_ptr(&timer_bases[BASE_DEF])); + run_timer_base(BASE_GLOBAL); + run_timer_base(BASE_DEF); } } From patchwork Tue Oct 25 13:58:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 618622 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90670ECDFA1 for ; Tue, 25 Oct 2022 13:59:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232192AbiJYN7j (ORCPT ); Tue, 25 Oct 2022 09:59:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47696 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231926AbiJYN7d (ORCPT ); Tue, 25 Oct 2022 09:59:33 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B7B93BB; Tue, 25 Oct 2022 06:59:28 -0700 (PDT) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1666706367; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kjVAw8MhbYxCC7wINXEcQMmNl7FSEGbHT8j9jxIsmGY=; b=mcYLXutmXzSR9ZuMfSpPMy8mQSASFUo61PQ7SHQrKWDUi8VnIlN2rlivC+NIYxUL+jWhXo XukDXg7bgvlqbKnay6xYfaddDvjB4JzoR/3at5ZyWY6B4aY9vzMijr17r1FPG2M/OxmEw7 P5jo7g0efgIjCM9NQhOJOak/NtBguBLHDt/WfOXVVHpOG8cRa+GPW+HIfQ8BDobKRsj4U+ jLlgMSxzHk+oh2E29A/B26h8puv9BNJP3ky9uFDl9CpHkcqUJZaZPAmbD1Qux2dh2Bq5ya l/f0ifnZi1PXLK5lEq/OBOtQmqjfd7z3NH6HtT/s0Xjl7RJejjmCnHkNrOpFkg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1666706367; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kjVAw8MhbYxCC7wINXEcQMmNl7FSEGbHT8j9jxIsmGY=; b=HlRzm2pHphXnjc98iqPYLFBQ1DfuNTeMTrE2jFII10yIqd6mRgG+TLp3EnVl/OalcwMbou CI+wEHaGio7wslCA== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Eric Dumazet , Thomas Gleixner , "Rafael J. Wysocki" , linux-pm@vger.kernel.org, Arjan van de Ven , "Paul E. McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen Subject: [PATCH v3 12/17] timer: Check if timers base is handled already Date: Tue, 25 Oct 2022 15:58:45 +0200 Message-Id: <20221025135850.51044-13-anna-maria@linutronix.de> In-Reply-To: <20221025135850.51044-1-anna-maria@linutronix.de> References: <20221025135850.51044-1-anna-maria@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org Due to the conversion of the NOHZ timer placement to a pull at expiry time model, the per CPU timer bases with non pinned timers are no longer handled only by the local CPU. In case a remote CPU already expires the non pinned timers base of the local cpu, nothing more needs to be done by the local CPU. A check at the begin of the expire timers routine is required, because timer base lock is dropped before executing the timer callback function. This is a preparatory work, but has no functional impact right now. Signed-off-by: Anna-Maria Behnsen --- kernel/time/timer.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index bce2f87d5e70..0790ec8efe82 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1913,6 +1913,9 @@ static inline void __run_timers(struct timer_base *base) lockdep_assert_held(&base->lock); + if (!!base->running_timer) + return; + while (time_after_eq(jiffies, base->clk) && time_after_eq(jiffies, base->next_expiry)) { levels = collect_expired_timers(base, heads); From patchwork Tue Oct 25 13:58:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 618620 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12099FA3741 for ; Tue, 25 Oct 2022 13:59:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232152AbiJYN7o (ORCPT ); Tue, 25 Oct 2022 09:59:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47838 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232066AbiJYN7h (ORCPT ); Tue, 25 Oct 2022 09:59:37 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B98E65B4; Tue, 25 Oct 2022 06:59:29 -0700 (PDT) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1666706368; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7kvCElohZyUyNJGM2JneGjqwOan7vBCE2nbaLgbbE24=; b=1rvNmBFHfqTzHK97MH5AwbRl/2ohHfzGQXVUxbumwMMYUql4LUSdarJUdEjUB/2J4INa2K lb7eCMqax2eY+QdE5NkIEWoATeFTZKEEEm1oRFwRiOpJRsPKqKp+tSqGRzLtRV4aLdiH2P KEVQZy2lVd7P0D+aSDHIplsRslihKWsjDhTmSMFNiBF41svMCJCRAdEA0gWt3snZFlkwYH 7NbrCLJNIRM4agpP3644czZxZ7bz7yAkdHlldQvFKL5FIqrzkb2y04faAPB8sDj1+Y7RAL RFrLzNG4SgvvjbXp+w/419e241dsGffAmyJedlzDcqAthmonyySAoVL6agJkMA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1666706368; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7kvCElohZyUyNJGM2JneGjqwOan7vBCE2nbaLgbbE24=; b=C6EV5i1SHcU/aOMrMx8cq4g8OZp2nrmCWSD0GaxibGuR5WE/UzyOd0FqnDEfWOp7NpytzM ObJWWF8A79a2cCCw== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Eric Dumazet , Thomas Gleixner , "Rafael J. Wysocki" , linux-pm@vger.kernel.org, Arjan van de Ven , "Paul E. McKenney" , Frederic Weisbecker , Rik van Riel , "Richard Cochran (linutronix GmbH)" , Anna-Maria Behnsen Subject: [PATCH v3 13/17] tick/sched: Split out jiffies update helper function Date: Tue, 25 Oct 2022 15:58:46 +0200 Message-Id: <20221025135850.51044-14-anna-maria@linutronix.de> In-Reply-To: <20221025135850.51044-1-anna-maria@linutronix.de> References: <20221025135850.51044-1-anna-maria@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org From: "Richard Cochran (linutronix GmbH)" The logic to get the time of the last jiffies update will be needed by the timer pull model as well. Move the code into a global funtion in anticipation of the new caller. No functional change. Signed-off-by: Richard Cochran (linutronix GmbH) Signed-off-by: Anna-Maria Behnsen --- kernel/time/tick-internal.h | 1 + kernel/time/tick-sched.c | 20 ++++++++++++++++---- 2 files changed, 17 insertions(+), 4 deletions(-) diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h index 7b65abc9d803..8f99e35e75b4 100644 --- a/kernel/time/tick-internal.h +++ b/kernel/time/tick-internal.h @@ -158,6 +158,7 @@ static inline void tick_nohz_init(void) { } #ifdef CONFIG_NO_HZ_COMMON extern unsigned long tick_nohz_active; extern void timers_update_nohz(void); +extern u64 get_jiffies_update(unsigned long *basej); # ifdef CONFIG_SMP extern struct static_key_false timers_migration_enabled; # endif diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 7f7bfe8b498d..868679ff3421 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -782,19 +782,31 @@ static inline bool local_timer_softirq_pending(void) return local_softirq_pending() & BIT(TIMER_SOFTIRQ); } -static ktime_t tick_nohz_next_event(struct tick_sched *ts, int cpu) +/* + * Read jiffies and the time when jiffies were updated last + */ +u64 get_jiffies_update(unsigned long *basej) { - struct timer_events tevt = { .local = KTIME_MAX, .global = KTIME_MAX }; - u64 basemono, delta, expires; unsigned long basejiff; unsigned int seq; + u64 basemono; - /* Read jiffies and the time when jiffies were updated last */ do { seq = read_seqcount_begin(&jiffies_seq); basemono = last_jiffies_update; basejiff = jiffies; } while (read_seqcount_retry(&jiffies_seq, seq)); + *basej = basejiff; + return basemono; +} + +static ktime_t tick_nohz_next_event(struct tick_sched *ts, int cpu) +{ + struct timer_events tevt = { .local = KTIME_MAX, .global = KTIME_MAX }; + u64 basemono, delta, expires; + unsigned long basejiff; + + basemono = get_jiffies_update(&basejiff); ts->last_jiffies = basejiff; ts->timer_expires_base = basemono; From patchwork Tue Oct 25 13:58:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 618618 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E77F1C38A2D for ; Tue, 25 Oct 2022 13:59:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232222AbiJYN7r (ORCPT ); Tue, 25 Oct 2022 09:59:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48134 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232174AbiJYN7h (ORCPT ); Tue, 25 Oct 2022 09:59:37 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8AB56E0BC; Tue, 25 Oct 2022 06:59:30 -0700 (PDT) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1666706369; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3r8eFvXX7ewTWx91C3hEXLRF8LG7h66hSM9FnLvuvbY=; b=wClEbjJseJEAEj1agp1ZPP+vXzt++cAMTswcwsjpza8uM4kPm3Ndw5nGzdByxBJyy6XnYj pKpCZFQpmh2c26XnKRSz2WgpFCQotX3eW5Fl6LqxEJvEqyWf2vTcRrbps7Xob1oLfyJapq BYP/EkZHcZDe/4v3t2ExSLJf3vjmRMo1lIDEUgc+IR/LPlMVrezjVaXZSa9Gi+CXTtA1m6 dvvT+VSBkd3B3dw8amtXgd6Ga1aTUIuzJakP/Gn61aQatEke0E7eh16pqQDfCzazq4HTgk PC5z4UiGNzZHgekKKOa0fRBfJo6IESMAXwb9L5XqijA7qUoY8WOa5h8ycpTftQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1666706369; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3r8eFvXX7ewTWx91C3hEXLRF8LG7h66hSM9FnLvuvbY=; b=JleMxpC5zVweo7sN4lVtElzmCTyxCBVd3omw897K6J4kqYM9TFA2vNfwz1/gzczHD76eER EnsNeZmfc+s3mnAw== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Eric Dumazet , Thomas Gleixner , "Rafael J. Wysocki" , linux-pm@vger.kernel.org, Arjan van de Ven , "Paul E. McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen Subject: [PATCH v3 15/17] timer_migration: Add tracepoints Date: Tue, 25 Oct 2022 15:58:48 +0200 Message-Id: <20221025135850.51044-16-anna-maria@linutronix.de> In-Reply-To: <20221025135850.51044-1-anna-maria@linutronix.de> References: <20221025135850.51044-1-anna-maria@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org The timer pull logic needs proper debugging aids. Add tracepoints so the hierarchical idle machinery can be diagnosed. Signed-off-by: Anna-Maria Behnsen --- include/trace/events/timer_migration.h | 277 +++++++++++++++++++++++++ kernel/time/timer_migration.c | 24 +++ 2 files changed, 301 insertions(+) create mode 100644 include/trace/events/timer_migration.h diff --git a/include/trace/events/timer_migration.h b/include/trace/events/timer_migration.h new file mode 100644 index 000000000000..0c4824056930 --- /dev/null +++ b/include/trace/events/timer_migration.h @@ -0,0 +1,277 @@ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM timer_migration + +#if !defined(_TRACE_TIMER_MIGRATION_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_TIMER_MIGRATION_H + +#include + +/* Group events */ +TRACE_EVENT(tmigr_group_set, + + TP_PROTO(struct tmigr_group *group), + + TP_ARGS(group), + + TP_STRUCT__entry( + __field( void *, group ) + __field( unsigned int, lvl ) + __field( unsigned int, numa_node ) + ), + + TP_fast_assign( + __entry->group = group; + __entry->lvl = group->level; + __entry->numa_node = group->numa_node; + ), + + TP_printk("group=%p lvl=%d numa=%d", + __entry->group, __entry->lvl, __entry->numa_node) +); + +TRACE_EVENT(tmigr_connect_child_parent, + + TP_PROTO(struct tmigr_group *child), + + TP_ARGS(child), + + TP_STRUCT__entry( + __field( void *, child ) + __field( void *, parent ) + __field( unsigned int, lvl ) + __field( unsigned int, numa_node ) + __field( unsigned int, num_childs ) + __field( u32, childmask ) + ), + + TP_fast_assign( + __entry->child = child; + __entry->parent = child->parent; + __entry->lvl = child->parent->level; + __entry->numa_node = child->parent->numa_node; + __entry->numa_node = child->parent->num_childs; + __entry->childmask = child->childmask; + ), + + TP_printk("group=%p childmask=%0x parent=%p lvl=%d numa=%d num_childs=%d", + __entry->child, __entry->childmask, __entry->parent, + __entry->lvl, __entry->numa_node, __entry->num_childs) +); + +TRACE_EVENT(tmigr_connect_cpu_parent, + + TP_PROTO(struct tmigr_cpu *tmc), + + TP_ARGS(tmc), + + TP_STRUCT__entry( + __field( void *, parent ) + __field( unsigned int, cpu ) + __field( unsigned int, lvl ) + __field( unsigned int, numa_node ) + __field( unsigned int, num_childs ) + __field( u32, childmask ) + ), + + TP_fast_assign( + __entry->parent = tmc->tmgroup; + __entry->cpu = tmc->cpuevt.cpu; + __entry->lvl = tmc->tmgroup->level; + __entry->numa_node = tmc->tmgroup->numa_node; + __entry->numa_node = tmc->tmgroup->num_childs; + __entry->childmask = tmc->childmask; + ), + + TP_printk("cpu=%d childmask=%0x parent=%p lvl=%d numa=%d num_childs=%d", + __entry->cpu, __entry->childmask, __entry->parent, + __entry->lvl, __entry->numa_node, __entry->num_childs) +); + +DECLARE_EVENT_CLASS(tmigr_group_and_cpu, + + TP_PROTO(struct tmigr_group *group, union tmigr_state state, u32 childmask), + + TP_ARGS(group, state, childmask), + + TP_STRUCT__entry( + __field( void *, group ) + __field( void *, parent ) + __field( unsigned int, lvl ) + __field( unsigned int, numa_node ) + __field( u8, active ) + __field( u8, migrator ) + __field( u32, childmask ) + ), + + TP_fast_assign( + __entry->group = group; + __entry->parent = group->parent; + __entry->lvl = group->level; + __entry->numa_node = group->numa_node; + __entry->active = state.active; + __entry->migrator = state.migrator; + __entry->childmask = childmask; + ), + + TP_printk("group=%p lvl=%d numa=%d active=%0x migrator=%0x " + "parent=%p childmask=%0x", + __entry->group, __entry->lvl, __entry->numa_node, + __entry->active, __entry->migrator, + __entry->parent, __entry->childmask) +); + +DEFINE_EVENT(tmigr_group_and_cpu, tmigr_group_set_cpu_inactive, + + TP_PROTO(struct tmigr_group *group, union tmigr_state state, u32 childmask), + + TP_ARGS(group, state, childmask) +); + +DEFINE_EVENT(tmigr_group_and_cpu, tmigr_group_set_cpu_active, + + TP_PROTO(struct tmigr_group *group, union tmigr_state state, u32 childmask), + + TP_ARGS(group, state, childmask) +); + +/* CPU events*/ +DECLARE_EVENT_CLASS(tmigr_cpugroup, + + TP_PROTO(struct tmigr_cpu *tmc), + + TP_ARGS(tmc), + + TP_STRUCT__entry( + __field( void *, parent) + __field( unsigned int, cpu) + ), + + TP_fast_assign( + __entry->cpu = tmc->cpuevt.cpu; + __entry->parent = tmc->tmgroup; + ), + + TP_printk("cpu=%d parent=%p", __entry->cpu, __entry->parent) +); + +DEFINE_EVENT(tmigr_cpugroup, tmigr_cpu_new_timer, + + TP_PROTO(struct tmigr_cpu *tmc), + + TP_ARGS(tmc) +); + +DEFINE_EVENT(tmigr_cpugroup, tmigr_cpu_active, + + TP_PROTO(struct tmigr_cpu *tmc), + + TP_ARGS(tmc) +); + +DEFINE_EVENT(tmigr_cpugroup, tmigr_cpu_online, + + TP_PROTO(struct tmigr_cpu *tmc), + + TP_ARGS(tmc) +); + +DEFINE_EVENT(tmigr_cpugroup, tmigr_cpu_offline, + + TP_PROTO(struct tmigr_cpu *tmc), + + TP_ARGS(tmc) +); + +DEFINE_EVENT(tmigr_cpugroup, tmigr_handle_remote_cpu, + + TP_PROTO(struct tmigr_cpu *tmc), + + TP_ARGS(tmc) +); + +TRACE_EVENT(tmigr_cpu_idle, + + TP_PROTO(struct tmigr_cpu *tmc, u64 nextevt), + + TP_ARGS(tmc, nextevt), + + TP_STRUCT__entry( + __field( void *, parent) + __field( unsigned int, cpu) + __field( u64, nextevt) + ), + + TP_fast_assign( + __entry->cpu = tmc->cpuevt.cpu; + __entry->parent = tmc->tmgroup; + __entry->nextevt = nextevt; + ), + + TP_printk("cpu=%d parent=%p nextevt=%llu", + __entry->cpu, __entry->parent, __entry->nextevt) +); + +TRACE_EVENT(tmigr_update_events, + + TP_PROTO(struct tmigr_group *child, struct tmigr_group *group, + union tmigr_state childstate, union tmigr_state groupstate, + u64 nextevt), + + TP_ARGS(child, group, childstate, groupstate, nextevt), + + TP_STRUCT__entry( + __field( void *, child ) + __field( void *, group ) + __field( u64, nextevt ) + __field( u64, group_next_expiry ) + __field( unsigned int, group_lvl ) + __field( u8, child_active ) + __field( u8, group_active ) + __field( unsigned int, child_evtcpu ) + __field( u64, child_evt_expiry ) + ), + + TP_fast_assign( + __entry->child = child; + __entry->group = group; + __entry->nextevt = nextevt; + __entry->group_next_expiry = group->next_expiry; + __entry->group_lvl = group->level; + __entry->child_active = childstate.active; + __entry->group_active = groupstate.active; + __entry->child_evtcpu = child ? child->groupevt.cpu : 0; + __entry->child_evt_expiry = child ? child->groupevt.nextevt.expires : 0; + ), + + TP_printk("child=%p group=%p group_lvl=%d child_active=%0x group_active=%0x " + "nextevt=%llu next_expiry=%llu child_evt_expiry=%llu child_evtcpu=%d", + __entry->child, __entry->group, __entry->group_lvl, __entry->child_active, + __entry->group_active, + __entry->nextevt, __entry->group_next_expiry, __entry->child_evt_expiry, + __entry->child_evtcpu) +); + +TRACE_EVENT(tmigr_handle_remote, + + TP_PROTO(struct tmigr_group *group), + + TP_ARGS(group), + + TP_STRUCT__entry( + __field( void * , group ) + __field( unsigned int , lvl ) + ), + + TP_fast_assign( + __entry->group = group; + __entry->lvl = group->level; + ), + + TP_printk("group=%p lvl=%d", + __entry->group, __entry->lvl) +); + +#endif /* _TRACE_TIMER_MIGRATION_H */ + +/* This part must be outside protection */ +#include diff --git a/kernel/time/timer_migration.c b/kernel/time/timer_migration.c index d68b8fd06a97..6b954e034959 100644 --- a/kernel/time/timer_migration.c +++ b/kernel/time/timer_migration.c @@ -13,6 +13,9 @@ #include "timer_migration.h" #include "tick-internal.h" +#define CREATE_TRACE_POINTS +#include + /* * The timer migration mechanism is built on a hierarchy of groups. The * lowest level group contains CPUs, the next level groups of CPU groups @@ -317,6 +320,8 @@ static bool tmigr_active_up(struct tmigr_group *group, */ set_bit(0, &group->groupevt.ignore); + trace_tmigr_group_set_cpu_active(group, newstate, childmask); + return walk_done; } @@ -341,6 +346,7 @@ void tmigr_cpu_activate(void) raw_spin_lock(&tmc->lock); tmc->idle = 0; tmc->wakeup = KTIME_MAX; + trace_tmigr_cpu_active(tmc); __tmigr_cpu_activate(tmc); raw_spin_unlock(&tmc->lock); } @@ -435,6 +441,9 @@ static bool tmigr_update_events(struct tmigr_group *group, data->nextexp = tmigr_next_groupevt_expires(group); } + trace_tmigr_update_events(child, group, data->childstate, + data->groupstate, nextexp); + unlock: raw_spin_unlock(&group->lock); @@ -478,6 +487,8 @@ static u64 tmigr_new_timer(struct tmigr_cpu *tmc, u64 nextexp) if (tmc->remote) return KTIME_MAX; + trace_tmigr_cpu_new_timer(tmc); + clear_bit(0, &tmc->cpuevt.ignore); data.groupstate.state = atomic_read(tmc->tmgroup->migr_state); @@ -576,6 +587,8 @@ static bool tmigr_inactive_up(struct tmigr_group *group, } } + trace_tmigr_group_set_cpu_inactive(group, data->childstate, childmask); + return walk_done; } @@ -652,6 +665,7 @@ u64 tmigr_cpu_deactivate(u64 nextexp) tmc->idle = 1; unlock: + trace_tmigr_cpu_idle(tmc, ret); raw_spin_unlock(&tmc->lock); return ret; } @@ -678,6 +692,8 @@ static u64 tmigr_handle_remote_cpu(unsigned int cpu, u64 now, return next; } + trace_tmigr_handle_remote_cpu(tmc); + tmc->remote = 1; /* Drop the lock to allow the remote CPU to exit idle */ @@ -728,6 +744,7 @@ static bool tmigr_handle_remote_up(struct tmigr_group *group, childmask = data->childmask; + trace_tmigr_handle_remote(group); again: /* * Handle the group only if @childmask is the migrator or if the @@ -963,6 +980,7 @@ static struct tmigr_group *tmigr_get_group(unsigned int cpu, unsigned int node, tmigr_init_group(group, lvl, node, migr_state); /* Setup successful. Add it to the hierarchy */ list_add(&group->list, &tmigr_level_list[lvl]); + trace_tmigr_group_set(group); return group; } @@ -981,6 +999,8 @@ static void tmigr_connect_child_parent(struct tmigr_group *child, raw_spin_unlock(&parent->lock); raw_spin_unlock_irqrestore(&child->lock, flags); + trace_tmigr_connect_child_parent(child); + /* * To prevent inconsistent states, active childs needs to be active * in new parent as well. Inactive childs are already marked @@ -1067,6 +1087,8 @@ static int tmigr_setup_groups(unsigned int cpu, unsigned int node) raw_spin_unlock_irqrestore(&group->lock, flags); + trace_tmigr_connect_cpu_parent(tmc); + /* There are no childs that needs to be connected */ continue; } else { @@ -1135,6 +1157,7 @@ static int tmigr_cpu_online(unsigned int cpu) tmc->wakeup = KTIME_MAX; } raw_spin_lock_irqsave(&tmc->lock, flags); + trace_tmigr_cpu_online(tmc); __tmigr_cpu_activate(tmc); tmc->online = 1; raw_spin_unlock_irqrestore(&tmc->lock, flags); @@ -1148,6 +1171,7 @@ static int tmigr_cpu_offline(unsigned int cpu) raw_spin_lock_irq(&tmc->lock); tmc->online = 0; __tmigr_cpu_deactivate(tmc, KTIME_MAX); + trace_tmigr_cpu_offline(tmc); raw_spin_unlock_irq(&tmc->lock); return 0; From patchwork Tue Oct 25 13:58:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anna-Maria Behnsen X-Patchwork-Id: 618619 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6DABECDFA1 for ; Tue, 25 Oct 2022 13:59:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231526AbiJYN7q (ORCPT ); Tue, 25 Oct 2022 09:59:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48122 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232169AbiJYN7h (ORCPT ); Tue, 25 Oct 2022 09:59:37 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C446FAC9; Tue, 25 Oct 2022 06:59:31 -0700 (PDT) From: Anna-Maria Behnsen DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1666706370; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t38ZOSk7RADUS9yQPx+jhxDPVyx+Ww7vTN6ZsN90lKg=; b=bEwlhGzpXpX8hMG8IqtPZmhlQRTQikKxqNs+gbQgfv3WFD96OO3tUpQT4EGa3g/BzQnubQ YJSxR22rP/7t7RYGTs+tiAigsONu+jLzODJiGTM/3aFyR1hciqB+lBSGwaCnd2Th7JbGQq uwSOnET8vkJWh2fJdQx7Pue1BuPsppC5/Vnq2Kf4G1QWNwkR7MP7bc6yZ3yGh4vYfD9AGW NcMkYkdYIzu/1fLOmu4BzSl8JZJExpqx06XXG7SZsxGs6MycZ4DWUgN6eLrtxp1JzRVe61 FhrUh86UP3a4VJCA0facxuPBjkZB5p26oaplpC/jxnH/cB37pSiJfUs6KLYmgQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1666706370; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t38ZOSk7RADUS9yQPx+jhxDPVyx+Ww7vTN6ZsN90lKg=; b=WydopO46wwu45sP+yRfKw5ZwrGNEVisF1hJAUAO3vVDu9Yj/PohrKCIGU8fAEgVTpFn5D9 G2QbL4hSwWjxuBDg== To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , John Stultz , Eric Dumazet , Thomas Gleixner , "Rafael J. Wysocki" , linux-pm@vger.kernel.org, Arjan van de Ven , "Paul E. McKenney" , Frederic Weisbecker , Rik van Riel , Anna-Maria Behnsen Subject: [PATCH v3 16/17] add_timer_on(): Make sure callers have TIMER_PINNED flag Date: Tue, 25 Oct 2022 15:58:49 +0200 Message-Id: <20221025135850.51044-17-anna-maria@linutronix.de> In-Reply-To: <20221025135850.51044-1-anna-maria@linutronix.de> References: <20221025135850.51044-1-anna-maria@linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org The hierachical timer pull model at expiry time is now in place. Timers which should expire on a dedicated CPU needs the TIMER_PINNED flag. Otherwise they will be queued on the dedicated CPU but in global timer base and those timers could also expire on other CPUs. Only timers with TIMER_PINNED flag will end up in local timer base Therefore add the missing TIMER_PINNED flag for those who use add_timer_on() without the flag. Signed-off-by: Anna-Maria Behnsen --- arch/x86/kernel/tsc_sync.c | 3 ++- kernel/time/clocksource.c | 2 +- kernel/workqueue.c | 7 +++++-- 3 files changed, 8 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/tsc_sync.c b/arch/x86/kernel/tsc_sync.c index 9452dc9664b5..eab827288e0f 100644 --- a/arch/x86/kernel/tsc_sync.c +++ b/arch/x86/kernel/tsc_sync.c @@ -110,7 +110,8 @@ static int __init start_sync_check_timer(void) if (!cpu_feature_enabled(X86_FEATURE_TSC_ADJUST) || tsc_clocksource_reliable) return 0; - timer_setup(&tsc_sync_check_timer, tsc_sync_check_timer_fn, 0); + timer_setup(&tsc_sync_check_timer, tsc_sync_check_timer_fn, + TIMER_PINNED); tsc_sync_check_timer.expires = jiffies + SYNC_CHECK_INTERVAL; add_timer(&tsc_sync_check_timer); diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index 8058bec87ace..f8c310e62758 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -523,7 +523,7 @@ static inline void clocksource_start_watchdog(void) { if (watchdog_running || !watchdog || list_empty(&watchdog_list)) return; - timer_setup(&watchdog_timer, clocksource_watchdog, 0); + timer_setup(&watchdog_timer, clocksource_watchdog, TIMER_PINNED); watchdog_timer.expires = jiffies + WATCHDOG_INTERVAL; add_timer_on(&watchdog_timer, cpumask_first(cpu_online_mask)); watchdog_running = 1; diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 7cd5f5e7e0a1..a0f7bf7be6f2 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -1670,10 +1670,13 @@ static void __queue_delayed_work(int cpu, struct workqueue_struct *wq, dwork->cpu = cpu; timer->expires = jiffies + delay; - if (unlikely(cpu != WORK_CPU_UNBOUND)) + if (unlikely(cpu != WORK_CPU_UNBOUND)) { + timer->flags |= TIMER_PINNED; add_timer_on(timer, cpu); - else + } else { + timer->flags &= ~TIMER_PINNED; add_timer(timer); + } } /**