From patchwork Fri Apr 11 16:38:30 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Viresh Kumar X-Patchwork-Id: 28290 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-qa0-f71.google.com (mail-qa0-f71.google.com [209.85.216.71]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 32C3120822 for ; Fri, 11 Apr 2014 16:38:51 +0000 (UTC) Received: by mail-qa0-f71.google.com with SMTP id j7sf14287554qaq.10 for ; Fri, 11 Apr 2014 09:38:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:mime-version:in-reply-to:references :date:message-id:subject:from:to:cc:sender:precedence:list-id :x-original-sender:x-original-authentication-results:mailing-list :list-post:list-help:list-archive:list-unsubscribe:content-type; bh=DMMcNjT6T8TZrM7J3iDXYpMRETxUMmJvb9eMNMeyhME=; b=eirbaWW1oPmQtSZgPEaEuYOy3bS857k2vNKN7NyM8vlB6Dw5FXonCo/9MV4kqemo65 GWoTweaqWGS5OK7HavkLqOjnaODD/HHv3Dx7AJoW5zYayBUzIs6SQeP+p36SD0wfIzUo ohWEymYl3mHQBYoRTriyBcxun9gey/BPdmR+87ZYwxRd1bdGr/wxRsuCprceqWsM0e+D ma9gItmuGA8eKIX8fnMHk54yaGTde3+ytj8+r97t2j1qU9DVmFBdW71H4sw5w9c78hED EHyNgySQy3Ph11w9EdpZsSvmboNPZnaCrysOyS6u4bEhSRZ+wd1salwnovxcBuNYYMO9 mOqA== X-Gm-Message-State: ALoCoQlnLyIRw5+Qm8PMIrywrD6deNs+Wg9XshEYIdXiSM81kj6Q9aLhaZZUkxR1WkvtGoIlxgOy X-Received: by 10.58.141.200 with SMTP id rq8mr11955655veb.31.1397234330943; Fri, 11 Apr 2014 09:38:50 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.100.137 with SMTP id s9ls1752463qge.57.gmail; Fri, 11 Apr 2014 09:38:50 -0700 (PDT) X-Received: by 10.220.163.3 with SMTP id y3mr4744295vcx.7.1397234330742; Fri, 11 Apr 2014 09:38:50 -0700 (PDT) Received: from mail-vc0-f176.google.com (mail-vc0-f176.google.com [209.85.220.176]) by mx.google.com with ESMTPS id sc7si1418590vdc.193.2014.04.11.09.38.50 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 11 Apr 2014 09:38:50 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.220.176 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) client-ip=209.85.220.176; Received: by mail-vc0-f176.google.com with SMTP id lc6so5047517vcb.35 for ; Fri, 11 Apr 2014 09:38:50 -0700 (PDT) X-Received: by 10.52.51.226 with SMTP id n2mr923651vdo.57.1397234330654; Fri, 11 Apr 2014 09:38:50 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.220.221.72 with SMTP id ib8csp70652vcb; Fri, 11 Apr 2014 09:38:50 -0700 (PDT) X-Received: by 10.66.141.197 with SMTP id rq5mr28786133pab.64.1397234329361; Fri, 11 Apr 2014 09:38:49 -0700 (PDT) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id et3si4553482pbc.420.2014.04.11.09.38.48; Fri, 11 Apr 2014 09:38:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759697AbaDKQik (ORCPT + 26 others); Fri, 11 Apr 2014 12:38:40 -0400 Received: from mail-oa0-f45.google.com ([209.85.219.45]:65419 "EHLO mail-oa0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1422913AbaDKQib (ORCPT ); Fri, 11 Apr 2014 12:38:31 -0400 Received: by mail-oa0-f45.google.com with SMTP id eb12so6395289oac.4 for ; Fri, 11 Apr 2014 09:38:30 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.60.51.227 with SMTP id n3mr20337858oeo.33.1397234310402; Fri, 11 Apr 2014 09:38:30 -0700 (PDT) Received: by 10.182.28.168 with HTTP; Fri, 11 Apr 2014 09:38:30 -0700 (PDT) In-Reply-To: <20140411151825.GX11096@twins.programming.kicks-ass.net> References: <20140410143857.GA27654@localhost.localdomain> <20140411145333.GC3438@localhost.localdomain> <20140411151825.GX11096@twins.programming.kicks-ass.net> Date: Fri, 11 Apr 2014 22:08:30 +0530 Message-ID: Subject: Re: [Query]: tick-sched: why don't we stop tick when we are running idle task? From: Viresh Kumar To: Peter Zijlstra Cc: Frederic Weisbecker , Thomas Gleixner , Linux Kernel Mailing List , Lists linaro-kernel Sender: linux-kernel-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: viresh.kumar@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.220.176 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , On 11 April 2014 20:48, Peter Zijlstra wrote: > On Fri, Apr 11, 2014 at 04:53:35PM +0200, Frederic Weisbecker wrote: > I think there's assumptions that tick runs on the local cpu; Yes, many function behave that way, i.e. with smp_processor_id() as CPU. > also what > are you going to do when running it on all remote cpus takes longer than > the tick? > >> Otherwise (and ideally) we need to make the scheduler code able to handle long periods without >> calling scheduler_tick(). But this is a lot more plumbing. And the scheduler has gazillions >> accounting stuffs to handle. Sounds like a big nightmare to take that direction. > > So i'm not at all sure what you guys are talking about, but it seems to > me you should all put down the bong and have a detox round instead. > > This all sounds like a cure worse than the problem. So, what I was working on isn't ready yet but I would like to show what lines I have been trying on. In case that is completely incorrect and I should stop making that work :) Please share your feedback about this (Yes there are several parts broken currently, specially the assumption that tick runs on local CPU): NOTE: Its rebased over some cleanups I did which aren't sent to LKML yet. ---------x-------------------------x--------------- tick-sched: offload NO_HZ_FULL computation to timekeeping CPUs Signed-off-by: Viresh Kumar --- include/linux/tick.h | 2 ++ kernel/time/tick-sched.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------- 2 files changed, 82 insertions(+), 10 deletions(-) goto out; @@ -687,9 +734,7 @@ out: static void tick_nohz_full_stop_tick(struct tick_sched *ts) { #ifdef CONFIG_NO_HZ_FULL - int cpu = smp_processor_id(); - - if (!tick_nohz_full_cpu(cpu) || is_idle_task(current)) + if (!tick_nohz_full_cpu(ts->cpu) || is_idle_task(current)) return; if (!ts->tick_stopped && ts->nohz_mode == NOHZ_MODE_INACTIVE) @@ -698,7 +743,7 @@ static void tick_nohz_full_stop_tick(struct tick_sched *ts) if (!can_stop_full_tick()) return; - tick_nohz_stop_sched_tick(ts, ktime_get(), cpu); + tick_nohz_stop_sched_tick(ts, ktime_get(), ts->cpu); #endif } @@ -824,11 +869,21 @@ EXPORT_SYMBOL_GPL(tick_nohz_idle_enter); void tick_nohz_irq_exit(void) { struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched); + unsigned int cpu; - if (ts->inidle) + if (ts->inidle) { __tick_nohz_idle_enter(ts); - else + } else { tick_nohz_full_stop_tick(ts); + + while (!cpumask_empty(&ts->timekeeping_pending)) { + cpu = cpumask_first(&ts->timekeeping_pending); + cpumask_clear_cpu(cpu, &ts->timekeeping_pending); + + /* Try to stop tick of NO_HZ_FULL cpu */ + tick_nohz_full_stop_tick(tick_get_tick_sched(cpu)); + } + } } /** @@ -1090,13 +1145,23 @@ static enum hrtimer_restart tick_sched_timer(struct hrtimer *timer) { struct tick_sched *ts = container_of(timer, struct tick_sched, sched_timer); + struct tick_sched *this_ts = &__get_cpu_var(tick_cpu_sched); ktime_t now = ktime_get(); - tick_sched_do_timer(now); + /* Running as timekeeper ? */ + if (likely(ts == this_ts)) + tick_sched_do_timer(now); + else + cpumask_set_cpu(ts->cpu, &this_ts->timekeeping_pending); + tick_sched_handle(ts); hrtimer_forward(timer, now, tick_period); + /* + * Yes, we are scheduling next tick also while timekeeping on this_cpu. + * We will handle that from irq_exit(). + */ return HRTIMER_RESTART; } @@ -1149,6 +1214,11 @@ void tick_setup_sched_timer(void) tick_nohz_active = 1; } #endif + +#ifdef CONFIG_NO_HZ_FULL + ts->cpu = smp_processor_id(); + cpumask_clear(&ts->timekeeping_pending); +#endif } #endif /* HIGH_RES_TIMERS */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ diff --git a/include/linux/tick.h b/include/linux/tick.h index 97bbb64..f8efa9f 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -62,6 +62,7 @@ enum tick_nohz_mode { */ struct tick_sched { struct hrtimer sched_timer; + struct cpumask timekeeping_pending; enum tick_nohz_mode nohz_mode; unsigned long check_clocks; unsigned long idle_jiffies; @@ -77,6 +78,7 @@ struct tick_sched { ktime_t iowait_sleeptime; ktime_t sleep_length; ktime_t idle_expires; + unsigned int cpu; int inidle; int idle_active; int tick_stopped; diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 2d0b154..7560bd0 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -532,13 +532,56 @@ u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time) } EXPORT_SYMBOL_GPL(get_cpu_iowait_time_us); +static int tick_nohz_full_get_offload_cpu(unsigned int cpu) +{ + const struct cpumask *nodemask = cpumask_of_node(cpu_to_node(cpu)); + unsigned int offload_cpu; + cpumask_t cpumask; + + /* Don't pick any NO_HZ_FULL cpu */ + cpumask_andnot(&cpumask, cpu_online_mask, tick_nohz_full_mask); + cpumask_clear_cpu(cpu, &cpumask); + + /* Try for same node. */ + offload_cpu = cpumask_first_and(nodemask, &cpumask); + if (offload_cpu < nr_cpu_ids) + return offload_cpu; + + /* Any online will do */ + return cpumask_any(&cpumask); +} + +static void tick_nohz_full_offload_timer(void *info) +{ + struct tick_sched *ts = info; + hrtimer_start_expires(&ts->sched_timer, HRTIMER_MODE_ABS_PINNED); +} + +static void tick_nohz_full_offload_tick(struct tick_sched *ts, ktime_t expires, + unsigned int cpu) +{ + unsigned int offload_cpu = tick_nohz_full_get_offload_cpu(cpu); + + /* Offload accounting to timekeeper */ + hrtimer_cancel(&ts->sched_timer); + hrtimer_set_expires(&ts->sched_timer, expires); + + /* + * This is triggering a WARN_ON() as below routine doesn't support calls + * while interrupts are disabled. Need to think of some other way to get + * this fixed. + */ + smp_call_function_single(offload_cpu, tick_nohz_full_offload_timer, ts, + true); +} + static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts, ktime_t now, int cpu) { unsigned long seq, last_jiffies, next_jiffies, delta_jiffies; ktime_t last_update, expires, ret = { .tv64 = 0 }; unsigned long rcu_delta_jiffies; - struct clock_event_device *dev = tick_get_cpu_device()->evtdev; + struct clock_event_device *dev = tick_get_device(cpu)->evtdev; u64 time_delta; time_delta = timekeeping_max_deferment(); @@ -661,8 +704,12 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts, } if (ts->nohz_mode == NOHZ_MODE_HIGHRES) { - hrtimer_start(&ts->sched_timer, expires, - HRTIMER_MODE_ABS_PINNED); + if (ts->inidle) + hrtimer_start(&ts->sched_timer, expires, + HRTIMER_MODE_ABS_PINNED); + else + tick_nohz_full_offload_tick(ts, expires, cpu); + /* Check, if the timer was already in the past */ if (hrtimer_active(&ts->sched_timer))