From patchwork Thu Mar 26 05:39:01 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Viresh Kumar X-Patchwork-Id: 46342 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-la0-f69.google.com (mail-la0-f69.google.com [209.85.215.69]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 7F5D621584 for ; Thu, 26 Mar 2015 05:39:48 +0000 (UTC) Received: by lahp7 with SMTP id p7sf3194183lah.0 for ; Wed, 25 Mar 2015 22:39:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:sender:precedence:list-id:x-original-sender :x-original-authentication-results:mailing-list:list-post:list-help :list-archive:list-unsubscribe; bh=ESNZbPiza3PFipkjgzfyPZDuTOZg9G/H7wYmQuDYJ6Q=; b=GyDW9exlcGOh9xjnyXZTtMkcXU2MuNv8sA52f9rAsz+OmuL9wbUE1H/UZs0qgY6RP4 z9JPtwyA6zaLQNolsMhYdga7s33Gb7hITfuMh7t1Z0M0r3nY0et0PXg6/qVDywyruYj0 fIOTgjHTQjxUhQpBhECVR475AeBL5rKRwYZuRqzxppwTG49YBNxVjaWjO6OzvmyQkEtl 3LHVI7CHjjXpRD/TvWiImhcAbQCrIThuCJgvbVN/Y0kyOpF+tin2uqfbbdbeKj+34N23 cgJiTj2PCk2vvr9TxBj+40RGzgddczKdQhQTpxf6/EMT8iLCoMzye3i0ZulX0aPe0hOp +R/g== X-Gm-Message-State: ALoCoQnAdRDe0SYLuiSHMHDB714xqCSzxa25+qC7Csqz7Cbxi6A6BG/oqO/EEX6S6OV3PhyHds2q X-Received: by 10.152.27.134 with SMTP id t6mr2898666lag.5.1427348387045; Wed, 25 Mar 2015 22:39:47 -0700 (PDT) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.202.133 with SMTP id ki5ls200185lac.90.gmail; Wed, 25 Mar 2015 22:39:46 -0700 (PDT) X-Received: by 10.152.19.34 with SMTP id b2mr698268lae.122.1427348386698; Wed, 25 Mar 2015 22:39:46 -0700 (PDT) Received: from mail-lb0-f170.google.com (mail-lb0-f170.google.com. [209.85.217.170]) by mx.google.com with ESMTPS id v11si3715165laz.126.2015.03.25.22.39.46 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 25 Mar 2015 22:39:46 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.170 as permitted sender) client-ip=209.85.217.170; Received: by lbcgn8 with SMTP id gn8so33189850lbc.2 for ; Wed, 25 Mar 2015 22:39:46 -0700 (PDT) X-Received: by 10.112.29.36 with SMTP id g4mr3250458lbh.56.1427348386520; Wed, 25 Mar 2015 22:39:46 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.57.201 with SMTP id k9csp390017lbq; Wed, 25 Mar 2015 22:39:45 -0700 (PDT) X-Received: by 10.70.35.2 with SMTP id d2mr23517264pdj.51.1427348384768; Wed, 25 Mar 2015 22:39:44 -0700 (PDT) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m4si6729394pap.204.2015.03.25.22.39.43; Wed, 25 Mar 2015 22:39:44 -0700 (PDT) Received-SPF: none (google.com: linux-kernel-owner@vger.kernel.org does not designate permitted sender hosts) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751587AbbCZFj2 (ORCPT + 27 others); Thu, 26 Mar 2015 01:39:28 -0400 Received: from mail-pd0-f178.google.com ([209.85.192.178]:35442 "EHLO mail-pd0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750790AbbCZFj0 (ORCPT ); Thu, 26 Mar 2015 01:39:26 -0400 Received: by pdbop1 with SMTP id op1so51913444pdb.2 for ; Wed, 25 Mar 2015 22:39:26 -0700 (PDT) X-Received: by 10.66.140.39 with SMTP id rd7mr23163946pab.25.1427348365957; Wed, 25 Mar 2015 22:39:25 -0700 (PDT) Received: from localhost ([122.167.64.77]) by mx.google.com with ESMTPSA id tx2sm4177410pbc.70.2015.03.25.22.39.24 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Wed, 25 Mar 2015 22:39:25 -0700 (PDT) From: Viresh Kumar To: akpm@linux-foundation.org, hannes@cmpxchg.org, cl@linux.com Cc: linaro-kernel@lists.linaro.org, linux-kernel@vger.kernel.org, vinmenon@codeaurora.org, shashim@codeaurora.org, mhocko@suse.cz, mgorman@suse.de, dave@stgolabs.net, koct9i@gmail.com, linux-mm@kvack.org, Viresh Kumar Subject: [RFC] vmstat: Avoid waking up idle-cpu to service shepherd work Date: Thu, 26 Mar 2015 11:09:01 +0530 Message-Id: <359c926bc85cdf79650e39f2344c2083002545bb.1427347966.git.viresh.kumar@linaro.org> X-Mailer: git-send-email 2.3.0.rc0.44.ga94655d Sender: linux-kernel-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: viresh.kumar@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.170 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , A delayed work to schedule vmstat_shepherd() is queued at periodic intervals for internal working of vmstat core. This work and its timer end up waking an idle cpu sometimes, as this always stays on CPU0. Because we re-queue the work from its handler, idle_cpu() returns false and so the timer (used by delayed work) never migrates to any other CPU. This may not be the desired behavior always as waking up an idle CPU to queue work on few other CPUs isn't good from power-consumption point of view. In order to avoid waking up an idle core, we can replace schedule_delayed_work() with a normal work plus a separate timer. The timer handler will then queue the work after re-arming the timer. If the CPU was idle before the timer fired, idle_cpu() will mostly return true and the next timer shall be migrated to a non-idle CPU. But the timer core has a limitation, when the timer is re-armed from its handler, timer core disables migration of that timer to other cores. Details of that limitation are present in kernel/time/timer.c:__mod_timer() routine. Another simple yet effective solution can be to keep two timers with same handler and keep toggling between them, so that the above limitation doesn't hold true anymore. This patch replaces schedule_delayed_work() with schedule_work() plus two timers. After this, it was seen that the timer and its do get migrated to other non-idle CPUs, when the local cpu is idle. Tested-by: Vinayak Menon Tested-by: Shiraz Hashim Signed-off-by: Viresh Kumar --- This patch isn't sent to say its the best way forward, but to get a discussion started on the same. mm/vmstat.c | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-) diff --git a/mm/vmstat.c b/mm/vmstat.c index 4f5cd974e11a..d45e4243a046 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1424,8 +1424,18 @@ static bool need_update(int cpu) * inactivity. */ static void vmstat_shepherd(struct work_struct *w); +static DECLARE_WORK(shepherd, vmstat_shepherd); -static DECLARE_DELAYED_WORK(shepherd, vmstat_shepherd); +/* + * Two timers are used here to avoid waking up an idle CPU. If a single timer is + * kept, then re-arming the timer from its handler doesn't let us migrate it. + */ +static struct timer_list shepherd_timer[2]; +#define toggle_timer() (shepherd_timer_index = !shepherd_timer_index, \ + &shepherd_timer[shepherd_timer_index]) + +static void vmstat_shepherd_timer(unsigned long data); +static int shepherd_timer_index; static void vmstat_shepherd(struct work_struct *w) { @@ -1441,15 +1451,19 @@ static void vmstat_shepherd(struct work_struct *w) &per_cpu(vmstat_work, cpu), 0); put_online_cpus(); +} - schedule_delayed_work(&shepherd, - round_jiffies_relative(sysctl_stat_interval)); +static void vmstat_shepherd_timer(unsigned long data) +{ + mod_timer(toggle_timer(), + jiffies + round_jiffies_relative(sysctl_stat_interval)); + schedule_work(&shepherd); } static void __init start_shepherd_timer(void) { - int cpu; + int cpu, i = -1; for_each_possible_cpu(cpu) INIT_DELAYED_WORK(per_cpu_ptr(&vmstat_work, cpu), @@ -1459,8 +1473,13 @@ static void __init start_shepherd_timer(void) BUG(); cpumask_copy(cpu_stat_off, cpu_online_mask); - schedule_delayed_work(&shepherd, - round_jiffies_relative(sysctl_stat_interval)); + while (++i < 2) { + init_timer(&shepherd_timer[i]); + shepherd_timer[i].function = vmstat_shepherd_timer; + }; + + mod_timer(toggle_timer(), + jiffies + round_jiffies_relative(sysctl_stat_interval)); } static void vmstat_cpu_dead(int node)